A chloroplast DNA hypervariable region in eucalypts

Download A chloroplast DNA hypervariable region in eucalypts

Post on 10-Jul-2016




2 download

Embed Size (px)


<ul><li><p>Abstract Eucalypt chloroplast DNA (cpDNA) providesuseful markers for phylogenetic and population researchincluding gene flow and maternity studies. All cpDNAstudies in Eucalyptus to date have been based on theRFLP technique, which requires relatively large amountsof clean DNA. The objective of this study was to developPCR-based cpDNA markers for Eucalyptus. The chloro-plast genome of Eucalyptus, like that of most angio-sperms, possesses inverted repeats (IR). The two junctionsbetween the IRs and the large single copy (LSC) regionsare defined as JLA and JLB. The region surrounding the JLAjunction was sequenced from 26 Eucalyptus DNA sam-ples (21 of E. globulus, plus 5 other species), and the JLBregion was sequenced using 5 of these samples. The sam-ples were chosen to represent all major haplotypes identi-fied in previous cpDNA RFLP studies. The JLA productswere 150210 bp in size, while those from JLB were ap-proximately 500 bp in size. Eighteen mutations werescored in total. Extensive variation was found in the IRwithin the intergenic spacer between rpl2 and the IR/LSCjunctions. Many of these characters were complex indels.One sample of E. globulus possessed a relatively large(38 bp) insertion which caused displacement of theIR/LSC junctions. This is the first report of intraspecificvariation in the position of IR/LSC junctions; interspecificvariation was also found. The LSC region near JLB had alow rate of mutation and none were informative. The LSCregion near JLA possessed 2 informative mutations. Thevariation revealed from these sequences reflects the differ-entiation previously uncovered in an extensive RLFPanalysis on the same samples. These results suggest thatthe JLA region will provide very useful cpDNA polymor-phisms for future studies in Eucalyptus.</p><p>Key words Chloroplast DNA JLA PCR Eucalyptus Inverted repeat </p><p>Introduction</p><p>Eucalypt species are fast becoming the hardwood speciesof choice in plantations throughout the world, and E.globulus is the most important of these species in tem-perate regions of the world (Eldridge et al. 1993). Chlo-roplast DNA (cpDNA) has been proven to be useful inphylogenetic (Steane et al. 1991; Sale et al. 1996), popu-lation (Byrne and Moran 1994; Jackson et al. 1999) andevolutionary studies in Eucalyptus (Steane et al. 1998;Jackson et al. 1999; McKinnon et al. 1999). All cpDNAstudies in Eucalyptus to date have been based on the restriction fragment length polymorphism (RFLP) tech-nique. This requires relatively large amounts of cleanDNA, which is time-consuming to obtain in eucalypts.The cpDNA microsatellites developed for gymnosperms(Powell et al. 1995; Vendramin et al. 1996) do not trans-fer well to Eucalyptus (D.A. Steane personal communi-cation). Consequently, development of cpDNA markersin Eucalyptus that would enable the RFLP technique tobe avoided would prove useful in studies that require alarge sample size.</p><p>In three independent RFLP studies (Steane et al.1998; Jackson et al. 1999; McKinnon et al. 1999) thecpDNA of Eucalyptus was found to be very variablewith respect to the size of restriction fragments whenprobed with Petunia probe nos. 20 and 19 (Sytsma andGottlieb 1986). This variation, which appeared as a wob-ble in band size, could not be used in these studies be-cause it could not be ascribed to simple restriction sitemutations or indels. These Petunia probes hybridize to acpDNA region that overlaps the junctions between thelarge single-copy region (LSC) and the inverted repeats(IR). Byrne et al. (1993) found that the chloroplast ge-nome of Eucalyptus possesses the IR structure character-istic of most angiosperms, such as found in Petunia andNicotiana. The IR is a duplicated section of cpDNA, and</p><p>Communicated by A.L. Kahler</p><p>R.E. Vaillancourt () H.D. JacksonCooperative Research Centre for Sustainable Production Forestryand School of Plant Science, University of Tasmania, GPO Box 25255, Hobart, Tasmania 7001, Australiae-mail: R.Vaillancourt@utas.edu.auFax: +61 3 6226 2698</p><p>Theor Appl Genet (2000) 101:473477 Springer-Verlag 2000</p><p>O R I G I N A L A RT I C L ER.E. Vaillancourt H.D. Jackson</p><p>A chloroplast DNA hypervariable region in eucalypts</p><p>Received: 20 March 1999 / Accepted: 16 December 1999</p></li><li><p>in each duplicate (25 kb in size) genes are in the oppositeorder. The IRs are separated by two single-copy regions,one large (LSC, 80 kb in size) and the other small (SSC,21 kb in size; Byrne et al. 1993). There are two junctionsbetween the LSC and IR, and these are defined as JLAand JLB in Shinozaki et al. (1986). Goulding et al. (1996)developed primers, anchored in conserved genes on either side of the junctions (for JLA rpl2 and trnH; forJLB rpl2 and rpl22), for amplifying both regions. Thegene encoding the ribosomal protein rps19 is often prox-imal to JLB on the LSC side of the junction and thereforecan be found with these primers (Goulding et al. 1996).Goulding et al. (1996) found that the DNA region sur-rounding JLA evolved rapidly by expansion and contrac-tion in Nicotiana, while the JLB region was much moreconserved. In the study presented here, we investigatedthe potential of the JLA and JLB regions as a source ofhighly polymorphic polymerase chain reaction (PCR)-based cpDNA markers for Eucalyptus.</p><p>Materials and methods</p><p>The DNA used in this study were 22 samples from Jackson et al.(1999), representing all haplotypes therein, and 4 samples fromSteane et al. (1998). The taxa belong to subgenus Symphyomyrtus(20 samples of Eucalyptus globulus; and 1 each of E. dalrymple-ana, E. archeri, E. morrisbyi and E. vernicosa) except for E. obli-qua from subgenus Monocalyptus, which served as an outgroup.Primers for PCR amplification of the JLA and JLB regions werethose developed by Goulding et al. (1996): for the JLA region,rpl2, 5-GATAATTTGATTCTTCGTCGCC-3 (which is in both</p><p>inverted repeats) and trnH, 5-CGGATGTAGCCAAGTGGATC-3(in the LSC side of JLA); for the JLB region, rpl2 and rpl22, 5-ACTCTTCGTGCTTTGTAGC-3 (in the LSC side of JLB). PCRreaction mixtures (50 l final volume) contained 5 l of Taq poly-merase 10 buffer (Promega), 2 U Taq polymerase, 200 M ofeach dNTP, 2 mM MgCl2, 100 g/ml of BSA (bovine serum albu-min), 7.5 pmol of each primer, 5 l 50% glycerol and 20 ng of ge-nomic DNA. Amplification was carried out in a PTC-200 (MJ Research) thermocycler programmed as follows: an initial dena-turation for 5 min at 95C followed by 35 cycles of 1 min at 94C,1 min at 46C, 1 min at 72C; and a final extension of 5 min at72C. For sequencing, PCR products were cleaned using the QIA-quick column (QIAGEN) and then quantified using a fluorometer(Hoefer). Sequencing was performed using the BigDye terminatorchemistry of Perkin Elmer (PE Applied Biosystems) on an ABIautomated sequencer. Each PCR product was sequenced in bothdirections. Sequence alignments were done manually. To judgewhether sequence characters were diagnostic of haplotypes, wemapped the characters onto the phylogenetic tree obtained fromcpDNA RFLP analysis on the same samples by Jackson et al.(1999; Fig. 2 therein).</p><p>Results</p><p>The region of cpDNA spanning JLA was sequenced in 26samples of Eucalyptus. Five of these were also se-quenced for the region of cpDNA spanning JLB, whichenabled the exact position of the IR/LSC junctions to bedetermined (Fig. 1). Eighteen variable characters werescored in total (Table 1), 11 of which were found in theIR. All IR mutations occurred in the intergenic spacerbetween rpl2 and the end of the IR. A few of these muta-tions were simple base pair (bp) substitutions (4 out of</p><p>474</p><p>Table 1 Description of all vari-able characters scored from sequencing the JLA and JLBregions of the chloroplast DNAof Eucalyptus. Characters werescored by comparison with theoutgroup</p><p>Character number Character type Character description and position (pos)aand location</p><p>Inverted repeats1 Transversion GT, pos 382 Substitution Outgroup state=C, 2.1=A, 2.2=T, pos 513 Transversion TA, pos 584 Indel 5-bp insertion at pos 59635 Multistate Various insertions at pos 6470: 5.0=gap; 5.1=</p><p>AATTATT, 5.2=AATTATA; 5.3=AATTAAA,5.4=AATTAT_</p><p>6 Multistate Various number of As at pos 7185: 6.2=A2,6.8=AATAAAAA, 6.9=A9, 6.10=A10, 6.15=A15</p><p>7 Indel 48-bp insertion at pos 801278 Indel 13-bp deletion at pos 1321449 Indel 12 bp insertion at pos 134145</p><p>10 Transversion TG, pos 13211 Indel 38 bp insertion at pos 94131, originating from</p><p>both sides of the JLB junctionLSC of JLA12 Indel 5-bp insertion, pos 146150.13 Transversion GC, pos 146LSC of JLB14 Transition TC, in rps1915 Transversion CA, in rps1916 Transition TC, in rps1917 Transversion GT, in rps1918 Transition TC, in rps19</p><p>a See Fig. 1</p></li><li><p>475</p><p>11), and many were complex sometimes involving largeindels. The 5 mutations (1.2%) in the large single-copyregion contiguous to JLB (JLB-LSC) were simple bp sub-stitutions which mainly distinguished the outgroup fromthe ingroup. Two mutations were found in the large sin-gle-copy region contiguous to JLA (JLA-LSC), one ofwhich was an indel.</p><p>The sample of E. globulus from North Flinders Is-land possessed a large insertion of 38 bp in the JLA re-gion (Table 1, character 11; Fig. 1). The sequence of thisinsertion is identical to the sequence flanking JLB. Thefirst 14 bp of this insert are the usual last 14 bp of the IR(as in the sample from St Helens) followed by a C, while</p><p>the next 23 bp of the insert corresponds to the first 23 bpof rps19 (JLB-LSC). A complete copy of rps19 wasfound outside of the inverted repeat, in the LSC portionof the JLB region. Therefore, the 23-bp insertion of therps19 sequence into JLA results in a shift in the IR/LSCjunctions in the North Flinders sample relative to otherE. globulus samples. Four other indels are also possibleduplications of nearby fragments (characters 4, 5, 7 and12; Table 1), and all these were judged to be insertionsby outgroup comparisons. The region between positions64 and 80 has a high degree of variability, which makescharacter interpretation difficult. For example, character5 may be an insertion, however there appear to havebeen a number of mutations in the fragment since theoriginal insertion event. This would make it difficult todetermine the original sequence that was inserted. There-fore, the variation in this region was ascribed to only 3characters (5, 6 and 7), 2 of which have many indepen-dent states (Table 1).</p><p>The 18 sequence characters were mapped onto a phylogenetic tree obtained from cpDNA RFLP analysisby Jackson et al. (1999; Fig. 2). Most of the charac-ters mapped to only one position on the tree (i.e. non-</p><p>Fig. 1 Alignment of the sequence from the JLA region in 15 Euca-lyptus samples. Only samples with different haplotypes are shown.Eucalyptus globulus samples are labelled by locality name only.Hyphens represent gaps inserted for alignment. Samples where theend of the inverted repeat (IR) is shaded represent those that havebeen sequenced for JLB. The position of the JLA junction (JLA), theIR and the large single-copy (LSC) region was then extrapolatedfor all samples as shown. The first 20 (within rpl2) and the last 65(within trnH) base pairs of the JLA products are not shown becausethey were highly conserved with tobacco</p></li><li><p>homoplasious), which indicates agreement between thetwo data sets. For example, character 2 from the JLA se-quence places the South Australian sample as basal tothe ingroup (Table 1, Fig. 2), as in the original RFLPanalysis. Another important character was number 12,which supported a clade (Fig. 2) identical to an RFLPclade and similarly had a clear geographical pattern toits distribution amongst E. globulus samples. However,the structure within this RFLP clade was previously de-fined only by homoplasious RFLP characters. There-fore, this part of the tree has been redrawn, taking into account the differentiation obtained with the non-homoplasious sequence character 13. The mapping ofmultistate characters 5 and 6 onto the RFLP tree wasmore difficult, with one possible interpretation beingshown in Fig. 2. The sequence data added further reso-lution to some of the RFLP clades. For example, theTriabunna, Taranna and Sandy Bay E. globulus samples(identical according to cpDNA RFLPs) are now all dif-ferentiated by the sequence data (Fig. 2).</p><p>Discussion</p><p>Sequencing the JLA region has provided useful molecularmarkers. In eucalypts there is considerable sequencevariation in the IR region close to the LSC. More infor-mative characters were found in the JLA than the JLB re-gion. The sequence variation uncovered using the JLA re-gion largely reflects the differentiation previously uncov-ered using extensive RLFP analysis on the same samples</p><p>(Jackson et al. 1999). Jackson et al. (1999) identified twomajor cpDNA clades within Tasmanian samples of E.globulus. Our results show that these can be unambigu-ously distinguished by the presence or absence of a 5-bpindel (character 12). By mapping the sequence charactersonto the RFLP phylogenetic tree of Jackson et al. (1999),we were able to show that the two types of data harbourthe same phylogenetic signal. Some characters in the JLAregion will be useful in phylogenetic studies, althoughsome difficulties may be encountered with the morecomplex characters. With characters that have many dif-ferent states it is difficult to judge how many differentmutations have actually occurred and whether these havefollowed any particular series. Despite these difficulties,these mutations are still useful in fingerprinting haplo-types.</p><p>Many of the mutations in the JLA region are indels,and often these appeared to be duplications of nearbyfragments. The reason why indels are so common in thejunction regions is not well understood, but probably relates to the recombination and gene conversion be-tween the two IRs that is presumed to occur continually(Goulding et al. 1996). Sequence mutations were alsofound to occur within the indel characters, indicating thatthis intergenic DNA is evolving at a high rate within Eucalyptus. The initial 20 bp of the intergenic spacer (bp 3463), upstream of rpl2, is highly conserved be-tween Eucalyptus and Nicotiana, suggesting that themechanism mutating the DNA is mostly active close tothe IR/LSC junction. In eucalypts, the gene arrangementacross the IR/LSC junctions is similar to that found inNicotiana (Goulding et al. 1996). A complete copy ofrps19 was found within the LSC of JLB, as in most an-giosperms, however the truncated copy of rps19 (rps19)commonly found in the IR side of JLA (Goulding et al.1996) was only found in 1 Eucalyptus sample (NorthFlinders). Six different rps19 structures were observedamong 13 Nicotiana species studied by Goulding et al.(1996), and these correlated to interspecific variation inthe position of the IR/LSC junctions. In Eucalyptus, theposition of the junctions was variable between and with-in species, with this being the first case of intraspecificvariation reported.</p><p>The JLA region has provided useful PCR markers forcpDNA that should be widely applicable in Eucalyptus.Some of the mutations result in fragment...</p></li></ul>