classification and evolution of a-amylase genes in plants · provided by d. baulcombeand m....

6
Downloaded by guest on March 17, 2021 Downloaded by guest on March 17, 2021 Downloaded by guest on March 17, 2021 Downloaded by guest on March 17, 2021 Downloaded by guest on March 17, 2021 Downloaded by guest on March 17, 2021 Downloaded by guest on March 17, 2021

Upload: others

Post on 16-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Classification and evolution of a-amylase genes in plants · provided by D. Baulcombeand M. Lazarus), Amy6-4and Amy46 (13),RAmylA(20),RAmy3A,RAmy3B,andRAmy3C(21),RAmy3D andRAmy3E(8),

Dow

nloa

ded

by g

uest

on

Mar

ch 1

7, 2

021

Dow

nloa

ded

by g

uest

on

Mar

ch 1

7, 2

021

Dow

nloa

ded

by g

uest

on

Mar

ch 1

7, 2

021

Dow

nloa

ded

by g

uest

on

Mar

ch 1

7, 2

021

Dow

nloa

ded

by g

uest

on

Mar

ch 1

7, 2

021

Dow

nloa

ded

by g

uest

on

Mar

ch 1

7, 2

021

Dow

nloa

ded

by g

uest

on

Mar

ch 1

7, 2

021

Page 2: Classification and evolution of a-amylase genes in plants · provided by D. Baulcombeand M. Lazarus), Amy6-4and Amy46 (13),RAmylA(20),RAmy3A,RAmy3B,andRAmy3C(21),RAmy3D andRAmy3E(8),

Proc. Nati. Acad. Sci. USAVol. 89, pp. 7526-7530, August 1992Plant Biology

Classification and evolution of a-amylase genes in plants(cereals/phylogeny/polymerase chain reaction/signatu regions)

NING HUANG*, G. LEDYARD STEBBINS, AND RAYMOND L. RODRIGUEZtDepartment of Genetics, University of California, Davis, CA 95616

Contributed by G. Ledyard Stebbins, April 24, 1992

ABSTRACT The DNA sequences for 17 plant genes fora-amylase (EC 3.2.1.1) were analyzed to determine theirphylogenetic relationship. A phylogeny for these genes wasobtained using two separate approaches, one based on molec-ular clock assumptions and the other based on a comparison ofsequence polymorphisms (i.e., small and localized insertions)in the a-amylase genes. These polymorphisms are called"ra-amylase signatures" because they are diagnostic of thegene subfamily to which a particular a-amylase gene belongs.Results indicate that the cereal a-amylase genes fall into twomajor classes: AmyA and AmyB. TheAmyA class is subdividedinto the Amyl and Amy2 subfamilies previously used to classifya-amylase genes in barley and wheat. The AmyB class includesthe Amy3 subfamily to which most of the a-amylase genes ofrice belong. Using polymerase chain reaction and oligonucle-otide primers that flank one of the two signatu regions, weshow that the AmyA and AmyB gene classes are present inapproximately equal amounts in all grass species exinedexcept barley. The AmyB (Amy3 subfamily) genes in the lattercase are comparatively underrepresented. Additional evidencesuggests that the AmyA genes appeared recently and may beconfined to the grass family.

a-Amylase (EC 3.2.1.1) plays a key role in the metabolism ofthe plant by hydrolyzing starch in the germinating seed andin other tissues. This is accomplished primarily through the1,4-a endoglycolytic cleavage of amylose and amylopectin,the principal components of starch granules in plant cells.Because of its importance to cereal seed germination andmalting, the genetic basis and biochemical mechanism of thisprocess have been the subject of study for many years (1).More recent descriptions of the biology and biochemistry ofplant a-amylases have been reviewed by Akazawa et al. (2)and Fincher (3).

Protein sequence comparisons of a-amylases from plants,animals, and microbes reveal four highly conserved domainsthat correspond to sites necessary for enzyme structureand/or function (4). A subsequent comparison of 11 differentcereal a-amylase proteins revealed the same four sites plusthree additional sites (5). Two of these correspond to intronsplice sites, whereas the third may represent a duplication ofthe calcium binding site, essential for enzyme stability. Theseresults clearly indicate acommon ancestry for the a-amylasesand conservation of critical sequences over evolutionarytime.Although three a-amylase isozymes have been detected in

germinating rice seeds (6, 7), it is now known that a-amylaseisozymes are encoded by a family of 10 genes located on fivedifferent chromosomes (8-10). Multiple a-amylase genes arealso known to exist in barley (11) and wheat (12). In barley,7 genes encoding a-amylase isozymes with high isoelectricpoints (p1) map to chromosome 1, whereas four low pI genesmap to chromosome 6 (11, 13). In hexaploid wheat, 12-14

a-Amyl genes reside on homeologous chromosomes 6A, 6B,and 6D, whereas 10 or 11 a-Amy2 genes are found onchromosomes 7A, 7B, and 7D. The wheat a-Amy3 genes (14,15) map to the homeologues of chromosome 5.

Different experimental approaches to the study of cereala-amylases have led to different nomenclatures and classifi-cation schemes (16). Consequently, some confusion hasarisen as more protein and nucleotide sequences have accu-mulated in electronic data bases. For example, in barley,genes for high pI isozyme have been alternatively calledAmyl (17) and Amy2 (18). On the other hand, genes for lowpI isozyme are called either Amy2 (17) or Amyl (18). Sincezymograms do not resolve the a-amylase isozymes of riceinto distinct high and low pI groups (2, 19), it has beendifficult to relate these isozymes to those observed in barleyand wheat. Subsequently, Huang et al. (20) used DNA-DNAhybridization to classify 30 rice genomic clones into fivehybridization groups. These five groups were eventuallyconsolidated into three subfamilies (Amyl, Amy2, andAmy3) on the basis of protein comparisons to other wheatand barley a-amylase genes (8, 20-22). Other investigatorshave used the high and low pI nomenclature to describe theircloned rice a-amylase genes (23, 24). Since the cereal a-amy-lase genes appear to be evolutionarily related, we believe anunderstanding of their phylogeny can be used to establish aconsistent and informative nomenclature applicable to othermonocot and perhaps dicot a-amylase genes.For this paper, we examined the phylogenetic relationships

of several plant a-amylase genes with and without the rate-constancy assumption ofthe molecular clock hypothesis. Onthe basis ofthis analysis, the cereal a-amylase genes could bedivided into two major classes: AmyA and AmyB. TheAmyA consists of Amyl and Amy2 subfamilies, whereasAmyB consists of the Amy3 subfamily. All grasses examinedso far have AmyA and AmyB gene classes. The AmyB genescontained some structural characteristics of the prototypea-amylase gene, which may relate them to a common ances-tor for a-amylase in other angiosperms.

MATERIALS AND METHODSPreparation of Plant Genomic DNA. Plant DNAs suitable

for polymerase chain reaction (PCR) were isolated using theCTAB procedure (25) as modified by Rogers and Bendich(26). DNA was isolated from the following tissues: 10-dayetiolated rice young leaves (cv. M202), 7-day etiolated barleyyoung leaves (cv. Klages and cv. Himalaya), 10-day etiolatedyoung leaves from ryegrass, leaf tissue from Zebrinapendula(wandering jew), and Vigna radiata (mung bean) sprouts.DNA for rye, oats, Triticum aestivum (hexaploid wheat), andTriticum uravta (diploid wheat) were generously provided byH. Zhang and J. Dvorak (Agronomy Department, Universityof California, Davis), whereas maize DNA was provided by

*Present address: International Rice Research Institute, P.O. Box933, 1099 Manila, The Philippines.tTo whom reprint requests should be addressed.

7526

The publication costs of this article were defrayed in part by page charge,payment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Page 3: Classification and evolution of a-amylase genes in plants · provided by D. Baulcombeand M. Lazarus), Amy6-4and Amy46 (13),RAmylA(20),RAmy3A,RAmy3B,andRAmy3C(21),RAmy3D andRAmy3E(8),

Proc. Natl. Acad. Sci. USA 89 (1992) 7527

J. Callis (Biochemistry and Biophysics Department, Univer-sity of California, Davis). Brassica and potato DNA weregifts from C. Quiros and J. Hu (Department of VegetableCrops, University of California, Davis).DNA Sequences of Rice, Barley, Wheat, and Mung Bean

a-Amylase Genes. DNA sequences of barley, wheat, andmung bean a-amylase genes were obtained from publishedreports (see Fig. 1 legend) or from the GenBank data base(Release 70). Partial DNA sequences were not included inthis study. Only one sequence was chosen from identical ornearly identical sequences. Unpublished sequences for twowheat DNA sequences (W-Amyl/13 and W-Amy2/54) werekindly provided by D. Baulcombe and M. Lazarus. Ricea-amylase gene sequences were previously determined bythis laboratory and are available from GenBank. The DNAsequences were aligned by PILEUP and edited by LINEUP ofthe GCG Sequence Analysis Software Package (27). Alignedsequences were used to estimate the synonymous and non-synonymous nucleotide substitution rates using a programdeveloped by Li et al. (28). Nonsynonymous nucleotidesubstitutions were taken as a measure ofgenetic distance andthe UPGMA program was used to construct the phylogenetictree shown in Fig. 1 (29).PCR Amplification of the Signature Region. PCR primers

were synthesized on a Cyclone Plus DNA synthesizer (Mil-liGen/Biosearch, Novato, CA). The 22-mer (no. 139, 5'-CGGCA GGAGC TCGTC AACTG GG-3') anneals 5' to thesignature region, whereas the 24-mer (no. 138, 5'-GAGCATGCCC TTGGT GGTGA AGTC-3') is located 3' to thesignature region (Fig. 3). The expected PCR products are 84and 87 base pairs (bp) for AmyA genes and 78 bp for AmyBgenes. The design of these primers was based on the follow-ing two criteria: (i) both primers correspond to highly con-served regions in the a-amylase genes and (ii) the PCRproducts for the AmyA and AmyB genes could be easilyresolved by polyacrylamide gel electrophoresis. The twomung bean a-amylase gene primers were based on thepreviously published sequence information (30).PCR amplification of total genomic DNA from various

plant species was performed according to previously pub-lished procedures (8). Fifty to 500 ng of genomic DNA(depending on the genome size) were used for PCR amplifi-cation, whereas only 50 pg of cloned a-amylase gene DNAwas used for control amplifications. PCRs were carried out in50 al containing genomic DNA, 0.4 ,uM primer 138, 0.4 ,uMprimer 139, 0.2 mg of bovine serum albumin per ml, 1 mMeach dNTP, and 1 unit of Taq polymerase (United StatesBiochemical) in Taq polymerase buffer [20 mM Tris, pH8.4/5 mM MgCl2/2.5 mM KCI/15 mM (NH4)2SO4]. TheDNAdenaturation was set at 94°C for 1 min, primer annealing wasat 55°C for 1 min, and primer extension was at 72°C for 2 min.The reaction was allowed to run for 30 cycles before a final7-min primer extension at 72°C. The PCR products wereresolved by electrophoresis on 10% polyacrylamide gels runin 1x TAE (40 mM Tris-acetate/2 mM EDTA) buffer and theseparated DNA bands were visualized by ethidium bromidestaining.

RESULTSPhylogenetic Relationship of a-Amylase Genes by Sequence

Homology. DNA sequences of 17 a-amylase genes from rice,barley, wheat, and mung bean were aligned and the numberof nucleotide substitutions at synonymous (Ks) and nonsyn-onymous (KA) sites was estimated based on this alignment(28). Because synonymous nucleotide substitution sites weresaturated between monocot and dicot genes, these valueswere not calculated (Table 1). Based on the nonsynonymoussubstitution rate, it is clear that the mung bean a-amylasegene is equidistant from the Amyl, Amy2, and Amy3 genes

Table 1. Synonymous (Ks) and nonsynonymous (KA)substitution per site between mung bean (MB), rice, barley, andwheat

Amy gene No. Ks KAMonocot/dicotMB/barley 6 NC 0.265 ± 0.019MB/rice 7 NC 0.266 ± 0.019MB/wheat 3 NC 0.264 ± 0.019MB/Amyl 5 NC 0.249 ± 0.018MB/Amy2 5 NC 0.278 ± 0.019MB/Amy3 6 NC 0.267 ± 0.019

Monocot/monocotRice/barleyAmyl 3 0.461 ± 0.053 0.093 ± 0.010Amy2 3 0.671 ± 0.076 0.124 ± 0.012

Rice/wheatAmyl 1 0.461 ± 0.052 0.096 ± 0.010Amy2 1 0.559 ± 0.064 0.122 ± 0.012Amy3 5 0.782 ± 0.086 0.182 ± 0.015Amyl/Amy2 2 0.752 ± 0.092 0.194 ± 0.016

Wheat/barleyAmyl 3 0.129 ± 0.028 0.025 ± 0.005Amy2 3 0.207 ± 0.030 0.042 ± 0.006

See Fig. 1 for sequence sources. Values are presented as mean ±SD. NC, not calculated.

ofbarley, rice, and wheat, indicating equal rates ofa-amylasegene evolution in these three species. The Ks and KA valueswere also determined for the monocot a-amylase genes(Table 1) and, as reported previously (31), Ks values weremuch higher than KA values. A strong, positive correlation(correlation coefficient = 0.982) was observed between thesesets of values.Because KS values for monocot/dicot a-amylase genes

were not available (Table 1), KA values were used to con-struct the phylogenetic tree shown in Fig. 1. Assuming thatthe pattern of gene evolution is divergent, all monocota-amylase genes were found to share a common ancestorwith the mung bean a-amylase gene, indicating that monocotand dicot a-amylase genes were derived from a single pro-genitor. Since the divergence of monocot/dicot plants, the

0.10 0.15 0.20 0.25Nonsynonymous substitution/site

FIG. 1. Phylogenetic relationship of a-amylase genes from rice(R), barley (B), wheat (W), and mung bean. DNA sequences for theexons of 17 a-amylase genes were aligned and a phylogenetic treewas constructed. DNA sequence information was obtained from thefollowing sources: RAmy2A (22), Amy32b (32), CloneE (33),gKAmyl55 and gKAmyl41 (17), Amy2/54 and Amyl/13 (generouslyprovided by D. Baulcombe and M. Lazarus), Amy6-4 and Amy46(13), RAmylA (20), RAmy3A, RAmy3B, and RAmy3C (21), RAmy3Dand RAmy3E (8), mung bean (30), Amy3/33 (14).

Plant Biology: Huang et al.

Page 4: Classification and evolution of a-amylase genes in plants · provided by D. Baulcombeand M. Lazarus), Amy6-4and Amy46 (13),RAmylA(20),RAmy3A,RAmy3B,andRAmy3C(21),RAmy3D andRAmy3E(8),

Proc. Natl. Acad. Sci. USA 89 (1992)

Table 2. Synonymous (Ks) and nonsynonymous (KA)substitution per site between rice and barleyGene L Ks KAAmyl 453 0.461 ± 0.053 0.093 ± 0.010Amy2 453 0.671 ± 0.076 0.124 ± 0.012ADH2 379 0.564 + 0.065 0.090 ± 0.012Waxy 611 0.704 + 0.063 0.091 ± 0.009Lectin 228 0.594 ± 0.108 0.227 ± 0.024CAB2 266 0.812 ± 0.107 0.119 ± 0.015Values are presented as mean ± SD. L, number of codons

compared. Sequences were obtained from GenBank Release 70.

a-amylase gene in the monocot lineage leading to the grasses(Poaceae) expanded into a multigene family consisting ofthree subfamilies: Amyl, Amy2, and Amy3.To estimate the rate of a-amylase gene evolution relative

to other nuclear genes, Ks and KA values for six different riceand barley genes were calculated (Table 2). The values for thea-amylase genes (Amyl and Amy2) were found to be com-parable to other nuclear plant genes shown in Table 2 and tothose estimated by Wolfe et al. (31).Use of Sequence Polymorpbhsms to Determine the Phylogeny

of Cereal a-Amylase Genes. DNA sequence comparisonsrevealed a possible phylogenetic relationship based on therate-constancy assumption of the molecular clock hypothe-sis. Alternatively, a phylogeny for the a-amylase genes wasderived based on the analysis of two polymorphic regions(small, localized insertions) in the aligned sequences. Inser-tions in DNA are generally considered to be rare eventscompared to nucleotide substitutions and tend to be main-tained after gene duplication (34). Therefore, genes sharingthe same insertion are probably derived from a commonancestor and should be classified into the same group.Examination of the aligned a-amylase DNA sequences re-vealed two distinct regions that are phylogenetically infor-mative (Figs. 2 and 3). These regions of sequence polymor-phisms are called "a-amylase signature regions" since theycan be used to define the subfamily of a particular cereala-amylase gene without the need to determine genetic dis-tance.The first of these regions (a-amylase signature I) is located

around the junction between the signal peptide and theamino-terminal end of mature peptide (Fig. 2). On the basisof sequence differences in this region, the a-amylase genescould be organized into three groups consistent with theAmyl, Amy2, and Amy3 subfamilies currently used to de-

769mungbean GGT GGG GGA CTG GTG AAT TGG GTT GAA TM GCA GGT GGA

61 99Mungbean TCT TCC CCT GCC ............ ... TTG CTG TTT CAG

RAmy3EW-Amy3/33RAmy3BRAmy3CRAmy3ARAmy3D

TCC AGC TTA GCA ... ... CAA GCC CAA GTT CTC TTC CAGTCC AGC TTA GCA ... G...CAG GCT CAA ATT CTT TTC CAGTCT CAC TTG GCC ...G...CAA GCC CAG GTC CTC TTC CAG Amy3TCT CAC TTA GCC ... G...CAG GCT CAG GTT CTC TTT CAGCCC GAC GTC GCG CAC GCG CAG ACG CAG ATC CTC TTC CAGACC TGT AAC TCG ... GGT CAA GCA CAG GTC CTC TTC CAG

B-Amy46 TGC AGC TTG GCC TCC GGG ... ... CAA GTC CTG TTT CAGB-gXAmyl4l TGC AGC TTG GCC TCT GGG. CAA GTC CGG TTT CAGW-Amyl/13 GCC AGT TTG GCC TCT GGC ... G...CAA GTC CTG TTT CAG Amy1B-Amy6-4 GCC AGC TTG GCC TCC GGG ... ... CAA GTC CTC TTT CAGRAmy1A TCC AAC TTG ACA GCC GGG. CAA GTC CTG TTT CAG

B-CloneE GCC GGG TTG GCG TCC GGC CAC ... CAA GTCCTC TTT CAGW-Amy2/54 GCC GGA TTT GCG TCC GGC CAT ... CAA GTT CTC TTT CAGB-Amy32B GCC GGG TTG GCG TCC GGC CAT ... CAA GTC CTC TTT CAG Amy2BgKAmyl55 GCC GGG TTG GCA TCC GGC CAT ... CAG GTC CTG TTT CAGRAmy2A CTC GGC TTG GCT TCC GGC GAC ... AAG ATT CTC TTC CAG

Consensus -CC -GC TTG GC- -C- GG- CA- --- CAA GTC CTC TTT CAG

FIG. 2. Partial DNA sequence aligment ofthe a-amylase genes.DNA sequences of 17 a-amylase genes from rice, barley, wheat, andmung bean were aligned using the PILEUP command oftheGCGDNASequence Analysis Software (27). Gaps were inserted only betweencodons and in multiples of three. Sources of DNA sequences areindicated in the legend to Fig. 1.

scribe the wheat a-amylase genes. This signature region hasthe following characteristics: TCC GGG --- --- for the Amylgenes, TCC GGC CAT --- for the Amy2 genes, and ---

--- CA6 GCN for the Amy3 genes. Although rice a-amylasegenesRAmy3A andRAmy3D are at variance with other genesin the Amy3 subfamily, their classification as such is sup-ported by genetic distance measurements (Fig. 1) and se-quence characteristics in the a-amylase signature II region(Fig. 3).The a-amylase signature II region is located between

nucleotide positions 769 and 855 of the consensus sequence(Fig. 3). In this region, sequences are highly conserved, asrevealed in the consensus sequence. Based on the sequencepolymorphisms found in these two regions, the a-amylasegenes can be sorted into two groups: those genes withoutinsertions (AmyB) and those genes with either a 6- or 9-bpinsertion (AmyA). The genes in AmyA group can be furtherdivided into Amyl and Amy2, depending on the sequencepolymorphisms shown in Fig. 2. This classification scheme isin complete agreement with a phylogeny based on geneticdistance (Fig. 1).

Barley Contains Amy3 Gene: The AmyB Clam. Prior to ourresearch, Amy3-type genes were restricted to wheat (14). Nogene belonging to this subfamily had been reported for barley

855.......... GCT ATT ACT GCA mTT GAT TTC ACA ACA AAA WGA AT 1TT

AmyBRAmy3EW-Amy3/33RAky3BRAmy3CRkmy3ARAmy3D

AmyAB-AmI46B-gKAmyl4lW-Amyl/13B-AmV64R-Anm1AB-CloneEW-AWy2/54B-Aky32bB-gKkny155RAmy2A

AGG CAG GAG CGI GMG AAC TGG GIG GAG GG G GG G ... ...GGA CAG CGG CTC GOG AAC TGG GTG CGG GGC GTC GGC GGG ... ...CGG CAG GAG CTG GIG AAC TGG GGG CAG GCG GIG GGT GG ... ...GGG CAG GAG TTG GTG AAC TGG GCG CAG GCC GTC GGT GGC ......CGG CAG GAG CTC GTG AAC TGG GTG AAG CAG GIT GGC GGC ......GGG CAG GAG CTG GTG AAC TGG GTG AAC GCC GTC GGC GGC ......

... OGG GCG AMG GGGTTC GAGTTC ACA CAM GG AGCI

... CGG GOC ACM GCG TTT GAC TTC COC ACC AAG GGC GTT CTC

... CCT GOG TCA GCG TTC GAC TIC ACG AC AMG GGC Ga CTIG

... OCT GCA TCG GOG TTC GAC TTC AOG AGC AMG GGC GAG CTG

... CCG GOG ACG GOG TIC GAC TTC ACG AOC AAG GOGC ATC CTG

... COG GOG ATG ACG TTC GAC TTC ACC AC AMG GGC CIC CTG

GGG CAG GAG CTG GTG AAC TGG GTG AAC AAG GTG GGC GGC TCC GGC ... CCC GCC AOC ACG TTC GAC TIC ACC AC AMG GGC ATC CTCOGG CAG GAG CTG GIG AAC TGG GTG AAC AAG GI GGC GGC TCG GGC ... CCC GOC AOC ACG TIC GAC TTC AOC ACC AG GOC ATGC GTOGG CAG GAG CTG GTG AAC TGG GIG AMC AMG GTG GGC GGC TCC GGC ... CCG GGT AOC AOG TIC AC TTC AG AMC MG GGC ATC GIGGGG CAG GAG CTG GTG AAC TGG GTG GAC AAG GTT GGC GGC AAA GGG ... CCC GO ACC AOG TTC GAC TTC AOC ACAG G GGC ATC CTCCGG CAG GAG CTG GTC AAC TGG GTC GAT CGI GTC GGG GGC GCC AMC AGC AAC GGC AOG GOG TIC GAC TTC AOC AGO MG GOC ATC CTCOGG CAG AAT CTG GTG AAC TGG GTG GAC AAG GTG GGC GGC GCG GGC TCG GG GGC AM GIG TTC GAC TIC AOG ACG AA GOG ATA GOG- CAG AAT GTG GTG AAC TGG GTG GAC MG GTG GGC GGC GGG GOG TCG GCA GOC ATG GTG TTC GAT TIC AMG ACO MG GGG AT CTGCOGG CAG AAT CTG GTG AAC TGG GTG GAC AAG GTC GGC GGC GCG GCA TCG GCT GGC M'G IG TIC GAG TTC AOG AGO MG GGG ATA TIGGGG CAG AAT CTG GTG AAC TGG GIGAC AAG GTG GGG GGC CGG GCG TCG GT GGC ATG GIG TIC GAC TTC ACG AOC AAG GGG AhT GGGG CAG GCG TTG GIG GAC TGG GTG GAC AGG GTG GGT GGG ACG GOG TCG GOG GGG ATG GTG TTC GAC TTC AOG ACG AAG GGG ATC ATG

Consensus GG CGAG GG CTG G MC TGG GG -AC -G G- GGC -- - - CC- G- AOG GOG TTC GAG TTC AC- AGO MG GGC AT- CT-Oligos COGG CAG GAG CTC GTG AAC TGG G--3 3 -CTG MG TGG TGG TIC COG TAC GAG

FIG. 3. Partial DNA sequence alignment of the a-amylase genes. Sequences were aligned as described in the legend to Fig. 2. The primersfor DNA amplification ofAmyA and AmyB genes are shown below the consensus sequence. The monocot primers were synthesized based onthe consensus sequence, whereas the dicot primers were synthesized based only on the mung bean cDNA sequence. Nucleotide coordinatesare based on the consensus sequence starting from the A of the first AUG codon and excluding introns.

7528 Plant Biology: Huang et al.

Page 5: Classification and evolution of a-amylase genes in plants · provided by D. Baulcombeand M. Lazarus), Amy6-4and Amy46 (13),RAmylA(20),RAmy3A,RAmy3B,andRAmy3C(21),RAmy3D andRAmy3E(8),

Proc. Natl. Acad. Sci. USA 89 (1992) 7529

or any other member of the grass family. If our assumptionof a common a-amylase ancestor is correct, one wouldpredict that the barley genome should contain at least oneAmy3 gene. To test this possibility we took advantage of thesignature region shown in Fig. 3 and used flanking primers tosearch for an Amy3-type gene in barley genomic DNA.According to this strategy, one would predict PCR productsof 78 bp for the AmyB genes (which include members of theAmy3 subfamily) and 84-bp and 87-bp products for the AmyAgenes.The validity of this strategy was confirmed by PCR am-

plification of the appropriate positive controls (Fig. 4). Forexample, when primers 138 and 139 were used to amplify tworice a-amylase cDNA clones, one corresponding to an AmyAgene (20) and the other corresponding to an AmyB gene (8),PCR products of about 87 bp (Fig. 4, lane 2) and 78 bp (Fig.4, lane 3) could be observed. Furthermore, when these sameprimers were used to amplify two barley cDNA clones, onecorresponding to an Amyl gene and the other to an Amy2gene, PCR products of about 84 bp (Fig. 4, lane 6) and 87 bp(Fig. 4, lane 5) were obtained. Under the electrophoresisconditions used in this study, the 84-bp and 87-bp productscomigrate as one band when run together in the same lane orwhen amplified from genomic DNA. This can be seen whengenomic DNA from rice and wheat (T. aestivum) wereamplified. The PCR products of AmyA genes (84 bp and 87bp) and AmyB genes (78 bp) could be clearly seen in thoselanes containing rice (Fig. 4, lane 4) and wheat (Fig. 4, lane8) amplification products. These results demonstrate that theprimers and the amplification condition used were suitablefor detecting AmyB genes in the genomes of barley and otherspecies. The amplification of barley genomic DNA (Fig. 4,lane 7) produced two bands that matched those produced bythe rice and wheat genomic DNAs and the two rice cDNAclones. The lower of these bands corresponds to the 78-bpproduct characteristic of AmyB gene class. The faintness ofthis band is probably due to the low copy number of this geneclass in the barley genome.AmyA and AmyB Genes Present in the Grass Family. To

determine if AmyA and AmyB genes are present in othermembers of the grass family, we used PCR amplification ofthe signature II region to examine genomic DNAs from eightspecies (Fig. 5). All of them exhibited the 78-bp and 84- and87-bp bands, characteristic of the AmyA and AmyB geneclasses. Except in the case of barley, all genomic DNAsproduced AmyA and AmyB bands of approximately equalintensity. Assuming the efficiency of amplification is the samefor both gene classes, these species appear to have about equalnumbers ofAmyA and AmyB genes. In the case ofbarley, theAmyA class (which contains the Amyl and Amy2 subfamilygenes) appears to be the predominant gene class.When this analysis was expanded to include genomic DNA

from other members of the grass family, PCR products

1 2 3 4 5 6 7 8 9 bp

11811 0

6757

FIG. 5. PCR amplification of the signature regions of selectedmonocot plant genomic DNA. Lane 1, rice; lane 2, barley cv. Klages;lane 3, barley cv. Himalaya; lane 4, wheat (hexaploid); lane 5, corn;lane 6, rye; lane 7, rye grass; lane 8, oat; lane 9, wheat (diploid); andlane M, molecular size markers (see Fig. 4).

indicative of AmyA and AmyB gene classes were observed(Fig. 5). However, when genomic DNAs from distantlyrelated monocots such as Z. pendula or dicot species (e.g.,mung bean) were amplified, only the AmyB class genes weredetected (Fig. 6). In the case of potato genomic DNA, nobands were observed. In the latter two instances, primersbased on the mung bean a-amylase gene sequence (Fig. 3)were used for PCR. The absence ofPCR products in the caseof potato DNA suggests nucleotide sequence divergence inone or both of the primer regions. Similar negative resultswere obtained with genomic DNA from Brassica and Ara-bidopsis (data not shown). Presumably, as more sequenceinformation on dicot a-amylases becomes available, it shouldbe possible to design new primer sets that will enable us toextend this investigation to a wider range of species.

DISCUSSIONa-Amylase Genes in Plants Consist of Two Classes: AmyA

and AmyB. Based on isozyme and DNA sequence analysis,barley and wheat a-amylase genes can be classified into twogroups: the high pI, type A or Amyl group and the low pI,type B orAmy2 group (13, 15, 17, 36). Recently, an additionalsubfamily of a-amylase genes (Amy3) has been identified inwheat. The genes in Amy3 are much less homologous to thewheat Amyl and Amy2 genes than these genes are to eachother (14). One of the wheat Amy3 genes, Amy3/33, is onlyexpressed at low levels in immature seeds and no orthologousgene has been detected in barley (14). Our studies indicatethat 6 of the 10 rice a-amylase genes belong to the Amy3subfamily (8, 9, 21). Unlike the wheat Amy3 gene, however,members of this gene subfamily in rice are expressed ingerminating seeds as well as in other tissues of the plant (8).From the data shown in Figs. 1-3, it is clear that the

a-amylase genes can be divided into two main classes (AmyAand AmyB) and three subfamilies (Amyl, Amy2, and Amy3).The ability to classify the cereal a-amylase genes in this wayallowed us to conduct a cursory survey of the distribution ofthese gene classes in other grasses using PCR (Fig. 3-5).These results indicate that AmyA and AmyB genes arepresent in all grass species examined. Furthermore, therelative proportion of genes in these two classes is approx-imately equal with the exception of barley. In the latter case,AmyB gene(s) were present in low copy number. It is notclear why the distribution of genes within the AmyA andAmyB classes varies so drastically between rice and barley(Figs. 1 and 4). It is tempting to speculate that this asymmetryis due to the habitats or growth conditions for which rice and

FIG. 4. Amplification of rice, barley, and wheat genomic DNAusing primers 138 and 139. The same PCR conditions were used for alltemplate DNAs. The absence of bands in lane 1 (no DNA) indicatesthat the PCRs were free ofcontaminating templates and artifact bands.Lane 2, RAmylA (8); lane 3, RAmy3D (8); lane 4, rice genomic DNA;lane 5, CloneE (33); lane 6, pM/C (35); lane 7, barley genomic DNA;lane 8, wheat genomic DNA; and lane 9, molecular size markers:pBluescript KS- digested with restriction enzyme Hae III.

bp 1 2 3 4

84-87-78'

FIG. 6. Amplification of signature re-gion of genomic DNA of Z. pendula andmung bean. The DNA of Z. pendula wasamplified with the primers used in Figs. 4and 5. The mung bean primers are shownin Fig. 3. The PCR products of rice (lane1), Z. pendula (lane 2), mung bean (lane3), and potato (lane 4) were resolved on a10%o polyacrylamide gel.

Plant Biology: Huang et al.

Page 6: Classification and evolution of a-amylase genes in plants · provided by D. Baulcombeand M. Lazarus), Amy6-4and Amy46 (13),RAmylA(20),RAmy3A,RAmy3B,andRAmy3C(21),RAmy3D andRAmy3E(8),

Proc. Natl. Acad. Sci. USA 89 (1992)

barley were selected (e.g., tropical/subtropical vs. temperateclimates and relatively slow vs. rapid germination). How-ever, careful investigation of other grass genera will berequired before such a hypothesis can be safely formulated.The purpose of maintaining multiple isozymes for a-amy-

lase is not known, although slight differences in the enzy-matic properties may make one isozyme better suited to aparticular substrate or intracellular environment than an-other. This notion is supported by studies indicating thatwheat a-amylase isozymes I and II adsorb and degrade starchgranules at different rates (37, 38). Moreover, we haveobserved that the rice a-amylase isozyme encoded by theRAmyJA gene has a low pH optimum, whereas the RAmy3Disozyme is active over a broad pH range (M. Terashima andR.L.R., unpublished observation). We believe such enzy-matic differences may reflect adaptive changes in the phys-iology and morphology of each species in response to theclimatic and edaphic requirements of their respective habi-tats. It remains to be seen whether differences in the distri-bution of genes within the AmyA and AmyB classes areassociated with these adaptive changes.

Evolution of Plant a-Amylase Genes. On the basis ofnucleotide sequence information, the phylogeny of thea-amylase genes shows that monocot and dicot a-amylasegenes are derived from a common ancestor. Therefore, thea-amylase genes in the monocot lineage must have resultedfrom a duplication to form AmyA and AmyB genes. The PCRresults (Figs. 4-6) suggest that AmyA genes are restricted tothe grass family and arose from a duplication of an AmyBgene after the grass family separated from the ancestors of theCommelinaceae, to which Z. pendula belongs. Subsequentduplications of the AmyA and AmyB genes produced theAmyl, Amy2, and Amy3 genes that comprise the subfamiliesshown in Fig. 1. Since all Amy3 genes, including mung beana-amylase, lack the 6- and 9-bp insertion diagnostic of theAmyA class (Fig. 3), we assumed that the ancestor gene thatprecedes the monocot-dicot divergence also lacks theseinsertions. Furthermore, the evolution of the AmyA class canbe explained by a 9-bp insertion event before the separationof rice and wheat/barley to produce an Amy2-type genefollowed by a 3-bp deletion event to produce an Amyl-typegene. This would explain the presence of the 9-bp insertionin Amyl and Amy2 genes of rice. That the 3-bp deletionoccurred in the Amyl genes before the divergence of barleyand wheat is supported by the observation that all knownAmyl gene in barley and wheat contains this 3-bp deletion.The possibility that the Amyl genes evolved from Amy3genes by a 6-bp insertion followed by an additional 3-bpinsertion to produce the Amy2 genes cannot be ruled out atthis time. Further analysis of the signature regions in othersubfamilies, such as the Arundinoideae, Chloridoideae, andthe Panicoideae, should answer this question.The picture emerging for the cereal a-amylases raises several

questions of interest to evolutionists and molecular biologists.For example, why and how has this diversity evolved? Has itaided grasses in occupying the enormous range of habitats thatare characteristic of this family? How do differences in thedistribution of a-amylase genes correlate with molecular dataon plant ribosomal and chloroplast genes? The experimentalapproach described in this paper should provide answers tothese and other important questions in molecular evolution.

We thank Michael Clegg and John Gillespie for critical commentson this manuscript and Steve Reinl and John Chandler for theirexpert technical assistance.

1. Brown, H. T. & Morris, G. H. (1890) J. Chem. Soc. 57, 458.

2. Akazawa, T., Mitsui, T. & Hayashi, M. (1988) Biochem. Plants14, 465-492.

3. Fincher, G. B. (1989) Annu. Rev. Plant Physiol. Plant. Mol.Biol. 40, 305-346.

4. Nakajima, R., Imanaka, T. & Aiba, S. (1986) Appl. Microbiol.Biotechnol. 23, 355-360.

5. O'Neill, S. D., Kumagai, M. H., Majumdar, A., Huang, N.,Sutliff, T. D. & Rodriguez, R. L. (1990) Mol. Gen. Genet. 221,235-244.

6. Miyata, S. & Akazawa, T. (1982) Plant Physiol. 70, 147-153.7. Daussant, J., Miyata, S., Mitsui, T. & Akazawa, T. (1983) Plant

Physiol. 71, 88-95.8. Huang, N., Koizumi, N., Reinl, S. & Rodriguez, R. L. (1990)

Nucleic Acids Res. 18, 7007-7014.9. Ranjhan, S., Litts, J. C., Foolad, M. & Rodriguez, R. L. (1991)

Theor. Appl. Genet. 82, 481-488.10. Rodriguez, R. L., Huang, N., Sutiff, T. D., Ranjhan, S.,

Karrer, E. & Litts, J. (1992) Rice Genetics II: Proceeding oftheSecond International Rice Genetics Symposium (Int. Rice Res.Inst., Los Banos, Philippines), pp. 417-429.

11. Muthukrishnan, S., Gill, B. S., Swegle, M. & Chandra, G. R.(1984) J. Biol. Chem. 259, 13637-13639.

12. Gale, M. D., Law, C. N., Chojecki, A. J. & Kempton, R. A.(1983) Theor. Appl. Genet. 64, 309-316.

13. Khursheed, B. & Rogers, J. C. (1988) J. Biol. Chem. 263,18953-18960.

14. Baulcombe, D. C., Huttly, A. K., Martienssen, R. A., Barker,R. F. & Jarvis, M. G. (1987) Mol. Gen. Genet. 209, 33-40.

15. Huttly, A. K., Martienssen, R. A. & Baulcombe, D. C. (1988)Mol. Gen. Genet. 214, 232-240.

16. MacGregor, E. A. & MacGregor, A. W. (1987) CRC Crit. Rev.Biochem. 5, 129-142.

17. Knox, C. A. P., Sonthayanon, B., Chandra, G. R. & Muth-ukrishnan, S. (1987) Plant Mol. Biol. 9, 3-17.

18. Aoyagi, K., Sticher, L., Wu, M. & Jones, R. L. (1990) Planta180, 333-340.

19. Akazawa, T. & Hara-Nishimura, I. (1985) Annu. Rev. PlantPhysiol. 36, 441-472.

20. Huang, N., Sutliff, T. D., Litts, J. C. & Rodriguez, R. L. (1990)Plant Mol. Biol. 14, 655-668.

21. Sutliff, T. D., Huang, N., Litts, J. C. & Rodriguez, R. L. (1991)Plant Mol. Biol. 16, 579-591.

22. Huang, N., Reinl, S. J. & Rodriguez, R. L. (1992) Gene 111,223-228.

23. Ou-Lee, T., Turgeon, R. & Wu, R. (1988) Proc. Natl. Acad.Sci. USA 85, 6366-6369.

24. Yu, S., Tai, Y., Goldman, S., Chuu, Y., Ou-Lee, T. & Wu, R.(1990) in Structure and Function ofNucleicAcids andProteins,eds. Wu, Y. & Wu, C. (Raven, New York), pp. 287-295.

25. Murray, M. G. & Thompson, W. F. (1980) Nucleic Acids Res.8, 4321-4325.

26. Rogers, S.0. & Bendich, A. J. (1985) Plant Mol. Biol. 5, 69-76.27. Devereux, J., Haeberli, P. & Smithies, 0. (1984) Nucleic Acids

Res. 12, 387-395.28. Li, W., Wu, C. & Luo, C. (1985) Mol. Biol. Evol. 2, 150-174.29. Nei, M. (1987) Molecular Evolutionary Genetics (Columbia

Univ. Press, New York), pp. 287-326.30. Koizuka, N., Tanaka, Y. & Morohashi, Y. (1990) Plant Physiol.

94, 1488-1491.31. Wolfe, K. H., Sharp, P. M. & Li, W. (1989) J. Mol. Evol. 29,

208-211.32. Whittier, R. F., Dean, D. A. & Rogers, J. C. (1987) Nucleic

Acids Res. 15, 2515-2535.33. Rogers, J. C. & Miflliman, C. (1983) J. Biol. Chem. 258,

8169-8174.34. Smith, T. F., Waterman, M. S. & Fitch, W. M. (1981) J. Mol.

Evol. 18, 38-46.35. Rogers, J. C. (1985) J. Biol. Chem. 260, 3731-3738.36. Jacobsen, J. V. & Higgins, T. J. V. (1982) Plant Physiol. 70,

1647-1653.37. Sargeant, J. G. (1978) Starch 30, 160-163.38. Sargeant, J. G. (1979) in The Biochemistry of Cereals, eds. Laid-

man, D. L. & Wyn, R. G. (Academic, New York), pp. 339-343.

7530 Plant Biology: Huang et al.