evolutionary history of the gh3 family of acyl adenylases ... · evolutionary history of the gh3...

17
Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent Mary C. Wildermuth Received: 8 October 2010 / Accepted: 10 April 2011 Ó Springer Science+Business Media B.V. 2011 Abstract GH3 amino acid conjugases have been identi- fied in many plant and bacterial species. The evolution of GH3 genes in plant species is explored using the sequenced rosids Arabidopsis, papaya, poplar, and grape. Analysis of the sequenced non-rosid eudicots monkey flower and col- umbine, the monocots maize and rice, as well as spikemoss and moss is included to provide further insight into the origin of GH3 clades. Comparison of co-linear genes in regions surrounding GH3 genes between species helps reconstruct the evolutionary history of the family. Com- bining analysis of synteny with phylogenetics, gene expression and functional data redefines the Group III GH3 genes, of which AtGH3.12/PBS3, a regulator of stress- induced salicylic acid metabolism and plant defense, is a member. Contrary to previous reports that restrict PBS3 to Arabidopsis and its close relatives, PBS3 syntelogs are identified in poplar, grape, columbine, maize and rice suggesting descent from a common ancestral chromosome dating to before the eudicot/monocot split. In addition, the clade containing PBS3 has undergone a unique expansion in Arabidopsis, with expression patterns for these genes consistent with specialized and evolving stress-responsive functions. Keywords GH3 Rosids Phylogeny Synteny Acyl adenylase Salicylic acid Phytohormone Abbreviations Compounds BTH 1,2,3-benzothiodiazole-7-carbothioic acid S-methyl ester IAA Indole-3-acetic acid JA Jasmonic acid SA Salicylic acid Genes bZIP Basic-domain leucine-zipper ERF Ethylene response factor GDG1 GH3-like defense gene 1 GH3 Gretchen Hagen 3 ICS1 Isochorismate synthase 1 JAR1 Jasmonic acid resistant 1 PBS3 avrPphB susceptible 3 WIN3 HopW1-1-interacting 3 Organisms Ac Aquilegia coerulea (columbine) At Arabidopsis thaliana Cp Carica papaya (papaya) Mg Mimulus guttatus (monkey flower) Os Oryza sativa (rice) Accession numbers: AtGH3.1, At2g14960; AtGH3.2, At4g37390; AtGH3.3, At2g23170; AtGH3.4, At1g59500; AtGH3.5, At4g27260; AtGH3.6, At5g54510; AtGH3.7, At1g23160; AtGH3.8, At5g51470; AtGH3.9, At2g47750; AtGH3.10, At4g03400; AtGH3.11, At2g46370; AtGH3.12, At5g13320; AtGH3.13, At5g13350; AtGH3.14, At5g13360; AtGH3.15, At5g13370; AtGH3.16, At5g13380; AtGH3.17, At1g28130; AtGH3.18, At1g48670; AtGH3.19, At1g48660. Electronic supplementary material The online version of this article (doi:10.1007/s11103-011-9776-y) contains supplementary material, which is available to authorized users. R. A. Okrent M. C. Wildermuth (&) Department of Plant and Microbial Biology, University of California, 221 Koshland Hall, Berkeley 94720, USA e-mail: [email protected] Present Address: R. A. Okrent Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA 123 Plant Mol Biol DOI 10.1007/s11103-011-9776-y

Upload: others

Post on 21-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

Evolutionary history of the GH3 family of acyl adenylasesin rosids

Rachel A. Okrent • Mary C. Wildermuth

Received: 8 October 2010 / Accepted: 10 April 2011

� Springer Science+Business Media B.V. 2011

Abstract GH3 amino acid conjugases have been identi-

fied in many plant and bacterial species. The evolution of

GH3 genes in plant species is explored using the sequenced

rosids Arabidopsis, papaya, poplar, and grape. Analysis of

the sequenced non-rosid eudicots monkey flower and col-

umbine, the monocots maize and rice, as well as spikemoss

and moss is included to provide further insight into the

origin of GH3 clades. Comparison of co-linear genes in

regions surrounding GH3 genes between species helps

reconstruct the evolutionary history of the family. Com-

bining analysis of synteny with phylogenetics, gene

expression and functional data redefines the Group III GH3

genes, of which AtGH3.12/PBS3, a regulator of stress-

induced salicylic acid metabolism and plant defense, is a

member. Contrary to previous reports that restrict PBS3 to

Arabidopsis and its close relatives, PBS3 syntelogs are

identified in poplar, grape, columbine, maize and rice

suggesting descent from a common ancestral chromosome

dating to before the eudicot/monocot split. In addition, the

clade containing PBS3 has undergone a unique expansion

in Arabidopsis, with expression patterns for these genes

consistent with specialized and evolving stress-responsive

functions.

Keywords GH3 � Rosids � Phylogeny � Synteny � Acyl

adenylase � Salicylic acid � Phytohormone

Abbreviations

Compounds

BTH 1,2,3-benzothiodiazole-7-carbothioic acid S-methyl

ester

IAA Indole-3-acetic acid

JA Jasmonic acid

SA Salicylic acid

Genes

bZIP Basic-domain leucine-zipper

ERF Ethylene response factor

GDG1 GH3-like defense gene 1

GH3 Gretchen Hagen 3

ICS1 Isochorismate synthase 1

JAR1 Jasmonic acid resistant 1

PBS3 avrPphB susceptible 3

WIN3 HopW1-1-interacting 3

Organisms

Ac Aquilegia coerulea (columbine)

At Arabidopsis thaliana

Cp Carica papaya (papaya)

Mg Mimulus guttatus (monkey flower)

Os Oryza sativa (rice)

Accession numbers: AtGH3.1, At2g14960; AtGH3.2, At4g37390;

AtGH3.3, At2g23170; AtGH3.4, At1g59500; AtGH3.5, At4g27260;

AtGH3.6, At5g54510; AtGH3.7, At1g23160; AtGH3.8, At5g51470;

AtGH3.9, At2g47750; AtGH3.10, At4g03400; AtGH3.11,

At2g46370; AtGH3.12, At5g13320; AtGH3.13, At5g13350;

AtGH3.14, At5g13360; AtGH3.15, At5g13370; AtGH3.16,

At5g13380; AtGH3.17, At1g28130; AtGH3.18, At1g48670;

AtGH3.19, At1g48660.

Electronic supplementary material The online version of thisarticle (doi:10.1007/s11103-011-9776-y) contains supplementarymaterial, which is available to authorized users.

R. A. Okrent � M. C. Wildermuth (&)

Department of Plant and Microbial Biology, University

of California, 221 Koshland Hall, Berkeley 94720, USA

e-mail: [email protected]

Present Address:R. A. Okrent

Department of Botany and Plant Pathology, Oregon State

University, Corvallis, OR 97331, USA

123

Plant Mol Biol

DOI 10.1007/s11103-011-9776-y

Page 2: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

Pp Physcomitrella patens (moss)

Pt Populus trichocarpa (poplar)

Sm Selaginella moellendorffii (spikemoss)

Vv Vitis vinifera (grape)

Zm Zea mays (maize)

Terms

ML Maximum likelihood

MP Maximum parsimony

NJ Neighbor joining

PID Percent identity

WGD Whole genome duplication

Introduction

GH3 (Gretchen Hagen 3) genes were originally identified

in Glycine max (soybean) as responsive to the phytohor-

mone auxin (Hagen et al. 1984) and have since been

identified in many plant species (Terol et al. 2006). Several

Arabidopsis thaliana genes were identified in genetic

screens for altered phytohormone-mediated responses to

auxin [e.g., DFL1 (Nakazawa et al. 2001)], or jasmonic

acid [e.g., JAR1 (Staswick et al. 1992)]. However, the

molecular function of these genes remained unknown until

Staswick et al. identified structural similarity between the

A. thaliana GH3 and the firefly luciferase-like superfamily

of proteins (Staswick et al. 2002). The firefly luciferase-

like superfamily, also called the adenylate-forming super-

family, is a diverse group of enzymes that catalyzes the

addition of AMP to carboxyl groups on a wide variety of

substrates. This family includes nonribosomal peptide

synthetases, 4-coumarate-CoA ligases, acyl-CoA ligases,

and oxidoreductases (Conti et al. 1996). These enzymes

typically contain three conserved motifs that form a bind-

ing pocket for AMP and the substrate (Chang et al. 1997).

Staswick et al. identified the three conserved motifs in the

A. thaliana GH3 proteins. Furthermore, in vitro activity

assays revealed that one, JAR1 (GH3.11), catalyzes the

addition of amino acids to the plant hormone jasmonic acid

(Staswick et al. 2002) and that several others catalyze the

addition of amino acids to auxin (Staswick et al. 2005).

PBS3 (GH3.12), among others, was not active on any of

the phytohormone substrates tested (Staswick et al. 2002).

Our subsequent work determined that 4-substituted ben-

zoates serve as substrates of PBS3 (Okrent et al. 2009).

Previously published phylogenetic trees constructed

using distance methods divided plant GH3 proteins into

three major clades, identified as Groups I, II, and III (Felten

et al. 2009; Staswick et al. 2002; Terol et al. 2006). Groups

I and II contained genes from many more species than did

Group III, which contained genes from only three species,

Arabidopsis thaliana, Brassica napus (rapeseed), and

Gossypium hirsutum (cotton), all in the Eurosids II subc-

lade of the rosids superfamily of eudicotyledonous plants

(Terol et al. 2006). Substrate specificity of assayed proteins

tended to correspond to these phylogenetic relationships

(Staswick et al. 2002, 2005). The JAR1 enzyme active on

JA was placed in GH3 Group I (comprised of 2 GH3 genes

in A. thaliana), with GH3 enzymes active on IAA in Group

II (8 AtGH3 genes), and enzymes including PBS3 active on

neither of these compounds in Group III (9 AtGH3 genes).

We are interested in the evolutionary history of the GH3

family as a means of gaining insight into the evolved

functions of these enzymes in plants. As mentioned above,

this family of enzymes catalyzes the amino acid conjuga-

tion of small molecules, including the hormones IAA and

JA. In doing so, these GH3 proteins alter the activity of the

hormone and its extensive impact on plant metabolism and

physiology. For example, JAR1 catalyzes the conjugation

of Ile to JA forming JA-Ile, the active form of the hormone,

resulting in the degradation of a JA repressor protein and

the subsequent activation of downstream transcriptional

responses (Chini et al. 2007). The function of the Group III

enzymes has remained more elusive, as a substrate for only

one enzyme, PBS3, has been identified (Okrent et al.

2009). Though PBS3 does not act directly on the phyto-

hormone SA, its function is required for full activation of

SA-dependent defense responses (Nobuta et al. 2007;

Jagadeeswaran et al. 2007; Lee et al. 2007). If the Group III

GH3 genes were only present in a small group of related

species, it would suggest the encoded enzymes evolved a

new function. Possibly, this function, e.g., acting on a

unique substrate, would be specifically required by these

species or confer a growth or reproductive benefit.

Herein, we explore the evolutionary history of the GH3

family, focusing particularly on Group III. The recent

sequencing of multiple plant genomes coupled with new

computational tools such as the CoGe suite of comparative

genomics programs (Lyons and Freeling 2008; Lyons et al.

2008) allows us to leverage information from whole gen-

omes to infer descent from a common ancestral gene by

analyzing co-linearity of neighboring genes. We performed

our analysis focusing primarily on genome sequence data

from the rosids Arabidopsis thaliana and Arabidopsis

lyrata, order Brassicales; Carica papaya (papaya), order

Brassicales; Populus trichocarpa (poplar), order Mal-

pighiales; and Vitis vinifera (grape), order Vitales. For

comparison, we also used genome data from the asterid

Mimulus guttatus (monkey flower); order Lamilaes; the

basal eudicot Aquilegia coerulea (columbine), order Ran-

unculales; the monocot grasses Oryza sativa (rice) and Zea

mays (maize); the lycophyte Selaginella moellendorffii

(spikemoss); and the moss Physcomitrella patens (Fig. 1).

This syntenic analysis is coupled with investigation of

expression patterns of the GH3 genes and available

Plant Mol Biol

123

Page 3: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

functional information to gain insight on potential gene

function. As detailed below, we find that contrary to past

reports, Group III GH3 enzymes descended from a com-

mon ancestral chromosome dating to before the eudicot/

monocot split and are a sister taxa to the Group II IAA-

conjugating enzymes. Furthermore, our analyses find the

subsequent expansion of Group III GH3s to be consistent

with a role in response to (a)biotic stress. Our identification

of syntelogs (syntenic orthologs) of key Group I, II, and III

members in agronomically-important species allows one to

prioritize those genes for translational research. Moreover,

analysis of conserved genes in syntenic regions may pro-

vide insight on ancient and evolving GH3 syntelog func-

tion(s), as we have explored for PBS3.

Materials and methods

Identification of GH3 genes and syntenic regions

flanking GH3 genes

The CoGe platform for comparison of genome sequences

(http://www.synteny.cnr.berkeley.edu/CoGe/; Lyons and

Freeling 2008) was used to identify regions of potential

synteny between Arabidopis thaliana (TAIR, v9) and

Arabidopsis lyrata (JGI, v1) and other sequenced plant

genomes, including Aquilegia coerulea (JGI, v1), Carica

papaya [v0.4, ASPGB draft genome (Ming et al. 2008)],

Mimulus guttatus (JGI, v1), Oryza sativa [TIGR v5,

(Ouyang et al. 2006)], Physcomitrella patens [JGI v1.1,

(Rensing et al. 2008)], Populus trichocarpa [JGI, v2

(Tuskan et al. 2006)], Selaginella moellendorffii (JGI v1),

Vitis vinifera [v1, French-Italian Public Consortium for

Grapevine Genome Characterization (Jaillon et al. 2007)],

and Zea mays cultivar B73 [Maize sequence.org v2 (Sch-

nable et al. 2009)]. The regions of potential synteny com-

piled in (Lyons et al. 2008) were used as a starting point,

modified with new genome sequence data, evaluated for

accuracy, and expanded with additional analyses. The

V. vinifera genome, less subject to rearrangements, dupli-

cations and gene loss than other genomes (Jaillon et al.

2007; Semon and Wolfe 2007), was used as a bridge

between A. thaliana and the other genomes to identify

regions of co-linearity and possible synteny. It should be

noted that genes from some of the sequenced plants have

not yet been assigned to chromosomes (Phytozome 6.0).

Four Arabidopsis GH3 genes were used as seeds to

identify GH3 genes in the other eudicot genomes. The CDS

sequences of PBS3 (AtGH3.12, At5g13320), DFL1

(AtGH3.6, At5g54510), JAR1 (AtGH3.11, At2g46370), and

DFL2 (AtGH3.10; At4g03400) were used as BLAST seeds

against papaya, grape, poplar, monkey flower, and colum-

bine in CoGe Blast. Similarly, Arabidopsis and rice GH3

sequences were used to identify homologs in maize, and

moss GH3 sequences were used to identify homologs in

spikemoss. Additional BLASTN searches were performed

with the CDS sequence from each species against the gen-

ome of origin to find any other possible matches, and results

were checked against genes containing the keyword

‘‘GH3’’ in Phytozome v6 (http://www.phytozome.net/) from

Department of Energy’s Joint Genome Institute and the

Center for Integrative Genomics. Sequence length and gene

models were also retrieved from the Phytozome website and

compared to BLASTX results. The sequences were aligned

using MUSCLE (Edgar 2004), visualized in JalView and

evaluated for global alignment and presence of AMP-bind-

ing motifs (Chang et al. 1997).

As two divergent haplotypes of Selaginella were

sequenced, a non-redundant set of sequences was com-

posed using a gene set from JGI (http://www.genome.jgi.-

psf.org/Selmo1.info.html) and by removing nearly identi-

cal sequences. Several sequences were found to be

incomplete. GSVIV00026990001 (VvGH3.7) was missing

motif II and the protein sequence was shorter than expected

by approximately 200 amino acids based on comparison

with other GH3 proteins, typically around 600 amino acids

long. BLAST searches found peptides and ESTs containing

the missing sequence information (ESTs gi 110368758 and

gi 110698541 and peptide gi 225454466). The corrected

sequence length and exon number are shown in Table 1.

Two papaya sequences, EVM prediction supercon-

tig_1065.2 (CpGH3.3) and EVM prediction supercon-

tig_9.204 (CpGH3.4) were both truncated due to missing

Paleohexaploidy

Paleotetraploidy

Vitis vinifera

Mimulus guttatus

Aquilegia coerulea

Zea mays

Oryza sativa

Selaginella moellendorffii

Physcomitrella patens

Populus trichocarpa

Arabidopsis thaliana

Carica papaya

10

19

8

13

6

13

2

13

20

6

9

3

rosids

eudicots

monocots

Fig. 1 Plant phylogeny showing number of GH3 genes in modern

species and location of paleohexaploidy and paleotetraploidy events

in lineages of interest. Figure adapted from Freeling (2009) with

phylogenetic information from the Missouri Botanical Garden’s

Angiosperm Phylogeny Project and Phytozome v6

Plant Mol Biol

123

Page 4: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

sequence data. Some additional base pairs in the C-termi-

nus of the p1065.2 gDNA sequence were identified as

potentially coding, and added to the CDS sequence.

However, this correction did not account for all of the

missing sequence at the C-terminus. No ESTs were iden-

tified that contain the missing regions for either of the

papaya sequences, so they remain incomplete.

Sequence alignment and phylogenetic tree construction

CDS and peptide sequences were aligned with MUSCLE

(Edgar 2004) via the online phylogeny.fr platform (http://

www.phylogeny.fr; Dereeper et al. 2008) using default

parameters. Distance (BioNJ), maximum likelihood

(PhyML) and maximum parsimony (TNT) tree construc-

tion methods were used and MEGA 4.0 (Tamura et al.

2007) was employed to visualize and annotate the phylo-

genetic trees. Sequences missing AMP binding motifs or

the highly conserved C-terminus were omitted from

alignments and phylogenetic tree construction. DNA

regions containing sequences related phylogenetically were

tested for synteny using the GEvo tool of the CoGe

browser, described above.

Analysis of expression data

The expression patterns of AtGH3 genes were explored

using tools from Genevestigator (Zimmermann et al.

Table 1 GH3 genes in the rosids Populus trichocarpa, Vitis vinifera, and Carica papaya and their relationship to Arabidopsis thaliana genes

through common position on an ancestral chromosome

Namea Locusb Pos.c Protein Lengthb Exonsb Groupd At syntelog (s)d

PtGH3.1 POPTR_0007s10350 7 597 3 II AtGH3.2, AtGH3.3

PtGH3.2 POPTR_0009S09590 9 690 3 II AtGH3.1

PtGH3.3 POPTR_0001S30560 1 596 3 II AtGH3.1

PtGH3.4 POPTR_0013S14740 13 608 3 II –

PtGH3.5 POPTR_0011S13330 11 611 3 II AtGH3.5, AtGH3.6

PtGH3.6 POPTR_0001S43990 1 611 3 II AtGH3.5, AtGH3.6

PtGH3.7 POPTR_0001S12850 1 606 4 III AtGH3.12, AtGH3.17

PtGH3.8 POPTR_0003S15970 3 594 5 III AtGH3.12, AtGH3.17

PtGH3.9 POPTR_0002S20790 2 596 4 III AtGH3.9

PtGH3.10 POPTR_0013S14050 13 595 4 I AtGH3.10

PtGH3.11 POPTR_0019S13450 19 595 4 I AtGH3.10

PtGH3.14 POPTR_0014S09120 14 576 4 I AtGH3.11

PtGH3.15 POPTR_0002S16960 2 576 4 I AtGH3.11

VvGH3.1 GSVIV00007718001 3 598 4 II AtGH3.1

VvGH3.2 GSVIV00019610001 7 600 3 II AtGH3.2, AtGH3.3

VvGH3.3 GSVIV00027472001 19 605 4 II AtGH3.5 AtGH3.6

VvGH3.4 GSVIV00026120001 12 588 5 II –

VvGH3.5 GSVIV00027964001 7 596 4 III AtGH3.9

VvGH3.6 GSVIV00026000001 12 592 4 I AtGH3.10

VvGH3.7 GSVIV00026990001e 15 583 4 I AtGH3.11

VvGH3.8 GSVIV00006220001 1 593 5 III AtGH3.12, AtGH3.17

CpGH3.1 EVM supercontig_292.1 292 599 3 II AtGH3.1

CpGH3.2 EVM supercontig_6.74 6 599 3 II AtGH3.2 AtGH3.3

CpGH3.3 EVM supercontig_1065.2 1065 455f 3 II AtGH3.5 AtGH3.6

CpGH3.4 EVM supercontig_9.204 9 491f 3 III AtGH3.9

CpGH3.5 EVM supercontig_34.122 34 607 4 I AtGH3.10

CpGH3.6 EVM supercontig_1483.1 1483 591 4 I AtGH3.11

Syntelogs of PBS3 (AtGH3.12) are in bolda Pt names from Felten et. al (2009), PtGH3.15, Vv, Cp and Mg names first described hereb Locus names, protein length and exon number from Phytozome v5.0c Position is the chromosome for Pt and Vv, and supercontig for Cp and Mgd Group and At syntelog from synteny analysis described in Materials and Methodse Gene model in Phytozome v5.0 corrected based on global alignment and EST data, see Materials and Methodsf Sequencing error results in missing sequence

Plant Mol Biol

123

Page 5: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

2004), NascArrays (Craigon et al. 2004), the eFP brower

(Toufighi et al. 2005; Winter et al. 2007) (http://www.bar.

utoronto.ca/efp/cgi-bin/efpWeb.cgi) and analysis of the

relevant literature. Experiments in which GH3 genes were

expressed in response to abiotic or biotic stress or hormone

treatments were identified in Genevestigator and the eFP

browser, and data downloaded. Most experiments analyzed

were from the AtGenExpress series (Goda et al. 2008). The

NascArrays experiment reference numbers are shown in

the corresponding data tables in the ‘‘Results’’ section.

Values greater than 2-fold increase relative to control

experiments were reported and significance tested using

student’s t tests at a = 0.05 for experiments with three

replicates. Experiments with two replicates were manually

examined for reproducibility of experimental and control

samples. PtGH3 and VvGH3 expression patterns were

analyzed using the Plant Expression Database [Plexdb,

http://www.plexdb.org (Wise et al. 2007)].

Analysis of transcription factor binding motifs

Potential transcription factor binding sites 1 kb upstream of

AtGH3 transcriptional start sites were identified using the

Arabidopsis cis-regulatory element database (AtcisDB,

http://www.arabidopsis.med.ohio-state.edu/AtcisDB/;

Molina and Grotewold 2005) of the Arabidopsis Gene

Regulatory Information Server from Ohio State University.

Several of the tandemly duplicated genes have fewer than

1 kB between the next gene: AtGH3.13 (100 bp),

AtGH3.15 (288 bp), AtGH3.16 (711 bp) and AtGH3.18

(989 bp). Promoter motifs associated with transcription

factor families of interest were tallied and summarized.

These plant transcription factors include WRKYs, basic-

domain leucine-zippers (bZIPs), MYBs, and dehydration

response element binding proteins (DREBs) known to

mediate plant response to biotic stress (Singh et al. 2002).

Furthermore, ethylene responsive element binding factors

(ERFs), auxin response factors (ARFs) and MYC2 tran-

scription factors (Dombrecht et al. 2007) were included to

reflect responses mediated by the hormones ethylene,

auxin, and jasmonate, respectively. Cis-acting regulatory

elements are defined as follows: WRKY transcription

factors recognize the W-box: ttgact/c; bZIPs recognize

actcat (ATB2/AtbZip53) and acacttg (DPBF1&2) motifs;

the MYB and MYB-like transcription factors bind MYB

(aaccaaac, taactaac) and aaatct (MYB-related CCA1)

motifs; DREBs recognize tgccgacaa, gaccgacct, and aacc-

gacca motifs; ERFs bind the GCC-box (gccgcc), ARF1

binds tgtctc; and MYC2 binds the G-box (cacgtg), T/G-box

(cacgtt), and cacatg.

Results

Identification of GH3 genes

Potential orthologs of the AtGH3 genes were identified by

BLASTN search using CoGe Blast with AtGH3, OsGH3,

and PpGH3 genes as seeds, aligned using MUSCLE (Edgar

2004), visualized in JalView and evaluated for global

alignment and presence of AMP-binding motifs and the

highly conserved C-terminal domain (see Materials and

Methods). For the rosids, in addition to the 19 Arabidopsis

thaliana GH3 genes, 6 GH3 genes were identified in Carica

papaya, 13 in Populus trichocarpa [one newly described

gene plus 12 described in (Felten et al. 2009)], and 8 GH3

genes were identified in Vitis vinifera (Fig. 1, Table 1).

For comparison, we also identified GH3 genes in non-

rosid eudicots Mimulus guttatus (6 GH3 genes) and Aqui-

legia coerulea (10 genes), as well as in the monocot grasses

Zea mays (13 GH3 genes) and Oryza sativa [13 genes;

previously described in (Jain et al. 2006)], and in Selagi-

nella moellendorffii (20 GH3 genes) and Physcomitrella

patens [2 genes; previously described in (Bierfreund et al.

2004) and (Ludwig-Muller et al. 2009)] (Fig. 1, Online

Resource 1). It should be noted that there may be additional

M. guttatus and C. papaya GH3 genes, as the sequencing

and annotation of those genomes are still incomplete.

Loss and gain of GH3 genes in rosids

Analysis of synteny is complicated by gene duplication,

either due to whole genome duplication (WGD) or local

duplication, gene loss or insertion (Lyons et al. 2008). Plant

genomes are heavily duplicated, with WGD events quite

common in the evolutionary history of many plant lineages.

Following episodes of genome duplication, selective gene

loss, called fractionation, is typically observed (Freeling

2009). As shown in Fig. 1, the rosids all show evidence of

a pre-rosid hexaploidy event with subsequent fractionation.

Though the exact classification of V. vinifera remains

uncertain, sequence comparisons of chloroplast genomes

suggest that V. vinifera is a basal rosid (Jaillon et al. 2007;

Jansen et al. 2006) as depicted. Although A. thaliana has

the smallest genome of any sequenced plant, two rounds of

WGD have occurred since its divergence from the primary

rosid lineage (Blanc et al. 2003). One round of WGD has

also occurred in the P. trichocarpa lineage. This can lead to

as many as four copies of A. thaliana and two of

P. trichocarpa for each V. vinifera or C. papaya gene.

Though Fig. 1 does not show a WGD event as part of

M. guttatus lineage, several members of the Mimulus genus

have been shown to have evidence of polyploidy (Wu et al.

Plant Mol Biol

123

Page 6: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

2007). Completion of the M. guttatus annotation will allow

future resolution of this issue.

As noted on Fig. 1, our analysis indicates that the

common ancestor prior to the pre-rosid hexaploidy WGD

event had 3 GH3 genes, leading to nine after the hexa-

ploidy WGD event, one of which was lost in the lineage

prior to the divergence of grape. V. vinifera has retained all

8 GH3 genes, but the other rosids analyzed have lost

multiple genes (Table 2). Uniquely, A. thaliana lost 21

genes but also gained 7 GH3 genes, due to whole genome

duplication, fractionation, local duplication and transposi-

tion. Of particular interest, the genes gained are all in

Group IIIA defined below.

Identification of GH3 syntenic sets

Using the GEvo tool of the CoGe genome comparison

browser (Lyons and Freeling 2008), eight GH3 syntenic

sets were identified, incorporating ten genomic regions of

Arabidopsis thaliana that flank GH3 genes. These include

several areas of A. thaliana chromosomes 1, 2, 4, and 5 and

account for 14 of the 19 A. thaliana GH3 (AtGH3) genes.

The gene names, locus, chromosome, contig, or scaffold

number, predicted protein length, number of exons, and

Arabidopsis syntelog are shown in Table 1 for the rosids

and Online Resource 1 for the others.

As an example, Fig. 2 shows part of the Arabidopsis

PBS3 (AtGH3.12/At5g13320) syntenic set including

approximately 40 kB of Arabidopsis chromosome 5 with

100 kB of V. vinifera chromosome 1, 90 kB P. trichocarpa

chromosome 1, and 60 kB of P. trichocarpa chromosome

3. The Arabidopsis chromosome 5 region contains PBS3

(AtGH3.12, At5g13320) and four other AtGH3 (AtGH3.13-

16) genes. Though not shown in Fig. 2, 70 kB of Arabid-

opsis chromosome 1 containing AtGH3.17 is also syntenic

(see Table 3 and Online Resource 2). Of the rosids, only

C. papaya did not share a region of synteny with PBS3;

however, this needs to be reexamined once the final com-

pleted and annotated genome is available. Regions of

M. guttatus showed evidence of synteny with the region

surrounding PBS3 (see Table 3), however, none of these

M. guttatus regions contain a GH3 sequence. It remains

possible, however, that a GH3 sequence resides in the

missing region of M. guttatus scaffold 39 (open boxes in

Table 3). As shown in Online Resource 2, Aquilegia

coerulea does contain a syntenic region (mapped to scaf-

fold 49) and GH3 gene, AcGH3.6. Similarly, a region of

Oryza sativa chromosome 11 containing OsGH3.13

(Os11g32520) (Terol et al. 2006) and a region of Zea mays

containing ZmGH3.13 demonstrate co-linearity of genes

with the region containing Arabidopsis PBS3 (Online

Resource 2). No syntenic region is detectable between

S. moellendorffii or P. patens.

A detailed comparison of the syntenic regions flanking

PBS3 shows that the five AtGH3 genes in the Arabidopsis

chromosome 5 region correspond to a single gene in each

of the other genomes. This suggests that these additional

Arabidopsis GH3 genes result from local duplication. Two

additional genes (At5g13330 and At5g13340, annotated as

an ERF/AP2 transcription factor family member and of

unknown function, respectively) are located between PBS3

and the other four AtGH3 genes. One of these, the gene

encoding an ERF/AP2 transcription factor, also has syn-

telogs in many of the species examined. In total, in addition

to the GH3 genes, seven Arabidopsis genes in the chro-

mosome 5 region and three in the chromosome 1 region

display synteny to genetic regions of other species. These

genes and their annotations are summarized in a slightly

abbreviated version for the eudicots in Table 3 and more

fully for all species examined in Online Resource 2.

Five A. thaliana, one P. trichocarpa (PtGH3.4), one

A. coerulea (AcGH3.1) and two O. sativa GH3 genes

(OsGH3.7 and OsGH3.12) are not members of syntenic sets

(Online Resources 1 and 3); there is no detectable co-line-

arity between surrounding genes and GH3 genes from other

species. In addition, it is not possible to detect co-linearity

between chromosomal regions surrounding the S. mo-

ellendorfii and P. patens genes and the other species studied

(data not shown). Four of the five Arabidopsis genes not in a

syntenic set (AtGH3.7, AtGH3.8, AtGH3.18, AtGH3.19) are

in Group IIIA (defined below), which contains PBS3, and

can be explained by local duplication and gene insertion

(Fig. 3). AtGH3 genes, AtGH3.18 (At1g48670) and

AtGH3.19 (At1g48660), as well as a severely truncated GH3

gene (At1g48690), likely arose from insertion and duplica-

tion (Fig. 3) and there are several retrotransposons nearby

(e.g. At1g48680). AtGH3.7 (At1g23160), most similar in

protein sequence to PBS3, is present in A. lyrata but not

papaya, suggesting that it was inserted into an ancestor of

Table 2 Loss and gain of GH3 genes in each syntenic group since

divergence from common ancestor

Syntenic group At Cp Pt Vv

IA -3 NC -1 NC

IB -3 NC NC NC

IIA1 -2 NC -1 NC

IIA2 -2 NC NC NC

IIB1 -2 NC NC NC

IIB2 -4 -1 -1 NC

IIIA 22/17 21 21 NC

IIIB -3 -1 NC NC

Total 19 5 12 8

Group in bold contains PBS3. NC no change

Plant Mol Biol

123

Page 7: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

A. thaliana before the divergence from A. lyrata but after

papaya (data not shown). The region surrounding AtGH3.8

(At5g51470) contains many stress response genes, including

PBS2, also known as RAR1, which was identified in the same

mutant screen for altered disease resistance as PBS3 (War-

ren et al. 1999) (data not shown, can be recapitulated

http://www.genomevolution.org/r/37d). Regions of V. vinif-

era chromosome 16 and P. trichocarpa chromosomes 12

and 15 display co-linear genes with the region around

AtGH3.8 (At5g51470), although no GH3 gene is present,

suggesting that AtGH3.8 was inserted at this site. Though

AtGH3.4 (At1g59500) is grouped with the IIA GH3 proteins

(below), synteny was ambiguous as only one other gene in

the surrounding region matched genes from other species.

Classification of GH3 proteins into groups based

on synteny and phylogenetic relationships

Corrected GH3 protein sequences were aligned using

MUSCLE (Edgar 2004), and curated by removing gaps

manually or using Gblocks (Castresana 2000) the phylog-

eny.fr server (Dereeper et al. 2008) as described in

‘‘Materials and methods’’. Alignments were constructed

and clustered into phylogenetic trees using maximum

parsimony (TNT), maximum likelihood (PhyML), and

neighbor joining (BioNJ) algorithms. The tree found to best

correspond to the syntenic relationships identified above

was constructed using a multiple sequence alignment

curated using Gblocks with relaxed settings and PhyML.

This phylogenetic tree comprised of the eudicot GH3

sequences is shown in Fig. 4, with the complete phyloge-

netic tree including eudicot, monocot, moss and spikemoss

sequences provided as Online Resource 3.

The eight sets of GH3 syntenic genes are distributed

between Groups I, II, and III. The proteins from all species

other than S. moellendorffii and P. patens separate into

these three Groups in the phylogenetic trees, with each

syntenic set containing one V. vinifera protein. Each of the

two Group I sets contains one VvGH3, AtGH3 (AtGH3.11/

JAR1 in Set IA and AtGH3.10/DFL2 in Set IB), CpGH3,

MgGH3, and AcGH3 protein and two poplar proteins, with

Set IA also including monocot grass syntelogs.

Group II contains four VvGH3 proteins, each of which

corresponds to a syntenic subset of sequences (IIA1, IIA2,

IIB1, and IIB2). Within IIB, IIB1 contains eight proteins

including AtGH3.5/WES1 and AtGH3.6/DFL1, other rosid

proteins, as well as proteins from Mimulus and Aguilegia,

while IIB2 is comprised only of VvGH3.4 and PtGH3.4.

Within subgroup IIA, the sequences are more evenly dis-

tributed between IIA1 and IIA2 syntentic sets, with

Pt 2

At

Vv

Pt 1

AtGH3.16AtGH3.14AtGH3.13 AtGH3.15PBS3

AtGH3.12

Fig. 2 Synteny between the region of Arabidopsis chromosome 5

surrounding PBS3 and grape and poplar chromosomes. A screenshot

of the BLASTz output from the CoGe browser is shown. Each large

horizontal bar represents one genomic region, with the dashed linedividing the top (50 on left) and bottom (50 on right) strand. The

genome origin is indicated on the right (At Arabidopsis thaliana, Vv

Vitis vinifera, Pt Populus trichocarpa). The colored arrows represent

gene models: green are CDS, blue are RNA, and gray are introns. The

areas of similarity from BLASTz between genomic regions are shown

as colored blocks above or below the gene models. Each pairwise

comparison is shown in a different color. Brown and pink lines are

drawn between similar regions of At and the other species. Linesconnecting other pairwise comparisons were omitted for clarity. The

analysis can be regenerated at http://www.genomevolution.org/r/1d9i.

The 70 kB syntenic region of Arabidopsis chromosome 1 containing

AtGH3.17 is not shown here but is included in Table 3

Plant Mol Biol

123

Page 8: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

Ta

ble

3C

om

par

iso

no

fsy

nte

nic

reg

ion

sin

clu

din

gP

BS

3(A

tGH

3.1

2,

At5

g1

33

20

)

At

locu

s1

Ch

5

At

locu

s2

Ch

1

Vv

locu

s1

Ch

1

Pt

locu

s1

Ch

1

Pt

locu

s2

Ch

3

Mg

locu

s1

scaf

fold

39

Mg

locu

s2

scaf

fold

7

Mg

locu

s3

scaf

fold

54

An

no

tati

on

At5

g1

32

50

At1

g2

80

80

No

ne

No

ne

No

ne

No

ne

No

ne

Un

kn

ow

n,

pla

stid

At5

G1

33

20

(AtG

H3

.12

)A

t1g

28

13

0(A

tGH

3.1

7)

GS

VIV

00

00

62

20

00

1(V

vG

H3

.8)

PO

PT

R_

00

01

s12

85

0(P

tGH

3.7

)P

OP

TR

_0

00

3s1

59

70

(PtG

H3

.8)

No

ne

No

ne

PB

S3

,G

H3

fam

ily

pro

tein

At5

g1

33

30

At1

g2

81

60

GS

VIV

00

00

62

01

00

1P

OP

TR

_0

00

1s1

28

20

PO

PT

R_

00

03

s15

94

0N

on

eN

on

eE

RF

/AP

2tr

ansc

rip

tio

nfa

cto

r

At5

g1

33

40

No

ne

No

ne

No

ne

No

ne

No

ne

No

ne

Un

kn

ow

n

At5

g1

33

50

No

ne

No

ne

No

ne

No

ne

No

ne

No

ne

GH

3fa

mil

yp

rote

in

At5

g1

33

60

No

ne

No

ne

No

ne

No

ne

No

ne

No

ne

GH

3fa

mil

yp

rote

in

At5

g1

33

70

No

ne

No

ne

No

ne

No

ne

No

ne

No

ne

GH

3fa

mil

yp

rote

in

At5

g1

33

80

No

ne

No

ne

No

ne

No

ne

No

ne

No

ne

GH

3fa

mil

yp

rote

in

At5

g1

33

90

No

ne

GS

VIV

00

00

62

19

00

1P

OP

TR

_0

00

1s1

28

60

PO

PT

R_

00

03

s15

98

0m

gf0

12

81

5m

No

ne

No

ne

NE

F1

,ch

loro

pla

st

At5

g1

34

00

No

ne

GS

VIV

00

00

62

14

00

1P

OP

TR

_0

00

1s1

28

90

No

ne

mg

f01

61

73

mN

on

eN

on

eP

roto

n-d

epen

den

t

oli

go

pep

tid

etr

ansp

ort

,

pla

stid

No

ne

No

ne

GS

VIV

00

00

62

11

00

1P

OP

TR

_0

00

1s1

29

10

PO

PT

R_

00

03

s16

01

0m

gf0

87

73

mN

on

eN

on

eC

yto

chro

me

P4

50

At5

g1

34

10

No

ne

GS

VIV

00

00

62

10

00

1P

OP

TR

_0

00

1s1

29

20

No

ne

mg

f01

57

87

mN

on

eN

on

eP

epti

dy

l-p

roly

lci

s–tr

ans

iso

mer

ase,

chlo

rop

last

At5

g1

34

20

No

ne

GS

VIV

00

00

62

09

00

1P

OP

TR

_0

00

1s1

29

30

PO

PT

R_

00

03

s16

03

0m

gf0

12

10

0m

No

ne

No

ne

Tra

nsa

ldo

lase

,ch

loro

pla

st

No

ne

No

ne

GS

VIV

00

00

62

08

00

1N

on

eN

on

eN

on

em

gf0

04

55

6m

mg

f01

49

39

mU

nk

no

wn

No

ne

No

ne

GS

VIV

00

00

62

06

00

1N

on

eN

on

eN

on

em

gf0

15

94

6m

mg

f01

43

06

m,

mg

f00

06

51

m,

mg

f00

80

51

m

Un

kn

ow

n

At5

g1

34

30

,

At5

g1

34

40

No

ne

GS

VIV

00

00

62

05

00

1P

OP

TR

_0

00

1s1

29

60

PO

PT

R_

00

03

s16

06

0m

gf0

04

95

9m

No

ne

No

ne

Ub

iqu

ino

l-cy

toch

rom

eC

red

uct

ase

mit

och

on

dri

on

No

ne

At1

G2

81

40

GS

VIV

00

00

62

02

00

1N

on

eN

on

em

gf0

03

85

0m

No

ne

No

ne

Un

kn

ow

n,

chlo

rop

last

Plant Mol Biol

123

Page 9: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

AtGH3.1 and AtGH3.3 in IIA1 and AtGH3.2/YDK1 and

AtGH3.3 in IIA1 and AtGH3.1 in IIA2. As the evolu-

tionary distance of the species increases, it is increasingly

difficult to distinguish subsets of IIA and IIB. Though we

did assign subsets of IIA and IIB for Mimulus and

Aguilegia, there is less certainty about this assignment than

for the rosids. For the monocots, we could not parse out

these subsets and differentiate only IIA from IIB (Online

Resource 3). If MgGH3.3 is misassigned and should be

placed in IIA1 instead of IIA2, then a duplication of the

Group II sequences after the divergence of the rosids would

explain the presence of non-rosid species in only one

subset of IIA (i.e. IIA1) and IIB (i.e. IIB1).

Group III contains two VvGH3 proteins, corresponding

to two syntenic sets, both of which contain proteins from

species other than rosids. Set IIIA, contains the Arabidopsis

Syntenic set:IIIA

IIIB

insertion

insertion

local duplication

local duplication

(At2g47750)

(At1g28130)(At5g13360)(At5g13370)

(At5g13380)(At5g13350)(At1g48670)

(At1g48660)(At5g51470)

(At1g23160)

Fig. 3 Maximum likelihood phylogenetic tree of Group III rosid

GH3 proteins, with evidence for expansion in Arabidopsis Group IIIA

detailed. The tree was constructed using PhyML on multiple sequence

alignments from MUSCLE curated using Gblocks with reliability of

internal branch length tested using the aLRT method. Branches with

the same symbol represent a set with evidence of synteny. Sequences

are from the rosid species At Arabidopsis thaliana, Cp Caricapapaya, Pt Populus trichocarpa, and Vv Vitis vinifera. PBS3 is

highlighted

B

II

III

Syntenic sets:IA IB IIA1

IIA2 IIB1

IIB2 IIIA

IIIB

A

B

A

B

A

I

Fig. 4 Maximum likelihood

phylogenetic tree of eudicot

GH3 proteins. The tree was

constructed using PhyML on

multiple sequence alignments

from MUSCLE curated using

Gblocks with reliability of

internal branch length tested

using the aLRT method.

Branches with the same symbol

represent a set with evidence of

synteny. Sequences are from the

species Aq Aquilegia coerulea,

At Arabidopsis thaliana,

Cp Carica papaya, Mg Mimulusguttatus, Pt Populustrichocarpa, and Vv Vitisvinifera. PBS3 is highlighted

Plant Mol Biol

123

Page 10: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

proteins PBS3 (AtGH3.12) and AtGH3.13-16 on chromo-

some 5 and AtGH3.17 on chromosome 1. PBS3 syntelogs

in other species include VvGH3.8, PtGH3.7 and PtGH3.8,

AcGH3.6, OsGH3.13 and ZmGH3.13 (Online Resources 2

and 3). Set IIIB, contains AtGH3.9 and a variety of other

eudicot proteins. AtGH3.17 of Set IIIA and AtGH3.9 of Set

IIIB had previously been classified with Group II proteins

(Staswick et al. 2005).

Analysis of GH3 functional data

Examination of expression data and transcription factor

binding motifs in gene promoters complements biochemi-

cal and phenotypic data and can help provide insight into

gene function. For example, the first GH3 genes were

identified as inducible by auxin (Hagen et al. 1984), years

before their function was known. Subsequent analysis

revealed GH3 enzymes that catalyze the conjugation of

amino acids to the auxin IAA regulating IAA activity and

function (Staswick et al. 2005). Most of the available

functional data is for the Arabidopsis GH3 genes/proteins,

the focus of our functional analysis.

Arabidopsis GH3s induced by IAA are in Group II

(Fig. 5). Though all Group II AtGH3 genes are induced by

IAA, only two of the six analyzed contain a cis-acting

regulatory element bound by auxin responsive transcription

factors (ARFs). However, on a percentage basis Group II

promoters are more likely to contain at least one cis-acting

regulatory element bound by an ARF at 33%, compared

with 0 and 20% for Groups I and III, respectively (Online

Resource 4). It should be noted that AtGH3.17 and

AtGH3.9, which we reclassify as Group III above, are not

induced by IAA (Fig. 5.) We find the Group II GH3s are

also induced by (a)biotic stress including pathogens

(Fig. 6) with the exception of the lowly expressed

AtGH3.1. In addition to Group II promoters being enriched

in cis-acting regulatory elements bound by ARFs, they are

also enriched for regulatory elements bound by ethylene

responsive binding factors (ERFs) known to regulate

response to (a)biotic stress (Gutterson and Reuber 2004).

Group II Arabidopsis GH3 proteins are also active on IAA,

with mutants in these genes resulting in an IAA phenotype

when tested (Fig. 5). Of note, AtGH3.5/WES1 is active on

both IAA and salicylic acid (2-hydroxybenzoate) (Staswick

et al. 2002; Park et al. 2007b; Zhang et al. 2007).

In contrast to the Arabidopsis Group II genes, which are

all induced by IAA and encode proteins active on IAA, the

two Group I genes AtGH3.11/JAR1 and AtGH3.10/DFL2

have minimal unifying functional data. Surprisingly,

though JAR1 is active on JA (Staswick and Tiryaki 2004),

JAR1 is not induced by JA, other hormones, or in response

to a diverse set of pathogens (Figs. 5 and 6). jar1 mutants

do however exhibit JA-associated phenotypes and are more

susceptible to pathogens such as Botrytis cinerea that

activate ET/JA-dependent defenses (Ferrari et al. 2003).

AtGH3.10/DFL2 is in a distinct syntenic set from JAR1

and does not appear to function in JA signaling and

response (Fig. 5). Instead, it mediates red light-specific

hypocotyl elongation with its expression controlled by light

(Takase et al. 2003). Interestingly, jar1 mutants also

exhibit far red light insensitivity (Hsieh et al. 2000). Nei-

ther Group I Arabidopsis gene contains cis-acting regula-

tory elements bound by ARFs, ERFs, or dehydration

response element binding proteins (DREBs) in their 1 kb

promoters (Online Resource 4). However, they do contain

other biotic-stress associated cis-acting regulatory elements

including those bound by MYC2 transcription factors. Of

particular interest, the promoter of JAR1 contains 4 MYC2

binding sites. Proteolysis of the JAZ family of JA repres-

sors is mediated by the JA-Ile conjugate whose formation

is catalyzed by JAR1. The proteolysis of the JAZ repressor

then allows for downstream JA-associated gene expression

through MYC2 transcription factors (Dombrecht et al.

2007).

An analysis of publicly available expression data

showed the Arabidopsis Group III GH3 genes tend to have

higher expression then those in Group I and II, excluding

AtGH3.13 and 16, whose expression has not been observed

(Figs. 5 and 6; Online Resource 5). Two of the seven tested

and expressed Group III A. thaliana GH3 genes are

induced by phytohormone treatment (Fig. 5) with five of

the seven induced by pathogen or abiotic stress (Fig. 6).

Group III GH3s tend to contain cis-acting regulatory ele-

ments associated with (a)biotic stress response in their

promoters (Online Resource 4). It is interesting that motifs

bound by DREBs are only present in the promoters of the

Group III genes AtGH3.12/PBS3, AtGH3.7 (the most

similar A. thaliana protein to PBS3), and AtGH3.15. In

addition, PBS3 and AtGH3.7 are induced by osmotic stress

(Fig. 6); AtGH3.15 does not have a gene-specific probe on

the ATH1 array. In terms of enzymatic activity, AtGH3.9

and 17, now classified into distinct syntenic Group III sets

are active on IAA, though their expression is not induced

by it [Fig. 5 (Khan and Stone 2007; Staswick et al. 2005)].

AtGH3.12 (PBS3), a syntelog of AtGH3.17, is not active

on IAA, but on 4-substituted benzoates (Okrent et al.

2009). Its expression is induced by SA [Fig. 5 and (Jaga-

deeswaran et al. 2007)], with mutants exhibiting compro-

mised SA accumulation and pathogen resistance (Okrent

et al. 2009; Lee et al. 2007; Jagadeeswaran et al. 2007).

Substrates for other Arabidopsis Group III members remain

unknown.

In addition to the functional data discussed above for

Arabidopsis GH3 genes, data for P. trichocarpa, V. vinif-

era, and O. sativa provide further evidence for selected

biotic stress induction of Group II and III GH3 members.

Plant Mol Biol

123

Page 11: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

For example, in poplar (Populus tremula 9 Populus alba)

tree roots colonized by the ectomycorrhizal fungus (EMF)

Laccaria bicolor, the Group III PBS3 syntelogs PtGH3.7,

and Pt.GH3.8 were induced (Felten et al. 2009). In con-

trast, PtGH3.7 (probeset PtpAffx.140928.1.A1_at) and

PtGH3.8 (probeset PtpAffx.210014.1.S1_at) were not

significantly elevated in response to infection with the

Melampsora rust fungus (Plant Expression Database). No

probesets were identified for the grape Group III members

the PBS3 syntelog VvGH3.8 or the AtGH3.9 syntelog

VvGH3.5; however, the PBS3 syntelog from rice,

OsGH3.13, is active on IAA, highly upregulated during

drought conditions and to a lesser extent, treatment with

the phytohormones IAA, SA and ABA, and confers

enhanced drought tolerance when overexpressed in rice

(Zhang et al. 2009).

For Group II, OsGH3.8 (in IIA) is upregulated following

infection with the bacterial pathogen Xanthomonas oryzae

pv oryzae and overexpresion of OsGH3.8 in rice is corre-

lated with increased levels of IAA-Asp conjugates and

enhanced resistance to Xanthomonas (Ding et al. 2008).

Similarly, constitutive expression of OsGH3.1 (in IIB)

alters auxin homeostasis and enhances resistance to the

fungal pathogen Magnaporthe grisea (Domingo et al.

2009). Furthermore, the IIA genes PtGH3.1 and PtGH3.2

were induced in EMF-colonized tree roots (Felten et al.

2009) and VvGH3.2 (probeset: 1610880_s_at) expression

was elevated in response to infection with Bois Noir phy-

toplasma (Albertazzi et al. 2009).

Discussion

GH3 phylogeny

A careful analysis of sequence data reconciled with syn-

tenic analysis indicates that GH3 phylogeny is not as clear-

cut as previously reported. Previous analyses of the GH3

gene family relied on global sequence similarities (Stas-

wick et al. 2005; Felten et al. 2009; Terol et al. 2006),

which are not necessarily indicative of true evolutionary

descent. The increasing number of sequenced genomes and

new comparative genomic tools makes it possible to use

synteny to evaluate phylogenetic trees. For example, Jun

et al. recently evaluated the use of local synteny for iden-

tifying orthologous genes in mammals, and found it quite

effective, particularly in cases of local duplication and

transposition (Jun et al. 2009). We used the syntenic data to

Induced Activity Phenotype

JA IAA SA JA IAA B JA IAA SA

Fig. 5 Maximum Likelihood phylogenetic tree of the ArabidopsisGH3 proteins showing induction by phytohormones, enzyme activity,

and mutant phenotype. Symbols in branches are as in Fig. 4. JA is

jasmonic acid, IAA is indole-3-acetic acid, SA is salicylic acid, B is

benzoates. In the ‘‘Induced’’ column, gray shaded boxes correspond

to C2-fold increase in expression compared to control treatment,

black shaded boxes correspond to C10-fold expression compared to

control treatment. Boxes with dashed line were not tested or did not

have gene specific data on array. PBS3 is highlighted. Microarray

data from NASCArrays set 174 (JA), 175 (IAA), 192 (SA) and 392

(SA-analogue BTH). Shaded boxes in the ‘‘Activity’’ column indicate

that the enzyme is active on the corresponding substrate. References

for activity data are (Staswick and Tiryaki 2004; Okrent et al. 2009).

Shaded boxes in the ‘‘Phenotype’’ column indicate that plants with

mutations in the genes have altered signaling through the phytohor-

mone indicated, with black shaded boxes moderate/strong and gray as

weak phenotype. References for phenotypic data are as follows:

AtGH3.1, (Staswick et al. 2005); AtGH3.2 (YDK1) (Staswick et al.

2005: Takase et al. 2004); AtGH3.5 (WES1) (Staswick et al. 2005;

Park et al. 2007b; Zhang et al. 2007); AtGH3.6 (DFL1) (Nakazawa

et al. 2001); AtGH3.9 (Khan and Stone 2007); AtGH3.11 (JAR1)

(Staswick et al. 1992); AtGH3.12 (PBS3) (Nobuta et al. 2007;

Jagadeeswaran et al. 2007; Lee et al. 2007); AtGH3.17 (Staswick

et al. 2005; Khan and Stone 2007)

Plant Mol Biol

123

Page 12: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

guide the choice of phylogenetic methods in order to best

understand how PBS3 is related to the other GH3 genes.

Phylogenetic trees were constructed using neighbor-joining

(NJ), maximum parsimony (MP), and ML methods with

various alignment curation methods.

The ML method is a discrete data method that begins

with a model of rates of evolutionary change and alters the

model until in fits the observed data (Mount 2004). This is

contrasted with distance methods such as NJ for which the

percentage of aligned positions that differ between two

sequences is computed pair-wise for all sequences in an

alignment and the values arranged so that sequences with

lower difference scores are closer together on the tree

(Mount 2004). Distances methods tend to be favored by

molecular biologists as they are straightforward and com-

putationally efficient. The drawback of distance methods is

that they can be misleading when using an incorrect evo-

lutionary model (Huelsenbeck and Hillis 1993). While ML

methods are traditionally computationally intensive, new

algorithms, such as that employed by PhyML (Guindon

and Gascuel 2003) reduce calculation time sufficiently to

allow for routine use. There is considerable argument in the

literature over the ‘‘best’’ method for phylogenetic analysis,

often defined as how much a method can tolerate violations

of its assumptions (Huelsenbeck and Hillis 1993). For

example, there is a contentious debate over the relative

merits of ML particularly when different positions in a

sequence evolve over different rates (Steel 2005). Many

experts suggest constructing phylogenetic trees using

multiple methods and evaluating them carefully (Hall

2005; Mount 2004; Thornton and Kolaczkowski 2005).

When available, addition of data from syntenic analysis

can help determine the choice of phylogenetic method. We

found the tree constructed via ML best fit the data from

syntenic analysis and was supported by the available

functional data.

As shown in the eudicot GH3 ML phylogenetic tree in

Fig. 4, there are two clades with high statistical support. One

of these is Group I, which in turn forms two subclades: one

with JAR1 and its syntelogs (set IA) and one with DFL2 and

its syntelogs (set IB). Set IA predates the moncot/eudicot split

whereas set IB contains only eudicots (Online Resource 3).

As discussed in Results, JAR1 is active on JA and mediates

JA-dependent developmental and (a)biotic stress responses,

Pst

vir

Pst

avr

Psp

no

nh

ost

B. c

iner

a

P. i

nfe

stan

s

Ab

ioti

c st

ress

Induced

Bio

tic

Phenotype

Fig. 6 Maximum likelihood phylogenetic tree of Arabidopsis GH3

proteins and expression in response to biotic and abiotic stress.

Symbols in branches correspond to syntenic set as in Fig. 4. For the

‘‘Induced’’ column, black shaded boxes correspond to C2-fold

expression compared to control treatment and gray shaded boxescorrespond to C10-fold increase in expression compared to control

treatment at a = 0.05. Boxes with dashed line were not tested or did

not have gene-specific data on array. PBS3 is highlighted. Microarray

data is from adult leaves unless otherwise noted treated as indicated

from the following NASCArrays datasets: virulent, avirulent Pseu-domonas syringae pv. tomato and nonhost Pseudomonas syringae pv.

phaseicola (120), Botrytis cinerea (167), Phytopthora infestans (123),

seedlings treated with cold/osmotic/salt/drought (138–141). For the

‘‘Phenotype’’ column, shaded boxes indicate phenotypes have been

observed in mutants in response to biotic stress. Assessment of abiotic

stress response has been minimal so it has not been included here.

References are as follows: AtGH3.5 (WES1; Park et al. 2007a, b;

Zhang et al. 2007); AtGH3.6 (DFL1; Zhang et al. 2007); AtGH3.11

(JAR1; Ferrari et al. 2003); AtGH3.12 (PBS3, GDG1, WIN3; Nobuta

et al. 2007; Jagadeeswaran et al. 2007; Lee et al. 2007)

Plant Mol Biol

123

Page 13: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

though it is not induced by JA nor (a)biotic stresses tested

(Figs. 5 and 6). Our analysis identifies JAR1 syntelogs in

agronomically important species including grape, poplar,

rice, and maize as potential targets for enhanced disease

resistance. By contrast, DFL2 is not active on JA or involved

in JA-associated responses (Figs. 5 and 6). Instead it mediates

red light-specific hypocotyl elongation with its expression

controlled by light (Takase et al. 2003, 2004) and its syntelogs

present only in the eudicots.

The second major clade contains the rest of the

sequences, which divides into two subclades, referred to as

Group II and Group III. Group II contains two major sets

(IIA and IIB) which precede the moncot/eudicot split. As

discussed in the Results, Group II members where tested

are induced by the growth phytohormone IAA, active on

IAA, with mutants exhibiting auxin-associated phenotypes

(e.g. Figs. 5 and 6). In addition, the promoters of the

Arabidopsis GH3 Group II member genes are enriched in

ARF- and ERF-binding motifs (Online Resource 4) con-

sistent with their induction by IAA and for some, by

pathogens (Figs. 5 and 6). Group II GH3 proteins in other

eudicot and moncot species are also induced by pathogen

and exhibit both auxin and altered susceptibility pheno-

types (see ‘‘Results’’).

Group III sequences consist of set IIIA (PBS3 and its

syntelogs, including AtGH3.17 which is active on IAA) and

set IIIB (AtGH3.9 which is active on IAA and its syntelogs).

As discussed earlier, AtGH3.9 and AtGH3.17 and their

orthologs had been previously classified as Group II proteins

though they did group separately from the other Group II

proteins (Staswick et al. 2005; Terol et al. 2006; Felten et al.

2009; Liu et al. 2005). Similar to Group II GH3 genes, Group

III genes are often induced by (a)biotic stress and altered

expression can result in (a)biotic stress phenotypes [i.e. At-

PBS3 (Nobuta et al. 2007; Lee et al. 2007; Jagadeeswaran

et al. 2007) and OsGH3.13 (Zhang et al. 2009)].

The most parsimonious explanation for the presence of

IAA- conjugating enzymes in both Group II and III is that

the ancestral gene encoded an enzyme that used IAA as a

substrate. Indeed, both rice Group II and III members

have been found to be active on IAA (Zhang et al. 2009;

Ding et al. 2008; Domingo et al. 2009). However, as

described in the Results, genes in both Group II and III

can be induced by SA and are active on benzoates [e.g.

Group II AtGH3.5 is active on SA (2-HBA) and Group III

PBS3/AtGH3.12 is active on 4-HBA]. IAA-amino acid

conjugates and their function in regulating auxin

homeostasis has been a subject of long-standing investi-

gation [reviewed in (Woodward and Bartel 2005)].

However, despite the fact that a variety of benzoate and

cinnamate amino acid conjugates have been detected in

plants (e.g. Suzuki et al. 1988; Bourne et al. 1991;

Trennheuser et al. 1994), GH3 protein substrate

preference has routinely been assessed only for

2-hydroxybenzoate (SA). Indeed, though PBS3 is not

active on SA, it is active on a series of other related

benzoates with a strong preference for 4-substituted ben-

zoates such as 4-HBA (Okrent et al. 2009). Additional

functional analyses of the Group III (and Group II) GH3

members including high throughput assays [described in

(Okrent et al. 2009)] to assess GH3 activity on a variety

of substrates combined with a comprehensive ancestral

analysis as more genomes are sequenced should eventu-

ally allow for a fuller understanding of the ancestral

function(s) and evolution of these proteins. Importantly,

the early emergence of GH3 proteins active on benzoates

is suggested by the ability of Lemna paucicostata (an

early diverging monocot) to produce benzoyl-Asp (Suzuki

et al. 1988) and the detection of 4-HBA-Glu and p-cou-

maroyl-Glu in the hornwort Anthoceros agrestis (Tren-

nheuser et al. 1994).

Function of PBS3 and its syntelogs

Contrary to previous reports that restrict PBS3 to Ara-

bidopsis and its close relatives, we identify PBS3 synte-

logs in poplar, grape, columbine, maize and rice

suggesting descent from a common ancestral chromosome

dating to before the eudicot/monocot split. Furthermore,

our analysis of co-linear genes found a syntenic relation-

ship between the Arabidopsis PBS3/AtGH3.12 gene and

AtGH3.17 that was obscured by sequence similarity alone.

The PBS3 syntelogs for which expression data exist are

induced by biotic interactions and/or abiotic stress, with

the exception of some of the recently acquired Arabidopis

genes (see ‘‘Results’’). As mentioned earlier, AtPBS3 is

active on 4-substituted benzoates while AtGH3.17 and

OsGH3.13 are active on IAA. However, these latter

enzymes were not tested for activity on 4-HBA. Loss of

PBS3 function alters benzoate metabolism, with a sub-

stantial impact on total SA accumulation and disease

resistance (Nobuta et al. 2007; Jagadeeswaran et al. 2007;

Lee et al. 2007). Enhanced expression of OsGH3.13/TLD1

alters plant architecture, IAA homeostasis, and enhances

drought tolerance (Zhang et al. 2009). However, a TLD1

loss of function mutant displayed no obvious growth or

drought phenotypes, presumably due to the compensatory

action of other GH3s (Zhang et al. 2009). Similarly, an

Arabidopsis gh3.17 mutant exhibited only very minor

auxin-associated phenotypes; it was not tested for altered

(a)biotic stress resistance (Staswick et al. 2005). Since

pbs3 mutants exhibit substantial defects in total SA

accumulation and disease resistance; benzoate metabolism

is likely to be the primary target of PBS3 and perhaps also

of its syntelogs. Cross-talk between SA, auxin, and the

drought-induced phytohormone ABA [e.g. (Park et al.

Plant Mol Biol

123

Page 14: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

2007b; Zhang et al. 2009)], might then explain the phe-

notypes observed when OsGH3.13/TLD1 is overexpressed.

An examination of the predicted or known function of

conserved genes in the PBS3 syntenic regions (Online

Resource 2) may provide additional insight into the ancient

and perhaps conserved function of the PBS3 syntelogs.

Though benzoates are produced in the plastid, there is no

evidence that PBS3 is plastid-localized. However, many of

the genes in the PBS3 syntenic region are known or pre-

dicted to be in the plastid (Online Resource 2). NEF1

(At5g13390) syntelogs are present in most species exam-

ined including eudicots and monocots. NEF1 is plastid-

localized and involved in exine formation of pollen

(Ariizumi et al. 2004). The sculptured wall exine consists

of phenols and fatty acid derivatives and plays an important

role in protecting the pollen from various (a)biotic stresses

(in Ariizumi et al. 2004). Because NEF1 is expressed in

flowers, and benzoates including SA are known to impact

the induction of flowering (Martinez et al. 2004), pollina-

tion strategy (e.g. to act as specific pollinator attractants)

and defense (e.g. Dudareva and Pichersky 2000), we

examined whether PBS3 and the other Arabidopsis

genes in the PBS3 syntenic regions are expressed in

flowers. With the exception of the duplicated At5g13430

gene At5g13440, all Arabidopsis genes in the syntenic

region surrounding PBS3 are expressed in flowers (Online

Resource 2). Indeed, using a PBS3 promoter::GUS fusion,

Jagadeeswaran et al. (2007) found PBS3 to be expressed at

multiple stages of flower development and in specific floral

organs. Benzoate metabolism in flowers is integrated with

the functions of other phytohormones and can be modu-

lated by herbivory (e.g. see Kessler et al. 2010). Therefore,

a possible conserved function of PBS3 syntelogs may be to

modulate stress-induced benzoate metabolism in flowers

and by so doing provide a reproductive benefit.

In addition to its expression in flowers, NEF1 is strongly

induced by osmotic stress, heat, and cold in seedlings (abi-

otic stress, At-TAX series). Both PBS3 and OsGH3.13 are

also induced by drought/salt stress in addition to pathogens

and SA, as is the nearby ERF transcription factor At5g13330/

RAP2.6L (At-TAX dataset; Jagadeeswaran et al. 2007;

Zhang et al. 2009; Krishnaswamy et al. 2011). Similar to

overexpression of the PBS3 syntelog OsGH3.13 in rice

(Zhang et al. 2009), overexpression of RAP2.6L in Arabid-

opsis results in reduced stature, enhanced salt and drought

tolerance (Krishnaswamy et al. 2011). By contrast, rap2.6L

Arabidopsis mutants exhibit enhanced expression of SA-

dependent defense genes (e.g. PR1) and enhanced resistance

to Pseudomonas syringae (Sun et al. 2010), while pbs3

mutants are compromised in SA-dependent gene expression

and resistance to P. syringae (e.g. Nobuta et al. 2007).

Unfortunately, rice plants with altered OsGH3.13 expression

have not been tested for altered pathogen resistance/

susceptibility and Arabidopsis plants with altered PBS3

expression have not been tested for drought tolerance. In any

case, it is clear that PBS3 and its syntelogs can mediate

(a)biotic stress independent of flowering, with this role

supported by the (a)biotic stress-induction and, where tested,

function of its conserved neighbors NEF1 and RAP2.6L.

Expansion of the clade containing PBS3

in Arabidoposis

Of the rosids analyzed, A. thaliana was unique in that it not

only lost GH3 genes but also gained seven GH3 genes (Table

2). Intriguingly, the genes gained in Arabidopsis all are in

Group IIIA which contains PBS3. AtGH3.7 and AtGH3.8

were inserted into their location in the genome, as evidenced

by corresponding regions in papaya, grape, and poplar but no

GH3 gene (data not shown). AtGH3.13-16 (syntelogs of

PBS3) were locally duplicated, likely from the ancestral

gene of PBS3 as the proteins encoded by the four duplicated

genes are more similar to each other than they are to PBS3

(data not shown). The genes AtGH3.18 and AtGH3.19 rep-

resent a tandem duplication following insertion. All of these

rearrangements and insertions suggest a rapidly evolving

group of genes. Researchers have shown that genes that

respond to (a)biotic stress are overrepresented in arrays of

tandemly duplicated genes (Hanada et al. 2008; Rizzon et al.

2006) and tend to be retained following local duplication

events (reviewed in Freeling 2009). In addition, the same

genes that tend to be locally duplicated also tend to transpose

within chromosomal regions with high recombination rates.

Much of the work studying the divergence in expression of

duplicated genes has been done in yeast. However, expres-

sion patterns of tandemly and segmentally duplicated genes

in A. thaliana have been found to be similar, with a weak but

significant correlation between expression pattern and pro-

moter similarity (Haberer et al. 2004). The expression of the

tandemly duplicated sets of A. thaliana genes in the GH3

family, PBS3/AtGH3.12 and AtGH3.13-16; and AtGH3. 18

and AtGH3.19, do not retain correlated expression (data not

shown). The expression of the genes in different types of

tissues suggests possible specialization (Online Resource 5).

In addition, they differ in frequency of putative transcription

factor binding motifs (Online Resource 4), suggesting evi-

dence for promoter evolution. In many cases, these newer

AtGH3 genes are induced in response to (a)biotic stress

(Fig. 6) suggesting an evolving role for these genes in

response to stress. The rare effector HopW1-1 from Pseu-

domonas syringae pv. maculicola has been shown to bind to

the AtPBS3 protein (Lee et al. 2008) indicating unique tar-

geting of PBS3. This further illustrates the importance of the

PBS3 protein in disease resistance/susceptibility and sug-

gests host counter-evolution could play a role in the

expansion of this clade in Arabidopsis.

Plant Mol Biol

123

Page 15: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

Acknowledgments We would like to thank Dr. Eric Lyons for his

assistance with the CoGe browser, Dr. Divya Chandran for careful

reading of the manuscript, and the William Carroll Smith Graduate

Research Fellowship in Plant Pathology (to R.A.O) and UC Berkeley

awards (to M.C.W.) for financial support. Some of the genome sequence

data described here was analyzed prior to publication by the sequencing

projects. Of these, the Aquilegia coerulea, Mimulus guttatus, and

Selaginella moellendorffii data were produced by the US Department of

Energy Joint Genome Institute. Carica papaya data were produced by

the ASGPB Hawaii Papaya Genome Project (http://www.

asgpb.mhpcc.hawaii.edu/papaya/). Zea mays data were produced by

the Genome Sequencing Center at Washington University School of

Medicine in St. Louis and can be obtained from http://www.

maizesequence.org/.

References

Albertazzi G, Milc J, Caffagni A, Francia E, Roncaglia E et al (2009)

Gene expression in grapevine cultivars in response to Bois noir

phytoplasma infection. Plant Sci 176:792–804

Ariizumi T, Hatakeyama K, Hinata K, Inatsugi R, Nishida I, Sata

S, Kato T, Tabata S, Toriyama K (2004) Disruption of the

novel plant protein NEF1 affects lipid accumulation in the

plastids of the tapetum and exine formation of pollen,

resulting in male sterility in Arabidopsis thaliana. Plant J

39:170–181

Bierfreund NM, Tintelnot S, Reski R, Decker EL (2004) Loss of GH3

function does not affect phytochrome-mediated development in

a moss, Physcomitrella patens. J Plant Physiol 161:823–835

Blanc G, Hokamp K, Wolfe KH (2003) A recent polyploidy

superimposed on older large-scale duplications in the Arabid-

opsis genome. Genome Res 13:137–144

Bourne DJ, Barrow KD, Milborrow BV (1991) Salicyloylaspartate as

an endogenous component of the leaves of Phaseolus vulgaris.

Phytochemistry 30:4041–4044

Castresana J (2000) Selection of conserved blocks from multiple

alignments for their use in phylogenetic analysis. Mol Biol Evol

17:540–552

Chang K-H, Xiang H, Dunaway-Mariano D (1997) Acyl-adenylate

motif of the acyl-adenylate/thioester-forming enzyme superfam-

ily: a site directed mutagenesis study with the Pseudomonas sp.

Strain CBS3 4-chlorobenzoate:Coenzyme A ligase. Biochemis-

try 36:15650–15659

Chini A, Fonseca S, Fernandez G, Adie B, Chico JM et al (2007) The

JAZ family of repressors is the missing link in jasmonate

signalling. Nature 448:666

Conti E, Franks NP, Brick P (1996) Crystal structure of firefly

luciferase throws light on a superfamily of adenylate-forming

enzymes. Structure 4:287–298

Craigon DJ, James N, Okyere J, Higgins J, Jotham J et al (2004)

Nascarrays: a repository for microarray data generated by NASC’s

transcriptomics service. Nucleic Acids Res 32:D575–D577

Dereeper A, Guignon V, Blanc G, Audic S, Buffet S et al (2008)

Phylogeny.Fr: robust phylogenetic analysis for the non-special-

ist. Nucleic Acids Res 36:W465–W469

Ding X, Cao Y, Huang L, Zhao J, Xu C et al (2008) Activation of the

indole-3-acetic acid-amido synthetase GH3–8 suppresses expan-

sin expression and promotes salicylate- and jasmonate-indepen-

dent basal immunity in rice. Plant Cell 20:228–240

Dombrecht B, Xue GP, Sprague SJ, Kirkegaard JA, Ross JJ et al

(2007) MYC2 differentially modulates diverse jasmonate-

dependent functions in Arabidopsis. Plant Cell 19:2225–2245

Domingo C, Andres F, Tharreau D, Iglesias DJ, Talon M (2009)

Constitutive expression of OsGH3.1 reduces auxin content and

enhances defense response and resistance to a fungal pathogen in

rice. Mol Plant-Micro Interact 22:201–210

Dudareva N, Pichersky E (2000) Biochemical and molecular genetic

aspects of floral scents. Plant Physiol 122:627–634

Edgar RC (2004) Muscle: multiple sequence alignment with high

accuracy and high throughput. Nucleic Acids Res 32:

1792–1797

Felten J, Kohler A, Morin E, Bhalerao RP, Palme K et al (2009) The

ectomycorrhizal fungus Laccaria bicolor stimulates lateral root

formation in poplar and Arabidopsis through auxin transport and

signaling. Plant Physiol 151:1991–2005

Ferrari S, Plotnikova JM, De Lorenzo G, Ausubel FM (2003)

Arabidopsis local resistance to Botrytis cinerea involves salicylic

acid and camalexin and requires EDS4 and PAD2, but not SID2,

EDS5, or PAD4. Plant J 35:193–205

Freeling M (2009) Bias in plant gene content following different sorts

of duplication: Tandem, whole-genome, segmental, or by

transposition. Annu Rev Plant Biol 60:433–453

Goda H, Sasaki E, Akiyama K, Maruyama-Nakashita A, Nakabayashi

K et al (2008) The AtGenExpress hormone and chemical

treatment data set: experimental design, data evaluation, model

data analysis and data access. Plant J 55:526–542

Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm

to estimate large phylogenies by maximum likelihood. Syst Biol

52:696–704

Gutterson N, Reuber TL (2004) Regulation of disease resistance

pathways by AP2/ERF transcription factors. Curr Opin Plant

Biol 7:465–471

Haberer G, Hindemitt T, Meyers BC, Mayer KFX (2004) Transcrip-

tional similarities, dissimilarities, and conservation of cis-

elements in duplicated genes of Arabidopsis. Plant Physiol

136:3009–3022

Hagen G, Kleinschmidt A, Guilfoyle T (1984) Auxin-regulated gene

expression in intact soybean Glycine max cultivar Wayne

hypocotyl and excised hypocotyl sections. Planta 162:147–153

Hall BG (2005) Comparison of the accuracies of several phylogenetic

methods using protein and DNA sequences. Mol Biol Evol

22:792–802

Hanada K, Zou C, Lehti-Shiu MD, Shinozaki K, Shiu SH (2008)

Importance of lineage-specific expansion of plant tandem

duplicates in the adaptive response to environmental stimuli.

Plant Physiol 148:993–1003

Hsieh H-L, Okamoto H, Wang M, Ang L-H, Matsui M et al (2000)

FIN219, an auxin-regulated gene, defines a link between

phytochrome A and the downstream regulator COP1 in light

control of Arabidopsis development. Genes Dev 14:1958–1970

Huelsenbeck JP, Hillis DM (1993) Success of phylogenetic methods

in the four-taxon case. Syst Biol 42:247–264

Jagadeeswaran G, Raina S, Acharya BR, Maqbool SB, Mosher SL et al

(2007) Arabidopsis GH3-like Defense Gene 1 is required for

accumulation of salicylic acid, activation of defense responses and

resistance to Pseudomonas syringae. Plant J 51:234–246

Jaillon O, Aury JM, Noel B, Policriti A, Clepet C et al (2007) The

grapevine genome sequence suggests ancestral hexaploidization

in major angiosperm phyla. Nature 449:463–467

Jain M, Kaur N, Tyagi AK, Khurana JP (2006) The auxin-responsive

GH3 gene family in rice (oryza sativa). Funct Integr Genomics

6:36–46

Jansen R, Kaittanis C, Saski C, Lee S-B, Tomkins J et al (2006)

Phylogenetic analyses of Vitis (Vitaceae) based on complete

chloroplast genome sequences: effects of taxon sampling and

phylogenetic methods on resolving relationships among rosids.

BMC Evol Biol 6:32

Plant Mol Biol

123

Page 16: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

Jun J, Mandoiu I, Nelson C (2009) Identification of mammalian

orthologs using local synteny. BMC Genomics 10:630

Kessler D, Diezel C, Baldwin IT (2010) Changing pollinators as a

means of escaping herbivores. Curr Biol 20:237–242

Khan S, Stone JM (2007) Arabidopsis thaliana GH3.9 influences

primary root growth. Planta 226:21–34

Krishnaswamy S, Verma S, Rahman MH, Kav NNV (2011)

Functional characterization of four APETALA2-family genes

(RAP2.6, RAP2.6L, DREB19 and DREB26) in Arabidopsis.

Plant Mol Biol 75:107–127

Lee MW, Lu H, Jung HW, Greenberg JT (2007) A key role for the

Arabidopsis WIN3 protein in disease resistance triggered by

Pseudomonas syringae that secrete AvrRpt2. MPMI 20:1192–1200

Lee MW, Jelenska J, Greenberg JT (2008) Arabidopsis proteins

important for modulating defense responses to Pseudomonas

syringae that secrete HopW1–1. Plant J 54:452–465

Liu Kl, Kang B-C, Jiang H, Moore SL, Li H et al (2005) A GH3-like

gene, CcGH3, isolated from Capsicum chinense l. Fruit is

regulated by auxin and ethylene. Plant Mol Biol 58:447–464

Ludwig-Muller J, Julke S, Bierfreund NM, Decker EL, Reski R

(2009) Moss (Physcomitrella patens) GH3 proteins act in auxin

homeostasis. New Phytol 181:323–338

Lyons E, Freeling M (2008) How to usefully compare homologous plant

genes and chromosomes as DNA sequences. Plant J 53:661–673

Lyons E, Pedersen B, Kane J, Alam M, Ming R et al (2008) Finding

and comparing syntenic regions among Arabidopsis and the

outgroups papaya, poplar, and grape: CoGe with rosids. Plant

Physiol 148:1772–1781

Martinez C, Pons E, Prats G, Leon J (2004) Salicylic acid regulates

flowering time and links defence responses and reproductive

development. Plant J 37:209–217

Ming R, Hou S, Feng Y, Yu Q, Dionne-Laporte A et al (2008) The

draft genome of the transgenic tropical fruit tree papaya (Carica

papaya linnaeus). Nature 452:991–996

Molina C, Grotewold E (2005) Genome wide analysis of Arabidopsis

core promoters. BMC Genomics 6:25

Mount DW (2004) Bioinformatics: sequence and genome analysis.

Cold Spring Harbor Laboratory Press, Cold Spring Harbor

Nakazawa M, Yabe N, Ichikawa T, Yamamoto YY, Yoshizumi T

et al (2001) DFL1, an auxin-responsive GH3 gene homologue,

negatively regulates shoot cell elongation and lateral root

formation, and positively regulates the light response of

hypocotyl length. Plant J 25:213–221

Nobuta K, Okrent RA, Stoutemyer M, Rodibaugh N, Kempema L

et al (2007) The GH3 acyl adenylase family member PBS3

regulates salicylic acid-dependent defense responses in Arabid-

opsis. Plant Physiol 144:1144–1156

Okrent RA, Brooks MD, Wildermuth MC (2009) Arabidopsis

GH3.12 (PBS3) conjugates amino acids to 4-substituted

benzoates and is inhibited by salicylate. J Biol Chem 284:

9742–9754

Ouyang S, Zhu W, Hamilton J, Lin H, Campbell M et al (2006) The

TIGR rice genome annotation resource: improvements and new

features. Nucleic Acids Res 35:D883–D887

Park J-E, Seo PJ, Lee A-K, Jung J-H, Kim Y-S et al (2007a) An

Arabidopsis GH3 gene, encoding an auxin-conjugating enzyme,

mediates phytochrome B-regulated light signals in hypocotyl

growth. Plant Cell Physiol 48:1236–1241

Park J-E, Park J-Y, Kim Y-S, Staswick PE, Jeon J, Yun J, Kim

S-Y, Kim J, Lee Y-H, Park C-M (2007b) GH3-mediated

auxin homeostasis links growth regulation with stress

adaptation response in Arabidopsis. J Biol Chem 282:

10036–10046

Rensing SA, Lang D, Zimmer AD, Terry A, Salamov A et al (2008)

The Physcomitrella genome reveals evolutionary insights into

the conquest of land by plants. Science 319:64–69

Rizzon C, Ponger L, Gaut BS (2006) Striking similarities in the

genomic distribution of tandemly arrayed genes in Arabidopsis

and rice. PLoS Comput Biol 2:e115

Schnable PS, Ware D, Fulton RS, Stein JC, Wei F et al (2009) The

B73 maize genome: complexity, diversity, and dynamics.

Science 326:1112–1115

Semon M, Wolfe KH (2007) Consequences of genome duplication.

Curr Opin Genet Dev 17:505–512

Singh KB, Foley RC, Onate-Sanchez L (2002) Transcription factors

in plant defense and stress responses. Curr Opin Plant Biol

5:430–436

Staswick PE, Tiryaki I (2004) The oxylipin signal jasmonic acid is

activated by an enzyme that conjugates it to isoleucine in

Arabidopsis. Plant Cell 16:2117–2127

Staswick PE, Su W, Howell SH (1992) Methyl jasmonate inhibition

of root growth and induction of a leaf protein are decreased in an

Arabidopsis thaliana mutant. Proc Natl Acad Sci USA

89:6837–6840

Staswick PE, Tiryaki I, Rowe ML (2002) Jasmonate response locus

JAR1 and several related Arabidopsis genes encode enzymes of

the firefly luciferase superfamily that show activity on jasmonic,

salicylic, and indole-3-acetic acids in an assay for adenylation.

Plant Cell 14:1405–1415

Staswick PE, Serban B, Rowe M, Tiryaki I, Maldonado MT et al

(2005) Characterization of an Arabidopsis enzyme family that

conjugates amino acids to indole-3-acetic acid. Plant Cell

17:616–627

Steel M (2005) Should phylogenetic models be trying to ‘‘Fit an

elephant’’? Trends Genet 21:307–309

Sun F, Liu P, Xu J, Dong H (2010) Mutation in RAP2.6L, a

transactivator of the ERF transcription factor family enhances

resistance to Pseudomonas syringae. Physiol Mol Plant Pathol

74:295–301

Suzuki Y, Yanaguchi I, Murofushi N, Takahasi N (1988) Biological

conversion of benzoic acid in Lemna paucicostata 151 and its

relation to flower induction. Plant Cell Physiol 29:439–444

Takase T, Nakazawa M, Ishikawa A, Manabe K, Matsui M (2003)

DFL2, a new member of the Arabidopsis GH3 family is involved

in red light-specific hypocotyl elongation. Plant Cell Physiol

44:1071–1080

Takase T, Nakazawa M, Ishikawa A, Kawashima M, Ichikawa T,

Takahashi N, Shimada H, Manabe K, Matsui M (2004) ydk1-D,

an auxin-responsive GH3 mutant that is involved in hypocotyl

and root elongation. Plant J 37:471–483

Tamura K, Dudley J, Nei M, Kumar S (2007) MEGA4: molecular

evolutionary genetics analysis (MEGA) software version 4.0.

Mol Biol Evol 24:1596–1599

Terol J, Domingo C, Talon M (2006) The GH3 family in plants:

genome wide analysis in rice and evolutionary history based on

EST analysis. Gene 371:279–290

Thornton JW, Kolaczkowski B (2005) No magic pill for phylogenetic

error. Trends Genet 21:310–311

Toufighi K, Brady SM, Austin R, Ly E, Provart NJ (2005) The botany

array resource: E-northerns, expression angling, and promoter

analyses. Plant J 43:153–163

Trennheuser F, Burkhard G, Becker H (1994) Anthocerodiazonin, an

alkaloid from Antheroceros agretis. Phytochemistry 37:899–903

Tuskan GA, DiFazio S, Jansson S, Bohlmann J, Grigoriev I et al

(2006) The genome of black cottonwood, populus trichocarpa

(torr. & gray). Science 313:1596–1604

Warren RF, Merritt PM, Holub E, Innes RW (1999) Identification of

three putative signal transduction genes involved in R gene-

specified disease resistance in Arabidopsis. Genetics 152:401–412

Winter D, Vinegar B, Nahal H, Ammar R, Wilson GV et al (2007) An

‘‘electronic fluorescent pictograph’’ browser for exploring and

analyzing large-scale biological data sets. PLoS One 2:e718

Plant Mol Biol

123

Page 17: Evolutionary history of the GH3 family of acyl adenylases ... · Evolutionary history of the GH3 family of acyl adenylases in rosids Rachel A. Okrent • Mary C. Wildermuth Received:

Wise RP, Caldo RA, Hong L, Shen L, Cannon E et al (2007)

Barleybase/plexdb: a unified expression profiling database for

plants and plant pathogens. In: Edwards D (ed) Plant bioinfor-

matics—methods and protocols. Humana Press, Totowa,

pp 347–363

Woodward AW, Bartel B (2005) Auxin: regulation, action, and

interaction. Ann Bot 95:707–735

Wu CA, Lowry DB, Cooley AM, Wright KM, Lee YW et al (2007)

Mimulus is an emerging model system for the integration of

ecological and genomic studies. Heredity 100:220–230

Zhang Z, Li Q, Li Z, Staswick PE, Wang M et al (2007) Dual

regulation role of GH3.5 in salicylic acid and auxin signaling

during Arabidopsis-Pseudomonas syringae interaction. Plant

Physiol 145:450–464

Zhang S-W, Li C-H, Cao J, Zhang Y-C, Zhang S-Q et al (2009)

Altered architecture and enhanced drought tolerance in rice via

the down-regulation of indole-3-acetic acid by tld1/osgh3.13

activation. Plant Physiol 151:1889–1901

Zimmermann P, Hirsch-Hoffmann M, Hennig L, Gruissem W (2004)

GENEVESTIGATOR. Arabidopsis microarray database and

analysis toolbox. Plant Physiol 136:2621–2632

Plant Mol Biol

123