crystal structure of transaldolase b from escherichia coli suggests a circular permutation of the...

10
Crystal structure of transaldolase B from Escherichia coli suggests a circular permutation of the a/b barrel within the class I aldolase family Jia Jia 1 , Weijun Huang 1 , Ulrich Schörken 2 , Hermann Sahm 2 , Georg A Sprenger 2 , Ylva Lindqvist 1 * and Gunter Schneider 1 * Background: Transaldolase is one of the enzymes in the non-oxidative branch of the pentose phosphate pathway. It transfers a C3 ketol fragment from a ketose donor to an aldose acceptor. Transaldolase, together with transketolase, creates a reversible link between the pentose phosphate pathway and glycolysis. The enzyme is of considerable interest as a catalyst in stereospecific organic synthesis and the aim of this work was to reveal the molecular architecture of transaldolase and provide insights into the structural basis of the enzymatic mechanism. Results: The three-dimensional (3D) structure of recombinant transaldolase B from E. coli was determined at 1.87 Å resolution. The enzyme subunit consists of a single eight-stranded a/b-barrel domain. Two subunits form a dimer related by a twofold symmetry axis. The active-site residue Lys132 which forms a Schiff base with the substrate is located at the bottom of the active-site cleft. Conclusions: The 3D structure of transaldolase is similar to structures of other enzymes in the class I aldolase family. Comparison of these structures suggests that a circular permutation of the protein sequence might have occurred in transaldolase, which nevertheless results in a similar 3D structure. This observation provides evidence for a naturally occurring circular permutation in an a/b-barrel protein. It appears that such genetic permutations occur more frequently during evolution than was previously thought. Introduction The pentose phosphate pathway for the metabolism of glucose-6-phosphate can be described as containing three rather distinct enzyme systems [1]. One of these systems catalyses a dehydrogenase/decarboxylase step which decarboxylates glucose-6-phosphate resulting in the for- mation of ribulose-5-phosphate and CO 2 with concomitant reduction of NADP+. The pathway includes an isomeriz- ing system which produces an equilibrium mixture of ribose-5-phosphate, ribulose-5-phosphate and xylulose- 5-phosphate. Finally, the pathway contains a sugar rearrangement system which uses the enzymes transketo- lase and transaldolase to produce three to seven carbon carbohydrate intermediates which can be shunted into other metabolic pathways such as glycolysis. The pathway is very flexible and can adjust itself to the various chang- ing requirements of the cell, for example, for metabolic intermediates or reducing power in the form of NADPH. Transaldolase (D-sedoheptulose-7-phosphate: D-glyceralde- hyde-3-phosphate dihydroxyacetone transferase, EC 2.2.1.2), one of the enzymes in the non-oxidative branch of the pentose phosphate pathway, catalyzes the reversible transfer of a dihydroxyacetone moiety, derived from fructose-6-phosphate, to erythrose-4-phosphate yielding sedoheptulose-7-phosphate and glyceraldehyde-3-phos- phate. Catalysis by transaldolase proceeds through the for- mation of a covalent intermediate, a Schiff base between an active-site lysine residue and the dihydroxyacetone moiety [2], similar to the mechanism of other class I aldolases (Fig. 1). The subunit of transaldolase from E. coli has a molecular weight of 35 kDa and the enzyme forms a dimer in solution [3] as do the enzymes from Saccharomyces cerevisiae [4] and Candida utilis [5]. The gene for transal- dolase B from E. coli codes for 317 amino acids, but the N-terminal methionine residue is cleaved off in the mature protein [3]. Comparison of the amino-acid sequences of transaldolases reveals overall sequence identities of greater than 50% between prokaryotic and eukaryotic species, for example 53% identity between the enzymes from E. coli and Saccharomyces cerevisiae and 55% identity between E. coli and human transaldolase [3,6,7]. Aldolases are of considerable interest as catalysts in stereospecific organic synthesis. In particular, fructose- 1,6-phosphate aldolase (F-1,6-P aldolase) has been suc- cessfully employed for stereospecific carbon–carbon bond formation in the synthesis of monosaccharides [8]. As part Addresses: 1 Division of Molecular Structural Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institute, Doktorsringen 4, S-171 77 Stockholm, Sweden and 2 Institut für Biotechnologie 1, Forschungszentrum Jülich GmbH, PO Box 1913, D-52425 Jülich, Germany. *Corresponding authors. E-mail: YL, [email protected]; GS, [email protected] Key words: enzyme mechanism, evolution, gene permutation, protein crystallography, transaldolase Received: 14 Mar 1996 Revisions requested: 4 April 1996 Revisions received: 12 April 1996 Accepted: 16 April 1996 Structure 15 June 1996, 4:715–724 © Current Biology Ltd ISSN 0969-2126 Research Article 715

Upload: jia-jia

Post on 18-Sep-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Crystal structure of transaldolase B from Escherichia colisuggests a circular permutation of the a/b barrel within the class I aldolase familyJia Jia1, Weijun Huang1, Ulrich Schörken2, Hermann Sahm2, Georg A Sprenger2, Ylva Lindqvist1* and Gunter Schneider1*

Background: Transaldolase is one of the enzymes in the non-oxidative branch ofthe pentose phosphate pathway. It transfers a C3 ketol fragment from a ketosedonor to an aldose acceptor. Transaldolase, together with transketolase, creates areversible link between the pentose phosphate pathway and glycolysis. Theenzyme is of considerable interest as a catalyst in stereospecific organic synthesisand the aim of this work was to reveal the molecular architecture of transaldolaseand provide insights into the structural basis of the enzymatic mechanism.

Results: The three-dimensional (3D) structure of recombinant transaldolase Bfrom E. coli was determined at 1.87 Å resolution. The enzyme subunit consists ofa single eight-stranded a/b-barrel domain. Two subunits form a dimer related bya twofold symmetry axis. The active-site residue Lys132 which forms a Schiffbase with the substrate is located at the bottom of the active-site cleft.

Conclusions: The 3D structure of transaldolase is similar to structures of otherenzymes in the class I aldolase family. Comparison of these structures suggeststhat a circular permutation of the protein sequence might have occurred intransaldolase, which nevertheless results in a similar 3D structure. Thisobservation provides evidence for a naturally occurring circular permutation in ana/b-barrel protein. It appears that such genetic permutations occur morefrequently during evolution than was previously thought.

IntroductionThe pentose phosphate pathway for the metabolism ofglucose-6-phosphate can be described as containing threerather distinct enzyme systems [1]. One of these systemscatalyses a dehydrogenase/decarboxylase step whichdecarboxylates glucose-6-phosphate resulting in the for-mation of ribulose-5-phosphate and CO2 with concomitantreduction of NADP+. The pathway includes an isomeriz-ing system which produces an equilibrium mixture ofribose-5-phosphate, ribulose-5-phosphate and xylulose-5-phosphate. Finally, the pathway contains a sugarrearrangement system which uses the enzymes transketo-lase and transaldolase to produce three to seven carboncarbohydrate intermediates which can be shunted intoother metabolic pathways such as glycolysis. The pathwayis very flexible and can adjust itself to the various chang-ing requirements of the cell, for example, for metabolicintermediates or reducing power in the form of NADPH.

Transaldolase (D-sedoheptulose-7-phosphate:D-glyceralde-hyde-3-phosphate dihydroxyacetone transferase, EC2.2.1.2), one of the enzymes in the non-oxidative branch ofthe pentose phosphate pathway, catalyzes the reversibletransfer of a dihydroxyacetone moiety, derived from

fructose-6-phosphate, to erythrose-4-phosphate yieldingsedoheptulose-7-phosphate and glyceraldehyde-3-phos-phate. Catalysis by transaldolase proceeds through the for-mation of a covalent intermediate, a Schiff base betweenan active-site lysine residue and the dihydroxyacetonemoiety [2], similar to the mechanism of other class Ialdolases (Fig. 1). The subunit of transaldolase from E. colihas a molecular weight of 35 kDa and the enzyme forms adimer in solution [3] as do the enzymes from Saccharomycescerevisiae [4] and Candida utilis [5]. The gene for transal-dolase B from E. coli codes for 317 amino acids, but theN-terminal methionine residue is cleaved off in the matureprotein [3]. Comparison of the amino-acid sequences oftransaldolases reveals overall sequence identities of greaterthan 50% between prokaryotic and eukaryotic species, forexample 53% identity between the enzymes from E. coliand Saccharomyces cerevisiae and 55% identity betweenE. coli and human transaldolase [3,6,7].

Aldolases are of considerable interest as catalysts instereospecific organic synthesis. In particular, fructose-1,6-phosphate aldolase (F-1,6-P aldolase) has been suc-cessfully employed for stereospecific carbon–carbon bondformation in the synthesis of monosaccharides [8]. As part

Addresses: 1Division of Molecular StructuralBiology, Department of Medical Biochemistry andBiophysics, Karolinska Institute, Doktorsringen 4,S-171 77 Stockholm, Sweden and 2Institut fürBiotechnologie 1, Forschungszentrum JülichGmbH, PO Box 1913, D-52425 Jülich, Germany.

*Corresponding authors. E-mail: YL, [email protected]; GS, [email protected]

Key words: enzyme mechanism, evolution, genepermutation, protein crystallography, transaldolase

Received: 14 Mar 1996Revisions requested: 4 April 1996Revisions received: 12 April 1996Accepted: 16 April 1996

Structure 15 June 1996, 4:715–724

© Current Biology Ltd ISSN 0969-2126

Research Article 715

of on-going efforts to explore the suitability of trans-aldolases as catalysts in organic synthesis, we have initi-ated crystallographic studies of this enzyme [9]. Theseanalyses should reveal the molecular architecture of thecatalyst and provide insights into the structural basis ofthe enzymatic mechanism.

In this paper, we describe the three-dimensional (3D)structure of recombinant transaldolase from E. coli at1.87 Å resolution. We also present the results of a compari-son of the structure of transaldolase with structures ofother class I aldolases.

Results and discussionStructure determination and electron-density mapThe structure of transaldolase was solved by the multipleisomorphous replacement (MIR) method using five heavymetal derivatives. After solvent flattening and histogrammatching, the electron-density map was of sufficientquality to trace the polypeptide chain of the enzyme. Themap was subsequently improved by phase combination,using calculated phases from the partial model and theMIR phases. In this improved map, the complete biologi-cal amino-acid sequence was fitted into the electrondensity. In the final electron-density maps, there is con-tinuous well-defined electron density for the wholepolypeptide chain except for a few side chains at thesurface of the molecule. A representative part of the2|Fo|–|Fc| electron-density map after completion of thecrystallographic refinement is shown in Figure 2. Theoverall residue-by-residue real-space correlation [10] be-tween the model and the 2|Fo|–|Fc| electron-density mapis 0.89. The heavy metal ions in the mercury derivatives

bind at the same site, Cys239, and the platinum interactswith the side chain of Met312. The final model consists of2×316 residues and 524 water molecules. One residue,Asp293, in each subunit was modelled with two alterna-tive conformations. The protein model has good stereo-chemistry: the root mean square (rms) deviation of thebond lengths is 0.006 Å and the rms deviation for the bond angles is 1.521°. The Ramachandran plot (Fig. 3) shows that 96.5% of the residues are in the most favoured regions of ϕ,f space with no outliers in the disallowed regions. The only residue in the generously allowed region (Ser226) has very well defined electron density. The conventional crystallographic R-factor afterrefinement is 20.1% and R-free is 23.4% in the resolutioninterval 5.5–1.87 Å.

Overall structureStructure of the subunitThe enzyme subunit consists of a single domain whichhas the fold of an eight-stranded a/b barrel (Fig. 4).Eight parallel b strands (b1–b8) form the core of thebarrel. This core is surrounded by eight a helices(a1–a8) running approximately antiparallel with theb strands with the exception of helix a8 which has itshelix axis almost perpendicular to strand b8. There aresix additional a helices (aA–aF), three of which areinserted in loop regions between b strands and a helicesof the barrel, two after strand 2 (aB and aC) and oneafter strand 6 (aD). One additional a helix (aA) is foundat the N terminus and two more helices (aE and aF)occur at the C terminus of the polypeptide chain. Two ofthese helices, aB and aD, point with their N-terminalends into the active site. The helix aF runs across one

716 Structure 1996, Vol 4 No 6

Figure 1

Reactions catalyzed by (a) transaldolase and(b) fructose-1,6-bisphosphate aldolase.

CH2OPO32-

C

C

C

H

C

H OH

H OH

CH2OPO32-

C O

CH2OH

CH2OPO32-

O

HO

CH2OH

C O

C

C

HO H

C

H OH

H OH

C

CH2OH

C O

C

C

HO H

C

H OH

H OH

CH2OPO32-

H

C

C OH

CH2OPO32-

CH2OPO32-

OHH

H O

E N C

CH2OH

CH2OPO32-

E NH2

E NH2 E N C

CH2OH

CH2OH

E NH2

H

C

C OH

CH2OPO32-

H O

E NH2 +

D-erythrose4-phosphate

(b) fructose-1,6-bisphosphate aldolase

+ +

H2OH2O

+

Dihydroxyacetone phosphate

D-fructose1,6-bisphophate

H2O

(a) transaldolase

D-sedoheptulose 7-phosphate

D-glyceraldehyde 3-phosphate

D-glyceraldehyde 3-phosphate

D-fructose6-phosphate

+ +

H2O

Schiff-baseintermediate

Schiff-baseintermediate

half of the active-site cleft at the C-terminal of the barreland partly covers this cleft (Fig. 4). The last helix of thebarrel, a8, is connected to the penultimate C-terminalhelix aE (which packs against a4 and a5 of the barrel) by a very long loop wrapping around almost half of the barrel. A stereoview of the structure is shown in Figure 5 and the assignment of the secondary structure elements to the amino-acid sequence of transaldolase is shown in Figure 6.

Structure of the dimerThe dimer is formed between one of the subunits in theasymmetric unit and a symmetry mate of the secondsubunit. The dimer interface buries an area of 1600 Å2 andinvolves interactions between the N-terminal parts ofhelix aF in the two subunits and between the b3–a3 loopregion from one subunit and helix aE from the other.These interactions involve hydrophobic contacts andhydrogen bonds.

One major contribution to this interface area is the inter-action between the N-terminal parts of helix aF of thetwo subunits across a non-crystallographic twofold symme-try axis. Hydrophobic contacts are formed between sidechains of Gln287, Val292, Ala296, Arg300 and their corre-sponding symmetry mates in the second subunit. Otherimportant subunit–subunit interactions in this area involvea hydrogen bond from the conserved Arg300 to the mainchain oxygen of the conserved Asn286 and the corre-sponding hydrogen bond generated by the twofold sym-metry. In this interface region, Asp293 is located. The sidechain of this residue has been modelled in two conforma-tions. In one of the conformations, it forms a hydrogenbond to the same conformer in the second subunit. In theother conformation, the side chain points away from thesecond subunit and its charge is compensated for the sur-rounding basic residues Arg300 from both subunits. Thecrystals are grown at pH 4.5 and it seems likely that thetwo observed conformations reflect the distribution of theprotonated and deprotonated states of the carboxylategroup at this pH.

Interactions in other parts of this interface region includemostly hydrophobic contacts from the loop connecting b3with a3 of one subunit to the helix aE of the second

Research Article Crystal structure of transaldolase Jia et al. 717

Figure 2

Stereodiagram showing part of the final2Fo–Fc electron-density map for transaldolase,contoured at 1.0s.

Figure 3

Ramachandran plot for the refined model of transaldolase. Trianglesrepresent glycine and squares represent non-glycine residues.

B

A

L

b

a

l

p

~p

~b

~a

~l

b

~b

b~b

~b

-180 -135 -90 -45 0 45 90 135 180

-135

-90

-45

0

45

90

135

180

Phi (degrees)

Psi (

degr

ees)

Ser226

subunit. These involve the side chains of Arg100, Tyr103,Trp283, Leu282, and corresponding residues related bythe twofold symmetry.

The active siteAs in all other a/b-barrel enzymes found so far, the activesite is located at the C-terminal end of the b strands andthe walls of the active-site cavity are formed by the loopsconnecting these b strands with the a helices. In the struc-ture presented here, this cavity is accessible from the bulksolution despite the fact that helix aF runs across one half

of the funnel-shaped opening of the cavity. A large numberof ordered solvent molecules were found in the active-sitecleft. At the bottom of this cleft, we find the e-nitrogen ofthe conserved Lys132 that forms the Schiff base duringcatalysis. The identification of this particular lysine as theSchiff-base-forming residue in E. coli transaldolase is basedon several lines of evidence. Firstly, the amino-acidsequence surrounding this residue shows homology to theamino-acid sequence of an active-site peptide, comprisingthe Schiff-base-forming lysine residue, derived from yeastCandida utilis transaldolase (Fig. 7) [11]. Secondly, site-directed mutagenesis of the corresponding Lys144 to a glu-tamine in transaldolase from the yeast Saccharomycescerevisiae resulted in complete loss of activity [4], thus sup-porting the proposed role of this residue in catalysis.Finally, Lys132 is the only lysine residue present in theactive site of transaldolase from E. coli.

In the close vicinity of this residue, several conserved polarresidues are located (Fig. 4). Asp17 in b1 and Glu96 in b3might act in proton transfer during catalysis. Other con-served polar residues such as Asn35 in the loop betweenb2 and a2, Ser176 in b6 and Ser226 in the loop betweenb7 and a7 are perhaps involved in substrate binding.

Both substrates of the reaction carry a phosphate groupand the active site has therefore been analyzed for one ormore possible phosphate-binding sites. Two a helices (aBand aD) point with their N-terminal ends into the activesite and one of them might contribute to phosphatebinding. However, both helices contain bulky side chainsat their N-terminal ends and hydrogen bonds from thephosphate to main-chain nitrogens of one of the helices(as observed in other a/b barrel enzymes) are thereforenot very likely. However, at the N terminus of helix aDthere is a conserved arginine residue, Arg181, and in closeproximity, on the loop connecting b7 with a7, there isanother conserved arginine, Arg228. The guanidiniumgroups of these residues are about 9–12 Å away from the

718 Structure 1996, Vol 4 No 6

Figure 5

Stereoview of the Ca trace of a subunit oftransaldolase. Every tenth residue is indicated.

Figure 4

Schematic view of the subunit of transaldolase from E. coli. Theb strands are coloured in green, the helices pink and the connectingloops yellow. Conserved polar amino acids at the active site areincluded as ball-and-stick models. The figure was generated using theprogram MOLSCRIPT [36].

e-amino group of Lys132 and they might participate inbinding of the substrate’s phosphate groups. No otherstructural features which would suggest a second phos-phate-binding site were found. The kinetic mechanism oftransaldolase is still unknown, but the presence of onlyone phosphate-binding site suggests that there is only onebinding site for the two substrates, consistent with a ping-pong mechanism.

Comparison with other class I aldolases suggests a circularpermutation of the a/b barrelA number of enzymes belonging to the class I aldolasefamily have been studied by protein crystallography andthe 3D structures of F-1,6-P aldolase from several species(human [12] and rabbit muscle [13] and Drosophila melano-gaster [14]), 2-keto-3-deoxy-phosphogluconate aldolase[15], N-acetylneuraminate lyase [16] and dihydropicolinatesynthase [17] are known. Among these enzymes, the reac-tion catalyzed by F-1,6-P aldolase is, in a chemical sense,very similar to one half of the trans-aldolase reaction, thatis, aldol cleavage of a fructose-phosphate to produce glycer-aldehyde-3-phosphate and a dihydroxyacetone moiety.

The overall topologies of transaldolase and F-1,6-Paldolase are quite similar. Both of them are eight-strandeda/b barrels, and contain an additional a helix at the N ter-minus. In F-1,6-P aldolase, this helix packs against theN-terminal end (with respect to the b strands) of the barrel,whereas in transaldolase the corresponding helix packsagainst the loop connecting a8 with aE. Three helices areinserted in the C-terminal loops of the barrel in bothenzymes, however the two helices at the very C terminusof the polypeptide chain are only found in transaldolase.

One obvious way to superpose the transaldolase andF-1, 6-P aldolase barrels is by aligning their secondarystructural elements by order of appearance, that is, strandb1 with strand b1, strand b2 with strand b2 and so forth(alignment b1→b1). This alignment results in 113 equiva-lent Ca atoms with an rms fit of 2.53 Å. It is obvious fromthe corresponding sequence alignment based on this struc-tural superposition that the two Schiff-base-forming lysineresidues (Lys132 of the transaldolase and Lys229 of theF-1,6-P aldolase) are not aligned (Fig. 6). Lys132 of thetransaldolase corresponds to another lysine of the F-1, 6-P

Research Article Crystal structure of transaldolase Jia et al. 719

Figure 6

Assignment of secondary structure elementsof transaldolase and sequence alignment oftransaldolase with the amino-acid sequenceof human muscle aldolase based on thecomparison of their 3D structure. Secondarystructure assignments were made with theprogram DSSP [37]. The first line showssecondary structure elements oftransaldolase; second line, amino-acidsequence of transaldolase from E. coli [3];third line, sequence alignment of humanmuscle aldolase, using the structuralsuperposition in which the strands carryingthe Schiff-base-forming lysine were aligned(alignment b4→b6); fourth line, sequencealignment of human muscle aldolase, usingthe structural alignment in which the strandswere superposed in order of appearance(alignment b1→b1). Conserved residues inboth enzymes are shaded in grey, and theSchiff-base-forming lysine residues areenclosed in rectangles.

. . . . . .MTDKLTSLRQYTTVVADTGDIAAMKLYQPQDATTNPSLILNAAQIPEYRKLIDDAVAWAK LKKKGIILGIKV YKKDGCDFAKWRC 97 137 GKGILAAD LLFST ISGVI 26 61 73

. . . . . .QQSNDRAQQIVDATDKLAVNIGLEILKLVPGRISTEVDARLSYDTEASIAKAKRLIKLYN LENANVLARYASICQSQR VPIVEP AQKVTETVLAAVYKALS 164 183 201 AEILIILGIKVDKG DLAARCAQY 94 102 129

. . . . . .DAGISNDRILIKLASTWQGIRAAEQLEKEGINCNLTLLFSFAQARACAEAGVFLISPFVGDHHVYLEGTLLKPN GVTFLS EEEATVNLSAIN LTFSYG 266 276 297 CDFAKWR NANVLAR RIVPIVEPEVTETVLA TLLKPNM 142 166 181 205 226

. . . . . .RILDWYKANTDKKEYAPAEDPGVVSVSEIYQYYKEHGYETVVMGASFRNIGEILELAGCD NIAAG LKRAKA KGILAA RQLLFST IS 319 328 27 59 73 AKKNTPEEI AAVTGVTFLSSEEEATVNLS L 240 262 275 297

. . . . . .RLTIAPALLKELAESEGAIERKLSYTGEVKARPARITESEFLWQHNQDPMAVDKLAEGIRGVILFHETLYQ A 359TFSYG

. KFAIDQEKLEKMIGDLLNHAY

1

61

121

181

241

301

60

120

180

240

300

317

A B C

D

E F

α β1 α1 β2 α α

α2 β3 α3

β4 α4 β5 α5 β6

α α6 β7 α7

β8 α8 α α

aldolase, Lys146, which is also in the active site and wassuggested to bind the C1-phosphate group of the substrateand to participate in the catalytic reaction [18]. The Schiff-base-forming residue in F-1,6-P aldolase is aligned withSer176 in transaldolase. Of the 113 equivalent residues,nine are conserved between the F-1,6-P aldolase andtransaldolase. Five of these residues are hydrophobic andwith the exception of the above mentioned lysine residuesand two of the hydrophobic residues, all of the remainingconserved residues are far from the active-site pocket.

Another way of aligning the two barrels is by aligning theb strands carrying the Schiff-base-forming lysine residue,that is, b4 of transaldolase with b6 of F-1,6-P aldolase andso forth (b4→b6). Such an alignment, using O, results in asignificant higher number of structurally equivalent Caatoms, 146 atoms with an rms fit of 2.17 Å (Fig. 8).

Three of the remaining six possible ways to superpose thetwo barrels result in alignments with similar statistics to theb1→b1 alignment, with 100–109 equivalenced Ca atoms(rms varying between 2.29–2.35 Å). The remaining threepossible superpositions yield alignments with only 80–89equivalent Ca atoms (rms between 2.01 Å and 2.48 Å).

A different approach to reveal topological and structuralsimilarities is taken by the program TOP (G Lu, unpub-lished program). This program does not require an initial

set of equivalent atoms, but identifies topological similari-ties in proteins by finding the optimal match of secondarystructural elements. Comparison of the transaldolase withthe F-1,6-P aldolase structure using TOP reveals that,from a topological point of view, both structures are signif-icantly more similar when strands b4→b6 are aligned (87matched residues, rms 1.48 Å, compared with the nextbest alignment, b1→b1, which results in 64 matchedresidues, rms 1.73 Å).

In the b4→b6 based superpositions, the two Schiff-base-forming lysine residues are at equivalent positions, and wenow find several polar residues that are conserved in bothtransaldolases and aldolases at corresponding locations inthe 3D structure. All in all, 13 of the equivalencedresidues are conserved between E. coli transaldolase andF-1,6-P aldolase, and five of them are located at the activesite. Glu96 of the transaldolase superimposes with Glu187of F-1,6-P aldolase, and Ser176 of the transaldolase withSer300 of F-1,6-P aldolase. Furthermore, the side chainsof the conserved residues Asp17 of transaldolase andAsp33 of F-1,6-P aldolase are at the same position in 3Dspace, but they are located on different b strands.

In addition, this way of superposition also aligns the pro-posed phosphate-binding sites for the C6 phosphate. Thebinding site of this phosphate group in F-1,6-P aldolase islocated in a cleft between the loops after b8 and b1[18,19]. Arg303 in F-1,6-P aldolase is located at the N ter-minus of an additional a helix after b8 and its side-chainposition is very close to the position of the side chain ofArg181 in transaldolase, also located the N terminus of anadditional a helix.

In summary, transaldolase and F-1,6-P aldolase are struc-turally very similar in spite of the lack of significantoverall amino acid sequence homology. Given the highdegree of overall structural similarity along with theconservation of some active-site residues and of the

720 Structure 1996, Vol 4 No 6

Figure 8

Superposition of a single subunit fromtransaldolase (red) and aldolase (blue). Thesuperposition is based on alignment b4→b6(for details see text).

Figure 7

Active-site peptide of transaldolase from Candida utilis, carrying theSchiff-base-forming lysine residue, and alignment of homologousregions in the amino-acid sequences of transaldolases. Conservedresidues are indicated by bold letters.

phosphate-binding site, it seems likely that both enzymefamilies have evolved from a common ancestor. However,optimization of structural overlay and optimal alignmentof the active-site residues between F-1,6-P aldolases andtransaldolases require a circular permutation of the sec-ondary structural elements of the barrel. We thereforesuggest that such a permutation in an ancestral aldolasegene might have occurred during evolution, whereby thefirst two b strands of this ancient aldolase were moved to the C terminus. It is of interest to note that in KDPG aldolase [15], N-acetylneuraminate lyase [16] anddihydrodipicolinate synthase [17], the three other class Ialdolases with known 3D structure, the Schiff-base-forming lysine is located on strand b6. The similarity inchemistry and substrates between F-1,6-P aldolase andtransaldolase implies that these enzymes are more closelyrelated to each other than to the other enzymes of thisfamily. If this assumption is correct (it is also supported bythe conservation of active-site residues other than theSchiff-base-forming lysine residue amongst F-1,6-Paldolases and transaldolases) we conclude that thecommon ancestor of the class I aldolase family had thelysine group at b strand six. Furthermore, the permutationof the gene resulting in the rearrangement of the sec-ondary elements should have occurred at a later stage inevolution, after N-acetylneuraminate lyase and dihy-drodipicolinate synthase had diverged from F-1,6-Paldolase and transaldolase.

In a series of elegant experiments, Kirschner and col-leagues [20] have shown that circular permutations of thea/b-barrel motif at the level of the gene can result in prop-erly folded, active enzymes . The present study providesevidence that such permutations in a/b barrels mightindeed have occurred naturally during evolution. Natu-rally circular permutations of protein sequences within asingle domain have been found in other instances: thesaposin domains [21] and b-glucanases [22]. It might bethat such circular permutations of gene sequences, withpreservation of the 3D structure of the encoded protein,occur more often than previously thought.

Another possible explanation for these findings would bethat we observe a case of convergent evolution imposedby the chemical constraints of the reaction to be catalyzed.Although convergent evolution cannot be ruled out unam-biguously, the higher structural similarity found betweenthe permuted structures indicates that a circular permuta-tion is the more likely explanation.

Implications for the catalytic mechanismImportant polar residues at the active site that have been implicated to participate in substrate binding and/or catalysis are conserved between F-1,6-P aldolase and transaldolase. This conservation indicates furthermechanistic similarities beyond the mere formation of a

Schiff-base intermediate. Littlechild and Watson [18] haveproposed a reaction mechanism for F-1,6-P aldolase.Based on the crystal structure of the enzyme from threespecies and amino acid sequence comparisons, they iden-tified a number of conserved amino acids surrounding theSchiff-base-forming lysine residue. In addition to thisresidue, two acidic residues are conserved between theactive sites of F-1,6-P aldolase and transaldolases. Glu96is very close to the e-NH2 group of the lysine residue, buta function for this residue other than maintaining aneutral electrostatic potential at the active site [18] hadnot been discussed previously. Its proximity to Lys132suggests to us that this residue might participate inproton abstraction from the e-NH2 group of Lys132, thusfacilitating nucleophilic attack of this nitrogen on the car-bonyl carbon of the substrate. Another acidic residue atthe active site of F-1,6-P aldolase, Asp33, has been impli-cated in protonating the leaving hydroxyl group of thecarbinolamine intermediate. In transaldolase, the con-served residue Asp17 is found at the corresponding posi-tion in 3D space and might fulfil a similar role in thetransaldolase reaction.

In addition to these residues, we can identify anotherpolar amino acid at the active site which is conserved inthe two enzyme families. Ser176 in transaldolase and thecorresponding residue Ser300 in F-1,6-P aldolase arewithin 5 and 3 Å, respectively of the e-NH2 group of theSchiff-base-forming lysine residue and might be involvedin substrate binding.

One important difference in the active-site structurebetween the two enzymes is the absence of the ‘second’lysine residue [23] in transaldolase. This lysine residue,Lys146 in F-1,6-P aldolase, is close to the Schiff-base-forming lysine and has been implicated in binding of theC1 phosphate group [18,24]. No second lysine residue isfound in the active site of E. coli transaldolase. As thenatural substrates of transaldolase are not phosphorylatedat the C1 position, there might be no need for this residueto be conserved. This lysine was also proposed to be a keyplayer in formation and cleavage of the Schiff-base byacting as a acid/base catalyst [18]. At the position ofLys146, we find in transaldolase residue Thr33, conservedin the transaldolases. While this side chain can participatein hydrogen bonding and in this way might stabilize theSchiff base intermediate, it is not very likely that thisresidue is acting as an acid/base catalyst in the formationand breakdown of the Schiff-base intermediate.

However, we want to point out that so far, no structuralinformation of an F-1,6-P aldolase or transaldolase com-plexed with substrate or substrate analogues is available.Thus, a more definite assignment of the function ofactive-site residues will require crystallographic analysis ofsuch complexes.

Research Article Crystal structure of transaldolase Jia et al. 721

Biological implicationsTransaldolase, one of the enzymes in the pentose phos-phate pathway, catalyses the reversible transfer of adihydroxy moiety between various sugar phosphates. Asoccurs in other enzymes of the class I aldolase family,transaldolase forms a covalent Schiff-base adductbetween the substrate and the side chain of an active-sitelysine residue. Aldolases, in general, are of considerableinterest for stereospecific carbon–carbon bond formationin organic synthesis. The crystal structure determina-tion of transaldolase provides the structural frameworkfor attempts to modify substrate specificity of thisenzyme by rational design.

The enzyme is a homodimer and each monomer con-sists of an eight-stranded a/b barrel, the same overallfold found for other enzymes in the class I aldolasefamily. The active site is at the C-terminal end of thebarrel, at the bottom of a funnel-shaped cleft formed bythe loops connecting the eight b strands with thea helices. In the interior of this cleft, Lys132, whichforms a Schiff-base with the substrate during catalysis,is located. The e-nitrogen atom of this residue is sur-rounded by a number of conserved polar residues whichmight be involved in proton transfer during catalysisand/or substrate binding. The active-site topology oftransaldolase suggests some mechanistic differencesfrom fructose-1,6-phosphate aldolases with respect toformation and cleavage of the Schiff-base intermediate.

Structural comparisons of transaldolase with otheraldolase structures reveal that the residue forming theSchiff base, located on strand b6 in aldolases, is foundon strand b4 in transaldolase. This suggests the possibil-ity of a circular permutation in the aldolase gene whichmoved the first two a/b units of an ancient aldolase geneto the C terminus during evolution. Thus, transaldolasemight be the first example of a naturally occurring circu-lar permutation in an a/b barrel. Together with otherrecent observations, this result suggests that such circu-lar permutations in gene sequences, resulting in permu-tations of protein structure within a domain withpreservation of the overall 3D fold, occur more oftenthan previously thought.

Materials and methodsEnzyme preparation and crystallizationTransaldolase from recombinant E. coli K-12 cells (strain DH5/pGSJ451)was purified to electrophoretic homogeneity by successive ammoniumsulphate precipitation and two anion exchange chromatography steps[3]. Crystals of transaldolase were obtained by a combination of micro-and macroseeding techniques using the hanging drop vapour diffusionmethod [9]. The crystals were orthorhombic, space group P212121 withcell dimensions a=68.9 Å, b=91.3 Å and c=130.5 Å.

Data collection and screening of heavy atom compoundsNative and derivative data were collected at 2.0 Å and 2.2 Å resolution,respectively, on an R-axis II imaging plate detector at the department of

Molecular Biology, Uppsala, Sweden. The detector was mounted on aRigaku rotating anode, operating at 50 KV and 90 mA. Details of datacollection are given in Table 1. A second native data set to 1.87 Å res-olution was collected at beamline X12-C at NSLS, Brookhaven, using aMAR Research scanner. All data processing and scaling was carriedout using DENZO and SCALEPACK [25]. The two native data setswere merged to give a final data set containing 62 509 unique reflec-tions to 1.87 Å, (91.3% overall completeness, 82.8% in the highestresolution shell, (1.97–1.87 Å), with an overall merging R-factor of0.088. Heavy metal derivatives were prepared by soaking crystals withsolutions of heavy metal compounds in mother liquor for several hours.

Phase determinationThe heavy metal sites could be located by manual inspection of the dif-ference Patterson maps. Starting from the platinum derivative, theheavy metal positions in the other derivatives were located by cross-phased difference Fourier maps. The metal sites in the HgMi andHgAc2 derivatives were identical. The parameters of the heavy metalderivatives were refined using MLPHARE [26]. The final figure of meritat 2.2 Å resolution after phase refinement was 0.38. Table 2 summa-rizes the phasing statistics. The MIR phases were improved by asolvent flattening and histogram matching procedure using the pro-gramme SQUASH [27]. All crystallographic computing was carried outwith the CCP4 program suite [28].

722 Structure 1996, Vol 4 No 6

Table 1

Data collection results.

Data set No. of Resolution Measured Unique Rmerge†

crystals (Å) reflections reflections*

Nat(1) 1 2.0 330 510 47 372 (84.0%) 0.064Nat(2) 1 1.87 167 704 59 359 (83.0%) 0.062Merged

native 2 1.87 498 214 62 509 (91.3%) 0.088PIP‡(1) 1 2.2 142 682 32 311 (75.5%) 0.066PIP‡(2) 1 2.2 109 429 25 526 (60.1%) 0.086HgAc2(1) 1 2.2 243 823 35 624 (84.0%) 0.068HgAc2(2) 1 2.2 64 507 13 348 (40.7%) 0.086HgMi§ 1 2.2 105 944 25 812 (60.7%) 0.054

*The percentage of the total number of reflections is given inparentheses. †Rmerge=SSi|Ii–<I>|/ S|<I>, where Ii are the intensitymeasurements for a reflection and <I> is the mean value for thisreflection. ‡PIP, di-m-iodobis(ethylenediamine)diplatinum(II). §HgMi,CH3OC2H4HgCl.

Table 2

Phasing statistics.

Derivative Number of sites Rderiv* Rcullis† Phasing power‡

HgAc2 2 13.5 0.84 1.07HgAc2(2) 2 24.5 0.83 0.92HgMi 2 16.3 0.81 1.17PIP 2 14.9 0.85 0.95PIP(2) 2 15.4 0.84 1.05

*Rderiv=S|FPH–FP|/ S|FP| where FPH is the structure factor amplitude ofthe derivative crystals and FP is that of the native crystal.†Rcullis=S|FPH±FP|–|FH (calc)|/ S|FPH–FP|, where FPH and FP are definedas above and FH(calc) is the calculated heavy atom structure factoramplitude summed over centric reflections only.‡Phasing power=F(H)/E, the rms heavy atom structure factoramplitudes divided by lack of closure error.

From native Patterson maps, it was established that the crystal asym-metric unit contains two subunits related by a translation of x=0, y=0.5and z=0.136 [9]. This non-crystallographic symmetry was confirmedby the heavy metal positions and subsequently refined using theprogram IMP [29].

Model building and crystallographic refinementThe data set Nat1 was used for phase refinement, solvent flattening andcalculation of the initial density maps and subsequent phase combina-tion. The MIR map and the solvent flattened map were of very goodquality and allowed tracing of the polypeptide main chain. A polyalaninemodel comprising about 85% of the residues was built using O [10].The map was improved by cycles of refinement using X-PLOR [30],phase combination using SIGMAA [31] and model building. In theserefinement runs, the parameters described by Engh and Huber [32] wereused, and strict non-crystallographic symmetry constraints and an overallB-factor were applied. After a few rounds of refinement and manual inter-vention, the biological amino-acid sequence could be fitted to the elec-tron density. At this stage, crystallographic refinement was continuedwith the new merged data set containing the synchrotron data. A test set(2.5% of the reflections) was set aside to monitor progress of the refine-ment by means of the free R-factor [33]. In the last refinement cycle, indi-vidual B-factors were refined. Non-crystallographic symmetry constraintswere kept throughout the whole refinement procedure, as their releasedid not lower the free R-factor. The final R-factor is 20.1% and the freeR-factor is 23.4% in the resolution interval 5.5–1.87 Å. The stereochem-istry of the model is good with rms deviations for bond lengths of0.006 Å and bond angles of 1.521°. Table 3 gives the final refinementstatistics and stereochemical parameters for the model. The proteinmodel has been analyzed with the program PROCHECK [34].

Structure comparisons Refined models for F-1,6-P aldolase from human [12] and rabbit [13]muscle (Protein data bank [35] accession codes 1fba and 1ald, respec-tively) were used for the structural comparisons. Superpositions of thestructures were made by least-squares methods using the program O[10] or the program TOP (G Lu, unpublished program). For the align-ments using O, a small set of core atoms consisting of about 32 Caatoms from the eight strands of the a/b barrels were assigned for theinitial alignment which was subsequently used to maximize the numberof structurally equivalent Ca atoms. Atoms were considered equivalent ifthey were within 3.8 Å distance from each other and within a consecu-tive region consisting of at least four equivalent residues. Two cases ofinitial alignments were tried. In one alignment, the b strands of transal-dolase were aligned with those of F-1,6-P aldolase in order of appear-ance, i.e. strand b1 with strand b1, strand b2 with strand b2 and soforth. In a second run, the two b strands carrying the Schiff-base-forming lysine were the starting points of the alignment, i.e. b4 oftransaldolase was aligned with b6 of F-1,6-P aldolase, b5 with b7 andso forth. Structurally homologous residues in the two superpositions are

aligned in Figure 6. Superpositions were also carried out using theprogram TOP (G Lu, unpublished program). In this case, no initialmanual assignment of equivalent atoms is required.

Accession numbersAtomic coordinates for the refined transaldolase model have beendeposited with the Protein Data Bank, Brookhaven.

AcknowledgementsWe thank the staff, in particular RM Sweet, at the Department of Biology,Brookhaven National Laboratory, for access to beamline X12-C at NationalSynchrotron Light Source. This work was supported by a grant from theSwedish Natural Science Research Council. GS acknowledges receipt of amajor equipment grant from the Knut and Alice Wallenberg Foundation andGAS, US, and HS acknowledge a grant from the Deutsche Forschungs-gemeinschaft through SFB 380/B21.

References1. Gubler, C.J. (1991). Physiological functions and mechanism of action

of transketolase. In Biochemistry and Physiology of ThiaminDiphosphate Enzymes. (Bisswanger, H. & Ullrich, J., eds),pp. 311–322, Verlag Chemie, Weinheim, Germany.

2. Venkatamaran, R. & Racker, E. (1961). Mechanism of action oftransaldolase. J. Biol. Chem. 236, 1883–1886.

3. Sprenger, G.A., Schörken, U., Sprenger, G. & Sahm, H. (1995).Transaldolase B of Escherichia coli K-12: cloning of its gene, talB,and characterization of the enzyme from recombinant strains.J. Bacteriology 177, 5930–5936.

4. Miosga, T., Schaaff-Gerstenschläger, I., Franken, E. & Zimmermann,F.K. (1993). Lysine144 is essential for the catalytic activity ofSaccharomyces cerevisiae transaldolase. Yeast 9, 1241–1249.

5. Tsolas, O. & Lai, C.Y. (1972). Transaldolase. In The Enzymes. (Boyer,P.D., ed), vol. 7, pp. 259–280, Academic Press, New York, NY.

6. Schaaff, I., Hohmann, S. & Zimmermann, F.K. (1990). Molecularanalysis of the structural gene for yeast transaldolase. Eur. J.Biochem. 188, 597–603.

7. Banki, K., Halladay, D. & Perl, A. (1994). Cloning and expression ofthe human gene for transaldolase. J. Biol. Chem. 269, 2847–2851.

8. Toone, E.J., Simon, E.S., Bednarski, M.D. & Whitesides, G.M. (1989).Enzyme-catalyzed synthesis of carbohydrates. Tetrahedron Lett. 45,5365–5421.

9. Jia, J., Lindqvist, Y., Schneider, G., Schörken, U., Sahm, H. &Sprenger, G.A. (1996). Crystallization and preliminary X-raycrystallographic analysis of recombinant transaldolase B fromEscherichia coli. Acta Cryst. D 52, 192–193.

10. Jones, T.A., Zou, J.Y., Cowan, S. & Kjeldgaard, M. (1991). Improvedmethods for building protein models in electron density maps andlocation of errors in these models. Acta Cryst. A 47, 110–119.

11. Lai, C.Y., Chen, C. & Tsolas, O. (1967). Isolation and sequenceanalysis of a peptide from the active site of transaldolase. Arch.Biochem. Biophys. 121, 790–797.

12. Gamblin, S.J., Cooper, B., Millar, J.R., Davies, G.J., Littlechild, J.A. &Watson, H.C. (1990). The crystal structure of human muscle aldolaseat 3.0 Å resolution. FEBS Lett. 262, 182–286.

13. Sygusch, J., Beaudry, D. & Allaire, M. (1987). Molecular architecture ofrabbit skeletal muscle aldolase at 2.7 Å resolution. Proc. Natl. Acad.Sci. USA 84, 7846–7850.

14. Hester, G., et al., & Piontek, K. (1991). The crystal structure offructose-1,6-bisphosphate aldolase from Drosophila melanogaster at2.5 Å resolution. FEBS Lett. 292, 237–242.

15. Mavridis, J.M., Hatada, M.H., Tulinsky, A. & Lebioda, D. (1992).Structure of 2-keto-3-deoxy-6-phosphogluconate aldolase at 2.8 Åresolution. J. Mol. Biol. 162, 419–444.

16. Izard, T., Lawrence, M.C., Malby, R.L., Lilley, G.G. & Colman, P.M.(1994). The three-dimensional structure of N-acetylneuraminate lyasefrom Escherichia coli. Structure 2, 361–369.

17. Mirwaldt, C., Korndörfer, I. & Huber, R. (1995). The crystal structure ofdihydropicilinate synthase from Escherichia coli at 2.5 Å resolution.J. Mol. Biol. 246, 227–239.

18. Littlechild, J.A. & Watson, H.C. (1993). A data-based reactionmechanism for type I fructose bisphosphate aldolase. Trends Biol.Sci. 18, 36–39.

19. Gamblin, S.J., Davies, G.J., Grimes, J.M., Jackson, R.M., Littlechild, J.A.& Watson, H.C. (1991). Activity and specificity of human aldolases.J. Mol. Biol. 219, 573–576.

Research Article Crystal structure of transaldolase Jia et al. 723

Table 3

Refinement statistics.

Resolution (Å) 1.87Number of unique reflections 62 509R-factor (%) 20.1%R-free (%) 23.4%Non-hydrogen atoms of protein 4948Water molecules 524Average B-factors (Å2)

Protein atoms 23.8Solvent atoms 36.9

Rms bond length deviation (Å) 0.006Rms bond angle deviation(°) 1.506Rms B (Å2) 1.211

20. Luger, K., Hommel., U., Herold, M., Hofsteenge, J. & Kirschner, K.(1989). Correct folding of circularly permuted variants of a ba barrelenzyme in vivo. Science 243, 206–210.

21. Ponting, C.P. & Russell, R.B. (1995). Swaposins; circularpermutations within genes encoding saposin homologues. TrendsBiol. Sci. 20, 179–180.

22. Heinemann, U. & Hahn, M. (1995). Circular permutations of proteinsequence: not so rare? Trends Biol. Sci. 20, 349–350.

23. Horecker, B.L., Tsolas, O. & Lai, C.Y. (1972). Aldolases. In TheEnzymes. (Boyer, P.D., ed), vol. 7, pp. 213–258, Academic Press,New York, NY.

24. Hartman, F.C. & Brown, P.J. (1976). Affinity labeling of a previouslyundetected essential lysyl residue in class I fructose bisphosphatealdolase. J. Biol. Chem. 251, 3057–3062.

25. Otwinowski, Z. (1993). DENZO: An Oscillation Data ProcessingProgram for Macromolecular Crystallography. Yale University, NewHaven, CT.

26. Otwinowski, Z. (1991). Maximum likelihood refinement of heavy atomparameters. In Isomorphous Replacement and Anomalous Scattering.Proceedings of the CCP4 Study Weekend (Wolf, W., Evans, P.R. &Leslie, A.G.W., eds), pp. 80–88, SERC Daresbury Laboratory,Warrington, UK.

27. Zhang, K.Y. (1993). SQUASH — Combining Constraints forMacromolecular Phase Refinement and Extension. Acta Cryst. D 49,213–222.

28. Collaborative Computational Project, Number 4. (1994). The CCP4suite: programs for protein crystallography. Acta Cryst. D 50,760–763.

29. Jones, T.A. (1992). a, yaap, asap, @#*? A set of averaging programs.In CCP4 Study weekend 1992: Molecular replacement. (Dodson,E.J., Gover, S. & Wolf, W., eds), pp. 91–105, SERC DaresburyLaboratory, Warrington, UK.

30. Brünger, A. (1989). Crystallographic refinement by simulatedannealing: application to crambin. Acta Cryst. A 45, 50–61.

31. Read, R.J. (1986). Improved coefficients for map calculation usingpartial structures with errors. Acta Cryst. A 42, 140–149.

32. Engh, R.A. & Huber, R. (1991). Accurate bond and angle parametersfor X-ray protein structure refinement. Acta Cryst. A 47, 392–400.

33. Brünger, A. (1992). Free R value: a novel statistical quantity forassessing the accuracy of crystal structure. Nature 355, 472–475.

34. Laskowski, R.A., MacArthur, M.W., Moss, D.S. & Thornton, J.M.(1993). PROCHECK: a program to check the stereochemical qualityof protein structures. J. Appl. Cryst. 26, 946–950.

35. Bernstein, F.C., et al., & Tasumi, M. (1977). The Protein Data Bank: acomputer-based archival file for macromolecular structures. J. Mol.Biol. 112, 535–542.

36. Kraulis, P.J. (1991). MOLSCRIPT: a program to produce both detailedand schematic plots of protein structure. J. Appl. Cryst. 24, 946–950.

37. Kabsch, W. & Sander, C. (1983). Dictionary of protein secondarystructure: pattern recognition of hydrogen bond and geometricalfeatures. Biopolymers 22, 2577–2637.

724 Structure 1996, Vol 4 No 6