<<

View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

Research Article 715

Crystal structure of transaldolase B from Escherichia coli suggests a circular permutation of the ␣/␤ barrel within the class I aldolase family Jia Jia1, Weijun Huang1, Ulrich Schörken2, Hermann Sahm2, Georg A Sprenger2, Ylva Lindqvist1* and Gunter Schneider1*

Background: Transaldolase is one of the in the non-oxidative branch of Addresses: 1Division of Molecular Structural the pentose phosphate pathway. It transfers a C3 ketol fragment from a ketose Biology, Department of Medical Biochemistry and donor to an aldose acceptor. Transaldolase, together with , creates a Biophysics, Karolinska Institute, Doktorsringen 4, S-171 77 Stockholm, Sweden and 2Institut für reversible link between the pentose phosphate pathway and . The Biotechnologie 1, Forschungszentrum Jülich is of considerable interest as a catalyst in stereospecific organic synthesis GmbH, PO Box 1913, D-52425 Jülich, Germany. and the aim of this work was to reveal the molecular architecture of transaldolase and provide insights into the structural basis of the enzymatic mechanism. *Corresponding authors. E-mail: YL, [email protected]; GS, [email protected] Results: The three-dimensional (3D) structure of recombinant transaldolase B from E. coli was determined at 1.87 Å resolution. The enzyme subunit consists of Key words: enzyme mechanism, evolution, a single eight-stranded ␣/␤-barrel domain. Two subunits form a dimer related by permutation, protein crystallography, transaldolase a twofold symmetry axis. The active-site residue Lys132 which forms a Schiff Received: 14 Mar 1996 base with the substrate is located at the bottom of the active-site cleft. Revisions requested: 4 April 1996 Revisions received: 12 April 1996 Conclusions: The 3D structure of transaldolase is similar to structures of other Accepted: 16 April 1996 enzymes in the class I aldolase family. Comparison of these structures suggests Structure 15 June 1996, 4:715–724 that a circular permutation of the protein sequence might have occurred in transaldolase, which nevertheless results in a similar 3D structure. This © Current Biology Ltd ISSN 0969-2126 observation provides evidence for a naturally occurring circular permutation in an ␣/␤-barrel protein. It appears that such genetic permutations occur more frequently during evolution than was previously thought.

Introduction fructose-6-phosphate, to erythrose-4-phosphate yielding The pentose phosphate pathway for the of sedoheptulose-7-phosphate and glyceraldehyde-3-phos- glucose-6-phosphate can be described as containing three phate. Catalysis by transaldolase proceeds through the for- rather distinct enzyme systems [1]. One of these systems mation of a covalent intermediate, a Schiff base between catalyses a dehydrogenase/decarboxylase step which an active-site lysine residue and the dihydroxyacetone decarboxylates glucose-6-phosphate resulting in the for- moiety [2], similar to the mechanism of other class I

mation of ribulose-5-phosphate and CO2 with concomitant aldolases (Fig. 1). The subunit of transaldolase from E. coli reduction of NADP+. The pathway includes an isomeriz- has a molecular weight of 35 kDa and the enzyme forms a ing system which produces an equilibrium mixture of dimer in solution [3] as do the enzymes from Saccharomyces -5-phosphate, ribulose-5-phosphate and xylulose- cerevisiae [4] and Candida utilis [5]. The gene for transal- 5-phosphate. Finally, the pathway contains a sugar dolase B from E. coli codes for 317 amino acids, but the rearrangement system which uses the enzymes transketo- N-terminal methionine residue is cleaved off in the mature lase and transaldolase to produce three to seven carbon protein [3]. Comparison of the amino-acid sequences of carbohydrate intermediates which can be shunted into transaldolases reveals overall sequence identities of greater other metabolic pathways such as glycolysis. The pathway than 50% between prokaryotic and eukaryotic species, for is very flexible and can adjust itself to the various chang- example 53% identity between the enzymes from E. coli ing requirements of the cell, for example, for metabolic and Saccharomyces cerevisiae and 55% identity between intermediates or reducing power in the form of NADPH. E. coli and human transaldolase [3,6,7].

Transaldolase (D-sedoheptulose-7-phosphate:D-glyceralde- Aldolases are of considerable interest as catalysts in hyde-3-phosphate dihydroxyacetone , EC stereospecific organic synthesis. In particular, fructose- 2.2.1.2), one of the enzymes in the non-oxidative branch of 1,6-phosphate aldolase (F-1,6-P aldolase) has been suc- the pentose phosphate pathway, catalyzes the reversible cessfully employed for stereospecific carbon–carbon bond transfer of a dihydroxyacetone moiety, derived from formation in the synthesis of monosaccharides [8]. As part 716 Structure 1996, Vol 4 No 6

Figure 1

Reactions catalyzed by (a) transaldolase and (a) transaldolase (b) CH2OH fructose-1,6-bisphosphate aldolase. D-erythrose CH OH CO 2 H O 4-phosphate H2O H2O CO CH2OH C HOC H

ENH2+ HOC H ENC + H COH ENH2 + HOHC 2- HOHC CH2OH CH2OPO3 HOHC HOHC H COH 2- 2- CH2OPO3 CH2OPO3

D-fructose Schiff-base D-glyceraldehyde D-sedoheptulose 6-phosphate intermediate 3-phosphate 7-phosphate

(b) fructose-1,6-bisphosphate aldolase

CH OPO 2- 2 3 H O H2O 2- H O 2- C O CH2OPO3 C 2 CH2OPO3

ENH2+ HO C H ENC +HCOH ENH2+ CO 2- HOHC CH2OH CH2OPO3 CH2OH HOHC 2- CH2OPO3 D-fructose Schiff-base D-glyceraldehyde Dihydroxyacetone 1,6-bisphophate intermediate 3-phosphate phosphate

of on-going efforts to explore the suitability of trans- bind at the same site, Cys239, and the platinum interacts aldolases as catalysts in organic synthesis, we have initi- with the side chain of Met312. The final model consists of ated crystallographic studies of this enzyme [9]. These 2×316 residues and 524 water molecules. One residue, analyses should reveal the molecular architecture of the Asp293, in each subunit was modelled with two alterna- catalyst and provide insights into the structural basis of tive conformations. The protein model has good stereo- the enzymatic mechanism. chemistry: the root mean square (rms) deviation of the bond lengths is 0.006 Å and the rms deviation for In this paper, we describe the three-dimensional (3D) the bond angles is 1.521°. The Ramachandran plot structure of recombinant transaldolase from E. coli at (Fig. 3) shows that 96.5% of the residues are in the most 1.87 Å resolution. We also present the results of a compari- favoured regions of ϕ,␾ space with no outliers in the son of the structure of transaldolase with structures of disallowed regions. The only residue in the generously other class I aldolases. allowed region (Ser226) has very well defined electron density. The conventional crystallographic R-factor after Results and discussion refinement is 20.1% and R-free is 23.4% in the resolution Structure determination and electron-density map interval 5.5–1.87 Å. The structure of transaldolase was solved by the multiple isomorphous replacement (MIR) method using five heavy Overall structure metal derivatives. After solvent flattening and histogram Structure of the subunit matching, the electron-density map was of sufficient The enzyme subunit consists of a single domain which quality to trace the polypeptide chain of the enzyme. The has the fold of an eight-stranded ␣/␤ barrel (Fig. 4). map was subsequently improved by phase combination, Eight parallel ␤ strands (␤1–␤8) form the core of the using calculated phases from the partial model and the barrel. This core is surrounded by eight ␣ helices MIR phases. In this improved map, the complete biologi- (␣1–␣8) running approximately antiparallel with the cal amino-acid sequence was fitted into the electron ␤ strands with the exception of helix ␣8 which has its density. In the final electron-density maps, there is con- helix axis almost perpendicular to strand ␤8. There are tinuous well-defined electron density for the whole six additional ␣ helices (␣A–␣F), three of which are polypeptide chain except for a few side chains at the inserted in loop regions between ␤ strands and ␣ helices surface of the molecule. A representative part of the of the barrel, two after strand 2 (␣B and ␣C) and one

2ԽFoԽ–ԽFcԽ electron-density map after completion of the after strand 6 (␣D). One additional ␣ helix (␣A) is found crystallographic refinement is shown in Figure 2. The at the N terminus and two more helices (␣E and ␣F) overall residue-by-residue real-space correlation [10] be- occur at the C terminus of the polypeptide chain. Two of tween the model and the 2ԽFoԽ–ԽFcԽ electron-density map these helices, ␣B and ␣D, point with their N-terminal is 0.89. The heavy metal ions in the mercury derivatives ends into the . The helix ␣F runs across one Research Article Crystal structure of transaldolase Jia et al. 717

Figure 2

Stereodiagram showing part of the final

2Fo–Fc electron-density map for transaldolase, contoured at 1.0␴.

Figure 3 Structure of the dimer The dimer is formed between one of the subunits in the 180 asymmetric unit and a symmetry mate of the second B ~b 2 b subunit. The dimer interface buries an area of 1600 Å and

135 involves interactions between the N-terminal parts of b ~b helix ␣F in the two subunits and between the ␤3–␣3 loop ~lSer226 region from one subunit and helix ␣E from the other. 90 l These interactions involve hydrophobic contacts and hydrogen bonds. 45 L a A One major contribution to this interface area is the inter- Psi (degrees) 0 ~a action between the N-terminal parts of helix ␣F of the two subunits across a non-crystallographic twofold symme- -45 try axis. Hydrophobic contacts are formed between side chains of Gln287, Val292, Ala296, Arg300 and their corre- -90 sponding symmetry mates in the second subunit. Other important subunit–subunit interactions in this area involve ~b ~p -135 a hydrogen bond from the conserved Arg300 to the main b p ~b chain oxygen of the conserved Asn286 and the corre- -180 -135 -90 -45 0 45 90 135 180 sponding hydrogen bond generated by the twofold sym- Phi (degrees) metry. In this interface region, Asp293 is located. The side chain of this residue has been modelled in two conforma- tions. In one of the conformations, it forms a hydrogen Ramachandran plot for the refined model of transaldolase. Triangles bond to the same conformer in the second subunit. In the represent glycine and squares represent non-glycine residues. other conformation, the side chain points away from the second subunit and its charge is compensated for the sur- half of the active-site cleft at the C-terminal of the barrel rounding basic residues Arg300 from both subunits. The and partly covers this cleft (Fig. 4). The last helix of the crystals are grown at pH 4.5 and it seems likely that the barrel, ␣8, is connected to the penultimate C-terminal two observed conformations reflect the distribution of the helix ␣E (which packs against ␣4 and ␣5 of the barrel) protonated and deprotonated states of the carboxylate by a very long loop wrapping around almost half of group at this pH. the barrel. A stereoview of the structure is shown in Figure 5 and the assignment of the secondary structure Interactions in other parts of this interface region include elements to the amino-acid sequence of transaldolase is mostly hydrophobic contacts from the loop connecting ␤3 shown in Figure 6. with ␣3 of one subunit to the helix ␣E of the second 718 Structure 1996, Vol 4 No 6

Figure 4 of the funnel-shaped opening of the cavity. A large number of ordered solvent molecules were found in the active-site cleft. At the bottom of this cleft, we find the ⑀-nitrogen of the conserved Lys132 that forms the Schiff base during catalysis. The identification of this particular lysine as the Schiff-base-forming residue in E. coli transaldolase is based on several lines of evidence. Firstly, the amino-acid sequence surrounding this residue shows homology to the amino-acid sequence of an active-site peptide, comprising the Schiff-base-forming lysine residue, derived from yeast Candida utilis transaldolase (Fig. 7) [11]. Secondly, site- directed mutagenesis of the corresponding Lys144 to a glu- tamine in transaldolase from the yeast Saccharomyces cerevisiae resulted in complete loss of activity [4], thus sup- porting the proposed role of this residue in catalysis. Finally, Lys132 is the only lysine residue present in the active site of transaldolase from E. coli.

In the close vicinity of this residue, several conserved polar residues are located (Fig. 4). Asp17 in ␤1 and Glu96 in ␤3 might act in proton transfer during catalysis. Other con- served polar residues such as Asn35 in the loop between ␤2 and ␣2, Ser176 in ␤6 and Ser226 in the loop between Schematic view of the subunit of transaldolase from E. coli. The ␤ strands are coloured in green, the helices pink and the connecting ␤7 and ␣7 are perhaps involved in substrate binding. loops yellow. Conserved polar amino acids at the active site are included as ball-and-stick models. The figure was generated using the Both substrates of the reaction carry a phosphate group program MOLSCRIPT [36]. and the active site has therefore been analyzed for one or more possible phosphate-binding sites. Two ␣ helices (␣B subunit. These involve the side chains of Arg100, Tyr103, and ␣D) point with their N-terminal ends into the active Trp283, Leu282, and corresponding residues related by site and one of them might contribute to phosphate the twofold symmetry. binding. However, both helices contain bulky side chains at their N-terminal ends and hydrogen bonds from the The active site phosphate to main-chain nitrogens of one of the helices As in all other ␣/␤-barrel enzymes found so far, the active (as observed in other ␣/␤ barrel enzymes) are therefore site is located at the C-terminal end of the ␤ strands and not very likely. However, at the N terminus of helix ␣D the walls of the active-site cavity are formed by the loops there is a conserved arginine residue, Arg181, and in close connecting these ␤ strands with the ␣ helices. In the struc- proximity, on the loop connecting ␤7 with ␣7, there is ture presented here, this cavity is accessible from the bulk another conserved arginine, Arg228. The guanidinium solution despite the fact that helix ␣F runs across one half groups of these residues are about 9–12 Å away from the

Figure 5

Stereoview of the C␣ trace of a subunit of transaldolase. Every tenth residue is indicated. Research Article Crystal structure of transaldolase Jia et al. 719

Figure 6

Assignment of secondary structure elements of transaldolase and sequence alignment of α AB β1 α1 β2 α αC transaldolase with the amino-acid sequence ...... of human muscle aldolase based on the 1 MTDKLTSLRQYTTVVADTGDIAAMKLYQPQDATTNPSLILNAAQIPEYRKLIDDAVAWAK 60 LKKKGIILGIKV YKKDGCDFAKWRC comparison of their 3D structure. Secondary 97 137 structure assignments were made with the GKGILAAD LLFST ISGVI program DSSP [37]. The first line shows 26 61 73 secondary structure elements of α2 β3 α3 transaldolase; second line, amino-acid sequence of transaldolase from E. coli [3]; ...... 61 QQSNDRAQQIVDATDKLAVNIGLEILKLVPGRISTEVDARLSYDTEASIAKAKRLIKLYN 120 third line, sequence alignment of human LENANVLARYASICQSQR VPIVEP AQKVTETVLAAVYKALS 164 183 201 muscle aldolase, using the structural AEILIILGIKVDKG DLAARCAQY superposition in which the strands carrying 94 102 129 the Schiff-base-forming lysine were aligned (alignment ␤4→␤6); fourth line, sequence β4 α4 β5 α5 β6 alignment of human muscle aldolase, using ...... 121 DAGISNDRILIKLASTWQGIRAAEQLEKEGINCNLTLLFSFAQARACAEAGVFLISPFVG 180 the structural alignment in which the strands DHHVYLEGTLLKPN GVTFLS EEEATVNLSAIN LTFSYG were superposed in order of appearance 266 276 297 → CDFAKWR NANVLAR RIVPIVEPEVTETVLA TLLKPNM (alignment ␤1 ␤1). Conserved residues in 142 166 181 205 226 both enzymes are shaded in grey, and the Schiff-base-forming lysine residues are αα6β7α7D enclosed in rectangles...... 181 RILDWYKANTDKKEYAPAEDPGVVSVSEIYQYYKEHGYETVVMGASFRNIGEILELAGCD 240 NIAAG LKRAKA KGILAA RQLLFST IS 319 328 27 59 73 AKKNTPEEI AAVTGVTFLSSEEEATVNLS L 240 262 275 297

β8 α8 αEF α ...... 241 RLTIAPALLKELAESEGAIERKLSYTGEVKARPARITESEFLWQHNQDPMAVDKLAEGIR 300 GVILFHETLYQ A 359 TFSYG

. 301 KFAIDQEKLEKMIGDLL 317 NHAY

⑀-amino group of Lys132 and they might participate in The overall topologies of transaldolase and F-1,6-P binding of the substrate’s phosphate groups. No other aldolase are quite similar. Both of them are eight-stranded structural features which would suggest a second phos- ␣/␤ barrels, and contain an additional ␣ helix at the N ter- phate- were found. The kinetic mechanism of minus. In F-1,6-P aldolase, this helix packs against the transaldolase is still unknown, but the presence of only N-terminal end (with respect to the ␤ strands) of the barrel, one phosphate-binding site suggests that there is only one whereas in transaldolase the corresponding helix packs binding site for the two substrates, consistent with a ping- against the loop connecting ␣8 with ␣E. Three helices are pong mechanism. inserted in the C-terminal loops of the barrel in both enzymes, however the two helices at the very C terminus Comparison with other class I aldolases suggests a circular of the polypeptide chain are only found in transaldolase. permutation of the ␣/␤ barrel A number of enzymes belonging to the class I aldolase One obvious way to superpose the transaldolase and family have been studied by protein crystallography and F-1, 6-P aldolase barrels is by aligning their secondary the 3D structures of F-1,6-P aldolase from several species structural elements by order of appearance, that is, strand (human [12] and rabbit muscle [13] and Drosophila melano- ␤1 with strand ␤1, strand ␤2 with strand ␤2 and so forth gaster [14]), 2-keto-3-deoxy-phosphogluconate aldolase (alignment ␤1→␤1). This alignment results in 113 equiva- [15], N-acetylneuraminate [16] and dihydropicolinate lent C␣ atoms with an rms fit of 2.53 Å. It is obvious from synthase [17] are known. Among these enzymes, the reac- the corresponding sequence alignment based on this struc- tion catalyzed by F-1,6-P aldolase is, in a chemical sense, tural superposition that the two Schiff-base-forming lysine very similar to one half of the trans-aldolase reaction, that residues (Lys132 of the transaldolase and Lys229 of the is, aldol cleavage of a fructose-phosphate to produce glycer- F-1,6-P aldolase) are not aligned (Fig. 6). Lys132 of the aldehyde-3-phosphate and a dihydroxyacetone moiety. transaldolase corresponds to another lysine of the F-1, 6-P 720 Structure 1996, Vol 4 No 6

Figure 7 set of equivalent atoms, but identifies topological similari- ties in proteins by finding the optimal match of secondary structural elements. Comparison of the transaldolase with the F-1,6-P aldolase structure using TOP reveals that, from a topological point of view, both structures are signif- icantly more similar when strands ␤4→␤6 are aligned (87 Active-site peptide of transaldolase from Candida utilis, carrying the matched residues, rms 1.48 Å, compared with the next Schiff-base-forming lysine residue, and alignment of homologous best alignment, ␤1→␤1, which results in 64 matched regions in the amino-acid sequences of transaldolases. Conserved residues, rms 1.73 Å). residues are indicated by bold letters. In the ␤4→␤6 based superpositions, the two Schiff-base- aldolase, Lys146, which is also in the active site and was forming lysine residues are at equivalent positions, and we suggested to bind the C1-phosphate group of the substrate now find several polar residues that are conserved in both and to participate in the catalytic reaction [18]. The Schiff- transaldolases and aldolases at corresponding locations in base-forming residue in F-1,6-P aldolase is aligned with the 3D structure. All in all, 13 of the equivalenced Ser176 in transaldolase. Of the 113 equivalent residues, residues are conserved between E. coli transaldolase and nine are conserved between the F-1,6-P aldolase and F-1,6-P aldolase, and five of them are located at the active transaldolase. Five of these residues are hydrophobic and site. Glu96 of the transaldolase superimposes with Glu187 with the exception of the above mentioned lysine residues of F-1,6-P aldolase, and Ser176 of the transaldolase with and two of the hydrophobic residues, all of the remaining Ser300 of F-1,6-P aldolase. Furthermore, the side chains conserved residues are far from the active-site pocket. of the conserved residues Asp17 of transaldolase and Asp33 of F-1,6-P aldolase are at the same position in 3D Another way of aligning the two barrels is by aligning the space, but they are located on different ␤ strands. ␤ strands carrying the Schiff-base-forming lysine residue, that is, ␤4 of transaldolase with ␤6 of F-1,6-P aldolase and In addition, this way of superposition also aligns the pro- so forth (␤4→␤6). Such an alignment, using O, results in a posed phosphate-binding sites for the C6 phosphate. The significant higher number of structurally equivalent C␣ binding site of this phosphate group in F-1,6-P aldolase is atoms, 146 atoms with an rms fit of 2.17 Å (Fig. 8). located in a cleft between the loops after ␤8 and ␤1 [18,19]. Arg303 in F-1,6-P aldolase is located at the N ter- Three of the remaining six possible ways to superpose the minus of an additional ␣ helix after ␤8 and its side-chain two barrels result in alignments with similar statistics to the position is very close to the position of the side chain of ␤1→␤1 alignment, with 100–109 equivalenced C␣ atoms Arg181 in transaldolase, also located the N terminus of an (rms varying between 2.29–2.35 Å). The remaining three additional ␣ helix. possible superpositions yield alignments with only 80–89 equivalent C␣ atoms (rms between 2.01 Å and 2.48 Å). In summary, transaldolase and F-1,6-P aldolase are struc- turally very similar in spite of the lack of significant A different approach to reveal topological and structural overall amino acid sequence homology. Given the high similarities is taken by the program TOP (G Lu, unpub- degree of overall structural similarity along with the lished program). This program does not require an initial conservation of some active-site residues and of the

Figure 8

Superposition of a single subunit from transaldolase (red) and aldolase (blue). The superposition is based on alignment ␤4→␤6 (for details see text). Research Article Crystal structure of transaldolase Jia et al. 721

phosphate-binding site, it seems likely that both enzyme Schiff-base intermediate. Littlechild and Watson [18] have families have evolved from a common ancestor. However, proposed a reaction mechanism for F-1,6-P aldolase. optimization of structural overlay and optimal alignment Based on the crystal structure of the enzyme from three of the active-site residues between F-1,6-P aldolases and species and amino acid sequence comparisons, they iden- transaldolases require a circular permutation of the sec- tified a number of conserved amino acids surrounding the ondary structural elements of the barrel. We therefore Schiff-base-forming lysine residue. In addition to this suggest that such a permutation in an ancestral aldolase residue, two acidic residues are conserved between the gene might have occurred during evolution, whereby the active sites of F-1,6-P aldolase and transaldolases. Glu96 first two ␤ strands of this ancient aldolase were moved is very close to the ⑀-NH2 group of the lysine residue, but to the C terminus. It is of interest to note that in a function for this residue other than maintaining a KDPG aldolase [15], N-acetylneuraminate lyase [16] and neutral electrostatic potential at the active site [18] had dihydrodipicolinate synthase [17], the three other class I not been discussed previously. Its proximity to Lys132 aldolases with known 3D structure, the Schiff-base- suggests to us that this residue might participate in forming lysine is located on strand ␤6. The similarity in proton abstraction from the ⑀-NH2 group of Lys132, thus chemistry and substrates between F-1,6-P aldolase and facilitating nucleophilic attack of this nitrogen on the car- transaldolase implies that these enzymes are more closely bonyl carbon of the substrate. Another acidic residue at related to each other than to the other enzymes of this the active site of F-1,6-P aldolase, Asp33, has been impli- family. If this assumption is correct (it is also supported by cated in protonating the leaving hydroxyl group of the the conservation of active-site residues other than the carbinolamine intermediate. In transaldolase, the con- Schiff-base-forming lysine residue amongst F-1,6-P served residue Asp17 is found at the corresponding posi- aldolases and transaldolases) we conclude that the tion in 3D space and might fulfil a similar role in the common ancestor of the class I aldolase family had the transaldolase reaction. lysine group at ␤ strand six. Furthermore, the permutation of the gene resulting in the rearrangement of the sec- In addition to these residues, we can identify another ondary elements should have occurred at a later stage in polar amino acid at the active site which is conserved in evolution, after N-acetylneuraminate lyase and dihy- the two enzyme families. Ser176 in transaldolase and the drodipicolinate synthase had diverged from F-1,6-P corresponding residue Ser300 in F-1,6-P aldolase are aldolase and transaldolase. within 5 and 3 Å, respectively of the ⑀-NH2 group of the Schiff-base-forming lysine residue and might be involved In a series of elegant experiments, Kirschner and col- in substrate binding. leagues [20] have shown that circular permutations of the ␣/␤-barrel motif at the level of the gene can result in prop- One important difference in the active-site structure erly folded, active enzymes . The present study provides between the two enzymes is the absence of the ‘second’ evidence that such permutations in ␣/␤ barrels might lysine residue [23] in transaldolase. This lysine residue, indeed have occurred naturally during evolution. Natu- Lys146 in F-1,6-P aldolase, is close to the Schiff-base- rally circular permutations of protein sequences within a forming lysine and has been implicated in binding of the single domain have been found in other instances: the C1 phosphate group [18,24]. No second lysine residue is saposin domains [21] and ␤-glucanases [22]. It might be found in the active site of E. coli transaldolase. As the that such circular permutations of gene sequences, with natural substrates of transaldolase are not phosphorylated preservation of the 3D structure of the encoded protein, at the C1 position, there might be no need for this residue occur more often than previously thought. to be conserved. This lysine was also proposed to be a key player in formation and cleavage of the Schiff-base by Another possible explanation for these findings would be acting as a acid/base catalyst [18]. At the position of that we observe a case of convergent evolution imposed Lys146, we find in transaldolase residue Thr33, conserved by the chemical constraints of the reaction to be catalyzed. in the transaldolases. While this side chain can participate Although convergent evolution cannot be ruled out unam- in hydrogen bonding and in this way might stabilize the biguously, the higher structural similarity found between Schiff base intermediate, it is not very likely that this the permuted structures indicates that a circular permuta- residue is acting as an acid/base catalyst in the formation tion is the more likely explanation. and breakdown of the Schiff-base intermediate.

Implications for the catalytic mechanism However, we want to point out that so far, no structural Important polar residues at the active site that have information of an F-1,6-P aldolase or transaldolase com- been implicated to participate in substrate binding plexed with substrate or substrate analogues is available. and/or catalysis are conserved between F-1,6-P aldolase Thus, a more definite assignment of the function of and transaldolase. This conservation indicates further active-site residues will require crystallographic analysis of mechanistic similarities beyond the mere formation of a such complexes. 722 Structure 1996, Vol 4 No 6

Biological implications Molecular Biology, Uppsala, Sweden. The detector was mounted on a Transaldolase, one of the enzymes in the pentose phos- Rigaku rotating anode, operating at 50 KV and 90 mA. Details of data collection are given in Table 1. A second native data set to 1.87 Å res- phate pathway, catalyses the reversible transfer of a olution was collected at beamline X12-C at NSLS, Brookhaven, using a dihydroxy moiety between various sugar phosphates. As MAR Research scanner. All data processing and scaling was carried occurs in other enzymes of the class I aldolase family, out using DENZO and SCALEPACK [25]. The two native data sets transaldolase forms a covalent Schiff-base adduct were merged to give a final data set containing 62 509 unique reflec- tions to 1.87 Å, (91.3% overall completeness, 82.8% in the highest between the substrate and the side chain of an active-site resolution shell, (1.97–1.87 Å), with an overall merging R-factor of lysine residue. Aldolases, in general, are of considerable 0.088. Heavy metal derivatives were prepared by soaking crystals with interest for stereospecific carbon–carbon bond formation solutions of heavy metal compounds in mother liquor for several hours. in organic synthesis. The crystal structure determina- tion of transaldolase provides the structural framework Table 1 for attempts to modify substrate specificity of this enzyme by rational design. Data collection results.

† Data set No. of Resolution Measured Unique Rmerge The enzyme is a homodimer and each monomer con- crystals (Å) reflections reflections* sists of an eight-stranded ␣/␤ barrel, the same overall fold found for other enzymes in the class I aldolase Nat(1) 1 2.0 330 510 47 372 (84.0%) 0.064 family. The active site is at the C-terminal end of the Nat(2) 1 1.87 167 704 59 359 (83.0%) 0.062 Merged barrel, at the bottom of a funnel-shaped cleft formed by native 2 1.87 498 214 62 509 (91.3%) 0.088 the loops connecting the eight ␤ strands with the PIP‡(1) 1 2.2 142 682 32 311 (75.5%) 0.066 ␣ helices. In the interior of this cleft, Lys132, which PIP‡(2) 1 2.2 109 429 25 526 (60.1%) 0.086 forms a Schiff-base with the substrate during catalysis, HgAc2(1) 1 2.2 243 823 35 624 (84.0%) 0.068 HgAc2(2) 1 2.2 64 507 13 348 (40.7%) 0.086 is located. The ⑀-nitrogen atom of this residue is sur- HgMi§ 1 2.2 105 944 25 812 (60.7%) 0.054 rounded by a number of conserved polar residues which might be involved in proton transfer during catalysis *The percentage of the total number of reflections is given in † ⌺⌺ ԽI I Խ ⌺Խ I I and/or substrate binding. The active-site topology of parentheses. Rmerge= i i–< > / < >, where i are the intensity measurements for a reflection and is the mean value for this transaldolase suggests some mechanistic differences reflection. ‡PIP, di-␮-iodobis(ethylenediamine)diplatinum(II). §HgMi, from fructose-1,6-phosphate aldolases with respect to CH3OC2H4HgCl. formation and cleavage of the Schiff-base intermediate.

Structural comparisons of transaldolase with other Phase determination The heavy metal sites could be located by manual inspection of the dif- aldolase structures reveal that the residue forming the ference Patterson maps. Starting from the platinum derivative, the Schiff base, located on strand ␤6 in aldolases, is found heavy metal positions in the other derivatives were located by cross- on strand ␤4 in transaldolase. This suggests the possibil- phased difference Fourier maps. The metal sites in the HgMi and ity of a circular permutation in the aldolase gene which HgAc2 derivatives were identical. The parameters of the heavy metal derivatives were refined using MLPHARE [26]. The final figure of merit moved the first two ␣/␤ units of an ancient aldolase gene at 2.2 Å resolution after phase refinement was 0.38. Table 2 summa- to the C terminus during evolution. Thus, transaldolase rizes the phasing statistics. The MIR phases were improved by a might be the first example of a naturally occurring circu- solvent flattening and histogram matching procedure using the pro- lar permutation in an ␣/␤ barrel. Together with other gramme SQUASH [27]. All crystallographic computing was carried out recent observations, this result suggests that such circu- with the CCP4 program suite [28]. lar permutations in gene sequences, resulting in permu- Table 2 tations of protein structure within a domain with preservation of the overall 3D fold, occur more often Phasing statistics. than previously thought. † ‡ Derivative Number of sites Rderiv*Rcullis Phasing power Materials and methods HgAc 2 13.5 0.84 1.07 Enzyme preparation and crystallization 2 HgAc2(2) 2 24.5 0.83 0.92 Transaldolase from recombinant E. coli K-12 cells (strain DH5/pGSJ451) HgMi 2 16.3 0.81 1.17 was purified to electrophoretic homogeneity by successive ammonium PIP 2 14.9 0.85 0.95 sulphate precipitation and two anion exchange chromatography steps PIP(2) 2 15.4 0.84 1.05 [3]. Crystals of transaldolase were obtained by a combination of micro- and macroseeding techniques using the hanging drop vapour diffusion *Rderiv=⌺ԽFPH–FPԽ/ ⌺ԽFPԽ where FPH is the structure factor amplitude of method [9]. The crystals were orthorhombic, space group P2 2 2 with 1 1 1 the derivative crystals and FP is that of the native crystal. cell dimensions a=68.9 Å, b=91.3 Å and c=130.5 Å. † Rcullis=⌺ԽFPH±FPԽ–ԽFH (calc)Խ/ ⌺ԽFPH–FPԽ, where FPH and FP are defined as above and FH(calc) is the calculated heavy atom structure factor Data collection and screening of heavy atom compounds amplitude summed over centric reflections only. Native and derivative data were collected at 2.0 Å and 2.2 Å resolution, ‡Phasing power=F(H)/E, the rms heavy atom structure factor respectively, on an R-axis II imaging plate detector at the department of amplitudes divided by lack of closure error. Research Article Crystal structure of transaldolase Jia et al. 723

From native Patterson maps, it was established that the crystal asym- aligned in Figure 6. Superpositions were also carried out using the metric unit contains two subunits related by a translation of x=0, y=0.5 program TOP (G Lu, unpublished program). In this case, no initial and z=0.136 [9]. This non-crystallographic symmetry was confirmed manual assignment of equivalent atoms is required. by the heavy metal positions and subsequently refined using the program IMP [29]. Accession numbers Atomic coordinates for the refined transaldolase model have been Model building and crystallographic refinement deposited with the , Brookhaven. The data set Nat1 was used for phase refinement, solvent flattening and calculation of the initial density maps and subsequent phase combina- Acknowledgements tion. The MIR map and the solvent flattened map were of very good We thank the staff, in particular RM Sweet, at the Department of Biology, quality and allowed tracing of the polypeptide main chain. A polyalanine Brookhaven National Laboratory, for access to beamline X12-C at National model comprising about 85% of the residues was built using O [10]. Synchrotron Light Source. This work was supported by a grant from the The map was improved by cycles of refinement using X-PLOR [30], Swedish Natural Science Research Council. GS acknowledges receipt of a phase combination using SIGMAA [31] and model building. In these major equipment grant from the Knut and Alice Wallenberg Foundation and refinement runs, the parameters described by Engh and Huber [32] were GAS, US, and HS acknowledge a grant from the Deutsche Forschungs- used, and strict non-crystallographic symmetry constraints and an overall gemeinschaft through SFB 380/B21. B-factor were applied. After a few rounds of refinement and manual inter- vention, the biological amino-acid sequence could be fitted to the elec- References tron density. At this stage, crystallographic refinement was continued 1. Gubler, C.J. (1991). Physiological functions and mechanism of action with the new merged data set containing the synchrotron data. A test set of transketolase. In Biochemistry and Physiology of Thiamin (2.5% of the reflections) was set aside to monitor progress of the refine- Diphosphate Enzymes. (Bisswanger, H. & Ullrich, J., eds), pp. 311–322, Verlag Chemie, Weinheim, Germany. ment by means of the free R-factor [33]. In the last refinement cycle, indi- 2. Venkatamaran, R. & Racker, E. (1961). Mechanism of action of vidual B-factors were refined. Non-crystallographic symmetry constraints transaldolase. J. Biol. Chem. 236, 1883–1886. were kept throughout the whole refinement procedure, as their release 3. Sprenger, G.A., Schörken, U., Sprenger, G. & Sahm, H. (1995). did not lower the free R-factor. The final R-factor is 20.1% and the free Transaldolase B of Escherichia coli K-12: cloning of its gene, talB, R-factor is 23.4% in the resolution interval 5.5–1.87 Å. The stereochem- and characterization of the enzyme from recombinant strains. istry of the model is good with rms deviations for bond lengths of J. Bacteriology 177, 5930–5936. 0.006 Å and bond angles of 1.521°. Table 3 gives the final refinement 4. Miosga, T., Schaaff-Gerstenschläger, I., Franken, E. & Zimmermann, statistics and stereochemical parameters for the model. The protein F.K. (1993). Lysine144 is essential for the catalytic activity of 9 model has been analyzed with the program PROCHECK [34]. Saccharomyces cerevisiae transaldolase. Yeast , 1241–1249. 5. Tsolas, O. & Lai, C.Y. (1972). Transaldolase. In The Enzymes. (Boyer, P.D., ed), vol. 7, pp. 259–280, Academic Press, New York, NY. Table 3 6. Schaaff, I., Hohmann, S. & Zimmermann, F.K. (1990). Molecular analysis of the structural gene for yeast transaldolase. Eur. J. Refinement statistics. Biochem. 188, 597–603. 7. Banki, K., Halladay, D. & Perl, A. (1994). Cloning and expression of Resolution (Å) 1.87 the human gene for transaldolase. J. Biol. Chem. 269, 2847–2851. 8. Toone, E.J., Simon, E.S., Bednarski, M.D. & Whitesides, G.M. (1989). Number of unique reflections 62 509 Enzyme-catalyzed synthesis of carbohydrates. Tetrahedron Lett. 45, R-factor (%) 20.1% 5365–5421. R-free (%) 23.4% 9. Jia, J., Lindqvist, Y., Schneider, G., Schörken, U., Sahm, H. & Non-hydrogen atoms of protein 4948 Sprenger, G.A. (1996). Crystallization and preliminary X-ray Water molecules 524 crystallographic analysis of recombinant transaldolase B from Average B-factors (Å2) Escherichia coli. Acta Cryst. D 52, 192–193. Protein atoms 23.8 10. Jones, T.A., Zou, J.Y., Cowan, S. & Kjeldgaard, M. (1991). Improved Solvent atoms 36.9 methods for building protein models in electron density maps and location of errors in these models. Acta Cryst. A 47, 110–119. Rms bond length deviation (Å) 0.006 11. Lai, C.Y., Chen, C. & Tsolas, O. (1967). Isolation and sequence Rms bond angle deviation(°) 1.506 analysis of a peptide from the active site of transaldolase. Arch. 2 Rms B (Å ) 1.211 Biochem. Biophys. 121, 790–797. 12. Gamblin, S.J., Cooper, B., Millar, J.R., Davies, G.J., Littlechild, J.A. & Watson, H.C. (1990). The crystal structure of human muscle aldolase at 3.0 Å resolution. FEBS Lett. 262, 182–286. Structure comparisons 13. Sygusch, J., Beaudry, D. & Allaire, M. (1987). Molecular architecture of Refined models for F-1,6-P aldolase from human [12] and rabbit [13] rabbit skeletal muscle aldolase at 2.7 Å resolution. Proc. Natl. Acad. muscle (Protein data bank [35] accession codes 1fba and 1ald, respec- Sci. USA 84, 7846–7850. tively) were used for the structural comparisons. Superpositions of the 14. Hester, G., et al., & Piontek, K. (1991). The crystal structure of structures were made by least-squares methods using the program O fructose-1,6-bisphosphate aldolase from Drosophila melanogaster at FEBS Lett. 292 [10] or the program TOP (G Lu, unpublished program). For the align- 2.5 Å resolution. , 237–242. 15. Mavridis, J.M., Hatada, M.H., Tulinsky, A. & Lebioda, D. (1992). ments using O, a small set of core atoms consisting of about 32 C␣ Structure of 2-keto-3-deoxy-6-phosphogluconate aldolase at 2.8 Å atoms from the eight strands of the ␣/␤ barrels were assigned for the resolution. J. Mol. Biol. 162, 419–444. initial alignment which was subsequently used to maximize the number 16. Izard, T., Lawrence, M.C., Malby, R.L., Lilley, G.G. & Colman, P.M. of structurally equivalent C␣ atoms. Atoms were considered equivalent if (1994). The three-dimensional structure of N-acetylneuraminate lyase they were within 3.8 Å distance from each other and within a consecu- from Escherichia coli. Structure 2, 361–369. tive region consisting of at least four equivalent residues. Two cases of 17. Mirwaldt, C., Korndörfer, I. & Huber, R. (1995). The crystal structure of initial alignments were tried. In one alignment, the ␤ strands of transal- dihydropicilinate synthase from Escherichia coli at 2.5 Å resolution. dolase were aligned with those of F-1,6-P aldolase in order of appear- J. Mol. Biol. 246, 227–239. 18. Littlechild, J.A. & Watson, H.C. (1993). A data-based reaction ␤ ␤ ␤ ␤ ance, i.e. strand 1 with strand 1, strand 2 with strand 2 and so mechanism for type I fructose bisphosphate aldolase. Trends Biol. forth. In a second run, the two ␤ strands carrying the Schiff-base- Sci. 18, 36–39. forming lysine were the starting points of the alignment, i.e. ␤4 of 19. Gamblin, S.J., Davies, G.J., Grimes, J.M., Jackson, R.M., Littlechild, J.A. transaldolase was aligned with ␤6 of F-1,6-P aldolase, ␤5 with ␤7 and & Watson, H.C. (1991). Activity and specificity of human aldolases. so forth. Structurally homologous residues in the two superpositions are J. Mol. Biol. 219, 573–576. 724 Structure 1996, Vol 4 No 6

20. Luger, K., Hommel., U., Herold, M., Hofsteenge, J. & Kirschner, K. (1989). Correct folding of circularly permuted variants of a ␤␣ barrel enzyme in vivo. Science 243, 206–210. 21. Ponting, C.P. & Russell, R.B. (1995). Swaposins; circular permutations within encoding saposin homologues. Trends Biol. Sci. 20, 179–180. 22. Heinemann, U. & Hahn, M. (1995). Circular permutations of protein sequence: not so rare? Trends Biol. Sci. 20, 349–350. 23. Horecker, B.L., Tsolas, O. & Lai, C.Y. (1972). Aldolases. In The Enzymes. (Boyer, P.D., ed), vol. 7, pp. 213–258, Academic Press, New York, NY. 24. Hartman, F.C. & Brown, P.J. (1976). Affinity labeling of a previously undetected essential lysyl residue in class I fructose bisphosphate aldolase. J. Biol. Chem. 251, 3057–3062. 25. Otwinowski, Z. (1993). DENZO: An Oscillation Data Processing Program for Macromolecular Crystallography. Yale University, New Haven, CT. 26. Otwinowski, Z. (1991). Maximum likelihood refinement of heavy atom parameters. In Isomorphous Replacement and Anomalous Scattering. Proceedings of the CCP4 Study Weekend (Wolf, W., Evans, P.R. & Leslie, A.G.W., eds), pp. 80–88, SERC Daresbury Laboratory, Warrington, UK. 27. Zhang, K.Y. (1993). SQUASH — Combining Constraints for Macromolecular Phase Refinement and Extension. Acta Cryst. D 49, 213–222. 28. Collaborative Computational Project, Number 4. (1994). The CCP4 suite: programs for protein crystallography. Acta Cryst. D 50, 760–763. 29. Jones, T.A. (1992). a, yaap, asap, @#*? A set of averaging programs. In CCP4 Study weekend 1992: Molecular replacement. (Dodson, E.J., Gover, S. & Wolf, W., eds), pp. 91–105, SERC Daresbury Laboratory, Warrington, UK. 30. Brünger, A. (1989). Crystallographic refinement by simulated annealing: application to crambin. Acta Cryst. A 45, 50–61. 31. Read, R.J. (1986). Improved coefficients for map calculation using partial structures with errors. Acta Cryst. A 42, 140–149. 32. Engh, R.A. & Huber, R. (1991). Accurate bond and angle parameters for X-ray protein structure refinement. Acta Cryst. A 47, 392–400. 33. Brünger, A. (1992). Free R value: a novel statistical quantity for assessing the accuracy of crystal structure. Nature 355, 472–475. 34. Laskowski, R.A., MacArthur, M.W., Moss, D.S. & Thornton, J.M. (1993). PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Cryst. 26, 946–950. 35. Bernstein, F.C., et al., & Tasumi, M. (1977). The Protein Data Bank: a computer-based archival file for macromolecular structures. J. Mol. Biol. 112, 535–542. 36. Kraulis, P.J. (1991). MOLSCRIPT: a program to produce both detailed and schematic plots of protein structure. J. Appl. Cryst. 24, 946–950. 37. Kabsch, W. & Sander, C. (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen bond and geometrical features. Biopolymers 22, 2577–2637.