<<

doi:10.1016/j.jmb.2007.05.066 J. Mol. Biol. (2007) 371, 1007–1021

The Crystal Structure of the Cytosolic Exopolyphosphatase from Reveals the Basis for Substrate Specificity

Emilie Ugochukwu, Andrew L. Lovering, Owen C. Mather Thomas W. Young and Scott A. White⁎

The School of Biosciences, Inorganic long-chain polyphosphate is a ubiquitous linear polymer in The University of Birmingham, biology, consisting of many phosphate moieties linked by phosphoanhy- Edgbaston, Birmingham, dride bonds. It is synthesized by , and metabolised by B15 2TT, UK a number of , including exo- and endopolyphosphatases. The Saccharomyces cerevisiae gene PPX1 encodes for a 45 kDa, metal-dependent, cytosolic exopolyphosphatase that processively cleaves the terminal phosphate group from the polyphosphate chain, until inorganic pyropho- sphate is all that remains. PPX1 belongs to the DHH family of phos- phoesterases, which includes: family-2 inorganic pyrophosphatases, found in Gram-positive bacteria; prune, a cyclic AMPase; and RecJ, a single- stranded DNA exonuclease. We describe the high-resolution X-ray struc- tures of PPX1, solved using the multiple isomorphous replacement with anomalous scattering (MIRAS) technique, and its complexes with phosphate (1.6 Å), sulphate (1.8 Å) and ATP (1.9 Å). Yeast PPX1 folds into two domains, and the structures reveal a strong similarity to the family-2 inorganic pyrophosphatases, particularly in the active-site region. A large, extended channel formed at the interface of the N and C-terminal domains is lined with positively charged amino acids and represents a conduit for polyphosphate and the site of phosphate hydrolysis. Structural compar- isons with the inorganic pyrophosphatases and analysis of the ligand- bound complexes lead us to propose a hydrolysis mechanism. Finally, we discuss a structural basis for substrate selectivity and processivity. © 2007 Elsevier Ltd. All rights reserved. Keywords: exopolyphosphatase; DHH family; phosphoesterases; long-chain *Corresponding author polyphosphate; family 2 PPase

Present addresses: E. Ugochukwu, Structural Genomics Consortium, University of Oxford, Botnar Research Centre, Oxford OX3 7LD, UK; A. L. Lovering, University of British Columbia, Life Sciences Center, 2350 Health Sciences Mall, Vancouver, Canada; O. C. Mather, Evotec (UK) Ltd, 111 Milton Park, Abingdon, Oxfordshire OX14 4RZ, UK. Abbreviations used: B.s., Bacillus subtilus; MIRAS, multiple isomorphous replacement with anomalous scattering; … polyP, long-chain polyphosphate; polyP3 (4,5,…n), tri (tetra, penta )polyphosphate; PPase, inorganic pyrophosphatases; PPi, inorganic pyrophosphate; PPX, exopolyphosphatase; RMS(D), root-mean-square (deviation); S.c., Saccharomyces cerevisiae; S.c.-PPX1, S.c. cytosolic exopolyphosphatase; S.g., Streptococcus gordonii; S.m., Streptococcus mutans. E-mail address of the corresponding author: [email protected]

0022-2836/$ - see front matter © 2007 Elsevier Ltd. All rights reserved. 1008 The 3-D Structure of S.c.-PPX1

Introduction Two PPXs have been characterised structurally: the stringent response-related exopolyphospha- Inorganic long-chain polyphosphates (polyP) are tase/ phosphohydrolase linear molecules of variable length, that may span (PPX/GPPA) from (PDB entries anywhere between a few to several hundred phos- 1t6c and 1t6d),29 a monomeric that folds into phoryl monomers. The functions and properties of two domains and is thought to be principally polyP, and the enzymes of polyP , have a GPPA; and the PPX from (2flo,30 been studied for many years, principally by the 1u6z31), a dimer with four domains in each subunit. Kornberg group and the Kulaev group.1,2 PolyP has Domains 1 and 2 of E. coli PPX, responsible for the been detected in many types of cells and . PPX activity, can be superimposed on the A. aeolicus Although long regarded as simply a phosphate PPX/GPPA with a root-mean-square deviation α store,3,4 or metal chelator, it has been observed (RMSD) of 1.6 Å for 282 C pairs. The two recently that a growing number of cellular processes share approximately 22% sequence identity and depend on, or are triggered by, polyP metabolism, have been shown to belong to the sugar kinase/ including long-term cell survival,5,6 cell sporulation /hsp 70 superfamily of phosphotransferases.30 or adaptation to a changing cellular environment,7,8 The in this superfamily forms at the gene and regulation,1 cell envelope forma- interface of the two domains. Although no enzyme – tion,9 and bacterial cell motility.10 12 Most recently, substrate complex has been determined for these it has been proposed that polyP is a potent regulator PPXs, a number of sulphate ions bound at the inter- of blood clotting in humans.13 face between the domains 1 and 2 in 1u6z has been PolyP is synthesised by polyphosphate kinases suggested to indicate the pathway of polyphosphate (PPKs), and hydrolysed by exo- or endopolypho- entrance into the active site. sphatases (PPXs or PPNs),1 in addition to other With an sequence completely unre- .14,15 The yeast Saccharomyces cerevisiae lated to the A. aeolicus and E. coli PPXs, S.c.-PPX1 is a (S.c.) possesses several PPXs with enzymatic activity member of the so-called DHH phosphoesterase observed in the , vacuole, mitochondrial ma- family,32 a group of functionally related enzymes trix, mitochondrial membrane, cell envelope and that includes: the family 2 inorganic pyrophospho- nucleus.16,17 The S. cerevisiae exopolyphosphatase sphatases (PPase†),33–36 prune,37 and RecJ,38 a gene PPX1 encodes a low molecular mass cytosolic single-stranded DNA exonuclease. The DHH family PPX (S.c.-PPX1, 45 kDa),18,19 as well as the function- are named after the characteristic triplet motif, Asp- ally distinguishable PPX in the mitochondrial ma- His-His, that contributes to the active site of each trix.17,20 It may also encode a cell-envelope PPX,21 member. S.c.-PPX1 and family 2 PPases share although it is possible that the observable PPX approximately 26% sequence identity, with the activity in the cell envelope has arisen by contamina- most conserved regions contributing to the active tion of S.c.-PPX1.20 The PPXs in the vacuole,22 mito- site, and are expected to have a similar catalytic chondrial membrane,23 cell nucleus,24 and a high mechanism. Despite the similarity, S.c.-PPX1 has a molecular mass cytosolic PPX (∼1000 kDa)25 are negligible activity against the substrate inorganic thought to be encoded by separate, but as yet, pyrophosphate (PPi),18,28 and family 2 PPases are uncharacterised genes. The cytosolic PPXs have poor at hydrolysing tripolyphosphate (polyP3)or different substrate profiles, with S.c.-PPX1 having a longer chains.39 preference for shorter chain polyP. In contrast to the We report on the crystal structure determination high molecular mass cytosolic PPX, S.c.-PPX1 is also of S.c.-PPX1 using multiple isomorphous replace- able to cleave the terminal phosphate group from ment with anomalous scattering (MIRAS) and ′ nucleoside 5 tetra- and pentaphosphates (e.g. p4A describe its structure bound with sulphate (native − 1 − 1 and p5A, with kcat values of 720 s and 40 s , res- structure), phosphate and ATP. On the basis of the pectively, at an optimum pH of 4.6, and an apparent active site structure, and a comparison of S.c.-PPX1 μ 26,27 KM for p4Aof80 M, but in common with its with the PPases, a potential enzyme mechanism and functional homologue, it cannot hydrolyse ATP. the structural basis for substrate selectivity and Both enzymes cleave the terminal phosphate group processivity is discussed. from polyP in a processive manner. S.c.-PPX1 poly- activity is dependent on divalent metal 2+ N 2+ N 2+ N 2+ 28 cations (Co Mn Mg Zn ). The KM for Results polyP decreases with increasing chain length, with μ μ values ranging from 140 MforpolyP3 to 0.004 M 18 Cloning, expression and purification for polyP250. PolyP3 kcat values for S.c.-PPX1 purified from yeast cells have been reported as 180 −1 18 −1 28 The gene PPX1 has been cloned from S. cerevisiae to s ; and 280 s ; (see also Results). In terms of enable over-expression of the full-length protein, the concentration of the polyP chain, k values cat amino acid residues 1–397. Over-expressed protein decrease with increasing chain length but, expressed in terms of phosphoryl monomers, phosphate release is fairly constant over the whole range of polyP3 to † Throughout this text PPase refers to the so-called −1 −1 polyP250, and approaches 500 Pi residues mol s family-2 PPases. Family-1 PPases, which are structurally at 37 °C.18 unrelated functional homologues, are not discussed. The 3-D Structure of S.c.-PPX1 1009 was purified 17.4-fold using a two-step purification The Structure of S.c.-PPX1 protocol, yielding approximately 1 mg of S.c.-PPX1 from 800 ml of cell culture. SDS-PAGE and electron The X-ray structure of S.c.-PPX1 has been deter- spray ionization time-of-flight mass spectrometry mined using the MIRAS technique and refined to confirmed that the protein was full-length and 1.8 Å resolution, with excellent stereochemical uncleaved with a parent ion peak of 45,058 Da, cf. a quality (Tables 1 and 2). The native structure con- theoretical value of 45,051 Da. The purified recombi- tains a molecule of sulphate in each active site, and nant S.c.-PPX1 had a specific activity of 2070 units/ will therefore be referred to as the sulphate-bound − mg (equivalent to a turnover of 1550 s 1), where one complex, or simply SO4. The structures of S.c.-PPX1 μ unit corresponds to the release of 1 mol of ortho- crystals soaked in the presence of 10 mM polyP3 and phosphate (Pi) released per minute at 37 °C. These 5 mM CoCl2 (referred to as P3Co), 5 mM ATP and values are higher than those reported previously for 5 mM MgCl2 (Mg.ATP) have also been determined to non-recombinant enzyme.18,28 resolutions of 1.6 Å and 1.9 Å, respectively (Table 1). Since the P3Co structure is refined at the highest Bioinformatics analysis resolution and represents a metal-bound enzyme– phosphate complex with potential relevance to the For the interpretation and analysis of PPX struc- physiological substrate-bound complex (see Discus- ture and function, sequences homologous to S.c.- sion), most of the analysis and discussion of S.c.- PPX1 were obtained using a Blast2 search (E cutoff PPX1 structure and function is based on this − 1×10 10) against the Uniprot Database.40 Removal structure. There are two copies of S.c.-PPX1 in the of PPase and prune sequences yielded a multi- asymmetric unit, related by 178° around a non- sequence alignment with an overall score of 37.0, con- crystallographic symmetry axis, approximately 30° sisting of 13 confirmed sequences and a further eight off the c-axis (RMS deviation for the SO4 structure is α preliminary (whole genome shotgun) sequences, 0.47 for 349 pairs of equivalent C atoms). S.c.-PPX1 representing 16 fungal species (of which eight are appears to be a monomer in the crystalline phase, ), three parasites (Leishmania major, Trypano- consistent with physiological measurements.18 soma cruzi and Trypanosoma brucei), one protozoan There is well-defined electron density for the (Tetrahymena thermophila) and one fish (Tetraodon majority of both molecules; residues 5–397 were fit nigroviridis). The resulting consensus sequence, rep- into each chain A, and residues 4–272 and 279–397 resentative of 21 DHH family PPX sequences, is into each chain B. As predicted by the lack of se- included in Figure 1. quence homology S.c.-PPX1 is structurally unrelated

Figure 1. A structure-based alignment of S.c.-PPX1 and S.g.-PPase sequences. Secondary structure elements, as calculated by Procheck56 from the S.c.-PPX1 structure are indicated: helices are lettered A–R and denoted by cylinders, β-strands are numbered 1–11 and denoted by arrows. Loops connecting secondary structure elements are named after the preceding and following strands or helices. PPX amino acid residues that contribute to the substrate-binding channel are indicated with reverse font or ringed for sections 1 or 2 of the channel, respectively. Highlighted in boxes are motifs I–IV, common to all DHH domains in the DHH family of phosphoesterases,32 and motifs V–VI, common to all DHHA2 domains. Immediately below the S.c.-PPX1 sequence is a consensus sequence based on the 21 DHH family PPX sequences obtained from the Uniprot database.40 Invariant positions are denoted in bold italics with the appropriate single-letter code. Bold numbers indicate a strongly conserved position across 21 sequences, where all amino acids belong to a similarity group (based on the Blosum62 similarity groups: (1) D, N; (2) E, Q; (3) S, T; (4) R, K; (5) F, Y, W; (6) I, L, V, M. Normal type (letters or numbers) denotes a position that is conserved across at least two-thirds of the sequences. 1010 The 3-D Structure of S.c.-PPX1

Table 1. X-ray data collection and phasing statistics

Data set a Re-1 Re-2 Co-1 SO4 (native) P3Co Mg.ATP A. Crystal properties Cell dimensions a (Å) 79.13 78.44 79.92 79.66 79.99 78.1 b (Å) 82.67 82.70 83.00 82.91 83.04 82.8 c (Å) 119.24 120.83 118.98 119.23 118.90 121.8

B. Data collection b Beamline ID29 ID14-1 Stn-9.6 ID14-1 ID14-1 ID14-2 Wavelength (Å) 1.176 0.934 1.488 0.934 0.934 0.933 No. observations c 117,829 (17,417) 331,146 (46,168) 60,026 (538) 281,723 (14,522) 365,698 (25,270) 231,791 (33,047) No. unique reflections 20,255 (2915) 27,727 (3948) 18,380 (414) 64,994 (5374) 99,085 (11292) 60,043 (8715) Completeness (%) 100 (99.9) 99.9 (99.5) 71.5 (11.5) 88.2 (51.4) 94.6 (74.9) 99.4 (99.9) Anomalous 99.9 (99.8) 99.5 (98.2) completeness Redundancy 5.8 (6.0) 11.9 (11.7) 3.3 (1.3) 4.3 (2.7) 3.7 (2.2) 4.0 (3.9) I/σ (I) 8.9 (4.8) 6.3 (3.1) 5.4 (1.7) 8.5 (3.7) 5.8 (2.8) 7.7 (1.6) d Rsym (%) 6.7 (13.8) 9.5 (23.5) 9.3 (32.8) 5.3 (19.4) 5.4 (26.6) 6.5 (32.7) Resolution range (Å) 51.7–2.8 35.8–2.5 68.1–2.6 41.4–1.8 68.0–1.6 30.5–1.9 (2.94–2.80) (2.63–2.50) (2.72–2.60) (1.89–1.80) (1.69–1.60) (2.00–1.90) Derivative () 50 mM ReO4 50 mM ReO4 20 mM CoCl2 Soaking time (h) 25 9 1.5 No. sites found 5 5 3 Phasing power 0.87/1.05 0.20/0.25 Combined mean FOMe 0.45 a Re-1, Re-2 and Co-1 are derivative datasets used in phasing. The native dataset is a sulphate-bound complex. b All beamlines are at the European Synchrotron Radiation Facility (ESRF), except Stn 9.6 (Synchrotron Radiation Source, SRS). c Values in parentheses represent data in the highest resolution shell. d ∑ b N– ∑ b N b N Rsym = j| I Ij|/ I , where Ij is the intensity of the jth reflection and I is the average intensity. e FOM, figure of merit. to the PPXs from A. aeolicus and E. coli. S.c.-PPX1 topology strongly resembles that of the family-2 folds into two domains (Figures 2 and 3). The N- PPases (see Figures 2 and 3).35 Superpositioning of terminal domain, residues 1–256, forms a three- the N and C-terminal domains of S.c.-PPX1 :A and layered a/b/a structure with helices F and G, a 5- S.g.-PPase:A (PDB 1k20:A) results in an rmsd of 1.22 Å α α stranded parallel β-sheet (β-strands 1–5), and helices (133 C pairs) and 1.21 Å (88 C pairs), respectively. A–E and H–L, respectively. The C-terminal domain However, there is a slight difference in the quatern- (amino acid residues 257–397) also folds as a three- ary structures of the two proteins: with the two N- layered a/b/a structure with β-strands 6–11 forming terminal domains structurally aligned, the C-term- a mixed β-sheet, sandwiched between helices M–O inal domain of S.c.-PPX1 is twisted by approximately on one side and P–Q on the other. Overall, the 12° with respect to the S.g.-PPase C-terminal domain. A comparison of the six S.c.-PPX1 structures Table 2. Refinement statistics (two each for SO4, P3Co and Mg.ATP) shows that there is a rigid-body movement of the C-terminal Dataset SO4 P3Co Mg.ATP domain, but it is less significant than that for the 35,36,41 – – – PPases. With respect to the “most closed struc- Resolution range (Å) 67.4 1.8 67.4 1.6 69 1.9 ” No. non-hydrogen atoms 7067 7134 6948 ture (Mg.ATP:A), each of the C-terminal domains No. water molecules 788 858 659 has a rotation about a common axis (indicated in R-factor a (%) 15.7 17.1 18.4 b Figure 4): SO4:A 1.7°; P3Co:A 2.7°; P3Co:B 3.4°; SO4: Rfree (%) 20.1 20.1 23.3 c 2 B 3.8°; Mg.ATP:B 7.1°. Average B-factor (Å ) 23.4 24.4 26.0 RMSD from ideal Bond angles (deg.) 1.47 1.43 1.15 Bond lengths (Å) 0.014 0.008 0.009 Packing analysis Ramachandran plot d Core 664 663 659 The maximum amount of solvent-accessible sur- Allowed 49 49 53 face area buried by contacting molecules is 1200 Å2 Generously allowed 1 2 2 Disallowed 0 1 1 at the interface between the non-crystallographic PDB entry 2qb6 2qb7 2qb8 symmetry related chains A and B. Chain A forms a ∑ – ∑ crystal contacts with an additional eight neighbour- R-factor= hkl|| Fo| |Fc|| / hkl|Fo|. 2 b R-factor based on a random 5% of the data withheld from ing molecules, burying a further 3400 Å of solvent- refinement. accessible surface area. Chain B contacts an addi- 2 c Average B after calculation of total B-factors using TLSANL.52 tional six neighbouring molecules, burying 2900 Å d The number of non-glycine amino acids in the core, allowed, 64 of solvent-accessible surface area. generously allowed and disallowed regions of the Ramachandran There is no electron density for loop 6–7 (amino Phi-Psi plot, as defined and calculated by the program Procheck.56 acid residues 273–278) in chain B, suggesting that it The 3-D Structure of S.c.-PPX1 1011

Figure 2. Ribbon diagram of S.c.-PPX1 (P3Co) subunit A with α-helices in blue and β-strands in red. The three bound 2+ phosphate moieties PT,PE1, and PE3 are shown in ball-and-stick, and the two bound Co as pink spheres, labelled 1 and 2. Secondary structure elements are labelled according to Figure 1. is disordered. In chain B, loop 6–7 is exposed to venience, the substrate-binding channel can be solvent, whereas in chain A, it is involved in two described in two sections (Figure 7(b)). Section 1 is stabilizing contacts: Lys275NZ is H-bonded (2.9 Å) buried deeper within the core of the protein. There are to Asn57 O of a neighbouring chain B; and Lys278 is two smaller gaps in the channel wall, labelled [1] and 4.0 Å away from Glu242 and 3.8 Å away from [2] in Figure 7. Section 2 of the substrate-binding Lys245 from a second neighbouring chain B. channel can be defined as the region starting at Arg381 and moving outward to an extensive protein–solvent The substrate-binding channel interface with significantly larger channel openings, labelled [3] and [4] in Figure 7. A distinctive feature of the S.c.-PPX1 structure is the 18 Å-long channel formed at the interface of the N Section 1 and C-terminal domains (burying 900 Å2 of solvent- accessible surface area in chain A and 800 Å2 in chain The β-strand 6 and helix R (motifs V and VI, B). This channel is the location of a metal ion and ligand respectively) in the C-terminal domain pack against binding (Figures 5 and 6;andseebelow),andaswith motifs I–IV of the N-terminal domain to form a the domain:domain interface regions of A. aoelicus and pocket, highly conserved in sequence (Figure 1). On E. coli PPXs, represents the most likely site of polypho- the basis of the homology with the PPases, this sphate binding and phosphate hydrolysis. For con- pocket forms the active site containing the invariant 1012 The 3-D Structure of S.c.-PPX1

The two other phosphate moieties, PE1 and PE3, are bound exclusively to the C-terminal domain. Both metals and phosphate ions PT and PE1 are located in the active site. Cobalt-1 is co-ordinated in an approximate square-pyramidal coordination to Asp41 OD2, Asp127 OD1, His148 NE2 a water molecule and phosphate PT-O2 (Figure 6; Table 3). Cobalt-2 is co-ordinated in a near-perfect octahedral co-ordination with contacts to phosphates PT-O3, PE1-O2 and four water molecules. There is an extensive network of potential H-bonds and electro- static interactions between the phosphates and the amino-acid side-chains or water molecules. Briefly, phosphate PT H bonds to Asn35 ND2, Asp39 OD2, Asp127, His149 NE2 and Lys268 NZ. Phosphate PE1 H-bonds to Lys268, Arg381, Lys382, acetate ligand, and water. In the SO4 structure, there is strong electron density for a single metal in each active site, which has either co-purified with the protein or was present in trace amounts in the crystallisation buffer. It occupies the same position as cobalt-1 in P3Co, and has very similar co-ordination geometry, except that PT-O2 is replaced by water. The sulphate ion, S1, originating from the crystallisation buffer, occupies the same location in each active site as PE1 in the Figure 3. A TOPS diagram62 showing the tertiary P3Co structure (Figures 4 and 8). S.c. In the Mg.ATP structure, each active site has a single structure of -PPX1. Secondary structure elements are 2+ labelled as in Figure 1. Loop L-M forms the hinge linking strong electron density peak (refined as Mg )cor- the N and C-terminal domains. responding to cobalt-1. In addition, in chain A there is strong electron density to position the triphosphate portion of a bound ATP molecule at the boundary of the two channel sections, so that the γ-phosphate sequence Asp-His-His (amino acid residues 147–149, group of ATP, salt-bridged to Arg381, is in a position motif III) present in all members of the DHH Family. equivalent to that of anions PE1 and S1 described for Electron density maps for the P3Co structure (Figure the other two structures (Figures 4 and 8). 5) clearly define two metal positions, refined as Cobalt ions, and three phosphate ions in the Section 2 substrate channel. PT in the terminal phosphate site (see Discussion) is bound with contacts, almost Section 2 of the channel, formed by loops J-K and exclusively from the N-terminal domain (Table 3). 5-H of the N-terminal domain and β-strands 7, 8 and

Figure 4. A superpositioning of four of the six S.c.-PPX1 structures obtained, focussing on the substrate-binding channel. Positively-charged side-chains that appear mobile or have alternative conformations are shown. The long-dash broken line (light blue) indicates the common hinge axis. The superpositioning shows that there are distinct anion-binding sites (see also Figure 8). The chains are coloured: dark blue, SO4:A; red, P3Co:A; green, Mg.ATP:A; yellow, Mg.ATP:B. The 3-D Structure of S.c.-PPX1 1013

Figure 5. A wall-eyed stereo representation of the active site of S.c.-PPX1 (P3Co:A) with bound phosphates and Co2+. – σ The final 2Fo Fc electron density map, contoured at 1.5 , is in blue mesh.

9 and loop 11-R of the C-terminal domain, links the structure of section 1 of the channel is highly related active site to large openings in the channel to the active site in PPases, section 2 of the channel is wall (labelled [3] and [4] in Figure 7). Whereas the unique to PPXs. In all six S.c.-PPX1 structures, this

Figure 6. (a) A wall-eyed stereo representation of the substrate-binding channel (P3Co:A) with bound phosphate 2+ moieties (labelled PT,PE1 and PE3), Co (1 and 2), an acetate molecule (ACT) and ordered water molecules. Arg334 is shown with two alternative conformations. For clarity, Asn35 and Lys268 are labelled * and #, respectively. The orientation of S.c.-PPX1 is approximately the same as in Figure 2. (b) Wall-eyed stereo of the P3Co:A active site showing the intermolecular contacts between side-chains and ligands. 1014 The 3-D Structure of S.c.-PPX1

Figure 7. (a) and (b) Two orthogonal views of the Grasp representation of the S.c.-PPX1 surface, coloured according to electrostatic charge.63 In (a) the S.c.-PPX1 orientation is the same as that if Figure 2. The four channel entrance sites are labelled [1]–[4]. Entrance [1] at the back of the structure in (b) aligns with entrance [4]. (c) A cross-section of the substrate- binding channel, with a portion of the enzyme (indicated with a black broken lined box in (a), and indicated in (b) with a broken black line) opened up like a book to reveal the “back” and “front” inside walls (left-hand and right-hand side of the Figure, respectively). The phosphorus atoms in the three phosphate moieties define the cross-section plane, and are in the plane of the paper (approximately the same orientation as that in Figure 2). Amino acid residues are represented in van der Waal surfaces and coloured according to type: positive, blue (R, K, H); negative, red (D, E); polar, yellow (N, Q, S, T, Y); and non-polar, grey (A, C, F, G, I, L, M, P, V, W). The three phosphate ions (transparent van der Waals surface and ball-and-stick) and two Co2+ are represented in both halves of the opened Figure. The solvent-accessible surface is represented as small blue spheres. Channel entrance sites are labelled [1]–[4]. section is largely filled with ordered water mole- Around the large channel entrance, electron den- cules (Figure 6(a)). In P3Co:A and B, phosphate PE3 sity maps suggest that a number of the positively- binds in this section, and in Mg.ATP:A the β and α- charged amino acids (Arg207, Arg208, Lys209, phosphate groups of the ATP molecule are ordered, Lys248, Arg334, Lys382) are mobile or disordered. occupying the PE2 and PE3 sites (see Discussion and The electron density for these residues is often weak Figure 8). There is no electron density for the ribose after the gamma carbon atom or, in some cases, and adenosine rings of ATP, suggesting disorder. indicates alternative conformations. Furthermore, The 3-D Structure of S.c.-PPX1 1015

Table 3. Contact distances to cobalt and phosphate Whereas RecJ belongs to DHH-subfamily-1, S.c.- groups in P3Co:A PPX1 and PPases belong to subfamily-2, containing a DHH-A2 domain (see motifs Vand VI in Figure 1). A Phosphate Contact Distance (Å) third type of enzyme in the DHH subfamily-2 is PT-O1 Asn35 ND2 3.0 known as prune, a phosphodiesterase found in the Asp39 OD2 2.7 animal kingdom, and thought to be an important Asp127 OD2 3.3 37,42,43 Water 155 3.2 indicator of cancer metastasis in humans. PT-O2 Asp127 OD1 3.2 All three types of phosphatase in DHH-subfamily Water 822 2.8 2 are related in sequence, with typical pairwise Water 686 2.9 sequence identities of approximately 25%. Blast PT-O3 PE1-O2 3.2 Water 685 2.7 searches with any one DHH subfamily-2 member Water 824 2.9 will find the other two phosphatase types with E − 10 Water 105 3.0 scores less than 10 . However, prune sequences Water 155 3.2 can be distinguished easily, since motifs II and III PT-O4 His149 NE2 2.6 contain DHH and DHR, respectively, instead of Lys268 NZ 2.7 DHN and DHH, for both PPXs and PPases (Figure 1). PE1-O4 2.6 PPXs can be distinguished easily from PPases by PE1-O1 Lys268 NZ 3.2 sequence length, typically 400 amino acid residues Arg381 NE 2.9 Acetate OXT 2.5 PE1-O2 Arg381 NH2 2.9 PE1-O3 Lys382 N 3.2 PE3-O1 2.5 PE1-O4 PT-O4 2.6 Water 393 2.9

PE3-O1 PE1-O1 2.5 Water 824 2.8 PE3-O2 Arg334 NH2 3.1 Arg381 N 2.9 Water 20 2.7 PE3-O3 Arg334 NH2 3.4 Water 31 3.2 Water 253 2.7 PE3-O4 Ser286 OG 2.6 Water 729 2.6 Co1 D41 OD2 2.1 D127 OD1 2.1 H148 NE2 2.2 PT-O2 2.0 Water 822 2.2 Co2 PT-O3 2.1 PE1-O2 2.1 Water 105 2.2 Water 155 2.2 Water 729 2.2 Water 824 2.2 the positions of some of these amino acid side-chains differ when the chains are structurally aligned (Figure 4).

Discussion S.c.-PPX1 is a member of the so-called DHH family of modular phosphatases with an invariant Asp-His- His sequence and four conserved sequence motifs Figure 8. A schematic of the polyP substrate path- common to all members (labelled I–IV in Figure 1).32 way, and ligand-binding states of S.c.-PPX1 structures. (a) Although there are currently 812 DHH family The three phosphate moieties PT,PE1 and PE3 bound in members listed in the Profam 20.0 database (May the P3Co structure. (b) The sulphate (SO4) and (c) the 2006), to date, only four members, representing two triphosphate of ATP (Mg.ATP). (d) Outline of the types of enzyme activity have been structurally mechanism proposed in scenario 1. (e) Pyrophosphate cannot reach site T. (f) Scenario 2, in which sites T and E1 characterised: the PPases from Streptococcus gordonii represent phosphate-binding sites following hydrolysis (S.g.) (PDB entries 1k20 and 1wpp), Streptococcus and electrostatic repulsion (“product complex”). Potential mutans (S.m.) (1i74) and Bacillus subtilis (B.s.) (1wpn, motion of the C-terminal domain to bring sites T and E1 1wpm, 1k23); and the manganese-dependent exo- closer is indicated by the zig-zag line. The potential nuclease RecJ from Thermus thermophilus (1ir6). rotation of Arg381 is discussed. 1016 The 3-D Structure of S.c.-PPX1 versus 310. Also, one of the invariant His residues perfect octahedral geometry and choice of ligands responsible for binding Mn2+ in PPase (His9 in S. around the cobalt-2 site is typical of a Mg2+ substrate- g.-PPase from motif I: GHxnPD[S|T]D) is not bound counter ion. conserved in PPXs (Figure 1), resulting in a The RecJ active site also contains one metal ion profound difference in metal-binding properties (analogous to cobalt-1),38 suggesting perhaps that (see below). cobalt-1 alone is responsible for generating the nucleophilic water/hydroxide ion in S.c.-PPX1. The substrate-binding channel Assuming an analogous catalytic mechanism, based on that proposed for PPases, an activated The active site water molecule chelated to cobalt-1 would attack the terminal phosphate group of the polyP chain The active site is the most conserved part of the (PT, Figure 8(d)). His149 (forming a His/Asp pair substrate-binding channel, and indeed the whole with Asp147 of the DHH sequence in motif III) protein structure, with nine out of the ten amino would serve as a general acid to protonate the acids contributing to the channel wall in section 1 oxygen bridging the terminal and penultimate being invariant across 21 DHH family PPX se- phosphate groups, facilitating release of PT as the quences (Figure 1). As anticipated from sequence product Pi. alignments, the S.c.-PPX1 active site is structurally The prune enzymes show an intriguing rearrange- very similar to PPase active sites, strongly suggest- ment of sequence motifs. In multi-sequence align- ing a similar catalytic mechanism for phosphate ments of all members in DHH subfamily-2, the Asp- hydrolysis. However, there is a striking difference His-His in prune sequences aligns with motifs II between the two enzymes. PPases contain a binuc- from PPX and PPases (loops 3-G and 4-5, respec- lear manganese cluster (3.5 Å Mn–Mn distance) with tively), not motifs III (loops 4-5 and 5-6, respec- a bridging ligand, thought to be an activated water tively). Since this would have the effect of moving molecule or hydroxide ion.35,36,41 The PPase binuc- the Asp-His-His tripeptide one β-strand across the lear cluster (metals M1 and M2) is proposed to be central parallel sheet in the 3-D structure, it is likely responsible for binding one of the phosphate moities the Asp-His-His would have a similar function in of the substrate PPi, polarising the P-O bonds prune, despite the difference in sequence location. making the phosphate more susceptible to nucleo- In common with S.c.-PPX1, the PPXs from A. philic attack, and lowering the pKa of the bridging aeolicus and E. coli are dependent on metals for water, promoting hydroxide generation. In a most- activity. Based on the calcium-bound A. aeolicus PPX recent structure of a substrate-bound complex of structures, and from comparisons with other mem- B.s.-PPase, complexed with PNP (a PPi analogue), a bers of the sugar kinase/actin/hsp-70 superfamily, a third metal (labelled M4) is present forming a near- mechanism has been proposed for E. coli PPX perfect equilateral triangle with M1 and M2, with involving a metal-activated water molecule attack- each metal co-ordinating an oxygen of the N-ter- ing the terminal phosphate group.29,30 However, minal bound phosphate group.44 The hydroxide ion there is little structural overlap of functional groups is approximately equidistant from metals M1, M2 in E. coli and S.c.-ppX1 active sites, with many more and M4, and well-positioned for in-line attack of the of the phosphate contacts to E. coli PPX being water- phosphate, triggering hydrolysis (see Figure 7 of mediated or formed with the polypeptide main Ahn et al.35 and Figure 5(b) of Fabrichniy et al.44). chain. The binuclear manganese cluster in PPase is chelated to a pair of histidine residues (His9 and A for the extended polyphosphate chain His99 in S.g.-PPase), proposed to be responsible for binding manganese rather than magnesium.36 But in In contrast to section 1 of the S.c.-PPX1 inter- the S.c.-PPX1 active site, only one of the metal domain channel, section 2 is poorly conserved in the positions is conserved (equivalent to M2 in B.s., S.g. DHH family PPX multi-sequence alignment (Figure and S.m.-PPases) with the invariant Asn35 (motif I) 1), with only two of the 14 amino acids contributing of PPX1 in a position analogous to that of S.g.-PPase to the channel wall being invariant and a further two His9. Despite numerous crystal soaking experi- strongly conserved across all 21 sequences. In the ments to introduce a second metal into the active S.c.-PPX1 structures presented here, section 2 of the site, to form a binuclear metal cluster equivalent to channel forms a large polar cavity linking the active that in the PPases, electron density maps were site to bulk solvent (Figure 7). In both P3Co:A and B always consistent with only one bound metal ion phosphate PE3 is bound in section 2, forming con- (data not shown). tacts to Arg381 N, Ser286 OG and Arg334, which In the P3Co structure there is a strong electron has two alternative conformations. Section 2 of the β α density peak located between phosphate groups PT channel contains the and -phosphate groups of and PE1 for a second cobalt-ion, 6.2 Å distance from the soaked ATP molecule in Mg.ATP:A, and is cobalt-1 (Figures 5 and 6), but there is no direct unoccupied in Mg.ATP:B. contact between cobalt 2 and the protein. Again, Section 2 of the channel acts as a conduit for the attempts to place a second metal to form a binuclear polyphosphate chain to access the active site with metal cluster, this time in the presence of Pi, PPi, the disordered and/or mobile positively-charged polyP3 or a mixture of all three, failed. The near- side-chains around the largest channel entrance The 3-D Structure of S.c.-PPX1 1017

([3]/[4] in Figure 7) facilitating polyP movement. It PPi, and how does the polyP substrate advance is not known if long polyP chains have any form of processively through the channel? As a first step to superstructure; presumably polyP–metal ion com- understanding the structural basis of substrate plexes are not long, extended chains. Section 2 of the selectivity and processivity, we compare the struc- channel could act to unravel any superstructure. tures of the six S.c.-PPX1 chains and their bound However, it must be stressed that section 2 of the ligands with the available PPase structures. channel is poorly conserved in sequence across the In Figure 8(a)–(c) the polyP pathway through S.c.- 21 DHH family PPX sequences aligned, perhaps PPX1 is represented schematically, showing a series explaining the reported species-dependent variation of phosphate-binding sites, occupied by Pi, sulphate in preferred length of polyP chain for the PPXs or ATP. Sites T and E1 are in the active site, with T examined.45 being the site of phosphate hydrolysis. In the six S.c.- While it is clear that the polyP chain enters the PPX1 structures, site E1 is occupied most often, either γ channel through the large channel opening [3]/[4], by phosphate (PE1 in P3Co:A and B, and ATP- - what is not so clear from structural analysis is how phosphate in Mg.ATP:A, or by sulphate (S1 in SO4:A the resulting Pi ions exit. Figure 7 shows a cross- and B), suggesting that site E1 is the highest-affinity section of the substrate-binding channel. The sol- site. Anions in site E1 chelate to cobalt-2 and vent-accessible surface is continuous at either end of interactions include a bidentate salt-bridge to the channel. One possible Pi exit is through the small Arg381 (Table 3, Figure 6). Sites T and E3 are also opening labelled [1], between His149 and Lys268, occupied by phosphate groups (PT and PE3 in P3Co: although this potential exit is much closer to phos- A and B, and α-phosphate in Mg.ATP:A), suggesting phate PE1 than to PT. In P3Co:A and B this potential that these sites too have a higher affinity for exit route is occupied by an acetate ion originating phosphate. Site E2 is occupied only in Mg.ATP:A. from the crystallisation buffer (Figure 6(a)). Fabrich- However, the electron density for the β-phosphate is niy et al. has highlighted a possible role of the weaker and less defined, compared to either the α invariant Lys207 in open PPases (equivalent to the (site E3) or the γ-phosphate group (site E1), suggest- invariant Lys268 in PPX).41 As the PPase C-terminal ing some disorder. domain moves to open the active site, Lys207 moves Depending on the relationship between PT and further away from the binuclear cluster, and has PE1, we can envisage two scenarios for polyP been proposed to facilitate the leaving group Pi. The processivity. In the first scenario, PE1 represents the environment of S.c.-PPX1:Lys268 can be seen in third phosphate moiety in the polyP chain (Figure – Figures 4, 6 and 7). For opening [1] to be a potential 8(d)). The PT PE1 distance of 4.3 Å is within the exit pathway for the product Pi, either the Lys268 range of i–i+2 phosphate separation in polyP (4.3– side-chain would have to change rotamer, or the 4.7 Å cf. 2.9 Å for i–i+1). Occupation of the second C-terminal domain would have to swing, as it does phosphate site, PT-1, is not observed in our struc- in the PPases (see below). Perhaps a better exit tures, perhaps indicating a weaker binding site, but channel for Pi (labelled [2] in Figure 7) is formed simple modelling shows that there are two possible between the side-chains of Lys382 and Lys209, and locations for a second phosphate. One possibility is is situated closer to PT. Lys382 (equivalent to Lys296 the location currently occupied by cobalt-2, so that in B.s.-PPase and Lys298 in S.g.-PPase) is in different oxygen atoms PT-O3 and PE1-O2 (3.6 Å apart in conformations, and could facilitate phosphate trans- P3Co:A and B) would be the bridging oxygen atoms fer out of the protein. of PT-1. A second potential, and more favoured position would be in the unoccupied solvent region Structural basis for substrate specificity and within section 1 of the channel, close to His149, so processivity? that it bridged phosphate oxygen atoms PT-O4 and PE1-O4 (see Figure 6(b)), which are 2.6 Å apart in The structural homology between the S.c.-PPX1 P3Co:A and B, close to the ideal O-P-O distance of and PPase active sites supports an analagous en- 2.47 Å in orthophosphate. Supporting this model is zyme mechanism for phosphate hydrolysis. But, the fact that oxygen PT-O4 is H-bonding to His149 PPases have a strong preference for PPi, whereas NE2 and Lys268 NZ. The same situation occurs in PPXs, processively cleaving the terminal phosphate the PPase active site: His99 NE2 and Lys209 NZ (S.g. from polyP, have reportedly negligible PPi hydro- numbers) H-bonding to the bridging oxygen atom lysis activity, and have a species-dependent prefer- of PPi, with His99 proposed to be the active site ence for longer or shorter polyphosphate chains.18,45 general acid facilitating hydrolysis.35,44 Following An induced-fit mechanism has been proposed for hydrolysis of the terminal phosphate, PT, from the PPases, based on the crystal structures of open and polyP chain, and its departure from the active site, closed forms,35,41 in which the mobile C-terminal the polyP chain advances one position so that the domain packs against the catalytic domain to form a new terminal phosphate occupies binding site T, and closed active site around the PPi substrate. In the cycle continues. The long polyP chain is hydro- contrast, S.c.-PPX1 has a positively-charged, sol- lysed processively until only PPi is remaining vent-filled channel linking the active site at one end (Figure 8(e)). At this point, the PPi can continue to to the large channel openings at the other, allowing advance along the channel only with site E1, the long polyP chains to enter the enzyme. The question highest affinity site, becoming unfavourably vacant, remains as to why S.c.-PPX1 is so bad at hydrolysing rendering PPX a poor pyrophosphatase. 1018 The 3-D Structure of S.c.-PPX1

An alternative scenario to that proposed above, δ-phosphate before catalysis. Were scenario 1 is that PT and PE1 in P3Co represent the last and correct, the ribose group of p4A would have to penultimate phosphates in the polyP chain (i–i+1) move into site E3 to allow the terminal phosphate immediately following cleavage and electrostatic group to reach site T, then following hydrolysis, the repulsion, a so-called hydrolysis product complex, product ATP would move back to give the inhibitor rather than an i–i+2 distance (Figure 8(f)). Such a complex. On the other hand, it is less clear to see structure of a hydrolysis product complex has been why S.c.-PPX1 cannot hydrolyse PPi in scenario 2, observed in the sulphate-bound, open structure of unless PPi requires either more metal co-ordination, 36,41 – S.m.-PPase (PDB 1i74), where the SO4 SO4 dis- as strongly suggested in the recently-observed tance is 4.3 Å. In the family-2 PPases, the formation trimetal state of PNP-bound B.s.-PPase44 to prop- of the hydrolysis product complex and subsequent erly orient and activate the substrate, or sites E2 product release is correlated with a significant and E3 must be occupied to advance the substrate motion of the C-terminal domain, enlarging the along the channel and/or get proper closure of the active site volume. In the S.m.-PPase hydrolysis, polyphosphate channel by C-terminal domain product complex the C-terminal domain is reported motion. to have opened by 8° and twisted by 13°, moving In conclusion, the DHH family PPXs (found in Arg295 (equivalent to Arg381 in S.c.-PPX1) further eukaryotes) and PPases (found in Gram-positive away from the binuclear Mn cluster compared to the bacteria) are evolutionarily related in structure and closed PPase active site.41 In the six S.c.-PPX1 function. On the basis of structural and sequence structures reported here, the movement of the alignments, and taking advantage of the available C-terminal domain is significant only when the PPase structures, we propose a structural basis for channel is unoccupied (Mg.ATP:B). For all others, the intriguing differences in PPX and PPase sub- the observed motion is between 0° (Mg.ATP:A) and strate profiles, and two alternative scenarios for the 3.8° (SO4:B). If PT and PE1 represent a hydrolysis processive nature of polyP metabolism by S.c.-PPX1. product complex, rather than an i–i+2 polypho- Support for either scenario needs clarification via sphate distance, Arg381 would have to be able to further biophysical experiments. move closer to cobalt-1 to allow sites PT and PE1 to approach a distance comparable to an i-i+1 interac- tion (2.9 Å), requiring either a rigid-body motion of Methods and Materials the C-terminal domain (cf. the C-terminal domain motions of family-2 PPases35,36,41and A. aeolicus Cloning of S.c.-PPX1 PPX29), or a conformational change in the Arg381 side-chain, to bring its guanidinium group closer to The full-length coding region of PPX-1 was amplified the PT site (Figure 8(f)). Modelling such a rigid-body from genomic DNA of Saccharomyces cerevisiae (strain rotation around the hinge axis, shown in Figure 4, AH22) with the oligonucleotide primers: shows that the C-terminal domain could potentially move closer, with only steric clashes between 5′-GTCTAGACATATGTCGCCTTTGAGAAAGA- ′ Arg208 and Arg327. However, arginine side-chain CGG-3 5′-GAATTCGGATCCTCACTCTTCCAGGTTTGAGT-3′ rotation to push PE1 closer to PT is less likely. The guanidinium group of Arg381 does not have the freedom to move. In addition to the bidentate salt- These introduced XbaI and NdeI restriction endonuclease sites at the 5′ end, and BamHI and EcoRI sites at the 3′ end bridge to phosphate PE1 via NE and NH2, there are – of the amplified fragment. A single product of approxi- two H-bonds to backbone carbonyl groups (NH1 mately 1.2 kb was obtained in each of three parallel Leu284 O and NH1–Asp266 O) and a salt-bridge to – amplifications. Each product was cloned in-frame into E. an aspartate side-chain (NH2 Asp266 OD1). Fur- coli expression vector pET11c (Novagen) using the unique thermore, in a structural superpositioning of the six NdeI and BamHI sites and transformed first into E. coli DHH family PPX C-terminal domains, together with strain DH5α for maintenance and storage, and secondly 14 available family-2 PPase C-terminal domains, into strain BL21(DE3) for protein expression. The direction shows that Arg381, or its PPase equivalent Arg297 and nucleotide sequence of the cloned PPX gene in the (S.g.-PPase numbers), align very well and has the pET11c vector was confirmed by DNA sequencing. same rotamer in 16 of the 20 structures aligned, regardless of the position of the C-terminal domain, Protein expresssion or the contents of the active site. The four PPase structures in which the arginine adopts a slightly Recombinant S.c.-PPX1 was expressed and purifed from − 1 different conformation are the four copies of B.s.- E. coli. LB (5 ml) containing 100 μg.ml ampicillin was PPase in 1k23, where the C-terminal domains are inoculated with a single colony of E. coli containing the opened by between 70° and 90°, and the SRKK S.c.-PPX1 construct and incubated at 37 °C, with shaking at 200 rpm, until the OD at 600 nm reached approximately motifs are phosphate-free and exposed to bulk 0.5. The culture was pelleted by centrifugation (4800 rpm, solvent. 10 min, 4 °C), resuspended in fresh LB medium containing − The fact that p4A is a substrate favours scenario 2, 100 μg.ml 1 ampicillin and stored at 4 °C overnight. The since it is easy to envisage p4A binding in a manner resuspended cells were added to 200 ml of LB medium − similar to that of the ATP–inhibitor complex (Figure containing 100 μg.ml 1 ampicillin and incubated at 37 °C, 8(c)), but with site T occupied by the terminal with shaking at 200 rpm. When the OD at 600 nm reached The 3-D Structure of S.c.-PPX1 1019 approximately 0.6, expression of S.c.-PPX1 was induced and translation non-crystallographic symmetry para- with the addition of 1 mM IPTG; cells were left to grow for meters (calculated from the initial partial model co- a further 5 h. The cells were pelleted by centrifugation ordinates), solvent flattening and phase extension to (9000 rpm, 15 min, 4 °C), freeze-thawed, pooled and 1.8 Å resolution in density modification routines by the resuspended in 25 ml of 100 mM Tris–HCl (pH 8), before program RESOLVE50 resulted in a vastly improved lysing the cells by sonication. Cell lysate was cleared by a electron density map, which could be automatically two-step centrifugation (20,000 rpm, 30 min, 4 °C, then interpreted using the autobuild feature in RESOLVE. The after discarding the initial pellet, 45,000 rpm, 2 h, 4 °C) resulting model contained 611 out of the possible 794 then stored at −20 °C. amino acid residues in the asymmetric unit. The model- ling of side-chains and the remaining residues was 51 Protein purification completed manually using TURBO-FRODO.

Cleared cell lysate was loaded onto a mono-Q FPLC Refinement column, pre-equilibrated with 20 mM L-histidine, pH 5.5. S.c.-PPX1 was eluted using a 0–2 M NaCl gradient at a − The PPX SO4 structure was refined against a maximum flow-rate of 1 ml min 1 and collected in 2 ml fractions, likelihood target using rigid-body and restrained refine- which were analysed by SDS-PAGE gel and tested for ment using TLS protocols,52 as implemented in the activity using the Fiske-Subbarow method.46 Fractions program REFMAC5.53 Rounds of refinement were inter- – containing S.c.-PPX1 (typically, between 18 mM and spersed with manual rebuilding of the model using 2mFo – – – σ 26 mM NaCl) were pooled and concentrated by centrifu- DFc (2Fo Fc) and mFo DFc (Fo Fc) A-weighted electron gation (3600 rpm, 4 °C) using a spin filter (10,000 Da cut- density maps as a guide.54 Water molecules were added – −1 – off, Vivascience) to approximately 5 15 mg.ml . The manually to the model using Fo Fc peak co-ordinates. It protein was washed with 1 M (NH4)2SO4 in the spin filter was clear at the start of the refinement procedure that each four times, then applied onto a phenyl Superose column, active site contained a heavy element, which was verified pre-equilibrated with 20 mM L-histidine, 1 M (NH ) SO at − 4 2 4 by inspection of anomalous difference maps. This metal pH 5.5. The protein was eluted at 0.2 ml.min 1 with a 1– must have co-purified with the protein, and, in the 0 M (NH4)2SO4 gradient in 20 mM L-histidine, pH 5.5. absence of any other data, was assumed to be Mn, Fractions (1.6 ml) were analysed by SDS-PAGE and tested which gives maximal activity with S.c.-PPX1. Progress of for activity. Those containing pure, active protein were the refinement of the model was judged throughout by pooled and the protein was concentrated and stored frozen following a reduction in Rcryst and Rfree (5% of the data). in 10 mM Tris–HCl (pH 7.5) at −20 °C. Protein was S.c.-PPX1 crystals soaked with (a) 10 mM sodium analysed also by electrospray mass spectrometry (Uni- tripolyphosphate and 5 mM CoCl2 (labelled P3Co in versity of Birmingham) before crystallisation. Table 1) and (b) 5 mM Mg.ATP (labelled Mg.ATP) were isomorphous to the SO4 crystals. Electron density maps (F –F ,2F –F and F –F ) all showed additional Crystallisation o(soak) o(native) o c o c features corresponding to added ligands. The fully-refined SO4 structure was used as the starting model for the P3Co Crystals of S.c.-PPX1 were grown by the hanging-drop, and Mg.ATP models; the model was adjusted using μ vapour-diffusion method in 6 l drops. The crystals were manual rebuilding and refinement with REFMAC5, obtained from a 1:1 mixture of purified protein (10 mg. including several cycles of automatic removal and addi- −1 – ml , in 10 mM Tris HCl, pH 7.5) and a reservoir solution tion of water molecules using Arp/wArp v.5.0 containing 7% (w/v) PEG 4000, 0.1 M sodium acetate (Arp_waters55). Following refinement, the SO4, P3Co buffer (pH 4.6), and 150 mM (NH4)2SO4. Crystals were and Mg.ATP models, including all water molecules, were needle-shaped, took 24-36 h to reach a maximum length of checked manually against electron density maps and 0.2 mm, and diffracted beyond 1.6 Å resolution on validated using the program Procheck.56 Final refinement beamline ID14-1 at the European Synchrotron Radiation statistics for all three models are given in Table 2. Facility (ESRF), Grenoble, France. Crystals were soaked in an artificial mother liquor, supplemented with 25% (w/v) ethylene glycol, before flash-cooling directly into liquid Bioinformatic and structure analysis . P3Co and ATP derivative crystals were soaked in an artificial mother liquor containing the ligands for Sequences homologous to S.c.-PPX1 were obtained approximately 30 min before introducing cryoprotectant using Blast2 searches against the Uniprot Database.40 and flash-cooling. T-Coffee was used to calculate multiple sequence align- ments (MSA) and detect outlier sequences.57 MSA analysis Data collection and phase determination and preparation of preliminary sequence figures made use of GeneDoc.58 Structure superpositioning and rms devia- tions between models were calculated using the program A complete native dataset (SO4) was collected to 1.8 Å Swiss PDB Viewer,59 or LSQKAB.60 Surface area calcula- resolution at ESRF (Table 1). The unit cell dimensions are tions were made using Areaimol.60 Structure Figures were consistent with two molecules per asymmetric unit, giving generated using Povscript61 or Swiss PDBViewer.59 an estimated solvent content of 45% (v/v). Diffraction images were indexed and integrated using MOSFLM47 and scaled using SCALA.48 Three derivative data sets, labelled Re-1, Re-2, and Co-1, were included in a MIRAS phase calculation using the program SOLVE49 (Table 1)to give an initial phase set with a mean Figure of Merit of Acknowledgements 0.45. Although not of exceptional quality, the electron density map, calculated to 3 Å resolution, could be We thank Albert Wadeson, John Osborne and partially interpreted. Inclusion of accurate 2-fold rotation Mohammad Ilias for help during the cloning of the 1020 The 3-D Structure of S.c.-PPX1

S.c.-PPX1 protein. We acknowledge the BBSRC (to 17. Lichko, L. P., Andreeva, N. A., Kulakovskaya, T. V. & E.U. and O.C.M) and the School of Biosciences (to Kulaev, I. S. (2003). Exopolyphosphatases of the yeast 3 – A.L.L.) for studentship funding, and the MRC for Saccharomyces cerevisiae. FEM Yeast Res. , 233 238. part funding of the BIP computational suite. We are 18. Wurst, H. & Kornberg, A. (1994). A soluble grateful to the ESRF and SRS for access to X-ray exopolyphosphatase of Saccharomyces cerevisiae. Pur- ification and characterization. J. Biol. Chem. 269, facilities and help with data collection. 10996–11001. 19. Wurst, H., Shiba, T. & Kornberg, A. (1995). The gene for a major exopolyphosphatase of Saccharomyces References cerevisiae. J. Bacteriol. 177, 898–906. 20. Lichko, L., Kulakovskaya, T. & Kulaev, I. (2002). Effect 1. Kornberg, A., Rao, N. N. & Ault-Riche, D. (1999). of PPX1 inactivation on exopolyphosphatases of dif- Inorganic polyphosphate: a molecule of many func- ferent cell compartments of the yeast Saccharomyces tions. Annu. Rev. Biochem. 68,89–125. cerevisiae. Biochim. Biophys. Acta, 1599, 102–105. 2. Kulaev, I. S., Vagabov, V. M., Kulakovskaya, T. V., 21. Andreeva, N. A. & Okorokov, L. A. (1993). Purifica- Lichko, L. P., Andreeva, N. A. & Trilisenko, L. V. tion and characterization of highly active and stable (2000). The development of A.N. Belozersky's ideas in polyphosphatase from Saccharomyces cerevisiae cell polyphosphate biochemistry. Biochemistry (Mosc), 65, envelope. Yeast, 9, 127–139. 271–278. 22. Andreeva, N. A., Kulakovskaya, T. V. & Kulaev, I. S. 3. Kulaev, I. S. (1979). The Biochemistry of Inorganic (1998). Purification and properties of exopolypho- Polyphophates. Wiley, New York. sphatase isolated from Saccharomyces cerevisiae 4. Kulaev, I. S. & Vagabov, V. M. (1983). Polyphosphate vacuoles. FEBS Letters, 429, 194–196. metabolism in micro-organisms. Advan. Microb. Phy- 23. Lichko, L., Kulakovskaya, T. & Kulaev, I. (1998). siol. 24,83–171. Membrane-bound and soluble polyphosphatases of 5. Rao, N. N. & Kornberg, A. (1996). Inorganic polyphos- mitochondria of Saccharomyces cerevisiae: identifica- phate supports resistance and survival of stationary- tion and comparative characterization. Biochim. Bio- phase Escherichia coli. J. Bacteriol. 178, 1394–1400. phys. Acta, 1372, 153–162. 6. Kim, K. S., Rao, N. N., Fraley, C. D. & Kornberg, A. 24. Lichko, L. P., Kulakovskaya, T. V. & Kulaev, I. S. (2002). Inorganic polyphosphate is essential for long- (2003). Nuclear exopolyphosphatase of Saccharomyces term survival and virulence factors in Shigella and cerevisiae is not encoded by the PPX1 gene encoding Salmonella spp. Proc. Natl Acad. Sci. USA, 99, 7675–7680. the major yeast exopolyphosphatase. FEMS Yeast Res. 7. Chatterji, D. & Ojha, A. K. (2001). Revisiting the 3,113–117. stringent response, ppGpp and starvation signaling. 25. Andreeva, N. A., Kulakovskaya, T. V. & Kulaev, I. S. Curr. Opin. Microbiol. 4, 160–165. (2004). Purification and properties of exopolypho- 8. Ruiz, F. A., Rodrigues, C. O. & Docampo, R. (2001). sphatase from the cytosol of Saccharomyces cerevisiae Rapid changes in polyphosphate content within acid- not encoded by the PPX1 gene. Biochemistry (Mosc), 69, ocalcisomes in response to cell growth, differentiation, 387–393. and environmental stress in Trypanosoma cruzi. J. Biol. 26. Kulakovskaya, T. V., Andreeva, N. A. & Kulaev, I. S. Chem. 276, 26114–26121. (1997). Adenosine-5′-tetraphosphate and guanosine- 9. Kulaev, I. & Kulakovskaya, T. (2000). Polyphos- 5′-tetraphosphate: new substrates of the cytosolic exo- phate and phosphate pump. Annu. Rev. Microbiol. 54, polyphosphatase of the yeast Saccharomyces cerevisiae. 709–734. Biochemistry (Mosc), 62, 1051–1052. 10. Rashid, M. H. & Kornberg, A. (2000). Inorganic 27. Guranowski, A., Starzynska, E., Barnes, L. D., polyphosphate is needed for swimming, swarming, Robinson, A. K. & Liu, S. (1998). Adenosine 5′- and twitching motilities of Pseudomonas aeruginosa. tetraphosphate phosphohydrolase activity is an Proc. Natl Acad. Sci. USA, 97, 4885–4890. inherent property of soluble exopolyphosphatase 11. Rashid, M. H., Rao, N. N. & Kornberg, A. (2000). from yeast Saccharomyces cerevisiae. Biochim. Biophys. Inorganic polyphosphate is required for motility of Acta, 1380, 232–238. bacterial pathogens. J. Bacteriol. 182, 225–227. 28. Andreeva, N., Kulakovskaya, T., Sidorov, I., Karpov, 12. Ogawa, N., Tzeng, C. M., Fraley, C. D. & Kornberg, A. A. & Kulaev, I. (1998). Purification and properties of (2000). Inorganic polyphosphate in Vibrio cholerae:ge- polyphosphatase from Saccharomyces cerevisiae cyto- netic, biochemical, and physiologic features. J. Bacteriol. sol. Yeast, 14, 383–390. 182, 6687–6693. 29. Kristensen, O., Laurberg, M., Liljas, A., Kastrup, J. S. & 13. Smith, S. A., Mutch, N. J., Baskar, D., Rohloff, P., Gajhede, M. (2004). Structural characterization of the Docampo, R. & Morrissey, J. H. (2006). Polyphosphate stringent response related exopolyphosphatase/gua- modulates blood coagulation and fibrinolysis. Proc. nosine pentaphosphate phosphohydrolase protein Natl Acad. Sci. USA, 103, 903–908. family. Biochemistry, 43, 8894–8900. 14. Lorenz, B. & Schroder, H. C. (2001). Mammalian intes- 30. Rangarajan, E. S., Nadeau, G., Li, Y., Wagner, J., Hung, tinal alkaline phosphatase acts as highly active exopo- M. N., Schrag, J. D. et al. (2006). The structure of the lyphosphatase. Biochim. Biophys. Acta, 1547,254–261. exopolyphosphatase (PPX) from Escherichia coli O157: 15. Lemercier, G., Espiau, B., Ruiz, F. A., Vieira, M., Luo, H7 suggests a binding mode for long polyphosphate S. H., Baltz, T. et al. (2004). A pyrophosphatase chains. J. Mol. Biol. 359, 1249–1260. regulating polyphosphate metabolism in acidocalci- 31. Alvarado, J., Ghosh, A., Janovitz, T., Jauregui, A., somes is essential for Trypanosoma brucei virulence in Hasson, M. S. & Sanders, D. A. (2006). Origin of exo- mice. J. Biol. Chem. 279, 3420–3425. polyphosphatase processivity: fusion of an ASKHA 16. Kulaev, I. S., Andreeva, N. A., Lichko, L. P. & phosphotransferase and a cyclic nucleotide phospho- Kulakovskaya, T. V. (1997). Comparison of exopoly- diesterase homolog. Structure, 14, 1263–1272. phosphatases of different yeast cell compartments. 32. Aravind, L. & Koonin, E. V. (1998). A novel family of Microbiol. Res. 152, 221–226. predicted phosphoesterases includes Drosophila The 3-D Structure of S.c.-PPX1 1021

prune protein and bacterial RecJ exonuclease. Trends 46. Fiske, C. H. & Subbarow, Y. (1925). The colorimetric Biochem. Sci. 23,17–19. determination of phosphorus. J. Biol. Chem. 66, 33. Young, T. W., Kuhn, N. J., Wadeson, A., Ward, S., 375–400. Burges, D. & Cooke, G. D. (1998). Bacillus subtilis ORF 47. Leslie, A. G. W. (1992). Recent changes to the yybQ encodes a manganese-dependent inorganic MOSFLM package for processing film and image pyrophosphatase with distinctive properties: the first plate data. Joint CCP4 ESF-EAMCB Newsletter on of a new class of soluble pyrophosphatase? Microbiol- Protein Crystallography, vol. 26. ogy-uk, 144, 2563–2571. 48. Evans, P. R. (1997). Joint CCP4 and ESF-EACBM 34. Shintani, T., Uchiumi, T., Yonezawa, T., Salminen, A., Newsletter, vol. 33, pp. 22–24. Baykov, A. A., Lahti, R. & Hachimori, A. (1998). Clon- 49. Terwilliger, T. C. & Berendzen, J. (1999). Automated ing and expression of a unique inorganic pyrophos- MAD and MIR structure solution. Acta Crystallog. sect. phatase from Bacillus subtilis: evidence for a new D, 55, 849–861. family of enzymes. FEBS Letters, 439, 263–266. 50. Terwilliger, T. C. (2002). Automated structure solu- 35. Ahn, S., Milner, A. J., Futterer, K., Konopka, M., Ilias, tion, density modification and model building. Acta M., Young, T. W. & White, S. A. (2001). The “open” Crystallog. sect. D, 58, 1937–1940. and “closed” structures of the type-C inorganic pyro- 51. Roussel, A. C., C. (1991). Silicon Graphics Geometry phosphatases from Bacillus subtilis and Streptococcus Partners Directory, vol. 86. gordonii. J. Mol. Biol. 313, 797–811. 52. Winn, M. D., Isupov, M. N. & Murshudov, G. N. 36.Merckel,M.C.,Fabrichniy,I.P.,Salminen,A., (2001). Use of TLS parameters to model anisotropic Kalkkinen, N., Baykov, A. A., Lahti, R. & Goldman, displacements in macromolecular refinement. Acta A. (2001). Crystal structure of Streptococcus mutans Crystallog. sect. D, 57, 122–133. pyrophosphatase: a new fold for an old mechanism. 53. Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Structure, 9, 289–297. Refinement of macromolecular structures by the 37. D'Angelo, A., Garzia, L., Andre, A., Carotenuto, P., maximum-likelihood method. Acta Crystallog. sect. D, Aglio, V., Guardiola, O. et al. (2004). Prune cAMP 53, 240–255. phosphodiesterase binds nm23-H1 and promotes 54. Read, R. J. (1986). Improved Fourier coefficients for cancer metastasis. Cancer Cell, 5, 137–149. maps using phases from partial structures with errors. 38. Yamagata, A., Kakuta, Y., Masui, R. & Fukuyama, K. Acta Crystallog. sect. A, 42, 140–149. (2002). The crystal structure of exonuclease RecJ 55. Perrakis, A., Morris, R. & Lamzin, V. S. (1999). bound to Mn2+ ion suggests how its characteristic Automated protein model building combined with motifs are involved in exonuclease activity. Proc. Natl iterative structure refinement. Nature Struct. Biol. 6, Acad. Sci. USA, 99, 5908–5912. 458–463. 39. Kuhn, N. J. & Ward, S. (1998). Purification, properties, 56. Laskowski, R. A., Macarthur, M. W., Moss, D. S. & and multiple forms of a manganese-activated inor- Thornton, J. M. (1993). Procheck - a program to ganic pyrophosphatase from Bacillus subtilis. Arch. check the stereochemical quality of protein struc- Biochem. Biophys. 354,47–56. tures. J. Appl. Crystallog. 26, 283–291. 40. Bairoch, A., Apweiler, R., Wu, C. H., Barker, W. C., 57. Notredame, C., Higgins, D. G. & Heringa, J. (2000). Boeckmann, B., Ferro, S. et al. (2005). The universal pro- T-Coffee: a novel method for fast and accurate mul- tein resource (UniProt). Nucl. Acids Res. 33,D154–D159. tiple sequence alignment. J. Mol. Biol. 302, 205–217. 41. Fabrichniy, I. P., Lehtio, L., Salminen, A., Zyryanov, 58. Nicholas, K. B., Nicholas, H. B., Jr & Deerfield, D. W., A. B., Baykov, A. A., Lahti, R. & Goldman, A. (2004). II (1997). GeneDoc: analysis and visualization of Structural studies of metal ions in family II pyrophos- genetic variation. EMBNEW.NEWS, 4, 14. phatases: the requirement for a janus ion. Biochem- 59. Guex, N. & Peitsch, M. C. (1997). SWISS-MODEL istry, 43, 14403–14411. and the Swiss-PdbViewer: an environment for 42. Zollo, M., Andre, A., Cossu, A., Sini, M. C., D'Angelo, comparative protein modeling. Electrophoresis, 18, A., Marino, N. et al. (2005). Overexpression of h-prune 2714–2723. in breast cancer is correlated with advanced disease 60. Collaborative Computing Project Number 4. (1994). status. Clin. Cancer Res. 11, 199–205. The CCP4 suite - programs for protein crystallogra- 43. Kobayashi, T., Hino, S., Oue, N., Asahara, T., Zollo, phy. Acta Crystallog. sect. D, 50, 760–763. M., Yasui, W. & Kikuchi, A. (2006). Glycogen synthase 61. Fenn, T. D., Ringe, D. & Petsko, G. A. (2003). kinase 3 and h-prune regulate cell migration by mod- POVScript+: a program for model and data visualiza- ulating focal adhesions. Mol. Cell Biol. 26, 898–911. tion using persistence of vision ray-tracing. J. Appl. 44.Fabrichniy,I.P.,Lehtio,L.,Tammenkoski,M., Crystallog. 36, 944–947. Zyryanov, A. B., Oksanen, E., Baykov, A. A. et al. 62. Flores, T. P., Moss, D. S. & Thornton, J. M. (1994). (2007). A trimetal site and substrate distortion in a An algorithm for automatically generating II inorganic pyrophosphatase. J. Biol. Chem. topology cartoons. Protein Eng. 7,31–37. 282, 1422–1431. 63. Nicholls, A., Bharadwaj, R. & Honig, B. (1993). Grasp - 45. Rodrigues, C. O., Ruiz, F. A., Vieira, M., Hill, J. E. & graphical representation and analysis of surface- Docampo, R. (2002). An acidocalcisomal exopoly- properties. Biophys. J. 64, A166. phosphatase from Leishmania major with high affinity 64. Ramachandran, G. N. & Sasisekharan, V. (1968). for short chain polyphosphate. J. Biol. Chem. 277, Conformation of polypeptides and proteins. Advan. 50899–50906. Protein Chem. 23, 283–438.

Edited by M. Guss

(Received 20 December 2006; received in revised form 19 May 2007; accepted 22 May 2007) Available online 31 May 2007