Structural and Functional Studies on the Dodecameric Tetrahedral-shaped FrvX from Pyrococcus horikoshii

Inauguraldissertation Der Philosophisch-naturwissenschaftlichen Fakultät der Universität Bern

Vorgelegt von Santina Russo von Aeschi b. Spiez

Leiter der Arbeit: Prof. Dr. Ulrich Baumann Departement für Chemie und Biochemie

Structural and Functional Studies on the Dodecameric Tetrahedral-shaped Aminopeptidase FrvX from Pyrococcus horikoshii

Inauguraldissertation Der Philosophisch-naturwissenschaftlichen Fakultät der Universität Bern

vorgelegt von Santina Russo von Aeschi b. Spiez

Leiter der Arbeit: Prof. Dr. Ulrich Baumann Departement für Chemie und Biochemie

Von der Philosophisch-naturwissenschaftlichen Fakultät angenommen.

Bern, den 23. März 2005 Der Dekan

Prof. Dr. P. Messerli Summary

Protein turnover is an essential process in all living cells. The cellular protein content is constantly renewed through synthesis of new proteins and degradation of misfolded or unneeded proteins. Since protein degradation is also a hazard, it must be subject to spatial and temporal control. A basic regulatory mechanism is self-compartmentalization where oligomerization is used to arrange the active sites of the subunits on the inside of the particles in large central cavities. Especially in prokaryotes, which lack cellular compartments, self-compartmentalization evolved in order to restrict the access to the rather unspecific active sites and also to ensure complete degradation. Recently, a novel self- compartmentalizing aminopeptidase complex with a tetrahedral shape (TET) has been found in the archeon Haloarcula marismortui. In contrast to all other chambered described before, the central cavity of TET is accessible through a total of eight channels, four wider channels and four narrow channels. TET is capable to degrade most peptides down to single amino acids.

We present here the determination of the crystal structure of the TET homolog Frvx from

Pyrococcus horikoshii. The gene encoding FrvX was cloned into a bacterial expression vector and overexpressed in Escherichia Coli. Purified FrvX crystallized in spacegroup P63 with 4 molecules per asymmetric unit. The structure was determined by molecular replacement using an ortholog from Bacillus subtilis as search model and refined at a resolution of 2.1 Å. Like

TET, FrvX is a dodecameric tetrahedral-shaped aminopeptidase complex. The monomer has a typical clan MH fold, as found for example in Aeromonas proteolytica aminopeptidase, containing a dinuclear zinc active centre. The quaternary structure is build by dimers with a length of 100 Å which form the edges of the tetrahedron. All twelve active sites are located in a large central cavity inside the particle. The central cavity is accessible through four channels

1 located on the faces of the tetrahedral particle. The diameter of these channels restricts the access to the central chamber so that only short peptides and at the most unfolded proteins can serve as substrates.

Before we were able to solve the crystal structure of FrvX by molecular replacement, we tried to determine phases by multiple anomalous dispersion (MAD). We performed MAD experiments on tantalum and platinum-derivatized crystals. The heavy-atom compounds used were a Ta6Br14 cluster and Di-µ-iodobis(ethylenediamine)diplatinum(II)nitrate. However, no interpretable electron-density map could be obtained from this data.

The protein solution which yielded the hexagonal crystals used for structure determination was purified by heat treatment and ion exchange chromatography. On initial trials though, the protein was purified by Ni-NTA affinity chromatography. Crystallization of this protein solution yielded crystals which belong to a trigonal spacegroup and diffract only to low resolution.

FrvX and TET are the only tetrahedral-shaped which have been studied structurally and functionally so far. Both proteins are postulated to have a major role in the cellular degradation pathway. However, further studies are needed to prove and clarify their relevance.

2 Contents

Summary 1

Abbreviations 4

Introduction 5

Chapter I Crystal structure of a dodecameric tetrahedral-shaped aminopeptidase 18

Chapter II Further analysis of the structure 26

Chapter III Anomalous scattering of tantalum and platinum-derivatized crystals 33

Chapter IV Trigonal crystal form 39

Danksagung 44

Curriculum Vitae 45

3 Abbreviations

Å Ångstrom ApAp Leucine amimopeptidase from Aeromonas proteolytica ATP Adenosine triphosphate CCD Charge coupled device CCP4 Collaborative computing project 4 CNS Crystallography and NMR system Da Dalton DTT Dithiothreitol DESY Deutsches Elektronen-Synchrotron E. coli Escherichia coli EMBL European Molecular Biology Laboratory ESRF European Synchrotron Radiation Facility FrvX Fructose-like operon protein X IPTG Isopropyl-thiogalactopyranoside MAD Multiple anomalous dispersion NTA Nitrilotriacetate ODx Optical density at a wavelength of x nanometer PAGE Polyacrylamide gel electrophoresis PCR Polymerase chain reaction PDB Protein data bank PEG Polyethylene glycol r.m.s. Root mean square RT room temperature SAD Single anomalous dispersion SDS Sodiumdodecylsulfate Se-Met Selenomethionine SHARP Statistical Heavy Atom Refinement and Phasing SIR Single isomorphous replacement SLS Swiss Light Source TET tetrahedral aminopeptidase Tris Tris-hydroxymethylaminoethane XDS X-ray detector software

4 Introduction

Degradation of proteins

Protein turnover is an essential process in all living cells. The cellular protein content is constantly renewed through synthesis of new proteins and degradation of misfolded, or unnecessary proteins. Cellular structures have to be constantly rebuilt or adapted, in particular during development or in response to external stimuli. Misfolded or damaged proteins must be degraded efficiently to prevent the formation of large aggregates in the cell [1].

Furthermore, protein degradation is of major importance as a regulatory mechanism involved in several cellular functions, since the turnover of many regulatory proteins must be controlled. Some proteins are very stable but many proteins have a short lifetime, particularly those that are involved in metabolic regulation. A rapid and specific degradation rate is a prerequisite for the required fast changes of the concentration of those proteins, not only downwards but also upwards [2], since the amino acids generated during the process of degradation are reused in protein synthesis.

In eukaryotic cells a large number of physiological processes are controlled at least partially by protein degradation. Among them are e.g. gene transcription, cell-cycle progression, inflammatory response, tumor suppression, organ formation and antigen processing [3].

Therefore, controlled protein degradation is a key cellular function and must be under spatial and temporal control in order to prevent damage of the cell.

Degradation pathways

Eukaryotic cells contain lysosomes which employ a battery of proteolytic for degradation of almost exclusively extracellular proteins that are taken up from the cell surface. In contrast, the degradation of cytosolic proteins mostly occurs by the ubiquitin-

5 proteasome pathway. Here, ubiquitinylation of target proteins is carried out by linking the C- terminal glycine of the conserved protein ubiquitin to the ε amino group of a lysine residue of the protein substrate in an ATP-dependent reaction [4-6]. Subsequently, a polyubiquitin chain is usually formed, in which the C-terminus of each ubiquitin unit is linked to a specific lysine residue (most commonly Lys48) of the previous ubiquitin. A chain of four or more ubiquitin molecules serves as molecular tag that marks the protein for rapid degradation by the large proteolytic complex known as 26S proteasome [7-9]. In an ATP-dependent process, this large complex generates divers peptides ranging in length from 2 to 24 residues with approximately two thirds being less than 8 residues long [10-13].

Nearly all of these peptides are then rapidly degraded to amino acids. They are first cleaved by endopepdidases like the monomeric Thimet (TOP) [14] to 2-5 amino acid long peptides, which are then degraded by aminopeptidases to single amino acids. This rapid clearance of peptides released by the proteasome is essential for cell viability since an accumulation of undegraded peptide fragments in the cytosol could interfere with important protein-protein interactions, they could aggregate or turn out to be toxic [15, 16]. The amino acids provided by the hydrolysis of the peptides can be reused in the synthesis of new proteins. This is of particular importance under conditions where the supply of exogenous amino acids is limited. Unneeded amino acids are processed by the urea cycle.

In higher vertebrates, a small fraction of the longer proteasome products (9-17 amino acids long) escapes this complete destruction and is further cleaved by the tripeptidyl peptidase II

(TPPII) [17, 18] which assists the proteasome in producing the characteristic 7-9 amino acid long peptides used in immune response. These peptides are translocated from the cytosol into the endoplasmic reticulum, where they bind to major histocompatibility complex (MHC) class

I molecules and are transported to the cell surface to serve in antigen presentation. Electron micrographs of TPPII have shown a giant rod shaped complex which is composed of a stack of eight ring-shaped segments resulting in a complex that is even bigger than the proteasome

6 [17]. TPPII is a serine-peptidase of the subtilisin-type and combines amino- and (trypsin-like) activities.

Proteasomes and their kin

The 26S proteasome is a large multisubunit protease found in the cytosol and in the nucleus of eucaryotic cells. It consists of a proteolytic core, the 20S proteasome and regulatory multisubunit 19S caps attached to one or both ends of the core. The ATP-dependent 19S caps are believed to recognize and unfold the ubiquitinylated substrates and feed them to the actual protease [19-21]. The 20S proteasome seems to be more ancient than the ubiquitin system since both prokaryotic and archaebacterial ancestors have been identified [22, 23]. Archeae and few eubacteria contain a primordial 20S proteasome which possesses the same oligomeric architecture as its eukaryotic counterpart and most bacteria contain a proteasome homolog called heat shock locus V (HslV) which adopts a slightly simplified quaternary structure.

Crystal structures are available for Escherichia coli and Haemophilus influenzae proteasome homologs [24, 25] and the Thermoplasma acidophilum and Saccharomyces cerevisiae 20S proteasomes [26, 27].

The bacterial protease HslV (Figure 1d) is a 240 kDa-homododecamer built by two six- membered rings assembled head to head forming a barrel-like structure with two axial channels leading to a central cavity [24]. The complex measures 75 Å in height and 100 Å in diameter. The channels have a diameter of about 19 Å. The amino acid sequence of an HslV monomer shares about 20% identity with the subunits of the archeal and eukaryotic 20S proteasomes. The HslV monomer adopts the same fold as all 20S proteasome subunits and employs the same catalytic mechanism. The HslV and 20S proteasome subunits are threonine proteases with an N-terminal threonine as nucleophile, which places them in the family of Ntn

(N-terminal nucleophile) . Together with the HslU ATPase subunits which form

7 two homohexameric rings stacked onto both ends of HslV, a large ATP-dependent protease named HslUV is formed [23, 25] (Figure 1e). However, HslUV is not essential for protein breakdown [28] in bacteria and only few in vivo substrates are known [29, 30].

Most bacterial proteins are degraded by a protease called caseinolytic protease (ClpP) (Figure

1c) which was shown to be essential for protein degradation [31]. The quaternary structure of

ClpP is built by two seven-membered rings of identical subunits stacked on one another enclosing the active sites on the interior of the complex [32]. ClpP is a and unrelated to the HslV and 20S proteasome subunits in sequence and fold. The protease is activated by binding to hexameric rings of ClpA or ClpX subunits, two distinct regulatory complexes forming ClpAP or ClpXP holoenzymes.

a b c d

e

Figure 1: Proteasomes and bacterial proteasome-like protease complexes. (a) 20S proteasome of Saccharomyces cerevisiae [33]. (b) 20S proteasome of Thermoplasma acidophilum [33]. (c) ClpP protease of Escherichia coli [33]. The upper panels show end-on views with the axial pores in an open (T. acidophilum proteasome and ClpP) or closed (S. cerevisiae proteasome) state. The lower panels show side-on cross-sections, each with its seven- fold axis oriented vertically. The α- and β-rings are labelled. The yellow spheres denote active sites. The AC and CC labels on the proteasomes indicate the antechambers and catalytic chamber, respectively. (d) HslV protease of Escherichia coli. Cartoon representation along the molecular six-fold axis. The subunits related by the crystallographic three-fold axis are drawn in the same color. (e) HslUV protease-chaperone complex of Haemophilus influenzae. Cartoon representation perpendicular to six-fold axis. The subunits forming of HslV are colored in green and yellow, the HslU subunits in red and blue. Figures d and e were prepared with the program PYMOL (www.pymol.org).

8 Archaebacteria contain a 20S proteasome which is built by a stack of four heptameric rings

[26] (Figure 1a). The two inner rings consist of β-subunits which provide the proteolytic activity. The subunits of these β-rings assemble in analogous orientation as HslV and form the central cavity which harbors all 14 active sites. The two outer rings are built by α-subunits which are proteolytically inactive but share the fold of the β-subunits. Together with the β- subunits they form two antechambers which provide a holding area for the unfolded substrates. Unlike in HslV, the central chamber never faces the cytosol because the isolated β- subunits have lost the ability to assemble in the absence of the α-subunits. The axial pores providing the entry to the antechambers have a diameter of about 13 Å. The two openings of the central β-chamber are larger with a diameter of about 25 Å. For to occur, the substrates have to be unfolded by oligomeric chaperone-like ATPases [34] before they can enter the antechamber.

In eukaryotic 20S proteasomes, both α- and β-subunits have diverged into seven similar but distinct subunits each. Four of the subunits of the β-rings have lost the ability to cleave their propeptide autocatalytically and achieve proteolytic activity. The remaining three active β- subunits per ring display different specificities, namely trypsin-like, chymotrypsin-like and caspase-like activities [35, 36]. The proteolysis is activated and regulated by the 19S caps, which associate with the 20S proteasome to form the active, ATP- and ubiquitin-dependent protein degradation machine called 26S proteasome [19-21]. The components of these ~ 900 kDa complexes seem to have evolved from the archeal ATPases [37]. Each 19S complex contains at least 17 different subunits forming a base of eight subunits sitting on the α-ring of the 20S proteasome and a lid subcomplex on top of the base [38]. Overall, the eukaryotic 26S proteasome is a 2.5 MDa particle.

9 Energy independent protease complexes

In prokaryotes the products of the proteasome and its bacterial counterparts are thought to be further processed by energy-independent multimeric assemblies like the tricorn protease found in the archeon Thermoplasma acidophilum [39, 40]. Tricorn (TRI) is a 720 kDa homo- hexameric complex built of two staggered and interdigitating trimeric rings forming a distorted hexagon which is traversed by a prominent channel along the three-fold axis [39, 40]

(Figure 2a). Inside the hexamer the channel widens into a cavity that has a diameter of 85 Å and encloses the active sites. TRI was shown to cleave both trypsin-like and chymotrypsin- like substrates and to have a preference for basic P1 residues. Homologous sequences to the tricorn protease can be found in all sequenced Eubacteria species but only in some archeae.

The products of TRI are di- and tripeptides which are broken down to amino acids by the tricorn interacting factors, a proline-iminopeptidase (F1) [41] and two metallo- aminopeptidases (F2, F3) [40, 42]. TRI in vivo can assemble to a giant icosahedral capsid of

14.6 MDa [43] which might serve as the organizing centre of the multi-proteolytic complex.

A very similar quaternary structure is adopted by the smaller yeast bleomycin

(Gal6) [44] (Figure 2b). It is a that forms hexamers of 300kDa. The six identical subunits of Gal6 are arranged in two intercalating rings with 32-point group symmetry creating a channel along the three-fold axis. Gal6 was shown to detoxify the anticancer drug bleomycin by hydrolysis of an amide group and thus regulate the effectiveness and the side effects of the drug [45, 46]. Furthermore, Gal6 binds DNA and acts as a repressor in the Gal4 regulatory system which plays a central role in the galactose metabolism system of yeast [47].

10

a Tricorn protease T. acidophilum

b Gal6 Saccharomyces cerevisiae c DppA Bacillus subtilis d bovine leucine Ap

Figure 2: Energy-independent protease complexes. (a) Tricorn protease of Thermoplasma acidophilum [39]. Left: Ribbon representation along molecular three-fold axis. Individual subunits are distinguished by color. Middle: Cut-open surface representation along three-fold axis. Right: Cut-open surface representation viewed from the side. The central pore has a diameter of about 85 Å. (b) Yeast bleomycin hydrolase (Gal6). Cartoon representation along the three-fold axis. The six subunits are shown in different colors. (c) D-aminopeptidase DppA from Bacillus subtilis. Cartoon representation along the molecular five-fold axis. The subunits of the pentamer closest to the viewer are distinguished by different colors. (d) Bovine lens leucine aminopeptidase (BLAp). Cartoon representation along the three-fold axis. The six subunits are shown in different colors. Figures b, c and d were prepared with the program PYMOL (www.pymol.org).

Another energy-independent multisubunit protease is the D-aminopeptidase DppA from

Bacillus subtilis [48] (Figure 2c) which is involved in the remodelling of peptidoglycan in the cell wall. It is a dinuclear zinc-dependent D-specific aminopeptidase that consists of ten identical subunits which associate to two pentameric rings stacked on one another. A 20 Å wide channel, which gives access to a central 50 Å wide cavity, runs along the five-fold axis of the complex.

One of the few other multimeric aminopeptidases known is the bovine lens leucine aminopeptidase (BLAP) which is a hexameric protein composed of two trimeric rings stacked

11 on one another [49-51] (Figure 2d). However, in contrast to DppA, the active sites of BLAP are located on the surface of the molecule. BLAP does neither possess a central region nor channels restricting the access.

Self-Compartmentalization

With exception of BLAP all described complexes are so-called self-compartmentalizing or chambered proteases [37, 52, 53]. Despite a complete lack of similarity in sequence, structure or catalytic mechanism, these proteases have converged towards a similar architecture. They use oligomerization to arrange their active sites on the inside of the particles in large central cavities. In all cases, the proteolytic subunits associate into rings (three-, five-, six- or seven- membered) that stack upon each other to form barrel-shaped complexes. All of them possess two axial channels leading to a central chamber which harbors all active sites.

Apparently, self-compartmentalization is a regulatory mechanism in order to prevent unintended hydrolysis of wrong substrates. Especially in prokaryotes, which lack cellular compartments, self-compartmentalization evolved in order to restrict the access to the rather unspecific active sites and also to ensure complete degradation.

In the case of energy-dependent chambered proteases like the proteasomes, additional regulatory complexes which contain ATP-dependent chaperone-like proteins, so-called reverse chaperones or unfoldases [54], are used for recognition of target proteins, their unfolding and translocation into the inner cavity.

In the case of energy-independent proteases, the diameter of the channels restricts the access to the central chamber so that only short peptides and at the most unfolded proteins can serve as substrates.

12 Tetrahedral Aminopeptidase

In 2002 Franzetti and coworkers were able to isolate a novel self-compartmentalizing aminopeptidase complex called tetrahedral aminopeptidase (TET). Electron microscopy analysis of TET at 17 Å showed that in contrary to all other chambered proteases described before, the homododecameric complex has a tetrahedral shape which is formed by association of six antiparallel dimers [55] (Figure 3a). TET possesses a central cavity with a diameter of about 35 Å, which most probably harbors the active sites. The proteolytic chamber is accessible through a total of eight channels that are of two different types. Four wider channels originate at the centre of the faces of the tetrahedron and four narrow channels are located at the vertices of the tetrahedron (Figure 3b). The wide channels have a diameter of about 21 Å while the diameter of the second channel type is too narrow to be determined by electron microscopy. TET was shown to process most efficiently oligopeptides of 9-12 amino acids length and to possess a broad specificity with preference for neutral and basic residues

[55]. It has been proposed that the substrate peptides enter the cavity through the larger channels while the narrow channels provide the exit for the free amino acids.

a b

Figure 3: Three-dimensional structure of TET [55] according to EM reconstruction at 17 Å resolution. (a) View along molecular two-fold axis with one antiparallel dimer high-lighted. (b) Left: same orientation as in (a). Middle: view along three-fold axis down narrow channel at the vertex. Right: view along three-fold axis down wider channel at the facet.

TET represents a new type of self-compartmentalizing protease. The properties of TET would in principle enable it to degrade the products of the proteasome down to single amino acids and thus be a functional homolog of the tricorn protease with its interacting factors.

13 Sequence analysis of TET reveals similarity to assigned bacterial and archeal aminopeptidases. Among them is a protein called FrvX from the hyperthermophilic archeon

Pyrococcus horikoshii, which shares 27% sequence identity with TET.

FrvX from Pyrococcus horikoshii

FrvX is a 353 amino acid long protein of 39 kDa which mainly consists of a peptidase M42 domain according to the MEROPS data base [56]. The M42 domain occurs only in bacterial and archeal species and contains a dinuclear zinc center in the active site. The metal ions are complexed by conserved histidine, aspartic acid and glutamic acid residues. Sequence alignments with proteins of the same family and comparison with the crystal structure of the leucine aminopeptidase of Aeromonas proteolytica [57] point to a conserved glutamic acid as the putative catalytic base of the hydrolysis reaction.

Nothing is known about the function of FrvX so far. Since the optimal growth temperature of the hyperthermophile Pyrococcus horikoshii is 95°, it is likely that FrvX is only active at very high temperatures.

References

1. Goldberg, A.L. & St John, A.C. (1976). Intracellular protein degradation in mammalian and bacterial cells: Part 2. Annu Rev Biochem 45, pp. 747-803. 2. Schimke, R.T. & Doyle, D. (1970). Control of levels in animal tissues. Annu Rev Biochem 39, pp. 929-976. 3. Berg, J.M., Tymoczko, J.L. & L, S. (2002). Biochemistry fifth edition. W. H. Freeman and Company, New York, USA. 4. Hershko, A. & Ciechanover, A. (1998). The ubiquitin system. Annu Rev Biochem 67, pp. 425-479. 5. Hershko, A., Ciechanover, A., Heller, H., Haas, A.L. & Rose, I.A. (1980). Proposed role of ATP in protein breakdown: conjugation of protein with multiple chains of the polypeptide of ATP-dependent proteolysis. Proc Natl Acad Sci U S A 77, pp. 1783- 1786.

14 6. Ciechanover, A., Heller, H., Katz-Etzion, R. & Hershko, A. (1981). Activation of the heat-stable polypeptide of the ATP-dependent proteolytic system. Proc Natl Acad Sci U S A 78, pp. 761-765. 7. Ganoth, D., Leshinsky, E., Eytan, E. & Hershko, A. (1988). A multicomponent system that degrades proteins conjugated to ubiquitin. Resolution of factors and evidence for ATP-dependent complex formation. J Biol Chem 263, pp. 12412-12419. 8. Hough, R., Pratt, G. & Rechsteiner, M. (1987). Purification of two high molecular weight proteases from rabbit reticulocyte lysate. J Biol Chem 262, pp. 8303-8313. 9. Waxman, L., Fagan, J.M. & Goldberg, A.L. (1987). Demonstration of two distinct high molecular weight proteases in rabbit reticulocytes, one of which degrades ubiquitin conjugates. J Biol Chem 262, pp. 2451-2457. 10. Kisselev, A.F., Akopian, T.N. & Goldberg, A.L. (1998). Range of sizes of peptide products generated during degradation of different proteins by archaeal proteasomes. J Biol Chem 273, pp. 1982-1989. 11. Kisselev, A.F., Akopian, T.N., Woo, K.M. & Goldberg, A.L. (1999). The sizes of peptides generated from protein by mammalian 26 and 20 S proteasomes. Implications for understanding the degradative mechanism and antigen presentation. J Biol Chem 274, pp. 3363-3371. 12. Emmerich, N.P., Nussbaum, A.K., Stevanovic, S., Priemer, M., Toes, R.E., Rammensee, H.G. & Schild, H. (2000). The human 26 S and 20 S proteasomes generate overlapping but different sets of peptide fragments from a model protein substrate. J Biol Chem 275, pp. 21140-21148. 13. Niedermann, G., King, G., Butz, S., Birsner, U., Grimm, R., Shabanowitz, J., Hunt, D.F. & Eichmann, K. (1996). The proteolytic fragments generated by vertebrate proteasomes: structural relationships to major histocompatibility complex class I binding peptides. Proc Natl Acad Sci U S A 93, pp. 8572-8577. 14. Saric, T., Graef, C.I. & Goldberg, A.L. (2004). Pathway for Degradation of Peptides Generated by Proteasomes: A KEY ROLE FOR THIMET OLIGOPEPTIDASE AND OTHER METALLOPEPTIDASES. J Biol Chem 279, pp. 46723-46732. 15. Hughes, E., Burke, R.M. & Doig, A.J. (2000). Inhibition of toxicity in the beta- amyloid peptide fragment beta -(25-35) using N-methylated derivatives: a general strategy to prevent amyloid formation. J Biol Chem 275, pp. 25109-25115. 16. Tenidis, K., Waldner, M., Bernhagen, J., Fischle, W., Bergmann, M., Weber, M., Merkle, M.L., Voelter, W., Brunner, H. & Kapurniotu, A. (2000). Identification of a penta- and hexapeptide of islet amyloid polypeptide (IAPP) with amyloidogenic and cytotoxic properties. J Mol Biol 295, pp. 1055-1071. 17. Geier, E., Pfeifer, G., Wilm, M., Lucchiari-Hartz, M., Baumeister, W., Eichmann, K. & Niedermann, G. (1999). A giant protease with potential to substitute for some functions of the proteasome. Science 283, pp. 978-981. 18. Shastri, N., Schwab, S. & Serwold, T. (2002). Producing nature's gene-chips: the generation of peptides for display by MHC class I molecules. Annu Rev Immunol 20, pp. 463-493. 19. Coux, O., Tanaka, K. & Goldberg, A.L. (1996). Structure and functions of the 20S and 26S proteasomes. Annu Rev Biochem 65, pp. 801-847. 20. DeMartino, G.N., Moomaw, C.R., Zagnitko, O.P., Proske, R.J., Chu-Ping, M., Afendis, S.J., Swaffield, J.C. & Slaughter, C.A. (1994). PA700, an ATP-dependent activator of the 20 S proteasome, is an ATPase containing multiple members of a nucleotide-binding . J Biol Chem 269, pp. 20878-20884. 21. Chu-Ping, M., Vu, J.H., Proske, R.J., Slaughter, C.A. & DeMartino, G.N. (1994). Identification, purification, and characterization of a high molecular weight, ATP- dependent activator (PA700) of the 20 S proteasome. J Biol Chem 269, pp. 3539-3547.

15 22. Dahlmann, B., Kopp, F., Kuehn, L., Niedel, B., Pfeifer, G., Hegerl, R. & Baumeister, W. (1989). The multicatalytic proteinase (prosome) is ubiquitous from eukaryotes to archaebacteria. FEBS Lett 251, pp. 125-131. 23. Rohrwild, M., Coux, O., Huang, H.C., Moerschell, R.P., Yoo, S.J., Seol, J.H., Chung, C.H. & Goldberg, A.L. (1996). HslV-HslU: A novel ATP-dependent protease complex in Escherichia coli related to the eukaryotic proteasome. Proc Natl Acad Sci U S A 93, pp. 5808-5813. 24. Bochtler, M., Ditzel, L., Groll, M. & Huber, R. (1997). Crystal structure of heat shock locus V (HslV) from Escherichia coli. Proc Natl Acad Sci U S A 94, pp. 6070-6074. 25. Sousa, M.C., Trame, C.B., Tsuruta, H., Wilbanks, S.M., Reddy, V.S. & McKay, D.B. (2000). Crystal and solution structures of an HslUV protease-chaperone complex. Cell 103, pp. 633-643. 26. Lowe, J., Stock, D., Jap, B., Zwickl, P., Baumeister, W. & Huber, R. (1995). Crystal structure of the 20S proteasome from the archaeon T. acidophilum at 3.4 A resolution. Science 268, pp. 533-539. 27. Groll, M., Ditzel, L., Lowe, J., Stock, D., Bochtler, M., Bartunik, H.D. & Huber, R. (1997). Structure of 20S proteasome from yeast at 2.4 A resolution. Nature 386, pp. 463-471. 28. Kanemori, M., Nishihara, K., Yanagi, H. & Yura, T. (1997). Synergistic roles of HslVU and other ATP-dependent proteases in controlling in vivo turnover of sigma32 and abnormal proteins in Escherichia coli. J Bacteriol 179, pp. 7219-7225. 29. Khattar, M.M. (1997). Overexpression of the hslVU operon suppresses SOS-mediated inhibition of cell division in Escherichia coli. FEBS Lett 414, pp. 402-404. 30. Missiakas, D., Schwager, F., Betton, J.M., Georgopoulos, C. & Raina, S. (1996). Identification and characterization of HsIV HsIU (ClpQ ClpY) proteins involved in overall proteolysis of misfolded proteins in Escherichia coli. Embo J 15, pp. 6899- 6909. 31. Tobias, J.W., Shrader, T.E., Rocap, G. & Varshavsky, A. (1991). The N-end rule in bacteria. Science 254, pp. 1374-1377. 32. Wang, J., Hartling, J.A. & Flanagan, J.M. (1997). The structure of ClpP at 2.3 A resolution suggests a model for ATP-dependent proteolysis. Cell 91, pp. 447-456. 33. Pickart, C.M. & Cohen, R.E. (2004). Proteasomes and their kin: proteases in the machine age. Nat Rev Mol Cell Biol 5, pp. 177-187. 34. Zwickl, P., Ng, D., Woo, K.M., Klenk, H.P. & Goldberg, A.L. (1999). An archaebacterial ATPase, homologous to ATPases in the eukaryotic 26 S proteasome, activates protein breakdown by 20 S proteasomes. J Biol Chem 274, pp. 26008-26014. 35. Heinemeyer, W., Fischer, M., Krimmer, T., Stachon, U. & Wolf, D.H. (1997). The active sites of the eukaryotic 20 S proteasome and their involvement in subunit precursor processing. J Biol Chem 272, pp. 25200-25209. 36. Orlowski, M., Cardozo, C. & Michaud, C. (1993). Evidence for the presence of five distinct proteolytic components in the pituitary multicatalytic proteinase complex. Properties of two components cleaving bonds on the carboxyl side of branched chain and small neutral amino acids. Biochemistry 32, pp. 1563-1572. 37. Lupas, A., Flanagan, J.M., Tamura, T. & Baumeister, W. (1997). Self- compartmentalizing proteases. Trends Biochem Sci 22, pp. 399-404. 38. Glickman, M.H., Rubin, D.M., Coux, O., Wefes, I., Pfeifer, G., Cjeka, Z., Baumeister, W., Fried, V.A. & Finley, D. (1998). A subcomplex of the proteasome regulatory particle required for ubiquitin-conjugate degradation and related to the COP9- signalosome and eIF3. Cell 94, pp. 615-623. 39. Brandstetter, H., Kim, J.S., Groll, M. & Huber, R. (2001). Crystal structure of the tricorn protease reveals a protein disassembly line. Nature 414, pp. 466-470.

16 40. Tamura, T., Tamura, N., Cejka, Z., Hegerl, R., Lottspeich, F. & Baumeister, W. (1996). Tricorn protease--the core of a modular proteolytic system. Science 274, pp. 1385-1389. 41. Tamura, T., Tamura, N., Lottspeich, F. & Baumeister, W. (1996). Tricorn protease (TRI) interacting factor 1 from Thermoplasma acidophilum is a proline iminopeptidase. FEBS Lett 398, pp. 101-105. 42. Tamura, N., Lottspeich, F., Baumeister, W. & Tamura, T. (1998). The role of tricorn protease and its aminopeptidase-interacting factors in cellular protein degradation. Cell 95, pp. 637-648. 43. Walz, J., Tamura, T., Tamura, N., Grimm, R., Baumeister, W. & Koster, A.J. (1997). Tricorn protease exists as an icosahedral supermolecule in vivo. Mol Cell 1, pp. 59-65. 44. Joshua-Tor, L., Xu, H.E., Johnston, S.A. & Rees, D.C. (1995). Crystal structure of a conserved protease that binds DNA: the bleomycin hydrolase, Gal6. Science 269, pp. 945-950. 45. Sebti, S.M., Mignano, J.E., Jani, J.P., Srimatkandada, S. & Lazo, J.S. (1989). Bleomycin hydrolase: molecular cloning, sequencing, and biochemical studies reveal membership in the cysteine proteinase family. Biochemistry 28, pp. 6544-6548. 46. Sebti, S.M., Jani, J.P., Mistry, J.S., Gorelik, E. & Lazo, J.S. (1991). Metabolic inactivation: a mechanism of human tumor resistance to bleomycin. Cancer Res 51, pp. 227-232. 47. Xu, H.E. & Johnston, S.A. (1994). Yeast bleomycin hydrolase is a DNA-binding cysteine protease. Identification, purification, biochemical characterization. J Biol Chem 269, pp. 21177-21183. 48. Remaut, H., Bompard-Gilles, C., Goffin, C., Frere, J.M. & Van Beeumen, J. (2001). Structure of the Bacillus subtilis D-aminopeptidase DppA reveals a novel self- compartmentalizing protease. Nat Struct Biol 8, pp. 674-678. 49. Burley, S.K., David, P.R., Taylor, A. & Lipscomb, W.N. (1990). Molecular structure of leucine aminopeptidase at 2.7-A resolution. Proc Natl Acad Sci U S A 87, pp. 6878- 6882. 50. Burley, S.K., David, P.R. & Lipscomb, W.N. (1991). Leucine aminopeptidase: bestatin inhibition and a model for enzyme-catalyzed peptide hydrolysis. Proc Natl Acad Sci U S A 88, pp. 6916-6920. 51. Burley, S.K., David, P.R., Sweet, R.M., Taylor, A. & Lipscomb, W.N. (1992). Structure determination and refinement of bovine lens leucine aminopeptidase and its complex with bestatin. J Mol Biol 224, pp. 113-140. 52. Baumeister, W., Walz, J., Zuhl, F. & Seemuller, E. (1998). The proteasome: paradigm of a self-compartmentalizing protease. Cell 92, pp. 367-380. 53. Larsen, C.N. & Finley, D. (1997). Protein translocation channels in the proteasome and other proteases. Cell 91, pp. 431-434. 54. Lupas, A., Koster, A.J. & Baumeister, W. (1993). Structural features of 26S and 20S proteasomes. Enzyme Protein 47, pp. 252-273. 55. Franzetti, B., Schoehn, G., Hernandez, J.F., Jaquinod, M., Ruigrok, R.W. & Zaccai, G. (2002). Tetrahedral aminopeptidase: a novel large protease complex from archaea. Embo J 21, pp. 2132-2138. 56. Rawlings, N.D. & Barrett, A.J. (1999). MEROPS: the peptidase database. Nucleic Acids Res 27, pp. 325-331. 57. Chevrier, B., Schalk, C., D'Orchymont, H., Rondeau, J.M., Moras, D. & Tarnus, C. (1994). Crystal structure of Aeromonas proteolytica aminopeptidase: a prototypical member of the co-catalytic zinc enzyme family. Structure 2, pp. 283-291.

17 Chapter I

The following chapter has been published in the Journal of Biological Chemistry.

Crystal Structure of a Dodecameric Tetrahedral-shaped Aminopeptidase

Russo, S. and Baumann, U. (2004) 279, 51275-51281

18

19

20

21

22

23

24

25 Chapter II

Further analysis of the structure

The oligomeric state of FrvX and related proteins

As described in Chapter I, the FrvX monomer consists of two domains, namely a bigger domain which contains all amino acids of the active site (domain I) and a smaller domain

(domain II). The fold of the catalytic domain I resembles those found in other clan MH proteases (MEROPS data base [1]) like the leucine aminopeptidase from Aeromonas proteolytica (ApAp, PDB code 1AMP) [2] and the aminopeptidase from Streptomyces griseus

(SgAp, PDB code 1XJO) [3]. In a structural alignment, the positions of the active sites of

FrvX, ApAp and SgAp overlap (data not shown). However, ApAp and SgAp completely lack the small domain II. This suggests that domain II is in principle not needed for catalytic activity and might have another purpose. Since both ApAp and SgAp are monomers in vivo

[4, 5], one might assume that domain II of FrvX is needed for its oligomerization.

A related oligomeric protein is the clan MF (MEROPS data base) bovine lens leucine aminopeptidase (BLAP), which, like FrvX, consists of two domains, a bigger domain carrying the active site and a smaller domain [6]. Its physiological form is a hexamer built by two stacked trimers. The trimers themselves have a spiral form and are connected through interaction of the big catalytic domains. The main connection of the two trimeric spirals to form the hexamer though, is effected by residues of the small domains which interact at the vertices of the triangular structure.

In FrvX, the primary building blocks of the dodecamer are six antiparallel dimers which form the edges of the tetrahedron. The monomers use a large portion of their surface for dimer interactions. The dimers themselves are connected to each other more weakly mainly at the 26 vertices of the particle. The residues which participate in the edge-dimer interaction are almost exclusively located in domain II or its nearest periphery (Figure 4 of Chapter I).

These observations about the quaternary structures of BLAP and FrvX confirm the important role of domain II for oligomerization and lead to the assumption that its existence is the prerequisite as well as the cause for oligomerization.

As mentioned in Chapter I, two until now unpublished structures which are very similar to

FrvX are deposited in the PDB. The first protein is the search model we used for structure solution from Bacillus subtilis (PDB entry 1VHE). The second protein is an ortholog from

Thermotoga maritima (PDB entry 1VHO). 1VHE shares 37% sequence identity with FrvX, whereas 1VHO is 30% identical to FrvX. Like Frvx, the Bacillus enzyme forms a tetrahedral shaped dodecamer in the crystal lattice while the Thermotoga ortholog appears to be a monomer.

Figure 1: Stereoview of a superposition of subunit A of FrvX, 1VHE and 1VHO colored in blue, red and green respectively. The structures were superimposed using the program O [7]. The figure was prepared using PYMOL (www.pymol.org).

27 A structure based sequence alignment of the three proteins reveals only minor differences between FrvX and 1VHE on the one hand and 1VHO on the other hand (Figure 4 of Chapter

I). A superposition of the monomer structures of FrvX, 1VHE and 1VHO is shown in Figure

1. Overall, the monomer structures of FrvX, 1VHE and 1VHO are very similar. They exhibit

Cα-r.m.s. deviations of about 1 Å. However, a major conformational difference occurs in domain II of 1VHO compared to FrvX and 1VHE. One loop of domain II of 1VHO clearly displays a different conformation than the equivalent loops of FrvX and 1VHE. Although in our FrvX structure most residues which are part of this loop are not visible, we can conclude that the conformation is similar to 1VHE since the visible residues clearly adopt the same conformation as those in 1VHE.

A superposition of 1VHE and 1VHO monomers with the edge-forming dimer of FrvX reveals the relevance of this loop for dimer formation (Figure 2).

Figure 2: Overlay of 1VHE and 1VHO with edge dimer of FrvX. Subunit A of FrvX (blue) is superposed with 1VHE (red) and 1VHO (green). Subunit D of FrvX which completes the edge dimer is shown in surface representation. The structures were superposed using the program O [7]. The figure was prepared using PYMOL (www.pymol.org).

28 The loop conformations found in FrvX and 1VHE allow and putatively help in dimer formation by interacting with the dimer partner. In contrast, the loop of 1VHO would collide with the dimer partner and consequently interfere with the formation of the edge dimer. In the

1VHO crystal structure this loop forms crystal contacts.

Since no solution studies neither of 1VHE nor of 1VHO are available, the physiological oligomeric state of both proteins is unknown and it is difficult to decide, whether in the case of 1VHO the conformation of the loop in question is physiological or just a crystallization artefact.

The relevance of domain II for oligomerization of FrvX and related proteins still has to be discovered. During our work on FrvX two homologous proteins have been additionally examined, namely Peptidase A from Lactococcus lactis (PepA) and FrvX from

Thermoanaerobacter tengcongensis (FrvXTt). A sequence alignment of these two proteases with FrvX suggests that both enzymes contain a small domain additionally to the big catalytic domain (data not shown). On a gelfiltration column however PepA and FrvXTt migrate as dimeric or monomeric proteins respectively. Therefore, the determination of the crystal structures of these two proteins would possibly aid to elucidate the role of domain II for oligomerization.

The role of a conserved aspartic acid

Sequence alignments of FrvX with proteins of the family of the M42 peptidases reveal a conserved aspartic acid (Figure 4, Chapter I) (Asp-70) near the active site which was thought to take part in the catalytic mechanism [8]. Asp-70 is also present in other related aminopeptidases, like ApAp and TET. The crystal structures of Frvx and ApAp show, that

Asp-70 (Asp-99 in ApAp) is in close proximity of the binuclear zinc center but too distant to

29 be directly involved in the reaction mechanism (Figure 4). Asp-70 is situated on one of the two loop structures which connect domain I and domain II. It is in hydrogen bond distance to the backbone of Glu-212 which is the catalytic base and Glu-213 which is one of the zinc ligands.

Figure 4: Stereoview of the active site of FrvX. The sidechains of the active site residues are shown as sticks. The backbone of Glu-212 and Glu-213 is shown as lines. Atomic colors are as follows: oxygen, red; nitrogen, blue; carbon, black or grey. Zinc ions and the water molecule are shown as black or red spheres respectively. Hydrogen–bonds are drawn as violet dashes and labeled with the distances between hydrogen–bond partners in Å. The figure was prepared using PYMOL (www.pymol.org).

It seems that these hydrogen bonds stabilize the linking loop and consequently the connection of the two domains. The strict conservation of Asp-70 suggests that these interactions play a major role in stabilization of the FrvX structure.

In order to pursue this question, mutation studies are in progress. Asp-70 has been exchanged against alanine, asparagine and glutamic acid. Preliminary results of activity measurements

(data not shown) indicate that D70A- and D70N-mutants are completely inactive and that the

D70E-mutant retains partial activity. However, these results are still to be confirmed. In addition, further activity studies of the D70E-mutant and determination of the crystal structures of all the mutant proteins are planned.

30 The specificity for leucine of FrvX

As mentioned in Chapter I, FrvX demonstrates a significant specificity for leucine in the P1 position of the substrate. Related aminopeptidases like ApAp, SgAp and BLAP strongly prefer substrates with large hydrophobic side-chains at the N-terminus with leucine being the most efficiently cleaved as well [9]. In order to picture the similarity of the substrate binding pockets of Frvx and ApAp, a superimposition of FrvX with leucinephosphonic acid bound to

ApAp (PDB code 1FT7) is shown in Figure 3.

Figure 1: Surface representation of FrvX superposed with leucinephosphonic acid bound to ApAp. The amino acids of ApAp are not visible. The leucinephosphonic acid molecule and the sidechains of the active site residues of FrvX are shown as sticks. Atomic colors are as follows: oxygen is drawn in red; carbon in black; nitrogen in blue; phosphorus in yellow. The structures were superposed using the program O [7]. The figure was prepared using PYMOL (www.pymol.org).

The substrate binding pocket of FrvX is confined by Ile-238, Lys-293 and Thr-298. Both the hydrophobicity and the form of the binding pocket seem to render it a perfect host for a leucine side-chain.

In order to get a deeper insight in this subject we plan to analyze the effect of different inhibitors on the activity of FrvX. Furthermore we plan to determine the crystal structures of

FrvX complexed with inhibitors like leucine hydroxamate and amastatin.

31 In contrast to FrvX, TET possesses a rather broad specificity with a preference for neutral and basic residues [8]. FrvX and TET are the only dodecameric tetrahedral-shaped aminopeptidases which have been functionally studied so far. Both proteins are postulated to have a major role in the cellular degradation pathway. However, further studies are needed to prove and clarify their relevance.

References

1. Rawlings, N.D. & Barrett, A.J. (1999). MEROPS: the peptidase database. Nucleic Acids Res 27, pp. 325-331. 2. Chevrier, B., Schalk, C., D'Orchymont, H., Rondeau, J.M., Moras, D. & Tarnus, C. (1994). Crystal structure of Aeromonas proteolytica aminopeptidase: a prototypical member of the co-catalytic zinc enzyme family. Structure 2, pp. 283-291. 3. Greenblatt, H.M., Almog, O., Maras, B., Spungin-Bialik, A., Barra, D., Blumberg, S. & Shoham, G. (1997). Streptomyces griseus aminopeptidase: X-ray crystallographic structure at 1.75 A resolution. J. Mol. Biol 265, pp. 620-636. 4. Prescott, J.M., Wilkes, S.H., Wagner, F.W. & Wilson, K.J. (1971). Aeromonas aminopeptidase. Improved isolation and some physical properties. J. Biol. Chem. 246, pp. 1756-1764. 5. Spungin, A. & Blumberg, S. (1989). Streptomyces griseus aminopeptidase is a calcium-activated zinc metalloprotein. Purification and properties of the enzyme. Eur J Biochem 183, pp. 471-477. 6. Burley, S.K., David, P.R., Taylor, A. & Lipscomb, W.N. (1990). Molecular structure of leucine aminopeptidase at 2.7-A resolution. Proc Natl Acad Sci U S A 87, pp. 6878- 6882. 7. Jones, T.A., Zou, J.Y., Cowan, S.W. & Kjeldgaard (1991). Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallogr A 47 ( Pt 2), pp. 110-119. 8. Franzetti, B., Schoehn, G., Hernandez, J.F., Jaquinod, M., Ruigrok, R.W. & Zaccai, G. (2002). Tetrahedral aminopeptidase: a novel large protease complex from archaea. Embo J 21, pp. 2132-2138. 9. Lowther, W.T. & Matthews, B.W. (2002). Metalloaminopeptidases: common functional themes in disparate structural surroundings. Chem Rev 102, pp. 4581-4608.

32 Chapter III

Anomalous scattering of tantalum- and platinum-derivatized crystals

Before we were able to solve the crystal structure of FrvX by molecular replacement, we tried to determine phases by multiple anomalous dispersion (MAD). We performed MAD experiments on tantalum and platinum-derivatized crystals. The heavy-atom compounds used were a Ta6Br14 cluster (TaBr) and D1-µ-iodobis(ethylenediamine)diplatinum(II)nitrate (PIP).

Experimental Procedures

FrvX was expressed, purified and crystallized as described in Chapter I. The resulting crystals were incubated in a 5µl drop containing mother liquor (0.1 M Na-citrate pH 6.1, 18 % (w/v)

PEG 400, 0.1 M NaCl) and 1 mM TaBr or PIP at 18°C for 10-14 hours. The soaked crystals were cryoprotected prior to data collection by raising the PEG 400 concentration to 35% (v/v) and flash-cooled in a nitrogen stream at 110 K. A MAD experiment of a TaBr-derivatized crystal was performed on beamline BW7A at the EMBL-outstation at DESY in Hamburg using a MAR-165 CCD detector (Marresearch, Hamburg, Germany). A dataset of a PIP- derivatized crystal was measured in-house at 110 K on a Rigaku RU300 rotating-anode X-ray source using a RaxisIV image plate area detector. A MAD experiment of a similar PIP-crystal was carried out on beamline BM14 at the ESRF in Grenoble on a MAR Mosaic 225 CCD detector (Marresearch, Hamburg, Germany). Exposure times varied from 10 min (rotating anode) to 5 s (ESRF) for an oscillation angle of 0.2°. The datasets were integrated and scaled with XDS [1-3]. Heavy atom positions were determined by CNS [4] or by SHELXD [5].

Calculation of phases and solvent flattening was performed using the program SHARP [6]

Electron density maps computation was effected using the CCP4 software package [7] and

33 inspected with the program O [8]. Difference Fourier maps were calculated using CNS.

Patterson map predictions were done by XtalView [9].

Results and Discussion

Native FrvX crystals belong to spacegroup P63 with cell constants of a = b = 158 Å, c = 114

Å (Figure 1). They diffract to a resolution of about 2.1 Å at a synchrotron source. TaBr- soaked crystals possess a slightly extended unit cell with typical cell constants of about a = b

= 161 Å, c = 116 Å and diffract to lower resolution (~2.8 Å).

200 µm

Figure 1: Diffraction quality crystals of FrvX used for soaking experiments.

Binding of TaBr to the crystals could be observed through a significant color change from colorless to dark green. Data collection statistics of a Ta-derivatized crystal measured at the

DESY in Hamburg are reported in Table 1.

Table 1: Data collection statistics of Ta-derivative

Ta peak Ta inflection Ta remote

Crystal parameters P63; a = b = 161.0 Å, c = 116.7 Å 4 molecules / a.s.u. Data collection (XDS) Beamline BW7A, DESY BW7A, DESY BW7A, DESY Wavelength (Å) 1.2575 1.2590 1.1716 Resolution rangea 20-3.50 (3.71-3.50) 20-2.72 (2.88-2.72) 20-2.87 (3.09-2.87) No. observations 300100 556042 471361 No. unique reflectionsb 41502 87347 76789 Completeness (%) 97.3 (90.8) 96.0 (80.5) 99.2 (99.1) c Rsym (%) 5.0 (7.3) 4.6 (48.9) 6.0 (25.6) I/σ(I) 32.65 (16.80) 30.75 (3.33) 24.92 (5.37) a The values in parentheses of resolution range, completeness, Rsym and I/ σ(I) correspond to the outermost resolution shell bFriedel pairs were treated as different reflections c Rsym = ΣhklΣj|I(hkl;j) - 〈I(hkl)〉|/( ΣhklΣj〈I(hkl)〉) where I(hkl;j) is the jth measurement of the intensity of the unique reflection (hkl) and 〈I(hkl)〉 is the mean over all symmetry-related measurements. 34 One TaBr-site per asymmetric unit (a.s.u.) could be located using the anomalous signal of the peak dataset to 5 Å resolution. In spacegroup P63, as in all polar spacegroups, a single heavy- atom position in the asymmetric unit results in a centrosymmetric heavy atom substructure.

An electron-density map obtained through phase information of a heavy atom substructure is not the true electron density. In the case of SIR phasing, the electron-density map is a superposition of the true electron-density and the inverse of the true electron-density convoluted with the Fourier transform of exp(2iφsub). Fourier coefficients of this map are then

* given by F + F exp(2iφsub), where φsub are the phases of the heavy-atom substructure, F the complex structure factors and F* the complex conjugates [10]. An electron-density map obtained through SAD phasing is a superposition of the true electron density and the negative inverse of the true electron density convoluted with the same Fourier transform. The Fourier

* coefficients are then F - F exp(2iφsub). Normally, if the substructure is not centrosymmetric, the second terms only contribute to the noise of the map. However, if the substructure has a centrosymmetric configuration in a non-centrosymmetric crystal, the resulting map is the superposition of the true electron-density and its exact inverse or negative inverse and interpretation of the map is significantly more difficult. In the case of MAD phasing there is no twofold ambiguity in the phase determination unlike in SIR or SAD. Here, if the heavy atom substructure is centrosymmetric the handedness can not be determined. In both hands the resulting electron-density combines elements of the true electron-density and the inverse electron-density. In this case neither inspection of the electron-density map after solvent flattening nor statistical parameters reveal information about the correct solution. Therefore, the phases calculated from the located TaBr-sites were not sufficient to yield an interpretable electron-density map and data of a second heavy atom derivative was needed.

35 As second heavy-atom compound D1-u-iodobis(ethylenediamine)diplatinum(II)nitrate (PIP) was chosen. PIP-soaked crystals typically possess cell constants of about a = b = 158 Å, c =

114.5 Å and diffract to a resolution of about 2.7 Å. A PIP-soaked crystal was measured in- house to a resolution of 2.74 Å. Data collection statistics are reported in Table 2.

Table 2: Data collection statistics of Pt-derivatives

Pt L3 peak Pt L1 peak Pt remote Pt in house

Crystal parameters P63; a = b = 157.6 P63; a = b = 158.4 Å, c = 114.5 Å Å, c = 114.5 Å 4 molecules / a.s.u. 4 molecules / a.s.u. Data collection (XDS) Beamline BM14, ESRF BM14, ESRF BM14, ESRF in house Wavelength (Å) 1.0715 0.8856 0.9868 1.5418 Resolution rangea 20-2.70 (2.86-2.70) 20-2.70 (2.86-2.70) 20-2.70 (2.86-2.70) 20-2.74 (2.91-2.74) No. observations 331950 375178 335062 282992 No. unique reflectionsb 85971 86564 86416 79479 Completeness (%) 98.4 (95,9) 99.1 (99.0) 98.9 (98.3) 94.1 (78.2) c Rsym (%) 5.0 (30.7) 7.6 (51.2) 8.1 (55.5) 7.7 (27.2) I/σ(I) 19.86 (4.32) 16.23 (2.86) 14.49 (2.45) 16.30 (3.75) a The values in parentheses of resolution range, completeness, Rsym and I/ σ(I) correspond to the outermost resolution shell bFriedel pairs were treated as different reflections c Rsym = ΣhklΣj|I(hkl;j) - 〈I(hkl)〉|/( ΣhklΣj〈I(hkl)〉) where I(hkl;j) is the jth measurement of the intensity of the unique reflection (hkl) and 〈I(hkl)〉 is the mean over all symmetry-related measurements.

Four PIP sites could be located in the asymmetric unit using the anomalous signal of the crystal measured in house to 5 Å resolution. Initially, only the PIP positions were used for phasing. The quality of the resulting phases was confirmed by an anomalous difference

Fourier calculation using the anomalous Ta-peak dataset. For this, an imaginary difference

Fourier map was calculated using the Ta-peak data and the phases derived from the PIP positions. This revealed the position of the anomalous scatterer TaBr. A Patterson map prediction of this position overlaps perfectly with the anomalous Patterson map calculated from the Ta-peak dataset (Figure 2). Hence, the PIP phases identified the correct TaBr-site.

36 a b

Figure 2: Superposition of the anomalous Patterson map of the Ta-peak dataset with labels of the peaks of a predicted Patterson map calculated from the position of the major peak of the cross-difference Fourier map. (a) Harker section at z = 0. (b) Harker section at z = 0.5. The Figure was prepared using XtalView [9].

However, the electron-density map obtained from the PIP data was not interpretable. At this point, the TaBr and PIP data were combined to profit from the large anomalous signal obtained from the TaBr data and therewith improve the phases. Although SHARP yielded good phasing statistics the resulting electron-density map was not interpretable.

Another PIP-derivatized crystal was measured at the ESRF in Grenoble to a resolution of 2.7

Å. Data collection statistics are reported in Table 2. Due to various problems at the beamline only an incomplete MAD experiment without data collection at the inflection point could be performed. The data of this crystal were confirmed in the same way as the in-house PIP data and identified the correct TaBr site. The data were then scaled and merged to the existing native dataset, the TaBr data and the in-house PIP data and used for phasing.

A few different phasing scenarios were tested. First, the PIP MAD data alone were used to obtain phases. As this did not lead to an interpretable electron-density map, the data were combined with either the TaBr MAD data or the in-house PIP data or with all available data.

The anomalous phasing power obtained from SHARP was in all cases acceptable to about 5 Å resolution. Nevertheless none of the tested combinations resulted in an interpretable electron- density map.

37 A possible reason for this could be the rather bad quality of the PIP data measured at the

ESRF. The observed anomalous signal calculated by XDS (data not shown) was quite low in all three collected datasets. Moreover, the crystal seemed to suffer from radiation damage since the anomalous signal dropped significantly during the experiment. In addition, both platinum derivatives only displayed low isomorphous phasing power and hence isomorphous differences could not be exploited.

References

1. Kabsch, W. (1988). Evaluation of single crystal x-ray diffraction data from a position sensitive detector. J Appl Cryst 21, pp. 916-924. 2. Kabsch, W. (1988). Automatic indexing of rotation diffraction patterns. J Appl Cryst 21, pp. 67-71. 3. Kabsch, W. (1993). Automatic processing of rotation diffraction data from crystals of initially unknown symmetry and cell constants. J Appl Cryst 26, pp. 795-800. 4. Brunger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros, P., Grosse-Kunstleve, R.W., Jiang, J.S., Kuszewski, J., Nilges, M., Pannu, N.S., Read, R.J., Rice, L.M., Simonson, T. & Warren, G.L. (1998). Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr D Biol Crystallogr 54 ( Pt 5), pp. 905-921. 5. Schneider, T.R. & Sheldrick, G.M. (2002). Substructure solution with SHELXD. Acta Crystallogr D Biol Crystallogr 58, pp. 1772-1779. 6. La Fortelle, E. & D., B. (1997). Maximum-likelihood heavy-atom parameter refinement for multiple isomorphous replacement and multiwavelength anomalous diffraction methods. Methods Enzymol 276, pp. 472-494. 7. Collaborative Computational Project, N. (1994). The CCP4 suite: programs for protein crystallography. Acta Crystallogr D Biol Crystallogr 50, pp. 760-763. 8. Jones, T.A., Zou, J.Y., Cowan, S.W. & Kjeldgaard (1991). Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallogr A 47 ( Pt 2), pp. 110-119. 9. McRee, D.E. (1999). XtalView/Xfit--A versatile program for manipulating atomic coordinates and electron density. J Struct Biol 125, pp. 156-165. 10. Grosse-Kunstleve, R.W. & Adams, P.D. (2003). On symmetries of substructures. Acta Crystallogr D Biol Crystallogr 59, pp. 1974-1977.

38 Chapter IV

Trigonal crystal form

The protein solution which yielded the hexagonal crystals used for structure determination was purified by heat treatment and ion exchange chromatography. In initial trials though, the protein was purified by Ni-NTA affinity chromatography. Moreover, in contrast to the protein sample allowing successful structure determination, the His tag was subsequently removed by thrombin digestion. Crystallization of this protein solution yielded crystals which belong to a trigonal spacegroup and diffract only to low resolution.

Experimental Procedures

The N-terminally His6 tagged FrvX construct used for structure solution was expressed in E. coli RosettaTM (DE3) cells (Novagen) according to the protocol described in chapter I. The harvested cells were resuspended in 20 mM Tris-HCl, pH 8.0, 300 mM NaCl, 10 mM imidazole, 0.02% sodium azide and disrupted by sonication. After centrifugation at 41000 g for 30 minutes, the supernatant was applied to a column containing Ni-NTA SUPERFLOW medium (Qiagen) and FrvX was isolated according to the manufacturer’s instructions. The elution fraction was incubated with thrombin (1U/mg) over night to cleave off the His tag. A second Ni-NTA purification step was carried out to separate digested FrvX from uncut protein. Truncated FrvX was further purified by gel-filtration chromatography using a

Superdex 75 column (Amersham Biosciences) equilibrated in 20 mM Tris-HCl, pH 8.0, 100 mM NaCl, 2 mM DTT, 0.02% sodium azide. All purification steps were analyzed by SDS-

PAGE [1]. The protein concentration was determined by Bradford-assay (BioRad) [2].

Typical yields were 20 mg/l pure protein. FrvX was crystallized by the vapour diffusion

39 method in sitting drops. Droplets containing 2µl protein solution (8 mg/ml) and 1µl reservoir solution were equilibrated against 100µl of reservoir solution at 18°C. Crystals were cryoprotected by addition of 30% (v/v) glycerol to the reservoir solution and flash-cooled in liquid nitrogen. One dataset was collected at the Swiss Light Source at 100K on beamline

X06SA on a MAR CCD 165 detector (Marresearch GmbH, Hamburg, Germany). The data were processed with the program XDS [3-5]. Calculation of a self-rotation function was performed using MOLREP [6]. Molecular replacement trials were carried out with the programs MOLREP and PHASER [7].

Results and Discussion

First small crystals grew out of a Crystal Screen I condition from Hampton Research (50 mM glycine, pH 2.9, 2% (w/v) PEG 8000, 1.0 M lithium sulphate). The crystals are shown in

Figure 1a. Optimization of the condition by grid screening of pH and salt concentration yielded the crystals shown in Figure 1b (50 mM glycine, pH 3.1, 2% (w/v) PEG 8000, 1.0 M lithium sulphate). They were of an average size of 200 x 100 x 100 µm3 and belonged to spacegroup P31/221 with cell constants of a = b = 160.0 Å, c = 418.2 Å.

a b

200 µm 300 µm

Figure 1: Trigonal crystals of FrvX. (a) Initial crystals. (b) Optimized crystals.

This trigonal crystal form diffracted to a maximal resolution of 5.0 Å at the SLS. A diffraction image is shown in Figure 2 and data collection statistics are given in Table 1 of chapter I.

Despite efforts to improve the crystals either by screening of drop ratio, protein concentration, and different PEG’s at different concentrations or by Additive Screens I-III and Detergent

40 Screens I-III from Hampton Research as well as by seeding techniques and dehydration methods, the resolution could not be enhanced. This may be explained partially by the dimension of the unit cell but still hints to significant disorder in the crystal lattice.

Figure 2: Diffraction image of FrvX collected on beamline X06SA at the SLS. Exposure time: 1s, crystal- to-detector distance: 450 mm, oscillation angle: 0.5°. The edge of the detector is at 5.0 Å resolution.

The calculation of the Matthew’s coefficient suggests that the asymmetric unit contains twelve subunits and 62.4% solvent (Table 1). A self-rotation function (Figure 3) shows the presence of non-crystallographic three-fold and two-fold axes which is in agreement with a tetrahedral particle.

Table 1: Matthew’s coefficients of FrvX in spacegroup P31/221

Nmol/a. s. u. Matthews Coefficient Solvent (%) 10 4.0 68.7 11 3.6 65.5 12 3.3 62.4 13 3.0 59.3 14 2.8 56.1 15 2.6 53.0 16 2.5 49.9 17 2.3 46.7 18 2.2 43.6

41

Figure 3: Self-rotation functions of FrvX spacegroup P31/221 at Chi = 180°, 120°, 90°, and 60°

The large cell constants, the Matthew’s coefficient and the result of the self-rotation indicate that the asymmetric unit of the trigonal crystal form contains one whole dodecameric tetrahedron.

After the crystal structure was solved with data from the hexagonal crystal form, we tried to determine the structure of FrvX in the trigonal spacegroup to 5.0 Å resolution by molecular replacement. Initially, the whole tetrahedron was used as search model. Since this did not lead to a solution, further trials were performed using either four subunits, the edge-dimer or a monomer as search probe. However, neither MOLREP nor PHASER was able to locate one of the search models. It is likely that severe disorder and defects in the crystal lattice are the reasons for the failure to solve the structure.

42 References

1. Laemmli, U.K. (1970). Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227, pp. 680-685. 2. Bradford, M.M. (1976). A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal Biochem 72, pp. 248-254. 3. Kabsch, W. (1988). Evaluation of single crystal x-ray diffraction data from a positive sensitive detector. J Appl Cryst 21, pp. 916-924. 4. Kabsch, W. (1988). Automatic indexing of rotation diffraction patterns. J Appl Cryst 21, pp. 67-71. 5. Kabsch, W. (1993). Automatic processing of ratation diffraction data from crystals of initially unknown symmetry and cell constants. J Appl Cryst 26, pp. 795-800. 6. Vagin, A. & Teplayakov, A. (1997). MOLREP: an automated program for molecular replacement. J Appl Cryst 30, pp. 1022-1025. 7. Storoni, L.C., McCoy, A.J. & Read, R.J. (2004). Likelihood-enhanced fast rotation functions. Acta Crystallogr D Biol Crystallogr 60, pp. 432-438.

43 Danksagung

Bei Prof. Ulrich Baumann möchte ich mich für die hervorragende Betreuung und insbesondere für die angenehme Atmosphäre in der Arbeitsgruppe bedanken, die er als Leiter verbreitet und ermöglicht.

Bei Prof. Bernhard Erni bedanke ich mich für seine stete Bereitschaft zu einem Gespräch.

Herzlich möchte ich mich bei Dr. Anselm Oberholzer, Dr. Mario Bumann, Reto Meier, Christoph Bieniosssek, Dr. Paola Luci und Dr. Achim Stocker für ihre Hilfsbereitschaft und die angenehme und anregende Zusammenarbeit bedanken.

Dr. Johann Schaller, Urs Kämpfer, Adrian Schindler und Michael Locher danke ich für den hervorragenden Service für Massenspektrometrie und Sequenzierung.

Nicht zuletzt danke ich meinen Eltern, die mich stets ermutigt und unterstützt haben.

44

Curriculum Vitae

Santina Russo

Personal Data: Date of birth: 17th September 1977 Nationality: Swiss/Italian

Scholastic Education:

1987 – 1991 Primary school in Lyssach, Switzerland

1989 – 1991 Secondary school in Kirchberg, Switzerland

1991 – 1997 High school (gymnasium) and graduation (A-levels) in Burgdorf, Switzerland

Academic Education:

Oct. 1997–Nov. 2001 Studies in chemistry, major subject biochemistry, University of Bern, Switzerland.

April 2001–Nov. 2001 Master thesis under the supervision of Prof. U. Baumann, Departement of Chemistry and Biochemistry, University of Bern, Switzerland. Thesis: „Structural and Functional Studies of Nitrilotriacetate-Monooxygenase from Chelatobacter heinzii”

Nov. 2001 M.Sc. degree in chemistry at the University of Bern.

Nov. 2001-Feb. 2005 PhD in the group of Prof. Dr. U. Baumann at the Department of Chemistry and Biochemistry, University of Bern, Switzerland. Thesis: “Structural and functional studies on a dodecameric tetrahedral-shaped aminopeptidase”

45