Research Collection

Doctoral Thesis

Structural and Mechanistic Characterization of the Bacterial -related Bpa and Dop

Author(s): Bolten, Marcel

Publication Date: 2016

Permanent Link: https://doi.org/10.3929/ethz-a-010814710

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH Library DISS. ETH NO. 23817

STRUCTURAL AND MECHANISTIC CHARACTERIZATION OF THE BACTERIAL PROTEASOME-RELATED PROTEINS BPA AND DOP

A thesis submitted to attain the degree of

DOCTOR OF SCIENCES of ETH ZURICH

(Dr. sc. ETH Zurich)

presented by

MARCEL BOLTEN

M.Sc. in Chemical Biology, TU Dortmund

born on 25.07.1986

citizen of Germany

accepted on the recommendation of

Prof. Dr. Eilika Weber-Ban Prof. Dr. Rudi Glockshuber Prof. Dr. Donald Hilvert

2016

Chapter3 of this thesis has appeared in the following publication:

Bolten, M.1; Delley, C.L.1; Leibundgut, M.; Boehringer, D.; Ban, N. & Weber-Ban, E. Structural analysis of the bacterial proteasome activator Bpa in complex with the 20S proteasome. Structure 2016, 24, 2138–2151 doi: 10.1016/j.str.2016.10.008

1contributed equally

Zusammenfassung

Proteinhomöostase ist ein fundamentaler Prozess des Lebens und beschreibt die Kontrolle der Proteinkonzentrationen und der Funktionalität des Proteoms in le- benden Zellen. Ein wichtiger Aspekt dieses Kontrollprozesses ist der Proteinabbau, bei welchem Proteasen selektiv beschädigte oder nicht länger benötigte Proteine verdauen. Diese Aufgabe wird üblicherweise von kompartmentalisierenden Pro- teasekomplexen übernommen, die abgeschlossene Abbaukompartimente mit Zu- gangskontrolle besitzen, wodurch deren proteolytische Aktivität gegenüber dem Zytosol abgeschottet ist. In einigen Actinobakterien wurde einer dieser makromo- lekularen Komplexe, das Proteasom, gefunden, das für gewöhnlich nur in Eukaryo- ten und Archaeen vorkommt. Das Proteasom besteht aus einem Hohlzylinder mit siebenzähliger Symmetrie, in dessen Innerem sich die katalytisch aktiven Zentren befinden. Um den aktiven Proteasomkomplex zu bilden, assoziiert dieses soge- nannte “core particle”, (CP) mit ringförmigen Regulatoren, die für die Erkennung abzubauender Proteinsubstrate verantwortlich sind. In Actinobakterien sind zwei Regulatorkomplexe mit unterschiedlicher Funktion bekannt, die mycobakterielle, proteasomale ATPase (Mpa) und der bakterielle Proteasomeaktivator (Bpa). Bpa besitzt eine Interaktionsplattform für partiell entfaltete Proteine, vermittelt ATP- unabhängig deren weitere Entfaltung und die Einfädelung in den Hohlzylinder des CP. Dagegen ist Mpa ein durch ATP-Hydrolyse angetriebener Komplex, wel- cher Proteine, die post-translational kovalent mit dem prokatyotischen ubiquitin- ähnlichen (Pup) modifiziert sind, rekrutiert, und unter Umsatz von ATP aktiv entfaltet. Im Gegensatz zum gut untersuchten Mpa-Komplex ist nur wenig über das erst kürzlich entdeckte Bpa bekannt. In der vorliegenden Arbeit wurde die dreidimensionale Struktur von Bpa durch Röntgenkristallographie mit einer Auflösung von 2.6 Å aufgeklärt. Der Bpa-Kom- plex bildet eine Ring aus zwölf Untereinheiten, die eine Vier-Helix-Bündel Faltung einnehmen. Die C-terminalen Segmente der Protomere, die das für die Bindung am CP essentielle C-terminale Tetrapeptidmotif GQYL tragen, treten alle auf der

iii gleichen Seite des Bpa-Ringkomplexes hervor. Die Bpa-CP Berührungsfläche ist somit ein weiteres Beispiel für die Wechselwirkung zwischen Ringsystemen mit nicht übereinstimmender Symmetrie. Mpa-abhängiger Proteinabbau erfordert eine als Pupylierung bezeichnete post- translationale Modifikation der abzubauende Proteine. In diesem Prozess wird durch die Ligase PafA eine Isopeptidbindung zwischen dem C-terminalen Glutama- trest des kleinen Proteins Pup und einer Lysinseitenkette des Substratproteins aus- gebildet. In einigen Actinobakterien ist Pup als Protein mit einem C-terminalen Glutamin kodiert, welches vor der Ligation erst zu Glutamat umgewandelt werden muss. Die Deamidase Dop (deamidase of Pup) katalysiert diese Reaktion und weist zusätzlich Depupylierungsaktivität auf. PafA und Dop sind paraloge Enzyme mit hoher Struktur- und Sequenzhomologie und sehr ähnlich aufgebauten katalytischen Zentren. Während bei PafA der ATP-Umsatz stöchiometrisch mit der Ligations- reaktion gekoppelt ist, ist bei Dop zwar die Bindung von ATP für die Aktivität erforderlich, jedoch erfolgt keine stöchiometrische Umsetzung von ATP. Im zweiten Teil dieser Arbeit wird der Frage nachgegangen, warum Dop ATP be- nötigt. Mit Hilfe von Röntgenkristallographie und biochemischen Methoden wird gezeigt, dass die katalytische Aktivität von Dop abhängig von ADP und anorga- nischem Phosphat ist und nicht von ATP. Bei der Depupylase-Aktivität von Dop fungiert das anorganische Phosphat als Nukleophil bei der Spaltung der Amid- bindung. Somit ist der katalytische Mechanismus von Dop demjenigen von PafA ähnlicher als zuvor angenommen. Die vorgestellten Ergebnisse liefern aus verschiedenen Blickwinkeln neue und in- teressante Erkenntnisse zum proteasomalen Proteinabbau in Mykobakterien. Beide Teile der Arbeit liefern strukturelle Details in atomarer Auflösung und bilden da- her die Grundlage für die Entwicklung neuer Medikamente, die auf der Inhibition des für die Virulenz von Mycobacterium tuberculosis essentiellen proteasomalen Proteinabbaus beruhen.

iv Abstract

Protein homeostasis is a fundamental process of life and describes the control of the protein levels and integrity of the proteome in living cells. An important part of this controll process is protein degradation, where proteases digest selectively damaged or no longer needed proteins. This activity is usually carried out by compartmentalizing proteases that form reaction chambers which are under con- trolled access and shield the proteolytic activity from the rest of the cytosol. It was found that some actinobacteria harbor one of these macromolecular assemblies that is usually only found in eukaryotes or archaea, namely the proteasome. The proteasome consists of a proteolytic core cylinder with seven fold symmetry, also called core particle (CP), and ring-shaped regulators that associate with the CP to form the active proteasome complex. In actinobacteria two regulators, Mpa (mycobacterium proteasome ATPase) and Bpa (bacterial proteasome activator), are known, which can be distinguished by their mode of action. Bpa provides an interaction platform for partially unfolded proteins and relies on a passive, ATP-independent, further unfolding to ultimately degrade proteins. Mpa is an ATPase-driven complex that recruits proteins covalently modified with prokary- otic ubiquitin-like protein (Pup) and actively unfolds them under expense of ATP. In contrast to the well-studied Mpa only limited knowledge is available for the recently discovered Bpa. In the first part of this thesis I solved the Bpa structure by X-ray crystallography at a resolution of 2.6 Å. The structure shows that Bpa is a dodecameric ring assem- bly formed by protomers with a four-helix bundle fold. The twelve C-termini bear a conserved GQYL interaction motif that protrudes on one side of the ring in order to dock onto the CP cylinder. Interestingly, the Bpa-CP interface is an example of a ring-ring interaction with a significant twelve to seven symmetry mismatch. Protein degradation mediated by Mpa requires a post-translational modification of degradation substrates referred to as pupylation. In this process the C-terminal glutamate of the small protein Pup is attached via an isopeptide bond to a lysine

v side chain of the protein substrates by the Pup ligase PafA. In some actinobacteria Pup is genetically encoded with a C-terminal glutamine that needs to be converted to glutamate prior to ligation. Dop (deamidase of Pup) is catalyzing this reaction and features in addition depupylase activity as well. PafA and Dop are paralogs with a high structural and sequence homology and they even have identical residues forming the active site. However, PafA shows stoichiometric ATP turnover along with the ligation reaction, whereas Dop needs ATP for proper function without stoichiometric turnover. In the second part of this thesis, the puzzling question why Dop needs ATP is addressed. Using a combination of X-ray crystallography and biochemical methods it is shown that the catalytic activity of Dop depends on ADP and inorganic phosphate rather than on ATP. The inorganic phosphate acts as a nucleophile during amide bond cleavage and the overall catalytic mechanism of Dop is therefore more similar to that of PafA than previously thought. In summary, the presented results provide new and interesting insights in the field of bacterial proteasome-mediated protein degradation from different points of view. Both parts provide structural details at atomic resolution and thus lay a foundation for the development of new drugs targeting the proteasomal protein degradation system of mycobacteria, that was shown to be essential for virulence of Mycobacterium tuberculosis.

vi Acknowledgments

First of all, I would like to thank my PhD supervisor Prof. Eilika Weber-Ban for giving me the opportunity to work on very interesting projects and for the trust she put in the ideas I developed during my time in her lab. I am grateful to Prof. Rudi Glockshuber and Prof. Donald Hilvert for being co- referees of my thesis and for the valuable input during the committee meeting. I like to thank Prof. Nenad Ban for many great advices in the field of X-ray crystallography. I really appreciate his interest and support while solving crystal- lographic problems. Furthermore I am grateful for the possibility to participate in the informal group meetings of the Ban-Lab. Thanks to Michał not only for taking time to discuss scientific things but also for being a great lab mate. I really enjoyed our time in the new established outpost 6 (E 2 =E3) of the Weber-Ban laboratory. Thanks to Cyrille for the close collaboration with constructive discussions on the Bpa project. I thank my lab mates Andreas, Cyrille, Dennis, Jannis, Jürg, Julia, Jonas, and Michał for the nice lab atmosphere, the after work beers and movie evenings. Thanks to my students Christian and Colette for their contributions to the Dop project. I got great support from the “structure experts” Marc Leibundgut, Chris Aylett, Jan Erzberger, and David Sargent: thank you very much. I thank all present members of the Glockshuber-Lab: Christoph, Dawid, Fabia, Hiang, Jonathan, Lena, Max, and Vreni for being always helpful and for the pro- ductive working atmosphere. I would like to thank the Ban-Lab for scientific input during their informal lab meetings. I am grateful to my family for the constant support over the last years. Your advices were always helpful and brought back my motivation at anytime. The gratitude I feel to my wife Lena cannot be expressed in words.

vii

Contents

Zusammenfassung iii

Abstractv

Acknowledgments vii

List of Figures xi

List of Tables xiii

1 Introduction1 1.1 The ubiquitin proteasome system...... 3 1.2 The bacterial proteasome...... 5 1.3 Pupylation – a post-translational modification for proteasomal de- gradation...... 9 1.4 PafA and Dop are members of the γ-carboxylate amine/ammonia ligase superfamily...... 13

2 Aim of the thesis 15

3 Structural analysis of Bpa in complex with the 20S proteasome 17 Summary...... 17 Introduction...... 18 Results...... 19 Discussion...... 32 Experimental procedures...... 36 Author contributions...... 43 Acknowledgments...... 43 Supplement...... 43

4 Depupylase Dop requires inorganic phosphate as the active site nu- cleophile 53 Abstract...... 53 Introduction...... 54 Results...... 55 Discussion...... 62

ix Experimental procedures...... 66 Acknowledgments...... 69 Conflict of interest...... 69 Supplement...... 69

5 Concluding remarks and outlook 71

Bibliography 75

Curriculum Vitae 87

x List of Figures

1.1 Compartmentalizing proteases of Mycobacterium tuberculosis in com- parison with the eukaryotic proteasome...... 2 1.2 Schematic overview of the ubiquitination cascade in eukaryotes...3 1.3 X-ray structure and catalytic mechanism of the Mycobacterium tu- berculosis proteasomal core particle...... 6 1.4 Mpa mediated proteasomal protein degradation...... 8 1.5 The pupylation cycle in actinobacteria...... 10 1.6 Crystal structures of the PafA · Pup complex and Dop...... 12 1.7 Crystal structures of glutamine synthetase and glutamate cysteine ligase...... 13

3.1 Bpa consists of a highly conserved α-helical core with disordered C- and N-termini...... 21 3.2 Bpa protomers fold into a four-helix bundle and assemble into a dodecameric ring...... 23 3.3 Assessment of Bpa-CP affinity by measuring Bpa-mediated activa- tion of the proteasome for casein degradation...... 25 3.4 Site-specific cross-linking stabilizes the Bpa-CP complex...... 27 3.5 Cryo-EM density map of the Bpa-CP complex...... 29 3.6 The C-terminal GQYL motif of Bpa gives rise to clearly discernable density in the proteasome α-rings and can be modeled based on the Blm10 HbYX motif...... 31 3.7 Passive and active proteasomal targeting in bacteria...... 34 3.S1 Diffraction data show that Bpa crystals suffer from a lattice translo- cation defect...... 44 3.S2 Improvement of the experimental phases in space group P6322 using 4-fold NCS averaging during the solvent flipping procedure..... 45 3.S3 Heavy atom phasing and structure determination in space group P6322...... 46 3.S4 Model of HisBpa36–159 refined in space group P3...... 47 3.S5 Model of Bpa36–159 refined in space group R32:h...... 47 3.S6 Procedure outline to obtain the cryo-EM 3D reconstructions.... 49 3.S7 Fourier shell correlation and local resolution estimates for the two EM reconstructions...... 50

xi 3.S8 Comparison of Bpa-occupied and Bpa-free faces of the Bpa-CP com- plex...... 52

4.1 Depupylation time course of PanB-Pup in the presence of different nucleotides...... 56 4.2 Crystal structure of nucleotide-bound Dop...... 59 4.3 ATP hydrolysis mediated by Dop is dependent on the C-terminal residue of Pup...... 60 4.4 Initial ATP hydrolysis is necessary for depupylation of model sub- strate Pup-Fl...... 61 4.5 Proposed catalytic mechanism of Dop...... 65 4.S1 ATP hydrolysis mediated by Dop in absence and presence of model substrate Pup-Fl...... 69

xii List of Tables

3.S1 X-ray data collection, phasing and refinement statistics for Bpa.. 48 3.S2 EM data collection and refinement statistics for Bpa...... 51

4.1 X-ray data collection and refinement statistics for Dop · ADP · Pi .. 58

xiii

Introduction 1

All living organisms are challenged by a constantly changing environment. While some species have developed the possibility to change their environment by active movement and thereby have influence to some extent on their surroundings, oth- ers depend more strongly on the conditions they are faced with. Nevertheless, in either case the adaptation to a changing environment is one of the features that defines living organisms. Adaptation is important because the external influences need to be counterbalanced so that homeostasis can be maintained (1 ). The abil- ity of an organism to react to an external stimulus —a changing environment— is ultimately limited by genetically encoded information. Protein biosynthesis is thus the main process converting DNA encoded information to functionality in the form of proteins. The complex interplay of proteins with each other, with other cellular components, including metabolites, allows an organism to rapidly adapt. Hence protein homeostasis (proteostasis) is of crucial importance in maintaining a func- tional interaction network of proteins (2 –4 ). Whereas DNA encoded information is converted to proteins exclusively by the ribosome, the folding and degradation of proteins is accomplished by a plethora of monomeric proteins and large protein complexes. As a consequence, all domains of life (bacteria, archaea and eukarya) harbor protein degradation machineries (5 , 6 ). While eukaryotes are the only organisms containing organelles called lysosomes which are used for digestion of all kinds of biopolymers, they share the appearance of the cytosolic proteasome with archaea and actinobacteria, a phylum of bacteria. It should be noted that the proteasome is rather an anomaly in bacteria as the majority of bacteria do not feature it (7 ). Bacterial protein degradation, also in actinobacteria, is primarily carried out by a selection of the canonical chaperone proteases ClpCP, ClpXP, FtsH, HslVU and Lon (8 –10 ). To prevent unspecific proteolysis in the cytosol, the protein degra- dation machineries are assemblies that form compartments harboring the prote-

1 Canonical bacterial Eukaryotic-like proteasomal Cytosolic

RP Mpa6 11S

α7 α1–7 α1–7 ClpC6 ClpX6 β7 β1–7 β1–7 CP CP ClpP27 ClpP27 β β β FtsH6 7 1–7 1–7

ClpP17 ClpP17 α7 α1–7 α1–7

Mpa 11S 6 RP

Figure 1.1: Compartmentalizing proteases of Mycobacterium tuberculosis in compar- ison with the eukaryotic proteasome. The illustrated protease complexes are composed of a central cylinder (light and dark salmon), harboring the active sites (red dots), and an ATPase- dependent (orange) or energy-independent (purple) interactor. The eukaryotic proteasome has evolved toward more complexity and has additional non-ATPase subunits (magenta) in the reg- ulatory particle (RP). Figure adapted from Imkamp et al. (9 ). olytic activity and shielding it from the cytosol. A comparative overview of the compartmentalizing proteases of Mycobacterium tuberculosis and the eukaryotic proteasome is given in Figure 1.1. As there are only a small number of compart- mentalizing proteases in each organism compared to the multitude of proteins, they need adapters, also called regulators, to mediate the recruitment of proteins selected for degradation to the proteases appears to be a logical consequence. In addition, the modular setup also allows separation of recruitment and proteolytic activity. A further challenge for the degradation system is the discrimination be- tween proteins that require degradation and those that should be left untouched. Taking into account that such a difference can be as small as an oxidation or loss of an functional group and that proteins continuously undergo transitions between different folding states, the challenge seems to be insuperable. Neverthe- less, the proteostasis network overcomes this obstacle by specialized enzymes that sense the degree of alteration that often comes along with a loss of protein sta- bility and therefore with (partial) unfolding. The altered proteins can be labeled for degradation by post-translational modifications like ubiquitination or pupyla- tion which are recognized by the proteasome regulators (7 , 9 , 11 ). This affords a fast, efficient and highly dynamic means to remove damaged or no longer needed proteins. As a consequence of its fundamental importance the proteostasis network is in- volved in many, if not all, cellular processes throughout all organisms and is there-

2 1 Introduction fore associated with several diseases. In humans, many age-related diseases, such as type II diabetes, cancer, and cardiovascular diseases are related to deficiencies in proteostasis (3 ).

1.1 The ubiquitin proteasome system

In eukaryotes protein degradation is mainly carried out by the ubiquitin protea- some system. The 76 amino acid long protein ubiquitin (Ub) is found in all eukary- otic cells and the sequence and the structure are conserved. Ub has a characteristic globular fold where an α-helix is “grasped” by a five stranded β-sheet, explaining why it is called “the β-grasp fold” (12 ). On the surface of Ub several lysines are presented and the C-terminus is formed by a di-glycine motif. Ub can be cova- lently attached to lysines of substrates via its di-glycine motif. The formation of the isopeptide bond is catalyzed by a cascade of three enzymes (Figure 1.2)(13 ). The ubiquitin-activating enzyme E1 uses ATP to activate the C-terminus of Ub and form a ubiquitin-adenylylate intermediate. The activated carbonyl carbon is attacked by the catalytic cysteine of E1 and a thioester is formed. In the next step Ub is transferred to the ubiquitin-conjugating enzyme E2 by a transthiolation re- action. The last step of the cascade is performed by the ubiquitin ligase E3 which catalyzes the formation of the isopeptide bond between the C-terminal di-glycine motif and a lysine residue of a substrate protein. Ub itself can be a substrate for ubiquitination and several of the surface lysines undergo modification resulting

activation thioester formation O O O +E1, ATP

−PP −AMP O i O-AMP S SH E2 E1 transfer

O O O 3×Ub +E3, sub N N −E2, E3 S

polyUB formation via K48 isopeptide bond formation

Figure 1.2: Schematic overview of the ubiquitination cascade in eukaryotes. Ubiquitin (Ub) is activated at its C-terminus by E1 via AMPylation. The catalytic cysteine of E1 subse- quently forms a thioester with Ub and releases AMP. Ub is transferred to E2 in a transthiolation reaction. An E3 ligase mediates the formation of the isopeptide bond between the C-terminus of Ub and a lysine on the target substrate. For proteasomal degradation, a minimum of four Ub molecules have to be linked via Lys48 of Ub itself to form a linear polyubiquitin chain.

3 in a complex way of controlling cellular functions. In addition, deubiquitinating enzymes, capable of cleaving the isopeptide bond, further increase the ways in which cells can regulate the post-translational modification pattern. For proteaso- mal degradation, a polyubiquitin of at least four ubiquitin molecules, linked with each other via the C-terminus of one unit and Lys48 of the next, is necessary (14 ). However, it was recently shown that the underlying recognition mechanism is far more complex (15 , 16 ). The eukaryotic proteasome belongs to the family of compartmentalizing pro- teases and is a multienzyme complex (Figure 1.1). The catalytic activity for pro- teolysis is harbored in the proteasomal core particle (CP) which interacts on both sides with a regulatory particle (RP). The RP combines several functionalities, including control of access to the proteolytic interior of the CP, recruitment of ubiquitinated proteins, deubiquination of the substrates, and ATP-dependent un- folding and feeding of a substrate into the CP catalytic chamber. Substrates are recognized by the RP via two ubiquitin receptors that due to their distance sense the presence of the polyubiquitin chain. Nevertheless, ubiquitination on its own is not sufficient for degradation (17 ). It was shown that the ubiquitinated substrate needs to be partially unfolded to provide a handle for further unfolding (18 ). Be- fore the substrate is unfolded by the heterohexameric AAA-ATPase, the ubiquitins are removed by a deubiquitinase (19 ) that is located inside the RP close to the AAA-ATPase. The CP consists of seven distinct α- and β-subunits, which form seven-membered α1–7 and β1–7 rings (20 ). The rings are stacked in four layers with an αββα order (Figure 1.1). The α-rings provide binding pockets formed by two neighboring subunits that are used for interaction with the RP. The proteasomal core particle can interact with other regulators as well. This interaction is always established by the C-terminus of the regulators which docks into the binding pockets of an α-ring. The other regulators, in contrast to RP, are ATP-independent and therefore unable to actively unfold proteins, which in turn lets them only interact with peptides and partially unfolded proteins. One of the two known passive regulators is called the 11S complex, which forms heptameric rings (21 ). Depending on the organism, this complex is formed by one or two distinct subunits (22 , 23 ) that assemble as a hollow cylinder with an inner diameter between 20 and 30 Å. The C-terminal carboxylate of an 11S protomer forms a salt bridge with a lysine in a binding pocket of the α-ring, which summed over all seven interactions, leads to sufficient binding (24 , 25 ).

4 1 Introduction

The other regulator is called Blm10. It is not an oligomeric complex but rather a single polypeptide chain which adopts a HEAT-repeat like fold with an overall arrangement of a solenoid spiral (26 –28 ). Like the 11S complex protomers, the

Blm10 C-terminus docks into a binding pocket between the CP α5 and α6 subunits and forms a similar salt bridge (29 , 30 ). For Blm10, a conserved motif in the last three C-terminal residues, referred to as a HbYX motif, was identified, which consists of a hydrophobic residue followed by a tyrosine before a nonconserved C-terminal residue. The HbYX motif was shown to be sufficient for granting access to the proteolytic interior of the CP as hepta-peptides with this sequence induce activation (31 ).

1.2 The bacterial proteasome

The bacterial proteasome consists of the proteasomal core particle (CP) and one of two regulators (7 , 9 ). The CP, also referred to as the 20S particle because of its sedimentation coefficient, is an approximately 700 kDa multisubunit complex constructed from the two paralogous proteins proteasome core A and B (prcA/B) (32 , 33 ). Each of these proteins organizes as heptameric rings, which further assemble as a stack of four rings (α7β7β7α7) (Figure 1.1)(32 , 34 ). The final as- sembly has a barrel-like shape with a diameter of ≈120 Å and a length of ≈150 Å (32 , 33 , 35 ). The α-rings on both sides of the CP form a gated pore of roughly 20 Å and thereby constitute the entrance to the otherwise inaccessible interior (Fig- ure 1.3A, B) (33 , 36 ). The inner architecture is dominated by three large chambers formed by two adjacent rings whereby the central cavity formed by the β-subunits comprises the catalytic active sites and therefore is called the proteolytic chamber (32 , 34 , 35 ). The assembly of the CP occurs in three steps (37 , 38 ). First, a half-proteasome consisting of an α- and a β-ring forms spontaneously. Next, two half- form a preholoproteasome. PrcB is genetically encoded with an N-terminal pro- peptide that was shown to have an influence on the assembly of the preholopro- teasome (33 , 35 , 39 ). In addition, the propeptides prevent unwanted proteolytic activity prior to assembly. In the last step, maturation to the catalytically ac- tive proteasome happens by cleavage of propeptides in an autocatalytic process resulting in free N-terminal threonines (Thr1) (38 ).

5 A B

α

β

β

α

C Lys33 Lys33 Lys33 Lys33

NH2 NH3 NH3 R' NH3 H H O O H N 2 R' N H R' N R' O O N O H O H HO H O H H H O Thr1 Thr1 R O Thr1 Thr1 N HN N H O N N HN H3N H3N H2N H3N R H N H R O O 2 O O R' H R' N N N COOH H O R H

Figure 1.3: X-ray structure and catalytic mechanism of the Mycobacterium tubercu- losis proteasomal core particle. A) Overview of the core particle (pdb 3mi0,(36 )). The α- and β-rings are colored differently. The active site Thr1 is represented as spheres with the nitrogen in blue and oxygen in red. B) Top view of the α-ring showing the closed gate (pdb 3mka,(36 )). The subunits are colored according to the different arrangement of their gate-closing N-termini. The hydrophobic residues closing the gate are highlighted in red. The grey stars indicate the binding pockets for the C-termini of regulators. C) Reaction mechanism of CP-catalyzed proteolysis.

The usage of Thr1 as the catalytic nucleophile classifies the CP as an N-terminal nucleophile aminohydrolase (40 ). These enzymes use the functional group of the side chain, in this case the hydroxy group, as the nucleophile and the N-terminal amino group as a proton donor/acceptor for general acid-base catalysis (Fig- ure 1.3C) (33 , 34 , 41 ). The substrate specificity of the CP is encoded in PrcB, too. In general, the bacterial CP has chymotrypsin-like activity, meaning that it hydrolyzes peptide bonds after hydrophobic residues (32 , 33 ). For the CP of

6 1 Introduction

Mycobacterium tuberculosis additional tryptic (cleavage after basic residues) and caspase-like (cleavage after acidic residues) activity was found (33 , 34 ). The interior of the CP is shielded from unwanted binding of non-target proteins and small peptides by the gated pore formed by the N-terminal amino acids of PrcA (33 , 36 ). The N-termini of the seven α-ring subunits adopt three different conformations whereby one subunit does not contribute to an alternating sequence of elongated and angled orientations (Figure 1.3B). The three angled N-termini interact via their hydrophobic residues and thereby block the pore (36 ). In archaea and eukarya the gate opening relies on the interaction of the CP with its regulators (42 –44 ). On the side of the CP an interaction platform is provided by binding pockets formed by two adjacent PrcA protomers (45 ). Although these binding pockets are present in the bacterial CP as well (34 , 36 ), the increased activity observed for non-bacterial CPs has never been detected (46 , 47 ).

1.2.1 Interactors of the bacterial proteasome

Interactors, also called activators or regulators, of the bacterial proteasome core particle serve and combine several functionalities. They can be involved in the activation of the CP, and they provide an interaction platform for recruiting sub- strates that should be degraded by the CP. In bacteria two interactors, Bpa (bac- terial proteasome activator (48 ), also called PafE [proteasome accessory factor E] (49 )) and Arc (ATPase forming a ring-shaped complex (50 ), called Mpa [my- cobacterium proteasome ATPase] in mycobacteria (51 )), are known. They can be distinguished by their ability to unfold proteins in an active (ATP-driven) or passive (ATP-independent) manner (46 , 50 ). Independent of their mode of action they bind to the CP via a conserved C-terminal GQYL motif (46 , 48 , 49 , 52 ) (Figure 1.4, upper half). Bpa was identified by a bioinformatic approach searching for proteins that co- occur with PrcA/B and feature the C-terminal GQYL motif (48 ). Subsequent analysis by electron microscopy (EM) revealed that Bpa forms complexes with the CP in which additional features on each side of the long axis of the CP appear. EM micrographs of Bpa on its own showed rings with a six or seven fold symmetry. Gel filtration was used to estimate the stoichiometry of the rings and suggested a hexa- or heptameric assembly. In an independent study, gel filtration experiments cou- pled with multiangle light-scattering suggested a dodecameric stoichiometry (49 ).

7 +

Mpa CP

Substrate

Pup

Mpa

CP

Figure 1.4: Mpa mediated proteasomal protein degradation. The proteasomal core parti- cle (CP, light and dark salmon) opens its gate (dark salmon) upon interaction with Mpa (orange). The C-terminal GQYL motifs of the Mpa complex bind in pockets formed by to adjacent CP α-subunits. The N-terminal coiled-coil domains of Mpa recruit substrates (grey) by forming a three-part coiled-coil with Pup (red). The unstructured N-terminus of Pup is grabbed by the hy- drophobic loops (green) in the central channel which, due to ATP hydrolysis, undergo movements that first pull Pup and ultimately unfold the substrate. Pup and the substrate are degraded in the proteolytic chamber. Figure adapted from Imkamp et al. (9 ).

Bpa was found to accelerate the degradation of β-casein, which gives a hint toward its native function (48 , 49 ). It could be involved in the degradation of partially unfolded proteins upon stress (48 ). This speculation was further supported by the in vivo observation that a bpa-deficient M. tuberculosis strain exhibited only minor growth effects in liquid culture, but had severe growth phenotypes on solid medium and in a mouse model (49 ). The bpa-deficient strain accumulated the heat shock repressor HspR which was efficiently degraded by the Bpa-CP complex in vitro. The structural analysis of Bpa is part of this thesis and will be discussed in Chapter3. In contrast to the recently discovered Bpa, the ATP-dependent proteasomal in- teractor Arc/Mpa has been studied in greater detail. Mpa belongs to the “ATPases

8 1 Introduction associated with diverse cellular activities” (AAA) superfamily and forms homo- hexameric ring complexes (53 ). ATP hydrolysis is coupled to the unfolding of proteins through movements of hydrophobic loops in a central channel formed by the six subunits (46 ). The unfolded polypeptide chains are further translocated into the catalytic chamber of the CP where they are ultimately digested (54 ). An Mpa protomer folds into several domains, starting at the N-terminus with an α-helix that forms a shared coiled-coil with one neighboring protomer. The follow- ing stretch folds into two oligonucleotide/oligosaccharide-binding (OB) domains (47 , 55 ). The C-terminal domain is the actual AAA-ATPase domain which con- tains the C-terminal GYQL motif that is necessary for interaction with the CP. The OB domains themselves are able to form a hexameric ring (47 , 56 ), giving rise to the assumption that they are mainly responsible for the sixfold arrangement. Substrate recruitment is realized via the N-terminal coiled-coil domains which rec- ognize a post-translational modification called pupylation (57 ) (Figure 1.4 lower half). Upon binding to Mpa, the 64 amino acid long and intrinsically disordered protein Pup (prokaryotic ubiquitin-like protein) adopts a partial α-helical fold (residues 21–58) and forms a three-part coiled-coil (57 –61 ). Pup binds antiparal- lel to the Mpa α-helices which enables Mpa to grab the unstructured N-terminal part of Pup and initiate unfolding (46 ). As a consequence of Pup being the point of attack, it gets, at least in vitro, degraded along with the substrate (46 ).

1.3 Pupylation – a post-translational modification for proteasomal degradation

Post-translational modifications (PTM) add an additional layer of functionality and control to ribosomally synthesized polypeptide chains (62 ). For example, they enable fast adaptation and are very diverse in size, ranging from small modifications like phosphorylation to the attachment of macromolecules like ubiquitin or Pup (13 , 62 ). Pupylation is one of the modifications in the field of PTMs involving the attach- ment of proteins and follows the common theme of modifying lysine side chains of target proteins (63 –65 ). The prokaryotic ubiquitin-like protein Pup suggests, based on its name, a close relationship to its eponym, which holds true for the overall functionality but collapses upon a closer look at the underlying biochem- istry. Furthermore, Pup and ubiquitin do not share either sequence or structural

9 O O PafA PafA Pup-C Pup-C ADP Pi O ADP O P O ATP O Substrate

O

R’NH2 H N-K O O -O 2 PafA Pup-C Pup-C

Pup

O P Pi NHR’

Ligation Dop

ATP O O- O N-K H

O NH O 2 Pup O Pup - - O Deamidation O O -O O H H

Pup Depupylation Pup-Substrate

ATP O O Dop •ATP O N-K Pup-C Pup-C H O H2O NHR O RNH - 3 O O H H Dop

PupGG

Figure 1.5: The pupylation cycle in actinobacteria. Before PupE (red) enters the cycle, Dop (green) deamidates PupQ. The Pup ligase PafA (blue) catalyzes isopeptide bond formation between PupE and a substrate (grey). In addition, Dop has depupylase activity as well and can cleave the isopeptide bond. The reaction mechanism for both enzymes is indicated. R’ represents the lysine side chain of a substrate whereas R is either a hydrogen or a lysine of a substrate. Figure adapted from Barandun et al. (68 ). homology, yet both are involved in marking proteins for proteasomal degradation (66 ). The covalent attachment of Pup to a substrate occurs via an isopeptide bond between the side chain carboxylate of the C-terminal glutamate of Pup and the ε-amino group of a lysine of the substrate (67 ). The pupylation system consists of three proteins (Figure 1.5). The small in- trinsically disordered protein Pup (58 –61 ), the deamidase of Pup (Dop) which has depupylase activity as well (65 , 69 , 70 ), and the Pup ligase PafA (protea- some accessory factor A) (65 ). Pup, depending on the organism, is genetically encoded in two isoforms, containing either a C-terminal glutamate (PupE) or a glutamine (PupQ) (66 , 71 ). Thus, ligation to substrates occurs either in one or

10 1 Introduction two steps, respectively (Figure 1.5). In the two-step sequence Dop first catalyzes the deamidation of PupQ and thereby transforms it to PupE (65 ). In the next step, PafA activates the side chain carboxylate group of PupE at the expense of ATP and forms an unusually stable phosphorylated intermediate that remains bound to the enzyme(72 ). In the second half of the PafA-catalyzed reaction, PafA most likely is involved in deprotonation of the lysine ε-amino group that forms the isopeptide bond. The deprotonated amine is a better nucleophile and attacks the activated phosphoanhydride, resulting in the covalent amide bond (72 ). Pupylated substrates can undergo two different fates. Either they are recruited to the pro- teasome by Mpa and are degraded (Figure 1.4)(63 –65 ) or they undergo cleavage of the isopeptide bond by Dop (Figure 1.5)(69 , 70 ). Together, the features of the enzymes involved in pupylation allow for a regulatory circuit in which, depending on the activities of the individual enzymes, the substrates are either removed from the cell by proteolysis or kept intact (66 , 71 ). Like the bacterial proteasome, pupylation is mainly found in actinobacteria (e.g. Mycobacterium tuberculosis) and the genes encoding the proteasomal and the pupy- lation proteins usually cluster together (65 , 68 ). Bpa, the regulator that is un- related to pupylation, is found elsewhere in the genome (48 ). Interestingly, some organisms (e.g. Corynebacterium glutamicum) maintain the pupylation genes and mpa without having the proteasomal core genes prcA/B. Thus, in such organ- isms pupylation most likely does not serve as a degradation tag but rather has some other functionality. Nevertheless, due to the lack of experimental evidence degradation by other proteases cannot be ruled out (66 ). PafA and Dop are paralogs with high sequence and structural homology and are about 500 amino acids long (Figure 1.6)(71 , 73 ). A large N-terminal domain of roughly 400 residues includes the catalytic active site and presents the Pup binding platform. Approximately 70 C-terminal amino acids fold into a compact domain that tightly interacts with the N-terminal domain and together they form a binding pocket for an adenosine moiety. The central β-sheet of the enzymes is accessible on its concave side and, together with an adjacent loop, it forms a β-sheet cradle in which the phosphate groups of ATP run along the β-strands away from the adenosine binding pocket. At this end of the β-sheet a conserved groove emerges where Pup binds, as seen in the co-crystal structure of PafA and Pup (71 ). The C-terminal carboxylate of Pup most likely forms a salt bridge with a conserved arginine close to the active site and thereby positions the side chain carboxyl group

11 PafA Dop

Dop loop

Pup

Figure 1.6: Crystal structures of the PafA · Pup complex and Dop. Both enzymes are depicted in cartoon representation and the ATP bound in the active site is highlighted in black. Magnesium ions are shown in magenta. The C-terminus of Pup reaches into the active site. Dotted lines show structurally undefined regions. The approximately 40 residues long Dop loop is characteristic for the depupylase. PafA · Pup complex: pdb 4bjr( 71 ), Dop: pdb 4b0s( 73 ). in close proximity to the γ-phosphate of ATP. A strictly conserved aspartate in Dop (Asp94) and PafA (Asp64) flanks the point where the γ-phosphate of ATP and the side chain carboxyl group of the C-terminal glutamate of Pup meet each other. Mutation of this aspartate to alanine in Dop and asparagine in PafA inactivates the enzymes, suggesting that Asp functions as catalytic base. Asp64 in PafA is therefore thought to activate the incoming ε-amino group by deprotonation, while Asp94 of Dop activates water (73 ). The partially negatively charged intermediates that occur during catalysis are most likely stabilized by one of the arginines in the loop adjacent to the β-sheet. A good indication for the discrimination of PafA and Dop is the so-called “Dop loop”, an additional stretch of about 40 amino acids that is only found in the N-terminal domain of Dop (65 , 73 ). The Dop loop was not resolved in the X-ray structure indicating high flexibility, but mutational studies showed that this segment has no influence on the catalytic activity (73 ). An obscure observation is that both enzymes, PafA and Dop, require ATP for enzymatic activity, even though Dop does not use it in stoichiometric amounts and was found to be active with ADP and the slowly hydrolyzing ATP analog ATPγS, albeit much more slowly (65 ). Although ATP is necessary for the activation of the terminal carboxylate in the ligation process (72 ), the requirement for ATP is not obvious in the case of the deamidation/depupylation. The deamidation and the cleavage of the amide bond are entropically favored and numerous proteases rely on this concept.

12 1 Introduction

GS GCL

Figure 1.7: Crystal structures of glutamine synthetase (GS) and glutamate cysteine ligase (GCL). Both enzymes are depicted in cartoon representation and the ADPs (black) along with phosphorylated inhibitors (cyan) are bound in the active site. Magnesium ions are shown in magenta. GS: pdb 2bvc( 75 ), GCL: pdb 1va6( 76 ).

In an effort to characterize the mechanism of Mycobacterium tuberculosis Dop, an electrophilic trap was employed and Asp95 (Asp94 in the structure) was identified as a potential nucleophile involved in catalysis (74 ). The trap consisted of a Pup variant where the C-terminal amino acid was replaced by the unnatural amino acid DON (6-diazo-5-oxo-L-norleucine) resulting in Pup-DON. A covalent adduct of Pup and Dop was identified by mass spectrometry. Additional experiments with 18O-water and hydroxylamine showed incorporation into the C-terminal side chain carboxylate of Pup. In the proposed mechanism for Dop, Asp95 attacks the amide bond at the carbonyl carbon directly and forms an anhydride intermediate that is resolved in a second step by water (74 ). This mechanism does not explain the role of ATP. A new mechanism that assigns a role to the nucleotide and accounts for the previously found results is presented in Chapter4.

1.4 PafA and Dop are members of the γ-carboxylate amine/ammonia ligase superfamily

After the discovery of the Pup-proteasome system in Mycobacterium tuberculosis (63 , 77 ), a bioinformatic study identified the genes pafA and dop as members of the γ-carboxylate amine/ammonia ligase superfamily (78 , 79 ). Prominent members of this family are glutamine synthetase (GS) and glutamate cysteine ligase (GCL) (Figure 1.7), which both catalyze the formation of a covalent carbon-nitrogen bond to give an amide as observed for PafA (65 , 67 , 78 ). The enzymes of this clan do

13 this via the two-step mechanism previously described for PafA. The γ-carboxylate activated by phosphorylation and then attacked nucleophilicaly by a primary amine or ammonia. Catalysis always takes place on the concave side of a central antipar- allel six-stranded β-sheet (78 ). The β-sheet is stabilized on the convex side by two α-helices that pack against it. The highly conserved residues in the active site are involved in Mg2+ ion coordination which in turn mediates ATP binding. Two to three binding sites for magnesium are found depending on the enzyme and experimental method (75 , 79 , 80 ). A conserved motif for ATP binding (GxExE) was identified on the most N-terminal β-strand. The neighboring strands contain the conserved glutamates, glutamines and histidines responsible for magnesium binding (73 , 78 , 79 ). The members of the γ-carboxylate amine/ammonia ligase family assemble into complexes with a different number of protomers. Bacterial GS forms dodecameric complexes out of two hexameric rings and the active sites are buried in deep pock- ets formed by two adjacent protomers within a hexamer (81 ). GCL in plants was found to exist as a monomer or dimer with an interesting redox-dependent dimerization process that regulates activity (82 ). PafA and Dop are monomeric enzymes, mechanisms for their regulation have not been described (66 , 71 , 73 ).

14 Aim of the thesis 2

The discovery of the Pup proteasome system in actinobacteria nearly a decade ago showed that bacteria also harbor a system for proteasomal protein degradation similar to the well-known ubiquitin proteasome system. Despite the fact that both systems use a small protein as a post-translational marker for proteasomal degradation, they differ substantially. As of today the only enzymes known to be involved in the catalytic attachment and detachment of Pup are the Pup ligase PafA and the depupylase/deamidase Dop. Interestingly, they are paralogous and the key catalytic residues are highly conserved. Both belong to a protein-fold superfamily of carboxylate-amine/ammonia ligases. The well-studied enzymes of this superfamily (glutamine synthetase and glutamate-cysteine ligase) use ATP to form a transient phosphorylated intermediate to generate an amide. PafA was shown to utilize this mechanism and the phosphorylated intermediate could even be isolated. On the other hand, Dop was shown to function only in the presence of ATP but catalysis was not accompanied by stoichiometric ATP turnover. This is a puzzling observation in light of the high structural and sequence similarity between Dop and PafA. The main goal of my thesis was to resolve this issue by characterizing Dop with a combination of biochemical and biophysical techniques. During my time as a PhD student my colleague Cyrille identified a new bacte- rial proteasome interactor termed Bpa in an bioinformatic study. In contrast to the previously known interactor Arc/Mpa, Bpa functions in an ATP-independent manner. Sequence analysis failed to identify any related protein, making Bpa an attractive target for structural analysis by protein X-ray crystallography. As a consequence, a further goal of my thesis was the structural characterization of Bpa.

15

Structural analysis of the bacterial proteasome 3 activator Bpa in complex with the 20S proteasome

Marcel Bolten1, Cyrille L. Delley1, Marc Leibundgut, Daniel Boehringer, Nenad Ban and Eilika Weber-Ban Structure 2016, 24, 2138–2151 doi: 10.1016/j.str.2016.10.008 1 contributed equally. CD contributed electron microscopy experiments while MB contributed crystallization experiments. Non-structural experiments were done by both authors.

SUMMARY

Mycobacterium tuberculosis harbors proteasomes in addition to the bacteria-typic degradation complexes. As in eukaryotes, an ubiquitin-like modification targets proteins for proteasomal degradation. Recently, a non-ATPase activator termed Bpa was shown to support an alternate proteasomal degradation pathway. Here, we present the cryo-EM structure of Bpa in complex with the 20S core particle (CP). For docking into the cryo-EM density, we solved the X-ray structure of Bpa, showing that it forms tight four-helix bundles that arrange into a 12-membered ring with a 40 Å wide central pore and the C-terminal helix of each protomer protruding from the ring. The Bpa model was fitted into the cryo-EM map of the Bpa-CP complex, revealing its architecture and striking symmetry mismatch. The Bpa-CP interface was resolved to 3.5 Å, showing the interactions between the C-terminal GQYL motif of Bpa and the proteasome α-rings. This docking mode is related to the one observed for eukaryotic ATP-independent activators with features specific to the bacterial complex.

17 INTRODUCTION

The occurrence of proteasomes is a special feature of many members of the acti- nobacterial phylum, amongst them the human pathogen Mycobacterium tubercu- losis (Mtb) (83 ). While proteasomes represent the main molecular machine for processive protein degradation in eukaryotes (84 ), the actinobacterial proteasomes exist alongside canonical bacterial compartmentalizing proteases (6 ), and their role is to ensure survival under especially demanding conditions as for example encoun- tered by Mtb inside host macrophages (85 ). Similar to its eukaryotic counterpart, the bacterial proteasome core particle (CP) is gated for substrate entry (36 ). The Mtb CP is assembled from four homo- heptameric rings built of two homologous subunits, α and β (34 ). The rings stack to form a four-tiered cylinder with the two central β-rings flanked by the outer α-rings that form the entrance pores to the interior of the cylinder. The N-terminal residues of the α-subunits surround the pore and adopt a conformation that keeps the pores tightly sealed (36 ). In order to degrade cellular proteins, the CP has to associate with interaction partners referred to as regulators or activators that allow entry of selected sub- strates into the gated proteolytic chamber (9 , 43 , 44 ). These binding partners are themselves multi-subunit complexes, usually of ring-shaped architecture and featuring a central pore (27 , 45 , 86 ). The regulators stack at the ends of the core cylinder and align their ring pore with the entrance pore of the core particle. In complex with the ring-shaped ATPase Mpa (called ARC in other Actinobacte- ria), the bacterial proteasome is able to degrade stably folded proteins that have been post-translationally modified with the prokaryotic ubiquitin-like protein Pup (46 ). This involves recognition of Pup by Mpa followed by ATP-dependent unfold- ing and translocation of the Pup-conjugated protein. We and others have recently discovered a novel, ring-shaped, ATP-independent activator of the bacterial protea- some, Bpa (bacterial proteasome activator), also referred to as PafE, which forms an alternative proteasome complex mediating a second, pupylation-independent degradation pathway (9 , 48 , 49 ). It was shown that association with Bpa can activate the CP to degrade an unstructured model substrate (48 ), suggesting that substrates might be selected on the basis of their poorly structured conformational state. This is further supported by the finding that heat shock repressor HspR, an

18 3 Structural analysis of Bpa in complex with the 20S proteasome intrinsically unstable protein that partially denatures upon heat shock, is a natural substrate of the Bpa-proteasome pathway (49 ). Interestingly, Bpa is to date the only known proteasomal interactor able to successfully open the gate of the bacterial CP (48 , 49 ). Mpa was never shown to stably interact with the wild type CP or open its gate in vitro, despite the wealth of functional and genetic evidence linking it to the proteasomal degradation pathway in vivo (46 , 47 ). In fact, the field has relied on a variant of the CP with a truncated gate to carry out in vitro Mpa-driven proteasome degradation experiments. The inability of Mpa to interact with and open the wild type CP is surprising, since the C-terminal proteasome interaction motif (GQYL) is shared between Bpa and Mpa, suggesting a common mode of gate-opening. With the aim to structurally characterize the interaction of Bpa with the 20S proteasome core, we stabilized the Bpa-proteasome complex with the help of DNA-encoded site-specific crosslinking for analysis by cryo-electron microscopy (cryo-EM). Our cryo-EM 3D image reconstructions reveal an assembly that has a symmetry-mismatch across the Bpa/CP interface. In addition, we determined the crystal structure of Bpa in isolation by X-ray crystallography. It shows a rigid 12-membered ring with flexible protruding C-terminal helices and features a large central pore. During the preparation of this manuscript, a crystal structure of Mtb-Bpa was published (87 ), which is in good agreement with our independently determined structure. We used our Bpa model for fitting into the cryo-EM density of the Bpa-CP complex. In our reconstruction, the proteasome core can be visu- alized at high resolution, revealing additional density associated with its α-rings, which correspond to the C-terminal GQYL motif of Bpa. The docking mode is re- lated to eukaryotic ATP-independent activators but also displays features specific to the bacterial complex.

RESULTS

CD analysis, sequence alignment and secondary structure prediction suggests a protomer with an all α-helical core domain and an unstructured N-terminal stretch of 35 residues

Bpa occurs as a ring-shaped oligomer with a stain-filled center on negative-stain electron micrographs (48 ), and size-exclusion chromatography followed by multi-

19 angle light scattering suggests a homododecameric complex (49 ). However, the amino acid sequence of Bpa does not belong to any known protein family. To assess the domain structure of Bpa, we performed a sequence alignment between Bpa from different mycobacterial species combined with secondary struc- ture prediction (88 , 89 ). The data suggest that Mtb-Bpa folds into a conserved α-helical core flanked by intrinsically disordered termini with low sequence homol- ogy, except for a strictly conserved C-terminal GQYL motif (Figure 3.1A). This prediction agrees with the characteristic CD signature observed for full-length Bpa, which features two minima at 208 nm and 220 nm (Figure 3.1B). Based on our in silico analysis, we generated a truncation variant (HisBpa36–159), from which the non-homologous and likely disordered regions were excluded. The CD spectrum of this variant shows the same overall shape with a slightly stronger helical signal, indicating that it is folded and indeed features an increased relative helical content.

Bpa forms crystals with a large unit cell

While no diffracting crystals could be obtained from full-length Mtb-Bpa, we were able to grow plate-like hexagonal crystals from a truncated untagged Bpa36–159 variant (Figure 3.1C) and collect a dataset to 2.6 Å resolution, which could be in- dexed in space group R32:h (hexagonal lattice setting of R32) with unit cell param- eters of a = b = 101.4 Å, c = 311.2 Å and α = β = 90◦, γ = 120◦ (Table 3.S1). The diffraction data showed a severe lattice-translocation defect (LTD) (Figure 3.S1) and were anisotropic, which limited the overall resolution, although diffraction was visible to 2.4 Å along the c* axis. As no molecular replacement model of Bpa was available, at this stage of the project, we used seleno-methionine (Se-Met) heavy atom derivatization for phasing. The Se-Met labelled Bpa truncation vari- ant HisBpa36–159 (Figure 3.1C) formed hexagonal crystals that diffracted to 3.3 Å resolution and could be indexed in space group P6322 (unit cell parameters a = b = 100.9 Å, c = 207.4 Å, α = β = 90◦, γ = 120◦). We solved the HisBpa36–159 struc- ture at a resolution of 3.3 Å by single-wavelength anomalous dispersion (SAD) phasing using the Se-Met sites and a combination of solvent flipping and 4-fold non-crystallographic symmetry (NCS) averaging, followed by model building and refinement (Figure 3.S2, 3.S3). Due to the increased R-values obtained in space group P6322, which might be the result of an NCS emulating a higher crystallo- graphic symmetry or a translational lattice defect, the model was refined against

20 3 Structural analysis of Bpa in complex with the 20S proteasome

A Pre Exp 10 20 30 40 50 60 70 80 90 Mtb 1 MVIGLSTGSDD---DDVEVIGGVDPRLIAVQEN----DSDESSLTDLVEQPAKVMRIGTMIKQLLEEVRAAPLDEASRNRLRDIHATSIRELE 86 Mmar 1 MVTGLSTGNDDPEAEDVEVIGGIDPRLMAVTDE----DSDERSLTDLVEQPAKVMRIGTMIKQLLEEVRAAPLDEASRNRLRDIHATSIRELE 89 Mavi 1 ----MST---RNDDDGVEVIGGVDPRMMGVDDD----DADERSLTDLVEQPAKVMRIGTMIKQLLEEVRAAPLDDASRNRLREIHATSIRELE 82 Mlep 1 ------MNNDNDDSIEIIGGVDPRTMATRGEDESRDSDEPSLTDLVEQPAKVMRIGTMIKQLLEEVRAAPLDEASRNQLREIHATSIRELE 85 Msm 1 ------MTINPDDDNIEILTGAAGG----ADTEGEGEGEGKSLTDLVEQPAKVMRIGTMIKQLLEEVRAAPLDDASRNRLREIHQTSIRELE 82

Pre Exp 100 110 120 130 140 150 160 170 180 Mtb 87 DGLAPELREELDRLTLPFNEDAVPSDAELRIAQAQLVGWLEGLFHGIQTALFAQQMAARAQLQQMRQGALPPGVGKSGQH-----GHGTGQYL 174 Mmar 90 DGLAPELREELDRLTLPFNEDSAPSDAELRIAQAQLVGWLEGLFHGIQTALFAQQMAARAQLEQMRQGALPPGIGKPGPH-----GHGTGQYL 177 Mavi 83 EGLAPELREELDRLTLPFNEDAAPSDAELRIAQAQLVGWLEGLFHGIQTALFAQQMAARAQLEKMRQGALPSGIGKPGSA------PGHGQYL 169 Mlep 86 DGLAPELREELDRLTLPFNESTAPSNAELRIAQAQLVGWLEGLFHGIQTALFAQQMAARAQLEQMRHSALPPGMGKPGQAG----GQGTGQYL 174 Msm 83 DGLAPELREELERLTLPFTDDNVPSDAELRIAQAQLVGWLEGLFHGIQTALFAQQMAARAQLEQMRQGALPPGIQVPGAQRGGATHPGTGQYL 175 random coil α-helix unresolved in X-ray structure GQYL motif B 5 C Bpa Bpa HisBpa36–159 −1 0 dmol

2 Bpa36–159 −5

deg cm His 3 Bpa36–159 −10 / x10

His MRW Bpa36–174

[ Θ ] −15

HisBpa −20 200 210 220 230 240 * λ / nm

Figure 3.1: Bpa consists of a highly conserved α-helical core with disordered C- and N-termini. A) Multiple sequence alignment of Bpa orthologs from different Mycobacteria. The predominant residues in positions with an identity score above 0.5 are shaded in blue where increasing similarity is indicated by a gradient from light to dark blue. Predicted (Pre) and exper- imentally (Exp) determined secondary structure is indicated above the sequences. Black triangles mark the positions where the crystallized Bpa variant has been truncated. My- cobacterium tuberculosis (Mtb), Mycobacterium marinum (Mmar), Mycobacterium avium (Mavi), Mycobacterium leprae (Mlep), Mycobacterium smegmatis (Msm). B) Circular dichroism spectra of Bpa and HisBpa36–159 exhibit dominant α-helical features with local minima at 208 nm and 220 nm. C) Cartoon depicting Mtb-Bpa constructs used in this study (His: hexa histidine-tag, numbers indicate the boundaries, the asterisk indicates mutation of T170 to the amber stop codon for incorporation of para-benzoyl-L-phenylalanine).

the same data processed into space group P3 with 16 monomers in the asymmetric unit (ASU) and similar spatial arrangement of the dodecameric rings. This mod- ification yielded improved R-values (Table 3.S1) and extra density in the area of the C-terminal helix. The P3 model (Figure 3.S4) was then used for molecular replacement in the LTD-corrected native R32:h 2.6 Å dataset.

21 Bpa adopts a rigid dodecameric ring structure with protruding flexible C-terminal helices

The crystal structure of Bpa36–159 reveals a dodecameric ring, in which the indi- vidual protomers are in the same orientation (Figure 3.2B). The Bpa ring exhibits an outer diameter of approximately 100 Å with a 40 Å wide central pore. These physical dimensions are in agreement with earlier EM micrographs of the wt Mtb- Bpa (48 , 49 ) and with the recent X-ray structure (87 ). The Bpa protomers fold into an antiparallel four-helix bundle (Figure 3.2A) that forms a compact body with the longer C-terminal helix H4 extending out from the bundle. While short turns connect H1 to H2 and H2 to H3, H3 is linked to H4 via a long loop that protrudes from the periphery of the Bpa ring, giving it a burled appearance. The C-terminally extending part of H4 shows weaker electron density which is indica- tive of intrinsic flexibility. Interestingly, in the crystal, the C-terminal helices of two Bpa dodecamers interdigitate such that the continuous packing along the c axis is formed by double ring units, resulting in alternating interdigitating and non-interdigitating contacts (Figure 3.S5). As an interdigitating double-ring is not observed for wt Bpa at biologically relevant concentrations in solution based on gel filtration and cryo-EM experiments (48 , 49 ) and as the dimensions of the single ring are in excellent agreement with the EM data the single dodecameric ring is evidently representative for the solution structure. The protomer-protomer interface is stabilized by two salt bridges between residues R49-E96 and E84-R99, both located at the periphery of the ring, and a further salt bridge at the inner perimeter between residues R62-E60-R116 (Figure 3.2C).

Bpa and CP form a complex with low apparent affinity

To structurally characterize the Bpa-CP complex by EM a high occupancy of CP with Bpa is essential. It was shown that Bpa must form a physical complex with the CP to mediate the degradation of protein substrates, since a variant of Bpa lacking the GQYL proteasome-interaction motif cannot stimulate proteasomal degradation of the unstructured model substrate β-casein (48 ). We first assessed the affinity for complex formation by exploiting the ability of Bpa to stimulate β-casein degradation. β-casein was labeled randomly with fluorescein, so that its digestion could be followed spectroscopically by measuring fluorescein unquench- ing upon cleavage of the protein backbone. The steady-state rates of β-casein

22 3 Structural analysis of Bpa in complex with the 20S proteasome

A H2 H3-H4 loop B

H3-H4 loop

H1 30 Å H3 40 Å

30 Å

N H4

C

C R116

H4 H1* R116

E60 H1 R62 E60*

R62 R99 E96 E84

R49 E84* H2* H1*

R99

E96

R49*

H3

Figure 3.2: Bpa protomers fold into a four-helix bundle and assemble into a dode- cameric ring. A) The Bpa36–159 protomer forms an antiparallel four-helix bundle with the C-terminal helix H4 extending further out of the bundle. B) Dodecameric ring assembly of Bpa. One protomer is shown as cartoon with the same col- oration as in A. The other protomers are depicted in surface representation with alternating dark and light grey colors. C) The protomer-protomer interface is visualized in splayed-open view and colored according to electrostatic charge from red (−2.5 KbT/ec) to blue (+2.5 KbT/ec) as determined by APBS (90 ) and PDB2PQR (91 ). The surface not involved in direct contacts is colored in grey. Insets show close-ups of the salt-bridges involved in stabilizing the protomer-protomer interaction. Adjacent protomers in the insets are colored cyan and green. The 2FO−FC Fourier difference density map is contoured at 1 σ level and shown as light grey mesh.

23 degradation were measured at increasing concentrations of HisBpa, thereby shift- ing the equilibrium from free CP towards Bpa-occupied CP (Figure 3.3A). Since CP alone only exhibits a marginal β-casein degradation capacity compared to the Bpa-CP complex, complex formation is coupled with an increase in the β-casein degradation rate. From the concentration-dependent change in the initial veloci- ties of degradation we estimated an apparent affinity of 11.8 ± 1.5 µM of CP for the Bpa ring-complex (Figure 3.3B, black curve). Although it has been shown that the C-terminal GQYL motif of Bpa mediates this interaction (48 ), curiously, the other proteasome interactor Mpa, featuring the identical GQYL motif, does not interact with CP in vitro. Indeed, monitoring the degradation of β-casein by Coomassie-stained SDS gel electrophoresis demonstrates that, in contrast to Bpa, Mpa is unable to stimulate β-casein degradation significantly (Figure 3.3C). The crystal structure of Bpa reveals that, unlike for Mpa, the N-terminus of Bpa emerges at the same face of the Bpa ring as the C-terminal GQYL motif. This arrangement raises the question whether the N-terminus of Bpa contributes to the interaction with the CP and could explain the discrepancy in gate-opening activity between Bpa and Mpa. To test this hypothesis, we assessed the ability of an N-terminally truncated HisBpa36–174 variant to stimulate β-casein degradation at different concentrations of the variant. An apparent affinity of 3.8 ± 0.3 µM was obtained for the truncated variant, indicating an even stronger interaction than observed for the full-length Bpa (Figure 3.3B, grey curve). This result argues against a favorable contribution to binding and gate-opening by the N-terminal region of Bpa.

The Bpa-20S complex can be stabilized for EM analysis by site-specific cross-linking

For EM grid preparation, typically a ≈100 nM particle concentration is used, which poses a problem in the case of the Bpa-CP complex, because the established disso- ciation constant of ≈10 µM indicates an equilibrium that favors dissociated com- plexes at this particle concentration. It was therefore clear that additional means of stabilizing the Bpa-CP complex were required to obtain sufficient numbers of Bpa-CP particles across the EM grids. The active site of the CP is allosteri- cally connected to the conformation of the interactor binding site at the α-ring face, and the active site variant (CPT1A) exhibits increased affinity for proteaso-

24 3 Structural analysis of Bpa in complex with the 20S proteasome

A B app. Kd app. kcat Casein-F HisBpa 11.8 +_ 1.5 μM 4.4 +_ 0.2 4 HisBpa36–174 3.8 +_ 0.3 μM 3.8 +_ 0.1 F

3 −1 + 2 ON / min Bpa CP Bpa-CP T 1 F

0 100 101 102 [Bpa] / μM C 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 time /h 70 kDa - Mpa

*

Casein 25 kDa - CP Bpa ** 15 kDa -

Figure 3.3: Assessment of Bpa-CP affinity by measuring Bpa-mediated activation of the proteasome for casein degradation. A) Cartoon depiction of affinity assay reaction scheme. The equilibrium between free Bpa and CP and associated Bpa-CP complex depends on the Bpa concentration. The increase in Bpa-CP complex over free CP can be measured indirectly by assessing the increase in casein degradation activity. The proteasomal model substrate casein is labeled nonspecifically with fluorescein, leading to a fluorescence increase upon casein degradation. B) Casein degradation rates were measured at increasing Bpa concentrations (0–76.8 µM pro- tomer) while keeping CP constant (0.2 µM). Turnover numbers (TON) refer to the to- tal concentration of CP present. HisBpa (black circles) and HisBpa36–174 (grey cir- cles) were measured in triplicate and the dependence of the determined turnover num- bers on the Bpa protomer concentration was fitted to the following saturation function: TON([Bpa]) = kcat×[Bpa]/(K d+[Bpa]). Errors are given as one standard error. pH 7.5, T = 37 ◦C. C) Casein (27 µM) degradation by Bpa-CP (left gel, 0.5 µM Bpa complex and 0.2 µM CP), by CP (middle gel, 0.2 µM) or by Mpa-CP (right gel, 0.5 µM Mpa complex and 0.2 µM CP) evaluated by SDS-PAGE and Coomassie staining. In contrast to Bpa-CP, Mpa-CP exhibits only very little casein degradation. Bands marked by ** presumably indicate truncation pat- terns of incompletely digested casein. * indicates an impurity in the proteasome preparation. pH 7.5, T = 37 ◦C.

25 mal interactors (49 , 92 –94 ). It was also demonstrated that the interaction of Mpa with the core particle, which like for Bpa is mediated by a GQYL motif, could be stabilized by using the so-called open gate proteasome (∆7CP), where the N-terminal seven amino acids of the α-rings are removed (47 ). Thus, we used an open-gate variant of CPT1A in our EM experiments to maximize interaction with Bpa. To further increase the fraction of assembled Bpa-CP complexes on the grid, we applied site-specific crosslinking (95 ) by replacing certain residues in the Bpa sequence close to the GQYL motif with the genetically encoded unnatural amino acid para-benzoyl-L-phenylalanine (Figure 3.4A, left). The most successful variant carried a replacement of T170 preceding the GQYL motif (Figure 3.4A, bottom). Upon expression of BpaT170*, para-benzoyl-L-phenylalanine incorpora- tion rescued the amber mutant with approximately 80 % efficiency as reported for other constructs (96 ). Strep-Tactin purification of the CP after crosslinking revealed that 10 % of the α-ring protomers are crosslinked to a Bpa protomer, as indicated by the band close to 55 kDa in the polyacrylamid gel (Figure 3.4A, right). The crosslinking strategy resulted in EM micrographs with a high occu- pancy of singly or bicapped CP complexes when the cross-linked preparation was supplemented with a moderate excess of free HisBpa (Figure 3.4B).

Cryo-electron microscopy reconstruction of the Bpa-20S complex shows a symmetry mismatch

After this optimization procedure, the complexes showed the expected side views of Bpa stacking coaxially with the CP (Figure 3.4B, C) as observed previously with uncrosslinked material (48 ). Roughly 200 000 particles were picked from the 2 150 recorded cryo-EM images and were submitted to 2D classification (Figure 3.4C). About half of the particles presented as side views, which could be classified into three main groups, CP particles without Bpa associated on either side, CP particles associated with Bpa on one side and CP particles associated with Bpa on both cylinder ring faces. Particles containing only the proteasomal core were discarded, leaving roughly 55 000 Bpa-CP complexes. The obtained 2D class averages show significant particle heterogeneity, which is not limited to assembly state (mono- or bicapped), but also represents differences in shape (e.g. deviation from coaxial position of Bpa, Figure 3.4C).

26 3 Structural analysis of Bpa in complex with the 20S proteasome

A 0 X Ft El B O multiple 70 kDa x-link 55 kDa Bpa-PrcA 40 kDa

NH2 35 kDa Δ7PrcA 25 kDa O O PrcBT1A

p-BpF BpaT170*

160 170

Bpa ...P G V G K S G Q H G H G * G Q Y L T170* C I II III IV V 30 nm 50 nm

Figure 3.4: Site-specific cross-linking stabilizes the Bpa-CP complex. A) On the left, structure of the unnatural amino acid crosslinker para-benzoyl-L-phenylalanine. At the bottom, C-terminal sequence stretch of Bpa with the GQYL motif highlighted in red. The asterisk indicates introduction of the amber codon for incorporation of the unnatural amino acid crosslinker para-benzoyl-L-phenylalanine. To the right, Coomassie stained SDS gel of the Bpa-CP sample before crosslinking (0), after 4 min crosslinking (X) and after pu- rification across a Strep-Tactin resin (El). The flow-through of the Strep-Tactin purification is also shown (Ft). B) Representative cryo-EM micrograph of the specimen generated from the elution fraction in panel A at a nominal defocus of 3.1 µm. C) Representative 2D class averages of the second classification round (top views and ice crystals were removed in the first round). Classes in lanes I–III represent deformed particles (4 844), CP with partially disconnected Bpa (14 371) or CP without Bpa (26 763), respectively, and were excluded from 3D reconstruction, while singly capped (37 417) (IV) and bi-capped (18 296) (V) particles were retained.

The remaining particles were submitted to 3D classification into eight classes, and complexes with distorted features were rejected (Figure 3.S6). Because no subclasses with the Bpa ring adopting a defined orientation relative to the CP were obtained, a consensus model averaging the complete data set was calculated in order to determine the ensemble maximum likelihood orientation for all particles. According to the 2D class averages, our dataset contained some particles where the Bpa ring was displaced from its coaxially stacked position on the α-rings. To remove these misaligned particles, the 3D reconstruction was further sorted for coplanar Bpa rings. Classes with misaligned rings could be easily identified, since

27 they feature smeared out density around the Bpa ring due to the off-center axis (Figure 3.S6, strategy I). This approach yielded a 4.6 Å near-atomic model for the CP according to gold standard Fourier shell correlation (FSC) (Figure 3.S7A, B). Although Bpa and the interface with CP were resolved to only ≈10 Å in the EM-map, which precluded fitting of individual Bpa subunits (Figure 3.S7A, B), the dimension of the Bpa ring precisely matches that of the dodecameric Bpa model from the crystal structure. We were therefore able to assemble a composite model by docking both the crystal structure of Bpa (this study) and of the proteasome complex (pdb 3mi0,(36 )) into the EM map, leaving a gap of about 20 Å between the last observed Bpa residue in the crystal structure (L148) and the α-ring face of the CP (Figure 3.5A, B). The 26 C-terminal residues of Bpa (148–174) not present in the crystal structure model are sufficient to cover this distance and to reach into the binding pockets between the two α-ring subunits, according to the Gaussian chain end-to-end distance model for a random coil (97 ). In the EM reconstruction without 7-fold symmetry imposed, Bpa appears to de- viate from a perfect circular shape when in complex with the CP (Figure 3.5A, B), in contrast to the crystal structure. This could be a consequence of the symmetry mismatch between Bpa and the CP (Figure 3.5B).

Cryo-EM reconstruction of the Bpa-20S interface to 3.5 Å resolution reveals density for the C-terminal interaction motif of Bpa

The C-terminal GQYL motif of Bpa was shown to mediate the interaction with the CP (48 ). To characterize this interaction structurally, we aimed at visualizing the Bpa/α-ring interface at higher resolution. For a better signal to noise ratio, the Bpa-proteasome complex map was reconstructed with imposed C7 symmetry while at the same time applying a mask to the complex that excludes the core of the symmetry-mismatched Bpa ring. With this approach any contribution of Bpa-derived density at the proteasomal α-rings will remain untouched, so that the GQYL interaction motif bound to the CP may be visualized. In order to ensure that extra density observed in the CP α-rings originates from the Bpa C-terminal tails, CP α-ring faces occupied with Bpa are compared with unoccupied CP α-ring faces. For the comparison bicapped CP particles were excluded from the reconstruction, leaving only CP complexes associated with Bpa on one end of the cylinder but not on the other. The CP α-ring proximal to Bpa was then compared

28 3 Structural analysis of Bpa in complex with the 20S proteasome

A 100 B 40

56

20

C-terminus of Bpa H4 C

160

extra density

125

Figure 3.5: Cryo-EM density map of the Bpa-CP complex. A) Side view of Bpa in complex with CP. The EM density was low pass filtered to 8 Å, and the crystal structures of Bpa (turquoise, this study) and the refined proteasome model (salmon, based on pdb 3mi0) were docked into the density. Dimensions are indicated in Å. B) Top view of the particle in A. C) The high resolution EM map of the Bpa-CP complex is displayed as grey surface. The FO−FC Fourier difference density map obtained by rigid-body refinement of the proteasome model from A (salmon) into the high resolution EM map (grey) is shown in yellow, revealing unassigned areas that originate from Bpa. The last few residues of the C-terminal helices visible in the Bpa crystal structure are indicated in turquoise in the positions they occupy in A and B. to the distal α-ring that is not capped with Bpa (Figure 3.S6, strategy II). Our analysis confirms that additional density is only detected at the CP cylinder end that is capped by Bpa (Figure 3.S8A). In order to obtain a model of the Bpa-activated proteasome, we performed rigid body fitting of the proteasomal protomers of a previously solved Mtb CP crys- tal structure (pdb 3mi0,(34 , 36 )) into our 3.5 Å (FSC 0.143 criterion) EM map

29 (Figure 3.S7C, D, Table 3.S2), resulting in a slight reorientation of the α-ring pro- tomers, while the β-ring subunits remain in an unchanged position. The fit of the α-rings could be further improved by independent docking of helix H0, which resulted in a retraction of the H0s from the ring-pore, resembling the opening of a seven-bladed diaphragm. Although the movement is slight, it could indicate opening of the α-ring gate upon binding of Bpa. We therefore compared the con- formation from the rigid body fit of the Bpa-bound α-ring to the fit of the α-ring of the non-engaged side (Figure 3.S8C). The superimposition shows that the un- engaged α-rings already exhibit a rearrangement of the H0 diaphragm towards the open state, albeit to a lesser extent. This could be due to the fact that a variant of the CP (∆7CPT1A) was used for the EM studies, which is presumably already partially open. Indeed, the crystal structure of wild type Mtb CP, which is in the closed state exhibits a much more closed conformation of the H0 diaphragm (Fig- ure 3.S8B), although functional conclusions should be drawn with caution, because the H0 helices are involved in crystal contacts in the available X-ray structures. A movement of H0 into a fully upright position, bringing it to a 90◦ angle with re- spect to the α-ring plane, as encountered in the Mtb proteasome crystal structure (36 ), was not observed in our EM structure (36 ). Comparison of the model with the experimental EM map revealed volumes of extra density between the protomers of the proximal α-ring, which is in contact with Bpa, but not in the distal α-ring (Figure 3.5C and Figure 3.S8A, yellow). Strikingly, when viewed from the top, these seven extra density volumes form a ring with a diameter that matches the circular distribution of the H4 helices in the docked Bpa dodecamer from which the C-terminal tails extend, further supporting the conclusion that the density derives from the C-terminal GQYL motif of Bpa (Figure 3.5C). Indeed, a similar difference density was previously observed for the archaeal proteasome when it was incubated together with synthetic activation peptides corresponding to the C-terminal hydrophobic-tyrosine-X (HbYX) motif of the archaeal proteasomal ATPase PAN (98 ). Additional density was also observed in the catalytic chamber close to the active sites of the proteasome (Figure 3.S8). It is likely that small peptides that diffused into the proteasome are responsible for this observation. A similar effect has been noted previously (99 ). To interpret the density representing the suspected Bpa GQYL motif, we su- perimposed the crystal structure of eukaryotic proteasomal interactor Blm10 in complex with the S. cerevisiae proteasome (pdb 4v7o,(29 )) onto the α-subunits

30 3 Structural analysis of Bpa in complex with the 20S proteasome

A H0 B H0 H0

R26 H0 H0 Y2141 Y2142 H0

A2143 K52

density from Bpa exit turn C D H0 H0 H0 straight exit H0 R26 R26 G171

Y173 Y173 G171 Q172 L174 K52 K52 Q172 L174

Figure 3.6: The C-terminal GQYL motif of Bpa gives rise to clearly discernable density in the proteasome α-rings and can be modeled based on the Blm10 HbYX motif. A) The EM map (grey surface) exhibits additional density between the interfaces of adjacent proteasome α-subunits, which can be attributed to the C-terminal interaction motif of Bpa (turquoise density). Density of the α-subunit residue R26 protrudes from the base of the H0 helices, partially covering the interactor density. B) The functionally analogous C-terminal HbYX interaction motif of eukaryotic Blm10 (dark yellow, pdb 4v7o,(29 )) can be fitted into the EM-docked proteasome model of Mtb with good congruence. Placement of the Blm10 C-terminus is achieved by rigid body alignment of yeast proteasome α5-subunit with the Mtb proteasome α-subunit (salmon). C) Based on the analogous placement of Blm10, a model of the GQYL motif of Bpa (turquoise) could be built into the Mtb proteasome EM map (mesh). D) The same view as in B for the Bpa GQYL motif. of our own Mtb model, which positions the tyrosine of the Blm10 HbYX motif precisely into our unassigned density (Figure 3.6A, B and D). Using the conserved tyrosine of Blm10 as a start, we were able to trace the backbone of the last five amino acids of Bpa and could establish the side chain conformations of the last three residues (Figure 3.6C). On the proteasomal side, the conserved K52 of the α-subunits features a well resolved density, indicating the formation of a salt bridge with the C-terminal car-

31 boxylate of Bpa (Figure 3.6C). The corresponding residue was previously shown to be critical for binding of the proteasomal interactors in S. cerevisiae and T. aci- dophilum (K66 in both) (30 , 31 , 45 , 99 ). At the C-terminal end of helix H0, we observe strong side chain density protruding above residue Y173 of Bpa, which represents R26 of the α-ring and suggests a planar stacking interaction with the tyrosine of the GQYL motif (100 ). This arginine is well conserved in bacteria but only weakly in eukaryotic proteasomal α-subunits, where at the equivalent position sometimes a lysine engages in cation-π interaction with the HbYX motif (99 ).

DISCUSSION

While it has been known for a long time that the 20S core particle of eukaryotes can associate with different activator complexes to provide access to the proteolytic chamber for a variety of degradation substrates, this is a very recent discovery for bacterial proteasomes (48 , 49 ). In bacteria, recruitment of proteins for protea- somal degradation was largely assumed to be mediated by a post-translational modification that functionally, though not structurally, shows parallels to ubiqui- tin (66 , 101 ). With the discovery of the ATP-independent bacterial proteasome activator Bpa (also referred to as PafE), this notion had to be amended to include a new group of substrates that do not require tagging and are degraded without the expense of ATP (9 , 48 , 49 ). Two families of ATP-independent activators have been de- scribed in lower and higher eukaryotes, the oligomeric 11S activators (PA26 or PA28) and the large proteins of the Blm10 type (PA200) (27 , 102 , 103 ). Proteins of both families are all-α-helical, a feature they apparently also share with Bpa (29 , 42 , 87 ). Architecturally, Bpa shows more similarity to the 11S rings, where small helix-bundle protomers are arranged into a toroidal structure. Interestingly, however, the 11S family of ATP-independent interactors exhibits a 7-fold rotational symmetry that matches that of the CP cylinder, resulting in 7-fold symmetrical 11S-CP complexes (102 ). In contrast, association of Bpa with the CP involves the binding of the 12-fold rotational Bpa ring to the 7-fold rotational proteaso- mal α-ring, resulting in a symmetry-mismatch between Bpa and the CP particle and twelve Bpa GQYL motifs ready to interact with only seven binding pockets on the proteasomal α-rings. In light of the hexameric proteasomal ATPases and the heptameric non-ATPase activators, the dodecameric assembly of Bpa appears

32 3 Structural analysis of Bpa in complex with the 20S proteasome surprising. One possible explanation for this observation is avidity, as the twelve flexibly tethered GQYL motifs can replace each other readily due to close proxim- ity. However, our cryo-EM model suggests another convincing reason, namely one of assembly geometry. As can be seen in Figure 3.5C, the dodecameric assembly generates just the right diameter of the ring to position the C-terminal helical ex- tensions of Bpa on the same perimeter as the binding pockets of the proteasome α-rings in the Bpa-CP complex.

Another consequence of the large number of protomers in the Bpa ring is the 40 Å diameter of the central pore, which is wide compared to the eukaryotic 11S ac- tivators (20 Å) and Blm10 (18×9 Å) (29 , 104 ). The overall shape of Bpa resembles a funnel with a large opening through which substrates can enter, a bowl-shaped “chamber” and a narrow outlet formed by the proteasomal α-rings that represent a gate with a mere 23 Å diameter, even when opened.

The combination of the wide, chamber-forming architecture of Bpa and rela- tively narrow opening of the proteasome suggests a different role for this bacterial activator than those played by the eukaryotic counterparts. Although the function of eukaryotic ATP-independent activators is still not understood very well, sev- eral studies suggest a role in degradation of peptides and very small, unstructured proteins (24 , 105 ). While Bpa ultimately has to rely on the diffusion of unstruc- tured segments of a protein substrate through the CP gate as well, the Bpa ring itself could accommodate small proteins in the 10–30 kD range and might serve to cradle the substrate protein close to the CP entrance, increasing the likelihood that poorly structured segments of the protein, such as loops or unstructured ter- mini, will insert into the degradation chamber (Figure 3.7). Bpa may thus act as a molecular funnel that allows proteins to enter into the 20S degradation chamber. This stands in contrast to the mechanistic principles of the Mpa-CP degradation complex. Mpa features a much smaller pore for substrate entry, but capture of substrates is achieved by the long and flexible coiled-coil domains that reach out from the top rim of the pore and form a three-stranded coiled-coil with Pup, where Pup provides the third strand (Figure 3.7). The antiparallel orientation of Pup in the coiled-coil points the unstructured N-terminal segment of Pup directly into the pore, where it is captured by the ATPase-driven pore loops. The two mechanisms of substrate recruitment, active targeted delivery of substrates into the pore pow- ered by ATPase activity versus conformational selection and diffusion-mediated

33 partially disordered pupylated proteins (e.g. HspR) (e.g. PanB-Pup)

Bpa12 Mpa6

α α7 7

β β 7 7

β β 7 7

α α 7 7

Figure 3.7: Comparison of passive and active proteasomal targeting in bacteria. Bpa (turquoise) interacts with the proteasomal core particle (salmon) to function like a molecular funnel that recruits substrates by conformational selection and diffusion-mediated entry. Without actively engaging substrate (grey), the funnel-shaped architecture of Bpa increases the likelihood for poorly folded substrate proteins to enter the proteolytic chamber. In contrast, the ATPase Mpa (orange) engages pupylated (crimson) substrate proteins (grey) by specifically interacting with the recruitment tag via dedicated coiled-coil domains and then translocates them into the proteasome chamber in an active ATP-driven process. entry, are reflected in the different architectural features of the two proteasome interactors. Our high resolution cryo-EM reconstruction of the proteasome core permitted visualization of density originating from the C-terminal tails of Bpa that are in- serted into the docking sites between the proteasomal α-subunits (Figure 3.5C and Figure 3.6). This density can be interpreted by fitting it to a model of the GQYL motif aided by available structural information on the eukaryotic Blm10 C-terminal HbYX motif in complex with the proteasome, where the HbYX motif performs the analogous function as a docking module that attaches the regulator to the protea- some. The similarities include the conserved salt-bridge between the C-terminal carboxylate of Bpa and a conserved lysine residue in the proteasome pocket (K52), demonstrating a common strategy for docking of regulator complexes to the protea- some core across all three kingdoms. Small differences exist however in position −3 from the C-terminus (Q of the GQYL motif), as this residue is oriented downwards away from helix H0 rather than up as is the case of the equivalent hydrophobic

34 3 Structural analysis of Bpa in complex with the 20S proteasome residue (Hb) in the HbYX motif. This provides a plausible structural explana- tion for the differences in amino acid composition between the eukaryotic/archaeal HbYX-type and bacterial GQYL proteasome recognition motifs: In the HbYX- type interaction, the upward oriented hydrophobic side chain stacks between two proteasomal α-subunits such that the backbone of the interactor leaves the α-ring interface through a characteristic turn (Figure 3.6B–D) (42 , 45 ). In contrast, our structure reveals that the corresponding, less-well conserved glutamine side chain of the GQYL motif points “downwards” away from the H0 helices, which leads to a more direct exit of the C-terminal segment out of the binding pocket without the formation of the turn observed in Blm10 (29 ). Although the local resolution is not high enough to model the main chain conformation unambiguously, the rotational flexibility necessary to obey the observed density is likely provided by the strictly conserved G171, which has no counterpart in the HbYX motif. While Bpa offers an excess of C-terminal binding motifs over proteasomal bind- ing pockets, it nevertheless appears from the EM structure of the non-symmetrized complex, that Bpa is tethered more strongly to one side of the proteasome α-ring, apparent in the upward tilting of the Bpa ring for some of the particles. This sug- gests that not all binding pockets are simultaneously occupied. As demonstrated by several of the eukaryotic and archeal proteasomal activators (29 , 86 , 106 ), substoichiometric occupancy is usually sufficient to mediate interaction. As most of the visualized core particles were singly capped by Bpa in our cryo- electron micrographs, it was possible to compare the Bpa-associated side of the proteasome with the Bpa-free side to deduce possible conformational rearrange- ments induced by Bpa binding. Although we can detect a slight movement of the H0 helices away from the pore in the Bpa bound conformation, the changes are too small to represent gate-opening. This is not surprising, since the variant of the CP chosen to stabilize complex formation is already in a largely open state. Nev- ertheless, an iris-like opening of the pore by concerted movement of the H0 helix C-terminal ends to a more peripheral position might well be part of a mechanism for gate-opening. Regardless of the interpretation of the lateral movements of H0, we can exclude a gate conformation where one or several H0 helices swing upward, as was suggested previously as a conceivable model for gate-opening based on the crystal structure (36 ). It is possible that the upturned H0 helix conformation is only present in the context of the crystal, as it is involved in crystal contacts, and therefore may not reflect a functional state in solution.

35 In summary, the structure of the Bpa-proteasome complex provides intriguing views into the architecture of an alternate proteasome interactor in Mtb and sug- gests a rather different mode of substrate processing than observed with Mpa and also different from the mechanism employed by the eukaryotic proteasome activa- tors. Further studies into the biological role played by the Bpa-proteasome pathway will shed more light on the reasons for the observed architectural differences, and identification of a larger set of substrate proteins might help characterize the key interactions between Bpa and its substrates. It will be of particular interest to establish, if both proteasome complexes operate concomitantly or if the passive and active mode of the CP represents a distinct state in the actinobacterial life cycle.

EXPERIMENTAL PROCEDURES

CD spectroscopy

CD spectra of Bpa (62.2 µM protomer) and HisBpa36–159 (37.8 µM protomer) in 10 mM Na/K-phosphate buffer pH 7 were measured in a 0.1 cm quartz cuvette ◦ on a Jasco J-715 spectropolarimeter (Jasco, Tokyo) at 25 C. The ellipticity (Θobs in mdeg) was recorded between 200 nm and 240 nm in 0.5 nm steps at a speed of 20 nm/min. Displayed spectra are an average of five scans and expressed as molar 2 −1 ellipticity per residue [Θ]MRW in deg cm dmol ([Θ]MRW = Θobs / (10clN ) where c is the molar protein concentration, l is the path length of the cuvette in cm, and N represents the number of amino acids).

Cloning, expression and purification of proteins

Escherichia coli Rosetta or BL21 was transformed with a modified pET24 vec- tor (71 ) containing the bpa gene variants resulting in constructs summarized in Figure 3.1C. Cells were grown at 37 ◦C and protein expression was induced at an

OD600 of 0.8 by addition of isopropyl-β-D-1-thiogalactopyranoside (IPTG) to a final concentration of 0.5 mM and further cultivated overnight at 23 ◦C. Cells were lysed using a microfluidizer M110-L with a H10Z 100 µm chamber (Microfluidics) in buffer L (50 mM HEPES-NaOH pH 8 at 23 ◦C, 500 mM NaCl). After centrifugation at 45 krpm and 4 ◦C for 1 h (rotor: Type70Ti), the expressed protein was purified from the supernatant by affinity chromatography using a 5 mL Hi-Trap IMAC col-

36 3 Structural analysis of Bpa in complex with the 20S proteasome umn (GE Healthcare Life Sciences) charged with Ni2+ pre-equilibrated in buffer L supplemented with 40 mM imidazole. After washing the protein-containing column with 50 mL equilibration buffer, the protein was eluted with buffer L supplemented with 300 mM imidazole. Protein-containing fractions were pooled and in the case of constructs Bpa and Bpa36–159 dialyzed for 1 h at 4 ◦C against buffer (50 mM HEPES-NaOH pH 8 at 4 ◦C, 150 mM NaCl). The N-terminal histidine tag was cleaved at the tobacco etch virus (TEV) cleavage site by addition of his-tagged TEV protease and further dialysis for 15 h. TEV protease was removed via affinity chromatography. All proteins were further purified by size exclusion chromatog- raphy using a Superdex200 column in 20 mM HEPES-NaOH pH 8 at 20 ◦C and 50 mM NaCl. Selenomethionine-labeled HisBpa36–159 was prepared as described by Van Duyne

(107 ). In addition, the M9 medium contained 3 mg/mL Fe2SO4. All buffers con- tained 1–5 mM tris(2-carboxyethyl)phosphine (TCEP). To design Bpa variants for site-specific crosslinking, the bpa nucleotide se- quence was searched for amino acid codons close to the C-terminus, outside the strictly conserved region and with a high chance of quantitative incorporation of the photo-activatable tyrosine analog cross-linker para-benzoyl-L-phenylalanine (pBpF, Bachem) in response to the amber codon. The latter condition is fulfilled by codons preceding a codon that starts with adenine or cytosine (96 ). Four posi- tions: H166, H168, G169 and T170 met these criteria and were therefore changed to the amber stop codon by site directed mutagenesis. All four reached a level of incorporation of approximately 80 %. Subsequent testing revealed T170* as the most successful in cross-linking.

An E. coli BL21 strain was co transformed with pET-24b containing BpaT170* and pEVOL-pBpF (95 ). 100 mL Luria brooth culture containing 100 µg/mL kanamycin and 37 µg/mL chloramphenicol was inoculated with 0.1 mL saturated ◦ preculture and grown at 37 C. At an OD600 of 1.0 protein expression and artificial tyrosine analog incorporation was induced simultaneously by supplementing the culture with 1 mM IPTG, 0.02 % L-arabinose and 1 mM pBpF (from stock 1 M in 1 M NaOH). The temperature was reduced to 20 ◦C and the expression was continued overnight. Cells were harvested by centrifugation at 4500 g for 20 min, resuspended in buffer S- (50 mM HEPES-NaOH pH 7.5 at 4 ◦C, 150 mM NaCl, 10 % (v/v) glycerol) and lysed by three passes through the microfluidizer (Mi- crofluidics M-110L). BpaT170* was purified by Ni-NTA affinity chromatography in

37 gravity flow, the buffer was exchanged to S (S- supplemented with 1 mM EDTA) by dialysis and the protein was concentrated using an Amicon spin column with a 10 kDa molecular weight cut-off.

The PrcBT1A variant was constructed by site directed mutagenesis and expressed and purified as previously described (46 ).

Crystallization of Bpa

Crystallization was carried out in sitting drop vapor diffusion plates at a pro- tein concentration of 15–20 mg/mL at 20 ◦C by mixing 1 µL of protein solu- tion with 2 µL of reservoir. Bpa36–159 formed hexagonal plate-shaped crystals (200×200×50 µm) in reservoir solutions consisting of 2–4 % (v/v) propane-2-ol, ◦ 100 mM HEPES-NaOH pH 7.5–8.2 at 20 C and 150–300 mM MgCl2. Before flash cooling the crystals with liquid nitrogen, glycerol was added in 5 % (v/v) steps to a final concentration of 30 % (v/v) by using reservoir solution supplemented with glycerol. Selenomethionine-incorporated HisBpa36–159 formed hexagonal plate- shaped crystals with rounded edges (150×150×40 µm) in reservoir solutions con- sisting of 5–7 % (v/v) propane-1,2-diol, 100 mM HEPES-NaOH pH 8.0–8.2 at 20 ◦C and 200 mM MgCl2. Before flash cooling the crystals with liquid nitrogen, propane- 1,2-diol was increased in 5 % (v/v) steps to a final concentration of 30 % (v/v) by using reservoir solution supplemented with propane-1,2-diol.

Data collection, experimental phasing, structure determination, model building and refinement

Native data sets of Bpa36–159 and SAD data sets of selenomethionine-incorporated HisBpa36–159 were collected at beamline X06SA of the Swiss Light Source (Paul Scherrer Institut, Villigen, Switzerland), and data were indexed and integrated using XDS (108 ). Initial analysis of data was performed using POINTLESS (109 ) and phenix.xtriage (110 ). Scaling was subsequently done by AIMLESS (111 ). Bpa36–159 crystals suffered from lattice-translocation defects as indicated by al- ternating sharp and diffuse reflections (Figure 3.S1A, C) and prominent packing peaks in the native Patterson map (Figure 3.S1E). The intensities were corrected for the effects of translocation disorder by applying the methods described by Wang and Kamtekar (112 ).

38 3 Structural analysis of Bpa in complex with the 20S proteasome

Selenium sites were localized in space group P6322 by SHELXC/D/E (113 ). The symmetry of the sites indicated a dodecameric ring structure, with two out of the four possible Se-Met positions per molecule ordered, resulting in a packing of four molecules in the ASU and a solvent content of ≈47 % in the crystal. Using these sites, experimental phasing was performed using the SAD pipeline imple- mented in PHASER-EP from the CCP4 suite (114 ). After solvent flipping of the initial phases using CNS (115 ), α-helices occurred as weak rod-like density fea- tures, from which the NCS operators between the molecules in the asymmetric unit could be precisely determined. A non-overlapping mask suitable for four-fold NCS symmetry averaging was be generated using NCSMASK in CCP4 (Figure 3.S2A, C). A major improvement of the experimental phases could then be achieved us- ing a combination of solvent flipping and 4-fold NCS averaging with CNS, which resulted in readily interpretable experimental electron density maps that revealed the main-chain connectivity and most protein side chain features (Figure 3.S2B, D–F). Although the calculation of an anomalous difference Fourier map using the improved phases (Figure 3.S3A) indicated that one of the Se-Met in each molecule adopts alternative conformations, inclusion of these sites during phasing did not further improve the appearance of the experimental maps.

An initial model was built including the Se-Met positions as a guide, and the model was further improved by iterative model building in COOT (116 ) or O (117 ) and refinement in PHENIX (110 ). Due to the increased R-values obtained in space group P6322, which might be the result of an NCS emulating a higher crystallo- graphic symmetry or a translational lattice defect, the model was refined against the same data processed in space group P3 with 16 monomers in the ASU and sim- ilar spatial arrangement of the dodecameric rings, resulting in improved R-values and extra density in the area of the C-terminal helix (Table 3.S1). An analysis of the crystal packing revealed a continuous lattice along all crystallographic axes, with no obvious gaps occurring (Figure 3.S3B, C), which is in contrast to a recent report (87 ). The area of weaker electron density represents the long C-terminal he- lices of the molecule, which are flexibly disposed and interdigitate with the opposed dodecameric ring (Figure 3.S4). Amino acids beyond residue L137 or Q140 were built as polyalanine backbone trace in the P6322 and P3 structures, respectively.

The resulting model was then used for molecular replacement in PHASER-MR (114 ) in the better resolved native dataset with space group R32:h, revealing a

39 packing with four protomers in the ASU. The R32:h model was further refined and used in this manuscript unless stated otherwise.

Bpa proteasome affinity estimate with fluorescein labeled β-casein

Purified β-casein (from casein from bovine milk mix, Sigma) was labeled with NHS-fluorescein (Thermo Scientific) according to the manufacturer’s protocol. An average of 1.4 fluorescein labels per β-casein molecule was estimated. The protea- some complex (0.2 µM) was incubated with 0–76.8 µM Bpa protomer (0–6.4 µM Bpa complex) for 15 min in buffer S. The reaction was started by addition of 20 µM β-casein, with a fluorescein labeling ratio of approximately 0.5. The change in fluorescence (Ex485/20, Em528/20) was monitored using a Synergy 2 (BioTek) plate reader at 37 ◦C. Initial velocities were estimated by a linear fit to the first 10 min of the time trace. The measurements were performed in triplicate, the basal activity of CP was subtracted and the data was fitted to the function: TON([Bpa]) = vmax×[Bpa]/(K d+ [Bpa]) where [Bpa] indicates

Bpa protomer concentration to estimate apparent K d and vmax of the Bpa-CP complex. The kcat values were estimated by dividing the mean total fluores- cence signal increase after reaching saturation by the total protein concentration

(F(tsat)−F(t0))/([CP][Casein]∆t).

Casein degradation with Bpa- and Mpa-CP

A master mix in buffer S-, containing WT CP (0.2 µM), β-casein (27 µM), 10 mM

ATP, 20 mM MgCl2, 80 mM creatine phosphate and 1 mU/µl creatine phosphoki- nase was prepared, and the reaction was started by addition of either Mpa (0.5 µM hexamer), Bpa (0.5 µM dodecamer) or buffer S-. Samples were drawn at the indi- cated time points and quenched in SDS loading dye. The complete reaction was performed at 37 ◦C.

UV cross-linking of Bpa and CP

240 µM BpaT170* was incubated with 1 µM of open-gate active site mutant CP

(∆7PrcABT1A) in buffer S- for 30 min on ice. The complex was applied to an agarose gel imaging system (GelLogic 212 pro) in aliquots of 25 µl in thin wall PCR tubes for long wavelength UV (peak around 350 nm) irradiation for 4 min. Then, aliquots were pooled and the proteasome was purified by Strep-Tactin tag

40 3 Structural analysis of Bpa in complex with the 20S proteasome

purification. The BpaT170*-CP complex was eluted with 2.5 mM desthiobiotin in buffer EmG (50 mM HEPES-NaOH, 150 mM NaCl, pH 7.5 at 23 ◦C).

EM specimen preparation and imaging

Negatively stained Bpa-CP complexes: Carbon-coated 300 mesh copper grids

(Quantifoil) were glow discharged at 25 mA for 45 s. BpaT170*-CP complex was diluted to 110 nM immediately prior to application to the grid. After 2 min excess liquid was blotted away and the specimen was stained with 1 % (w/v) aqueous uranyl acetate for 45 s. A FEI Morgagni 268 transmission electron microscope was operated at 100 kV and a nominal magnification of 36 000 with an applied defocus of −1.2 µm for imaging. Cryo-EM grid preparation: Holey carbon copper grids (Quantifoil R 2/2) were covered with a thin carbon layer for cryo-EM specimen preparation. The grids were glow discharged for 22 s at 25 mA. A mixture of 0.5 µM BpaT170*-CP and 12 µM Bpa was rapidly diluted 5:1 in EmG immediately prior to grid application. The sample was incubated on the grid for 45 s and blotted for 18 s using a Vitrobot (FEI company) set to 95 % humidity, 7.5 ◦C and plunged into liquid ethane. The data set was automatically recorded with EPU software on a Titan Krios cryo-transmission electron microscope (FEI company) equipped with an Falcon II direct electron detector (FEI). The Microscope was operated at 300 kV and 100 000-fold magnification within an applied defocus range from −1.0 µm to −3.6 µm resulting in a pixel size of 1.4 Å on the object scale. Images were recorded as seven separate frames, with a total dosage of 25 e−/Å2. The images were aligned with MotionCorr (118 ).

Data processing and refinement

CTF was estimated using CTFFIND3 (119 ) and micrographs were rejected based on the extent and regularity of Thon rings. Single particle images were windowed out with batchboxer from EMAN (120 ) on four fold binned images resulting in 212 176 particles and used in a 2D and 3D refinement with RELION (121 , 122 ). Two rounds of reference free 2D classifications were performed, in the first round only side views of proteasomal particles were retained which yielded 108 815 side views of proteasomal particle. In the second round we selected 55 713 Bpa-CP particles displaying density for at least one Bpa ring. An initial 3D structure was

41 obtained in RELION using the 3D classification routine with one class imposing D7 symmetry. As initial reference we used a reconstruction of the Bpa-CP com- plex low pass filtered to 60 Å obtained by negative stain EM, which in turn was obtained with the T. acidophilum proteasome as reference (EMDB 5623,(118 )). The particles were subsequently classified in eight 3D classes without imposing any symmetry to remove distorted particles and to assess global heterogeneity. For subsequent 3D reconstructions a mask was used including the proteasome and one Bpa ring. This resulted in alignment of all singly-capped particles with the capped side pointing in the same direction. Binning was released stepwise and the remaining 54 424 particles were aligned imposing C7 symmetries. Subsequently two different sorting strategies were implemented (see masks in Figure 3.S6). Strategy I used a mask for the Bpa ring excluding the proteasome core, to focus on the Bpa ring conformation. Particles that display excessive tilting of the Bpa ring with respect to the coaxial placement on the proteasome were removed. Strategy II on the other hand aims at visualizing the proteasome core itself and at comparting the Bpa-capped α-rings with the Bpa-free α-rings. The sorting step in strategy II therefore removes all bicapped particles. Perfect coaxial alignment of the Bpa ring and the proteasome, however, is not required in this case, since only the pro- teasome and the Bpa-docking portion in the α-rings is visualized, so that axially slightly misaligned complexes were not removed in this sorting step (Figure 3.S6). The particles (33 652 in I and 48 799 in II) were then used in an autorefinement according to “gold-standard” protocol (122 ) without imposing symmetries (I) or imposing C7 symmetries (II) to result in a map with the estimated average of 4.6 Å (I) and 3.5 Å (II), respectively according to the FSC 0.143 criterion. Local resolu- tion was estimated for the unfiltered reconstruction with ResMap (123 ). Volumes were visualized using UCSF Chimera (124 ). The model of CP complex with Bpa was based on pdb 3mi0( 36 ) and GQYL motif was placed based on Blm10 HbYX terminus from pdb 4v7o( 29 ) by aligning the α-subunits. The H0 helix was re- aligned and the GQYL motif residues were adjusted manually in O (117 ). All the proteasomal chains (excluding Bpa chains) were refined with PHENIX as rigid bodies.

42 3 Structural analysis of Bpa in complex with the 20S proteasome

Accession codes

Protein Data Bank: Structure factors and atomic coordinates for the Bpa35–159 His 2.6 Å R32:h, Bpa35–159 3.5 Å P3 and 3.3 Å P6322 structures have been de- posited with accession codes 5lfj, 5lfq and 5lfp, respectively. Proteasomal model coordinates have been deposited with pdb code 5lzp. Electron Microscopy Data Bank: The EM map of the asymmetric proteasome was deposited as EMDB 4127 and of the C7 particle as EMDB 4128.

AUTHOR CONTRIBUTIONS

MB, ML, NB and EWB conceived and analyzed all crystallization experiments, CD, DB, NB and EWB designed and analyzed all electron microscopy experiments. MB, CD and EWB wrote the manuscript. MB and CD cloned and purified all constructs and performed the non-structural experiments, CD and DB performed the EM measurements, MB the crystallization experiments.

ACKNOWLEDGEMENTS

We thank Achmad Jomaa for valuable comments and discussions. We thank B. Blattmann and C. Stutz-Ducommun for support with initial screening and the ScopeM facility staff for help with EM data collection. We acknowledge the staff of X06SA at the Swiss Light Source (SLS, Paul Scherrer Institut, Villigen, Switzer- land) for support with data collection. We thank Jimin Wang for sharing his For- tran code of LTD program. This work was supported by the Swiss National Science Foundation (SNF), the National Center for Excellence in Research (NCCR) Struc- tural Biology program of the SNF and an ETH research grant. The authors declare that they have no conflict of interest.

SUPPLEMENT

43 A B 2.22Å 2.22Å

2.71Å 2.71Å 3.53Å 3.53Å 5.20Å 5.20Å 10.30Å 10.30Å

C 2.6Å D 2.6Å c* b*

b* a*

1.00 1.00 E uncorrected F corrected 0.66 0.66

V 0.33 V 0.33

0.00 0.00

U U 0.33 0.33

0.66 0.66

1.00 1.00

Figure 3.S1: Diffraction data show that Bpa crystals suffer from a lattice translocation defect. A) X-ray diffraction pattern from a Bpa36–159 crystal showing a combination of sharp and diffuse reflections. B) An image from the crystal in A rotated by 90◦shows only sharp Bragg reflections. C) Pseudo-precession photograph of 0,k,l layer generated by LABELIT (125 ). Along the c* axis typical Bragg spots are observed for index l = 3n. Streaky features occur for index l =6 3n. D) Pseudo-precession photograph of h,k,0 layer generated by LABELIT. All spots appear as typical Bragg spots. E) Native uncorrected Patterson map (scaled from 0–100) showing a non-origin peak at (2/3, 1/3, 0). Contour lines starting from 3 σ level above mean density in steps of 3 σ levels. F) Native corrected (κ = 0.17) Patterson map (scaled from 0–100) no longer showing non-origin peak. Contour levels as in E.

44 3 Structural analysis of Bpa in complex with the 20S proteasome

A B

C D

E F

Figure 3.S2: Improvement of the experimental phases in space group P6322 using 4-fold NCS averaging during the solvent flipping procedure. A, B) Overview of the experimentally determined electron density maps, calculated after im- provement of the heavy atom SAD phases by solvent flipping without (A) or by includ- ing 4-fold NCS averaging (B). The initially positioned helices corresponding to the four molecules in the ASU are shown in different colors. They were used to generate an aver- aging mask covering one ASU (green). C, D) Detailed views of the maps shown in A and B either without (C) or with (D) applying 4-fold NCS averaging. E, F) The resulting experimental maps allowed tracing of the full main chain and establishing the correct side chain register, also using the positions of the Se-Met sites as a guide (corresponding anomalous difference Fourier map shown in yellow at 4 σ contour level).

45 A

B

x

y z

C z

y x

Figure 3.S3: Heavy atom phasing and structure determination in space group P6322. A) Anomalous difference Fourier map (contoured at 4 σ level) showing the location of the Se-Met sites, which occur as rings. The map was calculated using the improved experi- mental phases after solvent flipping and NCS averaging. For orientation, the backbone of the refined structure in space group P6322 is shown, with the four molecules in the ASU depicted in different colors. B, C) Solvent flipped and NCS averaged experimental electron density maps (contoured at 1.25 σ level) showing the continuous packing of the rings in a view along the z- and x-axes. The apparent areas of weak density in C represent the region where two opposed dodecameric rings interdigitate via the long C-terminal helices in the protomers.

46 3 Structural analysis of Bpa in complex with the 20S proteasome

Figure 3.S4: Model of HisBpa36–159 refined in space group P3. Two opposed dodecameric rings of the structure are colored according to the B-values from blue (low value) to red (high value). The C-terminal helices of the monomers from both rings interdigitate, however they are partially flexible, as indicated by the increased B-values and weak electron density, which occurs as gaps at higher contour levels of the maps (Figure 3.S3C).

A B

Figure 3.S5: Model of Bpa36–159 refined in space group R32:h. Two opposed dodecameric rings of the structure are colored according to the B-values from blue (low value) to red (high value). The C-terminal helices of the monomers from both rings interdigitate, however they are partially flexible, as indicated by the increased B-values.

47 Table 3.S1: Data collection, phasing and refinement statistics.

HisBpa36–159 HisBpa36–159 Bpa36–159 Bpa36–159 (SeMet)a (SeMet)a corrected for LTD PDB 5lfq 5lfp 5lfj Crystal form Space group (Å) P 3 P 63 2 2 R 3 2 :H R 3 2 :H Unit cell 100.854 100.857 101.378 101.378 dimensions (Å) 100.854 100.857 101.378 101.378 207.433 207.422 311.246 311.246 angels α, β, γ (◦) 90 90 120 90 90 120 90 90 120 90 90 120 Molecules / ASU 16 4 4 4 Data collection Wavelength (Å) 0.9785 0.9785 1 1 Resolution (Å)b 45.35–3.5 45.35–3.3 84.5–2.6 84.5–2.6 (3.55–3.5) (3.41–3.3) (2.74–2.6) (2.74–2.6) Total reflectionsb 139 455 (4 089) 153 365 (10 798) 188 770 (26 082) 188 770 (26 082) Unique reflectionsb 48 589 (1 729) 16 086 (1 402) 19 401 (2 731) 19 401 (2 727) Multiplicityb 2.9 (2.4) 9.5 (7.7) 9.7 (9.6) 9.7 (9.6) Completeness (%)b 0.81 (0.78) 0.90 (0.88) 1.00 (1.00) 1.00 (1.00) Mean I / σ(I)b 6.75 (1.41) 10.36 (1.56) 17.83 (2.96) 17.83 (2.96) Wilson B-factor (Å2) 93.29 95.37 67.82 67.56 b Rmerge 0.089 (0.526) 0.110 (1.002) 0.070 (0.723) 0.070 (0.684) b Rmeas 0.105 (0.649) 0.116 (1.068) 0.075 (0.765) 0.074 (0.723) b CC1/2 0.997 (0.85) 0.999 (0.824) 0.999 (0.997) 0.999 (0.999) Phasing Number of Se-sites 8 FOM 0.407 Lattice translocation defect Patterson peak location (2/3, 1/3, 0) Patterson peak ratioc (%) 22 Defect fractions (%) 83 and 17 Observed ‘streaky’ axis c* Refinement Reflections usedb 48 741 (1 703) 15 999 (1 400) 19 352 (2 726) 19 335 (2 717) b Reflections used Rfree 3 739 (122) 1 605 (150) 968 (136) 968 (137) b Rwork 0.2423 (0.3869) 0.2539 (0.3702) 0.2241 (0.3015) 0.2432 (0.3346) b Rfree 0.3013 (0.3442) 0.3157 (0.3795) 0.2894 (0.3930) 0.3122 (0.4131) Model composition non-hydrogen atoms 13 856 3 324 3 487 3 487 macromolecules 13 856 3 324 3 467 3 467 water 20 20 Protein residues 1 808 432 441 441 Rms deviations Bonds 0.005 0.01 0.008 0.010 Angles 0.71 1.29 1.17 1.29 Dihedrals 9.49 8.61 9.05 9.68 Ramachandran plot Favored (%) 98 97 100 100 Allowed (%) 1.7 2.6 0.23 0.23 Outliers (%) 0 0.24 0 0 Rotamer outliers (%) 0.57 0.29 0.83 1.1 Clashscore 6.57 16.41 8.18 9.23 Average B-factor 106.81 98.35 87.28 83.03 macromolecules 106.81 98.35 87.37 83.14 water 70.61 63.60 Number of TLS groups 16 16 16 16 a Anomalous pairs are counted separately b Values in parentheses are for highest-resolution shell

Rmerge = Σ Ih,i − hIhi /ΣI(h,i) where hIhi is the mean intensity of the reflections c The Patterson peak ratio is the ratio between the off-origin peak height and the origin peak height

48 3 Structural analysis of Bpa in complex with the 20S proteasome

108815 particles accepted 2D Classes Negative stain template D7, b4 D7 55713 particles EMDB 5623

C1, b4, as

0.321 0.082 0.070 0.057 0.308 0.023 0.079 0.059

54424 particles

C1, b2, a

mask schematic: included part C7, b1, a excluded part

I II

C7, b1, s C7, b1, s

0.286 0.071 0.128 0.241 0.178 0.095

0.465 0.205 0.226

0.104 33652 particles 48799 particles

C1, b1, a C7, b1, a

Figure 3.S6: Procedure outline to obtain the cryo-EM 3D reconstructions. The EM reconstruction EMDB 5623 of the archeal proteasome served as an initial model for the particle reconstruction of the successful classes from Figure 3.4C. For each reconstruction step on the top right corner the imposed symmetries (D7, C7 or C1), image binning (b4, b2 and b1 where b1 represents the full frame) and which procedure, aligning (a) or sorting (s) or both (as) was applied. The relative class distribution of particles from the sorting steps is given below and rejected classes are indicated. Where appropriate, a schematic representation of the mask indicating included parts is given. The two different sorting strategies for the high resolution reconstruction are indicated (I and II).

49 A B

FSC 0.143 o /Å 4.6 A 5 7 9 11 13

o C D

o FSC 0.143 3.5 A 3.5 4.0 4.5 5.0 7.5 _10> /Å

o

Figure 3.S7: Fourier shell correlation and local resolution estimates for the two EM reconstructions. A) Fourier shell correlation (FSC) curve for the reconstructed particle of strategy I (Figure 3.S6 bottom left, reconstructed in C1 symmetriy). B) Surface representation of the same particle low pass filtered to 8 Å and colored according to the local resolution estimate of ResMap. C) Same as A but for the reconstruction strategy II (Figure 3.S6, bottom right, reconstructed in C7). D) Same as B for the same particle as in C.

50 3 Structural analysis of Bpa in complex with the 20S proteasome

Table 3.S2: EM data collection and refinement statistics.

∆7PrcAPrcBT1A PDB 5lzp Data collection Particles 48 799 Pixel size (Å) 1.4 Defocus range (µm) 1.0–3.6 Voltage (kV) 300 Electron dose (e−/Å2) 25 Reciprocal space data Spacegroup (Å) P1 a, b, c (Å) 403.2, 403.2, 403.2 α, β, γ (◦) 90, 90, 90 Refinement Resolution range (Å) 403.2–3.45 Applied geometry weight (wxc) 1.15 No. reflections 3 340 040 R-factor 0.3144 Model composition non-hydrogen atoms 46 557 Protein residues 6 182 Rms deviations Bonds 0.007 Angles 0.814 Dihedrals 13.326 Ramachandran plot Favored (%) 96.01 Allowed (%) 3.87 Outliers (%) 0.11 Rotamer outliers (%) 4.82 Clashscore 6.1 Average B-factor 81.81

51 A B H0 GQYL density

PrcA

PrcB H0

C PrcB

PrcA

Active site density

Figure 3.S8: Comparison of Bpa-occupied and Bpa-free faces of the Bpa-CP com- plex. A) Overlay of FO−FC Fourier difference density map (yellow) with the modeled structure of the bacterial CP (salmon), revealing additional density in the active sites and in the Bpa- associated α-rings. B) Comparison of the conformation of the Bpa-CP complex with the crystal structure of the My- cobacterium tuberculosis proteasome. The model from A is superimposed with the structure pdb 3mi0 (slate blue). C) Similar overlay as in B, comparing the Bpa-capped side of the CP (salmon) with the Bpa-free side (recolored in green).

52 Depupylase Dop requires inorganic phosphate as the 4 active site nucleophile

Marcel Bolten, Christian Vahlensieck, Colette Lipp, Marc Leibundgut, Nenad Ban and Eilika Weber-Ban

ABSTRACT

Analogous to eukaryotic ubiquitination, proteins in actinobacteria can be post- translationally modified in a process referred to as pupylation, the covalent attach- ment of prokaryotic ubiquitin-like protein Pup to lysine side chains of the target protein via an isopeptide bond. As in eukaryotes, an opposing activity counter- acts the modification by specific cleavage of the isopeptide bond formed with Pup. However, the enzymes involved in pupylation and depupylation have evolved in- dependently of ubiquitination and are related to the family of ATP-binding and hydrolyzing carboxylate-amine ligases of the glutamine synthetase type. Further- more, the Pup ligase PafA and the depupylase Dop share close structural and sequence homology and have a common evolutionary history despite catalyzing opposing reactions. Here, we investigate the role played by the nucleotide in the active site of the depupylase Dop using a combination of biochemical experiments and X-ray crystallographic studies. We show that, although Dop does not turn over ATP stoichiometrically, the active site nucleotide species in Dop is ADP and inorganic phosphate rather than ATP, and that non-hydrolyzable analogs of ATP cannot support the enzymatic reaction. This finding suggests that the catalytic mechanism is more similar to the mechanism of the ligase PafA than previously thought and likely involves the transient formation of a phosphorylated Pup inter- mediate. Evidence is presented for a mechanism where the inorganic phosphate acts as the nucleophilic species in amide bond cleavage and implications for Dop function are discussed.

53 INTRODUCTION

In pupylation, proteins are marked by the post-translational modification of ly- sine side chains with the small, monomeric protein Pup (prokaryotic ubiquitin-like protein) (63 , 64 , 66 , 101 ). Bacterial pupylation shows many functional parallels to eukaryotic ubiquitination, as for example the employment of a macromolecular tag, the nature of the generated covalent linkage and the role played as an impor- tant recognition element in a protein degradation pathway involving a proteasome complex. However, bacteria have evolved this functionally analogous system in- dependently, and the enzymes involved in pupylation and depupylation are not related to ubiquitination or deubiquitination systems but rather belong to the superfamily of carboxylate-amine/ammonia ligases (65 , 78 , 79 ). Ligation of Pup to target proteins is catalyzed by the enzyme PafA (proteasome accessory factor A) and results in the formation of an isopeptide bond between the side chain carboxylate of the C-terminal glutamate of Pup and the ε-amino group of a lysine residue in the target protein (65 , 67 ). In accordance with the energy re- quirement of isopeptide bond formation the ligation process requires the turn-over of ATP. The structure of PafA shows that it has a similar active site arrange- ment as glutamine synthetase (GS), consisting of a curved anti-parallel β-sheet with ATP bound at one end of the β-sheet cradle and the triphosphate chain running along the strands toward the opposite side of the sheet, where the gluta- mate residue of Pup is bound (71 , 73 ). The γ-carboxylate of Pup’s C-terminal glutamate binds in close proximity to the γ-phosphate, allowing an attack of the glutamyl γ-carboxylate oxygen on the γ-phosphate of ATP to cleave off ADP and generate the γ-glutamyl phosphate-mixed anhydride intermediate of Pup (72 ). This phospho-Pup intermediate is activated for nucleophilic attack of the ε-amino group of the target lysine in the next step, which then leads to formation of the isopeptide bond. This reaction is chemically similar to the activation of the glu- tamate side chain for the attack of ammonia in GS. However, while bacterial GS is an oligomeric assembly (a double ring of hexamers) with the active sites buried in deep pockets at the intra-ring subunit interfaces (81 ), the Pup ligase PafA is active as a monomer and features a broad, easily accessible active site (71 , 73 ). In mycobacteria and several other actinobacteria, Pup is encoded with a C-terminal glutamine instead of glutamate, necessitating deamidation of the C-terminal glutamine to glutamate before ligation to a target is possible. This ac-

54 4 Depupylase Dop requires inorganic phosphate as the active site nucleophile tivity is carried out by Dop (deamidase of Pup), which is structurally highly similar to the ligase PafA and is also encoded in the pupylation gene locus (65 , 73 , 126 ). Intriguingly, Dop also opposes the ligase activity by catalyzing the specific cleavage of the isopeptide bond formed between Pup and the protein (69 , 70 ). Thus, at least in mycobacteria, Dop is involved in both the pupylation and depupylation of protein substrates, suggesting an intricate network of regulation. Furthermore, pupylated target proteins can be recruited to a bacterial proteasome complex con- sisting of the 20S proteasome core and the AAA-ATPase ring Mpa (mycobacterial proteasomal ATPase), where recognition of Pup takes place (46 , 57 , 61 ). Pupy- lated substrates can escape this fate, however, when Pup is cleaved off by the depupylase Dop. The fate of a pupylation target is therefore tightly controlled by all four activities. In agreement with its structural similarity to PafA, Dop also features an active site nucleotide (73 ). Yet, cleavage of the isopeptide bond does not require energy and it has been shown that Dop does not turn over ATP stoichiometrically with substrate deamidated or Pup cleaved off (65 , 69 ). It is, therefore, an intriguing question what role the nucleotide plays in the catalytic mechanism of the enzyme. In order to investigate the role of the nucleotide in the active site of Dop we used a combination of biochemical experiments and X-ray crystallographic analysis. Our results show that the active site of Dop requires ADP and inorganic phosphate to support both the deamidation and the depupylation activities. This finding has implications for the catalytic mechanism and suggests that Dop is mechanistically more closely related to the ligase PafA than previously thought.

RESULTS

Non-hydrolyzable ATP analogs cannot support Dop activity

It was shown earlier that Dop does not hydrolyze ATP stoichiometrically during the reaction progress of deamidation (65 ) and depupylation (69 ). In addition it was found that Dop shows only marginal deamidase activity in the presence of the non-hydrolyzable ATP analog ATPγS(65 ). To better understand the role of the nucleotide for the catalytic activity of Dop we analyzed the depupylation reaction of the known pupylation substrate PanB (ketopantoate hydroxymethyl- transferase) in the presence of different nucleotides. The enzymatic removal of

55 0.2 1 2 3 4 8 15 time / min 1.0 ADP + P A B i PanB∼Pup ATP ATP PanB 0.8

PanB∼Pup 0.6 AMP-PCP

0.4 ∼ PanB Pup fraction PanB ADP + Pi PanB 0.2 ADP

PanB∼Pup ADP 0.0 AMP-PCP PanB 0 5 10 15 time / min

Figure 4.1: Depupylation time course of PanB-Pup in the presence of different nu- cleotides. 3 µM CgluDop was incubated with 3 µM MtbPanB-CgluPup, 0.5 mM nucleotides and 10 mM phosphate at 30 ◦C and pH 8. A) SDS-polyacrylamide gels showing the influence of different nucleotides on mobility shift of PanB upon depupylation over the reaction time course. B) Densitometric analysis of the PanB band in relation to the total amount of PanB used in the reaction. PupE is poorly stained by Coomassie Blue and was not expected to contribute to the density of the PanB-Pup band at the concentrations under which the assay was performed.

Pup from PanB-Pup was probed by taking aliquots along the depupylation time course and quenching them with SDS-loading dye to stop the reaction. Aliquots at the respective time points were subjected to SDS-PAGE followed by Coomassie staining (Figure 4.1A). In addition, the gels were densitometrically evaluated for a more quantitative assessment (Figure 4.1B). In the presence of ATP, all PanB-Pup is converted to PanB within 3 min under the conditions used, while in the presence of the non-hydrolyzable analogue AMP-PCP no PanB-Pup was depupylated even after 15 min. This is a curious observation, considering that ATP is not turned over stoichiometrically during the reaction. One possible explanation could be that ATP cleavage, although not accompany- ing substrate turnover, is nevertheless required to produce the correct arrangement of the active site. A consequence of this would be that hydrolyzed ATP, i.e. ADP and phosphate (Pi) in the active site would be able to support the reaction, even though stoichiometric turnover does not take place during the reaction. Indeed, testing this hypothesis, we found that depupylation is catalyzed in the presence of ADP and Pi (Figure 4.1). Complete turnover of PanB-Pup in the presence of

ADP and Pi is reached after 2 min instead of 3 min with ATP. Interestingly, ADP alone shows only marginal activity. This residual activity of Dop in the presence of ADP could stem from minor impurities of Pi in the ADP stock solution or an adenylate kinase impurity in the Dop or PanB-Pup stock solution (127 ).

56 4 Depupylase Dop requires inorganic phosphate as the active site nucleophile

Co-crystal structure of Dop and ATP reveals ADP and phosphate as the active site species

One of our previously solved crystal structures of Dop contained ATP (73 ). How- ever, as the nucleotide was not co-crystallized but soaked into the crystal, the occupancy was poor. To investigate whether the active site species in Dop might be ADP and Pi rather than ATP, we co-crystallized ATP and Dop to obtain full occupancy. The co-crystallization attempts with Dop from Acidothermus cellulolyticus (AcelDop) and ATP yielded well diffracting crystals. The structure was solved by molecular replacement using the previously solved Dop structure [pdb 4b0r( 73 )] and refined to 1.9 Å (Table 1). With the exception of a disordered region between residues 42 to 79 in the so-called Dop loop, the electron density was continuous.

The structure clearly shows that ADP and Pi are bound in the active site with high occupancy as indicated by the density (Figure 4.2A). In contrast to the ear- lier, ATP-soaked structure, all magnesium binding sites characteristic for members of carboxylate-amine/ammonia ligase superfamily (n1–n3) are occupied and their importance for mediating contacts between the amino acid residues and the phos- phate groups becomes obvious (Figure 4.2A, B). An additional Mg2+ binding site (n5) was observed that contributes to the binding of the β-phosphate. All Mg2+- ions are coordinated in almost perfect octahedral symmetry. Notably, Asp94, a residue previously shown to be important for activity (73 , 126 ) and proposed to form a mixed anhydride intermediate during catalysis (74 ), coordinates Mg2+ at the n1 position. A comparison of the active site of Dop with that of Saccharomyces cerevisiae glutamate cysteine ligase (ScGCL) in complex with ADP and the transition state analogue (TSA) inhibitor buthionine sulfoximine phosphate [pdb 3lvv( 80 )] (Fig- ure 4.2C) shows that the phosphate complexed by Dop superimposes almost per- fectly with the phosphate group of the TSA, indicating the general relevance of the bound phosphate and in particular the conservation of the phosphate location within the carboxylate-amine/ammonia ligase superfamily. Arg472 of ScGCL is positioned to form hydrogen bonds with the sulfoximine oxygen and an oxygen of the phosphate group and, while the offset to Arg227 is 3.4 Å, the distance to the phosphate is kept constant. Although in the absence of an additional hydrogen bond partner only the guanidyl group coordinates the Pi, the conformational free-

57 Table 4.1: Data collection and refinement statistics

AcelDop · ADP · Pi

PDB 5lrt Crystal form Space group (Å) P 31 2 1 Unit cell 72.398 dimensions (Å) 72.398 215.25 angels α, β, γ (◦) 90 90 120 Molecules / ASU 1 Data collection Wavelength (Å) 0.99998 Resolution (Å) 35.88–1.85 (1.88–1.85) Total reflections 506 641 (23 851) Unique reflections 56 899 (2 622) Multiplicity 8.9 (9.1) Completeness (%) 100 (100) Mean I / σ(I) 25.52 (1.50) Wilson B-factor (Å2) 39.17 Rmerge 0.044 (1.425) Rmeas 0.047 (1.51) CC1/2 1 (0.607) Refinement Reflections used 56 895 (2 622) Reflections used Rfree 2 880 (125) Rwork 0.1665 (0.3185) Rfree 0.1977 (0.3695) Model composition non-hydrogen atoms 3 996 macromolecules 3 689 ligands 83 water 224 Protein residues 464 Rms deviations Bonds 0.027 Angles 1.13 Dihedrals 14.97 Ramachandran plot Favored (%) 97 Allowed (%) 2.6 Outliers (%) 0.21 Rotamer outliers (%) 0.52 Clashscore 4.54 Average B-factor 49.53 macromolecules 49.25 ligands 52.69 water 53.07 Number of TLS groups 11

Values in parentheses are for highest-resolution shell

Rmerge = Σ Ih,i − hIhi /ΣI(h,i) where hIhi is the mean intensity of the reflections

58 4 Depupylase Dop requires inorganic phosphate as the active site nucleophile

A Glu229 Arg433 Arg239 Arg227

Trp453 n5

His241 Arg90 Dop loop Glu8 n2 His155 Ser101 Thr217 n3 Glu99 Tyr92 n4 n1 Asp94 H2O Mg2+

+ Arg205 Glu10 Na H N B Pro230 C O NH Arg433 NH H N 2 2 N H N N

N N O H Pro103 O N Glu229 H O O 2 H O − O Thr102 H O H O n5 2 O O NH

H2O H2O + P O HN Trp453 Arg239 NH2 HN − HO − O O NH O Ser101 2 + Arg227 P − NH O 2 (Arg472) Glu8 O O H2N − N O n3 H Arg90 n2 HO His241 N HN Tyr92 + O − NH2 O − O n1 O Arg227 HN P − Glu99 NH2 O − OH O N H2O OH2 H O n4 2 O His155 N OH2 O Asp94 H − H2O H O O 2 Glu10 Glu10 OH Asp94 H2O NH2 (Glu52) (Glu96) Thr217 + N N H2 H Arg205 Arg205 (Arg313)

Figure 4.2: Crystal structure of nucleotide-bound Dop. A) AcelDop active site with bound ADP, phosphate, magnesium and sodium ions. The unbiased mFO−DFC Fourier simulated annealing omit map was calculated with a model in which all but the protein atoms and waters were omitted and is contoured at 5 σ level. B) Schematic representation of polar interactions between Dop active site residues, ADP, phos- phate, magnesium ions, sodium ion and water. Residues labeled in blue are located in the C-terminal domain of Dop. C) Comparison of the active sites of AcelDop (green) and Saccharomyces cerevisiae glutamate cysteine ligase (pdb 3lvv, grey, residue numbers in parentheses). The phosphate group of the transition state mimic buthionine sulfoximine phosphate overlays with the free phosphate bound by Dop.

59 ADP

ATP

t / min 0.5 5 15 20 30 50 0.5 5 15 20 30 50 0.5 5 15 20 30 50 0.5 5 15 20 30 50 Dop Dop + PupQ Dop + PupE Dop + PupGG

ADP

ATP

t / min 0.5 5 15 20 30 50 0.5 5 15 20 30 50 0.5 5 15 20 30 50 0.5 5 15 20 30 50 autohydrolysis PupQ PupE PupGG

Figure 4.3: ATP hydrolysis mediated by Dop is dependent on the C-terminal residue of Pup. The C-terminal residues glutamine or glutamate of Pup accelerate the ATP hydrolysis mediated by Dop. A Pup variant lacking the C-terminal residue (PupGG) shows the same ATPase activity as Dop alone. ATP and the Pup variants alone show negligible ATPase activity over the assay time. 30 µM CgluDop was incubated with 190 µM CgluPupE or 250 µM CgluPupQ or 250 µM CgluPupGG and 100 µM ATP (100 mCi/mmol [α-32P]ATP) at 23 ◦C and pH 8. dom of Arg227 would allow formation of a similar stabilizing interaction as Arg472 of ScGCL. Judging by the position of the carboxylate end of the TSA it is likely that Arg205 in Dop coordinates the free C-terminus of Pup.

ATP Hydrolysis Depends on the C-terminal Residue of Pup

Our structural analysis suggests that ATP is cleaved into ADP and phosphate in the active site of Dop. In contrast to the ligase, this reaction is not coupled to the turnover of substrate. Nevertheless, it is possible that the presence of Pup in the active site might influence the cleavage. To investigate a potential role of Pup in the hydrolysis of ATP we therefore monitored the conversion of [α-32P]ATP to [α-32P]ADP in the active site of Dop by TLC and phosphorimaging in the absence and presence of Pup. As ATP cleavage presumably occurs to configure the active site rather than to turn over the substrate, a high concentration of Dop (30 µM) had to be employed to detect the cleavage. When Pup was absent, Dop showed only a very low basal ATPase activity (Figure 4.3, upper left panel). However,

60 4 Depupylase Dop requires inorganic phosphate as the active site nucleophile when either PupE or PupQ were included in the reaction mixture, significant amounts of ADP were produced in the measured time frame of 50 min (Figure 4.3, two upper middle panels). This increased Dop activity is due to a stimulation of the ATP cleavage by Pup and not due to any possible impurity in the Pup preparations, since PupQ or PupE alone did not show turnover in the same time period (Figure 4.3, two lower middle panels). To test whether hydrolysis of ATP is stimulated by a conformational change in Dop upon binding of Pup or rather by the presence of the very C-terminal residue in the active site, a Pup variant lacking the last residue (PupGG) was used. Strikingly, PupGG was not able to accelerate the ATP hydrolysis over the background level that Dop exhibits on its own, indicating that indeed the C-terminal residue of Pup in the active site stimulates the hydrolysis of ATP.

ATP hydrolysis is necessary for Dop activity

We next analyzed whether ATP hydrolysis is required for the active site to carry out catalysis of the depupylation reaction. To obtain a continuous record of the depupylation reaction we used the fluorescent model substrate Pup-Fl (128 ) which can be monitored by fluorescence anisotropy. In parallel, we followed the ATP hydrolysis radiochemically (supplemental Figure 4.S1). The depupylation reaction shows an initial lag phase of about 10 min, during which Pup-Fl is turned over only very slowly (Figure 4.4). Maximal Dop activity is reached after approximately

1.0 0 O O O−

0.8 1 Fluorescein −OOC 0.6 2 −OOC 0.4 Lys N fraction Pup-Fl 3 O 0.2 H NH Pup-Fl 0.0 4 ATP equivalents of Dop Pup O 0 10 20 30 40 50 60 Glu time / min − O O

Figure 4.4: Initial ATP hydrolysis is necessary for depupylation of model substrate Pup-Fl. After an initial lag phase of ≈12 min, during which ATP is hydrolyzed, the depupylation reaction reaches a maximal rate. At 30 min, ≈90 % of Pup-Fl is depupylated and ≈40 % of ATP remains. The depupylation reaction progress was monitored by fluorescence anisotropy at 23 ◦C and pH 8. 25 µM CgluDop were incubated with 100 µM ATP and 100 µM Pup-Fl. For monitoring ATP hydrolysis, ATP was spiked with 200 mCi/mmol [α-32P]ATP (phosphorimaging is shown in supplemental Figure 4.S1).

61 12 min under the conditions used, at which point an ATP amount has been cleaved that is equivalent to 1.4 times the concentration of Dop active sites. From the assessment of ATP cleavage in the presence of PupE we know that ATP is turned over beyond a single turnover of available active sites, indicating that ADP and

Pi feature a certain off-rate. Nevertheless, the observed amount of ATP cleaved is rather close to stoichiometric with Dop active sites.

DISCUSSION

The two enzymes involved in pupylation and depupylation of proteins in actinobac- teria, the ligase PafA and the depupylase Dop, are evolutionarily related. Both belong to the large family of carboxylate-amine ligases, enzymes that catalyze the formation of an amide linkage between a carboxylate and an amine via an acylphosphate intermediate (78 , 79 ). Accordingly, PafA and Dop are structural homologs and the residues forming the active site are highly conserved (73 , 126 ). A defining feature of the active site of carboxylate-amine ligase family members is the nucleotide binding site. The role of ATP in the ligation reaction is to activate the carboxylate for nucleophilic attack by the amine and to drive the otherwise thermodynamically unfavorable ligation reaction. For the Pup ligase PafA, the activation of Pup occurs by formation of a γ-glutamyl-phosphate mixed anhydride intermediate at the C-terminal glutamate of Pup (72 ). This intermediate is poised in the active site, protected from hydrolysis, for reaction with a lysine ε-amino group of an incoming pupylation substrate. As a close homolog of PafA, Dop features a nearly identical ATP-binding site (73 ). However, thermodynamically, ATP turnover is not required, since cleavage of an amide linkage is entropically favorable. In accordance with that, amide bond cleavage catalyzed by Dop is not accompanied by stoichiometric turnover of ATP, neither during deamidation nor during depupylation (65 , 69 ). Nevertheless, our structural analysis clearly identi-

fied ADP and Pi in the active site of Dop, indicating that ATP does not merely play a structural role in maintaining active site configuration. Although it was previously shown that the ATP analog ATPγS is able to support a very low level of deamidase activity (65 ), this might be due to the tendency of this analog to exhibit some cleavage of the bond to the γ-phosphate. In contrast, the ATP analog AMP-PCP employed in this study is not known to undergo hydrolysis (129 ). The fact that Dop does not exhibit any activity with this nucleotide analog (Figure 4.1)

62 4 Depupylase Dop requires inorganic phosphate as the active site nucleophile indicates that cleavage of the bond to the γ-phosphate of ATP and, therefore, the presence of ADP and phosphate in the active site of Dop is crucial for catalysis. This is further supported by our finding that radiolabeled [α-32P]ATP is turned over to [α-32P]ADP independent of substrate turnover (Figure 4.3, upper panels). Another puzzling observation is that ADP and phosphate but not ADP alone can support Dop activity (Figure 4.1). Together, these lines of evidence strongly point to a mechanistic role of the inorganic phosphate in the depupylase/deamidase re- action. We propose that the role of the phosphate in the active site of Dop is the formation of a transient phospho-Pup intermediate, a scenario also supported by the evolutionary relationship with the carboxylate-amine ligase family, where exactly such an intermediate is formed in the forward reaction. An inorganic phosphate oxygen attacks the side chain carbonyl carbon of the Pup C-terminal glutamine or, in the case of depupylation, the carbonyl carbon of the isopeptide bond between Pup and substrate, thereby bringing about the cleavage of the amide bond. The ammonium/amine leaving group dissociates from the enzyme and, in the next step, water, likely activated for nucleophilic attack by Asp94 in the active site, hydrolyses the phospho-Pup intermediate, thereby releasing Pup from the en- zyme. The importance of this aspartate during catalysis has been demonstrated previously, as mutation to either alanine (73 ) or asparagine (126 ) abolishes Dop activity in vitro for AcelDop and in vivo for Dop from Mycobacterium tuberculosis (MtbDop). In agreement with the lack of a thermodynamic requirement for ATP hydrolysis, inorganic phosphate remains in the active site and is ready for another round of catalysis. This mechanism is similar to the reaction, where GS catalyzes the conversion of glutamine to glutamate in the presence of ADP and arsenate (81 ).

Superposition of the Dop active site containing ADP and Pi with structures of other members of the carboxylate-amine/ammonia ligase family in complex with phosphorylated inhibitors results in excellent congruence of Pi with the phosphate group of the inhibitors (Figure 4.2C). This observation lends strong support to the existence of a phosphorylated Pup intermediate during the depupylation reaction catalyzed by Dop, which resembles the well-characterized intermediates from the other family members (75 , 76 , 80 , 82 ), including the phosphorylated Pup from PafA (72 ). Intriguingly, only in the presence of Pup, and more specifically in the presence of the Pup C-terminal residue in the active site of Dop, are ADP and Pi formed

63 efficiently. Dop in the absence of Pup or in the presence of a Pup variant shortened by one residue hydrolyzes ATP only very slowly (Figure 4.3). Pup thus supports the activation of water in the active site, which is needed for the attack on the γ-phosphate of ATP. This occurs either directly or indirectly by slight rearrange- ments of active site residues, like for example Asp94. The requirement of Pup binding for efficient ATP cleavage might serve as a protection mechanism against uncontrolled ATP hydrolysis by free Dop. An earlier study suggested that Asp95 of MtbDop (corresponding to Asp94 in A. cellulolyticus Dop) participates in catalysis by forming a covalent mixed an- hydride intermediate with Pup (74 ). This hypothesis was based on the observa- tion that an electrophilic Pup derivative acting as an irreversible trap, Pup-DON (6-diazo-5-oxo-L-norleucine), forms a covalent bond with Asp95 in the active site of MtbDop. Pup-DON features a highly reactive aliphatic diazo group that, after addition of a proton to the ε-carbon and elimination of N2, forms a carbenium ion that readily reacts with any nucleophiles in the vicinity, like for example carboxy- lates. This is in perfect agreement with a mechanism in which this aspartate acts as a catalytic base to activate water. It should also be noted in this context that the ε-carbon in the Pup-DON trap, which is attacked by the nucleophile, is positioned one bond-length deeper in the active site than the side chain carbonyl carbon of Pup (Cε versus Cδ), a position that is never occupied by the isopeptide bond or the carbamoyl group of glutamine. Pup containing a C-terminal asparagine was found to resist deamidation, indicating the significance of the length of the sidechain and the precise position of the carbonyl carbon in the active site (126 ). Further support for our proposed mechanism is given by the sequence of events, with ATP hydrolysis taking place first and creating a lag phase in the depupylation reaction. The latter reaches maximal velocity only after the ADP and phosphate have been generated in the active site (Figure 4.4). In an earlier study it was shown that during the cycle of catalysis, 18O-water 18 (H2 O) is incorporated only into Pup and not into Dop (74 ). Furthermore, the authors showed that hydroxylamine can act as a nucleophile to form Pup- hydroxamate, indicating that during the Dop catalyzed reaction an activated car- bonyl must exist. These findings are in agreement with our proposed mechanism, 18 as the resolution of the phosphorylated Pup intermediate by water (H2 O) or hy- droxylamine directly leads to the experimentally identified Pup species, 18O-labeled Pup or Pup-hydroxamate, respectively.

64 4 Depupylase Dop requires inorganic phosphate as the active site nucleophile

Arg227 Arg227 Glu229 Glu229

NH O O NH O O Mg2+ Mg2+ H2N NHH2 +Pup~NHR H2N NHH2 O O O O Mg2+ O O P P P O P P P O O O adenosine O O O O adenosine H2O O O O O O O O NH H Mg2+ Mg2+ R Mg2+ Mg2+ O O O Pup NH O O O O O O O O O O Glu10 O Glu10 Glu99 Glu99 Asp94 Asp94

−PupE Arg227 Arg227 Glu229 +Pup~NHR Glu229

NH O O NH O O Mg2+ Mg2+ H2N NHH2 H2N NHH2 O Mg2+ O O O Mg2+ O O O P P P O P P P HO O O O adenosine O O O O adenosine O O O O O O O OH O NH Mg2+ Mg2+ R Mg2+ Mg2+ O O Pup NH O O Pup NH O O O O O O HO O O O O Glu10 O Glu10 Glu99 Glu99 Asp94 Asp94

−RNH2 Arg227 Arg227 Glu229 Glu229

NH O O NH O O Mg2+ Mg2+ H2N NHH2 H2N NHH2 O Mg2+ O O O Mg2+ O O O P P P O P P P O O O O adenosine O O O O adenosine O O O O O O O OH O H 2+ O 2+ Mg Mg2+ Mg Mg2+ O O H Pup NH O O Pup NH O O HO O O O O O O O O Glu10 O Glu10 Glu99 Glu99 Asp94 Asp94 R = H, Substrate

Figure 4.5: Proposed catalytic mechanism of Dop. Active site residues are numbered ac- cording to AcelDop and shown in a simplified representation omitting Mg2+ coordinating residues Glu8, Tyr92, His155, His241. Mg2+ at position n2 is omitted at the beginning of the reaction for clarity and added with the first step (compare to Figure 4.2B).

Taken together, our findings on the fate of ATP in the Dop active site, previ- ous mutational and biochemical studies of Dop and the high degree of structural conservation between the members of the carboxylate-amine ligase family, in par- ticular the nearly identical active site configuration of Dop and PafA, strongly indicate that Dop catalyzes the cleavage of the isopeptide bond by a mechanism representative of the reverse reaction of PafA (Figure 4.5). In a first step, Dop hydrolyzes ATP to produce ADP and Pi which remain bound in the active site. This at the same time prevents Dop from acting as a ligase. In the next step, the

65 active site-bound Pi attacks the isopeptide bond and forms the transient phospho- rylated Pup intermediate. During this step Arg227 plays a critical role likely by stabilizing the transition state with its guanidyl group. In the last step, the resolu- tion of the phosphorylated Pup intermediate is mediated by nucleophilic attack of water, activated by the active site catalytic residue Asp94, on the carbonyl carbon of Pup. The two tetrahedral transition states that flank the transiently formed phospho-Pup intermediate are further stabilized by the Mg2+ ion coordinated by residues Glu10, Asp94 and Glu99 in the active site. Our biochemical analysis supported by X-ray crystallographic data provides the framework for understanding the mechanism of the depupylase/deamidase enzyme Dop and thus provides insight that might be exploited for the rational design of antituberculosis drugs aimed at interfering with the pupylation enzymes. Further- more, it contributes to our understanding of the evolutionary relationship between the two opposing players in the pupylation system, the Pup ligase PafA and the depupylase Dop.

EXPERIMENTAL PROCEDURES

Chemicals and reagents

Chemicals were obtained from Sigma unless otherwise noted. 5-FAM Lys was pro- vided by AnaSpec. [α-32P]ATP was obtained from Hartmann Analytic (Braun- schweig, Germany) at a specific activity of 15 TBq (400 Ci)/mmol. Polyethylene- imine TLC plates were provided by VWR International. ADP was further purified by ion-exchange chromatography (6 mL Resource Q column, GE Healthcare Life Sciences) and desalted by size exclusion chromatography (100 mL Superose 12 col- umn, GE Healthcare Life Sciences).

Protein Expression and Purification

Dop from Acidothermus cellulolyticus (AcelDop) was expressed from isopropyl-β-D- 1-thiogalactopyranoside-inducible pET21 vector in Escherichia coli Rosetta (DE3) cells (Invitrogen) with a C-terminal tobacco etch virus (TEV) protease cleavage site-His6 fusion and purified by affinity chromatography using a 5 mL Hi-Trap IMAC HP column (GE Healthcare Life Sciences) charged with Ni2+. After wash- ing the protein-containing column with 50 mL buffer W (50 mM HEPES-NaOH

66 4 Depupylase Dop requires inorganic phosphate as the active site nucleophile pH 8 at 23 ◦C, 500 mM NaCl, 40 mM imidazole), protein was eluted with buffer W containing 300 mM imidazole. Protein-containing fractions were pooled and dia- lyzed for 1 h at 4 ◦C against buffer D (50 mM HEPES-NaOH pH 8 at 4 ◦C, 150 mM NaCl). The C-terminal histidine tag was cleaved at the TEV protease cleavage site by addition of His-tagged TEV protease and further dialysis for 15 h. TEV protease was removed via affinity chromatography. AcelDop was further purified by size ex- clusion chromatography using a Superdex75 column in 20 mM HEPES-NaOH pH 8 at 20 ◦C and 50 mM NaCl. Corynebacterium glutamicum Dop (CgluDop) with an

N-terminal His6-TEV protease cleavage site fusion was expressed from pET24 and purified similar to AcelDop.

Crystallization of Dop

Crystallization of AcelDop was carried out in sitting drop vapor diffusion plates at a protein concentration of 8–12 mg/mL at 20 ◦C by mixing 2 µL of protein solu- tion with 1 µL of reservoir solution. AcelDop formed crystals in reservoir solutions consisting of 18–23 % (w/v) PEG 3350, 100 mM Bis-Tris propane pH 8.25–9.0 at ◦ 20 C, 200 mM KSCN, 20,mM MgCl2 and 5 mM ATP (pH 8). Before flash cool- ing the crystals with liquid nitrogen, PEG400 was added in 5 % (v/v) steps to a final concentration of 30 % (v/v) by using reservoir solution supplemented with PEG400.

Data Collection, Processing, Structure Determination and Refinement

A data set was collected at beamline X06SA of the Swiss Light Source (Paul Scher- rer Institut, Villigen, Switzerland) at 100 K and data was indexed and integrated using XDS (108 ). Initial analysis of data was performed using POINTLESS (109 ) and PHENIX.xtriage (110 ). Scaling was subsequently done by AIMLESS (111 ). The structure was solved by molecular replacement in PHASER-MR (114 ) using the previously solved Dop structure [pdb 4b0r( 73 )] as a search model. The model was further improved by iterative model building in COOT (116 ) and refinement in PHENIX (110 ). Statistics are summarized in Table 1. An additional strong electron density peak near the Pi was interpreted as a sodium ion (n4) based on coordination geometry and hydrogen bond distances. Structural alignments were done by cealign in PYMOL (130 ) using only the highly conserved residues around the active site.

67 Gel-based Depupylation Assays

3 µM MtbPanB-CgluPup (produced as described in (69 )) was incubated with 3 µM CgluDop at 30 ◦C in buffer R (50 mM HEPES-NaOH pH 8 at 23 ◦C, 150 mM NaCl and 20 mM MgCl2) supplemented with 0.5 mM nucleotides and additional 10 mM Na/K-phosphate pH 7 at 23 ◦C in the case of ADP. The formation of PanB was monitored by SDS–PAGE and Coomassie staining and analyzed densitometrically by GelAnalyzer software version 2010a (131 ). The fraction of PanB is expressed in relation to total PanB (PanB, PanB-Pup). Pup was not taken into account because of poor staining.

Fluorescence Anisotropy-based Depupylation Assay of Pup-Fl

Pupylation of 5-FAM Lys and the depupylation assay were carried out as described (128 ). For the generation of Pup-Fl 1 mM 5-FAM Lys and 250 µM CgluPupE was used and incubated in the presence of 10 µM CgluPafA with 5 mM ATP. The depupylation was carried out in buffer R. All traces were set up in triplicate and the mean values are shown unless otherwise noted.

Radioactive Assay to Monitor ATP Hydrolysis

Assays were performed as described (72 ). In brief, 30 µM CgluDop was incubated with 190 µM CgluPupE or 250 µM CgluPupQ or 250 µM CgluPupGG and 100 µM ATP (100 mCi/mmol [α-32P]ATP) at 23 ◦C in reaction buffer R. A Molecular Dy- namics phosphorimaging screen (GE Healthcare) was exposed to the dried TLC plates for 1–2 h and subsequently scanned using a Typhoon Trio phosphorimag- ing system (GE Healthcare). Images were analyzed using ImageJ software version 1.48 (132 ) to determine the ratio r of non-hydrolyzed ATP to total nucleotide (ATP and ADP) for each reaction time point t. To correct for the background hydrolysis of Dop without Pup-Fl (−Pup-Fl), the fraction of non-hydrolyzed ATP f at each time point t during the depupylation of Pup-Fl was normalized to the corresponding background [f (t) = r (t, +Pup-Fl) / r (t, −Pup-Fl)].

68 4 Depupylase Dop requires inorganic phosphate as the active site nucleophile

ACKNOWLEDGEMENTS

We thank B. Blattmann and C. Stutz-Ducommun for support with initial screening and acknowledge the staff of X06SA at the Swiss Light Source (SLS, Paul Scherrer Institut, Villigen, Switzerland) for support with data collection.

CONFLICT OF INTEREST

The authors declare that they have no conflict of interest with the contents of this article.

SUPPLEMENT

− Pup-Fl + Pup-Fl

ADP

ATP

0.5 2 4 6 10 15 30 60 t / min 0.5 2 4 6 10 15 30 60

Figure 4.S1: ATP hydrolysis mediated by Dop in absence and presence of model substrate Pup-Fl. 25 µM CgluDop was incubated with 100 µM Pup-Fl and 100 µM ATP spiked with 200 mCi/mmol [α-32P]ATP. The time point at 2 min was not taken into account because of the different running behavior.

69

Concluding remarks and outlook 5

The discovery of the Pup-proteasome system in actinobacteria proved that bacteria use proteins as post-translational modifications (63 , 64 ). This feature was previ- ously thought only to be present in eukaryotes, with the ubiquitination system as the most prominent example. It is worth mentioning that the pupylation and ubiq- uitination systems thus represent a further example of convergent evolution. The overall purpose of both systems is the same but the underlying chemistry of the tagging and untagging enzymes differs. The proteasome is the common enzyme complex that degrades marked substrates, although the eukaryotic system has evolved toward higher complexity. In contrast to the degradation machinery, the Pup ligase PafA and the deamidase/depupylase Dop do not show either sequence or structural similarity to the corresponding ubiquitin system-related enzymes. Nor is the chemistry catalyzed by these enzymes related. PafA and Dop are members of the γ-carboxylate amine/ammonia ligase super- family (71 , 73 , 78 ). As such, they contain an ATP binding site and the nucleotide is usually used for the activation of the side chain carboxylate by phosphorylation. This makes sense in the case of a ligation reaction where the hydroxyl group needs to be modified to become a better leaving group. However, for the cleavage of an isopeptide bond, it seems to be unnecessary. Nevertheless, Dop was previously shown to function only in the presence of ATP (65 ). While the two-step mecha- nism typical for this superfamily was experimentally confirmed for the ligase PafA, and the phosphorylated intermediate could even be isolated (72 ), the role of ATP for Dop was unclear. A mechanism for Dop was previously proposed in which an unusual anhydride intermediate is formed between a highly conserved aspartate of Dop and the γ-carboxylate of Pup’s C-terminal amino acid, but this mechanism does not take into account the function of ATP (74 ). The work presented in this thesis provides a more reasonable mechanism that explains previous experimental observations

71 and is supported by new findings focusing on the fate of ATP. It is shown that Dop requires inorganic phosphate as the active site nucleophile and the proposed mechanism can be interpreted as a reverse ligation reaction. The phosphorylated intermediate, which, according to the new mechanism, forms during the Dop cat- alyzed reaction, could not be isolated in this study. On the one hand, this is peculiar as the isolation was possible for the Pup ligase but, on the other hand, PafA is the only γ-carboxylate amine/ammonia ligase family member for which the phosphorylated intermediate has ever been observed (72 ). For glutamine synthetase (GS) and glutamate cysteine ligase (GCL), two fur- ther members of the γ-carboxylate amine/ammonia ligase superfamily (78 , 79 ), potent inhibitors are known. One of them is methionine sulfoximine that is phosphorylated in the active site and then functions as a transition state ana- log (81 , 133 , 134 ). Other sulfoximine inhibitors have been developed as well and were used in the structural analysis of these enzymes (75 , 80 , 81 ). They proved to be of great value for mechanistic studies and a similar approach could be used to further dissect the mechanisms of PafA and Dop. It is likely that the unnatural amino acid methionine sulfoximine or one of its derivatives will need to be fused to the C-terminus of Pup, as was done with the electrophilic trap (74 ), since the free trap showed no effect. Using such an approach, it might become possible to understand how PafA and Dop manage to catalyze reverse reactions using a nearly identical active site. Preliminary experiments that aimed to transform Dop into a ligase based on transplantations of parts from PafA to Dop were not successful (73 ). A further unresolved mystery in the field of pupylation is the Dop loop. It will be interesting to see what function can be assigned to this stretch of amino acids. As the Dop loop features a conserved region it might be used to recognize other interaction partners. It could also provide an additional feature for the discrimina- tion between pupylated proteins that should undergo depupylation an those that should stay pupylated. It was previously shown that the proteasomal interactor Mpa, which is responsible for substrate recognition and unfolding, facilitates the depupylation process of proteins (69 ). This could be a way to recycle Pup be- fore the substrate is degraded, as seen in the ubiquitin-proteasome system where ubiquitin is recycled (11 ). Another proteasomal interactor recently discovered in actinobacteria is Bpa. In contrast to Mpa, Bpa is an ATPase-independent proteasomal activator and, unlike

72 5 Concluding remarks and outlook

Mpa, it could be shown in vitro for the first time that the proteasomal activity is stimulated upon interaction with Bpa (48 , 49 ). It is thought that Bpa recognizes its substrates in a less specific manner compared to Mpa. While Mpa relies on pupylation, so far only β-casein, a model substrate for partially unfolded proteins, and the heat heat shock repressor HspR have been shown to be degraded by the Bpa-CP complex. This observation indicates a possible role for Bpa in a stress response mechanism induced for example by heat shock. It will be necessary to identify further substrates to decipher the specificity of Bpa. The analysis of the inner surface of the “chamber” formed by Bpa, which was made possible after the structure was solved, did not reveal a specific preference either for charged or for hydrophobic interactions. Along with this unsolved question comes the question how do substrates ultimately unfold once they contact Bpa. Maybe the twelve unstructured C-terminal stretches are involved in further unfolding as they could provide additional interaction possibilities for the already partially unfolded substrates. Due to the seven to twelve symmetry mismatch between CP and Bpa a few C-termini of some Bpa protomers are always free. The fact that they could not be traced in the EM density map indicates their flexibility. Therefore, one can also envision that the entire Bpa complex somehow hovers above the CP but does not dissociate because of avidity. A detailed mechanism for gate opening remains unclear and further research needs to be undertaken. The fact that Bpa and Mpa use the GQYL motif for interaction with the CP but show different behavior toward the activation could help to solve this mystery. Consequently, a structure of the Mpa-CP complex would be of great interest. A similar approach as used for the Bpa-CP could be expedient. As the Bpa-CP complex was obtained with the open gate variant of the CP, it will be necessary to perform additional studies to determine how the rearrangement of the PrcA N-termini occurs. A mutational study could shed light on the individual contributions of the N-terminal amino acids to the gate opening process. Taken together, the structural analysis of the Bpa dodecamer, the Bpa-CP com- plex, and Dop in complex with ADP and Pi provides detailed information for the field of proteasome-related protein degradation in actinobacteria from different points of view. One of the most prominent members of actinobacteria is Mycobac- terium tuberculosis (Mtb), the causative agent of tuberculosis. The ever increasing number of multidrug resistant Mtb strains shows the urgent need for new drug

73 targets and new drugs. Pupylation was shown to be essential for the persistence of Mtb inside the host macrophages (85 ) and thus it is an attractive new drug target. In addition, the enzymes PafA and Dop are not present in eukaryotes reducing the chance of adverse effects due to drug treatment. Dop is essential for full virulence in Mtb because Pup is biosynthesized with a C-terminal glutamine and has to be deamidated before it can be ligated to substrates (126 ). Therefore, the high resolution structure of Dop can directly aid the development of new drugs (135 ). A small molecule spanning the entire active site could efficiently deactivate Dop and PafA alike. On the one end it could make use of interactions in the glutamate binding pocket and on the other end of the adenosine binding site. In the best case the moieties would be connected by a triphosphate-like segment that mimics the transition state. The field of proteasomal protein degradation in actinobacteria and the closely related pupylation pathway have, since their discovery nearly a decade ago, brought numerous interesting facts to light. While the underlying molecular mechanisms become more and more clear, the overall biological significance, especially with respect to different growth conditions, can only be envisioned.

74 Bibliography

[1] Cannon, W. B. Organization for physiological homeostasis. Physiol Rev 1929, 9, 399–431.

[2] Bednarska, N. G.; Schymkowitz, J.; Rousseau, F.; Van Eldere, J. Protein aggregation in bacteria: the thin boundary between functionality and toxicity. Microbiology 2013, 159, 1795–806, doi: 10.1099/mic.0.069575-0.

[3] Kim, Y. E.; Hipp, M. S.; Bracher, A.; Hayer-Hartl, M.; Hartl, F. U. Molecular chaperone functions in and proteostasis. Annu Rev Biochem 2013, 82, 323–55, doi: 10.1146/annurev-biochem-060208-092442.

[4] Labbadia, J.; Morimoto, R. I. The biology of proteostasis in aging and disease. Annu Rev Biochem 2015, 84, 435–64, doi: 10.1146/annurev-biochem-060614-033955.

[5] Saibil, H. Chaperone machines for protein folding, unfolding and disaggregation. Nat Rev Mol Cell Biol 2013, 14, 630–42, doi: 10.1038/nrm3658.

[6] Laederach, J.; Leodolter, J.; Warweg, J.; Weber-Ban, E. In The Molecular Chaperones Interaction Networks in Protein Folding and Degradation; Houry, W. A., Ed.; Interac- tomics and Systems Biology; Springer New York, 2014; Vol. 1; Chapter 16, pp 419–44, doi: 10.1007/978-1-4939-1130-1_16.

[7] Jastrab, J. B.; Darwin, K. H. Bacterial Proteasomes. Annu Rev Microbiol 2015, 69, 109–27, doi: 10.1146/annurev-micro-091014-104201.

[8] Frees, D.; Brøndsted, L.; Ingmer, H. Bacterial proteases and virulence. Subcell Biochem 2013, 66, 161–92, doi: 10.1007/978-94-007-5940-4_7.

[9] Imkamp, F.; Ziemski, M.; Weber-Ban, E. Pupylation-dependent and -independent proteasomal degradation in mycobacteria. Biomol Concepts 2015, 6, 285–301, doi: 10.1515/bmc-2015-0017.

[10] Bittner, L.-M.; Arends, J.; Narberhaus, F. Mini review: ATP-dependent proteases in bac- teria. Biopolymers 2016, 105, 505–17, doi: 10.1002/bip.22831.

[11] Finley, D. Recognition and processing of ubiquitin-protein conju- gates by the proteasome. Annu Rev Biochem 2009, 78, 477–513, doi: 10.1146/annurev.biochem.78.081507.101607.

[12] Vijay-Kumar, S.; Bugg, C. E.; Cook, W. J. Structure of ubiquitin refined at 1.8 A resolution. J Mol Biol 1987, 194, 531–44, doi: 10.1016/0022-2836(87)90679-6.

75 [13] Kerscher, O.; Felberbaum, R.; Hochstrasser, M. Modification of proteins by ubiq- uitin and ubiquitin-like proteins. Annu Rev Cell Dev Biol 2006, 22, 159–80, doi: 10.1146/annurev.cellbio.22.010605.093503.

[14] Hartmann-Petersen, R.; Gordon, C. Integral UBL domain proteins: a family of proteasome interacting proteins. Semin Cell Dev Biol 2004, 15, 247–59, doi: 10.1016/j.semcdb.2003.12.006.

[15] Ciechanover, A.; Stanhill, A. The complexity of recognition of ubiquitinated sub- strates by the 26S proteasome. Biochim Biophys Acta 2014, 1843, 86–96, doi: 10.1016/j.bbamcr.2013.07.007.

[16] Komander, D.; Rape, M. The ubiquitin code. Annu Rev Biochem 2012, 81, 203–29, doi: 10.1146/annurev-biochem-060310-170328.

[17] Thrower, J. S.; Hoffman, L.; Rechsteiner, M.; Pickart, C. M. Recognition of the polyubiq- uitin proteolytic signal. EMBO J 2000, 19, 94–102, doi: 10.1093/emboj/19.1.94.

[18] Prakash, S.; Tian, L.; Ratliff, K. S.; Lehotzky, R. E.; Matouschek, A. An unstructured initiation site is required for efficient proteasome-mediated degradation. Nat Struct Mol Biol 2004, 11, 830–37, doi: 10.1038/nsmb814.

[19] Verma, R.; Aravind, L.; Oania, R.; McDonald, W. H.; Yates, J. R., 3rd; Koonin, E. V.; Deshaies, R. J. Role of Rpn11 metalloprotease in deubiquitination and degradation by the 26S proteasome. Science 2002, 298, 611–15, doi: 10.1126/science.1075898.

[20] Groll, M.; Ditzel, L.; Löwe, J.; Stock, D.; Bochtler, M.; Bartunik, H. D.; Huber, R. Struc- ture of 20S proteasome from yeast at 2.4 Å resolution. Nature 1997, 386, 463–71, doi: 10.1038/386463a0.

[21] Dubiel, W.; Pratt, G.; Ferrell, K.; Rechsteiner, M. Purification of an 11 S regulator of the multicatalytic protease. J Biol Chem 1992, 267, 22369–77.

[22] Rechsteiner, M.; Hill, C. P. Mobilizing the proteolytic machine: cell biological roles of proteasome activators and inhibitors. Trends Cell Biol 2005, 15, 27–33, doi: 10.1016/j.tcb.2004.11.003.

[23] Masson, P.; Lundin, D.; Söderbom, F.; Young, P. Characterization of a REG/PA28 pro- teasome activator homolog in Dictyostelium discoideum indicates that the ubiquitin- and ATP-independent REGgamma proteasome is an ancient nuclear protease. Eukaryot Cell 2009, 8, 844–51, doi: 10.1128/EC.00165-08.

[24] Ma, C. P.; Willy, P. J.; Slaughter, C. A.; Demartino, G. N. Pa28, an Activator of the 20-S Proteasome, Is Inactivated by Proteolytic Modification at Its Carboxyl-Terminus. J Biol Chem 1993, 268, 22514–22519.

[25] Sprangers, R.; Kay, L. E. Quantitative dynamics and binding studies of the 20S proteasome by NMR. Nature 2007, 445, 618–22, doi: 10.1038/nature05512.

76 Bibliography

[26] Kajava, A. V.; Gorbea, C.; Ortega, J.; Rechsteiner, M.; Steven, A. C. New HEAT-like repeat motifs in proteins regulating proteasome structure and function. J Struct Biol 2004, 146, 425–30, doi: 10.1016/j.jsb.2004.01.013.

[27] Ortega, J.; Heymann, J. B.; Kajava, A. V.; Ustrell, V.; Rechsteiner, M.; Steven, A. C. The axial channel of the 20S proteasome opens upon binding of the PA200 activator. J Mol Biol 2005, 346, 1221–7, doi: 10.1016/j.jmb.2004.12.049.

[28] Iwanczyk, J.; Sadre-Bazzaz, K.; Ferrell, K.; Kondrashkina, E.; Formosa, T.; Hill, C. P.; Ortega, J. Structure of the Blm10-20 S proteasome complex by cryo-electron microscopy. Insights into the mechanism of activation of mature yeast proteasomes. J Mol Biol 2006, 363, 648–59, doi: 10.1016/j.jmb.2006.08.010.

[29] Sadre-Bazzaz, K.; Whitby, F. G.; Robinson, H.; Formosa, T.; Hill, C. P. Structure of a Blm10 complex reveals common mechanisms for proteasome binding and gate opening. Mol Cell 2010, 37, 728–35, doi: 10.1016/j.molcel.2010.02.002.

[30] Stadtmueller, B. M.; Ferrell, K.; Whitby, F. G.; Heroux, A.; Robinson, H.; Myszka, D. G.; Hill, C. P. Structural models for interactions between the 20S proteasome and its PAN/19S activators. J Biol Chem 2010, 285, 13–7, doi: 10.1074/jbc.C109.070425.

[31] Smith, D. M.; Chang, S. C.; Park, S.; Finley, D.; Cheng, Y.; Goldberg, A. L. Docking of the proteasomal ATPases’ carboxyl termini in the 20S proteasome’s alpha ring opens the gate for substrate entry. Mol Cell 2007, 27, 731–44, doi: 10.1016/j.molcel.2007.06.033.

[32] Tamura, T.; Nagy, I.; Lupas, A.; Lottspeich, F.; Cejka, Z.; Schoofs, G.; Tanaka, K.; De Mot, R.; Baumeister, W. The first characterization of a eubacterial proteasome: the 20S complex of Rhodococcus. Curr Biol 1995, 5, 766–74.

[33] Lin, G.; Hu, G.; Tsu, C.; Kunes, Y. Z.; Li, H.; Dick, L.; Parsons, T.; Li, P.; Chen, Z.; Zwickl, P.; Weich, N.; Nathan, C. Mycobacterium tuberculosis prcBA genes encode a gated proteasome with broad oligopeptide specificity. Mol Microbiol 2006, 59, 1405–16, doi: 10.1111/j.1365-2958.2005.05035.x.

[34] Hu, G.; Lin, G.; Wang, M.; Dick, L.; Xu, R. M.; Nathan, C.; Li, H. Structure of the My- cobacterium tuberculosis proteasome and mechanism of inhibition by a peptidyl boronate. Mol Microbiol 2006, 59, 1417–28, doi: 10.1111/j.1365-2958.2005.05036.x.

[35] Kwon, Y. D.; Nagy, I.; Adams, P. D.; Baumeister, W.; Jap, B. K. Crystal structures of the Rhodococcus proteasome with and without its pro-peptides: implications for the role of the pro-peptide in proteasome assembly. J Mol Biol 2004, 335, 233–45, doi: 10.1016/j.jmb.2003.08.029.

[36] Li, D.; Li, H.; Wang, T.; Pan, H.; Lin, G.; Li, H. Structural basis for the assembly and gate closure mechanisms of the Mycobacterium tuberculosis 20S proteasome. EMBO J 2010, 29, 2037–47, doi: 10.1038/emboj.2010.95.

77 [37] Zühl, F.; Tamura, T.; Dolenc, I.; Cejka, Z.; Nagy, I.; De Mot, R.; Baumeister, W. Subunit topology of the Rhodococcus proteasome. FEBS Lett 1997, 400, 83–90, doi: 10.1016/S0014-5793(96)01403-2.

[38] Zühl, F.; Seemüller, E.; Golbik, R.; Baumeister, W. Dissecting the assembly pathway of the 20S proteasome. FEBS Lett 1997, 418, 189–94, doi: 10.1016/S0014-5793(97)01370-7.

[39] Witt, S.; Kwon, Y. D.; Sharon, M.; Felderer, K.; Beuttler, M.; Robinson, C. V.; Baumeis- ter, W.; Jap, B. K. Proteasome assembly triggers a switch required for active-site matura- tion. Structure 2006, 14, 1179–88, doi: 10.1016/j.str.2006.05.019.

[40] Brannigan, J. A.; Dodson, G.; Duggleby, H. J.; Moody, P. C.; Smith, J. L.; Tomchick, D. R.; Murzin, A. G. A protein catalytic framework with an N-terminal nucleophile is capable of self-activation. Nature 1995, 378, 416–19, doi: 10.1038/378416a0.

[41] Mc Cormack, T.; Baumeister, W.; Grenier, L.; Moomaw, C.; Plamondon, L.; Pramanik, B.; Slaughter, C.; Soucy, F.; Stein, R.; Zühl, F.; Dick, L. Active site-directed inhibitors of Rhodococcus 20 S proteasome. Kinetics and mechanism. J Biol Chem 1997, 272, 26103–9, doi: 10.1074/jbc.272.42.26103.

[42] Whitby, F. G.; Masters, E. I.; Kramer, L.; Knowlton, J. R.; Yao, Y.; Wang, C. C.; Hill, C. P. Structural basis for the activation of 20S proteasomes by 11S regulators. Nature 2000, 408, 115–20, doi: 10.1038/35040607.

[43] Bajorek, M.; Glickman, M. H. Keepers at the final gates: regulatory complexes and gating of the proteasome channel. Cell Mol Life Sci 2004, 61, 1579–88, doi: 10.1007/s00018-004-4131-y.

[44] Schmidt, M.; Hanna, J.; Elsasser, S.; Finley, D. Proteasome-associated proteins: regulation of a proteolytic machine. Biol Chem 2005, 386, 725–37, doi: 10.1515/BC.2005.085.

[45] Forster, A.; Masters, E. I.; Whitby, F. G.; Robinson, H.; Hill, C. P. The 1.9 A struc- ture of a proteasome-11S activator complex and implications for proteasome-PAN/PA700 interactions. Mol Cell 2005, 18, 589–99, doi: 10.1016/j.molcel.2005.04.016.

[46] Striebel, F.; Hunkeler, M.; Summer, H.; Weber-Ban, E. The mycobacterial Mpa-proteasome unfolds and degrades pupylated substrates by engaging Pup’s N-terminus. EMBO J 2010, 29, 1262–71, doi: 10.1038/emboj.2010.23.

[47] Wang, T.; Li, H.; Lin, G.; Tang, C.; Li, D.; Nathan, C.; Darwin, K. H.; Li, H. Structural insights on the Mycobacterium tuberculosis proteasomal ATPase Mpa. Structure 2009, 17, 1377–85, doi: 10.1016/j.str.2009.08.010.

[48] Delley, C. L.; Laederach, J.; Ziemski, M.; Bolten, M.; Boehringer, D.; Weber-Ban, E. Bac- terial proteasome activator Bpa (Rv3780) is a novel ring-shaped interactor of the mycobac- terial proteasome. PLoS One 2014, 9, e114348, doi: 10.1371/journal.pone.0114348.

78 Bibliography

[49] Jastrab, J. B.; Wang, T.; Murphy, J. P.; Bai, L.; Hu, K.; Merkx, R.; Huang, J.; Chatter- jee, C.; Ovaa, H.; Gygi, S. P.; Li, H.; Darwin, K. H. An adenosine triphosphate-independent proteasome activator contributes to the virulence of Mycobacterium tuberculosis. Proc Natl Acad Sci U S A 2015, 112, E1763–72, doi: 10.1073/pnas.1423319112.

[50] Wolf, S.; Nagy, I.; Lupas, A.; Pfeifer, G.; Cejka, Z.; Müller, S. A.; Engel, A.; De Mot, R.; Baumeister, W. Characterization of ARC, a divergent member of the AAA ATPase family from Rhodococcus erythropolis. J Mol Biol 1998, 277, 13–25, doi: 10.1006/jmbi.1997.1589.

[51] Darwin, K. H.; Lin, G.; Chen, Z.; Li, H.; Nathan, C. F. Characterization of a Mycobac- terium tuberculosis proteasomal ATPase homologue. Mol Microbiol 2005, 55, 561–71, doi: 10.1111/j.1365-2958.2004.04403.x.

[52] Pearce, M. J.; Arora, P.; Festa, R. A.; Butler-Wu, S. M.; Gokhale, R. S.; Darwin, K. H. Identification of substrates of the Mycobacterium tuberculosis proteasome. EMBO J 2006, 25, 5423–32, doi: 10.1038/sj.emboj.7601405.

[53] Djuranovic, S.; Hartmann, M. D.; Habeck, M.; Ursinus, A.; Zwickl, P.; Mar- tin, J.; Lupas, A. N.; Zeth, K. Structure and activity of the N-terminal sub- strate recognition domains in proteasomal ATPases. Mol Cell 2009, 34, 580–90, doi: 10.1016/j.molcel.2009.04.030.

[54] Striebel, F.; Kress, W.; Weber-Ban, E. Controlled destruction: AAA+ ATPases in protein degradation from bacteria to eukaryotes. Curr Opin Struct Biol 2009, 19, 209–17, doi: 10.1016/j.sbi.2009.02.006.

[55] Murzin, A. G. OB(oligonucleotide/oligosaccharide binding)-fold: common structural and functional solution for non-homologous sequences. EMBO J 1993, 12, 861–867.

[56] Zhang, X.; Stoffels, K.; Wurzbacher, S.; Schoofs, G.; Pfeifer, G.; Banerjee, T.; Parret, A. H. A.; Baumeister, W.; De Mot, R.; Zwickl, P. The N-terminal of the Rhodococcus erythropolis ARC AAA ATPase is neither necessary for oligomerization nor nucleotide hydrolysis. J Struct Biol 2004, 146, 155–65, doi: 10.1016/j.jsb.2003.10.020.

[57] Sutter, M.; Striebel, F.; Damberger, F. F.; Allain, F. H.; Weber-Ban, E. A distinct structural region of the prokaryotic ubiquitin-like protein (Pup) is recognized by the N- terminal domain of the proteasomal ATPase Mpa. FEBS Lett 2009, 583, 3151–7, doi: 10.1016/j.febslet.2009.09.020.

[58] Liao, S.; Shang, Q.; Zhang, X.; Zhang, J.; Xu, C.; Tu, X. Pup, a prokaryotic ubiquitin- like protein, is an intrinsically disordered protein. Biochem J 2009, 422, 207–15, doi: 10.1042/BJ20090738.

[59] Chen, X.; Solomon, W. C.; Kang, Y.; Cerda-Maira, F.; Darwin, K. H.; Walters, K. J. Prokaryotic ubiquitin-like protein pup is intrinsically disordered. J Mol Biol 2009, 392, 208–17, doi: 10.1016/j.jmb.2009.07.018.

79 [60] Wang, T.; Darwin, K. H.; Li, H. Binding-induced folding of prokaryotic ubiquitin-like protein on the Mycobacterium proteasomal ATPase targets substrates for degradation. Nat Struct Mol Biol 2010, 17, 1352–57, doi: 10.1038/nsmb.1918.

[61] Burns, K. E.; Pearce, M. J.; Darwin, K. H. Prokaryotic ubiquitin-like protein provides a two-part degron to Mycobacterium proteasome substrates. J Bacteriol 2010, 192, 2933–5, doi: 10.1128/JB.01639-09.

[62] Walsh, C. T.; Garneau-Tsodikova, S.; Gatto, G. J., Jr Protein posttranslational modifica- tions: the chemistry of proteome diversifications. Angew Chem Int Ed 2005, 44, 7342–72, doi: 10.1002/anie.200501023.

[63] Pearce, M. J.; Mintseris, J.; Ferreyra, J.; Gygi, S. P.; Darwin, K. H. Ubiquitin-like protein involved in the proteasome pathway of Mycobacterium tuberculosis. Science 2008, 322, 1104–7, doi: 10.1126/science.1163885.

[64] Burns, K. E.; Liu, W. T.; Boshoff, H. I.; Dorrestein, P. C.; Barry, r., C. E. Proteasomal protein degradation in Mycobacteria is dependent upon a prokaryotic ubiquitin-like protein. J Biol Chem 2009, 284, 3069–75, doi: 10.1074/jbc.M808032200.

[65] Striebel, F.; Imkamp, F.; Sutter, M.; Steiner, M.; Mamedov, A.; Weber-Ban, E. Bacterial ubiquitin-like modifier Pup is deamidated and conjugated to substrates by distinct but homologous enzymes. Nat Struct Mol Biol 2009, 16, 647–51, doi: 10.1038/nsmb.1597.

[66] Striebel, F.; Imkamp, F.; Ozcelik, D.; Weber-Ban, E. Pupylation as a signal for pro- teasomal degradation in bacteria. Biochim Biophys Acta 2014, 1843, 103–13, doi: 10.1016/j.bbamcr.2013.03.022.

[67] Sutter, M.; Damberger, F. F.; Imkamp, F.; Allain, F. H.; Weber-Ban, E. Prokaryotic ubiquitin-like protein (Pup) is coupled to substrates via the side chain of its C-terminal glutamate. J Am Chem Soc 2010, 132, 5610–2, doi: 10.1021/ja910546x.

[68] Barandun, J.; Delley, C. L.; Weber-Ban, E. The pupylation pathway and its role in my- cobacteria. BMC Biol 2012, 10, 95, doi: 10.1186/1741-7007-10-95.

[69] Imkamp, F.; Striebel, F.; Sutter, M.; Ozcelik, D.; Zimmermann, N.; Sander, P.; Weber- Ban, E. Dop functions as a depupylase in the prokaryotic ubiquitin-like modification path- way. EMBO Rep 2010, 11, 791–7, doi: 10.1038/embor.2010.119.

[70] Burns, K. E.; Cerda-Maira, F. A.; Wang, T.; Li, H.; Bishai, W. R.; Darwin, K. H. "Depupy- lation" of prokaryotic ubiquitin-like protein from mycobacterial proteasome substrates. Mol Cell 2010, 39, 821–7, doi: 10.1016/j.molcel.2010.07.019.

[71] Barandun, J.; Delley, C. L.; Ban, N.; Weber-Ban, E. Crystal structure of the complex between prokaryotic ubiquitin-like protein and its ligase PafA. J Am Chem Soc 2013, 135, 6794–7, doi: 10.1021/ja4024012.

80 Bibliography

[72] Guth, E.; Thommen, M.; Weber-Ban, E. Mycobacterial ubiquitin-like protein ligase PafA follows a two-step reaction pathway with a phosphorylated pup intermediate. J Biol Chem 2011, 286, 4412–9, doi: 10.1074/jbc.M110.189282.

[73] Ozcelik, D.; Barandun, J.; Schmitz, N.; Sutter, M.; Guth, E.; Damberger, F. F.; Al- lain, F. H.; Ban, N.; Weber-Ban, E. Structures of Pup ligase PafA and depupylase Dop from the prokaryotic ubiquitin-like modification pathway. Nat Commun 2012, 3, 1014, doi: 10.1038/ncomms2009.

[74] Burns, K. E.; McAllister, F. E.; Schwerdtfeger, C.; Mintseris, J.; Cerda-Maira, F.; Noens, E. E.; Wilmanns, M.; Hubbard, S. R.; Melandri, F.; Ovaa, H.; Gygi, S. P.; Dar- win, K. H. Mycobacterium tuberculosis prokaryotic ubiquitin-like protein-deconjugating enzyme is an unusual aspartate amidase. J Biol Chem 2012, 287, 37522–9, doi: 10.1074/jbc.M112.384784.

[75] Krajewski, W. W.; Jones, T. A.; Mowbray, S. L. Structure of Mycobacterium tuberculosis glutamine synthetase in complex with a transition-state mimic provides functional insights. Proc Natl Acad Sci U S A 2005, 102, 10499–504, doi: 10.1073/pnas.0502248102.

[76] Hibi, T.; Nii, H.; Nakatsu, T.; Kimura, A.; Kato, H.; Hiratake, J.; Oda, J. Crystal structure of gamma-glutamylcysteine synthetase: insights into the mechanism of catalysis by a key enzyme for glutathione homeostasis. Proc Natl Acad Sci U S A 2004, 101, 15052–7, doi: 10.1073/pnas.0403277101.

[77] Festa, R. A.; Pearce, M. J.; Darwin, K. H. Characterization of the proteasome accessory factor (paf) operon in Mycobacterium tuberculosis. J Bacteriol 2007, 189, 3044–50, doi: 10.1128/JB.01597-06.

[78] Iyer, L. M.; Burroughs, A. M.; Aravind, L. Unraveling the biochemistry and prove- nance of pupylation: a prokaryotic analog of ubiquitination. Biol Direct 2008, 3, 45, doi: 10.1186/1745-6150-3-45.

[79] Abbott, J. J.; Pei, J.; Ford, J. L.; Qi, Y.; Grishin, V. N.; Pitcher, L. A.; Phillips, M. A.; Grishin, N. V. Structure prediction and active site analysis of the metal binding determi- nants in gamma -glutamylcysteine synthetase. J Biol Chem 2001, 276, 42099–107, doi: 10.1074/jbc.M104672200.

[80] Biterova, E. I.; Barycki, J. J. Structural basis for feedback and pharmacological inhibition of Saccharomyces cerevisiae glutamate cysteine ligase. J Biol Chem 2010, 285, 14459–66, doi: 10.1074/jbc.M110.104802.

[81] Eisenberg, D.; Gill, H. S.; Pfluegl, G. M.; Rotstein, S. H. Structure-function rela- tionships of glutamine synthetases. Biochim Biophys Acta 2000, 1477, 122–45, doi: 10.1016/S0167-4838(99)00270-8.

[82] Hothorn, M.; Wachter, A.; Gromes, R.; Stuwe, T.; Rausch, T.; Scheffzek, K. Structural basis for the redox control of plant glutamate cysteine ligase. J Biol Chem 2006, 281, 27557–65, doi: 10.1074/jbc.M602770200.

81 [83] Lupas, A.; Zuhl, F.; Tamura, T.; Wolf, S.; Nagy, I.; De Mot, R.; Baumeister, W. Eubacterial proteasomes. Mol Biol Rep 1997, 24, 125–31, doi: 10.1023/A:1006803512761.

[84] Rock, K. L.; Gramm, C.; Rothstein, L.; Clark, K.; Stein, R.; Dick, L.; Hwang, D.; Gold- berg, A. L. Inhibitors of the proteasome block the degradation of most cell proteins and the generation of peptides presented on MHC class I molecules. Cell 1994, 78, 761–71, doi: 10.1016/S0092-8674(94)90462-6.

[85] Darwin, K. H.; Ehrt, S.; Gutierrez-Ramos, J. C.; Weich, N.; Nathan, C. F. The proteasome of Mycobacterium tuberculosis is required for resistance to nitric oxide. Science 2003, 302, 1963–6, doi: 10.1126/science.1091176.

[86] Lander, G. C.; Martin, A.; Nogales, E. The proteasome under the microscope: the regulatory particle in focus. Curr Opin Struct Biol 2013, 23, 243–51, doi: 10.1016/j.sbi.2013.02.004.

[87] Bai, L.; Hu, K.; Wang, T.; Jastrab, J. B.; Darwin, K. H.; Li, H. Structural analysis of the dodecameric proteasome activator PafE in Mycobacterium tuberculosis. Proc Natl Acad Sci USA 2016, 113, E1983–92, doi: 10.1073/pnas.1512094113.

[88] Mirabello, C.; Pollastri, G. Porter, PaleAle 4.0: high-accuracy prediction of protein sec- ondary structure and relative solvent accessibility. Bioinformatics 2013, 29, 2056–8, doi: 10.1093/bioinformatics/btt344.

[89] Walsh, I.; Martin, A. J.; Di Domenico, T.; Vullo, A.; Pollastri, G.; Tosatto, S. C. CSpritz: accurate prediction of protein disorder segments with annotation for homol- ogy, secondary structure and linear motifs. Nucleic Acids Res 2011, 39, W190–6, doi: 10.1093/nar/gkr411.

[90] Baker, N. A.; Sept, D.; Joseph, S.; Holst, M. J.; McCammon, J. A. Electrostatics of nanosystems: application to microtubules and the ribosome. Proc Natl Acad Sci U S A 2001, 98, 10037–41, doi: 10.1073/pnas.181342398.

[91] Dolinsky, T. J.; Czodrowski, P.; Li, H.; Nielsen, J. E.; Jensen, J. H.; Klebe, G.; Baker, N. A. PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res 2007, 35, W522–5, doi: 10.1093/nar/gkm276.

[92] Flynn, J. M.; Neher, S. B.; Kim, Y. I.; Sauer, R. T.; Baker, T. A. Proteomic discovery of cellular substrates of the ClpXP protease reveals five classes of ClpX-recognition signals. Mol Cell 2003, 11, 671–83, doi: 10.1016/S1097-2765(03)00060-1.

[93] Ruschak, A. M.; Kay, L. E. Proteasome allostery as a population shift between interchanging conformers. Proc Natl Acad Sci U S A 2012, 109, E3454–62, doi: 10.1073/pnas.1213640109.

[94] Shi, L.; Kay, L. E. Tracing an allosteric pathway regulating the activity of the HslV pro- tease. Proc Natl Acad Sci U S A 2014, 111, 2140–5, doi: 10.1073/pnas.1318476111.

82 Bibliography

[95] Chin, J. W.; Martin, A. B.; King, D. S.; Wang, L.; Schultz, P. G. Addition of a pho- tocrosslinking amino acid to the genetic code of Escherichia coli. Proc Natl Acad Sci U S A 2002, 99, 11020–4, doi: 10.1073/pnas.172226299.

[96] Smolskaya, S.; Zhang, Z. J.; Alfonta, L. Enhanced yield of recombinant proteins with site- specifically incorporated unnatural amino acids using a cell-free expression system. PLoS One 2013, 8, e68363, doi: 10.1371/journal.pone.0068363.

[97] Flory, P. J. Statistical mechanics of chain molecules; Interscience Publishers: New York, 1969; pp xix, 432 p.

[98] Rabl, J.; Smith, D. M.; Yu, Y.; Chang, S. C.; Goldberg, A. L.; Cheng, Y. Mechanism of gate opening in the 20S proteasome by the proteasomal ATPases. Mol Cell 2008, 30, 360–8, doi: 10.1016/j.molcel.2008.03.004.

[99] Yu, Y.; Smith, D. M.; Kim, H. M.; Rodriguez, V.; Goldberg, A. L.; Cheng, Y. Interactions of PAN’s C-termini with archaeal 20S proteasome and implications for the eukaryotic proteasome-ATPase interactions. EMBO J 2010, 29, 692–702, doi: 10.1038/emboj.2009.382.

[100] Flocco, M. M.; Mowbray, S. L. Planar stacking interactions of arginine and aromatic side- chains in proteins. J Mol Biol 1994, 235, 709–17, doi: 10.1006/jmbi.1994.1022.

[101] Samanovic, M. I.; Li, H.; Darwin, K. H. The pup-proteasome system of Mycobacterium tuberculosis. Subcell Biochem 2013, 66, 267–95, doi: 10.1007/978-94-007-5940-4_10.

[102] Johnston, S. C.; Whitby, F. G.; Realini, C.; Rechsteiner, M.; Hill, C. P. The proteasome 11S regulator subunit REG alpha (PA28 alpha) is a heptamer. Protein Sci 1997, 6, 2469–73, doi: 10.1002/pro.5560061123.

[103] Schmidt, M.; Haas, W.; Crosas, B.; Santamaria, P. G.; Gygi, S. P.; Walz, T.; Finley, D. The HEAT repeat protein Blm10 regulates the yeast proteasome by capping the core particle. Nat Struct Mol Biol 2005, 12, 294–303, doi: 10.1038/nsmb914.

[104] Knowlton, J. R.; Johnston, S. C.; Whitby, F. G.; Realini, C.; Zhang, Z.; Rechsteiner, M.; Hill, C. P. Structure of the proteasome activator REGalpha (PA28alpha). Nature 1997, 390, 639–43, doi: 10.1038/37670.

[105] Zhang, Z.; Clawson, A.; Realini, C.; Jensen, C. C.; Knowlton, J. R.; Hill, C. P.; Rech- steiner, M. Identification of an activation region in the proteasome activator REGalpha. Proc Natl Acad Sci U S A 1998, 95, 2807–11.

[106] Barthelme, D.; Chen, J. Z.; Grabenstatter, J.; Baker, T. A.; Sauer, R. T. Architecture and assembly of the archaeal Cdc48*20S proteasome. Proc Natl Acad Sci U S A 2014, 111, E1687–94, doi: 10.1073/pnas.1404823111.

[107] Van Duyne, G. D.; Standaert, R. F.; Karplus, P. A.; Schreiber, S. L.; Clardy, J. Atomic structures of the human immunophilin FKBP-12 complexes with FK506 and rapamycin. J Mol Biol 1993, 229, 105–24, doi: 10.1006/jmbi.1993.1012.

83 [108] Kabsch, W. XDS. Acta Crystallogr D Biol Crystallogr 2010, 66, 125–32, doi: 10.1107/S0907444909047337.

[109] Evans, P. Scaling and assessment of data quality. Acta Crystallogr D Biol Crystallogr 2006, 62, 72–82, doi: 10.1107/S0907444905036693.

[110] Adams, P. D. et al. PHENIX: a comprehensive Python-based system for macromolec- ular structure solution. Acta Crystallogr D Biol Crystallogr 2010, 66, 213–21, doi: 10.1107/S0907444909052925.

[111] Evans, P. R.; Murshudov, G. N. How good are my data and what is the resolution? Acta Crystallogr D Biol Crystallogr 2013, 69, 1204–14, doi: 10.1107/S0907444913000061.

[112] Wang, J.; Kamtekar, S.; Berman, A. J.; Steitz, T. A. Correction of X-ray intensities from single crystals containing lattice-translocation defects. Acta Crystallogr D Biol Crystallogr 2005, 61, 67–74, doi: 10.1107/S0907444904026721.

[113] Sheldrick, G. M. Experimental phasing with SHELXC/D/E: combining chain tracing with density modification. Acta Crystallogr D Biol Crystallogr 2010, 66, 479–85, doi: 10.1107/S0907444909038360.

[114] McCoy, A. J.; Grosse-Kunstleve, R. W.; Adams, P. D.; Winn, M. D.; Storoni, L. C.; Read, R. J. Phaser crystallographic software. J Appl Crystallogr 2007, 40, 658–74, doi: 10.1107/S0021889807021206.

[115] Brunger, A. T. Version 1.2 of the Crystallography and NMR system. Nat Protoc 2007, 2, 2728–33, doi: 10.1038/nprot.2007.406.

[116] Emsley, P.; Lohkamp, B.; Scott, W. G.; Cowtan, K. Features and development of Coot. Acta Crystallogr D Biol Crystallogr 2010, 66, 486–501, doi: 10.1107/S0907444910007493.

[117] Jones, T. A.; Zou, J. Y.; Cowan, S. W.; Kjeldgaard, M. Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallogr A 1991, 47 ( Pt 2), 110–9, doi: 10.1107/S0108767390010224.

[118] Li, X.; Mooney, P.; Zheng, S.; Booth, C. R.; Braunfeld, M. B.; Gubbens, S.; Agard, D. A.; Cheng, Y. Electron counting and beam-induced motion correction en- able near-atomic-resolution single-particle cryo-EM. Nat Methods 2013, 10, 584–90, doi: 10.1038/nmeth.2472.

[119] Mindell, J. A.; Grigorieff, N. Accurate determination of local defocus and specimen tilt in electron microscopy. J Struct Biol 2003, 142, 334–47, doi: 10.1016/S1047-8477(03)00069-8.

[120] Ludtke, S. J.; Baldwin, P. R.; Chiu, W. EMAN: semiautomated software for high-resolution single-particle reconstructions. J Struct Biol 1999, 128, 82–97, doi: 10.1006/jsbi.1999.4174.

84 Bibliography

[121] Scheres, S. H. A Bayesian view on cryo-EM structure determination. J Mol Biol 2012, 415, 406–18, doi: 10.1016/j.jmb.2011.11.010.

[122] Scheres, S. H. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J Struct Biol 2012, 180, 519–30, doi: 10.1016/j.jsb.2012.09.006.

[123] Kucukelbir, A.; Sigworth, F. J.; Tagare, H. D. Quantifying the local resolution of cryo-EM density maps. Nat Methods 2014, 11, 63–5, doi: 10.1038/nmeth.2727.

[124] Pettersen, E. F.; Goddard, T. D.; Huang, C. C.; Couch, G. S.; Greenblatt, D. M.; Meng, E. C.; Ferrin, T. E. UCSF Chimera–a visualization system for exploratory research and analysis. J Comput Chem 2004, 25, 1605–12, doi: 10.1002/jcc.20084.

[125] Sauter, N. K. Visualizing the raw diffraction patter with LABELIT. 2011; http://www. phenix-online.org/newsletter/CCN_2011_01.pdf.

[126] Cerda-Maira, F. A.; Pearce, M. J.; Fuortes, M.; Bishai, W. R.; Hubbard, S. R.; Dar- win, K. H. Molecular analysis of the prokaryotic ubiquitin-like protein (Pup) conju- gation pathway in Mycobacterium tuberculosis. Mol Microbiol 2010, 77, 1123–35, doi: 10.1111/j.1365-2958.2010.07276.x.

[127] Chen, B.; Sysoeva, T. A.; Chowdhury, S.; Guo, L.; Nixon, B. T. ADPase activ- ity of recombinantly expressed thermotolerant ATPases may be caused by copurifi- cation of adenylate kinase of Escherichia coli. FEBS J 2009, 276, 807–15, doi: 10.1111/j.1742-4658.2008.06825.x.

[128] Hecht, N.; Gur, E. Development of a fluorescence anisotropy-based assay for Dop, the first enzyme in the pupylation pathway. Anal Biochem 2015, 485, 97–101, doi: 10.1016/j.ab.2015.06.019.

[129] Bystrom, C. E.; Pettigrew, D. W.; Remington, S. J.; Branchaud, B. P. ATP analogs with non-transferable groups in the gamma position as inhibitors of glycerol kinase. Bioorg Med Chem Lett 1997, 7, 2613–16, doi: 10.1016/S0960-894x(97)10051-8.

[130] Schrödinger, LLC. The PyMOL Molecular Graphics System, Version 1.8. 2015; www.pymol. org.

[131] Lázár, I. GelAnalyzer, Version 2010a. 2010; www.gelanalyzer.com.

[132] Rasband, W. ImageJ, Version 1.48. 1997-2016; imagej.nih.gov/ij.

[133] Liaw, S. H.; Eisenberg, D. Structural model for the reaction mechanism of glutamine synthetase, based on five crystal structures of enzyme-substrate complexes. Biochemistry 1994, 33, 675–81, doi: 10.1021/bi00169a007.

[134] Richman, P. G.; Orlowski, M.; Meister, A. Inhibition of gamma-glutamylcysteine syn- thetase by L-methionine-S-sulfoximine. J Biol Chem 1973, 248, 6684–90.

[135] Merz, K. M., Ed. Drug Design: Structure- And Ligand-Based Approaches; CAMBRIDGE UNIV PR, 2010.

85

Marcel Bolten

Personal data Date of birth July 25th, 1986 Place of birth Mönchengladbach, Germany Marital status married Citizenship German Education 05/2012–09/2016 PhD Thesis Title Structural and mechanistic characterization of the bacterial proteasome-related proteins Bpa and Dop Supervisor Prof. Dr. Eilika Weber-Ban Institution Institute of Molecular Biology and Biophysics, ETH Zurich 03/2011–01/2012 Master thesis Title Purification and structural analysis of a hopanoid biosynthesis associated RND transporter like protein HpnN Supervisor Prof. Dr. Stefan Raunser Institution Max-Planck-Institute of Molecular Physiology, Dortmund Department of Physical Biochemistry 10/2009–01/2012 Master of Science in Chemical Biology, TU Dortmund 04/2009–09/2009 Bachelor thesis Title Investigations on preparation, purification and structure of ternary complexes of L11, rRNA and thiopeptide ligands Supervisor Prof. Dr. Hans-Dieter Arndt Institution Max-Planck-Institute of Molecular Physiology, Dortmund Department of Chemical Biology 10/2005–09/2009 Bachelor of Science in Chemical Biology, TU Dortmund 2005 University entrance diploma (Abitur), Gesamtschule Hardt, Mönchengladbach Publications Marcel Bolten, Christian Vahlensieck, Colette Lipp, Marc Leibundgut, Nenad Ban, and Eilika Weber-Ban. Depupylase Dop requires inorganic phosphate as the active site nucleophile. under review. Marcel Bolten*, Cyrille L. Delley*, Marc Leibundgut, Daniel Boehringer, Nenad Ban, and Eilika Weber-Ban. Structural analysis of the bacterial proteasome activator Bpa of Mycobacterium tuberculosis in complex with the 20S proteasome. Structure, 24(12):2138–2151, 2016. Cyrille L. Delley, Juerg Laederach, Michał Ziemski, Marcel Bolten, Daniel Boehringer, and Eilika Weber-Ban. Bacterial proteasome activator Bpa (Rv3780) is a novel ring- shaped interactor of the mycobacterial proteasome. PLoS One, 9(12):e114348, 2014. Sascha Baumann, Sebastian Schoof, Marcel Bolten, Claudia Haering, Motoki Takagi, Kazuo Shin-ya, and Hans-Dieter Arndt. Molecular determinants of microbial resistance to thiopeptide antibiotics. J. Am. Chem. Soc., 132(20):6973–6981, 2010. * equal contribution 87