Structure and function of poly ADP-ribose glycohydrolases

A thesis submitted to the University of Manchester for the degree of Ph.D in the Faculty of Life Sciences

2015 Amy Brassington Structure and function of poly ADP-ribose glycohydrolases

CONTENTS TABLE

FIGURE LIST 9

TABLE LIST 14

ABBREVIATIONS 15

ABSTRACT 17

DECLARATION 18

COPYRIGHT STATEMENT 18

ACKNOWLEDGEMENTS 19

CHAPTER ONE 20

Introduction 20

1.1 Introduction to DNA damage 20

1.2 Poly ADP-ribosylation and glycosylation 21

1.3 ‘Readers’, ‘writers’ and ‘erasers’ in PARylation 24

1.4 Poly ADP-ribose glycohydrolase (PARG) the ‘erasers’ of 28 PARylation

1.5 The structures of bacterial and canonical PARGs 30

1.6 Mono ADP-ribsoylation by macrodomains 35

1.7 Therapeutic implications of PARG 36

1.8 Recruitment of PARG and the relationship between PARG 38 and PARP

1.9 Parthanatos and the release of PAR chains 39

1.10 PAR in bacteria and endotoxins 40

1.11 Project objectives 41

2

Structure and function of poly ADP-ribose glycohydrolases

CHAPTER TWO 43

Materials and methods 43

2.1 Materials 43

2.2 E. coli and P. pastoris molecular biology methods 43

2.2.1 Determination of DNA concentration 43

2.2.2 Polymerase chain reaction (PCR) 43

2.2.3 PCR product purification 44

2.2.4 Agarose gel electrophoresis 45

2.2.5 Restriction DNA digest of plasmid DNA 45

2.2.6 Purification of Plasmid DNA from E. coli 45

2.2.7 In Fusion Cloning reaction 46

2.2.8 Colony PCR 47

2.2.9 DNA sequencing reactions 48

2.2.10 E. coli strains. 49

2.2.11 E. coli growth conditions. 49

2.2.12 Transformation of competent E. coli cells. 49

2.2.13 Purification of plasmid from E. coli transformants 50

2.2.14 Glycerol stocks of E. coli strains 50

2.2.15 P. pastoris growth conditions 50

2.2.16 P. pastoris strains 52

2.2.17 Transformation of P. pastoris 52

2.3 Protein expression and purification 53

2.3.1 Determining protein concentration 53

2.3.2 SDS-PAGE electrophoresis 53

3

Structure and function of poly ADP-ribose glycohydrolases

2.3.3 Western Blot analysis 53

2.3.4 E. coli protein expression trials 55

2.3.5 P. pastoris protein expression trials 55

2.3.6 E. coli large-scale protein expression 56

2.3.7 Production of selenomethione-labelled proteins in E. coli 56

2.3.8 Large-scale protein expression in P. pastoris 57

2.3.9 E. coli cell disruption by sonication. 58

2.3.10 P. pastoris cell disruption 58

2.3.11 Protein buffer exchange using de-salting columns 58

2.3.12 Batch nickel affinity chromatography 59

2.3.13 Reverse-batch nickel affinity chromatography 59

2.3.14 Gel filtration chromatography 59

2.4 Enzymatic and biophysical methods 60

2.4.1 Poly ADP-ribose glycohydrolase activity assays 60

2.4.2 Circular dichroism (CD) spectroscopy 61

2.4.3 Isothermal titration calorimetry (ITC) 61

2.4.4. Nuclear magnetic resonance (NMR) 63

2.4.5 Thermal shift assays. 64

2.4.6 Liquid Chromatography-Mass Spectrometry of trypic digests 64

2.4.7 Multi-Angle Laser Light Scattering (MALLS) 65

2.5 Crystallographic methods 65

2.5.1 Crystallisation and crystal handling methods 65

2.5.2 X-ray data collection 68

2.5.3 Structure elucidation and refinement 68

4

Structure and function of poly ADP-ribose glycohydrolases

CHAPTER THREE 70

Biophysical characterization of Thermospora curvata poly 70 ADP-ribose glycohydrolase

3.1 Background information. 70

3.2 Biophysical characterization of bacterial PARG variants 71 E114A & E114Q 3.2.1 Expression and purification of bPARG WT, E114A & E114Q 71

3.2.2 Spectral analysis of bPARG E114A & E114Q 72

3.2.3 Thermal shift binding assays of bPARG WT, E114A and 73 E114Q

3.2.4 ITC measurement of bPARG E114A FAD and ADP-ribose 75 binding

3.3 Structural characterization of bPARG E114A 77

3.3.1 Crystallization of bPARG E114A. 77

3.3.2 Structure determination of bPARG E114A Apo and plus ADP- 78 ribose 3.3.3 Crystallisation of bPARG E114A with FAD 83

3.4 Engineering of T. curvata bPARG and H. sapiens MACROD2 83 hybrid variants 3.4.1 Design of a bPARG-MACROD2 Loop hybrid 83

3.4.2 Cloning of bPARG-MD2 and MACROD2-bP hybrid 84

3.4.3 Protein expression of bPARG-MD2 and MACROD2-bP hybrid 85 proteins 3.4.4 Purification of MACROD2, bPARG-MD2 and MACROD2-bP 85 proteins. 3.4.5 PARG Activity assay of MACROD2-bP hybrid 87

3.4.6 Thermal shift assay of modified proteins 89

3.5 Summary and discussion 91

5

Structure and function of poly ADP-ribose glycohydrolases

CHAPTER FOUR 94

Biophysical and structural characterization of Tetrahymena 94 thermophila PARG bound to the poly (ADP-ribose) substrate

4.1 Background information 94

4.2 Expression and purification of inactive Tetrahymena 95 thermophila PARG

4.2.1 Expression and purification of TTPARG WT, E256A, E256Q, 95 E255Q and E255A

4.2.2 UV-Vis Spectral analysis of TTPARG E255A and E255Q. 97

4.3 Biophysical characterization of purified TTPARG 98

4.3.1 Thermal shift assays of TTPARG E256Q and E256A 98

4.3.2 ITC Binding assays of TTPARG WT and E256Q, E256A, 100 E255A, E255Q

4.3.3 SPR ligand-binding assays of TTPARG WT and E256Q, E256A, 102 E255A, E255Q

4.4 Structural characterization of the inactive TTPARG-PAR 103 complex

4.4.1 Crystallisation of inactive E256Q TTPARG with poly-ADP- 103 ribose (PAR) fragments

4.4.2 Structure determination of inactive TTPARG with PAR 104

4.4.3 Crystal structure of inactive TTPARG with PAR 107

4.4.4 Crystal structure of TTPARG-PAR reveals exo-glycohydrolase 110 binding mode

4.4.5. The catalytic mechanism of canonical PARG 110

4.5 Discussion 113

6

Structure and function of poly ADP-ribose glycohydrolases

CHAPTER FIVE 115

Expression, purification and biophysical characterization of 115 the mammalian PARG regulatory domain

5.1 Background information 115

5.2 Cloning, expression and purification of full length human 117 PARG and human PARG regulatory domain variants in E.Coli

5.2.1 Rational design of hPARG regulatory domain truncations. 117

5.2.2 Cloning of hPARG C-terminally truncated forms. 117

5.2.3 Expression trials of full length hPARG and hPARG fragments. 119

5.2.4 Large-scale expression and purification full-length hPARG 120

5.2.5 Large-scale expression and purification of various hPARG 123 fragments 5.3 Cloning, expression and purification of human PARG 128 regulatory domain in P. pastoris

5.3.1 hPARG expression in P. pastoris 128

5.3.2 Cloning of hPARG 1-388 into pPICZ 3.1 plasmid 129

5.3.3 Transformation into P. pastoris strains 130

5.3.4 Expression trials of hPARG 1-388 in P. pastoris 130

5.3.5 Large-scale expression and purification of hPARG 1-388 in 134 P. pastoris 5.4 Biophysical characterization of purified hPARG regulatory 135 domain variants

5.4.1 Multi angle light scattering (MALLS) 135

5.4.2 Circular dichroism (CD) of various PARG fragments 136

5.4.3 Nuclear magnetic resonance (NMR) of hPARG1-388. 138

5.4.4 Pull-down assays of hPARG regulatory domain (1-388) and 139 catalytic domain hPARG (448-976).

5.5 Crystallization of PARG regulatory domain fragments 141

7

Structure and function of poly ADP-ribose glycohydrolases

5.5.1 Crystallization and initial data collection of hPARG regulatory 141 domain fragments

5.5.2 Crystallization trials of other mammalian PARG regulatory 145 domains

5.6 Summary and discussion 146

CHAPTER SIX 150

Structure determination of luciferase-like mono-oxygenase 150 from a SirTM operon

6.1 Background information 152

6.2 Biophysical characterization of SAV0323 152

6.2.1 Expression and purification of SAV0323 153

6.2.2 Thermal shift ligand binding assay. 154

6.2.3 Multi angle light scattering (MALLS) 155

6.3 Elucidation of the SAV0323 protein structure 157

6.3.1 Crystallization and data collection of native SAV0323 159

6.3.2 Expression, purification and crystallization of 159 selenomethionine-labelled SAV0323

6.3.3 Structure elucidation of SAV0323 159

6.4 Discussion 161

CHAPTER SEVEN 165

Discussion 165

REFERENCES 173

APPENDIX 186

8

Structure and function of poly ADP-ribose glycohydrolases

Figure list Figure 1. Schematic representation of proposed PAR mediated 21 single and double strand DNA repair mechanisms

Figure 2. PAR synthesis/ degradation pathway and the role of 22 key

Figure 3. DNA repair mechanism and chromatin 25 re-organisation by PARylation

Figure 4. Structures of ADP-ribose binder domains 28

Figure 5. Phylogenetic distribution of PARG proteins with clear 30 division between canonical and bacterial PARGs

Figure 6. Proposed catalytic mechanism for the bacterial PARG 31

Figure 7. Comparisons of the overall structures and active sites 34 for bacterial, eukaryotic and human PARGs

Figure 8. Surface representation of TARG1 bound to 35 ADP-ribose, (PDB:4J5S)

Figure 9. Schematic representation of genome arrangements 41 of the macrodomain- sirtuin linked operons

Figure 10. Basic configuration of an isothermal titration 62 calorimetry (ITC) instrument

Figure 11. Diagram of protein crystallization solubility curve 65

Figure 12. Diagram of sitting drop vapor diffusion method 67

Figure 13. SDS-PAGE analysis of nickel affinity 72 chromatography fractions of bPARG E114A, E114Q and WT respectively

Figure 14. Absorbance spectrum of purified bPARG E114A 73 and E114Q

Figure 15. Thermal shift assays of bPARG variants 74

Figure 16. ITC binding data for bPARG WT and E114A 76 with ADP-ribose and FAD

Figure 17. bPARG E114A crystal stored in a cryoloop 78

9

Structure and function of poly ADP-ribose glycohydrolases

Figure 18. Omit density corresponding to bound ADP- 79 ribose

Figure 19. T. curvata PARG crystal structure in complex 82 with ADP-ribose

Figure 20. T. curvata PARG and H. sapiens MACROD2 crystal 84 structures in complex with ADP-ribose

Figure 21. SDS-PAGE gel of cell extracts from bPARG, 85 MACROD2-, bPARG-MD2 and MACROD2-bP proteins

Figure 22. SDS-PAGE analysis of MACROD2-pB, bPARG-MD2 87 hybrid and MACROD2 purified from nickel affinity chromatography

Figure 23. Western blot results of the PARG activity assays 88 using bPARG, MACROD2-bP, bPARG-MD2 and MACROD2

Figure 24. Thermal shift assays for bPARG, MACROD2-bP, 90 bPARG-MD2 and MACROD2 in presence and absence of ADP- ribose

Figure 25. Structures of ADP-ribose, FAD and a model of 92 bPARG T. curvata bound to FAD

Figure 26. Structural comparison of PARG from T. curvata, 95 canonical PARG from T. thermophila

Figure 27. SDS-PAGE analysis of TTPARG WT and variants 96 using nickel affinity chromatography

Figure 28. Chromatogram of TTPARG E256Q gel filtration 97 purification

Figure 29. UV-Vis spectra of purified TTPARG E256Q 98

Figure 30. Thermal shift assays for TTPARG WT and mutant 99 versions in presence and absence of ADP-ribose

Figure 31. ITC Isotherm diagram, binding data for TTPARG WT 101 with ADP-ribose

Figure 32. SPR protein array sensogram of bPARG WT with 102 ADP-ribose

Figure 33. Photograph of a TTPARG-PAR9 crystal mounted in a 104 cryo-loop

Figure 34. view of TTPARG with PAR bound 108

10

Structure and function of poly ADP-ribose glycohydrolases

Figure 35 Crystal structure of a PARG–PAR9 complex 109

Figure 36. Catalytic mechanism of poly-ADP-ribose hydrolysis 112 by PARG

Figure 37. Schematic representation of hPARG protein 115

Figure 38. The order-disorder prediction of hPARG from 116 IUPRED Figure 39. Agarose electrophoresis analysis of HsPARG 118 regulatory domain colony PCR reactions

Figure 40. SDS gels showing the expression of hPARG 1-388, 1- 119 380, 1-365 and 1-329 before induction, after 2 hours of induction and after 18 hours of expression and various temperatures

Figure 41. SDS gel showing the expression of full length hPARG 120

Figure 42. Analysis of full-length hPARG large-scale expression 122

Figure 43. SDS page electrophoresis of hPARG 1-460 purified 123 by nickel affinity chromatography

Figure 44. SDS page electrophoresis of hPARG 1-388 and 1-380 124 purified by nickel affinity chromatography

Figure 45. SDS page electrophoresis of hPARG 1-365 and 1-329 124 purified by Nickel affinity chromatography

Figure 46. Peptide coverage of hPARG 1-460 degradation 125 product in hPARG 1-460 sequence

Figure 47. Analysis of hPARG 1-388 after purification with size 127 exlusion chromatography

Figure 48. Prediction of phosphorylation sites within the 128 hPARG 1-388 fragment

Figure 49. Agarose gel electrophoresis of amplified DNA 129 fragments for hPARG 1-388 after Dpn1 digest 132 Figure 50. SDS page gel of hPARG 1-388 expressing P. pastoris strains X-33 and KM71H, samples taken at various time points in during methanol expression

Figure 51. Western blot of hPARG 1-388 expression trials in P. 133 pastoris strains X-33 and KM17H

11

Structure and function of poly ADP-ribose glycohydrolases

Figure 52. Analysis of expression of hPARG 1-388 using the P. 135 pastoris X-33 expression system

Figure 53. Multi-angled light scattering (MALLS) 136 chromatogram of hPARG 1-388

Figure 54. CD spectra of hPARG regulatory domain fragments 138 used for deconvolution Figure 55. NMR spectrum, Bruker 800MHz spectrometer 139

Figure 56. SDS page gel of the pull down assay fractions, 140 between hPARG regulatory domain (1-388) and catalytic domain hPARG (448-976)

Figure 57. hPARG 1-460 crystal stored in a cryoloop with 142 corresponding diffraction pattern

Figure 58. hPARG 1-460 crystal stored in a cryoloop with 143 corresponding diffraction pattern snap shots

Figure 59. hPARG 1-388 crystal stored in a cryoloop 144

Figure 60. SDS page gel showing expression of rat, mouse and 146 chicken regulatory domain fragments

Figure 61. Schematic representation of genome arrangements 150 of the macrodomain- sirtuin linked operon from Staphylococcus aureus Figure 62. Schematic diagram of the relationship between the 151 elements of the Staphylococcus extended operon components

Figure 63. SDS-PAGE analysis of SAV0323 nickel affinity 153 chromatography elution fractions

Figure 64. Thermal shift assay for SAV0323 154

Figure 65. Multi-angled light scattering (MALLS) 155 chromatogram of SAV0323

Figure 66. SAV0323 crystal stored in a cryo-loop 157

Figure 67. SDS-PAGE analysis of nickel affinity chromatography fractions of selenomethionine labelled SAV0323 158

Figure 68. Selenomethionine labelled SAV0323 crystal stored 159 in a cryo-loop

12

Structure and function of poly ADP-ribose glycohydrolases

Figure 69. Overall structure of SAV0323 161

Figure 70. Structural comparison of SAV0323 and MnsO8 162 (PDB: 4US5)

Figure 71. Structural comparison of SAV0323 and a bacterial 163 luciferase (PDB:3FGC)

Figure 72. Schematic representation of PARylation and PAR 169 degradation with structures of key proteins involved

Figure 73. Schematic representation of translation and 171 transcription of the newly discovered SirTM operon and the proposed mechanisms for the proteins involved

13

Structure and function of poly ADP-ribose glycohydrolases

Table list Table 1. Thermal cycling parameters for PCR reactions 44

Table 2. Thermal cycling parameters for colony PCR reactions 48

Table 3. Thermal shift assays of bPARG variants 75

Table 4. Thermodynamic binding parameters for bPARG WT and 77 variants Table 5. X-ray crystallographic statistics for data collection of 80 bPARG E114 variant 90 Table 6. Tm for bPARG variants after thermal shift assay 99 Table 7. Thermal shift assays of TTPARG variants 101 Table 8. Thermodynamic binding parameters for TTPARG WT using ITC

Table 9. Thermodynamic binding parameters for TTPARG WT 103 using SPR

Table 10. Crystallographic data and model refinement parameters 106 for TTPARG-PAR9

Table 11. Deconvolution percentages for hPARG regulatory 137 domain fragments

Table 12. Crystallographic data and model refinement parameters 160 for SAV0323

14

Structure and function of poly ADP-ribose glycohydrolases

ABBREVIATIONS

3D Three dimensional ADP Adenine di-phosphate ADP-HPD Adenosine diphosphate (hydroxymethyl)pyrrolidinediol AIF Apoptosis inducing factor ALC1 Amplified in Liver Cancer 1 APLF Aprataxin and PNK-like factor ARTD ADP-ribosyl ATP Adenosine tri-phosphate CD Circular dichroism CHFR checkpoint with FHA and RING finger domains Da Dalton dH2O Distilled water DNA Deoxyribonucleic acid DP differential power DSB Double strand break EDTA Ethylene diaminetetra H acetic acid Fc Calculated structure factors Fo Observed structure factors FAD Flavin adenine dinucleotide FMN Flavin mononucleotide FPLC Fast Protein Liquid Chromatography g Grams GcvH-L Glycine cleavage system H-like ITC Isothermal titration calorimetry IPTG Isopentenyl pyrophosate KB Kilobase Kd Dissociation constant Km Michaelis Menten constant L Litre LB Lysogeny broth LpIA2 lipoate-protein homolog LLM luciferase-like monooxygenase ml millilitre mM MilliMolar mg Milligrams M Molar MACROD Macrodomain protein D MALLS Multi angled light scattering min minutes MTS Mitochondrial targeting sequence NAD Nicotinamide adenine dinucleotide ng nanograms nl nanolitres nm nanometres nM nanoMolar

15

Structure and function of poly ADP-ribose glycohydrolases

NMR Nuclear magnetic resonance OYE Old yellow enzyme OD600 Optical density at 600 nm PAR Poly ADP-ribose PARG Poly ADP-ribose glycohydrolase PARP Poly ADP-ribose polymerase PARylation Poly (ADP-ribosyl)ation PBM PAR-binding linear motif PBZ PAR-binding zinc finger PCR Polymerase chain reaction PCNA Proliferating cell nuclear antigen PDB Protein data bank PEG Polyethylene glycol PIP PCNA interacting protein pKa Acid dissociation constant ppm Parts per million Rcf Relative centrifugal force Rmsd Roost mean square deviation RNA Ribonucleic acid Rpm Revolutions per minute SDS-PAGE Sodium dodecyl sulphate polyacrylamide gel SMARC5 SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily A member 5 SNM1 sensitive to nitrogen mustard SSB Single strand break TAE Tris acetic acid EDTA TARG Terminal ADP-ribose glycohydrolase 1 Tm Melting temperature UV Ultra violet V Volts VEGF Vascular endothelial growth factor WT Wild type XRCC1 X-ray repair cross-complementing protein 1 YPD yeast extract- peptone-dextrose

16

Structure and function of poly ADP-ribose glycohydrolases

Abstract Thesis submitted to the University of Manchester in 2015 for the degree of Ph.D in the Faculty of Life Sciences by Amy Brassington entitled – Structure and function of poly ADP-ribose glycohydrolases

Post-translational modification of proteins by poly ADP-ribosylation is involved in numerous cellular processes such as chromatin restructuring, cell cycle progression and diversion, transcription, DNA repair, cell signalling, apoptosis, necrosis, replicative ageing and wound healing 1–8. As observed for other posttranslational modifications the process is tightly regulated, with proteins functioning as ‘readers’, ‘writers’ and ‘erasers’ coordinating the process.9 A superfamily of enzymes, known as poly (ADP-ribose) polymerases (PARPs), catalyse ADP-ribosylation of target proteins as well as the consequent elongation of poly (ADP-ribose) (PAR) by transferring the ADP-ribose moiety from NAD+. Poly ADP-ribose glycohydrolase (PARG) on the other hand, catalyses the breakdown of PAR into predominantly ADP-ribose monomers10–14. Until recently, research has primarily focused on generation of PAR, with the degradation pathway being comparatively less studied. The recent structure elucidation of a bacterial PARG from T. curvata proved a breakthrough in this field, leading to a postulated model for PARG catalysis in general. Results from our further work with a mutant bacterial PARG showed a role for the G114 residue in directing ligand binding and preventing FAD binding. The structure of PARG from T. curvata was found to be structurally homologous to the MACROD2 protein from H. sapiens, both of which can bind ADP-ribose and both have a slightly different catalytic loop which is responsible for their catalytic activity. Our results suggest that these two loops are not interchangeable, suggesting that the stability and activity of these catalytic loops are governed by the residues surrounding them. We here also present the crystal structure of a canonical PARG incorporating the PAR substrate. The two terminal ADP-ribose units of the polymeric substrate are bound in exo-mode. Our structure reveals that PARG acts predominantly as an exo-glycohydrolase and expands our understanding of the mechanism of poly-ADP-ribose degradation. Although various canonical PARG structures have been reported recently 11,13,15–18, little insight is available for the structure and function of the large regulatory domain present in vertebrate PARGs. Here we present purification of the human regularity PARG region from E. coli as well as initial crystallisation conditions. These results will support further studies in unravelling the complete mechanism of vertebrate PARG. Finally, ADP-ribosylation is more widespread than initially thought, ranging from bacteria including many virulent species, though to humans19. A previously unrecognised class of sirtuins has been discovered in microbial pathogens that possess ADP-ribosylation activity20. The sirtuin- mediated ADP-ribosylation seen in Staphylococcus aureus and Streptococcus pyogenes is dependent on lipoylation, and appears important for the response of these microbial pathogens to their host defence mechanism- oxidative stress20. We present ligand free crystal structure of SAV0323, a bacterial luciferase-like enzyme that is implicated in this response. The structure contains many disordered loops, suggesting FMNH2 and /or substrate binding are required to order the active site region.

17

Structure and function of poly ADP-ribose glycohydrolases

DECLARATION No portion of the work referred to in the thesis has been submitted in support of an application for another degree or qualification at the University of Manchester or any other university or other institute of learning.

COPYRIGHT The author of this thesis (including any appendices and/or schedules to this thesis) owns certain copyright or related rights in it (the “Copyright”) and s/he has given The University of Manchester certain rights to use such Copyright, including for administrative purposes. Copies of this thesis, either in full or in extracts and whether in hard or electronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it or, where appropriate, in accordance with licensing agreements which the University has from time to time. This page must form part of any such copies made. The ownership of certain Copyright, patents, designs, trade-marks and other intellectual property (the “Intellectual Property”) and any reproductions of copyright works in the thesis, for example graphs and tables (“Reproductions”), which may be described in this thesis, may not be owned by the author and may be owned by third parties. Such Intellectual Property and Reproductions cannot and must not be made available for use without the prior written permission of the owner(s) of the relevant Intellectual Property and/or Reproductions. Further information on the conditions under which disclosure, publication and commercialisation of this thesis, the Copyright and any Intellectual Property and/or Reproductions described in it may take place is available in the University IP Policy (see http://documents.manchester.ac.uk/DocuInfo.aspx?DocID=487), in any relevant Thesis restriction declarations deposited in the University Library, The University Library’s regulations (see http://www.manchester.ac.uk/library/aboutus/regulations) and in The University’s policy on Presentation of Theses

18

Structure and function of poly ADP-ribose glycohydrolases

ACKNOWLEDGEMENTS

Firstly, I would like to sincerely thank Prof David Leys for giving me the opportunity to study for this degree and his unwavering support throughout. Without him, I would not have made it to this point and for that I cannot thank him enough. In addition, Dr Ivan Ahel and everyone in the Ahel group for their backing and continuous guidance within this field. Secondly, I would like to acknowledge Dr Mark Dunstan for all of his training and support, he has been an amazing mentor and I have learned so much from him. Thirdly, a sincere thank you to Dr Karl Payne, Dr Karl Fisher, Dr Mary Ortmeyer and Dr Colin Levy for all of all of their advice and expertise, I would not be at this point without all of the guidance they have given me. Finally all of the Leys group (Mark, Ste, Caro, Hanno, Laura, Frazer) and the larger enzymology family (Munro and Scrutton) for all of their help and friendship. This has been a great experience and I would like to wish everyone the best in their future endeavours. Thank you again

I would like to dedicate this thesis to my wonderful family. To my mum, you have been my inspiration. You have made me the woman I am and without you I would not have gotten to this point. There aren’t enough words to thank you, you are my everything. Elliott, without you I am not sure I would have gotten though this. You have seen the highs and the lows of this Ph.D and have done nothing but encourage me, you are my rock. My nanna and grandad and my uncle Steven and my Aunty Tracy, who have shown me nothing but love thought out this journey, and even more tolerance. Thank you all.

19

Structure and function of poly ADP-ribose glycohydrolases

Chapter One

Introduction

1.1 Introduction to DNA damage and repair DNA damage via double strand breaks often leads to cell death if left un-repaired and can frequently cause instability within the genome if they are incorrectly repaired. DNA double strand breaks are often seen as a result of ionising radiation or other anti-cancer drugs which promote these breaks. In addition, these breaks can often occur naturally within the cell after normal cell metabolism produces reactive oxygen species, or as a result of essential intermediates during meiosis 21, 22. One key way that the cell repairs these double strand breaks is homologous recombination which often uses the sister chromatid as a template for the repair acting in late S and G2 phases of the cell cycle. This repair pathway is often referred to as ‘error-free’ because of its use of the sister chromatid allowing restoration of the damaged 23. The other key repair pathway for double strand breaks is non-homologous recombination or non- homologous DNA end-joining, which is more likely to gain errors during the process 24. If either of these repair pathways are defective, then the cells involved have shown to be sensitive to radiation, certain cancers and various immunological or neurological syndromes 25.

The most common DNA breaks seen are single strand breaks of which there are thousands per cell every hour, these can severely disrupt the progression of transcription and if not repaired rapidly, can develop into double strand breaks during genome duplication. In the absence of repair, these breaks are encountered during DNA replication and cause the stalling of the replication fork causing double strand breaks to accumulate 26, see figure 1.

Recent research has shown the importance of posttranslational modifications in the regulation of these repair pathways and more specifically the importance of a certain posttranslational modification known as poly-ADP ribosylation (PARylation), which is a central mediator coordinating the immediate detection of DNA damage, 22. Furthermore, the repair pathways for both single and double strand DNA damage use PARylation as a mediator, see figure 1 26.

20

Structure and function of poly ADP-ribose glycohydrolases

Figure 1. Schematic representation of proposed PAR mediated single and double strand DNA repair mechanisms. Single strand DNA break are detected by PARP1 which ADP-ribosylates itself and possibly other proteins in the area of the break e.g. histones. This activity recruits putative chromatin remodelling proteins ALC1 and APLF and the scaffold protein XRCC1, which promotes the single strand repair proteins to repair the DNA break. Double strand DNA damage is also detected by PARP1 which again ADP-ribosylates itself which mediates the recruitment of MRE11 in corporation with CtIP to initiate the first steps of homologous recombination. This initiates the XRCC1/LigII homologous recombination repair pathway. Based on information from Beck, et al and Caldecott, et al 22,26.

1.2 Poly ADP-ribosylation and glycosylation

Post-translational modification of proteins by poly ADP-ribosylation is involved in numerous cellular processes in both eukaryotes and prokaryotes, such as: chromatin restructuring, cell cycle progression and diversion, transcription, DNA repair, cell signalling, apoptosis, necrosis, replicative ageing and wound healing 1–8. A superfamily of enzymes, known as poly (ADP-ribose) polymerases (PARPs), catalyse ADP- ribosylation of target proteins, and consequent elongation and branching of PAR; Poly (ADP-ribose) glycohydrolase (PARG) on the other hand catalyses the breakdown of PAR into predominantly ADP-ribose monomers, see figure 210–12. Poly ADP-ribose is a highly negatively charged and heterogeneous polymer consisting of branched and un- branched ADP-ribose chains linked via glyosidic Ribose-ribose bonds, see figure 2.

21

Structure and function of poly ADP-ribose glycohydrolases

Poly ADP-ribosylation of proteins starts by the PARP protein transferring an ADP- ribose moiety from NAD+ onto an acceptor protein with subsequent release of the nicotinamide and one proton, see figure 2. This ADP-ribose moiety is specifically transferred to a glutamate or aspartate residues via an ester bond 13, 14. Further repeating units of ADP-ribose are added via a Ribose-ribose 2’,1”-O-glycosydic bond to produce poly ADP-ribose (PAR) up to 200 units 27. The poly ADP-ribose glycohydrolase protein (PARG) then hydrolyses the Ribose-ribose bond between the ADP-ribose units. The final ADP-ribose monomer is removed from the acceptor protein by Macrodomain containing protein 1/Macrodomain containing protein 2 (MACROD1/MACROD2). Terminal ADP-ribose glycohydrolase 1 (TARG1) removes entire chains of poly ADP- ribose by cleaving the ester bond to the acceptor protein 27.

Figure 2. PAR synthesis/ degradation pathway and the role of key enzymes. PARP enzymes transfer ADP-ribose from NAD+ to acceptor proteins via Glu and Asp residues, releasing nicotinamide and one proton. PARPs elongate the PAR polymer up to around 200 units, branching points are located at around every 40–50 ADP-ribose units. PARG breaks down PAR into ADP-ribose and MACROD1, MACROd2 or TARG1 remove the terminal ADP-ribose unit. TARG1 can also remove entire chains of PAR from the protein at the ester bond. Based on information from Barkauskaite, et al and Hottiger, et al 27,28.

Poly (ADP- ribosylation) was first discovered over 50 years ago by the Chambon group 29. They discovered that adding NAD+ to hen liver extracts stimulated the production of Poly (ADP-ribose). In humans, the family of enzymes responsible for ADP-ribosylation

22

Structure and function of poly ADP-ribose glycohydrolases are poly ADP-ribose polymerase (PARPs). The PARP super-family consist of 17 members all encoded by different genes 1 5. The human PARP superfamily can be divided into three subclasses, based primarily on their structural features: 1) those forming PAR, consisting of PARPs 1–5; 2) those only capable of mono-ADP-ribosylation , consisting of PARPs 6–8, 10–12 and 11–16, and 3) the inactive PARPs, consists of PARPs 9 and 13 which lack any NAD+ binding residues 8. Within this PARP super-family, PARPs 1, 2 and 3 have been studied the most and have a vital role within the repair of DNA damage 22.

PARP1 was found to be activated by various types of DNA damage, including stalled replication forks as well as single and double strand breaks 26 (see figure 1); however, PARP2 has been suggested to recognise overhanging regions of DNA damage or ‘flap structures’. PARP3 has been implicated in responding to only double strand breaks 26,30– 33. It is commonly recognised that PARP1 and PARP2 are central components of the repair pathway for specific single strand breaks, known as base excision repair/single strand break repair process (BER/SSBR) 22, see figure 1.

It has been proposed that PARP1 mediates single strand breaks by recruiting an accumulation of DNA repair protein X-ray repair cross-complementing protein 1 XRCC1, which acts as a scaffold protein to associate and stabilize a number of other repair enzymes 26. The inhibition of PARP1 and 2 was shown to cause high sensitivity to agents which can cause single strand and base damage 5. In addition, the inhibition of PARylation has been shown to cause increased presence of double strand breaks and recent evidence shows that not only does PARP mediate single strand breaks, but double strand breaks too via PARP1 and PARP3 30,34–36.

It has recently been suggested that all catalytically active PARPs ‘transfer’ the ADP- ribose moiety using the NAD+ onto a specific protein, therefore all PARPs are ‘transferases’ and not ‘polymerases’ 27. This activity means that the PARP enzymes are also often referred to as ADP-ribosyl transferases (ARTDs) 2. In addition, some of the PARP enzymes only function as mono (ADP-ribosyl) transferases rather than PARPs 5. Hence, PARP1 and PARP2 enzymes can catalyse the addition of further ADP-ribose monomers to create poly (ADP- Ribose) (PAR) linked by O-glycosidic Ribose-ribose bonds. Other smaller PARP enzymes can only transfer one ADP-ribose monomer onto acceptor proteins 5,37–39.

23

Structure and function of poly ADP-ribose glycohydrolases

A number of virulent bacteria also have a family of mono ADP-ribosyl transferase protein toxins, which are involved in modifying specific host proteins by covalently transferring the ADP-ribose portion of NAD+ to these host proteins. Each of these modifications is unique to the type of bacteria and modified toxins entering the host cell 40. This family of bacterial ADP-ribosyl transferases can be divided into four subclasses.

AB5, which includes the cholera toxin and consists of one A and five B subunits plus a target small G-protein. The AB3 group (including diphtheria) ribosylate a diphthamide residue on the elongation factor, they have a binding, catalytic and transmembrane domain. The single polypeptide group includes the C3 toxin from C.botulinum of which the pathogenesis has not yet been characterised. Finally, the AB binary toxin group is made up of an A and B subunit and includes the ADP-ribosyltransferase (or binary toxin) from the virulent pathogen Clostridium difficile, which is made up of a enzymatic 48kDa domain and a 74kDa regulatory domain 41. Since the initial discovery of bacterial mono ADP-ribosylation of toxins, a number of bacteria have been identified as possessing this mechanism, such as pseudomonas exoenzyme S.pertussis toxin, cholera toxin and diphtheria toxin and all disrupt the host cells regulatory, metabolic and biosynthetic pathways 42.

1.3 ‘Readers’, ‘writers’ and ‘erasers’ in PARylation

Within posttranslational modification of proteins via mono and poly ADP-ribosylation , there exist three general groups, the readers, writers and erasers. Readers recognise and bind ADP-riobose and its metabolites e.g. the macrodomains binding to PAR. The writers are responsible for the modifications, i.e. PARP which transfers/synthesises poly or mono ADP-ribose onto acceptor proteins. The erasers are responsible for the removal of the PAR signal e.g. PARG 9.

These reader proteins contain globular domains with PAR affinity that bind to the PAR chain to allow recruitment to the appropriate site of the signal, or alters protein function 39,43. For example, PARP1 in response to damage to various sites of DNA damage will produce PAR at each point of damage along the DNA strand. Histone poly- ADP-ribosylation promotes dissociation of the histones from the DNA, allowing complementary DNA repair proteins to gain access to the now relaxed DNA structure. Furthermore, following activation, PARP1 is auto-poly ADP-ribosylated and serves as a scaffold protein during the repair process 28,44,45, see figure 3.

24

Structure and function of poly ADP-ribose glycohydrolases

Figure 3. DNA repair mechanism and chromatin re-organisation by PARylation. 1) PARP1 associates with DNA at the sites of DNA damage, the subsequent PARylation promotes relaxation of the chromatin via two mechanisms: 2) Direct PARylation of histones and other chromatin associated proteins; and 3) auto-modification of PARP1 to recruit repair proteins acting as a scaffold for chromatin re-modellers such as SMARC5 and ALC1, and histone chaparones and variants such as APLF and MACRO-H2A. Based on information from Barkauskaite, et al, Tallis, et al and Haince et al 28,44,45.

The PAR-binding proteins have been classified into four groups: the PAR binding zinc fingers (PBZ), the macrodomains, PAR-binding linear motifs (PBMs) and WWE domains 8. These specific reader domains allow three key physiological functions: the binding of mono-ADP-ribosylated substrate proteins, PAR binding a localisation of proteins to sites of PARP activity and turnover of ADP-ribose metabolites 9.

The PAR binding zinc finger domains (PBZ), have been associated primarily with proteins involved in DNA damage and repair, checkpoint regulation and PAR metabolism 46. The zinc finger has a consensus [K/R]x2Cx[F/Y]Gx2- Cxbbx4Hx3[F/Y]xH motif revealed by the recently solved crystal structure of human E3 ubiquitin protein ligase checkpoint with forkhead and RING finger domains (CHFR), which contains a C-

25

Structure and function of poly ADP-ribose glycohydrolases terminal PBZ domain, amino acids 425–664 (PDB: 2XOC) 9 47, see figure 4a. The crystal structure of the CHFR protein bound to ‘PAR-like’ substrates and additional biophysical analysis revealed that the PBZ recognises the two adenine-containing subunits of PAR and the phosphate backbone connecting them, suggesting that PBZ motifs may recognise various different subunits of PAR 47. The p53 protein has been found to possess the signature zinc finger binding domain (PBZ) known to associate with PAR 46, 48, suggesting that these recognisable domains indicate which proteins may be key players within PARylation and the mechanisms which surround it. One research group noted that the PAR binding zinc finger (PBZ) motif located within the CHFR and aprataxin PNK-like factor (APLF) strongly associated with PAR and depletion of zinc or mutation of the zinc binding finger had an inhibiting effect on the binding of PAR 46.

The WWE domain is so named because of its three conserved residues: tryptophan, tryptophan and glutamine acid and is primarily found within the PARP superfamily and the E3 ubiquitin protein 9. The WWE domains have been structurally characterised in PARPs 11 and 14, as well as the RING-type E3 ubiquitin ligase - RNF146, 49. The crystal structure of the RNF146 WWE domain (PDB: 3V3L), revealed that the WWE domain consists of six β strands which form a half β barrel, the other side of the half β barrel is covered by an α helix. The adenine ring of the iso-ADP Ribose sits within the β barrel half pocket 50, see figure 4b,

Structural information regarding the PAR-binding linear motifs (PBMs) is limited. There is currently only two ligand free crystal structures, one of the BCRT domain of XRCC1 (PDB:2D8M), and the other is the mouse apoptosis inducing factor (PDB:1GV4) 9, 51.

Finally, the macrodomains, named so because of their identification within the macroH2A protein 52. Macrodomain proteins are often referred to both as readers and erasers, as they bind PAR as a ‘reader’ and break down PAR polymers as ‘erasers’. Macrodomain proteins, which includes poly ADP Ribose glycohydrolase (PARG), have a distinct ‘macro’ fold, approximately 130-190 amino acids long and can be found in most proteins known to bind to ADP-ribose and PAR 53,54, 46, see figure 4c. Numerous crystal structures of macrodomain proteins have been solved in the last 15 years, from the first bacterial structure in 2003 of the Af1521 protein from Archeoglobus fulgidus (PDB: 1HJZ) to the human macroH2A1.1 protein bound to ADP-ribose (PDB:3IID) 55–60, see figure 4c. The macrodomain is globular, consisting of a mixed β sheets and five α helices

26

Structure and function of poly ADP-ribose glycohydrolases which form a groove for the binding of ADP-ribose, where ligand binding occurs through stacking of the adenine ring and hydrogen bonding to the distal Ribose, strengthened by the interactions with the pyrophosphate of the ADP-ribose 54 8. As previously mentioned, most macrodomain proteins bind ADP-ribose; however, only few have catalytic activity. This is due to the differences in the macrodomain amino acid sequence which alters the proteins preference for different NAD+ metabolites. A chromatin remodelling enzyme, ALC1, has been shown to bind PAR only, and MACROHA.1.1 can bind to both PAR and mono ADP-ribosylated proteins, both via their macrodomain 53. Recent structural studies of some positive strand RNA viruses show that they contain macrodomains, some having different activity profiles implicating them in different cellular pathways, some of which may involve RNA rather than ADP- ribose derivatives; also, that these viruses may affect the PAR signal and PAR metabolism 61–63. The number of macrodomain proteins keeps expanding as more proteins are discovered with these unique domains and catalytic activities. Recently the structure of MACROD2 and enzymatic studies of both MACROD1 and MACROD2, revealed that these proteins cleave the terminal ADP-ribose away from acceptor proteins, determining that poly- and mono- ADP-ribosylation are completely reversible reactions 64,65. The recent study by Slade, et al revealed that PARG itself has a key macrodomain fold in its catalytic domain, hydrolysing the Ribose-ribose bond between the PAR polymers, see ‘domain C’ of figure 5, 13. Macrodomains can act as both hydrolytic enzymes to release ADP-ribose and/or PAR-binding modules 8.

27

Structure and function of poly ADP-ribose glycohydrolases

a

b c

b

Figure 4. Structures of ADP-ribose binder domains. (a) In light blue is the crystal structure of PBZ-domain protein CFHR (PDB:2XOC), bound to ADP. (b) In orange is the crystals structure of WWE-domain protein RNF146 (PDB:3V3L) bound to iso-ADP- ribose. (c) In green is the crystal structure of macro-domain protein macroH2A1.1 (PDB: 3IID) bound to ADP-ribose. Images were created using QTMG software (CCP4i suite)

1.4 Poly ADP-ribose glycohydrolase (PARG) the ‘erasers’ of PARylation

PAR-levels within the cell are considered transient due to rapid PAR-turnover (1–6 minutes) 66. PAR-levels are considerably lower when there is little or no DNA damage, largely due to the regulation of the catalytic activity of PARP1 67.

28

Structure and function of poly ADP-ribose glycohydrolases

Poly ADP-ribose glycohydrolase (PARG) is the key PAR processing protein, hydrolysing the glyosidic bonds between the Ribose-ribose molecules within the PAR chain 13. The human PARG protein is a multi-domain protein which also exists in alternative isoforms, a full length 110 kDa nuclear form, two cytoplasmic isoforms of 99 and 103 kDa, and two mitochondrial isoforms of 60 and 55 kDa 68–70. It was originally believed the different isoforms were the result of various genes encoding different PARG enzymes, as was true for the PARP superfamily. However, it was discovered that all isoforms arise from the alternative splicing of the same transcript 71. The two cytoplasmic isoforms PARG99 and PARG103 have cleaved nuclear localizing signals located within N-terminus of the protein 72.

Until recently there has been very little information regarding the main purpose for the cytoplasmic isoforms (PARG102 and PARG99). However, recent research suggests that these are activated in response to cellular stress, not unlike the nuclear PARG111 version of the enzyme. The levels of cytoplasmic poly (ADP-ribose) chains are extremely low, and are located mainly with the spindle apparatus during cell cycle regulation causing the accumulation of cytoplasmic PARG. During stalled translation, PAR recruitment directly affects the proteins responsible for the expression of microRNAs. Consequently microRNA rich structures known as stress granules are formed. The formation of these stress granules is paramount during times of cellular stress thus the formation and break down of PAR is equally important. The cytoplasmic PARG isoforms were found to co-localise with these stress granules, while the larger nuclear isoform PARG111 did not 72.

The PARG proteins have been found to be distributed throughout all domains of life from bacterium through to humans, see figure 5. It was initially thought that bacteria did not undergo PAR metabolism; however, various studies have shown that some bacterium contain PARP1 homologues 13. There have been more than 50 bacterium sequenced that have shown divergent forms of the PARG, and more than 300 PARGs have been found in 150 eukaryotic species 8,1. A number of the recently discovered PARG containing eukaryotes were originally thought to contain PARPs but no PARG activity, such as certain filamentous fungi, protozoa, mushrooms, rotifer Adineta vaga – where the PARG protein is fused to repeats of PBZ, see figure 5.

29

Structure and function of poly ADP-ribose glycohydrolases

Figure 5. Phylogenetic distribution of PARG proteins with clear division between canonical and bacterial PARGs. Domain A represents the regulatory domain of the Homo sapien PARG protein (HsPARG), Domain B represents the accessory domain of the PARG from Tetrahymena thermophila (TTPARG) and Homo sapien PARG, followed by Domain C representing the core catalytic macrodomain protein. Domains B and C show the minimal catalytic domain of Homo sapien PARG. Adapted from Dunstan et al and Barkauskaite et al 8,13.

1.5 The structures of bacterial and canonical PARGs

The PARG structure from the thermophilic bacterium Thermomonospora curvata (PDB: 3SIG) is far more divergent than the human PARG enzyme consisting of essentially a macrodomain with a small mostly alpha helical accessory domain in the N-terminal region, see figure 7a. The bacterial protein revealed a PARG specific catalytic loop with signature sequence GGG-X6–8-QEE, in which the two conserved glutamic acid residues, Glu114 and Glu115 are essential for catalytic activity. The ADP-ribose is positioned close to the PARG specific catalytic loop, specifically the Glu115 residue, allowing direct hydrogen bonding with the 2’OH group of the distil Ribose, see figure 6. This bacterial protein was shown to be as efficient as canonical PARG, effectively degrading PAR into individual ADP-ribose monomer in vivo and in vitro as effectively as human PARG. It had no catalytic activity on individual ADP Ribose monomers showing activity on the glyosidic bonds only 13.

Using a combination of mutagenesis and X-ray crystallography Slade, et al derived a mechanism for the hydrolysis of PAR13, see figure 6. Due to the size restriction within

30

Structure and function of poly ADP-ribose glycohydrolases the an n+1 ADP-ribose (n being the ADP-ribose observed in the crystal) would cause significant conformational change within the enzyme. Due to this T.curvata PARG was proposed to act as an exo-glycohydrolysing enzyme.13 Slade et al determined the hydrolysis mechanism13. The Ribose OH leaving group is protonated by the Glu115 residue forming a positively charged oxocarbenium intermediate, stabilized by the diphosphate group. This intermediate is subsequently attacked by a nearby water molecule to release the ADP Ribose monomer. The oxocarbenium intermediate is stabilised by the adenosine diphosphate group via a phenylalanine, Phe227 in close contact, see figure 6. Although Glu114 is not directly involved in the mechanism any site directed mutagenesis of this residue caused the enzyme to be inactive, suggesting that this residue may be responsible for directing or guiding the ligand binding. Mutagenesis of the Glu115 also inhibited the enzyme as would be expected because of its direct role in the hydrolysis of the Ribose-ribose bond 13. The bacterial PARG revealed an exo- glycohydrolase catalytic mechanism; however, this may be due to the simplicity of this protein when compared with the higher complexity canonical PARG, which may allow endo-glycohydrolase activity. This endo/exo issue has been previously discussed, but as of yet, no agreement has been reached 73–75.

Figure 6. Proposed catalytic mechanism for the bacterial PARG from Slade, et al. 2011. Protonation of the Ribose OH leaving group by Glu115 forming a positively charged oxocarbenium intermediate. The intermediate is stabilized by the di-phosphate group and then subsequently attacked by a water molecule, to cleave the Ribose-ribose glycosidic bond release the ADP-ribose monomer.

31

Structure and function of poly ADP-ribose glycohydrolases

Canonical PARGs from higher organisms are larger and more complex with extra domains, compared with the bacterial PARG proteins, see figure 7. The catalytic domain is conserved across species, with the central macrodomain occupying the same fold from bacterial through to human, see figure 7 d,e & f. The canonical PARG in addition to the bacterial catalytic domain contains a C-terminal extension, domain B seen in figure 5. The structure of the recently crystallised canonical PARG protein from Tetrahymena thermophila (TTPARG), showed this C-terminal extension domain, and is predicted to stabilise the PARG specific catalytic loop within the macrodomain, acting in a similar manor to the comparatively smaller N-terminal extension seen in the bacterial PARG, see green region in figure 5. The TTPARG structure provides a model for the minimal catalytic region within the human PARG 16. The Human PARG protein has an extra putative regulatory domain not seen in smaller canonical TTPARG. Like the bacterial catalytic mechanism, the canonical PARG forms an oxocarbenuim intermediate and is stabilized by close proximity to the adenosine diphosphate of the ADP-ribose molecule. Again, mutations to the two key glutamic acid residues within the PARG specific catalytic loop inactivate the enzyme, E114 being replaced by E255 in TTPARG and E115 replaced by E256. The E256 residue protonates the OH Ribose leaving group. Site mutations to the E255 residue also renders the enzyme inactive even though this residue is not seen to be directly involved with catalysis. Phe227 stabilises the oxocarbeinium intermediate in bacterial PARG which is replaced with Phe371 in TTPARG and acts in the same way, also stabilising this intermediate 16.

Toward the N-terminal domain, vertebrate PARGs have a short mitochondrial targeting sequence (MTS) and a large regulatory region (1-460 and 1-456 in human and rat respectively), which is poorly understood and not required for PARG activity in vitro 17, 15,76,13. The recent release of structure of the catalytic domain of rat PARG (residues 1- 398) shows key similarities to the canonical PARG from Tetrahymena thermophila; however, includes a MTS domain which was proposed to play a role in stabilizing a tyrosine clasp element, which is present in the active sites of all canonical PARGs 17. This tyrosine clasp contains a specific tyrosine residue which is not seen in the bacterial PARG protein, proposed to be a flexible domain 17. It was however recently proposed that this flexible domain, when compared with other PARG structures, reveals a low probability that this domain is indeed flexible 8. Although the authors claimed that the rat mitochondrial targeting sequence (residues 457-472) is integral to the catalytic 32

Structure and function of poly ADP-ribose glycohydrolases activity other findings suggest that a human truncated protein lacking this mitochondrial targeting sequencing still retains high PAR hydrolysing activity also canonical TTPARG retains hydrolysing PAR activity without this clasp 16.

Structures of new canonical PARGs, showed a different possibility for ligand binding compared with bacterial PARG. The 2’ OH portion of the terminal ADP-ribose of the PAR ligand is solvent exposed suggesting that the canonical PARGs have the potential to bind the PAR polymer in multiple locations, not just at the end, see figure 7, 13,1615,77. Further studies by Dunstan, et al, revealed that the canonical PARG active site can fit additional linear ADP-ribose units and possibly provide endo-glycohydrolase activity16. However, structural data of canonical PARG bound to poly ADP-ribose has been unobtainable due to the lack of large quantities of homogeneous, defined length PAR that would be needed for crystallography, a recent paper has suggested that this may now be possible 78.

33

Structure and function of poly ADP-ribose glycohydrolases

Figure 7. Comparisons of the overall structures and active sites for bacterial, eukaryotic and human PARGs. (a) Structure of bacterial PARG from T. curvata in complex with ADP-ribose (PDB: 3SIG). (b) Surface representation of the active site from bacterial PARG bound to ADP-ribose. (c) Structure of canonical PARG from T. thermophila in complex with ADP-ribose (PDB:4EPP). (d) Surface representation of the active site of canonical PARG from T. thermophila in complex with ADP-ribose. (e) Structure of human PARG in complex with ADP-ribose (PDB:4B1H). (f) Surface representation of human PARG in complex with ADP-ribose. Purple regions show alpha helicies and yellow are sheets, images were created using QTMG software (CCP4i suite)

34

Structure and function of poly ADP-ribose glycohydrolases

1.6 Mono ADP-ribsoylation by macrodomains

PARG is implicated as the key protein responsible for the degradation of PAR into ADP- ribose monomers; however, until recently it was unknown how the terminal ADP-ribose monomer was removed from the modified protein. In 2013, three groups identified the three elusive proteins responsible for this reaction, MACROD1, MACROD2 and TARG1/human C6orf130 (terminal ADP-ribose glycohydrolase 1) 60, 79, 80. Previous studies on these proteins had revealed that they were novel O-acetyl-ADP-ribose deacetylases suggesting an involvement in sirtuin signalling, however, the new studies in 2013 revealed that the catalytic mechanism in the removal of the terminal ADP- ribose unit is different 81,82. MACROD1 and D2 are able to cleave the bond between the glutamate and the ADP-ribose monomer by substrate-assisted cleavage of this ester linkage, as opposed to the acid-base catalysis seen in PARG hydrolysis 60. The structure and biochemical analysis of the TARG1 protein (PDB:4J5S), suggested that the catalytic mechanism involved a transient covalent TARG1- lysyl-(ADP-ribose) intermediate. In addition to this, the TARG1 protein also showed solvent access at the N-terminal Ribose 2’ OH group, which suggested that this protein is not only capable of removing the terminal ADP-ribose unit, but can bind PAR within the chain. This would suggest that TARG1 may have the ability to remove entire chains of PAR from the modified protein, see figure 8 80.

Figure 8. Surface representation of TARG1 bound to ADP-ribose, (PDB:4J5S). The terminal Ribose 2’ OH group is solvent exposed suggesting the potential for PAR binding. Images were created using QTMG software (CCP4i suite)

35

Structure and function of poly ADP-ribose glycohydrolases

The MACROD1 and MACROD2 proteins have been reported to have nearly identical macrodomains within the catalytic domain, but MACROD1 has an additional MTS toward the N-terminal of the protein 60 63. The specific biological role of these proteins still remains unknown; however, recent research has suggested that MACROD2 is involved in reactivation of the Glycogen synthase kinase 3 beta (GSK3β kinase) and MACROD1 was proposed to act as cofactor for androgen and oestrogen receptors 27. The recent revelation that these proteins cleave the terminal ADP-ribose units from modified proteins shows a key role in this key post-translational modifications.

1.7 Therapeutic implications of PARG

The widespread cell signalling pathways, DNA damage, cell trafficking, necrosis and apoptosis that involve poly (ADP-ribosylation ) mean that the enzymes involved have become key therapeutic targets in numerous diseases, in particular cancer.

Early PARP knockout models in mice (PARP-/- ES cells), showed that the mice were still viable and fertile; however, had increased sensitivity to gamma irradiation and DNA damaging agents, presenting PARPs as a key therapeutic target in the role of DNA damage and repair 83. Cortes, et al demonstrated that like the PARP knockout models, mice lacking the PARG 110kDa protein were viable and fertile but were also extremely sensitive to DNA damaging 84. Further research then showed that not only were PARG null cells sensitive to DNA damaging agents, but also the absence of PARG leads to increased cell death. The mechanism behind this induction of cell death has been predicted to result from apoptosis inducting factor (AIF) being released from the mitochondria and transported to the nucleus in response to elevated levels of PAR. In the absence or deletion of PARG, the levels of PAR continue to increase within the nucleus, AIF is transported to the nucleus where it binds histone H2AX and Cyclophilin A (CypA) to induce DNA disintegration and cell death, elevated more so after UV treatment 85. The lack of PARG does not only cause the release AIF and thus the initiation of apoptosis, but also induces complications within mitosis when coupled with radiation. The lack of PARG within cells causes the augmentation of mitotic equipment leading to apoptosis 86. The mass depletion of intercellular ATP arising from increasing levels of NAD+ being converted to ADP-ribose via PARP has also been shown to lead to cell death in an ischemic rat model after inhibition of PARG 87.

36

Structure and function of poly ADP-ribose glycohydrolases

Adenosine diphosphate (hydroxymethyl)pyrrolidinediol (ADP-HPD) was one of the first chemically produced specific inhibitors of PARG, showing almost a 50% reduction of PARG activity in vitro 88. Other possible inhibitors include mono-galloyl glucose derivatives, N-bis-(3-phenyl-propyl)9-oxo-flouorene-2,7-diamide (GPI 16552) and tannins. Mono-galloyl glucose derivatives were found to significantly reduce the half- life of PARG, causing an increase in levels of PAR in vitro 89.

GPI 16552, a specific synthetic PARG inhibitor was designed to aid the activity of temozolomide (TMZ) in malignant melanoma in mice. TMZ is currently in use as an anticancer drug in the treatment of brain tumours acting as a methylating agent. With the systemic addition of the small molecule GPI 16552 PARG inhibitor, the malignant melanomas were significantly more sensitive to TMZ reducing the growth of the tumours in vitro, almost 3 fold. Furthermore the combined treatment of TMZ and GPI 16552 reduced the number of metastases almost 10 fold in mice (P < 0.0001) 90.

In 2009, gallotannin was used to inhibit the activity of PARG in human colon carcinoma cell lines. gallotannin inhibited the expression of PARG within the cells and subsequently increased the levels of PAR. Furthermore, there was found to be a connection between the inhibition of PARG and the activity of PARP. Levels of PARG and PARP within cancer cells are significantly higher than normal cells and decreasing the expression of PARG lead to the decreased levels of PARP within the carcinoma cells. Furthermore, PARP1 was found to have a direct effect on the levels of vascular endothelial growth factor (VEGF), and basic epithelial growth factor b-EGF, responsible for increased angiogenesis. These results indicate an indirect relationship between the levels of PARG and the levels of VEGF and b-EGF 91. Gallotannin, however cannot easily pass through the blood brain barrier and the other inhibitors show low potency and short half-lives as well as difficulties penetrating the cell membrane 92.

The non-cell permeable ADP-HDP molecule is thought to be increasingly promising as a specific inhibitor of PARG, however, further research since the initial group in 1995 has been limited 88. Slade, et al revealed the structure of bacterial PARG from T. curvata bound to ADP-HDP inhibitor13. As new PARG structures are becoming available there will be greater opportunities to develop new or improved inhibitors via structure based drug design.

37

Structure and function of poly ADP-ribose glycohydrolases

1.8 Recruitment of PARG and the relationship between PARG and PARP

The presence of PARG in areas of DNA damage are elevated due to the high transient levels of PAR being synthesised by PARP 44. The question remains as to how synergistically PARG works with PARP enzymes, does the presence of PARPs and PAR recruit PARG, or is there an independent recruitment of PARG.

Isoforms of PARG have been seen to shuttle between the nucleus and cytoplasm. However, their role within DNA damage was previously unknown. Using GFP tagging of PARG111, PARG102 and PARG 99, Mortusewicz, et al were able to determine PARG recruitment to the areas of DNA damage 76. PARG99 and PARG102 are recruited to areas of DNA damage along with PARG111 in vitro. However, these took significantly longer to reach damaged areas and final concentrations of these smaller isoforms were much lower than the full length isoform. A smaller PARG 60kDa isoform, localised to the mitochondria, does not recruit to areas of DNA damage. Using PARP inhibitors and PARP1, it was found that the amount of PARG recruited to sites of DNA damage was reduced by 50%. This result suggested that poly ADP-ribosylation may not be the only trigger for the recruitment of PARG, and other mechanisms may be involved. Using various isoforms of PARG, it was predicted that the putative regulatory domain of the PARG 111kDa isoform contains a ‘PIP box’ (PCNA interacting protein box) motif located between residues 76 – 82 and binds the proliferating cell nuclear antigen (PCNA). These results suggested that PARG was recruited to areas of DNA damage via two pathways, PARP dependent and PCNA dependent. Direct docking of PARG with PARP1 or poly (ADP Ribose) allowed access to areas of DNA damage via the opening and relaxation of chromatin to allow access to surrounding poly (ADP-ribosylated) proteins. PARG binding to PCNA acts as an alternative mode of reaching areas of DNA damage independent from PARP and PAR, this process proved to be a much slower recruitment of PARG 76. One could infer that this allows for a second wave of PAR degradation, providing a wider blanket of controlling the local levels of poly ADP Ribose.

Maruta, et al found the vital role for PARG recruitment at sites of DNA damage was not only to control of levels of PAR, but to also provide energy to proteins involved in DNA repair 93. Since the liberation of monomeric ADP Ribose units are quickly converted to ATP via ADP-ribose pyrophosphorylase, the proteins involved in DNA repair are provided with an intra-nuclear increase in local ATP levels needed during the repair

38

Structure and function of poly ADP-ribose glycohydrolases process 93. As the key functions to PARG is the degradation of PAR and PARP is the synthesis of PAR, we would expect that the inhibition or silencing of either of these proteins to result in an opposite biological effect. To the contrary, a number of recent studies have shown that both of these proteins are needed for effective repair of DNA damage, showing a synergy between these two proteins 94–97. More recent studies have shown that inhibition of PARG specifically kills BRACA2-deficient tumour cells 98. However, targeting both PARP1 and PARG has shown no increase in cells sensitivity to chemotherapy agents, and shown that targeting these proteins separately can lead to greater chemotherapeutic efficacy 96.

1.9 Parthanatos and the release of PAR chains

As previously mentioned, the AIF protein has been shown to play a key role in PAR related cell death, 85. PAR itself, independent of proteins, has also recently been shown to be a key molecule for predicting cell death as it is able to translocate from the mitochondria to the nucleus and initiate apoptosis 99,100 this type of cell death is known as parthanatos. A recent study showed that the AIF is a high affinity PAR binding protein and that this PAR binding is essential for parthanatos in vivo and in vitro. The group found that AIF bound to PAR at a distinctly different site from the AIF DNA binding site and the PAR binding itself initiates the release from the mitochondria 101. The question still remains as to how PAR free from proteins is released from the nucleus 8.

As previously mentioned, the MACRO D1, MACRO D2 and TARG1 proteins are the key enzymes responsible for the mono ADP hydrolysis of the terminal ADP-ribose unit after PARG degradation or PARP dependant mono-ADP-ribosylation 80, 60 79. Further to this, a recent study showed that TARG1 can remove not only the terminal ADP-ribose unit, but also the entire PAR chain from the modified protein, suggesting that this protein may contribute to parthanatos 80. This suggested binding to the root of the PAR chain is different to the binding of PARG to PAR, which is suggested to bind to the end or points along the PAR chain.

39

Structure and function of poly ADP-ribose glycohydrolases

Until the work completed in this thesis, no structures of PARG or the other key macrodomain proteins bound to the PAR polymer were available. As a result, PAR binding was not completely understood or defined.

1.10 PAR in bacteria and endotoxins

The sirtuin family of proteins, more specifically Sir2 (silencing information regulator 2), was initially discovered over 20 years ago in yeast strain Saccharomyces cerevisiae as a transcriptional silencer of the mating type loci 102. Since this initial discovery, these proteins have been described as deacylases, with the ability to remove acyl groups, and are primarily involved in metabolism and DNA repair 103. These proteins have recently been divided into five subclasses based primarily on substrate preference: Class I which incorporates human SirT1, SirT2 and SirT3 along with other sirtuins; Class II which incorporates human SirT4, along with other sirtuins; Class III which incorporates human SirT5, along with other sirtuins; Class IV which incorporates human SirT6 and SirT7, along with other sirtuins; and finally Class U, which incorporates a number of other sirtuins. Although these proteins have been reported as primarily deacylases 103, two recent studies have shown possible links to ADP-ribosylation 104,105.

In a recent study by Rack, et al, the group identified a new class of sirtuins, here described as SirTMs, found in pathogens and were shown to function as ADP-ribosyl transferases106. Further research into this new group of pathogenic specific sirtuins revealed a genetic link to a specific subclass of macrodomain proteins, which themselves reverse the ADP-ribosylation catalysed by the sirtuins, see figure 9. The genomic arrangement of the sirtuin domains and macrodomains within the operon are shown to be fused in a number of fungi or remain adjacent to one another within the same operon in other species. Within a number of other pathogen species the two genes can be found within extended operons, such as Lactobacillales and Staphylococcaceae, where the macrodomain and sirtuin domains are capped with the glycine cleavage system H-like (GcvH-L) protein and a lipoate-protein ligase homolog (LpIA2), see figure 9. The LpIA2 are scavenging proteins from lipoate and play a key role in the virulence of microbial pathogens. The sirtuin mediated ADP-ribosylation within Staphylococcus aureus and Streptococcus pyogenes were found to also be dependent on another posttranslational modification, lipolyation, see LpIA2 yellow protein upstream of the sirtuin domain in figure 9 106.

40

Structure and function of poly ADP-ribose glycohydrolases

Figure 9. Schematic representation of genome arrangements of the macrodomain- sirtuin linked operons. The blue arrows represent a encoding a glycine cleavage system H-like protein (GcvH-L), the green arrows represent the gene encoding the macrodomain protein, the red arrows represent the gene encoding the sirtuin protein and the yellow arrow represents the gene encoding a lipoate protein ligase A (LplA2). Based on Rack, et al106.

The structural and biochemical analysis of this novel sirtuin/macrodomain system revealed a crosstalk between ADP-ribosylation and lipolyation and that these posttranslational modifications are key in the response of these microbial pathogens to oxidative stress, a potent host defence mechanism 106.

1.11 Project objectives

The key objectives of this project involved the biophysical and structural characterisation of PARGs from bacteria through to human to help improve our understanding of the posttranslational ADP-ribosylation and poly ADP-ribose hydrolysis.

The results will be divided into four key sections. Primarily studies will focus on establishing further information on the bacterial PARG protein and its structural homolog MACROD2 from humans. Efforts will focus on gaining binding data and

41

Structure and function of poly ADP-ribose glycohydrolases structural crystallographic data for bacterial PARG mutants and activity studies of an engineered MACROD2 protein with an inserted PARG specific catalytic loop.

Secondly, we will focus on structurally characterising canonical PARG protein from Tetrahymena thermophila with the PAR substrate in order to help understand whether this protein acts as an endo or exo-glycohydrolase. Thirdly, we will attempt to crystallise human PARG protein, more specifically the illusive regulatory domain in an attempt to gain structure based understanding of the regulation of the PARG protein. Finally, we will attempt to structurally ascertain information regarding the previously uncharacterised bacterial luciferase-like monooxygenase (LLM) protein SAV0323, which is genetically linked to the pathogenic SirTMs operon.

42

Structure and function of poly ADP-ribose glycohydrolases

Chapter 2

Material and methods

2.1 Materials

All materials were purchased from Sigma Aldrich, UK unless stated otherwise.

2.2 E. coli and P. pastoris molecular biology methods

2.2.1 Determination of DNA concentration

DNA concentration of samples was determined using the Nanodrop 2000c machine, Thermo scientific (USA). DNA absorbance is measured at 260 nm. The Beer-Lambert equation is modified to use a conversion factor with units of ng-cm/l. Modified equation:

C= (A x CF)/l

Where,

C= the nucleic acid concentration in ng/l

A = the sample absorbance

CF = conversion factor in ng-cm/l

L = the pathlength in cm

Conversion factors used are:  Double-stranded DNA: 50 ng-cm/l  Single-stranded DNA: 33 ng-cm/l

2.2.2 Polymerase chain reaction (PCR)

PCR reactions were performed using 100 ng of template DNA, 0.4 µl 10 nM dNTPs, 1 µl 10 µM forward primer, 1 µl 10 µM reverse primer, 4 µl 5X phusion HF buffer, 0.2 µl Phusion DNA polymerase (New England Bioloabs) and made up to 20 µl with nuclease free water. Reaction mixtures were thermally cycled using the BIO-RAD C1000TM, see table 1.

43

Structure and function of poly ADP-ribose glycohydrolases

Step Temperature (°C) Time Number of cycles

Initial 98 30 seconds 1 denaturation

Denaturation 98 10 seconds

Annealing A 30 seconds 35

Extension 72 B seconds

Final extension 72 10 minutes 1

Table 1. Thermal cycling parameters for PCR reactions. Temperatures are given for each step with timings and number of cycles.

The annealing temperature ‘A’ was set at the 5 ˚C lower than the Tm (melting temperature) and the extention time ‘B’ was 30 seconds per kilobase (kb). The reation mixture was then cooled to and held at 4˚C.

Tm = 4 (G+C) + 2(A+T)

After thermal cycling, PCR reaction mixtures were digested using Dpn1 to remove the template DNA from the reaction mixture. 20 units of Dpn1 were added to the 20 µl reaction mixture and incubated at 37 ˚C for 2 hours, followed by heat inactivation at 80 ˚C for 20 minutes.

2.2.3 PCR product purification

PCR product purification was completed using the QIAprep PCR purification kit following the manufacturer’s protocol (QIAGEN, UK). The kit uses a bind-wash-elute procedure. Binding buffer is added directly to the PCR product and the mixture is applied to the QIAquick spin column. The DNA adsorbs to the silica membrane in the

44

Structure and function of poly ADP-ribose glycohydrolases high-salt conditions provided by the buffer. Impurities are washed away and pure DNA is eluted with a small volume of low-salt buffer or water.

2.2.4 Agarose gel electrophoresis

Agarose gel electrophoresis was used to estimate yield and purity of DNA obtained from restriction digests, PCR reaction mixtures or PCR product purifications. A 1% agarose gel was made my dissolving 0.5 g agarose into 50 ml Tris-acetate- Ethylenediaminetetraacetic acid (TAE) buffer. The 1X TAE buffer was obtained by dilution from a 50X stock TAE (242 g/L Tris-base, 57.1 ml acetic acid and 100 ml 0.5 M EDTA, pH 8.0). The agarose solution was left to cool to approximately 40-50 ˚C before ethidium bromide was added to a final concentration of 0.5 µg/ml. The agarose solution was poured into an electrophoresis chamber, a comb added to yield the desired amount of wells and the mixture was left to solidify. Samples were prepared for loading by the addition of 1 µl 6X DNA loading dye (0.25% bromophenol blue, 0.25% xylene cyanol and 30% glycerol) to a 5 µl sample. GeneRuler 1 KB DNA ladder (Fermentas, Loughborough, UK) was used as a DNA marker. The gel was run at 90 V and DNA bands were visualized using a UV transiluminator (Syngene, Cambridge, UK).

2.2.5 Restriction enzyme DNA digest of plasmid DNA

Restriction digests were conducted using 0.5-1 µg of DNA, 5-10 units of desired restriction enzymes (New England Biolabs), 0.05 µg BSA, 5 µl 10x New England Biolabs (NEB) buffer 4 and made up to 50 µl with dH20. The reaction mixture was incubated at 37 ˚C for 2-3 hours followed by an incubation at 70 ˚C for 20 minutes to inactivate the enzyme.

2.2.6 Purification of Plasmid DNA from E. coli

Plasmid DNA was purified from E. coli using the QIAprep spin Miniprep kit following the manufacturers protocol (QIAGEN, UK). Mini preps of 15 ml cultures were grown overnight in Lysogeny broth (LB) broth supplemented with correct antibiotic. Cells were then harvested by centrifugation at 4000 rcf for 10 minutes. The pelleted cells were re-suspended in buffer P1 and lysed after the addition of buffer P2 (alkaline lysis buffer). N3 buffer was added to neutralise the lysate and the cellular debris was

45

Structure and function of poly ADP-ribose glycohydrolases removed by centrifugation at 16,000 rcf for 10 minutes at 4 °C. The DNA containing supernatant was bound to a spin column silica membrane by applying and centrifuging at 16,000 rcf for 30 seconds at 4 °C. The column was then washed with PB and PE buffer and then eluted in dH2O.

2.2.7 In Fusion Cloning reaction

Infusion cloning reaction was completed using the infusion enzyme, which PCR- generated DNA sequences and linearized vectors by recognizing a 15 bp overlap at their ends. These 15 bp extensions (5’) were complementary to the ends of the linearized vector. The In-Fusion cloning reaction was completed using a 2:1 vector to insert ratio, calculated using the Clontech online tool [http://bioinfo.clontech.com/infusion/molarRatio.do], using the Advantage PCR CLONING KIT (Clontech, CA, USA).

Each cloning reaction required 2 µl of Infusion enzyme being added to the appropriate amount of vector and insert. The reaction mixture was made up to 10 µl with dH20 and incubated at 50 ˚C for 15 minutes followed by immediate transfer to ice for 5 minutes. The reaction mixture was diluted five-fold and 2.5 µl of the dilution was used to transform competent E. coli.

The pET28a plasmids were digested with NdeI and XhoI restriction enzymes and purified using the plasmid purification kit. The Clontech online tool: [http://bioinfo.clontech.com/infusion/molarRatio.do] was used to determine the amount of DNA to use for the cloning reaction with a ratio of 3:1 insert to vector, respectively. Following In-Fusion cloning, 2.5 µl of cloning reaction was used to transform NEB5-alpha E. coli cells.

Bacterial hybrid genes were ordered from Genscript, USA, and 2 µg of each construct was delivered, cloned into a pmcnEAVNH plasmid. The Human MACROD2 protein was ordered to be codon optimised for expression in E.coli. Both hybrid genes and bPARG WT and MACROD2 WT were designed to be under the control of the T7 promotor and lac operator. Primers were designed with complimentary sequences for incorporation of the gene into a pET28 vector with an N-terminal His tag, using the NdeI and XhoI restriction sites, (primers are listed in appendix). The genes were PCR amplified,

46

Structure and function of poly ADP-ribose glycohydrolases including the N-terminal His-tag from the pmcnEAVNH plasmid (Genscript, USA), using the specific primers. A gradient of 50 °C–70 °C for the annealing temperature was used. PCR mixture was digested with DpnI restriction enzyme at 37 °C for 3 hours and inactivated at 80 °C for 20 minutes. The PCR product mixture was verified for the presence of the amplified gene using agarose gel electrophoresis and purified using the PCR purification kit (Qiagen, UK). The pET28a vector was cut using the NdeI and XhoI restriction enzymes (New England biolabs) and purified. The genes were cloned into the cut vector using the In-fusion Advantage PCR cloning technique, (Clontech, CA, USA) and the online tool [http://bioinfo.clontech.com/infusion/molarRatio.do] to find concentrations of gene and plasmid needed for a 3:1 molar ratio respectively. A 5 µl aliquot of the final cloning mixture was diluted five-fold and 2.5 µl of dilution was used to transform DH5α E. coli cells. Transformants were plated onto 50 µg/ml kanamycin LB agar selection plates. Colony PCRs were performed on individual colonies using T7 forward and reverse primers. Colonies that were identified to contain the correct gene were used to inoculate 15 ml of LB broth supplemented with 50 µg/ml kanamycin. Cells were grown overnight and plasmid DNA isolated the following morning. Plasmid samples were sent to MWG-operon (Wolverhampton, UK) for sequencing using the standard T7 forward and reverse primers to verify the constructs.

For hPARG transformants, condon optimized hPARG 1-460, 1-388, 1-380, 1-365 and 1- 329 and hPARG full-length were PCR amplified from a pDonor plasmid (from the Ahel group), (primers are listed in appendix). These primers contained complimentary sequences to allow insertion of the PCR product into pET28a plasmid using In-Fusion Advantage PCR Cloning. In addition, the forward and reverse primers included NdeI and XhoI restriction sites, respectively, to facilitate following cloning reactions. PCR was used to amplify the five hPARG regulatory domain fragments and the full-length gene. Annealing temperatures of 60 ˚C for 30 seconds, followed by 72 ˚C for 30 seconds were used. Following conformation of the correct PCR reaction product the template DNA was subject to Dpn1 enzyme digest.

2.2.8 Colony PCR

Colony PCR was used to screen candidate transformant E. coli colonies for the presence of the correct gene (following successful In-fusion reactions), using vector specific

47

Structure and function of poly ADP-ribose glycohydrolases primers and PCR to amplify the inserted gene. This was performed using REDTaq ReadyMix PCR reaction Mix. Each colony PCR reaction mixture contained the following: 25 µl REDTaq PCR reaction mix for a final concentration of 1X, 0.1-1.0 µM forward primer, 0.1-1.0 µM reverse primer and made up to 50 µl with dH20. Individual colonies were picked and streaked onto a fresh agar plate (containing appropriate antibiotic) using a sterile pipette tip. The tip was then used to inoculate the PCR reaction mixture with some of the remaining cells. PCR reaction mixtures containing the cells were thermally cycled using BIO-RAD C1000TM thermal cycler (see table below), then verified by loading directly onto a 1% Agarose Gel.

Step Temperature (°C) Time Number of cycles

Initial 98 30 seconds 1 denaturation

Denaturation 98 10 seconds

Annealing 55 120 seconds 37

Extension 72 180 seconds

Final extension 72 10 minutes 1

Table 2. Thermal cycling parameters for colony PCR reactions. Temperatures are given for each step with timings and number of cycles.

2.2.9 DNA sequencing reactions

Plasmids containing a gene of the correct size sequenced by MWG-Biotech, (Wolverhamption, UK). Two 15 µl samples of purified DNA (50-150 ng/µl) were sent for forward and reverse sequencing. Results were analyzed using the BioX program (eBioTools, UK).

48

Structure and function of poly ADP-ribose glycohydrolases

2.2.10 E. coli strains

The competent E.coli strains DH5α [F- endA1 glnV44 thi-1 recA1 relA1 gyrA96 deoR nupG Φ80dlacZΔM15 Δ(lacZYA-argF)U169, hsdR17(rK- mK+), λ–]; BL21(DE3) [F– ompT gal dcm lon hsdSB(rB- mB-) λ(DE3 (lacI lacUV5-T7 gene 1 ind1 sam7 nin5)] and

Rosetta(DE3)pLysS [F- ompT hsdSB(RB- mB-) gal dcm λ(DE3 [lacI lacUV5-T7 gene 1 ind1 sam7 nin5]) pLysSRARE (CamR)] were used for DNA propagation (DH5α) and protein (BL21, Rosetta) expression respectively, (New England biolabs).

2.2.11 E. coli growth conditions

E. coli strains were cultured in LB broth (10 g/L Tryptone, 10 g/L NaCl, 5 g/L Yeast Extract) and incubated at 37 °C with shaking at 200 rpm. E. coli was cultured on LB agar plates (15 g/L Agar, 10 g/L Tryptone, 5 g/L Yeast Extract, 5 g/L NaCl) and incubated at 37 °C. Both LB broth and LB agar were sterilized by autoclaving and supplemented with the appropriate antibiotic (kanamycin 50 µg/ml, chloramphenicol 33 µg/ml and ampicillin 50 µg/ml). LB agar plates containing the appropriate antibiotic were stored at 2-8 °C for future use. Kanamycin is stable at 2-8 °C for up to 6 months, ampicillin is stable at 2-8 °C for up to 2 weeks, and chloramphenicol is stable at 2-8 ° C for up to 2 years.

2.2.12 Transformation of competent E. coli cells

All E. coli strains were transformed using the same protocol. Individual 50 µl tubes of competent E. coli cells (New England biolabs) were thawed on ice for 10 minutes. Approximately 100 ng of plasmid DNA was added to the cell mixture using a maximal volume of 5 µL . The sample was gently mixed and placed back on ice for 30 minutes. The cells were then heat shocked at exactly 42 °C for 30 seconds and placed back on ice for 5 minutes. A 950 µl volume of SOC (0.5% Yeast Extract, 2% tryptone, 10 mM NaCl,

2.5 mM KCl, 10 mM MgCl2, 10 mM MgSO4, 20 mM glucose) at room temperature was added directly to the tube. The mixture was placed at 37 °C for 60 minutes, shaking vigorously at 250 rpm. A total of 50 µl was spread onto the prewarmed LB agar plates containing the appropriate antibiotic and plates were incubated overnight at 37 °C.

49

Structure and function of poly ADP-ribose glycohydrolases

2.2.13 Purification of plasmid from E. coli transformants

Plasmids from E. coli transformants were purified using a plasmid mini prep kit (Qiagen Ltd, West Sussex, UK). All reagents mentioned in the protocol are contained within the kit. A 5-15 ml volume of LB supplemented with appropriate antibiotic was inoculated and cells were grown overnight at 37 °C, shaking at 200 rpm. Centrifuging at 4000 rcf for 10 minutes at 4 °C to pellet the cells. The cell pellet was re-suspended in 300 µl of buffer P1 and transferred to a 1.5 ml microcentrifuge tube. A 300 µl volume of alkaline lysis buffer P2 was added to the mixture and inverted 4-6 times to ensure mixing (the mixture turns blue from the LyseBlue reagent in the P2 buffer). Adding 300 µl of N3 buffer and inverting 4-6 times neutralizes the solution. The lysed cells were separated by centrifuging at 16,000 rcf for 30 minutes at 4 °C. The supernatant was then applied to a Qiagen column and centrifuged for 30-60 seconds at 16,000 rcf to bind the DNA to the membrane. The membrane was washed twice with 750 µl of PB buffer followed by 750 µl of PE buffer and transferred to a clean sterile micro centrifuge. The DNA was eluted from the membrane by adding 30-50 µl of dH2O and stored at -20 °C.

The following buffers were used:

 P1 buffer- 50 mM Tris-HCl pH 8.0; 10 mM EDTA; 100 μg/ml RNaseA.  P2 buffer- 200 mM NaOH; 1% SDS.  N3 buffer- 4.2 M guanidine hydrochloride and 0.9 M potassium acetate pH 4.8  PB buffer- 5 M guanidine hydrochloride and 30% isopropanol  PE buffer- 10 mM Tris-HCl pH 7.5; 80% ethanol

2.2.14 Glycerol stocks of E. coli strains

E. coli cultures were grown overnight at 37 °C supplemented with the appropriate antibiotic. A 500 µl volume of cells was diluted with sterile glycerol to a final concentration of 15 % glycerol and stored in a sterile 1.5 ml cryogenic tube. This solutions (i.e. the glycerol stock) was then stored at -80 °C for long-term storage.

2.2.15 P. pastoris growth conditions

Prior to transforming P. pastoris strains, the gene of interest is first cloned into the

50

Structure and function of poly ADP-ribose glycohydrolases plasmid using E. coli as a host. For the transformation into E. coli a low salt LB liquid medium and solid plates were prepared containing: 10 g/L Tryptone, 5 g/L NaCl, 5 g/L Yeast extract (plus 15 g/L agar for plates). Liquid media and plates were supplemented with 25 µg/ml of zeocin antibiotic.

P. pastoris was grown on YPDS+zeocin plates (1% yeast extract, 2% peptone, 2% dextrose, 1M sorbitol, 2% agar and 1000ug/ml Zeocin) and YPDS+zeocin media (1% yeast extract, 2% peptone, 2% dextrose, 1 M sorbitol and 100 ug/ml Zeocin).

Protein expression in small volumes of P. pastoris cultures required a BMGY media while a BMMY media was used for large scale expression. Both media was prepared using the following stock solutions:

-10x YNB (13.4% yeast nitrogen base with ammonium sulfate without amino acids) the solution was autoclaved and stored 4 °C with a shelf life of one year.

-500x B (0.02% biotin) the solution was filter sterilized and stored at 4 °C with a shelf life of one year.

-10x M (5% methanol) the solution was filter sterilized and stored at 4 °C with a shelf life of 2 months.

-10x GY (10% glycerol) the solution was autoclaved and stored at room temperature with a shelf life of 2 years.

-1M potassium phosphate buffer, pH 6.0, this solution was prepared by combining 132 ml of 1M K2HPO4 with 868 ml of 1M KH2PO4. The pH was adjusted slightly using phosphoric acid or KOH to pH 6.0. This solution can be stored for over a year at room temperature.

A 1L volume of BMGY (buffered glycerol-complex medium) was prepared containing 100ml of 1M potassium phosphate buffer (pH6.0), 100 ml 10x YNB, 2 ml 500x B and 100 ml 10x GY. A 1L volume of BMMY (buffered methanol complex medium) was prepared containing 100 ml of 1 M potassium phosphate buffer (pH6.0), 100 ml 10x YNB, 2 ml 500x B and 100 ml 10x M. Both solutions were stored at 4 °C.

51

Structure and function of poly ADP-ribose glycohydrolases

2.2.16 P. pastoris strains

Two P. pastoris strains were used:

1. X-33 with a wild type genotype to be used with zeocin resistance encoding expression vectors.

2. KM71H with aox1-ARG4 and arg4 genotypes to be used with zeocin resistance encoding expression vectors to generate MutS phenotype.

2.2.17 Transformation of P. pastoris

The gene of interest was cloned into a pPICZB plasmid using the infusion cloning technique (section 2.2.7) and transformed into NEB5α E. coli cells (section 2.2.12). The construct(s) containing the correct gene(s) (verified with sequencing) were then linearized using the SacI unique restriction site at the AOX1 locus to allow efficient integration into the P. pastoris genome. The restriction digest was completed as previously described, (section 2.2.5).

All P. pastoris strains were transformed using the same approach. To make the cells competent 5 ml of YPD was inoculated with one stock strain of P. pastoris and grown at 30 °C shaking at 200 rpm for 24 hours. A 400 µl volume of the culture was then transferred to 500 ml of YPD and cells were grown overnight (approx. 12 hours) to an OD 600 of 1.3-1.5 at 30 °C shaking at 200 rpm. The 500 ml culture was centrifuged at 7000 rcf for 10 minutes and the pellet re-suspended in 500 ml ice-cold filter sterilized water. The cells were once again centrifuged at 7000 rcf for 10 minutes and the pellet again re-suspended in 500 ml ice cold filter sterilized water. The cells were centrifuged for the third time at 7000 rcf for 10 minutes before re-suspending the cells in 20 ml of ice-cold 1M sorbitol. The cell mixture was then centrifuged at 7000 rcf for 5 minutes before re-suspending the pellet in 1 ml of ice cold 1 M sorbitol. Competent cells can be stored on ice for use on the same day or slowly frozen to -80 °C in Styrofoam for long- term storage.

Competent P. pastoris cells were transformed by electroporation. An 80 µl volume of competent cells was transferred to a 0.2 cm electroporation cuvette followed by addition of 5-10 µg of linearized DNA. The mixture was charged to 180 V followed by

52

Structure and function of poly ADP-ribose glycohydrolases immediate addition of 1 ml of 1 M ice cold sorbitol and incubated at 30 °C for 2 hours (without shaking). A 20 µl volume of the transformation mixture was spread onto YPDS plates containing 1 mg/ml zeocin antibiotic and incubated at 30 °C for 2-3 days. A few colonies from the plates were then streaked onto fresh YPDS plates containing 1000 µg/ml zeocin and incubated at 30 °C for 2-3 days to insure full incorporation of the gene into the yeast genome.

2.3 Protein expression and purification

2.3.1 Determining protein concentration

Protein concentration was determined by measuring the UV absorbance at 280 nm and calculated using the extinction coefficient (calculated from amino acid sequence using the ProtParam tool [ExPasy, Swiss institute of Bioinformatics]) and the Beer-Lambert law; A280= 280 * b * c where A is absorbance at 280 nm, 280 is the extinction coefficient at 280 nm, b is the path length and c is the concentration.

2.3.2 SDS-PAGE electrophoresis

Proteins were visualized via SDS-PAGE gel electrophoresis. Precast 4-12 % SDS PAGE gels (BioRad, UK) with 10 lanes that hold 30 µl samples or 15 lanes that hold 15 µl of sample were used. Protein samples were diluted with 2x SDS loading dye and heated to 90 °C for 10 minutes to denature. The gel chamber was filled with 1x SDS buffer (10x SDS buffer: 30 g Tris base, 144 g glycine and 10 g SDS) and a Precision plus Protein Unstained marker was used as a standard (BioRad, UK). The protein samples were loaded into the individual wells and the gel was electrophorised at 300 V for 20 minutes. It was visualized using the Bio Rad stain free system and digitally analyzed using Image Lab software (BioRad, UK).

2.3.3 Western Blot analysis

Western blot analysis (Western Breeze Chemiluminescent Kit, Life Technologies, UK) was used to determine protein expression levels using ant-his tag antibodies, (Life Technologies, Paisley, UK) or detect PAR using anti-PAR antibodies (AMS Biotechnology, UK)

53

Structure and function of poly ADP-ribose glycohydrolases

An SDS-PAGE gel was run (section 2.3.2) and transferred to a PVDF membrane using the TransBlot® TurboTM transfer system (BioRad, UK). For the transfer, a positive ion reservoir stack was placed within the cassette, on top of the anode base. On top of this, the blotting membrane was placed followed by the SDS-PAGE gel, then the negative ion reservoir stack and finally the cathode. A blot roller was used to remove any air bubbles. A 25 V voltage was applied to the stacks for 7 minutes, following which the SDS-PAGE gel and stacks were discarded and the blotted membrane was immediately placed into dH2O for short term storage.

The PVDF membrane was placed in 10 ml in blocking solution and incubated at room temperature for 30 minutes on a rotary shaker set at 1 revolution per second. The membrane was then rinsed with 20 ml of dH20 for 5 minutes and the excess solution decanted. A primary antibody solution was used to dilute primary antibody to 1:1000 final concentration and incubated with the membrane for 1 hour. The membrane was then rinsed with 20 ml of antibody wash solution for 5 minutes, decanted and repeated 3 times. A 10 ml volume of secondary antibody solution was incubated with the membrane for 30 minutes followed by rinsing with 20ml of antibody wash solution for

5 minutes, 3 times. Two further wash steps were completed using 20ml of dH20 for 5 minutes and decanted. Transparent plastic cellophane was laid out and 2.5 ml of chemiluminescent solution applied to the surface, the membrane was placed face down onto the solution and left to develop for 5 minutes. The excess chemiluminescent solution was blotted from the membrane and covered in clean cellophane and placed in a black out case to prepare for luminography. In a dark room, the X-ray film (Kodak X- OMAT AR film, Sigma) was exposed to the membrane for 1 second to several minutes depending on the strength of the signal.

The following buffers were used:

 Blocking buffer- 5 ml dH20, 2 ml solution A (concentrated buffered saline solution containing detergent) and 3 ml of solution B (concentrated Hammersten casein solution).

 Primary antibody solution- 7 ml dH20, 2 ml solution A (concentrated buffered saline solution containing detergent), 3 ml of solution B (concentrated Hammersten casein solution).

54

Structure and function of poly ADP-ribose glycohydrolases

 Antibody wash solution- 150 ml dH20 and 10 ml of 16x antibody wash solution (Concentrated buffered saline solution containing detergent).

2.3.4 E. coli protein expression trials

A single colony from a transformation of Rosetta or BL21 expression strain(s) containing the appropriate gene(s) was used to inoculate 10 ml of LB (supplemented with appropriate antibiotic). Cells were left to grow overnight at 37 °C shaking at 200 rpm. A 1ml volume of overnight culture was used to inoculate 200 ml of LB (supplemented with appropriate antibiotic) and cells were grown at 37 °C, 200 rpm until an OD 600 of 0.6-0.8 was reached. Cells were then induced with varying amounts of IPTG (100 µM to 1 mM); various temperatures were used during induction. A 500 µl sample was taken just before induction, then 4 hours after induction and after overnight induction. The samples were centrifuged at 16,000 rcf at 4 °C for 30 minutes and supernatant removed. The cell pellets were lysed after addition of and RNase and vortexing for 2 minutes, protein expression levels were estimated using SDS PAGE gel electrophoresis.

2.3.5 P. pastoris protein expression trials

A single P. pastoris colony was used to inoculate 5 ml of sterile BMGY media in a 100 ml baffled flask and cells were grown at 30 °C shaking at 250 rpm until the culture reached an OD600 between 2-6 (approximately 5-6 hours). The cells were harvested by centrifuging at 7000 rcf for 5 minutes at room temperature and the pellet was re- suspended in sterile BMMY media to a final OD600 of 1 (approximately 25-75 ml). The culture was transferred to a sterile 250 ml baffled flask and returned to the incubator at 30 °C shaking at 250 rpm. Cultures were then incubated for 72 hours, at each 24 hour interval 100% methanol was added to a final concentration of 0.5% to maintain induction and a 1 ml sample was taken. Each sample was centrifuged at 16,000 rcf for 3 minutes at 4 °C to separate cell pellet and supernatant. Both pellet and supernatant were stored at -80 °C for analysis via SDS-PAGE to see if there was intracellular expression or secretion of the protein.

55

Structure and function of poly ADP-ribose glycohydrolases

The cell pellet(s) were thawed on ice and re-suspended in 10 ml of breaking buffer (50 mM sodium phosphate, pH 7.4, (50 mM sodium phosphate, 1 mM EDTA, 5% glycerol) supplemented with complete EDTA free protease inhibitors. Cells were broken open using cell disruption (see section 2.3.10). The samples were centrifuged at 16,000 rcf for 10 minutes at 4°C and 50 µl of supernatant was removed for analysis by SDS PAGE gel.

2.3.6 E. coli large-scale protein expression

Large-scale expression of proteins in Rosetta or BL21 E. coli were started by inoculating 200 ml of LB (supplemented with appropriate antibiotic) using a single colony obtained from a recent transformation, or glycerol stock. Cells were grown overnight at 37 °C shaking at 200 rpm. A 10ml volume of the overnight starter cultures was then used to inoculate 1 L of LB in 2 L flasks, supplemented with appropriate antibiotic (routinely 12

L in total were used). Cells were grown at 37 °C shaking at 200 rpm until an OD600 of 0.6-0.8 was reached. At this point the cells were induced with IPTG (to a concentration determined by small scale studies to be optimal for soluble protein expression) and incubated at the temperature determined by small scale studies to be optimal for soluble protein expression. Cells were grown overnight and the next day centrifuged at 7,000 rcf at 4 °C for 10 minutes to pellet. Pellets were transferred to a 50 ml falcon tube, weighed and then frozen at -20 °C. For bPARG proteins ) the following conditions were used: 1 mM IPTG was used for induction at 23°C overnight (~16 hours).

For TTPARG proteins, large-scale expression of all proteins were performed under the same conditions. 750 µM IPTG was used for induction and cells were left at 20 °C overnight (~16 hours). Per 12 litres of expression, the yield of cell pellet was ~14 g.

For hPARG proteins, large scale expression was done using 1 mM IPTG at 30 °C for 2 hours. Per 12 liters of expression,

2.3.7 Production of selenomethione-labelled proteins in E. coli

The following solutions were prepared for 10 L of culture:

56

Structure and function of poly ADP-ribose glycohydrolases

 2 L stock solution of 5x salts was prepared from: 128 g Na2HPO4-7H2O; 30 g

KH2PO4; 5.0 g NaCl; 10 g NH4Cl and adjusted to a pH of 7.5. The solution was then autoclaved.

 20 ml of 1 M MgSO4 and 5 ml of 1 M CaCl2, these were then filter sterilized.

 10x 2 L flasks was autoclaved, each containing 770 ml of dH2O.  150 ml of 20% glucose solution, filter sterilized.  50 ml stock of 10x amino acid solution: 1 g lysine, 1 g phenylalanine, 1 g threonine, 0.5 g isoleucine, 0.5 g leucine, 0.5 g valine, 0.85 g Se-met.  10 ml stock of 10x biotin and thiamine solution: 20 mg biotin, 20 mg thiamine.

A total of ten 10 ml E. coli starter cultures supplemented with the correct antibiotic were grown overnight as described in section 2.2.12. To each of the 2 L flasks containing autoclaved 770 ml of dH2O, 200 ml of 5x salts, 2 ml 1 M MgSO4, 100 µl 1 M CaCl2, 20 ml 20% glucose and desired antibiotic were added. The flasks were gently shaken to ensure complete mixing. Flasks were then each inoculated with 10 ml of the starter culture. Cells were grown at 37 ˚C to an OD600 of 0.6, at which point 5 ml of the 10x amino acid stock solution and 1 ml of the 10x biotin and thiamine stock solution was added to each flask. Cells continued to grow at 37 ˚C for 15 minutes before the temperature was reduced to 20 ˚C. Protein expression was induced with 1 mM IPTG. Cells were left for a further 18 hours at 20 ˚C before harvesting using centrifugation.

2.3.8 Large-scale protein expression in P. pastoris

Large-scale expression of proteins in P. pastoris strains was started by using a single colony to inoculate 2x 25 ml of sterile BMGY media in 2x 250 ml baffled flask. Cells were grown at 30 °C shaking at 200rpm until the cultures reached an OD600 between 2-6 (approximately 18-24 hours). The cultures were used to inoculate 2x 500 ml of BMGY in a 2x 2 L baffled flask and grown to an OD600 between 2-6 (12-16 hours). The cultures were harvested by centrifuging at 7,000 rcf for 5 minutes at room temperature and the pellets were re-suspended in sterile BMMY media to a final OD600 of 1 (approximately 2- 6 L). The culture was divided 500 ml per 2 L baffled flask and returned to the incubator at 30 °C shaking at 250 rpm. Cultures were incubated for the optimized length of time determined via expression trials adding 100% methanol (0.5% final concentration) at

57

Structure and function of poly ADP-ribose glycohydrolases each 24 hour interval to maintain induction. After complete induction the cells were pelleted at 8,500 rcf at 4 °C and stored at -80 °C.

2.3.9 E. coli cell disruption by sonication

E. coli cells were lysed as the first step to protein purification. Cell pellets were thawed on ice, transferred into a plastic 250 ml beaker and re-suspended in 150 ml of ice-cold buffer (25 mM Tris pH 7.5, 150 mM NaCl, 1mM β-mercaptoethanol). A complete EDTA free protease inhibitor tablet (Roche, UK), 10 µg/ml DNase and 10 µg/ml RNAse was added to the re-suspended cells. The beaker was placed on ice and cells were lysed using a sonicator set to 30% intensity, vibrating for 20 seconds and resting for 40 seconds for a total of 30 minutes. Lysate was then centrifuged at high speed to remove cell debris: 180,000 rcf for 1 hour at 4°C using a Beckman coulter C100XP Ultra centrifuge and rotor Ti 45 (Beckman, UK)

2.3.10 P. pastoris cell disruption

P. pastoris cells require high pressures to effectively lyse the cells sonication cannot be used. The P. pastoris cells previously frozen at -80°C were thawed on ice, transferred to a 1 L plastic beaker and suspended in 500 ml of ice-cold lysis buffer (see list of buffers). Three complete EDTA free protease inhibitor tablets were added as well as 10 µg/ml DNAse, the beaker was placed on ice. The cell disruptor (Constant Systems Ltd, UK) was cooled to 4 °C and set to 40 Kpsi, 150 ml of lysis buffer was used to equilibrate the cell disrupter before the 500 ml of lysate was passed through 3 times to insure complete lysis. A further 150 ml of lysis buffer was added to remove any lysate. Lysate was then centrifuged at high speed to remove cell debris: 180,000 rcf for 1 hour at 4 °C using a Beckman coulter C100XP Ultra centrifuge and rotor Ti 45 (Beckman, UK)

2.3.11 Protein buffer exchange using de-salting columns

Disposable PD-10 de-salting columns were used primarily to buffer exchange any proteins after ion exchange chromatography. The excess storage solution was discarded and the column was equilibrated with 2x CV of 25 mM Tris pH 7.5, 150 mM NaCl, 1 mM DTT. The protein was either concentrated or diluted to 3 ml, after which it was applied to the column and allowed to drain into the resin, the 3 ml elution was discarded. To

58

Structure and function of poly ADP-ribose glycohydrolases elute the protein, 4 ml 25 mM Tris pH 7.5, 150 mM NaCl, 1 mM DTT was applied to the column and the 4 ml elution was collected and analyzed by SDS page electrophoresis.

2.3.12 Batch nickel affinity chromatography

A 5ml volume of Ni-NTA resin (Qiagen, UK) was applied to a 25 ml plastic disposable batch column (sigma) and equilibrated with 5 column volumes of 25 mM Tris pH 7.5, 150 mM NaCl, 1 mM β-mercaptoethanol buffer. The soluble protein extract was mixed and incubated with the Ni-NTA resin for 30 minutes to allow full protein binding to the nickel resin. The Ni-NTA resin and protein extract were reapplied to the column, with the flow through collected for further analysis. The column was washed with 5 column volumes of containing 15 mM imidazole, with collection of the flow through for analysis. The column was then washed with 5 column volumes of 25 mM Tris pH 7.5, 150 mM NaCl, 50 mM imidazole, 1 mM β-mercaptoethanol buffer collecting the flow through for analysis. Finally, the His-tagged protein was eluted by applying 2 column volumes of 25 mM Tris pH 7.5, 150 mM NaCl, 250 mM imidazole, 1 mM β-mercaptoethanol buffer. Eluate was transferred to dialysis tubing and left dialyzing in 5 L of 25 mM Tris pH 7.5, 150 mM NaCl and 1 mM DTT buffer overnight. Samples taken were analyzed by SDS- PAGE.

2.3.13 Reverse-batch nickel affinity chromatography

Proteins containing a TEV cleavage site were incubated with 125 units of TEV protease (AcTEVTM Life technologies, Paisley, UK) per mg of protein at 4 °C in 25 mM Tris pH 7.5, 150 mM NaCl and 1 mM DTT buffer for 18 hours. To separate the cleaved from un- cleaved proteins the incubated mixture was applied to a pre-equilibrated Ni-NTA resin column (as described above). Un-tagged proteins, that are collected in the flow-through, were transferred to dialysis tubing and left dialyzing in 5 L of 25 mM Tris pH 7.5, 150 mM NaCl and 1 mM DTT buffer.

2.3.14 Gel filtration chromatography

Proteins purified using batch Nickel affinity chromatography were concentrated using Vivaspin 20 centrifugal concentrators (Generon, Berkshire, UK) to 1 ml at 50–100mM. High–resolution protein separation gel filtration was completed using a Superdex 200

59

Structure and function of poly ADP-ribose glycohydrolases

10/300 GL column, operated using an AKTA Fast Protein Liquid Chromatography (FPLC) system, both GE Healthcare, (Buckinghamshire, UK). The column was equilibrated with 2 column volumes 25 mM Tris pH 7.5, 150 mM NaCl and 1 mM DTT buffer. A 500 µl volume of protein sample was micro-centrifuged at 16,000 rcf for 10 minutes to remove any small debris and applied to the column. Proteins were separated according to size and visualised at 280 nm. The eluate was collected in 500 µl aliquots. Fractions were further analysed by SDS-PAGE gel and flash frozen in liquid nitrogen and stored at -80 °C.

2.4 Enzymatic and Biophysical methods

2.4.1 Poly ADP-ribose glycohydrolase activity assays

For a PARG activity assay, PAR was synthesized by auto modification of PARP1. Reaction mixture contained 2 units of PARP1 enzyme (Trevigen), 50 mM Tris (pH 7.5), 50 mM NaCl, 200 µM NAD (Trevigen), 10 units of activated DNA (Trevigen). The enzymatic reaction was left at room temperature for 30 minutes and stopped by adding 1 µM of Olaparib (KU-0058948) (Stratech Scientific, Suffolk, UK) the final PAR concentration was approximately 5 µM. A 10 µl volume of the PAR mix was aliquoted into a fresh Eppendorf, PARG was added in various concentrations (30 µM–400 µM) for a variety of time (0-30 minutes), and 30 µM of PARG over 15 minutes is enough to effectively cleave PAR chains. The 2x SDS loading dye was added directly to the PARG reaction mix and boiled for 15 minutes to stop the reaction, samples were loaded onto an SDS PAGE gel and run at 300 V for 20 minutes. The gel was then used for a western blot analysis as previously described (section 2.2.4). The PVDF membrane was blotted with 1:1000 rabbit polyclonal anti-PAR antibodies (Trevigen).

For bPARG and MACROD2 hybrid PARG activity assays, a series of 10 µl reaction mixtures were prepared to contain 2 units of PARP1, 200 µM NAD, 50 mM Tris pH 7.5 and 50 mM NaCl and incubated at room temperature. The PARP1 auto-modification reaction was stopped after 30 minutes with the addition of PARP inhibitor KU-0058948 (orliparab), the reaction mixture contained ~5 µM PAR. Each protein to be tested for PARG activity was added to the reaction mixture to a final concentration of 5 µM; bPARG WT can efficiently hydrolyze 5 µM PAR at 10 nM concentrations over 15 minutes. Reactions were stopped by adding SDS buffer and heat denaturation at 90 °C and time

60

Structure and function of poly ADP-ribose glycohydrolases points of 5, 10 and 15 minutes. The denatured reaction mixtures were applied to an SDS-PAGE gel for analysis. The SDS-PAGE gel was blotted onto a PVDF membrane and PAR was visualized by rabbit polyclonal anti-PAR antibodies, (AMS Biotechnology, UK) using a 1:1000 dilution of antibody with dH2O.

2.4.2 Circular dichroism (CD) spectroscopy

Circular dichroism is the difference of absorption between left-handed and right- handed circularly polarized light (LCPL and RCPL). A signal occurs when the sample contains one or more chiral chromophores (is optically active) and for protein samples, spectral features can be assigned to particular secondary structural features of the molecule. The absorption wavelengths of interest range in the far UV spectrum, between 190 nm and 260 nm.

ΔA(λ) = A(λ)LCPL - A(λ)RCPL

λ=wavelength

Circular dichroism = ΔA(λ)

A pure protein sample in 25 mM Tris pH 7.5, 100 mM NaCl and 1 mM DTT buffer, was diluted to 20-30 µM and centrifuged on a bench top micro centrifuge at 16,000 rcf for 5 minutes to remove any debris. A 50 µl volume of protein was loaded into a 0.1 mm quartz cuvette (Starna Scientific, Essex, UK) and placed in a Chirascan spectrometer (Applied Photophysics, Surrey, UK). Parameters were set in such a way as to measure absorbance every 1 nm between 190 nm and 260 nm for 3 seconds, 50 µl of 25 mM Tris pH 7.5, 100 mM NaCl and 1 mM DTT buffer was used to provide a baseline. Scans were repeated in triplicate and the data was further analyzed using the Chirascan CDNN software to estimate percentages of secondary structure. Data files were accessed via the CDNN program and the molecular mass, number of amino acids and protein path length were ‘deconvoluted’ to calculate the percentage contributions of the various components to the protein secondary structure.

2.4.3 Isothermal titration calorimetry (ITC)

Isothermal titration calorimetry (ITC) is used to measure reactions between biomolecules. In this case, ITC was performed to measure the binding affinity between

61

Structure and function of poly ADP-ribose glycohydrolases proteins and ligands, the data providing stoichiometry, entropy and enthalpy information of the binding reaction. Measuring the heat transfer during binding allows an accurate determination of the dissociation constant (Kd), see figure 10.

Figure 10. Basic configuration of an isothermal titration calorimetry (ITC) instrument. Both the reference and sample cells are shown next to one another within the adiabatic jacket, supplied by either constant or feedback power (reference and

62

Structure and function of poly ADP-ribose glycohydrolases sample respectively). The syringe titrates the protein into the sample cell, stirring constantly. Based on information from Malvern Instruments Ltd, VPN ITC.

A MicroCal Incorporated VP-ITC 2000 (Northhampton, MA, USA) was used. The sample cell was manually loaded with 1800 µl of 20–50 µM protein solution. The substrate, at 10 times the concentration of protein (200-500 µM), was loaded into the syringe. Twenty 15 µl aliquots were injected into the sample cell over 36 seconds per individual injection, spaced 300 seconds apart. Any heat generated by a binding reaction will be absorbed by the sample cell and causes a change to the feedback power, see figure 10. This power difference is proportional to the heat difference (ΔT) and is converted to a measure of either a positive change in power (endothermic reaction) or a negative change in power (exothermic reaction). Experiments were carried out in triplicate.

To prepare protein sample for ITC, it was buffer exchanged using a PD-10 desalting column (see method 2.2.12) to ensure complete buffer exchange into 25 mM Tris pH 7.5, 150 mM NaCl and 1 mM DTT. The ligand was prepared in the same buffer. Ligand and protein buffers needed to be exactly the same, this was to avoid any false positive changes in heat due to buffer mixing. To provide a baseline result, a run was completed using just buffer in the sample cell to insure any result was not the result of buffer mixing.

Raw data measures DP (differential power) in µCal/sec vs. time. A positive DP value indicates an endothermic reaction, a negative DP indicates an exothermic reaction. The molar ratio between the ligand and protein is gradually increased as the number of injections increases. As protein saturation increases the heat change will begin to decrease. The resulting raw data was further processed, the resulting isotherm is fitted to a binding model to generate the affinity (KD), stoichiometry (N) and enthalpy of interaction (ΔH) using the MicroCal PEAQ-ITC instrument control software.

2.4.4. Nuclear magnetic resonance (NMR)

A protein sample was prepared and purified into a low salt buffer, 25 mM Tris pH 7.5, 100 mM NaCl and 1 mM DTT, and concentrated to 500 µl at a concentration of ~25 µM. 1D proton NMR measurements were performed by Dr Matthew Cliff and results were recorded at 298 K using excitation sculpting for water suppression, and 1024 scans on the Bruker 800 MHz spectrometer (Avance III) with a 1H/13C/15N TCP cryoprobe. 63

Structure and function of poly ADP-ribose glycohydrolases

2.4.5 Thermal shift assays

Thermal shift assays were used to assess protein stability and to detect ligand binding. Stock solutions of protein at 3 mg/ml and SYPRO orange diluted 1 in 300 were prepared. SYPRO orange is a fluorescent dye (Life technologies, Paisley, UK) and binds preferentially to hydrophobic regions of the protein. Thermally induced protein denaturation therefore affects the fluorescence.

A master mix was prepared containing 600 µl of the stock SYPRO dye orange and 70 µl of stock protein. 30 x 20 µl of this master mix was aliquoted into individual wells of a Hard-Shell 90-Well Semi- Skirted PCR plate (BioRad, Hertfordshire, UK). Ligands were added to the wells containing protein-dye solution in varying concentrations ranging from 10 µM to 1000 µM. The assay was completed in a CFX96 Touch Real-Time PCR Detection System, the plate was sealed using an optical-quality sealing tape. The samples were heated at 15 ˚C for 1 minute followed by incremental increases in temperature of 0.2 ˚C every 5 minutes until 95 ˚C was reached. The fluorescence was measured following every temperature increase. Fluorescence emission was monitored at 575 nm and 490 nm was used for excitation.

For bPARG and MACRO D2 hybrid thermal shift assays, a total of 8 wells of a 90 well hard shell semi skirted PCR plate were filled with 19 µl of the master mix containing SYPRO orange dye and bPARG protein. Two wells were used for each protein, the following mixtures were tested: bPARG- no ligand; bPARG plus ADP-ribose; bPARG- MD2 no ligand; bPARG-MD2 plus ADP-ribose; MACROD2 no ligand; MACROD2 plus ADP-ribose; MACROD2-bP no ligand and MACROD2-pB plus ADP-ribose. ADP-ribose was added to a final concentration of 50 µM in 3 mg/ml protein.

2.4.6 Liquid Chromatography-Mass Spectrometry of trypic digests

SDS-page gels were sent for liquid chromatography mass-spectrometry to identify proteins, completed by Dr Martin Reed. Specific sections corresponding to a particular protein were extracted from the gel, followed by a trypsin protease digest. The peptide mass fingerprint is used to identify the polypeptides present in the gel sample.

64

Structure and function of poly ADP-ribose glycohydrolases

2.4.7 Multi-Angle Laser Light Scattering (MALLS)

All MALLS experiments were performed by Emma Keevill. Protein samples were diluted to 0.5 mg/ml concentration in 25 mM Tris, 150 mM NaCl. A 500 ml volume of this buffer was also prepared, filtered and de-gassed. The buffer was then used to equilibrate a Superdex 200 10/300 GL size exclusion column (GE Healthcare, Buckinghamshire, UK), following which the protein was run down the column and eluted through a flow cell in the DAWN-EOS MALLS spectrometer instrument (Wyatt Technology corp. CA, USA) with a quasi-elastic light scattering detector. Data was analyzed by the Astra Wyatt software on the apparatus to estimate molecular mass values.

2.5 Crystallographic methods

2.5.1 Crystallisation and crystal handling methods

Proteins were purified to the highest purity they could be, preferably at least 90% purity, confirmed on SDS-Page gel. The chances of proteins crystallizing is not fully understood; however, to increase the chances of crystallization proteins should be; pure, soluble and homogeneous with little aggregation or degradation. In order for protein crystals to grow they move through certain ‘zones’, determined by the solubility and precipitant concentrations of the protein in solution 107, see figure 11.

Figure 11. Diagram of protein crystallization solubility curve. Nucleation will only occur in the nucleation zone and grow in the metastable zone. If they fall into the precipitation zone, proteins will usually not crystallize. Adapted from Asherie, et al 107.

65

Structure and function of poly ADP-ribose glycohydrolases

Proteins were crystallized using the sitting drop vapor diffusion technique (see figure 12), with a number of commercial crystallization JCSG+, Morpheus, Pact Premier, Clear strategy (I) and clear strategy (II) which contained over 200 different precipitants, various pH values, salts and buffer systems to improve the chances of crystallization. The process of vapor diffusion involves the addition of a certain volume of protein usually between 150 nl up to 1000 nl to a well on a platform above a larger reservoir, in which the mother liquor (crystallization screen solution) will be added, for our experiments, between 30-70 µl of mother liquor was added depending on optimized conditions. To the protein within the well, an equal volume of mother liquor would be added so that the drop is 50% protein and 50% screen solution. After this, the well would be sealed air tight, the resulting vapor pressure would be higher than the drop which would lead to a net loss of water from the drop within the diffusion chamber, see figure 12. This process would be completed using the Mosquito liquid handling robot (TTP, Labtech) and trays would be stored in three temperatures, 4 ˚C or 25 ˚C to improve chances of crystal growth. The drops within the trays were examined at regular intervals using a stereoscopic light microscope (Nikon) in the temperature in which the crystals had been grown. Initial examination was completed 48 hours after trays were set, as to not disturb any initial nucleation. If crystal growth was observed, either optimization of the crystal screen was completed or crystals were initially cryo- protected in 10-30% PEG 200 if the mother liquor was not already cryo-protected. Once cryo-protected, crystals were flash cooled in liquid nitrogen and stored for diffraction at the Diamond Light Source facility.

66

Structure and function of poly ADP-ribose glycohydrolases

Figure 12. Diagram of sitting drop vapor diffusion method. Protein and precipitant is placed in a well above the reservoir mother liquor solution and sealed air tight. Due to the vapour pressure, vapour diffusion causes a net transfer of water from the protein solution into the reservoir, increasing the protein and precipitant concentration. Based om information in Asherie, et al 107.

During co-crystallization procedures, sitting drop diffusion was also used, the initial protein drop was a mix of protein and ligand. Ligand was dissolved or purified into the same buffer as the protein and an excess was added to the protein solution, this was usually around 1 mM concentration. It is important to make sure that all of the binding sites of the protein are occupied to avoid any heterogeneous protein and reduce the chances of crystallization. In most cases, the addition of the ligand helped to stabilize the protein, to reduce precipitation and improved the chances of nucleation and crystal growth. The same method was used as previously described to set up sitting drop vapor diffusion trays using the Mosquito liquid handling robot (TTP, Labtech). In this case, the saturated protein with ligand was placed on the platform and an equal amount of the mother liquor was added.

67

Structure and function of poly ADP-ribose glycohydrolases

For crystallization using a seed, 50 µl of the mother liquor was added to a Micro Seed bead (molecular dimensions). Either individual crystals, or a whole drop with crystals were removed from the drop and added to the 50 µl mother liquor and vortexed for 90 seconds to crush the crystals. This solution was then diluted, usually a 1 in 4 dilution. This time, crystallization set up using the robot would involve adding seed stock usually one quarter of the total volume e.g. 100 nl diluted seed stock to 100 nl protein and 200 nl mother liquor.

2.5.2 X-ray data collection

Once successful crystallization was achieved, crystals were flash frozen in liquid nitrogen. Diffraction data was collected by Professor David Leys, Dr Mark Dunstan and Dr Colin Levy at the Diamond Light Source facility (Oxford, UK) using beamlines I0-2, I0- 3 or I0-4 at 100 K. During the diffraction, the crystal was rotated to collect more reflections. The automated data reduction suite ‘Xia2’ processed initial diffraction data on site. This initial data provided resolution, following which a full data set would be taken if the resolution was sufficient and diffraction was seen. ‘Xia2’ would then further process the diffraction data to give the cell dimensions and space groups following which it would integrate and scale the resulting beam reflections. The scaled output file contains an index of each spot and its measured intensity. The spots were then listed into numerical order according to their index known as an mtz file.

2.5.3 Structure elucidation and refinement

The structures of TTPARG and bPARG were solved using molecular replacement. The PHASER program, part of the CCP4 suite, was used for molecular replacement 108. For the bacterial PARG protein E114A, the bPARG WT was used as a model for molecular replacement (PDB code: 3SIG), and the TTPARG protein bound to PAR was also solved using the TTPARG WT protein as a model (PDB accession code: 4L2H). Both of the structures were refined against the diffraction data using the Refmac5 software (CCP4 suite) 109,110. Subsequent real space refinement was used to improve the structures using COOT 111, followed by further processing in Refmac5. PAR used for co- crystallisation of TTPARG was biosynthesized by tankyrase 1 in the presence of histones, detached with potassium hydroxide and purified by means of the dihydroxy boronyl column. The resulting bulk polymers were fractionated by anion exchange

68

Structure and function of poly ADP-ribose glycohydrolases chromatography and desalted to yield homogeneous PAR of defined lengths: trimer, hexamer, heptamer, decamer, tetradecamer and pentdecamer. Size identity of PAR fragments were confirmed by mass spectrometry, [work completed by the Tan group (Harvard, USA)] 78.

The SAV0323 structure was built using a variety of crystallographic techniques. The Se- Met anomalous dispersion signal was used (Se-SAD) to obtain initial phases for this protein using Autorickshaw. The phases derived from this process were sufficient to detect solvent boundaries, but not to complete an atomic model. Experimental phases were improved using NCS averaging combined with multi-crystal averaging using the non-isomorphous native data set (using DM-multi from the CCP4 suite). The resulting electron density maps were of sufficient quality to allow for automated model building software BUCCANEER 112, to build an initial model, that was further refined using iterative cycles of manual building in COOT 111 and refinement using REFMAC5 109,110.

69

Structure and function of poly ADP-ribose glycohydrolases

Chapter Three Biophysical characterization of Thermospora curvata poly ADP-ribose glycohydrolase

3.1 Background information The structure of a bacterial PARG from Thermospora curvata was recently reported and a catalytic mechanism proposed 13. Importantly, the PARG specific catalytic loop with signature sequence GGG-X6–8-QEE is inserted into a macrodomain fold. Two key residues found in the catalytic loop are proposed to be largely responsible for the PARG activity, Glu114 and Glu115. The Glu115 residue is projected into the active site and forms a hydrogen bond with the Ribose-ribose O-glycosidic linkage. Mutation of either the Glu114 or Glu115 residues renders the enzyme inactive. The proposed role for Glu115 residue in catalyzing the hydrolysis of the Ribose-ribose bond, in part, explains this observation. However, the Glu114 residue only appears responsible for ligand binding and orientation, rather than catalysis directly. This would lead to assumption that (conservative) mutations may have retained some activity. The purified T. curvata PARG variants E114A and E114Q were bright yellow in colour. This had led to the speculation that Glu114 mutations bind flavin and that Glu114 therefore plays a role in hindering flavin binding to the WT enzyme. This hypothesis was tested by biophysically characterizing the Glu114 variants by assessing binding affinity, specificity and corresponding crystal structures.

T. curvata PARG is structurally homologous to other macrodomain proteins such as MACROD2 from H. sapiens, and both retain the ability to bind ADP-ribose. However, despite the structural homology of these two proteins, only bacterial PARG from T. curvata (here called bPARG) contains the key catalytic loop, conferring the ability to hydrolyse whole PAR chains into predominantly ADP-ribose monomers 13. The MACROD2 enzyme, on the other hand, cleaves the ester bond linking the proximal ADP- ribose unit directly to the modified protein 65. Given the close structural relationship between both enzymes, and the fact a single loop appears linked to activity, it was interesting to study variant bPARG and MACROD2 enzymes that contained a MACROD2 and bPARG loop sequence respectively. If activity was truly determined by the loop sequence only, we proposed that the hybrid enzymes would display some of the corresponding activity as determined by the loop sequence. The second part of this

70

Structure and function of poly ADP-ribose glycohydrolases chapter describes the results from engineering proteins bPARG and MACROD2 in order to swap the enzyme specific catalytic loops and enzyme activity.

3.2 Biophysical characterization of bacterial PARG variants E114A & E114Q

3.2.1 Expression and purification of bPARG WT, E114A & E114Q

Three pET 28a plasmids (Invitrogen), containing the bPARG WT, E114A & E114Q genes under the control of the T7 promotor and lac operator were received from the Ahel group. The constructs were used to transform BL21 DE3 E. coli cells and transformants were stored as glycerol stocks.

Small-scale expression conditions had been previously reported 13. These conditions were used as a guide to optimize large-scale expression of all 3 proteins (see section 2.3.6). The bPARG WT, E114A and E114Q proteins were released from the cells by sonication and purified by immobilization on a batch Ni-NTA column. After elution with 250 mM imidazole from the Ni-NTA column, both bPARG E114A and E114Q were visibly yellow in colour, while the bPARG WT remained colourless. After each elution of the nickel affinity column the eluates were analyzed by SDS-PAGE electrophoresis, figure 13. Protein concentration was determined by measurements at 280 nm using the predicted extinction coefficient. Twelve litres of culture produced approximately 100 mg of protein. All three proteins were judged to be over 90% pure after Ni-affinity chromatography. Proteins were buffer exchanged into lower salt conditions, E114Q and E114A remained bright yellow in colour.

71

Structure and function of poly ADP-ribose glycohydrolases

Figure 13. SDS-PAGE analysis of nickel affinity chromatography fractions of bPARG E114A, E114Q and WT respectively. Lanes 1-4. bPARG E114A lysate, 50 mM imidazole wash, 250 mM imidazole eluate and protein after buffer exchange column. Lanes 5-9, bPARG E11Q lysate, 15 mM imidazole wash, 50 mM imidazole wash, 250 mM imidazole eluate and protein after buffer exchange column. Lanes 10-14, bPARG WT lysate, 15 mM imidazole wash, 50 mM imidazole wash, 250 mM imidazole eluate, and protein after buffer exchange column. Arrows indicate the protein of interest.

3.2.2 Spectral analysis of bPARG E114A & E114Q

To confirm whether FAD binding was responsible for the bPARG E114A & E114Q proteins yellow colour, a UV-VIS absorption spectrum was taken for the purified samples, (see figure 14). Two spectral features at 375 nm and 450 nm respectively correspond to a typical FAD absorbance spectrum. FAD was removed from the protein by buffer exchange from a 25 mM Tris pH 7.5, 150mM NaCl buffer into a 25 mM Tris pH 7.5, 150 mM NaCl and 2 M potassium bromide. Following removal of potassium bromide and subsequent buffer exchange to a 25 mM Tris pH 7.5, 150 mM NaCl and 1 mM DTT buffer, no residual FAD binding could be detected by UV-VIS (see figure 14) as the protein remained colourless even when concentrated to approximately 60 mg/ml.

72

Structure and function of poly ADP-ribose glycohydrolases

a

b

Figure 14. Absorbance spectrum of purified bPARG E114A and E114Q. (a)The darker grey line represents the bPARG E114A as purified spectrum, with FAD bound. The lighter grey line represents the E114A spectrum after incubation with 2 M potassium bromide. (b) The blue line represents the bPARG E114Q as purified spectrum, with FAD bound. The purple line represents the E114Q spectrum after incubation with 2 M potassium bromide.

3.2.3 Thermal shift binding assays of bPARG WT, E114A and E114Q

The fact FAD-binding occurs only in the bPARG variants E114A and E114Q corresponds to a marked change in ligand binding specificity (the natural ligand for bPARG WT is

73

Structure and function of poly ADP-ribose glycohydrolases

PAR/ADP-ribose). This suggests the E114 mutations may have a variety of effects on the proteins behavior. Thermal shift assays were performed to assess whether the addition of FAD and/or ADP-ribose would increase the protein stability of the various E114 variants.

The following samples were tested: bPARG WT no ligand, bPARG WT plus FAD, bPARG WT plus ADP-ribose, bPARG E114A no ligand, bPARG E114A plus FAD, bPARG E114A plus ADP-ribose, bPARG E114Q no ligand, bPARG E114Q plus FAD and bPARG E114A plus ADP-ribose. Ligands were added to a final concentration of 50 µM, proteins were added at 3 mg/ml. Temperature was set to increase incrementally from 15 °C to 95 °C.

The data reveals that the E114 mutation destabilizes the protein: bPARG WT unfolding/melting temperature (Tm), was 51.4 ± 0.7 °C as opposed to bPARGs-E114A and E114Q, which unfolded at 43.0 ± 0.5 ˚C and 44.0 ± 1.2 ˚C respectively. When FAD was added to the samples, no shift in Tm was observed for the WT protein, indicating a lack in binding. In contrast, the Tm for bPARGs- E114A and E114Q were shown to increase, corresponding to a noticeable increase in stability. The addition of ADP-ribose increased the Tm for all 3 proteins and the increase in Tm seen for the E114 variants after the addition of FAD suggested that they can bind both FAD and ADP-ribose, see figure 15 and table 3.

Fluorescence

Figure 15. Thermal shift assays of bPARG variants. WT, E114A and E114Q bPARG proteins, both in absence and presence of ligand (FAD or ADP-ribose) presented in the first derivative with corresponding Tm values.

74

Structure and function of poly ADP-ribose glycohydrolases

bPARG Tm no Tm +ADP- No ligand Tm + FAD No ligand ligand ribose and ADP- and FAD ribose ΔTm ΔTm

WT 51.4 ±0.7 °C 60.0±0.4 °C 8.6 °C 51.9 ± 0.9 °C 0.5 °C

E114A 43.0 ±0.5 ˚C 52.2±0.7 °C 9.2 °C 54.8± 0.6 °C 11.8 °C

E114Q 44.0 ±1.2 ˚C 54.2±1.1 °C 10.2°C 56.8± 1.0 °C 12.8 °C

Table 3. Thermal shift assays of bPARG variants. Tm’s for bPARG WT, E114A and E114Q before and after the addition of ADP-ribose and FAD and the corresponding differences in Tm between ligands.

3.2.4 ITC measurement of bPARG E114A FAD and ADP-ribose binding

Both E114A and E114Q bPARG show an increase in Tm after the addition of ADP-ribose or FAD. To measure ligand binding affinities quantitatively, ITC was used to measure E114A FAD and ADP-ribose binding respectively.

A 20 µM solution of WT and E114A bPARG was prepared in 25 mM Tris pH 7.5, 150 mM NaCl and 1 mM DTT. FAD and ADP-ribose solutions were prepared at a concentration of 200 µM into the same buffer. After the protein was loaded into the sample cell, 20 small 15 µl aliquots of ligand solution were injected and mixed with the protein.

Exothermic binding reactions were recorded for WT and E114A bPARG after the addition of ADP-ribose. Additionally, exothermic reactions were recorded for bPARG E114A after the addition of FAD. No binding reaction was observed after addition of FAD to bPARG WT. The recorded raw data peaks were integrated and produced isotherm diagrams. When these raw data were integrated with the molar ratio of ligand to protein, a KD was calculated from the exothermic changes seen, see figure 16.

From the curve, the K (binding affinity), n (number of binding sites) and ΔH (enthalpy) can be derived. Kd = 1/K where Kd is the dissociation constant. The affinity of the WT protein for ADP-ribose was determined at a Kd of 248 ± 55 nM. The Kd of bPARG E114A for ADP-ribose and FAD was 397 ± 79 nM and 82 ± 26 nM respectively. These data indicate preferential binding of the E114A mutant for FAD over ADP-ribose.

75

Structure and function of poly ADP-ribose glycohydrolases

Figure 16: ITC binding data for bPARG WT and E114A with ADP-ribose and FAD. (A) E114A with ADP-ribose, Kd is 397 ± 79 nM, (B) E114A with FAD, Kd is 82 ± 26 nM and (C) WT with ADP-ribose, Kd is 248 ± 55 nM.

76

Structure and function of poly ADP-ribose glycohydrolases

Protein Ligand Kdiss Stoichiometry Delta H Delta S (nM) (N) (kcal/mol) (cal/mol/K)

WT ADP- 248 ± 55 0.61 ± 0.01 -7.94 ± 0.18 4.02 ribose

WT FAD No binding observed

E114A ADP- 397 ± 79 0.56 ± 0.01 -8.52 ± 0.22 0.72 ribose

E114A FAD 82 ± 26 0.70 ± 0.0 -4.51 ± 0.91 17.2

Table 4. Thermodynamic binding parameters for bPARG WT and variants. The table shows the dissociation constant, stoichiometry, change in enthalpy and change in entropy for the binding of ADP-ribose to the wild type and the E114A mutant, and for the binding of FAD to the E114A mutant.

3.3 Structural characterization of bPARG E114A

We attempted to crystallise these FAD binding mutants in an attempt to visualize what the impact of the single mutation of the E114 residue may have on the structure. From this, we can try to elucidate why or how FAD binding occurs.

3.3.1 Crystallization of bPARG E114A

For crystallographic trials, bPARG E114A was further purified using a Superdex 200, 10/300 GL size exclusion column. After purification, the FAD was removed from the binding site using 2 M potassium Bromide solution and desalted to remove the high salt into 25 mM Tris pH 7.5, 150 mM NaCl and 1 mM DTT. A seed stock was prepared from previously crystallized bPARG WT. The protein was both prepared with a molar excess of ADP-ribose as well as in the absence of a ligand. Both protein samples were tested against a number of commercially available crystallization screens at concentrations of 7 and 12 mg/ml.

Seed stock prepared from bPARG WT crystals was used in a 1:4 volume ratio in a 400 µl sitting drop (50 µl seed stock, 150 µl protein or protein + ADP-ribose and 200 µl of mother liquor). Crystals were obtained by sitting drop vapor diffusion at 12 mg/mL at 4 °C in 25 mM Tris pH 7.5, 150 mM NaCl and 1 mM DTT. Drops containing bPARG E114A

77

Structure and function of poly ADP-ribose glycohydrolases plus ADP-ribose produced more crystals. While drops containing E114A with no ligand also produced crystals, larger “rod” shaped crystals were only seen in B3 (0.1 M Bicine pH 9.0 and 20% w/v PEG 6000) of JCSG-plus crystal screen (Molecular Dimensions), containing bPARG E114A plus ADP-ribose. An optimized screen was created around these conditions: 0.1 M Bicine pH 9.0 and 0%–40% w/v PEG 6000. Bicine pH 9.0 20% w/v PEG 6000 led to growth of a few larger crystals, one of which was flash frozen in liquid nitrogen following addition of 10% PEG 200 (Figure 17).

Figure 17. bPARG E114A crystal stored in a cryoloop. The crystal was grown in conditions 0.1M Bicine pH 8.0 and 20% w/v PEG 6000.

3.3.2 Structure determination of bPARG E114A Apo and plus ADP-ribose

Two complete crystal diffraction data sets were obtained from single crystals using the I04 at the Diamond Lightsource facility (Oxford, UK); data were reduced and scaled using X-ray Detector Software (XDS)113. A 2.02 Å data set was obtained from bPARG E114A ligand free crystals and a 1.82 Å for bPARG E114A co-crystallised with ADP- ribose. The bPARG E114A with and without ADP-ribose ligand structures were solved by difference fourier methods using the bPARG WT structure 13. Both E114A structure (ligand and APO) was solved using molecular replacement using the BPARG WT previously solved structure (PDB: 3SIG) using PHASER (CCP4 suite). Refmac5 (CCP4 suite) was used for this molecular replacement and model refinement 109,110.

78

Structure and function of poly ADP-ribose glycohydrolases

For the ligand free structure, an initial round of Refmac5 produced Rwork/Rfree of: 24.17/28.53; for the ADP-ribose ligand structure, an initial round of Refmac5 produced

Rwork/Rfree of: 22.26/26.02, (table 5). Further rounds and model building was completed using COOT (CCP4 suite) 111. The COOT program allows for real space refinement of the model, addition of waters, mutation of the E114 residue and addition of the ADP-ribose ligand, figure 18. Rounds of manual real space refinement and model building were interspaced with model refinement using Refmac5. In order to complete the structure for bPARG E114A with ADP-ribose, molecular restraints for ADP-ribose ligand were added in using COOT, by selecting the ligand ‘APR’ to add ADP-ribose. After multiple rounds of refinement, the final completed model of bPARG E114A had a Rwork/Rfree of:

20.07/25.32; the bPARG E114A: ADP-ribose complex had a Rwork/Rfree of: 19.01/23.45. The final data processing and refinement statistics, can be seen in table 5.

Figure 18. Density corresponding to bound ADP-ribose within the bPARG E114A model. Electron density is shown contoured at 3 sigma. The bound ADP-ribose as well as associated water molecules can be clearly observed. Green density represents ADP- ribose, image was a snapshot from the COOT software, CCP4i suite.

79

Structure and function of poly ADP-ribose glycohydrolases

bPARG E114A bPARG E114A : ADP-ribose.

Data collection

Space group P212121 P212121

Cell dimensions

a, b, c (Å) 44.49, 50.57, 118.11 44.1, 50.1, 117.2

Resolution (Å) 41.63-2.02 (4.4-2.02) 31.7-1.82

Rmeas 5.7(60.1) 5.1(50.6)

I/σI 17.82(3.71) 16.53(3.25)

Completeness (%) 99.17% 98.5%

Redundancy 5.2 6.9

Refinement

Resolution (Å) 2.02 1.82

No. of reflections 170514 240658

Rwork/Rfree 0.20/0.25 19.01/23.45

No atoms 2138 2469

Protein 1940 2164

Water 198 305

Ligand 36

B-factors (Å2)

Protein 27.02 20.15

Water 35.71 30.32

Ligand 15.76

R.m.s. deviations

Bond lengths (Å) 0.0189 0.0190

Bond angles (°) 1.8832 1.579

Table 5. X-ray crystallographic statistics for data collection, processing and refinement of the T. curvata bPARG E114A variant. Both ADP-ribose bound and apo statistics are given. Values in parentheses indicate values obtained for the highest resolution shell. Equations for Rmeas and Rwork/Rfree are given below. For Rwork and Rfree, the formula for both is the same, Rwork is calculated for the working set, whereas Rfree is calculated for the test set.

80

Structure and function of poly ADP-ribose glycohydrolases

Figure 19c shows the structure of bPARG E114A bound to ADP-ribose superimposed with bPARG WT. Although mutation of the E114 residue renders the enzyme inactive, there is no obvious change to the overall fold of the protein. Similarly, ligand binding does not affect the protein conformation, as bPARG E114A superimposed with the bPARG E114A-ADP-ribose complex has an r.m.s.d. of 0.22 Å and a Z score of 48.4. The ADP-ribose ligand binding sites for bPARG E114A and bPARG WT,13 are displayed in Figure 19 a& b. No obvious differences between the overall positions of the ligand can be observed, and a majority of the protein-ligand interactions is identical for both structures. Binding studies using ITC (section 3.2.4), showed a decrease in binding affinity for ADP-ribose after glutamic acid to alanine mutation of the 114 residue. The E114A crystal structure suggests this is likely due to disruption of the E114-ribose 2’OH polar contact.

81

Structure and function of poly ADP-ribose glycohydrolases

C

Figure 19. T. curvata PARG crystal structure in complex with ADP-ribose. (A) ,(B), detailed view of ADP-ribose in the active site bound to bPARG WT and E114A mutant respectively. Key active site residues are labeled and arrows show the mutated and non-mutated residue. (C), overall fold of the bPARG E114A mutant in yellow overlapped with bPARG WT in grey, ADP-ribose can be seen in the active site represented in a spherical conformation. Images were created using QTMG part of the CCP4i suite.

82

Structure and function of poly ADP-ribose glycohydrolases

3.3.3 Crystallisation of bPARG E114A with FAD

To understand how FAD binds to bPARG E114A, attempts were made to crystallize bPARG E114A with FAD. Unfortunately, neither co-crystallization, nor soaking of bPARGE114A crystals with FAD led to any diffraction quality E114A- FAD crystals. Various attempts were made, including using micro-seeding protocols. The lack of crystals suggests that the bulky FAD isoalloxazine ring hinders possible crystal contact formation.

3.4 Engineering of T. curvata bPARG and H. sapiens MACROD2 hybrid variants.

3.4.1 Design of a bPARG-MACROD2 Loop hybrid

Both bPARG and MACROD2 are ADP-ribose binding proteins, but while PARG proteins hydrolyse the PAR glycosidic Ribose-ribose bonds, MACROD2 serves to remove the terminal ADP-ribose monomer from the modified protein 13,65. Having distinct catalytic mechanisms, PARG and MACROD2 enzymes cannot substitute for each other and act synergistically within post translational ADP-ribose metabolism. Both enzymes contain a unique catalytic loop that is inserted into the ADP-ribose binding site and, given the close structural relationship between both enzymes, it appears a single loop governs activity. To test this hypothesis, we wish to study variant bPARG and MACROD2 enzymes that contain a MACROD2 and bPARG loop sequence respectively. The aim was to explore if this could swap the catalytic activities of the proteins. Both structures of bPARG and MACROD2 consist of a ‘macrodomain fold’ but have only 19% , see figure 21.

83

Structure and function of poly ADP-ribose glycohydrolases

A B

Figure 20. T. curvata PARG and H. sapiens MACROD2 crystal structures in complex with ADP-ribose. (A) Overall ‘macro’ fold of the T. curvata PARG (PDB:3SIG) and H. sapiens MACROD2 (PDB:4IQY) proteins superimposed to show structural homology. (B) A closer view of the T. curvata PARG and H. sapiens MACROD2 active sites, focusing on the specific catalytic loops for each protein highlighted in purple; the black arrows represent the start and finish point for both loops. Figures were drawn using QTMG part of the CCP4i suite.

To design the proposed hybrid genes, the amino acid sequences of human MACROD2 and bPARG were aligned according to structure (see appendix). Figure 20 shows superimposed structures of bPARG WT and MACROD2 WT superimposed. Figure 20b highlights the overlap between both catalytic loops, the bPARG loop in blue and the MACROD2 loop in gold. Although the loops are different in length, each loop is connected to the macrodomain fold at very similar positions, suggesting a “swap” would result in folded protein. Using both the structures and the sequence alignments, hybrid genes were designed (see appendix for sequences).

3.4.2 Cloning of bPARG-MD2 and MACROD2-bP hybrid genes

After identifying the specific catalytic loop regions for both proteins, hybrid genes were designed to encode for bPARG with a MACROD2 catalytic loop (bPARG-MD2) and

84

Structure and function of poly ADP-ribose glycohydrolases

MACROD2 with a bPARG catalytic loop (MACROD2-bP), see section 2.2.7 for more details.

3.4.3 Protein expression of bPARG-MD2 and MACROD2-bP hybrid proteins

Competent BL21 DE3 Rosetta E. coli cells were transformed using the bPARG-, MACROD2, bPARG-MD2 and MACROD2-bP protein expression plasmids. An IPTG concentration of 400 µM, induction temperature of 18 °C and induction time of 16 hours was used for proteins based on the induction parameters for bPARG WT protein. Protein expression levels were estimated by SDS-PAGE analysis. Figure 21 shows the SDS-PAGE gel that verifies the presence of soluble versions of all four proteins at relatively high levels. For larger scale protein expression, the above conditions were used to grow 12 litres of each strain, producing on average 13 g of cell pellet for each bacterial strain.

Figure 21. SDS-PAGE gel of cell extracts from bPARG, MACROD2-, bPARG-MD2 and MACROD2-bP proteins. Samples were taken after 16 hours of induction at 18˚C with 400 µM IPTG. Lanes 1&2, bPARG supernatant and cell pellet. Lanes 3&4, bPARG-MD2 supernatant and cell pellet. Lanes 5&6, MACROD2 supernatant and cell pellet. Lanes 7&8, MACROD2-bP supernatant and cell pellet. Arrows highlight proteins of interest.

3.4.4 Purification of MACROD2, bPARG-MD2 and MACROD2-bP proteins bPARG-MD2, MACROD2-bP, and MACROD2 proteins were released from the cells by sonication and purified by immobilization on a batch Ni-NTA column. After multiple

85

Structure and function of poly ADP-ribose glycohydrolases wash steps, the proteins were eluted with a 250 mM imidazole buffer, and exchanged into a 25 mM Tris pH 7.5, 150 mM NaCl, 1 mM DTT buffer via a desalting column. During the first purification of bPARG-MD2 and MACROD2-bP hybrids, much of the protein was eluted with the 50 mM imidazole buffer wash step, and subsequent purifications used a lower concentration of imidazole to avoid protein elution. The bPARG-MD2 and MACROD2-bP hybrids were eluted with a 250 mM imidazole buffer after washing with 70 ml of a 15 mM imidazole solution instead (see figure 22). Protein concentrations were determined after nickel affinity chromatography using the predicted extinction coefficients, (see section 2.3.1). bPARG-MD2 has a predicted extinction coefficient of 0.787 g-1, yielding ~4 ml of 18 mg/ml. MACROD2-pB has a predicted extinction coefficient of 1.03 g-1, yielding ~4 ml at 15 mg/ml and MACROD2 with a predicted extinction coefficient 1.05 g-1 of yielded ~4 ml at 18 mg/ml.

The bPARG-MD2, MACROD2-bP and MACROD2 proteins were further purified by size exclusion chromatography to remove remaining E. coli derived contaminants. Fractions taken from the gel filtration column corresponding to the main protein peak were pooled and analyzed via SDS-PAGE electrophoresis to assess purity, (see figure 22). During the final purification steps of bPARG-MD2 and MACROD2-bP hybrids, there were smaller diffuse bands appearing on the SDS-PAGE gel. Following LCMS analysis of tryptic digests, these smaller bands were identified as C-terminal degradation products of the bPARG-MD2 and MACROD2-bP hybrids respectively.

86

Structure and function of poly ADP-ribose glycohydrolases

Figure 22. SDS-PAGE analysis of MACROD2-pB and bPARG-MD2 hybrid (lanes 1- 10) and MACROD2 (lanes 11-16) purified from nickel affinity chromatography. Lane 1: Precision plus protein ladder. Lanes 2-5: MACROD2-bP protein lysate, wash step, elution and pooled elution fragments after gel filtration column. Lane 6: Gap. Lanes 7-10, bPARG-MD2 lysate, wash step, elution and pooled elution fragments after gel filtration column. Lane 11, precision plus protein ladder. Lanes 12-16, MACROD2 lysate 1st wash step, 2nd wash step, elution and pooled elution fragments after gel filtration column. Purple arrows indicate the bands analyzed by LCMS of tryptic digests.

3.4.5 PARG Activity assay of MACROD2-bP hybrid

The aim of the PARG activity assays was to ascertain whether the modified MACROD2- bP hybrid would display PARG catalytic activity and whether the modified bacterial PARG (bPARG-MD2) would retain any PARG activity. Both WT bPARG and MACROD2 were used as controls. The PARG activity assays involved the enzymatic production of poly ADP-ribose (PAR) chains from auto-modification of PARP1 in response to the addition of double strand damaged DNA. In presence of PARG activity, PAR would be converted to ADP Ribose monomers, for specific methods see section 2.4.1.

The results of the activity assay, shown using a western blot (Figure 23a), shows bPARG hydrolyses the PAR chain; however, no hydrolysis reaction is seen for the other three proteins. Most importantly, no PAR hydrolysis reaction is seen for the MACROD2-bP protein. Given the high concentration of enzyme used, this suggests a total lack of PARG activity, which was verified in triplicate experiments.

87

Structure and function of poly ADP-ribose glycohydrolases

To further verify the lack of PARG activity, higher concentrations of enzyme were added (figure 23b), in this case the proteins were added to a final concentration of 30µM, 3000 times excess.

a

b

Figure 23. Western blot results of the PARG activity assays using bPARG, MACROD2-bP, bPARG-MD2 and MACROD2. (A) Shows the effect of time on the hydrolysis reaction, hydrolysis of PAR is completed by bPARG-unmodified after 10 and 15 minutes; however, no other enzyme shows hydrolysis activity. (B) Shows the effect of enzyme concentration on the hydrolysis reaction. Again bPARG shows almost complete hydrolysis of PAR. PAR hydrolysis is not seen in any of the other enzyme reactions.

88

Structure and function of poly ADP-ribose glycohydrolases

3.4.6 Thermal shift assay of modified proteins

As no PARG activity was seen for MACROD2-pB hybrid, the protein was further characterized using biophysical studies to verify correct folding and ligand binding properties. A thermal shift assay with bPARG-MD2 and MACROD2-pB hybrids using ADP-ribose as a ligand were carried out to determine if the addition of ADP-ribose would improve protein stability, implying ligand binding, for specific methods see section 2.4.5.

The resulting thermal shift data (figure 24), shows that bPARG had a melting temperature (Tm) of 54 ± 0.6 ˚C which increased to 60.2 ± 0.7 ˚C with the addition of ADP-ribose, showing an increase in stability and a similar result to the Tm recorded for earlier thermal shift for bPARG, in figure 15. A similar result (i.e. stabilization upon ligand binding) was obtained for MACROD2, which had a melting temperature of 53.8 ± 0.8 ˚C increasing to 56.1 ± 0.4 ˚C after addition if ADP-ribose. The hybrid bPARG-MD2 had a similar melting temperature to bPARG at 52.8 ± 0.7 ˚C and addition of ADP-ribose did not alter the Tm, indicating that ADP-ribose does not bind to this protein. The MACROD2-pB hybrid had a lower melting temperature than all other proteins, at 48.0˚C ± 1.0 ˚C, and no change was observed in melting temperature after the addition of ADP- ribose indicating the likely absence of ADP-ribose binding. It appears both hybrids are less stable than the unmodified proteins and lack affinity for ADP-ribose. This was verified using ITC studies, which revealed no binding of ADP-ribose to the hybrid proteins.

89

Structure and function of poly ADP-ribose glycohydrolases

Figure 24. Thermal shift assays for bPARG, MACROD2-bP, bPARG-MD2 and MACROD2 in presence and absence of ADP-ribose. Results shown in the first derivative to allow visualization of corresponding Tm values.

Protein Tm no ligand Tm +ADP- No ligand and ribose ADP-ribose ΔTm

bPARG 54 ± 0.6˚C 60.2 ± 0.7˚C 6.2 °C

MACROD2-bP 48.0±1.0˚C 48.2±0.9˚C 0.2 °C

bPARG-MD2 52.8±0.7˚C 52.8±0.8˚C 0°C

MACROD2 53.8 ± 0.8˚C 56.1 ± 0.4˚C 2.3°C

Table 6. Tm for bPARG variants after thermal shift assay. Tm for the WT, E114A and E114Q proteins are given before and after the addition of ADP-ribose and FAD.

90

Structure and function of poly ADP-ribose glycohydrolases

3.5 Summary and discussion

Chapter three presents data that pertains to biophysical and structural studies of the bPARG E114A mutant from T. curvata. Although crystals of the E114 mutants bound to FAD were grown, none produced any diffraction data. However, the structure of the bPARG E114A mutant was solved both with and without ADP-ribose. Chapter three also presents data that pertains to the biophysical studies of bPARG and MACROD2 engineered proteins to determine any catalytic activity. PARG activity assays were used to ascertain the successfulness of the swap of the catalytic loop and attempts were made to crystallize these proteins.

During the initial purification of the bPARG E114 variants or FAD-binding variants, it was noted these would precipitate at concentrations higher than 5 mg/ml. The proteins were bright yellow in colour and remained so throughout purification. The purified proteins had a typical FAD spectrum and the addition of 2 M potassium bromide allowed the FAD to be stripped from the binding site. Once the FAD was removed, the E114 variants became more soluble.

The thermal shift assays provided quantitative data on the decreased stability of the bPARG E114 variants and indicated possible preferential binding to FAD over ADP- ribose. This preferential FAD-binding of the E114 variants is confirmed by ITC binding assays, which show E114A bPARG binds to FAD over four times tighter than to ADP- ribose. As the bPARG WT protein shows no binding to FAD, it appears that E114 is critical for ligand binding preference and perhaps for avoiding FAD inhibition of the enzyme.

While crystals of E114A with and without ADP-ribose could be obtained, the bPARG E114A: FAD could not be crystallised. The structures of FAD and ADP-ribose (Figure 25a) is very similar and our data reveals that the E114A mutations does not drastically affect the bPARG conformation. Hence modelling of a plausible E114A-FAD complex is achieved by replacing the Ribose moiety of the ADP-ribose molecule with the flavin isoalloxazine-ribityl moiety. This suggests isoalloxazine ring could be protruding from the binding site, (see Figure 25b). The latter could be responsible for causing the protein aggregation observed.

91

Structure and function of poly ADP-ribose glycohydrolases

A

Flavin Adenine ADP-Ribose Dinucleotide (FAD)

B

Figure 25. Structures of ADP-ribose, FAD and a model of bPARG T. curvata bound to FAD. (a) The molecule to the left is ADP-ribose and the molecule to the right is the FAD molecule. (b) A model of the E114A: FAD complex. Model was drawn using the CCP4i suite using the QTMG software.

Recent publication of the crystal structure of MACROD2 protein by Jankevicius, et al, provided data on the high percentage of structural homology between this human protein and bPARG65. It was previously established that the bacterial PARG protein

92

Structure and function of poly ADP-ribose glycohydrolases contained a “macrofold”, but the high percentage of structural homology (r.m.s.d 2.02 Å) between MACROD2 and bPARG had not been explored fully. Both proteins contain a “catalytic loop” that contains a specific signature sequence that appears solely responsible for their respective catalytic activities. While both proteins bind to ADP- ribose, MACROD2 cleaves the ester bond to remove the terminal ADP-ribose monomer from the modified protein, while bPARG WT is responsible for hydrolyzing the glycosidic Ribose-ribose bonds between the ADP-ribose PAR monomers13,65. Neither enzyme can perform both reactions 8. If the loop region was solely responsible for activity, hybrid proteins containing the macrodomain of one enzyme with the loop sequence specific for the other might display activity corresponding to the loop sequence donor, as opposed to the macrodomain donor. The hybrid proteins bPARG- MD2 loop and MACROD2-Parg loop could be produced and purified, indicating these were (partially) folded. However, protein degradation occurred easily suggesting the engineering proteins displayed high flexibility.

No enzymatic activity could be observed for the hybrid proteins, which may be a result of the degradation or the fact the catalytic loop is not folded correctly. This appears corroborated by the fact ADP-ribose binding could not be observed for either hybrid. This clearly explains the lack of activity, and suggests that further studies would need to look more carefully at hybrid construction.

93

Structure and function of poly ADP-ribose glycohydrolases

CHAPTER FOUR

Biophysical and structural characterization of Tetrahymena thermophila PARG bound to the poly (ADP-ribose) substrate

4.1 Background information

Canonical PARG is a highly conserved protein, found in organisms ranging from protozoa, such as Tetrahymena thermophila (TTPARG) to humans 71. The Leys group previously presented the structure and mechanism of a divergent form of bacterial PARG from which further biophysical and structural studies were completed in chapter three 13. Recent structural determination of PARG from bacterial, protozoan and mammalian sources 13,15,16,77 revealed that these enzymes essentially consist of a macrodomain ADP-ribose binding module 54, elaborated upon through insertion of the highly conserved PARG-specific catalytic loop. Eukaryotic canonical PARGs are more complex and larger in size than their bacterial-type counter parts (see figure 26).

Recent studies identified a minimal region in the human PARG (hPARG) protein required for catalytic activity in vitro, which extends beyond the bacterial-type catalytic macro domain. This N-terminal extension (domain B, figure 5) contains a short motif known as the Regulatory Segment/Mitochondrial Targeting Sequence (RT/MTS), which was suggested to be essential for PAR activity 68.

While bacterial PARG has been show to act solely as an exo-glycohydrolase 13, canonical PARGs in contrast have been reported to act as both endo- and exo-glycohydrolases 114. PARG from T. thermophila is structurally homologous to the minimal catalytic region of the human PARG protein (seen in figures 5 and 7), making this a suitable model to study human PARG-PAR binding, catalytic mechanism and structure.

94

Structure and function of poly ADP-ribose glycohydrolases

Figure 26. Structural comparison of PARG from T. curvata, canonical PARG from T. thermophila. A superimposition of TTPARG (in blue) and bPARG (in pink) revealing a structurally conserved macro-domain core and the presence of a TTPARG accessory domain. Image was drawn using the QTMG software, part of the CCP4i suite.

Previous studies have revealed that mutation of residues within the signature PARG catalytic loop (GGG-X6–8-QEE), produce catalytically inactive proteins 13. Similar mutations in the human enzyme also lead to inactivation, indicating that TTPARG protein was a good model for the human PARG protein 16, 11.

To provide detailed understanding of the inherent balance of exo/endo glycohydrolase activities of the PARG proteins we set out to obtain biophysical and structural data of canonical catalytically inactive eukaryotic PARG (TTPARG) in complex with its natural ligand, poly ADP Ribose (PAR).

4.2 Expression and purification of inactive Tetrahymena thermophila PARG.

4.2.1 Expression and purification of TTPARG WT, E256A, E256Q, E255Q and E255A

WT TTPARG and the mutant E256A, E256Q, E255Q and E255A genes were previously cloned in pET 28a plasmids (Invitrogen) by the Ahel group and thus placed under the control of T7 promotor and lac operator. The constructs were used to transform E.coli Rosetta2 (DE3) cells (Novagen) and were inoculated into LB broth for glycerol stocks.

95

Structure and function of poly ADP-ribose glycohydrolases

Optimum small-scale expression conditions reported by the Ahel group were used as a guide to optimize large-scale expression of the three constructs, (see section 2.3.6).

TTPARG WT, E256A, E256Q, E255Q and E255A recombinant proteins were released from the cells by French press and purified by immobilization on a batch Ni-NTA column. After elution with 250 mM imidazole from the Ni-NTA column, TTPARG WT, E256A, E256Q, E255Q and E255A eluates were analysed by SDS-PAGE electrophoresis. Figure 27a shows an example of one Ni-NTA column for the E256Q variant, figure 27b shows samples of all variants after Ni-NTA, this shows the presence of TTPARG proteins at ~55.8 kDa. Protein concentration was determined by measuring absorbance at 280 nm using the predicted extinction coefficient (section 2.3.1), 12 litres of culture produced approximately 25 mg of each protein.

Figure 27. SDS-PAGE analysis of TTPARG WT and variants using nickel affinity chromatography. (a) Gel after Ni-NTA for TTPARG E256Q. Lane 1: molecular marker. Lanes 2-4: lysate, wash step and eluate. (b), Gel of TTPARG WT and variants after 250 mM imidazole elution. Lane 1: TTPARG WT eluate. Lane 2: TTPARG E255A eluate. Lane 3: TTPARG E255Q eluate. Lane 4: TTPARG E256A eluate. Lane 5: TTPARG E256Q eluate. Arrows indicate proteins of interest.

After nickel affinity chromatography, numerous impurities remained present in the final 250 mM eluate, see figure 27a and 27b. Consequently, TTPARG WT, E256A, E256Q, E255Q and E255A were further purified using a Superdex 200 10/300 GL size exclusion column (GE Healthcare, UK) on an AKTA FPLC system. The fractions corresponding to

96

Structure and function of poly ADP-ribose glycohydrolases the major peak at 280 nm were retained. Figure 28 shows an example of the typical chromatogram seen for these variants (E256Q)

Figure 28- Chromatogram of TTPARG E256Q gel filtration purification. Main peak fractions were collected after elution at around 15 ml.

4.2.2 UV-Vis Spectral analysis of TTPARG E255A and E255Q.

All of the purified TTPARG proteins were colourless to the naked eye after expression and purification. However, due to the persistent presence of FAD after expression of E114A bPARG protein (equivalent to TTPARG E255), TTPARG E255A and E255Q were analysed by taking spectral absorption measurements between 280 nm and 600 nm (see figure 29). Unlike their bacterial equivalent, no absorption features were seen for FAD around 375 nm and 450 nm.

97

Structure and function of poly ADP-ribose glycohydrolases

Figure 29. UV-Vis spectra of purified TTPARG E256Q (in yellow and E256A (in black). No typical FAD peaks at 375nm and 450nm were seen for these proteins.

4.3 Biophysical characterization of purified TTPARG

4.3.1 Thermal shift assays of TTPARG E256Q and E256A

Thermal shift assays were used as an initial experiment to ascertain whether the inactive mutants of TTPARG still retain the capacity to bind ADP-ribose. In the bPARG results chapter (chapter three), bacterial PARG mutant versions showed ligand- dependent shifts in melting temperatures and ligand binding could be verified using ITC.

A total of 10 wells in a 90 well hard shell semi skirted PCR plate were filled with 19 µl of the master mix containing SYPRO orange dye and TTPARG protein. Two wells were used for each protein, the following reactions were tested. TTPARG WT no ligand, TTPARG WT plus ADP-ribose, TTPARG E256Q no ligand, TTPARG E256Q plus ADP- ribose, TTPARG E256A no ligand, TTPARG E256A plus ADP-ribose, TTPARG E255A no ligand, TTPARG E255A plus ADP-ribose, TTPARG E255Q no ligand, TTPARG plus ADP- ribose.

Results of the thermal shift assays (see figure 30) show that TTPARG WT with no ligand was more stable compared to the corresponding mutant versions, with the Tm of 55 ±0.4 °C shifting to 61 ± 0.7 ˚C after the addition of ADP-ribose. This rise of 6 ˚C suggests binding of TTPARG WT to ADP-ribose. Similar shifts were seen for all mutants after the

98

Structure and function of poly ADP-ribose glycohydrolases addition of ADP-ribose. Following addition of ADP-ribose, increases in Tm were seen for E255Q, E256A, E255A, and E256Q mutants. The E256 variant had the highest increase in Tm compared with the other mutants and so appeared more stable compared to the two E255 mutants. However, all proteins tested had increased stability after the

addition of ADP-ribose, see table 7.

Fluorescence

Figure 30. Thermal shift assays for TTPARG WT and mutant versions in presence and absence of ADP-ribose. Melting temperatures were measured at 575nm. The 1st derivative shows the Tm readings.

TTPARG Tm (˚C) Tm+ADP-ribose Δ Tm (˚C) (˚C) WT 55 ± 0.4 61 ± 0.6 6 E255Q 43 ± 1.0 48 ± 0.3 5 E255A 42 ± 0.7 49 ± 0.9 7 E256Q 55 ± 0.2 57 ± 0.4 2 E256A 51 ± 0.7 53 ± 0.8 2

Table 7. Thermal shift assays of TTPARG variants. Tm values for TTPARG WT E256Q, E256A, E255A, and E255Q before and after the addition of ADP-ribose and the difference between these values.

99

Structure and function of poly ADP-ribose glycohydrolases

4.3.2 ITC Binding assays of TTPARG WT and E256Q, E256A, E255A, E255Q.

As the eventual aim of this project was to gain structural insight into binding of TTPARG with its natural ligand PAR, it was important to choose an inactive mutant with highest ligand binding affinity. The amount of purified homogenous PAR available was minimal, and would only allow for a couple of attempts at crystallisation. Therefore, quantitative binding data was needed for TTPARG inactive mutants, and TTPARG WT was used as a positive control. TT PARG WT, E256Q, E256A, E255A and E255Q proteins were used for ITC binding reactions, each prepared to 20 µM. Ligand concentration was prepared to 10x the protein concentration, i.e. 200 µM.

An exothermic binding reaction was recorded for TTPARG WT after the addition of ADP-ribose, see figure 31. The recorded raw data peaks were integrated and produced isotherm diagrams. When these raw data were integrated with the molar ratio of ligand to protein, a Kd was calculated. From the curve, the K (binding affinity), n (number of binding sites) and ΔH (enthalpy) can be derived. Kd = 1/K where Kd is the dissociation constant. The Kd for TTPARG WT was 200 ± 86.0 nM. No binding reactions were observed for E256Q, E256A, E255A and E255Q. As no binding reactions were seen for the other mutants, their Kd could not be obtained.

100

Structure and function of poly ADP-ribose glycohydrolases

Figure 31. ITC Isotherm diagram, binding data for TTPARG WT with ADP- ribose. Binding data is derived from integration of exothermic peaks. Kd for TTPARG binding to ADP-ribose is 200 ± 86.0 nM

Protein Ligand Kdiss (nM) Stoichiometry Delta H Delta S (N) (kcal/mol) (cal/mol/K)

WT ADP-ribose 200 ± 86 0.8808 ± -7717 ± 147 4.75 0.01153

Table 8. Thermodynamic binding parameters for TTPARG WT using ITC. The table shows the dissociation constant, stoichiometry, change in enthalpy and change in entropy for the binding of ADP-ribose to the wild type TTPARG.

101

Structure and function of poly ADP-ribose glycohydrolases

4.3.3 SPR ligand-binding assays of TTPARG WT and E256Q, E256A, E255A, E255Q

Due to the lack of any quantitative binding data for TTPARG catalytically inactive mutants using ITC, we attempted to use Surface Plasmon Resonance (SPR) as an alternative. bPARG WT (postitive control), TTPARG WT and E256Q, E256A, E255A, E255Q were all immobilised using the inherent His-tag onto a high-loading capacity HTG chip to around 800 RU. The samples were subject to incrementally increasing concentrations of ligand (ADP-ribose) and response was measured over 60 seconds. Binding reactions were analysed using the Langmuir analysis by the Bio-Rad ProteOn manager software.

Unfortunately, a distinct response to ligand binding was only seen for the positive control, bPARG WT (see figure 32). Furthermore, the Kd derived from the bPARG binding was highly variable and different from the binding data previously measured by ITC (see figure 31).

Figure 32. SPR protein array sensogram of bPARG WT with ADP-ribose. ADP- ribose was passed over the immobilised bPARG WT protein bound to an HTG chip. Data were fitted using kinetic Langmuir analysis (ProteOn manager software- Bio- Rad). Response to ADP-ribose was seen at the following concentrations: light blue-100 nM, dark blue-1 µM, green- 10 µM, pink-100 µM, orange-500 µM.

102

Structure and function of poly ADP-ribose glycohydrolases

The binding constant (KD) is derived using the Langmuir fit analysis as well as values for dissociation constant (kd) and association constant (ka). An average of the five concentrations of ADP-ribose gave an average binding constant (KD) of 704 ± 47 nM, an average ka of 3.22 x 104 ± 0.28 x 104 M.s-1 and an average kd of 2.27 x 10-2 ± 0.98 x 10-2 s-1 seen in table 9 and figure 8.

ADP-ribose concentration ka (M.s-1) kd (s-1) KD (nM) range (µM)

0.1-50 3.22 x 104 2.27 x 10-2 704 ± 47 ± 0.28 x 104 ± 0.98 x 10-2

Table 9. Thermodynamic binding parameters for TTPARG WT using SPR. Binding constant (KD), dissociation rate (kd) and association rate (ka) for TTPARG WT with ADP- ribose as derived using SPR.

4.4 Structural characterization of the inactive TTPARG-PAR complex

4.4.1 Crystallisation of inactive E256Q TTPARG with poly-ADP-ribose (PAR) fragments

The volume and concentration of homogenous PAR fragments available was low, and attempts at co-crystallisation would be limited to one mutant. A catalytically inactive mutant would prevent the hydrolysis of the PAR fragments and mimic the Michael- Menten enzyme-substrate complex, showing whether the protein bound to PAR in an exo and/or endo mode. In absence of quantitative binding data, E256Q was chosen as it seemed the most stable of the four inactive mutants.

Sitting drop crystal trays of TTPARG E256Q (at 7, 12, 14 and 15.5 mg/mL) in complex with various PAR fragments (1 mM concentration, see method section 2.5.3) were set against a number of commercial crystallisation screens (Molecular Dimensions) at 4 ˚C. Various crystals were obtained for all six PAR fragments tested. Crystals of TTPARG E256Q-PAR suitable for diffraction were obtained at 15.5 mg/ml at 4 °C in 400nL drops. Two different initial crystallisation conditions were identified:

103

Structure and function of poly ADP-ribose glycohydrolases

1) 0.1 M carboxylic acids, 0.1 M buffer system 3, pH 8.5 and 30% P550MME_P20K 2) 0.1 M HEPES, pH 7.0 and 30% v/v Jeffamine ED-2001

Figure 33. Photograph of a TTPARG-PAR9 crystal mounted in a cryo- loop. Crystal was obtained in 0.1 M HEPES, pH 7.0 and 30% v/v Jeffamine ED-2001.

4.4.2 Structure determination of inactive TTPARG with PAR

Crystals were flash cooled in liquid nitrogen and diffraction data were collected using the I02 beam at the Diamond Lightsource facility (Oxford, UK). Data was reduced and scaled using X-ray Detector Software (XDS)112. The highest resolution data were obtained from the TTPARG-PAR9 in 0.1 M HEPES, pH 7.0 and 30% v/v Jeffamine ED- 2001 (see figure 33), which diffracted to 1.42 Å (PDB ID: 4L2H). The space group was

P212121, (unit cell a= 55.58 Å, b=75.12 Å, c=138.72 Å), overall resolution was 30.0-1.42

Å, overall Rmeas 0.075 and overall I/σ 12.7. The structure was solved by molecular replacement using the TTPARG WT structure (PDB ID: 4EPP) using the PHASER

104

Structure and function of poly ADP-ribose glycohydrolases programme (CCP4 suite). Initial rounds of model refinement using Refmac5 produced

Rwork/Rfree of: 24.2/28.9 109,110. Further manual refinement was completed using COOT program 111, allowing for real space refinement of the model, addition of waters, mutation of the E256 residue and the addition of the PAR ligand. The electron density for the PAR ligand showed only two of the nine ADP-ribose monomers within the binding site bound in the ‘exo’ mode (figure 34). The PDB for the PAR ligand was manually edited in COOT using a previously constructed PAR trimer;16 following which a ‘cif’ library was generated using the Dundee server, (http://davapc1.bioch.dundee.ac.uk/cgi-bin/prodrg). The ligand was then manually manipulated into the density using COOT using real space refinement. Further rounds of real space refinement and refinement using Refmac5 produced a final completed model of TTPARG-PAR with Rwork/Rfree of 13.8/17.7. For full results of crystallographic molecular refinement parameters please see table 10.

105

Structure and function of poly ADP-ribose glycohydrolases

Data collection TTPARG E256Q-PAR9

Space group P212121 Cell dimensions a, b, c (Å) 55.8 75.6 138.7 α,β,γ () 90.0 90.0 90.0 Resolution (Å) 30

Rmeas 7.9 I /σI 13.91 Completeness (%) 99.4 Redundancy 5.46 Refinement Resolution (Å) 30-1.46 No. reflections 103,366

Rwork / Rfree 13.8/17.7 No atoms Protein 7,392 Ligand 92 Water 386 B-factors (Å2)

Protein 14.7 Ligand 16.6 Water 27.0 R.m.s. deviations Bond lengths (Å) 0.026 Bond angles () 2.067

Table 10. Crystallographic data and model refinement parameters for TTPARG- PAR9. Values in parentheses indicate values obtained for the highest resolution shell. Equations for Rmeas and Rwork/Rfree are given below. For Rwork and Rfree, the formula for both is the same, Rwork is calculated for the working set, whereas Rfree is calculated for the test set.

106

Structure and function of poly ADP-ribose glycohydrolases

Crystals were obtained in a range of conditions, for each of the PAR polymers tested. Crystals grew in the following conditions for different lengths of PAR: 24 % w/v PEG

1500 with PAR14; 0.1 M HEPES 7.0 30 % v/v Jeffamine® ED-2003 with PAR6; 0.15 M

Potassium bromide None 30 % w/v PEG 2000 MME with PAR10; 0.2 M Sodium chloride

0.1 M BIS-Tris 5.5 25 % w/v PEG 3350 with PAR15 and PAR7. All crystals were flash cooled in liquid nitrogen and diffraction data were collected using the I03 beam at the Diamond Lightsource facility (Oxford, UK). In all cases, the crystal packing was the same regardless of the size of the PAR fragment or the composition of the mother liquor.

4.3.3 Crystal structure of inactive TTPARG with PAR

The two terminal ADP-ribose units of the PAR ligand can be clearly seen in the electron density (figure 34). Between both ADP-ribose monomers we can see the Ribose-ribose O-glyosidic bond which is positioned in close proximity of the Gln256 side chain (figure 35a). We did not observe any direct interactions between protein and the n-1 Ribose’, however the terminal ADP-ribose monomer interacts with the Asn250 and Glu255 residues. We found that the N-ribose 3’ OH group was within hydrogen bonding distance of the terminal n-ribose” 3’ OH, which was bound by Asn240 residue and a water molecule (referred to as W2). This W2 was bound by the Gly246 amide nitrogen and the n Ribose” 2-OH group (bound to Glu255). The n-1 adenosine was positioned between the Val253 residue and the water molecules that interact with the Arg164 residue. Direct hydrogen bonding interactions between the Leu252 amide nitrogen and the n-1 adenosine N11, as well as the Ser297 side chain and the n-1 adenosine N10 can also be observed. A few polar interactions occur between the protein and the n-1 diphosphate, including the amide nitrogen backbone, and side chain of Asn250. As a consequence, starting with the n-1 alpha phosphate towards the N-2 ADP-ribose group of the PAR chain, electron density rapidly becomes weaker, signifying high levels of flexibility in the bound PAR from the n-1 α-phosphate onwards, see figure 34. Beyond the n-1 β-phosphate, electron density is reduced to background level. The crystal packing is such that solvent channels sufficiently large to contain the disordered PAR region are located directly adjacent to the n-1 β-phosphate (figure 35b).

107

Structure and function of poly ADP-ribose glycohydrolases

Figure 34. Active site view of TTPARG with PAR bound. The grey is the solvent- accessible surface of PARG. The 2FoFc electron density corresponding to the ordered region of PAR is shown in a blue mesh, contour level 1 sigma, original figure from Barkauskaite & Brassington, et al.

108

Structure and function of poly ADP-ribose glycohydrolases

a

b

b

Figure 35. Crystal structure of a PARG–PAR9 complex. (a) PARG active site. Residues involved in direct contacts with the PAR ligand are shown in atom-coloured sticks. The mutated Glu256 is shown with green rather than with light blue carbons. Hydrogen bonds between ligand and protein or structural waters are indicated by dotted lines, original figure from Barkauskaite & Brassington, et al. (b) Symmetry related monomers contacting the TTPARG-PAR9 complex, revealing a large solvent channel to line the surface corresponding to the active site PAR binding region, original figure from Barkauskaite & Brassington, et al.

109

Structure and function of poly ADP-ribose glycohydrolases

4.4.4 Crystal structure of TTPARG-PAR reveals exo-glycohydrolase binding mode

The observed binding of the PAR ligand within the TTPARG binding site corresponds to an exo-glycohydrolase activity. However, the terminal 2’ OH Ribose moiety is solvent exposed suggesting that another ADP-ribose monomer could fit in the N+1 position at the PARG surface (and would lead to an endo-mode binding). Only an exo-binding mode was seen for the various PAR fragments crystallised with TTPARG E256Q, including a PAR 15-mer. As the latter only contains one terminal residue compared with 13 intermediate positions, we can estimate at least a 100 fold preference for binding in the exo-mode rather than endo-mode. To provide further insights into the endo/exo binding modes, an MD simulation of TTPARG bound to a PAR trimer in the endo-conformation was carried out by Dr. P. Lafite (Université d’Orléans, France) (figure 3 in Barkauskaite & Brassington, et al, see appendix) and shows a reorientation of the n adenosine to avoid steric clashes with the conserved Phe398. This in turn leads to a disruption of the polar interactions with the side chain of Glu228 and the amide nitrogen of Ile227. Mutations of the Glu228 and the Ile227 (completed by the Ahel group, (figure 2a in Barkauskaite & Brassington, et al, see appendix), significantly reduces PARG activity but does not abolish it. This is to be expected with mutations which affect binding but not catalysis.

4.4.5. The catalytic mechanism of canonical PARG

The first insight into the PARG catalytic mechanism (from bacterial PARG), revealed the PARG catalytic loop and the two integral glutamates responsible for catalytic activity, E114 (E2566, TTPARG) and E115 (E255, TTPARG) 13. It was predicted from the bacterial structure that the role of the first glutamic acid (E115) was solely responsible for the hydrolysis while the second glutamic acid (E114) was responsible for assisting with the binding of PAR and was not directly involved in the catalytic mechanism, (section 1.5).

A plausible model of the TTPARG-PAR enzyme substrate complex can be derived by replacing Gln256 with Glu256 in a conformation observed in the WT-PARG complex (PDB ID: 4EPP). This reveals that the E256 residue is within hydrogen binding distance of the Ribose-ribose glyosidic bond, suggesting that E256 is protonated in the enzyme- substrate complex (see figure 36). pKa calculations were performed by Dr Warwicker on TTPARG for the free enzyme and bound to PAR. Both E255 and E256 were predicted to

110

Structure and function of poly ADP-ribose glycohydrolases

be negatively charged in the free enzyme. In the free enzyme, the pKa does not change substantially for the E256 residue after mutation in E255A; however, the charge coupling is increased between the two glutamates in the substrate bound enzyme, so much that one proton is added to the pair. As is common for tightly coupled protonation, the precise location of the proton is more difficult to predict; however, hydrogen bonding within the active site is consistent with the side chain of E256 picking up the proton in the enzyme-substrate complex. Calculations suggest that in the

E255A mutant, where charge coupling is removed, the E256 pKa reverts to a much lower value and is less likely to pick up a proton. This suggests that E256 becomes protonated by W2 concomitant with PAR binding, and establishing a clear role for the E255 residue in the catalytic mechanism, (see appendix for Barkauskaite & Brassington, et al).

111

Structure and function of poly ADP-ribose glycohydrolases

Figure 36. Proposed catalytic mechanism of poly-ADP-ribose hydrolysis by PARG based on the PARG-PAR structure. R1= Poly (ADP-ribose), R2= ADP.

112

Structure and function of poly ADP-ribose glycohydrolases

4.5 Discussion

Chapter four presents data on the biophysical, structural and catalytic studies of canonical active and inactive PARG from Tetrahymena thermophila. The expression and purification of TTPARG WT and four inactive mutants E256Q, E256A, E255Q and E255A is reported, followed by the biophysical characterisation of ADP-ribose binding to these proteins. The chapter concludes by presenting the successful crystallisation of inactive PARG with its natural ligand PAR, to reveal an exo-binding conformation.

FAD preferential binding was seen for the inactive equivalent bacterial PARG mutants E114A and E114Q (see section 3.2.4); however, no FAD was bound to the equivalent TTPARG mutants E255Q and E255A upon expression and purification. This suggests inherent FAD binding is not a property of canonical PARG and refutes the hypothesis that a (partial) role of E255 is to hinder FAD binding and avoid interference with enzyme activity by FAD. With a plausible role for E255 established in PARG catalysis

(i.e. binding of PAR and modulation of the E256 pKa), the strict conservation of E255 can now be explained. It would seem that FAD binding for the E114 bPARG variants is specific to bPARG only, with additional TTPARG features (perhaps the N-terminal accessory domain) hindering FAD binding even for E255 variants. It would seem plausible the latter would hinder binding of the bulky FAD isoalloxazine ring.

Due to the small amount of PAR available to us, it was important to find the most stable of the mutants to be taken forward into crystallisation trials with PAR. Thermal shift assays were performed on TTPARG WT, E256Q, E256A, E255A and E255Q with and without the addition of ADP-ribose. These experiments were completed for two reasons. One was to ascertain if the inactive mutants would still bind PAR or ADP- ribose, and two was to find the most stable of the inactive mutants (when bound to ADP-ribose).

The thermal shift assays (section 4.3.1), suggested that the E256Q mutant was the most stable and retained PAR binding ability. Attempts to gain more qualitative binding data with ADP-ribose, using either ITC or SPR failed. It is possible the exothermic reaction between TTPARG inactive mutants and ADP-ribose may not be measurable due to the binding being too weak to measure. In case of SPR, there were difficulties with stable immobilisation of the proteins onto the chip and no measurable binding reactions were

113

Structure and function of poly ADP-ribose glycohydrolases seen for the TTPARG inactive mutants or even TTPARG WT. Primarily from the results of the thermal shift assays, TTPARG E256Q was chosen to go forward into crystal trials with PAR.

Varying lengths of homogenous PAR fragments were crystallised with TTPARG E256Q, all resulted in structures revealing the exo-binding mode to ADP-ribose. The structural data indicated that TTPARG is inherently predisposed to act as an exo-glycohydrolase. This is due to higher affinity for the PAR terminus, a consequence of the presence of the conserved phenylalanine residue (Phe398/902). As a result, in normal physiological conditions ADP-ribose will be the dominant PARG product.

The PARG catalytic mechanism was first reported by Slade, et al and revealed the bacterial PARG to act as an exo-glycohydrolase13; however, this was originally thought to be possibly limited to the bacterial PARG, and that eukaryotic PARG had the space within the binding site to accommodate a further ADP-ribose monomer (see figure 7). The catalytic mechanism proposed in figure 36, reveals this predominant ‘exo’ binding and catalytic mode would hydrolyse the Ribose-ribose bonds linking the terminal ADP- ribose, resulting in primarily ADP-ribose products. This conclusion was supported by further biochemical and modelling studies, which indicated that canonical PARG is inherently predisposed to act as an exo-glycohydrolase. We discovered this was due to the presence of a conserved Phe residue (Phe398) (see figure 35), which, when mutated removed the steric clash and thus increased the endo-binding mode affinity. This latent, low-affinity endo-binding mode suggests that, in vivo, the relative balance between exo- and endo-glycohydrolase activity will be a function of the PAR/PARG ratio 11.

114

Structure and function of poly ADP-ribose glycohydrolases

Chapter five

Expression, purification and biophysical characterization of the mammalian PARG regulatory domain

5.1 Background information

This chapter reports on the cloning, expression, purification and biophysical characterization of the mammalian PARG regulatory domain. The human PARG (hPARG) protein exits in various isoforms arising from the same gene: a 110 kDa nuclear form, two cytoplasmic isoforms of 99 and 103 kDa, and two mitochondrial isoforms of 60 and 55 kDa 68,69 70. Hence, compared with the bacterial and eukaryotic PARG proteins, human PARG is far more complex and consists primarily of two large regions: the regulatory region (A in figure 37) and the catalytic region (B in figure 37). These regions are connected via a (flexible) linker (see figure 37). There have been a number of recent publications on the structure and function of the catalytic region of various vertebrate PARGs, in addition to a protozoan PARG 15,18,77. In contrast, there is currently very little information available about the function or structure of the vertebrate PARG regulatory domain.

Figure 37. Schematic representation of hPARG protein, (adapted from Kim, et al. and Mortusewicz, et al). Domain A represents the structurally uncharacterised regulatory domain of the hPARG protein. Domain B represents the catalytic region. The key MTS region is represented as a black box within the catalytic domain, and has proven essential for PARG activity. The macrodomain is shown in purple.

A recent study has revealed that mice expressing the catalytic PARG region only (i.e. without the regulatory domain) are more sensitive to genotoxic stress 84. Further studies have shown that the regulatory domain is also required for recruitment of PARG to sites of DNA damage and contains proliferating cell nuclear antigen binding domain 115

Structure and function of poly ADP-ribose glycohydrolases

76. The PARG catalytic domain also contains a mitochondrial targeting sequence (MTS), which is essential for PARG activity 68. A recent study revealed that the catalytic domain alone had higher PARG activity compared with the full-length protein 115. These results indicate that the regulatory domain could be responsible for regulation of PARG activity and the recruitment of the protein to areas of DNA damage and/or to poly ADP-ribose (PAR).

No catalytic function has been proposed for the PARG regulatory domain and sequence prediction tools indicate large amount of disorder is likely, figure 38. The key focus of this chapter is to determine whether the regulatory domain contains globular folded elements, and, if so, whether the structure of these could provide further information on associated function and hence the regulation of the PARG enzyme. Deficiency of the PARG protein has been shown to cause cell death and PARG depletion causes sensitization to certain DNA damaging agents, implicating PARG as a potential therapeutic target in several disease areas, particularly oncology 8486116117. As the regulatory domain has been shown to limit the movement of the PARG to sites of DNA damage 76, further information into the structure of this region could provide information needed to target PARG regulation, reduce the recruitment to sites of DNA damage and prove a key therapeutic target.

Figure 38. The order-disorder prediction of hPARG from IUPRED 118. Regions above the blue line (0.5 disorder tendency) represents likely disorder. The regulatory domain incorporates residues 1-460 (yellow brackets) and shows the highest levels of predicted disorder when compared with the catalytic domain (black brackets). 116

Structure and function of poly ADP-ribose glycohydrolases

5.2 Cloning, expression and purification of full length human PARG and human PARG regulatory domain variants in E.Coli

5.2.1 Rational design of hPARG regulatory domain truncations

Residues 1-460 were chosen as a starting point for expression of the hPARG regulatory domain, avoiding the mitochondrial targeting sequence (see figure 37). However, the exact length and nature of the linker region is unclear, hence we also made a series of further truncations. Multiple sequence alignments of PARG were used to identify multiple and possibly more stable truncations of the human PARG regulatory domain, (see alignment in appendix). It is important to note that this protein is only present in higher vertebrates, leading to few non-conserved areas. Areas of lower sequence homology were however identified around residues 365 and 380.

Hence, the fragment hPARGs 1-380, 1-365 were also made. Finally, the hPARG 1-329 fragment was cloned after purification and crystallography trials of the larger fragments suggested this might correspond to a minimal globular region, see section 5.5.1.

5.2.2 Cloning of hPARG C-terminally truncated forms

Primers were designed to allow PCR amplification of the putative N‐terminal regulatory domain (C-terminally truncated) and full-length 110kDa of human PARG. Primers for the C-terminally truncated hPARG proteins were designed to incorporate an N-terminal his-tag for protein purification purposes. Previously reported purifications of the human catalytic domain (Ahel group), had used a C-terminal his-tag, which improved stability for the catalytic domain. For this reason, a reverse primer incorporating a His- tag was used for the PCR amplification of only the full-length protein.

The genes of the five different N-terminal human PARG fragments and the full-length hPARG were placed under the control of the lac operator and a T7 promoter. This allows inducible heterologous expression in E. coli by IPTG, (see section 2.2.7 for more details).

To confirm correct cloning, following transformation and selection of the NEB5α E. coli cells, colony PCR reactions were completed using vector specific T7 and T7term primers. DNA agarose gels were used to visualize amplified fragments and identify colonies

117

Structure and function of poly ADP-ribose glycohydrolases containing the hPARG genes. The agarose gel in figure 39 shows colonies of NEB5α cells containing hPARG 1-460, 1-388, 1-365 and 1-329 genes.

Figure 39. Agarose electrophoresis analysis of HsPARG regulatory domain colony PCR reactions. (a) Amplified DNA fragments from seven colonies, six colonies contained the genes for hPARG N- terminal regulatory domains 1-388, 1-380, 1-365 and 1-329, one colony shown as hPARG 1-388(2) did not contain the correct insert. (b) Amplified DNA fragments from one colony containing the hPARG 1-460 gene. (c) Amplified DNA fragments from one colony containing the hPARG full-length gene.

Plasmids were sent to Eurofins MWG-operon (Ebersberg, Germany) for sequencing, confirming the various hPARG constructs (full-length, 1-460, 1-388, 1-380, 1-365 and 1- 329) as error free.

118

Structure and function of poly ADP-ribose glycohydrolases

5.2.3 Expression trials of full length hPARG and hPARG fragments

E. coli Rosetta BL21 (DE3) cells were transformed with the hPARG plasmids and transformants were tested for their ability to express the corresponding hPARG fragments. Expression trials were used to identify the optimum expression conditions that provided the best yield of protein. The five C-terminally truncationed fragments all gave the best yield after expression overnight (~18 hours) at 20 ˚C. In contrast, the full- length protein gave best expression after 2 hours at 30 ˚C, see figures 40, 41 and 42.

Figure 40- SDS gels showing the expression of hPARG 1-388, 1-380, 1-365 and 1-329 before induction, after 2 hours of induction and after 18 hours of expression and various temperatures. Lane 1: Ladder. Lane2: hPARG 1-388 before induction (20˚C expression temperature), Lane 3: hPARG 1-388 before induction (25 ˚C expression temperature), Lane 4: hPARG 1-380 before induction (20 ˚C expression temperature), Lane 5: hPARG 1-380 before induction (25 ˚C expression temperature), Lane 6: hPARG 1-365 before induction (20 ˚C expression temperature), Lane 7: hPARG 1-365 before induction (25 ˚C expression temperature), Lane 8: hPARG 1-329 20 ˚C expression temperature), Lane 9: hPARG 1-329 before induction (25 ˚C expression temperature), Lane 10: hPARG 1-388 2 hours after induction at 20 ˚C, Lane 11: hPARG 1-388 2 hours after induction at 25 ˚C, Lane 12: hPARG 1-380 after 2 hours of induction at 20 ˚C, Lane 13: hPARG 1-380 after 2 hours of induction at 25 ˚C, Lane 14: hPARG 1-365 after 2 hours of induction at 20 ˚C, Lane 15: hPARG 1-365 after 2 hours of induction at 26C, Lane 16: Ladder, Lane 17: hPARG 1-329 after 2 hours of induction at 20 ˚C, Lane 18: hPARG 1-329 after 2 hours of induction at 25 ˚C, Lane 19: hPARG 1-388 18 hours after induction at 20 ˚C, Lane 20: hPARG 1-388 18 hours after induction at 25 ˚C, Lane 21: hPARG 1- 380 18 hours after induction at 20 ˚C, Lane 22: hPARG 1-380 18 hours after induction at 25 ˚C, Lane 23: hPARG 1-365 18 hours after induction at 20 ˚C, Lane24: hPARG 1-365 18 hours after induction at 25 ˚C, Lane 25: hPARG 1-329 18 hours after induction at 20 ˚C, Lane 26: hPARG 1- 329 18 hours after induction at 20 ˚C. Arrows indicate proteins of interest.

119

Structure and function of poly ADP-ribose glycohydrolases

Figure 41. SDS gel showing the expression of full length hPARG. Lane 1: molecular marker. Lanes 2-8: hPARG FL before induction, after 2 hours of induction at 20 ˚C, after 2 hours of induction at 30 ˚C, after 2 hours of induction at 25 ˚C, after o/n induction (~18 hours) at 20 ˚C, after o/n induction (~18 hours) at 30 ˚C, after o/n induction (~18 hours) at 25 ˚C. Arrow indicates protein of interest.

5.2.4 Large-scale expression and purification full-length hPARG

Large-scale expression of the full-length hPARG enzyme was completed and purified via Ni-NTA column (see section 2.3.6). Yield of cell pellet ~9 g for the full length hPARG protein, relatively low compared with yields from the bacterial and eukaryotic PARGs (see sections 3.2.1 and 4.2.1). The full-length hPARG was released from the cells by sonication and was purified by immobilization on a batch Ni-NTA column and eluted with 250 mM imidazole. 1 mM IPTG was used for induction and cells were left at 20 °C overnight (~18 hours).

Previous studies of the bacterial and eukaryotic PARGs (sections 3.2.3 and 4.3.1) had shown a significant increase in stability after the addition of ADP-ribose. For this reason, the hPARG full-length protein was incubated with ADP-ribose for the size

120

Structure and function of poly ADP-ribose glycohydrolases exclusion stage of purification, in an attempt to stabilize the protein. The presence of species of lower molecular weight suggest a large portion of the full-length protein was degrading (see figure 42). Separate fractions corresponding to the individual peaks were further analyzed by SDS-page gel and Western blot (see figure 42c). SDS page revealed a protein at the correct weight for the full-length protein ~111 kDa (fraction B10), another corresponding to a 60 kDa fragment (fraction B1) and a third around 55 kDa (fractions C4-C8). A Western blot using an anti-his-tag antibody was performed on the SDS-gel seen in figure 42b. As the full-length hPARG was C-terminally his-tagged, this suggests that the smaller fragments correspond to the catalytic C-terminal region of the protein.

121

Structure and function of poly ADP-ribose glycohydrolases

a

b c

Figure 42. Analysis of full-length hPARG large-scale expression. (a) Chromatogram of full-length hPARG size exclusion chromatography. (B) SDS-page gel electrophoresis of fractions of individual peaks seen during size exclusion chromatography. Lane 1, Molecular protein marker. Lane 2, fraction B10. Lane 3, fraction B1. Lane 4, fraction C4. Lane 5, fraction C5. Lane 6, fraction C6. Lane 7, fraction C7. Lane 8, fraction C8. Lane 9, fraction D10. (C) SDS page gel electrophoresis and corresponding Western blot of fractions of individual peaks seen during size exclusion chromatography. Lane 10, Molecular protein marker. Lane 11, Fraction B10. Lane 12, combined fractions C4-C8, Lane 13, antibody staining of fraction B10. Lane 14, antibody staining of fractions C4-C8. Arrows indicate protein of interest.

122

Structure and function of poly ADP-ribose glycohydrolases

5.2.5 Large-scale expression and purification of various hPARG fragments

Large-scale expression of the five C-terminally truncated fragments was performed under the same conditions, (see section 2.3.6). Per 12 liters of expression, the yield of cell pellet was ~13 g for the 5 fragments. The hPARG variants were released from the cells by sonication and were purified by immobilization on a batch Ni-NTA column and eluted at 250 mM imidazole. SDS-page analysis of the eluate (seen in figures 43, 44 and 45) revealed low levels of expression and high levels of sample impurity/heterogeneity.

Figure 43. SDS page electrophoresis of hPARG 1-460 purified by nickel affinity chromatography.Lane 1: Molecular weight marker, Lane 2: hPARG 1-460 lysate, Lane 3: hPARG 1-460 imidazole wash step, Lane 4: hPARG 1-460 eluate. The arrow indicates the lower weight band that was analysed by LC-MS.

123

Structure and function of poly ADP-ribose glycohydrolases

Figure 44. SDS page electrophoresis of hPARG 1-388 and 1-380 purified by nickel affinity chromatography. Lane 1, Molecular weight marker. Lane 2, hPARG 1-388 lysate. Lane 3, hPARG 1-388 imidazole wash step. Lane 4, hPARG 1-388 eluate. Lane 5, Gap. Lane 6, hPARG 1-380 lysate. Lane 7, hPARG 1-380 imidazole wash step. Lane 8, hPARG 1-380 eluate. Arrows indicate protein of interest.

Figure 45. SDS page electrophoresis of hPARG 1-365 and 1-329 purified by Nickel affinity chromatography. Lane 1, Molecular weight marker. Lane 2, hPARG 1-365 lysate. Lane 3, hPARG 1-365 imidazole wash step. Lane 4, hPARG 1-365 eluate. Lane 17, Gap. Lane 5, hPARG 1-326 lysate. Lane 6, hPARG 1-329 imidazole wash step. Lane 7, hPARG 1-329 eluate. Arrows indicate protein of interest. 124

Structure and function of poly ADP-ribose glycohydrolases

Using liquid chromatography-mass spectrometry (LC-MS) of tryptic digests of the various gel bands, impurities could be identified as constitutively expressed E. coli proteins.

The hPARG 1-460 protein appeared to have a prominent band of lower molecular weight, indicated by the red arrow in figure 43. This was identified to be a lower molecular weight hPARG regulatory protein indicating protein degradation. This also suggests that the degradation of the hPARG regulatory proteins occurred at the C- terminus, as only peptides up to residue 356 could be detected, see figure 46. In addition, N-terminal fragments would not be purified using the Ni-affinity procedure.

Figure 46. Peptide coverage of hPARG 1-460 degradation product in hPARG 1-460 sequence. Peptides identified are in red, 269/460 amino acids were covered, giving an overall coverage of 58%.

An initial attempt to improve the stability of the hPARG 1-460 protein was to purify the protein using a higher salt buffer (buffers containing 100 mM up to 750 mM NaCl were used). However, protein degradation was still observed. Different buffers at various pH values were also used, with Tris buffer at pH 7.5 leading to minimal levels of degradation.

The hPARG variants were further purified by size exclusion chromatography using a Superdex 200 10/300 GL size exclusion column (GE Healthcare, Buckinghamshire, UK) on an AKTA FPLC system. Figure 11a shows an example of a typical chromatogram

125

Structure and function of poly ADP-ribose glycohydrolases obtained for hPARG variants. Fractions corresponding to the main peak were verified via SDS page gel (see figure 47a) and pooled and, following concentration, either used for crystallization trials or flash frozen in liquid nitrogen for future use. Liquid chromatography-mass spectrometry (LC-MS) of tryptic digests was performed on the main peak and the smaller peak as indicated by the red arrows in the gel (figure 47b), results showed that these were smaller N-terminal fragments of the hPARG regulatory protein, see figure 47c.

126

Structure and function of poly ADP-ribose glycohydrolases

a

b kDa

c

Figure 47. Analysis of hPARG 1-388 after purification with size exlusion chromatography. (a) Chromatogram of hPARG 1-388 size exlusion chromatography, fractions B3 to C2 covering the mean peak correspond to hPARG 1-388. (b) SDS-page gel of main peaks shown by the chromatogram. Lane 1, molecular marker. Lane 2, fraction B11. Lane 3, fraction B10. Lane 4, fraction B3. Lane 5, fraction B2. Lane 6, fraction B1. Lane 7, fraction C4. Lane 8, fraction C6. Lane 9, fraction C8. Lane 10, fraction C10. (c). Peptide coverage of hPARG 1-388 bands A1 and A2 seen in gel. Peptides identified are in red, 234/388 amino acids were covered for A1, giving an overall coverage of 60%, and 31/388 for A2, giving an overall coverage of 8%.

127

Structure and function of poly ADP-ribose glycohydrolases

5.3 Cloning, expression and purification of human PARG regulatory domain in P. pastoris

5.3.1 hPARG expression in P. pastoris

Attempts to express the hPARG regulatory domain in P. pastoris yeast strains were performed to test whether eukaryotic expression would lead to improved protein stability. Many eukaryotic proteins require post-translational modification in order to adopt the correct structure 119. The hPARG 1-388 sequence contains multiple phosphorylation sites using the NetPhos 2.0 server (see figure 48). Using a yeast based expression system might lead to correct posttranslational modification(s), and provide a more stable protein while still maintaining a high enough expression level to support crystallization trials.

Figure 48. Prediction of phosphorylation sites within the hPARG 1-388 fragment. Prediction is based on the residue sequence and generated using the Phospho server 2.0.

The recent publication of the crystal structure of the rat PARG catalytic domain 385-972 (equivalent to 389-978 human PARG), revealed a linker region between residues 385- 465 (equivalent to 389-460 human numbering) was part of the catalytic region see figure 37. This coincided with the lower amounts of homology seen around the 380aa region (see appendix). As the 389-460 region was now implicated as part of the catalytic region, the hPARG 1-388 fragment was chosen for yeast expression trials. PPICZ 3.1

128

Structure and function of poly ADP-ribose glycohydrolases vectors were kindly given to us by the protein expression group for recombinant protein expression in the P. pastoris yeast strains. The PPICZ 3.1 vectors can be used with multiple strains of P. pastoris, for purposes of this experiment the two strains chosen were, X-33 and KM71H P. pastoris strains.

5.3.2 Cloning of hPARG 1-388 into pPICZ 3.1 plasmid

Initial cloning steps use a standard In-fusion technique and transformation into E. coli, followed by P. pastoris cloning. The hPARG 1-388 gene, including an N-terminal His tag and TEV cleavage site was PCR amplified from a pDonor plasmid (from the Ahel group), (primers are listed in appendix). These primers contained complimentary sequences to allow insertion of the PCR product into pPICz plasmid using In-Fusion Advantage PCR Cloning. In addition, the forward and reverse primers include EcoRI and NotI restriction sites, respectively, to facilitate following cloning reactions. PCR was used to amplify the 1-388 hPARG gene. Annealing temperatures of 65 ˚C for 30 seconds, followed by 72 ˚C for 30 seconds were used. Once the PCR reactions were complete, conformation of appropriate PCR product was obtained using 40% Agarose gel as seen in figure 49, followed by Dpn1 enzyme digest (New England Biolabs).

The pPICz plasmids were digested with EcoRI and NotI restriction enzymes (New England Biolabs) and purified using the plasmid purification kit.

1kb

Figure 49- Agarose gel electrophoresis of amplified DNA fragments for hPARG 1-388 after Dpn1 digest. The amplified gene for hPARG can be seen at around 1KB (1164 bases).

129

Structure and function of poly ADP-ribose glycohydrolases

5.3.3 Transformation into P. pastoris strains.

Following the cloning reaction, 2.5 µl of the reaction mixture was used to transform NEB5α E. coli cells that were plated onto a low salt solid medium supplemented with 25 µg/ml of Zeocin antibiotic. Plasmids from successful transformants were sent to Eurofins MWG-operon (Ebersberg, Germany) for sequencing using vector specific T7 and T7term primers, confirming the hPARG 1-388 DNA sequence was correctly inserted without mutations.

Two genes in P. pastoris code for alcohol oxidase – AOX1 and AOX2. In addition, two different phenotypic classes of recombinant strains can be generated Mut+ and MutS (like the KM71H strain) that refers to the "Methanol utilization slow" phenotype, caused by the loss of alcohol oxidase activity encoded by the AOX1 gene. A strain with a MutS phenotype has a mutant AOX1 locus, but is wild type for AOX2. This results in a slow growth phenotype on methanol medium. X33 is Mut+ (Methanol utilization plus) and refers to the wild type ability of strains to metabolize methanol as the sole carbon source. Transformation of X33 with plasmid DNA linearized in the 5´ AOX1 region will yield Mut+ transformants, while KM71H will yield only MutS transformants. Both phenotypes (Mut+ and MutS) were used in this experiment as one may produce better expression of protein compared to the other.

Once the plasmid had been verified to contain the correct hPARG 1-388 insert, the plasmid was linearized (at the AOX1 locus) to allow recombination into the P. pastoris genome. P. pastoris strains X33 and KM71H were made competent, and transformed with 3 µg of the linearized hPARG plasmid using an electroporation method. The transformed cells were then plated onto YPDS plates supplemented with 1000 µg/ml Zeocin and left to grow for 2–3 days to allow complete incorporation into the yeast genome. Colonies were re-streaked onto 1000 µg/ml Zeocin YPDS plates to insure complete homologous recombination.

5.3.4 Expression trials of hPARG 1-388 in P. pastoris

P. pastoris is a methylotrophic yeast, capable of metabolizing methanol as its sole carbon source. Protein expression of the AOX1 gene is controlled at the level of transcription, for this reason, growth in glycerol is recommended for optimal induction

130

Structure and function of poly ADP-ribose glycohydrolases with methanol. A single colony of the transformed P. pastoris cells were used to inoculate the BGMY media (buffered glycerol-complex medium). Once an OD600 between 2–6 was reached, the cells were pelleted and transferred to a BMMY methanol- containing buffer for induction (buffered methanol-complex medium). Cells were grown for 72 hours and supplemented with methanol at every 24 hours to maintain induction then pelleted by centrifugation. P. pastoris require high pressures to effectively lyse the cells, the cell disruptor was able to effectively lyse the cells as 40 Kpsi. Cell debris was removed using and solubilized proteins were analyzed using SDS-PAGE electrophoresis (see figure 50). Although the cells appeared to yield protein at the correct size, expression levels were very low and a western blot was used to confirm protein expression (see figure 51). The X-33 strain expressed more of the hPARG protein when compared with the KM71H strain (although still at very low levels of expression). As the X-33 was a Mut+ phenotype, the cells grew much faster in methanol media and expressed higher levels of protein.

131

Structure and function of poly ADP-ribose glycohydrolases

a

b

Figure 50. SDS page gel of hPARG 1-388 expressing P. pastoris strains X-33 and KM71H, samples taken at various time points in during methanol expression. (a), Lanes 1-11 are of samples of X-33 cells were taken at 24h, 36h, 48h, 65h and 84h during methanol expression, (1) and (2) represent the colony that was used to grow the cells from. The red arrow highlights the hPARG 1-388 protein. Lanes 12-15 are of samples of KM71H taken at 0h and one sample taken at 24h. (b), Lanes 1-15 are of KM71H taken at 24h, 48h, 55h, 72h and 84h during methanol expression, (2), (5) and (7) represent the colony used to grow cells from. The red arrow highlights the presence of hPARG 1-388.

132

Structure and function of poly ADP-ribose glycohydrolases

Figure 51. Western blot of hPARG 1-388 expression trials in P. pastoris strains X- 33 and KM17H. (a), Samples of X-33 expression samples taken at 24h, 36h, 48h, 65h and 84h time points and a positive control of hPARG 1-388 purified from E. coli, (1) and (2) represent the colony used to grow cells from. Low levels of expression can be seen in X-33, represented by the red arrow. (b), Samples of KM71H taken at 24h, 48h, 55h, 72h and 84h and a positive control of hPARG 1-388 purified from E. coli, (2), (5) and (7) represent the colony used to grow cells from. The red arrow highlights hPARG 1-388 in 48h samples from colonies 7 and 2.

133

Structure and function of poly ADP-ribose glycohydrolases

5.3.5 Large-scale expression and purification of hPARG 1-388 in P. pastoris

For large-scale protein expression purposes, P. pastoris Mut+ X-33 was chosen as the ideal host. Expression trials revealed that the optimum time of induction was 48 hours after 1x addition of 100% methanol at 24h. At 65 hours, the protein appeared to start degrading. Specific P. pastoris culture and protein expression induction methods were used and cells grew effectively to optimum density, following which they were harvested and homogenized. Cell lysates were analyzed for expression of hPARG 1-388 using SDS-page electrophoresis. Following Ni-affinity chromatography, SDS-page of the 250 mM imidazole elution step of the nickel ion exchange column, revealed that hPARG 1-388 protein was expressed, but in very low concentrations, (see figure 52a).

To confirm presence of the soluble hPARG 1-388, the SDS-page gel was transferred to a PVDF membrane for western blot, see figure 53b. When compared with E. coli expression of hPARG 1-388 (see figure 44), the expression levels are lower. Only ~0.5 mg of soluble, highly impure protein could be obtained from 12 L of cell culture.

134

Structure and function of poly ADP-ribose glycohydrolases

a b

Figure 52. Analysis of expression of hPARG 1-388 using the P. pastoris X-33 expression system. (a), SDS-page gel of 250 mM imidazole elution from nickel column of hPARG 1-388 expressed in P. pastoris at different expression times. Lane 1: Molecular marker, Lane 2: P. pastoris X-33 expression taken at 0h time point, Lane 3: P. pastoris X- 33 cells lysed after 24h hours of expression, Lane 4: P. pastoris X-33 cells lysed after 48h hours of expression (b), Corresponding western blot. Lane 5: Molecular marker, Lane 6: P. pastoris X-33 expression taken at 0h time point, Lane 7: P. pastoris X-33 cells lysed after 24h hours of expression, Lane 8: P. pastoris X-33 cells lysed after 48h hours of expression. The red arrow indicated the small amount of expression seen.

5.4 Biophysical characterization of purified hPARG regulatory domain variants

5.4.1 Multi angle light scattering (MALLS)

Multi-angled light scattering (MALLS) was completed by Emma Keevil at the University of Manchester. The DAWNHEOS MALLs spectrophotometer was fitted with a quasi- elastic light scattering detector, to determine molecular mass. The hPARG 1-388 protein was purified using a Ni-affinity chromatography and further purified using size exclusion chromatography and concentrated to 0.5 mg/ml. Results of the MALLS (figure

135

Structure and function of poly ADP-ribose glycohydrolases

53) revealed that the protein predominantly existed as a monomer, with a molecular weight measuring 43.24 kDa (±0.706%), however, large levels of aggregation were also seen.

Figure 53. Multi-angled light scattering (MALLS) chromatogram of hPARG 1-388. A graph representing the molecular mass of hPARG as determined my MALLS. The mass distribution profile corresponds to a monomer.

5.4.2 Circular dichroism (CD) of various PARG fragments

Circular dichroism uses circularly polarized light to study protein secondary structure. For the purposes of this experiment, we wanted to get an idea whether the various hPARG regulatory region fragments were folded (with corresponding percentages of secondary structure), or if they were largely disordered. The hPARG fragments 1-460, 1- 388, 1-380, 1-365 and 1-329 were analyzed using a Chirascan™ CD spectrometer.

This method of analysis is a fairly new technique and has shown to be very sensitive in predicting types of secondary structure, especially in proteins that are predominantly alpha helical in nature. Often, if a protein has a large helix and small sheet content, the spectral contribution of the beta sheet may be difficult to detect and the accuracy of the derived sheet content will be low 120. Table 11 reveals the hPARG 1-388 variant to have the highest percentage of secondary structure and lowest percentage of random coil. For hPARG 1-329, the percentage of random coil is almost 40% of the protein. Many well-defined folded proteins have regions of random coil as determined by

136

Structure and function of poly ADP-ribose glycohydrolases deconvolution methods. Hence, this technique does not suggest that the hPARG protein is necessarily unfolded, rather that the smaller truncations may be less ordered and more difficult to crystallize 120. Figure 54 shows two examples of scans used for deconvolution.

hPARG hPARG hPARG hPARG hPARG 1-460 1-388 1-380 1-365 1-329

Helix (%) 45.2 47.8 43.8 39.2 27.2

Antiparallel (%) 6.6 6.5 6.0 7.2 5.3

Parallel (%) 8.3 7.9 7.9 6.5 7.5

Beta-turn (%) 18.5 19.2 19.5 16.1 19.6

Random coil (%) 20.5 17.6 20.9 30.2 38.9

Total Sum (%) 99.1 99.0 98.1 99.2 98.5

Table 11. Deconvolution percentages for hPARG regulatory domain fragments. Table presents deconvolution secondary structure percentages using CDNN software of hPARG proteins 1-460, 1-388 and 1-380, 1-365 and 1-329. CD spectra were measured between 180-260nm.

137

Structure and function of poly ADP-ribose glycohydrolases

Figure 54. CD spectra of hPARG regulatory domain fragments used for deconvolution. (a) CD spectra of hPARG 1-388 measured between 190-260nm. (b) CD spectra of 1-365 measured between 190-260nm. Path length was 1 cm.

5.4.3 Nuclear magnetic resonance (NMR) of hPARG1-388

Both the protein purification behavior and CD spectral analysis had indicated that the hPARG 1-388 variant is the most stable of the C-terminal hPARG truncations. In order to assess whether the hPARG 1-388 protein was folded, a 1D proton NMR spectrum was collected (performed by Dr Matthew Cliff). Samples of protein were prepared to the specifications required for NMR- purified into a low salt buffer, 25 mM Tris pH 7.5, NaCl 100 mM and 1 mM DTT, and concentrated to 500 µl at a concentration of ~25µM. The resulting spectrum (see figure 55), is dominated by narrow line width peaks with little dispersion, consistent with an unfolded protein. There were no peaks above 1 ppm and very little around 9ppm, showing few or no methyl groups packed against aromatic rings and little beta sheet, respectively. The spectrum indicates that the protein appears largely unfolded, although highly soluble.

138

Structure and function of poly ADP-ribose glycohydrolases

Figure 55. 1H NMR spectrum of hPARG 1-388 protein. the spectrum was recorded at 298K, using excitation sculpting for water suppression with a spectral width of 16ppm, centred at 4.7ppm (the water frequency), and 1024 scans, on the Bruker 800MHz spectrometer (Avance III) with a 1H/13C/15N TCI cryoprobe.

5.4.4 Pull-down assays of hPARG regulatory domain (1-388) and catalytic domain hPARG (448-976)

The hPARG 1-388 was expressed in E. coli and purified as before and concentrated to 26.79 µM. An hPARG catalytic domain expression construct was given to us by the protein expression group lead by Edward Mackenzie. The hPARG catalytic gene had been cloned into pET28a vector to incorporate an N-terminal TEV protease-cleavable 6- His tag. The protein consisted of residues 448–976 of the full-length human protein (519 residues), and had been previous crystallized by Tucker, et al 15. The hPARG 448– 976 protein was expressed in E. coli and induced with 500 µM IPTG at 18 ˚C for 16 hours. The protein was released from the cells using sonication in a 40 mM HEPES pH 8.0, 0.3 M NaCl, 20 mM imidazole buffer, 10% glycerol, 1 mM DTT. The protein was then purified using nickel affinity chromatography, followed by a desalting column to remove imidazole. The protein was incubated with the TEV protease (AcTev) to cleave and remove the N-terminal His tag, followed by reverse nickel affinity chromatography to isolate the cleaved protein and the cleaved hPARG 488-976 was concentrated to 41 µM.

139

Structure and function of poly ADP-ribose glycohydrolases

The hPARG 1-388 protein was applied to Ni-NTA column to allow binding (via His-tag) to the Ni-NTA resin (i.e. act as “bait”). The quantity of regulatory protein was in excess of the catalytic protein to ensure adequate binding to the Ni-NTA resin and to ensure excess potential binding sites for the hPARG catalytic protein. The column containing the bound regulatory domain protein was washed with a 15mM imidazole buffer, after which the catalytic domain protein was added to the column. The flow through was collected and a 15mM imidazole wash step was performed and the eluate collected, see figure 56. The column was then washed with a 250 mM imidazole buffer to elute bound proteins. All samples were compared using SDS-page revealing no strong association between the two proteins could be detected. The flow through and 15 mM wash steps contained hPARG catalytic domain to similar levels as the load sample, indicating that little or none of the protein had bound to the hPARG regulatory protein under the conditions tested.

Figure 56. SDS page gel of the pull down assay fractions, between hPARG regulatory domain (1-388) and catalytic domain hPARG (448-976). Lane 1: Molecular marker, Lanes 2-4: hPARG 1-388 load sample, flow through after Ni-NTA binding, wash step, Lanes 5-8: hPARG 448-976 load sample, flow through 1st wash step, 2nd wash step, Lane 9: hPARG 1-388 250 mM wash step. Orange arrow highlights the hPARG 1-388 protein, Yellow arrow highlights the hPARG 448-976 protein.

140

Structure and function of poly ADP-ribose glycohydrolases

5.5 Crystallization of PARG regulatory domain fragments

5.5.1 Crystallization and initial data collection of hPARG regulatory domain fragments

Crystallization trials of the hPARG 1-460 fragment were initially performed, before moving to smaller hPARG truncated forms. Crystallization trials were set up using the sitting drop vapor diffusion method against a number of commercially available crystallization screens at concentrations of 5, 10 and 15 mg/ml. It was visibly clear that the majority of the drop precipitated almost immediately, indicating that the likelihood of crystallization would be low. Crystal trays were incubated at 4 ˚C and checked on a regular basis. Regular checking revealed the presence of large amount of phase separation, but no nucleation. After around 3 months, a few very small crystals were seen in two 5 mg/ml drops containing: 0.1 M HEPES 7.5, 10 % w/v PEG 8000 and 0.2 M potassium bromide, 0.1 M Tris 7.5, 15 % w/v PEG 4000 respectively. Six crystals (3 from each condition) were flash frozen in liquid nitrogen with 10% PEG 400 as a cryoprotectant. X-ray diffraction data was collected at Diamond light source facility (Oxford, UK). Data were reduced and scaled using X-ray Detector Software (XDS) 113. Unfortunately, the best crystal only diffracted to 5 Å, see figure 57. The space group was

P21, overall resolution was 87-5.0 Å, with an I/σI of 5.9 and overall Rmeas of 0.173.

141

Structure and function of poly ADP-ribose glycohydrolases

a

b

a

Figure 57. hPARG 1-460 crystal stored in a cryoloop with corresponding diffraction pattern. (a) The crystal was grown in 0.1 M HEPES 7.5, 10 % w/v PEG 8000. (b) Two snap shots of a typical diffraction pattern obtained.

Optimization trials for hPARG 1-460 were attempted using both of these conditions to improve crystallization and crystal size. A screen was prepared around the two conditions listed above, applying a gradient of pH, the percentage of precipitant and/or concentration of salt. Protein was used at 5, 6, 7 and 8 mg/ml. In addition, a seed stock was prepared from hPARG 1-460 crystals and was used in a 1:4 volume ratio in a larger 800 µl sitting drop (200 µl seed stock, 600 µl protein and 800 µl of mother liquor). Drop sizes were increased in an attempt to grow larger crystals. Again, many of the drops led to immediate precipitation, trays were incubated at 4, 21 and 30˚C and checked on

142

Structure and function of poly ADP-ribose glycohydrolases regular intervals. Few crystals were obtained at 4 ˚C, while no crystals were seen at other temperatures. Crystals were again small and took around 2–3 months to grow. Crystals were flash frozen in liquid nitrogen, and taken to Diamond light source facility for data collection. Data were reduced and scaled using X-ray Detector Software (XDS) 113. However, the best crystal diffracted to 3.76 Å, see figure 58. The space group was determined to be P 21 again, with an overall resolution of 80-3.8 Å, I/σI of 6.3 and overall Rmeas of 0.226.

a

b

a

Figure 58. hPARG 1-460 crystal stored in a cryoloop with corresponding diffraction pattern snap shots. (a) The crystal was grown in 0.2 M HEPES 7.5, 12 % w/v PEG 8000. (b) Snap shot of a typical diffraction pattern.

143

Structure and function of poly ADP-ribose glycohydrolases

Calculations based on the Matthews coefficient revealed the hPARG 1-460 crystals contained approximately 5 PARG 1-460 molecules in the asymmetric unit, although a higher solvent content (usually correlates with poorer resolution) could correspond to only 3 or 4 molecules per asymmetric unit. As no homologues are available for the hPARG regulatory domain, the structure could not be solved using molecular replacement. A Se-Met expression technique of hPARG 1-460 was attempted in triplicate. Unfortunately, no Se-Met labeled protein expression was achieved.

In a further attempt to improve crystallization, trials were prepared using the hPARG 1- 388 protein using the commercially available screens and also the two screens optimized from conditions obtained with the hPARG 1-460 protein. Larger 800 µl sitting drop vapour diffusion trays were prepared using 5, 7, and 9 mg/ml hPARG 1-388 with and without addition of hPARG 1-460 seed stock and trays were stored at 4, 21 and 30 ˚C. As observed for the hPARG 1-460 protein, hPARG 1-388 appeared to precipitate quickly, and crystals appeared after around a month at 7 mg/ml, in the 0.1 M HEPES 7.5, 10 % w/v PEG 8000 screen. These crystals were also flash frozen in liquid nitrogen with 10% PEG 200 for data collection at Diamond light source facility, see figure 59. Unfortunately, these crystals did not diffract.

Figure 59. hPARG 1-388 crystal stored in a cryoloop. The crystal was grown in 0.15 M HEPES 7.5, 11 % w/v PEG 8000.

Due to the degradation seen during purification of hPARG truncated variants (section 5.2.4), it was hypothesized that limited proteolysis during incubation of the

144

Structure and function of poly ADP-ribose glycohydrolases crystallization drops could lead to further truncation. This might lead to forms more amenable to crystallization. Hence, the smaller hPARG variants, 1-380, 1-365 and 1-329 were prepared to 5, 7, and 10 mg/ml and sitting drop vapour diffusion crystal trays were set up against commercial screens and the two screens optimized for 1-460 hPARG. Unfortunately, no crystal growth could be seen for these three fragments.

3.5.2 Crystallization trials of other mammalian PARG regulatory domains

As hPARG fragments were extremely difficult and slow to crystallize, and only diffracted to very low resolution, we searched for homologues that may be easier to crystallize. The PARG regulatory domain is only present in higher organisms such as vertebrates and becomes much smaller and more divergent in lower species e.g. chicken PARG (see appendix). Three homologs (from chicken, rat and mouse) were chosen for expression, purification and crystallization trials.

Primers were designed to amplify the rat regulatory domain (rPARG) residues 1-456 and 1-382, mouse regulatory domain (mPARG) 1-438 and 1-387 and chicken PARG regulatory domain (cPARG) 1-421 and 1-361 respectively, to match the hPARG 1-460 and 1-388 fragments, see primers in appendix. Following PCR, cloning and protein expression (all similar in procedure to the hPARG variants), only a low level of expression was obtained for cPARG fragments (not enough to support further crystallization studies). The mPARG and rPARG fragments showed levels of expression similar to the hPARG fragments, see figure 60.

145

Structure and function of poly ADP-ribose glycohydrolases

Figure 60. SDS page gel showing expression of rat, mouse and chicken regulatory domain fragments. Lane 1, Molecular weight marker. Lane 2, mPARG 1-387 pellet. Lane 3, mPARG 1-387 supernatant. Lane 4, mPARG 1-438 pellet. Lane 5, mPARG 1-438 supernatant. Lane 6, Molecular weight marker. Lane 7, rPARG 1-456 supernatant. Lane 8, rPARG 1-382 supernatant. Lane 9, Molecular weight marker. Lane 10, rPARG 1-456 peller. Lane 11, rPARG 1-382 pellet. Lane 12, cPARG 1-421 pellet. Lane 13, cPARG 1-421 supernatant. Lane 14, cPARG 1-361 pellet. Lane 15, cPARG 1-361 supernatant. Arrows indicate protein of interest.

Mouse and rat PARG variants were purified by nickel affinity and size exclusion and used for crystallization trials. Unfortunately, a significant level of degradation could be observed. Sitting drop vapor diffusion trays were set up against a number of commercially available crystallization screens. Instant precipitation at higher protein concentration was seen. Crystal trays were incubated at 4 ˚C and checked on a regular basis. No crystals could be obtained despite various attempts.

5.6 Chapter five summary and discussion

The aim of the work described in this chapter was to gain insight into function and structure of the hPARG regulatory domain. As previously mentioned, the structure of the hPARG catalytic domain has now been reported for mouse, rat and human PARG. However, very little is known about the regulatory domain, other than that it is involved in the localization of the PARG protein to sites of DNA damage, and may be implicated in the regulation of the enzymatic activity 76,115.

146

Structure and function of poly ADP-ribose glycohydrolases

When expressing full-length hPARG in E. coli, the majority of the protein produced corresponds to the catalytic domain, suggesting the regulatory domain is prone to proteolysis. We attempted to clone, express and purify a range of hPARG C-terminally truncated fragments in both E. coli and P. pastoris. The corresponding proteins were purified and characterized by CD and NMR, as well as subject to crystallization trials. In addition, the regulatory domains of hPARG homologues (chicken, mouse and rat) were also cloned, expressed and tested for crystallisation.

The hPARG fragments 1-460, 1-388, 1-380, 1-365 and 1-329 all appeared to express to a similar level. However, following purification using nickel affinity chromatography, samples contained multiple species that could not be fully removed using size exclusion chromatography. LC-MS analysis of tryptic digests of one of the smaller bands seen in purified hPARG 1-460 confirmed degradation occurs from the C-terminus under the conditions used. This suggests flexible regions exist in the 1-460 protein that are susceptible to proteolysis and degradation. While size exclusion chromatography allowed smaller degradation products to be removed from the full-length protein (which was unfortunately unstable), this was difficult for the isolated regulatory domain, suggesting a possible aggregation of the sample.

The hPARG regulatory domain protein is predicted to have multiple phosphorylation sites, and bacterial expression might not give rise to fully functional protein in the absence of phosphorylation. We tested the use of a eukaryotic yeast expression system as a better source of heterologously produced hPARG fragments. While hPARG regulatory domain protein expression could be detected in both a MutS (KM71H) and Mut+ (X33) strains, the protein started to degrade after 48 hours. Furthermore, expression levels were disappointing, and the method time consuming. In addition, Western blot results hinted at the presence of degradation products, suggesting that this production method did not improve stability.

Multi angle light scattering (MALLS) of the purified hPARG 1-388 regulatory protein revealed that the protein existed in solution as a monomer. However, a large population corresponded to aggregated species. Circular dichroism (CD) spectra of the various hPARG regulatory domain forms were collected and reveal that hPARG 1-388 appeared to have the highest percentage of secondary structure. Additional analysis of the hPARG

147

Structure and function of poly ADP-ribose glycohydrolases

1-388 protein using 1D proton nuclear magnetic resonance (NMR), revealed results to the contrary, suggesting the protein to be largely unfolded. However, it is unusual to see an unfolded protein that is as soluble as hPARG 1-388.

Crystals of hPARG 1-460 could be obtained, as well as an initial 3.8 Å data set. This suggests the hPARG contains a folded domain. However, as no homologous structure exists, the structure could not be solved by molecular replacement, and attempts at experimental phase determination using Se-Met labeling failed.

Proteins can exist in three states: the ordered state, the molten globule state and the random coil. While the majority of proteins occupy the ordered state once folded, there are several proteins which have been identified with large regions that are intrinsically ‘molten globule’. In some cases, certain regions remain intrinsically disordered 121. Molten globule proteins are difficult to characterize by NMR or crystallography as they are heterogeneous in structure 122,123. The hPARG regulatory was predicted to be largely disordered (figure 38) and our 1D NMR results (figure 55) appear to confirm this. This suggests it is possible hPARG regulatory domain exists in a molten globule state.

It would seem implausible that a molten globule state of hPARG would crystallise, as we observe for the hPARG1-460 fragments. The possibility exists that limited proteolysis led to smaller, folded fragments that crystallised. However, it is important to note the crystals might derive from an E. coli protein contaminant. No crystal structures have been reported that have a similar unit cell to that observed for hPARG1-460, suggesting it does not correspond to commonly occurring or easily crystallizable E. coli contaminants. Finally, a complex formed between hPARG1-460 or fragments thereof and E. coli contaminants also remains a possibility.

The association assay between the hPARG regulatory domain and the hPARG catalytic domain was performed to determine whether the regulatory and catalytic regions would associate independently of the linker region. It is also possible that the regulatory domain completes the transition from molten globule to folded structure upon hPARG catalytic domain complex formation. A pull-down assay did not show any evidence for a strong association, suggesting that the regulatory region acts independently of the catalytic region. However, as the hPARG regulatory protein was seen to be partially

148

Structure and function of poly ADP-ribose glycohydrolases unfolded in vitro, it cannot be determined that these two proteins do not associate by other means than just the linker region in vivo.

As the hPARG 1-460 and 1-388 crystals were very difficult to reproduce due to an inherent instability and degradation, some PARG regulatory domain homologues from rat, mouse and chicken were expressed, purified and tested for crystallisation. Unfortunately, the chicken PARG regulatory domain did not express while the rat and mouse regulatory PARG regulatory domains proteins were expressed at lower levels compared with hPARG and also appeared to have more degradation. The rat and mouse regulatory domain proteins were used for crystallization trials but could not be crystallized. Future work will have to focus on delineating those PARG regulatory domain fragments that are folded, or fold following phosphorylation, in presence of the hPARG catalytic domain or PARG ligands.

149

Structure and function of poly ADP-ribose glycohydrolases

Chapter six

Structure determination of luciferase-like mono-oxygenase from a SirTM operon

6.1 Background information

Sirtuins have been identified as a class of evolutionarily conserved NAD+ dependent protein deacetylases and control a number of key cellular process. In humans, they have been implicated in cellular regulation in aging, repair, cancer and diabetes 124. Recently, a previously unrecognised class of sirtuins has been discovered in microbial pathogens possessing ADP-ribosylation activity in the pathogens Staphylococcus aureus and Streptococcus pyogenes 20. This class of sirtuins, now designated SirTM, has been found to be genetically linked to a subclass of macrodomain proteins, see figure 61. The latter act to reverse the sirtuin ADP-ribosylation .

Figure 61. Schematic representation of genome arrangements of the macrodomain- sirtuin linked operon from Staphylococcus aureus. The purple arrow represents the gene encoding the old yellow enzyme protein- sav0322. The grey arrow represents the gene encoding the flavin-utilising bacterial luciferase-like monooxygenase- sav0323. The blue arrows represents a gene encoding a glycine cleavage system H-like protein (GcvH-L)- sav0324. The green arrows represent the gene encoding the macrodomain protein- sav0325, the red arrows represent the gene encoding the sirtuin protein-sav0326 and the yellow arrow represents the gene encoding a lipoate protein ligase A (LplA2)-sav0326, based on Rack, et al 20.

Furthermore, the SirTM-mediated ADP-ribosylation seen in Staphylococcus aureus and Streptococcus pyogenes is dependent on another post-translational modification,

150

Structure and function of poly ADP-ribose glycohydrolases lipoylation, see figure 62. Rack et al, propose that cross talk between these two types of posttranslational modifications is important for the response of microbial pathogens to their host defence mechanism- oxidative stress 106. Transcriptional analysis of the operon revealed that it was activated under oxidative stress conditions. In S. aureus, the SirTM operon is associated with two putative , one which exhibits similarities to flavin-utilising bacterial luciferase-like monooxygenases (LLM) (sav0323) while the other is similar to Old Yellow Enzyme (OYE) type NADH:flavin (sav0322), see figure 61. Both of these enzymes are currently uncharacterized. However, related proteins have been implicated in detoxification of reactive oxygen species 20, 125.

Figure 62. Schematic diagram of the relationship between the elements of the Staphylococcus extended operon components. Lipoate protein ligase A (LpIA2) transfers lipoic acid to the Glycine cleavage system H-like (GcvH-L) protein. Following lipolyation, SirTM transfers an ADP-ribose monomer to the (GcvH-L) releasing the nicotinamide. The Macro associated protein removed the ADP-ribose, based on Rack, et al 20.

The aim of this chapter was to structurally characterise the SAV0323 protein to provide a framework for further study (a structure for SAV0322 was recently deposited in the

151

Structure and function of poly ADP-ribose glycohydrolases

PDB, code 3L5A). Structural insights will allow us to compare this protein with previously characterised luciferase-like monooxygenase enzymes, and might provide a clue as to nature of the SAV0323 substrate(s).

6.2 Biophysical Characterization of SAV0323

6.2.1 Expression and purification of SAV0323

A pET 28a plasmid (Invitrogen), containing the sav0323 gene under the control of the T7 promotor and lac operator was received from the Ahel group. The constructs were used to transform E. coli BL21 DE3 cells, and transformants were stored as glycerol stocks. A medium-scale protein expression trial was carried out, inoculating 250 ml of LB broth with 5 ml of overnight culture of E. coli BL21 DE3 pET28a-sav0323 supplemented with kanamycin. The culture was induced with 1 mM IPTG and left for ~16 hours overnight at 23 ˚C. Cells were disrupted by sonication and the SAV0323 purified by Ni-NTA affinity chromatography. After elution with 250 mM imidazole the protein appeared very pure, see figure 63. The eluate was buffer exchanged into 25 mM Tris pH 7.5, 150 mM NaCl and 1 mM DTT buffer. Following conformation of successful protein production, these expression and purification conditions were used to for large- scale expression of the SAV0323 protein, and 12L of culture produced around 100 mg of protein.

152

Structure and function of poly ADP-ribose glycohydrolases

Figure 63. SDS-PAGE analysis of SAV0323 nickel affinity chromatography elution fractions. Lane 1, Molecular marker. Lane 2, SAV0323 lysate flow through from loading the nickel column. Lane 3, SAV0323 flow through wash step with 15 mM imidazole. Lane 4, SAV0323 elution from wash step with 50mM imidazole. Lane 5, SAV0323 elution from 250 mM imidazole. Arrow indicates protein of interest.

6.2.2 Thermal shift ligand binding assay

SAV0323 is similar to the flavin-utilising bacterial luciferase-like monooxygenases. Hence, we wanted to ascertain whether we could see any potential binding to flavin cofactors. Thermal shift assays were performed to assess whether the addition of FAD and/or FMN would increase the protein melting temperature. In addition to this, the ADP-ribosylation mediated by the sirtuins found in Staphylococcus aureus and Streptococcus pyogenes is also dependent on lipoylation. For this reason, we tested a number of other ligands, to see if these could improve the stability of the protein: NAD, lipoic acid, lipoamide and ADP-ribose. Unfortunately, no variation could be observed in the SAV0323 Tm, suggesting these ligands did not bind to enzyme, (data not shown). Tm was 52.8 ˚C for the SAV0323 protein, see figure 64.

153

Structure and function of poly ADP-ribose glycohydrolases

Fluorescence Fluorescence

Figure 64. Thermal shift assay for SAV0323. Melting temperatures were measured at 575nm. The 1st derivative shows the Tm readings.

6.2.3 Multi angle light scattering (MALLS)

Multi-angled light scattering (MALLS) was completed by Emma Keevil at the University of Manchester. The DAWNHEOS MALLS spectrophotometer was fitted with a quasi- elastic light scattering detector, to determine molecular mass. SAV0323 protein was buffer exchanged into 25 mM Tris pH 7.5 with 150 mM NaCl and diluted to 0.2 mg/ml. Results of the MALLS in figure 65, revealed that the protein predominantly existed as a dimer, with a molecular weight measuring 73.91 kDa (±0.264%) (SAV0323 monomer is 37.15 kDa).

154

Structure and function of poly ADP-ribose glycohydrolases

Molar Mass vs. volume

SAV0323 230514

dRI 7 1.0x10

6 1.0x10

5 1.0x10

Molar Molar Mass (g/mol)

4 1.0x10

12.0 13.0 14.0 15.0 16.0 17.0 volume (mL)

Figure 65. Multi-angled light scattering (MALLS) chromatogram of SAV0323. A graph representing the molecular mass of SAV0323 as determined my MALLS. The mass distribution profile corresponds to a dimer.

6.3 Elucidation of the SAV0323 protein structure

6.3.1 Crystallization and data collection of native SAV0323

For crystallization trials, SAV0323 was purified using Ni-NTA column followed by buffer exchange into 25 mM Tris, 150 mM NaCl and 1 mM DTT. There was no requirement for further purification, protein fractions obtained appeared very pure, see figure 63. The protein was concentrated to 12, 17 and 22 mg/ml respectively, and these were tested against a number of commercially available crystallization screens (molecular dimensions). Crystals were obtained by sitting drop vapor diffusion at 17 mg/ml at 4 °C in three similar conditions from clear strategy II: (E4) 0.15 M potassium thiocyanate, 0.1 M sodium cacodylate 6.5, 20 % v/v PEG 600, (D5) 0.5 M potassium phosphate monobasic, 0.1 M Tris 7.5, (C4) 0.5 M potassium phosphate monobasic 0.1 M sodium cacodylate 6.5. In addition, one Morpheus condition gave crystals: (F12) 100 mM

155

Structure and function of poly ADP-ribose glycohydrolases monosaccharides II, 0.1 M buffer system 6 pH 8.5, 50 % v/v precipitant mix 8. Unfortunately, all crystals were small, and exhibited multiple nucleation points.

An optimization screen based on the Morpheus condition F12, using the previously obtained crystals for micro-seeding. The seed stock was prepared and used in a 1:4 volume ration in 400 µl sitting drops (50 µl seed stock, 350 µl of SAV0323 protein). SAV0323 was concentrated to 17 and 20 mg/ml respectively, and drops were prepared using a sitting drop vapour diffusion set up. Again, the crystals that were obtained exhibited multiple nucleation sites.

An alternative screen based on the clear strategy conditions was prepared, using the crystals previously obtained in these conditions for micro-seeding. A wider range of protein concentrations was used: 10, 12, 14, 16, 18 and 20 mg/ml. In this case, larger, single crystals were obtained in 0.55 M potassium phosphate monobasic 0.12 M Tris pH 7.5 at 18 mg/ml of protein. Attempts to flash-cool these crystals using PEG 200 proved difficult, as addition of the cryoprotectant affected crystal integrity and lower levels did not fully cryo-protect the crystals. This led to crystals frequently being cracked or covered in ice. Other cryo-protectants were tested (including 10-20% glycerol, Paratone N (cryo-oil), 100% ammonium sulphate) and crystals in each of these conditions were flash frozen in liquid nitrogen.

A complete 2.7 Å data set was obtained from a single crystal in Paratone N using the I02 at the Diamond Light Source facility (Oxford, UK), see figure 66. Data was reduced and scaled using X-ray Detector Software (XDS) 113. Space group was P212121, (unit cell a=82.00 Å, b=115.71 Å, c=163.98 Å) and overall I/σI for data 81.99-2.65 Å of 17.9, Rmeas of 0.093. Analysis of the Matthews coefficient predicted 4 molecules in the asymmetric unit. Homology searches in the protein data base revealed a bacterial luciferase structure from Vibrio harveyi (PDB:1LUC), had a 26% sequence identity to SAV0323 and a luciferase-like monooxygenase from Streptomyces bottropensis, with 33% sequence identity (PDB:4US5). Molecular replacement was attempted using the latter structure as a search model. However, the quality of the resulting solution was insufficient to support model refinement. It was therefore decided to solve the structure using Se-MAD on selenomethionine labelled crystals.

156

Structure and function of poly ADP-ribose glycohydrolases

Figure 66. SAV0323 crystal stored in a cryo-loop. The crystal was grown in 0.55 M potassium phosphate monobasic 0.12 M Tris pH 7.5 at 18 mg/ml.

6.3.2 Expression, purification and crystallization of selenomethionine-labelled SAV0323

For production of selenomethionyl proteins in E. coli, cells were pre-cultured in 10 ml LB overnight and used to inoculate the specialized selenomethionine growing medium, supplemented with kanamycin. Cells were grown to an OD600 of 0.6 and then induced with 1mM IPTG and the temperature was reduced to 23 ˚C for 18 hours. The Se-Met labelled SAV0323 protein was purified in the same way as the native protein, breaking the cells open using sonication, followed by purification on a Ni-NTA column, see figure 67. The labelled protein appeared to be as pure as the native protein and expressed to similar levels.

157

Structure and function of poly ADP-ribose glycohydrolases

Figure 67. SDS-PAGE analysis of nickel affinity chromatography fractions of selenomethionine labelled SAV0323. Lane 1, Molecular marker. Lane 2, Se-Met labelled SAV0323 lysate flow through from loading the nickel column. Lane 3, Se-Met labelled SAV0323 flow through wash step with 15 mM imidazole. Lane 4, Se-Met labelled SAV0323 elution from wash step with 50 mM imidazole. Lane 5, Se-Met labelled SAV0323 elution from 250 mM imidazole. Lane 6, additional elution of Se-Met labelled SAV0323 from 250 mM imidazole, Lane 9, non-labelled native SAV0323 as positive control.

For crystallization trials, selenomethionine-labelled SAV0323 was purified using Ni- NTA column followed by buffer exchange into 25 mM Tris, 150 mM NaCl and 1 mM DTT. Crystals were obtained by sitting drop vapor diffusion at 18 mg/ml at 4 °C in the optimised clear strategy II screen- 0.50 M potassium phosphate monobasic and 0.11 M Tris pH 7.5. Crystals took a little longer to grow, ~1-2 weeks compared with a few days for the native crystals. Multiple crystals were cryo-protected in Paratone N and were flash frozen in liquid nitrogen.

158

Structure and function of poly ADP-ribose glycohydrolases

6.3.3 Structure elucidation of SAV0323

Multiple high redundancy data sets collected at Selenium edge, wavelength: 0.9793Å, ranging between 3.8 and 3.2 Å were obtained from single Se-Met crystals, in figure 68, using the I04 at the Diamond Lightsource facility (Oxford, UK); data was reduced and scaled using X-ray Detector Software (XDS) 113. Space group was P 21 21 21, (unit cell a=83.02 Å, b=113.91 Å, c=159.12 Å) with an overall I/σI of 17.1, Rmeas of 0.218.

Figure 68. Selenomethionine labelled SAV0323 crystal stored in a cryo-loop. Crystal was grown in 0.50 M Potassium phosphate monobasic and 0.11 M Tris pH 7.5 at 18 mg/ml.

Although the Se-Met crystals had the same space group as the native enzyme, the unit cell parameters were sufficiently different to preclude the use of any isomorphous replacement signal between both data sets, (see table 12). Initial Se-Met sites were found using Auto-rickshaw and the Se-Met anomalous signal using a SAD approach. While the SAD phases did allow solvent boundaries to be detected, the Se-Met signal proved too weak and the Se-Met data resolution too low to support map interpretation and/or automated model building.

A multi-crystal averaging density modification procedure including NCS averaging was then used to provide sufficient quality phases for the higher diffracting un-labelled SAV0323 form. Briefly, this included using the low quality molecular replacement model

159

Structure and function of poly ADP-ribose glycohydrolases for both crystal forms to determine initial monomer mask and NCS matrices, followed by multiple rounds density modification and NCS averaging including mask and NCS matrix refinement. This procedure resulted in phases of sufficient quality to support map interpretation and initial autobuilding of the SAV0323 model using buccaneer 112. This model was further refined against the 2.8 Å data using Refmac5 109,110 and manual model building using coot 111. Final data and structure refinement statistics are shown in table 12.

Data collection

Space group P212121

Cell dimensions

a, b, c (Å) 82.99 115.72 163.99

Resolution (Å) 52.8-2.70

Rmeas 0.093

I/σI 17.9

Completeness (%) 100

Redundancy 6.2

Refinement

No. of reflections 41542

Rwork/Rfree 0.25/0.29

No atoms 9510

B-factors (Å2) 53.6

R.m.s. deviations

Bond lengths (Å) 0.015

Bond angles (°) 1.649

Table 12. Crystallographic data and model refinement parameters for SAV0323. Values in parentheses indicate values obtained for the highest resolution shell. Equations for Rmeas and Rwork/Rfree are given below. For Rwork and Rfree, the formula for both is the same, Rwork is calculated for the working set, whereas Rfree is calculated for the test set.

160

Structure and function of poly ADP-ribose glycohydrolases

6.4 Discussion

The final structure of SAV0323 contains four molecules within the asymmetric with an

Rwork/Rfree of 0.25/0.29. The tetramer within the AU is made up of a dimer of dimers, see figure 69. Analysis using PISA from CCP4 to show only dimer is true while tetramer is crystallographic. This result corresponds with the MALLS results that show the solution state is predominantly dimeric. Several loop regions of the structure are highlight flexible, (green arrows in figure 69). This might explain the low resolution obtained for these crystals which only diffracted to 2.8 Å. The SAV0323 monomer consists of an eight α-helices and eight parallel β-strands TIM barrel 126.

Figure 69. Overall structure of SAV0323. The four subunits can be seen as a dimer of dimers. Four individual subunits are shown in different colours. Chain A, α-subunit is gold, chain B β-subunit is blue, chain C α-subunit is orange and chain D β-subunit is in

161

Structure and function of poly ADP-ribose glycohydrolases green, the green arrows indicate the flexible regions. Images were drawn using the QTMG software, part of the CCP4i suite.

A structural homology search using the DALI server 127 revealed the closest homolog is the recently crystallised MsnO8 protein, a bacterial luciferase-like monooxygenase from Streptomyces bottropensis, (PDB:4US5) 128, see figure 70. There is 34% structural identity between these proteins, with a r.m.s.d. value of 2.3 Å and the Z score of 34.1 (for 287 C atoms) (other hits with Z scores over 20 included a FMN bound- bacterial luciferase (PDB:3FGC), a bacterial monooxygenase (PDB:2I7G), bacterial luciferase (PDB:1LUC)).

Unfortunately, we were unable to obtain a ligand structure of SAV0323 with FMN. Attempts to co-crystallise with FMN impeded crystal formation, and ligand soaking attempts were unsuccessful.

Figure 70. Structural comparison of SAV0323 and MnsO8 (PDB: 4US5). A superimposition of SAV0323 (in gold and blue) and MnsO8 (in grey) revealing significant structural similarity. Images were drawn using the QTMG software, part of the CCP4i suite. Image was drawn using the QTMG software, part of the CCP4i suite.

162

Structure and function of poly ADP-ribose glycohydrolases

The SAV0323 protein is similar in structure to FMN bound-bacterial luciferase structure (PDB:3FGC) (structural similarity with a r.m.s.d of 2.9 Å and a Z score of 25.8), see figure 71. In this heterodimer luciferase, the α subunit has been reported to contain the active site, 129 with the β subunit essential for high efficiency 130.

a b

c

Figure 71. Structural comparison of SAV0323 and a bacterial luciferase (PDB:3FGC). (a) A superimposition of SAV0323 (in gold) and bacterial luciferase (in red) reveal structural similarity and the presence of the FMN ligand in the α subunit, highlighted by the green arrow. Orange arrow highlights bacterial luciferase ‘mobile loop’ (b) Close up of the FMN binding region, missing loop region of SAV0323 structure is represented by the orange dashed line. Key residues from this key loop are labelled for the bacterial luciferase; (c) Sequence alignment of bacterial luciferase (PDB:3FGC)

163

Structure and function of poly ADP-ribose glycohydrolases and SAV0323, red box highlights they key loop region of 3FGC. Structural images were drawn using the QTMG software, part of the CCP4i suite.

It appeared that the residues responsible for a key loop a region were missing from the potential FMN binding site in SAV0323 (106-112) see figure 71c, as indicated by the orange dashed line in figure 71b. The residues in this loop (106-112) within the bacterial luciferase (PDB: 3FGC) have been implicated in ligand coordination, more specifically the isoalloxazine stacking as seen with the Phe110 residue, the Cys106 residue is seen projecting toward the site of flavin oxidation, and the Arg107 has been reported to interact directly with 5’ phosphate group of FMN. In addition to this, mutations to the Cis106 rendered the bacterial luciferase enzyme inactive131. Furthermore, the SAV0323 does not contain what is known as the ‘mobile loop’, a loop region within the bacterial luciferase protein α-subunit that stretches from β strand 7 to α helix 7, orange arrow in figure 71a. This loop has been reported to be very protease liable and binding of FMN protects the enzyme from proteolytic inactivation 131,132.

Increased specificity for the reduced form of riboflavin has been previously reported and can be common for bacterial luciferase like proteins. Luciferases tend to utilize

FMNH2 as a substrate and not as a prosthetic group 131 and it has been reported that these proteins have weaker binding to FMN when compared with FMNH2 129. This could account for why we did not see a shift in Tm when we added FMN to SAV0323 (section 6.2.2). Future work will include attempting to obtain a FMN or FMNH2 bound structure of SAV0323 and/or establishing that FMNH2 binds to the protein.

164

Structure and function of poly ADP-ribose glycohydrolases

Chapter Seven

Discussion

Since the initial discovery of posttranslational modification of proteins via ADP- ribosylation 29, the research into understanding poly ADP-ribose formation, recognition and removal has made significant progress. In fact, given the therapeutical relevance of PAR metabolism/signalling, this represents a growing area of research.

Until recently, much more was known about PAR formation (via PARP) and PAR recognition (macrodomains and other ‘readers’) as compared with PAR removal (via PARG). However, the last 4-5 years has seen the first detailed structural and mechanistic insights in the key eraser proteins involved in PAR degradation 11,13,15,16,77. Furthermore, PAR metabolism has now been found in both eukaryotes and prokaryotes, and PARG structures from a bacterium, a protozoan and vertebrates (including human) have been reported. In addition, it has now become clear that the terminal mono-ADP- ribose attached to the protein is removed by MACROD1/MACROD2 11,13,15,16,77. The corresponding crystal structures have provided a framework for further studies aimed at mechanistic understanding, as well as ligand and drug binding, see figure 72.

The PARG structures reported reveal a macrodomain type fold containing a specific PARG catalytic loop present in all PARGs 11,13,15,16,77. This macrodomain fold within the PARG protein was initially discovered by Slade, et al for the bacterial T. curvata PARG 13. This structure revealed that the PARG specific catalytic loop was inserted into the macrodomain fold 13. The recent structure of MACROD2 revealed it had a distinct, MACROD2 specific, catalytic loop inserted into the macro fold. This loop is responsible for the ability to remove the terminal ADP-ribose unit after the PAR degradation by PARG 60. Despite the common macrodomain fold, neither of these two enzymes can perform both reactions 8. We predicted that if the loop region was the sole determinant of catalytic activity, then hybrid proteins containing the macrodomain of one enzyme with the loop sequence specific for the other should display activity corresponding to the loop sequence donor. Unfortunately, the hybrid proteins produced were particularly susceptible to degradation and were unable to bind ADP-ribose (figures 23 and 24). This suggests the hybrid proteins were partially unfolded, and that the catalytic loop

165

Structure and function of poly ADP-ribose glycohydrolases requires specific interactions with the cognate macrodomain to adopt to correct structure.

The PARG catalytic loop contains the two key residues needed for catalytic activity, E114 and E115 13. We mutated the E114 residue to either an alanine or a glutamine, which changed the ligand binding specificity, causing tight binding to FAD in preference over ADP-ribose. In the proposed catalytic mechanism, the function of E114 seems limited to ligand binding. However, our results suggest that for the bacterial PARG protein, the E114 residue might be critical for ligand binding preference and preventing FAD binding in vivo.

While the bacterial PARG arguably represents the minimal PARG catalytic module, the eukaryotic PARGs (or canonical PARGs) contain an additional N-terminal domain that is essential for catalytic activity 16. While the active site between bacterial and canonical PARGs is highly conserved 11, the exact nature of the substrate-enzyme complex for each enzyme was not well understood. Our structure of canonical PARG in complex with PAR polymers revealed the first insights into PAR binding at the atomic level, which intended to encourage other structural studies aimed to increase understanding of PAR- protein interaction networks in genome stability 11. We were able to crystallize the catalytically inactive PARG protein with various lengths of PAR, all of which resulted in an exo-binding mode: only the two terminal ADP-ribose units visible in the binding site. This indicates that canonical PARG is inherently predisposed to act as an exo- glycohydrolase. Further evidence to support this included the fact that mutation of a conserved phenylalanine residue led to higher level of endo-glycohydrolyse products, suggesting that under normal physiological conditions ADP-ribose will be the dominant PARG product 11.

Recent work by the Ahel group provided the structure of the human PARG catalytic domain bound to a chemically synthesised dimeric ADP-ribose ligand to mimic the two terminal ADP-ribose units. This provides a more accessible method for obtaining protein:PAR complex structures for future studies. Furthermore, the study revealed many similarities of PAR binding with the TTPARG:PAR complex, and reinforces our suggestion that binding of branched PAR fragments is unlikely due to steric hindrances

133. The similarities between the human PARG:PAR2 and the TTPARG:PAR9 confirms

166

Structure and function of poly ADP-ribose glycohydrolases that canonical PARG acts predominantly as an exo-glycohydrolase. We propose that stress conditions might give rise to unusual PARG/PAR ratio’s, leading to activation of the latent PARG endo-glycohydrolase activity 11. Hence, under stress conditions, PARG has the capacity to produce larger oligomers of PAR that could ultimately lead to parthanatos, see figure 72.

Parthanatos is a newly discovered, unique form of cell death which occurs after over activation of the PARP enzyme producing PAR, usually in response to high levels of DNA damage 134. The importance of PARG within parthanatos has been previously studied in relation to genetic deletions, pharmacological and overexpression where PARG was found to protect against PAR mediated cell death 135. During parthanatos , PAR accumulates in the nucleus, however the question remains how the protein free PAR causes the release the AIF to initiate cell death 8,101. It was predicted that TARG may contribute to parthanatos 101. However, the discovery that canonical PARG may also release chains of PAR during cell stress 11, suggests PARG can also be involved in parthanatos mediated cell death, see figure 72. The structure of canonical PARG bound to PAR greatly improves our understanding of PAR degradation and allows a more detailed understanding for structure based drug design.

Despite the recent influx of canonical PARG structures, no structure is available for a full-length vertebrate PARG. The latter enzymes contain a regulatory domain that is believed to be involved in regulating the catalytic activity of the protein and that silencing this domain increases cells response to genotoxic stress, suggesting that understanding this region of the protein is key to future development of drugs and how these may interact with other proteins 68,84. Structural and biophysical information on the regulatory region of vertebrate PARG would provide more understanding into PARG regulation.

Our attempts to purify and crystallise the human PARG protein (full length) proved unsuccessful. The protein rapidly degraded, and was often reduced in size to the catalytic domain region. This suggests that the two regions (regulatory and catalytic) are likely joined by a flexible linker region, which may be sensitive to proteolysis. This was further suggested by the lack of association seen between the regulatory and catalytic domains within our pull down assay. Unfortunately, additional studies on the

167

Structure and function of poly ADP-ribose glycohydrolases isolated human PARG regulatory region proved difficult, as the corresponding protein degraded easily after purification in E. coli. Furthermore, only low levels of expression could be obtained from P. pastoris. Some of our biophysical studies suggested that this region may exist as a ‘molten globule’ and is possibly highly disordered. In contrast, we obtained crystals of the regulatory domain, although these crystals were difficult to reproduce. It is possible these were the results of limited proteolysis within the drop.

Our understanding of PARylation and the roles of various proteins involved in this process increases as we continue to obtain atomic level insights. The current view of PARylation is continuously changing, and there are still key questions remaining, see figure 72. The structure of protein-ADP-ribose complexes does not allow us to determine the relative PAR and ADP-ribose affinities 8. In addition, if the protein binds to PAR, it can be difficult to predict in what mode (i.e. exo versus endo). Various pull down assays have suggested PAR binding potential for specific proteins 136, but structures of these proteins bound to PAR or ADP-ribose would provide more information. This was the case for TTPARG, which we found predominantly binds PAR in the exo-position. This could not be predicted from the ADP-ribose complex structure alone. Structures of a number of other key proteins involved in PARylation and PAR- degradation have recently been solved. Most of these structures however, are only bound to ADP-ribose, e.g. MACROD2 and TARG1 60,80, (PDBs 4IQY and 4J5S respectively), see figure 72. It would be important to gain structural insights into their interactions with the PAR polymer. In addition, the crystal structure of PARP protein bound to DNA has recently been solved (PDB: 4DQY). However, we still do not have a structure of a PARylated protein in complex with a reader/eraser. This would allow us to determine whether any of the key eraser proteins interact both with the protein as well as the PAR chain. This is particularly true for the TARG and MACRO proteins, that are predicted to bind at opposite terminal ADP-ribose of the PAR chain 8.

168

Structure and function of poly ADP-ribose glycohydrolases

Figure 72. Schematic representation of PARylation and PAR degradation with structures of key proteins involved: PARP1 bound to DNA double strand break (PDB: 4DQY), canonical PARG bound to PAR (PDB: 4L2H), TARG1 bound to ADP- ribose (PDB: 4J5S) and MACROD2 bound to ADP-ribose (PDB: 4IQY). PARP binds to DNA after damage, which promotes PARylation. Over-activation of PARylation can lead to PAR-mediated cell death ‘parthanatos’. PARG breaks down the PAR chains into ADP- ribose monomers and longer PAR oligomers that can lead to parthanatos. TARG is predicted to remove the PAR chain from the root of the PARylated protein, which also leads to parthanatos. The PARG regulatory domain is still largely uncharacterised. MACROD1 and MACROD2 remove the terminal ADP-ribose monomer after PARG degradation.

The emerging role of ADP-ribosylation in bacteria has recently been extended to include bacterial pathogenesis 104,105. A new bacterial sirtuin group that functions as ADP-ribosyl transferases was shown to be genetically linked to a subclass of macrodomains. The latter catalyse the reversal of this ADP-ribosylation . In addition, this ADP-ribosylation requires another posttranslational modification: lipoylation.

169

Structure and function of poly ADP-ribose glycohydrolases

Lipoylation of GcvH-L protein is required in order for the protein to be ADP-ribosylated. The up regulation of the operonal macrodomain in S. aureus usually occurs after internalisation into human cells 136, indicating that de-ADP-ribosylation is the main pathway. A genetic linkage is observed with two uncharacterised oxidoreductases in S. aureus and S. pyogenes, one which exhibits similarity with flavin-utilising bacterial luciferase-like monooxygenases (sav0323) and Old Yellow Enzyme (OYE) type NADH:flavin oxidoreductase (sav0322) 106. It is proposed that in situations of oxidative stress, these two oxidoreductases (SAV0323 and SAV0322) interact with the GcvH-L, however, when the GcvH-L protein is ADP-ribosylated (non-stress conditions), interaction is inhibited, see figure 73. This indicates ADP-ribosylation acts as a ‘keep off’ mechanism for these oxidoreductases in times of low stress.20

Our structure determination of the SAV0323 confirmed the homology with other bacterial luciferase enzymes, although many of the active site loops were disordered. This makes it difficult to predict a substrate or interaction mode with the GcvH-L protein. We now have many structures of the key proteins on the SirTM operon, however we are missing the macrodomain protein, which currently has no homologues deposited in the PDB above 40% identity, and the LpIA2 protein, which again has no homologues listed in the PDB over 38% identity, see figure 73. When the bacterium inserts itself into human cells, the pathogenesis mechanism relies on the up-regulation of the macrodomain gene and higher rates of ADP-ribose removal. Currently, we can speculate that this de-ADP-ribosylation allows the interaction of the oxidoreductases with the protein 106. The exact mechanism by which these proteins (SAV0323 and SAV0322) interact with GcvH-L needs to be further explored. It is key to determine the whole picture to provide key therapeutic targets for treatment of these virulent bacteria, see figure 73.

170

Structure and function of poly ADP-ribose glycohydrolases

Figure 73. Schematic representation of translation and transcription of the newly discovered SirTM operon and the proposed mechanisms for the proteins involved: GcvH-L (SAV0327) (PDB:5A35), Sirtuin (SAV0326) (PDB:5A3A), OYE (SAV0322) and LLM (SAV0323) (PDB:3L5A). LipA2 lipolates the GcvH-L protein prior to the ADP-ribosylation of the now lipolyated GcvH-L protein by the sirtuin protein. This ADP-ribosylation inhibits the association of the two oxidoreductase enzymes, SAV0322 and SAV0323. The macrodomain protein removes the ADP-ribose monomer, and releases the inhibition of the oxidoreductases.

As always, the increased understanding of PAR metabolism/posttranslational modification has led to new questions. Until recently, bacteria have been considered to be devoid of PAR metabolism as they lack PARP and PARG enzymes. However, recent studies including our own have identified PARP and PARG homologs in bacteria 13. We now know that there are over 50 bacterial species identified to have divergent PARG-

171

Structure and function of poly ADP-ribose glycohydrolases type proteins 1. The role these enzymes and the associated PAR metabolism play in these bacteria is currently unclear.

Various studies have shown that inhibiting PARG leads to increased cell death 83–85, making PARG a key therapeutic target for cancer and other diseases that are correlated with increased levels of DNA damage (leading to over production of PAR). The recent structures of the hPARG and other canonical PARG proteins has revealed key ligand- binding site information, providing a blue print for the design of new drugs to inhibit the catalytic activities of this protein. The structure and function of the vertebrate specific regulatory domain remains unclear. This region is responsible for PARG locating the protein to poly ADP-ribosylated proteins 76. How this region specifically recognises PARylated proteins is also unclear.

172

Structure and function of poly ADP-ribose glycohydrolases

[1] D. Perina, A. Mikoč, J. Ahel, H. Ćetković, R. Žaja, and I. Ahel, “Distribution of protein poly(ADP-ribosyl)ation systems across all domains of life.,” DNA Repair (Amst)., vol. 23, pp. 4–16, Nov. 2014.

[2] M. O. Hottiger, P. O. Hassa, B. Lüscher, H. Schüler, and F. Koch-Nolte, “Toward a unified nomenclature for mammalian ADP-ribosyltransferases.,” Trends Biochem. Sci., vol. 35, no. 4, pp. 208–19, Apr. 2010.

[3] T. El-Hamoly, C. Hegedűs, P. Lakatos, K. Kovács, P. Bai, M. A. El-Ghazaly, E. S. El- Denshary, É. Szabó, and L. Virág, “Activation of poly(ADP-ribose) polymerase-1 delays wound healing by regulating keratinocyte migration and production of inflammatory mediators.,” Mol. Med., vol. 20, pp. 363–71, Jan. 2014.

[4] L. Virág, A. Robaszkiewicz, J. M. Rodriguez-Vargas, and F. J. Oliver, “Poly(ADP- ribose) signaling in cell death.,” Mol. Aspects Med., vol. 34, no. 6, pp. 1153–67, Dec. 2013.

[5] B. A. Gibson and W. L. Kraus, “New insights into the molecular and cellular functions of poly(ADP-ribose) and PARPs.,” Nat. Rev. Mol. Cell Biol., vol. 13, no. 7, pp. 411–24, Jul. 2012.

[6] W. L. Kraus, “Transcriptional control by PARP-1: chromatin modulation, enhancer-binding, coregulation, and insulation.,” Curr. Opin. Cell Biol., vol. 20, no. 3, pp. 294–302, Jun. 2008.

[7] D. D’Amours, S. Desnoyers, I. D’Silva, and G. G. Poirier, “Poly(ADP-ribosyl)ation reactions in the regulation of nuclear functions.,” Biochem. J., vol. 342 ( Pt 2, pp. 249–68, Sep. 1999.

[8] E. Barkauskaite, G. Jankevicius, A. G. Ladurner, I. Ahel, and G. Timinszky, “The recognition and removal of cellular poly(ADP-ribose) signals.,” FEBS J., vol. 280, no. 15, pp. 3491–507, Aug. 2013.

[9] T. Karlberg, M.-F. Langelier, J. M. Pascal, and H. Schüler, “Structural biology of the writers, readers, and erasers in mono- and poly(ADP-ribose) mediated signaling.,” Mol. Aspects Med., vol. 34, no. 6, pp. 1088–108, Dec. 2013.

[10] F. Koch-Nolte, S. Kernstock, C. Mueller-Dieckmann, M. S. Weiss, and F. Haag, “Mammalian ADP-ribosyltransferases and ADP-ribosylhydrolases.,” Front. Biosci., vol. 13, pp. 6716–29, Jan. 2008.

[11] E. Barkauskaite, A. Brassington, E. S. Tan, J. Warwicker, M. S. Dunstan, B. Banos, P. Lafite, M. Ahel, T. J. Mitchison, I. Ahel, and D. Leys, “Visualization of poly(ADP- ribose) bound to PARG reveals inherent balance between exo- and endo- glycohydrolase activities.,” Nat. Commun., vol. 4, p. 2164, Jan. 2013.

[12] P. O. Hassa and M. O. Hottiger, “The diverse biological roles of mammalian PARPS, a small but powerful family of poly-ADP-ribose polymerases.,” Front. Biosci., vol. 13, pp. 3046–82, Jan. 2008.

173

Structure and function of poly ADP-ribose glycohydrolases

[13] D. Slade, M. S. Dunstan, E. Barkauskaite, R. Weston, P. Lafite, N. Dixon, M. Ahel, D. Leys, and I. Ahel, “The structure and catalytic mechanism of a poly(ADP-ribose) glycohydrolase.,” Nature, vol. 477, no. 7366, pp. 616–20, Sep. 2011.

[14] P. O. Hassa, S. S. Haenni, M. Elser, and M. O. Hottiger, “Nuclear ADP-ribosylation reactions in mammalian cells: where are we today and where are we going?,” Microbiol. Mol. Biol. Rev., vol. 70, no. 3, pp. 789–829, Sep. 2006.

[15] J. a. Tucker, N. Bennett, C. Brassington, S. T. Durant, G. Hassall, G. Holdgate, M. McAlister, J. W. M. Nissink, C. Truman, and M. Watson, “Structures of the Human Poly (ADP-Ribose) Glycohydrolase Catalytic Domain Confirm Catalytic Mechanism and Explain Inhibition by ADP-HPD Derivatives,” PLoS One, vol. 7, no. 12, 2012.

[16] M. S. Dunstan, E. Barkauskaite, P. Lafite, C. E. Knezevic, A. Brassington, M. Ahel, P. J. Hergenrother, D. Leys, and I. Ahel, “Structure and mechanism of a canonical poly(ADP-ribose) glycohydrolase.,” Nat. Commun., vol. 3, p. 878, Jan. 2012.

[17] I.-K. Kim, J. R. Kiefer, C. M. W. Ho, R. a Stegeman, S. Classen, J. a Tainer, and T. Ellenberger, “Structure of mammalian poly(ADP-ribose) glycohydrolase reveals a flexible tyrosine clasp as a substrate-binding element,” Nat. Struct. Mol. Biol., vol. 19, no. 6, pp. 653–656, 2012.

[18] Z. Wang, J.-P. Gagné, G. G. Poirier, and W. Xu, “Crystallographic and biochemical analysis of the mouse poly(ADP-ribose) glycohydrolase.,” PLoS One, vol. 9, no. 1, p. e86010, 2014.

[19] S. Vyas, I. Matic, L. Uchima, J. Rood, R. Zaja, R. T. Hay, I. Ahel, and P. Chang, “Family-wide analysis of poly(ADP-ribose) polymerase activity.,” Nat. Commun., vol. 5, p. 4426, Jan. 2014.

[20] J. G. M. Rack, R. Morra, E. Barkauskaite, R. Kraehenbuehl, A. Ariza, Y. Qu, M. Ortmayer, O. Leidecker, D. R. Cameron, I. Matic, A. Y. Peleg, D. Leys, A. Traven, and I. Ahel, “Identification of a Class of Protein ADP-Ribosylating Sirtuins in Microbial Pathogens.,” Mol. Cell, vol. 59, no. 2, pp. 309–20, Jul. 2015.

[21] S. P. Jackson and J. Bartek, “The DNA-damage response in human biology and disease.,” Nature, vol. 461, no. 7267, pp. 1071–8, Oct. 2009.

[22] C. Beck, I. Robert, B. Reina-San-Martin, V. Schreiber, and F. Dantzer, “Poly(ADP- ribose) polymerases in double-strand break repair: focus on PARP1, PARP2 and PARP3.,” Exp. Cell Res., vol. 329, no. 1, pp. 18–25, Nov. 2014.

[23] P. Sung and H. Klein, “Mechanism of homologous recombination: mediators and helicases take on regulatory functions.,” Nat. Rev. Mol. Cell Biol., vol. 7, no. 10, pp. 739–50, Oct. 2006.

174

Structure and function of poly ADP-ribose glycohydrolases

[24] M. R. Lieber, Y. Ma, U. Pannicke, and K. Schwarz, “Mechanism and regulation of human non-homologous DNA end-joining.,” Nat. Rev. Mol. Cell Biol., vol. 4, no. 9, pp. 712–20, Sep. 2003.

[25] M. Bétermier, P. Bertrand, and B. S. Lopez, “Is non-homologous end-joining really an inherently error-prone process?,” PLoS Genet., vol. 10, no. 1, p. e1004086, Jan. 2014.

[26] K. W. Caldecott, “Protein ADP-ribosylation and the cellular response to DNA strand breaks.,” DNA Repair (Amst)., vol. 19, pp. 108–13, Jul. 2014.

[27] E. Barkauskaite, G. Jankevicius, and I. Ahel, “Structures and Mechanisms of Enzymes Employed in the Synthesis and Degradation of PARP-Dependent Protein ADP-Ribosylation.,” Mol. Cell, vol. 58, no. 6, pp. 935–46, Jun. 2015.

[28] M. O. Hottiger, “ADP-ribosylation of histones by ARTD1: an additional module of the histone code?,” FEBS Lett., vol. 585, no. 11, pp. 1595–9, Jun. 2011.

[29] P. CHAMBON, J. D. WEILL, and P. MANDEL, “Nicotinamide mononucleotide activation of new DNA-dependent polyadenylic acid synthesizing nuclear enzyme.,” Biochem. Biophys. Res. Commun., vol. 11, pp. 39–43, Apr. 1963.

[30] S. L. Rulten, A. E. O. Fisher, I. Robert, M. C. Zuma, M. Rouleau, L. Ju, G. Poirier, B. Reina-San-Martin, and K. W. Caldecott, “PARP-3 and APLF function together to accelerate nonhomologous end-joining.,” Mol. Cell, vol. 41, no. 1, pp. 33–45, Jan. 2011.

[31] C. Boehler, L. R. Gauthier, O. Mortusewicz, D. S. Biard, J.-M. Saliou, A. Bresson, S. Sanglier-Cianferani, S. Smith, V. Schreiber, F. Boussin, and F. Dantzer, “Poly(ADP- ribose) polymerase 3 (PARP3), a newcomer in cellular response to DNA damage and mitotic progression.,” Proc. Natl. Acad. Sci. U. S. A., vol. 108, no. 7, pp. 2783–8, Feb. 2011.

[32] R. Krishnakumar and W. L. Kraus, “The PARP side of the nucleus: molecular actions, physiological outcomes, and clinical targets.,” Mol. Cell, vol. 39, no. 1, pp. 8–24, Jul. 2010.

[33] J. Yélamos, V. Schreiber, and F. Dantzer, “Toward specific functions of poly(ADP- ribose) polymerase-2,” Trends Mol. Med., vol. 14, no. 4, pp. 169–178, Apr. 2008.

[34] M.-F. Langelier, J. L. Planck, S. Roy, and J. M. Pascal, “Crystal structures of poly(ADP-ribose) polymerase-1 (PARP-1) zinc fingers bound to DNA: structural and functional insights into DNA-dependent PARP-1 activity.,” J. Biol. Chem., vol. 286, no. 12, pp. 10690–701, Mar. 2011.

[35] M.-F. Langelier, J. L. Planck, S. Roy, and J. M. Pascal, “Structural basis for DNA damage-dependent poly(ADP-ribosyl)ation by human PARP-1.,” Science, vol. 336, no. 6082, pp. 728–32, May 2012.

175

Structure and function of poly ADP-ribose glycohydrolases

[36] J.-F. Haince, D. McDonald, A. Rodrigue, U. Déry, J.-Y. Masson, M. J. Hendzel, and G. G. Poirier, “PARP1-dependent kinetics of recruitment of MRE11 and NBS1 proteins to multiple DNA damage sites.,” J. Biol. Chem., vol. 283, no. 2, pp. 1197– 208, Jan. 2008.

[37] H. Kleine, E. Poreba, K. Lesniewicz, P. O. Hassa, M. O. Hottiger, D. W. Litchfield, B. H. Shilton, and B. Lüscher, “Substrate-assisted catalysis by PARP10 limits its activity to mono-ADP-ribosylation.,” Mol. Cell, vol. 32, no. 1, pp. 57–69, Oct. 2008.

[38] S. Vyas, I. Matic, L. Uchima, J. Rood, R. Zaja, R. T. Hay, I. Ahel, and P. Chang, “Family-wide analysis of poly(ADP-ribose) polymerase activity.,” Nat. Commun., vol. 5, p. 4426, Jan. 2014.

[39] D. d’AMOURS, S. DESNOYERS, I. d’SILVA, and G. G. POIRIER, “Poly(ADP- ribosyl)ation reactions in the regulation of nuclear functions,” Biochem. J., vol. 342, no. 2, pp. 249–268, Sep. 1999.

[40] Q. Deng and J. T. Barbieri, “Molecular mechanisms of the cytotoxicity of ADP- ribosylating toxins.,” Annu. Rev. Microbiol., vol. 62, pp. 271–88, Jan. 2008.

[41] A. H. Davies, A. K. Roberts, C. C. Shone, and K. R. Acharya, “Super toxins from a super bug: structure and function of Clostridium difficile toxins.,” Biochem. J., vol. 436, no. 3, pp. 517–26, Jun. 2011.

[42] M. Mashimo, J. Kato, and J. Moss, “Structure and function of the ARH family of ADP-ribosyl-acceptor .,” DNA Repair (Amst)., vol. 23, pp. 88–94, Nov. 2014.

[43] M. De Vos, V. Schreiber, and F. Dantzer, “The diverse roles and clinical relevance of PARPs in DNA damage repair: current state of the art.,” Biochem. Pharmacol., vol. 84, no. 2, pp. 137–46, Jul. 2012.

[44] J.-F. Haince, M. Rouleau, M. J. Hendzel, J.-Y. Masson, and G. G. Poirier, “Targeting poly(ADP-ribosyl)ation: a promising approach in cancer therapy.,” Trends Mol. Med., vol. 11, no. 10, pp. 456–63, Oct. 2005.

[45] M. Tallis, R. Morra, E. Barkauskaite, and I. Ahel, “Poly(ADP-ribosyl)ation in regulation of chromatin structure and the DNA damage response.,” Chromosoma, vol. 123, no. 1–2, pp. 79–90, Mar. 2014.

[46] I. Ahel, D. Ahel, T. Matsusaka, A. J. Clark, J. Pines, S. J. Boulton, and S. C. West, “Poly(ADP-ribose)-binding zinc finger motifs in DNA repair/checkpoint proteins.,” Nature, vol. 451, no. 7174, pp. 81–5, Jan. 2008.

[47] J. Oberoi, M. W. Richards, S. Crumpler, N. Brown, J. Blagg, and R. Bayliss, “Structural basis of poly(ADP-ribose) recognition by the multizinc binding domain of checkpoint with forkhead-associated and RING Domains (CHFR).,” J. Biol. Chem., vol. 285, no. 50, pp. 39348–58, Dec. 2010.

176

Structure and function of poly ADP-ribose glycohydrolases

[48] Z. Y. Abd Elmageed, A. S. Naura, Y. Errami, and M. Zerfaoui, “The poly(ADP-ribose) polymerases (PARPs): new roles in intracellular transport.,” Cell. Signal., vol. 24, no. 1, pp. 1–8, Jan. 2012.

[49] F. He, K. Tsuda, M. Takahashi, K. Kuwasako, T. Terada, M. Shirouzu, S. Watanabe, T. Kigawa, N. Kobayashi, P. Güntert, S. Yokoyama, and Y. Muto, “Structural insight into the interaction of ADP-ribose with the PARP WWE domains.,” FEBS Lett., vol. 586, no. 21, pp. 3858–64, Nov. 2012.

[50] Z. Wang, G. A. Michaud, Z. Cheng, Y. Zhang, T. R. Hinds, E. Fan, F. Cong, and W. Xu, “Recognition of the iso-ADP-ribose moiety in poly(ADP-ribose) by WWE domains suggests a general mechanism for poly(ADP-ribosyl)ation-dependent ubiquitination.,” Genes Dev., vol. 26, no. 3, pp. 235–40, Feb. 2012.

[51] M. J. Maté, M. Ortiz-Lombardía, B. Boitel, A. Haouz, D. Tello, S. A. Susin, J. Penninger, G. Kroemer, and P. M. Alzari, “The crystal structure of the mouse apoptosis-inducing factor AIF.,” Nat. Struct. Biol., vol. 9, no. 6, pp. 442–6, Jun. 2002.

[52] J. R. Pehrson and V. A. Fried, “MacroH2A, a core histone containing a large nonhistone region.,” Science, vol. 257, no. 5075, pp. 1398–400, Sep. 1992.

[53] W. Han, X. Li, and X. Fu, “The macro domain : structure, functions, and their potential therapeutic implications.,” Mutat. Res., vol. 727, no. 3, pp. 86– 103, Jan. .

[54] G. I. Karras, G. Kustatscher, H. R. Buhecha, M. D. Allen, C. Pugieux, F. Sait, M. Bycroft, and A. G. Ladurner, “The macro domain is an ADP-ribose binding module.,” EMBO J., vol. 24, no. 11, pp. 1911–20, Jun. 2005.

[55] G. Timinszky, S. Till, P. O. Hassa, M. Hothorn, G. Kustatscher, B. Nijmeijer, J. Colombelli, M. Altmeyer, E. H. K. Stelzer, K. Scheffzek, M. O. Hottiger, and A. G. Ladurner, “A macrodomain-containing histone rearranges chromatin upon sensing PARP1 activation.,” Nat. Struct. Mol. Biol., vol. 16, no. 9, pp. 923–9, Sep. 2009.

[56] M. D. Allen, A. M. Buckle, S. C. Cordell, J. Löwe, and M. Bycroft, “The crystal structure of AF1521 a protein from Archaeoglobus fulgidus with homology to the non-histone domain of macroH2A.,” J. Mol. Biol., vol. 330, no. 3, pp. 503–11, Jul. 2003.

[57] G. Kustatscher, M. Hothorn, C. Pugieux, K. Scheffzek, and A. G. Ladurner, “Splicing regulates NAD metabolite binding to histone macroH2A.,” Nat. Struct. Mol. Biol., vol. 12, no. 7, pp. 624–5, Jul. 2005.

[58] J. Tan, C. Vonrhein, O. S. Smart, G. Bricogne, M. Bollati, Y. Kusov, G. Hansen, J. R. Mesters, C. L. Schmidt, and R. Hilgenfeld, “The SARS-unique domain (SUD) of SARS coronavirus contains two macrodomains that bind G-quadruplexes.,” PLoS Pathog., vol. 5, no. 5, p. e1000428, May 2009.

177

Structure and function of poly ADP-ribose glycohydrolases

[59] M.-P. Egloff, H. Malet, A. Putics, M. Heinonen, H. Dutartre, A. Frangeul, A. Gruez, V. Campanacci, C. Cambillau, J. Ziebuhr, T. Ahola, and B. Canard, “Structural and functional basis for ADP-ribose and poly(ADP-ribose) binding by viral macro domains.,” J. Virol., vol. 80, no. 17, pp. 8493–502, Sep. 2006.

[60] G. Jankevicius, M. Hassler, B. Golia, V. Rybin, M. Zacharias, G. Timinszky, and A. G. Ladurner, “A family of macrodomain proteins reverses cellular mono-ADP- ribosylation.,” Nat. Struct. Mol. Biol., vol. 20, no. 4, pp. 508–14, Apr. 2013.

[61] M.-P. Egloff, H. Malet, A. Putics, M. Heinonen, H. Dutartre, A. Frangeul, A. Gruez, V. Campanacci, C. Cambillau, J. Ziebuhr, T. Ahola, and B. Canard, “Structural and functional basis for ADP-ribose and poly(ADP-ribose) binding by viral macro domains.,” J. Virol., vol. 80, no. 17, pp. 8493–502, Sep. 2006.

[62] H. Malet, B. Coutard, S. Jamal, H. Dutartre, N. Papageorgiou, M. Neuvonen, T. Ahola, N. Forrester, E. A. Gould, D. Lafitte, F. Ferron, J. Lescar, A. E. Gorbalenya, X. de Lamballerie, and B. Canard, “The crystal structures of Chikungunya and Venezuelan equine encephalitis virus nsP3 macro domains define a conserved adenosine binding pocket.,” J. Virol., vol. 83, no. 13, pp. 6534–45, Jul. 2009.

[63] M. Neuvonen and T. Ahola, “Differential activities of cellular and viral macro domain proteins in binding of ADP-ribose metabolites.,” J. Mol. Biol., vol. 385, no. 1, pp. 212–25, Jan. 2009.

[64] F. Rosenthal, K. L. H. Feijs, E. Frugier, M. Bonalli, A. H. Forst, R. Imhof, H. C. Winkler, D. Fischer, A. Caflisch, P. O. Hassa, B. Lüscher, and M. O. Hottiger, “Macrodomain-containing proteins are new mono-ADP-ribosylhydrolases.,” Nat. Struct. Mol. Biol., vol. 20, no. 4, pp. 502–7, Apr. 2013.

[65] G. Jankevicius, M. Hassler, B. Golia, V. Rybin, M. Zacharias, G. Timinszky, and A. G. Ladurner, “A family of macrodomain proteins reverses cellular mono-ADP- ribosylation.,” Nat. Struct. Mol. Biol., vol. 20, no. 4, pp. 508–14, Apr. 2013.

[66] R. Alvarez-Gonzalez and F. R. Althaus, “Poly(ADP-ribose) catabolism in mammalian cells exposed to DNA-damaging agents,” Mutat. Res. Repair, vol. 218, no. 2, pp. 67–74, Sep. 1989.

[67] J. Ménissier de Murcia, M. Ricoul, L. Tartier, C. Niedergang, A. Huber, F. Dantzer, V. Schreiber, J.-C. Amé, A. Dierich, M. LeMeur, L. Sabatier, P. Chambon, and G. de Murcia, “Functional interaction between PARP-1 and PARP-2 in chromosome stability and embryonic development in mouse.,” EMBO J., vol. 22, no. 9, pp. 2255– 63, May 2003.

[68] D. Botta and M. K. Jacobson, “Identification of a regulatory segment of poly(ADP- ribose) glycohydrolase.,” Biochemistry, vol. 49, no. 35, pp. 7674–82, Sep. 2010.

[69] R. G. Meyer, M. L. Meyer-Ficca, C. J. Whatcott, E. L. Jacobson, and M. K. Jacobson, “Two small enzyme isoforms mediate mammalian mitochondrial poly(ADP-

178

Structure and function of poly ADP-ribose glycohydrolases

ribose) glycohydrolase (PARG) activity.,” Exp. Cell Res., vol. 313, no. 13, pp. 2920– 36, Aug. 2007.

[70] M. L. Meyer-Ficca, R. G. Meyer, E. L. Jacobson, and M. K. Jacobson, “Poly(ADP- ribose) polymerases: managing genome stability.,” Int. J. Biochem. Cell Biol., vol. 37, no. 5, pp. 920–6, May 2005.

[71] W. Lin, J. C. Amé, N. Aboul-Ela, E. L. Jacobson, and M. K. Jacobson, “Isolation and characterization of the cDNA encoding bovine poly(ADP-ribose) glycohydrolase.,” J. Biol. Chem., vol. 272, no. 18, pp. 11895–901, May 1997.

[72] A. K. L. Leung, S. Vyas, J. E. Rood, A. Bhutkar, P. A. Sharp, and P. Chang, “Poly(ADP- ribose) regulates stress responses and microRNA activity in the cytoplasm.,” Mol. Cell, vol. 42, no. 4, pp. 489–99, May 2011.

[73] G. Brochu, C. Duchaine, L. Thibeault, J. Lagueux, G. M. Shah, and G. G. Poirier, “Mode of action of poly(ADP-ribose) glycohydrolase.,” Biochim. Biophys. Acta, vol. 1219, no. 2, pp. 342–50, Oct. 1994.

[74] M. Ikejima and D. M. Gill, “Poly(ADP-ribose) degradation by glycohydrolase starts with an endonucleolytic incision.,” J. Biol. Chem., vol. 263, no. 23, pp. 11037–40, Aug. 1988.

[75] S. A. Braun, P. L. Panzeter, M. A. Collinge, and F. R. Althaus, “Endoglycosidic cleavage of branched polymers by poly(ADP-ribose) glycohydrolase.,” Eur. J. Biochem., vol. 220, no. 2, pp. 369–75, Mar. 1994.

[76] O. Mortusewicz, E. Fouquerel, J. C. Amé, H. Leonhardt, and V. Schreiber, “PARG is recruited to DNA damage sites through poly(ADP-ribose)-and PCNA-dependent mechanisms,” Nucleic Acids Res., vol. 39, no. 12, pp. 5045–5056, 2011.

[77] I.-K. Kim, J. R. Kiefer, C. M. W. Ho, R. a Stegeman, S. Classen, J. a Tainer, and T. Ellenberger, “Structure of mammalian poly(ADP-ribose) glycohydrolase reveals a flexible tyrosine clasp as a substrate-binding element,” Nat. Struct. Mol. Biol., vol. 19, no. 6, pp. 653–656, 2012.

[78] E. S. Tan, K. A. Krukenberg, and T. J. Mitchison, “Large-scale preparation and characterization of poly(ADP-ribose) and defined length polymers.,” Anal. Biochem., vol. 428, no. 2, pp. 126–36, Sep. 2012.

[79] F. Rosenthal, K. L. H. Feijs, E. Frugier, M. Bonalli, A. H. Forst, R. Imhof, H. C. Winkler, D. Fischer, A. Caflisch, P. O. Hassa, B. Lüscher, and M. O. Hottiger, “Macrodomain-containing proteins are new mono-ADP-ribosylhydrolases.,” Nat. Struct. Mol. Biol., vol. 20, no. 4, pp. 502–7, Apr. 2013.

[80] R. Sharifi, R. Morra, C. D. Appel, M. Tallis, B. Chioza, G. Jankevicius, M. A. Simpson, I. Matic, E. Ozkan, B. Golia, M. J. Schellenberg, R. Weston, J. G. Williams, M. N. Rossi, H. Galehdari, J. Krahn, A. Wan, R. C. Trembath, A. H. Crosby, D. Ahel, R. Hay, A. G. Ladurner, G. Timinszky, R. S. Williams, and I. Ahel, “Deficiency of terminal ADP-

179

Structure and function of poly ADP-ribose glycohydrolases

ribose protein glycohydrolase TARG1/C6orf130 in neurodegenerative disease.,” EMBO J., vol. 32, no. 9, pp. 1225–37, May 2013.

[81] F. C. Peterson, D. Chen, B. L. Lytle, M. N. Rossi, I. Ahel, J. M. Denu, and B. F. Volkman, “Orphan macrodomain protein (human C6orf130) is an O-acyl-ADP- ribose deacylase: solution structure and catalytic properties.,” J. Biol. Chem., vol. 286, no. 41, pp. 35955–65, Oct. 2011.

[82] D. Chen, M. Vollmar, M. N. Rossi, C. Phillips, R. Kraehenbuehl, D. Slade, P. V Mehrotra, F. von Delft, S. K. Crosthwaite, O. Gileadi, J. M. Denu, and I. Ahel, “Identification of macrodomain proteins as novel O-acetyl-ADP-ribose deacetylases.,” J. Biol. Chem., vol. 286, no. 15, pp. 13261–71, Apr. 2011.

[83] M. Masutani, T. Nozaki, E. Nishiyama, T. Shimokawa, Y. Tachi, H. Suzuki, H. Nakagama, K. Wakabayashi, and T. Sugimura, “Function of poly(ADP-ribose) polymerase in response to DNA damage: gene-disruption study in mice.,” Mol. Cell. Biochem., vol. 193, no. 1–2, pp. 149–52, Mar. 1999.

[84] U. Cortes, W.-M. Tong, D. L. Coyle, M. L. Meyer-Ficca, R. G. Meyer, V. Petrilli, Z. Herceg, E. L. Jacobson, M. K. Jacobson, and Z.-Q. Wang, “Depletion of the 110- kilodalton isoform of poly(ADP-ribose) glycohydrolase increases sensitivity to genotoxic and endotoxic stress in mice.,” Mol. Cell. Biol., vol. 24, no. 16, pp. 7163– 78, Aug. 2004.

[85] Y. Zhou, X. Feng, and D. W. Koh, “Enhanced DNA accessibility and increased DNA damage induced by the absence of poly(ADP-ribose) hydrolysis.,” Biochemistry, vol. 49, no. 34, pp. 7360–6, Aug. 2010.

[86] J.-C. Amé, E. Fouquerel, L. R. Gauthier, D. Biard, F. D. Boussin, F. Dantzer, G. de Murcia, and V. Schreiber, “Radiation-induced mitotic catastrophe in PARG- deficient cells.,” J. Cell Sci., vol. 122, no. Pt 12, pp. 1990–2002, Jun. 2009.

[87] X.-C. M. Lu, E. Massuda, Q. Lin, W. Li, J.-H. Li, and J. Zhang, “Post-treatment with a novel PARG inhibitor reduces infarct in cerebral ischemia in the rat.,” Brain Res., vol. 978, no. 1–2, pp. 99–103, Jul. 2003.

[88] J. T. Slama, N. Aboul-Ela, D. M. Goli, B. V Cheesman, A. M. Simmons, and M. K. Jacobson, “Specific inhibition of poly(ADP-ribose) glycohydrolase by adenosine diphosphate (hydroxymethyl)pyrrolidinediol.,” J. Med. Chem., vol. 38, no. 2, pp. 389–93, Jan. 1995.

[89] L. Formentini, P. Arapistas, M. Pittelli, M. Jacomelli, V. Pitozzi, S. Menichetti, A. Romani, L. Giovannelli, F. Moroni, and A. Chiarugi, “Mono-galloyl glucose derivatives are potent poly(ADP-ribose) glycohydrolase (PARG) inhibitors and partially reduce PARP-1-dependent cell death.,” Br. J. Pharmacol., vol. 155, no. 8, pp. 1235–49, Dec. 2008.

[90] L. Tentori, C. Leonetti, M. Scarsella, A. Muzi, M. Vergati, O. Forini, P. M. Lacal, F. Ruffini, B. Gold, W. Li, J. Zhang, and G. Graziani, “Poly(ADP-ribose) glycohydrolase

180

Structure and function of poly ADP-ribose glycohydrolases

inhibitor as chemosensitiser of malignant melanoma for temozolomide.,” Eur. J. Cancer, vol. 41, no. 18, pp. 2948–57, Dec. 2005.

[91] L. Lin, J. Li, Y. Wang, and X. Lin, “Relationship of PARG with PARP, VEGF and b-FGF in colorectal carcinoma,” Chinese J. Cancer Res., vol. 21, no. 2, pp. 135–141, Jun. 2009.

[92] S. Tanuma, Y. J. Tsai, H. Sakagami, K. Konno, and H. Endo, “Lignin inhibits (ADP- ribose)n glycohydrolase activity.,” Biochem. Int., vol. 19, no. 6, pp. 1395–402, Dec. 1989.

[93] H. Maruta, N. Matsumura, and S. Tanuma, “Role of (ADP-ribose)n catabolism in DNA repair.,” Biochem. Biophys. Res. Commun., vol. 236, no. 2, pp. 265–9, Jul. 1997.

[94] C. Keil, T. Gröbe, and S. L. Oei, “MNNG-induced cell death is controlled by interactions between PARP-1, poly(ADP-ribose) glycohydrolase, and XRCC1.,” J. Biol. Chem., vol. 281, no. 45, pp. 34394–405, Nov. 2006.

[95] K. Erdélyi, P. Bai, I. Kovács, E. Szabó, G. Mocsár, A. Kakuk, C. Szabó, P. Gergely, and L. Virág, “Dual role of poly(ADP-ribose) glycohydrolase in the regulation of cell death in oxidatively stressed A549 cells.,” FASEB J., vol. 23, no. 10, pp. 3553–63, Oct. 2009.

[96] X. Feng and D. W. Koh, “Inhibition of poly(ADP-ribose) polymerase-1 or poly(ADP-ribose) glycohydrolase individually, but not in combination, leads to improved chemotherapeutic efficacy in HeLa cells.,” Int. J. Oncol., vol. 42, no. 2, pp. 749–56, Feb. 2013.

[97] A. E. O. Fisher, H. Hochegger, S. Takeda, and K. W. Caldecott, “Poly(ADP-ribose) polymerase 1 accelerates single-strand break repair in concert with poly(ADP- ribose) glycohydrolase.,” Mol. Cell. Biol., vol. 27, no. 15, pp. 5597–605, Aug. 2007.

[98] C. Fathers, R. M. Drayton, S. Solovieva, and H. E. Bryant, “Inhibition of poly(ADP- ribose) glycohydrolase (PARG) specifically kills BRCA2-deficient tumor cells.,” Cell Cycle, vol. 11, no. 5, pp. 990–7, Mar. 2012.

[99] S.-W. Yu, S. A. Andrabi, H. Wang, N. S. Kim, G. G. Poirier, T. M. Dawson, and V. L. Dawson, “Apoptosis-inducing factor mediates poly(ADP-ribose) (PAR) polymer- induced cell death.,” Proc. Natl. Acad. Sci. U. S. A., vol. 103, no. 48, pp. 18314–9, Nov. 2006.

[100] S. A. Andrabi, N. S. Kim, S.-W. Yu, H. Wang, D. W. Koh, M. Sasaki, J. A. Klaus, T. Otsuka, Z. Zhang, R. C. Koehler, P. D. Hurn, G. G. Poirier, V. L. Dawson, and T. M. Dawson, “Poly(ADP-ribose) (PAR) polymer is a death signal.,” Proc. Natl. Acad. Sci. U. S. A., vol. 103, no. 48, pp. 18308–13, Nov. 2006.

[101] Y. Wang, N. S. Kim, J.-F. Haince, H. C. Kang, K. K. David, S. A. Andrabi, G. G. Poirier, V. L. Dawson, and T. M. Dawson, “Poly(ADP-ribose) (PAR) binding to apoptosis-

181

Structure and function of poly ADP-ribose glycohydrolases

inducing factor is critical for PAR polymerase-1-dependent cell death (parthanatos).,” Sci. Signal., vol. 4, no. 167, p. ra20, Jan. 2011.

[102] J. Rine and I. Herskowitz, “Four genes responsible for a position effect on expression from HML and HMR in Saccharomyces cerevisiae.,” Genetics, vol. 116, no. 1, pp. 9–22, May 1987.

[103] J.-E. Choi and R. Mostoslavsky, “Sirtuins, metabolism, and DNA repair.,” Curr. Opin. Genet. Dev., vol. 26, pp. 24–32, Jun. 2014.

[104] T. M. Kowieski, S. Lee, and J. M. Denu, “Acetylation-dependent ADP-ribosylation by Trypanosoma brucei Sir2.,” J. Biol. Chem., vol. 283, no. 9, pp. 5317–26, Feb. 2008.

[105] M. C. Haigis, R. Mostoslavsky, K. M. Haigis, K. Fahie, D. C. Christodoulou, A. J. Murphy, D. M. Valenzuela, G. D. Yancopoulos, M. Karow, G. Blander, C. Wolberger, T. A. Prolla, R. Weindruch, F. W. Alt, and L. Guarente, “SIRT4 inhibits glutamate dehydrogenase and opposes the effects of calorie restriction in pancreatic beta cells.,” Cell, vol. 126, no. 5, pp. 941–54, Sep. 2006.

[106] J. G. M. Rack, R. Morra, E. Barkauskaite, R. Kraehenbuehl, A. Ariza, Y. Qu, M. Ortmayer, O. Leidecker, D. R. Cameron, I. Matic, A. Y. Peleg, D. Leys, A. Traven, and I. Ahel, “Identification of a Class of Protein ADP-Ribosylating Sirtuins in Microbial Pathogens.,” Mol. Cell, vol. 59, no. 2, pp. 309–20, Jul. 2015.

[107] N. Asherie, “Protein crystallization and phase diagrams,” Methods, vol. 34, no. 3, pp. 266–272, 2004.

[108] A. J. McCoy, R. W. Grosse-Kunstleve, P. D. Adams, M. D. Winn, L. C. Storoni, and R. J. Read, “Phaser crystallographic software.,” J. Appl. Crystallogr., vol. 40, no. Pt 4, pp. 658–674, Aug. 2007.

[109] G. N. Murshudov, A. A. Vagin, A. Lebedev, K. S. Wilson, and E. J. Dodson, “Efficient anisotropic refinement of macromolecular structures using FFT.,” Acta Crystallogr. D. Biol. Crystallogr., vol. 55, no. Pt 1, pp. 247–55, Jan. 1999.

[110] G. N. Murshudov, A. A. Vagin, and E. J. Dodson, “Refinement of macromolecular structures by the maximum-likelihood method.,” Acta Crystallogr. D. Biol. Crystallogr., vol. 53, no. Pt 3, pp. 240–55, May 1997.

[111] P. Emsley, B. Lohkamp, W. G. Scott, and K. Cowtan, “Features and development of Coot.,” Acta Crystallogr. D. Biol. Crystallogr., vol. 66, no. Pt 4, pp. 486–501, Apr. 2010.

[112] K. Cowtan, “The Buccaneer software for automated model building. 1. Tracing protein chains.,” Acta Crystallogr. D. Biol. Crystallogr., vol. 62, no. Pt 9, pp. 1002– 11, Sep. 2006.

182

Structure and function of poly ADP-ribose glycohydrolases

[113] W. Kabsch, “XDS.,” Acta Crystallogr. D. Biol. Crystallogr., vol. 66, no. Pt 2, pp. 125– 32, Feb. 2010.

[114] K. Hatakeyama, Y. Nemoto, K. Ueda, and O. Hayaishi, “Purification and characterization of poly(ADP-ribose) glycohydrolase. Different modes of action on large and small poly(ADP-ribose).,” J. Biol. Chem., vol. 261, no. 32, pp. 14902– 11, Nov. 1986.

[115] H. Gao, D. L. Coyle, M. L. Meyer-Ficca, R. G. Meyer, E. L. Jacobson, Z.-Q. Wang, and M. K. Jacobson, “Altered poly(ADP-ribose) metabolism impairs cellular responses to genotoxic stress in a hypomorphic mutant of poly(ADP-ribose) glycohydrolase.,” Exp. Cell Res., vol. 313, no. 5, pp. 984–96, Mar. 2007.

[116] W. Min, U. Cortes, Z. Herceg, W.-M. Tong, and Z.-Q. Wang, “Deletion of the nuclear isoform of poly(ADP-ribose) glycohydrolase (PARG) reveals its function in DNA repair, genomic stability and tumorigenesis.,” Carcinogenesis, vol. 31, no. 12, pp. 2058–65, Dec. 2010.

[117] D. W. Koh, A. M. Lawler, M. F. Poitras, M. Sasaki, S. Wattler, M. C. Nehls, T. Stöger, G. G. Poirier, V. L. Dawson, and T. M. Dawson, “Failure to degrade poly(ADP- ribose) causes increased sensitivity to cytotoxicity and early embryonic lethality.,” Proc. Natl. Acad. Sci. U. S. A., vol. 101, no. 51, pp. 17699–704, Dec. 2004.

[118] Z. Dosztányi, V. Csizmok, P. Tompa, and I. Simon, “IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content.,” Bioinformatics, vol. 21, no. 16, pp. 3433–4, Aug. 2005.

[119] T. A. Brown, “Synthesis and Processing of the Proteome.” Wiley-Liss, 2002.

[120] L. Whitmore and B. a. Wallace, “Protein secondary structure analyses from circular dichroism spectroscopy: Methods and reference databases,” Biopolymers, vol. 89, no. 5, pp. 392–400, 2008.

[121] A. K. Dunker, J. D. Lawson, C. J. Brown, R. M. Williams, P. Romero, J. S. Oh, C. J. Oldfield, A. M. Campen, C. M. Ratliff, K. W. Hipps, J. Ausio, M. S. Nissen, R. Reeves, C. Kang, C. R. Kissinger, R. W. Bailey, M. D. Griswold, W. Chiu, E. C. Garner, and Z. Obradovic, “Intrinsically disordered protein,” J. Mol. Graph. Model., vol. 19, no. 1, pp. 26–59, Feb. 2001.

[122] C. Bracken, “NMR spin relaxation methods for characterization of disorder and folding in proteins.,” J. Mol. Graph. Model., vol. 19, no. 1, pp. 3–12, Jan. 2001.

[123] R. Ishima and D. A. Torchia, “Protein dynamics from NMR.,” Nat. Struct. Biol., vol. 7, no. 9, pp. 740–3, Sep. 2000.

[124] N. Poulose and R. Raju, “Sirtuin regulation in aging and injury,” Biochim. Biophys. Acta - Mol. Basis Dis., vol. 1852, no. 11, pp. 2442–2455, Aug. 2015.

183

Structure and function of poly ADP-ribose glycohydrolases

[125] P. Chaiyen, M. W. Fraaije, and A. Mattevi, “The enigmatic reaction of flavins with oxygen.,” Trends Biochem. Sci., vol. 37, no. 9, pp. 373–80, Sep. 2012.

[126] D. W. Banner, A. C. Bloomer, G. A. Petsko, D. C. Phillips, C. I. Pogson, I. A. Wilson, P. H. Corran, A. J. Furth, J. D. Milman, R. E. Offord, J. D. Priddle, and S. G. Waley, “Structure of chicken muscle triose phosphate determined crystallographically at 2.5 angstrom resolution using amino acid sequence data.,” Nature, vol. 255, no. 5510, pp. 609–14, Jun. 1975.

[127] L. Holm and P. Rosenström, “Dali server: conservation mapping in 3D.,” Nucleic Acids Res., vol. 38, no. Web Server issue, pp. W545–9, Jul. 2010.

[128] S. Maier, T. Pflüger, S. Loesgen, K. Asmus, E. Brötz, T. Paululat, A. Zeeck, S. Andrade, and A. Bechthold, “Insights into the bioactivity of mensacarcin and epoxide formation by MsnO8.,” Chembiochem, vol. 15, no. 5, pp. 749–56, Mar. 2014.

[129] L. Y. Lin, T. Sulea, R. Szittner, V. Vassilyev, E. O. Purisima, and E. A. Meighen, “Modeling of the bacterial luciferase-flavin mononucleotide complex combining flexible docking with structure-activity data.,” Protein Sci., vol. 10, no. 8, pp. 1563– 71, Aug. 2001.

[130] Z. Li, R. Szittner, and E. A. Meighen, “Subunit interactions and the role of the luxA polypeptide in controlling thermal stability and catalytic properties in recombinant luciferase hybrids.,” Biochim. Biophys. Acta, vol. 1158, no. 2, pp. 137– 45, Oct. 1993.

[131] Z. T. Campbell, A. Weichsel, W. R. Montfort, and T. O. Baldwin, “Crystal structure of the bacterial luciferase/flavin complex provides insight into the function of the beta subunit.,” Biochemistry, vol. 48, no. 26, pp. 6085–94, Jul. 2009.

[132] T. F. Holzman and T. O. Baldwin, “Proteolytic inactivation of luciferases from three species of luminous marine bacteria, Beneckea harveyi, Photobacterium fischeri, and Photobacterium phosphoreum: evidence of a conserved structural feature.,” Proc. Natl. Acad. Sci. U. S. A., vol. 77, no. 11, pp. 6363–7, Nov. 1980.

[133] M. J. Lambrecht, M. Brichacek, E. Barkauskaite, A. Ariza, I. Ahel, and P. J. Hergenrother, “Synthesis of Dimeric ADP-Ribose and Its Structure with Human Poly(ADP-ribose) Glycohydrolase,” J. Am. Chem. Soc., vol. 137, no. 10, pp. 3558– 3564, 2015.

[134] A. A. Fatokun, V. L. Dawson, and T. M. Dawson, “Parthanatos: mitochondrial- linked mechanisms and therapeutic opportunities.,” Br. J. Pharmacol., vol. 171, no. 8, pp. 2000–16, Apr. 2014.

[135] S. A. Andrabi, N. S. Kim, S.-W. Yu, H. Wang, D. W. Koh, M. Sasaki, J. A. Klaus, T. Otsuka, Z. Zhang, R. C. Koehler, P. D. Hurn, G. G. Poirier, V. L. Dawson, and T. M. Dawson, “Poly(ADP-ribose) (PAR) polymer is a death signal.,” Proc. Natl. Acad. Sci. U. S. A., vol. 103, no. 48, pp. 18308–13, Nov. 2006.

184

Structure and function of poly ADP-ribose glycohydrolases

[136] K. Surmann, S. Michalik, P. Hildebrandt, P. Gierok, M. Depke, L. Brinkmann, J. Bernhardt, M. G. Salazar, Z. Sun, D. Shteynberg, U. Kusebauch, R. L. Moritz, B. Wollscheid, M. Lalk, U. Völker, and F. Schmidt, “Comparative proteome analysis reveals conserved and specific adaptation patterns of Staphylococcus aureus after internalization by different types of human non-professional phagocytic host cells.,” Front. Microbiol., vol. 5, p. 392, Jan. 2014.

185

Structure and function of poly ADP-ribose glycohydrolases

Chapter Eight Appendix Primers list hPARG 1-460

Forward: TCG AAG GTA GGC ATA TGA ATG CGG GCC CCG Reverse 1-460: GGT GGT GGT GCT CGA GCT ACT CCT CAA TGG GAG T Reverse 1-388: GGT GGT GGT GCT CGA GTT ATC CAG GTA GTT TAG CAT TTA AAT C Reverse 1-380: GGT GGT GGT GCT CGA GTT AAT TCA TTC CAG TGC GAC TCT C Reverse 1-365: GGT GGT GGT GCT CGA GTT ATC TAA CTT CAC CGC CCT TAG

Reverse 1-329: GGT GGT GGT GCT CGA GTT ATT GTT CAT CAA AAC CTG GAC TTG

Human PARG for P.pastoris expression 1-388 Forward: TCG AAA CGA GGA ATT CAG AAT GGG CAG CAG CCA TCA TCA TCA T Reverse: AGA AAG CTG GCG GCC GCC TAT CCA GGT AGT TTA GCA TTT AAA TC

Mouse PARG Forward: CGC GCG GCA GCC ATA TGA GTG CGG GCC CCG GC Reverse 1-438: GGT GGT GGT GCT CGA GCT AAG GTG GGA TGT ATT TTG GAA T Reverse 1-378: GGT GGT GGT GCT CGA GCT ATG GCT TGG CAT TTA AGT CAC T

Chicken PARG Forward: CGC GCG GCA GCC ATA TGT CCG CAG GTT GTG G Reverse 1-421 GGT GGT GGT GCT CGA GCT ACG GCA CAT ATT TCG GAA TTT T Reverse 1-362: GGT GGT GGT GCT CGA GCT ACG GTT TAA TAT GCA GAC CG

Rat PARG Forward: CGC GCG GCA GCC ATA TGT CCG CTG GTC CGG GC Reverse 1-456: GGT GGT GGT GCT CGA TCA TTC TTC AAT CGG CGT AC Reverse 1-382: GGT GGT GGT GCT CGA GCT ACG GTT TGG CGT TCA GAT C

186

Structure and function of poly ADP-ribose glycohydrolases

Human MACROD2 Forward: CGC GCG GCA GCC ATA TGA AGA AGA AAG TGT GGC GT Reverse: GGT GGT GGT GCT CGA GTC AAT CCA CCG AGA AAA ATT C

Bacterial PARG from T.curvata Forward: CGC GCG GCA GCC ATA TGC GTC ACA GTC GC Reverse: GGT GGT GGT GCT CGA GTT ACA GGC TGC CAA AAC

Gene synthesis sequences MACROD2 with PARG loop: ATGAAGAAGAAAGTGTGGCGTGAAGAAAAAGAACGTCTGCTGAAAATGACCCTGGAAGAAC GTCGTAAAGAATACCTGCGTGACTACATTCCGCTGAATAGTATCCTGTCCTGGAAAGAAGAA ATGAAAGGCAAGGGCCAGAACGATGAAGAAAATACCCAGGAAACGTCACAAGTGAAAAAGTC GCTGACCGAAAAAGTTAGCCTGTATCGTGGTGATATTACGCTGCTGGAAGTGGACGCGATCG TTAACGCAGCAAATGCAAGTCTGCTGGGCGGTGGCTTTCTGTCAGGTGCACATGCTCAAGAA GAATGCATTCACCGTGCAGCTGGTCCGTGCCTGCTGGCAGAATGTCGTAACCTGAATGGTTGC GATACCGGCCATGCGAAAATTACGTGTGGTTATGACCTGCCGGCCAAATACGTCATCCACACC GTGGGCCCGATTGCACGTGGTCATATCAACGGCTCCCACAAAGAAGATCTGGCTAATTGCTA TAAGAGCTCTCTGAAACTGGTCAAGGAAAACAATATCCGCAGCGTTGCGTTTCCGTGTATTT CTACCGGTATCTACGGCTTCCCGAACGAACCGGCGGCCGTGATTGCACTGAATACGATCAAAG AATGGCTGGCTAAGAACCATCACGAAGTTGACCGCATTATCTTTTGTGTCTTCCTGGAAGTG GATTTTAAGATTTACAAGAAGAAGATGAATGAATTTTTCTCGGTGGATTGA

Bacterial PARG with Macro loop: ATGCGTCACAGTCGCCGTGCTATTGCCGCCGAAACCGTGGAAATTCTGGAACGTGGTCGCTAT ACCGCCCCGTCTGGTCGTGTTGTTCCGATTGCAGATCATGTTGCACAGGCAGTCCGTGGTACC CGTCTGTATCGTCCGGAAAAACTGGCAGTGCTGCTGGAAGGTCTGGGTGCAGCAAGCGACGG TGCTCCGACCCGCATCGAAGTTACGGAAGAAACCACGCTGGCAGCTGCACGTCGCCTGACGGG TGCAGCAGGTGATCAAGTTGCCTGCCTGAACTTTGCATCAGCTGAACATCCGGGCGGTGGCGG TGTCGACGGCGGTCTGGCCCGCAGCTCTGGCCTGTATGCATCGCTGCGTGCTGTGCCGCAGTT TTACGCATTCCATCACCGTCAACGCGATCCGCTGTATAGTGACCACCTGATTTACTCCCCGGG CGTCCCGGTGTTTCGTGATGACGCGGGTCGTCTGCTGGAAGAACCGTATCGCGTTGCCTTCCT GACCAGTCCGGCACCGAACCGTCGCGCTATTGGCGATCTGCGCACGGTGGAAGAAATCGGCCG TGTTCTGCGTGGTCGTGCTGCAAAGGTCCTGGCAGCAGCTCGTCATCACGGTCATCGTCGCCT GGTGCTGGGTGCATGGGGCTGTGGTATCTACGGTAATGATCCGGCACAGGTCGCAGAAACCT TTGCAGGTCTGCTGCTGGACGGCGGTCCGTTTGCAGGTCGTTTCGCCCACGTGGTTTTCGCGG

187

Structure and function of poly ADP-ribose glycohydrolases

TTTGGGACACGGCACCGGGCGCACCGCGTCACGCAGCATTCGCACGTCGTTTTGGCAGCCTGT AA Chicken regulatory domain: ATGTCCGCAGGTTGTGGTCGTGATCATCCGTGTAAACGTGCTCGTCTGAGTCCGGGTGGTGG TTCCCAGGGTGAACAGCGTCCGCCGCCGCCGGATAGCGGTCCGGAAGGTCCGGGCGGTGGCAG CTCTGTGACCGCAGTTGGTAATAAAGCTTGCAAACAGCGTACGATTACCACGTGGCTGGAAG ATAAAGGCCCGAAAACCACGGAATCGCGCAGCCTGCAATCTAAAATCAACAACAACACCGAA GAAAATACCAAAATGACGAGTGTGAAAAAAGAAAACGTTTGCGAACATGATGTCAAACAGCT GGAAAATATTTGCCAGCAAAACTGTCCGGAAGTTAGCGGTGAAGTCGGCACCAAACACGTGA ATCTGAGTCAAACGGGTTCCATCTGTAACTGGGAAGCAGAAGGCCGTGATCCGGTTAGCGAC CGCGTTCCGGTCAAAGCTGAACAGACCGGTGAAGCGTTTAAAAATGCCAACATTG ATCAGATGCATCGTACGGGTAAAGAAGCGCAACGTCGCTGTGAAGAATCCGGCGACATCTGG ACCCAGCGTAAACGCAGTTCCGGTCAAAGCCTGAAAACCGGCAATACGAAACCGTCATCGAA AGATACCGAAACGGACGGCCAGGTGGTTCTGTGCAACTCACAAAAACTGAAATCGTGTGATC CGGTTGACACCGGTGAACAGGAAGAAGGCGATGTCGTGCCGGAATCTCCGCTGAGTGATACG GGTTGCGACGCATGTGTCAGTCGTGTGGGTGGCCCGGAAAAACTGTCCAAATGCCGCAGCTCT AGTGGCGATTCACCGGCTTTTGAAAAAGAATCAGAACCGGAATCGCCGATGGATGTGGACAA TTCCAAAAACTCATGTCACGGTAGCGAAGCGGATGAAGAAACCTCTCCGCTGCTGGACGAAC GTGAAGAAAACAACATGGCGAAAAGCATCAACAAATCTTTCCGCATCCAGTATGGCGATGCC GAACTGGAATCGCGTAAAAGCCGCTTTTCTACCAAAGGTTGCGAAGATTTCGAAGGCATTGA CAACTCTGTCAAAGATGACGGTCTGCATATTAAACCGCCGGGTATCAGTCCGGCACCGTCCTC AGAGGGTAAAGGCGCCAAACCGGGTGTGAAAAAAGATAGTAAAATCACCAGCCACTTCATGC GTATCCCGAAAATGGAAGAAAAATCGCGCAAAGACAAATGTGAATTCAAAAGCCAGCGCGCC GGCAAGAAAATTCCGAAATATGTGCCGCCGCCGCTGCCGACGAATAAAAAATGGTTTGGTAC GCCGATTGAAGAATAA

Rat regulatory domain: ATGTCCGCTGGTCCGGGCTGTGAACCGTGTACGAAACGTCCGCGTTGGGGCGCTGCAGGCACG AGTGCTCCGACCGCAAGTGATTCCCGCAGCTTTCCGGGCCGTCAAAAACGCGTCCTGGATCCG AAAGACGCACCGGTGCAGTTCCGTGTTCCGCCGAGCTCTAGTGCCTGTGTGTCTGGTCGTGCA GGTCCGCATCGTGGCAGTGTCACCAGCTTTGTGTTCAAACAGAAACCGATTACCACGTGGAT GGATACCAAAGGTCCGAAAACGGCCGAAAGTGAATCCAAAGAAAACAATAACACCCGCACGG ATCCGATGATGTCCTCAGTGCAAAAAGACAACTTCTATCCGCATAAAGTTGAAAAACTGGGC AATGTCCCGCAGCTGAACCTGGATAAATCACCGACCGAAAAATCGACGCCGTACCTGAACCA GCAACAGACCGCGGGCGTTTGTAAATGGCACTCGGCAGGTGAACGCGCTGAACAGCTGTCAG CATCGGAACCGAGTGCGGTCACGCAAGCCCCGAAACAGCTGTCAAATGCTAACATCGATCAG TCGCCGCCGACCGACGGTCATAGCGATACGGACCACGAAGAAGATCGTGACAATCAACAGTT CCTGACCCCGGTGAAACTGGCGAACGCCAAACAAACGGTTGGCGATGGTCAGGCACGCAGCA ATTGCAAATGTAGCGCTTCTTGCCAATGTGGCCAGGATTGCGCAGGTTGTCAGCGTGAAGAA GCTGACGTCATTCCGGAATCACCGCTGTCGGATGTGGGCGCCGAAGACATCGGCACCGGTAGC AAAAATGATAACAAACTGACGGGTCAGGAAAGCGGCCTGGGTGATTCTCCGCCGTTTGAAAA AGAAAGTGAACCGGAATCCCCGATGGATGTGGACAATAGCAAAACCTCTTGCCAGGATTCTG AAGCAGACGAAGAAGCTAGTCCGGTTTTCGATGAACAAGATGACCAGGATGACCGTAGCAGC

188

Structure and function of poly ADP-ribose glycohydrolases

CAAACCGCGAACAAACTGTCTAGTCGTCAGGCCCGCGAAGTGGATGGCGACCTGCGTAAACG CTATCTGACCAAAGGTTCCGAAATTCGTCTGCATTTTCAGTTCGAAGGCGGTAGTAATGCGG GCACGTCCGATCTGAACGCCAAACCGAGCGGTAATTCCTCATCGCTGAACGTTGATGGCCGCA GCTCTAAACAGCATGGTAAACGTGATTCCAAAATTACCGACCACTTTGTTCGCATCCCGAAA TCAGAAGATAAACGTAAAGAACAATGTGAAGTCCGTCACCAGCGCGCGGAACGTAAAATCCC GAAATATGTGCCGCCGAACCTGCCGCCGGACAAAAAATGGCTGGGTACGCCGATTGAAGAAT GA Sequence alignments

Sequence alignment of MACROD2 (H. sapiens) and bacterial PARG (T. curvata). Sequence alignment of bacterial PARG and human MACROD2, highlighted in blue are the residues that form both the catalytic loops for both proteins. Residues Glu103– Glu115 for bPARG and Glu93– Glu99 for MACROD2. Overall 19% sequence identity.

189

Structure and function of poly ADP-ribose glycohydrolases

Multiple alignment of hPARG regulatory domain with other vertebrate PARG proteins. ePARG is Erinaceus europaeus (hedgehog), fPARG is Fukomys damarensis (mole rat), cPARG is Gallus gallus (chicken) and chincPARG is Chinchilla lanigera (chinchilla). Plasmid list pET28a; DH5α; pDONOR; pPICZ 3.1; pPICZB 3.1; pmcnEAVNH.

Published work

190

ARTICLE

Received 8 Apr 2013 | Accepted 17 Jun 2013 | Published 6 Aug 2013 DOI: 10.1038/ncomms3164 OPEN Visualization of poly(ADP-ribose) bound to PARG reveals inherent balance between exo- and endo-glycohydrolase activities

Eva Barkauskaite1,*, Amy Brassington2,*, Edwin S. Tan3, Jim Warwicker2, Mark S. Dunstan2, Benito Banos1, Pierre Lafite4, Marijan Ahel5, Timothy J. Mitchison3, Ivan Ahel1,6 & David Leys2

Poly-ADP-ribosylation is a post-translational modification that regulates processes involved in genome stability. Breakdown of the poly(ADP-ribose) (PAR) polymer is catalysed by poly(ADP-ribose) glycohydrolase (PARG), whose endo-glycohydrolase activity generates PAR fragments. Here we present the crystal structure of PARG incorporating the PAR substrate. The two terminal ADP-ribose units of the polymeric substrate are bound in exo-mode. Biochemical and modelling studies reveal that PARG acts predominantly as an exo-glycohy- drolase. This preference is linked to Phe902 (human numbering), which is responsible for low-affinity binding of the substrate in endo-mode. Our data reveal the mechanism of poly- ADP-ribosylation reversal, with ADP-ribose as the dominant product, and suggest that the release of apoptotic PAR fragments occurs at unusual PAR/PARG ratios.

1 Cancer Research UK, Paterson Institute for Cancer Research, University of Manchester, Wilmslow Road, Manchester M20 4BX, UK. 2 Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester M1 7DN, UK. 3 Harvard Medical School WA 536, Boston, Massachusetts 02115, USA. 4 Institut de Chimie Organique et Analytique, Universite´ d’Orle´ans—CNRS—UMR 7311 BP6769 Rue de Chartres, 45067 Orle´ans cedex 2, France. 5 Rudjer Boskovic Institute, Bijenicka 54, HR-10000 Zagreb, Croatia. 6 Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford OX1 3RE, UK. * These authors contributed equally to this work. Correspondence and requests for materials should be addressed to I.A. (email: [email protected]) or to D.L. ([email protected]).

NATURE COMMUNICATIONS | 4:2164 | DOI: 10.1038/ncomms3164 | www.nature.com/naturecommunications 1 & 2013 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3164

oly-ADP-ribosylation is a reversible post-translational to a 16-mer fragment. Crystals were obtained in a range of modification that regulates a variety of cellular functions conditions, for each of the PAR polymers tested. In all cases, the Pinvolved in genome stability1–4. Poly(ADP-ribose) (PAR) same crystal packing was observed, regardless of PAR polymer synthesis is achieved by poly(ADP-ribose) polymerases (PARPs), size or mother liquor composition. No significant difference in which, by using NAD as substrate, link repeating ADP-ribose the corresponding structures was observed, and we report the units via unique O-glycosidic ribose–ribose bonds5. Efficient and highest resolution structure obtained to 1.46 Å for a E256Q timely breakdown of the PAR polymer is catalysed by a PAR TTPARG in complex with a PAR9 (PDB code 4L2H). glycohydrolase (PARG), and PARG depletion results in Compared with previously determined TTPARG structures10, embryonic lethality in mouse models, as well as increased changes in the TTPARG protein structure are minor and are radiosensitivity and chemosensitivity in cells, making PARG an probably due to the distinct crystal packing obtained for the 6–8 attractive therapeutic target . The complex chemical nature of PARG–PAR9 complex. The bound poly-ADP-ribose can clearly the poly(ADP-ribose) polymer has limited the understanding of be discerned from the electron density, but only the two terminal the structure and recognition of this signalling molecule. ADP-ribose units are visible (Fig. 1a). The terminal ADP-ribose Recent structure determination of PARG from bacterial, moiety occupies a position similar to that observed for the protozoan and mammalian sources9–12 revealed that these product ADP-ribose molecule. The ribose–ribose O-glycosidic enzymes essentially consist of a macrodomain ADP-ribose- linkage between both ADP-riboses is clearly visible and is binding module13, elaborated upon through insertion of the positioned in close proximity of the Gln256 side chain (Fig. 1b). highly conserved and PARG-specific catalytic loop (containing In contrast to the interactions between Asn250 and Glu255 and the key catalytic residues), and both N- and C-terminal the terminal ribose’, no direct interactions are observed between extensions. Canonical eukaryotic PARGs are more complex protein and the n-1 ribose’. The 3-OH group of the latter is when compared with their prokaryotic counterparts. In terms of within hydrogen bonding distance of the terminal n ribose’ 3-OH size, canonical PARGs are bigger and contain a significantly (in turn bound by Asn240) and a water molecule (referred to as larger accessory domain to the aforementioned, essential W2). The W2 is bound by the Gly246 amide nitrogen and the n substrate-binding macrodomain (Supplementary Fig. S1). ribose’ 2-OH group (bound to Glu255). The n-1 adenosine is Furthermore, the bacterial type PARGs have recently been sandwiched between Val253 and water molecules that interact shown to act solely in an exo-glycohydrolase mode9, whereas with Arg164. Direct hydrogen bonding interactions between the canonical PARGs have been reported to exhibit both endo- and Leu252 amide nitrogen and the n-1 adenosine N11, as well as the exo-glycohydrolase activities14. PARG is unable to cleave the ester Ser297 side chain and the n-1 adenosine N10, can also be bond linking the proximal ADP-ribose unit directly to proteins9. observed. A few polar interactions occur between the protein and Although the enzymatic activity that catalyses this step of PAR the n-1 diphosphate, including the amide nitrogen backbone, and catabolism has been detected in mammalian cell extracts 30 years side chain of Asn250. As a consequence, starting with the n-1 a- ago, the proteins responsible remained unknown until very phosphate towards the N-2 ADP-ribose group of the PAR chain, recently15–18. The deficiency in the proteins demodifying electron density rapidly becomes weaker, signifying high levels of mono(ADP-ribosyl)ated PARP substrates in cells leads to a flexibility in the bound PAR from the n-1 a-phosphate onwards. severe neurodegeneration disease in humans16,19. Beyond the n-1 b-phosphate, electron density is reduced to To provide detailed understanding of the inherent endo- and background level, corresponding to the large crystal solvent exo-glycohydrolase activities of canonical PARGs, we set out to channel that lines the TTPARG active-site region (Supplementary determine the structure of an inactive PARG in complex with the Fig. S3). The latter region is sufficiently large to contain the PAR substrate. Crystal structure determination of the E256Q remaining disordered PAR section. Tetrahymena thermophila PARG in complex with PAR fragments Comparison of the TTPARG–PAR9 structure with the recently surprisingly yielded only exo-glycohydrolase-mode complexes. published human PARG structure12 reveals that the majority of The inherent preference for exo- (as opposed to endo-) binding is PARG–PAR interactions are conserved across eukaryotic PARGs linked to the presence of the conserved residue Phe902 (human (Supplementary Fig. S4). The few differences that are observed numbering), and this is confirmed by detailed studies of product are located in the vicinity of the n-1 ADP-ribose unit, providing profiles for canonical PARGs. We conclude that, in contrast to further evidence that this region does not significantly contribute bacterial PARG, canonical PARGs can act as endo-glycohydro- to PAR binding (Fig. 2). Indeed, mutations of residues implicated lases. The balance between exo- and endo-activity is likely to be a in the binding of the n-1 ADP-ribose suggested by the TTPARG– function of the PARG/PAR ratio, offering the possibility that PAR crystal structure, or, by analogy, a human PARG–PAR apoptotic PAR fragments are only formed at unusual PARG/PAR model, have little to no effect on PARG catalytic activity (Fig. 2). ratios. In contrast, mutations of residues implicated in the binding of the n adenosine moiety in humans and TTPARGs significantly diminish activity, suggesting that analogous PAR binding mode Results also occurs in human PARG. Crystal structure of the inactive TTPARG with PAR.To understand poly-ADP-ribose binding by canonical PARGs, we prepared homogenous PAR fragments. PAR was prepared by PARG binds PAR in an exo-glycohydrolase binding mode. The enzymatic synthesis using the activity of PARP Tankyrase1 as observed binding mode corresponds to a PARG exo-glycohy- described20, which assures the production of biologically relevant drolase activity, and, given the fact that previous studies suggested polymers. The PAR fragments were resolved by anionic potential PAR conformations corresponding to endo-glycohy- chromatography at single-nucleotide resolution, providing drolase activity were compatible with canonical PARG sufficient amounts of homogenous PAR fragments of defined structures10–12, comes as a surprise. The 2-OH group of the lengths for the envisioned structural studies (Supplementary Fig. terminal ribose’ is indeed solvent exposed (Fig. 1a), suggesting S2). Size identity of PAR fragments was confirmed by mass that a further n þ 1 unit could fit at the PARG surface. The crystal spectrometry. packing is also such that a large solvent channel runs along the The inactive E256Q TTPARG mutant was co-crystallized with PARG active-site surface, and is thus unlikely to influence a series of defined length PAR oligomers, ranging from a 6-mer the observed PAR conformation (Supplementary Fig. S3). We

2 NATURE COMMUNICATIONS | 4:2164 | DOI: 10.1038/ncomms3164 | www.nature.com/naturecommunications & 2013 Macmillan Publishers Limited. All rights reserved. NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3164 ARTICLE

a

N-ribose 2OH N-ribose 2OH

N-ribose 3OH N-ribose 3OH

N-ribose′ 3OH N-ribose′ 3OH

(N-1) ribose 3OH (N-1) ribose 3OH

b Y293 Y293 L226 L226 E228 E228 Y296 Y296 F398 F398 R164 S297 R164 S297

V253 V253 Q256 Q256 N250 N250 E255 E255

W1 W1

W2 W2

N240 F371 N240 F371

Figure 1 | Crystal structure of a PARG–PAR complex. (a) Stereoview of the solvent-accessible surface of PARG in grey with the bound PAR at the active-site surface shown in atom-coloured sticks. The 2FoFc electron density corresponding to the ordered region of PAR is shown in a blue mesh (contour level 1 sigma). (b) Stereoview of the PARG active site. Residues involved in direct contacts with the PAR ligand are shown in atom-coloured sticks. The mutated Glu256 is shown with green rather than with light blue carbons. Hydrogen bonds between ligand and protein or structural waters are indicated by dotted lines.

re-evaluated the modelling of a PAR trimer using the new explanation for the observed preference to bind the PAR TTPARG–PAR structure and compared the average architectures terminus (Table 1). Given the fact that only an exo- of an exo-glycohydrolase and an endo-glycohydrolase binding glycohydrolase conformation was observed for different PAR mode during molecular dynamics (MD) simulations (Fig. 3a). lengths, with the longest oligomer reaching 15 ADP-ribose units These reveal that although binding of an additional n þ 1 unit is (which contains only 1 terminal residue, compared with 13 indeed possible, this requires a reorientation of the n ribose’ to intermediate positions), this preference can be estimated to be at allow for the presence of the additional n þ 1 ribose’ unit and least 100-fold. avoid steric clashes with conserved Phe398 (Fig. 3b). This in turn leads to a reorientation of the n adenosine, which is no longer able to maintain a series of polar interactions with the side chain PARG acts mainly as an exo-glycohydrolase. The above obser- of Glu228 and the amide nitrogen of Ile227. In addition, a water- vations suggest that correct positioning of the n adenosine moiety mediated interaction with Lys365 and Cys396 backbone atoms is is indeed important for PARG (exo)glycohydrolase functionality. also disrupted. The disruption of complementarity between the The mutation of Glu228 to Ala or Ile227 to Pro indeed sig- PARG active site and the n adenosine provides a likely nificantly affects, but does not abolish, TTPARG activity (Fig. 2a).

NATURE COMMUNICATIONS | 4:2164 | DOI: 10.1038/ncomms3164 | www.nature.com/naturecommunications 3 & 2013 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3164

a T. thermophila PARG b Ratio of BactPARG E115Q/TTPARG

120 ** 100 * ** No TTPARG 80 – 0 2.5 5 25 38 50 100 250 100 ** 60 191 40 *

% PARG activity 97 20

0 *** Poly(ADP-ribosyl)ated PARP1 WT R164K R164A I227P E228A V253I E256Q F398G No enzyme Ratio of BactPARG E115Q/human PARG Human PARG

120 ** 100 * 0 2 3 25 333 17 33 67 167 80 – No human PARG

60 191 ** 40

% PARG activity ** 97 20 *** 0 Poly(ADP-ribosyl)ated PARP1

WT R671K R671A I726P E727A V753I E756Q F902G No enzyme c 100 90 80 70 60 50 40 30 20 10 % Total reaction product 0

ADPR ADPR ADPR ADPR ADPR2ADPR3ADPR4 ADPR2ADPR3ADPR4 ADPR2ADPR3ADPR4 ADPR2ADPR3ADPR4 WT F902G WT F398G Human PARG T. thermophila PARG Figure 2 | PARG mutagenesis and PAR termini protection by bactPARG confirms the PARG–PAR model. (a) Activity of the T. thermophila PARG WT and mutants and the corresponding human PARG mutants. Error bars represent s.d. (n ¼ 3) (*Po0.05; **Po0.01; ***Po0.001) obtained using paired t-test. (b) Inhibition of glycohydrolase activity for both human and T. thermophila PARGs using PARP1-generated PAR substrate with bactPARG Glu115Gln. (c) Distribution of exo- and endo-glycohydrolase products obtained after treatment with wt and mutant human and T. termophila PARGs, as determined by LC/MS (see also Supplementary Fig. S6).

This is to be expected for mutations that affect binding affinity aim to protect the substrate PAR termini, gradually allowing for but not catalysis, but could also reflect that a proportion of endo- (any residual) endo-glycohydrolase activity only to occur. glycohydrolase activity does occur. To further estimate the extent However, complete inhibition of canonical PARG was observed of PARG endo-glycohydrolase activity, we used a mutant bac- at higher Glu115Gln concentrations, suggesting that all available terial PARG to inhibit the activities of canonical PARG proteins. PAR termini are protected from PARG exo-glycohydrolase Unlike eukaryotic PARGs, bacterial PARG shields the terminal n activity (Fig. 2b). This suggests that canonical PARG endo- ribose’ from solvent access and was shown to exclusively operate glycohydrolase activity is not comparable to exo-glycohydrolase as an exo-glycohydrolase9. The Glu115Gln bacterial PARG activity under the conditions used. Detection of PARG endo- mutant (corresponding to the Glu256Gln in TTPARG) binds to glycohydrolase activity has been historically challenging. To the PARG reaction product ADP-ribose as efficiently as the wild- overcome this, we developed a sensitive liquid chromatography/ type (WT) protein, and it was added in increasing ratios with an mass spectrometry (LC/MS) method that allowed us to detect

4 NATURE COMMUNICATIONS | 4:2164 | DOI: 10.1038/ncomms3164 | www.nature.com/naturecommunications & 2013 Macmillan Publishers Limited. All rights reserved. NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3164 ARTICLE

a

N+1

F398

N N-1

b E228 E228

I227 L226I227 L226

Y293 Y293

F398 F398 N exo N exo

Y296 Y296

N endo N endo N+1 N+1

Figure 3 | Model of the PARG–PAR complex in endo-glycohydrolase mode. (a) Solvent-accessible surface of PARG in grey (the position of Phe398 is shown in blue) with a PAR3 modelled in an endo-glycohydrolase position. Carbons belonging to the three individual ADP-ribose units are coloured distinctly. (b) Stereoview of an overlay of the modelled endo-glycohydrolase PAR3 (colour coded as in panel a) with the observed exo-glycohydrolase crystal structure. Residues implicated in steric hindrance with the additional N þ 1 unit are shown in VDW spheres (Phe398 and Leu226). The hydrogen network between the N adenosine and the protein/structural water observed for the exo-glycohydrolase binding mode is shown in dotted lines.

Table 1 | Observed variation of key distances across the MD and our LC/MS data reveal a similar predisposition of the human simulations for a PARG–PAR3 model. PARG to generating the ADP-ribose exo-glycohydrolase product, suggesting that canonical PARG activity substrate preferences are conserved across the eukaryotic kingdoms. We generated a Distance (Å) E256Q structure TTPARG TTPARG (PDB code PAR3 PAR3 Phe398Gly mutation in TTPARG and the corresponding 4L2H) exo model endo model Phe902Gly mutation in human PARG to remove the steric clash that occurs with the poly(ADP-ribose) bound in the endo- Y293(OH)—PAR3 3.8 3.5±0.1 5.3±0.2 (N6) glycohydrolase mode. In comparison with the WT human and E228(OE1)—PAR3 3.0 3.7±0.3 5.1±0.2 TTPARG proteins, LC-MS analysis reveals an approximate (N6) four- and sevenfold increase of endo-glycohydrolase products, I227(N)—PAR3 (N1) 3.1 3.5±0.2 4.6±0.1 respectively (Fig. 2c), supporting the role of Phe398/Phe902 in balancing the relative endo/exo-activities of canonical PARGs. Average distances derived from MD simulations between protein residues and nitrogen atoms of the PAR adenosine moiety bound in the I227/E228/Y293 pocket. very low levels of endo-glycohydrolase products. We produced A role for both conserved glutamates in catalysis. A model of long PAR chains to further increase a chance for endo- the TTPARG–PAR Michaelis complex can be derived by posi- glycohydrolitic action. In the conditions used, the PARG tioning the Glu256 in a conformation as observed in the WT reaction enriched the shorter endo products (probably owing to TTPARG–ADP-ribose complex (PDB ID:4EPP). This reveals that the coupling with exo reaction) and ADP-ribose oligomers of Glu256 is within the hydrogen bonding distance of the O-gly- more than four repeats could not be detected (Fig. 2c). Bacterial cosidic ribose–ribose bond, thus suggesting that Glu256 is Glu115Gln PARG is also able to inhibit human PARG activity, protonated. We performed pKa calculations for the free enzyme

NATURE COMMUNICATIONS | 4:2164 | DOI: 10.1038/ncomms3164 | www.nature.com/naturecommunications 5 & 2013 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3164 and the substrate–TTPARG bound complex. Both Glu255 and several macrodomain proteins namely MacroD1, MacroD2 and Glu256 are predicted to be negatively charged in the free enzyme. TARG1 (refs 16–18) can catalyse the removal of the final ADP- Although the predicted pKa of Glu256 does not change sub- ribose moiety efficiently (Fig. 4b). A comparison of the MacroD1 stantially for the Glu255Ala mutant in the free enzyme, the and MacroD2 structures17,21 with the PAR-PARG complex charge coupling between both glutamic acid residues is much (Supplementary Fig. S5) reveals that it does not contain the increased when the substrate is bound such that one proton is PARG-specific catalytic loop insert and associated additional added to the pair. As is common for tightly coupled protonations, domains, rendering the MacroD1/D2 active site significantly the precise location of the proton is more difficult to predict, but more accessible to the ADP-ribosylated protein. Furthermore, hydrogen bonding within the active site is consistent with the side unlike for the hydrolysis of the O-glycosidic ribose–ribose chain of Glu256 picking up a proton in the enzyme–substrate linkage, the formation of the oxocarbenium intermediate in complex. Calculations suggest that in the Glu255Ala mutant, hydrolysis of the glutamate–ADP-ribose linkage does not require where the charge coupling is removed, the Glu256 pKa reverts to acid–base catalysis. We propose that MacroD1 and related a much lower value, and is less likely to pick up a proton. This proteins catalyse bond breakage by forcing the substrate ribose’ implies that Glu256 becomes protonated by W2 on PAR binding, close to the a-phosphate, similar to what has been suggested for and establishes a clear role for Glu255 in the catalytic mechanism PARG. Although the MacroD1/D2 and PARG structures share that extends beyond the binding of the n ribose’ 2-OH group similarities in the binding mode of the ADP-ribose module, the (Fig. 4). The elevated pKa of Glu256 allows it to protonate the related non-catalytic MacroH2A.1.1 (ref. 22) has a distinct leaving group on concomitant formation of the oxocarbenium conformation for the ribose’, supporting the notion that intermediate. The Glu256-mediated hydrolysis of this inter- enzymatic macrodomains achieve catalysis by inducing mediate can then occur either by W1 (which is within the substrate strain. It has been suggested that in MacroD1/D2 a hydrogen bonding distance of the ribose’ C2) or by W2. Mod- conserved structural water molecule, positioned between the elling studies indicate that the motion of W1 is severely restricted ribose and the alpha-phosphate, could be activated by the latter. B compared with W2, and thus it seems plausible that the latter is In view of the established pKa of a-phosphates of 2 (ref. 21), responsible for PAR hydrolysis, leading to the a-ribose product. this seems unlikely. In the absence of any obvious MacroD1/ MacroD2-derived acid–base catalyst, it is possible that the Glu leaving group is involved in assisting hydrolysis of the PARG does not hydrolyse mono-ADP-ribosylated substrates. oxocarbenium intermediate or that non-enzymatic hydrolysis of Modelling of a glutamate linkage to the n ribose’, to mimic the the unstable intermediate occurs. protein-PAR linkage, leads to clashes of the glutamate carboxylate moiety with PARG Val253 and Ala370. Furthermore, the pre- sence of non-catalytic domains of PARG (the accessory domain Discussion and the additional regulatory domain in mammalian PARGs) The TTPARG–PAR9 complex crystal structure represents the first probably limits access to the active site for bulkier ADP-ribosy- visualization of PAR binding at atomic detail and encourages lated protein substrates. It has recently been established that further structural studies aimed at understanding PAR-protein

abE256 F398/F306 E256 F398/F306

pKaelevated

–O O HO O –R O O 2 No enzymeMacroD1MacroD1 WTHS G270E PARGHS PARG WT E756Q R1 191 HO OH

– – 97 O O O O F371/F272 F371/F272 E255/D184 E255/D184

E256/R1 F398/F306 Poly(ADP-ribosyl)ated PARP1 E256/R1 F398/F306 c HO O –O O – R + –R HO O 2 O 2 OH2 No enzymeMacroD1MacroD1 WTHS G270E PARGHS PARG WT E756Q HO HO OH OH 97 –O O –O O F371/F272 F371/F272 E255/D184 E255/D184 Mono(ADP-ribosyl)ated PARP1

Figure 4 | A mechanism for poly-ADP-ribose hydrolysis. (a) A detailed mechanism for PARG and MacroD1 based on the PARG-PARG structure. R1 ¼ pADPr for PARG, glutamate for MacroD1 and R2 ¼ ADP. (b) PARG is unable to hydrolyse the terminal ADP-ribose–PARP1 bond. MacroD1 and human PARG activities on radioactively labelled poly(ADP-ribosyl)ated PARP1. MacroD1 WT and the catalytic mutant, G270E, as well as human PARG E756Q mutant, are unable to process PAR. (c) MacroD1 WTremoves the terminal ADP-ribose group attached to mono(ADP-ribosyl)ated PARP1 E988Q substrate. In contrast, the MacroD1 catalytic mutant and human PARG do not exhibit the mono(ADP-ribosyl) activity.

6 NATURE COMMUNICATIONS | 4:2164 | DOI: 10.1038/ncomms3164 | www.nature.com/naturecommunications & 2013 Macmillan Publishers Limited. All rights reserved. NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3164 ARTICLE interaction networks involved in genome stability. Supported by Table 2 | Crystallographic data and model refinement biochemical and modelling studies, the structural data indicate parameters. that canonical PARG is inherently predisposed to act as an exo- glycohydrolase owing to higher affinity for the PAR terminus, a TTPARG E256Q-PAR consequence of the presence of the conserved phenylalanine 9 residue (Phe398/902). Unlike the bacterial PARG, however, a Data collection Space group P212121 (latent) low-affinity endo-glycohydrolase binding mode is possi- Cell dimensions ble, as confirmed by our LC-MS studies of the WT proteins. This a, b, c (Å) 55.8, 75.6, 138.7 suggests that, in vivo, the relative balance between exo- and endo- a, b, g (°) 90.0, 90.0, 90.0 glycohydrolase activity will be a function of the PAR/PARG ratio. Resolution (Å) 30 (1.55–1.46) The latter mode of action is more likely in cases where PAR/ Rmeas 7.9 (82.5) PARG ratio is increased, for instance, when cells encounter I/sI 13.91 (2.11) extreme stress leading to excess PAR production and apoptosis, Completeness (%) 99.4 (97.7) where released larger oligo-PAR fragments may act to further Redundancy 5.46 amplify the apoptotic signal23. We suggest that under normal Refinement physiological conditions ADP-ribose is the dominant PARG Resolution (Å) 30–1.46 product, with the latent endo-glycohydrolase activity of PARG No. of reflections 103,366 activated by increased PAR/PARG ratios and/or increasing PAR Rwork/Rfree 13.8/17.7 (24.2/28.9) chain lengths following cellular insult. The final step in complete No atoms removal of PAR requires hydrolysis of the glutamate–ADP-ribose Protein 7,392 linkage connecting PAR to the modified protein. This reaction is Ligand 92 catalysed by MacroD family of proteins (in addition to TARG1), Water 386 2 which do not contain the PARG-specific acid–base catalytic B-factors (Å ) machinery, but make similar use of substrate strain to catalyse the Protein 14.7 formation of the oxocarbenium intermediate. Ligand 16.6 Water 27.0 R.m.s. deviations Methods Bond lengths (Å) 0.026 Plasmids and proteins. The WT and mutant T. thermophila PARG2 Bond angles (°) 2.067 (TTHERM_00294690) were expressed from the pET28a vector (Novagen). Human PARG with an N-terminal truncation (D1–455) served as a reference for the WT Values in parentheses indicate values obtained for the highest resolution shell. PARG activity, and the mutant PARG constructs were expressed from the pColdTF vector (Takara). All proteins bear an N-terminal his-tag. For crystallization studies, T. thermophila E256Q mutant was purified by fast protein liquid chromatography (FPLC) on a HisTrap HP column (GE Healthcare), followed by size-exclusion chromatography using HiLoad 16/60 Superdex 200 column. Mutations were Molecular dynamics simulations. Topology and parameter files for the PAR introduced using the QuickChange II Site-Directed Mutagenesis kit (Stratagene). trimer were obtained using the Antechamber program25 with AM1-BCC charges26. The PARG–ligand complex model was placed in a periodic water box (TIP3) and neutralized by adding Na þ ions. This complex was equilibrated with several cycles PARG activity assays. For western blot analysis of PARG activity, PAR was of minimizations (steepest descent, 10,000 steps) and MD simulations (50 K, 20 ps) synthesized by the automodification of PARP1 in a reaction mixture containing with the protein atoms fixed. MD simulations were performed (310 K, 0.5 ns) at a 2 units of PARP1 (Trevigen), 200 mM NAD (Trevigen), activated DNA (Trevigen), time step of 2 ps, with the protein backbone restrained to the X-ray structure 50 mM Tris (pH 7.5) and 50 mM NaCl at room temperature. Reactions were conformation. Individual snapshots showing the PARG–ligand interactions were stopped after 30 min by the addition of the PARP inhibitor KU-0058948 and extracted from the last 100-ps simulation, and minimized (10,000 steps, steepest contain 5 mM of PAR. In mutational studies, either 80 nM human NTR1 or 7 nM descent). The overall r.m.s.d. of the PARendo–PARG model with the X-ray T. thermophila WT and mutant PARGs were added to the reactions and incubated coordinates was 0.43 Å for all Ca atoms. for another 30 min. In PARG inhibition studies, a preincubation step with T. curvata E115Q (ref. 9) was included. In the latter assays, the reaction mix was incubated with increasing amounts (0.05, 0.1, 0.5, 0.75, 1, 2, 5 and 10 mM) of pKa calculations. Calculations of pKa values were made with a combined Finite T.curvata E115Q PARG for 5 min before the addition of 20 nM human full-length Difference Poisson–Boltzmann and Debye–Hu¨ckel method, termed FDDH27,28. or 30 nM T. thermophila PARGs, followed by a subsequent 15-min incubation Relative dielectric values of 78.4 (water) and 4 (protein, Finite Difference Poisson– at room temperature. All reactions were run on 4–12% SDS–PAGE gels and blotted Boltzmann method) were assigned, with an ionic strength of 0.15 M. Calculations onto a nitrocellulose membrane. PAR hydrolysis was visualized by rabbit were made for TTPARG with and without the bound PAR substrate. polyclonal anti-PAR antibodies (Trevigen; 1:1,000 dilution). Western blots were analysed densitometrically by GeneTools (SynGene), followed by statistical analysis using a paired t-test. Ultrahigh-performance liquid chromatography coupled to quadrupole time-of- flight mass spectrometry. The poly(ADP-ribosyl)ated PARP1 was treated with PARG, and the mixture was filtered using centricons (30 kDa cutoff). The analysis PAR isolation and analytical verification. PAR biosynthesized by tankyrase 1 of the filtrate was performed using a modified procedure by Coulier et al.29,as (1,093–1,327) in the presence of histones were detached with potassium hydroxide described earlier in Dunstan et al.10 Briefly, all analyses were performed using a and purified by means of the dihydroxy boronyl column. The resulting bulk Waters Acquity ultrahigh-performance liquid chromatography system (Waters, polymers were fractionated by anion exchange chromatography and desalted to Milford, MA, USA), equipped with a binary solvent delivery system and yield PAR of defined lengths20. autosampler. The chromatographic separations used a column (100 mm 2.1 mm) filled with a 1.7 mmBEHC18 stationary phase (Waters, Milford, MA, USA). Binary gradients at a flow rate of 0.4 ml min 1 were applied for the elution. The eluent A Crystallization and structure solution. A concentration of 15.5 mg ml 1 was water containing 5 mmol l 1 of pentylamine, and the pH value was adjusted to TTPARG E256Q was incubated with a series of purified PAR fragments at 1 mM 6.5 using acetic acid, whereas the eluent B was acetonitrile. A fast elution gradient (hexamer, heptamer, decamer, tetradecamer and pentadecamer were used) before was applied, starting with 2% B, and then the percentage of B linearly increased to setting down sitting-drop vapour diffusion trays. Crystals were observed for all 25% in 5 min, followed by an isocratic hold until 10 min. PAR fragments tested under a range of conditions. Crystals were flash-cooled in The mass spectrometry was performed on a quadrupole time-of-flight Premier liquid nitrogen, and diffraction data were collected at the Diamond Lightsource instrument (Waters Micromass, Manchester, UK) using an orthogonal Z-spray– (UK), reduced and scaled using X-ray Detector Software (XDS)24. The highest electrospray interface. The instrument was operated in V mode, with time-of-flight resolution data were obtained for a PAR9 complex, which was obtained in 0.1 M mass spectrometry data being collected between m/z 100 and 2,000, applying HEPES, pH 7.0, and 30% v/v Jeffamine ED-2001 at 4 °C. The structure was solved collision energy of 4 eV. All acquisitions were carried out using an independent by molecular replacement using the WT TTPARG structure10 (PDB ID: 4EPP), reference spray via the lock spray interface, whereas leucine enkephalin was applied and final refinement statistics are given in Table 2. as a lock mass in negative ionization mode (m/z 554.2615).

NATURE COMMUNICATIONS | 4:2164 | DOI: 10.1038/ncomms3164 | www.nature.com/naturecommunications 7 & 2013 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3164

The reaction products, expected to occur in the reaction mixture, in particular 20. Tan, E.S., Krukenberg, K.A. & Mitchison, T.J. Large-scale preparation and ADP-ribose and oligomers of poly-ADP-ribose with up to five ADP-ribose units, as characterization of poly(ADP-ribose) and defined length polymers. Anal. well as NAD þ , were searched for using extracted ion chromatograms. The target Biochem. 428, 126–136 (2012). analyses were based on the characteristic accurate m/z ratios applying a mass 21. Chen, D. et al. Identification of macrodomain proteins as novel O-acetyl-ADP- window of 50 mDa. As the information on the electrospray ionization mass spectra ribose deacetylases. J. Biol. Chem. 286, 13261–13271 (2011). for oligomeric ADP-ribose species are only scarcely available in the literature30, the 22. Kustatscher, G., Hothorn, M., Pugieux, C., Scheffzek, K. & Ladurner, A.G. recorded total ion current (TIC) chromatograms were systematically examined using specific masses of possible mono- and multiply charged ions of individual Splicing regulates NAD metabolite binding to histone macroH2A. Nat. Struct. oligomers. The optimal responses were obtained using m/z ratios of 558.0639, Mol. Biol 12, 624–625 (2005). 1,099.125, 819.589 and 1,090.120 for deprotonated ADP-ribose, deprotonated 23. Andrabi, S.A. et al. Poly(ADP-ribose) (PAR) polymer is a death signal. Proc. ADPR dimer, doubly charged deprotonated ADPR trimer and doubly charged Natl Acad. Sci. USA 103, 18308–18313 (2006). deprotonated ADPR tetramer, respectively. The relative contributions of the 24. Kabsch, W. Evaluation of single-crystal X-ray diffraction data from a position- individual oligomers were determined assuming that they had the same responses sensitive detector. J. Appl. Cryst 21, 916–924 (1988). as ADPR. 25. Cornell, W. D. et al. A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. J. Am. Chem. Soc 117, 5179–5197 (1995). References 26. Jakalian, A., Jack, D. B. & Bayly, C. I. Fast, efficient generation of high-quality 1. Pears, C.J. et al. The role of ADP-ribosylation in regulating DNA double-strand atomic charges. AM1-BCC model: II. Parameterization and validation. break repair. Cell Cycle 11, 48–56 (2012). J. Comput. Chem. 23, 1623–1641 (2002). 2. Beneke, S. Regulation of chromatin structure by poly(ADP-ribosyl)ation. Front 27. Warwicker, J. pKa predictions with a coupled finite difference Poisson- Genet. 3, 169 (2012). Boltzmann and Debye-Hu¨ckel method. Proteins 79, 3374–3380 (2004). 3. Gibson, B.A. & Kraus, W.L. New insights into the molecular and cellular 28. Warwicker, J. Improved pKa calculations through flexibility based sampling of functions of poly(ADP-ribose) and PARPs. Nat. Rev. Mol. Cell Biol. 13, a water-dominated interaction scheme. Protein Sci. 13, 2793–2805 (2011). 411–424 (2012). 29. Coulier, L. et al. Simultaneous quantitative analysis of metabolites using ion- 4. Kalisch, T., Ame´, J.-C., Dantzer, F. & Schreiber, V. New readers and pair liquid chromatography-electrospray ionization mass spectrometry. Anal. 78, interpretations of poly(ADP-ribosyl)ation. Trends Biochem. Sci. 37, 381–390 Chem 6573–6582 (2006). 30. Morrison, A.R. et al. ART2, a T cell surface mono-ADP-ribosyltransferase, (2012). generates extracellular poly(ADP-ribose). J. Biol. Chem 281, 33363–33372 5. D’Amours, D., Desnoyer., S., D’Silva., I. & Poirier., G.G. Poly(ADP- (2006). ribosyl)ation reactions in the regulation of nuclear functions. Biochem. J. 342, 249–268 (1999). 6. Koh, D. W. et al. Failure to degrade poly(ADP-ribose) causes increased Acknowledgements sensitivity to cytotoxicity and early embryonic lethality. Proc. Natl Acad. Sci. This work was supported by a BBSRC studentship to A.B., a postdoctoral fellowship from USA 101, 17699–17704 (2004). the American Cancer Society (116420-PF-09-024-01-CCG) to E.S.T. and a grant from 7. Hanai, S. et al. Loss of poly(ADP-ribose) glycohydrolase causes progressive the National Institutes of Health (NCI PO1 grant CA139980) to T.J.M. The work in I.A. neurodegeneration in Drosophila melanogaster. Proc. Natl Acad. Sci. USA 101, laboratory has been supported by Cancer Research UK and European Research Council. 82–86 (2004). We are grateful to R. Morra for providing reagents. Access to Diamond beamline IO3 is 8. Heeres, J.T. & Hergenrother, P.J. Poly(ADP-ribose) makes a date with death. gratefully acknowledged. Curr. Opin. Chem. Biol. 11, 644–653 (2007). 9. Slade, D. et al. The structure and catalytic mechanism of a poly(ADP-ribose) glycohydrolase. Nature 477, 616–620 (2011). Author contributions 10. Dunstan, M.S. et al. Structure and mechanism of a canonical poly(ADP-ribose) E.B. performed biochemical and in vitro experiments, prepared proteins and performed glycohydrolase. Nat. Comm. 3, 878 (2012). crystallization. E.S.T. and T.J.M. prepared and purified PAR fragments. A.B. prepared 11. Kim, I-K. et al. Structure of mammalian poly(ADP-ribose) glycohydrolase proteins and performed crystallization and structural studies assisted by M.S.D. P.L. reveals a flexible tyrosine clasp as a substrate-binding element. Nat. Struct. Mol. performed molecular modelling studies. J.W. performed pKa calculations. M.A. Biol 19, 653–656 (2012). performed LC-MS studies. B.B. performed supporting studies. I.A. and D.L. wrote 12. Tucker, J.A. et al. Structures of the human poly (ADP-ribose) glycohydrolase the manuscript, designed experiments and analysed data. catalytic domain confirm catalytic mechanism and explain inhibition by ADP- HPD derivatives. PLoS One 7(12): e50889 (2012). Additional information 13. Karras, G.I. et al. The macro domain is an ADP-ribose binding module. EMBO Accession codes: Coordinates and structure factors for catalytically inactive PARG in J. 24, 1911–1920 (2005). complex with a poly(ADP-ribose) fragment have been deposited in the protein data bank 14. Hatakeyama, K., Nemoto, Y., Ueda, K. & Hayaishi, O. Purification and under PDB accession code 4L2H. characterization of poly(ADP-ribose) glycohydrolase. Different modes of action on large and small poly(ADP-ribose). J. Biol. Chem. 261, 14902–14911 (1986). Supplementary Information accompanies this paper at www.nature.com/ 15. Oka, J., Ueda, K., Hayaishi, O., Komura, H. & Nakanishi, K. ADP-ribosyl naturecommunications. protein . Purification, properties, and identification of the product. J. Biol. Competing financial interests: The authors declare no competing financial interests. Chem. 259, 986–995 (1984). 16. Sharifi, R. et al. Deficiency of terminal ADP-ribose protein glycohydroase Reprint and permissions information is available at http://npg.nature.com/ TARG1/C6orf130 in neurodegenerative disease. EMBO J. 32, 1225–1237 reprintsandpermissions/ (2013). 17. Jankevicius, G et al. A family of macrodomain proteins reverses cellular mono- How to cite this article: Barkauskaite, E. et al. Visualization of poly(ADP-ribose) bound ADP-ribosylation. Nat. Struct. Mol. Biol. 20, 508–514 (2013). to PARG reveals inherent balance between exo- and endo-glycohydrolase activities. 18. Rosenthal, F. et al. Macrodomain-containing proteins are new mono-ADP- Nat. Commun. 4:2164 doi: 10.1038/3164 (2013). ribosylhydrolases. Nat. Struct. Mol. Biol. 20, 502–507 (2013). 19. Williams, J.C., Chambers, J.P. & Liehr, J.G. Glutamyl ribose 5-phosphate This work is licensed under a Creative Commons Attribution- storage disease. A hereditary defect in the degradation of poly(ADP- NonCommercial-ShareAlike 3.0 Unported License. To view a copy of ribosylated) proteins. J. Biol. Chem. 259, 1037–1042 (1984). this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/

8 NATURE COMMUNICATIONS | 4:2164 | DOI: 10.1038/ncomms3164 | www.nature.com/naturecommunications & 2013 Macmillan Publishers Limited. All rights reserved. ARTICLE

Received 3 Feb 2012 | Accepted 4 May 2012 | Published 6 Jun 2012 DOI: 10.1038/ncomms1889 Structure and mechanism of a canonical poly(ADP-ribose) glycohydrolase

Mark S. Dunstan1,*, Eva Barkauskaite2,*, Pierre Lafite3, Claire E. Knezevic4, Amy Brassington1, Marijan Ahel5, Paul J. Hergenrother4, David Leys1 & Ivan Ahel2

Poly(ADP-ribosyl)ation is a reversible post-translational protein modification involved in the regulation of a number of cellular processes including DNA repair, chromatin structure, mitosis, transcription, checkpoint activation, apoptosis and asexual development. The reversion of poly(ADP-ribosyl)ation is catalysed by poly(ADP-ribose) (PAR) glycohydrolase (PARG), which specifically targets the unique PAR (1′′-2′) ribose–ribose bonds. Here we report the structure and mechanism of the first canonical PARG from the protozoan Tetrahymena thermophila. In addition, we reveal the structure of T. thermophila PARG in a complex with a novel rhodanine- containing mammalian PARG inhibitor RBPI-3. Our data demonstrate that the protozoan PARG represents a good model for human PARG and is therefore likely to prove useful in guiding structure-based discovery of new classes of PARG inhibitors.

1 Manchester Interdisciplinary Biocentre, Princess Street 131, M1 7DN, Manchester, UK. 2 Cancer Research UK, Paterson Institute for Cancer Research, University of Manchester, Wilmslow Road, Manchester M20 4BX, UK. 3 ICOA–UMR CNRS 7311 Université d’Orléans Rue de Chartres, F-45067 Orléans, France. 4 University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA. 5 Rudjer Boskovic Institute, Bijenicka 54, HR-10000 Zagreb, Croatia. *These authors contributed equally to this work. Correspondence and requests for materials should be addressed to D.L. (email: [email protected]) or to I.A. (email: [email protected]). nature communications | 3:878 | DOI: 10.1038/ncomms1889 | www.nature.com/naturecommunications  © 2012 Macmillan Publishers Limited. All rights reserved. ARTICLE nature communications | DOI: 10.1038/ncomms1889

AR is synthesized by the PARP family of enzymes in a reaction ADP-ribose (Supplementary Fig. S2) and displays high PAR that utilizes NAD as a substrate. In contrast to the abundance glycohydrolase activity. This activity is dependent on the conserved Pof structural and functional data describing PAR synthesis1–4, glutamate residues as observed for other PARGs (Fig. 1b). The our present understanding of the PAR-degradation pathway is com- structure was solved using a Se-Met MAD-based approach and paratively poor5,6. Canonical PARG is a highly conserved protein refined against 1.95 Å data (Table 1; Supplementary Fig. S3). The found in organisms ranging from protozoa to humans7. The disrup- TTPARG structure (Fig. 2) consists of a macrodomain sandwiched tion of the PARG gene in mouse and Drosophila melanogaster leads between a large N-terminal accessory domain and a smaller to early embryonic lethality8,9. A highly diverged PARG (bacte- carboxy-terminal extension. The macrodomain itself is most similar rial-type) is present in filamentous fungi and a number of bacterial in structure to the recently determined bacterial PARG (bactPARG) species10. The recently solved structure of a bacterial PARG revealed structure (Fig. 2b; PDB code 3SIG Z score 15.4), with which it that the catalytic centre is essentially a macrodomain with a loop shares the PARG-specific loop. In both bacterial and TTPARG, the region inserted that contains the PARG signature sequence (GGG- conformation of the glycine-rich PARG catalytic loop seems to be 5,10 X6 − 8-QEE) . Macrodomains are evolutionarily conserved stabilized by contacts with residues derived from an N-terminal ADP-ribose-binding modules that often mediate PAR signalling11–13. extension to the macrodomain. In bactPARG, this extension The bacterial PARG structures combined with biochemical -stud consists of a few α-helices (Fig. 2b). In contrast, the N-terminal ies established a role for the PARG signature residues in substrate accessory domain of the TTPARG accounts for approximately half binding and catalysis10. Despite this progress, structural informa- of the enzyme structure and is predominantly α-helical in nature, tion on the canonical PARGs is still lacking. Recent studies iden- with the exception of a few β-strands that form an extension to tified a minimal region in human PARG required for catalytic activity in vitro, which extends beyond the macrodomain alone14. a RS/MTS Macrodomain AB C Vertebrates This amino-terminal extension contains a short motif called the Canonical-type Regulatory Minimal catalytic Regulatory Segment/MTS (RS/MTS) that was suggested to be PARG region region essential for PARG activity14. Additionally, a conserved tyrosine BC Most of the eukaryotes Filamentous fungi, Trichomonas, Adineta, only present in canonical-type PARGs (Y795 in human PARG) Bacterial-type C was suggested to be involved in PAR binding15. To investigate the PARG Naegleria, a scattering of bacteria mechanism and structure of canonical-type PARGs, we screened b for a crystallizable homologue of the human PARG catalytic core, and identified a PARG from the ciliated protozoan Tetrahymena G thermophila (TTPARG). Our structures of TTPARG in complex No enzymeTTPARTTPARGTTPARG HSE255A PARG HSE256A PARGHS PARG E755A E756A with ADP-ribose and the mammalian PARG inhibitor RBPI-3 191 kDa combined with solution studies reveal details of the substrate recog- 97 kDa ated PARP1

nition and the reaction mechanism of canonical PARGs. Poly(ADP-ribosyl) αPAR

Results Figure 1 | Phylogenetic distribution and the activity of PARG enzymes. The structure of a canonical PARG from T. thermophila. The (a) The domain structure of canonical and bacterial type PARGs. full-length TTPARG is highly similar to the minimal catalytic (b) Poly(ADP-ribose) hydrolytic activities of Tetrahymena thermophila region of human PARG, but it lacks the obvious RS/MTS motif and human PARGs. The activity is dependent on the integrity of catalytic (Fig. 1a; Supplementary Fig. S1). Nonetheless, the TTPARG binds glutamate residues.

Table 1 | Data collection and refinement statistics.

TTPARG-ADP ribose TTPARG-RBPI3 TTPARG-SeMet peak Data collection space group P41 P41 P41 Cell dimensions a,b,c (Å) 112.70, 112.70, 88.60 80.68 80.68 89.42 81.18 81.18 89.32 α,β,γ (°) 90.0 90.0 90.0 90.0 90.0 90.0 90.0 90.0 90.0 Resolution (Å) 27.5 (2.07-1.95)* 27.5 (2.55-2.4) 27.9 (2.55-3.0) Rsym 14.7 (54.1) 12.8 (85.9) 11.7 (55.9) I/σI 11.1 (2.98) 13.6 (2.03) 14.63 (2.76) Completeness (%) 99.5 (97.2) 99.8 (99.4) 99.2 (96.1) Redundancy 6.9 (6.9) 8.2 (8.1) 7.5 (7.5)

Refinement Resolution (Å) 1.95 2.4 no. reflections 158535 22523 Rwork/Rfree 16.7/20.9 17.0/23.6 B-factors Protein 27.5 59.8 Ligand/ion 22.3 38 Water 35.8 48.2 R.m.s. deviations Bond lengths (Å) 0.007 0.008 Bond angles (°) 1.03 1.16

*Values in parentheses are for highest resolution shell.

 nature communications | 3:878 | DOI: 10.1038/ncomms1889 | www.nature.com/naturecommunications © 2012 Macmillan Publishers Limited. All rights reserved. nature communications | DOI: 10.1038/ncomms1889 ARTICLE the macrodomain central β-sheet. This accessory domain is not a related to any previously determined structures and seems unique to canonical PARGs. Given the sequence similarity between human and Tetrahymena PARGs (32% identity for residues 91–431), a plausible model (r.m.s.d. 0.51 Å) can be created accounting for the majority of the human PARG catalytic domain (residues 604–937) (Fig. 2c). We attempted to produce truncations in the N-terminal domain of TTPARG, which were chosen based on our structure (∆1-67 and ∆1-115), but both truncations dramatically affected expression and stability of the proteins in Escherichia coli. Similar behaviour was observed for homologous truncations of human PARG, demonstrating the importance of the N-terminal domain b for canonical PARGs. Despite this, we obtained some soluble protein for the human PARG truncation ∆1-619 (corresponding to TTPARG ∆1-115), and this protein exhibited significant PARG activity in vitro (Supplementary Fig. S4), demonstrating that the RS/MTS motif14 and a large part of the N-terminal domain are not absolutely essential for the enzymatic activity of canonical PARGs.

The canonical PARG catalytic mechanism. The TTPARG active site is clearly identified by the presence of the ADP-ribose ligand (Fig. 3a). The ADP-moiety is bound by the macrodomain with the c ADP-Ribose ribose group located in close proximity to the key catalytic residues provided by the PARG-specific catalytic loop. Our homology model reveals that near-identical contacts occur in the human PARG (Fig. 3b). The majority of the hydrogen bonds established with the ligand phosphates occur through main chain atoms. The adenosine base is stacked with Phe398 and forms direct hydrogen bonds with Ile227 and Glu228. Whereas the C2-hydroxyl group of the adenos- ine ribose is devoid of any protein-ligand interactions, a water-medi- ated network of interactions can be observed for the C3-hydroxyl group. This network includes the Y296 side chain, homologous to the hPARG Y795 previously implicated in the binding of an ADP- Human Homology Model 15 ribose analogue inhibitor of PARG . Mutation of this residue leads TTPARG to reduced PARG activity in vitro for both TTPARG and hPARG (Fig. 3c; Supplementary Fig. S5). Each of the ribose hydroxyl groups Figure 2 | Structure of canonical PARG enzyme from Tetrahymena is directly hydrogen-bonded by a PARG residue, with C1, C2 and thermophila. (a) The TTPARG-ADP-ribose complex structure. The C3 hydroxyl groups bound by Glu256, Glu255 and Asn240, respec- TTPARG macrodomain is coloured in blue (residues 218–414), the catalytic tively. While mutation of the conserved Glu residues abolishes PARG loop in green (residues 244–256), the N-terminal accessory domain enzyme activity, Asn240Ala (Asn740Ala in hPARG) mutants retain in red (residues 1–217), and the C-terminal extension in teal (residues 415– a low level of in vitro activity (Fig. 3c). Glu255 is in direct hydrogen- 458). (b) A TTPARG/bactPARG structure overlay. TTPARG is coloured as bonding contact with Asp237, which has been previously implicated in (a). The bactPARG structure is shown using cyan for the macrodomain, in canonical PARG activity5. The ribose’ moiety is also in direct van magenta for the N-terminal extension, and a dark green colour for the der Waals contact with Phe371. This interaction is important for the catalytic PARG loop. (c) Homology model for the human PARG catalytic activity, as Phe371Ala (Phe875Ala in hPARG) severely affects the domain, using residues 91–431 of TTPARG as a template, for clarity, regions ability of the enzyme to hydrolyse PAR (Fig. 3c). not included in the model (due to lack of significant homology in those A comparison with the previously determined bactPARG-ADP- regions) are transparent. ribose complex structure reveals significant similarity in enzyme- ligand interactions between both complexes (Supplementary Fig. S6). In both enzymes, the Gly-rich PARG catalytic loop confor- Structure of a mammalian PARG inhibitor:TTPARG complex. mation is stabilized through interactions with residues from an N- Inhibition of mammalian PARG enzymes was recently shown for terminal extension to the macrodomain. The catalytic Glu residues several related rhodanine-containing compounds16. Not unexpect- bind the ribose C1 and C2 hydroxyl groups, with Glu256 (Glu756 edly, we were able to show that TTPARG is also inhibited by this class in hPARG) presumably acting as the hydrogen donor to the leaving of inhibitors (Supplementary Fig. S7). The structure of the TTPARG- group. The Asp237–Glu255 pair could form a proton relay network inhibitor complex obtained by co-crystallization reveals that the to Glu256, mediated in part by the ADP-ribose hydroxyl groups. inhibitor RBPI-3 binds predominantly via a π–π stacking interaction In bactPARG, Asn95 replaces the TTPARG Asp237, which suggests with Tyr296 (implicated in binding of distinct ADP-ribose ana- that the mechanism for PAR hydrolysis is subtly different between logues15) and the conserved Phe398 (Fig. 3d). To accommodate the both enzymes, or that TTPARG Glu255 and/or bactPARG Glu114 binding of RBPI-3, Phe398 moves into the adenosine binding pocket. are not directly involved in acid-base catalysis. Regardless, in both The RBPI-3 carboxyl moiety occupies a region corresponding to the enzymes, the transient oxocarbenium intermediate seems to be ADP-ribose alpha-phosphate group and H-bonds to main chain stabilized by close contact with the adenosine diphosphate group, atoms of Lys365 and Gln254. The RBPI-3 di-chlorobenzyl moiety an interaction enforced by the presence of Phe371 (Phe227 in bact- extends into the solvent and is disordered. PARG). In both structures, a water molecule is positioned in close contact to the C1 carbon, with nucleophilic attack on C1 facilitated Poly(ADP-ribose) recognition by canonical PARGs. To fur- by Glu256 (Glu115 in bactPARG) acting as a base. ther explore canonical PARG-PAR interactions, we modelled an nature communications | 3:878 | DOI: 10.1038/ncomms1889 | www.nature.com/naturecommunications  © 2012 Macmillan Publishers Limited. All rights reserved. ARTICLE nature communications | DOI: 10.1038/ncomms1889

a b c 120 Y296 ty 100

E228 vi Y296 80 acti E256 Y795 F398 F902 60

F398 G249 ARG 40 Y146 S749 E255 Y653 E256 E237 ADP-ribose % P 20 E756 E736 0 G246 K365 F A N A A A W W N240 WT 46 G746 N869 240A 246S 255A 256 96A 65 71 98 00 Y1 N G245 G G249 E E Y2 K3 F3 F3 D4 120 N240 F371

N740 F875 ty 100

F371 vi 80 acti 60 RG 40

% PA 20 0 0A A S N 5A 6A 5A WT 4 49 5 9 653F 7 75 875A 904W Y N G745 G746 S7 E7 E Y7 N869AF F902WD d e f E256 E255 ADP-ribose N+1 N+1 N250 Y296 Y296 Y293 G251 F398 G249 G367 Q254 N-1 V253 N E256 L248 G246 K365 N-1 E228 RBPI-3 A370 N K365 F398 G246 N-2 E255

N240 F371 N-2

Figure 3 | Reaction mechanism of canonical PARGs. (a) Bound ADP-ribose is shown with electron density in cyan. For clarity, interactions with backbone are only indicated by dotted lines, with no representation of the backbone atoms. (b) Detailed view of the active site regions of the human PARG homology model (in grey), overlayed with the corresponding TTPARG structure. (c) Relative poly(ADP-ribose) hydrolytic activities of Tetrahymena thermophila (upper panel) and human (lower panel) PARG mutants. Error bars represent s.d. (n = 3). (d) Structure of the TTPARG-RBPI-3 complex. RBPI-3 is shown in atom- coloured sticks, with an ADP ribose overlayed in black. The omit 2Fo-Fc electron density corresponding to RBPI-3 is shown in the insert as a blue mesh contoured at 1.5σ. (e) A PARG-PAR4 model surface representation of TTPARG with 40 different PAR4 conformations derived from MD simulations. The PAR4 is shown in atom-colored sticks with ADP-ribose N in grey carbons, N − 1 magenta carbons, N − 2 teal carbons and N + 1 in green carbons. (f) Detailed view of the PAR-binding site of a representative PARG-PAR4 model. The PAR4 is coloured as in (a), with key contacts between PARG-PAR4 displayed by black dotted lines. Cα atoms of Gly residues important in PAR4 binding are shown in spheres. oligo(ADP-ribose)4 molecule in the TTPARG structure (Fig. 3e,f). additional n + 1 ADP-ribose reveals that TTPARG can bind PAR at The position and conformation of the n − 1, n − 2 and n + 1 ADP- intermediate positions, and could thus display endo-glycohydrolase ribose units (n being the ADP-ribose unit observed in the crystal activity as suggested for canonical PARGs17. However, the endo- structure) can be derived from linking these through an α(O1ribose’- glycohydrolase activity has been very difficult to confirm, as it was 18 O2ribose) O-glycosidic linkage and through the constraints imposed suggested to be coupled with exo-glycohydrolase activity . Thus on the available space by the TTPARG structure. The model pre- far, we have not been able to experimentally demonstrate the endo- dicts that the n − 1 ribose unit is in van der Waals contact with both glycohydrolase activity for any of our PARG enzyme preparations Gly246 and Gly251, whereas the n − 1 adenosine base is sandwiched and we consistently observed ADP-ribose as the only detectable against Val253 and Ala370. The n–1 diphosphate group is located product (Supplementary Fig. S8)10. Nonetheless, we tested some in close proximity to the conserved Gly246, with the α-phosphate, mutants at the n + 1 ADP-ribose binding site that, according to our making polar contacts with the amide nitrogens of Asn250 and PAR-PARG model, could specifically affect the endo-glycohydro- Gly251. Molecular dynamics simulations suggest that the PAR lig- lase function. For example, the ribose’ of the n + 1 ADP-ribose is and shows significant mobility from the n − 1 β-phosphate, with positioned in close contact with Lys365, which can hydrogen-bond increasing conformational flexibility for the n − 1 ribose and n − 2 to both C2 and C3 hydroxyl groups. Interestingly, the mutation of ADP-ribose unit (Fig. 3e). Mutation of Gly246 to Ser (G746S in Lys365 to Ala (Asn869Ala in hPARG) leads to diminished in vitro hPARG) leads to a significant reduction inin vitro activity, while the activity while retaining ADP-ribose-binding properties, as verified effect of mutating the preceding Gly245 to Ala (G745A in hPARG) by thermal shift assays (Fig. 3c; Supplementary Fig. S9). We also nearly completely abolishes activity (Fig. 3c). The Cα of the latter analysed the effect of mutating Phe398 and Asp400 to tryptophan residue is van der Waals contact with the n-ribose O2 group. This that based on our predictions could affect the TTPARG endo- suggests that binding, and/or orientation, of the n − 1 ADP-ribose is glycohydrolase function by blocking the binding to n + 1 ADP- less critical to activity when compared with binding of the n ADP- ribose. We observed a substantially reduced overall PARG activity ribose. The n − 2 ADP-ribose is only contacted by residues from the for the Asp400Trp TTPARG mutant only (Fig. 3c; Supplementary N-terminal accessory domain, where sequence conservation is lim- Fig. S9). Mutating the homologous residues in human PARG in ited compared with those regions contacting the n − 1 and n units, both cases severely affected the activity of the protein. Collectively, and it displays significant conformational mobility Fig.( 3e). our model suggests that canonical PARG-PAR specific contacts are The adenosine C2-hydroxyl group is solvent exposed in the more extensive than for bactPARG, but are still limited to the n and, TTPARG structure, in contrast to bactPARG, and modelling of an to a lesser extent, n − 1 and n + 1 ADP-ribose units.

 nature communications | 3:878 | DOI: 10.1038/ncomms1889 | www.nature.com/naturecommunications © 2012 Macmillan Publishers Limited. All rights reserved. nature communications | DOI: 10.1038/ncomms1889 ARTICLE

Discussion concentrated to ~15 mg ml − 1 and 1:1 molar ratio of TTPARG to ADP-ribose was In summary, we reveal the first structure of a canonical PARG. added before crystallization experiments. Our data suggests that canonical PARG is similar in mechanism to Structure determination and refinement. Selenomethionine incorporated bacterial-type PARGs, but displays some key differences, includ- TTPARG crystals were obtained by sitting drop vapour diffusion in drops contain- ing additional PAR-PARG interactions (that is, Asn240, Tyr296 ing equal volumes of ADP-ribose complexed protein and a solution containing and Lys365) and the ability to bind PAR at intermediate positions. 0.15 M potassium thiocyanate, 0.1 M Tris pH 8.5 and 15% PEG 6 K. Data was Targeting poly(ADP-ribosyl)ation in the therapy of human disease obtained on beamline IO4 at the Diamond light source and reduced and scaled with the X-ray Detector Software suite. The crystal structure of TTPARG was deter- has attracted considerable attention over the past few years after it mined by Single-wavelength anomalous diffraction and was phased with the pro- was demonstrated that permeable PARP inhibitors can be highly gram Solve21. The resulting map allowed a single low-resolution copy of TTPARG effective against hereditary breast and ovarian cancers19. Given the to be built with Bucanneer22. An additional high-resolution data set was collected, impact and the prevalence of these diseases, there has been a rap- on a second crystal grown in conditions containing 0.2 M potassium bromide, idly growing interest in PARG as an alternative target in human 0.1 M Tris pH7.5 and 15% PEG 4 K. The low-resolution model from Buccaneer was then used as a start model for molecular replacement and applied to the high- therapy. We believe that the novel PARG-inhibitor structure pre- resolution data. The resulting MR solution contained two TTPARG molecules in sented here could provide the groundwork for future studies that the asymmetric unit (AU). The missing residues and loops were completed by might lead to the development of small, cell-permeable PARG iterative cycles of automated model building (ARP/wARP) and manual model inhibitors. building and real-space refinement, using the program COOT and crystallographic refinement using phenix.refine. Structure validation was preformed with Methods Molprobity. Co-crystallization with the RBPI-3 inhibitor was carried out by replacing ADP-ribose with RBPI-3. The processing and final refinement statistics Plasmids and proteins . TheT. thermophila PARG2 (TTHERM_00294690) was are presented in Table 1. The thermodynamic ligand-binding properties of TTPARG synthesized, according to the database sequence (GenScript USA). The human and ADP-ribose was measured using a VP-ITC microcalorimeter. Protein and PARG gene (Q86W56) was cloned from HeLa complementary DNA. T. ther- ligand concentrations were 20 µM and 200 µM, respectively in 50 mM Tris (pH 7.5), mophila PARG proteins were expressed from the pET28a vector (Novagen). 100 mM NaCl. Titration curves were fitted using a nonlinear least-squares method ∆ Human PARG with an N-terminal truncation ( 1-455) served as a reference for in Microcal Origin software. A model that was indicative of a single-binding site the wild-type PARG activity and the mutant PARG constructs were expressed from was found to give the best fit and this model was used to obtain the thermodynamic the pColdTF vector (Takara). All proteins bear an N-terminal his tag. Human parameters. proteins bear the additional Trigger Factor chaperone tag that increases solubility. Mutations were introduced using the QuickChange II site-directed mutagenesis Computational simulations. NAMD software23 was used to perform all molecular kit (Stratagene). Proteins were expressed in E. coli Rosetta2(DE3) cells (Novagen). dynamics (MD) simulations of the ADP-α-ribose tetramer in complex with Recombinant proteins were purified on Ni-NTA beads, according to standard PARG10. PAR tetramer was created by addition of three additional ADP-a-ribose procedure. For crystallization studies, the T. thermophila PARG2 was purified by monomers to the ADP-a-ribose bound to PARG. Topology, and parameters files for FPLC on a HisTrap HP column followed by gel filtration, using HiLoad 16/60 this ligand were obtained using Antechamber program24 and AM1-BCC charges25. Superdex 200 column (GE Healthcare). Rhodanine-containing PARG inhibitor 16 The PARG-tetramer complex model was immersed in a periodic water box (TIP3) RBPI-3 was synthetized, as described . and neutralized by adding Na + ions. This complex was equilibrated with several cycles of minimizations (steepest descent, 10,000 steps) and MD simulations PARG activity assays. For western-blot analysis of PARG activity, poly(ADP- (50 K, 20 ps) with the protein atoms kept fixed. Then MD simulations were per- ribose) was synthesized by the automodification of PARP1 in a reaction mixture formed (200 K, 2 ns), with the protein backbone restrained on the X-ray structure containing 2 units of PARP1 (Trevigen), 200 µM NAD (Trevigen), activated DNA conformation. During these MD simulations, several snapshots were periodically (Trevigen), 50 mM Tris pH 7.5. and 50 mM NaCl at room temperature. Reactions extracted and energy-minimized (no restraints) to show the binding interactions of were stopped after 30 min by the addition of the PARP inhibitor KU-0058948. In PAR tetramer with PARG active site. reactions, either 1 µM of both human and T. thermophila PARGs, or for muta- tional analysis, 400 nM human and 30 nM T. thermophila PARGs were added to Analyses by ultrahigh-performance liquid chromatography. The poly(ADP- the reaction and incubated for another 30 min. The reactions were run on 4–12% ribosyl)ated PARP1 was treated with PARG, and the mixture filtered using SDS–PAGE gels and blotted onto a nitrocellulose membrane. PAR hydrolysis was centricons (30 kDa cutoff). The analysis of the filtrate was performed using a detected by rabbit polyclonal anti-PAR antibodies (Trevigen; 1:1,000 dilution). modified procedure by Coulieret al.26, which employed ion-pair chromatography Western blots were analysed densitometrically by GeneTools (SynGene). for the separation of nucleotides and negative electrospray ionization for the mass spectrometric detection. All analyses were performed using a Waters Acquity PARG inhibition assay. Inhibition of TTPARG by RBPI-3 was determined, as UPLC system (Waters Corp., Milford, MA, USA) coupled to a QTOF Premier previously reported16, with the following modifications. Purified TTPARG was mass spectrometer (Waters Micromass, Manchester, UK). The chromatographic employed at a final concentration of 50 nM and reactions were incubated at 37 °C separations employed a column (100 mm×2.1 mm) filled with a 1.7-µm BEH C18 for 60 min before stopping by the addition of 1.0 µl 1% SDS. stationary phase (Waters Corp., Milford, MA, USA). Binary gradients at a flow rate of 0.4 ml min − 1 were applied for the elution. The eluent A was water containing Thermal shift assay. In each 50 µl reaction, 15 µl of 300× Sypro Orange (Sigma), 5 mmol l − 1 of pentylamine with the pH value adjusted to 6.5, using acetic acid. The 2 µl of water and 33 µl of protein (4 mg ml − 1) in 25 mm Tris pH 7.5, 50 mM NaCl, eluent B was acetonitrile. A fast elution gradient was applied, starting with 2% B 10% glycerol and 1 mM DTT were added to the wells of a 96-well thin-wall PCR and then the percentage of B linearly increased to 25% in 5 min, followed by an plate (Bio-Rad). For thermal shift assays in the presence of ADP-ribose, 2 µl of isocratic hold until 9 min. The mass spectrometry was performed using an 25 mM ADP-ribose was added instead of water. Samples were placed in a CFX96 orthogonal Z-spray-electrospray interface. Drying gas and nebulizing gas was ni- Real-Time PCR thermal cycler (Bio-Rad) and slowly heated from 15 to 95 °C with trogen. The desolvation gas flow was set to 700 l h − 1 at a temperature of 280 °C. The sample fluorescence recorded every 0.2 °C. Fluorescence was monitored at 575 nm cone gas flow was adjusted to 25 l h − 1, and the source temperature to 120 °C. The (emission), using 490 nm for excitation. capillary and cone voltages were 3,500 V and 30 V, respectively. The instrument was operated in V mode with TOFMS data being collected between m/z 100–2,000, Protein expression and purification. Selenomethionine TTPARG was produced applying collision energy of 4 eV. Leucine enkephalin was applied as a lock mass in in the same Rosetta (DE3) cells as native TTPARG, but in the conditions where negative ionization mode (m/z 554.2615). the methionine biosynthesis pathway is inhibited20. Briefly, 1 l of unlabelled TBGG The chromatograms, recorded in the total ion current mode, were systemati- (‘Terrific Broth’ supplemented with 1.0% glucose and 4% glycerol) was inoculated cally examined by manually generating mass spectra of each visible individual with 5 ml of an overnight culture grown in lysogeny broth (LB). The culture was peak using background-subtraction option. Specific target components, grown to mid-log phase, pelleted and washed. The pellet was resuspended in expected to occur in the reaction mixture, in particular ADP-ribose, oligomers warm minimal media and evenly distributed between 6 l of Se-Met supplemented of poly ADP-ribose with up to 5 ADP-ribose units and NAD + , were searched media—Se-Met media contains abundant amounts of all amino acids known for using extracted ion chromatograms. The target analyses were based on the to inhibit methionine biosynthesis but no methionine (1.8 g of all amino acids). accurate mass feature of the instrument, applying a mass window of 50 mDa. Finally, 50 mg of Selenomethionine was added and the cultures allowed to recover Information on the electrospray ionization mass spectra for oligomeric for 90 min before being induced with 1 mm isopropyl-β-d-thiogalactoside over- ADP-ribose species is only scarcely available in the literature27. Therefore, the night at 30 °C. The native and selenomethionine-labelled proteins were purified recorded total ion current chromatograms were systematically examined using to homogeneity by HisTrap HP Ni2 + -affinity chromatography followed by size specific masses of possible mono- and multiply charged ions of individual exclusion chromatography on a HiLoad 16/60 Superdex 200 column. Samples were oligomers.

nature communications | 3:878 | DOI: 10.1038/ncomms1889 | www.nature.com/naturecommunications  © 2012 Macmillan Publishers Limited. All rights reserved. ARTICLE nature communications | DOI: 10.1038/ncomms1889

References 20. Van Duyne, G. J et al. Atomic structures of the human immunophilin FKBP-12 1. D’Amours, D., Desnoyers, S., D’Silva, I. & Poirier, G. G. Poly(ADP-ribosyl)ation complexes with FK506 and rapamycin. Mol. Biol. 229, 105–124 (1993). reactions in the regulation of nuclear functions. Biochem. J. 342, 249–268 21. Adams, P. D. et al. PHENIX: a comprehensive Python-based system for (1999). macromolecular structure solution. Acta Crystallogr. D Biol. Crystallogr. 66, 2. Hakme, A., Wong, H. K., Dantzer, F. & Schreiber, V. The expanding field of 213–221 (2010). poly(ADP-ribosyl)ation reactions. ‘Protein Modifications: Beyond the Usual 22. Cowtan, K. TheBucaneer software for automated model building. 1. Tracing Suspects’ Review Series. EMBO Rep. 9, 1094–1100 (2008). protein chains. Acta Crystallogr. D Biol. Crystallogr. 62, 1002–1011 (2006). 3. Ahel, I. et al. Poly(ADP-ribose)-binding zinc finger motifs in DNA repair/ 23. Phillips, J. C. et al. Scalable molecular dynamics with NAMD. J. Comput. Chem. checkpoint proteins. Nature 451, 81–85 (2008). 26, 1781–1802 (2005). 4. Langelier, M. F., Planck, J., Servent, K. & Pascal, J. Purifications of human 24. Cornell, W. D. et al. A second generation force field for the simulation PARP-1 and PARP-1 domains from Escherichia coli for structural and of proteins, nucleic acids, and organic molecules. J. Am. Chem. Soc. 117, biochemical analysis. Methods Mol. Biol. 780, 209–226 (2011). 5179–5197 (1995). 5. Patel, C. N., Koh, D. W., Jacobson, M. K. & Oliveira, M. A. Identification of three 25. Jakalian, A., Jack, D. B. & Bayly, C. I. Fast, efficient generation of high-quality critical acidic residues of poly(ADP-ribose) glycohydrolase involved in catalysis: atomic charges. AM1-BCC model: II. Parameterization and validation. determining the PARG catalytic domain. Biochem. J. 388, 493–500 (2005). J. Comput. Chem. 23, 1623–1641 (2002). 6. Panda, S., Poirier, G. G. & Kay, S. A. tej defines a role for poly(ADP- 26. Coulier, L. et al. Simultaneous quantitative analysis of metabolites using ribosyl)ation in establishing period length of the Arabidopsis circadian ion-pair liquid chromatography-electrospray ionization mass spectrometry. oscillator. Dev. Cell 3, 51–61 (2002). Anal. Chem. 78, 6573–6582 (2006). 7. Lin, W., Ame, J. C., Aboul-Ela, N., Jacobson, E. L. & Jacobson, M. K. Isolation 27. Morrison, A. R. et al. ART2, a T cell surface mono-ADP-ribosyltransferase, and characterization of the cDNA encoding bovine poly(ADP-ribose) generates extracellular poly(ADP-ribose). J. Biol. Chem. 281, 33363–33372 glycohydrolase. J. Biol. Chem. 272, 11895–11901 (1997). (2006). 8. Koh, D. W. et al. Failure to degrade poly(ADP-ribose) causes increased sensitivity to cytotoxicity and early embryonic lethality. Proc. Natl Acad. Sci. USA. 101, 17699–17704 (2004). Acknowledgements 9. Hanai, S. et al. Loss of poly(ADP-ribose) glycohydrolase causes progressive We thank A. Jordan, S. Williams, D. Slade and A. Boakes for helpful advice. This work neurodegeneration in Drosophila melanogaster. Proc. Natl Acad. Sci. USA. 101, was funded by Cancer Research UK and the European Research Council. Access to 82–86 (2004). Diamond beamlines is gratefully acknowledged. 10. Slade, D. et al. The structure and catalytic mechanism of poly(ADP-ribose) glycohyrolase. Nature 477, 616–620 (2011). 11. Ahel, D. et al. Poly(ADP-ribose)-dependent regulation of DNA repair by the Author contributions chromatin remodeling enzyme ALC1. Science 325, 1240–1243 (2009). M.D. performed structural/biophysical studies and analysed data. E.B. prepared proteins 12. Gottschalk, A. et al. Poly(ADP-ribosyl)ation directs recruitment and activation for crystallization, set up crystallization trials, performed mutagenesis and in vitro of an ATP-dependent chromatin remodeler. Proc. Natl Acad. Sci. USA. 106, experiments; P.L. performed molecular modelling studies. M.A. performed LC/MS 13770–13774 (2009). analyses. C.K. and P.H. performed inhibition studies, A.B. performed supporting studies, 13. Timinszky, G. et al. A macrodomain-containing histone rearranges chromatin I.A. and D.L. wrote the manuscript, designed experiments and analysed data. upon sensing PARP1 activation. Nat. Struct. Mol. Biol. 16, 923–929 (2009). 14. Botta, D. & Jacobson, M. Identification of a regulatory segment of poly(ADP-ribose) glycohydrolase. Biochemistry 49, 7674–7682 (2010). Additional information 15. Koh, D. W. et al. Identification of an inhibitor binding site of poly(ADP-ribose) Accession codes: Atomic coordinates and structure factors have been deposited in the glycohydrolase. Biochemistry 42, 4855–4863 (2003). Protein Data Bank under accession codes 4EPP and 4EPQ 16. Finch, K. E., Knezevic, C., Nottbohm, A. C., Partlow, K. C. & Hergenrother, Supplementary Information accompanies this paper at http://www.nature.com/ P. J. Selective small molecule inhibition of poly(ADP-ribose) glycohydrolase naturecommunications (PARG). ACS Chem. Biol. 7, 563–570 (2012). 17. Brochu, G. et al. Mode of action of poly(ADP-ribose) glycohydrolase. Biochim. Competing financial interests: The authors declare no competing financial interests. Biophys. Acta 1219, 342–350 (1994). Reprints and permission information is available online at http://npg.nature.com/ 18. Ikejima, M. & Gill, D. M. Poly(ADP-ribosylation) degradation by reprintsandpermissions/ glycohydrolase starts with an endonucleolytic incision. J. Biol. Chem. 263, 11037–11040 (1988). How to cite this article: Dunstan, M. S. et al. Structure and mechanism of a canonical 19. Fong, P. C. et al. Inhibition of poly(ADP-ribose) polymerase in tumours from poly(ADP-ribose) glycohydrolase. Nat. Commun. 3:878 doi: 10.1038/ncomms1889 BRCA mutation carriers. N. Engl. J. Med. 361, 123–134 (2009). (2012).

 nature communications | 3:878 | DOI: 10.1038/ncomms1889 | www.nature.com/naturecommunications © 2012 Macmillan Publishers Limited. All rights reserved.