University of Connecticut OpenCommons@UConn

Doctoral Dissertations University of Connecticut Graduate School

12-12-2016 Structural Insight into the Mechanisms of Activation and Substrate Specificity of Human Deubiquitinating USP7 Alexandra Pozhidaeva University of Connecticut, [email protected]

Follow this and additional works at: https://opencommons.uconn.edu/dissertations

Recommended Citation Pozhidaeva, Alexandra, "Structural Insight into the Mechanisms of Activation and Substrate Specificity of Human USP7" (2016). Doctoral Dissertations. 1287. https://opencommons.uconn.edu/dissertations/1287 Structural Insight into the Mechanisms of Activation and Substrate Specificity of Human

Deubiquitinating Enzyme USP7

Alexandra Pozhidaeva, PhD

University of Connecticut, 2016

The major part of this thesis describes studies of human -specific protease 7 (USP7), a deubiquitinating enzyme that regulates cellular levels of key oncoproteins and tumor suppressors.

Inactivation of USP7 has recently emerged as a new approach to treatment of malignancies. However, design of potent and specific small-molecule compounds requires detailed understanding of the molecular mechanisms of USP7 substrate recognition and regulation of its catalytic activity. The goal of this work was to explore these mechanisms using solution nuclear magnetic resonance spectroscopy in combination with other methods. In our studies of USP7 substrate recognition, we structurally characterized its interaction with ICP0 from 1 and identified a novel USP7 substrate-binding site harbored within its C-terminal region. To address the question of USP7 activity regulation, we investigated its interaction with ubiquitin, which was believed to cause structural rearrangement of USP7 active site from an unproductive to a catalytically competent conformation. Surprisingly, we showed that in solution USP7 – ubiquitin interaction alone is not sufficient for activation of the enzyme as was previously postulated. Finally, we uncovered a previously unknown mechanism of USP7 inactivation by two of its known inhibitors. We found that these compounds bind to the active site of USP7 and inactivate the enzyme via covalent modification of a catalytic cysteine residue. The efficacy of the inhibitors was confirmed in cells. Altogether, these results advance our understanding of the mechanisms of substrate specificity, activation and inhibition of USP7 and open new strategies for rational structure-based drug design.

Chapter 4 describes another example of using NMR spectroscopy in studies of complex biological systems. REV1 is a translesion synthesis (TLS) DNA polymerase that plays a key role in replication Alexandra Pozhidaeva – University of Connecticut, 2016

across the sites of DNA damage. Beyond its catalytic activity, its major role is to recruit other TLS polymerases to DNA lesions. The goal of this work was to structurally characterize protein – protein interactions mediated by the C-terminal region of REV1 (REV1-CT). We determined a solution NMR structure of REV1-CT alone and in complex with TLS polymerase η, and unraveled the molecular mechanism by which REV1 recognizes other Y-family TLS polymerases.

Structural Insight into the Mechanisms of Activation and Substrate Specificity of Human

Deubiquitinating Enzyme USP7

Alexandra Pozhidaeva

B.S., St. Petersburg State University, Russia, 2009

A Dissertation

Submitted in Partial Fulfillment of the

Requirements for the Degree of

Doctor of Philosophy

at the

University of Connecticut

2016

i

Copyright by

Alexandra Pozhidaeva

2016

ii

APPROVAL PAGE

Doctor of Philosophy Dissertation

Structural Insight into the Mechanisms of Activation and Substrate Specificity of Human

Deubiquitinating Enzyme USP7

Presented by Alexandra Pozhidaeva, B.S.

Major Advisor ______Irina Bezsonova

Associate Advisor ______Dmitry Korzhnev

Associate Advisor ______Sandra Weller

Associate Advisor ______Jeffrey Hoch

Associate Advisor ______Peter Setlow

University of Connecticut 2016

iii

AKNOWLEDGMENTS

I am extremely grateful to the many people who surrounded me during a half-decade journey towards achieving this life milestone. Here I wish to acknowledge the people without whom the completion of this dissertation would not be possible.

First and foremost, I would like to express my deepest gratitude to my advisors, Dr. Irina

Bezsonova and Dr. Dmitry Korzhnev, for allowing me to join their research group. It has been an honor to be their first PhD student. Despite the pressure of starting their own lab, from my first day there they showed themselves as exceptional mentors. Dmitry having a background in physics and Irina being a biologist, they taught me about science from different angles making the learning experience unique and always interesting. To Irina, your passion about protein structures was contagious and your irrepressible optimism was always a motivation for me. To

Dmitry, thank you for teaching me to pay close attention to details and to be critical of the results.

I would like to thank my committee members Dr. Jeffrey Hoch, Dr. Sandra Weller and Dr.

Peter Setlow for their constructive dialogue, overall support and guidance. Their feedback over the years not only enabled me to take the project in new directions but also helped me develop qualities necessary for being a successful scientist.

I would like to acknowledge past and present members of Korzhnev/Bezsonova lab for our fruitful scientific discussions and for the help whenever it was needed. Special thanks goes to Dr.

Yulia Pustovalova for spending long nights teaching me LINUX and basics of protein structure calculation, and to Dr. Mariana Quezado for bringing a lot of laughter and cheer to the lab. I am also grateful to Dr. Mark Maciejewski for help with operating NMR spectrometers and to other

iv members of the NMR super group (Dr. Adam Schuyler, Michael R. Gryk and Gerard

Weatherby) for valuable comments and suggestions.

If not for all the members of MBB department, my PhD experience would never be the same.

Because of the diverse journal clubs, work in progress talks and seminars I now a have vast knowledge of various scientific topics that will help me to navigate through the rest of my scientific career. I would also like to thank professors Dr. Siu-Pok Yee, Dr. Henry Smilowitz and

Dr. Melissa Caimano from the labs next door for insightful thoughts on the project and for non- scientific conversations reminding me about the big World outside the lab.

Finally, I would like to thank my family and friends for all their love and tremendous support. To my mother Ludmila, thank you for teaching me to never give up and to always look forward. To my brother Vladislav, thank you for encouraging my interest in science since my early childhood. To many friends I acquired in Connecticut, especially to Oksana, Olena and

Dina, thank you for all the wonderful time we had together. Last but not the least, to my fiancé

Michael, I am deeply thankful to you for always being there for me listening, understanding and cheering.

v

Table of contents

CHAPTER 1. Introduction ...... 1

1.1. USP7 – a complex, pharmaceutically relevant biological system ...... 2

1.1.1. Overview of ubiquitin – system ...... 2

1.1.2. Deubiquitinating and their role in the ubiquitin – proteasome system ...... 3

1.1.3. Introducing USP7: functions, substrates and regulation ...... 6

1.1.4. Structure of USP7 ...... 10

1.1.5. USP7 in human disease ...... 13

1.1.6. Pharmaceutical potential of USP7 ...... 14

1.2. NMR spectroscopy as a powerful method for studies of and protein – protein interactions...... 16

1.2.1. Basics of NMR ...... 16

1.2.2. Protein structure determination using NMR spectroscopy ...... 19

1.2.3. NMR studies of protein dynamics ...... 23

1.2.4. NMR titration experiments in studies of protein –ligand interactions ...... 26

1.3. Thesis objectives ...... 29

CHAPTER 2. Structural characterization of interaction between USP7 and ICP0 E3 ligase from HSV-1 ...... 39

2.1. Abstract ...... 40

2.2. Introduction ...... 41

2.3. Experimental procedures ...... 43

2.4. Results ...... 48

2.4.1. Isolated USP7 UBL domains can be studied independently in solution ...... 48

2.4.2. Solution structure of the USP7 UBL1 domain ...... 49

vi

2.4.3. Backbone dynamics of the USP7 UBL1 domain ...... 51

2.4.4. USP7 is required for the efficient HSV-1 infection in HFF-1 cells ...... 51

2.4.5. The first two USP7 UBL domains (UBL12) are responsible for interaction with ICP0 ...... 53

2.4.6. Mapping of the UBL12-ICP0 binding site...... 54

2.5. Discussion ...... 56

CHAPTER 3. Mechanism of USP7 inhibition by small-molecule compounds ...... 70

3.1. Abstract ...... 71

3.2. Introduction ...... 71

3.3. Experimental procedures ...... 74

3.4. Results ...... 80

3.4.1. NMR studies of the 41kDa catalytic core of USP7 in solution ...... 80

3.4.2. Interaction of the USP7 catalytic core with ubiquitin ...... 81

3.4.3. Mapping of the small-molecule binding site ...... 82

3.4.4. Mechanism of USP7 inhibition by DUB inhibitors P22077 and P50429 ...... 85

3.4.5. Inhibition of endogenous full-length USP7 by DUB inhibitors P2077 and P50429 in Jurkat cells ...... 87

3.5. Discussion ...... 88

CHAPTER 4. Structural NMR studies of other complex biological systems (example of translesion synthesis DNA polymerase REV1) ...... 102

4.1. Abstract ...... 103

4.2. Introduction ...... 104

4.3. Experimental procedures ...... 108

4.4. Results and discussion ...... 114

vii

4.4.1. The Rev1 C-terminus is a structured domain that binds human Y-family DNA polymerases...... 114

4.4.2. Structure of the free Rev1-CT domain ...... 117

4.4.3. Structure of the Rev1-CT/polη-RIR complex ...... 119

4.4.4. Corroboration of Rev1-CT/Polη-RIR interface by yeast two-hybrid assays ...... 123

4.4.5. Conformational dynamics of Rev1-CT domain and Rev1-CT/polη-RIR complex ...... 125

4.5. Concluding remarks ...... 130

CHAPTER 5. Summary and future directions ...... 142

5.1. Summary ...... 143

5.2. Closing remarks and future directions ...... 145

5.2.1. NMR spectroscopy is a powerful tool for studies of USP7 ...... 145

5.2.2. Identification of novel substrate-binding sites within USP7 ...... 146

5.2.3. Mechanism of USP7 activation ...... 148

5.2.4. Future of USP7 inhibition by small-molecule compounds ...... 149

REFERENCES ...... 156

viii

List of Figures

CHAPTER 1

Figure 1.1. Functions of DUBs in the ubiquitin-proteasome system ...... 33

Figure 1.2. Human DUBs ...... 34

Figure 1.3. USP7 domain organization ...... 35

Figure 1.4. C-terminal region of USP7 ...... 36

Figure 1.5. Human USPs predicted to have at least one ubiquitin-like domain ...... 37

Figure 1.6. NMR spectroscopy in structural biology ...... 38

CHAPTER 2

Figure 2.1. Isolated USP7 UBL domains can be studied independently in solution ...... 62

Figure 2.2. Solution NMR structure of the USP7 UBL1 (residues 537-664) ...... 63

Figure 2.3. The backbone dynamics of UBL1 from 15N relaxation ...... 64

Figure 2.4. USP7 is required for successful establishment of HSV-1 lytic infection ...... 65

Figure 2.5. NMR and ITC analysis of the USP7 UBL domain interactions with the ICP0617-627 peptide ...... 66

Figure 2.6. Analysis of ICP0 binding to the USP7 UBL12 tandem ...... 67

Figure 2.7. The C-terminal region of ICP0 may contain a structured domain ...... 69

CHAPTER 3

Figure 3.1. Backbone resonance assignment of USP7 catalytic domain ...... 95

Figure 3.2. The isolated USP7 catalytic domain binds ubiquitin ...... 96

Figure 3.3. DUB inhibitors P22077 and P502429 occupy the active site of USP7 ...... 97

Figure 3.4. Inhibitors P22077 and P50429 covalently modify catalytic Cys223 residue in the USP7 active site ...... 98

ix

Figure 3.5. Proposed mechanisms of chemical reaction between Cys223 of USP7 and inhibitors P22077 and P50429 ...... 99

Figure 3.6. P22077 and P50429 inhibit USP7 activity in cells ...... 100

Figure 3.7. Specificity determinants of dual USP7/USP47 inhibitors ...... 101

CHAPTER 4

Figure 4.1. The Rev1-CT forms a structured domain that binds DNA polymerase η ...... 135

Figure 4.2. Three-dimensional structure of the human Rev1-CT domain (residues 1157- 1251) determined by solution NMR spectroscopy ...... 136

Figure 4.3. Spatial structure of the complex of the human Rev1-CT domain (residues 1157-1251) with the RIR containing peptide from DNA polymerase η ...... 137

Figure 4.4. Inter-molecular NOEs observed for the Rev1-CT/polη-RIR complex ...... 138

Figure 4.5. Mapping of the Rev1-CT/polη-RIR interface by yeast two-hybrid assays ...... 139

Figure 4.6. Conformational dynamics of the Rev1-CT domain and the Rev1-CT/polη- RIR complex ...... 140

Figure 4.7. On-off equilibrium between free and bound forms of Rev1-CT contributes to ms time-scale conformational exchange observed for the Rev1-CT/polη-RIR complex ...... 141

CHAPTER 5

Figure 5.1. NMR resonance assignment of all USP7 domains provides new opportunities for studying the enzyme in solution ...... 152

Figure 5.2. Backbone resonance assignment of the USP7 UBL3 domain ...... 153

Figure 5.3. Backbone resonance assignment of the USP7 UBL45 domain ...... 154

Figure 5.4. Pathway towards the structure of full-length USP7 ...... 155

x

List of Tables

CHAPTER 1

Table 1.1. List of USP7 substrates ...... 32

CHAPTER 2

Table 2.1. NMR structural statistics of USP7 UBL1 domain ...... 61

CHAPTER 4

Table 4.1. NMR-based restraints used for structure calculation of Rev1-CT domain and Rev1-CT/pol-RIR complex, and structure refinement statistics ...... 134

xi

List of used abbreviations

UPS ubiquitin – proteasome system

DUB deubiquitinating enzyme

USP7 ubiquitin specific protease 7

C-USP7 C-terminal region of USP7

TRAF-like tumor necrosis factor receptor – associated factor-like

UBL ubiquitin-like domain

HSV-1 Herpes simplex virus-1

ICP0 infected cell protein 0

TLS translesion synthesis

Rev1 reversionless 1

Rev1-CT C-terminal domain of Rev1

Polη (ι, κ) TLS polymerase η (ι, κ)

RIR Rev1 interacting region

PCNA proliferating cell nuclear antigen

PDB protein databank

NMR nuclear magnetic resonance spectroscopy

2D, 3D two-, three-dimensional

HSQC heteronuclear single quantum coherence

TROSY transverse relaxation optimized spectroscopy

NOESY nuclear Overhauser effect spectroscopy

NOE nuclear Overhauser effect

CPMG Carr–Purcell–Meiboom–Gill

xii ps, μs, ms pico-, micro and miliseconds

DMSO dimethyl sulfoxide

IPTG isopropyl β-D-1-thiogalactopyranoside

DTT dithiothreitol

TEV Tobacco Etch virus nuclear-inclusion-a endopeptidase

EDTA ethylenediaminetetraacetic acid

PMSF phenylmethylsulfonyl fluoride

xiii

CHAPTER 1

Introduction

1

1.1. USP7 – a complex, pharmaceutically relevant biological system

1.1.1. Overview of ubiquitin – proteasome system

As a part of their metabolism cells constantly synthetize new proteins and, at the same time, break down proteins that are misfolded, aggregated or no longer needed. Eukaryotic cells developed two major systems responsible for protein degradation: lysosomal proteolysis and the ubiquitin – proteasome system (UPS). While the first system is specialized in degradation of membrane and endocytosed proteins, UPS is versatile and accomplishes degradation of the majority of cellular proteins (1). In this pathway a polyubiquitin chain is conjugated to a protein substrate. This chain serves as a signal for the substrate recognition and degradation by the 26S proteasome. Protein ubiquitination is a multi-step process, involving consecutive action of three enzymes: ubiquitin-activating enzyme E1, ubiquitin-conjugating enzyme E2 and E3 ligase

(Figure 1.1). First, E1 uses ATP to form a thioester bond between a cysteine residue in its active site and the C-terminal glycine of ubiquitin. Activated ubiquitin is then transferred to E2 with the formation of second thioester intermediate. In the final step, ubiquitin is transferred to a protein substrate with help of E3 ligase (2). There are two families of E3 ligases that act through different mechanisms. RING E3 ligases bind both the E2 – ubiquitin complex and a substrate bringing them together and, thereby, promoting the attack of the substrate lysine residue on E2 – ubiquitin thioester (3). E3 ligases of the HECT family receive ubiquitin from E2 with formation of the third thioester intermediate and then transfer it to the substrate (4). The process described above can be repeated several times to generate a polyubiquitin chain that consists of ubiquitin monomers attached to each other through a linkage between one of seven lysine residues (Lys6,

Lys11, Lys27, Lys29, Lys33, Lys48, and Lys63) of one monomer and the C-terminal glycine of another. Chains linked through N-terminal methionine can also be formed (5). Interestingly, only

2

Lys48- and Lys11-linked chains are known to target proteins to proteasome (6) while other linkage types are involved in signaling pathways distinct from protein degradation. Specifically,

K63-linked chains play a role in regulating signal transduction, DNA repair and endocytosis

(7,8), and linear polyubiquitin (Met1-linked) is important in inflammatory cytokine signaling (9).

In addition to polyubiquitination, protein monoubiquitination often takes place in various cellular pathways such as intracellular and membrane trafficking, DNA transcription and DNA repair

(10). Overall, the functions of protein ubiquitination are not limited to mediating proteasome- dependent degradation. The ability of ubiquitin to form various polymers adds a complexity to

UPS in which mono- and polyubiquitination as well as diverse chain linkage types determine different functional outcomes for protein substrates.

1.1.2. Deubiquitinating enzymes and their role in the ubiquitin – proteasome system

Protein ubiquitination can be reversed by action of deubiquitinating enzymes (DUBs) that cleave ubiquitin moieties from their substrates. Nearly 100 DUBs identified in can be classified into two mechanistically distinct classes: metalloproteases and cysteine proteases (11). Metalloproteases coordinate a zinc ion and a water molecule in the active site.

During catalysis, a hydrogen atom from the water molecule is abstracted by a neighboring glutamate residue and the resulting hydroxyl ion attacks the carbonyl carbon of the isopeptide bond. In humans, metalloproteases are represented by a family of JAB1/MPN/MOV34 domain containing proteases (JAMM) that consists of 12 members (12). The catalytic mechanism of cysteine proteases is analogous to that of well-studied plant enzymes papains (13). All cysteine peptidases share a catalytic triad consisting of cysteine, histidine and aspartate/asparagine residues. The role of the histidine is to deprotonate the thiol group of the cysteine and initiate its nucleophilic attack on the isopeptide bond, while the aspartate restricts side-chain rotation of the

3 histidine and polarizes it. Based on the architecture of the catalytic core, cysteine proteases can be subdivided into five families (Figure 1.2). Four families including ubiquitin-specific proteases

(USPs) ubiquitin carboxyl-terminal hydrolases (UCHs), ovarian tumor proteases (OTUs) and

Machado-Joseph (Josephin) domain containing (MJD) proteases have been extensively studied

(14). The fifth family of MINDY proteases (motif interacting with ubiquitin-containing novel

DUB family) was recently discovered and, to date, contains only one characterized protein

MINDY-1 (15). Structurally, proteins of UCH family are relatively small and do not have domains other than the catalytic core while DUBs from other families often have varying numbers of additional domains with distinct structures and functions. These domains often participate in protein – protein interactions such as assembly in macromolecular complexes or

DUB binding to substrates and substrate adaptors (16).

DUBs perform a variety of functions in the UPS (Figure 1.1). They remove polyubiquitin chains from their substrates thus preventing their proteasomal degradation. Furthermore, DUBs recycle ubiquitin by disassembling the chains. These enzymes are also involved in maturation of ubiquitin, which is synthetized as a linear fusion protein. DUBs cleave the precursor and generate a pool of ubiquitin monomers. Finally, DUBs can redirect the fate of a substrate by editing the conjugated ubiquitin chains. For example, they can transform polyubiquitinated protein into monoubiquitinated and vice versa. In some instances, they disassemble a chain of one linkage type so a chain of a different linkage could be formed (17). Interestingly, some

DUBs, including many USPs, recognize ubiquitin chains of several linkage types while others demonstrate linkage specificity (16). For example, proteases of the JAMM family cleave K63- linked chains only (18,19). In contrast, OTU family members vary in their linkage preferences.

4

Thus, OTUD4 and OTUB1 are specific to Lys48, Cezanne and Cezanne2 – to Lys11, OTUD1 – to Lys63 and OTULIN – to Met1 (14).

Given the number of cellular pathways involving protein deubiquitination, it is not surprising that DUBs activity is tightly regulated. To date, a few mechanisms of DUBs regulation have been discovered that function independently or in combination, depending on the particular enzyme. These mechanisms include post-translational modifications, recruitment to substrates and/or proteasome, inner structural rearrangements upon substrate binding and, finally, interactions with other proteins that can either activate or inhibit the enzyme (20). In turn, dysregulation of DUBs activity was linked to prostate, colon, ovary and other cancers as well as to neurodegenerative diseases such as Parkinson disease and ataxia (21,22).

Overall, protein ubiquitination and deubiquitination are two highly dynamic processes that maintain a delicate balance between protein stabilization and degradation, and mediate numerous signaling pathways in the cell. Interest in studying DUBs rapidly developed in recent years due to their contribution to almost every cellular process and their role in human pathologies. Despite some progress made in elucidation of DUBs activity and regulation, only a small fraction of these enzymes has been well characterized. The substrates, mechanisms of regulation as well as three-dimensional structures of many DUBs are still not known and many questions remain unanswered even for those enzymes that have been extensively studied. How do DUBs distinguish between ubiquitin chains of different linkage types? What are the general mechanisms of controlling DUBs activity that are applied within and across the families?

Multiple layers of regulation were described for several enzymes but in order to generalize these mechanisms we need further studies of many other DUBs. The same is true for defining the determinants of DUBs substrate specificity. The lack of information about protein targets for

5 many enzymes prevents determination of common substrate features. Finally, we do not know how DUBs are regulated in the context of macromolecular complexes since such studies are complex and limited by modern technology.

1.1.3. Introducing USP7: functions, substrates and regulation

Human ubiquitin-specific protease 7 (USP7) also known as Herpes virus associated protease

(HAUSP) belongs to the largest DUB family of USPs. Located in the nucleus, it regulates the stability of many proteins involved in multiple cellular processes (Table 1.1) (23). One such process is the DNA damage response where USP7 performs several functions. First, it deubiquitinates both transcription factor and E3 ligase HDM2 responsible for polyubiquitination of p53 and targeting it to the proteasome (24-26). Interestingly, USP7 preferably binds HDM2 preventing its degradation (27). In turn, HDM2 ubiquitinates p53 and maintains low cellular levels of this tumor suppressor during normal homeostasis. DNA damage causes ATM-dependent dephosphorylation of USP7 by PPM1G that decreases the affinity of the enzyme towards HDM2 (28). Consequently, USP7 associates with p53 and protects it from ubiquitination. This interaction results in p53 stabilization and initiation of the p53-dependent

DNA damage response. Second, USP7 is involved in regulation of nucleotide excision repair.

Specifically, it prevents degradation of the XPC protein that recognizes helix-distorting DNA lesions and initiates the repair (29). Furthermore, USP7 plays a significant role in the ATR-Chk1 branch of the DNA damage response pathway where it deubiquitinates and stabilizes Chk1, an important checkpoint kinase (30), and clapsin, an adapter protein (31). Another interesting substrate of USP7 is UVSSA, one of the initiating factors of UV-induced transcription-coupled

DNA repair (32). Finally, in response to genotoxic stress, USP7 stabilizes HLTF that regulates error-free replication through DNA lesions (33).

6

On the other hand, USP7 is also involved in DNA damage tolerance pathways. It controls the stability of E3 ligase Rad18 that monoubiquitinates a proliferating-cell nuclear antigen (PCNA) in response to the presence of DNA lesions stalling the replication fork (34). PCNA monoubiquitination facilitates recruitment of specialized translesion synthesis (TLS) DNA polymerases, which can efficiently bypass DNA lesions, thereby allowing replication to proceed

(35). The stability of at least one TLS polymerase, polymerase η, is also controlled by USP7

(36). In addition, USP7 plays a role in conventional DNA replication. First, its complex with

MCM-BP unloads the minichromosome maintenance protein complex from chromatin at the end of S-phase (37). Second, it was recently suggested that USP7 might regulate replication fork progression through creating SUMO-rich environment at sites of DNA replication (38). Finally,

USP7 controls protein levels of replisome components UHRF1 and methyltransferase DNMT1 that function together in maintaining tissue-specific DNA methylation patterns (39,40).

There are many other cellular pathways that require the deubiquitinase activity of USP7. For instance, this enzyme associates with chromatin and participates in epigenetic control of expression. Specifically, USP7 interacts with BMI1, MEL18, SCML2 and E3 ligase RING1B – components of the PRC1 complex responsible for monoubiquitination of histone H2A, which results in gene repression (41-43). It also stabilizes several other histone-associated proteins such as acetyltransferase TIP60 (44), demethylase PHF8 (45), methyltransferase MLL5 (46) and E3 ligase RNF168 (47). Furthermore, USP7 functions in the immune response where it regulates the transcription activity of master transcription factor NF-kB via stabilization of its subunit p65

(48).

Although USP7 is mostly known for removing polyubiquitin chains from its substrates, it is also capable of deubiquitinating monoubiquitinated proteins. Thus, USP7 forms a complex with

7

GMPS in Drosophila and removes monoubiquitin from thereby contributing to gene silencing (49). In humans, USP7 deubiquitinates monoubiquitinated transcription factors

PTEN and FOX(O)4 (50,51). In this case, the activity of USP7 is a part of a mechanism negatively regulating the transcription activity of these proteins since deubiquitination of both

PTEN and FOX(O)4 causes their translocation from the nucleus to cytoplasm and, therefore, their inactivation. As a side note, some functions of USP7 do not require its deubiquitinase activity at all. In particular, the enzyme negatively regulates PML nuclear bodies and the mechanism relies only on USP7 interactions with other proteins (52).

In addition to numerous cellular processes, USP7 is known to play a role in viral infection.

Originally, it was discovered as an enzyme associated with ICP0, an immediate-early protein from Herpes Simplex Virus-1 (53). ICP0 is the E3 ligase required for efficient lytic infection

(54). Its binding to USP7 prevents its auto-ubiquitination and proteasomal degradation (55).

Later studies revealed that USP7 interacts with proteins from three other viruses of the

Herpesviridae family. Thus, binding of USP7 to LANA from Kaposi’s sarcoma-associated herpes virus (KSHV) and its functional homologue EBNA-1 from Epstein-Barr virus (EBV) has a regulatory effect on latent viral replication (56,57). Furthermore, EBNA-1 – USP7 interaction blocks the interaction of the latter to p53 and, therefore, diminishes p53 stabilization (58). The similar viral mechanism that disrupts the p53-mediated antiviral response was proposed for another USP7 substrate, the vIRF4 protein from KSHV (59), while the significance of interaction between USP7 and UL35 from human cytomegalovirus (HCMV) remains elusive (60). Finally, besides the role in infections mediated by Herpesviruses, USP7 also promotes adenoviral replication via interaction with viral multifunctional protein E1B-55K (61).

8

USP7 is an important component of UPS and its activity in the cell is tightly regulated to avoid uncontrolled stabilization of its multiple substrates. There are several levels of USP7 regulation including intermolecular mechanisms, post-transcriptional modifications and protein – protein interactions. Intramolecular mechanisms include the domain reorganization required for the enzyme activation and the active site rearrangement upon ubiquitin binding (described in chapter 1.1.4). Post-transcriptional modifications further tune USP7 activity. In particular, the enzyme was shown to be phosphorylated at Ser18 and Ser963, and ubiquitinated at Lys869 (62).

Phosphorylation at Ser18 by protein kinase CK2 alters affinity of USP7 towards its substrates

HDM2 and p53 in such a way that the phosphorylated enzyme preferably binds to HDM2 while its dephosphorylation results in higher affinity to p53 (28). USP7 ubiquitination at Lys869 by E3 ligase TRIM27 promotes TNF-α-induced apoptosis through deubiquitination of RIPK1 (63) and the role of phosphorylation at Ser963 remains to be determined. In addition, USP7 is aberrantly phosphorylated at Tyr243 by chimeric protein p210 BCR-ABL in chronic myeloid leukemia

(CML) cells. This post-translational modification enhances deubiquitinase activity of the enzyme towards tumor suppressor PTEN, whose dysregulation is critical for CML pathogenesis (64). The highest level of USP7 regulation is mediated by several proteins that interact with the enzyme and mediate its activity. Thus, Daxx binds to USP7 and HDM2 facilitating deubiquitination of the latter (65). Daxx – USP7 complex also regulates stability of E3 ligase CHFR and Aurora-A kinase involved in mitosis progression (66). In the Wnt/β-catenin signaling pathway USP7 mediated stabilization of β-catenin depends on E3 ligase RNF220 that forms a tertiary complex with both USP7 and β-catenin and promotes deubiquitination of the latter (67). Another protein,

MBD4, recruits USP7 to heterochromatin foci in order to co-localize it with UHRF1 and

DNMT1 thus promoting their interaction (68). Finally, at least one protein, TRAF4 negatively

9 regulates USP7 activity. It binds to the same region of USP7 as p53 blocking p53 accesses to the enzyme. This inhibition mechanism was shown to play an important role in breast cancers where

TRAF4 overexpression prevents USP7-mediated p53 stabilization and diminishes cytotoxic stress response (69).

1.1.4. Structure of USP7

Schematic representation of USP7 domain organization is shown in Figure 1.3A. USP is a large 1102 residue (135 kDa) protein that consists of seven domains including an N-terminal

TRAF (tumor necrosis factor receptor – associated factor)-like domain, a central catalytic core and five C-terminal ubiquitin-like domains (UBLs) (70,71). The N-terminal TRAF-like domain encompasses the protein residues 53-205 and has a characteristic for all TRAF domains fold

(58). Specifically, it adopts a conformation of an eight-stranded, antiparallel β-sandwich (Figure

1.3B). The function of the USP7 TRAF-like domain is to bind USP7 substrates. Transcription factor p53 was the first protein shown to be recognized by this region of USP7 (72) and later studies revealed many others such as HDM2, UbE2E1 and MCM-BP. Interestingly all these interactions occur through the same molecular mechanism. Briefly, all the substrates contain a

Pro/Ala-X-X-Ser motif in their protein sequences that binds to a shallow grove on the surface of the TRAF-like domain (Figure 1.3C) (27,37,73,74). The almost identical manner of the binding in all cases allows some substrates to compete for interaction with USP7 as was shown, for example, for p53 and HDM2 (27).

The function of the DUB catalytic core is to bind ubiquitin and to cleave the isopeptide bond between ubiquitin and a modified protein substrate. The catalytic core of USP7 is the largest domain within the protein (residues 208-560) and is centrally located between the TRAF-like and UBL domains (Figure 1.3A). It has characteristic for all USP enzymes fold that resembles a

10 right hand (Figure 1.3D). Residues Cys223, His464 and Asp481 form the catalytic site of the enzyme that is located in a cleft between the two protein regions referred to as Thumb and Palm.

The third region, Fingers, is responsible for binding to ubiquitin (72,75). Remarkably, the structure of isolated USP7 catalytic domain differs from available structures of other USPs in the conformation of its active site. While the catalytic residues are properly positioned for catalysis in USP4, USP8, USP14 and CYLD (76-79), the USP7 catalytic triad is found in an unproductive conformation. The catalytically competent conformation of the active site is formed only upon binding to ubiquitin as was shown in the crystal structure of USP7 in complex with ubiquitin aldehyde (72) (Figure 1.3E). These intramolecular rearrangements represent a hypothetical mechanism of USP7 autoregulation, which ensures the inactive state of the enzyme in the absence of a substrate.

The C-terminal region of USP7 (C-USP7) encompasses residues 564-1102 and consists of five ubiquitin-like domains (UBLs 1-5) that are arranged in the following manner: UBL1 closely interacts with UBL2 (UBL12 tandem), UBL4 – with UBL5 (UBL45 tandem) and UBL3 is separated from UBL2 and UBL4 by linkers (Figure 1.4A) (71). These domains adopt a characteristic β-grasp ubiquitin fold consisting of β-β-α-β-β-α-β secondary structure elements

(Figure 1.4B). Besides USP7, UBL domains are found in many cellular proteins and play roles in diverse biological processes (80). Within the USP family, 19 enzymes are predicted to have

UBLs (16). Interestingly, the numbers of the domains varies from one in USP9X/Y, USP19,

USP31 and several others to five in USP7, USP40 and USP47 (Figure 1.5). A position of the

UBLs can also be different within a protein with some domains located either N- or C-terminally and others embedded in the catalytic core. The reason for such divergence within one protein

11 family is unclear since functions of the UBLs have been elucidated for only a couple of USPs

(76,81,82).

The functions of USP7 UBL domains remained unknown until recently. The new findings indicate that C-USP7 constitutes an additional substrate-binding region along with the N- terminal TRAF-like domain. Interestingly, despite overall structural similarity, the protein sequences of USP7 UBLs significantly vary from ubiquitin and from each other resulting in surface charge distributions unique for each domain. The presence of these differently charged surfaces implies that in contrast to the TRAF-like domain that contains a single substrate-binding site, several distinct sites might be harbored within the C-USP7. Indeed, mounting evidence suggests that different USP7 substrates are recognized by its C-terminal region that bind different UBL domains. Thus, ICP0 from HSV-1 interacts with the first three UBL domains and human XPC, and RNF168 bind to UBL12 tandem (29,47,70). UBL3 and UBL45 were recently shown to contain second binding sites for HDM2 and p53 respectively (83), and UBL45 was also reported to recognize FOX(O)4 (50). However, the mechanism of substrate binding to C-USP7 remains to be defined since no structural information is available for any of these complexes.

In addition to five UBL domains, C-USP7 contains a conserved 23 amino residue peptide located at the extreme C-terminus (residues 1080-1102). Intriguingly, this non-structured peptide is required for activation of USP7 as its deletion significantly decreases the deubiquitinase activity of the enzyme (71,83). The molecular mechanism of activation, however, raised controversy in the literature. First, according to the crystal structure, five UBLs are in the extended conformation (Figure 1.4A), which positions the extreme C-terminus ~80 Å away from the active site. Next, as reported in the same study, the isolated USP7 C-terminal peptide does not directly interact with the catalytic core and only slightly restores its enzymatic activity in

12 trans (71). In contrast, another study found that a USP7 construct containing the C-terminal peptide attached to the catalytic domain through a 10 amino residue linker has almost full deubiquitinase activity, compared to the full-length protein (84). Furthermore, according to the crystal structure of this USP7 construct in complex with ubiquitin aldehyde, USP7 C-terminal peptide binds to a cleft on the surface of its catalytic core formed after active site rearrangement upon ubiquitin binding. Nevertheless, both groups highlighted the significance of UBL domains in activation of the enzyme. They proposed a model of the full-length protein where UBLs rearrange from an extended conformation into a more compact structure upon activation of USP7 in order to bring the C-terminal peptide towards the catalytic domain (Figure 1.4C). In support of this hypothesis, it was recently shown that UBLs 1-3 can adopt a conformation that is more compact than previously reported in the structure of C-USP7 (85).

1.1.5. USP7 in human disease

USP7 expression is often dysregulated in human malignancies. Its overexpression contributes to tumor progression through aberrant changes in cancer related signaling pathways such as the

DNA damage response, apoptosis and cell cycle control. In particular, overexpression of USP7 in human prostate cancer correlates with tumor aggressiveness. This correlation is at least partially explained by USP7-dependent deactivation of tumor suppressor PTEN (51). USP7 is also overexpressed in breast carcinomas (45) as well as in lung squamous cell carcinoma and large cell carcinoma (86). Its dysregulation in non-small cell lung cancers leads to disruption of the HDM2 – p53 axis and is associated with induced cell epithelial mesenchymal transition, metastasis and overall poor prognosis (86). Similarly, high USP7 expression in patients with epithelial ovarian cancer was found to induce cell invasion ability and correlate with poor

13 survival (87,88). Finally, USP7 expression was shown to gradually increase with tumor progression form grade I to grade IV in glioma patients (89).

Because of USP7 aberrant expression in many human cancers and its role in important signaling pathways, this enzyme recently emerged as a promising pharmaceutical target.

1.1.6. Pharmaceutical potential of USP7

Manipulating the stability of proteins that are mutated, overexpressed or downregulated in human malignancies represents a promising therapeutic strategy for cancer treatment. Between the two protein degradation pathways, lysosomal proteolysis and UPS, the latter is highly selective and, therefore, its inhibition has a potential for development of highly specific targeted therapies. E3 ligases and DUBs are of special interest since they determine the selectivity of

UPS, although other components of UPS such as E2s and proteasome have also emerged as pharmaceutical targets (90).

USP7 pharmaceutical potential emerged in recent years because of its role in cellular pathways involving oncogenes and tumor suppressors as well as evidence of its aberrant expression in various cancer cell lines. Early studies searching for small-molecule inhibitors of

USP7 led to discovery of compound HBX 41,108, a cyano-indenopyrazine derivative, that inhibits USP7 in an uncompetitive and reversible manner and induces p53-dependent apoptosis in HCT116 colon cancer cells. However, HBX 41,108 activity against USP7 is relatively low

(high nM range) and its specificity is modest (91). More recently, specific USP7 compounds

HBX 19,818 and HBX 28,258 were reported. Treatment with HBX 19,818 leads to p53 stabilization and G1 cell cycle arrest in HCT116 colon cancer cells (92). Finally, a family of dual

USP7/USP47 inhibitors (compounds 1 - 14) was recently discovered (93). The lead compound 1

(P5091) was shown to induce apoptosis in multiple myeloma cells resistant to bortezamib and to

14 prolong survival in mouse xenograft models (94). The mechanism of P5091 cytotoxic activity is p53-independent and at least partially relies on endoplasmic reticulum stress caused by accumulation of polyubiquitinated proteins (95). In contrast, compound 4 (DUB inhibitor VI,

P22077) induces p53 stabilization and inhibits cell growth in neuroblastoma cell lines with intact, but not with the mutant p53 (96). Overall, all the reported USP7 inhibitors exhibit cytotoxic activity in several cancer models but their further optimization is needed in order to create highly specific and more potent compounds.

It is worth mentioning that to date, over 30 proteins have been described as substrates of

USP7 (Table 1.1) but, given the fact that only ~100 DUBs in humans deubiquitinate thousands of different proteins, it is likely that new USP7 substrates will be discovered in the near future.

Importantly, all the above compounds target the catalytic core of USP7 and, therefore, affect ubiquitination status of its numerous binding partners. Such full inhibition of the enzyme might result in severe side effects due to changes in stability of multiple cellular proteins. Another potential way to manipulate USP7 activity is to block its interactions with a particular substrate such as oncogene HDM2 or viral protein ICP0. The discovery of such compounds would allow controlling deubiquitinase activity of the enzyme towards a specific protein without affecting others.

In conclusion, although the design of compounds capable of breaking protein-protein interactions was considered to be challenging for a long time, recent advances in technology have allowed the pharmaceutical industry to successfully pursue this goal (97). However, the development of substrate-specific therapeutics for USP7, first, requires deep understanding of the molecular mechanisms of its activity and regulation as well as structural characterization of

USP7/substrate binding interfaces. In this light, solution nuclear magnetic resonance

15 spectroscopy (NMR) might be a useful tool for studying this enzyme. NMR is the state-of-the-art structural biology method that might provide valuable information about USP7 itself and its interactions with either substrates or small-molecule compounds. Furthermore, only NMR may provide insight into USP7 dynamic behavior such as conformational changes upon enzyme activation or binding to a substrate.

1.2. NMR spectroscopy as a powerful method for studies of proteins and protein-protein

interactions

1.2.1. Basics of NMR

Several methods including NMR spectroscopy, X-ray crystallography, electron microscopy and small-angle X-ray scattering are widely used to gain structural information about macromolecules. However, only two of these methods, X-ray crystallography and NMR spectroscopy, provide information about the three-dimensional structure of proteins at the atomic level. Among those, only NMR spectroscopy can be used to determine protein structure in solution, which closer resembles physiological conditions. Moreover, NMR spectroscopy is the only method sensitive to very weak and transient intermolecular interactions invisible to other techniques. Finally, in contrast to X-ray crystallography, NMR can provide information about protein dynamics and folding.

The method of NMR relies on interaction of the magnetic moment of a nucleus (µ) with an external magnetic field. The nuclear magnetic moment is an intrinsic property of a nucleus such as charge and mass and is proportional to the spin angular momentum:

휇⃗ = 훾ħ퐼⃗

16 where γ is gyromagnetic ratio, a fundamental property of a given isotope, ħ = h/2π (h is Plank constant) and I is spin angular momentum quantum number. The nuclei with I = ½ (such as 1H,

15N, 13C, 19F and 31P) have the most favorable for NMR magnetic properties and are commonly observed in studies of biomolecules.

When placed in a magnetic field (B0), nuclear spins precess about the direction of B0 with a characteristic frequency:

ω0 = -γB0, which is called the Larmor frequency. As it follows from the above equation, ω0 is different for different nuclei and is proportional to the applied magnetic field. The orientation of the magnetic moments of a spin ½ nucleus may be either parallel or antiparallel to the field. The parallel orientation is characterized by a magnetic quantum number m = +½ (referred to as α-spin state) and antiparallel orientation is characterized by m = -½ (referred to as β-spin state). Importantly, while the energies of α- and β-spin states are the same in the absence of an external magnetic field, in the presence of the magnetic field they are slightly different (effect called Zeeman splitting):

1 1 퐸 = − ħ훾퐵 퐸 = ħ훾퐵 . 훼 2 0 훽 2 0

The energy difference ∆E is given by

퐸훽 − 퐸훼 = ħ훾퐵0 = ħ휔0 and is proportional to the resonance frequency of a nucleus ω0 = γB0. In NMR spectroscopy, radio-frequency pulses (rf-pulses) and/or irradiation are applied to a nucleus placed in a static magnetic field B0 at a carrier frequency ω matching ∆E.

17

Notably, the low energy spin state is only slightly more populated in the magnetic field relative to the high energy state, with a ratio of spins in α- and β-spin states, Nα and Nβ, respectively, given by Boltzmann distribution:

훥퐸 푁훽 − = 푒 푘푏푇 푁훼 where kb is Boltzmann’s constant and T is the temperature. The modest excess of spins in the low energy state results in only a small fraction of spins excited during NMR experiments and contributing to the NMR signal, which explains the relatively low sensitivity of the NMR method (98).

Critical for structural biology, in context of biomolecules, Larmor frequency depends on the electronic environment of a nucleus. In proteins, adopting a three-dimensional structure electronic environment is unique for every nucleus. Elections induce currents and produce local magnetic fields that oppose B0 or, in other words, shield the nucleus from the field. Because of this chemical shielding, nuclei of the same type experience slightly different magnetic fields, which results in a range of resonance frequencies for each nucleus in an NMR spectrum. These frequency differences normalized to resonance frequency of the nucleus are referred as to chemical shifts.

The above description explains principles of NMR spectroscopy using an example of a single spin ½ system. Biological macromolecules contain complex multi-spin systems where spins of different nuclei are coupled through bonds (J-coupled spins) and through space (dipole-dipole coupled spins). Therefore, NMR experiments use various sequences of rf-pulses applied to nuclei of one or different types. Beyond recording one dimensional (1D) 1H, 15N or 13C NMR spectra, one can collect multi-dimensional (2D, 3D and 4D) NMR experiments to observe different types of correlations between coupled nuclei.

18

1.2.2. Protein structure determination using NMR spectroscopy

NMR data does not immediately provide direct structural information about a protein unlike, for example, protein images obtained using electron microscopy. Instead, a set of structure related parameters including dihedral angles and distances between atoms within the molecule can be derived from analysis of NMR data and then this indirect experimental information can be used for protein structure calculation. In brief, the NMR structure determination process consists of the following steps: 1) sample preparation; 2) NMR data acquisition, illustrated in Figure 1.6A and described elsewhere (98); 3) NMR data analysis, including NMR resonance assignment and obtaining distance restraints based on NOEs (Nuclear Overhauser Effect) as well as dihedral angle restraints, and 4) calculation of the structure satisfying the restraints.

Sample preparation is a first and important step. As mentioned above, only nuclei with spin

½ are routinely observable in NMR spectroscopy meaning that a protein sample has to be produced using spin – ½ 15N, 13C and in some cases spin – 1 2H isotope labeled precursors as sources for nitrogen, carbon and hydrogen. The bacterium E. coli is the most suitable protein expression system for NMR purposes since it allows introducing isotopes and obtaining a protein of interest in high quantities in a relatively short time. Proteins overexpressed in E. coli must be purified prior NMR experiments, which sometimes can be difficult due to low expression levels or insolubility of certain proteins in solution. Another challenge is that not all proteins are stable at high concentration (e.g. ~ 1mM) and/or for the time necessary to perform NMR measurements. Overall, all the requirements can make protein sample preparation a time consuming and challenging task.

NMR resonance assignment includes chemical shift assignments of backbone atoms, side- chain carbons and all the hydrogens.

19

Assigning resonances for backbone 15N, 1H, 13Cα, 13CO and, additionally, 13Cβ atoms is called the backbone chemical shift assignment. The first spectrum collected for the backbone assignment is usually 2D 1H-15N HSQC (Heteronuclear Single-Quantum Correlation) that correlates directly bonded N and H atoms (99). The majority of peaks in this spectrum correspond to the backbone amide groups. The number of amide peaks should be the same as the number of residues minus prolines and the first amino acid residue. Signals from the 15Nε-1Hε

15 1 pair of tryptophan indole groups and N H2 groups of asparagine and arginine side-chains are also visible (Figure 1.6B). Importantly, the number of peaks and the range of resonance frequencies in the 1H-15N HSQC spectrum provides valuable information about whether the protein is folded or disordered (100). Also, the 1H-15N HSQC spectrum can be used to monitor protein – ligand interactions (described in subchapter 1.2.3). In addition to conventional 1H-15N

HSQC, 1H-15N TROSY-HSQC spectrum (TROSY – transverse relaxation optimized spectroscopy) was designed for large macromolecules (101).

The commonly used 3D triple-resonance spectra for backbone resonance assignment are:

 HNCACO and HNCO that correlate amide Hi-Ni chemical shifts with COi and COi-1 chemical shifts, and COi-1 chemical shifts, respectively;

 HNCA and HNCOCA that correlate amide Hi-Ni chemical shifts with Cαi and Cαi-1 chemical shifts, and Cαi-1 chemical shifts, respectively;

 HNCACB and HNCOCACB that correlate amide Hi-Ni chemical shifts with Cαi, Cβi, Cαi-1, and Cβi-1 chemical shifts, and Cαi-1 as well as Cβi-1 chemical shifts, respectively; where i is a residue in the protein sequence and i-1 is the preceding residue (102). During the process of backbone resonance assignment, first, sequential chains of strips (parts of the spectrum showing the particular frequencies in 1H and 15N dimensions but the whole range of

20 frequencies in 13C dimension) are build using information in 3D spectra (Figure 1.6C). Briefly, one looks for a strip with 13C resonances for residue i of the same frequency as 13C resonances for residue i-1 in the analyzed strip (Figure 1.6D). As a next step, assignment of resonances to specific amino-residues in the protein sequence is done based on the fact that 13Cα and 13Cβ chemical shifts differ between various amino-acids (103). Finally, Hα and Hβ atoms are assigned using experiments HNCAHA and HN(CO)CAHA that correlate Hi-Ni pairs with Hαi and Hαi-1 chemical shifts, and Hαi-1 chemical shifts, respectively; and experiments HBHA(CO)NH and

HAHBNH that correlate Hi-Ni pairs with Hαi-1 and Hβi-1 chemical shifts, and Hαi, Hβi, Hαi-1 and

Hβi-1, respectively (98).

Once the backbone resonance assignment is complete, side-chain assignment is performed using 3D double-resonance experiments H(C)CH-TOCSY and (H)CCH-TOCSY that provide correlations of 1H-13C pairs to all the 1H and 13C atoms in the same amino residue sidechain, respectively (104).

Importantly, chemical shifts contain information about secondary structure of a protein that can be derived using several available algorithms. 13Cα, 13Cβ and 13CO chemical shifts can be used to predict secondary structure elements present in the protein fold using the Chemical shift index (CSI) method (105). This approach is based on the finding that the chemical shifts of Cα and CO nuclei located in α-helices and β-strands shift downfield and upfield, respectively

(compared to the chemical shifts in random coil) and the resonances of Cβ nuclei shift in the opposite direction. Also, given the fact that secondary chemical shifts are dependent on local secondary structure, assigned backbone resonances can be used for prediction of dihedral angle restraints for angles φ and ψ (Figure 1.6E). The prediction is often made in program TALOS+

21

(106) that compares chemical shifts of a given residue and its adjacent neighbors to a database of high resolution protein structures with known secondary chemical shifts and torsion angles.

Another set of restraints is obtained from the analysis of 3D 15N- and 13C-edited NOESY

(Nuclear Overhauser Effect Spectroscopy) spectra that correlate protons from amide groups and

C-H groups, respectively, with all nearby protons located within distance up to 5Å (102). Since inter-proton distances are limited by protein secondary and tertiary structure, 1H – 1H distance restraints obtained from NOEs are pivotal for structure calculation (Figure 1.6F). Briefly, NOE effect is a magnetization transfer between adjacent nuclei caused by cross relaxation in a dipolar coupled spin system. Important for structural biology, the cross relaxation rate is inversely proportional to the sixth power of the distance between the two interacting protons (98). Thus, the inter-proton distances in principle can be estimated from intensities of cross-peaks in

NOESY spectra. However, because of the internal protein motions and complexity of the 1H-1H interaction network, it is impossible to measure the distances precisely, instead the peak intensities are clustered in three groups of strong, medium and weak. Each group is associated with upper distance limits: strong – 2.7 Å, medium – 3.3Å and weak – 5Å (103).

Because of the complexity of the 15N- and 13C- edited 3D NOESY spectra, their assignment is done automatically in the beginning of the protein structure calculation process. Briefly, one of several available programs including CYANA, XPLOR-NIH, ARIA or UNIO (107-111) uses the resonance assignments to generate preliminary assignments for NOE peaks followed by their conversion into distance restraints. Then, a program generates an initial set of structures, which is used to confirm the NOE assignments that are satisfied in these structures and to discard ones that are violated. Once NOEs are unambiguously assigned and upper limits for inter-proton distances are defined, a program attempts to fold the protein structure in a way that every atom

22 will have space coordinates satisfying the given restraints. Fitness of the structure is characterized by its potential energy with lower energy indicating a smaller amount of the restraints violated. In practice, structure determination is an iterative process, in which one examines the distance (angle) restraints that have been violated during calculation, corrects the resonance and NOEs assignments and repeats the calculation. Importantly, NMR based restraints do not provide exact values but rather define the range of permitted distances and angle rotations.

Thus, NMR structure calculation cannot result in the protein structure adopting a single conformation. Usually, 100 to 200 structures are calculated and 20 structures of the lowest energy are chosen for final water refinement (103).

1.2.3. NMR studies of protein dynamics

Knowing protein structure often helps one to elucidate functions. However, atomic coordinates do not provide insight into internal motions and conformational exchange inherent for all the proteins. Studying time-dependent changes in protein structure (referred to as protein dynamics) is important for understanding key biological processes including catalysis, ligand binding as well as protein folding/misfolding and aggregation. Solution NMR spectroscopy is a powerful tool for studying protein dynamics and to date many NMR experiments have been developed to probe protein motions on time scales ranging from femtoseconds to seconds (112).

NMR spin relaxation, Carr-Purcell Meiboom-Gill relaxation dispersion (CPMG) and rotating frame relaxation (R1ρ) methods used in this work are briefly discussed below.

Heteronuclear spin system excited by an rf-pulse will relax, e.g. return to equilibrium, with characteristic times T1 (recovery of M0) and T2 (loss of coherence between individual spins persessing perpendicular to the magnetic field). R1 = 1/T1, R2 = 1/T2 rate constants and the values of heteronuclear NOEs (hrNOEs) related to the rate of dipole-dipole cross-relaxation between

23 nuclei, are widely used to elucidate ps – ns protein dynamics. The reason is that the protein random tumbling occurring on ns time scale and internal motions such as rotation of torsion angles and loop motions occurring on ps – ns time scale cause fluctuations in local magnetic fields generated by spins that lead to relaxation of 1H-X pairs (X – 15N or 13C). Two major mechanisms relating these motions to time-dependent oscillating fields at MHz frequencies are dipolar coupling and chemical shift anisotropy, which is the dependency of chemical shift on orientation of the molecule relative to the applied magnetic field. R1, R2 rate constants and hrNOEs are given by:

2 2 푅1 = (푑 /4)[퐽(휔퐻 − 휔푋) + 3퐽(휔푋) + 6퐽(휔퐻 + 휔푋)] + 푐 퐽(휔푥)

2 푅2 = (푑 /8)[4퐽(0) + 퐽(휔푋 − 휔퐻) + 3퐽(휔푋) + 6퐽(휔퐻) + 6퐽(휔퐻 + 휔푋)]

2 + (푐 /6)[4퐽(0) + 3퐽(휔푋)] + 푅푒푥

2 푁푂퐸 = 1 + (푑 /4푅1)(훾푋/훾퐻)[6퐽(휔퐻 + 휔푋) − 퐽(휔퐻 − 휔푋)]

−3 2 where d = μ0hγXγH〈푟푋퐻 〉/(8π ); c = ωXΔσ/√3, µ0 is the magnetic permeability of free space; γH

1 and γX are the gyromagnetic ratios of spins H and X, respectively; ωH and ωX are the Larmor

1 1 frequencies of spins H and X, respectively; rXH is the X- H bond length; ∆σ is the chemical shift anisotropy of spin X; Rex is conformational exchange term due to μs – ms time scale processes accompanied by changes in chemical shifts and J(ω) is spectral density function that quantifies the amplitude of local field fluctuations at frequency ω (113). J(ω) is obtained by a cosine

Fourier transform of correlation function, C(t), which describes how rapidly bond vector

“forgets” its prior orientation:

퐶(푡) = ⟨푃2(휇(0) ∙휇(푡))⟩ where μ(0) and μ(t) are orientations of a bond vector at time zero and t, respectively and P2 =

(3x2 - 1)/2 is the Legendre polynomial of rank 2 (114).

24

Obtained backbone 15N NMR relaxation data are most commonly interpreted using model- free data analysis originally introduced by Lipari and Szabo (115), and later extended by Clore et al. (116). This approach describes molecular motion using the correlation function, C(t), given by a product of two correlation functions Co(t) and CI(t) describing overall molecular tumbling and internal motions, respectively, that are assumed to be independent of each other:

퐶(푡) = 퐶표(푡)퐶퐼(푡).

Co(t) for a spherical molecule is given by:

퐶(푡) = 푒−푡/휏푅 where 휏푅is the effective overall rotational correlation time. CI(t) depends on the timescale of the internal motions. If the internal motions, 휏푒 , are much faster than the overall rotation of the molecule and are in the extreme narrowing limit (ωτe ≪ 1), CI(t) is given by:

2 2 −푡/휏푒 퐶퐼(푡) = 푆 + (1 − 푆 )푒

2 where τe is effective correlation time and S is a generalized order parameter that describes a range of main chain motions and varies from 1 to 0 for completely restricted and unrestricted isotropic motion, respectively (115). The spectral density function J(ω) in this case is:

2 2 ′ 푆 휏푅 (1 − 푆 )휏푒 퐽(휔) = + ′ 2 1 + (휔휏푅) 1 + (휔휏푒)

′ where 휏푒 = 휏푅휏푒/(휏푅 + 휏푒). If the internal motions are not in the extreme narrowing limit and fast, 휏푓, and slow, 휏푠, components significantly differ, CI(t) is given by:

2 2 −푡/휏푓 2 2 −휏/휏푠 퐶퐼(푡) = 푆 + (1 − 푆푓 )푒 + (푆푓 − 푆 )푒

2 2 2 where 푆 = 푆푓 푆푠 (116). J(ω) in this case is:

2 2 2 ′ 2 2 ′ 푆푓 푆푠 휏푅 (1 − 푆푓 )휏푓 푆푓 (1 − 푆푠 )휏푠 퐽(휔) = + 2 + ′ 2 1 + (휔휏푅) ′ 1 + (휔휏푠) 1 + (휔휏푓)

25

′ ′ where 휏푓 = 휏푅휏푓/(휏푅 + 휏푓) and 휏푠 = 휏푅휏푠/(휏푅 + 휏푠) . Parameters of internal motions are obtained from least square fit of relaxation data to equations for R1, R2 rates and hrNOEs (113) constructed using spectral density functions appropriate for the particular case. For more detailed and comprehensive explanation of the model-free approach and the description of a case of anisotropic motion see (114).

CPMG and R1ρ experiments are used to study relatively slow μs-ms protein dynamics (114).

Conformational exchange in this time range accompanied by changes in chemical shifts is known to contribute to R2 rates:

0 푅2 = 푅2 + 푅푒푥

0 where 푅2 is the R2 rate in the absence of chemical exchange and Rex is the contribution from conformational exchange. A CPMG experiment exploits the dependence of Rex on repetition rate of series of refocusing 180° pulses. In the constant relaxation time CPMG experiment, a number of spin-echo refocusing elements (δ-180°-δ) are applied during a fixed relaxation period T with different values of delay δ (117). A R1ρ experiment measures longitudal relaxation in the doubly rotating frame as a function of the strength of the spin-locking field and the offset of nucleus resonance frequency from the carrier frequency of the rf-pulse, ω0 (118,119). In either case, exchange rates kex and/or populations of states A and B, pA and pB, respectively, are obtained by a least square fit of measured R2/R1ρ rates to theoretical two-state or more complex models of exchange process as described in (114,120).

1.2.4. NMR titration experiments in studies of protein –ligand interactions

NMR spectroscopy is widely used for probing protein – ligand interactions with the ligand being another protein, DNA/RNA or a small-molecule compound. NMR titration experiments,

26 also called chemical shift perturbation experiments, are based on the fact that chemical shifts are extremely sensitive to the electronic environment of a nucleus. The most common NMR method for testing the binding includes gradual addition of an unlabeled ligand to a 15N-labeled protein and recording 1H-15N HSQC spectra of the protein at each point of titration. In the case of direct interaction, the proximity of the ligand and subtle conformational changes in the binding site lead to local changes in the electronic environment of the protein residues directly involved in binding. Therefore, corresponding peaks move in the NMR spectrum. In addition, if ligand binding induces a structural rearrangement in the protein, chemical shift perturbations for residues not involved in interaction but participating in conformational changes may also be observed. It is notable that strong dependence of chemical shifts on the electronic environment makes NMR spectroscopy sensitive to even very weak interactions with millimolar affinities that cannot be observed by other methods.

For a simple case of the ligand L reversibly binding to a single site within the protein P, interaction can be described by 푃 + 퐿 ↔ 푃퐿. Importantly, the exchange rate between free (P) and bound states (PL) of the protein kex = kon + koff (kon and koff are rates of the association and dissociation reactions, respectively) affects the observable chemical shift changes in 1H-15N

HSQC spectra. If the binding process is fast on the NMR timescale (kex ≫ Δω) meaning that the complex association and dissociation occur rapidly during NMR signal detection, the resonances at each titration point will gradually move with increasing amount of the ligand. The frequency of each signal is the population weighted average of the chemical shifts for free and saturated protein states (Figure 1.6G, top). If the interaction is slow (kex ≪ Δω), the peaks corresponding to the free protein will gradually disappear and the peaks corresponding to the bound state will appear. Thus, in this case, two separate sets of the peaks are observed for the free protein and its

27 complex with the ligand, and the intensity of the signals reflects relative concentrations of the free and bound protein (Figure 1.6G, bottom). Finally, if exchange is in the intermediate regime

(kex ~ Δω), the peaks will gradually move and broaden at the same time (Figure 1.6G, middle).

Notably, NMR spectroscopy is the only method that allows estimation of the dissociation constant KD and the mapping of the binding site from a single experiment, provided that the structure and the backbone resonance assignment of the protein are available. The binding site identification is straightforward. First, one maps residues exhibiting most significant chemical shift perturbations upon ligand binding on the protein structure. Next, considering the nature of interaction (hydrophobic or electrostatic) and the size of the ligand, one looks for a cluster of residues affected by binding, which is thought to form the binding site. The presence of more than one cluster may be indicative of two or more binding sites or allosteric conformational changes occurring upon interaction. The dissociation constant KD for a two-state binding process in thermodynamic equilibrium is equal to

[푃][퐿] 퐾 = 퐷 [푃퐿] where [P], [L] and [PL] are concentrations of the protein, the ligand and the complex, respectively. KD can be estimated for protein – ligand interactions in the fast to intermediate regime on the NMR time scale by nonlinear least-square fitting of observed chemical shifts Δωobs as a function of ligand concentration to the following equation:

2 (퐾D + [L]t + [P]t) − √([푃]t + [L]t + 퐾D) − 4[P]t[L]t ∆ωobs = ∆ωmax 2[P]t where [P]t and [L]t are the total protein and ligand concentrations and Δωmax is the maximum chemical shift difference at saturation (121).

28

1.3. Thesis objectives

USP7 is a deubiquitinating enzyme involved in regulation of several oncogenic pathways and its inhibition represents a promising pharmaceutical strategy for treatment of human malignancies. All existing USP7 inhibitors target its catalytic core and, therefore, may affect protein levels of its numerous substrates leading to unpredictable side effects. Blocking USP7 interaction with a specific substrate may present a better strategy for manipulating the enzyme activity but given that USP7 deubiquitinates dozens of proteins, it is unclear how its substrate specificity is achieved. Several previously identified binding partners of USP7 share a Pro/Ala-

X-X-Ser motif recognized by its N-terminal TRAF-like domain, while others lack this sequence and are recognized by non-homologous C-terminal UBL domains. Interestingly, different UBL domains can recognize different substrates, suggesting that the C-USP7 may serve as a substrate- specific binding platform. However, the precise location of the binding sites harbored within C-

USP7 is unknown. Furthermore, it is unclear whether the substrates recognized by C-USP7 share any common structural features or motifs recognized by USP7. In addition, USP7 is the only enzyme among USPs that has the catalytic triad in an unproductive conformation. The rearrangement into a catalytically competent conformation is thought to occur upon binding to ubiquitin as seen in the crystal structure of the USP7 catalytic core in a complex with ubiquitin aldehyde, but such an active site rearrangement has not been demonstrated for USP7 interacting with free ubiquitin and in solution. Answering the questions of “What determines USP7 substrate specificity?” and “How is USP7 activated?” will provide a structural basis for rational development of potent and highly targeted cancer therapeutics.

Although all reported USP7 inhibitors are known to target its catalytic domain, no structural information characterizing their binding to the enzyme is currently available, which hinders

29 further optimization of the lead compounds. Thus, the third question we addressed here is: “What is the molecular mechanism of USP7 inhibition by small-molecule compounds that show anti- tumor potential in cancer cell lines?” In order to answer these questions, we, for the first time, utilized NMR spectroscopy to study this fascinating and pharmaceutically relevant biological system. The overall goal of this thesis was to provide structural insights into the molecular mechanisms of USP7 substrate specificity, regulation and inhibition using solution NMR in combination with other biophysical and biological methods. The specific aims were the following1:

Aim 1. Structurally characterize the C-USP7 interaction with viral ICP0 protein that is missing the conventional USP7 recognition motif in order to identify new determinants of USP7 substrate specificity.

a) Identify domains/regions of C-USP7 responsible for binding to ICP0.

b) Map the binding site for ICP0 on the USP7 structure.

c) Explore the significance of ICP0 interaction with USP7 during HSV-1 lytic infection in

vivo.

Aim 2. Identify the molecular mechanism of USP7 activation upon ubiquitin binding in order to provide new means for manipulating the enzyme.

a) Test binding of ubiquitin to USP7 in solution.

b) Track conformational changes occurring in the catalytic domain of USP7 upon

interaction with ubiquitin.

1 Note that in order to reflect the work described in the thesis, the overall goal and the specific aims were modified from those presented in the prospectus.

30

Aim 3. Identify the molecular and chemical mechanism of USP7 inhibition by small-molecule compounds to allow rational optimization of inhibitors targeting the enzyme.

a) Map the binding site for USP7 inhibitors P22077 and P50429 on the structure of the

USP7 catalytic core.

b) Characterize the mode of action of USP7 inhibitors P22077 and P50429.

In addition, chapter 4 of the thesis is not related to the USP7 project and represents another example of using solution NMR spectroscopy in studies of complex large biological systems. It describes structural studies of Y-family polymerase REV1 essential for DNA damage tolerance pathway translesion synthesis (TLS) and explores the mechanisms of REV1 interactions with other TLS polymerases.

31

Table 1.1. List of USP7 substrates.

Substrate function Substrate name Refs Epigenetic control of MEL18 (42) Transcription factors RING1B (41) p53 (24) Bmi1 (42) HDM2 (25) Histone H2B (49) Rb (122) DNMT1 (129) FOX(O)4 (50) UHRF1 (39) FOX(O)3 RNF168 (47) PTEN (51) MLL5 (46) β-catenin (67) Tip60 (44) PPARγ (123) UbE2E1 (74) N- (124) PHF8 (45) DNA replication/ Cell cycle Telomere proteins CHFR (125) TPP1 (130) Bub3 (126) Immune response Sumo (38) NFκB (48) DNA damage TRAF6 (131) response/ DNA repair IKKγ (131) DAXX (127) TRIM27 (132) Polymerase η (36) Viral proteins HLTF (33) EBNA1 from EBV (56) Rad18 (34) ICP0 from HSV-1 (55) Clapsin (31) vIRF4 from KSHV (59) Chk1 (30) LANA from (57) UVSSA (32) KSHV UL35 from HCMV (60) Mule/ARF-BP1 (128) E1B from (61) XPC (29) Adenovirus

32

Figure 1.1. Functions of DUBs in the ubiquitin-proteasome system. A protein substrate mono- and polyubiquitination is catalyzed by consecutive activity of E1, E2 and E3 enzymes. The polyubiquitination targets the substrate for proteasomal degradation while monoubiquitination results in a different functional outcome. DUBs deubiquitinate both poly- and monoubiquitinated proteins and redirect their fate. Also DUBs edit polyubiquitin chins and recycle them Finally, DUBs participate in maturation of free ubiquitin. Ub – ubiquitin; E1 – ubiquitin activating enzyme; E2 – ubiquitin conjugating enzyme; E3 – ; DUB – deubiquitinating enzyme.

33

Figure 1.2. Human DUBs. DUBs are classified into two classes of cysteine proteases and metalloproteases. Cysteine proteases are further subdivided into five families of ubiquitin- specific proteases (USPs) ubiquitin carboxyl-terminal hydrolases (UCHs), ovarian tumor proteases (OTUs), Machado-Joseph (Josephin) domain containing (MJD) proteases and MINDY (motif interacting with Ub-containing novel DUB family). Adapted from (133). The class of metalloproteases consists of the only family of JAB1/MPN/MOV34 domain containing proteases (JAMMs).

34

Figure 1.3. USP7 domain organization. A. Schematic representation of USP7 domains with borders specified. UBL – ubiquitin-like domain. B, C. Crystal structures of free USP7 TRAF- like domain (PDB ID: 2F1W) and its complex with a p53364-367 peptide shown in yellow (PDB ID: 2FOJ) (27,73). D, E. Crystal structures of free USP7 catalytic core (PDB ID: 1NB8) and its complex with ubiquitin aldehyde (PDB ID: 1NBF) (72). The side-chains of the catalytic residues Cys223, His464 and Asp481 are shown in green. The distances between the Cγ atom in Cys223 and the Nε2 atom in His464 are also shown. In E ubiquitin is shown in cyan.

35

Figure 1.4. C-terminal region of USP7. A. Crystal structure of the C-terminal region of USP7 (PDB ID: 2YLM) consisting of five ubiquitin-like domains (UBL1 – 5) (71). Here and later, UBL1 is colored in blue, UBL2 – in green, UBL3 – in grey, UBL4 – in orange and UBL5 – in red. B. Overlay of the UBL1 NMR structure (PDB ID: 2KVR) (134) onto the structure of ubiquitin (PDB ID: 1UBQ) (135). C. Current model of USP7 activation by its C-terminal UBL domains. The model was adapted from (71,84).

36

Figure 1.5. Human USPs predicted to have at least one ubiquitin-like domain. The protein domains are drawn approximately to scale. The catalytic domain is colored in red, UBL – in green and DUSP (domain present in ubiquitin-specific protease) – in yellow. Other domains are also indicated. Adapted from (16).

37

Figure 1.6. NMR spectroscopy in structural biology. A. Schematic representation of a basic NMR experiment. d1 – relaxation delay, pw – pulse width, at – acquisition time and FID –free induction decay. B. Schematic representation of a 1H-15N HSQC spectrum. Amide peaks are shown in blue, side-chain NH2 groups of asparagines and glutamines are shown in green and Nε- Hε pairs of tryptophans are shown in magenta. C. Schematic representation of 3D HNCA spectrum where CA peaks (colored in red) of a given and preceding residue are visible in every strip shown in orange. D. Representation of the sequential backbone assignment process. E. Backbone torsion angles. F. NMR Structure of UBL1 (PDB ID: 2KVR) with all inter-proton distances less than 5 Å shown. G. Examples of the chemical shift perturbations in 1H-15N TROSY-HSQC spectra caused by binding to a ligand. Three different experiments are shown with binding in fast (top), intermediate (middle) and slow (bottom) regimes on the NMR timescale. The direction of chemical shifts movement is indicated by arrows.

38

CHAPTER 2

Structural characterization of interaction between USP7 and ICP0 E3 ligase from HSV-1

Alexandra Pozhidaeva, Kareem N. Mohni, Sirano Dhe-Paganon, Cheryl H. Arrowsmith, Sandra

K. Weller, Dmitry M. Korzhnev and Irina Bezsonova

This research was originally published in Journal of Biological Chemistry. Pozhidaeva A.K.,

Mohni K.N., Dhe-Paganon S., Arrowsmith C.H., Weller A.K., Korzhnev D.M., Bezsonova I.

Structural Characterization of Interaction between Human Ubiquitin-specific Protease 7 and

Immediate-Early Protein ICP0 of Herpes Simplex Virus-1. J Biol Chem. 2015; 290(38):22907-

18. © the American Society for Biochemistry and Molecular Biology.

Authors contributions: AKP performed all experiments and data analysis associated with

USP7-ICP0 binding and wrote a manuscript. KHM performed in vivo experiment shown in

Figure 2.4 and KHM, and SKW analyzed the results. IB, SDP and CHA contributed to NMR structure determination of the UBL1 domain. AKP and DMK performed and analyzed NMR dynamics experiments. All authors reviewed the results and approved the final version of the manuscript.

39

2.1. Abstract

Human ubiquitin-specific protease 7 (USP7) is a deubiquitinating enzyme that prevents protein degradation by removing poly-ubiquitin chains from its substrates. It regulates stability of a number of human transcription factors and tumor suppressors and plays a critical role in the development of several types of cancer, including prostate and small cell lung cancer. In addition, human USP7 is targeted by several viruses of the family and is required for effective Herpesvirus infection. The USP7 C-terminal region (C-USP7) contains five

Ubiquitin-Like domains (UBL1-5) that interact with several USP7 substrates. Although structures of the USP7 C-terminus bound to its substrates could provide vital information for understanding USP7 substrate specificity, no such data has been available to date.

In this work we have demonstrated that the USP7 UBL domains can be studied in isolation by solution NMR spectroscopy and have determined the structure of the UBL1 domain.

Furthermore, we have employed NMR and viral plaque assays to probe the interaction between the C-USP7 and HSV-1 immediate-early protein ICP0 (infected cell protein 0), which is essential for efficient lytic infection and virus reactivation from latency. We have shown that depletion of the USP7 in HFF-1 cells negatively affects the efficiency of HSV-1 lytic infection. We have also found that USP7 directly binds ICP0 via its C-terminal UBL1-2 domains and mapped the USP7 binding site for ICP0. This study, therefore, represents a first step toward understanding the molecular mechanism of C-USP7 specificity toward its substrates and may provide the basis for future development of novel anti-viral and anti-cancer therapies.

40

2.2. Introduction

Conjugation of ubiquitin to target proteins is important in multiple cellular processes, including signaling cascades, proteasomal degradation and protein localization. Mono- and polyubiquitination can be reversed by deubiquitinating enzymes (DUBs), which cleave the isopeptide bond between ubiquitin and a substrate. Ubiquitin-specific protease 7 (USP7), also known as Herpes virus-associated ubiquitin-specific protease (HAUSP), is a 135 kDa cysteine isopeptidase (11). Misregulation of USP7 was linked to multiple human pathologies, including prostate and non-small cell lung cancers (51,136).

A number of USP7 substrates have been identified, and USP7 knockdown was shown to affect cellular levels of at least 36 proteins in LS174T colon carcinoma cells (23). Among a variety of substrates, USP7 deubiquitinates both p53 tumor suppressor and its main regulator E3 ligase Hdm2 (human double minute 2) and thus plays a central role in maintaining an appropriate p53 level in the cell (24-27). Other important targets of USP7 include several transcription factors such as FOX(O)4, PTEN, NF-kB and HLTF; DNA replication and repair proteins such as

UbE2E1, claspin and MCM-BP; and proteins involved in epigenetic control of gene expression such as DNMT1 and PRC1 (33,37,39,41,48,50,51,74).

In addition to cellular substrates, USP7 is known to interact with proteins from all three subfamilies of the Herpesviridae family of viruses. Specifically, it interacts with immediate early protein ICP0 from Herpes Simplex Virus-1 (HSV-1), which is required for efficient lytic infection and reactivation of latent and quiescent viral genomes (53). USP7 also binds two proteins from Kaposi’s sarcoma-associated herpes virus (KSHV) LANA and vIRF4 (57,59).

LANA is the major viral protein expressed in all forms of KSHV-associated malignancies, and vlRF4 is the key player in the switch from latency to lytic reactivation. USP7 also interacts with

41 the EBNA1 protein from Epstein-Barr virus (EBV), which competes with the p53 transcription factor for USP7 binding (58,70). Thus, USP7 binding to EBNA1 results in p53 destabilization and blockage of the DNA damage response caused by EBV infection. Finally, USP7 binds the

UL35 protein from cytomegalovirus essential for viral replication and particle formation (60). In addition to Herpesvirus targets, USP7 was recently shown to interact with at least one adenoviral protein, E1B-55K, a component of the E1B-55K/E4-orf6 E3 ligase complex that degrades a wide range of intracellular proteins. Downregulation or specific inhibition of USP7 destabilizes E1B-

55K and negatively affects viral replication (61,137). Due to the central role USP7 plays in viral infection and cancer development pathways, it represents an attractive target for the design of new anti-viral and anti-cancer therapies. Although knowledge of molecular mechanisms of USP7 specificity towards its diverse substrates is essential for drug development, to date these mechanisms remain poorly understood.

Structurally, USP7 consists of an N-terminal substrate-binding TRAF-like domain followed by a ubiquitin-binding catalytic domain and a large 64 kDa C-terminal region (C-USP7, residues

557-1102) that contains five ubiquitin-like domains (UBL 1-5) shown in Figure 2.1A

(70,71,138). USP7 complexes with several substrates have been previously characterized, including peptides derived from p53, Hdm2, HdmX, UbE2E1, MCM-BP, vlRF4 and EBNA1

(27,37,58,59,73,74). Remarkably, all these substrates share a common USP7-binding Pro/Ala-X-

X-Ser motif and interact with the same site located in the TRAF-like domain. On the other hand, the function of the USP7 C-terminal UBL domains remains largely unknown. Interestingly, UBL domains were found in at least 16 members of the USP family. All of them share a characteristic

β-grasp ubiquitin fold, although their sequence similarity to ubiquitin may be very limited.

Although a few of these UBLs have been characterized, the functions of others, including UBLs

42 of the USP9, USP24, USP34, USP47 and USP48 remain elusive (139). The ICP0 E3 ligase from

HSV-1 is the first USP7 substrate shown to interact with C-USP7, suggesting that the UBL domains may function as a platform for substrate binding. Recent studies revealed that the USP7

C-terminal region contains additional binding sites for p53 and Hdm2, as well as for the transcription factor FOX(O)4, which interacts exclusively with the C-terminus (50,83). However, at this time no structural data is available for any of these C-USP7 complexes.

Here, we report a structural study of the C-USP7 – substrate interaction. With the use of

NMR spectroscopy we have shown that individual USP7 UBL domains and/or two-domain constructs are stable in isolation and can be studied independently in solution. Furthermore, we have determined solution NMR structure of the USP7 UBL1 domain and demonstrated that it is nearly identical to the crystal structure of the UBL1 in the context of the intact C-terminus (71).

Finally, we have characterized the interaction between C-USP7 and the ICP0 protein from HSV-

1 virus and mapped the ICP0 binding interface on the surface of the UBL1-UBL2 domains. Our work represents the first structural study of the USP - substrate recognition mediated by its UBL domains. Since many of the USPs contain multiple UBLs (139), it is likely that substrate recognition by the UBL domains may prove to be a common feature shared among a class of

UBL-containing USPs.

2.3. Experimental procedures

In vitro studies:

Protein expression and purification. Sequences encoding UBL1 (residues 537-664), UBL12

(residues 535-775), UBL3 (residues 775-888) and UBL45 (residues 884-1102) domains of the human USP7 were sub-cloned into pET28a-LIC Vector (Structural Genomics Consortium). All

43 proteins were purified following the same protocol. Plasmids containing recombinant proteins with the N-terminal His6-Tag were transformed into Escherichia coli BL21 (DE3). Proteins were

15 13 expressed in M9 minimal medium containing NH4Cl (and C-glucose for production of the

15 13 15 13 N/ C-labeled UBL1). N/ C-labeled deuterated UBL12 was expressed in 100% D2O-based

M9 medium using uniformly 13C/2H-glucose as the sole carbon source. Transformed cells were grown at 37°C to an OD600 of 0.6-0.8. Protein expression was induced by adding 1mM IPTG followed by incubation overnight at 20°C. Cells were harvested, resuspended in buffer containing 20 mM NaH2PO4, 250 mM NaCl, 10 mM imidazole, 1 mM PMSF, pH 7.4, lysed by sonication and centrifuged at 15000 rpm for 45 min. The supernatant was applied to a Ni-NTA column and proteins were eluted with a buffer containing 20 mM NaH2PO4, 250 mM NaCl, 250 mM imidazole, pH 7.4. Following thrombin or TEV digestion to remove the His6-tag, proteins were additionally purified using either HiLoad Superdex 75 (for UBL1, UBL3) or HiLoad

Superdex 200 (for UBL12, UBL45) columns (GE Healthcare). Final samples used for NMR experiments contained 0.8-1.2 mM of purified protein, 20 mM NaH2PO4, 250 mM NaCl, 2 mM

DTT, pH 7.0. The 15N/13C-labeled UBL1 sample additionally contained 0.5 mM PMSF, 1 mM benzamidine and 1 mM TCEP.

NMR spectroscopy and structure calculation. 1H-15N HSQC spectra for all C-USP7 constructs were recorded on 800 MHz (1H) Agilent VNMRS spectrometer at 25°C. All experiments for the UBL1 structure calculation were acquired on 600 MHz and 800 MHz (1H)

Bruker Avance spectrometers at 25ºC. The data were processed with NMRPipe (140) and analyzed with Sparky (141) and CARA (142). The backbone and side-chain resonance assignments of the uniformly 15N/13C-labeled UBL1 domain were performed using standard 2D

1H-15N HSQC, 1H-13C HSQC and 3D HNCA, HNCO, HBHA(CO)NH, CBCA(CO)NH,

44

HC(C)H- and (H)CCH-TOCSY experiments (102). Sequence specific resonance assignment of the UBL1 was done using ABACUS software (143,144). Nearly complete UBL1 NMR resonance assignment was obtained with 93.3% of all expected chemical shifts assigned. NMR distance restraints were obtained from 3D 15N- and 13C-edited NOESY-HSQC experiments

(102). H-bond restraints were added on the basis of NOE analysis. Backbone dihedral φ and ψ angle restraints were derived from the analysis of 15N, 1HN, 13C, 13C and 13CO chemical shifts using TALOS+ (106). Structures were calculated using CYANA (107). A total of

200 structures of UBL1 were calculated and 20 structures with the lowest energy were selected and refined using CNS (145).

The backbone resonance assignments of the 15N/13C-labeled deuterated UBL12 construct in its free and ICP0-bound forms were performed using 2D 1H-15N TROSY and TROSY-based 3D

HNCA, HNCOCA, HNCO, HNCACO, HNCACB and 15N-edited NOESY-HSQC (102) recorded on 800 MHz (1H) Agilent VNMRS spectrometer at 25°C. The data were processed with

NMRPipe (140) and analyzed with CcpNmr Analysis (146).

NMR Studies of Protein Dynamics. The backbone 15N relaxation experiments for the 15N- labeled UBL1 were performed at 25°C on 500 MHz (1H) Agilent VNMRS spectrometer as described elsewhere (118,147). All experiments were performed on a C576S UBL1 mutant

15 generated using QuikChange Kit (Agilent) to avoid slow cysteine-induced dimerization. N R1 and R1ρ relaxation rates were measured from a series of eight spectra recorded at different relaxation delays. Relaxation rates were extracted from exponential fits of peak intensities. 15N

R2 were obtained from longitudinal R1 and rotating-frame R1ρ rates (119) measured at a spin-lock field strength of 1592.4 Hz. The 15N{1H} NOE experiment was carried out in an interleaved manner by recording 2D 15N-1H spectra with and without 1H saturation period of 2.5 s using an

45 inter-scan delay of 8 s. Heteronuclear NOE values were calculated as ratios of peak intensities in corresponding 2D 1H–15N correlation spectra. Errors in NOE values were calculated based on peak intensities and the measured spectral noise levels.

NMR binding experiments. ICP0 binding to individual USP7 UBL domains was assessed by a series of titration experiments monitored by recording 800 MHz 1H-15N HSQC spectra of the domains. Peptide GPRKCARKTRH corresponding to the ICP0 residues 617-627 was custom synthesized (GenScript). The titrations were performed by gradual addition of the unlabeled peptide to ~0.5 mM 15N-labeled UBL domain samples (up to 1:10 protein to peptide ratio). Both the wild-type UBL1 and C576S mutant constructs were tested, and gave the identical results.

Frequency (chemical shift) perturbations were calculated for each amino acid residue as

2 2 1/2 15 1 i=(N +H ) , where N and H are N and H frequency differences between ICP0- bound and free states, respectively. The obtained i values were mapped onto the surface of the crystal structure of the UBL12 tandem (PDB ID: 2YLM) (71).

Isothermal Titration Calorimetry (ITC). The affinity of USP7 UBL12 – ICP0 interaction was determined by ITC. Measurements were performed with a Nano ITC instrument (TA

Instruments) in a buffer containing 20 mM NaH2PO, 100 mM NaCl, 2 mM β-mercaptoethanol, pH 7 at 25°C. ICP0 peptide at a concentration of 686 μM was titrated into 85 μM UBL12. A total of 20 injections were made with 300 second time intervals in between. ITC data were analyzed with NanoAnalyze software (TA Instruments). Data fitting was performed using an

“independent” model after correcting for the heat produced by ICP0 dilution.

46

In vivo studies:

Cell lines and Viruses. Vero and HFF-1 cells were purchased from the ATCC. HFF-1 cells were maintained in DMEM supplemented with 10% fetal bovine serum and used between passages 5 and 10. Vero cells were maintained in DMEM supplemented with 5% fetal bovine serum. Strain 17+ was used as wild type HSV-1. Virus in1863 is derived from strain 17+ and was obtained from Chris Preston (MRC Virology Unit, Glasgow, Scotland). This virus contains the lacZ gene under the control of the HCMV promoter/enhancer inserted in the tk gene.

Growth yield of HSV-1 on shUSP7 cells. Lentivirus generation and virus infection were done as previously described (148). Briefly, the pLKO.1 system was used to package lentiviruses and deliver shRNA into target cells. HFF-1 cells were infected with lentiviruses expressing shRNA against GFP (GCAAGCUGACCCUGAAGUUCA) or USP7 targeting sequence

(GGCAACCTTTCAGTTCACT) and selected with 2 g/mL puromycin. 72 hours post-lentiviral infection cells were infected with HSV-1 (in1863) at multiplicities of infection (MOI) of 0.1

PFU/cell. Progeny virus was collected at 24 hours post infection and titers were determined on

Vero cells by staining with crystal violet or X-gal for β-galactosidase-positive plaques as previously described (148).

Western blot analysis was used to visualize USP7 and ICP0. Ku70 was used as a loading control. After lentiviral transfection cells were infected with HSV-1 at an MOI of 2 PFU/cell.

Four hours post infection cells in 35mm dishes were lysed in 2X SDS sample buffer (4% SDS,

20% glycerol, 100mM Tris pH 6.8, 100mM DTT, 10% β-mercaptoethanol, 1X protease inhibitor cocktail (Roche), and 0.1% bromophenol blue) and boiled for 5 minutes. Proteins were resolved by SDS-PAGE and transferred to PVDF membranes. Membranes were blocked for 1 hour in 5% non-fat dry milk dissolved in TBST. Primary antibodies were diluted in blocking solution and

47 incubated overnight at 4°C. Primary antibodies used include monoclonal mouse ICP0 (5H7)

(1:1,000; Abcam), polyclonal rabbit USP7 (H200) (1:1,000; Santa Cruz), and monoclonal mouse

Ku70 (Ab-4) (1:5,000; NeoMarker).

2.4. Results

2.4.1. Isolated USP7 UBL domains can be studied independently in solution

The full-length USP7 as well as the C-USP7 have long represented a challenge for structural studies because of their rapid degradation during recombinant expression in E. coli (149). As a consequence, difficulties in C-USP7 production hampered investigation of its structure and function. The recent crystal structure of the isolated C-USP7 produced in insect cells (71) revealed the domain organization of the C-USP7 and confirmed that it consists of five ubiquitin- like domains. In order to produce stable constructs of the C-USP7 suitable for studies in solution we have tested bacterial expression of all individual UBL domains and their combinations in E. coli. We found that the isolated UBL1 and UBL3 express well and result in stable soluble proteins. In contrast, UBL2, UBL4 and UBL5 are unstable and degrade rapidly when expressed on their own. However, the two-domain combinations UBL12 and UBL45 express well and are stable and soluble. These results are in agreement with the C-USP7 crystal structure, which shows extensive interactions between UBL1 and UBL2 as well as between UBL4 and UBL5.

1H-15N HSQC NMR spectra of UBL1, UBL12, UBL3 and UBL45 are shown in Figure 2.1B.

The high quality of spectra and good chemical shift dispersion suggest that isolated UBL domains and/or their combinations described above represent independently folding units that can be used for structural and functional studies.

48

2.4.2. Solution structure of the USP7 UBL1 domain

Figure 2.2 shows the solution NMR structure of the USP7 UBL1 domain (residues 537-664).

According to secondary structure prediction obtained from the backbone 1H, 15N and 13C NMR chemical shifts using TALOS+ program (106), UBL1 adopts a characteristic ubiquitin-like β- grasp fold formed by β1-β2-α3-310-β3-β4-α4-α5-β5 elements (Figure 2.2A, top panel). The bottom panel on Figure 2.2A shows order parameters S2 for the backbone amide groups of the domain predicted from NMR chemical shifts using Random Coil Index (RCI) approach (150).

High S2 values of 0.8 – 0.9, indicative of limited backbone flexibility, were obtained for a majority of the residues. The exceptions are a linker connecting the first two helices with the first

β-strands of the β-grasp fold and two large loops: loop I and loop II. The final ensemble of the 20 lowest energy structures (Figure 2.2B) displays root mean square deviation (rmsd) of 1.96 ± 0.44

Å for the backbone and 2.70 ± 0.38 Å for all heavy atoms (see Table 2.1 for structural statistics).

As expected, the overall structure of the domain is well defined in the NMR ensemble with the exception of the two loops. (Figure 2.2B). The structure consists of a long kinked N-terminal α- helix (helices α1 and α2) followed by a β-grasp fold (Figures 2.2C and D). The α3-helix is packed against an anti-parallel β-sheet ( that forms a barrel-like structure

(Figure 2.2D). In this barrel, the β4-strand is very short and is stabilized by a single hydrogen bond between the backbone amide group of Leu638 and carboxyl of Leu622 from the β3-strand.

The N-terminal helix α1 does not belong to the ubiquitin-like fold and, in the context of the full- length USP7, connects C-USP7 with the catalytic domain of the enzyme. Although the sequence similarity of the USP7 UBL1 to ubiquitin is only 21%, it closely resembles a ubiquitin fold (PDB

ID: 1UBQ) with the backbone rmsd between the UBL1 and ubiquitin of 2.4 Å for regular secondary structure elements. Significant differences, however, are observed in the loop regions.

49

Thus, the UBL1 loop I that connects the β1- and β2-strands (residues 571-591) as well as loop II between the β3- and β4-strands (residues 625-637) are notably longer than those of ubiquitin.

The hydrophobic patch equivalent to the ubiquitin Ile44 patch (151) is also present in UBL1 and is centered around Trp623 (Figure 2.2D).

Figure 2.2E shows an overlay of the solution (red) and crystal (blue; PDB ID: 2YLM) structures of the UBL1 domain. The UBL1 NMR structure is in good agreement with the crystal structures of the entire C-USP7 (PDB ID: 2YLM) and the UBL12 tandem (PDB ID: 4PYZ) with the backbone rmsd between NMR and crystal structures of 2.40 Å and 2.35 Å, respectively. The most significant difference is in the loop conformations. The long loop II connecting the β3- and

β4-strands is poorly defined in the solution structure with only short and medium-range (|i-j|<5)

NOEs observed in this region (Figure 2.2F). This is in contrast to the crystal structure where residues 625-635 form type-I -hairpin. The position of the hairpin is stabilized by interaction with the UBL2 domain, including inter-domain salt bridges formed by residues Arg628 and

Asp764, as well as Arg634 and Glu737. Similarly, the position of loop I connecting - and β2- strands (residues 571-591) is poorly defined in our solution structure, while in the crystal structure it is stabilized by several inter-domain hydrogen bonds with the UBL2 residues.

Additionally, in the crystal structure, the relative position of the two long loops is constrained by a hydrogen bond between the backbone HN group of Lys633 (loop II) and the OD2 group of

Asp582 (loop I), while in the NMR ensemble the loops sample multiple conformations. Overall, the observed differences between the two structures may result from additional domains present in the X-ray structure and the fact that NMR is sensitive to protein dynamics in solution.

50

2.4.3. Backbone dynamics of the USP7 UBL1 domain

To confirm flexible conformations of the loops we have investigated the extent of conformational dynamics of the UBL1 domain using backbone 15N NMR relaxation

15 15 1 measurements, including N R1 and R2 rates and heteronuclear N{ H} NOE values

(118,119,147). These data report on Brownian rotational diffusion of a protein and its

15 conformational flexibility on the pico- to nanosecond time scales. N R1 and R2 relaxation rates and 15N{1H} NOEs suggest that UBL1 is a rigid molecule with increased sub-nanosecond time- scale flexibility of the two long UBL1 loops, as evidenced by a decrease in 15N{1H} NOE values, lower R2 rate and elevated R1 rates for amino acid residues located in loops I and II (Figure 2.3).

Additionally, relaxation rates and NOEs suggest that the N-terminal α-helix that does not belong to the β–grasp fold is also highly dynamic on the sub-nanosecond timescale.

2.4.4. USP7 is required for the efficient HSV-1 infection in HFF-1 cells

As a first step toward understanding the mechanism of substrate recognition by the C-USP7, we have characterized its interaction with the ICP0 E3 ligase from HSV-1. ICP0 was the first discovered and remains the best studied substrate of the USP7 that interacts exclusively with the enzyme's C-terminus (70). ICP0 is a 775 amino acid residue long immediate early protein that is expressed immediately after infection. Although ICP0 doesn’t interact with DNA directly, it acts as a promiscuous activator for all three temporally regulated kinetic classes of viral

(reviewed in (54)). It was previously shown that ICP0 self-ubiquitination promotes its own degradation; however, the formation of a complex with cellular USP7 stabilizes ICP0 and rescues it from proteasomal degradation (55,152). Mutations that abolish the USP7 binding lead to ICP0 destabilization during HSV-1 infection in U2O2 cells (55). Moreover, treatment of HeLa

51 cells with siRNA against USP7 leads to a defect in ICP0 accumulation and reduced expression of many ICP0 target genes, including ICP4 and UL42 (153). Furthermore, ICP0 interaction with

USP7 causes migration of the latter from the nucleus to the cytoplasm where it inhibits the NF- kB dependent inflammatory cytokine response to protect HSV-1 from innate immunity (131).

Altogether, these data suggest that USP7 not only plays an essential role in regulation of

ICP0 levels in the infected cell, but also is required for establishment of successful HSV-1 infection. This makes USP7/ICP0 binding interface an attractive target for the development of antiviral therapeutics. Therefore, we have explored the effect of a USP7 knockdown on the efficiency of HSV-1 infection. Although ICP0 is not essential for virus replication in cultured cells, ICP0-null mutant viruses grow poorly at low MOI. Because this effect is cell type dependent, we have utilized limited-passage human diploid fibroblasts since they are sensitive to the presence of ICP0 (reviewed in (54)). HFF-1 cells were infected with lentiviruses expressing shRNA against either USP7 or GFP (control) followed by HSV-1 infection at an MOI 2

PFU/cell. The knockdown of USP7 was confirmed by a Western blot (Figure 2.4C). As expected, ICP0 levels were severely decreased in cells lacking USP7 compared to cells expressing shRNA against GFP. In addition, cells were infected with HSV-1 at a lower MOI of

0.1 PFU/cell. Progeny virus was collected at 24 hours post infection and formation of plaques was monitored on Vero cells (Figures 2.4A and B). We observed that viral growth was reduced

100-fold in cells expressing shRNA against USP7 compared to control cells treated with shRNA against GFP. These results suggest that the presence of USP7 in the cell and its interaction with

ICP0 are essential for establishment of successful lytic infection at low MOI in HFF-1 cells.

52

2.4.5. The first two USP7 UBL domains (UBL12) are responsible for interaction with ICP0

The minimal ICP0 region required for interaction with USP7 was previously defined by R.

Everett et al. as residues 619-626 (152). This short peptide contains several Lysine and Arginine residues, which make it highly positively charged. On the other hand, the region of USP7 responsible for interaction with ICP0 was roughly defined using limited proteolysis to reside within residues 599-801 (70). This region contains a part of UBL1, the entire UBL2 and a part of the UBL3 domains. To determine the domain(s) of C-USP7 responsible for interaction with ICP0 we have performed a series of NMR binding experiments. Isolated 15N-labeled UBL1, UBL12 and UBL3 domains of the USP7 were gradually titrated with the unlabeled ICP0 peptide

(residues 617-GPRKCARKTRH-627) and binding-induced changes in peak positions in 1H-15N

HSQC spectra of the domains were monitored. As seen in Figures 2.5B and C, no peak shifts were observed for the UBL1 and UBL3, indicating that these domains alone are not sufficient for interaction with the ICP0 peptide. In contrast, the addition of ICP0 peptide to the 15N UBL12 caused progressive changes in the positions and intensities of a number of peaks in its 1H-15N

HSQC spectrum indicative of specific ICP0 binding (Figure 2.5A).

In the course of titration with ICP0, the peaks in the NMR spectrum of UBL12 do not move gradually, but rather disappear and then show up in new positions in the spectrum, indicative of a slow on the NMR time-scale binding process. Therefore, to measure UBL12 – ICP0 binding affinity and determine thermodynamic parameters for this interaction we have employed isothermal titration calorimetry (ITC) (Figure 2.5D). The results indicate that UBL12 binds ICP0 in 1:1 stoichiometry. The best fit of the titration curve was obtained with an independent site model with a dissociation constant (KD) of 14.6 ± 1.3 μM and enthalpy (∆H) of -24.5 ± 0.7 kcal/mol, resulting in entropy (ΔS) of -60.2 cal/(mol K) and Gibbs Free energy (ΔG) of -6.6

53 kcal/mol at 25°C. The high ΔH value suggests that the reaction is enthalpy-driven with primary contributions to the complex stabilization likely resulting from electrostatic interactions and/or hydrogen bonds.

2.4.6. Mapping of the UBL12-ICP0 binding site

As seen in Figure 2.5A, a large number of peaks have shifted in 1H-15N HSQC spectrum of the

USP7 UBL12 as a result of ICP0 peptide binding. Due to slow on the NMR time scale binding, the backbone resonances of the UBL12 in complex with ICP0 cannot be assigned by tracing peak shifts in the UBL12 spectrum during the titration. Therefore, to quantify ICP0-induced changes in peak positions and thus identify the UBL12 amino acid residues responsible for ICP0 recognition we have assigned the backbone resonances of the free and ICP0-bound forms of the

UBL12.

Figure 2.6A summarizes the extent of the UBL12 chemical shift assignment: circles below the amino acid sequence correspond to assigned peaks in the 1H-15N HSQC spectra of the free and ICP0-bound UBL12 constructs. UBL1 and UBL2 portions of UBL12 are colored in blue and green, respectively. A total of 74% of expected NMR resonances of free UBL12 and 82% resonances of ICP0-bound UBL12 have been assigned. The remaining peaks were broadened beyond detection. The residues with missing assignments in either free or ICP0-bound UBL12 are marked by stars and arrows, respectively, above the amino acid sequence. Most of these residues are missing in the spectrum of free UBL12 and appear only after ICP0 binding. Such a behavior points to changes in the UBL12 conformation and dynamics as well as changes in chemical environment of the nuclei on the interaction interface associated with the substrate recognition.

54

ICP0 binding led to substantial frequency (chemical shift) changes for the vast majority of the UBL12 amide resonances (Figures 2.5A and 2.6F) with the largest perturbations (>150 Hz) observed for residues Gly631, Leu638, Val685, Leu699, Ile709, Ser710, Ser752, Leu757 and

Asp758. The complex formation noticeably improved the quality of the spectrum, suggesting that ICP0 binding stabilizes the UBL12 tandem (Figure 2.5A, bottom panel). As a result, a number of intense peaks appeared in the UBL12 spectrum upon ICP0 binding. While the backbone amide resonances for 171 out of 230 non-proline residues were observed for the free protein, 25 additional peaks were assigned for the complex (Figure 2.6A). The vast majority of these new peaks correspond to residues located either on the UBL1/UBL2 interface or in its close proximity (Glu572, Phe575, His578, Gln579, Asn581, Asp582, Ile620, Gln626, Arg628,

Asn630, Thr632, Glu663, Thr664, Val665, Met686, Asn700, Tyr701, Tyr706 and Val738) (Figure

2.6D). In addition, 8 peaks in 1H-15N HSQC spectrum of the UBL12 disappeared upon ICP0 binding, pointing to changes in the UBL12 conformation and/or dynamics caused by complex formation. These peaks correspond to residues Asp680, Lys681, Asp682, Gly763, Asp764 and

Ile765 located on the UBL1/UBL2 interface and residues Glu759 and Leu760 located in the loop between the β4- and β5-strands of the UBL2.

Overall, mapping binding-induced frequency (chemical shift) changes onto the UBL12 structure (71) identified residues sensitive to ICP0 binding in both UBL1 and UBL2 domains.

The regions of the UBL1 and UBL2 with NMR peaks affected by ICP0 addition are shown on the protein surface in Figures 2.6C and D. The UBL1 residues exhibiting pronounced frequency

(chemical shift) changes are mainly located in loops I and II. The residues of UBL2 affected by

ICP0 binding are located in two distinct regions. The first region includes a hydrophobic patch formed by residues in strands β1, β2, β3 and β5 of the UBL2 β-sheet. This face of the UBL2

55 domain makes extensive contacts with UBL1, which protect this patch from solvent exposure

(Figure 2.6D). The second region is solvent exposed and includes residues from the two loops: the UBL2 loop connecting the β2-strand with the following -helix, and the loop connecting the

β4- and β5-strands (Figures 2.6C and E). These loops form a negatively charged cluster on the

UBL2 surface, which provides a binding interface for the positively charged ICP0 peptide.

2.5. Discussion

Owing to USP7 involvement in maintenance of several important tumor suppressors and key viral proteins, it has emerged as a potential therapeutic target for treatment of cancer and infections caused by DNA viruses (154). Therefore, the mechanisms of USP7 specificity toward its diverse substrates are of significant interest.

While functions of the USP7 N-terminal TRAF and catalytic domains were previously characterized, the functions of its UBL domains remain largely unknown. Roughly, UBLs within the USP family of proteins can be divided into three groups: N-terminal, C-terminal or embedded in the USP catalytic core (155). The USP7 UBL domains belong to the third group of the C-terminal UBLs. Previous studies revealed that UBLs of USP7 enhance its enzymatic activity, since even small truncations of the extreme C-terminus dramatically decrease USP7 catalytic activity (71,83). Furthermore, it was recently shown that the UBL domains can function as a binding platform for several USP7 substrates. However, the precise location of the C-USP7 substrate binding sites remained unknown (50,70,83).

USP7 is unique within the USP family, since, unlike other family members, it contains a tandem array of five UBL domains. As a consequence of the large size of the USP7 C-terminus, it is challenging to study all five tandem UBL domains together. Here we show that this complex

56 system can be simplified by the isolation of stable individual UBL domains and/or constructs containing two consecutive UBL domains. Our NMR data reveal that these constructs can be studied in isolation in solution, providing new insights into the C-USP7 structure, dynamics and substrate specificity. Although there are two available X-ray crystal structures of C-USP7 (PDB

IDs: 2YLM, 4PYZ), these structures provide little information about protein dynamics. In this study we focused on the UBL1 domain and UBL12 tandem. We have: (i) determined the three- dimensional NMR structure of the UBL1 domain and characterized its dynamic properties, (ii) narrowed down the ICP0-binding region of the USP7 to the UBL12 domains, and (iii) obtained high-resolution NMR mapping of the USP7 binding site for ICP0.

Once inside the cell, HSV-1 encounters a hostile environment created by host proteins that are intrinsically antiviral. HSV-1 encodes proteins such as ICP0 intended to inactivate cellular antiviral mechanisms (54). The interaction between ICP0 and USP7 is essential to prevent ICP0 degradation during the infection, and in this study, we have structurally and functionally characterized this interaction. As a first step, we have investigated the effect of USP7 ablation on the efficiency of HSV-1 lytic infection. We utilized shRNA to generate USP7 knockdown cells and demonstrated that in cells lacking USP7 virus growth is severely reduced. These results suggest that the USP7-ICP0 interaction is important for the establishment of effective lytic HSV-

1 infection in HFF-1 cells.

Next, we used solution NMR to structurally characterize the ICP0 – USP7 interaction, and have shown that USP7 binds ICP0 via its first two C-terminal UBL domains, providing evidence that the USP7 UBL domains function as a substrate interacting platform. The ICP0 peptide used in this study was previously identified as a minimal USP7-binding region of ICP0 (152). This peptide is composed of eleven amino acid residues, six of which are positively charged and three

57 are polar (Figure 2.7B). Thus, of the two USP7 regions affected by ICP0 binding, the one located at the UBL2 β-sheet is unlikely to directly interact with ICP0 because it consists of mainly hydrophobic residues buried on the UBL1/UBL2 interface. On the other hand, the second region encompassing the UBL2 loops, one between the β2-strand and the following α-helix and the other between the β4- and β5-strands, has a negatively charged surface and is more likely to directly bind positively charged ICP0 peptide. The high enthalpy of complex formation (-24.5 ±

0.7 kcal/mol) measured by ITC provides further evidence of the electrostatic nature of UBL12 –

ICP0 interaction.

NMR chemical shift mapping revealed that the binding of the two proteins is likely mediated by electrostatic interactions. This is in agreement with previous mutational analysis of ICP0, which identified Arg619, Lys620, Lys624 and Arg626 as residues critical for interaction with

USP7 (152) (Figure 2.7A). Interestingly, although the UBL1 domain alone does not interact with

ICP0 (Figure 2.5B), it exhibits extensive binding-induced chemical shift changes in the context of the UBL12 construct (Figure 2.6F). Notably, UBL1 residues involved in ICP0 binding belong to the two long flexible loops located in the close proximity of the UBL1 - UBL2 interface, supporting the notion that these two domains may function in tandem. The presence of UBL2 may constrain conformation of the UBL1 flexible loops, as seen in crystal structures of the C-

USP7 UBL domains. Overall, our data suggests that UBL12 may undergo micro- to millisecond opening/closing events, resulting in the observed line broadening and/or disappearance of peaks on the interface of UBL1 and UBL2 domains. As evident from chemical shift mapping, most of the UBL12 biding site for ICP0 is located on the UBL2 surface with an additional participation of residues from loop II of the UBL1. We speculate that the substrate binding may lock the two domains in a closed conformation, thus limiting micro- to millisecond conformational sampling

58 and quenching exchange contributions to transverse relaxation. This model could explain the chemical shifts changes and line narrowing on the inter-domain interface observed upon ICP0 binding.

The HSV-1 ICP0 has never been structurally characterized, although it was shown to contain an N-terminal RING domain similar to one from an immediate-early protein of equine herpes virus. (156,157). The USP7-binding motif of ICP0 is located within the molecule’s C-terminus

(residues 619-626). We used Jpred (158) to predict the secondary structure of the HSV-1 ICP0 and found that its C-terminal region is predicted to contain a structured domain with an

 arrangement (Figure 2.7A). Furthermore, we have compared the HSV-1 ICP0 protein with its orthologues from other Alphaherpesvirinae subfamily viruses using SIB Blast followed by multiple sequence alignment in ClustalW (159) and conservation score calculation using ConSurf server (160,161) and found that the ICP0 sequence has two highly conserved regions. The first region includes the N-terminal RING domain characteristic of E3 ubiquitin ligases, while the second contains a yet uncharacterized C-terminal domain (CTD). Figure 2.7A shows a multiple sequence alignment of the ICP0 CTD. Remarkably, the USP7 binding motif

(residues 619-626) is one of the best conserved regions within the ICP0 C-terminus and it is predicted to form an α-helix. According to a helix wheel projection, this helix is amphipathic

(positively charged on one side and neutral on the other), which may facilitate its interaction with a negatively charged cluster on the UBL2 surface (Figure 2.7B). Other predicted secondary structure elements of the ICP0 CTD are also conserved, supporting the notion that ICP0-like proteins harbor a structured CTD in addition to the N-terminal RING domain. High amino acid conservation in the ICP0 region involved in USP7 binding implies that ICP0 interaction with the

59

USP7 is not unique to HSV-1 but may represent a common feature shared by other members of the Herpesvirus family.

The fact that the five non-homologous USP7 UBL domains can interact with multiple USP7 targets suggests that the UBL domains serve as a versatile binding platform and endow the enzyme with specificity to its diverse substrates. However, until now, none of the C-USP7 substrate binding sites has been characterized at high resolution. In this work, for the first time, we have structurally and functionally characterized the interaction of the C-USP7 with its first identified target – HSV-1 ICP0 protein. Our results, along with future studies of other C-USP7 substrate-binding sites, may unravel determinants of the USP7 specificity towards its substrates and provide a structural basis for the future development of new therapeutic strategies to prevent livelong HSV-1 infection and virus expansion among human population.

ACKNOWLEDGMENTS

We thank Dr. Alexander Lemak for his help with UBL12 chemical shift assignment and

Khiem Nguyen and Dr. Olga Vinogradova for assistance with ITC experiment. Also we thank

Dr. Yulia Pustovalova and Dr. Mariana Quezado for helpful discussions. This work was supported by Charles H. Hood Foundation Child Health Research Award and State of

Connecticut Department of Public Health Biomedical Research Grant to I.B. Chemical shift assignments of UBL1 have been deposited to the Biological Magnetic Resonance Data Bank

(BMRB) under accession number # 16789. 20 lowest energy structures of the USP7 UBL1 domain were deposited to Protein Data Bank (PDB) under the code # 2KVR.

60

Table 2.1. NMR structural statistics of USP7 UBL1 domain.

Summary of restraints NOE distance restraints Short range (|i-j|<=1) 1358 Medium range (1<|i-j|<5) 576 Long range (|i-j|>=5) 1107 Total 3041 Dihedral angles (φ and ψ) 187 Hydrogen bonds 32 Deviation from experimental restraints NOE (Å) 0.022±0.0 03 Dihedral restraints 1.757±0.1 (degrees) 23 Deviation from idealized geometry Bonds (Å) 0.019 Angles (degrees) 1.3 Ramachandran plot statistics Most favored regions 77.2% Additionally allowed regions 21.8% Generously allowed regions 0.8% Disallowed regions 0.2% RMSD from mean structure (Å) All residues Backbone atoms 1.96±0.44 Heavy atoms 2.70±0.38 Residues in ordered regions1 Backbone atoms 0.62±0.09 Heavy atoms 1.25±0.14

1The following secondary structure elements were included as ordered regions in the RMSD calculation: 539-548, 551-554, 556-570, 592-595, 602-613, 617-619, 621-624, 649-652, 658- 663.

61

Figure 2.1. Isolated USP7 UBL domains can be studied independently in solution. A. Schematic representation of the domain arrangement of the human USP7 protein. B. 2D 1H-15N HSQC spectra of the UBL1 (blue), UBL12 (green), UBL3 (black) and UBL45 (red) domains.

62

Figure 2.2. Solution NMR structure of the USP7 UBL1 (residues 537-664). A. The top panel shows TALOS+ derived probabilities of α-helix and β-sheet formation, marked as P(α) and P(β), respectively. The bottom panel shows RCI-based backbone amide order parameters, S2. B. The backbone representation of the ensemble of 20 NMR lowest-energy structures of the USP7 UBL1. C. Amino acid sequence of the USP7 UBL1 domain with cartoon representation of its secondary structure elements shown at the top. Helices are colored in red, β-strands are shown in green. D. Ribbon representation of the UBL1 structure with secondary structure elements and loops I and II labeled. E. Superposition of solution (red) and crystal (blue; (PDB ID: 2YLM) structures of the UBL1. F. NOE connectivity plot of UBL1. Flexible loops I and II are labeled and shown in grey.

63

15 15 Figure 2.3. The backbone dynamics of UBL1 from N relaxation. A. N R1, R2 relaxation rates and 15N{1H} heteronuclear NOE values shown as a function of the UBL1 residue number. Secondary structure elements are shown at the bottom. B. Mapping of heteronuclear 1-NOE values onto the UBL1 structure. 1-NOE values are mapped as a color gradient (blue to magenta) and ribbon radius. Dynamic regions correspond to intense magenta color and thicker ribbon. All experimental relaxation data were acquired at 500 MHz (1H) and 25 °C.

64

Figure 2.4. USP7 is required for successful establishment of HSV-1 lytic infection. HFF-1 cells were infected with lentiviruses expressing shRNA against USP7 (shUSP7) in order to downregulate cellular USP7. shRNA against GFP (shGFP) was used as a control. A-B. After selection, knockdown cells were infected with HSV-1 at an MOI of 0.1 PFU/cell. Progeny virus was collected at 24 hours post infection and titers were determined on Vero cells by staining with either crystal violet (A) or X-gal for β-galactosidase-positive plaques (B). In B, values correspond to the average of three independent experiments and error bars correspond to one standard deviation. C. Efficiency of USP7 knockdown and ICP0 stability were tested by Western blot analysis. USP7 knockdown cells were infected with HSV-1 at an MOI of 2 PFU/cell and cell lysates were prepared at 4 hours post infection and subjected to Western blot analysis for ICP0 and USP7. Ku70 serves as a loading control.

65

Figure 2.5. NMR and ITC analysis of the USP7 UBL domain interactions with the ICP0617- 1 15 627 peptide. Overlay of H- N HSQC spectra of the UBL12 (A), UBL1 (B), and UBL3 (C) domains before (red) and after addition of the unlabeled ICP0 peptide (residues 617-627) to a final protein:peptide ratio of 1:10 (blue). In A, the signals of UBL12 with chemical shift perturbations over 100 Hz are labeled. Among these, the peaks corresponding to residues located on the ICP0-binding interface are labeled with asterisks. The bottom panel shows changes in line shapes for selected UBL12 peaks upon ICP0 binding. D. ITC profiles for UBL12 binding to ICP0. The left panel shows heat change upon ligand addition; the right panel shows an integrated ITC isotherm and its best fit to an independent site model.

66

67

Figure 2.6. Analysis of ICP0 binding to the USP7 UBL12 tandem. A. Completeness of the backbone assignment for the free and ICP0-bound UBL12. Here and later UBL1 and UBL2 domains are colored in shades of blue and green, respectively. Circles below the protein sequence represent amino acid residues with assigned NMR resonances in 1H-15N HSQC spectra of the free (light circles) and ICP0-bound UBL12 (dark circles). Open spaces indicate residues with unassigned amide peaks. Asterisks indicate UBL12 residues with assigned amide peaks in the bound state only, while arrows indicate residues with assigned amide peaks in the free state only. B. Ribbon representation of the UBL12 (pdb ID: 2YLM) (71). C. Mapping of the 2 2 1/2 frequency differences between the free and ICP0-bound UBL12 i=(N +H ) on the UBL12 surface. The color gradient corresponds to values, with larger changes shown by more intense color. D. Mapping of the UBL2 residues located on the UBL1/UBL2 interface and affected by ICP0 binding. As in C (top view) but rotated about the z-axis by 90°, for clarity, UBL1 is shown as orange ribbon. E. Close-up view of the UBL12 ICP0-binding interface. F. Per-residue amide frequency differences between the free and ICP0-bound states of the UBL12 (Hz as a function of residue number. Residues with peaks missing in one of the two spectra (free or bound) are shown as bars with the maximum  of 650 Hz to indicate that these residues are sensitive to ICP0 binding. Prolines and residues with missing amide group assignment in both spectra are shown as empty spaces. Secondary structure elements are shown at the top.

68

Figure 7. The C-terminal region of ICP0 may contain a structured domain. A. Sequence alignment of the HSV-1 ICP0 (residues 604-775) and a number of its viral orthologues performed in ClustalW (159). Viruses are Human Herpesvirus 1 (HSV1), Chimpanzee alpha-1 herpesvirus (ChHV), Human Herpesvirus 2 (HSV2), Cercopithecine Herpesvirus 1 (CeHV1), Cercopithecine Herpesvirus 16 (CeHSV16), Macropodid Herpesvirus 1 (MaHV1) and Saimiriine Herpesvirus 1 (SHV1). Residues are colored according to conservation scores calculated using the ConSurf server (160,161). Secondary structure elements predicted in Jpred (158) for the HSV-1 ICP0 are shown at the top. Arrows above the alignment indicate residues critical for USP7 binding identified in previous mutagenesis analysis (152). B. HSV-1 ICP0 peptide (residues 617-627) shown in a helical wheel projection created using the EMBOSS Pepwheel program. Hydrophobic residues are shown as grey spheres, polar residues as diamonds and basic residues as blue spheres.

69

CHAPTER 3

Mechanism of USP7 inhibition by small-molecule compounds

Alexandra Pozhidaeva, Feng Wang, Jian Wu, Phuong Nguyen, Joseph Weinstock, Suresh

Kumar, Jean Kanyo, Dennis Wright and Irina Bezsonova

This chapter was submitted as a research article titled "USP7-specific Inhibitors Target and

Modify the Enzyme’s Active Site via Distinct Chemical Mechanisms" to Cell Chemical Biology.

Authors contributions: AP and IB conceived the project and designed in vitro experiments. AP performed protein purification, carried out and analyzed all NMR experiments, and wrote the manuscript. AP and IB performed computational molecular docking. J Weinstock designed USP7 inhibitors used in the study. J Wu synthesized the inhibitors. SK designed cellular experiments while FW and PN performed and analyzed them. JK performed LC-MS/MS experiments and JK,

AP and IB analyzed the results. DW proposed the chemical mechanism of USP7 inhibition. JK,

SK, J Weinstock and IB reviewed and edited the manuscript.

70

3.1. Abstract

USP7 is a deubiquitinating enzyme that plays a pivotal role in multiple oncogenic pathways and therefore is a desirable target for new anti-cancer therapies. However, the lack of structural information about the USP7 – inhibitor interactions has been a critical gap in the development of potent inhibitors. Using a multidisciplinary approach, we obtained new and unexpected insights into the mechanisms of USP7 activation and inhibition by small-molecule compounds that can aid this development. USP7 is unique among other USPs in that its active site is catalytically incompetent and is postulated to rearrange into a productive conformation only upon binding to ubiquitin. Here we show that, contrary to previous reports, ubiquitin binding alone is not sufficient to induce active site rearrangement and leaves the enzyme in an unproductive conformation. Using a combination of NMR spectroscopy, mass spectrometry and computational modeling we found that inhibitors P22077 and P50429 covalently modify the catalytic cysteine of USP7 via distinct chemical mechanisms and induce a conformational switch in the catalytic domain associated with active site rearrangement. Cell-based enzymatic activity assays confirmed that the inhibitors inactivate the full-length protein in cells. This work represents the first experimental insights into USP7 activation and inhibition and provides a structural basis needed for rational development of highly potent anti-cancer therapeutics.

3.2. Introduction

The Ubiquitin-proteasome system (UPS) is required for the tightly controlled degradation and turnover of a majority of the proteins in eukaryotic cells. An enzymatic cascade of three classes of enzymes: E1 activating enzyme, E2 conjugating enzyme and E3 ligase work together to ubiquitinate a substrate and target it to the proteasome, where it is cleaved into short peptides.

71

The targeting of a protein to degradation in turn can be reversed by deubiquitinating enzymes

(DUBs) that remove ubiquitin chains from their substrates (2). Inhibition of UPS components represents a promising therapeutic strategy to control cellular levels of proteins that lack enzymatic activity and thus cannot be inhibited (90). E3 ligases and DUBs directly interact with proteins targeted for (de)ubiquitination and determine substrate specificity of UPS. Therefore, their inhibition has emerged as a new approach to modulate stability of distinct proteins, such as oncoproteins and tumor suppressors (22,162,163). Ubiquitin-specific protease 7 (USP7), also known as Herpes virus-associated ubiquitin-specific protease (HAUSP), is a cysteine protease that belongs to the USP family of DUBs in humans (53). It is widely known for its central role in the DNA damage response in which it regulates protein levels of the tumor suppressor p53 in response to genotoxic stress (24). In unstressed cells, USP7 preferably interacts with and stabilizes HDM2, an E3 ligase responsible for poly-ubiquitination of p53 and targeting it for degradation. During DNA damage, however, ATM-dependent phosphorylation of HDM2 reduces its affinity for USP7. As a consequence, p53 is available to interact with USP7, resulting in stabilization of p53 and initiation of the p53-dependent DNA damage response (25-27). In addition to p53, USP7 has been shown stabilize proteins involved in DNA replication, epigenetic

DNA alterations, apoptosis and cell cycle control (30,34,37,39,47,63,125,164,165).

Microdeletions and mutations in the USP7 gene lead to neurodevelopmental disorders in humans characterized by intellectual disability, autism spectrum disorder, and hypogonadism

(132).

Dysregulation of USP7 expression has been reported in number of human malignancies, including human prostate cancer (51), ovarian cancer (87) and non-small cell lung cancer

(NSCLC) (86,136). The role of USP7 in oncogenesis and tumor suppression makes it an

72 attractive pharmaceutical target for inhibition by small-molecule compounds (154). Early studies in human colon cancer xenograft models showed that downregulation of USP7 suppresses cell proliferation and delays tumor growth due to p53 stabilization in the absence of cellular stress

(166). High-throughput screening (HTS) efforts led to the discovery of the first semi-specific, uncompetitive and reversible inhibitor of USP7 (HBX 41,108) that stabilizes p53 and inhibits cell growth in HCT116 colon cancer cells (91). Later, more specific USP7 compounds were reported (92) along with a family of dual USP7/USP47 inhibitors (compounds 1 - 14) (93).

Inhibition of USP7 by P5091 (compound 1) was shown to cause apoptosis of multiple myeloma cells and prolonged survival in animal xenograft models (94). Optimization of P5091 led to the discovery of P22077 (compound 4) that inhibited neuroblastoma growth (96) and P50429

(compound 14) that inhibited the proliferation of HCT-116 cells (93). Although all of the above compounds show antitumor properties in various cancer cell lines and animal models, none is very potent and all require further optimization.

USP7 consists of an N-terminal substrate-binding TRAF-like domain, five C-terminal regulatory ubiquitin-like (UBLs) domains, and a central catalytic core that binds ubiquitin and removes it from the substrate (70,71,138). The catalytic domain of USP7 adopts a papain protease-like fold that binds ubiquitin. This fold contains Thumb, Palm and extended Fingers regions similar to those of other USPs (72,75). Importantly, because all previously reported

USP7 inhibitors were discovered using HTS, it is not clear whether the inhibitors bind to the active site of the enzyme or whether they bind to a region distinct from the active site, acting allosterically to induce a conformational change that leads to enzyme inactivation. Furthermore, given the structural conservation of the catalytic domains within the family of 58 USP enzymes, it is not clear how specificity for USP7 can be achieved (11).

73

In this study we uncover the mechanism of USP7 inhibition by small-molecule compounds as well as provide new insights into the mechanism of the enzyme’s activation. We have taken advantage of NMR spectroscopy, a powerful tool for studying protein-ligand interactions, and developed a robust assay to test binding of USP7 inhibitors to its catalytic domain in solution.

We performed a backbone NMR resonance assignment of the catalytic domain and mapped surface binding sites for two USP7 inhibitors P22077 and P50429, the derivatives of a lead compound that previously showed antitumor potential (93). Interestingly, our NMR and mass spectrometry data reveal that both compounds occupy the active site of the enzyme and covalently modify the catalytic Cys223 residue. We suggest that these small molecules directly inhibit the catalytic activity of USP7 by preventing the cleavage of the isopeptide bond between the substrate and ubiquitin. The efficacy of these irreversible inhibitors was also confirmed using full-length USP7 in Jurkat cells. Our study, the first elucidation of the mechanism of action for any USP7 inhibitor, provides the foundation for rational design of more potent and selective

USP7 inhibitor-based therapeutic agents.

3.3. Experimental procedures

Protein expression and purification. cDNA encoding the catalytic domain of the human

USP7 (residues 208-560) was sub-cloned into pET28a-LIC Vector (Structural Genomics

Consortium) downstream from an N-terminal His6-Tag and a thrombin cleavage site. A

2H/13C/15N – labeled catalytic domain was expressed in Escherichia coli BL21 (DE3) grown at

15 13 2 37°C in 100% D2O-based M9 minimal medium supplemented with NH4Cl and C/ H -glucose as the sole source for nitrogen and carbon respectively. Protein expression was induced at an

OD600 ~ 0.8 by adding 1mM IPTG. Cells were harvested after 16 hours of incubation at 20°C

74 and lysed by sonication in extraction buffer containing 20 mM NaH2PO4, 250 mM NaCl, 10 mM imidazole, pH 7.4. Cell lysates were clarified by centrifugation at 15000 rpm for 45 min and applied to a Ni-NTA resin (Thermo Scintific). Recombinant protein was eluted with extraction buffer containing 250 mM imidazole. Following thrombin digestion to remove the His6-tag, the catalytic domain was additionally purified by size exclusion chromatography using HiLoad

Superdex 200 column (GE Healthcare) in sample buffer containing 20mM Tris-HCl, 100mM

NaCl, 2mM DTT, pH 7.5. For NMR experiments the protein sample additionally contained 10%

D2O.

NMR spectroscopy. All NMR data including spectra for backbone resonance assignment and chemical shift perturbation experiments were collected on an 800 MHz (1H) Agilent VNMRS spectrometer equipped with a cryoprobe at 30°C. The data were processed with NMRPipe (140) and analyzed with CcpNmr Analysis (146). NMR experiments collected for backbone resonance assignment of USP7 catalytic domain included standard 2D 1H-15N TROSY-HSQC, TROSY- based 3D HNCA, HNCOCA, HNCO, HNCACO, HNCACB and 15N-edited NOESY-HSQC

(101,102). To assess ubiquitin binding, unlabeled ubiquitin was gradually added to 0.3 mM

2H/13C/15N – labeled catalytic domain of USP7 at ratios of up to 1:3 (protein to substrate). To monitor interactions of USP7 with its inhibitors, compounds P22077 or P50429 were dissolved in 100% DMSO and gradually added to 0.42 mM 2H/13C/15N – labeled catalytic domain up to a

1:5 protein-to-inhibitor ratio and a final DMSO concentration of 5%. In all experiments changes in chemical shifts were monitored by acquiring 2D 1H-15N TROSY spectra. During inhibitor binding experiments a spectrum of the free catalytic domain in sample buffer additionally containing 5% DMSO was used as a reference spectrum.

75

For the USP7 catalytic core – ubiquitin complex, observed frequency perturbations for each amino acid residue were calculated using combined shift change of the amide nitrogen and

2 2 1/2 15 1 proton obs=(N +H ) , where N and H are N and H frequency differences between free and bound states in Hz. The dissociation constant for the complex of the catalytic domain with ubiquitin KD=[P][L]/[PL] (where [P], [L] and [PL] are concentrations of the free protein, free substrate and the complex respectively) was extracted by nonlinear least-square fitting of global chemical shift change plotted versus ligand concentration using the following equation:

2 (퐾D + [L]t + [P]t) − √([푃]t + [L]t + 퐾D) − 4[P]t[L]t ∆ωobs = ∆ωmax 2[P]t where [P]t and [L]t are the total protein and ligand concentrations and ∆max is the chemical shift difference at saturation. A total of 199 assigned, well resolved non-overlapped peaks were used for the analysis. The titration data analysis was performed using SciDAVis.

Protein – ligand docking. Induced fit (flexible) and covalent docking of the complexes of the

USP7 catalytic domain with DUB inhibitors P22077 and P50429 was performed using ICM-Pro

(Molsoft). ICM method uses Monte-Carlo simulations based global optimization of flexible ligand positions in the space of grid potential energy maps calculated for a protein receptor

(167). During induced fit docking, pockets were first automatically identified on the X-Ray structure of the USP7 catalytic core (PDB ID: 4M5W) (75). The inhibitor binding pocket for calculation of grid potential maps was chosen and further adjusted based on experimental NMR results. Full side-chains of residues within the search box were allowed to be flexible including residues 218-219, 222-224, 226, 291-292, 461-465 and 482 of P22077 and residues 218-227,

291-296, 351, 405-411, 416-418, 456, 460-466, 481-482 and 514-515 – during docking of

76

P50429. The thoroughness level was set to 5. For covalent docking, chemical reactions for both of the compounds were sketched manually. Covalent adducts were identified based on MS results and represented products of the chemical reactions. Cys223 was chosen as the modified residue. Structures with the lowest predicted ICM scoring function accounting for ligand size, protein – ligand van der Waals interactions, conformational changes, changes in solvation electrostatic energy and hydrophobic free energy gain were chosen for analysis.

Mass Spectrometry. Samples used for intact mass determination and LC-MS/MS were prepared by incubating 300 μg of the purified catalytic domain with either a 5-fold molar excess of P22077 or P50429 in the sample buffer containing 5% DMSO, or with vehicle (5% DMSO) overnight at room temperature. Following incubation, the protein samples were dialyzed in water to remove salts and excess of the small molecules. To determine the intact mass of the USP7 catalytic core and its complexes with inhibitors, samples (~100 µg, 30 µl) were mixed with 200

µl of 50% methanol containing 0.1% formic acid and analyzed by direct infusion on a Thermo

Scientific Orbitrap Fusion Tribrid mass spectrometer. The instrument was operated in Intact

Protein mode with an ion routing multipole pressure of 1mTorr. Spectra were acquired at 120K and 240K resolution. The isotopically resolved data were processed using Thermo Scientific

Protein Deconvolution 4.0 software.

For LC-MS/MS, the intact protein samples (~10ug) were dried using SpeedVac prior to dissolving and denaturing in 8M urea, 0.4M ammonium bicarbonate. The urea concentration was adjusted to 2M by the addition of water, and the proteins were digested with trypsin (Promega) at

37ºC for 16 hours. Samples were desalted using a C18 Ultra microspin column (The Nest

Group) and peptides were eluted with 80% acetonitrile containing 0.1% trifluoroacetic acid

(TFA), dried and reconstituted in 7% formic acid/0.1% TFA and then diluted with additional

77

0.1% TFA to a final protein concentration of 0.01µg/µl. LC-MS/MS analysis was performed on a Thermo Scientific Q Exactive Plus equipped with a Waters nanoAcquity UPLC system using a

Waters Symmetry® C18 180µm x 20mm trap column and a ACQUITY UPLC PST (BEH) C18 nanoACQUITY Column (1.7 µm, 75 µm x 250 mm) for peptide separation. Trapping was done at flow rate of 5µl/min in 97% Buffer A (100% water, 0.1% formic acid) for 3 min. Peptide separation was performed at 330 nl/min with the following gradient of Buffer B (100% acetonitrile, 0.1% formic acid) in Buffer A for 90 minutes: 3% B at initial conditions; 5% B at 1 minute; 35% B at 50 minutes; 50% B at 60 minutes; 90% B at 65-70; and back to initial conditions at 71 minutes. MS was acquired in profile mode over the 300-1,700 m/z range using 1 microscan; 70,000 resolution; AGC target of 3E6; and a full max ion time of 45 ms. MS/MS was acquired in centroid mode using 1 microscan; 17,500 resolution; AGC target of 1E5; full max ion time of 100 ms; 1.7 m/z isolation window; normalized collision energy of 28; and 200-2,000 m/z scan range. Up to 20 MS/MS were collected per MS scan on species with an intensity threshold of 2E4, charge states 2-6, peptide match preferred, and dynamic exclusion set to 20 seconds. MASCOT Distiller software was used to generate peak lists which were searched against the UBP7_HUMAN sequence as well as the Swiss-Protein database with taxonomy restricted to E. coli using the MASCOT algorithm (Matrix Science). Search parameters used were peptide mass tolerance of 10 ppm; MS/MS fragment tolerance of +0.25 Da; allow up to two missed cleavages; variable modification of methionine oxidation and custom modifications configured for the moieties added by the compounds.

Cell Culture. Human T lymphocyte cell line Jurkat (p53MUT) was maintained in Roswell

Park Memorial Institute (RPMI) 1640 Medium, supplemented with 10% FBS and 2mM L- glutamine, and 50 units/mL penicillin/streptomycin.

78

Western Blotting. Jurkat cells were treated with USP7 inhibitors for 24 hours with indicated doses, harvested and washed once with ice-cold PBS. Cell pellets were lysed using modified

RIPA lysis buffer (50mM Tris-HCl, pH 7.5, 150mM NaCl, 1% NP40, 1% sodium deoxycholate,

2mM EDTA, 10% glycerol, Protease inhibitor cocktail (Sigma, 1:500), 50µM PR-619 (Pan DUB inhibitor, LifeSensors). Protein estimation was carried out using the Bradford method (Bio-Rad).

Twenty five µg of cell lysates were separated on 10% SDS-PAGE gel, transferred to PVDF membrane and probed with primary antibodies against DNMT1 (Cell Signaling) HDM2 (Santa

Cruz), UHRF1 (Santa Cruz) and Actin (Sigma). Blots were developed using Millipore ECL or

Pierce ECL depending on the sensitivity requirements and images were captured and quantified using the LI-COR Odyssey Fc imaging system. For repeated analysis of Western blots, membranes were stripped using Restore Western Blot Stripping buffer (ThermoFisher).

Immunoprecipitation. Immunoprecipitation and measurement of USP7 activity were performed as reported previously (94) with slight modifications. Jurkat cells treated with DMSO or USP7 inhibitors were washed with ice cold PBS and lysed using NP40 lysis buffer (50mM

Tris-HCl, pH 7.5, 150mM NaCl, 1%(v/v) NP40, 10%(v/v) glycerol, 1mM PMSF) and pre- cleared using 30µl of protein G-sepharose (Invitrogen) for 1 hour at 4oC. 500µg of pre-cleared cell lysates were incubated with Protein G sepharose beads pre-loaded with 0.5µg of anti-USP7 antibody (Progenra) for 1 hour. Beads were washed twice with lysis buffer containing 2mM β- mercaptoethanol and USP7 activity was determined using Ub-CHOP2 reporter assays

(LifeSensors). DMSO treated cell lysates were incubated with beads alone as a control. 5µl of beads from the immunoprecipitation was analyzed by Western blotting with anti-USP7 antibody to determine relative levels of USP7.

79

3.4. Results

3.4.1. NMR studies of the 41kDa catalytic core of USP7 in solution

Structural information about a protein target and its interaction with lead compounds is the key for successful rational drug design and optimization. Although several crystal structures are available for the catalytic core of USP7 and USP7-ubiquitin complexes (PDB IDs: 1NB8, 1NBF,

4M5W) (72,75), structures of the enzyme with any known inhibitors have not been obtained. The relatively large size of the catalytic domain (41kDa) has also hindered its investigation in solution. We have overcome this hurdle by using full protein deuteration in combination with

TROSY-based NMR experiments designed for large biological systems (101). A 2D 1H-15N

TROSY-HSQC spectrum displayed good signal dispersion indicative of well-structured domains

(Figure 3.1). The quality of the NMR data facilitated backbone resonance assignment, a prerequisite for mapping of inhibitor binding sites.

Sequential assignment using a standard triple-resonance NMR spectroscopy approach was achieved for a total of 83% of the backbone atoms (NH, CO, Cα and Cβ), with 81 % of amides assigned (Figure 3.2A). The absence of several peaks and the presence of weak peaks prevented a higher percentage of resonance assignments. In particular, 61 of the 343 non-proline residues resonances could not be identified in the 1H-15N TROSY-HSQC spectrum. Approximately one- half of the missing signals correspond to residues located in β-strands β8, β10 and β11 that are a part of an anti-parallel β-sheet buried in the interior of the protein. The absence of these signals is likely explained by the difficulty of deuterium – proton exchange in the hydrophobic core of the protein. Another cluster of missing resonances belongs to residues Lys217 – Ser227 from the non-structured and likely flexible N-terminus. Overall, the stability and behavior of the USP7 catalytic domain in solution indicate that this domain is well suited for NMR spectroscopy.

80

Furthermore, the nearly complete backbone assignment provides confidence that NMR spectroscopy is a powerful tool for probing USP7 – substrate/ligand interactions.

3.4.2. Interaction of the USP7 catalytic core with ubiquitin

USP7 is postulated to recognize the ubiquitin moiety conjugated to its substrates via a direct interaction between the catalytic domain of the enzyme and ubiquitin. According to previous structural studies, this interaction causes a dramatic structural rearrangement of the USP7 active site, resulting in activation of the enzyme (72). However, despite the report of a structure for the

USP7 catalytic core covalently modified with ubiquitin aldehyde, an interaction with free ubiquitin could not be confirmed using SPR experiments, suggesting that the interaction is very weak (71). Compared to other biophysical approaches, NMR spectroscopy has the advantage of being sensitive to weak and transient interactions (168). Consequently, we used this method to detect USP7/ubiquitin binding in solution and follow the anticipated structural changes. Our chemical shift perturbation experiments showed that the direct interaction between USP7 catalytic domain and ubiquitin occurs on a fast-to-intermediate exchange timescale, resulting in gradual changes in the NMR spectra of the 2H/15N/13C-labeled catalytic core upon titration with unlabeled ubiquitin (Figure 3.2B). These gradual chemical shift changes in the 1H-15N TROSY-

HSQC spectrum simplified the assignment of resonances of the catalytic domain complexed with ubiquitin and allowed quantification of binding affinity and mapping of the binding site. Overall, addition of ubiquitin caused prominent peak perturbations for a number of USP7 residues with peaks for Tyr331, Lys378, Arg343, Asp346, Glu371, Gln372, Ala381, Gln387 and Val 393 exhibiting shifts over 100 Hz (Figure 3.2C). The NMR titration curve fit a two-state binding model (Figure 3.2E), confirming that the USP7 catalytic domain can bind to ubiquitin with low affinity (KD of 105.7 ± 15.9 μM). NMR chemical shift mapping revealed that the residues most

81 sensitive to ubiquitin binding are located in β-strands β1, β2, β4, β5, β6 and β7 that form the ubiquitin-binding Fingers of the USP7 catalytic domain (Figure 3.2D), consistent with the crystal structure of the domain complexed with ubiquitin aldehyde (PDB ID: 1NBF) (72). Surprisingly, the active site of the enzyme was not perturbed by ubiquitin, suggesting that ubiquitin binding alone does not cause the active site rearrangement necessary for catalysis (Figure 3.2C).

3.4.3. Mapping of the small-molecule binding site

The USP7/USP47 inhibitors P22077 and P50429 are analogs of P5091, a lead compound that showed antitumor potential in cancer cell lines and mice xenograft models (93,94). Both inhibitors are trisubstituted thiophenes with 5-(2,4-difluorophenyl)thio, 4-nitro, and 2-acetyl substituents in P22077 and 5-(3,5-dichloropyridyl)thio, 4-cyano and 2-(4-

(methylsulfonyl)phenyl)-NH substituents in P50429 (insets in Figures 3.3A and B, respectively).

P50429 exhibits ten-fold more potent inhibition of USP7 compared to P22077 (IC50 = 0.42 μM versus 8.0 μM) (93). In order to understand the reason for the increased potency of P50429 and to develop compounds with even greater potency and target specificity, one must know how these compounds bind to USP7. We, therefore, set out to structurally characterize the binding of

P22077 and P50429 to the enzyme’s catalytic core by performing NMR titrations in which increasing amounts of each compound were gradually added to the 2H/15N/12C-labeled catalytic domain up to a final protein-to-inhibitor molar ratio of 1:5. Upon addition of P22077, residues

Gly215, Leu229, Glu248, Gly284, Thr287, Leu304, Asp305, Asp349 and Gly458 exhibited the most significant chemical shift changes (Figure 3.3A). Compound P50429 caused large chemical shift changes for residues Thr280, Leu304, Met407, Phe409, Met410, Gly458, Gly462 and

Asp483 (Figure 3.3B). Notably, the intensity of a few peaks significantly decreased (>60%) upon addition of either of the compounds indicating that the corresponding amino acid residues are

82 also sensitive to inhibitor binding. These residues included Lys281, Ser282, Phe283, Trp285,

Asn306, Asn460, Gly462, Gly463 and Tyr465 sensitive to P22077 and Lys281, Ser282, Phe283,

Gly284, Trp285, Asn306, Asn460, His464 and Tyr465 sensitive to P50429. Although these residues could not be unambiguously assigned in the bound state, they are shown on histograms in Figure 3.3 as having negative chemical shift values of -10 Hz to emphasize their role in binding. Overall, addition of either compound caused chemical shift perturbations and decreases in intensity only for a few peaks in the spectra (18 and 17 peaks for P22077 and P50429 respectively) indicating a small binding footprint on the surface of USP7 and suggesting that no major global conformational changes occur upon binding. Moreover, a very similar set of residues was affected by addition of both compounds with the exception of residues Met407,

Phe409 and Met410 that lie within a loop connecting β8- and β9-strands and are sensitive to

P50429 only.

Chemical shift perturbations of more than 40 Hz for P22077 and more than 30 Hz for P50429 were considered significant and were mapped onto the structure of USP7 catalytic domain (PDB

ID: 1NBF). Chemical shift mapping revealed that residues sensitive to addition of either of the two DUB inhibitors cluster in two small regions on the surface of the protein (Figures 3.3C and

D). The first set of residues affected by binding surrounds the catalytic cleft (Figure 3.3E). These include Gly458, Asn460, Gly462, Gly463 and Tyr465 located in β10- and β11-strands as well as the loop connecting them, and Asp349 from the loop between strands β2 and β3. Additionally, this region contains two residues from the flexible N-terminus, Gly215 and Leu216, as well as

Leu229 from the first helix α1, indicating that the mostly unassigned N-terminus is also sensitive to inhibitor binding (Figure 3.3E). Importantly, the catalytic cleft is mostly formed by hydrophobic residues with the catalytic Cys223 situated at its bottom. Consistent with this

83 finding, both compounds are hydrophobic in nature. Furthermore, the pocket is large enough

(~1075 Å3) to easily accommodate either of these small molecules. The second region consists of residues 281 – 285 and Thr287 located in the loop connecting helices α4- and α5 (referred to as a

“switching loop”) and residues 304 – 306 located in helix α5, which is involved in active site rearrangement into the catalytically productive conformation (72). Although this region is sensitive to the presence of the inhibitors, only the catalytic cleft is involved in a direct interaction, as was confirmed by MS (see below).

To elucidate the position of the inhibitors in the active site of USP7, we utilized an induced fit docking approach (ICM-Pro) that allows flexibility of side-chains of residues located in the binding site. The resulting structural models of the complexes of the catalytic core with P22077 and P50429 are shown in Figures 3.3F and G, respectively. Overall, the compounds are similarly positioned in both models. The 2,4-difluorobiphenyl group in P22077 (3,5-dichloropyridyl group in P50429) inserts deeply into a cavity formed by residues Asn218, Gly220, Ala221, Thr222,

Cys223, Asn226, His464, Tyr465 and Asp482. Thiophene is perpendicular to the aromatic rings and turned such that the sulfur is located in the bottom of the pocket and the electron- withdrawing groups, -nitro in P22077 and -cyano in P50429, are exposed to solvent. The second aromatic, methylsulfonyl phenyl group in P50429 further penetrates the catalytic cleft and interacts with residues from the loop connecting strands β8 and β9. This is consistent with the data obtained from the NMR titration experiments in which chemical shift perturbations for residues Met407, Phe409 and Met410 located in the loop connecting β8- and β9-strands were observed only upon binding to P50429. These additional contacts might contribute to tighter binding of P50429 compared to P22077, suggesting new strategies for improvement of potency.

84

3.4.4. Mechanism of USP7 inhibition by DUB inhibitors P22077 and P50429

The presence of a doubly deactivated, highly electron-deficient thiophene ring in both of the

USP7 inhibitors used in this study suggested that the molecules might react directly with nucleophilic residues, such as cysteine, thus leading to covalently modified adducts. The USP7 catalytic domain has seven cysteine residues that could potentially be modified. Addition of an excess of DTT to the catalytic core incubated with either P22077 or P50429 compounds did not cause any changes in the NMR spectra (data not shown) excluding the possibility of disulfide bond formation between a cysteine thiol group in the protein and the thioether linker in the small-molecule inhibitors. To investigate whether these compounds bind USP7 reversibly or covalently modify the protein we performed intact protein MS analysis of the free and inhibitor- treated catalytic domain (Figure 3.4A). The deconvoluted spectrum of the free catalytic domain showed a major peak with an average mass of 41062.25 Da (calculated mass is 41088.4 Da). The intact masses of both complexes with the inhibitors were greater relative to the free protein.

These results indicate that both compounds covalently modify USP7, since the measurements were taken in 50% methanol, where the protein would be expected to be largely denatured.

Moreover, mass shifts of 169 Da and 146 Da for P22077 and P50429, respectively, were smaller than the corresponding molecular weights of the compounds (315.322 Da for P22077, 484.411

Da for P50429), suggesting that significant parts of the small molecules are lost during final modification of the enzyme.

To investigate whether these compounds covalently modify the catalytic and/or other cysteine residues within USP7 we performed LC-MS/MS experiments on tryptically digested samples. The fragment ions in the MS/MS scans were compared to theoretical fragments via a database search to identify peptides. A variable modification of cysteine was configured for each

85 inhibitor based on the added net mass determined by the intact mass study. Over 90% of the protein sequence coverage was achieved for the free and inhibitor-treated protein samples, including all seven cysteine residues. Virtually all identified tryptic peptides 218-

NQGATCYMNSLLQTLFFTNQLR-239 from samples treated with either P22077 or P50429 exhibited a mass shift of 169 Da and 146 Da, respectively, compared to the same peptide from the unmodified protein. Importantly, the residue harboring the modification was almost exclusively assigned to catalytic Cys223 (Figures 3.4B and C). The data for P50429 was remarkably clear cut, with every peptide encompassing Cys223 identified as containing the inhibitor-based modification, while peptides containing other cysteine residues were rarely identified as modified. The putative 169 Da adduct for P22077 was more promiscuous, with

Cys223 modified in all but one instance and other cysteines occasionally modified as well.

Altogether, the MS results in combination with NMR-based mapping of the binding site strongly suggest that both of the compounds occupy the active site of USP7 catalytic domain and further selectively covalently modify the catalytic Cys223 residue, irreversibly blocking its enzymatic activity. Figure 3.5 shows the proposed mechanisms of USP7 active site modification.

The observed mass shift of 169 Da for P22077-modified protein is consistent with an average molecular weight of 1-(4-nitro-thiophen-2-yl)-ethanone group (3 in Figure 3.5). The mass shift of 146 Da in the case of P50429 corresponds to an average molecular weight of 3,5- dichloropyridyl group of the molecule (7 in Figure 3.5). Element composition of the proposed modifications was supported by analysis of MS/MS isotope patterns. Therefore, despite the overall scaffold structural similarity of the compounds (93), each compound modified the USP7 catalytic core in a specific manner. P22077 (1) transfers the central thiophene onto the catalytic cysteine though a postulated mechanism involving initial nucleophilic attack on the thiophene to

86 give an intermediate thio-orthoester (2) which collapses though expulsion of difluorothiophenol

(4) to form the re-aromatized adduct (3). Although P50429 (5) is also poised for a similar substitution reaction, the reaction follows a different course as a result of the substitution of the phenyl ring in (1) for the pyridyl ring in (5), which renders the ring prone to nucleophilic attack.

A similar addition/elimination process occurs through the intermediacy of (6) to transfer the pyridyl ring to the enzyme as in (7), with the thiophenyl moiety (8) serving as the leaving group.

3.4.5. Inhibition of endogenous full-length USP7 by DUB inhibitors P2077 and P50429 in

Jurkat cells

To determine if irreversible inhibition is also achieved against full-length USP7 in the context of live cells, we pre-treated Jurkat cells with either DMSO or graded concentrations of the compounds for 4 hours and monitored the activity of immunoprecipitated USP7 using Ub-

EKL reporter assay. As seen in Figure 3.6A, the inhibitors showed dose dependent inhibition of

USP7 catalytic activity, with strong inhibition at a dose of 10 μM. Notably, total DUB activity measured in cell lysates was not significantly affected by these treatments (Figure 3.6B), indicating their cellular selectivity, in agreement with previously published results (93,169).

The cellular effect of USP7 inhibition can be assessed by monitoring levels of its known protein targets. We used Western blotting of Jurkat cells to evaluate the effect of P22077 and

P50429 on the stability of DNA methyltransferase DNMT1 and E3 ligase UHRF1 (39), both essential for maintenance of genomic DNA methylation, as well as HDM2. Compared to cells treated with DMSO, treatment with either compound resulted in dose-dependent decreases for all three substrates (Figure 3.6C). Thus, these compounds not only inhibit the isolated catalytic domain in vitro but also effectively inhibit full-length USP7 in cells, as evidenced by the destabilization of USP7 substrates.

87

3.5. Discussion

Dysregulation of the ubiquitin-proteasome system has been reported in multiple human pathologies. Thus, targeting the UPS is considered a promising therapeutic strategy to treat cancer, cardiovascular, neurodegenerative and immunological disorders. Two FDA approved proteasome inhibitors, Bortezamib and Carfilzomib, are currently used for treatment of multiple myeloma (170,171). More recently, other components of the UPS, such as E2 conjugating enzymes, E3 ligases and DUBs have been exploited as potential targets for drug discovery.

Inhibition of USP7 is of special interest because of its role in turnover of multiple proteins involved in DNA replication and repair, epigenetic regulation of gene transcription, immune responses and cell cycle control. Here we report the first structural characterization of the molecular mechanism of USP7 inhibition by small-molecule compounds.

Ubiquitin binding alone is not sufficient for rearrangement of the catalytic core of USP7. By assigning NMR resonances of the catalytic domain of USP7, we developed a powerful tool for studying this challenging dynamic system in solution. We have shown that even transient and short lived interactions of USP7 can now be detected, along with evidence of conformational dynamics. Specifically, our NMR studies of the binding between the isolated catalytic domain and its only known substrate, ubiquitin, revealed that, contrary to previous reports, this interaction alone is not sufficient for active site rearrangement. According to the crystal structures, binding of ubiquitin to Fingers causes conformational changes in the catalytic domain that result in realigning of the catalytic triad into a competent conformation (72). In contrast, our results show that residues involved in such rearrangement are not sensitive to ubiquitin binding in solution. Specifically, no significant chemical shift changes were detected for residues located in helix α5, which moves by ~3Å during active site reorganization, as well as for the loops

88 connecting helices α4- and α5, and strands β10 and β11. Furthermore, upon ubiquitin binding the chemical shifts of residues surrounding the catalytic cleft are not perturbed. This indicates that the C-terminal tail of ubiquitin alone does not enter the catalytic cavity of the enzyme in solution. The active conformation of the catalytic domain in complex with ubiquitin aldehyde captured in the crystal structure (72) is likely induced by the covalent attachment of ubiquitin to the catalytic cysteine of USP7, which irreversibly traps the C-terminal tail of ubiquitin inside the catalytic cleft. There might be several reasons why free ubiquitin does not rearrange the active site of the isolated catalytic domain. Perhaps, presence of other USP7 domains or a ubiquitinated substrate, or both are required. Indeed, there is mounting evidence that the C-terminal UBL domains of USP7 may play an important role in activation of the enzyme (71,83,84).

Furthermore, binding of a ubiquitinated substrate rather than isolated ubiquitin might be needed for efficient activation of the enzyme. In the absence of a protein substrate, ubiquitin binds to the

Fingers of the catalytic core of USP7 but does not insert its flexible tail into the catalytic cleft and, therefore, does not cause the rearrangement of the active site required for catalytic activity.

The presence of a substrate may ensure proper positioning of the isopeptide bond between ubiquitin and the substrate inside the catalytic cleft. The importance of the ubiquitinated substrate for the enzyme’s activation suggests that targeting the USP7 substrate binding sites may provide an effective and highly specific approach to inhibiting USP7 in the future.

USP7 inhibitors selectively target and modify its active site. To investigate the molecular mechanism of USP7 inhibition we used NMR spectroscopy to map a binding site for two

USP7/USP47-specific DUB inhibitors P22077 and P50429 on the structure of its catalytic domain. The results of chemical shift perturbation experiments showed that both compounds share the binding site located in the USP7 catalytic cleft. Surprisingly, residues from helix α5

89 and the “switching loop” were also affected by addition of the inhibitors. As mentioned above, these residues are involved in the active site realignment observed in the crystal structure of the

USP7 catalytic core in complex with ubiquitin aldehyde. Remarkably, NMR frequency changes in this region of the protein were unique for the complexes with the inhibitors and were not detected in titrations with ubiquitin. The fact that the inhibitors bind to the cavity that accommodates the C-terminus of ubiquitin in the crystal structure together with the observed chemical shift changes in helix α5 and the “switching loop” suggest that binding of the compounds to USP7 causes a conformational rearrangement of the active site similar to that caused by ubiquitin aldehyde (72). These findings additionally highlight the significance of the interaction between the ubiquitin flexible tail and the active site for realignment of the latter. The inhibitors appear to mimic this interaction, causing conformational changes in the active site.

The chemical structure of DUB inhibitors P22077 and P50429 suggest that these compounds can covalently modify cysteine residues. Our LC-MS/MS and intact mass determination experiments have confirmed this suggestion. The catalytic domain of USP7 contains seven cysteines, four of them surface exposed and, therefore, have a chance of being modified by reactive small-molecule compounds. Strikingly, we have shown that both compounds specifically modify only the catalytic Cys223. This result is consistent with NMR experiments showing that the compounds cause chemical shift perturbations for residues in the catalytic cavity. The molecular scaffold of the compounds allows the specific binding and presence of doubly deactivated thiophene, leading to modification of Cys223 situated at the bottom of the catalytic cleft. The reaction is possible due to the realignment of the active site upon inhibitor binding, which results in deprotonation of the cysteine thiol group by catalytic His464. Structural models of the complexes obtained with an induced fit docking algorithm provide strong

90 complimentary evidence of the inhibitors position in the catalytic cleft favorable for nucleophilic attack of deprotonated Cys223 on the thiophene ring. Notably, although both compounds are chemically similar and possess the basic structure of a trisubstituted thiophene, MS experiments revealed that specific substructures of each inhibitor had been transferred to the protein and indicated an intriguing divergence in the reactivity profile of the two inhibitors (Figure 3.5). The preference for attack on the pyridine in P50429 may relate to either reduced electrophilicity of its thiophene relative to the thiophene in P22077 (nitro/acetyl versus cyano/carboxamido) or a more optimal trajectory for attack in the bound state. Figures 3.7A and B depict Cys223 modified by compounds P22077 and P50429 and surrounding residues. In general, the docked poses of thiophene and pyridyl rings are similar. The rings have hydrophobic interactions with surrounding residues Tyr224, Met292 and Phe409 and an electrostatic interaction with His456.

The only hydrogen bond is between the oxygen atom within the acetyl group and the alpha- hydrogen atom of Gly462 (Figure 3.7A). Notably, as both reactions proceed through a nucleophilic aromatic substitution reaction to deliver an S-arylated cysteine, the adducts are likely much more stable than those formed from more reversible Michael-type reactions on deactivated alkenes and, therefore, are expected to produce a longer duration of inhibition.

DUB inhibitors P22077 and P50429 selectively inhibit full-length USP7 in cells. MS data revealed that P22077 and P50429 form covalent adducts to the active site cysteine in in vitro reactions using the isolated catalytic domain of USP7. Full-length USP7 is a large (1102 amino acid residues) protein consisting of seven domains proposed to adopt a compact conformation during the enzyme activation (71) that may possibly interfere with the inhibitor binding. Our experiments in Jurkat cells showed that both compounds inhibit the catalytic activity of endogenous full-length USP7. Furthermore, P50429 proved to have more potency against USP7,

91 in agreement with previous studies (93). In particular, at 10 μM, P50429 inhibited USP7 by

~70% compared to ~55% for P22077.

Stabilization of p53, along with growth arrest of tumor cells, are well-known effects of USP7 inhibition (91,92,94). However, it was suggested that p53-independent pathways might also be involved in the cytotoxic effects of the USP7 inhibitors (94). UHRF1 and DNMT1, recently discovered substrates of USP7, are critical for maintenance of genomic DNA methylation during replication (39). Therefore, we explored the effect of P22077 and P50429 on the stability of

UHRF1, DNMT1, and HDM2 in Jurkat cells. The dose dependent decreases in the levels of all three substrates suggests that, their stability is compromised by deubiquitination. While degradation of HDM2 leads to stabilization of p53, the degradation of UHRF1 and DNMT1 might cause methylation defects in newly synthetized DNA. These downstream effects may contribute to the overall cytotoxic activity of these compounds.

Unique catalytic site environment may explain USP7/USP47 inhibitor selectivity. USP7 belongs to a superfamily of USP cysteine proteases consisting of 58 proteins with structurally conserved catalytic cores, which makes development of USP7-specific inhibitors challenging. Indeed, the two compounds tested in this work are the most specific among the currently available USP7 inhibitors; however, they both can inhibit USP7 and its close homolog USP47 (172). The fact that inhibitors P22077 and P50429 bind to the conserved catalytic cleft of USP7 is intriguing and raises the question of how this specificity is achieved. In an effort to address this question, we performed protein sequence alignment using PROMAL (173), followed by calculation of conservation score (ConSurf) (161) for a set of USP enzymes that are selectively inhibited by

P22077 (USP7 and USP47), as well as enzymes USP2, USP4, USP5, USP8, USP9X, USP15,

USP20 and USP28, not susceptible to inhibition by P22077 (169). As expected, the majority of

92 the residues in the catalytic cavity are highly conserved with calculated ConSurf scores of 8 and

9 out of 9 (Figure 3.7C). Interestingly, the alignment also revealed a few residues that are conserved between USP7 and USP47 but varied in other USPs. Specifically, Gln219 of USP7 is present only in USP47, while other proteins contain a hydrophobic amino acid residue at the equivalent position. Acidic amino acid residues equivalent to Asp289 in USP7 are also found in

USP47 (Glu246) but replaced by residues with a neutral sidechain in USP2, USP4, USP5, USP8,

USP9X, USP15 and USP28, and by basic arginine in USP20. Additionally, the positively charged His294 in USP7 has a ConSurf score of 8 and is conserved in USP47 and USP9X, in contrast to glutamine at the equivalent position in all other analyzed USPs. The hydrophobic side chain of USP7 Ala221 that is near the catalytic Cys223 is preserved in USP47 (Met175) and

USP9X (Ala1564) but changed to polar asparagine in others. Finally, Gln405 in USP7 is conserved among USP47, USP9X and USP5. Other proteins have histidine at the equivalent positions, with the exception of USP28, which has aspartate. Overall, although some of these residues are not unique for USP7 and USP47, only these two enzymes contain all five. This distinct residue combination creates a network of hydrophobic and electrostatic interactions unique to the active sites of USP7 and USP47, which may provide the basis for the observed specificity of inhibitors.

It is also possible that USP7-specific inhibitors preferentially target its unique, misaligned active site for initial binding. Indeed, catalytic triads in all existing structures of the USP catalytic domains, with the exception of USP7, are found in the catalytically competent conformation even in the absence of ubiquitin (76-79). In this light, the structure of the catalytic core of USP47 also inhibited by P22077 and P50429 is a subject of great interest.

93

Altogether, our results reveal the molecular mechanisms of USP7 inhibition by DUB inhibitors P22077 and P50429. We showed that inhibitors specifically bind to the catalytic cleft of the enzyme and subsequently modify the catalytic cysteine residue. The modification of the cysteine thiol group irreversibly inhibits the enzymatic activity of USP7 by preventing formation of anionic sulfur and its following nucleophilic attack on the C-terminal carbonyl carbon of ubiquitin.

Previously, only X-ray crystallography was used to structurally characterize the catalytic domains of USP proteins. We have shown that these large enzymes can be successfully studied in solution. Such studies may provide a unique insight into dynamic conformational changes and transient interactions within the molecule. More importantly, these studies will facilitate compound optimization and the discovery of more potent and specific inhibitors of USP7 and other pharmacologically important USPs.

ACKNOWLEDGMENTS

We thank Navin Rauniyar for his help with mass spectrometry data collection and analysis,

Toronto Structural Genomics Consortium for providing a plasmid of the USP7 catalytic domain.

Finally, we thank Dmitry Korzhnev, Sandra Weller, Justin Radolf and Kyle Hadden for helpful discussions and critical reading of the manuscript. This work was supported by the National

Science Foundation, the Connecticut Regenerative Medicine Research Fund and the Connecticut

Department of Public Health Biomedical Research Grant (IB).

94

Figure 3.1. Backbone resonance assignment of USP7 catalytic domain. 2D 1H-15N TROSY- HSQC spectrum of the USP7 catalytic domain with sequence specific assignments indicated. A central region of the spectrum indicated by a box is expanded in the inset for clarity.

95

Figure 3.2. The isolated USP7 catalytic domain binds ubiquitin. A. Completeness of the backbone assignment of the catalytic domain. Circles below the protein sequence represent amino acid residues with assigned amide NMR resonances. Secondary structure elements are shown and labeled above the protein sequence. B. NMR chemical shift perturbations, in1H- 15N TROSY-HSQC spectra of the catalytic domain caused by addition of increasing amounts of ubiquitin. C. A plot of per-residue NMR chemical shift differences between free and ubiquitin- bound states of USP7. The Palm, Fingers and Thumb regions of the catalytic domain are indicated above the plot. D. Solvent accessible surface of the Fingers region of the catalytic domain colored according to NMR chemical shift perturbations from C. The shade of violet represents the degree of chemical shift changes in response to ubiquitin binding. Ubiquitin is shown in cyan. E. Normalized global chemical shift change corresponding to the fraction of bound catalytic domain plotted as a function of catalytic domain to ubiquitin ratio. KD of the binding was estimated from least-square fit of cumulative chemical shift change.

96

Figure 3.3. DUB inhibitors P22077 and P502429 occupy the active site of USP7. A and B. Per-residue NMR chemical shift perturbations,, caused by addition of inhibitors P22077 (A) and P50429 (B) to the USP7 catalytic core. Amino acid residues exhibiting chemical shift changes greater than 40 Hz for P22077 and 30 Hz for P50429 are labeled. Residues unassigned in the bound state, for which NMR peaks gradually disappear upon binding to P22077 and P50429, are shown with a negative value of -10 Hz and colored in red and blue, respectively. Chemical structures of the compounds are shown as insets. C. Mapping of chemical shift differences from A on the surface of the catalytic domain. The color gradient from grey to magenta corresponds to  values observed upon P22077 addition. The residues with negative values are shown in red D. Mapping of  from B on the surface of the catalytic domain. Chemical shift differences are colored with a gradient from gray to cyan according to the  value. The residues with negative are shown in blue. E. Close-up view of the USP7 region affected by P22077 addition. Residues sensitive to binding are labeled and their side-chains are shown as spheres. The color code is the same as in C, ubiquitin is shown in beige. F and G. The structural models of the USP7 catalytic core in complex with P22077 (F) and P50429 (G). The binding pocket is shown as a surface and colored by electrostatic charge distribution (red is negative, blue is positive and green is hydrophobic). The compounds are colored by element: C – gray, S – yellow, N – blue, O – red, F – orange, Cl – green.

97

Figure 3.4. Inhibitors P22077 and P50429 covalently modify catalytic Cys223 residue in the USP7 active site. A. Centroid mode deconvoluted mass spectra of intact free (top), P22077- bound (middle) and P50429-bound (bottom) catalytic domain. Average intact masses of the significant peaks are shown. The mass difference between unmodified and the inhibitor – modified protein is labeled. B. MS/MS spectrum of peptide corresponding to residues 218 – 239 modified by P22077 (m/z = 1366.6, [M+2H]2+). C. MS/MS spectrum of the same peptide modified by P50429, (m/z = 903.4, [M+3H]3+). The b-ions denote N-terminal fragment ions and y-ions represent C-terminal fragment ions. Ions matched to the peptide sequence are shown in red.

98

Figure 3.5. Proposed mechanisms of chemical reaction between Cys223 of USP7 and inhibitors P22077 (1-4) and P50429 (5-8). See text for details.

99

Figure 3.6. P22077 and P50429 inhibit USP7 activity in cells. A. Jurkat cells were pretreated with either DMSO or the compounds at indicated concentrations for 2 hours and cell lysates were subjected to USP7 immunoprecipitation followed by western blotting (top panel) and USP7 activity measurement using the Ub-CHOP2 reporter assay (bottom panel). B. Total DUB activity in cell lysates from A measured in a similar manner. C. Endogenous levels of USP7 substrates DNMT1, UHRF1, HDM2 monitored in Jurkat cells pretreated with either DMSO or the compounds at indicated concentrations for 24 hours. Western blotting with antibodies against Actin serves as a loading control.

100

Figure 3.7. Specificity determinants of dual USP7/USP47 inhibitors. A and B. Structural models of the USP7 active site with Cys223 covalently modified by P22077 (A) and P50429 (B) obtained by covalent docking (ICM-Pro). The surrounding residues are labeled and their side- chains are shown in green. The small molecules are colored by element as in Figures 2F and G. C. Surface representation of the USP7 catalytic domain with residues conserved among USPs shown in maroon and residues unique for USP7 and USP47 shown in blue. Conservation scores were calculated using ConSurf server (161) and only highly conserved residues (with scores 9 and 8) are shown. The USP proteins used for the score calculation included USP2, USP4, USP5, USP7, USP8, USP9X, USP15, USP20, USP28, and USP47.

101

CHAPTER 4

Structural NMR studies of other complex biological systems (example of translesion

synthesis DNA polymerase REV1)

Alexandra Pozhidaeva#, Yulia Pustovalova#, Sanjay D’Souza, Irina Bezsonova, Graham C.

Walker, and Dmitry M. Korzhnev

Reprinted with permission from Pozhidaeva A., Pustovalova Y., D’Souza S., Bezsonova I.,

Walker G.C., Korzhnev D.M. NMR Structure and Dynamics of the C-Terminal Domain from

Human Rev1 and Its Complex with Rev1 Interacting Region of DNA Polymerase η.

Biochemistry. 2012; 51(27):5506-20. Copyright 2012 American Chemical Society

Authors contributions: # – shared the first authorship. SDS performed an experiment shown in

Figure 4.5. DMK collected NMR data. AP and YP analyzed all NMR experiments, determined protein structures and drafted the manuscript. DMK and IB helped with data analysis. GCW and

DMK reviewed and edited the manuscript.

102

4.1. Abstract

Rev1 is a translesion synthesis (TLS) DNA polymerase essential for DNA damage tolerance in eukaryotes. In the process of TLS, stalled high-fidelity replicative DNA polymerases are temporarily replaced by specialized TLS enzymes that can bypass sites of DNA damage

(lesions), thus allowing replication to continue or postreplicational gaps to be filled. Despite its limited catalytic activity, human Rev1 plays a key role in TLS by serving as a scaffold that provides access of Y-family TLS polymerases pol, , and  to their cognate DNA lesions and facilitates their subsequent exchange to pol that extends the distorted DNA primer-template.

Rev1 interaction with the other major human TLS polymerases, pol, ,  and the regulatory subunit Rev7 of pol, is mediated by the Rev1 C-terminal domain (Rev1-CT). We used NMR spectroscopy to determine the spatial structure of the Rev1-CT domain (residues 1157-1251) and its complex with Rev1 interacting region (RIR) from pol (residues 524-539). The domain forms a four-helix bundle with a well-structured N-terminal -hairpin docking against helices 1 and 2, creating a binding pocket for the two conserved Phe residues of the RIR motif that upon binding folds into an -helix. NMR spin-relaxation and NMR relaxation dispersion measurements suggest that free Rev1-CT and Rev1-CT/pol-RIR complex exhibit s-ms conformational dynamics encompassing the RIR binding site, which might facilitate selection of the molecular configuration optimal for binding. These results offer new insights into the control of TLS in human cells by providing a structural basis for understanding the recognition of the Rev1-CT by

Y-family DNA polymerases.

103

4.2. Introduction

Reactive products of cellular metabolism and external genotoxic agents such as ultraviolet

(UV) irradiation cause persistent damage to the genomic DNA, which is constantly removed through various DNA repair mechanisms (174). Unavoidably, however, some DNA modifications (lesions) are present during S-phase, creating blocks for progression of the DNA replication machinery since spatially constrained active sites of high-fidelity replicative DNA polymerases cannot accommodate most types of DNA damage. To circumvent this problem, organisms in all kingdoms of life have evolved DNA damage tolerance pathways employing specialized translesion synthesis (TLS) DNA polymerases that can insert nucleotides across

DNA lesions, thereby allowing replication to proceed while temporarily leaving DNA damage unrepaired (35,175-179). Growing evidence suggests that an additional important component of

TLS DNA polymerase action is to fill postreplicational gaps opposite lesions at the end of the cell cycle so that double strand breaks are not generated during the next round of DNA replication (180-184).

The most intensively studied TLS polymerases are the Y-family enzymes Rev1, pol, pol, pol and the B-family polymerase pol (a complex of the catalytic Rev3 subunit with Rev7)

(35,175-179). TLS polymerases lack 3'5' proofreading exonuclease activity, possess more accommodating active sites than replicative DNA polymerases, and make a relatively limited number of contacts with the template base and incoming nucleotide (185). These features allow them to insert nucleotides across from a wide variety of DNA lesions that would stall a replicative DNA polymerase. However, the ability to replicate through altered bases comes at an expense of fidelity. The nucleotide misincorporation rates of TLS polymerases copying undamaged DNA are ca.10-1-10-4, much higher than the misincorporation rates of 10-6-10-8

104 observed for pol and pol, the main replicative DNA polymerases (175,176,186). Consequently, the access of these error-prone TLS enzymes to primer termini is carefully regulated.

Certain TLS polymerases are specialized for the efficient and relatively accurate insertion of nucleotides opposite particular lesions (35,178,179), termed cognate lesions. Examples of cognate lesions include thymine-thymine cyclobutane pyrimidine dimers (T-T CPDs) for pol(187,188) and N2-dG adducts, such as N2-benzo[a]pyrene-dG (BaP-dG) adducts, for pol(189,190) Although these TLS polymerases can also replicate over certain non cognate lesions as well, they do so in a more error-prone manner. Furthermore, the bypass of many DNA lesions, including bulky BaP-dG adducts, cisplatin (cisPt) adducts, and [6-4] photoproducts ([6-

4]PP), is accomplished via the coordinated consecutive action of two different TLS DNA polymerases (191-193). An inserter TLS polymerase (e.g. pol,  or ) incorporates a nucleotide across the site of DNA damage. This inserter polymerase is then replaced by another TLS enzyme that extends from the distorted primer terminus positioned across from the lesion. This extender role is frequently carried out by pol, which is recruited to the primer terminus in a process mediated by Rev1 (176,194,195). Rev1/pol-dependent TLS is responsible for most of the DNA damage-induced mutagenesis in eukaryotic cells and contributes to spontaneous mutagenesis as well (194,195). On the other hand, certain DNA lesions such as cyclobutane pyrimidine dimers (CPDs) or abasic sites can be effectively bypassed by a single TLS enzyme.

For example, pol alone can efficiently replicate through T-T CPD, one of the common types of

UV induced lesions, in an essentially error-free manner (187,188). Xeroderma pigmentosum variant (XPV) individuals, who lack pol function are highly UV-sensitive and cancer-prone

(188). They exhibit increased UV-induced mutagenesis due to more error-prone polymerases carrying out TLS in the absence of the more accurate pol (196).

105

Protein-protein interactions play critical roles in controlling the access of TLS polymerases to their cognate DNA lesions and facilitating polymerase switching during bypass replication

(35,177-179). Beyond their catalytic cores, Y-family TLS enzymes possess auxiliary domains and motifs that are important for their efficient interactions with one another and with the DNA

(35,178,179). TLS polymerases possess ubiquitin-binding motif (UBM) or ubiquitin-binding zinc finger (UBZ) domains in addition to the proliferating cell nuclear antigen (PCNA) interacting motifs (197). These ubiquitin-binding domains play a role in the recruitment of TLS

DNA polymerases to PCNA processivity clamps that have been mono-ubiquitinated at Lys164 in a Rad6/Rad18-dependent fashion in response to DNA damage (198). The PCNA-interacting protein (PIP) box motifs of pol, , and  additionally contribute to the interaction with mono- ubiquitinated PCNA, while Rev1, which lacks a consensus PIP-box motif, instead utilizes its N- terminal BRCT (BRCA1 C-terminus), its polymerase associated (PAD) or UBM domains for

PCNA binding (199-201). In addition, as discussed in the detail in this paper, the C-terminal parts of vertebrate pol, ,  contain Rev1-interacting regions (RIRs), consisting of short peptide motifs containing two consecutive phenylalanines that bind to the Rev1 C-terminal domain

(Rev1-CT) (202).

The human Y-family Rev1 polymerase is a 1251 amino acid (aa) protein that is notable for its role, along with pol, in most DNA damage induced mutagenesis (203-205). The Rev1 catalytic activity, however, is limited to inserting dCMP opposite a G template or certain types of

DNA lesions, and thus is insufficient to explain its role in introducing mutations (205-207).

Subsequent studies revealed that Rev1 possesses a “second function” besides its catalytic ability that is important for TLS and mutagenesis (208). This second, and more important, function of

Rev1 proved to be the recruitment and coordination of other DNA polymerases during the

106 process of TLS via specific protein-protein interactions (35,178,179). Thus, Rev1 serves as a scaffold that provides access for the Y-family polymerases , , and to their cognate DNA lesions, and can also facilitate the subsequent exchange to pol, which then extends the distorted

DNA primer terminus opposite the lesion. Remarkably, previous studies of intermolecular interactions of Rev1 fragments using yeast two-hybrid assays, co-immunoprecipitation and cellular co-localization revealed that Rev1’s interaction with the other three Y-family TLS polymerases, pol, , and  and with pol as well is mediated by a relatively small 100 aa Rev1-

CT domain (202,209-211). The human Y-family polymerases ,  and  bind the Rev1-CT with

RIR peptide motifs that contain two consecutive Phe residues (202). Such motifs are absent in the Rev3 and Rev7 subunits of pol also known to interact with the Rev1-CT in vertebrates (209-

212), suggesting that their mode of Rev1-CT binding is different from that of polymerases ,  and . Previous reports suggest that the loss of S. cerevisiae Rev1-CT has a detrimental effect on cell survival and results in a decrease of mutagenesis after DNA damage (213). Although the crystal structure of the central polymerase domain of both human and yeast Rev1 have been solved, the critically important protein-protein interaction Rev1-CT domain was removed to facilitate crystallization (214,215).

The versatility of Rev1-CT interactions renders characterization of this module critical for understanding the underlying mechanisms of TLS. Here we report high-resolution spatial structures of the human Rev1-CT domain (residues 1157-1251) and its complex with Rev1 interacting region (RIR) from pol (residues 524-539) determined by solution NMR spectroscopy, as well as a detailed characterization of conformational flexibility of the free

Rev1-CT domain and the complex on various time-scales using 15N NMR spin-relaxation (216) and 15N NMR relaxation dispersion measurements (217). The functional importance of particular

107 amino acids to this interaction is demonstrated through the use of yeast two-hybrid analysis.

Taken together, these data provide a rigorous structural basis for understanding Rev1 interactions with human Y-family TLS polymerases.

4.3. Experimental Procedures

Protein sample preparation for NMR spectroscopy. The gene encoding human Rev1-CT domain (residues 1158-1251 of Rev1 inserted after the cleavage site for TEV protease -

ENLYFQG), codon optimized for expression in E. coli, was custom synthesized (GenScript) and sub-cloned into pET-28b(+) vector (Novagen) using NdeI - BamHI restriction sites.

The recombinant human Rev1-CT was over-expressed in E. coli BL21(DE3) cells. 15N/13C labeled protein for NMR spectroscopy was produced by growing cells transformed with Rev1-

15 13 CT plasmid in minimal M9 medium using NH4Cl and C-glucose as sole nitrogen and carbon

o sources, respectively. Cells were grown at 37 C to OD600 of 0.6-0.8 followed by induction of protein expression by 1 mM IPTG overnight at 20 oC. After harvesting, the bacterial pellets were re-suspended in loading buffer containing 50 mM NaH2PO4, 250 mM NaCl, 10 mM imidazole, 1 mM PMSF, pH 7.4. After the cells were lysed by sonication, the soluble fraction was loaded onto a Ni2+ affinity column and incubated for 1 h at 4oC. After an extensive wash with the loading buffer, the protein was eluted with the buffer of the same composition, but containing 250 mM imidazole. The His-tag was removed by TEV cleavage for 4 h at 20oC or overnight at 4oC in the presence of 2 mM DTT and 0.5 mM EDTA, followed by protein purification on a HiLoad

Superdex 75 column (GE Healthcare). After concentration, the final sample of free Rev1-CT

15 13 domain contained 0.9 mM N/ C labeled protein, 50 mM NaH2PO4, 100 mM NaCl, 0.25 mM

EDTA, 5 mM DTT, 0.05% NaN3, 10% D2O, pH 7.0. Protein cleavage with TEV protease leaves

108 an N-terminal Gly residue, so that the resulting Rev1-CT domain included residues 1157-1251 of human Rev1 with Pro1157 mutated to Gly. The consistency of the protein sample was confirmed by MALDI-TOF mass spectroscopy.

The complex of 15N/13C labeled Rev1-CT domain with a custom synthesized (GenScript) 16 aa unlabeled peptide that encompasses one of the two RIR regions of pol (residues 524-539;

QSTGTEPFFKQKSLLL) (202) was prepared by gradually titrating 25 mM peptide solution into

0.9 mM protein sample. The binding was monitored by recording 1H-15N HSQC spectra at each step of the titration. The final sample of the complex contained 0.9 mM 15N/13C Rev1-CT - unlabeled pol-RIR mixed in 1:1 molar ratio, 50 mM NaH2PO4, 100 mM NaCl, 0.25 mM

EDTA, 5 mM DTT, 0.05% NaN3, 10% D2O, pH 7.0. At this protein concentration, about 89% of the protein and the peptide are in the bound state, estimated assuming dissociation constant KD =

13 M for the complex reported in previous studies (202).

NMR resonance assignment and protein structure calculation. NMR spectra for the free

15N/13C Rev1-CT domain and its complex with the unlabeled pol-RIR peptide were collected at

15 oC on Agilent VNMRS spectrometers equipped with cold probes operating at 11.7 and 18.8 T magnetic fields (500 and 800 MHz 1H frequencies). The backbone and side-chain 15N, 13C and

1H resonances of Rev1-CT domain were assigned from 2D 1H-15N HSQC, 1H-13C HSQC and 3D

HNCA, HNCACB, HNCO, HBHA(CO)NH, HC(C)H-TOCSY, (H)CCH-TOCSY, 15N-edited

NOESY-HSQC and 13C-edited NOESY-HSQC (150 ms mixing time) spectra (102). Two sets of experiments were performed (i) for the free 15N/13C Rev1-CT domain and (ii) for 15N/13C Rev1-

CT - unlabeled pol-RIR complex. The assignments of 1H NMR resonances for pol-RIR peptide in complex with 15N/13C Rev1-CT domain were obtained from 2D 15N,13C-filtered

TOCSY, COSY and NOESY (250 ms mixing time) spectra (218), containing intra-molecular 1H-

109

1 H correlations from the unlabeled peptide, collected in both 90% H2O/10% D2O and in 100%

D2O buffers (COSY spectrum was recorded in D2O buffer only). In addition, we have recorded

3D 13C-edited 15N,13C-filtered NOESY-HSQC spectrum, containing inter-molecular NOE correlations between 15N/13C Rev1-CT domain and the unlabeled pol-RIR peptide used for structure determination of the complex (218). NMR spectra were processed with NMRPipe (140) and analyzed with CARA software (219). Nearly complete backbone 95% (94%) and side-chain

90% (90%) resonance assignments were obtained for the free Rev1-CT domain (the complex).

Structure calculations for the Rev1-CT domain and Rev1-CT/pol-RIR complex were performed in CYANA software (107). The restraints for the backbone dihedral  and  angles were derived from the backbone 1H, 15N and 13C chemical shifts using TALOS+ program (106).

Intra-molecular 1H-1H distance restraints for the free and bound forms of Rev1-CT were obtained from 3D 15N- and 13C-edited NOESY-HSQC spectra, while inter-molecular distance restraints and restraints within the pol-RIR peptide were derived from 3D 13C-edited 15N,13C- filtered NOESY-HSQC and 2D 15N,13C-filtered NOESY experiments, respectively. Intra- molecular NOE correlations for Rev1-CT domain in free and bound states were assigned automatically in CYANA, while protein-peptide and inter-peptide NOEs were assigned manually. Hydrogen bond restraints were added on the basis of NOE analysis. A total of 200 structures of the free Rev1-CT domain and the complex were generated, followed by refinement of 20 final lowest-energy structures by short constrained molecular dynamic simulations in explicit solvent using CNS software (145). Table 4.1 summarizes NMR based restraints used for structure calculation of Rev1-CT domain and the complex along with structure refinement statistics.

110

15 15 1 NMR characterization of protein dynamics. The backbone N R1, R2 (CPMG) and N{ H}

NOE NMR relaxation measurements for the Rev1-CT domain and its complex with pol-RIR peptide were performed on a 11.7 T Agilent VNMRS spectrometer at 15 oC using the pulse- sequences of Farrow et al. (117). The delay between 180o refocusing pulses of CPMG sequence

15 in N R2 experiment was set to  = 1 ms, corresponding to CPMG frequency CPMG = 1/(2) of

15 500 Hz. N R1 and R2 rates and their uncertainties were obtained from exponential fits of peak intensities in a series of 8 2D 1H-15N correlation spectra recorded at different relaxation delays.

15 N R2 rates were numerically corrected to take into account small systematic contributions due to off-resonance effects of CPMG refocusing pulses (220). 15N{1H} NOE values were calculated as ratios of peak intensities in corresponding 2D 1H-15N correlation spectra recorded with and without 1H saturation using 8 s delay between scans to ensure full recovery of 15N longitudinal magnetization. To account for possible systematic uncertainties in measured NMR relaxation

15 15 1 data the minimal errors of 2% for N R1, R2 rates and 0.05 for N{ H} NOE values were assumed.

Overall rotation diffusion tensors for the free Rev1-CT domain and Rev1-CT/pol-RIR complex and overall rotation correlation time R = 1/(2(Dx + Dy + Dz)) (Di, i = x,y,z are

15 eigenvalues of rotation diffusion tensor) were calculated from N R1 and R2 data using the

15 program DASHA (221). Specifically, N R1 and R2 rates were globally fit to the models assuming an isotropic, axially symmetric and fully anisotropic rotational diffusion tensor (216).

Included in these calculations were the residues with (i) non-overlapped cross-peaks in 1H-15N

15 1 15 correlation spectra, (ii) N{ H} NOE > 0.6 and (iii) N R2/R1 ratios within mean  2 standard deviations. In both cases the model of axially symmetric anisotropic rotation diffusion was selected according to F-test at 10% significance level, resulting in R = 8.96  0.07 ns (10.45 

111

0.06 ns) and Dx/Dz = Dy/Dz = 0.91  0.02 (0.84  0.01) for the free Rev1-CT domain (Rev1-

CT/pol-RIR complex).

Pico- to nanosecond conformational dynamics of the Rev1-CT domain in free and bound forms were assessed from the analysis of 15N relaxation data using the Lipari-Szabo model-free approach (115,216) performed in DASHA software (221). At this stage of the data analysis the

15 parameters of molecular overall rotation were fixed to previously determined values, and N R1,

15 1 R2 and N{ H} NOE data were fit on a per-residue basis using the models of spectral density

2 2 2 function including the following adjustable parameters: (i) {S }, (ii) {S , e}, (iii) {S , Rex}, (iv)

2 2 2 2 2 2 2 2 {Sf , Ss , s} and (v) {S , e, Rex}, where S , Sf and Ss (S2 = Sf Ss ) are generalized order parameters describing angular amplitudes of internal motions of N-H vectors, e, s are correlation times for fast ps and slower ns time-scale internal motions, Rex (below referred to as

Rex,mf) is the contribution to transverse relaxation rate R2 due to s-ms conformational exchange.

Model selection was performed based on the values of 2 penalty function, describing goodness of relaxation data fit, according to the protocol of Mandel et al. (222) (significance levels of 5% and 20%, respectively, were assumed for 2 and F-test).

The residues in free Rev1-CT domain and its complex with pol-RIR peptide exhibiting micro- to millisecond conformational exchange were identified based on Rex,mf values obtained

15 15 in model-free analysis of N relaxation data (115,216). The CPMG sequence used in the N R2 experiment suppresses exchange contribution to R2 provided that 2CPMG >> kex and 2CPMG

>> , where kex and  are exchange rate constant and frequency difference between the exchanging states (223). Therefore, Rex,mf primarily reflect contributions to R2 due to fast s-ms conformational exchange with kex ≥ 2CPMG accompanied by changes in NMR chemical shifts.

15 To quantify Rex contributions to N R2 due to slower ms time-scale exchange we have

112 performed 15N CPMG relaxation dispersion measurements (217,224), where the effective

15 transverse relaxation rates R2,eff were obtained as a function of CPMG. N relaxation dispersion profiles for the free Rev1-CT and Rev1-CT/pol-RIR complex were recorded at 11.7 T (the free domain and the complex) and 18.8 T (the complex only) magnetic fields with the pulse-sequence of Hansen et al. (225), using a constant relaxation period T = 30 ms and CPMG ranging from 33 to 1000 Hz. Relaxation dispersion data measured at 11.7 T were least-square fit to a model of two-state conformational exchange as described elsewhere (226). The best-fit dependencies of

clc clc R2,eff (CPMG) were used to estimate exchange contributions Rex,rd = R2,eff (33Hz)-

clc 15 R2,eff (533Hz) to N R2 that complement model-free derived Rex,mf measured at CPMG of 500

Hz.

Yeast two-hybrid analysis. Studies of protein-protein interactions in the yeast two-hybrid system were performed in the PJ69-4A strain of yeast (227). DNA fragments encoding the C- terminal fragment of human Rev1 (amino acids from 1131-1249) and full-length human polη

(amino acids from 1-713), were subcloned into the pGAD-C1 (GAL4 AD) and the pGBD-C1

(GAL4 BD) plasmids marked with leucine and tryptophan auxotrophy, respectively. The assay was performed by growing strains harboring the two plasmids in 3 ml of media lacking leucine and tryptophan for 2 days at 30°C and spotting 5µL of cells on selective media plates lacking leucine and tryptophan (-LW) and on medium also lacking adenine and histidine (-AHLW) to score positive interactions. Interactions were scored after 3 days of growth at 30 °C.

Site-directed mutations for yeast two-hybrid analysis were generated according to the protocol of the Quikchange Mutagenesis kit (Stratagene), except that we used an annealing temperature of 50 °C and an extension time of 2min/kb. Mutations were verified by sequencing.

113

4.4. Results and Discussion

4.4.1. The Rev1 C-terminus is a structured domain that binds human Y-family DNA polymerases

Previous studies established that the C-terminal ~100 aa region of DNA polymerase Rev1

(Rev1-CT) is critical for DNA damage tolerance in eukaryotes (209-211). However, removal of a fragment of Rev1 containing this region does not alter the dCMP transferase activity of yeast

Rev1 in vitro (228), supporting the notion that Rev1 plays a non-catalytic, structural role in translesion synthesis (TLS) by coordinating the action of other TLS polymerases through protein-protein interactions (35,177-179,229). The binding of the human and mouse Rev1-CT domain with each of the other vertebrate Y-family polymerases, pol,  , and with Rev7, the regulatory subunit of pol has been demonstrated by multiple experimental techniques, including yeast two-hybrid assays, co-immunoprecipitation, and cellular co-localization (202,209-211).

Although the Rev1-CT domain is highly conserved in eukaryotes, its ability to bind other Y- family DNA polymerases has only evolved in higher organisms (230), in contrast to the Rev1-

CT/Rev7 interaction, which has been reported in eukaryotes from yeast to humans (35).

Secondary structure prediction and sequence alignment of Rev1-CT from different species suggested that the domain likely consists of four amphipathic helices, which would have the potential for forming a four-helix bundle, and identified several conserved motifs that seemed likely to be involved in the hydrophobic core of the domain and whose mutation impairs the function of yeast REV1 gene in vivo (213).

Despite the wealth of biochemical and mutational data, the detailed atomic-level picture of

Rev1-CT interactions with other TLS polymerases has remained obscure due to the lack of a high-resolution spatial structure of this domain. In order to determine the solution NMR structure

114 of the human Rev1-CT domain (residues 1157-1251) we have expressed and purified 15N/13C labeled protein and performed its backbone and side-chain resonance assignment using a standard set of triple-resonance NMR experiments (102). Figure 4.1A shows the 1H-15N HSQC spectrum of the free Rev1-CT domain (red) recorded at 11.7 T field at 15 oC, revealing a single set of well dispersed NMR resonances indicative of a structured protein domain. NMR chemical shifts obtained at this stage of the data analysis provide a plentiful source of information on local protein structure and conformational dynamics. Thus, secondary structure prediction based on the backbone 1H, 15N and 13C chemical shifts using TALOS+ program (106) (Figure 4.1B, red) confirmed that the domain consists of four -helices, referred to below as H1 to H4. Order parameters S2 for the backbone amide groups of the free Rev1-CT (Figure 4.1C, red) that describe the main chain conformational flexibility (S2 = 1 for fully restricted, 0 for unconstrained motions) were empirically predicted from the backbone chemical shifts using Random Coil

Index (RCI) approach (150,231). High S2 observed throughout the protein sequence suggest rather limited mobility of the domain on ps-ns to s time-scale (231). In contrast, the free Rev1-

CT domain seems to be dynamic on a time-scale from s to ms. At temperatures above 20-25 oC the quality of Rev1-CT spectra deteriorate significantly due to extensive line-broadening indicative of s-ms conformational exchange. Decreasing the temperature to 15 oC improves spectra quality, although the resonances from loops between helices H1-H2 and H3-H4 still remain broadened, some beyond the level of detection.

The Rev1-CT domain has been previously shown to form medium to weak affinity complexes with short peptide motifs (Rev1 interacting regions, RIRs) from human Y-family

DNA polymerases ,  and  containing two consecutive Phe residues (Figure 4.1D) with dissociation constants (KD) ranging from 8 to 69 M (202). One RIR motif was identified in

115 each of human pol and pol, while two motifs were found in human pol (Figure 4.1D). In order to elucidate the structural basis for Rev1-CT interactions with human Y-family DNA polymerases, we have performed NMR studies of the Rev1-CT complex with a 16 amino acid peptide containing one of the two RIR motifs of human pol (residues 524-539, line 1 in Figure

4.1D). Figure 4.1A shows 1H-15N HSQC spectrum of the Rev1-CT/pol-RIR complex (blue) superimposed on the spectrum of the free domain (red). The backbone and side-chain NMR resonance assignments for 15N/13C labeled Rev1-CT in complex with the unlabeled pol peptide were obtained as described above for the free domain, while 1H resonances of the peptide were assigned from a set of filtered 2D 1H-1H experiments. The analysis of the backbone 1H, 15N and

13C chemical shifts of Rev1-CT domain bound to pol-RIR peptide suggests that neither secondary structure (Figure 4.1B, blue) nor backbone amide order parameters S2 that report on main chain dynamics (Figure 4.1C, blue) change significantly upon the complex formation. On the other hand, NMR resonances from H1-H2 and H3-H4 loops exhibit somewhat less broadening due to s-ms conformational exchange. Chemical shift changes upon complex formation are observed in the region of the Rev1-CT domain comprising helices H1 and H2, as well as in the N-terminal part preceding helix H1, suggesting that this part of the domain is involved in Y-family polymerase binding.

Titrations of Rev1-CT domain with increasing amounts of pol-RIR peptide monitored by

1H-15N HSQC spectra suggest that the binding process is slow on the NMR time-scale. Namely, peaks corresponding to free Rev1-CT decrease in intensity and ultimately disappear, while peaks of the bound form show up in new places in spectra and increase in intensity with increasing peptide concentration (as illustrated in the inset to Figure 4.1A for Leu1171). In the case of two- state chemical exchange between equally populated states, two peaks are observed in the NMR

116 spectrum of a nucleus if the forward and reverse rates k for the process are slower than /23/2

(coalescence point), where  is frequency difference between the exchanging states (232).

Separate peaks as close as 0.03 ppm (1H) corresponding to free and bound Rev1-CT are observed in 11.7 T 1H-15N HSQC spectra recorded at about 1:0.5 protein/peptide ratio, suggesting that association kon[L] ([L] is concentration of free peptide) and dissociation koff rates for Rev1-

-1 CT/pol-RIR interaction are slower than ~33 s . Assuming the previously reported KD = 13 M

(202) and total protein and peptide concentrations [P0] = 0.9 mM and [L0] = 0.45 mM, respectively, one can estimate the upper limit for kon of Rev1-CT/pol-RIR association as

2.7∙106 M-1s-1, which is of the same order of ~106 M-1s-1 expected for diffusion controlled protein association (233). This observation suggests that the Rev1-CT/pol-RIR association is likely somewhat slower than diffusion controlled, potentially pointing to a barrier crossing event due to rearrangements of interaction partners.

4.4.2. Structure of the free Rev1-CT domain

Figure 4.2 shows the spatial structure of the human Rev1-CT domain (residues 1157-1251) as determined by solution NMR spectroscopy. The ensemble of 20 least-energy structures of the domain (Figure 4.2A) is in excellent agreement with the input experimental data, displaying mean pairwise RMSD for the backbone atoms of regular secondary structure elements of 0.53 Å, and exhibiting minimal violations of distance/torsion angle restraints and deviations from idealized molecular geometry (Table 4.1). As expected, the domain adopts a four-helix bundle fold with helices H1, H2, H3 and H4 spanning the residues 1165-1178, 1184-1199, 1203-1219 and 1224-1243, respectively (Figure 4.2B). The N-terminal 8 residues (1157-1164) of the domain preceding -helix H1 form a rigid type I' -hairpin stabilized by two hydrogen bonds

117 between Leu1159 and Ala1162 (NH1159-CO1162, CO1159-NH1162). The -hairpin tightly packs against helices H1 and H2, resulting in multiple long-range NOE contacts between Leu1159 -

Val1163 and Asn1166, Asp1167, Val1168, Leu1172 in -helix H1, and between Leu1159 and

Gln1189 in -helix H2. Two hydrophobic residues in the -turn, Ala1160 and Gly1161, together with hydrophobic side chains of Leu1171, Trp1175 from helix H1 and negatively charged

Asp1186 from helix H2 form a pocket on the surface of the domain (Figure 4.2C, left plot) capable of accommodating the FF residues of RIR motifs of Y-type DNA polymerases (see below). It is notable that this region comprised of residues from the N-terminal -hairpin and helices H1 and H2, exhibits the largest NMR chemical shift change upon pol-RIR peptide binding. Figure 4.2C shows electrostatic charge distribution color mapped onto the surface of

Rev1-CT domain. In addition to hydrophobic/negatively charged pocket described above located between helices H1 and H2 there is a large positively charged patch on the opposite side of the molecule formed by side-chains of Lys1169 and Arg1173 from helix H1 and Lys1211, Lys1214 and Arg1215 from helix H3.

Side-chains of residues with solvent exposure less than 5% and an accessible surface area less than 10 Å2 that form the core of the Rev1-CT domain are shown in Figure 4.2B and are listed in the figure caption. Beyond side-chains buried in the hydrophobic core of the four-helix bundle, several residues form the interface between the N-terminal -hairpin and -helices H1 and H2. Recently, extensive sequence alignment of Rev1-CT domains from multiple eukaryotic species identified five conserved 5-6 residue regions within Rev1 C-terminus, referred to as rev1-108 to rev-112 (213). Alanine-patch mutations of four of these regions (rev1-108 - rev1-

111) in yeast Rev1-CT completely abrogate the function of the REV1 gene in cell survival after

DNA damage induced by UV or methanesulfonate (MMS) exposure (213). It is interesting to

118 note that each of the rev1-108, rev1-110 and rev1-111 regions in the human Rev1-CT contain 2 to 4 residues from the core of the domain (see caption to Figure 4.2). Simultaneous mutations of multiple core residues, therefore may cause significant destabilization of the domain. Provided that the core of Rev1-CT is conserved among eukaryotic species, this observation explains why mutations of rev1-108, rev1-110 and rev1-111 in yeast are equivalent to loss of the whole domain, and, once again, underlines the importance of the Rev1-CT domain for DNA damage tolerance in eukaryotes.

4.4.3. Structure of the Rev1-CT/pol-RIR complex

The finding that vertebrate Rev1-CT directly interacts with the three Y-family DNA polymerases, pol,  and  sparked a search for Rev1 binding sequences (RIR motifs) in eukaryotic Y-family DNA polymerases (202,209-211). This search proved non-trivial. As shown in Figure 4.1D, there is limited among the four RIR regions found in human

Y-family polymerases. Based on mutational analysis and truncations of RIR containing peptides

Ohashi et al. (202) described the minimal RIR region as yyyFFxxxx (underlined in Figure 4.1D), where F is a phenylalanine residue, y is any amino-acid, and x is any amino acid but a proline.

The two consecutive Phe residues are conserved among RIR regions, and substitution of any of them to Ala results in complete abrogation of RIR - Rev1-CT binding. In 3 of 4 RIRs found in human Y-family enzymes (Figure 4.1D) the FF sequence is preceded by Ser whose mutation in pol decreases, but does not abolish Rev1-CT interaction (202). Intriguingly, while there is no obvious sequence conservation in the region C-terminal to the FF motif, studies of the truncated

RIR peptides suggest that at least four residues must follow the FF sequence in order to efficiently bind Rev1-CT. Mutation of any of the four residues to Ala has no effect on Rev1

119 interaction, yet substitution of any of them to Pro results in a complete loss of Rev1 interaction, suggesting that main chain conformation of RIR peptides is important for binding rather than specific side-chain contacts (beyond those formed by FF residues) (202).

To ascertain the mechanism of Rev1 recognition of Y-family DNA polymerases, we have determined the solution NMR structure of the Rev1-CT domain complex with an RIR containing peptide from human pol (residues 524-539) shown in Figure 4.3. The ensemble of 20 least- energy structures (Figure 4.3A) agrees well with the input experimental restraints (Table 4.1), exhibiting mean pairwise RMSD for the backbone atoms of regular secondary structure elements of 0.62 Å (Rev1-CT)/0.27 Å (pol peptide). Structure comparison of the free and bound forms of

Rev1-CT domain (Figure 4.3B) reveals no noticeable structural changes in the domain upon pol-RIR binding. Interestingly, in the complex with Rev1-CT, the pol-RIR peptide forms an

-helix starting at Phe531 and continuing all the way to the C-terminal Leu539. The two phenylalanine residues, Phe531 and Phe532, are located in the first turn of the helix facing Rev1-

CT domain surface. As anticipated from the mutational analysis of Rev1-CT/pol-RIR interaction, the majority of contacts between Rev1-CT and pol-RIR peptide involve the two

Phe residues, displaying extensive inter-molecular NOE connectivities in 13C-edited 15N,13C- filtered NOESY-HSQC spectrum (218) (Figure 4.4). Both aromatic side-chains become buried upon Rev1-CT binding. The ring of Phe532 deeply penetrates into the binding pocket on Rev1-

CT surface formed by residues Ala1160, Gly1161 from the N-terminal -turn, Leu1171,

Trp1175 from helix H1, and Asp1186, Gln1189 from helix H2, while the ring of Phe531 packs against Thr1178 and Ile1179 on the domain surface (Figures 4.3C and D). Notably, the pocket is already preformed in the free form of Rev1-CT domain (Figure 4.2C). As a result, Phe531 and

Phe532 contribute 110 and 180 Å2, respectively, to the ~600 Å2 interaction surface between

120

Rev1-CT domain and pol-RIR peptide. The pol-RIR -helix is tilted towards helices H1 and

H2 of Rev1-CT (Figures 4.3A and B), with the backbone amides of Phe531, Phe532 and Lys533 in the first turn of the -helix facing the negatively charged surface of the Rev1-CT domain

(Figure 4.2C). The key residue in Rev1-CT that interacts with the above amide groups of pol-

RIR peptide is a highly conserved Asp1186 from the helix H2, which is surface exposed in the free domain, but becomes completely buried upon complex formation. Strikingly, as also discussed more fully below, mutation of Asp1186 to alanine completely eliminates the binding of the full-length pol to the Rev1-CT in the yeast two-hybrid system (Figure 4.5). The carboxyl of the Asp1186 side-chain participates in charge-dipole interactions with the backbone amides of

Phe531 (whose 1HN resonance is shifted to 11.1 ppm) and Phe532 of pol-RIR peptide, and is also proximal to the side-chain NH group of Trp1175 and the backbone amide of Met1183 from the H1-H2 loop of Rev1-CT domain (Figure 4.3D). It is not an unusual feature for a buried ionizable side-chain in a protein to be involved in interactions with the backbone / side-chain dipole or oppositely charged side-chain group (234). Residues Lys531, Ser532 and Leu539 from the two turns of the pol-RIR -helix following the FF sequence interact with Ala1160 and

Gly1161 from the N-terminal -hairpin of Rev1-CT, while Thr528 and Glu529 preceding the FF pair pack against the side-chains of Pro1182 and Met1183 in the H1-H2 loop.

The spatial structure of the Rev1-CT/pol-RIR complex (Figure 4.3) reveals that the two consecutive phenylalanine residues, Phe531 and Phe532, of the pol RIR motif are responsible for the majority of specific inter-molecular interactions and are crucial for the complex formation. Accordingly, mutation of these critical Phe531 and Phe532 residues to glycine within the pol RIR nearly completely eliminates interaction with the Rev1-CT via the yeast two- hybrid system described below (Figure 4.5A). We also observe a relatively weaker binding when

121 the Phe483 and Phe484 residues within the first RIR of pol were mutated to glycine, consistent with an earlier report that all four phenylalanine residues within the two pol RIRs contribute to

Rev1-binding (202).

Another important interaction involves the backbone amide groups of Phe531 and Phe532 from the N-terminal face of the pol-RIR -helix and negatively charged Asp1186 of Rev1-CT completely buried on the binding interface. The fact that the pol-RIR peptide adopts an - helical conformation provides a rationale for the previously published mutational data (202). It is clear that the helix is needed to maintain proper orientation of the two aromatic rings of the FF sequence that fit into the binding pocket on the Rev1-CT surface, as well as to position the backbone amides of the first -helix turn towards the side-chain of Asp1186 in Rev1-CT.

Mutational data suggest that at least four residues following the FF sequence are necessary to maintain the -helical conformation secured by hydrogen bonds between residues i and i+4.

Alanine is highly favorable in -helices (235), so that substitution of residues C-terminal to the

FF sequence with Ala maintains -helical conformation of RIR peptide and, therefore, has little effect on Rev1-CT binding. In contrast, mutation to a helix-breaking Pro (235) disrupts the - helix of RIR motif and thereby abolishes Rev1-CT interaction. The -helix of RIR motifs found in Y-family DNA polymerases starts with the FF sequence following the N-cap Ser or Pro

(Figure 4.1D). Pro is often found in N-cap positions of protein -helices (236), while Ser is known as one of the most favorable N-cap residues (along with Asn, Asp, Cys and Thr) (237).

Therefore, it is not surprising that Ser to Ala substitution in the pol-RIR motif results in weakening of Rev1-CT binding (202). Although Thr528 and Glu529 preceding the FF sequence interact with the side-chain of Met1183 of Rev1-CT (Figure 4.3), these and other residues from the N-terminal part of pol-RIR peptide exhibit no well-defined conformation and are likely of

122 little importance for interaction with Rev1-CT. Therefore, based on our analysis of the Rev1-

CT/pol-RIR structure (Figure 4.3) Rev1 binding motif can be re-defined as nFF(4h), where n is an N-cap residue, F is phenylalanine and 4h are at least four residues that form an -helix.

4.4.4. Corroboration of Rev1-CT/Pol-RIR interface by yeast two-hybrid assays

Having elucidated the structure of the human Rev1 C-terminus in complex with the polη-

RIR, we employed a yeast two-hybrid system to monitor the interaction between the proteins.

The region encoding the human Rev1-CT and full-length polη was fused to the Activation

Domain (AD) and the Binding Domain (BD) of the GAL4 transcription factor, respectively. The presence of the constructs after transformation into the PJ69-4A strain of yeast was ensured by growth on selective media plates lacking leucine and tryptophan (-LW). Interactions between the two proteins activate the expression of the HIS3 and ADE2 reporter genes, both driven by promoters responsive to the Gal4 transcription factor. These interactions were scored by growth on selective media plates additionally lacking adenine and histidine (-AHLW). We determined that a longer Rev1-CT construct of ~120 amino acids, rather than the 95 amino acid Rev1-CT fragment that was used in structure determination, reproducibly recapitulated the interaction with polη. Consistent with earlier data, the interaction between the Rev1-CT and full-length polη resulted in robust growth of the strains in –AHLW selective media plates (Figure 4.5A). In contrast, no growth was observed in strains expressing the Rev1-CT and the BD alone, showing that the interaction is specific between the two proteins.

Having established a strong interaction between the Rev1-CT and polη we asked if mutations of residues within the N-terminal β-hairpin and -helices H1, H2 of Rev1-CT affect the interaction with polη. To extend and validate our predictions from the Rev1-CT/polη-RIR

123 complex, we created a series of point mutations within the Rev1-CT and examined its interaction with polη via the yeast two-hybrid assay. As shown in Figure 4.5B, the Rev1-CT mutations

L1159A, L1171A, L1172A, W1175A and D1186A completely abrogate the interaction, whereas the mutations E1174A, T1178A and M1183A severely diminish binding to polη. The mutant proteins containing A1160G, I1179A and Q1189A substitutions, however, show an interaction similar to wild-type Rev1-CT. Among the set of residues in Rev1-CT that show complete loss of binding, Leu1159 lies on the interface between the N-terminal β-hairpin and helices H1 and H2 that create a binding pocket for the Phe531 and Phe532 residues of the pol-RIR, and likely is important for stabilization of the β-hairpin. The key residue Leu1171 interacts with Phe532 and is exposed in the free Rev1-CT but becomes buried upon binding the polη-RIR. Its mutation to alanine, therefore, completely eliminates the interaction with the polη-RIR. Leu1172, however, forms part of the hydrophobic core of the Rev1-CT so a change to alanine, while rendering it incompetent to bind polη, is likely due to a destabilization of the Rev1-CT itself. Similarly

Trp1175, one of the key residues on the binding interface between the Rev1-CT and polη interacts with Phe532, but is partially buried in the free domain and, therefore, participates in stabilization of the hydrophobic core. Like L1172A, the W1175A mutation likely significantly destabilizes the Rev1-CT domain. Strikingly, Asp1186, a crucial residue for the Rev1-CT - pol-

RIR interaction is almost completely buried on the binding interface. It interacts with the backbone amide groups of Phe531 and Phe532 from the first turn of pol-RIR -helix and, consequently, mutation of this negatively charged amino acid to alanine abrogates interaction with the polη-RIR, as evidenced by a complete lack of growth of yeast strains harboring these plasmids on selective media plates lacking AHLW (Figure 4.5B).

124

The weaker polη-RIR interacting residues Glu1174 and Thr1178 in the Rev1-CT are located in the -helix H1 at the periphery of pol-RIR binding interface, while Met1183 is located in the loop between helices H1 and H2 and interacts with N-terminal residues of the pol-RIR preceding the conserved Phe531 and Phe532. Their mutation to alanine results in a significantly reduced growth of yeast strains carrying the plasmids on –AHLW plates. Surprisingly, mutation of Ala1160, which makes contacts with the two turns of the pol-RIR -helix following the FF motif, and Ile1179, Gln1189 located on the periphery of the interface and interacting directly with Phe531 and Phe532, had no effect on binding to the Rev1-CT in the yeast two-hybrid assay

(Figure 4.5B). Taken together, these results show that residues Leu1171, Glu1174, Thr1178,

Met1183 and Asp1186 of Rev1-CT make contacts with pol that contribute to stabilization of the complex between the two proteins.

4.4.5. Conformational dynamics of Rev1-CT domain and Rev1-CT/pol-RIR complex

Protein conformational flexibility plays an important role in molecular recognition, often conferring upon protein domains the ability to bind multiple targets (238,239). Pre-sampling of multiple interconverting conformers enables rapid selection of molecular configuration optimal for binding. On the other hand, loss of conformational mobility and/or ordering upon binding imposes an entropic penalty that weakens molecular interaction, thereby modulating specificity of recognition by allowing more selective binding. Not surprisingly, a number of molecular complexes have been described involving flexible and/or disordered interaction partner(s) that only fold into an ordered structure upon binding (239). The interaction between the Rev1-CT and the Y-family DNA polymerases pol, , and  seems to fall into this category. The Rev1 interacting regions (RIRs) of pol, pol, pol are mapped outside structured domains of the

125 enzymes (35,178,179) and likely adopt -helical conformation only upon binding the Rev1-CT domain. Although the Rev1-CT is shown to be well structured, NMR line broadening in the H1-

H2 and H3-H4 loops, as well as in parts of helices H1 and H2 points to ms-s conformational rearrangements of RIR interaction interface. Therefore, to elucidate the role of Rev1-CT flexibility in Y-family DNA polymerase recognition we have examined ps-ns and ms-s time- scale conformational dynamics of free and pol-RIR bound forms of the domain using 15N spin- relaxation (216) and 15N CPMG relaxation dispersion (217) measurements.

15 15 1 Figure 4.6A (top plots) shows N R1, R2 relaxation rates and N{ H} NOE values for the

Rev1-CT domain (red) and Rev1-CT/pol-RIR complex (blue) measured at 11.7 T magnetic field, 15 oC. NMR spin-relaxation data were analyzed using the Lipari-Szabo model-free approach (115,216) to extract order parameters S2 (Figure 4.6A, bottom plot) and correlation times (data not shown), describing angular amplitudes and time-scales of fast ps-ns internal motions of the backbone NH vectors, respectively, along with parameters of overall rotational diffusion of the molecule. As expected, overall rotation correlation time R for the complex

(10.45  0.06 ns) exceeds R for the free domain (8.96  0.07 ns) by 17%, closely matching the increase of molecular size upon Rev1-CT (95 aa) - pol-RIR (16 aa) complex formation.

Uniformly high order parameters S2 > 0.8 are observed for the backbone amide groups of the free Rev1-CT domain and the complex (Figure 4.6A), stemming from fast (fs-ps) restricted thermal flexibility of the protein backbone (S2 = 1 for fully constrained, 0 for unrestricted motions). The exceptions are 5-6 residues in the C-terminal part of the domain and residues from the H3-H4 loop, exhibiting a lower S2 pointing to increased ps-ns mobility. The experimental S2 values (Figure 4.6A) are in line with those predicted from the backbone chemical shifts using

RCI approach (150,231) (Figure 4.1C), albeit the later are underestimated in loop regions.

126

Interestingly, the experimental S2 for the backbone amides in the N-terminal -hairpin are about as high as in the regular secondary structure elements (Figure 4.6A), confirming that the N- terminal part adopts a rigid conformation in both free and bound forms of the Rev1-CT domain.

Overall, 15N relaxation data analysis suggests that ps-ns time-scale internal motions of the Rev1-

CT domain are restricted and that ps-ns dynamics of the domain does not change upon pol-RIR binding.

Along with order parameters S2 and correlation times of internal motions, model-free

15 analysis of N relaxation data allows extraction of exchange contributions Rex,mf to transverse relaxation rates R2 that report on s-ms conformational dynamics accompanied by changes in

15 N chemical shifts (216). The extracted Rex,mf primarily originate from faster s-ms processes,

15 since the CPMG sequence in our N R2 experiment suppresses contributions to R2 due to ms exchange with kex << 2CPMG and  << 2CPMG (where kex is exchange rate constant,  is

15 frequency difference between the exchanging states; CPMG frequency CPMG in our N R2

15 experiment was set to 500 Hz) (223). To additionally quantify contributions to N R2 rates due to slower ms time-scale conformational exchange, we have performed 15N CPMG relaxation dispersion experiments (217,224,225) that measure effective transverse relaxation rates R2,eff as a

clc clc function of CPMG, and quantified Rex,rd = R2,eff (33Hz) – R2,eff (533Hz) that complement Rex,mf

15 obtained in model-free analysis of N relaxation data. The sum Rex,mf + Rex,rd provides an

15 approximation for total exchange contribution Rex to N R2 in free-precession limit.

15 N R2 rates plotted vs. residue number (Figure 4.6A) reveal that residues in (and adjacent to)

H1-H2 and H3-H4 loops in the free Rev1-CT domain and in the complex exhibit enhanced 15N

R2 caused by significant s-ms exchange contributions Rex,mf that were quantified using model- free analysis of 15N spin-relaxation data (115,216). 15N CPMG relaxation dispersion

127 measurements (217,224,225) that were used to identify amide groups involved in slower ms time-scale exchange not captured in Rex,mf significantly extend the list of residues in free and bound forms of Rev1-CT undergoing s-ms conformational rearrangements. Figure 4.6B shows examples of 15N CPMG relaxation dispersion profiles for the backbone amide groups of

Gly1161, Leu1171, Trp1175 and Asp1186 that form the pol binding site of Rev1-CT. In is clear that for these residues R2,eff (CPMG) profiles reach a plateau at CPMG~500 Hz where the exchange contribution to R2 is effectively suppressed by the CPMG sequence. Figure 4.6C

15 shows total exchange contribution Rex = Rex,mf + Rex,rd to N R2 rates (at 11.7 T magnetic field) mapped onto structures of free Rev1-CT domain (top) and the complex (bottom), with residues involved in faster s-ms and slower ms dynamics marked, respectively, in cyan and magenta.

Remarkably, the region exhibiting extensive s-ms conformational exchange in both free and bound states of Rev1-CT encompasses the binding site for the pol-RIR motif formed by residues from the N-terminal -hairpin, the two last turns of -helix H1, the H1-H2 loop, and two first turns of -helix H2. Although s-ms exchange contributions Rex generally decrease upon complex formation, the binding interface remains dynamic on s-ms time-scale in Rev1-

15 CT/pol-RIR complex. Interestingly, Rex contribution to N R2 increase upon complex formation for Gly1161 from the N-terminal -hairpin of Rev1-CT (Figure 4.6B), which directly interacts with pol-RIR -helix.

Beyond intramolecular processes, ms time-scale exchange observed in the Rev1-CT/pol-

RIR complex may be caused by an on-off equilibrium between free and bound forms of the

6 -1 -1 -1 protein (240). From the upper limits of kon (2.7x10 M s ) and koff (33 s ) obtained as described above one can derive the upper estimate for the exchange rate constant of the on-off equilibrium

-1 15 kex = kon[L] + koff ~ 300 s at the condition of our N CPMG dispersion experiment (0.9 mM

128 total protein, 0.9 mM total peptide concentrations; free ligand concentration [L] ~ 11% of 0.9

-1 mM at KD = 13 M (202)). Even at a relatively slow kex of 100-300 s the exchange between free and bound forms of Rev1-CT may cause noticeable exchange contributions to transverse relaxation rates. To elucidate whether the observed ms time-scale exchange in the complex can be solely explained by on-off equilibrium between free and bound forms of Rev1-CT we have additionally recorded 15N CPMG dispersion profiles at 18.8 T magnetic field, and globally fit

11.7 and 18.8 T relaxation dispersion data to a model of two-state conformational exchange

(Figure 4.7A), resulting in a population of the second (minor) state pB = 5.9 ± 0.4% and

-1 - exchange rate constant kex = 585 ± 23 s . This is about two times faster than the estimated 300 s

1 upper limit of the exchange rate constant for the Rev1-CT/pol-RIR binding equilibrium.

Interestingly, 15N chemical shift differences obtained from CPMG relaxation dispersion data,

15 disp, are in good agreement with N chemical shift differences between free and pol-RIR bound forms of Rev1-CT, spec (Figure 4.7B), with the exception of few residues on the

15 binding interface (such as Trp1175). In addition, significant contributions Rex,mf to N transverse relaxation rates R2 due to faster s-ms dynamics have been obtained in 'model-free' analysis of

15N relaxation data for the binding site residues 1175-1179 (Figures 4.6A and C). Taken together, these data suggest that conformational exchange observed on the binding interface of the Rev1-CT - pol-RIR complex may include contributions from both on-off equilibrium between free and bound forms of Rev1-CT and intra-molecular dynamic process(es).

The analysis of 15N spin-relaxation and 15N CPMG relaxation dispersion data established that free and pol-RIR bound forms of Rev1-CT domain are rigid on ps-ns time scale, but exhibit significant s-ms conformational dynamics encompassing the RIR binding site. Transitions to low-populated states occurring on s-ms time-scale observed on the intermolecular interface of

129 the complex may point to the existence of multiple modes of Rev1-CT/pol-RIR binding. Such a plasticity of interaction interface in the bound state might prove beneficial for minimizing the entropic penalty of the complex formation (239). On the other hand, s-ms sampling of multiple conformational states in the free Rev1-CT domain may facilitate selection of a molecular configuration(s) that enables efficient binding of RIR motifs (238,239), exhibiting surprisingly little sequence conservation among Y-family DNA polymerases (Figure 4.1D).

4.5. Concluding Remarks

In this study, we have made a significant contribution to our understanding of how Rev1 regulates TLS in human cells through its interactions with the Y-family TLS polymerases pol,

, and  by (i) solving the structure of Rev1’s important C-terminal domain, (ii) revealing the structural details of how the RIR of pol binds to Rev1’s C-terminal domain, and (iii) showing that the RIR binding site of the Rev1 C-terminal domain displays conformational dynamics on the s-ms time scale that may facilitate the selection of the molecular configuration optimal for binding. We have also been able to redefine the Rev1 binding motif, RIR, as nFF(4h), where n is an N-cap residue, F is phenylalanine and 4h are at least four residues that form an -helix.

The interactions that Rev1 makes with the Y-family DNA polymerases pol, , and  enable a later acting form of TLS control that acts subsequent to the Rad6/Rad18-mediated monoubiquitination of PCNA that is triggered by a variety of types of DNA damage (184,241).

The Rev1-CT is not important for the recognition of monoubiquitinated PCNA nor is it necessary for recruitment to nuclear foci after DNA damage (229,242). Reciprocally, the RIR sequences of pol are not necessary for recruitment to UV-induced nuclear foci, since its C- terminal 120 amino acids which do not include either of the RIRs are sufficient for its

130 localization (243). Nor are the RIR sequences necessary for damage-induced foci formation by pol (202) or pol (244). Thus the interactions that the Y-family TLS DNA polymerases pol, , and  make with the Rev1-CT represent another level of regulation beyond that of PCNA monoubiquitination that helps to ensure that these inherently error-prone DNA polymerases do not introduce mutations by gaining access to primer termini in undamaged DNA. Subsequent polymerase exchanges that are mediated through the interactions that each of these TLS polymerases make with Rev1’s C-terminal domain could possibly also help with the process of matching each polymerase with its cognate lesion over which it can replicate relatively accurately. Example of cognate lesions include thymine-thymine cyclobutane pyrimidine dimers for pol, N2-BPDE-dG lesions for pol, and dA for pol (35,174). Polymerase exchange mediated by the Rev1 C-terminal domain could also help exchange a non functional TLS polymerase for an alternative TLS polymerase (245). Evidence has been reported indicating that, in higher eukaryotes, a component of TLS can also proceed in the absence of PCNA monoubiquitination (246). In these cases, the interactions that Rev1 makes with other TLS DNA polymerases might be particularly important in facilitating and regulating TLS (209,211,247).

The exchanges that Rev1 likely facilitates between pol, , and  are apparently made possible by the s-ms conformational dynamics of the region of the Rev1-CT that interacts with their

RIRs.

In TLS events requiring the coordinated consecutive action of two DNA polymerases, the

Rev1 C-terminal domain can also serve to facilitate the exchange from an inserter DNA polymerase to the extender DNA polymerase pol via its interaction with the accessory subunit

Rev7. Recent evidence indicates that human Rev1 recruits pol through an interaction with Rev7 as it does in yeast (248). Rev7 clearly interacts differently with the Rev1-CT than do pol, , and

131

 since it lacks an RIR and the Rev1-binding interface of Rev7 appears be a portion of the surface of a five-stranded -sheet of Rev7 rather than a disordered peptide (249). Although pol can accurately replicate UV-irradiated DNA independently of Rev1, pol-Rev1 interactions are involved in suppressing spontaneous mutations, probably by promoting accurate TLS past endogenous lesions (250).

Although TLS DNA polymerases can act at the replication fork, there is a growing body of evidence in both yeast and mammalian cells that a substantial amount of TLS occurs at postreplicational gaps caused by lesions rather than at active replication forks (180-184,251). In fact, in yeast TLS occurs efficiently if it is constrained to occur during G2/M by manipulating the cell cycle dependence of Rad6/Rad18-dependent monoubiquitination (180,182). Of particular interest is the observation that in S. cerevisiae Rev1 is synthesized at extremely low levels during

G1 and most of S phase, but then is expressed at fifty-fold higher levels during G2/M (251).

Thus, most of the Rev1-pol interactions that occur in S. cerevisae via interactions of the Rev1

C-terminal domain with Rev7 are naturally confined to postreplicational gaps during G2/M

(183,252). Given the growing body of evidence that a significant amount of TLS takes place during G2/M in mammalian cells (181), it seems likely that a substantial fraction of the interactions between the Rev1 C-terminal domain and other TLS DNA polymerases similarly take place at postreplicational gaps.

Knowledge of the structure of the Rev1-CT should also make it possible to clarify the molecular details of the interactions that Rev1 makes during interstrand crosslink repair and may make it possible to gain additional insights into the relationship between interstrand crosslink repair and Fanconi anemia (253). Since inhibition of TLS can help to improve chemotherapy

(254,255), it is possible that searching for inhibitors that target specific protein-protein

132 interactions required for TLS might be an alternative to searching for inhibitors that selectively inhibit only one TLS DNA polymerase out of the 15 DNA polymerases found in human cells

(256).

Accession codes

Atomic coordinates of the Rev1-CT domain and the Rev1-CT/polh-RIR complex were deposited to Protein Data Bank (PDB entries 2LSY, 2LSK). The backbone and side chain NMR resonance assignments were deposited to BioMagResBank (BMRB entries 18455, 18434).

133

Table 4.1. NMR-based restraints used for structure calculation of Rev1-CT domain and Rev1- CT/pol-RIR complex, and structure refinement statistics.

Free Rev1-CT Rev1-CT/pol complex Summary of restraints Total NOE distance restraints 1853 1526 Short range (|i-j|<=1) 1013 856/53a Medium range (1<|i-j|<5) 468 315/9a Long range (|i-j|>=5) 372 221/0a Intermolecular - 51/21b Dihedral angles restrains (φ and ψ) 180 152 / 26a Hydrogen bonds 46 40/4a Structure refinement statistics Deviation from NMR-based restraints NOE (Å) 0.02 0.02 Dihedral restraints (degrees) 1.1 1.3 Deviation from idealized geometry Bonds (Å) 0.02 0.02 Angles (degrees) 1.1 1.2 Ramachandran plot favorable 90.1% 90.3% RMSD, pairwise (Å) All residues Rev1-CT pol-RIR Backbone atoms 0.87±0.16 1.08±0.19 2.58±1.03 Heavy atoms 1.35±0.14 1.83±0.23 3.40±1.00 Residues in regular secondary structure elementsc Backbone atoms 0.53±0.11 0.62±0.12 0.27±0.08 Heavy atoms 1.08±0.10 1.51±0.22 1.38±0.26

a - intra-molecular restraints for Rev1-CT / pol peptide b - upper/low distance restraints c - residues 1165-1178, 1184-1199, 1203-1219, 1224-1243 in Rev1-CT, 532-539 in pol.

134

Figure 4.1. The Rev1-CT forms a structured domain that binds DNA polymerase η. A. 1H- 15N HSQC spectra of free 15N/13C Rev1-CT domain (red) and the 15N/13C Rev1-CT complex with the unlabeled pol-RIR peptide (blue) recorded at 11.7 T magnetic field, 15 oC. The inset shows the peak of Leu1171 in 1H-15N HSQC spectra of a titration series recorded at different ratios of peptide to protein. B. Probability of -helix, P(helix), in the free (red) and pol-RIR bound (blue) Rev1-CT domain predicted from backbone chemical shifts using TALOS+ program (106) plotted vs. residue number. C. Order parameters S2 for the backbone amide groups of the free (red) and pol-RIR bound (blue) Rev1-CT domain predicted from the backbone chemical shifts using RCI approach (150,231). D. Alignment of 16 aa peptides comprising RIR regions of human pol, ,  used by Ohashi et al. (202) to identify the minimal motif required for Rev1-CT interaction (underlined). The two consecutive Phe residues shown in the box are the only residues conserved in all four RIR regions found in human Y-type DNA polymerases.

135

Figure 4.2. Three-dimensional structure of the human Rev1-CT domain (residues 1157- 1251) determined by solution NMR spectroscopy. A. Superposition of the backbone folds of 20 final lowest-energy structures obtained in CYANA (107) followed by refinement in explicit solvent by CNS software (145). B. Ribbon representation of the Rev1-CT structure showing the positions of four -helixes of the domain. Side chains are shown for residues that form core of the domain with less than 10% surface exposure and 10 Å2 accessible surface area (averaged over 20 best structures), including Leu1159, Ala1162 from the N-terminal -hairpin, Val1168, Leu1172 (rev1-108), Ile1176 (rev1-108) from -helix H1, Val1190 (rev1-110), Tyr1193 (rev1- 110), Cys1194 (rev1-110), Leu1197, Ile1198 from -helix H2, Leu1206 (rev1-111), Val1209 (rev1-111), Ile1210 (rev1-111), Met1213, Leu1216, Met1217 from -helix H3, Trp1225, Ala1228, Ile1232, Val1236, Leu1240 from -helix H4; underlined are residues from four conserved 5-6 aa residue regions, referred to as rev1-108 to -111, identified by extensive sequence alignment of Rev1-CT domains from different eukaryotic species (213). C. Electrostatic charge distribution on the surface of the Rev1-CT domain (blue is positive, red is negative). Encircled is a pocket on the surface of the domain that accommodates the FF residues of RIR motifs of Y-family polymerases.

136

Figure 4.3. Spatial structure of the complex of the human Rev1-CT domain (residues 1157- 1251) with the RIR containing peptide from DNA polymerase  (residues 524-539). A. Superposition of the backbone folds of 20 final least energy structures. Also shown are side- chains of Phe531 and Phe532 of pol critical for interaction with Rev1-CT. B. Superposition of NMR structures of the free Rev1-CT domain (light colors) and the Rev1-CT/pol-RIR complex (saturated colors). C and D. Interaction interface of the Rev1-CT domain complex with the pol- RIR peptide. Side chains of Rev1-CT residues that contribute 5 Å2 or more (averaged over 20 best structures) to the intermolecular interaction interface are shown on plot D.

137

Figure 4.4. Inter-molecular NOEs observed for the Rev1-CT/polη-RIR complex. 1H-1H cross-sections from a 13C-edited 15N,13C-filtered NOESY-HSQC spectrum of the Rev1-CT/polη- RIR complex, displaying intermolecular protein-peptide NOE cross-peaks between selected groups of the 13C/15N labeled Rev1-CT domain and 1H nuclei of the unlabeled polη-RIR peptide. Multiple NOE connectivities are observed between the protons from the polη-RIR binding site of the Rev1-CT domain and aromatic 1H of Phe residues of the polη-RIR motif.

138

Figure 4.5. Mapping of the Rev1-CT/pol-RIR interface by yeast two-hybrid assays. The region encoding the human Rev1 C-terminal 120 amino acid fragment (Rev1-CT) was fused to the GAL4 Activation Domain (AD). The full-length polη-encoding region was appended to the DNA-binding Domain (BD) of GAL4. A. Rev1 plasmids were transformed into strains harboring BD-fused wild-type (WT) polη and the indicated mutations or into strains harboring the empty expression plasmids (indicated by the – symbol in the figure). B. AD-appended WT Rev-CT and the indicated mutations were transformed into strains containing BD-WT polη or the empty expression plasmid (-). The plasmids were selected for by growth on media lacking leucine and tryptophan (-LW) and interactions were scored by growth on plates also lacking Adenine and Histidine (-AHLW).

139

Figure 4.6. Conformational dynamics of the Rev1-CT domain and the Rev1-CT/polη-RIR 15 15 1 complex. A. N R1, R2 relaxation rates and N{ H} NOE values measured at 11.7 T magnetic o 2 2 2 2 field, 15 C, along with order parameters S (or S = Sf Ss ) for the backbone amide groups obtained in model-free analysis of 15N relaxation data for the free Rev1-CT domain (red) and the Rev1-CT/pol-RIR complex (blue). B. Examples of experimental 15N CPMG dispersion profiles R2,eff(CPMG) measured at 11.7 T magnetic field for selected residues of free (top plot) and pol- RIR bound (bottom plot) forms of Rev1-CT. C. Exchange contributions Rex = Rex,mf + Rex,rd to 15 N R2 rates measured at 11.7 T magnetic field mapped onto structures of the free Rev1-CT -1 domain (top) and the complex (bottom). Labeled numbers are residues with Rex > 5 s . Residues exhibiting fast s-ms time scale exchange identified in model free-analysis of 15N relaxation data (115,216) with Rex,mf contributing over 60% to a total Rex are marked in cyan. Residues displaying slower ms time-scale exchange detected by 15N CPMG dispersion measurements (217,224,225) with Rex,rd contributing over 60% to a total Rex are marked in magenta. Shown in blue are residues with comparable Rex,mf and Rex,rd contributions. Phe1165, Ser1223 (free domain and the complex) and Val1224 (the complex only) whose resonances are not detected in NMR spectra presumably due to severe s-ms exchange line-broadening are assumed to have high Rex values and are marked in grey.

140

Figure 4.7. On-off equilibrium between free and bound forms of Rev1-CT contributes to ms time-scale conformational exchange observed for the Rev1-CT/polη-RIR complex. A. 15N CPMG relaxation dispersion data for the backbone amides of Trp1175 and Gln1189 of Rev1-CT domain bound to the polη-RIR peptide recorded at 11.7T (red circles) and 18.8T (blue circles) magnetic fields, and their best fits to a model of 2-state conformational exchange (solid lines). Relaxation dispersion data for 77 residues of a bound form of the Rev1-CT domain measured at two magnetic field strengths were globally fit to the model of 2-state exchange, resulting in a population of the second (minor) state pB = 5.9 ± 0.4% and exchange rate constant -1 15 15 kex = 585 ± 23 s . B. N chemical shift differences obtained in the global fit of N CPMG 15 relaxation dispersion data for the Rev1-CT/polη-RIR complex, Δϖdisp, (red circles), and N chemical shift differences between the free and polη-RIR bound forms of the Rev1-CT domain, Δϖspec (grey impulses), plotted vs. residue number.

141

CHAPTER 5

Summary and future directions

142

5.1. Summary

The overall goal of this thesis was to elucidate the molecular mechanisms of USP7 substrate specificity, its activation and inhibition. To address the specific aims, we explored the possibility of studying USP7 using NMR spectroscopy. We optimized expression and purification conditions for USP7 fragments including the catalytic core as well as the UBL1, UBL12, UBL3 and UBL45 C-terminal domains, performed backbone resonance assignment of the catalytic core

(Chapter 3), the UBL1 and the UBL12 domains (Chapter 2), and determined the solution structure of the UBL1 domain described in Chapter 2. Almost complete backbone NMR resonance assignment of the above USP7 fragments enabled structural characterization of the enzyme interactions with its substrates and small-molecule compounds.

In Chapter 2, using NMR spectroscopy, we characterized a novel substrate-binding site within USP7. In particular, we showed that ICP0 from HSV-1 interacts with the first two UBL domains of USP7. The binding site is located on the surface of UBL2, which makes it the first substrate-recognizing site discovered outside the N-terminal TRAF-like domain. In addition, using cellular assays, we found that knockout of USP7 in HFF-1 cells negatively affects the efficiency of HSV-1 lytic infection. Together these results provide a rationale for designing small-molecule compounds for breaking the USP7 – ICP0 interaction that might be used for treatment of HSV-1 infection in humans.

In Chapter 3, we described an unexpected insight into the mechanism of regulation of USP7 catalytic activity. We, for the first time, detected a weak (KD = 105.7 ± 15.9 μM) but specific interaction between the isolated catalytic core of USP7 and free ubiquitin in solution. Consistent with previous reports (72), we showed that ubiquitin binds to Fingers region of the enzyme.

However, strikingly, we discovered that this interaction does not induce conformational

143 rearrangement of the USP7 active site that is critical for the deubiquitinase activity of the enzyme. Therefore, contrary to previous findings (72), we revealed that ubiquitin binding alone is not sufficient for USP7 activation and a substrate binding is likely required.

In Chapter 3, we presented the first structural analysis of USP7 inhibition by two small- molecule compounds. We demonstrated that two USP7 inhibitors P22077 and P50429 (93) target the catalytic cleft of the enzyme in a similar manner. Further, we uncovered the mode of action for these inhibitors and, surprisingly, found that although P22077 and P50429 are structurally related, they act via distinct chemical mechanisms to covalently modify the catalytic Cys223 residue. We complemented the obtained in vitro data with cellular studies showing that P22077 and P50429 specifically inhibit endogenous full-length USP7 in cells. Overall, these results provide a structural basis for development of a new class of anti-cancer therapeutics.

Altogether, the results presented in this work aided to our understanding of how USP7 recognizes its substrates and how its catalytic activity is regulated in the cell and by available small-molecular compounds.

Another goal of this work was to characterize the mechanism by which TLS DNA polymerase REV1 recognizes other Y-family polymerases. We determined the spatial structures of the free Rev1-CT domain and its complex with Polη-RIR. NMR characterization of the protein structures and conformational dynamics, along with yeast two-hybrid assays revealed that Polη binding to Rev1 is primarily mediated by the two conservative phenylalanine residues of the Polη-RIR motif and amino-acid residues from the N-terminal β-hairpin of Rev1-CT. The results allowed us to define the RIR motif as nFF(4h) where n is N-cap serine or proline residue,

F is phenylalanine and 4h are at least four residues that form an α-helix.

144

5.2. Closing remarks and future directions

5.2.1. NMR spectroscopy is a powerful tool for studies of USP7

The work presented in this thesis differs from previous structural studies of USP7 in that it employs solution NMR spectroscopy to study the enzyme. Several crystal structures of the isolated USP7 domains were available in the literature (58,71,72) but we were the first to determine the solution structure of one of the domains, UBL1. Most importantly, we proved the feasibility of investigating USP7 interactions with its substrates and small-molecule compounds in solution using NMR spectroscopy.

Here, we have developed a robust system for studies of USP7 in solution (Figure 5.1). In addition to the assignments of USP7 catalytic core and UBL1, and UBL12 domains described above, we also performed backbone resonance assignment of UBL3 (Figure 5.2) and UBL45

(Figure 5.3) domains. The TRAF-like domain has been previously assigned by Saridakis at al.

(58). Thus, our “divide and conquer” strategy provides nearly complete backbone resonance assignment of the entire enzyme, which creates a powerful tool for future studies of USP7.

NMR sequence specific assignments accompanied by available crystal structures made it possible to map substrate/inhibitor binding interfaces on the structure of USP7. Moreover, solution NMR spectroscopy can now be used to investigate the molecular mechanisms of USP7 regulation such as inter-domain rearrangements and interactions with proteins altering the activity of the enzyme. Furthermore, studies of the USP7 conformational dynamics are now possible. Revealing the structural changes that occur during USP7 binding to its substrates, regulatory factors or inhibitors will gain our understanding of how the enzyme functions and will provide new possibilities for modulating its activity.

145

5.2.2. Identification of novel substrate-binding sites within USP7

In Chapter 2, we described a novel substrate-binding site within USP7 located in its C- terminal region. Although it was previously known that C-USP7 is involved in protein-protein interactions, the precise location of the binding sites remained unknown. We showed that a negatively charged patch on the surface of UBL2 domain serves as an interaction site for a positively charged peptide from a viral protein ICP0.

Our findings provided the first structural evidence that C-USP7 functions as a substrate- binding platform along with the N-terminal TRAF-like domain. Two other research groups have reported crystal structures of UBL12 in complex with ICP0 that confirmed our findings (85,257).

In addition, crystal structures of UBL1-3 complexes with UHRF1 and DNMT1 were recently published (258,259). Interestingly, these structures revealed that both UHRF1 and DNMT1 recognize C-USP7 in a manner similar to ICP0. Specifically, these proteins have a positively charged motif in their sequence that binds to the same negatively charged surface of UBL2 as

ICP0. Thus, our and others’ studies combined discovered the molecular mechanism of substrate recognition by UBL12 tandem. Moreover, provided the fact that these three proteins harbor short, ~10 amino residue sequence containing several lysines or/and arginines, the presence of such a sequence in a USP7 substrate can be now used to predict its interaction with UBL12 tandem. Along with a search for P/AxxS motifs that bind to the USP7 TRAF-like domain, a search for this newly described substrate feature might simplify the future mapping of

USP7/substrate binding interfaces.

USP7 interacts with and deubiquitinates over cellular 30 proteins (Table 1.1). Although the majority of these interactions have been biochemically and functionally characterized, only five complexes have been characterized structurally. These include complexes with p53, HDM2,

146

DNMT1, UHRF1 and UbE2E1 (27,73,74,258,259). The lack of structural information about

USP7/substrate binding interfaces is a critical gap in the development of highly targeted compounds modulating activity of USP7 towards a particular protein. Likely, several of known

USP7 binding partners will be found to interact with either the TRAF-like domain or UBL12 tandem via discovered mechanisms. However, it is already clear that structural studies of some interactions such as HDM2 binding to the C-terminal UBL3 domain or FOX(O)4 and p53 binding to UBL45 tandem will reveal new determinants of USP7 substrate specificity. Among others, USP7 interaction with TRIM27 E3 ligase is of the special interest. According to a recently published study, the C-terminal region of TRIM27 directly binds to the USP7 catalytic domain that previously was known to recognize only ubiquitin (132). Interestingly, this E3 ligase is both the substrate of USP7 and the regulator of its catalytic activity (63,132). Therefore, structural characterization of the complex might not only identify a novel substrate-binding site within the USP7 catalytic domain but also uncover the molecular mechanism of USP7 regulation by co-factors. Finally, structural characterization of USP7 interactions with histone-associated proteins represents another interesting subject for future investigations. USP7 interacts with all

PRC1 components in vitro (reviewed in subchapter 1.1.3) but it is not clear whether these multiple interactions occur simultaneously or competitively in the context of the intact complex in cells. Moreover, USP7 binds several other proteins involved in post-translational modifications of histones; however, it is not known how all these interactions are organized in space and time. Mapping of the binding interfaces between USP7 and its substrates located on chromatin will help to answer these questions and provide insight into not yet known functions of USP7.

147

5.2.3. Mechanism of USP7 activation

In Chapter 3 of this thesis, we found that USP7 interaction with free ubiquitin does not cause the rearrangement of the catalytic residues into productive conformation contrary to what was previously postulated (72). These results suggest that binding to a ubiquitinated substrate might be needed for efficient activation of the enzyme. We hypothesize that the simultaneous interactions of the substrate with TRAF-like/UBL domains and ubiquitin with the catalytic core ensure the correct placement of the flexible ubiquitin tail in the catalytic cleft and promote conformational changes in the active site necessary for cleavage of the isopeptide bond.

Furthermore, there is mounting evidence that C-terminal region of USP7 is critical for efficient ubiquitin binding and enzymatic activity (62,71). Interestingly, even in the absence of a substrate, a USP7 construct containing the catalytic core and C-terminal UBL domains is active on di-ubiquitin and ubiquitin-AMC (ubiquitin fused to 7-amido-4-methylcoumarin) while the isolated catalytic domain is not (71). These observations together with our results suggest that C-

USP7 might play a role in the active site rearrangement. In addition, there is a possibility that both, presence of the C-terminal domains and an interaction with a ubiquitinated substrate are required for efficient activation of USP7. Thus, to fully answer the question “How is USP7 activated?” we need to determine the structure of the full-length protein and investigate the effect of a substrate binding on the overall conformation of the enzyme.

The current model of the full-length USP7 suggests that UBL domains fold back in order to bring the extreme C-terminal peptide required for the deubiquitinase activity towards the catalytic domain (Figure 1.4C) (71,84). The large size of USP7 and inter-domain dynamics makes structure determination using X-Ray crystallography challenging. Figure 5.4A illustrates the alternative multi-disciplinary approach that might be used to overcome the challenge. We

148 propose to initially identify possible inter-domain interactions using methods such pull-downs,

SPR, ITC and mutagenesis. NMR might be further used to map the inter-domain binding interfaces. To investigate whether USP7 adopts a compact or extended conformation, or samples both conformations, one can use small-angle X-Ray scattering (SAXS). Finally, a high- resolution structure of full-length USP7 can be determined by fitting the structures of individual domains into density maps obtained with cryo-electron microscopy (cryo-EM). A structure of intact USP7 complexed with one of its ubiquitinated substrates might also be determined using cryo-EM. Together, the structures of free full-length USP7 and its complexes with substrates will provide a detailed picture of the enzyme’s action.

As the first step towards determination of the structure of fill-length USP7, we used negative stain EM and collected microphotographs of the protein purified from insect cells. Our results indicate that USP7 adopts a compact structure as evidenced by observed doughnut shaped molecules (Figure 5.4B). These preliminary results provide further support for a model of the full-length enzyme where its C-terminal and N-terminal regions closely interact with each other.

However, the high-resolution structure of intact USP7 is needed to elucidate the exact positions of its domains relative to each other and to explain how its C-terminal peptide affects the catalytic activity.

5.2.4. Future of USP7 inhibition by small-molecule compounds

In Chapter 3, we uncovered the mechanism of USP7 inhibition by two dual USP7/USP47 inhibitors. Despite the success of high throughput screening, the lack of structural biophysical characterization hindered their further optimization. The work described in Chapter 3 is the first comprehensive study that characterizes binding of the inhibitors to the enzyme as well as their mechanism of action. Since we found that compounds P22077 and P50429 used in this study

149 covalently modify the catalytic Cys223 residue, our results raise the critical for drug development question “If the inhibitors target the highly conservative active site, how is the specificity towards USP7 and USP47 achieved?” Our findings do not answer this question but provide a few interesting clues. In particular, we show that the inhibitors’ binding induces conformational changes in the unique misaligned active site of USP7 similar to those seen in the enzyme modified by ubiquitin aldehyde (72). One possible explanation for P22077 and P50429 specificity is that they target the catalytically unproductive conformation of the USP7 catalytic core. In this light, it would be interesting to see whether the USP47 active site is also misaligned.

Furthermore, our structural models of USP7 – inhibitor complexes in combination with identification of conserved residues revealed a few residues in the proximity of the active site that are unique to USP7 and USP47. However, mutagenesis studies are required to check if these unique residues are significant for initial binding. In addition, at least one another inhibitor of

USP7, HBX 19,818, was shown to target its active site (92). This compound is highly specific for USP7, thus its structural characterization in complex with the enzyme would significantly contribute to the understanding of the molecular mechanism of USP7 inhibition and might provide novel approaches to making compounds more specific.

The mapping of USP7/substrate interfaces opens a new alternative strategy for manipulating the enzyme by targeting one of its substrate-binding sites instead of the catalytic domain. Such compounds can potentially control stability of a specific USP7 substrate or a group of substrates.

Importantly, the success of designing compounds that break protein-protein interactions depends on the size and structural complexity of a binding interface. It is easier to design a small molecule resembling a linear short peptide rather than a molecule mimicking interaction with secondary or tertiary structure elements (97). The fact that both the TRAF-like domain and

150

UBL12 recognize short extended peptides is encouraging for discovery of compounds that can interrupt USP7 binding to specific substrates. NMR spectroscopy can be a useful tool in the development such compounds. It is widely used in fragment based drug design (FBDD) that was proved to be a more effective approach to targeting protein-protein interactions than high- throughput screening (260). FBDD identifies initial low-complexity molecules that weakly bind to protein/substrate interfaces; therefore, NMR backbone resonance assignment of the isolated

USP7 domains presented in this work might simplify the identification of such small molecules in the future.

Manipulation of the ubiquitin – proteasome system might be an effective strategy for treatment of human cancers (90); yet before targeting any of the E1, E2, E3 or DUB enzymes in humans, we need to fully comprehend how these enzymes function and how their activity is controlled. This thesis expands our knowledge of molecular mechanisms underlying substrate specificity and regulation of deubiquitinating enzyme USP7. Nevertheless, this work is only the beginning. There is still much to be learned about USP7 on the molecular level as well as in context of the cell and the whole organism. Furthermore, there are dozens of other DUBs and other components of UPS that are involved in vital cellular processes and need to be extensively studied in terms of their pharmaceutical potential. Once we have a complete picture of the interplay between various UPS participants, we will be able to take full advantage of this fascinating cellular system to treat life-threatening human diseases.

151

Figure 5.1. NMR resonance assignment of all USP7 domains provides new opportunities for studying the enzyme in solution. 2D 1H-15N HSQC spectra of USP7 fragments including the TRAF-like domain (pink), catalytic domain (purple), UBL1 (blue), UBL12 (green), UBL3 (black) and UBL45 (red) domains, for which backbone resonance assignment is available. NMR resonances of TRAF-like domain have been assigned previously (58); resonance assignment of all other fragments has been performed during studies presented in this thesis. The schematic representation of USP7 domain organization is shown above the spectra.

152

Figure 5.2. Backbone resonance assignment of the USP7 UBL3 domain. The 1H-15N HSQC spectrum of USP7 C-terminal UBL3 domain with residue-specific assignments indicated.

153

Figure 5.3. Backbone resonance assignment of the USP7 UBL45 domain. The 1H-15N TROSY-HSQC spectrum of USP7 C-terminal UBL45 tandem with residue-specific assignments indicated.

154

Figure 5.4. Pathway towards the structure of full-length USP7. A. Multi-method approach for determining the structure of full-length USP7. B. Negative stain electron microscopy of full- length USP7. Left panel shows an example photomicrograph with three selected particles boxed. The inset shows the same particles at larger magnification (100Å scale bar). Right panel shows class averages obtained using EMAN2 (261).

155

REFERENCES 1. Cohen-Kaplan, V., Livneh, I., Avni, N., Cohen-Rosenzweig, C., and Ciechanover, A. (2016) The ubiquitin-proteasome system and autophagy: Coordinated and independent activities. Int. J. Biochem. Cell Biol. 2. Glickman, M. H., and Ciechanover, A. (2002) The ubiquitin-proteasome proteolytic pathway: destruction for the sake of construction. Physiol. Rev. 82, 373-428 3. Deshaies, R. J., and Joazeiro, C. A. (2009) RING domain E3 ubiquitin ligases. Annu. Rev. Biochem. 78, 399-434 4. Kee, Y., and Huibregtse, J. M. (2007) Regulation of catalytic activities of HECT ubiquitin ligases. Biochem. Biophys. Res. Commun. 354, 329-333 5. Komander, D., and Rape, M. (2012) The ubiquitin code. Annu. Rev. Biochem. 81, 203- 229 6. Xu, P., Duong, D. M., Seyfried, N. T., Cheng, D., Xie, Y., Robert, J., Rush, J., Hochstrasser, M., Finley, D., and Peng, J. (2009) Quantitative proteomics reveals the function of unconventional ubiquitin chains in proteasomal degradation. Cell 137, 133- 145 7. Panier, S., and Durocher, D. (2009) Regulatory ubiquitylation in response to DNA double-strand breaks. DNA repair 8, 436-443 8. Erpapazoglou, Z., Walker, O., and Haguenauer-Tsapis, R. (2014) Versatile roles of k63- linked ubiquitin chains in trafficking. Cells 3, 1027-1088 9. Iwai, K., Fujita, H., and Sasaki, Y. (2014) Linear ubiquitin chains: NF-kappaB signalling, cell death and beyond. Nat. Rev. Mol. Cell Biol. 15, 503-508 10. Sigismund, S., Polo, S., and Di Fiore, P. P. (2004) Signaling through monoubiquitination. Curr. Top. Microbiol. Immunol. 286, 149-185 11. Nijman, S. M., Luna-Vargas, M. P., Velds, A., Brummelkamp, T. R., Dirac, A. M., Sixma, T. K., and Bernards, R. (2005) A genomic and functional inventory of deubiquitinating enzymes. Cell 123, 773-786 12. Komander, D. (2010) Mechanism, specificity and structure of the deubiquitinases. Subcell. Biochem. 54, 69-87 13. Storer, A. C., and Menard, R. (1994) Catalytic mechanism in papain family of cysteine peptidases. Methods Enzymol. 244, 486-500 14. Eletr, Z. M., and Wilkinson, K. D. (2014) Regulation of proteolysis by human deubiquitinating enzymes. Biochim. Biophys. Acta 1843, 114-128 15. Abdul Rehman, S. A., Kristariyanto, Y. A., Choi, S. Y., Nkosi, P. J., Weidlich, S., Labib, K., Hofmann, K., and Kulathu, Y. (2016) MINDY-1 is a member of an evolutionarily conserved and structurally distinct new family of deubiquitinating enzymes. Mol. Cell 63, 146-155 16. Clague, M. J., Barsukov, I., Coulson, J. M., Liu, H., Rigden, D. J., and Urbe, S. (2013) Deubiquitylases from genes to organism. Physiol. Rev. 93, 1289-1315 17. Komander, D., Clague, M. J., and Urbe, S. (2009) Breaking the chains: structure and function of the deubiquitinases. Nat. Rev. Mol. Cell Biol. 10, 550-563 18. Sato, Y., Yoshikawa, A., Yamagata, A., Mimura, H., Yamashita, M., Ookata, K., Nureki, O., Iwai, K., Komada, M., and Fukai, S. (2008) Structural basis for specific cleavage of Lys 63-linked polyubiquitin chains. Nature 455, 358-362

156

19. Cooper, E. M., Cutcliffe, C., Kristiansen, T. Z., Pandey, A., Pickart, C. M., and Cohen, R. E. (2009) K63-specific deubiquitination by two JAMM/MPN+ complexes: BRISC- associated Brcc36 and proteasomal Poh1. EMBO J. 28, 621-631 20. Sahtoe, D. D., and Sixma, T. K. (2015) Layers of DUB regulation. Trends Biochem. Sci. 40, 456-467 21. Ristic, G., Tsou, W. L., and Todi, S. V. (2014) An optimal ubiquitin-proteasome pathway in the nervous system: the role of deubiquitinating enzymes. Front. Mol. Neurosci. 7, 72 22. D'Arcy, P., Wang, X., and Linder, S. (2015) Deubiquitinase inhibition as a cancer therapeutic strategy. Pharmacol. Ther. 147, 32-54 23. Kessler, B. M., Fortunati, E., Melis, M., Pals, C. E., Clevers, H., and Maurice, M. M. (2007) Proteome changes induced by knock-down of the deubiquitylating enzyme HAUSP/USP7. J. Proteome Res. 6, 4163-4172 24. Li, M., Chen, D., Shiloh, A., Luo, J., Nikolaev, A. Y., Qin, J., and Gu, W. (2002) Deubiquitination of p53 by HAUSP is an important pathway for p53 stabilization. Nature 416, 648-653 25. Li, M., Brooks, C. L., Kon, N., and Gu, W. (2004) A dynamic role of HAUSP in the p53- pathway. Mol. Cell 13, 879-886 26. Brooks, C. L., Li, M., Hu, M., Shi, Y., and Gu, W. (2007) The p53--Mdm2--HAUSP complex is involved in p53 stabilization by HAUSP. Oncogene 26, 7262-7266 27. Hu, M., Gu, L., Li, M., Jeffrey, P. D., Gu, W., and Shi, Y. (2006) Structural basis of competitive recognition of p53 and MDM2 by HAUSP/USP7: implications for the regulation of the p53-MDM2 pathway. PLoS Biol. 4, e27 28. Khoronenkova, S. V., Dianova, II, Ternette, N., Kessler, B. M., Parsons, J. L., and Dianov, G. L. (2012) ATM-dependent downregulation of USP7/HAUSP by PPM1G activates p53 response to DNA damage. Mol. Cell 45, 801-813 29. He, J., Zhu, Q., Wani, G., Sharma, N., Han, C., Qian, J., Pentz, K., Wang, Q. E., and Wani, A. A. (2014) Ubiquitin-specific protease 7 regulates nucleotide excision repair through deubiquitinating XPC protein and preventing XPC protein from undergoing ultraviolet light-induced and VCP/p97 protein-regulated proteolysis. J. Biol. Chem. 289, 27278-27289 30. Alonso-de Vega, I., Martin, Y., and Smits, V. A. (2014) USP7 controls Chk1 protein stability by direct deubiquitination. Cell cycle, 3921-3926 31. Faustrup, H., Bekker-Jensen, S., Bartek, J., Lukas, J., and Mailand, N. (2009) USP7 counteracts SCFbetaTrCP- but not APCCdh1-mediated proteolysis of Claspin. J. Cell Biol. 184, 13-19 32. Schwertman, P., Lagarou, A., Dekkers, D. H., Raams, A., van der Hoek, A. C., Laffeber, C., Hoeijmakers, J. H., Demmers, J. A., Fousteri, M., Vermeulen, W., and Marteijn, J. A. (2012) UV-sensitive syndrome protein UVSSA recruits USP7 to regulate transcription- coupled repair. Nat. Genet. 44, 598-602 33. Qing, P., Han, L., Bin, L., Yan, L., and Ping, W. X. (2011) USP7 regulates the stability and function of HLTF through deubiquitination. J. Cell. Biochem. 112, 3856-3862 34. Zlatanou, A., Sabbioneda, S., Miller, E. S., Greenwalt, A., Aggathanggelou, A., Maurice, M. M., Lehmann, A. R., Stankovic, T., Reverdy, C., Colland, F., Vaziri, C., and Stewart, G. S. (2015) USP7 is essential for maintaining Rad18 stability and DNA damage tolerance. Oncogene

157

35. Waters, L. S., Minesinger, B. K., Wiltrout, M. E., D'Souza, S., Woodruff, R. V., and Walker, G. C. (2009) Eukaryotic translesion polymerases and their roles and regulation in DNA damage tolerance. Microbiol. Mol. Biol. Rev. 73, 134-154 36. Qian, J., Pentz, K., Zhu, Q., Wang, Q., He, J., Srivastava, A. K., and Wani, A. A. (2014) USP7 modulates UV-induced PCNA monoubiquitination by regulating DNA polymerase eta stability. Oncogene 37. Jagannathan, M., Nguyen, T., Gallo, D., Luthra, N., Brown, G. W., Saridakis, V., and Frappier, L. (2014) A role for USP7 in DNA replication. Mol. Cell. Biol. 34, 132-145 38. Lecona, E., Rodriguez-Acebes, S., Specks, J., Lopez-Contreras, A. J., Ruppen, I., Murga, M., Munoz, J., Mendez, J., and Fernandez-Capetillo, O. (2016) USP7 is a SUMO deubiquitinase essential for DNA replication. Nat. Struct. Mol. Biol. 23, 270-277 39. Felle, M., Joppien, S., Nemeth, A., Diermeier, S., Thalhammer, V., Dobner, T., Kremmer, E., Kappler, R., and Langst, G. (2011) The USP7/Dnmt1 complex stimulates the DNA methylation activity of Dnmt1 and regulates the stability of UHRF1. Nucleic Acids Res. 39, 8355-8365 40. Qin, W., Leonhardt, H., and Spada, F. (2011) Usp7 and Uhrf1 control ubiquitination and stability of the maintenance DNA methyltransferase Dnmt1. J. Cell. Biochem. 112, 439- 444 41. de Bie, P., Zaaroor-Regev, D., and Ciechanover, A. (2010) Regulation of the Polycomb protein RING1B ubiquitination by USP7. Biochem. Biophys. Res. Commun. 400, 389-395 42. Maertens, G. N., El Messaoudi-Aubert, S., Elderkin, S., Hiom, K., and Peters, G. (2010) Ubiquitin-specific proteases 7 and 11 modulate Polycomb regulation of the INK4a tumour suppressor. EMBO J. 29, 2553-2565 43. Lecona, E., Narendra, V., and Reinberg, D. (2015) Usp7 cooperates with Scml2 to regulate the activity of Prc1. Mol. Cell. Biol. 44. Dar, A., Shibata, E., and Dutta, A. (2013) Deubiquitination of Tip60 by USP7 determines the activity of the p53-dependent apoptotic pathway. Mol. Cell. Biol. 33, 3309-3320 45. Wang, Q., Ma, S., Song, N., Li, X., Liu, L., Yang, S., Ding, X., Shan, L., Zhou, X., Su, D., Wang, Y., Zhang, Q., Liu, X., Yu, N., Zhang, K., Shang, Y., Yao, Z., and Shi, L. (2016) Stabilization of histone demethylase PHF8 by USP7 promotes breast . J. Clin. Invest. 126, 2205-2220 46. Ding, X., Jiang, W., Zhou, P., Liu, L., Wan, X., Yuan, X., Wang, X., Chen, M., Chen, J., Yang, J., Kong, C., Li, B., Peng, C., Wong, C. C., Hou, F., and Zhang, Y. (2015) Mixed lineage leukemia 5 (MLL5) protein stability is cooperatively regulated by O-GlcNac transferase (OGT) and ubiquitin specific protease 7 (USP7). PLoS One 10, e0145023 47. Zhu, Q., Sharma, N., He, J., Wani, G., and Wani, A. A. (2015) USP7 deubiquitinase promotes ubiquitin-dependent DNA damage signaling by stabilizing RNF168. Cell cycle 14, 1413-1425 48. Colleran, A., Collins, P. E., O'Carroll, C., Ahmed, A., Mao, X., McManus, B., Kiely, P. A., Burstein, E., and Carmody, R. J. (2013) Deubiquitination of NF-kappaB by ubiquitin- specific Protease-7 promotes transcription. Proc. Natl. Acad. Sci. U. S. A. 110, 618-623 49. van der Knaap, J. A., Kumar, B. R., Moshkin, Y. M., Langenberg, K., Krijgsveld, J., Heck, A. J., Karch, F., and Verrijzer, C. P. (2005) GMP synthetase stimulates histone H2B deubiquitylation by the epigenetic silencer USP7. Mol. Cell 17, 695-707 50. van der Horst, A., de Vries-Smits, A. M., Brenkman, A. B., van Triest, M. H., van den Broek, N., Colland, F., Maurice, M. M., and Burgering, B. M. (2006) FOXO4

158

transcriptional activity is regulated by monoubiquitination and USP7/HAUSP. Nat. Cell Biol. 8, 1064-1073 51. Song, M. S., Salmena, L., Carracedo, A., Egia, A., Lo-Coco, F., Teruya-Feldstein, J., and Pandolfi, P. P. (2008) The deubiquitinylation and localization of PTEN are regulated by a HAUSP-PML network. Nature 455, 813-817 52. Sarkari, F., Wang, X., Nguyen, T., and Frappier, L. (2011) The herpesvirus associated ubiquitin specific protease, USP7, is a negative regulator of PML proteins and PML nuclear bodies. PLoS One 6, e16598 53. Everett, R. D., Meredith, M., Orr, A., Cross, A., Kathoria, M., and Parkinson, J. (1997) A novel ubiquitin-specific protease is dynamically associated with the PML nuclear domain and binds to a herpesvirus regulatory protein. EMBO J. 16, 1519-1530 54. Boutell, C., and Everett, R. D. (2013) Regulation of alphaherpesvirus infections by the ICP0 family of proteins. J. Gen. Virol. 94, 465-481 55. Canning, M., Boutell, C., Parkinson, J., and Everett, R. D. (2004) A RING finger ubiquitin ligase is protected from autocatalyzed ubiquitination and degradation by binding to ubiquitin-specific protease USP7. J. Biol. Chem. 279, 38160-38168 56. Holowaty, M. N., Zeghouf, M., Wu, H., Tellam, J., Athanasopoulos, V., Greenblatt, J., and Frappier, L. (2003) Protein profiling with Epstein-Barr nuclear antigen-1 reveals an interaction with the herpesvirus-associated ubiquitin-specific protease HAUSP/USP7. J. Biol. Chem. 278, 29987-29994 57. Jager, W., Santag, S., Weidner-Glunde, M., Gellermann, E., Kati, S., Pietrek, M., Viejo- Borbolla, A., and Schulz, T. F. (2012) The ubiquitin-specific protease USP7 modulates the replication of Kaposi's sarcoma-associated herpesvirus latent episomal DNA. J. Virol. 86, 6745-6757 58. Saridakis, V., Sheng, Y., Sarkari, F., Holowaty, M. N., Shire, K., Nguyen, T., Zhang, R. G., Liao, J., Lee, W., Edwards, A. M., Arrowsmith, C. H., and Frappier, L. (2005) Structure of the p53 binding domain of HAUSP/USP7 bound to Epstein-Barr nuclear antigen 1 implications for EBV-mediated immortalization. Mol. Cell 18, 25-36 59. Lee, H. R., Choi, W. C., Lee, S., Hwang, J., Hwang, E., Guchhait, K., Haas, J., Toth, Z., Jeon, Y. H., Oh, T. K., Kim, M. H., and Jung, J. U. (2011) Bilateral inhibition of HAUSP deubiquitinase by a viral interferon regulatory factor protein. Nat. Struct. Mol. Biol. 18, 1336-1344 60. Salsman, J., Jagannathan, M., Paladino, P., Chan, P. K., Dellaire, G., Raught, B., and Frappier, L. (2012) Proteomic profiling of the human cytomegalovirus UL35 gene products reveals a role for UL35 in the DNA repair response. J. Virol. 86, 806-820 61. Ching, W., Koyuncu, E., Singh, S., Arbelo-Roman, C., Hartl, B., Kremmer, E., Speiseder, T., Meier, C., and Dobner, T. (2013) A ubiquitin-specific protease possesses a decisive role for adenovirus replication and oncogene-mediated transformation. PLoS Pathog. 9, e1003273 62. Fernandez-Montalvan, A., Bouwmeester, T., Joberty, G., Mader, R., Mahnke, M., Pierrat, B., Schlaeppi, J. M., Worpenberg, S., and Gerhartz, B. (2007) Biochemical characterization of USP7 reveals post-translational modification sites and structural requirements for substrate processing and subcellular localization. The FEBS journal 274, 4256-4270

159

63. Zaman, M. M., Nomura, T., Takagi, T., Okamura, T., Jin, W., Shinagawa, T., Tanaka, Y., and Ishii, S. (2013) Ubiquitination-deubiquitination by the TRIM27-USP7 complex regulates tumor necrosis factor alpha-induced apoptosis. Mol. Cell. Biol. 33, 4971-4984 64. Morotti, A., Panuzzo, C., Crivellaro, S., Pergolizzi, B., Familiari, U., Berger, A. H., Saglio, G., and Pandolfi, P. P. (2014) BCR-ABL disrupts PTEN nuclear-cytoplasmic shuttling through phosphorylation-dependent activation of HAUSP. Leukemia 28, 1326- 1333 65. Tang, J., Qu, L. K., Zhang, J., Wang, W., Michaelson, J. S., Degenhardt, Y. Y., El-Deiry, W. S., and Yang, X. (2006) Critical role for Daxx in regulating Mdm2. Nat. Cell Biol. 8, 855-862 66. Giovinazzi, S., Morozov, V. M., Summers, M. K., Reinhold, W. C., and Ishov, A. M. (2013) USP7 and Daxx regulate mitosis progression and taxane sensitivity by affecting stability of Aurora-A kinase. Cell Death Differ. 20, 721-731 67. Ma, P., Yang, X., Kong, Q., Li, C., Yang, S., Li, Y., and Mao, B. (2014) The ubiquitin ligase RNF220 enhances canonical Wnt signaling through USP7-mediated deubiquitination of beta-catenin. Mol. Cell. Biol. 34, 4355-4366 68. Meng, H., Harrison, D. J., and Meehan, R. R. (2015) MBD4 interacts with and recruits USP7 to heterochromatic foci. J. Cell. Biochem. 116, 476-485 69. Yi, P., Xia, W., Wu, R. C., Lonard, D. M., Hung, M. C., and O'Malley, B. W. (2013) SRC-3 coactivator regulates cell resistance to cytotoxic stress via TRAF4-mediated p53 destabilization. Genes Dev. 27, 274-287 70. Holowaty, M. N., Sheng, Y., Nguyen, T., Arrowsmith, C., and Frappier, L. (2003) Protein interaction domains of the ubiquitin-specific protease, USP7/HAUSP. J. Biol. Chem. 278, 47753-47761 71. Faesen, A. C., Dirac, A. M., Shanmugham, A., Ovaa, H., Perrakis, A., and Sixma, T. K. (2011) Mechanism of USP7/HAUSP activation by its C-terminal ubiquitin-like domain and allosteric regulation by GMP-synthetase. Mol. Cell 44, 147-159 72. Hu, M., Li, P., Li, M., Li, W., Yao, T., Wu, J. W., Gu, W., Cohen, R. E., and Shi, Y. (2002) Crystal structure of a UBP-family deubiquitinating enzyme in isolation and in complex with ubiquitin aldehyde. Cell 111, 1041-1054 73. Sheng, Y., Saridakis, V., Sarkari, F., Duan, S., Wu, T., Arrowsmith, C. H., and Frappier, L. (2006) Molecular recognition of p53 and MDM2 by USP7/HAUSP. Nat. Struct. Mol. Biol. 13, 285-291 74. Sarkari, F., Wheaton, K., La Delfa, A., Mohamed, M., Shaikh, F., Khatun, R., Arrowsmith, C. H., Frappier, L., Saridakis, V., and Sheng, Y. (2013) Ubiquitin-specific protease 7 is a regulator of ubiquitin-conjugating enzyme UbE2E1. J. Biol. Chem. 288, 16975-16985 75. Molland, K., Zhou, Q., and Mesecar, A. D. (2014) A 2.2 A resolution structure of the USP7 catalytic domain in a new space group elaborates upon structural rearrangements resulting from ubiquitin binding. Acta Crystallogr F Struct Biol Commun 70, 283-287 76. Hu, M., Li, P., Song, L., Jeffrey, P. D., Chenova, T. A., Wilkinson, K. D., Cohen, R. E., and Shi, Y. (2005) Structure and mechanisms of the proteasome-associated deubiquitinating enzyme USP14. EMBO J. 24, 3747-3756 77. Avvakumov, G. V., Walker, J. R., Xue, S., Finerty, P. J., Jr., Mackenzie, F., Newman, E. M., and Dhe-Paganon, S. (2006) Amino-terminal dimerization, NRDP1-rhodanese

160

interaction, and inhibited catalytic domain conformation of the ubiquitin-specific protease 8 (USP8). J. Biol. Chem. 281, 38061-38070 78. Komander, D., Lord, C. J., Scheel, H., Swift, S., Hofmann, K., Ashworth, A., and Barford, D. (2008) The structure of the CYLD USP domain explains its specificity for Lys63-linked polyubiquitin and reveals a B box module. Mol. Cell 29, 451-464 79. Clerici, M., Luna-Vargas, M. P., Faesen, A. C., and Sixma, T. K. (2014) The DUSP-Ubl domain of USP4 enhances its catalytic efficiency by promoting ubiquitin exchange. Nature communications 5, 5399 80. Burroughs, A. M., Iyer, L. M., and Aravind, L. (2012) Structure and evolution of ubiquitin and ubiquitin-related domains. Methods Mol. Biol. 832, 15-63 81. Zhao, B., Velasco, K., Sompallae, R., Pfirrmann, T., Masucci, M. G., and Lindsten, K. (2012) The ubiquitin specific protease-4 (USP4) interacts with the S9/Rpn6 subunit of the proteasome. Biochem. Biophys. Res. Commun. 427, 490-496 82. Zhang, Q., Harding, R., Hou, F., Dong, A., Walker, J. R., Bteich, J., and Tong, Y. (2016) Structural basis of the recruitment of ubiquitin-specific protease USP15 by spliceosome recycling factor SART3. J. Biol. Chem. 291, 17283-17292 83. Ma, J., Martin, J. D., Xue, Y., Lor, L. A., Kennedy-Wilson, K. M., Sinnamon, R. H., Ho, T. F., Zhang, G., Schwartz, B., Tummino, P. J., and Lai, Z. (2010) C-terminal region of USP7/HAUSP is critical for deubiquitination activity and contains a second mdm2/p53 binding site. Arch. Biochem. Biophys. 503, 207-212 84. Rouge, L., Bainbridge, T. W., Kwok, M., Tong, R., Di Lello, P., Wertz, I. E., Maurer, T., Ernst, J. A., and Murray, J. (2016) Molecular understanding of USP7 substrate recognition and C-terminal activation. Structure 24, 1335-1345 85. Pfoh, R., Lacdao, I. K., Georges, A. A., Capar, A., Zheng, H., Frappier, L., and Saridakis, V. (2015) Crystal structure of USP7 ubiquitin-like domains with an ICP0 peptide reveals a novel mechanism used by viral and cellular proteins to target USP7. PLoS Pathog. 11, e1004950 86. Zhao, G. Y., Lin, Z. W., Lu, C. L., Gu, J., Yuan, Y. F., Xu, F. K., Liu, R. H., Ge, D., and Ding, J. Y. (2015) USP7 overexpression predicts a poor prognosis in lung squamous cell carcinoma and large cell carcinoma. Tumour Biol. 36, 1721-1729 87. Ma, M., and Yu, N. (2016) Ubiquitin-specific protease 7 expression is a prognostic factor in epithelial ovarian cancer and correlates with lymph node metastasis. Onco Targets Ther. 9, 1559-1569 88. Zhang, L., Wang, H., Tian, L., and Li, H. (2016) Expression of USP7 and MARCH7 is correlated with poor prognosis in epithelial ovarian cancer. Tohoku J. Exp. Med. 239, 165-175 89. Cheng, C., Niu, C., Yang, Y., Wang, Y., and Lu, M. (2013) Expression of HAUSP in gliomas correlates with disease progression and survival of patients. Oncol. Rep. 29, 1730-1736 90. Buckley, D. L., and Crews, C. M. (2014) Small-molecule control of intracellular protein levels through modulation of the ubiquitin proteasome system. Angew. Chem. Int. Ed. Engl. 53, 2312-2330 91. Colland, F., Formstecher, E., Jacq, X., Reverdy, C., Planquette, C., Conrath, S., Trouplin, V., Bianchi, J., Aushev, V. N., Camonis, J., Calabrese, A., Borg-Capra, C., Sippl, W., Collura, V., Boissy, G., Rain, J. C., Guedat, P., Delansorne, R., and Daviet, L. (2009)

161

Small-molecule inhibitor of USP7/HAUSP ubiquitin protease stabilizes and activates p53 in cells. Mol. Cancer Ther. 8, 2286-2295 92. Reverdy, C., Conrath, S., Lopez, R., Planquette, C., Atmanene, C., Collura, V., Harpon, J., Battaglia, V., Vivat, V., Sippl, W., and Colland, F. (2012) Discovery of specific inhibitors of human USP7/HAUSP deubiquitinating enzyme. Chem. Biol. 19, 467-477 93. Weinstock, J., Wu, J., Cao, P., Kingsbury, W. D., McDermott, J. L., Kodrasov, M. P., McKelvey, D. M., Suresh Kumar, K. G., Goldenberg, S. J., Mattern, M. R., and Nicholson, B. (2012) Selective dual inhibitors of the cancer-related deubiquitylating proteases USP7 and USP47. ACS Med. Chem. Lett. 3, 789-792 94. Chauhan, D., Tian, Z., Nicholson, B., Kumar, K. G., Zhou, B., Carrasco, R., McDermott, J. L., Leach, C. A., Fulcinniti, M., Kodrasov, M. P., Weinstock, J., Kingsbury, W. D., Hideshima, T., Shah, P. K., Minvielle, S., Altun, M., Kessler, B. M., Orlowski, R., Richardson, P., Munshi, N., and Anderson, K. C. (2012) A small molecule inhibitor of ubiquitin-specific protease-7 induces apoptosis in multiple myeloma cells and overcomes bortezomib resistance. Cancer Cell 22, 345-358 95. Lee, G., Oh, T. I., Um, K. B., Yoon, H., Son, J., Kim, B. M., Kim, H. I., Kim, H., Kim, Y. J., Lee, C. S., and Lim, J. H. (2016) Small-molecule inhibitors of USP7 induce apoptosis through oxidative and endoplasmic reticulum stress in cancer cells. Biochem. Biophys. Res. Commun. 470, 181-186 96. Fan, Y. H., Cheng, J., Vasudevan, S. A., Dou, J., Zhang, H., Patel, R. H., Ma, I. T., Rojas, Y., Zhao, Y., Yu, Y., Zhang, H., Shohet, J. M., Nuchtern, J. G., Kim, E. S., and Yang, J. (2013) USP7 inhibitor P22077 inhibits neuroblastoma growth via inducing p53-mediated apoptosis. Cell Death Dis. 4, e867 97. Arkin, M. R., Tang, Y., and Wells, J. A. (2014) Small-molecule inhibitors of protein- protein interactions: progressing toward the reality. Chem. Biol. 21, 1102-1114 98. Rule, G. S., and Hitchens, T. K. (2006) Fundamentals of Protein NMR Spectroscopy. Springer 99. Bodenhausen, G., and Ruben, D. J. (1980) Natural abundance nitrogen-15 NMR by enhanced heteronuclear spectroscopy. Chem. Phys. Lett. 69, 185-189 100. Wüthrich, K. (1986) NMR of Proteins and Nucleic Acids, Wiley 101. Pervushin, K., Riek, R., Wider, G., and Wuthrich, K. (1997) Attenuated T2 relaxation by mutual cancellation of dipole-dipole coupling and chemical shift anisotropy indicates an avenue to NMR structures of very large biological macromolecules in solution. Proc. Natl. Acad. Sci. U. S. A. 94, 12366-12371 102. Sattler, M., Schleucher, J., and Griesinger, C. (1999) Heteronuclear multidimensional NMR experiments for the structure determination of proteins in solution employing pulsed field gradients. Prog Nucl Magn Reson Spectrosc. 34, 93-158 103. Cavanagh, J., Fairbrother, W. J., III, A. G. P., Rance, M., and Skelton, N. J. (2007) Protein NMR Spectroscopy: Principles and Practice. Second Edition. Elsevier Academic press 104. Bax, A., Clore, G. M., and Gronenborn, A. M. (1990) 1H-1H correlation via isotropic mixing of 13C magnetization, a new three-dimensional approach for assigning 1H and 13C spectra of 13C-enriched proteins. J. Magn. Reson. 88, 425-431 105. Wishart, D. S., and Sykes, B. D. (1994) The 13C chemical-shift index: a simple method for the identification of protein secondary structure using 13C chemical-shift data. J. Biomol. NMR 4, 171-180

162

106. Shen, Y., Delaglio, F., Cornilescu, G., and Bax, A. (2009) TALOS plus : a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts. J. Biomol. NMR 44, 213-223 107. Guntert, P. (2003) Automated NMR protein structure calculation. Prog Nucl Magn Reson Spectrosc. 43, 105-125 108. Schwieters, C. D., Kuszewski, J. J., Tjandra, N., and Clore, G. M. (2003) The Xplor-NIH NMR molecular structure determination package. J. Magn. Reson. 160, 65-73 109. Rieping, W., Habeck, M., Bardiaux, B., Bernard, A., Malliavin, T. E., and Nilges, M. (2007) ARIA2: automated NOE assignment and data integration in NMR structure calculation. Bioinformatics 23, 381-382 110. Fiorito, F., Herrmann, T., Damberger, F. F., and Wuthrich, K. (2008) Automated amino acid side-chain NMR assignment of proteins using (13)C- and (15)N-resolved 3D [ (1)H, (1)H]-NOESY. J. Biomol. NMR 42, 23-33 111. Volk, J., Herrmann, T., and Wuthrich, K. (2008) Automated sequence-specific protein NMR assignment using the memetic algorithm MATCH. J. Biomol. NMR 41, 127-138 112. Kleckner, I. R., and Foster, M. P. (2011) An introduction to NMR-based approaches for measuring protein dynamics. Biochim. Biophys. Acta 1814, 942-968 113. Abragam, A. (1961) The Principles of Nuclear Magnetism. Clarendon Press 114. Korzhnev, D. M., Billeter, M., Arseniev, A. S., and Orekhov, V. Y. (2001) NMR studies of Brownian tumbling and internal motions in proteins. Prog Nucl Magn Reson Spectrosc. 38, 197-266 115. Lipari, G., and Szabo, A. (1982) Model-free approach to theinterpretation of nuclear magnetic resonance relaxation in macromolecules .1. Theory and range of validity. J. Am. Chem. Soc. 104, 4546-4559 116. Clore, G. M., Szabo, A., Bax, A., Kay, L. E., Driscoll, P. C., and Gronenborn, A. M. (1990) Deviations from the simple two-parameter model-free approach to the interpretation of nitrogen-15 nuclear magnetic relaxation of proteins. J. Am. Chem. Soc. 112, 4989–4991 117. Farrow, N. A., Muhandiram, R., Singer, A. U., Pascal, S. M., Kay, C. M., Gish, G., Shoelson, S. E., Pawson, T., Formankay, J. D., and Kay, L. E. (1994) Backbone dynamics of a free and a phosphopeptide complexed Src homology-2 domain studied by 15N NMR relaxation. Biochemistry 33, 5984-6003 118. Korzhnev, D. M., Skrynnikov, N. R., Millet, O., Torchia, D. A., and Kay, L. E. (2002) An NMR experiment for the accurate measurement of heteronuclear spin-lock relaxation rates. J. Am. Chem. Soc. 124, 10743-10753 119. Massi, F., Johnson, E., Wang, C., Rance, M., and Palmer, A. G., 3rd. (2004) NMR R1 rho rotating-frame relaxation with weak radio frequency fields. J. Am. Chem. Soc. 126, 2247-2256 120. Lisi, G. P., and Loria, J. P. (2016) Solution NMR Spectroscopy for the Study of Enzyme Allostery. Chem. Rev. 116, 6323-6369 121. Williamson, M. P. (2013) Using chemical shift perturbation to characterise ligand binding. Prog. Nucl. Magn. Reson. Spectrosc. 73, 1-16 122. Bhattacharya, S., and Ghosh, M. K. (2014) HAUSP, a novel deubiquitinase for Rb - MDM2 the critical regulator. The FEBS journal 281, 3061-3078 123. Lee, K. W., Cho, J. G., Kim, C. M., Kang, A. Y., Kim, M., Ahn, B. Y., Chung, S. S., Lim, K. H., Baek, K. H., Sung, J. H., Park, K. S., and Park, S. G. (2013) Herpesvirus-

163

associated ubiquitin-specific protease (HAUSP) modulates peroxisome proliferator- activated receptor gamma (PPARgamma) stability through its deubiquitinating activity. J. Biol. Chem. 288, 32886-32896 124. Tavana, O., Li, D., Dai, C., Lopez, G., Banerjee, D., Kon, N., Chen, C., Califano, A., Yamashiro, D. J., Sun, H., and Gu, W. (2016) HAUSP deubiquitinates and stabilizes N- Myc in neuroblastoma. Nat. Med. 22, 1180-1186 125. Oh, Y. M., Yoo, S. J., and Seol, J. H. (2007) Deubiquitination of Chfr, a checkpoint protein, by USP7/HAUSP regulates its stability and activity. Biochem. Biophys. Res. Commun. 357, 615-619 126. Giovinazzi, S., Sirleto, P., Aksenova, V., Morozov, V. M., Zori, R., Reinhold, W. C., and Ishov, A. M. (2014) Usp7 protects genomic stability by regulating Bub3. Oncotarget 5, 3728-3742 127. Tang, J., Qu, L., Pang, M., and Yang, X. (2010) Daxx is reciprocally regulated by Mdm2 and Hausp. Biochem. Biophys. Res. Commun. 393, 542-545 128. Khoronenkova, S. V., and Dianov, G. L. (2013) USP7S-dependent inactivation of Mule regulates DNA damage signalling and repair. Nucleic Acids Res. 41, 1750-1756 129. Du, Z., Song, J., Wang, Y., Zhao, Y., Guda, K., Yang, S., Kao, H. Y., Xu, Y., Willis, J., Markowitz, S. D., Sedwick, D., Ewing, R. M., and Wang, Z. (2010) DNMT1 stability is regulated by proteins coordinating deubiquitination and acetylation-driven ubiquitination. Science signaling 3, ra80 130. Zemp, I., and Lingner, J. (2014) The shelterin component TPP1 is a binding partner and substrate for the deubiquitinating enzyme USP7. J. Biol. Chem. 289, 28595-28606 131. Daubeuf, S., Singh, D., Tan, Y., Liu, H., Federoff, H. J., Bowers, W. J., and Tolba, K. (2009) HSV ICP0 recruits USP7 to modulate TLR-mediated innate response. Blood 113, 3264-3275 132. Hao, Y. H., Fountain, M. D., Jr., Fon Tacer, K., Xia, F., Bi, W., Kang, S. H., Patel, A., Rosenfeld, J. A., Le Caignec, C., Isidor, B., Krantz, I. D., Noon, S. E., Pfotenhauer, J. P., Morgan, T. M., Moran, R., Pedersen, R. C., Saenz, M. S., Schaaf, C. P., and Potts, P. R. (2015) USP7 acts as a molecular rheostat to promote WASH-dependent endosomal protein recycling and is mutated in a human neurodevelopmental disorder. Mol. Cell 59, 956-969 133. Hanpude, P., Bhattacharya, S., Dey, A. K., and Maiti, T. K. (2015) Deubiquitinating enzymes in cellular signaling and disease regulation. IUBMB life 67, 544-555 134. Pozhidaeva, A. K., Mohni, K. N., Dhe-Paganon, S., Arrowsmith, C. H., Weller, S. K., Korzhnev, D. M., and Bezsonova, I. (2015) Structural characterization of interaction between human ubiquitin-specific protease 7 and immediate-early protein ICP0 of Herpes Simplex Virus-1. J. Biol. Chem. 290, 22907-22918 135. Vijay-Kumar, S., Bugg, C. E., and Cook, W. J. (1987) Structure of ubiquitin refined at 1.8 A resolution. J. Mol. Biol. 194, 531-544 136. Masuya, D., Huang, C., Liu, D., Nakashima, T., Yokomise, H., Ueno, M., Nakashima, N., and Sumitomo, S. (2006) The HAUSP gene plays an important role in non-small cell lung carcinogenesis through p53-dependent pathways. J. Pathol. 208, 724-732 137. Cheng, C. Y., Gilson, T., Wimmer, P., Schreiner, S., Ketner, G., Dobner, T., Branton, P. E., and Blanchette, P. (2013) Role of E1B55K in E4orf6/E1B55K E3 ligase complexes formed by different human adenovirus serotypes. J. Virol. 87, 6232-6245

164

138. Zapata, J. M., Pawlowski, K., Haas, E., Ware, C. F., Godzik, A., and Reed, J. C. (2001) A diverse family of proteins containing tumor necrosis factor receptor-associated factor domains. J. Biol. Chem. 276, 24242-24252 139. Faesen, A. C., Luna-Vargas, M. P., and Sixma, T. K. (2012) The role of UBL domains in ubiquitin-specific proteases. Biochem. Soc. Trans. 40, 539-545 140. Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, G., Pfeifer, J., and Bax, A. (1995) NMRPIPE - a multidimensional spectral processing system based om UNIX pipes. J. Biomol. NMR 6, 277-293 141. Kneller, D. G., and Kuntz, I. D. (1993) UCSF SPARKY - an NMR display, annotation and assignment tool. J. Cell. Biochem., 254-254 142. Keller, R. (2004) Optimizing the process of nuclear magnetic resonance spectrum analysis and computer aided resonance assignment. PhD Thesis, ETH, Zurich 143. Grishaev, A., Steren, C. A., Wu, B., Pineda-Lucena, A., Arrowsmith, C., and Llinas, M. (2005) ABACUS, a direct method for protein NMR structure computation via assembly of fragments. Proteins 61, 36-43 144. Lemak, A., Steren, C. A., Arrowsmith, C. H., and Llinas, M. (2008) Sequence specific resonance assignment via Multicanonical Monte Carlo search using an ABACUS approach. J. Biomol. NMR 41, 29-41 145. Brunger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W., Jiang, J. S., Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T., and Warren, G. L. (1998) Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr D Biol Crystallogr. 54, 905-921 146. Vranken, W. F., Boucher, W., Stevens, T. J., Fogh, R. H., Pajon, A., Llinas, M., Ulrich, E. L., Markley, J. L., Ionides, J., and Laue, E. D. (2005) The CCPN data model for NMR spectroscopy: development of a software pipeline. Proteins 59, 687-696 147. Farrow, N. A., Muhandiram, R., Singer, A. U., Pascal, S. M., Kay, C. M., Gish, G., Shoelson, S. E., Pawson, T., Forman-Kay, J. D., and Kay, L. E. (1994) Backbone dynamics of a free and phosphopeptide-complexed Src homology 2 domain studied by 15N NMR relaxation. Biochemistry 33, 5984-6003 148. Mohni, K. N., Livingston, C. M., Cortez, D., and Weller, S. K. (2010) ATR and ATRIP are recruited to herpes simplex virus type 1 replication compartments even though ATR signaling is disabled. J. Virol. 84, 12152-12164 149. Wrigley, J. D., Eckersley, K., Hardern, I. M., Millard, L., Walters, M., Peters, S. W., Mott, R., Nowak, T., Ward, R. A., Simpson, P. B., and Hudson, K. (2011) Enzymatic characterisation of USP7 deubiquitinating activity and inhibition. Cell Biochem. Biophys. 60, 99-111 150. Berjanskii, M. V., and Wishart, D. S. (2005) A simple method to predict protein flexibility using secondary chemical shifts. J. Am. Chem. Soc. 127, 14970-14971 151. Callis, J. (2014) The ubiquitination machinery of the ubiquitin system. Arabidopsis Book 12, e0174 152. Everett, R. D., Meredith, M., and Orr, A. (1999) The ability of herpes simplex virus type 1 immediate-early protein Vmw110 to bind to a ubiquitin-specific protease contributes to its roles in the activation of gene expression and stimulation of virus replication. J. Virol. 73, 417-426

165

153. Boutell, C., Canning, M., Orr, A., and Everett, R. D. (2005) Reciprocal activities between herpes simplex virus type 1 regulatory protein ICP0, a ubiquitin E3 ligase, and ubiquitin- specific protease USP7. J. Virol. 79, 12342-12354 154. Nicholson, B., and Suresh Kumar, K. G. (2011) The multifaceted roles of USP7: new therapeutic opportunities. Cell Biochem. Biophys. 60, 61-68 155. Zhu, X., Menard, R., and Sulea, T. (2007) High incidence of ubiquitin-like domains in human ubiquitin-specific proteases. Proteins 69, 1-7 156. Everett, R. D., Barlow, P., Milner, A., Luisi, B., Orr, A., Hope, G., and Lyon, D. (1993) A novel arrangement of zinc-binding residues and secondary structure in the C3HC4 motif of an alpha herpes virus protein family. J. Mol. Biol. 234, 1038-1047 157. Barlow, P. N., Luisi, B., Milner, A., Elliott, M., and Everett, R. (1994) Structure of the C3HC4 domain by 1H-nuclear magnetic resonance spectroscopy. A new structural class of zinc-finger. J. Mol. Biol. 237, 201-211 158. Cole, C., Barber, J. D., and Barton, G. J. (2008) The Jpred 3 secondary structure prediction server. Nucleic Acids Res. 36, W197-201 159. Larkin, M. A., Blackshields, G., Brown, N. P., Chenna, R., McGettigan, P. A., McWilliam, H., Valentin, F., Wallace, I. M., Wilm, A., Lopez, R., Thompson, J. D., Gibson, T. J., and Higgins, D. G. (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947-2948 160. Berezin, C., Glaser, F., Rosenberg, J., Paz, I., Pupko, T., Fariselli, P., Casadio, R., and Ben-Tal, N. (2004) ConSeq: the identification of functionally and structurally important residues in protein sequences. Bioinformatics 20, 1322-1324 161. Ashkenazy, H., Erez, E., Martz, E., Pupko, T., and Ben-Tal, N. (2010) ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res. 38, W529-533 162. Ronau, J. A., Beckmann, J. F., and Hochstrasser, M. (2016) Substrate specificity of the ubiquitin and Ubl proteases. Cell Res. 26, 441-456 163. Ndubaku, C., and Tsui, V. (2015) Inhibiting the deubiquitinating enzymes (DUBs). J. Med. Chem. 58, 1581-1595 164. Meng, H., Harrison, D. J., and Meehan, R. R. (2014) MBD4 Interacts with and Recruits USP7 to heterochromatic foci. J. Cell. Biochem. 165. Luo, M., Zhou, J., Leu, N. A., Abreu, C. M., Wang, J., Anguera, M. C., de Rooij, D. G., Jasin, M., and Wang, P. J. (2015) Polycomb protein SCML2 associates with USP7 and counteracts histone H2A ubiquitination in the XY chromatin during male meiosis. PLoS Genet 11, e1004954 166. Becker, K., Marchenko, N. D., Palacios, G., and Moll, U. M. (2008) A role of HAUSP in tumor suppression in a human colon carcinoma xenograft model. Cell cycle 7, 1205-1213 167. Neves, M. A., Totrov, M., and Abagyan, R. (2012) Docking and scoring with ICM: the benchmarking results and strategies for improvement. J. Comput. Aided Mol. Des. 26, 675-686 168. O'Connell, M. R., Gamsjaeger, R., and Mackay, J. P. (2009) The structural analysis of protein-protein interactions by NMR spectroscopy. Proteomics 9, 5224-5232 169. Altun, M., Kramer, H. B., Willems, L. I., McDermott, J. L., Leach, C. A., Goldenberg, S. J., Kumar, K. G., Konietzny, R., Fischer, R., Kogan, E., Mackeen, M. M., McGouran, J., Khoronenkova, S. V., Parsons, J. L., Dianov, G. L., Nicholson, B., and Kessler, B. M.

166

(2011) Activity-based chemical proteomics accelerates inhibitor development for deubiquitylating enzymes. Chem. Biol. 18, 1401-1412 170. Kouroukis, T. C., Baldassarre, F. G., Haynes, A. E., Imrie, K., Reece, D. E., and Cheung, M. C. (2014) Bortezomib in multiple myeloma: systematic review and clinical considerations. Curr. Oncol. 21, e573-603 171. Meng, L., Mohan, R., Kwok, B. H., Elofsson, M., Sin, N., and Crews, C. M. (1999) Epoxomicin, a potent and selective proteasome inhibitor, exhibits in vivo antiinflammatory activity. Proc. Natl. Acad. Sci. U. S. A. 96, 10403-10408 172. Love, K. R., Catic, A., Schlieker, C., and Ploegh, H. L. (2007) Mechanisms, biology and inhibitors of deubiquitinating enzymes. Nat. Chem. Biol. 3, 697-705 173. Pei, J., Kim, B. H., and Grishin, N. V. (2008) PROMALS3D: a tool for multiple protein sequence and structure alignments. Nucleic Acids Res. 36, 2295-2300 174. Friedberg, E. C., Walker, G. C., Siede, W., Wood, R. D., Schultz, R.A., and Ellenberger, T. (2005) DNA Repair and Mutagenesis, ASM Press, Washington, D. C. 175. Goodman, M. F. (2002) Error-prone repair DNA polymerases in prokaryotes and eukaryotes. Annu. Rev. Biochem. 71, 17-50 176. Prakash, S., Johnson, R. E., and Prakash, L. (2005) Eukaryotic translesion synthesis DNA polymerases: specificity of structure and function. Annu. Rev. Biochem. 74, 317-353 177. Lehmann, A. R., Niimi, A., Ogi, T., Brown, S., Sabbioneda, S., Wing, J. F., Kannouche, P. L., and Green, C. M. (2007) Translesion synthesis: Y-family polymerases and the polymerase switch. DNA repair 6, 891-899 178. Guo, C. X., Kosarek-Stancel, J., Tang, T. S., and Friedberg, E. C. (2009) Y-family DNA polymerases in mammalian cells. Cell. Mol. Life Sci. 66, 2363-2381 179. Sale, J. E., Lehmann, A. R., and Woodgate, R. (2012) Y-family DNA polymerases and their role in tolerance of cellular DNA damage. Nat. Rev. Mol. Cell Biol. 13, 141-152 180. Daigaku, Y., Davies, A. A., and Ulrich, H. D. (2010) Ubiquitin-dependent DNA damage bypass is separable from genome replication. Nature 465, 951-955 181. Diamant, N., Hendel, A., Vered, I., Carell, T., Reissner, T., de Wind, N., Geacinov, N., and Livneh, Z. (2012) DNA damage bypass operates in the S and G2 phases of the cell cycle and exhibits differential mutagenicity. Nucleic Acids Res. 40, 170-180 182. Karras, G. I., and Jentsch, S. (2010) The RAD6 DNA damage tolerance pathway operates uncoupled from the replication fork and is functional beyond S phase. Cell 141, 255-267 183. Lopes, M., Foiani, M., and Sogo, J. M. (2006) Multiple mechanisms control integrity after replication fork uncoupling and restart at irreparable UV lesions. Mol. Cell 21, 15-27 184. Ulrich, H. D. (2011) Timing and spacing of ubiquitin-dependent DNA damage bypass. FEBS Lett. 585, 2861-2867 185. Yang, W. (2005) Portraits of a Y-family DNA polymerase. FEBS Lett. 579, 868-872 186. Kunkel, T. A. (2004) DNA replication fidelity. J. Biol. Chem. 279, 16895-16898 187. Johnson, R. E., Prakash, S., and Prakash, L. (1999) Efficient bypass of a thymine- thymine dimer by yeast DNA polymerase, Poleta. Science 283, 1001-1004 188. Masutani, C., Araki, M., Yamada, A., Kusumoto, R., Nogimori, T., Maekawa, T., Iwai, S., and Hanaoka, F. (1999) Xeroderma pigmentosum variant (XP-V) correcting protein from HeLa cells has a thymine dimer bypass DNA polymerase activity. EMBO J. 18, 3491-3501

167

189. Avkin, S., Goldsmith, M., Velasco-Miguel, S., Geacintov, N., Friedberg, E. C., and Livneh, Z. (2004) Quantitative analysis of translesion DNA synthesis across a benzo[a]pyrene-guanine adduct in mammalian cells: the role of DNA polymerase kappa. J. Biol. Chem. 279, 53298-53305 190. Jarosz, D. F., Godoy, V. G., Delaney, J. C., Essigmann, J. M., and Walker, G. C. (2006) A single amino acid governs enhanced activity of DinB DNA polymerases on damaged templates. Nature 439, 225-228 191. Friedberg, E. C., Lehmann, A. R., and Fuchs, R. P. P. (2005) Trading places: How do DNA polymerases switch during translesion DNA synthesis? Mol. Cell 18, 499-505 192. Livneh, Z., Ziv, O., and Shachar, S. (2010) Multiple two-polymerase mechanisms in mammalian translesion DNA synthesis. Cell cycle 9, 729-735 193. Shachar, S., Ziv, O., Avkin, S., Adar, S., Wittschieben, J., Reissner, T., Chaney, S., Friedberg, E. C., Wang, Z. G., Carell, T., Geacintov, N., and Livneh, Z. (2009) Two- polymerase mechanisms dictate error-free and error-prone translesion DNA synthesis in mammals. EMBO J. 28, 383-393 194. Lawrence, C. W. (2002) Cellular roles of DNA polymerase zeta and Rev1 protein. DNA repair 1, 425-435 195. Lawrence, C. W. (2004) Cellular functions of DNA polymerase zeta and Rev1 protein. Adv. Protein Chem. 69, 167-203 196. Wang, Y., Woodgate, R., McManus, T. P., Mead, S., McCormick, J. J., and Maher, V. M. (2007) Evidence that in xeroderma pigmentosum variant cells, which lack DNA polymerase eta, DNA polymerase iota causes the very high frequency and unique spectrum of UV-induced mutations. Cancer Res. 67, 3018-3026 197. Bienko, M., Green, C. M., Crosetto, N., Rudolf, F., Zapart, G., Coull, B., Kannouche, P., Wider, G., Peter, M., Lehmann, A. R., Hofmann, K., and Dikic, I. (2005) Ubiquitin- binding domains in Y-family polymerases regulate translesion synthesis. Science 310, 1821-1824 198. Hoege, C., Pfander, B., Moldovan, G. L., Pyrowolakis, G., and Jentsch, S. (2002) RAD6- dependent DNA repair is linked to modification of PCNA by ubiquitin and SUMO. Nature 419, 135-141 199. Guo, C., Sonoda, E., Tang, T. S., Parker, J. L., Bielen, A. B., Takeda, S., Ulrich, H. D., and Friedberg, E. C. (2006) REV1 protein interacts with PCNA: significance of the REV1 BRCT domain in vitro and in vivo. Mol. Cell 23, 265-271 200. Sharma, N. M., Kochenova, O. V., and Shcherbakova, P. V. (2011) The non-canonical protein binding site at the monomer-monomer interface of yeast proliferating cell nuclear antigen (PCNA) regulates the Rev1-PCNA interaction and Polzeta/Rev1-dependent translesion DNA synthesis. J. Biol. Chem. 286, 33557-33566 201. Wood, A., Garg, P., and Burgers, P. M. (2007) A ubiquitin-binding motif in the translesion DNA polymerase Rev1 mediates its essential functional interaction with ubiquitinated proliferating cell nuclear antigen in response to DNA damage. J. Biol. Chem. 282, 20256-20263 202. Ohashi, E., Hanafusa, T., Kamei, K., Song, I., Tomida, J., Hashimoto, H., Vaziri, C., and Ohmori, H. (2009) Identification of a novel REV1-interacting motif necessary for DNA polymerase kappa function. Genes Cells 14, 101-111 203. Gibbs, P. E., Wang, X. D., Li, Z., McManus, T. P., McGregor, W. G., Lawrence, C. W., and Maher, V. M. (2000) The function of the human homolog of Saccharomyces

168

cerevisiae REV1 is required for mutagenesis induced by UV light. Proc. Natl. Acad. Sci. U. S. A. 97, 4186-4191 204. Lawrence, C. W., and Maher, V. M. (2001) Mutagenesis in eukaryotes dependent on DNA polymerase zeta and Rev1p. Philos. Trans. R. Soc. Lond. B Biol. Sci. 356, 41-46 205. Masuda, Y., Takahashi, M., Tsunekuni, N., Minami, T., Sumii, M., Miyagawa, K., and Kamiya, K. (2001) Deoxycytidyl transferase activity of the human REV1 protein is closely associated with the conserved polymerase domain. J. Biol. Chem. 276, 15051- 15058 206. Haracska, L., Prakash, S., and Prakash, L. (2002) Yeast Rev1 protein is a G template- specific DNA polymerase. J. Biol. Chem. 277, 15546-15551 207. Nelson, J. R., Lawrence, C. W., and Hinkle, D. C. (1996) Deoxycytidyl transferase activity of yeast REV1 protein. Nature 382, 729-731 208. Nelson, J. R., Gibbs, P. E., Nowicka, A. M., Hinkle, D. C., and Lawrence, C. W. (2000) Evidence for a second function for Saccharomyces cerevisiae Rev1p. Mol. Microbiol. 37, 549-554 209. Guo, C., Fischhaber, P. L., Luk-Paszyc, M. J., Masuda, Y., Zhou, J., Kamiya, K., Kisker, C., and Friedberg, E. C. (2003) Mouse Rev1 protein interacts with multiple DNA polymerases involved in translesion DNA synthesis. EMBO J. 22, 6621-6630 210. Ohashi, E., Murakumo, Y., Kanjo, N., Akagi, J., Masutani, C., Hanaoka, F., and Ohmori, H. (2004) Interaction of hREV1 with three human Y-family DNA polymerases. Genes Cells 9, 523-531 211. Tissier, A., Kannouche, P., Reck, M. P., Lehmann, A. R., Fuchs, R. P., and Cordonnier, A. (2004) Co-localization in replication foci and interaction of human Y-family members, DNA polymerase pol eta and REV1 protein. DNA repair 3, 1503-1514 212. Sharma, S., Hicks, J. K., Chute, C. L., Brennan, J. R., Ahn, J. Y., Glover, T. W., and Canman, C. E. (2012) REV1 and polymerase zeta facilitate homologous recombination repair. Nucleic Acids Res. 40, 682-691 213. D'Souza, S., Waters, L. S., and Walker, G. C. (2008) Novel conserved motifs in Rev1 C- terminus are required for mutagenic DNA damage tolerance. DNA repair 7, 1455-1470 214. Nair, D. T., Johnson, R. E., Prakash, L., Prakash, S., and Aggarwal, A. K. (2005) Rev1 employs a novel mechanism of DNA synthesis using a protein template. Science 309, 2219-2222 215. Swan, M. K., Johnson, R. E., Prakash, L., Prakash, S., and Aggarwal, A. K. (2009) Structure of the human Rev1-DNA-dNTP ternary complex. J. Mol. Biol. 390, 699-709 216. Korzhnev, D. M., Billeter, M., Arseniev, A. S., and Orekhov, V. Y. (2001) NMR studies of Brownian tumbling and internal motions in proteins. Prog. Nucl. Magn. Reson. Spectrosc. 38, 197-266 217. Palmer, A. G., Kroenke, C. D., and Loria, J. P. (2001) Nuclear magnetic resonance methods for quantifying microsecond-to-millisecond motions in biological macromolecules. Methods Enzymol. 339, 204-238 218. Zwahlen, C., Legault, P., Vincent, S. J. F., Greenblatt, J., Konrat, R., and Kay, L. E. (1997) Methods for measurement of intermolecular NOEs by multinuclear NMR spectroscopy: Application to a bacteriophage lambda N-peptide/boxB RNA complex. J. Am. Chem. Soc. 119, 6711-6721 219. Keller, R. L. J. (2004) The computer aided resonance assignment tutorial, Cantina

169

220. Korzhnev, D. M., Tischenko, E. V., and Arseniev, A. S. (2000) Off-resonance effects in 15 N T2 CPMG measurements. J. Biomol. NMR 17, 231-237 221. Orekhov, V. Y., Nolde, D. E., Golovanov, A. P., Korzhnev, D. M., and Arseniev, A. S. (1995) Processing of heteronuclear NMR relaxation data with the new software DASHA. Appl. Magn. Reson. 9, 581-588 222. Mandel, A. M., Akke, M., and Palmer, A. G. (1995) Backbone dynamics of Escherichia coli ribonuclease HI: correlations with structure and function in an active enzyme. J. Mol. Biol. 246, 144-163 223. Ishima, R., and Torchia, D. A. (1999) Estimating the time scale of chemical exchange of proteins from measurements of transverse relaxation rates in solution. J. Biomol. NMR 14, 369-372 224. Loria, J. P., Rance, M., and Palmer, A. G. (1999) A relaxation-compensated Carr-Purcell- Meiboom-Gill sequence for characterizing chemical exchange by NMR spectroscopy. J. Am. Chem. Soc. 121, 2331-2332 225. Hansen, D. F., Vallurupalli, P., and Kay, L. E. (2008) An improved 15N relaxation dispersion experiment for the measurement of millisecond time-scale dynamics in proteins. J. Phys. Chem. B 112, 5898-5904 226. Korzhnev, D. M., Salvatella, X., Vendruscolo, M., Di Nardo, A. A., Davidson, A. R., Dobson, C. M., and Kay, L. E. (2004) Low-populated folding intermediates of Fyn SH3 characterized by relaxation dispersion NMR. Nature 430, 586-590 227. James, P., Halladay, J., and Craig, E. A. (1996) Genomic libraries and a host strain designed for highly efficient two-hybrid selection in yeast. Genetics 144, 1425-1436 228. Guo, D., Xie, Z., Shen, H., Zhao, B., and Wang, Z. (2004) Translesion synthesis of acetylaminofluorene-dG adducts by DNA polymerase zeta is stimulated by yeast Rev1 protein. Nucleic Acids Res. 32, 1122-1130 229. Ross, A. L., Simpson, L. J., and Sale, J. E. (2005) Vertebrate DNA damage tolerance requires the C-terminus but not BRCT or transferase domains of REV1. Nucleic Acids Res. 33, 1280-1289 230. Kosarek, J. N., Woodruff, R. V., Rivera-Begeman, A., Guo, C. X., D'Souza, S., Koonin, E. V., Walker, G. C., and Friedberg, E. C. (2008) Comparative analysis of in vivo interactions between Rev1 protein and other Y-family DNA polymerases in animals and yeasts. DNA repair 7, 439-451 231. Berjanskii, M. V., and Wishart, D. S. (2008) Application of the random coil index to studying protein flexibility. J. Biomol. NMR 40, 31-48 232. Bain, A. D. (2003) Chemical exchange in NMR. Prog. Nucl. Magn. Reson. Spectrosc. 43, 63-103 233. Schreiber, G. (2002) Kinetic studies of protein-protein interactions. Curr. Opin. Struct. Biol. 12, 41-47 234. Kim, J., Mao, J., and Gunner, M. R. (2005) Are acidic and basic groups in buried proteins predicted to be ionized? J. Mol. Biol. 348, 1283-1298 235. Chakrabartty, A., and Baldwin, R. L. (1995) Stability of alpha-helices. Adv. Protein Chem. 46, 141-176 236. Viguera, A. R., and Serrano, L. (1999) Stable proline box motif at the N-terminal end of alpha-helices. Protein Sci. 8, 1733-1742 237. Doig, A. J., and Baldwin, R. L. (1995) N- and C-capping preferences for all 20 amino- acids in alpha-helical peptides. Protein Sci. 4, 1325-1336

170

238. Boehr, D. D., Nussinov, R., and Wright, P. E. (2009) The role of dynamic conformational ensembles in biomolecular recognition. Nat. Chem. Biol. 5, 789-796 239. Mittag, T., Kay, L. E., and Forman-Kay, J. D. (2010) Protein dynamics and conformational disorder in molecular recognition. J. Mol. Recognit. 23, 105-116 240. Sugase, K., Lansing, J. C., Dyson, H. J., and Wright, P. E. (2007) Tailoring relaxation dispersion experiments for fast-associating protein complexes. J. Am. Chem. Soc. 129, 13406-13407 241. Zhang, W., Qin, Z., Zhang, X., and Xiao, W. (2011) Roles of sequential ubiquitination of PCNA in DNA-damage tolerance. FEBS Lett. 585, 2786-2794 242. Guo, C., Tang, T. S., Bienko, M., Parker, J. L., Bielen, A. B., Sonoda, E., Takeda, S., Ulrich, H. D., Dikic, I., and Friedberg, E. C. (2006) Ubiquitin-binding motifs in REV1 protein are required for its role in the tolerance of DNA damage. Mol. Cell. Biol. 26, 8892-8900 243. Kannouche, P., Broughton, B. C., Volker, M., Hanaoka, F., Mullenders, L. H., and Lehmann, A. R. (2001) Domain structure, localization, and function of DNA polymerase eta, defective in xeroderma pigmentosum variant cells. Genes Dev. 15, 158-172 244. Vidal, A. E., Kannouche, P., Podust, V. N., Yang, W., Lehmann, A. R., and Woodgate, R. (2004) Proliferating cell nuclear antigen-dependent coordination of the biological functions of human DNA polymerase iota. J. Biol. Chem. 279, 48360-48368 245. Ito, W., Yokoi, M., Sakayoshi, N., Sakurai, Y., Akagi, J. I., Mitani, H., and Hanaoka, F. (2012) Stalled Poleta at its cognate substrate initiates an alternative translesion synthesis pathway via interaction with REV1. Genes Cells 246. Edmunds, C. E., Simpson, L. J., and Sale, J. E. (2008) PCNA ubiquitination and REV1 define temporally distinct mechanisms for controlling translesion synthesis in the avian cell line DT40. Mol. Cell 30, 519-529 247. Murakumo, Y., Ogura, Y., Ishii, H., Numata, S., Ichihara, M., Croce, C. M., Fishel, R., and Takahashi, M. (2001) Interactions in the error-prone postreplication repair proteins hREV1, hREV3, and hREV7. J. Biol. Chem. 276, 35644-35651 248. Hashimoto, K., Cho, Y., Yang, I. Y., Akagi, J., Ohashi, E., Tateishi, S., de Wind, N., Hanaoka, F., Ohmori, H., and Moriya, M. (2012) The vital role of polymerase zeta and REV1 in mutagenic, but not correct, DNA synthesis across benzo[a]pyrene-dG and recruitment of polymerase zeta by REV1 to replication-stalled site. J. Biol. Chem. 287, 9613-9622 249. Hara, K., Hashimoto, H., Murakumo, Y., Kobayashi, S., Kogame, T., Unzai, S., Akashi, S., Takeda, S., Shimizu, T., and Sato, M. (2010) Crystal structure of human REV7 in complex with a human REV3 fragment and structural implication of the interaction between DNA polymerase zeta and REV1. J. Biol. Chem. 285, 12299-12307 250. Akagi, J., Masutani, C., Kataoka, Y., Kan, T., Ohashi, E., Mori, T., Ohmori, H., and Hanaoka, F. (2009) Interaction with DNA polymerase eta is required for nuclear accumulation of REV1 and suppression of spontaneous mutations in human cells. DNA repair 8, 585-599 251. Waters, L. S., and Walker, G. C. (2006) The critical mutagenic translesion DNA polymerase Rev1 is highly expressed during G(2)/M phase rather than S phase. Proc. Natl. Acad. Sci. U. S. A. 103, 8971-8976 252. D'Souza, S., and Walker, G. C. (2006) Novel role for the C terminus of Saccharomyces cerevisiae Rev1 in mediating protein-protein interactions. Mol. Cell. Biol. 26, 8173-8182

171

253. Kim, H., Yang, K. L., Dejsuphong, D., and D'Andrea, A. D. (2012) Regulation of Rev1 by the Fanconi anemia core complex. Nat. Struct. Mol. Biol. 19, 164-170 254. Doles, J., Oliver, T. G., Cameron, E. R., Hsu, G., Jacks, T., Walker, G. C., and Hemann, M. T. (2010) Suppression of Rev3, the catalytic subunit of Pol{zeta}, sensitizes drug- resistant lung tumors to chemotherapy. Proc. Natl. Acad. Sci. U. S. A. 107, 20786-20791 255. Xie, K., Doles, J., Hemann, M. T., and Walker, G. C. (2010) Error-prone translesion synthesis mediates acquired chemoresistance. Proc. Natl. Acad. Sci. U. S. A. 107, 20792- 20797 256. Lange, S. S., Takata, K., and Wood, R. D. (2011) DNA polymerases and cancer. Nat. Rev. Cancer 11, 96-110 257. Cheng, J., Li, Z., Gong, R., Fang, J., Yang, Y., Sun, C., Yang, H., and Xu, Y. (2015) Molecular mechanism for the substrate recognition of USP7. Protein & cell 6, 849-852 258. Cheng, J., Yang, H., Fang, J., Ma, L., Gong, R., Wang, P., Li, Z., and Xu, Y. (2015) Molecular mechanism for USP7-mediated DNMT1 stabilization by acetylation. Nature communications 6, 7023 259. Zhang, Z. M., Rothbart, S. B., Allison, D. F., Cai, Q., Harrison, J. S., Li, L., Wang, Y., Strahl, B. D., Wang, G. G., and Song, J. (2015) An Allosteric Interaction Links USP7 to Deubiquitination and Chromatin Targeting of UHRF1. Cell reports 12, 1400-1406 260. Wu, B., Barile, E., De, S. K., Wei, J., Purves, A., and Pellecchia, M. (2015) High- Throughput Screening by Nuclear Magnetic Resonance (HTS by NMR) for the Identification of PPIs Antagonists. Curr. Top. Med. Chem. 15, 2032-2042 261. Tang, G., Peng, L., Baldwin, P. R., Mann, D. S., Jiang, W., Rees, I., and Ludtke, S. J. (2007) EMAN2: an extensible image processing suite for electron microscopy. J. Struct. Biol. 157, 38-46

172