UNDERSTANDING THE MOLECULAR MECHANISMS OF THE

RNA DHX36 AND DDX41

by

SUKANYA SRINIVASAN

Submitted in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

Dissertation advisor: Dr. Eckhard Jankowsky

Department of Biochemistry

CASE WESTERN RESERVE UNIVERSITY

May 2020

CASE WESTERN RESERVE UNIVERSITY

SCHOOL OF GRADUATE STUDIES

We hereby approve the thesis/dissertation of

Sukanya Srinivasan

Candidate for the Doctor of Philosophy degree*

William C. Merrick (chair of the committee)

Eckhard Jankowsky

Derek J. Taylor

Tsan Sam Xiao

(date) March 18, 2020

*We also certify that written approval has been obtained for any proprietary material

contained therein.

Table of contents

List of figures...... viii

List of tables...... xii

Acknowledgment ...... xiii

List of abbreviations ...... xv

Abstract ...... xviii

Chapter 1: General Introduction to SF2 RNA helicases ...... 1

1.1 Introduction to RNA helicases ...... 1

1.2 Classification of helicases ...... 1

1.3 Structural architecture and conserved sequence motifs of SF2 RNA helicases ..... 3

1.3.1 core and its conserved sequence motifs ...... 3

1.3.2 Terminal accessory domains ...... 6

1.3.3 β-hairpin ...... 9

1.4 Distinct modes of duplex unwinding by SF2 RNA helicases ...... 11

1.5 Biochemical activities & cellular functions of SF2 RNA helicases ...... 13

1.5.1 Diverse cellular functions ...... 13

1.5.2 RNA unwinding ...... 15

1.5.3 displacement ...... 15

1.5.4 Strand annealing ...... 16

1.5.5 RNA structure conversion and chaperone activity ...... 16

1.5.6 Viral nucleic-acid and bacterial pathogen sensing ...... 17

1.6 Influence of cofactors on RNA helicase function ...... 17

i

1.7 Studies on DHX36 and DDX41 in this thesis ...... 18

1.7.1 The DEAH/RHA helicase DHX36 ...... 19

1.7.2 The DEAD-box protein DDX41 ...... 19

Chapter 2: Introduction-The DEAH/RHA helicase is involved in the regulation of expression ...... 21

2.1 Gene regulation in eukaryotes ...... 21

2.1.1 Transcriptional regulation ...... 21

2.1.2 Post-transcriptional regulation ...... 23

2.2 Nucleic acid structures and gene regulation ...... 26

2.2.1 Classical DNA/RNA structures and their functions ...... 26

2.2.2 G-quadruplexes ...... 29

2.2.2.1 Genomic mapping of G-quadruplexes ...... 31

2.2.2.2 In vivo existence of G-quadruplexes ...... 33

2.2.2.3 Biological roles of G-quadruplexes ...... 33

2.2.2.4 G-quadruplex disease connections ...... 40

2.2.2.5 G-Quadruplex drug targeting ...... 42

2.3 NA binding ...... 43

2.3.1 General aspects of NA-binding proteins ...... 44

2.3.2 G-quadruplex (GQ) interacting proteins ...... 46

2.4 The DEAH/RHA helicase, DHX36 ...... 49

2.5 Biological roles of DHX36 ...... 49

2.5.1 Disease relevance ...... 55

2.6 NA remodeling mechanism of DHX36 ...... 57

ii

2.7 Structural basis for NA remodeling by DHX36 ...... 59

2.8 Structural and biochemical analyses of RNA remodeling activity of DHX36 .... 61

Chapter 3: Introduction-The DEAD-box helicase DDX41, a myelodysplastic syndrome implicated splicing factor ...... 63

3.1 Conservation and structure of human DEAD-box protein DDX41 ...... 63

3.2 Cellular Functions of DDX41 ...... 66

3.2.1 Abstrakt the drosophila homolog of DDX41 ...... 66

3.2.2 DDX41 in the spliceosomal complex ...... 67

3.2.3 DDX41 in innate immune signaling ...... 68

3.2.4 DDX41 and tumor development ...... 73

3.2.5 DDX41 in post-transcriptional gene regulation ...... 74

3.3 Myelodysplastic syndrome ...... 74

3.3.1 Myelodysplastic syndrome associated protein factors ...... 75

3.3.1.1 Splicing factors in myeloid neoplasms ...... 76

3.3.1.2 DDX41 in myeloid neoplasms ...... 77

3.4 Cellular RNA targets of DDX41 and cellular consequences of DDX41

perturbations ...... 80

3.5 Biochemical characterization of DDX41 ...... 82

Chapter 4: Function of auxiliary domains of the DEAH/RHA helicase DHX36 in

RNA remodeling ...... 83

4.1 Introduction ...... 83

4.2 Results ...... 83

4.2.1 Crystal structure of mouse DHX36 ...... 83

iii

4.2.2 Remodeling of RNA duplexes and quadruplexes by mDHX36 ...... 87

4.2.3 The DSM promotes remodeling of RNA quadruplexes and duplexes and

binding to RNA quadruplexes ...... 91

4.2.4 The OB-fold promotes binding and remodeling of quadruplex and duplex

structures...... 95

4.2.5 The β-hairpin promotes binding and remodeling of quadruplex and duplex

structures...... 98

4.3 Discussion ...... 103

Chapter 5: The nucleotide selectivity of DHX36 influences its RNA substrate selectivity and vice versa ...... 108

5.1 Introduction ...... 108

5.2 Results ...... 110

5.2.1 Nucleotide of mouse DHX36 ...... 110

5.2.2 NTP-dependent remodeling of RNA duplexes and quadruplexes by mDHX36

...... 112

5.2.3 Physiological ATP levels inhibit mDHX36 remodeling of RNA duplexes but

not quadruplexes ...... 114

5.2.4 The molecular mechanism of ATP-mediated mDHX36 RNA duplex

remodeling inhibition ...... 120

5.3 Conclusion ...... 122

Chapter 6: Future Directions..Structural and biochemical analysis of DHX36 ...... 125

6.1 Introduction ...... 125

6.2 DHX36 recognition of RNA length, sequence, and structure ...... 125

iv

6.2.1 Minimum length of ssRNA bound by DHX36 ...... 125

6.2.2 Sequence preference of DHX36 ...... 127

6.3 Coupling of NA binding to ATP binding/hydrolysis ...... 128

6.3.1 Modulation of DHX36 ATPase activity by RNA sequence, structure, and

length...... 128

6.3.2 NTPase activities of DHX36 ...... 132

6.3.3 Influence of NTP base stacking on the biochemical activities of DHX36 .. 132

6.3.4 Role of ATP hydrolysis in DHX36-mediated RNA remodeling ...... 134

6.3.5 Function of auxiliary domains of DHX36 in the coupling of NA binding to

ATP binding/hydrolysis...... 135

6.4 The ligand-induced assembly state (i.e., monomer, dimer, oligomer) of DHX36

...... 139

6.5 Nucleotide induced inhibition of mDHX36 RNA duplex remodeling activity .. 141

6.6 Effect of cofactors and PTMs on DHX36 activity ...... 143

6.7 DHX36’s ability to form reversible aggregates and phase separate in solution . 145

Chapter 7: Biochemical analysis of the DEAD-box helicase DDX41 ...... 148

7.1 Introduction ...... 148

7.2 Results ...... 149

7.2.1 Purification of recombinant human DDX41 ...... 149

7.2.2 ATP-dependent RNA remodeling activity of DDX41 masked by a potential

nuclease contamination issue ...... 150

7.2.3 Characterization of the potential nuclease contamination in recombinant

DDX41 preparations ...... 152

v

7.2.4 Characterization of the RNA remodeling activity of DDX41 using a modified

RNA substrate ...... 154

7.2.5 Biochemical activity of WT DDX41 and mutant DDX41R525H ...... 156

7.3 Discussion ...... 158

7.4 Future directions ...... 161

Chapter 8: Material and Methods ...... 166

8.1 Construction of DHX36 expression plasmids ...... 166

8.2 DHX36 expression and purification ...... 167

8.3 Crystallization, data collection, and structure determination ...... 168

8.4 Preparation of radiolabeled RNA substrates for DHX36 remodeling reactions . 169

8.5 DHX36 remodeling reactions ...... 170

8.6 DHX36-RNA equilibrium binding reactions ...... 171

8.7 DHX36 ATPase reactions ...... 171

8.8 Circular Dichroism spectroscopy ...... 172

8.9 Accession Codes ...... 173

8.10 Protein-protein crosslinking ...... 173

8.11 Kinetic Simulations of DHX36 remodeling reactions ...... 173

8.12 Construction of DDX41 expression plasmids ...... 173

8.13 DDX41 expression and purification ...... 174

8.14 RNA substrate preparation for DDX41 unwinding reactions ...... 175

8.15 Pre-steady state Nuclease reactions ...... 176

8.16 DDX41 ATPase reactions ...... 177

8.17 DDX41 Unwinding reactions ...... 177

vi

8.18 Data analysis for DDX41 unwinding reactions ...... 178

Appendix 1 ...... 180

Appendix 2 ...... 182

Bibliography ...... 183

vii

List of figures

Figure 1.1. SF1 And SF2 helicase family classification...... 2

Figure 1.2. Sequence and structural organization of the helicase core of SF2 proteins. .... 4

Figure 1.3. Position of the characteristic motifs in three-dimensional structures of SF2 helicases...... 5

Figure 1.4. Domain organization and architecture of SF2 RNA helicase families...... 7

Figure 1.5. The general OB-fold topology...... 8

Figure 1.6. β-hairpin structures in SF2 RNA helicases...... 10

Figure 1.7. Distinct modes of duplex unwinding by DEAD-box and DEAH helicases. .. 12

Figure 1.8. Cellular functions of eukaryotic RNA helicases...... 14

Figure 2.1. RNA secondary structural elements: junctions, stems, loops and bulges...... 27

Figure 2.2. G-quadruplex (GQ) structures and their topological variants...... 30

Figure 2.3. Location and function of GQs involved in mRNA translation regulation. .... 38

Figure 2.4. Regulation of telomerase activity by RNA GQs associated with telomeres,

TERRA, and TERC...... 39

Figure 2.5. The overall structure of B. taurus DHX36–G-quadruplex DNA complex. ... 50

Figure 2.6. Expression profiles of DHX36 in various cancers...... 56

Figure 2.7. The overall structure of D. melanogaster DHX36-DNA complex...... 60

Figure 3.1. Phylogenetic tree showing DDX41 conservation across species...... 63

Figure 3.2. Domain organization of human DDX41...... 64

Figure 3.3. Structure of DDX41 helicase core...... 65

Figure 3.4. Co-mutational landscape of DDX41 patient cohort...... 68

Figure 3.5. Association of DDX41 with the spliceosome...... 69

viii

Figure 3.6. DDX41 as a PRR in innate immune signaling...... 71

Figure 3.7. Topography, types, and configuration of DDX41 mutations in myeloid malignancies ...... 78

Figure 3.8. Bi-allelic DDX41 mutations in myeloid malignancies...... 79

Figure 3.9. Structural model of the helicase core of DDX41...... 82

Figure 4.1. Crystal structure of mouse DHX36 bound to ADP...... 84

Figure 4.2. Differences in the arrangement of conserved domains in mouse DHX36 bound to ADP, compared to other DHX36 structures...... 85

Figure 4.3. Differences in the arrangement of conserved motifs in mouse DHX36 bound to

ADP, compared to other DHX36 structures...... 86

Figure 4.4. Circular Dichroism spectra of intermolecular GQ4 and intramolecular GQ1. 88

Figure 4.5. Remodeling of G-quadruplex and RNA duplex substrates by WT mDHX36.

...... 89

Figure 4.6. Quantitative analysis of the remodeling of G-quadruplex and RNA duplex substrates by WT mDHX36...... 90

Figure 4.7. Impact of the DNA trap on the remodeling reaction by mDHX36...... 91

Figure 4.8. The N-terminal DSM motif in DHX36...... 92

Figure 4.9. Impact of deletion of the DSM...... 93

Figure 4.10. Impact of deletion of the OB-fold on the RNA remodeling activity...... 95

Figure 4.11. Impact of deletion of the OB-fold on the ATPase activity of mDHX36...... 96

Figure 4.12. Impact of deletion of the OB-fold on RNA binding...... 97

Figure 4.13. Impact of deletion and truncation of the 5’-β-HP on RNA remodeling...... 99

Figure 4.14. Impact of deletion and truncation of the 5’-β-HP on the ATPase activity. 100

ix

Figure 4.15. Impact of deletion and truncation of the 5’-β-HP on the RNA binding affinity.

...... 101

Figure 4.16. Energetic differences of WT and mutant mDHX36 for binding to the different

RNA substrates...... 102

Figure 4.17. Circular dichroism analysis of mDHX36 variants used in this study...... 102

Figure. 5.1 ATP-binding site of mouse DHX36 and fly Vasa...... 111

Figure. 5.2. RNA remodeling activity of mDHX36 in the presence of different nucleoside triphosphates...... 113

Figure 5.3. ATP Vs CTP preference in DHX36 remodeling of RNA structures...... 115

Figure 5.4. Effect of ADP on DHX36 remodeling of RNA duplex...... 116

Figure 5.5. Models for RNA duplex unwinding by mDHX36...... 118

Figure 5.6. mDHX36 remodeling activity on 16-bp RNA duplexes with shorter 3’ ss extensions...... 121

Figure 6.1 Effects of 3’- ss extension length on DHX36 RNA duplex remodeling activity.

...... 126

Figure 6.2 Effects of 3’- ss extension sequence on DHX36 RNA duplex remodeling activity...... 127

Figure 6.3 Steady-state ATPase activity of DHX36 with RNA...... 130

Figure 6.4 Analysis of nucleotide base stacking interactions in mDHX36 structure. .... 133

Figure 6.5 DHX36 mediated RNA remodeling with ATP analogs...... 135

Figure 6.6 Conserved ratchet domain residues of DHX36 interact with NA...... 138

Figure 6.7 Hook loop and Hook-turn elements in DHX36...... 138

x

Figure 6.8. Prediction of post-translational modifications and disordered regions in mDHX36 ...... 145

Figure 7.1. Purification of recombinant WT and R525H mutant DDX41 from insect cells.

...... 149

Figure 7.2. RNA unwinding by WT DDX41...... 150

Figure 7.3. WT DDX41 unwinding of a 13 bp RNA duplex under regular and strand exchange conditions...... 151

Figure 7.4. Analysis of RNA degradation products accumulated in the presence of WT

DDX41...... 153

Figure 7.5. WT DDX41 unwinding reactions with a 13 bp RNA duplex containing a radiolabel at 3'-end of the bottom strand...... 154

Figure 7.6. WT DDX41 unwinding reactions with a 16 bp RNA duplex containing a radiolabel at 3'-end of the bottom strand...... 155

Figure 7.7. WT DDX41 ATP hydrolysis reactions...... 156

Figure 7.8. RNA Unwinding by WT DDX41 and DDX41R525H...... 157

Figure 7.9. RNA unwinding by WT DDX41 and DDX41R525H as a function of ATP concentration...... 159

Figure 7.10. Schematic representation of the key residues in the helicase core domain that mediate ATP binding and hydrolysis, based on the structure of Vasa 2...... 160

Figure 7.11. Non-phosphorylated and mono-phoshphorylated RNA duplex unwinding by three helicases...... 162

xi

List of tables

Table. 2.1. Biological roles of RNA GQs...... 35

Table 2.2. RNA GQs and hallmarks of cancer...... 40

Table 2.3. with potential GQ motif expansions implicated in neurological diseases.

...... 41

Table 2.4. List of helicases and selected features implicated in GQ regulation in vitro and/or in vivo...... 48

Table 5.1. Initial parameters for data simulation using various models...... 117

Table 5.2. Kinetic parameters derived by fitting kobs values to a binding isotherm invoking substrate inhibition...... 120

Table A1. X-ray Crystallographic Data Collection and Refinement Statistics ...... 180

Table A2. Apparent unwinding rate constant (kobs) values from ‘afit’ function of kintek simulation...... 181

xii

Acknowledgment

Foremost, I would like to thank my thesis advisor, Dr. Eckhard Jankowsky, for giving me an opportunity to develop as a young scientist. I have greatly benefited from his hands-off but attentive mentoring style. His belief in my abilities, and his constant reminders for continually pushing the boundaries of my capabilities, have helped me endure the meandering path of research. His critical and insightful discussions have significantly improved my critical thinking and scientific communication skills. I owe a big part of my solid scientific training to his wisdom and mentorship. His influence will continue to guide me throughout the rest of my scientific career.

I sincerely thank my thesis committee members, Drs. William Merrick, Derrick

Taylor, and Tsan S Xiao for their contributions and feedback on my work. I have benefitted from their involvement in my graduate career. Special thanks to Dr. William Merrick for reading the draft for this thesis. I would also like to acknowledge Drs. Tsan S Xiao and

Zhonghua Liu for their collaboration and experimental contributions.

I am thankful for all the current and former members of the lab, to Drs. Huijue Jia,

Andrea Putnam, Zhaofeng Gao, for welcoming me into the lab, tolerating all my silly questions, and for being fantastic troubleshooting experts. They mainly trained me in one way or another on most techniques. I am especially appreciative of my fellow lab members for their comradery and for maintaining a fun lab atmosphere. I have never once dreaded coming into the lab, and I owe much of this to the attitudes and personalities of my Ph.D. mentor and all my wonderful lab mates. I also thank all my friends and colleagues outside the group, with whom I enjoyed many insightful discussions during our lunches and coffee

xiii breaks. I thank Dr. Mengyuan Xu for her help with the insect expression and CD experiments.

Special thanks to my partner Ram, for being supportive and understanding, and for pushing me out of my comfort zone. His valuable time, constructive criticism, and constant encouragement have helped me succeed and navigate this long and seemingly endless journey. I would also like to thank my family, especially my grandpa, for his wisdom, love, and support throughout my life. Lastly, I’m incredibly grateful to my mom, who, despite the circumstances, went out of her way to provide the best education for me during my formative years. I dedicate this thesis to her.

xiv

List of Abbreviations

ABC transporter, ATP-binding cassette transporter

ADP, adenosine diphosphate

ADP-AlF4, ADP-aluminum fluoride

ADP-BeFx, ADP-beryllium fluoride

ADPNP, adenylyl imidodiphosphonate

AML, acute myeloid leukemia

ATP,

ATPase, adenosine bp,

β-HP, beta-hairpin

CARD, Caspase activation and recruitment domains

CLIP, cross-linking immunoprecipitation

C-terminal/terminus, carboxyl-terminal/terminus

DLS, dynamic light scattering

DNA, deoxyribonucleic acid dsRBD, double-stranded RNA binding domain dsRNA, double-stranded RNA

DSM, DHX36-specific motif

DTT, dithiothreitol

EDTA, ethylenediaminetetraacetic acid

EJC, exon junction complex

FRET, Föster resonance energy transfer

xv

GTP,

GQ4, intermolecular G-quadruplex

GQ1, intramolecular G-quadruplex

HA2, helicase associated domain 2

HCV, hepatitis C virus

HDAC1, histone deacetylase 1

IDRs, intrinsically disordered regions

IFN, interferon

IP6, inositol hexakisphosphate

ITP, inosine triphosphate

LLPS, liquid-liquid phase transitions

MDS, myelodysplastic syndrome mRNA, messenger RNA mt, mutant

NA, nucleic acid

NLS, nuclear localization signal

NMD, nonsense-mediated decay nt, nucleotide

N-terminal/terminus, amino-terminal/terminus

NTP, nucleotide triphosphate

OB-fold, oligonucleotide/oligosaccharide binding-fold

PAGE, polyacrylamide gel electrophoresis

PAP, poly(A) polymerase

xvi

PAZ domain, Piwi, Argonaut, Zwille domain

P-body, processing body

P-Loop, phosphate-binding loop

Pol I, II, III, RNA polymerase I, II, III poly(A), polyadenylic acid

PRR, pattern recognition receptor

PAMP, pathogen-associated molecular patterns

RNA, ribonucleic acid

RNP, ribonucleoprotein rRNA, ribosomal RNA

RRM, RNA recognition motif

SDS, sodium dodecyl sulfate ssRNA, single-stranded RNA snRNA, small nuclear RNA snoRNA, small nucleolar RNA

SF, superfamily

SNX2, sortin-nexin-2

Tris, tris(hydroxymethyl)aminomethane tRNA, transfer RNA

UTP, uridine triphosphate

WH domain, winged-helix domain wt, wildtype

xvii

Understanding The Molecular Mechanisms Of The RNA Helicases DHX36 And DDX41

Abstract

by

SUKANYA SRINIVASAN

The DEAH/RHA helicase DHX36 is an essential protein linked to cellular RNA and DNA quadruplex structures and AU-rich RNA elements. DHX36 contains the superfamily 2 helicase core and several auxiliary domains that are conserved in orthologs of the . The role of these auxiliary domains for the enzymatic function of DHX36 is not well understood. The molecular basis for RNA structure remodeling and biochemical features of DHX36 substrate selectivity also remain elusive. To address these gaps, I combine structural and biochemical studies of mouse DHX36. Our crystal structure of mouse DHX36 bound to ADP shows conformational changes that accompany stages of the

ATP-binding and hydrolysis cycle. I demonstrate and characterize RNA duplex unwinding for DHX36 and examine the remodeling of inter- and intramolecular RNA quadruplex structures. I find that the DHX36-specific motif (DSM) functions not only as a quadruplex binding adaptor, but also promotes the remodeling of RNA duplex and quadruplex structures. The highly conserved OB-fold and β-HP domains contribute to RNA binding but are also essential for remodeling RNA quadruplex and duplex structures. Our data reveal the roles of auxiliary domains for multiple steps of the nucleic acid remodeling reactions. I also show that DHX36 utilizes all nucleotides, but not indiscriminately. ATP at physiological levels negatively regulates DHX36 unwinding of duplexes and positively

xviii regulates DHX36 remodeling of quadruplexes. Thus, DHX36 utilizes a unique nucleotide- dependent mechanism to achieve RNA substrate selectivity.

Mutations of the spliceosome-associated DEAD-box helicase DDX41 are frequently found in myeloid malignancies. Germline mutations result in a truncated protein, while an R525H missense mutation dominates the somatic profile. The biochemical mechanism of DDX41 and the effects of DDX41 perturbations are unknown.

To address this, I characterize the biochemical activities of WT-DDX41 and mutant

DDX41R525H in vitro. I demonstrate modest defects in the RNA remodeling activity of the mutant. DDX41 is shown to be an essential gene, and homozygous truncating mutations of

DDX41 are not observed in humans. This is consistent with my biochemical observations, suggesting that DDX41R525H must maintain a nonlethal level of functionality in the presence of a truncating germline allele.

xix

Chapter 1

General introduction to SF2 RNA helicases

1.1 Introduction to RNA helicases

RNA molecules typically fold and adopt well defined secondary and tertiary structures 18. Additionally, an RNA is usually bound to proteins in ribonucleoprotein complexes (RNPs). All the steps of post-transcriptional gene regulation involve a coordinated interplay between RNA and proteins, and and RNPs are subject to constant reorganization. To maintain normal cellular function, this reorganization needs to be carefully controlled and is often mediated by RNA helicases 22. Functionally, RNA helicases are that use ATP to bind and remodel RNA and RNA-protein complexes and are involved in most aspects of RNA metabolism such as splicing, export, biogenesis, translation, and mRNA decay 30. RNA helicases are present in all forms of life and are the largest class of enzymes in eukaryotic RNA metabolism 31. Structurally, RNA helicases are related to DNA helicases, and both are members of the phosphate-binding loop (P-Loop) nucleotide triphosphatase (NTPases) class of proteins that contain a signature beta sheet followed by an alpha helix. Other P-loop NTPases include F, V

ATPases, ABC transporters, kinases, , , G-proteins, and helicases 32. The

P-loop, also referred to as the Walker A motif, binds the β- and γ-phosphates of NTPs.

Studies indicate that, despite the structural similarity to DNA helicases, RNA helicases work differently from their DNA counterparts 17.

1.2 Classification of helicases

Based on sequence and structural homology, helicases that form hexameric toroidal structures and those that do not, are together classified into six superfamilies (SF1-6) 33.

1

SF1 and SF2 are the largest superfamilies, which include eukaryotic RNA helicases together with bacterial and viral RNA and DNA helicases, all of which contain the characteristic tandem RecA-like folds 17. Conversely, SF3-6 helicases form hexameric rings and are found only in bacteria and viruses 33. In humans, there are at least 17 SF1 and

103 SF2 helicases. The SF1 and SF2 helicases contain a conserved core of two highly similar globular domains consisting of a parallel β‐sheet surrounded by α‐helices (RecA1 and RecA2), named for their similarity with the Escherichia coli protein RecA, which is involved in 34. Based on sequence and structural characteristics, SF1 helicases are further divided into three families, and SF2 helicases into nine families and one group 10 (Fig. 1.1). The SF2 families with RNA helicases include the

DEAD-box, DEAH/RHA, Ski2-like, RIG-I-like, and NS3/NPH-II families 10. Some families contain helicases that work on both DNA and RNA. Additionally, some mechanistic features are usually typical of helicases within a family, like indiscriminate

NTP usage versus selectivity for ATP 10.

Figure 1.1. SF1 And SF2 helicase family classification. Unrooted cladogram showing the families of the SF1 (right), and the SF2 (left) according to 10. Branch lengths are not to scale. The oval indicates significant uncertainty in cladogram topology in this region. Boldfaced names show families harboring RNA helicases (non-standard abbreviations: T1R – type 1 restriction enzymes, RHA –RNA helicase A). Adapted from (Ref: 22), with permission to reuse from Elsevier. 2

1.3 Structural architecture and conserved sequence motifs of SF2 RNA helicases

1.3.1 Helicase core and its conserved sequence motifs

As mentioned previously, the helicase core is made up of two similar RecA-like domains. One of the most notable features of SF2 helicases are the thirteen characteristic sequence motifs with roles involving ATP binding/hydrolysis, RNA binding, and the coupling between the two 10 (Fig. 1.2). The Q-motif, which provides adenine base specificity through hydrogen bonding and stacking interactions, is relatively less conserved across SF1-2 35. This motif is present across all SF1-2 families, except DEAH/RHA and

NS3/NPH-II protein families, and these helicases are not specific for ATP 10.

3

Figure 1.2. Sequence and structural organization of the helicase core of SF2 proteins. (a) Sequence organization of the helicase core in SF2. Characteristic sequence motifs are colored according to their predominant biochemical function: red, ATP binding and hydrolysis; yellow, coordination between nucleic acid and NTP binding sites; blue, nucleic acid binding. Green circled asterisks mark insertions of additional domains. The lengths of the blocks and the distance between the conserved domains are not to scale. Characteristic motifs were identified from the alignment of all SF2 proteins from human, S. cerevisiae, E. coli and selected viruses. Considering numbering schemes already in use 17, motifs were numbered consecutively. The Q-motif is equivalent to motif 0 in RecQ proteins. Motif IVa in SF2 proteins is frequently marked QxxR, motif Ic often TPGR. The asterisk on motif Ib indicates that in some proteins this motif is replaced by an additional domain. (b) Sequence conservation within the characteristic helicase motifs. The height of the amino acids reflects the level of conservation at a given position, taller letters indicate higher conservation. The universally conserved E in motif II corresponds to 4 bits. Coloring marks the chemical properties of a given amino acid position: green and purple—polar, blue—basic, red—acidic, and black—hydrophobic. Sequence logos were created from the alignment of SF2 proteins according to Ref: 24. Circles under the letters are for visual guidance. (c) Position of the characteristic motifs in the RecA-like folds of the helicase core domains. The β-strands are indicated by arrows, α-helices by cylinders. The β-strands of the first RecA-like domain are numbered according to their position in the primary structure. The position of the characteristic motifs is indicated by numbered circles, colored as in panel A. The position of inserted domains is marked by green circled asterisks, as in panel A. Blue coloring of the rightmost β-strand and α-helix indicates the absence of this part in several SF2 protein families. The figures are modified with permission from Elsevier Ref: 10.

Residues that coordinate NTP binding and hydrolysis (Motif I, II, and VI, Fig 1.2) display the highest level of sequence conservation across SF1-2. Motif I (Walker A), contacts the

β-phosphate as well as the magnesium ion bound between the β- and γ-phosphates 36. Motif

II (Walker B) coordinates the magnesium ion and the catalytic water molecule. Motif VI residues interact with the α, β, and γ phosphates of ATP, wherein the second arginine from this motif plays a role in stabilizing the transition state of ATP hydrolysis 28. This arginine is also referred to as the arginine finger and has been found to have similar functions in other P-loop proteins such as AAA+ proteins 37.

Residues from Motif I, II, and VI are located in the cleft between the conserved helicase core domains (Fig. 1.3). Residues from motifs Ia, Ib, Ic, IV, V, and Vb are involved in nucleic acid (NA) binding by contacting the sugar-phosphate backbone. Some

RNA-specific helicases are known to recognize the 2’-hydroxyl groups as well 10.

Interestingly, in certain helicases, residues outside the conserved motifs are involved in 4

Figure 1.3. Position of the characteristic motifs in three-dimensional structures of SF2 helicases. Structures of the DEAH/RHA helicase Prp43p (left) and the DEAD-box helicase Vasa (right). The bound ATP analog is colored magenta, the nucleic acid is colored wheat. Conserved sequence motifs are colored as in Figure 1.2. The auxiliary domains are in light-pink and light-green. The figures are modified with permission from Elsevier 10.

base-specific contacts 38-39. This variable mode of NA binding contributes to the functional

diversity in helicases. Residues in motif III, IV, and Va are critical for coupling between

ATP binding and hydrolysis and NA binding 40. Motif Va is often referred to as the

“sensor” motif because of its role in sensing γ-phosphate release during ATP hydrolysis 41.

However, how exactly the NTP and NA binding sites communicate with each other is still

not well understood. The NA binding motifs are located on the opposite side of the ATP

binding site spreading across both helicase domains (Fig.1.3).

The RecA domains are connected via a flexible linker region, thus allowing for

relative movement of these domains 17. Reported crystal structures of helicases from

different families illustrate different conformations of the RecA domains in the presence

and absence of ADP or ATP analogs 39, 42-43. These studies show that the cleft between the

two RecA domains must be closed to productively bind and hydrolyze ATP. In all these

5 instances, the opening and closing of the RecA domains is mediated by the absence and presence of nucleotide analogs. The associated NA binds to a large basic surface formed by the RecA domains, opposite to the ATP binding site. Hence, this defined arrangement ensures coordination between ATP binding/hydrolysis and NA binding.

1.3.2 Terminal accessory domains

In addition to the helicase core, the majority of the SF2 helicases contain N- and C- terminal accessory sequences, which may adopt defined folds with specific functions such as RNA/DNA binding (e.g., OB-fold, dsRBD), or protein-protein interactions (e.g.,

CARD-domains) (Fig. 1.4a-e 10). These terminal regions are thought to be critical for additional enzymatic activities, promoting oligomerization, and providing physiological specificity of helicases 23, 44-45. Consistent with this function, these regions are typically not conserved within a family. However, some structural conservation of the C-term domains is observed within and between families (e.g., Hel308 and Brr2p in Ski2-like family; and

Prp43p, MLE (H. sapiens DHX9) and DHX36 in DEAH/RHA family 6, 29, 46 (Fig. 1.4f-g).

Hel308 features a C‐term HLH (helix–loop–helix) domain that is positioned near the exit of the helicase channel. Prp43p has, instead, a C‐terminal OB-fold (oligonucleotide/ oligosaccharide binding-fold) domain that is positioned on the opposite surface, at the entrance of the helicase channel. The shared homology region in the extended C-term of the DEAH/RHA helicases consists of two conservative stretches of amino acids, with variable length and placement. In the spliceosomal yeast DEAH/RHA helicases, this region is essential for viability 47-49. The most important conserved feature in the extended C-term of DEAH/RHA helicases is the OB-fold domain. The OB-folds were first identified in bacterial and yeast proteins as domains that bound to oligonucleotides or oligosaccharides

6

Figure 1.4. Domain organization and architecture of SF2 RNA helicase families. Domains are not to scale. (a) C-termini and N-termini of DEAD-box proteins usually include RRMs, Zn-fingers, tudor domains and others 11. (b) The family-typical domain inserted between the helicase domains is shown in grey. RIG-I-like proteins vary in their terminal domains 23. Prominent RIG-I-like proteins are shown, Mph1p/FancM-related proteins are not shown (CARD: caspase recruitment domain, RD: regulatory domain, a Zn-binding domain, PAZ: PIWI, Argonaute, Zwille, dsRBD: double-stranded RNA binding domain). (c) Domain organization of Hel308 26. The organization of the C-terminal domains is conserved in Brr2 27 (WH: winged helix, H1: helical 1, H2: helical 2, FN3: fibronectin 3). (d) DEAH/RHA proteins have varying N-terminal domains, but show a very high degree of conservation in their C- termini, especially among the spliceosomal DEAH proteins. It is possible that most DEAH/RHA proteins show a conserved domain organization of their C-termini. Shown is the domain organization of Prp43p 29. The domain organization of the C-terminus, with the exception of the OB-fold domain, resembles that of Ski2-like proteins 26 (WH*: degenerated winged helix, Ratchet corresponds to H1 in the Ski2-like proteins). (f) A cartoon representation of HEl308 architecture from Ski-2 like family. In all structures, the RecA/helicase domain 1 (H1) is green, RecA/helicase domain 2 (H2) cyan, and nucleic acid, where present, beige. Terminal and inserted accessory domains are colored according to their respective folds, as indicated. All structures are oriented in a similar fashion, right panels show the structures rotated by roughly 90°, as indicated in panel (f) (αH — ratchet helical domain, WH — winged helix domain, OB — OB fold). (g) A cartoon representation of Prp43p architecture from DEAH/RHA family. The figures are modified with permission from Elsevier 10.

50. Subsequently, they were found to be involved in protein-NA, or protein interactions 51-

53. The OB-folds in different proteins vary in length (from 70–150 amino acids) and have little sequence similarity. However, all OB-folds share structural features, typically containing five highly coiled, antiparallel β sheets 50 (Fig. 1.5). In Prp43p, the deletion of

7

Figure 1.5. The general OB-fold topology. The β-strands are labeled as B1 to B5. The loops are labeled as RT, n-src, distal and omega. The N- and C-termini are labeled. Note the N-terminal starts with strand B2 and strand B1 is between omega helix and RT-loop. RT-loop is connecting B1 with B5. This figure was adapted from Ref: 12, with permission from Springer Nature. this domain impairs the RNA binding activity and the stimulation of the RNA-dependent

ATPase activity of the helicase 43. Likewise, in MLE, point mutations in OB-fold residues were shown to affect the RNA–dependent ATPase activity of the helicase 46. Biochemical studies suggest that the OB-fold domain plays a role in regulating the helicase activity and

RNA-stimulated ATPase activity of Prp43p and DHX15, via interaction with RNA and G‐ patch domain containing proteins 54-55. Although the role of the OB-fold domain was examined in helicases, its impact on RNA structure remodeling activity was never thoroughly investigated.

The other conserved feature in the extended C-term of DEAH/RHA helicases is the winged-helix (WH) and ratchet domains, together referred to as the Helicase Associated

(HA2) domain. The HA2 domain is also conserved in the Ski-2 like helicase family, where the ratchet domain is thought to facilitate DNA translocation in Hel308 26. Consistent with this idea, point mutations in ratchet residues of Mtr4 and other Ski2-like helicases display slow growth phenotypes in vivo, and loss of in vitro unwinding activity 27, 56. However, 8 recent structures of MLE and Prp43p (DEAH/RHA helicases) revealed only minor interactions of the ratchet residues with bound RNA, hence the ratcheting role of this domain remains controversial 46, 57.

In contrast to the C-term domain, the N-terminal regions are unique and show little sequence conservation between family members. The spliceosomal DEAH/RHA helicase,

Prp22p, features a RNA-binding domain (RBD) in its N-term, and truncations in this region are lethal in yeast. But, this truncation does not affect the RNA-dependent ATPase activity or the ability to displace the mRNP from the spliceosome in vitro 58. Other non- spliceosomal DEAH/RHA helicases also contain canonical RBDs in their N-term: MLE contains two dsRBDs 46; DHX57 contains a C3H1-type Zn finger domain. On the other hand, DHX36 features a non-canonical NA-binding domain, DSM (DHX36 specific motif), which is involved in recognizing G-quadruplex NA 6, 45.

More recent evidence suggests that N- and C-termini of certain SF2 helicases consists of intrinsically disordered regions (IDRs) that are primarily composed of RGG,

G/Q, or G/S amino acids 59-60. These low-complexity regions are believed to play a role in

RNA binding and to have a tendency to self-aggregate 61.

1.3.3 β-hairpin

A prominent β-hairpin (β-HP) structure is present between motifs Va and VI in helicases that belong to the Ski2-like, NS3/NPH-II, and DEAH/RHA families (Fig. 1.6). The β-HP structure is, however, absent in DEAD-box and RIG-I-like families 10 (Fig. 1.6b). The structure of Hel308 (a Ski-2 like helicase) indicated the partial opening of the bound DNA duplex due to the presence of the β-HP at the junction between duplex and single-stranded regions 26 (Fig. 1.6b). In NS3/NPH-II, the β-HP acts as a bookend at the 5′ end of a central

9 stack of bases in the RNA substrate and is consequently designated the 5′ β-HP 39. In

Prp43p (DEAH/RHA helicase), point mutations in 5’ β-HP residues caused cold-sensitive and slow growth phenotypes in vivo 62. Likewise, in DHX29, point mutations of two 5’ β-

HP residues that contact the auxiliary domains, lead to defects in 48S formation in vivo 63.

In vitro, mutations in 5’ β-HP residues lead to a substantial reduction in the RNA- stimulated ATPase activity of DHX29 63. The 5’ β-HP was proposed to play a key role in translocation of Prp43p on RNA, by defining the dynamic boundaries of the RNA stack 41.

Interestingly, the presence of the β-HP in these helicase families seems to correlate with their 3’-5’ unwinding polarity. However, structural studies suggest subtle differences

Figure 1.6. β-hairpin structures in SF2 RNA helicases. (a) Position of the 5’ β-hairpin relative to characteristic helicase motifs. Motifs colored as in Figure 1.1. The hairpin is represented by the asterisk in green circle. (b) Cartoon representation of the helicase structures from different families, zoomed into the region containing the 5’ β-hairpin. The RecA1 domain is in blue, RecA2 domain in pink, accessory/ inserted domains are light beige, and the 5’ β-hairpin is green in all structures. ATP and nucleic acid are magenta and orange, respectively. The panel on the far right shows the DEAD-box helicase Vasa as an example for a family lacking the “hairpin”. The following structures are shown from left to right: S. cerevisiae Prp43p (PDB 3KX2); Archaeoglobus fulgidus Hel308 (PDB 2P6R); Hepatitis C virus HCV NS3 (PDB 1CU1); D. melanogaster Vasa (PDB 2DB3). The figures in a) and b) are modified with permission from Elsevier 10. (c-e) Interactions between a bound single-stranded nucleic acid substrate (pink, sticks) and the RecA1 (brown), RecA2, and 5′ β-hairpin domains (green) of (c) C. thermophilum Prp43p (PDB 5LTA); (d) MLE (PDB 5AOR); and (e) HCV NS3 (PDB 3KQL). (HP stands for 5′ β-hairpin). The figures in (c), (d), and (e) are modified from 25, distributed under the terms of the Creative Commons License (http://creativecommons.org/licenses/by-nc/4.0/).

10 in the function of the 5’ β-HP among these helicase families. For instance, in MLE and

Prp43 (DEAH/RHA family helicases), the nucleotide bases undergo a flip when they encounter the 5’ β-HP so that the bases are pointing away from the HP (Fig. 1.6c,d). While this is not observed in Ski-2 like and NS3/NPH-II family helicases where the bases are pointing towards the HP (Fig. 1.6e). Nevertheless, the position of the β-HP is conserved, and the hairpin likely plays a role in unwinding by providing strain directly 3’ to any NA structure 25. Chapter 3 provides experimental evidence for the critical role of the 5’ β-HP in RNA structure remodeling.

1.4 Distinct modes of duplex unwinding by SF2 RNA helicases

In DEAD-box RNA helicases, ATP and NA binding is highly co-operative. By contrast, in Ski-2 like, NS3/NPH-II and DEAH/RHA families, the ATP- and NA-binding sites are preformed; hence they bind ATP or NA independently of each other 17. Although, the single-stranded NA spans across the two RecA domains with similar overall orientation and path, the conformation of NA bound to DEAD-box proteins differs from that of

NS3/NPH-II and DEAH/RHA proteins. In the latter, the NA chain is embedded in a channel formed between the RecA domains and the extended C-terminal domains.

Additionally, the NA backbone is in an extended conformation in these helicases, whereas the backbone of the NA bound to DEAD-box proteins is severely bent 2, 41, 64. Also, DEAD- box proteins bind RNA exclusively at the sugar-phosphate backbone, whereas proteins from other families contact nucleobases too 17. As a result of these variations in NA binding, strand separation occurs via distinct mechanisms in these helicases.

At least two modes of duplex unwinding mechanism exist for SF2 RNA helicases: canonical translocation-based unwinding and local strand separation (Fig.1.7). The

11

Figure 1.7. Distinct modes of duplex unwinding by DEAD-box and DEAH helicases. Top panel: Schematic view of the main steps of translocation-based duplex unwinding. Lines represent RNA strands, the oval marks the helicase and the black rectangle indicates the ATP. Only a monomeric enzyme is displayed, but canonical helicases have also been shown to function as oligomers 15. Only selected, main intermediates are shown. Bottom panel: Schematic view of the steps of unwinding by local strand separation. Lines represent RNA strands, the ovals mark the helicase and the small rectangles indicate the ATP/ADP. The different colors of the helicase protomers emphasize their distinct roles in the unwinding process. The asterisk after step 7 highlights the transient nature of the RNA species with a partially opened helix. Adapted, with permission, from 22.

NS3/NPH-II and DEAH/RHA family helicases unwind duplexes by first loading onto a single-stranded region and then translocating in a unidirectional processive manner39, 41, 65.

Here translocation is tightly coupled to ATP binding and hydrolysis cycle. On the other hand, DEAD-box family helicases like Ded1p and Mss16p unwind duplexes by local strand separation, an unwinding mode not based on translocation 66-67. Single-stranded regions, regardless of orientation, facilitate loading of the helicase directly onto duplexes regions

67. The helicase then opens a limited number of base pairs, and the remaining base pairs dissociate spontaneously (supported by the slower unwinding of more stable duplexes 67-

12

70). The DEAD-box proteins, S. cerevisiae Mss116p, eIF4A, and Ded1p, were able to unwind duplexes in the presence of ADP-BeFx (a non-hydrolyzable ground state ATP analog). Similarly, the Neurospora crassa CYT19 was shown to unwind a 6 bp duplex under low magnesium conditions, with less than 1 ATP molecule hydrolyzed per strand separation event 68. These data indicate that ATP binding is sufficient for duplex unwinding; ATP hydrolysis is only necessary for enzyme release from the RNA 71.

Nevertheless, it is still possible that ATP hydrolysis may further disrupt additional base pairs and thus accelerate duplex unwinding, compared with the reaction driven by ATP binding.

1.5 Biochemical activities & cellular functions of SF2 RNA helicases

1.5.1 Diverse cellular functions

In the cell, SF2 RNA helicases are involved in most aspects of RNA metabolism, including , pre-mRNA splicing, and translation (Fig. 1.8). Some RNA helicases participate solely in one cellular process, such as Suv3 in mitochondrial RNA processing 72. Others, like DDX5, are implicated in multiple processes like transcription 73,

RNA decay 74, pre-mRNA splicing 75, and mRNA export 76. Certain cellular processes like pre-mRNA splicing and translation, utilize RNA helicases from multiple families, while other processes like mRNA export involve only the DEAD-box helicase family. Some

RNA helicases like RIG-I 77, DDX41 78, DHX9, and DHX36 have been implicated in cytoplasmic pathogen sensing.

RNA helicases generally unwind RNA duplexes in vitro provided appropriate substrates are used 79; however, this does not imply that the given helicase necessarily unwinds duplexes in the cell. Despite being defined by characteristic sequence motifs, not

13

all RNA helicases disrupts duplexes. In addition to duplex unwinding, RNA helicases have

been shown to remove proteins from RNA in an ATP-dependent fashion 80-81, promote

strand annealing 60, 70, and catalyze RNA structural conversion 66, 82. Hence, developing an

Figure 1.8. Cellular functions of eukaryotic RNA helicases. Selected, basic processes of eukaryotic RNA metabolism are represented by the white circles, as indicated by the callouts (NMD: nonsense mediated decay). The grey lines mark connections between processes. The colored circles represent the number of individual RNA helicases involved in a given process. RNA helicases (yeast and human orthologs) are grouped and color-coded according to their families (see legend at left lower corner). Connectors indicate involvement in one or more processes of RNA metabolism. Clear assignment of Suv3 to either SF is not possible, even though the protein is highly conserved throughout evolution 10. Circles with bold lines emphasize the three RNA helicases (Prp22p, Prp43p, eIF4A-III) for which specific binding site information is available. Adapted from 22, with permission to reuse from Elsevier.

14 understanding of these biochemical functions is essential for devising physical models of how these helicases function in the cell.

1.5.2 RNA unwinding

In vitro measurement of RNA unwinding with defined model substrates is considered an excellent proxy for the ATP-dependent binding and remodeling of more complex RNA and RNP structures that RNA helicases are thought to perform in the cell

28. For the DEAD-box helicase Dbp4p/DDX10, duplex unwinding is a key task during its involvement in the process of ribosome biogenesis 83. The spliceosomal DEAH/RHA helicases have been assigned to specific steps in the pre-mRNA splicing reaction.

For instance, Prp22 participates in the exon ligation step and promotes the release of the mRNA product by binding downstream of the exon junction and then translocating upstream along the mRNA 84. Similarly, the Ski-2 like helicase Brr2, is necessary for unwinding the U4/U6 duplex, a step essential for catalytic activation of the spliceosome 85.

1.5.3 Protein displacement

In addition to unwinding, RNA helicases also play a major role in the displacement of RNA-binding proteins. This activity is critical for the function of RNA helicases since

RNAs are generally bound to other proteins in vivo. Moreover, proteins can be displaced from structured and from unstructured RNA, suggesting that this activity can be catalyzed independent of duplex unwinding 80. The DEAD-box (Ded1p/DDX3), DEAH/RHA

(Prp2p/DHX16), and NPH-H/NS3 (NPH-II) family helicases have been demonstrated to disrupt RNA–protein interactions 80, 86-88. Some RNA helicases like p68/DDX5 have been shown to disrupt protein-protein contacts as well 89.

15

1.5.4 Strand annealing

RNA helicases are also known to facilitate strand annealing 17. Strong strand annealing activity has been reported for two DEAD-box helicases Mss116p, and

Ded1p/DDX3X 70, 90. Except for cyanobacterial DEAD-box helicase CrhR, the strand annealing activity by other DEAD-box helicases does not require ATP. The strand annealing activity is also observed in DEAH/RHA helicases, as DHX9 has been shown to promote the annealing of HIV-1 genomic RNA to tRNALys3, the primer for reverse transcriptase in HIV 91. Single-molecule studies on DHX36, another DEAH/RHA helicase, revealed a helicase mechanism that involved repetitive cycles of ATP-independent unfolding and ATP-dependent refolding (annealing) of RNA G-quadruplex structures 6.

The strand annealing activity, along with duplex unwinding or protein displacement, is thought to enable RNA helicases to catalyze RNA or RNP structure remodeling.

1.5.5 RNA structure conversion and chaperone activity

Some DEAD-box helicases are thought to promote proper folding of RNA structures 92. The DEAD-box proteins Mss116p and CYT-19 have been shown to promote the proper folding of group I and II introns 93-94. In vitro, single-molecule studies have shown that Mss116p promotes folding through a multi-step process that involves discrete

ATP-independent and ATP-dependent steps 95. Ded1p has also been shown to promote

RNA structure conversion in an ATP-dependent manner. This folding activity of Ded1p is believed to involve both RNA unwinding and strand annealing activities 96. Likewise, the

DEAH/RHA helicase Dhr1p/DHX37 is required for the structural reorganization of the

18S rRNA during ribosome biogenesis 97.

16

Eukaryotic translation requires multiple helicases, principally for rearrangements of the 5’ untranslated region (5’ UTR) in cap-dependent initiation. Three helicases, eIF4A,

Ded1, and DHX29, are thought to clear the way for the pre-initiation complex by resolving

RNA structures in the 5’ UTRs 98-99. Recently, DHX9 and DHX36 were also found to interact with 5’ UTR structures, including G-quadruplexes, and presumably remodel these structures to give the observed increases in polysome loading and translation 8.

1.5.6 Viral nucleic-acid and bacterial pathogen sensing

RNA helicases like RIG-I have been implicated in unanticipated cellular processes like their function in the innate immune system as a pattern recognition receptor for the identification of viral RNAs in the . RIG-I was shown to translocate on dsRNA in an ATP-dependent fashion, without unwinding the duplex 100. This activity of RIG-I is thought to aid the detection of viral RNAs, which can form long dsRNA during viral replication. In two independent studies, the DEAD-box helicase DDX41 was reported to sense viral DNA and bacterial cyclic di-nucleotides in the cytoplasm and subsequently activate the immune response via the STING adaptor 78, 101. Hence, RNA helicases may also act as pathogen sensors in cells.

1.6 Influence of cofactors on RNA helicase function

RNA helicases frequently interact with other proteins in the context of larger multi- protein complexes. Cofactors can positively or negatively influence helicase activity, and hence studies on these interactions are crucial for the understanding of in vivo activities of these proteins. Often, the helicase auxiliary domains aid in the interactions with other cofactors and facilitate the recruitment of the helicases to specific targets. Usually, cofactors are known to increase unwinding activity or NA-stimulated ATPase activity of

17 the helicases. For instance, the DEAD-box helicase eIF4A is stimulated by eIF4B, eIF4H, and eIF4F in unwinding and ATPase activities 102-104.

A prominent example of mediated stimulation of an RNA helicase was demonstrated for a DEAD-box helicase Dbp5p/ DDX21 that is involved in mRNA export.

The RNA binding and ATPase activity of Dbp5p is stimulated by the protein

Gle1p and a inositol hexakisphosphate (IP6) 105. Similarly, the G-patch protein,

Ntr1p, stimulates the unwinding of the DEAH/RHA helicase Prp43p and aids in the release of the excised lariat intron during pre-mRNA splicing 62. Likewise, the unwinding activity of Ski2-like helicase Brr2p is also stimulated by Prp8p, but interestingly it is partially inhibited in ATPase activity 106. On the other hand, cofactors can also inhibit ATPase activity and can interfere with RNA binding or prevent association with other proteins.

Prominent examples are inhibition of eIF4A-III ATPase activity in the exon-junction complex by MAGOH and Y14 107, and the inhibition of RNA binding activity of Dbp5p by the NUP214 108.

1.7 Studies on DHX36 and DDX41 in this thesis

In this thesis, I address the mechanistic features of two distinct RNA helicases, one a DEAH/RHA family helicase called DHX36, and the other a DEAD-box family helicase called DDX41. To devise physical models for how these helicases act, it is crucial to define where a given helicase binds its RNA target, whether and how the RNA structure is changed, and the manner in which ATP is utilized for this reaction. Besides, it is critical to elucidate how the helicases achieve substrate selectivity in the cell, and how the biochemical functions of these helicases are modulated by other factors, which invariably surround the helicase in large multicomponent complexes.

18

1.7.1 The DEAH/RHA helicase DHX36

The DEAH/RHA helicase DHX36 is highly conserved in metazoans and involved in several aspects of gene regulation (highlighted in Chapter 2). DHX36 can remodel a range of NA structures, including G-quadruplexes, and has been increasingly studied in recent years owing to the many biological functions in which it participates (transcription,

RNA processing, translation). DHX36 is composed of the SF2 helicase core and several conserved auxiliary domains. The roles of the auxiliary domains in the NA remodeling process of DHX36 is unclear. In chapter 3, I defined the function of three auxiliary domains of DHX36 using a combination of structural and biochemical approaches. Additionally, I characterized a previously unexplored activity of DHX36, its ability to unwind RNA duplex structures. In chapter 4, I investigated the coordination between nucleotide- binding/hydrolysis and NA remodeling of DHX36, and demonstrate how DHX36 utilizes a nucleotide-dependent mechanism to achieve RNA substrate selectivity.

1.7.2 The DEAD-box protein DDX41

The later part of my thesis work focused on the human DEAD-box protein DDX41.

We chose to study DDX41 due to the recent identification of some relatively common somatic mutations in DDX41 that promote malignant transformation of certain blood cells

(highlighted in Chapters 6). The DDX41 gene is also frequently deleted/inactivated in leukemia patients. DDX41 belongs to the classical DEAD-box family of RNA helicases and has been implicated in pre-mRNA splicing. However, there is a pressing need for quantitative biochemical insights into the mechanism of action of this helicase.

Additionally, its cellular targets are largely unknown. In chapter 7, I detail a collaborative effort aimed at understanding the role of DDX41 in pre-mRNA splicing, and clarification

19 of the effects of DDX41 mutations on helicase function, splicing and leukemia using an in vitro and cell culture systems. In addition, I report on the biochemical characterization of

DDX41 using established in vitro approaches.

20

Chapter 2

Introduction

The DEAH/RHA helicase DHX36 is involved in the regulation of

2.1 Gene regulation in eukaryotes

Even though all cells in the human body contain the same genetic code, differential gene regulation is responsible for the specific functions of different cell types. The orchestration of cellular processes involves the ability to tightly control protein levels at any given point, which ultimately enables the cell to differentiate, proliferate, and adapt to environmental conditions. Gene expression is controlled at multiple levels during the developmental process. First, at the transcriptional level, the amount of mRNA that is produced from a particular gene is controlled. Second, at the post-transcriptional level, the fate of the mRNA is controlled via multiple processes. The next level of control is through events that regulate the translation of mRNA into functional proteins. Finally, even after the protein is made, post-translational modifications can affect its activity. Although higher eukaryotes respond to environmental signals via gene regulation, organismal development is orchestrated through an additional layer of regulation from cell-to-cell interactions within the organism. This highly controlled network of regulatory interactions is necessary to produce a complex organism or phenotype.

2.1.1 Transcriptional regulation

DNA transcription is the chemical process through which information is transferred from DNA to RNA with the help of transcription factors and enzymes such as RNA

21 polymerases. In eukaryotes, DNA is compacted into , the basic unit of which is the nucleosome that contains DNA wrapped around octameric histone complexes 109. The first level of transcriptional regulation is achieved through the accessibility of packaged

DNA for transcription factors and polymerases. There are many varieties of active and inactive chromatin present in a given cell type. Through processes such as methylation and demethylation of DNA and post-translational modifications (PTMs) of histones, chromatin architecture can be modified, thus resulting in the regulation of gene expression 110. For instance, transcriptionally inactive chromatin is associated with repressive histone modifications, whereas transcriptionally active chromatin is relatively accessible and associated with actively transcribed genes and active histone modifications 111.

Another level of transcriptional regulation is accomplished via the binding of transcriptional factors (TFs) to specific cis-acting DNA sequences. Promoter regions on

DNA are one type of cis-acting sequence that bind basal transcriptional machinery, RNA polymerase, and general TFs 112. Enhancer regions bind activator proteins and upregulate the transcription of a gene by enhancing the affinity of RNA polymerase to the promoters.

Contrastingly, Silencer regions bind repressor proteins and function to downregulate gene transcription. In addition, PTMs control every aspect of the TF function. PTMs control sub-cellular localization of TFs, the binding strength of TFs to DNA, and the activity of their activation/silencing functions 113. Upon a distinct stimulus, many cytoplasmically localized TFs are activated and subsequently translocated to the nucleus where they bind their DNA targets and regulate gene expression. Another important layer of regulation is achieved via control of RNA polymerase II (RNAP II) activity, which is determined by the phosphorylation status of the serine residues in the CTD of RNAP II 114.

22

While the regulatory processes listed above determine the synthesis of mRNA, the fate of the individual mRNA is under the control of processes summarized by post- transcriptional gene regulation.

2.1.2 Post-transcriptional regulation

mRNA, along with other RNA molecules (tRNA and rRNA), are part of the machinery used to synthesize proteins. Post-transcriptional regulation involves a complex and diverse set of processes that dictate the lifecycle of an mRNA, including, splicing, capping, polyadenylation, mRNA export, stability, translation, and mRNA decay 115. Like transcriptional regulation, post-transcriptional regulation has a major impact on protein translation. RNA-binding proteins (RBPs) and processing factors that associate with RNAs during its lifetime are key players in post-transcriptional gene regulation 116. Studies suggest that the coupling of transcriptional and subsequent post-transcriptional steps by

RBPs are important in determining how, when, and where to translate functionally related subpopulations of mRNAs 117.

Even though all mRNAs go through the same choreographed set of events, each mRNA is presumed to be packaged into a unique mRNP 118. For instance, the PTM status of the CTD of RNAP II dictates the RBPs that are destined to associate with the nascent mRNA. The process of splicing is regulated by an interplay of cis-acting sequences on the pre-mRNA and several trans-acting protein factors. Alternative splicing can lead to altered function, localization, activity, or stability of the resulting protein 119. Splicing regulation can result in the expression of different protein isoforms often in a tissue- or cell-type- specific manner. The 3’-end processing of mRNAs, which includes endonucleolytic cleavage and poly (A) tail addition, is crucial for ensuring mRNA stability and efficient

23 translation. Recent discoveries have revealed that some mRNAs contain more than one polyadenylation site within their 3’ UTRs. Alternative polyadenylation (APA) can generate mRNAs with alternative 3’-ends, thus expanding its diversity. APA has expanded the complexity of the transcriptome and proteome, thereby potentially regulating the localization, translation efficiency, function, and stability of the target RNA 120.

Another critical step of regulation is by covalent alteration of the mRNA by deadenylation, adenylation, uridylation, editing, and/or base modifications. Such modifications can change the binding of mRNP components and, ultimately, the function of the mRNP. The mRNA export process can also be regulated since improperly processed mRNAs are retained in the nucleus, and defective transcripts are degraded, while the export of fully processed mature mRNAs is ensured. Once in the cytoplasm, the primary function of mRNPs is their translation into proteins. Dramatic mRNP remodeling occurs during this stage where translationally incompetent mRNPs are degraded by the decay machinery, while mRNAs deemed translationally competent are ready for the next step. Regulators controlling the translation of specific mRNAs (miRNAs or RBPs) typically affect multiple steps of the translation process. Most of the regulation happens at the translation initiation stage, wherein several active and passive factors interfere with the cap-dependent assembly of the pre-initiation complex (PIC).

The eukaryotic translation initiation factor 4E (eIF4E)-binding protein 1 (4E-BP1) interferes with translation initiation by binding to eIF4E, and preventing its interaction with eIF4G, thereby inhibiting PIC assembly and repressing translation. Translation is often controlled by specific signals arriving from inside or outside the cell. For instance, global regulation of protein synthesis occurs via the mTOR pathway, where mTORC1

24 phosphorylates 4E-BP1 in response to upstream stimuli 121. The phosphorylation of this protein results in its dissociation from eIF4E and the recruitment of eIF4G to the 5′-cap, thereby allowing cap-dependent translation initiation to proceed. Additionally, during cellular stress, global repression of translation occurs via eIF2α phosphorylation and control of translation factors 122. In another mode of regulation, translation of only defined groups of mRNAs is modulated. This is often effected by sequence/structural elements in

5´ UTRs, which can decrease the efficiency of the recruitment of the 43S complex, decelerate the scanning process, or altogether abolish 43S progression 123.

Recent developments point to another interesting mode of translation regulation involving the presence of an upstream open reading frame (uORF) in the 5’ UTRs. uORFs, often coding for a nonfunctional polypeptide, have been shown to decrease translation of the downstream-located major coding sequence 124,98,8. Translational silencing is also achieved by a mechanism known as RNA interference (RNAi). (miRNAs) – small non-coding RNAs of 22 nucleotides in length – are known to repress translation via base pairing to sequences located in 3’ UTRs of target mRNAs, thereby exerting translational control on specific mRNAs 123. Protein synthesis depends not just on the rate of translation but also on the stability of the mRNA. Hence, mRNA decay offers another layer of regulation, which is often induced by miRNAs, quality control factors, and RBPs binding to specific regulatory elements on the 3’ UTRs. An example of such regulatory elements are the AU-rich elements (AREs) found in the 3’ UTR of about 5% of human genes. Typically, AREs recruit specific RBPs that either stabilize the RNA or promote the degradation of the transcript by recruiting the mRNA decay machinery 125-126. The final layer of regulation is offered by sequestration of translationally repressed mRNPs (or those

25 undergoing decay) into stress granules and P-bodies. On the other hand, mRNAs can also be transported to specific parts of the cell for localized translation. These mechanisms provide the cell the opportunity to react on its needs and on external stimuli 125.

The proper functioning of developmental and cell cycle programs relies on the coordination of all of these events with high spatial, temporal, and tissue-specific precision

127. The interactions of RBPs with sequence and structural elements on the mRNAs form the basis of all these events. In the next sections, the structural aspects of NAs and the involvement of RBPs in gene regulation are discussed in greater detail.

2.2 Nucleic acid structures and gene regulation

Eukaryotic cells contain thousands of different NAs, and DNA/RNA structures in general influence nearly every step in eukaryotic gene expression. It is essential to understand how DBPs/RBPs interact with different DNA/RNA structures and how they discriminate between different potential binding sites in these NAs.

2.2.1 Classical DNA/RNA structures and their functions

In addition to the canonical right-handed DNA double helix, other unusual DNA secondary structures are predicted to have functions during transcriptional regulation in eukaryotes. Some of these include Z-DNA, cruciforms, triplexes, and G-quadruplex structures 128. Sequence motifs that form Z-DNA fold into a left-handed helix and are enriched at transcription start sites (TSS) of genes. Due to negative supercoiling, B-DNA that contains ≥6-nucleotide inverted repeats (cruciform motif) can adopt a four-armed cruciform secondary structure that resembles a Holliday junction. Such motifs are located near replication origins, breakpoint junctions, and near sites of gross chromosomal rearrangements 129. Triplexes in which the third strand is antiparallel to the DNA duplex

26 occurs when single-stranded DNA forms Hoogsteen hydrogen bonds in the major groove of purine-rich double-stranded B-DNA 130. In mammals, these structural motifs are enriched in the introns of a variety of essential genes. Most of these non-B-form secondary structures are hypothesized to cause genomic instability by causing deletions, double- strand breaks, and translocations. These structures may impact chromatic architecture, and the onset and regulation of replication by affecting the level of super helicity, and thus the binding of specific protein factors.

Single-stranded RNA has a greater propensity to form secondary structures than double-stranded DNA due to the absence of a competing complementary strand. The four basic secondary structure elements in RNA are helices, loops, bulges, and junctions 18 (Fig.

2.1). The helices are A-form duplexes formed in self-complementary base-paired regions.

Loops and bulges are regions in the double-stranded RNA with mismatched (e.g., AG, UC) or unmatched (unpaired) bases terminated by one or more helices 131. RNA junctions are constructs where two or more helices meet, and they usually contain unmatched bases.

Helices combined with a loop on top form stem-loops or hairpin structures. The overall

Figure 2.1. RNA secondary structural elements: junctions, stems, loops and bulges. The double-strands (black) of RNA stems are stabilized by complementary base pairs (e.g., AU, GC, GU), shown schematically as green lines. The number of unmatched bases (e.g., A, C, U, or G, red lines) and mismatched base pairs (e.g., AG, GA, AC, CA, red lines) in junctions, bulges and loops can vary. Several types of RNA junctions and internal loops are depicted. The figure is from Ref: 18, with permission to reuse.

27 molecular architecture of the secondary structure is stabilized by Watson-Crick (GC and

AU) and other (e.g., GU wobble) base pairing motifs 18.

RNA can sometimes adopt three-dimensional assemblies as complex as tertiary protein structures. Tertiary RNA structures are the consequence of higher-ordered interactions between distinct secondary structural elements. The L-shaped tertiary structure of tRNAs plays a crucial role during translation by binding proteins and by fitting into the ribosomal A-, P-, and E- sites. Among the most prevalent RNA structures is a motif known as the pseudoknot that has a stretch of nucleotides within a hairpin loop that pairs with nucleotides external to that loop 132. Pseudoknots play diverse roles in biology, like forming the catalytic core of various ribozymes 133, self-splicing introns 134, and telomerase

135. Additionally, pseudoknots play critical roles in altering gene expression by causing ribosomal frameshifting in many viruses 136.

Most RNA structures serve as interaction platforms for proteins and in turn influence the function of the polypeptide. For example, structured internal ribosomal entry sites (IRESs) on mRNAs can positively affect and promote translation by serving as alternative ribosome recruiting sites (cap-independent translation initiation) 137. Sometimes processing and maturation of RNAs is mediated via enzyme binding and recognition of specific RNA structures like processing of pri- and pre-miRNAs by Drosha and Dicer ribonucleases. RNA structures can inhibit or aid the binding of spliceosomal components to the pre-mRNA or can increase splicing efficiency by bringing important sequences in proximity 138. Importantly, numerous pieces of evidence suggest that 5′ UTR structures in mRNAs may block or recruit and other regulatory factors to control gene expression to enable a rapid, dynamic response to diverse cellular conditions 139. Lastly,

28

RNA structures play a critical role in mRNA localization 140, and also positively or negatively influence mRNA stability and decay by acting as functional obstacles or as protein recruitment platforms 141. RNA secondary structure thus influences nearly every step in eukaryotic gene expression, and the possibilities are further extended by another nucleic acid secondary structure, called G-quadruplexes.

2.2.2 G-quadruplexes

One of the earliest reports on the unusual nature of guanosine came from Bang et al. in 1910, who found that highly concentrated solutions of guanylic acid formed a gel.

Years later, the assembly of guanylic acids into tetrameric structures was demonstrated through X-ray diffraction studies 142. Four guanines can form a square planar arrangement in which each guanine is connected to other guanines via Hoogsteen base-pairing to form a G-tetrad with eight hydrogen bonds (aka, a G-quartet, Fig. 2.2a). When two or more G- tetrads stack on to one another, they form a stable helical structure called the G-quadruplex,

Fig. 2.2a, with intervening sequences forming single-stranded loops. The structure is stabilized by physiologically relevant monovalent cations like K+, Na+, and NH4+ that intercalate between the G-tetrads, neutralizing the electrostatic repulsion of guanine O6 oxygens (Fig. 2.2a). Larger cations coordinate more oxygen atoms than smaller ones, thus contributing differently to G-quadruplex (GQ) stability 143. Although Li+ is thought to destabilize GQ structures, under our experimental conditions, the GQ structures formed in the presence of Li+ have relatively lower stability. Other research groups observe that Li+ plays a neutral role wherein it neither stabilizes nor destabilizes GQ structures 143-144. GQ stability is dependent on several other factors, for instance, smaller loops (1-2 nt) and a larger number of stacked G-tetrads, result in more stable GQ structures 143.

29

Figure 2.2. G-quadruplex (GQ) structures and their topological variants. (a) G-tetrad is formed by four guanosines that are hydrogen bonded (dashed lines) by Hoogsteen base pairings and a + monovalent cation (M ) interacts with O6 atoms. Stacking of ≥3 G-tetrads with intervening loops regions in the GQ structure. (b) Schematic representations of Intramolecular parallel and antiparallel 4-tiered GQ with one NA strand; Intermolecular parallel 4-tiered GQ with four NA strands; Bimolecular antiparallel 4-tiered GQ with two NA strands. (c) Side view; and (d) Bottom view , of the structure of

GQ in the nuclease hypersensitive element (NHE) III1 region of human c-MYC promoter (PDBid: 1XAV). Sugar-phosphate backbone is represented by the orange ribbon. The figures in (c) & (d) are from Ref: 21, distributed under the terms of the Creative Commons License (http://creativecommons.org/licenses/by-nc/4.0/)

GQ structures can adopt diverse topologies depending on the conformation of the glycosidic bond (syn or anti), on the presence of monovalent cation (K+ or Na+); the number of participating NA molecules such as intramolecular, bimolecular or intermolecular, (Fig.

2.2b); the directionality of the strands leading to parallel, antiparallel, or hybrid; or the length and orientation of the bridging loops (propeller, lateral, diagonal) 143-145 (Fig. 2.2b).

GQ structures can form in DNA and RNA when harboring the classic putative GQ forming sequence motif of GxL1-7-GxL1-7-GxL1-7-GxL1-7 where x = 3-6, and L corresponds to any nucleotide A, G, C T, or U 144. Despite numerous similarities, RNA GQs are often

30 more thermodynamically stable, more compact, and less hydrated than DNA GQs 146. The presence of a 2’-OH on the ribose sugar results in extended hydrogen bonding interactions within the RNA GQ, making them more stable. Additionally, the 2’-OH strongly favors the orientation of the base in the anti-conformation and a C3’-endo puckering of the ribose sugar, thus preventing structural heterogeneity in RNA GQs. Consequently, RNA GQs adopt a parallel conformation where all four strands are in the same direction, whereas

DNA GQs can be in parallel, anti-parallel, or mixed conformations 145.

2.2.2.1 Genomic mapping of G-quadruplexes

Systematic computational approaches using simple algorithms (Quadparser) have identified > 370,000 putative sequence motifs (of the type G≥3L1-7-G≥3L1-7-G≥3L1-7-G≥3L1-

147-148 7) within the with potential to form GQs . However, additional sequence motifs, forming GQs consisting of only two G-quartets or with a more variable loop length, have been found recently 149. A high throughput analysis that involves GQ- dependent polymerase stalling and next-generation sequencing identified a genome-wide map of DNA GQ structures. This technique, combined with the addition of GQ stabilizing ligands, resulted in the identification of more than 700,000 DNA GQ (dGQ) forming sites in the human genome 150. These studies have revealed that dGQ motifs are not randomly distributed across the genome, but are instead clustered in functional genomic regions.

They are over-represented in replication origins, promoter regions, transcription start sites, mitotic and meiotic double-strand break sites, and telomeres 148-150. Importantly, they were found to be enriched at the sites of proto-oncogenes while being underrepresented in tumor suppressor genes 148, 151. Additionally, genome-wide mapping of dGQ was done in situ by using GQ-structure specific antibodies as probes for GQ-specific ChIP-Seq analysis 152.

31

Using this approach, dGQs were detected in regulatory, nucleosome-depleted chromatin regions that are usually highly transcribed 153.

Similarly, transcriptome-wide mapping of RNA GQ structures identified ~ 4000

RNA GQ (rGQ) forming sites in HeLa cells, made possible by combining rGQ induced reverse-transcriptase stalling and high-throughput sequencing (RT-stop profiling). This method was able to map ~13,500 additional rGQ sites by the addition of pyrodiostatin, a

GQ stabilizing ligand 154. The transcriptome-wide distribution of rGQ sequence motifs revealed their enrichment in 5´ and 3´ UTRs of mRNAs 154. Other studies showed that many non-coding RNAs, especially TERC (telomerase RNA component) and TERRA

(telomeric repeat-containing RNA), harbor rGQ motifs 155. However, the above method used purified cellular RNA and hence did not reflect the in vivo folding states of endogenous potential rGQ forming sequences.

To overcome this, the Bartel laboratory combined RT-stop profiling with elements of chemical probing techniques like DMS-seq (dimethyl-sulfate modification of RNA and subsequent sequencing) 156. These analyses showed that rGQs were largely unfolded in eukaryotic cells, suggesting that RNA-binding proteins (RBPs) and helicases were involved in actively maintaining them in an unfolded state. However, the studies mentioned above fail to identify transcripts that are expressed at low levels and sequences that may transiently fold to form rGQs only during a fraction of the transcript’s lifetime. In addition, these methods selected for polyadenylated RNAs, missing other classes of RNAs with the potential to form rGQs. To overcome these limitations, Monchaud et al. used a small- molecule-based chemical crosslinking approach to capture global snapshots of transiently folded rGQs in the human transcriptome 157.

32

2.2.2.2 In vivo existence of G-quadruplexes

Foremost, the existence of GQ specific DNA and RNA helicases provides circumstantial evidence for the presence of GQs in vivo. More recently, cellular imaging studies using GQ-specific antibodies and probes have provided evidence for the presence of in vivo GQ formation. The first evidence came from immunofluorescence experiments using telomeric GQ-specific antibodies for the ciliate model organism Stylonychia lemnae

(Sty3 and Sty49) 158. Recently, two independent studies on human cell lines using BG4 and

1H6 (GQ-specific antibodies) carried out on in situ fixed nuclei, showed punctate GQ staining of genomic DNA that were DNaseI sensitive 159-160. dGQs were predominantly found to occur during S phase, suggesting a cell cycle dependence of GQ dynamics 159.

Studies on rGQs are less extensive, compared to their DNA counterparts. Nevertheless, immunofluorescence experiments using BG4 antibody showed RNAse A sensitive, but

DNAse I insensitive cytoplasmic and nuclear staining, hence suggesting their existence 159.

Despite the usefulness of these immunostaining approaches, limitations like the possibility of GQ folding during the process of fixing, permeabilizing, or staining cells, has provoked skepticism in the community. Another recent approach called “in cell-NMR” has enabled the identification of GQs by injecting 13C and 15N radiolabelled-oligonucleotides into

Xenopus oocytes 153.

2.2.2.3 Biological roles of G-quadruplexes

Given the in vivo identification of dGQs near replication origins, promoters, transcription start sites, and telomeres, functional roles for dGQs in replication and transcription related processes could be postulated. Importantly, the enrichment of rGQs in mRNA regions with regulatory functions (UTRs) hints that rGQs regulate RNA

33 metabolism. Below, the proposed biological functions of DNA and RNA GQs are discussed in more detail.

Replication and genome instability: Evidence for the presence of dGQs near replication origins has led to the hypothesis that dGQs may be involved in the regulation of replication initiation. In the absence of helicases that remodel dGQs, DNA polymerase movement is hindered, which in turn leads to replication stalling, DNA damage, and genome instability.

For instance, cells from patients with Fanconi anemia carrying FANCJ helicase mutations displayed genome-wide deletions in G-rich sequences with dGQ forming potential 161.

Several genetic assays have shown that in the absence of the yeast helicase Pif1 that acts at dGQ motifs, DNA replication slows, and double-strand breaks occur 162-163. The regulator of telomere elongation helicase 1 (RTEL1) was shown to remodel telomeric dGQ structures and to maintain telomere integrity in mouse cells 164. Despite these findings, studies that provide direct experimental evidence are required to advance our understanding, supporting the involvement of dGQs in replication related processes.

Transcription: The mammalian c-MYC is the best-studied system for dGQ involvement in transcription. MYC and KRAS are transcription factors whose expression is linked to cell proliferation. Over-activation of these factors is observed in >80% of human cancer cells and has been shown to promote tumor formation 165-166. These proto- oncogenes contain dGQ motifs in their promoters, which form dGQ structure in vitro 167.

Using reporter assays and footprinting experiments, the dGQ motif in the c-MYC promoter was shown to repress transcription 168. In two separate studies, a GQ-stabilizing ligand reduced mRNA levels of MYC and KRAS and showed antitumour activity in mice 168-170.

More recently, the GQ-ChIP-seq data, the genomic binding site data of the transcription

34 factor SP1 153, and of human DNA helicases involved in transcription XPB,XPD 171, and the co-localization of GQ-specific antibodies within transcriptionally active regions 153, together support the involvement of dGQs in transcriptionally active regions. Mutations in the dGQ remodeling DNA helicases Werner syndrome protein (WRN) and (BLM) have been associated with cancer predisposition and premature aging. Mutations in these helicases have been shown to alter the regulation of certain genes that contain promoters enriched with dGQ motifs consistent with a link to transcription 172-

174. Overall, dGQs are postulated to regulate transcription by slowing the polymerase, or by recruiting protein factors that activate or repress transcription. Further studies are needed to understand the mechanistic details of these associations.

Transcription termination and pre-mRNA 3’ end processing: rGQs may function in both the nucleus and cytoplasm and influence a wide range of effects on RNA metabolism.

Table. 2.1 provides a summary of the biological roles of rGQs. The RNA that emerges from the transcriptional machinery can hybridize with the DNA template to form RNA-

Table. 2.1. Biological roles of RNA GQs.

35

DNA hybrids, R-loops. When R-loops contain G-C rich sequences, they can fold into hybrid GQs. This kind of GQ has been implicated in the inhibition of transcription and transcriptional termination. The helicase senataxin (SETX) and exoribonuclease Xrn2 have been implicated in nascent RNA release and Pol II termination 175-176. Transcriptional termination is tightly linked to mRNA 3’ end polyadenylation, wherein the cleavage/polyadenylation machinery 177 recognizes cis-acting sequence motifs and/or structural elements (rGQs). This rGQ mediated regulation of alternative polyadenylation was demonstrated for Fragile X-mental retardation autosomal homolog 1 (FXR1) mRNA, resulting in the production of either a shorter or longer isoform 178. pre-mRNA splicing: rGQ forming regions are common in mammalian introns, especially near 5’-splice sites (5’-SS), and as cis-acting regions, they can regulate pre-mRNA splicing by acting as splicing enhancers or silencers. For instance, the intron 6 of human telomerase reverse transcriptase pre-mRNA (hTERT) contains an rGQ-forming motif that was proposed to act as an intron-splicing silencer. Stabilization of this rGQ impaired hTERT splicing and thus telomerase activity 179. Similarly, the Bcl-X pre-mRNA contains two rGQ forming regions near two alternative 5′SS. One study showed that certain rGQ stabilizing ligands antagonized the major 5’-SS that expresses the anti-apoptotic isoform of Bcl-X and instead activated the alternative 5’-SS that expressed the pro-apoptotic isoform 180. In contrast, another study showed that an rGQ located in intron 3 of mRNA, enhanced the splicing of an adjacent intron 2, leading to differential expression of distinct p53 isoforms 181. Although our mechanistic understanding of these processes is limited, rGQ mediated splicing regulation is thought to be mediated by the recruitment of effector proteins involved in splicing, like hnRNP H/F and FMRP.

36 mRNA localization: In specialized cells like neurons, transport of transcribed mRNAs to localized sites of protein synthesis like synapses, is a key post-transcriptional mechanism. rGQs found in 3’ UTRs are emerging candidate cis-elements involved in subcellular sorting of mRNAs. Using reporter assays, Subramanian et al. showed that rGQs in 3’ UTRs of postsynaptic density protein 95, and Ca2+ /calmodulin-dependent II contribute to the transport of these mRNAs in neurites in vivo 182. Several GQ-associated

RBPs were found to co-localize to (molecular motors) associated neuronal granules, for transport along microtubules to localized sites of mRNA translation 183. mRNA translation: In eukaryotes, protein synthesis is a complex energy-consuming process that is highly regulated at the translation initiation stage. The 5’ UTR is a crucial element for translation initiation and translational control, wherein RNA secondary structures are thought to play important regulatory roles. Extremely stable structures like rGQs are thought to regulate translation initiation by mechanisms that may involve inhibition of 43S pre-initiation complex scanning, interference with cap binding, or by recruitment of different RBPs 184-186. Translational inhibition was observed in vitro and in vivo when 5’ UTRs of mRNA reporters contained GQs from endogenous transcripts like

FMR1, NRAs, Zic-1, MT3-MMP, Bcl-2, and TRF2 187. The possible roles of rGQs in mRNA translation is shown schematically in (Fig. 2.3). In rare instances, rGQs in 5’ UTRs have also been shown to stimulate translation, as in the case of VEGF and FGF2, wherein rGQs are part of internal ribosome entry sites (IRESes), and facilitate non-canonical cap- independent translation 188-189.

37

Figure 2.3. Location and function of GQs involved in mRNA translation regulation. Repression of cap-dependent translation and enhancement of cap-independent translation due to rGQs in 5’ UTRs, ORFs, and 3’ UTRs. The genes containing rGQs in the regulatory regions are listed below. This figure is from Ref: 3 which is distributed under the terms of the Creative Commons License (http://creativecommons.org/licenses/by-nc/3.0/).

Although less investigated, rGQs within ORFs and 3’ UTRs have also been associated with translational regulation. rGQs in ORFs are proposed to act as roadblocks for elongating ribosomes, and as factors that enhance ribosomal frameshifting 190-193. Even though 3’ UTR rGQs have been associated with facilitating localized mRNA translation, repression of translation has been reported for a 3’ UTR rGQ in the proto-oncogene PIM1 194.

Non-coding RNAs and rGQs: A recent analysis identified at least 700 non-coding transcripts with potential rGQ forming motifs 195. TERRA (telomeric repeat-containing

RNA), which is formed by RNAPol-II-dependent transcription of the C-rich strand of telomeric DNA, is an excellent example of a lncRNA that can fold into rGQs. Evidence

38

Figure 2.4. Regulation of telomerase activity by RNA GQs associated with telomeres, TERRA, and TERC. lncRNA TERRA folds into GQs and interacts with telomerase and TERC and inhibit telomere extension by telomerase. The 5’-end RNA moeity of telomerase, TERC, can fold into a GQ that is recognized and resolved by RHAU/DHX36 resulting in the formation of a P1 helix template boundary required for optimal telomerase activity. The figure is reproduced from Ref: 14, with permission to reuse. shows that TERRA functions as an inhibitor of telomerase activity by interacting directly with both the telomerase holoenzyme and the RNA moiety TERC 196-197 (Fig. 2.4). The 5´- end of human TERC (RNA template of telomerase) contains an rGQ-motif. Recent studies showed that the formation of this rGQ impedes the folding of a helical motif named P1 that is crucial for boundary definition and accurate reverse transcription by the telomerase enzyme. When this GQ structure was actively resolved by a helicase, P1-helix could form, and the telomerase was proficient for telomere extension 198.

Additionally, many predicted 3’ UTR microRNA (miRNA) binding sites overlap with potential rGQ sites; as a result, in the FADS2 mRNA, a rGQ was shown to prevent binding of miRNA 331-3p to its target site on this transcript 199. Another recent study showed that rGQs could affect the processing of primary-microRNA (pri-miRNA). The processing of pri-mir200c, pri-mir451a, and pri-mir497 was shown to be affected positively or negatively, due to rGQs located near the Drosha cleavage site 200.

39

2.2.2.4 G-quadruplex disease connections

As shown in section 2.1.2.3, the involvement of GQs in several steps of DNA and

RNA metabolism highlights the importance of these secondary structures in the regulation of physiological (or pathological) processes. For instance, maintenance of genome stability is essential to prevent the development of diseases, including developmental defects, immune deficiency, cancer, and neurodegenerative disorders. Fanconi’s anemia, Werner’s and Bloom’s syndrome, all result from mutations in DNA GQ resolving helicases and display genome instability with a predisposition to cancer 201-202. Furthermore, loss of additional GQ resolving helicases like ATRX and XPD is associated with instability and loss of DNA repair in pancreatic neuroendocrine tumors and cancer-prone diseases like 203-204.

Recent evidence suggests that the proposed physiological roles of GQs are altered in disease states like cancer and neurological disorders. Six known vital cellular processes are aberrantly regulated in malignancies, and for each of these, there is at least one crucial gene with a GQ in the promoter or UTRs (Table. 2.2), including vascular endothelial growth factor A (VEGFA) (angiogenesis) and telomerase reverse transcriptase (TERT)

Table 2.2. RNA GQs and hallmarks of cancer. Examples of GQ containing genes implicated in cancer development and progression. Contents of the table adapted from Ref: 13 with permission to reuse..

40

(limitless replication) 205. Notably, the GQs found in these genes exhibit diverse topologies, thus containing distinctive binding pockets that offer sites for specific drug targeting.

GQs have been associated with neurological diseases (Table. 2.3) via two distinct mechanisms: a) they can show expansions of G-rich sequences (predicted to form GQs) that sequester RNA binding proteins, and b) by mutations that affect the expression of GQ binding proteins 1. Both Amyotrophic lateral sclerosis (ALS) and Frontotemporal dementia

(FTD) are caused by an expansion of a GGGGCC (G4C2) repeats (>500 repeats) within the

1st intron of the C9orf72 gene 183, 206, wherein healthy individuals carry an average of just

2 repeats. The pathological expansion of G4C2 repeats leads to the production of abortive transcripts containing GQs. These GQ containing transcripts accumulate in RNA foci in the nucleus sequestering several RBPs. The accumulation of nucleolin at these foci results in altered biogenesis of ribosomal RNA and nucleolar stress 207. Other proteins like splicing factors (hnRNPA1, hnRNPH), RNA editing factor ADARB2, and RBP hnRNPA3 are also sequestered in these foci, collectively resulting in disease through an RNA gain of function mechanism 208. On the other hand, these C9orf72 expanded repeats can also undergo repeat associated non-ATG () translation in all frames producing dipeptide repeat (DPR) proteins 209. DPR proteins can aggregate in neuronal cytoplasmic and nuclear inclusions, further contributing to pathogenicity 210.

Table 2.3. Genes with potential GQ motif expansions implicated in neurological diseases. Table adapted from Ref: 1, with permission to reuse.

41

Fragile X syndrome (FXS) is caused by CGG repeat expansions in the 5’ UTR of the Fragile X Mental Retardation gene 1 (FMR1), which results in methylation and silencing of FMR1 and consequently the loss of expression of FMRP, a GQ binding protein

211. The FMRP protein is involved in GQ-mediated regulation of dendritic mRNA translation and localization, and thus misregulation of these processes is also linked to neurological disorders 212.

2.2.2.5 G-Quadruplex drug targeting

Since GQs are linked to genome instability, telomere biology, and gene regulation in cancer, targeting GQs with small molecules is being pursued as an attractive therapeutic strategy. The diversity observed in GQ folding patterns and loop lengths make them amenable candidates for specific drug targeting. For instance, the c-MYC oncogene is overexpressed in > 80% of tumors, including gastrointestinal, ovarian, breast, and non-

Hodgkin’s lymphoma tumors 205, 213. A GQ ligand TMPyP4 was shown to stabilize the GQ formed in the c-MYC promoter element, subsequently resulting in the inhibition of c-MYC expression 168.

Highly specific GQ ligands like PDS and RHPS4 have been shown to induce DNA double-stranded breaks, to activate DNA repair pathways, and activate polyADP-ribose polymerase 1(PARP1) 214. Hence, these GQ ligands are being used in combination with other small-molecule inhibitors for cancer therapy, to produce synergistic effects. A recent study showed that the GQ ligand TMPyP4 could bind C9orf72 G4C2 repeats expansions and distort its structure, in turn disrupting interactions with hnRNPA1 and SF2 215. Another study showed that GQ ligands targeting G4C2 repeats were able to decrease RAN translation and reduce RNA foci in patient derived neurons and transfected cultured cells

42

216. These examples establish a proof of principle that small molecules targeting GQ repeats are viable therapeutic strategies for certain neurological disorders.

Recent advances have provided substantial evidence for the existence of GQs in cells and their roles in gene regulation. Nevertheless, the existence and relevance of rGQs have been a subject of controversy for a very long time 217. As mentioned earlier, genome- wide chemical mapping studies in mammalian cells 156 hinted at the existence of a cellular network responsible for the global unfolding of rGQs. Consequently, there is more to be learned about the biological functions of GQs, hence understanding the molecular mechanisms of GQ mediated gene regulation is very important. Moreover, GQ formation and especially their disruption are assisted by NA-binding proteins. Hence, insights into this network will help us understand why, when, and how GQs form and what their cellular consequences are.

2.3 NA binding proteins

Most biological processes are governed by protein-NA interactions. Uncovering the roles that protein–NA complexes play in the regulation of various stages of DNA and

RNA metabolism continues to revolutionize our understanding of cell biology, normal cell development, and disease mechanisms. Proteins interact with DNA and RNA via electrostatic interactions, base stacking, hydrogen bonding, and hydrophobic interactions.

These forces contribute to the sequence/structure specificity of protein-NA interactions to varying degrees. Additionally, protein oligomerization and/or multi-protein complex formation (e.g., transcription initiation factors, splicing complexes, miRNA processing machinery, etc.) can influence the affinity and specificity of protein-NA interactions.

43

Importantly, the secondary and tertiary structures formed by NA sequences (particularly in

RNA) provide an additional mechanism by which proteins recognize and regulate NAs.

2.3.1 General aspects of NA-binding proteins

The NA-binding function of a protein is usually found in distinct conserved domains within its tertiary structure. A NA-binding protein can have several different domains, or it can have multiple repeats of the same domain to achieve this function.

Hence, the identity and relative arrangement of these domains are crucial for the function of the protein. Often, NA-binding domains present within different proteins can act in combination to achieve better binding specificity and affinity 218. Some proteins also utilize flexible regions outside the NA-binding domain to facilitate specific and non-specific interactions. The NA-binding domains are generally prone to post-translational modifications and are hot spots for disease mutations 219.

Proteins use a wide range of DNA-binding structural domains/motifs, including homeodomain (HD), helix-turn-helix (HTH), helix-loop-helix (HLH), Zinc-fingers, winged helix, leucine zipper, high mobility group box (HMG), and the oligonucleotide- binding fold (OB-fold) to recognize DNA 220. Central to the field of protein-DNA interactions is the recognition of target DNA sites by transcription factors (TFs) that bind to promoters and control target gene expression by activating or repressing RNA polymerases. Recent advances in computational and structural studies have provided a vast amount of information about the protein-DNA recognition code. TFs find their cognate sites via several mechanisms, including sliding, hopping, and intersegmental jumping 220.

The binding of a TF to its cognate site is based on physical interactions between the a.a side chains of the TF and the atoms of DNA base pairs 221. Other mechanisms involve

44 recognition of DNA structural features like major and minor grooves, backbone features, hydration shells, the flexibility of DNA bending, and unwinding 222.

On the other hand, proteins that recognize RNA are usually referred to as RNA- binding proteins (RBPs), and they generally participate in the formation of ribonucleoprotein complexes (RNPs) that interact with RNA to splice, protect, translate or degrade the mRNA. However, recent findings on the function of lncRNAs have challenged this convention by showing that RNAs may also regulate the function of RBPs 219. Usually,

RBPs are evolutionarily conserved and contain structurally well-defined RNA binding domains (RBDs) such as a zinc finger, K-homology domain (KH), PAZ, PUF, PIWI, double-stranded RNA binding domain (ds-RBD), the oligonucleotide-binding fold (OB- fold), and the RNA recognition motif (RRM). Novel proteome-wide experimental approaches have expanded the number of proteins implicated in RNA binding. Importantly, these studies have uncovered additional RBPs that do not require classical RBDs. One such mode of binding is through intrinsically disordered regions in proteins (IDRs), often in the form of repeats such as RGG, YGG, SR, DE, or KK 219. These motifs tend to engage in dynamic liquid-liquid phase separation in vivo, and can also aggregate in vitro, thereby forming hydrogels and amyloid-like fibers 223. Another unusual mode of binding is through shape complementarity, wherein the right spatial configuration of molecular interactions forms the basis of assembly in ribosomal and spliceosomal machinery 219. A significant fraction of cellular proteins (2%) can bind both DNA and RNA (DRBPs); these proteins have greater flexibility in generating cellular responses, thus playing a pivotal role in modulating gene expression, cell survival and homeostasis 224.

45

2.3.2 G-quadruplex (GQ) interacting proteins

As discussed in section 2.2.2, non-canonical NA secondary structures like GQs exhibit a high degree of structural polymorphism, which makes them suitable for differential recognition by proteins. The G4-interacting protein database (G4IPDB) lists over 70 DNA and RNA human GQ binding proteins identified from several individual studies 225. Given the complexity of the human proteome and technical limitations, this list may not be complete, and more GQ interacting factors remain to be identified. The biological consequences of interactions between proteins and GQs were already presented in section 2.2.2.3, indicating that GQ formation, resolution, and function in the cell must be tightly controlled. This regulation can be achieved by factors that either promote the formation of GQs, prevent their folding or those that actively resolve them.

In eukaryotes, telomeric DNA consists of repetitive sequences with the potential to form GQ structures. Many protein factors are postulated to play a role in the folding- unfolding equilibrium of telomeric DNA. These factors could act as chaperones that favor

GQ formation, or could preferentially bind to an unfolded form and shift the folding equilibrium. The shelterin protein complex that includes POT1 (protection of telomeres 1),

TRF1 and TRF2 (telomere repeat binding factor 1 and 2), TPP1, TIN2 (TRF1 interacting protein 2) and (repressor activation protein 1) are telomere end capping proteins and are also involved in the regulation of telomerase activity 21. Besides these factors, other proteins are recruited to telomere ends during distinct phases of the cell cycle. The (RPA) has been associated with the unfolding of telomeric GQ structures. Similarly, the heterogeneous nuclear ribonucleoprotein A1 (hnRNP A1) and its

46 proteolytic derivative, unwinding protein 1 (UP1), were shown to bind and destabilize GQ structures formed at telomere ends 21.

Given the abundance of GQ forming motifs in promoter regions of some proto- oncogenes, proteins like nucleolin, poly-[ADP-ribose] polymerase 1 (PARP-1), the mutant p53 protein, cellular nucleic-acid binding protein (CNAB), and hnRNPA1 have been implicated in GQ binding/processing in these regions 21. An important class of rGQ binding proteins is the heterogeneous nuclear ribonucleoproteins (hnRNPs), including hnRNPA1 and A2, hnRNP D, hnRNP BD1, and BD2 226-227. These RBPs have an RBD containing an

Arg-Gly-Gly (RGG) box and an auxiliary domain rich in specific amino acids such as Gly,

Asp/Glu or Pro, and are usually involved in multiple stages of NA metabolism (packaging, transport, splicing of the pre-mRNA, and translational regulation) 227. The AFF family of genes includes four members: AFF1/AF4, AFF2/FMR2, AFF3/LAF4, and AFF4/AF5q31.

AFF2/FMR2 that is silenced in Fragile XE syndrome (FRAXE) associated intellectual disability localizes to nuclear speckles and modulates alternative splicing via its interaction with rGQ structures. Additionally, AFF family proteins modulate in vivo the splicing efficiency of a mini-gene containing a GQ in one alternatively spliced exon 228. The silencing of the FMR1 gene due to CGG repeat expansions leads to loss of fragile X-mental retardation protein (FMRP) and is associated with the disease. The FMRP protein itself can bind rGQs through its RGG box domain 229. Xrn1p/mXRN1p, the primary 5’-3’ exoribonuclease in cytoplasmic mRNA turnover, exhibited a preference for rGQ substrates in vitro 230.

On the other hand, several helicases have been shown to remodel GQ structures in vitro 230-231. Most helicases bind and remodel NA and NA-protein complexes in an ATP-

47 dependent manner and play essential roles in virtually all aspects of NA-metabolism 22.

When such helicase-mediated pathways are misregulated, genome integrity is affected, resulting in defects in replication and gene expression, which in turn leads to mutagenesis, carcinogenesis, cell death/aging, and neurodegeneration. Therefore, identifying the NA substrates acted upon by helicases and understanding how the interactions involving helicases are important for genome homeostasis has become an essential field of inquiry.

So far, at least 18 helicases have been shown to bind and/or remodel GQ structures, each with differing specificity for strand composition, orientation, and topology (Table. 2.4),

231-235. Hence, a considerable level of redundancy in the GQ-interacting machinery can be expected. This is also consistent with the hypothesis from Bartel’s group that a robust and redundant cellular network of rGQ-disrupting proteins (helicases and non-ATP-dependent

Table 2.4. List of helicases and selected features implicated in GQ regulation in vitro and/or a in vivo. Indicates whether the helicase requires substrates with a single-stranded overhang 3’ (3’- 5’) or 5’ (5’-3’) to the structured region. 48 factors) is responsible for the global unfolded state of rGQs in eukaryotic cells 156. One of the key players in this network is the RNA helicase DHX36.

2.4 The DEAH/RHA helicase, DHX36

In the literature, this protein is referred to as RNA helicase associated with AU-rich element (RHAU) 126, due to its initial characterization as an AU-rich binding protein, involved in mRNA decay. It is also known as MLE-like protein 1 (MLEL1) named because of sequence similarity to Drosophila MLE (maleless)—a gene required for dosage compensation 236, or G4 resolvase 1 (G4R1) 237, because of its ability to bind and remodel

G4 (GQ) structures in vitro and in vivo. DHX36 belongs to the DEAH/RHA family of the superfamily 2 (SF2) helicases 17, 238. DHX36 is conserved in metazoans and contains all the DEAH/RHA signature sequence motifs in its helicase core. It also contains a shared homology region in its extended C-terminus, consisting of a “winged helix,” a “ratchet domain,” and an oligonucleotide binding (OB) fold 6 (Fig. 2.5). The N-terminus of DHX36 contains a glycine-rich region, a di-RG motif, and a unique DSM motif (DHX36 specific motif), which is conserved among DHX36 orthologues 45 and endows the helicase with specificity for binding GQs (Fig.2.5 45).

2.5 Biological roles of DHX36

DHX36 has been implicated in the regulation of key cellular processes, including transcription, translation, the cellular stress response 239, and interferon response to bacterial and viral nucleic acids 240. In HeLa cells, two alternatively spliced isoforms of

DHX36 were detected, with differential but not exclusive subcellular localization. Using fluorescence microscopy, the longer isoform 1 was shown to preferentially localize to the

49

Figure 2.5. The overall structure of B. taurus DHX36–G-quadruplex DNA complex. (a) Domain organization; G-quadruplex (G4) and ssDNA-interacting regions indicated. (b) Cartoon representation of Myc the co-crystal structure of DHX36 bound to DNA , color-coded as in a. Spheres denote two disordered segments (blue, 20 and 53 residues in the crystallization construct and wild-type, respectively; and green 13 residues). OB loops I and II (OI and OII) contact DNA. (PDB 5VHE). This figure was adapted from Ref: 6 with permission from Springer Nature.

nucleus, and the shorter isoform 2, which lacked 14 amino acids within the helicase core

region, was shown to localize to the cytoplasm 126. A separate study showed that DHX36

contained a nuclear localization signal in its N-terminus and predominantly localized to the

nucleus, but with a significant presence in the cytoplasm 241. This study reported the

presence of DHX36 in nuclear speckles, which are rich in pre-mRNA and RNA-binding

proteins. DHX36 was also reported to interact with transcriptional regulators, including

p68, p72, and HDACs, and to localize around nucleolar caps in transcription-arrested cells

241. A recent study used a more biochemical approach by fractionating isoform -1 and -2

overexpressing HEK293 cells, followed by immunoblotting. This study in HEK293 cells

50 identified a cytoplasmic localization of both DHX36 isoforms, which was also consistent with their finding of DHX36 predominantly binding to mature mRNAs 242. The discrepancy between these individual studies could be due to the usage of different cell lines. It could also be due to misfolding and/or mislocalization of transiently overexpressed tagged/fusion forms of DHX36 used in the fluorescence microscopy-based studies.

However, endogenous DHX36 may be present in the nucleus at low levels, wherein it is possible that under certain conditions, like high DHX36 levels or under specific stimuli, the helicase will partially translocate to the nucleus. The ability of DHX36 to remodel both

DNA and RNA structures and to shuttle between the nucleus and cytoplasm suggests one or more roles in the regulation of gene expression at both the DNA and RNA levels.

DHX36 is essential for embryogenesis and organogenesis in mice 243-244. Germline deletion of DHX36 causes embryonic lethality in mice 243, and its deletion in specific cell lineages results in strong developmental defects, including hemolytic anemia 243, severe heart defects 244, and azoospermia 245. A common underlying pattern observed in these conditions was the deregulation of genes involved in cell-cycle regulation, proliferation, and differentiation. Nevertheless, DHX36 is non-essential in HEK293 cells as the knockout

(KO) of DHX36 in these cells resulted in growth and morphological defects when compared to parental cells 242.

DHX36 was initially characterized as a regulator of mRNA stability via its binding to the AU-rich element of urokinase plasminogen activator (uPA) mRNA. This study demonstrated the interaction of DHX36 with exosome components, suggesting a role in mRNA degradation 126. Yet, a global microarray gene expression analysis in DHX36 knockdown cells revealed that changes in steady-state mRNA levels were only partially

51 influenced by mRNA decay, suggesting that DHX36 might be involved in the synthesis and transcriptional regulation of many mRNAs 241. In a seminal study, DHX36 was shown to be responsible for most of the inter-molecular dGQ and rGQ remodeling activity in HeLa cell lysates 237, 246. This shifted the focus of the DHX36 field to a role in quadruplex biology. The genes regulated by DHX36 were initially identified through traditional immunoprecipitation and microarray analyses. DHX36 was linked to various cellular mechanisms, including telomere maintenance, transcription, pre-mRNA 3’-end processing, translation, miRNA mediated gene regulation, lncRNA interaction, neuronal mRNA localization, and in response to stress and viral infection. In most of these studies,

DHX36 was suggested to act by a GQ-mediated mechanism. However, the likelihood of

DHX36 functioning in a GQ-independent fashion should not be overlooked.

In one such study, TERC (telomerase RNA template, hTR) was identified as the most enriched target among RNAs bound to DHX36 247. The 5’-end of TERC contains an rGQ-motif which, when folded into an rGQ structure, may impede the folding of another

P1 helical structure 248. This P1 helix is crucial for template definition and hence accurate reverse transcription by the telomerase 198, 249. DHX36 was shown to interact with the 5’- end rGQ on TERC, and the telomerase holoenzyme 198, 247, 250. It was shown that in vitro,

DHX36 bound and remodeled the rGQ of TERC, thereby enabling the formation of the P1 helix critical for telomere elongation 198, 250. Consequently, DHX36 knockdown in HEK293 cells resulted in shortened telomeres and down-regulation of telomerase activity 198. Hence, a model for the role of DHX36 in the regulation of telomere homeostasis was postulated.

DHX36 was implicated in the transcriptional up-regulation of YY1 (a overexpressed in breast, prostrate, and cervical cancers) by binding and remodeling

52 dGQs located in the promoter region of YY1. Further, in support of this notion, microarray data obtained from different breast cancer lines indicated a significant correlation between

DHX36 and YY1 expression 251. The spermatogonia differentiation defects observed in

DHX36 deficient germ cells was linked to the absence of dGQ remodeling and subsequent downregulation of a cell differentiation gene c-kit 245. DHX36 was associated with the regulation of 3’-end processing of p53 mRNA upon UV-induced DNA damage. It was shown to bind & remodel a rGQ within the 3’ UTR of p53, thereby providing a binding site for factors like hnRNP H/F that ensure correct end processing 252.

On a post-transcriptional level, DHX36 regulates the protein levels of a transcription factor NKx2-5 that is important for cardiomyocyte proliferation. DHX36 was shown to bind and remodel an rGQ located in the 5’ and 3’ UTRs of Nkx2-5 mRNA, and in turn, regulate protein translation and decay of this mRNA 244. On the contrary, DHX36 was described to reduce the protein levels of a tumor suppressor called PITX1, which is downregulated in various cancers. Although DHX36 can bind a rGQ in the 3’ UTR of

PITX1, gene repression was likely achieved via an independent miRNA mediated pathway whose exact mechanism is still elusive 253. Aven, an oncoprotein, was shown to interact with DHX36, and remodel rGQs in the ORFs of mixed-lineage leukemia (MLL1, MLL4) proto-oncogenes, thereby regulating translation of these genes 192.

Interestingly, DHX36 was also shown to mediate the dendritic localization of neuronal precursor miRNA-134 254. Interaction of DHX36 with lncRNAs that are frequently dysregulated in human cancers have also been reported 255-256. A G-quadruplex- forming sequence containing lncRNA, GSEC (upregulated in colon cancer), was shown to modulate cancer cell motility by acting as a molecular decoy that blocks DHX36 rGQ

53 remodeling activity by preventing DHX36 binding to target RNAs 256. Finally, DHX36 was linked to anti-virus innate immune response, as DHX36, along with DHX9, DDX1, and DDX21, were identified as cytoplasmic dsRNA sensors that use the TRIF pathway to activate type I IFN response in myeloid dendritic cells (mDCs) 240. A separate study in mouse embryonic fibroblasts (MEFs), emphasized the critical role of DHX36 in viral RNA recognition via activation of RIG-I- PKR pathway, subsequent antiviral-stress granule formation and TRIM25 signaling molecule recruitment 257.

Recently, Sauer et al. used transcriptome-wide sequencing-based approaches to obtain unbiased, comprehensive insights into DHX36 targets and its role in gene regulation. DHX36 predominantly bound G-rich sites on > 4500 mRNAs, many of which formed rGQ structures in vitro 242. Nevertheless, AU-rich mRNA regions were also identified as bonafide DHX36 binding sites. In HEK293 cells, loss of DHX36 resulted in the stabilization of target mRNAs in a helicase-dependent manner; however, the stabilized transcripts were found to be translationally incompetent and were instead sequestered into stress granules 242. Of note, this study showed that only a fraction of DHX36 is associated with actively translating polyribosomes. Interestingly, a separate study also indicated

DHX36 (via its N-term region) localization to cytoplasmic stress granules in response to various cellular stresses 239. Despite earlier observations in HeLa cells suggesting the involvement of DHX36 in the synthesis and transcriptional regulation of many mRNAs 126,

241, the recent investigation by Sauer et al. in HEK293 cells showed no changes in the transcription rates of target mRNAs upon DHX36 loss. Along with the predominant cytoplasmic localization of DHX36 in HEK293 cells, this study provided strong evidence for the participation of DHX36 in post-transcriptional gene regulation 242.

54

In a parallel independent study in HeLa cells, Murat et al. observed a stronger association of DHX36 with actively translating monosome and polysome fractions. Using transcriptome-wide ribosome profiling in HeLa cells, they showed that the translational efficiency (TE) of around 1000 transcripts (comprising proto-oncogenes, transcription factors, and epigenetic regulators) were modulated by the DHX36-dependent remodeling of rGQs within their 5’ UTRs 8. However, this study did not investigate whether the transcripts impacted by DHX36 were a result of the direct binding of DHX36 within their

5’ UTRs. Nonetheless, conflicting data between the two recent studies might be reflective of the usage of different cell lines. In certain cancer cell types, e.g., HeLa cells, there might be a stronger association of DHX36 with the translational machinery to enable higher translational levels. This further highlights the importance of investigating DHX36 function in rapidly proliferating cells and “normal” somatic cells, as this might provide insights into the unique functions of this helicase during cancer development.

2.5.1 Disease relevance

DHX36 was upregulated in more than 30 % of human lung cancers deposited on the cBioPortal 258 and was shown to be upregulated in most human breast cancer cell lines

251. The DHX36-dependent transcripts identified in recent transcriptome-wide studies included many genes with established roles in cancer pathways, like MAPK3/ ERK1,

FOXM1, YY1, and ELAVL1 8, 242. DHX36 did not contain recurrent mutations associated with cancers, but it showed altered expression levels in nearly 8 out of 15 human cancers analyzed, suggesting its association in stimulating cancer pathways 8 (Fig. 2.6). As mentioned before, DHX36 is implicated in host innate immune signaling response following viral infection. A more recent study demonstrated the upregulation of DHX36

55 upon PRRSV (porcine reproductive and respiratory syndrome virus) infection where

DHX36 is involved in the MyD88-p95 signaling cascade resulting in the activation of pro- inflammatory NF-B pathway 259. In addition, pathogenic viruses including, human immunodeficiency virus (HIV), the human papillomavirus (HPV), or the Epstein-Barr

Figure 2.6. Expression profiles of DHX36 in various cancers. a) Mutation frequencies were recovered form the COSMIC database (COSMIC v82, http://cancer.sanger.ac.uk/cosmic). Identified mutations in BRAF, NRAS (frequently mutated in cancers), DHX36, DHX9, GAPDH and ACTB (non- mutated in cancers) in all human cancers were stratified into six classes: synonymous, nonsense and missense substitutions, frameshifts, in-frame deletions or insertions and others. The insert shows that DHX36 and DHX9 do not show any frequent mutations in human cancers. b) Expression profiles of DHX36 in normal (N) and cancer (C) tissues were recovered from the GENT database (http://medicalgenome.kribb.re.kr/GENT/) using data generated by Affymetrix U133plus2 platforms. DHX36 showed altered expression levels in brain, cervix, colon, kidney, liver and skin cancerous tissues. c) Expression profiles of DHX36 in normal and cancer tissues when combining data from tissues presenting altered expression profiles. The plot shows that DHX36 is overexpressed in cancer tissues. In the box plots, the central lines represent the medians and the other lines represent quartile boundaries. Points represent individual values outside the 10-90 percentiles. P-values were assessed using two-tailed Student’s t-tests. ns: non significant, *P < 0.05, *P < 0.01, ***P < 0.001 and ****P < 0.0001. This figure is from Murat et al. 8, which is distributed under the terms of the Creative Commons License (http://creativecommons.org/licenses/by-nc/4.0/).

56 virus (EBV), have sequences with rGQ forming potential at genomic sites critical for their replication 260. Hence, as yet unidentified connection between DHX36 and viral rGQs seems like an exciting possibility.

2.6 NA remodeling mechanism of DHX36

DHX36 has an interesting ability to remodel both dGQ and rGQ structures. DHX36 was shown to be responsible for most of the inter-molecular dGQ and rGQ (GQ4) remodeling activity in HeLa cell lysates 237, 246. A monoclonal antibody raised against

DHX36 caused a significant (> 50%) reduction in helicase activity of cell lysates towards dGQ4 237. Additionally, DHX36 knockdown in HeLa cells led to an 8-fold reduction in rGQ4 remodeling activity of cell lysates, compared to the activity in WT lysates 246. In vitro,

4 4 DHX36 bound rGQ (apparent KD ~40 pM) with higher affinity than dGQ (apparent KD

75 pM) 246. In competition studies, rGQ4 inhibited dGQ4 remodeling by DHX36; additionally, dGQ4 were bound and remodeled more efficiently than regular duplex structures 237, 246. ATP-dependent remodeling of intra-molecular dGQs (GQ1) was demonstrated for DHX36 using a peptide nucleic acid (PNA) trap assay, and competition studies indicated that dGQ1 were bound and resolved better than dGQ4 261. A separate study reported that DHX36 uses a local non-processive mechanism to unwind dGQ structures, mimicking DEAD-box RNA helicases. Steady-state measurements of initial unwinding rates showed that DHX36 GQ-unwinding activity scaled with the stability of DNA GQ substrate, but the ATPase activity was mostly independent of substrate stability 262.

Significant progress has been made towards understanding how DHX36 targets and disrupts GQs. DHX36 was shown to specifically disrupt DNA and RNA GQ4 via ATP- driven translocation along the NAs 65. A separate study showed that DmDHX36 uses a

57 combination of passive (destabilization) -active (translocation) mechanism to remodel dGQ1 structures; this study also demonstrated processivity of DmDHX36 on DNA duplex structures wherein a kinetic step size of 1 bp and an average duplex unwinding velocity of

14.5 bp s-1 were reported 38.

While the above findings were from bulk ensemble assays, single-molecule studies report ATP-independent mechanisms for DHX36 remodeling of intramolecular GQ1, which are distinctly different for DNA and RNA GQ1. The first study shows that a single molecule of DHX36 recognizes the parallel dGQ1, and the mechanism involves ATP- independent unfolding and refolding of GQ that continues for may cycles without dissociation of DHX36 from the DNA 263. The second study, in contrast, shows a highly asymmetric pattern of DHX36 activity on the parallel rGQ1. The mechanism involves ATP- independent unfolding followed by ATP-dependent refolding of GQ that continues for many cycles leading to the dissociation of DHX36 from the RNA 264.

It should be noted that single-molecule studies demonstrating ATP-independent

GQ1 remodeling involved substrates with three G-tetrads. Whereas, bulk ensemble assays demonstrating ATP dependent GQ4 and GQ1 remodeling involved more stable GQ substrates (with ≥ three G-tetrads). This suggests that an ATP-hydrolysis driven translocation event may be necessary for the complete unfolding of highly stable GQ structures.

Collectively, our current understanding regarding the GQ remodeling mechanism is that the helicase core binds to an unpaired region of the substrate, which needs to be longer than 5 nt and located 3’ to the quadruplex structure 65, 263. DHX36 partially destabilizes the quadruplex in a non-ATP-dependent fashion 6, 38, 263-264. ATP-driven

58 translocation of the helicase core on the NA then threads the unpaired NA through the NA- binding tunnel, thereby resolving the structure 6, 38, 65.

Despite the large number of recent contributions on DHX36, there is a surprising number of structural and mechanistic features that are not well understood. DHX36 has been linked to AU-rich RNA elements and DNA and RNA quadruplex structures in cells

126, 242, but how the enzyme functions on such diverse substrates remains unclear. Although

DHX36 has been functionally and physically associated with RNA in cells 242, most biochemical studies have been conducted on DNA. To our knowledge, the remodeling of

RNA duplexes by DHX36 has not been experimentally tested. How DHX36 discriminates between its targets and how it balances its GQ remodeling and duplex unwinding activities remains unclear.

2.7 Structural basis for NA remodeling by DHX36

Crystal structures have been reported for DHX36 from bovine6 and fly38 (Fig.2.5,

Fig.2.7a). DHX36 contains multiple structural domains that are characteristic for

DEAH/RHA helicases, including two RecA-like helicase domains that form the helicase core, a winged-helix (WH), a ratchet, and an OB-domain 25, 29-30. These domains are arranged in a compact pyramidal structure, as seen in other DEAH/RHA helicases 25, 29.

The structure of fly DHX36 in complex with several ss and ds NA showed that the helicase specifically recognizes G-rich oligonucleotides 38 (Fig.2.7b). The structure of bovine

DHX36 in complex with DNA GQ underlined the role of the N-terminal DHX36 specific motif (DSM) in GQ-recognition 6. However, contacts to NA outside of the GQ are also made by other domains, including the OB-fold, the ratchet, and the helicase core 6, 38.

59

Figure 2.7. The overall structure of D. melanogaster DHX36-DNA complex. a) The N-terminal domain (NTD) is in cyan, RecA1 and RecA2 in yellow and yellow-orange, respectively, HA2 in red, the OB-like domain in blue, and the C-terminal domain (CTD) in dark gray. The 12-nt ssDNA (3G10) is in green. b) Schematic diagrams of the contacts of DmDHX36 with poly(G) oligonucleotide. It is from a composite of different crystal structures of DmDHX36 in complex with different oligonucleotides bearing G-rich sequences. Hydrogen-bonding interactions are indicated by dotted lines. The aromatic residues implicated in stacking interactions with the bases are indicated with gray lines. In the right panel, to highlight the amino acids specifically implicated in binding with poly(G), interactions that are identical with other bound oligos are presented with the related amino acids in gray, and only the newly identified interactions are presented with the related amino acids in colors. This figure was adapted from Ref: 6 with permission from Elsevier.

Other DEAH/RHA helicases remodel RNA and DNA structures by an ATP-driven translocation-based mechanism 41, 46, 265. By comparing the several conformational snapshots of bovine DHX36 with known ones from other DEAH/RHA helicases, a translocation based molecular model of GQ remodeling by DHX36 was proposed 6, 265.

DHX36 in the open ATP-independent conformation binds the DNA GQ and pulls it in the

3’- direction via concerted but opposite rotations of the RecA2 and C-term domains. This may result in repetitive GQ unfolding, as seen in single-molecule assays 263. This open

ATP free conformation accommodates a five-nucleotide stack in the NA-binding tunnel.

60

ATP binding to the cleft between the RecA domains induces domain closure and encloses a stack of four nucleotides in the NA-binding tunnel. Continuous cycling between these states enables DHX36 to translocate in the 3’ to 5’ direction along the ss NA with a step size of one base pair (bp) per hydrolyzed ATP.

The roles of the DSM and the helicase core in GQ recognition and remodeling have been investigated. What roles, if any, other conserved domains play in the remodeling of

NA structures is not clear. Given the high degree of conservation of DHX36, it is important to delineate whether and how these domains impact the biochemical activity of DHX36.

Additionally, it is not understood how exactly NTP and NA binding sites communicate in

DHX36 as well as other DEAH/RHA helicases. Thus, dedicated structure-function studies are needed to validate and complement structural models with dynamic information and to define regions in DHX36 that are critical for its biochemical activities.

2.8 Structural and biochemical analyses of RNA remodeling activity of DHX36

In chapters 3 and 4, I describe structural and biochemical studies on mouse DHX36

(accomplished through a collaborative effort) and address several unanswered questions on this helicase. We report the crystal structure of mouse DHX36 bound to ADP. I address the similarities and differences between our mouse DHX36 structure, and previously reported fly and bovine DHX36 structures. I systematically characterize the remodeling activity of mouse DHX36 on RNA duplex structures, as well as both intermolecular and intramolecular RNA G-quadruplex structures. From this, I investigate the roles of the

DHX36 specific domain (DSM), the β-hairpin motif, and the OB-fold domain in the binding and remodeling of RNA duplex and quadruplex structures. Additionally, I thoroughly examine the interplay between nucleotide selectivity and RNA substrate

61 selectivity of DHX36 during the RNA remodeling process. Finally, I developed a kinetic model for the NTP-mediated regulation of RNA duplex versus RNA G-quadruplex remodeling activities of DHX36.

62

Chapter 3

Introduction

The DEAD-box helicase DDX41, a myelodysplastic syndrome implicated splicing

factor

3.1 Conservation and structure of human DEAD-box protein DDX41

DDX41 is a member of DEAD-box family of RNA helicases. Phylogenetic analysis performed with alignment of human DEAD-box helicases shows that DDX41 is most similar to DDX59 and DDX54 7. DDX41 is highly conserved in metazoans. The phylogenetic tree (Fig. 3.1) shows that the amino acid sequence is conserved among species. 5 DDX41 is also known as ABS as the drosophila homolog Abstrakt was originally

Figure 3.1. Phylogenetic tree showing DDX41 conservation across species. The tree was generated using MEGA5 software (http://www.megasoftware.net/) with Neighbor Joining method. The amino acid sequences of DDX41 from different species were downloaded from GenBank. Length of branch represents divergence and numbers on branches are indicated as percentage of bootstrap values in 1000 sampling replicates. This figure is from Jiang et al 5 which is distributed under the terms of the Creative Commons License (http://creativecommons.org/licenses/by/4.0/).

63 identified in a genetic screen for mutants that disrupted normal axonal growth in the fly 266.

DDX41 contains all the conserved motifs of DEAD-box proteins (Section 1.3.1, Fig.1.3).

In DDX41 the 13 signature conserved sequence motifs are present in their appropriate spacing, within the RecA1 and RecA2 helicase core domains. However, two of the motifs which have been implicated in oligonucleotide binding, PTRELA (Motif Ia) and RG-D

(Motif V), show conservative substitutions in their amino acid sequence, serine for threonine in the PTRELA motif and lysine for arginine in the RG-D motif 266 (Fig. 3.2.).

The potential functional or structural significance of these substitutions is unknown.

Figure 3.2. Domain organization of human DDX41. The organization of conserved sequence motifs in DDX41 (upper line) as compared to the DEAD-box helicases consensus (lower line). DDX41 shows all the hallmarks of a DEAD-box helicase including a proper spacing of the conserved motifs 9 The conservative substitutions in the PTRELA and RG-D motifs are indicated by double arrows.

The X-ray crystal structure of DDX41 RecA1 domain in the apo form was reported in two separate studies 5, 267 (Fig. 3 3a). The RecA1 domain contained motif Q, P-loop, motif Ia, motif Ib, motif II, and motif III revealing structural similarities to RecA1 domains of other DEAD-box proteins. The X-ray crystal structure of DDX41 RecA2 domain was reported in another study 7. The RecA2 domain contributes to nucleotide coordination via motifs V and VI. Upon superposition of the DDX41 RecA2 structure with the RecA2

64 structure of another DEAD-box helicase DDX19 (in complex with RNA and ADPNP), the

DDX41 structure showed a lack of electron density for part of motif VI indicating its flexible nature. The conserved motifs IV and VI superposed well, whereas motif V showed different conformations in the two structures (Fig. 3 3b). Overall, the crystal structures of both RecA1 and RecA2 domains of DDX41 in isolation revealed its structural similarities to RecA1 & RecA2 domains of other DEAD-box proteins.

Figure 3.3. Structure of DDX41 helicase core. (a) Cartoon representation of the RecA1 domain of DDX41 (PDB: 5GVR; grey). The positions of conserved motifs Q, I, Ia, Ib, II, and III are indicated and highlighted in colors yellow, green, cyan, lime green, red, and magenta respectively. (b) Cartoon representation of the RecA2 domain of DDX41 (PDB: 2P6N; light orange). The positions of conserved motifs IV, V, and III are indicated and highlighted in colors magenta, pink, and blue respectively. (c) Superposition of the RecA2 domains of DDX19 (light blue), DDX25 (grey) and DDX41 (orange). The positions of conserved motifs IV–VI (black) are indicated. Panel (c) in this figure is from 7 with permission to reuse.

Apart from the conserved helicase core, most DEAD-box helicases have variable

N- and C- terminal regions that confer functional specificity to individual helicases. In that respect, DDX41 contains a CCHC-type Zinc finger motif in the C-terminal region (Fig

3.2). Proteins that contain zinc fingers are classified into distinct families due to the different types of zinc fingers, each with a unique three-dimensional architecture. The vast majority typically function as interaction modules that bind DNA/RNA, proteins, or other small molecules. Although the function of the Zn-finger motif in DDX41 is unclear, it may

65 serve as an additional nucleic acid binding domain or be important for interactions with other proteins.

Secondary structure prediction of human DDX41 via the XtalPred-RF server reveals that the N-terminal region (aa 1–160) is disordered 268. In the drosophila homolog

Abstrakt, the N-terminal region (1-194) was shown to contain an NLS (nuclear localization signal) capable of translocation of the protein into the nucleus 269. Alignment of the N-terminal regions of the two proteins shows that they share 41.9% identity, suggesting one or more similar roles for this region in the two DDX41 homologs 268. This was further supported by evidence from cellular localization experiments performed with human DDX41 268, which showed that a potential NLS in the N-terminus can target the protein to the nucleus.

3.2 Cellular Functions of DDX41

3.2.1 Abstrakt the drosophila homolog of DDX41

Abstrakt shares 66% identity and 80% similarity to DDX41 at the amino acid level.

In drosophila the gene was identified due to a specific phenotype, the failure of the Bowling nerve to fasciculate and project normally 270. Abstrakt is essential for survival throughout the life cycle of fly. Mutants show specific defects in many developmental processes, including cell-shape changes, localization of RNA, and apoptosis 271. Irion et al. showed that Abstrakt is required for oogenesis, maintaining cell polarity and asymmetric in neural and muscle progenitors 272. Abstrakt also plays an important developmental role as it is required for the development of embryonic CNS and the larval and adult visual system 266. The Abstrakt protein interacts with insc RNA. Abstrakt mutants show loss of Insc protein levels but no change of insc RNA levels thus

66 demonstrating a novel role for Abstrakt in the post-transcriptional regulation of insc expression 272. Interestingly, Abstrakt was also shown to participate in protein sorting via its interaction with Sorting Nexin-2 (SNX2). The N-terminal domain of

Abstrakt was shown to interact with the phox homology (PX) domain of SNX2 269. They also found that Abstrakt might shuttle between the nucleus and cytoplasm with a bias towards the nucleus, and the N-terminal domain is responsible for its nuclear localization.

Deletion of this domain in Abstrakt (aa 194–622) resulted in distinct punctate cytoplasmic distribution and loss of nuclear localization 269.

3.2.2 DDX41 in the spliceosomal complex

The molecular mechanism and/or function of DDX41 remains largely unclear.

RNA helicases are usually involved in several aspects of RNA metabolism, including ribosome biogenesis, pre-mRNA splicing, mRNA export, and translation 22. For instance, the spliceosome undergoes profound conformational changes during the splicing cycle to achieve splicing efficiency and accuracy 273. So far, around eight DExD/H-box RNA helicases have been identified as essential players in facilitating remodeling of the spliceosome 274. They mediate rearrangements of RNA–RNA or RNA–protein interactions during the progression of the splicing pathway, using the energy from ATP hydrolysis 30.

DDX41 is proposed to play a role in pre-mRNA splicing during the catalytic step, as it was found in the spliceosomal C-complex during purification and characterization of native spliceosomes 275. Moreover, mutations of several spliceosomal proteins are common in myeloid neoplasms 276 and are mutually exclusive with DDX41 mutations (Fig. 3.4).

Thus, Polprasert et al. in 2015 investigated a possible association of DDX41 with spliceosomal proteins. Through large scale proteomics and mass spectrometric analysis,

67

Figure 3.4. Co-mutational landscape of DDX41 patient cohort. Comparison of co-mutations of MDS/AML patients with DDX41 mutations. Mutations of other DEAD/H box helicases and spliceosomal factors ( SF3B1, U2AF1, SRSF2, PRPF8 and LUC7L2) in Polprasert et al. patient cohort (n=846) and TCGA database (n=197). This figure is from Polprasert et al. 4 with permission to reuse. they showed that affinity-tagged DDX41 co-immunoprecipitated with several splicing factors, specifically components of the U2 and U5 snRNPs, including SF3B1 and PRPF8.

It was also shown that DDX41 mutations impaired its interactions with these components

4. Investigation of the impact of DDX41 defects on pre-mRNA splicing through RNA-seq analysis revealed aberrant exon skipping or exon retention events within functionally relevant genes like IKZF1, a transcription factor associated with lymphoid differentiation and ZMYM2, a component of the histone deacetylase complex 4. One prevailing hypothesis in this field is that mutations in splicing factors may result in mis-splicing of target genes (oncogenes/ tumor suppressors), leading to their activation/inhibition, and the subsequent establishment or progression of MDS and other leukemias (Fig. 3.5). However, the key downstream targets of mutant splicing factors have not been identified, so alternative mechanisms cannot be disregarded.

3.2.3 DDX41 in innate immune signaling

To initiate host defense against pathogenic infection, the innate immune system detects pathogen-associated nucleic acids (bacterial or viral and RNAs) via pattern-

68

Figure 3.5. Association of DDX41 with the spliceosome. WT-DDX41 physically interacts with the U2 and U5 small nuclear ribonucleic protein complexes (snRNPs). Mutation of DDX41 may result in disruptions of these interactions and splicing defects, like exon skipping or exon retention. Abbreviations: ex, exon. This figure is reproduced from Steidl et al. 16 with permission to reuse. recognition receptors (PRRs). This results in the activation of various innate immune signaling pathways that produce inflammatory cytokines and IFNs 277. Several PRRs have been identified, which include the toll-like receptors (TLRs), the retinoic acid-inducible gene I (RIG-I)-like receptors, the nucleotide oligomerization domain-like receptors

(NLRs), the C-type lectin receptors, and the family of cytosolic DNA sensors, such as

DNA-dependent activator of IFN-regulatory factors (DAI), absent in melanoma 2 (AIM2),

IFNγ-inducible protein 16 (IFI16), cyclic GMP-AMP synthase (cGAS), and DExD/H

(Asp-Glu-Ala-Asp/His) box helicases 78, 101, 277-280.

Several members of the DExD/H-box helicases other than the RLRs have emerged as important players for innate immune signaling and the control of virus infection. Though some may act as bona fide RNA/DNA sensors, others may instead act as accessory factors required to promote innate immune signaling. It is, therefore important to carefully distinguish between helicases whose presence is required for signaling in response to NA

69 stimuli and those that engage PAMP (pathogen-associated molecular patterns) NAs with specificity. However, this is challenging, as most if not all of the DExD/H-box helicases will have the ability to associate with NA with some specificity. Further investigation will be necessary to elucidate the mechanism for signaling action by these helicases, and to precisely determine whether they distinguish between self and non-self RNA to signal innate immunity. It also remains to be seen whether these helicases participate in canonical innate immune signaling pathways and whether any of the DExD/H-box helicases play redundant roles in signaling innate immunity to control virus infection 281.

Through a siRNA screen, DDX41, a member of the DEAD-box family of RNA helicases, was identified as an intracellular DNA sensor in myeloid dendritic cells 101.

Activation of the mitogen-activated protein kinase TBK1 (TANK-binding kinase 1), the transcription factor NF-κB, and IRF3 (interferon regulatory factor) in response to cytosolic

DNA was impaired in murine myeloid dendritic cells (mDCs) lacking DDX41 101. In the same study, it was shown that both DDX41 and IFI16 were required for the response to cytosolic DNA and DNA viruses in the human monocytic THP1 cell line. It was suggested that the constitutively expressed DDX41 might act as an early sensor of viral DNA, while

IFI16 is upregulated at later stages of viral infection in a DDX41- dependent manner 101.

The stimulator of interferon genes (STING) has been shown to act as a crucial scaffolding and adaptor protein to facilitate the initiated from upstream cytosolic DNA sensors to downstream effectors like TBK1 and IRF3, leading to the expression of type I IFNs 282. DDX41 was shown to directly bind DNA and STING via its

RecA1 domain, and in turn, trigger signaling by downstream effectors of STING 101 (Fig.

3.6).

70

Figure 3.6. DDX41 as a PRR in innate immune signaling. DDX41 recognition of viral DNA and bacterial second messengers following its phosphorylation by BTK kinase, results in activation of the STING-TBK1 pathway and induction of type I interferon response in the nucleus. Uncontrolled sensing of DNA is prevented by ubiquitination of DDX41 by TRIM21 at Lys9, Lys115 sites, leading to proteasomal degradation of DDX41. This figure is from Jiang et al. 5 which is distributed under the terms of the Creative Commons License (http://creativecommons.org/licenses/by/4.0/).

DDX41 has also been proposed to directly bind cyclic dinucleotides (CDNs), cyclic di-GMP (c-di-GMP), or cyclic di-AMP (c-di-AMP), and induce a STING-TBK1-IRF3- dependent type I IFN response (Fig. 3.6) 78. CDNs are bacterial second messengers that act as PAMPs and signal the presence of intracellular bacterial infection and induce a type I

IFN response 283. STING itself can directly bind CDNs and can sense the presence of CDNs without the need for an upstream receptor 284. Detection of c-di-GMP by DDX41 was proposed to enhance the DDX41-STING interaction, which in turn could also increase the

71 binding affinity of STING and c-di-GMP; the molecular interactions may lead to activation of the interferon response 78.

Excessive production of type-I interferon following uncontrolled sensing of

DNA/RNA could lead to autoimmune diseases 277. The immune system has developed a mechanism to overcome this, and it usually involves direct blockade of the function of key signaling molecules. For example, NLRX1 (a member of the Nod-like receptor family) inhibits MAVS (mitochondrial antiviral signaling protein)-mediated interferon-β (IFN-β) responses by disrupting the interaction of RIG-I with MAVS 285. Another mechanism is the enhancement of protein ubiquitination and degradation of key signaling molecules by the TRIM family of proteins. The E3 ubiquitin TRIM21 was identified as a DDX41 interacting protein. Overexpression of TRIM21 leads to enhanced degradation of DDX41 and lesser production of IFN-β in response to intracellular dsDNA 286. The SPRY-PRY domain of TRIM21 interacts with the RecA1 domain of DDX41, and Lys9 and Lys115 of

DDX41 are the ubiquitination sites of TRIM21 (Fig. 3.6) 286.

Nevertheless, the exact mechanism of how DDX41 functions as a PRR is not entirely understood. Recently, it was shown that Tyr364 and Tyr414 of DDX41 are important for its recognition of DNA and binding to STING 287. Importantly, Tyr414 was identified as the BTK phosphorylation site (Fig. 3.6) 287. Burton’s tyrosine kinase (BTK) belongs to the Tec family of cytoplasmic tyrosine kinases and plays a role in B-cell receptor signaling and lymphopoiesis 288. BTK is involved in TLR signaling. For instance, it phosphorylates the Mal adaptor involved in TLR4 signaling (Ref). BTK’s kinase domain and the SH3/SH2 interaction domains bind the RecA1 domain of DDX41 and the transmembrane region of STING, respectively. Thus, BTK was shown to play a critical

72 role in the activation of DDX41 helicase and STING signaling 288. Lastly, the function of

DDX41 in innate immune regulation was also studied in teleosts. Quynh et al. cloned and characterized the DDX41-encoding gene from the olive flounder. They found that olive flounder DDX41 is also a cytosolic viral DNA sensor and showed antiviral function similar to DDX41 in mammals 289.

3.2.4 DDX41 and tumor development

Several RNA helicases are deregulated in cancer via involvement in chromosomal translocation, downregulation, and over-expression 290. Although the exact role of helicases in carcinogenesis is unclear, disruption of the normal function of RNA helicases can potentially result in abnormal RNA processing with harmful effects on the expression/function of key proteins. In 2012, Ding et al. reported somatic mutations in

DDX41 in the study of sporadic acute myeloid leukemia (AML) syndrome 291. In 2015, our collaborators, Polprasert et al., identified a familial myelodysplastic MDS/AML syndrome characterized by long latency and germline mutations in the DDX41 gene. This study proposed a tumor suppressive function for DDX41 in familial MDS/AML, as reduced DDX41 expression, as well as deletions and frameshift mutations in DDX41, were reported 4. In fact, the knockdown of DDX41 resulted in increased proliferation and a differentiation block, whereas its overexpression decreased and enhanced cellular differentiation in cell lines as well as primary patients’ diseased cells.

Additionally, DDX41 knockdown led to an increase in tumor growth in a xenotransplantation model in vivo 4. However, the molecular mechanism of DDX41 and its role in hematopoiesis remains unclear. Somatic mutations of DDX41 occur but are rare in non-hematological malignancies 20.

73

3.2.5 DDX41 in post-transcriptional gene regulation

DDX41 was identified as a negative regulator of p21 protein translation via association with the 3’ UTR of p21 mRNA 292. The -dependent kinase inhibitor p21

(CDKN1A) functions in several cancer cell lines in an antiapoptotic manner 292. This study supported an oncogenic role for DDX41, and this view is consistent with data from a genome-wide RNA-interference screen in HeLa cells, demonstrating reduced cell numbers following knockdown of DDX41 293. The study demonstrating DDX41 mediated post- transcriptional gene regulation of p21, also showed that the p53 mutation status of the cells might influence this regulation by a yet undefined mechanism. Although this study is contradictory to the tumor suppressive function of DDX41 in myeloid leukemia cells, it should be noted that such opposing functions are not uncommon among DEAD-box proteins. Depending on the cancer type, treatment modalities, and various co-factors, several DEAD-box proteins possess both oncogenic, and tumor suppressor functions 294-

295.

3.3 Myelodysplastic syndrome

MDS (Myelodysplastic syndrome) encompasses a heterogeneous group of myeloid neoplasms characterized by dysplastic (abnormal cell morphology) ineffective hematopoiesis, cytopenias (reduction in the number of mature cells) and evolution to sAML (secondary- Acute Myeloid Leukemia) 296. Most cases of adult MDS are sporadic, but some may arise due to exposure to chemotherapy or ionizing radiation.

In normal hematopoiesis, hematopoietic stem cells (HSCs) undergo trilineage differentiation into erythrocytes, granulocytes, and megakaryocytes, where the relative abundance of the myeloid lineages in the bone marrow and the number of cells in the blood

74 are maintained within relatively narrow limits 296. Myeloid neoplasms are characterized by excess proliferation in one or more myeloid lineages, and by inefficient hematopoiesis.

Patients with MDS exhibit aberrant stem cell proliferation in the marrow, whereas they have decreased numbers of erythrocytes, granulocytes and/or platelets in the blood.

Dysplasia or abnormal cell morphology at a given stage of differentiation is a hallmark of

MDS. The defining feature of AML is the presence of many immature white blood cells

(blast cells) in the marrow of affected patients. In AML, reduced apoptosis is observed, and differentiation is inhibited hence leading to a reduction in all three lineages. On the other hand, in MDS there might be a loss of one or two lineages. The lack of terminal cells in MDS is due to a failure of differentiation and increased apoptosis 297.

Familial myeloid malignancies are a group of rare inherited disorders that represent a unique resource to investigate initial steps to leukemia progression. In MDS/AML the initial germline mutation could be a preleukemic event, required but not sufficient for leukemia initiation. Among the most studied ones are familial AML with mutated transcription factors like CEBPA, GATA2, and RUNX1. It was also shown that mutations in these factors play important roles in sporadic AML. Thus, the study of patients with familial MDS/AML is valuable to understand leukemia progression, including sporadic leukemias 298.

3.3.1 Myelodysplastic syndrome associated protein factors

The advent of whole exome sequencing has allowed for improved analysis of genomic defects in leukemia, including MDS/AML. Some of these include mutations in genes encoding transcriptional regulators (RUNX1, CEBPA), epigenetic regulators (TET2,

DNMT3A) and mutations in factors involved in signal transduction and DNA repair (JAK2,

75

KIT, KRAS/NRAS, TP53) 299, 300, 4. Several groups have also identified somatic mutations of several spliceosomal genes (SF3B1, U2AF1, and SRS2), and frequent somatic mutations in DDX41, a gene encoding a DEAD-box RNA helicase 301, 276, 302, 301.

3.3.1.1 Splicing factors in myeloid neoplasms

Loss of function mutations in splicing factor genes like SRSF2, SF3B1, and U2AF1 are frequent in myeloid neoplasms, and they are acquired early during disease evolution.

The most frequently detected splicing factor mutations were found in the SF3B1 gene on chromosome 2q33·1. This gene codes for subunit 1 of the splicing factor 3b protein complex (SF3B1), which is involved in the early stages of spliceosomal assembly. The

SF3B1 complex is part of the functional form of the U2 small nuclear ribonucleoprotein

(snRNP) that binds to the branch site near the 3′ end of introns and helps to specify the site of splicing. SF3B1 also plays a role in the minor U12-dependent spliceosome. SF3B1 haploinsufficiency is linked to ring sideroblasts in MDS/MPN. In addition, mice with heterozygous SF3B1 knockout accumulate ring sideroblasts in their bone marrow.

However, the abnormally spliced gene or genes responsible for this phenotype remain controversial 303.

SRSF1 and SRSF2 proteins belong to the SR splicing regulatory factor family. Each of these factors contains an RNA recognition motif (RRM) for binding RNA and an RS domain for binding other proteins. SRSF1 and SRSF2 play a role in preventing exon skipping, ensuring the accuracy of splicing, and regulating alternative splicing. In mice,

SRSF2 mutations modify the sequence specific RNA binding activity of SRSF2, thereby altering recognition of exon splicing enhancer motifs and driving mis-splicing of key

76 hematopoietic regulators. For instance, one SRSF2 mutation leads to mis-splicing and non- sense mediated decay of EZH2 pre-mRNA with inactivation of PRC2 304.

Since all the affected splicing factors are involved in 3’ splice site recognition, it has been hypothesized that some commonly affected splicing events may underlie the disease 305. Strikingly, the same mutation in SF3B1 induced a unique non-overlapping set of splicing changes in mouse versus human cells 306. Furthermore, all major splicing factor mutations seem to co-evolve with mutations in genes that function at chromatin levels, such as RUNX1, DNMT3A, and TET2 307. These findings raised the possibility that a unifying mechanism in addition to or independent of splicing changes might contribute to

MDS/MPN. In 2018, Zhang et.al showed that high risk mutations in SRSF2 and U2AF1 augment R-loop formation and subsequent activation of ATR pathway. It was demonstrated that MDS may result, at least in part, from chronic insult to the genome 308.

3.3.1.2 DDX41 in myeloid neoplasms

DDX41 mutations were discovered in myeloid leukemias by a combination of whole exome sequencing applied to families suspected of carrying disease prone variants, as well as cross-sectional studies of paired tumor and germline DNA samples 4. Although not as frequent, this study also identified somatic mutations in other DEAD/H- box RNA helicases like DHX29, DDX4, and DDX11, to name a few. Unlike other splicing factors frequently mutated in MDS, which are exclusively somatic mutations, DDX41 mutations can be either germline, somatic, or both 4, 20, 309. The sites and types of mutations in the

DDX41 gene, located at 5q35.3, are different between the germline and somatic cases. The currently identified DDX41 mutations are summarized in Fig. 3.7. Among multiple cohorts, usually, germline mutations were dominated by p.D140fs (fs, denotes frame-shift),

77

Figure 3.7. Topography, types, and configuration of DDX41 mutations in myeloid malignancies. DDX41 is located at the distal end of chromosome 5q, 5q35.3, and codes for an RNA helicase that contains two known domains containing the conserved sequence motifs, as illustrated. The positions of the germline (Red triangles) and somatic (Blue circles) alterations reported in the literature are shown. The most recurrent germline mutations in caucasian populations are D140fs and M1I, whereas in asian populations A500fs and Y259C are most common 20. The somatic R525H mutation is commonly seen in both populations. fs, denotes frame-shift mutations. and somatic mutations were dominated by p.R525H. In about 50% of the cases, tumors in such patients often combine the germline frameshift mutation (p.D140fs) on one allele that results in the loss of full-length protein, and a recurrent somatic missense mutation

(p.R525H) on the other allele (Fig. 3.8). This suggests that patients with germline mutations on one allele are predisposed to secondary hits on the second allele. The germline mutations seem to be dominant, leading to loss-of-protein function in most cases, whereas somatic variants were usually missense mutations, not abrogating protein function. This likely indicates the need for a balance between cell growth inhibition and clonal expansion.

In MDS, somatic (p.R525H) DDX41 mutations without germline DDX41 mutations have also been observed in conjunction with other genetic alterations. Interestingly, additional somatic mutations were always acquired by the other WT allele. Another interesting

78 observation is that ~25% of MDS cases have del5q abnormality where there is a deletion of the chromosome 5q regions, usually involving the DDX41 locus.

Figure 3.8. Bi-allelic DDX41 mutations in myeloid malignancies. Carriers of inactivating germline DDX41 mutation in one allele (Red) often acquire a somatic alteration in the other allele (Blue), as shown.

Patients showed late disease onset with the median age around 65, with most carriers showing normal blood count until malignancy develops. In general, the initial phenotypic changes are subtle and include immature myeloid cells that do not differentiate, cytopenias, erythroid dysplasia, and normal cytogenetics. The long latency of disease in patients with germline DDX41 mutations does not support the notion that DDX41 is a strong driver of myeloid neoplasia but instead points towards additional processes associated with aging bone marrow. To that end, the cellular functions of DDX41, and consequences of DDX41 mutations or haploinsufficiency, including the downstream effects on the transcriptome (if any) need to be firmly established.

Previous studies by our collaborators on the effects of DDX41 mutations on splicing did not produce conclusive answers 4. Since the downstream effects of DDX41 perturbations are unreported, in our present collaborative study, we developed a strategy to determine the biochemical and cellular functions of DDX41 and to study the effects of

79 haploinsufficiency and the recurrent missense mutation p.R525H. The results from this study (Hiznay et al. manuscript submitted) are discussed in sections 3.4, 3.5, and Chapter

7 of this thesis.

3.4 Cellular RNA targets of DDX41 and cellular consequences of DDX41 perturbations

To understand the putative function of DDX41 in splicing, our collaborators Hiznay et al. (manuscript submitted), identified the cellular RNA targets of DDX41WT and

DDX41R525H, using cross-linking immunoprecipitation, followed by high-throughput sequencing (CLIP-Seq). Their analyses in HEK293 cells identified binding of DDX41 isoforms to RNAs at regions proximal to 5’ and 3’ splice sites, similar to other core splicing factors. Both WT and mutant DDX41 produced a similar set of CLIP peaks or binding patterns on their target genes. Additionally, DDX41 CLIP peaks were also observed on U2,

U6, and U12 snRNAs (small nuclear RNAs of the major and minor spliceosomes).

Although DDX41WT and DDX41R525H showed similar binding patterns on U2 snRNAs, the binding to U12 snRNA was different for the two isoforms. This suggests that the R525H mutant DDX41 could bind and stabilize a structurally distinct form of the minor spliceosome.

Additionally, RNA-Sequencing was used to examine the functional impact of

DDX41-RNA binding in the cell, and to characterize the effects of a DDX41 haploinsufficiency/ mutation on splicing events. The knockdown of DDX41 levels did not lead to a significant alteration in gene expression levels, but it produced altered splicing events. In particular, it led to improved splicing of 81 introns and reduced splicing of 14 introns. In contrast, overexpression of WT DDX41 led to altered expression levels of a

80 subset of transcripts, while also producing altered splicing events. In terms of splicing changes, it led to improved splicing of only 19 introns but decreased splicing of 118 introns.

These observations suggest that DDX41’s role in the spliceosome maybe to inhibit or delay some steps in the splicing reaction. On the other hand, overexpression of DDX41R525H showed an even distribution of improved and reduced intron splicing. On comparison of

DDX41R525H overexpression with DDX41WT overexpression, the mutant showed a relatively less inhibitory role suggesting that its function was impaired in one or more ways.

In addition, on combining the RNA-Seq and CLIP-Seq datasets, they identified

DDX41 crosslinked near introns that were retained more in DDX41 overexpressing cells and retained less in DDX41 knockdown cells. Pathway analyses showed significant alterations in spliceosomal factors (either upregulated or downregulated) in DDX41- perturbed cells, suggesting the existence of a common gene-regulatory pathway affected by DDX41R525H. In DDX41-perturbed cells, they observed significant de-regulation of genes typically regulated by an epigenetic modifier, EZH2 (catalytic component of the polycomb repressive complex 2). Somatic loss of function mutations of EZH2 are frequently found in MDS patients 310. Besides, mutations in splicing factors like SF3B1, and U2AF1 are frequently associated with co-mutations in epigenetic factors, like TET2,

DNMT3A, and EZH2 (Hiznay et al., manuscript submitted). But in the DDX41 patient cohort, there is a striking lack of co-mutation in epigenetic modifiers, ex: EZH2, suggesting that DDX41 may play a role in regulating this pathway.

DDX41 has not been associated with any of the major RNA rearrangement events catalyzed by other spliceosomal DEAD-box RNA helicases. Although it is found in stoichiometric amounts in late splicing complexes, it has not been modeled in any cryo-

81

electron microscopy structures to date. The most recent set of data from our collaborators,

Padgett et al., suggests a role for DDX41 in one or more late proofreading steps of pre-

mRNA splicing. In summary, the CLIP-seq data show DDX41 binding to 5’ and 3’ splice

sites and near snRNAs, and analyses of altered splicing events suggest DDX41 may impede

intron release. Thus, DDX41 may have evolved due to a need for more dynamic and

flexible spliceosomes required for alternative splicing.

3.5 Biochemical characterization of DDX41

The somatic p.R525H mutation in DDX41 alters a highly conserved arginine

residue in the predicted ATP-binding pocket of the helicase. Based on structures of other

DEAD-box helicases, Arg525 (part of conserved Motif VI) in DDX41 likely forms

hydrogen bonds to the α and β phosphates of ATP 4 (Fig. 3.9). Consequently, the

conversion of this arginine to histidine could alter the catalytic functions of the helicase. In

Chapter 7 of this thesis, I investigate the biochemical functions of WT DDX41 and R525H

mutant DDX41 to study the effects of haploinsufficiency and the recurrent missense

mutation p.R525H.

Figure 3.9. Structural model of the helicase core of DDX41. Mutations in the helicase core (p.R525H and p.P321L) are highlighted in orange. The structural model was created by combining the existing structure of Drosophila Vasa (PDB : 2DB3) and the partial structure of the helicase domain of DDX41. Color scheme: RNA in gold; ATP in green; Conserved motifs as orange ribbons, and significant residues as red spheres. This figure is from Ref: 4 with permission to reuse.

82

Chapter 4

Function of Auxiliary Domains of the DEAH/RHA Helicase DHX36 in RNA

Remodeling

4.1 Introduction

Here, the crystal structure of the mouse DHX36-ADP complex is reported, and similarities and differences between the existing structures are highlighted, thus providing a better understanding of the conformational changes accompanying the NA remodeling cycle. I used purified protein and radiolabeled RNA substrates to quantitatively analyze the remodeling activity of mouse DHX36 on RNA duplex structures, as well as both intermolecular and intramolecular RNA G-quadruplex structures. Various aspects of the

RNA remodeling reaction, including remodeling rate constants and affinities for RNA substrates, were examined for each substrate. Finally, guided by the structure, we interrogate the roles of several conserved domains and structural elements in the remodeling of RNA duplex and quadruplex structures. Purified recombinant mouse

DHX36 (WT and mutants) was overexpressed and purified from E. coli by Dr. Zhonghua

Liu (CWRU) and Dr. Watchalee Chuenchor (NIH).

4.2 Results

4.2.1 Crystal structure of mouse DHX36

To provide a firm basis for a structure-function analysis, we determined the structure of mouse DHX36 (mDHX36). The solved crystal structures of a mDHX36 construct without the 45 N-terminal and the 26 C-terminal residues (mDHX3646-982) in complex with ADP, is shown in Fig. 4.1a,b). The structure was solved by Dr. Zhonghua

Liu, CWRU. The N-terminal residues (aa: 46-153), which include the DSM, were not

83 visible in the electron density maps, most likely due to disorder. The overall structure of mDHX36 closely resembles structures for the previously reported bovine and fly DHX36 with the six domains typical for DEAH/RHA helicases (Fig. 4.1c,d) 6,38,29,25: an N-terminal domain (NTD, aa 46-203), the two RecA-like helicase domains (aa 204-619), a winged- helix (WH) and ratchet domain, together referred to as the HA2 domain (aa 620-834), an

OB-fold (aa 835-895), and a C-terminal domain (CTD, aa 896-982). Similar to the fly and

Figure 4.1. Crystal structure of mouse DHX36 bound to ADP. (a) Domain organization of mouse

DHX36. The bottom bar indicates the construct (WT46-982) used for crystallization. (b) Cartoon representation of the crystal structure of mouse DHX36 in complex with ADP (6UP4). Domains are colored and labeled as in panel (a). ADP is shown in green. (c) Mouse DHX36 structure in complex with ADP (yellow, 6UP4), superimposed with the bovine DHX36 structure in complex with ADP.BeF3 (gray, 5VHC) and quadruplex DNA (cyan, 5VHE). Bound nucleotides are shown as sticks (Mg2+ ions: magenta; K+ ions: purple; quadruplex DNA: colored orange. (d) Detailed view of the ADP binding site (mDHX36: yellow ribbon). ADP and residues engaged in ADP-binding are shown as sticks, with electron density (2Fo-Fc) for the ADP molecule shown as blue mesh contoured at 1 . Hydrogen bonds are shown as gray dotted lines. Figures c, d were provided by Dr. Zhonghua Liu and Dr. Tsan Xiao, CWRU.

84 bovine DHX36, mDHX36 adopts a pyramidal structure. The helicase core forms a rectangular base. The extended C-terminus with the HA2, the OB-fold, and the CTD form the apex (Fig. 4.1b).

The HA2 and OB-fold together represent one side of the nucleic acid binding site.

Both domains and the CTD are slightly rotated, compared to structures of DHX36 bound

Figure 4.2. Differences in the arrangement of conserved domains in mouse DHX36 bound to ADP, compared to other DHX36 structures. (a) Rotation of the C-terminal domains in different nucleotide and nucleic acid bound DHX36 structures (mouse DHX36 bound to ADP: hot pink; bovine DHX36 bound to ADP.BeF3-(PDB: 5VHC): silver; bovine DHX36 bound to a DNA quadruplex (PDB: 5VHE): pale teal). The structures were aligned by superimposition of the RecA1 domain (Fig 4.1). Alpha-helices are represented as cylinders. The RecA1 and RecA2 domains have been omitted from the structures. (b) Structures are colored as in (a). (Central panel) Rearrangement of OI and OII loops of the OB-fold. Small panels surrounding the central panel depict details of the rearrangement of OB-fold residues. Residues in OI (N 841) and OII loop (H 862, Y 890) that undergo the most prominent rearrangements are highlighted.

85

to nucleic acids 6,38 (Fig. 4.2a). The rotation of these domains is even more pronounced in

- 6 the bovine DHX36 structure with ADP-BeF3 (Fig. 4.2a). We speculate that the rotation

of the HA2, OB-fold, and CTD reflects conformational changes dictated by the state of the

ATPase cycle and the bound nucleic acids. Consistent with this notion, several residues in

the OB-fold that contact nucleic acids are reoriented in our ADP-bound structures (Fig.

4.2b). We also observe movement of the RecA2 domain away from the RecA1 domain in

our structure, compared to structures with nucleic acids and with the ATP-ground state

- 6 analog, ADP-BeF3 , (Fig. 4.3a). Other differences between our and previous DHX36

structures include an altered arrangement of residues in the helicase domains Ia, III, and

IV, in the loop that connects the two helicase domains (Fig. 4.3b-e), and in the conserved

Figure 4.3. Differences in the arrangement of conserved motifs in mouse DHX36 bound to ADP, compared to other DHX36 structures. (a) Movement of the RecA2 domain relative to the RecA1 domain in the different nucleotide and nucleic acid bound DHX36 structures. Structures and alignment as in panel (Fig 4.2). Alpha-helices are represented as cylinders. For clarity, the extended C-terminal domains are not depicted. (b) Rearrangement of R257 and R258 (motif Ia, involved in nucleic acid binding). (c) Rearrangement of T360, and L361 (motif III, coupling of ATP hydrolysis to RNA unwinding). (d, e) Rearrangement of helicase motif IV and the linker loop connecting RecA1 and RecA2 domains. (f) Rearrangement of the conserved hook loop motif and R287. 86

3’-β-hairpin element, which is thought to act as 3’ bookend for the nucleic acid 41.

Specifically, Arg 287 is reoriented in the ADP-bound state, compared to the nucleic acid bound states (Fig. 4.3f).

Collectively, our observations are consistent with conformational changes that accompany different stages of the ATP binding, hydrolysis cycle, and the RNA remodeling cycle. Besides establishing the arrangement of the domains in mDHX36, our structure highlights movements of the helicase core that accompany the ATP-binding and hydrolysis cycle and RNA remodeling.

4.2.2 Remodeling of RNA duplexes and quadruplexes by mDHX36

Having determined the architecture of mouse DHX36, we set out to interrogate the function of conserved auxiliary domains in the remodeling of RNA structures. To provide a basis for this structure-function analysis, we first characterized the ATP-dependent remodeling activity of wild type mDHX3646-982 (subsequently referred to as WT mDHX36). We examined the activity on three different RNA substrates: (i) a 16-bp duplex with 15 nt unpaired A’s 3' to the duplex, (ii) an intermolecular G-quadruplex (GQ4) with five G-tetrads formed from four separate RNA oligonucleotides (3'A15G5UUA), and (iii) an intramolecular G-quadruplex (GQI) with five G-tetrads formed from a single RNA oligonucleotide. The RNA duplex and the intramolecular GQI had a single unpaired region of 15 A 3' to the RNA structure; the intermolecular GQ4 had four single unpaired regions of 15 A 3' to the RNA structure (Fig. 4.5, Materials and Methods). Circular dichroism indicated that both quadruplexes formed in the parallel orientation, as expected (Fig. 4.4,

311,262).

87

Figure 4.4. Circular Dichroism spectra of intermolecular GQ4 and intramolecular GQ1. (a) CD spectrum of intermolecular GQ4 prepared in 10 mM MOPS (pH 6.5) and 50 mM KCl (blue: buffer; green: GQ4). (b) CD spectrum of intramolecular GQ1 prepared in 10 mM MOPS (pH 6.5) and 50 mM KCl (blue: buffer; red: GQ1). The positive peak at 260 nm and the negative peak at 240 nm both indicate a parallel G-quadruplex structure. We measured RNA remodeling under pre-steady state conditions with an excess

of DHX36 over RNA (Fig. 4.5a). These reaction conditions permit multiple cycles of

enzyme substrate binding 312. The kinetic interpretation of this reaction regime is simpler

than for steady-state reactions, which involve multiple substrate turnovers 312. We detected

clear, ATP-dependent duplex remodeling by mDHX36 (Fig. 4.5b, left panel). No

remodeling was observed without ATP and without mDHX36 (Fig. 4.5b, middle and

right panel). These data demonstrate that mDHX36 unwinds RNA duplexes.

mDHX36 also resolved the intermolecular GQ4 in an ATP-dependent manner (Fig.

4.5c, left panel). No reaction was seen without ATP and without mDHX36 (Fig. 4.5c,

middle and right panel). Remodeling of GQI was measured in the presence of a DNA

trap, a DNA oligonucleotide that hybridizes to the quadruplex region and thereby prevents

re-formation of the quadruplex upon unfolding (Fig. 4.5d, 311). mDHX36 resolved the

intramolecular GQI with, but not without ATP (Fig. 4.5e).

88

Figure 4.5. Remodeling of G-quadruplex and RNA duplex substrates by WT mDHX36. (a) Reaction scheme for pre-steady state remodeling reactions for RNA-duplex and RNA GQ4. (b) Left panel: Representative PAGE for RNA duplex remodeling reaction (100 nM WT DHX36, 2 mM ATP- Mg2+, 0.5 nM 16-bp RNA duplex). Middle and right panels: Reaction without ATP, and without protein. Cartoons mark RNA substrate and unwound product. The asterisk shows the radiolabel. (c) Left panel: Representative PAGE for RNA GQ4 remodeling reaction (100 nM WT DHX36, 2 mM ATP-Mg2+, 0.5 nM 5G-tetrad RNA GQ4). Middle and right panels: Reaction without ATP, and without protein. Cartoons mark RNA substrate and unwound product. The asterisks show the radiolabel. (d) Reaction scheme for pre-steady state remodeling reactions for the intramolecular RNA-GQI. (e) Left panel: Representative PAGE for the RNA GQI remodeling reaction (100 nM WT DHX36, 2 mM ATP-Mg2+, 0.5 nM 5G-tetrad RNA GQI, 100 nM DNA trap). Middle and right panels: Reaction without ATP and without protein. Cartoons mark RNA substrate and unwound product. The asterisk shows the radiolabel.

Under identical conditions, mDHX36 resolved the quadruplex substrates markedly faster than the duplex (Fig. 4.6a). To understand whether these variations were caused by differing affinities of mDHX36 to the substrates, by differences in the remodeling rate constants, or by a combination of both, we measured apparent remodeling rate constants

(kobs) with increasing mDHX36 concentrations (Fig. 4.6b-d). We observed a weaker affinity of mDHX36 for the duplex, compared to the quadruplex substrates (Fig. 4.6b-d).

max The duplex remodeling rate constant at enzyme saturation (kobs ) was also lower than the

max quadruplex remodeling rate constants at enzyme saturation (kobs ) (Fig. 4.6b-d). In addition, we measured a lower affinity of mDHX36 for the intermolecular GQ4 with four

89 unpaired regions, compared to the intramolecular GQ1 with only one unpaired region (Fig.

4.6c,d). Remodeling rate constants at enzyme saturation were roughly similar for both quadruplexes.

Functional binding isotherms for duplex remodeling were sigmoidal, with a Hill coefficient of H = 1.8 ± 0.3 (Fig. 4.6b). This observation suggests that multiple protomers of mDHX36 cooperate to unwind a duplex, but not the quadruplex substrates. The functional binding isotherms were hyperbolic even for the intermolecular substrate with four unpaired regions (Fig. 4.6c). This result suggests that remodeling of a quadruplex does not require the cooperative function of multiple DHX36 protomers, regardless of the number of unpaired RNA regions. In the intramolecular GQ1 remodeling reactions, the

Figure 4.6. Quantitative analysis of the remodeling of G-quadruplex and RNA duplex substrates by WT mDHX36. (a) Representative remodeling time courses of WT mDHX36 for RNA duplex (red diamonds), GQ4 (green circles) and GQI (blue circles). Reaction conditions as in Fig.3.5. Data points are averages from multiple independent reactions (N ≥ 4). Error bars mark one standard deviation. Lines show best fits to the integrated first order rate law, yielding observed rate constants -1 4 -1 I (kobs) for duplex RNA, kobs = 0.15 ± 0.04 min ; for GQ , kobs = 3.27 ± 0.88 min ; and for GQ , kobs = 3.73 ± 0.55 min-1. (b) Dependence of the remodeling rate constant (2 mM ATP-Mg2+) on mDHX36 concentrations for the duplex substrate. Data points are averages from multiple independent reactions (N ≥ 6). Error bars mark one standard deviation. The line represents the best fit to the Hill equation max H H H -1 max kobs= kobs [DHX36] ·((K1/2, DHX36) + [DHX36] ) (kobs: observed remodeling rate constant; kobs :

remodeling rate constant at DHX36 saturation; K1/2, DHX36: apparent functional binding constant, H: Hill max -1 coefficient), yielding kobs = 1.1 ± 0.1 min , K1/2 = 363 ± 54 nM, H = 1.8 ± 0.3. (c) Dependence of the remodeling rate constant (2 mM ATP-Mg2+) on mDHX36 concentrations for the intermolecular GQ4 substrate. Data points are averages from multiple independent reactions (N ≥ 3). Error bars mark one max standard deviation. Curves represent the best fit to a binding isotherm kobs= kobs [DHX36]·(K1/2, DHX36 -1 max -1 + [DHX36]) , yielding kobs = 4.6 ± 0.3 min , K1/2 = 20.9 ± 4.4 nM. (d) Dependence of the remodeling rate constant (2 mM ATP-Mg2+, 100 nM DNA trap) on mDHX36 concentrations for the intramolecular GQ4 substrate. Data points are averages from multiple independent reactions (N ≥ 4). Error bars mark one standard deviation. DHX36 was saturated at the lowest experimentally accessible concentration. max ‘ -1 The average of kobs values at all shown concentrations is kobs = 4.6 ± 0.4 min , K1/2 < 0.06 nM. 90

DNA trap could compete with the GQ1 substrate for DHX36 binding, and subsequently affect the remodeling activity of the enzyme. To test this, we measured mDHX36 remodeling of the intermolecular GQ4 substrate in the presence and absence of the DNA trap and found that the trap did not measurably impact mDHX36 activity on this substrate

(Fig. 4.7). In sum, our observations showed ATP-dependent remodeling activity by mDHX36 on different RNA structures, including an RNA duplex. mDHX36 preferentially bound and remodeled substrates with a quadruplex structure, compared to those with an

RNA duplex structure.

Figure 4.7. Impact of the DNA trap on the remodeling reaction by mDHX36. (a) Representative PAGE for remodeling reactions of the intermolecular GQ4 by 10 nM WT mDHX36 (2 mM ATP-Mg2+, 0.5 nM, 5G-tetrad RNA GQ4) without (left panel) and with (right panel) 100 nM DNA trap. Cartoons mark substrate and unwound product, asterisks show the radiolabel. (b) Representative remodeling time courses (conditions as in panel A) without (filled circles) and with 100 nM DNA trap (open circles). Data represent the average of 3 independent measurements. The error bars mark one standard deviation. Curves represent best fits to the integrated first order rate law, yielding observed rate -1 -1 constants (kobs) without DNA trap, kobs = 0.90 ± 0.17 min ; with DNA trap kobs = 0.85 ± 0.16 min .

4.2.3 The DSM promotes remodeling of RNA quadruplexes and duplexes and binding to

RNA quadruplexes

Utilizing the substrates and conditions described above, we next examined the role of the DHX36 specific motif (DSM) for functional binding and remodeling of RNA substrates. The N-terminal DSM, which is conserved among DHX36 orthologues (Fig.

4.8a), contacts the quadruplex in a recent crystal structure (6, Fig. 4.8b). The DSM

91

Figure 4.8. The N-terminal DSM motif in DHX36. (a) Sequence alignment of DSM in DHX36 orthologues. Conservation of the DSM among DHX36 orthologues (Mus musculus, NP_082412, Homo sapiens, NP_065916, Bos Taurus NP_001073720, Drosophila melanogaster, NP_610056, Gallus gallus, NP_422834, Xenopus tropicalis, ENXSETP00000016958, and Danio rerio, NP_001122016). Sequence alignment was performed with Clustal Omega 19 and visualized with BoxShade (http://sourceforge.net/projects/boxshade/). Identical residues: black, Similar residues: grey. (b) Location of the DSM (green ribbon, box) in bovine DHX36 bound to a DNA quadruplex structure (PDB: 5VHE). promoted DHX36 binding to quadruplexes in several biochemical studies 45,313,264, but appeared dispensable for quadruplex binding in other work 38,314. Whether and how the

DSM impacts the remodeling of RNA duplex substrates by DHX36 is not known.

To systematically characterize the contribution of the DSM for RNA quadruplex

Δ-DSM and duplex remodeling, we deleted the DSM from mDHX3646-982 (mDHX36 ) and measured remodeling of the three substrates tested above (Fig. 4.5-4.6). mDHX36Δ-DSM remodeled all substrates in an ATP-dependent manner (Fig. 4.9). Differences between wild type (WT) mDHX36 and mDHX36Δ-DSM were apparent (Fig. 4.9). For the duplex substrate,

max Δ-DSM the remodeling rate constant at enzyme saturation (kobs ) was lower for mDHX36 , compared to WT mDHX36 (Fig. 4.9a, lower left panel). The functional affinity (K1/2) was similar for both enzyme variants, within error (Fig. 4.9a, lower right). The functional binding isotherm for mDHX36Δ-DSM was sigmoidal (H = 1.9 ± 0.6), further supporting the notion that multiple protomers of mDHX36 cooperate to unwind the duplex substrate. 92

Figure 4.9. Impact of deletion of the DSM. (a) Top panel: Representative PAGE for RNA duplex remodeling reactions (100 nM mDHX36∆-DSM, 2 mM ATP-Mg2+, 0.5 nM RNA substrate). Middle panel: Dependence of the remodeling rate constant (2 mM ATP-Mg2+) on mDHX36∆-DSM (red) concentrations for the duplex substrate. Reactions for WT mDHX36 (blue, Fig. 4.6b) are shown for comparison. Data points are averages from multiple independent reactions (N ≥ 4). Error bars mark one standard deviation. ∆-DSM max The lines represent the best fit to the Hill equation (Fig. 4.6b) yielding for mDHX36 : kobs = 0.65 ± -1 max 0.15 min , K1/2 = 460 ± 153 nM, H = 1.9 ± 0.6. Bottom panel: comparison of kobs and K1/2 values for WT mDHX36 (Fig. 4.6b) and mDHX36∆-DSM for the RNA duplex substrate. Error bars indicate the standard error of the datafit to the Hill equation. (b) Top panel: Representative PAGE for RNA GQ4 remodeling reactions (conditions as in a). Middle panel: Dependence of the remodeling rate constant (2 mM ATP-Mg2+) on mDHX36∆-DSM (red) concentrations for the GQ4 substrate. Reactions for WT mDHX36 (blue, Fig. 4.6c) are shown for comparison. Data points are averages from multiple independent reactions (N ≥ 4). Error bars mark one standard deviation. The lines represent the best fit to the binding ∆-DSM max -1 isotherm (Fig. 4.6c) yielding for mDHX36 : kobs = 1.97 ± 0.33 min , K1/2 = 120 ± 61 nM. Bottom max ∆-DSM 4 panel: comparison of kobs and K1/2 values for WT mDHX36 (Fig. 4.6c) and mDHX36 for the GQ substrate. Error bars indicate the standard error of the datafit to the binding isotherm. (e) Top panel: Representative PAGE for RNA GQ1 remodeling reactions (conditions as in a,b). Middle panel: Dependence of the remodeling rate constant (2 mM ATP-Mg2+) on mDHX36∆-DSM (red) concentrations for the GQ1 substrate. Reactions for WT mDHX36 (blue, Fig. 4.6d) are shown for comparison. Data points are averages from multiple independent reactions (N ≥ 4). Error bars mark one standard deviation. ∆-DSM max The lines represent the best fit to the binding isotherm (Fig. 4.6d) yielding for mDHX36 : kobs = -1 max 1.2 ± 0.1 min , K1/2 = 75 ± 24 nM. Bottom panel: comparison of kobs and K1/2 values for WT mDHX36 (Fig. 4.6d) and mDHX36∆-DSM for the GQ1 substrate. Error bars indicate the standard error of the datafit to the Hill equation.

4 max For the intermolecular quadruplex substrate (GQ ), kobs , and functional affinity

93

Δ-DSM (K1/2) were markedly reduced for mDHX36 , compared to WT mDHX36 (Fig. 4.9b).

I max For the intramolecular quadruplex substrate (GQ ), kobs and functional affinity were also reduced for mDHX36Δ-DSM, compared to WT mDHX36 (Fig. 4.9c). Deletion of the DSM reduced the functional affinity for GQI by almost four orders of magnitude (Fig. 4.9c, lower right panel). Yet, the functional affinity of mDHX36Δ-DSM for both quadruplex substrates was similar, even though WT mDHX36 binds significantly tighter to intramolecular than to the intermolecular quadruplex substrate (Fig. 4.9b,c).

The data thus show that the DSM promotes binding to RNA quadruplexes, but not to duplexes. The differences in the functional affinity of WT mDHX36 for the intra- and the intermolecular quadruplex structures suggest that mDHX36 either prefers one or more specific orientations of unpaired RNA in the quadruplex structures, that multiple options of mDHX36 binding to a single quadruplex is disadvantageous for a functional association, or a combination of both scenarios. The DSM appears to confer this characteristic to mDHX36 because deletion of the DSM largely eliminates the differences in the functional affinity for both quadruplex substrates. However, the functional affinity of mDHX36Δ-DSM for the duplex substrate is still lower than that for the quadruplex substrates, indicating that removal of the DSM does not entirely abolish the preferential binding of mDHX36 to quadruplexes.

Our data further reveal a previously unappreciated role of the DSM in the

max remodeling step of quadruplex and duplex substrates, reflected in the reduced kobs values for mDHX36Δ-DSM, compared to WT mDHX36 (Fig. 4.9a-c). Although the size of this effect varies for the different substrates, the data suggest a role for the DSM in the conformational re-arrangement of the DHX36 domains that accompany the RNA

94

remodeling step, translocation events, or both 6. In sum, our data show that (i) the DSM is

a binding adaptor for quadruplex, but not for duplex structures (ii) that the DSM plays a

role in the remodeling step of both quadruplex and duplex substrates.

4.2.4 The OB-fold promotes binding and remodeling of quadruplex and duplex

structures.

We next probed the role of the OB-fold (Fig. 4.10a). This domain is located C-

terminal relative to the helicase core and conserved across the DEAH/RHA family 29,25,265.

The OB-fold in DHX36 contacts the nucleic acid (6, 38 Fig. 4.10a). Based on these contacts,

a role of the domain in quadruplex recognition was proposed 6, 38. However, mutations of

residues that contact the nucleic acid did not impact remodeling 6.

Figure 4.10. Impact of deletion of the OB-fold on the RNA remodeling activity. (a) Location of the OB-fold in bovine DHX36 bound to a DNA quadruplex structure (PDB: 5VHE). (b) Representative PAGEs for RNA remodeling reactions (100 nM mDHX36∆OB-fold, 2 mM ATP-Mg2+, 0.5 nM RNA substrate) for the RNA duplex (left), the GQ4 (middle) and the GQ1 (right) substrates.

To clarify the role of the OB-fold for DHX36 activity, we deleted the domain from

ΔOB-fold ΔOB- mDHX3646-982 (mDHX36 ). We examined the remodeling activity of mDHX36

fold on the RNA substrates tested above. We did not detect significant remodeling for any

of the substrates tested (Fig. 4.10b). To nevertheless obtain insight into functional

characteristics of mDHX36ΔOB-fold, we measured the RNA-stimulated ATPase activity

95

(Fig. 4.11). Without RNA, mDHX36ΔOB-fold showed basal ATPase activity comparable to

WT mDHX36 (Fig. 4.11b). In contrast to WT mDHX36, no RNA stimulation of the

ATPase activity was seen with mDHX36ΔOB-fold (Fig. 4.11a,c). The ATPase activity remained at the level of the unstimulated activity (Fig. 4.11c). These data suggested that mDHX36ΔOB-fold retained the capacity to hydrolyse ATP, but lost the ability to bind RNA, couple RNA binding to ATP turnover, or both.

Figure 4.11. Impact of deletion of the OB-fold on the ATPase activity of mDHX36. (a) Left panel: Reaction scheme for steady-state ATPase of RNA mDHX36 with RNA substrates. Right panel: Representative TLC images of ATPase reactions with WT mDHX36 (left), and mDHX36∆OB-fold (right) (260 nM mDHX36, 0.5 mM equimolar ATP-Mg2+, trace amounts of [γ-32P] ATP) with and without RNA (2 µM), as indicated on the right. (b) Representative ATP hydrolysis time courses indicating basal ATP hydrolysis rates without RNA (260 nM mDHX36WT (blue circles), with and without 1 U/µl RNase; 260 nM mDHX36∆OB-fold (orange circles), with and without 1 U/µl RNase. Data points represent the average of two independent measurements. Error bars mark one standard deviation. Lines represent linear least squares fits to the initial phase of the reaction, yielding initial rates (r0). Without mDHX36, r0 = 0.00011 ± -1 -1 0.00008 min with WT DHX36, r0 = 0.0018 ± 0.0001 min ; with WT mDHX36 and RNase, r0 = 0.0014 ± -1 ∆OB-fold -1 ∆OB-fold 0.0006 min ; with mDHX36 , r0 = 0.0016 ± 0.0001 min ; with mDHX36 and RNase, r0 = 0.0016 -1 ∆OB-fold ± 0.0002 min . (c) Initial reaction velocity (V0) of WT mDHX36 (blue) and mDHX36 (orange), without or with RNA as indicated (266 nM mDHX36, trace amounts of [γ-32P] ATP, 0.5 mM ATP-Mg2+, 2 µM RNA).

Values for V0 were obtained by multiplying the initial reaction rates with the total ATP concentration for the linear part of the progress curve (< 20% product formation). Data represent the average of 3 independent measurements. The error bars mark one standard deviation. V0 values were for WT -1 -1 mDHX36: no RNA, V0 = 0.9 ± 0.1 µM min ; no RNA and RNase, V0 = 1.0 ± 0.3 µM min ; with ss RNA, -1 -1 4 V0 = 10.2 ± 1.3 µM min ; with the RNA duplex, V0 = 7.8 ± 2.8 µM min ; with the GQ , V0 = 6.6 ± 2.1 µM -1 1 -1 ∆OB-fold -1 min ; and with the GQ , V0 = 10.6 ± 0.8 µM min . For mDHX36 : no RNA, V0 = 1.1 ± 0.5 µM min ; -1 -1 no RNA and RNase, V0 = 0.9 ± 0.1 µM min ; with ss RNA, V0 = 1.0 ± 0.2 µM min ; with the RNA duplex, -1 4 -1 1 -1 V0 = 1.3 ± 0.3 µM min ; with the GQ , V0 = 1.3 ± 0.5 µM min ; and with the GQ , V0 = 0.8 ± 0.2 µM min .

96

To examine the role of the OB-fold for RNA binding, we measured equilibrium binding of mDHX36ΔOB-fold to the RNA substrates tested above and to a 31 nt single- stranded RNA (Fig. 4.12). We observed reduced binding of mDHX36ΔOB-fold to all substrates, compared to WT mDHX36 (Fig. 4.12). WT mDHX36 showed the lowest affinity to ssRNA, followed by the duplex substrate and the quadruplex substrates (Fig.

4.12). Affinities of mDHX36ΔOB-fold followed the same trend but were generally lower than for WT mDHX36 (Fig. 4.12). For the intermolecular GQ4 substrate, which contains four unpaired RNA regions, we noticed multiple bound species (Fig. 4.12), consistent with the binding of several mDHX36 protomers to the RNA. Apparent equilibrium constants of K1/2

~ 50 nM for mDHX36ΔOB-fold binding to the quadruplex substrates indicated a clear ability to bind the quadruplex. A comparison of the apparent affinities of WT and mDHX36ΔOB-

Figure 4.12. Impact of deletion of the OB-fold on RNA binding. Representative PAGEs of equilibrium binding reactions with increasing concentrations of WT mDHX36 (left panels) and mDHX36∆OB-fold (middle panels) and RNAs as in (Fig 4.10-4.11). Cartoons mark free and protein-bound RNA, asterisks show the radiolabel. Right panels: plots of fraction of bound RNA as a function of enzyme concentration. Data points represent an average of three independent measurements. Error bars mark one standard deviation. Lines represent trends.

97 fold for the three substrates suggested that deletion of the OB-fold results in a loss of binding energy that is similar for all substrates (Fig. 4.16). This observation implies that the OB- fold contributes a free binding energy to nucleic acid binding roughly similar for all substrates and largely additive to the free binding energy provided by other nucleic acid binding sites in mDHX36.

Taken together, the RNA binding data and the lack of ATPase stimulation by any

RNA substrate by mDHX36ΔOB-fold indicate a role of the OB-fold in enabling the coupling of the ATPase cycle to nucleic acid binding, and thus a function beyond RNA binding. A mere impact of the OB-fold on RNA binding during the ATPase cycle would result in a measurable ATPase stimulation with the quadruplex substrates. However, no stimulation was observed (Fig. 4.11c). Our data thus collectively indicate that the OB-fold, beyond a role in promoting binding to nucleic acids, is essential for coupling of the ATPase cycle to nucleic acid binding and thus for the ATP-driven remodeling of quadruplex and duplex structures. The impact of the OB-fold deletion on the binding to quadruplex structures further suggests that the DSM alone is not sufficient to confer the high affinity for quadruplexes to mDHX36 (Fig. 4.10a). Instead, high affinity binding of mDHX36 to quadruplex structures appears to involve multiple mDHX36 domains.

4.2.5 The β-hairpin promotes binding and remodeling of quadruplex and duplex structures.

The OB-fold interacts with another conserved structural feature of the DEAH/RHA family: a β-hairpin (β-HP), which is inserted into the RecA2 domain and contacts the OB- fold across the nucleic acid (Fig. 4.13a, 29,25,265). The β-HP is also present in Ski2-like helicases, where it has been proposed to promote translocation and strand separation 26.

98

However, the function of the β-HP has not yet been experimentally validated in any helicase. To determine the role of the β-HP for DHX36 activity, we generated a mDHX36

Δβ-HP variant where we deleted this domain from the mDHX3646-982 construct (mDHX36 ,

Fig. 4.13a). In addition, we generated a variant where we replaced the top of the hairpin, which contacts the OB-fold, with a flexible glycine linker, leaving a shortened β-HP

(mDHX36short β-HP, Fig. 4.13a). We next examined the remodeling activity of both mDHX36Δβ-HP and mDHX36short β-HP on the RNA substrates tested above. No significant remodeling for any substrate was detected (Fig. 4.13b).

Figure 4.13. Impact of deletion and truncation of the 5’-β-HP on RNA remodeling. (a) Top panel: Residues comprising the 5’-β-HP. Residues deleted to generate the mDHX36∆β-HP construct are marked in red. Residues replaced by a glycine to generate the mDHX36short β-HP construct are marked in brown. Bottom Panel: Location of the 5’-β-HP in DHX36 bound to a DNA quadruplex structure (PDB: 5VHE). (b) Representative PAGEs for RNA remodeling reactions. Left panels: Reactions with 100 nM mDHX36∆β-HP, 2 mM ATP-Mg2+, 0.5 nM RNA substrates as indicated by the cartoons on the left. Right panels: Reactions with 100 nM mDHX36short β-HP, 2 mM ATP-Mg2+, 0.5 nM RNA substrates as indicated.

We then measured RNA-stimulated ATPase activity mDHX36Δβ-HP and mDHX36short β-HP (Fig. 4.14). In the absence of RNA, ATPase activity increased with mDHX36Δβ-HP, compared to WT mDHX36 (Fig. 4.14). Addition of RNA did not

99 significantly stimulate this ATPase activity of mDHX36Δβ-HP, although it did for WT mDHX36 (Fig. 4.14). The increase in basal ATPase activity in mDHX36Δβ-HP, compared to WT mDHX36, raises the possibility that complete removal of the hairpin relaxes the coupling of RNA binding and ATP binding or turnover. mDHX36short β-HP showed reduced

ATPase activity at all conditions, with no stimulation by RNA (Fig. 4.14). Together, these data for both mDHX36Δβ-HP and mDHX36short β-HP suggested that mDHX36Δβ-HP and mDHX36short β-HP could still hydrolyse ATP, but were defective either in RNA binding, in the ability to couple RNA binding to ATP turnover, or in both.

Figure 4.14. Impact of deletion and truncation of the 5’-β-HP on the ATPase activity. Initial ATPase short β-HP velocities (V0) for WT mDHX36 (blue, for comparison, Fig. 4.11c), mDHX36 (purple) and mDHX36∆β-HP (green) with and without RNA substrates as indicated (266 nM mDHX36, trace amounts of [γ-32P] ATP, 0.5 mM ATP-Mg2+, 2 µM RNA substrate). Data represent the average of three ∆β- independent measurements. Error bars mark one standard deviation. Values for V0 were for mDHX36 HP -1 -1 : without RNA, V0 = 2.9 ± 0.3 µM min ; without RNA and with RNase, V0 = 2.3 ± 0.1 µM min ; with ss -1 -1 4 RNA, V0 = 3.3 ± 0.6 µM min ; with RNA duplex substrate, V0 = 3.1 ± 1.2 µM min ; with GQ substrate, -1 1 -1 short β-HP V0 = 5.0 ± 1.1 µM min ; and with GQ substrate, V0 = 3.3 ± 0.7 µM min . For mDHX36 : without -1 -1 RNA, V0 = 0.6 ± 0.2 µM min ; without RNA and with RNase, V0 = 0.4 ± 0.2 µM min ; with ss RNA, V0 = -1 -1 4 0.6 ± 0.1 µM min ; with RNA duplex substrate, V0 = 0.6 ± 0.1 µM min ; with GQ substrate, V0 = 0.8 ± -1 1 -1 0.3 µM min ; and with GQ substrate, V0 = 0.6 ± 0.1 µM min .

We next measured the equilibrium binding of mDHX36Δβ-HP and mDHX36short β-HP to the RNA substrates tested above (Fig. 4.15). For the ssRNA, we detected much reduced binding of mDHX36Δβ-HP, compared to WT mDHX36 (Fig. 4.15, left and right panel in

100 upper row). Virtually no binding of mDHX36short β-HP to the ssRNA was seen (Fig. 4.15d, middle and right panel in upper row). For all other substrates, we observed similarly reduced binding of mDHX36Δβ-HP and mDHX36short β-HP, compared to WT mDHX36 (Fig.

4.15). Of note, removal or shortening of the β-HP (mDHX36Δβ-HP and mDHX36short β-HP) lowered affinity for structured RNAs less than removal of the OB-fold (Fig. 4.12), even though deletion of either domain abrogated remodeling activity.

Figure 4.15. Impact of deletion and truncation of the 5’-β-HP on the RNA binding affinity. Representative PAGEs of equilibrium binding reactions with increasing concentrations of mDHX36∆β-HP (left panels) and mDHX36short β-HP (middle panels) for RNA structures as indicated. Cartoons mark free and protein-bound RNA, asterisks show the radiolabel. Right panels: plots of fraction of bound RNA as a function of enzyme concentration. Data points represent an average of three independent measurements. Error bars mark one standard deviation. Lines represent smoothed trends.

Comparison of the apparent affinities of WT and mDHX36Δβ-HP and mDHX36short

β-HP for the three substrates suggested that both removal and shortening of the β-HP resulted in a loss of binding energy that is similar for all substrates (Fig. 4.16). As seen for the OB- fold, this observation implies that the β-HP contributes free binding energy to nucleic acid binding that is roughly similar for all substrates and largely additive to the free binding energy provided by other nucleic acid binding sites in mDHX36.

101

Figure 4.16. Energetic differences of WT and mutant mDHX36 for binding to the different RNA substrates. Effects of the OB-fold and β-HP deletions on the free energies of equilibrium binding affinities for each RNA substrate. The effect is expressed as difference in free energies, e.g., ΔΔG = ΔG(WT DHX36) - ΔG(DHX36mutant). Free energies for equilibrium RNA binding affinities were calculated RNA WT ∆OB- according to ΔG°= -RT·ln(K1/2 ), using the K1/2 determined in Fig. 4.12d (mDHX36 , mDHX36 fold), and Fig. 4.15d (mDHX36∆β-HP, mDHX36short β-HP).

We also examined the overall folding of all the mutant variants by circular dichroism (CD, Fig. 4.17). At 30 °C, mDHX36ΔDSM, mDHX36ΔOB-fold, mDHX36Δβ-HP, mDHX36short β-HP, and WT mDHX36 had highly similar spectra, indicating significant α- helical content and thus overall folding of the constructs (Fig. 4.17).

Collectively, our data indicate that the β-HP, much like the OB-fold, is essential for remodeling of quadruplex and duplex structures and promotes binding to nucleic acids.

Together with the data for mDHX36ΔOB, these observations suggest that the conserved

Figure 4.17. Circular dichroism analysis of mDHX36 variants used in this study. The overlaid CD spectra of WT mDHX36 (blue), mDHX36∆-OB-fold (red), mDHX36∆-β-HP (green), mDHX36short β-HP (lilac), and mDHX36∆-DSM (brown). The results reveal similar spectra for wild-type and all four DHX36 variants.

102 domains contacting the nucleic acid are not only critical for promoting binding of mDHX36 to the RNA, but also for organizing the helicase domains to allow for coupling of RNA binding to the ATPase cycle and thus for the ATP-driven remodeling of duplex and quadruplex structures.

4.3 Discussion

In this structure-function analysis of mouse DHX36, we report the crystal structure of the protein bound to ADP and define the biochemical function of three auxiliary domains that are conserved among DHX36 orthologs. Our crystal structures of mDHX36 bound to

ADP show a domain architecture resembling fly and bovine DHX36 and structures of other

DEAH/RHA helicases 6,38,265,57. Differences to previous DHX36 structures include a rotation of the WH, ratchet, and OB-fold, compared to structures with bound nucleic acids and/or nucleotide analogs, and a closed conformation of the two helicase core domains

(Fig. 4.1). These structural differences are consistent with conformational changes that accompany stages of ATP binding and hydrolysis and the RNA remodeling cycle. While these structural differences are predominantly in highly conserved regions, it is possible that differences, especially in less conserved regions, could also reflect variations between the bovine and the mouse proteins.

Our structure-function analysis focused on roles of the DSM, the OB-fold and the

β-HP, three conserved domains that contact nucleic acids. We examined the roles of the domains in the remodeling of RNA structures. Although DHX36 is functionally and physically linked to RNA in cells 242,8, the majority of biochemical studies had been conducted on DNA substrates. Remodeling of an intermolecular RNA GQ4 substrate by

DHX36 had been demonstrated previously 65, but differences in experimental conditions

103 between that study and our work preclude a direct comparison of obtained values. Yet, the biochemical characterization of RNA quadruplex and duplex remodeling activities is, to our knowledge, the first comparative biochemical analysis of remodeling of different RNA structures and the first demonstration of RNA helicase activity by DHX36.

The functional affinity of mDHX36 for quadruplex substrates is higher than for duplex substrates. This finding is consistent with the preferential association of DHX36 to quadruplex regions in cells 242 and mirrors observations for DNA substrates 45,246. The functional affinity for the intramolecular quadruplex substrate was higher than for the intermolecular quadruplex substrate with an identical number of G-tetrads, suggesting that mDHX36 prefers a substrate with a single unpaired region over substrates with multiple unpaired tails. At saturating concentrations of mDHX36, both quadruplex substrates are unwound at virtually identical rates. This observation implies one or more similar remodeling steps for quadruplex substrates with different architectures.

The remodeling stage of the duplex substrate is traversed slower than that for the quadruplex substrates, even though the thermodynamic stability of the quadruplex structures is higher than that of the duplex substrate 311,315. We speculate that this difference is caused by the partial ATP-independent destabilization of the quadruplex, which leaves fewer G-tetrads for the ATP-dependent, translocation-based remodeling step 6,38,264. To remodel the duplex, DHX36 presumably has to translocate more nucleotides, which could conceivably result in a slower remodeling rate constant. However, rigorous mechanistic interpretations of the differences in remodeling rate constants between quadruplex and duplex substrates have to await a model for the mechanism of duplex remodeling by

DHX36. In this context, it is likely important to investigate the of DHX36

104 during duplex remodeling. Our data suggest that multiple mDHX36 protomers cooperate to unwind the duplex (Fig. 4.6). By contrast, quadruplex remodeling requires only a single protomer 264.

Our structure-function analysis of the DSM, the OB-fold, and the β-HP reveal functions of these domains in the RNA remodeling cycle beyond RNA binding. This finding is particularly remarkable for the DSM, a small segment that contacts the quadruplex 6. Consistent with these contacts, deletion of the DSM confers binding defects for quadruplex, but not for duplex substrates (Fig. 4.9), indicating that the DSM is a binding adaptor for quadruplex, but not for duplex structures. Yet, deletion of the DSM also decreases the rate constant for the remodeling step(s), notably for duplex remodeling (Fig.

4.9). These findings reveal a role for the DSM in the remodeling step of both quadruplex and duplex substrates. We speculate that the impact on the remodeling step(s) arises from the linker that connects the DSM to the helicase domains and thereby presumably contacts several DHX36 domains, akin to a brace. This linker might not be appropriately positioned across DHX36 without the DSM, thereby impairing the remodeling step.

The OB-fold, which is conserved among DEAH/RHA helicases, is an integral part of the DHX36 architecture and contacts unpaired nucleotides of the substrate 6, 38.

Accordingly, we find that the OB-fold promotes RNA binding for all substrates, contributing approximately the same free binding energy for all substrates. However, we find that the OB-fold is also essential for coupling of the ATPase cycle to nucleic acid binding and thus for the ATP-driven remodeling of quadruplex and duplex structures (Fig.

4.10-4.12). Our findings mirror observations for the DEAH helicase Prp43p, where the

OB-fold is essential for activity in vivo 43,316, and deletion of this domain impairs RNA

105 binding and ATPase stimulation by RNA 43. In the RHA helicase MLE, mutations in the

OB-fold residues diminish RNA-stimulated ATPase activity of the helicase 46. While the defects in nucleic acid binding in DHX36 and other DEAH/RHA helicases are readily explained by the loss of the RNA contacts, delineation of the exact function of the OB-fold in the coupling of RNA/DNA binding to the ATPase cycle requires more experimentation.

We speculate that the OB-fold, together with other domains, are essential for the conformational changes needed to couple RNA binding to the ATPase cycle. This notion would be consistent with the conformational changes in our structure of mDHX36 bound to ADP, relative to structures at different stages of the ATP hydrolysis cycle 6.

We also examined the role of the 5’ β-hairpin (β-HP), which protrudes from the

RecA2 domain between motifs V and VI and interacts with the OB-fold (Fig. 4.1). The β-

HP is conserved across the DEAH/RHA family and also present in Ski-2-like and

NS3/NPH-II helicases 26,39. Our data show that the β-HP, like the OB-fold, is essential for remodeling of quadruplex and duplex structures and also promotes binding to nucleic acids.

The β-HP contributes approximately the same free binding energy to RNA binding, regardless of which substrate is bound. This observation is consistent with contacts of β-

HP to the unpaired nucleic acid region 6. In addition, both deletion and shortening of the

β-HP abrogate ATP-driven RNA remodeling and coupling of the ATPase cycle to RNA binding (Fig. 4.13-4.14). These observations suggest that the β-HP also organizes conformational changes of perhaps multiple mDHX36 domains in response to RNA and

ATP binding and ATP hydrolysis. This function appears sensible, because the β-HP bridges several enzyme domains (RecA2, HA2, OB-fold) and also contacts the NA 6.

Deletion of the β-HP, which eliminates contacts between the RecA2 and the extended CTD

106 domain, increases the basal, non-RNA stimulated ATPase activity of mDHX36, compared to WT and other mDHX36 variants (Fig. 4.14). This observation suggests a role of the β-

HP in organizing mDHX36 domains even without RNA bound. A critical role of the β-HP for enzyme activity, while not previously demonstrated for DHX36, is consistent with findings for other helicases with a β-HP. In Prp43p, point mutations in the β-HP confer cold-sensitive and slow growth phenotypes 316,62. In DHX29, point mutations in the β-HP lead to defects in 48S formation in vivo, and a reduction in RNA-stimulated ATPase activity in vitro 63.

Collectively, our structure-function analysis indicates that mDHX36 domains which contact nucleic acids have roles beyond creating the DNA or RNA binding site.

Consistent with the compact architecture of mDHX36, the DSM, the OB-fold and the β-

HP domains contribute to the coupling of RNA binding to the ATPase cycle and thus to

RNA remodeling. Although the architecture of mDHX36 consists of clear modules, all of these appear to work closely together during several reaction steps. The degree of conservation of the domain architecture in DEAH/RHA helicases raises the possibility that the domains in other enzymes of this family play similar, tightly integrated roles. Despite this integration, each helicase appears adapted to specific functions, at least in cells. For

DHX36, this is an association with quadruplex substrates. Yet, the enzyme readily unwinds

RNA duplexes.

107

Chapter 5

The nucleotide selectivity of DHX36 influences its RNA substrate selectivity and vice

versa

5.1 Introduction

The RNP remodeling activity of DHX36 (aka RHAU/ G4R1) has been implicated in the regulation of key cellular processes, including transcription, translation, and the cellular stress response 239, 242-244. DHX36 has been associated with two antagonistic in vivo binding motifs: it binds to A/U-rich elements in the 3’ UTR of the uPA mRNA and recruits destabilizing factors, such as PARN 126, and it also interacts with and remodels DNA or

RNA GQ structures 237, 246. A recent transcriptome-wide study reconciled these differing binding preferences by showing that the major site of action of DHX36 is G-rich sequence elements and associated structures 242. However, transient and fast interaction with (less structured) AU-rich sites were also identified, and these regions were thought to serve as recruitment sites for DHX36 242. In vitro, the helicase readily unwinds RNA duplexes as well as GQ structures. However, the actual remodeling step of the GQ is traversed faster than that for the duplex substrate. Additionally, DHX36 preferentially bound RNA substrates with a GQ structure, compared to those with a duplex structure, and this high affinity binding appears to involve the DSM and OB-fold domains of DHX36 317.

Both biochemical and cellular data have shown differing DHX36 activities towards different RNA substrates. In a typical cell with thousands of different RNAs, the interaction between DHX36 and a specific RNA site is not just dictated by the inherent affinity of

DHX36 for that RNA site. Other factors like the concentration of the RNA and DHX36, the competition from other RNAs for association with DHX36, the competition among

108 other proteins for the RNA’s binding site, and proteins that interact with and modify

DHX36 can all strongly affect RNA-binding patterns 318. Thus, it is crucial to identify parameters that dictate the substrate selectivity of DHX36 and understand the mechanistic basis for the activity variations of DHX36 towards different RNA substrates.

Another question that is often posed for molecular motors is whether they take advantage of thermal energy to move—are they passive motors, or do they actively destabilize base pairs? Many of the DEAH/RHA helicases hydrolyze ATP in a manner that is either stimulated by or dependent on a nucleic acid cofactor 319-324; several can unwind

RNA duplexes in an ATP-dependent fashion 30, 46, 319-320, 325. ATP hydrolysis is essential for the in vivo functions of Brr2, Prp2, Prp16, Prp22, and Prp43, insofar as mutations that abolish ATPase activity in vitro are invariably lethal in vivo 58, 319, 326-327. Single-molecule biochemical assays have demonstrated ATP-independent remodeling of DNA and RNA

GQs (three G-tetrads) by DHX36 38, 328-329, 264. Consequently, DHX36 has been described to follow a local-strand separation mechanism reminiscent of most DEAD-box helicases, wherein DHX36 loads directly on the GQ and unfolds the structure without ATP hydrolysis. In this mechanism, the helicase does not translocate on the NA, and ATP hydrolysis is only required for dissociation of the helicase from the NA 71. But this is not entirely true for DHX36, as the helicase was shown to remodel GQ structures by ATP- driven translocation along the nucleic acid 65. DHX36 also belongs to the DEAH/RHA family of helicases, which are known to remodel RNA/DNA structures through an ATP- dependent translocation based mechanism 43, 46, 57. Additionally, through bulk ensemble assays, ATP-dependent RNA duplex, as well as RNA GQ (five G-tetrads) remodeling activity by DHX36 has been demonstrated 317,65.

109

We speculate that DHX36 remodeling of GQ substrates involves two steps: the partial ATP-independent destabilization of the quadruplex, which leaves fewer G-tetrads for the ATP-dependent, translocation-based remodeling step (section 4.3). However, the mechanism of ATP-dependent RNA duplex remodeling by DHX36 is still unclear and requires more rigorous mechanistic interpretations. Central questions include how DXH36 coordinates ATP binding and hydrolysis to RNA duplex and RNA GQ structure remodeling, and if/how this subsequently influences the RNA substrate selectivity of

DHX36.

Here, through bulk ensemble biochemical assays, we report that mouse DHX36 can utilize not just ATP, but all NTPs for RNA duplex, as well as quadruplex structure, remodeling. We further show that mDHX36 utilization of NTP is not indiscriminate. Most notably, our results show that physiological ATP levels inhibit mDHX36 remodeling of

RNA duplexes but not quadruplexes. Furthermore, a kinetic model for the ATP-induced inhibition of RNA duplex remodeling by mDHX36 is presented. Collectively our data provides an unexpected perspective on substrate selectivity of DHX36 by showing that it utilizes a unique nucleotide-dependent mechanism to selectively remodel quadruplex structures over duplex structures.

5.2 Results

5.2.1 Nucleotide binding site of mouse DHX36

We previously reported the X-ray crystal structure of mouse DHX36 bound to

ADP 317. To provide a firm basis for our biochemical analysis, here we analyze the nucleotide binding of mouse DHX36 (mDHX36) in significant detail. The mDHX36-ADP complex contains all the prototypical domain motifs of SF2 helicases 317,10,

110

43. Of note, residues from Motif I (aka P-loop), II, V, and VI are involved in nucleotide binding and catalysis, and are arranged in a manner typical of DEAH/RHA family helicases

57,330 (Fig. 5.1a). The ADP molecule bound to mDHX36 is sandwiched between the RecA1 and RecA2 domains. Here, the adenine moiety interacts with the F-motif (Phe 529) via π - electron stacking and by cation-π-interactions with the R-motif (Arg 267), Fig. 5.1a. These two residues are not part of the classical SF2 motifs, but recent studies in Prp43p (a

DEAH/RHA family helicase) highlight their relevance in nucleotide recognition 331-332. In

DHX36, this phenylalanine residue is reoriented in the ADP bound state compared to the

- 317,6 ADP-BeF3 and nucleic acid bound states (data not shown, ). This observation is consistent with conformational changes that accompany different stages of the ATP binding and hydrolysis cycle and the RNA remodeling cycle.

Figure. 5.1 ATP-binding site of mouse DHX36 and fly Vasa. (a) In mDHX36 (PDB: 6UP4), the adenine moiety is sandwiched between the RecA1 and RecA2 domains and is bound via via π- electron stacking between Arg267 and Phe529. (b) In Vasa (PDB: 2DB3), the adenine moiety is buried in the RecA1 domain and Gln272 from the Q motif (yellow), coordinates the N6 and N7 positions of adenine. ADP/ ADPNP are shown as green sticks with C atoms shown in green, N atoms 2+ in blue, O atoms in red, P atoms in orange, and the Mg ion as light green sphere. Conserved 2+ residues from Motifs I (cyan), II (grey), and VI (dark blue) which are involved in ADP, water or Mg binding are shown as sticks and are labelled. Not all the conserved residues of the Q motif or motif I,II and VI are shown to facilitate viewing.

In comparison to the DHX36 conformation, the nucleotide in the DEAD-box protein Vasa is buried inside the RecA1 domain, and the ribose/base moiety is rotated by

111 about 150°, but the conformation of the phosphate backbone is conserved 2 (Fig. 5.1b). It should be noted that the interaction of different motifs with the bound nucleotides is usually influenced by NA binding; hence the observed interactions in mDHX36 could change upon

RNA binding.

In certain SF2 RNA helicases like those belonging to the DEAD-box and Ski2-like family, the presence of a conserved glutamine (Q-motif) elicits adenine specificity 2, 10, 35

(Fig. 5.1b, section 1.3.1). In vivo analyses in yeast show that the Q motif is important for cell viability, and in vitro analyses of purified proteins show that these elements are needed for ATP binding and hydrolysis 333. In contrast, the structures of DHX36 and DEAH/RHA helicases, and also viral NS3 helicase, revealed the absence of the Q-motif, and thus, the base of the nucleoside triphosphate is not specifically recognized by these helicases 332, 334-

39 (Fig. 5.1). Due to the absence of specific base recognition, certain DEAH/RHA helicases

(Prp43 and Prp22) display in vitro biochemical activity with all four NTPs 332, 334.

5.2.2 NTP-dependent remodeling of RNA duplexes and quadruplexes by mDHX36

We previously demonstrated the ATP-dependent RNA remodeling activity of wild

317 type mDHX36 (truncated mDHX3646-982) on three radiolabeled RNA substrates . NTP binding and hydrolysis is thought to induce conformational changes in the helicase, which modulate the affinity of the helicase for NAs 39. To better define the biochemical properties of mDHX36, we investigated whether the RNA remodeling activity of mDHX36 is promoted by all four NTPs. For this, we characterized the NTP-dependent remodeling activity of mDHX36 on the three RNA substrates used previously 317: (i) a five G-tetrad intermolecular G-quadruplex (GQ4), (ii) a five G-tetrad intramolecular G-quadruplex

(GQI), and (iii) a 16-bp duplex, each containing a 15 nt unpaired A stretch 3' to the structure

112

(Fig. 5.2). We measured RNA remodeling under pre-steady state conditions with excess of

DHX36 over RNA. These reaction conditions permit multiple cycles of enzyme substrate

Figure. 5.2. RNA remodeling activity of mDHX36 in the presence of different nucleoside 4 triphosphates. (a) Left: Representative PAGE for RNA GQ remodeling reactions by mDHX36 with 2+ 4 indicated NTP-Mg (2 mM each) (20 nM mDHX36, 0.5 nM RNA GQ ). Right: Comparison of kobs values 4 for mDHX36 remodeling of GQ with different NTPs (Purple: ATP; Grey: UTP, GTP; Red: CTP), kobs, ATP -1 -1 -1 = 2.04 ± 0.02 min ; kobs, UTP = 1.17 ± 0.34 min ; kobs, GTP = 1.25 ± 0.05 min ; and kobs, CTP = 0.69 ± 0.01 -1 min . Values represent averages from 2 independent experiments. Error bars mark 1 SD. (b) Left: 1 2+ Representative PAGE for RNA GQ remodeling reactions by mDHX36 with indicated NTP-Mg (2 mM 1 each) (2 nM mDHX36, 0.5 nM RNA GQ ). Right: Comparison of kobs values for mDHX36 remodeling of 1 -1 GQ with different NTPs (Purple: ATP; Grey: UTP, GTP; Red: CTP), kobs, ATP = 4.50 ± 2.1 min ; kobs, UTP -1 -1 -1 = 4.52 ± 0.54 min ; kobs, GTP = 6.08 ± 0.22 min ; and kobs, CTP = 2.90 ± 0.88 min . Values represent averages from 2 independent experiments. Error bars mark 1 SD. (c) Left: Representative PAGE for 2+ RNA duplex remodeling reactions by mDHX36 with indicated NTP-Mg (2 mM each) (400 nM mDHX36,

0.5 nM duplex). Right: Comparison of kobs values for mDHX36 remodeling of duplex with different NTPs -1 -1 (Purple: ATP; Grey: UTP, GTP; Red: CTP), kobs, ATP = 0.57 ± 0.21 min ; kobs, UTP = 0.99 ± 0.18 min ; kobs, -1 -1 GTP = 0.29 ± 0.08 min ; and kobs, CTP = 1.19 ± 0.20 min . Values represent averages from 2 independent experiments. Error bars mark 1 SD. Cartoons mark RNA substrate and unwound product. The asterisk shows the radiolabel. binding 317,312. The kinetic interpretation of this reaction regime is more straightforward than for steady-state reactions, which involve multiple substrate turnovers 312. We found that all four NTPs supported mDHX36- dependent remodeling of duplex and quadruplex substrates (Fig. 5.2). However, among the four NTPs, differences in the apparent remodeling rate constant (kobs) values were evident. Under identical conditions, mDHX36

113 showed a preference for ATP when remodeling the quadruplex substrates (Fig. 5.2a, b, bar graphs). In contrast, mDHX36 showed a preference for CTP when remodeling the duplex substrate (Fig. 5.2c, bar graphs). In sum, our observations showed that mDHX36 could utilize all NTPs for RNA structure remodeling, but not indiscriminately. mDHX36 showed contrasting ATP and CTP preferences when remodeling RNA quadruplex and duplex structures.

5.2.3 Physiological ATP levels inhibit mDHX36 remodeling of RNA duplexes but not quadruplexes

To address mechanistic differences in the NA-dependent ATP/CTP preference of mDHX36, it is crucial to investigate whether these variations were caused by differing affinities of mDHX36 to these NTPs, by differences in the remodeling rate constants, or by a combination of both. To this end, we measured apparent remodeling rate constants

(kobs) with increasing ATP/CTP concentrations (Fig. 5.3).

Among the three RNA structures, differences between ATP- and CTP- dependent remodeling were apparent (Fig. 5.3a-c). For the intermolecular quadruplex substrate

4 max (GQ ), the remodeling rate constant at enzyme and NTP saturation (kobs ) was lower with

CTP, compared to ATP (Fig. 5.3a, lower left panel). The functional affinity (K1/2) of mDHX36 for CTP was six-fold weaker, compared to ATP (Fig. 5.3a, lower right panel).

1 max For the intramolecular quadruplex substrate (GQ ), the remodeling rate constant (kobs ) and functional affinity (K1/2) were also reduced with CTP, compared to ATP (Fig. 5.3b).

For the duplex substrate, we made an interesting observation. As shown in Fig.

5.3c, the mDHX36-dependent remodeling of the duplex substrate showed a bidirectional response to increasing concentrations of ATP. The kobs values steadily rose with increasing

114

Figure 5.3. ATP Vs CTP preference in DHX36 remodeling of RNA structures. Dependence of unwinding rate constants on ATP (filled blue circles) and CTP (filled red circles) concentrations for: (a) 4 max RNA GQ , WT-DHX36 (400 nM). Lines represent the best fit to a binding isotherm kobs= kobs, NTP [NTP] max -1 / (K1/2, NTP + [NTP]). With ATP, kobs, ATP = 3.57 ± 0.46 min , K1/2, ATP = 0.12 ± 0.07 mM. With CTP, kobs, max -1 CTP = 2.33 ± 0.24 min , K1/2, CTP = 0.71 ± 0.19 mM; Data points represent an average of at least 3 independent measurements; error bars mark 1 SD. (b) RNA GQ1, WT-DHX36 (2 nM). Lines represent max -1 the best fit to a binding isotherm as in (a). With ATP, kobs, ATP = 4.92 ± 0.35 min , K1/2, ATP = 0.07 ± 0.02 max -1 mM. With CTP, kobs, CTP = 2.92 ± 0.74 min , K1/2, CTP = 0.67 ± 0.20 mM. Data points represent an average of at least 4 independent measurements; error bars mark 1 SD. (c) RNA duplex, WT-DHX36 (700 nM). For ATP, lines represent the best fit to a binding isotherm that invokes substrate inhibition max max -1 kobs= kobs, NTP [NTP] / [K1/2, NTP + [NTP] (1+ [NTP]/Ki)], where kobs, ATP = 2.99 ± 0.55 min , K1/2, ATP =

0.013 ± 0.006 mM and Ki = 0.26 ± 0.12 mM. Data points represent an average of at least 8 independent measurements; error bars mark 1 SD. For CTP, lines represent the best fit to a binding isotherm kobs= max max -1 kobs, NTP [NTP] / (K1/2, NTP + [NTP]), where kobs, CTP = 2.36 ± 0.15 min , K1/2, CTP = 0.055 ± 0.02 mM; Data points represent an average of at least 3 independent measurements; error bars mark 1 SD.

[ATP] initially, approaching the highest at ~ 0.1 mM ATP. Oddly, the kobs values decreased

when [ATP] was increased further, wherein high concentrations of ATP were inhibitory

(Fig. 5.3c, Blue). The kinetic profile of ATP-dependent mDHX36 duplex remodeling can,

therefore, be divided into two parts, an incline followed by a decline with a peak at ~ 0.1

mM ATP. Contrastingly, duplex remodeling did not show a bidirectional response to

115 increasing concentrations of CTP. The kobs values steadily rose with increasing [CTP] and subsequently reached a plateau at saturating [CTP] (Fig. 5.3c, Red).

The declining slope at higher [ATP] suggests that ATP inhibits mDHX36 remodeling of the duplex substrate. However, during the assay, ATP is hydrolyzed to ADP, which accumulates over time. ADP is a structural analog of ATP; hence the accumulated

ADP might have inhibited duplex remodeling at high [ATP]. In addition, trace amounts of

ADP contamination from the ATP stock used in the experiment could also result in ADP accumulation. To assess if ATP-dependent mDHX36 duplex remodeling displays ADP

(by-product) inhibition, we titrated increasing amounts of ADP into the reaction. At saturating [ATP], we observed a 50% reduction in remodeling activity at 3 mM [ADP] (Ki,

ADP = 3 mM, Fig. 5.4). It is unlikely that the ADP accumulated during the reaction reaches this level.

Figure 5.4. Effect of ADP on DHX36 remodeling of RNA duplex. Left: Reaction scheme for RNA duplex remodeling reactions with ADP. Right: Dependence of unwinding rate constants on ADP concentrations for RNA duplex, (700 nM WT-DHX36, 2 mM ATP, 0.5 nM RNA). Lines represent interpolation of data points, which are an average of at least 3 independent measurements; error bars mark 1 SD.

When [ATP] was varied at a fixed concentration of mDHX36 and duplex substrate, a bell-shaped dependence of the kobs values is observed; hence the data cannot be fit using the already established binding isotherm. Instead, the data were fit to a binding isotherm

116 invoking substrate inhibition (Fig. 5.3c, Blue). The Kintek Explorer modeling program335 allowed us to fit data sets to multiple models for duplex remodeling by mDHX36. We performed simulations and data fitting using kinetic models describing (i) product (ADP) inhibition, (ii) non-productive substrate (ATP) binding, and (iii) substrate (ATP) inhibition of complete and partial types (Fig. 5.5a). The initial kinetic parameters for the simulation were estimated from experimental data. The initial parameters of the respective models are listed in Table 5.1.

The simulated data for each model was fit using the analytical fit (‘afit’) function of the kintek program, from which the kobs values for each fit was determined. To assess the quality of fit for the different models, experimental data were plotted versus simulated

Table 5.1. Initial parameters for data simulation using various models. Reaction steps correspond to those shown in Fig. 5.3a. Initial parameters estimated from experimental data are noted as ‘exp’. Other initial parameters are extrapolated from experimental values. Letters following the ‘exp’ values indicate the experimental source with a, RNA affinity (K1/2) measured during pre-steady state unwinding max of 16-bp duplex (Fig. 4.6b); and b, Maximal unwinding rate constant (kobs,ATP ), and ATP affinity values

(K1/2, ATP,Prod and K1/2,ATP†) from functional binding isotherms invoking substrate inhibition measured during pre-steady state unwinding of 16-bp duplex (Fig. 5.3c).

117

Figure 5.5. Models for RNA duplex unwinding by mDHX36. (a) Kinetic schemes for RNA/ATP binding, and unwinding/ATPase activities of DHX36 depicting by-product competitive inhibition (model 1), a non-productive ATP binding event (model 2), ATP-mediated substrate inhibition with binding of additional ATP leading to complete (model 3a), or partial inhibition (model 3b). Binding events (E: DHX36; R: RNA; P: remodeled RNA product; E.R.ATPProd: productive ATP binding forming ternary complex; E.R.ATPNon-Prod: non-productive ATP binding forming ternary complex; ATP†: additional ATP molecule. Association and dissociation rate constants are indicated for DHX36 binding RNA without ATP (kR). A negative sign indicates the dissociation step. For binding of ATP and ADP (kA and kD). For binding of additional ATP (kA†). Numbers following the rate constant indicate the binding step.

Forward rate constants for complexes resulting in product formation (kobs). (b) Visualization of quality of fits for each of the above kinetic models. For fitting parameters see Tables 4.1. To visualize the quality of fit for each model, experimental rate constants (kobs) from unwinding assays are plotted versus the simulated rate constant values obtained for each model. Black lines represent a linear fit with an intercept set at zero (R2: correlation coefficient). (c) Plots show pre-steady state unwinding rate constants versus ATP concentrations for the 16 bp RNA duplex. Experimental data (black filled circles) was overlaid with data generated from each model (model 1: blue trace, model 2: green trace, model

3a: red trace and model 3b: pink trace. Refer to Table A2 (appendix 1) for kobs values derived from each model. Experimental data was fit to a binding isotherm as described in Fig. 5.3c, blue.

118 data predicted from the models (Fig. 5.5b). In addition, the data calculated for each model were overlaid with the experimental data set (Fig. 5.5c). We found that Model 3 described the experimental data set well. Model 1, which allowed a reaction by-product (ADP) to bind and compete with the active site of the E-R complex, failed to describe the decrease in kobs values at high [ATP] and showed a very poor fit. Model 2, which incorporated a non-productive ATP binding step in the kinetic framework, also poorly fit the experimental data. Alternatively, Model 3 invoked a kinetic scheme that accounts for substrate (ATP) inhibition, possibly explained by the presence of more than one substrate binding site.

Here, substrate binding to a catalytic site results in ER.ATPProd complex that can generate reaction products; however, the binding of an additional substrate to a non-catalytic or allosteric site can result in ER.ATPProd.ATP† complex that either fails to produce reaction products (complete inhibition, Model 3a) or generates reaction products at a reduced rate

(partial inhibition, Model 3b). The mechanism of partial substrate inhibition (Model 3b) showed a good fit; alternatively, a better fit was obtained with the model describing complete substrate inhibition.

The kinetic parameters derived from fitting the simulated and experimental dataset to a binding isotherm invoking substrate inhibition are shown in Table 5.2. Hence, on comparing the kinetic parameters obtained from ATP and CTP titrations (Fig. 5.3c, Table

5.2), we observed that the duplex remodeling rate constant at enzyme and NTP saturation

max (kobs ) were roughly similar with ATP and CTP. The functional affinity (K1/2) of mDHX36 for CTP was five-fold weaker, compared to ATP. In sum, our data show that

ATP at high concentrations (>0.1 mM) inhibits mDHX36 remodeling of the duplex, but not the quadruplex structures.

119

Table 5.2. Kinetic parameters derived by fitting kobs values to a binding isotherm invoking substrate inhibition. The kobs values obtained from the simulations (models 3a, and 3b) and the experiment were plotted as a function of [ATP]. And, subsequently fit to a binding isotherm that invokes max substrate inhibition kobs= kobs, ATP [ATP] / K1/2, ATP, Prod + [ATP] (1+ [ATP]/ K1/2,ATP†).

5.2.4 The molecular mechanism of ATP-mediated mDHX36 RNA duplex remodeling inhibition

Earlier experiments (Section 4.2.2, 317) that involved RNA duplex remodeling at

ATP saturation (2 mM) suggested that more than one molecule of mDHX36 cooperates to remodel this RNA structure (Fig 4.6, 4.9a). In contrast, we observed that remodeling of a quadruplex does not require the cooperative function of multiple DHX36 protomers, regardless of the number of unpaired RNA extensions (Fig 4.6c, 4.9b). The substrate inhibition model presented above predicts the presence of more than one substrate (ATP) binding site during RNA duplex remodeling by mDHX36. One possibility is that the second molecule of mDHX36 may provide the additional ATP binding site for mDHX36 on the RNA duplex. Here, the second molecule of mDHX36 likely has a lower affinity for

ATP and/or it may accommodate ATP in a configuration, which is incompatible with .

This scenario can be experimentally verified by monitoring ATP utilization during the remodeling of a RNA duplex that can only accommodate a single molecule of mDHX36. We reasoned that 16-bp duplexes with shorter 3’ ss extensions (8 nt and 6 nt unpaired adenylates) would be unable to bind more than one molecule of mDHX36.

120

However, preliminary remodeling experiments with a 16-bp duplex containing a 8 nt extension (2 mM ATP) showed sigmoidal functional binding isotherms (Fig. 5.6a), suggesting that more than one molecule of mDHX36 still cooperates to remodel this duplex. A RNA duplex containing a 6 nt extension did not show a sigmoidal binding isotherm (Fig. 5.6b). However, given the experimental errors associated with both these datasets, the results are being considered inconclusive. Hence, choosing an ideal RNA duplex substrate for these purposes will require further investigation. On the other hand, we can also test whether a single molecule of mDHX36 is involved in remodeling the 16- bp RNA duplexes at very low ATP or very high CTP concentrations and if this subsequently leads to improved activity. This will tell us whether nucleotide concentrations influence the cooperation between multiple DHX36 molecules when remodeling RNA duplexes.

Figure 5.6. mDHX36 remodeling activity on 16-bp RNA duplexes with shorter 3’ ss extensions. (a) Dependence of the remodeling rate constant (2 mM ATP-Mg2+) on mDHX36 concentrations for the duplex containing an 8 nt ss-adenylate extension. Data points (blue circles) are averages from multiple independent reactions (N = 5). Error bars mark one standard deviation. The line represents max H H H -1 the best fit to the Hill equation kobs= kobs [DHX36] ((K1/2, DHX36) + [DHX36] ) (kobs: observed max remodeling rate constant; kobs : remodeling rate constant at DHX36 saturation; K1/2, DHX36: apparent max -1 functional binding constant, H: Hill coefficient). Here, kobs = 0.8 ± 0.05 min , K1/2 = 295 ± 26.8 nM, H = 3.4 ± 0.7. (b) Dependence of the remodeling rate constant (2 mM ATP-Mg2+) on mDHX36 concentrations for the duplex containing an 6 nt ss-adenylate extension. Most data points (black circles) are averages from two independent reactions. Error bars mark one standard deviation.

121

An alternate possibility for the presence of more than one ATP binding site is that the structure of mDHX36 can accommodate ATP in an allosteric site distant from the active site, which in turn causes allosteric inhibition in the active site. Nevertheless, to fully describe the complexity of the inhibition mechanism, a thorough kinetic characterization of mDHX36 is required. Structure determination of ternary complexes with non- hydrolyzable ATP analogs and/or the use of allosteric site mutants will be helpful in this elucidation.

5.3 Conclusion

Our study examines the interplay between nucleotide selectivity and RNA substrate selectivity of DHX36 during the RNA remodeling process. We show that mDHX36 lacks a conserved Q-motif, which is involved in selective recognition of ATP in certain other

SF2 helicase families. In line with this observation we show that in vitro, mDHX36 is able to use not just ATP, but all other NTPs for RNA structure remodeling (Fig. 5.2). This is consistent with similar observations made in other DEAH/RHA helicases 332, 334. However, we find that mDHX36 is not indiscriminate in NTP usage. During our efforts to characterize the NTP-dependent RNA structure remodeling mechanism of mDHX36, we were struck by the weak RNA duplex remodeling activity with ATP compared to CTP as a phosphoryl donor substrate in in vitro assays. Subsequent efforts to establish the ATP vs.

CTP selectivity of mDHX36 provided several insights. 1) When considering nucleotide affinity values, the nucleotide selectivity of mDHX36 is usually in favor of ATP during duplex as well as quadruplex structure remodeling (Fig. 5.3, bar graphs). 2) The RNA

GQ remodeling activity of mDHX36 is more effective with ATP than with CTP once again, suggesting ATP selectivity (Fig. 5.3). 3) In contrast, the RNA duplex remodeling activity

122 of mDHX36 shows more complex ATP/CTP preferences. a) The activity is more effective with CTP than ATP at saturating experimental concentrations of these NTPs (Fig. 5.3). b)

However, this observation is a reflection of the substrate (ATP) inhibition kinetics displayed by mDHX36 at saturating [ATP]. c) The duplex remodeling activity is equally effective with both ATP and CTP when ATP inhibition is accounted for in the calculation of remodeling rate constants.

Even if in vitro kinetic data demonstrate CTP utilization of mDHX36, it is crucial to consider NTP usage relative to their concentrations in the cell. Typically, the intracellular concentrations of ATP is ten-fold higher than other NTPs 336-337. Therefore, the ability to use CTP as a phosphoryl donor in in vitro kinetic experiments may not translate to a preference in the cell, where ATP dominates. Although the in vivo NTP usage of helicases is unknown, it prompts us to speculate why certain helicases evolved to use all

NTPs. One possibility is that the ability of the helicase to use more than one NTP would be advantageous in competing with host enzymes for NTPs.

Nevertheless, the absence of substrate inhibition during CTP-dependent mDHX36

RNA duplex remodeling provides us with a useful handle to understand the mechanism of substrate inhibition observed with ATP. The observation suggests that the second site is specific for ATP and not other NTPs. The molecular mechanism of ATP mediated inhibition of mDHX36 RNA duplex remodeling activity warrants further investigation.

Our kinetic models suggesting substrate inhibition by ATP may be oversimplified and will likely require the incorporation of more intermediate steps (Fig. 5.5). And, alternative explanations for the observed ATP-dependent inhibition of mDHX36 duplex remodeling activity may also exist.

123

As discussed in section 5.2.4, it will be important to investigate the differences in the cooperativity of mDHX36 during duplex and quadruplex remodeling. In addition to data from our ensemble experiments, recent single-molecule studies suggest that quadruplex remodeling requires only a single protomer of DHX36 317, 328-329. Also, the ten- fold tighter ATP affinity of mDHX36 when remodeling the duplex compared to the quadruplex likely plays a role in the inhibition of duplex remodeling at high [ATP]. Further possibilities are discussed in section 6.5 of the future directions chapter.

Regardless of the molecular mechanism underlying the ATP-mediated inhibition of RNA duplex remodeling, our data suggest that mDHX36 uses a nucleotide-dependent mechanism to remodel RNA structures selectively. As in, ATP at physiological levels (>

0.5 mM) negatively regulates RNA duplex remodeling, but positively regulates RNA GQ remodeling Hence, we note that DHX36 may achieve GQ selectivity in more than one way: a) by selectively recognizing quadruplexes over duplexes, and b) by using ATP as a preferred nucleotide to selectively remodel quadruplexes and/or inhibit remodeling of duplexes This may be advantageous for DHX36 under physiological conditions, where other RNA conformations may outcompete the number of GQ conformations. Under this scenario, mere selective recognition of RNA substrates may not be enough.

124

Chapter 6

Future Directions

Structural and biochemical analysis of DHX36

6.1 Introduction

In chapter 4, I reported the crystal structure of mouse DHX36 bound to ADP and systematically characterized the ATP-dependent remodeling activity of mDHX36 on RNA duplex structures, as well as both intermolecular, and intramolecular RNA G-quadruplex structures. I demonstrated the functional integration of auxiliary domains and helicase core domains of DHX36 for multiple steps of the nucleic acid remodeling reaction. In chapter

5, I thoroughly examined the interplay between nucleotide selectivity and RNA substrate selectivity of DHX36 during the RNA remodeling process. This chapter outlines possible future investigations on substrate recognition and NTP utilization by DHX36.

6.2 DHX36 recognition of RNA length, sequence, and structure

DHX36 targets a wide range of RNAs for regulation in vivo (Section 2.5). It is not clear what defines a DHX36 substrate, most likely, the information lies mainly in the RNA itself. My analysis of the RNA remodeling activity of DHX36 identified the effects of RNA structure (Section 4.2). The experiments proposed as follows may illustrate how features in an RNA determine recognition and activities by DHX36.

6.2.1 Minimum length of ssRNA bound by DHX36

Previous studies imply that >5 nt ssDNA, 3’ to the GQ structure is minimally required for productive binding and remodeling by DHX36 65, 314, 338. Although it is reasonable to assume that DHX36 will have a similar requirement for 3’- ss extensions on

125

RNA structures, this can be experimentally verified by varying the length of 3’- extensions using RNA duplex and GQ substrates. Preliminary experiments with a 16 bp RNA duplex containing 3’- ss adenylates of varying lengths, suggest that a minimum length of 8-nt may be preferred (Fig. 6.1a). We even observed robust remodeling for substrates with 25- and

15-nt extensions, with no significant difference between them (Fig. 6.1b). Additionally, my results suggest that more than one protomer of DHX36 may bind the RNA substrate.

To identify the minimum length of ssRNA bound by DHX36, ATPase, and equilibrium binding experiments can be performed with different lengths of ssRNA. Alternatively, the binding site of DHX36 can be examined by RNA footprinting, which has been used with numerous RNA binding proteins.

Figure 6.1 Effects of 3’- ss extension length on DHX36 RNA duplex remodeling activity. a) Dependence of the remodeling rate constant (2 mM ATP-Mg2+) on mDHX36 concentrations for 16 bp RNA duplexes containing ss- adenylates that is 15 nt (blue circles), 8 nt (green diamonds) or 6 nt (red circles). Data points are averages from multiple independent reactions (N ≥ 2). Error bars mark one max H standard deviation. The line represents the best fit to the hill equation kobs = kobs [DHX36] .((K1/2, H H -1 max DHX36) + [DHX36] ) (kobs: observed remodeling rate constant; kobs : remodeling rate constant at

DHX36 saturation; K1/2, DHX36: apparent functional binding constant, H: Hill coefficient). For the max substrate with 15 nt extension, kobs = 1.1 ± 0.1 min-1, K1/2, DHX36 = 363 ± 54 nM, H = 1.8 ± 0.3. For max the substrate with 8 nt extension, kobs = 0.8 ± 0.05 min-1, K1/2, DHX36 = 295 ± 26.8 nM, H = 3.4 ± 0.7. max For the substrate with 6 nt extension, kobs = 0.17 ± 0.04 min-1, K1/2, DHX36 = 546.97 ± 289 nM. b) Dependence of the remodeling rate constant (2 mM ATP-Mg2+) on mDHX36 concentrations for 16 bp RNA duplexes containing ss- region of adenylates mixed with cytidines that is 25 nt (brown circles), or 15 nt (purple diamond). Data points are averages from multiple independent reactions (N ≥ 2). Error bars mark one standard deviation. The line represents the best fit to the hill equation, where for the max substrate with 25 nt extension, kobs = 4.2 ± 0.6 min-1, K1/2, DHX36 = 589 ± 115 nM, H = 1.9 ± 0.4, and max for the substrate with 15 nt extension, kobs = 2.8 ± 0.1 min-1, K1/2, DHX36 = 395 ± 31.7 nM, H = 1.9 ± 0.3.

126

6.2.2 Sequence preference of DHX36

Of note, DHX36 showed some sequence preference when unwinding the 16 bp duplex with 15-nt extensions (Fig. 6.2). We find that an extension sequence containing

CAAA rich regions are preferred over a sequence with only adenylates. This observation is interesting, given that a previous study showed that DHX36 prefers A-rich extensions over T-rich sequences for GQ disruption 65. However, for effective comparison, remodeling of GQs with 15-nt extensions containing CAAA rich regions needs to be investigated.

Figure 6.2 Effects of 3’- ss extension sequence on DHX36 RNA duplex remodeling activity. Dependence of the remodeling rate constant (2 mM ATP-Mg2+) on mDHX36 concentrations for 16 bp RNA duplexes containing 15-nt ss extension of adenylates (blue circles), and adenylates mixed with cytidines (purple diamonds). The data for these substrates were the same as in Fig. 6.1 a, b.

Importantly, given that DHX36 is linked to AREs in mRNA 3’ UTRs, the majority of which are stretches of uridylates interrupted by adenylates, RNA substrates containing this sequence and/or structure can be used to compare DHX36 remodeling activity.

Additionally, recent structural and functional studies with fly DHX36 indicated preferential recognition and binding to G-rich DNA sequences (Fig. 2.6). Hence, the sequence preferences of DHX36 in RNA remodeling activity should be explored further.

Given that most of these experiments involved the use of synthetic constructs, it would be of value to use a natural sequence as well (RNA binding sites of DHX36

127 identified through recent transcriptome-wide studies). This will help put the biochemical data in the context of the physiological function of DHX36.

6.3 Coupling of NA binding to ATP binding/hydrolysis

In terms of coupling of ATP binding and hydrolysis with RNA binding and enzyme release, differences exist among SF2 RNA helicases. For example, the DExH viral helicase

NS3 binds ss-NA with high affinity in the absence of ATP or the presence of ADP, but with lower affinity in the presence of an ATP analog 80. Hence, ATP binding brings NS3 out of a tightly bound state to facilitate translocation, whereas ATP hydrolysis and product release promote tight re-binding of the helicase to the NA 80. In contrast, DEAD-box proteins bind RNA tighter in the presence of ATP and weaker in its absence or the presence of ADP 339-340. There are minimal studies concerning the coupling of ATP binding and hydrolysis with duplex unwinding.

For a thorough understanding of the mechanism of NA-unwinding and translocation by DXH36, we need information about the rate, processivity, step size, the average number of ATP/NTP molecules hydrolyzed per bp unwound (ATP-coupling stoichiometry), and the functional oligomeric state of the helicase 15. A combination of structural, ensemble pre-steady-state and steady-state kinetics, and single-molecule studies is required to address all of these issues.

6.3.1 Modulation of DHX36 ATPase activity by RNA sequence, structure, and length

The stimulation of DHX36 ATPase activity by some NA substrates including, a

DNA GQ 262, a DNA duplex, a RNA duplex, a tRNA, and homopolymeric RNAs 45, 126, has been previously demonstrated through semi-quantitative biochemical assays. Of note, the relatively more robust stimulation of DHX36 ATPase activity by the poly(U) RNA is

128 interesting given that the majority of AREs in mRNA 3’ UTRs are in fact stretches of uridylates interrupted by adenylates. Consistent with this notion, the ATPase activity of

DHX36 was shown to be essential for the function of DHX36 in promoting mRNA deadenylation and decay 126. Although these observations suggest NA sequence or structure preferences for the stimulation of DHX36’s ATPase activities, further analysis is required to characterize the detailed ATPase mechanisms of this helicase.

I previously demonstrated RNA-stimulated ATPase activity of WT mDHX36 in steady-state conditions (RNA in excess over enzyme), using an ssRNA and a series of structured model RNA substrates (Section 4.2.4). To examine the role of RNA structure on the ATPase activity of DHX36, I performed an analysis of RNA binding by DHX36 by measuring ATPase activity using these model RNA substrates. Preliminary data show that

4 the reaction with intermolecular GQ was faster than with ss- or duplex RNA, at a kcat over

2-fold higher (Fig. 6.3a). Interestingly, I observed a slight stimulation of the ATPase activity of DHX36 by a low concentration of intramolecular GQ1 RNA but inhibition at a higher concentration of this RNA (Fig. 6.3b).

I then measured the apparent ATP affinity of DHX36 in ATPase reactions, using the same RNA substrates. These reactions, however, lacked an important control, the

ATPase activity of DHX36, in the absence of RNA (basal hydrolysis rates). Reaction with the intramolecular RNA GQ1 substrate remains to be investigated. Under these conditions, the ATP hydrolysis rates were more or less similar for the three RNA substrates; however, the rates are inhibited in the presence of excess ATP (Fig. 6.3c). However, inhibition by

RNA was not observed in RNA titrations (Fig. 6.3a). The ATP affinity of DHX36 was weakest in the presence of intermolecular GQ4, and strongest in the presence of ss RNA.

129

In the presence of RNA duplex, the ATP affinity of DHX36 was ten-fold weaker than ATP

ATP affinity measured in unwinding reactions under pre-steady state conditions (K1/2 = 0.14

ATP ATP ± 0.05 mM, and Ki = 1.4 ± 0.5 mM, Fig. 6.3c vs. K1/2 = 0.013 ± 0.005 mM and Ki =

0.26 ± 0.11 mM for unwinding, Fig. 5.3c). A similar pattern was observed in the presence of intermolecular GQ4, where the ATP affinity of DHX36 was ten-fold weaker than ATP

ATP affinity measured in RNA remodeling reactions under pre-steady state conditions (K1/2

ATP ATP = 1.0 ± 0.8 mM, and and Ki = 1.5 ± 1.2 mM, Fig. 6.3c vs. K1/2 = 0.12 ± 0.07 mM. and Ki = 4 ± 0.11 mM for unwinding, Fig. 5.3a).

These differences could reflect the differences in the nature of the assays; hence, the RNA-stimulated ATPase parameters for DHX36 have to be measured at a range of

Figure 6.3 Steady-state ATPase activity of DHX36 with RNA. a) ATPase activity of DHX36 measured at varying concentrations of indicated RNA (ss, blue circle; duplex, red circle; GQ4, green diamond). Initial velocities of ATP hydrolysis were determined at 430 nM DHX36, 3 mM ATP-Mg2+ and the RNA concentrations indicated. Curves represent best fits to the binding isotherm V0 = Vmax,RNA [RNA] / ([RNA] max,RNA max,RNA + K1/2,RNA). For ss RNA, V = 20.0 ± 4.6 μM min-1, K1/2,RNA = 2.9 ± 1.7 μM; for duplex, V = 4 max,RNA - 13.3 ± 2.2 μM min-1, K1/2,RNA = 0.9 ± 0.6 μM, and for intermolecular GQ , V = 44.2 ± 10.8 μM min 1 , K1/2,RNA = 5.4 ± 2.8 μM. b) ATPase activity of DHX36 measured at varying concentrations of intramolecular RNA GQ1. Initial velocities of ATP hydrolysis were determined at 430 nM DHX36, 3 mM ATP-Mg2+ and the RNA concentrations indicated. Error bars represent the standard deviation. c) ATPase activity of DHX36 measured at varying ATP concentrations, in the presence of indicated RNA (ss, blue circle; duplex, red circle; GQ4, green diamond). Initial velocities of ATP hydrolysis were determined at 5 μM RNA, 700 nM DHX36 and ATP-Mg2+ concentrations indicated. Curves represent best fits to inhibition max,ATP max,ATP equation V0 = V [ATP] / {([ATP]*1+ [ATP]/ Ki,ATP ) + K1/2ATP}. For ss, V = 10.8 ± 0.5 μM -1 max,ATP -1 min , K1/2,ATP = 0.02 ± 0.004 mM, and Ki,ATP = 4.4 ± 0.9 mM; for duplex, V = 23.1 ± 3.6 μM min , 4 max,ATP Ki,ATP = 0.14 ± 0.05 mM, and Ki,ATP = 1.4 ± 0.5 mM; and for intermolecular GQ , V = 33.1 ± 18.5 μM -1 min , K1/2,ATP = 1.0 ± 0.8 mM, and and Ki,ATP = 1.5 ± 1.2 mM;

130 concentrations under pre-steady state conditions (enzyme excess over RNA) similar to the unwinding conditions for the same RNA substrates. This will tell us if the apparent ATP affinity of DHX36 in ATPase reactions is close to the apparent ATP affinity measured in unwinding reactions. While we believe that the pre-steady state results are more relevant in vivo, the steady-state results suggest interesting aspects of DHX36 for further study.

Thus, ATP-dependence during RNA remodeling by DHX36 needs to be investigated under steady-state conditions (RNA excess over enzyme). Further experimentation is required to adequately understand the influence of RNA structure on the mechanism of ATP hydrolysis by DHX36. Similarly, the influence of RNA sequence can be further investigated with different ss-, duplex, and GQ substrates.

Previous steady-state measurements of initial unwinding rates showed that DHX36

GQ-unwinding activity scaled with the stability of DNA GQ substrate, but the ATPase activity was largely independent of substrate stability 262. This observation only indicates that ATP hydrolysis is not rate limiting step in the NA- remodeling process. The crystal structures of Ski-2 like and DEAH helicases provide a more rational explanation for the observed uncoupling of ATP hydrolysis and strand separation activities. The active conformation of the two RecA domains appears to be stabilized even in the absence of ligands, possibly by intramolecular interactions with the additional regions, including the

N-terminal, β-hairpin, and the extended C-terminal domains 64, 341 6, 29. Although, domain movements are likely to occur during ATP binding and hydrolysis and NA binding, less dramatic movements of the RecA domains has been suggested for DEAH helicases when compared to DEAD-box helicases 341-342. This might rationalize why DEAH helicases show significant levels of ATP hydrolysis without RNA, although RNA still stimulates

131

ATP hydrolysis in most cases 316, 334, 343. In contrast, most DEAD-box helicases are unable to bind or hydrolyze ATP without RNA, as ATP and nucleic acid binding are highly cooperative in these helicases 340.

6.3.2 NTPase activities of DHX36

Due to the lack of specific contacts to the nucleotide base, DHX36 can use all NTPs for RNA structure remodeling (Section 5.2.1). No evidence that the enzyme uses other

NTPs in vivo exists. Nevertheless, to evaluate the NTP specificity of DHX36, we can measure the steady-state kinetic parameters of the NTPase activity of DHX36 in the presence of all NTPs. In other DEAH/RHA helicases, including Prp43p, Prp22p, and

DHX9, the NTPase activity with a pyrimidine base is higher than the activity with a purine base 332, 344. It will be interesting to see if DHX36 behaves similarly. Furthermore, the influence of RNA structure and/or sequence on DHX36 NTPase activity will also be an important line of investigation.

6.3.3 Influence of NTP base stacking on the biochemical activities of DHX36

In DHX36, and other DEAH/RHA helicases, the adenine moiety in the nucleotide- binding pocket is sandwiched between two conserved residues, an arginine (R-motif) and phenylalanine (F-motif) from RecA1 and RecA2 domains respectively (Section 5.2.1). In

Prp43p, the stacking of the nucleobase between these opposing residues influences Prp43p

ATPase and RNA remodeling activities, its activation by G-patch proteins in vitro, and its activity in vivo 332. The relative strength of these interactions and their effect on the biochemical activity of DHX36 is unclear. To evaluate this, we can make DHX36 mutants impaired in stacking with the ATP base either on the RecA1 domain (R267A) or on the

RecA2 domain (F529A). We can then investigate the RNA-stimulated ATPase and ATP-

132 dependent RNA remodeling activities of these mutants and compare it to the activity of

WT DHX36 (under pre-steady state and steady state conditions). In addition, we can also investigate the effect of these base stacking interactions on the RNA binding affinity of

DHX36 under equilibrium conditions.

Compared to the crystal structure of Prp43p in complex with ADP, the structure of

Prp43p in complex with CDP showed reduced stacking of the cytosine base to the phenylalanine (F-motif) from the RecA2 domain. The Prp43p model suggests that the loss of stacking interactions between the F-motif and NTP base results in higher enzyme turnover (NTP hydrolysis), as the product (NDP) is less retained in the active site 332. In order to consider if DHX36 contained similar structural features, we modeled the CDP nucleotide into the mDHX36 structure, based on the structure of the Prp43p-CDP complex

332. The overall structure of mDHX36 in complex with ADP or CDP is the same, but similar to Prp43p the cytosine in mDHX36 loses the stacking interaction to F529 (F-motif) (Fig.

6.4). We also observed higher RNA duplex remodeling activity of mDHX36 (but not RNA

GQ remodeling) in the presence of CTP than with ATP (section 5.2, Fig. 5.3). Given these observations, with experiments proposed above and in section 6.2.2, it will be interesting

Figure 6.4 Analysis of nucleotide base stacking interactions in mDHX36 structure. Close-up view of the stacking of R- and F- motifs on the ADP nucleotide (Left), the CDP nucleotide modeled into mDHX36 structure based on the Prp43p-CDP conformation (Right). F529 residue exhibits reduced stacking to the ring of CDP. The RecA1 and RecA2 domains from mouse DHX36 bound to ADP (PDB: 6UP4) colored in cyan and purple, respectively. ADP moiety shown as green sticks and CDP moiety shown as pink sticks.

133 to evaluate if the influence of base stacking interactions on DHX36 activity is similar to that observed in Prp43p.

6.3.4 Role of ATP hydrolysis in DHX36-mediated RNA remodeling

ATP hydrolysis is coupled to translocation in DEAH helicases, while ATP hydrolysis is not required for unwinding and is only necessary for enzyme recycling in

DEAD-box proteins (Section 1.3). With the prevalence of short RNA duplexes and two G- tetrad-containing GQs in the cell, there is a possibility that actual RNA structure remodeling by DHX36 may often not depend on ATP hydrolysis. Single molecule biochemical assays have demonstrated ATP-independent DHX36 remodeling of three G- tetrad containing DNA and RNA GQs 328-329,38, 264 (Sections 2.6, 5.1). It will be interesting to see whether DHX36 displays ATP-independent unwinding of relatively unstable duplexes such as a 10-bp RNA, or a 13-bp DNA/RNA hybrid, in ensemble biochemical assays. Likewise, we can investigate if DHX36 has ATP-independent remodeling activity on two G-tetrad containing RNA GQs.

Additionally, the role of ATP hydrolysis in RNA remodeling by DHX36, WT, and mutant DHX36 can be studied with ATP analogs that represent different states of ATP hydrolysis. Preliminary analyses showed that in the presence of the non-hydrolyzable ATP analog, ADPNP, DHX36 failed to remodel the previously studied RNA duplex and GQ substrates (Fig. 6.5a). Similar results were observed when DHX36 remodeling was measured in the presence of the non-hydrolyzable ATP ground-state analog ADP-BeFx, which mimics the ATP pre-hydrolysis state (Fig. 6.5b). Other commonly used analogs, such as the transition state analog ADP-AlFx, which represents the ATP post hydrolysis

134 state, remain to be tested. It will also be useful to investigate the influence of different ATP states on DHX36 binding to structured and unstructured RNAs.

Figure 6.5 DHX36 mediated RNA remodeling with ATP analogs. a) Representative PAGE for, top panel: RNA duplex remodeling reaction, middle panel: RNA GQ4 remodeling reaction, and bottom panel: RNA GQI remodeling reaction in the presence of ADPNP (360 nM WT mDHX36, 0.5 mM ADPNP- Mg2+, 0.5 nM RNA). Reaction with GQI contained 100 nM DNA TRAP. b) Representative PAGE for, top

panel: RNA duplex remodeling reaction, middle panel: RNA GQ4 remodeling reaction, and bottom panel: I 2+ RNA GQ remodeling reaction in the presence of ADP-BeFx (360 nM WT mDHX36, 0.5 mM ADP-Mg ,

0.5 nM RNA). Reaction with GQI contained 100 nM DNA TRAP. Cartoons mark RNA substrate and unwound product. The asterisk shows the radiolabel.

6.3.5 Function of auxiliary domains of DHX36 in the coupling of NA binding to ATP binding/hydrolysis.

Recent crystal structures of DHX36, including apo, nucleotide-analog, and RNA- bound forms, and several related Ski2-like and DEAH/RHA-box helicase structures provide insight into the general features employed by these helicases to bind and translocate along nucleic acid substrates 6, 38,345,46,41,57,265. While the structure-function

135 analysis has greatly improved our understanding of the DHX36- NA remodeling mechanism, many questions remain yet to be explored.

To exclude the possibility that the basal non-RNA-stimulated ATPase activity of the ΔOB-fold mutant may have been from a co-purifying protein (contamination), we can make an ATPase motif point mutation in the OB-fold variant, and verify that this protein shows no ATP hydrolysis. Nevertheless, the deletion of the OB-fold in mDHX36 not only resulted in RNA binding defects, but it also decoupled ATP hydrolysis and RNA binding and thus RNA remodeling (Section 4.2.4). The OB-fold deletion may have affected the ability to properly align the helicase core domains in response to ATP binding or failed to coordinate the assembly of the ATP active site in response to RNA binding. The OB-fold, together with other domains, may be essential for the conformational changes needed to couple RNA binding to the ATPase cycle. Delineation of the exact function of the OB-fold in the coupling of RNA/DNA binding to the ATPase cycle requires more experimentation.

To begin with, we can make point mutations in critical OB-fold residues to identify their impact on coupling ATPase, NA-binding, and remodeling.

Previously I demonstrated that deletion of the β-HP, which eliminates contacts between the RecA2 and the extended CTD domain, increases the basal, non-RNA stimulated ATPase activity of mDHX36, compared to WT (Section 4.2.5). Whereas the shortened β-HP generated by replacing the top of the hairpin with a flexible glycine linker had basal ATPase activity comparable to WT. This observation suggests that the complete removal of β-HP organizes mDHX36 domains in a conformation that lowers the coupling of RNA binding to ATP hydrolysis. It will be interesting to test the impact of a β-HP variant, wherein the bottom of the hairpin (interacts with the RecA2 domain) is replaced

136 with a flexible glycine linker. In addition, identification of critical β-HP residues will also be important for understanding the role of this motif in coupling ATPase, NA-binding, and remodeling.

In the structure of DHX36 and other related helicases, the winged-helix domain is tightly packed against the RecA1 domain. The ratchet domain consists of a 7-8 helix bundle and, together with the RecA1 and winged-helix domain, forms a ring around the 3′ tail of the nucleic acid. In Ski-2 like helicases, the ratchet domain is thought to facilitate NA translocation and unwinding (detailed in Section 1.3.2). Importantly, deletion of the ratchet domain abolishes helicase activity in Hel308 but still retains its DNA-stimulated ATPase activity 26. Although in DEAH/RHA helicases, the orientation of the ratchet helix residues differs from those of DNA-bound Hel308, the ratchet helix might undergo reorientation on binding of NTP and accommodation of an NA substrate, as suggested for Prp43p 346. The crystal structures of DmDHX36 bound to RNA revealed numerous contacts between the ratchet residues and NA backbone 38. Of note, the ratchet helix residue Q736 (equivalent to position W599 in Hel308), which is conserved among DEAH/RHA helicases, contacts the NA backbone near the 3’- end (Fig. 6.6). Hence, mutational analysis of DHX36 ratchet helix residues will be important to understand its role in the RNA structure remodeling activity of the helicase.

Similarly, conformational changes that accompany stages of ATP binding and hydrolysis and the RNA remodeling cycle results in the movement of the conserved 3’-β- hairpin element (hook-turn) in the RecA1 domain (Section 4.2.1), and of a loop (hook- loop) in the RecA2 domain. Both the hook-turn and the hook-loop are directly located at the NA-binding tunnel and thus might be involved in NA translocation (Fig. 6.7).

137

Figure 6.6 Conserved ratchet domain residues of DHX36 interact with NA. Semi-transparent cartoon model of the ratchet and winged-helix domains in the nucleotide and nucleic acid bound DHX36 structures (mouse DHX36 bound to ADP (PDB: 6UP4): hot pink; Fly DHX36 bound to DNA (PDB: 5N90): silver, RNA: silver cartoon. The fly DHX36 residues that are implicated in NA binding are highlighted. The analogous mouse DHX36 residues are also shown. The structures were aligned by superimposition of the RecA1 domain. For clarity, RecA1, RecA2 and rest of the extended C-terminal domains are omitted from the structures.

Consistent with this notion, mutations in hook-turn residues in Prp43p abrogated the unwinding activity without affecting the ATPase activity and the RNA binding property of the helicase. In contrast to this, the unwinding activity of Prp43p is not affected by mutations in its hook-loop residues 57. Interestingly, the integrity of this hook-loop is essential for the RNA unwinding activity but does not affect the NA binding capacity of

MLE 46. These observations suggest that the hook-turn and hook elements may link the

ATPase and the NA unwinding activities of the helicase. Hence, investigating the roles of these elements in DHX36 will be crucial to understand its translocation mechanism.

Figure 6.7 Hook loop and Hook-turn elements in DHX36. The reorientation of these elements in the nucleotide and nucleic acid bound DHX36 structures (bovine DHX36 bound to DNA (PDB: 5VHE): teal; bovine DHX36 bound to - ADP.BeF3 (PDB: 5VHE): teal; mouse DHX36 bound to ADP (PDB: 6UP4): hot pink. The structures were aligned by superimposition of the RecA1 domain. For clarity, rest of the RecA1, RecA2 and the extended C-terminal domains are omitted from the structures.

138

6.4 The ligand-induced assembly state (i.e., monomer, dimer, oligomer) of DHX36

A common feature among helicases is the necessity to contain multiple NA binding sites to remodel the NA structure and translocate processively without dissociating from the NA 347. Multiple NA binding sites can be present within a monomer or can be provided by separate subunits of an oligomeric complex. For instance, in ring-shaped hexameric helicases (T7 gp4, RepA, SV40-LTag), the enclosure of the NA within the ring decreases the probability of the helicase falling off, thereby increasing processivity. Additionally, this arrangement enables the coupling of NTP hydrolysis cycles between the subunits, thereby increasing the efficiency of the NTPase cycle in promoting translocation 348.

Although some non-ring forming helicases function as monomers (T4 Dda, HCV NS3h,

E. coli RecQ), the activity of several other helicases is greatly enhanced by the formation of dimers or higher order oligomers 349-352. Certain helicases function as monomers, do not form oligomers and do not show cooperativity in NTPase or NA binding, yet they show functional cooperativity when multiple molecules of helicases are loaded on the NA 348.

When this leads to enhanced activity, it is either due to the prevention of backward helicase slips or due to the availability of additional helicase molecules when one falls off the substrate 353-354. Given the influence of oligomerization on helicase activity, delineation of the active assembly state of DHX36 is an important step towards understanding the molecular mechanism of RNA structure remodeling by the helicase.

Interestingly, the crystal structures of mouse DHX36-ADPNP complex

(unpublished, Watchalee Chuenchor, Tsan Xiao) and fly DHX36-3G4RNA complex 38, each showed two molecules in the asymmetric unit. However, size exclusion chromatography (SE) either by itself or combined with small-angle X-ray scattering (SE-

139

SAXS) showed single peaks (corresponding to a monomer form) of mDHX36, bovine

DHX36-G4DNA complex, fly DHX36-ADP.AlF4 complex, and fly DHX36-3G4RNA complex 6, 38, 317. Additionally, single-molecule biochemical studies showed that a monomer form of DHX36 was responsible for the remodeling of dGQ and rGQ structures

264, 328.

Nevertheless, we speculate a mechanism of ATP-induced dimerization of mDHX36 during RNA duplex remodeling (Section 5.2.4). However, there is currently no direct evidence to indicate that mDHX36 may function as a dimer on certain RNA substrates. Notably, the NTP- and NA- induced assembly state of mDHX36 is still uncertain. To analyze the molecular basis of the change in conformation and/or oligomeric state of mDHX36, size exclusion chromatography can be performed in the presence and absence of ADPNP (a non-hydrolyzable ATP analog) and/or RNA duplex. An identical set of experiments should also be done with the RNA quadruplex substrate. Additional controls include experiments performed in the presence of low concentrations of ADPNP and low/high concentrations of a non-hydrolyzable CTP analog.

To analyze the heterogeneity of the oligomeric states of DHX36 in solution, one can also use dynamic light scattering. Dynamic light scattering (DLS) measures fluctuations in scattered light from particles in solution, as a function of time. This enables the determination of the size of particles over a broad range of molecular weights, approximately 13 orders of magnitude 355. Alternatively, to investigate the number of molecules of DHX36 bound to a given RNA substrate, one can use sucrose gradient sedimentation. For this, a binding reaction can be set up with the 32P- labeled RNA substrate (duplex or quadruplex), saturating concentrations of mDHX36 and non-

140 hydrolyzable NTP analog (ADPNP or CDPNP), following which the samples can be loaded on a 6-40 % sucrose gradient and subjected to ultracentrifugation. An RNA only reaction will be an essential control. The samples can then be fractionated, and the amount of RNA in each fraction can be measured by scintillation counting. Three well- characterized proteins can be used as size standards. The molecular weight of the species in each sample can then be calculated based on the size standards.

6.5 Nucleotide induced inhibition of mDHX36 RNA duplex remodeling activity

Under pre-steady state (enzyme in excess of RNA) multiple-cycle conditions, high

ATP concentrations may or may not promote dimerization of mDHX36, but it results in inhibition of remodeling activity on RNA duplexes but not quadruplexes (Section 5.2.3).

However, to describe the complexity of the inhibition mechanism, a thorough kinetic characterization of mDHX36 is required. To understand the inhibition mechanism, it will be important to study the reaction under pre-steady state, single-cycle (prevent re- association of the enzyme to RNA), and steady-state (RNA in excess of enzyme) conditions. Remarkably, under pre-steady state conditions, high CTP concentrations did not lead to dimerization and/or inhibition of mDHX36 RNA duplex remodeling activity.

The effects of other NTPs (UTP and GTP) on the RNA structure remodeling activity of mDHX36 is still unknown. Although cellular ATP levels are ten-fold over other NTPs, it remains to be seen if dimerization and/or inhibition of mDHX36 on the RNA duplex is promoted by high UTP or GTP concentrations.

Also, the ten-fold tighter ATP affinity of mDHX36 when remodeling the duplex compared to the quadruplex likely plays a role in this inhibition of duplex remodeling at high [ATP]. We can investigate whether substrate inhibition can be overcome by a point

141 mutation in the ATP binding site, thereby reducing ATP affinity. This can be accompanied by a reduction/ increase in reaction rate and will be interesting to evaluate. Importantly, since DHX36 auxiliary domains contribute to the coupling of RNA binding to the ATPase cycle and thus to RNA remodeling (section 4.3), the impact of mutations in these domains in overcoming substrate inhibition by ATP should be investigated. Much like the wild-type form of mDHX36, the mDHX36∆-DSM variant also showed some cooperativity when remodeling the RNA duplex substrate (at saturating ATP-Mg2+). Nevertheless, it will be interesting to measure the RNA duplex remodeling activity of this variant over a range of

ATP concentrations, to test if ATP is inhibitory at high concentrations.

To learn more about the assembly state of mDHX36 responsible for RNA duplex remodeling inhibition, we can set up mutant doping experiments, where one can evaluate the effect of adding non-functional mDHX36 mutants to RNA remodeling and ATPase reactions with WT mDHX36. These experiments have to be performed under both multiple-cycle and single-cycle conditions. One can generate two types of mutants, one designed to disrupt helicase activity without affecting ATPase activity, and the other to disrupt ATPase activity. If a dimer is required for biochemical activity, the presence of an excess molar concentration of mutant enzyme should inhibit the reaction catalyzed by the

WT enzyme. In these assays, an observed decrease in remodeling rates could suggest two things: competition between the WT and mutant enzyme for RNA substrate binding, or the formation of hetero-dimers between WT monomers and mutant monomers that unwind the substrate at a lower rate. The formation of such hetero-oligomers has been previously demonstrated for the dimeric E. coli RepA and hexameric bacteriophage T7 gp4 356-357.

142

All the RNA remodeling experiments in this work were performed with RNA pre- incubated with mDHX36 followed by reaction start with ATP-Mg2+. In order to investigate the order of binding of DHX36’s substrates and products, it is important to change the order of addition of reaction components, and subsequently, monitor its effect on RNA remodeling. The order of substrate (ATP and RNA) binding to DHX36 may have potential implications for the mechanism of the enzyme. For instance, there is a possibility that the binding of RNA to DHX36 results in an inactive conformation that affects subsequent binding of ATP. This would necessitate the binding of ATP before RNA to form an active ternary complex.

An alternative explanation for mDHX36 RNA duplex remodeling inhibition at high

[ATP] could be . In this scenario, ATP may bind to a different binding site on the monomeric enzyme other than the catalytic site. For this, the identification of allosteric ATP binding sites, followed by the use of allosteric site mutants in RNA remodeling assays, will prove useful.

Lastly, the structure determination of DHX36 ternary complexes with non- hydrolyzable ATP analogs and NAs (RNA duplex and RNA GQ), will be very crucial for elucidating the inhibition mechanism. This will provide insights into the conformational changes that originate ATP inhibition during RNA structure remodeling.

6.6 Effect of cofactors and PTMs on DHX36 activity

In the cell, protein cofactors are likely to influence helicase activity, sub-cellular localization, and RNA substrate recognition (Section 1.6). As discussed in section 2.5,

DHX36 interacts with many cofactors, including PARN, hRrp40p, PM/scl-100, and Aven.

In addition, DHX36 also interacts with the telomerase holoenzyme, histone deacetylase 1

143

(HDAC1), HuR, and NFAR1 in an indirect RNA-dependent manner. Co- immunoprecipitation experiments (Co-IP) performed using lysates from HeLa cells transfected with HA-tagged-DHX36, identified a direct interaction of DHX36 with PARN and the exosome components 126. Similarly, Aven can recruit DHX36 to polyribosomes, and increase the translation of certain mRNAs, and the interaction of DHX36 with

RGG/RG domains of Aven was confirmed via Co-IP 192. However, the effect of these cofactors on the biochemical properties of DHX36 is currently unknown. Using purified recombinant proteins, one can evaluate the influence of these cofactors on DHX36 biochemical activity in RNA remodeling and ATPase assays. This type of analysis will provide more quantitative and mechanistic information on this subject.

In recent years, it has become evident that post-translational modifications (PTMs) of helicases allow a fine-tuning of its function. In a few cases, modifications have been shown to affect co-factor interaction directly. For instance, the sumoylation of p68/DDX5 promotes interaction with HDAC1 and alters transcriptional activity 358. PhosphoSite ® is a curated, sequence-oriented protein database dedicated to in vivo phosphorylation sites that culls all the proteomic data into a searchable platform (http://www.phosphosite.org/).

Using this database to search for PTMs detected on DHX36, several locations within the

N-terminal region, conserved helicase core, and C-terminal region were identified (Fig.

6.8a). While PTMs are likely to affect DHX36 protein function either directly or through protein co-factors, the biological and biochemical consequences of these modifications remain to be studied.

144

Figure 6.8. Prediction of post-translational modifications and disordered regions in mDHX36 (a) PTMs (blue, phosphorylation; green, acetylation; orange, ubiquitination; grey, others) identified by ® PhosphoSite are noted by colored circles above the schematic of the DHX36 structure. (b) Analysis of the mDHX36 amino acid sequence by the IUPred2A disorder prediction server. The probability to form disordered regions (y-axis) is shown with respect to the residue number (x-axis).

6.7 DHX36’s ability to form reversible aggregates and phase separate in solution

Cytoplasmic granules usually contain translationally inactive mRNAs and associated proteins and are defined by their cellular localization and protein composition

359. Granules assemble when compositionally similar mRNPs temporarily accumulate in response to various cellular events such as metabolic stress, pathogen infection, and during cell cycle 360.

Stress granules (SGs) are dynamic complexes that form when translation initiation is blocked/ inhibited in response to stress, drug treatment, and modulation of translation factors composition 359. Depending on conditions, SGs may contain translation initiation factors, mRNA stability factors, cell-signaling factors, and RNA helicases. Although

145 granules are fundamental to cellular function, permanent accumulation of these assemblies can also contribute to disease pathology. An emerging theme is that the perturbation of the biologically important higher-order assemblies of RBPs can be pathogenic 361-362. As a result, there has been a growing interest in the composition, function, and assembly of RBP granules.

In vitro, RBPs containing low complexity (LC) sequences/ intrinsically disordered regions (IDRs) can undergo concentration dependent phase transition to form reversible hydrogels 223, 363. Such liquid-liquid phase transitions (LLPS) are believed to model the cellular RNA granule architecture, and maybe a general mechanism underlying intracellular granule assembly 364. A few DEAD-box helicases such as Ded1p/DDX3X, p68/Dbp5p, LAF-1/DDX3Y are stress granule components, and contain N- and C- terminal regions that are thought to be disordered 61, 365. The C. elegans helicase LAF-1 can phase separate to form liquid droplets in vitro 61; this study helped characterize the molecular interactions that drive LLPS in P-granules in vivo.

DHX36 was previously shown to localize to stress granules (SGs) in a RNA- dependent manner through its unique N-terminal domain that contains a glycine-rich region, a di-RG motif, and a DSM motif 239. Additionally, the ATPase activity of DHX36 was required for rapid shuttling into and out of SGs, and for remodeling and recruitment of RNPs in SGs 239. Another recent study showed that DHX36 target mRNAs and transcripts harboring rGQs enrich in stress granules 242. DHX36 also contains an intrinsically disordered N-terminal domain that harbors a G-rich patch and a di-RG motif

239,6, 38 (Fig. 6.8b). We can thus investigate whether DHX36 can undergo LLPS similar to cellular granules. Some evidence suggests that rGQs and dGQs by themselves can undergo

146 phase transition and drive stress granule assembly 366-367. Hence, the study of phase transitions in DHX36 and GQs will shed light on the specific molecular interactions that drive phase separation and the mechanisms by which liquid properties impart cellular function.

All the biochemical experiments in this work were performed with an N- and C- terminally truncated mDHX36 (mDHX3646-982). However, for studying phase transitions in DHX36, we need purified full-length recombinant mouse DHX36. Then, we can use several methods like phase contrast or differential interference contrast (DIC) microscopy to study such transitions in protein solutions 61. DHX36 can also be fluorescently tagged

(Cy3 or Cy5), and phase transitions can then be monitored via fluorescent microscopy 61,

363 . To determine the DHX36 regions required for phase transitions, mDHX3646-982 will still prove useful. We can then assess conditions (salt, pH, temperature, molecular crowding, and concentration of protein) that favor/disfavor droplet formation. Importantly, we can monitor if DHX36 can seed RNA into these droplets, and also test the effect of

ATP/NTP on droplet formation. This will be crucial as expansions of G-rich sequences

(predicted to form GQs) that sequester RBPs have been associated with neurological disorders 1 (Section 2.2.2.4).

147

Chapter 7

Biochemical analysis of the DEAD-box helicase DDX41

7.1 Introduction

As mentioned in Chapter 3, the p.R525H mutation in DDX41 occurs at a canonical hotspot (Fig. 3.7), suggesting that it might distinctly alter DDX41 function during oncogenesis. The highly conserved arginine-525 amino acid is located in Motif-VI of the

RecA2 domain in DDX41. Motif-VI, which is conserved in the DEAD-box family of RNA helicases, is known to be important for ATP binding and hydrolysis. The 3rd arginine in this motif likely forms hydrogen bonds with the gamma phosphate of bound ATP.

Consequently, the conversion of this arginine to histidine could alter the biochemical functions of DDX41. Our goal was to analyze if the R525H mutation in DDX41 caused a significant change in its biochemical activity.

DDX41 has been shown to sense microbial DNA, and in cell-based assays when stimulated with poly(dA:dT) or poly(dG:dC) it interacts with downstream adaptors

STING, TBK1, triggering the activation of the interferon response. Double-stranded DNA, such as poly(dA:dT) or poly(dG:dC) was shown to stimulate the ATPase activities of recombinant WT and R525H mutant DDX41 368. However, in this study, defects in ATP binding and hydrolysis were not reported for the mutant DDX41R525H. With the exception of one study, very little is known about the biochemical activities of WT DDX41, and the effect of R525H mutation on the activity of DDX41 has not been carefully examined.

Importantly, DDX41 may have both ATPase and NA remodeling activities, but the NA remodeling activity has never been reported.

148

In this chapter, I report the characterization of the biochemical activities of WT-

DDX41 and mutant DDX41R525H using in vitro approaches and demonstrate modest defects in the RNA remodeling activity of the mutant.

7.2 Results

7.2.1 Purification of recombinant human DDX41

To establish the biochemical properties of DDX41, we overexpressed and purified recombinant human DDX41 from E.coli. The purified protein showed >80 % purity on a

Coomassie stained gel. However, in our downstream in vitro assays, robust biochemical activity of DDX41 could not be detected. Subsequent efforts suggested that this was in part due to a nuclease contamination issue from purified protein preparations. Additionally, the purified protein may have been functionally inactive due to improper folding and/or lack of post-translational modifications. After several rounds of troubleshooting, we switched to an insect cell expression host, as a eukaryotic expression host can provide post- translational modifications necessary to obtain a functionally active protein. To this end, we expressed and purified recombinant WT DDX41 and DDX41R525H from baculovirus- infected insect cells and obtained >85% pure recombinant proteins suitable for subsequent in vitro biochemical assays (Fig. 7.1).

Figure 7.1. Purification of recombinant WT and R525H mutant DDX41 from insect cells. Coomassie stained SDS-PAGE gels showing purified WT DDX41 (a) and mutant DDX41 (b), along with pre-stained protein size standards.

149

7.2.2 ATP-dependent RNA remodeling activity of DDX41 masked by a potential nuclease

contamination issue

First, to test if the purified DDX41 was biochemically active, pre-steady state

unwinding reactions were performed with a 3’ tailed (25 nt), 16 bp RNA duplex where the

5’end of the top strand was radiolabeled with [γ-32P]-ATP (Fig. 7.2a). In the presence of

DDX41 and ATP/Mg++, we observed a 10-20 % accumulation of unwound single strand

(ss) RNA (R16) as early as 5 min; however, the unwound ssRNA was immediately

degraded (yellow arrow, Fig. 7.2b). The control reaction without ATP/Mg2+ did not result

in the accumulation of unwound product and was not degraded (Fig. 7.2c). We observed a

similar pattern with a 3’ tailed (25 nt), 16 bp DNA/RNA hybrid duplex, where the unwound

single-stranded (ss) DNA was immediately degraded (data not shown). Despite the

degradation of the unwound product, these results suggested that the purified recombinant

WT DDX41 was biochemically active, as it displayed ATP-dependent RNA unwinding

activity. These results also indicated that our purified recombinant protein might have been

contaminated with nucleases from the insect cell lysate.

Figure 7.2. RNA unwinding by WT DDX41. Reaction scheme for pre-steady state unwinding (a). Representative PAGE for unwinding reactions (RNA: 0.5 nM, ATP: 2 mM, DDX41: 250 nM) in the presence of ATP (b), in the absence of ATP (c), and in the absence of enzyme (d). Cartoons on the left show duplex and single strand RNA substrate. The asterisk represents the radiolabel. Arrow (yellow) represents degradation product.

150

We next wondered if WT DDX41 would separate a smaller duplex region more efficiently and if the degradation could be minimized when using a different RNA sequence. To accomplish this, we performed unwinding reactions with a 3’ tailed (25 nt),

13 bp RNA duplex. We observed that irrespective of the length of the duplex and the sequence of the RNA, the unwound ssRNA (13 nt, in this example) that accumulated due to an ATP-dependent RNA unwinding activity of DDX41, was immediately degraded (Fig.

7.3a). In order to prevent the degradation of the unwound ssRNA and to achieve an increase in the fraction of visible unwound ssRNA product, I performed unwinding reactions under strand exchange conditions. Here, the reaction contained an excess of the unlabeled top strand. We hypothesized that due to the availability of unlabeled top strand in molar excess, the labeled strand would remain preferentially free from nuclease attack. As shown in (Fig.

7.3b), this reaction regime increased the fraction of visible unwound ssRNA product. Yet, the accumulated ssRNA was eventually degraded.

We next asked whether we could use the strand exchange regime to capture the unwound ssRNA product and completely prevent its degradation from the potential nuclease attack. To accomplish this, our reactions included a molar excess of unlabeled

Figure 7.3. WT DDX41 unwinding of a 13 bp RNA duplex under regular and strand exchange 2+ conditions. Representative PAGE for unwinding reactions (RNA: 0.5 nM, ATP-Mg : 2 mM, DDX41: 400 nM) in the absence (a), and in the presence of 1 µM unlabeled top strand (b). Reactions in the presence of 1 µM RNA complimentary to the top strand (c).Cartoons on the left show duplex and single strand RNA substrate. The asterisk represents the radiolabel. Arrow (yellow) represents degradation product.

151

RNA complementary to the labeled top strand. Under these conditions, the unwound ssRNA product base paired with its complementary strand and the resulting blunt-ended

13 bp RNA duplex that accumulated was no longer degraded (Fig. 7.3c). This set of observations confirmed the presence of RNA remodeling activity in our WT DDX41 preparation, and the accumulation of blunt-end RNA duplex in Fig.3C also indirectly showed that DDX41 requires an ss region adjacent to the duplex, for efficient RNA structure remodeling.

7.2.3 Characterization of the potential nuclease contamination in recombinant DDX41 preparations

In the biochemical experiments shown so far, the accumulation of a distinct degradation product of uniform size was very puzzling. A 3’->5’ exonuclease contamination is commonly encountered in recombinant protein preparations. If the contaminating protein in our DDX41 preparations was a 3’->5’ exonuclease, it would degrade the 3’-end ss regions on the RNA duplex and stop as soon as it encountered structured RNA regions. This would result in an accumulation of a blunt-ended form of the radiolabeled duplex in our gel-based in vitro assays. However, we did not observe this in our assays. Also, we did not observe any intermediates or smearing of the degraded species

(Figs. 7.2, 7.3). But, the nature of the assays was such that, even if intermediates were present, they would not be captured under the reaction conditions used. Furthermore, other equally likely possibilities like the presence of a contaminating 5’->3’ exonuclease and/or endonuclease cannot be excluded.

In order to further probe the nature of this potential nuclease contamination (3’->5’ exonuclease, 5’->3’ exonuclease or endonuclease), we developed an assay to capture/

152 analyze the intermediates (if any) of the degradation product observed in our assays. WT

DDX41 was incubated with ATP/Mg2+ and radiolabeled ssRNA, 41 nt (Fig. 7.4), or ssDNA

(data not shown) in a temperature controlled heat block (30○ C) and collected sample aliquots every 15 sec. The reaction was allowed to proceed for 5 min, and the samples were analyzed by denaturing PAGE at single-nucleotide resolution. The reactions in the presence (Fig. 7.4, left panel) and the absence (Fig 7.4, middle panel) of ATP/Mg2+ showed an accumulation of degradation product as early as 15 sec, but without intermediates. The accumulated degradation products correspond to the size of 1-2 nt. The control reaction in the absence of DDX41 (Fig 7.4, right panel) did not show an accumulation of degradation products. A similar pattern was observed with reactions in the presence of DNA. These observations suggested that the potential nuclease contamination was either a 5’->3’ nucleotide phosphatase that cleaved the 5’-end [γ-32P]-ATP radiolabel, or a 5’->3’ exonuclease that cleaved the 5’ terminal most nucleotide with [γ-32P]-ATP, the former being the most likely possibility.

Figure 7.4. Analysis of RNA degradation products accumulated in the presence of WT DDX41. Representative PAGE for reactions (ssRNA, 41 nt: 0.5 nM, ATP: 2 mM, DDX41: 100 nM) in the presence of ATP (left), in the absence of ATP (middle), and in the absence of DDX41 (right). The asterisk represents the radiolabel. Arrow (yellow) represents degradation product.

153

7.2.4 Characterization of the RNA remodeling activity of DDX41 using a modified RNA substrate

Given these sets of observations, we next asked whether switching the radiolabel to the 3’-end of the RNA oligo would help eliminate this potential 5’->3’ exonuclease issue. To accomplish this, unwinding reactions were performed with a 13 bp RNA duplex where the 3’- end of the bottom- strand RNA was radiolabeled with [5′-32P]pCp (cytidine

3’-5’ bis-phosphate). The accumulation of a distinct degradation product was no longer observed; however, the fraction of visible unwound ss product was merely 10-20%

(Fig.7.5a). Also, unwinding of this 13 bp duplex was not observed in the absence of ATP, so there is ATP-dependent strand separation involved (Fig. 7.5a middle).

Figure 7.5. WT DDX41 unwinding reactions with a 13 bp RNA duplex containing a radiolabel at 3'-end of the bottom strand. Representative PAGE for unwinding reactions (RNA: 0.5 nM, ATP: 2 mM, DDX41: 200 nM) with a 13 bp RNA duplex containing a 3’, 26nt overhang and radiolabel (a), and a 5’-end monophosphorylated 13 bp RNA duplex containing a 3’, 26 nt overhang and radiolabel (b).

154

Importantly, this observation led us to speculate if the presence of a 5’-phosphate on the nucleotide substrate was required to activate the unwinding activity of DDX41 and/or if DDX41 had an inherent 5’ nucleotide phosphatase activity that was required to activate its unwinding activity. To test this, I performed unwinding reactions with the 13 bp RNA duplex where the 5’-end of the top strand RNA was mono-phosphorylated, and

3’- end of the bottom-strand RNA was radiolabeled. I observed robust unwinding activity on this substrate (Fig. 7.5b). Additionally, even with a 16 bp, 3’-end labeled RNA duplex,

I observed more robust unwinding activity when the 5’-end of the top strand RNA was mono-phosphorylated compared to when it was non-phosphorylated. This suggests that the pattern was the same irrespective of length or sequence of the duplex region used (Fig.

7.6a,b).

Figure 7.6. WT DDX41 unwinding reactions with a 16 bp RNA duplex containing a radiolabel at 3'-end of the bottom strand. Representative PAGE for unwinding reactions (RNA: 0.5 nM, ATP: 2 mM, DDX41: 200 nM) with a 16 bp RNA duplex containing a 3’, 26 nt overhang and radiolabel (a), and a 5’-end monophosphorylated 16 bp RNA duplex containing a 3’, 26 nt overhang and radiolabel (b).

If the presence of a 5’-phosphate on the nucleotide substrate was required to activate the unwinding activity of DDX41, it should also increase the RNA-stimulated

ATPase activity of DDX41. To investigate this possibility, I assayed the RNA-stimulated steady-state ATP hydrolysis rates of DDX41 using a series of single-stranded and double-

155 stranded RNA substrates that were either non-phosphorylated or mono-phosphorylated at the 5’-end of one of the strands. However, I did not observe a significant difference in initial rates of hydrolysis between these substrates (Fig. 7.7 b,c), suggesting that the presence of the 5’ monophosphate was only required for the activation of DDX41 duplex unwinding activity. Additionally, we observed a more robust stimulation of ATPase activity with RNA duplexes compared to ssRNA, suggesting that the stimulation of the

ATPase activity of DDX41 requires structured RNA.

Figure 7.7. WT DDX41 ATP hydrolysis reactions. (a) Reaction scheme for steady-state ATPase assays. (b) Representative TLC images of ATP hydrolysis reactions with 5’-end non-phosphorylated (left) and mono-phosphorylated (right) 16-bp RNA duplex in the presence of DDX41 (RNA: 1 µM, ATP: 0.5 mM, DDX41: 200 nM). (c) Initial ATP hydrolysis reaction velocities in the presence or absence of the indicated RNA substrates. Data points represent an average of 3 independent measurements. Error bars mark 1 SD.

7.2.5 Biochemical activity of WT DDX41 and mutant DDX41R525H

In light of this finding, we decided to use a 5’-end mono-phosphorylated RNA duplex as a model substrate for subsequent unwinding reactions comparing WT DDX41 and mutant DDX41R525H. I performed pre-steady state unwinding reactions with the 3' tailed, 16 bp RNA duplex, to determine unwinding rate constants at enzyme saturation, and functional affinities of WT DDX41 and DDX41R525H for RNA and ATP 312. Both WT

DDX41 and DDX41R525H displayed clear and measurable unwinding activities, even at low nanomolar (10 nM) concentrations (Fig. 7.8). However, at comparable concentrations (10

156 nM), DDX41R525H unwound the duplex slower than WT DDX41 (Fig. 7.8a, b). We next measured unwinding rate constants for WT DDX41 and DDX41R525H with increasing enzyme concentrations (Fig. 7.8c). Extrapolated to saturating enzyme concentration, the unwinding rate constant of WT DDX41 was approximately 4-fold higher than for

DDX41R525H (Fig. 7.8c). This observation indicates that WT DDX41 remodels RNA more efficiently than the mutant DDX41R525H. However, the functional affinity of

DDX41R525H for RNA, in the presence of ATP, is lower than for WT DDX41 by a factor

Figure 7.8. RNA Unwinding by WT DDX41 and DDX41R525H. (a) Representative PAGE for unwinding reactions (RNA: 0.5 nM, ATP: 2 mM, DDX41/DDX41R525H: 10 nM). Cartoons on the left show duplex and single strand RNA substrate. The asterisk represents the radiolabel. (b) Representative unwinding time courses by WT- DDX41 (filled black circle) and mutant DDX41R525H (open black circles). Conditions were as in A. Curves represent best fits to the integrated first order rate law, yielding observed rate R525H constants (kobs). For WT-DDX41, kobs = 0.024 ± 0.006 min-1; for DDX41 , kobs = 0.01 ± 0.005 min-1; (c) Dependence of unwinding rate constants (2 mM ATP) on enzyme concentration for WT DDX41 (filled circles) and DDX41R525H (open circles). Rate constants are averages from at least 6 independent measurements. Error bars represent 1 SD. Curves represent the best fit to the binding isotherm kobs= max kobs [DDX41] / (K1/2, DDX41 + [DDX41]), where kobs: observed unwinding rate constant; K1/2, DHX36: apparent functional binding constant of DDX41 to the RNA substrates. Unwinding rate constants at max R525H max enzyme saturation were for WT DDX41: kobs = 0.096 ± 0.011 min-1; for DDX41 : kobs = 0.024 ± 0.001 min-1. (d) Functional binding constants of WT DDX41 and DDX41R525H for RNA duplex (2 mM WT R525H ATP). K1/2 = 89.8 ± 22.6 nM and K1/2 = 22.7 ± 4.8 nM, calculated from the data fit in panel (c), errors reflect the fitting error.

157 of roughly four (Fig. 7.8c,d). This observation indicates that the mutant protein binds RNA tighter than WT DDX41.

The residues that make up this motif are conserved in DEAD-box RNA helicases and play an important role in ATP binding and hydrolysis 28. Therefore, I next asked whether the R525H mutation in DDX41 altered its ATP binding. For this, I compared the functional affinities of WT and mutant DDX41 for ATP, during the process of RNA remodeling. I found that the affinity of both WT DDX41 and DDX41R525H for ATP, in the presence of RNA, does not significantly differ (Fig. 7.9c). Nevertheless, WT and mutant

DDX41 bind ATP with low micromolar affinity. This ATP affinity is on the high end of the ATP affinity spectrum for DEAD-box helicases 40. This notion also implies that both

WT DDX41 and DDX41R525H operate in the cell under saturating conditions, with respect to ATP.

Collectively, our data indicate that the R525H mutation in DDX41 causes a defect in the ability of DDX41 to remodel RNA, and a slight increase in the affinity for RNA.

These observations suggest a potential of DDX41R525H to exhibit an antimorphic effect over DDX41WT, consistent with the genetic findings that the somatic R525H mutation can occur in the presence or absence of the wild-type allele.

7.3 Discussion

The original work was slowed down by what appeared to be a nuclease problem such that when the RNA duplex was unwound, the 5’-end radiolabeled single strand was degraded. This problem was overcome by using a RNA duplex with a 3’-end radiolabel.

Subsequent efforts indicated that the presence of a 5’-monophosphate on the short single strand was necessary to obtain robust duplex unwinding from WT DDX41. However, the

158

Figure 7.9. RNA unwinding by WT DDX41 and DDX41R525H as a function of ATP concentration. (a). Representative PAGE for unwinding reactions (RNA: 0.5 nM, ATP: 0.0025 mM, WT DDX41/DDX41R525H: 100 nM). Cartoons on the left show duplex and single strand RNA substrate. The asterisk represents the radiolabel. (b) Dependence of unwinding rate constants (100 nM DDX41) on ATP concentration for WT DDX41 (filled blue circles) and DDX41R525H (filled red circles). Rate constants are averages from at least 6 independent measurements. Error bars represent 1 SD. Curves represent the max best fit to the binding isotherm kobs= kobs [DDX41] / (K1/2, ATP + [DDX41]), where kobs: observed ATP unwinding rate constant; K1/2 : apparent functional binding constant of DDX41 to ATP. Unwinding rate max constants at enzyme saturation were for WT DDX41: kobs = 0.096 ± 0.011 min-1; for DDX41R525H: max R525H kobs = 0.024 ± 0.001 min-1. (c) Functional binding affinities of WT DDX41 and DDX41 for ATP. ATP Reactions were performed and analyzed as in panel (a). For WT DDX41, K1/2 = 4.0 ± 0.9 µM; for R525H ATP DDX41 , K1/2 = 5.0 ± 1.3 µM.

presence of the 5’-monophosphate did not seem to affect the RNA-stimulated ATPase

activity of WT DDX41. The stimulation of WT DDX41 unwinding activity in the presence

of a 5’-monophosphate could, in part, result from the monophosphate addition resulting in

a relatively less stable duplex. In part, this unusual activity of DDX41 may also reflect end

requirements that might be associated with pre-mRNA splicing.

Biochemical studies in eIF4A (DEAD-box helicase) showed that substitution of the

3rd Arg to either Gln (R365Q) or Lys (R365K) resulted in a complete loss of unwinding

159 activity and a partial reduction in ATPase activity 369. Overall, the eIF4A study showed that Motif VI residues were required for RNA binding, unwinding, and ATP hydrolysis.

The crystal structure of Vasa (DEAD-box helicase) with ssRNA and a non-hydrolysable

ATP analog, AMP-PNP, showed that the 2nd and 3rd Arg from motif VI contacts the triphosphate group of AMP-PNP 2, 28 (Fig. 7.10).

Figure 7.10. Schematic representation of the key residues in the helicase core domain that mediate ATP binding and hydrolysis, based on the structure of Vasa 2. Brackets indicate the approximate binding surface for the conserved sequence domains represented by the numbers. Functionalities that coordinate the catalytic water (E, motif II), stabilize the transition state (R2, motif VI) 2+ and coordinate the Mg (D, motif II; T/S, motif I) and the β-phosphate of ATP (K, motif I) are largely conserved among other helicase families 10. Sequence logos of the conserved domains involved in ATP binding and hydrolysis are shown at the bottom. These logos are constructed from sequence alignments of all of the DEAD box and superfamily 2 (SF2) helicases from Escherichia coli, Bacillus subtilis, Saccharomyces cerevisiae and Homo sapiens 10. The colored dots below the sequence logos correspond to the residues shown in the schematic and the colors of the dots and bonds emphasize the different residues. This figure is from Ref: 28 with permission to reuse.

However, in our DDX41 biochemical experiments, the mutation of 3rd Arg to His

(R525H) showed relatively modest effects on the unwinding activity of DDX41. Perhaps, with a histidine replacement, the residue can still form some or all of the hydrogen bonds 160 with triphosphates, but likely in a different configuration. Or, the histidine residue interactions may strengthen RNA binding.

The R525H mutant DDX41 bound RNA approximately four-fold tighter and was nearly four-fold less active in RNA duplex unwinding than WT DDX41. These results suggest that the DDX41 R525H mutation in myeloid neoplasms, Section 3.3.1.2) maybe associated with the slower unwinding of RNAs and/or stabilization of a different configuration of the spliceosome. This is consistent with genetic evidence in MDS patients, in which DDX41R525H can function in adult bone marrow either in combination with a truncating germline allele or in the presence of the wild-type allele (Section 3.3.1.2).

DDX41 appears to be an essential gene in cell lines, and homozygous truncation mutations are not observed in humans or mice 4. Consequently, the modest defects in biochemical activity observed with DDX41R525H may be necessary to be able to maintain a nonlethal level of functionality in the presence of a truncating germline allele. Nevertheless, even in the presence of a wild-type allele, the effect of the DDX41R525H mutation appears to cause leukemogenesis. In light of our in vitro biochemical findings, this suggests a potential of

R525H mutant DDX41 to show a dominant-negative effect over WT DDX41.

7.4 Future directions

One possible explanation for the increase in signal of visible single strand product observed in our unwinding assays with a 5’-end phosphorylated duplex (Figs. 7.5, 7.6, and

7.8), could be a difference in stability of the phosphorylated duplex when compared to a non-phosphorylated duplex. The presence of a 5’ monophosphate could make a duplex less stable than a non-phosphorylated duplex and in turn lead to its more efficient separation.

If this observation was truly due to the nature of the duplex substrate and not a specific

161 requirement of DDX41, then the use of a different helicase should produce similar results.

To test this, I used DHX36 (a DEAH-family helicase) and Ded1p (a DEAD-box family helicase) and compared their unwinding activity on a non-phosphorylated and mono- phosphorylated duplex substrate. Under the conditions tested, both DHX36 and Ded1p unwound the non-phosphorylated and mono-phosphorylated duplex with a similar unwinding rate constant (Fig. 7.11a,b). However, DDX41 unwound the mono- phosphorylated duplex more efficiently than the non-phosphorylated duplex (10-fold

Figure 7.11. Non-phosphorylated and mono-phoshphorylated RNA duplex unwinding by three helicases. Left: Representative PAGE for unwinding reactions (RNA: 0.5 nM, ATP: 2 mM) (a) DHX36: 100 nM; (b) Ded1p: 200 nM; (c) DDX41: 100 nM. Right: Representative unwinding time courses by non-phosphorylated 13-bp RNA duplex (blue circles) and mono-phosphorylated 13-bp RNA duplex (red circles) by the respective helicases. Curves represent best fits to the integrated first order rate law, -1 yielding observed rate constants (k ). For DHX36, Non-Phos Duplex k = 0.44 ± 0.23 min ; Phos -1 obs obs -1 Duplex k = 0.32 ± 0.04 min ; For Ded1p, Non-Phos Duplex k = 2.23 ± 0.36 min ; Phos Duplex k = obs -1 obs -1 obs 2.15 ± 0.29 min ; For DDX41, Non-Phos Duplex k = 0.01 ± 0.003 min ; Phos Duplex k = 0.18 ± -1 obs obs 0.02 min ; Errors reflect fitting error. Cartoons on the left show duplex and single strand RNA substrate. The asterisk represents the radiolabel.

162 higher unwinding rate constant) (Fig. 7.11c). This suggests that an end-phosphorylated substrate may reflect a specific requirement of DDX41. Note that to compare, the activities of DHX36 and Ded1p need to be measured at enzyme and ATP saturation conditions to provide an accurate comparison.

The clustering of somatic DDX41 mutations in the 3rd Arg of Motif VI (R525H) and the correlation of a more severe phenotype with a mutation in this region suggest either that this part is particularly susceptible to mutation or that tumor development is, in part driven by selection for a mutation in this location. We report only a moderately reduced biochemical activity of the R525H mutant, which does not entirely explain the strong enrichment for this mutation in patients. Also, some of the reported DDX41 R525H mutations in MDS/AML are heterozygous, and both the wildtype (WT) and mutant alleles are expressed. This raises the possibility of the p.R525H mutant exerting a dominant- negative effect over WT DDX41, which results in higher disease propensity. Thus, it is important to investigate if the R525H mutant DDX41 is just less biochemically active or whether it might be a dominant-negative mutant. To test this in vitro, the purified mutant and WT form of DDX41 can be mixed in different ratios, and the biochemical activity can be examined using our established setup. The activity can then be compared to expected activity from either WT or mutant DDX41 by itself. The activity of WT and R525H mutant

DDX41 in pre-mRNA splicing could also be investigated via in vitro splicing assays.

As mentioned in (Section 7.3), in eIF4A, the substitution of the 3rd Arg to either

Gln (R365Q) or Lys (R365K) resulted in complete loss of unwinding activity. DDX41 appears to be an essential gene, and as mentioned in (Section 7.3), the modest defects in biochemical activity observed with DDX41R525H may be necessary to be able to maintain

163 a nonlethal level of functionality in the presence of a truncating germline allele. The residual activity of DDX41 seen with Arg to His replacement suggests that the His residue can still form some or all of the hydrogen bonds with triphosphates of ATP, but likely in a different configuration. This possibility can be examined by modeling the structure of

DDX41 with a His replacement (or, any other residue) of the 3rd Arg residue.

Another possibility is that the R525H mutation is selected for (3rd Arg always mutated only to a His), as substitution of this Arg to any other residue, may result in inactivity. We could potentially test this possibility by generating a variant of DDX41 wherein, the 3rd Arg is replaced by an Ala residue and examining its biochemical activity.

In addition, we could make an analogous mutation (3rd Arg replaced by a His) in eIF4A, or other well behaved DEAD-box helicases, and examine their biochemical activity.

Recent studies that analyzed cancers by amino acid substitution signatures found that Arg>His mutations are dominant in a subset of cancers 370-372. Increased intracellular pH (pHi) has become an established feature of most cancers 373. Lately, a notion has emerged that cancer-associated Arg>His mutations may confer a gain in pH sensing to mutant proteins not seen with WT protein, thus providing a fitness advantage to the increased pHi of cancer cells 371, 374. Arg>His mutations swap a positively charged amino acid for a titratable amino acid. Arginine with a pKa ~12 is always protonated, whereas histidine with a pKa ~ 6.5 can titrate within the narrow cellular pH range and exhibit a shift in population from the protonated to the neutral species at the higher pH of cancer cells. It was shown that increased pHi could work in concert with mutant proteins to enhance oncogenic signaling (increased activity of EGFR-R776H substitution) and limit tumor suppression (the decreased activity of p53-R273H substitution) 375. These studies raise the

164 possibility that the magnitude of the molecular defect caused by the R525H mutation in

DDX41 may, in turn, depend on the intracellular pH.

This leads to another likely scenario wherein, the R525H mutation might have minimal consequences on the unwinding activity of DDX41; conversely, the mutation may decrease the stability of DDX41 and consequently render it unavailable for its role in the spliceosome. To investigate this, circular dichroism (CD-spectroscopy), and differential scanning calorimetry (DSC) can be used to determine the thermal stability of

DDX41R525H versus WT-DDX41 over a range of pH values.

165

Chapter 8

Material and Methods

8.1 Construction of DHX36 expression plasmids

The cDNA sequence of mouse DHX36 (GenBank reference BC138061, AA138062, residues 50-981) was PCR amplified using Phusion® Hot Start Flex DNA polymerase

(New England Biolabs) and inserted into pSMT3 vector (provided by Dr.Christopher Lima,

Sloan-Kettering Institute) between SalI/ NotI restriction sites. The mutant constructs were also generated by the standard PCR cloning strategy. All constructs were verified by DNA sequencing. The primers used in this study were as follows, with restriction sites underlined:

1) WT, sense: 5’ GTTGTTGTCGACAAATCCCGCACACCTTAAGGGTCGCG 3’

and antisense: 5’

GTTGTTGCGGCCGCTCAAGTTTTGATCAAGTCTAGAATAGCTG 3’

2) ∆ β-HP, sense: 5' GGAGGAGTTAGTAAAGCTAATGCCAAACAG 3' and

antisense: 5' CTTTACTAACTCCTCCATCTATTACATAAACC 3'

3) short β-HP, sense: 5' AAAGAAGGATCTGCTGAGTGGGTTAGTAAAGC 3' and

antisense: 5' CAGCAGATCCTTCTTTAATTTTTCCTCCATCT 3'

4) ∆ DSM, sense: 5' CGTCGACAAGCCAAGAAGCAGACGCAGAAGAAC 3' and

antisense: 5' CTTGGCTTGTCGACGGAGCTCGAATTCGGATC 3'

5) ∆ OB-fold , sense: 5' TCCCAAAACAGAAGTGTCACCATACTGCCTCC 3' and

antisense: 5' CACTTCTGTTTTGGGATATAAACCAGCACAGAT 3'

166

8.2 DHX36 expression and purification

This was provided by Dr. Zhonghua Liu, CWRU, and Dr.Watchalee Chuenchor,

NIH. Mouse -pSMT3 (46-982) and truncation mutants were expressed as N-terminal

His6-SUMO fusion protein in Escherichia coli BL21-Codonplus (DE3)-RIPL cells

(Agilent Technologies, Santa Clara, CA). The culture was grown at 37° C in LB medium containing 50 μg/ml kanamycin until the absorbance at 600 reached 0.8-1.0. The protein expression was induced by 0.2% lactose (MilliporeSigma, Burlington, MA) or 0.2 mM

IPTG at 18° C for 48 h. The cells were harvested and resuspended in lysis buffer containing

50 mM Tris-HCl pH 8.0, 13% glycerol, 12 mM imidazole, 300 mM NaCl and stored at -

80° C until ready for use.

Frozen cells were thawed and disrupted by sonication in lysis buffer with the addition of 1 mM DTT, 1 mM PMSF, and Protease inhibitor cocktail (complete, EDTA- free, Roche). The soluble protein was collected by centrifugation at 35,000 rpm, 30 min, and 4° C. The crude soluble protein was treated with 0.1% polyethyleneimine (Sigma) to eliminate endogenous nucleic acid contaminants. The nucleic acid pellet was then removed by centrifugation at 14,000 rpm, 20 min, 4° C. The nucleic acid free protein solution was passed through a Ni-column (HisPrep FF 16/10; GE Healthcare) equilibrated in 50 mM

Tris-HCl, pH 8.0, 300 mM/1 M NaCl, 20% glycerol, 20 mM imidazole, 1 mM DTT.

Protein fractions containing His6-SUMO fusion protein was incubated with Ulp1 protease

(plasmid provided by Sloan-Kettering Institute) and dialyzed against buffer containing 50 mM Tris-HCl, pH 8.0, 350 mM NaCl, 30% glycerol, 1 mM DTT for overnight, at 4° C.

The proteolysis reaction was loaded onto Ni-column to remove free His6-SUMO tag, non- cleavage protein, and Ulp1 protease. The unbound fractions containing mDHX36 were

167 dialyzed against a buffer containing 25 mM HEPES, pH 7.0, 20% glycerol, 1 mM DTT, and then loaded on to Source 15S cation exchange column (GE Healthcare Biosciences,

Pittsburgh, PA). mDHX36 protein was eluted by a step gradient of 0-1 M NaCl. Purified protein was precipitated by addition of 87% ammonium sulfate and resuspended in storage buffer containing 25 mM Tris-HCl, pH 7.5, 150 mM NaCl, 5% glycerol, 1 mM DTT and loaded onto S200 gel filtration (Superdex200 HR 10/300 GL; GE Healthcare Biosciences,

Pittsburgh, PA) pre-equilibrated with the same buffer. Purified mDHX36 was concentrated to 9-15 mg/ml and stored in 25 mM Tris-HCl, pH 7.5, 150 mM NaCl, 50% glycerol, and 1 mM DTT at -80° C. The mutant protein expressions and purifications were the same as the native protein.

8.3 Crystallization, data collection, and structure determination

This was provided by Dr.Zhonghua Liu, CWRU. Crystals were grown by hanging drop vapor diffusion at 18° C by mixing an equal volume of protein and crystallization reagent. The complex of mDHX36 in complex with 2 mM ADP was crystallized with 10%

PEG 5000 MME, 0.1 M HEPES pH 7.0, 5% v/v Tacsimate. Crystals were cryoprotected by 25% (v/v) glycerol in the crystallization condition. X-ray diffraction data were collected at the SER-CAT 23IDD and IDB beamline at Advanced Photon Source (APS, Argonne,

IL, U.S.A.). All data were processed and scaled by HKL2000 376 and XDS 377. The mDHX36 structure was determined using Multi-wavelength anomalous diffraction (MAD) dataset collected from crystals obtained with Se-Met protein samples, which will be detailed in a separate manuscript. The structure was built with the program COOT 378 and refined with PHENIX 379. The crystal structures were validated by the MolProbity server

380.

168

8.4 Preparation of radiolabeled RNA substrates for DHX36 remodeling reactions

RNA and DNA oligonucleotides used in this study were purchased from Sigma and Dharmacon. Sequences of oligonucleotides used in this study are as follows (G- tetrads and duplex region underlined). GQ4: 5’ UUAGGGGGAAAAAAAAAAAAAAA

3’. GQI: 5’ UUAGGGGGAGGGGGAUGGGGGAGGGGGAAAAAAAAAAAAAAA

3’. R16 (top strand): 5’ AGCACCGUAAAGACGC 3’ and R31allA (bottom strand with

3’ end overhang containing 15 adenylates): 5’

GCGUCUUUACGGUGCUAAAAAAAAAAAAAAA 3’. R31-3’ (bottom strand with a

15 nt 3’ end overhang containing a different sequence): 5’

GCGUCUUUACGGUGCUUAAAACAAAACAAAA 3’. R41-3’ (bottom strand with a

25 nt 3’ end overhang): 5’

GCGUCUUUACGGUGCUUAAAACAAAACAAAACAAAACAAAA 3’. R24-3’allA

(bottom strand with a 3’ end overhang containing 8 adenylates): 5’

GCGUCUUUACGGUGCUAAAAAAAA 3’, and R22-3’allA (bottom strand with a 3’ end overhang containing 6 adenylates): 5’ GCGUCUUUACGGUGCUAAAAAA 3’.The sequence of DNA TRAP used for GQI remodeling assays:

5’ TCCCCCATCCCCCTCCCCC 3’.

The 5’-end of the RNA (GQ4, GQI, and R16) was radiolabeled using T4 polynucleotide kinase (NEB) followed by purification on denaturing PAGE. The duplex substrate was generated by annealing the radiolabeled top strand (R16) to its corresponding bottom strand

(R31) in 10 mM MOPS, pH 6.5, 50 mM KCl, and 1 mM EDTA, followed by purification on non-denaturing PAGE. The intermolecular and intramolecular GQ substrates were formed by adding the respective radiolabeled oligos to a buffer containing 10 mM MOPS,

169 pH 6.5, 50 mM KCl, and 1 mM EDTA. The solution was heated to 98o C for 10 min, slowly cooled to 0o C overnight. Following this, the structured substrates were purified on non- denaturing PAGE. The annealed radiolabeled GQ and duplex substrates were gel eluted and stored in a buffer containing 10 mM MOPS, pH 7, 50 mM KCl, and 0.1 mM MgCl2.

8.5 DHX36 remodeling reactions

Reactions were performed at 300 C in a temperature-controlled heat block in a buffer with 40 mM Tris-HCl (pH 8.0), 100 mM KCl, 0.5 mM MgCl2, 6% glycerol (vol/vol),

0.01% IGEPAL, 2mM DTT, 0.3 U/µl RNase Inhibitor (Roche). Prior to the reaction, radiolabeled RNA substrate (0.5 nM final concentration) was incubated with the indicated concentration of mDHX36 (WT or mutants). Reactions were initiated by addition of equimolar NTP and MgCl2 (2 mM final concentration, unless otherwise stated). For remodeling reactions with GQI, a 200-fold excess of DNA trap (100 nM) was added at reaction start. At times indicated, aliquots were removed, and the reaction was stopped by addition of an equal volume of a buffer containing 1% SDS, 50 mM EDTA, 10% glycerol,

0.05% (w/v) xylene cyanol and 0.05% (w/v) bromophenol blue. Samples were then applied to a 15% non-denaturing PAGE, and the structured and single-stranded RNAs were separated by electrophoresis at 15V/cm. Gels were dried. Bands were visualized on a

Typhoon Phosphorimager (GE health care) and quantified using ImageQuant 5.2 software

(Molecular Dynamics). The fractions of single-stranded and structured RNA were determined from the relative amounts of radioactivity in the respective bands. Observed remodeling rate constants (kobs) for each single-stranded species were determined from time courses as described, considering both unwinding and annealing reactions 70.

170

The reactions with ATP analog (ADP-BeFx) were conducted in the presence of hexokinase (0.2 units/µl; Roche), and 1 mM D-glucose (substrate for hexokinase) to remove traces of ATP from ADP preparations 71. Similarly, reactions with intramolecular

GQ1 and ATP analog (ADPNP) were conducted in the presence of hexokinase (0.2 units/µl;

Roche), and 1 mM D-glucose; but reactions with the duplex and intermolecular GQ4 did not have hexokinase.

8.6 DHX36-RNA equilibrium binding reactions

Binding reactions were performed at 300 C in a temperature-controlled heat block in a buffer with 40 mM Tris-HCl (pH 8.0), 100 mM KCl, 0.5 mM MgCl2, 6% glycerol

(vol/vol), 0.01% IGEPAL, 2 mM DTT, 0.3 U/µl RNase Inhibitor (Roche), 0.5 nM radiolabeled RNA substrate, indicated concentration of mDHX36 (WT or mutants) and incubated for 30 min 381 . Samples were then applied to a 4% non-denaturing PAGE, and the mDHX36-RNA complexes and structured RNAs were separated by electrophoresis at

40 C. Gels were dried, and bands were visualized on a Typhoon Phosphorimager (GE health care).

8.7 DHX36 ATPase reactions

ATPase measurements were performed at 300 C in a buffer containing 40 mM Tris-

HCl (pH 8.0), 100 mM KCl, 0.5 mM MgCl2, 6% glycerol (vol/vol), 0.01% IGEPAL, 2 mM

DTT, 0.3 U/µl RNase Inhibitor (Roche) and indicated concentration of mDHX36 (WT or mutants). mDHX36 was preincubated with 2 µM RNA substrate for 5 min before reaction start. Reactions were initiated by addition of a mixture of trace amounts of [γ-32P] ATP and

0.5 mM ATP. All reactions contained equimolar ATP and MgCl2 with 0.5 mM MgCl2 excess. At various time points, 1 µl aliquots were removed and applied to a PEI-cellulose

171 thin-layer chromatography plate (20 cm X 20 cm; Selecto Scientific). Hydrolysis of [γ-

32P]-ATP was monitored by TLC, as described 381. The PEI plate was developed with 0.5

M LiCl and 1.5 M formic acid and subsequently dried. Radioactivity was quantified with a Phosphorimager (GE) and the ImageQuant software (Molecular Dynamics). Initial rates of ATP hydrolysis were determined by linear least-squares fit to the initial phase of the reaction.

8.8 Circular Dichroism spectroscopy

RNA GQs were prepared in buffers containing 10 mM MOPS (pH 6.5) in the presence of 50 mM KCl at a concentration of 4 µM (GQ4) and 1 µM (GQ1), annealed by heating to 95° C and then cooling slowly to room temperature. CD of RNA oligonucleotides was determined at 30° C by an Applied Photophysics PiStar 180 spectropolarimeter equipped with a temperature controller. An average of three CD spectra ranging from 220 to 340 nm was recorded in a 10 mm path length cuvette at a scan rate of

1nm/sec with a 1 sec response time, 1 nm bandwidth, and continuous scan mode.

CD spectra for wildtype and mutant mDHX36 variants were recorded at 30° C in a

Jasco J-815 spectrophotometer in a quartz cuvette with a path length of 1 mm. Protein stocks were diluted into a 10-fold excess of 25 mM NaH2PO4, pH 7.5, 150 mM NaF for a final volume of 200 µl, and a final protein concentration of 0.05 mg/ml. For each construct, three independent CD spectra (195 - 260 nm) were recorded in a 1 mm path length cuvette

(scan rate: 50 nm/min, response time: 2 sec, bandwidth: 1 nm, continuous scan mode). The average of the spectra was then computed. The molar ellipticity was calculated from the observed ellipticity values and protein concentrations and plotted against the standard wavelengths from 195 to 260 nm.

172

8.9 Accession Codes

Coordinates and structure factors of mouseDHX36-ADP complex (PDB ID: 6UP4) have been deposited in the .

8.10 Protein-protein crosslinking

Purified recombinant DHX36 (600 nM) were incubated with and without 60 nM of a 16 bp duplex with a 3’ 25 nt ssRNA and 2 mM ADPNP/Mg2+ (40 mM Tris (pH 8.0), 50 mM NaCl, 8.3% (v/v) glycerol, 0.01% (w/v) IGEPAL CA 630, 2 mM DTT, 0.6 U/µL

RNasin (Roche), and 0.5 mM MgCl2) for 5 min at 30° C. Glutaraldehyde (0.02% (v/v)) was added for 30 min at room temperature. Reactions were quenched with 0.5 mM Tris-

Glycine (pH 6.8). Samples were diluted (100 mM Tris, pH 6.8, 24% glycerol, 8% SDS,

0.02% Coomassie Blue R250, 0.2 mM DTT) and resolved on 8% Tricine-SDS Page gel.

Gels were analyzed by Coomassie staining.

8.11 Kinetic Simulations of DHX36 remodeling reactions

Modeling of RNA duplex remodeling inhibition with ATP and DHX36 was performed with Kintek Global Explorer (Kintek, Austin, TX) 335 using data from unwinding assays. A detailed description of the data used, modeling strategy, data fits, and assessment of model quality is provided in Chapter 5.

8.12 Construction of DDX41 expression plasmids

To generate plasmid pET22b-His6-TEV-DDX41, the entire human DDX41 open reading frame was amplified from pCMV6-XL5 (Origene) by PlatinumPfx polymerase

(Thermo Fisher) using primers:

5’ATACATATGCACCACCACCACCACCACGGAAGCGGAAGCGAGAATCTGTAC

TTTCAATCAGAGGAGTCGGAACCCGAACGG 3’(bold letters indicate restriction

173 site, italics represent His6-coding region and underlined region represents TEV cleavage site) and 5’GCTAAG CTTTCAGAAGTCCATGGAGCTGTG3’and cloned into pET22b

(+) expression vector (Novagen) after digestion of PCR product and vector with HindIII and NdeI. This created an N-terminal His6-tagged DDX41 with a TEV cleavage site. The resulting plasmid was isolated from DH5-alpha. Sequence integrity was verified by DNA sequence analysis. The R525H mutation was introduced into pET22b-His6-TEV-DDX41 by PCR amplification using the primers:5’CGCACCGGGCACTCGGGAAACACAGGC

3’ and 5’GCCTGTGTTTCCCGAGTGCCCGGTGCG3’. The pET22b-His6-TEV-

DDX41R525H plasmid was isolated from DH5-alpha and sequence integrity and presence of the mutation was verified by DNA sequence analysis.

8.13 DDX41 expression and purification

Full-length DDX41 (WT and R525H) with an N-terminal TEV-cleavable His6-tag was sub-cloned into a pfast-Bac1 vector using XbaI and HindIII sites. The resulting plasmid pfastBac1-His6-TEV-DDX41 was isolated from DH5-alpha. Sequence integrity was verified by DNA sequence analysis. pfastBac1-His6-TEV-DDX41 was used to transform DH10Bacstrain (Thermo Fisher) for the production of recombinant bacmid and expressed using the recombinant baculovirus expression system to infect Sf9 insect cells as described previously 382. Cells were pelleted 24–72 h post-infection and lysed via sonication in lysis buffer containing 50 mM NaH2PO4, pH 6.0, 300 mM NaCl, 5 mM

Imidazole, 1% (v/v) IGEPAL, 0.4 mM phenylmethanesulfonyl fluoride, 6 µg/ml RNaseA,

0.5 mM NaNO3, 1 mM NaF and 1 complete ULTRA protease inhibitor cocktail tablet

(Roche Diagnostics). Cellular debris was pelleted by ultracentrifugation at 36,000 rpm for

45 min at 4° C. The supernatant was loaded onto pre-charged nickel-resin (Affymetrix-

174

USB). DDX41 was eluted with a buffer containing 250 mM imidazole. The His6-tag was cleaved using TEV protease, and DDX41 was further purified by adsorption to phosphocellulose resin (P11, Whatman) and eluted with a NaCl gradient (100-500 mM).

Protein fractions were pooled and concentrated and then flash-frozen and stored at −80° C.

The protein concentration was determined by Coomassie staining using BSA as a standard.

8.14 RNA substrate preparation for DDX41 unwinding reactions

RNA oligonucleotides were purchased from Sigma and Dharmacon. Sequences of oligonucleotides were as follows (duplex region underlined).

R16 (top strand): 5’AGCACCGUAAAGACGC 3’; R41 (bottom strand with 3’ end overhang): 5’ GCGUCUUUACGGUGCUUAAAACAAAACAAAACAAAACAAAA

3’. R13 (top strand): 5’AGCACCGUAAAGC 3’; R38 (bottom strand with 3’ end overhang): 5’ GCUUUACGGUGCUUAAAACAAAACAAAACAAAACAAAA 3’.

For data shown in figures 7.2-7.4, the 5’-end of the RNA (top strand) was radiolabeled using T4 polynucleotide kinase (NEB) followed by purification on denaturing PAGE.

The duplex substrate was generated by annealing the radiolabeled top strand (R16 or

R13) to its corresponding bottom strand (R41 or R38) in 10 mM MOPS, pH 6.5, 50 mM

KCl and 1 mM EDTA, followed by purification on non-denaturing PAGE.

For data shown in figures, 7.5- 7.11, the 3’-end of the bottom-strand RNA (R41 or

R38) was radiolabeled as follows: T4 polynucleotide kinase (NEB) was used to phosphorylate cytidine-3′-monophosphate (Cp) with [γ-32P]-ATP at 37° C for 4 h. The kinase was inactivated at 98° C for 2 min. T4 RNA ligase (Thermo Fisher) was then used to covalently join [5′-32P]pCp to the free 3′ hydroxyl of RNA by incubation at 4° C overnight. Unincorporated [5′-32P]pCp was then removed by gel filtration (Biorad P-6

175 spin column), and the labeled RNA was further purified on denaturing PAGE. Duplex substrates were generated by combining the radiolabeled bottom strand with a molar excess of its complementary strand in 10 mM MOPS, pH 6.5, 50 mM KCl and 1 mM

EDTA, followed by purification of the duplex on non-denaturing PAGE.

8.15 Pre-steady state Nuclease reactions

Degradation reactions were performed in 30 μl at 30° C in a temperature-controlled aluminum block in buffer containing 40 mM Tris-Cl, pH 8, 50 mM NaCl, 0.5 mM MgCl2,

5% glycerol, 0.01% Nonidet P-40, 2 mM DTT. To minimize the impact of condensation on the tube lids, reactions were spun down in a minicentrifuge every 3 min. Before the start of the reaction, DDX41 (at indicated concentrations) was incubated with or without ATP-

Mg2+ (at indicated concentration) in the reaction buffer for 4 min. Pre–steady-state reactions were started by addition of radiolabeled NA substrate to a final concentration of

[RNA*] = 0.5 nM. Aliquots were removed from the reaction at times indicated. The reaction was stopped by addition to an equal volume of a buffer containing 80% formamide, 0.1% xylene cyanol, and 0.1% bromophenol blue.

Samples were denatured at 95° C for 2 min, and applied to denaturing PAGE (20% acrylamide:bis-acrylamide at 29:1, 7 M urea, 1× TBE). Prior to sample loading, the gels were pre-run in 1× TBE for 25 min at 15 W. After sample loading, gels were run at 15 W for ∼1 h. Following this, gels were dried and exposed overnight to a PhosphorImager cassette (Amersham Biosciences). Individual bands were visualized on a Typhoon 9400

PhosphorImager (Amersham Biosciences).

176

8.16 DDX41 ATPase reactions

ATPase measurements were performed at 300 C in a buffer containing 40 mM Tris-

HCl (pH 8.0), 50 mM NaCl, 6% glycerol (vol/vol), 0.01% IGEPAL, 2 mM DTT, 0.3 U/µl

RNase Inhibitor (Roche) and indicated concentration of DDX41 (WT or mutants). DDX41 was pre-incubated with 1 µM RNA for 5 min before the reaction start. Reactions were initiated by the addition of a mixture of trace amounts of [γ-32P]-ATP and 0.5 mM ATP.

All reactions contained equimolar ATP and MgCl2 with 0.5 mM MgCl2 excess. At various time points, 1 µl aliquots were removed and applied to a PEI-cellulose thin-layer chromatography plate (20 cm X 20 cm; Selecto Scientific). Hydrolysis of [γ-32P]-ATP was monitored by TLC, as described 381. The PEI plate was developed with 0.5 M LiCl and 1.5

M formic acid and subsequently dried. Radioactivity was quantified with a

Phosphorimager (GE) and the ImageQuant software (Molecular Dynamics). Initial rates of

ATP hydrolysis were determined by linear least-squares fit to the initial phase of the reaction.

8.17 DDX41 Unwinding reactions

Reactions were performed at 30° C in a temperature-controlled heat block in a buffer with 40 mM MOPS (pH 6.5), 100 mM NaCl, 0.5 mM MgCl2, 5% glycerol (v/v),

0.01% (v/v) NP-40, 2 mM DTT, 0.3 U/µl RNase Inhibitor (Roche). Equimolar ATP and

MgCl2 were incubated with the indicated concentration of DDX41 (WT or R525H), before the start of the reaction for 4 min. For reactions performed under strand exchange conditions, a molar excess of the corresponding unlabeled top strand oligo (at indicated concentration) was added to the reaction mixture prior to reaction start. Reactions were started by the addition of the radiolabeled substrate. At times indicated, aliquots were

177 removed, and the reaction was stopped by addition of an equal volume of a buffer containing 0.5% SDS, 25 mM EDTA, 10% glycerol, 0.05% (w/v) xylene cyanol and 0.05%

(w/v) bromophenol blue. Samples were then applied to a 15% nondenaturing PAGE; the duplex and single-stranded RNAs were separated by electrophoresis at 15 V/cm. Gels were dried, and bands were visualized on a Typhoon Phosphorimager (GE) and quantified using

ImageQuant5.2 software (Molecular Dynamics).

8.18 Data analysis for DDX41 unwinding reactions

The fraction of single-stranded and duplex RNA was determined from the relative amounts of radioactivity in the respective bands. Reactions at low enzyme and ATP concentrations, especially for DDX41R525H required reaction times over 120 min. During this time, a moderate amount of RNA degradation was observed. To obtain accurate unwinding rate constants, we explicitly considered degradation for substrate and unwound duplex in the quantification, by applying the following kinetic model:

S → P (unwinding reaction: kobs; S: substrate, P: product); S → X (degradation of substrate, k2: S: substrate, X: degraded RNA); P → X (degradation of the product, k3: P: product, X: degraded RNA). Corresponding rate constants were obtained by global data fit using

335, 383 Kintek Global Explorer . To estimate initial values for kobs, k2 and k3 and to estimate the standard deviation in the data, time courses were fit to a biphasic exponential function:

-b1·t -b2∙t frac [species] = a1∙e + a2·e + c (a1,a2 are the amplitudes for each phase; b1,b2 are observed rate constants for each phase; c is the offset). Initial values for kobs, k2, and k3 were further optimized with the dynamic simulation feature of Kintek Explorer, which allows variation of rate constants with continuous simulation 383. Obtained data were then used in several iterative rounds of experimental fitting until parameters no longer changed, and the

178 minimal X2 value was obtained. More than 150 individual data points were used to determine the three observed rate constants at varying enzyme and ATP concentrations.

Functional affinities for DDX41 and DDX41R525H to RNA were calculated by fitting the data to the functional binding isotherm:

max RNA -1 kobs = kunw ∙[DDX41]·(K1/2 + [DDX41])

max (kobs: observed unwinding rate constant; kunw unwinding rate constant at enzyme

RNA saturation; K1/2 : apparent functional binding constant of protein to RNA). Functional affinities for DDX41 and DDX41R525H to ATP were calculated by fitting the data to the functional binding isotherm:

max ATP -1 kobs = kunw ∙[ATP]·(K1/2 + [ATP])

max (kobs: observed unwinding rate constant; kunw unwinding rate constant at enzyme

ATP saturation; K1/2 : apparent functional binding constant of protein to ATP).

179

Appendix 1

Table A1. X-ray Crystallographic Data Collection and Refinement Statistics

Dataset mouseDHX36-ADP complex Data collection Space group P212121 Unit cell (Å, °) a = 66.2 Å, b = 115.9 Å, c = 132.6 Å Wavelength (Å) 0.980 Resolution range (Å) 50-2.40 (2.46-2.40) No. of reflections (total/unique) 303342/77042 Completeness (%) 100 (100) Multiplicity 3.9 (3.8) a Rmeas (%) 12.0 (152.0) I/σI 10.8 (1.6) b CC1/2 1.00 (0.47)

Refinement Refinement resolution (Å) 46.83-2.40

c Rwork / Rfree (%) 18.61/23.57

No. atoms Protein 6546

ANP/ADP 27 Hetero-atoms & solvent 146 2 Mean B-factor (Å ) Protein 60.3 ANP/ADP 67.3 Hetero-atoms & solvent 54.2 RMSD from ideality Bond lengths (Å) 0.009 Bond angles (°) 1.006 Validation Ramachandran plot Favored/Outliers (%) d 97.9/0 MolProbity percentile e 100th PDB code 6UP4

a 1/2 Rmeas = h {N h /[N h -1]} i |Ii(h) - | / hi Ii(h), where Ii(h) and are the ith and mean measurement of the intensity of reflection h, and N h is the multiplicity . b 2 2 1/2 CC1/2 =  (x -) (y-) / [ (x -)  (y-) ] where x and y are randomly split half datasets. This is the Pearson’s correlation coefficient of randomly split half datasets 384 . c Rwork = h||Fobs (h)|-|Fcalc (h)|| / h|Fobs (h)|, where Fobs (h) and F calc (h) are the observed and calculated structure factors, respectively; Rfree is the R value obtained for a test set of reflections consisting of a randomly selected 5% subset of the data set excluded from refinement. d Values from the Molprobity server (http://molprobity.biochem.duke.edu/). e 100th percentile is the best among structures of comparable resolution. MolProbity score combines the clashscore, rotamer, and Ramachandran evaluations into a single score, normalized to be on the same scale as X-ray resolution.

180

Table A2. Apparent unwinding rate constant (kobs) values from ‘afit’ function of kintek simulation. For each model, the simulated data points were fit to a first-order exponential using the analytical fitting (afit) function of the kintek program, yielding kobs values for increasing [ATP]. These values are listed here along with the values obtained from the experiment.

181

Appendix 2

Contributions

Chapter 3 has been published as:

Srinivasan S†, Liu Z†, Chuenchor W, Xiao T, Jankowsky E. “Function of the Auxiliary domains of DEAH/RHA Helicase DHX36 in RNA Remodeling.” Journal of Molecular

Biology, (2020). DOI: 10.1016/j.jmb.2020.02.005

Permission to use the content has been obtained from Elsevier.

†, indicates authors contributed equally. Dr. Zhonghua Liu and Dr. Watchalee Chuenchor from the laboratory of Dr. Tsan Sam Xiao (CWRU), performed the initial expression of mDHX36 and the crystal structure determination of the mDHX36-ADP complex. WT mDHX36 expression and purification was subsequently adapted by Dr. Zhonghua Liu and

I. The mutant forms of mDHX36 were expressed and purified by Dr. Zhonghua Liu.

Chapter 7, section 7.2.5, is included in the following manuscript:

Hiznay J, Hershberger C, Srinivasan S, Kerr C, Ademà V, Moyer D, Nagata Y, Daniels N,.

DiPasquale W, Jankowsky E, Maciejewski J, Padgett R. “Molecular Functions of

Myelodysplastic Syndrome–Implicated Splicing Factor DDX41.”Manuscript submitted to

Journal of Clinical Investigation, (2020).

Dr. Mengyuan Xu, from the laboratory of Dr. Derek Taylor, performed the infection of Sf9 insect cells with recombinant bacmid, using the baculovirus expression system. Section 6.4 of this thesis describes the findings of the other authors listed above.

182

Bibliography

1. Simone, R.; Fratta, P.; Neidle, S.; Parkinson, G. N.; Isaacs, A. M., G-quadruplexes: Emerging roles in neurodegenerative diseases and the non-coding transcriptome. FEBS Lett 2015, 589 (14), 1653-68. 2. Sengoku, T.; Nureki, O.; Nakamura, A.; Kobayashi, S.; Yokoyama, S., Structural basis for RNA unwinding by the DEAD-box protein Drosophila Vasa. Cell 2006, 125 (2), 287-300. 3. Song, J.; Perreault, J. P.; Topisirovic, I.; Richard, S., RNA G-quadruplexes and their potential regulatory roles in translation. Translation (Austin) 2016, 4 (2), e1244031. 4. Polprasert, C.; Schulze, I.; Sekeres, M. A.; Makishima, H.; Przychodzen, B.; Hosono, N.; Singh, J.; Padgett, R. A.; Gu, X.; Phillips, J. G.; Clemente, M.; Parker, Y.; Lindner, D.; Dienes, B.; Jankowsky, E.; Saunthararajah, Y.; Du, Y.; Oakley, K.; Nguyen, N.; Mukherjee, S.; Pabst, C.; Godley, L. A.; Churpek, J. E.; Pollyea, D. A.; Krug, U.; Berdel, W. E.; Klein, H. U.; Dugas, M.; Shiraishi, Y.; Chiba, K.; Tanaka, H.; Miyano, S.; Yoshida, K.; Ogawa, S.; Muller-Tidow, C.; Maciejewski, J. P., Inherited and Somatic Defects in DDX41 in Myeloid Neoplasms. Cancer Cell 2015, 27 (5), 658-70. 5. Jiang, Y.; Zhu, Y.; Liu, Z. J.; Ouyang, S., The emerging roles of the DDX41 protein in immunity and diseases. Protein Cell 2017, 8 (2), 83-89. 6. Chen, M. C.; Tippana, R.; Demeshkina, N. A.; Murat, P.; Balasubramanian, S.; Myong, S.; Ferre-D'Amare, A. R., Structural basis of G-quadruplex unfolding by the DEAH/RHA helicase DHX36. Nature 2018, 558 (7710), 465-469. 7. Schutz, P.; Karlberg, T.; van den Berg, S.; Collins, R.; Lehtio, L.; Hogbom, M.; Holmberg-Schiavone, L.; Tempel, W.; Park, H. W.; Hammarstrom, M.; Moche, M.; Thorsell, A. G.; Schuler, H., Comparative structural analysis of human DEAD-box RNA helicases. PLoS One 2010, 5 (9). 8. Murat, P.; Marsico, G.; Herdy, B.; Ghanbarian, A. T.; Portella, G.; Balasubramanian, S., RNA G-quadruplexes at upstream open reading frames cause DHX36- and DHX9-dependent translation of human mRNAs. Genome Biol 2018, 19 (1), 229. 9. Schmid, S. R.; Linder, P., D-E-A-D of putative RNA helicases. Mol Microbiol 1992, 6 (3), 283-91. 10. Fairman-Williams, M. E.; Guenther, U. P.; Jankowsky, E., SF1 and SF2 helicases: family matters. Curr Opin Struct Biol 2010, 20 (3), 313-24. 11. Linder, P., Dead-box proteins: a family affair--active and passive players in RNP- remodeling. Nucleic Acids Res 2006, 34 (15), 4168-80. 12. Agrawal, V.; Kishan, R. K., Functional evolution of two subtly different (similar) folds. BMC Struct Biol 2001, 1, 5. 13. Cammas, A.; Millevoi, S., RNA G-quadruplexes: emerging mechanisms in disease. Nucleic Acids Res 2017, 45 (4), 1584-1595.

183

14. Millevoi, S.; Moine, H.; Vagner, S., G-quadruplexes in RNA biology. Wiley Interdiscip Rev RNA 2012, 3 (4), 495-507. 15. Lohman, T. M.; Tomko, E. J.; Wu, C. G., Non-hexameric DNA helicases and : mechanisms and regulation. Nat Rev Mol Cell Biol 2008, 9 (5), 391-401. 16. Antony-Debre, I.; Steidl, U., Functionally relevant RNA helicase mutations in familial and sporadic myeloid malignancies. Cancer Cell 2015, 27 (5), 609-11. 17. Jankowsky, E.; Fairman, M. E., RNA helicases--one fold for many functions. Curr Opin Struct Biol 2007, 17 (3), 316-24. 18. Gan, H. H.; Pasquali, S.; Schlick, T., Exploring the repertoire of RNA secondary motifs using graph theory; implications for RNA design. Nucleic Acids Res 2003, 31 (11), 2926-43. 19. Sievers, F.; Wilm, A.; Dineen, D.; Gibson, T. J.; Karplus, K.; Li, W.; Lopez, R.; McWilliam, H.; Remmert, M.; Soding, J.; Thompson, J. D.; Higgins, D. G., Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 2011, 7, 539. 20. Maciejewski, J. P.; Padgett, R. A.; Brown, A. L.; Muller-Tidow, C., DDX41-related myeloid neoplasia. Semin Hematol 2017, 54 (2), 94-97. 21. Brazda, V.; Haronikova, L.; Liao, J. C.; Fojta, M., DNA and RNA quadruplex- binding proteins. Int J Mol Sci 2014, 15 (10), 17493-517. 22. Jankowsky, E., RNA helicases at work: binding and rearranging. Trends Biochem Sci 2011, 36 (1), 19-29. 23. Yoneyama, M.; Fujita, T., Structural mechanism of RNA recognition by the RIG- I-like receptors. Immunity 2008, 29 (2), 178-81. 24. Crooks, G. E.; Hon, G.; Chandonia, J. M.; Brenner, S. E., WebLogo: a sequence logo generator. Genome Res 2004, 14 (6), 1188-90. 25. Chen, M. C.; Ferre-D'Amare, A. R., Structural Basis of DEAH/RHA Helicase Activity. Crystals 2017, 7 (8). 26. Buttner, K.; Nehring, S.; Hopfner, K. P., Structural basis for DNA duplex separation by a superfamily-2 helicase. Nat Struct Mol Biol 2007, 14 (7), 647-52. 27. Zhang, L.; Xu, T.; Maeder, C.; Bud, L. O.; Shanks, J.; Nix, J.; Guthrie, C.; Pleiss, J. A.; Zhao, R., Structural evidence for consecutive Hel308-like modules in the spliceosomal ATPase Brr2. Nat Struct Mol Biol 2009, 16 (7), 731-9. 28. Linder, P.; Jankowsky, E., From unwinding to clamping - the DEAD box RNA helicase family. Nat Rev Mol Cell Biol 2011, 12 (8), 505-16. 29. He, Y.; Andersen, G. R.; Nielsen, K. H., Structural basis for the function of DEAH helicases. EMBO Rep 2010, 11 (3), 180-6. 30. Tanner, N. K.; Linder, P., DExD/H box RNA helicases: from generic motors to specific dissociation functions. Mol Cell 2001, 8 (2), 251-62.

184

31. Anantharaman, V.; Koonin, E. V.; Aravind, L., Comparative genomics and evolution of proteins involved in RNA metabolism. Nucleic Acids Res 2002, 30 (7), 1427- 64. 32. Saraste, M.; Sibbald, P. R.; Wittinghofer, A., The P-loop--a common motif in ATP- and GTP-binding proteins. Trends Biochem Sci 1990, 15 (11), 430-4. 33. Singleton, M. R.; Dillingham, M. S.; Wigley, D. B., Structure and mechanism of helicases and nucleic acid translocases. Annu Rev Biochem 2007, 76, 23-50. 34. Caruthers, J. M.; McKay, D. B., Helicase structure and mechanism. Curr Opin Struct Biol 2002, 12 (1), 123-33. 35. Tanner, N. K.; Cordin, O.; Banroques, J.; Doere, M.; Linder, P., The Q motif: a newly identified motif in DEAD box helicases may regulate ATP binding and hydrolysis. Mol Cell 2003, 11 (1), 127-38. 36. Walker, J. E.; Saraste, M.; Runswick, M. J.; Gay, N. J., Distantly related sequences in the alpha- and beta-subunits of ATP synthase, , kinases and other ATP-requiring enzymes and a common nucleotide binding fold. EMBO J 1982, 1 (8), 945-51. 37. Zhang, X.; Chaney, M.; Wigneshweraraj, S. R.; Schumacher, J.; Bordes, P.; Cannon, W.; Buck, M., Mechanochemical and transcriptional activation. Mol Microbiol 2002, 45 (4), 895-903. 38. Chen, W. F.; Rety, S.; Guo, H. L.; Dai, Y. X.; Wu, W. Q.; Liu, N. N.; Auguin, D.; Liu, Q. W.; Hou, X. M.; Dou, S. X.; Xi, X. G., Molecular Mechanistic Insights into Drosophila DHX36-Mediated G-Quadruplex Unfolding: A Structure-Based Model. Structure 2018, 26 (3), 403-415 e4. 39. Gu, M.; Rice, C. M., Three conformational snapshots of the hepatitis C virus NS3 helicase reveal a ratchet translocation mechanism. Proc Natl Acad Sci U S A 2010, 107 (2), 521-8. 40. Putnam, A. A.; Jankowsky, E., DEAD-box helicases as integrators of RNA, nucleotide and protein binding. Biochim Biophys Acta 2013, 1829 (8), 884-93. 41. He, Y.; Staley, J. P.; Andersen, G. R.; Nielsen, K. H., Structure of the DEAH/RHA ATPase Prp43p bound to RNA implicates a pair of hairpins and motif Va in translocation along RNA. RNA 2017, 23 (7), 1110-1124. 42. Mallam, A. L.; Del Campo, M.; Gilman, B.; Sidote, D. J.; Lambowitz, A. M., Structural basis for RNA-duplex recognition and unwinding by the DEAD-box helicase Mss116p. Nature 2012, 490 (7418), 121-5. 43. Walbott, H.; Mouffok, S.; Capeyrou, R.; Lebaron, S.; Humbert, O.; van Tilbeurgh, H.; Henry, Y.; Leulliot, N., Prp43p contains a processive helicase structural architecture with a specific regulatory domain. EMBO J 2010, 29 (13), 2194-204. 44. Klostermeier, D.; Rudolph, M. G., A novel dimerization motif in the C-terminal domain of the Thermus thermophilus DEAD box helicase Hera confers substantial flexibility. Nucleic Acids Res 2009, 37 (2), 421-30.

185

45. Lattmann, S.; Giri, B.; Vaughn, J. P.; Akman, S. A.; Nagamine, Y., Role of the amino terminal RHAU-specific motif in the recognition and resolution of guanine quadruplex-RNA by the DEAH-box RNA helicase RHAU. Nucleic Acids Res 2010, 38 (18), 6219-33. 46. Prabu, J. R.; Muller, M.; Thomae, A. W.; Schussler, S.; Bonneau, F.; Becker, P. B.; Conti, E., Structure of the RNA Helicase MLE Reveals the Molecular Mechanisms for Uridine Specificity and RNA-ATP Coupling. Mol Cell 2015, 60 (3), 487-99. 47. Edwalds-Gilbert, G.; Kim, D. H.; Silverman, E.; Lin, R. J., Definition of a spliceosome interaction domain in yeast Prp2 ATPase. RNA 2004, 10 (2), 210-20. 48. Hotz, H. R.; Schwer, B., Mutational analysis of the yeast DEAH-box splicing factor Prp16. Genetics 1998, 149 (2), 807-15. 49. Rajendran, R. R.; Nye, A. C.; Frasor, J.; Balsara, R. D.; Martini, P. G.; Katzenellenbogen, B. S., Regulation of nuclear receptor transcriptional activity by a novel DEAD box RNA helicase (DP97). J Biol Chem 2003, 278 (7), 4628-38. 50. Murzin, A. G., OB(oligonucleotide/oligosaccharide binding)-fold: common structural and functional solution for non-homologous sequences. EMBO J 1993, 12 (3), 861-7. 51. Agrawal, V.; Kishan, K. V., OB-fold: growing bigger with functional consistency. Curr Protein Pept Sci 2003, 4 (3), 195-206. 52. Arcus, V., OB-fold domains: a snapshot of the evolution of sequence, structure and function. Curr Opin Struct Biol 2002, 12 (6), 794-801. 53. Theobald, D. L.; Mitton-Fry, R. M.; Wuttke, D. S., Nucleic acid recognition by OB- fold proteins. Annu Rev Biophys Biomol Struct 2003, 32, 115-33. 54. Christian, H.; Hofele, R. V.; Urlaub, H.; Ficner, R., Insights into the activation of the helicase Prp43 by biochemical studies and structural mass spectrometry. Nucleic Acids Res 2014, 42 (2), 1162-79. 55. Robert-Paganin, J.; Rety, S.; Leulliot, N., Regulation of DEAH/RHA helicases by G-patch proteins. Biomed Res Int 2015, 2015, 931857. 56. Taylor, L. L.; Jackson, R. N.; Rexhepaj, M.; King, A. K.; Lott, L. K.; van Hoof, A.; Johnson, S. J., The Mtr4 ratchet helix and arch domain both function to promote RNA unwinding. Nucleic Acids Res 2014, 42 (22), 13861-72. 57. Tauchert, M. J.; Fourmann, J. B.; Luhrmann, R.; Ficner, R., Structural insights into the mechanism of the DEAH-box RNA helicase Prp43. Elife 2017, 6. 58. Schneider, S.; Schwer, B., Functional domains of the yeast splicing factor Prp22p. J Biol Chem 2001, 276 (24), 21184-91. 59. Hickford, D. E.; Frankenberg, S.; Pask, A. J.; Shaw, G.; Renfree, M. B., DDX4 (VASA) is conserved in germ cell development in marsupials and monotremes. Biol Reprod 2011, 85 (4), 733-43. 60. Sharma, D.; Jankowsky, E., The Ded1/DDX3 subfamily of DEAD-box RNA helicases. Crit Rev Biochem Mol Biol 2014, 49 (4), 343-60.

186

61. Elbaum-Garfinkle, S.; Kim, Y.; Szczepaniak, K.; Chen, C. C.; Eckmann, C. R.; Myong, S.; Brangwynne, C. P., The disordered P granule protein LAF-1 drives phase separation into droplets with tunable viscosity and dynamics. Proc Natl Acad Sci U S A 2015, 112 (23), 7189-94. 62. Tanaka, N.; Aronova, A.; Schwer, B., Ntr1 activates the Prp43 helicase to trigger release of lariat-intron from the spliceosome. Genes Dev 2007, 21 (18), 2312-25. 63. Dhote, V.; Sweeney, T. R.; Kim, N.; Hellen, C. U.; Pestova, T. V., Roles of individual domains in the function of DHX29, an essential factor required for translation of structured mammalian mRNAs. Proc Natl Acad Sci U S A 2012, 109 (46), E3150-9. 64. Kim, J. L.; Morgenstern, K. A.; Griffith, J. P.; Dwyer, M. D.; Thomson, J. A.; Murcko, M. A.; Lin, C.; Caron, P. R., Hepatitis C virus NS3 RNA helicase domain with a bound oligonucleotide: the crystal structure provides insights into the mode of unwinding. Structure 1998, 6 (1), 89-100. 65. Yangyuoru, P. M.; Bradburn, D. A.; Liu, Z.; Xiao, T. S.; Russell, R., The G- quadruplex (G4) resolvase DHX36 efficiently and specifically disrupts DNA G4s via a translocation-based helicase mechanism. J Biol Chem 2018, 293 (6), 1924-1932. 66. Yang, Q.; Del Campo, M.; Lambowitz, A. M.; Jankowsky, E., DEAD-box proteins unwind duplexes by local strand separation. Mol Cell 2007, 28 (2), 253-63. 67. Yang, Q.; Jankowsky, E., The DEAD-box protein Ded1 unwinds RNA duplexes by a mode distinct from translocating helicases. Nat Struct Mol Biol 2006, 13 (11), 981-6. 68. Chen, Y.; Potratz, J. P.; Tijerina, P.; Del Campo, M.; Lambowitz, A. M.; Russell, R., DEAD-box proteins can completely separate an RNA duplex using a single ATP. Proc Natl Acad Sci U S A 2008, 105 (51), 20203-8. 69. Rogers, G. W., Jr.; Richter, N. J.; Merrick, W. C., Biochemical and kinetic characterization of the RNA helicase activity of eukaryotic initiation factor 4A. J Biol Chem 1999, 274 (18), 12236-44. 70. Yang, Q.; Jankowsky, E., ATP- and ADP-dependent modulation of RNA unwinding and strand annealing activities by the DEAD-box protein DED1. Biochemistry 2005, 44 (41), 13591-601. 71. Liu, F.; Putnam, A.; Jankowsky, E., ATP hydrolysis is required for DEAD-box protein recycling but not for duplex unwinding. Proc Natl Acad Sci U S A 2008, 105 (51), 20209-14. 72. Khidr, L.; Wu, G.; Davila, A.; Procaccio, V.; Wallace, D.; Lee, W. H., Role of SUV3 helicase in maintaining mitochondrial homeostasis in human cells. J Biol Chem 2008, 283 (40), 27064-73. 73. Bates, G. J.; Nicol, S. M.; Wilson, B. J.; Jacobs, A. M.; Bourdon, J. C.; Wardrop, J.; Gregory, D. J.; Lane, D. P.; Perkins, N. D.; Fuller-Pace, F. V., The DEAD box protein p68: a novel transcriptional coactivator of the p53 tumour suppressor. EMBO J 2005, 24 (3), 543-53.

187

74. Geissler, V.; Altmeyer, S.; Stein, B.; Uhlmann-Schiffler, H.; Stahl, H., The RNA helicase Ddx5/p68 binds to hUpf3 and enhances NMD of Ddx17/p72 and Smg5 mRNA. Nucleic Acids Res 2013, 41 (16), 7875-88. 75. Kar, A.; Fushimi, K.; Zhou, X.; Ray, P.; Shi, C.; Chen, X.; Liu, Z.; Chen, S.; Wu, J. Y., RNA helicase p68 (DDX5) regulates tau exon 10 splicing by modulating a stem-loop structure at the 5' splice site. Mol Cell Biol 2011, 31 (9), 1812-21. 76. Ma, W. K.; Paudel, B. P.; Xing, Z.; Sabath, I. G.; Rueda, D.; Tran, E. J., Recruitment, Duplex Unwinding and Protein-Mediated Inhibition of the Dead-Box RNA Helicase Dbp2 at Actively Transcribed Chromatin. J Mol Biol 2016, 428 (6), 1091-1106. 77. Schmidt, A.; Rothenfusser, S.; Hopfner, K. P., Sensing of viral nucleic acids by RIG-I: from translocation to translation. Eur J Cell Biol 2012, 91 (1), 78-85. 78. Parvatiyar, K.; Zhang, Z.; Teles, R. M.; Ouyang, S.; Jiang, Y.; Iyer, S. S.; Zaver, S. A.; Schenk, M.; Zeng, S.; Zhong, W.; Liu, Z. J.; Modlin, R. L.; Liu, Y. J.; Cheng, G., The helicase DDX41 recognizes the bacterial secondary messengers cyclic di-GMP and cyclic di-AMP to activate a type I interferon immune response. Nat Immunol 2012, 13 (12), 1155- 61. 79. Jankowsky, E.; Fairman, M. E., Duplex unwinding and RNP remodeling with RNA helicases. Methods Mol Biol 2008, 488, 343-55. 80. Fairman, M. E.; Maroney, P. A.; Wang, W.; Bowers, H. A.; Gollnick, P.; Nilsen, T. W.; Jankowsky, E., Protein displacement by DExH/D "RNA helicases" without duplex unwinding. Science 2004, 304 (5671), 730-4. 81. Jankowsky, E.; Fairman, M. E.; Yang, Q., RNA helicases: versatile ATP-driven nanomotors. J Nanosci Nanotechnol 2005, 5 (12), 1983-9. 82. Del Campo, M.; Tijerina, P.; Bhaskaran, H.; Mohr, S.; Yang, Q.; Jankowsky, E.; Russell, R.; Lambowitz, A. M., Do DEAD-box proteins promote group II intron splicing without unwinding RNA? Mol Cell 2007, 28 (1), 159-66. 83. Kos, M.; Tollervey, D., The Putative RNA Helicase Dbp4p Is Required for Release of the U14 snoRNA from Preribosomes in Saccharomyces cerevisiae. Mol Cell 2005, 20 (1), 53-64. 84. Schwer, B., A conformational rearrangement in the spliceosome sets the stage for Prp22-dependent mRNA release. Mol Cell 2008, 30 (6), 743-54. 85. Raghunathan, P. L.; Guthrie, C., RNA unwinding in U4/U6 snRNPs requires ATP hydrolysis and the DEIH-box splicing factor Brr2. Curr Biol 1998, 8 (15), 847-55. 86. Bowers, H. A.; Maroney, P. A.; Fairman, M. E.; Kastner, B.; Luhrmann, R.; Nilsen, T. W.; Jankowsky, E., Discriminatory RNP remodeling by the DEAD-box protein DED1. RNA 2006, 12 (5), 903-12. 87. Jankowsky, E.; Gross, C. H.; Shuman, S.; Pyle, A. M., Active disruption of an RNA-protein interaction by a DExH/D RNA helicase. Science 2001, 291 (5501), 121-5. 88. Ohrt, T.; Prior, M.; Dannenberg, J.; Odenwalder, P.; Dybkov, O.; Rasche, N.; Schmitzova, J.; Gregor, I.; Fabrizio, P.; Enderlein, J.; Luhrmann, R., Prp2-mediated protein

188 rearrangements at the catalytic core of the spliceosome as revealed by dcFCCS. RNA 2012, 18 (6), 1244-56. 89. Yang, L.; Lin, C.; Liu, Z. R., P68 RNA helicase mediates PDGF-induced epithelial mesenchymal transition by displacing Axin from beta-catenin. Cell 2006, 127 (1), 139-55. 90. Halls, C.; Mohr, S.; Del Campo, M.; Yang, Q.; Jankowsky, E.; Lambowitz, A. M., Involvement of DEAD-box proteins in group I and group II intron splicing. Biochemical characterization of Mss116p, ATP hydrolysis-dependent and -independent mechanisms, and general RNA chaperone activity. J Mol Biol 2007, 365 (3), 835-55. 91. Xing, L.; Liang, C.; Kleiman, L., Coordinate roles of Gag and RNA helicase A in promoting the annealing of formula to HIV-1 RNA. J Virol 2011, 85 (4), 1847-60. 92. Lorsch, J. R., RNA chaperones exist and DEAD box proteins get a life. Cell 2002, 109 (7), 797-800. 93. Mohr, S.; Matsuura, M.; Perlman, P. S.; Lambowitz, A. M., A DEAD-box protein alone promotes group II intron splicing and reverse splicing by acting as an RNA chaperone. Proc Natl Acad Sci U S A 2006, 103 (10), 3569-74. 94. Mohr, S.; Stryker, J. M.; Lambowitz, A. M., A DEAD-box protein functions as an ATP-dependent RNA chaperone in group I intron splicing. Cell 2002, 109 (6), 769-79. 95. Karunatilaka, K. S.; Solem, A.; Pyle, A. M.; Rueda, D., Single-molecule analysis of Mss116-mediated group II intron folding. Nature 2010, 467 (7318), 935-9. 96. Yang, Q.; Fairman, M. E.; Jankowsky, E., DEAD-box-protein-assisted RNA structure conversion towards and against thermodynamic equilibrium values. J Mol Biol 2007, 368 (4), 1087-100. 97. Sardana, R.; Liu, X.; Granneman, S.; Zhu, J.; Gill, M.; Papoulas, O.; Marcotte, E. M.; Tollervey, D.; Correll, C. C.; Johnson, A. W., The DEAH-box helicase Dhr1 dissociates U3 from the pre-rRNA to promote formation of the central pseudoknot. PLoS Biol 2015, 13 (2), e1002083. 98. Guenther, U. P.; Weinberg, D. E.; Zubradt, M. M.; Tedeschi, F. A.; Stawicki, B. N.; Zagore, L. L.; Brar, G. A.; Licatalosi, D. D.; Bartel, D. P.; Weissman, J. S.; Jankowsky, E., The helicase Ded1p controls use of near-cognate translation initiation codons in 5' UTRs. Nature 2018, 559 (7712), 130-134. 99. Svitkin, Y. V.; Pause, A.; Haghighat, A.; Pyronnet, S.; Witherell, G.; Belsham, G. J.; Sonenberg, N., The requirement for eukaryotic initiation factor 4A (elF4A) in translation is in direct proportion to the degree of mRNA 5' secondary structure. RNA 2001, 7 (3), 382-94. 100. Myong, S.; Cui, S.; Cornish, P. V.; Kirchhofer, A.; Gack, M. U.; Jung, J. U.; Hopfner, K. P.; Ha, T., Cytosolic viral sensor RIG-I is a 5'-triphosphate-dependent on double-stranded RNA. Science 2009, 323 (5917), 1070-4. 101. Zhang, Z.; Yuan, B.; Bao, M.; Lu, N.; Kim, T.; Liu, Y. J., The helicase DDX41 senses intracellular DNA mediated by the adaptor STING in dendritic cells. Nat Immunol 2011, 12 (10), 959-65.

189

102. Korneeva, N. L.; First, E. A.; Benoit, C. A.; Rhoads, R. E., Interaction between the NH2-terminal domain of eIF4A and the central domain of eIF4G modulates RNA- stimulated ATPase activity. J Biol Chem 2005, 280 (3), 1872-81. 103. Oberer, M.; Marintchev, A.; Wagner, G., Structural basis for the enhancement of eIF4A helicase activity by eIF4G. Genes Dev 2005, 19 (18), 2212-23. 104. Rogers, G. W., Jr.; Richter, N. J.; Lima, W. F.; Merrick, W. C., Modulation of the helicase activity of eIF4A by eIF4B, eIF4H, and eIF4F. J Biol Chem 2001, 276 (33), 30914-22. 105. Alcazar-Roman, A. R.; Bolger, T. A.; Wente, S. R., Control of mRNA export and translation termination by inositol hexakisphosphate requires specific interaction with Gle1. J Biol Chem 2010, 285 (22), 16683-92. 106. Maeder, C.; Kutach, A. K.; Guthrie, C., ATP-dependent unwinding of U4/U6 snRNAs by the Brr2 helicase requires the C terminus of Prp8. Nat Struct Mol Biol 2009, 16 (1), 42-8. 107. Ballut, L.; Marchadier, B.; Baguet, A.; Tomasetto, C.; Seraphin, B.; Le Hir, H., The exon junction core complex is locked onto RNA by inhibition of eIF4AIII ATPase activity. Nat Struct Mol Biol 2005, 12 (10), 861-9. 108. von Moeller, H.; Basquin, C.; Conti, E., The mRNA export protein DBP5 binds RNA and the cytoplasmic nucleoporin NUP214 in a mutually exclusive manner. Nat Struct Mol Biol 2009, 16 (3), 247-54. 109. Li, G.; Reinberg, D., Chromatin higher-order structures and gene regulation. Curr Opin Genet Dev 2011, 21 (2), 175-86. 110. Lelli, K. M.; Slattery, M.; Mann, R. S., Disentangling the many layers of eukaryotic transcriptional regulation. Annu Rev Genet 2012, 46, 43-68. 111. Jenuwein, T.; Allis, C. D., Translating the histone code. Science 2001, 293 (5532), 1074-80. 112. Levine, M.; Tjian, R., Transcription regulation and animal diversity. Nature 2003, 424 (6945), 147-51. 113. Filtz, T. M.; Vogel, W. K.; Leid, M., Regulation of transcription factor activity by interconnected post-translational modifications. Trends Pharmacol Sci 2014, 35 (2), 76- 85. 114. Hsin, J. P.; Manley, J. L., The RNA polymerase II CTD coordinates transcription and RNA processing. Genes Dev 2012, 26 (19), 2119-37. 115. Culjkovic-Kraljacic, B.; Borden, K. L. B., The Impact of Post-transcriptional Control: Better Living Through RNA Regulons. Front Genet 2018, 9, 512. 116. Corbett, A. H., Post-transcriptional regulation of gene expression and human disease. Curr Opin Cell Biol 2018, 52, 96-104. 117. Keene, J. D.; Tenenbaum, S. A., Eukaryotic mRNPs may represent posttranscriptional operons. Mol Cell 2002, 9 (6), 1161-7.

190

118. Rodriguez, E. A.; Campbell, R. E.; Lin, J. Y.; Lin, M. Z.; Miyawaki, A.; Palmer, A. E.; Shu, X.; Zhang, J.; Tsien, R. Y., The Growing and Glowing Toolbox of Fluorescent and Photoactive Proteins. Trends Biochem Sci 2017, 42 (2), 111-129. 119. Wang, Z.; Burge, C. B., Splicing regulation: from a parts list of regulatory elements to an integrated splicing code. RNA 2008, 14 (5), 802-13. 120. Elkon, R.; Ugalde, A. P.; Agami, R., Alternative cleavage and polyadenylation: extent, regulation and function. Nat Rev Genet 2013, 14 (7), 496-506. 121. Ma, X. M.; Blenis, J., Molecular mechanisms of mTOR-mediated translational control. Nat Rev Mol Cell Biol 2009, 10 (5), 307-18. 122. Martin, C.; Zhang, Y., The diverse functions of histone lysine methylation. Nat Rev Mol Cell Biol 2005, 6 (11), 838-49. 123. Halbeisen, R. E.; Galgano, A.; Scherrer, T.; Gerber, A. P., Post-transcriptional gene regulation: from genome-wide studies to principles. Cell Mol Life Sci 2008, 65 (5), 798- 813. 124. Sonenberg, N.; Hinnebusch, A. G., Regulation of translation initiation in eukaryotes: mechanisms and biological targets. Cell 2009, 136 (4), 731-45. 125. Gehring, N. H.; Wahle, E.; Fischer, U., Deciphering the mRNP Code: RNA-Bound Determinants of Post-Transcriptional Gene Regulation. Trends Biochem Sci 2017, 42 (5), 369-382. 126. Tran, H.; Schilling, M.; Wirbelauer, C.; Hess, D.; Nagamine, Y., Facilitation of mRNA deadenylation and decay by the exosome-bound, DExH protein RHAU. Mol Cell 2004, 13 (1), 101-11. 127. Keene, J. D., RNA regulons: coordination of post-transcriptional events. Nat Rev Genet 2007, 8 (7), 533-43. 128. Bochman, M. L.; Paeschke, K.; Zakian, V. A., DNA secondary structures: stability and function of G-quadruplex structures. Nat Rev Genet 2012, 13 (11), 770-80. 129. Pearson, C. E.; Zorbas, H.; Price, G. B.; Zannis-Hadjopoulos, M., Inverted repeats, stem-loops, and cruciforms: significance for initiation of DNA replication. J Cell Biochem 1996, 63 (1), 1-22. 130. Jain, A.; Wang, G.; Vasquez, K. M., DNA triple helices: biological consequences and therapeutic potential. Biochimie 2008, 90 (8), 1117-30. 131. Tinoco, I., Jr.; Bustamante, C., How RNA folds. J Mol Biol 1999, 293 (2), 271-81. 132. Staple, D. W.; Butcher, S. E., Pseudoknots: RNA structures with diverse functions. PLoS Biol 2005, 3 (6), e213. 133. Ke, A.; Zhou, K.; Ding, F.; Cate, J. H.; Doudna, J. A., A conformational switch controls hepatitis delta virus ribozyme catalysis. Nature 2004, 429 (6988), 201-5. 134. Adams, P. L.; Stahley, M. R.; Kosek, A. B.; Wang, J.; Strobel, S. A., Crystal structure of a self-splicing group I intron with both exons. Nature 2004, 430 (6995), 45- 50.

191

135. Theimer, C. A.; Blois, C. A.; Feigon, J., Structure of the human telomerase RNA pseudoknot reveals conserved tertiary interactions essential for function. Mol Cell 2005, 17 (5), 671-82. 136. Nixon, P. L.; Rangan, A.; Kim, Y. G.; Rich, A.; Hoffman, D. W.; Hennig, M.; Giedroc, D. P., Solution structure of a luteoviral P1-P2 frameshifting mRNA pseudoknot. J Mol Biol 2002, 322 (3), 621-33. 137. Gebauer, F.; Hentze, M. W., Molecular mechanisms of translational control. Nat Rev Mol Cell Biol 2004, 5 (10), 827-35. 138. Warf, M. B.; Berglund, J. A., Role of RNA structure in regulating pre-mRNA splicing. Trends Biochem Sci 2010, 35 (3), 169-78. 139. Leppek, K.; Das, R.; Barna, M., Functional 5' UTR mRNA structures in eukaryotic translation regulation and how to find them. Nat Rev Mol Cell Biol 2018, 19 (3), 158-174. 140. Martin, K. C.; Ephrussi, A., mRNA localization: gene expression in the spatial dimension. Cell 2009, 136 (4), 719-30. 141. Goodarzi, H.; Najafabadi, H. S.; Oikonomou, P.; Greco, T. M.; Fish, L.; Salavati, R.; Cristea, I. M.; Tavazoie, S., Systematic discovery of structural elements governing stability of mammalian messenger RNAs. Nature 2012, 485 (7397), 264-8. 142. Gellert, M.; Lipsett, M. N.; Davies, D. R., Helix formation by guanylic acid. Proc Natl Acad Sci U S A 1962, 48, 2013-8. 143. Bhattacharyya, D.; Mirihana Arachchilage, G.; Basu, S., Metal Cations in G- Quadruplex Folding and Stability. Front Chem 2016, 4, 38. 144. Burge, S.; Parkinson, G. N.; Hazel, P.; Todd, A. K.; Neidle, S., Quadruplex DNA: sequence, topology and structure. Nucleic Acids Res 2006, 34 (19), 5402-15. 145. Fay, M. M.; Lyons, S. M.; Ivanov, P., RNA G-Quadruplexes in Biology: Principles and Molecular Mechanisms. J Mol Biol 2017, 429 (14), 2127-2147. 146. Arora, A.; Nair, D. R.; Maiti, S., Effect of flanking bases on quadruplex stability and Watson-Crick duplex competition. FEBS J 2009, 276 (13), 3628-40. 147. Rankin, S.; Reszka, A. P.; Huppert, J.; Zloh, M.; Parkinson, G. N.; Todd, A. K.; Ladame, S.; Balasubramanian, S.; Neidle, S., Putative DNA quadruplex formation within the human c-kit oncogene. J Am Chem Soc 2005, 127 (30), 10584-9. 148. Todd, A. K.; Johnston, M.; Neidle, S., Highly prevalent putative quadruplex sequence motifs in human DNA. Nucleic Acids Res 2005, 33 (9), 2901-7. 149. Bedrat, A.; Lacroix, L.; Mergny, J. L., Re-evaluation of G-quadruplex propensity with G4Hunter. Nucleic Acids Res 2016, 44 (4), 1746-59. 150. Chambers, V. S.; Marsico, G.; Boutell, J. M.; Di Antonio, M.; Smith, G. P.; Balasubramanian, S., High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nat Biotechnol 2015, 33 (8), 877-81. 151. Poulet-Benedetti, J.; Valton, A. L.; Prioleau, M. N., [G-quadruplex: key controllers of human genome duplication]. Med Sci (Paris) 2017, 33 (12), 1063-1070.

192

152. Rodriguez, R.; Miller, K. M.; Forment, J. V.; Bradshaw, C. R.; Nikan, M.; Britton, S.; Oelschlaegel, T.; Xhemalce, B.; Balasubramanian, S.; Jackson, S. P., Small-molecule- induced DNA damage identifies alternative DNA structures in human genes. Nature chemical biology 2012, 8 (3), 301-10. 153. Hansel-Hertsch, R.; Beraldi, D.; Lensing, S. V.; Marsico, G.; Zyner, K.; Parry, A.; Di Antonio, M.; Pike, J.; Kimura, H.; Narita, M.; Tannahill, D.; Balasubramanian, S., G- quadruplex structures mark human regulatory chromatin. Nat Genet 2016, 48 (10), 1267- 72. 154. Kwok, C. K.; Marsico, G.; Sahakyan, A. B.; Chambers, V. S.; Balasubramanian, S., rG4-seq reveals widespread formation of G-quadruplex structures in the human transcriptome. Nat Methods 2016, 13 (10), 841-4. 155. Martadinata, H.; Phan, A. T., Structure of human telomeric RNA (TERRA): stacking of two G-quadruplex blocks in K(+) solution. Biochemistry 2013, 52 (13), 2176- 83. 156. Guo, J. U.; Bartel, D. P., RNA G-quadruplexes are globally unfolded in eukaryotic cells and depleted in bacteria. Science 2016, 353 (6306). 157. Yang, S. Y.; Lejault, P.; Chevrier, S.; Boidot, R.; Robertson, A. G.; Wong, J. M. Y.; Monchaud, D., Transcriptome-wide identification of transient RNA G-quadruplexes in human cells. Nat Commun 2018, 9 (1), 4730. 158. Schaffitzel, C.; Berger, I.; Postberg, J.; Hanes, J.; Lipps, H. J.; Pluckthun, A., In vitro generated antibodies specific for telomeric guanine-quadruplex DNA react with Stylonychia lemnae macronuclei. Proc Natl Acad Sci U S A 2001, 98 (15), 8572-7. 159. Biffi, G.; Tannahill, D.; McCafferty, J.; Balasubramanian, S., Quantitative visualization of DNA G-quadruplex structures in human cells. Nat Chem 2013, 5 (3), 182- 6. 160. Henderson, A.; Wu, Y.; Huang, Y. C.; Chavez, E. A.; Platt, J.; Johnson, F. B.; Brosh, R. M., Jr.; Sen, D.; Lansdorp, P. M., Detection of G-quadruplex DNA in mammalian cells. Nucleic Acids Res 2014, 42 (2), 860-9. 161. London, T. B.; Barber, L. J.; Mosedale, G.; Kelly, G. P.; Balasubramanian, S.; Hickson, I. D.; Boulton, S. J.; Hiom, K., FANCJ is a structure-specific DNA helicase associated with the maintenance of genomic G/C tracts. J Biol Chem 2008, 283 (52), 36132-9. 162. Paeschke, K.; Capra, J. A.; Zakian, V. A., DNA replication through G-quadruplex motifs is promoted by the Saccharomyces cerevisiae Pif1 DNA helicase. Cell 2011, 145 (5), 678-91. 163. Ribeyre, C.; Lopes, J.; Boule, J. B.; Piazza, A.; Guedin, A.; Zakian, V. A.; Mergny, J. L.; Nicolas, A., The yeast Pif1 helicase prevents genomic instability caused by G- quadruplex-forming CEB1 sequences in vivo. PLoS Genet 2009, 5 (5), e1000475. 164. Vannier, J. B.; Pavicic-Kaltenbrunner, V.; Petalcorin, M. I.; Ding, H.; Boulton, S. J., RTEL1 dismantles T loops and counteracts telomeric G4-DNA to maintain telomere integrity. Cell 2012, 149 (4), 795-806.

193

165. Pelengaris, S.; Khan, M.; Evan, G., c-MYC: more than just a matter of life and death. Nat Rev Cancer 2002, 2 (10), 764-76. 166. Pelengaris, S.; Khan, M.; Evan, G. I., Suppression of Myc-induced apoptosis in beta cells exposes multiple oncogenic properties of Myc and triggers carcinogenic progression. Cell 2002, 109 (3), 321-34. 167. Simonsson, T.; Pecinka, P.; Kubista, M., DNA tetraplex formation in the control region of c-myc. Nucleic Acids Res 1998, 26 (5), 1167-72. 168. Siddiqui-Jain, A.; Grand, C. L.; Bearss, D. J.; Hurley, L. H., Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c- MYC transcription. Proc Natl Acad Sci U S A 2002, 99 (18), 11593-8. 169. Cogoi, S.; Xodo, L. E., G-quadruplex formation within the promoter of the KRAS proto-oncogene and its effect on transcription. Nucleic Acids Res 2006, 34 (9), 2536-49. 170. Grand, C. L.; Han, H.; Munoz, R. M.; Weitman, S.; Von Hoff, D. D.; Hurley, L. H.; Bearss, D. J., The cationic porphyrin TMPyP4 down-regulates c-MYC and human telomerase reverse transcriptase expression and inhibits tumor growth in vivo. Mol Cancer Ther 2002, 1 (8), 565-73. 171. Kwiatkowski, N.; Zhang, T.; Rahl, P. B.; Abraham, B. J.; Reddy, J.; Ficarro, S. B.; Dastur, A.; Amzallag, A.; Ramaswamy, S.; Tesar, B.; Jenkins, C. E.; Hannett, N. M.; McMillin, D.; Sanda, T.; Sim, T.; Kim, N. D.; Look, T.; Mitsiades, C. S.; Weng, A. P.; Brown, J. R.; Benes, C. H.; Marto, J. A.; Young, R. A.; Gray, N. S., Targeting transcription regulation in cancer with a covalent CDK7 inhibitor. Nature 2014, 511 (7511), 616-20. 172. Johnson, J. E.; Cao, K.; Ryvkin, P.; Wang, L. S.; Johnson, F. B., Altered gene expression in the Werner and Bloom syndromes is associated with sequences having G- quadruplex forming potential. Nucleic Acids Res 2010, 38 (4), 1114-22. 173. Nguyen, G. H.; Tang, W.; Robles, A. I.; Beyer, R. P.; Gray, L. T.; Welsh, J. A.; Schetter, A. J.; Kumamoto, K.; Wang, X. W.; Hickson, I. D.; Maizels, N.; Monnat, R. J., Jr.; Harris, C. C., Regulation of gene expression by the BLM helicase correlates with the presence of G-quadruplex DNA motifs. Proc Natl Acad Sci U S A 2014, 111 (27), 9905- 10. 174. Tang, W.; Robles, A. I.; Beyer, R. P.; Gray, L. T.; Nguyen, G. H.; Oshima, J.; Maizels, N.; Harris, C. C.; Monnat, R. J., Jr., The Werner syndrome RECQ helicase targets G4 DNA in human cells to modulate transcription. Hum Mol Genet 2016, 25 (10), 2060- 2069. 175. Gromak, N.; West, S.; Proudfoot, N. J., Pause sites promote transcriptional termination of mammalian RNA polymerase II. Mol Cell Biol 2006, 26 (10), 3986-96. 176. Skourti-Stathaki, K.; Proudfoot, N. J.; Gromak, N., Human senataxin resolves RNA/DNA hybrids formed at transcriptional pause sites to promote Xrn2-dependent termination. Mol Cell 2011, 42 (6), 794-805. 177. Decorsiere, A.; Cayrel, A.; Vagner, S.; Millevoi, S., Essential role for the interaction between hnRNP H/F and a G quadruplex in maintaining p53 pre-mRNA 3'-end processing and function during DNA damage. Genes Dev 2011, 25 (3), 220-5.

194

178. Beaudoin, J. D.; Perreault, J. P., Exploring mRNA 3' UTR G-quadruplexes: evidence of roles in both alternative polyadenylation and mRNA shortening. Nucleic Acids Res 2013, 41 (11), 5898-911. 179. Gomez, D.; Lemarteleur, T.; Lacroix, L.; Mailliet, P.; Mergny, J. L.; Riou, J. F., Telomerase downregulation induced by the G-quadruplex ligand 12459 in A549 cells is mediated by hTERT RNA alternative splicing. Nucleic Acids Res 2004, 32 (1), 371-9. 180. Weldon, C.; Dacanay, J. G.; Gokhale, V.; Boddupally, P. V. L.; Behm-Ansmant, I.; Burley, G. A.; Branlant, C.; Hurley, L. H.; Dominguez, C.; Eperon, I. C., Specific G- quadruplex ligands modulate the alternative splicing of Bcl-X. Nucleic Acids Res 2018, 46 (2), 886-896. 181. Marcel, V.; Tran, P. L.; Sagne, C.; Martel-Planche, G.; Vaslin, L.; Teulade-Fichou, M. P.; Hall, J.; Mergny, J. L.; Hainaut, P.; Van Dyck, E., G-quadruplex structures in TP53 intron 3: role in alternative splicing and in production of p53 mRNA isoforms. Carcinogenesis 2011, 32 (3), 271-8. 182. Subramanian, M.; Rage, F.; Tabet, R.; Flatter, E.; Mandel, J. L.; Moine, H., G- quadruplex RNA structure as a signal for neurite mRNA targeting. EMBO Rep 2011, 12 (7), 697-704. 183. Kanai, Y.; Dohmae, N.; Hirokawa, N., Kinesin transports RNA: isolation and characterization of an RNA-transporting granule. Neuron 2004, 43 (4), 513-25. 184. Babendure, J. R.; Babendure, J. L.; Ding, J. H.; Tsien, R. Y., Control of mammalian translation by mRNA structure near caps. RNA 2006, 12 (5), 851-61. 185. Halder, C.; Ossendorf, C.; Maran, A.; Yaszemski, M.; Bolander, M. E.; Fuchs, B.; Sarkar, G., Preferential expression of the secreted and membrane forms of tumor endothelial marker 7 transcripts in osteosarcoma. Anticancer Res 2009, 29 (11), 4317-22. 186. Shahid, R.; Bugaut, A.; Balasubramanian, S., The BCL-2 5' untranslated region contains an RNA G-quadruplex-forming motif that modulates protein expression. Biochemistry 2010, 49 (38), 8300-6. 187. Bugaut, A.; Balasubramanian, S., 5' UTR RNA G-quadruplexes: translation regulation and targeting. Nucleic Acids Res 2012, 40 (11), 4727-41. 188. Bonnal, S.; Schaeffer, C.; Creancier, L.; Clamens, S.; Moine, H.; Prats, A. C.; Vagner, S., A single internal ribosome entry site containing a G quartet RNA structure drives fibroblast growth factor 2 gene expression at four alternative translation initiation codons. J Biol Chem 2003, 278 (41), 39330-6. 189. Morris, M. J.; Negishi, Y.; Pazsint, C.; Schonhoft, J. D.; Basu, S., An RNA G- quadruplex is essential for cap-independent translation initiation in human VEGF IRES. J Am Chem Soc 2010, 132 (50), 17831-9. 190. Endoh, T.; Kawasaki, Y.; Sugimoto, N., Suppression of gene expression by G- quadruplexes in open reading frames depends on G-quadruplex stability. Angew Chem Int Ed Engl 2013, 52 (21), 5522-6.

195

191. Endoh, T.; Sugimoto, N., Mechanical insights into ribosomal progression overcoming RNA G-quadruplex from periodical translation suppression in cells. Sci Rep 2016, 6, 22719. 192. Thandapani, P.; Song, J.; Gandin, V.; Cai, Y.; Rouleau, S. G.; Garant, J. M.; Boisvert, F. M.; Yu, Z.; Perreault, J. P.; Topisirovic, I.; Richard, S., Aven recognition of RNA G-quadruplexes regulates translation of the mixed lineage leukemia protooncogenes. Elife 2015, 4. 193. Yu, C. H.; Teulade-Fichou, M. P.; Olsthoorn, R. C., Stimulation of ribosomal frameshifting by RNA G-quadruplex structures. Nucleic Acids Res 2014, 42 (3), 1887-92. 194. Arora, A.; Suess, B., An RNA G-quadruplex in the 3' UTR of the proto-oncogene PIM1 represses translation. RNA Biol 2011, 8 (5), 802-5. 195. Jayaraj, G. G.; Pandey, S.; Scaria, V.; Maiti, S., Potential G-quadruplexes in the human long non-coding transcriptome. RNA Biol 2012, 9 (1), 81-6. 196. Redon, S.; Reichenbach, P.; Lingner, J., The non-coding RNA TERRA is a natural ligand and direct inhibitor of human telomerase. Nucleic Acids Res 2010, 38 (17), 5797- 806. 197. Schoeftner, S.; Blasco, M. A., Developmentally regulated transcription of mammalian telomeres by DNA-dependent RNA polymerase II. Nat Cell Biol 2008, 10 (2), 228-36. 198. Booy, E. P.; Meier, M.; Okun, N.; Novakowski, S. K.; Xiong, S.; Stetefeld, J.; McKenna, S. A., The RNA helicase RHAU (DHX36) unwinds a G4-quadruplex in human telomerase RNA and promotes the formation of the P1 helix template boundary. Nucleic Acids Res 2012, 40 (9), 4110-24. 199. Rouleau, S.; Glouzon, J. S.; Brumwell, A.; Bisaillon, M.; Perreault, J. P., 3' UTR G-quadruplexes regulate miRNA binding. RNA 2017, 23 (8), 1172-1179. 200. Rouleau, S. G.; Garant, J. M.; Bolduc, F.; Bisaillon, M.; Perreault, J. P., G- Quadruplexes influence pri-microRNA processing. RNA Biol 2018, 15 (2), 198-206. 201. Amor-Gueret, M., Bloom syndrome, genomic instability and cancer: the SOS-like hypothesis. Cancer Lett 2006, 236 (1), 1-12. 202. Wu, Y.; Shin-ya, K.; Brosh, R. M., Jr., FANCJ helicase defective in Fanconia anemia and breast cancer unwinds G-quadruplex DNA to defend genomic stability. Mol Cell Biol 2008, 28 (12), 4116-28. 203. Fan, L.; Fuss, J. O.; Cheng, Q. J.; Arvai, A. S.; Hammel, M.; Roberts, V. A.; Cooper, P. K.; Tainer, J. A., XPD helicase structures and activities: insights into the cancer and aging phenotypes from XPD mutations. Cell 2008, 133 (5), 789-800. 204. Marinoni, I.; Kurrer, A. S.; Vassella, E.; Dettmer, M.; Rudolph, T.; Banz, V.; Hunger, F.; Pasquinelli, S.; Speel, E. J.; Perren, A., Loss of DAXX and ATRX are associated with chromosome instability and reduced survival of patients with pancreatic neuroendocrine tumors. Gastroenterology 2014, 146 (2), 453-60 e5.

196

205. Brooks, T. A.; Kendrick, S.; Hurley, L., Making sense of G-quadruplex and i-motif functions in oncogene promoters. FEBS J 2010, 277 (17), 3459-69. 206. Renton, A. E.; Majounie, E.; Waite, A.; Simon-Sanchez, J.; Rollinson, S.; Gibbs, J. R.; Schymick, J. C.; Laaksovirta, H.; van Swieten, J. C.; Myllykangas, L.; Kalimo, H.; Paetau, A.; Abramzon, Y.; Remes, A. M.; Kaganovich, A.; Scholz, S. W.; Duckworth, J.; Ding, J.; Harmer, D. W.; Hernandez, D. G.; Johnson, J. O.; Mok, K.; Ryten, M.; Trabzuni, D.; Guerreiro, R. J.; Orrell, R. W.; Neal, J.; Murray, A.; Pearson, J.; Jansen, I. E.; Sondervan, D.; Seelaar, H.; Blake, D.; Young, K.; Halliwell, N.; Callister, J. B.; Toulson, G.; Richardson, A.; Gerhard, A.; Snowden, J.; Mann, D.; Neary, D.; Nalls, M. A.; Peuralinna, T.; Jansson, L.; Isoviita, V. M.; Kaivorinne, A. L.; Holtta-Vuori, M.; Ikonen, E.; Sulkava, R.; Benatar, M.; Wuu, J.; Chio, A.; Restagno, G.; Borghero, G.; Sabatelli, M.; Consortium, I.; Heckerman, D.; Rogaeva, E.; Zinman, L.; Rothstein, J. D.; Sendtner, M.; Drepper, C.; Eichler, E. E.; Alkan, C.; Abdullaev, Z.; Pack, S. D.; Dutra, A.; Pak, E.; Hardy, J.; Singleton, A.; Williams, N. M.; Heutink, P.; Pickering-Brown, S.; Morris, H. R.; Tienari, P. J.; Traynor, B. J., A hexanucleotide repeat expansion in C9ORF72 is the cause of chromosome 9p21-linked ALS-FTD. Neuron 2011, 72 (2), 257-68. 207. Haeusler, A. R.; Donnelly, C. J.; Periz, G.; Simko, E. A.; Shaw, P. G.; Kim, M. S.; Maragakis, N. J.; Troncoso, J. C.; Pandey, A.; Sattler, R.; Rothstein, J. D.; Wang, J., C9orf72 nucleotide repeat structures initiate molecular cascades of disease. Nature 2014, 507 (7491), 195-200. 208. Mizielinska, S.; Isaacs, A. M., C9orf72 amyotrophic lateral sclerosis and frontotemporal dementia: gain or loss of function? Curr Opin Neurol 2014, 27 (5), 515-23. 209. Gendron, T. F.; Bieniek, K. F.; Zhang, Y. J.; Jansen-West, K.; Ash, P. E.; Caulfield, T.; Daughrity, L.; Dunmore, J. H.; Castanedes-Casey, M.; Chew, J.; Cosio, D. M.; van Blitterswijk, M.; Lee, W. C.; Rademakers, R.; Boylan, K. B.; Dickson, D. W.; Petrucelli, L., Antisense transcripts of the expanded C9ORF72 hexanucleotide repeat form nuclear RNA foci and undergo repeat-associated non-ATG translation in c9FTD/ALS. Acta Neuropathol 2013, 126 (6), 829-44. 210. Mori, K.; Weng, S. M.; Arzberger, T.; May, S.; Rentzsch, K.; Kremmer, E.; Schmid, B.; Kretzschmar, H. A.; Cruts, M.; Van Broeckhoven, C.; Haass, C.; Edbauer, D., The C9orf72 GGGGCC repeat is translated into aggregating dipeptide-repeat proteins in FTLD/ALS. Science 2013, 339 (6125), 1335-8. 211. Pieretti, M.; Zhang, F. P.; Fu, Y. H.; Warren, S. T.; Oostra, B. A.; Caskey, C. T.; Nelson, D. L., Absence of expression of the FMR-1 gene in fragile X syndrome. Cell 1991, 66 (4), 817-22. 212. Darnell, J. C.; Jensen, K. B.; Jin, P.; Brown, V.; Warren, S. T.; Darnell, R. B., Fragile X mental retardation protein targets G quartet mRNAs important for neuronal function. Cell 2001, 107 (4), 489-99. 213. Brooks, T. A.; Hurley, L. H., Targeting MYC Expression through G-Quadruplexes. Genes Cancer 2010, 1 (6), 641-649. 214. Salvati, E.; Scarsella, M.; Porru, M.; Rizzo, A.; Iachettini, S.; Tentori, L.; Graziani, G.; D'Incalci, M.; Stevens, M. F.; Orlandi, A.; Passeri, D.; Gilson, E.; Zupi, G.; Leonetti,

197

C.; Biroccio, A., PARP1 is activated at telomeres upon G4 stabilization: possible target for telomere-based therapy. Oncogene 2010, 29 (47), 6280-93. 215. Zamiri, B.; Reddy, K.; Macgregor, R. B., Jr.; Pearson, C. E., TMPyP4 porphyrin distorts RNA G-quadruplex structures of the disease-associated r(GGGGCC)n repeat of the C9orf72 gene and blocks interaction of RNA-binding proteins. J Biol Chem 2014, 289 (8), 4653-9. 216. Su, Z.; Zhang, Y.; Gendron, T. F.; Bauer, P. O.; Chew, J.; Yang, W. Y.; Fostvedt, E.; Jansen-West, K.; Belzil, V. V.; Desaro, P.; Johnston, A.; Overstreet, K.; Oh, S. Y.; Todd, P. K.; Berry, J. D.; Cudkowicz, M. E.; Boeve, B. F.; Dickson, D.; Floeter, M. K.; Traynor, B. J.; Morelli, C.; Ratti, A.; Silani, V.; Rademakers, R.; Brown, R. H.; Rothstein, J. D.; Boylan, K. B.; Petrucelli, L.; Disney, M. D., Discovery of a biomarker and lead small molecules to target r(GGGGCC)-associated defects in c9FTD/ALS. Neuron 2014, 83 (5), 1043-50. 217. Kwok, C. K.; Marsico, G.; Balasubramanian, S., Detecting RNA G-Quadruplexes (rG4s) in the Transcriptome. Cold Spring Harb Perspect Biol 2018, 10 (7). 218. Gerstberger, S.; Hafner, M.; Tuschl, T., A census of human RNA-binding proteins. Nat Rev Genet 2014, 15 (12), 829-45. 219. Hentze, M. W.; Castello, A.; Schwarzl, T.; Preiss, T., A brave new world of RNA- binding proteins. Nat Rev Mol Cell Biol 2018, 19 (5), 327-341. 220. Yesudhas, D.; Batool, M.; Anwar, M. A.; Panneerselvam, S.; Choi, S., Proteins Recognizing DNA: Structural Uniqueness and Versatility of DNA-Binding Domains in Stem Cell Transcription Factors. Genes (Basel) 2017, 8 (8). 221. Rohs, R.; Jin, X.; West, S. M.; Joshi, R.; Honig, B.; Mann, R. S., Origins of specificity in protein-DNA recognition. Annu Rev Biochem 2010, 79, 233-69. 222. Stella, S.; Cascio, D.; Johnson, R. C., The shape of the DNA minor groove directs binding by the DNA-bending protein Fis. Genes Dev 2010, 24 (8), 814-26. 223. Kato, M.; Han, T. W.; Xie, S.; Shi, K.; Du, X.; Wu, L. C.; Mirzaei, H.; Goldsmith, E. J.; Longgood, J.; Pei, J.; Grishin, N. V.; Frantz, D. E.; Schneider, J. W.; Chen, S.; Li, L.; Sawaya, M. R.; Eisenberg, D.; Tycko, R.; McKnight, S. L., Cell-free formation of RNA granules: low complexity sequence domains form dynamic fibers within hydrogels. Cell 2012, 149 (4), 753-67. 224. Hudson, W. H.; Ortlund, E. A., The structure, function and evolution of proteins that bind DNA and RNA. Nat Rev Mol Cell Biol 2014, 15 (11), 749-60. 225. Mishra, S. K.; Tawani, A.; Mishra, A.; Kumar, A., G4IPDB: A database for G- quadruplex structure forming nucleic acid interacting proteins. Sci Rep 2016, 6, 38144. 226. von Hacht, A.; Seifert, O.; Menger, M.; Schutze, T.; Arora, A.; Konthur, Z.; Neubauer, P.; Wagner, A.; Weise, C.; Kurreck, J., Identification and characterization of RNA guanine-quadruplex binding proteins. Nucleic Acids Res 2014, 42 (10), 6630-44. 227. Sissi, C.; Gatto, B.; Palumbo, M., The evolving world of protein-G-quadruplex recognition: a medicinal chemist's perspective. Biochimie 2011, 93 (8), 1219-30.

198

228. Melko, M.; Douguet, D.; Bensaid, M.; Zongaro, S.; Verheggen, C.; Gecz, J.; Bardoni, B., Functional characterization of the AFF (AF4/FMR2) family of RNA-binding proteins: insights into the molecular pathology of FRAXE intellectual disability. Hum Mol Genet 2011, 20 (10), 1873-85. 229. Melko, M.; Bardoni, B., The role of G-quadruplex in RNA metabolism: involvement of FMRP and FMR2P. Biochimie 2010, 92 (8), 919-26. 230. Bashkirov, V. I.; Scherthan, H.; Solinger, J. A.; Buerstedde, J. M.; Heyer, W. D., A mouse cytoplasmic exoribonuclease (mXRN1p) with preference for G4 tetraplex substrates. J Cell Biol 1997, 136 (4), 761-73. 231. Mendoza, O.; Bourdoncle, A.; Boule, J. B.; Brosh, R. M., Jr.; Mergny, J. L., G- quadruplexes and helicases. Nucleic Acids Res 2016, 44 (5), 1989-2006. 232. Sauer, M.; Paeschke, K., G-quadruplex unwinding helicases and their function in vivo. Biochem Soc Trans 2017, 45 (5), 1173-1182. 233. Gao, J.; Byrd, A. K.; Zybailov, B. L.; Marecki, J. C.; Guderyon, M. J.; Edwards, A. D.; Chib, S.; West, K. L.; Waldrip, Z. J.; Mackintosh, S. G.; Gao, Z.; Putnam, A. A.; Jankowsky, E.; Raney, K. D., DEAD-box RNA helicases Dbp2, Ded1 and Mss116 bind to G-quadruplex nucleic acids and destabilize G-quadruplex RNA. Chem Commun (Camb) 2019, 55 (31), 4467-4470. 234. Herdy, B.; Mayer, C.; Varshney, D.; Marsico, G.; Murat, P.; Taylor, C.; D'Santos, C.; Tannahill, D.; Balasubramanian, S., Analysis of NRAS RNA G-quadruplex binding proteins reveals DDX3X as a novel interactor of cellular G-quadruplex containing transcripts. Nucleic Acids Res 2018, 46 (21), 11592-11604. 235. McRae, E. K. S.; Booy, E. P.; Moya-Torres, A.; Ezzati, P.; Stetefeld, J.; McKenna, S. A., Human DDX21 binds and unwinds RNA guanine quadruplexes. Nucleic Acids Res 2017, 45 (11), 6656-6668. 236. Kuroda, M. I.; Kernan, M. J.; Kreber, R.; Ganetzky, B.; Baker, B. S., The maleless protein associates with the to regulate dosage compensation in Drosophila. Cell 1991, 66 (5), 935-47. 237. Vaughn, J. P.; Creacy, S. D.; Routh, E. D.; Joyner-Butt, C.; Jenkins, G. S.; Pauli, S.; Nagamine, Y.; Akman, S. A., The DEXH protein product of the DHX36 gene is the major source of tetramolecular quadruplex G4-DNA resolving activity in HeLa cell lysates. J Biol Chem 2005, 280 (46), 38117-20. 238. Jankowsky, E.; Jankowsky, A., The DExH/D protein family database. Nucleic Acids Res 2000, 28 (1), 333-4. 239. Chalupnikova, K.; Lattmann, S.; Selak, N.; Iwamoto, F.; Fujiki, Y.; Nagamine, Y., Recruitment of the RNA helicase RHAU to stress granules via a unique RNA-binding domain. J Biol Chem 2008, 283 (50), 35186-98. 240. Zhang, Z.; Kim, T.; Bao, M.; Facchinetti, V.; Jung, S. Y.; Ghaffari, A. A.; Qin, J.; Cheng, G.; Liu, Y. J., DDX1, DDX21, and DHX36 helicases form a complex with the adaptor molecule TRIF to sense dsRNA in dendritic cells. Immunity 2011, 34 (6), 866-78.

199

241. Iwamoto, F.; Stadler, M.; Chalupnikova, K.; Oakeley, E.; Nagamine, Y., Transcription-dependent nucleolar cap localization and possible nuclear function of DExH RNA helicase RHAU. Exp Cell Res 2008, 314 (6), 1378-91. 242. Sauer, M.; Juranek, S. A.; Marks, J.; De Magis, A.; Kazemier, H. G.; Hilbig, D.; Benhalevy, D.; Wang, X.; Hafner, M.; Paeschke, K., DHX36 prevents the accumulation of translationally inactive mRNAs with G4-structures in untranslated regions. Nat Commun 2019, 10 (1), 2421. 243. Lai, J. C.; Ponti, S.; Pan, D.; Kohler, H.; Skoda, R. C.; Matthias, P.; Nagamine, Y., The DEAH-box helicase RHAU is an essential gene and critical for mouse hematopoiesis. Blood 2012, 119 (18), 4291-300. 244. Nie, J.; Jiang, M.; Zhang, X.; Tang, H.; Jin, H.; Huang, X.; Yuan, B.; Zhang, C.; Lai, J. C.; Nagamine, Y.; Pan, D.; Wang, W.; Yang, Z., Post-transcriptional Regulation of Nkx2-5 by RHAU in Heart Development. Cell Rep 2015, 13 (4), 723-32. 245. Gao, X.; Ma, W.; Nie, J.; Zhang, C.; Zhang, J.; Yao, G.; Han, J.; Xu, J.; Hu, B.; Du, Y.; Shi, Q.; Yang, Z.; Huang, X.; Zhang, Y., A G-quadruplex DNA structure resolvase, RHAU, is essential for spermatogonia differentiation. Cell Death Dis 2015, 6, e1610. 246. Creacy, S. D.; Routh, E. D.; Iwamoto, F.; Nagamine, Y.; Akman, S. A.; Vaughn, J. P., G4 resolvase 1 binds both DNA and RNA tetramolecular quadruplex with high affinity and is the major source of tetramolecular quadruplex G4-DNA and G4-RNA resolving activity in HeLa cell lysates. J Biol Chem 2008, 283 (50), 34626-34. 247. Lattmann, S.; Stadler, M. B.; Vaughn, J. P.; Akman, S. A.; Nagamine, Y., The DEAH-box RNA helicase RHAU binds an intramolecular RNA G-quadruplex in TERC and associates with telomerase holoenzyme. Nucleic Acids Res 2011, 39 (21), 9390-404. 248. Gros, J.; Guedin, A.; Mergny, J. L.; Lacroix, L., G-Quadruplex formation interferes with P1 helix formation in the RNA component of telomerase hTERC. Chembiochem 2008, 9 (13), 2075-9. 249. Li, X.; Nishizuka, H.; Tsutsumi, K.; Imai, Y.; Kurihara, Y.; Uesugi, S., Structure, interactions and effects on activity of the 5'-terminal region of human telomerase RNA. J Biochem 2007, 141 (5), 755-65. 250. Sexton, A. N.; Collins, K., The 5' guanosine tracts of human telomerase RNA are recognized by the G-quadruplex binding domain of the RNA helicase DHX36 and function to increase RNA accumulation. Mol Cell Biol 2011, 31 (4), 736-43. 251. Huang, W.; Smaldino, P. J.; Zhang, Q.; Miller, L. D.; Cao, P.; Stadelman, K.; Wan, M.; Giri, B.; Lei, M.; Nagamine, Y.; Vaughn, J. P.; Akman, S. A.; Sui, G., Yin Yang 1 contains G-quadruplex structures in its promoter and 5' UTR and its expression is modulated by G4 resolvase 1. Nucleic Acids Res 2012, 40 (3), 1033-49. 252. Newman, M.; Sfaxi, R.; Saha, A.; Monchaud, D.; Teulade-Fichou, M. P.; Vagner, S., The G-Quadruplex-Specific RNA Helicase DHX36 Regulates p53 Pre-mRNA 3' End Processing Following UV-Induced DNA Damage. J Mol Biol 2016. 253. Booy, E. P.; Howard, R.; Marushchak, O.; Ariyo, E. O.; Meier, M.; Novakowski, S. K.; Deo, S. R.; Dzananovic, E.; Stetefeld, J.; McKenna, S. A., The RNA helicase RHAU

200

(DHX36) suppresses expression of the transcription factor PITX1. Nucleic Acids Res 2014, 42 (5), 3346-61. 254. Bicker, S.; Khudayberdiev, S.; Weiss, K.; Zocher, K.; Baumeister, S.; Schratt, G., The DEAH-box helicase DHX36 mediates dendritic localization of the neuronal precursor- microRNA-134. Genes Dev 2013, 27 (9), 991-6. 255. Booy, E. P.; McRae, E. K.; Howard, R.; Deo, S. R.; Ariyo, E. O.; Dzananovic, E.; Meier, M.; Stetefeld, J.; McKenna, S. A., RNA Helicase Associated with AU-rich Element (RHAU/DHX36) Interacts with the 3'-Tail of the Long Non-coding RNA BC200 (BCYRN1). J Biol Chem 2016, 291 (10), 5355-72. 256. Matsumura, K.; Kawasaki, Y.; Miyamoto, M.; Kamoshida, Y.; Nakamura, J.; Negishi, L.; Suda, S.; Akiyama, T., The novel G-quadruplex-containing long non-coding RNA GSEC antagonizes DHX36 and modulates colon cancer cell migration. Oncogene 2016. 257. Yoo, J. S.; Takahasi, K.; Ng, C. S.; Ouda, R.; Onomoto, K.; Yoneyama, M.; Lai, J. C.; Lattmann, S.; Nagamine, Y.; Matsui, T.; Iwabuchi, K.; Kato, H.; Fujita, T., DHX36 enhances RIG-I signaling by facilitating PKR-mediated antiviral stress granule formation. PLoS Pathog 2014, 10 (3), e1004012. 258. Gao, J.; Aksoy, B. A.; Dogrusoz, U.; Dresdner, G.; Gross, B.; Sumer, S. O.; Sun, Y.; Jacobsen, A.; Sinha, R.; Larsson, E.; Cerami, E.; Sander, C.; Schultz, N., Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal 2013, 6 (269), pl1. 259. Jing, H.; Zhou, Y.; Fang, L.; Ding, Z.; Wang, D.; Ke, W.; Chen, H.; Xiao, S., DExD/H-Box Helicase 36 Signaling via Myeloid Differentiation Primary Response Gene 88 Contributes to NF-kappaB Activation to Type 2 Porcine Reproductive and Respiratory Syndrome Virus Infection. Front Immunol 2017, 8, 1365. 260. Metifiot, M.; Amrane, S.; Litvak, S.; Andreola, M. L., G-quadruplexes in viruses: function and potential therapeutic applications. Nucleic Acids Res 2014, 42 (20), 12352- 66. 261. Giri, B.; Smaldino, P. J.; Thys, R. G.; Creacy, S. D.; Routh, E. D.; Hantgan, R. R.; Lattmann, S.; Nagamine, Y.; Akman, S. A.; Vaughn, J. P., G4 resolvase 1 tightly binds and unwinds unimolecular G4-DNA. Nucleic Acids Res 2011, 39 (16), 7161-78. 262. Chen, M. C.; Murat, P.; Abecassis, K.; Ferre-D'Amare, A. R.; Balasubramanian, S., Insights into the mechanism of a G-quadruplex-unwinding DEAH-box helicase. Nucleic Acids Res 2015, 43 (4), 2223-31. 263. Koh, H. R.; Xing, L.; Kleiman, L.; Myong, S., Repetitive RNA unwinding by RNA helicase A facilitates RNA annealing. Nucleic Acids Res 2014, 42 (13), 8556-64. 264. Tippana, R.; Chen, M. C.; Demeshkina, N. A.; Ferre-D'Amare, A. R.; Myong, S., RNA G-quadruplex is resolved by repetitive and ATP-dependent mechanism of DHX36. Nat Commun 2019, 10 (1), 1855. 265. Hamann, F.; Enders, M.; Ficner, R., Structural basis for RNA translocation by DEAH-box ATPases. Nucleic Acids Res 2019, 47 (8), 4349-4362.

201

266. Schmucker, D.; Vorbruggen, G.; Yeghiayan, P.; Fan, H. Q.; Jackle, H.; Gaul, U., The Drosophila gene abstrakt, required for visual system development, encodes a putative RNA helicase of the DEAD box protein family. Mech Dev 2000, 91 (1-2), 189-96. 267. Omura, H.; Oikawa, D.; Nakane, T.; Kato, M.; Ishii, R.; Ishitani, R.; Tokunaga, F.; Nureki, O., Structural and Functional Analysis of DDX41: a bispecific immune receptor for DNA and cyclic dinucleotide. Sci Rep 2016, 6, 34756. 268. Jiang, Y.; Zhu, Y.; Qiu, W.; Liu, Y. J.; Cheng, G.; Liu, Z. J.; Ouyang, S., Structural and functional analyses of human DDX41 DEAD domain. Protein Cell 2017, 8 (1), 72-76. 269. Abdul-Ghani, M.; Hartman, K. L.; Ngsee, J. K., Abstrakt interacts with and regulates the expression of sorting nexin-2. J Cell Physiol 2005, 204 (1), 210-8. 270. Schmucker, D.; Jackle, H.; Gaul, U., Genetic analysis of the larval optic nerve projection in Drosophila. Development 1997, 124 (5), 937-48. 271. Irion, U.; Leptin, M., Developmental and cell biological functions of the Drosophila DEAD-box protein abstrakt. Curr Biol 1999, 9 (23), 1373-81. 272. Irion, U.; Leptin, M.; Siller, K.; Fuerstenberg, S.; Cai, Y.; Doe, C. Q.; Chia, W.; Yang, X., Abstrakt, a DEAD box protein, regulates Insc levels and asymmetric division of neural and mesodermal progenitors. Curr Biol 2004, 14 (2), 138-44. 273. Trowitzsch, S.; Weber, G.; Luhrmann, R.; Wahl, M. C., Crystal structure of the Pml1p subunit of the yeast precursor mRNA retention and splicing complex. J Mol Biol 2009, 385 (2), 531-41. 274. Wang, Y.; Wagner, J. D.; Guthrie, C., The DEAH-box splicing factor Prp16 unwinds RNA duplexes in vitro. Curr Biol 1998, 8 (8), 441-51. 275. Zhou, Z.; Licklider, L. J.; Gygi, S. P.; Reed, R., Comprehensive proteomic analysis of the human spliceosome. Nature 2002, 419 (6903), 182-5. 276. Maciejewski, J. P.; Padgett, R. A., Defects in spliceosomal machinery: a new pathway of leukaemogenesis. Br J Haematol 2012, 158 (2), 165-173. 277. Takeuchi, O.; Akira, S., Innate immunity to virus infection. Immunol Rev 2009, 227 (1), 75-86. 278. Broz, P.; Monack, D. M., Newly described pattern recognition receptors team up against intracellular pathogens. Nat Rev Immunol 2013, 13 (8), 551-65. 279. Gao, D.; Wu, J.; Wu, Y. T.; Du, F.; Aroh, C.; Yan, N.; Sun, L.; Chen, Z. J., Cyclic GMP-AMP synthase is an innate immune sensor of HIV and other retroviruses. Science 2013, 341 (6148), 903-6. 280. Takaoka, A.; Wang, Z.; Choi, M. K.; Yanai, H.; Negishi, H.; Ban, T.; Lu, Y.; Miyagishi, M.; Kodama, T.; Honda, K.; Ohba, Y.; Taniguchi, T., DAI (DLM-1/ZBP1) is a cytosolic DNA sensor and an activator of innate immune response. Nature 2007, 448 (7152), 501-5. 281. Chow, K. T.; Gale, M., Jr.; Loo, Y. M., RIG-I and Other RNA Sensors in Antiviral Immunity. Annu Rev Immunol 2018, 36, 667-694.

202

282. Liu, X.; Wang, C., The emerging roles of the STING adaptor protein in immunity and diseases. Immunology 2016, 147 (3), 285-91. 283. Hengge, R., Principles of c-di-GMP signalling in bacteria. Nat Rev Microbiol 2009, 7 (4), 263-73. 284. Burdette, D. L.; Monroe, K. M.; Sotelo-Troha, K.; Iwig, J. S.; Eckert, B.; Hyodo, M.; Hayakawa, Y.; Vance, R. E., STING is a direct innate immune sensor of cyclic di- GMP. Nature 2011, 478 (7370), 515-8. 285. Moore, C. B.; Bergstralh, D. T.; Duncan, J. A.; Lei, Y.; Morrison, T. E.; Zimmermann, A. G.; Accavitti-Loper, M. A.; Madden, V. J.; Sun, L.; Ye, Z.; Lich, J. D.; Heise, M. T.; Chen, Z.; Ting, J. P., NLRX1 is a regulator of mitochondrial antiviral immunity. Nature 2008, 451 (7178), 573-7. 286. Zhang, Z.; Bao, M.; Lu, N.; Weng, L.; Yuan, B.; Liu, Y. J., The E3 ubiquitin ligase TRIM21 negatively regulates the innate immune response to intracellular double-stranded DNA. Nat Immunol 2013, 14 (2), 172-8. 287. Lee, K. G.; Kim, S. S.; Kui, L.; Voon, D. C.; Mauduit, M.; Bist, P.; Bi, X.; Pereira, N. A.; Liu, C.; Sukumaran, B.; Renia, L.; Ito, Y.; Lam, K. P., Bruton's tyrosine kinase phosphorylates DDX41 and activates its binding of dsDNA and STING to initiate type 1 interferon response. Cell Rep 2015, 10 (7), 1055-65. 288. Khan, W. N.; Sideras, P.; Rosen, F. S.; Alt, F. W., The role of Bruton's tyrosine kinase in B-cell development and function in mice and man. Ann N Y Acad Sci 1995, 764, 27-38. 289. Quynh, N. T.; Hikima, J.; Kim, Y. R.; Fagutao, F. F.; Kim, M. S.; Aoki, T.; Jung, T. S., The cytosolic sensor, DDX41, activates antiviral and inflammatory immunity in response to stimulation with double-stranded DNA adherent cells of the olive flounder, Paralichthys olivaceus. Fish Shellfish Immunol 2015, 44 (2), 576-83. 290. Fuller-Pace, F. V., DEAD box RNA helicase functions in cancer. RNA Biol 2013, 10 (1), 121-32. 291. Ding, L.; Ley, T. J.; Larson, D. E.; Miller, C. A.; Koboldt, D. C.; Welch, J. S.; Ritchey, J. K.; Young, M. A.; Lamprecht, T.; McLellan, M. D.; McMichael, J. F.; Wallis, J. W.; Lu, C.; Shen, D.; Harris, C. C.; Dooling, D. J.; Fulton, R. S.; Fulton, L. L.; Chen, K.; Schmidt, H.; Kalicki-Veizer, J.; Magrini, V. J.; Cook, L.; McGrath, S. D.; Vickery, T. L.; Wendl, M. C.; Heath, S.; Watson, M. A.; Link, D. C.; Tomasson, M. H.; Shannon, W. D.; Payton, J. E.; Kulkarni, S.; Westervelt, P.; Walter, M. J.; Graubert, T. A.; Mardis, E. R.; Wilson, R. K.; DiPersio, J. F., Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature 2012, 481 (7382), 506-10. 292. Peters, D.; Radine, C.; Reese, A.; Budach, W.; Sohn, D.; Janicke, R. U., The DEAD-box RNA helicase DDX41 is a novel repressor of p21(WAF1/CIP1) mRNA translation. J Biol Chem 2017, 292 (20), 8331-8341. 293. Kittler, R.; Pelletier, L.; Heninger, A. K.; Slabicki, M.; Theis, M.; Miroslaw, L.; Poser, I.; Lawo, S.; Grabner, H.; Kozak, K.; Wagner, J.; Surendranath, V.; Richter, C.; Bowen, W.; Jackson, A. L.; Habermann, B.; Hyman, A. A.; Buchholz, F., Genome-scale

203

RNAi profiling of cell division in human tissue culture cells. Nat Cell Biol 2007, 9 (12), 1401-12. 294. Chao, C. H.; Chen, C. M.; Cheng, P. L.; Shih, J. W.; Tsou, A. P.; Lee, Y. H., DDX3, a DEAD box RNA helicase with tumor growth-suppressive property and transcriptional regulation activity of the p21waf1/cip1 promoter, is a candidate tumor suppressor. Cancer Res 2006, 66 (13), 6579-88. 295. Botlagunta, M.; Vesuna, F.; Mironchik, Y.; Raman, A.; Lisok, A.; Winnard, P., Jr.; Mukadam, S.; Van Diest, P.; Chen, J. H.; Farabaugh, P.; Patel, A. H.; Raman, V., Oncogenic role of DDX3 in breast cancer biogenesis. Oncogene 2008, 27 (28), 3912-22. 296. Deininger, M. W. N.; Tyner, J. W.; Solary, E., Turning the tide in myelodysplastic/myeloproliferative neoplasms. Nat Rev Cancer 2017, 17 (7), 425-440. 297. Corey, S. J.; Minden, M. D.; Barber, D. L.; Kantarjian, H.; Wang, J. C.; Schimmer, A. D., Myelodysplastic syndromes: the complexity of stem-cell diseases. Nat Rev Cancer 2007, 7 (2), 118-29. 298. Antony-Debre, I.; Manchev, V. T.; Balayn, N.; Bluteau, D.; Tomowiak, C.; Legrand, C.; Langlois, T.; Bawa, O.; Tosca, L.; Tachdjian, G.; Leheup, B.; Debili, N.; Plo, I.; Mills, J. A.; French, D. L.; Weiss, M. J.; Solary, E.; Favier, R.; Vainchenker, W.; Raslova, H., Level of RUNX1 activity is critical for leukemic predisposition but not for thrombocytopenia. Blood 2015, 125 (6), 930-40. 299. Owen, C. J.; Toze, C. L.; Koochin, A.; Forrest, D. L.; Smith, C. A.; Stevens, J. M.; Jackson, S. C.; Poon, M. C.; Sinclair, G. D.; Leber, B.; Johnson, P. R.; Macheta, A.; Yin, J. A.; Barnett, M. J.; Lister, T. A.; Fitzgibbon, J., Five new pedigrees with inherited RUNX1 mutations causing familial platelet disorder with propensity to myeloid malignancy. Blood 2008, 112 (12), 4639-45. 300. Hahn, C. N.; Chong, C. E.; Carmichael, C. L.; Wilkins, E. J.; Brautigan, P. J.; Li, X. C.; Babic, M.; Lin, M.; Carmagnac, A.; Lee, Y. K.; Kok, C. H.; Gagliardi, L.; Friend, K. L.; Ekert, P. G.; Butcher, C. M.; Brown, A. L.; Lewis, I. D.; To, L. B.; Timms, A. E.; Storek, J.; Moore, S.; Altree, M.; Escher, R.; Bardy, P. G.; Suthers, G. K.; D'Andrea, R. J.; Horwitz, M. S.; Scott, H. S., Heritable GATA2 mutations associated with familial myelodysplastic syndrome and acute myeloid leukemia. Nat Genet 2011, 43 (10), 1012-7. 301. Makishima, H.; Visconte, V.; Sakaguchi, H.; Jankowska, A. M.; Abu Kar, S.; Jerez, A.; Przychodzen, B.; Bupathi, M.; Guinta, K.; Afable, M. G.; Sekeres, M. A.; Padgett, R. A.; Tiu, R. V.; Maciejewski, J. P., Mutations in the spliceosome machinery, a novel and ubiquitous pathway in leukemogenesis. Blood 2012, 119 (14), 3203-10. 302. Przychodzen, B.; Jerez, A.; Guinta, K.; Sekeres, M. A.; Padgett, R.; Maciejewski, J. P.; Makishima, H., Patterns of missplicing due to somatic U2AF1 mutations in myeloid neoplasms. Blood 2013, 122 (6), 999-1006. 303. Nikpour, M.; Scharenberg, C.; Liu, A.; Conte, S.; Karimi, M.; Mortera-Blanco, T.; Giai, V.; Fernandez-Mercado, M.; Papaemmanuil, E.; Hogstrand, K.; Jansson, M.; Vedin, I.; Wainscoat, J. S.; Campbell, P.; Cazzola, M.; Boultwood, J.; Grandien, A.; Hellstrom- Lindberg, E., The transporter ABCB7 is a mediator of the phenotype of acquired refractory anemia with ring sideroblasts. Leukemia 2013, 27 (4), 889-896.

204

304. Kim, E.; Ilagan, J. O.; Liang, Y.; Daubner, G. M.; Lee, S. C.; Ramakrishnan, A.; Li, Y.; Chung, Y. R.; Micol, J. B.; Murphy, M. E.; Cho, H.; Kim, M. K.; Zebari, A. S.; Aumann, S.; Park, C. Y.; Buonamici, S.; Smith, P. G.; Deeg, H. J.; Lobry, C.; Aifantis, I.; Modis, Y.; Allain, F. H.; Halene, S.; Bradley, R. K.; Abdel-Wahab, O., SRSF2 Mutations Contribute to Myelodysplasia by Mutant-Specific Effects on Exon Recognition. Cancer Cell 2015, 27 (5), 617-30. 305. Yoshida, K.; Sanada, M.; Shiraishi, Y.; Nowak, D.; Nagata, Y.; Yamamoto, R.; Sato, Y.; Sato-Otsubo, A.; Kon, A.; Nagasaki, M.; Chalkidis, G.; Suzuki, Y.; Shiosaka, M.; Kawahata, R.; Yamaguchi, T.; Otsu, M.; Obara, N.; Sakata-Yanagimoto, M.; Ishiyama, K.; Mori, H.; Nolte, F.; Hofmann, W. K.; Miyawaki, S.; Sugano, S.; Haferlach, C.; Koeffler, H. P.; Shih, L. Y.; Haferlach, T.; Chiba, S.; Nakauchi, H.; Miyano, S.; Ogawa, S., Frequent pathway mutations of splicing machinery in myelodysplasia. Nature 2011, 478 (7367), 64- 9. 306. Mupo, A.; Seiler, M.; Sathiaseelan, V.; Pance, A.; Yang, Y.; Agrawal, A. A.; Iorio, F.; Bautista, R.; Pacharne, S.; Tzelepis, K.; Manes, N.; Wright, P.; Papaemmanuil, E.; Kent, D. G.; Campbell, P. C.; Buonamici, S.; Bolli, N.; Vassiliou, G. S., Hemopoietic-specific Sf3b1-K700E knock-in mice display the splicing defect seen in human MDS but develop anemia without ring sideroblasts. Leukemia 2017, 31 (3), 720-727. 307. Hirsch, C. M.; Przychodzen, B. P.; Radivoyevitch, T.; Patel, B.; Thota, S.; Clemente, M. J.; Nagata, Y.; LaFramboise, T.; Carraway, H. E.; Nazha, A.; Sekeres, M. A.; Makishima, H.; Maciejewski, J. P., Molecular features of early onset adult myelodysplastic syndrome. Haematologica 2017, 102 (6), 1028-1034. 308. Chen, L.; Chen, J. Y.; Huang, Y. J.; Gu, Y.; Qiu, J.; Qian, H.; Shao, C.; Zhang, X.; Hu, J.; Li, H.; He, S.; Zhou, Y.; Abdel-Wahab, O.; Zhang, D. E.; Fu, X. D., The Augmented R-Loop Is a Unifying Mechanism for Myelodysplastic Syndromes Induced by High-Risk Splicing Factor Mutations. Mol Cell 2018, 69 (3), 412-425 e6. 309. Cheah, J. J. C.; Hahn, C. N.; Hiwase, D. K.; Scott, H. S.; Brown, A. L., Myeloid neoplasms with germline DDX41 mutation. Int J Hematol 2017, 106 (2), 163-174. 310. Khan, S. N.; Jankowska, A. M.; Mahfouz, R.; Dunbar, A. J.; Sugimoto, Y.; Hosono, N.; Hu, Z.; Cheriyath, V.; Vatolin, S.; Przychodzen, B.; Reu, F. J.; Saunthararajah, Y.; O'Keefe, C.; Sekeres, M. A.; List, A. F.; Moliterno, A. R.; McDevitt, M. A.; Maciejewski, J. P.; Makishima, H., Multiple mechanisms deregulate EZH2 and histone H3 lysine 27 epigenetic changes in myeloid malignancies. Leukemia 2013, 27 (6), 1301-9. 311. Byrd, A. K.; Raney, K. D., A parallel quadruplex DNA is bound tightly but unfolded slowly by pif1 helicase. J Biol Chem 2015, 290 (10), 6482-94. 312. Jankowsky, E.; Putnam, A., Duplex unwinding with DEAD-box proteins. Methods Mol Biol 2010, 587, 245-64. 313. Heddi, B.; Cheong, V. V.; Martadinata, H.; Phan, A. T., Insights into G-quadruplex specific recognition by the DEAH-box helicase RHAU: Solution structure of a peptide- quadruplex complex. Proc Natl Acad Sci U S A 2015, 112 (31), 9608-13.

205

314. You, H.; Lattmann, S.; Rhodes, D.; Yan, J., RHAU helicase stabilizes G4 in its nucleotide-free state and destabilizes G4 upon ATP hydrolysis. Nucleic Acids Res 2017, 45 (1), 206-214. 315. Sen, D.; Gilbert, W., A sodium-potassium switch in the formation of four-stranded G4-DNA. Nature 1990, 344 (6265), 410-4. 316. Tanaka, N.; Schwer, B., Mutations in PRP43 that uncouple RNA-dependent NTPase activity and pre-mRNA splicing function. Biochemistry 2006, 45 (20), 6510-21. 317. Srinivasan, S.; Liu, Z.; Chuenchor, W.; Xiao, T. S.; Jankowsky, E., Function of Auxiliary Domains of the DEAH/RHA Helicase DHX36 in RNA Remodeling. J Mol Biol 2020. 318. Jankowsky, E.; Harris, M. E., Specificity and nonspecificity in RNA-protein interactions. Nat Rev Mol Cell Biol 2015, 16 (9), 533-44. 319. Martin, A.; Schneider, S.; Schwer, B., Prp43 is an essential RNA-dependent ATPase required for release of lariat-intron from the spliceosome. J Biol Chem 2002, 277 (20), 17743-50. 320. Schwer, B.; Gross, C. H., Prp22, a DExH-box RNA helicase, plays two distinct roles in yeast pre-mRNA splicing. EMBO J 1998, 17 (7), 2086-94. 321. Wagner, J. D.; Jankowsky, E.; Company, M.; Pyle, A. M.; Abelson, J. N., The DEAH-box protein PRP22 is an ATPase that mediates ATP-dependent mRNA release from the spliceosome and unwinds RNA duplexes. EMBO J 1998, 17 (10), 2926-37. 322. O'Day, C. L.; Dalbadie-McFarland, G.; Abelson, J., The Saccharomyces cerevisiae Prp5 protein has RNA-dependent ATPase activity with specificity for U2 small nuclear RNA. J Biol Chem 1996, 271 (52), 33261-7. 323. Strauss, E. J.; Guthrie, C., PRP28, a 'DEAD-box' protein, is required for the first step of mRNA splicing in vitro. Nucleic Acids Res 1994, 22 (15), 3187-93. 324. Xu, D.; Nouraini, S.; Field, D.; Tang, S. J.; Friesen, J. D., An RNA-dependent ATPase associated with U2/U6 snRNAs in pre-mRNA splicing. Nature 1996, 381 (6584), 709-13. 325. Schneider, S.; Campodonico, E.; Schwer, B., Motifs IV and V in the DEAH box splicing factor Prp22 are important for RNA unwinding, and helicase-defective Prp22 mutants are suppressed by Prp8. J Biol Chem 2004, 279 (10), 8617-26. 326. Kim, D. H.; Rossi, J. J., The first ATPase domain of the yeast 246-kDa protein is required for in vivo unwinding of the U4/U6 duplex. RNA 1999, 5 (7), 959-71. 327. Edwalds-Gilbert, G.; Kim, D. H.; Kim, S. H.; Tseng, Y. H.; Yu, Y.; Lin, R. J., Dominant negative mutants of the yeast splicing factor Prp2 map to a putative cleft region in the helicase domain of DExD/H-box proteins. RNA 2000, 6 (8), 1106-19. 328. Tippana, R.; Hwang, H.; Opresko, P. L.; Bohr, V. A.; Myong, S., Single-molecule imaging reveals a common mechanism shared by G-quadruplex-resolving helicases. Proc Natl Acad Sci U S A 2016, 113 (30), 8448-53.

206

329. Tippana, R.; Xiao, W.; Myong, S., G-quadruplex conformation and dynamics are determined by loop length and sequence. Nucleic Acids Res 2014, 42 (12), 8106-14. 330. Boneberg, F. M.; Brandmann, T.; Kobel, L.; van den Heuvel, J.; Bargsten, K.; Bammert, L.; Kutay, U.; Jinek, M., Molecular mechanism of the RNA helicase DHX37 and its activation by UTP14A in ribosome biogenesis. RNA 2019, 25 (6), 685-701. 331. Tauchert, M. J.; Ficner, R., Structural analysis of the spliceosomal RNA helicase Prp28 from the thermophilic eukaryote Chaetomium thermophilum. Acta Crystallogr F Struct Biol Commun 2016, 72 (Pt 5), 409-16. 332. Robert-Paganin, J.; Halladjian, M.; Blaud, M.; Lebaron, S.; Delbos, L.; Chardon, F.; Capeyrou, R.; Humbert, O.; Henry, Y.; Henras, A. K.; Rety, S.; Leulliot, N., Functional link between DEAH/RHA helicase Prp43 activation and ATP base binding. Nucleic Acids Res 2016. 333. Tanner, N. K., The newly identified Q motif of DEAD box helicases is involved in adenine recognition. Cell Cycle 2003, 2 (1), 18-9. 334. Tanaka, N.; Schwer, B., Characterization of the NTPase, RNA-binding, and RNA helicase activities of the DEAH-box splicing factor Prp22. Biochemistry 2005, 44 (28), 9795-803. 335. Johnson, K. A.; Simpson, Z. B.; Blom, T., Global kinetic explorer: a new computer program for dynamic simulation and fitting of kinetic data. Anal Biochem 2009, 387 (1), 20-9. 336. Shugar, D., The NTP phosphate donor in kinase reactions: is ATP a monopolist? Acta Biochim Pol 1996, 43 (1), 9-23. 337. Traut, T. W., Physiological concentrations of purines and pyrimidines. Mol Cell Biochem 1994, 140 (1), 1-22. 338. Smaldino, P. J.; Routh, E. D.; Kim, J. H.; Giri, B.; Creacy, S. D.; Hantgan, R. R.; Akman, S. A.; Vaughn, J. P., Mutational Dissection of Telomeric DNA Binding Requirements of G4 Resolvase 1 Shows that G4-Structure and Certain 3'-Tail Sequences Are Sufficient for Tight and Complete Binding. PLoS One 2015, 10 (7), e0132668. 339. Henn, A.; Cao, W.; Hackney, D. D.; De La Cruz, E. M., The ATPase cycle mechanism of the DEAD-box rRNA helicase, DbpA. J Mol Biol 2008, 377 (1), 193-205. 340. Lorsch, J. R.; Herschlag, D., The DEAD box protein eIF4A. 2. A cycle of nucleotide and RNA-dependent conformational changes. Biochemistry 1998, 37 (8), 2194- 206. 341. Weir, J. R.; Bonneau, F.; Hentschel, J.; Conti, E., Structural analysis reveals the characteristic features of Mtr4, a DExH helicase involved in nuclear RNA processing and surveillance. Proc Natl Acad Sci U S A 2010, 107 (27), 12139-44. 342. Andersen, C. B.; Ballut, L.; Johansen, J. S.; Chamieh, H.; Nielsen, K. H.; Oliveira, C. L.; Pedersen, J. S.; Seraphin, B.; Le Hir, H.; Andersen, G. R., Structure of the exon junction core complex with a trapped DEAD-box ATPase bound to RNA. Science 2006, 313 (5795), 1968-72.

207

343. Shuman, S., Vaccinia virus RNA helicase: an essential enzyme related to the DE- H family of RNA-dependent NTPases. Proc Natl Acad Sci U S A 1992, 89 (22), 10935-9. 344. Schutz, P.; Wahlberg, E.; Karlberg, T.; Hammarstrom, M.; Collins, R.; Flores, A.; Schuler, H., Crystal structure of human RNA helicase A (DHX9): structural basis for unselective nucleotide base binding in a DEAD-box variant protein. J Mol Biol 2010, 400 (4), 768-82. 345. Patrick, E. M.; Srinivasan, S.; Jankowsky, E.; Comstock, M. J., The RNA helicase Mtr4p is a duplex-sensing translocase. Nature chemical biology 2017, 13 (1), 99-104. 346. He, Y.; Andersen, G. R.; Nielsen, K. H., The function and architecture of DEAH/RHA helicases. Biomol Concepts 2011, 2 (4), 315-26. 347. Muzzolini, L.; Beuron, F.; Patwardhan, A.; Popuri, V.; Cui, S.; Niccolini, B.; Rappas, M.; Freemont, P. S.; Vindigni, A., Different quaternary structures of human RECQ1 are associated with its dual enzymatic activity. PLoS Biol 2007, 5 (2), e20. 348. Patel, S. S.; Donmez, I., Mechanisms of helicases. J Biol Chem 2006, 281 (27), 18265-8. 349. Levin, M. K.; Wang, Y. H.; Patel, S. S., The functional interaction of the hepatitis C virus helicase molecules is responsible for unwinding processivity. J Biol Chem 2004, 279 (25), 26005-12. 350. Maluf, N. K.; Ali, J. A.; Lohman, T. M., Kinetic mechanism for formation of the active, dimeric UvrD helicase-DNA complex. J Biol Chem 2003, 278 (34), 31930-40. 351. Tackett, A. J.; Chen, Y.; Cameron, C. E.; Raney, K. D., Multiple full-length NS3 molecules are required for optimal unwinding of oligonucleotide DNA in vitro. J Biol Chem 2005, 280 (11), 10797-806. 352. Xu, H. Q.; Deprez, E.; Zhang, A. H.; Tauc, P.; Ladjimi, M. M.; Brochon, J. C.; Auclair, C.; Xi, X. G., The Escherichia coli RecQ helicase functions as a monomer. J Biol Chem 2003, 278 (37), 34925-33. 353. Ali, J. A.; Maluf, N. K.; Lohman, T. M., An oligomeric form of E. coli UvrD is required for optimal helicase activity. J Mol Biol 1999, 293 (4), 815-34. 354. Byrd, A. K.; Raney, K. D., Increasing the length of the single-stranded overhang enhances unwinding of duplex DNA by bacteriophage T4 Dda helicase. Biochemistry 2005, 44 (39), 12990-7. 355. Moradian-Oldak, J.; Leung, W.; Fincham, A. G., Temperature and pH-dependent supramolecular self-assembly of amelogenin molecules: a dynamic light-scattering analysis. J Struct Biol 1998, 122 (3), 320-7. 356. Mechanic, L. E.; Hall, M. C.; Matson, S. W., Escherichia coli DNA helicase II is active as a monomer. J Biol Chem 1999, 274 (18), 12488-98. 357. Wong, I.; Lohman, T. M., A two-site mechanism for ATP hydrolysis by the asymmetric Rep dimer P2S as revealed by site-specific inhibition with ADP-A1F4. Biochemistry 1997, 36 (11), 3115-25.

208

358. Jacobs, A. M.; Nicol, S. M.; Hislop, R. G.; Jaffray, E. G.; Hay, R. T.; Fuller-Pace, F. V., SUMO modification of the DEAD box protein p68 modulates its transcriptional activity and promotes its interaction with HDAC1. Oncogene 2007, 26 (40), 5866-76. 359. Decker, C. J.; Parker, R., P-bodies and stress granules: possible roles in the control of translation and mRNA degradation. Cold Spring Harb Perspect Biol 2012, 4 (9), a012286. 360. Anderson, P.; Kedersha, N., RNA granules: post-transcriptional and epigenetic modulators of gene expression. Nat Rev Mol Cell Biol 2009, 10 (6), 430-6. 361. King, A. E.; Blizzard, C. A.; Southam, K. A.; Vickers, J. C.; Dickson, T. C., Degeneration of axons in spinal white matter in G93A mSOD1 mouse characterized by NFL and alpha-internexin immunoreactivity. Brain Res 2012, 1465, 90-100. 362. Schwartz, J. C.; Podell, E. R.; Han, S. S.; Berry, J. D.; Eggan, K. C.; Cech, T. R., FUS is sequestered in nuclear aggregates in ALS patient fibroblasts. Mol Biol Cell 2014, 25 (17), 2571-8. 363. Schwartz, J. C.; Wang, X.; Podell, E. R.; Cech, T. R., RNA seeds higher-order assembly of FUS protein. Cell Rep 2013, 5 (4), 918-25. 364. Weber, S. C.; Brangwynne, C. P., Getting RNA and protein in phase. Cell 2012, 149 (6), 1188-91. 365. Saito, M.; Hess, D.; Eglinger, J.; Fritsch, A. W.; Kreysing, M.; Weinert, B. T.; Choudhary, C.; Matthias, P., Acetylation of intrinsically disordered regions regulates phase separation. Nature chemical biology 2019, 15 (1), 51-61. 366. Byrd, A. K.; Zybailov, B. L.; Maddukuri, L.; Gao, J.; Marecki, J. C.; Jaiswal, M.; Bell, M. R.; Griffin, W. C.; Reed, M. R.; Chib, S.; Mackintosh, S. G.; MacNicol, A. M.; Baldini, G.; Eoff, R. L.; Raney, K. D., Evidence That G-quadruplex DNA Accumulates in the Cytoplasm and Participates in Stress Granule Assembly in Response to Oxidative Stress. J Biol Chem 2016, 291 (34), 18041-57. 367. Zhang, Y.; Yang, M.; Duncan, S.; Yang, X.; Abdelhamid, M. A. S.; Huang, L.; Zhang, H.; Benfey, P. N.; Waller, Z. A. E.; Ding, Y., G-quadruplex structures trigger RNA phase separation. Nucleic Acids Res 2019, 47 (22), 11746-11754. 368. Yoneyama-Hirozane, M.; Kondo, M.; Matsumoto, S. I.; Morikawa-Oki, A.; Morishita, D.; Nakanishi, A.; Kawamoto, T.; Nakayama, M., High-Throughput Screening to Identify Inhibitors of DEAD Box Helicase DDX41. SLAS Discov 2017, 2472555217705952. 369. Pause, A.; Methot, N.; Sonenberg, N., The HRIGRXXR region of the DEAD box RNA helicase eukaryotic translation initiation factor 4A is required for RNA binding and ATP hydrolysis. Mol Cell Biol 1993, 13 (11), 6789-98. 370. Anoosha, P.; Sakthivel, R.; Michael Gromiha, M., Exploring preferred amino acid mutations in cancer genes: Applications to identify potential drug targets. Biochim Biophys Acta 2016, 1862 (2), 155-65.

209

371. Szpiech, Z. A.; Strauli, N. B.; White, K. A.; Ruiz, D. G.; Jacobson, M. P.; Barber, D. L.; Hernandez, R. D., Prominent features of the amino acid mutation landscape in cancer. PLoS One 2017, 12 (8), e0183273. 372. Tan, H.; Bao, J.; Zhou, X., Genome-wide mutational spectra analysis reveals significant cancer-specific heterogeneity. Sci Rep 2015, 5, 12566. 373. Cardone, R. A.; Casavola, V.; Reshkin, S. J., The role of disturbed pH dynamics and the Na+/H+ exchanger in metastasis. Nat Rev Cancer 2005, 5 (10), 786-95. 374. DiGiammarino, E. L.; Lee, A. S.; Cadwell, C.; Zhang, W.; Bothner, B.; Ribeiro, R. C.; Zambetti, G.; Kriwacki, R. W., A novel mechanism of tumorigenesis involving pH- dependent destabilization of a mutant p53 tetramer. Nat Struct Biol 2002, 9 (1), 12-6. 375. White, K. A.; Ruiz, D. G.; Szpiech, Z. A.; Strauli, N. B.; Hernandez, R. D.; Jacobson, M. P.; Barber, D. L., Cancer-associated arginine-to-histidine mutations confer a gain in pH sensing to mutant proteins. Sci Signal 2017, 10 (495). 376. Otwinowski, Z.; Minor, W., Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol 1997, 276, 307-26. 377. Kabsch, W., Xds. Acta Crystallogr D Biol Crystallogr 2010, 66 (Pt 2), 125-32. 378. Emsley, P.; Cowtan, K., Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr 2004, 60 (Pt 12 Pt 1), 2126-32. 379. Adams, P. D.; Grosse-Kunstleve, R. W.; Hung, L. W.; Ioerger, T. R.; McCoy, A. J.; Moriarty, N. W.; Read, R. J.; Sacchettini, J. C.; Sauter, N. K.; Terwilliger, T. C., PHENIX: building new software for automated crystallographic structure determination. Acta Crystallogr D Biol Crystallogr 2002, 58 (Pt 11), 1948-54. 380. Chen, V. B.; Arendall, W. B., 3rd; Headd, J. J.; Keedy, D. A.; Immormino, R. M.; Kapral, G. J.; Murray, L. W.; Richardson, J. S.; Richardson, D. C., MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr 2010, 66 (Pt 1), 12-21. 381. Putnam, A. A.; Jankowsky, E., AMP sensing by DEAD-box RNA helicases. J Mol Biol 2013, 425 (20), 3839-45. 382. Corriveau, M.; Mullins, M. R.; Baus, D.; Harris, M. E.; Taylor, D. J., Coordinated interactions of multiple POT1-TPP1 proteins with telomere DNA. J Biol Chem 2013, 288 (23), 16361-70. 383. Putnam, A. A.; Gao, Z.; Liu, F.; Jia, H.; Yang, Q.; Jankowsky, E., Division of Labor in an Oligomer of the DEAD-Box RNA Helicase Ded1p. Mol Cell 2015, 59 (4), 541-52. 384. Karplus, P. A.; Diederichs, K., Linking crystallographic model and data quality. Science 2012, 336 (6084), 1030-3.

210