Structural Studies of C9orf72-SMCR8-WDR41 Complex

Valeria Shkuratova

Department of Biochemistry McGill University, Montreal

A thesis submitted to McGill University in partial fulfillment of the requirements of the degree of Master of Science

© Valeria Shkuratova, 2020 Table of Contents

Abstract ...... 3

Résumé ...... 4

Acknowledgment ...... 5

Author Contribution ...... 6

List of Abbreviations ...... 7

List of Figures ...... 9

List of Tables ...... 9

Introduction ...... 10

1. Amyotrophic Lateral Sclerosis (ALS) ...... 10

2. (FTD) ...... 13

3. C9orf72 ...... 15

4. Cellular functions of C9orf72 protein and formation of CSW complex ...... 18

5. CSW structure...... 20

6. Project goals ...... 21

Results ...... 22

1. Purification optimization ...... 22 1.1. Optimization of expression and purification of His-tagged constructs ...... 22 1.2. Optimization of expression and purification of GFP-tagged constructs ...... 26

2. C9orf72-SMCR8-WDR41 complex forms a trimer ...... 28

3. Initial crystallization trials for CSW and CS complexes ...... 29

4. HDX-MS analysis for CS and CSW ...... 30

5. CSW unstructured regions are essential for the formation of the complex ...... 38

1 6. Crystallization trials for CSW mutants ...... 41

Discussion ...... 42

Materials and Methods ...... 47

1. Protein constructs ...... 47

2. Cloning into pFastBac ...... 47 2.1. Restriction enzyme digestion ...... 47 2.2. Ligation and transformation ...... 48

3. Protein expression in Sf9 insect cells ...... 48

4. Protein purification with Ni-affinity beads...... 48

5. Expression and purification of GFPnb ...... 49

6. Preparation of GFPnb-coupled beads ...... 50

7. Generation of GFP tagged constructs ...... 50 7.1. C9orf72-SMCR8 construct ...... 50 7.2. C9orf72-SMCR8-WDR41 construct ...... 51

8. Protein purification with GFPnb-coupled beads ...... 52

9. Sedimentation-Velocity Analytical Ultracentrifugation (SV-AUC) ...... 53

10. Hydrogen-Deuterium Exchange Mass Spectrometry (HDX-MS) ...... 53

11. Generation of deletion mutants for CSW complex...... 54

References ...... 55

2 Abstract

Hexanucleotide repeat expansions in the C9orf72 gene are the leading cause for the development of two neurodegenerative diseases: amyotrophic lateral sclerosis and frontotemporal dementia. Previous studies have identified that C9orf72 binds SMCR8 and WDR41 to form a stable complex, which was shown to be important during in neuronal cells. However, the exact functions of the complex and its structural features are mainly unknown. Here we established a purification protocol for C9orf72-SMCR8-WDR41 protein complex. We constructed a single plasmid containing all three for efficient expression in Sf9 insect cells. The two- step purification using GFP-nanobody-coupled beads was optimized to yield high quantities of pure trimeric complex (up to 7.5 mg for 1 L of cell culture). Crystallization trials were unsuccessful, suggesting a requirement for other techniques such as cryoEM for solving the complex structure. Studies using HDX-MS revealed that WDR41 binds to SMCR8 DENN domain but not to C9orf72. However, the conformational changes occur in both SMCR8 and C9orf72.

Additionally, we identified unstructured regions within the SMCR8 linker region and WDR41 that could be important for complex assembly. All these results are in agreement with the recently published structure of the C9orf72-SMCR8-WDR41 complex.

3 Résumé

Les expansions répétées hexanucléotidiques du gène C9orf72 sont la principale cause du développement de deux maladies neurodégénératives: la sclérose latérale amyotrophique et la démence fronto-temporale. Des études antérieures ont montré que C9orf72 se lie à SMCR8 et

WDR41 pour former un complexe stable. Ce complexe s’est révélé important lors de l’autophagie dans les cellules neuronales. Cependant, les fonctions exactes du complexe et ses caractéristiques structurelles sont principalement inconnues. Ici, nous avons établi un protocole de purification pour le complexe protéique C9orf72-SMCR8-WDR41. Nous avons construit un seul plasmide contenant les trois protéines pour une expression efficace dans les cellules d'insectes Sf9. La purification en deux étapes à l'aide de billes couplées à la nanoparticule GFP a été optimisée pour maximiser le rendement du complexe trimérique pur (jusqu’à 7,5 mg pour 1 L de culture cellulaire). Les essais de cristallisation menés ont échoué, suggérant une exigence pour d'autres techniques tels que cryoEM pour résoudre la structure du complexe. Des études utilisant HDX-

MS ont révélé que WDR41 se lie au domaine DENN du SMCR8 mais pas à C9orf72. Cependant, les changements de conformation se produisent à la fois dans SMCR8 et C9orf72. De plus, nous avons identifié des régions non structurées au sein dans la région de liaison SMCR8 et WDR41 qui peuvent être importantes pour l'assemblage du complexe. Tous ces résultats sont en accord avec la structure du complexe C9orf72-SMCR8-WDR41 récemment publiée.

4 Acknowledgment

I wish to express my gratitude to my supervisor Dr. Kalle Gehring for accepting me into his lab and allowing me to work on this new exciting project. His constant involvement and immense knowledge and valuable suggestions helped to drive my project forward. I would like to thank Dr. Guennadi Kozlov for training me when I joined the lab as a master’s student and for guiding me throughout the project. I would like to thank Katalin Kocsis Illes for joining me on the project and helping with insect cell cultures and virus generation. I also wish to thank my research advisory committee members Dr. Peter McPherson and Dr. Bhushan Nagar for their valuable insights and guidance along the way.

A special thank you to George Sung, Seby Chan, Rayan Fakih for their support, guidance and help. I am also grateful to all past and present Gehring’s lab members for their warm welcome and support during these years. I am pleased that I had the opportunity to know them and to conduct my research in such a friendly environment.

I would like to acknowledge our collaborators on this project: Dr. John Burke and Brandon

Moeller for carrying out HDX-MS experiments.

I am also extremely grateful to my family for always being there for me, cheering me up in the hardest moments and encouraging me to be my best.

Finally, I want to thank the NSERC and CIHR granting agencies for supporting this project.

5 Author Contribution

I designed WDR41-strep-strep, His-C9orf72-SMCR8, and His-C9orf72-SMCR8-flag-HA pFB constructs, carried out expression and purification optimization for all protein complexes and individual proteins used in this project. I screened crystallization conditions, purified samples for

HDX-MS, performed AUC-SV and wrote the manuscript.

Katalin Kocsis Illes maintained the Sf9 insect cell cultures and generated viruses for protein constructs used in this project. She also designed GFP tagged CS and CSW constructs used in this project.

Dr. John Burke and Brandon Moeller performed HDX-MS experiments.

Rayan Fakih helped with the SV-AUC data collection and analysis.

Dr. Guennadi Kozlov designed mutant constructs for CSW complex.

Dr. Kalle Gehring oversaw the project.

6 List of Abbreviations

AA - Amino Acid ALS - Amyotrophic Lateral Sclerosis bvFTD - behavioral Frontotemporal Dementia C9orf72 - 9 Open Reading Frame 72 CS - C9orf72-SMCR8 CSW - C9orf72-SMCR8-WDR41 CV - Column volume DENN - Differentially Expressed in Normal and Neoplastic cells DPR - Dipeptide repeat E. coli - Escherichia coli EDTA - Ethylenediaminetetraacetic acid fALS - familial Amyotrophic Lateral Sclerosis FTD - Frontotemporal Dementia FUS - Fused in Sarcoma GAP - GTPase activating protein GEF - GDP/GTP exchange factor GFP - Green Fluorescent Protein GFPnb - GFP nanobody GRN - Progranulin HDX-MS - Hydrogen-Deuterium Exchange Mass Spectrometry IPTG - Isopropyl β-d-1-thiogalactopyranoside LB - Lysogeny broth MART - Microtubule-Associated Protein Tau PCR - Polymerase chain reaction pFB plasmid - pFastBac plasmid RAN translation - Repeat-Associated non-AUG translation sALS - sporadic Amyotrophic Lateral Sclerosis SDS-PAGE or SDS - Sodium dodecyl sulphate polyacrylamide gel electrophoresis SEC - Size Exclusion Chromatography

7 Sf9 - Spodoptera frugiperda 9 cells SMCR8 - Smith-Magenis syndrome chromosome region candidate 8 SOD1 - Superoxide Dismutase 1 SV-AUC - Sedimentation-Velocity Analytical Ultracentrifugation TARDBP - Transactive Response DNA-Binding Protein TCEP - Tris(2-carboxyethyl)phosphine TEV - Tobacco Etch Virus Tris - Tris(hydroxymethyl)aminomethane ULK1 - Unc-51 Like Autophagy Activating Kinase 1 WDR41 - WD repeat-containing protein 41 WT - Wild-type

8 List of Figures

Figure 1: Percentage distribution of ALS associated in patients ...... 11

Figure 2: Schematic representation of human C9orf72 gene mutation on and

proposed pathways contributing to ...... 16

Figure 3: Domain organization of C9orf72, SMCR8 and WDR41 ...... 21

Figure 4: Purification profile of CS complex using His-tag purification ...... 23

Figure 5: Purification profile of CSW complex using His-tag purification...... 24

Figure 6: Purification profile of CSW complex after His-tag purification and size-exclusion

chromatography ...... 25

Figure 7: Purification profile of CSW and CS complexes when using GFPnb ...... 27

Figure 8: C9orf72, SMCR8 and WDR41 complex formation ...... 29

Figure 9: HDX-MS detection of unstructured regions within CSW complex...... 33

Figure 10: Conformational changes in C9orf72 and SMCR8 upon WDR41 binding detected by

HDX-MS ...... 35

Figure 11: HDX-MS analysis of C9orf72 and SMCR8 proteins in CS complex ...... 37

Figure 12: Purification profile of CSW mutants with the deletion in the SMCR8 linker region. 40

Figure 13: Purification profile of CSW mutants with the deletion of WDR41 flexible regions .. 41

List of Tables

Table 1: C9orf72, SMCR8, and WDR41 constructs used for cloning and protein expression .... 22

Table 2: Buffer condition used for protein purification during optimization ...... 23

Table 3: CSW mutant constructs designed based on HDX-MS results ...... 39

Table 4: Mutagenesis primers used to generate CSW deletion mutants...... 54

9 Introduction

1. Amyotrophic Lateral Sclerosis (ALS)

ALS was first described in 1869 by French neurologist Jean-Martin Charcot, who was able to identify the correlation between post-mortem anatomical nervous system abnormalities and clinical signs and classify ALS as a separate motor disease (Kumar, 2011). ALS affects upper and/or lower motor and results in neurodegeneration and subsequent impairment of voluntary muscle movement (Tiryaki, 2014). Most often the first symptoms appear at the age of

50-75 years-old with one limb experiencing muscle weakness, atrophy, spasticity, hypo- or hyperreflexia, or fasciculations. As the disease progresses, voluntary muscles responsible for moving, eating, speaking, and breathing become affected and further paralyzed (Ferraiuolo, 2011;

Laferriere, 2015; Tiryaki, 2014). Most patients die within the 2 to 5 years after diagnosis due to respiratory muscle failure (Larson, 2018; Tiryaki, 2014; Ferraiuolo, 2011; Laferriere, 2015), making ALS the neurodegenerative disease with the shortest survival time and highest mortality rate (Steenland, 2010). About 2-3 individuals per 100,000 are diagnosed with ALS worldwide, and these numbers are predicted to increase with the aging population (Arthur et al., 2016; Laferriere,

2015). Therefore, there is a growing interest in studying ALS’s progression mechanisms, risk factors, and possible treatments.

Genetics is one of the factors associated with ALS progression. Patients can be divided into two categories: those with a family history of ALS (fALS) that account for 5-10% of all cases and those with no known family history (sporadic ALS – sALS) that account for the other 90-95% of cases (Belzil, 2016; Byrne, 2011; Laferriere, 2015). Among them, ~68% of fALS and 11% of sALS cases have identified genetic mutations (Belzil, 2016). There are ~30 mutations of

10 which are associated with ALS. The most commonly mutated genes in both familial and sporadic

ALS are FUS, TARDBP, SOD1 and C9orf72 (Figure 1) (Evans, 2019; Laferriere, 2015; Tiryaki,

2014).

Figure 1: Percentage distribution of ALS associated mutations in patients. Around 10 % of ALS patients have a known family history of neurodegenerative diseases; the others have an apparently sporadic form with no known family history. C9orf72 is the most common mutation among identified. A. Distribution of mutations found in fALS patients shows that C9orf72 gene mutation is the most frequent. B. Distribution of gene mutations found in sALS patients. Most sALS cases are not associated with known mutations (Tiryaki, 2014). FUS (Fused in Sarcoma) gene mutations accounts for 4 % of fALS and <1 % of sALS cases

(Ferraiuolo, 2011; Laferriere, 2015). FUS is known to bind RNA and DNA and participate in DNA repair and RNA processing; thus, it is mainly localized within the nucleus (Ferraiuolo, 2011).

Mutations occur in exons 13-15, which code for nuclear localization signal leading to the disruption in FUS transportation and protein accumulation in the of affected motor neurons (Ferraiuolo, 2011). As a result, the neurodegeneration occurs due to the loss of function of FUS protein and/or highly ubiquitinated toxic cytoplasmic inclusions (Ferraiuolo, 2011).

TARDBP gene encodes for Transactive Response-DNA-binding protein 43 kDa (TDP-43), the primary function of which is RNA processing (Laferriere, 2015). Thirty-five TARDBP mutations that account for 5 % of fALS and <1 % of sALS cases were identified (Laferriere, 2015; Tiryaki,

2014). Similarly to FUS, these mutations lead to the disruption in protein translocation to the nucleus and TDP-43 aggregation and accumulation in the cytoplasm (Laferriere, 2015; Ferraiuolo,

11 2011). The formation of toxic aggregations and/or loss of TDP-43 nuclear functions results in the degeneration of the affected neurons (Ferraiuolo, 2011).

SOD1 (Superoxide Dismutase 1) gene mutations were the first identified genetic causes of ALS with 20% of fALS patients carrying mutations (Laferriere, 2015; Tiryaki, 2014). The gene codes for a SOD1 protein the central role of which is the dismutation of free superoxide radicals (Banci,

2008). There are about 170 SOD1 mutations that occur along the entire protein sequence. These disrupt proper folding (formation of disulfide bonds and dimerization) and/or protein activation

(binding of copper and zinc ions) (Banci, 2008; Laferriere, 2015). This results in the accumulation of misfolded and ubiquitinated SOD1 proteins that form positive ubiquitin inclusions and cause neurotoxicity (Banci, 2008; Laferriere, 2015).

C9orf72 (chromosome 9 open reading frame 72) gene mutation on chromosome 21 is characterized by the hexanucleotide repeat (GGGGCC) expansion in the first intron of the gene

(Ling, 2013). In individuals with no ALS symptoms, the number of GGGGCC repeats is usually below 30. Patients with ALS symptoms have been identified with 45 to several thousand repeats with numbers varying between patients (Bourinaris, 2018; Laferriere, 2015; Oskarsson, 2018). The penetrance of the expansion mutation is highly dependent on the age and the number of GGGGCC repeats. The recent studies showed that 99.5% of patients, identified with the expansion, develop

ALS symptoms by the age of 83; although, numbers may not be accurate due to the low availability of data for asymptomatic carriers (Murphy, 2017). The expansions of varying length are observed in 20-60 % of fALS cases and 1-7 % of sALS cases, depending on the population, making the

C9orf72 gene mutation the most common ALS mutations (Belzil, 2016; Corbier, 2017; Hodges,

2012; Oskarsson, 2018).

12 How this expansion leads to neurodegeneration is not known. Among the proposed mechanisms are the depletion of C9orf72 protein levels, formation of toxic RNA foci and translation of upstream repeat (Ling, 2013). These will be discussed later.

ALS has no treatment; although, two drugs (Riluzole and Edaravone) were found to be effective in slowing the disease progression (Bhandari, 2018; Ling, 2013). Riluzole mediates and inhibits presynaptic glutamate release as it was found that some ALS patients have abnormal accumulation of glutamate that could cause neurotoxicity (Ling, 2013). In fact, clinical studies showed that

Riluzole can extend survival of ALS patient for up to 3 months (Ling, 2013; Tiryaki, 2014).

Edaravone, on the other hand, functions as an antioxidant and prevents oxidative stress development due to disease-associated mutations or external factors. (Bhandari, 2018). Taking

Edaravone showed a promising 33 % drop in the ALS progression rate (Bhandari, 2018). The main shortage of both drugs is that they do not target the major causes of cellular abnormalities such as gene mutations, but rather they try to alleviate specific cellular abnormalities and ALS symptoms.

Partially, this could be happening due to the limited knowledge about gene mutations and their consequences. Therefore, the studying of genetic mutations and functions of the related proteins

(especially C9orf72) can be critical for understanding the ALS mechanism and for developing new effective treatments capable of curing ALS.

2. Frontotemporal Dementia (FTD)

The neurodegenerative disorder FTD was first described in 1892 by Czech neurologist Pick, who identified a patient with “progressive deterioration of language associated with left temporal lobe atrophy” (Olney, 2017). FTD classifies into two main types: behavioural FTD (bvFTD) and primary progressive aphasia, with both implying neurodegeneration in frontal and temporal lobes of the brain (Olney et al., 2017). Around 60 % of patients develop bvFTD and experience

13 behavioural and cognitive changes (apathy, inertia, compulsiveness, loss of sympathy/empathy, judgment impairment) (Olney et al., 2017). The other 40% - develop aphasia, which is associated with the development of language problems such as speaking, reading, writing and understanding

(Olney et al., 2017). Over time, patients with aphasia can also develop some bvFTD symptoms and vice versa. The percent of FTD patients varies across populations and is around 1.61 – 4.1 individuals per 100,000 with the average age of onset between 45 – 65 years; although, cases of early (<45) and late (>65) onset are not uncommon (Olney et al., 2017).

The causes for FTD development are unexplained in most of cases. Gene mutations were identified in about 25% of familial FTD (fFTD) and 10% of sporadic (sFTD) cases (Belzil, 2016).

The three most common gene mutations that may cause autosomal dominant FTD inheritance are

C9orf72, MAPT, and GRN (Olney, 2017; Hodges, 2012).

MART (Microtubule-Associated Protein Tau) gene on chromosome 17 codes for Tau proteins

(Olney, Spina, & Miller, 2017). There are 53 missense, silent, or deletion mutations in the gene that lead to a partial loss of Tau functions and preferential production of 4R tau over 3R tau (Ghetti,

2015; Olney, 2017). This imbalance in 3R:4R ratio is thought to cause aggregation of 4R tau in the cytoplasm and formation of insoluble filaments that leads to neurodegeneration and FTD progression (Ghetti, 2015).

GRN (Progranulin) gene is also located on chromosome 17 and codes for a secreted progranulin protein (Petkau, 2014). Sixty-eight GRN gene mutations are identified to date in families with fFTD, with all of them leading to a 33 % decrease in GRN protein levels within cells (Petkau,

2014; Sun, 2011). The growing research into progranulin functions showed that it can participate in stress and inflammation response, synapse activities, neurite outgrowth, and lysosomal functions (Petkau, 2014; Sun, 2011). Thus, FTD development can be a result of the misregulation of one of these pathways (Olney, Spina, & Miller, 2017; Sun, 2011).

14 The last and the most common gene mutated in FTD is C9orf72, which is mutated in 11.7-26 % of cases with fFTD and 5 % of cases with sFTD (although numbers fluctuate depending on the country and population) (Belzil, 2016; Olney, 2017). The mutation is identical to the one in ALS and represents the expansion of the GGGGCC repeat (Belzil, 2016).

15-20 % of FTD patients develop ALS, and up to 50 % of ALS patients are diagnosed with

FTD symptoms (Babić Leko, 2019; Tiryaki, 2014). It is proposed that ALS and FTD have a similar pathway for neurodegeneration and represent the extremes of a continuum neurodegenerative disorder (ALS/FTD) (Belzil, 2016). Therefore, patients with C9orf72 gene mutation can develop

ALS, FTD, or a mix of ALS/FTD disorders; but, the reasons for the preferential progression of either of the diseases are still unexplained (Belzil, 2016).

FTD has no cure, with only few drugs (antipsychotics or anti-epileptics) used to control behavioural abnormalities (Tsai, 2014). The study of gene mutations and protein functions is essential for understanding the molecular mechanisms of FTD progression and the development of treatments.

3. C9orf72 gene mutation

In humans, C9orf72 gene consists of 11 exons with hexanucleotide (GGGGCC or G4C2) sequence positioned in the first intron shown in Figure 2 (Ling, 2013). The are two transcription variants of C9orf72 gene that code for full length C9orf72 protein (481 amino acid) (Babić Leko,

2019). In variant 1, transcription starts at exon 1a and the G4C2 repeat is positioned in the first intron of the gene; while, variant 2 has transcription initiation site at 1b with the repeat in the

promoter region (Figure 2) (Babić Leko, 2019). In healthy individuals with low number of G4C2 copies both variants are transcribed and translated to produce functional C9orf72 protein (Niblock,

2016).

15 There are three main hypotheses of how the hexanucleotide repeat expansion can disrupt cellular processes and lead to neurodegeneration: gain of toxicity due to the formation of RNA foci, gain of toxicity due to mRNA translation and production of dipeptide repeats (DPR) and loss of function due to the decrease of C9orf72 mRNA levels in cytoplasm (Figure 2) (Ling, 2013).

Figure 2: Schematic representation of human C9orf72 gene mutation on chromosome 9 and proposed pathways contributing to neurodegeneration. The hexanucleotide repeat expansion is positioned in the first intron of the gene between exons 1a and 1b with n representing the number of repeats. In blue are the coding regions and yellow showing untranslated regions (UTRs). A. Repeat containing RNAs may form RNA foci and sequester RNA-binding proteins causing toxicity. B. RAN-initiated translation of repeat-containing RNA leads to the production of toxic DPRs. C. The expansion may also hinder the transcription of the affected allele leading to reduced levels of C9orf72 mRNA and protein in cells. Figure adapted from (Ling, 2013).

In variant 1, RNA maturation involves splicing of all introns except intron 1, which is

sometimes retained (Niblock, 2016). The presence of G4C2 expansion does not influence intron retention as the intron 1 was observed in mRNA derived from both wild-type alleles and mutant

16 alleles (Niblock, 2016). Thus, the only difference between wt and the mutated variant 1 is the presence of an enlarged 5’-untranslated region in the later. While mRNA derived from wt alleles can leave the nucleus and be translated to produce wt C9orf72 protein, enlarged mRNA results in abnormalities in mRNA processing (Figure 2 A and B) (Ling, 2013).

When intron 1 is retained and repeat is positioned within the intron (Figure 2), 85 % of incompletely spliced mRNA is retained in the nucleus in both wt and mutated alleles. (Green,

2016; Ling, 2013; Niblock, 2016). When the highly expanded G4C2 repeats are present, they can assemble into stable secondary structures such as G-quadruplexes and hairpins, forming RNA foci

(Figure 2A) (Green, 2016; Laferriere, 2015). Structured RNAs can further sequester RNA binding proteins and potentially disrupt their functions (Todd, 2016). Several proteins responsible for RNA processing, sorting, transport, and other nuclear functions were found to associate with RNA foci

(Babić Leko, 2019; Todd, 2016). Sequestration of these proteins is thought to cause neurotoxicity.

The other 15 % of intron containing mRNA are transported to the cytoplasm (Niblock, 2016).

With no expansion present, C9orf72 protein is produced; while, the presence of the expansion leads to the production of DPRs (Ling, 2013). C9orf72 RNA is transcribed from both sense and antisense strands of C9orf72 gene and both forms can be exported to the cytoplasm where each is translated in three reading frames. The translation occurs via repeat-associated non-AUG (RAN)- initiated translation (Figure 2B) (Green, 2016; Ling, 2013). Secondary RNA structures, present due to repeat expansion, promote translation initiation in non-coding/intron regions; however, the mechanism of RAN translation is unclear (Green, 2016). As a result, five dipeptide-repeat containing proteins (or DPRs) are synthesized from two strands (Todd, 2016). Poly-GA, poly-GP and poly-GR chains are translated from the sense strand; while, poly-PR, poly-PA and again poly-

GP chains are translated from the antisense strand (Balendra, 2018). All five are found in

ALS/FTD patients (Balendra, 2018). These DPRs have been found to form aggregates in

17 cytoplasm and cause ER stress, disrupt the ubiquitin-proteasome-system and affect several cellular pathways, including stress granules formation and Notch signalling pathway (Todd, 2016).

Arginine containing repeats were shown to be associated with the majority of these effects while the role of poly-GA, GP and PA needs to be studied further (Todd, 2016).

The mutation in variant 2, with the expansion positioned in the promoter region, results in decreased level of C9orf72 protein and loss-of-function pathway. The transcription from 1b site is hindered by the large repeat expansion that results in drop of C9orf72 mRNA levels (Figure 2C)

(Ling, 2013; Todd, 2016). One of the proposed mechanisms is hypermethylation of 5’-CpG island in the promoter region found in up to 30 % of ALS/FTD patients or the repeat expansion itself

(Belzil, 2016). In both cases, promoter binding is hindered, and transcription initiation is suppressed, resulting in lower mRNA levels and decreased C9orf72 protein production (Babić

Leko, 2019; Belzil, 2016). The decrease in protein level is observed even with one allele being mutated, meaning the mutation is dominant and results in haploinsufficiency (Babić Leko, 2019;

Laferriere, 2015). To link the reduced C9orf72 levels to the observed neurodegeneration, understanding the protein role in neurons is essential. Several works showed C9orf72 importance in autophagy (Sellier, 2016; Sullivan, 2016; Yang, 2016).

While RNA foci and DRPs can more directly lead to neurotoxicity, the consequences of the decreased levels of C9orf72 protein are still unclear (Belzil, 2016). As a result, there is a growing number of research focused on identification of C9orf72 cellular functions and on how disruption of these functions may lead to neurodegeneration.

4. Cellular functions of C9orf72 protein and formation of CSW complex

C9orf72 is related to DENN (Differentially Expressed in Normal and Neoplastic cells) domain- containing proteins (Levine, 2013; Zhang, 2012) and forms a stable complex with another DENN

18 domain-containing protein SMCR8 (Smith-Magenis syndrome chromosome region candidate 8) and WD repeat-containing protein 41 (WDR41) (Sellier, 2016; Sullivan, 2016; Yang, 2016).

Together, they form a trimeric complex, C9orf72-SMCR8-WDR41, henceforth referred to as

CSW.

DENN-containing proteins are known to function as regulators for small GTPases, catalyzing the transition between inactive GDP-bound and active GTP-bound forms. GDP/GTP exchange factors (GEFs) facilitate GDP to GTP transition, while GTPase activating proteins (GAPs) facilitate GTP to GDP transition. In their turn, GTPases act as molecular switches and regulate diverse cellular processes, including membrane trafficking, cytokinesis, cell polarization, gene expression, and cell migration (Song, 2019). Knockout experiments showed that autophagy is partially disrupted in neuronal cells in the absence of C9orf72; thus, CSW may regulate GTPases involved in autophagy and promote neurodegeneration through autophagy attenuation (Sellier,

2016).

There are several opposing theories of how CSW complex can influence autophagy and promote neurodegeneration; and, to confirm any of the functions and interactions observed in vitro, further in vivo studies have to be carried out.

Several studies have shown that CSW acts as GEF for GTPases. CSW complex was found to interact with ULK1 autophagy initiation complex and Rab1a (Webster, 2016; Yang, 2016).

Thus, CSW complex may act as Rab1a effector and mediate autophagy initiation by controlling localization of ULK1 initiation complex to phagophore (Webster, 2016). Additionally, CSW complex was shown to function as GEF for Rab8a and Rab39b GTPases, involved in membrane trafficking and autophagosome maturation (Sellier, 2016; Yang, 2016). Rab8a and Rab39b may function in a complex with CSW and ubiquitin-binding protein p62/optineurin autophagy receptors to target autophagy initiation to ubiquitinated protein aggregates, damaged organelles or pathogens

19 (Sellier, 2016). C9orf72 was also found to co-localize with Rab5, Rab7 and Rab11 involved in phagophore's membrane elongation, autophagosome translocation and maturation, respectively, suggesting C9orf72 involvement in those pathways (Farg, 2014).

Newer studies identified GAP activity of CSW for small GTPases (Tang, 2020; Su, 2020). CSW was found to increase GTP hydrolysis by Rab8a and Rab11a, but not by Rab7a nor Rab39a/b, as thought previously (Tang, 2020). GEF activity towards Rab8a and Rab11a was also tested but was not confirmed (Tang, 2020). Another study showed CSW as a potent GAP for Arf family GTPases, especially Arf1, 5 and 6, which participate in endosomal sorting (Su, 2020). In both studies, GAP activity was observed by either CSW or CS (C9orf72-SMCR8) complexes, suggesting that

WDR41 is not essential for GTPases regulation. WDR41’s primary function is thought to be the localization of CS complex to to facilitate CS-GTPases interaction by bringing them into proximity (Amick, 2018).

5. CSW structure

Despite the growing knowledge about CSW complex functions, there was little known about its structure when this project was started, which could greatly aid in understanding CSW functions. Bioinformatic analyses and sequence comparison to identified longin and DENN domains in both C9orf72 and SMCR8 shown in Figure 3 (Levine, 2013; Zhang, 2012). In C9orf72, these domains are connected by a short linker, while linker in SMCR8 is much larger (325 aa) and is predicted to be unstructured (Figure 3). C9orf72 and SMCR8 have low primary sequence similarity with DENN domain proteins but show high structural similarity to Interacting

Protein (FNIP) and Folliculin (FLCN), respectively (Amick, 2017). FNIP-FLCN complex is another DENN containing protein complex that functions as GAP, supporting the hypothesis that

CSW may also function as GAP (Amick, 2017). WDR41, on the other hand, is a member of WD40

20 repeat-containing proteins and thus predicted to have eight conserved WD repeats (Figure 3) that assemble into β-propeller conformation (Wang, 2015).

In a major breakthrough, two papers published during the preparation of this thesis showed the newly solved cryoEM structures of CSW complex (Su, 2020; Tang, 2020). Their findings will be discussed in comparison to our results in the Discussion.

Figure 3: Domain organization of C9orf72, SMCR8 and WDR41. C9orf72 and SMCR8 consist of two domains: Longin and DENN domains. The dashed line on SMCR8 corresponds to a disordered linker region. WDR41 has eight �-propeller domains. Figure adapted from (Tang, 2020)

6. Project goals

The aim of this study was first to develop an optimal expression and purification protocols for

C9orf72-SMCR8-WDR41 and C9orf72-SMCR8 protein complexes that would allow to obtain proteins with high yield and high purity. The second aim was to gain more information about protein structure and protein interactions within the complex through HDX-MS experiments and protein crystallization.

21 Results

1. Purification optimization

This project's first objective was to establish a robust expression and purification protocol to

obtain pure C9orf72-SMCR8 (CS) and C9orf72-SMCR8-WDR41 (CSW) complexes with high

yield. Several groups working with these protein complexes have shown that human variants of

C9orf72 and SMCR8 cannot be expressed in E. coli, meaning that post-translational modifications

or folding are necessary for these proteins to assemble (Iyer, 2018). Therefore, Bac-to-Bac

expression system that uses Sf9 insect cells was chosen for the expression as it was proved to be

suitable in previous studies (Iyer, 2018; Sellier, 2016).

1.1. Optimization of expression and purification of His-tagged constructs

pMF plasmids 1 to 3 (Table 1) coding for CS and WDR41 proteins were first cloned into

pFastBac (pFB) plasmids as described in Materials and Methods and plasmids 4 to 6 were

obtained. New constructs were then expressed in Sf9 insect cells. To obtain CSW complex, CS

and WDR41 containing plasmids were co-expressed.

Table 1: C9orf72, SMCR8, and WDR41 constructs used for cloning and protein expression. 1 – 3 constructs were used to clone protein sequences into pFB plasmids. 4 – 6 were used for protein purification with Ni beads. 7 – 8 were purified using GFPnb-coupled beads.

# Plasmid Construct 1 pMF WDR41-strep-strep 2 pMF His-C9orf72-SMCR8 3 pMF His-C9orf72-SMCR8-flag-HA 4 pFB WDR41-strep-strep 5 pFB His-C9orf72-SMCR8 6 pFB His-C9orf72-SMCR8-flag-HA 7 pFB His-strep-GFP-C9orf72-SMCR8 8 pFB His-strep-GFP-WDR41-SMCR8-C9orf72

22 The first round of purifications was performed using C9orf72-SMCR8 constructs 5 and 6 (Table

1) as they proved to form a stable complex without WDR41. As both constructs were His tagged,

Ni-affinity beads were used for purification. Purification proceeded as in Materials and Methods, and Buffer 1: 50 mM HEPES pH 7.6, 500 mM NaCl, 5 % glycerol, 500 mM Imidazole (Table 2) was used for elution from the beads. Samples were then analyzed on 10 % SDS-PAGE gel to assess their purity. Figure 4 shows the results obtained.

Table 2: Buffer condition used for protein purification during optimization. Buffers 1, 3-5 were used for elution during His-tag purification. 2 and 6 were used for running SEC after elution from Ni beads. 7 and 8 were used during purification with GFPnb for running SEC after elution from the beads.

# Buffer composition 1 50mM HEPES pH7.6, 500mM NaCl, 5% glycerol, 500mM Imidazole 2 50mM HEPES pH7.6, 150mM NaCl, 5% glycerol 3 50mM HEPES pH7.6, 500mM NaCl, 5% glycerol, 500mM Imidazole, 1mM TCEP 4 50mM HEPES pH7.6, 500mM NaCl, 5% glycerol, 500mM Imidazole, 5mM TCEP 5 50mM HEPES pH7.6, 500mM NaCl, 5% glycerol, 500mM Imidazole, 10mM TCEP 6 20mM Tris pH8.0, 150mM NaCl, 1mM EDTA 7 20mM Tris pH8.0, 150mM NaCl, 5mM TCEP, 5% glycerol 8 20mM Tris pH8.0, 150mM NaCl, 5mM TCEP

Figure 4: Purification profile of CS complex using His-tag purification. The Coomassie blue-stained SDS-PAGE shows the elution franctions of His-C9orf72- SMCR8 (E5) and His-C9orf72-SMCR8- flag-HA (E6) proteins after elution from Ni beads.

Proteins were expressed and were visible on the gel (Figure 4). SMCR8 migrates at 150 kDa and C9orf72 migrates at 50 kDa. Purity was noticed to be a significant issue, as many other bands were observed. To improve purity, samples were concentrated and run on SEC (using Buffer 2,

23 Table 2). The substantial aggregation peaks and relatively small protein peak on the chromatogram

(not shown) were observed, suggesting that all eluted proteins precipitated during concentration.

To see if the situation would improve in the presence of WDR41, constructs coding for His-

C9orf72-SMCR8 and WDR41-strep-strep (4 and 5 in Table 1) were co-expressed and purified using the same protocol as above. Construct 6 (tagged with flag-HA) was also used for co- expression and showed similar purification as 5 (not shown here); therefore, in further analyses,

His-C9orf72-SMCR8 was used. Additionally, to prevent protein aggregation during concentration, the elution Buffer 1 (Table 2) was modified by addition of TCEP at various concentrations (1, 5 and 10 mM) (Buffers 3, 4, and 5 in Table 2). Finally, SEC buffer was changed to 20 mM Tris pH 8, 150 mM NaCl, 1 mM EDTA (Buffer 6 in Table 2). The obtained results are presented in

Figures 5 and 6.

Figure 5: Purification profile of CSW complex using His-tag purification.

E1mM, E5mM and E10mM represent the elution samples obtained after His-C9orf72- SMCR8+WDR41-strep-strep elution from Ni beads at different concentrations in the elution buffer. Samples were run on 10% SDS-PAGE and stained using Coomassie blue.

CSW complex eluted with a similar amount of impurities as CS complex (Figure 5). The addition of TCEP reducing agent allowed to prevent some protein aggregation during elution and concentration. As a result, size-exclusion chromatograms in (Figure 6 A-C) all showed distinct peaks for the CSW complex. Sample with 10 mM TCEP, expectedly, showed the best separation between aggregation and CSW peaks; therefore, fractions from this CSW peak were collected and run on SDS gel (Figure 6 D). All three proteins were present in the sample, and the complex was

24 less contaminated, compared to samples after Ni-beads elution (Figure 5); although, some impurities’ bands could still be seen.

Figure 6: Purification profile of CSW complex after His-tag purification and size-exclusion chromatography. Samples were monitored at 280 nm and using AKTA Pure system. The first peak represents the His-C9orf72-SMCR8-WDR41-strep-strep (CSW) complex eluting between 50-60 mL and the last – His-C9orf72 protein eluting at 75-85 mL A. Gel filtration profile with 1 mM TCEP in the elution buffer. B. Gel filtration profile with 5 mM TCEP in the elution buffer. C. Gel filtration profile with 10 mM TCEP in the elution buffer. D. The Coomassie blue- stained SDS-PAGE shows the peak fraction of CSW from gel filtration in C.

Sample aggregation was reduced; however, purity was still an issue, suggesting that His-tag purification was not optimal for obtaining pure CSW complex. Additionally, purified proteins contained unremovable tags. As a result, His-C9orf72 and WDR41-strep-strep proteins were indistinguishable on SDS gel due to similarity in molecular weights (55.4 kDa and 55.6 kDa respectively), making it hard to verify their presence after purification. Unremovable tags also were a potential disadvantage for further structural analyses and crystallization trials as they could introduce additional protein flexibility. Therefore, another purification method had to be considered.

25 1.2. Optimization of expression and purification of GFP-tagged constructs

Because we needed a trap that would improve sample purity, anti-GFP nanobody-coupled beads

or just GFP nanobody-coupled beads (GFPnb) were considered as a possible alternative for

purification. Nanobodies consist of a single variable domain of antibody and retain antibody’s

specificity towards the antigen (GFP protein) (Zhang, 2020). Thus, nanobody-coupled beads

should have a minimal number of impurities binding and result in cleaner protein samples after

purification (Zhang, 2020).

For purification with GFPnb, new pFastBac constructs with GFP tags had to be created for

CSW and CS complexes (Table 1, constructs 7 and 8). To simplify the expression, we created the

plasmid containing all three proteins that was used for all future expressions. Additionally, TEV

cleavage site was introduced after the GFP tag so it could be cleaved. Purification was carried out

for both CS and CSW complexes and proceeded as in Materials and Methods. The first step

involved purification on GFPnb-coupled beads to isolate proteins and GFP tag cleavage during the

elution step using TEV protease. After several rounds of buffer optimizations, the buffer

containing 50 mM Tris pH 7.5, 150 mM NaCl, 1 mM EDTA, 5 mM dithiothreitol was determined

to be optimal for washing the beads. For protein elution, the same buffer was used with 75 µg of

TEV added to cleave the tag and release proteins from the beads. The eluted sample was run on

SEC using Buffer 7 (Table 2) composed of 20 mM Tris pH 8.0, 150 mM NaCl, 5 mM TCEP, 5 %

glycerol. Collected fractions were then concentrated and run on 10 % SDS gel. All obtained results

are shown in Figure 7.

The elution profile of CSW (Figure 7A) shows that protein complex elutes as a single peak

suggesting the complex is stable and no proteins dissociated during overnight incubation and SEC

as no peaks for individual proteins were observed. In the case of CS complex, two peaks are

26 observed (Figure 7B), suggesting that C9orf72 protein, which was tagged with GFP (construct 7

Table 1), was overexpressed in cells resulting in free C9 protein (not bound to SMCR8) binding to the beads and appearing on the chromatogram. Overexpression could be explained by the fact that the C9orf72 sequence was codon-optimized for better expression. The same is not observed for CSW complex as the GFP tag was on WDR41 protein, which was not overexpressed.

Figure 7: Purification profile of CSW and CS complexes when using GFPnb. Samples were monitored at 280 nm using AKTA Pure system. A. Gel filtration profile of C9orf72-SMCR8- WDR41 (CSW) complex that eluted between 50-65 mL showed single elution peak for the complex with small aggreagtion peak between 40-50 mL. B. Gel filtration profile of C9orf72- SMCR8 (CS) complex showed aggregation peak at 40-50 mL, CS peak at 55-70 mL and a separate peak for C9orf72 at 75-90 mL. C. The Coomassie blue-stained 10 %SDS-PAGE of concentrated CSW, CS and C9 samples collected after gel filtration.

Comparing these results to the ones obtained after His purification (Figure 6 A-C), a reduction of aggregation peak and more apparent separation between peaks is observed. Moreover, a single peak for the CSW complex is observed compared to the two peaks seen in Figure 6 A-C. Moreover, when comparing protein purity (Figure 7C), a significant improvement compared to His purifications (Figure 6D) is seen, as no impurities were observed on the SDS gels.

27 The first round of purifications involved using SEC buffer 7 (Table 2) that included 5 % glycerol for additional protein stability. The downside of glycerol was that it could interfere with the initial crystallization by acting as an untinucleation agent (Vera, 2011). Therefore, another SEC buffer 8 (Table 2), which excluded glycerol, was tested. The CS and CSW constructs were purified once more using the new SEC buffer. The obtained results were identical to those in Figure 7, suggesting that glycerol in the SEC buffer does not influence purification. As a result, the final

SEC buffer contained only: 20 mM Tris pH 8.0, 150 mM NaCl, and 5 mM TCEP and all further protein constructs were purified using this buffer and protocol described in Materials and Methods.

This last optimization step reduced the number of buffer components, which was important for further crystallization screening. Using this method, we were able to obtain up to 7.5 mg per 1 litre of cell culture of pure CSW complex, 4 mg/L of pure CS complex, and 3 mg/L of pure C9orf72.

Purification with GFPnb, therefore, was proven to be the optimal one, allowing to obtain a high quantity of pure CSW, CS and C9 proteins in a simple two-step purification with all tags being removed.

2. C9orf72-SMCR8-WDR41 complex forms a trimer

During purification, the CSW protein complex eluted in the range that could correspond to both trimeric (212 kDa) and hexameric (414 kDa) forms of the complex (Figure 8A). Sedimentation- velocity analytical ultracentrifugation (SV-AUC) was performed to determine the accurate molecular mass of the CSW complex and its stoichiometry, and results are shown in Figure 8B.

AUC analysis was performed at three protein concentrations: 0.3, 0.5 and 1.0 mg/mL (Figure

8B). A single peak is observed in each case meaning only one form is present in solution. The sedimentation coefficient for the complex was found to be ~5.84 on average, corresponding to the molecular mass of ~206 kDa (Table in Figure 8B). The theoretical MW of the CSW monomer

28 complex is 211 kDa (C9orf72 ~54 kDa, SMCR8 ~105 kDa, WDR41 ~52 kDa). Therefore, it can be concluded that CSW complex forms a trimeric assembly in solution at all concentrations with a protein ratio of 1:1:1.

Figure 8: C9orf72, SMCR8 and WDR41 complex formation. A. Gel-filtration chromatography results for CSW complex aligned with the gel filtration standards (myoglobin, 17 kDa; ovalbumin, 44 kDa; albumin, 66 kDa; IgG, 158 kDa; ferritin, 440 kDa). B. SV-AUC results for CSW complex at three concentrations showing formation of a trimer complex.

3. Initial crystallization trials for CSW and CS complexes

After establishing the purification protocol and obtaining proteins suitable for crystallization

(high purity, no tags, the minimal number of buffer components), the next step was to set up preliminary crystallization trials for both CSW and CS complexes. For CSW, ProComplex,

JCSG+, Classics II, Classics I, PACT and PEG crystallization suites from QIAGEN were used.

The sample concentration varied from 2 to 14 mg/mL, and screens were set at both 4 and 22 ºC.

For CS, ProComplex, JCSG+, Classics II, Classics I, and PACT suites were used for screening at both 4 and 22 ºC. Concentrations varied from 2 to 6 mg/mL for CS complex. More than 2,000 conditions were tested for CSW complex and more than 1,000 for CS. None of the conditions showed promising hits with protein complex precipitating in more than 50 % of screens, even at

29 low concentrations. One of the explanations for the excessive precipitation and hindered crystallization could be the presence of unstructured regions that impede nucleation.

4. HDX-MS analysis for CS and CSW

The next step was to acquire more information about the presence of unstructured regions within the CSW complex and the changes in protein conformation upon complex formation or protein binding. For this, Hydrogen-Deuterium Exchange Mass Spectrometry (HDX-MS) was used. For HDX-MS, protein samples are diluted in deuterated water and hydrogen-deuterium exchange proceeds (Fowler, 2016). The exchange is stopped at several time-points and the level of deuteration are determined by monitoring changes in the mass of the protein peptides compared to non-deuterated samples (Fowler, 2016). The higher deuteration at small reaction times corresponds to the more disordered protein regions due to their high accessibility to deuterium atoms, while slower exchange signifies a more rigid protein structure due to higher protection from deuterium atoms (Fowler, 2016). HDX-MS results can be visualized by constructing heatmaps that show level of deuterium incorporation into each amino acid over reaction time.

At first, HDX-MS was used to determine the presence of unstructured regions within CSW complex. For this, the HDX was performed for CSW complex, and the results for individual proteins were analyzed and the heatmaps shown in Figure 9 were generated for visual representation.

30 A. C9orf72 < 10% < 20% 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 < 30% 1 M S T L C P P P S P A V A K T E I A L S G K S P L L A A T F A Y W D N I L G P R V R H I W A P K T E Q V L L S D G E I T F L A N H T L N G E I L R N A E S G A I D V K F F < 40% < 50% 0.3s < 60% 3s < 70% 30s < 80% 300s < 90% 3000s > 90%

90 95 100 105 110 115 120 125 130 135 140 145 150 155 160 165 170 86 V L S E K G V I I V S L I F D G N W N G D R S T Y G L S I I L P Q T E L S F Y L P L H R V C V D R L T H I I R K G R I W M H K E R Q E N V Q K I I L E G T E R M E D Q G Q

0.3s 3s 30s 300s 3000s

175 180 185 190 195 200 205 210 215 220 225 230 235 240 245 250 255 171 S I I P M L T G E V I P V M E L L S S M K S H S V P E E I D I A D T V L N D D D I G D S C H E G F L L N A I S S H L Q T C G C S V V V G S S A E K V N K I V R T L C L F L

0.3s 3s 30s 300s 3000s

260 265 270 275 280 285 290 295 300 305 310 315 320 325 330 335 340 256 T P A E R K C S R L C E A E S S F K Y E S G L F V Q G L L K D S T G S F V L P F R Q V M Y A P Y P T T H I D V D V N T V K Q M P P C H E H I Y N Q R R Y M R S E L T A F W

0.3s 3s 30s 300s 3000s

345 350 355 360 365 370 375 380 385 390 395 400 405 410 415 420 425 341 R A T S E E D M A Q D T I I Y T D E S F T P D L N I F Q D V L H R D T L V K A F L D Q V F Q L K P G L S L R S T F L A Q F L L V L H R K A L T L I K Y I E D D T Q K G K K

0.3s 3s 30s 300s 3000s

430 435 440 445 450 455 460 465 470 475 480 426 P F K S L R N L K I D L D L T A E G D L N I I M A L A E K I K P G L H S F I F G R P F Y T S V Q E R D V L M T F

0.3s 3s 30s 300s 3000s

31 B. SMCR8

< 10% < 20% 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 < 30% 1 M I S A P D V V A F T K E E E Y E E E P Y N E P A L P E E Y S V P L F P F A S Q G A N P W S K L S G A K F S R D F I L I S E F S E Q V G P Q P L L T I P N D T K V F G T F < 40% < 50% 0.3s < 60% 3s < 70% 30s < 80% 300s < 90% 3000s > 90%

90 95 100 105 110 115 120 125 130 135 140 145 150 155 160 165 170 86 D L N Y F S L R I M S V D Y Q A S F V G H P P G S A Y P K L N F V E D S K V V L G D S K E G A F A Y V H H L T L Y D L E A R G F V R P F C M A Y I S A D Q H K I M Q Q F Q

0.3s 3s 30s 300s 3000s

175 180 185 190 195 200 205 210 215 220 225 230 235 240 245 250 255 171 E L S A E F S R A S E C L K T G N R K A F A G E L E K K L K D L D Y T R T V L H T E T E I Q K K A N D K G F Y S S Q A I E K A N E L A S V E K S I I E H Q D L L K Q I R S

0.3s 3s 30s 300s 3000s

260 265 270 275 280 285 290 295 300 305 310 315 320 325 330 335 340 256 Y P H R K L K G H D L C P G E M E H I Q D Q A S Q A S T T S N P D E S A D T D L Y T C R P A Y T P K L I K A K S T K C F D K K L K T L E E L C D T E Y F T Q T L A Q L S H

0.3s 3s 30s 300s 3000s

345 350 355 360 365 370 375 380 385 390 395 400 405 410 415 420 425 341 I E H M F R G D L C Y L L T S Q I D R A L L K Q Q H I T N F L F E D F V E V D D R M V E K Q E S I P S K P S Q D R P P S S S L E E C P I P K V L I S V G S Y K S S V E S V

0.3s 3s 30s 300s 3000s

430 435 440 445 450 455 460 465 470 475 480 485 490 495 500 505 510 426 L I K M E Q E L G D E E Y K E V E V T E L S S F D P Q E N L D Y L D M D M K G S I S S G E S I E V L G T E K S T S V L S K S D S Q A S L T V P L S P Q V V R S K A V S H R

0.3s 3s 30s 300s 3000s

515 520 525 530 535 540 545 550 555 560 565 570 575 580 585 590 595 511 T I S E D S I E V L S T C P S E A L I P D D F K A S Y P S A I N E E E S Y P D G N E G A I R F Q A S I S P P E L G E T E E G S I E N T P S Q I D S S C C I G K E S D G Q L

0.3s 3s 30s 300s 3000s

600 605 610 615 620 625 630 635 640 645 650 655 660 665 670 675 680 596 V L P S T P A H T H S D E D G V V S S P P Q R H R Q K D Q G F R V D F S V E N A N P S S R D N S C E G F P A Y E L D P S H L L A S R D I S K T S L D N Y S D T T S Y V S S

0.3s 3s 30s 300s 3000s

685 690 695 700 705 710 715 720 725 730 735 740 745 750 755 760 765 681 V A S T S S D R I P S A Y P A G L S S D R H K K R A G Q N A L K F I R Q Y P F A H P A I Y S L L S G R T L V V L G E D E A I V R K L V T A L A I F V P S Y G C Y A K P V K

0.3s 3s 30s 300s 3000s

770 775 780 785 790 795 800 805 810 815 820 825 830 835 840 845 850 766 H W A S S P L H I M D F Q K W K L I G L Q R V A S P A G A G T L H A L S R Y S R Y T S I L D L D N K T L R C P L Y R G T L V P R L A D H R T Q I K R G S T Y Y L H V Q S M

0.3s 3s 30s 300s 3000s

855 860 865 870 875 880 885 890 895 900 905 910 915 920 925 930 935 851 L T Q L C S K A F L Y T F C H H L H L P T H D K E T E E L V A S R Q M S F L K L T L G L V N E D V R V V Q Y L A E L L K L H Y M Q E S P G T S H P M L R F D Y V P S F L Y

0.3s 3s 30s 300s 3000s

936 K I

0.3s 3s 30s 300s 3000s

32 C. WDR41

< 10% 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 < 20% 1 M L R W L I G G G R E P Q G L A E K S P L Q T I G E E Q T Q N P Y T E L L V L K A H H D I V R F L V Q L D D Y R F A S A G D D G I V V V W N A Q T G E K L L E L N G H T Q < 30% < 40% 0.3s < 50% 3s < 60% 30s < 70% 300s < 80% 3000s < 90% > 90%

90 95 100 105 110 115 120 125 130 135 140 145 150 155 160 165 170 86 K I T A I I T F P S L E S C E E K N Q L I L T A S A D R T V I V W D G D T T R Q V Q R I S C F Q S T V K C L T V L Q R L D V W L S G G N D L C V W N R K L D L L C K T S H

0.3s 3s 30s 300s 3000s

175 180 185 190 195 200 205 210 215 220 225 230 235 240 245 250 255 171 L S D T G I S A L V E I P K N C V V A A V G K E L I I F R L V A P T E G S L E W D I L E V K R L L D H Q D N I L S L I N V N D L S F V T G S H V G E L I I W D A L D W T M

0.3s 3s 30s 300s 3000s

260 265 270 275 280 285 290 295 300 305 310 315 320 325 330 335 340 256 Q A Y E R N F W D P S P Q L D T Q Q E I K L C Q K S N D I S I H H F T C D E E N V F A A V G R G L Y V Y S L Q M K R V I A C Q K T A H D S N V L H V A R L P N R Q L I S C

0.3s 3s 30s 300s 3000s

345 350 355 360 365 370 375 380 385 390 395 400 405 410 415 420 425 341 S E D G S V R I W E L R E K Q Q L A A E P V P T G F F N M W G F G R V S K Q A S Q P V K K Q Q E N A T S C S L E L I G D L I G H S S S V E M F L Y F E D H G L V T C S A D

0.3s 3s 30s 300s 3000s

430 435 440 445 450 455 426 H L I I L W K N G E R E S G L R S L R L F Q K L E E N G D L Y L A V

0.3s 3s 30s 300s 3000s

Figure 9: HDX-MS detection of unstructured regions within CSW complex. Each panel shows the peptide coverage and % of deuteration (%D) of proteins at five different time points with amino acids coloured in red having the highest deuteration and in dark purple having the lowest A. %D in C9orf72 showed no large unstructured regions. B. %D in SMCR8 showed several highly unstructured regions in between 375 – 697 aa. C. %D in WDR41 showed an unstructured region from 351 to 395 aa.

C9orf72 (Figure 9A) showed an overall low level of deuteration. The majority of peptides had

%D <60%, especially when the deuteration reaction was run for a short time (0.3 s). Therefore, it was concluded that C9orf72 protein does not have highly unstructured regions that may interfere with crystallization. SMCR8 (Figure 9B) results’ analysis showed a high level of deuteration in the region between 375 to 697 aa. The observed flexibility correlated with the protein sequence analysis, which showed that these amino acids represent the unstructured linker region between

DENN and Longin domains of SMCR8 (Figure 3). As this extended flexible region could be the

33 reason for hindered crystallization, the next step was to design several deletion mutants with truncated linker region to see if the crystallization could be improved. WDR41 (Figure 9C) analysis identified relatively high levels of deuteration in two regions: N-terminus 1 – 15 aa and

351 – 395 aa, which had %D > 80 %. Similarly to SMCR8, the next step was to create deletion mutants lacking these regions to improve crystallization.

The second use of HDX was to study the interactions between C9orf72-SMCR8 and WDR41 proteins within CSW complex. For this purpose, the HDX experiment was performed for both CS and CSW complexes. The change in peptide deuteration between CS and CSW was analyzed to identify the regions directly or indirectly impacted by WDR41 binding and to gain more structural information on proteins’ interactions within the complex. For the experiments, a change in deuterations was considered significant when the following criteria were met: > 5 % change in

% of incorporated deuterium, > 0.4 Da difference in number of deuterons incorporated, and a p value less than 0.01 between triplicates. Deuteration profiles of C9orf72 and SMCR8 regions with a significant change upon CSW complex formation are shown in Figure 10. The complete HDX heatmaps for C9orf92 and SMCR8 from CS complex are shown in Figure 11.

34

Figure 10: Conformational changes in C9orf72 and SMCR8 upon WDR41 binding detected by HDX-MS. A. Changes in C9orf72 deuteration. Only one region from 380 to 393 showed slower exchange in CSW complex compared to CS. B. Changes in SMCR8 deuteration. Four SMCR8 regions showed changes in the % deuteration. The peptides 351-371, 725-739, 750-845, and 880- 934 all showed slower exchange in CSW complex compared than in CS.

35

A. C9orf72

< 10% < 20% 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 < 30% 1 M S T L C P P P S P A V A K T E I A L S G K S P L L A A T F A Y W D N I L G P R V R H I W A P K T E Q V L L S D G E I T F L A N H T L N G E I L R N A E S G A I D V K F F < 40% < 50% 0.3s < 60% 3s < 70% 30s < 80% 300s < 90% 3000s > 90%

90 95 100 105 110 115 120 125 130 135 140 145 150 155 160 165 170 86 V L S E K G V I I V S L I F D G N W N G D R S T Y G L S I I L P Q T E L S F Y L P L H R V C V D R L T H I I R K G R I W M H K E R Q E N V Q K I I L E G T E R M E D Q G Q

0.3s 3s 30s 300s 3000s

175 180 185 190 195 200 205 210 215 220 225 230 235 240 245 250 255 171 S I I P M L T G E V I P V M E L L S S M K S H S V P E E I D I A D T V L N D D D I G D S C H E G F L L N A I S S H L Q T C G C S V V V G S S A E K V N K I V R T L C L F L

0.3s 3s 30s 300s 3000s

260 265 270 275 280 285 290 295 300 305 310 315 320 325 330 335 340 256 T P A E R K C S R L C E A E S S F K Y E S G L F V Q G L L K D S T G S F V L P F R Q V M Y A P Y P T T H I D V D V N T V K Q M P P C H E H I Y N Q R R Y M R S E L T A F W

0.3s 3s 30s 300s 3000s

345 350 355 360 365 370 375 380 385 390 395 400 405 410 415 420 425 341 R A T S E E D M A Q D T I I Y T D E S F T P D L N I F Q D V L H R D T L V K A F L D Q V F Q L K P G L S L R S T F L A Q F L L V L H R K A L T L I K Y I E D D T Q K G K K

0.3s 3s 30s 300s 3000s

430 435 440 445 450 455 460 465 470 475 480 426 P F K S L R N L K I D L D L T A E G D L N I I M A L A E K I K P G L H S F I F G R P F Y T S V Q E R D V L M T F

0.3s 3s 30s 300s 3000s

36

B. SMCR8

< 10% 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 < 20% 1 M I S A P D V V A F T K E E E Y E E E P Y N E P A L P E E Y S V P L F P F A S Q G A N P W S K L S G A K F S R D F I L I S E F S E Q V G P Q P L L T I P N D T K V F G T F D L N Y F S L R I M S V D Y Q A S F V G H P P G S A Y P K L N F V E D S K V V L < 30% < 40% 0.3s < 50% 3s < 60% 30s < 70% 300s < 80% 3000s < 90% > 90%

130 135 140 145 150 155 160 165 170 175 180 185 190 195 200 205 210 215 220 225 230 235 240 245 250 126 G D S K E G A F A Y V H H L T L Y D L E A R G F V R P F C M A Y I S A D Q H K I M Q Q F Q E L S A E F S R A S E C L K T G N R K A F A G E L E K K L K D L D Y T R T V L H T E T E I Q K K A N D K G F Y S S Q A I E K A N E L A S V E K S I I E H Q D L L

0.3s 3s 30s 300s 3000s

255 260 265 270 275 280 285 290 295 300 305 310 315 320 325 330 335 340 345 350 355 360 365 370 375 251 K Q I R S Y P H R K L K G H D L C P G E M E H I Q D Q A S Q A S T T S N P D E S A D T D L Y T C R P A Y T P K L I K A K S T K C F D K K L K T L E E L C D T E Y F T Q T L A Q L S H I E H M F R G D L C Y L L T S Q I D R A L L K Q Q H I T N F L F E D F

0.3s 3s 30s 300s 3000s

380 385 390 395 400 405 410 415 420 425 430 435 440 445 450 455 460 465 470 475 480 485 490 495 500 376 V E V D D R M V E K Q E S I P S K P S Q D R P P S S S L E E C P I P K V L I S V G S Y K S S V E S V L I K M E Q E L G D E E Y K E V E V T E L S S F D P Q E N L D Y L D M D M K G S I S S G E S I E V L G T E K S T S V L S K S D S Q A S L T V P L S P Q

0.3s 3s 30s 300s 3000s

505 510 515 520 525 530 535 540 545 550 555 560 565 570 575 580 585 590 595 600 605 610 615 620 625 501 V V R S K A V S H R T I S E D S I E V L S T C P S E A L I P D D F K A S Y P S A I N E E E S Y P D G N E G A I R F Q A S I S P P E L G E T E E G S I E N T P S Q I D S S C C I G K E S D G Q L V L P S T P A H T H S D E D G V V S S P P Q R H R Q K D Q G

0.3s 3s 30s 300s 3000s

630 635 640 645 650 655 660 665 670 675 680 685 690 695 700 705 710 715 720 725 730 735 740 745 750 626 F R V D F S V E N A N P S S R D N S C E G F P A Y E L D P S H L L A S R D I S K T S L D N Y S D T T S Y V S S V A S T S S D R I P S A Y P A G L S S D R H K K R A G Q N A L K F I R Q Y P F A H P A I Y S L L S G R T L V V L G E D E A I V R K L V T A L

0.3s 3s 30s 300s 3000s

755 760 765 770 775 780 785 790 795 800 805 810 815 820 825 830 835 840 845 850 855 860 865 870 875 751 A I F V P S Y G C Y A K P V K H W A S S P L H I M D F Q K W K L I G L Q R V A S P A G A G T L H A L S R Y S R Y T S I L D L D N K T L R C P L Y R G T L V P R L A D H R T Q I K R G S T Y Y L H V Q S M L T Q L C S K A F L Y T F C H H L H L P T H D K E

0.3s 3s 30s 300s 3000s

880 885 890 895 900 905 910 915 920 925 930 935 876 T E E L V A S R Q M S F L K L T L G L V N E D V R V V Q Y L A E L L K L H Y M Q E S P G T S H P M L R F D Y V P S F L Y K I

0.3s 3s 30s 300s 3000s

Figure 11: HDX-MS analysis of C9orf72 and SMCR8 proteins in CS complex. Each panel shows the peptide coverage and % of deuteration (% D) of proteins at five different time points with amino acids coloured in red having the highest deuteration and in dark purple – the lowest A. % D in C9orf 72. B. % D in SMCR8

37

During the comparison of heatmaps for C9orf72 proteins in CS and CSW complexes, the

peptide comprising residues 380-393 (C9orf72 DENN domain) was found to become significantly

protected in the presence of WDR41. Thus, WDR41 may interact with C9orf72 at that region,

shielding it and lowering the deuterium exchange; or WDR41 binding to another region could

trigger downstream conformational changes in C9, resulting in the observed protection (Figure

10).

When comparing SMCR8 results from CS and CSW complexes, four regions were found to

have significant changes in deuteration. The peptides comprising residues 351-371, 725-739, 750-

845, and 880-934 showed higher protection after WDR41 binding. Since SMCR8 had more

protected regions that C9orf72, we suggest that WDR41 binds directly to SMCR8. The fact that

both DENN domains on SMCR8 and C9orf72 are affected means that they can be closely

positioned. Thus, WDR41 interacts directly with DENN domain of SMCR8, which also interacts

with the DENN domain of C9orf72.

5. CSW unstructured regions are essential for the formation of the complex

CSW mutant constructs lacking HDX-MS identified flexible regions were created to study their

importance in the formation of the complex and to improve CSW crystallization. The mutants

generated are shown in Table 3. All constructs were purified using the optimized protocol with

GFPnb-coupled beads followed by SEC. The size exclusion chromatograms were then compared

to see if the deletion mutation influenced the complex formation. The purity was accessed by SDS

gel.

38 Table 3: CSW mutant constructs designed based on HDX-MS results.

CSW� Construct 1 WDR41-C9orf72-SMCR8�381-692 2 WDR41-C9orf72-SMCR8�391-692 3 WDR41-C9orf72-SMCR8�401-692 4 WDR41-C9orf72-SMCR8�421-692 5 WDR41�1-14-SMCR8-C9orf72 6 WDR41�352-393-SMCR8-C9orf72 7 WDR41�1-14, 352-393-SMCR8-C9orf72

At first, the flexible linker region identified for SMCR8 (375 – 697 aa) was studied. Four mutants were generated (1-4 in Table 3) with varying deletion lengths. The comparison of gel filtration profiles (Figure 12) showed that the longest mutation (�381-692) completely disrupted

CSW complex formation as no peak was observed, and no proteins were identified on the gel

(Figure 12A). When the deletion was reduced (�391-692, �401-692, and �421-692), CSW complex was observed on the chromatograms and the gels (Figure 12B-D); although the amount of obtained complex was much lower compared to the WT CSW (Figure 7A). Thus, the linker region may not be structured but is vital for forming a stable CSW complex. As the linker region was not predicted to interact with WDR41 based on the HDX (Figure 10) results, it could be important for binding to C9orf72.

39

Figure 12: Purification profile of CSW mutants with the deletion in the SMCR8 linker region. Samples were monitored at 280 nm using AKTA Pure system. The Coomassie blue-stained 10 % SDS-PAGE showed the purity and composition of the eluted peaks. A. Purification profile of WDR41-C9orf72-SMCR8�381-692 (CSW�1) showed no clear peaks for the complex after gel filtration and no proteins on the gel. B. Purification profile of WDR41-C9orf72-SMCR8�391-692 (CSW�2) showed a small peak for the complex after gel filtration, but all three were observed on the gel. C. Purification profile of WDR41-C9orf72-SMCR8�401-692 (CSW�3) and D. Purification profile of WDR41-C9orf72-SMCR8�421-692 (CSW�4) showed results identical to CSW�2.

In the next step, the importance of WDR41 flexible regions was studied. Two WDR41 regions were identified as flexible by HDX: 1 – 15 aa and 351 – 395 aa. Therefore, three CSW mutants were produced (5-7 in Table 3). Deletion of regions 1-14 and 352-393 separately showed an increase in CSW stability as aggregation peaks were reduced while CSW peaks stayed of comparable size (Figure 13A and B) to wild type CSW (Figure 7A). On the other hand, when the double mutant was purified, it showed complete disruption of CSW complex formation as no CSW peak was observed on the gel filtration (Figure 13C). As a result, at least one of the unstructured sequences is required for CSW stability.

40

Figure 13: Purification profile of CSW mutants with the deletion of WDR41 flexible regions. Samples were monitored at 280 nm using AKTA Pure system. The Coomassie blue-stained 10 % SDS-PAGE showed the purity and composition of the eluted peaks. Purification profiles of WDR41�1-14-SMCR8-C9orf72 (CSW�5) A. and of WDR41�352-393-SMCR8-C9orf72 (CSW�6) B. showed clear CSW peaks with little aggregation and all proteins being visible on the gel. C. Purification profile of WDR41�1-14,�352-393-SMCR8-C9orf72 (CSW�5) showed no peaks for the complex after gel filtration.

6. Crystallization trials for CSW mutants

Crystallization screens were performed for the following CSW mutants: CSW�2, 3, 4, 5, and 6

(Table 1). All of them were screened with ProComplex and Classics II suites at 4 ºC with concentrations of 1 mg/mL (CSW�2, 3 and 4), 1-3 mg/mL (CSW�5), and 2-4 mg/mL (CSW�6).

Despite the deletion of predicted flexible regions, the results were similar to those obtained for

WT CSW and CS. None of the conditions showed promising hits with protein precipitation observed in the majority of screens. Therefore, further research into optimal crystallization conditions is required.

41 Discussion

In 2011, repeat expansions in the C9orf72 gene were found to be the leading cause for the development of ALS and FTD (Ling, 2013). Up to 60 % ALS patients and 26 % FTD patients with familial cases were carriers of this mutation (Belzil, 2016). C9orf72 was found to form a stable complex with SMCR8 and WDR41. Many studies were carried out to identify the function of this

CSW complex and understand its role in the disease progression. The obtained results are ambiguous and disagree on whether CSW functions as GEF or GAP and for which GTPases

(Amick, 2018; Sellier, 2016; Su, 2020; Tang, 2020; Yang, 2016). C9orf72 and SMCR8 were predicted to have longin and DENN domains each and to act as GEF or GAP, while WDR41 was predicted to have �-propeller domain and function as a hub between CS and substrate GTPases

(Levine, 2013; Wang, 2015). Additional structural information about CSW assembly and protein interactions within the complex could greatly aid in determining protein functions; however, up to now, no CSW crystal structures have been solved, and little was known about protein-protein interactions within the complex. The goals of this project were to gain more structural information about CSW complex and to obtain its 3D structure.

The first step was to establish a robust purification protocol capable of producing large quantities of pure CSW. For test expression, the CS complex was first obtained by expressing it in Sf9 insect cells and purifying with Ni-affinity beads. High-level expression was observed

(Figure 4), but the sample aggregated during purification and was heavily contaminated with unspecific proteins. The CSW complex was then co-expressed in Sf9 insect cells and purified with

Ni-affinity beads. The addition of the TCEP reducing agent at 1, 5, and 10 mM concentrations to the elution buffer lowered the aggregation. SEC removed some impurities, but unspecific bands could still be seen on the gel (Figure 6). Therefore, a new purification method that used GFP

42 nanobodies for affinity purification was implemented. New constructs containing GFP tags were designed for CSW and CS complexes (constructs 7 and 8, Table 1). The new CSW construct contained all three proteins which simplified protein expression, removed the necessity for co- expression and ensured more even protein expression. The last advantage of new constructs was the cleavable tag as TEV cleavage site was introduced so that GFP could be cleaved during purification. The results observed after purification showed a high-level of expression and much- improved samples’ purity (Figure 7C). Lastly, the SEC buffer was optimized and the final SEC buffer for the purification of GFP-tagged protein constructs was composed of 20 mM Tris pH 8.0,

150 mM NaCl, and 5 mM TCEP. As a result, we were able to develop the expression and purification protocols for CSW, CS and C9orf72 that did not require co-expression, included only two purification steps and provided high purity proteins with high yields. Up to 7.5 mg of pure

CSW complex could be obtained from 1 L of Sf9 cell culture using this approach.

Purified untagged proteins were then used for crystallization screens. More than 3000 conditions were tested for CSW and CS, but none resulted in protein crystallization. The presence of highly unstructured regions within the complex could be one of the reasons for unsuccessful crystallization, suggesting that further protein optimization was required.

We also carried the AUC-SV experiment to understand the CSW stoichiometry. The CSW complex was found to form a trimer as the experimental molecular weight was determined to be

206 kDa (Figure 8B), which is close to the theoretical weight of 211 kDa. Moreover, it suggested that the stoichiometry of CSW complex is 1:1:1.

Next, we performed HDX-MS experiments that determined the rate of the deuterium exchange for proteins within CSW complex. With the exchange rate being dependable on solvent exposure and protein stability, the results were used to identify flexible regions in each protein and monitor changes in protein conformation upon WDR41 binding to CS. We identified SMCR8 and C9orf72

43 regions impacted by WDR41 by comparing HDX-MS results for both CS and CSW complexes. It revealed that WDR41 most likely binds to SMCR8 DENN domain as four DENN domain peptides became protected upon WDR41 binding (Figure 10B). For C9orf72, only one peptide in DENN domain was more protected in the presence of WDR41 (Figure 10A), suggesting that WDR41 binding to SMCR8 triggers conformational changes in the adjacent C9orf72 DENN domain.

HDX-MS analysis of CSW was also used to study proteins’ flexibility, which could be the reason for unsuccessful initial crystallization screens for full-length CSW and CS complexes. The results showed that peptides comprising residues 375-697 on SMCR8, and residues 1-15 and 351

– 395 on WDR41 had the highest deuteration levels, and thus are flexible. Deletion mutants were created for CSW complex (Table 3) to study the importance of these regions and to improve crystallization. Mutations in SMCR8, located in the linker region, disrupted complex formation as smaller peaks for mutant CSW compared to wild type CSW were observed (Figure 12). The unstructured SMCR8 linker is crucial for the CSW assembly. Since it was not protected upon

WDR41 binding, this region may be involved in binding to C9orf72. The deletion of two unstructured WDR41 regions individually increased CSW expression and stability as smaller aggregation peaks were observed (Figure 13). Interestingly, the deletion of both regions resulted in complete disruption of CSW complex and further analysis is needed to understand their significance. The purified mutant CSW complexes were used for crystallization screens, expecting that the deletion of flexible regions would improve crystallization. However, after screening more than 1000 conditions, no crystallization was observed. Therefore, more conditions must be screened, including other crystallization suites, sample concentrations, temperatures and addition of potential GTPases substrates to identify the optimal parameters for crystal growth.

44 Recently, two groups have solved the 3D structure for C9orf72-CSMR8-WDR41 complex using Cryo-EM technique and have done structural analyses similar to those presented in this thesis

(Su, 2020; Tang, 2020). When comparing expression and purification techniques, Tang et al. expressed and purified CS and WDR41 separately and obtained CSW complex by combining incubating proteins together in vitro. They used a total of seven steps to obtain pure CSW complex.

Su et al. used co-expression of CS and WDR41 containing plasmids, so the complex was formed within the cells. In terms of purification, Su et al. only had three steps, but their sample had visible impurities. Therefore, our method with only one expression plasmid and two-step purification using GFP nanobodies and resulting in pure CSW complex is the most efficient. We have established that the CSW complex has 1:1:1 stoichiometry as a trimer. Su et al., 2020 showed the same while Tang et al., 2020, found that the CSW complex assembles as two trimers. As a result, a more thorough analysis is still needed to verify monomer or dimer formation. Our HDX-MS findings on interactions between CS and WDR41 are in agreement with Su et al. They also performed HDX-MS experiments and identified the same SMCR8 and C9orf72 regions protected upon WDR41 binding. Both research groups confirmed WDR41 binding to the SMCR8 DENN domain and not C9orf72 (Protein Data Bank: 6LT0). In regard to the identified unstructured

SMCR8 and WDR41 (Table 3) regions, our HDX results were confirmed by the cryoEM structures of the complex in which all three regions were unresolved due to high flexibility. Thus, their role in complex formation is still to be studied.

In terms of functions, both groups have shown that CSW acts as a GAP (Su, 2020; Tang, 2020).

The GAP activity is also supported by the presence of highly conserved arginine residue on

SMCR8 (R147), which is found in human, mouse, zebrafish, worm and Xenopus variants of

SMCR8 (Su, 2020). The equivalent of R147 is also present in other GAP acting proteins such as folliculin, where it acts as a catalytic residue (Su, 2020). Several GTPases families were proposed

45 to interact with CSW (Sellier, 2016; Su, 2020; Tang, 2020); thus, more studies are needed, especially structural studies, that could aid in determining CSW functions and binding partners.

Ultimately, the structural study of C9orf72-SMCR8-WDR41 complex is the emerging topic with many unknowns and many structural features that could help understand CSW involvement in ALS/FTD yet to be discovered.

46 Materials and Methods

1. Protein constructs

Human constructs for C9orf72, SMCR8, and WDR41 were obtained from Dr. Nicolas Charlet

Berguerand’s lab at the Institute of Genetics and Molecular and Cellular Biology (France). Three pMF plasmids containing these proteins were obtained and are shown in Table 1 (constructs 1 to

3). C9orf72 sequence has been optimized to remove negative RNA elements and rare codons.

2. Cloning into pFastBac

For protein expression in insect cells, genes were first cloned into a pFastBac vector

(Invitrogen).

2.1. Restriction enzyme digestion

The gene coding for WDR41 protein (construct 1 in Table 1) was cloned as BamH1-XbaI

fragment, and the gene coding for C9orf72-SMCR8 protein complex (3 and 4 from Table 1)

was cloned as NsiI-EcoRI fragment. 5 µg of pMF plasmid was digested with BamH1 and XbaI

restriction enzymes for WDR41 construct and NsiI, EcoR1, and PVU1 restriction enzymes for

CS constructs. 2 µg of pFastBac plasmid was linearized with corresponding restriction enzymes

except for PVU1, as it was used to differentiate the gene of interest from the rest of the digested

plasmid. 10x Buffer 3.1 was used for all reactions, and Calf Intestinal Alkaline Phosphatase

was added during pFB digestion. Following digestion, samples were run on 0.7 % agarose gel

and the bands corresponding to the protein constructs and digested pFB vector were extracted

using QIAquick Gel Extraction Kit (QIAGEN).

47 2.2. Ligation and transformation

Ligation for each construct was performed using the Quick Ligation Kit (NEB). The ligated

sample was added to 50 µL DH5� competent cells and kept on ice for 20 minutes. Then, the

sample was heat-shocked at 42 ℃ for 45 seconds, placed on ice for another minute, and

250 µL of LB was added. The reaction tube was shacked at 37 ℃ for 30 minutes and plated

on the LB agar plate containing 0.1 mg/mL ampicillin and kept overnight at 37 ℃.

Miniprep was done, and the plasmids were extracted using the QIAprep Spin Miniprep Kit

(QIAGEN). The presence of inserts was verified by DNA sequencing using pFB sequencing

primers 5’ – CTCTACAAATGTGGTATGG – 3’ and 5’ – TATTGCCGTCATAGCGCG – 3’.

3. Protein expression in Sf9 insect cells

All proteins were expressed using Bac-to-Bac Baculovirus Expression protocol (Invitrogen) with minor modifications. DH10MultiBac cells were used (Berger, Fitzgerald, & Richmond, 2004) for bacterial transformation. 4x106 cells/mL Sf9 insect cells grown in I-Max serum-free media

(Wisent BioProducts) were infected with 3.5 % baculovirus or 1.75 % each for co-infection. The expression took place at 27 ℃ for 66 h with rotating at 90 rpm. Cells were harvested by centrifuging cells at 2,500 g in 4 ºC for 25 minutes, resuspending pellets in 50 mM HEPES pH 7.6,

500 mM NaCl, 5 % glycerol, 7 mM 2-Mercaptoethanol. 5mM Imidazole was added for samples purified with Ni-affinity beads. Samples were stored at -80 ºC.

4. Protein purification with Ni-affinity beads.

Cells were lysed by two rounds of sonication with 5 × 10 seconds pulses and 50 seconds delay between each. The lysate was centrifuged at 50,000 rpm at 4 ºC for 1 hour using Beckman Coulter

Optima L-100XP Ultracentrifuge, and the supernatant was incubated with Ni Sepharose 6 Fast

48 Flow beads (GE Healthcare) for 1 hour at 4 ºC. The column was then washed with 4 CV of binding buffer: 50 mM HEPES pH 7.6, 500 mM NaCl, 5 % glycerol, 5 mM Imidazole, and washed with

6 CV of binding buffer containing 30 mM Imidazole. Proteins were eluted with 2 × 2 CV of binding buffer containing 500 mM Imidazole (Buffer 1 Table 2). For several purifications, TCEP was added to the elution at varying concentrations (Buffer 3, 4, and 5 in Table 2) to minimize protein aggregation. Proteins were concentrated using Amicon Ultra-15 Centrifugal Filters

(Millipore) and run on HiLoad 16/600 Superdex 200PG column using AKTA Pure purification system (GE Healthcare). Final buffers used for running size exclusion chromatography are summarized in Table 2 (buffers 2 and 6), as different conditions were used to find the optimal parameters for protein purification. Fractions corresponding to the protein or complex peaks were collected, concentrated and stored at -80 ºC. Protein Purity was assessed by running proteins on

10 % SDS-PAGE at 160 V for 55 minutes (Laemmli, 1970).

5. Expression and purification of GFPnb

Plasmid containing GFP sequence (MAQVQLVESGGALVQPGGSLRLSCAASGFPVNRY

SMRWYRQAPGKEREWVAGMSSAGDRSSYEDSVKGRFTISRDDARNTVYLQMNSLKPE

DTAVYYCNVNVGFEYWGQGTQVTVSSKKLEHHHHHH) was transformed in E. coli

BL21star competent cells and plated on LB agar plates containing 0.05 mg/mL of kanamycin.

Transformed cells were grown in 12 L of LB broth with 50 mg/L kanamycin at 37 ºC, and the expression was induced with 1 mM IPTG. Cells were harvested after 16 hour incubation at 18 ºC and resuspended in 50 mM Tris pH 7.5, 500 mM NaCl, 5 mM imidazole, 8 µg/mL DNase. Cells were lysed using EmulsiFlex-C3 French Press (Avestin) and centrifuged at 20,000 rpm at 4 ºC for

45 minutes. The supernatant was filtered using gravity filtration and incubated with 8 mL Ni-beads for 1 hour. Beads were washed with 400 mL of 10 mM NaxPO4 pH 7.5, 500 mM NaCl, 20 mM

49 Imidazole buffer and GFPnb were eluted with 3 CV of 10 mM NaxPO4 pH 7.5, 100 mM NaCl,

250 mM Imidazole. The sample was concentrated and run on HiLoad Superdex 75PG column in

10 mM NaxPO4 pH 7.5, 100 mM NaCl buffer. Corresponding GFPnb peak was collected, and concentration was measured at 280 nm using the mass extinction coefficient of 18.91 g-1Lcm-1 using NanoDrop2000 UV Spectrophotometer (Thermo Scientific). Sample concertation was brought to 1 mg/mL, and it was stored at -80 ºC.

6. Preparation of GFPnb-coupled beads

25 mL of N-Hydroxysuccinimide (NHS)-activated Sepharose 4 Fast Flow (Sigma-Aldrich)

beads was washed with 150 mL of cold dH2O following 150 mL of cold 1 mM HCl and

1xPhosphate buffered saline buffer until the flowthrough pH was 7.4. 25 mL of 1 mg/mL GFPnb was added, and the resin was incubated with nutation overnight at 4 ºC. Then flowthrough was discarded, and the resin was incubated with 1 CV of 100 mM Tris pH 8.0 for 30 min at 4 ºC. After discarding the flowthrough, the same step was repeated two more times. Another 1 CV of buffer was added, and the resin was incubated overnight at 4 ºC with nutation. Lastly, the resin was washed with 10 CV of 20 % ethanol and stored at 4 ºC. To regenerate the beads after use, they were washed with 100 mL of 6 M Guanidine-HCl or until beads become white; and then rinsed in dH2O and 20 % ethanol and stored.

7. Generation of GFP tagged constructs

7.1. C9orf72-SMCR8 construct

C9orf72 protein was amplified, and BamHI and NotI restriction sites were added by

PCR using construct 2 (Table 1) as a template and the following primers (FW: 5’-

CTAGGGATCCATGTCAACCCTGTGCCCTC-3’, Rev: 5’-TTTTTTCCTTGCGGCCG

50 CTCAGAAAGTCATCAGCACATCC-3’). The pFB-His-Strep-GFP plasmid was

digested with BamHI and NotI and further ligated with the PCR product using the protocol

mentioned previously. Produced pFB-His-Strep-GFP-C9orf72 and construct #2 (Table 1)

was then digested with SmaI (at 25 ºC for one hour) and XhoI (at 37 ºC for another hour).

The digested plasmid and SMCR8 were then ligated to produce pFB- His-Strep-GFP-

C9orf72-SMCR8 (construct 7 in Table 1)

In both cases, digested samples of interest and PCR products were isolated by running

0.7 % agarose gel and extracted using the protocol mentioned previously. DNA insertions

were also verified on an agarose gel and by DNA sequencing.

7.2. C9orf72-SMCR8-WDR41 construct

Construct 1 (Table 1) was used as a template and WDR41 protein was amplified by PCR

adding BamHI and NotI restriction sites. The following primers were used: FW 5’-

CTAGGGATCCATGTTGCGATGGCTGATC-3’ and Rev 5’- TTTTTTCCTTGCGGCC

GCTCAGACAGCAAGGTATAAGTCACC-3’. Then, BamHI/NotI digested pFB plasmid

was ligated with the PCR product to produce pFB-His-Strep-GFP-WDR41. This construct

and construct 2 (Table 1) were digested with SmaI and XhoI as in 7.1. Ligation of pFB and

SMCR8 was then performed to produce pFB-His-Strep-GFP-WDR41-SMCR8. Lastly,

pFB-His-Strep-GFP-C9orf72 plasmid from 7.1 was used for PCR to obtain the protein

sequence of C9orf72 with new restriction sites. NheI and AvrII restriction sited were

introduced by PCR using FW: 5’- CTAGGCTAGCGTATACTCCGGA

ATATTAATAGATCATGG-3’ and Rev: 5’- CTAGCCTAGGCTCAAGCAGTGATCA-

3’ primers. C9orf72 PCR product and pFB-His-Strep-GFP-WDR41-SMCR8 plasmid were

51 digested with NheI and AvrII, purified and ligated producing final pFB-His-Strep-GFP-

WDR41-SMCR8-C9orf72 protein construct 8 in Table 1.

PCR fragments of interest, digested plasmids and proteins were isolated by running

0.7 % agarose gel and extracted using protocol mentioned previously. DNA insertions were

also verified on an agarose gel and by DNA sequencing.

8. Protein purification with GFPnb-coupled beads

Proteins were expressed in Sf9 cells and harvested as described previously. Cells were lysed by one round of sonication with 5 × 10 seconds pulses and 50 seconds delay between each. The lysate was centrifuged at 50,000 rpm at 4 ºC for 1 hour using Beckman Coulter Optima L-100XP

Ultracentrifuge. 1 mL of GFPnb-coupled Sepharose 4 Fast Flow beads were washed with 2 CV of dH2O and incubated with supernatant for 1 hour at 4 ºC in a gravity column. The column was then washed with 95 mL of wash buffer: 50 mM Tris pH 7.5, 150 mM NaCl, 1 mM EDTA, 5 mM dithiothreitol. For the elution, 5 mL of the elution buffer containing 75 µg of TEV protease was added to the beads and incubated overnight with slow nutation at 4 ºC. The eluted sample was then run on HiLoad 16/600 Superdex 200PG column using AKTA Pure purification system (GE

Healthcare). Two buffers for running SEC were used: buffers 7 and 8 in Table 2, as different conditions were screened for optimal protein purification. The final SEC buffer used for all further protein purifications with GFPnb contained 20 mM Tris pH 8.0, 150 mM NaCl, and 5 mM TCEP.

Fractions corresponding to the protein or complex peaks were collected, concentrated and stored at -80 ºC. Protein purity was determined by running protein on 10 % SDS-PAGE at 160 V for 55 minutes (Laemmli, 1970).

52 9. Sedimentation-Velocity Analytical Ultracentrifugation (SV-AUC)

SV-AUC experiment was performed, as previously described in (Chen et al., 2018), to determine the ration of C9orf72, SMCR8 and WDR41 proteins in the complex. Three samples of

CSW complex were used for the analysis with concentrations of 0.3 mg/mL, 0.5 mg/mL and

1.0 mg/mL. SEC buffer was composed of 20 mM Tris pH 8.0, 150 mM NaCl, 5 mM TCEP, and

5 % glycerol. Samples were centrifuged at 51,000 g using a Beckman Coulter XL-I analytical ultracentrifuge and an An-60Ti rotor for 18 hours at 20 °C. Sednterp (Laue, 1992) was used to obtain the solvent density (1.0192 g/cm3), viscosity (0.01841 mPa•s), and the partial specific volume (0.7376 cm3/g). The c(s) distribution graph was plotted using GUSSI (Brautigam, 2015).

10. Hydrogen-Deuterium Exchange Mass Spectrometry (HDX-MS)

HDX-MS analysis was performed as described in (Jenkins et al., 2018) and (Lucic et al., 2018) with few modifications. HDX reactions were carried on for C9orf72-SMCR8 (CS) and C9orf72-

SMCR8-WDR41 (SCW) samples. The reaction proceeded in a 50 µL final reaction volume with

15 pmol of protein per sample. 48.9 µL of D2O buffer (100 mM NaCl, 50 mM pH 7.5 HEPES,

94 % D2O (V/V)) was added to 1.1 µL of protein solution (final D2O concentration of 92 %) to start the reaction. The reaction was carried out in triplicates for five time points: 3 s at 4 °C (0.3 s for 20 °C), and 3, 30, 300, or 3000 s at 20 °C. Reactions were stopped by adding the acidic quench buffer to a final concentration of 0.6 M guanidine-HCl and 0.9 % formic acid post quench.

Samples were flash-frozen immediately and stored at -80 °C. HDX-MS data analysis was performed exactly as in (Jenkins et al., 2018), but no fully deuterated sample was generated. For the comparison between CS and CSW complexes, the change in peptide deuteration was recorded to be significant upon the presence of WDR41 in the complex only when following criteria were

53 met: > 5 % change in % D (% of deuterium that has been incorporated into the trimer and dimer over 0.3 to 3000 s), > 0.4 Da in # D (# of deuterons incorporated) difference, and a p-value less than 0.01 between triplicates.

11. Generation of deletion mutants for CSW complex.

Based on HDX-MS results, mutant CSW constructs were created with deleted unstructured regions. 7 CSW constructs (Table 3) with the deletion mutations were generated using the

QuikChange Lightning Multi Site-Directed Mutagenesis kit (Agilent) and following the provided protocol provided with the kit. The his-strep-GFP-WDR41-SMCR8-C9orf72 construct was used as a template. Mutagenesis primers used for this experiment are shown in Table 4. The mutation success was verified by DNA sequencing. For mutations in WDR41 reverse 5’- caataactgttctatcagcagaggc-3’ and forward 5’-catgtgatgaagagaatgtatttgc-3’ primers were used. For mutations in SMCR8 reverse 5’-atcggaagttgaaggggc-3’and forward 5’-accttggagtcctccacgaag-3’ primers were used.

Table 4: Mutagenesis primers used to generate CSW deletion mutants.

CSW� Mutagenesis primers 1 5’-gacaggccagcaggatagtcatcgacctccacaaag-3’ 2 5’-gacaggccagcaggatagggtatgctttcttgtttc-3’ 3 5’-gacaggccagcaggataggaaggcggcctgtcttg-3’ 4 5’-gacaggccagcaggataggacttgtaagaaccaac-3’ 5 5’-ggcgccatgggatccctggccgagaaatctc-3’ 6 5’-gtacgcatttgggagttatcactggagcttattggag-3’ 7 Primers 5 and 6

54 References

Amick, J., & Ferguson, S. M. (2017). C9orf72: At the intersection of cell biology and neurodegenerative disease. Traffic, 18(5), 267-276. doi:10.1111/tra.12477

Amick, J., Tharkeshwar, A. K., Amaya, C., & Ferguson, S. M. (2018). WDR41 supports lysosomal response to changes in amino acid availability. Mol Biol Cell, 29(18), 2213-2227. doi:10.1091/mbc.E17-12-0703

Arthur, K. C., Calvo, A., Price, T. R., Geiger, J. T., Chiò, A., & Traynor, B. J. (2016). Projected increase in amyotrophic lateral sclerosis from 2015 to 2040. Nat Commun, 7, 12408. doi:10.1038/ncomms12408

Babić Leko, M., Župunski, V., Kirincich, J., Smilović, D., Hortobágyi, T., Hof, P. R., & Šimić, G. (2019). Molecular Mechanisms of Neurodegeneration Related to C9orf72 Hexanucleotide Repeat Expansion. Behav Neurol, 2019, 2909168. doi:10.1155/2019/2909168

Balendra, R., & Isaacs, A. M. (2018). C9orf72-mediated ALS and FTD: multiple pathways to disease. Nat Rev Neurol, 14(9), 544-558. doi:10.1038/s41582-018-0047-2

Banci, L., Bertini, I., Boca, M., Girotto, S., Martinelli, M., Valentine, J. S., & Vieru, M. (2008). SOD1 and amyotrophic lateral sclerosis: mutations and oligomerization. PLoS One, 3(2), e1677. doi:10.1371/journal.pone.0001677

Belzil, V. V., Katzman, R. B., & Petrucelli, L. (2016). ALS and FTD: an epigenetic perspective. Acta Neuropathol, 132(4), 487-502. doi:10.1007/s00401-016-1587-4

Berger, I., Fitzgerald, D. J., & Richmond, T. J. (2004). Baculovirus expression system for heterologous multiprotein complexes. Nat Biotechnol, 22(12), 1583-1587. doi:10.1038/nbt1036

Bhandari, R., Kuhad, A., & Kuhad, A. (2018). Edaravone: a new hope for deadly amyotrophic lateral sclerosis. Drugs Today (Barc), 54(6), 349-360. doi:10.1358/dot.2018.54.6.2828189

Bourinaris, T., & Houlden, H. (2018). C9orf72 and its Relevance in Parkinsonism and Movement Disorders: A Comprehensive Review of the Literature. Mov Disord Clin Pract, 5(6), 575- 585. doi:10.1002/mdc3.12677

Brautigam, C. A. (2015). Calculations and Publication-Quality Illustrations for Analytical Ultracentrifugation Data. Methods Enzymol, 562, 109-133. doi:10.1016/bs.mie.2015.05.001

55 Byrne, S., Walsh, C., Lynch, C., Bede, P., Elamin, M., Kenna, K., Hardiman, O. (2011). Rate of familial amyotrophic lateral sclerosis: a systematic review and meta-analysis. J Neurol Neurosurg Psychiatry, 82(6), 623-627. doi:10.1136/jnnp.2010.224501

Chen, Y. S., Kozlov, G., Fakih, R., Funato, Y., Miki, H., & Gehring, K. (2018). The cyclic nucleotide-binding homology domain of the integral membrane protein CNNM mediates dimerization and is required for Mg(2+) efflux activity. J Biol Chem, 293(52), 19998- 20007. doi:10.1074/jbc.RA118.005672

Corbier, C., & Sellier, C. (2017). C9ORF72 is a GDP/GTP exchange factor for Rab8 and Rab39 and regulates autophagy. Small GTPases, 8(3), 181-186. doi:10.1080/21541248.2016.1212688

Evans, C. S., & Holzbaur, E. L. F. (2019). Autophagy and mitophagy in ALS. Neurobiol Dis, 122, 35-40. doi:10.1016/j.nbd.2018.07.005

Farg, M. A., Sundaramoorthy, V., Sultana, J. M., Yang, S., Atkinson, R. A., Levina, V., Atkin, J. D. (2014). C9ORF72, implicated in amytrophic lateral sclerosis and frontotemporal dementia, regulates endosomal trafficking. Hum Mol Genet, 23(13), 3579-3595. doi:10.1093/hmg/ddu068

Ferraiuolo, L., Kirby, J., Grierson, A. J., Sendtner, M., & Shaw, P. J. (2011). Molecular pathways of injury in amyotrophic lateral sclerosis. Nat Rev Neurol, 7(11), 616-630. doi:10.1038/nrneurol.2011.152

Fischer, L., & Rappsilber, J. (2017). Quirks of Error Estimation in Cross-Linking/Mass Spectrometry. Anal Chem, 89(7), 3829-3833. doi:10.1021/acs.analchem.6b03745

Fowler, M. L., McPhail, J. A., Jenkins, M. L., Masson, G. R., Rutaganira, F. U., Shokat, K. M., Burke, J. E. (2016). Using hydrogen deuterium exchange mass spectrometry to engineer optimized constructs for crystallization of protein complexes: Case study of PI4KIIIβ with Rab11. Protein Sci, 25(4), 826-839. doi:10.1002/pro.2879

Ghetti, B., Oblak, A. L., Boeve, B. F., Johnson, K. A., Dickerson, B. C., & Goedert, M. (2015). Invited review: Frontotemporal dementia caused by microtubule-associated protein tau gene (MAPT) mutations: a chameleon for neuropathology and neuroimaging. Neuropathol Appl Neurobiol, 41(1), 24-46. doi:10.1111/nan.12213

Green, K. M., Linsalata, A. E., & Todd, P. K. (2016). RAN translation-What makes it run? Brain Res, 1647, 30-42. doi:10.1016/j.brainres.2016.04.003

56 Grimm, M., Zimniak, T., Kahraman, A., & Herzog, F. (2015). xVis: a web server for the schematic visualization and interpretation of crosslink-derived spatial restraints. Nucleic Acids Res, 43(W1), W362-369. doi:10.1093/nar/gkv463

Hodges, J. (2012). Familial frontotemporal dementia and amyotrophic lateral sclerosis associated with the C9ORF72 hexanucleotide repeat. Brain, 135(Pt 3), 652-655. doi:10.1093/brain/aws033

Iyer, S., Subramanian, V., & Acharya, K. R. (2018). C9orf72, a protein associated with amyotrophic lateral sclerosis (ALS) is a guanine nucleotide exchange factor. PeerJ, 6, e5815. doi:10.7717/peerj.5815

Jenkins, M. L., Margaria, J. P., Stariha, J. T. B., Hoffmann, R. M., McPhail, J. A., Hamelin, D. J., Burke, J. E. (2018). Structural determinants of Rab11 activation by the guanine nucleotide exchange factor SH3BP5. Nat Commun, 9(1), 3772. doi:10.1038/s41467-018-06196-z

Knopman, D. S., & Roberts, R. O. (2011). Estimating the number of persons with frontotemporal lobar degeneration in the US population. J Mol Neurosci, 45(3), 330-335. doi:10.1007/s12031-011-9538-y

Kumar, D. R., Aslinia, F., Yale, S. H., & Mazza, J. J. (2011). Jean-Martin Charcot: the father of neurology. Clin Med Res, 9(1), 46-49. doi:10.3121/cmr.2009.883

Laemmli, U. K. (1970). Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature, 227(5259), 680-685. doi:10.1038/227680a0

Laferriere, F., & Polymenidou, M. (2015). Advances and challenges in understanding the multifaceted pathogenesis of amyotrophic lateral sclerosis. Swiss Med Wkly, 145, w14054. doi:10.4414/smw.2015.14054

Larson, T. C., Kaye, W., Mehta, P., & Horton, D. K. (2018). Amyotrophic Lateral Sclerosis Mortality in the United States, 2011-2014. Neuroepidemiology, 51(1-2), 96-103. doi:10.1159/000488891

Laue, T. S., Bharat V.; Ridgeway, Theresa M.; Pelletier, Sandra L. (1992). Computer-aided interpretation of analytical sedimentation data for proteins.

Leitner, A., Reischl, R., Walzthoeni, T., Herzog, F., Bohn, S., Forster, F., & Aebersold, R. (2012). Expanding the chemical cross-linking toolbox by the use of multiple proteases and enrichment by size exclusion chromatography. Mol Cell Proteomics, 11(3), M111.014126. doi:10.1074/mcp.M111.014126

57 Levine, T. P., Daniels, R. D., Gatta, A. T., Wong, L. H., & Hayes, M. J. (2013). The product of C9orf72, a gene strongly implicated in neurodegeneration, is structurally related to DENN Rab-GEFs. Bioinformatics, 29(4), 499-503. doi:10.1093/bioinformatics/bts725

Ling, S. C., Polymenidou, M., & Cleveland, D. W. (2013). Converging mechanisms in ALS and FTD: disrupted RNA and protein homeostasis. Neuron, 79(3), 416-438. doi:10.1016/j.neuron.2013.07.033

Lucic, I., Rathinaswamy, M. K., Truebestein, L., Hamelin, D. J., Burke, J. E., & Leonard, T. A. (2018). Conformational sampling of membranes by Akt controls its activation and inactivation. Proc Natl Acad Sci U S A, 115(17), E3940-e3949. doi:10.1073/pnas.1716109115

Murphy, N. A., Arthur, K. C., Tienari, P. J., Houlden, H., Chiò, A., & Traynor, B. J. (2017). Age- related penetrance of the C9orf72 repeat expansion. Sci Rep, 7(1), 2116. doi:10.1038/s41598-017-02364-1

Niblock, M., Smith, B. N., Lee, Y. B., Sardone, V., Topp, S., Troakes, C., Gallo, J. M. (2016). Retention of hexanucleotide repeat-containing intron in C9orf72 mRNA: implications for the pathogenesis of ALS/FTD. Acta Neuropathol Commun, 4, 18. doi:10.1186/s40478-016-0289-4

Olney, N. T., Spina, S., & Miller, B. L. (2017). Frontotemporal Dementia. Neurol Clin, 35(2), 339- 374. doi:10.1016/j.ncl.2017.01.008

Oskarsson, B., Gendron, T. F., & Staff, N. P. (2018). Amyotrophic Lateral Sclerosis: An Update for 2018. Mayo Clin Proc, 93(11), 1617-1628. doi:10.1016/j.mayocp.2018.04.007

Petkau, T. L., & Leavitt, B. R. (2014). Progranulin in neurodegenerative disease. Trends Neurosci, 37(7), 388-398. doi:10.1016/j.tins.2014.04.003

Schuck, P. (2000). Size-distribution analysis of macromolecules by sedimentation velocity ultracentrifugation and lamm equation modeling. Biophys J, 78(3), 1606-1619. doi:10.1016/s0006-3495(00)76713-0

Sellier, C., Campanari, M. L., Julie Corbier, C., Gaucherot, A., Kolb-Cheynel, I., Oulad- Abdelghani, M., . . . Charlet-Berguerand, N. (2016). Loss of C9ORF72 impairs autophagy and synergizes with polyQ Ataxin-2 to induce motor neuron dysfunction and cell death. EMBO J, 35(12), 1276-1297. doi:10.15252/embj.201593350

Song, S., Cong, W., Zhou, S., Shi, Y., Dai, W., Zhang, H., Zhang, Q. (2019). Small GTPases: Structure, biological function and its interaction with nanoparticles. Asian J Pharm Sci, 14(1), 30-39. doi:10.1016/j.ajps.2018.06.004

58 Steenland, K., MacNeil, J., Seals, R., & Levey, A. (2010). Factors affecting survival of patients with neurodegenerative disease. Neuroepidemiology, 35(1), 28-35. doi:10.1159/000306055

Su, M.-Y., Zoncu, R., & Hurley, J. H. (2020). Structure of the lysosomal SCARF (L-SCARF) complex, an Arf GAP haploinsufficient in ALS and FTD. bioRxiv, 2020.2004.2015.042515. doi:10.1101/2020.04.15.042515

Sullivan, P. M., Zhou, X., Robins, A. M., Paushter, D. H., Kim, D., Smolka, M. B., & Hu, F. (2016). The ALS/FTLD associated protein C9orf72 associates with SMCR8 and WDR41 to regulate the autophagy-lysosome pathway. Acta Neuropathol Commun, 4(1), 51. doi:10.1186/s40478-016-0324-5

Sun, L., & Eriksen, J. L. (2011). Recent insights into the involvement of progranulin in frontotemporal dementia. Curr Neuropharmacol, 9(4), 632-642. doi:10.2174/157015911798376361

Tang, D., Sheng, J., Xu, L., Zhan, X., Liu, J., Jiang, H., Qi, S. (2020). Cryo-EM structure of C9ORF72-SMCR8-WDR41 reveals the role as a GAP for Rab8a and Rab11a. Proc Natl Acad Sci U S A, 117(18), 9876-9883. doi:10.1073/pnas.2002110117

Tiryaki, E., & Horak, H. A. (2014). ALS and other motor neuron diseases. Continuum (Minneap Minn), 20(5 Peripheral Nervous System Disorders), 1185-1207. doi:10.1212/01.CON.0000455886.14298.a4

Todd, T. W., & Petrucelli, L. (2016). Insights into the pathogenic mechanisms of Chromosome 9 open reading frame 72 (C9orf72) repeat expansions. J Neurochem, 138 Suppl 1, 145-162. doi:10.1111/jnc.13623

Trnka, M. J., Baker, P. R., Robinson, P. J., Burlingame, A. L., & Chalkley, R. J. (2014). Matching cross-linked peptide spectra: only as good as the worse identification. Mol Cell Proteomics, 13(2), 420-434. doi:10.1074/mcp.M113.034009

Tsai, R. M., & Boxer, A. L. (2014). Treatment of frontotemporal dementia. Curr Treat Options Neurol, 16(11), 319. doi:10.1007/s11940-014-0319-0

Vera, L., Czarny, B., Georgiadis, D., Dive, V., & Stura, E. A. (2011). Practical Use of Glycerol in Protein Crystallization. Crystal Growth and Design, 11(7), 2755–2762. doi:10.1021/cg101364m

Wang, Y., Hu, X. J., Zou, X. D., Wu, X. H., Ye, Z. Q., & Wu, Y. D. (2015). WDSPdb: a database for WD40-repeat proteins. Nucleic Acids Res, 43(Database issue), D339-344. doi:10.1093/nar/gku1023

59 Webster, C. P., Smith, E. F., Bauer, C. S., Moller, A., Hautbergue, G. M., Ferraiuolo, L., De Vos, K. J. (2016). The C9orf72 protein interacts with Rab1a and the ULK1 complex to regulate initiation of autophagy. EMBO J, 35(15), 1656-1676. doi:10.15252/embj.201694401

Yang, M., Liang, C., Swaminathan, K., Herrlinger, S., Lai, F., Shiekhattar, R., & Chen, J. F. (2016). A C9ORF72/SMCR8-containing complex regulates ULK1 and plays a dual role in autophagy. Sci Adv, 2(9), e1601167. doi:10.1126/sciadv.1601167

Zhang, D., Iyer, L. M., He, F., & Aravind, L. (2012). Discovery of Novel DENN Proteins: Implications for the Evolution of Eukaryotic Intracellular Membrane Structures and Human Disease. Front Genet, 3, 283. doi:10.3389/fgene.2012.00283

Zhang, Z., Wang, Y., Ding, Y., & Hattori, M. (2020). Structure-based engineering of anti-GFP nanobody tandems as ultra-high-affinity reagents for purification. Sci Rep, 10(1), 6239. doi:10.1038/s41598- 020-62606-7

60