GENOME-WIDE APPROACHES TO INVESTIGATE RARE NEUROLOGICAL DISORDERS IN FRENCH CANADIANS

Karine Choquet

Department of Human Genetics McGill University, Montréal, Canada April 2018

A thesis submitted to McGill University in partial fulfillment of the requirements of the degree of Doctor of Philosophy

© Karine Choquet, 2018 ABSTRACT

The French Canadian population of Québec is defined by a unique history and genetic heritage, which has led to the regional clustering of a large number of Mendelian diseases. This includes several types of cerebellar ataxias, a heterogeneous group of neurological disorders characterized by impaired balance and coordination. In the past 20 years, the causative have been identified for the major forms of autosomal recessive cerebellar ataxias (ARCA) in Québec. However, approximately 30% of French Canadian ARCA patients remain without a definite molecular diagnosis. We combined whole exome and targeted sequencing and identified the underlying genetic defect in 13 families, representing more than 40% of our unresolved ARCA cohort. Specifically, we uncovered pathogenic SPG7 mutations in 12 families, demonstrating that this is an important cause of spastic ataxia in Québec. In the last family, we found a homozygous mutation in the PMPCA, a recently described novel cause of ARCA. In order to improve the number of available treatments for cerebellar ataxias, it is crucial to gain a better understanding of the pathogenic mechanisms responsible for these diseases. Thus, we next focused on one specific ataxia-related disorder, Pol III-related hypomyelinating leukodystrophy (POLR3-HLD), and on unraveling its pathophysiological processes. POLR3- HLD is characterized by deficient cerebral myelin formation and is caused by recessive mutations in the genes POLR3A, POLR3B and POLR1C, encoding three subunits of RNA Polymerase III (Pol III). Pol III is an essential enzyme responsible for the synthesis of transfer RNAs (tRNA) and many small non-coding RNAs. First, we generated two knock-in mouse models and performed an in-depth phenotypic and molecular characterization. We found that the Polr3a c.2015G>A (p.G672E) mutation does not lead to a neurological phenotype in mouse, contrary to what is observed in human homozygotes, and that the G672E mutation causes only a mild defect in Pol III function in human cells. Next, using CRISPR-Cas9 gene editing and two complementary RNA-sequencing approaches, we observed that mutations in POLR3A lead to decreased expression of a subset of Pol III transcripts in human cell lines, with the most potent impact on nuclear-encoded tRNAs, the signal recognition particle 7SL RNA and the brain cytoplasmic BC200 RNA. Furthermore, expression profiling in patient-derived fibroblasts and genomic deletion of BC200 in an oligodendroglial cell line provide the first evidence linking this non-coding RNA to POLR3-HLD pathogenesis. Overall, our findings contribute to a better understanding of the genetic and molecular basis of three distinct ataxia-related disorders in

2 French Canadians, while providing insights into the role of an essential enzyme in mouse and human.

3 RÉSUMÉ La population canadienne française du Québec est caractérisée par une histoire et un héritage génétique uniques ayant mené au regroupement régional de nombreuses maladies héréditaires. Celles-ci incluent diverses formes d’ataxies cérébelleuses, un groupe hétérogène de maladies neurologiques définies par un défaut d’équilibre et de coordination. Au cours des vingt dernières années, les gènes causant les formes majeures d’ataxies cérébelleuses autosomiques récessives (ACAR) au Québec ont été identifiés. Pourtant, environ 30% de patients canadiens français atteints d’ACAR n’ont toujours pas de diagnostic moléculaire. En combinant des méthodes de séquençage exomique et ciblé, nous avons identifié le défaut génétique chez 13 familles, ce qui représente plus de 40% de notre cohorte de cas d’ACAR non résolus. Précisément, nous avons découvert des mutations pathogéniques dans le gène SPG7 chez 12 familles, démontrant ainsi qu’il s’agit d’une cause importante d’ataxie spastique au Québec. Chez la dernière famille, nous avons trouvé une mutation homozygote dans le gène PMPCA, une nouvelle cause d’ACAR récemment identifiée. Afin d’améliorer le nombre insuffisant de traitements disponibles pour les ataxies cérébelleuses, il est indispensable de mieux comprendre les mécanismes pathogéniques à l’origine de ces maladies. Ainsi, nous nous sommes ensuite concentrés sur la résolution des processus pathophysiologiques présents dans les leucodystrophies hypomyélinisantes reliées à l’ARN polymérase III (LDH-POLR3). Ce groupe de maladies caractérisé par un déficit de formation de la myéline dans le système nerveux central est aussi associé à la présence d’ataxie cérébelleuse et est causé par des mutations récessives dans les gènes POLR3A, POLR3B et POLR1C, codant pour trois sous-unités de l’ARN polymérase III (Pol III). Pol III est une enzyme essentielle qui synthétise les ARNs de transfert (ARNt) ainsi que de nombreux petits ARNs non codants. D’abord, nous avons généré deux modèles de souris et les avons caractérisés de façon détaillée aux niveaux phénotypique et moléculaire. Nos résultats montrent que la mutation c.2015G>A (p.G672E) dans Polr3a ne cause pas de phénotype neurologique chez la souris, contrairement à ce qui est observé chez les patients humains homozygotes. De plus, la mutation G672E n’entraîne qu’un défaut léger de la fonction de Pol III dans des cellules humaines. Ensuite, à l’aide de la technologie d’édition du génome CRISPR-Cas9 et de deux méthodes complémentaires de séquençage de l’ARN, nous avons découvert que des mutations dans POLR3A mènent à une diminution de l’expression d’un sous-groupe de transcrits Pol III dans des

4 lignées cellulaires humaines. Parmi les transcrits les plus affectés, nous retrouvons les ARNt, l’ARN 7SL de la particule de reconnaissance du signal et l’ARN cytoplasmique du cerveau BC200. De plus, à l’aide de mesures d’expression dans des fibroblastes de patients et de la délétion génomique de BC200 dans une lignée cellulaire oligodendrocytaire, nous démontrons un premier lien entre cet ARN non codant et la pathogénèse de la LDH-POLR3. En conclusion, nos données contribuent à une meilleure compréhension des bases génétiques et moléculaires de trois différentes maladies reliées à l’ataxie chez les Canadiens français, tout en approfondissant le rôle d’une enzyme essentielle chez la souris et l’humain.

5 Table of Contents ABSTRACT ...... 2 RÉSUMÉ ...... 4 LIST OF ABBREVIATIONS ...... 9 LIST OF FIGURES ...... 13 LIST OF TABLES ...... 15 ACKNOWLEDGEMENTS ...... 16 PREFACE ...... 18 Contribution of Authors ...... 19 Original Contribution to Knowledge ...... 21 CHAPTER 1: General Introduction ...... 22 1.1 Genetics of autosomal recessive cerebellar ataxias ...... 23 1.1.1 Cerebellar ataxias ...... 23 1.1.2 Autosomal recessive cerebellar ataxias (ARCA) ...... 24 1.1.3 ARCAs in the French Canadian population ...... 24 1.1.4 Next-generation sequencing in the genetic diagnosis of ARCAs ...... 26 1.2 Pol III-related leukodystrophies ...... 28 1.2.1 Myelination in the central nervous system ...... 28 1.2.2 Leukodystrophies ...... 30 1.2.3 Pol III-related leukodystrophy: a new disease entity ...... 32 1.2.4 Expansion of the phenotypic spectrum associated with POLR3A and POLR3B mutations ...... 33 1.3 Function and regulation of RNA Polymerase III and its transcripts ...... 35 1.3.1 Pol III transcription machinery and associated ...... 35 1.3.2 Functions of Pol III transcripts ...... 36 1.3.3 Regulation of Pol III transcription ...... 41 1.4 Quantification of Pol III transcripts ...... 44 Tables ...... 46 Figures ...... 50 RATIONALE, HYPOTHESIS AND OBJECTIVES ...... 53 PREFACE TO CHAPTER 2 ...... 55 CHAPTER 2: Identification of new causes of cerebellar ataxia in French Canadians ...... 56 Part A: SPG7 mutations explain a significant proportion of French Canadian spastic ataxia cases ...... 56 Abstract ...... 57 Introduction ...... 58 Subjects and Methods ...... 60 Results ...... 62 Discussion ...... 64 Figures and Tables ...... 66 Web Resources ...... 69 Acknowledgements ...... 69

6 Supplementary Materials ...... 70 Part B: Autosomal recessive cerebellar ataxia caused by a homozygous mutation in PMPCA ...... 73 Letter to the Editor ...... 74 Figures ...... 78 Acknowledgements ...... 81 PREFACE TO CHAPTER 3 ...... 82 CHAPTER 3: Absence of neurological abnormalities in mice homozygous for the Polr3a G672E hypomyelinating leukodystrophy mutation ...... 83 Abstract ...... 84 Introduction ...... 85 Results ...... 87 Discussion ...... 91 Materials and Methods ...... 97 Figures ...... 102 Declarations ...... 108 Supplementary Materials ...... 110 PREFACE TO CHAPTER 4 ...... 122 CHAPTER 4: Mutations in POLR3A impact a subset of Pol III transcripts including BC200 RNA ...... 123 Abstract ...... 124 Introduction ...... 125 Results ...... 127 Discussion ...... 132 Materials and Methods ...... 137 Figures ...... 141 Acknowledgements and Funding ...... 150 Supplementary Materials ...... 151 CHAPTER 5: General Discussion ...... 178 5.1 Summary ...... 178 5.2 New insights into the genetic basis of ARCAs ...... 178 5.2.1 Phenotypic expansion and the ataxia-spasticity spectrum ...... 178 5.2.2 PMPCA-related ataxia, an ultra-rare genetic disorder ...... 180 5.2.3 Autosomal Recessive Spastic Ataxia with Leukoencephalopathy: not a single disease entity ...... 181 5.3 The challenges of developing a disease model for POLR3-HLD ...... 181 5.3.1 Mouse models and myelin disorders ...... 182 5.3.2 Advantages and limitations of a human cellular model ...... 183 5.3.3 Limitations of patient-derived fibroblasts ...... 184 5.4 Insights into Pol III biology and POLR3-HLD ...... 185 5.4.1 Lessons learned on Pol III biology ...... 185 5.4.2 Pol III transcripts as candidate effectors of POLR3-HLD ...... 187 5.5 Genome-wide approaches to study rare neurological diseases ...... 191 5.5.1 Whole exome sequencing and beyond ...... 191

7 5.5.2 Genome-wide quantification of Pol III transcripts ...... 192 CHAPTER 6: Conclusion and Future Directions ...... 194 6.1 Conclusion ...... 194 6.2 Future directions ...... 194 REFERENCES ...... 196 APPENDIX ...... 218 Significant contributions to other projects ...... 218

8 LIST OF ABBREVIATIONS

4H: Hypomyelination, Hypodontia and Hypogonadotropic hypogonadism AARS: alanyl-tRNA synthetase AIMP1: aminoacyl tRNA synthase complex-interacting multifunctional 1 ALD: adrenoleukodystrophy ANCOVA: analysis of covariance AOA2: Ataxia oculomotor apraxia 2 AR: autosomal recessive ARCA: autosomal recessive cerebellar ataxia ARCA1: Autosomal recessive cerebellar ataxia type 1 ARM-seq: AlkB-facilitated RNA methylation sequencing ARS: aminoacyl-tRNA synthetase ARSACS: Autosomal recessive spastic ataxia of Charlevoix-Saguenay ARSAL: Autosomal recessive spastic ataxia with leukoencephalopathy CRISPR: Clustered Regularly Interspaced Short Palindromic Repeats BAF: barrier-to-autointegration factor bp: base pairs BPD1: B double prime 1 BRF1/2: B-related factor 1/2 cDNA: complementary DNA ChIP-seq: immunoprecipitation followed by high-throughput sequencing ChIP-qPCR: chromatin immunoprecipitation followed by quantitative PCR CI: confidence interval cm: centimeter CMT: Charcot-Marie-Tooth disease CNP: 2',3'-Cyclic Nucleotide 3' Phosphodiesterase CNS: central nervous system DARS: aspartyl-tRNA synthetase DM-tRNA-seq: Demethylase-thermostable group II intron RT tRNA sequencing DNA: deoxyribonucleic acid DSE: distal sequence element eIF2B: eukaryotic 2B eIF4A: eukaryotic initiation factor 4A eIF4B: eukaryotic initiation factor 4B eIF4F: eukaryotic initiation factor 4F eIF4G: eukaryotic initiation factor 4G EPRS: glutamyl-prolyl-tRNA synthetase ER: endoplasmic reticulum EVS: Exome Variant Server ExAc: Exome Aggregation Consortium

9 FC: French Canadian FDR: false discovery rate FMRP: Fragile X mental retardation protein FRDA: Friedreich Ataxia FXN: frataxin GAPDH: glyceraldehyde 3-phosphate dehydrogenase gDNA: genomic DNA GO: HEXIM1/2: hexamethylene bisacetamide inducible 1/2 HLD: hypomyelinating leukodystrophy hnRNP: heterogeneous nuclear ribonucleoprotein HSP: hereditary spastic paraplegia HSP90: heat shock protein 90 IGV: Integrative Genomics Viewer indel: insertion / deletion iPSC: induced pluripotent stem cell KD: knockdown kDa: kilodalton KI: knock-in KO: knock-out LC-MS/MS: liquid chromatography-tandem mass spectrometry LFB: Luxol Fast Blue LO: leukodystrophy with oligodontia MAF: minor allele frequency MAG: myelin-associated glycoprotein MAP1B: microtubule-associated protein 1B MBP: myelin basic protein MDH2: malate dehydrogenase 2 MePCE: methylphosphate capping enzyme mg: milligram MIR: mammalian interspersed repeat mM: millimolar miRNA: microRNA mL: milliliter MPP: mitochondrial processing peptidase MRI: magnetic resonance imaging mRNA: messenger RNA MRP: mitochondrial RNA processing MS: multiple sclerosis mTOR: mammalian target of rapamycin

10 mTORC1: mammalian target of rapamycin complex 1 µg: microgram µL: microliter µm: micrometer µM: micromolar ncRNA: non-coding RNA ng: nanogram NG2: neural/glial antigen 2 NGS: next-generation sequencing nt: nucleotides OLIG2: oligodendrocyte transcription factor 2 OPC: oligodendrocyte progenitor cell PABP: poly-A binding protein PAR-CLIP: photoactivatable crosslinking and immunoprecipitation PCR: polymerase chain reaction PDGFRα: platelet derived growth factor receptor alpha

PIP3: phosphatidylinositol (3,4,5)-trisphosphate PLP: proteolipid protein PMD: Pelizaeus-Merzbacher disease pmol: picomole Pol I: RNA Polymerase I Pol II: RNA Polymerase II Pol III: RNA Polymerase III POLG: DNA polymerase gamma POLR1C: DNA-directed RNA polymerase I subunit C POLR1D: DNA-directed RNA polymerase I subunit D POLR2C: DNA-directed RNA polymerase II subunit C POLR2J: DNA-directed RNA polymerase II subunit J POLR3A: DNA-directed RNA polymerase III subunit A POLR3B: DNA-directed RNA polymerase III subunit B POLR3-HLD: RNA Polymerase III-related hypomyelinating leukodystrophy POU2F1: POU domain, class 2, transcription factor 1 P-TEFb: positive transcription b PSE: proximal sequence element qRT-PCR: quantitative real-time PCR RARS: arginyl-tRNA synthetase RNA: ribonucleic acid RNA-seq: RNA-sequencing RNP: ribonucleoprotein RPAP2/3/4: RNA Polymerase II-associated protein 2/3/4

11 rpm: revolutions per minute rRNA: ribosomal RNA SCA: spinocerebellar ataxia SEM: standard error of the mean sgRNA: single guide RNA SILAC: stable isotope labeling of amino acids in culture SINE: short interspersed nuclear element siRNA: small interfering RNA SLSJ: Saguenay-Lac-St-Jean SNAPc: snRNA activating protein complex snoRNA: small nucleolear RNA SNP: single nucleotide polymorphism snRNA: small nuclear RNA snRNP: small nuclear ribonucleoprotein SPG7: spastic paraplegia type 7 SRP: signal recognition particle SSB: single-strand DNA binding protein ssODN: single strand oligodeoxynucleotide SYNCRIP: synaptotagmin binding cytoplasmic RNA interacting protein TACH: Tremor Ataxia with Central Hypomyelination TBP: TATA-binding protein TFIIIA/B/C: transcription factor III A/B/C TGIRT: thermostable group II intron reverse transcriptase tRNA: transfer RNA uORF: upstream open UTR: untranslated region VWM: Vanishing White Matter disease WES: whole exome sequencing WGS: whole genome sequencing WT: wild-type ZNF143: zinc finger protein 143

12 LIST OF FIGURES

Chapter 1 Figure 1.1 Schematic of an axon myelinated by oligodendrocytes in the CNS 50 Figure 1.2 Pol III gene classes based on promoter architecture and associated transcription factors 51 Figure 1.3. Linear representation of BC200 RNA 52

Chapter 2 Figure 2.1. cDNA analysis of the splice site variant c.988-1G>A 67 Figure 2.2. Brain MRI of French Canadian cases with SPG7 variants 68 Figure 2.3. Genetic and MRI analysis 78 Figure 2.4. Functional analysis of MPP in patient lymphoblasts 79 Figure S2.1. SDS-PAGE analysis of whole cell protein extracts 80

Chapter 3 Figure 3.1. Yearlong study of motor function in Polr3a KI/KI and KI/KO mice 102 Figure 3.2. Normal myelination in Polr3a KI/KI and KI/KO mice 104 Figure 3.3. No Purkinje cell loss in Polr3a KI/KI and KI/KO mice 105 Figure 3.4. Expression levels of Pol III transcripts in the cerebrum and liver of Polr3a KI/KI and KI/KO mice 106 Figure 3.5. Impact of POLR3A G672E mutation on Pol III function in human cells 107 Figure S3.1. Generation of Polr3a transgenic mice 110 Figure S3.2. Representation of the Polr3a KO allele at the mRNA level 112 Figure S3.3. Weights of mice undergoing phenotypic tests at each time point tested 113 Figure S3.4. Behavioral data prior to adjustment with weight as covariate 114 Figure S3.5. Luxol Fast Blue staining of coronal sections of the cerebrum of 365 days old mice 116 Figure S3.6. Expression levels of Pol III transcripts in the brain and liver of Polr3a KI/KI and KI/KO mice 117

13 Chapter 4 Figure 4.1. The POLR3A M852V mutation leads to decreased Pol III chromatin occupancy 141 Figure 4.2. Characterization of POLR3A mutant clones obtained by CRISPR-Cas9 142 Figure 4.3. Pol III transcripts with a type 2 promoter are decreased in POLR3A mutants 143 Figure 4.4. Differential expression analysis shows decreased tRNA levels in mutants 144 Figure 4.5. Global decrease in tRNA levels in POLR3A mutants 145 Figure 4.6. Large Pol III transcripts with a type 2 promoter are decreased in POLR3A mutants 146 Figure 4.7. BC200 RNA levels are decreased in POLR3-HLD patient fibroblasts 147 Figure 4.8. BC200 KO causes important changes at the proteome level in MO3.13 cells 148 Figure S4.1. Characterization of POLR3A mutant clones 156 Figure S4.2. POLR3A M852V mutation in patient-derived fibroblasts 157 Figure S4.3. Optimization of a small RNA-seq protocol 168 Figure S4.4. Analysis of small RNA-seq data 159 Figure S4.5. Expression of large Pol III transcripts 160 Figure S4.6. Generation of POLR3A-M852V and BC200 KO MO3.13 cell lines 161

14 LIST OF TABLES Chapter 1 Table 1.1. Major forms of ARCA in French Canadians 46 Table 1.2. Types of hypomyelinating leukodystrophies (HLD) and functions of the mutated genes 47 Table 1.3. Phenotypes caused by recessive mutations in genes encoding Pol III subunits 48 Table 1.4. Roles of Pol III transcripts in mammalian cells 49

Chapter 2 Table 2.1. Clinical features and variants identified in French Canadian SPG7 cases 66 Table S2.1. Exome sequencing summary data 70 Table S2.2. Filtering strategy employed to identify the causative variants in SPG7 71 Table S2.3. Genotypes of SNPs in patients with the p.(Ala510Val) mutation in SPG7 72

Chapter 3 Table S3.1. Quantification of the high-confidence interactors of the G672E mutant POLR3A against WT POLR3A 118

Chapter 4 Table S4.1. DESeq2 differential expression analysis of small RNA-seq data in clones C1 to C3 and M1 to M3 163 Table S4.2. DESeq2 differential expression analysis of small RNA-seq data in clones C3 and M2 165 Table S4.3. Top 30 differentially expressed genes in rRNA-depleted RNA-seq data from independent HEK293 clones 169 Table S4.4. Characteristics of the primary patient-derived fibroblasts used in this study 170 Table S4.5. SILAC analysis comparing the proteomes of POLR3A-M852V to MO3.13-WT and BC200-KO to MO3.13-WT 171 Table S4.6. Primers and oligonucleotides used in this study 177

15 ACKNOWLEDGEMENTS

This thesis was supported by generous scholarships from the Canadian Institutes of Health Research, the Fonds de Recherche du Québec – Santé, the Fondation du Grand Défi Pierre Lavoie and the Jewish General Hospital National Bank Molecular Pathology Scholarship. I would like to thank my supervisor for the past six years, Dr. Bernard Brais, for giving me the opportunity to work in his laboratory and for always believing in my abilities. His enthusiasm for discovery largely contributed to my passion for research. Since day one, he has given me unwavering support in my research and in the development of my career, and provided me with the independence to develop my own ideas. I would also like to express my gratitude to my co-supervisor, Dr. Claudia Kleinman, for taking me on as her first graduate student and showing me the ropes of computational genomics. Over the last four years, she has continuously pushed me to expand my limits. I am very grateful for the countless hours she has spent exploring and discussing data with me. I have learned tremendously from her expertise and knowledge and I hope to one day be a similar scientist and supervisor. None of the work described here would have been possible without the help and advice from numerous colleagues. Special thanks to Marie-Josée, who helped with many experiments and made it possible for me to share my time between two laboratories, and to Martine and Roberta, who took me under their wings since the beginning and were excellent mentors, colleagues and friends. I am grateful to Catherine, Talita, Nicolas, Roxanne, Rebecca G. and Rebecca R. for all the assistance, helpful discussions and many laughs and memories. I would also like to thank the many undergraduate students that I had the opportunity to supervise over the years, especially Sharon and Elisabeth, who contributed significantly to the work described in this thesis. I am grateful to the other members of the Rare Neurological Diseases Group for the many useful discussions and for creating a wonderful working atmosphere. Thank you to everyone in the Kleinman lab, especially Nicolas and Steven, for helping me develop my computational skills and patiently answering my frequent questions, and Maud and Selin for the many discussions. I would also like to thank the members of my supervisory committee, Dr. Jacek Majewski and Dr. Benoit Coulombe, for their guidance and encouragement. I am grateful to the

16 many collaborators with whom I interacted over the years, especially Dr. Martin Teichmann, who welcomed me warmly in his laboratory, Diane Forget and Dr. Ian Willis. It was an incredible privilege to share this journey with the friends I made in the Department of Human Genetics. Special thanks to Andréanne, my running partner, Renata and Lundi for always lending an ear to my frustrations and sharing the happy moments. I am also very grateful to Ross MacKay, Rimi Joshi, Dr. Aimee Ryan and Dr. Eric Shoubridge for their wonderful support over the years and for all the opportunities that came with being in the Department of Human Genetics. Last but not least, I would like to thank my family and friends for their constant support. None of this would have been possible without my parents’ continuous encouragement, attentiveness, help and love. They instilled in me the passion for learning, always believed in me and inspired me to follow my dreams, and I would not be who I am today without them. To my brother, grandmothers, aunts, uncles and cousins, thank you for your frequent encouragements. To Catherine, Laura, Johanna and Debbie, thank you for your invaluable friendship over the past twelve years.

I would like to dedicate this thesis to my maternal grandfather, Roland Morin, for whom the pursuit of education was so important and whose hard work and perseverance were true inspirations.

17 PREFACE

This thesis is written according to the guidelines of the McGill University Graduate and Postdoctoral Studies Office. It is presented in the manuscript-based format for a Doctoral thesis. The studies described herein were performed under the co-supervision of Dr. Bernard Brais and Dr. Claudia Kleinman. This thesis tackles two main subjects that are both related to rare disease genetics in the French Canadian population: i) the identification of causal genetic mutations in patients affected with cerebellar ataxia, and ii) the phenotypic, molecular and transcriptional characterization of Pol III-related leukodystrophy, a disorder with frequent cerebellar ataxia, caused by mutations in the gene POLR3A. This thesis is composed of six chapters. Chapter 1 is a general introduction reviewing the key literature relevant to the thesis. Chapter 2 is composed of two separate manuscripts describing the identification of causal mutations in cerebellar ataxia. The first was published in the European Journal of Human Genetics, while the second is a Letter to the Editor in Brain in response to an article published in this journal. Chapter 3 recounts the generation and characterization of a mouse model of Pol III-related leukodystrophy and was published in Molecular Brain. Chapter 4 is a manuscript in preparation, to be submitted once a few remaining experiments have been performed, and reports the transcriptional and translational consequences of POLR3A mutations in cellular models. Chapter 5 provides a global discussion of the previous chapters, while Chapter 6 presents the general conclusion and future directions.

18 Contribution of Authors

Chapter 2: The study in Part A was performed with the following colleagues and collaborators: Martine Tétreault, Sharon Yang, Roberta La Piana, Marie-Josée Dicaire, Megan R. Vanstone, Jean Mathieu, Jean-Pierre Bouchard, Marie-France Rioux, Guy A. Rouleau, Kym M. Boycott, Jacek Majewski and Bernard Brais. BB and I conceived the study. I coordinated the research, contributed to WES interpretation, performed RT-PCR experiments, supervised PCR validation and mutational screening of the cohort and analyzed Sanger sequencing data. MT analyzed WES data and participated to study design and data interpretation. SY and MJD carried out PCR validation and mutational screening of the cohort. RLP reviewed clinical and MRI data. MRV, JM, JPB, MFR and GAR provided clinical data. KMB, JM and BB supported the study and contributed to data interpretation. MT, RLP and I prepared the figures and tables. I wrote the manuscript with the help of MT and BB. All authors edited and approved the manuscript.

The study in Part B was performed with the following colleagues and collaborators: Olga Zurita-Rendon, Roberta La Piana, Sharon Yang, Marie-Josée Dicaire, Kym M. Boycott, Jacek Majewski, Eric A. Shoubridge, Bernard Brais and Martine Tétreault. I designed and coordinated the project, participated in the WES interpretation, cultured lymphoblast cell lines and supervised the undergraduate student SY, who performed PCR experiments and analyzed Sanger sequencing data. MJD completed PCR experiments. OZR carried out siRNA knockdown and Western Blot experiments. RLP analyzed clinical and MRI data. KMB, JM, EAS and BB supported the study and contributed to study design and data interpretation. MT analyzed WES data and participated in study design and data interpretation. OZR, RLP and I prepared the figures. MT and I wrote the manuscript. All authors edited and approved the manuscript.

Chapter 3: This study was performed with the following colleagues and collaborators: Sharon Yang, Robyn D. Moir, Diane Forget, Roxanne Larivière, Annie Bouchard, Christian Poitras, Nicolas Sgarioto, Marie-Josée Dicaire, Forough Noohi, Timothy E. Kennedy, Joseph Rochford, Geneviève Bernard, Martin Teichmann, Benoit Coulombe, Ian M. Willis, Claudia L. Kleinman and Bernard Brais. BB and I conceived the study. RDM, RL, BC, IMW, CLK, BB and I

19 designed the experiments. I managed and analyzed behavioral experiments. SY, MJD, NS, RL, FN and I collected and processed tissues, performed genotyping, RT-PCR and PCR for Sanger sequencing. SY, RL, FN and I performed and analyzed histology and Western Blots. RDM and IMW conducted and analyzed Northern Blot experiments. DF and AB performed and analyzed immunofluorescence, affinity purification and ChIP-qPCR experiments. CP analyzed mass spectrometry data. JR participated in the statistical analysis of behavioral experiments. TEK, GB and MT provided advice regarding experimental design and data analysis and contributed to interpretation of results. I wrote the manuscript with CLK and BB. All authors read and approved the final manuscript.

Chapter 4: This study was performed with the following colleagues and collaborators: Diane Forget, Elisabeth Meloche, Marie-Josée Dicaire, Geneviève Bernard, Marc R. Fabian, Martin Teichmann, Benoit Coulombe, Bernard Brais and Claudia L. Kleinman. I conceived the study with the guidance of CLK and BB. I coordinated the research, performed most experiments and analyzed RNA-seq and SILAC data with advice from CLK. MJD and EM assisted with CRISPR-Cas9 experiments, mutation screening and sgRNA cloning. DF and BC conceived and performed affinity purification, immunofluorescence and ChIP experiments. MRF prepared the synthetic RNA spike-ins. GB provided patient-derived fibroblasts. MT contributed to the design of the custom microarray. CLK, BB and BC supported the study and participated in data interpretation. CLK supervised the study. I prepared the figures and wrote the manuscript with guidance from CLK and BB.

20 Original Contribution to Knowledge

The work described in this thesis includes significant contributions to our understanding of the genetic and molecular basis of rare neurological disorders in French Canadians. Chapter 2 Part A shows for the first time that SPG7 mutations are an important cause of spastic ataxia in French Canadians by identifying causative mutations in 22 patients, thus providing them with a long-awaited molecular diagnosis. It also shows that the SPG7 mutation c.988-1G>A causes aberrant splicing in patient-derived fibroblasts. This study contributed to the demonstration that SPG7 mutations are one of the most common causes of undiagnosed spastic ataxia worldwide. Chapter 2 Part B reports the identification of a novel disease-causing mutation in PMPCA, c.766G>A (p.V256M). Since mutations in this gene had only recently been associated with cerebellar ataxia,1 it provided the first independent confirmation of this causal link and showed that intellectual disability is not an obligate feature of PMPCA-related ataxia. Chapter 3 describes Polr3a G672E knock-in mice, the first animal model with a bi-allelic missense mutation in a Pol III subunit. It shows that these mice do not recapitulate the disease phenotype of human POLR3-HLD cases carrying the POLR3A G672E mutation nor do they display abnormal levels of Pol III transcripts, providing novel evidence that Pol III function may differ significantly between mouse and human. Chapter 4 describes the generation of cellular models of POLR3-HLD and the extensive transcriptomic characterization of POLR3A mutant cells. It shows for the first time that POLR3- HLD-causing POLR3A mutations have a selective impact on the levels of Pol III transcripts, affecting specifically BC200 RNA, 7SL RNA and nuclear-encoded tRNAs. It also reports the novel findings that BC200 RNA levels are decreased in patient-derived fibroblasts and that absence of this ncRNA causes important changes at the proteome level.

21 CHAPTER 1: General Introduction

The French Canadian population of Québec was created through a series of successive and fairly recent regional founder effects, resulting in a disproportionate contribution of the first generations of settlers to the gene pool.2 In fact, it is estimated that two thirds of the gene pool of the modern French Canadian population was provided by only approximately 2,600 settlers that came to Nouvelle-France before 1680.2 As new areas of the territory opened for colonization, subsequent migrations and founder effects resulted in variations among the regional gene pools.2,3 These unique demographics have led to the regional clustering of a large number of Mendelian diseases in Québec due to elevated frequencies of certain pathogenic alleles.4 The best example is the Saguenay-Lac-St-Jean (SLSJ) region, which has an unusually high prevalence of multiple Mendelian diseases, such as cystic fibrosis and tyrosinemia type 1.4 Nonetheless, clusters of hereditary diseases are also observed in several other areas of Québec, for instance Bas-St-Laurent, Lanaudière and Southern Québec.2,5 Among those are many neurological disorders, including several forms of cerebellar ataxias and leukodystrophies. The first part of my thesis (Chapter 2) focuses on the genetic basis of recessive cerebellar ataxias in French Canadians. In the first section of this introduction, I will provide background information on recessive cerebellar ataxias, outline the most common forms in Québec and review the contribution of whole exome sequencing (WES) to the genetic diagnosis of ataxias in general. The second part of my thesis (Chapters 3 and 4) concentrates on one particular neurological disorder that affects French Canadians, RNA Polymerase III-related leukodystrophy, and on gaining a better understanding of its underlying pathophysiological mechanisms. In the second section of this introduction, I will briefly define leukodystrophies and summarize the current knowledge on the clinical, genetic and pathological characteristics of Pol III-related leukodystrophy. Finally, the third section will review the relevant literature on the function of RNA Polymerase III (Pol III) and its RNA transcripts, as it pertains to how it could be altered and lead to disease.

22 1.1 Genetics of autosomal recessive cerebellar ataxias 1.1.1 Cerebellar ataxias Ataxia is characterized by impaired balance and coordination.6,7 Cerebellar ataxias, caused by dysfunction of the cerebellum or its projections, are the most frequent forms of ataxias. Symptoms include gait ataxia, dysarthria, nystagmus, tremor and cognitive dysfunction.6 The cerebellum (latin for “little brain”) plays an essential role in the coordination of motor function through complex neuronal connections with the cerebral cortex.7,8 After receiving inputs from the motor cortex signaling an intention to move, the cerebellum adjusts discrepancies between the movement programmed and the movement being produced, based on previous experience, and returns information to the cortex via the thalamus.9 The cellular organization of the cerebellum is relatively simple, consisting of an outer cortex, inner white matter and deep cerebellar nuclei located within the white matter.8,10 Purkinje neurons are the main efferent elements of the cerebellum, integrating inputs from other regions of the central nervous system (CNS) and projecting to the cerebellar nuclei, thus returning information to the cerebral cortex via the thalamus.8,10,11 Cerebellar ataxia results from the dysfunction of the cerebellum or its related neuronal circuitry. Cerebellar atrophy is a common pathological feature of ataxia and results from the degeneration and death of cerebellar cells, predominantly the large efferent Purkinje cells.12 Cerebellar ataxia can be acquired or inherited. Acquired forms of cerebellar ataxia often present with acute symptoms and are caused by a variety of factors, including exposure to toxins, vitamin deficiency, infections and trauma.13,14 In contrast, acquired subacute or chronic cerebellar ataxia can be due to tumors, Creutzfeldt-Jakob disease, immune-mediated ataxias or multiple sclerosis.14-16 Once the above causes have been ruled out, the presence of constant and progressive cerebellar ataxia suggests an inherited neurodegenerative disorder.14 Inherited cerebellar ataxias are a clinically and genetically heterogeneous group of rare disorders characterized by progressive cerebellar degeneration due to genetic mutations.6,17 There are over 40 different genes associated with autosomal dominant forms, also referred to as spinocerebellar ataxias (SCA).6 Autosomal recessive (AR) cerebellar ataxias (ARCA) are similarly heterogeneous, with mutations identified in over 60 genes, and often present with more complex phenotypes and an earlier age of onset.18 X-linked and mitochondrial inheritances are observed in a smaller number of forms, such as Fragile X-associated tremor-ataxia syndrome and POLG-

23 associated ataxia, respectively.6,14 Additionally, ataxia is also a prominent symptom in many other neurodegenerative disorders, including complex hereditary spastic paraplegias (HSP), polyneuropathies and leukodystrophies.18,19

1.1.2 Autosomal recessive cerebellar ataxias (ARCA) ARCAs are complex neurodegenerative disorders that generally present in children and young adults.14 In addition to cerebellar ataxia, other important neurological symptoms observed in ARCAs include spasticity, peripheral neuropathy, movement disorders, dystonia, oculomotor abnormalities, pyramidal tract dysfunction, hyperreflexia, retinopathy, hypogonadism, mental retardation, cognitive impairment and epilepsy.14,18 Most forms of ARCA display an important clinical heterogeneity with regards to age of onset, disease progression and severity.14 In addition, several forms are characterized by a similar phenotype, making the differential diagnosis challenging.14 In a systematic review of 16 studies published between 1983 and 2013, the prevalence of ARCA was estimated to be 3.3/100,000 (95% CI: 1.8–4.9).20 However, due to the phenotypic heterogeneity and the overlap with other disorders, it is difficult to pinpoint the exact number of disorders belonging to the ARCA category. To distinguish primary ARCAs from other disorders where ataxia is an associated feature, a recent systematic review proposed a new classification containing 45 “true” ARCAs, 20 “other complex movement or multisystem recessive disorders that have prominent ataxia” and nine additional recessive disorders where ataxia is a secondary feature.18

1.1.3 ARCAs in the French Canadian population Among the hereditary disorders more frequent in French Canadians are several major forms of ARCAs, which are briefly reviewed below (Table 1.1). While none of these disorders are specific to the French Canadian population, each is characterized by at least one founder mutation that has led to a higher prevalence in Québec. First, Autosomal Recessive Spastic Ataxia of Charlevoix-Saguenay (ARSACS) is the most frequent type of ARCA in the French Canadian population, with over 300 affected cases living in Québec, and is caused by recessive mutations in the gene SACS.21 ARSACS was first described in a cohort of patients from the Charlevoix and SLSJ regions, where two founder mutations in the SACS gene (c.8844delT and c.7504C>T) account for the majority of carrier

24 .22 In fact, the carrier rate for the first mutation in SLSJ is 1/22, and it has never been found in other populations.21 Nevertheless, over 100 SACS mutations have been uncovered in cases from multiple countries, suggesting that ARSACS is also a frequent form of ARCA worldwide.21 Friedreich Ataxia (FRDA) is the second most frequent recessive ataxia in Québec and the most common type of inherited recessive ataxia worldwide.23 FRDA is caused by a GAA triplet repeat expansion in the first intron of the gene FXN.24 FRDA cases are found across Québec, including a cluster of eight families near Rimouski.25 Using genealogical information from 14 pedigrees, André Barbeau et al. originally suggested that a shared ancestral couple introduced the FRDA allele in the French Canadian population upon its arrival in Nouvelle-France in 1634,26 but this has not been confirmed by more recent and reliable genealogical methods. Autosomal recessive cerebellar ataxia 1 (ARCA1), also known as recessive ataxia of Beauce, was originally described in French Canadian patients affected with a late-onset, slowly progressive and relatively pure cerebellar ataxia.27,28 Through linkage analysis and candidate gene sequencing, seven different loss-of-function mutations in SYNE1 were uncovered in 30 families originating from the Beauce or Bas-St-Laurent regions, making ARCA1 the third most frequent form of inherited ataxia in Québec.27 Despite the clear allelic heterogeneity, one mutation was found in the homozygous state in about one-third of cases.29,30 Although only a handful of cases from other ethnicities were initially identified, more recent series using next- generation sequencing have shown that it is in fact one of the most common ARCAs worldwide and is also often associated with spasticity.30-33 For example, screening of a large cohort of non- French Canadian ARCA patients led to the discovery of 23 index patients carrying homozygous or compound heterozygous mutations in SYNE1 (23/434, 5.3%), showing that ARCA1 is also a common form of recessive ataxia outside Québec.34 In fact, the high percentage of ARCA1 and ARSACS cases in other ethnicities indicates that there are no unique French Canadian ataxias, but rather some that are more prevalent and clinically more homogeneous in this population because of the over-representation of more common founder mutations. Another example of an ataxia that is more frequent in a specific region is Ataxia oculomotor apraxia 2 (AOA2), which is caused by recessive mutations in the gene SETX, encoding a protein called senataxin.35 Mutations in SETX were first identified in 15 AOA2 families of various ethnicities, including one from Canada.35 Subsequently, the Brais laboratory

25 uncovered SETX mutations in a cluster of 10 French Canadian families affected with a similar phenotype of ataxia and distal amyotrophy, but without the characteristic oculomotor apraxia.36 They identified a novel founder mutation, c.5927G>T (p.Leu1976Arg), which was homozygous in most families.36 All families originated from northeastern Quebec, New Brunswick or Gaspésie,36 suggesting that this founder mutation could be of Acadian origin.37 Tremor Ataxia with Central Hypomyelination (TACH), now referred to as RNA Polymerase III-related leukodystrophy, was first identified in a cluster of patients in the Beauce- Bellechasse region, near Québec City.38 A founder mutation in the gene POLR3A, c.2015G>A (p.G672E), was identified in all but one TACH patient, who carried distinct compound heterozygous mutations in POLR3A.39 In addition to ataxia and cerebellar atrophy, all patients displayed hypomyelination on magnetic resonance imaging (MRI), prompting the radiological diagnosis of a hypomyelinating leukodystrophy.38 These cases are now seen as part of the spectrum of Pol III-related leukodystrophy, which is the focus of Chapters 3 and 4 of this thesis. Finally, a putative novel form of ARCA, Autosomal Recessive Spastic Ataxia with Leukoencephalopathy (ARSAL), was first described in 2006 in a cohort of 23 French Canadian individuals, of which 59% had a genealogical relationship to the county of Portneuf, near Québec City.40 An important inter- and intra-familial variability was observed at the clinical and radiological levels.40 A subsequent study published in 2012 suggested that three complex rearrangements in the gene MARS2 were causative of ARSAL in 54 cases.41 However, these results have not been reproduced in additional patients and the genetic evidence was later deemed insufficient to conclude that MARS2 genomic rearrangements are truly causative of ARSAL.18 In summary, the last twenty years have been extremely fruitful for genetics of ARCAs in Québec, with the identification of the genetic basis of the major forms observed in the province. Nonetheless, at the beginning of this thesis project, an estimated 30% of French Canadian patients affected with presumed recessive cerebellar ataxia remained unresolved.

1.1.4 Next-generation sequencing in the genetic diagnosis of ARCAs Starting in the mid-1980s, patients affected with rare genetic diseases were screened for the most probable genetic defects according to their clinical presentation, generally one gene at a time. If this strategy did not reveal causative mutations, novel gene discovery was attempted

26 through a combination of linkage analysis or homozygosity mapping, positional cloning and Sanger sequencing of candidate genes.42 In 2009, the introduction of next-generation sequencing (NGS) significantly accelerated the discovery of novel disease-gene associations for Mendelian diseases, mostly through WES.42 In fact, the average number of new genes underlying Mendelian disorders climbed from 166 per year between 2005 and 2009 to 236 per year between 2010 and 2014.43 Moreover, WES and whole genome sequencing (WGS) have surpassed conventional approaches by about three-fold in the number of novel gene discoveries and have significantly expanded the spectrum of phenotypes associated with mutations in previously known disease- causing genes.42 Similarly to other disorders, the advent of NGS has greatly improved the genetic diagnosis of cerebellar ataxias, both in research and clinical settings. Since 2012, there have been at least 20 novel genes associated with cerebellar ataxia (12 recessive forms) and more than 12 known disease-causing genes with phenotypic expansion to include cerebellar ataxia.17,44-47 Furthermore, considering the genetic heterogeneity of cerebellar ataxias, WES allows the simultaneous sequencing of all genes associated with ataxia, which is significantly faster and economical than traditional “one gene at a time” investigations. For example, WES performed in a cohort of 35 undiagnosed ataxia cases uncovered the likely genetic cause in 64%, all in genes previously associated with ataxia or other neurological disorders.48 In Canada, the Finding Of Rare disease GEnes Consortium reported a diagnostic yield of 46% for paediatric-onset ataxias (42 cases), including the identification of two novel genes.49 In a review of seven WES cohort studies, the definite diagnostic yield for cerebellar ataxias varied between 20 and 46%, but the average increased to 53% if variants of uncertain significance were included.50 In fact, another advantage of WES is that the data can be revisited as new disease-causing genes or functional data establishing the pathogenicity of mutations are published. Nonetheless, even after several years of WES applications, these studies demonstrate that at least 50% of cerebellar ataxia patients remain without a definite molecular diagnosis in large multiethnic cohorts, highlighting the need for continued efforts in searching for novel mutations and causal genes. Obtaining a precise molecular diagnosis has a major positive impact on patient care, including a more accurate prognosis, informed care, reproductive counselling and access to services.42 In addition, when a therapy does exist, it is often most efficient before the disease has progressed too far.14 Nevertheless, specific pharmacological treatments are currently available for only 6% of rare diseases.51 In ARCAs, a few examples include ataxia with vitamin E

27 deficiency, Niemann-Pick type C disease and coenzyme Q10 deficiency.14 In order to improve the dismal availability of treatments, it is crucial to better understand the pathophysiological mechanisms underlying ARCAs and other rare diseases, in particular the link between each of the causative genes and their associated disease phenotype. Because of this, a major part of my PhD focused on one specific disorder, Pol III-related leukodystrophy (POLR3-HLD), and on trying to uncover the link between the mutated genes and the major phenotypes observed in patients. POLR3-HLD regroups several recessive disorders that display varying degrees of cerebellar ataxia. However, it is generally considered to be a leukodystrophy rather than an ARCA because of the frequent white matter abnormalities present on MRI. In the next section, I will briefly review the literature pertaining to myelination and leukodystrophies, and then delve more deeply into the current knowledge on POLR3-HLD.

1.2 Pol III-related leukodystrophies 1.2.1 Myelination in the central nervous system Neurons are the functional unit of the nervous system, responsible for the transmission of information.9 They consist of three main parts: the cell body, also called soma, the axon and the dendrites (Fig. 1.1). Each neuron possesses an axon that conducts electrical signals from the cell body, and multiple dendrites responsible for receiving information from other axons and relaying it to the cell body.9 Myelin sheets consist of lipid-rich membranes that wrap around axons, forming an insulating sheath to promote the propagation of electrical signals52 (Fig. 1.1). In contrast, neuronal cell bodies and dendrites are not myelinated. In the central nervous system (CNS), oligodendrocytes are the cells responsible for myelination. Myelinated axons, their adjoining oligodendrocytes and other glial cells constitute the white matter, which makes up half of the human brain.53 The other 50% consists of the grey matter, which is composed of neuronal cell bodies, dendrites and axon terminals.9 Although myelin was originally thought to be a fatty substance secreted into the extracellular space, we now know that the myelin sheath is actually a continuous extension of the oligodendroglial plasma membrane (Fig. 1.1).54 Myelinating oligodendrocytes wrap tightly around axons to form a highly compact and multilayered stack of thick membranes with high electrical resistance, occasionally interrupted by unmyelinated regions called nodes of Ranvier (Fig. 1.1).54 This particular structure allows for the very rapid jumping-like propagation of action potentials along axons, termed saltatory conduction.54 In

28 addition, myelin is essential for the integrity and long-term survival of axons.55 Compact myelin has a particular molecular composition, with only 40% of water, 70-80% of lipids by dry weight and a handful of proteins.54 Myelin Basic Protein (MBP) and Proteolipid Protein (PLP) are the two major protein constituents of myelin.54 During CNS development, oligodendrocyte progenitor cells (OPC) are generated in sequential waves from distinct germinal regions.56 They originate from multipotent neural progenitor cells and their fate is determined in large part by the expression of the transcription factor OLIG2.56 They are also characterized by the co-expression of markers NG2 and PDGFRα.57 OPCs proliferate and migrate throughout the brain until they have reached their final destinations in the future white matter tracts, where they begin to differentiate into post-mitotic pre-myelinating oligodendrocytes, and then myelin-forming oligodendrocytes.56 As oligodendrocytes mature, they start to express the two major protein components of the myelin sheath, MBP and PLP.58 Finally, oligodendrocytes begin to myelinate axons.57 In the human brain, myelination begins around 30 weeks of gestation and continues at a rapid pace during the first year of life, then proceeds slower throughout the rest of life.59 In mice, myelination begins after birth and is mostly completed by two months of age.56 Interestingly, pre-myelinating oligodendrocytes are maintained longer in the human brain, whereas OPC terminal differentiation and myelination happen almost coincidentally in mice.57 Myelination is regulated by various extracellular factors and signals originating from other cell types, although the initial signal to start myelination has yet to be identified in the CNS. Astrocytes are a major source of these regulatory signals and therefore play an important role in OPC survival and migration, oligodendrocyte differentiation and myelination.57 Reciprocal interactions between oligodendrocytes and neurons are also crucial for the integrity of both cell types. In addition to the essential role of the myelin sheath in the conduction of action potentials, oligodendrocytes offer important metabolic support (i.e. lactate) to axons.60,61 On the other hand, signals from neurons to oligodendrocytes are involved in all stages of myelination.60 For instance, in mice, neurons express Jagged1, a ligand of the Notch1 receptor present on OPCs, to inhibit differentiation into oligodendrocytes until OPCs are in the correct spatio- 62,63 temporal context for myelination. Another notable example is that the activation of the PIP3- AKT-mTOR pathway in neurons is sufficient to trigger the recruitment of OPCs and the entire myelination program,64 in which mTORC1 is an important player.65 Furthermore, myelination is

29 dependent on neuronal electrical activity, where the release of glutamate by electrically active axons induces their preferential myelination by oligodendrocytes.66,67 Thus, although oligodendrocytes are responsible for the actual production of myelin, neurons are essential to ensure that myelination occurs timely and properly, with the additional support of other cell types such as astrocytes.

1.2.2 Leukodystrophies Leukodystrophies are a heterogeneous group of genetically determined disorders that primarily affect myelin in the CNS.68 The term leukodystrophy stems from the Greek words leuko (white), dys (lack of) and trophy (growth).68 Although ataxia and spasticity are part of the clinical picture in many subtypes, leukodystrophies have traditionally been considered a distinct entity from cerebellar ataxias, often diagnosed based on radiological findings rather than clinical symptoms.68 In fact, the advent of MRI in the early 1980s was a major step forward in the diagnosis of leukodystrophies, since it allowed classifying patients based on specific patterns of MRI abnormalities.68 Leukodystrophies are generally classified into three groups: hypomyelinating leukodystrophies (HLD) are characterized by lack of myelin deposition; demyelinating leukodystrophies are defined by loss of previously deposited myelin; and dysmyelinating leukodystrophies have structurally or biochemically abnormal myelin.57 In this thesis, I will focus on HLDs, and more specifically on POLR3-HLD. In recent years, the traditional view of leukodystrophies as disorders that primarily affect myelin through oligodendrocyte dysfunction has been challenged by the discovery of mutations in several genes involved in housekeeping processes or in the function of other cell types.57 For instance, many demyelinating leukodystrophies are now considered to be “astrocytopathies” because of the major involvement of astrocytes in their pathogenesis (i.e. Vanishing White Matter).57 Genes mutated in HLDs, listed in Table 1.2, do include a few that encode essential oligodendrocyte or myelin proteins, such as PLP1 and GJC2.68 Two other HLD genes, SOX10 and NKX6-2, code for transcription factors that regulate the expression of oligodendroglial proteins.47,69,70 However, none of the other 14 HLD genes are directly involved in myelination or oligodendrocyte biology. Instead, they play roles in ubiquitous processes, and in most cases, the link between the function of the gene and the observed phenotype is obscure.68,71 Pathways affected in these HLDs include mRNA , protein folding, phospholipid synthesis,

30 endolysosomal trafficking and cytoskeleton (Table 1.2).71 Since most of the genes causing HLDs were identified in the last decade, information about their pathophysiological mechanisms remains scarce. Even in Pelizaeus-Merzbacher disease (PMD), the prototypical HLD for which the causative gene PLP1 was identified three decades ago, the pathological processes by which PLP1 mutations cause PMD remain unclear.72 Understanding the functional basis of HLDs is further complicated by the challenges associated with developing appropriate disease models. Primary human oligodendrocytes are difficult to obtain because of the rarity and obvious danger of brain biopsies;72 they are also laborious to maintain in culture. In addition, since HLDs are developmental disorders, the relevant stages of pathogenesis are likely to occur prior to diagnosis.72 Thus, access to brain tissue from deceased patients is of limited usefulness, as it more closely reflects end-stage disease as opposed to initial pathogenesis. In PMD, some disease processes have been modeled in oligodendroglial immortalized cell lines, such as impaired PLP1 trafficking and endoplasmic reticulum (ER) stress.73 However, although they express oligodendrocyte markers, these cell lines do not have the ability to produce myelin sheaths in vitro and thus cannot truly recapitulate the in vivo function of oligodendrocytes.74 Primary mouse or rat oligodendrocytes can produce myelin when co-cultured with neurons,75 but require the generation of transgenic animals carrying defects in the gene of interest. Several existing PMD mouse models display myelin abnormalities76 and have led to insights into possible pathogenic mechanisms, such as oxidative stress, mitochondrial dysfunction77 and perturbed axonal transport.78 However, relevant models for most HLDs are still lacking. Furthermore, in many HLDs, it is unclear whether the lack of myelin deposition results from a primary oligodendrocyte dysfunction or whether it is secondary to impaired axonal function since axonal signals are crucial for myelination. Of course, these two options are not mutually exclusive. A recent study showed that the TUBB4A mutation that causes Hypomyelination with atrophy of the basal ganglia and cerebellum alters the morphology of both mouse cerebellar granule neurons and oligodendrocytes when they are cultured separately, as well as myelin in the oligodendrocyte culture.79 In addition, the cell type involvement may differ for mutations in the same gene, since another TUBB4A mutation that causes isolated hypomyelination selectively impaired myelin gene expression and oligodendrocyte morphology.79

31

1.2.3 Pol III-related leukodystrophy: a new disease entity In 2010, our laboratory identified a cluster of French Canadian patients with a potentially new disease, which we referred to at the time as Tremor Ataxia with Central Hypomyelination (TACH), since it did not have all the features of 4H leukodystrophy (Hypomyelination, Hypodontia and Hypodonadotropic hypogonadism), a previously described HLD with very similar clinical findings. All seven TACH cases presented with motor regression, cerebellar ataxia, spasticity and tremor.38 On MRI, all patients demonstrated diffuse hypomyelination with relative sparing of optic radiations and thinning of the corpus callosum. Cerebellar atrophy was observed in the most severe cases.38 Linkage analysis and homozygosity mapping in three consanguineous families affected with TACH allowed to define a 4.3 Mb candidate on 10q22.3-23.1.38 Interestingly, an overlapping clinical and radiological phenotype, Leukodystrophy with Oligodontia (LO), was described in a large Syrian inbred pedigree and also mapped to 10q22.80 The combination of the two intervals further reduced the candidate locus to a 2.99 Mb region containing 15 genes, of which seven genes were selected for sequencing of exons and intron-exon junctions39. This led to the identification of two distinct homozygous mutations in the gene POLR3A in the French Canadian and Syrian pedigrees (c.2015G>A (p.Gly672Glu) and c.2003+18G>A (p.Tyr637CysfsX650), respectively), as well as compound heterozygous mutations in three other families.39 Mutations were also found in five individuals affected with 4H syndrome, a disorder characterized by an important clinical overlap with TACH and LO.39 POLR3A encodes the largest subunit of Pol III and forms the catalytic center of this enzyme with POLR3B, the second largest subunit. We subsequently uncovered compound heterozygous mutations in POLR3B in three cases with 4H syndrome in which no POLR3A mutations had been found.81 In the following years, other groups, including ours, described more than 100 mutations in POLR3A and POLR3B in over 130 patients from various ethnicities.39,81-89 Furthermore, we contributed to the identification of mutations in POLR1C, encoding a third Pol III subunit, in a small group of patients with the cardinal features of POLR3-HLD but that did not carry POLR3A or POLR3B mutations.90 The different disorders caused by mutations in Pol III subunits, summarized in Table 1.3, were grouped into a new disease entity called Pol III- related hypomyelinating leukodystrophies (POLR3-HLD).

32 1.2.4 Expansion of the phenotypic spectrum associated with POLR3A and POLR3B mutations In 2014, a cross-sectional observational study of 105 POLR3-HLD mutation-proven cases described the clinical spectrum of the disorder, confirming the existence of an important phenotypic and genetic heterogeneity.89 With the exception of French Canadian cases, most cases were compound heterozygotes. One recurrent mutation in POLR3B, c.1568T>A (p.Val523Glu), was present in more than 80% of POLR3B cases, while the remainder of mutations were private or observed in a small proportion of cases. No patient carried two null mutations. At the phenotypic level, ataxia was present in all patients and was the main cause of gait limitation. Other neurological signs included intention tremor, dysmetria, gaze-evoked nystagmus, pyramidal signs and dystonia, but they were more variable across the cohort. Interestingly, faster deterioration was observed following infections in about half of the cases. Disease severity was also highly variable, ranging from children that never achieved independent walking and had intellectual disability, to patients presenting later in childhood with learning difficulties and motor clumsiness. The most frequent extra-neurological signs were dental abnormalities (87%), delayed puberty (76% of patients old enough to be assessed) and myopia (87%).89 Characteristic MRI findings included hypomyelination with relative preservation of the optic radiations, ventrolateral thalamus and dentate nucleus. Supratentorial atrophy and thinning of the corpus callosum were observed in most adult cases, while cerebellar atrophy was present in all but nine patients.89 At the time, experts thought that the triad of hypomyelination, cerebellar atrophy and variable extra-neurological involvement (abnormal dentition and/or puberty) were the key features of POLR3-HLD. From a pathological viewpoint, the co-existence of cerebral white matter abnormalities and cerebellar atrophy suggested a disease where oligodendrocytes and/or neurons contributed to the pathophysiology.89 In following years, the phenotypic spectrum of Pol III-related disorders continued to expand (Table 1.3). Indeed, a handful of patients were described with cerebellar atrophy only, suggesting that hypomyelination is not an obligate feature of the disease.91 This was substantiated by the identification of POLR3A mutations in approximately three percent of cases in a large cohort of sporadic and recessive spastic ataxia.92 Interestingly, over 80% of cases carried the same deep intronic mutation in POLR3A, which was found to activate a cryptic splice site in a tissue-specific manner, with mature differentiated neuroepithelial cells showing higher

33 use of this site compared to induced pluripotent stem cells (iPSC).92 Importantly, carriers of this mutation did not display the typical hypomyelination pattern of POLR3-HLD, but instead had abnormal white matter signal along the superior cerebellar peduncles.92 The phenotypic spectrum was further extended when Azmanov et al. reported a homozygous splice site mutation in POLR3A in patients of Roma ethnicity.93 This mutation causes exon skipping accompanied by partial deficiency of the full-length wild-type transcript. Strikingly, the patients had no white matter or cerebellar involvement on MRI, but rather presented with involvement of the striatum and red nuclei. Although ataxia was a prominent clinical sign, it was hypothesized to result from dorsal midbrain involvement as opposed to cerebellar dysfunction.93 On the milder end of the spectrum, variants in POLR3B were identified in individuals with isolated hypogonadotropic hypogonadism without neurological or dental abnormalities.94 Curiously, bi-allelic truncating variants in POLR3A were recently found in a single case of neonatal progeroid syndrome, implying an even broader phenotype, although replication of this observation in additional cases is necessary to conclude that POLR3A is indeed the causative gene.95 Thus, mutations in genes encoding Pol III subunits can lead to a wide spectrum of clinical presentations, although no clear genotype-phenotype correlation has been established yet. The absence of white matter abnormalities in some patients strengthens the idea that neuronal dysfunction plays a strong role in the disease pathogenesis. Histological data supports the hypothesis that POLR3-HLD is a “leuko-axonopathy”, with a primary involvement of both neurons and oligodendrocytes in the disease pathogenesis.57 Indeed, autopsy results from the brains of two POLR3-HLD cases showed important oligodendrocyte loss and myelin rarefaction, suggesting oligodendroglial dysfunction. In addition, axonal damage was present both in areas of myelin loss and regions of preserved myelin, especially in the cerebellum.89,96 This indicates that axonal degeneration is not only occurring as a secondary effect to the lack of myelin deposition, but likely also because of an intrinsic neuronal dysfunction. Furthermore, the recent studies showing that hypomyelination is no longer a unifying feature of the disorder, whereas neuronal degeneration is always present, support this hypothesis.91-93 It is likely that both oligodendrocytes and neurons, whose axonal integrity is crucial for myelination, play a role in the primary pathology of the disorder.89 Furthermore, a possible genotype-phenotype correlation is emerging, with certain intronic mutations associated with an absence of white matter involvement, suggesting that some residual

34 activity of the wild-type enzyme might better preserve oligodendrocyte function.92,93 Nonetheless, these results do not provide a clear pathophysiological mechanism. Therefore, further insight into the role of Pol III, particularly in the CNS, is necessary to better understand the pathophysiology of POLR3-HLD and the impact on individual cell types. In the last section of this introduction, I will thus review the existing knowledge on Pol III and its transcripts.

1.3 Function and regulation of RNA Polymerase III and its transcripts 1.3.1 Pol III transcription machinery and associated proteins All eukaryotes possess three essential nuclear RNA polymerases that are responsible for the transcription of DNA into RNA. Pol I transcribes the ribosomal RNA (rRNA) 45S precursor gene. Pol II synthesizes messenger RNAs (mRNA), microRNAs (miRNA) and other non-coding RNAs (ncRNA). Finally, Pol III generates a diverse pool of essential small ncRNAs. The three RNA polymerases are highly conserved from yeast to human and are composed of multiple subunits.97 Pol III is the largest of the three, with 17 subunits, while Pol I and Pol II possess 14 and 12, respectively.98 The architecture of the catalytic core, composed of 10 subunits, is conserved among the three enzymes.97 Of the 17 Pol III subunits, five are shared by the three polymerases, two are shared with Pol I only and ten are specific to Pol III. The latter group includes POLR3A and POLR3B, which form the active center of the enzyme.97 POLR1C is a subunit of Pol I and Pol III and forms a heterodimer with POLR1D. Their yeast homologues, AC40 and AC19, are closely related to Pol II subunits RPB3 (POLR2C) and RPB11 (POLR2J), which serve as a platform for enzyme core assembly.99 Recent data suggests that biogenesis of all three RNA polymerases occurs in the cytoplasm and requires the co-chaperone RPAP3 and the chaperone HSP90.100-103 Once fully assembled, the complexes are imported into the nucleus with the help of RPAP2 and RPAP4/GPN1,100,101,104 where basal transcription machinery specific to each Pol will assist to initiate transcription. Pol III-transcribed genes are divided into three types, based on sequence-specific promoter elements that are bound by different transcription factors (Fig. 1.2 and Table 1.1).105 Type 1 and type 2 genes (Table 1.1) contain internal promoter elements that are recognized by TFIIIA and TFIIIC (Type 1) or TFIIIC only (Type 2). In contrast, type 3 genes have upstream promoter elements, including a TATA element, a proximal sequence element (PSE) that is bound by SNAPc, and a distal sequence element (DSE) recognized by transcriptional activators

35 ZNF143 and POU2F1.105 TFIIIA, TFIIIC and SNAPc carry out the common role of guiding the transcription factor TFIIIB onto upstream DNA. Next, TFIIIB recruits Pol III and directs its transcription initiation. TFIIIB is composed of three subunits: TBP, BDP1 and BRF1 (type 1 and 2 genes) or BRF2 (type 3 genes), which both interact with initiation-specific Pol III subunits.105 Interestingly, mutations in BRF1 cause a disorder that includes significant clinical overlap with POLR3-HLD, namely cerebellar involvement and dental anomalies.106 BRF1 mutations lead to decreased Pol III occupancy at target genes in yeast and reduced Pol III transcription in vitro.106 There are no known Pol III elongation factors, presumably because of the short size of its transcripts (70-350 nucleotides).107 Since most Pol III-transcribed genes are expressed at very high levels, Pol III performs rapid cycles of transcription initiation, termination and reinitiation. Indeed, in eukaryotic cells, Pol III achieves more transcription initiations than Pol I or Pol II and most of it occurs from reinitiation.108 Pol III termination occurs at stretches of four or more T nucleotides, and the Pol III transcription initiation complex is then recycled for many rounds of transcription with the help of TFIIIB, which forms very stable complexes with DNA.108

1.3.2 Functions of Pol III transcripts Pol III synthesizes various small ncRNAs with diverse but important functions (Table 1.4). Classical Pol III transcripts include transfer RNAs (tRNA), 5S rRNA, U6 small nuclear RNA (snRNA), 7SK snRNA, 7SL RNA, RNase P RNA, RNase MRP RNA, Y RNAs and vault RNAs. A significant number of them play a role in transcription regulation, RNA maturation or translation (Table 1.4). In addition to their classical and generally housekeeping functions, new roles have started to emerge for some Pol III transcripts.109 In this section, I will briefly review the roles of Pol III transcripts in mammalian cells.

Nuclear-encoded tRNAs Nuclear-encoded tRNAs are the largest group of Pol III transcripts and act as essential adaptors during protein synthesis.110 Charged tRNAs transfer their cognate amino acids to the nascent peptide chain by progressing through the three tRNA binding sites on the 80S .111 Furthermore, tRNA fragments can also function as signaling molecules during stress responses.112 Following heat shock, arsenite treatment or UV irradiation, tRNAs are

36 cleaved in their anticodon loop by angiogenin.113 The resulting 5’ tRNA halves interfere with translation initiation by displacing eIF4G and eIF4F as well as promote the formation of stress granules.112,114,115 Importantly, this does not significantly affect the levels of mature full-length tRNAs.112 Angiogenin can also cleave the 3’ CCA termini of tRNAs in response to severe oxidative stress, which leads to a global repression of translation that can be reversed through repair by the CCA-adding enzyme.116 In 2007, microarray profiling of cytoplasmic tRNAs in eight tissues showed that the brain has the second-highest overall tRNA expression and that tRNAs cognate for charged and polar amino acids are particularly increased in the brain compared to other tissues.117 This was one of the first lines of evidence suggesting that tRNAs may be of particular importance in the CNS. Since then, mutations in genes related to tRNA biology and/or mRNA translation have been identified in a growing number of neurological disorders, including POLR3-HLD, leading to the hypothesis that neurons and oligodendrocytes are particularly vulnerable to impaired tRNA function and/or deregulated mRNA translation.111 Mutations in genes involved in tRNA splicing, CCA trinucleotide addition or tRNA post-transcriptional modifications cause various neurological disorders that frequently involve cerebellar and/or cortical atrophy.118-123 On the other hand, mutations in four aminoacyl-tRNA synthetases (AARS, RARS, DARS, EPRS), responsible for charging mature tRNAs with their corresponding amino acids, and one of their auxiliary proteins (AIMP1), cause hypomyelinating disorders.124-130 In addition, mutations in any of the genes encoding the five subunits of the translation elongation factor eIF2B cause Vanishing White Matter (VWM) disease, a type of demyelinating leukodystrophy characterized by rapid deterioration after fever or brain injuries.111

Other Pol III transcripts As an essential component of the large ribosomal subunit, 5S rRNA is a key player in translation.109 However, 5S rRNA was also found to stabilize Mdmx, an inhibitor of p53, by blocking its ubiquitination. Interestingly, knockdown of 5S rRNA induces Mdmx degradation without affecting formation of the large ribosomal subunit,131 suggesting that this additional function of 5S rRNA is more susceptible to perturbations in its expression level. 7SK snRNA, encoded by RN7SK, represses Pol II transcription by sequestering the positive transcription elongation factor b (P-TEFb), which regulates the transition of Pol II from

37 promoter-proximal pausing to active elongation.132 The functionally active core 7SK snRNP consists of 7SK snRNA, the capping enzyme MePCE and the La-related protein 7 (Larp7). The majority of 7SK snRNP forms a larger complex with HEXIM1/2 and P-TEFb, which is disassembled upon increased transcriptional demand.132 Nonetheless, recent data suggests that 7SK snRNP can also regulate Pol II transcription in a P-TEFb-independent manner. Indeed, 7SK snRNP interacts with the Little Elongation Complex to promote Pol II transcription of snRNA and snoRNA genes.133 Moreover, 7SK snRNA was found to repress transcription of enhancer RNAs by recruiting the BAF chromatin-remodeling complex to enhancers.134 Thus, 7SK snRNA can regulate multiple aspects of Pol II transcription via interaction with distinct protein partners.132 During mouse neuronal differentiation, 7SK is at its highest expression level in mature and terminally differentiated neurons.135 The authors also demonstrated that 7SK knockdown at different time points of in vitro neural differentiation led to impaired morphology and reduced expression of markers of neuronal differentiation.135 This is in contrast to an earlier study showing that 7SK knockdown in neural stem cells and OPCs leads to an increased expression of genes involved in neuronal and oligodendrocyte specification and differentiation.136 Thus, the dynamic expression of 7SK RNA during neural differentiation suggests that it is important for this process, but its exact role remains to be established. 7SL RNA, encoded by RN7SL1 and RN7SL2, is a component of the signal recognition particle (SRP), together with six proteins, and is a universally conserved molecule.137 The SRP is responsible for co-translational targeting of nascent secretory and transmembrane peptides to the ER through interaction with its SRP receptor.109,138 7SL RNA was originally thought to act simply as a scaffold to hold the SRP proteins together, but more recent work in prokaryotes suggests that the 7SL RNA plays an active role in reorganizing the SRP for cargo binding and mediating interactions with the SRP receptor.137 Depletion of SRP14, SRP54 or SRP72 in HEK293 or HeLa cells leads to decreased 7SL RNA levels, suggesting that normal levels of SRP proteins are necessary for normal expression of 7SL RNA. This also causes inefficient ER targeting and impaired post-ER membrane trafficking.139 Interestingly, both 7SL and 7SK RNAs are at their highest expression level in the hypothalamus compared to other non-CNS tissues.140 Furthermore, their mouse counterparts are up-regulated in a heterogeneous population of neural and glial cells compared to neural

38 progenitors and embryonic cells, suggesting that these ncRNAs could be of particular importance for proper differentiation of neural lineages.141 U6 and U6atac snRNAs are small uridine-rich RNAs required for mRNA splicing. They are components of the major and the minor spliceosomes, respectively. The former removes the common U2-type introns from mRNAs, while the latter excises the rare U12-type introns.142 Specifically, U6 snRNA forms the catalytic site of the spliceosome with U2 snRNA and then coordinates the metal cations that are necessary for catalysis of the splicing reaction.142 The RNase mitochondrial RNA processing complex (MRP) is an RNP composed of the MRP RNA (RMRP) and seven proteins.143 Most RNase MRP complexes are actually located in the nucleolus, where they are involved in processing of pre-rRNA. In human cells, the RNase MRP precisely cleaves the 18S rRNA from the part of the pre-rRNA that encompasses the 5.8S rRNA and the 28S rRNA.143 In the mitochondria, RNase MRP generates primers for DNA replication by cleaving mitochondrial RNA transcripts.109 A third function emerged when MRP RNA was found to be in a complex with the human telomerase catalytic subunit.144 This newly discovered RNP has an RNA dependent RNA polymerase activity and produces double-stranded RNAs that are processed into siRNAs by Dicer. Importantly, some of these siRNAs regulate the expression level of RMRP, thus producing a negative feedback loop.144 H1 RNA, encoded by RPPH1, is a component of the RNase P complex, a ribonucleoprotein whose main function is to process the 5’ leader sequence of pre-tRNAs.109 Surprisingly, new evidence indicates that RNase P regulates transcription by Pol I and Pol III, since knockdown of its protein subunits or targeted cleavage of H1 RNA reduces the levels of Pol I and Pol III transcription.145,146 In fact, RNase P appears to be required for the formation of Pol III initiation complexes, although the exact mechanism remains unclear.147 Vault RNAs are members of the vault particle, which is thought to play a role in multi- drug resistance.109 In addition, vault RNAs can be processed into small regulatory RNAs that resemble miRNAs.109 Similar to miRNAs, these small vault RNAs are Dicer-dependent and regulate expression of at least one mRNA, CYP3A4, through sequence-specific cleavage.148 Y RNAs assemble into Ro RNPs with the proteins Ro60 and La. The exact function of Y RNAs is not well understood, although they were shown to be required for chromosomal DNA replication in human cells149. Furthermore, Ro RNPs are involved in RNA quality control, RNA stability and cellular response to stress.150 Y RNAs are also thought to act as miRNA precursors,

39 since miRNA-sized fragments from Y RNAs were detected in several mammalian healthy tissues and tumors.150 BC200 RNA, encoded by BCYRN1, is a primate-specific ncRNA of 200 nucleotides that is highly expressed in the brain.151 It consists of three major structural domains: a 5’ domain homologous to Alu repetitive elements, a central A-rich region and a unique 3’ sequence (Fig. 1.3). The 5’ domain contains consensus elements A and B of type 2 Pol III promoters (Fig. 1.3).151 In 1993, Tiedge et al. reported high expression of BC200 RNA in the human brain and absence of labeling in other organs.151 Using a probe against the 3’ unique sequence for in situ labeling, they showed a predominant signal in dendrite-rich areas of the brain, suggesting that BC200 RNA localizes to the somatodendritic part of neurons, where it is thought to regulate ectopic translation.151,152 Rodents also possess a short Pol III transcript named Bc1 RNA that is almost exclusively expressed in neurons and localizes to dendrites.153 Bc1 and BC200 have distinct evolutionary origins and are not sequence homologs.151 However, they have similar primary and secondary structure,154 comparable expression pattern and localization and are both Pol III transcripts. Thus, they have traditionally been considered as functional analogs151 and much of the early knowledge on the role of BC200 RNA was derived from studies on Bc1 RNA. Both Bc1 and BC200 RNAs were found to interact with members of the translation machinery or translational regulators, namely poly-A binding protein (PABP), translation initiation factor eIF4A, fragile-X mental retardation protein (FMRP), Pur alpha, and La protein.154-159 Several lines of evidence suggest that BC1 plays a role in translational repression, but the exact in vivo mechanism remains unclear.160-162 Studies from Henri Tiedge’s group strongly suggest that both Bc1 and BC200 RNAs repress translation through a dual mode of action by inhibiting the helicase activity of eIF4A and disrupting the interaction of eIF4B interaction with 18S rRNA.163,164 In contrast, Zalfa et al. observed direct binding of Bc1 and FMRP and demonstrated that Bc1 participates in the translational repression of a subset of mRNAs by mediating their interaction with FMRP.154 Interestingly, more recent data indicates that BC200 RNA is also expressed in human immortalized and primary cell lines, albeit at much lower levels than in the brain.165 Finally, Pol III also transcribes short interspersed nuclear elements (SINE), a class of repetitive transposable elements.166 The majority of SINEs are Alu elements, a family present in >1 million copies and constituting approximately 11% of the , although only a

40 small proportion are still active.166,167 The body of Alu elements is formed by two dimers that are ancestrally derived from the 7SL RNA gene.166 Many Alu elements have inserted in genes and are co-transcribed with their host gene by Pol II. However, a small number of Alu elements have retained their own Pol III promoters and can be transcribed as free RNAs, although they are usually expressed at very low levels.167 Interestingly, free Alu elements are upregulated under stress conditions such as heat shock or viral infections.168 Pol III-transcribed Alu elements can repress gene expression by binding to the Pol II initiation complex or by interacting with mRNA through sequence complementarity.167 Evidence suggests that Alu RNAs can also positively regulate translation, while Alu RNPs, which consist of Alu RNA and SRP9/14 proteins, repress translation.169 This discrepancy is thought to result from different conformations of Alu RNA with and without SRP9/14.167 In addition to the conventional Pol III transcripts, a computational search of the human genome for upstream promoter elements typical of type 3 promoters revealed 32 putative novel Pol III transcripts.170 Experimental validation of a subset of candidates confirmed Pol III- dependent transcription. A significant proportion of these transcripts were found to play a role in neuronal cell lines and to be expressed in the human brain. In particular, several novel transcription units are embedded in introns of protein-coding genes in an antisense configuration. Their overexpression shifts the ratio of alternatively spliced isoforms of their “co-gene”, resulting in a higher expression of isoforms associated with processes related to neurodegeneration, such as amyloid protein processing.171-174

1.3.3 Regulation of Pol III transcription The development of ChIP-Seq confirmed Pol III binding at loci encoding all its previously known transcripts in mouse liver and in various human cultured cells (H1 embryonic stem cells, IMR90 fibroblasts, HEK293, HeLa, K562, GM12878, Jurkat T, CD4+ T cells).175-181 However, these datasets varied widely in the number of novel Pol III-bound loci detected (between 6 and 124), with an important proportion corresponding to SINE elements (MIR or Alu).176,177,179 This discrepancy could be explained by tissue-specific expression of Alu elements or artifacts due to the challenges associated with mapping reads corresponding to Pol III transcripts.176 One important conclusion from these ChIP-seq experiments is that only 50-60% of tRNA genes are occupied in any given cell type. Indeed, the human genome contains almost 600

41 tRNA genes, resulting in high redundancy, with multiple copies encoding the same tRNA isoacceptor. The number of gene copies for each isoacceptor varies widely, from only one (i.e. tRNA-SerACT) to 39 (i.e. tRNA-CysGCA) (GtRNAdb). Importantly, the identity of occupied tRNA genes varies across cell types and developmental stages.175-179,181,182 In several human cell lines, Pol III occupancy coincides with chromatin accessibility and epigenetic marks of active chromatin (i.e. H3K4me2, H3K4me3, H3K9ac, H3K27ac).175,177,178,183 For example, in H1 embryonic stem cells, Alla et al. found that a fraction of active tRNA genes (16%) are located in promoters of known Pol II-transcribed genes, while the remainder are in enhancer-like regions marked by H3K4me1 and H3K27ac.175 Furthermore, coding regions of Pol III-transcribed genes are fully depleted in nucleosomes, contrary to Pol II-transcribed genes, presumably because of their short length.177,179,184 Strikingly, multiple reports have described Pol II occupancy on or near active Pol III-transcribed genes, both for tRNA and type 3 genes.175,178,180,185 Of note, type 3 genes and small ncRNA genes transcribed by Pol II have similar promoter elements and both use the transcription factor SNAPc.185 For most Pol III- transcribed genes, Pol II occupancy was much lower than that of Pol III (i.e. 19-33 fold difference in mouse liver).181,185 One notable exception is the RPPH1 gene, which was bound by similar amounts of Pol II and Pol III. Thus, the authors concluded that it can be transcribed by both complexes, although this was not shown directly.185 In a different study, inhibition of Pol II transcription with α-amanitin led to reduced expression of type 3 genes and several transcription factors associated with Pol II, such as c-Fos, c-Jun and c-Myc, were found at active Pol III- transcribed loci.180 Altogether, these data suggest that Pol II occupancy plays a role in the regulation of Pol III transcription, although the exact mechanisms remain unclear. In turn, Pol III activity can also regulate Pol II transcription. In yeast, these so-called “extra-transcriptional” effects of Pol III complexes have been extensively studied, especially at tRNA genes.186 Specifically, Pol III and TFIIIC-bound tRNA genes exert partial inhibition on transcription from adjacent Pol II promoters by acting as insulators, a phenomenon called tRNA-mediated gene silencing.186 Similar observations were made in human cells, where a cluster of tRNA genes on 17 could act as enhancer-blocking insulators in a CTCF-independent and TFIIIC- dependent manner.187 Alternatively, Pol III complexes also act as boundary elements to prevent the spread of heterochromatin via transcriptional interference of the Pol III machinery, thus positively

42 regulating nearby gene expression.186 Several studies in yeast have shown that mutation or deletion of a tRNA gene cause partial repression of the adjacent downstream gene, indicating that Pol III binding to tRNA genes is necessary to maintain normal expression levels of nearby genes.186 A reporter gene assay in mouse cells provided the first evidence of a similar function of tRNA genes in mammals, demonstrating that clusters of 2 or 4 mouse tRNA genes prevent heterochromatin-mediated silencing when inserted upstream of the reporter gene.188 In human, genome-wide prediction of boundary elements using ChIP-seq and RNA-seq data identified putative boundaries encoded by tRNA genes.189 In a separate study, chromosome conformation capture analysis showed that both tRNA genes and Alu/SINE repeat elements are enriched at boundaries of topologically associated domains, supporting that Pol III transcription contributes to the formation of boundaries and the regulation of chromatin structure.190 In concordance with this data, multiple groups have reported that active tRNA genes are often in proximity to active Pol II-transcribed genes.178,179,181 Specifically, tRNA genes are enriched near the transcription start site of Pol II-transcribed genes, but not near the 3’ end.188 During macrophage differentiation, changes in nascent tRNA levels were mirrored by similar changes in the expression of nearby protein-coding genes.183 A particularly interesting example of the interplay between Pol II and Pol III transcription is a Pol III-transcribed Mammalian Interspersed Element (MIR) located in antisense configuration in the first intron of the Pol II-transcribed POLR3E gene. Yeganeh et al. found that active transcription of the MIR creates a roadblock to Pol II elongation along the mouse Polr3e gene, consistent with a mechanism of transcriptional interference. Since POLR3E encodes a Pol III subunit, this suggests a negative feedback loop, where elevated Pol III activity would lead to increased MIR transcription, causing decreased expression of POLR3E and bringing Pol III activity back to lower levels.191 Because of the abundance of its transcripts, Pol III transcription has a high energetic cost in growing cells. Upon unfavorable growth conditions such as nutrient deprivation or stress, its master regulator MAF1 rapidly represses Pol III transcription in order to conserve metabolic energy.192 Upon return to favorable growth conditions, human MAF1 is inactivated via direct phosphorylation by mTORC1.193 In Maf1 knock-out (KO) mice, failure to repress Pol III transcription leads to a lean body weight and resistance to diet-induced obesity due to increased energy expenditure.192,194 In human fibroblasts, it was recently shown that MAF1 represses

43 transcription by preventing Pol III recruitment to target promoters.193 Strikingly, despite acting directly on Pol III rather than on different promoters, the repressive action of MAF1 is not uniform for all Pol III-transcribed genes. In fact, two distinct classes of genes were identified based on their response to serum starvation and rapamycin treatment. The first category contains genes that are expressed at higher levels in favorable conditions and repressed in stress conditions via the action of MAF1. The second group is comprised of 49 genes that show a high and stable expression across all conditions and includes genes encoding for most tRNA isotypes, as well as other housekeeping Pol III transcripts.193 This suggests that a small group of Pol III transcripts are constitutively expressed to maintain a basal level of expression during stress conditions.192

1.4 Quantification of Pol III transcripts Simultaneous measurement of expression levels of all Pol III transcripts, which are numerous and scattered across the genome, requires high-throughput techniques such as RNA- Sequencing (RNA-seq) or chromatin immunoprecipitation followed by sequencing (ChIP-seq). In the past several years, RNA-seq has become the gold standard to study genome-wide RNA expression, especially for mRNAs and microRNAs. However, the direct quantification of Pol III transcripts using RNA-seq is associated with several challenges. First, most Pol III transcripts are smaller than 200 nucleotides and are therefore not well represented in standard RNA-Seq protocols. Moreover, existing small RNA-seq protocols show a bias towards miRNAs and do not allow proper coverage of Pol III transcripts. Next, Pol III transcripts, in particular mature tRNAs, have a strong and complex secondary structure and post-transcriptional modifications that interfere with cDNA library preparation. Finally, the highly repetitive sequences of Pol III transcripts lead to ambiguity during alignment and subsequent quantification of expression levels. Because of these challenges, several groups have used ChIP-seq, which measures Pol III binding to DNA, as a proxy to quantify Pol III transcript expression.176,180,193,195 Indeed, alignment of ChIP-seq reads is computationally easier than RNA-seq because of more unique flanking DNA sequences. However, while it was demonstrated that Pol III occupancy correlates with transcription rate and unprocessed RNA levels,196 this is insufficient to predict the final amount of RNA product, especially in the case of Pol III mutants. For instance, a mutated

44 polymerase may be correctly bound to the DNA but have impaired transcription elongation, thus breaking the correlation between occupancy and transcript level. In contrast, RNA-seq measures the steady-state levels of RNA. Recently, several protocols have been published to overcome the challenges associated with RNA-seq of Pol III transcripts, in particular to improve library preparation for sequencing of mature tRNA.120,197-200 TGIRT-seq relies on the use of a thermostable group II intron reverse transcriptase (TGIRT), which has higher processivity than traditional reverse transcriptases and thus enables more efficient cDNA synthesis of highly structured small RNAs.201 DM-tRNA-seq and ARM-seq use enzymatic or chemical treatment, respectively, to remove tRNA base methylations that interfere with cDNA synthesis.197,198 In DM-tRNA-seq, this is coupled with the use of a highly processive TGIRT.198 To specifically study tRNA gene transcription, van Bortle et al. combined demethylation with biotin-capture of nascent tRNAs.183 Hydro-tRNAseq is another alternative that generates smaller tRNA fragments (19-35 nt) by limited alkaline hydrolysis of purified tRNAs, to reduce the potential to form secondary structures and the number of modifications per fragment, which facilitates sequencing.199,202 Finally, Gogakos et al. used photoactivatable crosslinking and immunoprecipitation (PAR-CLIP) on SSB, a protein involved in processing of tRNA 3’ ends, to reliably sequence pre-tRNAs.202 Using both Hydro-tRNAseq and SSB PAR- CLIP, they confirmed the absence of a strong correlation between pre-tRNA abundance and mature tRNA abundance.202 Although these techniques have greatly improved the high- throughput quantification of tRNAs, their specific focus on tRNAs is limiting in the context of studying Pol III function, since they do not address other Pol III transcripts. Thus, a method that allows simultaneous quantification of all Pol III transcripts using RNA-seq is still lacking.

45 Tables

Table 1.1. Major forms of ARCA in French Canadians, ranked in order of frequency (most common to least common). *Unpublished evidence suggests that MARS2 rearrangements are not causative of ARSAL.

Causative Regional Founder Name Main clinical features Protein function gene clustering mutation(s) Ataxia, spasticity, Charlevoix, Putative organizer c.8844delT, ARSACS SACS22 polyneuropathy, hyper- Saguenay- of cytoskeletal c.7504C>T myelinated retinal fibers Lac-St-Jean network203 Mitochondrial Gait and limb ataxia, Across GAA triplet protein involved Friedreich absent tendon reflexes in Québec, FXN24 repeat in iron-sulfur Ataxia lower limbs, axonal cluster near expansion cluster neuropathy Rimouski formation204 Beauce and Links plasma Late-onset pure cerebellar c.15705- ARCA1 SYNE127 Bas-St- membrane to actin ataxia 12A>G Laurent cytoskeleton27 Ataxia, axonal Northeastern sensorimotor neuropathy, Québec, Putative AOA2 SETX35 elevated alpha- Gaspésie, c.5927G>T RNA:DNA fetoprotein, oculomotor New helicase35,205 apraxia Brunswick Childhood-onset progressive motor Catalytic subunit TACH POLR3A39 deterioration, gait ataxia, Beauce c.2015G>A of RNA spasticity, tremor, polymerase III39 hypomyelination on MRI Complex Mitochondrial Gait ataxia, spasticity, ARSAL *MARS241 Portneuf genomic tRNA-aminoacyl hyperreflexia rearrangements synthetase41

46 Table 1.2. Types of hypomyelinating leukodystrophies (HLD) and functions of the mutated genes. The table is divided into four sections: genes encoding i) proteins directly involved in myelin or oligodendrocyte function, ii) transcription factors involved in oligodendrocyte maturation, iii) proteins that are directly or indirectly involved in mRNA translation, and iv) other proteins involved in ubiquitous processes. The table was adapted from references 206, 57 and the Online Mendelian Inheritance in Man (OMIM) database.207 *These diseases are not unanimously classified as an HLD.

OMIM Disease Gene Type of mutation Gene function Phenotype Number Copy number One of the two main proteins in Pelizaeus-Merzbacher Disease PLP1 variations, point 312080 myelin sheath mutations Hypomyelination of early One of the two main proteins in PLP1 Splice site - myelinating structures myelin sheath Pelizaeus-Merzbacher-like Oligodendrocyte-specific gap junction GJC2 Missense, frameshift 608804 Disease protein

Peripheral neuropathy, central Transcription factor involved in hypomyelination, Waardenburg- SOX10 Nonsense, frameshift 609136 oligodendrocyte differentiation208 Hirschsprung Transcription factor putatively NKX6-2-related disorders NKX6-2 Missense, nonsense involved in oligodendrocyte 617560 maturation47 Auxiliary protein of the aminoacyl- Hypomyelinating AIMP1 Nonsense, frameshift tRNA synthetase complex; regulator 260600 leukodystrophy 3 of neurofilament phosphorylation Hypomyelinating RARS Missense, frameshift Arginyl-tRNA synthetase 616140 leukodystrophy 9 Hypomyelination with brainstem and spinal cord DARS Missense Aspartyl-tRNA synthetase 615281 involvement and leg spasticity* Missense, nonsense, EPRS-related leukodystrophy130 EPRS Glutamyl-prolyl-tRNA synthetase - frameshift POLR3A, 607694, Missense, nonsense, Subunits of RNA Pol III, which Pol III-related leukodystrophies POLR3B, 614381, frameshift, splice site synthesizes small non-coding RNAs POLR1C 616494 Hypomyelinating Mitochondrial protein folding and HSPD1 Missense 612233 leukodystrophy 4 assembly Hypomyelinating HIKESHI Missense Nuclear import of heat shock proteins 616881 leukodystrophy 13 Hypomyelination and Regulates synthesis of FAM126A Missense 610532 congenital cataracts phosphatidylinositol 4-phosphate Hypomyelination with atrophy Tubulin, essential component of of the basal ganglia and TUBB4A Missense (de novo) 612438 cytoskeleton cerebellum Hypomyelinating PYCR2 Missense Proline biosynthesis 616420 leukodystrophy 10 Hypomyelinating Vesicle-mediated delivery of proteins VPS11 Missense 616683 leukodystrophy 12 to lysosome Hypomyelinating Located in lysosomal and endosomal TMEM106B Missense (de novo) - leukodystrophy 14 membranes, unclear function Allan-Herndon-Dudley Missense, frameshift, SLC16A2 Thyroid hormone transporter 300523 syndrome* large deletion

47 Table 1.3. Phenotypes caused by recessive mutations in genes encoding Pol III subunits. The original diseases that now constitute POLR3-HLD are shown at the top, while recently identified atypical phenotypes are indicated at the bottom.

Geographical Founder / Causative Number Phenotype origin / common gene of cases Ethnicity mutation Tremor ataxia with central French hypomyelination POLR3A c.2015G>A Canadian

89 Leukodystrophy with ies Syria POLR3A c.2003+18G>A Oligodontia

related POLR3A, - POLR3B > 130 4H syndrome US, Europe POLR3B, c.1568T>A POLR1C

Pol III Pol Hypomyelination with leukodystroph cerebellar atrophy and POLR3A, Japan - hypoplasia of the corpus POLR3B callosum Ataxia with striatal involvement93 Roma POLR3A c.1771-6C>G 3 Spastic ataxia92 Europe POLR3A c.1909+22G>A 29 Isolated hypogonadotropic hypogonadism94 US POLR3B - 4 Neonatal progyroid syndrome95 US POLR3A - 1

48 Table 1.4. Roles of Pol III transcripts in mammalian cells. Abbreviations: nt: nucleotides. *Does not include pseudogenes of Pol III transcripts that are not or lowly transcribed.

Number of Pol III Promoter genes in Size Expression Functions transcript type human (nt) genome* mRNA translation 5S rRNA 1 17 120 Ubiquitous (component of ribosome)109, stabilization of Mdmx131 Regulation of Pol II 7SK snRNA 3 1 331 Ubiquitous transcription132 Protein translocation to the ER 7SL RNA 2 2 300 Ubiquitous (component of SRP complex)109 Regulation of ectopic translation BC200 RNA 2 1 200 Brain in dendrites151,163 Processing of rRNA precursor143, RNase MRP mitochondrial DNA 3 1 266 Ubiquitous RNA replication109, production of dsRNAs144

RNase P RNA Processing of tRNA 5' ends109, 3 1 340 Ubiquitous (H1 RNA) regulation of Pol I and Pol III transcription145-147 Cell-type Stress response, regulation of SINE elements 2 > 1 million ~300 specific gene expression167,168 mRNA translation111, stress tRNAs 2 610 70-94 Ubiquitous response110 mRNA splicing142 U6 snRNA 3 9 107 Ubiquitous (component of spliceosome) mRNA splicing142 U6atac 3 1 126 Ubiquitous (component of minor snRNA spliceosome) Multidrug resistance109, Vault RNAs 2 3 89-99 Ubiquitous generation of miRNA-like RNAs148 79- DNA replication, RNA stability, Y RNAs 3 4 Ubiquitous 113 miRNA precursors149,150

49 Figures

Figure 1.1 Schematic of an axon myelinated by oligodendrocytes in the CNS. Myelinated regions are interrupted by unmyelinated nodes of Ranvier, which allows for saltatory conduction of action potentials. Reproduced from OpenStax CNX (https://cnx.org/contents/c9j4p0aj@4/Neurons-and-Glial-Cells).

50

Figure 1.2 Pol III gene classes based on promoter architecture and associated transcription factors. IE – Internal Element, DSE – Distal Sequence Element, PSE – Proximal Sequence Element. Figure reproduced with permission from reference #175 under the Creative Commons Attribution license.

51

Alu domain A-rich domain Unique domain 5’ A B 3’

Figure 1.3. Linear representation of BC200 RNA. The RNA is composed of a 5’ Alu domain, an internal A-rich domain and a 3’ unique domain. The two elements of the type 2 Pol III promoter (boxes A and B) are shown in blue. Figure drawn from information in reference #151.

52 RATIONALE, HYPOTHESIS AND OBJECTIVES

For patients affected with rare inherited ataxias, the end of the diagnostic odyssey has major psychological, sociological and economical benefits. It eliminates the need for further investigations, decreases prognostic uncertainty, allows for accurate genetic counseling and informed reproductive choices, and improves psychosocial health in affected families.42 Furthermore, the identification of the precise genetic defect can occasionally point to specific treatments or provide access to therapies that are under investigation.209 Nevertheless, at the beginning of this project, it was estimated that approximately 30% of French Canadian patients affected with ARCA remained without a precise molecular diagnosis. It is presumed that these individuals are affected with rarer and more heterogeneous forms of ARCA. In our laboratory, traditional approaches such as linkage analysis have failed to identify candidate genes or loci, even for large cohorts with a common geographical origin (i.e. Portneuf region). Moreover, there is a large phenotypic heterogeneity among unresolved ARCA French Canadian cases, sparking the hypothesis that several genes are responsible for ARCA in these patients. Thus, the first objective of this thesis was to identify new genetic causes of ARCA in French Canadians.

With the accelerating pace of rare disease genetics, it is not far-fetched to imagine a time when the causative gene has been uncovered in most Mendelian disorders. However, an even bigger challenge is to act beyond gene identification, in order to unravel pathophysiological mechanisms and to develop potential therapies. Following that idea, the remainder of my PhD thesis focused on the broadening spectrum of Pol III-related ataxias with frequent hypomyelinating leukodystrophies, caused by mutations in genes encoding Pol III subunits. The downstream effects of Pol III dysfunction responsible for the different pathological phenotypes are poorly understood, which represents a major bottleneck for therapy development. Specifically, it remains enigmatic how mutations in a ubiquitously expressed and essential enzyme such as Pol III lead to a CNS-specific disease. Pol III synthesizes all tRNAs and many non-coding RNAs involved in transcription regulation, splicing and protein synthesis. Our hypothesis is that mutations in Pol III subunits lead to decreased levels of some or all of its transcripts, which preferentially perturbs the expression and/or translation of mRNAs that are essential for the development and survival of oligodendrocytes and neurons. Therefore, my

53 second objective was to generate and characterize a mouse model of POLR3-HLD, in which we could have easy access to the relevant CNS tissues, in order to better understand their specific vulnerability to Pol III dysfunction. This mouse model carried the French Canadian founder mutation in Polr3a, c.2015G>A (p.G672E), making it especially pertinent to Québec patients.

The majority of Pol III transcripts play critical roles in housekeeping processes related to RNA and protein homeostasis. However, the impact of Pol III hypofunction on the expression of its own transcripts, as well as on the rest of the transcriptome and the proteome, remains enigmatic. The identification of deregulated Pol III transcripts or pathways that underpin Pol III dysfunction could pinpoint potential pathophysiological mechanisms and eventually provide the molecular basis for developing therapies. Thus, my third objective was to uncover the transcriptional consequences of Pol III hypofunction by performing extensive transcriptomic profiling of disease models and patient-derived samples.

54 PREFACE TO CHAPTER 2

As part of the work performed in the Brais laboratory to identify new genetic causes of ARCA in French Canadians, patients presenting with cerebellar ataxia, spasticity and a presumed recessive mode of inheritance were recruited in neuromuscular clinics across Québec. Chapter 2 is separated in two parts and relates the genetic findings in 13 French Canadian ARCA families, in whom we identified mutations in the hereditary spastic paraplegia gene SPG7 (Part 1) or in a novel disease-causing gene, PMPCA (Part 2).

55 CHAPTER 2: Identification of new causes of cerebellar ataxia in French Canadians

Part A: SPG7 mutations explain a significant proportion of French Canadian spastic ataxia cases

Karine Choquet1,2*, Martine Tétreault2,3*, Sharon Yang1, Roberta La Piana1, Marie-Josée Dicaire1, Megan R. Vanstone4, Jean Mathieu5, Jean-Pierre Bouchard6, Marie-France Rioux7, Guy A. Rouleau8, Care4Rare Canada Consortium, Kym M. Boycott4, Jacek Majewski2,3, Bernard Brais1,2

Affiliations: 1) Neurogenetics of Motion Laboratory, Department of Neurology and Neurosurgery, Montreal Neurological Institute, McGill University, Montreal, Québec, Canada 2) Department of Human Genetics, McGill University, Montreal, Québec, Canada 3) McGill University and Genome Quebec Innovation Center, Montreal, Québec, Canada 4) Children’s Hospital of Eastern Ontario Research Institute, University of Ottawa, Ottawa, Ontario, Canada 5) Complexe hospitalier de la Sagamie et Faculté de Médecine et des Sciences de la Santé de l’Université de Sherbrooke, Jonquière, Québec, Canada 6) Hôpital Enfant-Jésus, CHU de Québec et Département des Sciences Neurologiques, Faculté de Médecine de l’Université Laval, Québec, Québec, Canada 7) Centre hospitalier universitaire de Sherbrooke – Hôpital Fleurimont, Sherbrooke, Québec, Canada 8) Montreal Neurological Institute and Hospital and Department of Neurology and Neurosurgery, McGill University, Montreal, Québec, Canada

*These authors contributed equally to this work.

Published in Eur J Hum Genet. 2016 Jul;24 (7):1016-21. doi: 10.1038/ejhg.2015.240. Epub 2015 Dec 2. Reproduced with permission

56 Abstract Hereditary cerebellar ataxias and hereditary spastic paraplegias are clinically and genetically heterogeneous and often overlapping neurological disorders. Mutations in SPG7 cause the autosomal recessive spastic paraplegia type 7 (SPG7), but recent studies indicate that they are also one of the most common causes of recessive cerebellar ataxia. In Quebec, a significant number of patients affected with cerebellar ataxia and spasticity remain without a molecular diagnosis. We performed whole exome sequencing in three French Canadian (FC) patients affected with spastic ataxia and uncovered compound heterozygous variants in SPG7 in all three. Sanger sequencing of SPG7 exons and exon/intron boundaries was used to screen additional patients. In total, we identified recessive variants in SPG7 in 22 FC patients belonging to 12 families (38.7% of the families screened), including two novel variants. The p.(Ala510Val) variant was the most common in our cohort. Cerebellar features, including ataxia, were more pronounced than spasticity in this cohort. These results strongly suggest that variants affecting the function of SPG7 are the fourth most common form of recessive ataxia in FC patients. Thus, we propose that SPG7 mutations explain a significant proportion of FC spastic ataxia cases and that this gene should be considered in unresolved patients.

57 Introduction Hereditary cerebellar ataxias and hereditary spastic paraplegias (HSPs) are clinically and genetically heterogeneous and often overlapping neurological disorders. HSPs are characterized by a predominant progressive spasticity and weakness in the lower limbs due to degeneration of the corticospinal tracts,210 while the main feature of cerebellar ataxias is progressive cerebellar degeneration leading to impaired balance, gait and speech.211 Both HSP and hereditary ataxias can be associated with other neurological and non-neurological features, resulting in complex phenotypes with frequent intra and inter-familial variability. Significantly, cerebellar ataxias are very often associated with pyramidal involvement leading to more than 50% of recessive ataxias manifesting as spastic ataxias.20 Due to the individual rarity and genetic heterogeneity of these conditions, their molecular diagnosis remains challenging and time-consuming.

Mutations in the gene SPG7 were the first identified genetic cause of autosomal recessive HSP in 1998 (MIM602783).212 Since then, a significant number of causative mutations were found in several HSP cohorts from different populations.213-222 SPG7 can be characterized by a pure or complex HSP phenotype. Increasingly, reports documented that cerebellar ataxia and/or cerebellar atrophy on magnetic resonance imaging (MRI) are the most frequent additional features in complex SPG7 cases.214,215,217,220,223 In a study of a large Dutch cohort, cerebellar ataxia was found in 57% of cases and it was even the dominating clinical symptom in some of these patients.214 In addition, a recent report suggests that SPG7 mutations are a frequent cause of adult-onset undiagnosed cerebellar ataxia in patients of British descent.224 The sequencing of SPG7 in next-generation sequencing (NGS) panels has further identified cases of spastic ataxia carrying compound heterozygous variants, supporting that SPG7 may be one of the most common forms of recessive ataxias worldwide.216 Thus, a growing number of studies indicate that SPG7 should be considered in the differential diagnosis of recessive cerebellar ataxia.214,224,225

In Quebec, Friedreich Ataxia (MIM229300), Autosomal Recessive Spastic Ataxia of Charlevoix- Saguenay (MIM270550) and Autosomal Recessive Spinocerebellar Ataxia Type 8 (MIM610743) account for the majority of autosomal recessive cerebellar ataxia cases and except for the latter one, they usually present as spastic ataxias.28 However, a large number of French

58 Canadian (FC) cerebellar ataxia cases remain unresolved, many of which have associated spasticity and often milder adult-onset phenotypes. Only recently were three SPG7 FC cases reported from the province of Ontario.226 The relative prevalence of SPG7 in the FC population is unknown.

As for many other rare diseases, the implementation of whole exome sequencing (WES) in research and in clinical settings has greatly accelerated the identification of disease-causing genes in the field of ataxias. Here, we report on the identification of causative SPG7 variants in 22 unresolved FC spastic ataxia cases belonging to 12 families, making it the fourth most common recessive ataxia in this population.

59 Subjects and Methods Subjects Patients presenting with cerebellar ataxia, spasticity and a family history suggestive of autosomal recessive or sporadic inheritance were seen at several neuromuscular clinics in the province of Quebec and Eastern Ontario between 2002 and 2014. Mutations in SACS and FRDA were ruled out in the majority of cases. All participating family members signed an informed consent form approved by the institutional ethics committee of the Centre de Recherche du Centre Hospitalier de l’Université de Montréal (CRCHUM) or the Children’s Hospital of Eastern Ontario.

Molecular analyses Genomic DNA was extracted from peripheral blood cells using standard methods. WES was performed on individuals 1, 5 and 9 using the SureSelect exome capture kit v.5 (Agilent Technologies, Santa Clara, CA). The libraries were sequenced on an Illumina HiSeq 2000 (Illumina, San Diego, CA) with paired-end 100-bp reads at the McGill University and Genome Quebec Innovation Center (Montreal, Quebec, Canada). Sequences were aligned to the human reference genome (UCSC hg19) using the BWA (Burrows-Wheeler Aligner) algorithm, variant calling was performed using SAMtools227 and annotation was done using ANNOVAR228 and custom scripts, as previously described. For Sanger sequencing, polymerase chain reaction (PCR) was used to amplify selected individual exons and intron-exon boundaries of SPG7. PCR products were sent to McGill University and Genome Quebec Innovation Center for sequencing, using a 3730XL DNA Analyzer (Applied Biosystems, Foster City, CA). Mutation detection analysis was performed using SeqMan v.4.03 (DNASTAR Inc., Madison, WI) and 4Peaks (A. Griekspoor and Tom Groothuis, mekentosj.com).

Patient 22 was analyzed using the HSP Sanger panel at the Hospital for Sick Children (Toronto, Ontario, Canada), as previously described.226 Variants identified were submitted to ClinVar (http://www.ncbi.nlm.nih.gov/clinvar/).

For complementary DNA (cDNA) sequencing, skin fibroblasts were derived from individual 19 and a healthy control and were grown according to standard protocols. Total RNA was extracted using Trizol reagent (Ambion, Foster City, CA) and reverse transcribed into cDNA using the

60 Superscript® III Reverse Transcriptase (Invitrogen, Carlsbad, CA) according to the manufacturer’s instructions. Exons 7 and 8 of SPG7 were amplified by PCR from the cDNA and sequenced. Sequencing and variant detection analysis were performed as described above.

61 Results Exome sequencing A group of nine unrelated patients with unresolved autosomal recessive spastic ataxia were sent for WES. Since there was some phenotypic heterogeneity within the cohort regarding age of onset, severity and associated neurological symptoms, it was expected that causative variants in different genes would be uncovered. We searched for homozygous or compound heterozygous variants with a minor allele frequency (MAF) lower than 3% in the 1000 Genomes and Exome Variant Server (EVS) databases. As expected, we identified distinct candidate genes in several individuals, confirming the genetic heterogeneity within our cohort. Nevertheless, in three patients (1, 5 and 9), we uncovered compound heterozygous variants in SPG7 (NM_003119, NG_00808.2.1) (Table 2.1, Table S2.1, Table S2.2). The three cases carried the most frequent previously reported variant c.1529C>T (p.(Ala510Val)) on one allele223. To assess the possibility that the p.(Ala510Val) variant arose from a single event, we looked at genotypes of known SNPs surrounding this shared variant. Single nucleotide polymorphisms were selected if they were reported in dbSNP138, had a mapping quality greater than 50 and coverage higher than 10X. Carriers of this variant shared a common 1.33 Mb haplotype (rs1107678-rs7196459), suggesting that p.(Ala510Val) derives from a single ancestral event (Table S2.3). In addition, patients 1 and 5 each carried a different previously reported pathogenic missense variant on the second allele (c.2249C>T (p.(Pro750Leu)) and c.1715C>T (p.(Ala572Val)), respectively).213,215 In patient 9, we found a novel intronic variant c.988-1G>A located one before exon 8. This variant was absent from the 1000 Genomes and EVS databases and present at a frequency of 1.66e-05 in the Exome Aggregation Consortium. In addition, it was found at a frequency of 2/2000 in our in-house exome database. Furthermore, it was predicted to be disease-causing due to the loss of an acceptor splice site by Mutation Taster. Sanger sequencing validated the presence of the corresponding variants in the three patients and co- segregation with disease status was confirmed in families A and B.

Identification of SPG7 variants in additional spastic ataxia patients To identify additional patients with SPG7 variants, we screened 21 additional unrelated patients from our cohort of unresolved cases. Because of the genetic homogeneity observed in FC, we focused our analysis on SPG7 exons in which we had previously identified variants in patients.

62 Strikingly, we uncovered rare variants in eight additional unrelated patients (38.1%, 8/21). We confirmed the presence of the identified variants in four affected relatives, which brought the total to 12 cases belonging to eight families (Table 2.1). In parallel, we also identified variants in SPG7 in subject 22 using a Sanger sequencing HSP panel. This included five missense variants that were previously reported to affect the protein function,213,215,218,223,229 and two novel variants (Table 2.1). In fact, five families (eight individuals) carried the novel splice site variant c.988-1G>A described above, making it the second most common variant in our cohort. Furthermore, patient 22 carried a novel variant (c.473_474del, p.(Leu158GlnfsTer30)), which is predicted to induce a frameshift and a premature stop codon at position 187 of SPG7. We confirmed co-segregation of all variants with the disease status in family members for which DNA was available. Thus, we uncovered rare homozygous or compound heterozygous variants in SPG7 in 38.7% of the families screened (12/31). cDNA sequencing To confirm the pathogenicity of the splice site variant c.988-1G>A, we sequenced exons 7 and 8 of the cDNA of patient 19, who was homozygous for this variant. The results confirm that the substitution of G to A causes the loss of the acceptor splice site and leads to the use of two alternative cryptic acceptor splice sites within exon 8 (Fig. 2.1). This leads to frameshifts and premature stop codons that likely produce two truncated SPG7 proteins (p.Ser330ProfsTer65 and p.Ser330LeufsTer460). Thus, these results strongly suggest that the SPG7 variant c.988-1G>A is indeed a mutation by affecting the function of the SPG7 protein.

Clinical features of patients with SPG7 variants Despite the allelic heterogeneity, this FC cohort supports a more homogeneous core phenotype where spasticity and ataxia are both a constant feature (Table 2.1). The fact that 100% of cases presented with ataxic features explains why SPG7 was not screened initially in these patients. This cohort confirms the variable but generally adult age of onset (mean 34.2, 15-55) and the intra-familial variability.223 Urinary urgency is a very common symptom that required medical treatment in many cases (14/22; 63.6%). Compared to other series, there was never any chronic external ophthalmoplegia documented even in the older cases, though clearly some pursuit difficulties above abnormal saccades and nystagmus (15/22; 68.1%) seem to appear with age.230

63 Only one patient demonstrated optic atrophy as has been reported in other cases.230,231 Though proximal lower limb weakness was a rare finding at initial evaluation (3/22; 13.6%), it clearly develops with time (mean age 54.4, 7/21; 33.3%). Ambulatory loss appears to be exceedingly rare in patients. MRI data were available for 16 patients. A various degree of cerebellar atrophy was present in the majority of them (14/16; 87.5%) and was associated with mild supratentorial atrophy in a few cases (Table 2.1, Fig. 2.2). Intra-familial variability was also observed on MRI, as cerebellar atrophy was moderate in the more ataxic siblings but absent or milder in others with less ataxia (Table 2.1, Fig. 2.2). Follow-up MRI was available for one case presenting mild cerebellar atrophy: over a period of seven years (age 51 to 58), subject 13 did not show marked progression of cerebellar atrophy despite clinical progression of his ataxia without loss of independent ambulation (Fig. 2.2). However, more serial MRI data would be necessary to establish a correlation between the increasing ataxia and the cerebellar atrophy. The evolution of the gait difficulty in patients is clearly due to both the progression of the ataxia and the spasticity. Most patients were followed in a rehabilitation clinic on a yearly basis. Lioresal in doses of 10mg to 90mg per day in divided doses were given to many of the more spastic patients. In a few rare cases Botox injections in the lower extremities were given. Urinary urgency was medically treated in the more than 50% of cases presenting this symptom.

Discussion We report the identification of causative variants in SPG7 in 22 FC patients from 12 families affected with autosomal recessive spastic ataxia. This is only the second report of SPG7 mutations in FC patients,226 and the first report of a large cohort. These results strongly suggest that homozygous or compound heterozygous SPG7 variants explain a significant proportion of FC spastic ataxia cases (38.7% of families in our cohort). Sanger sequencing detected a slightly higher frequency of SPG7 cases (9/22, 40.9%) than WES (3/9, 33.3%). This may be due to a larger phenotypic heterogeneity in the patients sent for WES compared to the ones screened by Sanger sequencing. SPG7 should be considered in spastic ataxia patients lacking a genetic diagnosis. The SPG7 c.1529C>T (p.(Ala510Val)) variant was present in six families (12 patients), including one homozygote, making it the most frequent variant identified in our cohort. It was also found to be the most common SPG7 variant in several other populations.217,223,224,232 While it was first thought to be a benign variant, its pathogenicity was

64 later demonstrated223,229 and its presence in our FC spastic ataxia cohort supports this claim. In addition, this variant was present at a frequency slightly above one percent in our in-house exome database. This is significantly higher than public databases (1000 Genomes, EVS and ExAc), but is comparable to what has been reported in the literature for British, Spanish, Italian and German populations.217,218,223,229 A threshold of one percent is frequently used for the filtering of recessive variants. However, “not so rare” variants have been identified as causing recessive diseases and this should be taken into account when searching for disease variants altering protein function. Stringent filtering parameters might prevent the uncovering of disease variants, as would have been the case in our SPG7 cohort. In addition to this more common variant, we identified two previously unreported SPG7 variants, including a novel splice site variant present in five families (eight patients), making it the second most frequent variant in our cohort. Despite the two more common variants in our cohort, the identification of seven distinct SPG7 variants supports the growing documentation that there is allelic heterogeneity in the FC founder population, even for rare diseases.21 Additional patients are currently being investigated for SPG7 mutations in a clinical setting, which may lead to the identification of more variants in the FC population. Furthermore, considering the recurrence of several variants in our cohort and the relatively high frequency of the p.(Ala510Val) variant in the FC population, our results highlight the importance of genetic counseling for SPG7 variant carriers. Variants in three of the patients were identified through WES, again illustrating the efficiency of exome sequencing for the diagnosis of inherited ataxias, as it identified the genetic cause in patients for which SPG7 would not have been considered based solely on clinical symptoms. This study, as well as more recent publications on SPG7 mutation carriers, supports that cerebellar ataxia is a very frequent feature.213-215,217,223 When present in milder adult onset cases with spasticity, it suggests this diagnosis if not associated with cognitive deficit or peripheral neuropathy. In fact, cerebellar features, including ataxia, are more pronounced than spasticity in our cohort. This agrees with more recent literature suggesting that cerebellar ataxia can be the dominating clinical symptom in SPG7 and that SPG7 should be considered in the differential diagnosis of inherited cerebellar ataxia. In conclusion, we show for the first time that mutations in SPG7 are an important cause of autosomal recessive spastic ataxia in the French Canadian population.

65 Figures and Tables SPG7 mutations in French Canadian ataxia cases KChoquetet al 1018 Table 2.1. Clinical features and variants identified in French Canadian SPG7 cases.

− + − + − + + + ++ ++ ++ ++ ++ ++ ++ ++ NA NA NA MRI Cerebellar atrophy on

52 54 41 58 67 NA proximal weakness Age documented − − − −− − − − − − − − − − −− − − − LE Proximal weakness −− −− − ++ + ++ + ++ + ++ + − − − exia Babinski Ataxia Dysmetria fl Hyper- re e imaging; UE, upper extremities. ++ + + + ++ + + + − ++++++ +++++ ++++ ++++ LE48 ++++ ++++ ++++++ ++++ − +++++ ++ ++ + + ++ + ++ ++ + + ++++++++ +++++ ++++++++ +++++++++++ ++++++++ +++++++++ ++++++++UE/LE61++ ++++++++ +++++++++++ + +++++++ ++++++++++ ++ ++ ++ + ++ + ++ + ++++++++++ Optic atrophy Nystagmus Dysarthria Spasticity − −− −− −− −− − −− −− −− − −− − −− − −− −− −− −− −−− − −−− − −−− −−− −−− −−− −−− − −−− Urinary urgency CPEO at Age exam ed in French Canadian SPG7 cases fi of 30 50 + 39 44 + 35 39 + 40 45 + 25 45 37 43 25 33 + 20 29 30 51 + 48 51 + 40 48 32 48 + 28 61 25 43 45 68 + 43 66 + 32 44 + 55 67 Age onset A A A A A T T T T T T T T5074+ T T1558+ T4057+ T2050 A A 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 A A 4 4 C C 1G 1G 1G 1G 1G 5 9 4 4 − − − − − 1 4 7 2 1 2 . . c c 3 7 1 1 x x Ex 4c.473_474del p.(Leu158GlnfsTer30) Ex 8c.988 (hmz) Ex 8c.988 Ex 8c.988 Ex 17c.1715C p.(Ala572Val) Ex 2c.233T p.(Leu78Ter) Ex 8c.988 p.(Ala510Val) (hmz) Ex 11c.1529C p.(Ala510Val) Ex 2c.233T p.(Leu78Ter) Ex 11c.1529C p.(Pro750Leu) Ex 8c.1045G p.(Gly349Ser) Ex 17c.2249C p.(Pro750Leu) Ex 8c.1045G p.(Gly349Ser) Ex 17c.2249C p.(Ala572Val) p.(Ala510Val) Ex 13c.1715C Ex 11c.1529C p.(Ala510Val) Ex 8c.988 p.(Ala572Val) Ex 11c.1529C Ex 11c.1529C p.(Ala510Val) p.(Pro750Leu) p.(Ala510Val) Ex 11c.1529C K I E C B A Family ID Mutations 22 M 21 20 19 16 17 18 J 15 H 14 G 13 F 12 11 10 D 8 9 5 6 7E 4 3E 2 1 Table 1 Clinical features and variants identi Individual ID Abbreviations: +, mild; ++; moderate; +++, severe; Ex, exon; CPEO, chronic external ophtalmoplegia; LE, lower extremities; MRI, magnetic resonanc Exons are numbered according toNovel variants reference NG_00808.2.1. are indicated in bold characters.

European Journal of Human Genetics

66

Figure 2.1. cDNA analysis of the splice site variant c.988-1G>A. A) Complementary DNA sequence chromatograms are shown for a non-affected control and patient 19 who is homozygous for the c.988-1G>A variant. The G to A substitution leads to the loss of the acceptor splice site and the use of two alternate cryptic acceptor sites located within exon 8, causing the deletion of 2 and 21 nucleotides from the beginning of exon 8, respectively, as well as a frameshift and premature stop codon. B) Schematic representation of normal splicing of SPG7 (upper panel) and aberrant splicing as seen in case 19 (middle and lower panel).

67

Figure 2.2. Brain MRI of FC cases with SPG7 variants. A-F) Sagittal T1-weighted (A-B, D- E) and coronal T2-weighted (C, F) images show moderate atrophy of the cerebellar vermis and hemispheres in subject 2 at age 57 (A-C) while cerebellar atrophy is milder in subject 3 at age 51 (D-F). G-H) Sagittal T1- (G) and T2- (H) weighted images show mild cerebellar atrophy on the initial MRI of subject 13 at age 51 (G) without significant progression at the follow-up MRI at age 58.

68 Web Resources Exome Variant Server: https://evs.gs.washington.edu/EVS/ ExAC: Exome Aggregation Consortium (ExAC), Cambridge, MA (URL: http://exac.broadinstitute.org) [March 2015] Mutation Taster: www.mutationtaster.org

Acknowledgements We thank the patients and their relatives who accepted to partake in this study. This project was financially supported by the Fondation Monaco. This work was selected for study by the Care4Rare (Enhanced Care for Rare Genetic Diseases in Canada) Consortium Gene Discovery Steering Committee: Kym Boycott (lead; University of Ottawa), Alex MacKenzie (co-lead; University of Ottawa), Jacek Majewski (McGill University), Michael Brudno (University of Toronto), Dennis Bulman (University of Ottawa), and David Dyment (University of Ottawa) and was funded in part by Genome Canada, the Canadian Institutes of Health Research, the Ontario Genomics Institute, Ontario Research Fund, Genome Quebec and the Children’s Hospital of Eastern Ontario Foundation. The authors wish to acknowledge the contribution of the high throughput sequencing platform of the McGill University and Génome Québec Innovation Centre, Montréal, Canada. KC and RLP received a Doctoral Award from the Fonds de recherche du Québec - Santé (FRQS). MT received a post-doctoral award from the Réseau de Médecine Génétique Appliquée and FRQS.

69 Supplementary Materials

Table S2.1. Exome sequencing summary data

Case 1 Case 5 Case 9 Total reads 105 898 538 140 882 234 120 519 058 Total aligned reads 101 729 359 139 641 721 119 375 959 Mean read length 93.05 97.99 96.92 Mean coverage 107.01X 149.8X 131.45X Coverage in CCDS region 5X 97.70% 97.90% 96.90% 10X 97.00% 97.40% 95.20% 20X 94.80% 96.20% 91.80% 30X 91.20% 94.40% 88.50%

70 Table S2.2. Filtering strategy employed to identify the causative variants in SPG7

Case 1 Case 5 Case 9 Total variants 195 088 285 572 278 372 Non-synonymous/splicing/indel variants 11 876 11 756 11 955 After excluding variants reported in 1000genomes and 1 878 1 761 1 926 EVS datasets (frequency >3%) After excluding variants observed in our in-house 484 357 430 database (frequency >1.5%) Genes with homozygous or compound heterozygous 43 18 24 variants Genes with variants not homozygous or compound 22 12 14 heterozygous in our in-house database Gene associated with an ataxia phenotype 1 2* 1

*SYNE1 and SPG7

71 Table S2.3. Genotypes of SNPs in patients with the p.(Ala510Val) mutation in SPG7

SNP ID rs11076708 rs564705 rs12447947 A510V rs258327 rs461115 rs7200990 rs3809643 rs7196459 Position 88808141 88946955 89199651 89613145 89629218 89703519 89836507 90124273 90141477 C A G C G T T T G Case 1 C G A T G T C T G C G A C G T C G G Case 5 C G A T G T C T G T G A C G T T T G Case 9 T G A T G T C T G *Reference sequence hg19

72 Part B: Autosomal recessive cerebellar ataxia caused by a homozygous mutation in PMPCA

Karine Choquet1,2, Olga Zurita-Rendón1, Roberta La Piana1, Sharon Yang1, Marie-Josée Dicaire1, Care4Rare Consortium, Kym M. Boycott3, Jacek Majewski2,4, Eric A. Shoubridge1,2, Bernard Brais1,2, Martine Tétreault2,4

Affiliations: 1) Montreal Neurological Institute, McGill University, Montreal, Québec, Canada 2) Department of Human Genetics, McGill University, Montreal, Québec, Canada 3) Children’s Hospital of Eastern Ontario Research Institute, University of Ottawa, Ottawa, Ontario, Canada 4) McGill University and Genome Quebec Innovation Center, Montreal, Québec, Canada

Running title: New PMPCA mutation in cerebellar ataxia

Keywords: Cerebellar ataxia, exome sequencing, PMPCA

Published in Brain. 2016 Mar;139 (Pt 3):e19. doi: 10.1093/brain/awv362. Epub 2015 Dec 10. (Oxford University Press) Reproduced with permission

73 Letter to the Editor Dear Editor, Recently, Jobling et al. (2015) reported in Brain the identification of mutations in PMPCA in 17 patients from four families affected with autosomal recessive non-progressive cerebellar ataxia. A homozygous missense mutation in PMPCA (c.1129G>A, p.Ala377Thr) was uncovered in three families of Christian Lebanese Maronite origin, while compound heterozygous mutations (c.287C>T, p.Ser96Leu; c.1543G>A, p.Gly515Arg) were identified in a French patient.1 PMPCA encodes the alpha subunit of the mitochondrial processing peptidase (MPP), which cleaves the targeting peptide of nuclear-encoded mitochondrial precursor proteins upon their import into mitochondria.233 Jobling et al. showed compelling evidence for the causality of PMPCA variants. First, they demonstrated co-segregation of the variants with the disease among the four families, including a large consanguineous family of 32 individuals. Moreover, they observed decreased levels of MPPα in lymphoblasts from two affected members compared to heterozygote carriers and healthy controls. Interestingly, they also showed impaired processing of frataxin (FXN), a mitochondrial protein that is a known substrate of MPP.234,235 The paper by Jobling et al. is the first to associate defects in PMPCA with a human disease, which suggests a new mechanism for the pathogenesis of non-progressive cerebellar ataxias.

We wish to complement this study with our own identification of two brothers affected by a juvenile-onset recessive cerebellar ataxia caused by a homozygous mutation in PMPCA. These two patients were born of French Canadian parents who are distantly related (Fig. 2.3A). Considerable clinical variability was present between the two affected cases. Patient II.2 started developing impaired gait, dysarthria, dysmetria and mild distal atrophy during adolescence. He was diagnosed with a slowly progressive spinocerebellar ataxia. He did not have intellectual deficiency but had learning difficulties in school. Patient II.3 had an earlier onset at approximately five years of age and a more severe form of the disease than his older brother. He presented for consultation at the age of 15 with ataxia, hemidystonia, dysarthria and neurosensorial hearing loss. The MRI exams of both siblings (Fig. 2.3D-G) documented the presence of global cerebellar atrophy, involving both the vermis and the cerebellar hemispheres. In addition, MRI of one of the two brothers (II.2) demonstrated a bilateral and symmetric T2

74 hyperintense signal at the level of the deep cerebellar white matter. No structural or signal abnormalities were noted in the supratentorial structures.

To uncover the genetic cause of disease in this family, we performed whole-exome sequencing on individual II.3 (SureSelect All Exon v.5 and Illumina HiSeq 2000 with 100-bp paired-end sequencing reads). We filtered for non-synonymous, splicing, and coding indel variants that were present at a minor allele frequency lower than 5% in 1000 Genomes, EVS and ExAc or present in less than 30 in-house exomes. Using this filtering strategy, we identified nine rare homozygous variants. According to its gene function, the homozygous missense variant c.766G>A (p.Val256Met) in PMPCA (NM_015160) stood out as a good candidate. This variant is predicted to be pathogenic and is present at a conserved position (Fig. 2.3C). In addition, it is located in a large homozygous region of 6.5 Mb and Sanger sequencing showed that it co- segregates with disease status in the family (Fig. 2.3B).

To further link this variant to the ataxia in this family, we assessed the functional impact of the p.Val256Met mutation on MPP function. We performed immunoblots on protein extracts from immortalized lymphoblasts of the two patients, their parents, a sibling and two unrelated healthy controls. Steady-state levels of MPPα were comparable amongst patients, carriers and controls. Using a commercial (14147-1-AP, Proteintech) to detect the steady-state levels of both human frataxin isoforms (FXN81-210, FXN42-210), we showed that all family members accumulate FXN42-210 compared to controls. Moreover, both patients presented a slight increase in FXN42-210 compared to heterozygous carriers (Fig. 2.4). These results were further validated using a second antibody (Dr. Isaya PAC2518, Fig. S2.1A). In addition, we did not observe a specific pattern of accumulation in the steady state levels of FXN1-210 or FXN81-210 (Fig. 2.4, Fig. S2.1A). We also evaluated the processing profile of the malate dehydrogenase (MDH2), an established substrate of the MPP,236 which remained unaffected in all samples. To further investigate the effect of MPPα depletion, we used a siRNA construct to knock-down MPPα for 2, 3 and 4 days in a control fibroblast cell line. Western blot analysis of these cell lines showed that the complete loss of MPPα at 4 days resulted in a significant accumulation of FXN42-210 and depletion of FXN81-210 (Fig. S2.1B).

75 As mentioned above, recessive mutations in PMPCA have very recently been associated with non-progressive cerebellar ataxia.1 Considering the clinical and MRI similarities between our patients and the ones reported in Jobling et al., this homozygous variant in PMPCA appears to be the cause of cerebellar ataxia in this family. Although inter- and intra-familial variability is observed, there is a clear clinical overlap between the previously reported patients and ours. Gait ataxia, dysarthria and dysmetria are observed in all PMPCA patients. However, intellectual disability was not observed in our patients although it is a frequent phenotype in the previously reported patients. In contrast to the majority of patients described by Jobling et al., our patients seem to have a milder course of disease, resembling patient F4-II.1 in their study.1 It is important to point out that the cerebellar ataxia phenotype is slowly progressing in our patients in comparison to the report from Jobling et al., extending the phenotype associated with PMPCA mutations to slowly progressive cerebellar ataxia.

The variability in phenotype could potentially be explained by a different functional impact of the mutations. The previously reported p.Ala377Thr mutation is in close proximity to the glycine-rich loop, which is the most conserved part of MPPα and is crucial for initial interaction with and binding to the substrate.237-239 Jobling et al. showed that the mutation resulted in a decreased amount of MPPα in patients and that it altered the processing of FXN.1 The other two mutations (p.Ser96Leu and p.Gly515Arg) are located in two other conserved regions of the peptidase.1 The variant p.Val256Met identified in the two affected siblings reported here also localizes to a conserved region away from the glycine-rich loop. Immunoblot analysis in our patients did not reveal decreased levels of MPPα; however, we observed a consistent increase in FXN42-210 with no obvious changes of FXN81-210. Together, these results imply that the initial cleavage event in the processing of frataxin proceeds normally until the generation of FXN42-210, which cannot be efficiently recognized for the second cleavage step, resulting in its accumulation. As suggested for the two missense variants found in patient F4-II.1,1 the p.Val256Met may affect protein folding or stability, given the longer side chain of methionine compared to valine. The different functional impact could potentially explain the milder phenotype observed in patient F4-II.1 and ours.

76 In this and in other studies where the MPP activity is impaired,1,240 the analysis of the processing profile of several mitochondrial proteins, other than frataxin, remained normal. Based on the evidence presented here and in Jobling et al., the kinetics of processing are likely to be substrate- specific. Thus in the context of a mutant form of the protein, some substrates may be more easily processed than others. The challenge that remains is to identify the substrates that are poorly processed in affected cell-types that are relevant to the clinical phenotype. In fact, as suggested by Horvath and Chinnery,241 the biochemical defect caused by the PMPCA mutation p.Val256Met may be mild enough that it only affects selected cell types or tissues. For instance, decreased levels of MPPα and processing of FXN and/or other mitochondrial proteins may be more severely altered in the cerebellum. This mild impact may also be consequential only during earlier brain development, when the demand for mitochondrial processing is higher.241 Both alternatives would explain the absence of a strong biochemical phenotype in adult patient lymphoblasts.

In conclusion, we report an additional family, from an independent study, with a mutation in PMPCA, confirming the implication of this gene in cerebellar ataxias. Our report highlights key clinical and radiological features associated with PMPCA-related ataxias and also the presence of clinical heterogeneity by extending the phenotype to slowly progressive cerebellar ataxias. We believe these findings will help clinicians to make an accurate diagnosis and that PMPCA should be considered for both non-progressive and slowly progressive cerebellar ataxias.

77 Figures

Figure 2.3. Genetic and MRI analysis. A) Pedigree of the family. The red arrow indicates the individual sent for whole exome sequencing. B) Genomic sequence chromatograms of one affected, one heterozygote carrier and one unrelated control. The arrow indicates the substitution of the G for an A at position 266 of the coding sequence of PMPCA. C) Amino acid conservation for the PMPCA p.Val256Met variant. D-G) MRI findings in subjects with PMPCA mutations. Sagittal T1- (D) and T2- (E) weighted images of subjects II.3 and II.2 respectively showing the presence of cerebellar atrophy and cerebellar white matter abnormalities (E). Axial T2-weighted images showing no supratentorial abnormalities (F) and hyperintense signal at the level of the peridentate cerebellar white matter (G) in subject II.2.

78

Figure 2.4. Functional analysis of MPP in patient lymphoblasts. SDS-PAGE analysis of whole cell protein extracts in lymphoblast from controls (Control-1 and Control-2), carriers (I-1, I-2 and II-1) and patients (II-2 and II-3). Immunoblots show the protein steady state levels of the MPP alpha (MPPα) and beta (MPPβ) subunits, frataxin isoform-1 (FXN42-210) and frataxin isoform-2 (FXN81-210), Malate dehydrogenase (MDH2) and Actin (loading control).

79

Figure S2.1. SDS-PAGE analysis of whole cell protein extracts in (A) Lymphoblast from controls (Control-1 and Control-2), carriers (I-1, I-2 and II-1) and patients (II-2 and II-3). Immunodetection of frataxin was done using the antibody PAC2518 (B) Fibroblasts from control (Control-3) and MPPα knock down for 2, 3 and 4 days (MPPα-KD-2D,3D,4D). Immunodetection of frataxin was done using the antibody 14147-1-AP. The outer mitochondrial membrane protein, porin, was used as loading control in (A) and (B).

80 Acknowledgements We thank the patients and their relatives who accepted to partake in this study. This work was supported by the Fondation Groupe Monaco and the Care4Rare Canada Consortium, funded in part by Genome Canada, the Canadian Institutes of Health Research (CIHR), the Ontario Genomics Institute, Ontario Research Fund, Genome Quebec and the Children’s Hospital of Eastern Ontario Foundation. We would also like to thank Dr. Grazia Isaya for kindly providing us with the FXN antibody PAC2518. KC and RLP received a Doctoral Award from the Fonds de recherche du Québec - Santé (FRQS). MT received a post-doctoral award from the Réseau de Médecine Génétique Appliquée, FRQS and CIHR.

81 PREFACE TO CHAPTER 3

In Chapter 2, I described the identification of disease-causing mutations in two different genes in French Canadian patients affected with ARCA. In both projects, the genetic evidence showing that the identified genes cause ARCA is highly compelling. However, as it is the case for many inherited neurological diseases, there is no direct link between the function of these genes and the clinical symptoms. In fact, both SPG7 and PMPCA are ubiquitously expressed genes and their encoded proteins play important roles in the mitochondria in all cell types. Why they lead specifically to ataxia remains enigmatic, which is a major barrier to the development of therapies. Pol III-related leukodystrophy (POLR3-HLD) is another ataxia-related disorder, caused by mutations in the ubiquitously expressed genes POLR3A, POLR3B and POLR1C, and for which the link between the mutated genes and the phenotype remains enigmatic. The remainder of my thesis is focused on trying to gain a better understanding of the mechanisms underlying this disorder. Since animal models can be powerful tools to study disease pathogenesis, we developed a Polr3a knock-in (KI) mouse model in which we could have easy access to the relevant CNS tissue. Our ultimate goal was to help shed the light on the link between impaired Pol III function and abnormal myelination as well as cerebellar function. Chapter 3 describes the phenotypic, histological and molecular characterization of this Polr3a KI mouse generated as a potential of POLR3-HLD.

82

CHAPTER 3: Absence of neurological abnormalities in mice homozygous for the Polr3a G672E hypomyelinating leukodystrophy mutation

Karine Choquet1,2,3, Sharon Yang1, Robyn D. Moir4, Diane Forget5, Roxanne Larivière1, Annie Bouchard5, Christian Poitras5, Nicolas Sgarioto1, Marie-Josée Dicaire1, Forough Noohi1,2, Timothy E. Kennedy1, Joseph Rochford6, Geneviève Bernard7,8,9, Martin Teichmann10, Benoit Coulombe5,11, Ian M. Willis4, Claudia L. Kleinman2,3, Bernard Brais1,2

Affiliations: 1) Montreal Neurological Institute, McGill University, Montréal, Québec, Canada. 2) Department of Human Genetics, McGill University, Montréal, Québec, Canada. 3) Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Québec, Canada. 4) Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York, USA. 5) Translational Proteomics Laboratory, Institut de recherches cliniques de Montréal (IRCM), Montréal, Québec, Canada. 6) Douglas Institute Research Center, Montréal, Québec, Canada. 7) Departments of Neurology and Neurosurgery, and Pediatrics, McGill University, Montreal, Canada. 8) Department of Medical Genetics, Montreal Children’s Hospital, McGill University Health Center, Montreal, Canada. 9) Child Health and Human Development Program, Research Institute of the McGill University Health Center, Montreal, Canada. 10) INSERM U1212 – CNRS UMR5320, Université de Bordeaux, Bordeaux, France. 11) Département de biochimie et médecine moléculaire, Université de Montréal, Montréal, Québec, Canada.

Published in Mol Brain. 2017 Apr 13;10(1):13. doi: 10.1186/s13041-017-0294-y. Reproduced under the Creative Commons Attribution license

83 Abstract Recessive mutations in the ubiquitously expressed POLR3A gene cause one of the most frequent forms of childhood-onset hypomyelinating leukodystrophy (HLD): POLR3-HLD. POLR3A encodes the largest subunit of RNA Polymerase III (Pol III), which is responsible for the transcription of transfer RNAs (tRNAs) and a large array of other small non-coding RNAs. In order to study the central nervous system pathophysiology of the disease, we introduced the French Canadian founder Polr3a mutation c.2015G>A (p.G672E) in mice, generating homozygous knock-in (KI/KI) as well as compound heterozygous mice for one Polr3a KI and one null allele (KI/KO). Both KI/KI and KI/KO mice are viable and are able to reproduce. To establish if they manifest a motor phenotype, WT, KI/KI and KI/KO mice were submitted to a battery of behavioral tests over one year. The KI/KI and KI/KO mice have overall normal balance, muscle strength and general locomotion. Cerebral and cerebellar Luxol Fast Blue staining and measurement of levels of myelin proteins showed no significant differences between the three groups, suggesting that myelination is not overtly impaired in Polr3a KI/KI and KI/KO mice. Finally, expression levels of several Pol III transcripts in the brain showed no statistically significant differences. We conclude that the first transgenic mice with a leukodystrophy-causing Polr3a mutation do not recapitulate the childhood-onset HLD observed in the majority of human patients with POLR3A mutations, and provide essential information to guide selection of Polr3a mutations for developing future mouse models of the disease.

Keywords (3-10) Leukodystrophy, POLR3A, mouse model, hypomyelination, RNA Polymerase III, transfer RNAs

84 Introduction Hypomyelinating leukodystrophies are a heterogeneous group of neurodegenerative diseases characterized by impaired cerebral myelin formation. POLR3-related hypomyelinating leukodystrophy (POLR3-HLD), also called 4H leukodystrophy, is caused by recessive mutations in POLR3A, POLR3B or POLR1C.39,81,83,90 Patients usually present in early childhood or adolescence with motor regression, cerebellar features and/or cognitive dysfunction.39,89 In many cases, they also display hypogonadotropic hypogonadism and/or hypodontia.39,89 Diffuse hypomyelination with relative preservation (T2 hypointensity) of myelination of the dentate nuclei, anterolateral nuclei of the thalami, globi pallidi and optic radiations, as well as a thin corpus callosum and cerebellar atrophy are observed on magnetic resonance imaging (MRI) in the majority of POLR3-mutated patients.89,91,242,243

POLR3A, POLR3B and POLR1C encode subunits of RNA Polymerase III (Pol III), one of the three essential eukaryotic RNA polymerases. Specifically, Pol III is responsible for the synthesis of several types of non-coding RNAs (ncRNAs), including transfer RNAs (tRNAs), 5S ribosomal RNA (rRNA), U6 small nuclear RNA and BC200 RNA.244 Pol III is a large enzymatic complex composed of 17 subunits. POLR3A and POLR3B, the two largest subunits, form the catalytic center of the enzyme.

Since the initial identification of mutations in POLR3A,39 more than 100 mutations in POLR3A, POLR3B and POLR1C have been identified in over 130 patients with POLR3-HLD.39,81-91,245-248 The majority of mutations are private or present in only a handful of patients.89 While most international POLR3-HLD patients are compound heterozygotes, the majority of French Canadian cases are homozygous for the c.2015G>A (p.G672E) mutation in POLR3A, suggesting a founder effect in this population.39,89 In addition to this genetic heterogeneity, POLR3-HLD is characterized by important inter- and rarely intra-familial clinical variability, both in symptoms and severity, and its phenotypic spectrum continues to expand.91,93,94 Notably, two recent studies described patients with cerebellar atrophy only91 or with involvement of the striatum and red nuclei but normally myelinated white matter,93 suggesting that diffuse hypomyelination is not an obligate feature of the disorder.91

85 Despite the major advances in the clinical and genetic characterization of POLR3-HLD, the molecular basis of its pathophysiology remains poorly understood. Mutations are located throughout the three genes and are likely to impact different functional aspects of Pol III, which would in all cases lead to enzyme hypofunction and decreased expression of ncRNAs synthesized by Pol III.39,81,90,105 Indeed, a recent study using FLAG-tagged POLR1C mutants transfected in HeLa cells demonstrated that two POLR1C missense mutations cause impaired assembly of the Pol III complex, accumulation of the mutated subunits in the cytoplasm and reduced Pol III occupancy at its target promoters, suggesting decreased transcription of the corresponding genes.90 In addition, overexpression of missense alleles of Rpc1, the yeast ortholog of POLR3A, in S. pombe, led to reduced precursor tRNA levels, a proxy for transcription, and changes in tRNA modification and translation fidelity.199 A key question is how mutations in such an essential and ubiquitously expressed enzymatic complex lead to a central nervous system (CNS)-specific disease. Most Pol III transcripts are ubiquitously expressed and several of them are at their highest expression level in the CNS.117,140 Moreover, POLR3-HLD belongs to a growing number of neurological diseases, including several leukodystrophies, caused by mutations in genes that are also related to tRNA biology,106,118- 120,124,125,129 suggesting that impaired tRNA biogenesis could be particularly detrimental to the CNS.

In this study, we generated and characterized a knock-in (KI) mouse model carrying the common French Canadian Polr3a c.2015G>A (p.G672E) mutation in order to determine if it recapitulates POLR3-HLD features. Herein, we describe the results from a yearlong study of motor function in this first transgenic exploratory model of POLR3-HLD, as well as its molecular and histological characterization.

86 Results Generation of Polr3a KI/KI and KI/KO mouse models To obtain a relevant model of POLR3-HLD, we generated a KI mouse carrying the c.2015G>A (p.G672E) mutation in Polr3a, a mutation chosen based on its frequency in French Canadian cases and on the report of several human homozygous cases.39,89 Indeed, we obtained viable KI/KI mice and confirmed the expression of the homozygous c.2015G>A (p.G672E) in these animals by Sanger sequencing of tail genomic DNA as well as brain cDNA (Fig. S3.1A). We also generated a compound heterozygous Polr3a mouse line carrying one KI allele and one null allele (KI/KO). Heterozygous Polr3a knockout (KO) mice were produced by insertion of a gene trap cassette in intron 21. The portion of intron 21 upstream of the cassette is retained in the mRNA of these mice, leading to a frameshift and premature stop codon (p.E968VfsX12) (Fig. S3.2). As expected, homozygous Polr3a KO mice are embryonically lethal (Fig. S3.1B). KI/KI mice were bred with heterozygous Polr3a KO mice to create the KI/KO mouse line. Both KI/KI and KI/KO mice reproduce normally and do not display a grossly abnormal phenotype at 12 months of age. At the protein level, full-length POLR3A levels were comparable in the cerebrum of one-year-old KI/KI, KI/KO and WT mice (Fig. S3.1D). The KO allele is predicted to cause a frameshift leading to premature termination at amino acid 980 and resulting in a protein of approximately 100 kDa. We did not observe a band of that size accumulating in KI/KO mice (Fig. S3.1D), implying that the KO mRNA and/or protein is rapidly degraded. In addition, the normal levels of full-length protein in KI/KI and KI/KO mice indicate that the G672E mutation does not impair stability of the POLR3A protein.

Characterization of motor function over one year Individuals with POLR3A mutations, including those homozygous for the c.2015G>A (p.G672E) mutation, manifest cerebellar and upper motor neuron signs leading to impaired gait, coordination and balance as well as cognitive dysfunction.39 We thus performed balance beam, rotarod, open field and inverted grid tests to assess balance, coordination, general locomotion, and muscle strength (Fig. 3.1). Since the body weights of the mice were variable, especially at later time points (Fig. S3.3), we used one-way analysis of covariance (ANCOVA) with weight as the covariate to compare behavioral measures between genotypes (Fig. 3.1). At 40 and 90 days old, there were no significant differences between the three groups, implying that Polr3a KI/KI

87 and KI/KO mice do not develop an early-onset motor phenotype. While some differences were observed on the beam test at 270 and 365 days old (Fig. S3.4), those were largely attributable to weight and did not remain after adjustment of the data for this variable (Fig. 3.1). To complement the beam test, we performed gait analysis (Fig. 3.1J-K). Both KI/KI and KI/KO mice displayed a small but statistically significant reduction in their back paws limb width compared to WT mice (p-value < 0.01) at 270 days old. The test was repeated at 365 days old and showed the same trend but the difference between groups was not statistically significant (Fig. 3.1J). This may reflect a very mild phenotype that would require testing of older mice for confirmation. In summary, the extensive panel of tests performed strongly suggests that Polr3a KI/KI and KI/KO mice do not display motor dysfunction at one year of age.

Analysis of myelination and cerebellar integrity Hypomyelination is the main pathological feature of POLR3-HLD.89,96 Thus, to assess whether Polr3a KI/KI and KI/KO mice display hypomyelination, we stained coronal brain sections from 90 and 365 days old mice with Luxol Fast Blue (LFB), which is commonly used to detect myelin in the CNS. We observed normal and complete myelination in the brain and cerebellum of KI/KI and KI/KO mice, where the staining was indistinguishable from age-matched WT mice (Fig. 3.2A, 3.2B and Fig. S3.5). In addition, we measured the levels of the major protein components of myelin by western blot in the cerebellum of 90-day-old mice. Protein levels of Myelin Basic Protein (MBP), Proteolipid Protein (PLP), Myelin-associated Glycoprotein (MAG) and 2',3'- Cyclic Nucleotide 3' Phosphodiesterase (CNP) were comparable between WT, KI/KI and KI/KO mice (Fig. 3.2C). These results suggest that Polr3a KI/KI and KI/KO mice undergo normal gross myelination and do not experience major demyelination at one year of age. Since cerebellar atrophy and Purkinje cell loss is a major feature in POLR3-HLD,89,96 we then evaluated cerebellar morphology using Nissl staining followed by Purkinje cell counts in 365-day-old mice. Cerebellar morphology was overall normal (Fig. 3.3A) as were Purkinje cell numbers (Fig. 3.3B), implying that KI/KI and KI/KO mice do not present cerebellar atrophy.

Evaluation of Pol III transcription levels Despite the lack of severe abnormalities at the phenotypic and histological levels, the homozygous c.2015G>A (p.G672E) substitution in Polr3a may alter Pol III function. Because of

88 their short half-lives, precursor tRNA levels provide a reliable estimate of Pol III transcription.194,249,250 To evaluate the impact of the Polr3a G672E mutation on Pol III transcription, we measured the levels of one precursor tRNA and two mature tRNAs in the cerebrum and liver of 90-days-old and one-year-old WT, KI/KI and KI/KO mice (Fig 3.4A and S3.6). While there were no statistically significant differences in tRNA levels among the three groups, there was a trend towards a small decrease of pre-tRNAIle(TAT) in one-year-old KI/KO mice (Fig. 3.4A). We then reasoned that brain-specific transcripts, such as Bc1 RNA and n-Tr20 tRNA,251 might be more sensitive to Pol III mutations. We first confirmed the brain-specific expression of both transcripts (Fig. S3.6). We then measured the levels of Bc1 RNA, precursor n-Tr20 as well as mature n-Tr20 in the cerebrum of WT, KI/KI and KI/KO mice, but we did not detect differences between groups (Fig. 3.4 and S3.6). Therefore, our results suggest that the Polr3a G672E mutation does not significantly impair Pol III transcript levels, although it may result in a minor effect on the transcription of tRNA genes in whole cerebrum of one-year-old Polr3a KI/KO hypomorphic mice.

Impact of the POLR3A G672E mutation in human cells Because of the absence of dysfunction resulting from the c.2015G>A (p.G672E) mutation in mouse, we sought to evaluate its impact on Pol III function in human cells. We stably expressed FLAG-tagged versions of WT and mutant (G672E) POLR3A in HeLa cells. We first examined the impact of the G672E mutation on POLR3A cellular localization by performing anti-FLAG immunofluorescence. As expected, POLR3A-WT showed a predominant nuclear localization. Similarly, the majority of POLR3A-G672E was also in the nucleus, albeit with slightly more of the protein in the cytoplasm compared to WT. This suggests that the mutant Pol III complex is generally correctly assembled and imported into the nucleus (Fig. 3.5A). To further confirm this, we performed anti-FLAG affinity purification on cell extracts from cell lines expressing FLAG- tagged POLR3A-WT and POLR3A-G672E and analyzed the purified proteins using shotgun proteomics. The mutant POLR3A-G672E subunit was able to pull down all detectable Pol III subunits with levels that did not significantly differ from the WT subunit, indicating that the Pol III complex assembles correctly and thus that the mutation does not globally impair Pol III complex assembly (Fig. 3.5B, Table S3.1). Finally, we performed chromatin immunoprecipitation followed by quantitative PCR (ChIP-qPCR) to evaluate Pol III occupancy

89 at two target loci after transient transfection of POLR3A-WT or POLR3A-G672E in HEK293 cells. This showed a mild reduction in Pol III occupancy for POLR3A-G672E compared to POLR3A-WT, but the difference was not statistically significant (Fig. 3.5C). These results suggest that the impact of the POLR3A c.2015G>A (p.G673E) mutation on Pol III function is also mild in human cultured cells.

90 Discussion In this study, we describe the first transgenic mice with bi-allelic mutations in Polr3a, encoding the largest subunit of Pol III. We report that these mice do not display gross hypomyelination or cerebellar atrophy. In fact, our data shows apparently normal myelin staining and myelin protein levels in KI/KI and KI/KO mice at 90 days of age, thereby excluding the presence of hypomyelination in these mice. Furthermore, we did not find evidence of demyelination, gross cerebellar atrophy or Purkinje cell loss in the brains of mice at one year of age. This is consistent with the absence of statistically significant motor dysfunction in one-year-old Polr3a KI/KI and KI/KO mice. Altogether, our results are in stark contrast with the observations in the majority of human patients affected with POLR3-HLD described to date, who manifest diffuse hypomyelination and cerebellar atrophy on MRI, childhood-onset ataxia often leading to loss of gait and speech, and death in adolescence or early adulthood.89

This first report of a missense mutation in subunits of Pol III in a vertebrate model organism thus demonstrates that bi-allelic mutations in Polr3a do not necessarily lead to leukodystrophy and/or cerebellar dysfunction in mice, and that Pol III vulnerability to mutations may vary between species. Indeed, a previous report showed that a splice site substitution in zebrafish polr3b, leading to an in-frame deletion of 41 amino acids, resulted in impaired intestinal and exocrine pancreas development in the larvae, with no CNS or myelination defects reported.252 Importantly, instead of the expected 50% decrease in POLR3A protein level in KI/KO mice, we observed normal levels of the full-length protein. This may be due to a compensation mechanism that allows overcoming the loss of one allele by maintaining normal levels of full-length protein, but further experiments are warranted to establish whether this is true. Of note, in one deceased POLR3-HLD patient carrying a heterozygous nonsense mutation in POLR3A, there were only 26.8% and 6.8% decreases in POLR3A protein levels in the white matter and the cortex, respectively, compared to a healthy control.39 This is in agreement with our observation that a heterozygous premature stop codon in POLR3A does not necessarily lead to a 50% loss of full- length protein. Furthermore, this may account for the fact that the KI/KO mice are not more severely affected than KI/KI mice, but does not explain the lack of myelin-related phenotype in G672E mutant mice. At the molecular level, we show that the levels of five Pol III transcripts are largely unaffected in the cerebrum and liver of Polr3a KI/KI and KI/KO mice, although there

91 was a trend towards a small decrease in precursor tRNA-Ile levels in one-year-old KI/KO mice. While this is consistent with the absence of a clinical or histological phenotype, it surprisingly implies that certain Pol III mutants can function well enough to maintain overall normal levels of Pol III transcripts and general homeostasis in mice. Of note, we cannot exclude effects on the expression of other Pol III targets, especially since Pol III-transcribed genes vary in their promoter structure and associated transcription machinery.253 Recently, the Pol III transcriptome was investigated in the blood of patients with a homozygous POLR3A splice site mutation. This mutation produces an aberrant POLR3A mRNA, which is reduced in abundance by 37% relative to the wild-type mRNA.93 In addition to the technical limitations of assessing tRNA levels in a heterogeneous cell population such as blood, the results suggest that there is only a modest defect in Pol III function in these patients, with 7/46 tRNA isoacceptors showing statistically significant changes. Furthermore, the study reports an increase in the levels of 5S rRNA, RMRP and RPPH1 in patients,93 which is difficult to reconcile with a reduced level of functional POLR3A protein. Investigation of Pol III transcript levels in skin fibroblasts of a classical POLR3-HLD case did not uncover differences in 7SL RNA levels between the patient and a control,245 but this could be due to the fact that fibroblasts, just as blood, are not affected in the disease. Levels of Pol III transcripts were not analyzed in the brain of two deceased POLR3-HLD patients.89,96

To better understand the phenotypic discrepancy between human POLR3-HLD cases and our mutant mice, we analyzed the impact of the POLR3A G672E mutation on Pol III function in human cells, as was previously done for leukodystrophy-causing POLR1C mutations.90 Our results suggest that the effect of the POLR3A G672E mutations is milder than the aforementioned POLR1C mutations. In fact, we show that the Pol III complex containing the FLAG-tagged POLR3A-G672E is properly assembled and has a predominant nuclear localization. This is in contrast to the severe complex assembly defect and cytoplasmic localization of both reported POLR1C mutants.90 Although we observed a mild reduction in Pol III occupancy on chromatin for POLR3A-G672E, this was much more pronounced for the POLR1C mutants. Some of the difference may be explained by the different techniques used (ChIP-qPCR vs. ChIP-Seq, transient vs. stable transfections).90 Thus, we cannot exclude that a more substantial defect in Pol III occupancy could be uncovered for POLR3A-G672E using genome-wide techniques. Furthermore, it is possible that the G672E mutation impairs

92 downstream processes such as transcriptional elongation or termination, but this could not be evaluated in the transfected cells since they still express the endogenous POLR3A. The mild impact of the G672E mutation in human cells perhaps explains why this mutation is viable in the homozygous state in humans, and appears to be in agreement with the milder phenotype observed in a subset of patients with this mutation.39

The human and mouse POLR3A proteins share 97.99% sequence identity and the region surrounding the G672E mutated site (20 amino acids on each side) is perfectly conserved (multiple protein alignment by Clustal Omega254). The lack of a strong phenotype in Polr3a KI/KI and KI/KO mice could potentially be explained by the much higher proportion of white matter in the human brain (more than 50%) compared to other species (around 10% in mouse).255-257 This might make the human brain more vulnerable than the mouse brain to mutations in genes important for myelination. Furthermore, oligodendrogliogenesis is thought to occur at a different pace in humans and mice, perhaps leading to differences in susceptibility to myelin abnormalities.58 In fact, the disruption of genes causing leukodystrophies in humans does not always produce the same phenotype in mouse. Mouse models have been published for a number of leukodystrophies, but in several of them, the CNS myelin defects are milder than in humans or even completely absent. For example, null mice for Cx47 or homozygous Cx47M282T/M282T mice show only mild myelin deficits, contrary to human patients with mutations in its human ortholog, GJC2, which causes Pelizaeus-Merzbacher-like disease.257,258 Inactivation of Abcd1 in mice, associated with human adrenoleukodystrophy (ALD), leads to a late-onset neurological phenotype that resembles adrenomyeloneuropathy, with abnormal myelin and axonal loss in the spinal cord and the sciatic nerve, but these mice do not display the cerebral demyelination characteristic of cerebral ALD.259

Another possible explanation for the absence of a phenotype in mice is the existence of primate- specific Pol III transcripts. The best example is BC200 RNA, a brain-specific Pol III transcript that is thought to regulate subcellular translation in dendrites.260 BC200 is only present in primates. Although there is a functional analog (Bc1) in mouse, Bc1 and BC200 have different evolutionary origins.151 While we did not observe differences in Bc1 levels in KI/KI and KI/KO mice, we cannot exclude the possibility that BC200 may be sensitive to Pol III mutations and

93 have a unique function in the human CNS that is not recapitulated by Bc1 in the mouse. In this scenario, the brain-specific expression of BC200 could explain the mainly CNS manifestations of POLR3-HLD. In recent years, several novel Pol III transcriptional units have been discovered in the human genome. Some of these transcripts are specifically expressed in neuronal cell lines where they have been found to regulate , proliferation, differentiation or cell cycle progression.170-173,261,262 Although functional homologs have been identified in the mouse genome for some of these transcripts, they are not orthologous to their human counterparts. Deregulation of such transcripts could be responsible for the phenotype in humans, but their absence or different evolutionary origin in rodents would not lead to the same manifestations in mice.

The phenotypic spectrum of POLR3-HLD in humans is wide, regarding severity, age of onset and nature of symptoms.89 Among individuals homozygous for the c.2015G>A (p.G672E) mutation, the disease severity is highly variable, even within families, with 2/5 cases still ambulatory in early adulthood (unpublished data). One possibility is that Polr3a KI/KI and KI/KO mice, which carry the same c.2015G>A (p.G672E) mutation, more closely resemble these milder cases. On the other hand, these patients still displayed hypomyelination on MRI,39 while our histology data shows normal myelination in our mice. A recent study described patients with mutations in POLR3A or POLR3B without hypomyelination, suggesting that this feature is not obligate for POLR3-related disorders.91 However, cerebellar atrophy and corresponding clinical symptoms were evident in these individuals,91 contrary to the observations in Polr3a KI/KI and KI/KO mice. Nonetheless, it is important to note that our experiments were aimed at detecting major differences in the transgenic mice compared to WT mice. The LFB and Nissl stains and immunoblots we performed do not exclude the possibility of altered myelin ultrastructure or physiological dysfunction of Purkinje cell neurons.263,264 It remains possible that KI/KI and KI/KO mice will develop later-onset phenotypic abnormalities due to mild pathological mechanisms, but uncovering these events is beyond the scope of the present study, which focused on establishing whether KI/KI and KI/KO mice represent a good model for childhood-onset POLR3-HLD. Nevertheless, the important phenotypic heterogeneity observed in POLR3-HLD suggests the existence of additional genetic or epigenetic factors that can modify the presentation of the disease. The presence of certain genetic variants may be necessary in

94 order to develop a severe form of the disease. If this is the case, introducing the Polr3a KI mutation in different mouse backgrounds could lead to a neurological phenotype. Mouse genetic backgrounds often influence the severity of a gene KO or KI,265,266 and comparison of the same mutation in different strains could allow identification of genetic modifiers.267 Environmental stressors could also accentuate the disease presentation. This is the case in Vanishing White Matter (VWM) disease, which is caused by mutations in genes encoding the initiation factor 2B.268 An increasing amount of evidence suggests that the expression of different pools of tRNAs is important for protein homeostasis under normal and stress conditions.110,193,269 One can imagine that POLR3-HLD cases would be more susceptible to certain stressors during CNS development because of Pol III dysfunction, and those would vary among individuals. Since laboratory mice are housed in a relatively stress-free and sterile environment, exposure to environmental stressors during early development might produce a more severe phenotype.

To our knowledge, even the most mildly affected POLR3-HLD patients manifest some aspects of the disease. Thus, the lack of a phenotype in our mutant mice may not solely be explained by the effects of genetic modifiers or environmental stressors. Perhaps a combination of these or other factors discussed herein could account for the normal CNS development of mice carrying the POLR3A c.2015G>A (p.G672E) mutation. This mutation was chosen for this mouse model based on its frequency in the French Canadian patient population and the viability of homozygous carriers.39,89 In light of our results, and considering the genetic and phenotypic heterogeneity in POLR3-HLD, it is possible that other POLR3A, POLR3B or POLR1C mutations may produce a more severe phenotype in mice. Thus, the choice of future mutations for insertion into mice should also consider the location of the mutation within important structural elements, such as the bridge helix or the trigger loop, and mutations known to have an impact on Pol III function.105 Previous studies have found that non-lethal point mutations in conserved regions of yeast Pol III subunits, including two POLR3-HLD-causing mutations, impair Pol III transcription.199,270-272 Similar studies on a range of mutations in yeast could aid in selecting mutations to introduce in mice.199 Alternatively, expression of different mutated forms of Pol III subunits in HeLa cells, as performed with G672E, could also be a helpful tool to choose appropriate mutations. For instance, mutations that cause the accumulation of POLR3A in the

95 cytoplasm may result in a more severe phenotype in mice. However, there is a need for caution since most POLR3-HLD mutations have not been reported in the homozygous state and might be lethal.

Hypomyelination in humans may be due to minor differences in Pol III activity that do not produce the same effect in mice because of a higher tolerance to mutations affecting myelin formation. On the other hand, more severe mutations, such as the ones that are only observed as compound heterozygotes in humans or that cause accumulation of the mutated subunit in the cytoplasm, may be too severe as homozygous alleles and involve other organs or cause developmental failure or embryonic lethality. Nonetheless, both POLR1C mutations that were found to impair complex assembly and nuclear import were present in the homozygous state in patients,90 suggesting that these two features are compatible.

In conclusion, this study illustrates the challenges of developing mouse models for HLD. However, the phenotypic and genetic heterogeneity characteristic of POLR3-HLD patients raises the possibility that introducing other mutations in genes encoding Pol III subunits could lead to a more severe early onset phenotype.

96 Materials and Methods Animals All experiments were carried out according to good practice of handling laboratory animals consistent with the Canadian Council on Animal Care and approved by the University Animal Care Committee. Polr3aKI/KI mice were generated by Ozgene (Bentley, Australia) on a C57BL/6J genetic background using a conditional mutagenesis strategy modeled after the FLEX switch (see Supplementary Methods).273 To generate whole-body Polr3aKI/+ mice, Polr3aFL/+ mice were crossed with transgenic mice expressing CMV-Cre (Jackson Laboratory #006054). Polr3aKI/+ mice were bred together to obtain homozygous Polr3aKI/KI mice (KI/KI). Full-body Polr3a+/- mice were obtained from the Riken Bioresource Center (#RBRC03817, strain B6D2F1- Polr3a), where they were produced by gene trap (Fig. 3.S2). To produce Polr3aKI/- mice (KI/KO), we bred KI/KI mice with Polr3a+/- mice. The resulting KI/KO mice were bred with KI/KI mice to generate litters with the predicted 50% KI/KI and 50% KI/KO mice.

Behavioral tests Mice were tested for general locomotion, balance, coordination and strength. We generated a cohort of 15 female mice per group (WT, KI/KI, KI/KO), all born within seven days of one another, and submitted them to the following behavioral tests over one year. Phenotyping tests were performed at 40, 90 and 270 days of age. The balance beam was also repeated at 365 days of age, while the gait analysis was done at 270 and 365 days old. The balance beam, rotarod and inverted grid tests were performed as previously reported.203,274 For the open field test, locomotor activity was assessed over 90 minutes in a bank of 8 Versamax Animal Activity Monitor chambers (Accuscan Model RS2USB v4.00, Columbus, OH). Each chamber consisted of a clear acrylic open-field (40cm L x 40cm W x 30cm H), divided into two equal sized chambers (20cm L × 20cm W × 30 cm H) by an acrylic partition and was covered by an acrylic lid with air holes. Activity was detected via a grid of infrared photo sensors spaced 2.5 cm apart and 6 cm above the floor along the perimeter of the box. All activity chambers were connected to a Versamax data analyzer (Accuscan Model VMX 1.4B, Columbus, OH), which then transmitted data to an HP Compaq Pentium 4 computer for further analysis. Locomotor activity and its distribution within the two chambers were quantified using the Versamax Software System (Version 4.00,

97 Accuscan, Columbus, OH). The following measures were recorded for each interval of 10 minutes: total distance covered, number of movement bouts, time moving, stereotypy bouts and stereotypy time. For gait analysis, which was performed at 270 and 365 days old only, we used footprint patterns or walking tracks to analyze different parameters.275 The fore and hind paws of the mice were stained with red and blue washable color paint, respectively and the mice were trained to walk on a paper-covered narrow runway (85 cm long, 6 cm wide with clear Plexiglas walls) until they reached a dark box at the end of the runway. If the animal stopped in the middle of the track, the test was repeated. The first and last 10 cm of the footprint were excluded. For analyses, at least four steps from each side per print were measured. Stride length, front and hind limb width and inter limb coordination were measured bilaterally. Upon completion of the last tests, 1-year-old mice were sacrificed and tissues were harvested for histology or for RNA extraction (see below).

Histology For tissue preparation, mice were anesthetized with mouse anesthetic cocktail (ketamine (100 mg/mL), xylazine (20 mg/mL) and acepromazine (10 mg/mL)), perfused transcardially with 0.9% NaCl followed by 4% paraformaldehyde. Brains were dissected and post-fixed for 24h at 4 °C in the same fixative. For Luxol Fast Blue (LFB), tissue processing, embedding, sectioning and staining were performed at the Goodman Cancer Research Centre Histology Facility (McGill University, Montreal, Canada). Briefly, tissues were embedded in paraffin and sectioned on a microtome. Sections of 15 microns were stained with LFB according to standard procedures. Nissl stains and Purkinje cell counts were performed as previously reported.276

Western Blots Cerebellar or cerebral hemispheres were harvested, snap-frozen in liquid nitrogen and homogenized with a Teflon putter in extraction buffer [10 mM Tris–HCl, pH 7.5, 150 mM NaCl, 1 mM 6 EDTA, 1% Triton X-100 and protease inhibitors (Roche)] followed by centrifugation at 12,000xg for 30 minutes and collection of the supernatant. Protein quantification was determined using DC Protein assay (Bio-Rad). Protein samples were separated onto a 4-12% NuPAGE Bis Tris gel (ThermoFisher) and transferred onto a nitrocellulose membrane (Bio-Rad). For assessment of myelin protein levels, immunoblots were probed with anti-MBP (Aves Labs Inc.

98 #MBP), anti-PLP (Abcam #ab28486), anti-CNPase (Millipore, #MAB326R), anti-MAG (gift of Dr. David Colman, McGill University) and anti-tubulin (Sigma-Aldrich #T5168) primary . For measurement of POLR3A protein levels, immunoblots were probed with anti- POLR3A (Abcam #ab96328) and anti-actin (Abcam #ab3280).

RNA extraction, Northern Blots and qRT-PCR Cerebral hemispheres and livers were harvested and snap-frozen in liquid nitrogen. Of note, one- year-old mice were fed ad libitum for their entire life, while 90-days-old mice used for Northern Blots were fasted for 16 hours and refed for 5 hours prior to sacrifice and tissue collection in order to stimulate Pol III transcription. Tissues were homogenized in Qiazol lysis reagent (Qiagen). Total RNA was extracted with the miRNeasy kit (Qiagen) and treated with DNAse I (Qiagen) according to the manufacturer’s instructions. RNA quality was assessed on an Agilent 2100 Bioanalyzer and RNA Integrity Numbers (RIN) were routinely above 9. For Northern Blots, RNA samples (7.5µg or 10 µg) were separated by denaturing polyacrylamide gel electrophoresis and transferred to Nytran Plus membranes (GE Healthcare). The resulting blots were sequentially hybridized with [32P]-end labelled probes detecting precursor tRNAIle(TAT), mature tRNALeu(AAG) and mature tRNAGlu(CTC) at 42oC. Bc1 RNA levels were subsequently detected with a probe mapping to the 5’ portion of the RNA, as previously described.277 For n- Tr20, blots were sequentially hybridized with a probe targeting the 3’ trailer sequence of precursor n-Tr20 and a probe targeting both precursor and mature n-Tr20.251 All Pol III transcript levels were quantified and normalized to U3 snRNA levels.

Immunofluorescence, affinity purification and mass spectrometry HeLa cell lines stably expressing the FLAG-tagged POLR3A subunit (WT or G672E-mutated) were generated by transfection with Lipofectamine according to the manufacturer’s instructions (ThermoFisher). Immunofluorescence was performed using an anti-FLAG antibody, as previously described.90 For affinity purification, cytoplasm and nuclei were prepared as reported before.278 Briefly, cells were lysed by mechanical homogenization in lysis buffer [10 mM Tris- HCl (pH 8), 0.34 M sucrose, 3 mM CaCl2, 2 mM MgOAc, 0.1 mM EDTA, 1 mM DTT, 0.5% Nonidet P-40 and protease inhibitors]. Whole cell extracts were centrifuged at 3,500×g for 15 minutes and the supernatant, which represents the cytoplasmic fraction, was saved. The pellet

99 containing the nuclei was resuspended, lysed by mechanical homogenization in lysis buffer [20 mM HEPES (pH 7.9), 1.5 mM MgCl2, 150 mM KOAc, 3 mM EDTA, 10% glycerol, 1 mM DTT, 0.1% Nonidet P-40 and protease inhibitors] and, centrifuged at 15,000×g for 30 minutes. The supernatant, which corresponds to the nucleoplasmic fraction, was saved. Cytoplasm and nuclei were mixed; fractions were centrifuged at 124,000×g and dialyzed overnight in dialysis buffer [10 mM Hepes (pH 7.9), 0.1 mM EDTA (pH 8), 0.1 mM DTT, 0.1 M KOAc and 10% glycerol]. The following day, the fractions were clarified by centrifugation at 20,000×g for 30 min, and the supernatants containing the solubilized proteins were collected. Anti-FLAG affinity purification and mass spectrometry were performed as previously reported.90 Data analysis is described in Supplementary Methods.

ChIP-qPCR For ChIP-qPCR, FLAG-tagged POLR3A variants (WT or G672E) were transiently transfected in HEK293 cells for 24 hours with Lipofectamine. Transfections were performed in triplicate. Cells were crosslinked with 1% formaldehyde directly in the cell medium for 5 minutes followed by quenching for 5 minutes in 125 mM glycine. ChIP was performed as reported previously.90 For qPCR, 10 ng of chromatin were used to amplify two Pol III target genes (VTRNA1-1 and tRNA- iMet) and a control locus on chromosome 13 that is not bound by Pol III. The following primers were used: VTRNA1-1: 5’-GGC TGG CTT TAG CTC AGC G-3’ and 5’- TCT CGA ACA ACC CAG ACA GGT-3’, tRNA-iMet: 5’-AGA GTG GCG CAG CGG AA-3’ and 5’- TAG CAG AGG ATG GTT TCG ATC C-3’, unbound locus: 5’-GGC ACT GTC TTG TCA CTG CAC ATT-3’ and 5’- TGG AAA CAG CCA TTG AGA ACA CC-3’.

Statistical analyses Data are presented as mean +/- SEM. Preliminary analysis revealed significant differences in weight among the genotypes, particularly at the later ages (see Fig. S3.3). As weight can significantly impact motor performance, behavioral measures at each age were examined by one- way analysis of covariance (ANCOVA) with weight as the covariate. Behavioral measures are reported as ANCOVA (weight) adjusted least squares means +/- SEMs in Figure 3.1. Unadjusted means +/- SEMs are shown in Figure S3.4. Purkinje cell counts and RNA quantifications were compared using one-way ANOVA. ANOVAs were performed with GraphPad Prism and

100 ANCOVAs with JMP (Version 13). When required, differences between genotypes were assessed using Tukey’s honestly significant difference (HSD) test. In all cases, the threshold for statistical significance was set at p-value < 0.05.

101 Figures

Figure 3.1. Yearlong study of motor function in Polr3a KI/KI and KI/KO mice. Results from the 12mm (A, D) and 6 mm (B, E) beam test at four time points consisting of three trials per mouse. Latencies to cross (A, B) and number of foot slips (D, E) were recorded for both beam sizes. C, F) Results from the rotarod (C) and inverted grid (F) tests performed at three time points. The rotarod and inverted grid consisted of three trials per mouse. G-I) Results from the open field test performed at three time points. The open field test was run for 90 minutes per mouse during which total distance traveled (G), number of movements bouts (H) and total time spent moving (I) were recorded for each 10 minute interval. The results represent the sum of all 10 minutes intervals. J-K) Results from gait analysis performed at the two latest time points. Paws were covered in color paint and mice were allowed to walk on a white paper-covered

102 narrow runway. Distance between fore limbs and hind limbs was measured. All tests were performed on ≥14 female mice per group. For the beam test, rotarod and inverted grid, data are represented as adjusted least squares means +/- SEM of the sum of the three trials for each group. Groups were compared with one-way ANCOVA for each time point. #: p < 0.01.

103

Figure 3.2. Normal myelination in Polr3a KI/KI and KI/KO mice. A-B) Luxol Fast Blue staining of coronal sections (A) showing the corpus callosum (long arrow) and dorsal fornix (short arrow), both myelinated, and of sagittal sections (B) of the cerebellum. Staining was performed on three 90 days old mice per group and representative images are shown for each group. Scale bar = 100 µm. C) Immunoblots of myelin proteins using total protein extracts from the brain of 90 days old WT, KI/KI and KI/KO mice. Mag: Myelin Associated Glycoprotein, Cnp: 2',3'-Cyclic Nucleotide 3' Phosphodiesterase, Plp: Proteolipid Protein, Mbp: Myelin Basic Protein.

104

Figure 3.3. No Purkinje cell loss in Polr3a KI/KI and KI/KO mice. A) Nissl staining of sagittal cerebellar sections of 365 days old mice. Staining was performed on four mice per group and representative image are shown for each group. Scale bar = 100 µm (top) and 50 µm (bottom). B) Purkinje cell counts of mid-sagittal cerebellar sections of 365 days old mice (n=4 per group). Data are represented as mean +/- SEM.

105

Figure 3.4. Expression levels of Pol III transcripts in the cerebrum and liver of Polr3a KI/KI and KI/KO mice. A) Top: Northern blots of precursor (pre) and mature (m) tRNA species from the cerebrum (left) and liver (right) of 365 days old mice. U3 snRNA was used as a loading control. Bc1 RNA was probed in the cerebrum only. Mean +/- SEM of tRNA or Bc1 levels normalized to U3 snRNA levels are indicated below the blot for each transcript. Bottom: Quantification of Pol III transcripts surveyed by Northern Blot. tRNA levels were normalized to U3 snRNA levels. Data are represented as mean +/- SEM. B) Left: Northern blot of precursor (pre) and mature (m) n-Tr20 tRNAArg(UCU) in the cerebrum of 3-months-old mice, demonstrating low levels of n-Tr20, consistent with these mice having the C57BL/6J n-Tr20 genotype (see also Additional File 1: Fig. S6B). Right: Quantification of precursor and mature n-Tr20 levels, normalized to U3 snRNA levels.

106

Figure 3.5. Impact of POLR3A G672E mutation on Pol III function in human cells. A) Immunofluorescence experiment showing the predominant nuclear localization of FLAG-tagged variants of POLR3A (WT or G672E). Scale bar = 20 µm. B) FLAG-tagged variants of POLR3A (WT or G672E) were expressed at equivalent levels in HeLa cells and purified using anti-FLAG affinity chromatography. The co-purified proteins were identified by LC-MS/MS. The heatmap contains the log2-transformed average spectral count ratios of G672E/WT across both replicates. Spectral counts were computed with Mascot. Specific and shared (with Pol I and/or Pol II) subunits are identified on the left. POLR3A (the bait) is identified by an asterisk. C) ChIP-qPCR performed against FLAG-tagged variants POLR3A-WT and POLR3A-G672E expressed transiently at equivalent levels in HEK293 cells. The chromatin was quantified by qPCR with primers for two Pol III target gene promoters (VTRNA1-1 and tRNA-iMet). Pol III enrichment at these loci was calculated relative to a locus on chromosome 13 that is not bound by Pol III. Data are represented as mean +/- SEM of biological triplicates.

107 Declarations Abbreviations HLD: hypomyelinating leukodystrophy; Pol III: RNA Polymerase III; tRNA: transfer RNA; MRI: magnetic resonance imaging; ncRNA: non-coding RNA; rRNA: ribosomal RNA; CNS: central nervous system; KI: knock-in; KO: knock-out; WT: wild-type; LFB: Luxol Fast Blue; cDNA: complementary DNA; SEM: standard error of the mean, MS: mass spectrometry, ChIP: chromatin immunoprecipitation.

Ethics approval and consent to participate All experiments were performed according to good practice of handling laboratory animals consistent with the Canadian Council on Animal Care and approved by the McGill University Animal Care Committee.

Consent for publication Not applicable

Availability of data and materials The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Competing interests The authors declare that they have no competing interests.

Funding This project was supported by grants from the Fondation Leuco Dystrophie, the European Leukodystrophy Association and the Canadian Institutes of Health Research (CIHR) (MOP #126141). KC receives a Doctoral Award from the Fonds de recherche du Québec – Santé (FRQS). GB has received a Research Scholar Junior 1 (2012-2016) salary award from the FRQS and the New Investigator salary award (2017-2022) from the CIHR (MOP-G-287547). MT was supported by grants from the INSERM and the Ligue Nationale contre le Cancer (Equipe labellisée). CLK receives salary awards from the FRQS.

108

Authors’ contributions KC and BB conceived the study. KC, RDM, RL, BC, IMW, CLK and BB designed the experiments. KC managed and analyzed behavioral experiments. SY, MJD, NS, RL, FN and KC collected and processed tissues, performed genotyping, RT-PCR and PCR for Sanger sequencing. KC, SY, RL and FN performed and analyzed histology and Western Blots. RDM and IMW conducted and analyzed Northern Blot experiments. DF and AB performed and analyzed immunofluorescence, affinity purification and ChIP-qPCR experiments. CP analyzed mass spectrometry data. JR participated in the statistical analysis of behavioral experiments. TEK, GB and MT provided advice regarding experimental design and data analysis and contributed to interpretation of results. KC, CLK and BB wrote the manuscript. All authors read and approved the final manuscript.

Acknowledgements We are grateful to Dr. Roberta La Piana for critical reading of the manuscript and insightful comments. We would like to thank Eve-Marie Charbonneau and Geneviève Hamel from the Douglas Neurophenotyping Platform for their help with mouse colony management and behavioral assays, as well as the Goodman Cancer Research Centre Histology Facility for performing LFB stains and the McGill University and Génome Québec Innovation Center for Sanger sequencing.

109 Supplementary Materials

A gDNA (tail) cDNA (brain)

WT WT

T G G G G G C A G T G G G G G C A G Trp Gly Gln Trp Gly Gln

KI/KI KI/KI

T G G G A G C A G T G G G A G C A G Trp Glu Gln Trp Glu Gln

B C Negative +/+ +/KI KI/KI +/- KI/- control

Ladder KI KO KI KO KI KO KI KO KI KO KI KO

1500

600

400 KO 300 KI 200 WT

100

D

kDa WT KI/KI KI/KO 250

130 Polr3a 100 70

55 Actin

Figure S3.1. Generation of Polr3a transgenic mice. A) Sequence chromatograms of genomic DNA (left) and complementary DNA (right) for a WT mouse and a Polr3a KI/KI mouse, showing the presence of the homozygous mutation c.2015G>A (p.G672E) in the KI/KI mice. B) Genotyping of pups from Polr3a+/- intercrosses shows that the Polr3a-/- genotype is not compatible with embryo development. Embryos were collected at E13.5 from six pregnancies. Genotyping of embryos shows normal ratio of Polr3a+/+ and Polr3a+/- mice but no Polr3a-/- mice could be obtained. C) Agarose gel (3%) showing the PCR products from the amplification of genomic DNA for genotyping of Polr3a WT, KI and KO alleles. For each mouse, two PCRs were performed and resolved side by side: the first one detects WT and KI alleles and the second detects the KO allele. gDNA: genomic DNA; cDNA: complementary DNA. D) Immunoblot of

110 POLR3A protein levels using protein extracts from the cerebrum of 1-year-old WT, KI/KI and KI/KO mice (n=3 per group). Actin was used as a loading control. Quantification of POLR3A protein levels normalized to Actin levels is shown on the right. Data are represented as mean +/- SEM.

111

Figure S3.2. Representation of the Polr3a KO allele at the mRNA level. Screenshot of the SnapGene sequence file corresponding to exons 20 to 22 of the Polr3a KO mRNA, showing the retention of a portion of intron 21, leading to a frameshift and premature stop codon (red asterix). Open reading frame is shown below the nucleotide sequence. Amino acid numbers are indicated in red. Purple arrows show the primers used for PCR amplification and sequencing.

112

Figure S3.3. Weights of mice undergoing phenotypic tests at each time point tested. Boxplots show the variability in weight, especially in the mutant groups.

113

Figure S3.4. Behavioral data prior to adjustment with weight as covariate (see Figure 3.1). Results from the 12mm (A, D) and 6 mm (B, E) beam test at four time points consisting of three trials per mouse. Latencies to cross (A, B) and number of foot slips (D, E) were recorded for both beam sizes. C, F) Results from the rotarod (C) and inverted grid (F) tests performed at three time points. The rotarod and inverted grid consisted of three trials per mouse. G-I) Results from the open field test performed at three time points. The open field test was run for 90 minutes per mouse during which total distance traveled (G), number of movements bouts (H) and total time spent moving (I) were recorded for each 10 minute interval. The results represent the sum of all 10 minutes intervals. J-K) Results from gait analysis performed at the two latest time points. Paws were covered in color paint and mice were allowed to walk on a white paper-covered narrow runway. Distance between fore limbs and hind limbs was measured. All tests were

114 performed on ≥14 female mice per group. For the beam test, rotarod and inverted grid, data are represented as adjusted least squares means +/- SEM of the sum of the three trials for each group. Groups were compared with one-way ANOVA for each time point. *: p < 0.05, #: p < 0.01.

115

Figure S3.5. Luxol Fast Blue staining of coronal sections of the cerebrum of 365 days old mice. Staining was performed on four mice per group and representative images are shown for each group. Pictures were taken with the 2.5X objective.

116 A BRAIN LIVER

WT KI/KO KI/KI KI/KO KI/KI WT KI/KO KI/KI KI/KO KI/KI

pre-tRNAIle(TAT)

m-tRNALeu(AAG)

U3

Bc1

BRAIN LIVER

Relative RNA/U3 ratio Relative RNA/U3 ratio

Ile(TAT) Bc1 Ile(TAT) Leu(AAG) Leu(AAG)

tRNA tRNA pre- m-tRNA pre- m-tRNA

B Brain Liver B6J B6N B6J B6N

Bc1

pre-n-Tr20

m-n-Tr20 (tRNAArg(UCU))

U3

Figure S3.6. Expression levels of Pol III transcripts in the brain and liver of Polr3a KI/KI and KI/KO mice. A) Top: Northern blots of one precursor (pre) tRNA, one mature (m) tRNA and Bc1 RNA from the cerebrum and liver of 90 days old mice. U3 snRNA was used as a loading control. Bottom: Quantification of transcripts surveyed by Northern Blot. Pol III transcript levels were normalized to U3 snRNA levels. No statistically significant differences were uncovered. Data are represented as mean +/- SEM. B) Northern blot of Bc1 and n-Tr20 in the brain and liver of C57BL/6J (B6J) and C57BL/6N (B6N) mice, showing the brain-specific expression of these transcripts, as well as the much higher expression of n-Tr20 in B6N mice. U3 snRNA was used as a loading control.

117 Table S3.1. Quantification of the high-confidence interactors of the G672E mutant POLR3A against WT POLR3A. The ratio represents the average of the two spectral count ratios from the duplicate pairs of POLR3A G672E and POLR3A WT. The adjusted p-values were obtained by performing a two-tailed one-sample t-test followed by multiple testing correction using the Benjamini-Hochberg procedure. Specific Pol III subunits are show in blue and shared subunits with Pol I and/or Pol II are indicated in green.

Protein Ratio Adjusted p-value Symbol (G672E/WT) C19orf60 0.500 0.80 CRCP 0.227 0.73 FGD2 0.619 0.81 MAF1 0.378 0.73 PDRG1 1.299 0.97 PFDN2 0.577 0.81 PFDN6 0.224 0.73 PIH1D1 0.993 0.97 POLR1C 0.677 0.81 POLR1D 0.300 0.73 POLR2E 0.559 0.81 POLR2H 0.389 0.73 POLR2L 0.410 0.73 POLR3A 1.000 0.97 POLR3B 0.787 0.89 POLR3C 0.479 0.80 POLR3D 0.622 0.81 POLR3E 0.660 0.81 POLR3F 0.419 0.73 POLR3G 0.333 0.73 POLR3H 0.522 0.80 RFC3 0.250 0.73 RPAP3 1.335 0.97 RUVBL1 1.314 0.97 RUVBL2 1.158 0.97 SPTAN1 0.103 0.73 SSB 0.283 0.73 TRMT1L 0.692 0.81 URI1 1.200 0.97 UXT 0.958 0.97 WDR92 0.792 0.89

118 Supplementary Methods Generation of Polr3a KI mice The targeting vector for conditional mutagenesis was modeled on the FLEX switch.273 Briefly, a lox2272 site was inserted in forward orientation into intron 14 of Polr3a, approximately 100bp upstream of the exon 15 splice acceptor. A neomycin (neo) cassette was inserted into intron 15, approximately 150bp downstream of the splice donor. The neo cassette was flanked with FRT sites for removal by Flpe-mediated recombination. The neo cassette was followed by a loxP site in forward orientation. The mutant exon 15 (c.2015G>A, p.G672E) was inserted in reverse orientation 5’ to the loxP site. The inverted exon was followed by a FLEX switch consisting of lox2272 and loxP in reverse orientation.

DNA extraction, genotyping and Sanger sequencing Genomic DNA was extracted from tail biopsies or whole embryos using the Gentra Puragene Tissue Kit (Qiagen). For genotyping, two PCR reactions were performed for each mouse and resolved on a 3% agarose gel (Fig. S3.1C). For the WT and KI alleles, the following primers were used: 5’-ATC ATC CGG GTG GAA TGT AA-3’ and 5’-TAA GTG TGC TCT CCC ACA CG-3’, producing bands of 246bp and 286bp for the wild-type (WT) and KI alleles, respectively. For the KO allele, the following primers were used: 5’-GTC ACT CAA TCC TCT GCC TTT G- 3’ and 5’-GAT CTC TAG TTA CCA GAG TCA-3’. The latter primer is located in the gene trap cassette, thus producing an amplicon of 320bp only when the KO allele is present (Fig. S3.1C). For Sanger sequencing of Polr3a exon 15, genomic DNA was amplified by PCR using the following primers: 5’- GGA TGT AAA CAT TAT TCT CCA CCA G-3’ and 5’- ACT AAG CCT TTC CCT CTG CG-3’. PCR products were sequenced at the McGill University and Genome Quebec Innovation Center, using a 3730XL DNA Analyzer (Applied Biosystems). Sequence chromatograms were analyzed using SnapGene v2.7.2 (GSL Biotech LLC) and 4Peaks (A. Griekspoor and Tom Groothuis, mekentosj.com). For determination of the lethality of homozygous KO mice, heterozygous Polr3a+/- mice were interbred and gestating females were sacrificed at E13.5 to extract embryos for genotyping.

119 RT-PCR and cDNA sequencing For RT-PCR, brain RNA was reversed transcribed into complementary DNA (cDNA) using the Superscript III Reverse Transcriptase (ThermoFisher) according to the manufacturer’s instructions. For sequencing of the c.2015G>A (p.G672E) mutation, exons 14 to 16 of Polr3a were amplified by PCR using the following primers: 5’-TCA CCC TCA AGG ACA CCT TC- 3’and 5’-CGG ATC ACA GAC AGC TCC TT-3’. For sequencing of the KO allele, exons 20 to 22 of Polr3a were amplified by PCR in KI/KO mice using the following primers: KO_F1: 5’- TCT GGC TTA ACG CCT ACT GA-3’, KO_F2: 5’-GGC TAA TTC ACT CCC AAC GA-3’ (located in the gene trap cassette), KO_R1: 5’-GAT CTC TAG TTA CCA GAG TCA-3’ (located in the gene trap cassette) and KO_R2: 5’-GGT CCA ACT GGT ACA GCA CA-3’ (see Figure S3.2). PCR products were Sanger sequenced as described above.

Analysis of mass spectrometry data Protein database searching and protein spectral count quantification were performed with Mascot (version 2.3.02).279 The NCBI_Human protein sequence database was downloaded on 20 February 2014. Known protein contaminants such as keratins, which are not expressed in HeLa cells, were excluded from the data set. Undistinguishable protein isoforms were considered as a single protein. For each LC-MS/MS analysis, protein spectral counts were normalized by the spectral count of the FLAG-POLR3A in order to allow the comparison of different purifications. Each replicate LC-MS/MS analysis of the affinity purifications of the FLAG-POLR3A mutant (G672E) was paired with a LC-MS/MS analysis of the WT FLAG-POLR3A that was performed at the same time. The set of high-confidence interactors of POLR3A for a given mutant analysis was identified by comparing the spectral counts of the interactors obtained from the purifications of the paired WT POLR3A and the G672E POLR3A to those of the proteins purified with an empty vector (EV) of the FLAG tag (nonspecific interaction). A protein is labeled as a high- confidence interactor if it was identified and quantified in both replicates of WT POLR3A and G672E POLR3A and that the ratio of the average spectral counts across the two replicates (WT/EV or G672E/EV) was greater than 5. These stringent criteria allow us to eliminate the vast majority of nonspecific interactors of POLR3A for the analysis of the interactions of the mutant. For each high-confidence interacting protein, a two-tailed one-sample t-test was performed on the spectral count ratios (G672E/WT) and the resulting p-values were adjusted for multiple

120 hypothesis testing using the Benjamini–Hochberg procedure. To maximize the specificity of our approach, a protein is deemed to show a level of differential interaction with POLR3A that is statistically significant when its adjusted p-value is < 0.05.

121 PREFACE TO CHAPTER 4

In Chapter 3, I described the absence of neurological abnormalities in a Polr3a KI mouse model and discussed the possibility that the G672E mutation may be at the mildest end of the phenotypic spectrum associated with Pol III-related leukodystrophy. In contrast, homozygosity for the Polr3b R103H mutation is embryonic lethal in mice (unpublished data, K. Choquet and B. Brais), suggesting that other Polr3a or Polr3b mutations could possibly cause a phenotype that falls in the middle of this severity spectrum. On the other hand, obtaining a viable mouse model with an appropriate phenotype may simply not be possible for POLR3-HLD. Because of inter-species differences in myelination and in expression of Pol III transcripts, we next opted to take a step back from model organisms and to use human cells to investigate the molecular impact of POLR3A mutations. Chapter 4 is a manuscript in preparation, in which we apply functional genomics technologies in several complementary cellular models, including one of the only two existing human oligodendroglial cell lines, to determine whether abnormal expression of Pol III transcripts is indeed a feature of POLR3-HLD.

122 CHAPTER 4: Mutations in POLR3A impact a subset of Pol III transcripts including BC200 RNA

Karine Choquet1,2,3, Diane Forget4, Elisabeth Meloche3, Marie-Josée Dicaire3, Geneviève Bernard5,6,7,8, Marc R. Fabian2, Martin Teichmann9, Benoit Coulombe4,10, Bernard Brais1,3,5,, Claudia L. Kleinman1,2

Affiliations: 1) Department of Human Genetics, McGill University, Montréal, Canada. 2) Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Canada. 3) Montreal Neurological Institute, McGill University, Montréal, Canada. 4) Translational Proteomics Laboratory, Institut de recherches cliniques de Montréal (IRCM), Montréal, Canada. 5) Department of Neurology and Neurosurgery, and Pediatrics, McGill University, Montréal, Canada. 6) Department of Pediatrics, McGill University, Montréal, Canada. 7) Department of Medical Genetics, Montreal Children’s Hospital, McGill University Health Center, Montréal, Canada. 8) Child Health and Human Development Program, Research Institute of the McGill University Health Center, Montréal, Canada. 9) INSERM U1212 – CNRS UMR5320, Université de Bordeaux, Bordeaux, France. 10) Département de biochimie et médecine moléculaire, Université de Montréal, Montréal, Canada.

Manuscript in preparation

123 Abstract RNA Polymerase III (Pol III) is an essential enzyme responsible for the synthesis of transfer RNAs (tRNA) and many non-coding RNAs. Recessive mutations in POLR3A, POLR3B and POLR1C, encoding subunits of Pol III, cause Pol III-related hypomyelinating leukodystrophy (POLR3-HLD), a neurodegenerative disorder characterized by a severe defect in central nervous system myelination. While the genetic cause of the disease has been well established, the mechanisms involved are unknown. Here, we used CRISPR-Cas9 to introduce the disease- causing mutation c.2554A>G (p.M852V) in the endogenous POLR3A gene in human cell lines. Using two complementary RNA-sequencing approaches, we found that the POLR3A mutation leads to significantly decreased levels of a subset of Pol III transcripts, all of which are transcribed through type 2 Pol III promoters. Notably, we observed a global reduction in nuclear- encoded tRNA levels and decreased expression of the brain cytoplasmic BC200 RNA in POLR3A mutant cells. BC200 RNA levels were also reduced in POLR3-HLD patient-derived fibroblasts. Genomic deletion of BC200 in the oligodendroglial MO3.13 cell line led to major changes at the proteome level, which were partially recapitulated in POLR3A mutant cells. Altogether, our findings provide the first evidence of decreased Pol III transcript expression in POLR3-HLD, and identify candidate effectors, including BC200 RNA, to be further investigated for their involvement in the disease and their role in myelination.

124 Introduction Hypomyelinating leukodystrophies (HLD) are a heterogeneous group of inherited neurodegenerative disorders characterized by deficient cerebral myelin formation.68 Recessive mutations in the genes POLR3A, POLR3B and POLR1C cause POLR3-related hypomyelinating leukodystrophy (POLR3-HLD), the second most common form of childhood-onset HLD.39,81,90 Patients present in early childhood or adolescence with motor regression, cerebellar features including ataxia and extra-neurological manifestations such as hypodontia and/or hypogonadotropic hypogonadism.89 Histopathological findings in the brains of two deceased POLR3-HLD patients support a contribution of oligodendrocytes, the myelin-producing cells, as well as neurons, to disease pathogenesis.89,96 Nonetheless, the molecular basis of the disease pathophysiology remains poorly understood.

POLR3A, POLR3B and POLR1C encode subunits of RNA Polymerase III (Pol III), one of the three essential eukaryotic RNA polymerases. POLR3A and POLR3B are the two largest subunits of Pol III and form the active center of the enzyme,98 while POLR1C is a shared subunit of Pol I and Pol III and is thought to serve as a scaffold for the assembly of the enzyme core.99 Pol III synthesizes a diverse group of small non-coding RNAs (ncRNA), including nuclear-encoded transfer RNAs (tRNA), 5S ribosomal RNA (rRNA), U6 small nuclear RNA (snRNA), and many others.244 The majority of Pol III transcripts are key players in housekeeping processes such as regulation of transcription, RNA processing, mRNA splicing and translation.244 Pol III also transcribes short interspersed nuclear elements (SINE), including Alu elements.166 Pol III- transcribed genes are broadly grouped into three types based on their promoter elements and associated transcription factors.175 Type 1 promoters, which are exclusive to 5S rRNA genes, and type 2 promoters (tRNAs and a few others) contain internal elements located within the transcribed sequence.175 In contrast, type 3 genes (U6, RN7SK, RPPH1, RMRP) possess upstream promoter elements.175,280

Despite its crucial role in the cell, the general consequences of Pol III hypofunction in mammalian cells remain obscure. If fact, it is unknown whether mutations in Pol III subunits impair the transcription of its target genes. Missense and nonsense mutations causing POLR3- HLD are scattered throughout POLR3A, POLR3B and POLR1C and are located in multiple

125 functional domains.105 We previously showed that exogenous expression of two POLR1C mutants in HeLa cells impairs Pol III complex assembly, leading to an accumulation of the mutated subunits in the cytoplasm and decreased Pol III binding at most target genes.90 However, whether this widespread effect on Pol III occupancy is identical for all mutations remains to be established. Of particular interest is the disease-causing POLR3A c.2554A>G (p.M852V) mutation, located in the cleft domain of Pol III, which harbors the template DNA during transcription,98,105,281 and more specifically on the bridge helix, which is important for the translocation of the RNA-DNA duplex during transcription.282 As such, it is predicted to directly impair Pol III DNA binding and/or transcription elongation.

Another key question is why the central nervous system (CNS) is specifically affected in POLR3-HLD, since most Pol III transcripts are highly abundant and expressed ubiquitously. One notable exception is BC200 RNA, a primate-specific transcript that is almost exclusively expressed in the brain, with much lower expression in certain other tissues and in various cell lines.151,165 This ncRNA localizes to neuronal dendrites, where it is thought to regulate local translation.151,154,163 Although the expression pattern of BC200 RNA makes it an attractive candidate for a role in POLR3-HLD pathogenesis, the potential involvement of BC200 RNA in myelination or oligodendrocyte biology has never been investigated.

In this study, we performed comprehensive profiling of the Pol III transcriptome in human cells carrying the POLR3A M852V mutation. We find that a subset of Pol III transcripts is more vulnerable to Pol III hypofunction. In particular, we observe a global decrease in tRNA levels, as well as downregulation of BC200 RNA in multiple datasets including POLR3-HLD patient- derived fibroblasts. Upon knock-out (KO) of BC200 in an oligodendroglial-derived cell line, we detect major changes at the proteome level, suggesting that this ncRNA does play a role in this cell type. Altogether, our results indicate that tRNAs, BC200 RNA and its downstream targets may be important effectors of POLR3-HLD pathogenesis.

126 Results The POLR3A M852V mutation leads to decreased Pol III chromatin occupancy To evaluate the impact of the POLR3A M852V mutation on Pol III complex assembly and chromatin occupancy, we first stably expressed FLAG-tagged versions of the wild-type (WT) and mutant (M852V) POLR3A in HeLa cells and performed anti-FLAG affinity purification followed by mass spectrometry as well as immunofluorescence. POLR3A-M852V pulled down similar levels of Pol III complex subunits compared to the WT and did not accumulate in the cytoplasm (Fig. 4.1a, b), indicating that this mutation does not impair Pol III complex assembly. Next, we assessed the impact of the mutation on Pol III chromatin occupancy in HEK293 cells transiently transfected with the FLAG-tagged POLR3A variants. ChIP followed by quantitative PCR (ChIP-qPCR) at two genes with type 2 Pol III promoters showed a 40-50% reduction of POLR3A occupancy in POLR3A-M852V compared to WT (Fig. 4.1c). This suggests that the M852V mutation causes impaired recruitment by the basal transcription machinery and/or binding to DNA and provides an opportunity to examine the downstream effects of decreased Pol III chromatin occupancy without the confounding effects of impaired Pol III complex assembly.

Introduction of disease-causing mutations in POLR3A using CRISPR-Cas9 Since Pol III occupancy is correlated to nascent RNA levels,193 the above results suggest that POLR3A mutations could lead to decreased levels of Pol III transcripts. To explore this hypothesis in a cellular model where physiological Pol III subunit levels are produced, we introduced the M852V mutation in the endogenous POLR3A gene in HEK293 cells using the CRISPR-Cas9 system.283 We obtained one homozygous mutant clone and two compound heterozygous clones that carry the M852V mutation on one allele and an indel causing a frameshift and premature stop codon on the other allele (Fig. 4.2a, Fig. S4.1a). These clones mimic the genotypes observed in POLR3-HLD cases, since four out of the five reported patients carrying the heterozygote c.2554A>G (p.M862V) mutation have a premature stop codon on the other allele.39,89 This null allele is mostly degraded (Fig. S4.2) in patient-derived fibroblasts, resulting in decreased levels of POLR3A protein.39 Similarly, POLR3A protein levels were decreased in all three HEK293 mutant clones compared to controls (Fig. 4.2c). POLR3A mRNA levels were reduced in clones M1 and M2 compared to wild-type clones, but not in the

127 homozygous mutant clone M3 (Fig. 4.2b). Upon inspection of the cDNA, we observed partial (~40%) in-frame skipping of exon 19, carrying the M852V mutation, in clone M3 (Fig. 4.2a, Fig. S4.1b, c). It was previously shown that in-frame skipping of exons carrying an indel sometimes occurs following CRISPR-Cas9 editing,284 and our observations suggest that this can also happen with point mutations. In sum, we generated three HEK293 clonal cell lines carrying the POLR3A M852V mutation and showing reduced levels of POLR3A protein, allowing us to investigate the downstream impact of POLR3A haploinsufficiency.

POLR3A mutations lead to reduced levels of a subset of its transcripts Pol III occupancy levels measured by ChIP-Seq have traditionally been used as a proxy for Pol III transcription levels,181,195 since measuring transcript levels is challenging given that they are strongly structured, chemically modified and highly repetitive. However, if the M852V mutation were to affect transcription elongation as well as DNA binding, the direct correlation between Pol III occupancy and nascent RNA levels would be disrupted. To directly determine whether this mutation is associated with reduced levels of specific Pol III transcripts, we used two complementary RNA-seq approaches in the three mutant and three control clones. Since Pol III transcripts range in size from 70 to 330 nucleotides, we developed a custom small RNA- seq method to properly quantify Pol III transcripts smaller than 200 nucleotides (Supplementary Methods, Fig. S4.3, S4.4) and applied it in combination with rRNA-depleted RNA-seq, which is more appropriate for Pol III transcripts larger than 200 nucleotides. In both RNA-seq datasets, Pol III transcripts showed a global trend towards decreased expression in mutants compared to controls (Fig. 4.3a). Furthermore, stratifying according to their promoter type revealed that this decrease was mostly attributable to type 2 Pol III transcripts, while type 3 were less affected (Fig. 4.3b).

Since nuclear-encoded tRNAs represent the majority of transcripts with a type 2 promoter, we examined their expression in the small RNA transcriptome data. Indeed, we observed a global reduction of tRNA levels in the three mutants compared to controls (Fig. 4.4a). However, due to the variability in the genotypes of the three mutants, only four tRNA genes demonstrated a statistically significant decrease (FDR < 0.05), although the majority of expressed tRNAs showed a trend towards a decrease (Fig. 4.4a, Table S4.1). To reduce variability and analyze the

128 effects of the potentially most detrimental POLR3A mutations on tRNA expression, we also analyzed biological triplicates of mutant M2, the clone with the lowest POLR3A mRNA expression. Indeed, the large majority of nuclear-encoded tRNAs showed decreased expression in M2 compared to control C3 (Fig. 4.4b), with 77 tRNAs reaching statistical significance (Table S4.2). In contrast, levels of other categories of small RNAs (snRNAs, snoRNAs) were unaffected, both in individual clones (Fig. 4.4a) and in replicates of clone M2 (Fig. 4.4b). Decreased tRNAs belonged to most tRNA isoacceptor families (Fig. 4.5a), suggesting a widespread effect on this transcript type and arguing against a specific impact on tRNAs cognate for certain codons or amino acids. To confirm the decreased tRNA levels using an independent method, we measured the expression levels of three tRNAs by qRT-PCR and observed a significant decrease in mutants compared to controls (Fig. 4.5b). Thus, these results provide the first evidence that a leukodystrophy-causing POLR3A mutation causes a global reduction in nuclear-encoded tRNA levels.

To determine whether the selective effect on type 2 transcripts was only driven by differences in tRNA levels, we examined the other members of this group, which include BC200 RNA, 7SL RNA, vault RNAs and SINE elements (including Alu elements).175,244 BC200 (also called BCYRN1) and genes encoding 7SL RNA were among the most significantly downregulated genes in the large RNA transcriptome (Fig. 4.6a, Table S4.3). Furthermore, we observed a statistically significant decrease in the total expression of Pol III-transcribed Alu elements in POLR3A mutants (Fig. 4.6b). In contrast, vault RNA levels were comparable across both groups, making them the only type 2 transcripts that were not decreased in mutants. Consistent with a selective effect on type 2 transcripts, none of the genes coding for type 1 (5S rRNA) or type 3 transcripts (U6 RNAs, Y RNAs, RPPH1, RMRP and RN7SK) displayed significant differences between control and mutant samples, although some showed a small trend towards decreased expression in mutants (Fig. S4.5a). Taken together, these results strongly suggest that type 2 transcripts are the most vulnerable to disease-causing POLR3A mutations.

BC200 RNA is downregulated in patient-derived fibroblasts Among the large RNA transcriptome, BC200 RNA, the only Pol III transcript observed to have a predominant CNS expression,165 was the most downregulated transcript in HEK293 mutant

129 clones (Fig. 4.6a, 4.7a), suggesting a particular sensitivity to Pol III hypofunction. To determine if this transcript was also affected in patient cells, we examined two existing transcriptomic datasets of patient-derived fibroblasts. In RNA-seq from four POLR3A patients (see genotypes in Table S4.4), BC200 RNA was also the top differentially expressed Pol III transcript in patients compared to controls (Fig. 4.7b, Fig. S4.5b), albeit not quite reaching statistical significance (p = 0.004, FDR = 0.102). Similarly, in microarray data from four POLR3A cases (including two cases carrying the M852V mutation, see Table S4.4), BC200 displayed a statistically significant reduction in patients compared to controls (Fig. 4.7c). Altogether, we show that BC200 RNA is the top differentially expressed large Pol III transcript in three separate datasets of POLR3A mutant cells, including carriers of several distinct mutations, emphasizing its vulnerability to Pol III hypofunction and suggesting a possible role in the pathophysiology of POLR3-HLD.

Identification of putative downstream targets of Pol III hypofunction Given the expression of BC200 RNA in the CNS and its consistent downregulation in POLR3A mutant cells and patient-derived fibroblasts, we next explored the downstream consequences of decreased BC200 RNA expression. Because of their crucial function in CNS myelination, oligodendrocytes are likely to be one of the main dysfunctional cell types in POLR3-HLD. Thus, we opted to investigate the effects of the POLR3A M852V mutation and the role of BC200 RNA in the MO3.13 cell line, which displays features of oligodendrocyte progenitor cells (OPC) and is one of only two existing human oligodendroglial immortalized cell lines.74,285 First, we used CRISPR-Cas9 to delete the entire BC200 gene in MO3.13 cells (Fig. S4.6d-h). Second, using the same approach as above, we generated MO3.13 clones carrying the POLR3A c.2554A>G (p.M852V) mutation in compound heterozygosity with a null allele (Fig. S4.6a). We verified that mutant MO3.13 cells recapitulate the effects observed in HEK293 cells. Indeed, they display reduced levels of POLR3A protein (Fig. S4.6b). Consistent with our above results, they also show significantly decreased levels of BC200 RNA (Fig. S4.6c). Thus, we obtained two mutant cell lines with varying decreased levels of BC200 RNA (~50% in POLR3A-M852V and absent in BC200-KO). Given the documented role of BC200 RNA, as well as tRNAs and 7SL RNA, in translation regulation and protein homeostasis, we next sought to determine if the POLR3A M852V mutation and/or BC200 KO would impact steady-state protein levels. We applied quantitative mass spectrometry by stable isotope labeling of amino acids in culture (SILAC) and

130 detected 1,272 protein in at least 4 out of 6 biological triplicates in the three conditions (WT, POLR3A-M852V, BC200-KO, (Fig. 4.8a)). We compared the proteome of POLR3A-M852V and BC200-KO cells to that of MO3.13 WT cells and uncovered 93 and 276 differentially abundant proteins (FDR < 0.05) with significant effect size (log2 fold change > 0.5) (Fig. 4.8b, Table S4.5). BC200 KO had a more potent effect on the proteome than the POLR3A M852V mutation did, leading to differential abundance of a higher number of proteins (Fig. 4.8b) and to more proteins showing a large effect size (Fig. 4.8c). These observations suggest that BC200 RNA does have a role in MO3.13 cells, since its inactivation leads to significant changes for more than 20% of detected proteins (276/1272). Gene Ontology (GO) analysis showed an over- representation of proteins located in the plasma membrane, ER and cytoskeleton among the upregulated proteins. In contrast, downregulated proteins were enriched for ones involved in the G1/S transition, including members of the condensin and minichromosome maintenance complexes, possibly implying slower cell growth or cell cycle defects. We observed a mild correlation (R2 = 0.28) between the fold change in BC200-KO and in POLR3A-M852V for proteins that showed statistically significant differences (FDR < 0.05) in both conditions (Fig. 4.8d). When we focused on proteins that have a significant effect size in BC200-KO (log2 fold change > 0.5), we found that proteins that were increased in BC200-KO tended to follow the same direction in POLR3A-M852V, and vice versa for decreased proteins (Fig. 4.8e, 4.8f). This suggests that the changes in protein levels observed in POLR3A mutant cells could be at least partially explained by reduced expression of BC200 RNA, and further support a contribution of this ncRNA to POLR3-HLD pathogenesis. Furthermore, our data pinpoint several putative downstream targets of BC200 RNA, which will be further investigated in future experiments.

131 Discussion Mutations in genes encoding Pol III subunits cause POLR3-HLD, a devastating neurodegenerative disorder. However, the extent to which Pol III mutations affect expression of Pol III transcripts remains unclear. Identification of specific Pol III transcripts that are deregulated in POLR3-HLD could provide clues to the downstream disease mechanisms. Herein, we combined CRISPR-Cas9 gene editing and transcriptomic profiling in human cells to uncover Pol III transcripts that are most vulnerable to POLR3-HLD disease-causing mutations.

We found that the POLR3A M852V mutation leads to significantly decreased expression of some but not all Pol III transcripts, namely BC200 RNA, tRNAs, 7SL RNAs and Alu elements. The common feature of these transcripts is their possession of a type 2 Pol III promoter,175 suggesting a promoter-specific vulnerability to Pol III dysfunction, although not all members of this group were downregulated (i.e. vault RNAs). This type of promoter is composed of control elements located within the transcribed sequence, called A and B boxes.280 It was recently shown that BC200 RNA expression requires an upstream TATA-like element in addition to these internal elements.286 In contrast, type 3 promoters are entirely located upstream of the transcription start site and include a TATA element as well as proximal and distal elements that are highly similar to the ones controlling the transcription of other snRNA genes by Pol II and are recognized by the same transcription factors.105,280 In human fibroblasts, at least one type 3 gene, RPPH1, was occupied by equivalent amounts of the Pol III subunit POLR3D and the Pol II subunit POLR2B, leading the authors to conclude that it can be transcribed by both polymerases.185 Considering the similarity between promoters, one possibility is that Pol II can compensate for decreased Pol III transcription at type 3 genes, thus stabilizing the levels of these transcripts in POLR3A mutants. However, differences in termination between the two polymerases suggest that such Pol II- transcribed products would be of different lengths and highly unstable.185 Alternatively, the internal nature of type 2 promoters could be particularly detrimental to a mutated Pol III. A- and B-boxes are bound by the transcription factor TFIIIC for transcription initiation, but it also presumably acts as a block on transcription that must be displaced during Pol III elongation.287 A recent study in S. cerevisiae found an uneven distribution of Pol III across tRNA genes, with the highest occupancy at the beginning of the A-box, suggesting that promoter escape could be a rate-limiting step during Pol III transcription that requires TFIIIC displacement, possibly through

132 conformational changes induced by the elongating Pol III.287,288 Thus, one can hypothesize that impaired elongation due to POLR3A mutations could have a more detrimental effect on type 2 genes, compared to type 3 genes where this physical obstacle to transcription is not present. Of note, type 1 genes encoding 5S rRNA also contain internal elements, but those are bound by TFIIIA, which may not exert the same effect on Pol III transcription. High-resolution mapping of Pol III location on nascent transcripts in POLR3A mutants and controls, using native elongating transcript sequencing (NET-seq)289 or UV crosslinking and analysis of cDNA (CRAC),288 would allow to determine whether there are differences in accumulation of the mutant Pol III at the beginning of type 2 genes.

The affected transcripts in our POLR3A mutant cells partially overlap with the ones that were decreased in the blood of patients with an atypical Pol III-related disease without myelin abnormalities, caused by partial skipping of POLR3A exon 14.93 The authors also observed significantly decreased expression of 7SL RNA and a small number of tRNAs. However, they reported significantly increased expression of RN7SK, RMRP and RPPH1,93 whereas those transcripts do not show significant differences in our data. This discrepancy could be explained by the remaining POLR3A WT expression in their patients.93 Nonetheless, the same transcripts appeared to be most vulnerable to Pol III dysfunction in both studies, further strengthening our results. It is important to note that type 3 Pol III transcripts are not completely resistant to the POLR3A M852V mutation, but that their downregulation appears to require a higher degree of Pol III dysfunction. For instance, RN7SK and RPPH1 have decreased expression in mutant M2 only, which also showed the lowest expression of BC200 RNA and tRNAs, suggesting that it has the worst level of preserved Pol III function. This indicates that Pol III transcripts have different sensitivities to POLR3A haploinsufficiency. Similarly, repression of Pol III transcription by rapamycin treatment or serum deprivation led to decreased Pol III occupancy and transcription for the majority of Pol III target genes, while a handful of “stable” genes conserved normal expression.193 Those include a small proportion of tRNAs and representatives from vault RNAs and most type 3 transcripts, but not 7SL RNA and the remainder of tRNAs,193 again suggesting a higher resistance of certain genes to modulations in Pol III activity.

133 Among differentially expressed transcripts, BC200 RNA stands out because of its high expression in the brain,151 and the fact that it was also downregulated in patient-derived fibroblasts carrying different POLR3A mutations. Although BC200 RNA was originally exclusively detected in the brain, recent data indicates that it is also expressed in various primary and immortalized human cell lines,165 albeit at lower levels than in the brain. This is consistent with our own detection of BC200 RNA expression in fibroblasts and HEK293 cells. Despite being a primate-specific transcript, BC200 RNA does have a functional analog in rodents, Bc1 RNA, which is of different evolutionary origin but is also synthesized by Pol III and displays similar expression pattern, dendritic localization and function.151 Bc1 KO in mouse does not lead to hypomyelination or impaired motor function.277 However, considering the documented difficulties of recapitulating human myelin disorders in mice, including by our group in a POLR3-HLD mouse model,290 the absence of a relevant phenotype in Bc1 KO mice is not sufficient to exclude BC200 as a potential mediator of the disease.

In the brain, BC200 is thought to regulate local translation in neuronal dendrites by repressing the function of translation initiation factors eIF4A and eIF4B and/or by mediating the regulatory action of Fragile X mental retardation protein (FMRP).154,163 In human cell lines, it has also been involved in the regulation of alternative splicing and mRNA stability.291,292 In the oligodendroglial MO3.13 cells, we found that BC200 KO led to alterations in steady-state protein levels that were partially recapitulated in POLR3A mutant cells, suggesting a dose- dependent effect of decreased BC200 RNA expression. For example, MAP1B, which has sequence complementarity with BC200 and is one of its putative targets,154 was significantly increased in BC200-KO by almost 2-fold and in POLR3A-M852V by 50% in our proteomics data (Fig. 4.8f), suggesting that its translation may be directly impacted by reduced BC200 RNA expression. Although BC200 expression was not detected in the white matter of an adult human brain using radioactive in situ hybridization,151 the important protein changes observed in BC200-KO cells suggest that this ncRNA plays a role in the OPC-like MO3.13 cells. Interestingly, expression of FMRP, an interacting partner of BC200, is highest in OPCs early in development and declines as oligodendrocytes differentiate.293 This could also be the case for BC200 RNA, and its reduced expression in developing OPCs could have a detrimental impact and contribute to POLR3-HLD pathogenesis. Nonetheless, considering that BC200 RNA is

134 upregulated and essential for cell growth in several cancers165 and that the MO3.13 cell line was established from a tumor,285 we cannot exclude that its expression or function is specific to this cell line. Thus, validation of our results in primary OPCs or oligodendrocytes represents an important step to confirm the potential involvement of BC200 RNA in myelination. Alternatively, the function of BC200 RNA in dendrites could also contribute to hypomyelination through impaired neuron-to-glia signaling52 or to the loss of cerebellar neurons observed in POLR3-HLD cases89 considering the massive dendritic arborization of cerebellar Purkinje cells.11

POLR3-HLD is likely the result of impaired expression of several transcripts, which all contribute to the observed phenotypes. In addition to BC200 RNA, we also detected decreased levels of 7SL RNA, an essential component of the signal recognition particle (SRP) that targets nascent peptides to the ER. Depletion of protein components of the SRP causes inefficient ER targeting.139 Decreased levels of 7SL RNA could exert a similar effect, perhaps on proteolipid protein (PLP), one of the major myelin proteins which transits through the ER and Golgi to reach the cell surface.294 Moreover, we also observed a global reduction in tRNA levels in POLR3A mutants. This lends support to the current leading hypothesis to explain POLR3-HLD pathogenesis, which is that Pol III hypofunction impairs the transcription of certain tRNAs that are essential for the synthesis of proteins involved in CNS myelin development.90,111 In our study, downregulated tRNAs belonged to most tRNA isoacceptor families, and we did not observe a selective effect on certain tRNAs, such as the ones that show higher expression in the brain compared to other tissues117 or the ones that correspond to tRNA aminoacyl synthetases mutated in other HLDs.124-127 Furthermore, follow-up experiments will be required to determine whether the observed decreased tRNA levels are sufficient to negatively affect mRNA translation. Based on our SILAC experiment, steady-state protein levels were not globally altered in POLR3A mutant cells. Nonetheless, a more informative way to establish whether translation rate is affected would be to label and quantify only nascent peptides or to perform ribosome profiling.251,295 In fact, SILAC on its own does not allow to distinguish indirect changes in protein levels due to modulations of mRNA expression/stability from direct changes that could be the consequence of altered translation. Thus, a crucial next step will be to acquire RNA-seq data from WT, POLR3A-M852V and BC200-KO MO3.13 cells to compare differences in the

135 transcriptome and the proteome. We must also consider the possibility that reduced tRNA levels only have a detrimental effect at times when the need for newly synthesized proteins is very high, such as during myelinogenesis,54 and remain inconsequential in other spatio-temporal contexts.

In conclusion, our study shows for the first time that a disease-causing POLR3-HLD mutation causes selective alterations in Pol III transcript levels and that BC200 RNA may play a central role in the pathophysiology of POLR3-HLD. Our data pinpoints several Pol III transcripts and candidate downstream targets of BC200 RNA which can be further investigated for their role in oligodendrocytes, their involvement in the disease and their potential as future therapeutic targets.

136 Materials and Methods Cell lines Stable HeLa cell lines expressing FLAG-tagged POLR3A variants (WT, M852V or R1005C) were produced by transfection with Lipofectamine (ThermoFisher) according to the manufacturer’s instructions. For transient expression of the FLAG-tagged POLR3A variants, HEK293 cells were transfected with Lipofectamine for 24 hours. MO3.13 cells were purchased from Cedarlane CELLutions Biosystems (#CLU301). Unless otherwise specified, all cells were grown in DMEM 1X (Wisent #319-005-CL) supplemented with 10% fetal bovine serum (Wisent #080-150) and 1% penicillin-streptomycin-L-glutamine (Wisent #450-202-EL).

Experiments in FLAG-tagged cell lines Affinity purification, mass spectrometry and immunofluorescence in cell lines expressing FLAG-tagged POLR3A subunits were performed as previously described.90,290 For measurement of POLR3A occupancy at the VTRNA1-1 and tRNA-iMet genes, ChIP-qPCR was performed as previously described.290

CRISPR-Cas9 gene editing POLR3A mutant cell lines carrying the c.2554A>G (p.M852V) mutation were generated using CRISPR-Cas9 coupled with homology-directed repair (HDR), as previously described.283 BC200 KO cell lines were generated using an approach adapted from reference #291, with dual sgRNAs targeting upstream and downstream of the BC200 gene (Fig. S4.6d). Detailed methods are described in Supplementary Methods.

RT-PCR For RT-PCR, 1µg of RNA was reverse transcribed using the Superscript III Reverse Transcriptase (ThermoFisher #18080-044). For detection of the M852V mutation, exons 17-21 of POLR3A were amplified (see Table S4.6 for primers) and Sanger sequenced (McGill University and Génome Québec Innovation Centre).

137 Western Blot Cells were homogenized in lysis buffer [10 mM Tris–HCl, pH 7.5, 150 mM NaCl, 1 mM 6 EDTA, 1% Triton X-100 and protease inhibitors (Roche)] and incubated on ice for 30 minutes, followed by centrifugation at 14,000 rpm for 10 minutes and collection of the supernatant. Protein concentration was determined using DC Protein assay (Bio-Rad). Protein samples were separated onto a 4-12% NuPAGE Bis Tris gel (ThermoFisher #NP0321) and transferred onto a nitrocellulose membrane (Bio-Rad). Immunoblots were probed with anti-POLR3A (Abcam #ab96328), anti-GAPDH (GeneTex #GTX627408) or anti-actin (Abcam #ab3280).

Small RNA-Sequencing Small RNAs (< 200 nucleotides) were enriched from total RNA using Qiagen miRNeasy (#217004). Libraries were prepared with the KAPA stranded RNA-seq library preparation and sequenced on an Illumina HiSeq 2500 with 100bp single-end reads at the McGill University and Génome Québec Innovation Centre, resulting in an average of 25 million reads/sample. Detailed protocols and data analysis are described in Supplementary Methods.

RNA-Sequencing Total RNA was extracted using miRNeasy (Qiagen) or Trizol (ThermoFisher) and submitted to Illumina TruSeq rRNA-depleted stranded library preparation. HEK293 libraries were sequenced on Illumina HiSeq 2500 with 125bp paired-end reads at an average of 42 million reads/sample. Fibroblast libraries were sequenced on Illumina HiSeq 2000 with 50bp paired-end reads at an average of 60 million reads/sample. Quality control and trimming were performed as previously described.296 Trimmed reads were aligned to the reference genome hg19 using STAR v2.3.0e,297 eliminating reads mapping to more than 10 locations. Expression levels were estimated with featureCounts298 using exonic reads and normalized using DESeq2.299 Differential expression analysis between control and mutant/patient groups was performed with DESeq2 for genes in the Ensembl v75 annotation.299 Genes were considered differentially expressed if they had an adjusted p-value < 0.05 and a base mean >100. Since Pol III transcripts are highly repetitive, alignment and subsequent analyses were repeated with the inclusion of all multimapping reads to confirm consistency in results.

138 qRT-PCR For qRT-PCR of tRNAs and BC200 RNA, 1 µg of RNA was reverse transcribed using the Superscript III Reverse Transcriptase. Real-time PCR was performed in technical duplicates using the SYBR GreenER™ qPCR SuperMix (ThermoFisher #11760100) on a BioRad CFX96. The ΔΔCt method was used to calculate relative RNA expression, with normalization to genes PSMB6, SDHA and COPS7A (HEK293 cells) or NDUSF2 and PMM1 (MO3.13 cells). All primers used are indicated in Table S4.6.

SILAC Cells were grown in DMEM for SILAC (ThermoFisher, #88364) supplemented with 10% dialyzed fetal bovine serum (ThermoFisher, #A3382001) and light, medium or heavy amino acids for MO3.13 WT, BC200-KO and POLR3A-M852V, respectively (Fig. 8a). Light amino acids were unlabeled lysine (ThermoFisher, #88429) and arginine (ThermoFisher, #89989), medium amino acids were Lys4 (ThermoFisher, #88437) and Arg6 (ThermoFisher, #88210) and heavy amino acids were Lys8 (ThermoFisher, #88209) and Arg10 (ThermoFisher, #89990). The three media were also supplemented with L-proline (ThermoFisher, # 88211). After > 6 passages in the SILAC media, cells were harvested, washed in cold PBS and lysed in 1.5% n-dodecyl-β- D-maltoside (DDM) in PBS. Protein concentration was determined using DC Protein assay (Bio- Rad) and equal quantities of protein (25 or 30 µg) from the three conditions were mixed. Proteins were precipitated with methanol-chloroform and pellets were air-dried and stored at -80oC. Proteins were digested with trypsin and the resulting peptides were purified using Oasis MCX (Waters). Peptides were analyzed via LC-MS/MS with a microcapillary reversed-phase, high- pressure LC-coupled LTQ-Orbitrap (ThermoElectron) mass spectrometer. Raw data files were processed with MaxQuant300 v.1.6.0.16. MS/MS spectra were matched against the human Uniprot annotation (downloaded on November 11, 2017). Methionine oxidation was set as a variable modification and Lys4/Arg6 and Lys8/Arg10 were set as medium and heavy labels, respectively. Trypsin/P was defined as enzyme, allowing for two missed cleavages. FDR threshold was set to 0.01 for peptide and protein identifications. Minimal ratio count was set to 2 for protein quantification and the functions “match between runs”, “requantify” and “match from and to” were enabled. Normalized MaxQuant ratios were used for subsequent analyses with Perseus v.1.5.6.0. Known protein contaminants were removed from the

139 analysis. Protein groups were kept for analysis if they were detected in at least four out of six biological replicates. Conditions (POLR3-M852V or BC200-KO vs. MO3.13-WT) were compared in pairs using a one-sample t-test on log2 ratios. Multiple testing correction was performed using the Benjamini-Hochberg method. The threshold for statistical significance was set at FDR < 0.05. Subsequent analyses were performed using custom scripts in R. We used the software GOrilla301 for GO analysis, with all detected proteins as background.

140 Figures

a b FLAG DNA Merge WT WT

POLR3A

POLR3B

POLR3C WT

POLR3D

POLR3E

POLR3F

POLR3G Specific Pol III subunits POLR3H M852V CRCP

POLR1C

POLR1D

POLR2E

POLR2H Shared subunits c

-10 -8 -6 -4 -2 0 2

Figure 4.1. The POLR3A M852V mutation leads to decreased Pol III chromatin occupancy. a) FLAG-tagged variants of POLR3A (WT or M852V) were expressed at equivalent levels in HeLa cells and purified using anti-FLAG affinity chromatography. The co-purified proteins were identified by LC-MS/MS. The heatmap contains the log2-transformed average spectral count ratios of mutant/WT across both replicates. Spectral counts were computed with Mascot. Specific and shared (with Pol I and/or Pol II) subunits are identified on the left. b) Immunofluorescence experiment showing the subcellular localization of FLAG-tagged variants of POLR3A. Scale bar = 20µm. c) ChIP-qPCR performed against FLAG-tagged variants expressed transiently at equivalent levels in HEK293 cells. The chromatin was quantified by qPCR with primers for two Pol III target gene promoters (VTRNA1-1 and tRNA-iMet). Pol III enrichment at these loci was calculated relative to a locus on chromosome 13 that is not bound by Pol III. Data are represented as mean +/− SEM of biological duplicates.

141 Expression of POLR3A in HEK293 clones

# POLR3A expression in HEK293 clones fileDir <- "/Volumes/guillimin/pipeline/v0/level3/kchoquet/RNASeq/clones_HEK293/allSamples_boxplots/boxplot/"

df <- read.table(paste(fileDir, "ENSG00000148606:POLR3A.rpkm.Ensembl.ensGene.exon.s2.tsv", sep=""), sep="\t", header=TRUE) df$Alias <- c("C1", "C2", "C3", "M1", "M2", "M3") df$newGroup <- c("Controls", "Controls", "Controls", "Mutants", "Mutants", "Mutants") a POLR3A exon 19 b ggplot(df, aes(x=Alias, y=Expression, fill=Group)) + geom_bar(stat="identity")+scale_fill_manual(values=cbPalette2) + ggmin::theme_min() + theme(text = element_textPOLR3A(size=20), panel.border = element_rect(fill = NA, colour = "black")) + ylab("Expression (RPKM)")+xlab("Sample") c.2554 Control A 7.57.5

5.0 Group 5.0 MUT WT G M1 p.M852V 2.52.5 A p.F833LfsX10 delT Expression (RPKM) Expression(RPKM) 0.00 C1C1 C2C2 C3C3 M1M1 M2 M3 Sample M2 G p.M852V ins A p.M852RfsX8 c Size (kDa) C1 C2 C3 M1 M2 M3 POLR3A 130 p.M852V M3 G (homozygote) GAPDH 35 40%

d 2.0 1.5 1.0 0.5 4 0 Controls M1 M2 M3 Ratio POLR3A / GAPDH / RatioPOLR3A

Figure 4.2. Characterization of POLR3A mutant clones obtained by CRISPR-Cas9. a) Schematic of the genotypes in POLR3A mutant clones. Mutants M1 and M2 are compound heterozygous for the M852V mutation and a deletion/insertion leading to a premature stop codon. The predicted protein change is indicated on the right. Mutant M3 is homozygous for the M852V mutation, but this results in partial exon 19 skipping at the mRNA level (see Fig. S4.1 for details). Exon and intron sizes are not at scale. b) POLR3A mRNA levels in control and mutant clones, quantified by RNA-seq. Mutant M3 has normal levels of POLR3A mRNA, suggesting exon 19 skipping does not affect mRNA stability. c) POLR3A protein levels in control and mutant clones. GAPDH was used as a loading control. d) Quantification of POLR3A protein levels normalized to GAPDH levels. The three controls were grouped together and represented as mean +/− SEM. Abbreviations: ins: insertion, delT: deletion of a T nucleotide.

142

a Large RNAs (≥ 200 nt) Small RNAs (< 200 nt) 1.0 *** *** 1.0 0.5 0.5 pol0.0 pol_type other other 0.0 pol3 pol3 −0.5 log2 Fold Change log2 Fold −0.5 Change log2 Fold log2 fold change −1.0 −1.0 Otherother Polpol3 III Otherother Polpol3 III pol pol_type (n=14,034) transcripts (n=216) transcripts (n=25) (n=301)

b Large RNAs (≥ 200 nt) Small RNAs (< 200 nt) *** 1.0 *** *** 1.0 *** ns 0.5 0.5 type promoter 0.0 other other type1 0.0 type2 type2 type3 type3 −0.5 log2 Fold Change log2 Fold log2 Fold Change log2 Fold −0.5 log2 fold change −1.0 −1.0

Otherother Typetype2 2 Typetype3 3 Otherother Typetype1 1 Typetype2 2 Typetype3 3 (n=14,034) (n=12)type (n=13) (n=216) (n=31)promoter (n=223) (n=47)

Pol III transcripts Pol III transcripts

Figure 4.3. Pol III transcripts with a type 2 promoter are decreased in POLR3A mutants. a) Violin plots representing the distribution of expression log2 fold change (mutants/controls) for Pol III transcripts detected in rRNA-depleted RNA-seq (left) or small RNA-seq (right) compared to all other expressed genes in the respective datasets. b) Same as a), but Pol III transcripts were partitioned into the three types of Pol III promoters. White dots indicate the median for each group. ***: p-value < 0.0001, Wilcoxon rank-sum test, ns: non-significant.

143 tRNAs snRNAs snoRNAs tRNAs snRNAs snoRNAs

a 3 3 3 2 2 2 1 1 1

● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ●● ● ● ●●● ●●● ● ● ● ●●● ●● ●●● ●●●●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●●● ●●●●●●●●● ●● ● ● ●● ● ● ●●●● ●● ● ●● ●●● ● ● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ● ● ● ● ●● ●●●●●●●● ● ●●●● ● ●●●●●●●●●● ●●● ● ●●●●●● 0 ●●●●●●● ●●●●●● ●● ● 0 ●●●●●●● ●●●● ● ● 0 ● ●●●●●● ●● ●●●●●● ●● ● ● ●●●●●●●●●●●●●●●● ● ● ●●● ●● ●●● ●●● ● ● ● ●● ●● ● ●●●●●● ●●● ● FDR < 0.05 ●●●●●●●●●●● ● ●●● ● ●●● ● ● ● ● ● ● ● ● ● ● ●●●●●● ●●● ●●● ●● ●●●●●● ●●● ●●● ● ●● ● ● ● ● ● ● ● ● ● ●●●●●● ● ●●●●●●●● ●●● ● ●● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●● ●●●●●● ● ● ● ●●●●●●●●●●●●●●● ●●●● ●●●● ●●● ●●● ●●●●●●●●●● ●●●● ● ● ● ● ● ● ●●●●●●●●●●● ● ● ● ●●●●●● ●●●●● ● ● ●● ● ● ● ● ● ●● ● ●● ● ●

log2 fold change log2 fold ● change log2 fold change log2 fold − 1 − 1 − 1 log2 fold change − 2 − 2 − 2 − 3 − 3 − 3

0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 log10 mean of normalized counts log10Mean mean ofexpression normalized counts log10 mean of normalized counts tRNAs snRNAs snoRNAs tRNAs snRNAs snoRNAs

b 3 3 3 2 2 2

● 1 1 1

● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ●●●●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●●● ● ● ●● ● ●● ●● ● ● ● ● ● ● ●●● ● ● ● ●● ●●● ● ●● ● ●●●●●●● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ●●● ●● ● ● ● ● ●●●●●● ●●●●● ●●●● ● ● 0 ● ● 0 ● ● ● 0 ● ●● ● ●●● ● ● ● ●● ● ●●● ●● ● ●● ● ●● ●●● ●●● ● ● ● ● ● ● ● ●●●● ●●●●● ● ● FDR < 0.05 ● ● ● ● ● ●●●● ● ● ● ●●●● ● ● ● ● ● ●●●● ● ●●● ● ● ● ● ●● ● ●● ●● ● ● ●●● ● ●●●●●●● ● ● ●●● ● ● ● ●● ●● ● ●● ● ●● ● ●●● ● ● ● ● ● ● ●● ● ● ●●● ● ●●● ●● ● ● ● ●●●●●●● ●●●●● ●●● ●●●●● ● ● ● ●● ●● ●● ● ● ●● ●● ●●● ●● ● ●●● ●● ● ● ● ● ●●●●●●●●●● ●●● ●●● ● ●● ●●●●●●●●● ● ● ● ●●●●●●●●●●●●● ●● ● ● ● ●●●●●●●● ● ● ● ● ●●●●●●●● ●● ● ●●●●●● log2 fold change log2 fold ●● ● ●● ● change log2 fold change log2 fold − 1 ● ●●●●● ● − 1 − 1 ●●●●●● ● ● ● ●● ●● ● ●●●●●● ● ● ● ● ● ●● ● ● ● ●● ●

log2 fold change ● − 2 − 2 − 2 − 3 − 3 − 3

0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 log10 mean of normalized counts log10Mean mean ofexpression normalized counts log10 mean of normalized counts

Figure 4.4. Differential expression analysis shows decreased tRNA levels in mutants. MA plots for all detected tRNAs, snRNAs and snoRNAs in the small RNA-seq data, showing a general decrease in tRNA levels in a) three mutants (M1-M3) compared to three controls (C1- C3) and b) biological triplicates of mutant M2 and control C3. In contrast, snRNAs and snoRNAs are not affected by the POLR3A mutation, consistent with the fact that the majority is not synthesized by Pol III.

144 a Differentially expressed tRNAs by isoacceptor Significant decrease (FDR < 0.05) 10 value Non-significantbaseMean < 10 decrease (FDRnot_significant_down > 0.05) genes 5 Non-significantnot_significant_up increase significant_down (FDR > 0.05) tRNA Number of tRNA genes 0 Low coverage Number of detected − CCA Ile − TAT Ile − AAT − TACVal − GTA Tyr − TTT Lys − AAC Val − CAC Val − CTT Lys Leu − TAA Glu − TTC Thr − TGT Arg − TCT Tr p Ser − ACT Thr − AGT Met − CAT Gly − TCC Gln − TTG Ala − TGC Leu − TAG Ala − AGC His − GTG Glu − CTC Ser − TGA Thr − CGT Arg − CCT Ser − AGA Gly − CCC Gln − CTG Ser − GCT Arg − TCG Ala − CGC Arg − ACG Gly − GCC Pro − TGG Leu − CAA Ser − CGA Asn − GTT Leu − AAG Pro − AGG iMet − CAT Arg − CCG Leu − CAG Pro − CGG Asp − GTC Cys − GCA SeC − TCA IsoacceptorIsoacceptor b

Figure 4.5. Global decrease in tRNA levels in POLR3A mutants. a) Differential expression results from small RNA-seq for detected tRNA genes, categorized by isoacceptor family, showing that affected tRNAs belong to most families. tRNAs with low coverage have mean expression < 10. b) Expression of three tRNA genes measured by qRT-PCR in total RNA from three controls (C1-C3) and three mutants (M1-M3). tRNA gene expression was normalized to the stable genes SDHA, PSMB6 and COPS7A using the ΔΔCt method.

145 a b ExpressionAlu elements−positive Alu elements ** 9 Pol III transcripts Type 2 7500 sigBC200 RNA BC200 rn7sl 6 7SL RNA group bc200 5000 Type 3 ctrl t3 mut RN7SK, RPPH1, 3 − log10 p value RMRP 2500 -log10 p-value Total expression Total Normalized expression 0 0 C1 C2 C3 −1.0 −0.5 0.0 0.5 1.0 M1 M2 M3 Sample log2log2 fold fold change change Sample

Figure 4.6. Large Pol III transcripts with a type 2 promoter are decreased in POLR3A mutants. a) Volcano plot representing the results of differential expression analysis between controls and mutants using rRNA-depleted RNA-seq (long RNA transcriptome). Expressed genes and pseudogenes encoding Pol III transcripts are shown in distinct colours based on their promoter type. b) Expression levels of Alu elements predicted to be Pol III-transcribed (see Supplementary Methods for details). Reads mapping to Alu elements were summed and normalized to library size.

146 a chr2 b chr2 c 47,562,400 47,562,600 47,562,400 47,562,600 2000 * 1500 Controls Controls Group 1000 Controls Patients

500 Patients Mutants Microarray signal (a.u.) Microarray

0 Microarray signal ( a.u .) BC200 BC200 Controls PatientsPatients Group log2FC = -0.57 log2FC = -0.87 Carrier of M852V mutation

Figure 4.7. BC200 RNA levels are decreased in POLR3-HLD patient fibroblasts. a) IGV screenshot of normalized BC200 expression in HEK293 controls and POLR3A mutants. All samples are represented on the same scale (0-2900). b) IGV screenshot of normalized BC200 expression in primary fibroblasts from controls and POLR3-HLD patients (see genotypes in Table S4.4). All samples are represented on the same scale (0-2140). c) Microarray signal for a probe targeting the 3’ unique sequence of BC200 RNA in fibroblasts from controls and POLR3- HLD patients. The two carriers of the M852V mutation are indicated by a black dot. The two groups were compared with a one-sided Student’s t-test. *p-value < 0.05, a.u.: arbitrary units.

147 2.5

2.0 a POLR3A- c 2.5 MO3.13-WT BC200-KO M852V 1.5 2.0 variable light medium heavy log2FC_C425.MO3POLR3A-M852V 1.5 variable log2FC_BC26.MO3BC200-KO density 1.0 log2FC_C425.MO3 log2FC_BC26.MO3 density 1.0 0.5 Density 0.5

0.0 0.0 Mix 1:1:1 0 0 1 1 2 2 3 3 Absolute log2 (MUT/WT) Absoluteabsolute log2 (MUT/WT) log2 FC Trypsin digestion d e 2 *** LC-MS/MS 1 R = 0.53 1

0 |log2cat FC| > 0.5: 0 bc26_change Bothgrey orange down −1 POLR3A-M852V blue1 −1 up BC200-KO Intensity blue2 −2 Neither −2

m/z 2 log2 BC200_KO/MO3.13 log2 BC200_KO/MO3.13

log2 FC BC200-KO −3 R = 0.28 −3 log2 (POLR3A − M852V / WT) −3 −2 −1 0 1 Negativedown Positiveup log2log2 FC POLR3A_MUT/MO3.13 POLR3A-M852V log2 FC POLR3A-M852V log2 bc26_changeFC BC200-KO

1 b POLR3A-M852V BC200-KO f 8 8 1 6 6 0 variable sig_C425.MO3 sig_BC26.MO3FDR < 0.05 & variablelog2FC_C425.MO3POLR3A-M852V |log2 FC| > 0.5 0 4 4 not not log2FC_BC26.MO3BC200-KOlog2FC_C425.MO3 sig sig log2FC_BC26.MO3 log2 FC − log10 p value − log10 p value 2 2 −1

-log10 p-value −1 0 0 −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 Mutant / Control log2 ratio 1.0 log2 M852V / WT log2 BC200−KO / WT Mutant / Control log2 ratio log2 FC PPIB NGFR MAP1B LAMB1 PPIB NGFR LAMB1 ProteinMAP1B gene_id PPIB NGFR LAMB1 Figure 4.8. BC200 KO causes important changes at theMAP1B proteome level in MO3.13 cells. a) Overview of the SILAC protocol. Light refers to unlabeledgene_id amino acids, medium to Lys4 and Arg6 and heavy to Lys8 and Arg10. b) Identification of proteins that are differentially abundant in POLR3A-M852V or in BC200-KO compared to MO3.13-WT. c) Distribution of the absolute log2 fold change for POLR3A-M852V/WT and BC200-KO/WT. More proteins have high fold changes in BC200-KO. d) Correlation in fold change of POLR3A-M852V and BC200-KO for proteins that show statistically significant differences (FDR < 0.05) in both conditions compared to MO3.13-WT. Proteins are coloured based on their absolute fold change being greater than 0.5 in both conditions (orange), in one condition (shades of blue) or in neither (grey). e) Distribution of fold change in POLR3A-M852V for proteins that showed statistically significant differences in both conditions (FDR < 0.05) and had a significant fold change in BC200-KO (log2 FC > 0.5). Proteins tend to change in the same direction in both conditions, but with lower effect size in POLR3A-M852V compared to BC200-KO. ***p = 5.52x10-8, Wilcoxon rank-sum test. f) Example of four proteins showing significant differences in both mutants and with higher

148 effect size in BC200-KO than in POLR3A-M852V. In all panels, fold change (FC) refers to the ratios POLR3A-M852V/MO3.13-WT or BC200-KO/MO3.13-WT.

149 Acknowledgements and Funding We would like to thank the McGill University and Genome Quebec Innovation Center for Sanger and next-generation sequence services, the Genomics Platform of the Institute for Research in Immunology and Cancer for qRT-PCR services, LC Sciences for microarray experiments and the IRCM Proteomics Discovery Platform for mass spectrometry services. We are grateful to Drs. Adeline Vanderver and Raphael Schiffmann for providing us with patient- derived fibroblasts, and wish to thank all patients for their participation in this research program. We would also like to acknowledge the Fondation Leuco Dystrophie for their constant encouragement and for supporting earlier expression analysis studies. This study was supported by grants from the Canadian Institutes of Health Research (CIHR) (MOP#126141) and the European Leukodystrophy Association and by computing resources from Compute Canada. KC received doctoral awards from the Fonds de Recherche en Santé – Québec (FRQS) and the Fondation Grand Défi Pierre Lavoie, and is a recipient of the Jewish General Hospital National Bank Molecular Pathology Scholarship. GB and CLK receive salary awards from the CIHR and the FRQS, respectively.

150 Supplementary Materials

Supplementary Methods CRISPR-Cas9 gene editing For POLR3A mutant cell lines carrying the c.2554A>G (p.M852V) mutation, we used CRISPR- Cas9 with homology-directed repair (HDR), as previously described.283 Briefly, we selected two single guide RNAs (sgRNA) targeting the region neighboring the POLR3A c.2554A>G position using the CRISPR design tool http://crispr.mit.edu/ (Table S4.6). Oligonucleotides corresponding to both strands of the sgRNA sequences were annealed and cloned into the plasmid pSpCas9(BB)-2A-Puro (PX459) (#48139, Addgene) as previously described.283 HDR templates containing the mutated nucleotide were synthesized (IDT) as single-strand DNA oligonucleotides (ssODN) with flanking homology arms of 40 or 90 nucleotides. Synonymous mutations in the PAM or sgRNA target sequences were also added to the ssODNs to prevent re- editing.302 HEK293 or MO3.13 cells (1x106 cells per reaction) were re-suspended in 100µL of Ingenio® Electroporation Solution (Mirus Bio LLC) and nucleofected with the sgRNA- containing plasmid (2.5µg) and ssODN (75 pmols) using a Lonza-Amaxa® Nucleofector Device, then seeded into two wells of a 6-well plate. Forty-eight hours later, puromycin was added to the media at a concentration of 1µg/mL to select for cells having integrated the plasmid. Puromycin- resistant clonal cell colonies were subsequently screened using PCR amplification of the corresponding exon followed by Sanger sequencing (McGill University and Genome Québec Innovation Center). Clones were considered positive if they carried the M852V mutation on all alleles (homozygous) or in compound heterozygosity with an indel causing a frameshift and premature stop codon. Clones that did not acquire mutations were used as controls for subsequent experiments. For compound heterozygous clones in MO3.13 cells, PCR products were sequenced on an Illumina MiSeq (McGill University and Genome Québec Innovation Center) to confirm that the missense mutation and the deletion were on different alleles. For BC200 KO cell lines, we used a CRISPR-Cas9 approach adapted from Singh et al.291, with dual sgRNAs targeting upstream and downstream of the BC200 gene (Fig. S4.6d). We tested four different combinations of sgRNAs: g1+g3, g1+g4, g2+g3, g2+g4 (Table S4.6). sgRNA sequences were cloned into the plasmid pSpCas9(BB)-2A-Puro (PX459) as described above. Plasmids (2.5 µg) were transfected into MO3.13 cells with Lipofectamine 3000 (ThermoFisher

151 #L3000008) and puromycin was added to the media at a concentration of 1µg/mL after forty- eight hours. Puromycin-resistant clonal cell colonies were screened by PCR for the presence of a band of approximately 200 bp corresponding to the targeted region without the BC200 gene (Fig. S4.6e). To confirm complete deletion of the BC200 gene, PCR products were sequenced on an Illumina MiSeq (Fig. S4.6f). We identified two clones with deletion of BC200 on all alleles (Fig. S4.6f). Although qRT-PCR showed low levels of remaining BC200 RNA expression in these clones (Fig. S4.6g), closer inspection of the melt curve suggests that this is due to non- specific amplification and/or primer dimers (Fig. S4.6h). All primers used for gRNA cloning, PCR and sequencing are indicated in Table S4.6. For each sgRNA targeting POLR3A or BC200, we screened the top five possible off-target effects predicted by http://crispr.mit.edu/ by PCR and Sanger sequencing.

Genome-wide profiling of small Pol III transcripts Sample preparation Since Pol III transcripts range in size from 70 to 330 nucleotides, we used two complementary RNA-seq approaches, consisting of rRNA-depleted RNA-seq for transcripts ≥ 200 nt and a modified small RNA-seq approach aimed at measuring the levels of Pol III transcripts < 200 nt, with a special focus on tRNA precursors (pre-tRNAs). Because of their short half-lives, pre- tRNAs have been shown to provide a more reliable estimate of Pol III transcription compared to mature tRNAs.193,194,249 They are also easier to quantify by RNA-seq because they have not yet acquired the post-transcriptional modifications or complex secondary structure that can interfere with reverse transcription. However, pre-tRNAs are only present at low coverage in standard RNA-seq data because of their size (~100 nt). Conversely, commercial small RNA-seq kits are often biased towards Dicer or Drosha-processed small RNAs and do not offer optimal coverage of Pol III transcripts. To overcome these limitations, we enriched total RNA extractions for small RNAs (< 200 nt), directly followed by random priming, cDNA synthesis and next-generation sequencing (Fig. S4.3a). Small RNA enrichments were performed using the miRNeasy kit (Qiagen) with the modifications outlined in Appendix A of the manufacturer’s protocol to allow for separation of RNAs smaller than 200 nucleotides.

152 To monitor the level of small RNA enrichment in each sample, we synthesized three synthetic spike-in RNAs of different sizes (70, 94 and 250 nt) selected from previous publications.303-305 The two small spike-in RNAs were chosen for their similar size to that of Pol III transcripts: 70 nt for SS-70 in Locati et al.303 and 94 nt for the synthetic spike-in from Zhong et al.304 The larger 250 nt spike-in RNA corresponds to ERCC-00051 from the ERCC spike-in set,305 but without the polyA tail. PCR products to be used for in vitro transcription reactions were generated using a G-block (IDT) template and primer pairs corresponding to each spike-in RNA (Table S4.6). In vitro transcription was performed using PCR products as individual templates with a MAXIscript T7 in vitro Transcription Kit (Ambion). Completed reactions were treated with TurboDNase (Ambion) and subsequently loaded onto mini Quick Spin RNA Columns (Roche) to remove unincorporated nucleotides. RNA was phenol-chloroform extracted and analyzed by 7% denaturing PAGE. Full-length RNA molecules were then eluted from the gel and quantified by Nanodrop. Synthetic spike-in RNAs were mixed at an equimolar concentration of 4x10-9 mol/L and 0.5 µL of this mix was added to miRNeasy Qiazol lysis buffer after cells were homogenized.

Small RNA enrichment was confirmed using an Agilent Bioanalyzer (Fig. S4.3a). Libraries from three HEK293 mutant clones (M1-M3) and three control clones (C1-C3) were prepared with the KAPA stranded RNA-seq library preparation and sequenced on an Illumina HiSeq 2500 with 100bp single-end reads. Since the three mutant clones have slightly different genotypes (Fig. 4.2a), we also sequenced small RNA-seq libraries from biological triplicates of the mutant clone with the lowest POLR3A expression (M2, see Fig. 4.2b) and a control clone (C3), in order to assess the impact of POLR3A hypofunction in the worst-case scenario.

Data analysis Quality control and trimming were performed as previously described.296 Trimmed reads were aligned to the reference genome hg19 using STAR v2.3.0e,297 including reads mapping to up to 100 locations. tRNA gene body coverage was assessed with RSeQC306 using windows encompassing each tRNA gene and 20bp upstream and downstream. This showed good coverage of the tRNA gene body (Fig. S4.4a), with only a slight drop in coverage near the 3’ end. Expression levels were estimated with featureCounts298 using exonic reads in three successive

153 runs with different parameters to treat multimapping reads: i) uniquely mapped reads only; ii) all multimapping reads, counting primary alignments only; and iii) all multimapping reads, counting primary and secondary alignments. All subsequent analyses were performed with the three types of counts. Unless otherwise specified, results are reported for option ii), but general agreement of results was verified with the three options. Expression levels of tRNA precursors were estimated by counting reads mapping at least partially to tRNA introns, leader (20bp upstream) or trailer (20bp downstream) sequences, using featureCounts298 and custom scripts. pre-tRNA reads represented on average 74.9% of the total number of uniquely mapped tRNA reads and 37.8% of all mapped tRNA reads, while the remainder were exonic reads that could not distinguish between mature and pre-tRNAs (Fig. S4.4b). We used both types of reads for further analyses.

To assess small RNA enrichment, we calculated the ratio of counts from the small spike-ins over counts from the large spike-in. As a second measure of enrichment, we quantified the library size factors with DESeq2 for small (< 200 nt) and large RNAs (≥ 200 nt), respectively. The ratio of size factors was highly correlated to the ratio of spike-in counts (Fig. S4.4c), indicating that both measures can be used to assess small RNA enrichment level. Thus, to account for small RNA enrichment variability during subsequent analyses, expression levels of small and large RNAs were normalized with their respective size factors, using DESeq2.299 For transcripts with multiple isoforms, the longest isoform was used. For small RNAs, tRNAs were excluded from the size factor calculation since they represent ~38% of expressed small transcripts and could thus skew the size factors if they are differentially expressed in mutants. After normalization, small and large transcripts were combined for the remainder of the differential expression analysis workflow with DESeq2. Differentially expressed genes were considered statistically significant if adjusted p-value (FDR) < 0.05 and mean expression >10.

Expression of Pol III-transcribed Alu elements Expression of Pol III-transcribed Alu elements was measured as previously reported.307 Briefly, to restrict our analysis to Pol III-transcribed elements, we filtered out Alu elements with expression immediately upstream or downstream of the Alu gene body, since they are more likely to be embedded in Pol II-transcribed genes and therefore not be transcribed by Pol III. To do so, we defined windows of 350 nt upstream and downstream of each expressed Alu gene body

154 (average number of reads in control samples > 10). Alu elements were filtered out if they met the following criteria for being likely embedded into Pol II-transcribed genes: upstream window coverage > 1/7 of Alu body coverage or downstream window coverage > 1/2 of Alu body coverage.307 The counts for the remaining “expression-positive” Alu elements (n=282) were summed in each sample and normalized using DESeq2.

Microarray Total RNA was extracted from control and patient fibroblasts in triplicate using miRNeasy (Qiagen). LC Sciences (Houston, TX, USA) generated a custom microarray including three different probes (~22 nt) for each known Pol III transcript and pseudogene. Briefly, custom probes were synthesized on Paraflo® microfluidic chips. 5µg of total RNA was reverse transcribed and hybridized to the microarray in biological triplicates (LC Sciences). Following cross-array normalization of samples, differential expression between controls and patients was assessed using a Student’s t-test with a threshold p-value of 0.05. Probes with a mean intensity signal < 100 were excluded from analysis. Of the three probes targeting BC200 RNA, two had very low signal intensity and were excluded. The remaining probe, 5’- CGTAACTTCCCTCAAAGCAACAACCCC-3’, targeted the unique 3’ region of the transcript and showed a statistically significant difference between patients and controls.

155 c.2554A>G a b

Size C1 C2 C3 C4 M1 M2 M3 NO-RT Control (nt)

Compound 500 heterozygous 400 (M1) 300 200

Homozygous (M3)

c chr10 79754891 79759149 Controls M1 M2 M3

POLR3A exon 19

Figure S4.1. Characterization of POLR3A mutant clones. a) Genomic DNA sequence chromatograms of control and POLR3A mutant clones. Red arrows indicate the mutation of interest (POLR3A c.2554A>G). As shown by the double sequence, compound heterozygous clones also carry an indel causing a frameshift and a premature stop codon. b) Electrophoresis of the PCR product from the amplification of exons 17-21 of POLR3A cDNA in four control clones (C1-C4) and three POLR3A mutant clones (M1-M3) in HEK293 cells. Mutant M3 shows a lower band corresponding to exon 19 skipping (orange arrow). c) Sashimi plot showing partial exon 19 skipping in mutant M3 (highlighted by the orange oval), while it has 100% inclusion in other mutants and controls. Exon 19 is where the M852V mutation is located.

156 gDNA cDNA

2554 2554

Control

Patient

Figure S4.2. POLR3A M852V mutation in patient-derived fibroblasts. gDNA and cDNA sequence chromatograms of a control and a patient carrying the c.2554A>G mutation in compound heterozygosity with a null allele. Both alleles are visible in the gDNA, while the missense allele is predominant in the cDNA of the patient, suggesting that the null allele is degraded.

157 a 1) Custom small RNA-seq 2) Small RNA-seq 3) rRNA-depleted RNA-seq

Enrichment of Total RNA Total RNA RNAs < 200 nt (including small RNAs)

50 rRNA depletion 40 Adaptor ligation (targets

30 small RNAs with 5’

20 phosphate and 3’ OH) Fragmentation 10

0 cDNA synthesis + PCR

Signal intensity (FU) 25 200 500 1000 2000 4000 cDNA synthesis Size (nt) Size selection End repair, A-tailing, adaptor ligation, PCR Random priming Sequencing (SE50)

Sequencing (PE125) cDNA synthesis

End repair, A-tailing, adaptor ligation, PCR

Sequencing (SE100)

b All transcripts c Pol III transcripts

≥ 200 nt ≥ 200 nt < 200 nt Other small (< 200 nt) tRNAs (< 200 nt) Proportion of read counts Proportion of read counts

Protocol Protocol1 3 Protocol Protocol1 3

Figure S4.3. Optimization of a small RNA-seq protocol. a) Overview of the workflow for the custom small RNA-seq protocol used in this study compared to traditional small RNA-seq and rRNA-depleted RNA-seq approaches. b) Proportions of read counts mapping to small and large transcripts using custom small RNA-seq (protocol 1) compared to rRNA-depleted RNA-seq (protocol 3). c) Proportion of reads counts mapping to tRNAs, other small Pol III transcripts or large Pol III transcripts in the two protocols. b) and c) show an enrichment of small transcripts in Protocol 1 compared to Protocol 3. SE: single-end, PE: paired-end, nt: nucleotide.

158 Raw coverage

60000

40000 Normalized coveragegrouping 100 a 1.00 MUTM1 b WTC1 Coverage reads 0.75 75 20000 0.75 Exonic grouping type 0.50 MUT 50 exonicPrecursor WT precursor

Coverage 0 0 0.2525 50 75 100 25 Gene body percentile (5'−>3')

Proportion of tRNA reads

Normalized coverage 0.000 0 0 25 50 75 100 Proportion of tRNA Primary Unique Gene5’ body percentile (5'−>3')3’ tRNA gene variable

Uniquely Primary reads c mapped reads Ratio SFs vs. EF

2.0 RR == 0.96

1.5 groupControls ctrl Mutantsmut 1.0

SF_small / SF_large 0.5 Ratio of size factors 0.04 0.06 0.08 0.10 RatioEnrichment of spike-in Factor (spike counts−ins)

Figure S4.4. Analysis of small RNA-seq data. a) Representative image of the aggregate coverage at all tRNA gene bodies in samples M1 and C1. Coverage was computed with RSeqQC for each tRNA gene, with 20bp upstream and downstream. b) Proportion of tRNA reads mapping to precursor or exonic regions of tRNAs, using all reads (each read counted once only) or uniquely mapped reads. c) Correlation between the ratio of spike-in counts and the ratio of DESeq2 size factors calculated for small (< 200 nt) and large (≥ 200 nt) transcripts.

159

a BCYRN1 RN7SL1 RN7SL2 3.5 60000 50000 3.0 50000 2 Type 40000 2.5 40000 2.0 30000 30000 CTRL MUT CTRL MUT CTRL MUT group CTRL RPPH1 RN7SK RMRP MUT 9000 55000 9000 Expression (RPKM) 8000 3 Type

Expression (RPKM) 7000 50000 8000 6000 7000 45000 5000 6000 CTRL MUT CTRL MUT CTRL MUT Group Group

b BCYRN1 RN7SL1 RN7SL2 60000 3 25000

50000 2 Type 20000 2 40000 15000

1 30000 10000 Controls Patients Controls Patients Controls Patients group Controls RPPH1 RN7SK RMRP Patients 8000 6000

Expression (RPKM) 7000 3 Type 30000 5500

Expression (RPKM) 6000 5000 5000 25000 4500 4000 4000 20000 3000 Controls Patients Controls Patients Controls Patients Group Group

Figure S4.5. Expression of large Pol III transcripts in a) HEK293 control (CTRL) and mutant (MUT) clones and b) primary fibroblasts derived from controls and POLR3-HLD patients. In both datasets, BCYRN1 (encoding BC200 RNA) shows the clearest difference in expression between normal and POLR3A-mutated samples. Type 2 Pol III transcripts are on the top row of each panel, and type 3 Pol III transcripts are on the bottom row.

160 a c.2554 b c Size WT M852V (kDa) POLR3A 130

55 ACTIN expression Relative BC200 M852 MO3.13-WT POLR3A- M852V

d BC200 g1 g3 e Size A B C D E g2 g4 (nt)

A: MO3.13 WT 500 B: KO #1, #2 (g2g3) 400 300 C: KO #3, #4 (g2g4) f 47,562,400 47,562,600 200 D: KO #5 (g1g3) 100 E: Blank

BC200

g h

Figure S4.6. Generation of POLR3A-M852V and BC200 KO MO3.13 cell lines. a) IGV view of the gDNA sequence of an MO3.13 POLR3A mutant clone surrounding the c.2554A>G position, obtained by MiSeq, showing compound heterozygosity for the M852V mutation and a one bp indel causing a frameshift and premature stop codon. b) POLR3A protein levels in the MO3.13 parental and POLR3A-M852V cell lines. ACTIN was used as a loading control. c) Expression of BC200 RNA measured by qRT-PCR in the parental MO3.13 and POLR3A- M852V cell lines. BC200 RNA expression was normalized to PMM1 and NDUSF2 using the ΔΔCt method. d) Schematic representation of the sgRNAs used to knock-out BC200 RNA. sgRNAs are shown in orange, primers for deletion screening are in green and qRT-PCR primers are in pink. sgRNAs were used in the following combinations: g1g3, g1g4, g2g3 and g2g4. e) Example of PCR products obtained in single clones generated with each sgRNA combination. The g1g4 combination only resulted in WT clones. The full-size PCR product including BC200 RNA is shown in sample MO3.13 WT. The arrow indicates the expected PCR product without BC200 RNA. f) IGV view of the gDNA sequence of BC200-KO clone #4 obtained by MiSeq, showing complete deletion of the BC200 gene. g) Expression of BC200 RNA measured by qRT- PCR in the parental MO3.13 cell line and five clones. BC200 RNA expression was normalized to PMM1 and NDUSF2 using the ΔΔCt method. h) Melt curve obtained from the qRT-PCR

161 performed in g), showing that KO #3-5 do not have the profile observed in the WT cells, suggesting non-specific amplification or primer dimers and complete absence of BC200 RNA expression.

162 Table S4.1. DESeq2 differential expression analysis of small RNA-seq data in clones C1 to C3 and M1 to M3. Genes with p-value < 0.05 are shown.

Mean log2 fold Adjusted Gene pvalue expression change pvalue tRNA-Ser-GCT-6-1 85.48 -1.10 5.23E-07 0.002928703 tRNA-His-GTG-1-1 233.72 -0.87 1.45E-05 0.037469671 tRNA-Leu-CAA-3-1 1546.12 -0.87 2.68E-05 0.037469671 tRNA-Tyr-GTA-5-1 433.01 -0.81 2.34E-05 0.037469671 tRNA-Leu-AAG-4-1 25.43 -0.93 5.32E-05 0.05961999 BCYRN1 100.26 -0.75 6.93E-05 0.064688479 tRNA-Ala-TGC-7-1 1181.90 -0.65 8.88E-05 0.071088568 tRNA-Ile-AAT-8-1 58.96 -0.83 0.000121073 0.075361479 tRNA-Ser-CGA-2-1 21.15 -0.89 0.000110673 0.075361479 tRNA-Pro-AGG-2-8 60.94 -0.83 0.000148152 0.075449772 tRNA-Tyr-GTA-5-4 45.65 -0.84 0.000143596 0.075449772 tRNA-Ile-TAT-2-3 385.05 -0.75 0.000226203 0.097476277 TRNAI6 385.05 -0.75 0.000226203 0.097476277 tRNA-Ala-AGC-8-2 29.15 -0.81 0.000303321 0.121371732 tRNA-Val-TAC-4-1 52.46 -0.77 0.000347712 0.129858791 tRNA-His-GTG-1-9 144.03 -0.71 0.000378896 0.132660904 tRNA-Leu-CAA-1-2 334.55 -0.68 0.000431381 0.142152718 RN7SL3 3815.42 -0.59 0.000965789 0.264685493 RP11-242C19.2 32.90 0.75 0.000876563 0.264685493 STX16 94.59 0.62 0.000911466 0.264685493 tRNA-Arg-ACG-2-1 166.92 -0.66 0.001013129 0.264685493 tRNA-Ile-TAT-2-2 26.77 -0.75 0.001086713 0.264685493 TRNAI2 26.77 -0.75 0.001086713 0.264685493 RN7SL2 6542.94 -0.57 0.001223235 0.285523408 RN7SL5P 15093.70 -0.57 0.00130217 0.291790348 tRNA-Asn-GTT-6-1 85.23 -0.64 0.001476158 0.307110501 tRNA-Leu-CAA-1-1 1683.22 -0.60 0.001575405 0.307110501 tRNA-Leu-CAA-2-1 105.77 -0.62 0.001589826 0.307110501 tRNA-Tyr-GTA-4-1 145.92 -0.66 0.001561753 0.307110501 ZNF586 22.65 0.72 0.001831211 0.341948073 RN7SL1 15045.86 -0.55 0.001960464 0.352321264 tRNA-Asn-GTT-4-1 26.67 -0.70 0.002075438 0.352321264 tRNA-Cys-GCA-2-4 229.88 -0.60 0.002017007 0.352321264 tRNA-Gln-TTG-2-1 113.24 -0.64 0.002413362 0.39763693 tRNA-Ile-AAT-5-3 130.44 -0.65 0.00255051 0.408227389 tRNA-Arg-TCT-3-2 122.28 -0.64 0.00282851 0.42825171 ZNF542 31.31 0.69 0.002766431 0.42825171 tRNA-Ala-CGC-5-1 44.60 -0.67 0.003068985 0.452433054 tRNA-Tyr-GTA-3-1 53.87 -0.64 0.003255527 0.467627274 AC074363.1 29.19 0.67 0.003965497 0.482928544 tRNA-Glu-CTC-1-3 105.37 -0.61 0.003781997 0.482928544 tRNA-Leu-TAA-3-1 41.73 -0.63 0.003850632 0.482928544 tRNA-Leu-TAG-1-1 100.94 -0.60 0.003708863 0.482928544 tRNA-Lys-TTT-3-1 112.22 -0.63 0.003937706 0.482928544 tRNA-Met-CAT-2-1 46.88 -0.63 0.003613955 0.482928544 tRNA-Ser-AGA-2-5 40.03 -0.63 0.003486393 0.482928544 U1 660.09 -0.55 0.004230701 0.504263515 tRNA-Ala-CGC-2-1 58.56 -0.60 0.004371028 0.510135372 RN7SL4P 277.53 -0.51 0.004629685 0.529295872 SNORD118 831.30 0.44 0.005787652 0.531543172 tRNA-Ala-TGC-1-1 33.83 -0.63 0.005270328 0.531543172 tRNA-Ala-TGC-2-1 24.74 -0.63 0.005625419 0.531543172 tRNA-Arg-CCT-3-1 117.36 -0.55 0.005214351 0.531543172 tRNA-Arg-TCT-2-1 108.12 -0.58 0.005277082 0.531543172 tRNA-Asn-GTT-10-1 177.75 -0.64 0.004760047 0.531543172 tRNA-Asn-GTT-3-2 733.77 -0.51 0.005809045 0.531543172 tRNA-Asp-GTC-2-8 25711.12 -0.56 0.005552267 0.531543172 tRNA-Asp-GTC-3-1 4483.41 -0.56 0.005383913 0.531543172

163 tRNA-Glu-TTC-4-1 591.10 -0.54 0.005908091 0.531543172 tRNA-Leu-AAG-1-2 31.46 -0.64 0.005770901 0.531543172 tRNA-Lys-TTT-6-1 80.51 -0.60 0.005977726 0.531543172 tRNA-Lys-TTT-7-1 49.45 -0.62 0.004991075 0.531543172 tRNA-Ser-AGA-2-4 27.80 -0.64 0.005920228 0.531543172 tRNA-Ile-AAT-3-1 101.41 -0.58 0.006105153 0.534391675 tRNA-Gln-CTG-9-1 94.65 -0.57 0.006525818 0.562425107 tRNA-Glu-TTC-3-1 59.64 -0.56 0.006830225 0.571088335 tRNA-Ile-AAT-5-2 51.97 -0.60 0.006818169 0.571088335 tRNA-Ile-TAT-1-1 81.75 -0.60 0.007688748 0.633417128 LINC00240 31.97 -0.60 0.008513628 0.662407568 tRNA-Gln-CTG-6-1 38.94 -0.59 0.008349806 0.662407568 tRNA-Lys-CTT-2-2 419.78 -0.51 0.008240856 0.662407568 tRNA-Thr-CGT-4-1 61.11 -0.55 0.00844096 0.662407568 MRFAP1L1 38.22 -0.57 0.009232632 0.677460331 RP11-11N9.4 24.54 0.60 0.009066794 0.677460331 SPINT2 75.22 0.53 0.009236134 0.677460331 tRNA-Asn-GTT-2-1 650.50 -0.51 0.008870391 0.677460331 tRNA-Thr-TGT-3-1 22.44 -0.60 0.009311754 0.677460331 LINC00958 37.69 -0.59 0.009530902 0.684514244 tRNA-Ser-CGA-3-1 72.70 -0.56 0.009974162 0.707281682 ERVK3-1 59.37 0.53 0.010482152 0.73401266 tRNA-His-GTG-1-7 217.42 -0.47 0.010829375 0.748964913 ZNF331 53.41 0.53 0.01098457 0.750433653 tRNA-Cys-GCA-21-1 31.25 -0.58 0.011441155 0.772209031 BX322557.10 22.24 0.57 0.013233511 0.87216622 tRNA-Ser-TGA-3-1 35.83 -0.55 0.013083475 0.87216622 tRNA-Lys-CTT-4-1 781.62 -0.49 0.013405298 0.873214893 LINC00511 55.51 0.53 0.013629237 0.877597545 tRNA-Arg-CCG-2-1 58.43 -0.50 0.014268432 0.908315399 tRNA-Ser-GCT-5-1 38.90 -0.55 0.015075923 0.947484797 tRNA-Tyr-GTA-5-3 161.81 -0.56 0.015221998 0.947484797 tRNA-Pro-TGG-3-2 49.36 -0.55 0.015467432 0.952181925

164 Table S4.2. DESeq2 differential expression analysis of small RNA-seq data in clones C3 and M2. Genes with adjusted p-value (FDR) < 0.05 are shown.

Mean log2 fold Adjusted Gene pvalue expression change pvalue LINC01036 74.60 3.72 5.90E-43 4.89E-39 RP6-99M1.2 110.71 2.98 4.48E-39 1.86E-35 RPPH1 5408.27 -1.24 1.19E-20 3.30E-17 RN7SL3 3075.79 -1.25 1.64E-20 3.40E-17 RN7SL5P 12462.35 -1.21 2.41E-20 3.99E-17 RN7SL2 5278.64 -1.23 5.15E-20 7.11E-17 RN7SL1 12320.07 -1.19 1.27E-19 1.50E-16 tRNA-Leu-CAA-3-1 705.30 -1.70 5.69E-18 5.89E-15 tRNA-Ala-TGC-6-1 179.39 -1.64 5.71E-17 5.26E-14 tRNA-Tyr-GTA-5-3 71.87 -2.02 1.24E-15 1.03E-12 AC074363.1 39.21 2.02 9.32E-15 7.02E-12 tRNA-Leu-AAG-1-1 764.62 -1.67 3.53E-14 2.44E-11 tRNA-Tyr-GTA-5-1 213.85 -1.54 4.48E-14 2.86E-11 tRNA-Ala-TGC-7-1 860.80 -1.21 6.74E-14 3.99E-11 tRNA-Ile-TAT-1-1 38.42 -1.87 3.41E-13 1.89E-10 tRNA-Gly-TCC-2-2 197.17 -1.44 1.06E-12 5.47E-10 TRNAI6 174.57 -1.55 1.21E-12 5.88E-10 RN7SL4P 230.52 -1.16 1.40E-12 6.09E-10 tRNA-Ile-TAT-2-3 174.39 -1.55 1.32E-12 6.09E-10 RP11-475O6.1 19.21 2.02 2.60E-11 1.08E-08 tRNA-His-GTG-1-1 108.50 -1.51 7.14E-11 2.82E-08 tRNA-Pro-AGG-2-8 35.47 -1.73 1.35E-10 5.09E-08 tRNA-Leu-CAA-1-2 176.19 -1.20 1.57E-10 5.66E-08 tRNA-Leu-CAA-1-1 839.20 -1.16 5.14E-10 1.78E-07 TSC22D3 104.26 1.24 7.75E-10 2.57E-07 tRNA-Glu-CTC-1-3 87.05 -1.27 9.83E-10 3.13E-07 FOXQ1 25.63 -1.70 1.40E-09 4.31E-07 COL6A1 71.88 -1.32 1.70E-09 5.04E-07 GUCY1B2 10.33 -1.89 3.81E-09 1.09E-06 tRNA-Arg-ACG-2-1 78.32 -1.35 7.95E-09 2.20E-06 BCYRN1 95.81 -1.21 1.05E-08 2.81E-06 tRNA-Val-TAC-3-1 84.82 -1.32 1.31E-08 3.38E-06 HSPA1B 2151.06 0.75 3.54E-08 8.90E-06 SNORA46 90.49 1.21 4.10E-08 9.86E-06 tRNA-Tyr-GTA-5-4 20.55 -1.61 4.16E-08 9.86E-06 tRNA-Tyr-GTA-5-5 40.81 -1.34 5.29E-08 1.22E-05 RP11-657O9.1 12.71 -1.71 5.59E-08 1.25E-05 tRNA-Lys-CTT-2-2 201.43 -1.09 7.59E-08 1.66E-05 NDRG1 113.17 1.04 8.92E-08 1.85E-05 tRNA-Asn-GTT-10-1 53.04 -1.34 8.89E-08 1.85E-05 tRNA-Thr-AGT-1-1 30.58 -1.43 9.14E-08 1.85E-05 MIR3676 32.02 -1.41 1.05E-07 2.07E-05 LINC00478 80.11 1.12 1.34E-07 2.58E-05 RP11-168O22.1 21.74 1.54 1.38E-07 2.59E-05 tRNA-Cys-GCA-2-4 111.87 -1.09 2.52E-07 4.64E-05 FTL 203.82 0.87 2.66E-07 4.75E-05 SLC25A6 220.26 -0.84 2.69E-07 4.75E-05 RP11-282I1.1 47.43 -1.23 4.12E-07 7.11E-05 tRNA-Gln-CTG-2-1 204.67 -1.09 5.69E-07 9.62E-05 tRNA-His-GTG-1-9 64.59 -1.20 5.98E-07 9.91E-05 SNORA31 271.55 -0.90 6.77E-07 0.000109973 tRNA-Ile-AAT-5-2 28.07 -1.35 7.17E-07 0.00011427 FSCN1 84.13 -1.04 7.90E-07 0.000123567 COL6A2 23.29 -1.40 9.18E-07 0.000140908 tRNA-Arg-ACG-1-3 23.77 -1.37 1.01E-06 0.000151492 TMBIM1 15.63 -1.48 1.08E-06 0.000159463 tRNA-Tyr-GTA-4-1 60.52 -1.24 1.19E-06 0.000172448 AP003900.6 20.31 -1.44 1.24E-06 0.000176497

165 tRNA-Val-AAC-6-1 16.40 -1.45 1.77E-06 0.000248671 tRNA-Lys-TTT-2-1 38.39 -1.31 1.84E-06 0.000254488 POLR3A 78.85 -0.98 2.43E-06 0.000330594 tRNA-Ala-TGC-1-1 16.40 -1.43 2.59E-06 0.000343779 tRNA-Gln-CTG-9-1 27.72 -1.30 2.61E-06 0.000343779 ZNF542 27.52 1.30 2.69E-06 0.000348319 tRNA-Ser-CGA-2-1 9.72 -1.49 3.30E-06 0.000420108 MSI1 45.88 -1.10 3.96E-06 0.000491601 tRNA-Ser-GCT-6-1 42.93 -1.17 3.98E-06 0.000491601 tRNA-Lys-TTT-6-1 37.22 -1.22 4.20E-06 0.000511213 EEF1A2 49.50 -1.07 5.76E-06 0.000691179 tRNA-Leu-TAG-3-1 208.45 -0.86 5.86E-06 0.000694139 GLA 89.30 0.90 6.32E-06 0.000737169 RP11-834C11.14 118.33 0.84 6.91E-06 0.000795087 tRNA-Val-TAC-2-1 144.85 -0.92 8.14E-06 0.00092344 tRNA-Ile-AAT-8-1 27.52 -1.22 8.70E-06 0.000963184 tRNA-Leu-CAA-2-1 55.50 -1.09 8.72E-06 0.000963184 tRNA-Asn-GTT-3-2 348.95 -0.89 9.28E-06 0.001011601 tRNA-Ala-CGC-2-1 33.19 -1.13 1.03E-05 0.00111307 DNAJA2 72.12 0.94 1.19E-05 0.001246665 tRNA-Leu-TAG-2-1 21.29 -1.28 1.19E-05 0.001246665 RP11-11N9.4 24.30 1.21 1.22E-05 0.001260816 tRNA-Lys-TTT-3-1 48.40 -1.14 1.38E-05 0.001410473 tRNA-Leu-AAG-2-2 61.61 -1.00 1.69E-05 0.001707405 RP11-435B5.4 351.86 0.67 1.84E-05 0.001833761 RP11-298I3.1 27.28 -1.17 2.16E-05 0.00212592 RP1-168P16.2 35.99 1.08 2.41E-05 0.002350058 ZC3H13 176.26 -0.71 2.79E-05 0.002683301 tRNA-Tyr-GTA-1-1 7.35 -1.34 2.93E-05 0.002792151 AKAP11 114.13 -0.83 3.41E-05 0.003215034 USP25 23.67 1.16 3.46E-05 0.003219506 RN7SL674P 50.39 -0.98 3.69E-05 0.003399767 tRNA-Ile-AAT-4-1 16.93 -1.23 3.94E-05 0.003584026 LINC00355 49.79 0.96 4.16E-05 0.003747 GPT2 55.99 0.90 4.85E-05 0.004322245 TRIM28 494.21 0.60 5.16E-05 0.004547358 AC012531.25 19.32 1.17 5.53E-05 0.004721475 AL589743.1 183.12 -0.69 5.44E-05 0.004721475 SOGA1 102.07 -0.77 5.49E-05 0.004721475 tRNA-Ser-GCT-5-1 20.07 -1.16 5.93E-05 0.005012516 tRNA-Ile-AAT-5-3 53.07 -1.05 6.65E-05 0.005533289 tRNA-Ser-AGA-2-4 15.82 -1.20 6.68E-05 0.005533289 AC009120.6 111.39 0.76 7.16E-05 0.00573209 COL4A1 110.07 -0.77 7.09E-05 0.00573209 GSTP1 23.89 -1.10 7.30E-05 0.00573209 tRNA-Ala-CGC-5-1 19.50 -1.19 7.16E-05 0.00573209 tRNA-Leu-TAA-4-1 24.06 -1.10 7.40E-05 0.00573209 tRNA-Ser-CGA-3-1 30.13 -1.13 7.35E-05 0.00573209 VPS4A 84.04 0.80 7.22E-05 0.00573209 RN7SKP255 1059.17 -0.55 7.57E-05 0.00580766 DHX34 66.91 0.84 7.65E-05 0.005812225 TAGLN2 179.28 -0.70 8.06E-05 0.006046376 USP10 123.01 0.73 8.10E-05 0.006046376 RP1-170O19.22 291.63 0.63 8.59E-05 0.00635472 RP1-122P22.2 9.85 -1.25 9.41E-05 0.006777446 RP11-252A24.2 48.96 0.94 9.28E-05 0.006777446 tRNA-Gln-TTG-2-1 58.09 -0.97 9.41E-05 0.006777446 TNFRSF10D 48.50 0.92 9.92E-05 0.007082186 tRNA-Pro-TGG-3-2 21.30 -1.13 0.000100167 0.007093055 tRNA-Leu-TAA-3-1 28.04 -1.03 0.000104394 0.00732972 KIAA1522 53.76 -0.88 0.000107641 0.007456003 RP11-66N24.4 28.54 -1.05 0.000107993 0.007456003 RPL35A 286.04 -0.63 0.000109423 0.007492307 RP11-782C8.1 69.18 0.82 0.000129314 0.008710301

166 TYSND1 39.48 -0.93 0.000128793 0.008710301 LINC00511 41.52 0.92 0.000140438 0.00938333 tRNA-His-GTG-1-5 12.21 -1.19 0.000144741 0.009593407 AC069277.2 18.71 1.12 0.000147276 0.009607701 Y_RNA 1601.18 -0.63 0.000146225 0.009607701 LONRF3 41.83 -0.94 0.00015056 0.009669702 RP11-417J8.3 76.17 0.81 0.000149862 0.009669702 NIP7 33.02 0.97 0.0001545 0.009771261 ZNF587 90.02 0.75 0.000154161 0.009771261 tRNA-Arg-TCT-3-2 50.49 -0.94 0.000163246 0.010185119 tRNA-Lys-TTT-7-1 25.17 -1.03 0.000163503 0.010185119 SLC1A5 301.69 0.64 0.000168227 0.01025808 tRNA-Ile-TAT-2-2 12.89 -1.16 0.000168389 0.01025808 TRNAI2 12.89 -1.16 0.000168389 0.01025808 KARS 119.53 0.70 0.000170231 0.010291131 PANX2 10.01 -1.20 0.000171415 0.010291131 tRNA-Leu-AAG-4-1 9.71 -1.19 0.000185814 0.011075311 CHCHD10 49.06 -0.86 0.000243388 0.014300849 ERVK3-1 58.09 0.83 0.000243534 0.014300849 tRNA-Ala-AGC-4-1 122.24 -0.86 0.000245108 0.014300849 LIMD1-AS1 77.03 -0.94 0.00025127 0.014459847 PSMC4 176.83 0.66 0.000252921 0.014459847 tRNA-Ser-AGA-2-5 23.13 -1.02 0.000253069 0.014459847 RUVBL2 107.78 0.70 0.000275094 0.01561066 CROCCP3 47.38 -0.89 0.000281124 0.015844274 HELZ2 22.75 -1.03 0.000284985 0.015953377 tRNA-Tyr-GTA-3-1 20.56 -1.10 0.000298583 0.0166024 RN7SK 10061.17 -0.49 0.000305893 0.016789222 tRNA-Val-TAC-4-1 22.49 -1.07 0.000305995 0.016789222 COL4A6 28.20 -0.95 0.000319979 0.017440944 PDLIM1 50.18 -0.84 0.000333657 0.01806762 AKAP17A 53.67 -0.81 0.000336813 0.018120108 DLEU2 178.37 -0.62 0.000340372 0.018193413 CIAPIN1 50.89 0.82 0.000365732 0.018938086 GRWD1 95.81 0.69 0.000363036 0.018938086 MYC 96.07 0.74 0.000357466 0.018938086 TFDP1 225.55 -0.58 0.000359405 0.018938086 tRNA-Ser-TGA-3-1 17.57 -1.07 0.000363947 0.018938086 ATXN1L 57.66 0.82 0.00037456 0.019124307 FAM192A 59.86 0.78 0.000373493 0.019124307 tRNA-Thr-TGT-5-1 31.94 -0.95 0.000376254 0.019124307 tRNA-Ala-TGC-5-1 352.64 -0.70 0.000386716 0.019536253 PPP2R3B 26.12 -0.99 0.000397616 0.019965143 CD99 63.86 -0.80 0.000406539 0.020146492 DZIP1 132.53 -0.64 0.000406175 0.020146492 KIF11 139.74 -0.64 0.000408523 0.020146492 ATP7B 49.39 -0.84 0.000419402 0.020492539 USP9X 587.18 -0.50 0.000420487 0.020492539 ZC3H18 114.12 0.70 0.000443252 0.021475671 TIMM50 191.04 0.60 0.000464008 0.022221408 XIST 17117.95 -0.47 0.000463483 0.022221408 LZTS2 34.34 -0.93 0.000475848 0.022485352 RP11-6N13.1 17.21 1.07 0.000477661 0.022485352 tRNA-Gln-CTG-3-1 86.54 -1.08 0.000475886 0.022485352 HNRNPF 225.06 -0.57 0.000484961 0.022699995 tRNA-Gly-GCC-1-2 856.02 -0.57 0.000495463 0.02306129 FAM101B 64.15 -0.75 0.00050737 0.0234836 CLPTM1 94.29 0.69 0.000522525 0.024050672 VPS35 119.04 0.64 0.000525615 0.02405923 RP11-343C2.11 35.44 0.88 0.000554465 0.025240323 tRNA-Leu-CAA-4-1 36.03 -0.87 0.000564067 0.025537138 PDF 30.43 0.92 0.000570596 0.025553462 RN7SKP71 4113.08 -0.48 0.00056776 0.025553462 CHAMP1 83.56 -0.71 0.000595413 0.026379677

167 PRKCB 23.54 -0.96 0.000593762 0.026379677 HOXA-AS3 25.91 -0.94 0.000599206 0.026406516 CMTM4 112.02 0.64 0.000617422 0.02706531 RFWD3 139.56 0.66 0.000647208 0.028221684 EIF2S3 305.66 -0.54 0.00067345 0.029212217 TSPYL2 88.31 0.68 0.000704543 0.030401778 MTSS1L 104.36 0.65 0.000725732 0.031153815 FOXC1 343.76 -0.52 0.000751635 0.031875855 tRNA-Thr-CGT-4-1 32.54 -0.87 0.000754094 0.031875855 UBE2S 181.06 0.61 0.000751178 0.031875855 SAE1 153.93 0.60 0.000772902 0.032176843 tRNA-Ala-AGC-6-1 37.37 -1.08 0.000776749 0.032176843 tRNA-Ala-TGC-4-1 23.61 -0.97 0.00077399 0.032176843 tRNA-Pro-TGG-1-1 95.75 -0.83 0.000767932 0.032176843 ALOX12P2 115.09 0.63 0.000782673 0.03226094 DHX38 74.25 0.69 0.000814698 0.033414708 ANXA11 80.81 -0.68 0.00082761 0.033521994 C21orf58 105.89 -0.65 0.000824573 0.033521994 C21orf91-OT1 7.71 1.07 0.000832274 0.033521994 RP11-343C2.12 29.80 0.90 0.000833498 0.033521994 NUP93 87.67 0.66 0.000877424 0.035118166 ARHGEF7-IT1 7.48 -1.07 0.000884161 0.03521765 tRNA-Glu-TTC-2-1 548.38 -0.98 0.000900966 0.035715344 tRNA-Ile-AAT-2-1 20.30 -0.96 0.000910181 0.035908798 E2F4 62.68 0.71 0.000932291 0.036295067 RAC3 39.77 -0.81 0.000948423 0.036295067 RBM23 84.04 -0.66 0.000949867 0.036295067 RP11-308N19.1 20.13 0.97 0.000947173 0.036295067 SDC3 28.24 -0.89 0.000950637 0.036295067 SNRPD2 108.69 0.65 0.000949826 0.036295067 tRNA-Gly-CCC-1-1 324.21 -0.54 0.000940852 0.036295067 AL121578.2 12.90 -1.02 0.000981838 0.037314361 TPT1-AS1 61.33 -0.73 0.00098687 0.037334312 NOTCH3 24.42 -0.92 0.00104051 0.039184674 tRNA-Cys-GCA-21-1 18.47 -0.95 0.001050039 0.03936458 tRNA-Gln-TTG-1-1 186.01 -0.74 0.001059435 0.03953794 OPTN 14.63 -1.01 0.001088743 0.040319683 RN7SKP203 1850.69 -0.45 0.001090116 0.040319683 RP11-412P11.1 7.12 1.05 0.001095665 0.04034481 MKI67 875.03 -0.46 0.001118021 0.040985849 SNORD3D 5359.99 0.54 0.001189091 0.043399188 tRNA-Ser-GCT-3-1 14.40 -0.99 0.001201294 0.043652285 PNKP 29.65 0.86 0.001328032 0.048046929 TET1 61.72 -0.70 0.001335182 0.04809559 tRNA-Arg-CCT-3-1 56.76 -0.78 0.001361795 0.048841879 EIF4EBP2 135.65 -0.57 0.001380525 0.049300211 ASNS 114.92 0.63 0.001400396 0.049681802 CTD-2571L23.8 124.56 0.62 0.001403204 0.049681802

168 Table S4.3. Top 30 differentially expressed genes in rRNA-depleted RNA-seq data from independent HEK293 clones (C1-C3, M1-M3). Genes with p-value < 0.05 and mean expression > 100 are shown. Pol III transcripts are highlighted in green. Differential expression analysis was performed with DESeq2. BCYRN1 is an alternate name for the gene encoding BC200 RNA.

Adjusted Gene Mean expression log2FoldChange pvalue pvalue ENSG00000247134:RP11-11N9.4 228.9651005 0.911167765 7.89E-12 2.83E-07 ENSG00000174460:ZCCHC12 996.5777299 0.570678683 1.21E-06 0.016197608 ENSG00000236824:BCYRN1 1687.345263 -0.573872857 1.36E-06 0.016197608 ENSG00000263740:RN7SL4P 62994.57097 -0.530663278 8.65E-06 0.077487817 ENSG00000239899:RN7SL674P 853.534484 -0.497291421 1.69E-05 0.10839386 ENSG00000173480:ZNF417 1157.110523 0.508551638 2.03E-05 0.10839386 ENSG00000156521:TYSND1 1567.822667 -0.509290551 2.19E-05 0.10839386 ENSG00000266037:RN7SL3 131395.7365 -0.517422699 2.72E-05 0.10839386 ENSG00000175832:ETV4 533.6280187 0.521624765 6.04E-05 0.216200677 ENSG00000187678:SPRY4 348.8592143 0.488654031 8.89E-05 0.273426723 ENSG00000197016:ZNF470 581.7403705 0.44889916 9.16E-05 0.273426723 ENSG00000265150:RN7SL2 576589.4215 -0.473755729 0.000125444 0.345656446 ENSG00000268163:AC004076.9 157.8305878 0.508665374 0.000158178 0.375020853 ENSG00000244642:RN7SL396P 897.4843711 -0.48747051 0.000158452 0.375020853 ENSG00000240869:RN7SL128P 504.1094719 -0.464882444 0.000167509 0.375020853 ENSG00000258486:RN7SL1 670373.8699 -0.450344474 0.000225734 0.429090968 ENSG00000197928:ZNF677 445.0518712 0.507922586 0.000227596 0.429090968 ENSG00000265735:RN7SL5P 250715.3676 -0.440904583 0.000258196 0.434056263 ENSG00000154642:C21orf91 803.4118706 0.496931755 0.000282907 0.434056263 ENSG00000260035:CTD-2651B20.6 138.1733338 -0.498782508 0.000292522 0.434056263 ENSG00000267040:RP11-35G9.3 161.641118 -0.488299929 0.000295665 0.434056263 ENSG00000151240:DIP2C 930.8660083 -0.39177914 0.000302934 0.434056263 ENSG00000204524:ZNF805 549.5521849 0.431193374 0.00040094 0.48059228 ENSG00000182180:MRPS16 3999.591101 -0.332628006 0.000403078 0.48059228 ENSG00000139318:DUSP6 683.72636 0.436441963 0.000405514 0.48059228 ENSG00000175093:SPSB4 193.6445625 -0.477400086 0.000408881 0.48059228 ENSG00000241529:RN7SL767P 181.0756281 -0.481341069 0.000417273 0.48059228 ENSG00000152454:ZNF256 591.1021268 0.426258345 0.000429328 0.48059228 ENSG00000067057:PFKP 6529.426721 -0.347541951 0.000454833 0.493714862 ENSG00000142556:ZNF614 1322.098675 0.39108226 0.000482084 0.507904067

169 Table S4.4. Characteristics of the primary patient-derived fibroblasts used in this study.

Mutation 1 Mutation 2 Age at Experiments performed Mutated ID Sex biopsy RNA- Reference gene DNA Protein DNA Protein Microarray (years) Seq Ref 39 P1 POLR3A c.2554A>G p.M852V c.2617-1G>A p.R873AfsX5 F 23 X (Individual 7) Ref 39 P2 POLR3A c.1114G>A p.D372N c.2324A>T p.N775I F 1.5 X (Individual 8) Ref 39 P3 POLR3A c.2830G>T p.E944X c.3013C>T p.R1005C M 26 X (Individual 9) Ref 39 P4 POLR3A c.2554A>G p.M852V c.2617-1G>A p.R873AfsX5 F 30 X (Individual 10) Ref 39 P5 POLR3A c.2015G>A p.G672E c.2015G>A p.G672E M 18 X (Individual 2) Ref 39 P6 POLR3A c.2015G>A p.G672E c.2015G>A p.G672E M 15 X (Individual 3) Ref 39 P7 POLR3A c.1674C>G p.F558L c.3742insACC p.1248insT M 15 X (Individual 1) Ref 89 P8 POLR3A c.2015G>A p.G672E c.3718G>A p.G1240S M 7 X (Individual 4H-

42)

170 Table S4.5. SILAC analysis comparing the proteomes of POLR3A-M852V to MO3.13-WT and BC200-KO to MO3.13-WT. Proteins or protein groups were included if they displayed valid values in at least 4 out of 6 replicates for each condition. One-sample t-test was used to assess differential abundance in POLR3A-M852V vs. MO3.13-WT or BC200-KO vs. MO3.13 WT. FDR was used to correct for multiple testing and threshold for statistical significance was set at FDR < 0.05 (indicated by a “+” sign in the table). Proteins with FDR < 0.05 in both pairwise comparisons are shown, identified by their corresponding gene name. Fold change refers to the ratios POLR3A-M852V/MO3.13-WT or BC200-KO/MO3.13-WT.

POL3A-M852V / MO3.13-WT BC200-KO / MO3.13-WT

Gene name FDR log2 fold FDR log 2 fold pvalue pvalue significant change significant change AARS + 0.000750397 0.27 + 8.39E-05 -0.48 ABCE1 + 2.00E-05 -0.36 + 0.000653628 -0.42 ACAT1 + 0.014327048 -0.22 + 0.003482376 0.37 ACLY + 0.017541615 -0.15 + 2.53E-06 -0.83 ACTC1;ACTA2;ACTG2;ACTA1 + 6.76E-05 -1.03 + 4.52E-05 1.43 ACTN1 + 4.46E-06 0.49 + 1.21E-05 1.03 ADD1 + 0.003303696 0.56 + 0.004281238 0.55 AGPS + 0.003579293 -0.46 + 0.006162878 -0.39 AHSA1 + 0.000360954 0.30 + 0.000870781 -0.25 AKR1B1 + 0.000148611 0.38 + 3.67E-05 -0.93 ALDOA + 0.007333697 0.13 + 1.90E-06 0.65 ANXA2;ANXA2P2 + 0.013122185 0.29 + 0.000730098 1.44 ANXA5 + 8.16E-07 0.25 + 0.000185938 0.65 ANXA6 + 3.29E-05 0.38 + 0.000493217 0.38 AP2A2 + 0.00205428 0.17 + 0.001030865 0.39 AP2M1 + 0.014017793 -0.11 + 0.001500042 0.33 ARF4 + 0.004101471 -0.12 + 0.001466202 0.24 ARFGAP1 + 0.003385618 0.35 + 3.31E-07 0.59 ARHGDIA + 0.008367015 -0.38 + 0.003365146 -0.23 ARPC4 + 0.007468147 0.13 + 0.001447202 -0.15 ASNS + 0.000586848 -0.50 + 8.75E-07 -1.03 ATL3 + 0.012855543 0.61 + 0.003481747 1.00 ATP2B1 + 0.013430183 0.21 + 7.73E-05 0.64 ATP5C1 + 0.005345041 0.26 + 0.00374655 0.34 ATP5O + 0.002800653 0.20 + 0.000478018 0.45 BCAT1 + 0.0007821 0.23 + 6.11E-05 -0.57 BLMH + 0.005553448 -0.37 + 7.11E-06 -1.00 C1QBP + 6.05E-06 -0.43 + 0.003057831 -0.19 CACYBP + 0.008589583 -0.08 + 7.08E-05 -0.37 CALU + 0.008828601 0.29 + 0.00582386 0.25 CANX + 0.000462057 0.28 + 0.0002485 0.43 CAPN2 + 0.0096556 0.34 + 3.73E-08 1.22 CCT2 + 0.003937854 -0.15 + 6.26E-05 -0.29 CCT3 + 0.00031761 -0.25 + 0.000132811 -0.29 CCT4 + 0.001391774 -0.21 + 8.15E-05 -0.33 CCT5 + 0.000942746 -0.22 + 6.61E-05 -0.32 CCT6A + 0.000177742 -0.24 + 1.66E-05 -0.31 CCT7 + 0.000656092 -0.19 + 2.39E-05 -0.29 CCT8 + 0.000273066 -0.23 + 8.71E-05 -0.33 CD59 + 0.01313212 -0.63 + 7.06E-05 1.65 CDC37 + 0.001760092 -0.26 + 0.008917712 -0.27 CDK1 + 0.000170935 0.16 + 0.024499861 -0.16 CDK6 + 0.010587567 -0.25 + 5.14E-05 -0.47 CFL1 + 0.010009036 0.24 + 0.003839012 0.12 CHID1 + 0.001314315 0.52 + 0.009048361 0.42 CHORDC1 + 1.57E-05 -0.33 + 2.69E-05 -0.67 CKAP4 + 2.00E-05 0.81 + 2.52E-05 1.02

171 CKAP5 + 0.012016594 -0.21 + 0.013262756 0.17 CLIC4 + 3.41E-06 -0.28 + 0.001241137 -0.21 CNP + 0.007353265 -0.22 + 0.002402113 -0.71 COPS3 + 0.002948632 -0.36 + 0.017539059 -0.26 COPS8 + 0.001152692 -0.54 + 0.000172078 -0.53 CORO1B + 0.008145648 0.30 + 2.94E-06 0.76 CORO1C + 7.74E-05 0.42 + 5.27E-05 0.89 CPNE1 + 0.002157837 0.19 + 0.003724878 0.20 CRIP2 + 0.00421892 0.49 + 0.003255572 0.33 CRTAP + 0.000571387 0.35 + 0.002080535 0.32 CS + 0.000884289 0.31 + 8.08E-05 0.57 CTAG1A + 0.006341142 -0.68 + 0.002518429 -0.47 CTHRC1 + 0.000591489 0.31 + 0.001098284 -0.53 CTNNA2 + 0.000830941 0.16 + 0.011893582 0.21 CTNNB1 + 0.003374459 0.17 + 0.000153412 0.14 CTSD + 0.000206823 0.40 + 0.011532661 0.51 CTTN + 0.004543349 0.22 + 0.000248216 0.51 CYB5R1 + 0.006688187 0.21 + 0.003031281 1.36 CYC1 + 0.001879155 -0.28 + 0.009367883 -0.40 CYCS + 0.00146999 -0.33 + 0.000735953 0.66 CYP51A1 + 5.37E-05 0.49 + 0.023107903 -0.23 DARS + 0.000784181 -0.39 + 0.014116594 -0.30 DBN1 + 1.96E-05 1.08 + 0.000537176 0.57 DCTN2 + 0.001242875 0.31 + 0.010262503 0.17 DDOST + 0.000350557 0.29 + 0.004631791 0.30 DDT;DDTL + 0.014968876 -0.07 + 0.000105034 0.62 DDX39B + 0.000543409 -0.41 + 0.00318199 -0.36 DES + 0.005168299 -0.60 + 0.000323012 1.51 DHCR24 + 1.75E-05 0.33 + 0.004813045 0.24 DNAJA1 + 1.46E-05 -0.42 + 8.80E-05 -0.46 DNM1L + 0.003408009 0.24 + 0.00011643 0.29 DNMT1 + 2.88E-05 -0.41 + 5.03E-05 -0.75 DPYSL2 + 1.34E-05 0.58 + 1.59E-08 -1.20 DUT + 0.011463245 -0.25 + 0.00970794 -0.61 DYNC1H1 + 2.33E-06 0.38 + 5.17E-05 0.37 DYNC1I2 + 0.009535033 0.32 + 0.013925093 0.28 DYNC1LI2 + 0.015180611 0.26 + 0.002221278 0.30 ECHS1 + 0.002545518 0.23 + 0.004484362 0.37 ECM29 + 0.001982521 -0.24 + 0.011225241 0.25 EDF1 + 0.011313101 0.36 + 0.012687797 -0.40 EEF1A1;EEF1A1P5 + 0.000123208 -0.36 + 0.01486603 -0.11 EEF1B2 + 0.010835535 0.10 + 0.000126599 -0.38 EIF3B + 0.000227331 -0.34 + 0.002672378 -0.26 EIF3E + 2.33E-05 -0.28 + 0.000571007 -0.19 EIF3G + 0.000452747 -0.29 + 0.002072372 -0.39 EIF3I + 2.25E-05 -0.32 + 0.003198488 -0.32 EIF4A1 + 0.000487591 -0.41 + 0.001542742 -0.48 EIF4A2 + 0.011000377 -0.33 + 0.025120176 -0.18 EIF4A3 + 0.000212761 -0.31 + 5.37E-05 -0.35 EIF4G1 + 0.000113802 -0.40 + 0.000133167 -0.44 EIF5A;EIF5AL1 + 0.002139354 -0.40 + 0.001171779 -0.44 ELAC2 + 3.93E-05 -0.65 + 0.000288949 -0.75 EPB41L3 + 0.002307787 -0.28 + 0.006469851 0.17 EPRS + 0.000524659 0.14 + 0.000246296 0.19 ERP44 + 0.007124413 0.14 + 0.000306277 0.63 ESYT1 + 0.002186182 0.32 + 0.000496212 0.71 EYA3 + 0.006811659 -0.68 + 0.01039315 -0.44 EZR + 2.49E-05 -0.71 + 0.000378968 0.49 FANCI + 0.00027498 -0.46 + 0.000649482 -0.50 FH + 0.00090579 -0.41 + 0.000530039 0.41 FKBP10 + 1.11E-06 0.52 + 0.0032649 0.42 FKBP4 + 0.014219581 0.13 + 0.00341986 -0.20 FLNC + 0.00816654 -0.19 + 0.000146265 0.69 FSCN1 + 0.000171027 0.38 + 0.000120598 -0.20

172 GABARAPL2 + 0.009897539 -0.17 + 0.003679403 -0.26 GAPDH + 6.73E-05 0.38 + 1.93E-06 0.71 GARS + 0.004786184 0.13 + 0.023149555 -0.10 GBP1;GBP2 + 0.000498066 -3.16 + 0.01767033 -3.25 GDI1 + 0.012734512 0.23 + 0.014197309 0.38 GEMIN2 + 0.000553995 0.24 + 8.99E-07 1.16 GMPS + 0.000824529 -0.27 + 1.02E-05 -0.46 GOT2 + 0.00199968 -0.21 + 0.000150362 0.49 HACD3 + 0.000377737 0.52 + 0.005549531 0.42 HADHA + 0.003934586 0.44 + 0.004647366 0.69 HIGD1A + 0.001496463 -0.46 + 4.32E-06 0.56 HINT1 + 3.54E-05 0.43 + 0.002275643 0.13 HK1 + 0.000163284 0.62 + 3.86E-05 1.07 HM13 + 0.000141665 0.36 + 1.50E-05 0.77 HSD17B12 + 0.00357258 0.25 + 0.000148023 0.76 HSD17B4 + 0.012207324 0.51 + 0.014294355 0.29 HSP90AA1 + 4.96E-05 -0.17 + 0.000468663 -0.41 HSP90AB1 + 0.007907544 -0.12 + 0.000175866 -0.29 HSP90AB2P + 0.002598617 -0.14 + 0.002429652 -0.19 HSP90B1 + 2.74E-06 0.58 + 4.78E-05 0.41 HSPA5 + 0.002964103 0.11 + 0.00013888 0.38 HYOU1 + 1.08E-05 0.46 + 8.19E-05 0.37 IDH1 + 1.57E-05 0.59 + 7.42E-05 -0.85 IDH2 + 0.00099711 0.29 + 4.76E-05 0.57 IGF2BP1 + 0.001114184 -0.23 + 3.88E-06 -0.35 IGF2R + 0.007613144 -0.53 + 4.47E-05 0.77 ILKAP + 0.000607056 -0.40 + 8.97E-05 -0.91 IMPA2 + 0.011465876 -0.23 + 0.000557936 -1.39 IMPDH2 + 2.68E-06 -0.58 + 0.000563618 -0.31 IPO4 + 0.006037971 -0.39 + 0.000533096 -0.61 IPO5 + 2.89E-05 -0.55 + 0.002607503 -0.29 IPO7 + 3.91E-06 -0.17 + 1.72E-05 -0.30 IPO9 + 0.013372179 0.21 + 0.000212604 -0.30 ITGA1 + 0.000363545 -0.38 + 7.25E-07 1.17 ITGB1 + 1.83E-05 0.59 + 4.02E-06 0.44 KCTD12 + 9.58E-05 -0.99 + 0.012253846 -0.40 KDELR1 + 0.015103229 0.13 + 0.000123491 0.37 KDELR2 + 0.006045607 0.20 + 0.001378497 0.44 KHSRP + 0.009762763 -0.23 + 0.001042133 -0.22 KIF5B + 1.03E-05 0.45 + 0.002995922 0.23 KLC1 + 0.00077164 0.40 + 0.021230363 0.31 KPNA2 + 5.12E-07 -0.72 + 1.42E-05 -0.52 KPNA4 + 0.000145457 -0.29 + 9.23E-05 -0.28 KPNB1 + 2.76E-05 -0.32 + 3.76E-06 -0.64 LAMB1 + 3.28E-05 0.52 + 4.10E-05 1.02 LAMC1 + 0.004412743 0.60 + 0.000270209 1.50 LAMP2 + 0.002066227 0.65 + 0.000397255 0.71 LDHB + 3.13E-06 0.40 + 0.002762873 -0.15 LEPRE1 + 0.00086333 0.74 + 0.007029252 0.52 LGALS1 + 4.57E-05 0.45 + 6.69E-06 0.97 LMAN1 + 3.79E-05 0.57 + 0.001553425 0.45 LMAN2 + 0.000241559 0.45 + 0.000486842 0.29 LOXL3 + 0.018820293 0.54 + 0.001206341 0.40 MACF1 + 0.010744555 -0.14 + 0.001220924 0.66 MANF + 0.007694984 0.36 + 0.002724979 0.27 MAP1B + 0.001477849 0.57 + 4.85E-05 0.99 MCAM + 0.008744828 -0.46 + 0.000335181 0.98 MCM2 + 5.14E-06 -0.40 + 9.26E-07 -1.03 MCM3 + 2.88E-06 -0.41 + 1.71E-05 -0.57 MCM4 + 7.63E-06 -0.40 + 2.05E-07 -1.08 MCM5 + 0.00132745 -0.20 + 0.000191787 -0.47 MCM6 + 0.000334892 -0.22 + 1.42E-06 -0.98 MCM7 + 3.71E-05 -0.35 + 2.08E-07 -0.91 MDH1 + 0.000450112 0.13 + 0.012254448 0.12

173 MGST3 + 0.000125931 0.58 + 0.000730127 1.03 MIF + 0.00093852 0.26 + 7.36E-05 -0.23 MLEC + 0.001566609 0.37 + 0.017405668 0.51 MSTN + 4.05E-06 -1.06 + 1.45E-06 -1.37 MTCH2 + 0.01306732 0.32 + 0.000840403 0.46 MTHFD1 + 0.000110861 -0.24 + 0.001450304 -0.25 MYDGF + 0.001717084 0.24 + 0.000108648 0.53 MYH10 + 0.002644973 -0.18 + 0.004815469 -0.18 MYH13 + 0.001317768 0.46 + 2.20E-06 0.98 MYL1 + 3.29E-05 -1.31 + 4.13E-06 -1.59 NAP1L1 + 8.09E-05 0.31 + 0.000503102 -0.14 NCAPD2 + 0.000111525 -0.31 + 2.74E-06 -0.81 NCAPG + 0.000149725 -0.42 + 3.65E-05 -0.90 NCAPH + 8.54E-05 -0.57 + 5.53E-06 -0.89 NGFR + 0.018318147 -0.62 + 2.40E-06 -1.69 NMT1 + 0.001111925 -0.51 + 0.000275191 -0.47 NOMO1;NOMO3;NOMO2 + 9.55E-05 0.66 + 0.003183707 0.32 NPEPPS + 0.000388428 -0.25 + 0.002265235 -0.46 NSDHL + 8.75E-06 0.56 + 4.62E-05 0.60 NUDC + 0.007979596 -0.10 + 1.17E-06 -0.79 NUDT5 + 0.000351323 0.72 + 0.004676246 -0.51 NUP155 + 0.000685357 -0.45 + 0.015596359 -0.38 NUP93 + 0.010257785 -0.31 + 0.004274247 -0.31 OLA1 + 0.001335852 -0.15 + 0.001927578 -0.39 OXSR1 + 0.010747854 -0.23 + 0.000526475 0.26 P4HA1 + 2.69E-05 0.70 + 0.001255952 0.46 P4HB + 0.00090198 0.16 + 0.000773646 0.41 PAFAH1B2 + 0.002086175 0.11 + 0.026330957 0.14 PARVA + 0.017201913 -0.26 + 0.013304738 0.23 PCBP1 + 0.007428333 -0.05 + 0.000186977 -0.17 PDCD5 + 0.000383541 -0.57 + 0.000374158 -0.35 PDCD6IP + 0.000148118 -0.34 + 0.000985242 -0.16 PDIA3 + 0.00056522 0.26 + 0.000360544 0.44 PDIA4 + 5.00E-06 0.52 + 0.011453102 0.15 PDIA6 + 0.006832968 0.14 + 0.016069805 0.12 PEBP1 + 0.000703726 0.29 + 0.008080157 -0.09 PFAS + 0.001377578 -0.36 + 2.21E-05 -0.73 PFKP + 0.001108193 0.77 + 0.000936876 0.87 PGAM1 + 6.38E-06 0.54 + 4.72E-05 0.24 PGD + 0.008938366 0.16 + 0.003301683 -0.14 PGLS + 0.008334977 0.25 + 0.000833538 -0.31 PHGDH + 2.71E-05 0.31 + 1.46E-06 -1.02 PKM + 0.002226732 0.18 + 3.45E-06 1.12 PLOD3 + 0.000584472 0.52 + 5.64E-05 0.59 POFUT1 + 0.00278567 0.33 + 5.90E-05 0.64 POTEE;POTEF;POTEI + 0.008664356 0.17 + 0.000182206 0.31 PPIA + 0.000700146 0.14 + 2.15E-05 -0.27 PPIB + 1.51E-07 0.61 + 1.94E-06 0.98 PPP1CA + 0.002731141 0.10 + 0.006165678 0.12 PRDX1 + 1.14E-05 0.27 + 0.000341401 -0.20 PRDX3 + 0.00589608 -0.20 + 0.00316731 0.23 PRDX4 + 3.33E-05 0.31 + 0.005315837 0.25 PRKACA;PRKACB + 0.000146357 -0.50 + 0.000219907 0.52 PRKAR1A + 0.001533793 -0.42 + 0.000167613 0.27 PRPS1;PRPS1L1 + 0.016330573 -0.19 + 0.0008596 -0.65 PSMA2 + 0.005892297 0.17 + 0.003923498 0.15 PSMC1 + 4.54E-05 -0.14 + 0.004989928 -0.12 PSMC2 + 0.001039239 -0.13 + 0.000785335 -0.24 PSMC4 + 0.019280749 -0.15 + 0.000112875 -0.33 PSMC5 + 0.01739092 -0.14 + 0.000817882 -0.22 PSMD1 + 0.010702584 -0.13 + 0.003647452 -0.32 PSMD11 + 0.000199165 -0.32 + 2.93E-05 -0.45 PSMD12 + 0.006649667 -0.34 + 0.002013175 -0.25

174 PSMD13 + 0.000554423 -0.25 + 0.001468379 -0.22 PSMD2 + 6.96E-05 -0.22 + 0.008340985 -0.21 PSMD3 + 2.86E-05 -0.42 + 0.000227763 -0.33 PSMD7 + 0.001758344 -0.32 + 6.70E-05 -0.22 PSME3 + 0.001699502 -0.31 + 0.000275991 -0.74 PTGES3 + 0.000262198 0.18 + 5.86E-05 -0.20 QPRT + 3.88E-06 0.86 + 0.007255773 0.41 RAB18 + 0.000325265 0.57 + 0.016939267 0.55 RAB1A + 0.00702251 0.12 + 0.001393495 0.23 RAB21 + 0.008749727 0.36 + 0.000125491 0.62 RAB8A + 0.000339709 -0.53 + 0.019612316 -0.18 RAC1;RAC3;RAC2 + 0.000404027 -0.25 + 0.020553517 0.05 RAN + 0.001735314 0.22 + 0.002263739 -0.18 RANBP1 + 0.001277215 -0.26 + 0.000787848 -0.56 RAP1B;RAP1A + 0.000657451 0.28 + 0.001153127 0.21 RCC2 + 0.001630462 -0.24 + 3.78E-05 -0.77 RCN1 + 0.001637904 0.36 + 0.01212461 0.27 RDX + 0.003806665 0.08 + 0.011315366 0.25 RHOC + 0.018236684 -0.15 + 0.00548788 0.14 RPN2 + 0.003004636 0.27 + 0.000675876 0.29 RSU1 + 0.004608673 0.36 + 0.000212566 0.42 RTN4 + 0.000410691 -0.37 + 0.000436675 0.50 RUVBL1 + 0.001368921 -0.47 + 4.89E-05 -0.30 RUVBL2 + 0.0002757 -0.42 + 6.31E-05 -0.41 S100A11 + 6.91E-05 0.54 + 2.52E-06 0.65 S100A4 + 0.004707559 0.26 + 0.004650759 0.26 SAR1A + 0.000135553 0.35 + 0.003281364 0.23 SBDS + 0.01140656 -0.30 + 0.000231646 0.33 SDHA + 0.010201645 -0.26 + 0.002699125 0.43 SERPINH1 + 1.65E-05 0.68 + 0.001185 0.48 SET + 0.011784556 -0.12 + 3.02E-05 -0.23 SF3A1 + 0.002758429 -0.18 + 0.000217255 -0.40 SF3B1 + 0.00011301 -0.19 + 6.03E-06 -0.81 SF3B3 + 0.002537388 -0.18 + 0.000119759 -0.36 SH3GL1 + 0.00919166 -0.21 + 0.007939522 0.20 SLC16A1 + 7.19E-07 -0.44 + 8.13E-05 -0.19 SLC25A11 + 0.000438872 -0.40 + 0.000660368 0.51 SLC25A3 + 0.001177185 0.23 + 0.000397287 0.47 SLC29A1 + 0.000226604 -0.50 + 3.14E-05 0.49 SLC2A1 + 0.002743392 -0.41 + 0.000707413 1.55 SMAP + 0.016093094 -0.14 + 0.000319401 -0.52 SMC1A + 0.013387792 0.22 + 0.019867307 0.26 SMC2 + 0.003671627 -0.50 + 2.16E-05 -1.12 SMC4 + 0.001733657 -0.45 + 4.23E-05 -0.95 SND1 + 0.00183935 0.33 + 0.005718783 0.11 SNRNP200 + 0.015635202 -0.23 + 0.000391536 -0.43 SNRPN;SNRPB + 0.0166603 -0.09 + 4.77E-07 -0.53 SOD2 + 0.00195655 -0.81 + 0.000308843 0.78 SRM + 0.009401395 -0.23 + 0.009233411 -0.47 SRRM2 + 0.000384898 -0.23 + 2.04E-06 -0.54 SSB + 0.01788731 -0.12 + 5.02E-05 -0.70 SSR4 + 0.002481075 0.23 + 0.014494386 0.41 STC2 + 0.014577076 0.21 + 0.000970432 1.04 STIP1 + 0.000494791 -0.10 + 4.06E-05 -0.26 STMN1 + 0.000826133 0.40 + 2.47E-05 -0.81 STT3A + 0.000916991 0.35 + 0.000118529 0.36 SUCLA2 + 0.0139496 -0.24 + 0.002962763 0.36 SUGT1 + 0.001756179 -0.17 + 5.18E-06 -0.26 SUMO2;SUMO4;SUMO3 + 0.002281506 -0.33 + 0.008430544 -0.67 SURF4 + 0.001430669 0.41 + 0.00012089 0.71 TALDO1 + 6.62E-06 0.25 + 0.024182054 0.12 TBCA + 0.000748621 0.22 + 0.000768294 0.47 TCEA1 + 0.005402814 0.08 + 6.77E-05 -0.82

175 TCP1 + 0.000191855 -0.33 + 2.73E-05 -0.30 TK1 + 0.000174403 -0.65 + 0.024631136 -0.70 TKT + 0.003161526 0.20 + 0.000209394 -0.30 TMED10 + 0.000640803 0.45 + 0.0033727 0.25 TMED7 + 0.000605195 0.43 + 0.001257223 0.46 TMEM43 + 0.004361021 0.28 + 0.000309317 0.45 TMX2 + 0.01482896 0.28 + 3.51E-06 0.49 TNPO3 + 0.000148068 0.09 + 0.000208802 -0.49 TOMM40 + 0.000452816 -0.31 + 0.023469684 0.17 TPD52L2 + 2.40E-05 0.51 + 3.12E-06 0.35 TPI1 + 5.50E-05 0.25 + 0.000232628 0.40 TPT1 + 8.47E-05 -0.34 + 0.000102524 -0.27 TRIM28 + 0.000334257 -0.55 + 0.000407075 -0.41 TRIP13 + 0.001895594 -0.38 + 5.08E-05 -0.73 TRMT1 + 0.002204533 -0.57 + 0.004501115 -0.47 TUBA1A;TUBA3C;TUBA3E + 2.35E-05 1.20 + 7.76E-05 0.49 TUBB3 + 1.97E-05 0.67 + 0.000146745 0.80 TXN + 0.000229614 0.37 + 0.023475404 -0.19 TXNDC17 + 0.000140029 -0.41 + 0.00010217 -0.31 UBE2C + 4.78E-07 -0.47 + 0.004333087 -0.53 UBE2D2;UBE2D3 + 0.001277212 -0.13 + 0.014056881 -0.17 UBE2N;UBE2NL + 0.003326821 0.30 + 0.002860631 0.17 UCHL1 + 3.74E-06 0.92 + 0.001107318 0.28 UGDH + 0.005207729 0.19 + 0.004641151 -0.20 UGGT1 + 0.000439793 0.28 + 0.019427453 -0.31 USO1 + 0.001010276 -0.16 + 0.000395897 0.58 USP5 + 1.41E-05 0.23 + 0.000150941 0.49 VAMP3;VAMP2;VAMP1 + 1.21E-05 -0.64 + 0.002897649 0.43 VASP + 0.000184744 -0.70 + 3.29E-05 0.46 VCL + 0.002290981 0.29 + 8.17E-06 0.78 VCP + 0.000129224 0.31 + 0.000268027 0.18 VDAC1 + 0.000228602 0.35 + 0.000217987 0.68 VDAC2 + 0.000284804 0.43 + 8.96E-06 0.68 VIM + 0.006314736 0.96 + 0.001367872 1.10 VKORC1 + 0.013556343 0.41 + 0.004280212 0.60 VPS29 + 5.40E-05 0.30 + 7.45E-05 0.36 VPS35 + 0.000102964 0.17 + 1.15E-05 0.23 XPO1 + 0.001188856 -0.15 + 6.21E-06 -0.40 YBX1 + 0.005219131 -0.21 + 0.002672687 -0.20 YWHAZ + 6.63E-06 0.29 + 1.34E-05 -0.65

176 Table S4.6. Primers and oligonucleotides used in this study.

Name Sequence Application hVTRNA1-1_ChIP-F GGCTGGCTTTAGCTCAGCG ChIP-qPCR hVTRNA1-1_ChIP-R TCTCGAACAACCCAGACAGGT ChIP-qPCR htRNA-iMet_ChIP-F AGAGTGGCGCAGCGGAA ChIP-qPCR htRNA-iMet_ChIP-R TAGCAGAGGATGGTTTCGATCC ChIP-qPCR unbound_hchr13_ChIP-F GGCACTGTCTTGTCACTGCACATT ChIP-qPCR unbound_hchr13_ChIP-R TGGAAACAGCCATTGAGAACACC ChIP-qPCR POLR3A_M852V_g1_cloning-F CACCgcacacaatggccggccggga Cloning of sgRNA sequence into pSpCas9(BB)-2A-Puro POLR3A_M852V_g1_cloning-R AAACtcccggccggccattgtgtgc Cloning of sgRNA sequence into pSpCas9(BB)-2A-Puro POLR3A_M852V_g2_cloning-F CACCgCTTCCCGGCCGGCCATTGTG Cloning of sgRNA sequence into pSpCas9(BB)-2A-Puro POLR3A_M852V_g2_cloning-R AAACCACAATGGCCGGCCGGGAAGc Cloning of sgRNA sequence into pSpCas9(BB)-2A-Puro POLR3A_M852V_seq_F aaggaaagggcaaagaaagc PCR and Sanger sequencing to screen for mutation POLR3A_M852V_seq_R tcatttccagggcatctacc PCR and Sanger sequencing to screen for mutation POLR3A_M852V_MiSeq_F ACACTGACGACATGGTTCTACAaaggaaagggcaaagaaagc PCR and MiSeq sequencing to confirm compound heterozygosity POLR3A_M852V_MiSeq_R TACGGTAGCAGAGACTTGGTCTtcatttccagggcatctacc PCR and MiSeq sequencing to confirm compound heterozygosity tttttctacctctctttagCTCCCAGCTGCCAAAGGCTTTGTGGCTAATAGC TTTTATTCCGGTTTGACACCAACTGAGTTTTTCTTCCATACAGT POLR3A_M852V_ssODN-1 GGCTGGTCGAGAAGGTCTAGTCGACACGGCTGTAAAGACAGC Homology-directed repair template TGAAACGGGATACATGCAGgtaacctgaagaaatgtaggccattagcaacagagca gg ATAGCTTTTATTCCGGTTTGACACCAACTGAGTTTTTCTTTCAT POLR3A_M852V_ssODN-2 ACAGTGGCTGGCCGGGAAGGTCTAGTCGACACGGCTGTAAAG Homology-directed repair template ACAGCTG POLR3A_M852V_cDNA_seq_F GGCTCCAAAGGTTCCTTCAT PCR and sequencing of POLR3A cDNA exons 17-21 POLR3A_M852V_cDNA_seq_R AGCTCGTTTTTGCTGAGAGC PCR and sequencing of POLR3A cDNA exons 17-21 BC200_g1_F CACCgtgatgagctatataacccta Cloning of sgRNA sequence into pSpCas9(BB)-2A-Puro BC200_g1_R AAACtagggttatatagctcatcac Cloning of sgRNA sequence into pSpCas9(BB)-2A-Puro BC200_g2_F CACCgataaccctatggccagcaga Cloning of sgRNA sequence into pSpCas9(BB)-2A-Puro BC200_g2_R AAACtctgctggccatagggttatc Cloning of sgRNA sequence into pSpCas9(BB)-2A-Puro BC200_g3_F CACCgttaagaagctgaggaaagca Cloning of sgRNA sequence into pSpCas9(BB)-2A-Puro BC200_g3_R AAACtgctttcctcagcttcttaac Cloning of sgRNA sequence into pSpCas9(BB)-2A-Puro BC200_g4_F CACCgttcctcagcttcttaaagct Cloning of sgRNA sequence into pSpCas9(BB)-2A-Puro BC200_g4_R AAACagctttaagaagctgaggaac Cloning of sgRNA sequence into pSpCas9(BB)-2A-Puro BC200_screening-F gggtgtgagggttggaaaat PCR screening for BC200 deletion BC200_screening_R gcagtagcagcagcatttca PCR screening for BC200 deletion BC200_MiSeq-F ACACTGACGACATGGTTCTACAgggtgtgagggttggaaaat PCR and MiSeq sequencing to confirm BC200 deletion BC200_MiSeq_R TACGGTAGCAGAGACTTGGTCTgcagtagcagcagcatttca PCR and MiSeq sequencing to confirm BC200 deletion hBC200_qRT-PCR-F GGTGGCTCACGCCTGTAATC qRT-PCR hBC200_qRT-PCR-R GAACTCCTGGGCTCAAGCTATC qRT-PCR tRNA-Leu-CAA-1-2-F CTCAAGCTTGGCTTCCTCGT qRT-PCR tRNA-Leu-CAA-1-2-R GAACCCACGCCTCCATTG qRT-PCR tRNA-Tyr-GTA-8-1-F AGCGGAGGACTGTAGGTTCA qRT-PCR tRNA-Tyr-GTA-8-1-R GATTCGAACCAGCGACCTAA qRT-PCR tRNA-Ala-TGC-1-1-F GAACCCGGGACCTCATACAT qRT-PCR tRNA-Ala-TGC-1-1-R GGGGGTGTAGCTCAGTGGTA qRT-PCR PMM1-F gacagcttcgacaccatcca qRT-PCR PMM1-R cggcaaagatctcaaagtcgtt qRT-PCR NDUFS2-F ggaatgggcacagcagtt qRT-PCR NDUFS2-R ctttggagggtccacatca qRT-PCR PSMB6-F caagaaggagggcaggtgtact qRT-PCR PSMB6-R cctccaatggcaaaggactg qRT-PCR SDHA-F ctgtcttcatacgcttctgcactc qRT-PCR SDHA-R ccagccactaggtgccaatc qRT-PCR COPS7A-F ctggccacactcatccatc qRT-PCR COPS7A-R aggtagaggcaaagtcactctca qRT-PCR GGCGCCTAATACGACTCACTATAGGTTACAATAGATGCGTTTA T7-ERCC-00051-F Production of template for in vitro transcription of spike-in RNAs GAG ERCC-00051-R ATAGAGAGAACAACAACCCCTTTC Production of template for in vitro transcription of spike-in RNAs GGCGCCTAATACGACTCACTATAGGCGAAATTAATACGACTCA T7-TR94-F Production of template for in vitro transcription of spike-in RNAs C TR94-R TGGTATATCTCCTTCTTAAAG Production of template for in vitro transcription of spike-in RNAs GGCGCCTAATACGACTCACTATAGGTCTAGTACTATGTATATGA T7-SRQC70-F Production of template for in vitro transcription of spike-in RNAs TATC SRQC70-R ATCGCGTGCATAGTGTAGCATAG Production of template for in vitro transcription of spike-in RNAs TTACAATAGATGCGTTTAGAGTAGCTGGGGGAATTTTGCTCTTT AAAATAGCTTGGGACATGCTTCACGCAGAAATTCCAAAAACA AAGCACAAACCAGATGAAAGATTAGACCTTGAAGATATTGAT AGTATAGTTTATGTCCCATTGGCTATTCCTTTAATCTCTGGCCCT GGAGCTATAACAACAACCATGATTTTGATTAGCAAAACCCAGA G-block Production of template for in vitro transcription of spike-in RNAs GTATCTTAGAGAAAGGGGTTGTTGTTCTCTCTATCGAAATTAAT ACGACTCACTATAGGGGAATTGTGAGCGGATAACTGACTGAC TGACTAAATAATTTTGTTTAACTTTAAGAAGGAGATATACCATC TAGTACTATGTATATGATATCACACTCAGCACTACGTACTCATA GCTATGCTACACTATGCACGCGAT

177 CHAPTER 5: General Discussion

5.1 Summary This thesis focused on the use of genome-wide approaches to gain a better understanding of rare neurological disorders present in the French Canadian population. It led to progress on two equally important fronts for ataxia-related disorders: identification of new genetic causes and insights into pathophysiological mechanisms. In Chapter 2, we used WES to identify causal recessive mutations in two different genes, SPG7 and PMPCA, resulting in a genetic diagnosis in approximately 40% of our ARCA cohort and ending the protracted diagnostic odyssey of 24 individuals. In Chapter 3, we showed that the POLR3A G672E French Canadian founder mutation causing POLR3-HLD does not lead to impaired motor function, hypomyelination or cerebellar atrophy in mice, suggesting that Pol III vulnerability to mutations varies between species. In Chapter 4, we created a cellular model of POLR3-HLD and used two complementary RNA-seq approaches to show that a subset of Pol III transcripts is more vulnerable to POLR3A mutations. Notably, we found that BC200 RNA is one of the most affected transcripts and that it may play an important role in cells of the oligodendroglial lineage.

5.2 New insights into the genetic basis of ARCAs 5.2.1 Phenotypic expansion and the ataxia-spasticity spectrum In Chapter 2, we showed that mutations in two different genes, SPG7 and PMPCA, are two novel causes of ARCA in the French Canadian population. The first gene, SPG7, was already known to cause recessive hereditary spastic paraplegia (HSP), a disorder characterized by spasticity due to progressive degeneration of the corticospinal tracts.19 We uncovered bi- allelic SPG7 mutations in 22 cases belonging to 12 families (38.7% of the families screened) using a combination of WES and Sanger sequencing. Importantly, our study was among the first214,224 to show that SPG7 mutations can cause predominant or pure cerebellar ataxia and that it is one of the most common causes of ARCA.19 Two years later, SPG7 is now widely considered in the differential diagnosis of ARCAs, both in clinical and research settings. In fact, in a recent WES study of the largest cohort of unresolved cerebellar ataxia cases to date, SPG7 mutations explained the highest number of cases (14/319).33 On a larger scale, we contributed to the growing evidence that mutations in the same gene can lead to predominant HSP in some

178 cases and predominant ataxia in others. This has prompted experts to suggest the existence of a phenotypic continuum, the ataxia-spasticity spectrum, as opposed to two distinct entities.19 Other examples include PNPLA6, which causes disorders at both extremes of the ataxia-spasticity spectrum,308-310 and KIF1C, associated with HSP with ataxic features and cerebellar ataxia with variable spasticity.311,312 In addition to this clinical overlap, a recent review by Synofzik and Schule19 points out the existence of shared molecular pathways between the two groups and suggests that the corticospinal tracts and cerebellar circuits could be vulnerable to dysfunction of the same molecular processes, such as lipid metabolism, acid metabolism and cytoskeleton or dendritic intracellular transport.19 Mitochondrial dysfunction is also common, including in SPG7 where the encoded protein, paraplegin, is involved in mitochondrial protein quality control.17 Importantly, our experience with SPG7 is part of a more widespread phenomenon of phenotype expansion, brought on by the WES revolution and observed in many rare diseases. Before WES, genes associated with similar but clinically distinct diseases were rarely screened. Indeed, SPG7 had never been considered as a potential causal gene in the French Canadian ARCA cohort until we uncovered the first mutations using WES. By allowing the simultaneous sequencing of most protein-coding genes, WES revealed a plethora of mutations in genes that were already associated with other rare diseases. In fact, since 2012, novel phenotypes associated with known disease-causing genes make up almost half of all new gene-phenotype relationships.42 While there is often phenotypic overlap among diseases caused by mutations in the same gene, as is the case for SPG7, this is not always true. For example, mutations in TUBB4A can cause dystonia or leukodystrophy, two entirely distinct disorders,313,314 and the phenotypic spectrum associated with the gene continues to expand.315 Interestingly, a genotype- phenotype correlation is emerging, with a specific mutation causing dystonia while others are only observed in leukodystrophy.79 However, this is not always the case in other diseases with phenotypic expansion. In fact, most of the SPG7 mutations found in our ARCA patients had been previously observed in HSP cases. Thus, further studies will be required to determine the factors responsible for the spectrum of phenotypes associated with the same gene or the same mutation.

179 5.2.2 PMPCA-related ataxia, an ultra-rare genetic disorder In the second part of Chapter 2, we described the identification of a homozygous missense mutation in PMPCA in a single pedigree affected with childhood-onset ARCA. Our experience with this family illustrates well the challenges of working on ultra-rare genetic diseases. In fact, the strongest proof of causality for a new disease-causing gene is the identification of multiple unrelated families with the same rare disease and mutations in the same gene.42 Thus, although the PMPCA mutation was the only strong candidate in our patient’s WES data, we could not confirm the genetic diagnosis in the absence of a second family. Almost one year later, Jobling et al.1 published their identification of PMPCA mutations in four independent families affected with a very similar form of ARCA, which convinced us that we had truly found the causative gene in our family. Therefore, we were able to report our novel PMPCA mutation in a Letter to the Editor in response to Jobling et al.1 and to describe some phenotypic differences. More importantly, this allowed us to provide a long-awaited genetic diagnosis to our patients. However, we still have other ARCA single families with promising variants that remain unresolved because of the absence of a second family. In fact, it is estimated that there are more than 1,000 unpublished patients affected with rare diseases for which a single candidate gene has been identified with supporting segregation and functional data.42 Even after publication of such a single family, it frequently takes 2-3 years to identify additional unrelated individuals with likely pathogenic mutations in the same gene.42 To overcome these limitations, multiple collaborative “matchmaking” platforms have been developed to identify researchers with interest in a common gene and to put them in contact (i.e. PhenomeCentral,316 GeneMatcher317). While our group has yet to have a successful “match” in any of our ARCA families, several novel disease-causing genes have been recently published due to collaborations that were initiated on these platforms.318-320 For example, two independent groups found the same de novo mutation in the gene TMEM106B in four patients with hypomyelinating leukodystrophy, resulting in a “match” on GeneMatcher and a collaborative publication of this novel gene.318 Shortly after, a Chinese group published a fifth case,321 suggesting that use of these data-sharing infrastructures needs to be increased to link all international laboratories working on similar rare diseases. Two years after the publication of our article on PMPCA mutations in ARCA, it remains an ultra-rare genetic disorder. Only one other family has been published with PMPCA mutations, but with a much more severe phenotype, representing another case of phenotypic expansion.

180 Indeed, two other missense mutations in PMPCA were uncovered in a family with severe mitochondrial disease and multisystem involvement (developmental delay, hypotonia, weakness, respiratory insufficiency, blindness, etc.).322 This discrepancy in phenotype is thought to result from one of the new mutations, which directly perturbs the glycine-rich catalytic loop,322 as opposed to the ARCA-causing mutations located further away from this important domain. This possible genotype-phenotype correlation will only be confirmed if additional cases with ARCA or mitochondrial disease are identified, which may take time considering the apparent rarity of PMPCA mutations.

5.2.3 Autosomal Recessive Spastic Ataxia with Leukoencephalopathy: not a single disease entity It is important to note that affected individuals from seven families with SPG7 mutations and from the PMPCA family were originally diagnosed with Autosomal Recessive Spastic Ataxia with Leukoencephalopathy (ARSAL), a putative novel form of ARCA with geographical links to Portneuf County south of Québec City.40 In 2012, 54 cases, including affected members from these eight families, were reported to have rearrangements in the gene MARS2 that were causative of ARSAL.41 However, these findings were never replicated in an independent cohort and the important clinical and radiological variability observed in ARSAL raised the likelihood that the cohort was genetically heterogeneous. Our identification of pathogenic SPG7 and PMPCA mutations in eight ARSAL families joins other unpublished evidence from our laboratory showing that MARS2 mutations are not a cause of ARCA and that ARSAL is not a single disease entity and should not be used as a diagnostic label.

5.3 The challenges of developing a disease model for POLR3-HLD In Chapters 3 and 4, I focused on POLR3-HLD, a disease that is classified as a leukodystrophy because of prominent myelination deficits, but where patients often display ataxia and other cerebellar features. With recent reports of POLR3A mutations in patients with spastic ataxia but without typical white matter abnormalities,91,92 POLR3-HLD is another perfect example of phenotype expansion, and also belongs to the aforementioned ataxia-spasticity spectrum. Although understanding the molecular basis for these phenotypic differences would be

181 fascinating, in this thesis I concentrated on the “classical” POLR3-HLD cases and on trying to explain the link between Pol III function and myelination.

5.3.1 Mouse models and myelin disorders One of the main challenges of my thesis was to develop a disease model for POLR3- HLD in which we could study the pathophysiological mechanisms responsible for the disease. We tried two alternative approaches: an in vivo mouse model and an in vitro cellular model, both presenting their advantages and limitations. In Chapter 3, we described two Polr3a mouse models, a KI/KI and a KI/KO mouse, carrying the c.2015G>A (p.G672E) mutation in homozygosity or in compound heterozygosity with a null allele, respectively. In certain rare neurological disorders, generation of mice that mirror the clinical features observed in humans has allowed to gain important insights into the pathophysiology of the disease,203 in large part because of the accessibility and virtually unlimited supply of the relevant CNS tissues. Unfortunately, our Polr3a KI/KI and KI/KO mice did not show any neurological or molecular abnormalities, thus greatly limiting their usefulness. Nonetheless, these negative data contribute to the growing evidence that myelin disorders are difficult to recapitulate in mice. When we submitted the article relating our findings, most of the other documented examples were not of hypomyelinating but of demyelinating leukodystrophies.323 In the group of hypomyelinating leukodystrophies (HLD), Pelizaeus-Merzbacher disease (PLP1 gene) appears to be the only form that is properly recapitulated by several mouse models,76 while Pelizaeus-Merzbacher disease- like models (GJC2 gene) only display mild myelin deficits.257,258 Other recent HLD mouse models published after our own show mitigated results: a Fam126a KO mouse, model for Hypomyelination with congenital cataracts, does not display an abnormal phenotype or defects in myelination;324 an HPSD1 mutant transgene under the control of the MBP promoter causes some myelin defects, although these were not described in much detail;325 on the other end of the spectrum, Dars KO mice, model for Hypomyelination with Brain stem and Spinal cord involvement and Leg spasticity, die in utero before embryonic day 11,326 as would be expected with the complete absence of an essential tRNA-aminoacyl synthetase. Thus, the absence of a phenotype in Polr3a KI and KI/KO mice is not an anomaly among mouse models for HLDs. In addition, double mutant mice carrying a Polr3a homozygous mutation (c.2015G>A) and a Polr3b heterozygous mutation (c.308G>A) are also normal (unpublished results, K. Choquet and

182 B. Brais), providing further evidence that mice may not be the proper model for this disease, although we cannot exclude that other mutations in Polr3a or Polr3b would cause a phenotype.

5.3.2 Advantages and limitations of a human cellular model In Chapter 4, we took a different approach, using a cellular model to gain more insight into the functional impact of POLR3A mutations. The main advantage is that human cell lines express the full human Pol III transcriptome, including primate-specific transcripts that are not present in other model organisms, which allowed us to identify a deregulation of BC200 RNA in POLR3A mutants. Furthermore, mutant cell lines are much faster to produce and maintain than animal models. On the other hand, while CRISPR-Cas9 is an extremely powerful technique, it is not without limitations. First, introducing KI mutations with CRISPR-Cas9 is much less efficient than KO mutations.302 This was well illustrated in Chapter 4, where we mostly obtained compound heterozygous cells, even after screening more than 100 clones. Second, off-target mutations remain a possible issue.283 We addressed this by screening the top 5 or 10 predicted off-target sites in each clone used for subsequent experiments, but short of performing WGS in every clone, we cannot rule out that a modification occurred elsewhere and could affect experimental results. Another major limitation of our cellular model is that HEK293 and MO3.13 cell lines are immortalized and do not necessarily recapitulate the in vivo context of POLR3-HLD. While MO3.13 cells resemble OPCs, they cannot achieve myelination,74 making it impossible to evaluate the impact of POLR3A mutations on this aspect of oligodendrocyte biology. Although we identified some differences in the transcriptome and the proteome of POLR3A mutant cells, we cannot exclude that the effects of POLR3A mutations are different or more pronounced in vivo, perhaps especially at certain crucial developmental time points in CNS development. Thus, while our in vitro model was very useful in allowing us to identify transcripts that are more vulnerable to Pol III dysfunction, it remains crucial to validate these results in vivo, or at least in a model where myelination occurs. The fact that BC200 RNA, our top candidate for a role in the disease, is only expressed in primates, is another limitation. In a recent study, Pelizaeus-Merzbacher disease, the most common form of HLD, was modeled using human induced pluripotent stem cells (iPSC) differentiated into oligodendrocytes. By generating iPSCs from the fibroblasts of 12 patients and initiating their differentiation into the oligodendrocyte

183 lineage, the authors were able to identify defects in OPC numbers and oligodendrocyte morphology and to assess early myelination by co-culturing the oligodendrocytes with rat dorsal root ganglion neurons.72 Considering that we have access to primary fibroblasts from several POLR3-HLD patients, using iPSC-derived oligodendrocytes would be the next logical step for evaluating the impact of POLR3A mutations on human oligodendrocyte biology in cells that express the same Pol III transcripts that are present in the CNS of POLR3-HLD patients, including BC200 RNA. Furthermore, although we have mostly focused on oligodendrocytes in this thesis, neurons could play an equally important role in POLR3-HLD pathogenesis. Thus, repeating the same experiment with iPSC-derived neurons would allow identifying cell-type specific effects, and perhaps distinguishing the contribution of each cell type to the pathology. However, the generation of iPSC lines remains expensive and the need for isogenic controls requires correction of the mutant cells by CRISPR-Cas9 before any comparative experiment can be performed. Furthermore, there is no evidence to show that the in vivo levels of the Pol III transcriptome are properly recapitulated in iPSC-derived cells and thus the functional defects responsible for POLR3-HLD might not be reproduced. Ultimately, a model organism will likely be necessary to fully understand the interplay between neurons and oligodendrocytes in the disease, and to eventually serve for pre-clinical testing of potential therapies. If BC200 RNA does play an important part in the disease, then its primate-specific expression will remain a limitation to modeling the disease in a small organism and generating a POLR3-HLD primate model might be the only option. Generation of mutant primates is possible using CRISPR-Cas9 gene editing, but with greater technical challenges and ethical considerations than smaller organisms, and would require a much larger financial expenditure.327

5.3.3 Limitations of patient-derived fibroblasts Finally, the third type of model used in this thesis was POLR3-HLD patient-derived primary fibroblasts. Here, they contributed to support our findings from the CRISPR-Cas9 cellular model, showing that BC200 RNA is also downregulated in patient cells. Nonetheless, the microarray and RNA-seq data from fibroblasts had actually been produced several years beforehand but were difficult to interpret on their own due to the limitations of using patient- derived fibroblasts. Beyond the obvious fact that fibroblasts are not a known affected tissue in POLR3-HLD, other issues inherent to rare diseases made the work with these samples

184 challenging. First, small sample sizes limit the statistical power to identify differences between patients and controls. Second, differences in genetic background can have a major impact on gene expression and increase the variability among individuals. For instance, it was shown that POLR3A mRNA levels vary widely among healthy individuals.92 Similarly, we observed high inter-individual variability in the levels of Pol III transcripts among both control and patient groups in our microarray data, resulting in a very small number of statistically significant differences, including BC200 RNA. In the second set of experiments where we performed RNA- seq, we obtained fibroblasts from a more homogeneous group of cases composed of four French Canadian patients, and compared them to three age- and sex-matched controls of presumed French Canadian background. BC200 RNA also emerged as a decreased transcript in patients, but inter-group variability was still significant. Thus, until we observed the same trend in isogenic HEK293 POLR3A mutant clones, it was unclear whether the differences in BC200 RNA were biologically significant or the result of inter-individual heterogeneity. However, now that all our datasets can be compared, the fact that this transcript is always the most significantly decreased one strongly suggests that this result is biologically meaningful. Nonetheless, our difficulties in comparing data from different patient-derived lines highlight the need to produce an isogenic control line for each patient, which is now possible with CRISPR-Cas9, albeit challenging. Of course, this is not always feasible for rare diseases depending on the available and affected tissues. As an example, we also have brain tissue from two deceased POLR3-HLD patients, which produced high quality RNA-seq data, but the lack of appropriate controls has not allowed drawing any solid conclusions. Interestingly, we do see a trend towards decreased expression of BC200 RNA in the patients compared to healthy brain RNA-seq data obtained from a collaborator (unpublished data, K. Choquet, C.L. Kleinman, G. Turecki), but this would have to be corroborated in patient and control samples that are processed simultaneously, ideally from similar genetic background.

5.4 Insights into Pol III biology and POLR3-HLD 5.4.1 Lessons learned on Pol III biology Although the work described in Chapters 3 and 4 did not provide definite answers regarding POLR3-HLD pathophysiology, it did lead to some insights into Pol III biology. Until recently, Pol III function in mammalian cells had mostly been studied in the context of cancer,

185 metabolism and immunity.194,328-331 Furthermore, the downstream impact of direct alterations in Pol III function had only been examined for a few select transcripts.245 Herein, we found that Pol III transcript levels are quite resistant to genetic perturbations in its largest catalytic subunit, POLR3A. In mice, homozygosity for the Polr3a G672E mutation or compound heterozygosity for this mutation and a null allele did not alter the levels of tRNAs and Bc1 RNA. In POLR3A mutant human cells, only a proportion of Pol III transcripts were downregulated, while others showed no change. Even for the transcripts that were significantly decreased in mutants, this reduction generally did not go beyond 50%. This was somewhat surprising considering that both the M852V mutation and the accompanying null mutations are directly located in the bridge helix,105 a crucial structural element of the polymerase with an important role in transcription elongation.282 Thus, we would have expected a more robust negative effect on transcription elongation, since the mutant cells do not express any wild-type POLR3A. It is possible that there are other mechanisms at play to ensure a relative stability of Pol III transcripts. For example, RMRP, a transcript for which we actually observed increased expression in some mutants, is part of an RNP that produces siRNAs, which in turn target RMRP and regulate its expression level through a negative feedback loop.144 One can imagine that reduced RMRP transcription by Pol III would lead to decreased levels of corresponding siRNAs, thus stabilizing the levels of its transcripts. Another limitation of the CRISPR-Cas9 approach is that surviving cells can find ways to adapt to their new genetic mutations during the several weeks of clonal selection before they are submitted to experiments. Thus, it is difficult to dissociate the effects of Pol III hypofunction from possible compensatory mechanisms, especially some that may not be present in vivo. An alternative would be to use conditional rapid POLR3A depletion with the auxin inducible degron system,332 allowing to examine the direct effects of POLR3A loss on the levels of its transcripts, for example to determine which transcripts are most stable or affected first. An additional takeaway from this thesis is that Pol III function and/or vulnerability to mutations varies between species. Although the POLR3A G672E mutation can cause a milder phenotype in humans, it is also associated with POLR3-HLD cases that have a more usual progression of the disease. While the inter-species differences in myelination probably contribute to the lack of neurological phenotype in Polr3a KI/KI and KI/KO mice, it is likely not the only reason. In fact, cerebellar atrophy, which is thought to occur independently of hypomyelination

186 in POLR3-HLD patients,89,91,92 was also absent in our mice. Contrary to myelin deficiencies, cerebellar dysfunction is generally well recapitulated in mouse models,44,203 so its absence in Polr3a mutant mice suggests the existence of intrinsic differences in Pol III function, regulation and/or transcripts between the two species.

5.4.2 Pol III transcripts as candidate effectors of POLR3-HLD • BC200 RNA In Chapter 4, BC200 RNA emerged as the most significantly downregulated Pol III transcript in both POLR3A mutant cells and patient-derived fibroblasts. BC200 is a primate- specific transcript and is highly expressed in the brain where it localizes to neuronal dendrites.151 These characteristics make BC200 an attractive candidate for an involvement in POLR3-HLD, since they could at least partially explain both the specific vulnerability of the CNS to Pol III dysfunction and the lack of phenotype in Polr3a KI/KI and KI/KO mice. Nevertheless, rodents express Bc1 RNA, which is thought to be a functional analog of BC200 despite different evolutionary origins.151 Targeted deletion of Bc1 in mice does not alter general health or gross brain morphology, but leads to mild behavioural, neurophysiological and molecular phenotypes.277,333-336 These reports suggest that Bc1 plays a role in synaptic plasticity, social behavior and learning. No motor deficits or brain anomalies that would be reminiscent of POLR3-HLD have been reported in these mice. Thus, if BC200 RNA indeed carries a similar role as Bc1, it argues against its involvement as an effector in the disease. However, considering the fact that myelin deficits are difficult to model in mice, we cannot draw real conclusions regarding the potential role of BC200 RNA in myelination simply from the fact that it is normal in Bc1 KO mice. Furthermore, Bc1 and BC200 seem to be differentially affected by mutations in POLR3A: mice homozygous for the G672E mutation have normal Bc1 levels (Chapter 3), while patient-derived fibroblasts that are homozygous for the same mutation all show decreased levels of BC200 (Chapter 4). This points back to possible differences in Pol III vulnerability to mutations in mouse and humans, or to different regulation of Bc1 and BC200 RNAs. Furthermore, while BC200 and Bc1 were found to have a similar role in translational repression in vitro,163 the specific function or targets of BC200 have largely been unexplored in vivo. As mentioned before, BC200 expression was not detected in the white matter using in situ hybridization in adult brains,151 but it could perhaps be expressed in cells of the oligodendrocyte

187 lineage earlier in development. Alternatively, even a low level of expression could be important for oligodendroglial function. In fact, several recent studies have shown that BC200 RNA is also expressed in other tissues and in various immortalized and primary cell lines, although at much lower levels than in the brain.165 In cell lines, BC200 was found to interact with different proteins to regulate alternative splicing,291 mRNA stability,292 and translation in p-bodies,337 perhaps in a cell-type dependent fashion, suggesting that its role does go beyond the analogies drawn from studying Bc1 RNA. In vitro experiments have shown that BC200 represses translation by inhibiting eIF4A’s helicase activity, indicating that it may be especially important for mRNAs with complex 5’ UTRs, which are enriched among mRNAs located at synapses.334 BC200 also interacts with eIF4B and disrupts its interaction with 18S rRNA.163,164 These mechanisms have been demonstrated in vivo for Bc1 RNA, but not for BC200 RNA. Furthermore, analysis of human brain extracts has shown a direct interaction of BC200 with synaptotagmin-binding cytoplasmic RNA protein (SYNCRIP), a component of mRNA granules that are shuttled to post- synaptic sites, suggesting an additional role in mRNA transport.152,260 In Chapter 4, we showed that BC200 KO in the MO3.13 cell line, which resembles OPCs, leads to major changes at the proteome level, suggesting a function in this cell type, although our findings will have to be confirmed in primary OPCs and oligodendrocytes. Since local translation also occurs in oligodendrocytes at the site of myelination, away from the cell body, BC200 RNA could be implicated in the regulation of this process or in mRNA transport, as it is thought to do in dendrites. Notably, MBP mRNA, encoding one of the major CNS myelin proteins, is transported in RNA granules and synthesized locally at the site of myelination.338 Repression of MBP translation during transport is mediated in part by hnRNP-E1,339 which was also shown to regulate BC200 RNA function in vitro.340 A possible role of BC200 RNA in regulating MBP translation could provide a link to the hypomyelination observed in POLR3- HLD. Alternatively, considering the dependence of myelination on neuronal activity,66,341 impaired BC200 RNA function in neuronal dendrites could indirectly affect myelination by disrupting signaling to oligodendrocytes. Although not the main focus of this thesis, cerebellar atrophy is a major feature of POLR3-HLD and other Pol III-related ataxias.89,91,92 BC200 RNA is predominantly located in dendrites; as such it could play an important role in the integrity of cerebellar Purkinje cells, which have extensive dendritic arbors,11 by regulating local translation, as was described for Bc1

188 RNA in other neuronal types.154,334,335 In this scenario, downregulation of BC200 RNA caused by mutations in Pol III subunits could deregulate the translation and/or transport of mRNAs important for cerebellar dendritic function, impact neuronal integrity and lead to neurodegeneration. Profiling BC200 RNA expression in POLR3A cases that have spastic ataxia but no hypomyelination92 could help to determine whether BC200 is implicated in the cerebellar phenotype in POLR3-HLD.

• Other Pol III transcripts We would like to speculate that the pathogenesis of POLR3-HLD likely results from the perturbed action of multiple Pol III transcripts. As such, BC200 RNA may be responsible for the cerebellar phenotype, while other transcripts are important in myelination, or vice versa. Most likely, combinations of transcripts produce synergistic effects that explain each aspect of the phenotype. In Chapter 4, we found that aside from BC200 RNA, the other transcripts most affected by POLR3A mutations were nuclear-encoded tRNAs and the signal recognition particle 7SL RNA. Of course, it is possible that the effects of POLR3A mutations are specific to each cell type or developmental time point and that other Pol III transcripts are downregulated in other spatio-temporal contexts. Nonetheless, both of these groups of transcripts have functions that could be relevant to the disease pathophysiology. The field of leukodystrophy research favors a tRNA-centric hypothesis to explain the pathophysiology of POLR3-HLD, in part because a significant number of other disorders related to tRNA biology include white matter abnormalities and/or cerebellar atrophy.106,111 Based on our transcriptome analysis, tRNAs are indeed among the most affected Pol III transcripts in POLR3A mutant cells. The crucial question is whether a global reduction in tRNA levels is sufficient to impair protein synthesis. One of the main reasons why tRNAs are thought to play a central role in POLR3-HLD is that mutations in four tRNA aminoacyl synthetases (ARS) and one of the auxiliary proteins of the multisynthetase complex (AIMP1) cause hypomyelination and clinical features that partially overlap with POLR3-HLD.124-127,130 Thus, the leading hypothesis to explain these diseases is that defects in tRNA or ARS function leads to abnormal translation, perhaps especially affecting the production of the most abundant CNS myelin proteins, PLP and MBP, during myelination.130 In lymphoblasts of patients with EPRS mutations, the prolyl-tRNA synthetase activity of the enzyme was significantly decreased

189 compared to control.130 However, as in POLR3-HLD, the direct impact of ARS mutations on mRNA translation, especially of myelin proteins, remains to be established. In neurological disorders caused by mutations in genes related to tRNA splicing or modification, the pathology appears to result from an accumulation of tRNA fragments that trigger the activation of stress pathways.118,119 Although changes in the levels of mature tRNAs are observed as a result of mutations in the RNA kinase CLP1, the impact on translation elongation was not investigated.119 In mouse, the combination of a SNP in a brain-specific tRNA-Arg gene with an inactivating mutation in the gene Gtpbp2, encoding a putative ribosome-recycling factor, causes neurodegeneration and dramatically increased ribosome stalling at AGA codons.251 One could predict a similar effect if POLR3A mutations had a selective effect on a subset of important tRNA isoacceptors, but this was not the case in our study, where all tRNA isoacceptors seemed to be similarly affected. While we did not identify major changes in the proteome of POLR3A mutant cells, this is not unexpected if defective translation as a result of POLR3A mutations occurs in a very specific spatio-temporal window during which the tRNA demand is higher, again highlighting the need for a model where myelination occurs. In zebrafish, differentiating oligodendrocytes produce new myelin sheaths in only 5 hours, a very short period in the lifespan of cells that can persist throughout organismal life.342 This requires a very high rate of protein synthesis,54 which could become too demanding for oligodendrocytes with reduced mature tRNA levels due to POLR3A mutations, thus resulting in hypomyelination. Our third downregulated candidate is 7SL RNA, a component of the signal recognition particle (SRP) responsible for co-translational targeting of nascent proteins to the ER.138 In a recent study, whole transcriptome profiling of compartmentalized motor neurons showed that 7SL RNA is enriched in the axon compared to the somatodendritic compartment, indicating that the SRP complex is not only present in axons, but perhaps more abundant than in the soma.343 This is in line with the fact that translation of several axonal mRNAs occurs locally.338 Thus, one possible option is that decreased 7SL RNA levels impairs translocation of axonal and secreted proteins to the ER and interferes with axon-oligodendrocyte signaling during myelination. On the oligodendroglial side, PLP is a major myelin protein that is targeted to the ER and follows the secretory pathway to reach the site of myelination.294 Reduced SRP function could potentially impair PLP trafficking and contribute to POLR3-HLD pathogenesis.

190 In this thesis, I focused on the role of Pol III in transcription, and on how its transcripts are affected and could possibly be implicated in the POLR3-HLD phenotype. While we did identify some downregulated transcripts in POLR3A mutants, further experiments will be necessary to determine if this decrease is sufficient to alter their respective functions in the cell. Furthermore, we cannot exclude that the disease is not a result of decreased Pol III transcript function, but of the involvement of Pol III in another yet-to-be-identified process. Alternatively, a non-specific pathological mechanism could be at play, such as aggregation of the unassembled mutant subunits in the cytoplasm that could be detrimental to oligodendrocyte or neuronal survival. However, the fact that some mutations do not impair complex assembly in vitro (POLR3A M852V, G672E) argues against this hypothesis. Nonetheless, some chaperones and co-chaperones responsible for Pol III assembly, such as the R2TP-prefoldin complex, are also involved in the biogenesis of other large protein complexes.344 These could potentially be indirectly affected by the more difficult task of assembling a Pol III complex containing mutant subunits.

5.5 Genome-wide approaches to study rare neurological diseases 5.5.1 Whole exome sequencing and beyond WES was the first genome-wide approach applied in this thesis and was successful in identifying the genetic basis of ARCAs in multiple families. As mentioned above, WES has greatly improved the diagnosis yield and expanded the phenotypic spectrum for many forms of ARCAs,50 as well as rare neurological diseases in general. Nonetheless, the limitations of WES are starting to become more apparent, as the rate of new gene discoveries is starting to decline.42 In our own cohort, ARCA patients from eight families have undergone WES but remain without a diagnosis (unpublished data, K. Choquet, B. Brais). As discussed above, the extreme rarity of certain genetic causes is certainly a contributing factor, but in some cases there are no good candidate variants. An obvious limitation of WES is that it only captures a small proportion of the whole genome, albeit the one found to be the most frequently mutated to date in rare diseases.42 The role of deep intronic mutations in rare diseases is starting to emerge through a combination of WGS and RNA-seq.345 Mutations in non-coding RNA genes, promoters and enhancers have also been found to be causative of rare human or animal diseases.346-349 However, interpretation of the thousands of rare variants identified by WGS remains difficult and requires

191 improvements in non-coding variant annotation, functional prediction and prioritization.42 Future directions include the use of transcriptome-wide techniques such as RNA-seq as a complementary tool to evaluate the impact of variants on gene expression. Since their application requires a tissue relevant to the pathology studied, iPSC-derived cells are starting to be used to help prove the pathogenicity of non-coding variants.350

5.5.2 Genome-wide quantification of Pol III transcripts The usefulness of exome- and genome-wide DNA sequencing in the diagnosis of rare diseases is indisputable. Transcriptome-wide techniques are also helpful tools, both for identifying and studying the functional impact of rare mutations. Considering the role of Pol III in the transcription of multiple ncRNA genes, RNA-seq appeared to be an ideal method to help unravel the transcriptional defects in POLR3-HLD. Thus, a significant proportion of my thesis work was spent on developing methods to accurately quantify Pol III transcripts at the transcriptome-wide level. This was challenging because of their short size, highly repetitive sequences, strong secondary structures and post-transcriptional modifications that interfere with reverse transcription, especially for mature tRNAs. Northern Blots have traditionally been preferred over qRT-PCR to quantify tRNA levels, since the probe directly targets RNA without the need for reverse transcription. This is the approach that we took in Chapter 3 to measure the levels of tRNAs and Bc1 in mouse brain and liver. However, while very accurate, the same Northern Blots cannot easily be reprobed for many genes, so the simultaneous measurement of expression levels of all Pol III transcripts, which are numerous and scattered across the genome, requires a technique with higher throughput. While many groups have successfully used ChIP- seq as a proxy to quantify Pol III transcript expression,176,180,193,195 for this project we wanted to directly quantify steady-state levels of RNAs since a mutated Pol III may be correctly bound to the DNA but have impaired transcription elongation, thus breaking the correlation between occupancy and transcription rate. In Chapter 4, we achieved direct quantification of all Pol III transcripts by combining rRNA-depleted RNA-seq for larger ones (≥ 200 nt) and a custom small RNA-seq approach for smaller ones (< 200 nt). In particular, by concentrating on pre-tRNAs, we obtained good coverage of the tRNA gene body, as well as other small Pol III transcripts. Furthermore, the use of longer reads (100 nt) than typical small RNA-seq experiments allowed to reduce the number of reads that map to multiple locations. However, analysis of these

192 “multimapping” reads remains an important limitation to the quantification of Pol III transcripts. Other groups have used a “weighting” approach, where fractions of reads are assigned to each locus based on the number of total loci where the reads map.194 In this work, we opted to perform the analyses using various thresholds for filtering of multimapping reads, verifying in each case that our conclusions hold in these different scenarios. It is noteworthy that except for pre-tRNAs, our approach focuses on steady-state levels of Pol III transcripts, which are the result of transcription rate, but also post-transcriptional regulation, stability and degradation. Thus, the absence of differences in the steady-state levels of some transcripts, such as vault RNAs or RMRP, does not exclude the possibility of a slower transcription rate that would be masked by other mechanisms. To truly assess Pol III transcription efficiency, other transcriptome-wide methods that quantify nascent transcripts, such as native elongating transcript sequencing (NET-seq)289 or sequencing of newly synthesized EU- labeled small RNAs (neusRNA-seq),193 could be applied. However, in the context of POLR3- HLD, if the resulting steady-state levels of the transcripts are not perturbed, they are unlikely to have a downstream impact that is relevant to the disease. Thus, measuring steady-state RNA levels is sufficient to understand which transcripts are deregulated in the disease, but may not provide all the mechanistic information necessary to pinpoint the exact effect on Pol III function.

193 CHAPTER 6: Conclusion and Future Directions

6.1 Conclusion The French Canadian population of Québec is characterized by the regional clustering of a large number of Mendelian diseases. Identifying the causal genetic defects and unraveling the pathophysiological mechanisms for each disease are crucial steps towards developing therapies. The work described herein employed a combination of genome-wide and targeted experimental approaches to improve our understanding of ARCAs and leukodystrophies. At the genetic level, we showed that SPG7 mutations are a major cause of unresolved ARCA cases and that PMPCA mutations underlie an ultra-rare form of ARCA. On the functional side, we demonstrated that POLR3A mutations cause divergent phenotypes in human and mouse. Despite the challenges associated with developing an appropriate model of POLR3-HLD and the quantification of Pol III transcripts, we were able to show that a subset of Pol III transcripts are more susceptible to Pol III dysfunction, including nuclear-encoded tRNAs and the ncRNAs 7SL and BC200. Combined with high-throughput proteomics, these results suggest candidate transcripts and proteins to be further investigated for their involvement in the disease, their role in myelination and their potential as future therapeutic targets. Altogether, our findings provide functional insights into an essential and ubiquitous enzyme, improve our understanding of the pathophysiological mechanisms responsible for one of the devastating neurodegenerative disorders affecting French Canadians, and expand the genetic and mutational landscape of rare neurological disorders in this unique population.

6.2 Future directions For the remaining unresolved ARCA patients, future directions include maximizing collaborations and leveraging matchmaking programs to identify other cases with variants in common genes. In the cases where a promising candidate variant exists, basic functional studies will be performed to confirm the pathogenic impact of the variant. Finally, when candidate variants were not uncovered in WES, we will explore the possibility of using WGS and RNA-seq to assess the contribution of non-coding variants to the genetic landscape of ARCAs, possibly in combination with iPSC-derived neurons from patients.

194 For POLR3-HLD, the first steps will be to validate the protein changes identified by SILAC using targeted methods. Profiling the mRNA transcriptome will also allow to distinguish changes that result from regulation at the gene expression or translation levels. The most promising candidate proteins will be selected to further interrogate their relationship to BC200 RNA and to myelination. We will also investigate whether decreased levels of tRNAs affect translation rate. A crucial future step will be to determine whether other mutations in POLR3A, POLR3B or POLR1C lead to the same transcriptional and protein defects as we observed. Finally, we will move to an alternative model such as iPSC-derived oligodendrocytes and neurons to explore the role of BC200 RNA and other Pol III transcripts in OPC function, myelination and neuronal integrity. Ultimately, modeling the mutations that cause different Pol III-related phenotypes in oligodendrocytes and neurons may help to dissect the molecular and cellular differences that underlie the distinct human phenotypes.

195 REFERENCES

1 Jobling, R. K. et al. PMPCA mutations cause abnormal mitochondrial protein processing in patients with non-progressive cerebellar ataxia. Brain : a journal of neurology 138, 1505-1517, doi:10.1093/brain/awv057 (2015). 2 Laberge, A. M. et al. Population history and its impact on medical genetics in Quebec. Clinical genetics 68, 287-301, doi:10.1111/j.1399-0004.2005.00497.x (2005). 3 Moreau, C., Vezina, H. & Labuda, D. [Founder effects and genetic variability in Quebec]. Med Sci (Paris) 23, 1008-1013, doi:10.1051/medsci/200723111008 (2007). 4 Scriver, C. R. Human genetics: lessons from Quebec populations. Annu Rev Genomics Hum Genet 2, 69-101, doi:10.1146/annurev.genom.2.1.69 (2001). 5 Roddier, K. et al. Two mutations in the HSN2 gene explain the high prevalence of HSAN2 in French Canadians. Neurology 64, 1762-1767, doi:10.1212/01.WNL.0000161849.29944.43 (2005). 6 Teive, H. A. & Ashizawa, T. Primary and secondary ataxias. Curr Opin Neurol 28, 413- 422, doi:10.1097/WCO.0000000000000227 (2015). 7 Akbar, U. & Ashizawa, T. Ataxia. Neurol Clin 33, 225-248, doi:10.1016/j.ncl.2014.09.004 (2015). 8 Butts, T., Green, M. J. & Wingate, R. J. Development of the cerebellum: simple steps to make a 'little brain'. Development 141, 4031-4041, doi:10.1242/dev.106559 (2014). 9 Bear, M. F., Connors, B. W. & Paradiso, M. A. Neurosciences: à la découverte du cerveau. 3rd edn, (Éditions Pradel, 2007). 10 Ramnani, N. The primate cortico-cerebellar system: anatomy and function. Nat Rev Neurosci 7, 511-522, doi:10.1038/nrn1953 (2006). 11 Eccles, J. C., Ito, M. & Szentagothai, J. The Cerebellum as a Neuronal Machine. (Springer, 1967). 12 Matilla-Duenas, A. et al. Consensus paper: pathological mechanisms underlying neurodegeneration in spinocerebellar ataxias. Cerebellum 13, 269-302, doi:10.1007/s12311-013-0539-y (2014). 13 Nachbauer, W., Eigentler, A. & Boesch, S. Acquired ataxias: the clinical spectrum, diagnosis and management. Journal of neurology 262, 1385-1393, doi:10.1007/s00415- 015-7685-8 (2015). 14 Anheim, M., Tranchant, C. & Koenig, M. The autosomal recessive cerebellar ataxias. N Engl J Med 366, 636-646, doi:10.1056/NEJMra1006610 (2012). 15 Mitoma, H., Manto, M. & Hampe, C. S. Immune-mediated cerebellar ataxias: from bench to bedside. Cerebellum Ataxias 4, 16, doi:10.1186/s40673-017-0073-7 (2017). 16 Wilkins, A. Cerebellar Dysfunction in Multiple Sclerosis. Front Neurol 8, 312, doi:10.3389/fneur.2017.00312 (2017). 17 Coutelier, M., Stevanin, G. & Brice, A. Genetic landscape remodelling in spinocerebellar ataxias: the influence of next-generation sequencing. Journal of neurology 262, 2382- 2395, doi:10.1007/s00415-015-7725-4 (2015). 18 Beaudin, M., Klein, C. J., Rouleau, G. A. & Dupre, N. Systematic review of autosomal recessive ataxias and proposal for a classification. Cerebellum Ataxias 4, 3, doi:10.1186/s40673-017-0061-y (2017).

196 19 Synofzik, M. & Schule, R. Overcoming the divide between ataxias and spastic paraplegias: Shared phenotypes, genes, and pathways. Mov Disord 32, 332-345, doi:10.1002/mds.26944 (2017). 20 Ruano, L., Melo, C., Silva, M. C. & Coutinho, P. The global epidemiology of hereditary ataxia and spastic paraplegia: a systematic review of prevalence studies. Neuroepidemiology 42, 174-183, doi:10.1159/000358801 (2014). 21 Thiffault, I. et al. Diversity of ARSACS mutations in French-Canadians. The Canadian journal of neurological sciences. Le journal canadien des sciences neurologiques 40, 61- 66 (2013). 22 Engert, J. C. et al. ARSACS, a spastic ataxia common in northeastern Quebec, is caused by mutations in a new gene encoding an 11.5-kb ORF. Nature genetics 24, 120-125, doi:10.1038/72769 (2000). 23 Alper, G. & Narayanan, V. Friedreich's ataxia. Pediatr Neurol 28, 335-341 (2003). 24 Campuzano, V. et al. Friedreich's ataxia: autosomal recessive disease caused by an intronic GAA triplet repeat expansion. Science 271, 1423-1427 (1996). 25 Bouchard, J. P., Barbeau, A., Bouchard, R., Paquet, M. & Bouchard, R. W. A cluster of Friedreich's ataxia in Rimouski, Quebec. The Canadian journal of neurological sciences. Le journal canadien des sciences neurologiques 6, 205-208 (1979). 26 Barbeau, A. et al. Origin of Friedreich's disease in Quebec. The Canadian journal of neurological sciences. Le journal canadien des sciences neurologiques 11, 506-509 (1984). 27 Gros-Louis, F. et al. Mutations in SYNE1 lead to a newly discovered form of autosomal recessive cerebellar ataxia. Nature genetics 39, 80-85, doi:10.1038/ng1927 (2007). 28 Dupre, N., Gros-Louis, F., Bouchard, J. P., Noreau, A. & Rouleau, G. A. in GeneReviews(R) (eds R. A. Pagon et al.) (1993). 29 Dupre, N. et al. Clinical and genetic study of autosomal recessive cerebellar ataxia type 1. Annals of neurology 62, 93-98, doi:10.1002/ana.21143 (2007). 30 Noreau, A. et al. SYNE1 mutations in autosomal recessive cerebellar ataxia. JAMA Neurol 70, 1296-1231, doi:10.1001/jamaneurol.2013.3268 (2013). 31 Izumi, Y. et al. Cerebellar ataxia with SYNE1 mutation accompanying motor neuron disease. Neurology 80, 600-601, doi:10.1212/WNL.0b013e3182815529 (2013). 32 Hamza, W. et al. Molecular and clinical study of a cohort of 110 Algerian patients with autosomal recessive ataxia. BMC medical genetics 16, 36, doi:10.1186/s12881-015-0180- 3 (2015). 33 Coutelier, M. et al. Efficacy of Exome-Targeted Capture Sequencing to Detect Mutations in Known Cerebellar Ataxia Genes. JAMA Neurol, doi:10.1001/jamaneurol.2017.5121 (2018). 34 Synofzik, M. et al. SYNE1 ataxia is a common recessive ataxia with major non- cerebellar features: a large multi-centre study. Brain : a journal of neurology 139, 1378- 1393, doi:10.1093/brain/aww079 (2016). 35 Moreira, M. C. et al. Senataxin, the ortholog of a yeast RNA helicase, is mutant in ataxia- ocular apraxia 2. Nature genetics 36, 225-227, doi:10.1038/ng1303 (2004). 36 Duquette, A. et al. Mutations in senataxin responsible for Quebec cluster of ataxia with neuropathy. Annals of neurology 57, 408-414, doi:10.1002/ana.20408 (2005). 37 Dupre, N., Bouchard, J. P., Brais, B. & Rouleau, G. A. Hereditary ataxia, spastic paraparesis and neuropathy in the French-Canadian population. The Canadian journal of

197 neurological sciences. Le journal canadien des sciences neurologiques 33, 149-157 (2006). 38 Bernard, G. et al. Tremor-ataxia with central hypomyelination (TACH) leukodystrophy maps to chromosome 10q22.3-10q23.31. Neurogenetics 11, 457-464, doi:10.1007/s10048-010-0251-8 (2010). 39 Bernard, G. et al. Mutations of POLR3A encoding a catalytic subunit of RNA polymerase Pol III cause a recessive hypomyelinating leukodystrophy. American journal of human genetics 89, 415-423, doi:10.1016/j.ajhg.2011.07.014 (2011). 40 Thiffault, I. et al. A new autosomal recessive spastic ataxia associated with frequent white matter changes maps to 2q33-34. Brain : a journal of neurology 129, 2332-2340, doi:10.1093/brain/awl110 (2006). 41 Bayat, V. et al. Mutations in the mitochondrial methionyl-tRNA synthetase cause a neurodegenerative phenotype in flies and a recessive ataxia (ARSAL) in humans. PLoS biology 10, e1001288, doi:10.1371/journal.pbio.1001288 (2012). 42 Boycott, K. M. et al. International Cooperation to Enable the Diagnosis of All Rare Genetic Diseases. American journal of human genetics 100, 695-705, doi:10.1016/j.ajhg.2017.04.003 (2017). 43 Chong, J. X. et al. The Genetic Basis of Mendelian Phenotypes: Discoveries, Challenges, and Opportunities. American journal of human genetics 97, 199-215, doi:10.1016/j.ajhg.2015.06.009 (2015). 44 Wang, Y. et al. Defects in the CAPN1 Gene Result in Alterations in Cerebellar Development and Cerebellar Ataxia in Mice and Humans. Cell Rep 16, 79-91, doi:10.1016/j.celrep.2016.05.044 (2016). 45 Mendoza-Ferreira, N. et al. Biallelic CHP1 mutation causes human autosomal recessive ataxia by impairing NHE1 function. Neurol Genet 4, e209, doi:10.1212/NXG.0000000000000209 (2018). 46 Coutelier, M. et al. A Recurrent Mutation in CACNA1G Alters Cav3.1 T-Type Calcium- Channel Conduction and Causes Autosomal-Dominant Cerebellar Ataxia. American journal of human genetics 97, 726-737, doi:10.1016/j.ajhg.2015.09.007 (2015). 47 Chelban, V. et al. Mutations in NKX6-2 Cause Progressive Spastic Ataxia and Hypomyelination. American journal of human genetics 100, 969-977, doi:10.1016/j.ajhg.2017.05.009 (2017). 48 Pyle, A. et al. Exome sequencing in undiagnosed inherited and sporadic ataxias. Brain : a journal of neurology 138, 276-283, doi:10.1093/brain/awu348 (2015). 49 Sawyer, S. L. et al. Exome sequencing as a diagnostic tool for pediatric-onset ataxia. Human mutation 35, 45-49, doi:10.1002/humu.22451 (2014). 50 Galatolo, D., Tessa, A., Filla, A. & Santorelli, F. M. Clinical application of next generation sequencing in hereditary spinocerebellar ataxia: increasing the diagnostic yield and broadening the ataxia-spasticity spectrum. A retrospective analysis. Neurogenetics 19, 1-8, doi:10.1007/s10048-017-0532-6 (2018). 51 Austin, C. P. et al. Future of Rare Diseases Research 2017-2027: An IRDiRC Perspective. Clin Transl Sci 11, 21-27, doi:10.1111/cts.12500 (2018). 52 Ahrendsen, J. T. & Macklin, W. Signaling mechanisms regulating myelination in the central nervous system. Neurosci Bull 29, 199-215, doi:10.1007/s12264-013-1322-2 (2013).

198 53 Saab, A. S. & Nave, K. A. Myelin dynamics: protecting and shaping neuronal functions. Curr Opin Neurobiol 47, 104-112, doi:10.1016/j.conb.2017.09.013 (2017). 54 Simons, M. & Nave, K. A. Oligodendrocytes: Myelination and Axonal Support. Cold Spring Harb Perspect Biol 8, a020479, doi:10.1101/cshperspect.a020479 (2015). 55 Nave, K. A. Myelination and the trophic support of long axons. Nat Rev Neurosci 11, 275-283, doi:10.1038/nrn2797 (2010). 56 Mitew, S. et al. Mechanisms regulating the development of oligodendrocytes and central nervous system myelin. Neuroscience 276, 29-47, doi:10.1016/j.neuroscience.2013.11.029 (2014). 57 van der Knaap, M. S. & Bugiani, M. Leukodystrophies: a proposed classification system based on pathological changes and pathogenetic mechanisms. Acta Neuropathol 134, 351-382, doi:10.1007/s00401-017-1739-1 (2017). 58 Jakovcevski, I., Filipovic, R., Mo, Z., Rakic, S. & Zecevic, N. Oligodendrocyte development and the onset of myelination in the human fetal brain. Front Neuroanat 3, 5, doi:10.3389/neuro.05.005.2009 (2009). 59 Tau, G. Z. & Peterson, B. S. Normal development of brain circuits. Neuropsychopharmacology 35, 147-168, doi:10.1038/npp.2009.115 (2010). 60 Shimizu, T., Osanai, Y. & Ikenaka, K. Oligodendrocyte-Neuron Interactions: Impact on Myelination and Brain Function. Neurochemical research 43, 181-185, doi:10.1007/s11064-017-2387-5 (2018). 61 Lee, Y. et al. Oligodendroglia metabolically support axons and contribute to neurodegeneration. Nature 487, 443-448, doi:10.1038/nature11314 (2012). 62 Wang, S. et al. Notch receptor activation inhibits oligodendrocyte differentiation. Neuron 21, 63-75 (1998). 63 Givogri, M. I. et al. Central nervous system myelination in mice with deficient expression of Notch1 receptor. Journal of neuroscience research 67, 309-320, doi:10.1002/jnr.10128 (2002). 64 Goebbels, S. et al. A neuronal PI(3,4,5)P3-dependent program of oligodendrocyte precursor recruitment and myelination. Nat Neurosci 20, 10-15, doi:10.1038/nn.4425 (2017). 65 Figlia, G., Gerber, D. & Suter, U. Myelination and mTOR. Glia 66, 693-707, doi:10.1002/glia.23273 (2018). 66 Wake, H., Lee, P. R. & Fields, R. D. Control of local protein synthesis and initial events in myelination by action potentials. Science 333, 1647-1651, doi:10.1126/science.1206998 (2011). 67 Wake, H. et al. Nonsynaptic junctions on myelinating glia promote preferential myelination of electrically active axons. Nature communications 6, 7844, doi:10.1038/ncomms8844 (2015). 68 Kevelam, S. H. et al. Update on Leukodystrophies: A Historical Perspective and Adapted Definition. Neuropediatrics 47, 349-354, doi:10.1055/s-0036-1588020 (2016). 69 Pouwels, P. J. et al. Hypomyelinating leukodystrophies: translational research progress and prospects. Annals of neurology 76, 5-19, doi:10.1002/ana.24194 (2014). 70 Dorboz, I. et al. Biallelic mutations in the homeodomain of NKX6-2 underlie a severe hypomyelinating leukodystrophy. Brain : a journal of neurology 140, 2550-2556, doi:10.1093/brain/awx207 (2017).

199 71 Zhou, X. & Rademakers, R. TMEM106B and myelination: rare leukodystrophy families reveal unexpected connections. Brain : a journal of neurology 140, 3069-3080, doi:10.1093/brain/awx318 (2017). 72 Nevin, Z. S. et al. Modeling the Mutational and Phenotypic Landscapes of Pelizaeus- Merzbacher Disease with Human iPSC-Derived Oligodendrocytes. American journal of human genetics 100, 617-634, doi:10.1016/j.ajhg.2017.03.005 (2017). 73 Numata, Y. et al. Depletion of molecular chaperones from the endoplasmic reticulum and fragmentation of the Golgi apparatus associated with pathogenesis in Pelizaeus- Merzbacher disease. The Journal of biological chemistry 288, 7451-7466, doi:10.1074/jbc.M112.435388 (2013). 74 De Vries, G. H. & Boullerne, A. I. Glial cell lines: an overview. Neurochemical research 35, 1978-2000, doi:10.1007/s11064-010-0318-9 (2010). 75 Jarjour, A. A., Zhang, H., Bauer, N., Ffrench-Constant, C. & Williams, A. In vitro modeling of central nervous system myelination and remyelination. Glia 60, 1-12, doi:10.1002/glia.21231 (2012). 76 Griffiths, I. R., Schneider, A., Anderson, J. & Nave, K. A. Transgenic and natural mouse models of proteolipid protein (PLP)-related dysmyelination and demyelination. Brain pathology 5, 275-281 (1995). 77 Ruiz, M. et al. Oxidative stress and mitochondrial dynamics malfunction are linked in Pelizaeus-Merzbacher disease. Brain pathology, doi:10.1111/bpa.12571 (2017). 78 Edgar, J. M. et al. Demyelination and axonal preservation in a transgenic mouse model of Pelizaeus-Merzbacher disease. EMBO Mol Med 2, 42-50, doi:10.1002/emmm.200900057 (2010). 79 Curiel, J. et al. TUBB4A mutations result in specific neuronal and oligodendrocytic defects that closely match clinically distinct phenotypes. Human molecular genetics 26, 4506-4518, doi:10.1093/hmg/ddx338 (2017). 80 Chouery, E. et al. A whole-genome scan in a large family with leukodystrophy and oligodontia reveals linkage to 10q22. Neurogenetics 12, 73-78, doi:10.1007/s10048-010- 0256-3 (2011). 81 Tetreault, M. et al. Recessive mutations in POLR3B, encoding the second largest subunit of Pol III, cause a rare hypomyelinating leukodystrophy. American journal of human genetics 89, 652-655, doi:10.1016/j.ajhg.2011.10.006 (2011). 82 Daoud, H. et al. Mutations in POLR3A and POLR3B are a major cause of hypomyelinating leukodystrophies with or without dental abnormalities and/or hypogonadotropic hypogonadism. Journal of medical genetics 50, 194-197, doi:10.1136/jmedgenet-2012-101357 (2013). 83 Saitsu, H. et al. Mutations in POLR3A and POLR3B encoding RNA Polymerase III subunits cause an autosomal-recessive hypomyelinating leukoencephalopathy. American journal of human genetics 89, 644-651, doi:10.1016/j.ajhg.2011.10.003 (2011). 84 Terao, Y. et al. Diffuse central hypomyelination presenting as 4H syndrome caused by compound heterozygous mutations in POLR3A encoding the catalytic subunit of polymerase III. Journal of the neurological sciences 320, 102-105, doi:10.1016/j.jns.2012.07.005 (2012). 85 Cayami, F. K. et al. POLR3A and POLR3B Mutations in Unclassified Hypomyelination. Neuropediatrics 46, 221-228, doi:10.1055/s-0035-1550148 (2015).

200 86 Potic, A., Brais, B., Choquet, K., Schiffmann, R. & Bernard, G. 4H syndrome with late- onset growth hormone deficiency caused by POLR3A mutations. Archives of neurology 69, 920-923, doi:10.1001/archneurol.2011.1963 (2012). 87 Jurkiewicz, E. et al. Recessive Mutations in POLR3B Encoding RNA Polymerase III Subunit Causing Diffuse Hypomyelination in Patients with 4H Leukodystrophy with Polymicrogyria and Cataracts. Clinical neuroradiology, doi:10.1007/s00062-015-0472-1 (2015). 88 Gutierrez, M. et al. Large exonic deletions in POLR3B gene cause POLR3-related leukodystrophy. Orphanet journal of rare diseases 10, 69, doi:10.1186/s13023-015- 0279-9 (2015). 89 Wolf, N. I. et al. Clinical spectrum of 4H leukodystrophy caused by POLR3A and POLR3B mutations. Neurology 83, 1898-1905, doi:10.1212/WNL.0000000000001002 (2014). 90 Thiffault, I. et al. Recessive mutations in POLR1C cause a leukodystrophy by impairing biogenesis of RNA polymerase III. Nature communications 6, 7623, doi:10.1038/ncomms8623 (2015). 91 La Piana, R. et al. Diffuse hypomyelination is not obligate for POLR3-related disorders. Neurology 86, 1622-1626, doi:10.1212/WNL.0000000000002612 (2016). 92 Minnerop, M. et al. Hypomorphic mutations in POLR3A are a frequent cause of sporadic and recessive spastic ataxia. Brain : a journal of neurology 140, 1561-1578, doi:10.1093/brain/awx095 (2017). 93 Azmanov, D. N. et al. Transcriptome-wide effects of a POLR3A gene mutation in patients with an unusual phenotype of striatal involvement. Human molecular genetics, doi:10.1093/hmg/ddw263 (2016). 94 Richards, M. R. et al. Phenotypic spectrum of POLR3B mutations: isolated hypogonadotropic hypogonadism without neurological or dental anomalies. Journal of medical genetics, doi:10.1136/jmedgenet-2016-104064 (2016). 95 Jay, A. M. et al. Neonatal progeriod syndrome associated with biallelic truncating variants in POLR3A. Am J Med Genet A 170, 3343-3346, doi:10.1002/ajmg.a.37960 (2016). 96 Vanderver, A. et al. More than hypomyelination in Pol-III disorder. Journal of neuropathology and experimental neurology 72, 67-75, doi:10.1097/NEN.0b013e31827c99d2 (2013). 97 Vannini, A. & Cramer, P. Conservation between the RNA polymerase I, II, and III transcription initiation machineries. Molecular cell 45, 439-446, doi:10.1016/j.molcel.2012.01.023 (2012). 98 Hoffmann, N. A. et al. Molecular structures of unbound and transcribing RNA polymerase III. Nature 528, 231-236, doi:10.1038/nature16143 (2015). 99 Fernandez-Tornero, C. et al. Crystal structure of the 14-subunit RNA polymerase I. Nature 502, 644-649, doi:10.1038/nature12636 (2013). 100 Forget, D., Cloutier, P., Domecq, C. & Coulombe, B. in Systems Analysis of Chromatin- Related Protein Complexes in Cancer 227-238 (Springer New York, 2014). 101 Forget, D. et al. Nuclear import of RNA polymerase II is coupled with nucleocytoplasmic shuttling of the RNA polymerase II-associated protein 2. Nucleic acids research 41, 6881-6891, doi:10.1093/nar/gkt455 (2013).

201 102 Boulon, S. et al. HSP90 and its R2TP/Prefoldin-like cochaperone are involved in the cytoplasmic assembly of RNA polymerase II. Molecular cell 39, 912-924, doi:10.1016/j.molcel.2010.08.023 (2010). 103 Cloutier, P. et al. High-resolution mapping of the protein interaction network for the human transcription machinery and affinity purification of RNA polymerase II-associated complexes. Methods 48, 381-386, doi:10.1016/j.ymeth.2009.05.005 (2009). 104 Forget, D. et al. The protein interaction network of the human transcription machinery reveals a role for the conserved GTPase RPAP4/GPN1 and microtubule assembly in nuclear import and biogenesis of RNA polymerase II. Molecular & cellular proteomics : MCP 9, 2827-2839, doi:10.1074/mcp.M110.003616 (2010). 105 Arimbasseri, A. G. & Maraia, R. J. RNA Polymerase III Advances: Structural and tRNA Functional Views. Trends in biochemical sciences 41, 546-559, doi:10.1016/j.tibs.2016.03.003 (2016). 106 Borck, G. et al. BRF1 mutations alter RNA polymerase III-dependent transcription and cause neurodevelopmental anomalies. Genome research 25, 155-166, doi:10.1101/gr.176925.114 (2015). 107 Gavazzo, P., Vassalli, M., Costa, D. & Pagano, A. Novel ncRNAs transcribed by Pol III and elucidation of their functional relevance by biophysical approaches. Frontiers in cellular neuroscience 7, 203, doi:10.3389/fncel.2013.00203 (2013). 108 Arimbasseri, A. G., Rijal, K. & Maraia, R. J. Comparative overview of RNA polymerase II and III transcription cycles, with focus on RNA polymerase III termination and reinitiation. Transcription 5, e27639, doi:10.4161/trns.27369 (2014). 109 Hu, S., Wu, J., Chen, L. & Shan, G. Signals from noncoding RNAs: unconventional roles for conventional pol III transcripts. The international journal of biochemistry & cell biology 44, 1847-1851, doi:10.1016/j.biocel.2012.07.013 (2012). 110 Orioli, A. tRNA biology in the omics era: Stress signalling dynamics and cancer progression. Bioessays 39, doi:10.1002/bies.201600158 (2017). 111 Kapur, M., Monaghan, C. E. & Ackerman, S. L. Regulation of mRNA Translation in Neurons-A Matter of Life and Death. Neuron 96, 616-637, doi:10.1016/j.neuron.2017.09.057 (2017). 112 Kirchner, S. & Ignatova, Z. Emerging roles of tRNA in adaptive translation, signalling dynamics and disease. Nature reviews. Genetics 16, 98-112, doi:10.1038/nrg3861 (2015). 113 Yamasaki, S., Ivanov, P., Hu, G. F. & Anderson, P. Angiogenin cleaves tRNA and promotes stress-induced translational repression. The Journal of cell biology 185, 35-42, doi:10.1083/jcb.200811106 (2009). 114 Ivanov, P., Emara, M. M., Villen, J., Gygi, S. P. & Anderson, P. Angiogenin-induced tRNA fragments inhibit translation initiation. Molecular cell 43, 613-623, doi:10.1016/j.molcel.2011.06.022 (2011). 115 Emara, M. M. et al. Angiogenin-induced tRNA-derived stress-induced RNAs promote stress-induced stress granule assembly. The Journal of biological chemistry 285, 10959- 10968, doi:10.1074/jbc.M109.077560 (2010). 116 Czech, A., Wende, S., Morl, M., Pan, T. & Ignatova, Z. Reversible and rapid transfer- RNA deactivation as a mechanism of translational repression in stress. PLoS genetics 9, e1003767, doi:10.1371/journal.pgen.1003767 (2013).

202 117 Dittmar, K. A., Goodenbour, J. M. & Pan, T. Tissue-specific differences in human transfer RNA expression. PLoS genetics 2, e221, doi:10.1371/journal.pgen.0020221 (2006). 118 Blanco, S. et al. Aberrant methylation of tRNAs links cellular stress to neuro- developmental disorders. The EMBO journal 33, 2020-2039, doi:10.15252/embj.201489282 (2014). 119 Schaffer, A. E. et al. CLP1 founder mutation links tRNA splicing and maturation to cerebellar development and neurodegeneration. Cell 157, 651-663, doi:10.1016/j.cell.2014.03.049 (2014). 120 Karaca, E. et al. Human CLP1 mutations alter tRNA biogenesis, affecting both peripheral and central nervous system function. Cell 157, 636-650, doi:10.1016/j.cell.2014.02.058 (2014). 121 Breuss, M. W. et al. Autosomal-Recessive Mutations in the tRNA Splicing Endonuclease Subunit TSEN15 Cause Pontocerebellar Hypoplasia and Progressive Microcephaly. American journal of human genetics 99, 228-235, doi:10.1016/j.ajhg.2016.05.023 (2016). 122 Sasarman, F. et al. The 3' addition of CCA to mitochondrial tRNASer(AGY) is specifically impaired in patients with mutations in the tRNA nucleotidyl transferase TRNT1. Human molecular genetics 24, 2841-2847, doi:10.1093/hmg/ddv044 (2015). 123 Chakraborty, P. K. et al. Mutations in TRNT1 cause congenital sideroblastic anemia with immunodeficiency, fevers, and developmental delay (SIFD). Blood 124, 2867-2871, doi:10.1182/blood-2014-08-591370 (2014). 124 Taft, R. J. et al. Mutations in DARS cause hypomyelination with brain stem and spinal cord involvement and leg spasticity. American journal of human genetics 92, 774-780, doi:10.1016/j.ajhg.2013.04.006 (2013). 125 Wolf, N. I. et al. Mutations in RARS cause hypomyelination. Annals of neurology 76, 134-139, doi:10.1002/ana.24167 (2014). 126 Simons, C. et al. Loss-of-function alanyl-tRNA synthetase mutations cause an autosomal-recessive early-onset epileptic encephalopathy with persistent myelination defect. American journal of human genetics 96, 675-681, doi:10.1016/j.ajhg.2015.02.012 (2015). 127 Nakayama, T. et al. Deficient activity of alanyl-tRNA synthetase underlies an autosomal recessive syndrome of progressive microcephaly, hypomyelination, and epileptic encephalopathy. Human mutation 38, 1348-1354, doi:10.1002/humu.23250 (2017). 128 Nafisinia, M. et al. Mutations in RARS cause a hypomyelination disorder akin to Pelizaeus-Merzbacher disease. European journal of human genetics : EJHG 25, 1134- 1141, doi:10.1038/ejhg.2017.119 (2017). 129 Feinstein, M. et al. Pelizaeus-Merzbacher-like disease caused by AIMP1/p43 homozygous mutation. American journal of human genetics 87, 820-828, doi:10.1016/j.ajhg.2010.10.016 (2010). 130 Mendes, M. I. et al. Bi-allelic Mutations in EPRS, Encoding the Glutamyl-Prolyl- Aminoacyl-tRNA Synthetase, Cause a Hypomyelinating Leukodystrophy. American journal of human genetics, doi:10.1016/j.ajhg.2018.02.011 (2018). 131 Li, M. & Gu, W. A critical role for noncoding 5S rRNA in regulating Mdmx stability. Molecular cell 43, 1023-1032, doi:10.1016/j.molcel.2011.08.020 (2011).

203 132 Egloff, S., Studniarek, C. & Kiss, T. 7SK small nuclear RNA, a multifunctional transcriptional regulatory RNA with gene-specific features. Transcription, 1-7, doi:10.1080/21541264.2017.1344346 (2017). 133 Egloff, S. et al. The 7SK snRNP associates with the little elongation complex to promote snRNA gene expression. The EMBO journal 36, 934-948, doi:10.15252/embj.201695740 (2017). 134 Flynn, R. A. et al. 7SK-BAF axis controls pervasive transcription at enhancers. Nature structural & molecular biology 23, 231-238, doi:10.1038/nsmb.3176 (2016). 135 Bazi, Z. et al. Rn7SK small nuclear RNA is involved in Neuronal Differentiation. J Cell Biochem, doi:10.1002/jcb.26472 (2017). 136 Castelo-Branco, G. et al. The non-coding snRNA 7SK controls transcriptional termination, poising, and bidirectionality in embryonic stem cells. Genome biology 14, R98, doi:10.1186/gb-2013-14-9-r98 (2013). 137 Akopian, D., Shen, K., Zhang, X. & Shan, S. O. Signal recognition particle: an essential protein-targeting machine. Annu Rev Biochem 82, 693-721, doi:10.1146/annurev- biochem-072711-164732 (2013). 138 Lakkaraju, A. K., Mary, C., Scherrer, A., Johnson, A. E. & Strub, K. SRP keeps polypeptides translocation-competent by slowing translation to match limiting ER- targeting sites. Cell 133, 440-451, doi:10.1016/j.cell.2008.02.049 (2008). 139 Lakkaraju, A. K., Luyet, P. P., Parone, P., Falguieres, T. & Strub, K. Inefficient targeting to the endoplasmic reticulum by the signal recognition particle elicits selective defects in post-ER membrane trafficking. Experimental cell research 313, 834-847, doi:10.1016/j.yexcr.2006.12.003 (2007). 140 Castle, J. C. et al. Digital genome-wide ncRNA expression, including SnoRNAs, across 11 human tissues using polyA-neutral amplification. PloS one 5, e11779, doi:10.1371/journal.pone.0011779 (2010). 141 Skreka, K. et al. Identification of differentially expressed non-coding RNAs in embryonic stem cell neural differentiation. Nucleic acids research 40, 6001-6015, doi:10.1093/nar/gks311 (2012). 142 Mroczek, S. & Dziembowski, A. U6 RNA biogenesis and disease association. Wiley Interdiscip Rev RNA 4, 581-592, doi:10.1002/wrna.1181 (2013). 143 Goldfarb, K. C. & Cech, T. R. Targeted CRISPR disruption reveals a role for RNase MRP RNA in human preribosomal RNA processing. Genes & development 31, 59-71, doi:10.1101/gad.286963.116 (2017). 144 Maida, Y. et al. An RNA-dependent RNA polymerase formed by TERT and the RMRP RNA. Nature 461, 230-235, doi:10.1038/nature08283 (2009). 145 Reiner, R., Ben-Asouli, Y., Krilovetzky, I. & Jarrous, N. A role for the catalytic ribonucleoprotein RNase P in RNA polymerase III transcription. Genes & development 20, 1621-1635, doi:10.1101/gad.386706 (2006). 146 Reiner, R., Krasnov-Yoeli, N., Dehtiar, Y. & Jarrous, N. Function and assembly of a chromatin-associated RNase P that is required for efficient transcription by RNA polymerase I. PloS one 3, e4072, doi:10.1371/journal.pone.0004072 (2008). 147 Serruya, R. et al. Human RNase P ribonucleoprotein is required for formation of initiation complexes of RNA polymerase III. Nucleic acids research 43, 5442-5450, doi:10.1093/nar/gkv447 (2015).

204 148 Persson, H. et al. The non-coding RNA of the multidrug resistance-linked vault particle encodes multiple regulatory small RNAs. Nat Cell Biol 11, 1268-1271, doi:10.1038/ncb1972 (2009). 149 Christov, C. P., Gardiner, T. J., Szuts, D. & Krude, T. Functional requirement of noncoding Y RNAs for human chromosomal DNA replication. Molecular and cellular biology 26, 6993-7004, doi:10.1128/MCB.01060-06 (2006). 150 Kowalski, M. P. & Krude, T. Functional roles of non-coding Y RNAs. The international journal of biochemistry & cell biology 66, 20-29, doi:10.1016/j.biocel.2015.07.003 (2015). 151 Tiedge, H., Chen, W. & Brosius, J. Primary structure, neural-specific expression, and dendritic location of human BC200 RNA. The Journal of neuroscience : the official journal of the Society for Neuroscience 13, 2382-2390 (1993). 152 Sosinska, P., Mikula-Pietrasik, J. & Ksiazek, K. The double-edged sword of long non- coding RNA: The role of human brain-specific BC200 RNA in translational control, neurodegenerative diseases, and cancer. Mutat Res Rev Mutat Res 766, 58-67, doi:10.1016/j.mrrev.2015.08.002 (2015). 153 Tiedge, H., Fremeau, R. T., Jr., Weinstock, P. H., Arancio, O. & Brosius, J. Dendritic location of neural BC1 RNA. Proceedings of the National Academy of Sciences of the United States of America 88, 2093-2097 (1991). 154 Zalfa, F. et al. The fragile X syndrome protein FMRP associates with BC1 RNA and regulates the translation of specific mRNAs at synapses. Cell 112, 317-327 (2003). 155 Muddashetty, R. et al. Poly(A)-binding protein is associated with neuronal BC1 and BC200 ribonucleoprotein particles. J Mol Biol 321, 433-445 (2002). 156 West, N., Roy-Engel, A. M., Imataka, H., Sonenberg, N. & Deininger, P. Shared protein components of SINE RNPs. J Mol Biol 321, 423-432 (2002). 157 Wang, H. et al. Dendritic BC1 RNA: functional role in regulation of translation initiation. The Journal of neuroscience : the official journal of the Society for Neuroscience 22, 10232-10241 (2002). 158 Johnson, E. M. et al. Role of Pur alpha in targeting mRNA to sites of translation in hippocampal neuronal dendrites. Journal of neuroscience research 83, 929-943, doi:10.1002/jnr.20806 (2006). 159 Kremerskothen, J., Nettermann, M., op de Bekke, A., Bachmann, M. & Brosius, J. Identification of human autoantigen La/SS-B as BC1/BC200 RNA-binding protein. DNA Cell Biol 17, 751-759, doi:10.1089/dna.1998.17.751 (1998). 160 Iacoangeli, A. et al. Reply to Bagni: On BC1 RNA and the fragile X mental retardation protein. Proceedings of the National Academy of Sciences of the United States of America 105, E29, doi:10.1073/pnas.0803737105 (2008). 161 Iacoangeli, A. et al. On BC1 RNA and the fragile X mental retardation protein. Proceedings of the National Academy of Sciences of the United States of America 105, 734-739, doi:10.1073/pnas.0710991105 (2008). 162 Bagni, C. On BC1 RNA and the fragile X mental retardation protein. Proceedings of the National Academy of Sciences of the United States of America 105, E19, doi:10.1073/pnas.0801034105 (2008). 163 Eom, T., Berardi, V., Zhong, J., Risuleo, G. & Tiedge, H. Dual nature of translational control by regulatory BC RNAs. Molecular and cellular biology 31, 4538-4549, doi:10.1128/MCB.05885-11 (2011).

205 164 Lin, D., Pestova, T. V., Hellen, C. U. & Tiedge, H. Translational control by a small RNA: dendritic BC1 RNA targets the eukaryotic initiation factor 4A helicase mechanism. Molecular and cellular biology 28, 3008-3019, doi:10.1128/MCB.01800-07 (2008). 165 Booy, E. P., McRae, E. K., Koul, A., Lin, F. & McKenna, S. A. The long non-coding RNA BC200 (BCYRN1) is critical for cancer cell survival and proliferation. Mol Cancer 16, 109, doi:10.1186/s12943-017-0679-7 (2017). 166 Deininger, P. Alu elements: know the SINEs. Genome biology 12, 236, doi:10.1186/gb- 2011-12-12-236 (2011). 167 Chen, L. L. & Yang, L. ALUternative Regulation for Gene Expression. Trends Cell Biol 27, 480-490, doi:10.1016/j.tcb.2017.01.002 (2017). 168 Liu, W. M., Chu, W. M., Choudary, P. V. & Schmid, C. W. Cell stress and translational inhibitors transiently increase the abundance of mammalian SINE transcripts. Nucleic acids research 23, 1758-1765 (1995). 169 Hasler, J. & Strub, K. Alu RNP and Alu RNA regulate translation initiation in vitro. Nucleic acids research 34, 2374-2385, doi:10.1093/nar/gkl246 (2006). 170 Pagano, A. et al. New small nuclear RNA gene-like transcriptional units as sources of regulatory transcripts. PLoS genetics 3, e1, doi:10.1371/journal.pgen.0030001 (2007). 171 Massone, S. et al. RNA polymerase III drives alternative splicing of the potassium channel-interacting protein contributing to brain complexity and neurodegeneration. The Journal of cell biology 193, 851-866, doi:10.1083/jcb.201011053 (2011). 172 Penna, I. et al. A novel snRNA-like transcript affects amyloidogenesis and cell cycle progression through perturbation of Fe65L1 (APBB2) alternative splicing. Biochimica et biophysica acta 1833, 1511-1526, doi:10.1016/j.bbamcr.2013.02.020 (2013). 173 Massone, S. et al. 17A, a novel non-coding RNA, regulates GABA B alternative splicing and signaling in response to inflammatory stimuli and in Alzheimer disease. Neurobiology of disease 41, 308-317, doi:10.1016/j.nbd.2010.09.019 (2011). 174 Ciarlo, E. et al. An intronic ncRNA-dependent regulation of SORL1 expression affecting Abeta formation is upregulated in post-mortem Alzheimer's disease brain samples. Disease models & mechanisms 6, 424-433, doi:10.1242/dmm.009761 (2013). 175 Alla, R. K. & Cairns, B. R. RNA polymerase III transcriptomes in human embryonic stem cells and induced pluripotent stem cells, and relationships with pluripotency transcription factors. PloS one 9, e85648, doi:10.1371/journal.pone.0085648 (2014). 176 Canella, D., Praz, V., Reina, J. H., Cousin, P. & Hernandez, N. Defining the RNA polymerase III transcriptome: Genome-wide localization of the RNA polymerase III transcription machinery in human cells. Genome research 20, 710-721, doi:10.1101/gr.101337.109 (2010). 177 Barski, A. et al. Pol II and its associated epigenetic marks are present at Pol III- transcribed noncoding RNA genes. Nature structural & molecular biology 17, 629-634, doi:10.1038/nsmb.1806 (2010). 178 Oler, A. J. et al. Human RNA polymerase III transcriptomes and relationships to Pol II promoter chromatin and enhancer-binding factors. Nature structural & molecular biology 17, 620-628, doi:10.1038/nsmb.1801 (2010). 179 Moqtaderi, Z. et al. Genomic binding profiles of functionally distinct RNA polymerase III transcription complexes in human cells. Nature structural & molecular biology 17, 635-640, doi:10.1038/nsmb.1794 (2010).

206 180 Raha, D. et al. Close association of RNA polymerase II and many transcription factors with Pol III genes. Proceedings of the National Academy of Sciences of the United States of America 107, 3639-3644, doi:10.1073/pnas.0911315106 (2010). 181 Canella, D. et al. A multiplicity of factors contributes to selective RNA polymerase III occupancy of a subset of RNA polymerase III genes in mouse liver. Genome research 22, 666-680, doi:10.1101/gr.130286.111 (2012). 182 Schmitt, B. M. et al. High-resolution mapping of transcriptional dynamics across tissue development reveals a stable mRNA-tRNA interface. Genome research 24, 1797-1807, doi:10.1101/gr.176784.114 (2014). 183 Van Bortle, K., Phanstiel, D. H. & Snyder, M. P. Topological organization and dynamic regulation of human tRNA genes during macrophage differentiation. Genome biology 18, 180, doi:10.1186/s13059-017-1310-3 (2017). 184 Helbo, A. S., Lay, F. D., Jones, P. A., Liang, G. & Gronbaek, K. Nucleosome Positioning and NDR Structure at RNA Polymerase III Promoters. Sci Rep 7, 41947, doi:10.1038/srep41947 (2017). 185 James Faresse, N. et al. Genomic study of RNA polymerase II and III SNAPc-bound promoters reveals a gene transcribed by both enzymes and a broad use of common activators. PLoS genetics 8, e1003028, doi:10.1371/journal.pgen.1003028 (2012). 186 Donze, D. Extra-transcriptional functions of RNA Polymerase III complexes: TFIIIC as a potential global chromatin bookmark. Gene 493, 169-175, doi:10.1016/j.gene.2011.09.018 (2012). 187 Raab, J. R. et al. Human tRNA genes function as chromatin insulators. The EMBO journal 31, 330-350, doi:10.1038/emboj.2011.406 (2012). 188 Ebersole, T. et al. tRNA genes protect a reporter gene from epigenetic silencing in mouse cells. Cell Cycle 10, 2779-2791, doi:10.4161/cc.10.16.17092 (2011). 189 Wang, J., Lunyak, V. V. & Jordan, I. K. Genome-wide prediction and analysis of human chromatin boundary elements. Nucleic acids research 40, 511-529, doi:10.1093/nar/gkr750 (2012). 190 Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376-380, doi:10.1038/nature11082 (2012). 191 Yeganeh, M., Praz, V., Cousin, P. & Hernandez, N. Transcriptional interference by RNA polymerase III affects expression of the Polr3e gene. Genes & development 31, 413-421, doi:10.1101/gad.293324.116 (2017). 192 Willis, I. M. & Moir, R. D. Signaling to and from the RNA Polymerase III Transcription and Processing Machinery. Annu Rev Biochem, doi:10.1146/annurev-biochem-062917- 012624 (2018). 193 Orioli, A., Praz, V., Lhote, P. & Hernandez, N. Human MAF1 targets and represses active RNA polymerase III genes by preventing recruitment rather than inducing long- term transcriptional arrest. Genome research, doi:10.1101/gr.201400.115 (2016). 194 Bonhoure, N. et al. Loss of the RNA polymerase III repressor MAF1 confers obesity resistance. Genes & development 29, 934-947, doi:10.1101/gad.258350.115 (2015). 195 Kutter, C. et al. Pol III binding in six mammals shows conservation among amino acid isotypes despite divergence among tRNA genes. Nature genetics 43, 948-955, doi:10.1038/ng.906 (2011). 196 Orioli, A., Praz, V., Lhote, P. & Hernandez, N. Human MAF1 targets and represses active RNA polymerase III genes by preventing recruitment rather than inducing long-

207 term transcriptional arrest. Genome research 26, 624-635, doi:10.1101/gr.201400.115 (2016). 197 Cozen, A. E. et al. ARM-seq: AlkB-facilitated RNA methylation sequencing reveals a complex landscape of modified tRNA fragments. Nature methods 12, 879-884, doi:10.1038/nmeth.3508 (2015). 198 Zheng, G. et al. Efficient and quantitative high-throughput tRNA sequencing. Nature methods 12, 835-837, doi:10.1038/nmeth.3478 (2015). 199 Arimbasseri, A. G. et al. RNA Polymerase III Output Is Functionally Linked to tRNA Dimethyl-G26 Modification. PLoS genetics 11, e1005671, doi:10.1371/journal.pgen.1005671 (2015). 200 Wilusz, J. E. Removing roadblocks to deep sequencing of modified RNAs. Nature methods 12, 821-822, doi:10.1038/nmeth.3516 (2015). 201 Nottingham, R. M. et al. RNA-seq of human reference RNA samples using a thermostable group II intron reverse transcriptase. Rna 22, 597-613, doi:10.1261/rna.055558.115 (2016). 202 Gogakos, T. et al. Characterizing Expression and Processing of Precursor and Mature Human tRNAs by Hydro-tRNAseq and PAR-CLIP. Cell Rep 20, 1463-1475, doi:10.1016/j.celrep.2017.07.029 (2017). 203 Lariviere, R. et al. Sacs knockout mice present pathophysiological defects underlying autosomal recessive spastic ataxia of Charlevoix-Saguenay. Human molecular genetics 24, 727-739, doi:10.1093/hmg/ddu491 (2015). 204 Bradley, J. L. et al. Clinical, biochemical and molecular genetic correlations in Friedreich's ataxia. Human molecular genetics 9, 275-282 (2000). 205 Cohen, S. et al. Senataxin resolves RNA:DNA hybrids forming at DNA double-strand breaks to prevent translocations. Nature communications 9, 533, doi:10.1038/s41467- 018-02894-w (2018). 206 Charzewska, A. et al. Hypomyelinating leukodystrophies - a molecular insight into the white matter pathology. Clinical genetics 90, 293-304, doi:10.1111/cge.12811 (2016). 207 Online Mendelian Inheritance in Man, OMIM®, ( 208 Pozniak, C. D. et al. Sox10 directs neural stem cells toward the oligodendrocyte lineage by decreasing Suppressor of Fused expression. Proceedings of the National Academy of Sciences of the United States of America 107, 21795-21800, doi:10.1073/pnas.1016485107 (2010). 209 Jayadev, S. & Bird, T. D. Hereditary ataxias: overview. Genet Med 15, 673-683, doi:10.1038/gim.2013.28 (2013). 210 Noreau, A., Dion, P. A. & Rouleau, G. A. Molecular aspects of hereditary spastic paraplegia. Experimental cell research 325, 18-26, doi:10.1016/j.yexcr.2014.02.021 (2014). 211 Hersheson, J., Haworth, A. & Houlden, H. The inherited ataxias: genetic heterogeneity, mutation databases, and future directions in research and clinical diagnostics. Human mutation 33, 1324-1332, doi:10.1002/humu.22132 (2012). 212 Casari, G. et al. Spastic paraplegia and OXPHOS impairment caused by mutations in paraplegin, a nuclear-encoded mitochondrial metalloprotease. Cell 93, 973-983 (1998). 213 Wilkinson, P. A. et al. A clinical, genetic and biochemical study of SPG7 mutations in hereditary spastic paraplegia. Brain : a journal of neurology 127, 973-980, doi:10.1093/brain/awh125 (2004).

208 214 van Gassen, K. L. et al. Genotype-phenotype correlations in spastic paraplegia type 7: a study in a large Dutch cohort. Brain : a journal of neurology 135, 2994-3004, doi:10.1093/brain/aws224 (2012). 215 Klebe, S. et al. Spastic paraplegia gene 7 in patients with spasticity and/or optic neuropathy. Brain : a journal of neurology 135, 2980-2993, doi:10.1093/brain/aws240 (2012). 216 Kumar, K. R. et al. Targeted next generation sequencing in SPAST-negative hereditary spastic paraplegia. Journal of neurology 260, 2516-2522, doi:10.1007/s00415-013-7008- x (2013). 217 Sanchez-Ferrero, E. et al. SPG7 mutational screening in spastic paraplegia patients supports a dominant effect for some mutations and a pathogenic role for p.A510V. Clinical genetics 83, 257-262, doi:10.1111/j.1399-0004.2012.01896.x (2013). 218 Arnoldi, A. et al. A clinical, genetic, and biochemical characterization of SPG7 mutations in a large cohort of patients with hereditary spastic paraplegia. Human mutation 29, 522- 531, doi:10.1002/humu.20682 (2008). 219 Doi, H. et al. Identification of a novel homozygous SPG7 mutation in a Japanese patient with spastic ataxia: making an efficient diagnosis using exome sequencing for autosomal recessive cerebellar ataxia and spastic paraplegia. Internal medicine 52, 1629-1633 (2013). 220 Orsucci, D. et al. Hereditary spastic paraparesis in adults. A clinical and genetic perspective from Tuscany. Clinical neurology and neurosurgery 120, 14-19, doi:10.1016/j.clineuro.2014.02.002 (2014). 221 Racis, L. et al. The high prevalence of hereditary spastic paraplegia in Sardinia, insular Italy. Journal of neurology 261, 52-59, doi:10.1007/s00415-013-7151-4 (2014). 222 Elleuch, N. et al. Mutation analysis of the paraplegin gene (SPG7) in patients with hereditary spastic paraplegia. Neurology 66, 654-659, doi:10.1212/01.wnl.0000201185.91110.15 (2006). 223 Roxburgh, R. H. et al. The p.Ala510Val mutation in the SPG7 (paraplegin) gene is the most common mutation causing adult onset neurogenetic disease in patients of British ancestry. Journal of neurology 260, 1286-1294, doi:10.1007/s00415-012-6792-z (2013). 224 Pfeffer, G. et al. SPG7 mutations are a common cause of undiagnosed ataxia. Neurology 84, 1174-1176, doi:10.1212/WNL.0000000000001369 (2015). 225 van de Warrenburg, B. P. et al. EFNS/ENS Consensus on the diagnosis and management of chronic ataxias in adulthood. European journal of neurology : the official journal of the European Federation of Neurological Societies 21, 552-562, doi:10.1111/ene.12341 (2014). 226 Yoon, G. et al. Autosomal recessive hereditary spastic paraplegia-clinical and genetic characteristics of a well-defined cohort. Neurogenetics 14, 181-188, doi:10.1007/s10048- 013-0366-9 (2013). 227 Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078-2079, doi:10.1093/bioinformatics/btp352 (2009). 228 Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic acids research 38, e164, doi:10.1093/nar/gkq603 (2010).

209 229 Bonn, F., Pantakani, K., Shoukier, M., Langer, T. & Mannan, A. U. Functional evaluation of paraplegin mutations by a yeast complementation assay. Human mutation 31, 617-621, doi:10.1002/humu.21226 (2010). 230 Pfeffer, G. et al. Mutations in the SPG7 gene cause chronic progressive external ophthalmoplegia through disordered mitochondrial DNA maintenance. Brain : a journal of neurology 137, 1323-1336, doi:10.1093/brain/awu060 (2014). 231 Marcotulli, C. et al. Early-onset optic neuropathy as initial clinical presentation in SPG7. Journal of neurology 261, 1820-1821, doi:10.1007/s00415-014-7432-6 (2014). 232 Brugman, F. et al. Paraplegin mutations in sporadic adult-onset upper motor neuron syndromes. Neurology 71, 1500-1505, doi:10.1212/01.wnl.0000319700.11606.21 (2008). 233 Teixeira, P. F. & Glaser, E. Processing peptidases in mitochondria and chloroplasts. Biochimica et biophysica acta 1833, 360-370, doi:10.1016/j.bbamcr.2012.03.012 (2013). 234 Cavadini, P., Adamec, J., Taroni, F., Gakh, O. & Isaya, G. Two-step processing of human frataxin by mitochondrial processing peptidase. Precursor and intermediate forms are cleaved at different rates. The Journal of biological chemistry 275, 41469-41475, doi:10.1074/jbc.M006539200 (2000). 235 Schmucker, S., Argentini, M., Carelle-Calmels, N., Martelli, A. & Puccio, H. The in vivo mitochondrial two-step maturation of human frataxin. Human molecular genetics 17, 3521-3531, doi:10.1093/hmg/ddn244 (2008). 236 Ogishima, T., Niidome, T., Shimokata, K., Kitada, S. & Ito, A. Analysis of elements in the substrate required for processing by mitochondrial processing peptidase. The Journal of biological chemistry 270, 30322-30326 (1995). 237 Nagao, Y. et al. Glycine-rich region of mitochondrial processing peptidase alpha-subunit is essential for binding and cleavage of the precursor proteins. The Journal of biological chemistry 275, 34552-34556, doi:10.1074/jbc.M003110200 (2000). 238 Kucera, T. et al. A computational study of the glycine-rich loop of mitochondrial processing peptidase. PloS one 8, e74518, doi:10.1371/journal.pone.0074518 (2013). 239 Dvorakova-Hola, K. et al. Glycine-rich loop of mitochondrial processing peptidase alpha-subunit is responsible for substrate recognition by a mechanism analogous to mitochondrial receptor Tom20. Journal of molecular biology 396, 1197-1210, doi:10.1016/j.jmb.2009.12.054 (2010). 240 Greene, A. W. et al. Mitochondrial processing peptidase regulates PINK1 processing, import and Parkin recruitment. EMBO reports 13, 378-385, doi:10.1038/embor.2012.14 (2012). 241 Horvath, R. & Chinnery, P. F. Nuclear-mitochondrial proteins: too much to process? Brain : a journal of neurology 138, 1451-1453, doi:10.1093/brain/awv072 (2015). 242 La Piana, R. et al. Brain magnetic resonance imaging (MRI) pattern recognition in Pol III-related leukodystrophies. Journal of child neurology 29, 214-220, doi:10.1177/0883073813503902 (2014). 243 Steenweg, M. E. et al. Magnetic resonance imaging pattern recognition in hypomyelinating disorders. Brain : a journal of neurology 133, 2971-2982, doi:10.1093/brain/awq257 (2010). 244 Dieci, G., Fiorino, G., Castelnuovo, M., Teichmann, M. & Pagano, A. The expanding RNA polymerase III transcriptome. Trends in genetics : TIG 23, 614-622, doi:10.1016/j.tig.2007.09.001 (2007).

210 245 Shimojima, K. et al. Novel compound heterozygous mutations of POLR3A revealed by whole-exome sequencing in a patient with hypomyelination. Brain & development 36, 315-321, doi:10.1016/j.braindev.2013.04.011 (2014). 246 Battini, R. et al. Longitudinal follow up of a boy affected by Pol III-related leukodystrophy: a detailed phenotype description. BMC medical genetics 16, 53, doi:10.1186/s12881-015-0203-0 (2015). 247 Billington, E., Bernard, G., Gibson, W. & Corenblum, B. Endocrine Aspects of 4H Leukodystrophy: A Case Report and Review of the Literature. Case reports in endocrinology 2015, 314594, doi:10.1155/2015/314594 (2015). 248 Synofzik, M., Bernard, G., Lindig, T. & Gburek-Augustat, J. Teaching neuroimages: hypomyelinating leukodystrophy with hypodontia due to POLR3B: look into a leukodystrophy's mouth. Neurology 81, e145, doi:10.1212/01.wnl.0000435300.64776.7e (2013). 249 Upadhya, R., Lee, J. & Willis, I. M. Maf1 is an essential mediator of diverse signals that repress RNA polymerase III transcription. Molecular cell 10, 1489-1494 (2002). 250 Michels, A. A. et al. mTORC1 directly phosphorylates and regulates human MAF1. Molecular and cellular biology 30, 3749-3757, doi:10.1128/MCB.00319-10 (2010). 251 Ishimura, R. et al. RNA function. Ribosome stalling induced by mutation of a CNS- specific tRNA causes neurodegeneration. Science 345, 455-459, doi:10.1126/science.1249749 (2014). 252 Yee, N. S. et al. Mutation of RNA Pol III subunit rpc2/polr3b Leads to Deficiency of Subunit Rpc11 and disrupts zebrafish digestive development. PLoS biology 5, e312, doi:10.1371/journal.pbio.0050312 (2007). 253 White, R. J. Transcription by RNA polymerase III: more complex than we thought. Nature reviews. Genetics 12, 459-463, doi:10.1038/nrg3001 (2011). 254 Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7, 539, doi:10.1038/msb.2011.75 (2011). 255 Ornelas, I. M. et al. Heterogeneity in oligodendroglia: Is it relevant to mouse models and human disease? Journal of neuroscience research 94, 1421-1433, doi:10.1002/jnr.23900 (2016). 256 Fields, R. D. White matter matters. Scientific American 298, 42-49 (2008). 257 Tress, O. et al. Pathologic and phenotypic alterations in a mouse expressing a connexin47 missense mutation that causes Pelizaeus-Merzbacher-like disease in humans. PLoS genetics 7, e1002146, doi:10.1371/journal.pgen.1002146 (2011). 258 Odermatt, B. et al. Connexin 47 (Cx47)-deficient mice with enhanced green fluorescent protein reporter gene reveal predominant oligodendrocytic expression of Cx47 and display vacuolized myelin in the CNS. The Journal of neuroscience : the official journal of the Society for Neuroscience 23, 4549-4559 (2003). 259 Pujol, A. et al. Late onset neurological phenotype of the X-ALD gene inactivation in mice: a mouse model for adrenomyeloneuropathy. Human molecular genetics 11, 499- 505 (2002). 260 Duning, K., Buck, F., Barnekow, A. & Kremerskothen, J. SYNCRIP, a component of dendritically localized mRNPs, binds to the translation regulator BC200 RNA. Journal of neurochemistry 105, 351-359, doi:10.1111/j.1471-4159.2007.05138.x (2008).

211 261 Castelnuovo, M. et al. An Alu-like RNA promotes cell differentiation and reduces malignancy of human neuroblastoma cells. FASEB J 24, 4033-4046, doi:10.1096/fj.10- 157032 (2010). 262 Gigoni, A. et al. Down-regulation of 21A Alu RNA as a tool to boost proliferation maintaining the tissue regeneration potential of progenitor cells. Cell Cycle 15, 2420- 2430, doi:10.1080/15384101.2016.1181242 (2016). 263 Mack, J. T. et al. "Skittish" Abca2 knockout mice display tremor, hyperactivity, and abnormal myelin ultrastructure in the central nervous system. Molecular and cellular biology 27, 44-53, doi:10.1128/MCB.01824-06 (2007). 264 De Munter, S. et al. Early-onset Purkinje cell dysfunction underlies cerebellar ataxia in peroxisomal multifunctional protein-2 deficiency. Neurobiology of disease 94, 157-168, doi:10.1016/j.nbd.2016.06.012 (2016). 265 Coley, W. D. et al. Effect of genetic background on the dystrophic phenotype in mdx mice. Human molecular genetics 25, 130-145, doi:10.1093/hmg/ddv460 (2016). 266 Hatzipetros, T. et al. C57BL/6J congenic Prp-TDP43A315T mice develop progressive neurodegeneration in the myenteric plexus of the colon without exhibiting key features of ALS. Brain research 1584, 59-72, doi:10.1016/j.brainres.2013.10.013 (2014). 267 Frankel, W. N., Mahaffey, C. L., McGarr, T. C., Beyer, B. J. & Letts, V. A. Unraveling genetic modifiers in the gria4 mouse model of absence epilepsy. PLoS genetics 10, e1004454, doi:10.1371/journal.pgen.1004454 (2014). 268 van der Knaap, M. S., Pronk, J. C. & Scheper, G. C. Vanishing white matter disease. Lancet Neurol 5, 413-423, doi:10.1016/S1474-4422(06)70440-9 (2006). 269 Gingold, H. et al. A dual program for translation regulation in cellular proliferation and differentiation. Cell 158, 1281-1292, doi:10.1016/j.cell.2014.08.011 (2014). 270 Dieci, G. et al. A universally conserved region of the largest subunit participates in the active site of RNA polymerase III. The EMBO journal 14, 3766-3776 (1995). 271 Thuillier, V., Brun, I., Sentenac, A. & Werner, M. Mutations in the alpha-amanitin conserved domain of the largest subunit of yeast RNA polymerase III affect pausing, RNA cleavage and transcriptional transitions. The EMBO journal 15, 618-629 (1996). 272 Brun, I., Sentenac, A. & Werner, M. Dual role of the C34 subunit of RNA polymerase III in transcription initiation. The EMBO journal 16, 5730-5741, doi:10.1093/emboj/16.18.5730 (1997). 273 Schnutgen, F. & Ghyselinck, N. B. Adopting the good reFLEXes when generating conditional alterations in the mouse genome. Transgenic research 16, 405-413, doi:10.1007/s11248-007-9089-8 (2007). 274 Luong, T. N., Carlisle, H. J., Southwell, A. & Patterson, P. H. Assessment of motor balance and coordination in mice using the balance beam. Journal of visualized experiments : JoVE, doi:10.3791/2376 (2011). 275 Carter, R. J., Morton, J. & Dunnett, S. B. Motor coordination and balance in rodents. Current protocols in neuroscience Chapter 8, Unit 8 12, doi:10.1002/0471142301.ns0812s15 (2001). 276 Girard, M. et al. Mitochondrial dysfunction and Purkinje cell loss in autosomal recessive spastic ataxia of Charlevoix-Saguenay (ARSACS). Proceedings of the National Academy of Sciences of the United States of America 109, 1661-1666, doi:10.1073/pnas.1113166109 (2012).

212 277 Skryabin, B. V. et al. Neuronal untranslated BC1 RNA: targeted gene elimination in mice. Molecular and cellular biology 23, 6435-6441 (2003). 278 Lavallee-Adam, M. et al. Discovery of cell compartment specific protein-protein interactions using affinity purification combined with tandem mass spectrometry. Journal of proteome research 12, 272-281, doi:10.1021/pr300778b (2013). 279 Perkins, D. N., Pappin, D. J., Creasy, D. M. & Cottrell, J. S. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551-3567, doi:10.1002/(SICI)1522- 2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2 (1999). 280 Orioli, A., Pascali, C., Pagano, A., Teichmann, M. & Dieci, G. RNA polymerase III transcription control elements: themes and variations. Gene 493, 185-194, doi:10.1016/j.gene.2011.06.015 (2012). 281 Khatter, H., Vorlander, M. K. & Muller, C. W. RNA polymerase I and III: similar yet unique. Curr Opin Struct Biol 47, 88-94, doi:10.1016/j.sbi.2017.05.008 (2017). 282 Hein, P. P. & Landick, R. The bridge helix coordinates movements of modules in RNA polymerase. BMC Biol 8, 141, doi:10.1186/1741-7007-8-141 (2010). 283 Ran, F. A. et al. Genome engineering using the CRISPR-Cas9 system. Nature protocols 8, 2281-2308, doi:10.1038/nprot.2013.143 (2013). 284 Lalonde, S. et al. Frameshift indels introduced by genome editing can lead to in-frame exon skipping. PloS one 12, e0178700, doi:10.1371/journal.pone.0178700 (2017). 285 McLaurin, J., Trudel, G. C., Shaw, I. T., Antel, J. P. & Cashman, N. R. A human glial hybrid cell line differentially expressing genes subserving oligodendrocyte and astrocyte phenotype. Journal of neurobiology 26, 283-293, doi:10.1002/neu.480260212 (1995). 286 Kim, Y. et al. Biosynthesis of brain cytoplasmic 200 RNA. Sci Rep 7, 6884, doi:10.1038/s41598-017-05097-3 (2017). 287 Ramsay, E. P. & Vannini, A. Structural rearrangements of the RNA polymerase III machinery during tRNA transcription initiation. Biochimica et biophysica acta, doi:10.1016/j.bbagrm.2017.11.005 (2017). 288 Turowski, T. W. et al. Global analysis of transcriptionally engaged yeast RNA polymerase III reveals extended tRNA transcripts. Genome research 26, 933-944, doi:10.1101/gr.205492.116 (2016). 289 Mayer, A. et al. Native elongating transcript sequencing reveals human transcriptional activity at nucleotide resolution. Cell 161, 541-554, doi:10.1016/j.cell.2015.03.010 (2015). 290 Choquet, K. et al. Absence of neurological abnormalities in mice homozygous for the Polr3a G672E hypomyelinating leukodystrophy mutation. Mol Brain 10, 13, doi:10.1186/s13041-017-0294-y (2017). 291 Singh, R. et al. Regulation of alternative splicing of Bcl-x by BC200 contributes to breast cancer pathogenesis. Cell death & disease 7, e2262, doi:10.1038/cddis.2016.168 (2016). 292 Shin, H. et al. Knockdown of BC200 RNA expression reduces cell migration and invasion by destabilizing mRNA for calcium-binding protein S100A11. RNA Biol 14, 1418-1430, doi:10.1080/15476286.2017.1297913 (2017). 293 Wang, H. et al. Developmentally-programmed FMRP expression in oligodendrocytes: a potential role of FMRP in regulating translation in oligodendroglia progenitors. Human molecular genetics 13, 79-89, doi:10.1093/hmg/ddh009 (2004).

213 294 Woodward, K. J. The molecular and cellular defects underlying Pelizaeus-Merzbacher disease. Expert Rev Mol Med 10, e14, doi:10.1017/S1462399408000677 (2008). 295 Wisse, L. E. et al. Proteomic and Metabolomic Analyses of Vanishing White Matter Mouse Astrocytes Reveal Deregulation of ER Functions. Frontiers in cellular neuroscience 11, 411, doi:10.3389/fncel.2017.00411 (2017). 296 Antonicka, H. et al. A pseudouridine synthase module is essential for mitochondrial protein synthesis and cell viability. EMBO Rep 18, 28-38, doi:10.15252/embr.201643391 (2017). 297 Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15-21, doi:10.1093/bioinformatics/bts635 (2013). 298 Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923-930, doi:10.1093/bioinformatics/btt656 (2014). 299 Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome biology 15, 550, doi:10.1186/s13059-014- 0550-8 (2014). 300 Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature biotechnology 26, 1367-1372, doi:10.1038/nbt.1511 (2008). 301 Eden, E., Navon, R., Steinfeld, I., Lipson, D. & Yakhini, Z. GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC bioinformatics 10, 48, doi:10.1186/1471-2105-10-48 (2009). 302 Paquet, D. et al. Efficient introduction of specific homozygous and heterozygous mutations using CRISPR/Cas9. Nature 533, 125-129, doi:10.1038/nature17664 (2016). 303 Locati, M. D. et al. Improving small RNA-seq by using a synthetic spike-in set for size- range quality control together with a set for data normalization. Nucleic acids research 43, e89, doi:10.1093/nar/gkv303 (2015). 304 Zhong, J. et al. Transfer RNAs Mediate the Rapid Adaptation of Escherichia coli to Oxidative Stress. PLoS genetics 11, e1005302, doi:10.1371/journal.pgen.1005302 (2015). 305 Baker, S. C. et al. The External RNA Controls Consortium: a progress report. Nature methods 2, 731-734, doi:10.1038/nmeth1005-731 (2005). 306 Wang, L., Wang, S. & Li, W. RSeQC: quality control of RNA-seq experiments. Bioinformatics 28, 2184-2185, doi:10.1093/bioinformatics/bts356 (2012). 307 Conti, A. et al. Identification of RNA polymerase III-transcribed Alu loci by computational screening of RNA-Seq data. Nucleic acids research 43, 817-835, doi:10.1093/nar/gku1361 (2015). 308 Wiethoff, S. et al. Pure Cerebellar Ataxia with Homozygous Mutations in the PNPLA6 Gene. Cerebellum 16, 262-267, doi:10.1007/s12311-016-0769-x (2017). 309 Synofzik, M. et al. PNPLA6 mutations cause Boucher-Neuhauser and Gordon Holmes syndromes as part of a broad neurodegenerative spectrum. Brain : a journal of neurology 137, 69-77, doi:10.1093/brain/awt326 (2014). 310 Rainier, S. et al. Neuropathy target esterase gene mutations cause motor neuron disease. American journal of human genetics 82, 780-785, doi:10.1016/j.ajhg.2007.12.018 (2008). 311 Dor, T. et al. KIF1C mutations in two families with hereditary spastic paraparesis and cerebellar dysfunction. Journal of medical genetics 51, 137-142, doi:10.1136/jmedgenet- 2013-102012 (2014).

214 312 Novarino, G. et al. Exome sequencing links corticospinal motor neuron disease to common neurodegenerative disorders. Science 343, 506-511, doi:10.1126/science.1247363 (2014). 313 Simons, C. et al. A de novo mutation in the beta-tubulin gene TUBB4A results in the leukoencephalopathy hypomyelination with atrophy of the basal ganglia and cerebellum. American journal of human genetics 92, 767-773, doi:10.1016/j.ajhg.2013.03.018 (2013). 314 Lohmann, K. et al. Whispering dysphonia (DYT4 dystonia) is caused by a mutation in the TUBB4 gene. Annals of neurology 73, 537-545, doi:10.1002/ana.23829 (2013). 315 Blumkin, L. et al. Expansion of the spectrum of TUBB4A-related disorders: a new phenotype associated with a novel mutation in the TUBB4A gene. Neurogenetics 15, 107-113, doi:10.1007/s10048-014-0392-2 (2014). 316 Buske, O. J. et al. PhenomeCentral: a portal for phenotypic and genotypic matchmaking of patients with rare genetic diseases. Human mutation 36, 931-940, doi:10.1002/humu.22851 (2015). 317 Sobreira, N., Schiettecatte, F., Valle, D. & Hamosh, A. GeneMatcher: a matching tool for connecting investigators with an interest in the same gene. Human mutation 36, 928-930, doi:10.1002/humu.22844 (2015). 318 Simons, C. et al. A recurrent de novo mutation in TMEM106B causes hypomyelinating leukodystrophy. Brain : a journal of neurology 140, 3105-3111, doi:10.1093/brain/awx314 (2017). 319 Kernohan, K. D. et al. Matchmaking facilitates the diagnosis of an autosomal-recessive mitochondrial disease caused by biallelic mutation of the tRNA isopentenyltransferase (TRIT1) gene. Human mutation 38, 511-516, doi:10.1002/humu.23196 (2017). 320 Faden, M. et al. Identification of a Recognizable Progressive Skeletal Dysplasia Caused by RSPRY1 Mutations. American journal of human genetics 97, 608-615, doi:10.1016/j.ajhg.2015.08.007 (2015). 321 Yan, H. et al. The recurrent mutation in TMEM106B also causes hypomyelinating leukodystrophy in China and is a CpG hot spot. Brain : a journal of neurology, doi:10.1093/brain/awy029 (2018). 322 Joshi, M. et al. Mutations in the substrate binding glycine-rich loop of the mitochondrial processing peptidase-alpha protein (PMPCA) cause a severe mitochondrial disease. Cold Spring Harb Mol Case Stud 2, a000786, doi:10.1101/mcs.a000786 (2016). 323 Elsea, S. H. & Lucas, R. E. The mousetrap: what we can learn when the mouse model does not mimic the human disease. ILAR journal 43, 66-79 (2002). 324 Baskin, J. M. et al. The leukodystrophy protein FAM126A (hyccin) regulates PtdIns(4)P synthesis at the plasma membrane. Nat Cell Biol 18, 132-138, doi:10.1038/ncb3271 (2016). 325 Miyamoto, Y., Kawahara, K., Torii, T. & Yamauchi, J. Defective myelination in mice harboring hypomyelinating leukodystrophy-associated HSPD1 mutation. Mol Genet Metab Rep 11, 6-7, doi:10.1016/j.ymgmr.2017.03.003 (2017). 326 Frohlich, D. et al. In vivocharacterization of the aspartyl-tRNA synthetase DARS: Homing in on the leukodystrophy HBSL. Neurobiology of disease 97, 24-35, doi:10.1016/j.nbd.2016.10.008 (2017). 327 Luo, X., Li, M. & Su, B. Application of the genome editing tool CRISPR/Cas9 in non- human primates. Dongwuxue Yanjiu 37, 214-219, doi:10.13918/j.issn.2095- 8137.2016.4.214 (2016).

215 328 Woiwode, A. et al. PTEN represses RNA polymerase III-dependent transcription by targeting the TFIIIB complex. Molecular and cellular biology 28, 4204-4214, doi:10.1128/MCB.01912-07 (2008). 329 Haurie, V. et al. Two isoforms of human RNA polymerase III with specific functions in cell growth and transformation. Proceedings of the National Academy of Sciences of the United States of America 107, 4176-4181, doi:10.1073/pnas.0914980107 (2010). 330 Chiu, Y. H., Macmillan, J. B. & Chen, Z. J. RNA polymerase III detects cytosolic DNA and induces type I interferons through the RIG-I pathway. Cell 138, 576-591, doi:10.1016/j.cell.2009.06.015 (2009). 331 Johnson, S. A., Dubeau, L. & Johnson, D. L. Enhanced RNA polymerase III-dependent transcription is required for oncogenic transformation. The Journal of biological chemistry 283, 19184-19191, doi:10.1074/jbc.M802872200 (2008). 332 Natsume, T., Kiyomitsu, T., Saga, Y. & Kanemaki, M. T. Rapid Protein Depletion in Human Cells by Auxin-Inducible Degron Tagging with Short Homology Donors. Cell Rep 15, 210-218, doi:10.1016/j.celrep.2016.03.001 (2016). 333 Lewejohann, L. et al. Role of a neuronal small non-messenger RNA: behavioural alterations in BC1 RNA-deleted mice. Behav Brain Res 154, 273-289, doi:10.1016/j.bbr.2004.02.015 (2004). 334 Zhong, J. et al. BC1 regulation of metabotropic glutamate receptor-mediated neuronal excitability. The Journal of neuroscience : the official journal of the Society for Neuroscience 29, 9977-9986, doi:10.1523/JNEUROSCI.3893-08.2009 (2009). 335 Maccarrone, M. et al. Abnormal mGlu 5 receptor/endocannabinoid coupling in mice lacking FMRP and BC1 RNA. Neuropsychopharmacology 35, 1500-1509, doi:10.1038/npp.2010.19 (2010). 336 Briz, V. et al. The non-coding RNA BC1 regulates experience-dependent structural plasticity and learning. Nature communications 8, 293, doi:10.1038/s41467-017-00311-2 (2017). 337 Shin, H. et al. Identifying the cellular location of brain cytoplasmic 200 RNA using an RNA-recognizing antibody. BMB Rep 50, 318-322 (2017). 338 Wang, W., van Niekerk, E., Willis, D. E. & Twiss, J. L. RNA transport and localized protein synthesis in neurological disorders and neural repair. Dev Neurobiol 67, 1166- 1182, doi:10.1002/dneu.20511 (2007). 339 Torvund-Jensen, J., Steengaard, J., Reimer, L., Fihl, L. B. & Laursen, L. S. Transport and translation of MBP mRNA is regulated differently by distinct hnRNP proteins. J Cell Sci 127, 1550-1564, doi:10.1242/jcs.140855 (2014). 340 Jang, S. et al. Regulation of BC200 RNA-mediated translation inhibition by hnRNP E1 and E2. FEBS Lett 591, 393-405, doi:10.1002/1873-3468.12544 (2017). 341 De Biase, L. M., Nishiyama, A. & Bergles, D. E. Excitability and synaptic communication within the oligodendrocyte lineage. The Journal of neuroscience : the official journal of the Society for Neuroscience 30, 3600-3611, doi:10.1523/JNEUROSCI.6000-09.2010 (2010). 342 Czopka, T., Ffrench-Constant, C. & Lyons, D. A. Individual oligodendrocytes have only a few hours in which to generate new myelin sheaths in vivo. Dev Cell 25, 599-609, doi:10.1016/j.devcel.2013.05.013 (2013). 343 Briese, M. et al. Whole transcriptome profiling reveals the RNA content of motor axons. Nucleic acids research 44, e33, doi:10.1093/nar/gkv1027 (2016).

216 344 Houry, W. A., Bertrand, E. & Coulombe, B. The PAQosome, an R2TP-Based Chaperone for Quaternary Structure Formation. Trends in biochemical sciences 43, 4-9, doi:10.1016/j.tibs.2017.11.001 (2018). 345 Cummings, B. B. et al. Improving genetic diagnosis in Mendelian disease with transcriptome sequencing. Sci Transl Med 9, doi:10.1126/scitranslmed.aal5209 (2017). 346 Jia, Y., Mu, J. C. & Ackerman, S. L. Mutation of a U2 snRNA gene causes global disruption of alternative splicing and neurodegeneration. Cell 148, 296-308, doi:10.1016/j.cell.2011.11.057 (2012). 347 Gotoh, L. et al. GJC2 promoter mutations causing Pelizaeus-Merzbacher-like disease. Mol Genet Metab 111, 393-398, doi:10.1016/j.ymgme.2013.12.001 (2014). 348 Plassais, J. et al. A Point Mutation in a lincRNA Upstream of GDNF Is Associated to a Canine Insensitivity to Pain: A Spontaneous Model for Human Sensory Neuropathies. PLoS genetics 12, e1006482, doi:10.1371/journal.pgen.1006482 (2016). 349 Jang, Y. J. et al. Disease-causing mutations in the promoter and enhancer of the ornithine transcarbamylase gene. Human mutation 39, 527-536, doi:10.1002/humu.23394 (2018). 350 Albert, S. et al. Identification and Rescue of Splice Defects Caused by Two Neighboring Deep-Intronic ABCA4 Mutations Underlying Stargardt Disease. American journal of human genetics, doi:10.1016/j.ajhg.2018.02.008 (2018).

217 APPENDIX

Significant contributions to other projects a) Peer-reviewed articles • Calabretta S, Vogel G, Yu Z, Choquet K, Darbelli L, Nicholson TB, Kleinman CL, Richard S. PRMT5 regulates PDGFRα plasma membrane availability during myelination by modulating Cbl recruitment. Developmental Cell (2018). Accepted. KC analyzed the RNA-sequencing data, made figures and reviewed the manuscript.

• Darbelli L*, Choquet K*, Richard S, Kleinman CL. Transcriptome profiling of mouse brains with qkI-deficient oligodendrocytes reveals major alternative splicing defects including self- splicing. Sci Reports 7, 7554 (2017). *Co-first authors KC analyzed the RNA-sequencing data and performed other bioinformatics analyses, made figures and contributed to writing the manuscript.

• Antonicka H, Choquet K, Lin ZY, Gingras AC, Kleinman CL, Shoubridge EA. A pseudouridine synthase module is essential for mitochondrial protein synthesis and cell viability. EMBO Rep 18, 28-38 (2016). KC analyzed the RNA-sequencing data, performed computational analyses, made figures and contributed to writing the manuscript.

• Vasli N, Harris E, Karamchandani J, Bareke E, Majewski J, Romero NB, Stojkovic T, Barresi R, Malfatti E, Bohm J, Marini-Bettolo C, Choquet K, Dicaire MJ, Shao YH, Topf A, O’Ferrall E, Eymard B, Straub V, Blanco G, Lochmuller H, Brais B, Laporte J, Tétreault M. Recessive mutations in the kinase ZAK cause a congenital myopathy with fiber type disproportion. Brain 140, 37-48 (2016). KC performed RNA extractions and reviewed the manuscript.

• Binan L, Mazzaferri J, Choquet K, Lorenzo LE, Wang YC, Affar el B, De Koninck Y, Ragoussis J, Kleinman CL, Costantino S. Live single-cell laser tag. Nat Commun 7, 11636 (2016). KC performed the analysis of the single-cell RNA-sequencing data, made figures and contributed to writing the manuscript.

• Shao YH*, Choquet K*, La Piana R, Tétreault M, Dicaire MJ, Care4Rare Canada Consortium, Boycott KM, Majewski J, Brais B. Mutations in GALC cause late-onset Krabbe disease with predominant cerebellar ataxia. Neurogenetics 17, 137-141 (2016). *Co-first authors KC supervised the experiments, analyzed data, wrote and reviewed the manuscript.

218 • Thiffault I, Wolf NI, Forget D, Guerrero K, Tran LT, Choquet K, Lavallée-Adam M, Poitras C, Brais B, Yoon G, Sztriha L, Webster RI, Timmann D, van de Warrenburg BP, Seeger J, Zimmermann A, Maté A, Goizet C, Fung E, van der Knaap MS, Fribourg S, Vanderver A, Simons C, Taft RJ, Yates JR 3rd, Coulombe B, Bernard G. Recessive mutations in POLR1C cause a leukodystrophy by impairing biogenesis of RNA polymerase III. Nat Commun 6, 7623 (2015). KC analyzed the ChIP-Seq data, produced the figures, wrote the section of the manuscript pertaining to this data and reviewed the manuscript.

• Choquet K, La Piana R, Brais B. A novel frameshift mutation in FGF14 causes an autosomal dominant episodic ataxia. Neurogenetics 16, 233-236 (2015). KC analyzed the exome sequencing data, performed the validation and wrote the manuscript. b) Articles under review or in preparation • Tétreault M, Nicolau S, Choquet K, Bareke E, Shao YH, Brais B, O’Ferrall EK, Karamchandani J. A molecular diagnosis of LGMD2A established by RNA sequencing. Submitted to Neuromuscular Disorders (2018). KC provided advice regarding experimental design, supervised the student doing experiments and reviewed the manuscript.

• Larivière R, Sgarioto N, Toscano Marquez B, Gaudet R, Choquet K, McKinney RA, Watts A, Brais B. Sacs R272C missense homozygous mice develop a milder ataxia phenotype than knock out mice. Submitted to Molecular Brain (2018). KC performed RNA extractions and analyzed qRT-PCR data.

219