The Regulation of Incorrect Splicing of ISCU in Hereditary Myopathy with Lactic Acidosis

Denise F. R. Rawcliffe

Department of Medical Biosciences Umeå 2018 This work is protected by the Swedish Copyright Legislation (Act 1960:729) Dissertation for PhD ISBN: 978-91-7601-954-2 ISSN: 0346-6612 New series number: 1978 Cover design: Denise Rawcliffe Electronic version available at: http://umu.diva-portal.org/ Printed by: UmU Print Service, Umeå University Umeå, Sweden 2018

To Bethie

Ancora Imparo

Table of Contents

Abstract ...... iii Original Papers ...... iv Abbreviations ...... v Enkel sammanfattning på svenska ...... vii Splitsning ...... vii Linderholms myopati ...... vii Studie I ...... viii Studie II ...... viii Studie III ...... ix Slutsatser ...... x Introduction ...... 1 Splicing ...... 1 Constitutive and Alternative Splicing ...... 1 The Spliceosome ...... 3 Cis-acting Elements ...... 3 Exon Definition ...... 5 Trans-acting Factors ...... 6 Secondary Structure ...... 6 RNA-Binding ...... 7 Splicing Factors ...... 7 Nonsense-Mediated Decay ...... 8 Diseases Caused by Aberrant Splicing ...... 9 Pseudoexons ...... 10 Splicing Prediction ...... 10 Muscle-specific Splicing ...... 11 Muscles ...... 11 Myopathies ...... 13 Hereditary Myopathy with Lactic Acidosis...... 13 Symptoms and Inheritance ...... 13 Incorrect Splicing of ISCU ...... 15 Muscle-specific Incorrect Splicing ...... 16 The ISCU and FeS Cluster Assembly ...... 17 ISCU and Oxidative Phosphorylation in Cancer ...... 18 Muscle-specific Loss of ISCU ...... 18 PTBP1 ...... 19 IGF2BP1 ...... 20 RBM39 ...... 21 MATR3 ...... 22 Other ISCU Mutations ...... 23 Aims of Thesis ...... 25 Materials and Methods ...... 26

i

Human Materials ...... 26 Mice ...... 26 Cell Culture ...... 26 Lentiviral Transduction ...... 26 Magnetofection ...... 27 Cycloheximide Treatment ...... 27 RNA Extraction and cDNA Synthesis ...... 27 Semi-qRTPCR and qRTPCR ...... 27 Protein Samples and Nuclear Extracts ...... 27 Pull-down Assay ...... 27 Western Blot Analysis...... 28 Statistics ...... 28 Results and Discussion ...... 29 Paper I ...... 29 Muscle-specific Splicing ...... 29 SRSF3 as a Muscle-specific Activator of the Incorrect Splicing ...... 29 Paper II...... 30 Tissue-specificity of PTBP1 ...... 30 PTBP1 inhibition of the Incorrect Splicing ...... 31 Tissue Specificity ...... 31 Paper III ...... 32 K-homology Domain Candidates ...... 32 Other Splicing Factor Candidates ...... 33 U2AF ...... 33 MBNL1 ...... 34 MBNL1 and RBM39 Activates the Incorrect Splicing of ISCU ...... 35 The Incorrect ISCU RNA Transcripts are Targets for NMD ...... 36 Conclusions ...... 38 Paper I ...... 38 Paper II...... 38 Paper III ...... 38 Overall Discussion ...... 39 The Exercise Sensitivity of HML Patients ...... 39 Effect of HML Mutation on Splicing Elements ...... 40 Splicing Factor Binding Prediction ...... 42 Tissue-Specific Regulation of ISCU Splicing ...... 42 Splicing Disease Therapy ...... 45 Antisense Oligonucleotide Therapy ...... 45 Muscular Dystrophy Treatments ...... 46 Therapies for HML Patients ...... 47 Antisense Oligonucleotides ...... 47 Other Therapies ...... 48 Acknowledgement ...... 50 References ...... 53

ii

Abstract

Patients suffering from hereditary myopathy with lactic acidosis (HML) can be found in the northern Swedish counties of Ångermanland and Västerbotten. HML is a rare autosomal recessive disease where patients display a low tolerance to exercise at an early age. Exercise can trigger symptoms such as palpitations, tachycardia, muscle cramps and dyspnoea. Extensive exercise or strict diets can result in myoglobinuria and life-threatening levels of lactic acid. The disease is caused by a nonsense G > C mutation (c.418 + 328G < C) in the last intron of the iron-sulphur (FeS) cluster assembly (ISCU), resulting in nonsense-mediated decay (NMD) of the transcript due to incorrect splicing. The ISCU protein is involved in the assembly of FeS clusters, which are essential cofactors for a wide range of proteins. Patient muscles display decreased levels of several FeS cluster proteins: mitochondrial aconitase in the tricarboxylic acid (TCA) cycle and Complex I, II (succinate dehydrogenase [SDH]) and III in the electron transport chain (ETC). The incorrect splicing of ISCU occurs to the highest extent in HML patient skeletal muscle, restricting the loss of ISCU protein to muscles, thereby preventing a more severe phenotype.

We found that the incorrect splicing occurs to the highest extent in slow-fibre muscle, which may be caused by the serine/arginine-rich splicing factor (SRSF3) as it is expressed at higher levels in slow-fibre muscle compared to other muscles, and since it is able to activate the incorrect splicing of ISCU. Following muscle, there is a gradual decrease of the incorrect splicing in heart, brain, liver and kidney, which is negatively correlated with the levels of the splicing inhibitor polypyrimidine-tract binding protein 1 (PTBP1). Overexpression of PTBP1 in HML patient myoblasts resulted in a drastic decrease in the incorrect splicing, while a PTBP1 knockdown had the opposite effect. Our results suggest that PTBP1 acts as a dominant inhibitor of the incorrect splicing and is likely the main cause for the tissue-specific splicing of ISCU in HML. We also identified RBM39 and MBNL1 as activators of the incorrect splicing of ISCU, which, together with the low levels of PTBP1, could explain the high levels of incorrect splicing in muscle.

Since almost 95% of all human gene transcripts are alternatively spliced, it is not surprising that a wide range of diseases are caused by mutations that affect splicing. Further knowledge of the function of splicing, such as tissue-specific splicing, can provide vital information for the development of therapies for diseases caused by splicing.

iii

Original Papers

The thesis is based on the following papers, referred to in the text by their Roman numeral:

I. Denise F.R. Rawcliffe, Lennart Österman, Hans Lindsten and Monica Holmberg. The High Level of Aberrant Splicing of ISCU in Slow-Twitch Muscle May Involve the Splicing Factor SRSF3. (2016) PLoS ONE 11 (10):e0165453. II. Denise F.R. Rawcliffe, Lennart Österman, Angelica Nordin and Monica Holmberg. PTBP1 Acts as a Dominant Repressor of the Aberrant Tissue-specific Splicing of ISCU in Hereditary Myopathy with Lactic Acidosis. (2018) Mol Genet Genomic Med. 00:1-11 III. Denise F.R. Rawcliffe, Malin Johansson, Lennart Österman and Monica Holmberg. MBNL1 and RBM39 can Activate the Incorrect Splicing of ISCU and the Aberrant Transcript is a Target for Nonsense- mediated Decay (Manuscript).

Papers I and II did not require permission to be reprinted from the publisher.

iv

Abbreviations

ASOs Antisense oligonucleotides C Healthy control C2C12 An immortalized mouse myoblast cell line CHX Cycloheximide cTNT Cardiac troponin T DMD Duchenne muscular dystrophy EJC Exon-junction-complex ESE Exonic splicing enhancer ESS Exonic splicing silencer ETC Electron transport chain FeS Iron-sulphur FGF21 Fibroblast growth factor 21 HIF1α Hypoxia inducible factor 1α HML Hereditary myopathy with lactic acidosis hnRNP Heterogenous ribonucleoprotein IGF2 Insulin-like growth factor 2 IGF2BP1 Insulin-like growth factor 2 mRNA-binding protein 1 ISCU Iron-sulphur cluster assembly ISE Intronic splicing enhancer ISS Intronic splicing silencer KH K-homology MATR3 MATRIN3 MD Myotonic dystrophy MBNL1/2/3 Muscle-blind like protein 1/2/3 miR microRNA NF1 Neurofibromatosis type 1 NMD Nonsense-mediated decay P1 Patient 1

v

P2 Patient 2 PMO Phosphorodiamidate morpholino PPT Polypyrimidine tract PTBP1 Polypyrimidine tract binding protein PTC Premature termination codon RBM39 RNA-binding motif protein 39 RBPs RNA-binding proteins RD Rhabdomyosarcoma RRM RNA-recognition motif SDH Succinate dehydrogenase SOL soleus snRNAs Small nuclear RNAs SR Serine/arginine rich SREs Splicing regulatory elements SRSF3 Serine/arginine-rich splicing factor 3 TCA Tricarboxylic acid U2AF U2 auxiliary factor UPF1 Up-frameshift 1 UTCF Umeå Transgene Core Facility UTRs Untranslated regions VEGFB Vascular endothelial growth factor B

vi

Enkel sammanfattning på svenska

Splitsning Från protein-kodande gener i vårt DNA transkriberas mRNA transkript. mRNA transkriptet innehåller exonsekvens som kodar för protein men även intronsekvens som ska tas bort innan proteinproduktionen kan börja. Arbetet med att ta bort intronerna och sätta ihop exonerna till ett färdigt mRNA transkript genomförs av spliceosomen och ett antal andra splitsningsproteiner i en process som kallas splitsning. Till skillnad från bakterier som är prokaryoter så har alla eukaryota organismer, inklusive människan, ett system i cellen som splitsar mRNA transkript.

Fördelen med splitsning är att det skapar en väldig flexibilitet i hur olika varianter av ett protein kan uttryckas i olika vävnader i kroppen. Beroende på hur man sätter ihop de olika exonerna så kan man få olika varianter av proteinet. De olika splitsningsproteinerna hjälper till att dirigera vilka exoner som ska sättas ihop och det är den unika konstellation av splitsningsproteiner i en viss vävnad som bestämmer vilken proteinvariant som faktiskt kommer att uttryckas.

Linderholms myopati Patienter med Linderholms myopati (LM) har sitt ursprung i Ångermanland och Västerbotten. Från tidig ålder kan vanlig muskelansträngning orsaka hjärtklappning med en onormalt hög hjärtrytm samt andnöd och muskelkramp. Tyngre träning eller strikta dieter kan orsaka allvarliga skov som innefattar nedbrytning av muskelmassa men även potentiellt livsfarliga nivåer av mjölksyra. Sjukdomen är orsakad av en mutation i den sista intronen i genen ISCU. Den genen kodar för Iron-sulphur cluster assembly (ISCU) proteinet som är en viktig komponent i produktionen av järn-svavel kluster som återfinns i ett flertal av de proteiner som är ansvariga för att producera ATP, cellens energi, i cellens mitokondrier.

Mutationen i genen ISCU gör att det inkluderas intronsekvens i det färdiga ISCU mRNA:t. Detta gör att detta mRNA bryts ner vilket orsakar låga nivåer av ISCU protein. Trots att cellens kapacitet att producera livsviktig energi är beroende av fungerande ISCU protein så påvisar patienterna enbart muskelspecifika symptom. Detta beror på att den felaktiga splitsningen som orsakas av mutationen sker till en ovanligt hög grad i skelettmuskelvävnad jämfört med andra organ.

vii

Studie I Tidigare forskning har mätt nivåerna av felaktig splitsning av ISCU i muskel, hjärta och levervävnad från LM patienter och fann att den felaktiga splitsningen skedde till högst grad i muskel. För att undersöka nivåerna av felaktig splitsning i fler vävnader så uttryckte vi den mänskliga ISCU genen med LM mutationen i möss. Genom att samla vävnader från dessa möss så kunde vi visa att den felaktiga splitsningen sker till högst grad i skelettmuskeln soleus, som består mestadels av långsamma muskelfibrer. Efter muskelvävnader så minskade nivån av felaktig splitsning gradvis i hjärta, hjärna, lever och njure.

För att förstå varför den felaktiga splitsningen skedde till en högre grad i soleus jämfört med gastrocnemius muskeln och quadriceps muskeln så sökte vi efter skillnader i nivåer av splitsningsproteiner i muskler av olika fiberkomposition. Vi fann att splitsningsproteinet Serine/arginine-rich splicing factor 3 (SRSF3) fanns till högre nivåer i soleus jämfört med gastrocnemius och quadriceps. När vi sedan ökade eller minskade nivåerna av SRSF3 i muskelceller från patienter så ökade och minskade nivåerna av felaktig splitsning. Detta påvisar att SRSF3 kan aktivera den felaktiga splitsningen och kan vara orsaken till att det blir mer felaktig splitsning av ISCU i soleus jämfört med gastrocnemius och quadriceps. SRSF3 proteinet kunde även binda till ISCU mRNA, dock med liknande styrka till både den icke-muterade sekvensen som den muterade ISCU sekvensen. Detta betyder att SRSF3 inte kan vara det splitsningsprotein som binder in specifikt till den mutade ISCU sekvensen och orsakar de höga nivåerna av felaktig splitsning, utan det måste vara fler splitsningsproteiner som är involverade.

Studie II Tidigare LM forskning har visat att Polypyrimidine tract-binding protein 1 (PTBP1) har en inhiberande effekt på den felaktiga splitsningen av ISCU. Vi mätte nivån av PTBP1 i de olika musvävnaderna vi samlat och såg att nivån av PTBP1 hade en negativ korrelation med nivåerna av felaktig splitsning. I muskelvävnaderna, där den felaktiga splitsningen var högst, fanns det väldigt lite PTBP1 medan i lever och njure, där den felaktiga splitsningen var som lägst, fanns det tydligt högre nivåer av PTBP1. När vi ökade och minskade nivåerna av PTBP1 i LM patienters muskelceller såg vi att nivåerna av felaktig splitsning drastiskt minskade, respektive, ökade. Sammanfattningsvis betyder detta att PTBP1 inhiberar den felaktiga splitsningen av ISCU i LM och att nivåerna av PTBP1 i en viss vävnad kan vara en dominant faktor som bestämmer vilken nivå av felaktig splitsning som kommer nås i den vävnaden. PTBP1 kunde även binda till ISCU mRNA och band faktiskt starkare till det ISCU mRNA som innehöll LM mutationen jämfört med det ISCU mRNA som saknade mutationen. Detta kan förklaras av det faktum att mutationen skapar en ny sekvens som är väldigt lik den sekvens som PTBP1 brukar binda till. Trots att splitsningsinhibitorn PTBP1

viii

binder den muterade sekvensen starkare så har patienterna ändå mycket höga nivåer av felaktigt splitsat ISCU mRNA. Då nivån av PTBP1 är låg i muskelvävnad hos både patienter och friska, så kan inte de låga nivåerna av PTBP1 i muskel vara den enda orsakan till de höga nivåerna av felaktig splitsning. Det måste finnas ett splitsningsprotein som binder till den nya mRNA sekvens som LM mutationen skapar och som kan aktivera den felaktiga splitsningen av ISCU hos LM patienter.

Studie III I tidigare LM forskning så visade det sig att proteinet IGF2BP1 kunde binda starkare till den mRNA sekvens som skapas av LM mutationen jämfört med den icke-muterade sekvensen. I de studierna så visade de sig också att IGF2BP1 kunde motarbeta effekten av PTBP1 och kunde hindra att den felaktiga splitsningen minskade om PTBP1 nivåerna ökades. Då IGF2BP1 enbart uttrycks embryonalt i frisk vävnad samt att proteinet inte är känt för att vara involverat i splitsning, så undersökte vi tre splitsningsproteiner som innehöll liknande RNA- bindningselement som IGF2BP1: NOVA1 och PCBP1. Vi ökade nivåerna av dessa proteiner i muskelcancerceller som uttryckte ISCU mRNA med LM mutationen. Ökade nivåer av NOVA1 gav ingen effekt medan ökade nivåer av PCBP1 ledde till en diskret ökning av nivåerna av felaktigt splitsat ISCU.

Vi testade även att minska nivån av tre andra splitsningsproteiner i muskelceller från LM patienter: U2AF65, RBM39, MBNL1. Tidigare forskning har visat att RBM39 kan binds till ISCU mRNA sekvens medan U2AF65 är känd för att tävla med PTBP1 om att binda specifika splitsningssekvenser. MBNL1 är känd för att ha ett högt uttryckt i muskelvävnad och för att vara involverad i muskel-specifik reglering. Överaskande nog så fann vi att minskade nivåer av PTBP1-rivalen U2AF65 resulterade i en liten ökning av nivåerna av felaktig splitsning. Däremot fann vi även att minskade nivåer av RBM39 och MBNL1 ledde till minskade nivåer av felaktig splitsning av ISCU.

Vi kunde även visa att det felaktigt splitsade ISCU mRNA transkriptet blir nedbrutet av ett system som kallas nonsense-mediated decay. Det systemet känner igen mRNA transkript som har splitsats felaktigt och ser till att de bryts ner. De transkript som känns igen innehåller sekvens som kodar för att avsluta proteinuppbyggnaden innan hela proteinet har byggts upp. Om ett sådant transkript translateras så skulle det resultera i ett protein som var för kort, vilket ofta betyder att proteinet inte kommer kunna uppfylla sin funktion och i värsta fall kan ha skadliga effekter på cellen.

ix

Slutsatser Våra studier har visat att den felaktiga splitsningen av ISCU mRNA hos LM patienter verkar ske till högst grad i skelettmuskulatur bestående av mestadels långsamma muskelfibrer och till minst grad i njure. Splitsningsproteinet SRSF3 skulle kunna vara orsaken till att nivåerna är högre i muskler med långsamma muskelfibrer jämfört med snabba muskelfibrer. Den felaktiga splitsningen inhiberas till störst grad av PTBP1 där nivån av PTBP1 i en vävnad verkar vara avgörande för den slutgiltiga nivån av felaktig splitsning av ISCU i den vävnaden. Den felaktiga splitsningen av ISCU verkar dock kunna aktiveras av RBM39 och MBNL1, där MBNL1 uttryckts till hög grad i den drabbade skelettmuskulaturen.

Mutationer som påverkar splitsningsprocessen är orsaken till en rad olika sjukdomar och sedan upptäckten av splitsning för 40 år sedan finns idag de första fåtal medicinerna som riktas mot felaktig splitsning. För 10 år sedan visade forskning att ungefär 95 % av alla mRNA transkript kan splitsas till olika alternativa transkript, en siffra som man innan dess trodde låg på 60 %. Studier kring splitsning och vävnadsspecifik splitsning, så som i LM, kan bidra med viktig information kring hur man kan utveckla terapier för att behandla sjukdomar som orsakas av felaktig splitsning.

x

Introduction

Splicing Splicing is a process that is performed in eukaryotic cells but not in prokaryotic cells, where transcription and translation occur simultaneously. The pre-mRNA contains both introns and exons, where splicing removes the introns and fuses the exons, resulting in a final mRNA transcript that is used for protein production. Human exons have an average size of almost 150 nucleotides while the average human intron is approximately 3,400 nucleotides long1. The considerable difference in size combined with the fact that the introns are speckled with potential termination codons, which, if included, would result in truncated proteins, makes the task of intron removal quite daunting1. Including the untranslated regions (UTRs) that are not translated but transcribed, an average human gene is nearly 27,000 nucleotides long with an average of nine exons with intervening introns1. In the end, more than 90% of the pre-mRNA is removed during splicing, which at first might seem like a huge waste of energy and resources. It was initially believed that introns lack a regulatory role and are merely degraded, but it has been shown that introns can act as precursors for non-coding RNA molecules, such as microRNA (miR), as well as regulating translation2. However, the actual benefit from the usage of exons and introns, is the mechanism of alternative splicing.

Constitutive and Alternative Splicing A constitutive exon is always included in the different splice variants of a pre- mRNA while a so-called alternative exon can be excluded. Approximately 95% of all pre-mRNA transcripts are alternatively spliced, where the most common type of alternative exon is called an alternative cassette exon3,4 (FIGURE 1). When only a segment of an exon is excluded, it is referred to as an alternative 5´ or 3´ splice site, depending on which end of the exon is affected (FIGURE 1). Other modes of alternative splicing are when an entire intron is included in the final mRNA transcript or when two exons are included in a mutually exclusive fashion (FIGURE 1). The most common alternative splicing event in higher eukaryotes is the skipping of an exon while intron retention is quite common in non-vertebrates5.

1

Figure 1. Major forms of alternative splicing. Boxes represents exons while black lines represent introns. Schemtics of pre-mRNA to the left and the final mRNA outcomes to the right. The arrows represent the splicing process.

Together with alternative transcription/translation initiation and alternative polyadenylation, alternative splicing enables the production of multiple splice variants from each gene, which expands the protein coding capacity and contributes to the proteomic complexity in higher organisms. The approximately 20,000 protein-coding in the are transcribed into more than 80,000 transcripts, which then result in a proteome of approximately 250,000 to 1,000,000 proteins6. This protein diversity allows proteins to exist in several isoforms, where the different isoforms share certain constant domains but differ in other variable domains. Each protein isoform displays qualitatively distinct functions to serve the needs of the cell type in which they are expressed. This is a flexibility that, for example, cancer cells take advantage of when the alternative splicing is re-directed in favour of cancer cell survival despite hostile conditions7.

2

An example of a tissue-specific alternative splicing event is the muscle-specific inclusion of exon 7 in the β-tropomyosin mRNA. The inclusion of this exon in non-muscle tissues is prevented by the splicing regulator PTBP18,9. PTBP1 is also involved in preventing the inclusion of exon 3 in the α-tropomyosin mRNA in smooth muscle tissues10,11. Also, the tissue-specific expression of the cardiac troponin T (cTNT) gene is regulated by alternative splicing, where exon 5 is only included in cardiac or embryonic skeletal muscle cells in which cTNT needs to be expressed, and is skipped in other non-muscle cell types such as cervical cancer cells, fibroblasts and oocytes12,13.

The Spliceosome The spliceosome is a protein complex consisting of more than 200 polypeptides and RNA-binding proteins as well as five small nuclear RNAs (snRNAs): U1, U2, U4, U5 and U61,14,15. The spliceosome is recruited by splicing factors, the trans- acting factors, to specific sites on pre-mRNA transcripts known as cis-acting elements. Besides the so-called U2 dependent major spliceosome, there is also a U12 dependent minor spliceosome that recognizes less than 0.5% of all human introns16,17. The minor spliceosome performs the same splicing mechanism but uses a set of different but homologous snRNPs and recognises different splicing consensus elements18-20.

Cis-acting Elements The cis-acting elements are crucial consensus sequences for splicing and include the branchpoint, the polypyrimidine tract (PPT), the 3´ splice site and the 5´ splice site12,21 (FIGURE 2). The branchpoint is a variable heptanucleotide at the site of lariat formation12,22 (FIGURE 2). A recent study found that 90% of human branchpoints occur 19–37 nucleotides upstream of the 3´ splice site, which fits within previous estimations of 20–40 nucleotides, and that many branchpoint sequences match the UCCUURAY motif (R: A or G, Y: C or U)22,23. The role of the branchpoint is to interact with the G and U-rich U2 snRNA where the central A nucleotide remains unpaired to allow a trans-esterification reaction with the 5´end of the intron during lariat formation22 (FIGURE 2). Due to the high sequence variation, besides the central A nucleotide, it is hard to predict human branchpoints based only on sequence information. Within an intron, one can find several branchpoints or non-canonical branchpoints that lack the A nucleotide24. In 10% of all human introns, very distant branchpoints are utilized (up to 100 nucleotides or more upstream of the 3´ splice site), and the region between the regulated exon and the upstream distant branchpoint has been termed an AG exclusion zone, where the inclusion of an AG could be used as a cryptic, non- authentic 3´ splice site25.

3

Downstream of the branchpoint is the PPT, which is a stretch of less than six to dozens of nucleotides enriched in U and C, also known as pyrimidines26,27 (FIGURE 2). The 3´ splice site is located approximately 3–40 nucleotides downstream of the PPT where the last three nucleotides before the start of the exon consists of the consensus sequence of YAG (Y: C or T) in 90% of mammalian introns18,28-30. The 5´ splice site occurs at the end of the exon and consists of nine nucleotides where the consensus sequence for the last six positions is GURAGU (R: A or G) (FIGURE 2). The essential di-nucleotide GU of the 5´ splice site make up the first two nucleotides of the downstream intron18,19 (FIGURE 2).

Figure 2. Cis-acting elements and intron removal. Pre-mRNA with cis-acting elements: 5´ splice site (donor site), branchpoint sequence, the polypyrimidine tract and 3´ splice site (acceptor). Intron removal, including lariat formation and joining of acceptor and donor sites (Modified after Cieply et al. 2015).

When an intron is removed, it is folded back onto itself to create a lariat loop, which is then removed (FIGURE 2). Initially, the branchpoint A nucleotide and the 5´ site GU nucleotides of the upstream exon are bound together by a trans- esterification reaction22. The resulting lariat is then still attached to the downstream exon while the spliceosome continues downstream to find the 3´ splice site. The exons are then fused together by a second trans-esterification reaction, as the upstream exon has replaced the AG nucleotides at the 3´ splice site of the downstream exon (FIGURE 2). The GU site is therefore known as the donor splice site while the AG site is known as the acceptor site. The resulting products are two merged exons and a removed intron in the form of a lariat loop15,31.

4

Exon Definition The U1 snRNA of the spliceosome directly base pairs with the 5´ splice site while the U2 snRNA binds to the branchpoint to define an exon and to initiate splicing15,32 (FIGURE 3). Initially, SF1 binds to the mammalian UnA (n: any nucleotide) sequence of the branchpoint along with the U2 auxiliary factor (U2AF), where the U2AF65 unit binds to the PPT while the U2AF35 unit interacts with the 3´ splice site, resulting in U2 snRNA recruitment (FIGURE 3). Other trans-acting factors can also bind to different regulatory elements and affect the binding of U1 and U2 snRNA to the splice sites.

Figure 3. Exon and intron definition by U1 and U2 snRNP binding. Binding of U1 snRNP to 5´splice. U2AF binding to the 3´splice site and recruitment of U2 snRNP.

Cis-acting elements are necessary for splicing but are not in themselves sufficient for splice site selection. As their sequences are far from unique, these elements alone do not supply the spliceosome with sufficient information in order to identify an authentic splice site33. Non-authentic splice sites, known as cryptic splice sites, occur within both exons and introns in almost every pre-mRNA transcript34,35. Even if a splice site is well-defined, there is always a competition with cryptic splice sites, and the efficiency of any site can be altered by mutations that affect nearby regulatory splicing elements36-38. Strong splice site elements are thereby associated with constitutive splicing, while the usage of weaker splice sites is associated with alternative splicing22. The exons of a pre-mRNA can be combined in a variety of ways. The true number of splice variants for a given gene is often not known, and a recent study found novel, less common combinations of exons for a wide range of genes, including for example PTBP139.

5

Trans-acting Factors To ensure proper recognition of authentic splice sites, the spliceosome is guided by trans-acting splicing factors that bind to splicing regulatory elements (SREs). There are exonic splicing enhancers (ESE) and silencers (ESS) but also intronic splicing enhancers (ISE) and silencers (ISS) (FIGURE 4). A certain element can act as either an enhancer or a silencer of splicing and can be localised in both introns and exons. The actual functionality of an SRE is highly dependent on the local RNA sequence and the cellular environment. For example, a certain sequence motif in an intron, in respect to the nearby exon, can function as an enhancer in one tissue but as a silencer in another tissue40. Typically, splicing factors that are known as serine/arginine-rich (SR) proteins bind to enhancer SREs while members of the heterogeneous ribonucleoprotein (hnRNP) family bind to silencer SREs1,5 (FIGURE 4). SR proteins are known to recruit the splicing machinery to splice sites and are involved in both constitutive as well as alternative splicing41. The splicing outcome is not solely determined by the interaction of splicing factors with SREs but is also affected by the RNA secondary structure, short noncoding RNAs and cellular stresses42-44.

Figure 4. SREs: ESE, ESS, ISE and ISS position and action on nearby splice site. Binding of hnRNP splicing factors to ISSs or ESSs usually result in an inhibitory effect on nearby splice sites. Binding of SR splicing factors to ISEs or ESEs usually result in an activating effect on nearby splice sites. The SREs can also be bound by other trans-acting splicing factors.

Secondary Structure All pre-mRNA transcripts form sequence-specific secondary structures, which serves as an important regulatory mechanism of alternative splicing. Distant sequences of a pre-mRNA transcript can interact in order to form hairpin structures with stems and loops. The secondary structure of a pre-mRNA can be predicted using the thermodynamic stability of RNA-RNA bonds, identifying complementary sequence regions as well as chemical and enzymatic probing1 of phosphodiester bonds in RNA45,46. The secondary structures can bring essential splicing sites too close together, blocking spliceosome binding. A single nucleotide change within a pre-mRNA transcript can have drastic effects on the overall secondary structure of the entire transcript, which can directly affect the splicing of the pre-mRNA42,46. For example, when the 3´ end of intron 4 of the

1 Using chemicals and enzymes to break down macromolecules in order to determine their structure.

6

cTNT pre-mRNA is in a single stranded formation in the secondary structure, the U2AF65 splicing factor can bind and recruit U2 snRNP, which results in the inclusion of exon 5. When the 3´ end of intron 4 ends up in a hairpin structure due to a mutation that affects the secondary structure, the splicing factor MBNL1 binds instead, which prevents U2AF65 binding and results in exon 5 exclusion47.

RNA-Binding Proteins More than 1,500 RNA-binding proteins (RBPs) have been identified thus far in the human genome, which includes a subgroup of proteins known as splicing factors due to their splicing regulatory abilities48,49,50. The final outcome of a splicing event is determined by the unique collection of splicing factors and other RBPs in a certain cell type, as their combined interactions decide if a particular splicing site is recognised or not. The effect that a splicing factor exerts on splicing is highly dependent on the local cellular and sequence environment, making it hard to strictly define a splicing factor as an activator or an inhibitor of splicing51.

RBPs contain single or multiple domains for RNA binding. A very common RNA- binding domain is the RNA-recognition motif (RRM), which can be found in a wide variety of splicing factors, including SR proteins, hnRNP and in the U2AF65 component of the core spliceosomal machinery52. Another common RNA-binding domain is the K-homology (KH) domain, which can be found in a wide range of splicing factors, including NOVA1, but also in the branchpoint binding protein SF153,54. Zinc finger domains can bind both RNA and DNA and are relatively small domains. Zinc finger domains rely on zinc or other metal ions for a functional binding pocket52,55,56. U2AF35 and the muscleblind-like (MBNL) splicing factors MBNL1 and MBNL2 all contain zinc finger domains57.

Splicing Factors Splicing factors can either act as activators of splicing, resulting in inclusion of exons, or splicing inhibitors, where exons are skipped. SR protein splicing factors, such as SRSF3 or RBM39, tend to act as splicing activators while hnRNP splicing factors, such as PTBP1, generally act as inhibitors. However, there are always exceptions to this general rule where the binding of the splicing factor in relation to the regulated exon can determine if the splicing factor will act as an activator or an inhibitor of the splicing. Each splicing factor can be associated with one or several so-called splicing motifs, which represent the sequence of RNA that the splicing factor tends to bind to. The sequence of a splicing motif can be quite degenerate but usually consists of four to eight nucleotides58-60.

Some splicing factors can be considered constitutive, such as U2AF65, which plays an essential role in the recruitment of the spliceosome. Other splicing factors have a more regulatory role, such as PTBP1 and SRSF3. Some splicing

7

factors also have a ubiquitous expression pattern while others, such as NOVA1, display a tissue-specific expression pattern. Splicing factors can also be characterised as oncoproteins or tumour suppressors since cancer cells can re- direct alternative splicing in order to survive hostile conditions. These RNA splicing factors control the occurrence of oncogenic splice variants that help the cancer cells to survive7,61.

Nonsense-Mediated Decay A single nucleotide change or a shift in the translation reading frame due to insertions, deletions or alternative splicing can result in the occurrence of a stop codon referred to as a premature termination codons (PTC)62,63. Approximately one third of all alternatively spliced transcripts contain PTCs and are known as nonsense transcripts64,65. In order to prevent the production of non-functional and potentially harmful truncated proteins, a cellular surveillance and quality control system exists entitled nonsense-mediated decay (NMD), which detects and degrades nonsense transcripts66,67. NMD can be found in all eukaryotic cells and is a well-needed system since approximately one third of mutations in inherited genetic diseases result in PTCs65,68,69.

By coupling alternative splicing with NMD, the expression of a gene can be turned on or off by selectively changing the splicing pattern to give rise to either a mature transcript or a nonsense transcript, respectively62,65. The expression of certain SR splicing factors as well as the splicing factor PTBP1 are, for example, regulated in this way70-73. PTBP1 autoregulates its own expression by repressing the inclusion of exon 11, where PTBP1 transcripts that lack exon 11 are degraded by NMD70.

The predominant model for PTC recognition in the mammalian NMD pathway is the exon junction complex (EJC) dependent model. The EJC is a multi-protein complex that binds approximately 20–24 nucleotides upstream of exon-exon junctions during splicing67,74-77. For human mRNAs, it has been reported that approximately 80% of all exon-exon junctions are marked by an EJC78. The EJCs are displaced by the ribosome as it scans the entire mRNA during the first round of translation until it reaches the translation termination codon. An mRNA that lacks EJCs is not susceptible to NMD; however, if a PTC stops the ribosome prematurely, there will still be one or more EJCs downstream of the PTC. This will trigger NMD, and the PTC containing mRNA will be rapidly degraded79,80. The size of the ribosome and the distance between an EJC and the exon-exon junction has led to the mammalian ‘50–55 nt rule’, which states that mRNAs with a PTC localised more than 50–55 nucleotides upstream of the most 3´ exon-exon junction will be degraded by NMD81,82 (FIGURE 5). If the distance is less than 50– 55 nucleotides, the idea is that the downstream EJC would be displaced by the large ribosome when in such close proximity (FIGURE 5). There are, however,

8

exceptions to this rule as well due to the potential for variability in the outcome when the distance is between 50–55 nucleotides83,84.

Figure 5. 50–55 nt rule for NMD. The distance between a PTC (STOP) and the most 3’ exon-exon junction is a strong determinant for the characterisation of a transcript as an NMD or non-NMD target (Modified after Zhang et al., 2009).

The NMD core components are the Up-frameshift 1 (UPF1), UPF2 and UPF3 where UPF1 becomes phosphorylated on specific serine residues following the interaction with UPF2 and UPF385,86. The presence of an EJC more than 50–55 nucleotides downstream of a stop codon will result in the formation of the SURF complex onto the mRNA. The phosphorylation of UPF1 recruits other factors and results in RNA decapping and deadenylation or transcript cleavage near the PTC, followed by transcript degradation by cellular nucleases65,85,86.

Diseases Caused by Aberrant Splicing Altered splicing is known to play a crucial role in the development of many human genetic diseases. Widespread splicing changes can be observed in complex diseases such as cancer and neurodegeneration and more targeted changes in monogenic diseases18,38,87. One can classify human splicing diseases into two categories depending on where the mutation is located. The first category consists of splicing diseases caused by mutations located within SREs, while the second category consists of splicing diseases caused by mutations that affect splicing regulators, which include the components of the spliceosome and splicing factors88,89.

The first category is estimated to represent as much as 50% of human genetic diseases that are caused by point mutations90,91. Mutations can disrupt the exon definition of authentic exons by reducing the strength of the 3´ or 5´ splice site18,88,92. Also, mutations in SREs can either have a gain- or loss-of-function, which includes creation and enhancement of SREs or destruction and weakening of SREs, respectively. Disrupting enhancer SREs can result in a splicing failure,

9

resulting in exon skipping, which is one of the most common causes of splicing disease22,88,91. In general, single nucleotide substitutions that affect 5´ or 3´ splice sites are the most common cause of splicing disease, while mutations that cause intron retention are much less common88. Mutations that affect the RNA secondary structure can also result in aberrant splicing and disease. Also, the loss of a natural splice site is the most common splicing defect in inherited diseases, while the loss of an ESE or the gain of an ESS resulting in exon skipping is more frequent in cancer88.

The second category of splicing diseases also includes a wide variety of human diseases18,88. This category is however smaller and there are quite few diseases that have been described to be caused by mutations in core components of the splicing machinery89. When core components of the splicing machinery are affected one might expect potential systemic consequences. However, the resulting diseases are often caused by quite specific splicing changes. Interestingly, the majority of all splicing events are not noticeably affected and the splicing machinery is just operating at a less efficient rate87. Mutations in RBPs have been associated with for example muscular atrophy but also more complex diseases such as cancer and neurodegeneration38. Within this category one can also include splicing factor sequestration due to the repetitive elements, such as in Myotonic dystrophy which is caused by MBNL1 sequestration due to CUG repeats88,93.

Pseudoexons One subtype of incorrect splicing associated with diseases is the inclusion of intronic sequence within the final mRNA transcript94,95. These inclusions are known as pseudoexons and are flanked by so-called cryptic acceptor and donor splice sites within introns, which for some reason are not properly recognised by the spliceosomal machinery96,97. The cryptic splicing elements somehow lack authenticity and proper splice-enabling RNA secondary structures might not form94,96,98. Mutations that somehow alter the surrounding splicing elements – such as splice sites, branchpoints or SREs – can activate pseudoexons, leading to their inclusion in the final mRNA transcript18,94. A pseudoexon inclusion often results in a shifted reading frame, the inclusion of novel amino acids during translation and a non- or weak-functioning protein which can lead to the development of disease62,95,99. The opposite of pseudoexons is pseudointrons which are intron-like sequence that can be found within exons94.

Splicing Prediction Although a lot is known about consensus motifs for splicing, it can be quite challenging to reliably predict splicing processes based only on mRNA sequence100,101. It is, therefore, a challenging task to characterise new mutations

10

as pathological or not, in terms of their effect on splicing. Despite the knowledge about splicing factor motifs and SREs, it is hard to adapt that knowledge to a specific biological context where the final splicing outcome is dependent on a multitude of factors, such as RNA secondary structure, DNA conformation, transcription activity, non-coding RNAs and the extracellular signalling pathways18,38,46.

Muscle-specific Splicing Muscle and brain tissue are known for their high complexity in the alternative splicing of pre-mRNA transcripts, most likely due to the unique structural requirements of both tissues types3,4,6,102,103. Most of the alternative exons that are regulated in a brain- or muscle-specific manner are associated with conserved adjacent intron sequences102.

Genes coding for muscle-specific products related to the mechanical properties of muscle, such as tension and contractility, display a clear muscle-specific splicing pattern. Examples include the largest human cellular protein titin as well as myosin and troponin T genes3,103,104. Other genes that are closely related to muscle development and maintenance, such as genes coding for myogenic transcriptions factors and metabolic enzymes, also show a muscle-specific alternative splicing103,105. The two isoforms of the myofibrillar protein cTNT can determine the contractile sensitivity of maturing muscles by providing different levels of calcium sensitivity of the myofilament106,107. Exon 5 of the cTNT gene is, for example, included in a muscle-specific manner in embryonic skeletal muscle cultures108. PTBP1 as well as the muscleblind family of proteins have been shown to be involved in muscle-specific splicing where PTBP1 usually acts as an inhibitor of exons that are included in muscle-specific splicing1,102,109.

Muscles Mesodermal stem cells turn into myogenic precursors that then differentiate into multinucleated myotubes due to the expression of myogenic transcription factors, which in the end give rise to myofilaments and myofibres110,111. The long myofibres are grouped into bundles and make up our skeletal muscles. The myoblast precursors differ from the muscle fibres in that they do not express genes that code for the cellular apparatus required for muscle contractions110. MyoD is a required transcription factor that can trigger differentiation in myoblasts, and myogenin is required for the formation of myotubes and muscle fibres. MyoD is, therefore, seen as an early marker of differentiation while myogenin is a late marker112. In embryonic muscle, both MyoD and myogenin are expressed, which is also seen in myotubes in a cell culture environment113.

11

The myofibres, also called muscle fibres, can be divided into two subgroups: slow and fast fibres, depending on which myosin heavy chain isoform the fibres express114,115. Early populations of myogenic precursor cells develop into either slow or fast fibres in response to postnatal cues such as muscle innervation as well as endocrine or nutritional cues, where the final individual ratio of fibres in a specific muscle varies significantly between individuals114,116. The slow fibres are rich in oxygen-binding myoglobin and rely mostly on oxidative metabolism and are specialised for continuous activity. Fast fibers rely mostly on glycolytic metabolism and are suited for rapid activity. MyoD is more expressed in fast muscle while myogenin is more expressed in slow muscle113,117. Fast fibres can also be classified as oxidative or glycolytic, depending on the nature of their major energy source113,118. The growth of muscle, as well as muscle repair, relies on stem cells known as satellite cells that can be activated and proliferated, differentiated and fused with existing myofibres119,120. The myosin heavy chain is also used as a molecular marker for differentiated myotubes where MyoD and myogenin are markers for undifferentiated and committed satellite cells121. The promoter for the myosin light chain is silent in non-muscle and undifferentiated muscle cells but becomes activated in differentiated muscle fibres110,121.

The slow oxidative fibres are called type I and maintain intracellular calcium concentrations at high levels (100–300 nM)114,118,122. Fast glycolytic fibres are called type II and have lower ambient calcium levels (less than 50 nM) but are used for sudden bursts of contractions, which rely on a transient high level of calcium114,118,122. When comparing marathon runners and sprinters to the general population, they have a higher number of slow and fast fibres, respectively118,123. The fibre composition of their muscles represents the type of training that they perform. However, in general, exercise induces a muscle fibre-type transition that results in increased oxidative metabolism of the muscle114,124. The transition is from fast to slow fibres, and if this transition is not maintained by continuous exercise, the transition will be reversed. A change in muscle fibre composition can also be achieved by altered thyroid hormone levels113. A muscle disorder in which the fibres are discriminated in terms of disease state is oculopharyngeal muscular dystrophy, where the severe atrophy is restricted to the fast fibres while the slow fibres are spared125.

It is, however, important to remember that muscles also include fibroblasts in the connective layers, endothelial and smooth muscle cells in the vessel walls as well as nerves and Schwann cells around the axons and blood vessels with blood cells120,121. Also, muscle cells possess the impressive ability to store oxygen due to the oxygen-binding ability of myoglobin, as compared to haemoglobin in the blood126.

12

Myopathies Diseases that affect the muscles are known as myopathies127. The ATP production factories of cells, known as mitochondria, have their own genomes that code for 13 of the approximately 90 proteins that are involved in the ETC128. If a gene in the mitochondrial genome is faulty and causes a myopathy, it is described as a mitochondrial myopathy. Mutations in nuclear DNA that have a negative effect on mitochondrial function can also cause myopathies but are instead referred to as metabolic myopathies128.

Hereditary Myopathy with Lactic Acidosis

Symptoms and Inheritance Patients with hereditary myopathy with lactic acidosis (HML) originate from Västerbotten and Ångermanland in Northern Sweden and were first described in the 1960s129,130. The patients displayed a low tolerance to exercise despite normal muscle contraction strength. Even light physical exertion resulted in symptoms such as dyspnoea, palpitations, muscle cramps and tachycardia along with an increased release of pyruvate and lactate. The patients performed poorly in physical tests and displayed a low oxygen uptake in relation to their cardiac output, which is defined as a hyperkinetic circulation. Extensive exercise or strict diets could cause severe episodes characterised by lactic acidosis, which in the worst case could be fatal. Dark urine, which is a sign of myoglobinurea, i.e. the presence of myoglobin in the urine, is another symptom of a severe episode.

The disease displays an autosomal recessive inheritance pattern where positional cloning identified the iron-sulphur (FeS) cluster assembly gene (ISCU) located on 12 as the disease-causing gene131,132. The heterozygous carriers are unaffected, and the allele frequency in the region was estimated to be 1/188131.

Biochemical disease characteristics of patient muscle tissue include low levels and activity of several FeS cluster proteins, including cytosolic aconitase, mitochondrial aconitase in the tricarboxylic acid (TCA) cycle, as well as Complex I, II (succinate dehydrogenase: SDH) and III of the oxygen-dependent electron transport chain (ETC) in mitochondria131,133-137 (FIGURE 6). Heterozygote muscle displays normal levels and activity of Complex II subunit B as well as cytosolic and mitochondrial aconitase131,137. In heterozygous myotubes, however, there is an observable decrease in Complex II subunit B levels and Complex II activity as well as a slight increase in succinate levels compared to a healthy control138.

In patient fibroblasts, one can also observe a decrease in Complex II activity, and in patient myotubes, the activity reached 50% of the heterozygote levels and 25% of the healthy control levels138. In patient myotubes, there is also a decrease in the

13

activity of both cytosolic and mitochondrial aconitase and a two-fold increase in succinate levels compared to a heterozygous and a healthy control138. Both fibroblasts and myoblasts from patients have also been shown to be more sensitive to oxidative stress compared to healthy control cells137. Another consequence of the disease that could be related to impaired FeS cluster production was the observed iron overload in the mitochondria of patient muscles compared to healthy controls and heterozygote carriers131,135,139.

Figure 6. FeS proteins in the TCA cycle and the ETC. Oxygen-independent production of ATP occurs by glycolysis in the cellular cytoplasm while the more efficient oxygen-dependent ATP production occurs in the mitochondria. The glycolysis end-product, Acetyl CoA, is imported into the mitochondrial matrix and processed by the TCA cycle to produce electron donors. The electron transport of the ETC builds up a proton gradient in the mitochondrial intermembrane space that is used for ATP production by Complex V.

An RNA microarray analysis of HML patient skeletal muscle indicated changes in expression of myofibril structural components, suggesting a shift in muscle fibre composition137. Genes related to fast fibres were downregulated while genes related to slow fibres were upregulated, similar to what can be observed in the muscles of an endurance-trained athlete137. Fibre staining of vastus lateralis

14

muscle biopsy sections showed a 22% increase in slow fibre composition in HML muscle compared to control biopsies137 . Increased levels of vascular endothelial growth factor B (VEGFB) protein was also observed in patient vastus lateralis and gastrocnemius muscle biopsies, and in the gastrocnemius sections, patients showed a 50% increase in capillaries/muscle fibre compared to a biopsy from a healthy heterozygote control137 . Increased expression of the secreted endocrine hormone fibroblast growth factor 21 (FGF21) was also observed in muscle as well as elevated levels of the hormone in serum from both HML and un-related mitochondrial myopathy patients137 . Heterozygote levels clustered with control levels for all the examined criteria in the RNA microarray study, except for perhaps the expression levels of monocarboxylate transporter 2, but those expression levels were also elevated in non-ISCU myopathy samples137.

Overall, patients suffering from HML display a pathological energy production in muscle, while other energy-consuming organs, such as the heart and the central nervous system, are unaffected129,130.

Incorrect Splicing of ISCU Sequence analysis of the ISCU gene from HML patients revealed an intronic G > C mutation, 382 bp downstream of exon 4 (c.418+382G > C). Analysis of the ISCU transcript showed that the mutation resulted in the activation of cryptic splice sites and the inclusion of part of the intron in the final mRNA transcript131,132 (FIGURE 7). The incorrect splicing results in an inclusion of either 86 or 100 bp of intronic sequence due to the recognition of two donor splice sites131,132. In both scenarios, the same acceptor splice site is recognized, located only 6 bp downstream of the HML mutation (FIGURE 7). The mutation increases the strength of the acceptor splice site according to several splicing prediction algorithms, while the strengths of the donor splice sites remains the same (Paper III, Supplementary Figure S2, Supplementary Table 1).

The inserted intronic sequence in the final ISCU mRNA transcript of HML patients results in a frameshift and the introduction of 15 novel amino acids followed by a PTC (FIGURE 7). Translation of incorrectly spliced ISCU mRNA results in a 13 amino acid truncation that disrupts the last α-helix of the ISCU protein131,132,140 (FIGURE 7). It has been suggested that the incorrectly spliced transcript is a target for NMD131, and this is further explored in Paper III of this thesis. Western blot analysis revealed a protein that could represent the truncated ISCU protein in muscle mitochondria, heart tissue as well as myoblasts, fibroblasts and lymphoblasts from HML patients135,136,140. The half-life of the mutant protein is, however, four-fold shorter than the full-length ISCU protein, which indicates that it is degraded quite rapidly136.

15

Figure 7. Incorrect splicing of ISCU pre-mRNA in HML. The HML mutation (MUT) activates cryptic splice sites (acceptor [AG] and donor [GU]) in the ISCU pre-mRNA and results in the inclusion of a pseudoexon in the final mRNA transcript. Protein sequence of wildtype (WT) and truncated (MUT) ISCU protein with a dashed line under the novel amino acids in the end (STOP) of the truncated ISCU protein.

Muscle-specific Incorrect Splicing When comparing skeletal muscle, liver and heart tissue from HML patients, skeletal muscle showed the highest level of incorrectly spliced ISCU mRNA135. Almost 80% of the ISCU mRNA in muscle tissue was incorrectly spliced, while in heart and liver tissues, only 30% and 46% of the ISCU mRNA was incorrectly spliced, respectively135. Skeletal muscle, heart and liver tissues from healthy individuals also contain incorrectly spliced ISCU mRNA but to a very low extent135. The high levels of ISCU protein in healthy control heart tissue is clearly decreased in HML patient heart tissue, and the low levels of ISCU protein in healthy control muscle tissue is just barely detectable in HML patient muscle tissue135. When comparing different patient tissues, ISCU protein levels in kidney were the highest followed by moderate levels in heart and liver, and again, practically undetectable levels in muscle tissue135. Incorrect splicing of ISCU also occurs in patient myoblasts and myotubes as well as fibroblasts to some extent; however, the drastic decrease in ISCU protein seen in myotubes and myoblasts is not seen in fibroblasts136,138 (Paper I, II).

16

Compared to healthy controls, the levels of incorrect splicing in heterozygotes are two-fold higher in myotubes and almost four-fold higher in muscle tissue, but they are still clearly lower than that observed in patients131,137,138. In other words, the levels of incorrect splicing in heterozygotes are closer to patient levels in muscle tissues but closer to healthy control levels in myotubes131,137,138. However, the levels of correctly spliced ISCU in heterozygote muscle tissue and myotubes is relatively close to healthy control levels, and there is no observable decrease of ISCU protein in heterozygote muscle tissue but perhaps a marginal decrease in myotubes131,138.

Overall, HML patients can live fairly normal lives if they make sure to avoid physical exertion as this can trigger severe episodes, including rhabdomyolysis, which is the rapid breakdown of muscle fibres and the release of fibre content into the bloodstream. A case study following an HML patient who had experienced a severe attack of rhabdomyolysis showed that the regenerating muscle displayed normal Complex II activity and low levels of iron accumulation141. It was shown that the transient improvement was due to a decrease in the incorrect splicing of ISCU, which resulted in increased levels of ISCU protein in the muscle. However, nine years after the attack, these transient improved values were back to the typical HML state141.

The ISCU Protein and FeS Cluster Assembly A complete loss of the ISCU gene in mouse results in embryonic lethality, indicating that the ISCU protein is essential for survival135. When examining levels of ISCU protein in mouse tissues, it is present at the highest level in heart, at moderate levels in brain, liver and kidney, and at a lower level in skeletal muscle136. The ISCU protein is a highly conserved protein and functions as a scaffold protein involved in the production of FeS clusters142,143.

FeS clusters exist in almost every living organism and are essential cofactors for a wide variety of proteins144. Theses clusters consist of iron and sulphur combined 145 in different conformational ratios: Fe4S4, Fe2S2 and Fe3S4 . FeS cofactors endow the host protein with electron transferring abilities as well as enzyme-substrate abilities. Complex I, II and III in the ETC include at least 12 known FeS clusters145,146 (TABLE 1) where subunit B of Complex II contains three FeS clusters, 145 one of each combination: Fe4S4, Fe2S2 and Fe3S4 (TABLE 1). Levels of Complex II in mouse tissues is high in heart and kidney, moderate in skeletal muscle and quite low in brain and liver136. For mitochondrial aconitase, heart tissue displays the highest level followed by kidney, muscle and brain tissues, while liver has the lowest levels136.

17

Table 1. FeS cluster proteins in mitochondria and the cytoplasm145,147.

In mitochondria: In cytoplasm: Ferrochelatase Cytoplasmic aconitase (Iron regulatory Mitochondrial aconitase (1xFeS) protein 1) Complex I subunits: IOP1 (CIA factor) NDUFS8 (2xFeS) Glutamine phosphoribosylpyrophosphate NDUFS7 (1xFeS) amidotransferase NDUFS1 (3xFeS) Dihydropyrimidine dehydrogenase NDUFV2 (1xFeS) POLD1 (catalytic subunit of DNA NDUFV1 (1xFeS) polymerase δ) Complex II (SDH) subunit B (3xFeS) Nth-like DNA glycosylase 1 Complex III subunit UQCRFS1/Rieske (1xFeS)

Mitochondria contain a high level of FeS clusters and are the major site for FeS cluster assembly. The ISCU protein acts as a scaffold protein in the formation of FeS clusters together with the cysteine desulfurase NFS1 and ISD11. Frataxin, the affected protein in the human neurodegenerative disease Friedreich ataxia, induces conformational changes, which enables the transfer of inorganic sulphur from NFS1 to the nascent FeS cluster on ISCU145,148. Interestingly, the levels of NFS1 in mouse muscle are significantly lower than the levels in heart, brain, liver and kidney, and the expression of NFS1 is increased in HML patient muscles compared to healthy controls136,137. After FeS clusters have been assembled, they are exported from mitochondria to the cytoplasm, where they are delivered to relevant protein recipients146,149,150.

ISCU and Oxidative Phosphorylation in Cancer During normal oxygen conditions (normoxia), the oxygen-dependent energy production, known as oxidative phosphorylation, is activated while glycolysis, the oxygen-independent energy production, is inhibited. In cancer cells, however, the process of glycolysis is still active, despite the presence of oxygen, a situation known as the Warburg effect. During low oxygen conditions (hypoxia), the hypoxia inducible factor 1α (HIF1 α) is activated, resulting in increases levels of miR-210151,152. MiRs are short RNA sequences that can target mRNAs and accelerate their degradation by blocking their translation. In cancer cells, miR- 210 targets the 3´ UTR of ISCU mRNA, resulting in decreased levels of ISCU protein, which in turn impedes oxidative phosphorylation151. The suppression of ISCU in cancer has been correlated with a poor prognosis151.

Muscle-specific Loss of ISCU Patients suffering from HML show a disabled ability to adequately handle the cellular consequences of normal levels of skeletal muscle exertion. The drastic decrease of ISCU protein negatively affects the oxygen-dependent energy production in skeletal muscle due to the disabled FeS cluster production. To be

18

able to understand why the incorrect splicing of ISCU occurs to such a high extent in muscle tissue, a previous study identified PTBP1, IGF2BP1, RBM39 and MATR3 as proteins that interacted with ISCU RNA with or without the HML mutation153. Overexpression of these splicing factors in rhabdomyosarcoma (RD4) cells expressing an ISCU minigene with the HML mutation showed that PTBP1 inhibited the incorrect splicing while IGF2B1 and RBM39 activated the incorrect splicing153. Also, when PTBP1 overexpression was coupled with the overexpression of either IGF2BP1 or RBM39, the inhibitory effect of PTBP1 was reduced153.

PTBP1 Polypyrimidine tract-binding protein 1 (PTBP1) is a member of the hnRNP family and is a well-known inhibitor of splicing, but it has also been shown to be able to enhance splicing154-158. PTBP1 has four RRMs and can act as a monomer or a homodimer57,159,160. Knockout of PTBP1 leads to embryonic lethality, where the expression of its paralogue PTBP2 is not enough to rescue the knockout161. During neuron differentiation, PTBP1 expression is downregulated while the expression of PTBP2 is upregulated162,163. This neuronal connection was discovered in the PTBP1 splicing regulation of the neuronal N1 exon of the c-Src pre-mRNA. In non-neuronal HeLa cells, the N1 exon inclusion is repressed by PTBP1, while in the neuronal WERI-1 retinoblastoma cells, the N1 exon is included163,164. PTBP1 and PTBP2 display slightly different repression patterns but do share many target sights158. The two paralogues are 74% similar in full-sequence, but that similarity increases to 82% when only the four RRMs that they both have are considered158.

PTBP1 binds, as its name suggests, the PPT of pre-mRNAs. It was, therefore, thought that PTBP1 represses splicing by blocking U2AF65 from binding the PPT9,165,166. However, this is not the only explanation of PTBP1 repression, as most PTBP1 regulated exons have multiple PTBP1 sites, and in some cases, the sites are not even in the PPT167-169. For example, during the splicing of exon 6 in the Fas gene, it has been shown that PTBP1 interferes with the interactions between U1 snRNP and U2AF65, which disrupts the process of exon definition170. PTBP1, just as NOVA1, can either activate or inhibit splicing, depending on where the factor binds in relation to the regulated exon155,171,172. Therefore, PTBP1 knockdown can result in an increase of both exon inclusion and exclusion, depending on the pre-mRNA of interest, however, PTBP1 knockdown most often results in an increase of exon inclusion154,162,173,174. PTBP1 is a well-studied and wide-acting splicing factor as PTBP1 binding clusters have been found on approximately half of all annotated human genes, and approximately 28% of these binding events were associated with alternative splicing155.

19

The expression of PTBP1 is also regulated by alternative splicing, where exon 9 is an alternative exon, while exon 11 exclusion, similarly to exon 10 exclusion in PTBP2, results in NMD70,175. There are three known splice variants of PTBP1 where PTBP1.1 excludes exon 9 (531 amino acids), PTBP1.2 includes a short exon 9 and PTBP1.4 includes all of exon 9 (557 amino acids)26,158. Exon 9 codes for a region of the final protein that is a linker region between the second and third RRM26,158. The binding motif for PTBP1 is CUCUCU, and the longer a PPT is, the stronger PTBP1 binds26,40. An interruption of individual G bases is tolerated, but an interruption of the tract by an A is more deleterious for the binding affinity26,176.

PTBP1 is known to be involved in muscle-specific splicing where it inhibits inclusion of muscle-specific exons175,177. The levels of PTBP1 actually decrease due to post-transcriptional mechanisms during muscle differentiation, which results in the inclusion of muscle-specific exons in a variety of transcripts, including NCAM1, ITGA7, CAPN3, α-actinin as well as α- and β-tropomyosin9,11,178,179. The PTBP1 inhibition of muscle-specific exon 11 in the ABLIM transcript occurs by interference with the well-known muscle-specific splicing factor MBNL1180. There are, however, exceptions to this rule, where PTBP1 inhibits the inclusion of an exon that is used in a non-muscle situation166,181 (Sharma_2005; Llorian_2016). An example is the inclusion of exon 3 in the Tpm1 transcript, which is inhibited by PTBP1 in smooth muscle cells to allow the inclusion of the mutually exclusive exon 2181.

A role of PTBP1 in the Warburg effect of cancer cells has also been found. The Warburg effect describes the metabolic switch from the oxygen-dependent oxidative phosphorylation to aerobic glycolysis. This switch is beneficial for cancer cells and is achieved in part by PTBP1 regulated splicing of pyruvate kinase transcript, resulting in the PKM2 glycolysis promoting isoform7,182,183. The oncogenic transcriptional factor c-MYC is responsible for the upregulation of PTBP1 as well as other splicing factors184.

IGF2BP1 Insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1) was shown in our previous research to increase the incorrect splicing of ISCU when overexpressed in RD4 cells expressing an ISCU minigene with the HML mutation153. Also, overexpression of IGF2BP1 leads to an almost complete abolishment of the inhibitory effect of PTBP1 overexpression, indicating that the RNA-binding protein IGF2BP1 is able to prevent most of the inhibitory effect of PTBP1 for the incorrect splicing of ISCU153. IGF2BP1 contains two N-terminal RRMs and four C-terminal KH domains, but it is not known to act as a splicing factor57.

20

IGF2BP1 is known for binding to the mRNA of the essential growth factor insulin- like growth factor 2 (IGF2), which, just like insulin, can induce cellular changes such as a metabolic switch to glycolysis in differentiated myotubes122. IGF2 is a major foetal growth factor, and the IGF2 binding proteins are expressed during the foetal life of humans and mice48. Their expression is particularly high in developing epithelia, placenta and muscle of both human and mouse embryos. In mouse embryos, IGF2BP1 can be detected in both myoblasts and myotubes and is expressed at high levels in developing muscle and in the placenta but is absent in the brain48. The expression of IGF2BP1 is limited to embryogenesis and is, therefore, not detectable in healthy tissue185,186. However, IGF2BP1 expression is re-activated in several types of cancers, and its expression has been correlated to an increased metastasis and poor prognosis, making it quite a potent oncogenic factor185,186. Interestingly, the translation levels of IGF2 mRNA is correlated with the growth of RD4 cells where one can detect high levels of IGF2BP248,185.

RBM39 RNA-binding motif protein 39 (RBM39) is an SR protein that has been shown to increase the incorrect splicing of ISCU when overexpressed in RD4 cells expressing an ISCU minigene with the HML mutation153. RBM39 was also able to inhibit the repression of the incorrect splicing by PTBP1153. RBM39 was initially identified as an antigen in a hepatocellular carcinoma patient and is known to be upregulated in small-cell lung and breast cancers, colorectal adenomas and carcinomas187. There are three human isoforms of RBM39, and besides being a regulator of splicing, RBM39 can also function as a transcriptional co- activator188,189. RBM39 contains one long SR domain and three RRM domains, and its function is dependent on the phosphorylation status of its SR domain190,191. RBM39 is highly homologous to the essential splicing factor U2AF65, but its role in alternative splicing is still not well understood188,191. A recent study focusing on genome-wide RBM39 interactions with RNA showed that RBM39 binds both intronic and exonic sequences with the majority of sites in proximity to 5´ and 3´ splicing sites, where RBM39 can induce both inclusion and skipping of exons, depending on its binding position188. The study also showed that approximately 20% of the alternative splicing events found to be regulated by RBM39 were similarly regulated by U2AF65. RBM39 could thereby either bind to RNA directly to regulate splicing or act as a recruiter of U2AF65. Besides RBM39 lacking the U2AF ligand motif, RBM39 shares a relatively high sequence similarity with U2AF65 where the RRMs share a 37%–56% similarity191. The C-terminal RRM domain of RBM39 belongs to the U2AF homology motif family, and can also be found in both U2AF65 and U2AF35187. Even though U2AF65 and RBM39 share homology, U2AF65 binds PPTs while RBM39 binding is found close to 5´ or 3´ UTRs, transcriptional start and stop sites as well as 5´ and 3´ splice sites188.

21

U2AF65 is also an SR protein but is considered to be a constitutive splicing factor and is known to be ubiquitously expressed, while RBM39 displays a more tissue- specific expression pattern. An example of RBM39 splicing regulation in muscle cells was described in a recent study where knockdown of RBM39 in mouse muscle cells (C2C12) resulted in an increase of the long isoform of the Sin3b transcript192. The role of SIN3B is to decrease by directing histone deacetylases towards specific chromatin locations. This function is disabled in the short SIN3B isoform where the domain that interacts with histone acetylases is absent. Another connection with muscle tissues and RBM39 was shown in the muscle-enriched RNA-binding protein ZFP106193. ZFP106, which is required for the correct myofibre innervation by motor neurons, is expressed in both slow and fast muscles and is upregulated during C2C12 myoblast differentiation193. RBM39 was shown to interact with ZFF106, and they co- localised in nuclear speckles in proximity to spliceosomes193.

MATR3 MATRIN3 (MATR3) has previously been shown to bind to ISCU RNA sequence; however, the overexpression of MATR3 in RD4 cells carrying an ISCU minigene did not change the levels of incorrect splicing153. MATR3 is an RNA-binding factor that is known to regulate a wide range of splicing events, where a proportion of those events are co-regulated with PTBP1173. MATR3 has two DNA-binding C2H2 zinc finger domains and two RNA-binding RRM domains and is a highly conserved nuclear matrix protein173,194. MATR3 as well as NOVA1, SFRS14 and RAVER1 have all been shown to interact with PTBP1, and MATR3 has also been shown to interact with both NOVA1 and NOVA257,173,195. Compared to other RBPs with well-known binding motifs, MATR3 binds to quite an extended region, often within introns in order to inhibit the inclusion of adjacent exons, and seems to prefer pyrimidine rich regions such as UUUCUNUUU as well as PPTs173,196. The binding efficiency of MATR3 also decreases when a pyrimidine-rich region is shortened196. Knockdown of MATR3 in HeLa and SH-SY5Y cells results in a general increase in exon inclusion, indicating that MATR3 in most cases acts as a splicing inhibitor173. The knockdown in HeLa cells also resulted in a general increase in PTC containing mRNA isoforms173. When both MATR3 and PTBP1 (as well as PTBP2 to avoid any compensatory effects) were knocked down, the outcoming events indicated that MATR3 and PTBP1 could act as both repressors and activators. However, for the splicing events where the PTBP1 and MATR3 activities were opposed, the knockdown of the inhibitory factor showed a dominant effect compared to the knockdown of the activator173. Exons that were repressed by MATR3 tended to include pyrimidine motifs also associated with PTBP1 (UUCUU and UCCCC)173. The majority of all PTBP1 regulated splicing events also seem to be regulated by MATR3196. The exact mechanism of MATR3 splicing regulation is, however, still a bit unclear, but MATR3 tends to control the

22

alternative splicing of exons by binding to the adjacent introns in a manner that is semi-independent of PTBP1, but its interaction with PTBP1 might also result in an antagonised PTBP1 activity.

Missense mutations in MATR3 have been shown to be associated with amyotrophic lateral sclerosis and asymmetric myopathy with vocal cord paralysis197,198. Also, both MATR3 and PTBP1 can inhibit the inclusion of exon 78 in the DMD gene, which regulates the cellular localisation of dystrophin protein isoforms, an important determinant for phenotype outcome for Duchenne’s and Becker’s muscular dystrophy patients199.

Other ISCU Mutations HML is a very rare disease, but it can still be counted as the second most prevalent disease with a pathophysiology that includes an impaired FeS cluster formation after the more well-known Friedreich ataxia143,200. Besides the HML mutation, two other ISCU mutations are reported to result in myopathies. Both mutations are missense mutations in exon 3, and both patients display more severe myopathic symptoms compared to HML patients140,147.

A male of Finnish/Swedish descent was identified as a compound heterozygote with the HML mutation on one chromosome and a missense mutation (c.149G > A) in exon 3 on the other140. The G > A was inherited from his healthy Finnish mother and is found in 1/100 Finns. The compound heterozygote patient had a healthy brother carrying one copy of the missense mutation, indicating that it is a recessive mutation. The biochemical and histochemical phenotype of the patient was more or less identical to HML, but the clinical phenotypes were more severe. The patient had an early onset of severe slowly progressive muscle weakness and muscle wasting. In comparison, HML patients have normal or near normal muscle strength at rest. Another differing symptom was cardiomyopathy as well as mild cardiac hypertrophy without dilation. The patient did not, however, suffer from dyspnoea, tachycardia or palpitations. The missense mutation changes a conserved glycine to a glutamate (G50E), and compared to HML patients, the decrease of ISCU protein in patient muscle was less prominent. However, the levels of ISCU protein in fibroblasts from HML patients, the compound heterozygous patient and healthy family members did not differ140. It was also shown that the G50E amino acid change directly affects the ability of the ISCU protein to interact with the sulphur donor NFS1 and the molecular chaperone J-protein HSCB, which negatively affects the rate of FeS cluster assembly201.

The other ISCU missense mutation was a de novo G > T mutation (c.287G > T) in exon 3 found in an Italian male147. It is a dominant mutation that changes a highly

23

conserved glycine to a valine. It has been suggested that the amino acid substitution prevents the correct orientation of a cysteine residue, which is crucial for the binding of Fe by ISCU. The heterozygous patient displayed a more severe phenotype than the compound heterozygote patient with slowly progressive severe muscle weakness and partial recovery between episodes of exercise intolerance as well as ptosis. The patient did display tachycardia but no dyspnoea and was unable to walk during the weak episodes. The biochemical status of patient muscle tissue showed iron overload and that all ETC complexes were affected. Compared to the Swedish/Finnish patient, the Italian patient did not show any symptoms related to heart function. The patient did, however, have high levels of lactic acid in the blood, displayed muscle hypotrophy at 17 years of age and was unable to run. Again, patient fibroblasts displayed normal levels of ISCU protein147.

All three ISCU mutations result in myopathies with exercise intolerance and a varying degrees of muscle weakness or wasting. The muscle-specific symptoms of HML can be explained by the higher levels of incorrect splicing in muscle tissue, but a similar explanation for the other two myopathies cannot be made. Muscle tissue might display a greater sensitivity to ISCU dysfunction due to its unique demand of oxygen and energy in response to exercise.

24

Aims of Thesis

To examine the levels of incorrect splicing of ISCU in various tissues and to identify the splicing factors responsible for the high levels of incorrect splicing of ISCU in the skeletal muscle of HML patients.

Paper I

 Investigate the levels of incorrect splicing of ISCU in various tissues from transgenic mice expressing a human ISCU transgene with the HML mutation.  Examine the regulatory role of SRSF3 in the splicing of ISCU pre-mRNA.

Paper II

 Confirm the role of PTBP1 as an inhibitor of the incorrect splicing of ISCU in an endogenous context by using myoblasts from HML patients.  Determine the importance of PTBP1 for the tissue-specific splicing of ISCU.

Paper III

 Identify a muscle-specific splicing factor that activates the incorrect splicing of ISCU, thereby causing the muscle-specific high levels of incorrect splicing of ISCU in HML patients.  Determine if the incorrectly spliced ISCU transcripts are NMD targets.

25

Materials and Methods

The listed publications contain detailed descriptions of the materials and methods used in each study. A general overview is provided in the following sections.

Human Materials Biopsies were taken from the tibialis anterior of two HML patients (P1, P2) and a healthy control (C). The biopsies were used to establish myoblast primary cultures that were used in all three studies. Written consent was obtained from all participants, and the usage of human material was approved by the Regional Ethics Committee for Medical Research at Umeå University (09-105M).

Mice Mouse tissues used to measure splicing factor expression were obtained from CBA/B6 mice. To measure the levels of incorrect splicing of a human ISCU transgene carrying the ISCU mutation, transgenic mice on a CBA/B6 background were generated and inter-crossed to obtain homozygotes. The injection of the DNA construct into pronuclear stage mouse embryos was performed by the Umeå Transgene Core Facility (UTCF). Mice were kept in standard cages with free access to food and water. To collect mouse tissues, the animals were sacrificed by cervical dislocation. All procedures were approved by the Ethical Committee for Animal Research at Umeå University (A5-12, A74-14).

Cell Culture Primary myoblasts as well as HEK293T LentiX, RD4 and HeLa cells were used. The HEK293T LentiX cells were used for the production of lentivirus. RD4 cells were a generous gift from Tracey A. Rouault MD, NICHD/NIH, Bethesda, MD, USA. RD4 cells were used to study the splicing of ISCU minigenes containing all ISCU exons as well as the last intron with or without the mutation153.

Lentiviral Transduction In order to efficiently transfect the primary myoblasts, they were infected with lentivirus carrying the DNA of interest. Lentivirus was produced by calcium phosphate transfection of HEK293T LentiX cells by lentiviral DNA vectors along with the DNA vector of interest. The lentivirus were collected from the media of the infected HEK293T LentiX cells.

26

Magnetofection Magnetofection transfection was used for transfection of esiRNA into primary myoblasts. The technique involves the charge-based interaction between magnetic nanoparticle beads and siRNA or DNA. The mixture is then added onto cells while placed on a strong magnetic plate. The magnetic force pulls the magnetic nanoparticles down along with the nucleotide load into close proximity of the cells, which then engulf the nearby complexes.

Cycloheximide Treatment Cycloheximide (CHX) is a well-known inhibitor of protein synthesis by blocking the translational elongation. CHX can be used to indirectly inhibit NMD since NMD is a translation-dependent process. Primary myoblasts were treated with CHX for 24 h to observe the effect of an inhibited NMD on the levels of incorrectly spliced ISCU transcripts.

RNA Extraction and cDNA Synthesis RNA was isolated from mouse tissues or human cells. The RNA was used to synthesise cDNA, which was subsequently used in qRT or RTPCRs.

Semi-qRTPCR and qRTPCR RTPCR was performed using TaqMan polymerase, and the resulting product was size-separated by electrophoresis either on an agarose gel or capillaries. The use of FAM-labelled primers allowed the quantification of specific PCR amplicons by fragment analysis, termed semi-quantitative RTPCR. qRTPCR was performed using SYBR green, using triplicate wells for each sample. For the quantification of incorrectly spliced ISCU, the 2(-Ct) value for the PCR amplicon representing incorrectly spliced ISCU was divided by the 2(-Ct) value for the PCR amplicon representing the total amount of ISCU transcript.

Protein Samples and Nuclear Extracts Protein was isolated from both mouse tissues and human cells using a standard protein lysis buffer. Nuclear extracts from mouse tissues and cells involved the specific isolation of nuclei and the final disruption of the nuclei to obtain a pool of nuclear proteins, which includes splicing factors since splicing takes place in the nucleus.

Pull-down Assay To determine if a splicing factor of interest can bind to an RNA sequence of interest, a nuclear extract containing splicing factors was passed through a

27

column with biotin-fixed RNA sequences. RNA-oligos with a 20 bp ISCU RNA sequence, with or without the HML mutation, were used to investigate the mutation specificity of involved splicing factors. The bound factors were eluted from the column and were used in a western blot analysis to observe differences in the levels of protein that bound to either oligo.

Western Blot Analysis Western blot analysis was performed to measure the levels of proteins of interest in mouse tissues, human cells as well as nuclear extracts and elute fractions from pull-down experiments. Horse radish peroxidase conjugated primary antibodies were used during overnight incubations at 4°C.

Statistics Non-paired two-tailed Student’s t-tests were performed for the qRTPCR and fragment analysis data. The significance levels were depicted by * (p-value < 0.05), ** (p-value < 0.01) or *** (p-value < 0.001).

28

Results and Discussion

Paper I The High Level of Aberrant Splicing of ISCU in Slow-Twitch Muscle May Involve the Splicing Factor SRSF3

Muscle-specific Splicing It has been previously shown that the incorrect splicing of ISCU is highest in patient muscle compared to heart and liver135. To further examine the levels of incorrect splicing in different tissues, we used a transgenic mouse model expressing the human ISCU transgene with the HML mutation. We collected three different types of muscles based on fibre composition: quadriceps, gastrocnemius and soleus. The soleus (SOL) muscle consists predominantly of slow fibres, while the gastrocnemius is a mixed-fibre muscle and quadriceps consist of mostly fast fibres. SOL showed the highest level of incorrect splicing compared to the mixed-fibre gastrocnemius and fast-fibre quadriceps. The percentage of incorrect splicing then dropped along a gradient for heart, brain, liver and kidney.

SRSF3 as a Muscle-specific Activator of the Incorrect Splicing It was reported earlier that the serine/arginine-rich splicing factor 3 (SRSF3) is expressed at higher levels in slow-fibre muscles compared to fast-fibre muscles120,202. We could confirm this by a western blot using SOL, gastrocnemius and quadriceps muscle extracts from mice. SRSF3 belongs to the SR protein family, which consists of essential splicing factors. The general structure of these proteins includes one or two RRM domains in the N-terminal and a C-terminal SR domain rich in serine and arginine.

To determine if SRSF3 has a regulatory role in the splicing of ISCU, we overexpressed and knocked down its expression in HML patient myoblasts. The overexpression resulted in an increase in incorrect splicing while the knockdown decreased the incorrect splicing, indicating that SRSF3 can act as an activator of the incorrect splicing of ISCU. A regulatory mechanism that could weaken the effect of the overexpression is the self-regulation that SRSF3 exerts on its own mRNA through the inclusion of exon 4, which introduces an in-frame stop codon resulting in a truncated protein that lacks the SR domain72,174. Overexpression of SRSF3 can, therefore, result in an increase in exon 4 inclusion, which results from the SRSF3 enhancement of an otherwise unused, weak splice acceptor of exon 471,72. However, the ratio of SRSF3 transcript including exon 4 is less than 1% in actively growing cells but can increase to 10% in cells in the G0 cell cycle phase, making the overall effect potentially neglectable71,72.

29

SRSF3 has only one RRM and it binds to the consensus sequence CNNC according to a protein crystal structure where only the 5´ C is specifically recognised57,203. However, in vitro and in vivo studies suggest longer sequence motifs that are CU-rich or follow the patterns [A/U]C[A/U][A/U]C or CTC[T/G]TC[C/T]57,204,205. To investigate if the exerted effect is due to a direct interaction with ISCU RNA, we performed a biotin pulldown with ISCU RNA oligomers with or without the HML mutation. The pulldown showed that SRSF3 interacted equally with both RNA sequences, suggesting that SRSF3 exerts its effect through a direct interaction with the ISCU RNA sequence but a lack of mutation specificity, which suggests a more general activating splicing role for SRSF3. In general, SRSF3 has been shown to bind both exonic and intronic sequences, and it has more than one in silico binding hit for the mutated ISCU sequence (Paper III; Supplementary Figure S2, Supplementary Table 2), further supporting its involvement in the splicing of ISCU204-206.

Our results show that the higher levels of SRSF3 in SOL may be caused by its higher levels of SRSF3 compared to gastrocnemius and quadriceps. The location of the TATA box in the SRSF3 promoter indicates a tissue-specific expression, and SRSF3 has been shown to be expressed at high levels in thymus, testis and spleen tissues but at low levels in liver, lung and kidney tissues, which correlates well with the low levels of incorrectly spliced ISCU seen in liver and kidney71,72,207.

Paper II PTBP1 Acts as a Dominant Repressor of the Aberrant Tissue-Specific Splicing of ISCU in Hereditary Myopathy with Lactic Acidosis

Tissue-specificity of PTBP1 PTBP1 has previously been shown to act as an inhibitor in the incorrect splicing of ISCU. PTBP1 overexpression in RD4 cells resulted in a clear decrease of the incorrect splicing of an ISCU minigene carrying the HML mutation153. PTBP1 shows a tissue-specific expression and is downregulated in the brain, originally demonstrated by the skipping of a neuron-specific N1 exon in the mouse c-Src gene in non-neuronal cells208. The neuron-specific skipping is caused by conserved CUCUCU elements within the PPT and the downstream intron, which are typical PTBP1 binding motifs208. When examining mouse RNA-seq data to identify enhancer or silencer splicing elements, a computational study found that the UCUCUC and CUCUCU motifs can be identified as intronic splicing silencing motifs in muscle and liver but not in brain40. In other words, the absence of PTBP1 in brain results in the loss of the silencing status of the UCUCUC and CUCUCU motifs.

30

We confirmed the absence of PTBP1 in mouse muscle, and we also observed very low levels of PTBP1 in myoblast cells in comparison to RD4 and HeLa cells. PTBP1 RNA is expressed at low levels in muscle tissues, intermediate levels in heart and brain tissues and high levels in liver and kidney tissues. Its paralogue, PTBP2, is as expected, expressed at low levels in all tissues except for brain. The inhibitor role of PTBP1 alongside the low levels of PTBP1 in muscle tissue led us to compare PTBP1 expression levels with the levels of incorrect splicing of ISCU in the transgenic ISCU mice used in Paper I. Indeed, we observed a negative correlation between PTBP1 expression and the levels of incorrect splicing, indicating that the level of PTBP1 in a tissue is a potential determinant of the magnitude of incorrect splicing.

PTBP1 inhibition of the Incorrect Splicing To further confirm the inhibitory role of PTBP1 in an in vivo context, PTBP1 was overexpressed and knocked down in HML patient myoblasts. The knockdown resulted in a clear increase of incorrect splicing while the overexpression resulted in drastically decreased levels of incorrect splicing, almost as low as seen in healthy control cells. Our results clearly indicate that PTBP1 acts as a strong inhibitor of the incorrect splicing. We also show that PTBP1 binds to the ISCU RNA sequence by RNA-pulldown using ISCU sequence with or without the HML mutation. In a previous study, using nuclear extract from RD4 cells, an equal and strong affinity was found for both the wt and mut sequences153. Using nuclear extracts from myoblasts, which contain lower levels of PTBP1 compared to RD4 cells, we observed a slightly stronger affinity of PTBP1 binds for the mutated sequence. This is, however, not surprising, since PTBP1 has a binding preference for pyrimidines, and the mutated sequence contains one more pyrimidine than the wt sequence. The mut binding preference of PTBP1, along with the fact that the PTBP1 levels are equally low in both control and patient myoblasts, indicates that the low level of PTBP1 is not the sole determinant of the high levels of incorrect splicing in patient muscle. Instead, there must also exist a muscle- specific splicing activator, which binds with a higher affinity to the HML mutation and specifically triggers the incorrect splicing in patient muscles.

Tissue Specificity miRs are known to repress translation of mRNAs by binding to specific mRNA targets, often in their 3´ UTR209,210. For example, in muscle tissue, miR-133 targets PTBP2 mRNA and suppresses its translation182,211. The tissue-specific expression of PTBP1 is in part controlled by miR-124, which inhibits PTBP1182,183. miR-124 levels are high in both human brain, skeletal muscle and heart as expected, but it is also high in bone marrow and kidney182,183. The low levels of incorrect splicing in kidney alongside the high levels of miR-124 could be partly explained by the fact that the specific miR profile for kidney differs greatly

31

compared to the other examined tissues212. When observing the RNA pool in various tissues, kidney is also an outlier in correlation to liver, heart and skeletal muscle103. A transcriptomics profile for tissue-specific splicing groups skeletal muscle close to heart and shares a pattern with liver but shows the least similarity to brain4. miR-124 levels in mouse are also higher in brain compared to skeletal muscle but to a higher extent, indicating potentially different inter-tissue ratios of PTBP1 and other splicing factors in humans and mice182.

Paper III MBNL1 and RBM39 can Activate the Incorrect Splicing of ISCU and the Aberrant Transcript is a Target for Nonsense-mediated Decay

The high levels of incorrect splicing of ISCU in patient muscle cannot only depend on the low levels of PTBP1 but must also be caused by a splicing activator that binds the HML mutation site with a higher affinity than the wildtype sequence. Previous research has shown that the KH domain RNA-binding protein IGF2BP1 showed a higher affinity for the HML mutation site. The expression of IGF2BP1 is, however, limited to embryonic stages in healthy tissues but can be found at high levels in rhabdomyosarcomas48,185,186. The high levels of IGF2BP1 in RD4 cells might be the reason for its identification in the ISCU RNA pulldown experiments in previous research153. To investigate the potential of other more relevant KH domain splicing factors we overexpressed NOVA1 and PCBP1 in RD4 cells carrying the ISCU minigene with or without the HML mutation.

K-homology Domain Candidates NOVA1 is a neuron-specific splicing factor that contains three KH domains where each domain binds to RNA regions containing one or multiple YCAY (Y: C or T) sequence binding motifs57,213. The NOVA1 paralogue, NOVA2, also recognises the YCAY motif that often occurs in the proximity of exons regulated by NOVA proteins57. NOVA proteins can bind both introns and exons but are also involved in cytoplasmic regulation of mRNA transcripts by binding to 3´ UTR regions172,214. Just like PTBP1, the NOVA binding motif is quite short, which indicates that it will appear quite frequently in the genome172. Using the two in silico splicing prediction tools SpliceAid and SFmap, NOVA1 appeared as a hit for the mutated ISCU sequence due to the perfect sequence match of the HML mutation and the NOVA1 binding motif YCAY59,215. However, when examining the expression levels of NOVA1 in various mouse tissues, the expression was very low in all tissues except for in brain, and the overexpression of NOVA1 did not have any effect on the incorrect splicing of ISCU. Also, the more recent in silico tool RBPmap did not predict NOVA1 as a hit for the mutated ISCU sequence60. NOVA1 might be a contributor for the incorrect splicing of ISCU in brain tissue, but any effect is most likely neutralised by the much higher levels of PTBP2.

32

PolyC-binding proteins are a group of proteins that each contain three highly conserved KH domains and show a high affinity to C-rich PPTs216,217. They can shuttle between the nucleus and the cytoplasm, and PCBP1 and PCBP2 are the most highly expressed of the PCBP proteins216. It has been shown that these splicing factors enhance the inclusion of exons with a C-rich PPT217. PCBP1 and PCBP2 show a clear overlap in their functionalities, indicating a complementary regulatory relationship217. PCBP1 can also interact with U2AF65, indicating that the binding of PCBP1 to a C-rich PPT might enhance the recruitment of U2 snRNP217. The fact that PCBP1 contains KH domains and shows a strong affinity to C-rich PPTs made it a potential candidate for regulating ISCU splicing. The muscle-specific expression of PCBP1 was, however, quite low in comparison to the other tissues, and an in silico search predicted no binding preference to the mutated sequence (Paper III; Supplementary Table 2, Figure S2). The overexpression of PCBP1 did, however, result in a discrete increase in the incorrect splicing of the mut ISCU minigene. However, taken together the low expression in muscle and this discrete effect suggest that PCBP1 is not a strong or relevant activator of the incorrect splicing of ISCU. Interestingly, both PCBP1 and PCBP2 can act as adaptor proteins that bind iron and deliver it to ferritin for storage218. The depletion of both PCBP1 and PCBP2 did affect the activity of cytosolic aconitase but had no effect on some other FeS cluster proteins218. It is interesting to speculate how the impaired Fe metabolism in patient myoblasts affects the functionality of PCBP proteins.

In summary, the investigated KH domain splicing factors are most likely not involved in the muscle-specific incorrect splicing of ISCU.

Other Splicing Factor Candidates RBM39 and MATR3 have previously been shown to bind ISCU RNA, and RBM39 was also shown to increase the incorrect splicing of an ISCU minigene, even in the presence of high levels of PTBP1153. RBM39, MATR3 as well as the essential splicing factor U2AF65 and the highly muscle-specific splicing factor MBNL1 were included in a screen for splicing activators using HML patient myoblasts.

U2AF The U2 RNA auxiliary factor (U2AF) complex is an essential member of the spliceosome recruitment, and it consists of a 35 (U2AF35) and a 65 kDa subunit (U2AF65). U2AF65 binds to the PPT to promote base pairing between the branchpoint and the U2 snRNA, while U2AF35 interacts with the 3´ splice site to initiate splicing9,70,219,220. U2AF35 binding is, however, not essential for in vitro splicing when the PPT is strong, but it is necessary for the in vitro splicing involving weak or non-canonical PPTs187,221. Furthermore, it has also been shown that the dependency on U2AF for spliceosomal recruitment can be overridden by

33

SR proteins222. Also, one of the modes of action for PTBP1 splicing repression is by competing with U2AF65 for PPT binding9,70,166,222. During this competition, however, PTBP1 has been shown to be the more dominant in PPT binding166. The levels of U2AF65 have also been shown to increase as the levels of PTB decrease9.

U2AF65 contains an SR domain followed by two RRMs and a third non-canonical RRM in its C terminal, which is known as an U2AF homology motif and is responsible for the interaction with U2AF35191,223. U2AF65 prefers a CU-rich PPT, and the strength of the tract determines the overall conformational changes that U2AF65 undergoes in terms of its RRMs during RNA-binding219,224-227. A recent genome-wide analysis has shown that the binding of U2AF65 to an upstream intron interferes with the 3´ splice site of the downstream exon, resulting in either exon skipping or inclusion222. It was also shown that U2AF65 regulates the alternative splicing linked to PPTs for pseudo-splice sites in introns222. As a subunit of the U2AF complex, one would expect the involvement of U2AF65 to result in an exon inclusion, but as shown for the Fas6 transcript, this is not always the case226. Also, in spinal muscular atrophy, U2AF65 stimulates the alternative exon skipping in survival motor neuron pre-mRNA226. Additionally, the expression levels of U2AF65 have been found to be related to diseases, such as myotonic dystrophy, cystic fibrosis as well as cancers228,229.

MBNL1 The three muscleblind-like proteins in mammals, MBNL1, MBNL2 and MBNL3, are all tissue-specific splicing factors and contain four zinc-finger domains each230. MBNL1 and MBNL2 are ubiquitously expressed where MBNL1 is dominant in most tissues while MBNL2 is dominant in brain3,4,230. MBNL3, however, functions specifically during muscle cell differentiation and regeneration230. MBNL1 has 12 exons, out of which six are alternatively spliced, resulting in at least seven MBNL1 mRNA variants231,232. MBNL1 was initially discovered in Drosophila for its important role in the terminal differentiation of muscles and is the most studied of the MBNL proteins, in part due to its predominant expression in muscle tissue230,233. During a significant decrease of MBNL1, the expression of MBNL2 is known to increase, which indicates a compensatory relationship between the two splicing factors234,235. When levels of MBNL proteins decrease, a general shift in splicing occurs, resulting in the expression of isoforms linked to foetal requirements instead of the expected adult isoform230. Such an effect can be observed in the most common type of muscular dystrophy known as myotonic dystrophy236.

MBNL1 has a minimal binding motif of YGCY where the UGCU variant seems to be the most common motif230,237. A more specific sequence that MBNL1 was reported to recognise adjacent to exon 5 in chicken and human cardiac troponin

34

T pre-mRNA was YGCU(U/G)Y47,230. When MBNL1 binds to a motif in an alternative exon or an intronic motif upstream of the exon, it usually results in exon exclusion, while MBNL1 binding downstream of the exon usually results in exon inclusion230,237. This pattern of regulation suggests that MBNL1 repression of exon inclusion is due to blockage of exonic enhancers, 3´ splice sites, PPTs or intronic branch sites230,235. MBNL1 activation would then be caused by MBNL1 competing with splicing silencers, activating intronic splicing enhancers or enhancing the spliceosomal recognition of 5´splice sites230,235. However, several other modes of MBNL1 regulation, depending on motif location in relation to the regulated exon, have been reported230,235.

MBNL1 and RBM39 Activates the Incorrect Splicing of ISCU In a recent study where the expression of U2AF65 was knocked down to approximately 50% in MCF-7 breast cancer cells, 330 altered alternative splicing events were observed where most of the events were exon skipping, indicating that U2AF65 predominantly acts as an activator of splicing188. The interaction between RBM39 and U2AF65 as well as their high levels of shared splicing events with similar outcomes, together with our results showing that RBM39 is an activator of the incorrect splicing of ISCU, suggests that RBM39 might play an important role in the selective recognition of weak PPTs by recruiting U2AF65. However, the knockdown of U2AF65 resulted in a discrete increase in the incorrect splicing, which might indicate that RBM39 is able to bind the activated PPT in ISCU without the interference of U2AF65, resulting in a more potent activation of the pseudoexon inclusion. RBM39 is an SR protein, just like SRSF3, and it has been shown that SR proteins can override the need for U2AF65 to initiate splicing222.

An in silico search for MBNL1 binding sites in the ISCU pre-mRNA sequence of intron 4 using the RBPmap algorithm60 resulted in three new potential binding sites due to the HML mutation. The hits were fewer in the wildtype sequence even though the score for one of the hits was higher than all the mutant sequence hits. In the predecessor prediction site SFmap59, this average decrease of MBNL1 binding score for the mutated sequence was also observed, but the number of individual binding motifs was not displayed in the search output. The general observation for MBNL1 exon regulation is that MBNL1 binding upstream of an exon leads to exon exclusion. However, in the case of ISCU regulation, knockdown of MBNL1 results in a decrease of pseudoexon inclusion even though the predicted MBNL1 binding sites are located upstream of the pseudoexon. Therefore, the potential regulatory mechanism of MBNL1 for ISCU goes against the most common pattern for MBNL1 splicing regulation; however, cases where upstream binding of MBNL1 resulted in exon inclusion have also been reported230,235. CLIP-seq predicted tetra-nucleotide binding sites for MBNL1 were

35

CGCU, UUGC, CUGC, UGCU and GCUU, while an in vitro screen reported the heptamer GCUUGCU235,238. The in vitro screen also found that MBNL1 has a strong preference for unpaired U bases, but it could tolerate both paired and unpaired GC sequences such as UGCUGC, UGCUU, GCUUGC, CGCUU and GCUGCU238. MBNL1 has also been shown to interact with CAG and CCG repeats230.

As with PTBP1, one of the mechanisms suggested for MBNL1 repression of exon inclusion is by competing with U2AF65 binding to the 3´ splice site239. An example of this is the MBNL1 induced skipping of exon 5 in the cardiac troponin T pre-mRNA239. MBNL1 binds to a hairpin structure of the pre-mRNA directly upstream of exon 5, thereby blocking U2AF65 from binding to the PPT, leading to exon 5 skipping. Furthermore, when comparing the distribution of binding motifs for MBNL1 and PTBP1, most motifs for PTBP1 are found in introns while most motifs for MBNL1 are found in 3´ UTR regions240. The second most common location for MBNL1 motifs is a three-way tie between intergenic regions, coding sequences and introns240. In muscle tissue, where PTBP1 is scarce and MBNL1 is abundant, MBNL1 might be present at introns with a higher frequency at any given moment compared to PTBP1, despite the higher general involvement of PTBP1 in intron regulation.

The Incorrect ISCU RNA Transcripts are Targets for NMD The abundance of a transcript that is normally degraded by NMD is expected to increase if the NMD pathway is disabled. Besides knocking down individual UPF components (UPF1, 2 or 3), NMD can also be disabled by treatment with the protein biosynthesis inhibitors puromycin or CHX. The treatment of patient myoblasts with CHX resulted in a clear increase in the incorrectly spliced ISCU transcript, indicating that it is a target for NMD. The distance between a PTC and the most 3´ exon-exon junction is a strong determinant in the classification of a pre-mRNA as an NMD target, an observation known as the ‘50–55 nt rule’. The distance between the PTC and the downstream exon-exon junction for the two incorrectly spliced ISCU mRNA transcripts MUT86 and MUT100 is 39 and 53, respectively, indicating that MUT86 is not a NMD target. The 53 nt distance for the MUT100 transcript creates an ambivalence in the usage of the 50–55 nt rule82. It would, therefore, be interesting to perform a fragment analysis of the CHX treated samples to observe the change in the ratio of the two splice variants.

When CHX treatment was combined with knockdown of RBM39 or MBNL1, the previously observed decrease in incorrect splicing due to esiRNA treatment was abolished. The strong effect, resulting in an increase of incorrect splicing to approximately 80%, was the same for all CHX treated samples, despite splicing factor knockdown of either MBNL1 or RBM39. However, when PTBP1

36

knockdown was combined with CHX treatment, the resulting high levels of incorrect splicing were discretely higher than CHX treatment alone. These results further support PTBP1 as a dominant inhibitor of the incorrect splicing of ISCU. The reason for the abolished effect of RBM39 and MBNL1 knockdown could simply be because of their less potent effect on incorrect splicing of ISCU, compared to PTBP1.

37

Conclusions

Paper I  Slow-fibre muscles display the highest level of incorrect splicing of ISCU compared to fast- or mixed-fibre muscles while heart and brain show intermediate levels and liver and kidney show low levels.  SRSF3 acts as an activator of the incorrect splicing of ISCU and might be responsible for the higher levels of incorrect splicing in slow-fibre muscles.  SRSF3 binds with an equal affinity to ISCU pre-mRNA sequence of intron 4 with or without the HML mutation.

Paper II  The expression of PTBP1 in tissues is negatively correlated with the incorrect splicing of ISCU.  PTBP1 acts as a dominant inhibitor of the incorrect splicing of ISCU.  PTBP1 binds with a higher affinity to ISCU pre-mRNA sequence of intron 4 with the HML mutation.

Paper III  NOVA1 and MATR3 are likely not involved in the splicing of ISCU.  MBNL1, RBM39 and PCBP1 act as activators of the incorrect splicing of ISCU.  Incorrectly spliced ISCU mRNA transcripts are targets for NMD.

The G > C HML mutation activates a cryptic acceptor site due to the increase of pyrimidines in the essential PPT and leads to the incorrect splicing of the ISCU pre-mRNA followed by NMD. The unique repertoire of splicing factors in skeletal muscle tissue can account for the high levels of incorrect splicing, causing the myopathic state. We suggest that the causative composition of splicing factors in muscle includes the scarce presence of the inhibitor PTBP1 alongside the presence of the activators SRSF3 and RBM39 combined with the abundant presence of the activator MBNL1.

38

Overall Discussion

The Exercise Sensitivity of HML Patients Everyday movement that does not require much exertion mostly relies on slow- fibre muscles, while the fast-fibre muscles are not engaged until a higher force is required. Slow fibres mostly rely on oxidative phosphorylation as their main energy source and, therefore, contain more mitochondria and capillaries than fast fibres, which are more reliant on glycolysis. A potential connection might exist between the higher levels of incorrect splicing of ISCU in slow-fibre muscles (Paper I) and the importance of the ISCU protein for proper mitochondrial function. HML patient muscle has been shown to be composed of a higher proportion of slow fibres compared to healthy muscle as well as more mitochondria and capillaries, which describes structural alterations normally induced by regular exercise137. The changes might be a cellular response mechanism, where the increase in slow fibres and mitochondria compensates for the disabled oxidative phosphorylation to obtain a sustainable net output of ATP. The increased abundance of mitochondria might result in an overall increase in ISCU expression, which, despite NMD, might explain the increased levels of incorrect splicing in slow fibres. An increased ISCU expression should also result in increased levels of correctly spliced transcript and an increase in ISCU protein. The levels of ISCU protein are, however, very low in patient muscle, but they might be even lower if this change in muscular composition did not occur. However, the observed changes in expression in patient muscle suggests a decrease in oxygen consumption and TCA cycle activity alongside an increase in mitochondrial fatty acid uptake137. Transcriptional activation of PGC-1α was also observed, which can be activated by low ATP levels, high levels of the electron acceptor NAD+ or high levels of FGF-21, a starvation-induced hormone that is found in high levels in patient muscle and serum137.

During myoblast differentiation, the classical UPF2 dependent NMD system is attenuated to allow a needed increase in the NMD target myogenin, which promotes myogenesis241. The Staufen homolog protein 1 protein expression is upregulated in the differentiating myoblasts, which prevents the interaction between UPF1 and UPF2241. The alternative Staufen-mediated mRNA decay pathway is instead activated241. As myogenin is expressed at high levels in slow fibres, the active Staufen-mediated mRNA decay may not degrade the incorrectly spliced ISCU to the same extent as the UPF2-dependent NMD, causing the higher levels of incorrect splicing. Important to note is that the efficiency of NMD can be affected by hypoxia, and that NMD targets as well as the alternative splicing events that induce NMD are not commonly conserved between humans and mice82.

39

Additionally, to obtain an even clearer picture of what is causing the high levels of incorrect splicing in HML muscle, it might be an advantage to use dissociated myofibres, thereby excluding associated fibroblasts, neurons and blood cells to obtain a more precise tissue characteristic in terms of splicing factor abundance125.

Effect of HML Mutation on Splicing Elements The strength of a PPT is dependent on its proportion of pyrimidines, i.e. C or U. Therefore, the G > C HML mutation where a purine base, i.e. A or G, is substituted by a C naturally strengthens the non-authentic PPT, which activates the cryptic acceptor site within the intron. Several web-based in silico programs can predict this strengthened site as shown in Paper III (Supplementary Table 1). According to the HumanSplicingFinder site, the predicted strength for the 5´splice site of the incorrectly spliced MUT100 variant is slightly higher than the 5´splice site for the MUT86 variant.

There is an interesting difference in the proportion of the two incorrectly spliced mRNA variants in the context of either human or mouse splicing regulation. The MUT86 variant is the minor variant in the incorrect splicing of the human ISCU transgene in examined mouse tissues, except for in liver, while the two incorrect splice variants are present at equal levels in patient tissues and patient myoblasts135 (Paper I) (FIGURE 8). The difference may be due to intrinsic differences in the splicing regulation between mouse and human, where the MUT86 5´splice site appears stronger in human. When comparing the two cryptic 5´splice sites to the percentage occurrence of bases at mammalian 5´splice sites and calculating the average percentage for each variant (not considering the “any base is accepted rule” for position -1 and +5) the percentages are 47% for MUT86 and 46% for MUT100 (FIGURE 8). This theoretical equality in strength correlated well with the approximate 50/50 distribution of the two variants in patient tissues and myoblasts135 (FIGURE 8). It is also important to mention that the in silico predicted strengths of the 5´splice sites that are used in MUT86 and MUT100 do not change when the wt and the mut ISCU sequences are compared (Paper III, Supplementary Table 1). The increased strength of the 3´ splice site is what pushes the overall strength of the combined cryptic splice sites above a specific splicing threshold, resulting in aberrant splicing. The specific threshold for this particular splicing event, appears to be lower in muscle tissues compared to other tissues.

The weakness of prediction algorithms is to deduce whether a splicing event will occur or not. The HumanSplicingFinder algorithm suggested ‘Probably no impact on splicing’ for the mutated ISCU sequence, and the SPiCE program also predicted a ‘low’ effect on splicing. The SPiCE authors did, however, mention that

40

partial splicing effects often result in a ‘low’ SPiCE output, and that they were not able to define a threshold for those cases that would fall between low and high intensity effects101. The incorrect splicing of ISCU would probably be defined as a ‘partial splicing’ as a ‘high intensity effect’ of splicing only occurs in muscle tissues.

A) WT MUT 86 MUT 100 100% 80% 60% 40% 20% 0% C P1 P2

B) -3 -2 -1 +1 +2 +3 +4 +5 +6 MUT86 5’ss: A U G* G U A U U C 33% 14% 80% 100% 100% 59% 11%, 8% 15% 2nd 2nd 1st 1st 3rd 3rd 4th MUT100 5’ss: G G G* G U C U G* C 19% 11% 80% 100% 100% 3% 11% 78% 15% 3rd 3rd 1st 3rd 3rd 1st 4th *G at -1 allows any nucleotide at +5 *G at +5 allows any nucleotide at -1.

Figure 8. Proportions of the two incorrectly spliced ISCU mRNA transcripts. A) Percentage of incorrect splicing in healthy (C) and HML patient myoblasts (P1, P2) as obtained by fragment analysis, showing the proportions of the two incorrectly spliced variants, MUT86 and MUT100. B) The sequence of the cryptic 5´splice sites of the two splice variants, MUT86 and MUT100. Percentage of occurrence of nucleotides in mammalian 5´splice sites including ranking for most common nucleotide, connected to U1 snRNP pairing. Data is based on pairwise-dependencies and is taken from Roca et al. 2008.

41

Splicing Factor Binding Prediction When examining structures of RNA bound by RRMs or KH domains, it has been determined that only a few nucleotides are in contact with the domain57. Therefore, the reliability of predicted splicing motifs can be questioned, which adds to the underlying difficulties in splicing predictions. Additionally, RBPs that are not considered splicing factors can compete for splicing motifs, and RBPs and/or splicing factors can form loops or bridge structures, which can affect the final splicing outcome. The phosphorylation of certain splicing factors can also determine if they will act as an enhancer or repressor of exon inclusion, which also makes the tissue-specific expression of splicing factor kinases an important determinant242.

The total number of splicing factors and RBPs that are involved in regulating the splicing of a single mRNA transcript is challenging, to say the least. Predicting splicing is further complicated by the fact that some splicing factors are ubiquitously expressed, such as SR proteins and PTBP1, while the expression of other splicing factors, such as NOVA1, is restricted to a certain tissue.

Tissue-Specific Regulation of ISCU Splicing Exons with a weak PPT are highly dependent on splice-enhancer elements where the levels of regulatory splicing factors that are involved in the 3´ splice site selection have a determinant role for the splicing outcome243-246.

SRSF3 is a well-known general activator of splicing and had two in silico hits for the mutant sequence (Paper III). SRSF3 was found to be present at higher levels in the slow-fibre soleus muscle compared to gastrocnemius and quadriceps and was suggested to be a part of the reason why the levels of incorrect splicing of ISCU were higher in soleus (Paper I). A greater determinant for the incorrect splicing of ISCU in a wider tissue perspective is the level of the splicing inhibitor PTBP1. The decrease in PTBP1 levels during muscle development could, for example, partly explain the higher level of incorrect splicing in HML patient tissue compared to HML patient myoblasts135,154,211 (Papers II, III). Further evidence that PTBP1 and PTBP2 have a general role in repressing weak exons, including pseudoexons, can be found in the partial inclusion of a pseudoexon of the NF1 gene due to an intronic A to G mutation167,247. The c.31-279 A > G mutation in intron 30 creates a new acceptor splice site and activates a cryptic 5´splice site where the inclusion of the pseudoexon is repressed by PTBP1 and PTBP2. Also, the levels of incorrect splicing of an ISCU minigene in HeLa cells is only 35% (Unpublished data) compared to 55% in muscle cancer RD4 cells (Paper III). The decrease in incorrect splicing in HeLa cells could in part be attributed to the higher levels of PTBP1 (Paper II) as well as the absence of IGF2BP1 (Paper III).

42

The knockdown of U2AF65, to about 20 % of the endogenous levels in myoblasts, resulted in a slight increase in the incorrect splicing of ISCU (Paper III). However, based on the role of U2AF65 as a splicing activator we expected the knockdown of U2AF65 to result in a decrease in incorrect splicing. The unexpected effect might be in part due to the compensatory U65-independent minor spliceosome pathway that exists in parallel with the major spliceosome pathway17,20,248. To further investigate the role of U2AF65 in the incorrect splicing of ISCU, it would be necessary to perform an overexpression of both U2AF65 and U2AF35 in HML patient myoblasts. It is however interesting to note that U2AF65 appears as an RBPmap in silico hit in the sense of new splicing factor binding sites due to the HML mutation (Paper III, Supplementary Figure S2, Table 2).

RBM39 as well as the muscle-specific MBNL1 were found to act as activators of the incorrect splicing of ISCU (Paper III). The knockdown studies were performed in HML patient myoblasts, where the RNA expression of RBM39 turned out to be relatively low, at approximately half the expression of PTBP1 and almost ten-fold lower the highly abundant MBNL1. In mouse muscle tissues, the RNA levels of RBM39 are instead twice that of PTBP1 while MBNL1 levels are again approximately 10 times higher than RBM39. This difference in the ratio of splicing factor expressions between myoblasts and muscle tissue represents a weakness in the use of myoblasts to examine the muscle-specific splicing. However, a screening of multiple splicing factors to identify the most influential splicing regulators is most suitably performed using a cell-based approach. Once the main regulators have been identified, a mouse model with the muscle-specific expression or knockdown of one or multiple regulators could be constructed to analyse the in vivo effect on the incorrect splicing.

Besides RBM39, other RBM splicing factors might be involved in the splicing of ISCU based on their connection to muscle-specific regulation. RBM4 has a high expression in muscle, promotes muscle cell differentiation and is known to compete for PTBP1 for binding sites249,250, while RBM24 is required for M-band formation in skeletal muscles and has shown to be essential for heart development251. A strong candidate is also the SR-related splicing factor RBM20, which is predominantly expressed in striated muscle19,104. RBM20 contains two zinc-finger domains, one SR domain and one RRM domain, and it binds to UCUU binding motifs252. Mutations in RBM20 can cause dilated cardiomyopathy since the RBM20 splicing regulation of titin pre-mRNA directly determines the ventricular filling ability of the heart19,104. RBM20 has been shown to compete with PTBP1 during titin splicing regulation, but it has also been shown to colocalise with U2AF65 in the nucleus104. Also, RBM20 splicing is not reliant on a cardiac-specific co-factor as it can regulate splicing of a reporter mini-gene system both in C2C12 myoblasts and fibroblasts104. It has been shown to regulate mutually exclusive exons as well as both repressing and activating exon

43

inclusion252. Other splicing factors that are known to regulate muscle-specific splicing and could be of interest are RBFOX1 and RBFOX2, which are expressed at high levels in brain, heart and skeletal muscle1,18,19.

As mentioned previously, the levels of incorrect splicing are higher in patient muscle compared to patient myoblasts135 (Paper I). Such a positive correlation between levels of incorrect splicing with differentiation status is in line with the observed increase in incorrect splicing due to MyoD-induced differentiation of patient myoblasts136. Also, the levels of PTBP1 have been shown to decrease during the differentiation of mouse muscle cells (C2C12), while the levels of MBNL1 and MBNL2 have been shown to increase during the differentiation of most tissues, especially in brain, heart and skeletal muscle154,211,230. Other important regulators of ISCU missplicing might also display a differential expression in myoblasts and muscle tissues. Additionally, the MBNL1 paralogue, MBNL3, is known to be expressed during muscle regeneration, and perhaps this is part of the explanation for the decreased levels of incorrect splicing during muscle regeneration in response to rhabdomyolysis in HML141.

In general, however, the auto- and cross-regulation of RBPs and splicing factors create a very complex situation. Additionally, splicing is a highly intricate process, involving a wide range of proteins where the unique repertoire of splicing factors within a certain tissue decides the level of incorrect splicing of a particular gene.

44

Splicing Disease Therapy

Besides alternative splicing, mechanisms such as post-translational modifications and genetic variation also contribute to the existence of protein isoforms. Splicing is, however, a daunting task when considering that 3,000 genes are transcribed simultaneously and each gene can produce at least three transcripts. Since human genes consist of 8–9 exons on average and 7–8 introns, this means that more than 60,000 introns need to be spliced out simultaneously253. Therefore, the occurrence of splicing errors is not surprising, and mutations that affect alternative splicing cause several human diseases, which can be classified according to the type of element that is disrupted by the mutation, either cis or trans5,18,19,254.

Antisense Oligonucleotide Therapy A therapeutic approach for splicing diseases is the usage of antisense oligonucleotides (ASOs). An ASO is designed to target a complementary region of 15–20 nucleotides of a target pre-mRNA in order to sterically block the binding of the spliceosome or other crucial components that would result in spliceosome recruitment18 (FIGURE 9). If an ASO targets a splice site or an enhancer element, this will induce exon skipping, but if it targets a silence element, this will lead to exon inclusion (FIGURE 9). When faced with a splicing disease caused by either pseudoexons or pseudointrons, the oligonucleotide would be designed to target the affected intron sequence and sterically hinder the spliceosome, thereby preventing the aberrant splicing event. The fact that the oligo targets the intronic sequence is a technical advantage since it ensures that the oligo will not remain bound to the final mRNA transcript and, thus, will not result in side effects in the later stages of RNA processing94.

ASOs were firstly investigated 25 years ago in the context of beta-thalassemia splicing when exon skipping was successfully induced255. Induction of exon skipping has also been explored in many other diseases, including Duchenne muscular dystrophy (DMD)256-258. Inhibition of pseudoexon inclusion by ASOs was shown in cell lines from patients with mutations in the neurofibromatosis type 1 (NF1) gene. The NF1 ASOs were designed to target the new 5´ or 3´ splice sites deep within an intron of the NF1 gene259,260. ASOs have also been successfully used to block the skipping of exon 7 of SMN2 pre-mRNA splicing, which is associated with spinal muscular atrophy261. The skipping of exon 7 can occur due to a complex secondary structure of the pre-mRNA, but the use of ASOs can block such a formation and, thereby, the exon skipping. Another example is ASOs targeting intron 12 of BRCA2 to induce pseudoexon inclusion262.

45

Figure 9. ASO or siRNA mechanisms. The ASO is designed to target crucial splicing elements, which prevents incorrect splicing by sterically hindering the binding of the spliceosomal machinery. siRNA is desgined to target incorrectly spliced mRNA transcripts in the cytoplasm results in transcript degradation or translation inhibition (Figure inspiration from Sune-Pou et al., 2017).

Muscular Dystrophy Treatments Muscular dystrophies are characterised by progressive weakness and wasting of skeletal muscles263. The X-linked recessive disease DMD affects approximately 1 in 3,500 to 5,000 males born worldwide, making it the most common form of severe childhood muscular dystrophy264,265. The ASO strategy for DMD is to induce exon skipping to restore the reading frame, which occurs in the milder form of DMD, Becker MD, where truncated but still functional dystrophin protein is produced. Dystrophin is the largest gene in the human genome, comprising 79 exons and codes for the dystrophin protein, which is essential for muscle cell membrane integrity. DMD is caused by mutations in the dystrophin gene where the most frequent deletions are found around exon 47 to exon 50 and result in a disrupted reading frame, which prevents the production of dystrophin protein. The lack of dystrophin protein causes a progressive muscle breakdown and death of DMD patients just in their mid-twenties256.

The skipping of exon 51 restores the reading frame and can be beneficial for approximately 13%–14 % of DMD patients265. Two ASOs that have been shown to induce skipping of exon 51 in DMD are drisapersen and eteplirsen. However, drisapersen failed its clinical trials as it triggered an immune response, which was thought to be caused by the negative charge of the drug256,265. Drisapersen is a 2´ O-methyl-phosphorothioate-based (2OMeP) ASO, which makes it charged at body pH. Eteplirsen, however, is a phosphorodiamidate morpholino (PMO), which has a neutral charge at body pH. Eteplirsen passed its clinical trials and obtained FDA approval in September 2016 making it the first FDA-approved drug for DMD treatment. Eteplirsen is given intravenously, and the treatment does mostly delay disease progression 264,265.

46

Another muscular dystrophy is myotonic dystrophy (MD), which is caused by a CUG-nucleotide repeat expansion that sequesters MBNL193,266. MD is one of the most common types of muscular dystrophy in adults, affecting approximately 1 in 8,000 people worldwide1,236,267. The expanded CUG repeats are localised to the 3´ UTR of the DMPK gene and folds into a very stable and long RNA hairpin structure, which together with MBNL1 form nuclear foci230,268. The decrease of MBNL1 results in incorrect splicing of many pre-mRNAs in muscles and neural cells, which also includes the aberrant inclusion of exon 5 in MBNL1 pre- mRNA237. The MBNL1 isoform that includes exon 5 is primarily found in the nucleus while the isoform that lacks exon 5 can be found both in the cytoplasm and the nucleus232. Overexpression of MBNL1 or treatment with small molecules or antisense reagents that result in MBNL1 release from the CUG repeat hairpin structure can reverse the pathology of MD by restoring the cellular availability of MBNL1231,269. The 25-mer morpholino, CAG25, is an antisense oligomer that can bind the CUG repeats to prevent MBNL1 sequestration269. Local injection of CAG25 into skeletal muscles of myotonic dystrophy mice resulted in an increase in MBNL1 availability and a restored MBNL1 splicing regulation as well as a decreased number of nuclear foci and a reduced myotonia in the mice269.

Therapies for HML Patients

Antisense Oligonucleotides The use of ASOs to decrease the incorrect splicing of ISCU due to the HML mutation has been investigated in HML patient cells. The first study used a 25- mer PMO ASO at a concentration of 5 µmol/L to treat patient fibroblasts for 48 h, which resulted in a complete restoration of the normal splicing compared to the distribution of approximately 50/50 of incorrect and correct spliced transcripts in untreated cells270. Seven days after the treatment, one could observe low levels of incorrect splicing, but the correctly spliced transcript remained the major splice form for up to 21 days post-treatment270. The structure of the PMO morpholino group makes the molecules unsusceptible to degradation by endogenous cellular nucleases, which could explain the observed long-term effect.

A more recent study used both HML patient fibroblasts and myotubes, and a dose-dependent decrease of aberrant splicing as well as an increase in ISCU protein in response to ASO treatment were observed138. They used an 18-mer PMO ASO with the sequence GATTCTGAAATGAAAGAT, which targets the mutation site with a preferred binding to the mut C base rather than the wt G base and overlaps the start of the pseudoexon. The levels of Complex II in ASO-treated patient myotubes increased to control levels, and the Complex II activity for

47

treated fibroblasts (30 nM ASO) and myotubes (200 nM ASO) were restored to healthy levels138.

Early ASO treatment is crucial to optimally hinder the progressive muscle breakdown in DMD, but for treatment of the non-progressive muscle inadequacy of HML, a late intervention would also be quite beneficial. Also, if the incorrect splicing was only decreased to the extent of the levels observed in heart tissues where clear symptoms are absent, this might be enough to reach below the symptomatic threshold, which might relieve much of the muscle-specific symptoms. The riddle that needs to be solved for the development of treatments for splicing diseases is the challenge of delivering these nucleotide molecules into the human body. An approach that seemed to work for ASO treatment of MD mice was local injection into muscular tissues269. Intramuscular injections were tested for the FDA-failed drisapersen for treating MDM and showed promising results, but the continued clinical trials used subcutaneous injections265. As with other treatments, the resulting toxicity of the substance and its potential to trigger an immune response are essential factors to take into consideration. In conclusion, the use of ASOs to treat HML patients is a suitable option where ASO treatment of patient cells have been shown to decrease the disease-causing levels of incorrect splicing of ISCU.

Other Therapies In Europe, the drug Translarna was approved for DMD treatment by the European Medical Agency already in 2014. Translarna is a so-called nonsense suppressor and increases the dystrophin protein production by allowing the translational machinery to bypass PTCs in the dystrophin mRNA264. A PTC bypass represents an interesting approach to explore for HML treatment as it is caused by a nonsense mutation. The truncated ISCU protein does display a decreased stability and functionality; thus, the question remains if increased levels of the truncated ISCU protein will improve the HML symptoms at all. The use of siRNAs to target incorrectly spliced mRNA with a pseudoexon has been shown to induce mRNA degradation271-273 (FIGURE 9). The weakness of this approach is that the efficiency of the treatment is directly dependent on the percentage of pseudoexon inclusion since the treatment is not able to prevent the incorrect splicing. In the case of HML patient muscle, where the levels of incorrect splicing are approximately at 80%, the effect of such an approach might not be the most effective.

Another approach would be to target the levels of MBNL1, perhaps by taking advantage of the MBNL1 binding preference to CUG repeats. Overexpression of MBNL1 has been shown to be a working therapy for DMD, but for HML, the strategy could instead be the overexpression of PTBP1. However, a potentially

48

less invasive and molecularly complicated approach might be to use drugs or chemicals that have splicing altering effects. Drugs that have been shown to inhibit splicing, such as spliceostatin A, amiloride and herboxidiene, have all been actively investigated for their potential use in cancer treatment18. Herboxidiene is thought to interfere with U2 snRNP function while spliceostatin A is thought to inhibit the formation of complex A of the spliceosome assembly18. A potential drug for HML treatment might be amiloride as one of its effects is reduced levels of SRSF3, which has also been a known outcome from digitoxin treatment, a commonly used drug in cardiac disorders274,275.

One could also try to target SR protein kinases that are responsible for RS domain phosphorylation, which is essential for SR protein functions, such as SRSF3 and RBM39. When myoblasts from a DMD patient with a nonsense G > T mutation in exon 31 of the DMPK gene were treated with a chemical compound (TG003), which is known to specifically inhibit SR protein kinases, it resulted in enhanced exon skipping and increased levels of internally deleted but partially functional dystrophin protein276.

Once a treatment has been identified, a way to biochemically monitor progression in HML patients is needed, besides potential improvements in exercise tolerance. To avoid any invasive muscle biopsies, one could instead measure the levels of FGF21 in patient serum, as those levels have been shown to be elevated in HML patients, along with lactate and pyruvate levels137.

49

Acknowledgement

They always say that the acknowledgement section is one of the hardest sections to write in your thesis, but somehow I doubt it.

One of the main reasons I ended up getting a PhD position and embarking on this fun, but challenging journey is thanks to my strong and supportive wife. I had performed me degree project in Monica Holmberg’s group at the end of my Civilingenjörsutbildning and I was interested in doing a PhD. I have always gladly asked people questions, even strangers on the streets, and it’s even easier if I am asking on behalf of someone else. Anything to help out. But when it comes to asking for something on my own behalf, where I dread of getting denied, I encounter quite a lot of hesitance. In such situations, my wife often steps in and gives me the final push, and then I usually get the final encouragement that I need to ask that dreaded question. But I’ll get back to you in the end, Bethie 333.

As she has already been mentioned by name, I must start by thanking my clear spoken, to-the-point, on-time and ready-to-go supervisor Monica Holmberg. Besides all of your genetics and research skills, what I have valued the most is your honesty and clearness. Neither of us like to “tassa som katter kring varm gröt” but instead we like to get to the point and there aren’t unnecessary “joller”. You have been a great supervisor in that you have almost always been available for questions – no matter how small – both concerning our actual project as well as questions of life. We have also managed to discover how small the Umea Medical Faculty was back when you, your husband, your sister and my parents studied up here, since it turns out that you all kind of knew, or knew of each other back then. Now it’s time for me to leave the safe PhD bubble and take the next step, and we will see if it will be “Aftonbladet eller Expressen” or maybe even “en spik i foten”.

Besides Monica, my right hand man Lelle was there to discuss “allt mellan himmel och jord” and I have slowly realized, as I have become a senior PhD student, that it can be quite time-consuming to help junior researchers out – so I am shocked as to how you got anything done at all with me next to you, asking constant questions and just generally, not being quiet. I did my best to try and answer your questions or at least your contemplations concerning parenting, and now as a parent of a 2-year-old I have realized that it’s a “you learn as you go” kind of thing where each individual child will have its own set of rules. It’s kind of like science, you have that one gene/protein/disease that you are studying, and you just have to roll with the punches and adapt to move forward.

50

An early member of the group who is now a busy working medical doctor, was my fellow dog-lover Lisbet. She was my primer design guru and a fellow chatterbox with an IQ of God-knows how high. The fact that you can stay cheerful and positive after breaking bones and having to deal with delays in operations of both limbs and glands is immensely impressive, I wish I had your spunk. Not to be forgotten, I would literally not be here writing this thesis on this subject unless you hadn’t laid the foundation of the HML project.

I never had other PhD students or post-docs in our research group during my PhD but that didn’t mean I was alone. I have had great office buddies in Uma, Ann, Ash, Elin and now towards the end Anton and Ida-Maria. I would also like to thank all the students who have done projects in Monica’s group (three), or Jonathan’s group (∞). Uma, thank you for letting me borrow your Western blot equipment and for all the kids’ stuff that I got from you and Bala when Bethie and I were expecting Rose. Ann and Ash, the funniest two people ever, with that discrete and subtle type of British humor that I adore and that I will never master myself. Elin, what can I say, my buddy buddy who I could hang with in the cell culture room and who could teach me northern Swedish expressions. I can now also officially welcome you and big-dog Bovinder Erik to the parent-club – ask me whatever you can think of concerning being a parent and I will do my best to answer. Another fellow-parent is the all-knowing Anton. Your body might be fighting against itself on a cellular level, but you mind is impressive. It was fun to complain about our constant lack of sleep as fellow toddler parents, but we will just have to keep in mind that they will grow up sooner than we know. At last Ida- Maria, my padawan of science who might be an expert at genetics but who is venturing into the confuzzling world of protein functional studies. You will also one day master to move that gel – it’s only a matter of time.

To all my fellow lunch buddies, this list is too long to write, but if you ever have shared lunch with me in the 3rd floor lunch room for a longer period of time you can insert your name here. To mention as many as my poor goldfish memory can manage, in no particular order: Zahra, Mona, Xingru, Isil, Björn, Johan, Britta, Tommy, Mattias, Oskari, Erik, Elin, Lee-Ann, Priscilla, Rike, Samuel, Erik, Oskari, Hélène, Vicky, Helena, Manu etc. but also from the early ages when I sat at a bigger table: Malin, Mikael and Stefan etc. I love food and I love talking so I do love a good lunchbreak. Mona, I hope you come to my party and I hope you hold a speech because all your speeches always make me cry (in a good way ). Lee-Ann, you have become my sister from another mister, and I look forward to seeing where in the world you and I will be by the start of 2019. Tommy and Britta, you are like a fun Yin and Yang show, may the force be with you both. Priscilla, you’re like the smartest person I know, good luck back home in the Netherlands!

51

A thanks to the fourth-floor researchers that I have gotten to know: the whole ALS crew up here and UCMM, without you the fourth floor would literally and spiritually empty. Agnes, who is now back in town, thanks for all the Ultimate Frisbee and for being the strong and determined person you are. Also, Christine, the smartest and funniest scot I’ve ever met, you are so easy to chat with and you and James are so cute together. A special mention to Ulrika; your laughter just spreads so much joy, at least in me, so don’t ever stop laughing.

I would also like to thank the administrators of the department. Clas, I will always be Dee-Nice in your eyes. Ida, you are a hilarious person and and so easy to talk to. Terry, I see you as my inofficial Irish aunt and without you the department would surely crumble.

I would also like to thank my family. My mom who has always supported me no matter what and who is my rock in life. Anneli, my Umeå-mom who impresses me with her energy and honesty. My dad, who I can chat with for hours, whether it will be about music, movies or qPCRs, our stories never stop and I hope they never will. My big sister, who has always looked out for me and who was my backup-rock growing up. I might live far away but I am also just a phone call away. My daughter Rose, who is the sunshine of basically everything, as well as my daily follower IRL. Rose, it is already clear, at a mere age of 2, that you are smart, determined and confident and I am convinced that you will keep those characteristics as you grow up to become a strong and amazing woman. I’m so proud of you, and I know it might take you a couple years before you can read this on your own, but as usual, you can always ask your mommy or me for help. Jag älskar er!

Lastly but also most importantly, my wife. You know who you are, and you also know who I am. With baggage and bad sides and all. But you stick around, and I am amazed. While the rest of the world gets the most-of-the-time cheerful Denise, you also get stuck with the stubborn and moody Denise and she is a force to be reckoned with. A force that is only able to be matched by an equally strong force, which you possess. We might both be strong and determined but I wouldn’t have it any other way. If you want to spend yet another decade in another part of the world and learn yet another language, I’m up for it. As long as you are there with me. 333

I would aslo like to personally thank the patients and their families for their contribution to this work. I hope that there will be long-term benefits of the conclusions that I have drawn in this thesis, which one day might lead to the development of a treatment for HML or other diseases caused by aberrant splicing.

52

References

1 Pistoni, M., Ghigna, C. & Gabellini, D. Alternative splicing and muscular dystrophy. RNA biology 7, 441-452 (2010). 2 Hesselberth, J. R. Lives that introns lead after splicing. Wiley interdisciplinary reviews. RNA 4, 677-691, doi:10.1002/wrna.1187 (2013). 3 Pan, Q., Shai, O., Lee, L. J., Frey, B. J. & Blencowe, B. J. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nature genetics 40, 1413-1415, doi:10.1038/ng.259 (2008). 4 Wang, E. T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470-476, doi:10.1038/nature07509 (2008). 5 Suñé-Pou, M. et al. Targeting Splicing in the Treatment of Human Disease. Genes 8, 87, doi:10.3390/genes8030087 (2017). 6 de Klerk, E. & t Hoen, P. A. Alternative mRNA transcription, processing, and translation: insights from RNA sequencing. Trends in genetics : TIG 31, 128-139, doi:10.1016/j.tig.2015.01.001 (2015). 7 Calabretta, S. et al. Modulation of PKM alternative splicing by PTBP1 promotes gemcitabine resistance in pancreatic cancer cells. Oncogene 35, 2031-2039, doi:10.1038/onc.2015.270 (2016). 8 Mulligan, G. J., Guo, W., Wormsley, S. & Helfman, D. M. Polypyrimidine tract binding protein interacts with sequences involved in alternative splicing of beta- tropomyosin pre-mRNA. J Biol Chem 267, 25480-25487 (1992). 9 Saulière, J., Sureau, A., Expert-Bezançon, A. & Marie, J. The Polypyrimidine Tract Binding Protein (PTB) Represses Splicing of Exon 6B from the β- Tropomyosin Pre-mRNA by Directly Interfering with the Binding of the U2AF65 Subunit. Mol Cell Biol 26, 8755-8769, doi:10.1128/mcb.00893-06 (2006). 10 Gooding, C., Roberts, G. C. & Smith, C. W. Role of an inhibitory pyrimidine element and polypyrimidine tract binding protein in repression of a regulated alpha-tropomyosin exon. RNA (New York, N.Y.) 4, 85-100 (1998). 11 Spellman, R. et al. Regulation of alternative splicing by PTB and associated factors. Biochemical Society transactions 33, 457-460, doi:10.1042/bst0330457 (2005). 12 Xu, R., Teng, J. & Cooper, T. A. The cardiac troponin T alternative exon contains a novel purine-rich positive splicing element. Mol Cell Biol 13, 3660-3674 (1993). 13 Charlet-B, N., Singh, G., Cooper, T. A. & Logan, P. Dynamic Antagonism between ETR-3 and PTB Regulates Cell Type-Specific Alternative Splicing. Molecular cell 9, 649-658, doi:10.1016/S1097-2765(02)00479-3 (2002). 14 Jurica, M. S. & Moore, M. J. Pre-mRNA splicing: awash in a sea of proteins. Molecular cell 12, 5-14 (2003). 15 Will, C. L. & Luhrmann, R. Spliceosome structure and function. Cold Spring Harbor perspectives in biology 3, doi:10.1101/cshperspect.a003707 (2011). 16 Padgett, R. A. New connections between splicing and human disease. Trends in Genetics 28, 147-154, doi:10.1016/j.tig.2012.01.001 (2012). 17 Buratti, E. The minor spliceosome could be the major key for FUS/TLS mutants in ALS. The EMBO journal 35, 1486-1487, doi:10.15252/embj.201694763 (2016). 18 Singh, R. K. & Cooper, T. A. Pre-mRNA splicing in disease and therapeutics. Trends in molecular medicine 18, 472-482, doi:10.1016/j.molmed.2012.06.006 (2012). 19 Cieply, B. & Carstens, R. P. Functional roles of alternative splicing factors in human disease. Wiley interdisciplinary reviews. RNA 6, 311-326, doi:10.1002/wrna.1276 (2015).

53

20 Verma, B., Akinyi, M. V., Norppa, A. J. & Frilander, M. J. Minor spliceosome and disease. Seminars in Cell & Developmental Biology 79, 103-112, doi:10.1016/j.semcdb.2017.09.036 (2018). 21 Sheth, N. et al. Comprehensive splice-site analysis using comparative genomics. Nucleic Acids Research 34, 3955-3967, doi:10.1093/nar/gkl556 (2006). 22 Mercer, T. R. et al. Genome-wide discovery of human splicing branchpoints. Genome Res 25, 290-303, doi:10.1101/gr.182899.114 (2015). 23 Reed, R. The organization of 3' splice-site sequences in mammalian introns. Genes & development 3, 2113-2123, doi:10.1101/gad.3.12b.2113 (1989). 24 Bitton, D. A. et al. LaSSO, a strategy for genome-wide mapping of intronic lariats and branch points using RNA-seq. Genome Res 24, 1169-1179, doi:10.1101/gr.166819.113 (2014). 25 Gooding, C. et al. A class of human exons with predicted distant branch points revealed by analysis of AG dinucleotide exclusion zones. Genome biology 7, R1, doi:10.1186/gb-2006-7-1-r1 (2006). 26 Keppetipola, N., Sharma, S., Li, Q. & Black, D. L. Neuronal regulation of pre- mRNA splicing by polypyrimidine tract binding proteins, PTBP1 and PTBP2. Critical reviews in biochemistry and molecular biology 47, 360-378, doi:10.3109/10409238.2012.691456 (2012). 27 Coolidge, C. J., Seely, R. J. & Patton, J. G. Functional analysis of the polypyrimidine tract in pre-mRNA splicing. Nucleic Acids Research 25, 888-896 (1997). 28 Reed, R. Initial splice-site recognition and pairing during pre-mRNA splicing. Current opinion in genetics & development 6, 215-220 (1996). 29 Gao, K., Masuda, A., Matsuura, T. & Ohno, K. Human branch point consensus sequence is yUnAy. Nucleic Acids Research 36, 2257-2267, doi:10.1093/nar/gkn073 (2008). 30 Zhang, M. Q. Statistical features of human exons and their flanking regions. Hum Mol Genet 7, 919-932 (1998). 31 Smith, C. W., Porro, E. B., Patton, J. G. & Nadal-Ginard, B. Scanning from an independently specified branch point defines the 3' splice site of mammalian introns. Nature 342, 243-247, doi:10.1038/342243a0 (1989). 32 Andreadis, A., Gallego, M. E. & Nadal-Ginard, B. Generation of protein isoform diversity by alternative splicing: mechanistic and biological implications. Annual review of cell biology 3, 207-242, doi:10.1146/annurev.cb.03.110187.001231 (1987). 33 Ladd, A. N. & Cooper, T. A. Finding signals that regulate alternative splicing in the post-genomic era. Genome biology 3, reviews0008.0001, doi:10.1186/gb- 2002-3-11-reviews0008 (2002). 34 Buratti, E. et al. Aberrant 5′ splice sites in human disease genes: mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization. Nucleic Acids Research 35, 4250-4263, doi:10.1093/nar/gkm402 (2007). 35 Královičová, J., Christensen, M. B. & Vořechovský, I. Biased exon/intron distribution of cryptic and de novo 3′ splice sites. Nucleic Acids Research 33, 4882-4898, doi:10.1093/nar/gki811 (2005). 36 Cáceres, E. F. & Hurst, L. D. The evolution, impact and properties of exonic splice enhancers. Genome biology 14, R143, doi:10.1186/gb-2013-14-12-r143 (2013). 37 Wang, Y., Ma, M., Xiao, X. & Wang, Z. Intronic splicing enhancers, cognate splicing factors and context-dependent regulation rules. Nature Structural &Amp; Molecular Biology 19, 1044, doi:10.1038/nsmb.2377 (2012). 38 Baralle, D. & Buratti, E. RNA splicing in human disease and in the clinic. Clinical science (London, England : 1979) 131, 355-368, doi:10.1042/cs20160211 (2017).

54

39 Vaquero-Garcia, J. et al. A new view of transcriptome complexity and regulation through the lens of local splicing variations. eLife 5, e11752, doi:10.7554/eLife.11752 (2016). 40 Wen, J., Chiba, A. & Cai, X. Computational identification of tissue-specific alternative splicing elements in mouse genes from RNA-Seq. Nucleic Acids Research 38, 7895-7907, doi:10.1093/nar/gkq679 (2010). 41 Black, D. L. Mechanisms of alternative pre-messenger RNA splicing. Annual review of biochemistry 72, 291-336, doi:10.1146/annurev.biochem.72.121801.161720 (2003). 42 Buratti, E. & Baralle, F. E. Influence of RNA secondary structure on the pre- mRNA splicing process. Mol Cell Biol 24, 10505-10514, doi:10.1128/mcb.24.24.10505-10514.2004 (2004). 43 Luco, R. F. & Misteli, T. More than a splicing code: integrating the role of RNA, chromatin and non-coding RNA in alternative splicing regulation. Current opinion in genetics & development 21, 366-372, doi:10.1016/j.gde.2011.03.004 (2011). 44 Dutertre, M., Sanchez, G., Barbier, J., Corcos, L. & Auboeuf, D. The emerging role of pre-messenger RNA splicing in stress responses: sending alternative messages and silent messengers. RNA biology 8, 740-747, doi:10.4161/rna.8.5.16016 (2011). 45 Vinogradova, S. V., Sutormin, R. A., Mironov, A. A. & Soldatov, R. A. Probing- directed identification of novel structured RNAs. RNA biology 13, 232-242, doi:10.1080/15476286.2015.1132140 (2016). 46 Rubtsov, P. M. [Role of pre-mRNA secondary structures in the regulation of alternative splicing]. Molekuliarnaia biologiia 50, 935-943, doi:10.7868/s0026898416060173 (2016). 47 Ho, T. H. et al. Muscleblind proteins regulate alternative splicing. The EMBO journal 23, 3103-3112, doi:10.1038/sj.emboj.7600300 (2004). 48 Nielsen, J. et al. A family of insulin-like growth factor II mRNA-binding proteins repressed translation in late development. Mol Cell Biol 19, 1262-1270 (1999). 49 Gabut, M., Chaudhry, S. & Blencowe, B. J. SnapShot: The splicing regulatory machinery. Cell 133, 192.e191, doi:10.1016/j.cell.2008.03.010 (2008). 50 Wang, Z. L. et al. Comprehensive Genomic Characterization of RNA-Binding Proteins across Human Cancers. Cell reports 22, 286-298, doi:10.1016/j.celrep.2017.12.035 (2018). 51 Fu, X. D. & Ares, M. Context-dependent control of alternative splicing by RNA- binding proteins. Nat Rev Genet 15, doi:10.1038/nrg3778 (2014). 52 Letunic, I., Doerks, T. & Bork, P. SMART 6: recent updates and new developments. Nucleic Acids Research 37, D229-D232, doi:10.1093/nar/gkn808 (2009). 53 Valverde, R., Edwards, L. & Regan, L. Structure and function of KH domains. The FEBS journal 275, 2712-2726, doi:10.1111/j.1742-4658.2008.06411.x (2008). 54 Hollingworth, D. et al. KH domains with impaired nucleic acid binding as a tool for functional analysis. Nucleic Acids Research 40, 6873-6886, doi:10.1093/nar/gks368 (2012). 55 Hall, T. M. Multiple modes of RNA recognition by zinc finger proteins. Current opinion in structural biology 15, 367-373, doi:10.1016/j.sbi.2005.04.004 (2005). 56 Cassandri, M. et al. Zinc-finger proteins in health and disease. Cell Death Discovery 3, 17071, doi:10.1038/cddiscovery.2017.71 (2017). 57 Anko, M. L. & Neugebauer, K. M. RNA-protein interactions in vivo: global gets specific. Trends in biochemical sciences 37, 255-262, doi:10.1016/j.tibs.2012.02.005 (2012).

55

58 Blencowe, B. J. Alternative splicing: new insights from global analyses. Cell 126, 37-47, doi:10.1016/j.cell.2006.06.023 (2006). 59 Paz, I., Akerman, M., Dror, I., Kosti, I. & Mandel-Gutfreund, Y. SFmap: a web server for motif analysis and prediction of splicing factor binding sites. Nucleic Acids Research 38, doi:10.1093/nar/gkq444 (2010). 60 Paz, I., Kosti, I., Ares, M., Cline, M. & Mandel-Gutfreund, Y. RBPmap: a web server for mapping binding sites of RNA-binding proteins. Nucleic Acids Research 42, W361-W367, doi:10.1093/nar/gku406 (2014). 61 Dvinge, H., Kim, E., Abdel-Wahab, O. & Bradley, R. K. RNA splicing factors as oncoproteins and tumour suppressors. Nature reviews. Cancer 16, 413-430, doi:10.1038/nrc.2016.51 (2016). 62 da Costa, P. J., Menezes, J. & Romão, L. The role of alternative splicing coupled to nonsense-mediated mRNA decay in human disease. The International Journal of Biochemistry & Cell Biology 91, 168-175, doi:10.1016/j.biocel.2017.07.013 (2017). 63 Isken, O. & Maquat, L. E. The multiple lives of NMD factors: balancing roles in gene and genome regulation. Nat Rev Genet 9, 699-712, doi:10.1038/nrg2402 (2008). 64 Ochs, M. J. et al. Post-transcriptional regulation of 5-lipoxygenase mRNA expression via alternative splicing and nonsense-mediated mRNA decay. PLoS One 7, e31363, doi:10.1371/journal.pone.0031363 (2012). 65 Nickless, A., Bailis, J. M. & You, Z. Control of gene expression through the nonsense-mediated RNA decay pathway. Cell & bioscience 7, 26-26, doi:10.1186/s13578-017-0153-7 (2017). 66 Isken, O. & Maquat, L. E. Quality control of eukaryotic mRNA: safeguarding cells from abnormal mRNA function. Genes & development 21, 1833-1856, doi:10.1101/gad.1566807 (2007). 67 Raimondeau, E., Bufton, J. C. & Schaffitzel, C. New insights into the interplay between the translation machinery and nonsense-mediated mRNA decay factors. Biochemical Society transactions, doi:10.1042/bst20170427 (2018). 68 Fang, Y., Bateman, J. F., Mercer, J. F. & Lamande, S. R. Nonsense-mediated mRNA decay of collagen -emerging complexity in RNA surveillance mechanisms. J Cell Sci 126, 2551-2560, doi:10.1242/jcs.120220 (2013). 69 Kuzmiak, H. A. & Maquat, L. E. Applying nonsense-mediated mRNA decay research to the clinic: progress and challenges. Trends in molecular medicine 12, 306-316, doi:10.1016/j.molmed.2006.05.005 (2006). 70 Wollerton, M. C., Gooding, C., Wagner, E. J., Garcia-Blanco, M. A. & Smith, C. W. Autoregulation of polypyrimidine tract binding protein by alternative splicing leading to nonsense-mediated decay. Molecular cell 13, 91-100 (2004). 71 Jumaa, H., Guénet, J. L. & Nielsen, P. J. Regulated expression and RNA processing of transcripts from the Srp20 splicing factor gene during the cell cycle. Molecular and Cellular Biology 17, 3116-3124 (1997). 72 Jumaa, H. & Nielsen, P. J. The splicing factor SRp20 modifies splicing of its own mRNA and ASF/SF2 antagonizes this regulation. The EMBO journal 16, 5077- 5085, doi:10.1093/emboj/16.16.5077 (1997). 73 Lareau, L. F., Inada, M., Green, R. E., Wengrod, J. C. & Brenner, S. E. Unproductive splicing of SR genes associated with highly conserved and ultraconserved DNA elements. Nature 446, 926-929, doi:10.1038/nature05676 (2007). 74 Le Hir, H., Izaurralde, E., Maquat, L. E. & Moore, M. J. The spliceosome deposits multiple proteins 20-24 nucleotides upstream of mRNA exon-exon junctions. The EMBO journal 19, 6860-6869, doi:10.1093/emboj/19.24.6860 (2000). 75 Le Hir, H., Gatfield, D., Izaurralde, E. & Moore, M. J. The exon-exon junction complex provides a binding platform for factors involved in mRNA export and

56

nonsense-mediated mRNA decay. The EMBO journal 20, 4987-4997, doi:10.1093/emboj/20.17.4987 (2001). 76 Singh, G., Jakob, S., Kleedehn, M. G. & Lykke-Andersen, J. Communication with the exon-junction complex and activation of nonsense-mediated decay by human Upf proteins occur in the cytoplasm. Molecular cell 27, 780-792, doi:10.1016/j.molcel.2007.06.030 (2007). 77 Singh, G., Rebbapragada, I. & Lykke-Andersen, J. A Competition between Stimulators and Antagonists of Upf Complex Recruitment Governs Human Nonsense-Mediated mRNA Decay. PLOS Biology 6, e111, doi:10.1371/journal.pbio.0060111 (2008). 78 Singh, G. et al. The Cellular EJC Interactome Reveals Higher-Order mRNP Structure and an EJC-SR Protein Nexus. Cell 151, 750-764, doi:10.1016/j.cell.2012.10.007 (2012). 79 Ishigaki, Y., Li, X., Serin, G. & Maquat, L. E. Evidence for a Pioneer Round of mRNA Translation: mRNAs Subject to Nonsense-Mediated Decay in Mammalian Cells Are Bound by CBP80 and CBP20. Cell 106, 607-617, doi:10.1016/S0092- 8674(01)00475-5 (2001). 80 Culbertson, M. R. & Neeno-Eckwall, E. Transcript selection and the recruitment of mRNA decay factors for NMD in Saccharomyces cerevisiae. RNA (New York, N.Y.) 11, 1333-1339, doi:10.1261/rna.2113605 (2005). 81 Nagy, E. & Maquat, L. E. A rule for termination-codon position within intron- containing genes: when nonsense affects RNA abundance. Trends in biochemical sciences 23, 198-199 (1998). 82 Zhang, Z. et al. Noisy splicing, more than expression regulation, explains why some exons are subject to nonsense-mediated mRNA decay. BMC biology 7, 23, doi:10.1186/1741-7007-7-23 (2009). 83 Willing, M. C., Deschenes, S. P., Slayton, R. L. & Roberts, E. J. Premature chain termination is a unifying mechanism for COL1A1 null alleles in osteogenesis imperfecta type I cell strains. Am J Hum Genet 59, 799-809 (1996). 84 Liu, T. & Lin, K. The distribution pattern of genetic variation in the transcript isoforms of the alternatively spliced protein-coding genes in the human genome. Molecular BioSystems 11, 1378-1388, doi:10.1039/C5MB00132C (2015). 85 Kashima, I. et al. Binding of a novel SMG-1-Upf1-eRF1-eRF3 complex (SURF) to the exon junction complex triggers Upf1 phosphorylation and nonsense- mediated mRNA decay. Genes & development 20, 355-367, doi:10.1101/gad.1389006 (2006). 86 Unterholzner, L. & Izaurralde, E. SMG7 acts as a molecular link between mRNA surveillance and mRNA decay. Molecular cell 16, 587-596, doi:10.1016/j.molcel.2004.10.013 (2004). 87 Carey, K. T. & Wickramasinghe, V. O. Regulatory Potential of the RNA Processing Machinery: Implications for Human Disease. Trends in genetics : TIG 34, 279- 290, doi:10.1016/j.tig.2017.12.012 (2018). 88 Daguenet, E., Dujardin, G. & Valcarcel, J. The pathogenicity of splicing defects: mechanistic insights into pre-mRNA processing inform novel therapeutic approaches. EMBO reports 16, 1640-1655, doi:10.15252/embr.201541116 (2015). 89 Cooper, T. A., Wan, L. & Dreyfuss, G. RNA and disease. Cell 136, 777-793, doi:10.1016/j.cell.2009.02.011 (2009). 90 Lopez-Bigas, N., Audit, B., Ouzounis, C., Parra, G. & Guigo, R. Are splicing mutations the most frequent cause of hereditary disease? FEBS Lett 579, 1900- 1903, doi:10.1016/j.febslet.2005.02.047 (2005). 91 Wang, G. S. & Cooper, T. A. Splicing in disease: disruption of the splicing code and the decoding machinery. Nat Rev Genet 8, 749-761, doi:10.1038/nrg2164 (2007).

57

92 Roca, X. et al. Features of 5'-splice-site efficiency derived from disease-causing mutations and comparative genomics. Genome research 18, 77-87, doi:10.1101/gr.6859308 (2008). 93 Chen, G. et al. Phenylbutazone induces expression of MBNL1 and suppresses formation of MBNL1-CUG RNA foci in a mouse model of myotonic dystrophy. Scientific reports 6, 25317, doi:10.1038/srep25317 (2016). 94 Romano, M., Buratti, E. & Baralle, D. Role of pseudoexons and pseudointrons in human cancer. International journal of cell biology 2013, 810572, doi:10.1155/2013/810572 (2013). 95 Dhir, A. & Buratti, E. Alternative splicing: role of pseudoexons in human disease and potential therapeutic strategies. The FEBS journal 277, 841-855, doi:10.1111/j.1742-4658.2009.07520.x (2010). 96 Sun, H. & Chasin, L. A. Multiple splicing defects in an intronic false exon. Mol Cell Biol 20, 6414-6425 (2000). 97 Kapustin, Y. et al. Cryptic splice sites and split genes. Nucleic Acids Research 39, 5837-5844, doi:10.1093/nar/gkr203 (2011). 98 Buratti, E., Dhir, A., Lewandowska, M. A. & Baralle, F. E. RNA structure is a key regulatory element in pathological ATM and CFTR pseudoexon inclusion events. Nucleic Acids Research 35, 4369-4383, doi:10.1093/nar/gkm447 (2007). 99 Pérez, B. et al. Present and future of antisense therapy for splicing modulation in inherited metabolic disease. J Inherit Metab Dis 33, 397-403, doi:10.1007/s10545-010-9135-1 (2010). 100 Zheng, Z.-M. Regulation of alternative RNA splicing by exon definition and exon sequences in viral and mammalian gene expression. Journal of Biomedical Science 11, 278-294, doi:10.1007/BF02254432 (2004). 101 Leman, R. et al. Novel diagnostic tool for prediction of variant spliceogenicity derived from a set of 395 combined in silico/in vitro studies: an international collaborative effort. Nucleic Acids Research 46, 7913-7923, doi:10.1093/nar/gky372 (2018). 102 Sugnet, C. W. et al. Unusual Intron Conservation near Tissue-Regulated Exons Found by Splicing Microarrays. PLOS Computational Biology 2, e4, doi:10.1371/journal.pcbi.0020004 (2006). 103 Lindholm, M. E. et al. The human skeletal muscle transcriptome: sex differences, alternative splicing, and tissue homogeneity assessed with RNA sequencing. FASEB journal : official publication of the Federation of American Societies for Experimental Biology 28, 4571-4581, doi:10.1096/fj.14-255000 (2014). 104 Guo, W. et al. RBM20, a gene for hereditary cardiomyopathy, regulates titin splicing. Nature medicine 18, 766-773, doi:10.1038/nm.2693 (2012). 105 Ladd, A. N., Charlet, N. & Cooper, T. A. The CELF family of RNA binding proteins is implicated in cell-specific and developmentally regulated alternative splicing. Mol Cell Biol 21, 1285-1296, doi:10.1128/mcb.21.4.1285-1296.2001 (2001). 106 Gomes, A. V., Guzman, G., Zhao, J. & Potter, J. D. Cardiac troponin T isoforms affect the Ca2+ sensitivity and inhibition of force development. Insights into the role of troponin T isoforms in the heart. J Biol Chem 277, 35341-35349, doi:10.1074/jbc.M204118200 (2002). 107 Gomes, A. V. et al. Cardiac troponin T isoforms affect the Ca(2+) sensitivity of force development in the presence of slow skeletal troponin I: insights into the role of troponin T isoforms in the fetal heart. J Biol Chem 279, 49579-49587, doi:10.1074/jbc.M407340200 (2004). 108 Cooper, T. A. Muscle-specific splicing of a heterologous exon mediated by a single muscle-specific splicing enhancer from the cardiac troponin T gene. Molecular and cellular biology 18, 4519-4525 (1998).

58

109 Castle, J. C. et al. Expression of 24,426 human alternative splicing events and predicted cis regulation in 48 tissues and cell lines. Nature genetics 40, 1416- 1425, doi:10.1038/ng.264 (2008). 110 Hughes, S. M. Muscle Differentiation: A Gene for Slow Muscle? Current Biology 14, R156-R157, doi:10.1016/j.cub.2004.01.046 (2004). 111 Kitamura, T. et al. A Foxo/Notch pathway controls myogenic differentiation and fiber type specification. The Journal of clinical investigation 117, 2477-2485, doi:10.1172/JCI32054 (2007). 112 Owens, J., Moreira, K. & Bain, G. Characterization of primary human skeletal muscle cells from multiple commercial sources. In vitro cellular & developmental biology. Animal 49, 695-705, doi:10.1007/s11626-013-9655-8 (2013). 113 Hughes, S. M. et al. Selective accumulation of MyoD and myogenin mRNAs in fast and slow adult skeletal muscle is controlled by innervation and hormones. Development (Cambridge, England) 118, 1137 (1993). 114 Bassel-Duby, R. & Olson, E. N. Signaling pathways in skeletal muscle remodeling. Annu. Rev. Biochem. 75, 19-37 (2006). 115 Schiaffino, S. & Reggiani, C. Molecular diversity of myofibrillar proteins: gene regulation and functional significance. Physiol Rev 76, 371-423, doi:10.1152/physrev.1996.76.2.371 (1996). 116 Liew, H. P., Choksi, S. P., Wong, K. N. & Roy, S. Specification of vertebrate slow- twitch muscle fiber fate by the transcriptional regulator Blimp1. Developmental biology 324, 226-235, doi:10.1016/j.ydbio.2008.09.020 (2008). 117 Tai, P. W. et al. Differentiation and fiber type-specific activity of a muscle creatine kinase intronic enhancer. Skeletal muscle 1, 25, doi:10.1186/2044-5040-1-25 (2011). 118 Spletter, M. L. & Schnorrer, F. Transcriptional regulation and alternative splicing cooperate in muscle fiber-type specification in flies and mammals. Experimental cell research 321, 90-98, doi:10.1016/j.yexcr.2013.10.007 (2014). 119 Baldwin, K. M. & Haddad, F. Skeletal muscle plasticity: cellular and molecular responses to altered physical activity paradigms. American journal of physical medicine & rehabilitation 81, S40-51, doi:10.1097/01.Phm.0000029723.36419.0d (2002). 120 Chemello, F. et al. Microgenomic Analysis in Skeletal Muscle: Expression Signatures of Individual Fast and Slow Myofibers. PLoS ONE 6, e16807, doi:10.1371/journal.pone.0016807 (2011). 121 Carosio, S. et al. Generation of eX vivo-vascularized Muscle Engineered Tissue (X-MET). Scientific reports 3, 1420, doi:10.1038/srep01420 (2013). 122 Berchtold, M. W., Brinkmeier, H. & Muntener, M. Calcium ion in skeletal muscle: its crucial role for muscle function, plasticity, and disease. Physiol Rev 80, 1215- 1265, doi:10.1152/physrev.2000.80.3.1215 (2000). 123 Robson, H. E. The Olympic Book of Sports Medicine. British Journal of Sports Medicine 22, 155-155 (1988). 124 Baar, K. et al. Adaptations of skeletal muscle to exercise: rapid increase in the transcriptional coactivator PGC-1. FASEB journal : official publication of the Federation of American Societies for Experimental Biology 16, 1879-1886, doi:10.1096/fj.02-0367com (2002). 125 Trollet, C. et al. Molecular and phenotypic characterization of a mouse model of oculopharyngeal muscular dystrophy reveals severe muscular atrophy restricted to fast glycolytic fibres. Hum Mol Genet 19, 2191-2207, doi:10.1093/hmg/ddq098 (2010). 126 Kamga, C., Krishnamurthy, S. & Shiva, S. Myoglobin and mitochondria: a relationship bound by oxygen and nitric oxide. Nitric oxide : biology and chemistry 26, 251-258, doi:10.1016/j.niox.2012.03.005 (2012).

59

127 Guan, X., Goddard, M. A., Mack, D. L. & Childers, M. K. Gene therapy in monogenic congenital myopathies. Methods 99, 91-98, doi:10.1016/j.ymeth.2015.10.004 (2016). 128 Chinnery, P. F. Mitochondrial disease in adults: what's old and what's new? EMBO molecular medicine 7, 1503-1512, doi:10.15252/emmm.201505079 (2015). 129 Larsson, L. E., Linderholm, H., Müller, R., Ringqvist, T. & Sörnäs, R. Hereditary metabolic myopathy with paroxysmal myoglobinuria due to abnormal glycolysis. J. Neurol. Neurosurg. Psychiat. 27, 361-380 (1964). 130 Linderholm, H., MüllerT, R., Ringqvist, T. & Sörnäs, R. Hereditary abnormal muscle metabolism with hyperkinetic circulation during exercise. Acta Medica Scandinavica 185, 153-166, doi:10.1111/j.0954-6820.1969.tb07314.x (1969). 131 Mochel, F. et al. Splice Mutation in the Iron-Sulfur Cluster Scaffold Protein ISCU Causes Myopathy with Exercise Intolerance. American Journal of Human Genetics 82, 652-660, doi:10.1016/j.ajhg.2007.12.012 (2008). 132 Olsson, A., Lind, L., Thornell, L.-E. & Holmberg, M. Myopathy with lactic acidosis is linked to chromosome 12q23.3-24.11 and cause by an intron mutation in the ISCU gene resulting in a splicing defect. Hum. Mol. Gen. 17, 1666-1672 (2008). 133 Linderholm, H., Essen-Gustavsson, B. & Thornell, L. E. Low succinate dehydrogenase (SDH) activity in a patient with a hereditary myopathy with paroxysmal myoglobinuria. Journal of internal medicine 228, 43-52 (1990). 134 Sanaker, P. S. et al. Differences in RNA processing underlie the tissue specific phenotype of ISCU myopathy. Biochim. et Biophys. Acta. 1802, 539-544 (2010). 135 Nordin, A., Larsson, E., Thornell, L.-E. & Holmberg, M. Tissue-specific splicing of ISCU results in a skeletal muscle phenotype in myopathy with lactic acidosis, while complete loss of ISCU results in early embryonic death in mice. Hum. Genet. 129, 371-378, doi:10.1007/s00439-010-0931-3 (2011). 136 Crooks, D. R. et al. Tissue specificity of a human mitochondrial disease. J. Biol. Chem. 287, 40119-40130 (2012). 137 Crooks, D. R. et al. Elevated FGF21 secretion, PGC-1alpha and ketogenic enzyme expression are hallmarks of iron-sulfur cluster depletion in human skeletal muscle. Hum Mol Genet 23, 24-39, doi:10.1093/hmg/ddt393 (2014). 138 Holmes-Hampton, G. P. et al. Use of antisense oligonucleotides to correct the splicing error in ISCU myopathy patient cell lines. Human Molecular Genetics 25, 5178-5187, doi:10.1093/hmg/ddw338 (2016). 139 Haller, R. G. et al. Deficiency of skeletal muscle succinate dehydrogenase and aconitase. J. Clin. Invest. 88, 1197-1206 (1991). 140 Kollberg, G. et al. Clinical manifestation and a new ISCU mutation in iron- sulphur cluster deficiency myopathy Brain 132, 2170-2179 (2009). 141 Kollberg, G., Melberg, A., Holme, E. & Oldfors, A. Transient restoration of succinate dehydrogenase activity after rhabdomyolysis in iron-sulphur cluster deficiency myopathy. Neuromuscular disorders : NMD 21, 115-120, doi:10.1016/j.nmd.2010.11.010 (2011). 142 Hwang, D. M., Dempsey, A., Tan, K. T. & Liew, C. C. A modular domain of NifU, a nitrogen fixation cluster protein, is highly conserved in evolution. Journal of molecular evolution 43, 536-540 (1996). 143 Rouault, T. A. & Tong, W. H. Iron–sulfur cluster biogenesis and human disease. Trends in Genetics 24, 398-407, doi:10.1016/j.tig.2008.05.008 (2008). 144 Beinert, H. Iron-sulfur proteins: ancient structures, still full of surprises. Journal of biological inorganic chemistry : JBIC : a publication of the Society of Biological Inorganic Chemistry 5, 2-15 (2000). 145 Rouault, T. A. & Maio, N. Biogenesis and functions of mammalian iron-sulfur proteins in the regulation of iron homeostasis and pivotal metabolic pathways. J Biol Chem 292, 12744-12753, doi:10.1074/jbc.R117.789537 (2017).

60

146 Lill, R. et al. Mechanisms of iron-sulfur protein maturation in mitochondria, cytosol and nucleus of eukaryotes. Biochim Biophys Acta 1763, 652-667, doi:10.1016/j.bbamcr.2006.05.011 (2006). 147 Legati, A. et al. A novel de novo dominant mutation in ISCU associated with mitochondrial myopathy. J Med Genet, doi:10.1136/jmedgenet-2017-104822 (2017). 148 Lill, R. et al. The role of mitochondria in cellular iron-sulfur protein biogenesis and iron metabolism. Biochim Biophys Acta 1823, 1491-1508, doi:10.1016/j.bbamcr.2012.05.009 (2012). 149 Muhlenhoff, U. & Lill, R. Biogenesis of iron-sulfur proteins in eukaryotes: a novel task of mitochondria that is inherited from bacteria. Biochim Biophys Acta 1459, 370-382 (2000). 150 Tong, W. H. & Rouault, T. Distinct iron-sulfur cluster assembly complexes exist in the cytosol and mitochondria of human cells. The EMBO journal 19, 5692- 5700, doi:10.1093/emboj/19.21.5692 (2000). 151 Favaro, E. et al. MicroRNA-210 regulates mitochondrial free radical response to hypoxia and krebs cycle in cancer cells by targeting iron sulfur cluster protein ISCU. PLoS One 5, e10345, doi:10.1371/journal.pone.0010345 (2010). 152 Ullmann, P. et al. Hypoxia-responsive miR-210 promotes self-renewal capacity of colon tumor-initiating cells by repressing ISCU and by inducing lactate production. Oncotarget 7, 65454-65470, doi:10.18632/oncotarget.11772 (2016). 153 Nordin, A., Larsson, E. & Holmberg, M. The defective splicing caused by the ISCU intron mutation in patients with myopathy with lactic acidosis is repressed by PTBP1 but can be derepressed by IGF2BP1. Hum Mutat 33, 467-470, doi:10.1002/humu.22002 (2012). 154 Boutz, P. L. et al. A post-transcriptional regulatory switch in polypyrimidine tract-binding proteins reprograms alternative splicing in developing neurons. Genes & development 21, 1636-1652, doi:10.1101/gad.1558107 (2007). 155 Xue, Y. et al. Genome-wide analysis of PTB-RNA interactions reveals a strategy used by the general splicing repressor to modulate exon inclusion or skipping. Molecular cell 36, 996-1006, doi:10.1016/j.molcel.2009.12.003 (2009). 156 Llorian, M. & Smith, C. W. Decoding muscle alternative splicing. Current opinion in genetics & development 21, 380-387, doi:10.1016/j.gde.2011.03.006 (2011). 157 Sharma, S., Maris, C., Allain, F. H. & Black, D. L. U1 snRNA directly interacts with polypyrimidine tract-binding protein during splicing repression. Molecular cell 41, 579-588, doi:10.1016/j.molcel.2011.02.012 (2011). 158 Keppetipola, N. M. et al. Multiple determinants of splicing repression activity in the polypyrimidine tract binding proteins, PTBP1 and PTBP2. RNA (New York, N.Y.) 22, 1172-1180, doi:10.1261/rna.057505.116 (2016). 159 Monie, T. P. et al. The polypyrimidine tract binding protein is a monomer. RNA (New York, N.Y.) 11, 1803-1808, doi:10.1261/rna.2214405 (2005). 160 Perez, I., McAfee, J. G. & Patton, J. G. Multiple RRMs contribute to RNA binding specificity and affinity for polypyrimidine tract binding protein. Biochemistry 36, 11881-11890, doi:10.1021/bi9711745 (1997). 161 Suckale, J. et al. PTBP1 is required for embryonic development before gastrulation. PLoS One 6, e16992, doi:10.1371/journal.pone.0016992 (2011). 162 Romanelli, M. G., Diani, E. & Lievens, P. M.-J. New insights into functional roles of the polypyrimidine tract-binding protein. International journal of molecular sciences 14, 22906-22932, doi:10.3390/ijms141122906 (2013). 163 Sharma, S., Kohlstaedt, L. A., Damianov, A., Rio, D. C. & Black, D. L. Polypyrimidine tract binding protein controls the transition from exon definition to an intron defined spliceosome. Nature structural & molecular biology 15, 183- 191, doi:10.1038/nsmb.1375 (2008).

61

164 Chan, R. C. & Black, D. L. The polypyrimidine tract binding protein binds upstream of neural cell-specific c-src exon N1 to repress the splicing of the intron downstream. Mol Cell Biol 17, 4667-4676 (1997). 165 Zheng, X. et al. Polypyrimidine tract binding protein inhibits IgM pre-mRNA splicing by diverting U2 snRNA base-pairing away from the branch point. RNA (New York, N.Y.) 20, 440-446, doi:10.1261/rna.043737.113 (2014). 166 Sharma, S., Falick, A. M. & Black, D. L. Polypyrimidine tract binding protein blocks the 5' splice site-dependent assembly of U2AF and the prespliceosomal E complex. Molecular cell 19, 485-496, doi:10.1016/j.molcel.2005.07.014 (2005). 167 Wagner, E. J. & Garcia-Blanco, M. A. Polypyrimidine tract binding protein antagonizes exon definition. Mol Cell Biol 21, 3281-3288, doi:10.1128/mcb.21.10.3281-3288.2001 (2001). 168 Lamichhane, R. et al. RNA looping by PTB: Evidence using FRET and NMR spectroscopy for a role in splicing repression. Proceedings of the National Academy of Sciences 107, 4105-4110, doi:10.1073/pnas.0907072107 (2010). 169 Amir-Ahmady, B., Boutz, P. L., Markovtsov, V., Phillips, M. L. & Black, D. L. Exon repression by polypyrimidine tract binding protein. RNA (New York, N.Y.) 11, 699-716, doi:10.1261/rna.2250405 (2005). 170 Izquierdo, J. M. et al. Regulation of Fas alternative splicing by antagonistic effects of TIA-1 and PTB on exon definition. Molecular cell 19, 475-484, doi:10.1016/j.molcel.2005.06.015 (2005). 171 Llorian, M. et al. Position-dependent alternative splicing activity revealed by global profiling of alternative splicing events regulated by PTB. Nature structural & molecular biology 17, 1114-1123, doi:10.1038/nsmb.1881 (2010). 172 Ule, J. et al. An RNA map predicting Nova-dependent splicing regulation. Nature 444, 580-586, doi:10.1038/nature05304 (2006). 173 Coelho, M. B. et al. Nuclear matrix protein Matrin3 regulates alternative splicing and forms overlapping regulatory networks with PTB. The EMBO journal 34, 653 (2015). 174 Guo, J., Jia, J. & Jia, R. PTBP1 and PTBP2 impaired autoregulation of SRSF3 in cancer cells. Scientific reports 5, 14548, doi:10.1038/srep14548 (2015). 175 Spellman, R., Llorian, M. & Smith, C. W. J. Crossregulation and Functional Redundancy between the Splicing Regulator PTB and Its Paralogs nPTB and ROD1. Molecular cell 27, 420-434, doi:10.1016/j.molcel.2007.06.016 (2007). 176 Han, A. et al. De novo prediction of PTBP1 binding and splicing targets reveals unexpected features of its RNA recognition and function. PLoS Comput Biol 10, e1003442, doi:10.1371/journal.pcbi.1003442 (2014). 177 Gooding, C., Kemp, P. & Smith, C. W. A novel polypyrimidine tract-binding protein paralog expressed in smooth muscle cells. J Biol Chem 278, 15201-15207, doi:10.1074/jbc.M210131200 (2003). 178 Bland, C. S. et al. Global regulation of alternative splicing during myogenic differentiation. Nucleic Acids Research 38, 7651-7664, doi:10.1093/nar/gkq614 (2010). 179 Lustig, Y. et al. RNA-binding protein PTB and microRNA-221 coregulate AdipoR1 translation and adiponectin signaling. Diabetes 63, 433-445, doi:10.2337/db13-1032 (2014). 180 Ohsawa, N., Koebis, M., Mitsuhashi, H., Nishino, I. & Ishiura, S. ABLIM1 splicing is abnormal in skeletal muscle of patients with DM1 and regulated by MBNL, CELF and PTBP1. Genes to cells : devoted to molecular & cellular mechanisms 20, 121-134, doi:10.1111/gtc.12201 (2015). 181 Llorian, M. et al. The alternative splicing program of differentiated smooth muscle cells involves concerted non-productive splicing of post-transcriptional regulators. Nucleic Acids Research 44, 8933-8950, doi:10.1093/nar/gkw560 (2016).

62

182 Taniguchi, K. et al. Organ-specific PTB1-associated microRNAs determine expression of pyruvate kinase isoforms. Scientific reports 5, 8647, doi:10.1038/srep08647 (2015). 183 Taniguchi, K. et al. MicroRNA-124 inhibits cancer cell growth through PTB1/PKM1/PKM2 feedback cascade in colorectal cancer. Cancer letters 363, 17-27, doi:10.1016/j.canlet.2015.03.026 (2015). 184 David, C. J., Chen, M., Assanah, M., Canoll, P. & Manley, J. L. HnRNP proteins controlled by c-Myc deregulate pyruvate kinase mRNA splicing in cancer. Nature 463, 364-368, doi:10.1038/nature08697 (2010). 185 Stohr, N. & Huttelmaier, S. IGF2BP1: a post-transcriptional "driver" of tumor cell migration. Cell adhesion & migration 6, 312-318, doi:10.4161/cam.20628 (2012). 186 Manieri, N. A., Drylewicz, M. R., Miyoshi, H. & Stappenbeck, T. S. Igf2bp1 is required for full induction of Ptgs2 mRNA in colonic mesenchymal stem cells in mice. Gastroenterology 143, 110-121.e110, doi:10.1053/j.gastro.2012.03.037 (2012). 187 Stepanyuk, G. A. et al. UHM-ULM interactions in the RBM39-U2AF65 splicing- factor complex. Acta crystallographica. Section D, Structural biology 72, 497- 511, doi:10.1107/s2059798316001248 (2016). 188 Mai, S. et al. Global regulation of alternative RNA splicing by the SR-rich protein RBM39. Biochim Biophys Acta 1859, 1014-1024, doi:10.1016/j.bbagrm.2016.06.007 (2016). 189 Jung, D. J., Na, S. Y., Na, D. S. & Lee, J. W. Molecular cloning and characterization of CAPER, a novel coactivator of activating protein-1 and estrogen receptors. J Biol Chem 277, 1229-1234, doi:10.1074/jbc.M110417200 (2002). 190 Long, J. C. & Caceres, J. F. The SR protein family of splicing factors: master regulators of gene expression. The Biochemical journal 417, 15-27, doi:10.1042/bj20081501 (2009). 191 Kielkopf, C. L., Lücke, S. & Green, M. R. U2AF homology motifs: protein recognition in the RRM world. Genes & development 18, 1513-1526, doi:10.1101/gad.1206204 (2004). 192 Faherty, N. et al. Negative autoregulation of BMP dependent transcription by SIN3B splicing reveals a role for RBM39. Scientific reports 6, 28210, doi:10.1038/srep28210 (2016). 193 Anderson, D. M. et al. Severe muscle wasting and denervation in mice lacking the RNA-binding protein ZFP106. Proceedings of the National Academy of Sciences of the United States of America 113, E4494-4503, doi:10.1073/pnas.1608423113 (2016). 194 Hibino, Y. et al. Molecular properties and intracellular localization of rat liver nuclear scaffold protein P130. Biochim Biophys Acta 1759, 195-207, doi:10.1016/j.bbaexp.2006.04.010 (2006). 195 Polydorides, A. D., Okano, H. J., Yang, Y. Y., Stefani, G. & Darnell, R. B. A brain- enriched polypyrimidine tract-binding protein antagonizes the ability of Nova to regulate neuron-specific alternative splicing. Proceedings of the National Academy of Sciences of the United States of America 97, 6350-6355, doi:10.1073/pnas.110128397 (2000). 196 Uemura, Y. et al. Matrin3 binds directly to intronic pyrimidine-rich sequences and controls alternative splicing. Genes to cells : devoted to molecular & cellular mechanisms 22, 785-798, doi:10.1111/gtc.12512 (2017). 197 Senderek, J. et al. Autosomal-dominant distal myopathy associated with a recurrent missense mutation in the gene encoding the nuclear matrix protein, matrin 3. American journal of human genetics 84, 511-518, doi:10.1016/j.ajhg.2009.03.006 (2009).

63

198 Johnson, J. O. et al. Mutations in the Matrin 3 gene cause familial amyotrophic lateral sclerosis. Nature neuroscience 17, 664-666, doi:10.1038/nn.3688 (2014). 199 González, E. et al. Alternative splicing regulates the nuclear or cytoplasmic localization of dystrophin Dp71. FEBS Letters 482, 209-214, doi:10.1016/S0014- 5793(00)02044-5 (2000). 200 Delatycki, M. B. et al. Direct evidence that mitochondrial iron accumulation occurs in Friedreich ataxia. Annals of neurology 45, 673-675 (1999). 201 Saha, P. P. et al. The presence of multiple cellular defects associated with a novel G50E iron-sulfur cluster scaffold protein (ISCU) mutation leads to development of mitochondrial myopathy. J Biol Chem 289, 10359-10377, doi:10.1074/jbc.M113.526665 (2014). 202 Nierobisz, L. S. et al. Differential expression of genes characterizing myofibre phenotype. Animal Genetics 43, 298-308, doi:10.1111/j.1365-2052.2011.02249.x (2012). 203 Hargous, Y. et al. Molecular basis of RNA recognition and TAP binding by the SR proteins SRp20 and 9G8. The EMBO journal 25, 5126-5137, doi:10.1038/sj.emboj.7601385 (2006). 204 Cavaloc, Y., Bourgeois, C. F., Kister, L. & Stevenin, J. The splicing factors 9G8 and SRp20 transactivate splicing through different and specific enhancers. RNA (New York, N.Y.) 5, 468-483 (1999). 205 Schaal, T. D. & Maniatis, T. Selection and characterization of pre-mRNA splicing enhancers: identification of novel SR protein-specific enhancer sequences. Mol Cell Biol 19, 1705-1719 (1999). 206 Brugiolo, M., Botti, V., Liu, N., Muller-McNicoll, M. & Neugebauer, K. M. Fractionation iCLIP detects persistent SR protein binding to conserved, retained introns in chromatin, nucleoplasm and cytoplasm. Nucleic Acids Research 45, 10452-10465, doi:10.1093/nar/gkx671 (2017). 207 Ayane, M., Preuss, U., Köhler, G. & Nielsen, P. J. A differentially expressed murine RNA encoding a protein with similarities to two types of nucleic acid binding motifs. Nucleic Acids Research 19, 1273-1278 (1991). 208 Chan, R. C. & Black, D. L. Conserved intron elements repress splicing of a neuron- specific c-src exon in vitro. Molecular and cellular biology 15, 6377-6385 (1995). 209 Olsen, P. H. & Ambros, V. The lin-4 regulatory RNA controls developmental timing in Caenorhabditis elegans by blocking LIN-14 protein synthesis after the initiation of translation. Developmental biology 216, 671-680, doi:10.1006/dbio.1999.9523 (1999). 210 Bartel, D. P. MicroRNAs: target recognition and regulatory functions. Cell 136, 215-233, doi:10.1016/j.cell.2009.01.002 (2009). 211 Boutz, P. L., Chawla, G., Stoilov, P. & Black, D. L. MicroRNAs regulate the expression of the alternative splicing factor nPTB during muscle development. Genes & development 21, 71-84, doi:10.1101/gad.1500707 (2007). 212 Akkina, S. & Becker, B. N. MicroRNAs in kidney function and disease. Translational research : the journal of laboratory and clinical medicine 157, 236-240, doi:10.1016/j.trsl.2011.01.011 (2011). 213 Jensen, K. B., Musunuru, K., Lewis, H. A., Burley, S. K. & Darnell, R. B. The tetranucleotide UCAY directs the specific recognition of RNA by the Nova K- homology 3 domain. Proceedings of the National Academy of Sciences of the United States of America 97, 5740-5745, doi:10.1073/pnas.090553997 (2000). 214 Ule, J. et al. CLIP identifies Nova-regulated RNA networks in the brain. Science 302, 1212-1215, doi:10.1126/science.1090095 (2003). 215 Piva, F., Giulietti, M., Burini, A. B. & Principato, G. SpliceAid 2: a database of human splicing factors expression data and RNA target motifs. Hum Mutat 33, doi:10.1002/humu.21609 (2012).

64

216 Chkheidze, A. N. & Liebhaber, S. A. A novel set of nuclear localization signals determine distributions of the alphaCP RNA-binding proteins. Mol Cell Biol 23, 8405-8415 (2003). 217 Ji, X. et al. alphaCP binding to a cytosine-rich subset of polypyrimidine tracts drives a novel pathway of cassette exon splicing in the mammalian transcriptome. Nucleic Acids Research 44, 2283-2297, doi:10.1093/nar/gkw088 (2016). 218 Frey, A. G. et al. Iron chaperones PCBP1 and PCBP2 mediate the metallation of the dinuclear iron enzyme deoxyhypusine hydroxylase. Proceedings of the National Academy of Sciences of the United States of America 111, 8031-8036, doi:10.1073/pnas.1402732111 (2014). 219 Hoffman, B. E. & Grabowski, P. J. U1 snRNP targets an essential splicing factor, U2AF65, to the 3' splice site by a network of interactions spanning the exon. Genes & development 6, 2554-2568 (1992). 220 Wu, S., Romfo, C. M., Nilsen, T. W. & Green, M. R. Functional recognition of the 3' splice site AG by the splicing factor U2AF35. Nature 402, 832-835, doi:10.1038/45590 (1999). 221 Guth, S., Martinez, C., Gaur, R. K. & Valcarcel, J. Evidence for substrate-specific requirement of the splicing factor U2AF(35) and for its function after polypyrimidine tract recognition by U2AF(65). Mol Cell Biol 19, 8263-8271 (1999). 222 Shao, C. et al. Mechanisms for U2AF to define 3' splice sites and regulate alternative splicing in the human genome. Nature structural & molecular biology 21, 997-1005, doi:10.1038/nsmb.2906 (2014). 223 Mollet, I., Barbosa-Morais, N. L., Andrade, J. & Carmo-Fonseca, M. Diversity of human U2AF splicing factors. The FEBS journal 273, 4807-4816, doi:doi:10.1111/j.1742-4658.2006.05502.x (2006). 224 Wu, J. Y. & Maniatis, T. Specific interactions between proteins implicated in splice site selection and regulated alternative splicing. Cell 75, 1061-1070 (1993). 225 Sickmier, E. A. et al. Structural basis for polypyrimidine tract recognition by the essential pre-mRNA splicing factor U2AF65. Molecular cell 23, 49-59, doi:10.1016/j.molcel.2006.05.025 (2006). 226 Cho, S. et al. Splicing inhibition of U2AF(65) leads to alternative exon skipping. Proceedings of the National Academy of Sciences of the United States of America 112, 9926-9931, doi:10.1073/pnas.1500639112 (2015). 227 Mackereth, C. D. et al. Multi-domain conformational selection underlies pre- mRNA splicing regulation by U2AF. Nature 475, 408-411, doi:10.1038/nature10171 (2011). 228 Lim, K. H., Ferraris, L., Filloux, M. E., Raphael, B. J. & Fairbrother, W. G. Using positional distribution to identify splicing elements and predict pre-mRNA processing defects in human genes. Proceedings of the National Academy of Sciences of the United States of America 108, 11093-11098, doi:10.1073/pnas.1101135108 (2011). 229 Yoshida, K. et al. Frequent pathway mutations of splicing machinery in myelodysplasia. Nature 478, 64-69, doi:10.1038/nature10496 (2011). 230 Konieczny, P., Stepniak-Konieczna, E. & Sobczak, K. MBNL proteins and their target RNAs, interaction and splicing regulation. Nucleic Acids Research 42, 10873-10887, doi:10.1093/nar/gku767 (2014). 231 Kanadia, R. N. et al. Reversal of RNA missplicing and myotonia after muscleblind overexpression in a mouse poly(CUG) model for myotonic dystrophy. Proceedings of the National Academy of Sciences of the United States of America 103, 11748-11753, doi:10.1073/pnas.0604970103 (2006). 232 Tran, H. et al. Analysis of exonic regions involved in nuclear localization, splicing activity, and dimerization of Muscleblind-like-1 isoforms. J Biol Chem 286, 16435-16446, doi:10.1074/jbc.M110.194928 (2011).

65

233 Artero, R. et al. The muscleblind gene participates in the organization of Z-bands and epidermal attachments of Drosophila muscles and is regulated by Dmef2. Developmental biology 195, 131-143, doi:10.1006/dbio.1997.8833 (1998). 234 Lee, K.-Y. et al. Compound loss of muscleblind-like function in myotonic dystrophy. EMBO molecular medicine 5, 1887-1900, doi:10.1002/emmm.201303275 (2013). 235 Wang, E. T. et al. Transcriptome-wide regulation of pre-mRNA splicing and mRNA localization by muscleblind proteins. Cell 150, 710-724, doi:10.1016/j.cell.2012.06.041 (2012). 236 Udd, B. & Krahe, R. The myotonic dystrophies: molecular, clinical, and therapeutic challenges. The Lancet. Neurology 11, 891-905, doi:10.1016/s1474- 4422(12)70204-1 (2012). 237 Gates, D. P., Coonrod, L. A. & Berglund, J. A. Autoregulated splicing of muscleblind-like 1 (MBNL1) Pre-mRNA. J Biol Chem 286, 34224-34233, doi:10.1074/jbc.M111.236547 (2011). 238 Lambert, N. et al. RNA Bind-n-Seq: quantitative assessment of the sequence and structural binding specificity of RNA binding proteins. Molecular cell 54, 887- 900, doi:10.1016/j.molcel.2014.04.016 (2014). 239 Warf, M. B., Diegel, J. V., von Hippel, P. H. & Berglund, J. A. The protein factors MBNL1 and U2AF65 bind alternative RNA structures to regulate splicing. Proceedings of the National Academy of Sciences of the United States of America 106, 9203-9208, doi:10.1073/pnas.0900342106 (2009). 240 Masuda, A. et al. CUGBP1 and MBNL1 preferentially bind to 3' UTRs and facilitate mRNA decay. Scientific reports 2, 209, doi:10.1038/srep00209 (2012). 241 Gong, C., Kim, Y. K., Woeller, C. F., Tang, Y. & Maquat, L. E. SMD and NMD are competitive pathways that contribute to myogenesis: effects on PAX3 and myogenin mRNAs. Genes & development 23, 54-66, doi:10.1101/gad.1717309 (2009). 242 Feng, Y., Chen, M. & Manley, J. L. Phosphorylation switches the general splicing repressor SRp38 to a sequence-specific activator. Nature structural & molecular biology 15, 1040-1048, doi:10.1038/nsmb.1485 (2008). 243 Blencowe, B. J. Exonic splicing enhancers: mechanism of action, diversity and role in human genetic diseases. Trends in biochemical sciences 25, 106-110 (2000). 244 Graveley, B. R. Sorting out the complexity of SR protein functions. RNA (New York, N.Y.) 6, 1197-1211 (2000). 245 Konarska, M. M. & Query, C. C. Insights into the mechanisms of splicing: more lessons from the ribosome. Genes & development 19, 2255-2260, doi:10.1101/gad.1363105 (2005). 246 Crotti, L. B. & Horowitz, D. S. Exon sequences at the splice junctions affect splicing fidelity and alternative splicing. Proceedings of the National Academy of Sciences 106, 18954 (2009). 247 Raponi, M. et al. Polypyrimidine tract binding protein regulates alternative splicing of an aberrant pseudoexon in NF1. The FEBS journal 275, 6101-6108, doi:10.1111/j.1742-4658.2008.06734.x (2008). 248 Turunen, J. J., Niemela, E. H., Verma, B. & Frilander, M. J. The significant other: splicing by the minor spliceosome. Wiley interdisciplinary reviews. RNA 4, 61- 76, doi:10.1002/wrna.1141 (2013). 249 Lin, J. C. & Tarn, W. Y. RBM4 down-regulates PTB and antagonizes its activity in muscle cell-specific alternative splicing. The Journal of cell biology 193, 509- 520, doi:10.1083/jcb.201007131 (2011). 250 Lin, J. C. & Tarn, W. Y. Multiple roles of RBM4 in muscle cell differentiation. Frontiers in bioscience (Scholar edition) 4, 181-189 (2012).

66

251 Yang, J. et al. RBM24 is a major regulator of muscle-specific alternative splicing. Developmental cell 31, 87-99, doi:10.1016/j.devcel.2014.08.025 (2014). 252 Dauksaite, V. & Gotthardt, M. Molecular basis of titin exon exclusion by RBM20 and the novel titin splice regulator PTB4. Nucleic Acids Research 46, 5227-5238, doi:10.1093/nar/gky165 (2018). 253 Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, doi:10.1038/35057062 (2001). 254 Scotti, M. M. & Swanson, M. S. RNA mis-splicing in disease. Nat Rev Genet 17, 19-32, doi:10.1038/nrg.2015.3 (2016). 255 Dominski, Z. & Kole, R. Restoration of correct splicing in thalassemic pre-mRNA by antisense oligonucleotides. Proceedings of the National Academy of Sciences of the United States of America 90, 8673-8677 (1993). 256 Kole, R. & Krieg, A. M. Exon skipping therapy for Duchenne muscular dystrophy. Advanced drug delivery reviews 87, 104-107, doi:10.1016/j.addr.2015.05.008 (2015). 257 Wilton, S. D. et al. Specific removal of the nonsense mutation from the mdx dystrophin mRNA using antisense oligonucleotides. Neuromuscular Disorders 9, 330-338, doi: 10.1016/S0960-8966(99)00010-3 (1999). 258 Muntoni, F. & Wood, M. J. A. Targeting RNA to treat neuromuscular disease. Nature Reviews Drug Discovery 10, 621, doi:10.1038/nrd3459 (2011). 259 Fernandez-Rodriguez, J. et al. A mild neurofibromatosis type 1 phenotype produced by the combination of the benign nature of a leaky NF1-splice mutation and the presence of a complex mosaicism. Hum Mutat 32, 705-709, doi:10.1002/humu.21500 (2011). 260 Pros, E. et al. Antisense therapeutics for neurofibromatosis type 1 caused by deep intronic mutations. Hum Mutat 30, 454-462, doi:10.1002/humu.20933 (2009). 261 Singh, N. N. et al. An intronic structure enabled by a long-distance interaction serves as a novel target for splicing correction in spinal muscular atrophy. Nucleic Acids Research 41, 8144-8165, doi:10.1093/nar/gkt609 (2013). 262 Anczukow, O. et al. BRCA2 deep intronic mutation causing activation of a cryptic exon: opening toward a new preventive therapeutic strategy. Clinical cancer research : an official journal of the American Association for Cancer Research 18, 4903-4909, doi:10.1158/1078-0432.Ccr-12-1100 (2012). 263 Emery, A. E. The muscular dystrophies. Lancet (London, England) 359, 687- 695, doi:10.1016/s0140-6736(02)07815-7 (2002). 264 Lim, K. R. Q., Maruyama, R. & Yokota, T. Eteplirsen in the treatment of Duchenne muscular dystrophy. Drug Design, Development and Therapy 11, 533-545, doi:10.2147/DDDT.S97635 (2017). 265 Mendell, J. R., Sahenk, Z. & Rodino-Klapac, L. R. Clinical trials of exon skipping in Duchenne muscular dystrophy. Expert Opinion on Orphan Drugs 5, 683-690, doi:10.1080/21678707.2017.1366310 (2017). 266 Miller, J. W. et al. Recruitment of human muscleblind proteins to (CUG)(n) expansions associated with myotonic dystrophy. The EMBO journal 19, 4439- 4448, doi:10.1093/emboj/19.17.4439 (2000). 267 Cho, D. H. & Tapscott, S. J. Myotonic dystrophy: emerging mechanisms for DM1 and DM2. Biochim Biophys Acta 1772, 195-204, doi:10.1016/j.bbadis.2006.05.013 (2007). 268 Fardaei, M. et al. Three proteins, MBNL, MBLL and MBXL, co-localize in vivo with nuclear foci of expanded-repeat transcripts in DM1 and DM2 cells. Hum Mol Genet 11, 805-814 (2002). 269 Wheeler, T. M. et al. Reversal of RNA dominance by displacement of protein sequestered on triplet repeat RNA. Science 325, 336-339, doi:10.1126/science.1173110 (2009).

67

270 Kollberg, G. & Holme, E. Antisense oligonucleotide therapeutics for iron-sulphur cluster deficiency myopathy. Neuromuscular disorders : NMD 19, 833-836, doi:10.1016/j.nmd.2009.09.011 (2009). 271 Ryther, R. C., Flynt, A. S., Harris, B. D., Phillips, J. A., 3rd & Patton, J. G. GH1 splicing is regulated by multiple enhancers whose mutation produces a dominant-negative GH isoform that can be degraded by allele-specific small interfering RNA (siRNA). Endocrinology 145, 2988-2996, doi:10.1210/en.2003- 1724 (2004). 272 Bolduc, V., Zou, Y., Ko, D. & Bonnemann, C. G. siRNA-mediated Allele-specific Silencing of a COL6A3 Mutation in a Cellular Model of Dominant Ullrich Muscular Dystrophy. Molecular therapy. Nucleic acids 3, e147, doi:10.1038/mtna.2013.74 (2014). 273 Allo, M. et al. Control of alternative splicing through siRNA-mediated transcriptional gene silencing. Nature structural & molecular biology 16, 717- 724, doi:10.1038/nsmb.1620 (2009). 274 Anderson, E. S. et al. The cardiotonic steroid digitoxin regulates alternative splicing through depletion of the splicing factors SRSF3 and TRA2B. RNA (New York, N.Y.) 18, 1041-1049, doi:10.1261/rna.032912.112 (2012). 275 Chang, J. G. et al. Small molecule amiloride modulates oncogenic RNA alternative splicing to devitalize human cancer cells. PLoS One 6, e18643, doi:10.1371/journal.pone.0018643 (2011). 276 Nishida, A. et al. Chemical treatment enhances skipping of a mutated exon in the dystrophin gene. Nat Commun 2, 308, doi:10.1038/ncomms1306 (2011).

68