Splicing and Disease.

Emanuele Buratti and Francisco E Baralle*

International Centre of Genetic Engineering and Biotechnology (ICGEB), Trieste, Italy.

*address correspondence to: Prof. Francisco E. Baralle, Padriciano 99, 34149 Trieste, Italy,

Phone: +39-040-3757337, Fax: +39-040-3757361, E-mail: [email protected].

1 ABSTRACT

The observation that impaired splicing may cause human diseases dates from the earliest days of splicing research, in particular from the pioneering studies on haemoglobin genes. However, in the last fifteen years or so the increased knowledge of the process itself coupled with the great advancements in diagnostic screening techniques has greatly expanded that initial awareness. It is now clear, in fact, that splicing mutations can occur in virtually any human intron-containing gene and that the resulting splicing alterations may cause disease.

The pathological penetrance of these mutations may be variable depending on the individual genetic background. Up to now, the most studied examples regard classical genetic diseases linked to alterations in a single-gene splicing regulation. However, it is increasingly clear that splicing alterations play just as important roles also in the origin and progression of complex diseases, such as tumour formation or neurological defects. The aim of this review will be to provide basic pointers on splicing alterations and disease, especially focusing on an overview of the consequences of genomic variations.

Keywords: splicing, mutation, RNA, 5’ splice site, 3' splice site, bioinformatics

2 INTRODUCTION

In order to ensure accurate gene expression the pre-mRNA splicing process has the task of removing intervening sequences (or introns) from eukaryotic precursor messenger

RNAs (pre-mRNA) [1]. Apart from joining consecutive exons together, this process is capable of selective removal or inclusion of exonic and intronic sequences in mRNA, generating several transcripts from a single gene, often in a cell type-specific, developmental, and even gender-dependent manner [2-8]. This process is called alternative splicing and, during the course of evolution, has achieved a very high degree of complexity, as recently reviewed by a number of publications [9,10]. This complexity is aimed at maintaining proper exon/intron recognition and is one of the esential factors that influence the shape of human genes [11]. In keeping with this, many recent reports have consistently highlighted the observation that even apparently neutral changes in sequence composition of exons may alter splicing revealing evolutionary mechanisms aimed at maintaining proper splicing regulatory pathways [12-14].

Both constitutive and alternative splicing pathways are carried out by a large ribonucleoprotein complex named spliceosome [1,15]. Assembly of this very sophisticated cellular machinery [16,17] in every exon-intron or intron-exon junction is controlled by conserved but rather degenerate sequence elements that include 5’ splice sites (5’ss) and 3’ splice sites (3’ss) and upstream of the 3’ss the polypyrimidine tract and the branch point sequence (BPS) (Fig.1). Because of their degeneracy, however, these consensus splicing signals contain approximately half of the information necessary for accurate splice-site selection [18]. The remaining information is provided by auxiliary signals in introns and exons, termed splicing regulatory elements (SREs) that can also be referred to as enhancer and silencer sequences depending on their effect on exon recognition [19-21]. In most cases, these sequences work by interacting with trans-acting factors whose number is steadily

3 increasing over time [22,23]. In parallel, a considerable degree of splicing regulation can also occur in a protein-free fashion with low-molecular-weight ligands [24], snoRNAs [25], and modification of RNA secondary structure [26,27], all being capable of affecting this process.

Finally, as the spliceosome and transcription machineries are tightly linked, splicing can be influenced by pre-mRNA processing kinetics and transcription [28,29], cellular stress [30], and external extracellular signals [31]. As a result, splicing mutations may not only affect

RNA processing, but also transcription [32] and downstream gene expression pathways, including translation, largely by creating or eliminating exons containing upstream open reading frames [33,34]. The combination of all the factors influencing splicing contributes to what is now commonly known as the “splicing code” [35-37]. As expected from all this complexity, mutations in any of these cis-elements or factors can dramatically alter splicing efficiency, lead to aberrant splicing, and eventually to human disease, particularly in large genes with many introns [38,39].

Splicing and Disease

In recent years the topic of splicing and disease has been reviewed several times, most recently by Tazi et al. [40], Cooper et al. [41], and Baralle et al. [42]. All these reviews, with different emphasis on particular topics, have overviewed the general field in the light of the latest discoveries concerning the splicing process. The reader is therefore referred to these publications for a general, up to date overview of the subject.

At the same time, it is interesting to note that the huge amount of new information being published every year on the relationship between splicing and disease has resulted in the appearance of reviews that focus on particular kind of diseases. For example, starting from the initial overview by Venables [43] on potential connections between splicing and cancer, several reviews have followed up on the specific subject [44-48]. In general, the mechanisms

4 through which aberrant alternative splicing can bring about a tumorigenic transformation involve rather expected events, such as the production of protein isoforms with oncogenic properties (or with impaired anti-oncogenic properties). The genes involved in these cases predominantly belong to factors that control such processes as apoptosis, cell-cycle regulation, and angiogenesis. One important factor that is also emerging from these studies on human cancers is the central role played by alterations in the expression of the splicing factors themselves, rather than in individual genes being mutated in their splicing regulatory regions, recently reviewed by Grosso et al. [49]. Of particular interest is the recent identification of one of the most famous splicing factors (ASF/SF2) as a potential proto-oncogene upon its overexpression in rodent fibroblasts [50]. In keeping with this conclusion, the same study has shown that the SF2/ASF factor is overexpressed in a variety of human tumours. The mechanism through which transformation has come about are still under study. The study by

Karni et al. [50] has identified several likely targets whose alternative splicing patterns can be adversely affected by up regulation of SF2/ASF, such as tumour suppressor BIN1, and the

MNK2, S6K1 kinases. Moreover, it was previously published that SF2/ASF expression levels can powerfully affect the alternative splicing process of the Ron oncogene [51]. Other well known splicing factors whose expression levels are altered in cancer cells are hnRNP A1, Tra-

2, YB-1, and a host of other factors that were previously known just for their splicing regulatory abilities [52]. Many of these connections, especially with regards to their potential functional significance in tumour origin and progression, still need to be further tested. Even so, the rapid development of the therapeutic field associated at correcting splicing defects and the need to find new targets for therapeutic treatment for this type of ailments will certainly drive the field swiftly forward in the near future.

All these observations have increased the use of microarray analysis of transcript alterations as "biomarkers" for the diagnosis and prognosis of particular types of cancers. For

5 example, attempts have been made to classify according to splicing variations in the transcripts diseases like Hodgkin lymphoma tumours [53], ovarian cancers [54], leukemia cell lines [55], and human breast cancer cells [56].

It should also be pointed out that human tumors are not the only complex disease where the importance of splicing is carefully evaluated. In particular, the potential role played by splicing alterations has also gained a lot of attention in neurological diseases [57,58]. For example, a lot of focus has been recently placed on clarifying the role of Nova, a neuronal specific splicing factor. This factor regulates synapse formation during the development of the human brain by controlling the alternatively spliced levels of several neurotransmitter receptors, adhesion molecules, cation exchangers, and scaffold proteins [59,60]. Another recent addition to the list of splicing factors involved in neurodegeneration is also represented by TDP-43, a protein that was previously described to control CFTR exon 9 splicing [61] but has recently been described to be the major accumulating protein of patients affected by

Frontemporal Lobar Degerneration and Amyotrophic Lateral Sclerosis [62].

Splicing Therapeutics

Current possibilites aimed at modifying aberrant splicing patterns have also been the subject of several recent reviews [63-67]. The strategies used to modify splicing profiles are quite varied. The most successful approaches up to now have involved the use of antisense oligonucleotides that target splicing control regions. These oligos can be used to inhibit the inclusion of unwanted exons and/or promote the production of a truncated but functional protein [64,68]. Antisense oligonucleotides can also be modified to contain a complementary targeting region and an effector region which can recruit or mimick splicing factor activities

[69,70]. Alternatively, one promising research line that is gaining a lot of attention is represented by the use of small molecules that act by selectively modifying the activity of

6 splicing regulatory proteins through altered cellular distribution or changing phosphorylation states [71]. Alternative approaches have also been described that use siRNA approaches to specifically knock down aberrant splicing isoforms, exploit trans-splicing startegies (SMaRT)

[72], and use of modified U7-U1snRNP molecules to to block aberrant splice site sequences

(ie. acting as antisense oligonucleotides) or to reverse missplicing by carrying compensatory mutations in the 5' end of their U1snRNA sequence [73-76]. For the interested reader, it should be noted that many of these specific approaches are discussed in Section II.E of this book.

Generation of aberrant transcripts

From a biomedical point of view, one of the the most important aspects to be considered by researchers is what kind of aberrant transcript may be generated by typical splicing-affecting mutations. A schematic diagram of the possible consequences of mutations in the basic splicing regulatory elements is reported in Fig.1. Mutation in enhancer or silencer elements that lead to their disruption (or creation) can lead to the same consequences described for the basic regulatory factors with the addition that creation of enhancers or silencer loss can lead to increased levels of exon inclusion (Fig.2).

Exon skipping.

In general, the vast majority of 5’ss, 3’ss and regulatory elements mutations result in the skipping from the splicing queue of the affected exon [77] (Fig. 1A). Although by itself the skipping of an exon from the splicing queue is a straightforward process, it should be noted that quite often the skipping event is not confined to the exon carrying the splicing mutation but it can also be extended to neighbouring exons (either upstream or downstream).

7 This has suggested that in all cases the importance of the genomic milieu should never be underestimated [78].

Cryptic splice site activation.

Cryptic splice site activation usually occurs when the natural donor or acceptor site is inactivated or weakened by a particular mutation. In this case, depending on the local sequence context, one or more splice sites are used that would normally be ignored by the splicing machinery (Fig.1B). These events result either in the addition or subtraction of nucleotide sequences from the original exon. In these cases there is a 2 in 3 chance of disrupting the reading frame by introducing aberrant translation stop codons in the final transcript that can either cause degradation of the mRNA transcript through Nonsense

Mediated Decay (NMD) or synthesis of a truncated protein. Furthermore, even when the reading frame remains unchanged the addition/removal of a numebr of aminoacid residues from the resulting protein may well prove to be harmful with regards to its biological properties or regulation.

A bioinformatics analysis of several hundred cryptic splice site activation events

[79,80] has confirmed that cryptic splice-sites are, on average, intrinsically stronger than mutated authentic counterparts but generally weaker than their authentic, wild-type counterparts [81]. However, in ~10-15% cases, the wild-type authentic splice site was weaker than the corresponding cryptic site. This indicates that there are additional signals in the pre- mRNA that repress their use and several experimental observations have confirmed this hypothesis. First of all, on the bioinformatics level, the analysis of auxiliary sequences betweeen authentic and aberrant splice sites showed that one particular type of silencers, termed PESSs for putative exonic splicing silencers [19-21] was particularly informative for predicting aberrant splice-site activation [82]. Secondly, in genes such as FGB it has been

8 reported that an ASF/SF2 binding sequence, that does not normally partecipate in the recogniton of the constitutively recognized exon 7, can nonetheless profoundly influence the activation and type of cryptic splice site sequences being used by the splicing machinery following inactivation of the wild-type donor site [83].

There are two important databases that collect disease-related, cryptic splice site activation events following either acceptor or donor site inactivation DBASS3 and DBASS5

[79,80]. Both databases are freely available at www.dbass.org.uk/. Finally, an in silico tool

(Cryp-Skip, available at www.dbass.org.uk/cryp-skip/) has been recently developed to predict the potential occurrence of cryptic splice site activation versus exon skipping following the introduction of mutations in any given donor or acceptor site [84].

Intron Retention

Intron retention events are usually defined as the retention of entire intronic sequences in the final processed mRNA (Fig.1C). The frequency of normal intron retention events in the human genome has been recently estimated to be around 15% in a set of more than 21000 annotated genes [85]. In many cases, the biological role of these events is currently unknown, however, it is known that they preferentially occur in the untranslated region of the RNA

[85,86]. Their potentially regulatory role, however, is established by some well described example such as the generation of the P element and Msl2 transcripts in Drosophila [87,88], in the developmental regulation of the pro insulin messenger RNA in chicken embryos [89], in the generation of a novel adhesion molecule in rat testis [90], or in controlling the expression levels of the Apolipoprotein E in the central nervous system [91]. As expected, aberrant intron retention events following the introducton of mutations in splicing regulatory elements have also been shown to be associated with human disease such as

9 pheocromocytomas [92], long QT syndrome [93], Leigh syndrome [94], Arthrogryposis multiplex congenita (AMC) [95], and B lineage human cancers [96].

Pseudoexon inclusion.

The term "pseudoexon" usually refers to any nucleotide sequence between 50 and 300 nucleotides in length with apparently viable 5'ss and 3'ss sites at either end. Because of the degeneracy of the splicing code it is expected that many such sequences would be present in most human genes. Indeed, in the hprt gene it has been estimated that pseudoexon sequences largely outnumber the "real" exons [97]. The evidence available up to now has pointed to several factors that can help the spliceosome discriminate real exons from these false targets.

First of all, inclusion of many of these sequences is actively inhibited due to the presence of intrinsic defects [97], the presence of silencer elements [20,98,99], or the formation of inhibiting RNA secondary structures [100].

Nonetheless, the number of reported pseudoexon events involved in human disease is steadily increasing. Usually, this is due to de novo creation of the classical splicing consensus sequences: donor, acceptor, and branch site sequences (Fig.1D). Following these events, the second most frequent mechanism that leads to pseudoexon activation involves the creation/deletion of splicing regulatory sequences. Finally, in two individual cases the rearrangement of genomic regions through a gross deletion that has brought near to each other two viable donor and acceptor sites [101] or genomic inversions that have activated exons in what would normally have been the antisense genomic strand [102] has also been described to give rise to pseudoexon inclusion events.

Unexpected splicing outcomes following disruption of classical splicing sequences.

10 It should also be noted, however, that these possibilities do not rule other kind of outcomes such as those schematically reported in Fig.3. In this case, it has been observed that disease-associated inactivating mutations in the 3' acceptor sequences of the TP and XPA genes not only cause skipping of the affected exon but also determine a shift in donor acceptor usage of the preceding exon [103,104]. This kind of "atypical" outcomes is not confined to

3'ss sequences as donor site inactivation in the COL1A1 and CLN6 genes has yielded very similar results [105,106]. For this reason, in order to accurately determine aberrant splicing events it is always advisable to use the full range of diagnostic possibilites (most of which are fully described in other chapters of this book).

Acknowledgements

This work was supported by the Telethon Onlus Foundation (GGP06147) and by the

EC grant EURASNET-LSHG-CT-2005-518238.

11 REFERENCES

[1] Sharp, P.A. (1994). Split genes and RNA splicing. Cell 77, 805-815. [2] Black, D.L. (2003). Mechanisms of alternative pre-messenger RNA splicing. Annu Rev Biochem 72, 291-336. [3] Boue, S., Letunic, I. and Bork, P. (2003). Alternative splicing and evolution. Bioessays 25, 1031-1034. [4] Stetefeld, J. and Ruegg, M.A. (2005). Structural and functional diversity generated by alternative mRNA splicing. Trends Biochem Sci 30, 515-521. [5] Graveley, B.R. (2001). Alternative splicing: increasing diversity in the proteomic world. Trends Genet 17, 100-107. [6] Maniatis, T. and Tasic, B. (2002). Alternative pre-mRNA splicing and proteome expansion in metazoans. Nature 418, 236-243. [7] Blencowe, B.J. (2006). Alternative splicing: new insights from global analyses. Cell 126, 37-47. [8] Moroy, T. and Heyd, F. (2007). The impact of alternative splicing in vivo: Mouse models show the way. Rna 13, 1155-1171. [9] Park, J.W. and Graveley, B.R. (2007). Complex alternative splicing. Adv Exp Med Biol 623, 50-63. [10] Stamm, S., Ben-Ari, S., Rafalska, I., Tang, Y., Zhang, Z., Toiber, D., Thanaraj, T.A. and Soreq, H. (2005). Function of alternative splicing. Gene 344, 1-20. [11] Zhang, C., Li, W.H., Krainer, A.R. and Zhang, M.Q. (2008). RNA landscape of evolution for optimal exon and intron discrimination. Proc Natl Acad Sci U S A 105, 5797-5802. [12] Pagani, F. and Baralle, F.E. (2004). Genomic variants in exons and introns: identifying the splicing spoilers. Nat Rev Genet 5, 389-396. [13] Parmley, J.L. and Hurst, L.D. (2007). How do synonymous mutations affect fitness? Bioessays 29, 515-519. [14] Chamary, J.V., Parmley, J.L. and Hurst, L.D. (2006). Hearing silence: non-neutral evolution at synonymous sites in mammals. Nat Rev Genet 7, 98-108. [15] Nilsen, T.W. (2003). The spliceosome: the most complex macromolecular machine in the cell? Bioessays 25, 1147-1149. [16] Sperling, J., Azubel, M. and Sperling, R. (2008). Structure and function of the Pre-mRNA splicing machine. Structure 16, 1605-1615. [17] Matlin, A.J. and Moore, M.J. (2007). Spliceosome assembly and composition. Adv Exp Med Biol 623, 14-35. [18] Yeo, G., Hoon, S., Venkatesh, B. and Burge, C.B. (2004). Variation in sequence and organization of splicing regulatory elements in vertebrate genes. Proc Natl Acad Sci U S A 101, 15700-15705. [19] Wang, Z., Rolish, M.E., Yeo, G., Tung, V., Mawson, M. and Burge, C.B. (2004). Systematic identification and analysis of exonic splicing silencers. Cell 119, 831-845. [20] Zhang, X.H. and Chasin, L.A. (2004). Computational definition of sequence motifs governing constitutive exon splicing. Genes Dev 18, 1241-1250. [21] Cartegni, L., Chew, S.L. and Krainer, A.R. (2002). Listening to silence and understanding nonsense: exonic mutations that affect splicing. Nat Rev Genet 3, 285-298. [22] Jurica, M.S. and Moore, M.J. (2003). Pre-mRNA splicing: awash in a sea of proteins. Mol Cell 12, 5-14. [23] Chen, Y.I., Moore, R.E., Ge, H.Y., Young, M.K., Lee, T.D. and Stevens, S.W. (2007). Proteomic analysis of in vivo-assembled pre-mRNA splicing complexes expands the catalog of participating factors. Nucleic Acids Res 35, 3928-3944.

12 [24] Cheah, M.T., Wachter, A., Sudarsan, N. and Breaker, R.R. (2007). Control of alternative RNA splicing and gene expression by eukaryotic riboswitches. Nature 447, 497-500. [25] Kishore, S. and Stamm, S. (2006). The snoRNA HBII-52 regulates alternative splicing of the serotonin receptor 2C. Science 311, 230-232. [26] Buratti, E. and Baralle, F.E. (2004). Influence of RNA Secondary Structure on the Pre- mRNA Splicing Process. Mol Cell Biol 24, 10505-10514. [27] Hiller, M., Zhang, Z., Backofen, R. and Stamm, S. (2007). Pre-mRNA Secondary Structures Influence Exon Recognition. PLoS Genet 3, e204. [28] Tasic, B. et al. (2002). Promoter choice determines splice site selection in protocadherin alpha and gamma pre-mRNA splicing. Mol Cell 10, 21-33. [29] Kornblihtt, A.R. (2007). Coupling transcription and alternative splicing. Adv Exp Med Biol 623, 175-189. [30] Biamonti, G. and Caceres, J.F. (2009). Cellular stress and RNA splicing. Trends Biochem Sci 34, 146-153. [31] Blaustein, M., Pelisch, F. and Srebrow, A. (2007). Signals, pathways and splicing regulation. Int J Biochem Cell Biol 39, 2031-2048. [32] Furger, A., O'Sullivan, J.M., Binnie, A., Lee, B.A. and Proudfoot, N.J. (2002). Promoter proximal splice sites enhance transcription. Genes Dev 16, 2792-2799. [33] Cazzola, M. and Skoda, R.C. (2000). Translational pathophysiology: a novel molecular mechanism of human disease. Blood 95, 3280-3288. [34] Kralovicova, J., Gaunt, T.R., Rodriguez, S., Wood, P.J., Day, I.N. and Vorechovsky, I. (2006). Variants in the human insulin gene that affect pre-mRNA splicing: is -23HphI a functional single nucleotide polymorphism at IDDM2? Diabetes 55, 260-264. [35] Matlin, A.J., Clark, F. and Smith, C.W. (2005). Understanding alternative splicing: towards a cellular code. Nat Rev Mol Cell Biol 6, 386-398. [36] Buratti, E., Baralle, M. and Baralle, F.E. (2006). Defective splicing, disease and therapy: searching for master checkpoints in exon definition. Nucleic Acids Res 34, 3494-3510. [37] Hertel, K.J. (2008). Combinatorial control of exon recognition. J Biol Chem 283, 1211- 1215. [38] Ars, E., Serra, E., Garcia, J., Kruyer, H., Gaona, A., Lazaro, C. and Estivill, X. (2000). Mutations affecting mRNA splicing are the most common molecular defects in patients with neurofibromatosis type 1. Hum Mol Genet 9, 237-247. [39] Teraoka, S.N. et al. (1999). Splicing defects in the ataxia-telangiectasia gene, ATM: underlying mutations and consequences. Am J Hum Genet 64, 1617-1631. [40] Tazi, J., Bakkour, N. and Stamm, S. (2009). Alternative splicing and disease. Biochim Biophys Acta 1792, 14-26. [41] Cooper, T.A., Wan, L. and Dreyfuss, G. (2009). RNA and disease. Cell 136, 777-793. [42] Baralle, D., Lucassen, A. and Buratti, E. (2009). Missed threads. The impact of pre-mRNA splicing defects on clinical practice. EMBO Rep 10, 810-816. [43] Venables, J.P. (2004). Aberrant and alternative splicing in cancer. Cancer Res 64, 7647- 7654. [44] Venables, J.P. (2006). Unbalanced alternative splicing and its significance in cancer. Bioessays 28, 378-386. [45] Srebrow, A. and Kornblihtt, A.R. (2006). The connection between splicing and cancer. J Cell Sci 119, 2635-2641. [46] Kalnina, Z., Zayakin, P., Silina, K. and Line, A. (2005). Alterations of pre-mRNA splicing in cancer. Genes Chromosomes Cancer 42, 342-357. [47] Pajares, M.J., Ezponda, T., Catena, R., Calvo, A., Pio, R. and Montuenga, L.M. (2007). Alternative splicing: an emerging topic in molecular and clinical oncology. Lancet Oncol 8, 349-357.

13 [48] Ghigna, C., Valacca, C. and Biamonti, G. (2008). Alternative splicing and tumor progression. Curr Genomics 9, 556-570. [49] Grosso, A.R., Martins, S. and Carmo-Fonseca, M. (2008). The emerging role of splicing factors in cancer. EMBO Rep 9, 1087-1093. [50] Karni, R., de Stanchina, E., Lowe, S.W., Sinha, R., Mu, D. and Krainer, A.R. (2007). The gene encoding the splicing factor SF2/ASF is a proto-oncogene. Nat Struct Mol Biol 14, 185-193. [51] Ghigna, C. et al. (2005). Cell motility is controlled by SF2/ASF through alternative splicing of the Ron protooncogene. Mol Cell 20, 881-890. [52] Grosso, A.R., Gomes, A.Q., Barbosa-Morais, N.L., Caldeira, S., Thorne, N.P., Grech, G., von Lindern, M. and Carmo-Fonseca, M. (2008). Tissue-specific splicing factor gene expression signatures. Nucleic Acids Res 36, 4823-4832. [53] Relogio, A., Ben-Dov, C., Baum, M., Ruggiu, M., Gemund, C., Benes, V., Darnell, R.B. and Valcarcel, J. (2005). Alternative splicing microarrays reveal functional expression of neuron-specific regulators in Hodgkin lymphoma cells. J Biol Chem 280, 4779-4784. [54] Klinck, R. et al. (2008). Multiple alternative splicing markers for ovarian cancer. Cancer Res 68, 657-663. [55] Milani, L., Fredriksson, M. and Syvanen, A.C. (2006). Detection of alternatively spliced transcripts in leukemia cell lines by minisequencing on microarrays. Clin Chem 52, 202- 211. [56] Li, C., Kato, M., Shiue, L., Shively, J.E., Ares, M., Jr. and Lin, R.J. (2006). Cell type and culture condition-dependent alternative splicing in human breast cancer cells revealed by splicing-sensitive microarrays. Cancer Res 66, 1990-1999. [57] Dredge, B.K., Polydorides, A.D. and Darnell, R.B. (2001). The splice of life: alternative splicing and neurological disease. Nat Rev Neurosci 2, 43-50. [58] Licatalosi, D.D. and Darnell, R.B. (2006). Splicing regulation in neurologic disease. Neuron 52, 93-101. [59] Ule, J. and Darnell, R.B. (2007). Functional and mechanistic insights from genome-wide studies of splicing regulation in the brain. Adv Exp Med Biol 623, 148-160. [60] Ule, J. et al. (2005). Nova regulates brain-specific splicing to shape the synapse. Nat Genet 37, 844-352. [61] Buratti, E., Dork, T., Zuccato, E., Pagani, F., Romano, M. and Baralle, F.E. (2001). Nuclear factor TDP-43 and SR proteins promote in vitro and in vivo CFTR exon 9 skipping. Embo J 20, 1774-1784. [62] Neumann, M. et al. (2006). Ubiquitinated TDP-43 in frontotemporal lobar degeneration and amyotrophic lateral sclerosis. Science 314, 130-133. [63] Buratti, E., Baralle, F.E. and Pagani, F. (2003). Can a 'patch' in a skipped exon make the pre-mRNA splicing machine run better? Trends Mol Med 9, 229-232. [64] Garcia-Blanco, M.A. (2005). Making antisense of splicing. Curr Opin Mol Ther 7, 476-482. [65] Garcia-Blanco, M.A., Baraniak, A.P. and Lasda, E.L. (2004). Alternative splicing in disease and therapy. Nat Biotechnol 22, 535-546. [66] Tazi, J., Durand, S. and Jeanteur, P. (2005). The spliceosome: a novel multi-faceted target for therapy. Trends Biochem Sci 30, 469-478. [67] Dery, K.J., Gusti, V., Gaur, S., Shively, J.E., Yen, Y. and Gaur, R.K. (2009). Alternative splicing as a therapeutic target for human diseases. Methods Mol Biol 555, 127-144. [68] Aartsma-Rus, A. and van Ommen, G.J. (2007). Antisense-mediated exon skipping: a versatile tool with therapeutic and research applications. Rna 13, 1609-1624. [69] Cartegni, L. and Krainer, A.R. (2003). Correction of disease-associated exon skipping by synthetic exon-specific activators. Nat Struct Biol 10, 120-125.

14 [70] Skordis, L.A., Dunckley, M.G., Yue, B., Eperon, I.C. and Muntoni, F. (2003). Bifunctional antisense oligonucleotides provide a trans-acting splicing enhancer that stimulates SMN2 gene expression in patient fibroblasts. Proc Natl Acad Sci U S A 100, 4114-4119. [71] Soret, J. et al. (2005). Selective modification of alternative splicing by indole derivatives that target serine-arginine-rich protein splicing factors. Proc Natl Acad Sci U S A 102, 8764- 8769. [72] Liu, X. et al. (2002). Partial correction of endogenous DeltaF508 CFTR in human cystic fibrosis airway epithelia by spliceosome-mediated RNA trans-splicing. Nat Biotechnol 20, 47-52. [73] Gorman, L., Mercatante, D.R. and Kole, R. (2000). Restoration of correct splicing of thalassemic beta-globin pre-mRNA by modified U1 snRNAs. J Biol Chem 275, 35914- 35919. [74] Abad, X., Vera, M., Jung, S.P., Oswald, E., Romero, I., Amin, V., Fortes, P. and Gunderson, S.I. (2008). Requirements for gene silencing mediated by U1 snRNA binding to a target sequence. Nucleic Acids Res 36, 2338-2352. [75] Goraczniak, R., Behlke, M.A. and Gunderson, S.I. (2009). Gene silencing by synthetic U1 adaptors. Nat Biotechnol 27, 257-263. [76] Goyenvalle, A., Vulin, A., Fougerousse, F., Leturcq, F., Kaplan, J.C., Garcia, L. and Danos, O. (2004). Rescue of dystrophic muscle through U7 snRNA-mediated exon skipping. Science 306, 1796-1799. [77] Krawczak, M., Thomas, N.S., Hundrieser, B., Mort, M., Wittig, M., Hampe, J. and Cooper, D.N. (2007). Single base-pair substitutions in exon-intron junctions of human genes: nature, distribution, and consequences for mRNA splicing. Hum Mutat 28, 150-158. [78] Baralle, M. et al. (2006). NF1 mRNA biogenesis: Effect of the genomic milieu in splicing regulation of the NF1 exon 37 region. FEBS Lett 580, 4449-4456. [79] Vorechovsky, I. (2006). Aberrant 3' splice sites in human disease genes: mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization. Nucleic Acids Res 34, 4630-4641. [80] Buratti, E., Chivers, M., Kralovicova, J., Romano, M., Baralle, M., Krainer, A.R. and Vorechovsky, I. (2007). Aberrant 5' splice sites in human disease genes: mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization. Nucleic Acids Res 35, 4250-4263. [81] Roca, X., Sachidanandam, R. and Krainer, A.R. (2003). Intrinsic differences between authentic and cryptic 5' splice sites. Nucleic Acids Res 31, 6321-6333. [82] Kralovicova, J. and Vorechovsky, I. (2007). Global control of aberrant splice-site activation by auxiliary splicing sequences: evidence for a gradient in exon and intron definition. Nucleic Acids Res 35, 6399-6413. [83] Spena, S., Tenchini, M.L. and Buratti, E. (2006). Cryptic splice site usage in exon 7 of the human fibrinogen Bβ -chain gene is regulated by a naturally silent SF2/ASF binding site within this exon. Rna 12, 948-958. [84] Divina, P., Kvitkovicova, A., Buratti, E. and Vorechovsky, I. (2009). Ab initio prediction of mutation-induced cryptic splice-site activation and exon skipping. Eur J Hum Genet 17, 759-765. [85] Galante, P.A., Sakabe, N.J., Kirschbaum-Slager, N. and de Souza, S.J. (2004). Detection and evaluation of intron retention events in the human transcriptome. Rna 10, 757-765. [86] Stamm, S., Zhu, J., Nakai, K., Stoilov, P., Stoss, O. and Zhang, M.Q. (2000). An alternative- exon database and its statistical analysis. DNA Cell Biol 19, 739-756. [87] Laski, F.A. and Rubin, G.M. (1989). Analysis of the cis-acting requirements for germ-line- specific splicing of the P-element ORF2-ORF3 intron. Genes Dev 3, 720-728.

15 [88] Gebauer, F., Merendino, L., Hentze, M.W. and Valcarcel, J. (1998). The Drosophila splicing regulator sex-lethal directly inhibits translation of male-specific-lethal 2 mRNA. Rna 4, 142- 150. [89] Mansilla, A., Lopez-Sanchez, C., de la Rosa, E.J., Garcia-Martinez, V., Martinez-Salas, E., de Pablo, F. and Hernandez-Sanchez, C. (2005). Developmental regulation of a proinsulin messenger RNA generated by intron retention. EMBO Rep 6, 1182-1187. [90] Kurio, H., Murayama, E., Kaneko, T., Shibata, Y., Inai, T. and Iida, H. (2008). Intron retention generates a novel isoform of CEACAM6 that may act as an adhesion molecule in the ectoplasmic specialization structures between spermatids and sertoli cells in rat testis. Biol Reprod 79, 1062-1073. [91] Xu, Q., Walker, D., Bernardo, A., Brodbeck, J., Balestra, M.E. and Huang, Y. (2008). Intron-3 retention/splicing controls neuronal expression of apolipoprotein E in the CNS. J Neurosci 28, 1452-1459. [92] Le Hir, H., Charlet-Berguerand, N., de Franciscis, V. and Thermes, C. (2002). 5'-End RET splicing: absence of variants in normal tissues and intron retention in pheochromocytomas. Oncology 63, 84-91. [93] Zhang, L. et al. (2004). An intronic mutation causes long QT syndrome. J Am Coll Cardiol 44, 1283-1291. [94] Pequignot, M.O. et al. (2001). Mutations in the SURF1 gene associated with Leigh syndrome and cytochrome C oxidase deficiency. Hum Mutat 17, 374-381. [95] Attali, R. et al. (2009). Mutation of SYNE-1, encoding an essential component of the nuclear lamina, is responsible for autosomal recessive arthrogryposis. Hum Mol Genet 18, 3462-3469. [96] Ghosh, A., Kuppusamy, H. and Pilarski, L.M. (2009). Aberrant splice variants of HAS1 (Hyaluronan Synthase 1) Multimerize with and modulate normally spliced HAS1 protein: a potential mechanism promoting human cancer. J Biol Chem 284, 18840-18850. [97] Sun, H. and Chasin, L.A. (2000). Multiple splicing defects in an intronic false exon. Mol Cell Biol 20, 6414-6425. [98] Sironi, M., Menozzi, G., Riva, L., Cagliani, R., Comi, G.P., Bresolin, N., Giorda, R. and Pozzoli, U. (2004). Silencer elements as possible inhibitors of pseudoexon splicing. Nucleic Acids Res 32, 1783-1791. [99] Fairbrother, W.G. and Chasin, L.A. (2000). Human genomic sequences that inhibit splicing. Mol Cell Biol 20, 6816-6825. [100] Zhang, X.H., Leslie, C.S. and Chasin, L.A. (2005). Dichotomous splicing signals in exon flanks. Genome Res 15, 768-779. [101] Lucien, N., Chiaroni, J., Cartron, J.P. and Bailly, P. (2002). Partial deletion in the JK locus causing a Jk(null) phenotype. Blood 99, 1079-1081. [102] Madden, H.R., Fletcher, S., Davis, M.R. and Wilton, S.D. (2009). Characterization of a complex Duchenne muscular dystrophy-causing dystrophin gene inversion and restoration of the reading frame by induced exon skipping. Hum Mutat 30, 22-28. [103] Satokata, I., Uchiyama, M. and Tanaka, K. (1995). Two novel splicing mutations in the XPA gene in patients with group A xeroderma pigmentosum. Hum Mol Genet 4, 1993- 1994. [104] Szigeti, K. et al. (2004). MNGIE with lack of skeletal muscle involvement and a novel TP splice site mutation. J Med Genet 41, 125-129. [105] Siintola, E., Topcu, M., Kohlschutter, A., Salonen, T., Joensuu, T., Anttonen, A.K. and Lehesjoki, A.E. (2005). Two novel CLN6 mutations in variant late-infantile neuronal ceroid lipofuscinosis patients of Turkish origin. Clin Genet 68, 167-173. [106] Bateman, J.F., Chan, D., Moeller, I., Hannagan, M. and Cole, W.G. (1994). A 5' splice site mutation affecting the pre-mRNA splicing of two upstream exons in the collagen COL1A1

16 gene. Exon 8 skipping and altered definition of exon 7 generates truncated pro alpha 1(I) chains with a non-collagenous insertion destabilizing the triple helix. Biochem J 302 (Pt 3), 729-735.

17 Figure Legends

Fig. 1 Classical outcomes of mutation-induced aberrant transcripts

The upper panel shows a pre-mRNA molecule with exons (boxes) separated by introns

(lines). Splice-site consensens sequences (5’ss, 3’ss, polypyrimidine tract and BPS) are indicated for the central exon. Inactivation of the 5'ss and 3'ss sequences can lead to exon skipping (either single exon skipping or multiple exon skipping) (A), activation of downstream or upstream (cryptic) splice sites (B), or full intron retention (C). Mutations that create strong splice sites in intronic regions can lead to pseudoexon activation (D).

Fig. 2 Mutation-induced aberrant transcripts following inactivation/activation of SRE elements.

Mutation in enhancer or silencer elements that lead to their disruption (or creation) can lead to the same consequences described for the basic regulatory factors with the addition that creation of enhancers or silencer loss can lead to increased levels of exon inclusion.

Fig. 3 Unexpected splicing outcomes in disease

These schematic panels show some unexpected splicing events that mght be associated with the introduction of disease-associated mutations in classical splicing signals such as the acceptor or donor site of exons.

18