HUMAN MUTATION 28(1), 41^53, 2007

RESEARCH ARTICLE

Weak Definition of IKBKAP 20 Leads to Aberrant Splicing in

El Che´rif Ibrahim,1 Matthew M. Hims,2 Noam Shomron,3 Christopher B. Burge,3 Susan A. Slaugenhaupt2Ã and Robin Reed1Ã 1Department of Cell Biology, Harvard Medical School, Boston, Massachusetts; 2Center for Human Genetic Research, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts; 3Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts

Communicated by Michel Goossens Splicing mutations that lead to devastating genetic diseases are often located in nonconserved or weakly conserved sequences that normally do not affect splicing. Thus, the underlying reason for the splicing defect is not immediately obvious. An example of this phenomenon is observed in the neurodevelopmental disease familial dysautonomia (FD), which is caused by a single-base change in the 50 splice site (50ss) of intron 20 in the IKBKAP gene (c.220416T4C). This mutation, which is in the sixth position of the intron and results in exon 20 skipping, has no in many other introns. To determine why the position 6 mutation causes aberrant splicing only in certain cases, we first used an in silico approach to identify potential sequences involved in exon 20 skipping. Computational analyses of the exon 20 50ss itself predicted that this nine- nucleotide splicing signal, even when it contains the T4C mutation, is not sufficiently weak to explain the FD phenotype. However, the computational analysis predicted that both the upstream 30 splice site (30ss) and exon 20 contain weak splicing signals, indicating that the FD 50ss, together with the surrounding splicing signals, are not adequate for defining exon 20. These in silico predictions were corroborated using IKBKAP minigenes in a new rapid and simple in vitro coupled RNA polymerase (RNAP) II /splicing assay. Finally, the weak splicing signals that flank the T4C mutation were validated as the underlying cause of familial dysautonomia in vivo using transient transfection assays. Together, our study demonstrates the general utility of combining in silico data with an in vitro RNAP II transcription/splicing system for rapidly identifying critical sequences that underlie the numerous splicing diseases caused by otherwise silent mutations. Hum Mutat 28(1), 41–53, 2007. r 2006 Wiley-Liss, Inc.

0 KEY WORDS: RNA splicing; 5 splice site; exon skipping; familial dysautonomia; IKBKAP

INTRODUCTION from yeast to human, and is thought to function in transcrip- tion [Hawkes et al., 2002; Li et al., 2001; Otero et al., 1999]. Approximately 10% of all mutations that cause human genetic At present, the role of IKAP and the human elongator complex disease are due to splicing defects [Stenson et al., 2003]. Of these is not clear [Krogan et al., 2002; Pokholok et al., 2002]. Recently, mutations, 60% are located in the highly conserved GU and AG it was shown that the yeast homolog of IKBKAP, Elp1, physi- dinucleotides in the 50 and 30 splice sites (50ss and 30ss). In these cases, the underlying basis for the splicing defect is understood, as the GU and AG are essential for splicing catalysis. Single-base The Supplementary Material referred to in this article can be changes in the less well-conserved nucleotides in the splice sites accessed at http://www.interscience.wiley.com/jpages/1059-7794/ are also the cause of numerous genetic diseases. For example, suppmat. 0 Received10 May 2006; accepted revised manuscript 28 June 2006. aT4C change in position 6 of the 5’ splice site (5 ss) results à in b-thalassemia, familial growth hormone deficiency, and Ehlers- Correspondence to: Robin Reed, HarvardMedical School, Depart- ment of Cell Biology, LHRRB 501 240 Longwood Avenue, Boston, Danlos syndrome type IV [Cogan et al., 1994; Lloyd et al., 1993; MA 02115. E-mail: [email protected]; and Susan A.Slaugen- Treisman et al., 1983]. The autosomal recessive neurodegenerative haupt, Center for Human Genetic Research, Simches Research disease familial dysautonomia (FD; MIM] 223900) is another Center Room 5254,185 Cambridge Street, Boston, MA 02114. example of such a splicing defect. More than 99.5% of FD patients E-mail:[email protected] are homozygous for an intron 20 (c.220416T4C) mutation in Grant sponsors: Harvard Center for Neurodegeneration and Repair; Dysautonomia Foundation, Inc. the inhibitor of kappa light polypeptide gene enhancer in B-cells, ElCherif Ibrahim’scurrent address: NICN-UMR 6184 CNRS, Faculte complex-associated gene (HUGO symbol, IKBKAP; Ł Ł de MeŁdecine Nord-IFR Jean Roche, Bd Pierre Dramard, 13916 MIM] 603722) [Anderson et al., 2001; Slaugenhaupt et al., Marseilles Cedex 20, France. 2001]. This gene, which encodes the 160-kD IKAP protein, DOI 10.1002/humu.20401 contains 36 constitutively-spliced introns. IKAP is a component of Published online 8 September 2006 in Wiley InterScience (www. a six-subunit complex, designated Elongator, which is conserved interscience.wiley.com). r 2006 WILEY-LISS, INC. 42 HUMAN MUTATION 28(1), 41^53, 2007 cally interacted with the yeast Rab guanine nucleotide exchange 2 major late intron was amplified from a plasmid (pAdML) factor Sec2 [Rahl et al., 2005]. The Sec2 interaction domain of [Bennett et al., 1992] by PCR and was inserted into the WT- Elp1 was necessary for both Elp1 function and for the polarized control and FD-control constructs to replace IKBKAP intron 20 localization of Sec2. Thus, it was proposed that FD may be caused and to generate 19–21/AdMLWT-control and 19–21/AdMLFD- by a defect in polarized exocytosis in neurons [Rahl et al., 2005]. control. These constructs were linearized by digestion with Xba I The c.220416T4CFDmutationinIKBKAP results in exon 20 (New England Biolabs). A series of artificial point mutations in the skipping [Cuajungco et al., 2003; Slaugenhaupt et al., 2001]. As control and AdML-control constructs were created by PCR- this nucleotide is a C in many introns that are spliced normally, the mediated site directed mutagenesis using PfuUltra polymerase underlying basis for the splicing defect is not clear [Okubo et al., (Stratagene, La Jolla, CA; www.strtagene.com). Details of the 2002; Yabuta et al., 2002]. Insight into how the 16T4C affects constructs are shown in Figures 2–6. splicing in FD came from recent studies by Carmel et al. [2004], The Hind III-BamH I fragment containing the first two who transfected IKBKAP minigenes containing mutant and FD 50ss and the first intron sequence of the human b-globin gene was into a human embryonic kidney cell line. This analysis revealed that subcloned into the Hind III-BamH I multicloning site of a exon 20 skipping can be corrected by improving base-pairing of the pCDNA3.1 vector containing both a cytomegalovirus (CMV) FD 50ss to U1 small nuclear RNA (snRNA) but not to U6 snRNA promoter and a bacteriophage T7 promoter and was named bG. [Carmel et al., 2004]. Thus, at least for FD and possibly for the Nucleotides 39–56 of b-globin exon 2 were replaced by PCR- other diseases, the T4C change impacts splicing during the initial mediated site directed mutagenesis with size-matched and over- recognition of the 50ss by U1 snRNA, one of the earliest steps in lapping fragments of IKBKAP exon 20, covering exon positions 2 spliceosome assembly. This result indicates that the other to 71 (see Fig. 5). The following constructs were generated and nucleotides that comprise the exon 20 50ss itself are inherently linearized with BamHI: 2–19, 12–29, 22–39, 32–49, 42–59, and weak for recruiting U1 snRNP, and/or that remote signals in the 54–71. All of the minigenes were sequenced. pre-mRNA are weak for recruiting this snRNP to the exon 20 50ss. 0 To identify as many therapeutic targets as possible for splicing 5 ss Analysis diseases, it is critical to identify all of the sequences in the pre- The Hollywood database (http://hollywood.mit.edu) was used to mRNA that contribute to the aberrant splicing. In the case of query 151,199 internal exons for their 50ss sequences and their splicing diseases caused by mutation in the essential GU or AG at splicing patterns based on expressed sequence tag (EST) evidence the splice sites, the only viable option is to restore the function of [Holste et al., 2006]. these nucleotides, as they are required for splicing catalysis. In marked contrast, splicing mutations in weakly conserved elements Calculation of Splice-Site Strength, Prediction are potentially more tractable for intervention because many other of Exonic Splicing Enhancers, and Motifs Involved sequences, and hence drugable targets, can affect function of the in Exon Skipping weak element. At present, however, a significant stumbling block has been the lack of a rapid and reliable method for pinpointing Splice-site scores were calculated using either the algorithm and validating the sequences that underlie otherwise silent splicing from Shapiro and Senapathy [1987] and the tables based on the mutations. Here we carried out an in silico analysis of sequences in analysis of 45,552 splice sites [Carmel et al., 2004] (http:// the IKBKAP gene to identify potential sequences that contribute ast.bioinfo.tau.ac.il) or the maximum entropy model (MAXENT) to exon 20 skipping in FD. These data predicted that the FD 50ss method [Yeo and Burge, 2004], (http://genes.mit.edu/burgelab/ itself is not inherently weak compared to many 50ss but is instead maxent/Xmaxentscan_scoreseq.html). Exonic splicing enhancer inactive because it is surrounded by weak splicing signals in intron (ESE) predictions were based on previously described matrices 19 as well as in exon 20. These in silico predictions were tested in a [Cartegni et al., 2003] and were obtained using the web tool new efficient in vitro RNA polymerase (RNAP) II transcription/ ESEfinder 2.0 (http://rulai.cshl.edu/tools/ESE). The relative ESE splicing assay [Das et al., 2006]. The role of the weak signals in frequency was defined as the number of ESE motifs whose scores causing FD was then validated using in vivo minigene transfection are above the thresholds (SF2/ASF, 1.956; SC35, 2.383; SRp40, assays. Finally, the possibility of using the weak signals as targets for 2.67; SRp55, 2.676), divided by the exon size. therapeutic intervention was established by showing that exon 20 Coupled RNAP II Transcription and Splicing is included when the FD mutant pre-mRNA is hybridized to a bifunctional RNA oligonucleotide, which base-pairs to exon 20 The coupled RNAP II transcription and splicing system has and also contains serine/-rich (SR) protein binding sites. been previously described [Das et al., 2006]. Briefly, 200 ng of Our study demonstrates a rapid approach for identifying weak linearized DNA template was incubated in a 25-ml reaction splicing signals as potential therapeutic targets for the many mixture containing 15 ml HeLa-S3 nuclear extract (National Cell splicing diseases, such as familial dysautonomia, that are caused by Culture Center, Minneapolis, MN; www.nccc.com), 0.5 mM ATP, otherwise silent mutations. 20 mM creatine phosphate, 3.2 mM MgCl2, 60 mM KCl, 12 mM Tris-HCl pH 7.6, and 10 mCi of [a-32P]UTP (800 Ci/mmol; Perkin Elmer, Wellesley, MA; perkinelmer.com) at 301C for 15–30 min MATERIALS AND METHODS before adding 1 ml of 250 m/ml actinomycin D (Sigma, St. Louis, Minigene Constructs MO; www.sigmaaldrich.com) to block transcription. The splicing Constructs containing IKBKAP genomic DNA (RefSeq reaction was allowed to proceed for an additional 20–75 min at NM_003640.2) from exon 19 to the 16 first nucleotides of intron 301C. Total RNA was then isolated and resuspended in water or in 21, with or without the FD mutation, were previously described formamide dye before fractionating on 6.5% (19:1) 7 M urea [Slaugenhaupt et al., 2004]. The constructs are designated 19–21 polyacrylamide gels. RNAs were visualized and quantified using wild-type-control (19–21WT-control) and 19–21FD-control, a Molecular Imager device (Bio-Rad, Hercules, CA; www. respectively, and were linearized by digestion with Xho I (New bio-rad.com) and Quantity One software (Bio-Rad). The identity England Biolabs, Ipswich, MA; www.neb.com). The adenovirus of the RNA species generated from the two intron–three exon

Human Mutation DOI 10.1002/humu HUMAN MUTATION 28(1), 41^53, 2007 43

FIGURE 1. In silico analysis of the splicing signals surrounding the FD mutation. A: Schematic of the splicing signals in the 1.7-kb region surrounding the FD mutation. Boxes and uppercase letters represent exons. Lowercase letters and lines represent introns. An arrow designates the FD mutation.The 50ss and 30ss scores calculated by the Shapiro and Senapathy matrix (S&S) or the maximum entropy model (MAXENT) are indicated below the exons. For comparison, maximal scores for 50ss are100 and11.8 according to S&S and MAXENT, respectively. Maximal scores for 30ss are 100 and 15.9 according to S&S and MAXENT, respectively. ESE frequency was predicted using the ESE¢nder web tool. Putative weak splicing signals are indicated in bold. B: Sequence of IKBKAP exon 20. Putative ESEs are underlined (ESE1^4 are putative binding sites for SRp40, SRp55, SC35, and SF2/ASF, respectively) and the two PyUAG motifs (ESSs) are overlined. ESS1 and ESE1 are predicted to be overlapping, as observed previously [Han et al., 2005; Hovhannisyan and Carstens, 2005; Pagani et al., 2003; Zahler et al., 2004; Zhu et al., 2001].

substrates, including a number of lariat species, was determined by (SF2/ASF) binding sites. Controls had impaired or no base-pairing cloning and analyzing separately the in vitro spliced products of potential with the IKBKAP exon 20 or lacked the SR binding sites. substrates lacking either one or both introns (data not shown). The oligonucleotides were added at concentrations 0, 125, 250, Statistical comparisons were made using a one-tailed Student’s t 500, 750, 1,000, 1,500, and 2,000 nM at the same time as CMV test. Criteria for significance were calculated using a Bonferroni DNA constructs in the coupled RNAP II transcription-splicing correction for multiple comparisons by dividing the initial P value system. The complementary (antisense) RNA sequence is in of 0.05 by the number of comparisons made. uppercase and the tail region is in lowercase. The letters ‘‘m’’ and ‘‘Ã’’ refer to 20-O-methyl and phosphorothioate chemical groups, à à Cell Culture andTransfection respectively: Exonic splicing silencer (ESS)-SR, 50-mUmG mA mGÃmCmUmAmAmAÃmAÃmCÃmCÃmAÃmGmGaÃcaggaggcag Minigene constructs were transiently transfected into HEK293 gaggcaggÃaÃgÃgÃa-30; mut4ESS-SR, 50-mUmGÃmAÃmGÃmGm cells using Genejuice (Novagen, San Diego, CA; emdbioscien- UmAmUmAÃmAÃmCÃmGÃmAÃmCmGaÃcaggaggcaggaggcaggà ces.com) as directed by the manufacturer. Cells were harvested aÃgÃgÃa-30; mut7ESS-SR, 50-mUmCÃmAÃmCÃmGmUmAm 24 hr posttransfection and RNA isolated using RNAeasy extrac- UmAÃmAÃmGÃmGÃmAÃmCmGaÃcaggaggcaggaggcaggÃaÃgÃgà tion kit (Quiagen, Valencia, CA; www.qiagen.com). a-30; scram-SR, 50-mCmAÃmGÃmAÃmUmGmCmGmGÃmCÃmA ÃmUÃmCÃmAmUaÃcaggaggcaggaggcaggÃaÃgÃgÃa-30;NT,50- RT-PCR mUmGÃmAÃmGÃmCmUmAmAmAÃmAÃmCÃmCÃmAÃmGm- 0 0 à à à à à First-strand cDNA was either synthesized by random hexamer- G-3 ; ESS-A1, 5 -mUmG mA mG mCmUmAmAmA mA mC à à à à à à à à 0 primed reverse transcription with Moloney murine leukemia virus mC mA mGmGu uaggguuagggguuaggg u u a g-3 . reverse transcriptase (Invitrogen, Carlsbad, CA; www.invitrogen. com) for in vitro experiments, or by using Superscript II reverse transcriptase (Invitrogen) and a gene-specific primer: bGH reverse RESULTS (TAGAAGGCACAGTCGAGG) for in vivo experiments. 26 Computational Analysis of the FD and Normal cycles of PCR incorporating [a-32P]dCTP were then performed Exon 20 50ss with Herculase DNA polymerase (Stratagene) using either IKBKAP exon 19–specific primer 2652 (50-CCTGAGCAGCAAT To determine why the intron 16T4C mutation causes aberrant CATGTG) and the IKBKAP exon 21—intron 21–specific primer splicing when present in IKBKAP pre-mRNA, we first carried out 0 2914 (50-CAGCTTAGAAAGTTACCTTAG) or the universal T7 a computational analysis of the exon 20 5 ss. We used the Shapiro and bGH reverse primers as follows: 941C for 30 s; 561C for 30 s; and Senapathy matrix (S&S) method, which reflects the degree of 721C for 35 s. The products were resolved on an 8% polyacryla- conservation in different positions resulting from alignment of 0 mide gel. 445,000 5 ss [Carmel et al., 2004; Shapiro and Senapathy, 1987]. This well-established scoring method assumes independence between individual positions of the 50ss-motif [Senapathy et al., Bifunctional RNA Oligonucleotide 1990]. MAXENT was also used and integrates position-specific A series of RNA oligonucleotides containing both 20-O-methyl biases with dependencies between different positions [Yeo and and phosphorothioate modifications were obtained from Dharma- Burge, 2004]. As expected, these analyses revealed that con (Lafayette, CO; www.dharmacon.com). These oligonucleo- the scores of the FD 50ss (CAAguaagc, 81-S&S, 7.5-MAXENT) tides (38 nt) were complementary to nucleotides 139 to 153 of are lower than their WT counterpart (CAAguaagu, 88-S&S; IKBKAP exon 20 and also carried a tail containing two SR protein 10.1-MAXENT) (Fig. 1A).

Human Mutation DOI 10.1002/humu 44 HUMAN MUTATION 28(1), 41^53, 2007

In SilicoAnalysis Predicts That Weak Splicing Signals Flank the FD Mutation To identify potential splicing signals that may be the underlying cause of the inefficient use of the FD 50ss, we first calculated the strength of the 30ss upstream of the FD mutation using the S&S matrix and MAXENT (Fig. 1B) [Shapiro and Senapathy, 1987; Yeo and Burge, 2004]. This analysis revealed that the intron 19 30ss is clearly weak, as its MAXENT score of 6.4 is well below the average score of 8.4 for internal constitutive exons in the Hollywood database and even well below the MAXENT score of 7.4 for internal skipped exons [Holste et al., 2006]. In contrast, the intron 20 30ss (86-S&S; 12-MAXENT), which is not predicted to affect exon 20 definition, has scores that are well above average in this dataset. We conclude that the intron 19 30ss is a candidate weak element for exon 20 inclusion in FD. We next asked whether exon 20 itself contains splicing signals that may impact use of the FD 50ss. To predict ESEs, we used the ESEfinder matrices (http://rulai.cshl.edu/tools/ESE). This program predicts binding sites for the canonical SR members, SF2/ASF, SC35, SRp40, and SRp55 [Cartegni et al., 2003; Wang et al., 2005]. In the Hollywood database, the average ESE frequency in the 747 constitutive exons that are flanked by a sequence that is identical to the exon 20 WT 50ss (CAAguaagu) is 0.127 (frequency is calculated as the number of ESEs divided by exon length; data not shown). Significantly, the ESE frequency in exon 20 of IKBKAP, which flanks the FD 50ss, is much lower, at only 0.054 (Fig. 1A), indicating that this exon is weak. We also looked for the ESS core motif pyrimidine (Py)UAG in exon 20 FIGURE 2. Splicing of FD minigenes in vivo and in vitro. A: Mini- [Bilodeau et al., 2001; Si et al., 1997, 1998]. Notably, two copies of gene schematic.The region where nucleotide changes were made this motif are present in this exon, resulting in a PyUAG frequency are indicated by the circle. An alignment of the control and mu- of 0.027 (Fig. 1A and B). This is the highest PyUAG frequency of tant sequences at the exon/intron 20 boundary is shown below. Uppercase letters represent exon 20 nucleotides. The scores of all of the IKBKAP exons, most of which contain none or only one the 50ss are indicated in the normal (WT) and the FD context PyUAG motif. Moreover, in the Hollywood database, the average (FD). An arrow designates the FD mutation. B: In vivo spliced PyUAG frequency within the 127 constitutive exons flanked by an products were ampli¢ed by RT-PCR and resolved on native poly- FD 50ss (CAAguaagu) is 0.0026 (data not shown), and only one acrylamide gels. The percentage of exon 20 inclusion is indi- exon was identified with a higher PyUAG frequency than exon 20 cated.The upper band is the mRNA containing exon 20, and the lower band is the mRNA lacking exon 20. C: In vitro spliced pro- (data not shown). We conclude that the PyUAG frequency of ducts were ampli¢ed by RT-PCR and resolved on native polyacry- 0.027 in exon 20 of IKBKAP is much higher than found in most lamide gels. D: Spliced products were directly separated on a exons. Collectively, our in silico data revealed that the FD 50ss denaturing polyacrylamide gel.The schematics between the gels itself is somewhat weaker than average 50ss. In addition, the indicate the pre-mRNA, the splicing ¢nal products, and the lariat intermediates. sequence elements that are critical for recognition of this splice- site, and hence for defining exon 20, are weak. In particular, exon 20 itself is predicted to be weak, as it contains a lower than average To obtain additional information on the strength of the FD and ESE frequency combined with a higher than average PyUAG WT 50ss, we extracted sequences identical to these sites from the frequency. In addition, the 30ss of intron 19, which borders the Hollywood database, which is a compilation of 151,000 50ss, of weak exon 20, is also weak. We next sought to test these in silico which more than 114,000 flank constitutive exons [Holste et al., predictions by carrying out splicing assays using minigenes 2006]. Analysis of this dataset revealed that both the WT and FD containing specific sequence alterations in the regions of interest. sequences are frequently used as constitutive 50ss (Table 1), ranking 20th and 225th most common, respectively, out of the set Coupled RNAP II Transcription/Splicing Assay System of more than 8,000 9-mers that are used as 50ss in the constitutive to Detect E¡ects of IKBKAP Pre-mRNA Alterations exon dataset (data not shown). In this dataset, the WTsequence is on Inclusion of Exon 20 found a total of 747 times adjacent to constitutive exons whereas, the FD sequence is present 127 times (Supplementary Table S1; As an initial assay to monitor IKBKAP exon 20 inclusion, the available online at http://www.interscience.wiley.com/jpages/1059- CMV promoter was fused to a three-exon minigene that contains 7794/suppmat). Thus, the FD 50ss is functional in many introns. the FD or WT genomic sequence of IKBKAP exons 19–21 Both the WT and the FD sequences are also more frequently (Fig. 2A). Consistent with previous studies [Slaugenhaupt et al., found in constitutive exons than in alternatively-spliced (skipped) 2004], when these 19–21 minigenes were transfected into HEK exons (Table 1). This observation supports the conclusion that 293 cells, exon 20 is efficiently included in the WT minigenes, but the FD site is not the sole reason for exon 20 skipping. We mostly skipped in the FD minigenes (Fig. 2B, lanes 1 and 4). conclude that sequence elements in IKBKAP pre-mRNA, other Moreover, as reported previously [Carmel et al., 2004], the level of than those immediately present within the FD 50ss, underlie this exon 20 inclusion is increased when the base-pairing potential splicing disease. with U1 snRNA (U11) is increased, but not when the base-

Human Mutation DOI 10.1002/humu HUMAN MUTATION 28(1), 41^53, 2007 45

FIGURE 3. Coupled RNAP II transcription/splicing system for analysis of FD exon 20 skipping in vitro. A: Alignment of the sequences at the exon/intron 20 and exon/intron 21 boundaries. Uppercase letters represent exon nucleotides.The scores of the 50ss and 30ss are indicated. B: Spliced products were directly separated on a denaturing polyacrylamide gel.The schematics on the left of the gel indicate the pre-mRNA, the splicing products, and the lariat intermediates. In the minigene schematics left of the gel, circles £anking exon 20 and 21indicate where the nucleotide changes were introduced.The percentage of exon 20 inclusion is indicated. C: Spliced products were directly separated on a denaturing polyacrylamide gel. The schematics between the gels indicate the pre-mRNA, the splicing ¢nal products, and the lariat intermediates. pairing potential with U6 snRNA (U61) is increased (Fig. 2A; efficient splicing in vitro vs. in vivo is usually observed with pre- Fig. 2B, lanes 4–6). mRNAs other than well-characterized model splicing substrates After this analysis validated the CMV IKBKAP minigenes for with strong splice sites [Cartegni and Krainer, 2002; Coˆte´ et al., studies of mutations that affect FD splicing, we used them in a new 1999; Gabut et al., 2005; Jiang et al., 2000; Liu et al., 2001; coupled RNAP II in vitro transcription/splicing system, which Skordis et al., 2003]. This may be due to lower concentrations of splices pre-mRNA with significantly higher efficiency than the splicing factors or a different ratio of splicing factors to inhibitory standard T7 transcript in vitro splicing system (see Materials and factors in vitro vs. in vivo. Methods) [Das et al., 2006]. In the coupled transcription/splicing To determine whether we could increase the efficiency of in system, the CMV promoter is used to transcribe pre-mRNAs with vitro splicing of the 19–21 minigene, we substituted the whole RNAP II, and these transcripts are then spliced. As shown in IKBKAP intron 20 (1,072 nt) with the 120-nt AdML intron Figure 2C, we found that both the WT and FD pre-mRNAs (Fig. 3A; 50ss130ss AdML). As shown in Fig. 3B, the pre-mRNA containing the 19–21 minigene were not efficiently spliced in was now so efficiently spliced in the in vitro RNAP II coupled vitro, even using the coupled system. Indeed, the splicing products transcription/splicing system that we were able to detect splicing were only detectable by RT-PCR. Moreover, in contrast to the products directly on a gel (Fig. 3B, lane 1). The increased in vivo results, we found a high level of exon 20 skipping with efficiency may be due to the smaller size and/or the sequence of the WT pre-mRNA, even in the presence of the mutation the AdML intron. In light of this observation, we next created that increases base-pairing to U1 snRNA (Fig. 2C, lanes 1 and 2). minigenes designated 19–21/AdML, in which the entire intron 20 The skipping was about three-fold greater in the presence of is replaced with the AdML intron but the WT and FD IKBKAP the FD mutation (Fig. 2C, lanes 4 and 5). We note, however, that 50ss are now present (Fig. 3A, WT cntl and FD cntl, respectively). this high level of exon 20 skipping observed even with the WT For both the WT and FD 19–21/AdML constructs, the splicing pre-mRNA is consistent with our in silico analysis, indicating the products are readily detectable on a gel (Fig. 3B, lanes 2 and 3). splicing signals in and around exon 20 are weak (Fig. 1A). Less As observed with the 19–21 minigenes, exon 20 skipping is greater

Human Mutation DOI 10.1002/humu 46 HUMAN MUTATION 28(1), 41^53, 2007

FIGURE 4. 30ss changes in intron 19 promote exon 20 inclusion. A: Alignment of the control and mutant sequences at the intron 19/ exon 20 boundaries.The uppercase letter represents the ¢rst nucleotide of exon 20.The scores of the 30ss are indicated. B: In vitro spliced products were directly separated on a denaturing polyacrylamide gel.The schematics between the gels indicate the pre-mRNA, the splicing ¢nal products, and the lariat intermediates. C: In vitro spliced products were ampli¢ed by RT-PCR and resolved on native polyacrylamide gels.The upper band is the mRNA containing exon 20, and the lower band is the mRNA lacking exon 20. D: In vivo spliced products were ampli¢ed by RT-PCR and resolved on native polyacrylamide gels.The percentage of exon 20 inclusion is indi- cated. In the minigene schematics above the gels, a circle £anking exon 20 indicates where the nucleotide changes were introduced. in the presence of the FD 50ss than the WT 50ss (Fig. 3B; compare simple. The sequences identified in this assay can then be tested lanes 2 and 3). This observation suggested that the 19–21/AdML both in vitro and in vivo using the minigenes containing the minigenes were valid models to investigate FD exon skipping natural IKBKAP intron 20. in vitro. This minigene model was further validated by the observation that exon 20 inclusion is affected similarly by the The Intron 1930ss Contributes to Exon 20 Skipping intron 20 50ss changes that affect inclusion both in vivo and in vitro with the full length IKBKAP intron 20. That is, an increase After validating the in vitro system for our analysis, we in inclusion was observed for U11, and exon skipping was investigated the signals identified as weak using the in silico data. observed for U61 (compare Fig. 2B and C with Fig. 3C). We To determine whether the intron 19 30ss is a contributing element conclude that our in vitro system is sufficiently sensitive for in exon 20 skipping, we replaced one, three, or four of the identifying sequences that affect exon 20 inclusion. The advantage adenosines in the pyrimidine tract (PyT) with (Fig. 4A). of the in vitro RNAP II coupled transcription/splicing assay using These changes were introduced into the 19–21/AdML minigenes the 19–21/AdML constructs is that it is rapid, efficient, and and assayed using the in vitro RNAP II coupled transcription/

Human Mutation DOI 10.1002/humu HUMAN MUTATION 28(1), 41^53, 2007 47

FIGURE 5. Replacement of speci¢c nucleotides within exon 20 promotes its inclusion. A: Alignment of the control and mutant sequences of exon 20. PyUAG motifs (ESS) are circled, and overlined sequences indicate putative ESEs. B: In vitro spliced products derived from constructs with AdML intron were directly separated on denaturing polyacrylamide gels. Schematics between the gels indicate the pre-mRNA and the splicing ¢nal products. C: In vitro spliced products derived from constructs with IKBKAP intron 20 and including (upper band) or skipping (lower band) exon 20 were ampli¢ed by RT-PCR and separated on native polyacrylamide gels. D: In vivo spliced products were ampli¢ed by RT-PCR and resolved on native polyacrylamide gels. The percentage of exon 20 inclusion is indicated.

Human Mutation DOI 10.1002/humu 48 HUMAN MUTATION 28(1), 41^53, 2007

FIGURE 6. Multiple sequence elements a¡ect exon 20 inclusion. A: For each construct, the mutations introduced and the resulting splice-site scores are indicated. B: Spliced products derived from constructs with AdML intron were directly separated on denaturing polyacrylamide gels. Schematics between the gels indicate the pre-mRNA and the splicing ¢nal products.The percentage of exon 20 inclusion is indicated. C: Spliced products derived from constructs with IKBKAP intron 20 were ampli¢ed by RT-PCR and resolved on native polyacrylamide gels.

TABLE 1. Comparative Frequency ofWTand FD-Type of 5’ss lanes 1–4). Specifically, exon 20 inclusion was the highest with à Among the Di¡erent Categories of Spliced Exons Py4 (96%) and Py1 (92%), with Py3 much less (64%). For the FD pre-mRNA, Py4 and Py1 promoted high levels of exon Exon category WT CAAguaagu FD CAAguaagc 20 inclusion, whereas the effect of Py3 was significantly less Constitutively spliced 747 127 (Fig. 4B, lanes 5–8). This result is consistent with previous studies Alternatively skipped 60 16 showing that the nucleotide immediately upstream of the AG is 5’alternatively spliced 90 12 more highly conserved than the pyrimidine tract [Lim and Burge, 3’alternatively spliced 139 36 2001] and is critical for efficient splicing, with YAG being ÃAround 151,000 internal exons were queried for their 5’ss sequence much more efficient that RAG [Smith et al., 1993]. The exact and their splicing history, i.e., whether they have been observed to be constitutively or alternatively spliced. sequence of the pyrimidine tract itself is less important [Roscigno et al., 1993]. To determine whether the intron 19 30ss changes would have splicing assay. For the WT pre-mRNA, exon 20 inclusion was the same effect in the context of the natural intron 20, we dramatically enhanced with all of the changes, with the increases examined these mutations in the 19–21 minigenes using the in directly correlating with the respective 30ss scores (Fig. 4A and B, vitro coupled transcription/splicing system and RT-PCR. As shown

Human Mutation DOI 10.1002/humu HUMAN MUTATION 28(1), 41^53, 2007 49

FIGURE 7. Bifunctional RNA oligonucleotides promote exon 20 inclusion. A: Schematic representation of RNA oligonucleotides. B: The DNA construct was transcribed and spliced in vitro in the presence of increasing amount of ESS-SR (0,125,250,500,750,1,000, 1,500, and 2,000 nM). Spliced products were directly separated on a denaturing polyacrylamide gel (left panel). Schematics on the right of the gel indicate the pre-mRNA and splicing products. C: In vitro coupled transcription and splicing in the presence of increas- ing amounts (125,250,500,750,1,000,1,500, and 2,000 nM) of oligonucleotides not able to hybridize perfectly to the pre-mRNA and/ or to recruit SR . D:The ratios of spliced products including and skipping exon 20 after addition of each RNA oligonucleotide at increasing concentrations and in three independent experiments are plotted. in Figure 4C, the same results were obtained as with the 19–21/ Significantly, all three of the 30ss changes result in inclusion AdML minigenes. We conclude that the intron 19 30ss is a key of exon 20 (Fig. 4D). As was observed in vitro, the Py4 and Py1 element for inclusion of exon 20 during splicing in vitro. mutations have the most potent effects. We conclude that the We next investigated whether these intron 19 30ss changes weak 30ss of intron 19 is an important element that contributes to affect exon 20 inclusion during splicing of the FD minigene in vivo exon 20 skipping during splicing of FD pre-mRNA in vivo as well after transfection of the appropriate FD 19–21 minigenes. as in vitro.

Human Mutation DOI 10.1002/humu 50 HUMAN MUTATION 28(1), 41^53, 2007

Sequence Elements in Exon 20 Contribute Combinations of Changes Promote to Exon Skipping E⁄cient Exon 20 Inclusion Efficient recognition of exons (i.e., exon definition) requires Our in silico analysis predicted that exon 20 is inherently multiple interactions between factors bound to the exon as well as weak, containing at least two ESSs and only four ESEs (Fig. 1C). to the 50ss and 30ss flanking the exon [Boukis et al., 2004; Coˆte´ To test for a role of this exon in its own inclusion, we disrupted et al., 1995; Graveley et al., 2001; Kohtz et al., 1994; Lam and the potential ESSs and/or strengthened ESE1 to generate Hertel, 2002; Lavigueur et al., 1993; Wang et al., 1995; Wu and the minigenes shown in Figure 5A. In the absence of any Maniatis, 1993; Zuo and Maniatis, 1996]. Consistent with this changes, exon 20 inclusion is 12% for WT and 0.4% for FD, model, combining two or more of the changes that strengthened using the 19–21/AdML minigenes for coupled in vitro RNAP II the weak signals in and around exon 20 (Fig. 6A) results in a transcription/splicing (Fig. 5B, lanes 1 and 7). In the presence strong enhancement of splicing efficiency and exon 20 inclusion of a single-base change in ESS1, exon 20 inclusion is stimu- (Fig. 6B and C). Significantly, these data indicate that multiple lated two- to three-fold for both WTand FD (Fig. 5B, lanes 1, 2, 7, weak elements must be simultaneously strengthened in order to and 8). A single-base change that disrupts ESS2 has a much result in high levels of exon 20 inclusion when the FD 50ss is more potent effect, stimulating exon 20 inclusion approxi- present (Fig. 4B, lane 8; Fig. 6B, lanes 7–10). In contrast, mately four-fold for WT and 15-fold for FD (Fig. 5B, lanes 1, strengthening just one weak element is sufficient to result in high 3, 7, and 9). Finally, combining these changes has the levels of exon 20 inclusion when the WT 50ss is present (Fig. 4B, strongest effect, resulting in approximately six-fold and 25-fold lane 2; Fig. 5B, lane 3). Thus, these data underscore the inclusion for WT and FD, respectively (Fig. 5B, lanes 1, 4, 7, importance of the sequence elements surrounding the FD and 10). mutation as underlying contributors to this splicing disease. As shown in the schematic (Fig. 5A), ESS1 and ESE1 consist of a putative overlapping silencer and enhancer. When a 2-nt substitution was made in this ESS1/ESE1 sequence that is Bifunctional Oligonucleotides Mediate a Shift predicted to disrupt both the silencer and strengthen the From Exon Skipping to Exon Inclusion enhancer, a strong enhancement of exon 20 inclusion was observed, yielding 67% and 12% inclusion for WT and FD, As an approach to control use of the intron 20 50ss and promote respectively (Fig. 5B, lanes 5 and 11). This enhancement was exon 20 inclusion, we tested a method designated targeted further increased by combining the ESS1/ESE1 and ESS2 oligonucleotide enhancer of splicing (TOES), which employs a mutations, which resulted in almost complete inclusion exon 20 bifunctional RNA oligonucleotide to recruit SR proteins to weak for WT and 50% inclusion for FD (Fig. 5B, lanes 6 and 12). exons [Eperon and Muntoni, 2003]. TOES has recently been used To determine how the exon 20 changes would affect exon 20 to reprogram alternative splicing [Baughan et al., 2006; Skordis inclusion in the presence of natural intron 20, we examined these et al., 2003]. To test this system for FD, we designed a 38-nt changes in the 19–21 minigenes using the coupled RNAP II in bifunctional RNA oligonucleotide that base-pairs to 15 nt in exon vitro transcription/splicing system and RT-PCR. As was observed 20. This region of the exon contains the ESS sequence for both the 50ss and 30ss changes (Figs. 3 and 4), the exon changes (UUUUAGCUCA). In addition, the RNA oligonucleotide con- all had the same relative effects on exon 20 inclusion with the tains a 23-nt nonhybridizing extension carrying two SR protein 19–21 pre-mRNAs (Fig. 5C). We conclude that strengthening (SF2/ASF) binding sites (Fig. 7A). Significantly, when tested with exon 20 by eliminating ESSs and/or strengthening an ESE 19–21/AdML FD in the coupled in vitro RNAP II transcription/ promotes inclusion of exon 20 in vitro. splicing system, the ESS-SR oligonucleotide resulted in a dose- Analysis of the exon changes in vivo by transfecting the 19–21 dependent increase of exon 20 inclusion (Fig. 7B and D). The FD minigenes showed that ESS2, ESS1/ESE1, and ESS1/ESE1/ increased inclusion is dependent on base-pairing, as mismatches of ESS2 had similar relative effects as observed in vitro (Fig. 5D, the RNA oligonucleotide to the pre-mRNA target sequence lanes 3, 5, and 6). In contrast, ESS1-2, which had an effect similar (mut4ESS-SR, mut7ESS-SR, scram-SR, respectively; Fig. 7A) do to ESS2 and ESS1/ESE1 in vitro (Fig. 5C, lanes 3–5 and 9–11), not promote inclusion as efficiently as the bifunctional oligo- was as potent in stimulating inclusion as ESS1/ESE1/ESS2 in vivo nucleotide that is capable of base-pairing (Fig. 7C, lanes 3–5, (Fig. 5D, lanes 4 and 6). This observation may be explained by 10–12, and 17–19; Fig. 7D). The extension on the oligonucleotide differences in relative concentrations in vivo vs. in vitro of the is required for efficient exon 20 inclusion, as an oligonucleotide factors involved, such as SR proteins and heterogeneous nuclear without this extension (NT, Fig. 7A) did not result in any exon 20 ribonucleoprotein (hnRNP) proteins [Hanamura et al., 1998; Hou inclusion (Fig. 7C, lanes 22–28). Moreover, the SR binding sites et al., 2002; Kamma et al., 1999, 1995; Pollard et al., 2000]. are important as a sequence that corresponds to the hnRNP A1 As an alternative approach for identifying potential ESSs in (ESS-A1; Fig. 7A) does not result in efficient exon 20 exon 20, we divided this exon into six overlapping fragments of inclusion (Fig. 7C, lanes 29–35; Fig. 7D). It is surprising that the equal size and used them to replace exon 2 sequences in a b-globin oligonucleotides that have impaired base-pairing potential to exon minigene in a location that has no known positive or negative role 20 nevertheless resulted in some splicing of one intron or splicing in b-globin splicing [Schaal and Maniatis, 1999] (Supplementary of both introns when present at the highest concentrations Fig. S1A). Consistent with our predictions, pre-mRNAs contain- (Fig. 7C). One possible explanation for this observation is that the ing ESE2 or ESE4 were spliced more efficiently than b-globin, oligonucleotide titrate in trans SR proteins or another factor(s) whereas the pre-mRNA containing ESS2 was spliced approxi- that are required for exon skipping, thereby promoting exon mately three-fold less efficiently than b-globin. The pre-mRNA inclusion. The intermediate, lacking only one intron, may containing ESS1 was spliced with the same efficiency as accumulate because the kinetics of splicing two introns are slower b-globin, possibly because this ESS alone is not sufficiently strong, when the RNA oligonucleotide is not base-paired to exon 20. In as was observed in the 19–21 and 19–21/AdML minigenes any case, together, our data indicate that the highest level of exon (see Fig. 5). 20 inclusion is observed with the bifunctional RNA oligonucleo-

Human Mutation DOI 10.1002/humu HUMAN MUTATION 28(1), 41^53, 2007 51 tide that is both capable of base-pairing to exon 20 and contains bifunctional oligonucleotide comprises a 20-O-methyl-modified SR protein binding sites. base-pairing domain and an effector domain that contains binding sites for either splicing repressors or activators [Gendron et al., 2006; Skordis et al., 2003; Villemaire et al., 2003]. This later DISCUSSION method, called TOES, has been used successfully to modify the In this study, we investigated a single-base change in a 50ss that splicing of SMN2 and has been predicted to be widely applicable has no phenotype in many introns but in IKBKAP intron 20 results for mutations that affect ESEs [Baughan et al., 2006; Skordis et al., in a splicing defect that leads to a devastating neurodevelopmental 2003]. In the present study, our observation that a bifunctional disease. The FD mutation, which is a T to C change in position 6 oligonucleotide can correct FD splicing raises the possibility of of intron 20 of the IKBKAP gene, decreases the potential number using RNA-based therapies for FD. In addition, our data provide of base pairs between the 50ss and U1 snRNA from nine to seven. the new observation that TOES can be used to promote exon To determine why this otherwise silent nucleotide has such a inclusion of mutated 50ss. potent effect on splicing when present in intron 20 of IKBKAP, Few studies have investigated exon skipping in vitro because we carried out a computational analysis of the sequences in splicing has not been efficient enough to assay, and it has been the vicinity of the mutation. This analysis predicted that the difficult to analyze pre-mRNAs containing more that one intron sequences in and around exon 20 contain inherently weak splicing [Aebi et al., 1986; Black, 1992; Cartegni and Krainer, 2002; Coˆte´ elements. Thus, while these elements are sufficient for splicing et al., 1999; Gabut et al., 2005; Lang and Spritz, 1987; Liu et al., when the normal 50ss is present, a single nucleotide change in a 2001; Pollard et al., 2002; Roberts et al., 1998; Selvakumar weakly conserved nucleotide in the 50ss alone may be enough to and Helfman, 1999]. However, we and other groups recently severely tip the balance in favor of exon skipping. We tested our developed in vitro systems that couple RNAP II transcription and computational predictions in vitro and in vivo, and significantly, splicing, and in these systems pre-mRNAs with more than one all the data yielded a consistent picture of weak exon 20 definition intron are readily spliced [Das et al., 2006; Ghosh and Garcia- resulting in exon skipping in FD. In addition, the in vitro studies Blanco, 2000; Hicks et al., 2006; Ibrahim et al., 2005; Natalizio led to the rapid identification of individual nucleotides, that, when and Garcia-Blanco, 2005]. Thus, these systems should provide tested in vivo, were shown to be key contributors to exon 20 better models that the currently used T7 systems. skipping in FD. We have previously reported that exon skipping in FD is tissue- It was previously reported that ESSs (PyUAG motifs) are specific, with higher levels of exon skipping in neuronal tissue than located within exon 20, but their effect on splicing was not tested in other tissues [Cuajungco et al., 2003]. We have also shown [Anderson et al., 2003]. We demonstrated that the PyUAG motif that the plant cytokinin kinetin dramatically increases exon 20 located closer to the FD mutation is part of an ESS that inclusion in FD cells [Slaugenhaupt et al., 2004]. Although the significantly affects exon 20 recognition in vitro and in vivo. mechanism of action of this drug is unknown, the current study During the preparation of the manuscript, a new study identified identified multiple potential targets through which kinetin might and clustered 133 ESS decamers into seven ESS motifs, designated act directly or indirectly. FAS-ESS groups A–G [Wang et al., 2004]. The sequence UUUUAGCU (the PyUAG motif is underlined), present within the IKBKAP exon 20 splicing silencer, is perfectly embedded ACKNOWLEDGMENTS within the FAS-ESS group E [Wang et al., 2004], consistent with We thank Jeanne Hsu, Patricia Valencia, and Andre´ Verdel for its general function as a splicing silencer. Using the FAS-ESS web critical reading of the manuscript. We thank Dirk Holste, Michael server with the FAS-hex3 set (http://genes.mit.edu/fas-ess), Stadler, Brad Friedman, Noah Spies, and Xinshu (Grace) Xiao for we found that IKBKAP exon 20 contains three ESS hexamer database assembly and scripts. motifs and ranks fifth among the IKBKAP internal exons rich in ESS hexamer motifs, providing additional evidence that this exon is weak. REFERENCES The observation that single-base changes in the 50ss, the 30ss and exon 20 dramatically enhance exon 20 inclusion for FD in Aebi M, Hornig H, Padgett RA, Reiser J, Weissmann C. 1986. Sequence vivo indicates that factors recognizing these nucleotides early in requirements for splicing of higher eukaryotic nuclear pre-mRNA. Cell spliceosome assembly are potential therapeutic targets for FD 47:555–565. Anderson SL, Coli R, Daly IW, Kichula EA, Rork MJ, Volpi SA, Ekstein J, patients. These factors include the multicomponent U1 snRNP, Rubin BY. 2001. Familial dysautonomia is caused by mutations of the the U2AF heterodimer, and the SR family of splicing proteins, IKAP gene. Am J Hum Genet 68:753–758. respectively. In addition, the observation that disruption of exonic Anderson SL, Qiu J, Rubin BY. 2003. EGCG corrects aberrant splicing of silencers strongly promotes exon 20 inclusion raises the possibility IKAP mRNA in cells from patients with familial dysautonomia. Biochem of using ESS-binding proteins, such as hnRNP proteins, as Biophys Res Commun 310:627–633. therapeutic targets. Baughan T, Shababi M, Coady TH, Dickson AM, Tullis GE, Lorson CL. Several groups have used oligonucleotides or oligonucleotide- 2006. Stimulating full-length SMN2 expression by delivering bifunc- like compounds to activate or inhibit specific splicing events in tional RNAs via a viral vector. Mol Ther 14:54–62. vitro and in vivo [Garcia-Blanco et al., 2004]. The antisense Bennett M, Michaud S, Kingston J, Reed R. 1992. Protein components approach has been successfully used for many genes [Kole and specifically associated with prespliceosome and spliceosome complexes. Genes Dev 6:1986–2000. Sazani, 2001]. Recently, bifunctional molecules have been Bilodeau PS, Domsic JK, Mayeda A, Krainer AR, Stoltzfus CM. 2001. employed that consist of a hybrid protein nucleic acid (PNA)- RNA splicing at human immunodeficiency virus type 1 30 splice site A2 peptide oligomer with a binding domain that specifically base-pairs is regulated by binding of hnRNP A/B proteins to an exonic splicing to target gene sequences and an effector domain with arginine- silencer element. J Virol 75:8487–8497. serine (RS) repeats that mimics ESE-dependent exon activation Black DL. 1992. Activation of c-src neuron-specific splicing by an unusual of SR proteins [Cartegni and Krainer, 2003]. Another type of RNA element in vivo and in vitro. Cell 69:795–807.

Human Mutation DOI 10.1002/humu 52 HUMAN MUTATION 28(1), 41^53, 2007

Boukis LA, Liu N, Furuyama S, Bruzik JP. 2004. Ser/Arg-rich protein- containing a noncanonical branch point sequence. Mol Cell Biol 25: mediated communication between U1 and U2 small nuclear ribonu- 250–263. cleoprotein particles. J Biol Chem 279:29647–29653. Ibrahim EC, Schaal TD, Hertel KJ, Reed R, Maniatis T. 2005. Serine/ Carmel I, Tal S, Vig I, Ast G. 2004. Comparative analysis detects arginine-rich protein-dependent suppression of exon skipping by exonic dependencies among the 50 splice-site positions. RNA 10:828–840. splicing enhancers. Proc Natl Acad Sci USA 102:5002–5007. Cartegni L, Krainer AR. 2002. Disruption of an SF2/ASF-dependent Jiang Z, Cote J, Kwon JM, Goate AM, Wu JY. 2000. Aberrant splicing of exonic splicing enhancer in SMN2 causes spinal muscular atrophy in the tau pre-mRNA caused by intronic mutations associated with the absence of SMN1. Nat Genet 30:377–384. inherited dementia frontotemporal dementia with parkinsonism linked Cartegni L, Krainer AR. 2003. Correction of disease-associated exon skipp- to chromosome 17. Mol Cell Biol 20:4036–4048. ing by synthetic exon-specific activators. Nat Struct Biol 10:120–125. Kamma H, Portman DS, Dreyfuss G. 1995. Cell type-specific expression of Cartegni L, Wang J, Zhu Z, Zhang MQ, Krainer AR. 2003. ESEfinder: a hnRNP proteins. Exp Cell Res 221:187–196. web resource to identify exonic splicing enhancers. Nucleic Acids Res Kamma H, Horiguchi H, Wan L, Matsui M, Fujiwara M, Fujimoto M, 31:3568–3571. Yazawa T, Dreyfuss G. 1999. Molecular characterization of the hnRNP Cogan JD, Phillips JA 3rd, Schenkman SS, Milner RD, Sakati N. 1994. A2/B1 proteins: tissue-specific expression and novel isoforms. Exp Cell Familial growth hormone deficiency: a model of dominant and recessive Res 246:399–411. mutations affecting a monomeric protein. J Clin Endocrinol Metab 79: Kohtz JD, Jamison SF, Will CL, Zuo P, Luhrmann R, Garcia-Blanco MA, 1261–1265. Manley JL. 1994. Protein-protein interactions and 50-splice-site Coˆte´ J, Beaudoin J, Tacke R, Chabot B. 1995. The U1 small nuclear recognition in mammalian mRNA precursors. Nature 368:119–124. ribonucleoprotein/50 splice site interaction affects U2AF65 binding to Kole R, Sazani P. 2001. Antisense effects in the cell nucleus: modification the downstream 30 splice site. J Biol Chem 270:4031–4036. of splicing. Curr Opin Mol Ther 3:229–234. Coˆte´ J, Simard MJ, Chabot B. 1999. An element in the 50 common exon of Krogan NJ, Kim M, Ahn SH, Zhong G, Kobor MS, Cagney G, Emili A, the NCAM alternative splicing unit interacts with SR proteins and Shilatifard A, Buratowski S, Greenblatt JF. 2002. RNA polymerase II modulates 50 splice site selection. Nucleic Acids Res 27:2529–2537. elongation factors of : a targeted proteomics Cuajungco MP, Leyne M, Mull J, Gill SP, Lu W, Zagzag D, Axelrod FB, approach. Mol Cell Biol 22:6979–6992. Maayan C, Gusella JF, Slaugenhaupt SA. 2003. Tissue-specific reduction Lam BJ, Hertel KJ. 2002. A general role for splicing enhancers in exon in splicing efficiency of IKBKAP due to the major mutation associated definition. RNA 8:1233–1241. with familial dysautonomia. Am J Hum Genet 72:749–758. Lang KM, Spritz RA. 1987. In vitro splicing pathways of pre-mRNAs Das R, Dufu K, Romney B, Feldt M, Elenko M, Reed R. 2006. Functional containing multiple intervening sequences? Mol Cell Biol 7:3428–3437. coupling of RNAP II transcription to spliceosome assembly. Genes Dev Lavigueur A, La Branche H, Kornblihtt AR, Chabot B. 1993. A splicing 20:1100–1109. enhancer in the human fibronectin alternate ED1 exon interacts Eperon IC, Muntoni F. 2003. Response to Buratti et al.: Can a ‘‘patch’’ in a with SR proteins and stimulates U2 snRNP binding. Genes Dev 7: skipped exon make the pre-mRNA splicing machine run better? Trends 2405–2417. Mol Med 9:233–234. Li Y, Takagi Y, Jiang Y, Tokunaga M, Erdjument-Bromage H, Tempst P, Gabut M, Mine M, Marsac C, Brivet M, Tazi J, Soret J. 2005. The SR Kornberg RD. 2001. A multiprotein complex that interacts with RNA protein SC35 is responsible for aberrant splicing of the E1{alpha} polymerase II elongator. J Biol Chem 276:29628–29631. pyruvate dehydrogenase mRNA in a case of mental retardation with Lim LP, Burge CB. 2001. A computational analysis of sequence features lactic acidosis. Mol Cell Biol 25:3286–3294. involved in recognition of short introns. Proc Natl Acad Sci USA 98: Garcia-Blanco MA, Baraniak AP, Lasda EL. 2004. Alternative splicing in 11193–11198. disease and therapy. Nat Biotechnol 22:535–546. Liu HX, Cartegni L, Zhang MQ, Krainer AR. 2001. A mechanism for exon Gendron D, Carriero S, Garneau D, Villemaire J, Klinck R, Elela SA, skipping caused by nonsense or missense mutations in BRCA1 and other Damha MJ, Chabot B. 2006. Modulation of 50 splice site selection using genes. Nat Genet 27:55–58. tailed oligonucleotides carrying splicing signals. BMC Biotechnol 6:5. Lloyd J, Narcisi P, Richards A, Pope FM. 1993. A T16toC16 mutation Ghosh S, Garcia-Blanco MA. 2000. Coupled in vitro synthesis and splicing in the donor splice site of COL3A1 IVS7 causes exon skipping and of RNA polymerase II transcripts. RNA 6:1325–1334. results in Ehlers-Danlos syndrome type IV. J Med Genet 30:376–380. Graveley BR, Hertel KJ, Maniatis T. 2001. The role of U2AF35 and Natalizio BJ, Garcia-Blanco MA. 2005. In vitro coupled transcription U2AF65 in enhancer-dependent splicing. RNA 7:806–818. splicing. Methods 37:314–322. Han K, Yeo G, An P, Burge CB, Grabowski PJ. 2005. A combinatorial Okubo M, Horinishi A, Kim DH, Yamamoto TT, Murase T. 2002. Seven code for splicing silencing: UAGG and GGGG motifs. PLoS Biol novel sequence variants in the human low density lipoprotein receptor 3:e158. related protein 5 (LRP5) gene. Hum Mutat 19:186. Hanamura A, Caceres JF, Mayeda A, Franza BR, Jr, Krainer AR. 1998. Otero G, Fellows J, Li Y, de Bizemont T, Dirac AM, Gustafsson CM, Regulated tissue-specific expression of antagonistic pre-mRNA splicing Erdjument-Bromage H, Tempst P, Svejstrup JQ. 1999. Elongator, a factors. RNA 4:430–444. multisubunit component of a novel RNA polymerase II holoenzyme for Hawkes NA, Otero G, Winkler GS, Marshall N, Dahmus ME, Krappmann transcriptional elongation. Mol Cell 3:109–118. D, Scheidereit C, Thomas CL, Schiavo G, Erdjument-Bromage H, Pagani F, Stuani C, Tzetis M, Kanavakis E, Efthymiadou A, Doudounakis Tempst P, Svejstrup JQ. 2002. Purification and characterization of the S, Casals T, Baralle FE. 2003. New type of disease causing mutations: human elongator complex. J Biol Chem 277:3047–3052. the example of the composite exonic regulatory elements of splicing Hicks MJ, Yang CR, Kotlajich MV, Hertel KJ. 2006. Linking splicing to Pol in CFTR exon 12. Hum Mol Genet 12:1111–1120. II transcription stabilizes pre-mRNAs and influences splicing patterns. Pokholok DK, Hannett NM, Young RA. 2002. Exchange of RNA PLoS Biol 4:e147. polymerase II initiation and elongation factors during gene expression Holste D, Huo G, Tung V, Burge CB. 2006. HOLLYWOOD: a comparative in vivo. Mol Cell 9:799–809. relational database of alternative splicing. Nucleic Acids Res 34: Pollard AJ, Sparey C, Robson SC, Krainer AR, Europe-Finner GN. 2000. D56–D62. Spatio-temporal expression of the trans-acting splicing factors SF2/ASF Hou VC, Lersch R, Gee SL, Ponthier JL, Lo AJ, Wu M, Turck CW, Koury and heterogeneous ribonuclear proteins A1/A1B in the myometrium of M, Krainer AR, Mayeda A, Conboy JG. 2002. Decrease in hnRNP A/B the pregnant human uterus: a molecular mechanism for regulating expression during erythropoiesis mediates a pre-mRNA splicing switch. regional protein isoform expression in vivo. J Clin Endocrinol Metab EMBO J 21:6195–6204. 85:1928–1936. Hovhannisyan RH, Carstens RP. 2005. A novel intronic cis element, ISE/ Pollard AJ, Krainer AR, Robson SC, Europe-Finner GN. 2002. Alternative ISS-3, regulates rat fibroblast growth factor receptor 2 splicing through splicing of the adenylyl cyclase stimulatory G-protein G alpha(s) is activation of an upstream exon and repression of a downstream exon regulated by SF2/ASF and heterogeneous nuclear ribonucleoprotein A1

Human Mutation DOI 10.1002/humu HUMAN MUTATION 28(1), 41^53,2007 53

(hnRNPA1) and involves the use of an unusual TG 30-splice Site. J Biol splicing defect by the plant cytokinin kinetin. Hum Mol Genet 13: Chem 277:15241–15251. 429–436. Rahl PB, Chen CZ, Collins RN. 2005. Elp1p, the yeast homolog of the FD Smith CW, Chu TT, Nadal-Ginard B. 1993. Scanning and competition disease syndrome protein, negatively regulates exocytosis independently between AGs are involved in 30 splice site selection in mammalian of transcriptional elongation. Mol Cell 17:841–853. introns. Mol Cell Biol 13:4939–4952. Roberts GC, Gooding C, Mak HY, Proudfoot NJ, Smith CW. 1998. Stenson PD, Ball EV, Mort M, Phillips AD, Shiel JA, Thomas NS, Co-transcriptional commitment to alternative splice site selection. Abeysinghe S, Krawczak M, Cooper DN. 2003. Human Gene Mutation Nucleic Acids Res 26:5568–5572. Database (HGMD): 2003 update. Hum Mutat 21:577–581. Roscigno RF, Weiner M, Garcia-Blanco MA. 1993. A mutational analysis of Treisman R, Orkin SH, Maniatis T. 1983. Specific transcription and RNA the polypyrimidine tract of introns. Effects of sequence differences in splicing defects in five cloned beta-thalassaemia genes. Nature 302: pyrimidine tracts on splicing. J Biol Chem 268:11222–11229. 591–596. Schaal TD, Maniatis T. 1999. Multiple distinct splicing enhancers in Villemaire J, Dion I, Elela SA, Chabot B. 2003. Reprogramming alternative the protein-coding sequences of a constitutively spliced pre-mRNA. pre-messenger RNA splicing through the use of protein-binding Mol Cell Biol 19:261–273. antisense oligonucleotides. J Biol Chem 278:50031–50039. Selvakumar M, Helfman DM. 1999. Exonic splicing enhancers contribute Wang Z, Hoffmann HM, Grabowski PJ. 1995. Intrinsic U2AF binding is to the use of both 30 and 50 splice site usage of rat b-tropomyosin modulated by exon enhancer signals in parallel with changes in splicing pre-mRNA. RNA 5:378–394. activity. RNA 1:21–35. Senapathy P, Shapiro MB, Harris NL. 1990. Splice junctions, branch point Wang Z, Rolish ME, Yeo G, Tung V,Mawson M, Burge CB. 2004. Systematic sites, and exons: sequence statistics, identification, and applications to identification and analysis of exonic splicing silencers. Cell 119:831–845. genome project. Methods Enzymol 183:252–278. Wang J, Smith PJ, Krainer AR, Zhang MQ. 2005. Distribution of SR Shapiro MB, Senapathy P.1987. RNA splice junctions of different classes of protein exonic splicing enhancer motifs in human protein-coding genes. eukaryotes: sequence statistics and functional implications in gene Nucleic Acids Res 33:5053–5062. expression. Nucleic Acids Res 15:7155–7174. Wu JY, Maniatis T. 1993. Specific interactions between proteins implicated Si Z, Amendt BA, Stoltzfus CM. 1997. Splicing efficiency of in splice site selection and regulated alternative splicing. Cell 75: human immunodeficiency virus type 1 tat RNA is determined by 1061–1070. both a suboptimal 30 splice site and a 10 nucleotide exon splicing Yabuta T, Shinmura K, Tani M, Yamaguchi S, Yoshimura K, Katai H, silencer element located within tat exon 2. Nucleic Acids Res 25: Nakajima T, Mochiki E, Tsujinaka T, Takami M, Hirose K, Yamaguchi A, 861–867. Takenoshita S, Yokota J. 2002. E-cadherin gene variants in gastric cancer Si ZH, Rauch D, Stoltzfus CM. 1998. The exon splicing silencer in human families whose probands are diagnosed with diffuse gastric cancer. Int J immunodeficiency virus type 1 Tat exon 3 is bipartite and acts early in Cancer 101:434–441. spliceosome assembly. Mol Cell Biol 18:5404–5413. Yeo G, Burge CB. 2004. Maximum entropy modeling of short sequence Skordis LA, Dunckley MG, Yue B, Eperon IC, Muntoni F. 2003. motifs with applications to RNA splicing signals. J Comput Biol 11: Bifunctional antisense oligonucleotides provide a trans-acting splicing 377–394. enhancer that stimulates SMN2 gene expression in patient fibroblasts. Zahler AM, Damgaard CK, Kjems J, Caputi M. 2004. SC35 and Proc Natl Acad Sci USA 100:4114–4119. heterogeneous nuclear ribonucleoprotein A/B proteins bind to a Slaugenhaupt SA, Blumenfeld A, Gill SP,Leyne M, Mull J, Cuajungco MP, juxtaposed exonic splicing enhancer/exonic splicing silencer element to Liebert CB, Chadwick B, Idelson M, Reznik L, Robbins C, Makalowska I, regulate HIV-1 tat exon 2 splicing. J Biol Chem 279:10077–10084. Brownstein M, Krappmann D, Scheidereit C, Maayan C, Axelrod FB, Zhu J, Mayeda A, Krainer AR. 2001. Exon identity established through Gusella JF. 2001. Tissue-specific expression of a splicing mutation in the differential antagonism between exonic splicing silencer-bound hnRNP IKBKAP gene causes familial dysautonomia. Am J Hum Genet 68: A1 and enhancer-bound SR proteins. Mol Cell 8:1351–1361. 598–605. Zuo P, Maniatis T. 1996. The splicing factor U2AF35 mediates critical Slaugenhaupt SA, Mull J, Leyne M, Cuajungco MP, Gill SP, Hims MM, protein-protein interactions in constitutive and enhancer-dependent Quintero F, Axelrod FB, Gusella JF. 2004. Rescue of a human mRNA splicing. Genes Dev 10:1356–1368.

Human Mutation DOI 10.1002/humu