The zinc fingers of the SR-like ZRANB2 are single-stranded RNA-binding domains that recognize 5؅ splice site-like sequences

Fionna E. Loughlin, Robyn E. Mansfield, Paula M. Vaz, Aaron P. McGrath, Surya Setiyaputra, Roland Gamsjaeger, Eva S. Chen, Brian J. Morris, J. Mitchell Guss, and Joel P. Mackay1

School of Molecular and Microbial Biosciences and School of Medical Sciences, University of Sydney, Sydney NSW 2006, Australia

Edited by Adrian Krainer, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, and accepted by the Editorial Board February 17, 2009 (received for review March 11, 2008) The alternative splicing of mRNA is a critical process in higher eu- contrast, we show here that the double RanBP2 ZnF domain of karyotes that generates substantial proteomic diversity. Many of the ZRANB2 recognizes ssRNA, binding tandem copies of an AG- that are essential to this process contain arginine/serine-rich GUAA motif with high affinity. We have determined the structural (RS) domains. ZRANB2 is a widely-expressed and highly-conserved basis for RNA recognition by ZRANB2, which reveals a large RS-domain protein that can regulate alternative splicing but lacks number of specific hydrogen bonds and a previously undescribed canonical RNA-binding domains. Instead, it contains 2 RanBP2-type base-stacked ‘‘ladder’’ involving a tryptophan and 2 guanines. Our zinc finger (ZnF) domains. We demonstrate that these ZnFs recognize data define another class of ssRNA-binding protein and shed light ssRNA with high affinity and specificity. Each ZnF binds to a single on the biochemical functions of a number of human proteins that AGGUAA motif and the 2 domains combine to recognize AGGUAA contain these domains. (Nx)AGGUAA double sites, suggesting that ZRANB2 regulates alter- native splicing via a direct interaction with pre-mRNA at sites that Results ؅ resemble the consensus 5 splice site. We show using X-ray crystal- Identification of a High-Affinity RNA Ligand for ZRANB2-F12. To test lography that recognition of an AGGUAA motif by a single ZnF is whether the ZnFs of ZRANB2 can recognize ssRNA, we carried BIOCHEMISTRY dominated by side-chain hydrogen bonds to the bases and formation out an in vitro site selection [systematic evolution of ligands by of a guanine-tryptophan-guanine ‘‘ladder.’’ A number of other hu- exponential enrichment (SELEX)] experiment. From a ssRNA man proteins that function in RNA processing also contain RanBP2 pool containing a randomized 25-nt sequence, high-affinity RNA ZnFs in which the RNA-binding residues of ZRANB2 are conserved. The target sequences were selected by using a GST-fusion protein ZnFs of ZRANB2 therefore define another class of RNA-binding do- containing both ZnFs (GST-F12) immobilized on glutathione main, advancing our understanding of RNA recognition and empha- Sepharose beads. After 7 rounds of selection, we sequenced 26 sizing the versatility of ZnF domains in molecular recognition. unique clones; all contained a GGUA or a AGGU motif, and 15 contained the longer AGGUAA motif. Further selection enriched ͉ ͉ ͉ protein structure RanBP2 zinc fingers RNA-binding proteins splicing the RNA pool in sequences containing multiple AGGUA motifs, such that after 9 rounds of selection 15 of 33 unique clones lmost all human are thought to be alternatively spliced, contained 2 GGUA motifs. In 11 of these 15 clones, the 2 motifs Aand it has been estimated that at least 15% of human diseases occurred either in tandem or separated by 1 nt. are associated with changes in RNA processing (1). The selection Alignment of the sequences (Fig. 1A and Fig. S1A) suggests that of splice sites is influenced heavily by the binding of accessory each of the ZnFs recognizes an AGGUAA site. MFOLD (12) did splicing factors to regulatory sequences in the pre-mRNA. SR not reveal any consistent secondary structure predictions for these proteins are splicing factors that contain a C-terminal Arg/Ser-rich sequences, suggesting that the binding is sequence-driven rather (RS) domain and either 1 or 2 N-terminal RNA recognition motifs than structure-driven. To determine whether the AGGUAA motif (RRMs) (2). They play crucial roles in constitutive and alternative is sufficient for binding, we tested the binding of the double finger splicing, promoting recognition of splice sites by binding to exonic construct F12 to a 17-nt RNA oligonucleotide containing a single splicing enhancers (ESEs). RRM domains bind ssRNA with high AGGUAA motif. dsDNA, ssDNA, and dsRNA were also tested. affinity in a sequence-specific manner, whereas RS domains appear As shown in Fig. 1B, the interaction is strongly selective for ssRNA, to facilitate both protein–protein and protein–RNA interactions consistent with a role for ZRANB2 in mRNA processing. Mutation (3, 4). Other RS domain-containing proteins that lack a canonical of the central GGU, which is the most highly-conserved element in RRM, termed ‘‘SR-like’’ proteins (see, for example, ref. 5) are also the SELEX consensus, abrogated the interaction, showing that the known to play roles in splicing. interaction is sequence specific. ZRANB2 (Zis, ZNF265) is an SR-like nuclear protein that is expressed in most tissues and is conserved between nematodes and Sequence Specificity of the RNA-ZRANB2 Interaction. The repetition mammals. It interacts with the spliceosomal proteins U1–70K and of the AGGUAA motif in the SELEX sequence alignment sug- U2AF35 and can alter the distribution of splice variants of GluR-B, SMN2, and Tra2␤ in minigene reporter assays (6, 7). As such,

ZRANB2 appears to regulate splice site choice. However, in place Author contributions: F.E.L., R.E.M., and J.P.M. designed research; F.E.L., R.E.M., P.M.V., of the canonical RNA-binding RRM domains, ZRANB2 displays A.P.M., S.S., R.G., and E.S.C. performed research; F.E.L., R.E.M., P.M.V., A.P.M., R.G., J.M.G., 2 N-terminal RanBP2-type zinc fingers (ZnFs). and J.P.M. analyzed data; and F.E.L., B.J.M., J.M.G., and J.P.M. wrote the paper. RanBP2-type ZnF domains are defined by the consensus se- The authors declare no conflict of interest. quence W-X-C-X2–4-C-X3-N-X6-C-X2-C. They occur multiple This article is a PNAS Direct Submission. A.K. is a guest editor invited by the Editorial Board. times in at least 21 human proteins, and the fold comprises 2 Data deposition: The atomic coordinates and structure factors have been deposited in the distorted ␤-hairpins sandwiching a central tryptophan residue and , www.pdb.org (PDB ID codes 2k1p and 3g9y). a single zinc ion (8, 9). RanBP2 ZnFs from Npl4 and Nup153 are 1To whom correspondence should be addressed. E-mail: [email protected]. protein recognition motifs, mediating interactions with ubiquitin This article contains supporting information online at www.pnas.org/cgi/content/full/ (10) and the nuclear transport protein (11), respectively. In 0802466106/DCSupplemental.

www.pnas.org͞cgi͞doi͞10.1073͞pnas.0802466106 PNAS Early Edition ͉ 1of6 Downloaded by guest on September 26, 2021 also showed that the central GG sequence is most important for binding by measuring the affinity of each finger for a series of ssRNA oligonucleotides that each contained a single purine 7 pyrimidine mutation in the AGGUAA motif (Fig. 1D and Fig. S1). We next measured the binding of F12 to ssRNA containing 2 AGGUAA sites (Fig. 1E). The affinity of F12 for the sequence AGGUAAAGGUAA was 1.9 ϫ 107 MϪ1 (Fig. 1E, lane 6). Randomizing one of the AGGUAA motifs (to give the sequence AGAAUGAGGUAA; Fig. 1E, lane 4) reduced the binding by a small, but reproducible, amount, although it was notable that binding was still significantly stronger than that of a single finger to a similar sequence (Fig. 1E, lanes 1 and 2), indicating that the additional finger in the F12 construct could still make contacts with the scrambled second site. Scrambling both sites effectively elimi- nated binding (Fig. 1E, lane 3). We assessed the importance of the spacing between AGGUAA motifs by measuring binding of F12 to double sites that were separated by Ϫ1 to 8 nt (where Ϫ1 denotes the sequence AG- GUAAGGUAA; Fig. 1E, lanes 5–10). Surprisingly, the affinity of the interaction increased monotonically as an increasing number of adenines were inserted between the sites. Further, a double site containing a cytosine-rich spacer (Fig. 1E, lane 10) had the same affinity for F12 as a penta-adenine spacer (Fig. 1E, lane 8), indicating that the identity of the intervening bases is unimportant. Overall, these data demonstrate that the double ZnF domain of ZRANB2 recognizes ssRNA carrying the consensus sequence AGGUAA(XϪ1–8)AGGUAA. Strikingly, AGGUAA is almost identical to the conserved consensus sequence for the 5Ј splice site across metazoans (13, 14) and resembles the 3Ј splice site consensus (CAGG).

Solution Analysis of the ZRANB2:RNA Interaction. To provide struc- tural insight into the ZRANB2:RNA interaction we first deter- mined the structure of F2 by using NMR spectroscopy (Fig. 3A, and Table S1). The structure is well defined and comprises 2 short ␤-hairpins sandwiching a zinc ion that is ligated by the 4 conserved cysteines. The fold is consistent with that of F1 (8) and other structures from this class of ZnFs, including domains from HDM2 Fig. 1. ZRANB2 binds to ssRNA containing AGGUAA repeats. (A) Sequence logo (15), Npl4 (9), and Nup153 (11). indicating the degree of sequence conservation within the 2 RNA sites from sequences obtained from SELEX (Fig. S1A). (B) Gel-shift showing the nucleic acid We next carried out chemical-shift perturbation experiments, type specificity of ZRANB2. The same 17-nt sequence, encoded in ssDNA, ssRNA, titrating short (6 or 10 nt) RNA oligonucleotides containing a single 15 dsDNA, and dsRNA, was electrophoresed in the presence of F12. In the sequence AGGUAA motif into N-labeled F2. Significant chemical ex- ssRNAmut, the central GGU is replaced by CUG. (C) Fluorescence anisotropy data change broadening was observed for a subset of signals during the showing the binding of GST-F2 to 17-nt ssRNA sequences containing point titration, but all signals reappeared after the addition of 1 molar mutations. (D) Calculated association constants from D. Affinities are an average equivalent of RNA (Fig. 3C), and no further changes were observed of 3 experiments and error bars indicate 1 SD. (E) Association constants, obtained if more RNA was added, consistent with the formation of a by fluorescence anisotropy, for F12 binding to RNA sequences containing either well-defined 1:1 complex. a single AGGUAA site (with a scrambled second site) or double sites with spacings Mapping the most significant chemical-shift changes onto the of Ϫ1, 0, 2, 5, and 8 adenines or the 5-nt sequence ACCCC (AC4). (F) RNA-binding affinity for single Ala point mutations of GST-F2, from fluorescence anisotropy structure of F2 (Fig. 3B and Fig. S2A) reveals a single contiguous data. Error bars indicate 1 SD from 3 measurements. surface. The binding surface comprises a mixture of amino acid types, including aromatic (W79), aliphatic (V77, A80, M87), polar (N76, N78, N86), and charged (R81, R82) residues. These residues gested that each of the ZnFs can recognize a single site. We are almost completely conserved in F1, suggesting that each ZnF therefore expressed and purified the 2 individual domains (F1 and recognizes RNA in the same manner. F2; Fig. 2) as GST fusions and tested their ability to bind ssRNA by We corroborated our structural data by examining the effect of fluorescence anisotropy, using a ssRNA oligonucleotide bearing a alanine point mutations on the RNA-binding ability of F2. Fig. 1F 5Јfluorescein tag and containing a single AGGUAA site. F1 and F2 shows that alanine substitutions of W79, R81, R82, N86, and M87 bound to this sequence with association constants of 3 ϫ 106 MϪ1 significantly reduced the association constant (the correct folding of and 1 ϫ 106 MϪ1, respectively (Fig. 1 C and D and Fig. S1), each mutant was confirmed by NMR; Fig. S2B). In contrast, much demonstrating that both F1 and F2 can bind 1 AGGUAA site. We smaller changes in affinity were observed for K72A and T73A, which are oriented away from the RNA contact surface. 15N-HSQC titrations were also carried out by using the double finger construct (F12; residues 1–95) and the double site oligonu- cleotide AGGUAAAGGUAA. Chemical shifts for the free F12 protein were essentially unchanged from those in the 2 individual 15 Fig. 2. Sequence data for ZRANB2. Sequences of F1 and F2 are shown. Cysteines finger constructs, and no signals were observed in the N-HSQC are indicated by *, and side chains that directly contact RNA are marked by a gray of F12 for residues in the 25-residue linker. This titration (Fig. S2D) box. gave rise to the same pattern of chemical-shift changes as the 2

2of6 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0802466106 Loughlin et al. Downloaded by guest on September 26, 2021 BIOCHEMISTRY Fig. 3. NMR analysis of the ZRANB2:ssRNA interaction. (A) The structure of ZRANB2-F2. An overlay of the 20 lowest energy structures (residues 67–95) is shown. The zinc ligands C71, C74, C85, and C88 are shown in gold, and the zinc ion is shown in red. (B) Space-filling representation of F2 with residues showing significant chemical-shift changes (Ͼ1 SD from the mean) colored red (Fig. S2A). The protein is rotated Ϸ90° counterclockwise about the vertical axis, compared with A.(C) Overlay of 15N-HSQC spectra of free F2 (black) and F2 in the presence of 1.2 molar equivalents of CCAGGUAAAG (red). Arrows show the shifting of selected resonances.

single-finger experiments. Only 3–4 new signals appeared, most Fig. 4. Structure of the ZRANB2:RNA complex. (A) Electron density at the likely from linker residues. However, most of the linker residues protein/RNA interface. The binding interface is shown with unbiased Fo Ϫ Fc remained unobservable, indicating that the linker does not become density (green) at 2.5 ␴ calculated by omitting the RNA portion of the model. The ordered upon RNA binding. final model for the RNA is shown to illustrate the fit with the difference density. 2Fo Ϫ Fc electron density (blue) calculated by using the final model is shown at 1.4 ␴ Structural Basis for ssRNA Recognition by ZRANB2. Only a small on the protein portion of the model only. (B) Overview of the ZRANB2:RNA number (Ͻ10) of intermolecular NOEs were observed between F2 structure. F2 is shown as a ribbon diagram. The zinc ligands are shown as sticks and the zinc ion is shown as a sphere. The first 4 bases (AGGU) are shown as sticks. and RNA; intermediate chemical exchange broadened signals from Protein side chains that interact with RNA are shown in yellow. (C)(Left) Hydro- key residues at the protein:RNA interface. We therefore crystal- gen bonds between F2 and the 2 guanines (green dashed lines). Water molecules lized F2 in complex with a 6-nt RNA with the sequence AGGUAA are shown as red crosses. (Right) Summary of hydrogen bonds to each guanine. and determined the structure of the complex to a resolution of 1.4 (D)(Left) The hydrogen bond network involving N76, N78, C85, and N86, which Å. The overall quality of the structure, as judged by Molprobity defines the Ura4 binding site. (Right) Summary of hydrogen bonds to Ura4. (16), is excellent. All F2 residues lie in the favored region of the Ramachandran plot, and Molprobity scores the structure in the top 1% of all structures determined to comparable resolution. All RNA feature is the bidentate interaction of both Gua2 and Gua3 with an arginine side chain. Gua2 forms 2 hydrogen bonds to R81 side chain nucleotides have acceptable sugar puckers and residues Ade1–Ura4 ␧ ␩ have acceptable backbone conformations. protons: the carbonyl oxygen O6 with H and1oftheH protons In the structure (Fig. 4, Table S2, and Fig. S3), F2 adopts the with N7. O6 also forms a water-mediated hydrogen bond with the same backbone fold as it does free in solution. Electron density for backbone amide proton of R81. The imino proton on the Watson– Gua2, Gua3, and Ura4 is unambiguous and clearly reveals the mode Crick face of Gua2 forms a second water-mediated hydrogen bond, of recognition of these bases (Fig. 4). Most prominently, a trypto- whereby the water interacts with both the D68 carboxylate group phan side chain (W79) stacks between Gua2 and Gua3; the plane and the A80 backbone amide. of the indole side chain is parallel with those of the 2 purines and Gua3 forms a bidentate interaction with R82, and again the O6 makes extensive contacts with both bases (Fig. 4 A and B). Although forms a second hydrogen bond to a backbone amide, in this case this purine-Trp-purine ladder appears well-suited to direct recog- W79. The backbone carbonyl of V77 forms hydrogen bonds with nition of single-stranded nucleic acids, a motif of this type has not both the imino proton and the amino group of Gua3, and a previously been observed in any protein–nucleic acid structure to water-mediated hydrogen bond connects the 2Ј hydroxyl group of our knowledge. Gua3 with the side chain of N86. Ura4 forms 3 side-chain-mediated Gua2, Gua3, and Ura4 make a number of hydrogen bonds to side hydrogen bonds to F2. The O4 carbonyl group recognizes side- chains and the backbone of the protein (Fig. 4 C and D). A striking chain amide protons in both N76 and N86, and the side-chain

Loughlin et al. PNAS Early Edition ͉ 3of6 Downloaded by guest on September 26, 2021 Fig. 5. Conformation of Ade5 and Ade6. (A) Structure of 1 conformer, in which Ade5 and Ade6 were modeled with 50% occupancy. Residues 1–4 are shown in light brown and residues 5 and 6 are shown in light pink. (B) Structure of the second conformer, in which Ade5 points away from F2. No density for Ade6 was observed. (C) Model of the F2:RNA complex calculated by using the intermolec- ular NOEs between V77/M87 and Ade5 H2. The X-ray conformer from A is shown in green and the NOE-restrained model is in gray.

carbonyl group of N76 makes a hydrogen bond with the Ura4 imino proton. These interactions explain the strong observed preference for a core GGU sequence. The pattern of hydrogen bond donors and acceptors observed for the interaction with each guanine (Fig. 4C) Fig. 6. The RNA-binding preferences of ZRANB2 correlate with its observed ␤ is not compatible with either adenine or cytosine, and although splicing activity. (A) Schematic of the Tra2- minigene. Normally, the dominant uracil has a similar polarity to guanine (O4 replaces O6 and N3 transcript contains exons 1, 3, and 4. After the addition of ZRANB2, a transcript containing only exons 1 and 4 is observed (6). The sequences of the 5Ј splice sites replaces N1 on the Watson–Crick face) it would be unable to form of each exon are shown (the vertical line indicates the intron–exon boundary). (B) the additional hydrogen bond made to each guanine N7. Recog- Surface plasmon resonance data. Responses are shown after the injection of F12, nition of Ura4 relies on a hydrogen-bond network involving N76, in the presence of a competitor RNA, on to a chip bearing the sequence of the 5Ј N78, C85, and N86 (Fig. 4D), which orients N76 such that the splice site of exon 3. (C) Sequence alignment of human RanBP2 ZnFs showing that polarity of its side chain is complementary to uracil. several contain the residues shown to meditate RNA recognition in ZRANB2. Color is ascribed by residue type and RNA binding residues in F2 are highlighted Conformation of the Three Adenines. Whereas electron density for in gray boxes. Swissprot accession codes are given in SI Text. the GGU was well-defined and unambiguous, refinement of the adenines was more difficult. Ade1 was modeled at 50% occupancy conformer that is consistent with all of the NOEs (Fig. 5C). In this into a conformation in which the base extends the Gua2-W79- structure, Ade5 is in a position similar to that exhibited in the X-ray Gua3-Ura4 stack (Fig. 4 and Fig. S3C), but this nucleotide does not conformer shown in Fig. 5A (but translated Ϸ5 Å toward the directly contact F2. Ade5 was modeled into 2 different conforma- protein), suggesting that Ade5 and Ade6 do spend at least a tions (Fig. 5 A and B), and for one of these, density could proportion of their time in such a conformation. additionally be observed for Ade6 (Fig. 5A). In this latter con- former, the RNA backbone changes direction by Ϸ90° and folds back on itself. Ade5 makes contacts with the backbone and ribose The Binding Preferences of ZRANB2 Corroborate Functional Splicing ring of Ura4 and the Ade6 base is stacked coplanar with Ade5. In Data. It has been demonstrated that ZRANB2 can alter the splicing of Tra2-␤, GLUR-B, and SMN2 reporter genes in splicing assays the alternate conformer, Ade5 is oriented away from the protein ␤ (Fig. 5B). (6, 7). The Tra2- reporter contains 4 exons (Fig. 6A), and addition of ZRANB2 promotes the exclusion of exon 3. Given our Our chemical-shift perturbation analysis revealed that M87 un- Ј derwent one of the largest chemical-shift changes upon RNA SELEX data, it is notable that the sequence of the 5 splice site of binding and the alanine point mutation M87A reduced the RNA exon 3 is AG/GUAA, whereas those of exons 1 and 2 are GG/ binding affinity of F2 (Fig. 1F). However, no contacts are observed GUAC and AA/GUGA, respectively (where the slash represents between this residue and the RNA in the crystal structure. Further, the cleavage site in the splicing reaction). To test whether the the few intermolecular NOEs that we could assign with confidence tandem zinc finger domain of ZRANB2 will bind preferentially to Ј were between the H2 proton of a single adenine and the side chains the 5 splice site of exon 3, we carried out surface plasmon Ј of M87 and V77 (Fig. S2C) . Neither of the conformations in the resonance competition experiments. A 5 -biotinylated oligonucle- crystal is consistent with these NOEs, suggesting that either the otide containing the sequence AGGUAA was immobilized on a Ade5–Ade6 dinucleotide undergoes significant motion in solution streptavidin-coated Biacore chip and a solution of ZRANB2-F12 or the conformation differs in this region in solution. Examination containing unlabeled oligonucleotides corresponding to the 5Ј of packing in the crystal (Fig. S3D) shows that a tyrosine side chain splice sites of exons 1, 2, or 3 was injected. As shown in Fig. 6B, the from a symmetry-related molecule is next to M87, and it is possible oligonucleotide from the 5Ј splice site of exon 3 caused the largest that crystal-packing forces disturb Ade5 and Ade6 from their reduction in binding of ZRANB2 to the chip, indicating that the preferred solution positions. protein has a clear preference for this sequence above the others. To assess the likely conformation of Ade5 and Ade6, we carried out a restrained molecular dynamics calculation by using HAD- Discussion DOCK (17). In this calculation, the positions of the AGGU ZRANB2 ZnFs Are ssRNA-Binding Domains. The RanBP2 ZnFs of sequence and the entire protein other than M87 and V77 were ZRANB2 bind with high specificity to ssRNA containing a core fixed. Ade5 and Ade6 were allowed to reorient freely under the GGU sequence, defining another class of sequence-specific influence of the force field and the NOEs to the protein. When the ssRNA-binding domain. The interaction is mediated predomi- NOEs are directed to Ade5 H2, the calculations revealed a single nantly by hydrogen bonds between protein side chains and the

4of6 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0802466106 Loughlin et al. Downloaded by guest on September 26, 2021 bases, together with a tryptophan stacking motif. The bidentate mode of recognition as ZRANB2. Both TLS and EWS carry a arginine–guanine hydrogen bonds observed here have been ob- single change: R81 to W. Inspection of the F2:RNA structure served previously, although only in complexes involving double- reveals that the indole HN of this tryptophan could still make a stranded nucleic acids (the interaction still permits normal Watson– hydrogen bond with the O6 carbonyl of Gua2, thereby mimicking Crick base pairing). For example, Ets domain transcription factors the base specificity imparted by the arginine. Tex13a, RBM5, contain a conserved arginine in the recognition helix that recog- and RBM10 all have changes to the 2 residues that specify nizes a guanine (e.g., ref. 18), as do type II restriction enzymes such uridine at position 4 (N76 to A/L/V and N86 to F). Notably, the as BamH1 (19). Complexes formed between the arginine-rich N76L and N86F changes in RBM5 give rise to a surface that is peptides of retroviral proteins such as HIV Rev also exhibit this similar in overall shape but lacks hydrogen-bonding capacity. It type of interaction (e.g., ref. 20); here the target is dsRNA. A is therefore possible that RBM5 will accept either pyrimidine in related interaction, in which the 2 distal N␩ groups form a bidentate this position. Further, many of these proteins have already been hydrogen bond with guanine, is observed in a number of complexes, ascribed a function in splicing. For example, TLS, EWS, RBP56, including classical ZnF–DNA complexes (21). RBM5, and RBM10 all have been shown to be associated with Although several published structures [including the HIV the early spliceosome (34–36). Most recently, a role for RBM5 nucleocapsid ZnF, which contacts a flipped out guanine at the in the splicing of apoptosis-related genes was demonstrated in 2 end of an RNA stem loop (22, 23)] show Trp–purine stacking separate studies (37, 38). interactions, the ZRANB2:RNA structure appears to be an Other proteins, including MDM2 (a regulator of p53), contain example of true intercalation of a tryptophan side chain between RanBP2 ZnFs that display 1 or 2 of the RNA-binding residues 2 bases. Stacking interactions involving tyrosine or phenylala- that we have identified. If such domains can bind RNA, their nine are more common. The Tis11d ZnFs, which recognize sequence specificity will likely be very different. In contrast, the AU-rich elements (24), and the pumilio repeat proteins, which RanBP2 ZnF in Npl4 contains none of the RNA-binding resi- bind elements in the 3Ј untranslated regions of target mRNAs dues and instead harbors a conserved Thr–Phe dipeptide that (25), both display such stacks. mediates an interaction with ubiquitin (10). Similarly, the Several water-mediated hydrogen bonds and hydrogen bonds RanBP2 ZnFs of Nup153 recognize RanGTP/GDP by using a to protein backbone atoms also contribute to binding specificity Leu-Val motif in the same position (11). The RanBP2 ZnF in the in ZRANB2, including a hydrogen bond to the 2ЈOH of Gua3 putative regulator of cytokine signaling TRABID/ZRANB1 that provides support for the selectivity of ZRANB2 for ssRNA (39) contains several of the RNA-binding residues and a Thr-

over DNA. No interactions with the phosphate backbone are Tyr motif, and it will be interesting to ascertain whether both of BIOCHEMISTRY observed, in contrast to several other common classes of ssRNA the biochemical functions indicated are active in this domain. binding domains, such as RRMs (reviewed in ref. 26). Despite these differences however, several similarities exist The Function of ZRANB2. The functional data available for between the activities of the tandem ZRANB2 ZnFs and those ZRANB2 point strongly toward a role in alternative splicing. of the Tis11d/TTP/Mex-5 family. In both cases, each ZnF The observation that ZRANB2 binds both U170K and U2AF35 recognizes ssRNA in a sequence-specific manner with micro- suggests that it acts early in the splicing reaction, consistent with molar affinity and there does not appear to be a requirement for a role in splice site choice. The RNA-binding properties of the RNA to be presented in a specific conformation. In contrast, ZRANB2 draw parallels with canonical SR proteins such as a number of other RNA-binding proteins, including Nova (27), ASF/SF2 and SC35, although unlike these latter proteins RBMY (28), nucleolin (29), TFIIIA (30), and the nucleocapsid ZRANB2 does not localize to nuclear speckles. It is also notable ZnFs (23), recognize RNA in the context of a specific secondary that, although SR proteins bind RNA with high affinity, SELEX structure. The presence of 2 (or more) modular RNA-binding data have rarely revealed clear consensus sequences (3). For domains that recognize a single RNA sequence is also a common example, the structure of SRp20 bound to RNA reveals that only theme in nucleic acid recognition (31), and one that potentially 1 of the 4 nt is recognized in a sequence-specific manner (40). allows double sites with variable spacing to be recognized. This finding contrasts sharply with the well defined consensus sequence obtained here and hints at a role for ZRANB2 in Conservation of the ZRANB2–F12:RNA Interaction. The ZnFs of regulating specific transcripts, rather than a global role in ZRANB2 are highly conserved from Xenopus to humans (Fig. constitutive splicing. S4A), and a number of the residues that are important for RNA The target sequence for a single ZRANB2 ZnF strongly recognition are conserved in insects and nematodes. It is notable resembles the 5Ј splice site, which is conserved across all that the N-terminal finger of the Caenorhabditis elegans protein metazoans. It is therefore possible that ZRANB2 might act by is missing one of the conserved zinc-binding cysteines and is binding directly to a subset of 5Ј splice sites so as to prevent disordered in solution (Fig. S4B) . Yeast and rice also contain recognition of those sites by the spliceosome. Such a mode of orthologues of ZRANB2 in which the RNA recognition surface action is supported by the affinities of ZRANB2 for the different of the ZnFs is partly conserved. Further, examination of the splice sites in the Tra2-␤ minigene. Indeed in each of the exons F2:RNA structure reveals that some of the observed substitu- excluded from the transcripts of the GluR-B, SMN2 and Tra2-␤ tions could most likely retain specificity for GGU. For example, minigenes after the addition of ZRANB2, a single or double the asparagine that replaces R82 in the yeast protein could still (A)GGUA(A) site is present at or around the 5Ј splice site of the hydrogen-bond, via its side-chain NH2 moiety, to the O6 of major excluded exon. Given that 3Ј splice sites also display a GG Gua3. This high degree of conservation suggests that the RNA- dinucleotide, it is also possible that the 2 ZRANB2 ZnFs might binding activity and sequence specificity of ZRANB2 is part of simultaneously contact both splice donor and splice acceptor an ancient and important function. sites within the same transcript to influence splicing. Alternatively, ZRANB2 might recognize cryptic splice sites A Family of RanBP2 ZnF RNA-Binding Domains. Alignment of protein containing 1 or 2 AGGUAA sequences, either activating or sequences obtained from a BLAST search reveals that subsets of suppressing their use in a similar manner to that of the Dro- the residues in ZRANB2-F2 that directly contact the GGU motif sophila protein PSI (41, 42) and the pseudo 5Јsplice site in the are also present in several other human RanBP2 ZnFs (Fig. 6C). P-element transposase pre-mRNA. The fact that ZRANB2 can In fact, TLS has already been shown to bind GGUG RNA motifs accommodate a range of spacings between 2 AGGUAA motifs (32, 33), consistent with our data. RBP56 shares all of these suggests that the protein might recognize clusters of these motifs residues and so should exhibit the same sequence specificity and rather than a strict tandem site. A similar situation has been

Loughlin et al. PNAS Early Edition ͉ 5of6 Downloaded by guest on September 26, 2021 observed for the splicing factor Nova (43), which has 3 KH NMR Spectroscopy. The structure of F2, RNA titrations, and full assignments for domains separated by a long and short linker. F12 and a F2:RNA complex were determined by using standard solution NMR experiments. Details can be found in SI Text.

Methods Surface Plasmon Resonance. Competition binding experiments were carried out Expression and Purification. RanBP2-type ZnF domains from ZRANB2 and other by flowing a solution of F12 over a streptavidin chip coated with a biotinylated human proteins were expressed as GST-fusion proteins and purified by glutathi- ssRNA oligonucleotide (containing the sequence AGGUAA) in the presence of an one affinity chromatography and either gel filtration or cation exchange chro- unlabeled competitor oligonucleotide. Details can be found in SI Text. matography. Additional details are provided in SI Text. ACKNOWLEDGMENTS. We thank Merlin Crossley (University of Sydney, Syd- SELEX. The ZRANB2-F12 SELEX protocol was based on that used by Sakashita and ney) for the K562 cDNA library, Jacqui Matthews (University of Sydney, Sydney) for valuable discussions, Bill Bubb for expert maintenance of the NMR Sakamoto (44). A library of ssRNA sequences was incubated with GST-F12 on facility, and Miriam Rose Ash (University of Sydney, Sydney) for assistance with glutathione Sepharose beads, and after washing protein–RNA complexes were collecting X-ray diffraction data. This work was supported in part by a National eluted with glutathione and the selected RNA was reverse-transcribed and amplified Health and Medical Research Council Project Grant (to J.P.M.). F.E.L., R.E.M., by PCR. Sequencing of selected sequences was carried out after 7, 9, and 13 rounds and P.M.V. are supported by Australian Postgraduate Awards. SAD phasing of selection. A more detailed description of the protocol is provided in SI Text. and initial model building were carried out at the CCP4 workshop ‘‘CCP4 School: From Data Processing to Structure Refinement and Beyond’’ held in the Argonne National Laboratory (Argonne, IL) in May, 2008. The workshop Gel Shifts and Fluorescence Anisotropy Titrations. These experiments were was supported, in part, by National Cancer Institute Grant Y1-CO-1020 and carried out by using standard protocols. Details can be found in SI Text. National Institute of General Medical Science Grant Y1-GM-1104.

1. Krawczak M, Reiss J, Cooper DN (1992) The mutational spectrum of single base-pair 23. De Guzman RN, et al. (1998) Structure of the HIV-1 nucleocapsid protein bound to the substitutions in mRNA splice junctions of human genes: Causes and consequences. Hum SL3 psi-RNA recognition element. Science 279:384–388. Genet 90:41–54. 24. Hudson BP, Martinez-Yamout MA, Dyson HJ, Wright PE (2004) Recognition of the 2. Graveley BR (2000) Sorting out the complexity of SR protein functions. RNA 6:1197– mRNA AU-rich element by the zinc finger domain of TIS11d. Nat Struct Mol Biol 1211. 11:257–264. 3. Bourgeois CF, Lejeune F, Stevenin J (2004) Broad specificity of SR (serine/arginine) 25. Wang X, McLachlan J, Zamore PD, Hall TM (2002) Modular recognition of RNA by a proteins in the regulation of alternative splicing of pre-messenger RNA. Prog Nucleic human pumilio-homology domain. Cell 110:501–512. Acid Res Mol Biol 78:37–88. 26. Messias AC, Sattler M (2004) Structural basis of single-stranded RNA recognition. Acc 4. Shen H, Green MR (2006) RS domains contact splicing signals and promote splicing by Chem Res 37:279–287. a common mechanism in yeast through humans. Genes Dev 20:1755–1765. 27. Lewis HA, et al. (2000) Sequence-specific RNA binding by a Nova KH domain: Implica- 5. Li J, et al. (2003) Regulation of alternative splicing by SRrp86 and its interacting tions for paraneoplastic disease and the fragile X syndrome. Cell 100:323–332. proteins. Mol Cell Biol 23:7437–7447. 28. Skrisovska L, et al. (2007) The testis-specific human protein RBMY recognizes RNA 6. Adams DJ, et al. (2001) ZNF265, a novel spliceosomal protein able to induce alternative through a novel mode of interaction. EMBO Rep 8:372–379. splicing. J Cell Biol 154:25–32. 29. Allain FH, Gilbert DE, Bouvet P, Feigon J (2000) Solution structure of the two N-terminal 7. Li J, et al. (2008) Expression pattern and splicing function of mouse ZNF265. Neurochem RNA-binding domains of nucleolin and NMR study of the interaction with its RNA Res 33:483–489. target. J Mol Biol 303:227–241. 8. Plambeck CA, et al. (2003) The structure of the zinc finger domain from human splicing 30. Lu D, Searles MA, Klug A (2003) Crystal structure of a zinc-finger-RNA complex reveals factor ZNF265 fold. J Biol Chem 278:22805–22811. two modes of molecular recognition. Nature 426:96–100. 9. Wang B, et al. (2003) Structure and ubiquitin interactions of the conserved zinc finger 31. Lunde BM, Moore C, Varani G (2007) RNA-binding proteins: Modular design for domain of Npl4. J Biol Chem 278:20225–20234. efficient function. Nat Rev Mol Cell Biol 8:479–490. 10. Alam SL, et al. (2004) Ubiquitin interactions of NZF zinc fingers. EMBO J 23:1411–1421. 32. Iko Y, et al. (2004) Domain architectures and characterization of an RNA-binding 11. Higa MM, Alam SL, Sundquist WI, Ullman KS (2007) Molecular characterization of the protein, TLS. J Biol Chem 279:44834–44840. ran-binding zinc finger domain of NUP153. J Biol Chem 282:17090–17100. 33. Lerga A, et al. (2001) Identification of an RNA binding specificity for the potential 12. Zuker M (2003) Mfold web server for nucleic acid folding and hybridization prediction. splicing factor TLS. J Biol Chem 276:6807–6816. Nucleic Acids Res 31:3406–3415. 34. Behzadnia N, et al. (2007) Composition and three-dimensional EM structure of double 13. Ast G (2004) How did alternative splicing evolve? Nat Rev Genet 5:773–782. affinity-purified, human prespliceosomal A complexes. EMBO J 26:1737–1748. 14. Zhang MQ (1998) Statistical features of human exons and their flanking regions. Hum 35. Deckert J, et al. (2006) Protein composition and electron microscopy structure of Mol Genet 7:919–932. affinity-purified human spliceosomal B complexes isolated under physiological condi- 15. Yu GW, Allen MD, Andreeva A, Fersht AR, Bycroft M (2006) Solution structure of the tions. Mol Cell Biol 26:5528–5543. C4 zinc finger domain of HDM2. Protein Sci 15:384–389. 36. Hartmuth K, et al. (2002) Protein composition of human prespliceosomes isolated by a 16. Davis IW, et al. (2007) MolProbity: All-atom contacts and structure validation for tobramycin affinity-selection method. Proc Natl Acad Sci USA 99:16719–16724. proteins and nucleic acids. Nucleic Acids Res 35:W375–383. 37. Bonnal S, et al. (2008) RBM5/Luca-15/H37 regulates Fas alternative splice site pairing 17. Dominguez C, Boelens R, Bonvin AM (2003) HADDOCK: A protein–protein docking after exon definition. Mol Cell 32:81–95. approach based on biochemical or biophysical information. J Am Chem Soc 125:1731– 38. Fushimi K, et al. (2008) Up-regulation of the proapoptotic caspase 2 splicing isoform by 1737. a candidate tumor suppressor, RBM5. Proc Natl Acad Sci USA 105:15708–15713. 18. Mo Y, Vaessen B, Johnston K, Marmorstein R (1998) Structures of SAP-1 bound to DNA 39. Evans PC, et al. (2001) Isolation and characterization of two novel A20-like proteins. targets from the E74 and c-fos promoters: Insights into DNA sequence discrimination Biochem J 357:617–623. by Ets proteins. Mol Cell 2:201–212. 40. Hargous Y, et al. (2006) Molecular basis of RNA recognition and TAP binding by the SR 19. Newman M, Strzelecka T, Dorner LF, Schildkraut I, Aggarwal AK (1994) Structure of proteins SRp20 and 9G8. EMBO J 25:5126–5137. restriction endonuclease bamhi phased at 1.95-Å resolution by MAD analysis. Structure 41. Labourier E, Adams MD, Rio DC (2001) Modulation of P-element pre-mRNA splicing by (London) 2:439–452. a direct interaction between PSI and U1 snRNP 70K protein. Mol Cell 8:363–373. 20. Ye X, Gorin A, Ellington AD, Patel DJ (1996) Deep penetration of an ␣-helix into a 42. Adams MD, Tarng RS, Rio DC (1997) The alternative splicing factor PSI regulates widened RNA major groove in the HIV-1 rev peptide-RNA aptamer complex. Nat Struct P-element third intron splicing in vivo. Genes Dev 11:129–138. Biol 3:1026–1033. 43. Ule J, et al. (2006) An RNA map predicting Nova-dependent splicing regulation. Nature 21. Pavletich NP, Pabo CO (1991) Zinc finger-DNA recognition: Crystal structure of a 444:580–586. Zif268-DNA complex at 2.1 Å. Science 252:809–817. 44. Sakashita E, Sakamoto H (1994) Characterization of RNA binding specificity of the 22. D’Souza V, Summers MF (2004) Structural basis for packaging the dimeric genome of Drosophila sex-lethal protein by in vitro ligand selection. Nucleic Acids Res 22:4082– Moloney murine leukemia virus. Nature 431:586–590. 4086.

6of6 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0802466106 Loughlin et al. Downloaded by guest on September 26, 2021