Structure and Expression of the Drosophila Melanogaster Gene for the Ul Small Nuclear Ribonucleoprotein Particle 70K Protein RICARDO MANCEBO, PATRICK C
Total Page:16
File Type:pdf, Size:1020Kb
MOLECULAR AND CELLULAR BIOLOGY, June 1990, p. 2492-2502 Vol. 10, No. 6 0270-7306/90/062492-11$02.00/0 Copyright C) 1990, American Society for Microbiology Structure and Expression of the Drosophila melanogaster Gene for the Ul Small Nuclear Ribonucleoprotein Particle 70K Protein RICARDO MANCEBO, PATRICK C. H. LO, AND STEPHEN M. MOUNT* Department ofBiological Sciences, Columbia University, New York, New York 10027 Received 11 January 1990/Accepted 14 February 1990 A genomic clone encoding the Drosophila Ul small nuclear ribonucleoprotein particle 70K protein was isolated by hybridization with a human Ul small nuclear ribonucleoprotein particle 70K protein cDNA. Southern blot and in situ hybridizations showed that this Ul 70K gene is unique in the Drosophila genome, residing at cytological position 27D1,2. Polyadenylated transcripts of 1.9 and 3.1 kilobases were observed. While the 1.9-kilobase mRNA is always more abundant, the ratio of these two transcripts is developmentally regulated. Analysis of cDNA and genomic sequences indicated that these two RNAs encode an identical protein with a predicted molecular weight of 52,879. Comparison of the Ul 70K proteins predicted from Drosophila, human, and Xenopus cDNAs revealed 68% amino acid identity in the most amino-terminal 214 amino acids, which include a sequence motif common to many proteins which bind RNA. The carboxy-terminal half is less well conserved but is highly charged and contains distinctive arginine-rich regions in all three species. These arginine-rich regions contain stretches of arginine-serine dipeptides like those found in transformer, transform- er-2, and suppressor-of-white-apricot proteins, all of which have been identified as regulators of mRNA splicing in Drosophila melanogaster. Splicing of pre-mRNA occurs in a two-step reaction protein makes contacts with Ul snRNA at the first stem-loop requiring the assembly of a large complex (the spliceosome) (20, 41) in a sequence-specific manner (46, 56). The carboxy- which includes several small nuclear ribonucleoprotein par- terminal portion of the protein is particularly rich in arginine ticles (snRNPs) (see references 19, 32, and 52 for reviews). residues and contains repetitions of arginine-glutamic acid, Each snRNP consists of at least one snRNA bound to arginine-aspartic acid, and arginine-serine. It appears that several proteins, some of which are common to all the this region is responsible for the altered mobility of the 70K spliceosomal snRNPs. The human Ul snRNP is composed protein on sodium dodecyl sulfate-polyacrylamide gels (45). of a 164-nucleotide (nt) Ul snRNA molecule complexed to at Similar charged regions have been found in the predicted least 10 proteins (30). It binds to 5' splice site sequences by protein products of the Drosophila suppressor-of-white- base pairs involving the 5' end of Ul RNA (12, 24, 35, 62). apricot [SU(Wa)] (13), transformer (tra) (9), and transformer-2 There is also evidence that the Ul snRNP interacts with the (tra-2) (18) loci, as well as in a mouse major histocompati- U2 snRNP during splicing (8, 12), and that this interaction bility complex gene of unknown function (29). All these occurs early and is required for further stages of splicing (25, genes with the exception of the last have recently been 48, 50). Three ofthe human Ul snRNP proteins, 70K, A, and shown to be regulators of RNA splicing (2, 61). C, are unique to the Ul snRNP (21, 43); the other Ul snRNP The Ul 70K protein is ideally suited among spliceosomal proteins are common to the U2, U5, and U4/U6 snRNPs. components for a detailed functional investigation for sev- Patients with autoimmune disorders produce antibodies eral reasons. Among them are the availability of monospe- which specifically immunoprecipitate the Ul snRNP (28), cific antisera, the presence of the 70K protein in a complex and many of these are directed against epitopes on the 70K (the Ul snRNP) having a known and important function in protein (43, 60). Anti-RNP antisera have been shown to splicing, and the existence of a sequence motif found in block in vitro splicing when they are added to splicing proteins which act to regulate splicing. The precise role of reactions (40). Such sera, as well as a mouse monoclonal the Ul 70K protein in splicing is unknown. Specific regions antibody to the mouse Ul snRNP 70K protein (6), have of this protein could participate in (i) the recognition ofthe 5' enabled researchers to clone cDNAs encoding the Ul 70K splice site by the Ul snRNP; (ii) interactions between Ul protein from expression libraries (47, 59). and U2 snRNPs; (iii) interactions between Ul snRNPs and Sequence data from these cDNAs have provided informa- other proteins required for splicing; (iv) interactions between tion regarding the primary structure of the Ul 70K protein. Ul snRNPs and other RNAs. We have chosen Drosophila First, it appears that this protein is actually 52 kilodaltons in melanogaster as a system in which to analyze Ul 70K size (45, 53). We have followed others (15, 46) in using Ul protein function using genetic and biochemical techniques. 70K or 70K as a name for this protein, although this name In this communication, we report on the structure and says nothing about the size of the protein. A conserved expression of the Drosophila Ul 70K protein gene. Of region of amino acid sequence found in a number of RNA- particular interest is the pattern of conservation of sequence binding proteins (1, 3, 33, 57) occurs in the amino-terminal elements within the protein-coding region. half of the human Ul 70K protein at amino acids 104 to 183, and this region of the protein together with a minimum of MATERIALS AND METHODS flanking amino acids (amino acids 92 to 202) is capable of Screening of genomic and cDNA libraries. To isolate binding Ul snRNA specifically (45). The human Ul 70K Drosophila Ul 70K genomic clones, we screened a XEMBL4 genomic library (Oregon R strain; prepared by Mike Gold- * Corresponding author. berg) with the human 70K cDNA clone (provided by C. C. 2492 VOL. 10, 1990 DROSOPHILA GENE ENCODING Ul 70K PROTEIN 2493 EZIZORF B S Bs SE II I I pBS9.1 INVT ZUlZ ur *uNA I INTROIiN 2912PPDNDNS BS I I I11 X R3-2.1 113J5 BP aDNA --_lI *13-73 E E S S es E D33 1.1 I I I I I I x I 356 BP Gooml I 1 kb E B EE SS SBE S S SE 1 1 111111 11 I I1 .t SHOIRT LONG AR M ARM I kb FIG. 1. Structure of the cDNA and genomic clone inserts. The bold and thin wavy lines depict the major and minor Ul 70K gene transcripts, respectively, and their direction of transcription. ORF is the 448-amino-acid ORF determined by conceptual translation of the 2.9-kb insert within pBS9.1, a cDNA clone. A 79-bp intron lines 48 bp upstream of the ORF in pBS9.1. XR3-2.1 has a 1.4-kb cDNA insert extending to a natural polyadenylation site. The position of the subclone used to isolate pBS9.1 is shown beneath the map of XR3-2.1. The jagged end of m13-73 indicates that this end has not been mapped. The bottom line shows a restriction map of the entire 14.5-kb genomic insert of XD331.1. Above it is an enlargement of the XD331.1 region from which the cDNAs are derived. The arrow below this enlargement indicates again the start and direction of transcription of the Ul 70K gene. pRM1 is a 620-bp BamHI-SalI genomic fragment in pUC19. Enzyme sites are as follows: E, EcoRI, B, BamHI; S, Sall. The star denotes one of two possible positions for this Sall site. Query and J. D. Keene) by the method of Maniatis et al. (31). m13-73 (Fig. 1) and incorporating [a-32P]dCTP with the The probe was labeled as described by Feinberg and Vo- Klenow fragment. The probe was heat denatured and sepa- gelstein (17). rated on a low-melting-temperature agarose gel (see refer- For cDNA cloning, a total of approximately 100,000 ence 39). plaques from 12- to 24-h-old embryonic, late-third-instar In situ hybridizations. In situ hybridizations were done by larva, and adult male libraries in XgtlO (44) were screened the method of Laverty and Rubin (personal communication with a subclone of XD331.1 in M13 (the 620-nt Bam-Sal and unpublished data). fragment of pRM1; Fig. 1) by the method of Maniatis et al. Isolation of poly(A)+ RNA. Poly(A)+ RNA was isolated (31). From an adult head cDNA library prepared by selecting from embryonic, larval, pupal, and adult stages of Oregon R fragments larger than 2.3 kilobases (kb) in the Lambda Zap flies as described previously (39), using a Brinkmann Poly- vector (Stratagene, La Jolla, Calif.), 270,000 plaques were tron homogenizer. screened with m13-73 (Fig. 1) as a probe. The probe was DNA sequencing. Random subfragments of the pBS9.1 and labeled as described by Feinberg and Vogelstein (17). XR3-2.1 inserts were generated by sonication, subcloned into Southern and Northern (RNA) blots. Southern blots were M13mpl8, and sequenced by the dideoxynucleotide-chain done by the method of Maniatis et al. (31) by loading 5 ,g of termination method (49) with the modified T7 DNA polymer- Oregon R genomic DNA digested with BamHI, EcoRI, ase Sequenase (U.S. Biochemical Corp., Cleveland, Ohio) HindIll, PstI, or SalI onto a 0.8% agarose gel. The radioac- (58). Underrepresented cDNA regions and genomic se- tive probe (pBS9.1; Fig.