<<

Proc. Nati. Acad. Sci. USA Vol. 83, pp. 6030-6034, August 1986 Genetics Genes for the eight ribosomal are clustered on the of tobacco (): Similarity to the S10 and spc of Escherichia coli (molecular cloning/DNA sequence///blot hybridization) MINORU TANAKA*, TATSUYA WAKASUGI*, MAMORU SUGITAt, KAZUO SHINOZAKI*, AND MASAHIRO SUGIURA*t *Center for Gene Research, Nagoya University, Chikusa, Nagoya 464, Japan; and tDepartment of Botany, Hokkaido University, Sapporo 060, Japan Communicated by Dan L. Lindsley, April 21, 1986

ABSTRACT The nucleotide sequence of a tobacco in 6x SSC at 37°C for 2 hr. (1x SSC = 0.15 M NaCi, 0.015 (Nicotiana tabacum) chloroplast gene duster that encodes eight M sodium citrate, pH 7.0.) proteins homologous to Escherichia coli ribosomal proteins Recombinant pTBal, pTBa7, pTBa8, and pTP10 L23, L2, S19, L22, S3, L16, L14, and S8 has been determined. containing 19.3-, 5.0-, and 4.8-kilobase-pair (kbp) BamHI RNA gel blot hybridization revealed that all eight coding fragments and a 2.9-kbp Pst I fragment ofNicotiana tabacum regions are expressed in the . The arrangement of var. Bright Yellow 4 chloroplast DNA, respectively, were the eight genes resembles that found in the E. coUl S10 and spc constructed as described using pBR322 (ref. 10 and Fig. 1). operons. Among the eight genes, the L2 and L16 genes contain The DNA sequence was determined by a combination of the 666- and 1020-base-pair , respectively. These intron chemical method (11) and the dideoxy chain-termination boundary sequences are consistent with the conserved bound- method (12) using mplO/11 and E. coli JM109. DNA se- ary sequences of the chloroplast group m introns [Shinozaki, quences were analyzed using the GENETYX software sys- K., Deno, H., Sugita, M., Kuramitsu, S. & Sugiura, M. (1986) tem (Software Development, Tokyo). RNA gel blot hybrid- Mol. Gen. Genet. 202, 1-5]. ization was carried out as described (13). Chloroplast in higher plants are 70S in size and RESULTS contain 23S, 16S, SS, and 4.5S and =60 ribosomal proteins (1). Analyses of the synthesis of ribosomal proteins DNA Sequence. BamHI digests of tobacco chloroplast in isolated chloroplasts have shown that a chloroplast DNA blotted to nylon filter sheets were hybridized with genome encodes about one-third ofthe ribosomal proteins in nick-translated Xfus3 DNA fragments that carried portions of higher plant species (2, 3). Since the identification and the E. coli S10 and spc operons. The DNA probes hybridized sequencing of the first tobacco chloroplast small subunit strongly to the 5.0-kbp BamHI fragment (Ba7), moderately to ribosomal (CS) gene for CS19 (4), several additional the 19.3- and 4.8-kbp BamHI fragments (Bal and Ba8, genes for chloroplast ribosomal proteins located in respectively), and weakly to several other fragments (data chloroplast have been identified through their homol- not shown). A part of the Ba7 fragment has previously been ogy with Escherichia coli genes (5-9). sequenced and shown to contain the gene for the ribosomal Because of the success of this approach, we searched for protein CS19 (rps 19 gene product) (4), and thejunction (JLB) further ribosomal protein genes in tobacco chloroplast DNA between the inverted repeat B (IRB) and the large single-copy by hybridization with Afus3 DNA, which contains fourE. coli region (14). ribosomal protein operons (e.g., ref. 25). We sequenced the For the present study, we sequenced the entire Ba7 DNA region that hybridized strongly with the E. coli probe fragment and its adjacent part of the Bal fragment by the and found open reading frames (ORF) whose amino strategy shown in Fig. 2. We used the 2.9-kbp Pst I fragment sequences resemble those of E. coli ribosomal proteins. (PslO) to confim that there were no small BamHI pieces Here we describe a gene cluster encoding eight ribosomal between the Bal and Ba7 fragments (see Fig. 1). Fig. 3 shows proteins in tobacco chloroplast DNA. The organization ofthe the DNA sequence of a 6207-bp portion (the left end of Ba7 eight tobacco chloroplast genes is similar to that found in to the Taq I site in S11, see Fig. 2). The Ba8 fragment lies in corresponding order in the E. coli S10 and spc operons. a symmetrical position to Ba7 on the circular chloroplast Nevertheless, two of the above eight genes contain long DNA (see Fig. 1) and has been shown to contain thejunction introns in spite of their homologies with genes encoding the (JLA) between the inverted repeat A (IRA) and the large corresponding E. coli ribosomal proteins. single-copy region, trnH (14), psbA (15), and the 3' exon of trnK (16). We also sequenced the remaining portion (within MATERIALS AND METHODS IRA) of the Ba8 fragment by the same strategy shown in Fig. 2. The 2098-bp sequence (JLA to the first BamHI site) of IRA The transducing phage Xfus3 was kindly provided by K. was found to be completely identical to the corresponding Isono, and its 10 and 4.6% EcoRI fragments that contain parts sequence (JLB to the first BamHI site) of IRB. Determination of the S10 and spc operons of E. coli were used as probes. of the 6207-bp sequence revealed that there is a gene for Southern blot hybridization was performed in 28% (vol/vol) tRNAIIC (details will be published elsewhere) and 11 ORFs on formamide, 1 M NaCl, 10 mM Tris-HCl (pH 7.5), and lx the same DNA strand (strand B). Denhardt's solution at 37°C for 24 hr, and filters were washed Abbreviations: bp, base pairs; CL, chloroplast large subunit ribo- The publication costs of this article were defrayed in part by page charge somal protein: CS, chloroplast small subunit ribosomal protein; payment. This article must therefore be hereby marked "advertisement" ORF, open reading frame. in accordance with 18 U.S.C. §1734 solely to indicate this fact. *To whom reprint requests should be addressed. 6030 Genetics: Tanaka et al. Proc. Natl. Acad. Sci. USA 83 (1986) 6031

LSC sequence, shows 85, 82, and 48% sequence homologies with the deduced N. debneyi CL2, spinach CL2, and E. coli L2 (g proteins, respectively (all protein sequences are deduced from the DNA sequences hereafter). The proteins predicted from ORF1, ORF6, and ORF7 show 23, 26, and 38% sequence homologies with the E. coli L23, L22, and S3 proteins, respectively. In the case ofS3 proteins, several gaps were introduced to maximize the . The E. coli S10 consists ofthe genes for S10, L3, L4, L23, L2, S19, L22, S3, L16, L29, and S17 in this order (18). The order of ORF1, ORF2-4 (rpl2), ORF5 (rpsl9), ORF6, and ORF7 is the same as that of the S10 operon. Therefore, we propose that ORF1, ORF6, and ORF7 are the genes for the CL23, CL22, and CS3 proteins (rp123, rp122, and rps3), respectively. The deduced polypeptide for ORF9 showed a sequence homology with the E. coli L16 protein. When ORF9 is combined with the short 3-codon ORF (positions 3690-3698) between ORF7 and ORF8 and a 1020-bp insertion is introduced, it is more similar to the E. coli L16 protein (56% homology). We, therefore, propose that the short ORF plus FIG. 1. Positions of the cloned fragments and the genes for the ORF9 is the gene for CL16 (rpll6). The intron sequence was ribosomal proteins on the Sal I cleavage map of tobacco chloroplast assigned to be the positions 3699-4718 by comparing the DNA. The genes for the large subunit of ribulose bisphosphate conserved intron-exon boundary sequences of the chloro- carboxylase (rbcL), the 32-kDa membrane protein (psbA), and the rRNA operons (rrnA and rrnB) are marked. JLA and JLB are plast group III introns (9) and the E. coli L16 sequence. The thejunctions between the inverted repeats A and B (IRA and IRB) and intron is 1020 bp long, which is the longest intron so far the large single-copy region (LSC). analyzed in chloroplast genes for proteins. The 80-codon ORF8 is in this intron and showed no homology with any of the E. coli ribosomal proteins nor with ORF3 in the rpl2 Gene Cluster Coding for Proteins Homologous to E. coli intron. Ribosomal Proteins. We have reported the sequence of the The proteins derived from ORF10 and ORF11 have 55 and rpsl9 gene in tobacco chloroplast DNA (4). ORF5 shown 42% (with gaps) sequence homologies with the E. coli L14 here corresponds to rpsl9 (Fig. 3). rpsl9 has been reported and S8 proteins, respectively, suggesting that ORF10 and in spinach (7), Nicotiana debneyi (7), and duckweed (17). The ORF11 are the genes for CL14 and CS8 (rpll4 and rps8). The gene for the chloroplast large subunit ribosomal protein (CL) E. coli spc operon contains the genes for L14, L24, L5, S14, CL2 (rpl2) has been found upstream from rpsl9 in spinach S8, and so on in this order (19). The order of ORF10 (rp1l4) and N. debneyi chloroplast DNAs, and the N. debneyi rpl2 and ORF11 (rps8) is similar to the spc operon when the genes has been found to contain a 666-bp intron (7). Based on their for L24, L5, and S14 are deleted. The gene for CS14 has been high homology with the N. debneyi rpl2, ORF2 and ORF4 found before the tRNAfMet gene in the middle of the large represent the tobacco (N. tabacum) rpl2 that also contains a single-copy region of liverwort chloroplast DNA (8) and 666-bp intron (positions 943-1608). The 62-codon ORF3 is tobacco chloroplast DNA (unpublished data). The deduced within the intron and unlikely to be a gene for a ribosomal sequences of the seven genes for the chloroplast protein. The tobacco CL2 protein, deduced from the DNA ribosomal proteins so far reported showed 36-68% homolo- (kbp) 0 I 2 3 4 5 6 IRB JLB LSC rp123 rPI2 rpsl9 rp/22 rps3 rpIl16 rpl14 rps8 ORF I ~~~~~~~Ba7Bal Sail S7 ISb ol PstI Ps6b Ps5 2 PslO Ps9 I Io AccI .4- 1 Q Accli .l I Clal I EcoRI HindM Sau3A Xbal Xhol II I I I ., I Rsal P- TaqI ----]~~~~~~~~ IRA JLA LSC L E Psti HinfI Sau3A tr- I4--4;!l ~ TaqI FIG. 2. Physical map ofthe cloned Ba7, Ba8, PslO fragments, a part ofthe cloned Bal fragment oftobacco chloroplast DNA, and the strategy for sequencing parts of them. The locations ofORFs and genes are shown in the upper boxes. Arrows show the direction and extent ofthe DNA regions sequenced by the chemical method___(-) =and the dideoxy chain-termination method (-). 6032 Genetics: Tanaka et al. Proc. Natl. Acad. Sci. USA 83 (1986) OBamHI tRNA-IIe GATCCCCGCTAGCATCCATGGCTGAATGGTTAAAGCGCCCAACCATAATTGGCGAATTCGAGGTCAATTCCrACTGGATGACGCCAATGGGACCCCCAATAAGCrA__GGAATTGGCTCTGrATCAATGGAAT 140 23=ORF1 CTCATCATCCATACATAACGAATI_AGTGTGGTATATTCATATCATAATATATGAACAGTAAGAACTAGCATTCTTATTGAGACTATAACTCATAGGGAAGAAAATTGGAATCAAATATGCAGTATTTAC 280 MD G I K Y A V F T AGACAAAAGTATTCGGTTATTGGGGAAAAATCAATATACTTCTAATGTCGAATCAGGATCAACTAGGACAGPA-AATAAAGCATTGGGTCGAACTCTTCTTTGGTGTCAAGGTAATAGCTATGAATAGTCATCGACTTCCGG 420 D K S I R L L G K N Q Y T S N V E S G S T R T E I K H W V E L F F G V K V I A M N S H R L P rp/2=ORF2-4 GAAAGAGTAGAAGAATGGGACCTATTATGGGACATACAATGCATTACAGACGTATGATCATTACGCTrCAACCGGGTTATTCTATTCCACCTCTTAGAAAGAAAAGAAC¶1'MAAAAAATACTTAATA OGATA 560 G K S R R M G P I M G H T M H Y R R M I I T L Q P G Y S I P P L R K K RT AI CATTTATACAAAACTCTACCCCGAGCACACGCAATGGAACCGTAGACAGTCAAGTGAAATCCAATCCACGAAATAATTTGATCTATGGACAGCATCATTGTGGTu"AAAGGTCGTAATGCCAGGAACTACCGCAAG 700 H L Y K T S T P S T R N G T V D S Q V K S N P R N N L I Y G Q H H C G K G R N A R G I I T A R GCATAGAGGGGGAGGTCATAAGCGTCTATACCGTAAAATCGATTTTCGACGGAATGAAAAAGACATATATGGTAGAATCGTAACCATAGAATACGACCCTAATCGAAATGCATACATTTGTCTCATACACTATGGGGATG 840o H R G G G H K R L Y R K I D F R R N E K D I Y G R I- V T I E Y D P N R N A Y I C L I H Y G D GrGAGAAGAGATATATTTTACATCCCAGAGGGGCrATAATTGGAGATACCATTGI'rTCrGGrACAGAAGTTCCrATAAAAATGGGAAATGCCCTACC=~llGA^TGCGGTII.CACTATTGAmrACGrAATTGGAAATAA 980o G E K R Y I L H P R G A I I G D T I V S G T E V P I K M G N A L P L C)RF3 CCAATTAGGTTACGACGAAkACCTAGAAATCGATCACrGATCCAATTTGAGrACCrCTGCAGGATAGACCTCAACAGAAAACTGAAGAGTAACGGCAGCAAgiATTGAGTTCAGrAGTTCC'TCATATAAAATTATTGAC 1120 TCTAGAGATATAGrAATATGGAGAAGACAAAATTGTTCAAGCACCGACAGAACCGGAAGCGcCCCCTTrTTCAAAGAGAGGAGGACGGGTTATTCACATTTCATTTGATGGrCAGAGGCGAATTGAAAGrrAAGCAGrG 1 260 GGAATTCT.MAGATTCCCCGGGGGAAAAATAGAGATGTCTCCrACGTTACCCATAATATGrGGAAGTATCGPCGrAATTTCATAGAGTCATTCGGTCrGAATGCrACATGAAGAACATAAGCCAGATGACGGAACGGGAA 1400 GACCCAGGATGTAGAAGATCATAACATGAGTGATTCGGCAGATTTGGATTCATATATATATCCACCCATGTGGTACTTCATTCTACGATATATATAAGATCCATCTGTATAGATATCATCATCTACATCCAGAAAGAAGT 1 540o v ORF4 ATGCTTTGAAGAAGCTTGrACAGTTTGGGAAGGGGTTTTGATTGATCAAAAGAAGAATCTACTTCAACCGATAIECCCTTAGGCACGGCCATACATAACATAGAAATCACACITGGAAAGGGTGGACAATTAGCTAGAG 1680 T L M P L G T A I H N I E I T L G K G G Q L A R CAGCGGGTGCTGTAGCGAAACTGATTGCAAAAGAGGGGAAATCGGCCACATTAAAATTACCTTCTGGGGAGGTCCGTTTGATATCCAAAAACTGCTCAGCAACAGTCGGACAAGTGGGGAATGTTGGGGTGAACCAGAAA 182 0 A A G A V A K L I A K E G K S A T L K L P S G E V R L I S K N C S A T V G Q V G N V G V N Q K AGTTTGGGTAGAGCCGGATCTAAGCGTTGGCTAGGTAAGCGTCCTGTAGTAAGAGGAGTAGTTATGAACCCIGTAGACCATCCCCATGGGGGTGGTGAAGGGAGAGCCCCAATTGGTAGAAAAAAACCCACAACCCCTTG 1960 S L G R A G S K R W L G K R P V V R G V V M N P V D ,H P H G G G E G R A P I G R K K P T T P W GGGTTATCCTGCACTTGGAAGAAGAAGTAGAAAAAGGAATAAATATAGrGATAATTTGATTCTCGTCGCCGTAGrAA-kl,&AGAGAAAATCGAATTAAATTCrTCGrTTTTACAAAAAAAAAAAAT&GGA TAAGC 2100 1G Y P A L G R R S R K R N K Y S D N L I L R R R S K 4 rn19=ORF5 A TAAGrTCACTAAAAAAATCCCTTTGTAGCCAATCAmTTTIAAAAAATTGATAAGCrTAAcAcAAAAGcAGAAAAAGAAATAATAGTArrGGGTCCCGGGMACTACMATATACCCACAATGATCGGT 2240o |MT R S L K K N P F V A N H L L K K I D K L N T K A E K E I I V T W S R A S T I I P T M I ,j tACGATTGCTATCCATAATGGAAAAGAGCATTGCCTATTATATAACGGATAGTATGGTAGGCCACAAATTGGGAGAATTGCACCrACTTTAAATTTAGAGGAcTCATGAAAAGCGATAATAGATcrCG~GTCG 2 380 IH T I A I H N G K E H L P I Y I T D S M V G H K L G E F A P T L N F R G H A K S D N R SRR| r/22=ORF6 A,TATTAATAAAAAAAATCrAGATGCTTATGATTCAGrAGTAGGAGGCAAAC TaGTA-AGAAAAAA,'AACAGAAGTATATGCTTTAGGTGAACATATATCrATGrCrGCrGACAAAGCACGAAGAGTAATTAATCA 2520 |ML K K K K T E V Y A L G E H I S M S A D K A R R V I N Q .AATTCGCGGCCGTTCCrATGAGGAAAC~cLATGATACrAGAACrCATGCCCcT CGAGCATGTTATCCCcATTTGAAATTGATTATTcrGCAGCAGCAAATGCrAGTTACAATATGGGT~CCAGCGAAGCCAATTAG 2660 I R G R S Y E E T L M I L E L M P Y R A C Y P I L K L I Y S A A A N A S Y N M G S S E A N L TCATTAGTAAAGCCGAAGTCAATGGAGGTACTACTGTGAAGAAATTGAAACCTCGAGCTCGAGGACGTAGTT'CCAATAAAAAGATCGACCTGTCATATAACTATTGTAATGAAAGATATATCTTTAGATGATGAATAT 280oo V I S K A E V N G G T T V K K L K P R A R G .R S F P I K R S T C H I T I V M K D I S L D D E Y rps3=ORF7 1- GrAGAGATGTATTrCGTTA-"-AAACGAGATGGA-AA-AAAAATCTACAGCTATGCCGrATCGrGATATGrATAATAGrGGGGCA= W

FIG. 3. DNA sequence of the 6207-bp region containing the genes for the eight ribosomal proteins of tobacco chloroplasts. The RNA-like strand (strand B) is presented. Coding regions including introns are boxed. The deduced amino acid sequences are shown below the DNA sequences. Triangles indicate possible intron sites. Start and stop codons of the 11 ORFs are underlined twice. Sequences similar to the -35 region, the -10 region, and the Shine-Dalgamo signal are underlined. The sequence between positions 2023 and 2502 has been reported (4). gies with those of E. coli counterparts (4-9). Therefore, eight tobacco ORFs and the corresponding E. coli ribosomal sequence homologies between the proteins deduced from the proteins (Fig. 4) together with the coincidence of their order Genetics: Tanaka et al. Proc. Natl. Acad. Sci. USA 83 (1986) 6033

CL23 MDGI KYrRLLG YTS S L HRL IM TMHYRRMII1PS SIPPLRKKRT 93 EL23 MIREERLLKVLRAPE*f E KAAMEKSrI VLIjAKDK K LV RIgRSDWKKAY (E8pNLDFVGGAE 100

CL2 IF HLYJTPSTRNG¶EjPSQVKSNPRNNLIYGQHHCGKGI I GRIVIYDPNTIG EL2 L VKC GRRHVVFFVNPELHKGKPFAPLLEKNSKS GGH I AVVERYDPN G II EEV aTD T HN TL KGGQLA VAKLI S isK VGVNQKS XR W KI EV RNI N P K rYVQi Y E HM L R PI>VVPHG +AIW PW 4RS S K| 274 9AFNPVDHPHGGGEGKPWQTF273

CS19 LK TK II WsT I I I YI VGHKLGEFAP NG SDNRSRR 92 ES19 SLD ID LL ESG PL Ws P I I N LGEFAP RTYDKKAKKK 92

CL22 MLKKKKT ViLGEiI S N SYEET LMPYRCYP I IS SE I P EL2 2 Ht R 4KVSQ A YTNKKjAVLVNLEVS IDD S I ES If24MIS LDDEYVEMYS LKKTRWKKKSTAMPYRDMYNSGGLWDKK 155 V0VStR 110

CS3 GQI L TQGHilSEQYSE CIKNYV 4MRT Q LIQ\ FMGFPKLLIES-R-FELQT¶j-EF ES3 MGQ VKPW FANIIIEFAD YLTKELJ ---A P; RVT ------HT IGKK A HCVNRKLNJAVTRIF*GNPNIL N R ELTEQADT IQ I I GRVP I 3 SYTV G WI DIAGVPAQ[NIAEVFELDAKLM Sr R QNAMRLG KVEV I EGRVP D SEAG W LD7 218 1KGVjLGGMAAVEQPEKPAAQPKKQQRKGRK 233 CL1 6 MIS R~it SH tHIFKY PAW SQIE IMI EL16 M KRRQ YFF GRGR QIE RIRAM YE I S W S 134 l!K T VM~~1TV 136

CL14 MI NVADSG RiDIVRIAI{PLERS AV EL IIIDQEGRKS RIFAI L EL14 NADNSA V S D IIIRGKVKYG, KV GV VLNNNSEQPIIFGVTL E 123 XI|SLAPEVL| 123

CS 8 MGF*E IS DMDP VI INITEIQILRGFI RK* HRRNF PYRNL LK I RPGLRIYSNYQRIRUL HI IL ES8 MSMQP JMANMAKSKLKV F--DT E YFQG_ VVE RPGL RKD VV IE*IGGEM~~~~W13 4 ST VA 130O FIG. 4. Comparisons of the deduced amino acid sequences of tobacco chloroplast ribosomal proteins (C) with those of corresponding E. coli ribosomal proteins (E). Homologous residues are boxed. Triangles indicate intron sites, and numerals indicate amino acid residues. (Fig. 5) indicate that these eight ORFs are most likely to be DISCUSSION the genes for those ribosomal proteins encoded by the chloroplast DNA. The genes for eight ribosomal proteins, rp123, rpl2, rpsl9, Expression of the Eight ORFs. Total tobacco chloroplast rp122, rps3, rpll6, rpll4, and rps8, are clustered in this order RNA extracted from young tobacco leaves was electropho- in tobacco chloroplast genome. Surprisingly, this order resed, transferred to nylon membrane sheets, and hybridized corresponds to that ofthe homologous genes in theE. coli S10 with nick-translated probes containing each ORF. All the and spc operons. This finding raises the interesting possibility eight probes hybridized to several RNA bands ranging from that the genes for ribosomal proteins of chloroplast and E. 0.3 to 4.5 kilobases as listed in Table 1. These results indicate coli may have evolved from a common ancestral gene set. that all eight ORFs are expressed in the chloroplasts and that The rp123 and rpl2 genes lie in the 26-kbp inverted repeat of at least some of the ORFs are cotranscribed. We, therefore, the chloroplast DNA, indicating that these genes are present concluded that the eight ORFs represent the genes for the in two copies. We have sequenced both copies of rp123 and chloroplast ribosomal proteins. This is the first example of a rpl2 and found them to be identical, suggesting that the long ribosomal protein gene cluster found in a chloroplast inverted repeats A and B are highly homologous or identical genome. throughout the entire 26-kbp sequence. S10 operon F spcoperon g L23 L2 S19 L22 53 L16 L29S17 L14 L24 L5 S14 S8 E.coli /1I\\ Tobacco chloroplast 53f CL14 CII CL23 CL2 CS19 CL22 C'53 CL16 CL14 CS8

FIG. 5. Comparison of the gene arrangement of the tobacco chloroplast ribosomal protein gene cluster with those of the E. coli S10 and spc operons. Boxes indicate introns. 6034 Genetics: Tanaka et al. Proc. Nati. Acad. Sci. USA 83 (1986)

Table 1. Major RNA bands detected by blot hybridization found in the monocistronic genes of tobacco chloroplasts so Gene Bands, far examined. probe* kilobasest In the 165-bp region between the gene for tRNAL1e and rpl23, sequences similar to the E. coli -35 and -10 regions rpl23 1.8, 1.0, 0.3 were found, suggesting that the initiation site oftranscription rpl2 2.8, 1.6, 1.0, 0.3 is located before rps23. No -like sequences were rpll9 3.3, 1.9, 1.5 observed in the 127-bp spacer between rpll6 and rpll4, rp122 4.5, 3.3, 2.1, 1.3 although the E. coli genes for the L16 and L14 proteins belong rps3 3.0, 1.6, 0.9, 0.5 to separate operons, S10 and spc, respectively. Shine- rpll6 3.8, 1.8, 1.3, 0.3 Dalgarno-like sequences were found in front of rpsl9, rpl22, rpll4 4.0, 2.1, 1.5, 0.7 rps3, and rpll4 coding regions but not in the other genes. It rps8 3.3, 1.0, 0.6, 0.3 is important to elucidate what additional regulatory se- *Probes used are 343-bp Ava II (rpl23), 298-bp Acc I-Rsa I (rpl2), quences may be involved in the coordinate synthesis of 189-bp Sma I-Xba I (rpsl9), 217-bp Xba I-Pst I (rp122), 382-bp Sal ribosomal proteins in the chloroplasts. I-Xba I (rps3), 132-bp Xba I-BamHI (rpll6), 229-bp Aha III-Pst I (rpll4), and 283-bp Sal I-HindIII (rps8) fragments. Note Added in Proof. In the region downstream from rps8, we have tApproximate sizes in kilobases estimated using tobacco mosiac found additional sequences homologous to the E. coli infA, secX, virus RNA and E. coli 23S, 16S, and 4S RNA as size markers. rpsK, and rpoA in this order on the same strand.

The rpl2 and rpll6 genes were found to contain 666- and We thank Dr. R. A. Bonchard for editing this manuscript. This work was supported in part by a grant-in-aid from the Ministry of 1020-bp introns, respectively. A 666-bp intron has been Education, Science and Culture, Japan. reported in the N. debneyi rpl2 but not in the spinach rpl2 (7). We assigned the tobacco rpl2 intron site between the first and 1. Dyer, T. A. (1984) in Chloroplast Biogenesis, eds. Baker, second nucleotides of the 131st codon (A I CC) so N. R. & Barber, J. (Elsevier, Amsterdam), pp. 23-69. as to match the conserved intron boundary sequences of 2. Eneas-Filho, J., Hartley, M. R. & Mache, R. (1981) Mol. Gen. chloroplast group III introns (9). This site is shifted one Genet. 184, 484-488. nucleotide from the N. debneyi site previously suggested (7). 3. Dome, A. M., Lescure, A. M. & Mache, R. (1984) Plant Mol. The rpl2 and rpll6 introns contain the 62-codon ORF3 and Biol. 3, 83-90. the 80-codon ORF8, both with 4. Sugita, M. & Sugiura, M. (1983) Nucleic Res. 11, respectively, starting GTG 1913-1918. and ending with TAA. There is no significant homology 5. Subramanian, A. R., Steinmetz, A. & Bogorad, L. (1983) between ORF3 and ORF8. It should be noted that the 2526-bp Nucleic Acids Res. 11, 5277-5286. intron of the tobacco getie for tRNALYS (UUU) has a 6. Montandon, P. E. & Stutz, E. (1984) Nucleic Acids Res. 12, 509-codon ORF (16). At present no function of these ORFs 2851-2859. is known. 7. Zurawski, G., Bottomley, W. & Whitfeld, P. R. (1984) Nucleic The relatively low homology (23%) of L23 proteins be- Acids Res. 12, 6547-6558. tween tobacco and E. coli may be due partly to their location 8. Umesono, K., Inokuchi, H., Ohyama, K. & Ozeki, H. (1984) in ribosomes. The L23 protein has been reported to be Nucleic Acids Res. 12, 9551-9565. located near the base of the crown in the 50S 9. Shinozaki, K., Deno, H., Sugita, M., Kuramitsu, S. & projection Sugiura, M. (1986) Mol. Gen. Genet. 202, 1-5. subunits, which has been one of the more variable regions 10. Sugiura, M. & Kusuda, J. (1979) Mol. Gen. Genet. 172, during the course of evolution (20). The tobacco CL22 (155 137-141. residues) is 45 residues longer than the E. coli L22 (110 11. Maxam, A. M. & Gilbert, W. (1977) Proc. Natl. Acad. Sci. residues), which makes their calculated homology lower USA 74, 560-564. (26%). Another unique feature is that the rpl22 12. Sanger, F., Nicklen, S. & Coulson, A. R. (1977) Proc. Natl. overlaps the rps3 coding region by 13 bp. High homologies of Acad. Sci. USA 74, 5463-5467. L2, L16, S8, and S19 proteins between tobacco and E. coli 13. Ohme, M., Kamogashira, T., Shinozaki, K. & Sugiura, M. (42-57%) may be the reflection oftheir important functions in (1985) Nucleic Acids Res. 13, 1045-1056. the the L2 and L16 are known to be 14. Sugita, M., Kato, A., Shimada, H. & Sugiura, M. (1984) Mol. ribosomes'; proteins Gen. Genet. 194, 200-205. involved in peptidyltransferase activity (21) and the S8 15. Sugita, M. & Sugiura, M. (1984) Mol. Gen. Genet. 195, protein to bind to 16S rRNA in the early stage of the 30S 308-313. subunit assembly (22). 16. Sugita, M., Shinozaki, K. & Sugiura, M. (1985) Proc. Natl. Long transcripts that encompass several ribosomal protein Acad. Sci. USA 82, 3557-3561. genes have been detected in the chloroplasts, indicating that 17. Posno, M., Torenvliet, D. J., Lustig, H., van Noort, M. & at least some of the genes are transcribed polycistronically. Groot, G. S. P. (1985) Curr. Genet. 9, 211-219. Further studies are necessary to determine whether the eight 18. Zurawski, G. & Zurawski, S. M. (1985) Nucleic Acids Res. 13, genes discussed in this paper constitute a single operon. 4521-4526. Interestingly, small RNA bands of 0.3-0.5 kilobases, which 19. Cerretti, D. P., Dean, D., Davis, G. R., Bedwell, D. M. & are shorter than most of the genes, were clearly detected. Nomura, M. (1983) Nucleic Acids Res. 11, 2599-2616. They may be of the mRNAs. As these 20. Noller; H. F. (1984) Annu. Rev. Biochem. 53, 119-162. degradation products 21. Hampl, H., Schulze, H. & Nierhaus, K. H. (1981) J. Biol. small RNAs seem to be stable, another possibility would be Chem. 256, 2284-2288. that they are some elements in posttranscriptional regulation 22. Nomura, M. & Held, W. A. (1974) in , eds. Nomura, or processing of the ribosomal protein mRNAs. Short RNA M., Tissieres, A. & Lengyel, P. (Cold Spring Harbor Labora- species have also been observed among transcripts of the tory, Cold Spring Harbor, NY), pp. 193-223. putative gene cluster for the b6 and 23. Heinemeyer, W., Alt, J. & Herrmann, R. G. (1984) Curr. apocytochrome the Genet. 8, 543-549. subunit 4 of the cytochrome b/f complex in spinach (23) and 24. Deno, H., Shinozaki, K. & Sugiura, M. (1984) Gene 32, the gene cluster for the ATPase subunits I and III in tobacco 195-201. chloroplasts (24), but no apparent short RNAs have been 25. Watson, J. C. & Surzycki, S. J. (1983) Curr. Genet. 7, 201-210.