Proc. NatI. Acad. Sci. USA Vol. 77, No. 3, pp. 1369-1373, March 1980 Biochemistry DNA sequence of the Serratia marcescens lipoprotein (promoter/A-T base pairs/secretion/signal peptide/loop model/gene evolution) KENZO NAKAMURA AND MASAYORI INOUYE Department of Biochemistry, State University of New York, Stony Brook, New York 11794 Communicated by Charles Yanofsky, December 12, 1979

ABSTRACT The Serratia marcescens gene for the outer gene of S. marcescens with that of E. coli revealed several in- membrane lipoprotein (lpp) was cloned in A phage vector teresting and unique features concerning the mechanism of Charon 14. The recombinant phage was very unstable, and the expression of this gene, the secretion of the lipoprotein across Ipp gene with a 300-base-pair deletion at the transcription ter- mination site was further cloned in pBR322. The DNA sequence the cytoplasmic membrane, and the evolution of the gene. of 834 base pairs encompassing the Ipp gene was determined and compared with that of the Escherichia coli ipp gene. The MATERIALS AND METHODS sequence comparisons exhibit several unique features. (i) The Materials. [y-32P]ATP (m9000 Ci/mmol; 1 Ci = 3.7 X 101O promoter region is highly conserved (84% homology) and has becquerels), prepared by the method of Johnson and Walseth an extremely high A+T content (78%) as in E. coli (80%). (ii) (8) from carrier-free The 5' nontranslated region of the lipoprotein mRNA is also [32P]orthophosphate (New England Nu- highly conserved (95% homology). (iii) In the DNA sequence clear) and ADP (Sigma), was used for 5'-end-labeling of DNA corresponding to the signal peptide of this secretory , fragments. Tac I and Alu I were obtained from Bethesda Re- there are three drastic changes, including addition of one base search (Rockville, MD); other restriction enzymes were from pair and deletion of four base pairs in S. marcescens as com- New England BioLabs. Nitrocellulose filters (BA85) were ob- pared to E. coli. The resultant alterations in the amino acid se- tained from Schleicher & Schuell. quence, however, do not change the basic properties of the Bacteria and Phage. S. marcescens (5), E. coli man signal peptide, which are assumed to be essential for its function JE5527 in the secretory mechanism. (iv) The DNA sequence from the 1pp-2 pps thi his rpsL gyrL recAl (9), E. coli K802 hsr- hsm + amino terminus to the 51st residue of the mature lipoprotein is galK sull lacY met, and A phage vector Charon 14 (ref. 10; highly conserved (95% homology) and there is no amino acid obtained from F. R. Blattner) were used. substitution. (v) The DNA sequence corresponding to the seven Construction of XlppSm-1. An 8.5-kilobase (kb) EcoRI amino acid residues at the carboxyl terminus has only 42% fragment of S. marcescens chromosomal DNA carrying the lpp homology, resulting in four amino acid substitutions. (v) Within gene (5) was cloned into Charon 14 vector phage by the method the section of 40 base pairs beginning with the termination used for the codon (UAA) and ending immediately before the oligo(T) tran- cloning of the E. coli lpp gene (4). High-titer lysates scription termination site in the E. coli Ipp gene, there is about were prepared by the plate lysate technique (ref. 10; E. coli 60% homology. However, after this section, there is no obvious K802 as host) with plaques that consistently gave positive hy- homology between the two sequences, probably because of a bridization with 5'-32P-labeled E. coli lipoprotein mRNA after deletion of 300 base pairs at this region. (vii) Seven stable plaque purification. Purification of X DNA from the plate lysate stem-and-loop structures could be formed in the mRNA region. has been described (4). (viii) Alterations in the third position of codons used in the Ipp Construction of pKEN221. gene suggest that the gene has evolved somewhat differently XlppSm-1 DNA (2.5 ,ug) and from other in S. marcescens. pBR322 DNA (1 ,g) were double-digested with EcoRI and BamHI and treated with 0.3 unit of T4 DNA ligase (Miles) in The outer membrane lipoprotein is the most abundant protein 75 ,l of 66 mM Tris-HCI, pH 7.5/10 mM MgCl2/10 mM di- in Escherichia coli and is one of the most extensively investi- thiothreitol/0.4 mM ATP at 12.5°C for 8 hr. E. coli JE5527 cells gated membrane (for reviews, see refs. 1 and 2). The were transformed as described (11). Ampicillin-resistant, tet- complete nucleotide sequence of the lipoprotein mRNA (3) and racycline-sensitive transformants were grown on Whatman the DNA sequence of the lipoprotein gene of E. coli (4) have 3MM filter papers and screened for lpp clones by colony hy- been determined. A study of these sequences revealed many bridization (12). The 0.95-kb Msp I fragment of AlppEc-1 unique features concerning the expression of this gene. E. coli carrying the E. coli lpp gene (4) was nick-translated with [a- lipoprotein mRNA hybridizes with restriction enzyme frag- 32P]dATP and [a-32P]dCTP as described (13) and used as a 32p ments of chromosomal DNAs from nine different enterobac- probe. Hybridization conditions were the same as described for teria (5). Among these bacteria, Serratia marcescens is one of plaque hybridization (4). About 60% of the ampicillin-resistant those most distantly related to E. coli (5). S. marcescens DNA transformants were tetracycline sensitive, and 23% of them (31 has an A+T content of 42%, in contrast to 49% in E. coli DNA out of 135 colones examined) gave positive hybridization. (6). Furthermore, DNA-DNA hybridization experiments show DNA Sequence Determination. The sequence was deter- only 24% homology between the DNA of S. marcescens and mined by the method of Maxam and Gilbert (14). The cleaved E. coli (7). products were separated by the thin sequencing gel system (0.04 In this report, we describe the cloning and DNA sequence X 20 X 40 cm) with 20%, 10%, or 8% polyacrylamide in 7 M determination of the S. marcescens lipoprotein gene. E. coli urea. cells carrying this clone produced a large amount of S. mar- Other Procedures. Methods for the isolation of total DNA cescens lipoprotein. This foreign protein is apparently assem- or plasmid DNA, agarose gel electrophoresis, and Southern blot bled normally in the E. coli outer membrane (unpublished hybridization were as described (5). When 32P-labeled DNA data). A comparison of the DNA sequence of the lipoprotein fragments were used as hybridization probes, the hybridization medium contained 0.60 M NaCl/0.06 M sodium citrate instead The publication costs of this article were defrayed in part by page of 0.75 M NaCl/0.075 M sodium citrate and was heated at 90°C charge payment. This article must therefore be hereby marked "ad- for 3 min before hybridization. vertisement" in accordance with 18 U. S. C. §1734 solely to indicate this fact. Abbreviations: kb, kilobase; bp, base pairs. 1369 Downloaded by guest on September 29, 2021 1370 Biochemistry: Nakamura and Inouye Proc. Natl. Acad. Sci. USA 77 (1980) RESULTS on S. marcescens chromosomal DNA by Southern blot hy- Cloning of the S. marcescens Lipoprotein Gene. We pre- bridization with 5'-32P-labeled E. coli lipoprotein mRNA (data viously showed that 32P-labeled E. coli lipoprotein mRNA not shown). hybridizes with an 8.5-kb EcoRI fragment of S. marcescens Instability of the lpp Gene in XlppSm-1. The XlppSm-1 DNA (5). The 8.5-kb EcoRI fragment carrying the lpp gene was phage was propagated on solid medium. DNA isolated from cloned in X phage vector, Charon 14. Phage DNA was isolated these phages contained a 7.3-kb EcoRI fragment as a minor from a plaque that consistently gave positive hybridization with component (about one-fourth of the amount of the 8.3-kb 5'-32P-labeled lipoprotein mRNA after a two-step plaque pu- fragment), which also hybridized with 5'-32P-labeled E. coli rification. Upon digestion with EcoRI, this phage DNA gave lipoprotein mRNA (data not shown). This fragment appears three major bands of 20, 13.2, and 8.3 kb. The first two frag- to be generated by a deletion within the cloned fragment during ments were identical with the left and the right arm of Charon phage growth. Furthermore, none of the DNA prepared from 14 vector DNA, respectively (ref. 10; Fig. 1A). The 8.3-kb phages grown in liquid culture hybridized with the E. coli li- fragment, although somewhat shorter than the 8.5-kb chro- poprotein mRNA. These results indicate that the hybrid phage mosomal fragment, showed positive hybridization with 5'- carrying the Ipp gene is very unstable, readily losing a part of 32P-labeled E. coli lipoprotein mRNA by the Southern blotting or the whole Ipp gene during phage growth. technique (data not shown). This phage is designated XIppSm-1, and the restriction enzyme cleavage map of its DNA is shown Subcloning of the lpp Gene into pBR322. Using XlppSm-1 in Fig. 1A. Restriction enzyme cleavage sites within the cloned DNA, we subcloned the 4.2-kb EcoRI/BamHI fragment into 8.3-kb EcoRI fragment (Fig. 1A) agreed with those determined plasmid pBR322. One of the transformants isolated produced a large amount of the S. marcescens lipoprotein which was A assembled normally in the E. coli outer membrane (unpub- Choron 14 b1007 KH54 nin lished data). This clone was designated pKEN221 (Fig. 1B). O~77SR 80 Upon restriction enzyme analysis, we found that pKEN221 X /pp Sm-1 t ,~I I~ I I contained a 1.05-kb Hae III fragment that hybridized with the ° E- E L a W>sD 0 m0uD o co E. coli lipoprotein mRNA instead of the 1.35-kb Hae III frag- I coI 0 11 ment of the S. marcescens chromosomal DNA (data not shown).

c 6c o-a This indicates that pKEN221 DNA has a deletion of 300 base 0n(n 0 0 .' r pairs (bp) around the ipp gene. This deletion most likely oc- 0 10 20 30 40 Kilobase curred in XlppSm-1 DNA before the subeloning of the lpp gene B because the 1.05-kb Hae III fragment was detected in the phage pKEN221 (5.4 megadaltons) DNA as a major band that hybridized with the E. coli lipo- protein mRNA together with the 1.35-kb Hae III fragment and a shorter 0.63-kb fragment. A fine restriction enzyme cleavage map of the 1.05-kb Hae III fragment is shown in Fig. 1C. The location of the 300-bp deletion was determined as fol- lows. Two different probes were prepared by Alu I digestion C of the 1.05-kb Hae III fragment: probe 1, which covered the 5'-end region of the lipoprotein mRNA (from -176 to +160; HaeM XboI Pvull HoeM 4 I P see Figs. IC and 2), and probe 2, which covered the 3'-end re- Hinf I gion of the mRNA (from +276 to +540; see Figs. IC and 2). Msp I The Hae III/Pvu II fragment from pKEN221 DNA that hy- ToqI 5 bridized with 32P-labeled probe 1 was exactly the same size as AluI S (Probe I) S (Probe 2 ) i __ Toc I i 4l 4 44 that from the total chromosomal DNA (data not shown); the 5 /pp mRNA Oow% wA~.. Hae III/Pvu II fragment of pKEN221 DNA that hybridized bp with probe 2 was shorter by about 0.3 kb than that from the total -400 -200 0 200 400 600 . In the Hae III and Pvu I double digest, the D H Tc H M Tq X H TcTqH A Tq M Tq fragment from pKEN221 DNA that hybridized with probe 1 I~ ~ ( ~ 'h I ( I( was shorter by about 0.3 kb than that from the total chromo- o----- o some (data not shown); in contrast, there was no difference F.-o~0 between pKEN221 DNA and the total chromosomal DNA in the size of the smaller fragment that hybridized with probe 2. FIG. 1. Restriction enzyme maps. (A and B) Restriction enzyme From these results we concluded that the 300-bp deletion is maps of XlppSm-1 and pKEN221, respectively. The physical map located beweeen the Pvu II and Pvu I sites in the 1.05-kb Hae of Charon 14 is taken from Blattner et al. (10). Thick bar indicates III fragment as shown in Fig. IC. The exact location of the the cloned S. marcescens DNA fragment. The location and direction deletion will be discussed below. of transcription of the Ipp gene is indicated by the open arrow. v, Location of a 300-base-pair deletion in pKEN221. Ampr, ampicillin DNA Sequence Determination of the Ipp Gene. The DNA resistance. (C) Map of the 1.05-kb Hae III fragment isolated from sequence of 834 bp around the ipp gene was determined by pKEN221. v, Location of a 300-base-pair deletion. The region where using the 1.05-kb Hae III fragment of pKEN221, as shown in DNA sequence was determined is indicated by a thick bar. Probes 1 Fig. 2. The restriction enzyme fragments used for DNA se- and 2 on Alu I fragments were used to determine the location of the 2 shows 300-base deletion (see text). (D) Sequencing strategy around the Ipp quence determination are shown in Fig. 1D. Fig. also gene. Restriction enzyme cleavage sites are abbreviated as follows: the amino acid sequence deduced from the DNA sequence as H, Hinfl; Tc, Tac I; Tq, Taq I; M, Msp I; X, Xba I; A, Alu I. 0, Po- well as the E. coli DNA sequence of the corresponding region sition of 32p labeled at the 5' end. The singularly labeled fragments (4). Homologous regions between the two DNAs are indicated were obtained either by secondary restriction enzyme cleavage at the by solid lines; mismatched regions are shown by empty bars. sites indicated (-'I) or by strand separation (-). Broken regions of first nucleotide of the mRNA as deduced from the E. coli the arrows indicate that the sequences of those regions were not de- The termined on those fragments. sequence is numbered as +1. Downloaded by guest on September 29, 2021 Biochemistry: Nakamura and Inouye Proc. Natl. Acad. Sci. USA 77 (1980) 1371

E. co/i i_ 2. _ j W~ S. marescens TriT~~~~~~~~~I[ ASNl iywvvv~iyw~dsVLu

-10 +10

uL3THRLYSLUVAsLEL£UVYALAVALILLPRiBEowELEAYSBArdSEYSILEPSAsp(JABoax START LGANTWEIEA

CCTCAAATAITr2TrA~AM AG T~ C A TT eT A A ~

-50 -31+1 +50 +8+ +1 TERLYSLEUVLYKAILEuLY%-f2iWcsmeskAYsILEs NELeRsVALGLNTw LEuP -

+20~~~~~~~~~~~~~~~~~~~~~+30 +40 +50+51

G5CrATATTWFCUFG TcCUECCGrGG0GAf GhCTcCrGACT MU 4iA A

4400 4450~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+5

FIG. 2. DNA sequences encompassing the S. marcescens and E. coli ipp genes. The E. coli sequence is cited from ref. 4; only a single strand is shown. The first nucleotide of the lipoprotein mRNA deduced from the base sequence of the E. coli lipoprotein mRNA (3, 15) is numbered as +1. E. coli lipoprotein mRNA start and stop sites are indicated by arrowheads (v). between the two DNAs is indicated by solid lines; nonhomologous regions are indicated by connecting them with empty bars. Amino acid sequences deduced from the DNA sequences are also shown. Cleavage sites of the prolipoproteins are indicated by arrows (i). The Cys residue of the S. marcescens prolipoprotein at the cleavage site is numbered as +1. Amino acid residues that differ in the E. coli and S. marcescens sequences are circled.

DISCUSSION ference is a double frameshift mutation, an addition of a single base, and a deletion of four contiguous bases (from E. coli to S. Fig. 2 shows that there is no homology between the Ipp genes of S. marcescens and E. coli after residue 312 of the S. mar- marcescens), which results in the deletion of an amino acid in cescens DNA. Thus, it appears that the 300-bp deletion between the signal peptide of S. marcescens. There are two possible the Pvu II and Pvu I sites (see Fig. 1C) occurred between mechanisms that could generate the S. marcescens sequence residues 311 and 312. This deletion, therefore, caused the from the E. coli sequence. One requires one single-base alter- elimination of sequences corresponding to the oligo(T) present ation, whereas the other requires three single-base exchanges in this region of the E. coli Ipp gene (see Fig. 2). Because oli- besides the two frameshift mutations. None of these three al- go(T) is required at the transcription termination site (16), this terations causes any severe changes in the basic properties of deletion probably resulted in reading-through to the next gene. the signal peptide proposed for the E. coli prolipoprotein and In fact, the primary in vitro transcript directed by the 1.05-kb other bacterial secretory proteins (see loop model; ref. 18). Hae III fragment is much larger than the 322-base in vitro The amino acid sequences of the two mature lipoproteins are product directed by the E. coli lpp-containing DNA fragment identical from the amino-terminal cysteine residue to Asn at (unpublished data). This alteration appears to lower the pro- the 51st position. In this region, there is 95% homology between duction of the lipoprotein in vivo (unpublished data). the two DNAs with eight single-base alterations, all of which The following is a discussion of the unique features of the occurred at the third base position in genetic codons. On the DNA sequence shown in Fig. 2. other hand, in the sequence of seven amino acid residues at the Amino Acid Sequence of S. marcescens Prolipoprotein. carboxyl-terminal section, four residues are replaced. These The amino acid sequence of the prolipoprotein deduced from changes cause a drastic loss of homology of the nucleotide se- the DNA sequence shows extensive alterations, as compared quence in this section between the two DNAs (42% homology). with the E. coli sequence, only at the amino- and carboxyl- For all four amino acids altered, all three bases of the codons terminal sections. All differences in the amino-terminal section are changed. In both amino acid sequences, however, the car- are localized within the signal peptide of the prolipoprotein boxyl-terminal residue is lysine, which is linked to the pepti- (Met-19-Gly-1) which is required for secretion of the protein doglycan (2). These results indicate that the carboxyl-terminal across the cytoplasmic membrane (17, 18). A difference in three section of the lipoprotein, except for the carboxyl-terminal bases immediately after the initiation codon (AGC in E. coli and lysine residue, is more variable than the preceding section of TCG in S. marcescens) causes the alteration in the amino acid the molecule. This agrees with the immunological comparison sequence of Lys-19-Ala-8 (E. coli) to Asn'8-Arg-7 (S. mar- of the lipoproteins from various Gram-negative bacteria (5). cescens). A single base change (G to T) at +76 does not result Nucleotide Sequence of Lipoprotein mRNA. The 5'-non- in any alteration of the amino acid sequence. The third dif- translated region of the two lipoprotein mRNAs deduced from Downloaded by guest on September 29, 2021 1372 Biochemistry: Nakamura and Inouye Proc. Natl. Acad. Sci. USA 77 (1980)

VI V AU/ U-A - Vi AU-UUA A-U C-

A-U(-' ~ >- U-A- u~~~~~U- AC-A

FIG. 3. Possible secondary structure of the S. marcescens lipoprotein mRNA. Although the 3'-terminal sequence of the mRNA could not be determined because AM of the 300-bp deletion mutation in pKEN221 DNA, the mRNA is most likely to have a short stretch of oligo(U) at the I 3' end as shown.

the DNA sequence (Fig. 2) are almost identical (95% homolo- (-21.1), -15.5 (-11.3), -1.0 (-1.0), -6.8 (-9.8), -7.0 (-13.8), gy). Therefore, all the unique features discussed previously for -3.8 (-3.8), and -1.5 (-11.6) kcal for structures I-VII, re- this region of the E. coli mRNA (19) are also maintained for the spectively [values in parentheses are those for E. coli (3)]. S. marcescens mRNA. On the other hand, in the 3'-nontrans- Structure I may be slightly more stable (at most, -4 kcal) in the lated region there are extensive mismatches between the two intact mRNA because of the possible existence of oligo(U) at DNAs (63% homology between +269 and +311). the end of the intact mRNA, a region that was deleted in the A possible secondary structure of the mRNA is shown in Fig. present mRNA structure. The total AG value for the proposed 3. In this model there are no stable hairpin structures in the first loop structure of S. marcescens mRNA is -37.9 kcal, in contrast 87 residues from the 5' end, whereas after this position 75% of to -75.1 kcal for the E. coli mRNA. It will be interesting to the total nucleotides are involved in the formation of hairpin determine how the difference in the stability of the secondary structures. However, because of differences in the nucleotide structures affects the half-life as well as the rate of translation sequence, the S. marcescens mRNA has only seven possible of the mRNA. stem-and-loop structures in contrast to nine such structures in As shown in Table 1, the codon usage in the S. marcescens the E. coli mRNA (3). Moreover, AG values for all these lipoprotein mRNA was similar to that of the E. coli lipoprotein structures calculated according to Tinoco and coworkers (20, mRNA. In spite of several possible codons for each amino acid, 21) are higher than those for the corresponding structures in only few codons are used. There are several repeating sequences the E. coli mRNA except for structures II, III, and VI: -2.3 in the S. marcescens mRNA similar to that found in the E. coli

Table 1. Codon usage in S. marcescens lipoprotein mRNA* Phe UUU 0(0) Ser UCU 3(3) Tyr UAU 0(0) Cys UGU 0(0) UUC 0(0) UCC 3(2) UAC 1(1) UGC 1(1) Leu UUA 0(0) UCA 0(0) Term UAA 1(1) Term UGA 0(0) UUG 0(0) UCG 0(0) UAG 0(0) Trp UGG 0(0) Leu CUU 1(0) Pro CCU 0(0) His CAU 0(0) Arg CGU 3(3) CUC 0(0) CCC 0(0) CAC 2(0) CGC 1(1) CUA 0(0) CCA 0(0) Gln CAA 3(0) CGA 0(0) CUG 6(9) CCG 0(0) CAG 3(5) CGG 0(0) Ile AUU 0(0) Thr ACU 2(4) Asn AAU 0(0) Ser AGU 0(0) AUC 2(2) ACC 0(0) AAC 7(6) AGC 2(2) AUA 0(0) ACA 0(0) Lys AAA 5(6) Arg AGA 0(0) Met AUG 2(3) ACG 0(0) AAG 1(1) AGG 0(0) Val GUU 3(3) Ala GCU 7(8) Asp GAU 2(2) Gly GGU 1(2) GUC 0(0) GCC 0(0) GAC 6(6) GGC 2(1) GUA 2(2) GCA 4(3) Glu GAA 0(0) GGA 0(0) GUG 1(1) GCG 1(1) GAG 0(0) GGG 0(0) * Values in parentheses are from the E. coli lipoprotein mRNA (3). Downloaded by guest on September 29, 2021 Biochemistry: Nakamura and Inouye Proc. Natl. Acad. Sci. USA 77 (1980) 1373 mRNA (3): However, in contrast to these genes, the base usage at the third positions of codons in the S. marcescens lpp gene is 104 124 unique; five out of nine third-base differences in the lpp gene AACGCUAAAAUCGAU-CAACUG result in an enrichment for A-T pairs with respect to the E. coli 146 166 sequence, whereas only two base changes result in an enrich- AACGCUAAAGUUGAU-CAGCUG ment of G-C pairs. As a result, the A+T content of the coding region of the S. marcescens lpp gene is as high as 50% (49% for 206 224 the E. coli ipp gene), a value that is above the average A+T GCUAAAGACGACGCAGCAC content of the genome (42%). We do not know why the high and A+T content in the S. marcescens lpp gene has been main- 142 tained during the course of evolution in spite of the general 127 evolutionary trend towards a higher G+C content. This may UUCUGACGUUCAGACU be because the high A+T content is essential for maintenance 190 205 of the ipp gene function. Alternatively, the Ipp gene might have UUCUGACGUUCAAGCU undergone completely different evolutionary processes from other genes; for example, the lpp gene might have inserted into the chromosome at a later stage of evolution. where underlined nucleotides are different from each other. DNA Sequence of Promoter Region of the Ipp Gene. The We thank Dr. M. Freundlich and Dr. M. Riley for critical reading DNA sequence of the first 26 residues (-1 to -26) immediately of the manuscript. This work was supported by Grant GM19043 from before the transcription initiation site is almost identical with the U.S. Public Health Service and Grant BC-67 from the American that of E. coli except for the residue at -20. This section in- Cancer Society. cludes the "Pribnow box," T-15-G-T-A-A-T-A-9. The homol- 1. Inouye, M. (1979) in Biomembranes, ed. Manson, L. A. (Plenum, ogy between the two DNAs in the first 45 bp is also very high New York), Vol. 10, pp. 141-208. (84%) and, more importantly, the A+T content in this region 2. DiRienzo, J., Nakamura, K. & Inouye, M. (1978) Annu. Rev. is extremely high (78%), as found for the E. coli lipoprotein gene Biochem. 47, 481-532. (80%). The A+T content between -46 and -272 is 61%, which, 3. Nakamura, K., Pirtle, R. M., Pirtle, I. L., Takeishi, K. & Inouye, although lower than in the-1 to -45 region, is still considerably M. (1980) J. Biol. Chem. 255, 210-216. 4. Nakamura, K. & Inouye, M. (1979) Cell 18, 1109-1117. higher than the average A+T content of the S. marcescens 5. Nakamura, K., Pirtle, R. M. & Inouye, M. (1979) J. Bacteriol. 137, DNA (42%; ref. 6), whereas the A+T content between -273 595-604. and -359 drops to 35%, closer to A+T content of the genome 6. Normore, W. M. (1976) in Handbook of Biochemistry and Mo- as a whole. A similar pattern of the distribution of A-T pairs has lecular Biology-Nucleic Acids, ed. Fasman, G. D. (CRC, been observed in the E. coli DNA sequence (4). The A-T rich- Cleveland, OH), Vol. 2, 3rd Ed., pp. 65-235. 7. K. E. Annu. Rev. Microbiol. ness of the promoter region is probably very important for the Sanderson, (1976) 30,327-349. 8. Johnson, R. A. & Walseth, T. F. (1979) Adv. Cyclic Nucleotide efficient transcription of the lpp gene, as recently discussed by Res. 10, 135-167. Vollenweider et al. (22). 9. Nakamura, K., Katz-Wurtzel, E. T., Pirtle, R. M. & Inouye, M. As for the "RNA polymerase recognition site" generalized (1979) J. Bacteriol. 138, 715-720. as T-G-T-T-G-A-C-A-A-T-T-T (23), its possible sequence can 10. Blattner, F. R., Williams, B. G., Blechl, A. E., D.-Thompson, K., be found at T-38-a-T-T-c-t-C-t-t-T-c-g-27 with very poor ho- Faber, H. E., Furlong, L. A., Grunwald, D. J., Kiefer, D. O., mology (only capital letters represent the homology). Moore, D. D., Schumm, J. W., Sheldon, E. L. & Smithies, 0. of in section (1977) Science 196, 161-169. Other Features. The pattern mismatches the 11. Cohen, S. N., Chang, A. C. Y. & Hsu, L. (1972) Proc. Natl. Acad. from -273 to -359 between the two DNAs (Fig. 2) is unusual Sci. USA 69, 2110-2114. for it involves single bases, all of which are separated by 3n bases 12. Gergen, J. P., Stern, R. H. & Wensink, P. C. (1979) Nucleic Acids (n = integer values) without exception. If the mismatching Res. 7, 2115-2136. bases are placed at the third positions of codons, these regions 13. Maniatis, T., Jeffrey, A. & Kleid, D. G. (1975) Proc. Natl. Acad. could code for identical amino acid sequences except for Gln Sci. USA 72, 1184-1188. 14. Maxam, A. M. & Gilbert, W. (1977) Proc. Natl. Acad. Sci. USA (CAG-351) for S. marcescens in contrast to His (CAC) for E. 74,560-564. coli. Both sequences are terminated by UAA. Furthermore, 15. Pirtle, R. M., Pirtle, I. L. & Inouye, M. (1980) J. Biol. Chem. 255, about 50 residues downstream from the termination codons 199-209. there are oligo(T) sequences in both DNAs. Especially in S. 16. Adhya, S. & Gottesman, M. (1978) Annu. Rev. Biochem. 47, marcescens DNA, a very stable stem-and-loop structure can 967-996. be formed at this position (a stem formation between A-242- 17. Inouye, S., Wang, S. S., Sekizawa, J., Halegoua, S. & Inouye, M. A-A-A-C-G-C-C-G-C-233 and G-229-C-G-G-C-G-T-T-T- (1977) Proc. Natl. Acad. Sci. USA 74, 1004-1008. 18. Inouye, M. & Halegoua, S. (1979) Crit. Rev. Biochem., in T-210), which exists at the transcription termination site (15). press. Out of 16 base changes in the coding region, 11 (69%) are 19. Pirtle, R. M., Pirtle, I. L. & Inouye, M. (1978) Proc. Natl. Acad. changes from A-T to G-C pairs (from E. coli to S. marcescens) Sci. USA 75,2190-2194. whereas only 1 (6%) is a change in the opposite direction. As a 20. Tinoco, I., Borer, P., Dengler, B., Levine, M., Uhlenbeck, O., result, only 21% of the third bases of the codons (excluding Grothers, D. & Gralla, J. (1973) Nature (London) New Biol. 246, 40-41. codons for Met and Trp) used in this region of the S. marcescens 21. Borer, P. N., Dengler, B., Tinoco, I., Jr. & Uhlenbeck, 0. C. (1974) DNA are A or U in contrast to 48% in E. coli. Thus, the A+T J. Mol. Biol. 86, 843-853. contents of these regions are 35% and 45% for the S. mnarcescens 22. Vollenweider, H. J., Fiandt, M. & Szybalski, W. (1979) Science and E. coli DNA, respectively. This kind of alteration, if 205,508-511. widespread throughout the genome, could well account for the 23. Reznikoff, W. S. & Abelson, J. N. (1978) in The Operon, eds. differences in the A+T contents between the two bacterial Miller, J. H. & Reznikoff, W. S. (Cold Spring Harbor Laboratory, Cold pp. 221-243. DNAs (42% for S. marcescens; 49% for E. cohl). The same Spring Harbor, NY), 24. Miozzari, G. F. & Yanofsky, C. (1978) Nature (London) 276, mechanism is found in the codons of the tryptophan genes that 684-689. have been determined so far (refs. 24 and 25; 27% of the third 25. Miozzari, G. F. & Yanofsky, C. (1979) Nature (London) 277, positions are U or A for S. marcescens and 42% for E. coli). 486-489. Downloaded by guest on September 29, 2021