Proc. NatI. Acad. Sci. USA Vol. 83, pp. 7405-7409, October 1986 Promoters selected from random DNA sequences (mutagenesis/evolution) MARSHALL S. Z. HORWITZ AND LAWRENCE A. LOEB The Joseph Gottstein Memorial Cancer Research Laboratory, Department of Pathology SM-30, University of Washington, Seattle, WA 98195 Communicated by Earl P. Benditt, July 3, 1986

ABSTRACT We have selected a group of Escherichia coli moter recognition site of the tetracycline resistance gene of promoters from random DNA sequences by replacing 19 base pBR322 (tet). This technique may be generalized for the pairs at the -35 region of the tetracycline resistance selection of other genetic regulatory elements or protein gene te" of the plasmid pBR322. Substitution of 19 base pairs coding sequences. with chemically synthesized random sequences results in a maximum of 419 (about 3 x 1011) possible replacement se- MATERIALS AND METHODS quences. From a population of about 1000 harboring Oligonucleotides. Oligonucleotides were synthesized by plasmids with these random substitutions, tetracycline selec- the phosphoramidite method with an Applied Biosystems tion has revealed several functional -35 promoter sequences. 380A DNA synthesizer and purified by thin layer chroma- These promoters have retained only partial. homology to the tography (10). Random sequences were synthesized with -35 promoter . In three ofthese promoters, equimolar mixtures of phosphoramidites. the consensus agent shifts 10 downstream, allow- Plasmid Constructions. Restriction endonucleases and en- ing the RNA polymerase to recognize another Pribnow box from zymes of nucleic acid metabolism were obtained commer- within the original pBR322 sequence. Two of the sequences cially and use followed the supplier's instructions. Standard promote more strongly than the native promoter. molecular cloning methods were employed (11). This technique may have application for theselection ofadditional A plasmid with a deletion in the promoter recognition site, DNA sequences with varied biological activity. pBdEC, was constructed by digestion ofpBR322 with EcoRI and Cla I, by extension of the 5' overhangs with the large Comparison of known RNA polymerase binding sites of fragment of DNA polymerase I, and by blunt-end recircu- different genes of Escherichia coli reveals two highly con- larization with T4 DNA ligase. The plasmid sequences were served promoter elements centered at about -10 and -35 confirmed by DNA sequence analysis. base pairs (bp) from the start of transcription (1-5). A The plasmid populations containing random substitution consensus sequence of the nontemplate strand includes with all four bases, pRAN4, or just three bases (cytosine, "TATAAT" from positions -13 to -8, the "Pribnow box," guanine, and thymine), pRAN3, were constructed by hybrid- and "TTGACA" from positions -36 to -31, the "recogni- izing 4 x 10-' pM of primer 8-mer, 5' GGATCGAT 3', to 2 tion" site, with 17 bp between the two (6). The involvement X 10- pM of template 35-mer of mixed sequence, either 5' of each in the initiation of transcription has been CCGAATTC(A,C,G,T)19ATCGATCC 3' or 5' CCGAATTC- inferred largely from an analysis of mutations. Rare muta- (C,G,T)19ATCGATCC 3', respectively, in 90 mM NaCl/15 tions that increase transcription, "up mutations," usually mM Tris HCl, pH 7.9/1 mM MgCl2 at 650C for 5 min and 570C increase homology with the consensus sequence and spacing, for 90 min. The primed template was extended with the large while the more common mutations that decrease transcrip- fragment of DNA polymerase I and digested with an excess tion, "down mutations," usually decrease homology with the of EcoRI and Taq I. The resulting product was ligated into consensus sequence and spacing (7). EcoRI- and Cla I-digested pBR322 that had been treated with Targeted random mutagenesis has been used to define bacterial alkaline phosphatase and purified by agarose gel prokaryotic (8) and eukaryotic (9) translation initiation sig- electrophoresis. nals. In these experiments base substitutions were limited in One of the new promoter sequences was duplicated in a number to no more than three and were identified before second plasmid to rule out the possibility of mutation outside activity was manually assayed. This strategy limits the search the promoter region. The insert of plasmid pBT9 was recon- for functional sequences to derivatives of those already structed in the plasmid pBT9R by hybridizing 105 pM known. 26-mer (5' AATTCTTGGGCGCGCGTCGGCTTGAT 3') to We report here a technique that has allowed us to create 10-5 pM 24-mer (5' CGATCAAGCCGACGCGCGCCCAAG several unusual promoter recognition sequences. We have 3') using the conditions described above. Incubation with T4 substituted for the promoter recognition site of a plasmid- polynucleotide kinase in the presence of ATP added 5'- borne selectable marker chemically synthesized random phosphoryl termini. The resulting product contains EcoRI sequences of 19 bp, such that every particular plasmid and Cla I sticky ends and was ligated into similarly digested molecule contains a unique, randomly chosen sequence. pBR322. The expected plasmid sequences were confirmed by When introduced into competent cells, growth selection DNA sequence analysis. Competent DH5 and DH5.1 E. coli identifies those sequences with . When all (endAl, recAl), prepared by the method ofHanahan (12) and four DNA bases are present in the 19-bp random stretch, purchased from Bethesda Research Laboratories and Vector there are 419 (about 3 x 1011) different possible replacement Cloning Systems (San Diego, CA), respectively, were used sequences. We have used antibiotic selection to identify for DNA transformation. Transformants were grown in several unusual promoters from a population of about 1000 Luria-Bertani Medium (LB) (with 0.1% glucose) for 60 min such bacteria, heterogeneous in DNA sequence at the pro- (approximately two doublings) prior to antibiotic selection. DNA Sequence Analysis. Rapid plasmid DNA preparation was The publication costs of this article were defrayed in part by page charge was by the alkaline lysis method (11). DNA sequencing payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Abbreviations: tetr, tetracycline resistance gene; bp, base pair(s).

7405 Downloaded by guest on September 28, 2021 7406 Genetics: Horwitz and Loeb Proc. Natl. Acad. Sci. USA 83 (1986)

by dideoxy chain-termination (13) using double-strand I in the presence of all four dNTPs, and digested with EcoRI pBR322 templates (14) from these rapid preparations. Both and Taq I to produce a heterogeneous restriction fragment DNA strands were sequenced. population. We then ligated the restriction fragments into the Tetracycline Resistance Determination. Tetracycline resist- parent plasmid to produce a population ofplasmids, pRAN4, ance was determined by the 50% efficiency ofplating (EOP50) containing random 19-bp promoter substitutions. For this method (15) on LB agar (with 0.1% glucose). population, there is a maximum of 419 different possible RNA Gel Blot Analysis. Nucleic acids were purified from E. replacement sequences. The plasmid population was used to coli (16) and electrophoresed on 1% agarose/2.2 M formal- transform E. coli. Growth in ampicillin selects for bacteria dehyde (11). Hybridization was on GeneScreenPlus mem- containing plasmids, while growth in tetracycline selects for branes (New England Nuclear) following manufacturer's plasmids with functional tetr promoters. instructions. Densitometry of the autoradiogram was per- Specifically, we transformed DH5.1 with pRAN4 (Table formed on a Hoefer GS300 scanning densitometer. 1). Ampicillin selection yielded 125 colonies. To determine the nature of insertions present, plasmids from 10 of these RESULTS colonies have been characterized. Plasmid sizes were com- The Promoter Recognition Sequence Is Necessary for Tran- pared by agarose gel electrophoresis (not shown), and in each scription. The location of the tet( promoter has been deduced plasmid about 200 bp centered at -35 were sequenced. Two from promoter consensus sequence homology in pBR322 plasmids are identical to pBR322, and three plasmids contain (17), deletion mutations (18), and electron microscopic map- deletions bounded by the EcoRI and Cla I sites (sequences ping (19). The transcription initiation site has been identified not shown); these are assumed to be part of a background of by Si-nuclease mapping (16). The gene encodes a single, about 50% of the vectors that escaped either digestion with noninducible, 43.5-kDa polypeptide (20) that functions at the restriction enzymes or ligation of the insert. The other five cell membrane to block accumulation of the antibiotic (21). plasmids derived from pRAN4-pBB3, pBB5, pBB9, We first deleted the promoter recognition sequence in tetr pBB10, and pBB13-contain promoter substitutions of 10-23 to confirm the importance of this sequence in the transcrip- bp (Fig. 2). Of the total of 77 bases substituted among these tion of that gene. EcoRI and Cla I restriction sites flank the five plasmids, the average insert length is 15, and the -35 sequence in tet' (Fig. 1). We constructed pBdEC, a composition is 22% , 27% cytosine, 34% guanine, and plasmid with a 22-bp deletion extending from position -42 in 17% thymine. the EcoRI site to position -21 in the Cla I site. E. coli DH5.1 Selection of pRAN4 transformants in both ampicillin and was found to be resistant to a tetracycline concentration of 2 tetracycline yielded 28 colonies. Plasmids from all 28 ofthese ,ug/ml in the absence of a plasmid, to 40 pug/ml when colonies have been characterized to determine if any contain harboring pBR322, and to just 4 tkg/ml when harboring promoter replacement sequences. Twenty-seven plasmids pBdEC, the plasmid with the promoter recognition site are identical to pBR322. These are from the background of deletion (Fig. 2). RNA gel blot analysis shows an absence of unmodified vectors; it is improbable that a promoter se- the tetr transcript in cells containing pBdEC (Fig. 3, see quence identical to pBR322 would be present within the small below). Therefore, the promoter recognition sequence is subpopulation of all random sequences selected here. The necessary for tetr transcription. Replacement of the Promoter Recognition Sequence with other plasmid, pBG8, contains a 38-bp promoter substitution Random Sequences Yields Promoter Substitutions. The ratio- (Fig. 2). Although there is sequence length heterogeneity, the nale of the experiment is explained in Fig. 1. We excised the average length of all pRAN4 substitutions is 19 bases, in promoter recognition sequence of pBR322 by digestion with agreement with the target length. We conclude that about half EcoRI and Cla I and purification of the larger fragment by of the 125 ampicillin-resistant colonies contain plasmids with agarose gel electrophoresis. Next, an oligonucleotide tem- promoter substitutions and that about 1 of these 63 colonies is plate containing defined sequence termini sandwiching a also tetracycline resistant, suggesting that, very roughly, 2% of 19-base random stretch was hybridized to a defined sequence the 3 x 1011 possible random sequences present in this con- primer, copied with the large fragment of DNA polymerase struction may duplicate promoter recognition site activity.

EcoR I -35 Cla -10 CAAGAATTCTCATGT|TTGAC-CZZ- GCTfiIA TCATCGAAAT TCGAjIE[T T GCGGTAGTTTATCCGA GTTCTT AGAGTACAAACTGTCGAATAGTAGCTATTCGAAATTACGCCATCAAATAG

pBR322 ampr ori tetr

AATTCC AT sEcoRI/ClaI GGJ TAGC s gel purify 19 DNA ligase -35 V -10 NNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNN amp pRAN4 =- L Iampr ori tetr amp/tet

FIG. 1. Schematic of experimental strategy. The tet' promoter sequence is indicated (17). Matches with the -10 and -35 promoter consensus sequence are boxed (6). An additional "anti-tet" promoter (data not shown) initiates transcription from within the tet' Pribnow box transcribing in the opposite direction, toward the ampicillin resistance gene (19). Transfection was into E. coli DH5. 1. amp, ampicillin; tet, tetracycline; N, unspecified bases. Downloaded by guest on September 28, 2021 Genetics: Horwitz and Loeb Proc. Natl. Acad. Sci. USA 83 (1986) 7407

Tetracycline resistance Sequence (pg/ml) -35 -10 V V consensus tc TT GAC a t t tg TAt AaT native (pBR322) TT C TCA TGT7 ~GClDTC ATCGAMAAOUVT~7GC 40 AC,G,T substitution (pRAN4) pBG8 --- GTGCAGAAACGCCGCAGGGGAAAGAACTGCGCIT1G~ 40

pBB3 --- GGAGCCGCC02aTWCG -M ------2 pBB5 AAGGCAGGGGGGGCMIF 2 pBB9 --- GGCGATGCT TTCCAKMACBTTAG ------6 pBB10 Cc3AO3CIoAjATA 2 pBB13 --- TTqC9GGME= 4 C,G,T substitution (pRAN3) pBT9 -TC TTGGGCGCGCGTCGG£TTGJ 50 pBT21 - - - GCCCC TT TTCTC CgIf ------60 pBTR3 -- - CGTCCCTGCCrMICGCTTGm TIAI--- 30 pBA 1 GCGTGTCGGTCCCCGTGTC.3rC 6 pBA2 CGTGGCGCCDca]3C2T1IMC 6 pBA3 - - - GTT TCGGmCI3GGGCGdfl3C 4 pBA4 CGGTGGGCGOCI = TCGG 2 pBA5 - - - GGGCGGfMC CCGMC0I3r 2 pBA6 - - - GGCGGENGKGGCCGC 2 pBA7 GCCC T T GCTTTGG TGGf1TCTCGCCCC 2 pBA8 - -- ErMTGGqCCDGCCTTCGGG 10 pBA9 - - - TqGMTGffCTGCGCGCCCG .o------2 pBA10 - - - GTGGGCCGCGGCMGGt TCCG 4

deletion (pBdEC) - - 4

FIG. 2. Promoter sequence substitutions. The consensus promoter sequence (6) is on the top line. The most strongly conserved bases are in upper case. The sequence ofpBR322 (7) is on the second line. The lines following list promoter substitutions. Spaces have been inserted before and after the pBR322 promoter and the promoter substitutions to maximize alignment with the consensus. Matches with the consensus are boxed. Dashes denote positions identical with pBR322. Base matches among the dashes indicate positions within pBR322, outside the substitution, that together with the substitution allow for - 35 promoter consensus match. The last line lists the sequence of the promoter deletion plasmid pBdEC. Two plasmids, pBTR3 and pBA8, possess downstream insertions, a probable artifact of the method of plasmid construction, here indicated as subscripts among the dashes. Three plasmids, pBA7, pBT9, and pBT21, have two sites of potential alignment with the consensus; although not aligned, bases in this alternate site are underlined. Tetracycline resistance was determined by the 50% efficiency ofplating (EOP50) method (15).

To verify that our synthetic restriction fragments were pRAN3, deficient in adenine throughout the randomly sub- responsible for the observed sequence heterogeneity at the stituted segment. The plasmid population pRAN3 contains promoter recognition region and to check for allowable promoter recognition site substitutions of random 19-base sequence diversity, we prepared a population of plasmids, runs of cytosine, guanine, and thymine. For this population, there is a maximum of 319 (about 109) different possible sequences. z replacement We transformed DH5.1 with pRAN3 U) cl (Table 1). Ampicillin selection yielded 887 colonies. Plasmids ct U N CO from 10 of these colonies have been characterized. All 10 of LO Lu ct a. ctN CO o CY) these plasmids, pBA1-pBA10, contain replacement inser- < 13 cr b-H C F- cr m m mO m m m m m tions of 15-29 a a aCZ a a a a QL bp (Fig. 2), implying that >90% ofthe plasmids in pRAN3 contain promoter substitutions. Of the total of 203 bases substituted among these 10 plasmids, the average insert length is 20, and the composition is 0% adenine, 32% cytosine, 42% guanine, and 26% thymine. This composition bias indicates that the sequence heterogeneity results from ligation of the random insert, not from a cellular process. Selection in both ampicillin and tetracycline yielded 23 a colonies. Plasmids from all 23 of these colonies have been characterized. Twenty-one plasmids are identical to pBR322. The presence of adenine in these sequences indi- -4 1.4 kb promoter cates that these are from the background of unmodified vectors and are not present within the small subpopulation of all random sequences selected here. The other two plasmids, pBT9 and pBT21, contain promoter substitutions of 19 and 17 bases, respectively (Fig. 2). An additional tetracycline- resistant colony containing a plasmid, pBTR3 (Fig. 2), was detected by replica plating the colonies growing on the FIG. 3. RNA gel blot analysis to quantify tetr transcription. ampicillin media onto ampicillin/tetracycline media (data not Nucleic acid from E. coli DH5.1 the (5 ,g per lane) purified harboring shown). Although there is sequence length heterogeneity, the indicated plasmid and 100 ng of pBR322 DNA were electrophoresed average of all substitutions is 20 through 1% agarose/2.2 M formaldehyde, transferred to a hybrid- length pRAN3 bases, close ization membrane, and probed with the 32P-labeled, nick-translated, to the target length of 19. We conclude that >90% of the 887 787-bp EcoRV-Nru I restriction fragment (11) from the coding region ampicillin-resistant colonies contain promoter substitutions of tetr. A HindIII digest of bacteriophage A DNA was used as a size and that about two of these colonies are also tetracycline marker (data not shown). The 1.4-kilobase (kb) band is indicated. resistant, suggesting that, very roughly, 0.2% of the 109 Downloaded by guest on September 28, 2021 7408 Genetics: Horwitz and Loeb Proc. Natl. Acad. Sci. USA 83 (1986) Table 1. Transformation with random plasmid populations amp selection amp and tet selection Colonies, Sequences, no. Colonies, Sequences, no. DNA no. pBR322 Deletions Inserts no. pBR322 Deletions Inserts pRAN4 125 2 3 5 28 27 0 1 pRAN3 887 0 0 10 23 21 0 2 E. coli DH5.1 was transformed with 410 ng of DNA from the indicated plasmid population at an efficiency of 103-104 colonies per ,tg of DNA with ampicillin (amp) selection and 10-102 colonies per jig of DNA with ampicillin and tetracycline (tet) selection. The number of colonies produced is listed. For transformation with each of the two plasmid populations, plasmids from 10 ampicillin-resistant colonies have been sequenced in the promoter region for tetr. The number of plasmids that are pBR322 background, contain promoter deletions, or contain promoter replacement insertions is indicated. The insertion sequences are listed in Fig. 2. All transformed colonies were grown on LB agar. Ampicillin concentration was 50 Ag/ml; tetracycline concentration was 12.5 Ag/ml. Equal amounts of DNA were used for both antibiotic selections. No transformants were observed in the absence of DNA. pRAN4, random plasmid population containing all four bases; pRAN3, random plasmid population containing just three bases (cytosine, guanine, and thymine).

possible random sequences present in this construction may To unambiguously establish that the promoter replace- duplicate promoter recognition site activity. The absence of ments are the elements of the plasmid responsible for adenine reduces the frequency at which promoter recognition tetracycline resistance, we reconstructed one of the plas- sites may be selected from random sequences. mids. We used oligonucleotides of defined sequence to Tetracycline Resistance Correlates with Transcription. duplicate the promoter substitution in pBT9. We synthesized DH5.1 bearing plasmids with promoter recognition site two oligonucleotides, each identical to one of the two DNA replacements was tested for resistance to tetracycline (Fig. strands of the promoter recognition region. The two 2). The range of tetracycline resistance conferred by the oligonucleotides were annealed and ligated into pBR322. A plasmids selected for ampicillin resistance only (pBB3, synthetic 26-mer with the same sequence as pBT9 in the pBB5, pBB9, pBB10, pBB13, and pBA1-pBA10) is between nontemplate strand extending from the EcoRI restriction site 2 and 10 ,g/ml. For the plasmids selected for both ampicillin to the Cla I restriction site was hybridized to a complemen- and tetracycline resistance (pBG8, pBT9, pBT21, and tary synthetic 24-mer with an equivalent sequence for the pBTR3), the range is between 30 and 60 ,g/ml, while pBR322 template strand. This insert was ligated into EcoRI- and Cla is resistant to 40 ,g/ml. Both pBT9 and pBT21 are resistant I-digested pBR322 to produce pBT9R, a plasmid identical in to concentrations that inhibit pBR322, 50 and 60 ,g/ml, promoter sequence to pBT9. In the absence of a plasmid, respectively. DH5 was found to be resistant to a tetracycline concentration An RNA gel blot was used to quantify transcription from of 2 ,g/ml. When harboring pBT9R, the reconstruction of tet' (Fig. 3). The 787-bp EcoRV-Nru I restriction fragment pBT9, DH5 is resistant to a tetracycline concentration of 50 from the protein coding region of tetr was hybridized to ,ug/ml, the same level of resistance that the original pBT9 cellular nucleic acids. The absence of hybridization in DH5.1 confers on DH5.1. Therefore, tetracycline resistance is with no plasmid reveals the specificity of the probe. Hybrid- conferred by the promoter substitutions, not by a mutation ization of nucleic acids from DH5.1 harboring pBR322 elsewhere on the plasmid. detects three bands. The upper two are plasmid DNA, while the bottom band contains a tet' transcript of about 1.4- DISCUSSION kilobases maximum length. DH5.1 containing either the We inserted random, chemically synthesized DNA se- tetracycline-sensitive promoter deletion plasmid, pBdEC, or quences into plasmids and have selected from a heteroge- the tetracycline-sensitive promoter substitution plasmid, neous population those sequences that are biologically ac- pBA5, reveals the absence of tetr transcripts. DH5.1 harbor- tive. From a population of about 1000 plasmids, each differ- ing the tetracycline-resistant promoter substitutions, pBG8, ent and random in sequence for about 19 bp in the promoter pBT9, pBT21, and pBTR3, exhibits various levels of tetr recognition site of tetr, four additional promoters have been transcripts. The amount of tetr transcript for each plasmid identified by antibiotic selection, two of which promote was quantified by densitometry. To exclude possible differ- transcription more strongly than the native promoter. We ences in plasmid copy number, transcript levels have been have demonstrated that the promoter substitutions were normalized to plasmid DNA concentrations by taking the quotient of the values for the 1.4-kilobase band and the Table 2. Correlation of phenotype and transcription plasmid DNA bands. There is a direct correlation between Relative value levels of tetracycline resistance and tetr transcript (Table 2). Therefore, the phenotype provides a good estimate of pro- Tetracycline tetr moter strength. Plasmid resistance transcript The Promoter Substitutions Are Responsible for Transcrip- pBT21 1.5 1.27 tion. It is possible that the promoter substitution is not the pBT9 1.25 1.17 cause of tetracycline resistance. To rule out the possibility pBG8 1.00 1.01 that a chromosomal rather than a plasmid mutation is pBR322 1 1 responsible for this phenotype, we performed secondary pBTR3 0.75 0.290 transformations using plasmids purified from the primary pBA5 0.05 0 tetracycline-resistant transformants. The plasmids pBG8, pBdEC 0.05 0 pBT9, pBT21, and pBTR3 each transformed E. coli DH5 Plasmids were grown in E. coli DH5.1. Tetracycline resistance was under tetracycline selection with an efficiency about equal to determined by the 50% efficiency ofplating (EOP50) method (15), and pBR322 (data not shown). We, therefore, conclude that the tetr transcript levels were quantified by densitometry of the RNA tetracycline resistance is conferred by the plasmid, not by a gel blot (Fig. 3). Transcript levels were normalized to plasmid DNA mutation in the host bacteria. concentration. All values are expressed relative to those for pBR322. Downloaded by guest on September 28, 2021 Genetics: Horwitz and Loeb Proc. Natl. Acad. Sci. USA 83 (1986) 7409

present within the initial population of random sequences by may have been progressively narrowed by the more rapid synthesizing sequences of biased composition, deficient in reproduction of particular molecular species. Accurate rep- adenine; new promoter sequences were recovered at a lower lication ofpolymers ofincreasing length is likely to have been frequency and did not contain adenine. We have established coupled to the evolution of fidelity mechanisms, resulting in that the promoter substitutions are responsible for transcrip- a further reduction in the number of random species (29). tion initiation by reconstructing one of the new promoters From this, a very limited repertoire of functional DNA through chemical synthesis of that DNA sequence; reinser- sequences evolved. By presenting a cell with a random tion of this sequence into a plasmid with a deletion in the population of nucleotide sequences, selection of unusual promoter recognition site restores biological activity. biological activities might be possible. Since the arithmetic Ligation of target DNA upstream from a promoterless potential of sequence diversity is so great, however, only a marker gene is an established strategy for the identification of small fraction of all sequences can be screened experimen- promoters. The tetracycline resistance gene of pBR322 has tally. been used as a promoter probe (22-24). In these experiments The technique of biological selection from a large popula- the promoter typically was inactivated by deletion within the tion of random DNA sequences may be useful for defining recognition site. Restriction digests of genomic DNA from other prokaryotic and eukaryotic functions. Genetic regula- various prokaryotes and were ligated into the tory signals that may be studied include enhancers and inactive promoter as a test of their potential to restore replication origins. The technique may also be extended for function. Depending upon the organismal source ofthe DNA, the selection of protein coding sequences; candidate appli- insertions with promoter activity were selected at a frequen- cations include peptide hormones, leader sequences, cata- cy of 0.2-33% from among all recombinants. In the absence lytic domains, and even entire enzymes. The import of this of DNA sequence data this high frequency of promoter technique is its usefulness in the absence of physical, chem- selection has been variously explained as either verification ical, or empirical assumptions about structure and function offunctional promoters (15) or as fortuitous restoration of the relationships. deleted portion of the native promoter (25). The latter explanation implies a lack of stringency of RNA polymerase We thank Ray Monnat and Brad Preston for helpful advice, Patrick in choosing promoter DNA sequences. Chou and Yim Foon Lee for performring oligonucleotide synthesis, While all four of the promoter recognition regions reported and Chris Bjarke and Peter Evers for technical assistance with DNA here retain homology with the consensus sequence, in three sequencing. M.S.Z.H. is a student in the Medical Scientist Training of those sequences, pBG8, pBT9, and pBT21, the consensus Program. This work was funded by the Gottstein Memorial Trust. alignment shifts 10 nucleotides downstream (Fig. 2). In these 1. Pribnow, D. (1975) J. Mol. Biol. 99, 419-443. three promoters transcription initiation also shifts 10 nucle- 2. Schaller, H., Gray, C. & Herrmann, K. (1975) Proc. Natl. Acad. Sci. otides downstream (data not shown). The RNA polymerase, USA 72, 737-741. therefore, recognizes another Pribnow box from within the 3. Takanami, M., Sugimoto, K., Sugisaki, H. & Okamoto, T. (1976) Nature (London) 260, 297-302. original pBR322 sequence. A candidate Pribnow box, 10 4. Seeburg, P. H., Nusslein, C. & Schaller, H. (1977) Eur. J. Biochem. 74, nucleotides downstream from the original Pribnow box, is 107-113. TGCGGTAGTTT, where agreement with the consensus 5. Rosenberg, M. & Court, D. (1979) Annu. Rev. Genet. 13, 319-353. sequence has been underlined. Supportive evidence comes 6. Hawley, D. K. & McClure, W. R. (1983) Nucleic Acids Res. 11, 2237-2255. from a mutation in lac in which a base substitution at +1 7. McClure, W. R. (1985) Annu. Rev. Biochem. 54, 171-204. activates a latent Pribnow box to initiate transcription at + 13 8. Matteuci, M. D. & Heyneker, H. C. (1983) Nucleic Acids Res. 11, (26). 3113-3121. The functional tetr promoter in pBTR3 has conserved the 9. Kozak, M. (1986) Cell 44, 283-292. 10. Alvarado-Urbina, G., Sathe, G. M., Liu, W.-C., Gillen, M. F., Duck, consensus spacing of 17 nucleotides between the -35 and P. D., Bender, R. & Ogilvie, K. K. (1981) Science 214, 270-274. -10 promoter elements. However, the nonfunctional tet' 11. Maniatis, T., Fritsch, E. F. & Sambrook, J. (1982) Molecular Cloning:A promoter in pBA7 retains substantial homology with the Laboratory Manual (Cold Spring Harbor Laboratory, Cold Spring consensus sequence but not the spacing. These promoter Harbor, NY). 12. Hanahan, D. (1983) J. Mol. Biol. 166, 557-580. mutations show the significance of the spacing between the / 13. Sanger, F., Nicklen, S. & Coulson, A. R. (1977) Proc. Natl. Acad. Sci. two promoter elements. USA 74, 5463-5467. From among more than 150 known promoters and promot- 14. Wallace, B. R., Johnson, M. J., Suggs, S. V., Miyoshi, K., Bhatt, R. & er mutations (7), the only up mutation that decreases homol- Itakura, K. (1981) Gene 16, 21-26. 15. West, R. W., Neve, R. L. & Rodriguez, R. L. (1979) Gene 7, 271-288. ogy to the consensus sequence is found in the -35 region of 16. Brosius, J., Cate, R. L. & Perlmutter, A. P. (1982) J. Biol. Chem. 257, ara (27), and there are no down mutations that increase 9205-9210. homology with the consensus sequence. Among the se- 17. Sutcliffe, J. G. (1979) Cold Spring Harbor Symp. Quant. Biol. 43, 77-90. quences reported here, pBTR3 contains decreased homology 18. Rodriguez, R. L., West, R. W. & Heyneker, H. L. (1979) Nucleic Acids Res. 6, 3267-3287. with the consensus sequence, relative to pBR322, and is a 19. Stuber, D. & Bujard, H. (1981) Proc. Natl. Acad. Sci. USA 78, 167-171. down mutation, in agreement with this observation. Howev- 20. Backman, K. & Boyer, H. W. (1983) Gene 26, 197-203. er, pBT9 and pBT21 contain tet' promoters with decreased 21. McMurray, L., Petrucci, R. E. & Levy, S. B. (1980) Proc. Nati. Acad. consensus homology at both promoter elements, yet are up Sci. USA 77, 3974-3977. mutations. While our results highlight the the 22. Widera, G., Gautier, F., Lindenmaier, W. & Collins, J. (1978) Mol. Gen. importance of Genet. 163, 301-305. promoter consensus sequence, they also indicate that devi- 23. Neve, R. L., West, R. W. & Rodriguez, R. L. (1979) Nature (London) ations from the consensus need not necessarily decrease 277, 324-325. promoter activity. 24. West, R. W. & Rodriguez, R. L. (1982) Gene 20, 291-304. It should be emphasized that the set of all possible DNA 25. Brosius, J. (1984) Gene 27, 151-160. sequences may contain a large number of biologically active 26. Maquat, L. E. & Reznikoff, W. S. (1980) J. Mal. Biol. 139, 551-556. 27. Horwitz, A. H., Morandi, C. & Wilcox, G. (1980) J. Bacteriol. 142, sequences that have never been tested in nature. Presumably, 659-667. during prebiotic evolution nucleotides were linked together 28. Fakhrai, H., van Roode, J. H. G. & Orgel, L. E. (1981) J. Mol. Evol. 17, and replicated chemically with high error rates in the absence 295-302. of polymerases (28). The total pool of possible sequences 29. Eigen, M. & Schuster, P. (1978) Naturwissenschoqften 65, 341-369. Downloaded by guest on September 28, 2021