Volume 9 Number 21 1981 Nucleic Acids Research

The sequence of the and the amino-terminal region of alkaline phosphatase structural gene (phoA) of Escherichia coli

Yasuhiro Kikuchi, Koji Yoda, Makari Yamasaki and Gakuzo Tamura

Department of Agricultural Chemistry, The University of Tokyo, Bunkyo-ku, Tokyo 113, Japan

Received 21 September 1981

ABSTRACT The promoter and the amino-termianl region of phoA, the structural gene for alkaline phosphatase of Escherichia coli K12, was cloned by using a pro- moter cloning vector pMC1403. The nucleotide sequence of the cloned fragment has been determined. A sequence encoding the amino-terminal portion of mature alkaline phosphatase is found and it is preceded by a sequence encoding the signal peptide. The signal peptide consists of 21 amino acids; Met-Lys-Gln- Ser-Thr-Ile-Ala-Leu-Ala-Leu-Leu-Pro-Leu-Leu-Phe-Thr-Pro-Val-Thr-Lys-Ala. The translation initiation codon is GUG, which is preceded by the Shine- Dalgarno sequence GGAG. Upstream to these sequences, there is a typical pro- caryotic promoter, TATAGTC for the Pribnow box. Around the Pribnow box, there are several dyad symmetrical sequences, which may probably be concerned with the 'regulation of this gene.

INTRODUCTION Alkaline phosphatase (EC 3.1.3.1.) of E.coli is synthesized under low phosphate conditinos (1) and is secreted across the inner membrane to the periplasmic space (2). Its structural gene phoA is located at 8.5 min of the E.coli genetic map (3). The regulatory mechanism of expression of phoA is complex and not well understood. A number of genes, including phoB (4), phoM (5), phoR (6), phoS, phoT (7) and perA (8), have been identified, which af- fect the expression of phoA. Gene products of phoB, phoM and phoR especially regulate of phoA and may interact directly with some specific nucleotide sequence(s) around the phoA promoter. Like other secreted proteins, alkaline phosphatase is synthesized as a precursor form which has a signal peptide at the amino-terminal (9). The sig- nal peptide is believed to play an important role in the initiation step(s) of protein-translocation across the membrane and is cleaved off, during or soon after the secretion (10,11,12,13). The amino acid sequence of the signal peptide of alkaline phosphatase has been only partially known (14). The promoter region of phoA is interesting because of its complex regu- latory mechanism of expression and because of the existence of the signal

©) IRL Press Umited, 1 Falconberg Court, London W1V 5FG, U.K. 5671 Nucleic Acids Research peptide at the amino-terminal. Previously, we have cloned the entire phoA gene on plasmid vectors (15). We report here the subcloning of the phoA pro- moter region and the determination of its nucleotide sequence.

MATERIALS AND METHODS Bacterial strains and plasmids. E.coli strain MC1061 (AlacZ) and plasmid pMC1403 were kindly provided by Dr.M.J.Casadaban (16). Construction of phoA plasmid pKI-2 has been described previously (15). Media and transformation. Phosphate limited "media 121" (17) containing 5-bromo-4-chloro-3-indolyl-D-galactoside (SIGMA) (18) was used for selecting Lac transformants under low phosphate conditions. Transformation with plasmid DNA was carried out as described by Cohen et al (19). Enzymes. All enzymes used were purchased from commercial sources. Reac- tions were carried out under conditions described by suppliers. Purification of plasmid DNA. For small scale (10 ml) preparations, we adopted the "mini prep" described by Klein et al (20). For large scale (2000 ml) preparations, covalently closed circular plasmid DNA was purified from cleared lysate by using hydroxyapatite column chromatography (21). Cloning of the phoA promoter region. pKI-2 and pMC1403 were mixed digested with EcoRI and ligated with T4 DNA ligase. MC1061 was transformed with this DNA mixture. Transformants, which were resistant to ampicillin and showed Lac+ phenotypes under low phosphate conditions, were isolated and their plasmids were analysed. pYK171, one of such plasmids, was partially digested with HinfI to the extent that most of the plasmid molecules could be cleaved at only one site. Linearized molecules were purified by agarose gel electrophoresis. HinfI staggered ends were repaired with E.coli DNA polymerase I (large subunit) in the presence of four d-NTPs. Repaired mole- cules were digested with SmaI and circularized with T4 DNA ligase. Transfor- mants of MCl061, which were ampicillin resistant and showed Lac phenotypes under low phosphate conditions, were selected and further analysed. Purification of DNA fragments and restriction mapping. DNA fragments were isolated and purified by polyacrylamide gel electrophoresis as described by Maxam and Gilbert (22). A restriction map was established by sizing the DNA fragments digested with various restriction enzymes on polyacrylamide gels containing ethidium bromide. HinfI digested fragments of pBR322 were used as molecular weight markers (23). DNA sequencing. DNA fragments were 5' end labeled with [y- P] ATP and sequenced as described by Maxam and Gilbert (22).

5672 Nucleic Acids Research

RESULTS Cloning of the _hoA promoter region. We have previously constructed pKI- 2 which has the entire phoA gene cloned on pBR322 (15). Subcloning studies have revealed that the phoA gene is on the 2.7 kb (kilo bases) fragment be- tween HindME site and XhoI site and that the phoA promoter is on the 1.25 kb fragment between HindME site and EcoRI site (Kikuchi et al. Abstr. Jpn. Bio- chem. Soc. Meet. 1980, 3Q, p.11 and Kikuchi et al. in preparation). We subcloned the 1.28 kb EcoRI fragment of pKI-2 into the EcoRI site of pMC1403, as described in materials and methods section and as shown in Fig.l. Transformants with pYK171 s,howed Lac+ phenotype, only under low phosphate conditions. There are four HinfI sites on the 1.25 kb fragment between HindMI site and EcoRI site of pYK171. Fragments of various length between SmaI site and one of the HinfI sites were deleted as described in materials and methods section and as shown in Fig.l. Several kinds of plasmids smaller than pYK171 were obtained containing the phoA promoter. Among those plasmids, pYKl90 had the 520 bp (base pairs) fragment between BamHI site and HindIm site. Trans- formants with pYKl90 showed Lac phenotype, only under low phosphate condi- tions.

Z 1.25 E 0.35 4.02 JAVA~N~E1.10

IPKI-21 YcOJ 9.90

IwRI Ligation 0.52 EB1.25~~ ~ ~ ~ ~ ~ ~ E

luutl 4 d-ETPs aI T4 ligase UWa2 1 PYK171 partial repair l PYKi90O 9.90_ 9.90

Fig.l Cloning of the phoA Promoter Region. (i) pBR322 DNA, () pMC1403 DNA, (Ei) E.coli DNA, () the phoA promoter. Cleavage sites of restric- tion enzymes; (E) EcoRI, (H) Hind4 , (X) XhoI, (S) SmnaI and (B) BamHI. Lengths of fragments are indicated in kilo bases.

5673 Nucleic Acids Research

Nucleotide sequence of the 520 bp fragment. pYK190 was digested with BamHI and HindMEI. The 520 bp fragment was purified and the restriction map was established with AluI, Haelf, HhaI, HinfI and HpaU . Fig.2 shows the map and the strategy we adopted for sequencing the fragment. Fig.3 reports the nucleotide sequence we have determined.

DISCUSSION We have cloned the promoter region of phoA by using a promoter cloning vector pMC1403 (16). Several recombinant plasmids, including pYK171 and pYK 190, that carried the phoA-lacZ fusion genes, were obtained. The transformants showed Lac+ phenotypes, only under low phosphate conditions. Details on the expression of the fused genes and on the translocation of the products will be described elsewhere. We have reported here the nucleotide sequence of the 520 bp fragment of pYK190, that contains the promoter region of phoA. Within the sequence shown in Fig.3, sequence GGGGATC ( 514- 520) originated from vector pMC1403. The other 514 bp sequence originated from E.coli chromosome. Sequence GACT (nucleotides 510-513) is generated by the repair of HinfI staggered end and it is ligated to the blunt end of SmaI

~~~~~~~~~~~~~~~Z

4.

(2) - _1---

(3) 0

(4) O 100 200 300 400 50 bp

Fig.2 Strategy for Sequencing the 520 bp HindM-BanHI Fragment of pYK190. (O ) indicates the position of "P label at 5' end. Dotted lines indicate re- gions not sequenced. (1) 5'-labeled and cleaved with HaeM. (2) Cleaved with HaeI, 5'-labeled and separated into single strands. (3) Cleaved with AluI, 5'-labeled and cleaved with HinfI. (4) Cleaved with HinfI, 5'-labeled and cleaved with HhaI.

5674 Nucleic Acids Research

100 * * * * * * * * * * 5' - AZ4uIGludIZGVaZTPhrAZaAbtLeaugA5AenMtAZaGZL4BanAspGZnGZnArgLeuIZeA8pGZnVaZGZuGZLJAZarLeu2lVG1uVaZLya

200 * * * * * * * * *_ *

P?oApAAaSerIZePOAapAspAsp2pGlueuLeuArgA8pS¶yzVaZLyaLyeLeuLeuLyeHi.eProArgGIn

300 * *m _ T?ATGTATTTGTACATGGAGAAAATAAGTGAAACAAAGCACT&TTG

PB MetLyeGlnSerThzrIZe SD -20 400 * * * * * * * * * * CACTGGCACTCTTACCG r&CTGTTTACCCCTGTGACAAGCCCGGACACCAGAAATGCCTGTTCTGGAAAACCGGGCTGCTCAGGGCGATATTACTGC _G_Ga__CGTGAGA&TGGCAATGACMATGGGGACGTrI"ZcGGGcCTGTGGTCTTrACGGACAAGAcCCTTTGGCCCGAcGAGTCCCGCTATAATGAcG AZaLeuAZaLeuLeuProLeLuPerPro lroVyaZlhrLyaAZuft2elhr~VGZLuetPoVGLuZuAsn4rgAZaA ZaGZnGZyAspIZelhrA Za -10 -1 1' +1 10 500 * * * * * * * * * * ACCCGGCGGTGCTCGlCGTTTAACGGGTGATCAGACTGCCGCTCTGCGTGATTCTCTTAGCGATAACCTGCAAAATATTATTTTGCTGATrGGCGAT TGGGCCGCCACGAGCGGCAAATTGC GACTAACCGCTA PFoGZyGZyAlaA4gArgLeulrGZyApGZnThrAZaAZaLeuArgAspSerLeuSerAspLysProAZaLyseAnIZeIZeLeuLeuIZeGZyAsp 20 30 40 50

* * GGG&TGGGGGACTGGGGATCC CCCTACCCCCTGACCCCTAG GlyMetGlyAap

Fig.3 The Nucleotide Sequence of the 520 bp Fragment. (I ) and (in) indi- cate dyad symmetrical sequences around the Pribnow box. (o ) and (* ) indi- cate centers of symmetries. PB: the Pribnow box, SD: the Shine and Dalgarno sequence. cleaved pMC1403. To the right, this sequence is followed by lac'Z which en- codes a polypeptide that lacks first eight amino acid residues of 8-galac- tosidase but still retains catalytic activity (16). Recently, Bradshaw et al. determined the amino acid sequence of mature alkaline phosphatase (isozyme 3) (24). Within the nucleotide sequence shown in Fig.3, nucleotides 348-513 correspond to the amino-terminal fifty-six residues of mature alkaline phosphatase. Our results are in good agreement with theirs except for the residues 15 and 35. Bradshaw et al. reported that they were both aspargine. According to our nucleotide sequence, they are both aspartic acid coded by GAT (nucleotides 390-392 and 450-452). In amino acid sequence analysis, these amino acids are apt to be misinterpreted. Our result from nucleotide sequencing may be more reliable. The fifty-sixth residue, re- ported to be serine, should be coded by TCN, because the GACT (nucleotides

5675 Nucleic Acids Research

510-513) was originally a part of HinfI site (GANTC). In E.coli, monomers of alkaline phosphatase occur in two forms, which differ only by the existence of amino-terminal arginine residue (25). Amino- terminal arginine residue of isozyme 1 is cleaved off to form isozyme 3 (26). We have confirmed this with our nucleotide sequence: the amino-terminal threonine residue of isozyme 3 (nucleotides 348-350 ACA; residue 1) is pre- ceded by an arginine residue (nucleotides 345-347 CGG; residue 1'). Nucleotides 282-344 encode a peptide consisted of twenty-one amino acids Met-Lys-Gln-Ser-Thr-Ile-Ala-Leu-Ala-Leu-Leu-Pro-Leu-Leu-Phe-Thr-Pro-Val-Thr- Lys-Ala. The sequence of the first five amino acid residues agrees with that of the signal peptide of alkaline phosphatase, partially determined by Sarthy et al. (14). However, the amino acid composition of this peptide differs very much from that predicted by Lazdunski et al. for the signal peptide of alka- line phosphatase (27). They analysed a peptide cleaved from a presumed pre- cursor of alkaline phosphatase that accumulated in the presence of procain. We doubt that the peptide they have analysed may not have been the signal peptide or that it may have been contaminated with other peptide(s). The amino acid sequence of the signal peptide of alkaline phosphatase deduced from our nucleotide sequence shares common characteristics with those of other bacterial signal peptides (12,13). The amino-terminal region is basic with lysine (residue -20). There is a long stretch of hydrophobic amino acids (residues -16 to -4). Two proline residues are distributed (residues -10 and -5). The residue at the cleavage site (residue -1) is alanine. This sequence is consistent with the "loop model" proposed by Inouye (12). The codon for the first methionine (residue -21) is not AUG but GUG (GTG in DNA ; nucleotides 282-284). There are several instances in which GUG ini- tiates the translation and is translated into methionine (28). Nine nucleo- tides upstream to this GTG (282-284) is a sequence GGAG (nucleotides 270-273) which can hybridize with 3' end of 16S rRNA, as predicted by Shine and Dal- garno (29). No codon other than this GTG (282-284) is qualified for the translation initiation codon for alkaline phosphatase. The sequence TATAGTC (nucleotides 230-236) has a homology with the se- quence of the binding site for RNA polymerase in procaryotic promoters, known as the Pribnow box (30). The distance (45 bp) from the initiation codon to the Pribnow box is similar to that (43 bp) of the lactose (31). The sequence between the Pribnow box and the initiation codon is highly AT-rich (80%), reflecting that it is a highly efficient promoter (32). Messenger RNA will start from the nucleotide 242 G. The reading frame

5676 Nucleic Acids Research of phoA on the cloned fragment differs from that of lac'Z on the vector (16). Probably, fused messenger RNA is transcribed and the translation re-initia- tion occurs from the nucleotides 499-501 ATG, which are in the same reading frame as lac'Z. Nucleotides 1-178 may correspond to a carboxyl-terminal region of the preceding gene, which ends at the termination codon, TAA (nucleotides 179- 181). Around the Pribnow box TATAGTC (230-236), there are four dyad symmetri- cal sequences: nucleotides 176-201 CAGTAAAAAGTTAATCTTTTCAACAG, 191-212 CTTTT- CAACAGCTGTCATAAAG, 215-228 GTCACGGCCGAGAC and 231-298 ATAGTCGCTTTGTTTTTATTTT- TTAATGTATTTGTACATGGAGAAAATAAA. The sequence 191-212 overlaps the sequence 176-201 and includes the -35 region (around the nucleotide 207), a presumed recognition site for RNA polymexase (31). The sequence 231-298 includes the Pribnow box. Probably, all or any number of these sequences may be concerned with the regulation of this gene. Cleavage of the 520 bp fragment with Hpa11 (CCGG nucleotides 344-347) and Pvu]l (CAGCTG nucleotides 199-204) gives the smallest fragment (144 bp) that contains both the phoA promoter and the sequence for the intact signal pep- tide. This fragment will be a useful material for studying the regulatory mechanism of phoA and for elucidating the role of the signal peptide in se- cretion. However, cleavage with PvuIL disrupts the dyad symmetrical sequences around the -35 region and may affect the expression of this gene. It is of great interest to search for similar sequences in the promoter regions of the genes which are induced under low phosphate conditions. It is also interesting to determine the nucleotide sequences of promoter mutants of phoA. There may yet exist a different type of regulatory mechanism of the other than those of well characterized genes, such as lac or trp.

ACKNOWLEDGEMENTS We are greatful to Dr. Takao Sekiya and his research group at National Cancer Center Research Institute for giving us DNA sequencing techniques. We thank Dr. M. J. Casadaban for the gift of pMC1403 and MC1061. This work was supported in part by a research grant from the Institute of Physical and Chemical Research and a grant from the Ministry of Education, Science and Culture of Japan, No. 56118005.

5677 Nucleic Acids Research

REFERENCES 1. Torriani,A. (1960) Biochim. Biophys. Acta 38, 460-479 2. Malamy,M.,and Horecker,B. (1961) Biochem. Biophys. Res. Commun. 5, 104- 108 3. Bachmann,B.J.,and Low,K.B. (1980) Microbiol. Rev. 44, 1-56 4. Yagil,E.,Bracha,M.,and Lifshitz,Y. (1975) Mol. Gen. Genet. 137, 11-16 5. Wanner,B.L.,and Latterell,P.(1980) 96, 353-366 6. Echols,H.,Garen,A.,Garen,S.,and Torriani,A. (1961) J. Mol. Biol. 3, 425-438 7. Willsky,G.R.,Bennett,R.L.,and Malamy.H. (1973) J. Bacteriol. 113, 529- 539 8. Wanner,B.L.,Sarthy,A.,and Beckwith,J. (1979) J. Bacteriol. 140, 229-239 9. Inouye,H.,and Beckwith,J. (1977) Proc. Natl. Acad. Sci. U.S.A. 74, 1440-1444 10. Blobel,G.,and Dobberstein,B. (1975) J. Cell Biol. 67, 835-851 11. von Heijne,G.,and Blomberg,C. (1979) Eur. J. Biochem. 97, 175-181 12. Inouye,M.,editor (1979) Bacterial Outer Membranes: Biogenesis and Func- tions.,John Wiley and Sons, Inc.,New York 13. Emer,S.D.,Hall,M.N.,and Silhavy,T.J. (1980) J. Cell Biol. 86, 701-711 14. Sarthy,A.,Fowler,A.,Zabin,I.,and Beckwith,J. (1979) J. Bacteriol. 139, 932-939 15. Yoda,K.,Kikuchi,Y.,Yamasaki,M.,and Tamura,G. (1980) Agric. Biol. Chem. 44, 1213-1214 16. Casadaban,M.J.,Chou,J.,and Cohen,S.N. (1980) J. Bacteriol. 143, 971-980 17. Torriani,A. (1968) J. Bacteriol. 96, 1200-1207 18. Miller,J.H. (1972) Experiments in . Cold Spring Har- bor Laboratory,Cold Spring Harbor, New York 19. Cohen,S.N.,Chang,A.,and Hsu,L. (1972) Proc. Natl. Acad. Sci. U.S.A. 69, 2110-2114 20. Klein,R.D.,Selsing,E.,and Wells,R.D. (1980) Plasmid 3, 88-91 21. Colman,A.,Byers,M.J.,Primrose,S.B.,and Lyons,A. (1978) Eur. J. Biochem. 91, 303-310 22. Maxam,A.M.,and Gilbert,W. (1979) Methods in Enzymology, Grossman,L.,and Moldave,K. Eds. LXV, 499-560, Academic Press, New York 23. Sutcliffe,J.G. (1978) Nucl. Acids Res. 5, 2721-2728 24. Bradshaw,R.A.,Cancedda,F.,Ericsson,L.H.,Neumann,P.A.,Piccoli,S.P., Schlesinger,M.J.,Shriefer,K.,and Walsh,K.A. (1981) Proc. Natl. Acad. Sci. U.S.A. 78, 3473-3477 25. Kelley,P.M.,Neumann,P.A.,Shriefer,K.,Cancedda,F.,Schlesinger,N.J.,and Bradshaw,R.A. (1973) Biochemistry 12, 3499-3503 26. Nakata,A.,Yamaguchi,M.,Izutani,K.,and Amemura,M. (1978) J. Bacteriol. 134, 287-294 27. Lazdunski,C.,Bary,D.,and Pages,J.M. (1979) Eur. J. Biochem. 96, 49-57 28. Steege,D.A. (1977) Proc. Natl. Acad. Sci. U.S.A. 74, 4163-4167 29. Shine,J.,and Dalgarno,L. (1974) Proc. Nati. Acad. Sci. U.S.A. 71, 1342- 1346 30. Pribnow,D. (1975) Proc. Natl. Acad. Sci. U.S.A. 72, 784-788 31. Dickson,R.C.,Abelson,J.,Barnes,W.M.,and Reznikoff,W.S. (1975) Science 187, 27-35 32. Chamberlin,M.J.(1976) RNA Polymerase, Losick,R.,and Chamberlin,M. Eds. pp. 159-191

5678