Agric. Biol. Chem., 52 (12), 3067-3071, 1988 3067

Isolation and Nucleotide Sequence Analysis of the URA3 (orotidine 5 '-phosphate decarboxylase) of Kluyveromyces lactis Makoto Mizukami* and Fumio Hishinuma** Mitsubishi Kasei Institute of Life Sciences, ll Minamiooya, Machida-shi, Tokyo 194, Japan Received June 22, 1988

A gene complementing a uraS mutation in was isolated from Kluyveromyces lactis. Its nucleotide sequence showedan open reading frame of 801 bp, correspond- ing to 267 amino acid residues. These are homologous to the coding region of URA3gene of S. cerevisiae, 71% and 82% at the nucleotide and the amino acid levels, respectively. Although transcription started at -30 to -40 from the initiation codon ATG,which is comparable to S. cerevisiae, the regulatory sequences identified in the 5' flanking regions of URA1and URA3genes of S. cerevisiae are not found in the URA3gene of K. lactis.

The yeast Kluyveromyces lactis is able to use E. colipyrFand S. cerevisiae URA3genes. The a wide variety of carbon sources, and lacks nucleotide sequence analysis of this gene re- strong glucose repression of gene expres- veals that the deduced amino acid sequence is sion.1>2) These observations suggest that K. homologous to the URA3gene of S. cerevi- lactis is useful as a host yeast to produce siae^ but the 5' and 3' flanking sequences are foreign proteins, and is of interest in studying quite divergent from that of the URA1 and the regulation of gene expression. URA3genes of S. cerevisiae. The biosynthetic pathway in S. cerevisiae consists of six steps. Two unlinked MATERIALS AND METHODS , URA1and URA3, encode two inducible enzymes, dihydroorotic acid dehydrogenase Strains and plasmids. K. lactis wild type strain (IFO 1267) was the origin of chromosomal DNA for the and orotidine 5'-phosphate decarboxylase, re- spectively. Their expression is co-ordinately preparation of the gene library. S. cerevisiae YNN27(a ura3 trpl gall) and E. coli DB6656 (pyrFw Mutrp lacZ and positively regulated by the PPR1 gene hsdR) were used as hosts for screening of the URA3gene product and the inducer molecule dihydroorotic of K. laetis. E. coli C 600 (F" thrl leuB6 lacYl tonA21 acid at the transcriptional level.3'4) Losson et supE44 X~) was the host for propagating plasmids. Plasmid pREI032 was constructed by R. Hirochika in al.5) identified a short regulatory region in the this laboratory and is identical to YRp7except that one of 5' flanking region of URA1 and URA3 of S. the two EcoRlsites ofYRp7(the one near the ampicillin cerevisiae that responds to PPR1induction. resistant gene) was removed. The URA3gene of K. lactis should be quite useful in constructing host-vector systems as a Enzymesand chemicals. Nuclease S1 was obtained from selectable marker and in analyzing gene ex- NewEngland BioLab. Restriction endonuclease, bacterial alkaline phosphatase, the large fragment of DNApoly- pression in K. lactis. Here we have isolated the merase I (Klenow), and T4 polynucleotide kinase were URA3gene from a K. lactis genomic library by from Toyobo. a-32P-dCTP and y-32P-ATP were from complementationof mutantsin the equivalent Amersham.

Present address: Faculty of Agriculture, Nagoya University, Chigusa-ku, Nagoya 464, Japan. To whomreprint requests should be sent. 3068 M. Mizukami and F. Hishinuma

DNAmanipulation. DNAwas sequenced by the chain termination method.7) Yeast was transformed by the were used to transform S. cerevisiae YNN27. lithium ion method.8) Other procedures used were de- After two steps of selection (Trp+ and Ura+), four Ura+ transformants were obtained. scribed previously.9) Plasmid DNAwas isolated from these trans- Preparation of the gene library of K. lactis DNA.Total formants and then introduced into E. coli DNA was extracted from K. lactis IFO1267 by the C600. The physical maps with restriction en- method of Cryer et al.10) Isolated DNAwas digested partially with Sau3Al and electrophoresed on a 0.7% zymes showed that two kinds ofplasmids were agarose gel. DNAfragments larger than 4kb were extrac- recovered. The plasmids, pMIZ4 and pMIZ5, ted from the gel and ligated into the BamHlsite of have inserts 5.8 and 4.4kb long, respectively. pREI032. The recombinant plasmids were introduced into As can be seen in Fig. 1, they share a common E. coli strain C600. The gene bank obtained contained region. Although both of these plasmid DNAs 15000 clones having inserts averaging 5 kb in length and are able to complementthe ura3 mutation in S. might cover over 95%of the K. lactis genome. cerevisiae, only pMIZ4 complemented the pyrF mutation in E. coli DB6656. Subcloning 1.8 (the late log phase) in YPDmedium (1% yeast extract, of the insert ofpMIZ4 using E. coli DB6656 as 2% Bacto peptone, and 2%glucose), and precipitated by the recipient revealed that a 1.4-kb fragment centrifugation at 5000 rpm for 5min. Isolation of total {Hindlll site to the left end as in Fig. 1) cellular RNAand SI mapping was performed as described contained the orotidine 5'-phosphate decar- previously9); total RNA isolated from yeast was hy- boxylase gene. bridized with the probe DNAfragments in the presence of 80% (v/v) formamide for 4hr at 37°C. Non-hybridized DNAwas digested with SI nuclease at 37°C. The pro- Structure of URA3gene tected hybrid fragments were electrophoresed on an 8% The nucleotides of about 1.4kb of the sequencing gel. URA3gene in K. lactis were sequenced. Only one open reading frame was found in this RESULTS AND DISCUSSION region; it consisted of801 bp (Fig. 2); the same size as that of S. cerevisiae URA3. The homol- Isolation of the URA3gene ofK. lactis ogy of the coding region of URA3 genes Total DNA isolated from K. lactis was between K. lactis and S. cerevisiae was 71% digested partially with Sau3Al. DNA frag- at the nucleotide and 82%at the amino acid ments larger than 4kb were inserted in the level (Fig. 3). Figure 3 also shows the de- BamHlsite of the yeast-£. coli shuttle vector, duced amino acid sequences of orotidine 5'- pREI032. The plasmid DNAs of the gene bank phosphate decarboxylase from Aspergilus

Fig. 1. Restriction Maps of Two K. lactis DNAInserts Which Carry ura3 Complementing Activity. A: pMIZ4. B: pMIZ5. C: The region of the URA3gene sequenced. Black box showed the coding region. A, Avail, B, Bglll; D, Dral; E, EcoRI; Ha, Haelll; H, Hindlll; K, Kpnl; Nc, Ncol; Nd, Ndel; Ns, Nsil; S, Sail; X,Xhol. URA3gene in K. lactis 3069

-498 gatcgttt tatttaggtt ctatcgagga gaaaaagcga caagaagaga tagaccatgg ataaactgat tatgttctaa acactcctca gaagctcatc -400 gaactgtcat cctgcgtgaa gattaaaatc caacttagaa atttcgagct tacggagaca atcatatggg agaagcaatt ggaagataga aaaaaggtac -300 tcggtacaPta aatataItgtg attctgggta gaagatcggt ctgcattgga tggtggtaac gcattttttt acacacatta cttgcctcga gcatcaaatg -200 gtggttattc gtggatcjtat atlcacgtgat ttgcttaaga attgtcgttc atggtgacac ttttagcttt gacatgatta agctcatctc aattgatgtt U PMIZ4 -100 ATCTAAAGTC ATTTCAACTA TCTAAGATGT GGTTGTGATT GGGCCATTTT GTGAAAGCCA GTACGCCAGC GTCAATACAC TCCCGTCAAT TAGTTGCACC

1 ATG TCC ACA AAA TCA TAT ACC AGT AGA GCT GAG ACT CAT GCA AGT CCG GTT GCA TCG AAA CTT TTA CGT TTA ATG GAT GAA Met Ser Thr Lys Ser Tyr Thr Ser Arg Ala Glu Thr His Ala Ser Pro Val Ala Ser Lys Leu Leu Arg Leu Met Asp Glu

82 AAG AAG ACC AAT TTG TGT GCT TCT CTT GAC GTT CGT TCG ACT GAT GAG CTA TTG AAA CTT GTT GAA ACG TTG GGT CCA TAC Lys Lys Thr Asn Leu Cys Ala Ser Leu Asp Val Arg Ser Thr Asp Glu Leu Leu Lys Leu Val Glu Thr Leu Gly Pro Tyr

163 ATT TGC CTT TTG AAA ACA CAC GTT GAT ATC TTG GAT GAT TTC AGT TAT GAG GGT ACT GTC GTT CCA TTG AAA GCA TTG GCA lie Cys Leu Leu Lys Thr His Val Asp lie Leu Asp Asp Phe Ser Tyr GluGlyThr Val Val ProLeu Lys Ala Leu Ala

244 GAG AAA TAC AAG TTC TTG ATA TTT GAG GAC AGA AAA TTC GCC GAT ATC GGT AAC ACA GTC AAA TTA CAA TAT ACA TCG GGC Glu Lys Tyr Lys Phe Leu lie PheGluAsp Arg Lys Phe Ala Asp lieGlyAsn Thr Val Lys LeuGin Tyr Thr SerGly

325 GTT TAC CGT ATC GCA GAA TGG TCT GAT ATC ACC AAC GCC CAC GGG GTT ACT GGT GCT GGT ATT GTT GCT GGC TTG AAA CAA Val Tyr Arg He AlaGluTrp Ser Asp He ThrAsn AlaHisGlyVal ThrGlyAlaGly He Val AlaGlyLeuLysGin

406 GGT GCG CAA GAG GTC ACC AAA GAA CCA AGG GGA TTA TTG ATG CTT GCT GAA TTG TCT TCC AAG GGT TCT CTA GCA CAC GGT GlyAla Gin GluVal Thr LysGluPro Arg GlyLeu Leu Met Leu Ala GluLeu Ser Ser LysGlySer Leu Ala HisGly

487 GAA TAT ACT AAG GGT ACC GTT GAT ATT GCA AAG AGT GAT AAA GAT TTC GTT ATT GGG TTC ATT GCT CAG AAC GAT ATG GGA Glu Tyr Thr Lys Gly Thr Val Asp He Ala Lys Ser Asp Lys Asp Phe Val He Gly Phe He Ala Gin Asn Asp Met Gly

GGA AGA GAA GAA GGG TTT GAT TGG CTA ATC ATG ACC CCA GGT GTA GGT TTA GAC GAC AAA GGC GAT GCA TTG GGT CAG CAG GlyArg GluGluGly Phe Asp TrpLeu He Met Thr ProGlyVal GlyLeu Asp AspLysGlyAsp Ala LeuGlyGinGin

TAC AGA ACC GTC GAC GAA GTT GTA AGT GGT GGA TCA GAT ATC ATC ATT GTT GGC AGA GGA CTT TTC GCC AAG GGT AGA GAT Tyr Arg Thr Val AspGluVal Val Ser GlyGly Ser Asp He He He Val GlyArg GlyLeu Phe Ala LysGlyArgAsp

CCT AAG GTT GAA GGT GAA AGA TAC AGA AAT GCT GGA TGG GAA GCG TAC CAA AAG AGA ATC AGC GCT CCC CAT TAA TTATAC Pro Lys Val GluGlyGluArg Tyr Arg Asn AlaGlyTrp GluAla Tyr Gin Lys Arg He Ser Ala Pro His ***

+ 811 AGGAAACTTA ATAGAACAAA TCACATATTT AATCTAATAG CCACCTGCAT TGGCACGGTG CAACACTCAC TTCAACTTCA TCTTACAAAA GATCACGTGA + 911 TCTGTTGTAT TGAACTGAAA ATTTTTTGTT TGCTTCTCTC TCTCTCTTTC ATTATGTGAG AGTTTAAAAA CCAGAAACTA CATCATCGAA AAAGAGTTTA

+1011 AACCATTACA ACCATTGCGA TAAGCCCTCT CAAACTTCCT CCAGACTGTT TGTCATCCAA TTGGTAAGAT TAATTATCAT ATACCCCGGA TAAGAAAAGA

+1111 AGCAAGCACA CAATGGATCA ACTTAATGGT AAGGAACAAC AAGAGTTCCA AAAAATCGTG GAACAAAAGC AAATGAAGGA TTTCATGCGT CTATACTCCA

+1211 ACTTGGTCGA AAGATGTTTC AGCGACTGTG TCAACGACTT CACATCCGCT AAGCTT Fig. 2. DNAand Deduced Amino Acid Sequence of the K. lactis URA3Gene. TwoTATAbox candidates are boxed. The starting point of the insert of pMIZ4is also indicated.

nidulans,ll) Aspergillus niger,12) Neurospora pMIZ4does not contain a TATAsequence. crassa,13) and Ehrlich ascites cells.14) Since the The TATAT sequence at -138, upstream enzyme of N. crassa is about 130 residues from the unique BamHl site ofpREI032 in a larger than the others, several deletions are tetracycline resistant gene, is the most proba- introduced for maximum matching. One ble candidate. hundred twenty-six out of 267 residues are The expression of URA1 and URA3 genes of homologous in the six enzymes. The codon S. cerevisiae are co-ordinately positively regu- usage of the two genes was generally similar lated by the PPR1 gene product and the except for minor differences in Phe, Lys, Val, inducer molecule dihydroorotic acid at the and Pro (data not shown). transcriptional level.3'4) Losson et al.5) iden- The upstream sequence from the initiation tified a short regulatory region in the 5' codon in pMIZ4 is very short (69bp) and the flanking region of URA1 and URA3 that Sau3Al site at the boundary is missing (Fig. 2). responds to PPR1 induction, and there are Since pMIZ5 can not complementthe pyrF TATA sequences at -126 and -98 where mutation in E. coli, the K. lactis URA3gene in long stretches of purine: pyrimidine alterna- pMIZ4might be expressed in E. coli using the tion are found. The major transcripts start at promotor on the vector, so that the plasmid -43 and -33.upstream from the initiation could complement the pyrF. Though pMIZ4 ATG codon respectively. Furthermore, two also complements the URA3 gene of S. conserved sequences, ATACGCAT and cerevisiae, the short upstream sequence of CCPuGTATTCTT, are found just down- M. Mizukami and F. Hishinuma

++ * +++å + * +*+ * ** + +** * + +** * ++**+ *++ Kl MSTKSY TSRAET-HASPVASKLLRLMDEKKTNLCASLDVRSTDELLKLVETLGPYICLLKTHVDILDDF-- 68 Sc MSKATY KERAAT-HPSPVAAKLFNIMHEKQTNLCASLDVRTTKELLELVEALGPKICLLKTHVDILTDF- 68 Ad MSSKSH LPYAIRATN-HPNPLTSKLFSIAEEKKTNVTVSADVTTSAELLDLADRLGPYIAVLKTHIDILTPS- 72 Ag MSSKSQ LTYTARASK-HPNALAKRLFEIAEAKKTNVTVSADVTTTKELLDLADRLGPYIAVIKTHIDILSDF-- 72 Nc MSTSQETQPHWSLKQSFAERVES-STHPLTSYLFRLMEVKQSNLCLSADVEHARDLLALADKVGPSIVVLKTHYDLITGWDY 82 Eh EKKACKELSFGARAELPGTHPLASKLLRLMQKKETNLCLSADVSEARELLQLADALGPS I CMLKTHVDILNDF- 73

Kl -SYEGTVVPLKALAEKYKFLIFEDRKFADIGNTVKLQYTSGVYRIAEWSDITNAHGVTGAGIVAGLKQGAQ 138 Sc -SMEGTVKPLKALSAKYNFLLFEDRKFADIGNTVKLQYSAGVYRIAEWADITNAHGVVGPGIVSGLKQAAE 138 Ad -T-PSTLSSLQSLATKHNFLIFEDRKFIDIGNTVQKQYHGGALRISEWAHI INCAILPGEGIVEALAQTTKSPD 144 Ag -S-DETIEGLKALAQKHNFLIFEDRKFIDIGNTVQKQYHRGTLRISEWAHI INCSILPGEGIVEALAQTASAPD 144 Nc HPHTGTGAKLAALARKHGFLIFEDRKFVDIGSTVQKQYTAGTARIVEWAHITNADIHAGEAMVSAMAQAAQKWRERIPYEVK 163 Eh -T-LDVMEELTALAKRHEFLIFEDRKFADIGNTVKKQYESGTFKIASWADIVNAHVVPGSGVVKGLQEVGL 142

Kl 140 Sc 140 Ad 146 Ag 146 Nc TSVSVGTPVADQFADEEAEDQVEELRKVVTRETSTTTKDTDGRKSSIVSITTVTQTYEPADSPRLVKTISEDDEMVFPGIEE 245 Eh 142

*+ *++*+++ * * * * * * **+*+++ _|_ +++ *-h* Kl TKEPRGLLMLAELSSKGSLAHGEYTKGTVDIAKSDKDFVIGFIAQNDMGG REEGFEWLIMTPG 201 Scc TKEPRGLLMLAELSCKGSLSTGEYTKGTVDIAKSDKDFVIGFIAQRDMGG RDEGYEWLIMTPG 201 Add DANQRGLLILAEMTSKGSEATGESQARSVEYARKYKGFVMGFVSTRALSEVLPEQ KEESEDFVVFTTG 213 Agg YGPERGLLILAEMTSKGSLATGQYTTSSVDYARKYKNFVMGFVSTRSLGEVQSEV SSPSDEEDFVVFTTG 215 Ncc APLDRGLLILAQMSSKGCLMDGKYTWECVKAARKNKGFVMGYVAQQNLNGITKEALAPSYEDGESTTEEEAQADNFIHMTPG325 Ehh -PLHRACLLIAEMSSAGSLATGNYTKAAVGMAEEHCEFVIGFISGSRVSM KPEFLHLTPG 201

+ ** ***** + + *+* * ***+ + -)- + * +* ** * VGL DDKGDALGQQYRTVDEVVSG-GSDI I IVGRGLFAKGRDPKVEGERYRNAGWEAYQKRI SAPH 267 VGL-- DDKGDALGQQYRTVDDVVST-GSDI I IVGRGLFAKGRDAKVEGERYRKAGWEAYLRRCGQQN 267 VNL SDKGDKLGQQYQTPGSAVGR-GADF I I AGRG IY-KADDPVEAVQRYREEGWKAYEKRVGL 274 VNI SSKGDKLGQQYQTPASAIGR-GADFI IAGRGIYA-APDPVQAAQQYQKEGWEAYLARVGGN 277 CKLPPPGEEAPQGDGLGQQYNTPDNLVNIKGTDIAIVGRGI IT-AADPPAEAERYRRKAWKAYQDRRERLA 397 VQL ETGGDHLGQQYNSPQEVIGKRGSDVI IVGRGILA-AANRLEAAEMYRKAAWEAYLSRLAVQ 264 Fig. 3. Comparison of the Deduced Amino Acid Sequences ofOrotidine 5'-Phosphate Decarboxylases from K. lactis with That of Five Other Organisms. Sequences of S. cerevisiae (Sc), Aspergillus nidulans (Ad), Aspergillus niger (Ag), N. crassa (Nc) and Ehrlich ascites (Eh) were taken from the data of Rose et al.,6) Oakley et al.,n) Wilson et al.,12) Newbury et al.,13) and Ohmsted et al.,14) respectively. * and +indicate the identical amino acid residues and conservative changes, respectively. stream from the TATAbox separated by 6 and which has been found in S. cerevisiae.5) Par- 28 bp, respectively. They are thought to be the tially homologous sequences are found in the recognition/interaction sites with the PPR1 K. lactis URA3upstream region; first one molecules. ACGCATbeginning at position -242 and a To compare the regulatory region, the fur- second CCAGTA at position -43 (Fig. 2). ther upstream sequence of URA3gene from K. But the distance of the two sequencesare very lactis was identified using pMIZ5 (Fig. 2). In long compared to that of S. cerevisiae, and the contrast to the high homology of the nucleo- alternating Pu : Py sequence is not seen in that tide sequence in the coding region, the 5' region. These results suggest that the mecha- upstream sequences were quite different in nism controlling the expression of the pyrimi- S. cerevisiae and K. lactis. Transcription of dine biosynthetic genes in K. lactis might be URA3 ofK. lactis begins around -30 to -40 different from that of S. cerevisiae. It will be from the initiation ATGcodon (Fig. 4), but interesting to see whether the PPR1 protein of apparent TATAboxes are not found between S. cerevisiae can activate the transcription of -150 and -40. Two candidates for a TATA URA3 in K. lactis. box are found at -292 and -183 upstream During preparation of this manuscript, from ATG(Fig. 2), but neither had a neigh- Shuster et al.15) reported the nucleotide se- boring long stretch of Pu:Py alternation quence of the URA3gene of K. lactis. Their URA3 gene in K. lactis 3071

Acknowledgments. We wish to thank Ms. Rei Hirochika for constructing of pREI032. This work was supported by the Research and Development Project of Basic Technology for Future Industries from the MITI of Japan.

REFERENCES

1) R. C. Dickson, R. M. Sheets and L. A. Lacy, J. BacterioL, 142, 777 (1980). 2) M. I. Riley, J. E. Hopper, S. A. Johnston and R. C. Dickson, Mol. Cell BioL, 1, 780 (1987). 3) B. Kammerer, A. Guyonvarch and J. C. Hubert, J. Mol. BioL, 180, 239 (1984). 4) P. Liljelund, R. Losson, B. Kammerer and F. Lacroute, J. Mol. BioL, 180, 251 (1984). 5) R. Losson, R. P. P. Fuch and F. Lacroute, J. Mol. BioL, 185, 65 (1985). 6) M. Rose, P. Grisafi and D. Botstein, Gene, 29, 113 (1984). 7) F. Sanger, S. Nicklen and A. R. Coulson, Proc. Natl. Fig. 4. Analysis of the URA3 mRNAby SI Mapping. Acad. Sci. U.S.A., 74, 5463 (1977). Isolation of total cellular RNAand SI mapping were done 8) H. Ito, Y. Fukuda, K. Kurata and A. Kimura, J. as described previously.9* The 600-bp Ncol-Avall frag- BacterioL, 153, 163 (1983). ment was used as a probe after labelling at Avail site 9) K. Inokuchi, A. Nakayama and F. Hishinuma, Mol. (+ 157) with y-32P-ATP and polynucleotide kinase. Lane Cell. BioL, 1, 3185 (1987). 1, with RNA; lane 2, without RNA; lane 3, molecular 10) D. R. Cryer, R. Eccleshall and J. Marmur, Methods Cell BioL, 12, 39 (1975). marker of pBR322 DNAdigested with Hapll. ll) B. R. Oakley, J. E. Rinehart, B. L. Mitchell, C. E. Oakley, C. Carmona, G. L. Gray and G. S. May, sequence (-177 to +981) is almost identical Gene, 61, 385 (1987). 12) L. J. Wilson, C. L. Carmona and M. Ward, Nucl. to ours except for two places; the C of +878 Acids Res., 16, 2339 (1988). and the G of +972 are deleted in their se- 13) S. F. Newbury, J. A. Glazebrook and A. Radford, quence. This might be due to the different Gene, 43, 51 (1986). strain used for cloning. 14) C.-A. Ohmsted, S. D. Langdom, C.-B. Chae and M. E. Jones, /. BioL Chem., 261, 4276 (1986). 15) J. R. Shuster, D. Moyer and B. Irvine, Nucl. Acids Res., 15, 8573 (1987).