Protein Science (1994), 3:1081-1088. Cambridge University Press. Printed in the USA. Copyright 0 1994 The Protein Society

Unexpected sequence similarity between nucleosidases and phosphoribosyltransferases of different specificity

ARCADY R. MUSHEGIAN' AND EUGENE V. KOONIN' Department of Plant Pathology, University of Kentucky, Lexington, Kentucky 40546-0091 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894 (RECEIVEDFebruary 17, 1994; ACCEPTEDApril 29, 1994)

Abstract Amino sequences of that catalyze hydrolysis or phosphorolysis of the N- in nucleo- sides and (nucleosidases and phosphoribosyltransferases) were explored using computer methods for database similarity search and multiple alignment. Two new families, each including bacterial and eukaryotic en- zymes, were identified. Family I consists of Escherichia coli AMP (Amn), uridine (Udp), phosphorylase (DeoD), uncharacterized proteins from E. coli and Bacteroides uniformis, and, unexpectedly, a group of plant stress-inducible proteins. It is hypothesized that these plant proteins have evolved from nucleo- sidases and may possess nucleosidase activity. The proteins in this new family contain 3 conserved motifs, one of which was found also in eukaryotic purine nucleosidases, where it corresponds to thenucleoside-. Family I1 is comprised of bacterial and eukaryotic and anthranilate phosphoribosyl- , the relationship between which has not been suspected previously. Based on the known tertiary struc- ture of E. coli , structural interpretation was given to thesequence conservation in this family. The highest conservation is observed in the N-terminal a-helical domain, whose exactfunction is not known. Parts of the conserved of thymidine phosphorylasesand anthranilatephosphoribosyltransferases were delineated. A motif in the putative -binding site is conserved in family I1 and in other phosphoribosyl- transferases. Our analysis suggests that certain enzymes of very similar specificity, e.g., uridine and thymidine phosphorylases, could have evolved independently. In contrast, enzymes catalyzing such different reactions as AMP hydrolysis and uridine phosphorolysis or thymidine phosphorolysis and phosphoribosyl anthranilate syn- thesis are likely to have evolved from common ancestors. Keywords: nucleosidases; phosphoribosyltransferases; sequence similarity

Enzymes that catalyze hydrolysis or phosphorolysis of the platelet-derived endothelial cell is identical to thy- N-glycosidic bond in nucleotides, , and related com- midine phosphorylase (Barton et al., 1992; Ishizawa & Yamada, pounds are central to salvage pathways of 1992). Phosphoribosyltransferases also have attracted consid- and arealso important in de novo synthesis of nucleotides and erable attention because deficiencyin these enzymes leadsto var- certain amino (Table 1; reviewed by Lin, 1987; Neuhard ious metabolic disorders in humans, e.g., Lesch-Nyhan disease & Nygaard, 1987). Nucleosidases are involved in the regulation (Stout & Caskey, 1985). of the intracellular concentration of nucleotides and allow utiliza- A single organism, i.e., Escherichia coli, which has been stud- tion of and deoxyribose as a source of carbon and energy. ied in the most detail, encodes numerous nucleosidases and Phosphoribosyltransferases provide, via 5-phosphoribosyl- phosphoribosyltransferases, including enzymes that catalyze es- 1-pyrophosphate (PRPP), a crucial link between nucleotide and sentially identical reactions but differin their specificity toward metabolism. The substrates of these enzymesare very the nucleotide (e.g., thymidine phosphorylase, uridine diverse, but the reacting groups always involve phosphate and phosphorylase, and purine phosphorylase; see Table 1). A num- ribose or deoxyribose (Table 1). The interest to the nucleosid- ber of nucleotide sequences of genes encoding these enzymes ases has been enhanced by the recent observation that human from all types of organisms have been reported (Table 1). The 3-dimensional (3D) structure has been determined for human purine phosphorylase (PNP; Ealick et al., 1990) and Reprint requests to: EugeneV. Koonin, National Centerfor Biotech- nology Information, National Library of Medicine, National Institutes E. coli thymidine phosphorylase (DeoA; Walter et al., 1990), of Health, Bethesda, Maryland 20894; e-mail: [email protected]. leading to detailed characterization of the respective active cen- gov. ters. Except for the obvious similarity between human TYPH 1081 1082 A.R. Mushegian and E. K Koonin

Table 1. Nucleosidases, phosphoribosyltransferases, and related enzymesa -__ Protein/ Enzymatic MJquaternary Metabolic geneb Organism(s) activity Reaction structure' pathway References

Udp/udp E. coli Uridine Uridine + Pi = 8 x 22 Neuhard & phosphorylase ribose-I-P + salvage Nygaard, 1987 DeoD/deoD E. coli Purine nucleoside Purine (deoxy)- 6 x 23.7 Purine salvage Neuhard & phosphorylase ribonucleoside + Nygaard, 1987 Pi = (deoxy)- ribose-I-P + purine Amdamn E. coli AMP glycosylase AMP + Hz0 = 6 x 52 Purine salvage Leung & Schramm, + ribose- 1980; Leung 5-P et al., 1989; Neuhard & Nygaard, 1987 PNP Mammals, Purine nucleoside Purine (deoxy)- 3(2) x 32 Purine salvage Neuhard & B. subtilis, phosphorylase ribonucleoside + Nygaard, 1987; M. leprae Pi = (deoxy)- Lin, 1987; Ealick ribose-I-P + et al., 1990 purine DeoA/deoA E. coli Thymidine Thymidine + Pi = 2 x 45 Thymidine Neuhard & phosphorylase ribose- 1-P + salvage Nygaard, 1987; Lin, 1987; Walter et al., 1990 TYPH Human Thymidine Thymidine + Pi = 2 x 45 Thymidine Yoshimura et al., phosphorylase ribose- 1 -P + salvage 1990; Bartonet (=PD-ECGF) thymine al., 1992 TrpD/trpD Eubacteria, Anthranilate Anthranilate + 2(?) x 36 Second step in Crawford, 1989; (Eubacteria) archaea, yeast, phosphoribosyl. PRPP = N- tryptophan Kim et al., 1993 plants phosphoribosyl- biosynthesis anthranilate + PPi

TrpG/trpG E. coli and sev- Anthranilate Anthranilate + 2 x 58.3 First and second Pittard, 1987; eral other bac- phosphoribosyl- PRPP = N- step in trypto- Crawford, 1989 terial species transferase (with phosphoribosyl- phan biosyn- N-terminal gluta- anthranilate + thesis mine aminotrans- PP,; chorismate + ferase domain) L-glutamine = anthranilate + pyruvate + L-glutamine Apt/apt (E. coli) Eubacteria, Adenine phospho- Adenine + PRPP = 2 x 20 Purine salvage Neuhard & eukaryotes ribosyltransferase AMP + PPi Nygaard, 1987 Gpt/gpt (E. coli) Eubacteria, Guanine phospho- Guanine + PRPP = 3 x 16.9 Purine salvage Neuhard & eukaryotes ribosyltransferase GMP + PP,; Nygaard, 1987 xanthine + PRPP = XMP + PPI; hypoxan- thine + PRPP = HXMP + PPi Hpt/hpt Eubacteria, Hypoxanthine Hypoxanthine + ? x 20 Purine salvage Neuhard & (L. lactis) eukaryotes phosphoribosyl- PRPP = IMP + Nygaard, 1987 transferase pp, UPP/UPP Eubacteria, Uracil phospho- Uracil + PRPP = 3 x 23.5 Pyrimidine Neuhard & (E. coli) eukaryotes ribosyltransferase UMP + PP; salvage Nygaard, 1987 PyrE/pyrE Eubacteria, Orotate phospho- Orotate + PRPP = 2 x 23.4 Fifthstep in Neuhard & (E. coli) eukaryotes ribosyltransferase OMP + PPi pyrimidine Nygaard, 1987 biosynthesis (continued) Sequence similarity between nucleosidases and phosphoribosyltransferases 1083

Table 1. Continued

Protein/ Enzymatic M,/quaternary Metabolic geneb Organism(s) geneb activity Reaction structureC pathway References

PurF/purF Eubacteria, Glutamine phospho- L-glutamine + 4(3) X 53 First step in de Neuhard & (E. coli) eukaryotes ribosyltransferase PRPP = novo purine Nygaard, 1987 L-glutamate + biosynthesis PPI + 5-ph0~- phoribosylamine HisG/hisC Eubacteria, ATP phosphoribo- ATP + PRPP = 6 x 33 First step in Winkler, 1987 (E. coli) eukaryotes syltransferase 1-(5-phospho-D- histidine ribosyl)-ATP + biosynthesis PPI PncB/pncB E. coli Nicotinate phospho- Quinolinate + ? x 44 NAD White, 1982; (E. coli) ribosyltransferase PRPP = biosynthesis Tritz, 1987 NaMN + PPI + co2 NadC/nadC S. typhimurium Quinolinate phos- Nicotinate + 2x31 NAD White, 1982; phoribosyltrans- PRPP + ATP = biosynthesis Tritz, 1987 ferase NaMN + PPI + ADP + Pi PrsA/prsA Eubacteria,PrsA/prsA Ribose-phosphate ATP + ribose- 5(?) X 31 PRPP synthesis Neuhard & (E. coli) eukaryotes pyrophosphokinase 5-phosphate = for de novo Nygaard, 1987 (PRPP synthetase) AMP + PRPP and salvage pathways of nucleotide metabolism

a Only enzymes for which sequence information is available were included. Where sequences are available for several organisms, the proteidgene name is for the organism(s) indicated in parentheses. The first number is the number of identical subunits and the second number is the Mr.

(endothelial growth factor) and E. coli thymidine phosphory- base translated in 6 reading frames for similarity to an amino lase encoded by the deoA gene (Barton et al., 1992), and the acid sequence), and multiple alignment using the MACAW pro- somewhat lower similarity between human PNP and partial nu- gram (Schuler et al., 1991) showed that these pairs of relatively cleosidase sequence from Bacillussubtilis (Wu et al., 1992), no strongly similar proteins belonged to 2 distinct families of en- significant relationships have been derived from analysis of zymes that have not been described previously. amino acid sequences of nucleosidases. On the other hand, com- parative analysis of the aminoacid sequences of phosphoribo- Family I: Bacterial nucleosidases and plant syltransferases has revealed a conserved motif that has been vegetative storage proteins implicated in PRPP binding (Busetta, 1988; de Boer & Glick- man, 1991). Family I included DeoD, Udp, E. coli AMP glycosidase (Amn), In this work, using computer methods for sequence analysis, uncharacterized proteins from E. coli, Bacteroides uniformis, we delineate 2 new families of nucleosidases and phosphoribo- and Treponemapallidum, and, unexpectedly, bark storage pro- syltransferases, each including unexpected relationships between teins and wound-induced proteins (BSP-win4 family) from pop- enzymes that catalyze very different reactions. lars (Fig. 1). Otherthan the aforementioned relationship between DeoD and Udp,these proteins showed only limited sim- ilarity to each other, with P values between 0.1 and 0.01. Nev- Results and discussion ertheless, analysis of BLAST outputs for mutually consistent Comparison of the amino acid sequences of nucleosidases with pairwise alignments and multiple alignment using MACAW the nonredundant sequence database (National Center for Bio- (Schuler et al., 1991) showed that the similarity concentrated in technology Information, NIH) using the BLASTP (Altschul 3 distinct, conserved motifs (Fig. 1). The probability of obtain- et al., 1990) program revealed highly significant similarity be- ing each of these blocks by chance alone, as computed using tween E. coli purine phosphorylase (DeoD) and uridine phos- MACAW, was below lo-’ (only 1 of the closely related BSP- phorylase (Udp), with the probability of matching by chance (P) win4 sequences was usedfor these calculations; see Methods and aboutMore unexpectedly,significant similarity (P = Schuler et al. [1991] for details of significance evaluation by 3.6 X lo-’) was observed between human thymidine phosphor- MACAW), suggesting that the observed relationship is function- ylase (TYPH) and E. coli anthranilate phosphoribosyltransfer- ally and evolutionarily relevant. Therefore, itis likely that the ase (TrpG). Further analysis by iterative database search using uncharacterized proteins belonging to this family, including the BLASTP, TBLASTN (screening of a nucleotide sequence data- plant stress-induced proteins, possess nucleosidase activity. 1084 A.R. Mushegian and E. K Koonin

I prediction bbllllllbbbbblllllllllaaaaaaaa AMN-ECOLI 263 ALITADQQGITLVNIQVGPSNAKTICDALA NP BACUNNP 45 ISASAEQMTIINPGMQSPNAAIIMDLLSAI Fig. 1. DEOD-ECOLI 48 FTGTYKQRKISVMGAQMGIPSCSIYTKELI Conserved sequence motifs in the nucleosidase family I. UDP-ECOLI 52 WRAELDQKWIVCSTQIGGPSTSIAVEELA The alignment of family I proteins was constructed using the PFS-ECOLI 33 YTGQLNQTEVALLKSGIGKLGATLLL MACAW program, and the boundaries of the 3 conserved blocks BSPl POPLAR 81 ASGTLNGSSIVWKTGSASVNMATTLQILL were determined so as to achieve maximumstatistical significance. BSPZ POPLAR76 AIGTLNARYIVYVKIGGNSVNAAIAVQILL The distances between the blocks as well as the distances from pro- consensus h...h.G..U..h..G...... M tein ends are shown by numbers. For the putative ribose-binding A motif (II), the alignment additionally includes mammalian purine nucleoside phosphorylases (PNPs) and the putative PNP from - nucleoside I1 binding yeast. The protein sequence from Mycobacrerium leprae that is prediction bhhh~~~lll7”1””””7”’? related to eukaryotic PNPs (GenBank U00022; E.V. Koonin, un- AMN-ECOLI 5 VCYIGACGGLRKVRPL-ADYVLAAA 71 publ. obs.) is not shown; the related sequencefrom B. subri/is(PIR NP BACUN 4 CLFLGKCGGIDKKNRI-GDLILPIA (58) LO84729 A42708) is incomplete and includes only the C-terminal part of DEOD-ECOLI 10 IIRVGSCGAVLPAVKL-RDWIGMG 62 the protein downstream from motif 11. The consensus includes UDP-ECOLI 6 PLRIGTTGAIQPAINV-GDVLVTTA 75 amino acid residues or pairs of similar residues that are conserved PFS-ECOLI 8 IINTGSAGGLAPTLKV-GDIWSDE 69 BSPl POPLAR 7 VIYFGNAGSLDKKWPGDVSVPEA 114 in all of the aligned sequences. U designates a bulky aliphatic res- BSP2 POPLAR 7 IIAFGSAGSLDKESIVPGDVSVPLA 114 idue (I, L, V, or M); & designates a bulky hydrophobic residue (I, L, V, M,F, Y, or W); h designates any hydrophobic residue (I, L, V, M, F, Y, W, C. or A); s designates a small residue (G, p5A- I I -PlB I I -p6A- I A, or S); and dotdesignates any residue. The secondary structure PNPA-HUMAN 110 LVVTNAAQQLNPKFEV-GDIMLIRD 155 M13953 for human PNP is from the published 3D structure; A and Bdes- PNPA-MOUSE 110 LWTNAAQGLNPNFEV-GDIMLIRD 155 P23492 ignate 2 distinct &sheets found in this protein (Ealick et al., 1990). PNPA YEAST 128 LIVTNAAGGINAKYQA-CDLMCIYD ( 0) X69426 consensus The secondary structure for family I proteins is the consensus of hh.....GSU.....h .D&.h... predictions for individual proteins (a designates an a-helix, b des- ignates a &sheet, and 1 designates a loop). Theasterisk designates I11 the alanine residue in human PNP that interacts with the nucleo- side ribose. Each sequenceis accompanied by its accessionnumber prediction aaaaaaaaaaaaaaaaalll???bbbbbbbbbbl (g). AMN-ECOLI NLSRAVAIDMESATIAAQGYRFRVPYGTLLCVSD 53 P15272 in SWISS-PROT,PIR (P), or GenBank The partial sequences of DEOD-ECOLI EKYGILGVILMEAAGIYGVAAEFGAKALTICTVSD 35 PO9743 putativenucleosidasesfrom yeast, Bacreroidesunijormis(BACUN), UDP-ECOLI QAMGVMNYEMESATLLTMCASQGLRAGMVAGVIV 32 P12758 and Treponemapallidum (TREPA) have not been described pre- PFS-ECOLI NFPQAIAVEMEATAIAAVCANFNVPFWWWISD 22 P24247 viously and were identified in this study by database searches using BSPl POPLAR DNFDAKTADTTSASVALTSLSNEKLMQGVSN 38 S13580P the TBLASTN program. BSPZ POPLAR IWFNVSTADQESAAVAWTSLSNEKPFIVIRGASN 39 L20233g NP TREPA (311I REFGAAGVEMEQAAPAAVASVNGVPFVIIRCISD 31 M30941g ....h..hE..a..&h...... h..&..h.. D

The proteins of the nucleosidase family I also contained are- perform the same function. Therefore, we believe that thestruc- gion of low similarity to eukaryotic purine nucleoside phosphor- ture of the ribose-binding part of the nucleoside-binding site is ylases (PNPs). This similarity has been detected by BLAST partially conserved between family I and eukaryotic PNPs. De- searches but failed to attain statistical significance. It was re- tailed comparisons failed to reveal any other sequence conser- markable, however, that this region coincided with the con- vation between these groups of proteins. served motif I1 (Fig. 1). The regular expression [&C][&C]x2 The functions of the 2 other conserved motifs in the (puta- [GN]x~[GAS]UX~[UA]X,-~D&X[UC](alternative residues are tive) nucleosidases of family I remain unknown. It is plausible shown in brackets; U designates a bulky aliphatic residue, that motif I, which appears tocontain a hydrophobic P-strand namely I, L, V, or M; & designates a bulky hydrophobic resi- separated by a flexible loop from adownstream cr-helix (Fig. l), due, namely I, L, V, M, F, Y, or W; and x designates any resi- may be the phosphate-binding site (Saraste et al., 1990; Schulz, due) was specific for family I and eukaryotic PNPs, with only 1 1992). false positive (insectgeneral odorant-binding protein 1 precursor; Delineation of family I resulted in prediction of nucleosidase SWISS-PROT P31418) selected during the databasescreening. activity for several uncharacterized proteins. In particular, pfs Inspection of the 3D structure of human PNP (Ealick et al., protein appears tobe a new, not identified previously nucleo- 1990) showed that the conserved motif comprised part of the ac- sidase in E. coli; it is distinct from inosine phosphorylase (XapA) tive center. The central feature of the PNP structure is a dis- whose sequence is not yet available because the respective genes torted&barrel that consists of2 0-sheets, one of them are located in different regions of the chromosome (Brun et al., 8-stranded (sheet A) and theother one 5-stranded (sheet B). Mo- 1990). Interestingly, thepfsgene is adjacent to thedgt gene en- tif I1 includes 3 b-strands, two of which belong to sheet A and coding dGTP phosphohydrolase, another of nucleotide one to sheet B (Fig. 1). The turn between @SAand 0 1B (or in metabolism (Wurgler & Richardson, 1990). Conceivably, dGTP other words, at the interface between the 2 P-sheets) and BIB phosphohydrolase and thenew nucleosidase may be function- strand itself are directly involved in nucleoside binding. In par- ally linked. BSP-win4 family proteins in poplars are encoded by ticular, the backbone amidogroup of alanine 116 (Fig. 1) forms a family of physically linked genes and areinduced by various a bond with the ribose 3’ hydroxyl (Ealicket al., 1990). stress signals, namely, elevated level, short photope- Although this residue is not conserved in all of the proteins of riod, or mechanical wounding; BSP proteins serve as transient family I, this and the adjacentpositions contain mostly small nitrogen deposit in bark (Parsons et al., 1989; Clausen & Apel. residues, namely glycines and serines (Fig. l), that arelikely to 1991; Coleman et al., 1992; Daviset al., 1993). Nucleosidase ac- Sequence similarity between nucleosidases and phosphoribosyltransferases 1085

tivity appears to be compatible with a stress-related function be- Family ZZ: Thymidine phosphorylases and anthranilate cause it would increase the intracellular pools of free bases and phosphoribosyltransferases pentoses that could be reused for biosynthetic processes. Alter- natively, it is possible that the storage function of BSPs is un- Family I1 consisted of bacterial and mammalian thymidine phos- related to theirproposed nucleosidase activity; an obvious phorylases and numerousanthranilate phosphoribosyltransfer- parallel is the recruitment of various enzymes as animal lens ases from bacteria, archaea, and yeast (Fig. 2). Anthranilate crystallins (Piatigorsky, 1993). phosphoribosyltransferase is a ubiquitous enzyme that catalyzes

I - a-helical domain

structure -al------a2------"-a3"""- """-a4----- DeoA Ec 8 IRKKRDGHALSDEEIRFFINGIRDNTISEGQ~TIFFHDMTMPBRVSLTMAlIRDSG TYPH Mp 6 IELKKNKKKLSQDQINFCISGLVNKSIPDYQISALLMAIWFNGLDDNBLYFLTKANIDSG TYPH hum 39 IRMKRDGGRLSEADIRGFGSAQGAQIQAMLMAIRLRGMDLEBTSVLTQALAQSG : .. **: :::*** t *::e e :::::t* :e : te : : TrpG Ec 203 LEKLYQAQTLSQQESHQLFSAVVRGELKPEQLAlULVmKIA TrpD Ac 6 LNRIVNQLDLTTEEMQAVMRQIM'TGQCTDAQIQAFLMGNlWKSETIDBIVGAVAVYRELA TrpD Vp 4 INKLYEQQSLTQEESQQLFDIIIRGELDPILlUBALTALKIKGETPDBIAGAAKALLANA TrpD B1 9 LNAYLDNPTPTLEEAIEVFTPLTV~EYDDVHIATIRTRGEQFADIAGAAKPSWUIA TrpD Ba 4 FNKIYESKSLNQEESYQLFKSIALGKINEIQLsSILTAYQYHGESEKBILGAIYAPSERM TrpD L1 4 LEKVMSGRaPTENEMNMLANSIIQGELSEVQIASFLVALKMKGEAASPL'TGLARALQKAA TrpD Bs 4 LQLCVDGKTLTAGEAETL"MMAAEMTPSEYWILS1LAHRGETPEBLAGFVKAYRAHA TrpD Tm 6 LKKLVEFEDLTFEESRQVMNFIMSGNATDAQIMiBLVAWlEKS TrpD Mt 4 MNDIMDFRNLSEDEAHDLMEMIMDGEMGDVQIAULTALAYKGETVDBITGFARAKRERA TrpD Hv 4 1ER"SGADLTVEEARRPPRGRSSEDATEAQIQALLAALWAKGETEABIAGFAQGMRDAA TrpD At 111 IETLIDRVALSETEAESSLEFLL-NEANEALIBABLVLLRAKGETYEBIVGLARAMMKHA TrpD Sc 26 DALLVILSLLQKCDTNSDESLSIYTKVSSFLT~R~KLDH~EYI~KAVLRHSDLV YbiB EC 10 GRGKNHARDLDRDTARGLYAHMLNGEVPDLELQQVLIALRIKGEGEABMLGFYEANQNHT prediction aaaaalllllllaaaaaaaaaaaallllaaaaaaaaaaaaaalllaaaaaaaaaaaaaaa consensus ...... U...... U..hh..U...... B ...h...h.. .. D Fig. 2. Conserved sequence motifs in family I1 - phosphate binding 11: thymidine phosphorylases and anthranilate phosphoribosyltransferases. The consensus "--a5""- ==P2A= - - - -a6-- - - structure ==PIA== shows the conserved amino acid residues with ttt t t t DeoA Ec 13 IVDKHSTQQVGDV---TSLMLGPMVMCGG--YIPMISGRQLQHTGQTLDKLESI 1 possible exception; residues that conform to TYPH Mp 12 LIDKH~IGDK---VSIALRPILVSFDL--GVAKLSGRQLQF'TGQTIDKLESI the consensus are highlighted by bold. Aster- TYPH hum 12 LMKHSTQQVGDK---VSLVLAPALAACGC--KVPMISGRQLQH'TGQTLDKLESI isks show identical residues, and colons show :*e* t ** t *e** *e : tt:: :e: :: similar residues in the sequences of human TrpG EC 9 FADIVQTQQDGSNSINISTASAFVAAACGL--KVAKHQNRSVSSKSQSSDLLAAP TYPH andE. coli TrpG. The secondary struc- TrpD Ac 10 LVDIVQTQQDGQNLFNVSTASSFVIAAAGA--TIAKHQNROVSSKSQSSDLLEQA ture for DeoA is from the published3D structure TrpD VP 9 FADIVQTQQffiHNTINISTTAAFVAAACGL--KVAKHQNRSVSSKSQSSDLLDSF (Walter et al., 1990); the secondary structure TrpD E1 9 LLDSAQTWDGANTINIT'TGASLIMSGGV--KLAKHQNRSVSSKSQSADVLEAL TrpD Ba 9 FSDIVQTQQDSKNTINVSTSSAFVMSCGF--KIIKHCNKQVSSKSQSSDLLNKB for anthranilate phosphoribosyltransferases is TrpD L1 9 AMDNCQTQQDRSFSFNIS~AAF"GGV--NMAKHQNRSITSKSQSADVLEAL the consensus of predictions. Daggers denote TrpD Bs 8 IMTCQTQQDGISTFNISTPSAIVASAAGA--KIAKHQNRSVSSKSQSADVLEEL the amino acid residuethat directly contacts the TrpD Tm 10 TVDTCOTWDGFGTFNIS~AAFWAAaGI- - WAKHQNRSVSSKVQSADVLEAG phosphate in DeoA; exclamation marks show TrpD Mt 10 VVDACQTQQDRFKSYNVSTAAAIZAAAAGV--KVAKHQNRAVTGSCQGADILEAA residues that contact the pyrimidine base. For TrpD Hv 9 SSDTAQTQQDDYNTINVLDPTTRSSMAPGAAAVAKHQNYSVSSSSQSADVLEVA the other details and designations see legend to TrpD At 8 AVDIVQTQQDGANTVNISTGSSILAAAC----KVAKQQNRSSSSACQSADVLEAL Figure 1. Ec, E. coli; Mp, Mycoplasmapyrum; TrpD Sc 17 ILDIVQTQQDGQNTFNVSTSAAIVASQIQGL-KICKHQGKASTSNSQAGDLIGTL YbiB EC 10 YPIVIPSYNGARKQANLTPLLAILLBKLGF--PVWHQVSEDPTRVLTETIFELII Ac, Acinetobacter calcoaceticus; Vp, Vibrio prediction bbbbblllllllll?????aaaaaaaalll bbbbbllllllllllaaaaaaaa parahaemolyticus; B1, Brevibacterium lactofer- consensus h.D ..m"X)...... US.... .hh...... 0 ...Q...... Q..D.r.. h mentum; Ba, Buchnera aphidicola; LI, Lac- S T S ~OCOCCUSlactis; Bs, Bacillus subtilis; Tm, Thermotoga maritima; Mt, Methanobacterium I11 - pyrimidine/anthranilate(?) binding thermoautotrophicum; Hv, Haloferaxvol- canii; At, Arabidopsis thaliana; Sc, Saccharo- structure - - - - -a9- ___ ===p4A=== """a10""" !! myces cerevisiae. DeoA Ec 53 ITASILAKKW-LDALVMDVKVGSGAFMPTYPLSEALAEAIVGV 213PO7650 TYPH Mp 52 IAASILSKKPALE-SDYIFIDIKYGQGAFCHDIBTAKKISNIYKNL 224L13289g TYPH hum 51 ITASILSKKLVEG-LSALVVDVKFGGAAVFPNQBQARELAKTLVGV 214P19971 .. t :*-e : : :*: : : * TrpG EC 65 VYSPELVLPIAET-LRVKYQRAAWHS-GGMDBVSLHAPTIVAEL 96 PO0904 TrpD Ac 66 VFTQELCKPLAEV-LKRIGSEHVLVVHSRDGLDXFSLAAATHIAEL 109 PO0500 TrpD VP 66 VYSEELVRPIAET-YLQNGMKQ-SGLDWJAIHGTITVAEI 96 P22096 TrpD B1 66 VANANHGQLXAEV-FREIGRTRALWHQ-AGTDBIAVHGTTLWEL 107 PO6559 TrpD Ba 66 VYKKDLVNPNSRI-LKKLKYQRGIILHQ-DDTDBVTLHGTTYISEL 99Z19055g TrpD L1 66 TSRPDLLELTANV-LKGIGRKRALVITQEGGMDBATPFGLNHYALL 98 402000 TrpD Bs 65 GYSVEKAGLNASA-LETTQPKHVNF'VSSRffiLDBLSITAPTDVIEL 102PO3947 TrpD Tm 66 VFDLSFASKLATA-WRLGTERSAWNQ-GFTDBLTTCGKNNLLLV 110S62627p TrpD Mt 66 VFDPYLVGF'VAEV-LRNLGVKRAYWHQFDGNMNPAMDEISTVGPT 98P26925 TrpD HV 66 WDADLVPVIAES-LSWPVERALWHQ-SGMDBIALHDRTTVAEI 104M83788g TrpD At 66 VYHKDLV"A-WWGMKRALWHS-CGLDBMSPLGGGLWDV 103Q02166 TrpD Sc 68 WSKELAPEXUAAAL~FGSETTIWQHVGLDWJSPIGKTTVWHI 108PO7285 YbiB Ec 68 VSHPEYIGRVAKF-FSDIGG-RALLMHQ--TEGNYANPQRCPQIN 77P30177 prediction 7777?aaaaaaaaaaalllllbbbbblllllll??~7??777~7?~ consensus ...... 68 ...h..G.....h...... B.....-..... 1086 A.R. Mushegian and E. V. Koonin the second step in tryptophan biosynthesis (Table 1; reviewed sec. struct. bbbbblllllaaaaaa by Crawford, 1989). Apart from the already mentionedhighly DeoA EC 79 NGPIVDKHSTOGVGDV TYPH TYPH76Mp KKILIDKHSTOGIGDK significant similaritybetween human TYPH andE. coli TrpG, TYPHhum 109 RQQLVDKHSTOGVGDK BLAST searches detected moderate similarity between various anthranilate phosphoribosyltransferases and thymidine phos- .TrpG Ec 270 DIWADIVGTOGDGSN AnthrPRT TrpD B1 76 GAGLLDSAGTOGDGAN phorylases, with P values between 0.1 and 0.001. The region of TrpD L1 71 L"DNCGTOGDRSF highest conservation in most of these analyses included a block TrpD Bs 70 LPDIVDTCGTQCDGIS TrpD Tm 320 SPRTVD'EGTOGDGFG of about60 amino acid residues that in the majority of the fam- TrpD Mt 74 SYRVVDACGTOGDRFK ily I1 proteins is located near the N-terminus(with the exception TrpD 72Hv MRSSDTAGTOGDDYN TrpD At176 LVDAVDIVGTOGDGAN for 2 domain enzymes of tryptophan biosynthesis, e.g.,E. coli TrpDSc101 GPVILDIVGTOGDGQN TrpG that contain the N-terminal anthranilate synthetase do- main). Subsequent multiple alignment analysis revealed 2 addi- HIS1-ECO163 ADAICDLVSTOATLEA ATP-PRT P10366 HIS1-LAC 149 ADATVDIVIXONTLSA Q02129 tional conserved blocks (Fig. 2). For the set of 6 sequences HIS1-YEA164 GDAIVDLVESOETNRA PO0498 including theE. coli and human thymidine phosphorylases, to- APT-ECOL 11 9 VLVVDDLLATOGTIEA APRT PO7672 gether with 4 diverse sequences of anthranilate phosphoribosyl- YSCAPRT-1123 VVVVDDVLATOGTAYA L14434 transferases, the random matching probabilities for eachof the LEIADPH-1139 WLIDDVLRTOGTALS L25411 2 N-terminalblocks were below ascomputed using APT-ARAT122 AIIIDDLIATOGTLRA P31166 APT-HUMA 120 VVYVDDLLRTOGTMNA PO7741 MACAW. The alignmentof family I1 is readily interpretable in terms of S19720 94 VLW.DIIDTOHTISK UPRT S19720 BCPYRQP-197 VILVDDVLFTORTVRA X76083 the known 3D structure of E. coli thymidine phosphorylase, PYR5-BOV 116 CLIIIDWSSOSBYWE P31754 DeoA (Walter et al., 1990). The protein molecule consists of the UPP-ECOL123 ALIVDPMLATOGBVIA P25532 small a-helical domain and thelarge a/fl domain. The a-helical STRFTFA-1124 IIVVDPMWTOGSAIL LO7793 FURL-YEA165 VPLLDPMLATOGBAIM P18562 domain is comprised of amino acid residues 1-65 and 163-193. Strikingly, it is this domain that is most highly conserved be- XGPT-ECO81 IIVIDDLVDTOGTAVA HPRT PO0501 HPRT-VIB91 VLIVBDIIDTONTLNK P18134 tween thymidine phosphorylases and TrpD (blocks I and 111 HPRT-PLA 13 7 VLIVBDIIMOKTLVK P20035 inFig. 2 containing 011-4 and a9, respectively). Secondary HPRT-SCH187 VLWEDIIDTOKTITK PO9383 structure prediction indicated that therespective regions of TrpD TRBHGPRT104 VLIVBDIVDTALTLNY LO7486 LEIHGPT-118 ILIVBDIMSAITLQY L25412 have a-helical conformation and, for the N-terminal ofpart the HPRT-HUM127 VLIVBDIIDTOKTUQ'I PO0492 domain, suggested an almost perfect alignment of the4 helices PYRE-ECO 117 VMLVDDVITAOTAIRE OPRT PO0495 in TYPH and TrpD(Fig. 2). The N-terminal portion of the a- PYRE-BAC 11 9 TVVIPDLISTOGBVLE P25972 helical domain, and in particulara 1 and 013, is involved mostly NGRUMPA-122 CILIPDVITSOASIVE LO8073 in subunit interaction in the DeoA dimer. On the other hand, CET07C4-123 LILIBDW'ITOGSILD 229443 PYREYEA 12 5 ILIIDDVMTAOTAINE P13298 sequence conservation in a4, including the nearly invariant glu- PYRXYEA 128 VLIIDDVYTAOTRINE P30402 tamic acid (Fig. 2), may suggest a more specific, not yet de- ATPYR3FA111 CLIIPDLVTSOABVLE X71842 PYR5-DIC114 VLVMDLVTBOABVLE PO9556 scribedfunction. The distal part of the a-helical domain PYRS-DRO117 CLIMDVVTSOSSILD Q01637 containing a9 is implicated in the pyrimidine base binding (Wal- PYRE-CRY123 IVIIDDVLTSOKAIRE P18132 ter et al., 1990). Because the geometry of the anthranilate ring PYRS-HUM116 CLIIBDVVTSOSSVLE P11172 is similar to that of a pyrimidine, it seems likely that in TrpD PUR1-ECO360 VLLVDDSIVRQTTSEQ AmPRT PO0496 this region contributes to the anthranilate-binding site. PUR1-BAC349 VJMVDDSIVROTTSRR PO0497 PUR1-YEA366 VLIVDDSIVROTTSKE PO4046 The middle conserved alignment blockI1 in Figure 2 belongs 382JC1414 IVLVDDSIVRONTISP JC1414 to the largea/fl domain of DeoA and contains the phosphate- binding site (Walter et al., 1990). This region includes 2 flexi- KPRS-ECO 2 13 CVLVDDMIDTOGTLCK RPPK PO8330 KPRSYEA 3 18 UILDDNIDRPGWIS P32895 ble loops between PIA and as, and between 02A anda6, both LEIPRPP-263 CIIVDDNIDTQGTLVK M76553 of which contribute to phosphate binding. The conservationin KPR1-HUM213 AILVDDMADTCGTICH PO9329 CLLIDDYADTCGTLVK X75075 the first of these loops,which bears general resemblanceto the SCDNAPRS216 phosphate-bindingloops in other classes ofenzymes, e.g., RP4TRANO135 YIVVDDTLTMOGTIAS L10330 ATPases and (Saraste et al., 1990; Schulz, 1992; KOO- YORF-HAE 19 0 VALVDDVITTOSTLNE P31773 Y PY B-BAC 9 8 VILVDDVLYTQRTVRA P25982 nin, 1993), is particularly striking (Fig. 2; this loop is dramati- consensus hhhhDDhh.TQ.Th.. cally altered in E. coli YbiB, suggesting that despite thehighly B SS significant similarity to TrpD, this protein maylack the phos- Fig. 3. Conservation of the phosphate-binding motif in thymidine phos- phoribosyltransferase activity). This segment belongs to oneof phorylases and various groups of phosphoribosyltransferases. Closely the regions of highest conservation in the complete alignment related sequences are omitted. The consensus shows amino acid residues of TrpD(Kim et al., 1993). Previous sequence comparisons and conserved in the majority of the enzyme groups. The motif could not structural modeling indicated that this motif appears to be con-be identified in nicotinate phosphoribosyltransferase and quinolinate phosphoribosyltransferase. Secondary-structureprediction is based on served in almost all phosphoribosyltransferases and is located the experimentally determined 3D structure of DeoA, our prediction for in a 0-strand-loop-a-helix unit within the putative structural anthranilate phosphoribosyltransferases, and the published model of the core (Argos et al., 1983; Busetta, 1988; de Boer & Glickman, conserved phosphoribosyltransferase core (Busetta, 1988; de Boer & 1991; Fig. 3). It has been speculated by these authors that the Glickman, 1991). The designations are as in Figures I and 2. Phospho- ribosyltransferases: AnthrPRT, anthranilate; ATP-PRT, ATP; APRT, conserved motif is part of the PRPP-bindingsite. The conser- adenine; UPRT, uracil; HPRT, hypoxanthine (guanine); OPRT, oro- vation in thymidine phosphorylases that do not interact with tate; AmPRT, glutamine; RPPK, ribose pyrophosphate kinases (PRPP PRPP rather indicates that it is a specific form of the phosphate- synthetases). Sequence similarity betweenand nucleosidases phosphoribosyltransferases 1087 binding loop, thefunction of which is not limited to PRPPbind- translated versions of GenBank and EMBL sequence databases ing. It has to be emphasized that the conservation of the at the National Center for Biotechnology Information (NIH). phosphate-binding site between family I1 and other phospho- ribosyltransferases is very limited, barely detectable by auto- Computer-assisted sequence analysis matic methods of sequence analysis, and becomes apparent only upon comparison of multiple alignments for different families Amino acid sequences werecompared with the NRDB using pro- (Fig. 3). grams based on the BLAST algorithm (Altschul et al., 1990). The BLASTP program was used to screen the amino acid se- quence database and the TBLASTN program was used to screen Concluding remarks: Implications for enzyme evolution the conceptual translation of the nucleotide sequence database The present analysis of the relationships between the aminoacid in 6 reading frames. The BLAST algorithm provides the signif- sequences of nucleosidases and phosphoribosyltransferases icance estimate for each ungapped pairwise alignment produced showed that in these enzymes sequence similarity does not nec- in the database search using the statistical theory of asymptotic essarily reflect similarity between the catalyzed reactions. En- extrema1 distribution for high-scoring segments (Karlin & Alt- zymes that catalyze the same reactionand may be even capable schul, 1990, 1993; Altschul et al., 1994). Forall database of utilizing identical substrates may show no appreciable se- searches,the amino acid substitutionmatrix BLOSUM62 quence similarity to each other, like (Udp) (Henikoff & Henikoff, 1992, 1993) was employed. Composi- and thymidine phosphorylase (DeoA), or very weak similarity tionally biased segments in the query sequences that may pro- at the level of a degenerate conserved motif, like anthranilate duce spurious “hits” in database searches were detected and phosphoribosyltransferases and other groups of phosphoribo- masked using the SEG program (Altschul et al., 1994; Wootton syltransferases. Functional convergence may account for the & Federhen, 1993). analogous specificity in at least some of these cases (Doolittle, Multiple amino acid sequence alignments were constructed 1994). using the MACAW program (Schuler et al., 1991). MACAW In contrast, enzymes that catalyze very different reactions may produces alignments consisting of ungapped blocks separated share significant sequence similarity and probably have evolved by unaligned spacers of variable length by parsing compatible by divergence from a common ancestor.Examples of such un- high-scoring segments detectedby pairwise comparison of all se- expected similarities were found in both enzyme families de- quences in a set. The statistical significance of each block is cal- scribed here. In family I, there is a clear distinction between the culated separately using a generalization of the theory for reactions catalyzed by Amn and such proteins as Udp andDeoD, pairwise alignments (Karlin & Altschul, 1990; Schuler et al., with the former being a hydrolase and the latter being phospho- 1991). The boundaries of the blocks are adjusted so as to max- transferases (Table 1); thus, this family unites enzymes that for- imize the significance. The probability of obtaining a block by mally even belong to different classes. Obviously, however, the chance increases withthe growth of the “search space.” Usually, chemical moieties involved in both reactions are thesame, i.e., the search space is set to be equal to theproduct of the lengths phosphate, ribose, or deoxyribose, and a purine or pyrimidine of the compared sequences. Therefore, MACAW actually pro- base. This identity of the chemical constituents may account for vides significance estimates given that the set of sequences un- the conservation of the enzyme structure. Essentially the same dercomparison has been initiallydelineated using some pertains to thymidine phosphorylases and anthranilate phos- independent criteria. In addition, MACAW may overestimate phoribosyltransferase in family I1 that catalyze superficiallyvery the significance if a subset of closely related sequences is in- different reactions and are involved in different biochemical cluded in the analysis. To alleviate this problem, only 1 repre- pathways (salvage pathway of nucleotide metabolism and amino sentative of each of suchsubsets was included inthe calculations. acid biosynthesis, respectively). Again, however, chemically, the Screening of the NRDB for amino acid patterns (regular ex- ingredients are very similar, namely phosphate, ribose, or de- pressions) was performed using the program PAST (R.L. oxyribose, and a nitrogen-containing compound with a single Tatusov, unpubl.). aromatic ring (thymidine or anthranilate, respectively). Protein secondary structure was predicted using the recently Significant sequence conservation between enzymes that cat- developed neural network method (Rost & Sander, 1993). alyze different reactions may be widespread. Relevant examples include haloalkane dehalogenase and epoxide hydrolase (Jans- Acknowledgment sen et al., 1989; Ollis et al., 1992); creatinase, methionyl amino We are grateful to Dr. Roman L. Tatusov for providing unpublished peptidase, and prolidases (Murzin, 1993); asparagine synthetase computer programs. and aspartyl tRNA synthetase (Furukawa et al., 1992); and pu- tative active centers of numerous phosphoesterases that hydro- lyze a wide variety of substrates (Koonin, 1994). References Altschul SF, Boguski MS, Gish W, Wootton JC. 1994. Issues in searching molecular sequence databases. Nature Genet 6:119-129. Materials and methods Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Bio/215:403-410. Argos P, Hanei M, Wilson JM, Kelley WN. 1983. A possible nucleotide- Sequences and databases binding domain in the tertiary fold of phosphoribosyltransferases. JBiol Chem 258:6450-6457. Amino acid and nucleotide sequences were retrieved from the Barton GJ, Ponting CP, Spraggon G, Finnis C, Sleep D. 1992. Human platelet-derived endothelial cell growth factor is homologous to Esche- nonredundant sequence database (NRDB) that is constructed by richia coli thymidine phosphorylase. Protein Sci 1:688-690. merging nonidentical entries from SWISS-PROT, PIR, and Brun YV,Breton R, Lanouette P, Lapointe J. 1990. Precise mapping and 1088 A.R. Mushegian and E, K Koonin

comparison of two evolutionarily related regions of the Escherichia coli Lin ECC. 1987. Dissimilatory pathways for sugars, polyols, and carboxyl- K-12 chromosome. Evolution of valU and IysT from anancestral tRNA ates. In: Ingraham JL, Low KB, Magasanik B, Schaechter M, Umbarger operon. J Mol Biol214:825-843. HE, eds. Escherichia coli and Salmonella typhimurium. Cellular and mo- Busetta B. 1988. The use of folding patterns in the search of protein struc- lecular biology. Washington, D.C.: American Society for Microbiology. ture similarities: A three-dimensional model of phosphoribosyltransfer- pp 244-284. ases. Biochim Biophys Acta 957:21-33. Murzin AG. 1993. Can homologous proteins evolve different enzymatic ac- Clausen S, Ape1 K. 1991. Seasonal changes in the concentrationof the ma- tivities? Trends Biochem Sci 18:403-405. jor storage protein and its mRNA in xylem ray cells ofpoplar trees. Plant Neuhard J, Nygaard P. 1987. and . In: Ingraham JL, Mol Bioi 17:669-678. Low KB, Magasanik B, Schaechter M, Umbarger HE, eds. Escherichia Coleman GD, ChenTHF, Fuchigami L. 1992. Complementary DNA clon- coli and Salmonella typhimurium. Cellular and molecularbiology. Wash- ing of poplar bark storage protein and controlof its expression by pho- ington, D.C.: American Society for Microbiology. pp 445-473. toperiod. Plant Physiol 98587-693. Ollis DL, Cheah E,Cygler M, Dijkstra B, Frolow F, Franken SM, Hare1 M, Crawford IP. 1989. Evolution of a biosynthetic pathway: The tryptophan Remington SJ, Silman I, Schrag J, Sussman JL, Verschueren KHG, Gold- paradigm. Annu Rev Microbiol43:567-600. man A. 1992. The alpha/beta hydrolase fold. Protein Eng 5:197-211. Davis JM, Egelkrout EE, Coleman GD, Chen THH, Haissig BE, Riemen- Parsons TJ, Bradshaw HD, Gordon MP. 1989. Systemic accumulation of schneider DA, Gordon MP. 1993. A family of wound-induced genes in specific mRNAs in response to wounding in poplar trees. Proc Natl Acad Populus share common features with genes encoding vegetative storage Sci USA 86:9851-9855. proteins. Plant Mol Biol23:135-143. Piatigorsky J. 1993. Puzzle of crystallin diversity in eye lenses. Dev Dyn de Boer JG, Glickman BW. 1991. Mutational analysis of the structure and 196:267-272. function of the adenine phosphoribosyltransferase enzyme of Chinese Pittard AJ. 1987. Biosynthesis of aromatic amino acids. In: Ingraham JL, hamster. JMol Biol221:163-174. Low KB, Magasanik B, Schaechter M, Umbarger HE, eds. Escherichia Doolittle RF. 1994. Convergent evolution: The need to be explicit. Trends coli and Salmonella typhimurium. Cellular and molecularbiology. Wash- Biochem Sci 19:15-18. ington, D.C.: American Society for Microbiology. pp 368-394. Ealick SE, Rule SA, Carter DC, Greenhough TJ, Babu YS, Cook WJ, Rost B, Sander C. 1993. Prediction of protein secondary structure atbetter Habash J, Heliwell JR, Stoeckler JI, Parks RE, Chen S, Bugg CE. 1990. than 70% accuracy. J Mol Bioi 232:584-599. Three-dimensional strpcture of human erythrocytic purine nucleoside Saraste M, Sibbald PR, Wittinghofer A. 1990. The P-loop-A common mo- phosphorylase at 3.2 A resolution. J Bioi Chem 265:1812-1820. tif in ATP- and GTP-binding proteins. Trends Biochem Sci 15:430-434. Furukawa T, Yoshimura A, Sumizawa T,Haraguchi M, Akiyama SI, Fukui K, Schuler GD, Altschul SF, Lipman DJ. 1991. A workbench for multiple align- Hinchman SK, Henikoff S, Schuster SM. 1992. A relationshipbetween ment construction and analysis. Proteins Struct Funct Genet 9:180-190. asparagine synthetase A and aspartyl tRNA synthetase. J Bioi Chem Schulz GE. 1992. Binding of nucleotides by proteins. Curr Opin Struct Biol 167:144-149. 2:61-67. Henikoff S, Henikoff J. 1992. Amino acid substitution matrices from pro- Stout JT, Caskey CT. 1985. HPRT: Gene structure, expression and muta- tein blocks. Proc NatlAcad Sci USA 89:10915-10919. tion. Annu Rev Genet 19:127-148. Henikoff S, Henikoff J. 1993. Performance evaluation of amino acid sub- Tritz GJ. 1987. NAD biosynthesis and recycling. In: Ingraham JL, Low KB, stitution matrices. Proteins Struct Funct Genet 17:49-61. Magasanik B, Schaechter M, Umbarger HE, eds. Escherichia coli and Ishizawa M, Yamada Y. 1992. Angiogenic factor. Nature (Lond) 356:668. Salmonella typhimurium. Cellular and molecular biology. Washington, Janssen DB, Pries F, van der Ploeg J, Kazemier B, Terpstra P, Witholt B. D.C.: American Society for Microbiology. pp 557-566. 1989. Cloning of 1,2-dichloroethane degradationgenes of Xanthobac- Walter MR. Cook WJ, Cole LB, Short SA, Koszalka GW, Krenitsky TA, ter autotrophicus GJlO andexpression and sequencing of the dhlA gene. Ealick SE. 1990. Three-dimensional structure of thymidine phosphory- J Bacteriol I71 :6791-6799. lase from Escherichia coli at 2.8 A resolution. J Bioi Chew 265:14016- Karlin S, Altschul SF. 1990. Methods for assessing the statistical significance 14022. of molecular sequence features by using general scoring schemes. Proc White HB. 1982. Biosynthetic and salvage pathways of nucleotide Nut1 Acad Sci USA 87:2264-2268. coenzymes. In: Everse J, Anderson BM, You KS, eds. Pyridine nucleo- Karlin S, Altschul SF. 1993. Applications and statistics for multiple high- tide coenzymes. New York: Academic Press. pp 1-17. scoring segments in molecular sequences. Proc Nut1 Acad Sci USA 90: Winkler ME. 1987. Biosynthesis of histidine. In: Ingraham JL, Low KB, 5873-5877. Magasanik B, Schaechter M, Umbarger HE, eds. Escherichia coli and Kim CW, Markiewicz P, Lee JJ, Schierle CF, Miller JH. 1993. Studies of Salmonella typhimurium. Cellular and molecular biology. Washington, the hyperthermophile Thermotoga maritima by random sequencing of D.C.: American Society for Microbiology. pp 395-411. cDNA and genomic libraries. J Mol Bioi 231:960-981. Wootton JC, Federhen S. 1993. Statistics of local complexity in amino acid Koonin EV. 1993. A superfamily of ATPases with diverse functions contain- sequences and sequence databases. Comput Chem 17:149-163. ing either classical or deviant ATP-binding motif. JMol Biol229:1165- Wu JJ, Schuch R, Piggot PJ. 1992. Characterization of a Bacillussubtilis 1174. sporulation operon that includes genes for an RNA sigma Koonin EV. 1994. Conserved sequence pattern in a wide variety of phospho- factor and for a putative Do-carboxypeptidase. JBacteriol174:4885-4892. esterases. Protein Sci 3:356-368. Wurgler SM, Richardson CC. 1990. Structure and regulation of the gene for Leung HB, Kvalnes-Krick KL, Meyer SL, de Riel JK, Schramm VL. 1989. dGTP triphosphohydrolase fromEscherichia coli. Proc Nut1 Acad Sci Structure and regulation of the AMP nucleosidase gene(amn) from Esch- USA 87:2740-2744. erichia coli. 283726-8733. Yoshimura A, Kuwazuru Y, Furukawa T, Yoshida H, Yamada K, Akiyarna Leung HB, Schramm VL. 1980. Adenylate degradation in Escherichia coli. S. 1990. Purification andtissue distribution of human thymidine phos- The role of AMP nucleosidase and properties of the purified enzyme. phorylase: High expression in lymphocytes, reticulocytes and tumors. J Bioi Chem 255:10867-10874. Biochim Biophys Acta 1034:107-113.