Volume 16 Number 23 1988 Nucleic Acids Research

Evolution of the herpes thymidine : identification and comparison of the equine herpesvirus 1 gene reveals similarity to a cell-encoded

Graham R.Robertson* and J.Millar Whalley

School of Biological Sciences, Macquarie University, Sydney, NSW 2109, Australia

Received October 19, 1988; Accepted November 11, 1988 Accession no. X13209

ABSTRACT We have identified the equine herpesvirus 1 (EHV-1) thymidine kinase gene (TK) by DNA-mediated transformation and by DNA sequencing. Alignment of the amino acid sequence of the EHV-1 TK with the TKs from 3 other herpesviruses revealed regions of homology, some of which correspond to the previously identified substrate binding sites, while others have as yet, no assigned function. In particular, the strict conservation of an aspartate within the proposed nucleoside suggests a role in ATP binding for this residue. Comparison of 5 herpes TKs with the thymidylate kinase of yeast revealed significant similarity which was strongest in those regions important to catalytic activity of the herpes TKs, and, therefore we propose that the herpes TK may be derived from a cellular thymidylate kinase. The implications for the evolution of activites within a pathway of nucleotide metabolism are discussed.

INTRODUCTION Herpesviruses specify many novel enzyme activities which supplement and/or subvert cellular functions required during the various stages of the infectious process. These include at least five replication-related ; a DNA , a thymidylate synthetase, an alkaline exonuclease, a and a thymidine kinase (TK) (1-7). A (8,9) and a helicase (10,11) have also recently been identified. Of these, the TK is one of the best- characterized (5), with the structure and function of its gene and protein studied in detail, as well as many aspects of its role in viral pathogenesis. It is a target for antiviral chemotherapy and attenuated vaccine strains have been developed by deleting the TK function. Although the mechanism(s) by which herpesviruses have acquired the genes encoding these enzymes can only be speculated

11303 Nucleic Acids Research upon, in some instances there is sufficient similarity to the analogous cell gene to suggest a common ancestry (6-8,10). However, the TKs from those herpesviruses for which amino acid sequences are available (12-17) have no recognisable similarity to the TKs from several vertebrates and poxviruses (18 and refs therein), beyond a region common to all enzymes which use ATP (15,19-21). In addition, the herpes TKs have much broader substrate specificities and extra enzymatic activities. Finally they are active as dimers, whereas the vaccinia virus TK, which is similar to the cell enzymes, is a tetramer in its native form (5). Therefore, either the herpes TK has diverged considerably from the cell-encoded TK, or was not originally derived from it. Two additional enzyme activities have subsequently been shown to be associated with the herpes-specified TK ; a (22) and a thymidylate kinase, which phosphorylates thymidine monophosphate to thymidine diphosphate (23,24). Although considered primarily as a thymidine kinase, with these extra activities this enzyme constitutes a multifunctional deoxypyrimidine kinase (dPyK). In the absence of similarity with a cell-encoded TK, it is possible that the herpes dPyK is derived either from a similar multifunctional cell gene (whose existence has not been demonstrated) or from a monofunctional cell deoxycytidine or thymidylate kinase gene. The latter possibilities imply development of a thymidine kinase activity after incorporation into the herpes lineage. Therefore, the origin of the herpes dPyK is relevant not only to the acquisition and evolution of genes in viral genomes, but also to the evolution of enzymes and biochemical pathways generally. The existence of an equine herpesvirus 1 (EHV-1) specified TK was shown by Allen et al., (25,26) and its biochemical properties including use of nucleoside analogues have been characterised (27,28). To expand the range of herpesviruses for which TK sequences are available for comparison, we have identified and sequenced the EHV-1 TK gene. Comparison of the EHV-1 TK protein sequence with that of other herpesviruses has confirmed the assignment of functionally important segments of the herpes TK proteins by their conservation.

11304 Nucleic Acids Research

Alignment of the TK amino acid sequence from EHV-1 and four other herpesviruses facilitated comparison with the yeast thymidylate kinase (29). This revealed similarity which was greatest in those regions proposed to be part of the of the herpes TK. Such similarity suggests the herpes TK could be derived from a cellular thymidylate kinase and may therefore be an example of consecutive steps in a pathway of nucleotide metabolism evolving "backwards".

MATERIALS AND METHODS Cells and Virus LMTK- cells were maintained in MEM supplemented with 10% foetal calf serum. Virion DNA of EHV-1 (isolate HVS 25A) was cloned into the BamHI site of pBR322 as described previously (30). The 5.35 kb BamHI/ClaI and 3.15 kb BamHI/PstI subfragments of the BamHI-B fragment were cloned into pBR322 and pUC18 respectively. DNA-mediated Transformation Mouse LMTK- cells were transformed as described by Graham and Van der Eb (31) with various plasmid DNAs, using LMTK cell DNA as carrier. HAT selection was started 2 days after transformation and LMTK+ cell lines established after 3 weeks by picking individual colonies with sterile cloning rings. DNA Sequencing The 3.15 kb BamHI/PstI subfragment of the EHV-1 BamHI-B fragment was sequenced by the dideoxy chain termination technique (32), (reagents from BRESATEC, Adelaide, South Australia), using an overlapping series of M13 templates generated by the rapid deletion method of Dale et al., (33). DNA and Protein Sequence Analysis DNA sequence was read and compiled using the RODENT system (34), and analysed using the Cornell package (35). Protein sequences were initially aligned by pairwise comparison (36) and then by hand for multiple alignments.

RESULTS AND DISCUSSION Localisation of the EHV-1 TK gene With the knowledge that UV-irradiated EHV-1 virions are capable of rescuing LMTK- cells (25), DNA-mediated

11305 Nucleic Acids Research transformation with cloned subgenomic fragments was used to locate the EHV-1 TK gene. Based on the observed colinearity of the EHV-1, herpes simplex virus (HSV) and varicella zoster virus (VZV) genomes (37) and the known map location of the HSV, VZV and bovine herpesvirus 1 TK genes (12,14,17 and 38), the approximate position of the TK gene on the EHV-1 physical map (39) was predicted to be 0.5 map units (Figure 1A and B). A 35 kb plasmid clone (I-B-G) from the BamHI library of EHV-1 (30) which spans this region was initially used for transformation, and was found to restore the TK function. LMTK+ cell lines established in this way were demonstrated to have integrated EHV-1 DNA sequences (data not shown). To refine this location further, smaller fragments were tested (Figure 1B) with a 3.15 kb, BamHI/PstI subclone of the BamHI-B fragment being the smallest tested which was capable of restoring the TK activity. A functional TK gene is therefore located within the 3.15 kb BamHI/PstI fragment at around 0.47 map units on the EHV-1 genome (Figure 1B), a position consistent with that reported for other alphaherpesviruses. The relative positions of the EHV-1 gB gene homologue at 0.41 map units and the major capsid protein gene homologue at 0.53 map units (unpublished data) confirm the colinearity with HSV and VZV at the level of individual genes. Sequence of the EHV-1 TK gene The DNA sequence of the 3.15 kb BamHI/PstI fragment (Figure 1C) is shown in Figure 2 and analysis of its coding potential reveals two complete ORFs and two others which extend beyond the region sequenced. The 352 amino acid ORF commencing at nucleotide 1097 has significant homology with the TKs of HSV and VZV (see Figure 3.), and given the ability of this region to code for a functional TK, this ORF is identified as the EHV-1 TK gene. The two ORFs upstream and on the opposite strand to the TK are homologous to two genes of HSV and VZV which are in the same position and orientation relative to their TK genes(17,40). These have been named genes 34 and 35 at this stage, following the scheme of VZV (17). The HSV-1 gene 35 homologue (UL24) overlaps the TK gene and has recently been shown to be important for viral growth in cell culture (J.G.Jacobson, S.L.Martin and

11306 Nucleic Acids Research

T F J H M L K A G C O E Q P D N R S A I I B I I II' EHV-1 BamHl map .. -150kb

-'I B G b B b b b %> 0.45 0.50 0.55 0.60 map units b b

b c

b p

b , - 35 TK _PgH C _t l I .lI 0 - 1.0 2.0 -3.0 kb

Fiq. 1. Localisation and sequencing of the EHV-1 TK gene. (A) BamHI map of the EHV-1 genome. (B) Regions of the EHV-1 genome capable of rescuing the TK function. (C) Strategy for sequencing the 3.15 kb BamHI/PstI subfragment of the BamHI-B fragment. b=BamHI, p=PstI, c=ClaI.

D.M.Coen, personal communication). The ORF downstream of the TK gene has a hydrophobic signal peptide and potential N- glycosylation sites, in addition to homology with the characterized gH glycoprotein of HSV-1 (41) and the VZV gene 37 (17), which are both downstream of their respective TK genes. Therefore, the gene organisation around the TK genes of these alphaherpesviruses is highly conserved. As the start codons for the TK gene and gene 35 are within 12 bp of each other (Figure 2), the promoter and regulatory elements of the TK gene are embedded within the coding region of gene 35. Transcript mapping and functional analysis will be required to establish which of the potential TATA elements determines the cap site for the TK transcript. There do not appear to be any 'GC' boxes in the 5' upstream region of the EHV-1 TK gene. These have been shown to be integral elements of the HSV-1 TK promoter and binding sites for the Spl transcription factor (42).

11307 Nucleic Acids Research

BamHl OGATCCGCCCTCGSTGTTGGAGCTGACGOTGAATGTAGGTCCGCTCCAAAAGTTAAGCGGAAATGGCGGGGCTATAAAGTTTCGCACGTC 90 8 G G E T N S S V T F T P G S w F N L P F P P A I F N R V D - Start 'gene 34' .10 TGTCGOTATAGTGGACGTGGCCGTATCGCTTACGTAAAGCSATCCTAACACATAATTCACATACTCCGCCATCTCCACCGCSACTATAAG 180 I1 T P I T S T A T D S V Y L S G L V Y N V Y R A M sTtTTTT 270 Stop 'goine 3c RCCGGCTGATGGACCGCGGCTGCGTTAGTSGI;TGTTTG 360 K PS P L S A A W A L P N V H S V A V P Q H V A A A N T T N CCGGOACCSCAGCTSOAAATAAACTCACGACGGCGGCTOCCCGCTGACOGCTGGO,ICTGGCGTTATAGATGGCACTGGCTCCGCTGO ICCOCCTr 450 A P V A A P F L S V V A A A A S P Q A P T I S P V P e A J CCGCCTCCTTGTCC,3ATTTAGCTGAAACTOGTGGAGCOTATTCCGO ICCAAAC 540 TTOTACTAAAGOCTTTGOCCTTGGTTCCTTTaaCGACGCAC V . inO K T S F A K A K T O K A V C R R R T S K A s V P P A Y A L 0 STGATATSOTOCAGGATAGCACGGCAGCOTTGCTATATACAAACCTGTGGCGATA,RAACGCOTTACCCGCAACACCCGCATTCCTCtbGTTGAG1S 630 R S I T C S L V A A N S Y V V Q P S L R T V R L V R M G R Q CTACAAACACTASTACCGGASCTASTACGATCTCACCGCTI TCsCCGGaGTAGCG'STTCTCSCCASCAACCTSCACGAGTCATGTACTGGTCC 720 A V P V L V P A L V I e sN S P P L T R A LL R C S D H V P GCATOCCCCCCTTCCGCTGTAGATTTTTACTCGCOGTGTTCCATATTTTTGGAAA,AAGCGACACGTTTTTAOCTCGGATGC,'ACACTC 810 R M G G R Q L N K S A T N M N K SIF R C T K L K I L I (C V 0 CCTTGGCGTCAGAACCCTTTCCAAATTSCACOGTACAaACAACAATCCaGGCGCCICOCTOTCCGAGGTTAACCTCAAAGGCCAGAGO IACACGC 900 O K A D SI0a F Q V T C V C D P RIR Q 0 L N V K P A L IS V CCAOTOCCGTTTTAASAGTTTCCSCTSGCACCAGTTCACTAAAAAAsasaAGCAAvRGCCTCTCTCCGTACACOCCGTTTCGCTTGOGICGCTTGa 990 i S L A T K L T E A P V L E S F L P AIL R e G Y V G N R KJA S CCAGOTCTTGAACCATCGCSTTATAGAAGCSGTTGTGGCACCCGTATACCAOCTC'CTSAGTCTGCTTCTAGCTGTCAOACGCTSTCIITACGTTT 1080 A L D Q V M A N Y F R N H C R I G AIR L R S R A T L R Q R R Start 'gene 35' , - M A A R V P S G E A R R S ASSG A P V SRR Q V T I TCATTTTCAGAAATCAATGOCOSCTCGCSTACCTTCCOGGO GAAGCTCGACGGAGCGCCAGCOGGCGcCCGGTCAGGCGGCAAGTA'AACAATr 1170 K N Sta rt-0TK V R I Y L D S V Y GO KK SS T T R V M A S A A S 0 G S P T ASTTAGAATTTACCTCOATOGGGTCTACOSCATCSGCAAGS .GCACOACTGOACAGSTTATGGCATCGGCTGCGAGTGGAGGAAGTCCAAC 1260 L Y F P E P M A Y K R T L NF A D V I S SI Y D T Q N R Q TCTATACTTTCCTSAGCCTATGCGSTACTGGCGOACTCTCTTTTGAAOCOOACSTAATTAGTGGTATTTACGACACCCAGAACCGGAAACA 1350 Q S D L A A D D A A S I T AA H Y Q S R F T T P Y L I L H D H GCAGGOAOATTTSGCOSCTGATGACGCOOCOTCAATAACGS GCOCACTACCAGAOCCGCTTTACCACOCCCTACCTTATCCTACACSATCA 1440 T GS L F G G D S L Q R S TT R P D L T V V F D R H P V A S A CACATTTSOGTTOTTTGGOGGCOACAGCCTACAGCSTOOGAACAAGACCAGACCTAACCGTCSTTTTTGACCGCCACCCAGTCOCCTCTSC 1530

I- V C F P A A R Y L I G D M SS M C A L I A M V A T L P R e P COGTOTOCTTTCCCGCCSCTCGCTACCTCATCSGAGACATGTTCCATGTOCOCOCTGATTOCCATGGTTSCCACCCTACCCAGOGAACCGCA 1620 C U GS N I V V T T L N V D E HH V R R L R T R A R I G E Q I D M AGGCOOAAACATCGTOGTTACCACCCTCAATSTSGACGASC CACGTGCGAAGACTGCGCACCCGCGCCCAGAATCGGGGAACAGATTGACAI T 1710 K L I A T L R N V Y S M L AA N T S N F L R S G R V W R D N GAAGCTAATCGCCACACTGCOAAACGTOTACTCTATGCTCGGCTAATACTAGCAACTTTTTGCGCTCCGOGAGAOTATGGCGCGACGGCTO a 1 800

G E L P L S C e T Y K H RS A T Q M D A F Q E R e S P E L S D GOOGGAOTTGCCCCTTTCGTGCGAGACCTATAAACATCGCGCICAACGCAGATGGACGCCTTCCAGGAGCOCOAATCTCCTSAGCTGAGCCA 1890

T L F A M F K T P E L L D DD R V I L E V H A N A L D A L M CACOTTOTTTOCCATGTTTAAGACTCCCGAGCTGCTAGACC GATCGTOGAOTGATATTOGAAGTTCACOCCTOGOCOCTTOACGCGCTSA7r 1i980

L K L R N L S V F C A D L SS G T P R Q C A A T V E S L I P L GCTAAAGCTGCGCAACCTGASTGTTTTTTTGCCTGATCTOAGCOGGACTCCSCOCCAOTGTGCTGCAACCOTOGASTCTCYYCCCC7r 2070

M S S T L S D S E S A s S LL E R A R T F N A E M v I CATOAGCASCACCCTCTCCGATTCOGAOTCSOCCTCCTCCKCCTSOAGCOGOCCGCGCGCACCTTCAACGCCOAGATOOOCOTGCTOCTAh 2160 Stop TK TATOTAATGTTTGTTGTOCCAGTGTAATAATTATGAAATAAAGATTCCTTTOCCTATATCCCTCATACCGCCTCOTGTGTCCAGIITATOTA& 2250 'poly A' TK AACTTCCAGOTTCTAGTTTTOOGGATATATAOTGSCTSTGACCTGGATTCAILTTTAGTACAGTOCOOCCGAGCCACTCAAGATA,TACCOT 2340

M L Q P Y R K M L I F A V V T V A F GGCTOTACATTAACTTOGGOAATCATTACTTCCGCGATCATOTTACAACCOTAIkTPCSAAAAATOCTOATCTTTGCASTTOTTACTOTTSCCT 2430 Start gH A MA V W S T P V P A T P S G V G N A T A N N S F N I T R TTGCGATGGCTGTCTOGTCAACGCCCOTCCCAOCCACTCCOTCTGGCGTOGG rAACGCTACTTGOGCAAACAATAOCTTCAACATAACCA k 2520

Y D I TM Q V Y S N T S N P I F F V V IS eRR N F R I z GOTATGACAAGATAACCATGSGACAGGTTTATAGTAACACTTCAAACTCTCCCICA'ATCTTCTTCOTTGTTATATCOGAGCSGAATTTTCOCA 2610

V N T P L G A S V F W I P K A M N P P Q H Q P C V A N P TCGTTAACACTCCNCTOGGCGCGTCGGTATTTTGGATACCAAAOGGCGCTATI IAATCCTCCOCAACACCAACCCTGTOTCOCCAACOGGC 2700

0 B P D P R G P C V N S T V S L L F N N V P F L MO K N CGOAACCTGGGGACCCACOCOGGCCGTGTGTCAACTCOACCGTCAGTTTATTY ITTTAATGAAAACOTOGAOCCGTTCTTAATOTCAAAAA 2790

L L 9 F E V L P D T Y I T G N T F R S K T A T T S N P V ATCTTTTAGAGTTTOAAGTOTTGCCCGACACCTACATAACCGOTTOGACOTTIrTZrGAGCOOTCTAAAACAGCGACCACAAAAACAACCCGG i 2880

G V V L S P P R a S P s A N T T I R D D P K K P L S I I TTGOTOTOOTTTTATCGCCACCCAGGGOCAOTCCOTCAGCTAACACAACAATC :AGGGACGATGGCGGACCCAAAAAGCCCCTGAGCATTA 2970

D Y T T L V A D L Q N F T M T L T Y I S P F V P I TAQACGTATAC!CCACGCTCVT MCSGACTTGCDTCACTATGACATTV iACTTACATAAGCCCCTTTGCCGCGGTGTGGCCTATTG 3 3060 A F Q T G I T V M G C D T T Q V V A Y L G R G F M G L Q AAGCCTTTCAACGGGCATCACGGTCATGGGGTGCGACACTACACAGGTTGTIaXukw s x sWNuWuulaIXuu sX^suuwulwu^U^U *Wti%Jff [ -U s--s-w-w-XIv;-s-;wv-t;-rUw-l-lrTCIGCGTATCTCGGCCATOGGTTTATOGGCCTGCAG1 XluZZUX-lwav AU 3146 bp Poti

Fig. 2. DNA sequence of the 3.15 BamHI/PstI fragment shown to contain a functional TK gene. Potential TATA boxes are double- lined and 'poly A' signals underlined. The naming of genes 34 and 35 follows the scheme for VZV (17).

11308 Nucleic Acids Research

Comparison of the amino acid sequences of herpesvirus TKs. Alignment of the EHV-1 TK amino acid sequence with those published for VZV, HSV-1 and Epstein-Barr virus (EBV) highlights the regions of homology between the herpes TK proteins (Figure 3). There is variation in the length of the N-terminal sequences, and in particular EBV has an additional 243 residues preceding those shown for the alignment. As yet no function has been ascribed to this region of the EBV TK protein. Although deletion of 45 residues from the N-terminus of the HSV-1 TK affects stability of the enzyme, this region is dispensable for TK catalytic activity (44,45). In addition, the amino acid sequences of the 4 TK proteins coalign from the position corresponding to residue 45 of HSV-1 (residue 22 on the alignment) to their C-termini with minor deletions and additions. Therefore, this represents the 'core' sequence of the herpes TK enzymes. The degree of similarity of the TK proteins based on this core sequence ranges from 37% amino acid identity between EHV-1 and VZV, to 19% between VZV and EBV (Table 1), reflecting the expected relatedness of the alphaherpesviruses (HSV, VZV, EHV-1) and the gammaherpesvirus (EBV). However it is interesting that the TK from an equine virus is more similar to the human herpes viruses, HSV and VZV, than the latter are to each other. The degree of similarity among the herpes TKs is much lower than that between both cell and poxvirus TKs (e.g. man/chicken-81% man/vaccinia virus-68%)(18), which are clearly homologous to each other, but not to the herpes. The cell and poxvirus TKs are also smaller proteins (180-230 residues) compared to the herpes TKs. Both of these features correlate with the cell and poxvirus enzymes being solely TKs (46), whereas the larger, more diverse herpes enzymes are multifunctional, each with a distinct range of substrate specificities. Regions of the herpes TKs which are important for enzyme function have been identified by protein sequence comparisons and by relating the changes in natural or engineered mutants to enzyme activity. Our alignment of 4 herpes TKs confirms and extends this analysis. A highly conserved nucleotide binding site (residues 27-45) was initially identified by the consensus sequence -GxxGxGKT/S- (15,20) found in most enzymes which bind

11309 Nucleic Acids Research

10 20 _ 30 EHV MAWIVPSS AtPV-RRQ I ----_YLUt VZV DKTD HSV MASYPCHQHASAFDQAARSRGHSNRR T RRQQ EVR:,P, --1I_ EBV MNVLNLDDAQDTRPIVT NH (PACSL (244) ucleotIde 40 50 60 70 80 90 EUV --RI? P p E A Vzv O EE TP-N I V DA I L HSV P -Q G-W VYVPEP tASET EBV -NS_N L I ---Et T !FE HVY AMIIKN R - Binding Site 100 110 -120 130 140 EHV HTF T-OTDL DRH vzv HltS2RLr I IS VLMDTSTSDL K Kil RP! HSV M YA LAPH - DR.PI L EBV S TT ASRKRSL- V- WI R L Nuclotide/ 150 160 170 180 190 EHV lC- I -M - - L I LNVD VZV It -- V D GLLF-- HSV - P AF I PP --TNI G ,+l EBV QL I M----- 3 4J Nucloeosde Binding Site 200 210 220 230 240 EHV Q- I4KL1§----l- tL L ETI VZV PG - IFVMIII -TNN L HSV Nt*16 GAS|Vs @ D-G - EBV S SVN H AWL PEDI VKVI ITTV---- H

250 260 270 280 290 300 EHV 3QMDAFQ S Le3 DALL VZV QKSECIK PAl EI IPL--- MENC HSV rMS--NAGP Th iEL P* --A I EBV IIRSGV YK--NS L VIQPFRAD 1 L R' F

310 320 330 340 350 EHV AV- I ---j-LpSESS V 352 aa's VZV TPF - +K NMQA4---MGAWNIL VQDNTS 341 HSV P- a ---D- (4VcivTTDPGOjTICD lIT AN 377 EBV P---g TEIYI I FFDWAGLJKVAD 607 5

Fig. 3. Amino acid alignment of the TK proteins from EHV-1, HSV-1 strain CL(101) (13), VZV (17) and the BXLF1 ORF of EBV (16) from amino acid 244 to the terminus, identified as the EBV TK (43). The nucleotide and nucleotide/nucleoside binding sites are bracketed (15,20 and 49). The residues in the nucleotide binding site altered by Liu and Summers (48) are indicated by asterisks. The significance of the motif DRH [l, serine/alanine 2, cysteines 3 and 5 and arginine 4 are discussed in the text.

ATP (19,21). X-ray crystallographic analysis of indicated that the 3 glycines form a flexible loop capable of changing conformation upon binding ATP (21,47), which would

11310 Nucleic Acids Research

Table 1. Percent similarity of the amino acid sequences of 4 herpesvirus TKs. Conservative changes are shown in brackets. Inserted spaces were treated as though a 21st amino acid.

Enzyme from: VZV HSV-l EBV

EHV-l 37 (24) 36 (22) 23 (24) VZV - 29 (24) 19 (22) HSV-l - - 23 (20) account for the reduced affinity of the HSV-1 TK enzyme for ATP on substitution of any one of these glycines by the bulkier valine residue (48). The distinction between the herpes and cell TKs is again illustrated by the presence in the cell and poxvirus proteins of an invariant phenylalanine in place of the second glycine. This possibly reflects the ability of the cell TK to use only ATP and dATP as phosphate donor, whereas the multifunctional herpes enzymes can use ATP, dATP, CTP and GTP and also phosphorylate a much broader range of nucleosides. Another important structure identified in this region of adenylate kinase is a hydrophobic pocket which binds the adenine ring of ATP (21). All 4 herpes proteins have hydrophobic residues in an analogous position around residue 45, and may form a similar structure. The strictly conserved acidic and aromatic residues immediately preceding the glycine loop of the nucleotide binding site are also characteristic features. Another region which may contribute to formation of the active site, extends from residue 137 to 168. Based on the substrate binding properties of TK from HSV-l strains with altered residues (labelled 2 and 4 in Figure 3), Darby et al. (49) proposed that this region is involved in nucleoside binding. The sensitivity of HSV-1 strains to the nucleoside analogue BVdU is related to the alanine residue at position 2, in contrast to HSV-2 and EHV-1 which have a serine, and are both resistant to BVdU. This resistance may be due to an inability of the enzymes in HSV-2 and EHV-l infected cells to phosphorylate BVdU monophosphate to the diphosphate (28,50), indicating that this region could be important to both the

11311 Nucleic Acids Research thymidylate and thymidine kinase activities of the enzyme. This possibility is supported by kinetic studies which found that either both activities share the same active site, or if two sites exist, they overlap (51). The sequence -DRH- at (1] (Figure 3) is the most conserved feature of the herpes TKs. Although within the proposed nucleoside binding site, this motif may be involved in nucleotide binding for several reasons. Firstly, in many diverse ATP- and GTP-binding proteins an aspartate residue is implicated in binding the Mg-NTP ligand (19,21 and 52). Furthermore, this aspartate of adenylate kinase, and other ATP-binding enzymes is preceded by a hydrophobic strand of beta- pleated sheet (21,53). As this aspartate residue is the only one conserved in all herpes TKs, is preceded by hydrophobic residues and is in a region essential for catalytic activity, it is likely to have an analogous role. Functional analysis on this aspartate may establish whether this region is important for nucleotide binding, in addition to its role in determining nucleoside specificity. It is intriguing that the protein kinase family has a similar strictly conserved aspartate in the motif - HRD- (54), i.e. the reverse of -DRH- in the herpes TKs. Alteration of the cysteine residue 5 (Figure 3) in an HSV-1 mutant was found to severely disrupt functioning of the TK (49). The possibility that this cysteine forms a disulphide bridge with the cysteine 3, which is in a functionally important region, was discounted as mutating cysteine 3 to a serine had little effect on enzyme activity (55). In addition, the cysteines at these positions are not strictly conserved in all herpes TKs. Therefore, although cysteine 5 is crucial to enzyme function its role is as yet unknown. The alignment of these proteins identified other regions with localised conserved features whose role in enzyme function will require more structural/functional analyses. Similarity of the herpesvirus TKs with the thymidvlate kinase of yeast. Due to the fundamental differences in enzyme properties and lack of protein sequence similarity between the herpes and cellular TKs, the origin of the herpes TK has been obscure. We

11312 Nucleic Acids Research

YEAST TmpK 6 Kl---ILEGLrtGKTTQC-NILyKk-LQPNC kFPEr sTriGGjf --- EHVTK 27 R --- IYLDGVtygiGKSTTG-r V-asaaSgGSpTiLYFPEpmaywrlLTfeadv VZVTK 14 R---- IyDGAygiGKTTaaeeFLhHfaITPN- LII lSywrM1a geda HSVTK 51 R--_-_VYIDGPhg GKTT T- L alg-Srdd YIVPEpmTywQvLgaset EBVTK 282 K acLFLEGApg KTT l-N kav---f Gd IWVEVPEmpmrywThVy-ena MHVTK 11 R-- -M9!DGPhg KSTrae aRcePrrpi -_IJ EpmaywrStfasda Nucleotide Binding Site YEAST TmpK I-NeYTdsFd1 ... .23... EgKNIVMDR VY- s--AAKTnGD 112 EHVTK IsGi YdTqnrkQQGD .. . 43 ... D-1TVVFDRHPv-a p--ARyl GDM 159 VZVTK IcGiYgTqtrr GD . .. 46 .. .y-KiMLRDRHP i-a I lp--1vDM 149 HSVTK IaNi tTrhLd GE... 44 ... a-1TLIFDRHPi- - 1-AyinOsM 182 EBVTK IIkaklhvtrar .. . 41 .. .D-cwILMDRHL ls Qlsy 413 MHVTK IteiMd_qhrL N . . 49... .D-fvLVVDRHAv--gpwcat--rwRvydglr 149

Nucleotide/Nucleoside Binding Site Fig. 4. Comparison of the amino acid sequences of the yeast thymidylate kinase (TmpK) with the TKs of EHV-1, VZV (17), HSV (13), EBV (16) and Marmoset herpesvirus (MHV) (15). The sequence marked * below the MHV sequence is a corrected version (5). Identical matches or those involving conservative changes between the yeast TmpK and the herpes TKs are indicated by capital letters, and are boxed when the match is to a highly conserved amino acid of the herpes TKs. therefore searched for similarity in cell-encoded deoxycytidine and thymidylate kinases, the additional activities associated with the herpes TK. Comparison of the yeast thymidylate kinase, the only such enzyme for which sequence data was available (29), with 5 herpes TKs (Figure 4) revealed similarity in 2 regions of the respective proteins, beyond which any further similarity was undetectable. Significantly, these 2 regions include parts of the herpes TKs important to substrate binding and enzyme function, as discussed above. This is especially apparent where the yeast thymidylate kinase matches the proposed nucleotide/nucleoside binding site. Indeed, each residue identified as functionally important to the herpes TK has a corresponding identical or similar residue in the thymidylate kinase. The thymidylate kinase, however, has a tyrosine residue in place of the histidine in the -DRH- motif. This substitution may be conservative as there are examples of similar substitutions in highly conserved motifs of other protein families e.g. the bovine cAMP and cGMP dependent protein kinases have a tyrosine in the -HRD- motif (56,57).

11313 Nucleic Acids Research

Salvage dT Thymidine N pathwav + ATP kiaeThymidylate kinase dTMP + ATP - 4dTDP -4dTTP de novo dMP Thyidlate nathwav + folate synthetase

Ficg. 5. Salvage and de novo pathways of thymidine triphosphate synthesis. The proposed relationship of the herpesvirus TKs to the thymidylate kinase is shown by the broken line.

There are other similarities which match features characteristic of the herpes TKs, one of the most striking being the glutamate residue preceding the glycine loop of the nucleotide binding site. However, one mismatch which deserves mention is the arginine instead of the second glycine in the glycine loop. This may again reflect the broad nucleotide and nucleoside substrate range of the herpeS TKs compared with, in this case, the thymidylate kinase. As yeast does not possess a TIC activity of its own, this similarity can only reflect the thymidylate kinase activity shared with the herpes TIC. This raises the possibility that the herpes TIC is derived from a cell-encoded thymidylate kinase. If the original herpes enzyme was solely a thymidylate kinase, then the TKC activity must have evolved after its incorporation into the herpes lineage. As thymidylate kinase catalyses the next step after TIC in the pathway of thymidine triphosphate biosynthesis (Figure 5), this may be an example of consecutive steps in a biochemical pathway evolving in the opposite direction to that of the pathway. Although this process, termed retroevolution, was originally proposed by Horowitz (58) to involve gene duplication followed by divergence of function, the development of the multifunctional herpes TK from a monofunctional thymidylate kinase is similar in principle. As thymidylate is the convergent product of the de novo and salvage pathways of thymidine synthesis, development of a TK activity is not just one step back in the pathway, but makes possible an alternative source of thymidine in infected cells.

11314 Nucleic Acids Research

Confirmation of this proposed evolutionary relationship between the herpes TK and the cell thymidylate kinase requires more information. The protein sequence of thymidylate kinase from other sources, especially vertebrates, will determine whether the yeast protein is representative. The biochemical properties and activities of vertebrate thymidylate kinases will need to be characterised and compared with those of the herpesvirus enzyme. In particular, demonstration of the complete absence of TK activity in vertebrate thymidylate kinases would help to establish whether this activity was a novel herpes development. Finally, further resolution of the structure of the active site(s) for the viral TK and thymidylate kinase activites may show the changes involved in the evolution of a multifunctional enzyme such as the herpes TK.

ACKNOWLEDGENENTS This work was supported by grants from the Australian Research Council and the New South Wales State Cancer Council. G.R.R. was a recipient of a Commonwealth Postgraduate Research Award. We thank Nigel Farrar and Dr Norma Scott for criticism of the manuscript.

*To whom correspondence should be addressed

REFERENCES 1. Keir,H.M. and Gold,E. (1963) Biochim. Biophys. Acta 72, 263- 276. 2. Morrison,J.M. and Keir,H.M. (1968) J. Gen. Virol. 3, 337- 347. 3. Cohen,G.H. (1972) J. Virol. 9, 408-418. 4. Henry,B.E., Glaser,R., Hewetson,J. and O'Callaghan,D.J. (1978) Virology 89, 262-271. 5. Kit,S. (1985) Microbiol. Sci. 2, 369-375. 6. Bodemer,W., Niller,H.H., Nitsche,N., Scholz,B. and Fleckenstein,B. (1986) J. Virol. 60, 114-123. 7. Thompson,R., Honess,R.W., Taylor,L., Morran,J. and Davison, A.J. (1987) J. Gen. Virol. 68, 1449-1455. 8. McGeoch,D.J. and Davison,A.J. (1986) Nucleic Acids Res. 14, 1765-1777. 9. Michelson,S., Tardy-Panit,M. and Barzu,O. (1984) Virology 134, 259-268. 10. McGeoch,D.J., Dalrymple,M.A., Dolan,A., McNab,D., Perry, L.J., Taylor,P. and Challberg,M.D. (1988) J. Virol. 62, 444-453. 11. Crute,J.J., Mocarski, E.S. and Lehman, I.R. (1988) Nucleic Acids Res. 14, 6585-6596. 12. McKnight,S.L. (1980) Nucleic Acid Res. 8, 5949-5963.

11315 Nucleic Acids Research

13. Wagner, M.J., Sharp,J.A. and Summers,W.C. (1981) Proc. Natl. Acad. Sci. USA 78, 1441-1445. 14. Swain,M.A. and Galloway,D.A. (1983) J. Virol. 46, 1045-1050. 15. Otsuka,M. and Kit,S. (1984) Virology 135, 316-330. 16. Baer,R., Bankier,A.T., Biggin,M.D., Deininger,P.L., Farrell, P.J., Gibson,T.J., Hatful,G., Hudson,G.S., Satchwell,S.C., Seguin,C., Tuffnell,P.S. and Barrell,B.G. (1984) Nature 310, 207-211. 17. Davison,A.J. and Scott,J.E. (1986) J. Gen. Virol. 67, 1759-1816. 18. Boyle,D.B., Coupar,B.E.H., Gibbs,A.J., Seigman,L.J. and Both,G.W. (1987) Virology, 156, 355-365. 19. Walker,J.E., Saraste,M., Runswick,M.J. and Gray,N.J. (1982) EMBO J., 1, 945-951. 20. Gentry,G.A. (1985) Proc. Nat. Acad. Sci. USA, 82, 6815-6819. 21. Fry,D.C., Kuby,S.A. and Mildvan,A.C. (1986) Proc. Natl. Acad. Sci. USA 83, 907-911. 22. Jamieson,A.T. and Subak-Sharpe,J.H. (1974) J. Gen. Virol. 24, 481-492. 23. Chen,M.S. and Prusoff,W.H. (1978) J. Biol. Chem. 253, 1325-1327. 24. Chen,M.S., Summers,W.P., Walker,J., Summers,W.C. and Prusoff,W.H. (1979) J. Virol. 30, 942-945. 25. Allen,G.P., McGowan,J.J., Gentry,G.A. and Randall,C.C. (1978) J. Virol. 28, 361-367. 26. Allen,G.P., McGowan,J.J., Bryans,J.T., Randall,C.C. and Gentry,G.A. (1978) Virology 90, 351-359. 27. Allen,G.P., McGowan,J.J., Randall,C.C., Mancini,W., Cheng, Y-C. and Gentry,G.A. (1979) Virology 92, 367-374. 28. Kit,S., Ichimura,H. and de Clercq,E. (1987) Antiviral Res. 7, 53-67. 29. Jong,A.Y.S., Kuo,C. and Campbell,J.L. (1984) J. Biol. Chem. 259, 11052-11059. 30. Robertson,G.R. and Whalley,J.M. (1985) In Della Porta,A.J. (ed.), Viruses of Veterinary Importance in South East Asia and the Western Pacific, Academic Press, Sydney, pp 471-472. 31. Graham,F.L. and Van der Eb,A.J. (1973) Virology 52, 456-467. 32. Sanger,F., Nicklen,S. and Coulson,A.R. (1977) Proc. Natl. Acad. Sci. USA 74, 5463-5467. 33. Dale,R.M.K., McClure,B.A. and Houchins,J.P. (1985) Plasmid 13, 31-40. 34. Breen,E.J., Browne,L.H., Glue,L. and Williams,K.L. (1988) CABIOS 4, 217. 35. Fristensky,B., Lis,J. and Wu,R. (1982) Nucleic Acids Res. 10, 6451-6463. 36. Lipman,D.J. and Pearson,W.R. (1985) Science 227, 1435-1441 37. Davison,A.J. and Wilkie,N.M. (1983) J. Gen. Virol. 64, 1927-1942. 38. Bello,L.J., Whitbeck,J.C. and Lawrence,W.C. (1987) J. Virol. 61, 4023-4025. 39. Whalley,J.M., Robertson,G.R. and Davison,A.J. (1981) J. Gen. Virol. 57, 307-323. 40. McGeoch,D.J., Dalrymple,M.A., Davison,A.J., Dolan,A., Frame, M.C., McNab,D., Perry,L.J., Scott,J.E. and Taylor,P. (1988) J. Gen. Virol. 31, 120-128. 41. Gompels,U. and Minson,A. (1986) Virology 153, 230-247. 42. McKnight,S.L. and Tjian,R. (1986) Cell 46, 795-805.

11316 Nucleic Acids Research

43. Littler,E., Zeuthen,J., McBride,A.A., Trost Sorensen,E., Powell,K.L., Walsh-Arrand,J.E. and Arrand,J.R. (1986) EMBO J. 5, 1959-1966. 44. Halpern,M.E. and Smiley,J.R. (1984) J. Virol. 50, 733-738. 45. Haarr,L. and Flatmark,T. (1987) J. Gen. Virol. 68, 2817-2829. 46. Kwoh,T.J. and Engler,J.A. (1984) Nucleic Acids Res. 9, 3959- 3971. 47. Pai,E.F., Sachsenheimer,W., Schirmer,R.H. and Schulz,G.E. (1977) J. Mol. Biol. 114, 37-45. 48. Liu,Q. and Summers,W.C. (1988) Virology 163, 638-642. 49. Darby,G., Larder,B.A. and Inglis,M.M. (1986) J. Gen. Virol. 67, 753-758. 50. Fyfe,J.A., McKee,S.A. and Keller,P.M. (1983) Mol. Pharmacol. 24, 316-333. 51. Chen,M.S., Walker,J. and Prusoff,W.H. (1979) J. Biol. Chem. 254, 10747-10753. 52. Dever,T.E., Glynias,M.J. and Merrick,W.C. (1987) Proc. Natl. Acad. Sci. USA 84, 1814-1818. 53. Kolb,E., Hudson,P.J. and Harris,J.I. (1980) Eur. J. Biochem. 108, 587-597. 54. Leader,P. (1988) J. Gen. Virol. 69, 1441-1464. 55. Inglis,M.M. and Darby,G. (1987) J. Gen. Virol. 68, 39-46. 56. Shoji,S., Parmelee,D.C., Wade,R.D., Kumar,S., Ericsson,L.H., Walsh,K.A., Neurath,H., Long,G.L., Demaille,J.G., Fischer,E. and Titani,K. (1981) Proc. Natl. Acad. Sci. USA 78, 848-851. 57. Takio,K., Wade,R.D., Smith,S.B., Krebs,E.G., Walsk,K.A. and Titani,K. (1984) Biochemistry 23, 4207-4218. 58. Horowitz,N.H. (1945) Proc. Natl. Acad. Sci. USA 31, 153-156

11317