<<

Proc. Natl. Acad. Sci. USA Vol. 89, pp. 8847-8851, September 1992 Biochemistry A distinct group identified by sequence analysis TIMo HYYPIA*t, CHRISTINE HORSNELLt, MARITA MAARONEN*, MAHBOOB KHANS, NISsE KALKKINEN§, PETRI AUVINEN*, LEENA KINNUNEN¶, AND GLYN STANWAYt Department of *Virology, University of Turku, Kiinamyllynkatu 13, SF-20520 Turku, Finland; tDepartment of Biology, University of Essex, Wivenhoe Park, Colchester C04 3SQ, United Kingdom; §Institute of Biotechnology, University of Helsinki, Karvaamonkuja 3, SF 00380 Helsinki, Finland; and NMolecular Biology Unit, National Public Health Laboratory, Mannerheimintie 166, SF 00300 Helsinki, Finland Communicated by Robert H. Purcell, June 5, 1992

ABSTRACT Although echovirus 22 is presently classified of genetic properties may soon become essential because as a member of the group in the family of picor- their detection in the disease context can be expected to be naviruses, it has been reported to have exceptional biological done increasingly by the PCR, which uses conserved and properties when compared with other representatives of the specific regions ofthe (10). Molecular characteristics group. We have determined the complete nucleotide sequence are also essential when detailed pathogenetic mechanisms of ofthe echovirus 22 (Harris strain) genome, which appears to be viral diseases are concerned. We show here that echovirus 22 significantly different from all the other studied picornavi- has a distinctive genome when compared with the presently ruses. However, the organization of the genome [7339 nucle- known , suggesting that it should be classified otides, excluding the poly(A) tract] is similar to that of previ- in an independent group among these . 11 ously sequenced picornaviruses. This genome includes a 5' untranslated region, relatively well-conserved when compared MATERIALS AND METHODS with aphtho- and , followed by an open reading frame coding for a 2180-amino acid-long polyprotein. The Echovirus 22 (EV22; Harris strain) was obtained from the amino termini of polypeptides VP1 and VP3 were American Type Culture Collection (ATCC). It was plaque- determined by direct sequencing, and the other proteolytic purified three times and shown to be neutralized by a cleavage sites in the polyprotein were predicted by comparison EV22-specific antiserum (ATCC). The propagated was with other picornavirus . The amino acid identities of purified in successive sucrose and CsCl gradients, as de- echovirus 22 polypeptides with the corresponding proteins of scribed (2). Standard methods, including negative staining by other picornaviruses are in the 14-35% range, similar to those 2% potassium phosphotungstate or 1% aqueous uranyl ace- percentages seen when representatives of the five picornavirus tate, were used in electron microscopy. Virus RNA was groups (entero-, rhino-, cardio-, aphtho-, and hepatoviruses) purified and cloned as described (2), and two cDNA clones, are compared. Our results suggest that echovirus 22 belongs to together representing almost the entire genome, were se- an independent group of picornaviruses. quenced by the dideoxynucleotide chain-termination method. Three strategies were used: double-stranded se- quencing after subclone generation by the ExoIII deletion Picornaviruses are small, naked, icosahedral particles con- method (11, 12), direct sequencing with oligonucleotide prim- taining a single-stranded RNA genome with mRNA polarity. ers (13), and sequencing of templates generated by cloning This group includes a number ofimportant human and animal restriction fragments into M13mp18/19 vectors (Pharmacia). and has been studied extensively. Currently pi- Approximately 98% of the sequence, including the entire cornaviruses are divided into five main groups, largely on the capsid region, was determined on both strands, and all the basis of physicochemical properties and pathogenesis in the nucleotides were sequenced at least twice. Sequencing of 75 host. These groups are , [foot- nucleotides (nt) at the 5' terminus was done by primer and-mouth disease viruses (FMDVs)], cardioviruses (e.g., extension with virus RNA as a template (14). For direct encephalomyocarditis virus), hepatoviruses [ vi- amino-terminal sequence analysis, the capsid polypeptides rus (HAV)], and . The latter are subgrouped from purified virus were separated in a SDS/12.5% poly- into , coxsackie A and B viruses, echoviruses, acrylamide gel, electroblotted on a poly(vinylidene difluo- and enteroviruses 68-71. Recently echoviruses have been ride) membrane (15), and subjected to sequence analysis in a shown to contain genetically distinct subgroups, the major gas/pulsed liquid sequencer equipped with an on-line phen- one closely resembling coxsackie B viruses and the minor ylthiohydantoin-amino acid analyzer (16). Sequences were one, including echovirus 22, exhibiting specific molecular assembled by using Genetics Computer Group (17) and characteristics (1, 2). Echovirus 22 is also known to have Staden-Plus (Amersham) software; the analysis was carried other exceptional biological properties when compared with out by programs GAP, PRErrY, and WORDSEARCH from the members of the enterovirus group (3-7). Genetics Computer Group package. Picornavirus sequences Nucleotide-sequence analysis of picornaviruses has cast used in been in a light on the classification of the family (8). In most cases, the the computer comparisons have presented nucleotide and amino acid identities among the group mem- recent review article (8). Dendrograms were derived by using bers correspond well with the previous divisions. For in- the KITSCH package of the PHYLIP suite of programs (18). stance, polioviruses and coxsackie B viruses have been shown to be quite homogeneous subgroups and also rela- RESULTS tively closely related. On the other hand, HAV, previously Structure and Organization of the Genome. Electron mi- classified as enterovirus 72, shows only minimal sequence EV22 revealed 28 ± homology with enteroviruses and, indeed, any picornavirus croscopy of purified spherical particles studied so far (9). Classification ofpicornaviruses on the basis Abbreviations: HAV, hepatitis A virus; UTR, untranslated region; nt, nucleotide(s); FMDV, foot-and-mouth disease virus. The publication costs of this article were defrayed in part by page charge tTo whom reprint requests should be addressed. payment. This article must therefore be hereby marked "advertisement" IThe sequence reported in this paper has been deposited in the in accordance with 18 U.S.C. §1734 solely to indicate this fact. GenBank data base (accession no. L00675).

8847 Downloaded by guest on October 6, 2021 8848 Biochemistry: Hyypid et al. Proc. Natl. Acad. Sci. USA 89 (1992)

1 nm (n = 58) in diameter with the appearance of a picorna- VP1). The largest polypeptide repeatedly gave no result in virus (Fig. 1). Complete nucleotide-sequence analysis Edman degradation, and no signal representing the predicted showed that the length of the EV22 genome (Fig. 2), exclud- amino terminus of VP2 was seen in any bands analyzed. ing the poly(A)-tract, is 7339 nt, which is in the range found Tryptic peptide analysis of the 38-kDa band generated se- for rhinoviruses and enteroviruses. The base composition quences localized in the predicted VP4 region (VESVG- (adenine = 32%, cytosine = l9o, guanine = 20%, and uracil NEIGGNLLTK), in the VP2 region (NVVQATTTVNT- = 29o) is relatively similar to rhinoviruses (in HRV14, G+C TNL), and spanning the predicted VP2/VP4 cleavage site = 41%) and HAV (G+C = 38%) rather than to enteroviruses (VADDASNILGPNXFATTA). Together with the inability (G+C = -46%). As seen in most other picornaviruses, the to find any trace of VP2 sequences in the other capsid occurrence ofthe dinucleotide CG is very low. The predicted polypeptide bands, this result strongly suggests that no 5' untranslated region (UTR) extends to nt 709 and is fol- VP4/VP2 cleavage occurs, and VP0 is the predominant lowed by an open reading frame coding for a 2180-amino species in mature virions. acid-long polyprotein. Codon usage analysis shows a pre- In the polypeptide region, homologous to the VP2 ponderance of XXU and XXA (65%), similar to that of in other picornaviruses, there is an apparent deletion in the rhinoviruses and HAV. The length of the 3' UTR is 90 nt. "puff" region (20) when compared with entero-, rhino-, and The codon that initiates the open reading frame was cardioviruses, making the EV22 polypeptide structurally identified by two main criteria: the occurrence of a 9-nucle- more similar to those of aphthoviruses and HAV. VP3 is otide stretch of pyrimidine residues located 19 nt upstream larger than in other picornaviruses, mainly from a hydrophilic and the methionine codon being in an optimal Kozak context extension of "20 amino acids at its amino terminus. An (AXXAUGG). A polypyrimidine motif is found in all studied interesting finding in the VP1 polypeptide is an RGD se- picornaviruses, and that seen in EV22 is very similar in quence close to its carboxyl terminus. Such a motif is known sequence, and also in spacing from the AUG, to that of to be involved in cell attachment in various systems (21). cardioviruses, aphthoviruses, and HAV. Picornavirus VP4 Of the nonstructural proteins, some (e.g., 2C, 3D) can polypeptides, studied so far, have an amino-terminal glycine, easily be aligned with the corresponding polypeptides of and they are myristoylated (8). The first glycine in the EV22 other picornaviruses, whereas others (2A, 3A) exhibit only polyprotein is located 13 residues from the predicted amino minimal identity (Table 1). 2A is an important protein to terminus and appears within a myristoylation consensus consider because the strategy of its action divides picorna- motif. It is, therefore, possible a short leader is present in viruses into two categories: in enteroviruses and rhinoviruses EV22. The putative leader sequence is more reminiscent of it is responsible for an autocatalytic cleavage activity directed the predicted one in HAV, where apparently the true amino to its amino terminus, whereas in aphthoviruses and cardio- terminus of VP4 is seven amino acids downstream, than of viruses, 2A cleaves at its carboxyl terminus (8). However, the long leader polypeptides found in aphthoviruses and deducing the mode of action of EV22 2A from the predicted cardioviruses (19). sequence is not possible. The function ofpolypeptides 2B and The genome organization of EV22 is analogous to that of 3A in virus replication is still poorly understood, and they other picornaviruses because amino acid identities can be have a relatively low degree of conservation. Although the observed throughout the polyprotein, and this makes it function of polypeptide 2C has not yet been completely possible to suggest most of the cleavage sites in the polypro- elucidated, it is one of the most conserved proteins among tein. For direct detection of some of the cleavage sites, the picornaviruses. As this protein aligned well with other picor- capsid proteins in the purified virus were separated by naviruses, polypeptide 2C was used in detailed comparative SDS/PAGE, and the proteins were electroblotted on a poly- analyses. The localization of polypeptide 3B (VPg) was (vinylidene difluoride) membrane. Protein bands correspond- predicted by the genomic position, presence ofthe conserved ing to apparent molecular masses of30 kDa, 30.5 kDa, and 38 tyrosine (Y) as the third amino acid, and comparisons with kDa were subjected to amino-terminal sequence analysis. other picornavirus VPgs. The 3C contains the The 30-kDa and 30.5-kDa proteins gave the results APNGK- conserved sequence GXCGG, proposed to form part of the KKN and NSWGSQMDL, respectively, unambiquously lo- active site. Previously observed, highly conserved motifs calizing these structures on the deduced sequence (VP3 and (KDELR, YGDD, FLKR) are found in the 3D polymerase. Relationship of Echovirus 22 to Other Picornaviruses. The nucleotide identities ofthe 5' and 3' UTRs and the amino acid identity of the predicted polypeptides were compared with representatives of picornavirus groups (Table 1). In parts of the 5' UTR, EV22 RNA exhibits a high degree of sequence conservation relative to aphtho- and cardioviruses; for in- stance, there are several blocks of, at least, an identical 10 nt. These viruses, together with HAV, share a similar predicted 5' UTR secondary structure, which is distinct from that of entero- and rhinoviruses (8). The identity of the EV22 poly- peptides with corresponding proteins ofother picornaviruses varied between 14 and 35%; this value is of the same order as that seen when HAV is compared with other picornavirus groups (9). Because the most pronounced identity was seen in poly- peptide 2C, this protein was analyzed in more detail. When its amino acid sequence was compared with the National Biomedical Research Foundation data base, this EV22 poly- peptide was found to be more similar to the 2C protein of 22 other sequenced picornaviruses than to any other proteins in the data base (data not shown). In a dendrogram, illustrating the relationships between picornaviruses, EV22 and HAV FIG. 1. Electron micrograph of purified echovirus 22. join to the other major groups at an amino acid-difference (x300,000.) level exceeding 70%o when polypeptide 2C is used for com- Downloaded by guest on October 6, 2021 Biochemistry: Hyypia et al. Proc. Natl. Acad. Sci. USA 89 (1992) 8849

170 10 30 50 70 90 110 130 150 ITTTGAAAGGGGTCTCCTAC AOGAGCTTGCCTAGGGCCI STATAACCCGAC1TTCTGAAG.CTTCTATAGGAAAAAACCCIIr MrAAAAACCCCCATAG,TM'AACCAACACCTAAGG);ACAATTTGATCAACCCTATGC.CTGGTCCCCACT;rTCGAAGGCAACTT TTCCCAGCCCT0G00CTGTCAA 330 190 210 230 250 270 290 310 350 cGCAATAAGAACAGTGGAAC :AAGATOCTTAAAGCATJ.AC6GTGTAAATOATCTTTTCtITTAACCTGATATGTACAGOO OGCAGATGSCGTSCCATAAATCTATTrim 1TOGGATACCACGCII.Tl'TOTGACCTTATGCMCCACACAGCCATCCTCTAGTAIAGTTTGTAAAATGTCTGGTGAGATGTG 370 LAC 390 410 430 450 470 490 510 530 cCCOAACTTATTGGAAACAAC :AATTTICTTAATAOCATIIccCTAGTOCCAGCGGAACAAACATCTGOTAACAGATGC=TCI GCCAAGOTTGACA0J NO.CrATTA00ATT0TGITIT'CAAAACCTGAATTIrcrGTTGTGGAAGATATTCAGTAC:CTATCAATCTGGTAGTGGTGCAAACAC rS.GCOCAAAA 690 710 550 570 590 610 630 650 670 ITAGTTGTAAGGCCCACGOU LGGATOCCCAGAAGOTACCI.CCCOCAGGTAACAAGAGCAO AACTCTGGATCTGATCTGOGOC4 AACTACCTCTATCAOSTGAGTTAGTT.NUAAAAACGTCTAGTOC ;OCCAAACCCAGGGOGC;GGATCCCTGGTTTCCTTTTATrTGTTAATATTGACATTATGGAGACAAT HE T I 730 790 630 650 670 750 770 610 L ? ITAAGAGTATTGCAGATATS rGcOGCTACTGGTGTTOTAAG' =TCGTTGATrCTACCATCICCAATGCAGTCAATGAAAAOGTJWCAAAGCGTCGOTAATGAAATTGGAGC2TJrAATTTTTAACTAAMWTAGCTGATGATGCITTCCAATATCCTAGGACCAAA TTGTTTTGCAACGACTGCTGAACCAGA R A A L P C F A T T A E P E X1 I A ID Ai T SI V D S T I NA V N E V IE S V N E I N L T V D D S N I G N CLVVOS C 1070 910 950 970 990 1010 1030 1050 GAACAAGAATGTAGTTCAJukc MrACTACCAACTTACTCANAACATCCACCG0CACCTACGATC.IGC;CCCTTTTCACCAGATTTCTCCAATGTrcl ACAcATTTTCACWTrATGGCTTATGACATC CACAACTGGGGACAAAAATCC*CAGTAAATTGGTTCGTCTTGAAACCCA GCCCACTACTACCTGTAAI 1 L L S V P S K V R E T B N X N VV 0 Ai T T T V N TI T N L T Q H P S A P T I P F S P D F N D F S a X D I T T G D N 1090 1110 1130 1150 1170 1190 1210 1230 1250 ;GC rTjrACACATSTAGAATSACCl LAG W, 6CTTCAAGTACAAGTTAATGTrGAATCAAGGTACTGCAGGCAGTGCCTT TGAGTGGACGCCATCTTGO ;oCAAGAOGATACCAGAT TAAACTTTTTTGGOATCACCAJ GACaAGCccGCATATGGACAATCACO TACTTTGCGGCTGTAIAAGATGTGGCCTTCA A A F V V T G S A L E0 T P S M Ai R G Y1 0 I V E L P X V F D DE X P A Y G Q S R Y F A R C F H Q V N N Q G C 1410 1430 1270 1290 1310 1330 1350 1370 1390 OGTAGTATATGAACCTAAumCCAGTTGTCACITTATGAMrTCAAAACTGAATTTOG AAGCCTTTACAAATCTTCCACAKLTGVTTTTGATGOAATTTGCTGAAACCACCA4LCAOGCGOACTTATGFTATCCCCTATGTTGCTGACACAAACTATGTCAAGAC.CGATTCGTCAGACTTAGGGCAGTTAAA A I L G Q L V V Y E P X PI V V T YD SI X L E F C A F T N L P Vly L N LA T T Q D L C I P Y V A D T N Y V T D S S D X 1590 1610 1450 1470 1490 1510 1530 1550 1570 FGTCAACATATATGA GAAAAATTGGAAGAAAATTATGACCAT AGTCTATGTCTGGACTCC..AlTTOCCAATCCCCACTGGAlkTCTGCTAACCAAGTTGAIffFOTCACCATACTGGCTAOCTTCI.GI;TTGCAATTAGACTTCCAAAATCCAAG AITTTGCTCAAGAI ITAATGCACCAAATGGTAAAAI 0 0 M M A V l I I A T V Y V D T P L S I P T G SI A N a V D V T I L G S L LI L D F Q N P A V F Q N PN GX X XN X I Q 1790 1730 1750 1770 1630 1650 1670 1690 1710 LVP3 GAGCACAAAATACAAATG;Q;0CCAAACAAAAATCaMrATAGCAGAAGGTCCTG TTCAATGAACATGGCAAATGTCI.CCCTTTGTACAACTOGTGCTCAATCAGI rG4OGCTTTAGTAGGTGAMLikiLAGAGCTTTCTATGEPOkTCCTAGAACTGCAGGCAGTAA ATCAAGATTTGATGATCTTGTTAAGAT O I A L V I 0 1 A V A S V A L A F D P G S S T Y K T1 R T 1 I I E C P C S N N Aa L C T T C Q C R T X S R F D D 1930 1950 1970 1610 1630 1650 1870 1690 1910 AGCCCAGTTATTCTCACTrGI;ATGGCAGITTCAAOCACI:T(VCCATCAGAAAATCATGGAIATTGA:TCAAAAGGTATTTCICA:AAGTOGGCACAACAACAOCCCCCAJFIGSAGTATTGTCCACAGOWsLAAIAIIGTCTACTTriIAAGACTATTTCCAAATTTGAATGTTTTTGTTAACAGTTATTCATACTT A V R L F P L V F F A O LF S V M A D S T T P S E NG V D a X Y F OSA T P Q S I B N I Y L N N V N S Y S Y 2150 1990 2010 2030 2050 2070 2090 2110 2130 TAGSCGTTCATTAGTTTTArakA1TT0AGCGTCTACGC:CjAGTACTTTCAATAGAGTCOATTGAGGOTAIGCTTCTTCIrcCCCAATGCCACAACGGACTCAACTTCCC:ACCTTOGACAATGCI.TsAIAITACACAAIAIOm;TGATATAGGTAGTGACkATAG,TTTTGAAATTACAATCCCTTACTCTTT A 0 F P S F R C S L V L R L S V Y A S S F N R C R L R C F F P NA T T D S T S T L D N I Y T I C D I G D N S E I T I Y 2330 2170 2190 2210 2230 22 50 2270 2290 2310 uu MCM"ATTGTTTCAAATTSULAAGTCTTAAACAGGCTCACATA6cm hr.rTGTATAGTCCAA0G ITIP'AAATOGGACAAGNhi TGGTTCTGTTGTTACATTCCAGAATTC TTCCACTTGGATGAGGOA ACAAATGCCACCCCAT' :AACAGTTCTACTCCATCOGGOOTTTA CTGCCAGGTTCTC0TGCCAAC V V T F S MMO V P L F I E V L R L T Y S S S P V C I V G N A R F F CP T GOS Q0N0 T R X T C 0 I C Q N N S E Y 0 X C Q D l 2450 2470 2490 2510 2350 2370 2390 2410 2430 Lyp ATGGGGTTCACAGATGGA!LrrTAACCOaCCCTCTFTT ;TsATTCAAGASCACACACAJLAAA&TTGCAAACAAACAATGTCIITs'CCAAATGAACTAOGACTCACTTCAGCrclCCAAGATCATGGCCQ ACLCTTGGCCAAGAAAAAkcGCCAAATTATTTTCTCAATTTrTAGGTCGATGAATGTGGACATTTTTAC O M V I F T M O M L C T P E L S A Q D P P F L R S N D G S D T D P L C I E D D S N X 5 N C L D C L C Q E K N Y N 2530 2550 2570 2590 2610 2630 2650 2670 2690 TGTATCACATACTAAAGTArimWATAACCTATTTG10CC ;aWSCATGCTTTTTTATGGAkGCCATACTTTCACCAATGACGJ;ACLCAATGGAGAGTGCCATTGGAATTTCCCILLAAACAAGGTCATGG;m;TCCTTATCACTIGTrcGTTTGCTTATTTTACTGGTCAkACTGAATATCCATGTTCTGTTCCTAAG V G A G R V P L F P S L L L F A F T G E L V L F L S S H T X V D N LF R F F E T F T N E Q E X Q HG S Y N I H 2670 2710 2730 2750 2770 2790 2H10 2830 2650 TGIGAGGGGGTTTCTGAGFIGOOTTGCACACACATATGAkisACTAGCAATGASCGACTIrcCAATTTTCTGTCATCGAACGG';TcIGTAATAACTGTACCAGCCGGAGAGCCRG;ATGACACTTTCAG C:T'CCCTACTATTCAAAhckCAAACCATTAAGAACTGTCAG;AGATAACAATAGTCTTGGTTATTTGAT S A P L T V L G L M E R C F L R V A T Y D S N D R V N F L S S N V I T V P A C E Q I T L S P Y Y S N K R R D N N S Y 2690 2910 2930 2950 2970 2990 3010 3030 3050 GTGCAAGCCCTTCTTGACZTIrCGACCTCTACCGGTAALAJATTGAGGTTTATCTTAGXCCTGAGATCTCCAAATSTCTT'rTrTCCCTCTTCCTGCCCCTAAGGTTACCQZiATAGTCCTGCACTArMCLCGGGGTGATATGGCci:AAACCTTACAAATCAGAGTCC-ATATGGTCAACAACCACAGAATCGTAT CARP F L-T GT M T 01 I B V I L S L RC PMNF FF P L P A P VI SSO R AL ROGD M A NML TON O P IYGQ QP 0009R 3070 3090 3110 3130 3150 3170 3190 3210 L A 3230 MRX L AO L DRO R H 0000I V O D VIOQL D10111 F KAlA L TOGRARXF T RIB L T 5 DM VI BBB COE LIDI 3250O 3270 3290 3310 3330 3350 3370 3390 3410

F RI RIY LBES A V lOBS OIF O VIDRONC ET I AID I FOGT BI L SQ 0HAAIG L V GT I L L TAG LOMS T I IT P 3430 3450 3470 3490 3510 3530 3550 3570 3590

VONA V I ORB FF MB8A I D GDB QIGOL S L L V QIXCIT 1FF SS A AT B I LIM 0LI VRXF I VAK I L V R I L C I MV 3610 3630 3650 1 B 3670 3690 3710 3730 3750 3770 LOYC H R PM I L T T AC LOST L L I MIDVIT55 V L 5 PM CIXA LMOQC L MIO I V RIX L ABE OVALES M000111D 3790 3610 3630 3950 3970 3090 3910 3930 3950

IB V RB 00Q C DITV 011010T L SMN 01 XG1 F MBE V S TA FOB I01MM 061B L LAI0D M V L S V F O P S IB 3970 3990 4010 LP64050 4070 4090 4110 4130 SIRA I 00V LBE ROAR H V C O IL DIIA M DII V ESA I Q SAMIXTOIFP 100R SIDC L AIKF A PI M A I C F RO 4150 4170 4190 4210 4230 4250 4270 4290 4310

CD MOOSIS MIT VI ALF QOB L A R 0 P0000I1001N L I R I B P 00000 QOB P G QOK S F L TBH T L 00R0 L Q 4330 4350 4370 4390 4410 4430 4450 4470 4490 O C LON G VF TIMPITA S B F MI 01G D NQI OS9LO IDD LOOQT RI RIOKD EMN L C M CI SOS V P F I V P M AO LBE 4510 4530 4550 4570 4590 4610 4630 4650 4670 BIKGIXF 1101S L V VA 11010 OFDP SO TYVLOI S G AL RR FP 10rMOOI AAA IA 10 I A GAK LMNV S Q AMNA 4690 4710 4730 4750 4770 4790 4010 4M30 4650O

I MOS TO B COB V S K100GR1DB I L I L RID L VII 01011MBNE RN100AA V 00Q LB N0 Q LII LII A V 4670 4690 4910 4930 4950 4970 4990 L A 5030 YII SONF P IA I POYV D E L0N IBM SITLO EOMOB A F I EIPIRP S V F R CF AM 10001001KS A S RABV VIM 5050o 5070 5090 5110 5130 5150 5170 5190 5210 011010 M 000 A 10 I L P V A A PR F NL S F VBERMIRAM LITV VO A VI S A L LL VII I FIIBBE SIXD BR - 5230 5250 5270 5290 5310 5330 5350 i....3 Vi' ) 5390 GITF P V 5 010 F ROBEA P Y 10 OLIE BIOIS QOMA YI T0100110601M SCA 0 Y0001000I LOHGOS III LBE 5410 IL C5450 5470 5490 5510 5530 5570 QOBIDE LITLSB 1RMR V FP 0 E QPOSVITQ VITL GO PMIDL A I LAKCIK L PF OF 110011S I T MAO GIB S M 5590 5610 5630 5050 5670 5690 5710 5730 5750

L I 0 MI 00000GI TAB VOARV HHSG0000ARBOG B SKT0111000TIV AS C A G M C GOG L L I 0 AVE GON FRI 5770 5790 5610 5630 5650o 5670 5690 5910 5930 LOGM 00I A GONGE MOGV A OP F O F LI SI MO 0100I VITE I T P 10oPM 10011NT TO 000HX P VIY G A VE VIKM 5950o 5970 5990 6010 L 3D 6050 6070 6090 6110 OP A V LOS XS DIT RLB B P V BC L IAKRS A 010Y AVOI F Q V 4MB LMV QGV IA CVI OAXF R E I F 0MM 00G V 6130 6150 6170 6190 6210 6230 6250 6270 6290

DMA TAOI LOITS 0 VS 0 MI LOST S A 010 F V 0011111KK L I C LBE PF O VA P L L EARL VQ OAXF DON L LAX 6310 6330 6350 6370 6390 6410 6430 6450 6470

0W00011T I F MIT C LI D EL RIR LIIRI A 001 TARCO B A CE VIDI C I VI A M I MB 01110101KIYQ P C 110 6490 6510 6530 6550 6570 6590 6610 6630 6650 O L A VOG IMNP ORK DOD FM I M A L NDYNYl10YBM1000GB LOSS M L LOB A V EV L AO C BOO. P D LV MQ L 6670 6690 6710 6730 6750 6770 6790 6610 6630 DIKP V00DSDI V V F BOOR L ID GOMNP 000 PCI I V LO 0 L CO LMNM C I11110 LOOSP 000 C L P0I V 10G 6650s 6670 6690 6910 6930 6950 6970 6990 7010 D DV I L S LI RBX I O P ERK L 000 M AID SFOGA EVITGS0001 PP S LIKPARM E V EF LAXRAKP G I F P E 0 IF 7030 7050 7070 7090 7110 7130 7150 7170 7190

I VO R LIIT0MNM I005 LMMMR NFO S I F 100 LOQS Y L M EL C LHBGXD010100010 IAL B P0Y L Q E MOO T V 7210 7230 7250 7270 7290 7310 7330

FIG. 2. Nucleotide sequence of cDNA representing echovirus 22 genomic RNA. The predicted amino acid sequence of the polyprotein and suggested cleavage sites, determined by alignment with known picornavirus polypeptides, are indicated. The amino-terminal sequences of VP3 and VP1 were verified by directly sequencing the polypeptides. Downloaded by guest on October 6, 2021 8850 Biochemistry: HyypiA et al. Proc. Nadl. Acad. Sci. USA 89 (1992) Table 1. Comparison of nucleotide and amino acid (coding region) identities of echovirus 22 with representatives of other picornavirus groups Identity with echovirus 22, % Virus 5'UTR VP4 VP2 VP3 VP1 2A 2B 2C 3A 3B 3C 3D 3'UTR 1 38 24 22 19 22 14 21 30 15 35 24 27 36 HRV1B 35 20 22 17 15 15 19 29 19 30 23 26 48 FMDV 46 20 23 20 23 - 15 29 14 25 18 25 29 EMCV 43 20 24 22 20 18 23 29 20 30 18 25 34 HAV 36 19 18 23 21 17 27 25 29 35 19 27 44 These identities were calculated by using the GAP program ofthe Genetics Computer Group package. Polio 1, 1; HRV1B, human 1B; FMDV, FMDV strain 01K; EMCV, encephalo- myocarditis virus. In the 5' UTR optimal alignments were used for comparison. parison (Fig. 3). This is clearly a lower degree of relatedness fished as the etiologic agent of a clinically distinct disease, it than, for example, between the major enterovirus/rhinovirus would be removed from the group (23). Later on, it became group (51%), and this value also slightly exceeds the value evident that individual echovirus types cannot be directly seen between these two groups, and aphtho- and cardiovi- associated with individual illnesses but rather with a wide ruses (68%). Very similar results are obtained when the spectrum of clinical manifestations. This is why they still are region, corresponding to the VP2 polypeptide in other picor- classified together, forming the largest of the enterovirus naviruses, is used for comparison (data not shown). subgroups. However, from the original 33 echovirus sero- types, types 1 and 8 were closely enough related to be combined, type 10 was shown to be a reovirus, and type 28 DISCUSSION appeared to be a rhinovirus. Classification of viruses is mainly based on virion morphol- Echovirus serotypes 22 (including the prototype 101 Harris ogy, the nature of the nucleic acid, and replication strategy. virus) and 23 were isolated during a study ofsummer diarrhea Other biochemical and antigenic properties are also used, and in 1956 (3). Although these viruses shared the general prop- they are important in identification and grouping ofindividual erties found in the group, the cytopathic destruction of cell virus types and strains. Optimal classification should com- cultures was incomplete when compared with previously bine the essential biological properties of the group and, in characterized echoviruses. It has been reported that cyto- addition, give an adequate tool for diagnosis and treatment of logic changes caused by echovirus types 22 and 23 involve clinical diseases. Introduction of diagnostic procedures distinctive nuclear manifestations (4, 24) and that the repli- based on nucleic acid sequences has recently increased the cation of these viruses is not inhibited by 2-(a-hydroxyben- need for precise classification of viruses according to their zyl)benzimidazole, in contrast to other members of the molecular characteristics. subgroup (25). There is also evidence of exceptional second- Echoviruses were discovered soon after the first tissue- ary structure ofEV22 RNA (6). Recently EV22 infection has culture techniques were introduced into laboratories (22). been shown not to cause the host-cell protein synthesis These viruses were isolated from human intestinal tract, as shut-off (7, 26) characteristic of enteroviruses. are polioviruses and , but they did not share In addition to the biological characteristics, the epidemi- the major pathogenic properties ofthese groups. Because the ology and pathogenesis of EV22 infections also differ when association with human disease was, at the time of their compared with the other echoviruses. Perhaps the most discovery, unknown, they were given the name ECHO striking difference is that 60%o ofthe EV22 isolations are from (enteric, cytopathogenic, human, orphan) viruses. It was children of <1 yr of age (17% in all the echoviruses), and originally postulated that whenever an echovirus was estab- altogether 92% of the isolations are from children under 5 yr FMDVO of age (5). From the isolation-positive cases, 29% had gas- TMEV trointestinal symptoms (11% in all the echoviruses), and only EMCV 11% had symptoms indicating involvement of the central HRV2 HRV1B nervous system (56% in the entire echovirus subgroup). HRV89 Previous work has, thus, given evidence that EV22 differs Pdio3 Potiol biologically from other echoviruses. Our data indicate the Pa~o2 a CAV21 molecular basis for these characteristics. Although clearly CAV9 picornavirus in terms of electron microscopy and genome CBV1 EV22 has a very divergent nucleotide and SVDV organization, CBV3 amino acid sequence. These characteristics locate it well CBV1. outside the enterovirus group and, together with the other EV11 eV7O biological data, suggest that EV22 belongs to another picor- BEV navirus genus. HRV14. EV22 The diverse nature of the capsid proteins, when compared HAV with other picornaviruses, made prediction of some cleavage 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 sites relatively difficult, and, therefore, the VP2/VP3 and PERCENTAGE DIFFERENCE VP3/VP1 cleavages were established by directly analyzing the amino-terminal sequence of polypeptides VP3 and VP1. FIG. 3. Dendrogram of the relationships between picornavirus The amino-terminal sequence of VP2 protein could not be groups based on comparisons of amino acid differences of 2C determined, and because VP4 can be myristoylated in picor- polypeptides. FMDVO, FMDV strain 01K; FMDVA, FMDV strain no A-10-61; TMEV, Theiler's murine encephalomyocarditis virus; naviruses, one explanation for lack of a signal is that EMCV, encephalomyocarditis virus; HRV, human rhinovirus; Po- cleavage occurs between VP4/VP2 during maturation of lio, poliovirus; CAV, ; CBV, coxsackie ; EV22 virions. This alternative is strongly supported by the SVDV, swine vesicular disease virus; EV, echovirus; eV, enterovi- results from direct amino acid analysis ofthe tryptic peptides rus; BEV, bovine enterovirus. from the 38-kDa band, and these data suggest that capsid Downloaded by guest on October 6, 2021 Biochemistry: Hyypid et al. Proc. Natl. Acad. Sci. USA 89 (1992) 8851 structure of EV22 differs from other picornaviruses in this In conclusion, our data demonstrate that EV22 has a respect. Sequence comparison gives some evidence of con- structure and genome organization corresponding to other served amino acids in the capsid proteins of EV22 (data not picornaviruses. However, its similarity to other sequenced shown), indicating that they probably share the typical three- members of this virus family is low. This similarity is ap- dimensional structure found in picornaviruses, an eight- proximately at the same level as that observed for HAV (9), stranded antiparallel (3-barrel. The unique amino-terminal originally classified as an enterovirus but now assigned to the extension of -20 residues in VP3, contains eight positively hepatovirus genus of picornaviridae. The molecular data charged amino acids and may have important functions in the presented here suggest that classification ofEV22 should also virus structure. The observation of an RGD motif near the be reconsidered. Although this virus is already known to be found in young children and is often associated with gastro- carboxyl terminus of EV22 VP1 may give some indication of enteritis, the role of EV22 virus in clinical disease needs the type of the cell receptor used by the virus. In aphthovi- careful reevaluation against the background of its distinct ruses (FMDV) the carboxyl terminus of VP1 is known to molecular properties. As soon as these data and the identity project onto the virus surface and participate in receptor and characteristics of possible relatives of EV22 are known, binding in conjunction with an RGD motif located elsewhere it would be possible to give the putative additional genus a in the VP1 primary sequence butjuxtaposed structurally (27). nomenclature that corresponds to that of other picornavirus An RGD at the carboxyl terminus of VP1 has recently been groups. implicated in receptor binding of A9 (28, 29), and a comparison with strains of this virus and FMDV shows We thank Dr. Carl-Henrik von Bonsdorfffor his advice in electron that the EV22 motif (RGDMANL) conforms well to a con- microscopy and Drs. Pekka Halonen and Tuija Poyry for their sensus that can be derived (RGDM/LXXL; ref. 30). comments during manuscript preparation. The study was supported The 2A protein, known to be associated with a cis-acting by grants from the Sigrid Juselius Foundation, the Turku University proteolytic activity in other picornaviruses, does not conform Foundation, and the Medical Research Council. in sequence to either ofthe two well-characterized types (31). 1. Auvinen, P., Stanway, G. & Hyypia, T. (1989) Arch. Virol. 104, In entero- and rhinoviruses, the 2A protein acts autocatalyt- 175-186. ically at its amino terminus and is also involved in cleaving 2. Auvinen, P. & Hyypia, T. (1990) J. Gen. Virol. 71, 2133-2139. the p220 component of the cap-binding complex, shutting off 3. Wigand, R. & Sabin, A. B. (1961) Arch. Gesamte Virusforsch. 11, 224-247. host-cell protein synthesis. This protein is similar to the 4. Shaver, D. N., Barron, A. L. & Karzon, D. T. (1961) Proc. Soc. picornaviral 3C , as it is also a member of the Exp. Biol. Med. 106, 648-652. trypsin-like proteases in which the usual active-site serine has 5. Grist, N. R., Bell, E. J. & Assaad, F. (1978) Prog. Med. Virol. 24, been replaced by cysteine. Such proteases have a catalytic 114-157. 6. Seal, L. A. & Jamison, R. M. (1984) J. Virol. 50, 641-644. triad made up of histidine, aspartate/glutamate, and serine/ 7. Coller, B.-A. G., Chapman, N. M., Beck, M. A., Pallansch, M. A., cysteine residues, the latter being found in the context Gauntt, C. J. & Tracy, S. M. (1990) J. Virol. 64, 2692-2701. GXCG, which is well-conserved between these proteins from 8. Stanway, G. (1990) J. Gen. Virol. 71, 2483-2501. a wide range of sources. EV22 2A is similar in size to that of 9. Cohen, J. I., Ticehurst, J. R., Purcell, R. H., Buckler-White, A. & Baroudy, B. M. (1987) J. Virol. 61, 50-59. entero- and rhinoviruses, and there are a few short regions of 10. Hyypid, T., Auvinen, P. & Maaronen, M. (1989) J. Gen. Virol. 70, amino acid homology. However, the critical GXCG motif is 3261-3268. absent, and the protein is unlikely to function exactly in the 11. Murphy, G. & Kavanagh, T. A. (1988) Nucleic Acids Res. 16, 5198. same manner. This protein is also dissimilar in sequence from 12. Herikoff, S. (1984) Gene 28, 351-359. 13. Zagursky, R. J., Baumeister, K., Lomax, N. & Berman, M. L. the 2A protein of aphthoviruses and cardioviruses. In these (1985) Gene Anal. Tech. 2, 89-94. viruses, 2A protein cleaves at its carboxyl terminus and does 14. Huovilainen, A., Kinnunen, L., Ferguson, M. & Hovi, T. (1988) J. not participate in host-cell shut-off, which, in - Gen. Virol. 69, 1941-1948. infected cells, is accomplished by the proteolytic activity of 15. Speicher, D. W. (1989) in Techniques in Protein Chemistry, ed. the leader. The leader protein has no known Hugli, T. E. (Academic, San Diego), pp. 24-35. 16. Kalkkinen, N. & Tilgmann, C. (1988) J. Protein Chem. 7, 242-243. proteolytic function, and this virus does not shut-offhost-cell 17. Devereux, J., Haeberli, P. & Smithies, 0. (1984) Nucleic Acids Res. synthesis by inactivating the cap-binding complex, which 12, 387-395. also happens in EV22 infection (7, 26). Further insights into 18. Felsenstein, J. (1985) Evolution 39, 783-791. the role of 2A protein in EV22 replication must await direct 19. Palmemberg, A. C. (1989) in Molecular Aspects of Picornavirus analysis of its proteolytic capability. Infection and Detection, eds. Semler, B. L. & Ehrenfeld, E. (Am. Soc. Microbiol., Washington), pp. 211-241. In view of the highly divergent nature of the rest of the 20. Rossmann, M. G., Arnold, E., Erickson, J. W., Frankenberger, genome, the similarity of the 5' UTR to cardio- and aphtho- E. A., Griffith, J. P., Hecht, H.-J., Johnson, J. E., Kamer, G., Luo, viruses is somewhat unexpected. The observation raises the M., Mosser, A. G., Rueckert, R. R., Sherry, B. & Vriend, G. (1985) possibility that EV22 is a recombinant virus or, alternatively, Nature (London) 317, 145-153. that the 5' UTR evolves more slowly than other genome 21. Ruoslahti, E. & Pierschbacher, M. D. (1987) Science 238, 491-493. 22. Melnick, J. L. (1965) in Viral andRickettsialInfections ofMan, eds. regions. Predictions suggest that similarity of this region is Horsfall, F. L., Jr., & Tamm, I. (Lippincott, Philadelphia), 4th Ed., reflected in secondary-structural terms, and this may play a pp. 513-545. role in the observed viral properties. For instance, because 23. Committee on the ECHO viruses (1955) Science 122, 1187-1188. the 5' UTR is vital in translating the virus polyprotein, this 24. Jamison, R. M. (1974) Arch. Gesamte Virusforsch. 44, 184-194. similarity may explain the efficient translation of EV22 RNA 25. Eggers, H. J. & Tamm, I. (1961) J. Exp. Med. 113, 657-682. 26. Coller, B.-A. G., Tracy, S. M. & Etchison, D. (1991) J. Virol. 65, in rabbit reticulocyte lysate (7) also seen in cardio- and 3903-3905. aphthoviruses, and contrasting with the poor translation of 27. Acharya, R., Fry, E., Stuart, D., Fox, G., Rowlands, D. & Brown, entero- and rhinovirus RNA in this system. As suggested F. (1989) Nature (London) 337, 709-716. previously by hybridization experiments (7), the EV22 5' 28. Chang, K. H., Auvinen, P., Hyypia, T. & Stanway, G. (1989) J. UTR lacks the poly(C) tract seen in aphthoviruses and Gen. Virol. 70, 3269-3280. encaphalomyocarditis virus but absent from Theiler's murine 29. Roivainen, M., Hyypia, T., Piirainen, L., Kalkkinen, N., Stanway, 5' G. & Hovi, T. (1991) J. Virol. 65, 4735-4740. encephalitis virus, another virus that shares the common 30. Chang, K. H., Day, C., Walker, J., HyypiA, T. & Stanway, G. UTR structure. The 3' UTR has been shown (2) to have an (1992) J. Gen. Virol. 73, 621-626. exceptional predicted secondary structure when compared 31. Harris, K. S., Hellen, C. U. T. & Wimmer, E. (1990) Sem. Virol. 1, with enteroviruses. 323-333. Downloaded by guest on October 6, 2021