Quick viewing(Text Mode)

Comparative Genetic Analysis of Genomic DNA Sequences of Two Human Isolates of Tanapox Virus

Comparative Genetic Analysis of Genomic DNA Sequences of Two Human Isolates of Tanapox Virus

View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by UNL | Libraries

University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln

Public Health Resources Public Health Resources

2007

Comparative Genetic Analysis of Genomic DNA Sequences of Two Human Isolates of

Steven H. Nazarian Biotherapeutics Research Group, Robarts Research Institute, and Department of Microbiology and Immunology, University of Western Ontario, London, Ontario N6G 2V4, Canada

John W. Barrett Biotherapeutics Research Group, Robarts Research Institute, and Department of Microbiology and Immunology, University of Western Ontario, London, Ontario N6G 2V4, Canada

A. Michael Frace Biotechnology Core Facility Branch, Division of Scientific Resources, National Center for Preparedness, Detection, and Control of Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA 30329, USA

Melissa Olsen-Rasmussen Biotechnology Core Facility Branch, Division of Scientific Resources, National Center for Preparedness, Detection, and Control of Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA 30329, USA

Marina Khristova FBiotechnologyollow this and Cor additionale Facility works Branch, at: Divisionhttps://digitalcommons.unl.edu/publichealthr of Scientific Resources, National Centeresour forces Prepar edness, Detection, and Control of Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA 30329, Par tUSA of the Public Health Commons

See next page for additional authors Nazarian, Steven H.; Barrett, John W.; Frace, A. Michael; Olsen-Rasmussen, Melissa; Khristova, Marina; Shaban, Mae; Neering, Sarah; Li, Yu; Damon, Inger K.; Esposito, Joseph J.; Essani, Karim; and McFadden, Grant, "Comparative Genetic Analysis of Genomic DNA Sequences of Two Human Isolates of Tanapox virus" (2007). Public Health Resources. 61. https://digitalcommons.unl.edu/publichealthresources/61

This Article is brought to you for free and open access by the Public Health Resources at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in Public Health Resources by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. Authors Steven H. Nazarian, John W. Barrett, A. Michael Frace, Melissa Olsen-Rasmussen, Marina Khristova, Mae Shaban, Sarah Neering, Yu Li, Inger K. Damon, Joseph J. Esposito, Karim Essani, and Grant McFadden

This article is available at DigitalCommons@University of Nebraska - Lincoln: https://digitalcommons.unl.edu/ publichealthresources/61 Virus Research 129 (2007) 11–25

Comparative genetic analysis of genomic DNA sequences of two human isolates of Tanapox virusଝ Steven H. Nazarian a, John W. Barrett a, A. Michael Frace b, Melissa Olsen-Rasmussen b, Marina Khristova b, Mae Shaban a, Sarah Neering d,YuLic, Inger K. Damon c, Joseph J. Esposito b, Karim Essani d, Grant McFadden a,∗ a Biotherapeutics Research Group, Robarts Research Institute, and Department of Microbiology and Immunology, University of Western Ontario, London, Ontario N6G 2V4, Canada b Biotechnology Core Facility Branch, Division of Scientific Resources, National Center for Preparedness, Detection, and Control of Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA 30329, USA c Poxvirus and Rabiesvirus Branch, Division of Viral and Rickettsial Diseases, National Center for Zoonotic, Vector-Borne, and Enteric Diseases, Centers for Disease Control and Prevention, Atlanta, GA 30329, USA d Laboratory of Virology, Department of Biological Science, Western Michigan University, Kalamazoo, MI 49008, USA Received 10 March 2007; received in revised form 1 May 2007; accepted 1 May 2007 Available online 14 June 2007

Abstract Members of the genus , which include Tanapox virus (TPV) and Yaba monkey tumor virus, infect primates including humans. Two strains of TPV isolated 50 years apart from patients infected from the equatorial region of Africa have been sequenced. The original isolate from a human case in the Tana River Valley, Kenya, in 1957 (TPV-Kenya) and an isolate from an infected traveler in the Republic of Congo in 2004 (TPV-RoC). Although isolated 50 years apart the genomes were highly conserved. The genomes differed at only 35 of 144,565 nucleotide positions (99.98% identical). We predict that TPV-RoC encodes 155 ORFs, however a single transversion (at nucleotide 10241) in TPV-Kenya resulted in the coding capacity for two predicted ORFs (11.1L and 11.2L) in comparison to a single (11L) in TPV-RoC. The genomes of TPV are A + T rich (73%) and 96% of the sequence encodes predicted ORFs. Comparative genomic analysis identified several features shared with other chordopoxviruses. A conserved sequence within the terminal inverted repeat region that is also present in the other members of the Yatapoxviruses as well as members of the Capripoxviruses, Swinepox virus and an unclassified Deerpox virus suggests the existence of a conserved near-terminal sequence secondary structure. Two previously unidentified gene families were annotated that are represented by ORF TPV28L, which matched homologues in certain other chordopoxviruses, and TPV42.5L, which is highly conserved among currently reported chordopoxvirus sequences. © 2007 Elsevier B.V. All rights reserved.

Keywords: Tanapox; Yatapoxvirus; Poxvirus; Comparative genomics

1. Introduction tebrate and insect hosts, respectively (Buller et al., 2005). Characteristic features of poxviruses include a cytoplasmic Poxviruses constitute two sub-families, Chordopoxvirinae life cycle, a large virion size and large genome compared and Entomopoxvirinae, which infect a wide range of ver- to other (Moss, 2007). Poxviruses contain a linear, double-stranded DNA genome with palindromic, covalently- closed ends. Sequenced poxvirus genomes vary from ∼134 to ଝ ∼ Disclaimer: The findings and conclusions in this report are those of the 360 kbp in length and 130 to 328 open reading frames (ORFs) authors and do not necessarily represent the views of the funding agencies. Use can be predicted from the sequences. At the ends of poxvirus of trade names or commercial sources is for identification only and does not DNA genomes are mirror image terminal inverted repeat (TIR) imply endorsement by the funding agencies. ∗ regions, however, among different strains the lengths of the Corresponding author. Present address: University of Florida, 1600 SW TIR regions vary from a few hundred nucleotides, such as in Archer Road, ARB Room R4-295, P.O. Box 100332, Gainesville, FL 32610, USA. Tel.: +1 352 273 6852; fax: +1 352 273 6849. Variola virus (VARV), to approximately 12 kbp, such as in E-mail address: grantmcf@ufl.edu (G. McFadden). Shope fibroma virus (SHFV). In general, the chordopoxvirus

0168-1702/$ – see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.virusres.2007.05.001 12 S.H. Nazarian et al. / Virus Research 129 (2007) 11–25 genome is organized so that the essential housekeeping genes, by the development of few nodular skin lesions (Downie including those required for transcription, replication and mor- and Espana, 1972; Damon, 2007; Knight et al., 1989). In phogenesis, are located within the central region of the genome. contrast, YMTV produces a very distinct disease, primarily Genes nearer to the DNA ends are generally more variable in non-human primates, which is characterized by epidermal and encode for a wide variety of functions, including genes histiocytomas of the head and limbs (Downie and Espana, dedicated to ensure virus replication within the host by mod- 1972; Knight et al., 1989). The observed biological differences ulating the host innate and adaptive immune response (Seet et between YMTV and YLDV are likely explained by the 82% al., 2003). nucleotide identity and an approximately 10 kbp deletion from The genomic DNA sequences of over 100 different poxvirus YMTV compared to YLDV (Brunetti et al., 2003; Downie and strains have been determined. The particular poxvirus sequences Espana, 1972; Espana et al., 1971; Knight et al., 1989; Lee et al., used in the present study are listed in Table 1. All of the 2001). sequences used are available through GenBank and two curated A previous study, in which the genome of YMTV was poxvirus sites—www.poxvirus.org/ and www.biovirus.org/. sequenced, examined the conservation of certain gene fami- There are two species in the genus Yatapoxvirus: Yaba lies that were found to be below the usual 50 codon cutoff monkey tumor virus (YMTV) and Tanapox virus (TPV). Both (Brunetti et al., 2003). To further this research, two isolates of species have caused human infection. A previously sequenced TPV were sequenced; one is the first isolate from a 1957 human poxvirus, Yaba-like disease virus (YLDV) (Lee et al., 2001), outbreak of TPV in the Tana River valley in Kenya (TPV-Kenya) is a TPV from an infected non-human primate (Brunetti et and the other was isolated from an infected college student al., 2003; Espana et al., 1971; Esposito and Fenner, 2001; traveling in the Congo Basin in the Republic of Congo (TPV- McNulty et al., 1968). TPV and YLDV are suspected to be RoC) (Dhar et al., 2004). The current study is a comparative transmitted by arthropod vectors and both produce a similar genomic analysis of these two isolates of TPV that are from dis- rash illness, fever with prodromal symptoms that is followed crete geographic regions of Africa and isolated 50 years apart.

Table 1 Summary information of poxvirus sequences used in this study Genus Virus Strain Virus short form Genome size (bp) Accession number Reference

Tanapox virus RoC TPV-RoC 144553 EF420157 This study Tanapox virus Kenya TPV-Kenya 144565 EF420156 This study Yatapox Yaba-like disease virus YLDV 144575 NC 002642 Lee et al. (2001) Yaba monkey tumor virus Roswell Park-Yohn YMTV 134721 NC 005179 Brunetti et al. (2003)

Goatpox virus Pellor GTPV 149599 NC 004003 Tulman et al. (2002) Capripox Lumpy skin disease virus Neethling 2490 LSDV 150773 NC 003027 Tulman et al. (2001) Sheeppox virus A SHPV 150057 AY077833 Tulman et al. (2002)

Suipox Swinepox virus Nebraska 17077-99 SWPV 146454 NC 003389 Afonso et al. (2002)a

Myxoma virus Lausanne MYXV 161773 NC 001132 Cameron et al. (1999) Leporipox Shope fibroma virus Kasza SHFV 159857 NC 001266 Willer et al. (1999)

Molluscipox virus Subtype 1 MOCV 190289 NC 001731 Senkevich et al. (1996)

Camelpox virus CMS CMPV 202205 AY009089 Gubser and Smith (2002) virus Brighton Red CPXV 224499 C 003663 Ectromelia virus Moscow ECTV 209771 NC 004105 Horsepox virus MNR-76 HSPV 212633 DQ792504 Tulman et al. (2006) Orthopox virus Zaire MPXV 196858 NC 003310 Shchelkunov et al. (2001) Raccoonpox virus RCNV NDa M23018 Taterapox virus Dahomey 1968 TATV 198050 NC 008291 virus Western Reserve VACV 194711 NC 006998 Variola virus India 1964 7125 Vellor VARV 186127 DQ437586 Esposito et al. (2006)

Canarypox virus ATCC VR111 CNPV 359853 NC 005309 Tulman et al. (2004) Avipox Fowlpox virus Iowa FWPV 288539 NC 002188 Afonso et al. (2000)

Bovine papular stomatitis virus BV-AR02 BPSV 134431 NC 005337 Delhon et al. (2004) Parapox Orf virus NZ2 ORFV 137820 DQ184476 Mercer et al. (2006)

Crocodilepox virus CRV 190054 NC 008030 Afonso et al. (2006) Unclassified Deerpox virus W-1170-84 DPV 170560 AY689437 Afonso et al. (2005) a Not determined. S.H. Nazarian et al. / Virus Research 129 (2007) 11–25 13

Comparative genomics reported here reveals the similarities and 3730XL DNA sequencers and the sequencing primers (Inte- differences within and without Yatapoxviruses. In particular, the grated DNA Technologies, Coralville, IA, USA) were designed genetic relationships of TPV with sequenced isolates of the gen- to anneal approximately at every 400 bases across the templates, era Capripoxvirus and Suipoxvirus and an unclassified Deerpox which enabled a nine-fold average sequence redundancy. To virus are explored. verify certain sequences, additional cycle sequencing involved direct sequencing from the full-length extracted genome DNA. Chromatogram data was assembled using Seqmerge (Wisconsin 2. Materials and methods Package Version 10.3, Accelrys Inc., San Diego, CA, USA) and Phred/Phrap base-calling and assembly software and Consed for 2.1. Genomic sequencing sequence editing (Balbas and Gosset, 2001; Domi and Moss, 2002). ORFs were identified and alignments performed using Sequencing was performed essentially as described else- MacVector 6.5.3 (Oxford Molecular Ltd.). where (Esposito et al., 2006). Briefly, TPV genomic DNA was extracted from cells infected with TPV-Kenya (Knight et al., 1989) and TPV-RoC (Dhar et al., 2004). The genomic DNA 2.2. Estimation of nucleotide substitution was used as template for production of a set of 14 overlapping polymerase chain reaction (PCR) amplicons that span virtu- TPV-Kenya and TPV-RoCwere compared 25,000 bp at a time ally the entire viral genome. Amplicons of 10–12 kbp each by using a base-by-base pairwise comparison matrix contain- were produced using the Expand High Fidelity PCR System ing 144,565 nucleotide positions. Nucleotide differences were (Roche Applied Science, Indianapolis, IN, USA). The product analyzed for transversions and transitions. of eight identical PCR mixtures for each amplicon were pooled and treated with ExoSap-IT (USB Corporation, Cleveland, OH, 3. Results USA) to reduce PCR errors in the amplicon templates, which were used for primer-walking cycle-sequencing reactions. Cycle Two TPV isolates from infected humans either living (TPV- sequencing reactions used Applied Biosystems (PE Biosystems, Kenya) or traveling (TPV-RoC) through equatorial Africa were Foster City, CA, USA) Big-Dye 3.1 dye chemistry and ABI sequenced, which provided an opportunity to investigate the evo-

Fig. 1. ORF differences between TPV-RoC and TPV-Kenya. (A) The 11L gene from TPV-RoC (solid black arrow) is a single ORF. The same region from TPV-Kenya encodes two ORFs (solid gray arrows). The sequences between the comparable regions are identical except for a single transversion at position 10239 of TPV-RoC. This change results in a termination codon (* [TAG]) in the predicted transcript of TPV-Kenya instead of the incorporation of a glutamic acid (E [GAG]) as predicted for TPV-RoC. The single nucleotide change is bolded and italized. The sequence presented represents the minus strand. The predicted amino acid sequences for TPV-RoC or TPV-Kenya are indicated above or below the corresponding nucleotide sequence. Numbers indicate position with the respective genomes. The single black lines above or below the solid arrows show the region that is represented by the sequence comparison. (B) ClustalW alignment of the 23.5L ORF from YMTV, TPV-Kenya and TPV-RoC. Similar amino acids are shaded light grey and identical amino acids are shaded dark grey. 14 S.H. Nazarian et al. / Virus Research 129 (2007) 11–25 lutionary diversity of two TPVs that spanned 50 years and were 20 bp of cognate reported YLDV genomic sequences (Lee et al., from two different African countries. 2001). TPV-Kenya and TPV-RoC encode 156 and 155 distinct 3.1. Genome architecture of TPV ORFs, respectively (Table 2). All ORFs that were reported for YLDV are present in both isolates of TPV, with two In GenBank there are sequences of TPV-Kenya that represent exceptions—ORF 11L and ORF 23.5L (Fig. 1). TPV-Kenya 11L approximately 8 kbp of the total genome (GenBank acces- has a premature stop codon at codon 236, which results in a trun- sion numbers AY253325, AF245394 and AF153912); these cated ORF (Fig. 1A). Approximately 80 bp downstream of the sequences are about 98% identical to cognate sequences in a 11L stop codon in TPV-Kenya, a putative ORF corresponding reported YLDV genome sequence (Lee et al., 2001). In order to the second half of the 11L ORF is present and may be tran- to sequence the two TPV isolates described here, PCR ampli- scribed as a distinct gene product. The two ORFs in TPV-Kenya con and cycle sequencing primers were designed by using are denoted 11.1L and 11.2L (Fig. 1A and Table 2). The two pre- the reported YLDV sequence. The determined sequences of dicted ORFs have been identified previously and were annotated TPV-Kenya and TPV-RoC comprised 144,565 and 144,553 bp, in GenBank. TPV-Kenya 11.1L was previously labeled TPV respectively, 96% of which encode for putative ORFs. Both ORFL7R (accession number AAD46181) and TPV ORFL8R viruses are 73% A + T-rich, which is consistent with the other (accession number AAD46182). The ORF 11.2L is identical sequenced yatapoxviruses (YLDV 73% A + T and YMTV 70% to TPV ORFL4R (accession number AAD46179), which indi- A+T)(Brunetti et al., 2003; Lee et al., 2001). By comparison cates that this truncated ORF has been independently identified. with the YLDV sequences, the TPV sequences lack the putative In contrast, the 11L ORF in TPV-RoC is not truncated. The concatemer resolution domain proximal to the hairpin-loop ter- TPV-RoC 11L-encoded protein is an ankyrin repeat protein mini. However, the two TPV isolates were sequenced to within that contains a predicted F-box domain (Fig. 2B) (Mercer et

Fig. 2. Structural analysis of multiple ankyrin repeat-containing proteins. (A) Various homologous ankyrin repeat-containing proteins were identified using the c-terminal end of the TPV 11L protein as a query sequence. Predicted ankyrin repeats are indicated by the boxes (ANK). Numbers indicate the amino acid length of the various proteins from various poxviruses, including: TPV-RoC (TPV11L), TPV-Kenya (TPV11.1L and TPV11.2L), LSDV, MYXV (M148R), DPV and VACV (WR019 and WR186). (B) The C-terminal end of each of the proteins in the top panel is aligned. DPV019 and LSDV145 had an approximately 30 amino acid stretch, N-terminal to the aligned sequence that did not match and these residues are not included. The bold line underneath the alignment indicates the putative F-box domain that is complete in all sequences but WR019. S.H. Nazarian et al. / Virus Research 129 (2007) 11–25 15 similarity e YMTV 2L 73/82 22L 34/55 28.5L 67/84 ORF Identity/ similarity d MYXV ORF Identity/ similarity c VAC V ORF Identity/ similarity WR038 50/70 13L 76/87 b LSDV ORF Identity/ LSDV016 41/55 M010L 36/52 similarity a DPV DPV008 38/53 DPV013 49/70DPV017 LSDV011 39/62 40/66DPV020 51/73 LSDV014DPV021 47/66 46/68 WR031 WR034 34/51 LSDV015 37/57DPV022 40/65 35/59 M154L 26/46 LSDV017 30/54 7LDPV026 77/89 74/87 LSDV020 12L 78/89 75/88 WR043DPV032 77/87 77/90 M011LDPV035 27/47 M015L LSDV025 72/86 78/90 75/86 16L 14L LSDV028 WR049 20L 64/80 55/72 77/87 72/83 91/95 WR052 M020L 58/74 73/87 M022L 25L 72/83 90/96 27L 90/94 -like PKR ␣ or function protein inhibitor factor inhibitor lipase protein factor anti-apoptotic factor reductase protein kinase envelope protein Codon aa ORF Identity/ f CodonStart Stop aa Start Stop Table 2 Identification of the predicted open reading frames (ORFs) ofORF TPV-Kenya and TPV-RoC TPV-Kenya1L2L TPV-RoC 1738 2868 740 1855 333 338 1738 Predicted 2868 structure 740 1855 333 338 TNF binding DPV007 38/60 LSDV007 33/55 WR010 28/49 1L 71/84 3L4L5L 35836L 43297L 2918 4840 3616 5354 222 4373 6473 238 4908 3583 156 5421 4329 149 2918 4840 351 3616 5353 222 4373 6471 238 4907 156 5419 149 LAP/PHD domain 351 Chemokine DPV009 DPV01116L 42/66 35/62 13345 LSDV150 LSDV010 LSDV009 12818 45/59 35/56 176 13343 WR029 WR039 12816 33/62 38/60 176 LSDV00123L Mitochondria 23.5L 37/57 1770324L 1794925L WR196 17485 17800 18721 M153R 41/63 20036 73 18080 50 39/56 18702 17701 214 18005 445 5L 4L 17483 18719 17892 20034 18078 73 64/82 71/83 38 18700 214 445 Serine/threonine 6L 81/93 DPV029 DPV031 59/68 56/82 LSDV023 LSDV024 68/85 50/73 WR047 WR048 43/69 52/75 M018L M019L 58/76 46/73 23.5L 24L 86/95 84/93 8L9L 713110L 784811L 649311.1L 9021 7171 21311.2L 10160 22612L 7873 10946 7129 9036 383 10242 7846 1123213L 375 6491 235 10969 9019 7169 1212114L 213 226 88 7871 11267 Ankyrin repeat 1255415L Virulence 11230 gene 383 285 12147 12815 10967 SERPIN/Spi3ortholog 12119 DPV014 DPV018 10944 136 12585 32/58 45/61 11265 88 12552 903417L 285 77 IF2 LSDV012 12145 63718L Monoglyceride 12813 43/65 1382119L 136 Ankyrin repeat Ankyrin repeat 14246 1258320L Ankyrin 13393 repeat IL-18 binding 15846 13866 143 DPV019 16848 7721L DPV019 14281 127 DPV01922L 51/69 13819 15874 EGF-like 51/69 growth 522 17133 52/73 14244 13391 325 17480 15844 16879 LSDV148 13864 143 WR034 16846 LSDV145 17172 14279 23/45 127 85 29/48 31/52 dUTPase 15872 103 522 Pyrin 17131 M149R domain WR18626L 325 17478 Kelch M008.1 WR019 protein 1687727L 27/46 28/53 Ribonucleotide 17170 29/47 21988 31/52 DPV024 23133 85 DPV02328L 103 20063 M149R 11L DPV025 40/59 M148R28.5L 62/74 22024 WR186 642 23332 29/49 37/62 27/53 23529 23/45 370 27/45 21986 23189 LSDV018 23359 11L LSDV019 23131 64/76 20061 11L 48 34/54 57 22022 79/93 642 82/94 23330 WR041 23527 370 WR042 EEV maturation 58/73 23187 23357 Palmitylated 27/48 DPV027 EEV DPV034 48 30/61 M012L 57 M014L 48/69 63/80 LSDV021 24/57 11L LSDV027 38/60 17L 45/61 19L 79/93 86/95 76/89 WR051 36/56 M013L M021L 30/55 41/61 M016L 26L 42/66 70/86 21L 65/84 16 S.H. Nazarian et al. / Virus Research 129 (2007) 11–25 similarity e ORF Identity/ YMTV similarity d ORF Identity/ MYXV M037L 67/82 similarity c VAC V ORF Identity/ similarity b LSDV ORF Identity/ similarity a DPV DPV039 70/86DPV040 72/86 LSDV031 73/83 LSDV032DPV043 72/85 WR056 68/83 55/75 WR057 LSDV036 67/83 M026R 69/81 70/83 M027L WR060 71/84 31R 71/85 90/95 32L M030L 91/98 67/82 35L 86/93 DPV051 73/87DPV053 LSDV043 64/83 69/86 LSDV045 WR070 61/77 68/83 WR072 M038LDPV060 55/72 69/87 50/72 M040L 43L LSDV051 58/81DPV064 86/96 49/70 85/93 45L WR080 LSDV055 87/94 DPV067 49/71 85/95 88/98 M47R WR083DPV069 LSDV058 79/88 46/66 81/91 87/97 M50R 52R LSDV060 WR086 81/93 85/94 83/94 85/93 WR088 M53R 55R 69/83 96/98 83/94 M55R 58R 74/92 97/99 60R 91/96 or function virion core protein polymerase subunit rpo30 protein phosphoprotein elongation factor subunit rpo7 factor VLTF-1 protein Codon aa ORF Identity/ f ) CodonStart Stop aa Start Stop Continued Table 2 ( ORF TPV-Kenya29L30L 2402731R TPV-RoC 24738 23584 24797 24094 148 25111 215 24025 105 24736 23582 24795 24092 Predicted structure 148 25109 215 105 DNA-binding DPV037 DPV038 66/80 41/61 LSDV029 LSDV030 67/81 38/59 WR054 WR055 56/73 41/59 M024L M025L 52/72 33/55 29L 30L 82/92 77/92 32L 2652333L34L 25114 2859735L 470 29202 26543 2981136R 26521 28660 68537R 29245 25112 181 2992838R 28595 189 3098739L 470 29200 30983 26541 3271040R 29809 32687 Poly(A) 28658 352 3653341L 685 33513 29243 567 3656642L 181 29926 33516 268 3723642.5L 189 30985 36847 1006 dsRNA-binding 30981 39262 32708 39413 36850 RNA 32685 polymerase 36531 352 94 37226 33511 39324 129 DPV042 567 33514 36564 679 268 45/64 37234 1006 30 36845 39260 36848 DNA polymerase 39411 LSDV034 37224 94 DPV041 129 39322 49/69 DPV047 679 50/72 70/84 30 WR059 LSDV033 38/55 DPV044 LSDV039 44/66 DPV045 35/51 70/85 DPV046 M029L 70/85 WR058 80/91 LSDV035 WR065 58/75 40/62 LSDV037 37/56 DPV048 69/82 LSDV038 70/85 34L 70/88 DPV050 M028L 79/91 WR061 M034R 49/69 68/83 WR062 45/66 25/44 LSDV040 70/84 WR064 61/81 72/86 LSDV042 70/81 33L M031R 42/65 39L M032R 72/88 WR066 31/53 M033R 86/93 63/81 66/80 WR068 LSDV041 76/87 36R 39/61 54/74 37R M035R 74/88 38R M036L 87/95 WR067 66/89 93/97 44/71 41/65 40R 88/97 41L 74/87 43L 40365 39433 311 40363 39431 311 DNA binding core 44L45L 40590 4139146L 4036947L 40594 4169448L 74 266 4287249R 41458 40588 4415850L 41389 41715 4416451L 40367 79 42872 40592 386 4796352R 46191 41692 429 48295 74 266 42870 46194 676 4828953L 41456 44156 47963 ssDNA-binding 41713 59054R 44162 48954 42870 111 49301 7955R 386 47961 46189 222 49304 429 48293 48927 Structural 46192 5062656R 676 48287 50620 Topoisomerase 47961 II 12557L 590 50814 Helicase 48952 439 5081758R 111 49299 DPV057 Metalloprotease 52483 222 63 49302 51335 DPV055 48925 DPV052 52513 77/9059R 51362 Transcriptional 50624 50618 173 DPV059 78/9360R 50/63 125 53292 374 53308 LSDV048 50812 64/79 439 DPV058 50815 Glutaredoxin 260 54313 DPV05661R LSDV046 76/88 LSDV044 52481 54309 62/78 51333 6362L 52511 55/78 LSDV050 55053 68/81 49/66 51360 334 5507563R 173 RNA WR076 60/78 polymerase LSDV049 53290 247 DPV062 56276 374 53306 55347 LSDV047 WR074 WR071 69/85 56301 65/80 DPV061 73/88 260 54311 55329 54/72 Virion WR078 54307 46/76 core 45/62 protein 91 55/75 57047 Late 55051 316 transcription M043L 55/75 WR077 334 LSDV053 DPV066 55073 249 WR075 M041L M039L 247 56274 58/76 76/88 LSDV052 73/85 64/79 DPV063 54/74 55345 M45L 56299 46/71 IMV 43/66 45/68 55327 membrane 53/69 M044R WR081 48L 57045 LSDV057 91 58/75 316 M042L 46L WR079 44L 44/65 60/75 DPV065 57/75 92/97 249 LSDV054 50/71 44/65 86/96 55/77 50L 79/89 51/70 Core protein VP8 M48L WR085 49R 82/92 47L M46L 51/68 LSDV056 84/93 DPV072 WR082 DPV068 68/85 83/93 56/73 78/87 44/61 64/80 54/73 M52L 53L WR084 LSDV063 M49R LSDV059 51L 55/69 99/100 DPV070 DPV071 46/70 78/89 60/77 73/86 45/67 47/67 65/83 57L M51R WR091 WR087 LSDV061 82/91 54R LSDV062 60/82 50/70 55/78 45/70 66/84 76/89 M58R M54R 56R WR089 WR090 80/89 30/55 76/87 54/74 56/74 M56R 63R M57L 59R 91/95 80/90 24/50 60/80 61R 62L 71/84 86/95 S.H. Nazarian et al. / Virus Research 129 (2007) 11–25 17 DPV073 51/75 LSDV064 59/77DPV077 73/89 WR092DPV078 46/67 LSDV068 76/87 76/90 M59RDPV080 LSDV069 WR095 87/94 52/70 77/90DPV081 71/87 81/91 LSDV071 64R WR096 M65R 84/93 78/90 74/87 LSDV072DPV083 73/87 75/89 WR098 54/74 M66RDPV084 80/92 68R WR099 78/89 LSDV074 71/85 91/95 63/83 M68R 52/73DPV085 LSDV075 69R 48/68 M69L 85/94 77/88DPV086 WR101 92/95 68/85 35/61 LSDV076 76/88 71R WR102 43/64DPV088 93/97 69/84 LSDV077 M71L 72L 71/87 69/86 WR103 88/97 M72L 50/74 41/57 LSDV079 WR104 75/85 72/86 74LDPV091 63/82 M73R 80/91 76/88 75L WR106 M74R 44/65DPV093 91/97 68/84 LSDV082 88/94 62/82 76R 72/87 M76RDPV094 77/86 LSDV084 77R WR109 77/92 69/84 88/94 84/93 69/88 LSDV085 79RDPV097 WR111 M79R 80/93 88/95 75/89 80/90 72/87 WR112 LSDV088 M81R 71/85 82R 75/88 86/93 82/93 M82R WR116 84R 70/86 77/90 95/99 M86L 85R 94/98 71/86 88L 92/97 protein polymerase small subunit subunit rpo22 subunit rpo147 Ser/Thr and Tyr phosphatase protein p35 associated RAP94 factor VLTF-4 topoisomerase enzyme large subunit glycosylase transctription factor VETFs subunit rpo18 termination factor NPH-1 64R 5706665R 5744966R 5740667R 57882 128 5788268R 58481 58430 57064 59084 159 59014 183 57447 60082 5740469R 178 57880 128 60000 333 57880 5847970L 58428 IMV membrane 60554 59081 15971R 59012 60947 183 61038 185 60079 Virion protein 178 6053772L Thymidine kinase 64892 59997 333 65407 Host-range 137 protein 1285 DPV075 60551 Poly-A DPV074 64895 60944 DPV07673R 61035 67/78 63/81 18574L 43/67 65423 171 60534 64889 66969 RNA polymerase LSDV066 65992 65404 137 LSDV06575L 1285 LSDV067 66001 63/77 67/78 190 69366 64892 RNA 42/59 polymerase 323 65420 66973 17176R WR094 WR093 66966 WR021 69525 65989 798 Dual 70/78 specificity 49/6977R 36/66 65998 70064 190 69363 70080 M61R 323 M60R78R 180 66970 M62R 7102479R 71043 IMV envelope 66/78 DPV079 69522 798 59/71 71526 315 42/64 71486 67/81 70061 RNA polymerase- 74045 70077 66R80L 65R 148 67R 18081R 82/90 840 LSDV070 71021 74471 83/92 71040 80/92 82R 74470 Late transcription 62/80 71524 315 74013 75204 71483 DPV082 7520483R 74043 DNA 153 75857 WR097 14884R 69/86 75924 245 840 74469 61/80 78281 218 78281 74468 LSDV073 mRNA 74011 capping 80185 7520285R 786 M67L 75202 64/83 153 80221 635 75855 75922 24586R 66/81 Virion protein WR100 80700 78279 21887R 80748 78279 Virion protein88L 62/80 81383 160 80183 Uracil DNA 70L 81383 786 84038 DPV090 DPV087 82147 80219 635 82/94 M70R 212 NTPase DPV089 41/64 82146 51/67 255 80698 Early 41/61 80746 63/81 631 81381 160 LSDV080 LSDV078 81381 84036 LSDV081 36/59 82145 RNA 73R polymerase 52/69 212 36/61 DPV092 82144 84/94 255 mutT WR107 80/92 motif WR105 631 mutT WR108 motif 48/68 39/68 Transcription 38/56 LSDV083 M77L 78/91 M75R DPV095 M78R DPV096 70/84 40/63 WR110 51/69 65/81 30/54 70/85 LSDV086 80L 78R LSDV087 68/82 81R 74/91 M80R 80/91 65/80 67/84 WR114 77/90 WR115 61/77 50/70 83R M84R 94/98 M85R 58/77 59/78 86R 87R 87/95 89/96 18 S.H. Nazarian et al. / Virus Research 129 (2007) 11–25 similarity e YMTV ORF Identity/ similarity d MYXV ORF Identity/ similarity c VAC V ORF Identity/ similarity b LSDV ORF Identity/ similarity a DPV DPV098 80/91 LSDV089 77/88 WR117 74/89 M87L 77/88 89L 95/98 DPV099 80/92 LSDV090 80/91 WR118 73/86 M88L 77/90 90L 93/97 DPV100 64/85 LSDV091 66/84 WR119 63/85 M89L 72/88 91L 87/95 DPV101 88/95 LSDV092 84/93 WR120 84/95 M90L 86/94 92L 95/98 DPV102 60/82 LSDV093 64/85 WR121 55/76 M91L 68/82 93L 84/93 DPV105 64/81 LSDV096 68/85 WR124 62/78 M94R 62/81 96R 87/94 DPV107 78/90 LSDV098 77/90 WR126 71/86 M96L 76/89 98L 91/96 DPV108 65/81 LSDV099 67/82 WR127 61/80 M97R 68/83 99R 90/95 DPV109 86/91 LSDV100 75/84 WR128 72/81 M98L 78/90 100L 91/94 DPV113 57/75 LSDV104 56/79 M102L 50/70 104L 83/94 DPV114 86/94 LSDV105 78/90 WR133 61/77 M103L 73/83 105L 98/94 DPV115 84/94 LSDV106 79/88 WR134 66/80 M104L 79/88 106L 100/100 DPV118 75/90 LSDV109 61/78 WR137 41/63 M107L 57/73 109L 92/97 DPV121 50/71 LSDV112 51/66 WR141 46/66 M111R 47/66 113R 76/89 DPV124 60/79 LSDV115 63/78 WR143 60/77 M113R 60/76 115R 84/91 or function enzyme VITF resistance protein factor VLTF-2 factor VLTF-3 protein subunit rpo19 factor, VETF1 transcription factor VITF-3 protein protein phosphoprotein virulence factor phosphoprotein processivity factor transcription factor VITF-3 Codon aa ORF Identity/ f ) CodonStart Stop aa Start Stop Continued Table 2 ( ORF TPV-Kenya89L 84933 TPV-RoC 84073 287 84931 84071 Predicted structure 287 mRNA capping 90L 86620 84965 552 86618 84963 552 Rifampin 91L 87096 86647 150 87094 86645 150 Late transcription 92L 87794 87123 224 87792 87121 224 Late transcription 93L 88018 87794 75 88016 87792 75 Redox virion 94L95L 9000196R 90516 88031 90556 90061 657 91059 152 89999 168 90514 88029 90554 90059 657 91057 152 4b core protein 168 Virion core protein RNA polymerase DPV104 DPV103 44/62 79/90 LSDV095 LSDV094 36/55 73/87 WR122 64/80 M92L 75/87 M93L 94L 93/97 31/55 95L 68/82 97L98L 92174 94333 91062 92201 371 711 92172 94331 91060 92199 371 711 Early transcription DPV106 72/89 LSDV097 75/87 WR125 57/78 M95L 69/86 97L 92/98 99R 94386 95255 290 94384 95253 290 Intermediate 100L 95494 95258 79 95492 95256 79 IMV membrane 101L102R 98203103L 98218 95498104L 99672 99150 902 99920 99166 311 98201 99717 169 98216 95496 68 99670 99148 902 99918 99164 311 Core protein P4a 99715 169 Core protein 68 DPV110 69/85 IMV membrane DPV112 LSDV101 58/72 67/82 DPV111 LSDV103 WR129 78/89 55/70 54/72 LSDV102 WR131 M99L 76/89 46/63 61/79 WR130 M101L 55/75 101L 77/86 92/96 M100R 103L 75/88 77/86 102R 91/96 105L 100248 99970 93 100246 99968 93 IMV 106L 100426 100268 53 100424 100266 53 IMV membrane 107L 100700108L 101829 100419109L 102418 100687 94 101846 381 100698 191 101827 100417 102416 100685 101844 94 381 191 IMV protein IMV membrane DPV117 64/81 LSDV108 DPV116 63/80 50/69 WR136 LSDV107 51/70 52/71 M106L WR135 52/67 55/73 M105L 108L 79/87 51/72 107L 78/88 110R 102433111L 104077 103869112L 104410 103856113R 479 104409 104081 102431 74 105683 110 103867 104075 425 104408 103854 479 104407 104079 DNA 105681 helicase 74 110 425 Fusion protein DNA DPV119 polymerase 62/81 DPV122 57/73 LSDV110 58/76 LSDV113 DPV120 57/72 WR138 76/90 58/76 WR140 LSDV111 57/70 M108R 72/87 61/79 M110L WR139 56/74 110R 62/77 85/94 112L M109L 82/91 81/90 111L 81/94 114R 105695115R 106190 106165 107338 157 383 105693 106188 106163 107336 157 383 DNA processing Intermediate DPV123 72/88 LSDV114 67/83 WR142 67/86 M112R 63/83 114R 77/89 S.H. Nazarian et al. / Virus Research 129 (2007) 11–25 19 120.5L 69/81 126R 35/54 131R 38/65 DPV125 88/96 LSDV116 89/96 WR144 82/92 M114R 85/94 116R 94/98 DPV127 70/85 LSDV118 61/78 WR151 54/71 M116L 61/80 118L 78/92 DPV128 64/79 LSDV119DPV131 66/77 83/94 WR152 LSDV121 57/76 83/92 M117L WR155 61/77 59/76 119L M120L 85/92 81/92 121L 89/96 DPV133 64/82 LSDV123 54/75 WR157 48/68 M122R 57/81 123R 79/92 DPV142 53/69 WR170 43/63 DPV147b 28/51 LSDV135 33/53 WR200 26/44 M135R 23/41 DPV154 58/76 LSDV139 60/80 WR183 47/64 M142R 56/74 142R 84/92 DPV155 38/61 LSDV140 38/60 WR208 25/45 M143R 46/64 143R 80/91 DPV156 42/59 LSDV141 37/56 WR025 33/53 M144R 34/52 144R 64/76 subunit rpo132 A28-like subunit rpo35 packaging domain; EEV glycoprotein hydroxysteroid dehydrogenase receptor protein kinase finger range 116R 107343 110837 1165 107341 110835 1165 RNA polymerase 117L118L 111283 110840 111703 111287 148 111281 139 110838 111701 111285 148 139 Fusion protein Viral replication DPV126 39/61 LSDV117 41/64 WR150 43/61 M115L 43/66 117L 41/64 119L 112618 111719 300121L 112616 113777 111717 113019 300 253 RNA polymerase 113775 113017 253 GTPase; DNA 120L120.5L 112814 112978 112590 112847122R 75 44123R 112812 113889 112976 112588 114446 114472 112845 114981 186 75 44 113887 170 IMV membrane 114444 114470 114979 186 DPV129 170 EEV glycoprotein 69/82 C-type lectin-like DPV132 LSDV120 44/59 58/77 LSDV122 WR153 41/55 54/84 WR156 M118L 33/57 63/75 M121R 120L 39/57 90/94 122R 66/79 124R125R 115022126R 115558 115597127R 116451 116512 179128L 117180 117228 115020 285129R 118037 118847 115556 115595 223130L 118851 118038 116449 116510 270 179131R 119264 119813 117178 117226 270 285132R 119889 119256 118035 138 118839 223133L 120092 120123 118843 118036 186 270 120368 EEV 121408 glycoprotein 119256 119805 268 68 120380 119248 138 DPV136 119881 82 CD47-like 343 120084 186 120115 Myristylprotein 34/49 121400 120360 120372 68 DPV138 DPV134 82 343 DPV139 DPV135 50/67 44/66 3-Beta 32/57 38/59 DPV137 LSDV124 LSDV128 39/61 LSDV125 40/60 31/52 37/62 LSDV127 WR158 WR162 36/58 40/57 23/40 DPV141 WR160 M123R 46/66 M128L 27/51 41/60 WR063 31/54 LSDV130 28/45 M126R 45/68 124R 128L 35/55 M129R 72/84 M124R 66/83 36/55 127R 35/58 70/85 129R 125R 76/88 81/92 132R 63/80 134R135R 121447136R 121914 121993 127701 127701 158 1903 128753 121439 121985 121906 351 127693 127693 156 1903 128745 IL-24-like VARV B22R-like 351 DPV146 Type-I IFN 48/65 LSDV134 48/63 LSDV005 26/47 M134R 43/59 135R 78/89 137R138R 128783139R 129241 129270140R 130283 130362 153141R 130931 130966 128775 338142R 132675 132705 129233 129262 190 133064 133101 130275 130354 570 153 134027 130923 130958 120 338 A52R-family 132667 132697 309 190 133056 133093 570 A52R-family 134019 120 Kelch-like DPV148 309 CD200-like 37/60 Serine/threonine DPV152 50/69 LSDV136 DPV160 DPV153 42/68 DVP149 27/47 44/63 43/61 WR022 LSDV151 22/45 LSDV138 28/47 LSDV137 52/69 42/60 M136R WR180 28/55 31/56 WR177 35/59 137R WR039 M140R 26/44 M137R 72/86 43/62 32/55 M139R 44/66 M141R 138R 38/60 64/80 139R 141R 68/81 72/82 143R 134063 134764 234 134055 134756 234 Kila-N/RING 144R 134798 135601 268 134790 135593 268 vCCP/EEV host 145R146R 135900147R 136868 137021 138427 138465 323 139937 135892 469 136860 137013 491 138431 138456 323 139928 473 vCCR8 491 Ankyrin repeat Ankyrin repeat DPV164 DPV164 DPV162 41/63 29/52 36/60 LSDV147 LSDV148 37/59 LSDV011 35/54 37/62 WR186 WR186 21/41 23/39 M149R M149R 34/56 23/58 146R 147R 75/86 78/89 145R 66/78 20 S.H. Nazarian et al. / Virus Research 129 (2007) 11–25

al., 2005). A BLAST search of the intact 11L protein revealed homologues only in YLDV (11L), YMTV (11L), Deerpox virus

similarity (DPV; DPV019) and Vaccinia virus (VACV; VACV WR186). e However, when 11.2L was used as the query sequence, addi- tional homologues in Myxoma virus (MYXV; M148R), Lumpy YMTV ORF Identity/ skin disease virus (LSDV; LSDV145) and an additional VACV protein encoded by VACV WR019 were detected. The proteins encoded by TPV11L, DPV019, M148R, LSDV145, WR186

similarity and WR019 range from 558 to 675 amino acids and contain

d 7–14 predicted ankyrin repeats (Fig. 2A). While all proteins except for VACV WR019 contain the entire predicted F-box MYXV ORF Identity/ domain (Mercer et al., 2005), there is significant sequence sim- ilarity outside of the domain (Fig. 2B). It may be that the sequence, found between the last predicted ankyrin repeat and the start of the F-box domain, acts as an important functional similarity determinant of the proteins. The fact that 11L is truncated in

c TPV-Kenya suggests that all 14 ankyrin domains are not required to remain functional. Alternatively, the potential gene products VAC V ORF Identity/ from ORFs 11.1L and 11.2L might interact and form a functional complex. A previously unidentified ORF was annotated between ORFs

similarity 23L and 24L of YMTV and denoted 23.5L (Brunetti et al., 2003). A truncated ortholog was found in YLDV. We find that neither

b isolate of TPV contains a full-length copy of this predicted ORF, as compared with YMTV. However each isolate encodes for a LSDV ORF Identity/ truncated version of the 23.5L (Fig. 1B). TPV-Kenya encodes a 50 aa ORF that aligns to the carboxy half of YMTV 23.5L and is 98% identical. In contrast, TPV-RoC encodes a 38 aa ORF

similarity which is 71% identical to the amino terminus of YMTV 23.5L (Fig. 1B). A transversion at position 17890 changes a tyrosine

a (TPV-Kenya) to a stop codon (TPV-RoC) causing premature termination of TPV-RoC 23.5L. As well, an insertion at position DPV DPV167 48/68 LSDV149 44/67 WR195 38/56 M151R 45/62 149R 80/93 17994 changes a string of thymines from T5 (TPV-RoC) to T6 (TPV-Kenya) and disrupts the coding from the downstream start codon on the minus strand.

3.2. Overall nucleotide comparative analysis

or function ortholog Comparison of the two TPV isolates on a nucleotide-by- nucleotide basis indicates 35 changes across a pairwise sequence alignment of 144,565 nucleotide positions. Thirty-one of the changes were within predicted coding regions and could be divided into 13 transitions, 12 transversions and 6 deletions. Six transitions cause only synonymous codon changes. The other seven transitions resulted in non-synonymous substitu- tions within the coding sequence resulting in a single amino acid Codon aa ORF Identity/ difference between the comparable protein sequences between

f the two TPV isolates. Six of these non-synonymous events resulted in relatively non-conserved changes. In contrast, 11 of the 12 transversions were non-synonymous and 10 of the 11 non-synonymous changes were to non-conserved amino ) acids. An A to C transversion at position 10241 changes a stop codon (TAG) on the minus strand template of 11.1L of TPV-Kenya to a glutamic acid in TPV-RoC resulting in a full- CodonStart Stop aa Start Stop Continued length 11L ORF, comparable in length to the other poxvirus

Ortholog in DPV. Ortholog in LSDV. Ortholog in VACV. Ortholog in MYXV. Ortholog in YMTV. Number of amino acids. 11L orthologs. The 6 deletions represent the absence of one f a c e b d Table 2 ( ORF TPV-Kenya148R 139951149R 141392 141378 TPV-RoC 142393 476 334 139942 141383 141369 142384 476 334 Ankyrin repeat Predicted structure SERPIN/crmA DPV166 32/51 LSDV145 27/49 WR186 22/45 M005R 23/50 148R 72/85 150R 142434151R 142825 142733 143823 100 333 142425 142816 142724 143814 100 333of DPV168 four DPV007 28/49 32/54 hexanucleotide LSDV153 LSDV007 31/62 34/56 WR209direct 25/46 repeats M004.1 (CATATA) 33/51 150R 151Rpresent 74/88 71/84 at the S.H. Nazarian et al. / Virus Research 129 (2007) 11–25 21

Fig. 3. TPV-Kenya genomic map. ORFs are displayed as arrows that also indicate the direction of transcription. The arrows are coloured to indicate a specific functional category. At either end of the genome is a bolded section that indicates the terminal inverted repeat.

5 end of ORF 128L in TPV-RoC. The result of this hex- ing YLDV, YMTV, LSDV and DPV, have sequences that show anucleotide deletion results in a shortened 128L amino acid significant homology to the TPV query sequence; however, the sequence in TPV-RoC (MYMYMYNY) compared to TPV- nucleic acid sequences in YMTV, LSDV and DPV lacked a start Kenya (MYMYMYMYNY). The other sequence differences codon. Therefore, the cognate sequences appear to represent that distinguish the TPV isolates include two transitions in either a pseudogene or a terminal DNA sequence conserved non-coding sequences and an insertion in the intergenic region across genera. The region of nucleotide conservation consists between ORFs 23L and 23.5L of a thymidine (T) repeat; T8 in of a 300 nucleotide segment that surrounds the predicted 58- TPV-RoC compared to T7 in TPV-Kenya. codon ORF. The TIR regions of all chordopoxviruses were compared for similarity to these conserved sequences. The 300 3.3. Conserved DNA sequence near the termini

While examining the intergenic regions of TPV, a predicted Table 3 58-codon ORF was found within the TIR, located between the TIR conserved sequence positions in various poxvirus species extreme terminus and 1L/151R. The ORF is transcribed toward Virus Left end Right end the center of the genome and in the opposite direction of all Start End Start End ORFs 20 kbp from either end of the DNA (Fig. 3). The ORF is also present in the YLDV sequence in GenBank but was not TPV-Kenya 400 714 144164 143850 described in the publication, possibly due to the ORF mirror- TPV-RoC 400 714 144154 143840 YLDV 418 732 144158 143844 image orientation, which might contribute to dsRNA production YMTV 437 751 134285 133971 (Lee et al., 2001). The comparable region in YMTV was pre- GTPV 1018 1329 148582 148271 viously described as a pseudogene (Brunetti et al., 2003). To LSDV 1286 1592 149488 149182 determine the likelihood that the ORF encodes a functional pro- SHPV 1225 1530 148833 148528 tein, a translated BLAST search (tBLASTx) was used to find SWPV 1062 1381 145393 145074 DPV 4756 5070 165805 165491 homologous amino acid sequences. Several poxviruses, includ- 22 S.H. Nazarian et al. / Virus Research 129 (2007) 11–25 nucleotide conserved sequence was found in Swinepox virus (SWPV) and overlapped with a hypothetical gene (designated 002) in SWPV, DPV and Capripoxviruses (Table 3). While this sequence exhibits homology across genera, the highest identities were found between members within a particular genus (Fig. 4). A highly conserved sequence, present in the TIR region of , has been described previously (Shchelkunov et al., 1998). However, the sequence appears to be distinct from the Fig. 4. Identity matrix of a conserved sequence found in the TIR of vari- Yatapoxvirus ous poxvirus species. An approximately 300 bp region within the TIR of the 300-bp sequences and their cognates (Fig. 4). Since poxviruses listed was aligned using ClustalW and percent identity was deter- new sequence information has become available from the time mined. Poxviruses are listed in order of relatedness for this particular sequence. that this DNA region was last compared and reported (Baroudy et al., 1982; Shchelkunov et al., 1998), selected sequences from CPXV, VACV, VARV, HSPV, TATV, ETCV, size of these ORFs, BLAST searches were unable to detect any MPXV, CMLV and RCNV (acronyms described in Table 1; homologous sequences and thus the search for homologues was data not shown) were aligned and compared. Interestingly, performed manually. CMLV lacks this entire sequence and RCNV encodes for only The ORF 28L is present in both TPV and YLDV; the predicted 137 nucleotides of the ∼300-bp conserved sequences. As pre- ORF encodes for a potential protein of 48 aa. The region between viously described, these orthopoxvirus nucleotide sequences orthologues of 27L and 28.5L of the genomes of species of chor- are highly homologous, sharing 87–100% sequence identity dopoxviruses currently available were examined. Orthologues ((Shchelkunov et al., 1998) and data not shown). were identified in all orthopoxviruses, , SWPV, and the unclassified poxviruses DPV and Crocodilepox virus 3.4. Identification of two conserved poxvirus gene families (CRV) (Fig. 5a). However, the closely related capripoxviruses lacked a homologous gene in this region. Members of the genera Two TPV intergenic regions contained potential ORFs below Avipoxvirus, Leporipoxvirus and Molluscum contagiosum virus the commonly used codon limit of 50. The ORFs are located also lacked the sequence (Table 4). between 27L and 28.5L, and 42L and 43L; they were desig- The ORF 42.5L is predicted to encode a 30 amino acid pro- nated 28L and 42.5L, respectively (Table 2). To determine if tein. A search of all poxvirus genomes for orthologues showed these ORFs likely encode functional proteins, tBLASTx was that ORF 42.5L is highly conserved among the Chordopoxviri- used to find homologous sequences. However, due to the small nae and orthologues in all vertebrate poxviruses currently

Fig. 5. Alignments of the predicted 28L and 42.5L gene families. (a) Orthologs of TPV 28L are aligned and include YLDV, DPV, SWPV, ECTV (EVM037), VACV (WR053), ORFV, CRV. (b) Orthologs of TPV 42.5L are aligned and include YLDV, YMTV, LSDV, DPV, MYXV (M37L), VARV, VACV (WR220), as annotated at www.poxvirus.org. S.H. Nazarian et al. / Virus Research 129 (2007) 11–25 23

Table 4 Members of two new poxvirus gene families Genus Virus 28L gene family 42.5L gene family

Designation Start Stop Designation Start Stop

TPV-RoC 28L 23330 23187 42.5L 39411 39322 TPV-Kenya 28L 23332 23189 42.5L 39413 39324 Yatapox YLDV 28L 23342 23199 42.5L 39425 39336 YMTV NPa 42.5L 33938 33852 GTPV NP 038.5 37590 37504 Capripox SHPV NP 038.5 37805 37719 LSDV NP 042.5 38189 38103 Suipox SWPV 026 19915 19724 038.5 34712 34620 MYXV NP 37L 39385 39290 Leporipox SHFV NP 37L 39399 39316 Molluscipox MOCV NP 043.5L 64055 63933 ECTV 037 50964 50749 053.5 68659 68555 MPXV C20L 42461 42240 I0.5L 59903 59799 CPXV 062 58869 58648 079.5 76579 76475 VARV 041 33516 33295 058.5 51129 51025 Orthopox HSPV 054 53482 53261 070.5 71114 71010 TATV 054 42022 41807 072.5 59706 59602 CMPV 49L 43284 43063 66.5L 60901 60797 VACV 053 42188 41967 220b 59851 59744 FWPV NP 090.5 90681 90782 Avipox CNPV NP 117.5 120153 120254 ORFV 012.5b 11760 11578 029.5 32396 32244 Parapox BPSV 011.5b 12820 12650 028.5 33354 33214 CRV 044 58267 58025 064.5 86842 86762 Unclassified DPV 036 32435 32226 050.5 49191 49060 a Not present. b Annotated at www.poxvirus.org/. sequenced were found (Fig. 5b). The nucleotide sequence has standard search parameters can be difficult to assign. Many previously only been reported as a putative ORF of 32 codons poxvirus ORFs are quite small and it is unlikely that they will for sequenced leporipoxviruses (Cameron et al., 1999; Willer achieve a significant match using BLAST. Selected available et al., 1999). Additionally, an orthologue, VACV ORF WR220, sequences from chordopoxviruses were used to determine sig- has been annotated in sequences at http://www.poxvirus.org/ nificantly conserved sequences in TPV. One approach was to (Table 4). compare a tentatively designated ORF and examine areas of sev- eral poxviruses that contained highly conserved ORFs flanking 4. Discussion this region. Using this method, two previously unidentified gene families, The virtually complete genome sequences were determined 28L and 42.5L, which are clearly present in members of sev- for two isolates of TPV recovered from clinical cases that eral other poxvirus genera, were identified. The region between occurred about 50 years apart. Annotation of the determined ORFs 27L and 28.5L was previously assigned to a large but sequences revealed a single ORF difference between the two overlapping ORF (28R) that was identified in YLDV (Lee et genomes. The ORF 11L is truncated in TPV-Kenya compared al., 2001); therefore ORF 28L was not originally identified as a to TPV-RoC, which suggests that even the small genetic vari- putative gene. However, the evidence that orthologs of 28L are ability present between the two genomes has possibly resulted encoded by a variety of poxviruses suggests that 28L encodes in changes to the proteome. A TPV nucleotide sequence is con- a protein product (Table 4). Conversely, 42.5L had previously served in the TIR region of several poxviruses closely related only been identified in the leporipoxviruses but this ortholog to the yatapoxviruses. An analogous but distinct DNA sequence is highly conserved among the Chordopoxvirinae. Due to its located in the correlate region of the genome is also present in the extremely small size (30–44 codons), it is unlikely to have been orthopoxviruses. Finally, two novel gene families are proposed considered an ORF previously. On close inspection, however, following identification using comparative genomics. 42.5L has a conserved early and late promoter (data not shown; One of the difficulties that arise when limited sequences are www.poxvirus.org/) and the putative amino acid sequence shares available for comparison is that, ORFs that do not meet the 62–77% identity with orthologs among other chordopoxviruses. 24 S.H. Nazarian et al. / Virus Research 129 (2007) 11–25

These properties should be sufficient to designate 42.5L as a orthopoxviruses, has a sequence that is more amenable to chang- putative ORF. Other related predictive methods, such as analyz- ing with different hosts. ing purine skew of the ORFs (Da Silva and Upton, 2005), were not used since both of these ORFs had clear orthologues in other poxviruses. Acknowledgements In addition to identifying new putative ORFs, a conserved DNA sequence in the TIR of several genera of poxviruses was This work was supported by the Canadian Institutes of Health described. An analogous sequence that exhibited a similar orga- Research (CIHR) and National Cancer Institute of Canada nization pattern, but did not show significant homology, has been (NCIC). SN was supported by an Ontario Graduate Scholarship identified previously in the orthopoxviruses. A possible role in and Western Graduate Research Scholarship. GM held a Canada DNA replication for the orthopoxvirus conserved sequence in Research Chair in Molecular Virology and is an International the TIR region has been proposed (Shchelkunov et al., 1998), Scholar of the Howard Hughes Medical Institute. but since there is considerable divergence of the sequence across genera, a precise mechanism is unclear and may be structural References rather than sequence-specific. Structural elements such as Hol- liday Junctions and cruciform structures have been shown to be Afonso, C.L., Delhon, G., Tulman, E.R., Lu, Z., Zsak, A., Becerra, V.M., Zsak, important for resolution of concatenated DNA into unit length L., Kutish, G.F., Rock, D.L., 2005. Genome of deerpox virus. J. Virol. 79 DNA molecules (Palaniyar et al., 1999). This sequence may (2), 966–977. fulfill its function in a similar way, relying on a structural motif. Afonso, C.L., Tulman, E.R., Delhon, G., Lu, Z., Viljoen, G.J., Wallace, D.B., Kutish, G.F., Rock, D.L., 2006. Genome of crocodilepox virus. J. Virol. 80 If the conserved sequence does, in fact, play a role in DNA (10), 4978–4991. replication, then a sequence that performs a similar function Afonso, C.L., Tulman, E.R., Lu, Z., Zsak, L., Kutish, G.F., Rock, D.L., 2000. must be present in all poxviruses. Therefore, an attempt to define The genome of Fowlpox virus. J. Virol. 74 (8), 3815–3831. this sequence in other poxvirus species was made. It is possible Afonso, C.L., Tulman, E.R., Lu, Z., Zsak, L., Osorio, F.A., Balinsky, C., Kutish, to find sequences in other poxvirus TIRs that share some homol- G.F., Rock, D.L., 2002. The genome of swinepox virus. J. Virol. 76 (2), 783–790. ogy to the conserved sequence; however, without more sequence Balbas, P., Gosset, G., 2001. Chromosomal editing in Escherichia coli. Vectors information from viral members within genera it is difficult to for DNA integration and excision. Mol. Biotechnol. 19 (1), 1–12. clearly define since there is a lack of sequence information for Baroudy, B.M., Venkatesan, S., Moss, B., 1982. Incompletely base-paired flip- viral members within other poxvirus genera, including two gen- flop terminal loops link the two DNA strands of the vaccinia virus genome era composed of only a single member each (Suipoxvirus and into one uninterrupted polynucleotide chain. Cell 28 (2), 315–324. Barrett, J.W., Sun, Y., Nazarian, S.H., Belsito, T.A., Brunetti, C.R., McFad- Molluscipoxvirus). den, G., 2006. Optimization of codon usage of poxvirus genes allows for Through a comparative genomics approach, we have iden- improved transient expression in mammalian cells. Virus Genes 33 (1), tified important additional features of yatapoxviruses noted by 15–26. prior sequencing of YMTV and YLDV. The results presented Brunetti, C.R., Amano, H., Ueda, Y., Qin, J., Miyamura, T., Suzuki, T., Li, indicate a relatively slow evolutionary rate, which suggests a X., Barrett, J.W., McFadden, G., 2003. Complete genomic sequence and comparative analysis of the tumorigenic poxvirus Yaba monkey tumor virus. relatively stable, confined evolutionary niche. From this stand- J. Virol. 77 (24), 13335–13347. point, the primate host-range of TPV and YLDV in the central Buller, R.M., Arif, B.M., Black, D.N., Dumbell, K.R., Esposito, J.J., Lefkowitz, region of the rainforest of Africa appears to have remained E.J., McFadden, G., Moss, B., Mercer, A.A., Moyer, R.W., Skinner, M.A., the same, at least for 50 years, despite extensive ecological Tripathy, D.N., 2005. . In: Fauquet, C., Mayo, M.A., Maniloff, changes, particularly urbanization of forested areas. There have J., Desselberger, U., Ball, L.A. (Eds.), Virus Taxonomy: Classification and Nomenclature of Viruses; Eighth Report of the International Committee on been suggestions that an insect vector might be involved in Yat- Taxonomy of Viruses. Elsevier/Academic Press, Oxford, pp. 117–133. apoxvirus transmission because TPV and YLDV infection are Cameron, C., Hota-Mitchell, S., Chen, L., Barrett, J., Cao, J.-X., Macaulay, C., localized to one or two lesions and not systemic like Willer, D., Evans, D., McFadden, G., 1999. The complete DNA sequence of (Damon, 2007). Maintaining a lifecycle that includes a poten- myxoma virus. Virology 264 (2), 298–318. tial non-human primate reservoir, an insect reservoir, as well Da Silva, M., Upton, C., 2005. Using purine skews to predict genes in AT-rich poxviruses. BMC Genomics 6 (1), 22. as a human reservoir suggests that a constant genetic selective Damon, I.K., 2007. Poxviruses. In: Knipe, D.M., Howley, P.M. (Eds.), Fields pressure might be maintained on the TPV and YLDV genome, Virology, Vol.2, 5th ed. Lippincott, Williams & Wilkins, New York, pp. which would lower the likelihood of sequence divergence. How- 2947–2976. ever, this may not explain the lack of nucleotide changes in the Delhon, G., Tulman, E.R., Afonso, C.L., Lu, Z., de la Concha-Bermejillo, A., third bp position and it is unlikely that the DNA polymerase Lehmkuhl, H.D., Piccone, M.E., Kutish, G.F., Rock, D.L., 2004. Genomes of the parapoxviruses ORF virus and bovine papular stomatitis virus. J. Virol. encoded by Yatapoxviruses has a high enough fidelity to explain 78 (1), 168–177. this phenomenon. The codon bias present in many Yatapoxvirus Dhar, A.D., Werchniak, A.E., Li, Y., Brennick, J.B., Goldsmith, C.S., Kline, R., genes represent the most rarely used codons in mammalian cells Damon, I., Klaus, S.N., 2004. Tanapox infection in a college student. N. (Barrett et al., 2006). An alternative explanation to explain the Engl. J. Med. 350 (4), 361–366. third position conservation is that this codon bias is required for Domi, A., Moss, B., 2002. Cloning the vaccinia virus genome as a bacterial artificial chromosome in Escherichia coli and recovery of infectious virus in efficient gene expression in a variety of distinct host species. mammalian cells. Proc. Natl. Acad. Sci. U.S.A. 99 (19), 12415–12420. In contrast, a poxvirus that is able to infect several different Downie, A.W., Espana, C., 1972. Comparison of Tanapox virus and Yaba-like hosts, e.g. Cowpox virus, which appears to be parental to the viruses causing epidemic disease in monkeys. J. Hyg. (Lond.) 70 (1), 23–32. S.H. Nazarian et al. / Virus Research 129 (2007) 11–25 25

Espana, C., Brayton, M.A., Ruebner, B.H., 1971. Electron microscopy of the Seet, B.T., Johnston, J.B., Brunetti, C.R., Barrett, J.W., Everett, H., Cameron, Tana poxvirus. Exp. Mol. Pathol. 15 (1), 34–42. C., Sypula, J., Nazarian, S.H., Lucas, A., McFadden, G., 2003. Poxviruses Esposito, J.J., Sammons, S.A., Frace, A.M., Osborne, J.D., Olsen-Rasmussen, and immune evasion. Annu. Rev. Immunol. 21, 377–423. M., Zhang, M., Govil, D., Damon, I.K., Kline, R., Laker, M., Li, Y., Smith, Senkevich, T.G., Bugert, J.J., Sisler, J.R., Koonin, E.V., Darai, G., Moss, B., G.L., Meyer, H., Leduc, J.W., Wohlhueter, R.M., 2006. Genome sequence 1996. Genome sequence of a human tumorigenic poxvirus: prediction of diversity and clues to the evolution of variola (smallpox) virus. Science 313 specific host response-evasion genes. Science 273, 813–816. (5788), 807–812. Shchelkunov, S.N., Safronov, P.F., Totmenin, A.V., Petrov, N.A., Ryazankina, Gubser, C., Smith, G.L., 2002. The sequence of camelpox virus shows it is most O.I., Gutorov, V.V., Kotwal, G.J., 1998. The genomic sequence analysis of the closely related to variola virus, the cause of smallpox. J. Gen. Virol. 83 (Pt left and right species-specific terminal region of a cowpox virus strain reveals 4), 855–872. unique sequences and a cluster of intact ORFs for immunomodulatory and Knight, J.C., Novembre, F.J., Brown, D.R., Goldsmith, C.S., Esposito, J.J., 1989. host range proteins. Virology 243 (2), 432–460. Studies on Tanapox virus. Virology 172 (1), 116–124. Shchelkunov, S.N., Totmenin, A.V., Babkin, I.V., Safronov, P.F., Ryazankina, Lee, H.J., Essani, K., Smith, G.L., 2001. The genome sequence of Yaba-like O.I., Petrov, N.A., Gutorov, V.V., Uvarova, E.A., Mikheev, M.V., Sisler, J.R., disease virus, a yatapoxvirus. Virology 281 (2), 170–192. Esposito, J.J., Jahrling, P.B., Moss, B., Sandakhchiev, L.S., 2001. Human McNulty Jr., W.P., Lobitz Jr., W.C., Hu, F., Maruffo, C.A., Hall, A.S., 1968. monkeypox and smallpox viruses: genomic comparison. FEBS Lett. 509 A pox disease in monkeys transmitted to man. Clinical and histological (1), 66–70. features. Arch. Dermatol. 97 (3), 286–293. Tulman, E.R., Afonso, C.L., Lu, Z., Zsak, L., Kutish, G.F., Rock, D.L., 2001. Mercer, A.A., Fleming, S.B., Ueda, N., 2005. F-box-like domains are present in Genome of lumpy skin disease virus. J. Virol. 75, 7122–7130. most poxvirus ankyrin repeat proteins. Virus Genes 31 (2), 127–133. Tulman, E.R., Afonso, C.L., Lu, Z., Zsak, L., Kutish, G.F., Rock, D.L., 2004. Mercer, A.A., Ueda, N., Friederichs, S.M., Hofmann, K., Fraser, K.M., Bateman, The genome of Canarypox virus. J. Virol. 78 (1), 353–366. T., Fleming, S.B., 2006. Comparative analysis of genome sequences of three Tulman, E.R., Afonso, C.L., Lu, Z., Zsak, L., Sur, J.H., Sandybaev, N.T., isolates of Orf virus reveals unexpected sequence variation. Virus Res. 116 Kerembekova, U.Z., Zaitsev, V.L., Kutish, G.F., Rock, D.L., 2002. The (1–2), 146–158. genomes of sheeppox and Goatpox viruses. J. Virol. 76 (12), 6054– Moss, B., 2007. Poxviridae: the viruses and their replication. In: Knipe, D.M., 6061. Howley, P.M. (Eds.), Fields Virology, vol. 2, fifth ed. Lippincott, Williams Tulman, E.R., Delhon, G., Afonso, C.L., Lu, Z., Zsak, L., Sandybaev, N.T., & Wilkins, New York, pp. 2905–2946. Kerembekova, U.Z., Zaitsev, V.L., Kutish, G.F., Rock, D.L., 2006. Genome Palaniyar, N., Gerasimopoulos, E., Evans, D.H., 1999. Shope fibroma virus DNA of Horsepox virus. J. Virol. 80 (18), 9244–9258. topoisomerase catalyses holliday junction resolution and hairpin formation Willer, D., McFadden, G., Evans, D.H., 1999. The complete genome sequence in vitro. J. Mol. Biol. 287 (1), 9–20. of Shope (rabbit) fibroma virus. Virology 264 (2), 319–343.