Analysis of Synonymous Codon Usage in Hepatitis a Virus
Total Page:16
File Type:pdf, Size:1020Kb
Zhang et al. Virology Journal 2011, 8:174 http://www.virologyj.com/content/8/1/174 RESEARCH Open Access Analysis of synonymous codon usage in Hepatitis A virus Yiqiang Zhang1,2†, Yongsheng Liu1†, Wenqian Liu1, Jianhua Zhou1, Haotai Chen1, Yin Wang2, Lina Ma1, Yaozhong Ding1 and Jie Zhang1* Abstract Background: Hepatitis A virus is the causative agent of type A viral hepatitis, which causes occasional acute hepatitis. Nevertheless, little information about synonymous codon usage pattern of HAV genome in the process of its evolution is available. In this study, the key genetic determinants of codon usage in HAV were examined. Results: The overall extent of codon usage bias in HAV is high in Picornaviridae. And the patterns of synonymous codon usage are quite different in HAV genomes from different location. The base composition is closely correlated with codon usage bias. Furthermore, the most important determinant that results in such a high codon bias in HAV is mutation pressure rather than natural selection. Conclusions: HAV presents a higher codon usage bias than other members of Picornaviridae. Compositional constraint is a significant element that influences the variation of synonymous codon usage in HAV genome. Besides, mutation pressure is supposed to be the major factor shaping the hyperendemic codon usage pattern of HAV. Background acute hepatitis with fatal outcomes in otherwise healthy Hepatitis A virus (HAV), the causative agent of type A adults as well as isolated severe cases of hepatitis, it has viral hepatitis, is an ancient human virus that was first never been associated with chronic liver disease [8]. identified in the stools of infected people in 1973 [1]. As we all know, the genetic code chooses 64 codons HAV is a non-enveloped, single-stranded positive-sence to represent 20 standard amino acids and stop signals. RNA virus which belongs to order Picornavirales, family These alternative codons for the same amino acid are Picornaviridae, the genus Hepatovirus in virus taxonomy termed as synonymous codons. Synonymous mutations [2-4]. The genome of HAV is approximately 7500 tend to occur in the third base position, but the cases nucleotide in length and contains a large open-reading can be interchanged without altering the primary frame (ORF) encoding a polyprotein in which the major sequence of the polypeptide product. Some reports indi- capsid proteins represent the amino-terminal third, with cate that synonymous codons are not chosen equally the remainder of the polyprotein comprising a series of both within and between genomes [9-13]. In general, nonstructural proteins required for HAV RNA replica- codon usage variation may be the product of natural tion: 2B, 2C, 3A, 3B, 3Cpro and 3Dpol. Based on the stu- selection and/or mutation pressure for accurate and effi- dies of genetics, HAV was proposed to divide into six cient translation in various organisms [14-21]. It is well different genotypes [5]. However, there is only one known that codon usage variation is considered as an known serological group of human HAV [6,7]. Although indicator of the forces shaping genome evolution. In HAV causes occasional, dramatic disease outbreaks of addition, compared with natural selection, mutation pressure plays an important role in synonymous codon usage pattern in some RNA viruses [18,22,23]. * Correspondence: [email protected] Nevertheless, little information about codon usage † Contributed equally 1State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary pattern of HAV genome including the relative synon- Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou ymous codon usage (RSCU) and codon usage bias 730046, Gansu, China (CUB) in the process of its evolution is available. In this Full list of author information is available at the end of the article © 2011 Zhang et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Zhang et al. Virology Journal 2011, 8:174 Page 2 of 8 http://www.virologyj.com/content/8/1/174 Table 1 Identified nucleotide contents in complete coding region (length >250 bps) in hepatitis A virus (21 isolates) genome a SN A% A3%U%U3%C%C3%G%G3% (C+G)% (C3+G3)% ENC 1 29.8 26.9 32.6 41.9 15.3 9.5 22.3 21.7 37.6 31.2 39.6 2 29.9 27.4 33.0 43.2 15.2 9.0 21.9 20.4 37.1 29.4 39.2 3 30.2 27.7 32.9 43.4 15.3 9.2 21.6 19.6 36.9 28.8 39.2 4 30.0 27.2 32.9 43.1 15.3 9.3 21.8 20.4 37.1 29.7 39.0 5 30.2 27.9 32.7 42.1 15.5 10.0 21.6 20.0 37.1 30.0 38.9 6 30.1 27.7 32.9 42.9 15.2 9.3 21.8 20.2 36.9 29.5 38.8 7 30.2 28.0 32.7 42.1 15.4 10.0 21.6 19.9 37.0 29.9 38.9 8 30.3 28.1 32.7 42.1 15.5 10.0 21.5 19.8 36.9 29.8 39.0 9 30.2 28.0 32.7 42.1 15.4 10.0 21.6 19.9 37.0 29.9 38.9 10 30.1 27.4 32.9 42.8 15.2 9.2 21.8 20.6 37.0 29.8 38.9 11 29.8 27.0 32.4 41.9 15.8 10.3 22.0 20.7 37.7 31.0 40.7 12 30.3 27.9 32.8 42.6 15.3 9.4 21.6 20.1 36.9 29.5 38.8 13 29.6 25.8 32.5 42.7 16.1 11.0 21.8 20.5 37.9 31.5 40.7 14 29.8 26.7 32.7 43.3 15.9 10.3 21.5 19.7 37.4 30.0 40.0 15 30.1 27.5 32.9 43.3 15.3 9.1 21.7 20.2 37.0 29.3 39.2 16 30.0 27.4 32.7 42.5 15.6 10.1 21.7 20.1 37.3 30.2 40.0 17 30.0 27.3 32.6 42.6 15.5 9.6 21.8 20.4 37.3 30.0 39.6 18 30.1 27.5 32.7 42.7 15.4 9.7 21.7 20.1 37.1 29.8 39.5 19 30.3 28.0 32.6 42.0 15.5 10.0 21.6 19.9 37.0 29.9 38.9 20 30.1 27.5 32.9 42.9 15.2 9.1 21.8 20.6 37.0 29.7 39.1 21 30.0 27.4 32.9 42.8 15.3 9.2 21.8 20.6 37.0 29.8 39.2 aENC is effective number of codons. study, the key genetic determinants of codon usage Correspondence analysis (COA) index in HAV were examined. To investigate the major trend in codon usage variation among HAV, COA was used for all 21 HAV complete Results coding regions selected for this study. COA detected Synonymous codon usage in HAV one major trend in the first axis (ƒ’1) which accounted The values of nucleotide contents in complete coding for 26.98% of the total variation, and another major region of all 21 HAV genomes were analyzed (Table 1). trend in the second axis (ƒ’2) which accounted for Evidently, (C+G)% content fluctuated from 36.9 to 37.9, 19.50% of the total variation. A plot of the first and sec- with a mean value of 37.15 and S.D of 0.28, indicating ond principal axes of the complete coding region of that nucleotides A and U were the major elements of each gene was shown in Figure 1. It is clear that coordi- HAV genome. Comparing the values of A3%, U3%, C3% nate of each gene is relatively isolate except the Austra- and G3%, it is clear that U3% was distinctly high, and lia isolates, Brazil isolate and one Russia isolate. C3%wasthelowestofall.The(C3+G3)% in complete Nevertheless, these relatively isolated spots tend to clus- coding region of each HAV genome fluctuated from ter into several groups according to the same genotype. 28.8 to 31.5, with a mean value of 29.92 and S.D of But MBB which isolated from North Africa had a spe- 0.62. And the effective number of codons (ENC) values cial codon usage pattern contrasting with the other IB of these HAV genomes fluctuated from 38.8 to 40.7, strains. All above imply that these strains of HAV iso- with a mean value of 39.34 and S.D. of 0.58. The ENC lated from different places, even the same genotype, values for these HAV genomes were a little low indicat- have different trend in codon usage variation. Interest- ing that the there is a particular extent of codon prefer- ingly, the pattern of codon usage in vaccine strain H2 ence in HAV genome. The details of the overall relative change to MBB-like pattern after continuous culturing synonymous codon usage (RSCU) values of 59 codons in a human diploid cell line (KMB17), i.e. H2K5 and in 21 HAV genomes were analyzed (Table 2).