DNA Replication and Strand Asymmetry in Prokaryotic and Mitochondrial Genomes
Total Page:16
File Type:pdf, Size:1020Kb
16 Current Genomics, 2012, 13, 16-27 DNA Replication and Strand Asymmetry in Prokaryotic and Mitochondrial Genomes Xuhua Xia*,1,2 1Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, 30 Marie Curie, P.O. Box 450, Station A, Ottawa, Ontario, Canada 2Ottawa Institute of Systems Biology, Ottawa, Canada Abstract: Different patterns of strand asymmetry have been documented in a variety of prokaryotic genomes as well as mitochondrial genomes. Because different replication mechanisms often lead to different patterns of strand asymmetry, much can be learned of replication mechanisms by examining strand asymmetry. Here I summarize the diverse patterns of strand asymmetry among different taxonomic groups to suggest that (1) the single-origin replication may not be universal among bacterial species as the endosymbionts Wigglesworthia glossinidia, Wolbachia species, cyanobacterium Synechocystis 6803 and Mycoplasma pulmonis genomes all exhibit strand asymmetry patterns consistent with the multiple origins of replication, (2) different replication origins in some archaeal genomes leave quite different patterns of strand asymmetry, suggesting that different replication origins in the same genome may be differentially used, (3) mitochondrial genomes from representative vertebrate species share one strand asymmetry pattern consistent with the strand- displacement replication documented in mammalian mtDNA, suggesting that the mtDNA replication mechanism in mammals may be shared among all vertebrate species, and (4) mitochondrial genomes from primitive forms of metazoans such as the sponge and hydra (representing Porifera and Cnidaria, respectively), as well as those from plants, have strand asymmetry patterns similar to single-origin or multi-origin replications observed in prokaryotes and are drastically different from mitochondrial genomes from other metazoans. This may explain why sponge and hydra mitochondrial genomes, as well as plant mitochondrial genomes, evolves much slower than those from other metazoans. Received on: July 07, 2011 - Revised on: September 26, 2011 - Accepted on: October 02, 2011 Keywords: Archaea, DNA replication, deamination, GC skew, mitochondria, mutation, origin of replication, selection. INTRODUCTION 0.0021, SA = 0.0023). However, B. subtilis genomic DNA exhibits strong local asymmetry (Fig. 1). The asymmetry DNA strand asymmetry refers to the differential differs between the leading and the lagging strands, with the distribution of nucleotides between the two DNA strands, leading strand having more G than C and the lagging strand e.g., one has more A or C than the other. This implies a more C than G [2]. The strand compositional asymmetry is violation of Chargaff’s parity rule 2 [1], i.e., A = T and C = strong enough to allow the identification of the bacterial G within each strand. Consequently, strand asymmetry is origin of replication (Fig. 1) whose flanking sequences typically measured by nucleotide skews such as GC skew change direction in GC skew [2, 10-14] or in the components and AT skew [2-9], referred hereafter as S and S : G A of the Z-curve [11, 15-17]. For this reason, strand asymmetry G C is often computed locally instead of globally, with the S = GC Skew = nucleotide skews computed with a sliding window. The G G + C (1) validity and effectiveness of the in silico methods using A T S = AT Skew = strand asymmetry to identify the origin of replication in A A + T prokaryotic species are well established by many experimental verifications of the predicted replication origins Chargaff’s parity rule 2 may be satisfied at the genomic level in spite of strong local strand asymmetry. For example, [18] and the utility of these methods in practice has been demonstrated by many recent studies on prokaryotic Bacillus subtilis studied by Chargaff and his colleagues [1] genomes [15, 17, 19-24], mitochondrial genomes [25-28] has its genomic nucleotide frequencies being 28.18%, and plasmid genomes [16]. 21.81%, 21.71%, and 28.30% for A, C, G and T, according to the genomic sequence deposited in GenBank The nucleotide skews in Eq. (1) were extended in two (NC_000964). Thus, both SG and SA are close to 0 (SG = - ways. The first leads to the cumulative skew [8] which is based on summation of adjacent skew values and is equivalent to nucleotide skews with a wider sliding window *Address correspondence to this author at the Department of Biology and than what is used to compute individual skew values. For Center for Advanced Research in Environmental Genomics, University of example, the Mycoplasma pneumoniae genome has been Ottawa, 30 Marie Curie, P.O. Box 450, Station A, Ottawa, Ontario, K1N 6N5, Canada; Tel: (613) 562-5800 ext. 6886; Fax: (613) 562-5486; used to illustrate the advantage of the cumulative skew E-mail: [email protected] method which detects the replication origin while the 1389-2029/12 $58.00+.00 ©2012 Bentham Science Publishers DNA Replication and Strand Asymmetry in Prokaryotic Current Genomics, 2012, Vol. 13, No. 1 17 0.1 SG 0.08 0.06 0.04 0.02 SA 0 -0.02 Replication Nucleotide skew Nucleotide termination -0.04 -0.06 -0.08 Replication Origin -0.1 0 500000 1000000 1500000 2000000 25000003000000 3500000 4000000 Site Fig. (1). Nucleotide skew plot for the Bacillus subtilis genome (NC_000964), with window size = 537179 and step size = 21078. Each data point is at the beginning of its sliding window. The replication origin is identified as the genomic site where the GC skew (SG) changes from negative to positive and the replication termination is the site where SG changes from positive to negative. original GC skew method does not [8]. The real difference is the GC skew values of neighboring sliding windows is not really between the two methods, but between two maximized. This method is implemented in DAMBE [29, different widths of the sliding window (Fig. 2). A wide 30]. sliding window detects a clear change in polarity of strand The second extension of the nucleotide skews is to the asymmetry (Fig. 2A), but a narrow window fails to (Fig. word or motif skew [31] which is defined as 2B). In this review, the window size for skew plots is optimized with the criterion that the autocorrelation between (A) 0.03 0.02 0.01 0 G S Replication Replication -0.01 origin termination -0.02 -0.03 0 100000 200000 300000 400000 500000 600000 700000 800000 Site (B) 0.15 0.1 0.05 G S 0 -0.05 -0.1 -0.15 0 100000 200000 300000 400000 500000 600000 700000 800000 Site Fig. (2). SG plot for the Mycoplasma pneumoniae genome (NC_000912), with window size = 136694 and step size = 4081 (A), and with window size = 4000 and step size = 3000 (B). 18 Current Genomics, 2012, Vol. 13, No. 1 Xuhua Xia argue that (1) the common assumption of the single-origin N m N m S = rc (2) DNA replication in bacterial species may not be valid m N + N m mrc because bacterial genomes from the endosymbionts Wigglesworthia glossinidia and Wolbachia (from where m is either a nucleotide (e.g., G or A) or a motif (e.g., Drosophila melanogaster) exhibit patterns of strand ACG), mrc is the reverse complement of m (mrc = C if m = asymmetry strongly indicative of multiple origins of G, or mrc = CGT if m = ACG), and Nx is the number of x in replication, (2) different replication origins in some archaeal the sliding window (where x is either m or mrc). GC skew genomes leave quite different patterns of strand asymmetry, and AT skew are special cases of Sm when m is equal to suggesting that different replication origins in the same either G or A, respectively, i.e., GC Skew is SG and AT skew genome may be differentially used, (3) the pattern of strand is SA. It is for this reason that I have denoted GC Skew and asymmetry from mammalian mitochondrial genomes is AT Skew by SG and SA, respectively, in Eq. (1). consistent with the strand-displacement model of replication While transcription is known to contribute to strand well documented in mammalian mitochondria [35-40], and asymmetry [32, 33], the most important contributor to strand this pattern is shared among mitochondrial genomes from asymmetry is DNA replication associated with differential representative vertebrate species, suggesting a similar DNA strand-specific mutation bias [21, 22, 34], which is replication mechanism among vertebrate mitochondrial confirmed by a study that assesses the contribution of both genomes, and (4) primitive forms of metazoans such as transcription and replication to strand asymmetry [23]. sponge and hydra, as well as plants, have mitochondrial Because different replication mechanisms often lead to strand asymmetry patterns similar to prokaryotes and different patterns of strand asymmetry, much can be learned drastically different from higher metazoans, suggesting that of replication mechanisms by examining strand asymmetry. mitochondrial genomes in plants and in primitive In this review I will summarize the different patterns of invertebrate such as sponge and hydra share the a similar strand asymmetry in different prokaryotic and mitochondrial replication mechanism as their bacterial ancestor with a genomes as a basis to infer the mechanism of DNA much lower replication error rate than that in mammalian replication that gives rise to the diversity of strand mitochondrial genomes whose strand-displacement asymmetry patterns. Based on the empirical evidence, I replication is highly error-prone. This sheds light on why (A)0.038 E. coli (B) C. jejuni A 24.68% 0.13 A 34.83% S SG 0.028 G C 25.35% C 15.34% G 25.25% 0.08 G 15.20% 0.018 T 24.72% T 34.62% 0.008 0.03 Sm Sm -0.002 -0.02 -0.012 S SA A -0.07 -0.022 -0.12 -0.032 -0.042 -0.17 0 1000000 2000000 3000000 4000000 5000000 0 500000 1000000 1500000 Site Site (C) (D) H.