Slow Evolution of Rag1 and Pomc Genes in Vertebrates with Large Genomes
Total Page:16
File Type:pdf, Size:1020Kb
Slow Evolution of rag1 and pomc Genes in Vertebrates with Large Genomes. Bianca Sclavi 1* and John Herrick 2* *corresponding authors 1. LBPA, UMR 8113 du CNRS, ENS Cachan, Cachan, France 94235 [email protected] 2. [email protected] Abstract Growing evidence suggests that many vertebrate lineages are evolving at significantly different rates. As a first approximation of evolutionary rates, we assessed the amount of neutral (dS) and non-neutral (dN) substitutions that have accumulated within and across sister clades since the time of their divergence. We found that in fish, tetraodontiformes (pufferfish) are evolving at faster rates than cypriniformes (fresh water teleosts), while cypriniformes are evolving faster than elasmobranchs (sharks, skates and rays). A similar rate variation was observed in salamanders: plethodontidae were found to evolve at a rate nearly two fold faster than the hydromantes lineage. We discuss possible explanations for this striking variation in substitution rates among different vertebrate lineages that occupy widely diverse habitats and niches. Introduction Rates of molecular evolution are known to vary significantly across lineages belonging to the same evolutionary group (Lanfear et al. 2010). Nucleotide substitution rates in birds, for example, are higher in the songbird lineage than in chicken (Nam et al. 2010); while in mammals, rates in the murid lineage are higher than in man. The molecular basis for the observed variation in mutation and substitution rates is complex and poorly understood. DNA replication errors, however, are a major source of endogenous mutations, and mutation rates across the genome have recently been found to correlate with DNA replication timing in fungi, invertebrates and mammals (Wolfe et al. 1989; Chen et al. 2010; Weber et al. 2012) (Stamatoyannopoulos et al. 2009; Lang and Murray 2011; Agier and Fischer 2012). In addition, it has been proposed that substitution rates vary as a result of differing DNA repair efficiencies in a lineage specific manner (Britten 1986). The intricate interplay between DNA replication and DNA repair systems as the cell cycle progresses suggests that growing reliance on error prone DNA repair systems such as Translesion DNA Synthesis (TLS) and Non-homologous End-joining (NHEJ) of DNA double strand breaks might explain the increase in mutation rate as the DNA synthetic phase, or S phase, of the cell cycle advances (Herrick 2011). Other potential and related explanations concern the compartmentalization of the genome into different forms of chromatin (eg. early replicating euchromatin: EC, and late replicating heterochromatin: HC) (Lande-Diner et al. 2009), which vary in DNA content between lineages and differentially rely on DNA repair 1 systems. It remains unknown, however, if these same repair systems can account for differences in mutation/substitution rates between lineages. In vertebrates, lineage specific mutation rate variation has been associated with several different but interacting life history traits including body size, generation time and metabolic rate (Martin and Palumbi 1993; Bromham 2011). A generation time effect (GT), for example, has been proposed to account for the decrease in mutation rate resulting from DNA replication errors as the primate lineage evolved (Hwang and Green 2004). Low rates of molecular evolution in some acipensiforme lineages have similarly been attributed to a generation time effect on mutation and substitution rates (Krieger and Fuerst 2002). How GT might impact rates of molecular evolution remains unclear, but GT is known to correlate significantly with genome size (C-value) in both plants and animals (Gregory 2001; Hardie and Hebert 2003; Francis et al. 2008). Low mutation rates are generally acknowledged to be required for the evolution of large genomes. Hinegardner and Rosen first suggested in 1972 that fish with large genomes are evolving more slowly than fish with smaller genomes (Hinegardner and Rosen 1972). An investigation of evolutionary rates in lungfish (C-value 70 pg) likewise revealed that lungfish are evolving up to two fold more slowly than either frogs or mammals (C-value 3 pg) (Lee et al. 2006). Similar observations have been made on salamanders (Kozak et al. 2005). Consistent with observations of low rates of molecular evolution in taxa with large genomes, other studies in plants, fish and animals revealed a genome size effect on extinction rates and species richness (Vinogradov 2004; Knight et al. 2005; Olmo 2006; Kraaijeveld 2010). Together, these observations suggest that variations in mutation/substitution rates influence the mode and tempo of genome size evolution and rates of diversification in different plant and animal lineages. To further investigate the association between diversification rate and genome size, we measured substitutions at synonymous (dS) and non-synonymous (dN) coding sites in two nuclear genes, rag1 and pomc , from three different vertebrate groups: fish, frogs and salamanders. Within each group, we selected closely related lineages in order to compare the number of substitutions that have occurred since the lineages diverged. Two sister lineages were selected from cypriniformes, the largest freshwater fish clade. Substitution rates were then compared to substitution rates in closely related lineages from tetraodontiformes (pufferfish) and chondrichthyes (skates, rays and sharks). Similar analyses were performed on anurans (hyla and toads) and urodelae (salamanders). These studies revealed that rates of molecular evolution appear to be strongly conserved between the sister lineages examined here, but vary significantly between distantly related lineages in the same group. In salamanders, however, two closely related lineages, the plethodontidae and the hydromantes, exhibit a more than two-fold variation in evolutionary rates. As expected, these studies also revealed that large genomes tend to be associated with low rates of molecular evolution. The trend is remarkably reproducible among the lineages examined with the exception of cartilaginous fish. In skates, rays and sharks, genome size varies up to ten-fold (1.2 pg to 12 pg), but, as previously reported, substitution rates remain uniform and extremely low across the respective lineages (Martin et al. 1992). These findings contribute to the growing body of evidence that rates of molecular evolution are highly heterogeneous among vertebrates, and support the notion that organisms with large genomes tend to have lower substitution rates and rates of evolution. 2 Results Genome size variation in fish, frogs and Salamanders Earlier studies in plants, fish and animals revealed an association of genome size with extinction rates and species richness (Vinogradov 2004; Knight et al. 2005; Olmo 2006; Kraaijeveld 2010). The association between genome size and species richness becomes especially apparent in groups with genome sizes larger than 5 pg in amniotes and 14 pg in plants (Knight et al. 2005; Olmo 2006). We therefore examined the number of species as a function of genome size in three related groups: fish, frogs and salamanders. The genome size of each species was obtained from the Animal Genome Size Database (Gregory et al. 2007). Figure 1 shows that ray-finned fish have an optimal genome size that tends toward smaller genomes between 1 and 2 pg. In contrast, cartilaginous fish and frogs have an optimal genome size between 3 and 5 pg, and salamanders, which are the least speciose of the three groups, tend to have an optimal genome size of 25 to 30 pg. Given that fish are the most species rich group (ray finned fish: ~24000 species, cartilaginous ~810) compared to anurans (~4000) and urodelae (521) these results support the earlier findings that large genome size negatively impacts species richness in different taxonomic groups. Previous studies have shown that the variation in genome size in teleost fish approximates a lognormal distribution (Hardie and Hebert 2004). The dataset used here is limited to ray- finned and cartilaginous fish. In agreement with the earlier studies, both data sets fit a log normal distribution (Figure 1); combined data sets for fish, however, approximate a power- law distribution (Supplementary Figure 1). In contrast, the distribution in frogs is approximately gaussian, while the urodelae distribution shows two peaks, one between 25 and 30 pg and the second between 40 and 45 pg, both gaussian. In the first peak there is a slightly higher proportion of Ambystomidae (13% vs 9% in the total population) and Salamandriae (45% vs 33%) and a decreased proportion of Plethodonitae (38% vs 47%), which constitutes the majority of the second peak. A gaussian distribution indicates that the main mechanisms responsible for genome size variation are additive (randomly occurring deletions and amplifications), whereas lognormal distributions indicate multiplicative effects of varying degrees (genome duplication and polyploidization) (Hardie and Hebert 2004). The ancestral vertebrate lineage is believed to have experienced one or two whole genome duplication events. In contrast, teleost fish have undergone an additional duplication event (the 3R hypothesis), which might have contributed to their faster evolutionary rates compared to all other vertebrates (Robinson-rechavi 1998). Hence, genome size variation in the three different groups examined here appears to follow markedly different modes of genome evolution. Evolutionary rates of rag1 and POMC in fish, frogs and salamanders