Insights Into Cyclostome Phylogenomics: Pre-2R Or Post-2R?
Total Page:16
File Type:pdf, Size:1020Kb
Insights into Cyclostome Phylogenomics: Pre-2R or Post-2R? Shigehiro Kuraku* Lehrstuhl für Zoologie und Evolutionsbiologie, Department of Biology, University of Konstanz, Universitätsstrasse 10, 78457 Konstanz, Germany Interest in understanding the transition from prevertebrates to vertebrates at the molecular level has resulted in accumulating genomic and transcriptomic sequence data for the earliest groups of extant vertebrates, namely, hagfishes (Myxiniformes) and lampreys (Petromyzontiformes). Mol- ecular phylogenetic studies on species phylogeny have revealed the monophyly of cyclostomes and the deep divergence between hagfishes and lampreys (more than 400 million years). In parallel, recent molecular phylogenetic studies have shed light on the complex evolution of the cyclostome genome. This consists of whole genome duplications, shared at least partly with gnathostomes (jawed vertebrates), and cyclostome lineage-specific secondary modifications of the genome, such as gene gains and losses. Therefore, the analysis of cyclostome genomes requires caution in dis- tinguishing between orthology and paralogy in gene molecular phylogeny at the gene family scale, as well as between apomorphic and plesiomorphic genomic traits in larger-scale analyses. In this review, we propose possible ways of improving the resolvability of these evolutionary events, and discuss probable scenarios for cyclostome genome evolution, with special emphasis on the hypothesis that two-round (2R) genome duplication events occurred before the divergence between cyclostomes and gnathostomes, and therefore that a post-2R state is a genomic synapomorphy for all extant vertebrates. Key words: hagfish, lamprey, orthology, hidden paralogy, long branch attraction, whole genome duplica- tion lution, abundant sequence resources have enabled various INTRODUCTION kinds of evolutionary information to be extracted, such as Hagfishes (Myxiniformes) and lampreys (Petromyzo- phylogenetic relationships, evolutionary time scales, and ntiformes) hold basal phylogenetic positions as the earliest gains/losses of genes. In particular, the evolution of gene groups of extant vertebrates, and they have been analyzed repertoires has had an impact on comparative analyses of from various viewpoints to understand the transition from gene function, including the regulatory gene network that prevertebrates to vertebrates at the molecular level (e.g., governs development, physiology, and other biological pro- Kuratani et al., 2002; Lamb et al., 2007; Osorio and Retaux, cesses. Decades of molecular studies have shown that many 2008). Currently, the National Center for Biotechnology well-studied genes have similar copies between species as Information (NCBI) sequence database has 124,029 and well as within species. Indispensable terms to characterize 24,521 entries of nucleotide sequences, including expre- these gene copies evolutionarily, namely, ‘orthology’ and ssed sequence tags (ESTs), for species in the orders ‘paralogy,’ were originally introduced in the early 1970s Petromyzontiformes and Myxiniformes, respectively (and (Fitch, 1970) and later described as follows (Fitch, 2000): 3006 and 652 entries for annotated protein sequences, “Orthology is that relationship where sequence diver- respectively, as of March 26, 2008). Although many of these gence follows speciation, that is, where the common database entries represent limited types of gene families ancestor of the two genes lies in the cenancestor of the (e.g., genes encoding homeodomain-containing transcrip- taxa from which the two sequences were obtained. This tion factors, antigen-recognition proteins involved in the gives rise to a set of sequences whose true phylogeny is adaptive immune system, and so on), the current collection exactly the same as the true phylogeny of the organisms of annotated cyclostome genes is providing a rough but from which the sequences were obtained. Only ortho- insightful overview into understanding the evolutionary prop- logous sequences have this property. Paralogy is erties of cyclostome genomes. defined as that condition where sequence divergence fol- In light of the theories and knowledge of molecular evo- lows gene duplication. Such genes might descend and diverge while existing side by side in the same lineage.” * Corresponding author. Phone: +49-7531-88-2763; Fax : +49-7531-88-3018; In general, recognition of orthology and paralogy is not E-mail: [email protected] straightforward when the genomic evolution of a species in question has experienced a series of complicated events 961 (Fitch, 2000). In discussing early vertebrate evolution, the cyclostomes and gnathostomes. However, it has consis- closest attention should be paid to this, because two rounds tently been reported that the ancestor of the Myxiniformes of genome duplications occurred, which resulted, for and Petromyzontiformes diverged shortly (up to 100 million example, in four Hox gene clusters observed in non-teleost years) after the cyclostome and gnathostome lineages split gnathostomes, such as mammals, chicken, Xenopus, and (summarized in Kuraku et al., 2008a). In terms of the evolu- chondrichthyans (reviewed in Kuraku and Meyer, 2008). It tionary time that has elapsed, it would not be surprising has also been proposed that a large-scale duplication event even if we were to identify differences between the genomes occurred in the cyclostome lineage (see below). of hagfishes and lampreys that were as large as those Importantly, the above definitions of orthology and observed in a comparison between the genomes of paralogy do not include any properties of gene function. For Mammalia and Chondrichthyes. some cyclostome genes, changes in expression patterns BASIC GENOMIC PROPERTIES are described as possible factors explaining the morpho- logical differences between lamprey and gnathostomes Karyotypes (e.g., Shigetani et al., 2002; Uchida et al., 2003; Hammond Many cytogenetic observations of cyclostome genomes and Whitfield, 2006). Apart from cyclostomes, many more were described in the 1970s (Potter and Rothwell, 1970; studies highlight dynamic changes in gene expression pat- Potter and Robinson, 1971; Robinson et al., 1975). In con- terns among orthologs during vertebrate evolution (e.g., trast to the relatively small number of chromosomes of Locascio et al., 2002; Kuraku et al., 2005). To conduct hagfishes (2n=14–48), most lamprey species have more reasonable evolutionary studies, any comparative analysis than 150 chromosomes (Fig. 1B) (original data were regarding gene expression patterns and functions should retrieved from the Animal Genome Size Database, http:// follow the solid characterization of the phylogenetic nature of www.genomesize.com). C-values (genome sizes) of hag- genes: orthology/paralogy should be clarified independently fishes range from 2.29 to 4.59 pg, whereas those of lam- of any functional property of genes. preys range from 1.29 to 2.44 pg (Fig. 1B). Judging from the In this review, from the viewpoint of molecular evolution/ very small size of chromosomes and normal C-values in phylogeny and genome informatics, current knowledge and lampreys, the uniqueness of lamprey karyotypes is thought perspectives are summarized to provide a better link to have been caused mainly by successive Robertsonian between genomic properties and phenotypic evolution. chromosome fissions in the lamprey lineage (Robertson, 1916; Sumner, 2003). PHYLOGENY Monophyly of cyclostomes Noncoding and repetitive landscape Although the taxonomic term Cyclostomata was first Although the paucity of genomic sequences, especially introduced in the early 19th century (Duméril, 1806), many in hagfishes, prevents the analysis of general genomic prop- subsequent studies on morphology regarded only hagfishes erties, there are some implications based on transcriptomic as the earliest branching group, taking lampreys as the true data. By analyzing the GC-content of four-fold degenerate sister taxon of the gnathostomes (Janvier, 1996; see also sites (GC4) in protein-coding regions, it has been shown that Ota et al., 2007). However, the monophyly of cyclostomes lampreys (both northern- and southern-hemisphere species) was first supported by molecular phylogenetics in the early have high levels of GC4 (70–90%), whereas hagfishes have 1990’s (Stock and Whitt, 1992). Many molecular phyloge- moderate levels of GC4 (40–60%) (Kuraku and Kuratani, netic studies have since supported the monophyly of cyclos- 2006; Kuraku et al., unpublished observations). Currently tomes using ribosomal DNA (Mallatt and Sullivan, 1998; available genomic sequences of lampreys have revealed Mallatt and Winchell, 2007), mitochondrial DNA (Delarbre et that the extraordinarily high GC-content in protein-coding al., 2002), and protein-coding genes in the nuclear genome regions, which is represented by GC4, is not a reflection of (Kuraku et al., 1999; Takezaki et al., 2003; Blair and global genomic base composition (Table 1); rather, it is Hedges, 2005; Delsuc et al., 2006) (Fig. 1A). The mono- probably because of highly biased codon usage in lampreys. phyly of cyclostomes can be regarded as one of the most Other evidence obtained in transcriptome analysis clear-cut examples in which molecular phylogenetics has shows the existence of a short genomic element that might succeeded in updating phylogenetic relationships based on have spread throughout the lamprey genome (designated nonmolecular traits (Meyer and Zardoya, 2003). ‘lamprino’; GenBank accession number