The Role of Transposable Elements in Speciation

Total Page:16

File Type:pdf, Size:1020Kb

The Role of Transposable Elements in Speciation G C A T T A C G G C A T genes Review The Role of Transposable Elements in Speciation Antonio Serrato-Capuchina and Daniel R. Matute * Biology Department, Genome Sciences Building, University of North Carolina, Chapel Hill, NC 27514, USA; [email protected] * Correspondence: [email protected] Received: 30 January 2018; Accepted: 26 April 2018; Published: 15 May 2018 Abstract: Understanding the phenotypic and molecular mechanisms that contribute to genetic diversity between and within species is fundamental in studying the evolution of species. In particular, identifying the interspecific differences that lead to the reduction or even cessation of gene flow between nascent species is one of the main goals of speciation genetic research. Transposable elements (TEs) are DNA sequences with the ability to move within genomes. TEs are ubiquitous throughout eukaryotic genomes and have been shown to alter regulatory networks, gene expression, and to rearrange genomes as a result of their transposition. However, no systematic effort has evaluated the role of TEs in speciation. We compiled the evidence for TEs as potential causes of reproductive isolation across a diversity of taxa. We find that TEs are often associated with hybrid defects that might preclude the fusion between species, but that the involvement of TEs in other barriers to gene flow different from postzygotic isolation is still relatively unknown. Finally, we list a series of guides and research avenues to disentangle the effects of TEs on the origin of new species. Keywords: speciation; transposable elements; reproductive isolation 1. Introduction Speciation is the evolutionary process by which one lineage splits into two reproductively isolated groups of organisms [1]. One of the central goals of speciation research is to understand the processes that drive the evolution of reproductive isolation (RI) between species [2–6]. Significant strides have been made towards identifying barriers that generate RI between species [1,7], the processes underlying their evolution [2,3,8–11], and the rate at which they evolve during speciation [4,5,12–15]. Even though some progress has been made in identifying genes and loci associated with RI, few studies have explored the evolutionary processes that produced these barriers. Because of this, there is not yet a consensus as to what types of mutations or which mechanisms are typically involved in speciation or RI. There are two broad approaches to identify the genetic underpinnings of RI. First, if crosses can be made, one can genetically map the loci underlying RI between organisms. Such studies can establish the genetic changes that maintain species identity and, if divergence is recent, potentially reveal the molecular changes that were initially involved in speciation. This approach is particularly informative when coupled with closely related organisms at different stages of reduced gene exchange [1,16,17]. An alternative approach is to assess whether a particular type of molecular change is commonly associated with isolation between genotypes. If a barrier to gene flow is commonly caused by a certain type of molecular change, then one can argue that that molecular change is important in either the origin of new species or the persistence of them when they face the possibility of collapse through gene flow. This approach has, for example, revealed that chromosomal inversions are commonly associated with the suppression of recombination and frequently harbor gene combinations involved in isolation between species [18,19] (reviewed in [20]). However, this approach has rarely been used to understand Genes 2018, 9, 254; doi:10.3390/genes9050254 www.mdpi.com/journal/genes Genes 2018, 9, x FOR PEER REVIEW 2 of 28 Genes 2018, 9, 254 2 of 29 approach has rarely been used to understand the impact of other molecular changes on RI. Here we thehighlight impact transposable of other molecular elements changes as recurring on RI. Hereagents we that highlight underlie transposable a variety of elements manifestations as recurring of RI, agentswhich thatsuggests underlie they a varietyshould ofbe manifestationsexplored across of RI,various which taxa suggests in order they to should better be understand explored across their variousmechanistic taxa and in order evolutionary to better understandcontributions. their mechanistic and evolutionary contributions. Transposable elementselements (TEs)(TEs) areare DNADNA sequencessequences ableable to copy and insert themselves throughout the genome.genome. TEs TEs represent represent up up to to 80% 80% of nuclear DNA in plants, 3–20% in fungi,fungi, and 3–52%3–52% inin metazoans [[21–23].21–23]. TEs TEs are classified classified according to the mechanismmechanism theythey useuse toto transpose.transpose. Class I elements require anan RNARNA intermediateintermediate inin orderorder toto integrate/duplicateintegrate/duplicate themselves within a genome, whilewhile ClassClass II II elements elements act act without without an intermediate an intermed throughiate through a cut-and-paste a cut-and-paste mechanism mechanism that replicates that itsreplicates DNA directly its DNA to directly DNA as to it DNA mobilizes as it mobilizes (Figure1). (Figure A full classification1). A full classification of TEs is of shown TEs is in shown Table1 in. Interestingly,Table 1. Interestingly, the predominant the predominant class ofclass TEs of canTEs can vary vary greatly greatly between between taxa taxa [24 [24–28]–28] andand species, and theirtheir genomicgenomic frequency, frequency, location, location, and and activity activity levels levels can can vary vary greatly greatly even even at the at population the population level. TEslevel. were TEs describedwere described for the for first the time first intime maize in maiz by Barbarae by Barbara McClintock McClintock in 1950 in [195029] where[29] where they leadthey tolead somatic to somatic mutations mutations affecting affecting various various phenotypes/genes phenotypes/genes depending depending on their on chromosomal their chromosomal location andlocation transposition and transposition time. The time. insertion The insertion of a TE can of a disrupt TE can the disrupt coding the or coding regulatory or regulatory sequences sequences of genes, whichof genes, can which cause can deleterious cause deleterious effects by theeffects modifying by the modifying or eliminating or eliminating a gene’s expression a gene’s [expression30–34]. TEs [30– are ubiquitous34]. TEs are throughoutubiquitous naturethroughout [35–37 nature] and [35–37] their effect and ontheir their effect hosts’ on fitnesstheir hosts’ is generally fitness is considered generally toconsidered be deleterious; to be TEs deleterious; are commonly TEs consideredare commonl selfishy considered elements. However,selfish elements. gene disruptions However, are gene not thedisruptions only consequence are not the of only TEs asconsequence they transpose of TEs throughout as they transpose the genome. throughout TEs can alsothe genome. cause regulatory TEs can changes,also cause genomic regulatory expansions, changes, and genomic generate expans new chromosomalions, and generate variants new through chromosomal the generation variants of inversions.through the Moreover, generation TEs of inversions. can produce Moreover, all of these TE changess can produce rapidly all [38 of– these40] and changes in response rapidly to [38–40] abiotic stressors—aand in response hypothesis to abiotic first stressors—a advanced hypothesis by McClintock first [advanced29]. These by changes McClintock can provide [29]. These genetic changes and phenotypiccan provide novelties genetic and upon phenotypic which selection novelties can actupon [41 ,42which]. Due selection to their can potential act [41,42]. to generate Due noveltyto their whenpotential it is to needed, generate some novelty have when hypothesized it is needed, that so TEsme are have maintained hypothesized in genomes that TEs through are maintained multilevel in selectiongenomes [through43–45]. multilevel selection [43–45]. CLASS 1 – RETROTRANSPOSONS CLASS 2 – DNA TRANSPOSONS Types Types 5’ 3’ 5’ 3’ LTR LTR GAG POL LTR TIR TIR Transposase TIR Helitron LINE GAG POL PolyA Replicase Helicase SINE Left monomer A-rich Right monomer PolyA Maverick Integrase Polymerase Copy and Paste mechanism Cut and Paste mechanism 5’ 3’ 5’ 1 HOST DNA 3’ Transcription Excision RNA INTERMEDIARY DNA INTERMEDIARY Integration Reverse transcription 5’ 3’ Former 2 site DNA INTERMEDIARY Host DNA Integration Transposable element 1 2 5’ 3’ GAG gag gene, encodes structural proteins POL pol gene, encodes for a cleaving protease, reverse transcriptase and integrase TIR Terminal inverted repeats Figure 1.1.A A graphical graphical classification classification of transposable of transposable elements el (TEs).ements The (TEs). left panel The shows left Classpanel 1 retrotransposons,shows Class 1 andretrotransposons, the right panel and shows the right Class panel 2 DNA shows transposons. Class 2 DNA The transposons. upper panels The show upper three panels examples show of three the geneticexamples structure of the ofgenetic each ofstructure these two of classeseach of of these elements. two classes The lower of elements. panels show The thelower mode panels of movement show the (transpositionmode of movement mechanism) (transposition of each class. mechanism)LTR: Long Terminal of each class. Repeats; LTRLINE: Long: Long Terminal interspersed Repeats; nuclear LINE elements;: Long SINEinterspersed:
Recommended publications
  • Identification of a Non-LTR Retrotransposon from the Gypsy Moth
    Insect Molecular Biology (1999) 8(2), 231-242 Identification of a non-L TR retrotransposon from the gypsy moth K. J. Garner and J. M. Siavicek sposons (Boeke & Corces, 1989), or retroposons USDA Forest Service, Northeastern Research Station, (McClure, 1991). Many non-L TR retrotransposons Delaware, Ohio, U.S.A. have been described in insects, including the Doc (O'Hare et al., 1991), F (Di Nocera & Casari, 1987), I (Fawcett et al., 1986) and jockey (Priimiigi et al., 1988) Abstract elements of Drosophila melanogaster, the T1Ag A family of highly repetitive elements, named LDT1, (Besansky, 1990) and Q (Besansky et al., 1994) ele- has been identified in the gypsy moth, Lymantria ments of Anopheles gambiae, and the R1Bm (Xiong & dispar. The complete element is 5.4 kb in length and Eickbush, 1988a) and R2Bm (Burke et al., 1987) lacks long-terminal repeats, The element contains two families of ribosomal DNA insertions in Bombyx mori. open reading frames with a significant amino acid Gypsy moths (Lymantria dispar) are currently wide- sequence similarity to several non-L TR retrotrans- spread forest pests in the north-eastern United States posons. The first open reading frame contains a and the adjacent regions of Canada. Population region that potentially encodes a polypeptide similar markers have been sought to distinguish the North to DNA-binding GAG-like proteins. The second American gypsy moths introduced from Europe in 1869 encodes a polypeptide resembling both endonuclease from those recently introduced from Asia (Bogdano- and reverse transcriptase sequences. A" members of wicz et al., 1993; Pfeifer et al., 1995; Garner & Siavicek, the LDT1 element family sequenced thus far have poly- 1996; Schreiber et al., 1997).
    [Show full text]
  • The Evolution, Diversity, and Host Associations of Rhabdoviruses Ben Longdon,1,* Gemma G
    Virus Evolution, 2015, 1(1): vev014 doi: 10.1093/ve/vev014 Research article The evolution, diversity, and host associations of rhabdoviruses Ben Longdon,1,* Gemma G. R. Murray,1 William J. Palmer,1 Jonathan P. Day,1 Darren J Parker,2,3 John J. Welch,1 Darren J. Obbard4 and Francis M. Jiggins1 1 2 Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, School of Biology, University of Downloaded from St Andrews, St Andrews, KY19 9ST, UK, 3Department of Biological and Environmental Science, University of Jyva¨skyla¨, Jyva¨skyla¨, Finland and 4Institute of Evolutionary Biology, and Centre for Immunity Infection and Evolution, University of Edinburgh, Edinburgh, EH9 3JT, UK *Corresponding author: E-mail: [email protected] http://ve.oxfordjournals.org/ Abstract Metagenomic studies are leading to the discovery of a hidden diversity of RNA viruses. These new viruses are poorly characterized and new approaches are needed predict the host species these viruses pose a risk to. The rhabdoviruses are a diverse family of RNA viruses that includes important pathogens of humans, animals, and plants. We have discovered thirty-two new rhabdoviruses through a combination of our own RNA sequencing of insects and searching public sequence databases. Combining these with previously known sequences we reconstructed the phylogeny of 195 rhabdovirus by guest on December 14, 2015 sequences, and produced the most in depth analysis of the family to date. In most cases we know nothing about the biology of the viruses beyond the host they were identified from, but our dataset provides a powerful phylogenetic approach to predict which are vector-borne viruses and which are specific to vertebrates or arthropods.
    [Show full text]
  • Genetic Effects on Microsatellite Diversity in Wild Emmer Wheat (Triticum Dicoccoides) at the Yehudiyya Microsite, Israel
    Heredity (2003) 90, 150–156 & 2003 Nature Publishing Group All rights reserved 0018-067X/03 $25.00 www.nature.com/hdy Genetic effects on microsatellite diversity in wild emmer wheat (Triticum dicoccoides) at the Yehudiyya microsite, Israel Y-C Li1,3, T Fahima1,MSRo¨der2, VM Kirzhner1, A Beiles1, AB Korol1 and E Nevo1 1Institute of Evolution, University of Haifa, Mount Carmel, Haifa 31905, Israel; 2Institute for Plant Genetics and Crop Plant Research, Corrensstrasse 3, 06466 Gatersleben, Germany This study investigated allele size constraints and clustering, diversity. Genome B appeared to have a larger average and genetic effects on microsatellite (simple sequence repeat number (ARN), but lower variance in repeat number 2 repeat, SSR) diversity at 28 loci comprising seven types of (sARN), and smaller number of alleles per locus than genome tandem repeated dinucleotide motifs in a natural population A. SSRs with compound motifs showed larger ARN than of wild emmer wheat, Triticum dicoccoides, from a shade vs those with perfect motifs. The effects of replication slippage sun microsite in Yehudiyya, northeast of the Sea of Galilee, and recombinational effects (eg, unequal crossing over) on Israel. It was found that allele distribution at SSR loci is SSR diversity varied with SSR motifs. Ecological stresses clustered and constrained with lower or higher boundary. (sun vs shade) may affect mutational mechanisms, influen- This may imply that SSR have functional significance and cing the level of SSR diversity by both processes. natural constraints.
    [Show full text]
  • "Evolutionary Emergence of Genes Through Retrotransposition"
    Evolutionary Emergence of Advanced article Genes Through Article Contents . Introduction Retrotransposition . Gene Alteration Following Retrotransposon Insertion . Retrotransposon Recruitment by Host Genome . Retrotransposon-mediated Gene Duplication Richard Cordaux, University of Poitiers, Poitiers, France . Conclusion Mark A Batzer, Department of Biological Sciences, Louisiana State University, Baton Rouge, doi: 10.1002/9780470015902.a0020783 Louisiana, USA Variation in the number of genes among species indicates that new genes are continuously generated over evolutionary times. Evidence is accumulating that transposable elements, including retrotransposons (which account for about 90% of all transposable elements inserted in primate genomes), are potent mediators of new gene origination. Retrotransposons have fostered genetic innovation during human and primate evolution through: (i) alteration of structure and/or expression of pre-existing genes following their insertion, (ii) recruitment (or domestication) of their coding sequence by the host genome and (iii) their ability to mediate gene duplication via ectopic recombination, sequence transduction and gene retrotransposition. Introduction genes, respectively, and de novo origination from previ- ously noncoding genomic sequence. Genome sequencing Variation in the number of genes among species indicates projects have also highlighted that new gene structures can that new genes are continuously generated over evolution- arise as a result of the activity of transposable elements ary times. Although the emergence of new genes and (TEs), which are mobile genetic units or ‘jumping genes’ functions is of central importance to the evolution of that have been bombarding the genomes of most species species, studies on the formation of genetic innovations during evolution. For example, there are over three million have only recently become possible.
    [Show full text]
  • Evolution of Pogo, a Separate Superfamily of IS630-Tc1-Mariner
    Gao et al. Mobile DNA (2020) 11:25 https://doi.org/10.1186/s13100-020-00220-0 RESEARCH Open Access Evolution of pogo, a separate superfamily of IS630-Tc1-mariner transposons, revealing recurrent domestication events in vertebrates Bo Gao, Yali Wang, Mohamed Diaby, Wencheng Zong, Dan Shen, Saisai Wang, Cai Chen, Xiaoyan Wang and Chengyi Song* Abstracts Background: Tc1/mariner and Zator, as two superfamilies of IS630-Tc1-mariner (ITm) group, have been well-defined. However, the molecular evolution and domestication of pogo transposons, once designated as an important family of the Tc1/mariner superfamily, are still poorly understood. Results: Here, phylogenetic analysis show that pogo transposases, together with Tc1/mariner,DD34E/Gambol,and Zator transposases form four distinct monophyletic clades with high bootstrap supports (> = 74%), suggesting that they are separate superfamilies of ITm group. The pogo superfamily represents high diversity with six distinct families (Passer, Tigger, pogoR, Lemi, Mover,andFot/Fot-like) and wide distribution with an expansion spanning across all the kingdoms of eukaryotes. It shows widespread occurrences in animals and fungi, but restricted taxonomic distribution in land plants. It has invaded almost all lineages of animals—even mammals—and has been domesticated repeatedly in vertebrates, with 12 genes, including centromere-associated protein B (CENPB), CENPB DNA-binding domain containing 1 (CENPBD1), Jrk helix–turn–helix protein (JRK), JRK like (JRKL), pogo transposable element derived with KRAB domain (POGK), and with ZNF domain (POGZ), and Tigger transposable element-derived 2 to 7 (TIGD2–7), deduced as originating from this superfamily. Two of them (JRKL and TIGD2) seem to have been co-domesticated, and the others represent independent domestication events.
    [Show full text]
  • Retrotransposon Long Interspersed Nucleotide Element1 (LINE1) Is
    The Japanese Society of Developmental Biologists Develop. Growth Differ. (2012) 54, 673–685 doi: 10.1111/j.1440-169X.2012.01368.x Original Article Retrotransposon long interspersed nucleotide element-1 (LINE-1) is activated during salamander limb regeneration Wei Zhu,1 Dwight Kuo,2 Jason Nathanson,3 Akira Satoh,4,5 Gerald M. Pao,1 Gene W. Yeo,3 Susan V. Bryant,5 S. Randal Voss,6 David M. Gardiner5*and Tony Hunter1* 1Molecular and Cell Biology Laboratory and Laboratory of Genetics, Salk Institute for Biological Studies, La Jolla, California 92037, Departments of 2Bioengineering and 3Cellular and Molecular Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA; 4Okayama University, R.C.I.S. Okayama-city, Okayama, 700-8530, Japan; 5Department of Developmental and Cell Biology, University of California at Irvine, Irvine, California 92697, and 6Department of Biology and Spinal Cord and Brain Injury Research Center, University of Kentucky, Lexington, Kentucky 40506, USA Salamanders possess an extraordinary capacity for tissue and organ regeneration when compared to mam- mals. In our effort to characterize the unique transcriptional fingerprint emerging during the early phase of sala- mander limb regeneration, we identified transcriptional activation of some germline-specific genes within the Mexican axolotl (Ambystoma mexicanum) that is indicative of cellular reprogramming of differentiated cells into a germline-like state. In this work, we focus on one of these genes, the long interspersed nucleotide element-1 (LINE-1) retrotransposon, which is usually active in germ cells and silent in most of the somatic tissues in other organisms. LINE-1 was found to be dramatically upregulated during regeneration.
    [Show full text]
  • Capture of a Recombination Activating Sequence from Mammalian Cells
    Gene Therapy (1999) 6, 1819–1825 1999 Stockton Press All rights reserved 0969-7128/99 $15.00 http://www.stockton-press.co.uk/gt Capture of a recombination activating sequence from mammalian cells P Olson and R Dornburg The Dorrance H Hamilton Laboratories, Center for Human Virology, Division of Infectious Diseases, Jefferson Medical College, Thomas Jefferson University, Jefferson Alumni Hall, 1020 Locust Street, Rm 329, Philadelphia, PA 19107, USA We have developed a genetic trap for identifying revealed a putative recombination signal (CCCACCC). sequences that promote homologous DNA recombination. When this heptamer or an abbreviated form (CCCACC) The trap employs a retroviral vector that normally disables were reinserted into the vector, they stimulated vector itself after one round of replication. Insertion of defined repair and other DNA rearrangements. Mutant forms of DNA sequences into the vector induced the repair of a 300 these oligomers (eg CCCAACC or CCWACWS) did not. base pair deletion, which restored its ability to replicate. Our data suggest that the recombination events occurred Tests of random sequence libraries made in the vector within 48 h after transfection. Keywords: recombination; retroviral vector; vector stability; gene conversion; gene therapy Introduction scripts are still made from the intact left LTR, but reverse transcription copies the deletion to both LTRs, disabling Recognition of cis-acting DNA signals occupies a central the daughter provirus.15–17 We found that the SIN vectors role in both site-specific and general recombination path- that could escape this programmed disablement did so 1–6 ways. Signals in site-specific pathways define the by recombinationally repairing the U3-deleted LTR.
    [Show full text]
  • 1 Low Ribosomal RNA Genes Copy Number Provoke Genomic Instability
    bioRxiv preprint doi: https://doi.org/10.1101/2020.01.24.917823; this version posted January 25, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Low ribosomal RNA genes copy number provoke genomic instability and chromosomal segment duplication events that modify global gene expression and plant-pathogen response Ariadna Picart-Picolo1,2, Stefan Grob3, Nathalie Picault1,2, Michal Franek4, Thierry halter5, Tom R. Maier6, Christel Llauro1,2, Edouard Jobet1,2, Panpan Zhang1,2,7, Paramasivan Vijayapalani6, Thomas J. Baum6, Lionel Navarro5, Martina Dvorackova4, Marie Mirouze1,2,7, Frederic Pontvianne1,2# 1CNRS, LGDP UMR5096, Université de Perpignan, Perpignan, France 2UPVD, LGDP UMR5096, Université de Perpignan, Perpignan, France 3Institute of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland 4 Mendel Centre for Plant Genomics and Proteomics, CEITEC, Masaryk University, Brno, Czech Republic 5ENS, IBENS, CNRS/INSERM, PSL Research University, Paris, France 6Department of Plant Pathology and Microbiology, Iowa State University, Ames, IA, USA 7IRD, UMR232 DIADE, Montpellier, France #corresponding author: [email protected] ABSTRACT Among the hundreds of ribosomal RNA (rRNA) gene copies organized as tandem repeats in the nucleolus organizer regions (NORs), only a portion is usually actively expressed in the nucleolus and participate in the ribosome biogenesis process. The role of these extra-copies remains elusive, but previous studies suggested their importance in genome stability and global 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.24.917823; this version posted January 25, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder.
    [Show full text]
  • A Field Guide to Eukaryotic Transposable Elements
    GE54CH23_Feschotte ARjats.cls September 12, 2020 7:34 Annual Review of Genetics A Field Guide to Eukaryotic Transposable Elements Jonathan N. Wells and Cédric Feschotte Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14850; email: [email protected], [email protected] Annu. Rev. Genet. 2020. 54:23.1–23.23 Keywords The Annual Review of Genetics is online at transposons, retrotransposons, transposition mechanisms, transposable genet.annualreviews.org element origins, genome evolution https://doi.org/10.1146/annurev-genet-040620- 022145 Abstract Annu. Rev. Genet. 2020.54. Downloaded from www.annualreviews.org Access provided by Cornell University on 09/26/20. For personal use only. Copyright © 2020 by Annual Reviews. Transposable elements (TEs) are mobile DNA sequences that propagate All rights reserved within genomes. Through diverse invasion strategies, TEs have come to oc- cupy a substantial fraction of nearly all eukaryotic genomes, and they rep- resent a major source of genetic variation and novelty. Here we review the defining features of each major group of eukaryotic TEs and explore their evolutionary origins and relationships. We discuss how the unique biology of different TEs influences their propagation and distribution within and across genomes. Environmental and genetic factors acting at the level of the host species further modulate the activity, diversification, and fate of TEs, producing the dramatic variation in TE content observed across eukaryotes. We argue that cataloging TE diversity and dissecting the idiosyncratic be- havior of individual elements are crucial to expanding our comprehension of their impact on the biology of genomes and the evolution of species. 23.1 Review in Advance first posted on , September 21, 2020.
    [Show full text]
  • Tandemly Arrayed Genes in Vertebrate Genomes
    Tandemly Arrayed Genes in Vertebrate Genomes Deng Pan1 and Liqing Zhang2∗ 1Department of Computer Science, Virginia Tech, 2050 Torgerson Hall, Blacksburg, VA 24061-0106, USA 2Program in Genetics, Bioinformatics, and Computational Biology ∗To whom correspondence should be addressed; E-mail: [email protected] July 9, 2008 Abstract Tandemly arrayed genes (TAGs) are duplicated genes that are linked as neighbors on a chromosome, many of which have important physio- logical and biochemical functions. Here we performed a survey of these genes in 11 available vertebrate genomes. TAGs account for an average of about 14% of all genes in these vertebrate genomes, and about 25% of all duplications. The majority of TAGs (72-94%) have parallel tran- scription orientation (i.e. are encoded on the same strand), in contrast to the genome, which has about 50% of its genes in parallel transcrip- tion orientation. The majority of tandem arrays have only two members. In all species, the proportion of genes that belong to TAGs tends to be higher in large gene families than in small ones; together with our re- cent finding that tandem duplication played a more important role than retroposition in large families, this fact suggests that among all types of duplication mechanisms, tandem duplication is the predominant mecha- nism of duplication, especially in large families. Species-specific tandem arrays (SSTAs) are identified and results of statistical tests suggest that natural selection played a role in maintaining some of the SSTAs. Our 1 work will provide more insight into the evolutionary history of TAGs in vertebrates. 2 1 Introduction Although the importance of duplicated genes in providing raw materials for genetic innovation has been recognized since the 1930s and is highlighted in Ohno’s book Evolution by gene duplication (Ohno, 1970), it is only recently that the availability of numerous genomic sequences has made it possible to quantitatively estimate how many genes in a genome are generated by gene duplication.
    [Show full text]
  • Evolution of Orthologous Tandemly Arrayed Gene Clusters
    Tremblay Savard et al. BMC Bioinformatics 2011, 12(Suppl 9):S2 http://www.biomedcentral.com/1471-2105/12/S9/S2 PROCEEDINGS Open Access Evolution of orthologous tandemly arrayed gene clusters Olivier Tremblay Savard1*, Denis Bertrand2*, Nadia El-Mabrouk1* From Ninth Annual Research in Computational Molecular Biology (RECOMB) Satellite Workshop on Com- parative Genomics Galway, Ireland. 8-10 October 2011 Abstract Background: Tandemly Arrayed Gene (TAG) clusters are groups of paralogous genes that are found adjacent on a chromosome. TAGs represent an important repertoire of genes in eukaryotes. In addition to tandem duplication events, TAG clusters are affected during their evolution by other mechanisms, such as inversion and deletion events, that affect the order and orientation of genes. The DILTAG algorithm developed in [1] makes it possible to infer a set of optimal evolutionary histories explaining the evolution of a single TAG cluster, from an ancestral single gene, through tandem duplications (simple or multiple, direct or inverted), deletions and inversion events. Results: We present a general methodology, which is an extension of DILTAG, for the study of the evolutionary history of a set of orthologous TAG clusters in multiple species. In addition to the speciation events reflected by the phylogenetic tree of the considered species, the evolutionary events that are taken into account are simple or multiple tandem duplications, direct or inverted, simple or multiple deletions, and inversions. We analysed the performance of our algorithm on simulated data sets and we applied it to the protocadherin gene clusters of human, chimpanzee, mouse and rat. Conclusions: Our results obtained on simulated data sets showed a good performance in inferring the total number and size distribution of duplication events.
    [Show full text]
  • Conversion Between 100-Million-Year-Old Duplicated Genes Contributes to Rice Subspecies Divergence
    Wei et al. BMC Genomics (2021) 22:460 https://doi.org/10.1186/s12864-021-07776-y RESEARCH Open Access Conversion between 100-million-year-old duplicated genes contributes to rice subspecies divergence Chendan Wei1†, Zhenyi Wang1†, Jianyu Wang1†, Jia Teng1, Shaoqi Shen1, Qimeng Xiao1, Shoutong Bao1, Yishan Feng1, Yan Zhang1, Yuxian Li1, Sangrong Sun1, Yuanshuai Yue1, Chunyang Wu1, Yanli Wang1, Tianning Zhou1, Wenbo Xu1, Jigao Yu2,3, Li Wang1* and Jinpeng Wang1,2,3* Abstract Background: Duplicated gene pairs produced by ancient polyploidy maintain high sequence similarity over a long period of time and may result from illegitimate recombination between homeologous chromosomes. The genomes of Asian cultivated rice Oryza sativa ssp. indica (XI) and Oryza sativa ssp. japonica (GJ) have recently been updated, providing new opportunities for investigating ongoing gene conversion events and their impact on genome evolution. Results: Using comparative genomics and phylogenetic analyses, we evaluated gene conversion rates between duplicated genes produced by polyploidization 100 million years ago (mya) in GJ and XI. At least 5.19–5.77% of genes duplicated across the three rice genomes were affected by whole-gene conversion after the divergence of GJ and XI at ~ 0.4 mya, with more (7.77–9.53%) showing conversion of only portions of genes. Independently converted duplicates surviving in the genomes of different subspecies often use the same donor genes. The ongoing gene conversion frequency was higher near chromosome termini, with a single pair of homoeologous chromosomes, 11 and 12, in each rice genome being most affected. Notably, ongoing gene conversion has maintained similarity between very ancient duplicates, provided opportunities for further gene conversion, and accelerated rice divergence.
    [Show full text]