The Utility of Graph Clustering of 5S Ribosomal DNA Homoeologs In
Total Page:16
File Type:pdf, Size:1020Kb
The Utility of Graph Clustering of 5S Ribosomal DNA Homoeologs in Plant Allopolyploids, Homoploid Hybrids, and Cryptic Introgressants Sònia Garcia, Jonathan F Wendel, Natalia Borowska-Zuchowska, Malika Ainouche, Alena Kuderova, Ales Kovarik To cite this version: Sònia Garcia, Jonathan F Wendel, Natalia Borowska-Zuchowska, Malika Ainouche, Alena Kuderova, et al.. The Utility of Graph Clustering of 5S Ribosomal DNA Homoeologs in Plant Allopolyploids, Homoploid Hybrids, and Cryptic Introgressants. Frontiers in Plant Science, Frontiers, 2020, 11, pp.41. 10.3389/fpls.2020.00041. hal-02498513 HAL Id: hal-02498513 https://hal-univ-rennes1.archives-ouvertes.fr/hal-02498513 Submitted on 15 Jul 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. ORIGINAL RESEARCH published: 10 February 2020 doi: 10.3389/fpls.2020.00041 The Utility of Graph Clustering of 5S Ribosomal DNA Homoeologs in Plant Allopolyploids, Homoploid Hybrids, and Cryptic Introgressants Sònia Garcia 1,2, Jonathan F. Wendel 3, Natalia Borowska-Zuchowska 4, Malika Aïnouche 5, Alena Kuderova 2 and Ales Kovarik 2* 1 Institut Botànic de Barcelona (IBB, CSIC - Ajuntament de Barcelona), Barcelona, Spain, 2 Department of Molecular Epigenetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czechia, 3 Department of Ecology, Evolution & Organismal Biology, Iowa State University, Ames, IA, United States, 4 Faculty of Natural Sciences, Institute of Biology, Biotechnology and Environmental Protection, University of Silesia in Katowice, Katowice, Poland, 5 UMR CNRS 6553 ECOBIO, Université de Rennes 1, Rennes, France Introduction: Ribosomal DNA (rDNA) loci have been widely used for identification of allopolyploids and hybrids, although few of these studies employed high-throughput Edited by: sequencing data. Here we use graph clustering implemented in the RepeatExplorer (RE) Hanna Weiss-Schneeweiss, pipeline to analyze homoeologous 5S rDNA arrays at the genomic level searching for University of Vienna, Austria hybridogenic origin of species. Data were obtained from more than 80 plant species, Reviewed by: fi Tony Heitkam, including several well-de ned allopolyploids and homoploid hybrids of different Dresden University of evolutionary ages and from widely dispersed taxonomic groups. Technology, Germany Jelena Mlinarec, Results: (i) Diploids show simple circular-shaped graphs of their 5S rDNA clusters. In University of Zagreb, Croatia contrast, most allopolyploids and other interspecific hybrids exhibit more complex graphs *Correspondence: composed of two or more interconnected loops representing intergenic spacers (IGS). (ii) Ales Kovarik [email protected] There was a relationship between graph complexity and locus numbers. (iii) The sequences and lengths of the 5S rDNA units reconstituted in silico from k-mers were Specialty section: congruent with those experimentally determined. (iv) Three-genomic comparative cluster This article was submitted to Plant analysis of reads from allopolyploids and progenitor diploids allowed identification of Systematics and Evolution, a section of the journal homoeologous 5S rRNA gene families even in relatively ancient (c. 1 Myr) Gossypium and Frontiers in Plant Science Brachypodium allopolyploids which already exhibit uniparental partial loss of rDNA Received: 17 October 2019 repeats. (v) Finally, species harboring introgressed genomes exhibit exceptionally Accepted: 13 January 2020 Published: 10 February 2020 complex graph structures. Citation: Conclusion: We found that the cluster graph shapes and graph parameters (k-mer Garcia S, Wendel JF, coverage scores and connected component index) well-reflect the organization and Borowska-Zuchowska N, Aïnouche M, Kuderova A and intragenomic homogeneity of 5S rDNA repeats. We propose that the analysis of 5S rDNA Kovarik A (2020) The Utility of Graph cluster graphs computed by the RE pipeline together with the cytogenetic analysis might Clustering of 5S Ribosomal DNA Homoeologs in Plant Allopolyploids, be a reliable approach for the determination of the hybrid or allopolyploid plant species Homoploid Hybrids, and Cryptic parentage and may also be useful for detecting historical introgression events. Introgressants. Front. Plant Sci. 11:41. Keywords: 5S rRNA genes, allopolyploidy, hybridization, evolution, graph structure clustering, high-throughput doi: 10.3389/fpls.2020.00041 sequencing, repeatome Frontiers in Plant Science | www.frontiersin.org 1 February 2020 | Volume 11 | Article 41 Garcia et al. 5S rDNA Evolution in Plant Hybrids INTRODUCTION Volkov et al., 1999; Muir et al., 2001; Matyasek et al., 2003). In It is well-established that all modern plant species have contrast to 35S rDNA, the 5S rDNA loci appear to be less experienced at least one whole genome duplication and that sensitive to homogenization in some allopolyploids (Fulnecek many also have interspecific hybridization and recurrent et al., 2002; Pedrosa-Harand et al., 2006; Weiss-Schneeweiss introgression in their recent history (Wendel, 2015; Alix et al., et al., 2008; Garcia et al., 2017), retaining diagnostic capacity 2017; Nieto Feliner et al., 2017; Van De Peer et al., 2017). with respect to their parental origin. Documenting recent allopolyploidy is relatively straightforward The clustering algorithm employed by RepeatExplorer (RE) using cytogenetic analysis and genome size measurements, since (Novak et al., 2010; Novak et al., 2013) has become a tool of allopolyploids have twice as many chromosomes (or more) as the choice for the analysis of chromosome composition and genome parental species. Identification of homoploid hybrids is more evolution (Renny-Byfield et al., 2012; Weiss-Schneeweiss et al., difficult since the chromosome number and genome size are 2015; Ribeiro et al., 2017; Mlinarec et al., 2019; Peska et al., 2019). often similar to that of the parental species (Nieto Feliner et al., The phylogenetic signal of the repeatome has also been exploited 2017). Evolutionary young allopolyploids and other hybrids tend in phylogenetic studies (Dodsworth et al., 2015; Dodsworth et al., to retain fixed polymorphisms at protein-coding and non-coding 2016; Grover et al., 2019; Vitales et al., 2019). The analysis of loci. These duplicated loci are called homoeologs (Glover et al., genomes by RE is based on an all-to-all comparison of sequence 2016) and are useful for documenting parentage as well as reads revealing their similarities. Subsequently, the data are used understanding the dynamics of polyploid genomes (Yoo et al., to build clusters of overlapping reads representing different 2014; Wendel, 2015; Bourke et al., 2018). Older allopolyploids repetitive elements. The TAREAN tool, recently introduced can have experienced episodes of intergenomic translocation, into the RepeatExplorer2 pipeline, allows repeat identification dysploidy, gene conversion, localized deletions, and other genetic and reconstruction of tandem repeats solely from sequence reads events, leading eventually to diploidization of the genome (Novak et al., 2017). Graph theory and connected component (Wendel, 2015; Wendel et al., 2018). methods lying in the heart of the computation algorithm produce Ribosomal RNA genes encoding 5S, 5.8S, 18S, and 26S graph structures reflecting genomic organization of repeats. ribosomal RNA are ubiquitous in plants and are organized Typically, tandem repeats exhibit circular (ring) shape into arrays containing hundreds to thousands of tandem topologies are characterized by high values of circularity repeats at one or more genomic loci (Hemleben and Zentgraf, parameters. Although the RepeatExplorer2/TAREAN tool was 1994; Nieto Feliner and Rosselló, 2007; Roa and Guerra, 2012; initially developed for identification of non-coding satellites, 5S Garcia et al., 2017). Due to their rapidly diverging intergenic rDNA can also be analyzed with the program. This is because 5S (IGS) and internally transcribed spacers (ITS), rDNA loci have rDNA shows many features of satellite repeats: (I) its highly become popular taxonomic markers revealing allopolyploidy and homogeneous units are tandemly arranged in a head to tail other interspecific hybridization in many plant and animal orientation, (II) it appears in high copy number, allowing systems (Alvarez and Wendel, 2003; Poczai and Hyvonen, analyses even at low coverage, and (III) the size of 5S rDNA 2010; Nieto Feliner and Rossello, 2012). The internet searches monomers (c. 200–1,000 bp) (Sastri et al., 1992; Fulnecek et al., using *ITS* and *allopolyploidy* resulted in more than 650 hits 2006) falls within the range defined for satellite DNA, allowing in Web of Science for just 2019. Most studies have used classical circularization of chains of overlapping reads. single clone sequencing approaches whereas high-throughput In this study we investigated the 5S rDNA genomic data have only rarely been employed and are limited to the 35S organization and homogeneity in more than 80 plant diploids (45S) rDNA (Matyasek et al., 2012; West et al., 2014; Boutte et al., and polyploids, exploiting high-throughput