Russian Doll Genes and Complex Chromosome Rearrangements in Oxytricha Trifallax
Total Page:16
File Type:pdf, Size:1020Kb
University of South Florida Scholar Commons Mathematics and Statistics Faculty Publications Mathematics and Statistics 5-2018 Russian Doll Genes and Complex Chromosome Rearrangements in Oxytricha trifallax Jasper Braun University of South Florida, [email protected] Lukas Nabergall University of South Florida, [email protected] Rafik Neme Columbia University Laura F. Landweber Columbia University Masahico Saito University of South Florida, [email protected] See next page for additional authors Follow this and additional works at: https://scholarcommons.usf.edu/mth_facpub Scholar Commons Citation Braun, Jasper; Nabergall, Lukas; Neme, Rafik; Landweber, Laura F.; Saito, Masahico; and Jonoska, Nataša, "Russian Doll Genes and Complex Chromosome Rearrangements in Oxytricha trifallax" (2018). Mathematics and Statistics Faculty Publications. 16. https://scholarcommons.usf.edu/mth_facpub/16 This Article is brought to you for free and open access by the Mathematics and Statistics at Scholar Commons. It has been accepted for inclusion in Mathematics and Statistics Faculty Publications by an authorized administrator of Scholar Commons. For more information, please contact [email protected]. Authors Jasper Braun, Lukas Nabergall, Rafik Neme, Laura F. Landweber, Masahico Saito, and Nataša Jonoska This article is available at Scholar Commons: https://scholarcommons.usf.edu/mth_facpub/16 INVESTIGATION Russian Doll Genes and Complex Chromosome Rearrangements in Oxytricha trifallax Jasper Braun,* Lukas Nabergall,* Rafik Neme,† Laura F. Landweber,†,1 Masahico Saito,* and Natasa Jonoska*,1 *University of South Florida; Department of Mathematics and Statistics and †Columbia University; Departments of Biochemistry and Molecular Biophysics, and Biological Sciences ORCID ID: 0000-0001-8462-5291 (R.N.) ABSTRACT Ciliates have two different types of nuclei per cell, with one acting as a somatic, KEYWORDS transcriptionally active nucleus (macronucleus; abbr. MAC) and another serving as a germline nucleus Ciliates (micronucleus; abbr. MIC). Furthermore, Oxytricha trifallax undergoes extensive genome rearrangements Genome during sexual conjugation and post-zygotic development of daughter cells. These rearrangements are rearrangement necessary because the precursor MIC loci are often both fragmented and scrambled, with respect to the Nested genes corresponding MAC loci. Such genome architectures are remarkably tolerant of encrypted MIC loci, Scrambled genes because RNA-guided processes during MAC development reorganize the gene fragments in the correct Nanochromosomes order to resemble the parental MAC sequence. Here, we describe the germline organization of several nested and highly scrambled genes in Oxytricha trifallax. These include cases with multiple layers of nest- ing, plus highly interleaved or tangled precursor loci that appear to deviate from previously described patterns. We present mathematical methods to measure the degree of nesting between precursor MIC loci, and revisit a method for a mathematical description of scrambling. After applying these methods to the chromosome rearrangement maps of O. trifallax we describe cases of nested arrangements with up to five layers of embedded genes, as well as the most scrambled loci in O. trifallax. Ciliates have two types of nuclei within the same cell, where one acts as a Conjugation in O. trifallax leads to destruction of the parental, veg- germline nucleus (micronucleus; abbr. MIC) and the other acts as a etative MAC and regeneration of a new MAC from a copy of the zygotic somatic nucleus (macronucleus; abbr. MAC). The spirotrich ciliate MIC. This process involves the selective removal of more than 90% of the Oxytricha trifallax undergoes massive genome reorganization during germline genomic information, and the precise rearrangement of the the post-zygotic development of its daughter cells (Prescott 1994). remaining pieces in a particular order and orientation (Yerlici and Land- weber 2014). The mature MAC contains millions of gene-sized chromo- somes (also known as nanochromosomes) that average just 3 kb (Swart Copyright © 2018 Braun et al. et al. 2013), each containing its own set of telomeres, regulatory elements, doi: https://doi.org/10.1534/g3.118.200176 and between 1-8 genes. Manuscript received October 25, 2017; accepted for publication March 8, 2018; Any segment of micronuclear information that is retained in the published Early Online March 15, 2018. macronuclear destined sequence This is an open-access article distributed under the terms of the Creative Commons MAC is called a (abbr. MDS), while Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), any deleted segment that interrupts two MDSs in the same locus is which permits unrestricted use, distribution, and reproduction in any medium, called an internally eliminated sequence (abbr. IES). MDSs fuse to- provided the original work is properly cited. gether by recombination at short (2-20) repeats, called pointers,atthe Supplemental Material is available online at www.g3journal.org/lookup/suppl/ doi:10.1534/g3.118.200176/-/DC1. ends of MDSs, retaining one copy of the pointer in the MAC. The This work has been supported in part by the NSF grant CCF-1526485 and NIH accuracy of this process is guided by a suite of long and small RNAs grant R01 GM109459. ((Nowacki et al. 2008; Fang et al. 2012), reviewed in Bracht et al. (2013) 1 Columbia University, Departments of Biochemistry & Molecular Biophysics, and Yerlici and Landweber (2014)). Biological Sciences, and Systems Biology Hammer 714, 701 W 168th St, NY fi 10032. E-mail: [email protected] and Department of Mathematics Rearrangements can be classi ed as simple or complex, depending and Statistics, University of South Florida 4202 E. Fowler Av. CMC342 Tampa FL, on whether the order and orientation in the macronuclear product is 33620-5700. E-mail: [email protected], corresponding authors. maintained, relative to the germline precursor. Simple rearrangements Volume 8 | May 2018 | 1669 Figure 2 A specific example of three layers of nesting. The two MIC loci for MAC contigs Contig9583.0 (blue) and Contig6683.0 (orange) are nested between Contig6331.0 (red) on the micronuclear contig ctg7180000067077. Not drawn to scale. The red locus has an IDI of 2, the blue locus 1, and the orange locus 0. Most arise from gene duplication and insertion of young genes or transposons into long introns (Sheppard et al. 2016; Wei et al. 2013; Gao et al. 2012). In this study, we follow up on previous analyses in Burns et al. (2016a), that described the most commonly recurring scrambled pat- terns across the genome. Here, we present the most elaborate cases of genome rearrangement in O. trifallax, which highlight the extraordi- nary degree of topological complexity that can arise from such a highly plastic genomic architecture. Figure 1 Graphical representation of DNA rearrangements and their METHODS topologies. (a) A schematic idealized MIC contig containing MDSs for Genome sequences From Oxytricha trifallax three MAC contigs, each indicated by a different color. The numbers fi MIC and MAC sequences from Oxytricha trifallax were obtained from indicate the nal MDS order in the corresponding MAC contig. The , . numbers with bars above them indicate MDSs in a reverse orientation the mds_ies_db website (Burns et al. 2016b), in the form of annotated in the MIC contig relative to the MAC contig. The black numbers in tables, and processed as described in Burns et al. (2016a), available from smaller print over the blue MDSs label the pointers for the blue contig. http://knot.math.usf.edu/data/scrambled_patterns/processed_annotation_ (b) A condensed, linear description of the precursor MIC contig in (a). of_oxy_tri.gff. These were parsed to produce the various types of graphical The notation “1 2 7 odd” represents MDS 1; 3; 5; 7; and the black tri- representations described below using a combination of in-house angle indicates the presence of three nonscrambled MDSs for the Python and SQL scripts. orange MAC contig shown in (a). (c) A graph representing the pre- cursor MIC contig in (a) where the vertices (intersection points) indicate Graphical representations of scrambled loci the recombination junctions (pointers). The arrowhead indicates the orientation of the precursor MIC contig, reading left to right as shown These representations highlight different aspects of chromosomal re- in (a). The segments highlighted in 3 colors indicate the MAC contigs arrangement topologies. For the analysis of nested chromosomes we fi obtained after joining MDS segments. The uncolored segments cor- ltered the data to consider only cases with scrambled or nonscrambled respond to IESs. The vertices of the blue portion correspond to the MDSsformultipleMACcontigs(nanochromosomes)thatderivefrom a pointers 1; 2; 3; 4 that join respectively flanking MDSs. The orange single MIC region. Figure 1A shows a schematic view of the MIC portion with a loop corresponds to the middle orange MAC locus with (precursor) and MAC (product) versions of such a locus, and the three nonscrambled MDSs. The number 2 indicates removal of two correspondence between MDSs and pointers. In addition, we include conventional IESs. (d) A chord diagram representing a cyclic arrange- a condensed linear sequence representation of the MDSs (Figure 1B), a ment of pointers of MDSs from the blue MAC contig in (a) within its self-intersecting line corresponding to the MIC region that indicates the precursor MIC contig.