Analysis of Drosophila Buzzatii Transposable Elements Doctoral
Total Page:16
File Type:pdf, Size:1020Kb
Analysis of Drosophila buzzatii transposable elements Doctoral Thesis Nuria Rius Camps Departament de Genetica` i de Microbiolog`ıa, Universitat Autonoma de Barcelona, Bellaterra (Barcelona), Spain Memoria` presentada per la Llicenciada en Biologia Nuria Rius Camps per a optar al grau de Doctora en Genetica.` Nuria Rius Camps Bellaterra, a 23 de novembre de 2015 El Doctor Alfredo Ruiz Panadero, Catedratic` del Departament de Genetica` i Microbiologia de la Fac- ultat de Biociencies` de la Universitat Autonoma` de Barcelona, CERTIFICA que Nuria Rius Camps ha dut a terme sota la seva direccio´ el treball de recerca realitzat al Departament de Genetica` i Microbiologia de la Facultat de Biociencies` de la Universitat Autonoma` de Barcelona que ha portat a l’elaboracio´ d’aquesta Tesi Doctoral titulada “Analysis of Drosophila buz- zatii transposable elements”. I perque` consti als efectes oportuns, signa el present certificat a Bellaterra, a 23 de novembre de 2015 Alfredo Ruiz Panadero I tell you all this because it’s worth recognizing that there is no such thing as an overnight success. You will do well to cultivate the resources in yourself that bring you happiness outside of success or failure. The truth is, most of us discover where we are headed when we arrive. At that time, we turn around and say, yes, this is obviously where I was going all along. It’s a good idea to try to enjoy the scenery on the detours, because you’ll probably take a few. (Bill Watterson) CONTENTS Abstract iii Resumen v 1. Introduction 1 1.1. Transposable elements .......................... 1 1.1.1. Transposable element classification .............. 2 1.1.2. TEs in their host genomes .................... 5 1.1.3. The P element ........................... 7 1.2. Drosophila as a model organism .................... 8 1.2.1. D. buzzatii and the D. repleta species group .......... 9 1.3. Genomics .................................. 10 1.3.1. Genomics in Drosophila ..................... 10 1.4. TE annotation and classification in sequenced genomes ....... 12 2. Objectives 15 3. Results 17 3.1. A divergent P element and its associated MITE, BuT5 ........ 18 3.1.1. A divergent P element and its associated MITE, BuT5, gen- erate chromosomal inversions and are widespread within the Drosophila repleta species group .............. 18 3.1.2. Supplementary material ..................... 34 3.2. Exploration of the D. buzzatii transposable element content ..... 35 3.2.1. Abstract .............................. 36 3.2.2. Background ............................ 37 3.2.3. Methods .............................. 38 3.2.4. Results ............................... 41 3.2.5. Discussion ............................. 50 3.2.6. Supplementary material ..................... 57 4. Discussion 59 4.1. MITEs in Drosophila genus genomes .................. 59 i Contents 4.2. The lifespan of a MITE .......................... 61 4.3. The importance of thorough and detailed analysis in the genomic era ....................... 62 5. Conclusions 65 Appendices 83 A. Supplementary material of BuT5 and the P element in D. repleta group 85 A.1. P element transposase alingment .................... 85 A.2. Supplementary tables ..........................100 B. Supplementary material of TE analyses in Drosophila buzzatii genomes 109 B.1. TE density in D. buzzatii and D. mojavensis chromosomes ......109 B.2. Supplementary tables ..........................111 C. Research article 121 C.1. Genomics of ecological adaptation in cactophilic Drosophila ....121 6. Acknowledgments 141 ii LIST OF FIGURES 1. Proposed TE classification ........................ 3 2. TE life cycle ................................ 6 3. Repeat and TE content of the 12 Drosophila genomes ........ 13 4. Repeated elements in the Drosophila sequenced genomes ..... 14 5. TE Order abundance ........................... 41 6. Chromosomal TE density ........................ 45 7. Orders correction ............................. 49 8. Superfamilies correction ......................... 50 9. Supplementary Figure a .........................109 10. Supplementary Figure b .........................110 11. Supplementary Figure c .........................110 12. Supplementary Figure d .........................111 13. Supplementary Figure e .........................111 14. Supplementary Figure f .........................112 15. Supplementary Figure g .........................112 16. Supplementary Figure h .........................113 iii LIST OF TABLES 1. Contributions to D. buzzatii and D. mojavensis genomes ....... 42 2. TE fraction in D. buzzatii and D. mojavensis .............. 44 3. Percentage of TEs annotated ....................... 47 4. Supplementay Table: D statistics D. buzzatii Proximal ........113 5. Supplementay Table: D statistics D. buzzatii Distal + Central ....114 6. Supplementay Table: D statistics D. buzzatii total ...........114 7. Supplementay Table: D statistics D. mojavensis Proximal ......115 8. Supplementay Table: D statistics D. mojavensis Central + Distal ..115 9. Supplementay Table: D statistics D. mojavensis total .........116 10. Supplementay Table: p-values D. buzzatii Proximal .........116 11. Supplementay Table: p-values D. buzzatii Distal + Central .....117 12. Supplementay Table: p-value statistics D. buzzatii total .......117 13. Supplementay Table: p-values D. mojavensis Proximal ........118 14. Supplementay Table: p-values D. mojavensis Distal + Central ....118 15. Supplementay Table: p-values D. mojavensis total ..........119 v ACRONYMS BAC Bacterial Artificial Chromosome BDGP Berkeley Drosophila Genome Project BLAST Basic Local Alignment Search Tool BSC Barcelona Supercomputing Center CNAG Spanish Centro Nacional de Analisis´ Genomico´ DINE-1 Drosophila Interspersed Element-1 DIRS Dictyostelium Intermediate Repeat Sequence DNA Deoxyribonucleic Acid ERVK Endogenous Retrovirus-K HT Horizontal transfer LINE Long Interspersed Repetitive Elements LTR Long Terminal Repeats MITE Miniature Inverted-Repeat TE NCBI National Center for Biotechnology Information NGS Next-Generation Sequencing ORF Open Reading Frame PE Paired End piRNA piwi-interacting RNA PLE Penelope-like Element RNA Ribonucleic Acid SDS Sodium Dodecyl Culfate SINE Short Interspersed Repetitive Elements TE Transposable Element THAP Thanatos-associated Protein TIR Terminal Inverted Repeat TSD Target Site Duplication UAB Universitat Autonoma` de Barcelona UTA University of Texas at Arlington i ABSTRACT Transposable genetic elements are genetic units able to insert themselves in other regions of the genomes they inhabit, and are present in almost all eukaryotes an- alyzed. The interest of transposable element analysis, it is not only because its consideration as intragenomic parasites. Transposable elements are an enor- mous source of variability for the genomes of their hosts, and are therefore key to understanding its evolution. In this work we addressed the analysis of Droso- phila buzzatii transposable elements from two different approaches, the detailed study of one family of transposable elements and global analysis of all elements present in the genome. The study of chromosomal inversions in D. buzzatii led to the description of the non-autonomous transposable element, BuT5, which was later found to cause polymorphic chromosomal inversions in D. mojavensis and D. uniseta. In this work we have characterized the transposable element BuT5 and we have described its master element. BuT5 is found in 38 species of the group of species D. repleta. The autonomous element that mobilizes BuT5 is a P element, we described three partial copies in the sequenced genome of D. mojavensis and a complete copy in D. buzzatii. The full-length and putatively active copy has 3386 base pairs and encodes a transposase of 822 residues in seven exons. Moreover we have annotated, classified and compared the trans- posable elements present in the genomes of two strains of D. buzzatii, st-1 and j-19, recently sequenced with next-generation sequencing technology, and in the D. mojavensis, the phylogenetically closest species sequenced, in this case with Sanger technology. Transposable elements make up for 8.43%, the 4.15% and 15.35% of the assemblies of the genomes of D. buzzatii st-1,j-19 and it D . mo- javensis respectively. Additionally, we have detected a bias in the transposable elements content of genomes sequenced using next-generation sequencing tech- nology, compared with the content in genomes sequenced with Sanger technol- ogy. We have developed a method based on the coverage that allowed us to cor- rect this bias in the genome of D. buzzatii st-1 and have more realistic estimates of the content in transposable elements. Using this method we have determined that the transposable element content in D. buzzatii st-1 is between 10.85% and 11.16%. Additionally, the estimates allowed us to infer that the Helitrons order has undergone multiple cycles of activity and that the superfamily Gypsy and BelPao have recently been active in D. buzzatii. iii RESUMEN Los elementos transponibles son unidades geneticas´ capaces de insertarse en otras regiones de los genomas en los que habitan y estan´ presentes en casi todas las especies eucariotas estudiadas. El interes´ del analisis´ de los elemen- tos transponibles no se debe unicamente´ a su consideracion´ de parasitos´ intra- genomicos.´ Los elementos transponibles suponen una enorme fuente de vari- abilidad para los genomas de sus hospedadores, y son por lo tanto claves para comprender su evolucion.´ En este trabajo hemos abordado el