Gene Dosage and the Evolution of Gene Expression
Total Page:16
File Type:pdf, Size:1020Kb
Gene dosage and the evolution of gene expression Fabian Zimmer Department of Genetics, Evolution and Environment University College London (UCL) A thesis submitted for the degree of Doctor of Philosophy 1 2 I, Fabian Zimmer, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. Fabian Zimmer 3 4 Abstract The duplication and loss of genes, chromosomes and whole genomes has had a major impact on the evolution of most organisms. Changes in gene copy number, called gene dosage, may influence the resulting level of gene product through changes in gene expression. These gene expression changes can be detrimental, resulting in compensation and buffering mechanisms, or beneficial, when selection favours increased gene dosage. Understanding how changes in gene dose can influence the evolution of gene expression within and between species is an important task in evolutionary biology. This thesis combines studies of gene, protein domain, and genome duplications with gene expression data from a range of bird species to understand the evolutionary consequences of gene dosage changes. In addition to gene duplication and loss events, the genomic location of genes can subject loci to different evolutionary pressures. Genes present on sex chromosomes or the mitochondria are inherited unequally between males and females, potentially causing sexual conflict over expression. This thesis investigates if inter-genomic conflict could drive gene movement on and off the sex chromosomes using a comparative genomics approach. 5 6 Acknowledgments First and foremost, I would like to thank my supervisor Professor Judith Mank for her great support over the last few years. Her enthusiasm for biology has motivated me countless times. Professor Kevin Fowler has always offered invaluable advice and I could not imagine a better secondary supervisor. I would also like to commend Dr Peter Harrison for his help with computational problems and his feedback on all my manuscripts. My collaboration with Dr Rebecca Dean was a great working experience and I would like to thank her for supporting me with all my projects. I am indebted to Dr Stephen Montgomery who has gone above and beyond in helping me to resolve many scientific issues. Thanks to him, I know more about brains (including my own), phylogenies and cephalopods than I ever expected. I would like to thank all other members of the Mank lab (Alison, Vicencio, Natasha, Jen, Marie, Iulia, Clara) and Professor Christophe Dessimoz for the many productive discussions – I enjoyed working with all of you. I would not have finished this thesis without the support of my friends and family. In many regards writing a thesis is like cycling in the mountains and I would especially like to thank my friend Salvador Estrugo, who has taught me how to climb (slowly). Last, but by no means least, I am infinitely grateful for all the support from my wife Tammela Platt – you helped me in more ways than I can list here. 7 8 To my parents 9 10 Table of Contents Chapter 1 - Introduction ................................................................................. 19 Gene dosage .................................................................................................. 21 Genome evolution .......................................................................................... 22 Whole genome duplications ......................................................................... 22 Evolution after WGD events ......................................................................... 24 Detection of WGD events ............................................................................. 27 Gene dosage and sex chromosome evolution .............................................. 29 Paralog divergence ........................................................................................ 32 Protein domain evolution ............................................................................... 36 Gene movement and rearrangements ........................................................... 37 Summary of aims ........................................................................................... 39 Summary of thesis chapters .......................................................................... 40 Glossary ......................................................................................................... 41 Abbreviations ................................................................................................. 42 Additional published work ............................................................................. 43 Chapter 2 - Compensation of dosage-sensitve genes on the chicken Z chromosome .................................................................................................... 45 Summary ........................................................................................................ 48 Introduction .................................................................................................... 49 Material and Methods .................................................................................... 52 RNA-Seq analysis and gene expression estimates ....................................... 52 Identification of ohnologs and other paralogs ............................................... 53 Functional annotation of ohnologs ................................................................ 54 SNP calling and estimation of allele specific expression ................................ 54 Results ........................................................................................................... 56 Incomplete dosage compensation in females and reduced Z expression in males ........................................................................................................... 56 Z:A ratio comparison .................................................................................... 61 Expression level and dosage compensation ................................................. 61 Allele-specific expression and potential for Z chromosome inactivation ........ 64 Ohnologs are preferentially dosage-compensated ........................................ 66 Older Z chromosome parts contain fewer ohnologs ..................................... 72 Dosage compensation of ohnologs across tissues ....................................... 73 Functional annotation of Z-linked ohnologs .................................................. 75 Discussion ...................................................................................................... 77 11 Chapter 3 - Inferring paralogs and expression divergence across multiple bird species using RNA-Seq data .................................................................. 79 Summary ........................................................................................................ 81 Introduction .................................................................................................... 82 Material and Methods .................................................................................... 86 RNA-Seq assembly and gene expression calculation ................................... 88 RNA-Seq isoform filtering and protein prediction .......................................... 90 DNA-Seq data processing ........................................................................... 92 Reconstruction of gene family evolution ....................................................... 92 Lineage-specific paralog sequence divergence ............................................ 93 Results ........................................................................................................... 94 Are inferred evolutionary events from RNA-Seq and DNA-Seq comparable? 94 Can RNA- and DNA-Seq data be combined to improve paralog detection? . 95 Can lineage-specific paralogs be filtered using ancestral family structure? .. 101 Is the duplication rate of inferred lineage-specific paralogs comparable to genomic estimates? ................................................................................... 104 Can strictly filtered paralogs be confirmed using Ensembl data? ................ 105 Discussion .................................................................................................... 107 Family structure filtering and duplication rate comparisons ......................... 108 Paralog pairs show low sequence identity and unexpected gene tree topology .................................................................................................................. 109 An explanation: variation in gene model prediction ..................................... 110 Chapter 4 - The potential role of sexual selection and sexual conflict on the genomic distribution of mito-nuclear genes ........................................ 113 Summary ...................................................................................................... 116 Introduction .................................................................................................. 117 Material and Methods .................................................................................. 120 Detection and localization of genes interacting with mitochondria .............. 120 Statistical analysis ...................................................................................... 122 Results and Discussion ...............................................................................