The Evolution of Gene Function in Caenorhabditis Spp. by Adrian

The Evolution of Gene Function in Caenorhabditis spp. by Adrian Verster A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Molecular Genetics University of Toronto c Copyright 2014 by Adrian Verster Abstract The Evolution of Gene Function in Caenorhabditis spp. Adrian Verster Doctor of Philosophy Graduate Department of Molecular Genetics University of Toronto 2014 The sequence of the genome gradually evolves, and such changes can affect the function of the genes encoded within. Here I try to understand the causes and consequences of changes in gene function between related species, particularly in Caenorhabditis nema- todes. The goal of the first section of my thesis is to compare biological gene function of 1-1 orthologs by using loss of function phenotypes in C. elegans and C. briggsae. I did this by constructing and screening an RNA interference (RNAi) library in C. briggsae and comparing these RNAi phenotypes to those in C. elegans. This approach found 91 examples of orthologs with different in vivo functions, around 7% of the genes screened. For one of these examples, sac-1, I was able to explain the different biological function by a difference in molecular function, in this case by a difference in gene expression. Given the extremely high phenotypic similarity of these two species I hypothesize that many of these examples of different RNAi phenotypes likely represent cases of changes in gene function which have preserved the developmental phenotypes of the animal due to high levels of stabilizing selection, a model known as Developmental System Drift. In support of this, I found several cases where a phenotype is not present in the C. briggsae copy of an ortholog, but it is present in another related family member, suggesting that this related gene has taken over the function of the original gene in C. briggsae. I also found that recently evolved genes are enriched for having different RNAi phenotypes, which led me to consider what could explain their rapid rate of evolution of in ii vivo function. By measuring the co-expression of novel genes to genes in different molecular pathways, I was able to construct a series of features which could accurately predict which novel genes become essential. An analysis of this model showed that co-expressions to ancient, essential pathways is highly predictive of a novel gene acquiring an essential function. These results supported a picture of how novel genes become integrated into cellular networks, and subsequently are preserved by evolutionary forces. iii Acknowledgements First and foremost I would like to thank my thesis advisor, Andy Fraser who guided me through these intellectually challenging years. You were always available to set me on track when I was lost. I am also grateful to my thesis committee (Gary Bader, Asher Cutter and Sabine Cordes) who were always willing to give me advice on a multitude of issues. You each provided an important perspective on the work, and I could not have completed this without your help. Members of the Fraser lab have provided tremendous help over the years, Arun Ra- mani, Nadege Pelte, Mark Spensley, Nattha Wannissorn, Victoria Vu, June Tan, Steve Van Doormaal, Tunga Chuluunbaatar and Mike Schertzberg. Many people helped me with difficult scientific issues, but I would particularly like to thank those who taught me computational biology, namely Leopold Parts, Azin Sayad, Traver Hart, Lee Zamparo and Carl de Boer. I could not have made it through without my friends and others who are close to me. Kathleen Cook, my lovely fiancé, was always there to help me through difficult times. I will not exhaustively name you, but anyone else who was close to me, you know who you are. I would also like to acknowledge Marie-Anne Flix for the gift of the SID-2 expressing JU1018 transgenic C. briggsae line, and the pJH1774 plasmid containing mWormCherry was a gift from Arshad Desai’s lab. Finally, I would like to thank Google and Wikipedia for providing me with a deep well of knowledge that I repeatedly drew from when I was missing pieces of the puzzle. iv Contents 1 Introduction 1 1.1 Mechanismsforgenomechange . 2 1.1.1 Mutation ............................... 3 1.1.2 Selection................................ 4 1.2 ExperimentalEvolution . 7 1.2.1 Howreproducibleisevolution? . 8 1.3 Change in gene function of orthologous genes. 8 1.3.1 Case studies from quantitative genetics . 9 1.3.2 Case studies from the evolution of development . 10 1.4 Evolutionofnovelgenes . 14 1.4.1 Evolution of gene function for taxonomically restrictedgenes . 14 1.4.2 The evolution of novel genes by gene duplication . 16 1.4.3 The evolution of novel genes from non-coding DNA . 19 1.5 Pathwayevolution .............................. 20 1.5.1 Convergent evolution in the C. elegans and C. briggsae sex devel- opmentpathways ........................... 20 1.5.2 RNAinterferenceandArgonauteproteins . 21 1.6 High-throughput approaches to genome evolution . 23 1.6.1 How much of evolution can single genes explain? . 23 1.6.2 Geneexpressiondivergence . 24 v 1.6.3 Transcription factor binding divergence . 26 1.7 DevelopmentalSystemDrift . 28 1.8 Howmuchofmolecularfunctionisnoise?. 31 1.9 Openquestionsandthesisgoals . 34 2 Evolution of ortholog gene function in Caenorhabditis spp. 37 2.1 Abstract.................................... 38 2.2 Introduction.................................. 38 2.3 Results..................................... 42 2.3.1 Construction and screening of the C. briggsae RNAi library . 42 2.3.2 Genes with different phenotypes are enriched for transcription fac- torsandrecentlyevolvednovelgenes . 48 2.3.3 Changes in gene function during DSD are often the result of pro- moterevolution............................ 50 2.3.4 Ortholog pairs encoding more divergent protein sequences are more likelytohavedifferentRNAiphenotypes . 52 2.3.5 Orthologs may have different organismal roles due to changes in othergenes .............................. 54 2.3.6 Conservation of function can be maintained at the level of gene familyandnotgenefamilymembers . 55 2.4 Discussion................................... 57 2.5 Methods.................................... 63 2.5.1 Construction of the C. briggsae RNAi Library . 63 2.5.2 Manual screening of the C. briggsae RNAi Library . 63 2.5.3 FitnessAssay ............................. 64 2.5.4 qPCR ................................. 65 2.5.5 Examination of C. elegans and C. briggsae gene phylogenetic age 66 2.5.6 GFPStitchingandMicroscopy . 66 vi 2.5.7 Transgenicrescueexperiments . 66 2.5.8 Examination of C. elegans and C. briggsae protein similarity . 67 2.5.9 Predictability of phenotype differences . 67 2.5.10 Identifying functionally related genes . 68 3 Evolution of essential functions in novel genes 83 3.1 Abstract.................................... 84 3.2 Introduction.................................. 84 3.3 Results..................................... 86 3.3.1 ThenumberofTRGsinthewormgenome . 87 3.3.2 Novel genes preferentially form functional links with other novel genes.................................. 87 3.3.3 Essential TRGs are enriched for functional links . 88 3.3.4 Prediction of novel gene function based on co-expressionprofiles . 89 3.3.5 Novel genes contribute to drug resistance predictions . 91 3.4 Discussion................................... 92 3.5 Methods.................................... 96 3.5.1 Finding Taxonomically restricted genes . 96 3.5.2 TRGfunctionaldata ......................... 96 3.5.3 EssentialTRGclassification . 97 3.5.4 Featureimportanceanalysis . 98 3.5.5 Drugresistancepredictions. 98 4 Discussion and concluding remarks 109 4.1 Summary ................................... 109 4.2 Turnoverofgeneswithinpathways . 110 4.3 Transversalofadaptivepeaks . 113 4.4 Novelgenefunctionathighresolution . 115 vii 4.5 How do novel genes change functional networks? . 116 4.6 OverallSignificance.............................. 118 Bibliography 122 viii Chapter 1 Introduction Evolution has generated the incredibly diverse array of life forms which inhabit the planet. Small changes in the genetic material, DNA, are passed down to offspring and these changes can eventually become fixed in the population. Improvements in genomic technologies have completely revolutionized our understanding of evolution. Changes in the heritable material are now measurable, and thus, we can measure the changes that occur during evolution. For example, the old view of the tree of life was that of 5 kingdoms of life, Monera (bacteria), Protists, Plants, Animals and Fungi (Whittaker, 1969), which is based on our (naive) understanding of the biology of different organisms. However, the first attempts to measure the heritable material (rDNA sequences) seriously challenged this: they found that different types of bacteria were as divergent as bacteria are from eukaryotes, and this led to a classification system of eukaryotes, bacteria, as well as a third group, archeabacteria (Woese and Fox, 1977). Since then, similar molecular taxonomics has changed our view of the eukaryotic tree of life as well. The emerging view is of 5 major groupings: Unikonts which include fungi and animals, Plantae which include land plants and algae, Excavates, Cercozoa and Chromalveolates (Keeling et al., 2005); the exact relationships between these groups remain unclear. In parallel to our ability to measure heritable changes in DNA, we have also acquired 1 Chapter 1. Introduction 2 the technology to characterize the function of the genes

Load more