Characterization of Unique Features of the Denisovan Exome
Total Page:16
File Type:pdf, Size:1020Kb
CHARACTERIZATION OF UNIQUE FEATURES OF THE DENISOVAN EXOME A University Thesis Presented to the Faculty of California State University, East Bay In Partial Fulfillment of the Requirements for the Degree Master of Science in Biological Science By Alexandra Vivelo September, 2013 ABSTRACT The publicly available Denisovan genome sequence increases opportunities to learn what makes modern humans unique and to discover the distinguishing genetic features of an extinct sister lineage. This thesis explores the latter, with emphasis on male reproductive genes, neuronal genes, and a subset of metabolic genes, specifically those that code for enzymes involved in glycolysis and those that code for proteins that vary in modern human populations in connection with long-term dietary trends in those populations. Results include the identification of 34 neuronal genes with single-nucleotide changes that are derived in the Denisovan protein-coding sequence at loci that are non- polymorphic in modern humans, the computation of the dN/dS ratio for a semen coagulation factor for which the degree of positive selection is known to be correlated with the females’ mean number of male mating partners per periovulatory period, and the determination of the Denisovan variants at a subset of known modern dietary and metabolism-related single-nucleotide polymorphic loci. Possible behavioral and functional correlates of those unique features are suggested, providing the foundation for further study on Denisovan male reproductive selective pressure, unique neuronal gene features, and metabolic genes. ii iii ACKNOWLEDGMENTS I would like to thank Dr. Chris Baysdorfer, who supported this project from the start. He encouraged me to explore new territory, was kind when I proposed implausible plans, and was always enthusiastic. Thanks to Dr. Baysdorfer, my small seed of an idea turned into a full-fledged project, and it wouldn’t have happened without his vision and support. My immense gratitude also goes to Dr. Claudia Uhde-Stone, whose support has been indispensable throughout my time at CSUEB and without whom this thesis would not exist; Dr. Henry Gilbert, who has been tremendously generous with his time and expertise; and Dr. Kenneth Curr for his mentorship. I would also like to thank Dr. Kelly Decker, Dr. Maria Nieto, and Dr. Maria Gallegos for their encouragement. These CSUEB faculty members make up an exceptionally talented and caring group, and I am privileged to be acquainted with each of them. I sincerely appreciate the time and expertise offered by Dr. Ed Green of UCSC in discussing this research. Thanks also to Dr. Bill Lu of SBI for a helpful discussion. I would also like to express my deep gratitude to my husband, Terry Van Belle. Without the significant investment of time and programming skill he put into creating searchable alignment files, and without his programming instruction, I would not have been able to access or analyze the data used in these pages. Finally, thanks to my dad, who has always believed in me; Val, for the Excel help; my aunt, for telling me it’s not too late; and my daughter, for cheerfully sacrificing some of our precious time together and encouraging me every step of the way. iv TABLE OF CONTENTS ABSTRACT ........................................................................................................................ ii ACKNOWLEDGEMENTS ............................................................................................... iv LIST OF TABLES ............................................................................................................ vii LIST OF FIGURES ......................................................................................................... viii INTRODUCTION .............................................................................................................. 1 Research Question .......................................................................................................... 3 Background ..................................................................................................................... 4 What We Know of the Denisovans .............................................................................. 4 The Source of Ancient DNA Sequences ...................................................................... 5 Neanderthal Analyses Hint at What May Be Found in Denisovan DNA .................... 7 Locus-Specific Denisovan Genome Research Published to Date ............................. 13 Challenges and Opportunities in the Characterization of Ancient DNA .................. 16 Summary of Aims ...................................................................................................... 18 METHODS ....................................................................................................................... 19 RESULTS ......................................................................................................................... 23 Male Reproductive Genes ............................................................................................. 23 Premature Stop Codons and Loss-of-Stop Mutations ................................................... 36 Genes Related to Neuron Formation and Function....................................................... 41 L1CAM ...................................................................................................................... 43 PCSK9 ....................................................................................................................... 43 HMCN1 ..................................................................................................................... 44 SETD2 ....................................................................................................................... 45 v REST ......................................................................................................................... 46 GDNF ........................................................................................................................ 47 NYAP1 ....................................................................................................................... 48 CHAT ........................................................................................................................ 48 NAV2 ......................................................................................................................... 49 CLN6 ......................................................................................................................... 50 HEXA ........................................................................................................................ 50 CC2D1A .................................................................................................................... 51 NEFH ........................................................................................................................ 52 Metabolic Genes ........................................................................................................... 52 Metabolic Genes with Known SNPs in Modern Humans ......................................... 55 DISCUSSION ................................................................................................................... 58 Selective Pressures in Male Reproductive Genes ......................................................... 58 Neuronal Genes ............................................................................................................. 59 Premature Stops and Loss-of-Stop Mutations .............................................................. 60 Metabolic Genes ........................................................................................................... 62 REFERENCES ................................................................................................................. 64 APPENDIX A: MALE REPRODUCTIVE GENES ........................................................ 82 APPENDIX B: NEURONAL GENES ............................................................................. 85 vi LIST OF TABLES Table 1. SEMG2 dN/dS Ratios. ....................................................................................... 33 Table 2. Premature Stop Codons Found in the Denisovan Exome. ................................. 36 Table 3. Loss-of-Stop Mutations Found in the Denisovan Exome. ................................. 40 Table 4. SNCs in Genes Encoding Glycolytic Enzymes. ................................................ 54 vii LIST OF FIGURES Figure 1. SEMG2 dN/dS Ratios. ..................................................................................... 34 viii 1 INTRODUCTION This project is aimed at characterizing distinct features of the genome of the Denisovans, the species of extinct hominid discovered in Siberia in 2008. The project consists of a bioinformatics analysis of the Denisovan genome with special focus on specific subsets of protein-coding genes that are functional in modern humans and that include at least one amino acid change between Denisovans and modern humans. The present study focuses primarily on single nucleotide changes (SNCs), loci at which the identity of the nucleotide at a particular locus is different between the modern human and Denisovan genomes. The SNCs being studied here are primarily those likely to indicate functional significance based on the fact that they are located in translated regions of exons. To locate the base pairs that show a functionally