Human Gene Evolution the HUMAN MOLECULAR GENETICS Series

Human Gene Evolution The HUMAN MOLECULAR GENETICS series Series Advisors D.N. Cooper, Institute of Medical Genetics, University of Wales College of Medicine, Cardiff, UK S.E. Humphries, Division of Cardiovascular Genetics, University College London Medical School, London, UK T. Strachan, Department of Human Genetics, University of Newcastle upon Tyne, Newcastle upon Tyne, UK Human Gene Mutation From Genotype to Phenotype Functional Analysis of the Human Genome Molecular Genetics of Cancer Environmental Mutagenesis HLA and MHC: Genes, Molecules and Function Human Genome Evolution Gene Therapy Molecular Endocrinology Venous Thrombosis: from Genes to Clinical Medicine Protein Dysfunction in Human Genetic Disease Molecular Genetics of Early Human Development Neurofibromatosis Type 1: from Genotype to Phenotype Analysis of Triplet Repeat Disorders Molecular Genetics of Hypertension Human Gene Evolution Forthcoming title B Cells Human Gene Evolution David N. Cooper Institute of Medical Genetics, University of Wales College of Medicine, Cardiff, UK. © BIOS Scientific Publishers Limited, 1999 First published in 1999 All rights reserved. No part of this book may be reproduced or transmitted, in any form or by any means, without permission. A CIP catalogue record for this book is available from the British Library. ISBN 1 859961 51 7 BIOS Scientific Publishers Ltd 9 Newtec Place, Magdalen Road, Oxford OX4 1RE, UK Tel. +44 (0)1865 726286. Fax +44 (0)1865 246823 World Wide Web home page: http://www.bios.co.uk/ Published in the United States, its dependent territories and Canada by Academic Press, Inc., A Harcourt Science and Technology Company, 525 B Street, San Diego, CA 92101–4495. www.academicpress.com TO PAUL, CATRIN AND DUNCAN O sweet spontaneous earth how often has the naughty thumb of science prodded thy beauty thou answereth them only with spring. e e cummings Tulips and Chimneys (1924) Production Editor: Andrea Bosher. Typeset by Saxon Graphics Ltd, Derby, UK. Printed by Biddles Ltd, Guildford, UK. Contents Preface xi Abbreviations xv PART 1 INTRODUCTION AND OVERVIEW 1 Structure and function in the human genome 3 Introduction 3 Chromosome structure and function 3 Chromatin structure 3 Centromeres 4 Telomeres 4 Sites of recombination 5 Gene distribution and density 6 Isochores 6 Matrix attachment regions 6 Origins of DNA replication 7 Gene organization and transcriptional regulation 7 Gene structure and regulation 7 Polymorphisms 11 Functional organization of human genes 14 Pseudogenes 17 Promoter elements 17 Enhancers 19 Negative regulatory elements 19 Locus control regions 20 Trans-acting protein factors 20 5′ and 3′ untranslated regions 21 Boundary elements 22 Sequence motifs involved in mRNA splicing and processing 23 Polycistronic and polyprotein genes 25 DNA methylation 26 CpG islands 27 CpG suppression in the vertebrate genome and its origin 28 The CpG dinucleotide as a mutation hotspot 29 Imprinting and imprinted genes 29 Repetitive sequence elements 30 Tandem repeats 30 Alu sequences and other SINEs 31 LINE elements 33 Endogenous retroviral sequences and transposons 34 Genes, mutations and disease 34 Single base-pair substitutions within the coding region 36 Single base-pair substitutions within splice sites 36 Single base-pair substitutions within promoter regions 37 Gross gene deletions 37 Microdeletions 37 Insertions 37 Inversions 37 Expansion of unstable repeat sequences 38 VI HUMAN GENE EVOLUTION 2 Evolution of the human genome 55 Ancient genome duplications at the dawn of vertebrate evolution 55 Evidence for an ancient genome duplication 55 Consequences of genome duplications for gene evolution 57 Mammalian genome evolution 63 Primate evolution 67 Adaptation and adaptive radiation 67 Primate phylogeny 68 Chromosome evolution in primates 72 Old World primates 72 New World primates 77 Evolution of the human sex chromosomes and the pseudoautosomal regions 77 Evolution of the mitochondrial genome 81 The evolution of human populations 84 The action of natural selection in human populations 86 Sequencing the genomes of model organisms and humans 88 PART 2 EVOLUTION OF GENE STRUCTURE 3 Introns, exons, and evolution 107 Intron structure, function and evolution 107 The evolution of alternative processing 111 Alternative splicing 111 Aberrant transcripts 115 mRNA surveilance 115 Ectopic transcripts 115 Exon structure and evolution 116 Introns early or introns late? 116 Mechanisms of intron insertion and deletion 120 Exon shuffling 122 Exon shuffling in the evolution of human genes 122 The phase compatibility of introns 123 The serine proteases of coagulation 125 Protein folds, primordial exons and the emergence of exon shuffling 128 Pseudoexons 130 4 Genes and gene families 139 The origins of human genes 139 Genes with a specifically human origin 139 Human genes which originated after the divergence of Old World monkeys and New World monkeys 140 Human genes which originated during primate evolution 141 Human genes whose origin preceded the divergence of mammals 142 Human genes whose origin preceded the divergence of the vertebrates 143 Human genes whose origin preceded the divergence of the metazoa 143 Human genes whose origin preceded the divergence of animals and fungi 145 Human genes whose origin preceded the divergence of plants and animals 146 Human genes whose origin preceded the divergence of prokaryotes and eukaryotes 146 The emergence of genes and gene families has paralleled organismal evolution 149 Multigene families 150 Gene families 150 Actin genes 150 Albumin genes 152 CONTENTS VII Apolipoprotein genes 152 Complement genes 153 Crystallin genes 155 Collagen genes 156 Genes for the fibroblast growth factors and their receptors 158 GABA receptor genes 159 G protein α subunit genes 160 Globin genes 160 Growth hormone and somatomammotropin genes 162 Glycophorin genes 164 Homeobox genes 165 Integrin genes 168 Keratin genes 168 Genes of the major histocompatibility complex 170 Mucin genes 175 Genes encoding RNA-binding proteins 175 Sulfatase genes 176 Genes encoding tRNAs and aminoacyl-tRNA synthetases 177 Ubiquitin genes 178 An overview of the evolution of multigene families in eukaryotes 179 Highly repetitive multigene families 180 Histone genes 180 Ribosomal RNA genes 181 5S ribosomal RNA genes 182 Small nuclear RNA genes 182 Gene superfamilies 183 Cadherin genes 183 Cytochrome P450 genes 183 Cystatin genes 184 G-protein-coupled receptor genes 184 Heat shock genes 185 Insulin and insulin-like growth factor genes 186 Interferon genes 186 Nuclear receptor genes 189 Protein kinase C genes 190 Serine protease genes 190 Serpin genes 191 Zinc finger genes 192 Olfactory receptor genes 193 Genes which undergo programmed rearrangement 194 Immunoglobulin genes 194 T-cell receptor genes 196 Convergent evolution 196 Coevolution 198 5 Promoters and transcription factors 221 Promoters and enhancers 221 Evolutionary conservation of cis-acting elements and ‘phylogenetic footprinting’ 223 Nonhomologous genes containing similar regulatory elements 225 Paralogous genes containing dissimilar regulatory elements 225 Orthologous genes containing dissimilar regulatory elements 228 Bidirectional promoters 231 5Ј and 3Ј untranslated regions of genes 233 Inter-specific differences in promoter selection 234 Developmental changes in gene expression 235 Promoter polymorphisms 236 VIII HUMAN GENE EVOLUTION Promoter duplication 238 Functional redundancy of promoter elements 238 Recruitment of repetitive sequences as promoter and silencer elements 238 Alu sequences 238 Endogeneous retroviral elements 241 LINE elements 244 Minisatellites and microsatellites 245 mRNA editing 246 Coordinate regulation 247 Changes in expression of developmentally significant gene products 247 Transcription factors 249 Transcription factor families 250 Functional conservation of orthalogous transcription factors 251 Functional redundancy of paralogous transcription factors 251 Paralogus transcription factors 252 Orthologous transcription factors 253 Alternative splicing of transcription factor genes 253 Promoter shuffling in transcription factor genes 253 Transcription factor-binding site interactions 253 Exon shuffling in the evolution of transcription factors 254 6 Pseudogenes and their formation 265 Pseudogene formation 265 The generation of pseudogenes by duplication 265 The generation of pseudogenes by retrotransposition 269 The generation of pseudogenes by other means 275 Origin and age of pseudogenes 275 Patterns of mutation in pseudogenes 276 Pseudogenes and gene conversion 276 Pseudogene reactivation 278 Gene loss/inactivation in primates 279 Urate oxidase gene 280 α-1,3-Galactosyltransferase gene 280 Elastase 1 gene 281 L-gulono-γ-lactone oxidase gene 281 ADP-ribosyltransferase 1 gene 282 Haptoglobin gene 282 Fertilin-α gene 282 T-cell receptor γ V10 variable region gene 283 Cytidine monophospho-N-acetylneuramic acid hydroxylase gene 283 Flavin-containing monooxygenase 2 gene 283 Overview 284 PART 3 MUTATIONAL MECHANISMS IN EVOLUTION 7 Single base-pair substitutions 297 Single base-pair substitutions in evolution 297 The selectionist and neutralist perspectives 297 Synonymous and nonsynonymous substitutions 299 Positive and negative selection in protein evolution 300 Mutation rates and their evolution 303 The deleterious mutation rate in humans 304 Mutations in pathology and evolution; two sides of the same coin 305 The importance of evolutionary conservation in the study of pathological mutations at the protein level using human factor IX as a model

Human Gene Evolution the HUMAN MOLECULAR GENETICS Series

Mass Spectrometry-Based Proteomics Techniques and Their Application in Ovarian Cancer Research Agata Swiatly, Szymon Plewa, Jan Matysiak and Zenon J

Studies on the Proteome of Human Hair - Identifcation of Histones and Deamidated Keratins Received: 15 August 2017 Sunil S

Table SD1. Patient Characteristicsa

Supplementary Table 5. Functional Annotation of the Largest Gene Cluster(221 Element)

AMD) Transmitochondrial Cybrids Protected from Cellular Damage and Death by Human Retinal Progenitor Cells (Hrpcs

Coding RNA Genes

Analysis of Genetic Determinants Associated with Persistent Synthesis of Fetal Hemoglobin

Clinical Chemistry

Hibernation in Pygmy Lorises (Nycticebus Pygmaeus) – What Does It Mean?

Discovery of Genes Affecting Resistance of Barley to Adapted And

Gene Network and Pathway Analysis of Transcriptional Signatures Characterizing Sole Ulcer and Digital Dermatitis in Dairy Cows

Ribosomal RNA