Gene Duplication: the Genomic Trade in Spare Parts Matthew Hurles
Total Page:16
File Type:pdf, Size:1020Kb
Primer Gene Duplication: The Genomic Trade in Spare Parts Matthew Hurles f necessity is the mother of change. Gene duplication provides is the widespread existence of gene invention, then its father is an opportunities to explore this forbidden families. Members of a gene family that Iinveterate tinkerer, with a large evolutionary space more widely by share a common ancestor as a result garage full of spare parts. Innovation generating duplicates of a gene that of a duplication event are denoted as (like homicide) requires motive and can ‘wander’ more freely, on condition being paralogous, distinguishing them opportunity. Clearly, the predominant that between them they continue to from orthologous genes in different ‘motive’ during the evolution of a novel supply the original function. genomes, which share a common gene function is to gain a selective Susumu Ohno was the fi rst to ancestor as a result of a speciation advantage. To understand why gene comprehensively elucidate the event. Paralogous genes can often duplications represent the major potential of gene duplication, in his be found clustered within a genome, ‘opportunities’ from which new genes book Evolution by Gene Duplication, although dispersed paralogues, often evolve, we must fi rst consider what published more than 30 years ago with more diverse functions, are also constrains genic evolution. (Ohno 1970). The prescience of common. The vast majority of genes in every Ohno’s book is highlighted by the genome are selectively constrained, fact that his book has almost certainly in that most nucleotide changes that been cited more times in the past fi ve Copyright: © 2004 Matthew Hurles. This is an alter the fi tness of the organism are years than in the fi rst fi ve years after its open-access article distributed under the terms of the Creative Commons Attribution License, which deleterious. How do we know this? publication. permits unrestricted use, distribution, and reproduc- Comparisons between genomes clearly tion in any medium, provided the original work is demonstrate that coding sequences What Is the Evidence for the properly cited. diverge at slower rates than non-coding Importance of Gene Duplication? Matthew Hurles is at the Wellcome Trust Sanger Insti- regions, largely due to a defi cit of The primary evidence that tute near Cambridge in the United Kingdom. E-mail: [email protected] mutations at positions where a base duplication has played a vital role in change would cause an amino-acid the evolution of new gene functions DOI: 10.1371/journal.pbio.0020206 PLoS Biology | http://biology.plosjournals.org July 2004 | Volume 2 | Issue 7 | Page 0900 Whole genome sequences of closely of new biological functions over all classifi ed on the basis of the size of related organisms have allowed evolutionary timescales, it is of great duplication generated, and whether us to identify changes in the gene interest to be able to comprehensively they involve an RNA intermediate complements of species over relatively document the duplicative differences (Figure 1). short evolutionary distances. These that exist between our own species ‘Retrotransposition’ describes the comparisons typically reveal dramatic and our closest relatives, the great integration of reverse transcribed expansions and contractions of apes. The study by Fortna et al. (2004) mature RNAs at random sites in a gene families that can be related to in this issue of PLoS Biology identifi es genome. The resultant duplicated underlying biological differences. For over 3% of around 30,000 genes as genes (retrogenes) lack introns and example, humans and mice differ having undergone lineage-specifi c have poly-A tails. Separated from their in their sensory reliance on sight copy number changes among fi ve regulatory elements, these integrated and smell respectively; colour vision hominoid (humans plus the great apes) sequences rarely give rise to expressed in humans has been signifi cantly species. This is the fi rst time that copy full-length coding sequences, although enhanced by the duplication of an number changes among apes have functional retrogenes have been Opsin gene that allows us to distinguish been assayed for the vast majority of identifi ed in most genomes. light at three different wavelengths, human genes, and we can expect that Tandem duplication of a genomic while mice can distinguish only segment (segmental duplication) the biological consequences of the 140 two. By contrast, a much higher is one of the possible outcomes of human-specifi c copy number changes proportion of the large gene family of ‘unequal crossing over’, which results identifi ed in this study will be heavily olfactory receptors have retained their from homologous recombination investigated over the coming years. functionality in mice, as compared to between paralogous sequences. humans. How Do Duplications Arise? These recombination events can also Given the apparent importance of The various mechanisms by which give rise to the deletion or inversion gene duplication for the evolution genes become duplicated are often of intervening sequences. Recent DOI: 10.1371/journal.pbio.0020206.g001 Figure 1. Mechanism of Gene Duplication A two-exon gene is fl anked by two Alu elements and a neighbouring replication termination site. Recombination between the two Alu elements leads to a tandem duplication event, as does a replication error instigated by the replication termination site. Retrotransposition of the mRNA of the gene leads to the random integration of an intron-less paralogue at a distinct genomic location. PLoS Biology | http://biology.plosjournals.org July 2004 | Volume 2 | Issue 7 | Page 0901 evidence suggests that the explosion duplications, because duplication So what are the relative contributions of segmental duplications in recent breakpoints are enriched at replication of these different mechanisms? Not primate evolution has been caused in termination sites (Koszul et al. 2004). all interspersed duplicate genes are part by the rapid proliferation of Alu Genome duplication events generated by retrotransposition. elements about 40 MYA. Alu elements generate a duplicate for every gene The initially tandem arrangement of are derived from the 7SL RNA gene in the genome, representing a huge segmental duplications can be broken and represent the most frequent opportunity for a step-change in up by subsequent rearrangements. dispersed repeat in the human organismal complexity. However, In keeping with this hypothesis, genome, with the approximately genome duplication presents duplicated genes in a tandem 1 million copies of the 300-bp Alu signifi cant problems for the faithful arrangement typically represent more element representing around 10% transmission of a genome from recent duplication events (Friedman of the entire genome. The striking one generation to the next, and is and Hughes 2003). Recent analyses enrichment of Alu elements at the consequently a rare event, at least suggest that 70% of non-functional junctions between duplicated and in Metazoa. In principle, genome duplicated genes (pseudogenes) single copy sequences implicates duplications should be easily identifi ed in the human genome result from unequal crossing over between these through the coincident emergence retrotransposition rather than any repeats in the generation of segmental within a phylogeny of many gene DNA-based process (Torrents et al. duplications (Bailey et al. 2003). families. Unfortunately, this signal is 2003). The observation of segmental complicated by subsequent piecemeal duplication events with no evidence for loss and gain of gene family members. What Fates Befall a Recently homology-driven unequal crossing over Consequently, there is heated Duplicated Gene? suggests that segmental duplications debate over possible ancient genome A duplicated gene newly arisen can also arise through non-homologous duplication events in early vertebrate in a single genome must overcome mechanisms. A recent screen for evolution and more recently in teleost substantial hurdles before it can be spontaneous duplications in yeast fi sh, both of which must have occurred observed in evolutionary comparisons. suggests that replication-dependent hundreds of millions of years ago First, it must become fi xed in the chromosome breakages also play a (McLysaght et al. 2002; Van de Peer et population, and second, it must be signifi cant role in generating tandem al. 2003). preserved over time. Population genetics tells us that for new alleles, fi xation is a rare event, even for new mutations that confer an immediate selective advantage. Nevertheless, it has been estimated that one in a hundred genes is duplicated and fi xed every million years (Lynch and Conery 2000), although it should be clear from the duplication mechanisms described above that it is highly unlikely that duplication rates are constant over time. However, once fi xed, three possible fates are typically envisaged for our gene duplication. Despite the slackened selective constraints, mutations can still destroy the incipient functionality of a duplicated gene: for example, by introducing a premature stop codon or a mutation that destroys the structure of a major protein domain. These degenerative mutations result in the creation of a pseudogene (nonfunctionalization). Over time, the likelihood of such a mutation being introduced increases. Recent studies suggest that there is a relatively DOI: 10.1371/journal.pbio.0020206.g002 narrow time window for evolutionary Figure 2. Fates of Duplicate Genes exploration