Resurrecting Ancient Genes: Experimental Analysis of Extinct Molecules
Total Page:16
File Type:pdf, Size:1020Kb
REVIEWS RESURRECTING ANCIENT GENES: EXPERIMENTAL ANALYSIS OF EXTINCT MOLECULES Joseph W. Thornton There are few molecular fossils: with the rare exception of DNA fragments preserved in amber, ice or peat, no physical remnants preserve the intermediate forms that existed during the evolution of today’s genes. But ancient genes can now be reconstructed, expressed and functionally characterized, thanks to improved techniques for inferring and synthesizing ancestral sequences. This approach, known as ‘ancestral gene resurrection’, offers a powerful new way to empirically test hypotheses about the function of genes from the deep evolutionary past. BILATERIAN The evolution of gene function is a central issue in mol- genes from the last common ancestors of bacteria, of An animal that shows bilateral ecular evolution: by what mechanisms and dynamics BILATERIAN animals and of vertebrates. These studies have symmetry across a body axis. have the diverse functions of modern-day genes shed light on fascinating questions about primordial Bilaterians include chordates, emerged? This question is usually addressed by inferring environmental adaptations and the evolution of crucial arthropods, nematodes, annelids and molluscs, among other past processes from extant patterns, using statistical gene functions. groups. methods to detect the traces of ancient evolutionary Here, I review ancestral gene resurrection, the tech- events in the sequences of modern-day genes (see, for nical advances that have made it feasible and the studies ORTHOLOGUES example, REFS 1,2). Thanks to the recent surge in knowl- that have applied it. I discuss the historical development The ‘same’ gene in more than one species. Orthologues edge of structure–function relationships, evolutionists of the technique and its methodological basis, with an descend from a speciation event. can better interpret indicative patterns — such as biases emphasis on the previously unattainable insights it or changes in evolutionary rates — by focusing on spe- has allowed. I also highlight the limitations and pit- cific parts of gene sequences that are most likely to falls of this strategy, which should be kept in mind change gene function (for some examples, see REFS 3,4). when designing studies and interpreting results, and I Despite the important insights that the statistical conclude with suggestions for future extensions and approach has made possible, however, this method applications of the technique. remains inferential, without empirical tests to refute or corroborate the evolutionary hypotheses that sequence How to raise a gene from the dead patterns suggest. A gene resurrection study, as with all good science, Recent advances in phylogenetics and DNA-synthesis begins with a question — in this case, one that could be techniques have made it possible to experimentally test answered if we knew the functions of the ancestral or molecular evolutionary hypotheses by resurrecting intermediate forms that existed during the evolution of ancient genes in the laboratory. To resurrect a gene, the modern-day genes. For example, a question about the sequence of an ancient protein is inferred using phylo- physiological or biochemical traits of an extinct species genetic methods, a DNA molecule coding for that pro- could be answered by resurrecting and studying the Center for Ecology and tein is synthesized, the extinct protein is expressed ORTHOLOGUE, in that species, of a gene that determines Evolutionary Biology, in vitro or in cultured cells and its functions are assayed the trait in modern-day species. If the question concerns University of Oregon, Eugene, Oregon 97403-5289, USA. using molecular techniques. Half a dozen recent publi- how a gene family and its functions diversified, then the e-mail: [email protected] cations have reported the resurrection of ancestral ancestral gene that gave rise to the family through gene doi:10.1038/nrg1324 genes from the distant evolutionary past, including duplications can be resurrected and characterized, as 366 | MAY 2004 | VOLUME 5 www.nature.com/reviews/genetics REVIEWS a Infer phylogenetic tree from aligned sequences and determine best-fitting evolutionary model Human CDGCKSFFKRAVEGACMLLILKCKCLIICHHSRRWKTKHCIQGDACEHSKFFKENRAVEMMEVAGEKL… Mouse CDGCKTFFKRAVEGACMLLILKCKCLIICHHSRRYKTCHCIQGRACEHSKFFKENRAVEMMEVAGEKL… Chick CDGCKAFFKKAVEGACMLLILKCKCLIICHHARRYKTCHCVQGRACEKSKFFKENRAVEMMEVAGEKL… Vertebrate Frog CEGCKAFFKKSVEGACMLLILKCKCLIICHHARRYKTCHCIQGRACEKTDFFKENRAVEMMEVAGEKL… ancestor Zebrafish CEGCKSFFKKSVDGACMLLILKCKCLIICHHARRYKTCHCIQGRACEKTKFFRENRAVEMMEVAGEKL… (>400 mya) Shark CEGCKSFFKKSVDPACMLLILKCKCLIICHHARRYKTCHCIQGRACEKTKFFKDNRSVEMMEVAGEKL… Lamprey CEGCKSFFKKSMDPACMLLILKCKCLIICHHARRYKTCHCIQGRACEKTKFFKDNRALEMMEVAGEKL… Lancelet CEGCKSFLKKSMDPACMLLILKCKCLIICHHTRRYKTCHCIQGRACEKTKFFKDNRAVEMMDVAGEKL… Urchin CDGCKSFLKHTMDPACMFLILKCRCLIVCHHTRRYKTCHCIQGRSCEKTKFFKDNRAVEMMEVAGEKL… b Reconstruct protein sequence Ancestral sequence at ancestral node by maximum CEGCKSFFKKSMDPACMLLILKCKCLIICHHARRYKTCHCIQGRACEKTKFFKDNRAVEMMEVAGEKL likelihood c Synthesize oligonucleotides and assemble gene for ancestral protein by stepwise PCR CEGCKSFFKKSMDPACMLLILKCKCLIICHHARRYKTCHCIQGRACEKTKFFKDNRAVEMMEVAGEKL d Subclone assembled gene into vector, transform e Purify ancestral protein (if necessary) and characterize cultured cells and express ancestral protein function using trans-activation, binding or other assay 242.721 96-well plate Figure 1 | The ancestral gene resurrection strategy. Flow chart of the stages required to resurrect and characterize an ancestral gene. A hypothetical protein from the ancestral vertebrate is shown as an example. See main text for details of each step. mya, million years ago. OUTGROUP SEQUENCES In phylogenetics, sequences that can ancestors from intermediate points during the pro- code; CODON BIAS for the system in which the gene will be are known a priori to be more liferation of that family. expressed can be introduced to improve the translation distantly related to the other Resurrecting an ancestral gene involves five steps rate.This coding sequence is produced de novo (the sequences in the analysis (the (FIG. 1).In the first step, sequences that are descended third step) by synthesizing overlapping oligonucleotides ingroup sequences) than the from the ancestral gene are obtained and aligned — and assembling them by PCR or by restriction digest/ ingroup sequences are to each other. along with OUTGROUP SEQUENCES — and the tree of their ligation; site-directed mutagenesis can be used if the relationships is inferred (or imposed if it is known ancestral sequence can be made by introducing only a CODON BIAS a priori). Amino-acid sequences are typically used few changes into an extant gene. In the fourth step, the Preferential use of certain DNA because they contain less ‘noise’ than DNA sequences, ancestral gene is cloned into a plasmid that allows high- codons over others that code for the same amino acid. which are more subject to convergence and reversal. level expression, and the plasmid is then transfected into Phylogenetic methods, such as maximum parsimony or bacterial or mammalian cells in culture. Finally, the BINDING ASSAYS maximum likelihood (ML; see later sections), are then ancestral protein can be purified if necessary and its A family of biochemical used to infer the best estimate of the ancestral state for functions characterized using experimental tests such as proceduers used to determine each sequence site given the present-day sequence data. reporter-gene expression assays, ligand- or substrate- the affinity and specificity with which a protein binds a specific In the second step, a DNA sequence that codes for the BINDING ASSAYS,or assays that measure enzyme specificity ligand or substrate. ancestral protein is inferred on the basis of the genetic and turnover rate. NATURE REVIEWS | GENETICS VOLUME 5 | MAY 2004 | 367 REVIEWS Modernizing gene resurrection techniques maximum parsimony method, which was developed in In 1963, L. Pauling and E. Zuckerkandl prophesied that the early 1970s but not applied to or adapted for gene it would be possible one day to infer the gene sequences sequences until the first phylogenetic computer programs of ancestral species, to “synthesize these presumed com- emerged in the mid 1980s (BOX 1).Maximum parsimony ponents of extinct organisms … and study the physico- was an important advance over consensus methods, in chemical properties of these molecules”5.Not until which the most common state among extant sequences is 1990, however, were methods for ancestral sequence assumed to be the ancestral state, irrespective of phyloge- inference and DNA synthesis sufficiently mature to netic relations. The consensus approach is extremely sen- make the first such work possible (BOX 1).Five gene sitive to the sample of sequences chosen, however, and it resurrection studies were published during the follow- will always err if a change from the ancestral state ing seven years; all involved genes were from the rela- occurred on a deep branch that subtended a highly spe- tively recent past, such as transposons in the genomes of ciose group (BOX 2).To take a morphological example, mice and salmonids, and digestive RNases in ancestral the consensus method would lead to the inference that artiodactyls (TABLE 1). the vertebrate ancestor had jaws, because there are far A hiatus of more than half a decade followed, in more species of TELEOSTS and tetrapods than there are of which no new gene resurrection studies