Perspectives in Science (2016) 9, 17—23
View metadata, citation and similar papers at core.ac.uk brought to you by CORE
Available online at www.sciencedirect.com provided by Elsevier - Publisher Connector
ScienceDirect
jo urnal homepage: www.elsevier.com/pisc
ଝ
Reconstruction of ancestral enzymes
∗
Rainer Merkl, Reinhard Sterner
University of Regensburg, Institute of Biophysics and Physical Biochemistry, 93040 Regensburg, Germany
Received 1 December 2015; accepted 2 February 2016
Available online 26 August 2016
KEYWORDS Summary The amino acid sequences of primordial enzymes from extinct organisms can be
determined by an in silico approach termed ancestral sequence reconstruction (ASR). In the first
Ancestral sequence
reconstruction; step of an ASR, a multiple sequence alignment (MSA) comprising extant homologous enzymes is
being composed. On the basis of this MSA and a stochastic model of sequence evolution, a phy-
Enzyme evolution;
logenetic tree is calculated by means of a maximum likelihood approach. Finally, the sequences
Phylogenetic tree;
Promiscuous of the ancestral proteins at all internal nodes including the root of the tree are deduced. We
enzymes; present several examples of ASR and the subsequent experimental characterization of enzymes
Thermostability as old as four billion years. The results show that most ancestral enzymes were highly ther-
mostable and catalytically active. Moreover, they adopted three-dimensional structures similar
to those of extant enzymes. These findings suggest that sophisticated enzymes were invented
at a very early stage of biological evolution.
© 2016 Beilstein-lnstitut. Published by Elsevier GmbH. This is an open access article under the
CC BY license. (http://creativecommons.org/licenses/by/4.0/).
Introduction is only marginal (Jaenicke, 1991). We are interested in how
enzyme activity and stability developed and changed in the
course of evolution. For this purpose, it would be desirable
Modern enzymes from contemporary organisms are sophis-
to characterize ancient enzymes from primordial organisms.
ticated biocatalysts transforming their substrates into
The lack of macromolecular fossils, however, seems to block
products with high efficiency and specificity. However, as
the access to this interesting information. Luckily, there is a
catalytic activity usually requires a certain degree of con-
circumstantial way out of this dilemma, which is the charac-
formational flexibility, the stability of most modern enzymes
terization of ‘‘extinct’’ proteins after their ‘‘resurrection’’
via ancestral sequence reconstruction (ASR).
ASR is an in silico approach allowing to deduce the
ଝ
This is an open-access article distributed under the terms of the
sequences of ancient proteins from the sequences of homol-
Creative Commons Attribution License, which permits unrestricted
ogous extant proteins (Liberles, 2007). The idea of ASR is
use, distribution, and reproduction in any medium, provided the
more than 50 years old when Pauling and Zuckerkandl pos-
original author and source are credited. This article is part of a
tulated that modern proteins contain enough information
special issue entitled Proceedings of the Beilstein ESCEC Symposium
to derive the sequences of common ancestors (Pauling and
2015 with copyright © 2016 Beilstein-lnstitut. Published by Elsevier
Zuckerkandl, 1963). However, the concept of ASR could only
GmbH. All rights reserved.
∗
Corresponding author. be realized after Fitch had implemented the first phyloge-
E-mail address: [email protected] (R. Sterner). netic algorithm termed PAUP (phylogenetic analysis using
http://dx.doi.org/10.1016/j.pisc.2016.08.002
2213-0209/© 2016 Beilstein-lnstitut. Published by Elsevier GmbH. This is an open access article under the CC BY license. (http://creativecommons.org/licenses/by/4.0/).
18 R. Merkl, R. Sterner
Figure 1 Protocol for ASR. Each ASR requires four steps to deduce ancestral sequences from a set of extant homologs. (A) A set of
sequences is retrieved from a database. (B) These sequences are compiled to a multiple sequence alignment, which allows for the
identification of mutations observed in the sequences. (C) A phylogeny is determined; the extant sequences constitute the leaves.
(D) Based on this phylogenetic tree and the MSA, ancestral sequences are deduced for all internal nodes of the tree.
parsimony) (Fitch, 1971). Although phylogenetic models Computing a phylogenetic tree by means of maximum
and algorithms have considerably improved since then, the likelihood
workflow of ASR has remained essentially the same (Fig. 1). The prerequisite for the computation of a phylogenetic tree
A set of homologous sequences is put together in a multiple is a stochastic model that describes the probability for DNA
sequence alignment (MSA), which is the basis for the subse- or protein sequences to acquire mutations within a certain
quent calculations: first, a phylogenetic tree is constructed time interval ti. For this purpose, a probability to acquire
whose outermost nodes (the leaves) are represented by the any mutation within ti is combined with a substitution
extant sequences. After the calculation of this tree, the pre- model. The latter explains in detail with which probability a
cursor sequences corresponding to the internal nodes are nucleotide or amino acid residue is replaced by another one.
being calculated. These calculations are commonly based on Instead of using fixed mutation rates, it is meanwhile state
a maximum likelihood approach and a phylogenetic model of the art to sample these probabilities from a continuous
which allows for the sampling of mutational frequencies distribution, which provides every site with a specific rate
in a position-specific residue manner (Merkl and Sterner, (Susko et al., 2003).
2016). Based on such a model, the likelihood of a tree can be
In this short review, we will first describe state-of- computed. Likelihood is the probability for observing the
the art in silico methods that have been developed for data (i.e., sequences) given (i) the parameters of the chosen
ASR. We will then provide examples on how ASR has been evolutionary model and (ii) the topology of the tree under
used to resurrect ancient enzymes from the Precambrian study. Commonly, mutations at different sites are considered
era, among them translation elongation factors (Gaucher as independent events. Thus, this likelihood of a complete
et al., 2008), thioredoxins (Ingles-Prieto et al., 2013; Perez- sequence is the product of all site-specific values. To explain
Jimenez et al., 2011), 3-ispropylmalate dehydrogenases the principle, it is therefore sufficient to consider one site
(Hobbs et al., 2015; Hobbs et al., 2012), nucleotide kinases S(j) of a sequence S and to compute the likelihood for the
(Akanuma et al., 2013), -lactamases (Risso et al., 2013; nucleotides at S(j) at each node of the tree. If all time
Zou et al., 2015), imidazole glycerol phosphate synthase intervals ti and all nucleotides ei are known for all nodes
. . .
(Reisinger et al., 2014), and ribonuclease H1 (Hart et al., i = 1, , 8, the likelihood of the tree shown in Fig. 2 is:
2014). We will conclude by summarizing the most impor-
L tree =
p t p t p t p t p t
tant insights that have been gained from these studies with ( ) e1e8 ( 1) e2e8 ( 2) e3e7 ( 3) e4e6 ( 4) e5e6 ( 5)
respect to our ability to ‘‘replay the molecular tape of life’’
p t p t . e6e7 ( 6) e7e8 ( 7) (1)
(Gaucher, 2007).
In silico methods for ASR
However, the states (nucleotides) of the internal nodes
are not known and therefore it is necessary to sum over all
A detailed introduction into stochastic concepts and phylo-
possible states (nucleotides at internal nodes) which results
genetic models that is needed to understand modern ASR in
methods is beyond the scope of this review and can be
found elsewhere; see (Merkl and Sterner, 2016) and refer- L tree = p t p t p t p t
( ) e1e8 ( 1) e2e8 ( 2) e3e7 ( 3) e4e6 ( 4)
ences therein. Here, we present a short summary of the
e8 e7 e6
algorithms required to deduce phylogenetic trees and ances-
p t p t p t . tral sequences. e5e6 ( 5) e6e7 ( 6) e7e8 ( 7)) (2)
Reconstruction of ancestral enzymes 19
states (the content of the internal nodes labeled 6, 7, and 8).
3
For each site of these three internal nodes there are 4 com-
3
binations of nucleotides or 20 combinations of amino acids
ei. It is the aim of an ML approach to identify for these three
nodes a triplet with the largest value p( |data), which is
in a Bayesian approach the triplet that maximizes
p(data| ) · p( )
. (3)
p(data)
Since p(data) is identical for all candidate triplets, it is
sufficient to maximize p(data| ) · p( ). More specifically, for
this tree and the given three nodes the triplet is found by
solving
Figure 2 An example of a phylogeny. Leaves representing
extant sequences are labeled 1—5; internal nodes representing max [pe e (t1)pe e (t2)
e ,e ,e 1 8 2 8
reconstructed ancestral sequences are labeled 6—8. The values 8 7 6
t1 to t7 represent the length of the vertices; i.e., time intervals; p t p t p t p t p t .
e3e7 ( 3) e4e6 ( 4) e5e6 ( 5) e6e7 ( 6) e7e8 ( 7)] (4)
example according to (Pupko et al., 2000).
If we know all transition probabilities pij, the likelihood
The solution computed for Eq. (4) is the maximum over all
of a tree can be computed quite efficiently by means of
possible triplets. For larger trees, it is necessary to maximize
an iterative approach. The missing length of the edges can
over all internal nodes; see (Pupko et al., 2000) for details.
be determined by means of expectation maximization; for
details see (Felsenstein, 1981). However, to find the tree
with the maximal likelihood, the topology — which was taken
as given so far — has to be optimized as well, which requires Typical software protocols for ASR
the creation and the assessment of alternative topologies in The basic principles introduced in the previous paragraph
order to find the optimal one. have been implemented in several software suites. These
For a comparison of alternative trees, maximum likeli- are the basis for software protocols used in ASR experi-
hood (ML) approaches optimize the value given in Eq. (2). ments. These protocols differ and we will briefly illustrate
Unfortunately, the number of tree topologies grows expo- some typical combinations. Generally, each protocol for ASR
nentially with the number of sequences, which requires the requires four steps (A—D) that are depicted in Fig. 1:
use of heuristic approaches to sample tree space. Com-
monly, these approaches progressively optimize the tree
by examining the score of similar trees, choose the high- (A) Select extant sequences. Commonly, homologous
est scoring one as the next estimate, and finally stop, if no sequences are retrieved from databases like GenBank
further improvement can be found. or UniProtKB, usually with the help of BLAST (Altschul
Popular traversal schemes of tree space create alterna- et al., 1990). If the number of hits is very large, highly
tive trees by making small rearrangements on the current similar sequences have to be eliminated, e.g., by using
tree and examine each internal branch of the tree in turn; CD-HIT (Li and Godzik, 2006).
for details see (Whelan, 2008) and (Merkl and Sterner, 2016). (B) Create a multiple sequence alignment. Different
For heuristic algorithms there is no guarantee to find the methods are in use to create a MSA. Among them are
optimal tree that has the maximum likelihood. However, the Clustal W (Larkin et al., 2007), MUSCLE (Edgar, 2004)
rearrangements of the tree topology under study expand the and PRANK (Löytynoja and Goldman, 2008). Regions of
area of searched tree space, which increases the probabil- ambiguous alignment can be removed from the MSA by
ity of finding a nearly optimal solution, which is sufficient applying GBLOCKS (Castresana, 2000).
for ASR. Searching a wider range is additionally supported (C) Compute a phylogenetic tree. Current protocols rely
by computing several trees in parallel started from different on ML or Bayesian approaches. Frequently used ML
points in tree space. approaches are PAUP (Swofford, 1984) and PAML (Yang
et al., 1995). Alternatively, Bayesian approaches like
MrBayes (Ronquist and Huelsenbeck, 2003), PhyML
Deducing ancestral sequences
(Guindon and Gascuel, 2003), and PhyloBayes (Lartillot
After a tree has been generated, the most plausible
et al., 2009) are in use.
ancestral sequences can be deduced. Applying a Bayesian
(D) Reconstruct ancestral sequences. The extant
approach, a reconstruction maximizes the probability for
sequences chosen in step (A) and the phyloge-
the set of ancient sequences given the extant ones (Pupko
netic tree computed in step (C) in combination with a
et al., 2000). The basic idea of this reconstruction can
substitutions model are the basis for the computation
be illustrated by concentrating on the internal nodes of a
of ancestral sequences. Frequently used algorithms
tree whose topology and branch lengths are assumed to
are MrBayes (Ronquist and Huelsenbeck, 2003), PAML
be known. The tree given in Fig. 2 has five known states
(Yang, 1997) or FastML (Pupko et al., 2000).
(the content of the leaves, labeled 1—5) and three unknown
20 R. Merkl, R. Sterner
Application of ASR for enzyme reconstruction
Stabilities of ancient enzymes
During the last decade, ASR has been applied to resur-
rect a number of different ancient enzymes. Among the
first studied examples were translation elongation factors
from organisms that thrived ∼3.5—0.5 billion years (Gyr)
ago (Gaucher et al., 2008). The thermal stabilities of the
proteins declined from the ‘‘older’’ to the ‘‘younger’’
proteins, indicating that the environmental temperature
◦
decreased by 30 C within this period of time. This con-
clusion is supported by a nearly identical cooling trend
for the ancient ocean as inferred from the deposition of
oxygen isotopes (Robert and Chaussidon, 2006). Analogous
experiments with seven pre-cambrian thioredoxins dating
back between ∼4 Gyr and 1.4 Gyr, among them the last
bacterial common ancestor, the last archaeal common and
the archaeal-eukaryotic common ancestor, provided simi-
lar results (Perez-Jimenez et al., 2011). Thermal unfolding
measurements showed that the melting temperatures (Tm)
◦
of the ancient proteins were up to 32 C higher than those
Figure 4 Thermophilic and mesophilic RNH lineages exhibit
of extant thioredoxins. Moreover, a plot of the Tm versus
opposite stability trends. The denaturation temperatures (Tm)
geological time yielded a linear decline, corresponding to
◦ of the maximum likelihood ancestors and ten alternate recons-
a decrease of thermostability corresponding to 6 C/Gyr
tructions of Anc1 as a function of the evolutionary distance from
(Fig. 3). Activity measurements showed that ancient thiore-
the last common ancestor, Anc1. Distances are calculated as the
doxins used the same reaction mechanism as modern ones
sum of the branch lengths connecting Anc1 to the protein of
but were much more active at low pH values (Perez-
interest. The gray region defines the range of Tm values mea-
Jimenez et al., 2011). Overall, the high thermal stability
sured for the Anc1 alternates, which appear individually as gray
and the efficient catalysis under acidic conditions seem to
data points.
be adaptations of the ancient thioredoxins to the condi-
Figure taken from (Hart et al., 2014).
tions prevailing in the primordial oceans (Walker, 1983). In
accordance with the results obtained for ancient elongation
factors and thioredoxins, reconstructed nucleoside diphos- For more recent time spans, the analysis of four differ-
phate kinases from the last common ancestors of archaea ent reconstructed 3-isopropylmalate dehydrogenases (LeuB)
◦
and bacteria have Tm values of more than 100 C (Akanuma from Bacilli (ANC1—ANC4) support a different scenario
et al., 2013). Taken together, the results for these three pro- (Hobbs et al., 2012). ANC4, ANC3, ANC2, and ANC1 are
tein families support a hot environment for early life and a approximately 0.95, 0.85, 0.82, and 0.68 Gyr old. Ther-
subsequent continuous cooling. mal unfolding curves yielded Tm values in the following
∼
order: ANC4 ANC1 > ANC3 > ANC2, which means that the
‘‘oldest’’ and the ‘‘youngest’’ reconstructed proteins are
the most stable ones. This result is consistent with a fluctu-
ating trend in thermal evolution, with a temporal adaptation
toward mesophily followed by a more recent return to ther-
mophily. Due to the generally good correlation between the
growth temperature of an organism and the thermostability
of its proteins, the observed variations in thermostability
of the reconstructed LeuB variants might reflect changes in
the microenvironment encountered by the evolving Bacillus
species. Along similar lines, the Tm value of the ∼3 Gyr old
common ancestor of the ribonuclease H1 from the mesophile
Escherichia coli (ecRNH) and the extreme thermophile Ther-
mus thermophilus (ttRNH) lies exactly between the Tm
value of ecRNH and ttRNH (Hart et al., 2014). The Tm
values of intermediate ancestors along the ttRNH lineage
increased gradually over time, while the ecRNH lineage
exhibited an abrupt drop in Tm followed by relatively lit-
Figure 3 Denaturation temperatures (Tm) versus geological tle change (Fig. 4). The observed thermostability patterns
time for ancestral thioredoxin (Trx) enzymes as well as for are incompatible with any gradual, long-term trend in global
Escherichia coli and human Trx. The dashed line represents a environmental temperatures. For example, an intermediate
◦
linear fit to the data. Inset: experimental DSC thermograms for ancestor that existed about 2 Gyr ago, is only 2 C more
Trx from E. coli and the last bacterial common ancestor (LBCA). thermostable than modern ecRNH. The authors conclude
Figure taken from (Perez-Jimenez et al., 2011).
that their results are consistent with the wide variety of
Reconstruction of ancestral enzymes 21
Figure 5 Crystal structure and thermal stability of LUCA-HisF (Reisinger et al., 2014). (A, B) Ribbon diagram with a view onto the
catalytic face (A) and the stability face (B) of the (␣)8-barrel (Sterner and Höcker, 2005). The eight central -strands are depicted
in light blue, and the eight surrounding ␣-helices are depicted in light orange. Tw o essential aspartate residues and two phosphate
molecules are shown in (A) and a conserved salt bride cluster is shown in (B). (C, D) Thermal denaturation as monitored by far-UV
circular dichroism (C) and by differential scanning calorimetry (D). The apparent Tm-values are indicated.
temperature niches populated by both ancient and modern
microorganisms and claim that the tracking of the Tm of any
individual protein is an unreliable way to estimate long-term
trends in global environmental temperatures (Hart et al.,
2014).
Structures and activities of ancient enzymes
The crystal structure analysis of seven resurrected thiore-
doxins dating back to 4 Gyr showed that the ancestral
proteins display the canonical thioredoxin fold, whereas
only small structural changes have occurred over this
extremely long period of time. These findings indicate that
the thioredoxin fold is a molecular fossil and confirm that
protein structures evolve very slowly (Ingles-Prieto et al.,
2013); a conclusion that is further supported by the crystal
structures of -lactamases (Risso et al., 2013) and nucleo-
side kinases (Akanuma et al., 2013). Moreover, the last
universal common ancestor of the cyclase subunit of imid-
azole glycerol phosphate synthase (LUCA-HisF) does not
Figure 6 Generalist-to-specialist conversion of -lactamases
only display a similar (␣)8-barrel structure as modern HisF
as illustrated by the catalytic efficiencies (kcat/Km) for ben-
enzymes and a high thermal stability (Fig. 5) but also a highly
zylpenicillin and cefotaxime.
conserved folding mechanism (Reisinger et al., 2014).
Reprinted (adapted) with permission from (Risso et al., 2013).
A popular model postulates that ancient enzymes
Copyright 2013 American Chemical Society.
were capable of catalyzing similar reactions in different
metabolic pathways (Jensen, 1976), which implies that

they processed several substrates in a promiscuous man- the extant -lactamase TEM-1 only hydrolyzes penicillin
ner (Khersonsky and Tawfik, 2010). The analysis of four (Fig. 6). Since the active sites are highly conserved between
 
ancient -lactamases seems to support this idea (Risso et al., modern and ancient -lactamases, the different activities
2013): activity assays showed that these enzymes hydrolyzed might be due to altered structural dynamics. In accordance
various -lactam antibiotics with catalytic efficiencies sim- with this idea, molecular dynamics simulations yielded high

ilar to those of an average modern enzyme. In contrast, active site rigidity for modern -lactamases whereas the
22 R. Merkl, R. Sterner
ancestral enzymes appeared to be much more flexible, Mol. Biol. Evol. 17, 540—552, http://dx.doi.org/10.1093/
oxfordjournals.molbev.a026334.
which might allow them to bind antibiotics with different
Chothia, C., 1992. Proteins. One thousand families for the
sizes and geometries (Zou et al., 2015).
molecular biologist. Nature 357, 543—544, http://dx.doi.org/
However, promiscuity seems not to be a general feature
10.1038/357543a0.
of ancient enzymes. For example, LUCA-HisF is a specific
Edgar, R.C., 2004. MUSCLE: multiple sequence alignment with
enzyme, which transforms its native substrates into prod-
high accuracy and high throughput. Nucleic Acids Res. 32,
ucts with high efficiency. In contrast, it does not accept
1792—1797, http://dx.doi.org/10.1093/nar/gkh340.
similar substrates from the two related isomerases HisA or
Felsenstein, J., 1981. Evolutionary trees from DNA sequences:
TrpF, neither in vitro nor in vivo (Reisinger et al., 2014). a maximum likelihood approach. J. Mol. Evol. 17, 368—376,
Moreover, LUCA-HisF forms a functional complex with the http://dx.doi.org/10.1007/bf01734359.
glutaminase subunit HisF of an extant imidazole glycerol Fitch, W.M., 1971. Toward defining the course of evolution: mini-
mum change for a specific tree topology. Syst. Biol. 20, 406—416,
phosphate synthase and binds to reconstructed LUCA-HisH
http://dx.doi.org/10.1093/sysbio/20.4.406.
with high affinity. Along the same lines, the reconstructed
Gaucher, E.A., 2007. Ancestral sequence reconstruction as a tool
nucleoside diphosphate kinases from the last common ances-
to understand natural history and guide synthetic biology: real-
tors of bacteria and archaea, respectively, are catalytically
izing and extending the vision of Zuckerkandl and Pauling.
as efficient as their modern counterparts (Akanuma et al.,
In: Liberles, D.A. (Ed.), Ancestral Sequence Reconstruction.
2013). Similar steady-state kinetic parameters were also
Oxford University Press, Oxford, pp. 20—33, http://dx.doi.org/
determined for the ancient LeuB variants ANC1—ANC4 and 10.1093/acprof:oso/9780199299188.003.0002.
extant LeuB enzymes (Hobbs et al., 2012). Gaucher, E.A., Govindarajan, S., Ganesh, O.K., 2008. Palaeotem-
perature trend for Precambrian life inferred from resur-
rected proteins. Nature 451, 704—707, http://dx.doi.org/
Conclusion 10.1038/nature06510.
Guindon, S., Gascuel, O., 2003. A simple, fast, and accu-
rate algorithm to estimate large phylogenies by maxi-
The high thermostability of ancestral enzymes suggests that
mum likelihood. Syst. Biol. 52, 696—704, http://dx.doi.org/
Precambrian life was thermophilic, which is in accordance
10.1080/10635150390235520.
with several scenarios, including that ancestral oceans were
Hart, K.M., Harms, M.J., Schmidt, B.H., Elya, C., Thornton,
hot, that ancient life prospered in hot spots like hydro-
J.W., Marqusee, S., 2014. Thermodynamic system drift in pro-
thermal systems, or that only robust thermophilic organism
tein evolution. PLoS Biol. 12, e1001994, http://dx.doi.org/
survived bombardment of the young earth (Risso et al., 10.1371/journal.pbio.1001994.
2014). Hobbs, J.K., Prentice, E.J., Groussin, M., Arcus, V.L., 2015.
The high similarity between the crystal structures of Pre- Reconstructed ancestral enzymes impose a fitness cost upon
cambrian enzymes and their corresponding extant proteins modern bacteria despite exhibiting favourable biochemical
properties. J. Mol. Evol. 81, 110—120, http://dx.doi.org/
indicates a relatively slow evolution of protein structure in
10.1007/s00239-015-9697-5.
contrast to their amino acid sequences. These congruencies
Hobbs, J.K., Shepherd, C., Saul, D.J., Demetras, N.J., Haan-
were observed for several ancient enzymes (Akanuma et al.,
ing, S., Monk, C.R., Daniel, R.M., Arcus, V.L., 2012.
2013; Hart et al., 2014; Hobbs et al., 2012; Ingles-Prieto
On the origin and evolution of thermophily: reconstruc-
et al., 2013; Reisinger et al., 2014; Risso et al., 2013) and
tion of functional Precambrian enzymes from ancestors of
this slow innovation rate might explain the limited number
Bacillus. Mol. Biol. Evol. 29, 825—835, http://dx.doi.org/
of protein folds observed in nature (Chothia, 1992). 10.1093/molbev/msr253.
The observed activities and non-promiscuities of recon- Ingles-Prieto, A., Ibarra-Molero, B., Delgado-Delgado, A., Perez-
structed enzymes are compatible with the idea that the Jimenez, R., Fernandez, J.M., Gaucher, E.A., Sanchez-Ruiz,
evolution of many highly efficient enzymes and enzyme com- J.M., Gavira, J.A., 2013. Conservation of protein structure over
four billion years. Structure 21, 1690—1697, http://dx.doi.org/
plexes has already been completed in the LUCA era. In
10.1016/j.str.2013.06.020.
summary, enzymes that were encoded in the genome of the
Jaenicke, R., 1991. Protein stability and molecular adaptation to
LUCA (Mirkin et al., 2003) possessed most likely many of
extreme conditions. Eur. J. Biochem. 202, 715—728, http://dx.
the fundamental features present in modern organisms and
doi.org/10.1111/j.1432-1033.1991.tb16426.x.
exhibited a level of sophistication comparable to that found
Jensen, R.A., 1976. Enzyme recruitment in evolution of new func-
in modern bacteria or archaea.
tion. Annu. Rev. Microbiol. 30, 409—425, http://dx.doi.org/
10.1146/annurev.mi.30.100176.002205.
Khersonsky, O., Tawfik, D.S., 2010. Enzyme promiscuity: a
References
mechanistic and evolutionary perspective. Annu. Rev.
Biochem. 79, 471—505, http://dx.doi.org/10.1146/annurev-
Akanuma, S., Nakajima, Y. , Yokobori, S.-i., Kimura, M., Nemoto, biochem-030409-143718.
N., Mase, T. , Miyazono, K.-i., Tanokura, M., Yamagishi, A., 2013. Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan,
Experimental evidence for the thermophilicity of ancestral life. P.A., McWilliam, H., Valentin, F. , Wallace, I.M., Wilm, A., Lopez,
Proc. Natl. Acad. Sci. U.S.A. 110, 11067—11072, http://dx.doi. R., Thompson, J.D., Gibson, T.J., Higgins, D.G., 2007. Clustal
org/10.1073/pnas.1308215110. W and Clustal X version 2.0. Bioinformatics 23, 2947—2948,
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. http://dx.doi.org/10.1093/bioinformatics/btm404.
Basic local alignment search tool. J. Mol. Biol. 215, 403—410, Lartillot, N., Lepage, T. , Blanquart, S., 2009. PhyloBayes 3:
http://dx.doi.org/10.1016/s0022-2836(05)80360-2. a Bayesian software package for phylogenetic reconstruc-
Castresana, J., 2000. Selection of conserved blocks from mul- tion and molecular dating. Bioinformatics 25, 2286—2288,
tiple alignments for their use in phylogenetic analysis. http://dx.doi.org/10.1093/bioinformatics/btp368.
Reconstruction of ancestral enzymes 23
Li, W., Godzik, A., 2006. Cd-hit: a fast program for clustering promiscuity in laboratory resurrections of Precambrian beta-
and comparing large sets of protein or nucleotide sequences. lactamases. J. Am. Chem. Soc. 135, 2899—2902, http://dx.
Bioinformatics 22, 1658—1659, http://dx.doi.org/10.1093/ doi.org/10.1021/ja311630a.
bioinformatics/btl158. Risso, V.A., Gavira, J.A., Sanchez-Ruiz, J.M., 2014. Thermostable
Liberles, D.A., 2007. Ancestral Sequence Reconstruction. and promiscuous Precambrian proteins. Environ. Microbiol. 16,
Oxford University Press, Oxford, http://dx.doi.org/10.1093/ 1485—1489, http://dx.doi.org/10.1111/1462-2920.12319.
acprof:oso/9780199299188.001.0001. Robert, F. , Chaussidon, M., 2006. A palaeotemperature curve for the
Löytynoja, A., Goldman, N., 2008. Phylogeny-aware gap place- Precambrian oceans based on silicon isotopes in cherts. Nature
ment prevents errors in sequence alignment and evolutionary 443, 969—972, http://dx.doi.org/10.1038/nature05239.
analysis. Science 320, 1632—1635, http://dx.doi.org/10.1126/ Ronquist, F. , Huelsenbeck, J.P., 2003. MrBayes 3: Bayesian
science.1158395. phylogenetic inference under mixed models. Bioinformat-
Merkl, R., Sterner, R., 2016. Ancestral protein reconstruc- ics 19, 1572—1574, http://dx.doi.org/10.1093/bioinformatics/
tion: techniques and applications. Biol. Chem. 397, 1—21, btg180.
http://dx.doi.org/10.1515/hsz-2015-0158. Sterner, R., Höcker, B., 2005. Catalytic versatility, stability, and
Mirkin, B.G., Fenner, T.I., Galperin, M.Y., Koonin, E.V., 2003. evolution of the (␣)8-barrel enzyme fold. Chem. Rev. 105,
Algorithms for computing parsimonious evolutionary sce- 4038—4055, http://dx.doi.org/10.1021/cr030191z.
narios for genome evolution, the last universal common Susko, E., Field, C., Blouin, C., Roger, A.J., 2003. Estimation
ancestor and dominance of horizontal gene transfer of rates-across-sites distributions in phylogenetic substitu-
in the evolution of prokaryotes. BMC Evol. Biol. 3, 2, tion models. Syst. Biol. 52, 594—603, http://dx.doi.org/
http://dx.doi.org/10.1186/1471-2148-3-2. 10.1080/10635150390235395.
Pauling, L., Zuckerkandl, E., 1963. Chemical paleogenetics: Swofford, D.L., 1984. PAUP: Phylogenetic Analysis using Parsimony.
molecular ‘‘restoration studies’’ of extinct forms of life. Illinois Natural Historical Survey, Champaign.
Acta Chem. Scand. 17, 9—16, http://dx.doi.org/10.3891/ Walker, J.C.G., 1983. Possible limits on the composi-
acta.chem.scand.17s-0009. tion of the Archaean ocean. Nature 302, 518—520,
Perez-Jimenez, R., Inglés-Prieto, A., Zhao, Z.M., Sanchez-Romero, http://dx.doi.org/10.1038/302518a0.
I., Alegre-Cebollada, J., Kosuri, P. , Garcia-Manyes, S., Kappock, Whelan, S., 2008. Inferring trees. In: Keith, J.M. (Ed.), Bioinfor-
T.J., Tanokura, M., Holmgren, A., Sanchez-Ruiz, J.M., Gaucher, matics. Springer, Totowa, NJ, pp. 287—309, http://dx.doi.org/
E.A., Fernandez, J.M., 2011. Single-molecule paleoenzymology 10.1007/978-1-60327-159-2 14.
probes the chemistry of resurrected enzymes. Nat. Struct. Mol. Yang, Z., 1997. PAML: a program package for phylogenetic analy-
Biol. 18, 592—596, http://dx.doi.org/10.1038/nsmb.2020. sis by maximum likelihood. Comput. Appl. Biosci. 13, 555—556,
Pupko, T. , Pe, I., Shamir, R., Graur, D., 2000. A fast algorithm http://dx.doi.org/10.1093/bioinformatics/13.5.555.
for joint reconstruction of ancestral amino acid sequences. Yang, Z., Kumar, S., Nei, M., 1995. A new method of inference of
Mol. Biol. Evol. 17, 890—896, http://dx.doi.org/10.1093/ ancestral nucleotide and amino acid sequences. Genetics 141,
oxfordjournals.molbev.a026369. 1641—1650.
Reisinger, B., Sperl, J., Holinski, A., Schmid, V. , Rajendran, C., Zou, T. , Risso, V.A., Gavira, J.A., Sanchez-Ruiz, J.M., Ozkan,
Carstensen, L., Schlee, S., Blanquart, S., Merkl, R., Sterner, S.B., 2015. Evolution of conformational dynamics determines
R., 2014. Evidence for the existence of elaborate enzyme com- the conversion of a promiscuous generalist into a special-
plexes in the Paleoarchean era. J. Am. Chem. Soc. 136, 122—129, ist enzyme. Mol. Biol. Evol. 32, 132—143, http://dx.doi.
http://dx.doi.org/10.1021/ja4115677. org/10.1093/molbev/msu281.
Risso, V.A., Gavira, J.A., Mejia-Carmona, D.F., Gaucher, E.A.,
Sanchez-Ruiz, J.M., 2013. Hyperstability and substrate