<<

Perspectives in Science (2016) 9, 17—23

View metadata, citation and similar papers at core.ac.uk brought to you by CORE

Available online at www.sciencedirect.com provided by Elsevier - Publisher Connector

ScienceDirect

jo urnal homepage: www.elsevier.com/pisc

Reconstruction of ancestral enzymes

Rainer Merkl, Reinhard Sterner

University of Regensburg, Institute of Biophysics and Physical Biochemistry, 93040 Regensburg, Germany

Received 1 December 2015; accepted 2 February 2016

Available online 26 August 2016

KEYWORDS Summary The amino acid sequences of primordial enzymes from extinct organisms can be

determined by an in silico approach termed ancestral sequence reconstruction (ASR). In the first

Ancestral sequence

reconstruction; step of an ASR, a multiple sequence alignment (MSA) comprising extant homologous enzymes is

being composed. On the basis of this MSA and a stochastic model of sequence evolution, a phy-

Enzyme evolution;

logenetic tree is calculated by means of a maximum likelihood approach. Finally, the sequences

Phylogenetic tree;

Promiscuous of the ancestral at all internal nodes including the root of the tree are deduced. We

enzymes; present several examples of ASR and the subsequent experimental characterization of enzymes

Thermostability as old as four billion years. The results show that most ancestral enzymes were highly ther-

mostable and catalytically active. Moreover, they adopted three-dimensional structures similar

to those of extant enzymes. These findings suggest that sophisticated enzymes were invented

at a very early stage of biological evolution.

© 2016 Beilstein-lnstitut. Published by Elsevier GmbH. This is an open access article under the

CC BY license. (http://creativecommons.org/licenses/by/4.0/).

Introduction is only marginal (Jaenicke, 1991). We are interested in how

enzyme activity and stability developed and changed in the

course of evolution. For this purpose, it would be desirable

Modern enzymes from contemporary organisms are sophis-

to characterize ancient enzymes from primordial organisms.

ticated biocatalysts transforming their substrates into

The lack of macromolecular , however, seems to block

products with high efficiency and specificity. However, as

the access to this interesting information. Luckily, there is a

catalytic activity usually requires a certain degree of con-

circumstantial way out of this dilemma, which is the charac-

formational flexibility, the stability of most modern enzymes

terization of ‘‘extinct’’ proteins after their ‘‘resurrection’’

via ancestral sequence reconstruction (ASR).

ASR is an in silico approach allowing to deduce the

This is an open-access article distributed under the terms of the

sequences of ancient proteins from the sequences of homol-

Creative Commons Attribution License, which permits unrestricted

ogous extant proteins (Liberles, 2007). The idea of ASR is

use, distribution, and reproduction in any medium, provided the

more than 50 years old when Pauling and Zuckerkandl pos-

original author and source are credited. This article is part of a

tulated that modern proteins contain enough information

special issue entitled Proceedings of the Beilstein ESCEC Symposium

to derive the sequences of common ancestors (Pauling and

2015 with copyright © 2016 Beilstein-lnstitut. Published by Elsevier

Zuckerkandl, 1963). However, the concept of ASR could only

GmbH. All rights reserved.

Corresponding author. be realized after Fitch had implemented the first phyloge-

E-mail address: [email protected] (R. Sterner). netic algorithm termed PAUP (phylogenetic analysis using

http://dx.doi.org/10.1016/j.pisc.2016.08.002

2213-0209/© 2016 Beilstein-lnstitut. Published by Elsevier GmbH. This is an open access article under the CC BY license. (http://creativecommons.org/licenses/by/4.0/).

18 R. Merkl, R. Sterner

Figure 1 Protocol for ASR. Each ASR requires four steps to deduce ancestral sequences from a set of extant homologs. (A) A set of

sequences is retrieved from a database. (B) These sequences are compiled to a multiple sequence alignment, which allows for the

identification of mutations observed in the sequences. (C) A phylogeny is determined; the extant sequences constitute the leaves.

(D) Based on this phylogenetic tree and the MSA, ancestral sequences are deduced for all internal nodes of the tree.

parsimony) (Fitch, 1971). Although phylogenetic models Computing a phylogenetic tree by means of maximum

and algorithms have considerably improved since then, the likelihood

workflow of ASR has remained essentially the same (Fig. 1). The prerequisite for the computation of a phylogenetic tree

A set of homologous sequences is put together in a multiple is a stochastic model that describes the probability for DNA

sequence alignment (MSA), which is the basis for the subse- or sequences to acquire mutations within a certain

quent calculations: first, a phylogenetic tree is constructed time interval ti. For this purpose, a probability to acquire

whose outermost nodes (the leaves) are represented by the any mutation within ti is combined with a substitution

extant sequences. After the calculation of this tree, the pre- model. The latter explains in detail with which probability a

cursor sequences corresponding to the internal nodes are nucleotide or amino acid residue is replaced by another one.

being calculated. These calculations are commonly based on Instead of using fixed mutation rates, it is meanwhile state

a maximum likelihood approach and a phylogenetic model of the art to sample these probabilities from a continuous

which allows for the sampling of mutational frequencies distribution, which provides every site with a specific rate

in a position-specific residue manner (Merkl and Sterner, (Susko et al., 2003).

2016). Based on such a model, the likelihood of a tree can be

In this short review, we will first describe state-of- computed. Likelihood is the probability for observing the

the art in silico methods that have been developed for data (i.e., sequences) given (i) the parameters of the chosen

ASR. We will then provide examples on how ASR has been evolutionary model and (ii) the topology of the tree under

used to resurrect ancient enzymes from the Precambrian study. Commonly, mutations at different sites are considered

era, among them translation elongation factors (Gaucher as independent events. Thus, this likelihood of a complete

et al., 2008), thioredoxins (Ingles-Prieto et al., 2013; Perez- sequence is the product of all site-specific values. To explain

Jimenez et al., 2011), 3-ispropylmalate dehydrogenases the principle, it is therefore sufficient to consider one site

(Hobbs et al., 2015; Hobbs et al., 2012), nucleotide kinases S(j) of a sequence S and to compute the likelihood for the

(Akanuma et al., 2013), ␤-lactamases (Risso et al., 2013; nucleotides at S(j) at each node of the tree. If all time

Zou et al., 2015), imidazole glycerol phosphate synthase intervals ti and all nucleotides ei are known for all nodes

. . .

(Reisinger et al., 2014), and ribonuclease H1 (Hart et al., i = 1, , 8, the likelihood of the tree shown in Fig. 2 is:

2014). We will conclude by summarizing the most impor-

L tree =

p t p t p t p t p t

tant insights that have been gained from these studies with ( ) e1e8 ( 1) e2e8 ( 2) e3e7 ( 3) e4e6 ( 4) e5e6 ( 5)

respect to our ability to ‘‘replay the molecular tape of life’’

p t p t . e6e7 ( 6) e7e8 ( 7) (1)

(Gaucher, 2007).

In silico methods for ASR

However, the states (nucleotides) of the internal nodes

are not known and therefore it is necessary to sum over all

A detailed introduction into stochastic concepts and phylo-

possible states (nucleotides at internal nodes) which results

genetic models that is needed to understand modern ASR in

methods is beyond the scope of this review and can be 

found elsewhere; see (Merkl and Sterner, 2016) and refer- L tree = p t p t p t p t

( ) e1e8 ( 1) e2e8 ( 2) e3e7 ( 3) e4e6 ( 4)

ences therein. Here, we present a short summary of the

e8 e7 e6

algorithms required to deduce phylogenetic trees and ances-

p t p t p t . tral sequences. e5e6 ( 5) e6e7 ( 6) e7e8 ( 7)) (2)

Reconstruction of ancestral enzymes 19

states (the content of the internal nodes labeled 6, 7, and 8).

3

For each site of these three internal nodes there are 4 com-

3

binations of nucleotides or 20 combinations of amino acids

ei. It is the aim of an ML approach to identify for these three

nodes a triplet with the largest value p(|data), which is

in a Bayesian approach the triplet that maximizes

p(data|) · p()

. (3)

p(data)

Since p(data) is identical for all candidate triplets, it is

sufficient to maximize p(data|) · p(). More specifically, for

this tree and the given three nodes the triplet is found by

solving

Figure 2 An example of a phylogeny. Leaves representing

extant sequences are labeled 1—5; internal nodes representing max [pe e (t1)pe e (t2)

e ,e ,e 1 8 2 8

reconstructed ancestral sequences are labeled 6—8. The values 8 7 6

t1 to t7 represent the length of the vertices; i.e., time intervals; p t p t p t p t p t .

e3e7 ( 3) e4e6 ( 4) e5e6 ( 5) e6e7 ( 6) e7e8 ( 7)] (4)

example according to (Pupko et al., 2000).

If we know all transition probabilities pij, the likelihood

The solution computed for Eq. (4) is the maximum over all

of a tree can be computed quite efficiently by means of

possible triplets. For larger trees, it is necessary to maximize

an iterative approach. The missing length of the edges can

over all internal nodes; see (Pupko et al., 2000) for details.

be determined by means of expectation maximization; for

details see (Felsenstein, 1981). However, to find the tree

with the maximal likelihood, the topology — which was taken

as given so far — has to be optimized as well, which requires Typical software protocols for ASR

the creation and the assessment of alternative topologies in The basic principles introduced in the previous paragraph

order to find the optimal one. have been implemented in several software suites. These

For a comparison of alternative trees, maximum likeli- are the basis for software protocols used in ASR experi-

hood (ML) approaches optimize the value given in Eq. (2). ments. These protocols differ and we will briefly illustrate

Unfortunately, the number of tree topologies grows expo- some typical combinations. Generally, each protocol for ASR

nentially with the number of sequences, which requires the requires four steps (A—D) that are depicted in Fig. 1:

use of heuristic approaches to sample tree space. Com-

monly, these approaches progressively optimize the tree

by examining the score of similar trees, choose the high- (A) Select extant sequences. Commonly, homologous

est scoring one as the next estimate, and finally stop, if no sequences are retrieved from databases like GenBank

further improvement can be found. or UniProtKB, usually with the help of BLAST (Altschul

Popular traversal schemes of tree space create alterna- et al., 1990). If the number of hits is very large, highly

tive trees by making small rearrangements on the current similar sequences have to be eliminated, e.g., by using

tree and examine each internal branch of the tree in turn; CD-HIT (Li and Godzik, 2006).

for details see (Whelan, 2008) and (Merkl and Sterner, 2016). (B) Create a multiple sequence alignment. Different

For heuristic algorithms there is no guarantee to find the methods are in use to create a MSA. Among them are

optimal tree that has the maximum likelihood. However, the Clustal W (Larkin et al., 2007), MUSCLE (Edgar, 2004)

rearrangements of the tree topology under study expand the and PRANK (Löytynoja and Goldman, 2008). Regions of

area of searched tree space, which increases the probabil- ambiguous alignment can be removed from the MSA by

ity of finding a nearly optimal solution, which is sufficient applying GBLOCKS (Castresana, 2000).

for ASR. Searching a wider range is additionally supported (C) Compute a phylogenetic tree. Current protocols rely

by computing several trees in parallel started from different on ML or Bayesian approaches. Frequently used ML

points in tree space. approaches are PAUP (Swofford, 1984) and PAML (Yang

et al., 1995). Alternatively, Bayesian approaches like

MrBayes (Ronquist and Huelsenbeck, 2003), PhyML

Deducing ancestral sequences

(Guindon and Gascuel, 2003), and PhyloBayes (Lartillot

After a tree has been generated, the most plausible

et al., 2009) are in use.

ancestral sequences can be deduced. Applying a Bayesian

(D) Reconstruct ancestral sequences. The extant

approach, a reconstruction maximizes the probability for

sequences chosen in step (A) and the phyloge-

the set of ancient sequences given the extant ones (Pupko

netic tree computed in step (C) in combination with a

et al., 2000). The basic idea of this reconstruction can

substitutions model are the basis for the computation

be illustrated by concentrating on the internal nodes of a

of ancestral sequences. Frequently used algorithms

tree whose topology and branch lengths are assumed to

are MrBayes (Ronquist and Huelsenbeck, 2003), PAML

be known. The tree given in Fig. 2 has five known states

(Yang, 1997) or FastML (Pupko et al., 2000).

(the content of the leaves, labeled 1—5) and three unknown

20 R. Merkl, R. Sterner

Application of ASR for enzyme reconstruction

Stabilities of ancient enzymes

During the last decade, ASR has been applied to resur-

rect a number of different ancient enzymes. Among the

first studied examples were translation elongation factors

from organisms that thrived ∼3.5—0.5 billion years (Gyr)

ago (Gaucher et al., 2008). The thermal stabilities of the

proteins declined from the ‘‘older’’ to the ‘‘younger’’

proteins, indicating that the environmental temperature

decreased by 30 C within this period of time. This con-

clusion is supported by a nearly identical cooling trend

for the ancient ocean as inferred from the deposition of

oxygen isotopes (Robert and Chaussidon, 2006). Analogous

experiments with seven pre-cambrian thioredoxins dating

back between ∼4 Gyr and 1.4 Gyr, among them the last

bacterial common ancestor, the last archaeal common and

the archaeal-eukaryotic common ancestor, provided simi-

lar results (Perez-Jimenez et al., 2011). Thermal unfolding

measurements showed that the melting temperatures (Tm)

of the ancient proteins were up to 32 C higher than those

Figure 4 Thermophilic and mesophilic RNH lineages exhibit

of extant thioredoxins. Moreover, a plot of the Tm versus

opposite stability trends. The denaturation temperatures (Tm)

geological time yielded a linear decline, corresponding to

◦ of the maximum likelihood ancestors and ten alternate recons-

a decrease of thermostability corresponding to 6 C/Gyr

tructions of Anc1 as a function of the evolutionary distance from

(Fig. 3). Activity measurements showed that ancient thiore-

the last common ancestor, Anc1. Distances are calculated as the

doxins used the same reaction mechanism as modern ones

sum of the branch lengths connecting Anc1 to the protein of

but were much more active at low pH values (Perez-

interest. The gray region defines the range of Tm values mea-

Jimenez et al., 2011). Overall, the high thermal stability

sured for the Anc1 alternates, which appear individually as gray

and the efficient catalysis under acidic conditions seem to

data points.

be adaptations of the ancient thioredoxins to the condi-

Figure taken from (Hart et al., 2014).

tions prevailing in the primordial oceans (Walker, 1983). In

accordance with the results obtained for ancient elongation

factors and thioredoxins, reconstructed nucleoside diphos- For more recent time spans, the analysis of four differ-

phate kinases from the last common ancestors of archaea ent reconstructed 3-isopropylmalate dehydrogenases (LeuB)

and bacteria have Tm values of more than 100 C (Akanuma from Bacilli (ANC1—ANC4) support a different scenario

et al., 2013). Taken together, the results for these three pro- (Hobbs et al., 2012). ANC4, ANC3, ANC2, and ANC1 are

tein families support a hot environment for early life and a approximately 0.95, 0.85, 0.82, and 0.68 Gyr old. Ther-

subsequent continuous cooling. mal unfolding curves yielded Tm values in the following

order: ANC4 ANC1 > ANC3 > ANC2, which means that the

‘‘oldest’’ and the ‘‘youngest’’ reconstructed proteins are

the most stable ones. This result is consistent with a fluctu-

ating trend in thermal evolution, with a temporal adaptation

toward mesophily followed by a more recent return to ther-

mophily. Due to the generally good correlation between the

growth temperature of an organism and the thermostability

of its proteins, the observed variations in thermostability

of the reconstructed LeuB variants might reflect changes in

the microenvironment encountered by the evolving Bacillus

species. Along similar lines, the Tm value of the ∼3 Gyr old

common ancestor of the ribonuclease H1 from the mesophile

Escherichia coli (ecRNH) and the extreme thermophile Ther-

mus thermophilus (ttRNH) lies exactly between the Tm

value of ecRNH and ttRNH (Hart et al., 2014). The Tm

values of intermediate ancestors along the ttRNH lineage

increased gradually over time, while the ecRNH lineage

exhibited an abrupt drop in Tm followed by relatively lit-

Figure 3 Denaturation temperatures (Tm) versus geological tle change (Fig. 4). The observed thermostability patterns

time for ancestral thioredoxin (Trx) enzymes as well as for are incompatible with any gradual, long-term trend in global

Escherichia coli and Trx. The dashed line represents a environmental temperatures. For example, an intermediate

linear fit to the data. Inset: experimental DSC thermograms for ancestor that existed about 2 Gyr ago, is only 2 C more

Trx from E. coli and the last bacterial common ancestor (LBCA). thermostable than modern ecRNH. The authors conclude

Figure taken from (Perez-Jimenez et al., 2011).

that their results are consistent with the wide variety of

Reconstruction of ancestral enzymes 21

Figure 5 Crystal structure and thermal stability of LUCA-HisF (Reisinger et al., 2014). (A, B) Ribbon diagram with a view onto the

catalytic face (A) and the stability face (B) of the (␤␣)8-barrel (Sterner and Höcker, 2005). The eight central ␤-strands are depicted

in light blue, and the eight surrounding ␣-helices are depicted in light orange. Tw o essential aspartate residues and two phosphate

molecules are shown in (A) and a conserved salt bride cluster is shown in (B). (C, D) Thermal denaturation as monitored by far-UV

circular dichroism (C) and by differential scanning calorimetry (D). The apparent Tm-values are indicated.

temperature niches populated by both ancient and modern

microorganisms and claim that the tracking of the Tm of any

individual protein is an unreliable way to estimate long-term

trends in global environmental temperatures (Hart et al.,

2014).

Structures and activities of ancient enzymes

The crystal structure analysis of seven resurrected thiore-

doxins dating back to 4 Gyr showed that the ancestral

proteins display the canonical thioredoxin fold, whereas

only small structural changes have occurred over this

extremely long period of time. These findings indicate that

the thioredoxin fold is a molecular and confirm that

protein structures evolve very slowly (Ingles-Prieto et al.,

2013); a conclusion that is further supported by the crystal

structures of ␤-lactamases (Risso et al., 2013) and nucleo-

side kinases (Akanuma et al., 2013). Moreover, the last

universal common ancestor of the cyclase subunit of imid-

azole glycerol phosphate synthase (LUCA-HisF) does not

Figure 6 Generalist-to-specialist conversion of ␤-lactamases

only display a similar (␤␣)8-barrel structure as modern HisF

as illustrated by the catalytic efficiencies (kcat/Km) for ben-

enzymes and a high thermal stability (Fig. 5) but also a highly

zylpenicillin and cefotaxime.

conserved folding mechanism (Reisinger et al., 2014).

Reprinted (adapted) with permission from (Risso et al., 2013).

A popular model postulates that ancient enzymes

Copyright 2013 American Chemical Society.

were capable of catalyzing similar reactions in different

metabolic pathways (Jensen, 1976), which implies that

they processed several substrates in a promiscuous man- the extant -lactamase TEM-1 only hydrolyzes penicillin

ner (Khersonsky and Tawfik, 2010). The analysis of four (Fig. 6). Since the active sites are highly conserved between

␤ ␤

ancient -lactamases seems to support this idea (Risso et al., modern and ancient -lactamases, the different activities

2013): activity assays showed that these enzymes hydrolyzed might be due to altered structural dynamics. In accordance

various ␤-lactam antibiotics with catalytic efficiencies sim- with this idea, molecular dynamics simulations yielded high

ilar to those of an average modern enzyme. In contrast, active site rigidity for modern -lactamases whereas the

22 R. Merkl, R. Sterner

ancestral enzymes appeared to be much more flexible, Mol. Biol. Evol. 17, 540—552, http://dx.doi.org/10.1093/

oxfordjournals.molbev.a026334.

which might allow them to bind antibiotics with different

Chothia, C., 1992. Proteins. One thousand families for the

sizes and geometries (Zou et al., 2015).

molecular biologist. 357, 543—544, http://dx.doi.org/

However, promiscuity seems not to be a general feature

10.1038/357543a0.

of ancient enzymes. For example, LUCA-HisF is a specific

Edgar, R.C., 2004. MUSCLE: multiple sequence alignment with

enzyme, which transforms its native substrates into prod-

high accuracy and high throughput. Nucleic Acids Res. 32,

ucts with high efficiency. In contrast, it does not accept

1792—1797, http://dx.doi.org/10.1093/nar/gkh340.

similar substrates from the two related isomerases HisA or

Felsenstein, J., 1981. Evolutionary trees from DNA sequences:

TrpF, neither in vitro nor in vivo (Reisinger et al., 2014). a maximum likelihood approach. J. Mol. Evol. 17, 368—376,

Moreover, LUCA-HisF forms a functional complex with the http://dx.doi.org/10.1007/bf01734359.

glutaminase subunit HisF of an extant imidazole glycerol Fitch, W.M., 1971. Toward defining the course of evolution: mini-

mum change for a specific tree topology. Syst. Biol. 20, 406—416,

phosphate synthase and binds to reconstructed LUCA-HisH

http://dx.doi.org/10.1093/sysbio/20.4.406.

with high affinity. Along the same lines, the reconstructed

Gaucher, E.A., 2007. Ancestral sequence reconstruction as a tool

nucleoside diphosphate kinases from the last common ances-

to understand natural history and guide synthetic biology: real-

tors of bacteria and archaea, respectively, are catalytically

izing and extending the vision of Zuckerkandl and Pauling.

as efficient as their modern counterparts (Akanuma et al.,

In: Liberles, D.A. (Ed.), Ancestral Sequence Reconstruction.

2013). Similar steady-state kinetic parameters were also

Oxford University Press, Oxford, pp. 20—33, http://dx.doi.org/

determined for the ancient LeuB variants ANC1—ANC4 and 10.1093/acprof:oso/9780199299188.003.0002.

extant LeuB enzymes (Hobbs et al., 2012). Gaucher, E.A., Govindarajan, S., Ganesh, O.K., 2008. Palaeotem-

perature trend for Precambrian life inferred from resur-

rected proteins. Nature 451, 704—707, http://dx.doi.org/

Conclusion 10.1038/nature06510.

Guindon, S., Gascuel, O., 2003. A simple, fast, and accu-

rate algorithm to estimate large phylogenies by maxi-

The high thermostability of ancestral enzymes suggests that

mum likelihood. Syst. Biol. 52, 696—704, http://dx.doi.org/

Precambrian life was thermophilic, which is in accordance

10.1080/10635150390235520.

with several scenarios, including that ancestral oceans were

Hart, K.M., Harms, M.J., Schmidt, B.H., Elya, C., Thornton,

hot, that ancient life prospered in hot spots like hydro-

J.W., Marqusee, S., 2014. Thermodynamic system drift in pro-

thermal systems, or that only robust thermophilic organism

tein evolution. PLoS Biol. 12, e1001994, http://dx.doi.org/

survived bombardment of the young earth (Risso et al., 10.1371/journal.pbio.1001994.

2014). Hobbs, J.K., Prentice, E.J., Groussin, M., Arcus, V.L., 2015.

The high similarity between the crystal structures of Pre- Reconstructed ancestral enzymes impose a fitness cost upon

cambrian enzymes and their corresponding extant proteins modern bacteria despite exhibiting favourable biochemical

properties. J. Mol. Evol. 81, 110—120, http://dx.doi.org/

indicates a relatively slow evolution of protein structure in

10.1007/s00239-015-9697-5.

contrast to their amino acid sequences. These congruencies

Hobbs, J.K., Shepherd, C., Saul, D.J., Demetras, N.J., Haan-

were observed for several ancient enzymes (Akanuma et al.,

ing, S., Monk, C.R., Daniel, R.M., Arcus, V.L., 2012.

2013; Hart et al., 2014; Hobbs et al., 2012; Ingles-Prieto

On the origin and evolution of thermophily: reconstruc-

et al., 2013; Reisinger et al., 2014; Risso et al., 2013) and

tion of functional Precambrian enzymes from ancestors of

this slow innovation rate might explain the limited number

Bacillus. Mol. Biol. Evol. 29, 825—835, http://dx.doi.org/

of protein folds observed in nature (Chothia, 1992). 10.1093/molbev/msr253.

The observed activities and non-promiscuities of recon- Ingles-Prieto, A., Ibarra-Molero, B., Delgado-Delgado, A., Perez-

structed enzymes are compatible with the idea that the Jimenez, R., Fernandez, J.M., Gaucher, E.A., Sanchez-Ruiz,

evolution of many highly efficient enzymes and enzyme com- J.M., Gavira, J.A., 2013. Conservation of protein structure over

four billion years. Structure 21, 1690—1697, http://dx.doi.org/

plexes has already been completed in the LUCA era. In

10.1016/j.str.2013.06.020.

summary, enzymes that were encoded in the of the

Jaenicke, R., 1991. Protein stability and molecular adaptation to

LUCA (Mirkin et al., 2003) possessed most likely many of

extreme conditions. Eur. J. Biochem. 202, 715—728, http://dx.

the fundamental features present in modern organisms and

doi.org/10.1111/j.1432-1033.1991.tb16426.x.

exhibited a level of sophistication comparable to that found

Jensen, R.A., 1976. Enzyme recruitment in evolution of new func-

in modern bacteria or archaea.

tion. Annu. Rev. Microbiol. 30, 409—425, http://dx.doi.org/

10.1146/annurev.mi.30.100176.002205.

Khersonsky, O., Tawfik, D.S., 2010. Enzyme promiscuity: a

References

mechanistic and evolutionary perspective. Annu. Rev.

Biochem. 79, 471—505, http://dx.doi.org/10.1146/annurev-

Akanuma, S., Nakajima, Y. , Yokobori, S.-i., Kimura, M., Nemoto, biochem-030409-143718.

N., Mase, T. , Miyazono, K.-i., Tanokura, M., Yamagishi, A., 2013. Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan,

Experimental evidence for the thermophilicity of ancestral life. P.A., McWilliam, H., Valentin, F. , Wallace, I.M., Wilm, A., Lopez,

Proc. Natl. Acad. Sci. U.S.A. 110, 11067—11072, http://dx.doi. R., Thompson, J.D., Gibson, T.J., Higgins, D.G., 2007. Clustal

org/10.1073/pnas.1308215110. W and Clustal X version 2.0. Bioinformatics 23, 2947—2948,

Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. http://dx.doi.org/10.1093/bioinformatics/btm404.

Basic local alignment search tool. J. Mol. Biol. 215, 403—410, Lartillot, N., Lepage, T. , Blanquart, S., 2009. PhyloBayes 3:

http://dx.doi.org/10.1016/s0022-2836(05)80360-2. a Bayesian software package for phylogenetic reconstruc-

Castresana, J., 2000. Selection of conserved blocks from mul- tion and molecular dating. Bioinformatics 25, 2286—2288,

tiple alignments for their use in phylogenetic analysis. http://dx.doi.org/10.1093/bioinformatics/btp368.

Reconstruction of ancestral enzymes 23

Li, W., Godzik, A., 2006. Cd-hit: a fast program for clustering promiscuity in laboratory resurrections of Precambrian beta-

and comparing large sets of protein or nucleotide sequences. lactamases. J. Am. Chem. Soc. 135, 2899—2902, http://dx.

Bioinformatics 22, 1658—1659, http://dx.doi.org/10.1093/ doi.org/10.1021/ja311630a.

bioinformatics/btl158. Risso, V.A., Gavira, J.A., Sanchez-Ruiz, J.M., 2014. Thermostable

Liberles, D.A., 2007. Ancestral Sequence Reconstruction. and promiscuous Precambrian proteins. Environ. Microbiol. 16,

Oxford University Press, Oxford, http://dx.doi.org/10.1093/ 1485—1489, http://dx.doi.org/10.1111/1462-2920.12319.

acprof:oso/9780199299188.001.0001. Robert, F. , Chaussidon, M., 2006. A palaeotemperature curve for the

Löytynoja, A., Goldman, N., 2008. Phylogeny-aware gap place- Precambrian oceans based on silicon isotopes in cherts. Nature

ment prevents errors in sequence alignment and evolutionary 443, 969—972, http://dx.doi.org/10.1038/nature05239.

analysis. Science 320, 1632—1635, http://dx.doi.org/10.1126/ Ronquist, F. , Huelsenbeck, J.P., 2003. MrBayes 3: Bayesian

science.1158395. phylogenetic inference under mixed models. Bioinformat-

Merkl, R., Sterner, R., 2016. Ancestral protein reconstruc- ics 19, 1572—1574, http://dx.doi.org/10.1093/bioinformatics/

tion: techniques and applications. Biol. Chem. 397, 1—21, btg180.

http://dx.doi.org/10.1515/hsz-2015-0158. Sterner, R., Höcker, B., 2005. Catalytic versatility, stability, and

Mirkin, B.G., Fenner, T.I., Galperin, M.Y., Koonin, E.V., 2003. evolution of the (␤␣)8-barrel enzyme fold. Chem. Rev. 105,

Algorithms for computing parsimonious evolutionary sce- 4038—4055, http://dx.doi.org/10.1021/cr030191z.

narios for genome evolution, the last universal common Susko, E., Field, C., Blouin, C., Roger, A.J., 2003. Estimation

ancestor and dominance of horizontal gene transfer of rates-across-sites distributions in phylogenetic substitu-

in the evolution of prokaryotes. BMC Evol. Biol. 3, 2, tion models. Syst. Biol. 52, 594—603, http://dx.doi.org/

http://dx.doi.org/10.1186/1471-2148-3-2. 10.1080/10635150390235395.

Pauling, L., Zuckerkandl, E., 1963. Chemical paleogenetics: Swofford, D.L., 1984. PAUP: Phylogenetic Analysis using Parsimony.

molecular ‘‘restoration studies’’ of extinct forms of life. Illinois Natural Historical Survey, Champaign.

Acta Chem. Scand. 17, 9—16, http://dx.doi.org/10.3891/ Walker, J.C.G., 1983. Possible limits on the composi-

acta.chem.scand.17s-0009. tion of the Archaean ocean. Nature 302, 518—520,

Perez-Jimenez, R., Inglés-Prieto, A., Zhao, Z.M., Sanchez-Romero, http://dx.doi.org/10.1038/302518a0.

I., Alegre-Cebollada, J., Kosuri, P. , Garcia-Manyes, S., Kappock, Whelan, S., 2008. Inferring trees. In: Keith, J.M. (Ed.), Bioinfor-

T.J., Tanokura, M., Holmgren, A., Sanchez-Ruiz, J.M., Gaucher, matics. Springer, Totowa, NJ, pp. 287—309, http://dx.doi.org/

E.A., Fernandez, J.M., 2011. Single-molecule paleoenzymology 10.1007/978-1-60327-159-2 14.

probes the chemistry of resurrected enzymes. Nat. Struct. Mol. Yang, Z., 1997. PAML: a program package for phylogenetic analy-

Biol. 18, 592—596, http://dx.doi.org/10.1038/nsmb.2020. sis by maximum likelihood. Comput. Appl. Biosci. 13, 555—556,

Pupko, T. , Pe, I., Shamir, R., Graur, D., 2000. A fast algorithm http://dx.doi.org/10.1093/bioinformatics/13.5.555.

for joint reconstruction of ancestral amino acid sequences. Yang, Z., Kumar, S., Nei, M., 1995. A new method of inference of

Mol. Biol. Evol. 17, 890—896, http://dx.doi.org/10.1093/ ancestral nucleotide and amino acid sequences. Genetics 141,

oxfordjournals.molbev.a026369. 1641—1650.

Reisinger, B., Sperl, J., Holinski, A., Schmid, V. , Rajendran, C., Zou, T. , Risso, V.A., Gavira, J.A., Sanchez-Ruiz, J.M., Ozkan,

Carstensen, L., Schlee, S., Blanquart, S., Merkl, R., Sterner, S.B., 2015. Evolution of conformational dynamics determines

R., 2014. Evidence for the existence of elaborate enzyme com- the conversion of a promiscuous generalist into a special-

plexes in the Paleoarchean era. J. Am. Chem. Soc. 136, 122—129, ist enzyme. Mol. Biol. Evol. 32, 132—143, http://dx.doi.

http://dx.doi.org/10.1021/ja4115677. org/10.1093/molbev/msu281.

Risso, V.A., Gavira, J.A., Mejia-Carmona, D.F., Gaucher, E.A.,

Sanchez-Ruiz, J.M., 2013. Hyperstability and substrate