Evolution of lanthipeptide synthetases

Qi Zhanga,YiYub, Juan E. Vélasqueza, and Wilfred A. van der Donka,b,c,1

aDepartment of Chemistry, bDepartment of Biochemistry, and cHoward Hughes Medical Institute, University of Illinois at Urbana–Champaign, Urbana, IL 61801

Edited by Jerrold Meinwald, Cornell University, Ithaca, NY, and approved September 24, 2012 (received for review June 19, 2012) Lanthionine-containing peptides (lanthipeptides) are a family of cyclase domain. Class III and class IV lanthipeptides are synthesized ribosomally synthesized and posttranslationally modified peptides by trifunctional LanKC (10, 13) and LanL (12), respectively. containing (methyl)lanthionine residues. Here we present a phylo- These enzymes contain an N-terminal domain and a central genomic study of the four currently known classes of lanthipep- kinase domain but differ in their C termini. LanC and the C-terminal tide synthetases (LanB and LanC for class I, LanM for class II, LanKC domains of LanM and LanL contain a conserved zinc-binding motif for class III, and LanL for class IV). Although they possess very (Cys-Cys-His/Cys), whereas the C-terminal cyclase domain of LanKC similar cyclase domains, class II–IV synthetases have evolved inde- lacks these conserved residues (13) (Fig. 1B). pendently, and LanB and LanC enzymes appear to not always have Recent genome mining studies have revealed that lanthipeptide coevolved. LanM enzymes from various phyla that have three cys- biosynthetic genes are present in a wide range of bacteria (14–16). teines ligated to a zinc ion (as opposed to the more common Cys- The widespread occurrence may not be surprising, given the po- Cys-His ligand set) cluster together. Most importantly, the phylo- tential benefits of gene-encoded natural products with respect to genomic data suggest that for some scaffolds, the ring topology of facile evolution of new structures and biological function. Here we fi the nal lanthipeptides may be determined in part by the se- present a systematic analysis of the phylogenetic distribution of quence of the precursor peptides and not just by the biosynthetic lanthipeptide synthetases. We correlate the taxonomy of the bac- enzymes. This notion was supported by studies with two chimeric terial host and the structure of the final products with the peptides, suggesting that the nisin and prochlorosin biosynthetic phylogenies. Implications for biosynthetic engineering of the lan- enzymes can produce the correct ring topologies of epilancin 15X thipeptide family and for genome mining are discussed. and lacticin 481, respectively. These results highlight the potential of lanthipeptide synthetases for bioengineering and combinatorial Results and Discussion biosynthesis. Our study also demonstrates unexplored areas of Evolution of LanC enzymes. LanCs have about 400 amino acid res- sequence space that may be fruitful for genome mining. idues, and possess a double α–barrel-fold topology (17) and a molecular evolution | natural products | phylogeny | strictly conserved Cys-Cys-His triad near their C termini for binding posttranslational modification | lantibiotics of a zinc ion. In vitro reconstitution of the nisin cyclase activity of NisC and solution of its crystal structure have supported a zinc- dependent mechanism (17, 18). The zinc ion is believed to activate eptide antibiotics represent a large and diverse group of bio- the Cys thiols of the precursor peptide for nucleophilic attack on active natural products with a wide range of applications. Most P the dehydroamino acids. LanC or LanC-like enzymes are not only of these compounds are produced via two distinct biosynthetic found as stand-alone cyclases and as cyclase domains in LanM and paradigms. The nonribosomal peptide synthetases are responsible for the biosynthesis of many clinically important antibiotics (1, 2). LanL enzymes, their encoding genes are also present in , A different strategy involves posttranslational modifications of although the detailed functions of these LanC-like (LanCL) pro- linear ribosomally synthesized peptides (3, 4). This biosynthetic teins remain to be determined (19). strategy is also widely distributed and found in all three domains of The phylogeny of LanC from different bacterial lineages, in- life. Although the building blocks used by ribosomes are generally cluding the C-terminal domains of LanMs and LanLs, and the confined to the 20 proteinogenic amino acids, the structural di- LanCL proteins from human that can serve as the outgroup, was versity generated by posttranslational modifications is vast (5). constructed using both Bayesian Markov chain Monte Carlo Among the best-studied ribosomally synthesized and posttransla- (MCMC) (20) and maximum-likelihood inferences (21). To obviate tionally modified peptides are the lanthipeptides, a class of com- codon bias, the trees were constructed based on amino acid pounds distinguished by the presence of sulfur-to-β–carbon thioether sequences instead of nucleic acid sequences. The overall Bayesian cross-links named lanthionines and methyllanthionines (Fig. 1) (6–9). MCMC tree shown in Fig. 2 A and B is almost exactly the same with Many lanthipeptides, such as the commercially used food pre- that prepared by the maximum-likelihood method (SI Appendix, servative nisin, have potent antimicrobial activity and are termed Fig. S1), strongly supporting the reliability of both trees. lantibiotics. Maturation of lanthipeptides involves posttranslational The C-terminal domains of LanM and LanL fall into two distinct modifications of a C-terminal core region of a precursor peptide and clades. These two clades group into a larger clade and are sepa- subsequent proteolytic removal of an N-terminal leader sequence rated from a sister LanC clade and the eukaryotic LanCL clade that is not modified (3). Their thioether bridges are installed by the (Fig. 2A and SI Appendix, Fig. S1), suggesting that LanM and LanL initial dehydration of Ser and Thr residues, followed by stereo- evolved independently from LanC. If LanM and LanL originate selective intramolecular Michael-type addition of Cys thiols to the from hybridization of an ancestral LanC with a or newly formed dehydroamino acids (Fig. 1A). In some cases, this re- kinase, this event likely occurred only once. Gene recombination action is coupled with a second Michael-type addition of the resulting CHEMISTRY enolate to a second dehydroalanine to produce a labionin structure (10). Genetic and biochemical studies have revealed four distinct Author contributions: Q.Z., Y.Y., J.E.V., and W.A.v.d.D. designed research; Q.Z., Y.Y., and classes of lanthipeptides according to their biosynthetic machinery (7, J.E.V. performed research; Q.Z., Y.Y., J.E.V., and W.A.v.d.D. analyzed data; and Q.Z. and 11, 12) (Fig. 1B). Class I lanthipeptides are synthesized by two dif- W.A.v.d.D. wrote the paper. ferent enzymes, a dehydratase LanB, and a cyclase LanC (Lan is The authors declare no conflict of interest. a generic designation for lanthipeptide biosynthetic proteins). For This article is a PNAS Direct Submission. class II lanthipeptides, the reactions are carried out by a single lan- 1To whom correspondence should be addressed. E-mail: [email protected].

thipeptide synthetase, LanM, containing an N-terminal dehydratase This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. BIOCHEMISTRY domain that bears no homology to LanB, and a C-terminal LanC-like 1073/pnas.1210393109/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1210393109 PNAS | November 6, 2012 | vol. 109 | no. 45 | 18361–18366 Downloaded by guest on September 30, 2021 A few lanthipeptide biosynthetic gene clusters contain genes encoding a C-terminally truncated LanB and a stand-alone SpaB_C-like protein, suggesting that the two domains of LanB might be able to act both in cis and in trans. Similarly, most of the thiopeptide biosynthetic gene clusters encode two proteins for dehydration: a putative dehydratase that shares low se- quence similarity with LanBs and a SpaB_C-like protein (27). In this study, only full-length LanB sequences were used. The phylogenetic trees of LanB enzymes from the same gene cluster as the LanC proteins in Fig. 2B were constructed by both Bayesian MCMC and maximum-likelihood methods, and the resulting two trees are almost identical (SI Appendix, Figs. S4 and S5). Interestingly, the topology of the LanB tree is distinct from that of LanC. The enzymes from bacteroidetes and proteobac- teria are possibly derived from firmicutes because they are deeply nested within a group that mainly consists of LanBs from firmi- cutes (SI Appendix, Fig. S4). This finding is in distinct contrast to Fig. 1. Biosynthesis of lanthipeptides, showing the mechanism of (methyl) the LanC tree in which the enzymes of group 1 are distantly re- lanthionine formation (A), and the four classes of synthetases (B). Xn rep- lated to those of group 3 (Fig. 2B). It appears therefore that for resents a peptide linker. The conserved zinc-binding motifs are highlighted by the purple lines in the cyclase domains. SpaB_C is an in silico defined these clusters, LanBs have evolved independently from LanCs, or domain currently found as a stand-alone protein for thiopeptide bio- that the LanBs or LanCs may have been recruited from other synthesis and as the C-terminal domain of LanB enzymes. The N-terminal organisms and have formed a functional pair. In general, how- domain of LanB enzymes is made up of two subdomains according to the ever, the trees show that LanB and LanC enzymes from the same Conserved Domain Database (26). organism fall into similar clades (Fig. 2B and SI Appendix,Fig. S4), suggesting that they have coevolved. Genome mining for new class I lanthipeptides might be facilitated between the C-terminal domains of LanM and LanL is not sup- by phylogenetic categorization of LanC and LanB enzymes. For ported by our analysis of the currently available sequences. example, we note that Streptococcus pasteurianus contains genes for LanCs from different bacterial phyla group into three groups LanB and LanC that are phylogenetically closely related to NisB with strong statistical support (Fig. 2B and SI Appendix,Fig.S1). and NisC (Fig. 2). Examining their associated lanA gene suggests Group 1 consists of enzymes from bacteroidetes and proteobac- that this strain may be able to produce a nisin analog similar to nisin fi teria. Although proteobacteria are a proli c source of LanM U (designated nisin P in Fig. 3). Similarly, Actinomyces sp. oral may enzymes (vide infra), only very few LanCs are found in this phylum. produce a microbisporicin analog (designated microbisporicin B in Group 2 and group 3 consist of proteins from actinobacteria and Fig. 3), and Streptococcus sanguinis may produce a streptin analog firmicutes, respectively, which group into a larger clade that is sister (designated streptin B in SI Appendix,Fig.S6). Given the vast se- to group 1, indicating that LanC enzymes from these two phyla are quence space of uncharacterized enzymes from strains other than more closely related with each other than with their counterparts firmicutes (e.g., group 1 and group 2 in Fig. 2B), these data may also from bacteroidetes and proteobacteria. Group 3 consists of the serve as a guideline for future genome mining efforts. enzymes from firmicutes, with only one exception of an enzyme from Bifidobacterium longum, which belongs to the actinobacteria. Evolution of LanM Enzymes. LanMs are bifunctional enzymes of Taken together, these results indicate that LanCs from different 900–1,200 residues containing an N-terminal dehydratase and phyla have evolved independently, and that interphylum horizon- a C-terminal LanC-type cyclization domain (7, 28). LanM pro- tal gene transfer generally did not occur during LanC evolution. teins use ATP to phosphorylate Ser/Thr residues in their sub- Class I lanthipeptides have been grouped into nisin-like, epi- strates and they subsequently eliminate the resulting phosphate dermin-like, and Pep5-like groups (8) (SI Appendix, Figs. S2 and ester to yield the dehydroamino acids (29). The C-terminal do- S3). Although LanC enzymes for producing structurally similar main then catalyzes the cyclization reaction in a similar manner lanthipeptides usually cluster together (Fig. 2B), this is not always to LanC enzymes. Unlike LanBs and LanCs, LanMs are preva- the case. Nisin and subtilin have similar precursor peptide sequen- lent in proteobacteria and have also been characterized from ces and exactly the same ring topologies (6) (Fig. 3), but their as- cyanobacteria (30, 31). However, in our analysis they are not sociated LanC enzymes have relatively distant relationships (Fig. found in bacteroidetes, again indicating that different classes of 2B). Moreover, microbisporicin has a similar N-terminal structure lanthipeptides likely evolved independently. to that of nisin, but is produced by Actinobacteria (22, 23), and its Bayesian MCMC and maximum-likelihood trees of 91 LanM LanC does not even belong to group 3, which harbors most enzymes sequences from different bacterial phyla were constructed (Fig. for nisin-like lanthipeptides. These results demonstrate convergent 2C and SI Appendix,Fig.S7). The Bayesian MCMC inference of evolution to arrive at nisin-like architectures. When also taking into LanM phylogeny shown in Fig. 2C is similar to that of LanC and consideration the distinct C-terminal ring topologies of subtilin (6), LanB, in that enzymes are usually grouped according to their ericin A (24), and geobacillin (25) (Fig. 3), and their closely related producing organisms and form three major subdivisions. Group 1 LanCs (Fig. 2B), the ring topology of some lanthipeptides may be is a polyphyletic clade mainly consisting of enzymes from pro- determined as much by the sequence of their precursor peptides as teobacteria and cyanobacteria. Because the base of group 1 by their cyclase sequences. This hypothesis could have important consists exclusively of proteobacterial enzymes (Fig. 2C), mem- implications for bioengineering (vide infra). Conversely, it is puz- bers of group 1 from other phyla possibly evolved from proteo- zling that thus far the potent lipid II binding topology of the A and B bacterial ancestors. Groups 2 and 3 are made up of enzymes from rings of the nisin-like peptides has only been observed in class I actinobacteria and firmicutes, respectively, with the only excep- compounds and not in the other classes of lanthipeptides. tions enzymes from Bifidobacteria (Fig. 2C), which fall into the firmicutes clade similar to what was observed for LanC (Fig. 2B). Evolution of LanB enzymes. LanB proteins usually have about 1,000 Notably, the enzymes that synthesize type IIA compounds (SI residues, consisting of a lanthipeptide dehydratase domain and Appendix, Fig. S8) (8, 32) are grouped into a subclade with strong a so-called SpaB_C (SpaB C-terminal) domain (26) (Fig. 1B). support (Fig. 2C). Type IIA compounds are structurally similar

18362 | www.pnas.org/cgi/doi/10.1073/pnas.1210393109 Zhang et al. Downloaded by guest on September 30, 2021 Fig. 2. Bayesian MCMC phylogeny of LanC and LanM enzymes. (A) Tree of LanC and LanC-like enzymes. LanM_C and LanL_C represent the C-terminal cyclase domains of LanM and LanL, respectively. (B) The LanC clade of A.(C) Phylogenetic tree of LanM enzymes. As no suitable outgroup protein can be found (LanLs cannot serve as outgroup, because their N-terminal domains are not homologous to those of LanMs), the trees were rooted by using all members of a sister clade as the outgroup, an approach previously suggested as optimal in such instances (44). Bayesian inferences of posterior probabilities are indicated by line width. Lanthipeptides in each tree are shown by different colored boxes according to structural types. Two-component lantibiotics are in red font, and CHEMISTRY lanthipeptides proposed in this study are in yellow font.

to the prototypical peptide lacticin 481, which possesses a linear like peptides discussed above. Enzymes for synthesizing the two- N terminus and a globular C-terminal scaffold (SI Appendix, Fig. component lantibiotics are distributed in different subclades of S8) (6, 32). The close correlation of the enzyme phylogeny and group 3 (Fig. 2C). These systems are potent antimicrobial agents the structure of the final products demonstrate that these that display strong synergism between the two posttranslationally

products evolved from the same ancestors, in contrast to the modified peptides (7–9). Although the two peptides of lacticin BIOCHEMISTRY apparent convergent evolution of several members of the nisin- 3147 (Ltnα and Ltnβ) have similar ring topologies as haloduracin-α

Zhang et al. PNAS | November 6, 2012 | vol. 109 | no. 45 | 18363 Downloaded by guest on September 30, 2021 involved in catenulipeptin biosynthesis binds neither zinc nor other metals, indicating a different cyclization mechanism of LanKC enzymes, but domain deletions support the hypothesis that the C-terminal domain is responsible for cyclization (34). Many, but not all, LanKCs synthesize labionins (SI Appendix, Fig. S11), which are thus far found exclusively for class III lanthipeptides (34–36). Bayesian MCMC and maximum-likelihood inferences of 39 LanKC sequences and 15 LanL sequences were constructed (SI Appendix, Figs. S12 and S13). Clearly, LanKC and LanL enzymes group into different clades, indicating that, despite the significant sequence similarities and the similar domain structure, these two enzyme classes have evolved independently. The possibility that LanKCs are derived from LanLs by loss of the zinc- is therefore unlikely. Interestingly, some LanLs contain the “CCG” instead of the “CHG” motif and may have three Cys residues that coordinate to the zinc ion. As was observed for LanMs, these enzymes fall into a distinct subgroup (SI Appendix, Figs. S12 and S13). Phylogenetically, there are no obvious distinctions between LanKC enzymes that generate lanthionine and labionin structures. For ex- – Fig. 3. Comparative analysis of the nisin-like peptides. A E denote different ample, catenulipeptin and labyrinthopeptin contain only labionins rings. The conserved N termini are highlighted by the orange box. Ser and (34, 35) and SapB contains only lanthionines (13) (SI Appendix,Fig. Thr that are involved in ring formation are shown by red highlighted font. Ser and Thr that are dehydrated but not involved in ring formation are S11), but catenulipeptide synthetase AciKC is phylogenetically closer shown in green and purple highlighted font. New compounds proposed in to the SapB synthetase RamC than the labyrinthopeptin synthetase this study are shown in blue font. LabKC (SI Appendix,Fig.S12). In addition, recently characterized class III peptides contain both lanthionine and labionin (36), and the biosynthetic enzymes of these peptides are distributed in different and -β, particularly in their C-terminal regions (SI Appendix, Fig. subclades of the LanKC clade (SI Appendix,Fig.S12). Thus, it is S9), their corresponding LanMs are phylogenetically distantly possible that the cyclization mode for class III may again be de- related (Fig. 2C). These results reinforce the idea that the ring pendent on the precursor peptide sequences. topologies of lanthipeptides might be determined to a larger extent than previously anticipated by the sequence of the precursor Determinants of Lanthipeptide Ring Topologies. The ring systems of peptide, as discussed above. lanthipeptides are very diverse, ranging from simple nonover- The biosynthesis of the prochlorosins from cyanobacteria rep- lapping rings to highly complex, intertwined rings (SI Appendix, resents a remarkable example of natural combinatorial bio- Figs. S2, S3, and S9). The possibility to form labionin rings fur- synthesis (31): up to 29 different ProcA peptides with highly ther diversifies lanthipeptide ring systems. The manner by which variable sequences in the core region but highly conserved leader ring topology is determined from precursor peptides containing sequences serve as substrates of a single enzyme ProcM, resulting multiple Cys and Ser/Thr residues is at present entirely unclear. in a library of structurally diverse lanthipeptides (30, 31). This As discussed above, the trees provide several examples of system is a prime example of the highly evolvable nature of ribo- enzymes that produce structurally similar lanthipeptides falling somal biosynthesis to access high structural diversity of natural into different phylogenetic clades, as well as phylogenetically products at low genetic cost. Analysis of the ProcM sequence closely related enzymes generating products with distinct rings. revealed that this enzyme contains a “CCG” motif (30) rather than These results suggest that in some cases, the substrate sequences a “CHG” motif (17, 18) found in all LanCs and most of the LanMs may be as important to determine the ring topologies of the final known to date, indicating that ProcM likely uses three Cys residues product as the cyclization enzymes. rather than a Cys-Cys-His triad for binding of the zinc The notion of substrate-directed ring patterns is supported by + ion. Model studies of activation of thiolate nucleophiles by Zn2 comparative analysis of the nisin-group (Fig. 3). All members of have demonstrated increased reactivity with an increased number this group contain a conserved N-terminal ring system, but their of thiolate ligands (33), suggesting that ProcM may derive its biosynthetic enzymes fall into three different clades (Fig. 2B). promiscuity in part from a highly active zinc ion (30). Intriguingly, Interestingly, ericin S and A, despite possessing very different all LanM proteins containing the “CCG” motif cluster together to C-terminal ring topologies, are produced by the same biosynthetic form a distinct subgroup within group 1 (Fig. 2C). This phyloge- enzymes (24). The D-ring of ericin S is linked to the E-ring in netic distribution is not merely a consequence of the “CCG” or similar fashion as in nisin, whereas the D-ring of ericin A is “CHG” motifs, because artificially changing Cys to His or His to intertwined with the C-ring, similar to that found in micro- Cys in the motif for five representative group 1 enzymes did not bisporicin (Fig. 3). This analysis supports a more prominent role of alter their position in the tree or greatly affect the statistic support the substrate sequences. Remarkably, Kuipers and colleagues have of the phylogenetic trees (SI Appendix, Fig. S10). Rather, these recently shown that the precursors of a class II two-component results suggest that LanMs containing the conserved “CCG” motif lantibiotic can be modified by the nisin biosynthetic enzymes to have evolved independently from the other LanM proteins. Some form antimicrobially active products (37). Although the structures but not all of the members of the CCG clade have multiple pre- are currently unknown, the products likely have the same or very cursor genes nearby and at other loci of their genomes similar similar ring topologies as the wild-type peptides, supporting the to ProcM. importance of substrate sequence in the ring-pattern formation. To further address this hypothesis, we generated a construct Evolution of LanKC and LanL enzymes. LanKC and LanL both con- encoding a chimeric peptide NisA-ElxA, in which the leader tain an N-terminal lyase domain and a central kinase domain, with sequence of the nisin precursor peptide (NisA) was fused to the both enzymes generating dehydroamino acids via independent core peptide of epilancin 15X separated by an engineered glu- phosphorylation and elimination steps (7, 12). The C-terminal tamic acid residue for proteolytic removal of the leader peptide. cyclase domain of LanKC enzymes lacks the conserved residues for Nisin and epilancin 15X have very different N termini but similar binding of a zinc ion. Consistent with this observation, AciKC C termini (SI Appendix, Figs. S2 and S3). Coexpression of NisA-

18364 | www.pnas.org/cgi/doi/10.1073/pnas.1210393109 Zhang et al. Downloaded by guest on September 30, 2021 every lanthipeptide, ProcM did not process a chimeric peptide consisting of the ProcA leader and NisA core peptides to a bio- active product (SI Appendix, Fig. S15). Possibly, incomplete de- hydration precluded formation of the correct rings.

Base Composition and Codon Use of Lanthipeptide Synthetase Genes. Base composition analysis is a common strategy to investigate gene history and potential horizontal gene transfer events (39, 40). If a gene is a vertical descendent, it should have a similar base composition as the host genome, whereas if it does not, the gene is likely acquired relatively recently from another organism. We calculated the GC content of every lanthipeptide synthetase gene for which complete genome sequences are available and compared the results with the GC content of their genomes (Fig. 5). The analysis shows that enzymes from actinobacteria usually have nucleotide composition similar to their genomes, with exceptions again found for Bifidobacteria. The GC content of lanC and lanKC genes from this genus strongly deviate from their genomes, indicating these genes were most likely acquired by recent horizontal gene transfer. Genes of lanthipeptide synthe- tases from firmicutes show substantial variations of nucleotide base composition (Fig. 5). A notable example involves the genes from Geobacillus, with GC contents that are decreased signifi- cantly compared with their genomes. Taken together with the phylogenetic results, these genes were likely acquired horizon- tally from Bacillus strains (Fig. 2 B and C). Application of other methods to investigate nucleotide use in lanthipeptide bio- Fig. 4. Generation of a bioactive epilancin 15X analog with the nisin bio- synthetic genes, including GC3s and the effective number of synthetic enzymes. (A) MALDI-MS analysis of NisA-ElxA modified in E. coli by codons (Nc) (41), indicates that our analysis is not biased by the NisB and NisC and treated with GluC protease. (B) ESI-MS/MS analysis of the codon preference of different organisms (SI Appendix, Fig. S16). sixfold dehydrated peptide. The proposed structure, the MS/MS fragmen- All lanthipeptide biosynthetic genes from Streptococci have tation pattern, and the in vitro bioassay against S. carnosus are shown. Spots relatively low GC contents (Fig. 5). To evaluate the significance 1 and 2 on the bioassay plate are assay and negative control. of these high deviations, for every Streptococcus strain of Fig. 5, we selected 25 genes believed to be less prone to horizontal gene ElxA with NisB and NisC in Escherichia coli resulted in a series transfer (42) and calculated their GC composition and overall SDs (σ)(SI Appendix, Table S1). This analysis indicates that the of products that were dehydrated up to six times compared with σ eight dehydrations in epilancin 15X (Fig. 4A). Electron spray values of the core genes for all strains are lower than 3% (SI ionization (ESI)-MS/MS analysis suggested that the two N-ter- Appendix, Table S1), whereas the majority of the streptococcal lanthipeptide synthetase genes have GC contents decreased by minal Ser residues escaped dehydration in the sixfold dehydrated σ peptide and that it very likely has the same ring pattern as that of more than 4 from their genome, suggesting that they were likely epilancin 15X (Fig. 4B). Indeed, proteolytic removal of the acquired recently by horizontal gene transfer. leader peptide with endoproteinase GluC and subsequent use of Conclusion the product for well-diffusion assays clearly demonstrated a zone of growth inhibition of the indicator strain Staphylococcus car- The knowledge of the chemical and biosynthetic diversity of ri- nosus (Fig. 4B). The product from a parallel experiment in which bosomal natural products has been greatly expanded in recent NisC was not coexpressed lacked antibacterial activity. years. These compounds are distinct from other well-established To extend these studies to chimera of lanthipeptides that have natural products because they are gene-encoded and their no structural similarity, we fused the lacticin 481 core peptide to structures can be easily changed by simple permutation of the the ProcA3.2 leader sequence to afford a chimeric peptide precursor peptide sequences. Using lanthipeptide synthetases ProcA-LctA. This peptide was coexpressed in E. coli with the highly promiscuous enzyme ProcM. MS analysis showed that the peptide was dehydrated up to five times (SI Appendix, Fig. S14A). Iodoacetamide alkylation assays and ESI-MS/MS analysis indicated that the products contained a mixture of peptides that were partially and fully cyclized (SI Appendix, Fig. S14 B and C). Intriguingly, after removal of the ProcA leader peptide, the re- sultant product was active against the indicator strain Lacto- coccus lactis HP (SI Appendix, Fig. S14D), collectively suggesting CHEMISTRY that the correct rings of lacticin 481 were produced. These results support a model in which the Cys residues lodge onto the + Zn2 ion in the active site and that subsequently the precursor peptide sequence determines the site selectivity of cyclization. In other words, the cyclase does not appear to enforce the rings to be generated, nor do initially formed rings govern the site se- lectivity of subsequently formed rings. The latter point is also Fig. 5. Base composition analysis of the lanthipeptide synthetase genes and

supported by previous studies on single-ring disruptions of hal- the associated genomes. Genome sequences are from the same species as BIOCHEMISTRY oduracin (38). To emphasize that not all synthetases can make the synthetase genes but the subspecies are different in some cases.

Zhang et al. PNAS | November 6, 2012 | vol. 109 | no. 45 | 18365 Downloaded by guest on September 30, 2021 as a model system, the phylogenomic studies represented herein Materials and Methods indicate a complex, dynamic, and sometimes convergent evolu- Bayesian MCMC inference analyses were performed using the program tion mechanism of the biosynthetic enzymes. Several interesting MrBayes (version 3.2) (43). Final analyses consisted of two sets of eight chains observations are made, such as mostly phylum-dependent group- each (one cold and seven heated), run for about 2–10 million generations ings, the clustering of subgroups of enzymes that differ in the with trees saved and parameters sampled every 100 generations. Analyses identity of a single metal ligand, and the possibility that precursor were run to reach a convergence with SD of split frequencies < 0.01. Pos- peptide sequence may be a larger determinant of final ring to- terior probabilities were averaged over the final 75% of trees (25% burn in). pology than previously recognized. This hypothesis is supported For additional details of phylogenetic analysis, procedures for the studies by experimental studies with chimeric peptides showing that with the chimeric peptides, expression and purification of modified peptide phylogenetically distantly related enzymes can produce the same products, MS and bioactivity assays, and nucleotide base composition and ring topologies given the same peptide sequence. The phyloge- codon use analysis, please see the SI Appendix. The SI Appendix also contains netic trees described herein can also direct future genome mining the accession number and the source organism of the synthetases used (SI studies, as they show that some biosynthetic sequence space has Appendix, Tables S2–S5). not been tapped at all, whereas other areas have been heavily fi ACKNOWLEDGMENTS. We thank Dr. Taras Pogorelov and Mike Hallock sampled. These ndings may also serve as an entry point for un- (University of Illinois at Urbana–Champaign) for providing computation and derstanding the evolutionary mechanism of other ribosomal network assistance. This work was supported by the National Institutes of natural products. Health Grants GM58822 (to W.A.v.d.V.) and T32 GM070421 (to J.E.V.).

1. Fischbach MA, Walsh CT (2006) Assembly-line enzymology for polyketide and non- 23. Foulston LC, Bibb MJ (2010) Microbisporicin gene cluster reveals unusual features of ribosomal peptide antibiotics: Logic, machinery, and mechanisms. Chem Rev 106(8): lantibiotic biosynthesis in actinomycetes. Proc Natl Acad Sci USA 107(30): 3468–3496. 13461–13466. 2. Strieker M, Tanovic A, Marahiel MA (2010) Nonribosomal peptide synthetases: 24. Stein T, et al. (2002) Two different lantibiotic-like peptides originate from the ericin Structures and dynamics. Curr Opin Struct Biol 20(2):234–240. gene cluster of Bacillus subtilis A1/3. J Bacteriol 184(6):1703–1711. 3. Oman TJ, van der Donk WA (2010) Follow the leader: The use of leader peptides to 25. Garg N, Tang W, Goto Y, Nair SK, van der Donk WA (2012) Lantibiotics from Geo- guide natural product biosynthesis. Nat Chem Biol 6(1):9–18. bacillus thermodenitrificans. Proc Natl Acad Sci USA 109(14):5241–5246. 4. Velásquez JE, van der Donk WA (2011) Genome mining for ribosomally synthesized 26. Marchler-Bauer A, et al. (2011) CDD: A Conserved Domain Database for the functional natural products. Curr Opin Chem Biol 15(1):11–21. annotation of proteins. Nucleic Acids Res 39(Database issue):D225–D229. 5. McIntosh JA, Donia MS, Schmidt EW (2009) Ribosomal peptide natural products: 27. Li C, Kelly WL (2010) Recent advances in thiopeptide antibiotic biosynthesis. Nat Prod Bridging the ribosomal and nonribosomal worlds. Nat Prod Rep 26(4):537–559. Rep 27(2):153–164. 6. Willey JM, van der Donk WA (2007) Lantibiotics: Peptides of diverse structure and 28. Siezen RJ, Kuipers OP, de Vos WM (1996) Comparison of lantibiotic gene clusters and – function. Annu Rev Microbiol 61:477–501. encoded proteins. Antonie van Leeuwenhoek 69(2):171 184. 7. Knerr PJ, van der Donk WA (2012) Discovery, biosynthesis, and engineering of lanti- 29. Chatterjee C, et al. (2005) Lacticin 481 synthetase phosphorylates its substrate during – peptides. Annu Rev Biochem 81:479–505. lantibiotic production. J Am Chem Soc 127(44):15332 15333. 8. Bierbaum G, Sahl HG (2009) Lantibiotics: Mode of action, biosynthesis and bio- 30. Tang W, van der Donk WA (2012) Structural characterization of four prochlorosins: A engineering. Curr Pharm Biotechnol 10(1):2–18. novel class of lantipeptides produced by planktonic marine cyanobacteria. Bio- – 9. Piper C, Cotter PD, Ross RP, Hill C (2009) Discovery of medically significant lantibiotics. chemistry 51(21):4271 4279. Curr Drug Discov Technol 6(1):1–18. 31. Li B, et al. (2010) Catalytic promiscuity in the biosynthesis of cyclic peptide secondary 10. Müller WM, Schmiederer T, Ensle P, Süssmuth RD (2010) In vitro biosynthesis of the metabolites in planktonic marine cyanobacteria. Proc Natl Acad Sci USA 107(23): 10430–10435. prepeptide of type-III lantibiotic labyrinthopeptin A2 including formation of a C-C 32. Dufour A, Hindré T, Haras D, Le Pennec JP (2007) The biology of lantibiotics from the bond as a post-translational modification. Angew Chem Int Ed 49(13):2436–2440. lacticin 481 group is coming of age. FEMS Microbiol Rev 31(2):134–167. 11. Pag U, Sahl HG (2002) Multiple activities in lantibiotics—Models for the design of 33. Penner-Hahn J (2007) Zinc-promoted alkyl transfer: A new role for zinc. Curr Opin novel antibiotics? Curr Pharm Des 8(9):815–833. Chem Biol 11(2):166–171. 12. Goto Y, et al. (2010) Discovery of unique lanthionine synthetases reveals new 34. Wang H, van der Donk WA (2012) Biosynthesis of the class III lantipeptide cat- mechanistic and evolutionary insights. PLoS Biol 8(3):e1000339. enulipeptin. ACS Chem Biol 7(9):1529–1535. 13. Kodani S, et al. (2004) The SapB morphogen is a lantibiotic-like peptide derived from 35. Meindl K, et al. (2010) Labyrinthopeptins: A new class of carbacyclic lantibiotics. the product of the developmental gene ramS in Streptomyces coelicolor. Proc Natl Angew Chem Int Ed 49(6):1151–1154. Acad Sci USA 101(31):11448–11453. 36. Völler GH, et al. (2012) Characterization of new class III lantibiotics—Erythreapeptin, 14. Marsh AJ, O’Sullivan O, Ross RP, Cotter PD, Hill C (2010) In silico analysis highlights the avermipeptin and griseopeptin from Saccharopolyspora erythraea, Streptomyces frequency and diversity of type 1 lantibiotic gene clusters in genome sequenced avermitilis and Streptomyces griseus demonstrates stepwise N-terminal leader pro- bacteria. BMC Genomics 11:679. cessing. ChemBioChem 13(8):1174–1183. 15. Begley M, Cotter PD, Hill C, Ross RP (2009) Identification of a novel two-peptide 37. Majchrzykiewicz JA, et al. (2010) Production of a class II two-component lantibiotic of lantibiotic, lichenicidin, following rational genome mining for LanM proteins. Appl Streptococcus pneumoniae using the class I nisin synthetic machinery and leader se- – Environ Microbiol 75(17):5451 5460. quence. Antimicrob Agents Chemother 54(4):1498–1505. 16. Haft DH, Basu MK, Mitchell DA (2010) Expansion of ribosomally produced natural 38. Cooper LE, McClerren AL, Chary A, van der Donk WA (2008) Structure-activity re- products: A hydratase- and Nif11-related precursor family. BMC Biol 8:70. lationship studies of the two-component lantibiotic haloduracin. Chem Biol 15(10): 17. Li B, et al. (2006) Structure and mechanism of the lantibiotic cyclase involved in nisin 1035–1045. – biosynthesis. Science 311(5766):1464 1467. 39. Garcia-Vallvé S, Romeu A, Palau J (2000) Horizontal gene transfer in bacterial and fi 18. Li B, van der Donk WA (2007) Identi cation of essential catalytic residues of the cy- archaeal complete genomes. Genome Res 10(11):1719–1725. clase NisC involved in the biosynthesis of nisin. J Biol Chem 282(29):21169–21175. 40. Popa O, Hazkani-Covo E, Landan G, Martin W, Dagan T (2011) Directed networks 19. Zhong WX, et al. (2012) Lanthionine synthetase C-like protein 1 interacts with and reveal genomic barriers and DNA repair bypasses to lateral gene transfer among inhibits cystathionine beta-synthase: A target for neuronal antioxidant defense. J Biol prokaryotes. Genome Res 21(4):599–609. Chem, 10.1074/jbc.M1112.383646. 41. Wright F (1990) The ‘effective number of codons’ used in a gene. Gene 87(1):23–29. 20. Mau B, Newton MA, Larget B (1999) Bayesian phylogenetic inference via Markov 42. Daubin V, Gouy M, Perrière G (2002) A phylogenomic approach to bacterial phy- chain Monte Carlo methods. Biometrics 55(1):1–12. logeny: Evidence of a core of genes sharing a common history. Genome Res 12(7): 21. Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to estimate large 1080–1090. phylogenies by maximum likelihood. Syst Biol 52(5):696–704. 43. Ronquist F, et al. (2012) MrBayes 3.2: Efficient Bayesian phylogenetic inference and 22. Castiglione F, et al. (2008) Determining the structure and mode of action of micro- model choice across a large model space. Syst Biol 61(3):539–542. bisporicin, a potent lantibiotic active against multiresistant pathogens. Chem Biol 15 44. Smith AB (1994) Rooting molecular trees—Problems and strategies. Biol J Linn (1):22–31. Soc Lond 51(3):279–292.

18366 | www.pnas.org/cgi/doi/10.1073/pnas.1210393109 Zhang et al. Downloaded by guest on September 30, 2021