Computational-Guided Discovery and Characterization of a Sesquiterpene Synthase from Streptomyces Clavuligerus
Total Page:16
File Type:pdf, Size:1020Kb
Computational-guided discovery and characterization of a sesquiterpene synthase from Streptomyces clavuligerus Jeng-Yeong Chowa,1, Bo-Xue Tianb,c,1, Gurusankar Ramamoorthya, Brandan S. Hillerichd, Ronald D. Seideld, Steven C. Almod, Matthew P. Jacobsonb,c,2, and C. Dale Poultera,2 aDepartment of Chemistry, University of Utah, Salt Lake City, UT 84112; bDepartment of Pharmaceutical Chemistry, School of Pharmacy, University of California, San Francisco, CA 94158; cCalifornia Institute for Quantitative Biomedical Research, University of California, San Francisco, CA 94158; and dDepartment of Biochemistry, Albert Einstein College of Medicine, Bronx, NY 10461 Contributed by C. Dale Poulter, March 21, 2015 (sent for review March 6, 2015) Terpenoids are a large structurally diverse group of natural products in the hydrocarbon chain. The initial cyclization is often followed with an array of functions in their hosts. The large amount of by additional cyclizations and rearrangements to ultimately pro- genomic information from recent sequencing efforts provides duce a myriad of different carbon skeletons. Class II terpene syn- opportunities and challenges for the functional assignment of thases generate an electrophilic tertiary carbocation by protonation terpene synthases that construct the carbon skeletons of these of a trisubstituted double bond mediated by an acidic residue (e.g., compounds. Inferring function from the sequence and/or structure Asp) in their active sites. These enzymes then mediate cyclization of these enzymes is not trivial because of the large number of and rearrangement reactions to give a second large group of ter- possible reaction channels and products. We tackle this problem penoid carbon skeletons. by developing an algorithm to enumerate possible carbocations Streptomyces clavuligerus is a gram-positive bacterium that, like derived from the farnesyl cation, the first reactive intermediate of most other Streptomyces spp., produces a wide variety of natural the substrate, and evaluating their steric and electrostatic com- products, including penicillin, clavulanic acid, and holomycin patibility with the active site. The homology model of a putative (8, 9). The genome of S. clavuligerus ATCC 27604 encodes 20 pentalenene synthase (Uniprot: B5GLM7) from Streptomyces putative terpene synthases (10). Two of the enzymes are presumed clavuligerus was used in an automated computational workflow to be geosmin synthases based on their high sequence identity for product prediction. Surprisingly, the workflow predicted a (>60%), with known homologs from other Streptomyces spp. (11, linear triquinane scaffold as the top product skeleton for B5GLM7. 12). In 2011, Hu et al. reported that two other terpene synthases Biochemical characterization of B5GLM7 reveals the major prod- in this organism are involved in biosynthesis of T-muurolol and uct as (5S,7S,10R,11S)-cucumene, a sesquiterpene with a linear δ-cadinene (10). In the same year, Nakano et al. reported the triquinane scaffold. To our knowledge, this is the first documen- characterization of two additional terpene synthases, a mono- tation of a terpene synthase involved in the synthesis of a linear terpene cyclase that produced 1,8-cineole and a promiscuous triquinane. The success of our prediction for B5GLM7 suggests acyclic terpene synthase that produced linalool and nerolidol (13, that this approach can be used to facilitate the functional assign- 14). While this manuscript was in the final stages of preparation, ment of novel terpene synthases. the function of an additional terpene synthase, which is a protein variant of the subject of this investigation, was reported to syn- sesquiterpene | cyclase | structure-function | annotation thesize a linear triquinane of undefined stereochemistry (15). The erpenoid compounds form a large group of natural products Significance Tsynthesized by many species of plants, fungi, and bacteria. To date, more than 70,000 of these compounds have been identified, This paper describes a novel strategy for predicting the func- comprising ∼25% of all of the known natural products (Dictio- tion of terpene synthases. Functional assignment of terpene nary of Natural Products database, dnp.chemnetbase.com/intro/ synthases is a daunting task because product selectivity is not index.jsp). Terpenoids and molecules containing terpenoid high in many terpene synthases, and mutations in and near the fragments are produced from two basic five-carbon building active sites of selective enzyme can result in synthesis of dif- blocks: isopentenyl diphosphate (IPP) and dimethylallyl di- ferent products. Using a homology model of an unknown ter- phosphate (DMAPP) (1, 2). IPP and DMAPP are then con- pene synthase, we developed an algorithm that predicted the verted into a series of terpenoid diphosphate esters of increasing enzyme synthesizes a linear triquinane. We confirmed this chain lengths by chain length selective polyprenyl diphosphate prediction; specifically, the enzyme converts farnesyl diphosphate synthases to give geranyl diphosphate (GPP, C10), farnesyl di- to a linear triquinine sesquiterpene: (5S,7S,10R,11S)-cucumene. phosphate (FPP, C15), geranylgeranyl diphosphate (GGPP, The findings highlight the potential for using computational C20), and higher five-carbon homologs (3). These molecules are approaches to assist in the discovery and characterization of substrates for terpene synthases, which catalyze hydroxylation, unknown terpene synthases. elimination, cyclization, and rearrangement reactions to produce Author contributions: J.-Y.C., B.-X.T., S.C.A., M.P.J., and C.D.P. designed research; J.-Y.C., much of the structural diversity of terpenoid molecules found in BIOCHEMISTRY nature (4, 5). B.-X.T., G.R., B.S.H., and R.D.S. performed research; J.-Y.C., B.-X.T., G.R., B.S.H., R.D.S., S.C.A., M.P.J., and C.D.P. analyzed data; and J.-Y.C., B.-X.T., M.P.J., and C.D.P. wrote The two main classes of terpene synthases (class I and class II) the paper. are distinguished from one another by the mechanisms they use Conflict of interest statement: M.P.J. is a consultant to Schrodinger LLC, which distributes to initiate carbocationic cyclization and rearrangement reactions software used in this study. The authors declare no competing financial interest. and by their respective folds (6, 7). Class I terpene synthases use 1J.-Y.C. and B.-X.T. contributed equally to this work. 2+ active site Mg ions to bind and activate the diphosphate moi- 2To whom correspondence may be addressed. Email: [email protected] or poulter@ eties of their substrates as leaving groups to generate highly re- chemistry.utah.edu. active allylic carbocations. In those terpene synthases that catalyze This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. cyclization reactions, the carbocations alkylate distal double bonds 1073/pnas.1505127112/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1505127112 PNAS | May 5, 2015 | vol. 112 | no. 18 | 5661–5666 Downloaded by guest on September 24, 2021 functions of the remaining terpene synthases from S. clavuligerus we now report an automated computational workflow for predict- are unknown. ing terpene synthase product skeletons from protein sequences. Many terpene synthases are known to be promiscuous. The Application of this workflow aided the discovery and characteriza- promiscuity of terpene synthases is perhaps most dramatically tion of a sesquiterpene synthase (Uniprot: B5GLM7) from illustrated by γ-humulene synthase from Abies grandis, which S. clavuligerus. produces 52 different sesquiterpenes (16). Promiscuity in these enzymes is not surprising given the multiple cyclization and Results rearrangement steps that the substrates often undergo during Sequence Similarity Network for the Terpene Synthase-Like 2 Subgroup. catalysis, any one of which can be altered by perturbations to the Class I and class II terpene synthases can be distinguished from one microenvironment of the active site. Promiscuity is often seen in another based on sequence homology. The class I terpene syn- mutagenesis experiments where a few key mutations are typically thases, which are related to polyprenyl diphosphate synthases, sufficient to alter the product selectivity of the enzymes (17, 18). have a single all α-helix α-domain (PF03936) and are found mostly As a result, prediction of function for terpene synthases is still a in bacteria and fungi. The Structure-Function Linkage Database very challenging task in the field of computational biology. (SFLD) (26) assigns 931 protein sequences to the class I terpene One of the key objectives of the enzyme function initiative synthases, which are listed in the “terpene synthase-like 2 sub- (EFI) is the development of methods that facilitate functional group” (26). We constructed sequence similarity networks using − assignments for unknown terpene synthases as was previously these sequences (26). At an e-value cutoff of 10 50, these enzymes demonstrated for the polyprenyl transferases (3, 19). Quantum segregate into 14 main clusters containing 10 or more members mechanics-only and quantum mechanics/molecular mechanics (Fig. 1A). Terpene synthases with similar functions sometimes (QM/MM) calculations have provided important insights about group in the same clusters, as illustrated by epi-isozizaene syn- the mechanisms of reactions catalyzed by terpene synthases (20– thases (cluster 4), 2-methylisoborneol synthases (cluster 7), aris- 24). However, gas phase QM calculations