<<

Investigation of the mechanism of the SpnF-catalyzed [4+2]- reaction in the biosynthesis of spinosyn A

Byung-sun Jeona, Mark W. Ruszczyckyb,1, William K. Russellc, Geng-Min Lina, Namho Kimb, Sei-hyun Choia, Shao-An Wanga, Yung-nan Liub, John W. Patrickc, David H. Russellc, and Hung-wen Liua,b,1

aDepartment of , University of Texas at Austin, Austin, TX 78712; bDivision of Chemical Biology and Medicinal Chemistry, College of Pharmacy, University of Texas at Austin, Austin, TX 78712; and cDepartment of Chemistry, Texas A&M University, College Station, TX 77843

Edited by Jerrold Meinwald, Cornell University, Ithaca, NY, and approved August 7, 2017 (received for review June 12, 2017) The Diels–Alder reaction is one of the most common methods of biological cycloaddition reactions (3). For the purposes of the to chemically synthesize a six-membered carbocycle. While it has following discussion, we regard an intermediate as a local mini- long been speculated that the cyclohexene moiety found in many mum in the free energy along the reaction coordinate and a con- secondary metabolites is also introduced via similar chemistry, certed cyclization as one lacking any such intermediate. Conse- the enzyme SpnF involved in the biosynthesis of the insecticide quently, there has yet to be a direct experimental interrogation of spinosyn A in Saccharopolyspora spinosa is the first enzyme for the reaction coordinates that defines whether the catalytic cycles which of an intramolecular [4 + 2]-cycloaddition has been of these [4 + 2]-carbocyclases proceed via concerted or stepwise experimentally verified as its only known function. Since its dis- mechanisms. covery, a number of additional standalone [4 + 2]-cyclases have Development of a convergent synthetic strategy (2) has made been reported as potential Diels–Alderases; however, whether it possible to prepare isotopologs of the SpnF substrate, allow- their catalytic cycles involve a concerted or stepwise cycliza- ing study of the SpnF reaction via the measurement of sec- tion mechanism has not been addressed experimentally. Here, ondary deuterium kinetic isotope effects (KIEs). Previous chem- we report direct experimental interrogation of the reaction coor- ical model studies have indicated that concerted formation of dinate for the [4 + 2]-carbocyclase SpnF via the measurement both C–C bonds in a [4 + 2]-cycloaddition reaction (e.g., the of α-secondary deuterium kinetic isotope effects (KIEs) at all C4–C12 and C7–C11 bonds in 1) should result in inverse deu- 2 sites of sp2 → sp3 rehybridization for both the nonenzymatic and terium KIEs of about 5% at all four carbons undergoing sp to 3 enzyme-catalyzed cyclization of the SpnF substrate. The measured sp rehybridization (9, 10). In contrast, if cyclization is stepwise, KIEs for the nonenzymatic reaction are consistent with previous with the first step most rate-limiting, then unit to normal iso- computational results implicating an intermediary state between tope effects would be expected for carbons involved in the second formation of the first and second carbon–carbon bonds. The KIEs bond formation (9, 11). Since a stepwise intramolecular cycliza- measured for the enzymatic reaction suggest a similar mechanism tion for the SpnF-catalyzed cycloaddition (Fig. 2) is expected to of cyclization within the enzyme active site; however, there is evi- proceed with C7–C11 bond formation first, the C4D and C12D dence that conformational restriction of the substrate may play a secondary deuterium KIEs are of particular interest in determin- role in catalysis. ing the sequence of bond formation. With this goal in mind, we prepared four site-specifically deuterated isotopologs of 4 (C4D, Diels–Alderase | isotope effect | enzyme | spinosyn | Bayesian analysis C7D, C11D, and C12D) and measured the corresponding sec- ondary deuterium KIEs on both the nonenzymatic and enzyme- catalyzed reactions. The C2D and C14D isotopologs were also he enzyme SpnF from Saccharopolyspora spinosa catalyzes an Tintramolecular [4 + 2]-cycloaddition (4 → 5) during biosyn- thesis of the insecticide spinosyn A (1) (Fig. 1) (1, 2). At the Significance time of its discovery, the search for a naturally selected Diels– Alderase had already led to the characterization of several Computational and structural studies of putative Diels– enzymes proposed to catalyze [4 + 2]- (3). In all Alderases and the reactions that they catalyze have provided cases, however, these potential Diels–Alderases were found to testable models of the reaction coordinates and mechanisms be multifunctional, often catalyzing a priming reaction that acti- of catalysis; however, there has yet to be a direct experimental vates the substrate toward cyclization. Thus, the detailed mecha- interrogation of these hypotheses. Herein, the α-secondary nisms of cyclization and the catalytic roles played by the enzymes deuterium kinetic isotope effects at all sites of rehybridization were unclear and difficult to address. In contrast, the priming on the nonenzymatic and SpnF-catalyzed [4+2]-cycloaddition reaction (3 → 4) during spinosyn A biosynthesis is catalyzed by during biosynthesis of spinosyn A are reported. Results impli- the separate enzyme SpnM (1). The resulting 1, 4-dehydration cate an intermediary state between formation of the first and product 4 can then undergo stereospecific cyclization in aqueous second carbon–carbon bonds and provide evidence that con- solution with a half-life of ∼25 min at 300 K; however, in the formational restriction of the substrate may play a role in presence of SpnF, a rate enhancement (i.e., kcat /knon ) of ∼500- catalysis. fold is observed, making SpnF the first reported example of a standalone [4 + 2]-carbocyclase (1). This separation of activities Author contributions: M.W.R. and H.-w.L. designed research; B.-s.J., W.K.R., G.-M.L., N.K., S.-h.C., S.-A.W., Y.-n.L., and J.W.P. performed research; W.K.R., J.W.P., and D.H.R. has made SpnF an ideal system for mechanistic study. contributed new reagents/analytic tools; M.W.R. and H.-w.L. analyzed data; and B.-s.J., Since the discovery of SpnF, a number of additional enzymes, M.W.R., and H.-w.L. wrote the paper. such as VtsJ (4), PyrI4 (5), PyrE3 (5), TclM (6), TbtD (7), and The authors declare no conflict of interest. AbyU (8), have been characterized as uniquely catalyzing [4+2]- This article is a PNAS Direct Submission. cycloadditions; however, investigation of these enzymes has been 1To whom correspondence may be addressed. Email: [email protected] or h.w.liu@mail. limited primarily to crystallographic and computational studies utexas.edu. in large part because of the structural complexity of their sub- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. strates and the limited options available to study the mechanisms 1073/pnas.1710496114/-/DCSupplemental.

10408–10413 | PNAS | September 26, 2017 | vol. 114 | no. 39 www.pnas.org/cgi/doi/10.1073/pnas.1710496114 Downloaded by guest on September 29, 2021 Downloaded by guest on September 29, 2021 aigtertoo the by was of MS ratio (4) time-of-flight the substrate taking ionization SpnF electrospray residual by of determined sample then isotopolog isolated each observed of The ratio study cyclization. to is SpnF nonenzyme-catalyzed of flux absence the the reaction in Parallel performed that also reaction. were ensure experiments enzyme-catalyzed to the SpnF through sufficient predominantly of presence the spanning aliquots reaction of enriched consumption isotopically the generated ieseiclydueae stplgi H in isotopolog deuterated site-specifically in detailed as (A–C) Appendix, fragments SI deuterated of appropriately isotopologs the six ing the of approach preparation synthetic modular the facilitated This with (18). reactions SpnJ from enzyme isolated and biosynthetic prepared then was (3) strate of assembly macrolactoniza- the Yamaguchi completed by tion polyketide linear resulting the of Julia– A 3. fragment Fig. in fragments shown ∆ between as olefination reactions coupling Kocienski bond C–C fragments macrolactone three two the the connecting by and of prepared abundance were isotopologs Natural deuterated labeling. bias abundance site-specifically measurement natural any by minimize caused and changes populations observable isotopolog enhance in to As substrates (15–17). enriched substrate isotopically residual of in enrichment discussed isotopolog changes the monitor to in inter- MS of whole- and method (14) the competition using nal measured were KIEs site-specific The Discussion and Results com- recent a by predicted (12). as investigation carbon reaction putational two cyclization these the of in involvement centers possible the assess to prepared system. (PKS) synthase polyketide a constitute together 1. Fig. ene al. et Jeon 12 diino pMt itr fntrlabundance natural of mixture a to SpnM of Addition Methylmalonyl-CoA lfi,adaplaimctlzdSil opigwith coupling Stille palladium-catalyzed a and olefin, -(E) Propionyl-CoA Malonyl-CoA 4 5 isnhtcptwyo pnsnAin A spinosyn of pathway Biosynthetic C SpnF e oteC5C6lnae usqetcyclization Subsequent linkage. C-5/C-6 the to led IText. SI IAppendix, SI 12 4 4 12 residual 3, 11 7 7 11 M SpnA-E +1 (PKS) hsapoc eursusing requires approach this Text, SI ∼10–90% SpnM and –H 4 a sltdb PCfo six from HPLC by isolated was SpnG, H, I,K,L &P h orsodn pMsub- SpnM corresponding The 2. 2 O M ekitniisfrtesodi- the for intensities peak 4 ovrinof conversion nst.Atrcomplete After situ. in .spinosa S. A 3 2 /MOat O/DMSO 2 and pnsnA spinosyn 3 A, yincorporat- by B 1 ) SpnA–E 2). (1, and B, ile the yielded 4 SpnJ [O] 3 to ( 30 1 n a and C ) 5 via ◦ in C 2 egtdaverage weighted to S3. and S2 Tables and S4 and marginal in S3 label- provided posterior Figs. are C12D simulated KIEs the and and of C4D dataset distributions probability the full well The for as patterns. plots KIEs ing site-specific enrichment the of of examples estimates as point posterior in lists described 4 con- are including priors, analysis, this of regarding struction Details distribu- KIEs. probability the posterior for estimating variation tions time, between-experiment same the and at using within- while both by analysis least caused than Eq. KIEs rather regression of 22) (21, Bayesian fitting (MCMC) squares Carlo hierarchical Monte chain a Markov used We function material a D as reaction starting expressed of residual be tion can the light) in over (heavy concentrations isotopolog of measured. of KIEs consisting site-specific all each residual for trials, of samples experimental isolated six three approximately least At in . Methods provided are details analytical and Experimental of ated tpmyb osdrd“aelmtn”(4 5.Apidto Applied 25). roughly (24, the “rate-limiting” reaction, considered nonenzymatic which be the to for- may degree of the sum step reflects reciprocal a and a commitments as reverse written and be ward can index sensitivity Each and state, (i.e., state reference the where i.2. Fig. n aia(rniinsae)aogtegnrlzdfe nrycre of curves energy free coordinate. generalized reaction the bifurcating the along states) (transition maxima and strate k ie ussed tt,teKEo h ylzto of cyclization the on KIE the state, quasisteady a Given ratio the unity, near all are KIEs secondary expected the Since 5 n nta nihetratio enrichment initial and in [6+4] adduct ei tutrsrpeetpttv oa iia(intermediates) minima local putative represent structures Lewis 4. D ehnsi rpslfrthe for proposal Mechanistic n κ 2 i lmnayratos(tp)cnb xrse sthe as expressed be can (steps) reactions elementary R 14 14 S steefcieKEascae ihporsinfrom progression with associated KIE effective the is 14 4 2 2 (f i n orcigfrntrlaudnelbln (19). labeling abundance natural for correcting and 4 4 sRyssniiiyidxfrthe for index sensitivity Ray’s is 4 PNAS ; D 12 12 12 k f 8 4 7 11 = ρ) , ( 7 7 11 11 9 aaeeie ntrso h stp effect isotope the of terms in parameterized | ) etme 6 2017 26, September 1 D ρ omdlucranyi h measured the in uncertainty model to 4 k (1 (SpnF) rei ouin othe to solution) in free Cope = ‡ − X i =1 f n ρ ) [ 4 1/ codn o(20) to according S + D i D k ccodiino h pFsub- SpnF the of 2]-cycloaddition Fig. . Methods and Materials −1 κ | i , , [4+2] adduct o.114 vol. 7 5% 0 2 ≤ 14 14 2 eeperformed were 4, nes ausfor values inverse 2 i 14 f hse (23–25). step th 4 4 12 | < R 4 12 i aeil and Materials 12 o 39 no. htransition th IAppendix, SI 1. 6 ftefrac- the of 7 7 11 ( 11 11 7 5 ) | 10409 [1] [2] ‡ 4

CHEMISTRY BIOCHEMISTRY than 0.90 is no greater than 10% (SI Appendix, Fig. S3). There- C 7D k C 11D k 14 fore, the reduced magnitude of non and non vs. the computed values suggests that the actual transition state (rep- resented by 6) is likely earlier than that predicted by the com- A putational gas-phase model. The KIEs are not consistent, how- ever, with a pericyclic [4 + 2]-transition state for a synchronous 42 11 concerted cyclization, which would predict inverse effects at all 7 2 3 12 four sites of sp → sp rehybridization (9, 10, 29). In so far as the 10π-aromatic state can be considered an intermediate fol- BClowing 6, these observations suggest that nonenzymatic cycliza- tion of 4 may indeed be stepwise, albeit not in the usual sense 11 42 7 of a biradical or dipolar intermediate. It should be emphasized, 14 12 however, that the KIE measurements do not address existence of the [6 + 4]-cycloadduct (9) along the reaction coordinate, Julia-Kocienski Slle because they only provide information regarding the first, more Olefinaon Reacon rate-limiting elementary reaction. The enzymatic KIEs can also be interpreted in terms of Eq. D 2, where kenz corresponds to the isotope effect on V /K for 2 SpnF (14, 25). If substrate binding were the most rate-limiting 14 (i.e., S1 ≈ 1), then roughly unit isotope effects would have been expected at all six positions. If product dissociation were the most 2 12 11 rate-limiting (i.e., Sn ≈ 1), then the observed KIE would reflect 4 7 the equilibrium isotope effect on sp2 → sp3 rehybridization (23, Yamaguchi Macrolactonizaon 25), and four of six KIEs would be at least 5% inverse regardless [4+2] [6+4] Fig. 3. Modular, retrosynthetic design for the preparation of natural abun- of whether the product released is a - or -cycloadduct dance and site-specifically deuterated isotopologs of 2. Carbons are num- (9, 29). That these outcomes were not observed suggests that the bered according to their final position in the macrolactone. Full synthetic intermediary steps of the catalytic cycle dominate the observed schemata and protocols are provided in SI Appendix, SI Text and Figs. S5–S7. enzymatic isotope effects. Unlike the nonenzymatic cyclization, computational study of the SpnF-catalyzed reaction has been hampered, because a struc- C 7D C 11D knon and knon are indicative of partial C7–C11 bond ture of the SpnF–substrate complex is not yet available. Never- formation in the transition state of the most rate-limiting step. theless, the enzymatic results largely mirror those for the nonen- C 2D C 4D C 4D Likewise, the values centered near unity for knon , knon , zymatic reaction with the exception of kenz , which is notably C 14D 2 3 C 12D and knon suggest little to no sp → sp rehybridization at inverse, whereas the value of kenz is comparable with that these positions in this transition state compared with the ref- of the nonenzymatic case as shown in Fig. 4. While the posterior C 12D C 4D erence state (4). The ∼4% normal value for knon may be marginal probability that kenz < 0.98 is greater than 90%, C 12D explained by redistribution of electron density along the C11– the ∼5% normal value for kenz argues against significant C15 π-system, leading to a loosening of the C12–H bonding envi- C4–C12 bond formation in the transition state of the most rate- ronment (9, 11, 26). limiting step. Structural investigations of the [4+2]-carbocyclases These results are consistent with recent computational mod- SpnF, AbyU, and PyrE3 all implicate the formation of a closed els of the reaction (12, 13, 27, 28). In particular, a simplified complex on substrate binding into a largely hydrophobic active C 4D gas-phase model of the nonenzymatic reaction investigated by site (8, 30, 31). Therefore, the inverse value for kenz may Hess and Smentek (13) identified a reaction coordinate with reflect conformational restriction of 4 within a low-volume active a single potential energy saddle point (similar to 6 in Fig. 2) site of the Michaelis complex, resulting in a stiffer bonding envi- followed by two apparent inflection points forming a caldera- ronment for the C4–H center and an inverse binding isotope like region. Subsequent computations by the Houk and Single- effect (32). ton laboratories using variational transition-state theory identi- fied a local free energy minimum described as a 10π-aromatic Conclusions species (depicted as structure 7 in Fig. 2) residing about the As the number of recognized [4 + 2]-carbocyclases continues to caldera between two variational transition states along the grow, experimental interrogation of reaction coordinates pro- reaction coordinate (12). The first of these transition states (6) posed from structural and computational studies will be neces- is similar to that identified by Hess and Smentek (13), and the sary for a complete mechanistic understanding of these enzymes. trajectory of 7 can bifurcate on progression through the second The experimental results and computational predictions are col- transition state (8) to form either a [4 + 2]- or [6 + 4]-cycloadduct lectively most consistent with the hypothesis that the nonenzy- (i.e., 5 vs. 9), where the latter then converts to 5 via a facile Cope matic cyclization of 4 involves an intermediary state before com- rearrangement (12). The second of these transition states (8) is plete C4–C12 bond formation. A possible candidate for such ∼2.3 kcal/mol lower in free energy compared with the first (12). an intermediate (i.e., local free energy minimum) is the 10π- This implies that the sensitivity index at 300 K for the first tran- aromatic state (e.g., 7) identified in recent computational stud- sition state is nearly 50-fold greater than that for the second (24, ies (12). The results also suggest that the SpnF-catalyzed reac- 25). In other words, conversion of 4 to 7 via 6 is predicted to be tion is similar to that of the nonenzymatic cyclization but likely significantly rate-limiting in the nonenzymatic reaction. involves a reduction in the entropy of activation after the sub- The KIEs calculated from Hess and Smentek’s gas-phase mod- strate is bound in the active site, thereby facilitating catalysis els of 6 vs. 4 (13) are also shown in Fig. 4 and are directionally and the observed rate enhancement (kcat /knon ). These obser- consistent with the observed values. Because of uncertainty in the vations combined with prior structural investigations (8, 30, 31) measurements, there are ∼65 and 25% posterior probabilities suggest that [4 + 2]-carbocyclases in general may operate, in C 7D C 11D that the knon and knon isotope effects, respectively, are part, by reducing the conformation space available to the cycliz- between 0.90 and 0.95; however, the probability that they are less ing species in a closed, induced fit complex, a criteria worthy

10410 | www.pnas.org/cgi/doi/10.1073/pnas.1710496114 Jeon et al. Downloaded by guest on September 29, 2021 Downloaded by guest on September 29, 2021 ene al. et Jeon identically. treated were samples enzymatic and in is (rationale n h nta unh pFwsaddi ufiin uniy ota the that so quantity, sufficient tak- in before added immediately was all at and SpnF quench after consumed final quench. added been initial was had the SpnF substrate ing sub- KIEs, SpnM residual enzymatic the of of concentration of reduced measurement the later During for at strate. compensate taken to were aliquots points larger time con- Proportionately from been course). ranging had time points substrate (∼90-min time SpnM different the six of at all sumed after taken were aliquots reaction in vided Appendix Compounds. (SI Deuterated schemata of Synthesis Methods to and Materials design/reengineering protein future. novo the in de Diels–Alderases develop for consideration of in shown are plots 68% enrichment the all with as along well as KIE (C KIE each parentheses. each reaction. for in of nonenzymatic distribution distributions shown the probability probability are for marginal marginal K KIEs posterior initial site-specific 300 simulated the Overall at for the light) (B) distributions to (13) of over probability ). Methods (equivalent percentile) heavy marginal and interval (50th (D/H, posterior (Materials density enrichments median the shown observed posterior of the the trials values as denote median experimental reported the circles the are using Black to Values drawn reactions. corresponding are enzymatic KIE (red/blue) and lines and nonenzymatic Fitted the enrichment substrate. both residual for isolated KIEs the C12D in and C4D the for trials 4. Fig. otossoe ht fe t omto,ti ypouti ohsal and stable both is by-product this from formation, baseline-separated as its mass after same that, the showed with controls reaction within SpnM (4) the the substrate of product SpnF standardize to (3) 16 to substrate to SpnM species of standard The addition the integrations. internal peak HPLC an observed contain- as 8.0) p-methoxyacetophenone (pH 1 buffer 1 to than Tris close generally mM were 50 topologs in together at 10% concentration) mixed in were ing total stable, (14) are mM competition which (3), (∼1.5 internal substrates labeled SpnM site-specifically of trial, unlabeled each method and In trials. the experimental separate using three least measured was est Preparation. Sample Enrichment (D/H) Enrichment (D/H) AB oSn a rsn uigmaueeto oezmtcKE,and KIEs, nonenzymatic of measurement during present was SpnF No 0.94 0.96 0.98 1.00 1.02 1.04 0.62 0.63 0.64 0.65 0.66 0.67 0.68 .002 .006 0.80 0.60 0.40 0.20 0.00 .002 .006 0.80 0.60 0.40 0.20 0.00 : on rae hn4 than greater no to 4 umr ftescnaydueimKE nteezmtcadnnnyai ylzto of cyclization nonenzymatic and enzymatic the on KIEs deuterium secondary the of Summary IAppendix SI enzymac C4D nonenzymac C4D MOa 30 at DMSO IAppendix SI ∼80% Fracon ofReacon rco fReacon of Fracon , The IText SI oa ecinwstknat taken was reaction total 0.90 o l etrtdcmonsaepro- are compounds deuterated all for S5–S7) Figs. , ◦ 4 eoeadn pM nta aiso h iso- the of ratios Initial SpnM. adding before C 1.06 scnaydueimKEa ahst finter- of site each at KIE deuterium α-secondary yHPLC. by , 0.92 ,wihcnet l fteosral SpnM observable the of all converts which µM, IText SI . 1.01 : 1.04 7mM 0.47 included also mixtures Reaction 1. 0.96 1.02 .I l te epcs h nonenzymatic the respects, other all In ). 0.99 : ,atog hyrne rmn less no from ranged they although 1, 0.98 1.00 opeesnhtcpooosand protocols synthetic Complete 1.01 1.00 Daottemda faGusa itiuin.Iooeefcscmue rmtegspaemdlsystem model gas-phase the from computed effects Isotope distribution). Gaussian a of median the about SD ±1 i.Drn hstm,aby- a time, this During min. ∼4 ieseicKE o h pFctlzdcciainwt ausrpre sin as reported values with cyclization SpnF-catalyzed the for KIEs Site-specific )

4 Enrichment (D/H) Enrichment (D/H) 0t 90% to ∼10 0.44 0.39 0.40 0.41 0.42 0.43 i fe digSpnF adding after min ∼6 0.42 0.43 0.44 0.45 0.46 a eeae nst by situ in generated was .002 .006 0.80 0.60 0.40 0.20 0.00 .002 .006 0.80 0.60 0.40 0.20 0.00 4 lofrs however, forms; also C12D nonenzymac C12D enzymac oa reaction total Fracon ofReacon rco fReacon of Fracon 1.12 1.08 1.05 1.10 fet rmntrlaudnelbln r rvddin provided isotope are of labeling because bias abundance Text correcting of natural spectra, analysis from an MS and effects of labeling, processing abundance regarding natural iso- for Details retuning samples instrument. without together All the analyzed mass were energies. of trial the experimental ionization in given a source fragmentation in lated ion higher of at evidence even any spectrometer, show not over did mass experiments each at averaged intensities were Signal acquisition the prevent 3-min discarded. to third were instrument the flushes the from these obtained flush from to sample. Data used each effects. were for memory infusions series 3-min in two run MeOH:H first were The of infusions direct solution continuous 1:1 3-min 0.5 Three a of rate using flow high-sensitivity a diluted ion at injected first positive the University were in A&M Samples System Texas HDMS mode. G2 at SYNAPT Spectrometry Waters Mass a using Biological for Laboratory the Enrichments. Isotopolog in provided are Appendix criteria (SI outlier variables and in error in of described deter- all assessment as were of an integrations reaction free peak of were HPLC Fractions they from by-products. that mined ensure and to products HPLC reaction by other reanalyzed also were samples nepsdbtenec PCsprto opeetmmr fet.Iso- 1 effects. a memory prevent to 25% separation lated to HPLC at return each start linear between and gradient: interposed min, following 2 the for using B (B) MeCN vs. 25% Microsorb- (A) (Varian water HPLC with eluted reversed-phase sub- C18 250 residual C18 using the 100-5 MV; isolated and was centrifugation, (4) by strate removed was Protein cyclization. 5% ing 1.06 1.04 nrmvl lqoswr unhdb iig1 mixing by quenched were aliquots removal, On : 1.02 7 pcr curdt banafia pcrmfraayi.Control analysis. for spectrum final a obtain to acquired spectra ∼175 neapeM pcrmi hw in shown is spectrum MS example An . 1.02 itr fMC n ae o Saayi.Aiut fteisolated the of Aliquots analysis. MS for water and MeCN of mixture 1 ,lna o50% to linear B, 1.01 4 1.01 a ocnrtdi pe aumcnetao n eisle in redissolved and concentrator vacuum speed a in concentrated was 1.00 1.00 MOa 0 at DMSO IAppendix SI C PNAS × enzymac nonenzymac ◦ m ihU eeto t24n.Teclm was column The nm. 254 at detection UV with mm) 4.6 (1.012) xmlso nihetposfo experimental from plots enrichment of Examples (A) 4. (1.004) n3 i,lna o70% to linear min, 35 in B ,wiharssbt nyai n nonenzymatic and enzymatic both arrests which C, is 3adS4. and S3 Figs. , | 0.994 1.000 0.990 1.000 sltdrsda usrt a nlzdb Sat MS by analyzed was substrate residual Isolated etme 6 2017 26, September /i ihsga vrgn vr1sintervals. 1-s over averaging signal with µL/min –0.010 +0.011 –0.036 +0.040 –0.014 +0.013 –0.025 +0.025 (0.988) IAppendix SI 1.004 0.938 n2mn ovn ln a run was blank A min. 2 in B (1.027) 2 2 1.041 14 1.049 14 IAppendix SI –0.012 +0.014 –0.023 +0.020 h ulsmltdposterior simulated full The B. | , , n3mn scai t70% at isocratic min, 3 in B IText SI o.114 vol. 4 4 IText SI –0.013 +0.015 –0.013 +0.013 IAppendix SI 12 12 : ihMC contain- MeCN with 1 . i.S1. Fig. , .Ttlsml sizes sample Total ). 11 11 | (0.892) 7 7 (0.876) 0.946 0.961 IAppendix SI o 39 no. 0.935 0.963 ln with along –0.025 +0.023 –0.045 +0.044 –0.024 +0.024 | 2 highest –0.026 +0.024 and O 10411 , SI

CHEMISTRY BIOCHEMISTRY 2 Statistical Analysis. Each of the 12 measured KIEs was analyzed indepen- with mean zero and variance σw . Therefore, if fi and ri are vectors of the ni dently of the others using a hierarchical Bayesian regression model (33, 34). fractions of reaction and enrichment measurements in the ith experimental If n is the number of independent experimental trials for a given KIE, then trial, respectively, then the likelihood function based on the ith experimen- the model involves 2n+3 parameters. At the highest level is the KIE of inter- tal trial is given by est denoted Dk, for which a log-normal prior was constructed with median ni 0 and variance 0.0325, which reparameterizes to  D D  Y   D  2  p ri| k, σb, σw , ki, ρi, fi = N rij|R fij; ki, ρi , σw . [8]   1   j=1 p Dk = N logD k|0, 0.0325 , Dk > 0. [3] Dk The likelihood function over all independent experimental trials is then just The distribution in Eq. 3 has a median of unity and ensures that the prior the product probability of Dk lying on a positive interval [a, b] is equal to the prior prob- n ability that it is on [1/b, 1/a]. This can be interpreted physically as treating   Y   p r|Dk, σ , σ , Dk, ρ, f = p r |Dk, σ , σ , Dk , ρ , f , [9] positive and negative differences in the free energy of activation between b w i b w i i i i=1 two isotopogs equally. The variance of the log-normal distribution (0.0325) places ∼68% of the prior probability mass between KIEs of 0.83 and 1.20, where f and r are the fractions of reaction and enrichment vectors over all consistent with previous reports of α-secondary deuterium KIEs on rehy- experimental trials. The posterior joint distribution of the parameters con- bridization reactions at 300 K (9, 10, 29, 35–38). The prior also allows for the ditional on the observations is given by Bayes’ Theorem and is proportional possibility of large KIEs, consistent with previous estimates of theoretical to the product of the prior joint distribution in Eq. 6 and likelihood function maxima (11, 29, 39) on the intervals [0.70, 0.83] and [1.20, 1.43] with a com- in Eq. 9: that is, bined probability of 25%. Extreme KIEs that exceed the predicted maxima D D  have a prior probability less than 5% with this prior. A sensitivity analysis p k, σb, σw , k, ρ|f, r ∝ described in SI Appendix, SI Text shows that the posterior inferences for Dk     |D , σ , σ , D , ρ, D , σ , σ , D , ρ , were insensitive to changes in the scale of this prior, unless it was increased p r k b w k f p k b w k [10] so as to include extreme values with significant probability. because the distribution of f is assumed to provide no information regarding Between-experiment variation is modeled by characterizing the ith trial the distribution of r given f (33). with its own observed isotope effect Dk and initial enrichment ρ . The i i MCMC (33, 34, 41, 42) was used to sample from the posterior joint dis- prior probability distribution of ρ is treated as uniform on (0, 10), consis- i tribution in Eq. 10 using the affine-invariant ensemble sampling algorithm tent with the experimental design, and Dk is modeled as dependent on Dk i proposed by Goodman and Weare (21) and implemented in Python as the according to emcee package (22). The emcee sampler uses an ensemble of walkers to gen- D D ki = k + ηi, [4] erate separate MCMC chains. A total of 200 walkers were used for all exper- iments except the C7D enzymatic KIE, where the number was 246. Starting where the error term η is a truncated Gaussian random variable with scale i positions for the walkers were based on a Gaussian ball about the maxi- σ and mode 0. The prior distributions for Dk (conditional on Dk and σ ) b i b mum likelihood estimate of the parameters under a linear approximation. and ρ are then given by i Each walker was iterated 5,000 times after an initial equilibration period D D  D D 2 D (burn-in) of 1,000 iterations, which were discarded. This produced a total p ki| k, σb = N ki| k, σb , ki > 0, [5a] sample size of at least 1 × 106 points, which were used without thinning to construct the simulated posterior joint distribution. Effective sample sizes p(ρ ) = U(0, 10). [5b] i were calculated for each parameter from the autocorrelation times (22, 42) Based on previous recommendations for hierarchical analyses with small and found to be at least 10,000 for each parameter (SI Appendix, Tables S2 numbers of groups (40), a half-Cauchy distribution on (0, ∞) with scale 0.2 and S3), which is consistent with recommendations for interval estimation was used for p(σb), which admits the possibility for large variation between (34). Potential scale reduction factors (43) were calculated for each param- experiments with significant probability. Sensitivity analysis indicated that eter and found to be less than 1.06 in all cases (SI Appendix, Tables S2 and doubling the scale from 0.2 to 0.4 had no meaningful impact on the pos- S3), consistent with equilibration. In all cases, the mean acceptance fraction terior marginal probability intervals for Dk. Finally, within-experiment vari- was between 0.3 and 0.5, indicating that the walkers were neither becom- ation is modeled as the SD about the regression, denoted σw , and the prior ing trapped nor representative of a random walk (22). Visual inspection of distribution for σw was uniform on support (0, 1). The joint prior distribu- randomly selected walker chains (SI Appendix, Fig. S2) indicated the absence tion is then given by of any apparent drift or trapping, which is in line with the quantitative eval- n uations. SI Appendix, Table S2 provides summary statistics for all parameters D D  D  Y D D  p k, σb, σw , k, ρ = p k p (σb) p (σw ) p ki| k, σb p(ρi), [6] based on the simulated posterior joint distributions obtained in the analy- i=1 sis of each nonenzymatic KIE. SI Appendix, Table S3 does the same for each enzymatic KIE. Simulated posterior marginal distributions for each KIE (i.e., where Dk denotes the n vector of Dk parameters, and ρ denotes the n vector i the Dk parameter) as well as sampled curves from the posterior joint distri- of ρ parameters. i bution are shown in SI Appendix, Figs. S3 and S4. The likelihood function is equal to the probability of the observed data conditional on the parameters. The likelihood function was thus constructed Computed KIEs. Computational methods used to calculate KIEs are described by modeling the jth enrichment from the ith experimental trial as in SI Appendix, SI Text.  D  rij = R fij; ki, ρi + εij, [7] ACKNOWLEDGMENTS. We thank Ken Houk and Dan Singleton for helpful where the function R is given by Eq. 1, fij is the jth fraction of reaction in the discussions during preparation of this manuscript. This work was supported ith trial, and εij is the jth ith realization of the random Gaussian variable ε by NIH Grant R01 GM040541 and Welch Foundation Grant F-1511.

1. Kim HJ, Ruszczycky MW, Choi Sh, Liu Yn, Liu Hw (2011) Enzyme-catalysed [4 + 2] 7. Hudson GA, Zhang Z, Tietz JI, Mitchell DA, van der Donk WA (2015) In Vitro biosyn- cycloaddition is a key step in the biosynthesis of spinosyn A. Nature 473:109– thesis of the core scaffold of the thiopeptide thiomuracin. J Am Chem Soc 137:16012– 112. 16015. 2. Kim HJ, et al. (2014) Chemoenzymatic synthesis of spinosyn A. Angew Chem Int Ed 8. Byrne MJ, et al. (2016) The catalytic mechanism of a natural Diels-Alderase revealed 53:13553–13557. in molecular detail. J Am Chem Soc 138:6095–6098. 3. Jeon BS, Wang SA, Ruszczycky MW, Liu Hw (2017) Natural [4 + 2]-cyclases. Chem Rev 9. Storer JW, Raimondi L, Houk KN (1994) Theoretical secondary kinetic isotope effects 117:5367–5388. and the interpretation of transition state geometries. 2. The Diels-Alder reaction tran- 4. Hashimoto T, et al. (2015) Biosynthesis of versipelostatin: Identification of an enzyme- sition state geometry. J Am Chem Soc 116:9675–9683. catalyzed [4 + 2]-cycloaddition required for macrocyclization of spirotetronate- 10. Beno BR, Houk KN, Singleton DA (1996) Synchronous or asynchronous? An “exper- containing polyketides. J Am Chem Soc 137:572–575. imental” transition state from a direct comparison of experimental and theoretical 5. Tian Z, et al. (2015) An enzymatic [4 + 2] cyclization cascade creates the pentacyclic kinetic isotope effects for a Diels-Alder reaction. J Am Chem Soc 118:9984–9985. core of pyrroindomycins. Nat Chem Biol 11:259–265. 11. Henderson RW, Pryor WA (1972) Secondary hydrogen isotope effect for the transition 6. Wever WJ, et al. (2015) Chemoenzymatic synthesis of thiazolyl peptide natural prod- from olefin to free . Int J Chem Kinet 4:325–330. ucts featuring an enzyme-catalyzed formal [4 + 2] cycloaddition. J Am Chem Soc 12. Patel A, et al. (2016) Dynamically complex [6 + 4] and [4 + 2] cycloadditions in the 137:3494–3497. biosynthesis of spinosyn A. J Am Chem Soc 138:3631–3634.

10412 | www.pnas.org/cgi/doi/10.1073/pnas.1710496114 Jeon et al. Downloaded by guest on September 29, 2021 Downloaded by guest on September 29, 2021 5 asn G ta.(07 ncuaisi eetdinmntrn eeiainof detemination monitoring ion selected in Inaccuracies (2007) al. et AG, Cassano 15. 6 otR,Lv A er J(93 ehdfrtecluaino normal-mode of calculation the for method A (1983) WJ Hehre BA, Levi RF, Hout 26. of Interpretation (2006) VE Anderson MW, steady- to Ruszczycky Application definition. 25. quantitative A steps: the Rate-limiting utilizing (1983) reactions Jr complex WJ Ray on effects 24. iosotpe kinetic of Analysis (1981) RL Stein hammer. MCMC 23. The emcee: (2013) J Goodman D, Lang DW, Hogg D, Foreman-Mackey invariance. affine 22. with isotope samplers Ensemble of (2010) aspects J Weare experimental J, Goodman and Theoretical 21. (1958) tan- M by Wolfsberg J, product Bigeleisen peptide 20. of distributions Isotopologue (2007) al. et B, Sac- in Wang spinosyn of 19. biosynthesis The (2007) 2 Hw Liu L, by Hong Q, Wu cleavage R, of Pongdee hydrolysis HJ, the RNA Kim for structure state 18. for Transition (1997) VL effects Schramm SR, isotope Blanke PJ, Berti Kinetic 17. (2010) al. et ME, Harris 16. mechanisms. enzyme elucidate to effects isotope of Use enzyme-catalyzed (1982) WW asynchronous, Cleland highly 14. Concerted, (2012) L Smentek Jr, BA Hess 13. 7 evdvM,e l 21)Qatfigpsil otsfrSn-aaye formal SpnF-catalyzed for routes possible Quantifying (2017) al. et MG, Medvedev 27. ene al. et Jeon ahCmu Sci Comput Math Nucleotide acquisition: profile Biochem Anal by obviated ratios isotope Biochem Chem Biomol eodr etru stp fet ncarbocations. on of effects calculation isotope the deuterium to secondary Application coordinates. symmetry using frequencies vibrational steps. sensitive isotopically multiple 328–342. exhibiting reactions matic reactions. enzymic state state. transition virtual the of concept Pac Soc Astron Publ kinetics. chemical in effects incorporation. deuterium of levels low Biochem of Quantitation spectrometry: mass dem identification SpnJ. and of precursor function cross-bridging the the of of Synthesis spinosa: charopolyspora NAD base. specific by 132:11613–11621. activation Nucleophilic transphosphorylation: [4 il-le cycloaddition. Diels-Alder + + 2] aaye ydptei toxin. diphtheria by catalyzed yladto nteboytei fsioy ;cmuainlevidence. computational A; spinosyn of biosynthesis the in cycloaddition 13:385–429. 367:40–48. 10:7503–7509. 367:28–39. 5:65–80. 125:306–312. mCe Soc Chem Am J Biochemistry mCe Soc Chem Am J d hsChem Phys Adv mCe Soc Chem Am J 22:4625–4637. r Chem Org J 129:14582–14584. 139:3942–3945. 1:15–76. 46:3328–3330. 119:12079–12088. opChem Comp J V /K 18 stp fet o enzy- for effects isotope O/ 16 ho Biol Theor J measurements. O mCe Soc Chem Am J 4:499–505. omnAppl Commun rtRev Crit Anal 243: 0 Org -O- 1 hn ,e l 21)Enzyme-dependent (2016) al. et Q, catalyzes Zheng that enzyme 31. standalone a SpnF, of structure The (2015) al. et CD, Fage 30. vari- structure Transition-state (1989) YCJ Huang JR, Kagel KB, Peterson enzyme- of JJ, system Gajewski model a 29. of study Computational (2015) VP Ananikov EG, Gordeev 28. 3 emnA ui B(92 neec rmieaiesmlto sn multiple using A simulation Carlo: iterative Monte from chain Inference Markov (1992) (1998) DB RM Rubin Neal A, A, Gelman Gelman 43. BP, Carlin RE, Kass 42. (2003) models. DJC hierarchical effects in MacKay parameters isotope variance 41. Kinetic for distributions (1958) Prior (2006) S A Suzuki Gelman RC, 40. Fahey RH, Jagow Jr, Diels- the A on effect Streitwieser isotope deuterium 39. secondary with The 2-methylfuran (1964) JO of Rodin reaction DE, Diels-Alder Sickle Van the of 38. mechanism The (1965) S Seltzer 37. the in variation structure Transition-state (1987) JR Kagel KB, Peterson JJ, Gajewski 36. mul- of determination simultaneous High-precision (1995) AA Thomas DA, Singleton 35. (2015) JK Kruschke (2014) al. 34. bane. et and A, Gelman Boon effects: 33. isotope Binding (2007) VL Schramm 32. h ecino erysmercldee n inpie snal synchronous. nearly is dienophiles effects. and isotope dienes Soc kinetic Chem symmetrical Am deuterium nearly secondary of from reaction reaction The Diels-Alder the in ation mediated neato fthe of interaction cycloaddition. sequences. discussion. roundtable UK). Cambridge, Press, Univ (Cambridge Anal Bayesian tosylates. cyclopentyl deuterated of 2332. acetolyses the in reaction. Alder anhydride. maleic synchronous. nearly is dienophile reaction and The 109:5545–5546. diene effects: symmetrical isotope nearly kinetic a deuterium of secondary from reaction Diels-Alder abundance. natural at effects 9358. isotope kinetic small tiple 11:529–536. 23:352–360. [4 ttSci Stat + 111:9078–9081. a hmBiol Chem Nat 1:515–533. 2] mCe Soc Chem Am J PNAS yladto reaction. cycloaddition N mCe Soc Chem Am J 7:457–472. on aeinDt Analysis Data Bayesian Doing tria eunewt h aayi oei PyrI4. in core catalytic the with sequence -terminal nomto hoy neec,adLann Algorithms Learning and Inference, Theory, Information mStat Am aeinDt Analysis Data Bayesian | etme 6 2017 26, September 11:256–258. 86:3091–3094. 52:93–100. 87:1534–1540. LSOne PLoS [4 + CC oaRtn L,3dEd. 3rd FL), Raton, Boca (CRC, 2] Esve,Lno) n Ed. 2nd London), (Elsevier, | yladto eed nlid-like on depends cycloaddition 10:e0119984. o.114 vol. mCe Soc Chem Am J mCe Soc Chem Am J urOi hmBiol Chem Opin Curr | o 39 no. mCe Soc Chem Am J elCe Biol Chem Cell 117:9357– | 80:2326– 10413 [4 + 2] J

CHEMISTRY BIOCHEMISTRY