Review 721

Structural studies of natural biosynthetic proteins Craig A Townsend

The first high-resolution structures of key proteins involved in the Introduction biosynthesis of several natural product classes are now appear- X-ray qstallography has dramatically advanced our under- ing. In some cases, they have resulted in a significantly improved standing of protein structure and the macromolecular asso- mechanistic understanding of the often complex processes ciation of proteins with IlKA, RNA, drugs, cofactors and catalyzed by these , and they have also opened the way other proteins. Progress has, however, been considerably for more rational efforts to modify the products made. slower in obtaining detailed insights into the enzymes that catalyze natural product biosynthesis. ‘I’he reasons for this Address: Department of Chemistry, The Johns Hopkins University, Baltimore, MD 21218, USA. are several; in particular the enzymes concerned have modest, even poor, kinetic properties, they are available in Correspondence: Craig A Townsend only minute amounts from wild-type organisms and they E-mail: [email protected] are often not very stable. As a consequence, they have proved difficult to detect, isolate and purify. Efforts towards Chemistry & Biology October 1997, 4:721-730 http://biomednet.com/elecref/1074552100400721 this end have been successful in the last decade, ho\vever, and now a number of genes encoding synthetic enzymes for C Current Biology Ltd ISSN 1074-5521 natural products of interest have been cloned and recombi- nant enzymes have been made available for thorough char- acterization. Given the commercial importance of natural products, the associated enzymes hai,e become targets for manipulation by chemical modification, muragenesis and combinatorial methods, not onI\; to improve the production of these high-value materials but also, through deeper mechanistic understanding, to achieve the efficient synthe- sis of new chemical entities. It is axiomatic that both dynamic and structural information are required in order to understand the behavior of a chemical system. Rioorganic chemists (or chemical biologists!) have proved resourceful, even ingenious, in devising experiments to relcal in exquisite mechanistic detail the frequently complex cat- alytic cycles of biosynthetic enzymes. On the other hand, high-resolution structures of these proteins, which would be immensely useful guides to any program of protein modifi- cation for mechanistic or biotechnological ends, ha\,e not been available until very recently. Among the first of a growing wave of 0these structures is that of at 1.3 A resolution [l], which underscores this broader development and prompts this short review.

Penicillin The biosynthesis of penicillin from a tripeptide by a single , isopenicillin N synthase (IPNS), is among the most impressive transformations known in natural prod- ucts chemistry. In the reaction, a monomeric protein con- taining a single ferrous ion, in the presence of one molecule of dioxygen, mediates a stcpwise double oxida- tive cyclization of &(I,-aminoadipyl)-I,-cysteinyl-r>-valinc (ACV, 1) to isopenicillin N (2) and two molecules of water (Figure 1) [2,3]. The intermediate I,.r,,r)-tripeptide is in turn synthesized from its constituent [,-amino acids and ATP by ACV synthetase (AC\%), a member of the wider class of synthetases responsible for the analogous synthesis of modified polypeptides such as 722 Chemistry & Biology 1997, Vol 4 No 10

Figure 1

The biosynthesis of isopenicillin (2) from its L-a-Aminoadipic acid ACVS primary amino acid precursors. The reaction L- proceeds via an intermediate tripeptide, ACV J -Y L-Valine (1). ACVS, ACV synthetase; IPNS, 3 x (AMP + PP,) 3xATP isopenicillin N synthase.

2 Chemlstrv & B~oloclv

gramicidin and cyclosporin [4,.5]. Thus, in only two hydroperoxide 4 and the electronic equi\ alcnt of an iron- cnzymic steps, a compound of enormous value to human ligated thioaldchyde. Generation of the fcrryl species health is produced by chemical reactions that have an effi- appearing in 5 is invoked to release a hydroxide ion in the ciency unmatched by contemporary synthetic organic where deprotonation of the amide sets the stage methods and at a lower cost than that of the primar) for transannular closure to the monocyclic p-lactam 5. For- amino acid building blocks. mation of the thiazolidine ring follous suggestions made previousI>, [9] that the ferry1 species 5 carries out ‘I’he X-ray crystal structure of IPNS has recently been homolytic GH bond cleavage at the valyl P-carbon, prc- obtained under anaerobic conditions, in the presence of sumably no\v rotated to correctly position this hydrogen so Fe(II) and AC\’ [I]. An earlier crystal structure of the cat- as to initiate formation of the second ring - a process alytically inactive hln(II) enzyme showed the metal ion to indirectly supported by studies of other non-heme [iron] be octahedrally coordinated by HisZ14, Asp216, His270. oxygenases such as clavaminate synthase (although thcb- Gln330 and two water molecules [6]. This ligand geometry arc dependent on a-ketoglutarate to activate oxygen) was thought to represent the resting state of the enzyme. [l&-12]. The

Figure 2 -.- The proposed mechanism of lsopenicillin N (2) 1 -7 form&on from the L,L,o-tripeptlde ACV (1; see ( HIS His Figure 1). The reactlon is catalyzed by IPNS. His, \ Ill HIS ’ \ II

ACV

02

H I H 3 4 IPNS-Fe”

i I His

the functions of which have been assigned on the basis of is the case with XC\7S. to release the fully elaborated biochemical studies and high le~rcls of sequence identity. peptidc. The reactions of a single module are depicted in I)etailcd sequence comparisons have Icd to the definition Figure 4. Type I NKPSs comprise about 1000 amino of domains responsible for: (1) amino acid recognition and acids. N-hletkylation domains, which distinguish type II adcnylation by ATP, (2) thioestcrification of the resulting from type I NKFYS enzymes, are appro\;iniately 400 amino mixed anhydrides by phosphopantetheine units bound at acids long and arc located in the carboxy-terminal region highly conserved scrinc residues, (3) occasional epimcriza- of adenylation domains [5]. tion and/or N-methylation, (1) amide bond formation (condensation) and, finally. (5) intramolecular cwlization @a .300 precursor molecules are found in non-ribosomal or hydrolvsis bv carboxy-terminal thioesterase domains, as polypcptides. ‘I’he ability to assemble these units in 2

Figure 3

The modular arrangement of some non- ribosomal oeptide svnthetases (NRPSs) and Molecular mass the overall’reactions that they catalyze b = SH terminus of phophopantetheine bound at a ’ ’ l Amino acids --_ ““0 yJfy& 405-425 kDa conserved serine). ACVS, ACV synthetase; :- ATP GRS, gramicidin synthetase and SIM, ACVS 6OOH cyclosporin synthetase. ACV-tripeptide (tfipeptide)

o-Phe-L-Pro- L-Val-L-Orn-L-Leu 6 +i -Arni;pacids , 590 kDa (total) I GRS L-Leu-L-Orn-L-Val-L-Pro-o-Phe Gramicidin S (pentapeptide x 2)

7 7 ? t t 7 t t ? : i Amzin;ids Cyclosporin A 1700 kDa (undecapeptide) SIM S-Adenosylmethlonlne Chemistry & Biology 724 Chemistry & Biology 1997, Vol 4 No 10

Figure 4

1 The domain organization of a generic type I Type I NRPS NRPS module and an illustration of the 120 kDa/Module thiotemplate mechanism of polypeptide synthesis. Activation of an amino acid occurs by recognition and adenylation followed by -j-:pi&o,y.om transfer to the module-bound phosphopantetheine. Epimerization (‘Epim’ 0 0 f0 domain) and N-methylation can take place at this stage followed, finally, by condensation to afford the polypeptide product. Type II NRPS enzymes are distinguished by an N-methylation domain (orange) in addition to those present in a type I system. The module with the product bound to phosphopantetheine is shown in the box.

0 R 0. NH, + ATP + PP,

HO OH Type 11 a NRPS

0 R

NH,

\ . Chemistry & Eldogy 1

predetermined order and number by combining the appro- invariant or highly conserved residues within the signature priate modules to encode a chimeric synthetase has clear motifs that are thought to be involved in AMP binding or mechanistic and practical attractions. In particular, it could the interaction of the enzyme with the a-amino and a-car- be imagined that combinatorial methods could be applied boxy1 of the substrate phenylalanine. Although proteins to the synthesis of new chemical species. Indeed, experi- with this level of sequence identity are expected to have ments have been carried out to show the feasibility of such highly similar main chain conformations, the’ determinants a goal [18]. The X-ray structure of an entire module would of amino acid binding specificity are governed by the be of great value for determining the three-dimensional nature of the protein residues lining the substrate-binding relationships between its several catalytic domains, and an pocket. As would be anticipated, differences in these regions are evident in individual adenylation domains. important advance0 toward this goal has just been reported in the 1.9 A structure of the phenylalanine-activating subunit of gramicidin synthetase (PheA) [17,19]. Adenyla- The model polypeptide folds into two compact sub- tion domains play a central role in peptide formation by domains; a large amino-terminal portion that contributes selecting and activating specific building blocks for syn- most substrate contacts and a significantly smaller carboxy- thesis. Nine well-conserved signature sequences have terminal part. Although it shares only 16% amino acid iden- been identified within adenylation domains and they are tity, principally in short, highly conserved motifs, the X-ray believed to be involved in ATP and amino acid binding crystal structure of unligated firefly luciferase shows and catalysis [4]. These assignments have been supported remarkable topological similarity to the PheA model struc- by mutagenesis studies. The PheA model structure con- ture [ZO]. One would expect that all adenylation domains, sists of 512 residues which is in good agreement with the which have 30-60% sequence identity, have very similar size of an entire adenylation domain with both phenylala- three-dimensional structures. Interestingly, in the model nine and AMP bound to it. The structure reveals several PheA structure, the small carboxy-terminal domain is Review Structural studies of natural product biosynthetic proteins Townsend 725

Figure 5 - A linear domainal representation of an animal fatty acid synthase. KS, P-ketoacyl synthase; ACP AT, acyl (acetyl and malonyl); DH, 1 KS H -HEi--,AT ER H KR H H TE 1 dehydratase; ER, enoyl reductase; KR, I . P-ketoacyl reductase; ACP, acyl carrier . I protein; TE, thioesterase. The relative sizes of / TE H t+Z--HT} IDHH AT H KS] the domains and the DH-ER linker are ACP approximately to scale. Chemtry & Biology rotated approximately 90” relative to its orientation in firefly physical constraints to catalysis that are imposed by the luciferase. Although the model PheA structure includes length of the phosphopantetheine arm. substrate and AMP, the luciferase structure does not. These conformational differences may reflect distinct points in the In contrast to vertebrate fatty acid biosynthesis in which catalytic cycles of these proteins-an important dynamic acetyl-CoA (Figure 6, R = II) and malonyl-CoA (Figure 6, consideration to any attempt at protein engineering. R’ = H) cycle through FAS in a repeated manner, micro- bial polyketide synthesis extends these basic chemical Modular polyketide synthases reactions dramatically, especially in bacteria. Starter units Another major group of multidomainal biosynthetic pro- arc variable (Figure 6, R = H, CH,, ethyl, butyl, isobutyl, teins that has achieved prominence recently is the modular alkyl. arylalkyl, etc.), and homologations by substituted bacterial polyketide synthases (PKSs), responsible for the malonyl units are common (Figure 6, R’ = H. CH;. ethyl, biosynthesis of, for example, erythromycin [Zl-243 and butyl, etc.). Similarly, reductive transformations that rapamycin [25]. Interestingly, both the primary sequences proceed in lock step at the P-carbon in fatty acid synthesis and the organization of individual catalytic domains in can stop at any step and homologation can continue. PKSs closely mimic vertebrate fatty acid synthases (FASs). hloreover, the P-hydroxyl intermediate can, in principle, Animal FASs are highly evolved polyproteins (Type I) that exist in any of four possible stereochemical configurations contain seven catalytic sites: P-ketoacyl synthase (KS), acyl (Figure 6, “) at the a and p carbons. transferase (AT), de’hydratase (DH), enoyl reductase (ER), p-ketoreductase (KR), acyl carrier protein (ACP) and It can be readily appreciated that compounds of great thioesterase (TE) arranged in that order from the amino SmCtUrdl diversity can be rapidly assembled using the terminus to the carboxy terminus of the protein (Figure 5). simple synthetic reactions available to PKSs. How Nature The active form of the protein exists as a head-to-tail has done this is perhaps best understood through investi- homodimer (a,) in which the penultimate ACP domains, gations of the modular PKS of 6-deoxycrythronolide B (6- each with its bound phosphopantetheine arm (Figure 5; dEB; 6; Figure 7) biosynthesis in S~~~ar~~o~~~.sp~ru l = SH), ‘carry’ the intermediates of synthesis through suc- e~thmetr. In this organism, three large open reading cessive rounds of two-carbon extension (homologation) by frames were discovered that encode three large proteins, decarboxylative Claisen additions of malonyl-ACP, fol- DEBS l-DEBS 3 [Zl-241. Assignments of individual cat- lowed by reduction/dehydration/reduction steps until a alytic domains were readily made on the basis of their chain length of C,, or C,, is reached. At this point the acyl- high sequence homology to equivalent domains in animal pantetheinate of ACP is hydrolyzed at the thioesterase and FAS. In keeping with a ‘processive’ model of polyketide released as the free acid. The whole process is shown biosynthesis, six modules could be assigned that corre- schematically in Figure 6. sponded to the initial loading of a propionate starter unit and five chain extensions of (ZS)-methylmalonyl-CoA The two central features of the process are that the active 1321 and accompanying modifications (Figure 7). In con- sites are used iteratively and that synthesis requires the trast to animal FASs, modular PKSs are elaborately con- cooperation of catalytic centers located on both of the structed such that each active site is used non-itcrativel) antiparallel subunits [26,27]. Clever complementation (if at all) [2.5]. A telling early experiment both supported experiments by Stuart Smith’s group [28,29] have estab- the validity of the multifunctional catalytic roles proposed lished that a single catalytic complex consists of the for the DEBS proteins and raised unmistakably the possi- DH-ER-KR-ACP-TE of one strand and the AT-KS of bility of engineering these proteins in a rational \vay for the other, despite the occurrence of a -600 residue linker the production of new ‘natural’ products. Katz and his separating the DH and ER domains (Figure 5). The sub- coworkers at Abbott Laboratories [Z] generated a mutant units of FAS therefore cannot be positioned in a strictly of DKBS 3 in which the ketoreductasc in module 5 (KR; extended, side-by-side manner as was suggested earlier by Figure 7) was disrupted. The mutant produced an analog the analysis of electron microscope (EM) images [30,31], (7) of 6-dEB bearing a keto group at C-S rather than a but must be coiled in some manner to accommodate the hydroxyl group as had been hoped. 726 Chemistry & Biology 1997, Vol 4 No 10

Figure 6

7 A comparison of FAS homologation of acetyl- FAS PKS RCH, - COSCoA CoA with malonyl-CoA (and their substituted II derivatives with different groups at R and R’) I with a generic PKS that shows a varying extent of reduction/dehydration/reduction at the II thloacyl P-carbon prior to further homologation 0 0 (highlighted in color). Both R- and AT (repeat) S-configurations are possible at the positions SR SR marked (*). Abbreviations as for Figure 5. P---P ACF R R’ R R’

KR 4 Homologation I I OH 0 HOOC-CHR’-COSCoA ~

SR F--P SR R R’ IDH KS 0 (repeat) ‘; SR R R’ IER HH 0 repeat repeat SR -..._____ ---- , SR Rw R’ R R’ L Ghemlstry & Biology

Erythromycin is a front-lint antibiotic and the commercial remarkable strides are currently being made to exploit the importance of other polyketide metabolites is also well biotechnologicai possibilities of the PKSs [36-M]. Similarly, appreciated [33-X5]. Lsing the 6-dEB system, in particular, massive up-regulation of human FAS has been associated

Figure 7

The modular organization of DEBS 1 DEBS 2 DEBS 3 6-deoxyerythronolide B synthase (DEBS) and >-,l the proposed build up of 6-deoxyerythomycin Prime Module 2 Module 4 Module 6 B (6, 6-dEB) in a processive fashion from the - Module 1 - Module 3 Module 5 - initiation of synthesis at module 1 to final AT ACP KS AT KR ACP KS AT KR ACP KS AT ACP KS AT DH ER KR ACP KS AT KR ACP KS AT KR ACP TE macrocyclization occurring at the Q i 8 B thioesterase domain of module 6. Other 0 0 0 0 abbreviations as for Figure 5.

Ho- 0 Ho- =I HO- F rn, HO- HO-

F Hc 0

t Ho- + 0

HO HO-

f HO

1

6 Chemlstty & Biology Review Structural studies of natural product biosynthetic proteins Townsend 727

with cancer inter din of the breast, ovary and prostate [39]. PKSs, Simpson and coworkers [47] have recently Detailed tertiary and quarternary information about these described a high-resolution NhlK structure of the upo- proteins would be very useful to guide the design of drugs ACP of actinorhodin (crct) synthesis from S~eptom~~.s roeli- targeted to human FAS and to allow selective domainal color, nuclear Overhauser enhancement spectroscopy modifications or combinatorial methods of de no~o domain (NOESY) experiments suggest that there is little interac- assembly in PKSs. The methods that have been applied to tion between phosphopantetheine and ACP. The tertiary modular PKS structures have followed those that were used folds of the two proteins show strong structural homology, earlier for FAS. The Leadlay and Staunton groups [40,41] despite the generally observed failure of FAS ACPs to have examined the behavior of the DEBS proteins during substitute for PKS ACPs in genetically enginecrcd gel filtration and analytical ultracentrifugation and found systems. Both ACPs share a hydrophobic core that pre- them, like vertebrate FAS. to be homodimers. Cross- sumably accommodates the growing acyl chain as synthe- linking and partial proteolysis studies clearly indicate a sis proceeds. It is quite flexible, and differing extents of structural motif in which the inter-subunit arrangement is mobility, and different charged and hydrophilic residues overall a par&l rather than an anti-parallel arrangement lie buried within this core. These dynamic and structural [40,41]. But, just as the extended, side-by-side cylinders differences may account for the synthetic discrimination view of animal FAS is now seen as oversimplified [29], so shown by the different ACPs [47]. must the PKS subunits be intertwined to achieve synthesis. Experiments by the Khosla and Cane groups [42] suggest X-ray structures of two further proteins of the E. di FAS .that, like Type I FASs, the ACP on one chain must cooper- apparatus have been determined recently. The acyl trans- ate with the KS on the partner chain. ‘They go on to propose ferase that transfers malonate from CoA0 to ACP has been a dimer model in which the individual modules of one crystallized to afford a structure at 15 A [48]. The catalytic chain zig-zag back and forth in one plane and are arrayed Ser92 is revealed as hydrogen bonded to a histidine, with their partner in such a way that each module pair is ori- His201, which is typical of . but the catalytic ented head-to-tail [42]. In contrast, Leadlav and Staunton triad appears to be completed by Gln2.50 rather than the [41] have advanced a ‘double helical’ model in which the usual carboxylic acid of aspartate or glutamate. The chains are twisted together in a parallel head-to-head, tail- enzyme is known to be highly discriminating for malonyl- to-tail fashion; this model can also account for the new CoA in preference to acetyl-CoA, but the origin of this sub- results of Smith for the Type I FAS [ZC,]. strate specificity is not fully clear from the X-ray structure, although a nearby arginine may be involved. Enoyl rcduc- Electron microscopy, physical measurements, cross- tase catalyzes the last reaction in the catalytic cycle of FAS linking and proteolysis experiments have provided low- and requires NADPH (Figure 6); the E. cvli enzyme has resolution views of FAS [30,31] and, to a lesser extent, of been crystallized0 in the presence of NAL)+ and its &ucturc the PKS enzymes [40-U]. Closely related to the studies solved at 2.1 A [49]. Although the NAD’ presuni- described above have been those carried out with the ably occupies the expected , the location of the fungal Type I PKS h-methylsalicylic acid synthase [43] &P-unsaturated thioester substrate is not known. and the first, seminal experiments in this area by Lynen [44] on yeast FAS two decades ago. [infortunately, high Terpenes resolution information about Type I systems from X-ray The terpenes are the final class of natural products in crystattography or nuclear magnetic resonance (NhIK) which X-ray crystallography has begun to provide funda- spectroscopy is lacking. In bacteria, however, most fatty mental mechanistic insights. hlore than 23,000 terpenes acid biosynthesis is carried out by particulate enzymes arc known of the widest structural variety of any knobvn that have singie active sites (Type II FASs). Some bacter- class of natural product; they range from being essential ial PKSs arc particulate as well, such as those involved in membrane components, steroid hormones and insecticides the synthesis of actinorhodin [33] and tetracenomycin to carotenoids, flavors and fragrances. Central to the for- [35]. These bacterial enzymes are smaller, more tractable mation of all of these products arc the prenyl targets for detailed structural analysis and have started to in which dimethylallyl diphosphate (8; DMAPP; Figure 8) become a focus for experimental effort. They are also sig- and isopentyl diphosphate (9: IPP), the basic building nificantly homologous to individual domains of their Type blocks of isoprenoids, are condensed to give geranyl I FAS and PKS brethren and, hence, their structures are diphosphate (10; GPP) and inorganic pyrophosphate. In likely to be broadly representative of the corresponding an analogous reaction, IPP can react again with this allylic domains of these multifunctional proteins. diphosphate intermediate to afford farnesyl diphosphate (11; FPP) and so forth (Figure 8). The first advance in the structural analysis of the bacterial enzymes was reported by Kim and Prestegard [45,46] with Avian FPP synthase has been crystallized as a dimer and its the solution structure of /zolo-ACP from the Type II FAS structure has been determined at 2.6 w resolution [SO]. of Escherichiu co/i. In an important correlation with Type II The structure is highly a-helical: each subunit is composed 728 Chemistry & Biology 1997, Vol 4 No 10

Figure 8

The assembly of linear isoprenoids by prenyl transferase enzymes. J+-OPP+ Lopp \- -,,, 8 9 PP, 10

IPP

PP, +----... A \ \ \ +A OPP

11 Chemistry & Biology

of ten core helices surrounding a large central cavity con- Mg(I1) ions at one of the Asp-Asp-X-X-Asp sites with taining the putative active site. Highly conserved residues their hydrocarbon tails extending toward the mutated line this site: notably two Asp-Asp-X-X-Asp motifs, residues constituting the altered floor of the active site. which reside on opposing walls of the cavity, as well as spe- Although longer chain allylic diphosphates could not he cific arginine residues. The roles of these amino acids in co-cystallized, molecular modeling exercises disclosed that substrate binding and catalysis have been demonstrated by the rotation of just two sidechains opened a passageway site-specific mutagenesis experiments. Mg(II)-IPP has that was lined mainly with hydrophobic residues. This been co-crystallized with the prenyl transferase but, unfor- molecular picture provides the sharpest insight into how tunately, was insufficiently resolved to allow its precise prenyl transferases catalyze the fundamental elongation of placement within the active site. Although the mechanism C, units and control their length. This view of terpene of fundamental chain elongation reactions has been biosynthesis will be widened by the X-ray structures of studied in detail, a further understanding of the process on three terpene cyclases that are soon to he published (these the basis of the FPP synthase structure has been largely structures have been published whilst this review was in elusive, apart from the organization of conserved amino press [S&54]). The terpene cyclases take the reaction acids in the active site. products of prenyl transferases, like FPP synthase, and catalyze the formation of the familiar cyclic structural The central issue of the control of chain length during the types that are characteristic of this natural product class. prenyl transferase reaction has been addressed in an acutely rational manner [Sl]. An examination of the X-ray Conclusions structure of FPP synthase and comparisons among 35 At the heart of mechanistic enzymology lies a central protein sequences of GPP and FPP synthases and higher interest in catalysis - how do enzymes achieve their chain-length synthases points to the potentially important extraordinary rate accelerations? As a rule, the enzymes of roles of a pair of phenylalanine residues that form a ‘floor’ natural product biosynthesis are not so impressive for their to the presumed allylic substrate-binding pocket. In con- rates of synthesis hut rather for the reactions that they cat- trast, higher chain-length synthases have smaller or more alyze. For natural product enzymes, investigations tend to flexible sidechain residues, such as serine, and alanine or be skewed towards determining the mechanism and its methioninc, at these sites. Mutants of the avian FPP syn- relationship to gross structure, as well as to the pragmatic thase bearing alanine and/or serine substitutions at these considerations of how that knowledge can be applied. The aromatic amino acid sites were constructed and their reac- point I wish to make is that a threshold has been crossed tions were compared with those of the wild-type enzyme in the study of natural product biosynthetic enzymes that, [Sl]. A single amino acid replacement, Phell2+AlallZ, while the inevitable consequence of hard-won advances in extended the chain length cleanly by one isoprene protein purification and molecular biology, has elevated unit to C,,,. Substitution of both aryl groups in the the level of mechanistic discussion to include detailed Phel lZ+Alal lZ/Phel13~Serll3 double mutant, on the three-dimensional information about protein structure. other hand, dramatically affected the products formed. A Caveats to interpreting site-specifically mutated enzyme distribution of C2o-C7o isoprenoids resulted, with the behavior to correspond to that of wild-type enzymes peak at C&Z,,,. The ape-enzyme and structures contain- remain as they were when pointed out a decade ago [SS], ing GPP and FPP could be obtained, plainly showing and limitations can he seen for cases in which substrates binding of the allylic diphosphate substrates and a pair of and/or cofactors cannot he co-crystallized or successfully Review Structural studies of natural product biosynthetic proteins Townsend 729

diffused into presumed active sites. Less obvious is the 12. Salowe, S.P., Krol, W.J., Iwata-Reuyl, D. &Townsend, C.A. (1991). Elucidation of the order of oxidations and identification of an fact that the catalyzed reactions are multistep and intermediate in the multistep clavaminate synthase reaction. inescapably involve multiple transition states. The effects Biochemistry 30, 2281.2292. 13. Townsend, CA. & Basak, A. (1991). Expenments and speculations on substrate of protein structural changes, or indeed struc- the role of oxldative cyclization chemistry in natural products tural changes, on the relative energies of these states (and biosynthesis. Tetrahedron 47, 2591-2602. on overall reaction rates and mechanism) will be difficult 14. Townsend, C.A. (1992). Oxidative amino ald processing in p-lactam antibiotic biosynthesis. Biochem. Sot. Trans. 21, 208-213. to assess or predict. Linked to this conundrum is the prob- 15. Leasing, R.A., Zang, Y. & Que, J., L. (1992). Oxidative ligand transfer to able conformational plasticiq of active sites that direct alkanes: a model for Iron-mediated C-X bond formation In p-lactam reactions through multiple intermediates to one or a antibiotlc biosynthesis. 1. Am. Chem. Sot. 113, 8555-8557. 16. Cooper, R.D.G. (1993). The enzymes Involved In biosynthesis of number of products. IPNS, clavaminate synthase [M] or penicillin and cephalosporin; their structure and function. Bioorg. & the terpcne cyclases may be broadly cited as examples. An Med. Chem. 1, l-17. 17. Marahlel, M.A. (1997). Protein templates for the biosynthesis of X-ray crystal structure affords a snapshot and does not in peptide antiblotlcs. Chem. Biol. 4, 561-567. itself give a picture of whatever dynamic flexibility may 18. Stachelhaus, T., Schneider, A. & Marahiel, M.A. (1995). Rational design of peptide antibiotics by targeted replacement of bacterial and be required to guide all phases of these catalytic events. fungal domains. Science 269, 69-72. Nonetheless, it is clear that altered substrate specificities 19. Conti, E., Stachelhaus, T., Marahiel, M.A. & Brick, P. (1997). Structural will be achieved (greatly aided by the availability of high- basis for the activation of phenylalanine in the non-nbosomal biosynthesis of gramicidin S. EM60 J. 16, 4174-4183. resolution protein structures) and multidomain proteins 20. Conti, E., Franks, N.P. & Brick, P. (1990). Crystal structure of firefly will be created that synthesize new ‘natural’ products. luciferase throws light on a super family of adenylate-forming enzymes. Structure 4, 287-298. These are exciting prospects. It is important to note that 21. Cortes, J., Haydock, SF., Roberts, G.A., Bevitt, D.J. & Leadlay, P.F. gi\-en the synthetic capability of natural product enzymes, (1990). An unusually large multifunctional polypeptide in the erythromycin-producing polyketide synthase of Saccharopo/yspora subset only a small of possible structures exists (or has yet eryfbraea. Nature 348, 176-I 78. been discovered). Although the absence of some theoreti- 22. Donadio, S., Staver, M.J., McAlpine, J.B., Swanson, S.J. & Katz, L. cally possible products ma); be a result of unfavorable (1991). Modular organization of genes required for complex polyketide biosynthesis. Soence 252, 675-679. thermodynamics, other selective pressures must be at 23. Donadio, S. & Katz, L. (1992). Organization of the enzymatic domains work in Kature as well. It will be interesting to see if. for In the multifunctional polyketide synthase involved In erythromycin formation in Saccharopolyspora erythraea. Gene 111, 51-60. example, the antibiotic properties of the macrolide class of 24. Bevitt, D.J., Cartes, J., Haydock, SF. & Leadlay, P.F. (1992). natural products can be improved beyond the outcome of 6-Deoxyerythronolide-B synthase 2 from Saccharopolyspora erythraea. these forces. Cloning of the structural gene, sequence analysis and inferred domain structure of the multifunctional enzyme. Eur. J. Biochem. 204, 39-49. 25. Schwecke, T., et al., & Leadlay, P.F. (1995). The biosynthetic gene Acknowledgements cluster for the polyketide immunosuppressant rapamycin. Proc. Nat/ I am pleased to acknowledqe and thank the National Institutes of Health Acad. SC. USA 92, 7839-7843. for its support of the biosynthetic projects ongoing in my laboratory 26. Wakil, S.J. (1989). Fatty acid synthase, a proficient multifunctlonal (All4937 and ESO1670). enzyme. Biochemfstry 28, 4523-4530. 27. Smith, S. (1994). The animal fatty acid synthase: one gene, one polypeptjde, seven enzymes. FASEB J. 8, 1248-l 259. References 28. Witkowski, A., Joshl, A. & Smith, S. (1996). Fatty acid synthase: ,n vitro 1. Roach, P.L., et al., & Baldwin, J.E. (1997). Structure of isopeniclllin N complementation of inactive mutants. Biochemistry 35 10569-l 0575. synthase complexed with substrate and the mechanism of penicillin 29. Joshi, A.K., Wltkowski, A. &Smlth, S. (1997). Mapping of functional formation. Nature 387, 827-830. Interactions between domains of the anlmal fatty acid synthase by 2. Baldwin, J.E., Adlington, R.M., Moroney, S.E., Field, L.D. & Ting, H.-H. mutant complementatlon in vitro. Biochemfstry 36, 2316-2322. (1984). Stepwise ring closure in penicillin biosynthesis. Initial p-lactam 30. Stoops, J.K., Wakll, S.J., Uberbacher, E.C. & Bunick, G.J. (1987). formation. J. Chem. Sot., Chem. Commun., 984-986. Small-angle neutron-scattering and electron microscope studies of the 3. White, R.L., John, E.-M.M., Baldwin, J.E. &Abraham, E.P. (1982). chicken liver fatty acid synthase. J. Biol. Chem. 262, 10246-l 0251. Stoichiometry of oxygen consumption in the biosynthesis of 31. Kitamoto, T., Nishigal, M., Sasaki, T. & Ikai, A. (1988). Structure of fatty lsopenicillin from a tripeptide. Blochem. J. 203, 791-793. acid synthetase from the harderian gland of guinea pig. Proteolytic 4. Kleinkauf, H. & Von DBhren, H. (1996). A nonribosomal system of dissection and electron microscopic studies. J. Mol. Biol. 203, 183-l 95. peptide biosynthesis. Eur. J. Biochem. 236, 335-351. 32. Marsden, A.F.A., Caffrey, P., Apariclo, J.F., Loughran, M.S., Staunton, 5. Stachelhaus, T. & Marahiel, M.A. (1995). Modular structure of genes J. & Leadlay, P.F. (1994). Stereospecific acyl transfers on the encoding multifunctional peptide synthetases required for non- erythromycin-producing polyketide synthase. Science 263, 378-380. ribosomal peptide synthesis. FEMS Microbial. Left. 125, 3-l 4. 33. Hopwood, D.A. & Sherman, D.H. (I 990). Molecular genetics of 6. Roach, P.L.. et al., & Baldwin, J.E. (1995). Crystal structure of polyketides and Its comparison to fatty acid biosynthesis. Annu. Rev. isopenicillin N synthase is the first from a new structural family of Genet. 24, 37-66. enzymes. Nature 375, 700-704. 34. Katz, L. & Donadio, S. (1993). Polyketide synthesis: prospects for 7. Orville, A.M., et al., & Lipscomb, J.D. (1992). Thiolate ligation of the hybrid antibiotics. Annu. Rev. Microobiol. 47, 875-912. active site Fe*+ of isopenicillin N synthase derives from substrate 35. Hutchinson, CR. & Fujii, I. (1995). Polyketide synthase gene rather than endogenous cysteine: spectroscopic studies of site- manipulation: a structure-function approach in engineering novel specific Cys+Ser mutated enzymes. Biochemistry 31, 4602-4612. antibiotics. Annu. Rev. M~crobiol. 49, 201-238. 8. Clue, J., L. & Ho, R.Y.N. (1996). Dioxygen activation by enzymes with 36. Pieper, R., Kao, C. & Khosla, C. (I 996). Specificity and versatility in mononuclear non-heme iron active sites. Chem. Rev. 96, 2607-2624. erythromycln biosynthesis. Chem. Sot. Rev., 297-302. 9. Baldwin, J.E. & Abraham, E. (1988). The biosynthesis of penicillins and 37. Jacobsen, J.R., Hutchinson, CR., Cane, D.E. & Khosla, C. (1997). cephalosporins. Nat. Prod. Reports 5, 129-l 45. Precursor-directed biosynthesis of erythromycin analogs by an 10 Krol, W.J., Basak, A., Salowe, S.P. &Townsend, C.A. (1989). engineered polyketide synthase. Science 277, 367-369. Oxidative cyclization chemistry catalyzed by clavaminate synthase. J. 38. Staunton, J. &Wilkinson, B. (1997). The biosynthesis of erythomycin Am. Chem. Sot. 111, 7625-7627. and rapamycin. Chem. Rev., in press. 11 Basak, A., Salowe, S.P. &Townsend, CA. (1990). Stereochemical 39. Kuhajda, F.P., et al., & Pastemack, G.R. (1994). Fatty acid synthesis: a course of the key ring-forming reactions in clavulanic acid potential selective target for antineoplastic therapy. Proc. Nat/ Acad. biosynthesis. J. Am. Chem. Sot. 112, 1654-l 656. So. USA 91, 6379-6383. 730 Chemistry & Biology 1997, Vol 4 No 10

40. Aparicio, J.F., Caffrey, P., Marsden, A.F.A., Staunton, J. & Leadlay, P.F. (1994). Limited proteolysis and active-site studies of the first multienzyme component of the erythromycin-producing polyketide synthase. 1. Biol. Chem. 269, 8524-8528. 41. Staunton, J., Caffrey, P., Aparicio, J.F., Roberts, G.A., Bethel, S.S. & Leadlay, P.F. (1996). Evidence for a double-helical structure for modular polyketide synthases. Nat. Struct. Biol. 3, 188-l 92. 42. Kao, C.M., Pieper, R., Cane, DE. & Khosla, C. (1996). Evidence for two catalytically independent clusters of active sites in a functional modular polyketide synthase. Biochemistry 35, 12363-l 2368. 43. Child, C.J., Spencer, J.B., Bhogal, P. & Shoolingin-Jordan, P.M. (1996). Structural similarities between 6-methylsalicylic acid synthase from Pemcillium pat&m and vertebrate type I fatty acid synthase: evidence from thiol modification studies. Biochemistry 35, 12267-l 2274. 44. Lynen, F. (1980). On the structure of fatty acid synthetase of yeast. Eur. J. Biochem. 112, 43 l-442. 45. Kim, Y. & Prestegard, J.H. (1989). A dynamic model for the structure of acyl carrier protein in solution. Biochemistry 28, 8792-8797. 46. Kim, Y. & Prestegard, J.H. (1990). Refinement of the NMR structures for acyl carrier protein with scalar coupling data. Proteins 8, 377-385. 47. Crump, M.P., et a/., & Simpson, T.J. (1997). Solution structure of the actinorhodin polyketide synthase acyl carrier protein from Streptomyces coekolor A3(2). Biochemistry 36, 6000-6008. 48. Serre, L., Verbree, E.C., Dauter, Z., Stuitje, A.R. & Derewenda, Z.S. (1995). The Escher$hia co/i malonyl-CoA:acyl carrier protein transacylase at 1.5 A resolution. J. Biol. Chem. 270, 12961-I 2964. 49. Baldock, C., et al., & Rice, D.W. (1996). A mechanism of drug action revealed by structural studies of enoyl reductase. Science 274, 2107-2110. 50. Tarshis, L.C., Yan, M., Poulter, CD. & Sacchettini, J.C. (1994). Crystal structure of recombinant farnesyl diphosphate synthase at 2.6 A resolution. Biochemistry 33, 10871-I 0877. 51. Tarshis, L.C., Proteau, P.J., Kellog, B.A., Sacchettini, J.C. & Poulter, CD. (1996). Regulation of product chain length by isoprenyl diphosphate synthases. froc. Nat/ Acad. Sci USA 93, 15918-l 5023. 52. Wendt, K.U., Poralla, K. &Schultz, G.E. (1997). Structure and function of a squalene cyclase. Science 227, 181 l-1 815. 53. Starks, CM., Back, K. Chappell, J. & Noel, J.P. (1997). Structural basis for cyclic terpene biosynthesis by tobacco 5-epi-aristolochene synthase. Science 227, 1815-l 820. 54. Lesburg, C.A., Zhai, G., Cane, DE. &Christianson, D.W. (1997). Crystal structure of pentalenene synthase: mechanistic insights on terpenoid cyclization reactions in biology. Science 227, 1820-l 824. 55. Knowles, J. (1987). Tinkering with enzymes: what are we learning? Scrence 236, 1252-l 258. 56. Busby, R.W. &Townsend, CA. (1996). A single monomeric iron center in clavaminate synthase catalyzes three nonsuccessive oxidative transformations. B/oorg. & Med. Chem. 4, 1059-l 064.