US 201201078.93A1 (19) United States (2) Patent Application Publication (10) Pub. No.: US 2012/0107893 A1 Ajikumar et al. (43) Pub. Date: May 3, 2012

(54) MICROBIAL ENGINEERING FOR THE Related U.S. Application Data PRODUCTION OF CHEMICAL AND PHARMACEUTICAL PRODUCTS FROM THE (63) Continuation-in-part of application No. 12/943,477, ISOPRENOID PATHWAY filed on Nov. 10, 2010. (60) Provisional application No. 61/280,877, filed on Nov. (75) Inventors: Parayil K. Ajikumar, Cambridge, 10, 2009, provisional application No. 61/388,543, MA (US); Gregory filed on Sep. 30, 2010. Stephanopoulos, Winchester, MA (US); Too Heng Phon, Kent Vale Publication Classification (SG) (51) Int. Cl. CI2P 5/00 (2006.01) (73) Assignee: Massachusetts Institute of (52) U.S. Cl...... 435/166 Technolechnology CambridgeCambridge. MAUs,MA (US e. ABSTRACT (21) Appl. No.: 13/249,388 The invention relates to the production of one or more terpe noids through microbial engineering, and relates to the manu (22) Filed: Sep. 30, 2011 facture of products comprising . Patent Application Publication May 3, 2012 Sheet 1 of 24 US 2012/0107893 A1

q1:bla *...

*… ---- ,

~~~~. x3&############# Patent Application Publication May 3, 2012 Sheet 2 of 24 US 2012/0107893 A1

3.3.

§§§ Fig.2b Patent Application Publication May 3, 2012 Sheet 3 of 24 US 2012/0107893 A1

:

Patent Application Publication May 3, 2012 Sheet 4 of 24 US 2012/0107893 A1

§. 8, ºp. 7. §

so speechinchMEpplomg

Patent Application Publication May 3, 2012 Sheet 5 of 24 US 2012/0107893 A1

33 ?×ך?zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz&

§§§ 3. cquitéd Ratsge twº

§§§§?*************???? ???????????????? Patent Application Publication May 3, 2012 Sheet 6 of 24 US 2012/0107893 A1

---,

£ºstseata £xpressiº >o: § 3. <> tº

2**** 2

㺠>ºots ºctor. Fig. 4b Patent Application Publication May 3, 2012 Sheet 7 of 24 US 2012/0107893 A1

§ <&$#####3 . $ 3.3 - * ***, *. : § 3. ; --Strain 31 3.

Patent Application Publication May 3, 2012 Sheet 8 of 24 US 2012/0107893 A1

& 338̷:38 §§§3333333.3%. 833:33;

: 383 ! &#: 33 & #8 #3:38

---

…”3, 3.

****3. 8.***

: Patent Application Publication May 3, 2012 Sheet 9 of 24 US 2012/0107893 A1

Patent Application Publication May 3, 2012 Sheet 10 of 24 US 2012/0107893 A1

Patent Application Publication May 3, 2012 Sheet 11 of 24 US 2012/0107893 A1

3. 8: :3. § 3. %& : Patent Application Publication May 3, 2012 Sheet 12 of 24 US 2012/0107893 A1

. Taxadieae ?cºastasiatiots

*:::::::: -

3 ~3… *******************&~3. Kºsºsºeºesºe. & ~~.3 . Patent Application Publication May 3, 2012 Sheet 13 of 24 US 2012/0107893 A1

“r

Acquired Range m/z

too so 200 250 Acquired Range m/z

Patent Application Publication May 3, 2012 Sheet 14 of 24 US 2012/0107893 A1

###&##### Fig. 9a

:3

**

| s Patent Application Publication May 3, 2012 Sheet 15 of 24 US 2012/0107893 A1

2------.

do so 200 250 300 &cquired 8aage tº:

Acquited Range tººz

Patent Application Publication May 3, 2012 Sheet 16 of 24 US 2012/0107893 A1

Patent Application Publication May 3, 2012 Sheet 17 of 24 US 2012/0107893 A1

i göös; 38.33;. Patent Application Publication May 3, 2012 Sheet 18 of 24 US 2012/0107893 A1

Patent Application Publication May 3, 2012 Sheet 19 of 24 US 2012/0107893 A1 ?.'&& 3:

SI‘61) ********** &~??****?*********************************

&&&&&&&&&&& š?, Patent Application Publication May 3, 2012 Sheet 20 of 24 US 2012/0107893 A1

| 3. :*: & & &3.

§ ... :

tipstreat ºathway Strength Fig. 14b Patent Application Publication May 3, 2012 Sheet 21 of 24 US 2012/0107893 A1

****

& 3. & tragi: Patent Application Publication May 3, 2012 Sheet 22 of 24 US 2012/0107893 A1

l… Fig.16A Fig. 16B

=1-1-1- 4×3× so too so 200 250 S ###38 ºz Fig. 16a

$g too tº gº 336 Fig. 16b Patent Application Publication May 3, 2012 Sheet 23 of 24 US 2012/0107893 A1

Patent Application Publication May 3, 2012 Sheet 24 of 24 US 2012/0107893 A1

* 3:33:33 3. << $#&#: 33

**

i

33 33 33 '###### Fig. 17C Fig. 17d US 2012/0107893 A1 May 3, 2012

MICROBLAL ENGINEERING FOR THE producing strains have eluded prior attempts aiming at the PRODUCTION OF CHEMICAL AND transfer of this complex biosynthetic machinery into a micro PHARMACEUTICAL PRODUCTS FROM THE bial host.” Yet, as with other natural products, microbial ISOPRENOID PATHWAY production through metabolically engineered strains, offers attractive economics and great potential for synthesizing a RELATED APPLICATIONS diverse array of new compounds with anti-cancer and other [0001] This application is a continuation-in-part of U.S. pharmaceutical activity.” application Ser. No. 12/943,477, filed on Nov. 10, 2010, [0008] The metabolic pathway for Taxol and its analogs which claims the benefit under 35 U.S.C. §119(e) of U.S. consists of an upstreamisoprenoid pathway that is native to E. Provisional Application Ser. No. 61/280,877, entitled coli, and a heterologous downstream pathway “Microbial Engineering for the Production of Chemical and (FIG. 6). The upstream mevalonic acid (MVA) or methyl Pharmaceutical Products from Isoprenoid Pathway,” filed on erythritol phosphate (MEP) pathways can produce the two Nov. 10, 2009 and U.S. Provisional Application Ser. No. common building blocks, isopentenyl pyrophosphate (IPP) 61/388,543, entitled “Microbial to Engineering for the Pro and dimethylallyl pyrophosphate (DMAPP), from which duction of Chemical and Pharmaceutical Products from Iso Taxoland other isoprenoid compounds are formed.” Recent prenoid Pathway,” filed on Sep. 30, 2010, the entire disclo studies have highlighted the engineering of the above sures of which are incorporated by reference herein in their upstream pathways to support biosynthesis of heterologous entireties. isoprenoids such as and artemisinic acid.” The downstream taxadiene pathway has been reconstructed in E. GOVERNMENT INTEREST coli, but, to-date, titers have not exceeded 1.3 mg/L.” [0009] The above rational metabolic engineering [0002] This work was funded in part by the National Insti approaches focused on either the upstream (MVA or MEP) or tutes of Health under Grant Number 1-R01-GMO85323– the downstream terpenoid pathway, implicitly assuming that 01A1. The government has certain rights in this invention. modifications are additive, i.e. a linear behavior.” While this approach can yield moderate increases in flux, it gener FIELD OF THE INVENTION ally ignores non-specific effects, such as toxicity of interme [0003] The invention relates to the production of one or diate metabolites, cellular effects of the vectors used for more terpenoids through microbial engineering. expression, and hidden unknown pathways that may compete with the main pathway and divert flux away from the desired BACKGROUND OF THE INVENTION target. Combinatorial approaches can avoid such problems as they offer the opportunity to adequately sample the parameter [0004] Terpenoids are a class of compounds with diverse space and elucidate these complex non-linear interactions.” uses, including in fragrances, cosmetics, food products, phar 28,29.30 However, they require a high throughput screen, maceuticals, and other products. Thus, methods for their effi which is often not available for many desirable natural prod cient and cost effective production are needed. ucts.” Yet another class of pathway optimization methods [0005] For example, Taxol and its structural analogs have has explored the combinatorial space of different sources of been recognized as the most potent and commercially suc the heterologous genes comprising the pathway of interest.” cessful anticancer drugs introduced in the last decade." Taxol was first isolated from the bark of the Pacific Yew tree,” and Still dependent on a high throughput , these methods early stage production methods required sacrificing two to generally ignore the need for determining an optimal level of four fully grown trees to supply sufficient dosage for one expression for the individual pathway genes and, as such, patient.” Taxol's structural complexity necessitated a com have proven less effective in structuring an optimal pathway. plex chemical synthesis route requiring 35-51 steps with [0010] In the present work, as an example of aspects of the highest yield of 0.4%.” However, a semi-synthetic route invention, we focus on the optimal balancing between the was devised whereby the biosynthetic intermediate baccatin upstream, IPP-forming pathway with the downstream terpe III was first isolated from plant sources and was subsequently noid pathway of taxadiene synthesis. This is achieved by converted to Taxol.” While this approach and subsequent grouping the nine-enzyme pathway into two modules—a plant cell culture-based production efforts have decreased the four-gene, upstream, native (MEP) pathway module and a need for harvesting the yew tree, production still depends on two-gene, downstream, heterologous pathway to taxadiene plant-based processes” with accompanying limitations of (FIG. 1). Using this basic configuration, parameters such as productivity and scalability, and constraints on the number of the effect of plasmid copy number on cell physiology, gene Taxol derivatives that can be synthesized in search for more order and promoter strength in an expression cassette, and efficacious drugs.” chromosomal integration are evaluated with respect to their [0006] Methods for the production of terpenoids and effect on taxadiene production. This modular and multivari derivatives of these compounds, and methods of making able combinatorial approach allows us to efficiently sample products that comprise such compounds, are needed. the main parameters affecting pathway flux without the need for a high throughput screen. The multivariate search across SUMMARY OF THE INVENTION multiple promoters and copy numbers for each pathway mod ule reveals a highly non-linear taxadiene flux landscape with [0007] Recent developments in metabolic engineering and a global maximum exhibiting a 15,000 fold increase in taxa synthetic biology offer new possibilities for the overproduc diene production over the control, yielding 300 mg/L produc tion of complex natural products through more technically tion of taxadiene in small-scale fermentations. Further, we amenable microbial hosts.” Although exciting progress have engineered the P450 based oxidation chemistry in Taxol has been made in the elucidation of the biosynthetic mecha biosynthesis in E. coli, with our engineered strains improving nism of Taxol in Taxus,” commercially relevant Taxol the taxadien-50-ol production 2400-fold over the state of the

US 2012/0107893 A1 May 3, 2012 synthase enzyme. The cell can further recombinantly express one or more terpenoids by controlling the accumulation of any of the polypeptides associated with the invention. indole in the cell or in a culture of the cells. The terpenoid, or [0023] Aspects of the invention relate to isolated polypep a derivative of the terpenoid prepared through one or more tides comprising a taxadiene 50-hydroxylase (T50.OH) chemical or enzymatic steps, is incorporated into a product to enzyme or a catalytically active portion thereof fused to a thereby make the product containing a terpenoid or terpenoid cytochrome P450 reductase enzyme or a catalytically active derivative. According to this aspect, the cell expresses com portion thereof. In some embodiments, the cytochrome P450 ponents of the MEP pathway, and may be a bacterial cell such reductase enzyme is a Taxus cytochrome P450 reductase as E. coli or B. subtilis. In various embodiments, the accumu (TCPR). In certain embodiments, the taxadiene 50-hydroxy lation of indole in the cell or in a culture of the cells is lase and TCPR are joined by a linkersuch as GSTGS (SEQID controlled at least in part by balancing an upstream non NO:50). In some embodiments, the taxadiene 50-hydroxy mevalonate isoprenoid pathway with a downstream heterolo lase and/or TCPR are truncated to remove all or part of the gous terpenoid synthesis pathway. transmembrane region. In certain embodiments, 8, 24, or 42 [0029] In some embodiments, the productis a food product, N-terminal amino acids of taxadiene 50-hydroxylase are food additive, beverage, chewing gum, candy, or oral care truncated. In certain embodiments, 74 amino acids of TCPR product. In such embodiments, the terpenoid or derivative are truncated. In some embodiments, an additional peptide is may be a flavorenhancerorsweetener. In some embodiments, fused to taxadiene 50-hydroxylase. In certain embodiments, the product is a food preservative. the additional peptide is from bovine 170 hydroxylase. In [0030] In various embodiments, the product is a fragrance certain embodiments, the peptide is MALLLAVF (SEQ ID product, a cosmetic, a cleaning product, or a soap. In such NO:51). In certain embodiments, the isolated polypeptide is embodiments, the terpenoid orderivative may be a fragrance. At24T50.OH-tTCPR. Aspects of the invention also encom [0031] In still other embodiments, the product is a vitamin pass nucleic acid molecules that encode for any of the or nutritional supplement. polypeptides associated with the invention and cells that [0032] In some embodiments, the product is a solvent, recombinantly express any of the polypeptides associated cleaning product, lubricant, or surfactant. with the invention. [0033] In some embodiments, the product is a pharmaceu [0024] Aspects of the invention relate to methods for tical, and the terpenoid or derivative is an active pharmaceu increasing terpenoid production in a cell that produces one or tical ingredient. more terpenoids. The methods include controlling the accu [0034] In some embodiments, the terpenoid or derivative is mulation of indole in the cell or in a culture of the cells, polymerized, and the resulting polymer may be elastomeric. thereby increasing terpenoid production in a cell. Any of the [0035) In some embodiments, the product is an insecticide, cells described herein can be used in the methods, including pesticide or pest control agent, and the terpenoid orderivative bacterial cells, such as Escherichia coli cells; Gram-positive is an active ingredient. cells, such as Bacillus cells; yeast cells, such as Saccharomy [0036) In some embodiments, the product is a cosmetic or ces cells or Yarrowia cells; algal cells; plant cell; and any of personal care product, and the terpenoid or derivative is not a the engineered cells described herein. fragrance. [0025] In some embodiments, the step of controlling the [0037] These and other aspects of the invention, as well as accumulation of indole in the cell or in a culture of the cells various embodiments thereof, will become more apparent in includes balancing the upstream non-mevalonate isoprenoid reference to the drawings and detailed description of the pathway with the downstream product synthesis pathways invention. and/or modifying or regulating the indole pathway. In other embodiments, the step of controlling the accumulation of BRIEF DESCRIPTION OF THE DRAWINGS indole in the cell or in a culture of the cells includes or further [0038] The accompanying drawings are not intended to be includes removing the accumulated indole from the fermen drawn to scale. In the drawings, each identical or nearly tation through chemical methods, such as by using absorbents identical component that is illustrated in various figures is or scavengers. represented by a like numeral. For purposes of clarity, not [0026] The one or more terpenoids produced by the cell(s) every component may be labeled in every drawing. In the or in the culture can be a monoterpenoid, a sesquiterpenoid, a drawings: diterpenoid, a triterpenoid or a tetraterpenoid. In certain [0039] FIG. 1. Multivariate-modular isoprenoid pathway embodiments, the terpenoids is taxadiene or any taxol pre engineering reveals strong non-linear response in terpenoid CUITSOr. accumulation. To increase the flux through the upstream MEP [0027] Aspects of the invention relate to methods that pathway, we targeted reported bottleneck enzymatic steps include measuring the amount or concentration of indole in a (dxs, idi, isp?) and ispp) for overexpression by an operon cell that produces one or more terpenoids or in a culture of the (dxs-idi-isp?)F).” To channel the overflow flux from the uni cells that produce one or more terpenoids. The methods can versal isoprenoid precursors, IPP and DMAPP, towards Taxol include measuring the amount or concentration of indole two biosynthesis, a synthetic operon of downstream genes GGPP or more times. In some embodiments, the measured amount synthase (G) and Taxadiene synthase (T)” was constructed. or concentration of indole is used to guide a process of pro The upstream isoprenoid and downstream synthetic taxadi ducing one or more terpenoids. In some embodiments, the ene pathways were placed under the control of inducible measured amount or concentration of indole is used to guide promoters to control their relative gene expression. (a) Sche strain construction. matic of the two modules, the native upstream MEP iso [0028] Thus, in some aspects the invention provides a prenoid pathway (left) and synthetic taxadiene pathway method for making a product containing a terpenoid or ter (right). In E. coli biosynthetic network, the MEP isoprenoid penoid derivative. The method according to this aspect com pathway is initiated by the condensation of the precursors prises increasing terpenoid production in a cell that produces glyceraldehydes-3 phosphate (G3P) and pyruvate (PYR) US 2012/0107893 A1 May 3, 2012

from glycolysis. The Taxol pathway bifurcation starts from metabolite and highest production of taxadiene. The data the universal isoprenoid precursors IPP and DMAPP to form demonstrate the inverse correlation observed between the first the “linear” precursor Geranylgeranyl diphosphate, and unknown metabolite and taxadiene production. then the “cyclic” taxadiene, a committed and key intermedi [0042] FIG. 4. Upstream and downstream pathway tran ate to Taxol. The cyclic olefin taxadiene undergoes multiple scriptional gene expression levels and changes in cell physi rounds of stereospecific oxidations, acylations, benzoylation ology of engineered strains. Relative expression of the first with side chain assembly to, ultimately, form Taxol. (b) Sche genes in the operon of upstream (DXS) and downstream (TS) matic of the multivariate-modular isoprenoid pathway engi pathway is quantified by qPCR. Similar expression profiles neering approach for probing the non-linear response in ter were observed with the genes in the downstream of the oper penoid accumulation from upstream and downstream ons. The corresponding strain numbers are shown in the pathway engineered cells. Expression of upstream and down graph. (a) Relative transcript level DXS gene expression stream pathways is modulated by varying the promoter quantified from different upstream expressions modulated strength (Trc, T5 and T7) or increasing the copy number using using promoters and plasmids under two different down different plasmids. Variation of upstream and downstream stream expressions. (b) Relative transcript level TS gene pathway expression gives different maxima in taxadiene expression quantified from two different downstream expres accumulation. sion modulated using p5T7 and p10T7 plasmids under dif [0040] FIG. 2. Optimization of taxadiene production by ferent upstream expressions. Our gene expression analysis regulating the expression of the up- and down-stream modu directly supported the hypothesis, with increase in plasmid lar pathways. (a) Response in taxadiene accumulation to the copy number (5, 10 and 20) and promoter strength (Trc, T5 increase in upstream pathway strengths for constant values of and T7) the expression of the upstream and downstream path the downstream pathway. (b) the dependence on the down ways can be modulated. (c) Cell growth of the engineered stream pathway for constant increases in the upstream path strains 25-29. The growth phenotype was affected from acti way strength. Observed multiple local maxima in taxadiene vation of isoprenoid metabolism (strain 26), recombinant response depends on the increase in the pathway expression protein expression (strain 25) and plasmid born metabolic strength upstream or downstream. (c) Taxadiene response burden (control vs engineered strains) and (d) growth pheno from strains engineered (17-24) with high upstream pathway types of strains 17, 22, 25–32. The black color lines are the overexpressions (20-100) with two different downstream taxadiene producing engineered strains and the gray color expressions (~30 and ~60) to identify taxadiene response lines are control strains without downstream expression car with balanced expressions. Expression of downstream path rying an empty plasmid with promoter and multi cloning way from the low copy plasmid (p5 and p10) under strong sites. The growth was correlated to the activation of the ter promoter T7TG operon was used to modulate these expres penoid metabolism, plasmid born metabolic burden as well sions. Note that both upstream and downstream pathway the recombinant protein expression. expressed from different plasmids with different promoters [0043] FIG. 5. Engineering Taxol p450 oxidation chemis can impose plasmid born metabolic burden. (d) Modulating try in E. coli. (a) Schematics of the conversion of taxadiene to the upstream pathway with increasing promoter strength taxadiene 50-ol to Taxol. (b) Transmembrane engineering from chromosome with two different downstream expres and construction of one-component chimera protein from sions (~30 and ~60) to identify the missing search space with taxadiene 50-ol hydroxylsase (T5oC)H) and Taxus cyto reduced toxic effects (strains 25–32). (e) Genetic details of the chrome p450 reductase (TCPR). 1 and 2 represents the full taxadiene producing strains. The numbers corresponding to length proteins of T500H and TCPR identified with 42 and different strains and its corresponding genotype, E-E. coli 74 amino acid TM regions respectively, 3–chimera enzymes K12mC1655 Areca/AendA, EDE3-E. coli K12mC1655 generated from the three different TM engineered T5oCH Areca AendA with T7 RNA polymerase DE3 construct in the constructs, (At&T500H, At24T5oC)H and At+2T500H con chromosome, MEP-dzs-idi-isp?)F operon, GT-GPPS-TS structed by fusing 8 residue synthetic peptide (A) to 8, 24 and operon, TG-TS-GPPS operon, Ch.1-1 copy in chromosome, 42 AA truncated T50CH) through a translational fusion with Trc-Trc promoter, T5-T5 promoter, T7-T7 promoter, p5, p.10, 74 AA truncated TCPR (tTCPR) using 5 residue GSTGS p20-~5 (SC101), -10 (p15), and ~20 (pHR322) copy plas linker peptide. (c) Functional activity of At&T5oCH-tTCPR, mid. At24T5C.OH-tTCPR and At+2T5oC)H-tTCPR constructs [0041] FIG. 3. Metabolite inversely correlates with taxadi transformed into taxadiene producing strain 18. (d) Time ene production. (a) Mass spectrum of metabolite that was course profile of taxadien-50-ol accumulation and growth detected to correlate inversely with taxadiene production in profile of the strain 18-At?4T5oCH-tTCPR fermented in a the strain constructs of FIG. 2. The observed characteristic 1L bioreactor. peaks of the metabolite are 233,207, 178, 117, 89 and 62. (b) [0044] FIG. 6. Biosynthetic scheme for taxol production in Correlation between the isoprenoid byproduct of FIG. 3a and E. coli. Schematics of the two modules, native upstream taxadiene. Strains 26-29 and 30-32, all with chromosomally isoprenoid pathway (left) and synthetic Taxol pathway integrated upstream pathway expression, were chosen for (right). In E. coli biosynthetic network, divergence of MEP consistent comparison. In strains 26-29 and 30-32, upstream isoprenoid pathway initiates from the precursors glyceralde expression increased by changing the promoters from Trc, to hyde-3 phosphate (G3P) and Pyruvate (PYR) from glycolysis T5 and T7 respectively. The two sets of strains differ only in (I-V). The Taxol pathway bifurcation starts from the E. coli the expression of the downstream pathway with the second isoprenoid precursor IPP and DMAPP to “linear” precursor set (30-32) having twice the level of expression of the first. Geranylgeranyl diphosphate (VIII), “cyclic” taxadiene (IX), With the first set, optimal balancing is achieved with strain 26, “oxidized” taxadiene 50-ol (X) to multiple rounds of ste which uses the Trc promoter for upstream pathway expres reospecific oxidations, acylations, benzoylations and epoxi sion and also shows the lowest metabolite accumulation. With dation for early precursor Baccatin III (XII) and finally with strains 30-32, strain 31 shows the lowest accumulation of side chain assembly to Taxol (XIII). DXP-1-deoxy-D-xylu US 2012/0107893 A1 May 3, 2012

lose-5-phosphate, MEP-2C-methyl-D-erythritol-4-phos spectrum identity to authentic taxa-4(20), 11,12-dien-50-ol phate, CDP-ME-4-diphosphocytidyl-2C-methyl-D-erythri with characteristic ion m/z 288(Pt), 273 (Pº-H2O), 255 tol, CDP-MEP-4-diphosphocytidyl-2C-methyl-D-erythritol (P"—H2O_CH3). 2-phosphate, ME-cPP-2C-methyl-D-erythritol-2,4 [0048] FIG. 10 presents a schematic depicting the terpe cyclodiphosphate, IPP-isopentenyl diphosphate, DMAPP noid biosynthetic pathway and natural products produced by dimethylallyl diphosphate. The genes involved biosynthetic this pathway. pathways from G3P and PYR to Taxol. DXS-1-deoxy-D [0049] FIG. 11 presents a schematic depicting modulation xylulose-5-phosphate synthase, ispo-1-Deoxy-D-xylulose of the upstream pathway for amplifying taxadiene produc 5-phosphate reductoisomerase, Ispl)-4-diphosphocytidyl tion. 2C-methyl-D-erythritol synthase, IspH-4-diphosphocytidyl [0050] FIG. 12 presents a schematic depicting modulation 2-C-methyl-D-erythritol kinase, IspH-2C-Methyl-D of the downstream pathway for amplifying taxadiene produc erythritol-2,4-cyclodiphosphate Synthase, IspC-1-hydroxy tion. 2-methyl-2-(E)-butenyl-4-diphosphate synthase, IspH-4 [0051] FIG. 13 presents a schematic indicating that the hydroxy-3-methyl-2-(E)-butenyl-4-diphosphate reductase newly identified pathway diversion is not the characteristic of IDI-isopentenyl-diphosphate isomerase, GGPPS-geranyl downstream synthetic pathway. geranyldiphosphate synthase, Taxadiene synthase, Taxoid [0052] FIG. 14. Pathway strength correlates to transcrip tional gene expression levels. (c) relative expression of idi, 50-hydroxylase, Taxoid-50-O-acetyltransferase, Taxoid isp?) and ispE genes with increasing upstream pathway 130-hydroxylase, Taxoid 103-hydroxylase, Taxoid 20-hy strength and downstream strength at 31 arbitrary units, and droxylase, Taxoid 2-O-benzoyltransferase, Taxoid 73-hy (d) relative expression of idi, isp?) and isprº genes with droxylase, Taxoid 10-O-acetyltransferase, Taxoid 13-hy increasing upstream pathway strength and downstream droxylase”, Taxoid 90-hydroxylase, Taxoid 9-keto-oxidase”, strength at 61 arbitrary units. As expected the gene expression Taxoid C4,C20-5-epoxidase”, Phenylalanine aminomutase, increased as the upstream pathway strength increased. The Side chain CoA-ligase”, Taxoid O-phenylpropanoyltrans corresponding strain numbers are indicated in the bar graph. ferase, Taxoid 2'-hydroxylase”, Taxoid 3'-N-benzoyltrans The relative expression was quantified using the expression of ferase.” marked genes are yet to be identified or char the housekeeping res.A gene. Data are mean+/–SD for four acterized. replicates. [0045] FIG. 7. Fold improvements in taxadiene production [0053] FIG. 15. Impact of metabolite byproduct Indole from the modular pathway expression search. (a) Taxadiene accumulation on taxadiene production and growth. (a) response in fold improvements from all the observed maxi Inverse correlation between taxadiene and Indole. Strains 26 mas from FIG. 2a, b, and c compared to strain 1. The 2.5 fold to 28 and 30 to 32, all with chromosomally integrated differences between two highest maximas (strain 17 and 26) upstream pathway expression, were chosen for consistent and 23 fold (strain 26 and 10) with lowest one indicates that comparison. The two sets of strains differ only in the expres missing an optimal response results in significantly lower sion of the downstream pathway with the second set (30 to 32) titers. having twice the level of expression of the first. In strains 26 to 28 and 30 to 32, upstream expression increased by chang [0046] FIG. 8. Metabolite (a) Correlation between taxadi ing the promoters from Trc, to T5 and T7, respectively. With ene to metabolite accumulation. The metabolite accumula the first set, optimal balancing is achieved with strain 26, tion from the engineered strain is anti-proportionally related which uses the Trc promoter for upstream pathway expres to the taxadiene production in an exponential manner. The sion and also shows the lowest indole accumulation. With correlation coefficient for this relation was determined to 0.92 strains 30 to 32, strain 31 shows the lowest accumulation of (b) Representative GC-profile from the strains 26-28 to dem indole and highest production of taxadiene. The fold onstrate the change in taxadiene and metabolite accumula improvements are relative to strain 25 and 29, respectively, tion. Numbers in the chromatogram 1 and 2 corresponding to for the two sets. (b) Effect of externally-introduced indole on metabolite and taxadiene peak respectively. (c) GC-MS pro taxadiene production for the high-producing strain 26. Dif file of metabolite (1) and taxadiene (2) respectively. The ferent concentrations of indole were introduced into cultures observed characteristic peaks of the metabolite are 233, 207, of cells cultured in minimal media with 0.5% yeast extract. 178, 117, 89 and 62. Taxa-4(20),11,12-diene characteristic Taxadiene production was significantly reduced as indole ion m/z 272(Pt), 257 (P"—CH3), 229 (P"—CAH2): 121, 122, concentration increased from 50 mg/L to 100 mg/L (c) Effect 123 (C-ring fragment cluster).” The peak marked with a star of externally-introduced indole on cell growth for engineered is the internal standard caryophylene. strains of E. coli. Data are mean+/–SD for three replicates. [0047] FIG.9. GC-MS profiles and taxadiene/taxadien-50 Strains devoid of the downstream pathway and with different ol production from artificial chimera enzyme engineered in strengths of the upstream pathway (1, 2, 6, 21, 40 and 100) strain 26. (a) GC profile of the hexane:ether (8:2) extract from were selected. Strain 26, the high taxadiene producer, exhibits three constructs (A-At&T5oCH-tTCPR, t24T500H-tTCPR the strongest inhibition. and At+2T500H-tTCPR) transferred to strain 26 and fer [0054] FIG. 16. Unknown metabolite identified as indole. mented for 5 days. 1, 2 and 3 labels in the peaks correspond (A) and (a) Gas chromatogram and mass spectrum of the ing to the taxadiene, taxadien-50-ol and 5(12)-Oxa-3(11) unknown metabolite extracted using hexane from cell cul cyclotaxane (OCT) respectively. (b) The production of taxa ture. (B) and (b) correspond to the gas chromatogram and 4(20),11,12-dien-50-ol and OCT quantified from the three mass spectrum of pure indole dissolved in hexane. Further to strains. (c) and (d) GC-MS profile of taxa-4(20), 11,12-dien confirm the chemical identity, the metabolite was extracted 50-ol and OCT and the peaks corresponding to the fragmen from the fermentation broth using hexane extraction and puri tation was compared with the authentic standards and previ fied by silica column chromatography using hexane:ethylac ous reports” GC-MS analysis confirmed the mass etate (8:2) as eluent. The purity of the compound was con US 2012/0107893 A1 May 3, 2012 firmed by TLC and GC-MS. 'HNMR and **CNMR spectra competitive production of compounds of the isoprenoid path confirmed the chemical identity of the metabolite as indole. way using a microbial route is the amplification of this (c) "HNMR spectrum of indole extracted from cell culture pathway in order to allow the overproduction of these mol (CDC1s, 400 MHz) of 6.56 (d. 1H, Ar C–H), 7.16 (m,3H, Ar ecules. Described herein are methods that enhance or amplify C–H), 7.38 (d. 1H, Ar C–H), 7.66 (d. 1H, Ar C–H), 8.05 the flux towards terpenoid production in Escherichia coli (E. (b. 1H, Indole NH). (d) **CNMR 6: 135.7, 127.8, 124.2, 122, coli). Specifically, methods are provided to amplify the meta 120.7, 119.8, 111, 102.6 (e) is the 'HNMR spectrum of pure bolic flux to the synthesis ofisopentenyl pyrophosphate (IPP) indole. (a key intermediate for the production of isoprenoid com [0055] FIG. 17. Fed batch cultivation of engineered strains pounds), dimethylallyl pyrophosphate (DMAPP), geranyl in 1L-bioreactor. Time courses of taxadiene accumulation (a), diphosphate (GPP), farnesyl diphosphate (FPP), geranylgera cell growth (b), accumulation (c) and total sub nyl diphosphate (GGPP), and farnesyl geranyl diphosphate strate (glycerol) addition (d) for strains 22, 17 and 26 during (FGPP), (Taxol), ginkolides, , farmesol, 5 days of fed batch bioreactor cultivation in 1L-bioreactor geranylgeraniol, , , monoterpenoids such as vessels under controlled pH and oxygen conditions with , such as lycopene, polyisoprenoids such minimal media and 0.5% yeast extract. After glycerol as polyisoprene or , diterpenoids such as depletes to ~0.5 to 1 g/L in the fermentor, 3 g/L of glycerol eleutherobin, and sesquiterpenoids such as . was introduced into the bioreactor during the fermentation. [0059] Aspects of the invention relate to the production of Data are mean of two replicate bioreactors. terpenoids. As used herein, a terpenoid, also referred to as an isoprenoid, is an organic chemical derived from a five-carbon DETAILED DESCRIPTION OF THE INVENTION isoprene unit. Several non-limiting examples of terpenoids, [0056] Taxol is a potent anticancer drug first isolated as a classified based on the number of isoprene units that they natural product from the Taxus brevifolia Pacific yew tree. contain, include: hemiterpenoids (1 isoprene unit), monoter However, reliable and cost-efficient production of Taxol or penoids (2 isoprene units), sesquiterpenoids (3 isoprene Taxol analogs by traditional production routes from plant units), diterpenoids (4 isoprene units), sesterterpenoids (5 extracts is limited. Here, we report a multivariate-modular isoprene units), triterpenoids (6 isoprene units), tetraterpe approach to metabolic pathway engineering to amplify by noids (8 isoprene units), and polyterpenoids with a larger 15000 fold the production of taxadiene in an engineered number of isoprene units. In some embodiments, the terpe Escherichia coli. Taxadiene, the first committed Taxol inter noid that is produced is taxadiene. In some embodiments, the mediate, is the biosynthetic product of the non-mevalonate terpenoid that is produced is , Cubebol, Nootka pathway in E. coli comprising two modules: the native tone, Cineol, , Eleutherobin, Sarcodictyin, upstream pathway forming Isopentenyl Pyrophosphate (IPP) Pseudopterosins, Ginkgolides, Stevioside, Rebaudioside A, and a heterologous downstream terpenoid-forming pathway. sclareol, labdenediol, levopimaradiene, sandracopimaradi Systematic multivariate search identified conditions that opti ene or isopemaradiene. mally balance the two pathway modules to minimize accu [0060] Described herein are methods and compositions for mulation of inhibitory intermediates and flux diversion to side optimizing production of terpenoids in cells by controlling products. We also engineered the next step, after taxadiene, in expression of genes or proteins participating in an upstream Taxol biosynthesis, a P450-based oxidation step, that yielded pathway and a downstream pathway. The upstream pathway >98% substrate conversion and present the first example of in involves production of isopentyl pyrophosphate (IPP) and vivo production of any functionalized Taxol intermediates in dimethylallyl pyrophosphate (DMAPP), which can be E. coli. The modular pathway engineering approach not only achieved by two different metabolic pathways: the mevalonic highlights the complexity of multi-step pathways, but also acid (MVA) pathway and the MEP (2-C-methyl-D-erythritol allowed accumulation of high taxadiene and taxadien-50-ol 4-phosphate) pathway, also called the MEP/DOXP (2-C-me titers (~300 mg/L and 60 mg/L, respectively) in small-scale thyl-D-erythritol 4-phosphate/1-deoxy-D-xylulose 5-phos fermentations, thus exemplifying the potential of microbial phate) pathway, the non-mevalonate pathway or the meva production of Taxol and its derivatives. lonic acid-independent pathway. [0057] This invention is not limited in its application to the [0061] The downstream pathway is a synthetic pathway details of construction and the arrangement of components that leads to production of a terpenoids and involves recom set forth in the following description or illustrated in the binant gene expression of a terpenoid synthase (also referred drawings. The invention is capable of other embodiments and to as cyclase) enzyme, and a geranylgeranyl diphos of being practiced or of being carried out in various ways. phate synthase (GGPPS) enzyme. In some embodiments, a Also, the phraseology and terminology used herein is for the terpenoid synthase enzyme is a diterpenoid synthase enzyme. purpose of description and should not be regarded as limiting. Several non-limiting examples of diterpenoid synthase The use of “including,” “comprising,” or “having,” “contain enzymes include casbene synthase, taxadiene synthase, ing,” “involving,” and variations thereof herein, is meant to levopimaradiene synthase, abietadiene synthase, isopimara encompass the items listed thereafter and equivalents thereof diene synthase, ent-copalyl diphosphate synthase, syn as well as additional items. stemar-13-ene synthase, syn-stemod-13(17)-ene synthase, [0058] Microbial production of terpenoids such as taxadi syn-pimara-7,15-diene synthase, ent-sandaracopimaradiene ene is demonstrated herein. When expressed at satisfactory synthase, ent-cassa-12,15-diene synthase, ent-pimara-8014), levels, microbial routes reduce dramatically the cost of pro 15-diene synthase, ent-kaur-15-ene synthase, ent-kaur-16 duction of such compounds. Additionally, they utilize cheap, ene synthase, aphidicolan-16?-ol synthase, phyllocladan abundant and renewable feedstocks (such as sugars and other 160-ol synthase, fusicocca-2,10(14)-diene synthase, and ter carbohydrates) and can be the source for the synthesis of pentetriene cyclase. numerous to derivatives that may exhibit far superior proper [0062] Surprisingly, as demonstrated in the Examples sec ties than the original compound. A key element in the cost tion, optimization of terpenoid synthesis by manipulation of US 2012/0107893 A1 May 3, 2012

the upstream and downstream pathways described herein, dxs, ispo, ispl), ispe, ispp. ispG, isph, and idi is upregulated was not a simple linear or additive process. Rather, through while expression of the gene and/or protein of isp/A and/or complex combinatorial analysis, optimization was achieved ispb is down-regulated. through balancing components of the upstream and down [0065] Expression of genes within the MEP pathway can be stream pathways. Unexpectedly, as demonstrated in FIGS. 1 regulated in a modular method. As used herein, regulation by and 2, taxadiene accumulation exhibited a strong non-linear a modular method refers to regulation of multiple genes dependence on the relative strengths of the upstream MEP together. For example, in some embodiments, multiple genes and downstream synthetic taxadiene pathways. within the MEP pathway are recombinantly expressed on a contiguous region of DNA, such as an operon. It should be [0063] Aspects of the invention relate to controlling the appreciated that a cell that expresses such a module can also expression of genes and proteins in the MEP pathway for express one or more other genes within the MEP pathway optimized production of a terpenoid such as taxadiene. Opti either recombinantly or endogenously. mized production of a terpenoid refers to producing a higher [0066] A non-limiting example of a module of genes within amount of a terpenoid following pursuit of an optimization the MEP pathway is a module containing the genes dxs, idi, strategy than would be achieved in the absence of such a isp?) and isprº, as presented in the Examples section, and strategy. It should be appreciated that any gene and/or protein referred to herein as dxs-idi-isp?)F. It should be appreciated within the MEP pathway is encompassed by methods and that modules of genes within the MEP pathway, consistent compositions described herein. In some embodiments, a gene with aspects of the invention, can contain any of the genes within the MEP pathway is one of the following: dxs, ispC, within the MEP pathway, in any order. isp?), ispE, ispp. ispC, isph, idi, isp.A or ispB. Expression of [0067] Expression of genes and proteins within the down one or more genes and/or proteins within the MEP pathway stream synthetic terpenoid synthesis pathway can also be can be upregulated and/or downregulated. In certain embodi regulated in order to optimize terpenoid production. The syn ments, upregulation of one or more genes and/or proteins thetic downstream terpenoid synthesis pathway involves within the MEP pathway can be combined with downregula recombinant expression of a terpenoid synthase enzyme and tion of one or more genes and/or proteins within the MEP a GGPPS enzyme. Any terpenoid synthase enzyme, as dis pathway. cussed above, can be expressed with GGPPS depending on [0064] It should be appreciated that genes and/or proteins the downstream product to be produced. For example, taxa can be regulated alone or in combination. For example, the diene synthase is used for the production of taxadiene. expression of dxs can be upregulated or downregulated alone Recombinant expression of the taxadiene synthase enzyme or in combination with upregulation or downregulation of and the GGPPS enzyme can be regulated independently or expression of one or more of ispC, isp?), ispE, ispp. ispC, together. In some embodiments the two enzymes are regu ispFI, idi, isp/A and ispb. The expression of ispC can be lated together in a modular fashion. For example the two upregulated or downregulated alone or in combination with enzymes can be expressed in an operon in either order (GG upregulation or downregulation of expression of one or more PPS–TS, referred to as “GT,” or TS-GGPPS, referred to as of dxs, isp?), ispE, ispp. ispo, isph, idi, isp/A and ispb. The “TG”). expression of isp?) can be upregulated or downregulated [0068] Manipulation of the expression of genes and/or pro alone or in combination with upregulation or downregulation teins, including modules such as the dxs-idi-isp?)F operon, of expression of one or more of dxs, ispC, ispE, ispE. ispC, and the TS-GGPPS operon, can be achieved through methods ispFI, idi, isp/A and ispb. The expression of ispe can be known to one of ordinary skill in the art. For example, expres upregulated or downregulated alone or in combination with sion of the genes or operons can be regulated through selec upregulation or downregulation of expression of one or more tion of promoters, such as inducible promoters, with different of dxs, ispC, isp?), ispp. ispo, isph, idi, isp/A and ispb. The strengths. Several non-limiting examples of promoters expression of isprº can be upregulated or downregulated alone include Trc, T5 and T7. Additionally, expression of genes or or in combination with upregulation or downregulation of operons can be regulated through manipulation of the copy expression of one or more of dxs, ispC, isp?), ispe, ispC, number of the gene or operon in the cell. For example, in ispFI, idi, isp/A and ispb. The expression of ispC can be certain embodiments, a strain containing an additional copy upregulated or downregulated alone or in combination with of the dxs-idi-isp?)F operon on its chromosome under Trc upregulation or downregulation of expression of one or more promoter control produces an increased amount of taxadiene of dxs, ispC, isp?), ispe, ispp. isph, idi, isp/A and ispb. The relative to one overexpressing only the synthetic downstream expression of isphi can be upregulated or downregulated pathway. In some embodiments, expression of genes or oper alone or in combination with upregulation or downregulation ons can be regulated through manipulating the order of the of expression of one or more of dxs, ispC, isp?), ispe, ispp. genes within a module. For example, in certain embodiments, ispG, idi, isp/A and ispb. The expression of idi can be upregu changing the order of the genes in a downstream synthetic lated or downregulated alone or in combination with upregu operon from GT to TG results in a 2-3 fold increase in taxa lation or downregulation of expression of one or more of dxs, diene production. In some embodiments, expression of genes ispC, isp?), ispE, ispp. ispC, ispEI, isp/Aandispb. The expres or operons is regulated through integration of one or more sion ofisp.A can be upregulated or downregulated alone or in genes or operons into a chromosome. For example, in certain combination with upregulation or downregulation of expres embodiments, integration of the upstream dxs-idi-ispl)F sion of one or more of dxs, ispo, isp?), ispE, ispp. ispC, isph, operon into the chromosome of a cell results in increased idi and ispb. The expression of ispb can be upregulated or taxadiene production. downregulated alone or in combination with upregulation or [0069] It should be appreciated that the genes associated downregulation of expression of one or more of dxs, ispo, with the invention can be obtained from a variety of sources. isp?), ispe, ispp. ispG, isph, idi and isp/A. In some embodi In some embodiments, the genes within the MEP pathway are ments, expression of the gene and/or protein of one or more of bacterial genes such as Escherichia coli genes. In some

US 2012/0107893 A1 May 3, 2012 spp., Sinorhizobium spp., Saccharopolyspora spp., Agrobac monoterpenoids, sesquiterpenoids (including amorphadi terium spp. and Pantoea spp. The bacterial cell can be a ene), diterpenoids (including levopimaradiene), , Gram-negative cell such as an Escherichia coli (E. coli) cell, and . Other methods for reducing or controlling or a Gram-positive cell such as a species of Bacillus. In other the accumulation of indole include removing the accumu embodiments, the cell is a fungal cell such as a yeast cell, e.g., lated indole from the fermentation through chemical methods Saccharomyces spp., Schizosaccharomyces spp., Pichia spp., such as by using absorbents, scavengers, etc. Paffia spp., Kluyveromyces spp., Candida spp., Talaromyces [0082] In other embodiments, methods are provided that spp., Brettanomyces spp., Pachysolen spp., Debaryomyces include measuring the amount or concentration of indole in a spp., Yarrowia spp., and industrial polyploid yeast strains. cell that produces one or more terpenoids or in a culture of the Preferably the yeast strain is a S. cerevisiae strain or a Yar cells that produce one or more terpenoids. The amount or rowia spp. strain. Other examples of fungi include Aspergil concentration of indole can be measured once, or two or more lus spp., to Pennicilium spp., Fusarium spp., Rhizopus spp., times, as suitable, using methods known in the art and as Acremonium spp., Neurospora spp., Sordaria spp., Mag described herein. Such methods can be used to guide pro naporthe spp., Allomyces spp., Ustilago spp., Botrytis spp., cesses of producing one or more terpenoids, e.g., in process and Trichoderma spp. In other embodiments, the cell is an improvement. Such methods can be used to guide strain con algal cell, or a plant cell. It should be appreciated that some struction, e.g., for strain improvement. cells compatible with the invention may express an endog [0083] The identification of the means to achieve this bal enous copy of one or more of the genes associated with the ancing yielded a 15000 fold improvement in the overproduc invention as well as a recombinant copy. In some embodi tion of terpenoids such as taxadiene, compared to wild type ments, if a cell has an endogenous copy of one or more of the bacterial cells, expressed with a heterologous taxadiene bio genes associated with the invention then the methods will not synthetic pathway. The production was further increased necessarily require adding a recombinant copy of the gene(s) through modified fermentation methods that yielded concen that are endogenously expressed. In some embodiments the trations of approximately 2 g/L, which is 1500 fold higher cell may endogenously express one or more enzymes from compared to any prior reported taxadiene production. As the pathways described herein and may recombinantly demonstrated herein, by genetically engineering the non-me express one or more other enzymes from the pathways valonate isoprenoid pathway in E. coli the accumulation of described herein for efficient production of a terpenoid. this metabolite can now be controlled which regulates the flux [0080] Further aspects of the invention relate to screening towards the isoprenoid biosynthesis in bacterial E. coli cells. for bacterial cells or strains that exhibit optimized terpenoid Also demonstrated herein is further channeling of the taxa production. As described above, methods associated with the diene production into the next key precursor to Taxol, taxa invention involve generating cells that overexpress one or dien-50-ol, achieved through engineering the oxidation more genes in the MEP pathway. Terpenoid production from chemistry for Taxol biosynthesis. Example 5 presents the first culturing of such cells can be measured and compared to a successful extension of the synthetic pathway from taxadiene control cell wherein a cell that exhibits a higher amount of a to taxadien-50-ol. Similar to the majority of other terpenoids, terpenoid production relative to a control cell is selected as a the Taxol biosynthesis follows the unified fashion of “two first improved cell. The cell can be furthermodified by recom phase” biosynthetic process, (i) the “cyclase phase” of linear binant expression of a terpenoid synthase enzyme and a coupling of the prenylprecursors (IPP and DMAPP) to GGPP GGPPS enzyme. The level of expression of one or more of the followed by the molecular cyclization and rearrangement for components of the non-mevalonate (MEP) pathway, the ter the committed precursor taxadiene (FIG. 6, VIII-IX).” penoid synthase enzyme and/or the GGPPS enzyme in the After the committed precursor, (ii) the “oxidation phase”, the cell can then be manipulated and terpenoid production can be cyclic olefin taxadiene core structure is then functionalized measured again, leading to selection of a second improved by seven cytochrome P450 oxygenases together with its cell that produces greater amounts of a terpenoid than the first redox partners, decorated with two acetate groups and a ben improved cell. In some embodiments, the terpenoid synthase zoate group by acyl and aroyl CoA-dependent transferases, enzyme is a taxadiene synthase enzyme. keto group by keto-oxidase, and epoxide group by epoxidase [0081] Further aspects of the invention relate to the identi lead to the late intermediate baccatin III, to which the C13 fication and characterization (via GC-MS) of a previously side chain is attached for Taxol ((FIG. 6, X-XIII).” Although unknown metabolite in bacterial E. coli cells (FIGS. 3 and 6). a rough sequential order of the early oxidation phase reac The level of accumulation of the newly identified metabolite, tions are predicted, the precise timing/order of some of the indole, can be controlled by genetically manipulating the hydroxylations, acylations and benzoylation reactions are microbial pathway by the overexpression, down regulation or uncertain. However it is clear that the early bifurcation starts mutation of the isoprenoid pathway genes. The metabolite from the cytochrome p450 mediated hydroxylation of taxa indole anti-correlates as a direct variable to the taxadiene diene core at C5 position followed the downstream hydroxy production in engineered strains (FIGS. 3, 6 and 15). Further lations using a homologous family of cytochrome p450 controlling the accumulation of indole for improving the flux enzymes with high deduced similarity to each other (>70%) towards terpenoid biosynthesis in bacterial systems (specifi but with limited resemblance (<30%) to other plant p-450’s. cally in cells, such as E. coli cells) or other cells, can be * In addition, the structural and functional diversity with achieved by balancing the upstream non-mevalonate iso the possible evolutionary analysis implicit that the taxadiene prenoid pathway with the downstream product synthesis 50-ol gene can be the parental sequence from which the other pathways or by modifications to or regulation of the indole hydroxylase genes in the Taxol biosynthetic pathway pathway. In so doing, the skilled person can reduce or control evolved, reflecting the order of hydroxylations.” the accumulation of indole and thereby reduce the inhibitory [0084] Further aspects of the invention relate to chimeric effect of indole on the production of taxadiene, and other P450 enzymes. Functional expression of plant cytochrome terpenoids derived from the described pathways, such as: P450 has been considered challenging due to the inherent US 2012/0107893 A1 May 3, 2012

limitations of bacterial platforms, such as the absence of environment and present in sufficient quantity to permit its electron transfer machinery, cytochrome P450 reductases, identification or use. Isolated, when referring to a protein or and translational incompatibility of the membrane signal polypeptide, means, for example: (i) selectively produced by modules of P450 enzymes due to the lack of an endoplasmic expression cloning or (ii) purified as by chromatography or reticulum. electrophoresis. Isolated proteins or polypeptides may be, but [0085] In some embodiments, the taxadiene-50-hydroxy need not be, substantially pure. The term “substantially pure” lase associated with methods of the invention is optimized means that the proteins or polypeptides are essentially free of through N-terminal transmembrane engineering and/or the other substances with which they may be found in production, generation of chimeric enzymes through translational fusion nature, or in vivo systems to an extent practical and appropri with a CPR redox partner. In some embodiments, the CPR ate for their intended use. Substantially pure polypeptides redox partner is a Taxus cytochrome P450 reductase (TCPR; may be obtained naturally or produced using methods FIG. 5b). In certain embodiments, cytochrome P450 taxadi described herein and may be purified with techniques well ene-50-hydroxylase (T50.OH) is obtained from Taxus cuspi known in the art. Because an isolated protein may be admixed date (GenBank Accession number AY289209, the sequence with other components in a preparation, the protein may of which is incorporated by reference herein). In some comprise only a small percentage by weight of the prepara embodiments, NADPH:cytochrome P450 reductase (TCPR) tion. The protein is nonetheless isolated in that it has been is obtained from Taxus cuspidate (GenBank Accession num separated from the substances with which it may be associ ber AY371340, the sequence of which is incorporated by ated in living systems, i.e. isolated from other proteins. reference herein). [0090] The invention also encompasses nucleic acids that [0086] The taxadiene 50-hydroxylase and TCPR can be encode for any of the polypeptides described herein, libraries joined by a linker such as GSTGS (SEQID NO:50). In some that contain any of the nucleic acids and/or polypeptides embodiments, taxadiene 50-hydroxylase and/or TCPR are described herein, and compositions that contain any of the truncated to remove all or part of the transmembrane region of nucleic acids and/or polypeptides described herein. one or both proteins. For example, taxadiene 50-hydroxylase [0091] In some embodiments, one or more of the genes in some embodiments is truncated to remove 8, 24, or 42 associated with the invention is expressed in a recombinant N-terminal amino acids. In some embodiments, the N-termi expression vector. As used herein, a “vector” may be any of a nal 74 amino acids of TCPR are truncated. An additional number of nucleic acids into which a desired sequence or peptide can also be fused to taxadiene 50-hydroxylase. For sequences may be inserted by restriction and ligation for example, one or more amino acids from bovine 170 hydroxy transport between different genetic environments or for lase can be added to taxadiene 50-hydroxylase. In certain expression in a host cell. Vectors are typically composed of embodiments, the peptide MALLLAVF (SEQ ID NO:51) is DNA, although RNA vectors are also available. Vectors added to taxadiene 50-hydroxylase. A non-limiting example include, but are not limited to: plasmids, fosmids, phagemids, of polypeptide comprising taxadiene 50-hydroxylase fused virus genomes and artificial chromosomes. to TCPR is At 24T5C.OH-tTCPR. [0092] A cloning vector is one which is able to replicate [0087] In some embodiments, the chimeric enzyme is able autonomously or integrated in the genome in a host cell, and to carry out the first oxidation step with more than 10% which is further characterized by one or more endonuclease taxadiene conversion to taxadiene-50-ol and the byproduct restriction sites at which the vector may be cut in a determin 5(12)-Oxa-3(11)-cyclotaxane. For example, the percent taxa able fashion and into which a desired DNA sequence may be diene conversion to taxadiene-50-ol and the byproduct 5(12) ligated such that the new recombinant vector retains its ability Oxa-3(11)-cyclotaxane can be at least 20%, at least 30%, at to replicate in the host cell. In the case of plasmids, replication least 40%, at least 50%, at least 60%, at least 70%, at least of the desired sequence may occur many times as the plasmid 80%, at least 90%, at least 95%, at least 98%, approximately increases in copy number within the host cell such as a host 99% or approximately 100%. bacterium or just a single time per host before the host repro [0088] In certain embodiments, the chimeric enzyme is duces by mitosis. In the case of phage, replication may occur At245oCH-tTCPR, which was found to be capable of carry actively during a lytic phase or passively during a lysogenic ing out the first oxidation step with more than 98% taxadiene phase. conversion to taxadiene-50-ol and the byproduct 5(12)-Oxa [0093] An expression vector is one into which a desired 3(11)-cyclotaxane (OCT, FIG. 9a). Engineering of the step of DNA sequence may be inserted by restriction and ligation taxadiene-50-ol production is critical in the production of such that it is operably joined to regulatory sequences and Taxol and was found to be limiting in previous efforts to may be expressed as an RNA transcript. Vectors may further construct this pathway in yeast. The engineered construct contain one or more marker sequences suitable for use in the developed herein demonstrated greater than 98% conversion identification of cells which have or have not been trans of taxadiene in vivo with a 2400 fold improvement over formed or transfected with the vector. Markers include, for previous heterologous expression in yeast. Thus, in addition example, genes encoding proteins which increase or decrease to synthesizing significantly greater amounts of key Taxol either resistance or sensitivity to antibiotics or other com intermediates, this study also provides the basis for the syn pounds, genes which encode enzymes whose activities are thesis of subsequent metabolites in the pathway by similar detectable by standard assays known in the art to (e.g., fl-ga P450 chemistry. lactosidase, luciferase or alkaline phosphatase), and genes [0089] As used herein, the terms “protein” and “polypep which visibly affect the phenotype of transformed or trans tide” are used interchangeably and thus the term polypeptide fected cells, hosts, colonies or plaques (e.g., green fluorescent may be used to refer to a full-length polypeptide and may also protein). Preferred vectors are those capable of autonomous be used to refer to a fragment of a full-length polypeptide. As replication and expression of the structural gene products used herein with respect to polypeptides, proteins, or frag present in the DNA segments to which they are operably ments thereof, “isolated” means separated from its native joined. US 2012/0107893 A1 May 3, 2012

[0094] As used herein, a coding sequence and regulatory standard protocols such as transformation including chemical sequences are said to be “operably” joined when they are transformation and electroporation, transduction, particle covalently linked in such a way as to place the expression or bombardment, etc. Expressing the nucleic acid molecule transcription of the coding sequence under the influence or encoding the enzymes of the claimed invention also may be control of the regulatory sequences. If it is desired that the accomplished by integrating the nucleic acid molecule into coding sequences be translated into a functional protein, two the genome. DNA sequences are said to be operably joined if induction of [0099] In some embodiments one or more genes associated a promoter in the 5' regulatory sequences results in the tran with the invention is expressed recombinantly in a bacterial scription of the coding sequence and if the nature of the cell. Bacterial cells according to the invention can be cultured linkage between the two DNA sequences does not (1) resultin in media of any type (rich or minimal) and any composition. the introduction of a frame-shift mutation, (2) interfere with As would be understood by one of ordinary skill in the art, the ability of the promoter region to direct the transcription of routine optimization would allow for use of a variety of types the coding sequences, or (3) interfere with the ability of the of media. The selected medium can be supplemented with corresponding RNA transcript to be translated into a protein. various additional components. Some non-limiting examples Thus, a promoter region would be operably joined to a coding of supplemental components include glucose, antibiotics, sequence if the promoter region were capable of effecting IPTG for gene induction, ATCC Trace Mineral Supplement, transcription of that DNA sequence such that the resulting and glycolate. Similarly, other aspects of the medium, and transcript can be translated into the desired protein or growth conditions of the cells of the invention may be opti polypeptide. mized through routine experimentation. For example, pH and [0095] When the nucleic acid molecule that encodes any of temperature are non-limiting examples of factors which can the enzymes of the claimed invention is expressed in a cell, a be optimized. In some embodiments, factors such as choice of variety of transcription control sequences (e.g., promoter/ media, media supplements, and temperature can influence enhancer sequences) can be used to direct its expression. The production levels of terpenoids, such as taxadiene. In some promoter can be a native promoter, i.e., the promoter of the embodiments the concentration and amount of a supplemen gene in its endogenous context, which provides normal regu tal component may be optimized. In some embodiments, how lation of expression of the gene. In some embodiments the often the media is supplemented with one or more supple promoter can be constitutive, i.e., the promoteris unregulated mental components, and the amount of time that the media is allowing for continual transcription of its associated gene. A cultured before harvesting a terpenoid, such as taxadiene, is variety of conditional promoters also can be used, such as optimized. promoters controlled by the presence or absence of a mol [0100] According to aspects of the invention, high titers of ecule. a terpenoid (such as but not limited to taxadiene), are pro [0096] The precise nature of the regulatory sequences duced through the recombinant expression of genes as needed for gene expression may vary between species or cell described herein, in a cell expressing components of the MEP types, but shall in general include, as necessary, 5' non-tran pathway, and one or more downstream genes for the produc scribed and 5' non-translated sequences involved with the tion of a terpenoid (or related compounds) from the products initiation of transcription and translation respectively, such as of the MEP pathway. As used herein “high titer” refers to a a TATA box, capping sequence, CAAT sequence, and the like. titer in the milligrams per liter (mg LT") scale. The titer In particular, such 5' non-transcribed regulatory sequences produced for a given product will be influenced by multiple will include a promoter region which includes a promoter factors including choice of media. In some embodiments, the sequence for transcriptional control of the operably joined total titer of a terpenoid or derivative is at least 1 mg Lº". In gene. Regulatory sequences may also include enhancer some embodiments, the total terpenoid or derivative titer is at sequences or upstream activator sequences as desired. The least 10 mg Lº". In some embodiments, the total terpenoid or vectors of the invention may optionally include 5' leader or derivative titer is at least 250 mg Lº". For example, the total signal sequences. The choice and design of an appropriate terpenoid or derivative titer can be at least 1, 2, 3, 4, 5, 6, 7, 8, vector is within the ability and discretion of one of ordinary 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, skill in the art. 26, 27, 28, 29, 30, 31, 32,33, 34, 35, 36, 37,38, 39, 40, 41, 42. [0097] Expression vectors containing all the necessary ele 43, 44, 45,46, 47, 48,49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59. ments for expression are commercially available and known 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 75, 80, 85, 90, 95, to those skilled in the art. See, e.g., Sambrook et al., Molecu 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, lar Cloning: A Laboratory Manual, Second Edition, Cold 400, 425, 450, 475, 500, 525, 550, 575, 600, 625, 650, 675, Spring Harbor Laboratory Press, 1989. Cells are genetically 700, 725,750, 775, 800, 825, 850,875, 900 or more than 900 engineered by the introduction into the cells of heterologous mg L" including any intermediate values. In some embodi DNA (RNA). That heterologous DNA (RNA) is placed under ments, the total terpenoid orderivative titer can be at least 1.0, operable control of transcriptional elements to permit the 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, expression of the heterologous DNA in the host cell. Heter 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8. ologous expression of genes associated with the invention, for 3.9, 4.0, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, or more production of a terpenoid, such as taxadiene, is demonstrated than 5.0 g L-' including any intermediate values. in the Examples section using E. coli. The novel method for [0101] In some embodiments, the total taxadiene 50-oltiter producing terpenoids can also be expressed in other bacterial is at least 1 mg L'. In some embodiments, the total taxadiene cells, fungi (including yeast cells), plant cells, etc. 50-ol titer is at least 10 mg Lº". In some embodiments, the [0098) A nucleic acid molecule that encodes an enzyme total taxadiene 50-oltiter is at least 50 mg Lº". For example, associated with the invention can be introduced into a cell or the total taxadiene 50-oltiter can be at least 1, 2, 3, 4, 5, 6, 7, cells using methods and techniques that are standard in the 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, art. For example, nucleic acid molecules can be introduced by 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41. US 2012/0107893 A1 May 3, 2012

42, 43,44, 45,46, 47, 48,49, 50, 51, 52, 53, 54, 55, 56, 57, 58. [0110] (6) mutation of amino acids in one or more genes 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, or more than 70 of the downstream and/or upstream pathway, or mg L" including any intermediate values. [0111] (7) modifying the order of upstream and down [0102] The liquid cultures used to grow cells associated stream pathway genes in a heterologous operon. with the invention can be housed in any of the culture vessels [0112] In some embodiments, the cell comprises at least known and used in the art. In some embodiments large scale one additional copy of at least one of dxs, idi, isp?), and ispp. production in an aerated reaction vessel such as a stirred tank For example, the cell may comprise a heterologous dxs-idi reactor can be used to produce large quantities of terpenoids, isp?)F operon. such as taxadiene, that can be recovered from the cell culture. [0113] According to embodiments of this aspect, indole In some embodiments, the terpenoid is recovered from the gas accumulates in the culture at less than 100 mg/L, or in certain phase of the cell culture, for example by adding an organic embodiments, at less than 50 mg/L, at less than 10 mg/L, or at layer such as dodecane to the cell culture and recovering the less than 1 mg/L, or less than 0.5 mg/L, or less than 0.1 mg/L. terpenoid from the organic layer. As an alternative, orin addition to balancing the upstream and [0103] Terpenoids, such as taxadiene, produced through downstream pathways, the accumulation of indole in the cell methods described herein have widespread applications or in a culture of the cells may comprise modifying or regu including pharmaceuticals such as paclitaxel (Taxol), arte lating the cell’s indole pathway. In still other embodiments, misinin, ginkolides, eleutherobin and pseudopterosins, and the step of controlling the accumulation of indole in the cellor many other potential pharmaceutical compounds. Further in a culture of the cells may comprise or may further comprise applications include compounds used in flavors and cosmet removing the accumulated indole from the cell culture ics such as geraniol, farmesol, geranlygeraniol, linalool, through chemical methods, which may include using one or limonene, , cineol and isoprene. Further applications more absorbents or scavengers. In some embodiments, the include compounds for use as biofuels such as of 5, level of indole in the culture or in the cells in monitored, or 10, and 15 carbon atom length. It is noted that the above example, by continuously or intermittently measuring the compounds are presently produced as extracts of various amount of indole in the culture. plants. Plant extract-based methods are tedious, yield very [0114] The invention according to this aspect can involve small amounts and are limited as to the actual molecules that the manufacture of a product containing one or more terpe can be so obtained, namely, they do not allow the easy pro noids, such as one or more terpenoids selected from a duction of derivatives that may possess far superior properties hemiterpenoid, a monoterpenoid, a sesquiterpenoid, a diter than the original compounds. penoid, a triterpenoid or a tetraterpenoid. [0104] In one aspect, the invention provides a method for [0115] In some embodiments, the productis a food product, making a product containing a terpenoid or terpenoid deriva food additive, beverage, chewing gum, candy, or oral care tive. The method according to this aspect comprises increas product. In such embodiments, the terpenoid or derivative ing terpenoid production in a cell that produces one or more may be a flavor enhancer or sweetener. For example, the terpenoids by controlling the accumulation of indole in the terpenoid or derivative may include one or more of alpha cell or in a culture of the cells. The terpenoid, or a derivative sinensal; beta-Thuj one; ; Carvedl; Carvone; Cin of the terpenoid prepared through one or more chemical or eole: ; ; Cubebol; Limonene; Menthol; Men enzymatic steps, is incorporated into a product to thereby thone; ; Nootkatone; Piperitone; hydrate; make the product containing a terpenoid or terpenoid deriva Steviol; Steviol glycosides; ; Valenceme; or a deriva tive. According to this aspect, the cell expressed components tive of one or more of such compounds. In other embodi of the MEP pathway, and may be a bacterial cell such as E. ments, the terpenoid orderivative is one or more of alpha, beta coli or B. subtilis. In various embodiments, the accumulation and y-humulene; isopinocamphone; (-)-alpha-; of indole in the cell or in a culture of the cells is controlled at (+)-1--4-ol; (+)-; (+)-verbenone; 1,3,8 least in part by balancing an upstream non-mevalonate iso menthatriene; 3-carene; 3-Oxo-alpha-Ionone; 4-Oxo-beta prenoid pathway with a downstream heterologous terpenoid ionone; alpha-sinensal; alpha-terpinolene; alpha-; synthesis pathway. For example, the upstream non-meva Ascaridole; ; ; Cembrene; E)-4-decenal; lonate isoprenoid pathway may be balanced with respect to Farnesol; Fenchone; gamma-Terpinene; Geraniol; hotrienol; the downstream heterologous terpenoid synthesis pathway by Isoborneol; Limonene; myrcene; nerolidol; ; one or more of: p-cymene; perillaldehyde; ; Sabinene; Sabinene hydrate; tagetone; Verbenone; or a derivative of one or more [0105] (1) increasing gene copy number for one or more of such compounds. upstream or downstream pathway enzymes, [0116] Downstream enzymes for the production of such [0106] (2) increasing or decreasing the expression level terpenoids and derivatives are known. of the upstream and downstream pathway, as individual [0117] For example, the terpenoid may be alpha-sinensal, genes or as an operon, using promoters with different and which may be synthesized through a pathway comprising strengths, one or more of farnesyl diphosphate synthase (e.g., [0107] (3) increasing or decreasing the expression level AAK63847.1) and valenceme synthase (e.g., AF441124, 1). of the upstream and downstream pathways, as individual [0118] In other embodiments, the terpenoid is beta-Thuj genes or as an operon, using modifications to ribosomal one, and which may be synthesized through a pathway com binding sites, prising one or more of synthase (e.g., [0108] (4) replacing native genes in the MEP pathway AAN01134.1, ACA21458.2) and (+)-sabinene synthase (e.g., with heterologous genes coding for homologous AF051901.1). enzymes, [0119). In other embodiments, the terpenoid is Camphor, [0109] (5) codon-optimization of one or more heterolo which may be synthesized through a pathway comprising one gous enzymes in the upstream or downstream pathway, or more of Geranyl pyrophosphate synthase (e.g., US 2012/0107893 A1 May 3, 2012

AAN01134.1, ACA21458.2), (-)-borneol dehydrogenase syltransferases (UGTs) (e.g., AF515727.1, AY345983.1, (e.g., GU253890.1), and bornyl pyrophosphate synthase AY345982.1, AY345979.1, AAN40684.1, ACE87855.1). (e.g., AF051900). [0130] The one or more terpenoids may include Thymol, [0120] In certain embodiments, the one or more terpenoids which may be synthesized through a pathway comprising one include Carvedl or Carvone, which may be synthesized or more of Geranyl pyrophosphate synthase (e.g., through a pathway comprising one or more of Geranyl pyro AAN01134.1, ACA21458.2), limonene synthase (e.g., phosphate synthase (e.g., AANO1134.1, ACA21458.2), EF426463, JN388566, HQ636425), (-)-limonene-3-hy 4S-limonene synthase (e.g., AAC37.366.1), limonene-6-hy droxylase (e.g., EF426464, AY622319), (–)-isopiperitenol droxylase (e.g., AAQ18706.1, AAD44150.1), and carvedl dehydrogenase (e.g., EF426465), and (–)-isopiperitenone dehydrogenase (e.g., AAU20370.1, ABR15424.1). reductase (e.g., EF426466). [012.1] In some embodiments, the one or more terpenoids [0131] The one or more terpenoids may include Valencene, comprise Cineole, which may be synthesized through a path which may be synthesized through a pathway comprising one way comprising one or more of Geranyl pyrophosphate syn or more of farnesyl diphosphate synthase (e.g., AAK63847. thase (e.g., AANO1134.1, ACA21458.2) and 1,8-cineole syn 1), and Valanceme synthase (e.g., CQ813508, AF441124. 1). thase (e.g., AF051899). [0132] In some embodiments, the one or more terpenoids [0122] In some embodiments, the one or more terpenoids includes one or more of alpha, beta and Y-humulene, which includes Citral, which may be synthesized through a pathway may be synthesized through a pathway comprising one or comprising one or more of Geranyl pyrophosphate synthase more of farnesyl diphosphate synthase (e.g., AAK63847.1), (e.g., AANO1134.1, ACA21458.2), geraniol synthase (e.g., and humulene synthase (e.g., U92267.1). HM807399, GU136162, AY362553), and geraniol dehydro [0133] In some embodiments, the one or more terpenoids genase (e.g., AY879284). includes (+)-borneol, which may be synthesized through a [0123] In still other embodiments, the one or more terpe pathway comprising one or more of Geranyl pyrophosphate noids includes Cubebol, which is synthesized through a path synthase (e.g., AANO1134.1, ACA21458.2), and bornyl way comprising one or more of farnesyl diphosphate syn pyrophosphate synthase (e.g., AF051900). thase (e.g., AAK63847.1), and cubebol synthase (e.g., [0134] The one or more terpenoids may comprise 3-carene, CQ813505.1). which may be synthesized through a pathway comprising one [012.4] The one or more terpenoids may include Limonene, or more of Geranyl pyrophosphate synthase (e.g., and which may be synthesized through a pathway comprising AAN01134.1, ACA21458.2), and 3-carene synthase (e.g., one or more of Geranyl pyrophosphate synthase (e.g., HQ336800). AAN01134.1, ACA21458.2), and limonene synthase (e.g., [0135] In some embodiments, the one or more terpenoids EF426463, JN388566, HQ6364.25). include 3-Oxo-alpha-Ionone or 4-oxo-beta-ionone, which [0125] The one or more terpenoids may include Menthone may be synthesized through a pathway comprising caro or Menthol, which may be synthesized through a pathway tenoid cleavage dioxygenase (e.g., ABY60886.1, BAJO5401. comprising one or more of Geranyl pyrophosphate synthase 1). (e.g., AANO1134.1, ACA21458.2), limonene synthase (e.g., [0136] In some embodiments, the one or more terpenoids EF426463, JN1388566, HQ636425), (-)-limonene-3-hy include alpha-terpinolene, which may be synthesized through droxylase (e.g., EF426464, AY622319), (–)-isopiperitenol a pathway comprising one or more of Geranyl pyrophosphate dehydrogenase (e.g., EF426465), (–)-isopiperitenone reduc synthase (e.g., AANO1134.1, ACA21458.2), and alpha-terpi tase (e.g., EF426466), (+)-cis-isopulegone isomerase, (-) neol synthase (e.g., AF543529). menthone reductase (e.g., EF426467), and for Menthol (-) [0137] In some embodiments, the one or more terpenoids menthol reductase (e.g., EF426468). include alpha-thujene, which may be synthesized through a [0126] In some embodiments, the one or more terpenoids pathway comprising one or more of Geranyl pyrophosphate comprise myrcene, which may be synthesized through a path synthase (e.g., AAN01134.1, ACA21458.2), and alpha way comprising one or more of Geranyl pyrophosphate syn thujene synthase (e.g., AEJ91555.1). thase (e.g., AANO1134.1, ACA21458.2) and myrcene syn [0138] In some embodiments, the one or more terpenoids thase (e.g., U87908, AY195608, AF271259). include Farnesol, which may be synthesized through a path [0127] The one or more terpenoids may include Nootka way comprising one or more of farnesyl diphosphate syn tone, which may be synthesized through a pathway compris thase (e.g., AAK63847.1), and Farnesol synthase (e.g., ing one or more of farnesyl diphosphate synthase (e.g., AF529266.1, DQ872159.1). AAK63847.1), and Valanceme synthase (e.g., CQ813508, [0139] In some embodiments, the one or more terpenoids AF441124, 1). include Fenchone, which may be synthesized through a path [0128] The one or more terpenoids may include Sabinene way comprising one or more of Geranyl pyrophosphate syn hydrate, which may be synthesized through a pathway com thase (e.g., AANO1134.1, ACA21458.2), and (–)-endo-fen prising one or more of Geranyl pyrophosphate synthase (e.g., chol cyclase (e.g., AY693648). AAN01134.1, ACA21458.2), and sabinene synthase (e.g., [0140] In some embodiments, the one or more terpenoids O81193.1). include gamma-Terpinene, which may be synthesized [0129] The one or more terpenoids may include Steviol or through a pathway comprising one or more of Geranyl pyro Steviol glycoside, and which may be synthesized through a phosphate synthase (e.g., AANO1134.1, ACA21458.2), and pathway comprising one or more of geranylgeranylpyrophos terpinene synthase (e.g., AB110639). phate synthase (e.g., AF081514), ent-copalyl diphosphate [0141] In some embodiments, the one or more terpenoids synthase (e.g., AF034545.1), ent-kaurene synthase (e.g., include Geraniol, which may be synthesized through a path AF0973.11.1), ent-kaurene oxidase (e.g., DQ200952.1), and way comprising one or more of Geranyl pyrophosphate syn kaurenoic acid 13-hydroxylase (e.g., EU722415.1). For ste thase (e.g., AANO1134.1, ACA21458.2), and geraniol syn viol glycoside, the pathway may further include UDP-glyco thase (e.g. HM807399, GU136162, AY362553). US 2012/0107893 A1 May 3, 2012

[0142] In still other embodiments, the one or more terpe [0148] In some embodiments, the one or more terpenoids noids include ocimene, which may be synthesized through a include farmesene, which may be synthesized through a path pathway comprising one or more of Geranyl pyrophosphate way comprising one or more of farnesyl diphosphate syn synthase (e.g., AAN01134.1, ACA21458.2), and beta thase (e.g., AAK63847.1), and farmesene synthase (e.g., ocimene synthase (e.g., EU194553.1). AAT70237.1, AAS68019.1, AY182241). [0143] In certain embodiments, the one or more terpenoids [0149] In some embodiments, the one or more terpenoids include Pulegone, which may be synthesized through a path comprises patchouli , which may be synthesized way comprising one or more of Geranyl pyrophosphate syn through a pathway comprising one or more of farnesyl thase (e.g., AANO1134.1, ACA21458.2), and pinene synthase diphosphate synthase (e.g., AAK63847.1), and Patchoulol (e.g., HQ636424, AF543527, U87909). synthase (e.g., AY508730.1). [0144] In certain embodiments, the one or more terpenoids [0150] In some embodiments, the one or more terpenoids includes Sabinene, which may be synthesized through a path comprises fl-Santalol, which may be synthesized through a way comprising one or more of Geranyl pyrophosphate syn pathway comprising one or more of farnesyl diphosphate thase (e.g., AAN01134.1, ACA21458.2), and sabinene syn synthase (e.g., AAK63847.1), beta Santalene synthase (e.g., thase (e.g., HQ336804, AF051901, DQ785794). HQ259029.1, HQ343278.1, HQ343277.1). [0145] In various embodiments, the product is a fragrance [0151] In some embodiments, the one or more terpenoids product, a cosmetic, a cleaning product, or a soap. In such include 5-Santalene, which may be synthesized through a embodiments, the terpenoid orderivative may be a fragrance. pathway comprising one or more of farnesyl diphosphate For example, the one or more terpenoid or derivative may synthase (e.g., AAK63847.1), and beta Santalene synthase include one or more of Linalool; alpha-Pinene; Carvone; (e.g., HQ259029.1, HQ343278.1, HQ343277.1). Citronellal; Citronellol; Citral; Sabinene; Limonene; Ver [0152] In some embodiments, the one or more terpenoids benone; Geraniol; Cineole: myrcene; Germacrene D; farne include O.-Santalene or o-Santalol, which may be synthesized sene; Valenceme; Nootkatone; patchouli alcohol; Farnesol; through a pathway comprising one or more of farnesyl beta-Ylangene; fl-Santalol; fl-Santalene; a-Santalene; O.San diphosphate synthase (e.g., AAK63847.1), and Santalene talol; fl-vetivone; a-vetivone,: khusimol; Sclarene; sclareol; synthase (e.g., HQ343282.1). beta-Damascone; beta-Damascenone; or a derivative thereof. [0153] In other embodiments, the one or more terpenoids In these or other embodiments, the one or more terpenoid or include sclareol, which may be synthesized through a path derivative compounds includes one or more of Camphene; way comprising one or more of geranylgeranylpyrophos Pulegone; Fenchone; Fenchol; Sabinene hydrate; Menthone; phate synthase (e.g., AF081514), and Sclareol synthase (e.g., Piperitone; Carvedl; gamma-Terpinene; beta-Thujone; dihy HB976923.1). dro-myrcene; alpha-thujene; alpha-terpineol; ocimene; [0154] In certain embodiments, the one or more terpenoids nerol; nerolidol; E)-4-decenal; 3-carene; (-)-alpha-phelland include beta-Damascone or beta-Damacenone, which may be rene; hotrienol; alpha-terpinolene; (+)-1-terpinene-4-ol; synthesized through a pathway comprising one or more of perillaldehyde; verbenone; isopinocamphone; tagetone; cleavage dioxygenase (e.g., ABY60886.1, trans-myrtanal; alpha-sinensal; 1,3,8-menthatriene; (–)-cis BAJ05401.1). rose oxide; (+)-borneol; (+)-verbenone; Germacrene A, Ger [0155] In some embodiments, the one or more terpenoids macrene B; Germacrene; Germacrene E, (+)-beta-; include Fenchol, which may be synthesized through a path epi-cedrol; alpha, beta and y-humulene; alpha-bisabolene; way comprising one or more of Geranyl pyrophosphate syn beta-aryophyllene; ; alpha-sinensal; alpha-bis thase (e.g., AANO1134.1, ACA21458.2), and (–)-endo-fen abolol; (-)f-Copaene; (–)-O-Copaene; 4(Z),7(Z)-ecadienal; chol cyclase (e.g., AY693648). cedrol; ; muuroladiene; isopatchoul-3-ene; iso [0156] In some embodiments, the one or more terpenoids patchoula-3,5-diene; cedrol; guaiol; (–)-6,9-guaiadiene; bul include alpha-terpineol, which may be synthesized through a mesol; guaiol; ledene; ledol; lindestrene; alpha-bergamotene; pathway comprising one or more of Geranyl pyrophosphate maaliol; isovalencenol; muurolol T. beta-Ionone; alpha-Ion synthase (e.g., AANO1134.1, ACA21458.2), alpha-terpineol one; Oxo-Edulan I; Oxo-Edulan II: Theaspirone; Dihydroac synthase (e.g., AF543529). tinodiolide, 4-Oxoisophorone; Safranal; beta-Cyclocitral; [0157] In some embodiments, the one or more terpenoids (–)-cis-gamma-irone, (-)-cis-alpha-irone; or a derivative include Germacrene A, which may be synthesized through a thereof. Such terpenoids and derivatives may be synthesized pathway comprising one or more of farnesyl diphosphate according to a pathway described above. In some embodi synthase (e.g., AAK63847.1), and Germacrene synthase ments, the one or more terpenoids include Linalool, which (e.g., AEM05858.1, ABE03980.1, AAM11626.1). In other may be synthesized through a pathway comprising one or embodiments, the one or more terpenoids include Germa more of Geranyl pyrophosphate synthase (e.g., AAN01134.1, crene B, which may be synthesized through a pathway com ACA21458.2), and linalool synthase (e.g., FJ644544, prising one or more of farnesyl diphosphate synthase (e.g., GQ338154, FJ644548). AAK63847.1), and Germacrene synthase (e.g., ACF94469. [0146] In some embodiments, the one or more terpenoids 1). Alternatively or in addition, the one or more terpenoids include alpha-Pinene, which may be synthesized through a include Germacrene C, which may be synthesized through a pathway comprising one or more of Geranyl pyrophosphate pathway comprising one or more of farnesyl diphosphate synthase (e.g., AANO1134.1, ACA21458.2), and pinene syn synthase (e.g., AAK63847.1), and Germacrene synthase thase (e.g., HQ636424, AF543527, U87909). (e.g., AAC39432.1). [0147] In some embodiments, the one or more terpenoids [0158] In still other embodiments, the one or more terpe includes Germacrene D, which may be synthesized through a noids include (+)-beta-cadinene, which may be synthesized pathway comprising one or more of farnesyl diphosphate through a pathway comprising one or more of farnesyl synthase (e.g., AAK63847.1), and germacrene synthase (e.g., diphosphate synthase (e.g., AAK63847.1), cadinene syn AAS66357.1, AAX40665.1, HQ652871). thases (e.g., U883.18.1). US 2012/0107893 A1 May 3, 2012

[0159] In some embodiments, the one or more terpenoids ACI04664.1), lycopene beta cyclase (e.g., AAA64980.1), include epi-cedrol, which may be synthesized through a path and beta- hydroxylase (e.g., AAA64983.1). way comprising one or more of farnesyl diphosphate syn [0171] In other embodiments, the one or more terpenoids thase (e.g., AAK63847.1), and epicedrol synthase (e.g., include O- and/or fl-carotene, which may be synthesized AF157059.1). through a pathway comprising one or more of geranylgera [0160] In other embodiments, the one or more terpenoids nylpyrophosphate synthase (e.g., AF081514), syn include alpha-bisabolene, which may be synthesized through thase (e.g., AAA64982.1, phytoene desaturase (e.g., a pathway comprising one or more of farnesyl diphosphate AAA21263.1, ACI04664.1), lycopene beta cyclase (e.g., synthase (e.g., AAK63847.1), and bisabolene synthase (e.g., AAA64980.1.), and lycopene beta cyclase (e.g., HQ343280.1, HQ343279.1). NM 001084662.1, AAA64980.1). [0161] In some embodiments, the one or more terpenoids [0172] In some embodiments, the one or more terpenoids include beta-caryophyllene, which may be synthesized include lycopene, which may be synthesized through a path through a pathway comprising one or more of farnesyl way comprising one or more of geranylgeranylpyrophos diphosphate synthase (e.g., AAK63847.1), and Caryophyl phate synthase (e.g., AF081514), phytoene synthase (e.g., lene synthase (e.g., AF472361). AAA64982.1), phytoene desaturase (e.g., AAA21263.1, [0162] The one or more terpenoids may include Longi ACI04664.1), and lycopene beta cyclase (e.g., AAA64980. folene, which may be synthesized through a pathway com 1). prising one or more of farnesyl diphosphate synthase (e.g., [0173] In some embodiments, the one or more terpenoids AAK63847.1), and longifolene synthase (e.g., AY473625.1). include , which may be synthesized through a [0163] The one or more terpenoids may include alpha pathway comprising one or more of geranylgeranylpyrophos , which may be synthesized through a pathway com phate synthase (e.g., AF081514), phytoene synthase (e.g., prising one or more of farnesyl diphosphate synthase (e.g., AAA64982.1), phytoene desaturase (e.g., AAA21263.1, AAK63847.1), and bisabolol synthase (e.g., ADO87004.1). ACI04664.1), lycopene beta cyclase (e.g., AAA64980.1), [0164] In some embodiments, the one or more terpenoids beta-carotene hydroxylase (e.g., AAA64983.1), and beta include (–)-fl-Copaene, which may be synthesized through a carotene ketolase (e.g., ABLO9497.1). pathway comprising one or more of farnesyl diphosphate [0174] In still other embodiments, the one or more terpe noids include Vitamin E, which may be synthesized through synthase (e.g., AAK63847.1), and Copaene synthase (e.g., a pathway comprising one or more of geranylgeranylpyro EU158098). Alternatively, or in addition, the one or more phosphate synthase (e.g., AF081514), geranylgeranylpyro terpenoids include (–)-O-Copaene, which may be synthe phosphate reductase (e.g., BAO00022), homogentisate phy sized through a pathway comprising one or more of farnesyl tyltransferase (e.g., BAO00022), -cyclase (e.g., diphosphate synthase (e.g., AAK63847.1), and Copaene syn BAO00022), and copalyl pyrophosphate synthase (e.g., thase (e.g., AJO01539.1). AAB87091). [0165] The one or more terpenoids may include cedrol or [0175] In some embodiments of this aspect of the invention, cedrene, which may be synthesized through a pathway com the product is a solvent or cleaning product. For example, the prising one or more of farnesyl diphosphate synthase (e.g., terpenoid or derivative may be alpha-Pinene; beta-Pinene; AAK63847.1), and cedrol synthase (e.g., AJO01539.1). Limonene; myrcene; Linalool; Geraniol; Cineole; or a deriva [0166] The one or more terpenoids may include muurola tive thereof. Some enzymes and encoding genes for synthe diene, which may be synthesized through a pathway compris sizing such terpenoids or derivatives thereoffrom products of ing one or more of farnesyl diphosphate synthase (e.g., the MEP pathway are known, and some are described above. AAK63847.1), and muuroladiene synthase (e.g., AJ786641. [0176). In some embodiments, the one or more terpenoids 1). include beta-Pinene, which may be synthesized through a [0167] The one or more terpenoids may include alpha pathway comprising one or more of Geranyl pyrophosphate bergamotene, which may be synthesized through a pathway synthase (e.g., AANO1134.1, ACA21458.2), and pinene syn comprising one or more of farnesyl diphosphate synthase thase (e.g., HQ636424, AF543527, U87909). (e.g., AAK63847.1), and alpha-bergamotene synthase (e.g., [0177] In some embodiments of this aspect of the invention, HQ259029.1). the product is a pharmaceutical, and the terpenoid or deriva [0168] The one or more terpenoids may include alpha and/ tive is an active pharmaceutical ingredient. For example, the or beta-Ionone, which may be synthesized through a pathway terpenoid or derivative may be Artemisinin; Taxol; Taxadi comprising one or more of carotenoid cleavage dioxygenase ene; levopimaradiene; Gingkolides; Abietadiene; Abietic (e.g., ABY60886.1, BAJO5401.1). acid; beta-amyrin; ; or a derivative thereof. In still [0169] In some embodiments of this aspect of the invention, other embodiments, the terpenoid or derivative is Thymo the product is a vitamin or nutritional supplement, and the quinone; Ascaridole; beta-selinene; 5-epi-aristolochene; terpenoid or derivative may be Zexanthin; ?º-carotene; lyco vetispiradiene; epi-cedrol; alpha, beta and y-humulene; pene; Astaxanthin; CoQ10; Vitamin K; Vitamin E; Labdane; a-cubebene; beta-elemene; Gossypol; Zingiberene; Peripl. Gibberellins; Cº-carotene; or a derivative thereof. Enzymes anone B: Capsidiol; Capnellene; illudin; Isocomene; cyper and encoding genes for synthesizing such terpenoids or ene; Pseudoterosins; Crotophorbolone; Englerin; Psiguadial; derivatives thereof from products of the MEP pathway are Stemodinone; Maritimol; Cyclopamine; Veratramine; Aply known, and some are described above. violene; macfarlandin E: ; ; [0170] For example, the one or more terpenoids may Ursoloic acid; Pimaradiene; neo-abietadiene; ; include Zexanthin, which may be synthesized through a path Dolichol; Lupeol, Euphol; Kaurene; Gibberellins; Cassaic way comprising one or more of geranylgeranylpyrophos acid; Erythroxydiol; Trisporic acid; Podocarpic acid; Retene; phate synthase (e.g., AF081514), phytoene synthase (e.g., Dehydroleucodine; Phorbol; Cafestol; kahweol; Tetrahydro AAA64982.1), phytoene desaturase (e.g., AAA21263.1, cannabinol; androstenol; or a derivative thereof. Enzymes US 2012/0107893 A1 May 3, 2012 and encoding genes for synthesizing such terpenoids or through a pathway comprising one or more of farnesyl derivatives thereof from products of the MEP pathway are diphosphate synthase (e.g., AAK63847.1), and epicedrol known, and some are described above. synthase (e.g., AF157059.1). [0178] For example, in some embodiments, the one or more [0188] The one or more terpenoids or derivatives may terpenoids include Artemisinin, which may be synthesized include beta-elemene, which may be synthesized through a through a pathway comprising one or more of farnesyl pathway comprising one or more of farnesyl diphosphate diphosphate synthase (e.g., AAK63847.1), Amorphadiene synthase (e.g., AAK63847.1), and elemene synthase (e.g., Synthase (e.g., AAF984.44.1), amorpha-4,11-diene DQ872158.1). monooxygenase (e.g., DQ315671), and artemisinic [0189] In still other embodiments, the terpenoid or deriva delta-11(13) reductase (e.g., ACH61780.1). tive is Gossypol, which may be synthesized through a path [0179] In other embodiments, the one or more terpenoids way comprising one or more of farnesyl diphosphate syn include Taxadiene, which may be synthesized through a path thase (e.g., AAK63847.1), and (+) delta cadinene synthase way comprising one or more of geranylgeranylpyrophos (e.g., U88318.1). phate synthase (e.g., AF081514), and taxadiene synthase [0190] In some embodiments, the terpenoid or derivative is (e.g., U48796.1, AF081514). In some embodiments, the ter Zingiberene, which may be synthesized through a pathway penoid or derivative is Taxol, which may be synthesized comprising one or more of farnesyl diphosphate synthase through a pathway comprising one or more of geranylgera (e.g., AAK63847.1), and Zingiberene synthase (e.g., nylpyrophosphate synthase (e.g., AF081514), taxadiene syn Q5SBP4.1). thase (e.g., U48796.1, AF081514), Taxus cuspidata taxadi [0191] In still other embodiments, the one or more terpe ene 5-alpha hydroxylase (e.g., AY289209.2), Taxus cuspidata noids or derivatives include Capsidiol, which may be synthe 5-alpha-taxadienol-10-beta-hydroxylase (e.g., AF318211.1), sized through a pathway comprising one or more of farnesyl Taxus cuspidata taxoid 10-beta hydroxylase (e.g., diphosphate synthase (e.g., AAK63847.1), and epi-aris AY563635.1), Taxus cuspidata taxane 13-alpha-hydroxylase tolochene synthase (e.g., Q40577.3). (e.g., AYO56019.1), Taxus cuspidata taxane 14b-hydroxylase [0192] The one or more terpenoids or derivatives may (e.g., AY188177.1), Taxus cuspidata taxoid 7-beta-hydroxy include Pimaradiene, which may be synthesized through a lase (e.g., AY307951.1). pathway comprising one or more of geranylgeranylpyrophos [0180] In still other embodiments, the one or more terpe phate synthase (e.g., AF081514), ent-pimara-8014), 15-diene noids or derivatives include levopimaradiene, which may be synthase (e.g., Q6Z5J6.1), and syn-pimara-7,15-diene syn synthesized through a pathway comprising one or more of thase (e.g., AAU05906.1). geranylgeranylpyrophosphate synthase (e.g., AF081514), [0193] The one or more terpenoids or derivatives may and levopimaradiene synthase (e.g., AAS89668.1). include neo-abietadiene, which may be synthesized through a [0181] In some embodiments, the one or more terpenoids pathway comprising one or more of geranylgeranylpyrophos include Gingkolides, which may be synthesized through a phate synthase (e.g., AF081514), and levopimaradiene syn pathway comprising one or more of geranylgeranylpyrophos thase (e.g., AAS89668.1). The one or more terpenoids or phate synthase (e.g., AF081514), and levopimaradiene syn derivatives may include Squalene, which may be synthesized thase (e.g., AAS89668.1). through a pathway comprising one or more of farnesyl [0182] In some embodiments, the one or more terpenoids or diphosphate synthase (e.g., AAK63847.1), and squalene syn derivatives may include Abietadiene or , which thase (e.g., AEE86403.1). may be synthesized through a pathway comprising one or [0194] The one or more terpenoids or derivatives may more of geranylgeranylpyrophosphate synthase (e.g., include Lupeol, which may be synthesized through a pathway AF081514), and abietadiene synthase (e.g., AAK83563.1). comprising one or more of geranylgeranylpyrophosphate [0183] The one or more terpenoids or derivatives may synthase (e.g., AF081514), and lupeol synthase (e.g., include beta-amyrin, which may be synthesized through a AAD05032.1). pathway comprising one or more of geranylgeranylpyrophos [0195] The one or more terpenoids or derivatives may phate synthase (e.g., AF081514), and beta-Amyrin Synthase include Kaurene, which may be synthesized through a path (e.g., ACO24697.1). way comprising one or more of geranylgeranylpyrophos [0184] The one or more terpenoids or derivatives may phate synthase (e.g., AF081514), ent-copalyl diphosphate include beta-selinene, which may be synthesized through a synthase (e.g., AF034545.1), and ent-kaurene synthase (e.g., pathway comprising one or more of farnesyl diphosphate AF0973.11.1). synthase to (e.g., AAK63847.1), and selinene synthase (e.g., [0196) In some embodiments of this aspect of the invention, O64404.1). the product is a food preservative. In such embodiments, the [0185] In still other embodiments, the one or more terpe terpenoid or derivative may be Carvacrol; Carvone; or a noids or derivatives include 5-epi-aristolochene, which may derivative thereof. For example, the Carvone may be synthe be synthesized through a pathway comprising one or more of sized through a pathway comprising one or more of Geranyl farnesyl diphosphate synthase (e.g., AAK63847.1), and pyrophosphate synthase (e.g., AANO1134.1, ACA21458.2), 5-epi-aristolochene synthase (e.g., AF542544.1). 4S-limonene synthase (e.g., AAC37.366.1), limonene-6-hy [0186] In some embodiments, the one or more terpenoids or droxylase (e.g., AAQ18706.1, AAD44150.1), and carvedl derivatives include vetispiradiene, which may be synthesized dehydrogenase (e.g., AAU20370.1, ABR15424.1). through a pathway comprising one or more of farnesyl [0197] In some embodiments of this aspect, the product is a diphosphate synthase (e.g., AAK63847.1), and vetispiradi lubricant or surfactant. The terpenoid or derivative may be, ene synthases (e.g., BD227667). for example, one or more of alpha-Pinene; beta-Pinene; [0187] In some embodiments, the one or more terpenoids or Limonene; myrcene; farmesene; Squalene; or a derivative derivatives include epi-cedrol, which may be synthesized thereof. Enzymes and encoding genes for synthesizing such US 2012/0107893 A1 May 3, 2012

terpenoids or derivatives thereof from products of the MEP [0205] In some embodiments, the one or more terpenoids or pathway are known, and some are described above. derivatives include Copaene, which may be synthesized [0198] The one or more terpenoids may include farmesene, through a pathway comprising one or more of farnesyl which may be synthesized through a pathway comprising one diphosphate synthase (e.g., AAK63847.1), and Copaene syn or more of farnesyl diphosphate synthase (e.g., AAK63847. thase (e.g., EU158098). 1), and farmesene synthase (e.g., AAT70237.1, AAS68019.1, [0206] In other embodiments, the terpenoid or derivative is AY182241). (–)-cis-gamma-irone or (-)-cis-alpha-irone, which may be [0199] The one or more terpenoids may include Squalene, synthesized through a pathway comprising one or more of which may be synthesized through a pathway comprising one carotenoid cleavage dioxygenase (e.g., ABY60886.1, or more of farnesyl diphosphate synthase (e.g., AAK63847. BAJ05401.1). 1), and squalene synthase (e.g., AEE86403.1). [0207] In still other embodiments, the one or more terpe [0200] In some embodiments, the terpenoid or derivative is noids include Phytoene or , which may be syn polymerized, and the resulting polymer may be elastomeric. thesized through a pathway comprising one or more of gera In some such embodiments, the terpenoid or derivative may nylgeranylpyrophosphate synthase (e.g., AF081514), be alpha-Pinene; beta-Pinene; Limonene; myrcene; farne sene; Squalene; isoprene; or a to derivative thereof. Enzymes phytoene synthase (e.g., AAA64982.1, and phytoene desatu and encoding genes for synthesizing such terpenoids or rase (e.g., AAA21263.1, ACI04664.1). derivatives thereof from products of the MEP pathway are known, and some are described above. In some embodiments, EXAMPLES the one or more terpenoids include isoprene, which may be synthesized through a pathway comprising one or more of Methods Isoprene synthase (e.g., HQ684.728.1). [0201] In some embodiments of this aspect of the invention, [0208] Strains, Plasmids, Oligonucleotides and Genes E. the product is an insecticide, pesticide or pest control agent, coli K12 MG1655 strain was used as the host strain of all the and the terpenoid or derivative is an active ingredient. For taxadiene strain construction. E. coli K12MG1655 A(recA, example, the one or more terpenoid or derivative may include endA) and E. coli K12MG16554(recA,endA)ED3 strains one or more Carvone; Citronellol; Citral; Cineole: Germa were provided by Professor Kristala Prather's lab at MIT crene C; (+)-beta-cadinene; or a derivative thereof. In other (Cambridge, Mass.). Detail of all the plasmids constructed for embodiments, the one or more terpenoid or derivative is Thy the study is shown in Table 2. All oligonucleotides used in this mol; Limonene; Geraniol; Isoborneol; beta-Thuj one; study are contained in Table 3. myrcene; (+)-verbenone; dimethyl-nonatriene; Germacrene [0209] The sequences of geranylgeranyl pyrophosphate A; Germacrene B; Germacrene D; patchouli alcohol; Guaia synthase (GGPPS)," Taxadiene synthase (TS)," Cyto zulene; muuroladiene; cedrol; alpha-cadinol; d-occidol; Aza chrome P450 Taxadiene 50-hydroxylase (T5oC)H) and dirachtin A; Kaurene; or a derivative thereof. Enzymes and Taxus NADPH:cytochrome P450 reductase (TCPR)” were encoding genes for synthesizing such terpenoids or deriva obtained from Taxus canadensis, Taxus brevifolia, Taxus cus tives thereoffrom products of the MEP pathway are known, pidate (Genbank accession codes: AF081514, U48796, and some are described above. AY289209 and AY571340). Genes were custom-synthesized [0202] For example, the one or more terpenoids may using the plasmids and protocols reported by Kodumaletal.” include Germacrene D, which may be synthesized through a (Supplementary details Appendix 1) to incorporate E. coli pathway comprising one or more of farnesyl diphosphate translation codon and removal of restriction sites for cloning synthase (e.g., AAK63847.1), and Germacrene synthase purposes. Nucleotides corresponding to the 98 and 60 N-ter (e.g., AAS66357.1, AAX40665.1, HQ652871). minal amino acids of GGPPS and TS (plastid transit peptide) [0203] In some embodiments, the product is a cosmetic or were removed and the translation insertion sequence Met was personal care product, and the terpenoid or derivative is not a inserted.” fragrance. For example, one or more terpenoid orderivative is Construction of MEP Pathway (dxs-idi-idp?)F operon). Camphor; Linalool; Carvone; myrcene; farmesene; patchouli [0210] dxs-idi-isp?)F operon was initially constructed by alcohol; alpha-bisabolene; alpha-bisabolol; beta-Ylangene; cloning each of the genes from the genome of E coli K12 fl-Santalol; fl-Santalene; a-Santalene; Cº-Santalol; or a deriva MG1655 using the primers dxs(s), dxs(a), idi(s), idi(a), isp?)F tive thereof. In some embodiments, the terpenoid or deriva (s) and isp?)FI(a) under pHT21C+plasmid with T7 promoter tive is Camphene; Carvacrol; alpha-terpineol; (Z)-beta (p20T7MEP).” Using the primers dxsidiisp?)FNcoI (s) and ocimene; nerol; (E)-4-decenal; perillaldehyde; (–)-cis-rose dxsidiisp?)FKpnI(a) dxs-idi-isp?)F operon was sub-cloned oxide; Copaene; 4(Z),7(Z)-decadienal; isopatchoulenone; into and p"TrcHis2B (Invitrogen) plasmid after digested with (–)-6,9-guaiadiene; Retinol; ; (–)-cis-gamma-irone; NcoI and Kpnl for p?rcMEP plasmid (p20TreMEP). (–)-cis-alpha-irone; Phytoene; Phytofluene; or a derivative p20TrcNMEP plasmid digested with Mlul and Pnel and thereof. In some embodiments, the one or more terpenoids cloned into Mlul and Pme? digested paCYC184-melA(P2A) may include alpha-bisabolene, which may be synthesized plasmid to construct p10TrcNMEP plasmid, pIrcNMEP plasmid through a pathway comprising one or more of farnesyl digested with Bst/17I and Scal and cloned into Pyu?I diphosphate synthase (e.g., AAK63847.1), and bisabolene digested pCL1920 plasmid to construct pSTrcNMEP plasmid. synthase (HQ343280.1, HQ343279.1). For constructing p20T5MEP plasmid initially the dxs-idi [0204] In some embodiments, the one or more terpenoids or isp?)F operon was cloned into pCOE plasmid with T5 promoter derivatives include (Z)-beta-ocimene, which may be synthe (pC)E-MEP) using the primers dxsidiisp?)FNcol (s) and dxsi sized through a pathway comprising one or more of Geranyl disp?)FXhoI(a). A fraction of the operon DNA with T5 pro pyrophosphate synthase (e.g., AANO1134.1, ACA21458.2), moter was amplified using the primers T5AgeI(s) and T5Nhe? and beta-ocimene synthase (e.g., NM 117775). (a) from p(\EMEP plasmid. The DNA fragment was digested US 2012/0107893 A1 May 3, 2012 with AgeI/Nhel and cloned into the p20T7MEP plasmid bovine 17a hydroxylase N-terminal 8 residue peptide digested with SGr/AI/Nhe? enzymes. MALLLAVF (SEQ ID NO:51) to the N-terminal truncated T500H sequences" and GSTGS peptide linker was carried Construction of Taxadiene Pathway (GT and TG Operons). Out using the primer CYP17At8AANdeI(s), CYP17At24AANdeI(s), CYP17At42AANdeI(s) and [0211] The downstream taxadiene pathways (GT and TG CYPLinkBamhI(a). Using these primers each modified operon) were constructed by cloning PCR fragments of DNA was amplified, Ndel/BamhI digested and cloned into GGPS and TS into the Noel-EcoRI and EcoRI-Sall sites of Ndel/BamhI digested paCYC DUET1 plasmid to construct pTrcHIS2B plasmid to create p20TrcGT and p20TrcTG using p10At8T500H, p10At24T5oCH and p10At42T500H plas the primers GGPPSNcol(s), GGPPSEcoRI(a), TSEcoRI(s), TSsalI(a), TSNcol(s) TSEcoRI(a) GGPPSEcoRI(s) and mids. 74 amino acid truncated TCPR (tTCPR) sequence was GGPPSSall(a). For constructing p20T5GT, initially the amplified using primers CPRBamhI(s) and CPRSal?(a). The operon was amplified with primers GGPPSNcol(s) and TSX amplified tTCPR sequence and the plasmids, hol(a) and cloned into a p(OE plasmid under T5 promoter p10At8T500H, p10At24T500H and p10At42T5oCH, was digested with Ncol/XhoI. Further the sequence was digested digested with Bamfil/Sal? and cloned to construct the plas with Xbal and Xhol and cloned into the p"Trc plasmid back mids p10At8T5oCH-tTCPR p10At24T5oCH-tTCPR and bone amplified using the primers pTrcSal(s) and p?rc>ba(a). p10At42T5oCH-tTCPR. p10T7TG was constructed by subcloning the Ncol/Sal? Culture Growth for Screening the Taxadiene and taxadiene digested TG operon from p20TrcTG into Ncol/Sal? digested 50-ol Analysis pACYC-DUET1 plasmid. p5T7TG was constructed by clon [0215) Single transformants of pre-engineered E. coli ing the BspEI/Xbal digested fragment to the Xbal/BspEI strains harboring the appropriate plasmid with upstream digested DNA amplified from p(TL1920 plasmid using (MEP), downstream taxadiene pathway and taxadiene 50-ol pCLBspEI(s) and pCLXbal(a) primers. were cultivated for 18 h at 30° C. in Luria-Bertani (LB) [0212] Construction of chromosomal integration MEP medium (supplemented with appropriate antibiotics, 100 pathway plasmids For constructing the plasmids with FRP mg/mL carbenecilin, 34 mg/mL chloramphenicol, 25 mg/L Km-FRP cassette foramplifying the sequence forintegration, kanamycin or 50 mg/L spectinomycin). For small scale cul p20T7MEP and p20T5MEP was digested with XhoI/Scal. tures to screen the engineered strains, these preinnoculum FRP-Km-FRP cassette was amplified from the Km cassette were used to seed fresh 2-mL rich media (5 g/L yeast extract, with FRP sequence from pkD13 plasmid using the primers 10 g/L Trypton, 15 g/L, glucose, 10 g/L NaCl, 100 mM KmRRPXhoI(s) and KmBRPScal(a). The amplified DNA HEPS, 3 mL/LAntifoam B, pH 7.6, 100 ug/mL Carbenicillin was digested with XhoI/Scal and cloned into the XhoI/Scal and 34 ug/mL chloramphenicol), at a starting Agoo of 0.1. The digested p20T7MEP and p20T5MEP plasmid culture was maintained with appropriate antibiotics and 100 (p20T7MEPKmFRP and p20T5MEPKmFRP). Similarly the mM IPTG for gene induction at 22°C. for 5 days. p20TrcNMEP plasmid was digested with SacI/Scal and the amplified DNA using the primers KmPRPSacI(s) and Bioreactor Experiments for the Taxadiene 50-ol Producing KmERPScal(a) was digested, cloned into the p20TrcMEP Strain. plasmid (p20TreMEPKm-FRP). [0216) The 3-L Bioflo bioreactor (New Brunswick) was assembled as to manufacturer’s instructions. One liter of rich Chromosomal Integration of the MEP Pathway Cassette (La media with 1% glycerol (v/v) was inoculated with 50 mL of 8 clq-MEP-FRP-Km-FRP) Cassette h culture (Agoo of ~2.2) of the strain 26-At24T500H-tTCPR [0213] The MEP pathways constructed under the promot grown in LB medium containing the antibiotics (100 mg/mL ers T7, T5 and Trc were localized to the ara operon region in carbenicillin, 34 mg/mL chloramphenicol) at the same con the chromosome with the Kan marker. The PCR fragments centrations. 1L-bioreactors with biphasic liquid-liquid fer Were amplified from p20T7MEPKmFRP, mentation using 20% w/v dodecane. Oxygen was supplied as p20T5MEPKmFRP and p20TrcMEPKm-FRP using the filtered air at 0.5 V/v/m and agitation was adjusted to maintain primers IntT7T5(s), IntTrc(s) and Int(a) and then electropo dissolved oxygen levels above 50%. pH of the culture was rated into E. coli MG1655 reca-end-and E. coli MG1655 controlled at 7.0 using 10% NaOH. The temperature of the recA-end-EDE3 cells for chromosomal integration through culture in the fermentor was controlled at 30°C. until the cells the Red recombination technique.” The site specific local were grown into an optical density of approximately 0.8, as ization was confirmed and the Km marker was removed measured at a wavelength of 600 nm (OD600). The tempera through the action of the FLP recombinase after successful ture of the fermentor was reduced to 22°C. and the cells were gene integration. induced with 0.1 mM IPTG. Dodecane was added aseptically to 20% (v/v) of the media volume. During the course of the fermentation the concentration of glycerol and acetate accu Construction of Taxadiene 50-ol Pathway mulation was monitored with constant time intervals. During [0214] The transmembrane region (TM) of the taxadiene the fermentation as the glycerol concentration depleted below 50-ol hydroxylase (T5oC)H) and Taxus Cytochrome P450 0.5 g/L, glycerol (3 g/L) was introduced into the bioreactor. reductase (TCPR) was identified using PredictProtein soft [0217] The fermentation was further optimized using a fed ware (www.predictmrotein.org).” For transmembrane engi batch cultivation with a defined feed medium containing neering selective truncation at 8, 24 and 42 amino acid resi 0.5% yeast extract and 20% (v/v) dodecane (13.3 g/L dues on the N-terminal transmembrane region of taxadiene KH2PO4, 4 g/L (NHA), HPO, 1.7 g/L citric acid, 0.008.4 g/L 50-ol hydroxylase (T50.OH) and 74 amino acid region in the EDTA, 0.0025 g/L CoCl2, 0.015 g/L MnCl2, 0.0015 g/L TCPR was performed. The removal of the 8, 24 and 42 resi CuCl2, 0.003 g/L HABOs, 0.0025 g/L Na2MoC), 0.008 g/L due N-terminal amino acids of taxadiene 50-ol hydroxylase Zn(CH3COO), 0.06 g/L Fe(III) citrate, 0.0045 g/L thiamine, (T50.OH), incorporation of one amino acid substituted 1.3 g/L MgSO4, 10 g/L glycerol, 5 g/L yeast extract, pH 7.0). US 2012/0107893 A1 May 3, 2012

The same medium composition was used for the fermentation a Bio-Radicycler using the iQ SYBR Green Supermix (Bio of strains 17 and 26 with appropriate antibiotics (strain 17: rad). The level of expression of rrs.A gene, which is not 100 pg/mL carbenicillin and 50 pg/mL spectinomycin; strain subject to variable expression, was used for normalization of 26: 50 pg/mL spectinomycin). qPCR values.” Table 3 has primers used for qPCR. For each [0218] For the taxadien-50-ol producing strain, one liter of primer pair, a standard curve was constructed with mRNA of complex medium with 1% glycerol (v/v) was inoculated with E. coli as the template. 50 mL of an 8 h culture (OD of ~2.2) of strain 26 -Atz4T5oC)H-tTCPR grown in LB medium containing 50 Example 1 plg/mL spectinomycin and 34 pug/mL chloramphenicol). Oxy gen was supplied as filtered air at 0.5 (vvm) and agitation was Taxadiene Accumulation Exhibits Strong Non-Lin adjusted to maintain dissolved oxygen levels above 30%. The ear Dependence on the Relative Strengths of the pH of the culture was controlled at 7.0 using 10% NaOH. The Upstream MEP and Downstream Synthetic Taxadi temperature of the culture in the fermentor was controlled at ene Pathways 30° C. until the cells were grown into an optical density of [0222] FIG. 1b depicts the various ways by which promot approximately 0.8, as measured at a wavelength of 600 mm ers and gene copy numbers were combined to modulate the (OD600). The temperature of the fermentor was reduced to relative flux (or strength) through the upstream and down 22° C. and the pathway was induced with 0.1 mM IPTG. stream pathways of taxadiene synthesis. A total of 16 strains Dodecane was added aseptically to 20% (v/v) of the media were constructed in order to de-bottleneck the MEP pathway, volume. During the course of the fermentation, the concen as well as optimally balance it with the downstream taxadiene tration of glycerol and acetate accumulation was monitored pathway. FIG. 2a,b summarize the results of taxadiene accu with constant time intervals. During the fermentation as the mulation in each of these strains, with FIG. 2a accentuating glycerol concentration depleted 0.5-1 g/L, 3 g/L of glycerol the dependence of taxadiene accumulation on the upstream was introduced into the bioreactor. pathway for constant values of the downstream pathway, and FIG. 26 the dependence on the downstream pathway for GC-MS Analysis of Taxadiene and Taxadiene-50-ol constant upstream pathway strength (see also Table 1 for the [0219) For analysis of taxadiene accumulation from small calculation of the upstream and downstream pathway expres scale culture, 1.5 mL of the culture was vortexed with 1 mL sion from the reported promoter strengths and plasmid copy hexane for 30 min. The mixture was centrifuged to separate numbers”). Clearly, there are maxima exhibited with the organic layer. For bioreactor 1 uD of the dodecane layer respect to both upstream and downstream pathway expres was diluted to 200 uD using hexane. 1 uD of the hexane layer sion. For constant downstream pathway expression (FIG.2a), was analyzed by GC-MS (Varian saturn 3800 GC attached to as the upstream pathway expression increases from very low a Varian 2000 MS). The sample was injected into a HP5ms levels, taxadiene production is increased initially due to column (30 mix250 umx0.25 um thickness) (Agilent Tech increased supply of precursors to the overall pathway. How nologies USA). Helium (ultra purity) at a flow rate 1.0 ml/min ever, after an intermediate value, further upstream pathway was used as a carrier gas. The oven temperature was first kept increases cannot be accommodated by the capacity of the constant at 50° C. for 1 min, and then increased to 220° C. at downstream pathway. This pathway imbalance leads to the the increment of 10°C./min, and finally held at this tempera accumulation of an intermediate (see below) that may be ture for 10 min. The injector and transfer line temperatures either inhibitory to cells or simply indicate flux diversion to a were set at 200° C. and 250° C., respectively. competing pathway, ultimately resulting in taxadiene accu [0220] Standard compounds from biological or synthetic mulation reduction. sources for taxadiene and taxadiene 50-ol was not commer [0223] For constant upstream pathway expression (FIG. cially available. Thus we performed fermentations of taxadi 2b), a maximum is similarly observed with respect to the level ene producing E. coli in a 2L bioreactor to extract pure mate of downstream pathway expression. This is attributed to an rial. Taxadiene was extracted by solvent extraction using initial limitation of taxadiene production by low expression hexane, followed by multiple rounds of silica column chro levels of the downstream pathway, which is thus rate limiting matography to obtain the pure material for constructing a with respect to taxadiene production. At high levels of down standard curve for GC-MS analysis. We have compared the stream pathway expression we are likely seeing the negative GC and MS profile of the pure taxadiene with the reported effect of high copy number on cell physiology, hence, a maxi literature to confirm the authenticity of the compound”. In mum exists with respect to downstream pathway expression. order to check the purity we have performed "HNMR of These results demonstrate that dramatic changes in taxadiene taxadiene. Since the accumulation of taxadiene-50-ol was accumulation can be obtained from changes within a narrow very low level we used taxadiene as a measure to quantify the window of expression levels for the upstream and down production of this molecule and authentic mass spectral frag stream pathways. For example, a strain containing an addi mentation characteristics from previous reports”. tional copy of the upstream pathway on its chromosome qPCR Measurements for Transcriptional Analysis of Engi under Trc promoter control (Strain 8, FIG. 2a) produced 2000 neered Strains fold more taxadiene than one overexpressing only the syn [0221] Transcriptional gene expression levels of each gene thetic downstream pathway (Strain 1, FIG. 2a). Furthermore, were detected by qPCR on mRNA isolated from the appro changing the order of the genes in the downstream synthetic priate strains. To prevent degradation, RNA was stabilized operon from GT (GPPS-TS) to TG (TS-GPPS) resulted in before cell lysis using RNAprotect bacterial reagent 2-3-fold increase (strains 1-4 compared to 5, 8, 11 and 14). (Qiagen). Subsequently, total RNA was isolated using RNe The observed results show that the key to taxadiene overpro asy mini kit (Qiagen) combined with nuclease based removal duction is ample downstream pathway capacity and careful of genomic DNA contaminants. cDNA was amplified using balancing between the upstream precursor pathway with the iScript clNA synthesiskit (Biorad). qPCR was carried out on downstream synthetic taxadiene pathway. Altogether, the US 2012/0107893 A1 May 3, 2012 20 engineered strains established that the MEP pathway flux can variable to the taxadiene production (FIG.3 and FIG. 8) from be substantial, if a wide range of expression levels for the the engineered strains. A critical attribute of our optimal endogenous upstream and synthetic downstream pathway are strains is the fine balancing that alleviates the accumulation of searched simultaneously. this metabolite, resulting in higher taxadiene production. This balancing can be modulated at different levels from chromo Example 2 some, or different copy number plasmids, using different promoters, with significantly different taxadiene accumula Chromosomal Integration and Fine Tuning of the tion. Upstream and Downstream Pathways Further [0227| Subsequently the corresponding peak in the gas Enhances Taxadiene Production chromatography-mass spectrometry (GC-MS) chromato [0224] To provide ample downstream pathway strength gram was identified as indole by GC-MS, "Hand **C nuclear while minimizing the plasmid-borne metabolic burden”, two magnetic resonance (NMR) spectroscopy studies (FIG. 16). new sets of 4 strains each were engineered (strains 25-28 and We found that taxadiene synthesis by strain 26 is severely 29–32) in which the downstream pathway was placed under inhibited by exogenous indole at indole levels higher than the control of a strong promoter (T7) while keeping a rela ~100 mg/L (FIG. 15b). Further increasing the indole concen tively low number of 5 and 10 copies, respectively. It can be tration also inhibited cell growth, with the level of inhibition seen (FIG. 2C) that, while the taxadiene maximum is main being very strain dependent (FIG. 15c). Although the bio tained at high downstream strength (strains 21-24), a mono chemical mechanism of indole interaction with the iso tonic response is obtained at the low downstream pathway prenoid pathway is presently unclear, the results in FIG. 15 strength (strains 17–20, FIG. 26). This observation prompted suggest a possible synergistic effect between indole and ter the construction of two additional sets of 4 strains each that penoid compounds of the isoprenoid pathway in inhibiting maintained the same level of downstream pathway strength as cell growth. Without knowing the specific mechanism, it before but expressed very low levels of the upstream pathway appears strain 26 has mitigated the indole’s effect, which we (strains 25-28 and 29-32, FIG. 2d). Additionally, the operon carried forward for further study. of the upstream pathway of the latter strain set was chromo somally integrated. It can be seen that not only is the taxadi Example 4 ene maximum recovered, albeit at very low upstream path way levels, but a much greater taxadiene maximum is attained Cultivation of Engineered Strains (300 mg/L). We believe this significant increase can be attrib [0228] In order to explore the taxadiene producing poten uted to a decrease in cell’s metabolic burden. This was tial under controlled conditions for the engineered strains, fed achieved by 1) eliminating plasmid dependence through inte batch cultivations of the three highest taxadiene accumulating gration of the pathway into the chromosome and 2) attaining strains (~60 mg/L from strain 22; ~125 mg/L from strain 17; a fine balance between the upstream and downstream path ~300 mg/L from strain 26) were carried out in 1L-bioreactors way expression. (FIG. 17). The fed batch cultivation studies were carried out [0225] The 32 recombinant constructs allowed us to as liquid-liquid two-phase fermentation using a 20% (v/v) adequately probe the modular pathway expression space and dodecane overlay. The organic solvent was introduced to amplify -15000 fold improvement in taxadiene production. prevent air stripping of secreted taxadiene from the fermen This is by far the highest production of terpenoids from E. coli tation medium, as indicated by preliminary findings. In MEP isoprenoid pathway reported (FIG. 3a). Additionally, defined media with controlled glycerol feeding, taxadiene the observed fold improvements in terpenoid production are productivity increased to 174+5 mg/L (SD), 210+7 mg/L significantly higher than those of reported combinatorial (SD), and 1020+80 mg/L (SD), respectively for strains 22, 17 metabolic engineering approaches that searched an extensive and 26 (FIG. 17a). Additionally, taxadiene production sig genetic space comprising up to a billion combinatorial vari nificantly affected the growth phenotype, acetate accumula ants of the isoprenoid pathway.” This suggests that pathway tion and glycerol consumption (FIG. 17b-17d). optimization depends far more on fine balancing of the [0229] FIG. 17C shows that acetate accumulates in all expression of pathway modules than multi-source combina strains initially, however after ~60 hrs acetate decreases in torial gene optimization. The multiple maxima exhibited in strains 17 and 26 while it continues to increase in strain 22. the phenotypic landscape of FIG. 1 underscores the impor This phenomenon highlights the differences in central carbon tance of probing the expression space at sufficient resolution metabolism between high MEP flux strains (26 and 17) and to identify the region of optimum overall pathway perfor low MEP flux strain (22). Additionally, this observation is mance. FIG. 7 depicts the fold improvements in taxadiene another illustration of the good physiology that characterizes production from the modular pathway expression search. a well-balanced, -functioning strain. Acetic acid, as product of overflow metabolism, is initially produced by all strains Example 3 due to the high initial glycerol concentrations used in these fermentations and corresponding high glycerol pathway flux. Metabolite Inversely Correlates with Taxadiene Pro This flux is sufficient for supplying also the MEP pathway, as duction and Identification of Metabolite well as the other metabolic pathways in the cell. [0226] Metabolomic analysis of the previous engineered [0230] At –48 hrs, the initial glycerol is depleted, and the strains identified an, as yet, unknown, metabolite byproduct cultivation switches to a fed-batch mode, during which low that correlated strongly with pathway expression levels and but constant glycerol levels are maintained. This results in a taxadiene production (FIG. 3 and FIG. 8). Although the low overall glycerol flux, which, for strains with high MEP chemical identity of the metabolite was unknown, we hypoth flux (strains 26 and 17), is mostly diverted to the MEP path esized that it is an isoprenoid side-product, resulting from way while minimizing overflow metabolism. As a result ace pathway diversion and has been anti-correlated as a direct tic acid production is reduced or even totally eliminated. US 2012/0107893 A1 May 3, 2012

Regarding the decline in acetic acid concentration, it is pos between the pathways. However, -3 fold changes in TS sible that acetic acid assimilation may have happened to some expression are observed for different MEP expression vec extent, although this was not further investigated from a flux tors. This is likely due to significant competition for resources analysis standpoint. Some evaporation and dilution due to (raw material and energy) that are withdrawn from the host glycerol feed are further contributing to the observed acetic metabolism for overexpression of both the four upstream and acid concentration decline. In contrast, for strains with low two downstream genes.” Compared to the control strain 25c, MEP flux (strain 22), flux diversion to the MEP pathway is a 4 fold growth inhibition was observed with strain 25 indi not very significant, so that glycerol flux still supplies all the cated that high overexpression of synthetic taxadiene path necessary carbon and energy requirements. Overflow way induced toxicity altering the growth phenotype com metabolism continues to occur leading to acetate secretion. pared to the overexpression of native pathway (FIG. 4c). [0231] Clearly the high productivity and more robust However, as upstream expression increased, downstream growth of strain 26 allowed very high taxadiene accumula expression was reduced, inadvertently in our case, to desir tion. Further improvements should be possible through opti able levels to balance the upstream and downstream path mizing conditions in the bioreactor, balancing nutrients in the ways, minimizing growth inhibition (strain 26). growth medium, and optimizing carbon delivery. [0235] At the extreme of protein overexpression, T7 pro moter-driven MEP pathway resulted in severe growth inhibi Example 5 tion, due to the synthesis offour proteins at high level (strains 28 and 32). Expression of the TS genes by T7 does not appear Upstream and Downstream Pathway Expression to have as drastic effect by itself. The high rates of protein Levels and Cell Growth Reveal Underlying Com synthesis from the T7 induced expression (FIG. 4ab) could plexity lead to the down regulation of the protein synthesis machin [0232] For a more detailed understanding of the engineered ery including components of housekeeping genes from early balance in pathway expression, we quantified the transcrip growth phase impairs the cell growth and lower the increase tional gene expression levels of dxs (upstream pathway) and in biomass.” We hypothesized that our observed complex TS (downstream pathway) for the highest taxadiene produc growth phenotypes are cumulative effects of (1) toxicity ing strains and neighboring strains from FIG.2c and d (strains induced by activation of isoprenoid/taxadiene metabolism, 17, 22 and 25-32) (FIG. 4a,b). As we hypothesized, expres and (2) and the effects of high recombinant protein expres sion of the upstream pathway increased monotonically with sion. Altogether our multivariate-modular pathway engineer promoter strength and copy number for the MEP vector from: ing approach generated unexpected diversity in terpenoid native promoter, Trc, T5, T7, and 10 copy and 20 copy plas metabolism and its correlation to the pathway expression and mids, as seen in the DXS expression (FIG. 4a). Thus we found cell physiology. Rational design of microbes for secondary that dxs expression level correlates well with the upstream metabolite production will require an understanding of path pathway strength. Similar correlations were found for the way expression that goes beyond a linear/independent under other genes of the upstream pathway, idi, isp?) and isprº (FIG. standing of promoter strengths and copy numbers. However, 14a, b). In the downstream gene expression, a fold improve simple, multivariate approaches, as employed here, can intro ment was quantified after transferring the pathway from 5 to duce the necessary diversity to both (1) find high producers, 10 copy plasmid (25-28 series and 29-32 series) (FIG. 4b). and (2) provide a landscape for the systematic investigation of [0233] While promoter and copy number effects influenced higher order effects that are dominant, yet underappreciated, the gene expressions, secondary effects on the expression of in metabolic pathway engineering. the other pathway were also prominent. FIG. 4a shows that for the same dxs expression cassettes, by increasing the copy Example 6 number of the TS plasmid from 5 to 10, dxs expression was Engineering Taxol. P450-based Oxidation Chemistry increased. Interestingly, the 5 copy TS plasmid (strains 25-28 in E. coli series) contained substantially higher taxadiene yields (FIG. 2d) and less growth (FIG. 4c,d) than the 10 copy TS plasmid. [0236] A central feature in the biosynthesis of Taxol is Control plasmids that did not contain the taxadiene heterolo oxygenation at multiple positions of the taxane core structure, gous pathway, grew two fold higher densities, implying reactions that are considered to be mediated by cytochrome growth inhibition in the strains 25–28 series is directly related P450-dependent monooxygenases.” After completion of the to the taxadiene metabolic pathway and the accumulation of committed cyclization step of the pathway, the parent olefin, taxadiene and its direct intermediates (FIG.4c). However the taxa-4(5), 11(12)-diene, is next hydroxylated at the C5 posi strain 29-32 series only showed modest increases in growth tion by a cytochrome P450 enzyme, representing the first of yield when comparing the empty control plasmids to the eight oxygenation steps (of the taxane core) on route to Taxol taxadiene expressing strains (FIG. 4d). This interplay (FIG. 6).” Thus, a key step towards engineering Taxol-pro between growth, taxadiene production, and expression level ducing microbes is the development of P450-based oxidation can also be seen with the plasmid-based upstream expression chemistry in vivo. The first oxygenation step is catalyzed by vectors (strain 17 and 22). Growth inhibition was much larger a cytochrome P450, taxadiene 50-hydroxylase, an unusual in the 10 copy, high taxadiene producing strain (strain 17) monooxygenase catalyzing the hydroxylation reaction along compared to the 20 copy, lower taxadiene producing strain with double bond migration in the diterpene precursor taxa (strain 22) (FIG. 4d). Therefore product toxicity and carbon diene (FIG. 5a). We report the first successful extension of the diversion to the heterologous pathway are likely to impede synthetic pathway from taxadiene to taxadien-50-ol and growth, rather than plasmid-maintenance. present the first examples of in vivo production of any func [0234] Also unexpected was the profound effect of the tionalized Taxol intermediates in E. coli. upstream expression vector on downstream expression. FIG. [0237] In general, functional expression of plant cyto 4b would have two straight lines, if there was no cross talk chrome P450 is challenging” due to the inherent limitations US 2012/0107893 A1 May 3, 2012 22 of bacterial platforms, such as the absence of electron transfer products, especially in the context of microbially-derived machinery, cytochrome P450 reductases, and translational terpenoids for use as chemicals and fuels from renewable incompatibility of the membrane signal modules of P450 resources. By focusing on the universal terpenoid precursors enzymes due to the lack of an endoplasmic reticulum. IPP and DMAPP it was possible to, first, define the critical Recently, through transmembrane (TM) engineering and the pathway modules and then modulate expression such as to generation of chimera enzymes of P450 and CPR reductases, optimally balance the pathway modules for seamless precur some plant P450’s have been expressed in E. coli for the sor conversion and minimal intermediate accumulation. This biosynthesis of functional molecules.” Still, every plant approach seems to be more effective than combinatorial cytochrome p450 is unique in its transmembrane signal searches of large genetic spaces and also does not depend on sequence and electron transfer characteristics from its reduc a high throughput screen. tase counterpart.” Our initial studies were focused on opti [0241] The MEP-pathway is energetically balanced and mizing the expression of codon-optimized synthetic taxadi thus overall more efficient in converting either glucose or ene 50-hydroxylase by N-terminal transmembrane glycerol to isoprenoids. Yet, during the past 10 years, many engineering and generating chimera enzymes through trans attempts at engineering the MEP-pathway in E. coli to lational fusion with the CPR redox partner from the Taxus increase the supply of the key precursors IPP and DMAPP for species, Taxus cytochrome P450 reductase (TCPR) (FIG. carotenoid”, sesquiterpenoid” and diterpenoid” overpro 5b).” One of the chimera enzymes generated, duction met with limited success. This inefficiency was attrib At24T50.OH-tTCPR, was highly efficient in carrying out the uted to unknown regulatory effects associated specifically first oxidation step with more than 98% taxadiene conversion with the expression of the MEP-pathway in E. COli’. Here to taxadien-50-ol and the byproduct 5(12)-Oxa-3(11)-cyclo we provide evidence that such limitations are correlated with taxane (OCT) (FIG. 9a). the accumulation of the metabolite indole, owing to the non [0238] Compared to the other chimeric P450s, optimal expression of the pathway, which inhibits the iso At24T500H-tTCPR yielded two-fold higher (21 mg/L) pro prenoid pathway activity. Taxadiene overproduction (under duction of taxadien-50-ol. As well, the weaker activity of conditions of indole formation suppression), establishes the At&T5oC)H-tTCPR and At?4T5oC)H-tTCPR resulted in MEP-pathway as a very efficient route for biosynthesis of accumulation of a recently characterized byproduct, a com pharmaceutical and chemical products of the isoprenoid fam plex structural rearrangement of taxadiene into the cyclic ily. One simply needs to carefully balance the modular path ether 5(12)-Oxa-3(11)-cyclotaxane (OCT) (FIG. 9).” The ways as suggested by our multivariate-modular pathway byproduct accumulated at approximately equal amounts as engineering approach. the desired product taxadien-50-ol. The OCT formation was [0242] For successful microbial production of Taxol, dem mediated by an unprecedented Taxus cytochrome P450 reac onstration of the chemical decoration of the taxadiene core by tion sequence involving oxidation and subsequent cycliza P450 based oxidation chemistry is essential.” Cytochrome tions.” Thus, it seems likely that by protein engineering of P450 monooxygenases constitute about one half of the 19 the taxadiene 50-hydroxylases, termination of the reaction distinct enzymatic steps in the Taxol biosynthetic pathway. before cyclization will prevent the accumulation of such Characteristically, these genes show unusual high sequence undesirable byproduct and channeling the flux to taxadien similarity with each other (>70%) but low similarity (<30%) 50-ol could be achieved. with other plant P450s." Due to the apparent similarity [0239] The productivity of strain 26-At?4T5oCH-tTCPR among Taxol monooxygenases, expressing the proper activ was significantly reduced relatively to that of taxadiene pro ity for carrying out the specific P450 oxidation chemistry was duction by the parent strain 26 (~300 mg/L) with a concomi a particular challenge. Through TM engineering and con tant increase in the accumulation of the previously described struction of an artificial chimera enzyme with redox partner uncharacterized metabolite. No taxadiene accumulation was (TCPR), the Taxol cytochrome P450, taxadiene 50-hydroxy observed. Apparently, the introduction of an additional lase, was functionally expressed in E. coli and shown to medium copy plasmid (10 copy, p10T7) bearing the efficiently convert taxadiene to the corresponding alcohol At24T50.OH-tTCPR construct disturbed the carefully engi product in vivo. Previous in vitro studies have described the neered balance in the upstream and downstream pathway of mechanism of converting taxadiene to taxadien-50-ol by strain 26. Small scale fermentations were carried outin biore native taxadiene 50-hydroxylase enzyme, but have not dis actors to quantify the alcohol production by strain cussed the same conversion in vivo.” This oxygenation and 26-At?4T500H-tTCPR. The time course profile of taxadien rearrangement reaction involves hydrogen abstraction from 50-ol accumulation (FIG. 5d) indicates alcohol production of C20 position of the taxadiene to form an allylic radical inter up to 58+3 mg/L with an equal amount of the OCT byproduct mediate, followed by regio- and stereo-specific oxygen inser produced. The observed alcohol production was 2400 fold tion at C5 position to yield the alcohol derivative (FIG. 5a). higher than previous production in S. cerevisiae.” Further The observed modestabundance of the enzyme in Taxus cells, increases of taxadien-50-ol production are likely possible and the low kºa, values suggested that the 50-hydroxylation through pathway optimization and protein engineering. step of Taxol biosynthesis is slow relative to the downstream [0240] The multivariate-modular approach of pathway oxygenations and acylations in the Taxol pathway.” Thus, optimization has yielded very high producing strains of a engineering this step is key to Taxol synthesis, especially in critical Taxol precursor. Furthermore, the recombinant con the context of functional engineering of Taxol. P450’s in structs have been equally effective in redirecting flux towards prokaryotic host such as E. coli. In addition, this step was the synthesis of other complex pharmaceutical compounds, limiting in previous efforts of constructing the pathway in such as mono-, sesqui- and di-terpene (geraniol, linalool, yeast." The engineered construct in this study demonstrated amorphadiene and levopimaradiene) products engineered >98% conversion of taxadiene in vivo with product accumu from the same pathway (unpublished results). Thus, our path lation to ~60 mg/L, a 2400 fold improvement over previous way engineering opens new avenues to bio-synthesize natural heterologous expression in yeast. This study has therefore

US 2012/0107893 A1 May 3, 2012

TABLE 2 TABLE 2-continued Detail of all the plasmids constructed for the study Detail of all the plasmids constructed for the study - Origin of Antibiotic Origin of Antibiotic NO Plasmid replication marker NO Plasmid replication marker 1 p.20T7MEP pBR322 Amp 10 p.20TrcTG pBR322 Amp 2 p20TrcMEP pBR322 Amp 3 p20T5MEP pBR322 Amp 11 p.20T5GT pBR322 Amp 4 p20T7MEPKmFRP pBR322 Km 12 plo?TTG p15A Cm 5 p.20T5MEPKmFRP pBR322 Km 13 p5T7TG SC101 Spect 6 p20TrcMEPKm-FRP pBR322 Km 14 p10At8T500H-tTCPR p15A Cm 7 p10TrcMEP p15A Cm 15 p10At24T500H-tTCPR p15A Cm 8 p5TrcMEP SC101 Spect 16 p10At42T500H-tTCPR p15A Cm 9 p20TrcGT pBR322 Amp

TABLE 3 Details of the primer used for the cloning of plasmids, chromosomal delivery of the MEP pathway and g2CR measurements. SEQ ID NO Primer Name Sequences 1 dxsNdel (s) CGGCATATGAGTTTTGATATTGCCAAATACCCG

2 dxsNhe I (a) CGGCTAGCTTATGCCAGCCAGGCCTTGATTTTG

3 idi Nhe I (s) CGCGGCTAGCGAAGGAGATATACATATGCAAACGGAAC ACGTCATTTTATTG

4 idi EcoRI (a) CGGAATTCGCTCACAACCCCGGCAAATGTCGG

5 ispI)PEcoRI (s) GCGAATTCGAAGGAGATATACATATGGCAACCACTCATT TGGATGTTTG

6 ispIRXhoI (a) GCGCTCGAGTCATTTTGTTGCCTTAATGAGTAGCGCC 7 dxsidiisp.I.FNcol (s) T.AAAccATGGGTTTTGATATTGCCAAATAccoG 8 dxsidiisp.I.FKpnl (a) coGGGTAccTCATTTTGTTGccTTAATGAGTAGCGC 9 dxsidiisp.I.FXhoI (a) coGCTCGAGTCATTTTGTTGccTTAATGAGTAGCGC 10 T5AgeI (s) CGTAACCGGTGCCTCTGCTAACCATGTTCATGCCTTC

11 T5Nhel (a) CTCCTTCGCTAGCTTATGCCAGCC

52 GGPPSNcol (s) CGTACCATGGTTGATTTCAATGAATATATGAAAAGTAAG GC

2 GGPPSEcoRI (a) CGTAGAATTCACTCACAACTGACGAAACGCAATGTAATC

3 TXSEcoRI (g) CGTAGAATTCAGAAGGAGATATACATATGGCTAGCTCTA CGGGTACG

4 TXSsal I (a) GATGGTCGACTTAGACCTGGATTGGATCGATGTAAAC

5 TXSNcol (g) CGTACCATGGCTAGCTCTACGGGTACG

6 TXSEcoRI (a) CGTAGAATTCTTAGACCTGGATTGGATCGATGTAAAC

7 GGPPSEcoRI (s) CGTAGAATTCAGAAGGAGATATACATATGTTTGATTTCA ATGAATATATGAAAAGTAAGGC

8 GGPPSSal I (a) GATGGTCGACT CACAACTGACGAAACGCAATGTAATC

9 TSXhoI (a) GATGCTCGAGTTAGACCTGGATTGGATCGATGTAAAC

20 pTrcSal (s) GCCGTCGACCATCATCATCATCATC

21 pTrcxba (a) GCAGTCTAGAGCCAGAACCGTTATGATGTCGGCGC

US 2012/0107893 A1 May 3, 2012 27

[0251] Enantioselective Cascade Catalysis, a New TABLE 4 – continued Approach for the Efficient Construction of Molecular Complexity. Synlett 18, 1477-1489 (2007). Protein and Codon optimized nucleotide sequences. [0252] 7. Holton RAB. R., Boatman PD. Semisynthesis of taxol and taxotere (ed. M. S.) (CRC Press, Boca Raton, ATGCAGGCGAATTCTAATACGGTTGAAGGCGCGAGCCAAGGCAAGTCTC 1995). TTCTGGACATTAGTCGCCTCGACCATATCTTCGCCCTGCTGTTGAACGG [0253) 8. Frense, D. Taxanes: perspectives for biotechno GAAAGGCGGAGACCTTGGTGCGATGACCGGGTCGGCCTTAATTCTGACG GAAAATAGCCAGAACTTGATGATTCTGACCACTGCGCTGGCCGTTCTGG logical production. Applied microbiology and biotechnol TCGCTTGCGTTTTTTTTTTCGTTTGGCGCCGTGGTGGAAGTGATACACA ogy 73, 1233-1240 (2007). GAAGCCCGCCGTACGTCCCACAC CTCTTGTTAAAGAAGAGGACGAAGAA [0254] 9. Roberts, S. C. Production and engineering of GAAGAAGATGATAGCGCCAAGAAAAAGGTCACAATATTTTTTGGCACCC terpenoids in plant cell culture. Nature Chemical Biology AGACCGGCACCGCCGAAGGTTTCGCAAAGGCCTTAGCTGAGGAAGCAAA GGCACGTTATGAAAAGGCGGTATTTAAAGTCGTGGATTTGGATAACTAT 3,387-395 (2007). GCAGCCGATGACGAACAGTACGAAGAGAAGTTGAAAAAGGAAAAGCTAG [0255] 10. Goodman, J. & Walsh, V. The story of taxol: CGTTCTTCATGCTCGCCACCTACGGTGACGGCGAACCGACTGATAATGC nature and politics in the pursuit of an anti-cancer drug CGCTCGCTTTTATAAATGGTTTCTCGAGGGTAAAGAGCGCGAGCCATGG (Cambridge University Press, Cambridge; New York, TTGTCAGATCTGACTTATGGCGTGTTTGGCTTAGGTAACCGTCAGTATG AACACTTTAACAAGGTCGCGAAAGCGGTGGACGAAGTGCTCATTGAACA 2001). AGGCGCCAAACGTCTGGTACCGGTAGGGCTTGGTGATGATGATCAGTGC [0256] 11. Tyo, K. E., Alper, H. S. & Stephanopoulos, G. N. ATTGAGGACGACTTCACTGCCTGGAGAGAACAAGTGTGGCCTGAGCTGG Expanding the metabolic engineering toolbox: more ATCAGCTCTTACGTGATGAAGATGACGAGCCGACGTCTGCGACCCCGTA options to engineer cells. Trends Biotechnol 25, 132-7 CACGGCGGCTATTCCAGAATACCGGGTGGAAATCTACGACT CAGTAGTG TCGGTCTATGAGGAAACCCATGCGCTGAAACAAAATGGACAAGCCGTAT (2007). ACGATATCCACCACCCGTGTCGCAGCAACGTGGCAGTACGTCCTGAGCT [0257] 12. Ajikumar, P. K. et al. Terpenoids: opportunities GCATACCCCGCTGTCGGATCGTAGTTGTATTCATCTGGAATTCGATATT for biosynthesis of natural product drugs using engineered AGTGATACTGGGTTAATCTATGAGACGGGCGACCACGTTGGAGTTCATA microorganisms. Mol Pharm 5, 167-90 (2008). CCGAGAATTCAATTGAAACCGTGGAAGAAGCAGCTAAACTGTTAGGTTA CCAACTGGATACAATCTTCAGCGTGCATGGGGACAAGGAAGATGGAACA [0258] 13. Jennewein, S. & Croteau, R. Taxol: biosynthe CCATTGGGCGGGAGTAGCCTGCCACCGCCGTTTCCGGGGCCCTGCACGC sis, molecular genetics, and biotechnological applications. TGCGGACGGCGCTGGCACGTTACGCGGACCTGCTGAACCCTCCGCGCAA Appl Microbiol Biotechnol 57, 13-9 (2001). AGCCGCCTTCCTGGCACTGGCCGCACACGCGTCAGATCCGGCTGAAGCT GAACGCCTTAAATTTCT CAGTTCTCCAGCCGGAAAAGACGAATACT CAC [0259] 14. Jennewein, S., Wildung, M. R., Chau, M., AGTGGGTCACTGCGTCCCAACGCAGCCTCCTCGAGATTATGGCCGAATT Walker, K. & Croteau, R. Random sequencing of an CCCCAGCGCGAAACCGCCGCTGGGAGTGTTTTTCGCCGCAATAGCGCCG induced Taxus cell clNA library for identification of CGCTTGCAACCTAGGTATTATAGCATCTCCTCCTCCCCGCGTTTCGCGC clones involved in Taxol biosynthesis. Proc Natl Acad Sci CGTCTCGTATCCATGTAACGTGCGCGCTGGTCTATGGTCCTAGCCCTAC GGGGCGTATTCATAAAGGTGTGTGCAGCAACTGGATGAAGAATTCTTTG USA 101,9149-54 (2004). CCCTCCGAAGAAACCCACGATTGCAGCTGGGCACCGGTCTTTGTGCGCC [0260] 15. Croteau, R., Ketchum, R. E. B., Long, R. M., AGTCAAACTTTAAACTGCCCGCCGATTCGACGACGCCAATCGTGATGGT Kaspera, R. & Wildung, M. R. Taxol biosynthesis and TGGACCTGGAACCGGCTTCGCTCCATTTCGCGGCTTCCTTCAGGAACGC molecular genetics. Phytochemistry Reviews 5, 75-97 GCAAAACTGCAGGAAGCGGGCGAAAAATTGGGCCCGGCAGTGCTGTTTT TTGGGTGCCGCAACCGCCAGATGGATTACATCTATGAAGATGAGCTTAA (2006). GGGTTACGTTGAAAAAGGTATTCTGACGAATCTGATCGTTGCATTTTCA [0261) 16. Walker, K. & Croteau, R. Taxol biosynthetic CGAGAAGGCGCCACCAAAGAGTATGTTCAGCACAAGATGTTAGAGAAAG genes. Phytochemistry 58, 1-7 (2001). CCTCCGACACGTGGTCTTTAATCGCCCAGGGTGGTTATCTGTATGTTTG CGGTGATGCGAAGGGTATGGCCAGAGACGTACATCGCACCCTGCATACA [0262] 17. Dejong, J. M. et al. Genetic engineering of taxol ATCGTTCAGGAACAAGAATCCGTAGACTCGTCAAAAGCGGAGTTTTTAG biosynthetic genes in Saccharomyces cerevisiae. Biotech TCAAAAAGCTGCAAATGGATGGACGCTACTTACGGGATATTTGG no1 Bioeng 93, 212-24 (2006). (SEQ ID NO : 49) [0263] 18. Engels, B., Dahm, P. & Jennewein, S. Metabolic engineering of taxadiene biosynthesis in yeast as a first step towards Taxol (Paclitaxel) production. Metabolic engi REFERENCES neering 10, 201-206 (2008). [0264] 19. Chang, M. C. Y. & Keasling, J. D. Production of isoprenoid pharmaceuticals by engineered microbes. [0245] 1. Kingston, D. G. The shape of things to come: Nature chemical biology 2, 674-681 (2006). structural and synthetic studies of taxol and related com [0265] 20. Khosla, C. & Keasling, J. D. Metabolic engi pounds. Phytochemistry 68, 1844-54 (2007). neeringfordrug discovery and development. Nat Rev Drug [0246] 2. Wani, M. C., Taylor, H. L., Wall, M. E., Coggon, Discov 2, 1019-25 (2003). P & McPhail, A. T. Plant antitumor agents. VI. The isola [0266] 21. Alper, H., Miyaoku, K. & Stephanopoulos, G. tion and structure of taxol, a novel antileukemic and anti Construction of lycopene-overproducing E. coli strains by tumor agent from Taxus brevifolia. J Am Chem Soc 93, combining systematic and combinatorial gene knockout 2325-7 (1971). targets. Nat Biotechnol 23, 612-6 (2005). [0247] 3. Suffness M, W. M. Discovery and developement [0267] 22. Chang, M.C., Eachus, R.A., Trieu, W., Ro, D. K. of taxol. (ed. (ed), S. M.) (CRC Press, Boca Raton, 1995). & Keasling, J. D. Engineering Escherichia coli for produc [0248] 4. Nicolaou, K. C. et al. Total synthesis of taxol. tion of functionalized terpenoids using plant P450s. Nat Chem Biol 3, 274-7 (2007). Nature 367, 630-4 (1994). [0268] 23. Martin, V. J., Pitera, D. J., Withers, S. T., New [0249] 5. Holton, R.A. et al. First total synthesis of taxol. 2. man, J. D. & Keasling, J. D. Engineering a mevalonate Completion of the C and D rings. Journal of the American pathway in Escherichia coli for production of terpenoids. Chemical Society 116, 1599-1600 (1994). Nat Biotechnol 21, 796-802 (2003). (0250) 6. Walji, A. M. & MacMillan, D. W. C. Strategies to [0269] 24. Huang, Q., Roessner, C.A., Croteau, R. & Scott, Bypass the Taxol Problem. A. I. Engineering Escherichia coli for the synthesis of US 2012/0107893 A1 May 3, 2012 28

taxadiene, a key intermediate in the biosynthesis of taxol. [0286] 41. Kaspera, R. & Croteau, R. Cytochrome P450 BioCrg Med Chem 9, 2237-42 (2001). oxygenases of Taxol biosynthesis. Phytochemistry [0270] 25. Kim, S. W. & Keasling, J. D. Metabolic engi Reviews 5,433-444 (2006). neering of the nonmevalonate isopentenyl diphosphate [0287| 42. Jennewein, S., Long, R. M., Williams, R. M. & synthesis pathway in Escherichia coli enhances lycopene Croteau, R. Cytochrome p350 taxadiene 5alpha-hydroxy production. Biotechnol Bioeng 72, 408-15 (2001). lase, a mechanistically unusual monooxygenase catalyzing [0271] 26. Farmer, W. R. & Liao, J. C. Improving lycopene the first oxygenation step of taxol biosynthesis. Chem Biol production in Escherichia coli by engineering metabolic 11, 379-87 (2004). control. Nature Biotechnology 18, 533-537 (2000). [0288) 43. Schuler, M. A. & Werck-Reichhart, D. FUNC [0272] 27. Farmer, W. R. & Liao, J. C. Precursor balancing TIONAL G ENOMICS OF P450 S. Annual Review of for metabolic engineering of lycopene production in Plant Biology 54,629-667 (2003). [0289] 44. Leonard, E. & Koffas, M. A. G. Engineering of Escherichia coli. Biotechnology progress 17 (2001). artificial plant cytochrome P450 enzymes for synthesis of [0273] 28. Yuan, L. Z., Rouviere, P. E., Larossa, R. A. & isoflavones by Escherichia coli. Applied and Environmen Suh, W. Chromosomal promoter replacement of the iso tal Microbiology 73, 7246 (2007). prenoid pathway for enhancing carotenoid production in E. [0290] 45. Nelson, D. R. Cytochrome P450 and the indi coli. Metab Eng 8, 79-90 (2006). viduality of species. Archives of Biochemistry and Bio [0274] 29. Jin, Y. S. & Stephanopoulos, G. Multi-dimen physics 369, 1-10 (1999). sional gene target search for improving lycopene biosyn [0291) 46. Jennewein, S. et al. Coexpression in yeast of thesis in Escherichia coli. Metabolic Engineering 9, 337 Taxus cytochrome P450 reductase with cytochrome P450 347 (2007). oxygenases involved in Taxol biosynthesis. Biotechnol [0275] 30. Wang, H. H. et al. Programming cells by multi Bioeng 89, 588-98 (2005). plex genome engineering and accelerated evolution. [0292] 47. Rontein, D. et al. CYP725A.4 from yew cata Nature (2009). lyzes complex structural rearrangement of taxa-4(5), 11 [0276] 31. Klein-Marcuschamer, D., Ajikumar, P. K. & (12)-diene into the cyclic ether 5(12)-oxa-3(11)-cyclotax Stephanopoulos, G. Engineering microbial cell factories ane. J Biol Chem 283,6067-75 (2008). for biosynthesis of isoprenoid molecules: beyond lyco [0293] 48. Shigemori, H. & Kobayashi, J. Biological activ pene. Trends in Biotechnology 25, 417-424 (2007). ity and chemistry of taxoids from the Japanese yew, Taxus (0277] 32. Sandmann, G. Combinatorial biosynthesis of cuspidata. J Nat Prod 67,245-56 (2004). carotenoids in a heterologous host: a powerful approach for [0294] 49. Xu, F., Tao, W., Cheng, L. & Guo, L. Strain the biosynthesis of novel structures. ChemPioChem 3 improvement and optimization of the media of taxol-pro (2002). ducing fungus Fusarium maire. Biochemical Engineering [0278] 33. Brosius, J., Erfle, M. & Storella, J. Spacing of the Journal 31, 67-73 (2006). -10 and -35 regions in the tac promoter. Effect on its in vivo [0295] 50. Hefner, J., Ketchum, R. E. & Croteau, R. Clon activity. J Biol Chem 260, 3539-41 (1985). ing and functional expression of a cDNA encoding gera [0279| 34. Brunner, M. & Bujard, H. Promoter recognition nylgeranyl diphosphate synthase from Taxus canadensis and promoter strength in the Escherichia coli system. and assessment of the role of this prenyltransferase in cells Embo J 6, 3139-44 (1987). induced for taxol production. Arch Biochem Biophys 360, [0280) 35. Nishizaki, T., Tsuge, K., Itaya, M., Doi, N. & 62-74 (1998). Yanagawa, H. Metabolic engineering of carotenoid bio [0296) 51. Wildung, M. R. & Croteau, R.A conA clone for synthesis in Escherichia coli by ordered gene assembly in taxadiene synthase, the diterpene cyclase that catalyzes the Bacillus subtilis. Appl Environ Microbiol 73, 1355-61 committed step of taxol biosynthesis. J Biol Chem 271, (2007). 920.1-4 (1996). [0297] 52. Kodumal, S.J. etal. Total synthesis of long DNA [0281] 36. Sorensen, H. P. & Mortensen, K. K. Advanced sequences: synthesis of a contiguous 32-kb polyketide syn genetic strategies for recombinant protein expression in thase gene cluster. Proceedings of the National Academy of Escherichia coli. J Biotechnol 115, 113-28 (2005). Sciences 101, 15573-15578 (2004). [0282] 37. Jones, K. L., Kim, S. W. & Keasling, J. D. [0298] 53. Tyo, K. E. J., Ajikumar, P. K. & Stephanopoulos, Low-copy plasmids can perform as well as or better than G. Stabilized gene duplication enables long-term selec high-copy plasmids for metabolic engineering of bacteria. tion-free heterologous pathway expression. Nature Bio Metab Eng 2, 328-38 (2000). technology (2009). [0283) 38. Hoffmann, F. & Rinas, U. Stress induced by [0299| 54. Datsenko, K. A. & Wanner, B. L. 6640-6645 recombinant protein production in Escherichia coli. (National Acad Sciences, 2000). Advances in Biochemical Engineering Biotechnology 89, [0300|| 55. Rost, B., Yachdav, G. & Liu, J. The predictmro 73-92 (2005). tein server. Nucleic acids research 32, W321 (2004). [0284] 39. Hoffmann, F., Weber, J. & Rinas, U. Metabolic [0301] 56. Shalel-Levanon, S., San, K.Y. & Bennett, G. N. adaptation of Escherichia coli during temperature-induced Effect of ArcA and FNR on the expression of genes related recombinant protein production: 1. Readjustment of meta to the oxygen regulation and the glycolysis pathway in bolic enzyme synthesis. Biotechnology and bioengineer Escherichia coli under microaerobic growth conditions. ing 80 (2002). Biotechnology and bioengineering 92 (2005). [0285) 40. Chang, D. E., Smalley, D. J. & Conway, T. Gene [0302] 57. Heinig, U. & Jennewein, S. Taxol: A complex expression profiling of Escherichia coli growth transitions: diterpenoid natural product with an evolutionarily obscure an expanded stringent response model. Molecular micro origin. African Journal of Biotechnology 8, 1370-1385 biology 45,289-306 (2002). (2009). US 2012/0107893 A1 May 3, 2012 29

[0303] 58. Walji, A.M. & MacMillan, D.W. C. Strategies to parison of MEV and MEP isoprenoid precursor pathway Bypass the Taxol Problem. Enantioselective Cascade engineering. Appl Microbiol Biotechnol. 85(6):1893-906 Catalysis, a New Approach for the Efficient Construction (2010) of Molecular Complexity. Synlett 18, 1477-1489 (2007). [0307] Having thus described several aspects of at least one embodiment of this invention, it is to be appreciated various [0304] 59. Chau, M., Jennewein, S., Walker, K. & Croteau, alterations, modifications, and improvements will readily R. Taxol biosynthesis: Molecular cloning and character occur to those skilled in the art. Such alterations, modifica ization of a cytochrome P450 taxoid 7 beta-hydroxylase. tions, and improvements are intended to be part of this dis Chem Biol 11, 663-72 (2004). closure, and are intended to be within the spirit and scope of [0305) 60. Williams, D. C. et al. Heterologous expression the invention. Accordingly, the foregoing description and and characterization of a “Pseudomature” form of taxadi drawings are by way of example only. Those skilled in the art will recognize, or be able to ascertain using no more than ene synthase involved in paclitaxel (Taxol) biosynthesis routine experimentation, many equivalents to the specific and evaluation of a potential intermediate and inhibitors of embodiments of the invention described herein. Such equiva the multistep diterpene cyclization reaction. Arch Biochem lents are intended to be encompassed by the following claims. Biophys379, 137-46 (2000). [0308) All references disclosed herein are incorporated by [0306] 61. Morrone D. etal. Increasing diterpene yield with reference in their entirety for the specific purpose mentioned a modular metabolic engineering system in E. coli: com herein.

SEQUENCE LISTING

<16 O: NUMBER OF SEQ ID NOS : 52

<210 - SEQ ID NO 1 <211 - LENGTH : 33 <212 > TYPE : DNA <213 - ORGANISM: Artificial Sequence <22 0 - FEATURE : OTHER INFORMATION: Synthetic Oligonucleotide

<400: SEQUENCE: 1

cggcatatga gttttgatat to coaaatac cog 33

<210 - SEQ ID NO 2 <211 - LENGTH : 33 <212 > TYPE : DNA <213 - ORGANISM: Artificial Sequence <22 0 - FEATURE : < 2.23 - OTHER INFORMATION: Synthetic Oligonucleotide

<400: SEQUENCE: 2

C9gctagott atgc.cago Ca ggcottgatt ttg 33

<210 - SEQ ID NO 3 <211 - LENGTH : 52 <212 > TYPE : DNA <213 - ORGANISM: Artificial Sequence <22 0 - FEATURE : < 2.23 - OTHER INFORMATION: Synthetic Oligonucleotide

<400: SEQUENCE: 3

Cq.cggctago gaaggagata tacatatgca aacggaacac gtoattttat tº 52

<210 - SEQ ID NO 4 <211 - LENGTH : 32 <212 > TYPE : DNA <213 - ORGANISM: Artificial Sequence <22 0 - FEATURE : < 2.23 - OTHER INFORMATION: Synthetic Oligonucleotide

<400: SEQUENCE: 4

32 US 2012/0107893 A1 May 3, 2012 30

– continued < 210 - SEQ ID NO 5 < 211 > LENGTH : 49 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 5 gC9a attoga aggagatata Catatggcaa C Cact cattt gºatgtttg 49

< 210 - SEQ ID NO 6 < 211 > LENGTH : 37 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 6 gCGCtcgagt cattttgttg CCttaatgag tag.cgc.c 37

< 210 - SEQ ID NO 7 < 211 > LENGTH : 33 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 7 taaaccatgg gttttgat at to coaaatac cog 33

< 210 - SEQ ID NO 8 < 211 > LENGTH : 36 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 8

C9gggtacct cattttgttg C Cttaatgag tag.cgc 36

< 210 - SEQ ID NO 9 < 211 > LENGTH : 36 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 9

C9gctdgagt cattttgttg C Cttaatgag tag.cgc 36

< 210 - SEQ ID NO 10 < 211 > LENGTH : 37 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE : 10 cgta accogt goctotgcta accatgttca tocct to 37

< 210 - SEQ ID NO 11 < 211 > LENGTH : 24 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence US 2012/0107893 A1 May 3, 2012 31

– continued

< 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 11 ctoctitcgct agottatgcc ago c 24

< 210 - SEQ ID NO 12 < 211 > LENGTH : 39 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 12 cgtagaatto act cacaact gacgaaacgc aatgtaatc 39

< 210 - SEQ ID NO 13 < 211 > LENGTH : 47 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE : 13

C9taga atto agaaggagat atacatatgg Ctagotctac gºgtacg 47

<210 - SEQ ID NO 14 < 211 > LENGTH : 37 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 14 gatggtogac ttagacCtgg attggat.cga totaaac 37

< 210 - SEQ ID NO 15 < 211 > LENGTH : 27 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 15

C9taccatgg Ctagotctac gºgtacg 27

< 210 - SEQ ID NO 16 < 211 > LENGTH : 37 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 16

C9taga atto ttaga CCtgg attggat.cga totaaac 37

< 210 - SEQ ID NO 17 < 211 > LENGTH : 61 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 17 US 2012/0107893 A1 May 3, 2012 32

– continued

C9taga atto agaaggagat atacatatgt ttgatttcaa toaatatatg aaaagta agg 60

C 61

< 210 - SEQ ID NO 18 < 211 > LENGTH : 37 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 18 gatggtogac to acaactga C9aaacgcaa totaatc 37

< 210 - SEQ ID NO 19 < 211 > LENGTH : 37 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE : 19 gatgct Coag ttagacCtgg attggat.cga totaaac 37

< 210 - SEQ ID NO 20 < 211 > LENGTH : 25 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 20 gocgtogacc atcatcatca to atc 25

< 210 - SEQ ID NO 21 < 211 > LENGTH : 35 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 21 gCagtotaga gCC agaac Có ttatgatgtc. g.gc.gc 35

< 210 - SEQ ID NO 22 < 211 > LENGTH : 35 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 22 C9tgtc.cgga gCatctaacg Cttgagttaa gCC.gc 35

< 210 - SEQ ID NO 23 < 211 > LENGTH : 32 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 23 gCagtotaga gqaaac Ctgt C9tgc.cagot go 32 US 2012/0107893 A1 May 3, 2012 33

– continued

< 210 - SEQ ID NO 24 <211 > LENGTH : 91. <212 > TYPE : DNA < 2.13 - ORGANISM: Artificial Sequence < 220 - FEATURE : < 2.23 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 24 gacgct Coag gag caataac tag cataa CC C Cttgggg.co totaaacggg tottgagggg 60 ttttttgctt gtgtaggctg gagctgottc g 91

< 210 - SEQ ID NO 25 <211 > LENGTH : 34 <212 > TYPE : DNA < 2.13 - ORGANISM: Artificial Sequence < 220 - FEATURE : < 2.23 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 25 gacgagtact gaacgtogga attgatc.cgt. Cqac 34

< 210 - SEQ ID NO 26 <211 > LENGTH : 91. <212 > TYPE : DNA < 2.13 - ORGANISM: Artificial Sequence < 220 - FEATURE : < 2.23 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 26 gacggagotc gag caataac tag cataa CC C Cttgggg.co totaaacggg tottgagggg 60 ttttttgctt gtgtaggctg gagctgottc g 91

< 210 - SEQ ID NO 27 <211 > LENGTH : 63 <212 > TYPE : DNA < 2.13 - ORGANISM: Artificial Sequence < 220 - FEATURE : < 2.23 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 27 atgacgattt ttgata atta toaagtgtgg tttgtcattg cattaattgc gttgc.gcto a 60

Ctg 63

< 210 - SEQ ID NO 28 <211 > LENGTH : 64 <212 > TYPE : DNA < 2.13 - ORGANISM: Artificial Sequence < 220 - FEATURE : < 2.23 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 28 atgacgattt ttgata atta toaagtgtgg tttgtcattg goat cogctt acaga caagc 60

64

< 210 - SEQ ID NO 29 <211 > LENGTH : 64 <212 > TYPE : DNA < 2.13 - ORGANISM: Artificial Sequence < 220 - FEATURE : < 2.23 - OTHER INFORMATION: Synthetic Oligonucleotide US 2012/0107893 A1 May 3, 2012 34

– continued

< 400 - SEQUENCE : 29 ttagdgacga aac Cogtaat acact togtt C.Cagc.gcago Coacgtogga attgatc.cgt. 60

64

< 210 - SEQ ID NO 30 < 211 > LENGTH : 59 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 30 cgtacatatg got ctott at tag cagtttt totgg.cgaaa tttaacgaag taaccoagc 59

< 210 - SEQ ID NO 31 < 211 > LENGTH : 56 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE : 31 cgtacatatg got ctott at tag cagtttt ttttago atc gotttgagtg caattg 56

<210 - SEQ ID NO 32 < 211 > LENGTH : 58 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 32 cgtacatatg got ctott at tag cagtttt ttt togctog aaacgtdata gtagoct9 53

< 210 - SEQ ID NO 33 < 211 > LENGTH : 49 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE : 33

CGC gºgatcc ggtgctgo CC ggacgaggga acagtttgat toaaaac Co. 49

< 210 - SEQ ID NO 34 < 211 > LENGTH : 34 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 34

CGC gºgatcC CSC cqtggtg gaagtgatac acag 34

< 210 - SEQ ID NO 3.5 < 211 > LENGTH : 40 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 35 US 2012/0107893 A1 May 3, 2012 35

– continued cgcggtogac ttaccaaata toccgtaagt agdgtocatc 40

< 210 - SEQ ID NO 36 < 211 > LENGTH : 20 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 36 attoaaaagc titc.cggtoct 20

< 210 - SEQ ID NO 37 < 211 > LENGTH : 20 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE : 37 atctggcgac attcgttt to 20

< 210 - SEQ ID NO 38 < 211 > LENGTH : 20 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 38 gacgaactgt caccogattt 20

< 210 - SEQ ID NO 39 < 211 > LENGTH : 20 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 39 gCtt CºCggg tagtag acag 20

< 210 - SEQ ID NO 40 < 211 > LENGTH : 20 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 40 aggc Cttcgg gttgtaaagt 20

< 210 - SEQ ID NO 41 < 211 > LENGTH : 20 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 41 atto.cgatta acgcttgcac 20 US 2012/0107893 A1 May 3, 2012 36

– continued < 210 - SEQ ID NO 42 < 211 > LENGTH : 296 < 212 > TYPE : PRT <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Polypeptide

< 400 - SEQUENCE: 42 Met Phe Asp Phe Asn Glu Tyr Met Lys Ser Lys Ala Val Ala Val Asp 1 5 10 15 Ala Ala Leu Asp Lys Ala Ile Pro Leu Glu Tyr Pro Glu Lys Ile His 20 25 30 Glu Ser Met Arg Tyr Ser Leu Leu Ala Gly Gly Lys Arg Val Arg Pro 35 40 45 Ala Leu Cys Ile Ala Ala Cys Glu Leu Val Gly Gly Ser Gln Asp Leu 50 55 60 Ala Met Pro Thr Ala Cys Ala Met Glu Met Ile His Thr Met Ser Leu 65 70 75 30 Ile His Asp Asp Leu Pro Cys Met Asp Asn Asp Asp Phe Arg Arg Gly 35 90 95 Lys Pro Thr Asn His Lys Val Phe Gly Glu Asp Thr Ala Val Leu Ala 100 105 110 Gly Asp Ala Leu Leu Ser Phe Ala Phe Glu His Ile Ala Val Ala Thr 115 120 125 Ser Lys Thr Val Pro Ser Asp Arg Thr Leu Arg Val Ile Ser Glu Leu 130 135 140 Gly Lys Thr Ile Gly Ser Gln Gly Leu Val Gly Gly Gln Val Val Asp 145 150 155 160 Ile Thr Ser Glu Gly Asp Ala Asn Val Asp Leu Lys Thr Leu Glu Trp 165 17 O 175 Ile His Ile His Lys Thr Ala Val Leu Leu Glu Cys Ser Val Val Ser 180 135 19 O Gly Gly Ile Leu Gly Gly Ala Thr Glu Asp Glu Ile Ala Arg Ile Arg 195 200 2 OE Arg Tyr Ala Arg Cys Val Gly Leu Leu Phe Gln Val Val Asp Asp Ile 210 215 220 Leu Asp Val Thr Lys Ser Ser Glu Glu Leu Gly Lys Thr Ala Gly Lys 225 2 30 2.35 240 Asp Leu Leu Thr Asp Lys Ala Thr Tyr Pro Lys Leu Met Gly Leu Glu 2.45 250 255 Lys Ala Lys Glu Phe Ala Ala Glu Leu Ala Thr Arg Ala Lys Glu Glu 260 265 27 O Leu Ser Ser Phe Asp Gln Ile Lys Ala Ala Pro Leu Leu Gly Leu Ala 275 28 0 2.35 Asp Tyr Ile Ala Phe Arg Gln Asn 290 295

< 210 - SEQ ID NO 43 < 211 > LENGTH : 888 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Polynucleotide

< 400 - SEQUENCE: 43

US 2012/0107893 A1 May 3, 2012 38

– continued Pro Tyr Asp Leu Pro Phe Ile Lys Tyr Leu Ser Thr Thr Arg Glu Ala 195 200 2 OE Arg Leu Thr Asp Val Ser Ala Ala Ala Asp Asn Ile Pro Ala Asn Met 210 215 220 Leu Asn Ala Leu Glu Gly Leu Glu Glu Val Ile Asp Trp Asn Lys Ile 225 2 30 2.35 240 Met Arg Phe Gln Ser Lys Asp Gly Ser Phe Leu Ser Ser Pro Ala Ser 2.45 250 255 Thr Ala Cys Val Leu Met Asn Thr Gly Asp Glu Lys Cys Phe Thr Phe 260 265 27 O Leu Asn Asn Leu Leu Asp Lys Phe Gly Gly Cys Val Pro Cys Met Tyr 275 28 0 2.35 Ser Ile Asp Leu Leu Glu Arg Leu Ser Leu Val Asp Asn Ile Glu His 290 295 300 Leu Gly Ile Gly Arg His Phe Lys Gln Glu Ile Lys Gly Ala Leu Asp 3 O 5 310 3.15 320 Tyr Val Tyr Arg His Trp Ser Glu Arg Gly Ile Gly Trp Gly Arg Asp 3 25 330 3.35 Ser Leu Val Pro Asp Leu Asn Thr Thr Ala Leu Gly Leu Arg Thr Leu 340 34 5 3.5 O Arg Met His Gly Tyr Asn Val Ser Ser Asp Val Leu Asn Asn Phe Lys 355 360 3.65 Asp Glu Asn Gly Arg Phe Phe Ser Ser Ala Gly Gln Thr His Val Glu 3 T. O 3 Tº E 380 Leu Arg Ser Val Val Asn Leu Phe Arg Ala Ser Asp Leu Ala Phe Pro 3.35 390 3.95 400 Asp Glu Arg Ala Met Asp Asp Ala Arg Lys Phe Ala Glu Pro Tyr Leu 4 OE 410 415 Arg Glu Ala Leu Ala Thr Lys Ile Ser Thr Asn Thr Lys Leu Phe Lys 420 42.5 43 O Glu Ile Glu Tyr Val Val Glu Tyr Pro Trp His Met Ser Ile Pro Arg 435 440 445 Leu Glu Ala Arg Ser Tyr Ile Asp Ser Tyr Asp Asp Asn Tyr Val Trp 45 O 4 55 460 Gln Arg Lys Thr Leu Tyr Arg Met Pro Ser Leu Ser Asn Ser Lys Cys 4 65 4 70 4 75 480 Leu Glu Leu Ala Lys Leu Asp Phe Asn Ile Val Gln Ser Leu His Gln 435 4.90 495 Glu Glu Leu Lys Leu Leu Thr Arg Trp Trp Lys Glu Ser Gly Met Ala 500 50 E. 51. O Asp Ile Asn Phe Thr Arg His Arg Val Ala Glu Val Tyr Phe Ser Ser 515 52 O 525 Ala Thr Phe Glu Pro Glu Tyr Ser Ala Thr Arg Ile Ala Phe Thr Lys 53 O 53.5 54 O Ile Gly Cys Leu Gln Val Leu Phe Asp Asp Met Ala Asp Ile Phe Ala 5.45 550 555 560 Thr Leu Asp Glu Leu Lys Ser Phe Thr Glu Gly Val Lys Arg Trp Asp 5 65 57 O 575 Thr Ser Leu Leu His Glu Ile Pro Glu Cys Met Gln Thr Cys Phe Lys 5:30 585 59 O Val Trp Phe Lys Leu Met Glu Glu Val Asn Asn Asp Val Val Lys Val

US 2012/0107893 A1 May 3, 2012 41

– continued

50 55 60 Ser Phe Ile Phe Leu Arg Ala Leu Arg Ser Asn Ser Leu Glu Gln Phe 65 70 75 30 Phe Asp Glu Arg Val Lys Lys Phe Gly Leu Val Phe Lys Thr Ser Leu 35 90 95 Ile Gly His Pro Thr Val Val Leu Cys Gly Pro Ala Gly Asn Arg Leu 100 105 110 Ile Leu Ser Asn Glu Glu Lys Leu Val Gln Met Ser Trp Pro Ala Gln 115 120 125 Phe Met Lys Leu Met Gly Glu Asn Ser Val Ala Thr Arg Arg Gly Glu 130 135 140 Asp His Ile Val Met Arg Ser Ala Leu Ala Gly Phe Phe Gly Pro Gly 145 150 155 160 Ala Leu Gln Ser Tyr Ile Gly Lys Met Asn Thr Glu Ile Gln Ser His 165 17 O 175 Ile Asn Glu Lys Trp Lys Gly Lys Asp Glu Val Asn Val Leu Pro Leu 180 135 19 O Val Arg Glu Leu Val Phe Asn Ile Ser Ala Ile Leu Phe Phe Asn Ile 195 200 2 OE Tyr Asp Lys Gln Glu Gln Asp Arg Leu His Lys Leu Leu Glu Thr Ile 210 215 220 Leu Val Gly Ser Phe Ala Leu Pro Ile Asp Leu Pro Gly Phe Gly Phe 225 2 30 2.35 240 His Arg Ala Leu Gln Gly Arg Ala Lys Leu Asn Lys Ile Met Leu Ser 2.45 250 255 Leu Ile Lys Lys Arg Lys Glu Asp Leu Gln Ser Gly Ser Ala Thr Ala 260 265 27 O Thr Gln Asp Leu Leu Ser Val Leu Leu Thr Phe Arg Asp Asp Lys Gly 275 28 0 2.35 Thr Pro Leu Thr Asn Asp Glu Ile Leu Asp Asn Phe Ser Ser Leu Leu 290 295 300 His Ala Ser Tyr Asp Thr Thr Thr Ser Pro Met Ala Leu Ile Phe Lys 3 O 5 310 3.15 320 Leu Leu Ser Ser Asn Pro Glu Cys Tyr Gln Lys Val Val Gln Glu Gln 3 25 330 3.35 Leu Glu Ile Leu Ser Asn Lys Glu Glu Gly Glu Glu Ile Thr Trp Lys 340 34 5 3.5 O Asp Leu Lys Ala Met Lys Tyr Thr Trp Gln Val Ala Gln Glu Thr Leu 355 360 3.65 Arg Met Phe Pro Pro Val Phe Gly Thr Phe Arg Lys Ala Ile Thr Asp 3 T. O 3 Tº E 380 Ile Gln Tyr Asp Gly Tyr Thr Ile Pro Lys Gly Trp Lys Leu Leu Trp 3.35 390 3.95 400 Thr Thr Tyr Ser Thr His Pro Lys Asp Leu Tyr Phe Asn Glu Pro Glu 4 OE 410 415 Lys Phe Met Pro Ser Arg Phe Asp Gln Glu Gly Lys His Val Ala Pro 420 42.5 43 O Tyr Thr Phe Leu Pro Phe Gly Gly Gly Gln Arg Ser Cys Val Gly Trp 435 440 445 Glu Phe Ser Lys Met Glu Ile Leu Leu Phe Val His His Phe Val Lys 45 O 4 55 460

US 2012/0107893 A1 May 3, 2012 43

– continued

< 400 - SEQUENCE: 48 Met Gln Ala Asn Ser Asn Thr Val Glu Gly Ala Ser Gln Gly Lys Ser 1 5 10 15 Leu Leu Asp Ile Ser Arg Leu Asp His Ile Phe Ala Leu Leu Leu Asn 20 25 30 Gly Lys Gly Gly Asp Leu Gly Ala Met Thr Gly Ser Ala Leu Ile Leu 35 40 45

Thr Glu Asn Ser Gln Asn Leu Met Ile Leu Thr Thr Ala Leu Ala Val 50 55 60 Leu Val Ala Cys Val Phe Phe Phe Val Trp Arg Arg Gly Gly Ser Asp 65 70 75 30 Thr Gln Lys Pro Ala Val Arg Pro Thr Pro Leu Val Lys Glu Glu Asp 35 90 95 Glu Glu Glu Glu Asp Asp Ser Ala Lys Lys Lys Val Thr Ile Phe Phe 100 105 110 Gly Thr Gln Thr Gly Thr Ala Glu Gly Phe Ala Lys Ala Leu Ala Glu 115 120 125 Glu Ala Lys Ala Arg Tyr Glu Lys Ala Val Phe Lys Val Val Asp Leu 130 135 140 Asp Asn Tyr Ala Ala Asp Asp Glu Gln Tyr Glu Glu Lys Leu Lys Lys 145 150 155 160 Glu Lys Leu Ala Phe Phe Met Leu Ala Thr Tyr Gly Asp Gly Glu Pro 165 17 O 175 Thr Asp Asn Ala Ala Arg Phe Tyr Lys Trp Phe Leu Glu Gly Lys Glu 180 135 19 O Arg Glu Pro Trp Leu Ser Asp Leu Thr Tyr Gly Val Phe Gly Leu Gly 195 200 2 OE Asn Arg Gln Tyr Glu His Phe Asn Lys Val Ala Lys Ala Val Asp Glu 210 215 220 Val Leu Ile Glu Gln Gly Ala Lys Arg Leu Val Pro Val Gly Leu Gly 225 2 30 2.35 240 Asp Asp Asp Gln Cys Ile Glu Asp Asp Phe Thr Ala Trp Arg Glu Gln 2.45 250 255 Val Trp Pro Glu Leu Asp Gln Leu Leu Arg Asp Glu Asp Asp Glu Pro 260 265 27 O Thr Ser Ala Thr Pro Tyr Thr Ala Ala Ile Pro Glu Tyr Arg Val Glu 275 28 0 2.35 Ile Tyr Asp Ser Val Val Ser Val Tyr Glu Glu Thr His Ala Leu Lys 290 295 300 Gln Asn Gly Gln Ala Val Tyr Asp Ile His His Pro Cys Arg Ser Asn 3 O 5 310 3.15 320 Val Ala Val Arg Arg Glu Leu His Thr Pro Leu Ser Asp Arg Ser Cys 3 25 330 3.35 Ile His Leu Glu Phe Asp Ile Ser Asp Thr Gly Leu Ile Tyr Glu Thr 340 34 5 3.5 O Gly Asp His Val Gly Val His Thr Glu Asn Ser Ile Glu Thr Val Glu 355 360 3.65 Glu Ala Ala Lys Leu Leu Gly Tyr Gln Leu Asp Thr Ile Phe Ser Val 3 T. O 3 Tº E 380 His Gly Asp Lys Glu Asp Gly Thr Pro Leu Gly Gly Ser Ser Leu Pro US 2012/0107893 A1 May 3, 2012 44

– continued

3.35 390 3.95 400 Pro Pro Phe Pro Gly Pro Cys Thr Leu Arg Thr Ala Leu Ala Arg Tyr 4 OE 410 415 Ala Asp Leu Leu Asn Pro Pro Arg Lys Ala Ala Phe Leu Ala Leu Ala 420 42.5 43 O Ala His Ala Ser Asp Pro Ala Glu Ala Glu Arg Leu Lys Phe Leu Ser 435 440 445 Ser Pro Ala Gly Lys Asp Glu Tyr Ser Gln Trp Val Thr Ala Ser Gln 45 O 4 55 460 Arg Ser Leu Leu Glu Ile Met Ala Glu Phe Pro Ser Ala Lys Pro Pro 4 65 4 70 4 75 480 Leu Gly Val Phe Phe Ala Ala Ile Ala Pro Arg Leu Gln Pro Arg Tyr 435 4.90 495 Tyr Ser Ile Ser Ser Ser Pro Arg Phe Ala Pro Ser Arg Ile His Val 500 50 E. 51. O Thr Cys Ala Leu Val Tyr Gly Pro Ser Pro Thr Gly Arg Ile His Lys 515 52 O 525 Gly Val Cys Ser Asn Trp Met Lys Asn Ser Leu Pro Ser Glu Glu Thr 53 O 53.5 54 O His Asp Cys Ser Trp Ala Pro Val Phe Val Arg Gln Ser Asn Phe Lys 5.45 550 555 560 Leu Pro Ala Asp Ser Thr Thr Pro Ile Val Met Val Gly Pro Gly Thr 5 65 57 O 575 Gly Phe Ala Pro Phe Arg Gly Phe Leu Gln Glu Arg Ala Lys Leu Gln 5:30 585 59 O Glu Ala Gly Glu Lys Leu Gly Pro Ala Val Leu Phe Phe Gly Cys Arg 595 600 6 05 Asn Arg Gln Met Asp Tyr Ile Tyr Glu Asp Glu Leu Lys Gly Tyr Val 610 615 620 Glu Lys Gly Ile Leu Thr Asn Leu Ile Val Ala Phe Ser Arg Glu Gly 6.25 6.30 6 3.5 640 Ala Thr Lys Glu Tyr Val Gln His Lys Met Leu Glu Lys Ala Ser Asp 6.45 650 655 Thr Trp Ser Leu Ile Ala Gln Gly Gly Tyr Leu Tyr Val Cys Gly Asp 660 665 6 7 O Ala Lys Gly Met Ala Arg Asp Val His Arg Thr Leu His Thr Ile Val 6 75 680 6 35 Gln Glu Gln Glu Ser Val Asp Ser Ser Lys Ala Glu Phe Leu Val Lys 69 O 695 7 OO Lys Leu Gln Met Asp Gly Arg Tyr Leu Arg Asp Ile Trp 7 O 5 71O 715

< 210 - SEQ ID NO 49 < 211 > LENGTH : 2151 < 212 > TYPE : DNA <213: ORGANISM: Artificial Sequence < 22 O > FEATURE : <223 - OTHER INFORMATION: Synthetic Polynucleotide

< 400 - SEQUENCE: 49 atgcaggcga attctaatac gottgaaggc gogagcCaag gCaagtotct totggacatt 60 agtogc Ctcq accatatott CºCCCtgctg ttgaacggga aaggcggaga CCttggtgcg 120

US 2012/0107893 A1 May 3, 2012 46

– continued < 400 - SEQUENCE: 50 Gly Ser Thr Gly Ser 1 5

< 210 - SEQ ID NO E1 <211 > LENGTH : 3 <212 > TYPE : PRT < 2.13 - ORGANISM: Artificial Sequence < 220 - FEATURE : < 2.23 - OTHER INFORMATION: Synthetic Polypeptide

< 400 - SEQUENCE: 51

Met Ala Leu Leu Leu Ala Val Phe 1 5

< 210 - SEQ ID NO E2 <211 > LENGTH : 41 <212 > TYPE : DNA < 2.13 - ORGANISM: Artificial Sequence < 220 - FEATURE : < 2.23 - OTHER INFORMATION: Synthetic Oligonucleotide

< 400 - SEQUENCE: 52

C9taccatgg ttgatttgaa to?aatatatg aaaagta agg C 41

What is claimed is: codon-optimization of one or more heterologous enzymes 1. A method for making a product containing a terpenoid or in the upstream or downstream pathway, terpenoid derivative, the method comprising: amino acid mutations in one or more genes of the down increasing terpenoid production in a cell that produces one stream and/or upstream pathway, or or more terpenoids by controlling the accumulation of modifying the order of upstream and downstream pathway indole in the cell or in a culture of the cells, genes in a heterologous operon. incorporating the terpenoid, or a derivative of the terpenoid 6. The method of claim 5, wherein the cell comprises at prepared through one or more chemical or enzymatic least one additional copy of at least one of dxs, idi, isp?), and steps, into a product to thereby make the product con ispR. taining a terpenoid or terpenoid derivative. 7. The method of claim 6, wherein the cell comprises a 2. The method of claim 1, wherein the cell is a bacterial heterologous dxs-idi-isp?)F operon. cell. 8. The method of claim 4, wherein indole accumulates in 3. The method of claim 2, wherein the cell is E. coli or B. the culture at less than 100 mg/L. subtilis. 9-11. (canceled) 4. The method of claim 1, wherein the step of controlling 12. The method of claim 1, wherein the step of controlling the accumulation of indole in the cell or in a culture of the the accumulation of indole in the cell or in a culture of the cells comprises balancing an upstream non-mevalonate iso cells comprises modifying or regulating the indole pathway. prenoid pathway with a downstream heterologous terpenoid 13. The method of claim 1, wherein the step of controlling synthesis pathway. the accumulation of indole in the cell or in a culture of the 5. The method of claim 4, wherein the upstream non cells comprises or further comprises removing the accumu mevalonate isoprenoid pathway is balanced with respect to lated indole from the cell culture through chemical methods. the downstream heterologous terpenoid synthesis pathway by 14. The method of claim 13, wherein indole is removed one or more of: from the culture at least in part using one or more absorbents increasing gene copy number for one or more upstream or or scavengers. downstream pathway enzymes, 15. The method of claim 1, comprising measuring the increasing or decreasing the expression level of the amount of indole in the culture, or continuously or intermit upstream and downstream pathway genes, as individual tently monitoring the amount of indole in the culture. genes or as operons, using promoters with different 16. The method of claim 1, wherein the one or more terpe strengths, noids is a hemiterpenoid, monoterpenoid, a sesquiterpenoid, increasing or decreasing the expression level of the a diterpenoid, a triterpenoid or a tetraterpenoid. upstream and downstream pathways, as individual 17. The method of claim 1, wherein the product is a food genes or as operons, using modifications to ribosomal product, food additive, beverage, chewing gum, candy, or oral binding sites, care product, and wherein the terpenoid or derivative is a replacing native genes in the pathway with heterologous flavor enhancer or sweetener. genes coding for homologous enzymes, 18-51. (canceled) US 2012/0107893 A1 May 3, 2012 47

52. The method of claim 1, wherein the product is a fra 151. (canceled) grance product, a cosmetic, a cleaning product, or a soap, and 152. The method of claim 1, wherein the product is a wherein the terpenoid or derivative is a fragrance. lubricant or surfactant, and the terpenoid or derivative is one 53-109. (canceled) or more of alpha-Pinene; beta-Pinene; Limonene; myrcene; 110. The method of claim 1, wherein the product is a farmesene; Squalene; or a derivative thereof. vitamin or nutritional supplement, and wherein the terpenoid or derivative is Zexanthin; 13-carotene; lycopene; Astaxan 153-158. (canceled) thin; CoQ10; Vitamin K; Vitamin E; Labdane; Gibberellins; 159. The method of claim 1, wherein the terpenoid or Cº-carotene; or a derivative thereof. derivative is polymerized, and the terpenoid or derivative is 111-116. (canceled) alpha-Pinene; beta-Pinene; Limonene; myrcene; farmesene; 117. The method of claim 1, wherein the product is solvent Squalene; isoprene; or a derivative thereof. or cleaning product, and wherein the terpenoid orderivative is 160-167. (canceled) alpha-Pinene; beta-Pinene; Limonene; myrcene; Linalool; 168. The method of claim 1, wherein the product is an Geraniol; Cineole: or a derivative thereof. insecticide, pesticide or pest control agent, and wherein the 118-124. (canceled) terpenoid or derivative is an active ingredient. 125. The method of claim 1, wherein the product is a 169-185. (canceled) pharmaceutical, and wherein the terpenoid or derivative is an 186. The method of claim 1, wherein the product is a active pharmaceutical ingredient. cosmetic or personal care product, and wherein the terpenoid 126-149. (canceled) or derivative is not a fragrance. 150. The method of claim 1, wherein the product is a food 187-207. (canceled) preservative, and wherein the terpenoid or derivative is Car vacrol; Carvone; or a derivative thereof. sk sk sk sk sk