US 2015 0094.483A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2015/0094483 A1 Ness et al. (43) Pub. Date: Apr. 2, 2015

(54) BIOTRANSFORMATION USING Publication Classification GENETICALLY MODIFIED CANDDA (51) Int. Cl. (71) Applicants: Jon E. Ness, Redwood City, CA (US); CI2P 7/64 (2006.01) Jeremy Minshull, Palo Alto, CA (US) CIIC3/00 (2006.01) CI2N 5/8 (2006.01) (72) Inventors: Jon E. Ness, Redwood City, CA (US); (52) U.S. Cl. Jeremy Minshull, Palo Alto, CA (US) CPC ...... CI2P 7/6409 (2013.01); CI2N 15/815 (2013.01): CI IC3/006 (2013.01) (21) Appl. No.: 14/084.230 USPC ...... 554/219; 435/254.22:554/213 (22) Filed: Nov. 19, 2013 Related U.S. Application Data (57) ABSTRACT (60) Division of application No. 12,775,306, filed O May A substantially pure host cell is provided for the 6, 2010, now Pat. No. 8,597,923, which is a continu biotransformation of a substrate to a product wherein the host ation-in-part of application No. 12/436,729, filed on cell is characterized by a first genetic modification class that May 6, 2009, now Pat. No. 8,158,391. comprises one or more genetic modifications that collectively (60) Provisional application No. 61/176,064, filed on May or individually disrupt at least one alcohol dehydrogenase 6, 2009. gene in the Substantially pure Candida host cell. Patent Application Publication Apr. 2, 2015 Sheet 1 of 28 US 201S/0094.483 A1

Fatty Acid rt-NS Fatty Acid Fatty Acid CYP52A Type P450 s COAOxidase (HO-Fatty Acid CB-unsaturated fatty acyl CoA Fatty Alcohol Oxidase CYP52A Type P450 Hydratase OCH-Fatty Acid 8-hydroxy fatty-acyl CoA Dehydrogenase CYP32A Type P450 Dehydrogenase HOOC-Fatty Acid BOXO fatty-acyl CoA

Thiolase Fatty-acyl CoA+ Acetyl CoA Patent Application Publication Apr. 2, 2015 Sheet 2 of 28 US 201S/0094.483 A1

Fatty Acid

rt- * B-Oxidationis h Fatty Acid Fatty Acid CYP52AType P450 X ApOX4 ApOX3 (strain DP1) y (HO-Fatty Acid (B-unsaturated fatty acyl CoA Fatty Alcohol Oxidase CYP52AType P450 Hydratase y OCH-Fatty Acid B-hydroxy fatty-acyl CoA Dehydrogenase CYP52A Type P450 Dehydrogenase y HOOC-Fatty Acid 8-OXO fatty-acyl CoA

Thiolase w Fatty-acyl COA+ Acetyl COA

Figure2 Patent Application Publication Apr. 2, 2015 Sheet 3 of 28 US 201S/0094.483 A1

albicans DH 1A MQSLFRIFRGSLTTTTAASFTATAT, GATTAKTLSGSTWLRKSYKRTYSSSWLSSPE allian8 ADH E MERFFRIFEGGSLTTTTSFTTTTTTLSSTWLRESERTESSSWLSSPE tropicalis (E A4 albicans DH 24. albicans DH 2E tropicalis ADE B4 trialis DE 4.1 tropicalis ADE B11

albicans DH 1A LFFFHFINNERYCHTTTTTNTRTIMSEQIPRTRANWFDTNGGOLWYKDYFWPTFEERIE albicans DH 1B LFFFHOFMNNERYCHTTTTTTETIMSEQIPETEAWWFDTNGGOLWYEDYFWFTFEERIE tropicalis ADE 4 ELEEDIPWPTPERIE albicans ADH 2. MSWFTTE WIFETIGGELEYEDIFWFERENE albicans ADH 2B MSWFTTOKAWIFETIGGELEYEDIFWFEFENE tropicalis ADE B4 ELEEDIPWPEPEPE

... tropicalis ADH 10 ...... ELEEDWPWPWPEPE tropicalis ADE B11 FLYTDIPWPWPKPNE

albicans DH 1A LLIHESGCHTILHWEGDWPLTELPLGGHEGGGMGENEGWEIGDFGIEW albicans DE 1E LLINESGHTILHWEGITTPLTELPLWGGHEGGGGENEGWEIGIFIEW tropicalis ADF ki LLINKSGHTILHWKGIWFLTELPLWGGHEGGGGEWKGWEIGIFIEW albicans ADH 24. LLINESGHTILHWEGIWFLTELFLWGGHEG,GWWLGEEGWEGIWW albicans DH 2B LLIESGCHTILHEGIFLTELFLGGHEGGFLGEEGEGDGE ... tropicalis ADH B4 LLINESGHTILHTTEGITTPLITELPLGGHEGGIGINEGINEGILGET tropicalis. DE 10 LLWREYSGCHSDLHTTEGDTTPIPELPLGGHEGGGMGDWKGWEGILGIE tropicalis ADE B11 LLWHEYSGWCHS IIHF WKGIWFP:SELFTWGGHEGGSWWIGEMWGWENGILGIEM * * : * * * * * * * * : * : * * * * * *

albicans DH 1A LNGSCMSCEFCOGAEFNCGE, DLSGTHDGSFETDAW, EIFGTDLAITWPIL albicans DH 1E LNGSCMSCEFCOGAEFMCGEADLSGTHDGSFESTADAWKIPAGTDLNWAFIL tropicalis ADE 4 LNGSCMSCEFCOGEFNCGEDLSGTHIGSFETDRIPTILEWPIL albicans ADH 24. LNGSCLNCEC3GAEFMCAEADLSGTHDGSFOQYATADAWRIFAGTDLANWAFIL albicans DH 2B LNGSCLNCEC3GEFNCEDLSGTHIGSFOOTDAWRIFTDLNWFIL ... tropicalis ADE B4 LNGSCMRCECGEPNCPDLSGTHDGSFTDAWRIPAGTDLNAPIL tropicalis ADE 10 LNGSCMRICEFCOGAEFNCSRADMSGTHDGTFY TADAWKIPEGDMASIAFIL tropicalis ADE Bll LNGSCMNCEYCOGAEPNCPHDWSGSHDGTFYIT.I.W.EFPGSDLSIPIS * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * : * : * : *, *, * * *

alias DH 1. CGTELET DL, GTISGGGGLGSLFORMGLRIDGGIEEGEFES .albicans DH1B C.G.T.YKLET DLRGOT...ISGGGGLGSL SERMGLRWIDGGDEKGEF.ES tropicalis ADE 4 CGTELETILGWISGGGGLGSLWMGLRFIDGGDEEGFES albicans ADH2A C.GWTWYELETELEGOTWISGAGGLGSLWYK.MGYRLAIDGGEDKGEFE albicans ADH 2B CGTELETELEGISGGGLGSLOYEMGYRFIDGGEDEGEFE Figure 3A Patent Application Publication Apr. 2, 2015 Sheet 4 of 28 US 201S/0094.483 A1

C. tropicalis ADH E4 CGSTELET DLRGINAISGGGLSLINEAMGRWAIDGGDEGEFE C. tropicalis DH 10 CAGNTELEMADLLGAISG. GGGLSLGWEAMGRSLIDGGDERGEFE C. tropicalis DH B11 CGSTELET GLOFGTWISGGGLSLWEMGLEWIDGGDERGWFE if f * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * C. albicans ADH 1A LG.E.YWDFTEDEDIYEAWEKATIGGRHGAINSYSERIDSWEEWRFLGEWWLWGLFA C. allicans ADS 1B LG.E.YWDFTEDEDIYEAWEETIGGPEGAINSYSEEIDSWEEWRFLGEWWLWGLF). C. tropicalis ADH 4 LG.E.YIDFLEEEDISEE TIGGPHIRESSEKIISERFLELGLF C. albicans ADH2A LGETFIDFTREKDSJEANEKATNGGPHGWINSWSERIGSTEYWRTLGEWWLWGLFA C. alaicans ADH2E LGETFIDFTEEEDWWEAWEETINGGPHGNINSWSERIGSTEWRTLGEWWLWGLF). C. tropicalis DH B4 LGEWFWDFLEERDIYGAWEKATIGGPHGAWNWSISERINSYTYWRTLGRWWLWGLFA C. tropicalis DH 10 LGESYIDFLEEDIYSAIREATGGGPHGWITFSYSEEINSSEYWRTLGEWWLWSLF). C. tropicalis ADHB11 LGEWFWDFTEENYSEIIETIGGHGSINESISEEINSWEWRTLGTWWLWGLF kit. ::fff; ; ; ; kit. . . . .; if f : * : * * * * : h it...... if C. albicans ADH 1A HAEWTFWFTWWESIEIEGSYWGNREDTAETIFFSRGLIECPIKIWGLSDLPEWFELM C. allicans ADH 1E HETFFIESIEIEGSY GNREDTEIDFFSRGLIECFIEIWGLSDLFENFELM C. tropicalis ADH i4 GSEWTGWFEWWESIEIEGSYWGNREDTAEWDFFSRGLIECPIEIWGLSELPOSFELM C. albicans ADH2A GAEISTFWFDAWIRTIQIEGSYWGNREDTAEWIFFTRGLIRCPIRIWGLSELPENYELM C. allicans ADH2E GAEISTFWFDANIETIIEGSYWGNREDTAEWDFFTRGLIECFIEIWGLSELFEWYELM C. tropicalis ADH E4 GSENSAFWFISWESIQIEGSYWGNREDTAEWIFFSRGLIECFIEWGLSELFENYELM C. tropicalis DH 10 GGELTPLFESNARSIOIRTTCWGNREDTTEIDFFWRGLIDCPIEWAGLSEWPEIFDLM C. tropicalis DH B11 GELEFIFNNESIOIEGSYWGNRRDTESDFFRGLECFIEWGLSELFEIFELL f; ; ; ; ; ; ; ; ; ; ; ; * * * * * * * * * * * * * * * * * * * *; ; ; ; ; ; ;

Calbicans ADH.4. EEGKILSRYWLDTS- SEID MO;172 C. albicans ADHB EEGEILGRYWLDTSE SEID MO; 173 C. tropicalis ADH4 ------SEC ID NO;155 C. albicans ADH2A EEGKILGRYWLDMDE SEC ID MO:174 C. albicans ADH 2B EEGEILGRYWLDRIDE SEID NO:15 C. tropicalis ADH B4 ------SEC ID MO: 154 C. tropicalis ADH.10 ------SEC ID MO:152 C. tropicalis ADE B11 ------SE II) NO; 151 Figure3B Patent Application Publication Apr. 2, 2015 Sheet 5 of 28 US 201S/0094.483 A1

ReCOmbinase ReCOmbinase Site Site Targeting Recombinase Targeting Sequence enCOcing gene Selective marker Sequence

W '''''''''^|| || || || W

Restriction site to release targeting Restriction site to release targeting Construct from sequences required for Construct from sequences required for bacterial propagation bacterial propagation

Figure 4 Patent Application Publication Apr. 2, 2015 Sheet 6 of 28 US 201S/0094.483 A1

RecombinaseSite Targeting Construct RecombinaseSite Targeting Recombinase Targeting Sequence 1 enCOding gene Selective marker SeQuence 2

e A Z. B Target Target Sequence Sequence 2 GenOmic DNA DNA region to be replaced Homologous recombination

Integrant

: W:8 ReCOmbinase %: NE GenOmic DNA enCOdinCggene Cene Selective marker ReCOmbinase ReCOmbinase site site EXCised DNA W KN D Figure 5 Patent Application Publication Apr. 2, 2015 Sheet 7 of 28 US 201S/0094.483 A1

ReCOmbinase ReCOmbinase Site site ReCOmbinase encoding gene Selective marker : A GenOmic DNA Induction of reCOmbinase

ReCOmbinase

WXNE B : GenOmic DNA ReCOmbinase Recombinase site encoding gene Selective marker Illllllllllllllllllllll ...... C

Figure 6 Patent Application Publication Apr. 2, 2015 Sheet 8 of 28 US 201S/0094.483 A1

Insertion Sequences ReCOmbinase ReCOmbinase Site Site Targeting Recombinase Targeting Sequence enCOding gene Selective marker Sequence WI W

Restriction site to release targeting Restriction site to release targeting Construct from sequences required for Construct from sequences required for bacterial propagation bacterial propagation

Figure 7 Patent Application Publication Apr. 2, 2015 Sheet 9 of 28 US 201S/0094.483 A1

Insertion ReCOmbinase ReCOmbinase Sequences Targeting Construct Site Site Targeting Recombinase Targeting Sequence? encoding gene Selective marker W|8E: A

Z NE B Target Target Sequence Sequence 2 Genomic DNA DNA region to be replaced Homologous recombination

Insertion Sequence Integrant

EWI:E2%8. ReCOmbinase enCOcinq Cene Selective marker GenOmic DNA 99 ReCOmbinase ReCOmbinase Site Site Excised DNA Z: N D Patent Application Publication Apr. 2, 2015 Sheet 10 of 28 US 201S/0094.483 A1

ReCOmbinase ReCOmbinase Site site ReCOmbinase encoding gene Selective marker EW : ^ N

Insertion GenOmic DNA Sequences Induction OfreCOmbinase

Insertion Sequences ReCOmbinase Site

WAIIII: NE B

GenOmic DNA Recombinase Recombinase site encoding gene Selective marker :3''''''''y

Figure9 Patent Application Publication Apr. 2, 2015 Sheet 11 of 28 US 201S/0094.483 A1

Original genomic sequence Target Sequence 2

W. NEA A

ReCOMOinase ReCOmbinase Selective marker ReCombinase Site encoding gene site

Sequence 1

A A A A A A B A A A C A

S Patent Application Publication Apr. 2, 2015 Sheet 12 of 28 US 201S/0094.483 A1

Targeting Targeting Sequence Selective marker Sequence

Figure 11 Patent Application Publication Apr. 2, 2015 Sheet 13 of 28 US 201S/0094.483 A1

Fatty Acid

rt- B-Oxidation3. h Fatty Acid Fatty Acid CYP52AType P450 X X ApOx4 ApOX5 (strain DP1) y (HO-Fatty Acid (B-unsaturated fatty acyl CoA Fatty Alcohol Oxidase CYP52A Type P450 Hydratase y y OCH-Fatty Acid B-hydroxy fatty-acyl CoA Dehydrogenase CYP52A Type P450 Dehydrogenase V y HOOC-Fatty Acid BOXO fatty-acyl CoA

Thiolase y Fatty-acyl CoA + Acetyl CoA

Figure 12 Patent Application Publication Apr. 2, 2015 Sheet 14 of 28 US 201S/0094.483 A1

o S350 PartB &16- E. 9- E 30 ; 12 ... / So5200 it. 5 8 5

O -2. --- s O 12 24 36 48 O 12 24 36 48 Transformation time (h) Transformation time (h) o81. Part C a350C Part D

O E. % S. i

5& 14 EMyristic acid W 250C EMyristic acid E. E. E. E.g. 2 : 12 pH controlled pH controlled

9 () 10 9 20 : S 8 O 150C X, :::: 1. ( 6 100C :

i i.O L = DP DP174 P1 DP174 Strains Strains Patent Application Publication Apr. 2, 2015 Sheet 15 of 28 US 201S/0094.483 A1

Fatty Acid ()-Oxidation s B-Oxidation

s s i. Fatty Acid Fatty Acid CYP52A Type P450 i X ApOx4 ApOx5 strain DP1) g co-HO-Fatty Acid (B-unsaturated fatty acyl COA Fatty Alcohol Oxidase X CYP52A Type P450 Hydratase V y OCH-Fatty Acid 3-hydroxy fatty-acyl COA Dehydrogenase X CYP52A Type P450 Dehydrogenase V w HOOC-Fatty Acid BOXO fatty-acyl CoA

Thiolase s Fatty-acyl COA+ Acetyl CoA

Figure 14 Patent Application Publication Apr. 2, 2015 Sheet 16 of 28 US 201S/0094.483 A1

A ()-hydroxy laurate Substrate B: ()-hydroxy palmitate Substrate

% W %W

DP186 P258 DP259 DP186 DP258 DP259 trains Strains

Figure 15 Patent Application Publication Apr. 2, 2015 Sheet 17 of 28 US 201S/0094.483 A1

A ()-hydroxy laurate Substrate B: ()-hydroxy palmitate Substrate

5 2

4. 1.O s W. S m 2. "

E O2 E I O. E. % E OO-- E DP186 DP283 DP284 Control DP186 DP283 DP284 Control Strains Strains

Figure 16 Patent Application Publication Apr. 2, 2015 Sheet 18 of 28 US 201S/0094.483 A1

B2 A4

CaADH1A B4 CaADH2A

Figure 17 Patent Application Publication Apr. 2, 2015 Sheet 19 of 28 US 201S/0094.483 A1

5. ADH 5'. ADHOut k 3. ADHin 3. ADHOut

Figure 18 Patent Application Publication Apr. 2, 2015 Sheet 20 of 28 US 201S/0094.483 A1

A. Cell growth B: Diacid production 35 6 --DP1 --DP 30J --DP283 - DP28

"&" DP415 - s: DP415 & 2 5 ar

Time (h)

Figure 19 Patent Application Publication Apr. 2, 2015 Sheet 21 of 28 US 201S/0094.483 A1

--DP --DP390 A-DP415 -v-DP417

-b-DP434

o 4 3 2 16 20 24 Patent Application Publication Apr. 2, 2015 Sheet 22 of 28 US 201S/0094.483 A1

Bacterial Origin of replication

linearization position Transcription terminator 2

Promoter

Selectable marker

Bacterial promoter Promoter 2 Gene for Expression Transcription terminator 1

Figure 21 Patent Application Publication Apr. 2, 2015 Sheet 23 of 28 US 201S/0094.483 A1

Promoter 2 Promoter 1 Gene for Expression Selectable marker Bacterialalong Orion (5 part) Bacterial i ofreplication Promoter Transcription promoter Transcription (3part) imp ling 2 / Z Evili:XXIIIZ. Part A. -1 linearized COnStruct B C A B

GenOmic Gene Promoter? Expressed from Promoter Part B. genomic Sequence A B C

Gene for EXOression Selectable marker Bacterial Origin GenOmic Gene Promoter 1 Bacterial of replication Expressed from Transcription promoter Transcription Promoter 1 Promoter trip ? in 2 Z. EHX...... I.Z. Part C: Construct integrated into genome Figure 22 Patent Application Publication Apr. 2, 2015 Sheet 24 of 28 US 201S/0094.483 A1

pUC bacterial origin of replication BSWI site for linearization CYC transcription terminator

isOcitrate lyase / promoter

Ze0cin resistance gene

EM7 bacterial promoter -- w TEF promoter--- 3 w Gene for Expression lsOcitrate lyase transcription terminator Figure23 Patent Application Publication Apr. 2, 2015 Sheet 25 of 28 US 201S/0094.483 A1

10 --DP428 8 -y-DP201 -1 - 6

O se--ancies-owski. v rv -Tseaxw m O 12 24 36 48 6O 72 84 12 24 36 48 6O 72 84 Time (h) Time (h) ()-hydroxymyristic acid production Tetradecanedioicacic (diacid) production

Figure24 Patent Application Publication Apr. 2, 2015 Sheet 26 of 28 US 201S/0094.483 A1

Myristic acid (C14:0) Oleic acid (C18:1) 6 O 2 O J-A-O-OHC14 –A diacid by P450A13 -A-O-OHOA-A-diacid by P450A3 5 O -- (-OHOA mordiacid by 858&t -OHC140 dacid by PSAs, 6 4. O 12 3 O

2 O 10- // / // --

O s:-" O 20 40 60 80 O 20 40 6O 80 Bioconversion time (h) BioConversion time (h)

O Linoleic acid (C182) -A-O-OHLA-A-clacid by P450A13 -- (-OHLA-O-diacid by Pás At Sigis, DP428. P450A17

DP522. P450A13

Figure25 O 2O 40 60 8O BioConversion time (h) Patent Application Publication Apr. 2, 2015 Sheet 27 of 28 US 201S/0094.483 A1

-A- (0-hydroxymyristate -A-O-hydroxymyristate m in DiaCic --Dacid

8 M 8 O

6C 6 O

4 O 2 / -A 2 O O 20 40 6O 8O O BioConversion time (h) BioConversion time (h)

Figure26 Patent Application Publication Apr. 2, 2015 Sheet 28 of 28 US 201S/0094.483 A1

.

& yS.

Figure27 US 2015/0094.483 A1 Apr. 2, 2015

BOTRANSFORMATION USING estas potential sources of chemical intermediates, and thus as GENETICALLY MODIFIED CANDDA possible replacements for chemicals derived from crude oil and its distillates. RELATED APPLICATIONS 0006 from the genus Candida are industrially 0001. This application is a divisional of U.S. patent appli important, they tolerate high concentrations offatty acids and cation Ser. No. 12/775,306, entitled “OXIDATION OF hydrocarbons in their growth media and have been used to COMPOUNDSUSING GENETICALLY MODIFIED CAN produce long chain fatty diacids (Picataggio et al. (1992), DIDA, filed on May 6, 2010, which claims priority to U.S. Biotechnology (NY): 10, 894-8.) However they frequently Provisional Patent Application No. 61/176,064, entitled lack enzymes that would facilitate conversion of plant cell BIOSYNTHETIC ROUTES TO ENERGY RICHMOL wall material (cellulose, hemicellulose, pectins and lignins) ECULES USING GENETICALLY MODIFIED CAN into Sugar monomers for use in biofuel production. Methods DIDA, filed May 6, 2009, and which is a continuation-in-part for addition of genes encoding proteins capable of catalyzing of U.S. patent application Ser. No. 12/436,729, entitled “BIO Such conversion into the Candida genome are thus of com SYNTHETIC ROUTES TO LONG-CHAIN ALPHA, mercial interest. Further, because yeasts do not always con OMEGA-HYDROXYACIDS, DIACIDS AND THEIR tain enzymatic systems for uptake and metabolism of all of CONVERSION TO OLIGOMERS AND POLYMERS the Sugar monomers derived from plant cell wall material, filed May 6, 2009. The disclosures of the above-referenced applications are incorporated by reference herein in their genes encoding enzymes that enable Candida to utilize Sug entirety. ars that it does not normally use, and methods for adding these genes to the Candida genome, are thus of commercial inter eSt. STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH ORDEVELOPMENT 0007 Currently, C.co-dicarboxylic acids are almost exclu sively produced by chemical conversion processes. However, 0002 This invention was made with government support the chemical processes for production of C.O)-dicarboxylic under grant number DAAD19-03-1-0091, W911QY-04-C- acids from non-renewable petrochemical feedstocks usually 0082 and NBCH1070004 awarded by the Defense Advanced produces numerous unwanted byproducts, requires extensive Research Projects Agency (DARPA) to Richard A. Gross. purification and gives low yields (Picataggio et al., 1992, The United States Government has certain rights in this Bio/Technology 10, 894-898). Moreover, C.(I)-dicarboxylic invention. acids with carbon chain lengths greater than 13 are not readily available by chemical synthesis. While several chemical SEQUENCE LISTING routes to synthesize long-chain C.O)-dicarboxylic acids are available, their synthesis is difficult, costly and requires toxic 0003. This application includes a Sequence Listing sub reagents. Furthermore, most methods result in mixtures con mitted as filename 49840.054DV1 SL.txt, of size 337,998 taining shorter chain lengths. Furthermore, other than four bytes, created Jun. 11, 2014. The Sequence Listing is incor carbon C,c)-unsaturated diacids (e.g. maleic acid and fumaric porated by reference herein in its entirety. acid), longer chain unsaturated C.()-dicarboxylic acids or those with other functional groups are currently unavailable 1. FIELD since chemical oxidation cleaves unsaturated bonds or modi fies them resulting in cis-trans isomerization and other by 0004 Methods for biological production of C.co-hydroxy products. acids using genetically modified strains of the Candida are provided. Also provided are methods for the genetic 0008. Many microorganisms have the ability to produce modification of the yeast Candida. Also provided are DNA C.()-dicarboxylic acids when cultured in n-alkanes and fatty constructs for removal of genes that can interfere with the acids, including Candida tropicalis, Candida cloacae, Cryp production of energy rich molecules by Candida. Also pro tococcus neoforman and Corynebacterium sp. (Shiio et al., vided are DNA constructs for insertion of genes for expres 1971, Agr. Biol. Chem. 35,2033-2042: Hillet al., 1986, Appl. sion into the Candida genome. Microbiol. Biotech. 24: 168-174; and Broadway et al., 1993, J. Gen. Microbiol. 139, 1337-1344). Candida tropicalis and 2. BACKGROUND similar yeasts are known to produce C.()-dicarboxylic acids with carbon lengths from C12 to C22 via an co-oxidation 0005 Genes that encode proteins that catalyze chemical pathway. The terminal methyl group of n-alkanes or fatty transformations of alkanes, alkenes, fatty acids, fatty alco acids is first hydroxylated by a membrane-bound enzyme hols, fatty aldehydes, aldehydes and alcohols may aid in the complex consisting of cytochrome P450 monooxygenase and biosynthesis of energy rich molecules, or in the conversion of associated NADPH cytochrome reductase that is the rate Such compounds to compounds better Suited to specific appli limiting step in the co-oxidation pathway. Two additional cations. Such molecules include hydrocarbons (alkane, alk enzymes, the fatty alcohol oxidase and fatty aldehyde dehy ene and isoprenoid), fatty acids, fatty alcohols, fatty alde drogenase, further oxidize the alcohol to create ()-aldehyde hydes, esters, ethers, lipids, triglycerides, and waxes, and can acid and then the corresponding C.O)-dicarboxylic acid (ES be produced from plant derived substrates, such as plant cell chenfeldt et al., 2003, Appl. Environ. Microbiol. 69, 5992 walls (lignocellulose, cellulose, hemicellulose, and pectin) 5999). However, there is also a f-oxidation pathway for fatty starch, and Sugar. These molecules are of particular interestas acid oxidation that exists within Candida tropicalis. Both potential Sources of energy from biological Sources, and thus fatty acids and C.co-dicarboxylic acids in wild type Candida as possible replacements from energy sources derived from tropicalis are efficiently degraded after activation to the cor crude oil and its distillates. These molecules are also of inter responding acyl-CoA ester through the B-Oxidation pathway, US 2015/0094.483 A1 Apr. 2, 2015

leading to carbon-chain length shortening, which results in cation class and a second genetic modification class. The first the low yields of C,c)-dicarboxylic acids and numerous by genetic modification class comprises one or more genetic products. modifications that disrupt the B-Oxidation pathway in the 0009 Mutants of C. tropicalis in which the B-oxidation of Substantially pure Candida host cell. The second genetic fatty acids is impaired may be used to improve the production modification class comprises one or more genetic modifica of O.co-dicarboxylic acids (Uemura et al., 1988, J. Am. Oil. tions that collectively or individually disrupt at least one gene Chem. Soc. 64, 1254-1257; and Yiet al., 1989, Appl. Micro in the substantially pure Candida host cell selected from the biol. Biotech. 30, 327-331). Recently, genetically modified group consisting of a CYP52A type cytochrome P450, a fatty strains of the yeast Candida tropicalis have been developed to alcohol oxidase, and an alcohol dehydrogenase. increase the production of C,c)-dicarboxylic acids. An engi 0012 Another embodiment provides a method for produc neered Candida tropicalis (Strain H5343, ATCC No. 20962) ing an O-carboxyl-(1)-hydroxy fatty acid having a carbon with the POX4 and POX5 genes that code for enzymes in the chain length in the range from C6 to C22, an O.()-dicarboxy first step of fatty acid B-Oxidation disrupted was generated so lic fatty acid having a carbon chain length in the range from that it can prevent the strain from metabolizing fatty acids, C6 to C22, or mixtures thereof in a Candida host cell. The which directs the metabolic flux toward ()-oxidation and method comprises (A) making one or more first genetic modi results in the accumulation of C,c)-dicarboxylic acids (FIGS. fications in a first genetic modification class to the Candida 3A and 3B). See U.S. Pat. No. 5.254.466 and Picataggio et al., host cell. The method further comprises (B) making one or 1992, Bio/Technology 10: 894-898, each of which is hereby more second genetic modifications in a second genetic modi incorporated by reference herein. Furthermore, by introduc fication class to the Candida host cell, where steps (A) and tion of multiple copies of cytochrome P450 and reductase (B) collectively form a genetically modified Candida host genes into C. tropicalis in which the 3-oxidation pathway is cell. The method further comprises (C) producing an O-car blocked, the C. tropicalis strain AR40 was generated with boxyl-(1)-hydroxy fatty acid having a carbon chain length in increased co-hydroxylase activity and higher specific produc the range from C6 to C22, an O.co-dicarboxylic fatty acid tivity of diacids from long-chain fatty acids. See, Picataggio having a carbon chain length in the range from C6 to C22, or et al., 1992, Bio/Technology 10: 894-898 (1992); and U.S. mixtures thereof, by fermenting the genetically modified Pat. No. 5,620,878, each of which is hereby incorporated by Candida host cell in a culture medium comprising a nitrogen reference herein. Genes encoding proteins that catalyze Source, an organic Substrate having a carbon chain length in chemical transformations of alkanes, alkenes, fatty acids, the range from C6 to C22, and a cosubstrate. Here, the first fatty alcohols, fatty aldehydes, aldehydes and alcohols may genetic modification class comprises one or more genetic also reduce the usefulness of these compounds as energy modifications that disrupt the B-oxidation pathway of the Sources, for example by oxidizing them or further metaboliz Candida host cell. Also, the second genetic modification class ing them. Methods for identifying and eliminating from the comprises one or more genetic modifications that collectively Candida genome genes encoding enzymes that oxidize or or individually disrupt at least one gene selected from the metabolize alkanes, alkenes, fatty acids, fatty alcohols, fatty group consisting of a CYP52A type cytochrome P450, a fatty aldehydes, aldehydes and alcohols are thus of commercial alcohol oxidase, and an alcohol dehydrogenase in the Can interest. For example fatty alcohols cannot be prepared using dida host cell. any described strain of Candida because the hydroxy fatty 0013. One embodiment provides a substantially pure Can acid is oxidized to form a dicarboxylic acid, which has dida host cell for the production of energy rich molecules. reduced energy content relative to the hydroxy fatty acid. The Candida host cell is characterized by a first genetic Furthermore, neither the general classes nor the specific modification class and a second genetic modification class. sequences of the Candida enzymes responsible for the oxi The first genetic modification class comprises one or more dation from hydroxy fatty acids to dicarboxylic acids have genetic modifications that collectively or individually disrupt been identified. There is therefore a need in the art for meth at least one gene in the Substantially pure Candida host cell ods to prevent the oxidation of hydroxy fatty acids to diacids selected from the group consisting of a fatty alcohol oxidase, during fermentative production. and an alcohol dehydrogenase. The second genetic modifica tion class comprises one or more genetic modifications that 3. SUMMARY collectively or individually add to the host cell genome at 0010 Methods for the genetic modification of Candida least one gene selected from the group consisting of a lipase, species to produce strains improved for the production of a cellulase, a ligninase or a cytochrome P450 that is not biofuels are disclosed. Methods by which yeast strains may identical to a naturally occurring counterpart gene in the be engineered by the addition or removal of genes to modify Candida host cell; or a lipase, a cellulase, a ligninase or a the oxidation of compounds of interest as biofuels are dis cytochrome P450 that is expressed under control of a pro closed. Enzymes to facilitate conversion of plant cell wall moter other than the promoter that controls expression of the material (cellulose, hemicellulose, pectins and lignins) into naturally occurring counterpart gene in the Candida host cell. Sugar monomers and enzymes to enable Candida to utilize 0014. One embodiment provides a substantially pure Can Such Sugars for use in biofuel production and methods for dida host cell for the biotransformation of organic molecules. addition of genes encoding Such enzymes into the Candida The Candida host cell is characterized by a first genetic genome are disclosed. modification class and a second genetic modification class. 0011. One embodiment provides a substantially pure Can The first genetic modification class comprises one or more dida host cell for the production of an O-carboxyl-(o-hydroxy genetic modifications that collectively or individually disrupt fatty acid having a carbon chain length in the range from C6 at least one alcohol dehydrogenase gene in the Substantially to C22, an O.()-dicarboxylic fatty acid having a carbon chain pure Candida host cell. The second genetic modification class length in the range from C6 to C22, or mixtures thereof. The comprises one or more genetic modifications that collectively Candida host cell is characterized by a first genetic modifi or individually add to the host cell genome at least one gene US 2015/0094.483 A1 Apr. 2, 2015

that is not identical to a naturally occurring counterpart gene 0027 at least 30% identical to a stretch of at least 50, at in the Candida host cell; or at least one gene that is identical least 60, at least 70, at least 80, at least 90, at least 100, at least to a naturally occurring counterpart gene in the Candida host 110 at least 120 contiguous nucleotides of SEQ ID NO:39, cell, but that is expressed under control of a promoter other SEQID NO:40, SEQID NO:42, SEQID NO:43, or SEQID than the promoter that controls expression of the naturally NO:56, or occurring counterpart gene in the Candida host cell. 0028 at least 20% identical to a stretch of at least 50, at 0015. In some embodiments the first genetic modification least 60, at least 70, at least 80, at least 90, at least 100, at least class comprises disruption of at least one alcohol dehydroge 110 at least 120 contiguous nucleotides of SEQ ID NO:39, nase gene selected from the group consisting of ADH-A4. SEQID NO:40, SEQID NO:42, SEQID NO:43, or SEQID ADH-A4B, ADH-B4, ADH-B4B, ADH-A10, ADH-A1OB, NO:56, or ADH-B11 and ADH-B11B. 0029 at least 10% identical to a stretch of at least 50, at 0016. In some embodiments the first genetic modification least 60, at least 70, at least 80, at least 90, at least 100, at least class comprises disruption of at least one alcohol dehydroge 110 at least 120 contiguous nucleotides of any one of SEQID nase gene whose nucleotide sequence is NO:39, SEQID NO:40, SEQID NO:42, SEQID NO:43, or 0017 at least 95% identical to a stretch of at least 50, at SEQID NO:56. least 60, at least 70, at least 80, at least 90, at least 100, at least 0030. In some embodiments the first genetic modification 110 at least 120 contiguous nucleotides of SEQID NO:39, class comprises disruption of at least one alcohol dehydroge SEQID NO:40, SEQID NO:42, SEQID NO:43, or SEQID nase gene whose amino acid sequence, predicted from trans NO:56, or lation of the gene that encodes it, comprises a first peptide. In 0018 at least 90% identical to a stretch of at least 50, at some embodiments the first peptide has the sequence VKYS least 60, at least 70, at least 80, at least 90, at least 100, at least GVCH (SEQ ID NO: 156). In some embodiments, the first 110 at least 120 contiguous nucleotides of SEQID NO:39, peptide has the sequence VKYSGVCHXXXXXWKGDW SEQID NO:40, SEQID NO:42, SEQID NO:43, or SEQID (SEQID NO: 162). In some embodiments the first peptide has NO:56, or the sequence VKYSGVCHXXXXXWKGDWXXXXKLPxVG 0019 at least 85% identical to a stretch of at least 50, at GHEGAGVVV (SEQID NO: 163). It will be understood that least 60, at least 70, at least 80, at least 90, at least 100, at least in amino acid sequences presented herein, each 'X' respre 110 at least 120 contiguous nucleotides of SEQID NO:39, sents a placeholder for a residue of any of the naturally occur SEQID NO:40, SEQID NO:42, SEQID NO:43, or SEQID ring aminoa acids. NO:56, 0031. In some embodiments the first genetic modification 0020 or at least 80% identical to a stretch of at least 50, at class comprises disruption of at least one alcohol dehydroge least 60, at least 70, at least 80, at least 90, at least 100, at least nase gene whose amino acid sequence, predicted from trans 110 at least 120 contiguous nucleotides of SEQID NO:39, lation of the gene that encodes it, comprises a second peptide. SEQID NO:40, SEQID NO:42, SEQID NO:43, or SEQID In some embodiments the second peptide has the sequence NO:56, or QYATADAVQAA (SEQID NO: 158). In some embodiments 0021 at least 75% identical to a stretch of at least 50, at the second peptide has the sequence SGYxHDGXFXOYATA least 60, at least 70, at least 80, at least 90, at least 100, at least DAVQAA (SEQ ID NO: 164). In some embodiments the 110 at least 120 contiguous nucleotides of SEQID NO:39, second peptide has the sequence GAEPNCXXADxSGYX SEQID NO:40, SEQID NO:42, SEQID NO:43, or SEQID HDGxFXOYATADAVQAA (SEQID NO: 165). NO:56, or 0032. In some embodiments the first genetic modification 0022 at least 70% identical to a stretch of at least 50, at class comprises disruption of at least one alcohol dehydroge least 60, at least 70, at least 80, at least 90, at least 100, at least nase gene whose amino acid sequence, predicted from trans 110 at least 120 contiguous nucleotides of SEQID NO:39, lation of the gene that encodes it, comprises a third peptide. In SEQID NO:40, SEQID NO:42, SEQID NO:43, or SEQID Some embodiments the third peptide has the sequence NO:56, or CAGVTVYKALK (SEQ ID NO: 159). In some embodi 0023 at least 65% identical to a stretch of at least 50, at ments the third peptide has the sequence APIX least 60, at least 70, at least 80, at least 90, at least 100, at least CAGVTVYKALK (SEQID NO: 166). 110 at least 120 contiguous nucleotides of SEQID NO:39, 0033. In some embodiments the first genetic modification SEQID NO:40, SEQID NO:42, SEQID NO:43, or SEQID class comprises disruption of at least one alcohol dehydroge NO:56, or nase gene whose amino acid sequence, predicted from trans 0024 at least 60% identical to a stretch of at least 50, at lation of the gene that encodes it, comprises a fourth peptide. least 60, at least 70, at least 80, at least 90, at least 100, at least In some embodiments the fourth peptide has the sequence 110 at least 120 contiguous nucleotides of SEQID NO:39, GQWVAISGA(SEQID NO: 160). In some embodiments the SEQID NO:40, SEQID NO:42, SEQID NO:43, or SEQID fourth peptide has the sequence GQWVAISGAXGGLGSL NO:56, or (SEQID NO: 167). In some embodiments the fourth peptide 0025 at least 50% identical to a stretch of at least 50, at has the sequence GQWVAISGAXGGLGSLXVQYA (SEQID least 60, at least 70, at least 80, at least 90, at least 100, at least NO: 168). In some embodiments, the fourth peptide has the 110 at least 120 contiguous nucleotides of SEQID NO:39, sequence GQWVAISGAXGGLGSLXVQYAXAMG (SEQID SEQID NO:40, SEQID NO:42, SEQID NO:43, or SEQID NO: 169). In some embodiments the fourth peptide has the NO:56, or Sequence GQWVAISGAxGGLGSLXVQYAXAM 0026 at least 40% identical to a stretch of at least 50, at GxRVXAIDGG (SEQID NO: 170). least 60, at least 70, at least 80, at least 90, at least 100, at least 0034. In some embodiments the first genetic modification 110 at least 120 contiguous nucleotides of SEQID NO:39, class comprises disruption of at least one alcohol dehydroge SEQID NO:40, SEQID NO:42, SEQID NO:43, or SEQID nase gene whose amino acid sequence, predicted from trans NO:56, or lation of the gene that encodes it, comprises a fifth peptide. In US 2015/0094.483 A1 Apr. 2, 2015

Some embodiments the fifth peptide has the sequence embodiments the alcoholdehydrogenase is deemed disrupted VGGHEGAGVVV (SEQID NO: 157). if the rate of conversion is at least 5% lower, at least 10% 0035. In some embodiments the first genetic modification lower, at least 15% lower, at least 20% lower, at least 25% class comprises disruption of at least one alcohol dehydroge lower, or at least 30% lower in the Candida host cell than the nase gene whose amino acid sequence, predicted from trans second host cell. lation of the gene that encodes it, comprises at least one, two, 0040. In some embodiments, disruption of an alcohol three, four or five peptides selected from the group consisting dehydrogenase in a Candida host cell is measured by incu of a first peptide having the sequence VKYSGVCH (SEQID bating the Candida host cell in a mixture comprising a Sub NO: 156), a second peptide having the sequence QYATA strate possessing a hydroxyl group and measuring the rate of DAVQAA (SEQ ID NO: 158), a third peptide having the conversion of the Substrate to a more oxidized product such as sequence CAGVTVYKALK (SEQ ID NO: 159), a fourth an aldehyde or a carboxyl group. The amount of the Substrate peptide having the sequence GQWVAISGA (SEQ ID NO: converted to product by the Candida host cell in a specified 160) and a fifth peptide having the sequence VGGHE time is compared with the amount of substrate converted to GAGVVV (SEQID NO: 157). product by a second host cell that does not contain the dis 0036. In some embodiments the first genetic modification rupted gene but contains a wild type counterpart of the gene, class comprises disruption of at least one alcohol dehydroge when the Candida host cell and the second host cell are under nase gene whose amino acid sequence, predicted from trans the same environmental conditions (e.g., same temperature, lation of the gene that encodes it has at least 65 percent same media, etc.). The amount of product can be measured sequence identity, at least 70 percent sequence identity, at using colorimetric assays, or chromatographic assays, or least 75 percent sequence identity, at least 80 percent mass spectroscopy assays. In some embodiments the alcohol sequence identity, at least 85 percent sequence identity, at dehydrogenase is deemed disrupted if the amount of product least 90 percent sequence identity, or at least 95 percent is at least 5%, at least 10%, at least 15%, at least 20%, at least sequence identity to a stretch of at least 20, at least 30, at least 25%, or at least 25% lower in the Candida host cell than the 40, at least 50, at least 60, at least 70, at least 80, at least 90, second host cell. or at least 100 contiguous residues of any one of SEQ ID 0041. In some embodiments, the first genetic modification NO:151, SEQ ID NO:152, SEQ ID NO:153, SEQ ID class causes an alcohol dehydrogenases to have decreased NO:154, or SEQID NO:155. function relative to the function of the wild-type counterpart 0037. In some embodiments the first genetic modification in the Candida host cell. class comprises disruption of at least one alcohol dehydroge 0042. In some embodiments, decreased function of an nase gene whose amino acid sequence, predicted from trans alcohol dehydrogenase in a Candida host cell is measured by lation of the gene that encodes it, has at least 65 percent incubating the Candida host cell in a mixture comprising a sequence identity, at least 70 percent sequence identity, at Substrate possessing a hydroxyl group and measuring the rate least 75 percent sequence identity, at least 80 percent of conversion of the substrate to a more oxidized product such sequence identity, at least 85 percent sequence identity, at as an aldehyde or a carboxyl group. The rate of conversion of least 90 percent sequence identity, or at least 95 percent the substrate by the Candida host cell is compared with the sequence identity to a stretch of between 5 and 120 contigu rate of conversion produced by a second host cell that does not ous residues, between 40 and 100 contiguous residues, contain the disrupted gene but contains a wild type counter between 50 and 90 contiguous residues, between 60 and 80 part of the gene, when the Candida host cell and the second contiguous residues of any one of SEQID NO:151, SEQID host cell are under the same environmental conditions (e.g., NO:152, SEQ ID NO:153, SEQ ID NO:154, or SEQ ID same temperature, same media, etc.). The rate of formation of NO:155. the product can be measured using colorimetric assays, or 0038. In some embodiments the first genetic modification chromatographic assays, or mass spectroscopy assays. In class comprises disruption of at least one alcohol dehydroge Some embodiments the alcohol dehydrogenase is deemed to nase gene whose amino acid sequence, predicted from trans have decreased function if the rate of conversion is at least 5% lation of the gene that encodes it, has at least 90 percent lower, at least 10% lower, at least 15% lower, at least 20% sequence identity to a stretch of between 10 and 100 contigu lower, at least 25% lower, or at least 30% lower in the Can ous residues of any one of SEQID NO:151, SEQID NO:152, dida host cell than the second host cell SEQID NO:153, SEQID NO:154, or SEQID NO:155. 0043. In some embodiments, decreased function of an 0039. In some embodiments, the first genetic modification alcohol dehydrogenase in a Candida host cell is measured by class causes disruption of an alcohol dehydrogenase in a incubating the Candida host cell in a mixture comprising a Candida host cell. In some embodiments disruption of an Substrate possessing a hydroxyl group and measuring the rate alcohol dehydrogenase is measured by incubating the Can of conversion of the substrate to a more oxidized product such dida host cell in a mixture comprising a Substrate possessing as an aldehyde or a carboxyl group. The amount of the Sub a hydroxyl group and measuring the rate of conversion of the strate converted to product by the Candida host cell in a Substrate to a more oxidized product such as an aldehyde or a specified time is compared with the amount of Substrate con carboxyl group. The rate of conversion of the substrate by the verted to product by a second host cell that does not contain Candida host cell is compared with the rate of conversion the disrupted gene but contains a wild type counterpart of the produced by a second host cell that does not contain the gene, when the Candida host cell and the second host cell are disrupted gene but contains a wild type counterpart of the under the same environmental conditions (e.g., same tem gene, when the Candida host cell and the second host cell are perature, same media, etc.). The amount of product can be under the same environmental conditions (e.g., same tem measured using colorimetric assays, or chromatographic perature, same media, etc.). The rate of formation of the assays, or mass spectroscopy assays. In some embodiments product can be measured using colorimetric assays, or chro the alcoholdehydrogenase is deemed to have decreased func matographic assays, or mass spectroscopy assays. In some tion if the amount of product is at least 5% lower, at least 10% US 2015/0094.483 A1 Apr. 2, 2015

lower, at least 15% lower, at least 20% lower, at least 25% and C.co-dicarboxylic acids can be produced from inexpen lower, or at least 30% lower in the Candida host cell than the sive long-chain fatty acids, which are readily available from second host cell. renewable agricultural and forest products such as Soybean 0044. In some embodiments, the first genetic modification oil, corn oil and tallow. Moreover, a wide range of C-car class causes an alcohol dehydrogenases to have a modified boxyl-(o-hydroxylfatty acids with different carbon length can activity spectrum relative to an activity spectrum of the wild be prepared because the biocatalyst accepts a wide range of type counterpart. fatty acid substrates. Products described herein produced by 0045. In some embodiments, activity of an alcohol dehy the biocatalytic methods described herein are new and not drogenase in a Candida host cell is measured by incubating commercially available since chemical methods are imprac the Candida host cell in a mixture comprising a substrate tical to prepare the compounds and biocatalytic methods to possessing a hydroxyl group and measuring the rate of con these products were previously unknown. version of the Substrate to a more oxidized product such as an aldehyde or a carboxyl group. The rate of conversion of the 4. BRIEF DESCRIPTION OF THE FIGURES substrate by the Candida host cell is compared with the rate of 0049 FIG. 1 shows two pathways for metabolism of fatty conversion produced by a second host cell that does not acids, co-oxidation and B-Oxidation, both of which exist in contain the disrupted gene but contains a wild type counter yeasts of the genus Candida including Candida tropicalis. part of the gene, when the Candida host cell and the second The names of classes of compounds are shown, arrows indi host cell are under the same environmental conditions (e.g., cate transformations from one compound to another, and the same temperature, same media, etc.). The rate of formation of names of classes of enzymes that perform these conversions the product can be measured using colorimetric assays, or are indicated by underlined names adjacent to the arrows. chromatographic assays, or mass spectroscopy assays. In 0050 FIG. 2 shows two pathways for metabolism of fatty Some embodiments the alcohol dehydrogenase is deemed to acids, co-oxidation and B-Oxidation, both of which exist in have a modified activity spectrum if the rate of conversion is yeasts of the genus Candida including Candida tropicalis. at least 5% lower, at least 10% lower, at least 15% lower, at The names of classes of compounds are shown, arrows indi least 20% lower, or at least 25% lower in the Candida host cell cate transformations from one compound to another, and the than the second host cell. names of classes of enzymes that perform these conversions 0046. In some embodiments, activity of an alcohol dehy are indicated by underlined names adjacent to the arrows. By drogenase in a Candida host cell is measured by incubating inactivating the genes encoding acyl coA oxidase (poX4 and the Candida host cell in a mixture comprising a substrate pox5), the B-oxidation pathway is blocked (indicated by bro possessing a hydroxyl group and measuring the rate of con ken arrows), so that fatty acids are not used as Substrates for version of the Substrate to a more oxidized product such as an growth. This genetic modification allows Candida species of aldehyde or a carboxyl group. The amount of the substrate yeast including Candida tropicalis to be used as a biocatalyst converted to product by the Candida host cell in a specified for the production of C,c)-diacids. See, for example, Picatag time is compared with the amount of substrate converted to gio et al., 1991, Mol Cell Biol 11,4333-4339; and Picataggio product by a second host cell that does not contain the dis et al., 1992, Biotechnology 10, 894-898. The B-oxidation rupted gene but contains a wild type counterpart of the gene, pathway may be disrupted by any genetic modification or when the Candida host cell and the second host cell are under treatment of the host cells with a chemical for example an the same environmental conditions (e.g., same temperature, inhibitor that substantially reduces or eliminates the activity same media, etc.). The amount of product can be measured of one or more enzymes in the B-Oxidation pathway, includ using colorimetric assays, or chromatographic assays, or ing the hydratase, dehydrogenase or thiolase enzymes, and mass spectroscopy assays. In some embodiments the alcohol thereby reduces the flux through that pathway and thus the dehydrogenase is deemed to have a modified activity spec utilization of fatty acids as growth substrates. trum if the amount of product is at least 5% lower, at least 10% 0051 FIGS. 3A and 3B show an alignment, using Clust lower, at least 15% lower, at least 20% lower, at least 25% alW. of the amino acid sequences of alcohol dehydrogenase lower in the Candida host cell than the second host cell. proteins predicted from the sequences of genes from Candida 0047. In some embodiments the second genetic modifica albicans and Candida tropicalis. The genes from Candida tion class comprises addition of at least one modified tropicalis were isolated as partial genes by PCR with degen CYP52A type cytochrome P450 selected from the group erate primers, so the nucleic acid sequences of the genes and consisting of CYP52A13, CYP52A14, CYP52A17, the predicted amino acid sequences of the encoded proteins CYP52A18, CYP52A12, and CYP52A12B. are incomplete. Amino acid sequences of the partial genes are 0048 Disclosed are biosynthetic routes that convert (oxi predicted and provided: SEQID NO:155 (ADH-A4), SEQID dize) fatty acids to their corresponding C-carboxyl-(o-hy NO:154 (ADH-B4), SEQID NO:152 (ADH-A10), SEQ ID droxyl fatty acids. This is accomplished by culturing fatty NO:153 (ADH-A10B) and SEQID NO:151 (ADH-B11). acid Substrates with a yeast, preferably a strain of Candida 0.052 FIG. 4 shows a schematic representation of a DNA and more preferably a strain of Candida tropicalis. The yeast 'genomic targeting construct for deleting sequences from converts fatty acids to long-chain ()-hydroxy fatty acids and the genome of yeasts. The general structure is that the con C.co-dicarboxylic acids, and mixtures thereof. Methods by struct has two targeting sequences that are homologous to the which yeast strains may be engineered by the addition or sequences of two regions of the target yeast chromosome. removal of genes to modify the oxidation products formed are Between these targeting sequences are two sites recognized disclosed. Fermentations are conducted in liquid media con by a site-specific recombinase (indicated as “recombinase taining fatty acids as Substrates. Biological conversion meth site'). Between the two site specific recombinase sites are ods for these compounds use readily renewable resources sequence elements, one of which encodes a selective marker Such as fatty acids as starting materials rather than non-re and the other of which (optionally) encodes the site-specific newable petrochemicals For example, ()-hydroxy fatty acids recombinase that recognizes the recombinase sites. In one US 2015/0094.483 A1 Apr. 2, 2015

embodiment the sequences of the DNA construct between the from this plasmid in a linear form by digestion with one or targeting sequences is the “SAT1 flipper', a DNA construct more restriction enzymes with recognition sites that flank the for inserting and deleting sequences into the chromosome of targeting sequences. Candida (Reuss et al., (2004), Gene: 341, 119-27). In the 0056 FIG. 8 shows a schematic representation of the “SAT1 flipper, the recombinase is the flp recombinase from homologous recombination between a "genomic targeting Saccharomyces cerevisiae (Vetter et al., 1983, Proc Natl Acad construct of the form shown in FIG. 7, with the DNA con Sci USA: 80, 7284-8) (FLP) and the flanking sequences rec tained in a yeast genome (either in the chromosome or in the ognized by the recombinase are recognition sites for the flp mitochondrial DNA). The targeting construct (A) contains recombinase (FRT). The selective marker is the gene encod two regions of sequence homology to the genomic sequence ing resistance to the Nourseothricin resistance marker from (B); the corresponding sequences in the genomic sequence transposon Tn 1825 (Tietzeet al., 1988, J Basic Microbiol: 28, flank the DNA region to be replaced. Introduction of the 129-36). The DNA sequence of the SAT1-flipper is given as targeting construct into the host cell is followed by homolo SEQ ID NO: 1. The genomic targeting sequence can be gous recombination catalyzed by host cell enzymes. The propagated in bacteria, for example Ecoli, in which case the result is an integrant of the targeting construct into the complete plasmid will also contain sequences required for genomic DNA (C) and the excised DNA (D) which will propagation in bacteria, comprising a bacterial origin of rep generally be lost from the cell. lication and a bacterial selective marker Such as a gene con 0057 FIG.9 shows a schematic representation of excision ferring antibiotic resistance. The targeting construct can be of the targeting construct from the yeast genome that occurs released from this plasmid in a linear form by digestion with when expression of the recombinase in the targeting construct one or more restriction enzymes with recognition sites that is induced in the integrant (A) shown in FIG. 8. Induction of flank the targeting sequences. the site-specific recombinase causes recombination between 0053 FIG. 5 shows a schematic representation of the the two recombinase recognition sites. The result is the exci homologous recombination between a 'genomic targeting sion of the sequences between the two recombinase sites (C) construct of the form shown in FIG. 4, with the DNA con leaving a single recombinase site together with the additional tained in a yeast genome (either in the chromosome or in the sequences that were included between the targeting mitochondrial DNA). The targeting construct (A) contains sequences and the recombinase site (see FIG. 7) in the two regions of sequence homology to the genomic sequence genomic DNA (B). (B); the corresponding sequences in the genomic sequence 0.058 FIG. 10 shows a schematic representation of three flank the DNA region to be replaced. Introduction of the Stages in generation of a targeted deletion in a yeast genome targeting construct into the host cell is followed by homolo (either in the chromosome or in the mitochondrial DNA), and gous recombination catalyzed by host cell enzymes. The the results of a PCR test to distinguish between the three result is an integrant of the targeting construct into the stages. (A) PCR primers (thick arrows) are designed to flank genomic DNA (C) and the excised DNA (D) which will the targeted region. (B) Insertion of a genomic targeting con generally be lost from the cell. struct into the genome inserts two recombinase sites, a recom binase gene and a selection marker between the two target 0054 FIG. 6 shows a schematic representation of excision sequences. This changes the size of the DNA segment of the targeting construct from the yeast genome that occurs between the two PCR primers; in the case shown the size is when expression of the recombinase in the targeting construct increased. (C) Induction of the recombinase results in exci is induced in the integrant (A) shown in FIG. 5. Induction of sion of the recombinase encoding gene, the selective marker the site-specific recombinase causes recombination between and one of the recombinase sites. This again changes the size the two recombinase recognition sites. The result is the exci of the DNA segment between the two PCR primers. (D) PCR sion of the sequences between the two recombinase sites (C) amplification from yeast genomic DNA unmodified (gel leaving a single recombinase site in the genomic DNA (B). lanes marked A), with integrated genomic targeting vector 0055 FIG. 7 shows a schematic representation of a DNA (gel lanes marked B) or after excision of the genomic target 'genomic targeting construct for inserting sequences into ing vector (gel lanes marked C). the genome of yeasts. The general structure is that the con 0059 FIG. 11 shows a schematic representation of a DNA struct has two targeting sequences that are homologous to the 'genomic targeting construct for inserting or deleting sequences of two regions of the target yeast chromosome. sequences in the genome of yeasts. The general structure is Between these targeting sequences are two sites recognized that the construct has two targeting sequences that are by a site-specific recombinase (indicated as “recombinase homologous to the sequences of two regions of the target site'). Between the two site specific recombinase sites are yeast chromosome. Between these targeting sequences is a sequence elements, one of which encodes a selective marker sequence that encodes a selective marker. and the other of which (optionally) encodes the site-specific 0060 FIG. 12 shows two pathways for metabolism offatty recombinase that recognizes the recombinase sites. Insertion acids, co-oxidation and B-Oxidation, both of which exist in of additional sequences between one of the targeting Candida species of yeast including Candida tropicalis. The sequences and its closest recombinase recognition site will names of classes of compounds are shown, arrows indicate result in those sequences being inserted into the chromosome transformations from one compound to another, and the after excision of the targeting construct (“Insertion names of classes of enzymes that perform these conversions sequences'). The genomic targeting sequence can be propa are indicated by underlined names adjacent to the arrows. By gated in bacteria, for example Ecoli, in which case the com inactivating the Candida tropicalis genes poX4 and poX5 (or plete plasmid will also contain sequences required for propa their functional homologs in other Candida species), the gation in bacteria, comprising a bacterial origin of replication B-oxidation pathway is blocked (indicated by broken arrows), and a bacterial selective marker Such as a gene conferring so that fatty acids are not used as Substrates for growth. antibiotic resistance. The targeting construct can be released Furthermore, inactivation of CYP52A type cytochrome P450 US 2015/0094.483 A1 Apr. 2, 2015 enzymes, as illustrated in the Figure, prevents the co-oxida ()-hydroxy fatty acids to C,c)-dicarboxylic acids. If other tion of these fatty acids. These enzymes may also be respon enzymes involved in oxidation of ()-hydroxy fatty acids are sible for some or all of the transformations involved in oxi present in the strain, then the strain will convert co-hydroxy dizing ()-hydroxy fatty acids to C.O)-dicarboxylic acids. See fatty acids fed in the media to C, co-dicarboxylic acids. If other Eschenfeldt et al., 2003, “Transformation of fatty acids cata enzymes involved in oxidation of co-hydroxy fatty acids have lyzed by cytochrome P450 monooxygenase enzymes of Can been eliminated from the strain, then the strain will convert dida tropicalis.” Appli. Environ. Microbiol. 69: 5992-5999, ()-hydroxy fatty acids fed in the media to C.()-dicarboxylic which is hereby incorporated by reference herein. acids. 0061 FIG. 13 shows the levels of co-hydroxy myristate 0063 FIG. 15 shows the levels of C.O)-dicarboxylic acids and the over-oxidized C14 diacid produced by Candida tropi produced by Candida tropicalis strains DP186, DP258 and calis strains DP1 (ura3A/ura3B pox5A::ura3A/pox5B:: DP259 (see Table 3 for genotypes). Cultures of the yeast ura3A pox4A::ura3A/pox4B::URA3A) and DP174 (ura3A/ strains were grown at 30°C. and 250 rpm for 16 hours in a 500 ura3B pox5A::ura3A/pox5B::ura3A pox4A::ura3A/pox4B:: ml flask containing 30 ml of media F (media F is peptone3 g/l. URA3A ACYP52A17/ACYP52A18 ACYP52A13/ yeast extract 6 g/l. yeast nitrogen base 6.7 g/l. Sodium acetate ACYP52A14). Cultures of the yeaststrains were grown at 30° 3 g/l. KHPO, 7.2 g/l. KHPO 9.3 g/l) plus 20 g/l glycerol. C. and 250 rpm for 16 hours in a 500 ml flask containing 30 ml After 16 hours 0.5 ml of culture was added to 4.5 ml fresh of media F (media F is peptone 3 g/l. yeast extract 6 g/l. yeast media F plus 20 g/l glycerol in a 125 ml flask, and grown at nitrogen base 6.7 g/l. Sodium acetate 3 g/l. KHPO, 7.2 g/l. 30° C. and 250 rpm for 12 hours before addition of substrate. KHPO, 9.3 g/l) plus 30 g/l glucose. After 16 hours 0.5 ml of After addition of substrates growth was continued at 30° C. culture was added to 4.5 ml fresh media Fplus 60 g/l glucose and 250 rpm. Part A: the substrate ()-hydroxy laurate was then in a 125 ml flask, and grown at 30° C. and 250 rpm for 12 added to a final concentration of 5 g/l and the pH was adjusted hours before addition of substrate. After addition of substrates to between 7.5 and 8. Samples were taken after 24 hours, cell growth was continued at 30° C. and 250 rpm. Parts A and B: culture was acidified to pH -1.0 by addition of 6 NHCl, the substrate methyl myristate was then added to a final con products were extracted from the cell culture by diethyl ether centration of 10 g/l and the pH was adjusted to between 7.5 and the concentrations of C.O)-dicarboxy laurate were mea and 8. The culture was pH controlled by adding 2 mol/l NaOH Sured by LC-MS (liquid chromatography mass spectros every 12 hours and glucose was fed as a coSubstrate by adding copy). Part B: the substrate ()-hydroxy palmitate was then 400 g/l glucose every 8 hours. Samples were taken at the added to a final concentration of 5 g/l and the pH was adjusted times indicated, cell culture was acidified to pH ~1.0 by to between 7.5 and 8. Samples were taken after 24 hours, cell addition of 6 NHCl, products were extracted from the cell culture was acidified to pH -1.0 by addition of 6 NHCl, culture by diethyl ether and the concentrations of co-hydroxy products were extracted from the cell culture by diethyl ether myristate and of the C14 diacid produced by oxidation of the and the concentrations of C.O)-dicarboxy laurate were mea ()-hydroxy myristate were measured by LC-MS (liquid chro Sured by LC-MS (liquid chromatography mass spectros matography mass spectroscopy). The diacid was quantified copy). relative to a known standard. No such standard was available 0064 FIG. 16 shows the levels of C.O)-dicarboxylic acids for the ()-hydroxy myristate. So it was quantified by measur produced by Candida tropicalis strains DP186, DP283 and ing the area under the peak in the MS chromatogram. Parts C DP284 (see Table 3 for genotypes). Cultures of the yeast and D: the Substrates methyl myristate, Sodium myristate or strains were grown at 30°C. and 250 rpm for 16 hours in a 500 myristic acid were added to a final concentration of 10 g/land ml flask containing 30 ml of media F (media F is peptone3 g/l. the pH was adjusted to between 7.5 and 8. The culture was pH yeast extract 6 g/l. yeast nitrogen base 6.7 g/l. Sodium acetate controlled by adding 2 mol/l NaOH every 12 hours and glu 3 g/l. KHPO, 7.2 g/l. KHPO, 9.3 g/l) plus 20 g/l glycerol. cose was fed as a coSubstrate by adding 400 g/l glucose every After 16 hours 0.5 ml of culture was added to 4.5 ml fresh 8 hours. Samples were taken after 48 hours, cell culture was media F plus 20 g/l glycerol in a 125 ml flask, and grown at acidified to pH -1.0 by addition of 6 NHCl, products were 30° C. and 250 rpm for 12 hours before addition of substrate. extracted from the cell culture by diethyl ether and the con After addition of substrates growth was continued at 30° C. centrations of co-hydroxy myristate and of the C14 diacid and 250 rpm. Part A: the substrate ()-hydroxy laurate was then produced by oxidation of the co-hydroxy myristate were mea added to a final concentration of 5 g/l and the pH was adjusted Sured by LC-MS (liquid chromatography mass spectros to between 7.5 and 8. Samples were taken after 24 hours, cell copy). culture was acidified to pH -1.0 by addition of 6 NHCl, 0062 FIG. 14 shows two pathways for metabolism offatty products were extracted from the cell culture by diethyl ether acids, ()-oxidation and B-Oxidation, both of which exist in and the concentrations of C.O)-dicarboxy laurate were mea Candida species of yeast including Candida tropicalis. The Sured by LC-MS (liquid chromatography mass spectros names of classes of compounds are shown, arrows indicate copy). Part B: the substrate ()-hydroxy palmitate was then transformations from one compound to another, and the added to a final concentration of 5 g/l and the pH was adjusted names of classes of enzymes that perform these conversions to between 7.5 and 8. Samples were taken after 24 hours, cell are indicated by underlined names adjacent to the arrows. By culture was acidified to pH -1.0 by addition of 6 NHCl, inactivating the Candida tropicalis genes poX4 and poX5 (or products were extracted from the cell culture by diethyl ether their functional homologs in other Candida species), the and the concentrations of C.O)-dicarboxy laurate were mea B-oxidation pathway is blocked (indicated by broken arrows), Sured by LC-MS (liquid chromatography mass spectros so that fatty acids are not used as Substrates for growth. copy). Furthermore, inactivation of CYP52A type cytochrome P450 0065 FIG. 17 shows a phylogenetic tree with five Can enzymes prevents the w-oxidation of fatty acids. Several dida tropicalis alcohol dehydrogenase sequences (A10, B1 1. enzymes including, but not limited to CYP52A type P450s, B2, A4 and B4) and two alcohol dehydrogenases from Can are responsible for transformations involved in oxidizing dida albicans (Ca ADH1A and Ca ADH2A). US 2015/0094.483 A1 Apr. 2, 2015

0066 FIG. 18 shows a schematic design for selecting two marker. The selectable marker is preferably one that is active sets of nested targeting sequences for the deletion of two in both bacterial and yeast hosts. To achieve this, the select alleles of a gene whose sequences are very similar, for able marker may be preceded by a yeast promoter (promoter example the alcohol dehydrogenase genes. The construct for 2) and a bacterial promoter, and optionally it may be followed the first allele uses ~200 base pair at the 5' end and ~200 base by a transcription terminator (transcription terminator 2). The pair at the 3' end as targeting sequences (5'-ADH Out and genomic insertion construct also comprises a bacterial origin 3'-ADH Out). The construct for the second allele uses two of replication. sections of ~200 base pair between the first two targeting 0070 FIG. 22 shows a schematic representation of the sequences (5'-ADH In and 3'-ADH In). These sequences are integration of a DNA “genomic insertion' construct into the eliminated by the first targeting construct from the first allele DNA of a yeast genome. Part A shows an integration con of the gene and will thus serve as a targeting sequence for the struct of the structure shown in FIG. 22, with parts marked. second allele of the gene. The construct is linearized, for example by digesting with an 0067 FIG. 19 shows the levels of O.co-dicarboxylic acids enzyme that recognizes a unique restriction site within pro produced by Candida tropicalis strains DP1, DP283 and moter 1, or by PCR amplification, or by any other method, so DP415 (see Table 3 for genotypes). Cultures of the yeast that a portion of promoter 1 is at one end of the linearized strains were grown at 30°C. and 250 rpm for 18 hours in a 500 construct (5' part), and the remainder at the other end (3' end). ml flask containing 30 ml of media F (media F is peptone3 g/l. Three positions (A, B and C) are marked in Promoter 1, these yeast extract 6 g/l. yeast nitrogen base 6.7 g/l. Sodium acetate refer to the positions in FIG. 21. Part B shows the intact 3 g/l. KHPO, 7.2 g/l. KHPO, 9.3 g/l) plus 20 g/l glycerol. Promoter 1 in the yeast genome, followed by the gene that is After 18 hours the preculture was diluted in fresh media to normally transcribed from Promoter 1 (genomic gene Asoo 1.0. This culture was shaken until the Asoo reached expressed from promoter 1). Three positions (A, B and C) are between 5.0 and 6.0. Biocatalytic conversion was initiated by also marked in the genomic copy of Promoter 1. Part C shows adding 5 ml culture to a 125 ml flask together with 50 mg of the genome after integration of the construct. The construct ()-hydroxy lauric acid, and pH adjusted to ~7.5 with 2M integrates at position B in Promoter 1, the site at which the NaOH. Part A: cell growth was followed by measuring the construct was linearized. This results in a duplication of pro A every 2 hours. Part B: formation of diacid; every 2 hours moter 1 in the genome, with one copy of the promoter driving a sample of the cell culture was taken, acidified to pH~1.0 by transcription of the introduced gene for expression and the addition of 6 NHCl, products were extracted from the cell other copy driving the transcription of the genomic gene culture by diethyl ether and the concentrations of C,c)-dicar expressed from promoter 1. boxy laurate were measured by LC-MS (liquid chromatogra (0071 FIG. 23 shows a specific embodiment of the DNA phy mass spectroscopy). “genomic insertion' construct shown in FIG. 21. The general 0068 FIG. 20 shows the levels of C.O)-dicarboxylic acids structure is that the construct has a gene for expression which produced by Candida tropicalis strains DP1, DP390, DP415, is preceded by a promoter that is active in the yeast (the DP417, DP421, DP423, DP434 and DP436 (see Table 3 for Candida tropicalis isocitrate lyase promoter). The isocitrate genotypes). Cultures of the yeast strains were grown at 30°C. lyase promoter comprises a unique BsiWI site whereby the and 250 rpm for 18 hours in a 500 ml flask containing 30 ml construct may be cleaved by endocunclease BsiWI once to of media F (media F is peptone 3 g/l. yeast extract 6 g/l. yeast linearize it. The gene for expression is followed by a tran nitrogen base 6.7 g/l. Sodium acetate 3 g/l. KHPO, 7.2 g/l. Scription terminator (isocitrate lyase transcription termina KHPO 9.3 g/l) plus 20 g/l glycerol. After 18 hours the tor). The genomic insertion construct also comprises a select preculture was diluted in fresh media to A-1.0. This cul able marker conferring resistance to the antibiotic Zeocin. ture was shaken until the Ao reached between 5.0 and 6.0. This selectable marker is active in both bacterial and yeast Biocatalytic conversion was initiated by adding 5 ml culture hosts and preceded by a yeast promoter (the TEF1 promoter) to a 125 ml flask together with 50 mg of co-hydroxy lauric and a Bacterial promoter (the EM7 promoter), and followed acid, and pH adjusted to ~7.5 with 2M NaOH. Formation of by a transcription terminator (the CYC1 transcription termi diacid was measured at the indicated intervals by taking a nator 2). The genomic insertion construct also comprises a sample of the cell culture and acidifying to pH~1.0 by addi bacterial origin of replication (the puC origin of replication). tion of 6 NHCl, products were extracted from the cell culture 0072 FIG. 24 shows the levels of C.O)-dicarboxylic acids by diethyl ether and the concentrations of C,c)-dicarboxy and ()-hydroxy fatty acids produced by Candida tropicalis laurate were measured by LC-MS (liquid chromatography strains dpl. dp201 and dp428 (see table 3 for genotypes). mass spectroscopy). Cultures of the yeast strains were grown at 30° c. and 250 rpm 0069 FIG. 21 shows a schematic representation of a DNA for 18 hours in a 500 ml flask containing 30 ml of media f 'genomic insertion' construct for inserting sequences to be (media f is peptone 3 g/l. yeast extract 6 g/l. yeast nitrogen expressed into the genome of yeasts. The general structure is base 6.7 g/l. Sodium acetate 3 g/l. khpo 7.2 g/l. khpo 9.3 that the construct has a gene for expression which is preceded g/l) plus 20 g/l glucose plus 5 g/l ethanol. After 18 hours 3 ml by a promoter that is active in the yeast (Promoter 1). Pro of preculture was added to 27 ml fresh media f plus 20 g/1 moter 1 comprises a linearization position which may be a site glucose plus 5 g/l ethanol in a 500 ml flask, and grown at 30° recognized by a restriction enzyme which cleaves the c. and 250 rpm for 20 hours before addition of substrate. genomic insertion construct once to linearize it, or an anneal Biocatalytic conversion was initiated by adding 40 g/l of ing site for PCR primers to amplify a linear molecule from the methyl myristate, the ph was adjusted to ~7.8 with 2 m naoh. construct. Three positions (A, B and C) are marked in Pro The culture was ph controlled by adding 2 mol/l naoh every moter 1 for reference in FIG.22 when the construct is linear 12 hours, glycerol was fed as cosubstrate by adding 500 g/1 ized. The gene for expression is optionally followed by a glycerol and ethanol was fed as a inducer by adding 50% transcription terminator (Transcription terminator 1). The ethanol every 12 hours. Samples were taken at the times genomic insertion construct also comprises a selectable indicated, cell culture was acidified to ph-1.0 by addition of US 2015/0094.483 A1 Apr. 2, 2015

6 nhcl. products were extracted from the cell culture by was added into cell culture to 2 g/l, and methyl myristate was diethyl ether and the concentrations of co-hydroxy myristate added to 40 g/1 until the total methyl myristate added was 140 and C.O)-dicarboxymyristate were measured by lc-ms (liquid g/l (e.g. the initial 20 g/l plus 3 Subsequent 40 g/l additions). chromatography mass spectroscopy). Formation of products was measured at the indicated inter 0073 FIG. 25 shows the levels of O.co-dicarboxylic acids vals by taking samples and acidifying to ph ~1.0 by addition and ()-hydroxy fatty acids produced by Candida tropicalis of 6 nhcl: products were extracted from the cell culture by strains dp428 and dp522 (see table 3 for genotypes). Cultures diethyl ether and the concentrations of co-hydroxy myristate of the yeast Strains were grown at 30 c. in a dasgip parallel and am-dicarboxymyristate were measured by lc-ms (liquid fermentor containing 200 ml of media f(media fis peptone 3 chromatography mass spectroscopy). g/l, yeast extract 6 g/l. yeast nitrogen base 6.7 g/l. Sodium (0075 FIG. 27 shows the red fluorescent protein mCherry acetate 3 g/l. khpo 7.2 g/l. khpo 9.3 g/l) plus 30 g/l glu produced by Candida tropicalis strain DP197 (see Table 3 for cose. The ph was maintained at 6.0 by automatic addition of genotypes). Cultures of the yeast strains were grown at 30°C. 6 m naoh or 2 m hSo Solution. Dissolved oxygen was kept at on plates containing Buffered Minimal Medium+0.5% Glu 70% by agitation and o-cascade control mode. After 6 hour cose, 0.5% Glycerol, and 0.5% EtOH. growth, ethanol was fed into the cell culture to 5 g/l. After 12 h growth, biocatalytic conversion was initiated by adding (a) 5. DETAILED DESCRIPTION 20 g/l of methyl myristate, (b) 20 g/l oleic acid or (c) 10 g/1 0076. It is to be understood that what is disclosed herein is linoleic acid. During the conversion phase, 80% glycerol was not limited to the particular methodology, devices, solutions fed as co-substrate for conversion of methyl myristate and or apparatuses described, as such methods, devices, Solutions 500 g/l glucose was fed as co-substrate for conversion of oleic or apparatuses can, of course, vary. acid and linoleic acid by dissolved oxygen-stat control mode (the high limit of dissolved oxygen was 75% and low limit of dissolved oxygen was 70%, which means glycerol feeding 5.1. Definitions was initiated when dissolved oxygen is higher than 75% and (0077 Use of the singular forms “a,” “an and “the stopped when dissolved oxygen was lower than 70%). Every include plural references unless the context clearly dictates 12 hour, ethanol was added into cell culture to 2 g/l, and fatty otherwise. Thus, for example, reference to “a polynucleotide' acid substrate was added to 20 g/1 until the total substrate includes a plurality of polynucleotides, reference to “a sub concentration added was (a) 60 g/l of methyl myristate, (b) 60 strate' includes a plurality of such substrates, reference to “a g/l oleic acid or (c)30 g/l linoleic acid. Formation of products variant” includes a plurality of variants, and the like. was measured at the indicated intervals by taking samples and 0078 Terms such as “connected, “attached,” “linked, acidifying to ph ~1.0 by addition of 6 nhcl: products were and "conjugated are used interchangeably herein and extracted from the cell culture by diethyl ether and the con encompass direct as well as indirect connection, attachment, centrations of ()-hydroxy fatty acids and C.co-dicarboxylic linkage or conjugation unless the context clearly dictates acids were measured by lc-ms (liquid chromatography mass otherwise. Where a range of values is recited, it is to be spectroscopy). understood that each intervening integer value, and each frac 0074 FIG. 26 shows the levels of C.O)-dicarboxylic acids tion thereof, between the recited upper and lower limits of that and ()-hydroxy fatty acids produced by Candida tropicalis range is also specifically disclosed, along with each Subrange strain dp428 (see table 3 for genotype) in two separate fer between such values. The upper and lower limits of any range mentor runs. C. Tropicalis dp428 was taken from a glycerol can independently be included in or excluded from the range, stock or fresh agar plate and inoculated into 500 ml shake and each range where either, neither or both limits are flask containing 30 ml of ypd medium (20 g/l glucose, 20 g/1 included is also encompassed in the disclosed embodiments. peptone and 10 g/l yeast extract) and shaken at 30 c. 250 rpm Where a value being discussed has inherent limits, for for 20 hours. Cells were collected by centrifugation and re example where a component can be presentata concentration suspended infm3 medium for inoculation. (fm3 medium is 30 offrom 0 to 100%, or where the pH of an aqueous solution can g/l glucose, 7 g/l ammonium sulfate, 5.1 g/l potassium phos range from 1 to 14, those inherent limits are specifically phate, monobasic, 0.5 g/l magnesium Sulfate, 0.1 g/l calcium disclosed. Where a value is explicitly recited, it is to be chloride, 0.06 g/l citric acid, 0.023 g/l ferric chloride, 0.0002 understood that values which are about the same quantity or g/l biotin and 1 ml/l of a trace elements solution. The trace amount as the recited value are also encompassed. Where a elements solution contains 0.9 g/l boric acid, 0.07 g/l cupric combination is disclosed, each Subcombination of the ele sulfate, 0.18 g/l potassium iodide, 0.36 g/l ferric chloride, ments of that combination is also specifically disclosed and is 0.72 g/l manganese sulfate, 0.36 g/l sodium molybdate, 0.72 within the scope of the disclosed embodiments. Conversely, g/l Zinc sulfate.) Conversion was performed by inoculating 15 where different elements or groups of elements are individu ml of preculture into 135 ml fm3 medium, methyl myristate ally disclosed, combinations thereof are also disclosed. was added to 20 g/l and the temperature was kept at 30c. The Where any embodiment is disclosed as having a plurality of ph was maintained at 6.0 by automatic addition of 6 m naoh or alternatives, examples of that embodiment in which each 2 m hSo Solution. Dissolved oxygen was kept at 70% by alternative is excluded singly or in any combination with the agitation and o-cascade control mode. After six hour growth, other alternatives are also hereby disclosed; more than one ethanol was fed into the cell culture to 5 g/l. During the element of a disclosed embodiment can have such exclusions, conversion phase, 80% glycerol was fed as co-substrate by and all combinations of elements having Such exclusions are dissolved oxygen-stat control mode (the high limit of dis hereby disclosed. solved oxygen was 75% and low limit of dissolved oxygen 0079 Unless defined otherwise herein, all technical and was 70%, which means glycerol feeding was initiated when Scientific terms used herein have the same meaning as com dissolved oxygen is higher than 75% and stopped when dis monly understood by one of ordinary skill in the art. Single solved oxygen was lower than 70%). Every 12 hour, ethanol ton et al., Dictionary of Microbiology and Molecular Biology, US 2015/0094.483 A1 Apr. 2, 2015

2nd Ed., John Wiley and Sons, New York, 1994, and Hale & charged linkages (e.g., phosphorothioates, phosphorodithio Marham, The Harper Collins Dictionary of Biology, Harper ates, etc.), and with positively charged linkages (e.g., ami Perennial, NY. 1991, provide one of ordinary skill in the art noalkylphosphoramidates, aminoalkylphosphotriesters), with a general dictionary of many of the terms used herein. those containing pendant moieties, such as, for example, pro Although any methods and materials similar or equivalent to teins (including enzymes (e.g. nucleases), toxins, antibodies, those described herein can be used in the practice or testing of signal peptides, poly-L-lysine, etc.), those with intercalators the disclosed embodiments, the preferred methods and mate (e.g., acridine, psoralen, etc.), those containing chelates (of rials are described. Unless otherwise indicated, nucleic acids e.g., metals, radioactive metals, boron, oxidative metals, etc.), are written left to right in 5' to 3' orientation; amino acid those containing alkylators, those with modified linkages sequences are written left to right in amino to carboxy orien (e.g., alpha anomeric nucleic acids, etc.), as well as unmodi tation, respectively. The terms defined immediately below are fied forms of the polynucleotide or oligonucleotide. more fully defined by reference to the specification as a I0082. Where the polynucleotides are to be used to express whole. encoded proteins, nucleotides that can perform that function 0080. As used, herein, computation of percent identity or which can be modified (e.g., reverse transcribed) to per takes full weight of any insertions in two sequences for which form that function are used. Where the polynucleotides are to percent identity is computed. To compute percent identity be used in a scheme that requires that a complementary strand between two sequences, they are aligned and any necessary be formed to a given polynucleotide, nucleotides are used insertions in either sequence being compared are then made which permit Such formation. in accordance with sequence alignment algorithms known in I0083. It will be appreciated that, as used herein, the terms the art. Then, the percent identity is computed, where each “nucleoside' and “nucleotide' will include those moieties insertion in either sequence necessary to make the optimal which contain not only the known purine and pyrimidine alignment between the two sequences is counted as a mis bases, but also other heterocyclic bases which have been match. modified. Such modifications include methylated purines or 0081. The terms “polynucleotide. "oligonucleotide.” pyrimidines, acylated purines or pyrimidines, or otherhetero “nucleic acid and “nucleic acid molecule' and “gene' are cycles. Modified nucleosides or nucleotides can also include used interchangeably herein to refer to a polymeric form of modifications on the Sugar moiety, e.g., where one or more of nucleotides of any length, and may comprise ribonucleotides, the hydroxyl groups are replaced with halogen, aliphatic deoxyribonucleotides, analogs thereof, or mixtures thereof. groups, or is functionalized as ethers, amines, or the like. This term refers only to the primary structure of the molecule. I0084 Standard A-T and G-C base pairs form under con Thus, the term includes triple-, double- and single-stranded ditions which allow the formation of hydrogen bonds deoxyribonucleic acid (“DNA), as well as triple-, double between the N3-H and C4-oxy of thymidine and the NI and and single-stranded ribonucleic acid (“RNA). It also C6-NH2, respectively, of adenosine and between the C2-oxy, includes modified, for example by alkylation, and/or by cap N3 and C4-NH2, of cytidine and the C2-NH N' H and ping, and unmodified forms of the polynucleotide. More par C6-oxy, respectively, of guanosine. Thus, for example, gua ticularly, the terms “polynucleotide. "oligonucleotide.” nosine (2-amino-6-oxy-9-beta-D-ribofuranosyl-purine) may “nucleic acid' and “nucleic acid molecule' include be modified to form isoguanosine (2-oxy-6-amino-9-beta-D- poly deoxyribonucleotides (containing 2-deoxy-D-ribose), ribofuranosyl-purine). Such modification results in a nucleo polyribonucleotides (containing D-ribose), including tRNA, side base which will no longer effectively form a standard rRNA, hRNA, siRNA and mRNA, whether spliced or base pair with cytosine. However, modification of cytosine unspliced, any other type of polynucleotide which is an N- or (1-beta-D-ribofuranosyl-2-oxy-4-amino-pyrimidine) to form C-glycoside of a purine or pyrimidine base, and other poly isocytosine (1-beta-D-ribofuranosyl-2-amino-4-oxy-pyrimi mers containing nonnucleotidic backbones, for example, dine-) results in a modified nucleotide which will not effec polyamide (e.g., peptide nucleic acids (“PNAs)) and poly tively base pair with guanosine but will form a base pair with morpholino (commercially available from the Anti-Virals, isoguanosine (U.S. Pat. No. 5,681.702 to Collins et al., hereby Inc., Corvallis, Oreg., as Neugene) polymers, and other syn incorporated by reference in its entirety). Isocytosine is avail thetic sequence-specific nucleic acid polymers providing that able from Sigma Chemical Co. (St. Louis, Mo.); isocytidine the polymers contain nucleobases in a configuration which may be prepared by the method described by Switzer et al. allows for base pairing and base stacking, Such as is found in (1993) Biochemistry 32: 10489-10496 and references cited DNA and RNA. There is no intended distinction in length therein; 2'-deoxy-5-methyl-isocytidine may be prepared by between the terms “polynucleotide. "oligonucleotide.” the method of Tor et al., 1993, J. Am. Chem. Soc. 115:4461 “nucleic acid' and “nucleic acid molecule” and these terms 4467 and references cited therein; and isoguanine nucleotides are used interchangeably herein. These terms refer only to the may be prepared using the method described by Switzer et al., primary structure of the molecule. Thus, these terms include, 1993, supra, and Mantsch et al., 1993, Biochem. 14:5593 for example, 3'-deoxy-2',5'-DNA, oligodeoxyribonucleotide 5601, or by the method described in U.S. Pat. No. 5,780,610 N3' P5' phosphoramidates. 2'-O-alkyl-substituted RNA, to Collins et al., each of which is hereby incorporated by double- and single-stranded DNA, as well as double- and reference in its entirety. Other nonnatural base pairs may be single-stranded RNA, and hybrids thereof including for synthesized by the method described in Piccirilliet al., 1990, example hybrids between DNA and RNA or between PNAs Nature 343:33-37, hereby incorporated by reference in it and DNA or RNA, and also include known types of modifi entirety, for the synthesis of 2,6-diaminopyrimidine and its cations, for example, labels, alkylation, "caps. Substitution complement (1-methylpyrazolo-4.3 pyrimidine-5,7-(4H, of one or more of the nucleotides with an analog, internucle 6H)-dione. Other such modified nucleotidic units which form otide modifications such as, for example, those with unique base pairs are known, such as those described in Leach uncharged linkages (e.g., methyl phosphonates, phosphotri et al., 1992, J. Am. Chem. Soc.114:3675-3683 and Switzer et esters, phosphoramidates, carbamates, etc.), with negatively al., Supra. US 2015/0094.483 A1 Apr. 2, 2015

0085. The phrase “DNA sequence” refers to a contiguous IUPAC-IUB Biochemical Nomenclature Commission. nucleic acid sequence. The sequence can be either single Nucleotides, likewise, may be referred to by their commonly stranded or double stranded, DNA or RNA, but double accepted single-letter codes. stranded DNA sequences are preferable. The sequence can be I0089. The term “expression system” refers to any in vivo an oligonucleotide of 6 to 20 nucleotides in length to a full or in vitro biological system that is used to produce one or length genomic sequence of thousands or hundreds of thou more protein encoded by a polynucleotide. sands of base pairs. DNA sequences are written from 5' to 3' (0090. The term “translation” refers to the process by unless otherwise indicated. which a polypeptide is synthesized by a ribosome reading I0086. The term “protein refers to contiguous “amino the sequence of a polynucleotide. acids” or amino acid “residues. Typically, proteins have a 0091. In some embodiments, the term “disrupt” means to function. However, for purposes of this disclosure, proteins reduce or diminish the expression of a gene in a host cell also encompass polypeptides and Smaller contiguous amino organism. acid sequences that do not have a functional activity. The 0092. In some embodiments, the term “disrupt” means to functional proteins of this disclosure include, but are not reduce or diminish a function of a protein encoded by a gene limited to, esterases, dehydrogenases, hydrolases, oxi in a host cell organism. This function may be, for example, an doreductases, transferases, lyases, ligases, receptors, receptor enzymatic activity of the protein, a specific enzymatic activity ligands, cytokines, antibodies, immunomodulatory mol of the protein, a protein-protein interaction that the protein ecules, signaling molecules, fluorescent proteins and proteins undergoes in a host cell organism, or a protein-nucleic acid with insecticidal or biocidal activities. Useful general classes interaction that the protein undergoes in a host cell organism. of enzymes include, but are not limited to, proteases, cellu 0093. In some embodiments, the term “disrupt” means to lases, lipases, hemicellulases, laccases, amylases, glucoamy eliminate the expression of a gene in a host cell organism. lases, esterases, lactases, polygalacturonases, galactosidases, 0094. In some embodiments, the term “disrupt” means to ligninases, oxidases, peroxidases, glucose isomerases, nit eliminate the function of a protein encoded by a gene in a host rilases, hydroxylases, polymerases and depolymerases. In cell organism. This function may be, for example, an enzy addition to enzymes, the encoded proteins which can be used matic activity of the protein, a specific enzymatic activity of in this disclosure include, but are not limited to, transcription the protein, a protein-protein interaction that the protein factors, antibodies, receptors, growth factors (any of the undergoes in a host cell organism, or a protein-nucleic acid PDGFs, EGFs, FGFs, SCF, HGF, TGFs, TNFs, insulin, IGFs, interaction that the protein undergoes in a host cell organism. LIFs, oncostatins, and CSFs), immunomodulators, peptide I0095. In some embodiments, the term "disrupt” means to hormones, cytokines, integrins, interleukins, adhesion mol cause a protein encoded by a gene in a host cell organism to ecules, thrombomodulatory molecules, protease inhibitors, have a modified activity spectrum (e.g., reduced enzymatic angiostatins, defensins, cluster of differentiation antigens, activity) relative to wild-type activity spectrum of the protein. interferons, chemokines, antigens including those from infec 0096. In some embodiments, disruption is caused by tious viruses and organisms, oncogene products, thrombopoi mutating a gene in a host cell organism that encodes a protein. etin, erythropoietin, tissue plasminogen activator, and any For example, a point mutation, an insertion mutation, a dele other biologically active protein which is desired for use in a tion mutation, or any combination of Such mutations, can be clinical, diagnostic or veterinary setting. All of these proteins used to disrupt the gene. In some embodiments, this mutation are well defined in the literature and are so defined herein. causes the protein encoded by the gene to express poorly or Also included are deletion mutants of Such proteins, indi not at all in the host cell organism. In some embodiments, this vidual domains of Such proteins, fusion proteins made from mutation causes the gene to no longer be present in the host Such proteins, and mixtures of Such proteins; particularly cell organism. In some embodiments, this mutation causes useful are those which have increased half-lives and/or the gene to no longer encode a functional protein in the host increased activity. cell organism. The mutation to the gene may be in the portion of the gene that encodes a protein product (exon), it may be in 0087 "Polypeptide' and “protein’ are used interchange any of the regulatory sequences (e.g., promoter, enhancer, ably herein and include a molecular chain of amino acids etc.) that regulate the expression of the gene, or it may arise in linked through peptide bonds. The terms do not refer to a an intron. specific length of the product. Thus, "peptides. "oligopep 0097. In some embodiments, the disruption (e.g., muta tides, and “proteins are included within the definition of tion) of a gene causes the protein encoded by the gene to have polypeptide. The terms include polypeptides containing in a mutation that diminishes a function of the protein relative to co-and/or post-translational modifications of the polypeptide the function of the wild type counterpart of the mutated pro made in vivo or in vitro, for example, glycosylations, acety tein. lations, phosphorylations, PEGylations and Sulphations. In 0098. As used, herein, the wild type counterpart of a addition, protein fragments, analogs (including amino acids mutated protein is the unmutated protein, occurring in wild not encoded by the genetic code, e.g. homocysteine, orni type host cell organism, which corresponds to the mutated thine, p-acetylphenylalanine, D-amino acids, and creatine), protein. For example, if the mutated protein is a protein natural or artificial mutants or variants or combinations encoded by mutated Candida tropicalis PDX 5, the wild type thereof, fusion proteins, derivatized residues (e.g. alkylation counterpart of the mutated protein is the gene product from of amine groups, acetylations or esterifications of carboxyl naturally occurring Candida tropicalis PDX 5 that is not groups) and the like are included within the meaning of polypeptide. mutated. 0099. As used herein, the wild type counterpart of a 0088 Amino acids” or "amino acid residues' may be mutated gene is the unmutated gene occurring in wild type referred to herein by either their commonly known three letter host cell organism, which corresponds to the mutated gene. symbols or by the one-letter symbols recommended by the For example, if the mutated gene is Candida tropicalis PDX US 2015/0094.483 A1 Apr. 2, 2015

5 containing a point mutation, the wild type counterpart is organism that does not contain the disrupted gene when the Candida tropicalis PDX 5 without the point mutation. first host cell organism and the second host cell organism are 0100. In some embodiments, a gene is deemed to be dis under the same environmental conditions (e.g., temperature, rupted when the gene is not capable of expressing protein in media, etc.). the host cell organism. 0108. In some embodiments, a gene is deemed to be dis 0101. In some embodiments, a gene is deemed to be dis rupted when the abundance of mRNA transcripts that encode rupted when the disrupted gene expresses protein in a first the disrupted gene in a first host cell organism that has the host cell organism that contains the disrupted gene in amounts disrupted gene are 30% or less than the abundance of mRNA that are 20% or less than the amounts of protein expressed by transcripts that encode the gene in second wild type host cell the wild type counterpart of the gene in a second host cell organism that does not contain the disrupted gene when the organism that does not contain the disrupted gene, when the first host cell organism and the second host cell organism are first host cell organism and the second host cell organism are under the same environmental conditions (e.g., temperature, under the same environmental conditions (e.g., same tem media, etc.). perature, same media, etc.). 0109. In some embodiments, a gene is deemed to be dis 0102. In some embodiments, a gene is deemed to be dis rupted when the abundance of mRNA transcripts that encode rupted when the disrupted gene expresses protein in a first the disrupted gene in a first host cell organism that has the host cell organism that contains the disrupted gene in amounts disrupted gene are 40% or less than the abundance of mRNA that are 30% or less than the amounts of protein expressed by transcripts that encode the gene in second wild type host cell the wild type counterpart of the gene in a second host cell organism that does not contain the disrupted gene when the organism that does not contain the disrupted gene, when the first host cell organism and the second host cell organism are first host cell organism and the second host cell organism are under the same environmental conditions (e.g., temperature, under the same environmental conditions (e.g., same tem media, etc.). perature, same media, etc.). 0110. In some embodiments, a gene is deemed to be dis 0103) In some embodiments, a gene is deemed to be dis rupted when the abundance of mRNA transcripts that encode rupted when the disrupted gene expresses protein in a first the disrupted gene in a first host cell organism that has the host cell organism that contains the disrupted gene in amounts disrupted gene are 50% or less than the abundance of mRNA that are 40% or less than the amounts of protein expressed by transcripts that encode the gene in second wild type host cell the wild type counterpart of the gene in a second host cell organism that does not contain the disrupted gene when the organism that does not contain the disrupted gene, when the first host cell organism and the second host cell organism are first host cell organism and the second host cell organism are under the same environmental conditions (e.g., temperature, under the same environmental conditions (e.g., same tem perature, same media, etc.). media, etc.). 0104. In some embodiments, a gene is deemed to be dis 0111. In some embodiments, a gene is deemed to be dis rupted when the disrupted gene expresses protein in a first rupted when the abundance of mRNA transcripts that encode host cell organism that contains the disrupted gene in amounts the disrupted gene in a first host cell organism that has the that are 50% or less than the amounts of protein expressed by disrupted gene are 60% or less than the abundance of mRNA the wild type counterpart of the gene in a second host cell transcripts that encode the gene in second wild type host cell organism that does not contain the disrupted gene, when the organism that does not contain the disrupted gene when the first host cell organism and the second host cell organism are first host cell organism and the second host cell organism are under the same environmental conditions (e.g., same tem under the same environmental conditions (e.g., temperature, perature, same media, etc.). media, etc.). 0105. In some embodiments, a gene is deemed to be dis 0112. In some embodiments, a gene is deemed to be dis rupted when the disrupted gene expresses protein in a first rupted when the abundance of mRNA transcripts that encode host cell organism that contains the disrupted gene in amounts the disrupted gene in a first host cell organism that has the that are 60% or less than the amounts of protein expressed by disrupted gene are 70% or less than the abundance of mRNA the wild type counterpart of the gene in a second host cell transcripts that encode the gene in second wild type host cell organism that does not contain the disrupted gene, when the organism that does not contain the disrupted gene when the first host cell organism and the second host cell organism are first host cell organism and the second host cell organism are under the same environmental conditions (e.g., same tem under the same environmental conditions (e.g., temperature, perature, same media, etc.). media, etc.). 0106. In some embodiments, a gene is deemed to be dis 0113. In some embodiments, a protein is deemed to be rupted when the disrupted gene expresses protein in a first disrupted when the protein has an enzymatic activity that is host cell organism that contains the disrupted gene in amounts 20% or less than the activity of the wild type counterpart of that are 70% or less than the amounts of protein expressed by the protein when the disrupted protein and the wild type the wild type counterpart of the gene in a second host cell counterpart of the protein are under the same conditions (e.g., organism that does not contain the disrupted gene, when the temperature, concentration, pH, concentration of Substrate, first host cell organism and the second host cell organism are salt concentration, etc.). under the same environmental conditions (e.g., same tem 0114. In some embodiments, a protein is deemed to be perature, same media, etc.). disrupted when the protein has an enzymatic activity that is 0107. In some embodiments, a gene is deemed to be dis 30% or less than the activity of the wild type counterpart of rupted when the abundance of mRNA transcripts that encode the protein when the disrupted protein and the wild type the disrupted gene in a first host cell organism that has the counterpart of the protein are under the same conditions (e.g., disrupted gene are 20% or less than the abundance of mRNA temperature, concentration, pH, concentration of Substrate, transcripts that encode the gene in second wild type host cell salt concentration, etc.). US 2015/0094.483 A1 Apr. 2, 2015

0115. In some embodiments, a protein is deemed to be or w/v) or greater, a purity of 65% (w/w or w/v) or greater, a disrupted when the protein has an enzymatic activity that is purity of 70% (w/w or w/v) or greater, a purity of 75% (w/w 40% or less than the activity of the wild type counterpart of or w/v) or greater, a purity of 80% (w/w or w/v) or greater, a the protein when the disrupted protein and the wild type purity of 85% (w/w or w/v) or greater, a purity of 90% (w/w counterpart of the protein are under the same conditions (e.g., or w/v) or greater, a purity of 95% (w/w or w/v) or greater, a temperature, concentration, pH, concentration of Substrate, purity of 99% (w/w or w/v) or greater in the disrupted sample salt concentration, etc.). has a specific enzymatic activity that is 30% or less than the 0116. In some embodiments, a protein is deemed to be specific enzymatic activity of a sample of the wild type coun disrupted when the protein has an enzymatic activity that is terpart of the protein “wild type sample' in which the purity 50% or less than the activity of the wild type counterpart of of the wild type counterpart of the protein in the wild type the protein when the disrupted protein and the wild type sample is the same as or greater than the purity of the dis counterpart of the protein are under the same conditions (e.g., rupted protein in the disrupted protein sample, wherein dis temperature, concentration, pH, concentration of Substrate, rupted protein sample and the sample wild type sample are salt concentration, etc.). under the same conditions (e.g., temperature, concentration, 0117. In some embodiments, a protein is deemed to be pH, concentration of Substrate, salt concentration, etc.). disrupted when the protein has an enzymatic activity that is I0123. In some embodiments, a protein is deemed to be 60% or less than the activity of the wild type counterpart of disrupted when a sample of the disrupted protein “disrupted the protein when the disrupted protein and the wild type sample' having a purity of 50% (w/w or w/v) or greater, a counterpart of the protein are under the same conditions (e.g., purity of 55% (w/w or w/v) or greater, a purity of 60% (w/w temperature, concentration, pH, concentration of Substrate, or w/v) or greater, a purity of 65% (w/w or w/v) or greater, a salt concentration, etc.). purity of 70% (w/w or w/v) or greater, a purity of 75% (w/w 0118. In some embodiments, a protein is deemed to be or w/v) or greater, a purity of 80% (w/w or w/v) or greater, a disrupted when the protein has an enzymatic activity that is purity of 85% (w/w or w/v) or greater, a purity of 90% (w/w 70% or less than the activity of the wild type counterpart of or w/v) or greater, a purity of 95% (w/w or w/v) or greater, a the protein when the disrupted protein and the wild type purity of 99% (w/w or w/v) or greater in the disrupted sample counterpart of the protein are under the same conditions (e.g., has a specific enzymatic activity that is 40% or less than the temperature, concentration, pH, concentration of Substrate, specific enzymatic activity of a sample of the wild type coun salt concentration, etc.). terpart of the protein “wild type sample' in which the purity 0119. In some embodiments enzymatic activity is defined of the wild type counterpart of the protein in the wild type as moles of Substrate converted per unit time ratexreaction sample is the same as or greater than the purity of the dis Volume. Enzymatic activity is a measure of the quantity of rupted protein in the disrupted protein sample, wherein dis active enzyme present and is thus dependent on conditions, rupted protein sample and the sample wild type sample are which are to be specified. The SI unit for enzyme activity is under the same conditions (e.g., temperature, concentration, the katal, 1 katal=1 mol s-1. pH, concentration of Substrate, salt concentration, etc.). 0120 In some embodiments enzymatic activity is 0.124. In some embodiments, a protein is deemed to be expressed as an enzyme unit (EU)=130 umol/min, where 1 U disrupted when a sample of the disrupted protein “disrupted corresponds to 16.67 nanokatals. See Nomenclature Com sample' having a purity of 50% (w/w or w/v) or greater, a mittee of the International Union of Biochemistry (NC-IUB) purity of 55% (w/w or w/v) or greater, a purity of 60% (w/w (1979), “Units of Enzyme Activity.” Eur. J. Biochem. 97: or w/v) or greater, a purity of 65% (w/w or w/v) or greater, a 319-320, which is hereby incorporated by reference herein. purity of 70% (w/w or w/v) or greater, a purity of 75% (w/w 0121. In some embodiments, a protein is deemed to be or w/v) or greater, a purity of 80% (w/w or w/v) or greater, a disrupted when a sample of the disrupted protein “disrupted purity of 85% (w/w or w/v) or greater, a purity of 90% (w/w sample' having a purity of 50% weight per weight (w/w) or or w/v) or greater, a purity of 95% (w/w or w/v) or greater, a weight per volume (w/v) or greater, a purity of 55% (w/w or purity of 99% (w/w or w/v) or greater in the disrupted sample w/v) or greater, a purity of 60% (w/w or w/v) or greater, a has a specific enzymatic activity that is 50% or less than the purity of 65% (w/w or w/v) or greater, a purity of 70% (w/w specific enzymatic activity of a sample of the wild type coun or w/v) or greater, a purity of 75% (w/w or w/v) or greater, a terpart of the protein “wild type sample' in which the purity purity of 80% (w/w or w/v) or greater, a purity of 85% (w/w of the wild type counterpart of the protein in the wild type or w/v) or greater, a purity of 90% (w/w or w/v) or greater, a sample is the same as or greater than the purity of the dis purity of 95% (w/w or w/v) or greater, a purity of 99% (w/w rupted protein in the disrupted protein sample, wherein dis or w/v) or greater in the disrupted Sample has a specific rupted protein sample and the sample wild type sample are enzymatic activity that is 20% or less than the specific enzy under the same conditions (e.g., temperature, concentration, matic activity of a sample of the wild type counterpart of the pH, concentration of Substrate, salt concentration, etc.). protein “wild type sample' in which the purity of the wild 0.125. In some embodiments, a protein is deemed to be type counterpart of the protein in the wild type sample is the disrupted when a sample of the disrupted protein “disrupted same as or greater than the purity of the disrupted protein in sample' having a purity of 50% (w/w or w/v) or greater, a the disrupted protein sample, wherein disrupted protein purity of 55% (w/w or w/v) or greater, a purity of 60% (w/w sample and the sample wild type sample are under the same or w/v) or greater, a purity of 65% (w/w or w/v) or greater, a conditions (e.g., temperature, concentration, pH, concentra purity of 70% (w/w or w/v) or greater, a purity of 75% (w/w tion of Substrate, Salt concentration, etc.). or w/v) or greater, a purity of 80% (w/w or w/v) or greater, a 0122. In some embodiments, a protein is deemed to be purity of 85% (w/w or w/v) or greater, a purity of 90% (w/w disrupted when a sample of the disrupted protein “disrupted or w/v) or greater, a purity of 95% (w/w or w/v) or greater, a sample' having a purity of 50% (w/w or w/v) or greater, a purity of 99% (w/w or w/v) or greater in the disrupted sample purity of 55% (w/w or w/v) or greater, a purity of 60% (w/w has a specific enzymatic activity that is 60% or less than the US 2015/0094.483 A1 Apr. 2, 2015 specific enzymatic activity of a sample of the wildtype coun ments. In these experiments, an equilibrium mixture of terpart of the protein “wild type sample' in which the purity enzyme, Substrate and product is perturbed, for instance by a of the wild type counterpart of the protein in the wild type temperature, pressure or pH jump, and the return to equilib sample is the same as or greater than the purity of the dis rium is monitored. The analysis of these experiments requires rupted protein in the disrupted protein sample, wherein dis consideration of the fully reversible reaction. rupted protein sample and the sample wild type sample are 0.132. In some embodiments, the enzymatic activity or under the same conditions (e.g., temperature, concentration, enzymatic specific activity is measured by continuous assays, pH, concentration of Substrate, salt concentration, etc.). where the assay gives a continuous reading of activity, or 0126. In some embodiments, a protein is deemed to be discontinuous assays, where samples are taken, the reaction disrupted when a sample of the disrupted protein “disrupted stopped and then the concentration of Substrates/products sample' having a purity of 50% (w/w or w/v) or greater, a determined. purity of 55% (w/w or w/v) or greater, a purity of 60% (w/w I0133. In some embodiments, the enzymatic activity or or w/v) or greater, a purity of 65% (w/w or w/v) or greater, a enzymatic specific activity is measured by a fluorometric purity of 70% (w/w or w/v) or greater, a purity of 75% (w/w assay (e.g., Bergmeyer, 1974, "Methods of Enzymatic Analy or w/v) or greater, a purity of 80% (w/w or w/v) or greater, a sis”, Vol. 4, Academic Press, New York, N.Y., 2066-2072), a purity of 85% (w/w or w/v) or greater, a purity of 90% (w/w calorimetric assay (e.g., Todd and Gomez, 2001, Anal Bio or w/v) or greater, a purity of 95% (w/w or w/v) or greater, a chem. 296, 179-187), a chemiluminescent assay, a light scat purity of 99% (w/w or w/v) or greater in the disrupted sample tering assay, a radiometric assay, or a chromatrographic assay has a specific enzymatic activity that is 70% or less than the (e.g., Churchwella et al., 2005, Journal of Chromatography specific enzymatic activity of a sample of the wildtype coun B825, 134-143). terpart of the protein “wild type sample' in which the purity I0134. In some embodiments, a protein is deemed to be of the wild type counterpart of the protein in the wild type disrupted when the protein has a function whose performance sample is the same as or greater than the purity of the dis is 20% or less than the function of the wildtype counterpart of rupted protein in the disrupted protein sample, wherein dis the protein when the disrupted protein and the wild type rupted protein sample and the sample wild type sample are counterpart of the protein are under the same conditions (e.g., under the same conditions (e.g., temperature, concentration, temperature, concentration, pH, concentration of Substrate, pH, concentration of Substrate, salt concentration, etc.). salt concentration, etc.). 0127. In some embodiments, the enzymatic activity or I0135) In some embodiments, a protein is deemed to be enzymatic specific activity is measured by an assay that mea disrupted when the protein has a function whose performance Sures the consumption of Substrate or the production of prod is 30% or less than the function of the wildtype counterpart of uct over time such as those disclosed in Schnell et al., 2006, the protein when the disrupted protein and the wild type Comptes Rendus Biologies 329, 51-61, which is hereby counterpart of the protein are under the same conditions (e.g., incorporated by reference herein. temperature, concentration, pH, concentration of Substrate, 0128. In some embodiments, the enzymatic activity or salt concentration, etc.). enzymatic specific activity is measured by an initial rate 0.136. In some embodiments, a protein is deemed to be experiment. In Such an assay, the protein (enzyme) is mixed disrupted when the protein has a function whose performance with a large excess of the Substrate, the enzyme-substrate is 40% or less than the function of the wildtype counterpart of intermediate builds up in a fast initial transient. Then the the protein when the disrupted protein and the wild type reaction achieves a steady-state kinetics in which enzyme counterpart of the protein are under the same conditions (e.g., Substrate intermediates remains approximately constant over temperature, concentration, pH, concentration of Substrate, time and the reaction rate changes relatively slowly. Rates are salt concentration, etc.). measured for a short period after the attainment of the quasi 0.137 In some embodiments, a protein is deemed to be steady state, typically by monitoring the accumulation of disrupted when the protein has a function whose performance product with time. Because the measurements are carried out is 50% or less than the function of the wildtype counterpart of for a very short period and because of the large excess of the protein when the disrupted protein and the wild type Substrate, the approximation free Substrate is approximately counterpart of the protein are under the same conditions (e.g., equal to the initial substrate can be made. The initial rate temperature, concentration, pH, concentration of Substrate, experiment is relatively free from complications such as salt concentration, etc.). back-reaction and enzyme degradation. 0.138. In some embodiments, a protein is deemed to be 0129. In some embodiments, the enzymatic activity or disrupted when the protein has a function whose performance enzymatic specific activity is measured by progress curve is 60% or less than the function of the wildtype counterpart of experiments. In such experiments, the kinetic parameters are the protein when the disrupted protein and the wild type determined from expressions for the species concentrations counterpart of the protein are under the same conditions (e.g., as a function of time. The concentration of the substrate or temperature, concentration, pH, concentration of Substrate, product is recorded in time after the initial fast transient and salt concentration, etc.). for a sufficiently long period to allow the reaction to approach 0.139. In some embodiments, a protein is deemed to be equilibrium. disrupted when the protein has a function whose performance 0130. In some embodiments, the enzymatic activity or is 70% or less than the function of the wildtype counterpart of enzymatic specific activity is measured by transient kinetics the protein when the disrupted protein and the wild type experiments. In Such experiments, reaction behaviour is counterpart of the protein are under the same conditions (e.g., tracked during the initial fast transient as the intermediate temperature, concentration, pH, concentration of Substrate, reaches the steady-state kinetics period. salt concentration, etc.). 0131. In some embodiments, the enzymatic activity or 0140. In some embodiments, a protein is disrupted by a enzymatic specific activity is measured by relaxation experi genetic modification. In some embodiments, a protein is dis US 2015/0094.483 A1 Apr. 2, 2015 rupted by exposure of a host cell to a chemical (e.g., an procedures using Such conditions of high Stringency are as inhibitor that substantially reduces or eliminates the activity follows. Prehybridization offilters containing DNA is carried of the enzyme). In some embodiments, this compound satis out for 8 hours to overnight at 65 C in buffer composed of fies the Lipinski’s Rule of Five: 30 (i) not more than five 6xSSC, 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, 0.02% PVP. hydrogen bond donors (e.g., OH and NH groups), (ii) not 0.02% Ficoll, 0.02% BSA, and 500 mg/ml denatured salmon more than ten hydrogen bond acceptors (e.g. NandO), (iii) a sperm DNA. Filters are hybridized for 48 hours at 65 C in molecular weight under 500 Daltons, and (iv) a Log P under prehybridization mixture containing 100 mg/ml denatured 5. The “Rule of Five' is so called because three of the four salmon sperm DNA and 5-20x106 cpm of 32P-labeled probe. criteria involve the number five. See, Lipinski, 1997, Adv. Washing of filters is done at 37 C for one hour in a solution Drug Del. Rev. 23, 3, which is hereby incorporated herein by containing 2xSSC, 0.01% PVP 0.01% Ficoll, and 0.01% reference in its entirety. BSA. This is followed by a wash in 0.1xSSC at 50 C for 45 0141. In some embodiments, the invention relates to minutes before autoradiography. Other conditions of high nucleic acids hybridized using conditions of low stringency stringency that may be used are well known in the art. (low stringency conditions). By way of example and not 0144. As used herein, computation of percent identity limitation, hybridization using Such conditions of low strin takes full weight of any insertions in two sequences for which gency are as follows (see also Shilo and Weinberg, 1981, percent identity is computed. To compute percent identity Proc. Natl. Acad. Sci. U.S.A. 78:6789-6792): filters contain between two sequences, they are aligned and any necessary ing DNA are pretreated for 6 hours at 40°C. in a solution insertions in either sequence being compared are then made containing 35% formamide, 5xSSC, 50 mM Tris-HCl (pH in accordance with sequence alignment algorithms known in 7.5), 5 mM EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 the art. Then, the percent identity is computed, where each mg/ml denatured salmon sperm DNA. Hybridizations are insertion in either sequence necessary to make the optimal carried out in the same solution with the following modifica alignment between the two sequences is counted as a mis tions: 0.02% PVP 0.02% Ficoll, 0.2% BSA, 100 mg g/ml match. Unless explicitly indicated otherwise, the percent salmon sperm DNA, 10% (wt/vol) dextran sulfate, and 5-20x identity of two sequences is the percent identity across the 106 cpm 32P-labeled probe. Filters are incubated in hybrid entire length of each of the sequences being compared, with ization mixture for 18-20hours at 40°C., and thenwashed for gaps insertions processed as specified in this paragraph. 1.5 hour at 55° C. in a solution containing 2xSSC, 25 mM Tris-HCl (pH 7.4), 5 mM EDTA, and 0.1% SDS. The wash solution is replaced with fresh solution and incubated an 5.2. Enzymes to Derive and Utilize Sugar from Plant additional 1.5 hour at 60° C. Filters are blotted dry and Cell Walls and Plant Starches exposed for autoradiography. If necessary, filters are washed for a third time at 65-68° C. and reexposed to film. Other 0145 Many biofuel production pathways start from Sug conditions of low stringency that may be used are well known ars which are expensive and compete, directly or indirectly, in the art (e.g., as employed for cross-species hybridizations). with food crops. Commercially advantageous production pathways are those that begin with cheaper raw materials 0142. In some embodiments, the invention relates to Such as agricultural by-products, or agricultural products that nucleic acids under conditions of moderate stringency (mod require minimal processing for example cell wall material. erately stringent conditions). As used herein, conditions of moderate stringency (moderately stringent conditions), areas 0146 In addition to naturally occurring enzymes, modi known to those having ordinary skill in the art. Such condi fied enzymes may be added into the host genome. For tions are also defined by Sambrook et al., Molecular Cloning: example enzymes may be altered by incorporating system A Laboratory Manual, 2nd Ed. Vol. 1, pp. 1.101-104, Cold atically varied sets of amino acid changes, with the resulting Spring Harbor Laboratory Press, 1989, which is hereby incor changes in phenotypes measured and used to identify porated by reference herein in its entirety. They include, for sequence changes conferring improved function (see for example, use of a prewashing Solution for the nitrocellulose example Liao et al., 2007, BMC Biotechnol 7: 16; Ehren et filters 5xSSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0), hybrid al., 2008, Protein Eng Des Sel21:699-707 and Heinzelmanet ization conditions of 50 percent formamide, 6xSSC at 42°C. al., 2009, Proc Natl AcadSci USA 106:5610-5615). (or other similar hybridization solution, or Stark's solution, in 50% formamide at 42°C.), and washing conditions of about 5.2.1. Enzymes for Cellulose, Hemicelluose, and 60° C., 0.5xSSC, 0.1% SDS. See also, Ausubel et al., eds., in Lignocellulose Degradation the Current Protocols in Molecular Biology series of labora tory technique manuals, (C) 1987-1997. Current Protocols, 0147 Organisms capable of generating enzymes for the 1994-1997, John Wiley and Sons, Inc., hereby incorporated breakdown of cellulose, hemicellulose, and pectin include, by reference herein in its entirety. The skilled artisan will Trichoderma viride, Fusarium oxysporium, Piptoporus betu recognize that the temperature, salt concentration, and chao linus, Penicillium echinulatum, Penicillium purpurogenium, trope composition of hybridization and wash Solutions can be Penicillium rubrum, Aspergillus niger, Aspergillus fumigatus, adjusted as necessary according to factors such as the length Aspergillus phoenicus, Sporotrichum thermophile, Scyta and nucleotide base composition of the probe. Other condi lidium thermophillum, Clostridium straminisolvens, Ther tions of moderate Stringency that may be used are well known monospora curvata, Rhodospirillum rubrum, Cellulomonas in the art. fini, Clostridium Stercorarium, Bacillus polymyxa, Bacillus 0143. In some embodiments, the invention relates to coagulans, Pyrococcu firiosus, Acidothermus cellulolyticus, nucleic acids under conditions of high Stringency (high Strin Saccharophagus degradans, Neurospora crass, Humicola gent conditions). As used herein conditions of high Stringency fiascoatra, Chaectonium globosum, Thielavia terrestris-255, (high Stringent conditions) are as known to those having Mycelieopthra fergussi-246C, Aspergillus wentii, Aspergil ordinary skill in the art. By way of example and not limitation, lus Ornatus, Pleurotus florida, Pleurotus cornucopiae, Tra US 2015/0094.483 A1 Apr. 2, 2015

mates versicolor, Bacteroides thetaiotaOmicron, and Nectria Zation. Xylanases hydrolyze the B-1,4-Xylan linkage of hemi catalinensis; see Kumaret al., 2008, JInd Microbiol Biotech cellulose to produce the pentose Xylose. There are a large nol: 35, 377-91. number of distinct Xylanase protein families. Some fungi secrete Xylanase isozymes: Trichoderme viride makes 13 and 5.2.1.1. Cellulose Asperigillus niger produces 15. Xylanases will be an increas ing important component of hemicellulose utilization as an 0148 Cellulose is a homopolymeric compound composed added enzyme or part of an integrated bioprocessing system off-D-glucopyranose units, linked by a B-(1->4)-glycosidic produced in situ by a suitable organism. Xylanases would be bond and represents the most abundant polysaccharide in of utility in a Candida Strain configured for cellulose degra plant cell walls. dation. 0149 Trichoderma reesei is one of the prototypical cellu lose metabolizing fungi. It encodes genes for 3 enzyme classes required for the degradation of cellulose to glucose. 5.2.1.3. Pectin: These are Exoglucanases or cellobiohydrolases (genes CBH1 0155 Pectins are the third main structural polysaccharide and CBH2), Endoglucanases (genes EG1, EG2, EG3, EG5) of plant cell walls. Pectins are abundant in Sugar beat pulp and and B-glucosidase (gene BGL1). Genes for these 3 classes of fruits, e.g., citrus and apples, where it can form up to /2 the enzymes could be expressed and secreted from a modified C. polymeric content of cell walls. The pectin backbone consists tropicalis strainto allow it to generate glucose from cellulose. of homo-galacturonic acid regions and neutral Sugar side 0150. Clostridium thermocellum is a prototypical cellu chains from L-rhamnose, arabinose, galactose, and Xylose. lose degrading bacterium. It encodes numerous genes that L-rhamnose residues in the backbone carry sidechains con form the cellulosome, a complex of enzymes used in the taining arabinose and galactose. Pectin degrading enzymes degradation of cellulose. Enzymes participate in the forma include pectin lyase, endo-polygalacturonase, C.-arabinofura tion of the cellulosome include scaffoldin (cipA), cellulase nosidase, C.-galactosidase, polymethylgalacturonase, pectin (cell), cellobiohydrolase (cbha, celK, cello), Xylanase depolymerase, pectinase, exopolygalacturanosidase hydro (xynY. Xynz. Xyn A, xynO, XynC, xynD, XynB, XynV), lase, C-L-Rhamnosidase, C-L-Arabinofuranosidase, polym endoglucanase (celH, celE, celS, celF, celN, celQ, celD, celB. ethylgalacturonate lyase (pectin lyase), polygalacturonate celT, celG, celA), mannanase (manA), chitinase (chiA). lyase (pectate lyase), exopolygalacturonate lyase (pectate lichenase (licE) and a protein with unknown function Csep disaccharide-lyase). Pectinases would be of utility in a Can (cseP). dida Strain configured for cellulose degradation 0151 Encoding all or a subset of the genes required to replicate the C. thermocellum cellulosome, component 5.2.2. Biological Delignification enzymes or engineered derivatives would be of utility in a Candida Strain configured for cellulose degradation. There is 0156 The white rot fungi are a diverse group of Basidi emerging evidence that effective hydrolysis of cellulose omycetes that are capable of completely degrading all the requires a multicomponent system like the cellulosome that major components of plant cell walls, including cellulose, interacts with the substrate and the surface of the cell. Cellu hemicellulose and lignin. Phanerochaete chrysosporium is a losomes are nanomachines consisting cellulase catalytic prototypical example that has recently been the focus of a modules, carbohydrate binding domains that lock into the genome sequencing and anotization project. See review of Substrate, and dockerins plus cohesions that serve to connect genome project and genes used in delignification (Kersten et the catalytic and carbohydrate binding domains to the Surface al., 2007, Fungal Genet Biol: 44, 77-87). of the bacterial cell that is expressing the cellulosome. 0157 Lignocellulosic biomass refers to plant biomass that is composed of cellulose, hemicellulose, and lignin. The car 5.2.1.2. Hemicellulose: bohydrate polymers (cellulose and hemicelluloses) are tightly bound to the lignin, by hydrogen and covalent bonds. 0152 Hemicellulose is the second most abundant compo Biomass comes in many different types, which may be nent of plant cell walls. Hemicelluloses are heterogeneous grouped into four main categories: (1) wood residues (includ polymers built up by many different Sugar monomers. In ing sawmill and paper mill discards), (2) municipal paper contrast, cellulose contains only anhydrous glucose. For waste, (3) agricultural residues (including corn Stover and instance, besides glucose, Sugar monomers in hemicellulose Sugarcane bagasse), and (4) dedicated energy crops (which can include Xylose, mannose, galactose, rhamnose, and ara are mostly composed of fast growing tall, woody grasses). binose. Hemicelluloses contain most of the D-pentose Sugars, Fermentation of lignocellulosic biomass to ethanol is an and occasionally small amounts of L-Sugars as well. Xylose is attractive route to energy feedstocks that Supplements the always the Sugar monomer present in the largest amount, but depleting stores of fossil fuels. Biomass is a carbon-neutral mannuronic acid and galacturonic acid also tend to be Source of energy, since it comes from dead plants, which present. means that the combustion of ethanol produced from ligno 0153. Hemicellulose degrading enzymes include the celluloses will produce no net carbon dioxide in the earth's Xylan degrading enzymes (endo-B-Xylanase, C-glucu atmosphere. Also, biomass is readily available, and the fer ronidase, C.-arabinofuranosidase, and B-Xylosidase) and glu mentation of lignocelluloses provides an attractive way to comannan degrading enzymes (B-mannanase and B-man dispose of many industrial and agricultural waste products. nosidase). Xylan is the predominant component of Finally, lignocellulosic biomass is a very renewable resource. hemicellulose from hardwood and agricultural plants, like Many of the dedicated energy crops can provide high-energy grasses and stray. Glucomannan is the dominant component biomass, which may be harvested multiple times each year. of hemicellulose from hardwood. 0158. One barrier to the production of biofuels from bio 0154 Cellulose does not typically exist in nature by itself mass is that the Sugars necessary for fermentation are trapped and so other enzymes are needed for effective biomass utili inside the lignocellulose. Lignocellulose has evolved to resist US 2015/0094.483 A1 Apr. 2, 2015 degradation and to confer hydrolytic Stability and structural 5.3. Potential Feedstocks Used Directly or Following robustness to the cell walls of the plants. This robustness or Enzymatic, Physical, Chemical, and or Mechanical “recalcitrance' is attributable to the crosslinking between the Pretreatment polysaccharides (cellulose and hemicellulose) and the lignin 0.165 Almost anything derived from the Kingdom Plan via ester and ether linkages. Ester linkages arise between tae, and more specifically anything containing, lingnocellu oxidized Sugars, the uronic acids, and the phenols and phe lose, cellulose, hemicellulose, pectin, and/or starch can be nylpropanols functionalities of the lignin. To extract the fer used as a feedstock for the production of biofuels. mentable Sugars, one must first disconnect the celluloses from 0166 The heterogeneous structure of the lignin polymer the lignin, and then acid-hydrolyze the newly freed celluloses renders it highly difficult to degrade. Lignin degradation to break them down into simple monosaccharides. Another occurs quite slowly in nature via the action of wood rot fungi challenge to biomass fermentation is the high percentage of that produce ligninases. These fungi and some bacteria pentoses in the hemicellulose. Such as Xylose, or wood Sugar. recycle the carbon locked in woody plants taking years to Unlike hexoses, like glucose, pentoses are difficult to fer digest a large tree. A major strategy for increasing availability ment. The problems presented by the lignin and hemicellu of sugar polymers is to genetically decrease the lignin content lose fractions are the foci of much contemporary research. of plants. Alfalfa lines downregulated in several steps of 0159. Hundreds of sequences from P. chrysosporium are lignin biosynthesis were tested for Sugar release during predicted to encode extracellular enzymes including many chemical saccharification with promising results. Plant with oxidative enzymes potentially involved in lignocellulose deg the lowest lignincompensated by making more carbohydrate. radation, including peroxidases, copper radical oxidases, Moreover, the carbohydrate was more readily released with FAD-dependent oxidases, and multicopper oxidases. The decreasing lignin. Sugars present were Xylose, arabinose, oxidases and peroxidases are responsible for generating reac glucose, and galactose that were representative of hemicellu tive and nonspecific free radicals that affect lignin degrada losic and pectic cell wall polymers (Chen et al., 2007, Nat tion. Enzymes that accelerate the rate of lignocellulose deg Biotechnol: 25,759-61). radation would be of utility in a Candida strain configured for 5.3.1. Physical, Chemical, and/or Mechanical Lignocellulose cellulose degradation. Pre-treatments 0160 Large and complex families of cytochrome P450s, 0.167 Lignocellulosic substrates used by an engineered C. peroxidases, glycoside hydrolases, proteases, copper radical tropicalis Strain may include one or more of the following oxidases and multicopper oxidases are observed in the P pretreatments: mechanical pretreatment (milling), thermal chrysosporium genome. Structurally related genes may pretreatment (steam pretreatment, steam explosion, and/or encode proteins with subtle differences in functions, and such liquid hot waterpretreatment), alkaline pretreatment, oxida diversity may provide flexibility needed to change environ tive pretreatment, thermal pretreatment in combination with mental conditions (pH, temperature, ionic strength), Substrate acid pretreatment, thermal pretreatment in combination with composition and accessibility, and wood species. Alterna alkaline pretreatment, thermal pretreatment in combination tively, some of the genetic multiplicity may merely reflect with oxidative pretreatment, thermal pretreatment in combi redundancy. nation with alkaline oxidative pretreatment, ammonia and 0161 Lignin peroxidases (LiP) and manganese peroxi carbon dioxide pretreatment, enzymatic pretreatment, and/or dases (MnP) have been the most intensively studied extracel pretreatment with an engineered organism (Hendriks et al., luar enzymes of P. chrysosporium. Also, implicated in ligno 2009, Bioresour Technol: 100, 10-8). cellulose degradation are, copper radical oxidases (e.g., glyoxal oxidase, GLX), flavin and cytochrome enzymes Such 5.4. Sugars Derived from Plant Cell Walls that May as, cellobiose dehydrogenase (CDH), glucose oxidases (glu Require Engineering of C. Tropicalis cose 1-oxidase and glucose 2-oxidase), aryl alcohol oxidases, 0168 Plant biomass hydrolysates contain carbon sources Veratryl alcohol oxidase, multicopper oxidases (mcol). that may not be readily utilized by yeast unless appropriate 0162 Proteases produced by P. chrysosporium may be enzymes are added via metabolic engineering (van Mans et involved in activation of cellulase activity. P. chrysosporium al., 2006, Antonie Van Leeuwenhoek: 90, 391-418). For apparently does not code for laccases, which are used by other example, S. cerivisiae readily ferments glucose, mannose, organisms for lignocellulose degradation. and fructose via the Embden-Meyerhof pathway of glycoly 0163. Other lignocellulose degrading organisms include sis, while galactose is fermented via the Leloir pathway. Pleurotus erygii (has a versatile peroxidase that exhibits both Construction of yeast strains that efficiently convert other LiP and MnP activities), Cyathus sp., Streptomyces viri potentially fermentable substrates in plant biomass will dosporus T7A (the lignin peroxidase, LiP has been studied in require metabolic engineering. The most abundant of these some detail), Phelebia tremellosus, Pleurotus florida, Peuro compounds is xylose. Other fermentable substrates include tuS COrnucopiae, Pleurotus Ostreatus, Trametes versicolor; L-arabinose, galacturonic acid, and rhamnose. Irpex lacteus, Ganoderma lucidum, Ganoderma applanatum, Coriolus versicolor; Aspergillus 2BNL1, Aspergillus 5.4.1. Xylose Fermentation 1AAL1, Lentinus edodes UEC 2019, Ceriporiopsis subver 0169 Xylose-fermenting yeasts link xylose metabolism mispora, Panus conchatus. to the pentose-phosphate pathway. These yeasts use two oxi doreductases, Xylose reductase (XR) and Xylitol dehydroge 5.2.3. Enzymes Needed for Utilization of Starch: nase (XDH), to convert xylose to xylulose 5-phosphate, which enters the pentose phosphate pathway. 0164. Enzymes for saccharification include C.-amylases, 0170 Although strains of S. cerivisiae that express both B-amylases, y-amylases, glucoamylase, maltogenase and xylose reductase (XR) and xylitol dehydrogenase (XDH) pullanase. have been constructed, anaerobic fermentation was accom US 2015/0094.483 A1 Apr. 2, 2015

panied by considerable xylitol production. For every one described in (Beckeret al., 2003, Appl Environ Microbiol: 69. NADPH used by XR, one NADH needs to be reoxidized,and 4144-50). In this work the bacterial L-arabinose operon con the only way to do it be the engineered yeasts is to produce sisted of E. coli araB and araD and Bacillus subtilis araA, Xylitol, although ethanol vs. xylitol production can be along with overexpression of the yeast galactose permease impacted both positively and negatively by starting strain, gene (GAL2). Gallip is known to transport L-arabinose. Source of heterologous enzymes, and culture conditions. Ide Although overexpression of these enzymes did not result in ally, the XR and XDH can be engineered to be linked to the immediate growth on L-arabinose as the Sole carbon Source, same coenzyme system eliminating the production of excess the growth rate of the transformants increased progressively NADH in the process of ethanol production. after 4-5 days incubation. Eventually an L-arabinose-utiliz 0171 One of the most successful examples of engineering ing Strain was selected after several sequential transfers in S. cerivisiae for ethanol production from Xylose uses the L-arabinose medium. In addition to being able to grow aero fungal Xylose isomerase (XylA) from obligately anaerobic bically on L-arabinose, the evolved strain produced ethanol fungi Piromyces sp. E2. The introduction of the Xy 1A gene from L-arabinose at 60% the theoretical yield under oxygen was sufficient to enable the resulting strain to grow slowly limited conditions. An enhanced transaldolase (TALI) activ with Xylose as sole carbon source under aerobic conditions. ity was reported to enhance L-arabinose fermentation and Via an extensive selection procedure a new strain was derived overexpression of GAL2 was found not to be essential for (Kuyperet al., 2005, FEMSYeast Res: 5,399-409) which was growth on L-arabinose, Suggesting that other yeast Sugar capable of anaerobic growth on Xylose producing mainly transporters can also transport L-arabinose. A similar ethanol, CO2, glycerol, biomass, and notably little xylitol. approach would be feasible in Candida, re-coding the genes The ethanol production rate was considered still too low for to be better expressed in Candida, and to remove those industrial applications. To obtain a higher specific rate of codons that are non-canonical in Candida. ethanol production, a strain was constructed that in addition to the Xy 1A gene, overexpressed all genes involved in the 5.4.3. Galacturonic Acid Fermentation: conversion of Xylose into the intermediates of glycolysis, 0177 Reduction of galacturonic acid to the same level of including Xylulokinase, ribulose 5-phosphate isomerase, a hexose requires the input of two electron pairs, for instance ribulose 5-phosphate epimerase, transketolase, and transal via two NADH-dependent reduction steps. Galacturonic acid dolase. In addition the gene GRE3, encoding aldose red is a major component of pectin and therefore occurs in all cutase, was deleted to further minimize xylitol production. plant biomass hydrolysates. Pectin-rich residues from citrus The resulting strain could be cultivated under anaerobic con fruit, apples, Sugar cane and Sugar beets contain especially ditions without further selection or mutagenesis and at the large amounts of D-galacturonic acid. If D-galacturonic acid time had the highest reported specific ethanol production rate. can be converted to ethanol, this would increase the relevance 0172 Candida tropicalis has been shown to be able to of these abundantly available feedstocks. ferment xylose to ethanol (Zhang et al., 2008, Sheng Wu 0.178 Several yeasts, e.g., Candida and Pichia, can grow Gong Cheng Xue Bao: 24, 950-6.) Pichia stipitis is another on D-galacturonic acid, and therefore potential sources for yeast that is able to ferment Xylose to alcohol and being transport enzymes and a heterologous pathway if needed. studied (Agbogbo et al., 2008, Appl Biochem Biotechnol: 0179 The ability to utilize D-galacturonic acid is wide 145, 53-8). spread among bacteria, which all seem to use the same meta bolic pathway. In the bacterial pathway, D-galacturonic acid 5.4.2. L-Arabinose Fermentation is converted to pyruvate and glyceraldehydes-3-phosphate 0173 Although D-xylose is the most abundant pentose in via a five-step pathway. Overall this results in the conversion hemicellulosic Substrates, L-arabinose is present in signifi of D-galacturonic acid, NADH, and ATP into pyruvate, glyc cant amounts, thus the importance of converting arabinose to eraldehydes-3-phosphate and water. Glyceraldehyde-3-phos ethanol. phate can be converted to equimolar amounts of ethanol and 0.174 Saccharomyces cannot ferment or assimilate L-ara CO2 via standard glycolytic reactions yielding 2 ATP. How binose. Although many types of yeast are capable of assimi ever, conversion of pyruvate to ethanol requires oxidation of lating L-arabinose aerobically, most are unable to ferment it a second NADH. to ethanol. Some Candida species are able to make arabinose 0180. During anaerobic growth and fermentation on Sug fermentation to ethanol, but production rates are low. ars (hexoses, but also Xylose by engineered Xylose-ferment 0175 L-arabinose fermentation may be-rare among ing strains) of S. Cerivisiae, a significant fraction of the carbon yeasts due to a redox imbalance in the fungal L-arabinose is channeled into glycerol to compensate for oxidative, pathway, therefore an alternative approach to using the fungal NADH-generating reactions in biosynthesis. enzymes is to construct L-arabinose fermenting yeast by 0181. In theory, introduction of the prokaryotic glactur overexpression of the bacterial L-arabinose pathway. In the onic acid fermentation route into yeast can create an alterna bacterial pathway no redox reactions are involved in the ini tive redox sink for the excess NADH formed in biosynthesis. tial steps of L-arabinose metabolism. Instead the enzymes, This would have two advantages. Firstly, the NADH derived L-arabinose isomerase, L-ribulokinase, and L-ribulose-5- from biosynthetic processes can be used to increase ethanol phosphate 4-epimerase are involved in converting L-arabi yield on glacturonic acid to 2 mol ethanol per mol of glactu nose to L-ribulose-5-phosphate and D-xylulose-5-phosphate, ronic acid, as the pyruvate formed can now be converted to respectively. These enzymes are encoded by the araA, araB, ethanol. Secondly, since the Sugar requirements production and ara) genes respectively. for glycerol are reduced, the ethanol yield on Sugars will 0176 A first attempt to express the E. coli genes in S. increase. cerivisiae was only partly successful, with the Strain gener 0182 Bacterial D-galacturonate catabolism uses the fol ating only L-arabinitol. One of the most promising examples lowing enzymes: D-galacturonate isomerase, altronate oxi of S. Cerivisiae engineering for L-arabinose fermentation is doreductase, altronate dehydratase, 2-dehydro-3-deoxyglu US 2015/0094.483 A1 Apr. 2, 2015 conokinase, 2-keto-3-deoxy-6-phosphogluconate aldolase, tion of L-rhamnose by NAD+-dependent L-rhamnose dehy glyceraldehydes-3-phosphate. Although a large number of drogenase, yielding either L-rhamnono-1,4-lactone or the yeasts and molds use galacturonic acid as carbon and energy unstable rhamnono-1,5-lactone. The 1.4 lactone is hydro for growth, knowledge of the underlying metabolic process is lyzed to L-rhamnonate by L-rhamnono-1,4-lactonase. The limited. At present, the prokaryotic pathway offers the most unstable 1.5-lactone has been reported to spontaneously promising approach for engineering Candida for galactur hydrolyze to L-rhamnonate. L-Rhamnonate is Subsequently onic acid metabolism. dehydrated to 2-keto-3-deoxy-L-rhamnonate by L-rhamnon ate dehydratase. The product of this reaction is then cleaved 5.4.4. L-Rhamnose Fermentation: into pyruvate and L-lactaldehyde by an aldolase. In P stipitis 0183 The deoxyhexose L-rhamnose is named after the the thus formed L-lactaldehyde is converted to lactate and plant it was first isolated from: the buckthorn (Rhamnus). In NADH by lactaldehyde dehydrogenase. Introduction of this contrast with most natural Sugars, L-rhamnose is much more fungal pathway into S. cerivisiae should enable the conver common than D-rhamnose. It occurs as part of the rhamnoga sion of L-rhamnose to equimolar amounts of ethanol, lactal lacturonan of pectin and hemicellulose. Being a 6-deoxy dehyde and CO2 without a net generation of ATP. This con Sugar, L-rhamnose is more reduced than the rapidly ferment version would require the introduction of a transporter and able Sugars glucose and fructose. four heterologous enzymes (including 1.4-lactonase). 0184 S. cerivisiae cannot grow on L-rhamonose. The metabolic engineering of S. Cerivisiae for the production of 5.4.5. Inhibitor Tolerance: ethanol will have to address two key aspects: the enhance 0191 The harsh conditions that prevail during the chemi ment of rhamnose transport across the plasma membrane and cal and physical pretreatment of ligncullulse result in the the introduction of a rhamnose-metabolizing pathway. release of many Substances that inhibit growth and produc 0185. Two possible strategies to engineer uptake follow. tivity of microorganisms such as S. cerivisiae. The number Firstly, after introduction of an ATP-yielding pathway for and identity of the toxic compounds varies with the nature of L-rhamnose catabolism (see below), selection for growth on the raw material and pretreatment conditions. L-rhamnose can be used to investigate whether or not muta tions in hexose transporters enable uptake of L-rhamnose. 0.192 There are two approaches to limit the impact of the 0186 Although the rhamnose transporters from bacteria inhibitors on the fermentation process: (i) introduction of (e.g., E. coli) are well characterized, functional expression of additional chemical, physical, or biological process steps for bacterial transporters in the yeast plasma membrane may be removal or inactivation of inhibitors (ii) improvement of S. challenging. Pichia stipidis is able to use L-rhamnose. Using cerivisiae to the inhibitors. information generated by the P stipidis genome project, it might be possible to identify a rhamnose transporter if such a 5.5. Fermentation Products from Biomass gene can be shown to be induced by rhamnose (as proposed for galacturonic acid above). 5.5.1. Butanol 0187. After uptake the next requirement for successful rhamnose fermentation is conversion into intermediates of 0193 Metabolic engineering of Escherichia coli for central metabolism. butanol production by inserting genes from the butanol pro 0188 Two pathways for rhamnose utilization have been duction bacteria Clostridium acetobutylicum into E. coli has reported in microorganisms. been described (Inui et al., 2008, Appl Microbiol Biotechnol: 0189 The first catabolic pathway involves phosphorylated 77, 1305-16). intermediates and is used, for example, by E. coli. In this 0.194. A similar strategy can be envisioned for an engi pathway, L-rhamnose is converted to L-rhamnulose by neered C. tropicalis strain configured to derive Sugars from L-rhamnose isomerase. After the Subsequent phosphoryla biomass. Enzymes (and genes) from Clostridium acetobutyli tion to L-rhamnulose by rhamnulokinase, L-rhamnulose-1- cum required for butanol production from Acetyl-CoA phosphate is split into dihdroxy-acetone-phosphate (DHAP) include: Acetyl-CoA acetyltransferase (thiL), B-hydroxybu and L-lactaldehyde by rhamnulose-1-phosphate aldolase. tyryl-CoA dehydrogenase (hbd), 3-hydroxybutyryl-CoA DHAP can be normally processed by glycolysis, yielding 1 dehydratase (crt), butyryl-CoA dehydrogenase (bcd, etfA, mol ethanol per mol L-rhamnose. In E. coli, further metabo etfB), butyrlaldehyde dehydrogenase (adhel, adhe), butanol lism of L-lactaldehyde depends on the redox state of the cells. dehydrogenase (adhel, adhe), butyrlaldehyde dehydrogenase L-lactaldehyde can be oxidized to lactate by lactaldehyde (bdha), butanol dehydrogenase (bdha), butyrlaldehyde dehydrogenase, reduced to 1,2-propanediol by lactaldehyde dehydrogenase (bdhB), butanol dehydrogenase (bdhB). reductase, or processed via a redox-neutral mix of these two 0.195 n-Butanol is a commercially important alcohol that reactions. Introduction of this pathway into S. Cerivisiae, is considered by some to be a strong Candidate for wide L-rhamnulose is expected to be converted to equimolar spread use as a motor fuel. n-Butanol is currently produced amounts of ethanol, lactaldehyde and CO2 with generation of via chemical synthesis almost exclusively. The dominant syn 1 ATP. In Summary, this strategy would require the introduc thetic process in industry, the acetaldehyde method, relies on tion of a transporter and three heterologous enzymes into S. propylene derived from petroleum 1. The U.S. market for cerivisiae. butanol is 2.9 billion pounds per year 2. Currently, the 0190. A second route for rhamnose degradation, which primary use of n-butanol is as a solvent, however, several does not involve phosphorylated intermediates was first companies including British Petroleum and DuPont are described for the Aureobasidium pullulans and is developing methods to utilize bacteria to produce n-butanol referred to as direct oxidative catabolism of rhamnose. A on a large scale for fuel 3. Microorganisms capable of similar pathway occurs in the yeasts P stipitis and Debaryo producing n-butanol by fermentation are Clostridia acetobu myces polymorphus. This pathway is initiated by the oxida tylicum, C. beijerinckii, and C. tetanomorphum. US 2015/0094.483 A1 Apr. 2, 2015 20

0196) n-Butanol has several characteristics that make it a 0203 Synthesis would require valine biosynthesis viable alternative fuel option. It has an energy density that is enzymes, 2-keto-acid decarboxylase, alcohol dehydroge similar to gasoline. Additionally, it could power a combustion aSC. engine with minimal or no modifications. In either a blended 5.5.4. Isopropanol Pathway from Pyruvate or neat form, n-butanol could be easily integrated into our 0204 Isopropanol is commonly employed as an industrial current infrastructure. cleaner and solvent. Additionally, it is sold as “rubbing alco 0.197 Enzymes for butanol production include Pyruvate hol” for use as a disinfectant. As a significant component in dehydrogenase complex, acetyl-CoA acetyltransferase, dry gas, a fuel additive, it solubilizes water in gasoline, 3-hydroxybutyryl-CoA dehydrogenase, crotonase, butyryl thereby removing the threat of frozen supply lines. Proposed CoA dehydrogenase, aldhyde and/or alcohol dehydrogenase. biofuel applications include partial replacement of gasoline (Steen et al., 2008, Microb Cell Fact: 7, 36: Atsumi et al., and in production offatty acid esters. A benefit of substituting 2008, Metab Eng: 10, 305-11.) isopropanol for methanol in fatty acid esters is a higher tol erance for cold temperatures. The fatty acid isopropyl ester 5.5.2. Branched Chain Alcohols would remain liquid in cooler climates. The biosynthesis genes for isopropanol originally found in Clostridia acetobu 0198 In the quest to find a substitute for petroleum based tylicum were engineered into an E. coli Strain for optimal fuels, several low energy molecules have been suggested due industrial usage (Hanai et al., 2007, Appl Environ Microbiol: to the ease of production. However, a molecule with similar 73, 7814-8). energy density to current fuels would be preferred as a bio 0205 Synthesis would require pyruvate dehydrogenase fuel. Branched higher alcohols have a higher energy density complex, acetyl-CoA acetyltransferase, acetoacetyltrans than some of the alcohols proposed as alternative fuels. Other ferase, secondary alcohol dehydrogenase. various properties of these alcohols also display more desir 5.5.5. Methanol Pathway from Methane able features. For example, a lower miscibility with water and 0206 Methanol can be synthesized chemically or bio lower vapor pressure are benefits of higher alcohols. A unique chemically from methane gas. Over 30 million tons per year approach to alcohol synthesis taken by Atsumi etal. Atsumi et of methanol are produced worldwide 1. Currently, chemical al., 2008, Nature: 451, 86-9, employs synthetic biology to synthesis is the method of choice. Methanol is widely used as engineer non-fermentative pathways based on amino acid a solvent, in antifreeze, and as an intermediate in synthesis of biosynthesis. These pathways produce alcohols that are not more complex chemicals. Methanol is used as a fuel in Indy natural fermentation products. Some of the features of these race cars and it has been blended into gasoline for civilian molecules include branching and addition of aromatic cyclic automobiles. Microorganisms capable of methanol produc hydrocarbon structures. tion include Methylobacterium sp., Methylococcus capsula 0199 An engineered C. tropicalis capable of generating tus, and Methylosinus trichosporium. 2-methyl-1-butanol from L-threonine would use either the 0207 Enzymes required: methane monooxygenase. endogenous or exogenously added threonine biosynthetic enzymes, L-threonine ammonia lyase, endogenous or exog 5.5.6. Other Possible End Products Requiring Metabolic enously added isoluecine biosynthetic enzymes, 2-keto-acid Engineering decarboxylase, and an alcohol dehydrogenase. 0200 3-methyl-1-butanol pathway from pyruvate would 0208 Esters: Fatty acid ethyl ester, Fatty acid methyl ester require valine biosynthesis enzymes, leucine biosynthesis Ethers: Dimethyl ether, Dimethylfuran, Methyl-t-butyl ether enzymes, 2-keto-acid decarboxulase, alcohol dehydroge aSC. Hydrocarbons: Alkanes, Alkenes, Isoprenoids 0201 2-phenylethanol pathway from pyruvate would require Phenylalanine biosynthesis enzymes, 2-keto acid 5.5.7. Over-Production of Fatty Acids decarboxylase, alcohol dehydrogenase. 0209 Because many of the strains described here are no 5.5.3. Isobutanol Pathway from Pyruvate longer able to utilize many fatty acids as carbon and energy 0202) Isobutanol has a higher carbon content than ethanol, Sources due to the knockouts in both f3-oxidation (poX4a/ therefore making its energy properties closer to gasoline. pox4b pox5a/pox5b) and co-oxidation pathways (P450 (cyto Currently, isobutanol is used as a precursor for commodity chrome P450), fao (fatty alcohol oxidase), and adh (alcohol chemicals including isobutyl acetate. Atsumi et al., 2008, dehydrogenase) gene), the strain is an ideal Candidate for Nature: 451, 86-9, synthesized isobutanol via synthetic biol metabolic engineering for manipulation of the fatty acid bio ogy. The origin of the enzymes required to synthesize isobu synthetic pathways for overproduction of fatty acids. tanol were from a variety of microorganisms including Lac 0210 Fatty acids (and/or lipids) so produced could either tococcus lactis and Saccharomyces cerevisiae. In addition to be used for production of biofuels such as biodiesel or by expressing foreign enzymes, the host, E. coli, was modified to restoring a P450 or P450s for endogenous production of direct metabolism toward isobutanol production. The inter ()-hydroxy fatty acids. Methods for over-production of esting feature of this pathway to synthesize isobutanol is that endogenous fatty acids may be similar to those used by Lu X it employs amino acid biosynthesis to generate the essential et al., 2008, Metab Eng: 10, 333-9. precursor. This allows the microbe to produce the alcohol in 0211 Steps include: the presence of oxygen. In fact, semi-aerobic conditions 0212 1. Knocking out the E. coli fadD gene, which increased yields. This approach has been applied to generate encodes an acyl-CoA synthetase, to block fatty acid degrada several other alcohols, such as 2-phenylethanol and 2-me tion. This may be accomplished by knocking out acyl-coA thyl-1-butanol. The pathways for these interesting alcohols synthetases and acyl-coA oxidases of the Candida tropicalis have not yet been optimized. (e.g., POX4 and POX4 genes are already absent). US 2015/0094.483 A1 Apr. 2, 2015

0213 2. Heterologous expression of acyl-ACP tions are conducted under aerobic conditions with agitation in thioesterases to increase the abundance of shorter chain fatty a shaker incubator, fermentor or other suitable bioreactor. acids, e.g., U31813 from Cinnamomum camphorum (im 0222 For example, the fermentation process may be proved fuel quality). divided into two phases: a growth phase and a transformation 0214) 3. Increasing the supply of malonyl-CoA by over phase in which co-oxidation of the substrate is performed. The expressing acetyl-coA carboxylase. seeds inoculated from fresh agar plate or glycerol stock are 0215 4. Releasing feedback inhibition caused by long firstly cultivated in a preculture medium for 16-20 hours, at chain fatty acids by overexpression of an endogenous or 30° C. and pH 6.5 in a shaker. Subsequently, this culture is exogenous acyl-ACP thioesterase. Acyl-ACP thioesterases used to inoculate the conversion medium with co-substrates. release free fatty acids from acyl-ACPs. The growth phase of the culture is performed for 10-12 hours 0216 Mechanisms for membrane proliferation (more to generate high cell density cultures at pH 6.5 and 30°C. The membrane-more lipid?): transformation phase is begun with addition of the fatty acid 0217 Expression/overexpression of P450s including fatty or other substrate for the bio-oxidation. The medium pH is acid, alkane, and alkene metabolizing P450s lead to mem adjusted to 7.5-8.0 by addition of a base solution. Co-sub brane proliferation in Yeasts. May be possible to express an strates are fed during the transformation phase to provide enzymatically inactive P450 that elicits proliferation via energy for cell growth. By use of this method, the terminal membrane anchor. Expression of Secreted enzymes, such as methyl group of fatty acids, synthetically derived substrates, invertase (SUC2) can lead to membrane genesis in yeasts. n-alkanes, n-alkenes, n-alkynes and/or fatty alcohols that 0218 Growth with compounds that lead to membrane pro have a carbon chain length from 12 to 22 are converted to a liferation. hydroxyl or carboxyl group. 0219. Altering genetics of peroxisome proliferation. 0220 Enzymes that are Candidates for manipulating 5.7. Genetic Modifications of Candida tropicalis either by modulating or eliminating expression, or Substitut 0223 Yeasts of the genus Candida including Candida ing homologues or engineered enzymes, e.g. that eliminate tropicalis contains two pathways for the metabolism of fatty feedback or end product inhibition. acids: (1)-oxidation and B-oxidation. These pathways are shown schematically in FIG. 2, together with some classes of 5.6. Production of Long-Chain C2-Hydroxy Fatty enzymes capable of catalyzing the chemical conversions in Acids each pathway. In order for Candida to be used to transform 0221) Whole-cell biocatalysts currently used to oxidize fatty acids into useful compounds-such as diacids and long chain fatty acids include Candida tropicalis, Candida hydroxyl fatty acids, or high energy compounds, or other cloacae, Cryptococcus neoforman and Corynebacterium sp. chemicals it is advantageous to eliminate metabolic pathways One preferred microorganisms is Candida tropicalis that can divert either the substrates or products of the desired ATCC20962 in which the B-oxidation pathway is blocked by pathway. For example it may be desirable to prevent Candida disrupting PDX 4 and PDX 5 genes which respectively from metabolizing fatty acids through the B-Oxidation path encode the acyl-coenzyme A oxidases PXP-4 (SEQ ID NO: way, so that more fatty acids are available for conversion to 134) and PXP-5 (SEQ ID NO: 135). This prevents metabo C.()-diacids and ()-hydroxy fatty acids by the co-oxidation lism of the fatty acid by the yeast (compare FIGS. 2 and 3). pathway. This can be accomplished by deleting the acyl coen The fatty acids or alkynes used have 14 to 22 carbon atoms, Zyme A oxidase genes, as shown in FIG. 2 (Picataggio et al., can be natural materials obtained from plants or synthesized 1992, Biotechnology (NY): 10,894-8; Picataggio et al., 1991, from natural fatty acids, such as lauric acid (C12:0), myristic Mol Cell Biol: 11,4333-9). acid (C14:0), palmitic acid (C16:0), stearic acid (C18:0). 0224 Candida tropicalis strains lacking both alleles of oleic acid (C18:1), linoleic acid (C18:2), O-linolenic acid each of two acyl coenzyme A oxidase isozymes, encoded by (c)3. C18:3) ricinoleic acid (12-hydroxy-9-cis-octadecenoic the poX4 and poX5 genes, are efficient biocatalysts for the acid, 12-OH-C 18:1), erucic acid (C22:1). epoxy stearic acid. production of C,c)-diacids (Picataggio et al., 1992, Biotech Examples of other substrates that can be used in biotransfor nology (NY): 10,894-8; Picataggio et al., 1991, Mol Cell mations to produce C.()-dicarboxylic acid and ()-hydroxy Biol: 11, 4333-4339). However for the production of co-hy acid compounds are 7-tetradecyne and 8-hexadecyne. Dis droxy fatty acids, additional enzymes must be eliminated to closed herein, naturally derived fatty acids, chemically or prevent the oxidation of the ()-hydroxyl group to a carboxyl enzymatically modified fatty acids, n-alkane, n-alkene, group. n-alkyne and/or fatty alcohols that have a carbon chain length 0225. To prevent the oxidation of hydroxyl groups to car from 12 to 22 are used as carbon sources for the yeast boxyl groups, in Some embodiments it is particularly advan catalyzed biotransformation. For example, Candida tropica tageous to eliminate or inactivate one or more genes encoding lis ATCC20962 can be used as a catalyst under aerobic con a cytochrome P450. ditions in liquid medium to produce ()-hydroxy fatty acids 0226 To prevent the oxidation of hydroxyl groups to car and C.co-dicarboxylic acids. Candida tropicalis ATCC20962 boxyl groups, in Some embodiments it is particularly advan is initially cultivated in liquid medium containing inorganic tageous to eliminate or inactivate one or more genes encoding salts, nitrogen Source and carbon source. The carbon Source a fatty alcohol dehydrogenase. for initial cultivations can be saccharide such as Sucrose, 0227. To prevent the oxidation of hydroxyl groups to -car glucose, Sorbitol, etc., and other carbohydrates Such as glyc boxyl groups, in Some embodiments it is particularly advan erol, acetate and ethanol. Then, the Substrate such as naturally tageous to eliminate or inactivate one or more genes encoding derived fatty acids, chemically or enzymatically modified an alcohol dehydrogenase. fatty acids, n-alkane, n-alkene, n-alkyne and fatty alcohol for 0228. In one embodiment yeast genes can be inactivated oxidation of terminal methyl or hydroxyl moieties is added by deleting regions from the yeast genome that encode a part into the culture. The pH is adjusted to 7.5-8.0 and fermenta of the yeast gene that encodes the protein product (the open US 2015/0094.483 A1 Apr. 2, 2015 22 reading frame) so that the full-length protein can no longer be P450s and their reductases, glycosyltransferases and desatu made by the cell. In another embodiment yeast genes can be rases. A preferred method for testing the effect of sequence inactivated by inserting additional DNA sequences into the changes in a polypeptide within yeast is to introduce a plu part of the yeast gene that encodes the protein product so that rality of genes of known sequence, each encoding a unique the protein that is made by the cell contains changes that modified polypeptide, into the same genomic location in a prevent it from functioning correctly. In another embodiment plurality of strains. yeast genes are inactivated by inserting or deleting sequences 0233. Some embodiments described herein make use of a from control regions of the gene, so that the expression of the selective marker. A selective marker can be a gene that pro gene is no longer correctly controlled; for example additions duces a selective advantage for the cells under certain condi or deletions to the promoter can be used to prevent transcrip tions such as a gene encoding a product that confers resistance tion of the gene, additions or deletions to the polyadenylation to an antibiotic or other compound that normally inhibits the signal can be used to affect the stability of the mRNA, addi tions or deletions to introns or intron splicing signals can be growth of the host cell. used to prevent correct splicing or nuclear export of the pro 0234. A selective marker can be a reporter, such as, for cessed mRNA. example, any nucleic acid sequence encoding a detectable 0229. For the production of oxidized compounds in yeast gene product. The gene product may be an untranslated RNA including ()-hydroxy fatty acids and high energy compounds, product such as mRNA or antisense RNA. Such untranslated it may also be advantageous to add certain new genes into the RNA may be detected by techniques known in the art, such as yeast cell. For example to facilitate the production of co-hy PCR, Northern or Southern blots. The selective marker may droxy fatty acids from fatty acids with different chain lengths encode a polypeptide. Such as a protein or peptide. A polypep or degrees or positions of unsaturation, the enzymes that are tide may be detected immunologically or by means of its naturally present in the yeast are often inadequate; they may biological activity. The selective marker may be any known in oxidise the fatty acid to the co-hydroxy fatty acid too slowly, the art. The selective marker need not be a natural gene. they may only oxidise a Subset of the fatty acids in a mixture Useful selective markers may be the same as certain natural to their corresponding ()-hydroxy fatty acids, they may oxi genes, but may differ from them eitherinterms of non-coding dise the fatty acid in the wrong position or they may oxidise sequences (for example one or more naturally occurring the ()-hydroxy fatty acid itself to a diacid. Advantageous introns may be absent) or in terms of coding sequences. One enzymes could thus be those that oxidise a compound to the example of Such a detectable gene product is one that causes corresponding hydroxylated compound more rapidly, those the yeast to adopt a unique characteristic color associated that oxidise a fatty acid to its corresponding ()-hydroxy fatty with the detectable gene product. For example, if the targeting acid more rapidly, those that accept as Substrates a wider construct contains a selective marker that is a gene that directs range of Substrates and those that do not over-oxidise target the cell to synthesize a fluorescent protein, then all of the compounds including ()-hydroxy fatty acids to diacids. colonies that contain the fluorescent protein are carrying the 0230. To achieve novel phenotypes in Candida species, targeting construct and are therefore likely to be integrants. including the ability to perform biotransformations such as Thus the cells that will be selected for further analysis are novel chemical conversions, or increased rates of conversion those that contain the fluorescent protein. of one or more Substrates to one or more products, or 0235. The selective marker may encode a protein that increased specificity of conversion of one or more Substrates allows the yeast cell to be selected by, for example, a nutri to one or more products, or increased tolerance of a com tional requirement. For example, the selective marker may be pound by the yeast, or increased uptake of a compound by the the ura4 gene that encodes orotidine-5'-phosphate decar yeast, it may be advantageous to incorporate a gene encoding boxylase. The ura4 gene encodes an enzyme involved in the a polypeptide into the genome of the yeast. biosynthesis of uracil and offers both positive and negative 0231 Preferred sites of integration include positions selection. Only cells expressing ura4 are able to grow in the within the genome where the gene would be under control of absence of uracil, where the appropriate yeast strain is used. a promoter that transcribes high levels of an endogenous Cells expressing ura4 die in the presence of 5-fluoro-orotic protein, or under control of a promoter that leads to regulated acid (FOA) as the ura4 gene product converts FOA into a transcription for example in response to changes in the con toxic product. Cells not expressing ura4 can be maintained by centrations of one or more compound in the cellular or extra adding uracil to the medium. The sensitivity of the selection cellular environment. Examples of preferred sites of integra process can be adjusted by using medium containing 6-azau tion include sites in the genome that are under control of the racil, a competitive inhibitor of the ura4 gene product. The promoter for an isocitrate lyase gene, sites in the genome that his3 gene, which encodes imidazoleglycerol-phosphate are under control of the promoter for a cytochrome P450 dehydratase, is also suitable for use as a selective marker that gene, sites in the genome that are under control of the pro allows nutritional selection. Only cells expressing his3 are moter for a fatty alcohol oxidase gene and sites in the genome able to grow in the absence of histidine, where the appropriate that are under control of the promoter for an alcohol dehy yeast Strain is used. drogenase gene to obtain high levels of expression of a 0236. The selective marker may encode for a protein that polypeptide or expression of a polypeptide under specific allows the yeast to be used in a chromogenic assay. For circumstances. example, the selective marker may be the lacZ gene from 0232 To achieve such novel phenotypes in Candida spe Escherichia coli. This encodes the 3-galactosidase enzyme cies, it may be advantageous to modify the activity of a which catalyses the hydrolysis of 3-galactoside Sugars Such polypeptide by altering its sequence, and to test the effect of as lactose. The enzymatic activity of the enzyme may be the polypeptide with altered sequence within the yeast. assayed with various specialized Substrates, for example Polypeptides of particular interest for conferring the ability to X-gal (5-bromo-4-chloro-3-indoyl-3-D-galactoside) or o-ni synthesize novel hydroxy fatty acids include cytochrome trophenyl-B-D-galactopyranoside, which allow selective US 2015/0094.483 A1 Apr. 2, 2015

marker enzyme activity to be assayed using a spectrophotom 0242. In one embodiment, a DNA construct comprises eter, fluorometer or a luminometer. two sequences with homology to two sequences in the target 0237. In some embodiments, the selective marker com yeast genome (“targeting sequences'), separated by a selec prises a gene that encodes green fluorescent protein (GFP), tive marker, as shown in FIG. 11. The two target sequences which is known in the art. within the yeast genome are preferably located on the same 0238. In some embodiments, the selective marker encodes molecule of DNA (e.g. the same nuclear or mitochondrial a protein that is capable of inducing the cell, or an extract of chromosome), and are preferably less than 1,000,000 base a cell, to produce light. For example, the selective marker pairs apart, more preferably they are less than 100,000 base encodes luciferase in some embodiments. The use of pairs apart, and more preferably they are less than 10,000 base luciferase is known in the art. They are usually derived from pairs apart. Cells containing a genomic integration of the firefly (Photinous pyralis) or sea pansy (Renilla reniformis). targeting construct can be identified using the selective The luciferase enzyme catalyses a reaction using D-luciferin marker. and ATP in the presence of oxygen and Mg" resulting in light 0243 A schematic representation of one form of a DNA emission. The luciferase reaction is quantitated using a lumi molecule for yeast genomic integration (a "genomic targeting nometer that measures light output. The assay may also construct”) is shown in FIG. 4. In this embodiment the include coenzyme A in the reaction that provides a longer, genomic targeting construct has two targeting sequences that Sustained light reaction with greater sensitivity. An alternative are homologous to the sequences of two regions of the target form of enzyme that allows the production of light and which yeast genome. In some embodiments these sequences are can serve as a selective marker is aequorin, which is known in each at least 100 base pairs in length, or between 100 and 300 the art. base pairs in length. The targeting sequences are preferably 0239. In some embodiments the selective marker encodes 100% identical to sequences in the host genome or between B-lactamase. This selective marker has certain advantages 95% and 100% identical to sequences in the host genome. over, for example, lacZ. There is no background activity in Between these targeting sequences are two sites recognized mammalian cells or yeast cells, it is compact (29 kDa), it by a site-specific recombinase Such as the natural or modified functions as a monomer (in comparison with lacZ which is a versions of cre or flp or PhiC31 recombinases or serine tetramer), and has good enzyme activity. This may use CCF2/ recombinases such as those from bacteriophage R4 or bacte AM, a FRET-based membrane permeable, intracellularly riophage TP901-1. Between the two site specific recombinase trapped fluorescent substrate. CCF2/AM has a 7-hydroxy recognition sites are functional sequence elements which coumarin linked to a fluorescein by a cephalosporin core. In may include sequences that encode a site-specific recombi the intact molecules, excitation of the coumarin results in nase that recognizes the recombinase sites and which may efficient FRET to the fluorescein, resulting in green fluores also encode a selective marker as illustrated in FIG. 4. In one cent cleavage of the CCF2 by B-lactamase results in spatial embodiment this DNA construct incorporates the “SAT1 flip separation of the two dyes, disrupting FRET and causing cells per', a DNA construct for inserting and deleting sequences to change from green to blue when viewed using a fluorescent into the chromosome of Candida (Reuss et al., 2004, Gene: microscope. The retention of the cleaved product allows the 341, 119-277). In the “SAT1 flipper” the recombinase is the blue colour to develop over time, giving a low detection limit flp recombinase from Saccharomyces cerevisiae (Vetteret al., of for example, 50 enzyme molecules per cell. This results in 1983, Proc Natl Acad Sci USA: 80, 7284-8) (FLP) and the the selective maker being able to be assayed with high sensi flanking sequences recognized by the recombinase are rec tivity. It also allows the ability to confirm results by visual ognition sites for the flp recombinase (FRT). The selective inspection of the cells or the samples. marker is the gene encoding resistance to the Nourseothricin 0240. In some embodiments, the selective marker com resistance marker from transposon Tn 1825 (Tietze et al., prises any of the aforementioned genes under the control of a 1988, J Basic Microbiol: 28, 129-36). The entire construct promoter. In some embodiments, the selective marker com can then be targeted to the Candida chromosome by adding prises any of the aforementioned genes under the control of a flanking sequences with homology to a gene in the Candida promoter as well as one or more additional regulatory ele chromosome. The DNA sequence of the SAT1-flipper is SEQ ments, such as upstream activating sequences (UAS), termi ID NO: 1. nation sequences and/or secretory sequences known in the 0244 Yeast preferentially recombines linear DNA. It is art. The Secretory sequences may be used to ensure that the therefore advantageous to prepare the targeting construct as a product of the reporter gene is secreted out of the yeast cell. linear molecule prior to transforming it into the yeast target. 5.7.1. Methods for Deletion of Sequences from the Candida In some embodiments it is desirable to prepare and propagate Genome the targeting construct as plasmid DNA in a bacterial host 0241 Many yeasts recombine DNA in regions of Such as E. coli. For propagation in a bacterial host it is gen sequence homology. A linear DNA molecule that is intro erally preferred that plasmid DNA be circular. It is thus some duced into a yeast cell can recombine homologously with the times necessary to convert the targeting construct from a chromosomal DNA if its ends share sufficient sequence iden circular molecule to a linear molecule. Furthermore for tity with chromosomal sequences. Since the sequences of the propagation of the targeting construct in a bacterial host, ends of the DNA molecule are the primary determinant of additional sequence elements may be necessary, so a target where in the yeast chromosome the homologous recombina ing construct may, in addition to the elements shown in FIGS. tion event occurs, it is possible to construct a DNA molecule 4 and 7, comprise an origin of replication and a bacterial that encodes one or more functional genes, and to target that selectable marker. It may therefore be advantageous to place molecule to integrate at a specific location in the yeast chro restriction sites in the targeting construct to cleave between mosome. In this way, yeast genes in the chromosome or the elements of the targeting construct shown in FIGS. 4 and mitochondria may be disrupted, by interrupting the gene 7 and the elements not shown but required for propagation in sequence with other sequences. a bacterial host. Cleavage with restriction enzymes that rec US 2015/0094.483 A1 Apr. 2, 2015 24 ognize these sites will linearize the DNA and leave the tar In the targeting construct shown in FIG. 6 this is done by geting sequences at the ends of the molecule, favoring induction of the gene encoding the recombinase present in the homologous recombination with the target host genome. One targeting construct. Expression of the recombinase causes a of ordinary skill in the art will recognize that there are alter recombination event between the two recombinase recogni native ways to obtain linear DNA, for example by amplifying tion sites of the targeting construct, as shown schematically in the desired segment of DNA by PCR. It is also possible to FIG. 6. The result is that the sequences between the two prepare the DNA directly and transform it into the target yeast recombinase sites are excised from the genome. In other strain without propagating as a plasmid in a bacterial host. embodiments it is possible to integrate a recombinase into a 0245 Introduction of the linearized targeting construct second site in the host genome instead of having it present in into a yeast host cell such as a Candida host cell is followed the targeting construct. by homologous recombination catalyzed by host cell 0248 Cells from which a genomic integration of the tar enzymes. This event is represented schematically in FIG. 5. geting construct has been excised can optionally be tested to Homologous recombination occurs between each of the two ensure that the excision has occurred by testing cells from targeting sequences in the genomic targeting construct and individual colonies to determine whether they still carry the the homologous sites in the yeast genome. The result is an selective marker. In some embodiments. Such testing is per integration of the targeting construct into the genomic DNA. formed by amplification of a section of the genomic DNA by Cells containing a genomic integration of the targeting con the polymerase chain reaction. Excision of part of the target struct can be identified using the selective marker. ing construct from the yeast genome may be detected by a 0246 Cells containing a genomic integration of the target difference in size of amplicon using oligonucleotide primers ing construct can optionally be tested to ensure that the inte that anneal to sequences outside the targeted sequence. This is gration has occurred at the desired site within the genome. In illustrated in FIG. 10. One of ordinary skill in the art will one embodiment, such testing is performed by amplification readily appreciate that there are many alternative ways to of a section of the genomic DNA by the polymerase chain design oligonucleotides to produce diagnostic amplicons reaction. Integration of the targeting construct into the yeast using the polymerase chain reaction. For example one oligo genome will replace genomic sequences with targeting con nucleotide that anneals inside the targeting construct (ex struct sequences. This replacement may be detected by a ample e.g. within the selective marker) and one oligonucle difference in size of amplicon using oligonucleotide primers otide that anneals outside but close to the targeted region can that anneal to sequences outside the targeted sequence. This is be used to produce an amplicon from the integrated targeting illustrated in FIG. 10. One of ordinary skill in the art will construct but will not produce an amplicon if the targeting readily appreciate that there are many alternative ways to construct has been excised. In general oligonucleotide pairs design oligonucleotides to produce diagnostic amplicons for producing diagnostic amplicons should be oriented with using the polymerase chain reaction. For example one oligo their 3' ends towards each other and the sites in the genome nucleotide that anneals inside the targeted region and one where the two oligonucleotides anneal should be separated by oligonucleotide that anneals outside but close to the targeted between 100 and 10,000 bases, more preferably by between region can be used to produce an amplicon from the natural 150 and 5,000 bases and more preferably by between 200 and genomic sequence but will not produce an amplicon if the 2,000 bases. In some instances it may not be possible to targeting construct has eliminated the targeted genomic distinguish between two possible genotypes based on the size sequence. Conversely one oligonucleotide that anneals inside of the amplicons produced by PCR from genomic DNA. In the targeting construct and one oligonucleotide that anneals these cases an additional test is possible, for example diges outside but close to the targeted region outside will not pro tion of the amplicon with one or more restriction enzymes and duce an amplicon from the natural genomic sequence but will analysis of the sizes may enable the two possible genotypes to produce an amplicon if the targeting construct has integrated be distinguished, or sequencing of the amplicon may enable at the targeted genomic location. In general oligonucleotide the two possible genotypes to be distinguished. pairs for producing diagnostic amplicons should be oriented 0249. In some embodiments it may be advantageous to with their 3' ends towards each other and the sites in the delete sequences whose deletion will result in the inactivation genome where the two oligonucleotides anneal should be of a cytochrome P450; in some embodiments it may be separated by between 100 and 10,000 bases, more preferably advantageous to delete sequences whose deletion will result by between 150 and 5,000 bases and more preferably by in the inactivation of a fatty alcohol oxidase; in Some embodi between 200 and 2,000 bases. In some instances it may not be ments it may be advantageous to delete sequences whose possible to distinguish between two possible genotypes based deletion will result in the inactivation of an alcohol dehydro on the size of the amplicons produced by PCR from genomic genase. DNA. In these cases an additional test is possible, for example digestion of the amplicon with one or more restriction 5.7.2. Methods for Addition of Sequences to the Candida enzymes and analysis of the sizes may enable the two possible Genome genotypes to be distinguished, or sequencing of the amplicon 0250 In some embodiments, new DNA sequences can be may enable the two possible genotypes to be distinguished. inserted into the yeast genome at a specific location using 0247 The same selectable marker may be used for the variations of the targeting construct. Because many yeasts disruption of more than one genomic target. This can be recombine DNA in regions of sequence homology, a linear achieved by removing the selectable marker from the yeast DNA molecule that is introduced into a yeast cell can recom genome after each disruption. In one embodiment, this is bine homologously with the chromosomal DNA if its ends achieved when the selectable marker separates two sites that share Sufficient sequence identity with chromosomal are recognized by a recombinase. When the recombinase is sequences. It is thus possible to insert a DNA sequence into present and active, it effects a recombination reaction the yeast genome at a specific location by flanking that between the two sites, excising the sequences between them. sequence with sequences homologous to sequences within US 2015/0094.483 A1 Apr. 2, 2015

the yeast genome that Surround the desired genomic insertion 0256 In some embodiments, the insertion of a sequence site. Such replacements are quite rare, generally occurring encoding a polypeptide into a genomic site where transcrip less than 1 time in 1,000 yeast cells, so it is often advanta tion is regulated by a promoter that expresses high levels of geous to use a selective marker to indicate when new DNA mRNA is accomplished by adding a polypeptide encoding sequences have been incorporated into the yeast genome. A sequence into the genome at a position where a part of the selective marker can be used in conjunction with a sequence genomic sequence is duplicated so that the gene that was to be integrated into the yeast genome by modifying the originally present in the genome remains. In some embodi strategy described for deleting sequences form the yeast ments this can be effected using a DNA construct comprising genome. a promoter sequence found in the yeast genome positioned 0251. If a targeting construct comprises additional Such that transcription initiated by the promoter produces sequences between one of the targeting sequences and the RNA that can subsequently encode the polypeptide. Such a proximal recombinase site, those sequences will be retained construct also comprises a selectable marker that will func in the genome following integration and excision of the tar tion in the yeast and optionally a selectable marker that will geting construct. An example of such a construct is shown in function in a bacterial host. These may optionally be the same FIG. 7, with the additional sequences indicated as “insertion selectable marker. An example of such a construct is shown in sequences.” Integration of the targeting construct for inser FIG. 21. Integration of this construct into the yeast genome is tion into the yeast genome is shown Schematically in FIG. 8. shown schematically in FIG. 22. Homologous recombination occurs between each of the two 0257. In some embodiments, a sequence encoding a targeting sequences in the genomic targeting construct and polypeptide is inserted under control of the promoter for an the homologous sites in the yeast genome. The result is an isocitrate lyase gene or the promoter for a cytochrome P450 integration of the targeting construct into the genomic DNA. gene including the promoter of CYP52A12 or the promoter of Cells containing a genomic integration of the targeting con CYP52A13 or the promoter of CYP52A14 or the promoter of struct can be identified using the selective marker. CYP52A17 or the promoter of CYP52A18 or the promoter 0252 Cells containing a genomic integration of the target for a fatty alcohol oxidase gene including the promoter of ing construct can optionally be tested to ensure that the inte FAO1 or the promoter of FAO1B or the promoter of FAO2A gration has occurred at the desired site within the genome. In or the promoter of FAO2B, or the promoter for an alcohol one embodiment, Such testing may be performed by amplifi dehydrogenase gene including the promoter of ADH-A4 or cation of a section of the genomic DNA by the polymerase the promoter of ADH-A4B or the promoter of ADH-B4 or the chain reaction, for example as illustrated in FIG. 10. One of promoter of ADH-B4B or the promoter of ADH-A10 or the ordinary skill in the art will readily appreciate that there are promoter of ADH-B11 or the promoter of ADH-A10B or the many alternative ways to design oligonucleotides to produce promoter of ADH-B11B to obtain high levels of expression of diagnostic amplicons using the polymerase chain reaction. a polypeptide. 0253) The selectable marker and other sequences from the 0258. In addition to naturally occurring enzymes, modi targeting construct can be removed from the yeast genome fied enzymes may be added into the host genome. For using a recombinase-based strategy: the recombinase effects example enzymes may be altered by incorporating system a recombination reaction between the two recombinase sites, atically varied sets of amino acid changes, with the resulting excising the sequences between them. In the targeting con changes in phenotypes measured and used to identify struct shown in FIG. 7 this is done by induction of the gene sequence changes conferring improved function. See, for encoding the recombinase present in the targeting construct. example, United States Patent Publications Nos. Expression of the recombinase causes a recombination event 2006O136184 and 2008.0050357; Liao et al., 2007, BMC between the two recombinase recognition sites of the target Biotechnol 7, 16; Ehren et al., 2008, Protein Eng Des Sel 21, ing construct, as shown schematically in FIG.9. The result is 699-707 and Heinzelman et al., 2009, Proc Natl Acad Sci that the sequences between the two recombinase sites are USA 106, 5610-5615. Using these methods, modified ver excised from the genome, leaving the insertion sequences sions of enzymes may be obtained that confer on the host cell integrated into the yeast genome. an improved ability to utilize one or more substrate or an improved ability to perform one or more chemical conver 0254 Cells to which a genomic integration has been intro Sion. A gene that has been modified by these methods may be duced can optionally be tested to ensure that the addition has made more useful in the genome of the host by amplification, occurred correctly by polymerase chain reaction amplifica that is by genetic manipulations causing the presence of more tion of DNA from the yeast genome. These amplicons may than one copy of the gene within the host cell genome and then be tested to measure their size (for example by agarose frequently resulting in higher activity of the gene. gel electrophoresis), or their sequence may be determined to ensure that precisely the desired changes have been effected. 5.7.3. Other Microorganisms of Interest for the Production of 0255. In some embodiments, it may be advantageous to Oxidized Fatty Acids insert sequences into a site in the genome that is known to be transcriptionally active. For example inserting a sequence 0259 Homology-based recombination occurs in the Sac encoding a polypeptide into a genomic site where transcrip charomycetacaeae Family (which is in the tion is regulated by a promoter that expresses high levels of Subphylum); Saccharomycetacaeae include the Genera mRNA can produce high levels of mRNA encoding the Ascobotryozyma, Candida, Citeromyces, Debaryomyces, polypeptide. In some embodiments this can be done by Dekkera (Brettanomyces), Eremothecium, Issatchenkia, Kaz replacing a polypeptide encoding sequence in the genome achstania, Kluyveromyces, Kodamaea, Kregervanrija, with a sequence encoding a different polypeptide, for Kuraishia, Lachancea, Lodderomyces, Nakaseomyces, example using the genomic targeting constructs of the form Pachysolen, Pichia (Hansenula), Saccharomyces, Sat shown in FIG. 7. urnispora, Tetrapisispora, Torulaspora, Vanderwaltozyma, US 2015/0094.483 A1 Apr. 2, 2015 26

Williopsis, Zygosaccharomyces. The deletion and insertion bosa, Candida glucosophila, Candida glycerinogenes, Can methods described here are therefore likely to work in these dida gorgasii, Candidagotoi, Candida gropengiesseri, Can Genera. dida guaymorum, Candida haemulonii, Candida 0260. Within the Subphylum Saccharomycotina is a halonitratophila, Candida halophila, Candida hasegawae, monophyletic Glade containing organisms that translate CTG Candida hawaiiana, Candida heliconiae, Candida hispan as serine instead of leucine (Fitzpatricket al. A fungal phy iensis, Candida homilentoma, Candida humicola, Candida logeny based on 42 complete genomes derived from Supertree humilis, Candida hungarica, Candida hyderabadensis, Can and combined gene analysis BMC Evolutionary Biology dida incommunis, Candida inconspicua, Candida insectal 2006, 6:99) including the species Candida lusitaniae, Can ens, Candida insectamans, Candida insectorum, Candida dida guilliermondii and Debaryomyces hansenii, and the sec intermedia, Candida ipomoeae, Candida ishiwadae, Can ond group containing Candida albicans, Candida dublinien dida jaroonii, Candida jeffriesii, Candida kanchanaburien sis, Candida tropicalis, Candida parapsilosis and sis, Candida karawaiiewi, Candida kashinagacola, Candida Lodderomyces elongisporus. Of particular interest are modi kazuoi, Candida khmerensis, Candida kipukae, Candida fications of the activities of cytochrome P450s, fatty alcohol kofuensis, Candida krabiensis, Candida kruisii, Candida oxidases and alcohol dehydrogenases to modulate the hosts kunorum, Candida labiduridarum, Candida lactis-condensi, production of oxidized molecules by yeasts in this clade. Candida lassenensis, Candida laureliae, Candida leandrae, Yeast species of particular interest and industrial relevance Candida lessepsii, Candida lignicola, Candida litsaeae, within this clade include Candida aaseri, Candida abieso Candida lit.seae, Candida lanquihuensis, Candida lycoper phila, Candida africana, Candida aglyptinia, Candida agres dinae, Candida lyxosophila, Candida magnifica, Candida tis, Candida akabanensis, Candida alai, Candida albicans, magnoliae, Candida maltosa, Candida mannitofaciens, Can Candida alimentaria, Candida amapae, Candida ambrosiae, dida maris, Candida maritima, Candida maxii, Candida Candida amphixiae, Candida anatomiae, Candida ancuden melibiosica, Candida membranifaciens, Candida mesen sis, Candida anglica, Candida anneliseae, Candida antarc terica, Candida metapsilosis, Candida methanolophaga, tica, Candida antillancae, Candida anutae, Candida apicola, Candida methanolovescens, Candida methanosorbosa, Can Candida apis, Candida arabinofermentans, Candida arcana, dida methylica, Candida michaelii, Candida mogii, Candida Candida ascalaphidarum, Candida asparagi, Candida montana, Candida multigennis, Candida mycetangii, Can atakaporum, Candida atbi, Candida athensensis, Candida dida naeodendra, Candida nakhonratchasimensis, Candida atlantica, Candida atmosphaerica, Candida auringiensis, nanaspora, Candida natalensis, Candida neerlandica, Can Candida auris, Candida aurita, Candida austromarina, Can dida memodendra, Candida nitrativorans, Candida nitrato dida azyma, Candida azymoides, Candida barrocoloraden phila, Candida nivariensis, Candida nodaensis, Candida sis, Candida batistae, Candida beechii, Candida bentonen norvegica, Candida novaki, Candida Odintsovae, Candida sis, Candida bertae, Candida berthetii, Candida Oleophila, Candida Ontarioensis, Candida Ooitensis, Can bituminphila, Candida blanki, Candida blattae, Candida dida Orba, Candida Oregonensis, Candida Orthopsilosis, blattariae, Candida bohiensis, Candida boidinii, Candida Candida Ortonii, Candida Ovalis, Candidapallodes, Candida bokatorum, Candida boleticola, Candida bolitotheri, Can palmioleophila, Candida paludigena, Candida panamensis, dida bombi, Candida bombiphila, Candida bondarzewiae, Candida panamericana, Candida parapsilosis, Candida Candida bracarensis, Candida bribrorum, Candida brome pararugosa, Candida pattaniensis, Candida peltata, Can liacearum, Candida buenavistaensis, Candida buinensis, dida peoriaensis, Candida petrohuensis, Candida phangn Candida butyri, Candida Californica, Candida Canberraen gensis, Candidapicachoensis, Candidapiceae, Candida pic sis, Candida cariosilignicola, Candida carpophila, Candida inguabensis, Candida pignaliae, Candida pimensis, Candida carvicola, Candida caseinolytica, Candida Castrensis, Can pini, Candida plutei, Candida pomicola, Candida pondero dida catenulata, Candida cellae, Candida cellulolytica, Can sae, Candida populi, Candida powellii, Candida prunicola, dida cerambycidarum, Candida chauliodes, Candida chick Candida pseudogliaebosa, Candida pseudohaemulonii, Can asaworum, Candida chilensis, Candida choctaworum, dida pseudointermedia, Candida pseudolambica, Candida Candida chodatii, Candida chrysomelidarum, Candidacidri, pseudorhagii, Candida pseudovanderkliftii, Candida psy Candida cloacae, Candida coipomoensis, Candida conglo chrophila, Candida pyralidae, Candida qinlingensis, Can bata, Candida corydali, Candida cylindracea, Candida day dida quercitrusa, Candida quercuum, Candida railenensis, enportii, Candida davisiana, Candida deformans, Candida Candida ralunensis, Candida rancensis, Candida restingae, dendrica, Candida dendronema, Candida derodonti, Can Candida rhagii, Candida riodocensis, Candida rugopellicu dida diddensiae, Candida digboiensis, Candida diospyri, losa, Candida rugosa, Candida Sagamina, Candida sai Candida diversa, Candida dosseyi, Candida drimydis, Can toana, Candida sake, Candida Salmanticensis, Candida San dida drosophilae, Candida dubliniensis, Candida easanen tamariae, Candida Santjacobensis, Candida Saopaulonensis, sis, Candida edaphicus, Candida edax, Candida elateri Candida Savonica, Candida Schatavi, Candida Sequanensis, darum, Candida emberorum, Candida endomychidarum, Candida Sergipensis, Candida Shehatae, Candida Silvae, Candida entomophila, Candida ergastensis, Candida ermo Candida Silvanorum, Candida Silvatica, Candida Silvicola, bii, Candida etchellsii, Candida ethanolica, Candida famata, Candida Silvicultrix, Candida Sinolaborantium, Candida Candida fennica, Candida fermenticarens, Candida floccu sithepensis, Candida Smithsonii, Candida sojae, Candida losa, Candida fioricola, Candida fioris, Candida fioscu Solani, Candida songkhlaensis, Candida Sonorensis, Can lorum, Candida fluviatilis, Candida fragi, Candida frey dida Sophiae-reginae, Candida Sorbophila, Candida Sorbo Schussii, Candida friedrichii, Candida frijolesensis, Candida Sivorans, Candida Sorboxylosa, Candida Spandovensis, Can fructus, Candida fukazawae, Candida fingicola, Candida dida Steatolytica, Candida Stellata, Candida Stellimalicola, galacta, Candida galis, Candida galli, Candida gatunensis, Candida stri, Candida Subhashii, Candida succiphila, Can Candida gelsemii, Candida geochares, Candida germanica, dida Suecica, Candida Suzuki, Candida takamatsuzukensis, Candida ghanaensis, Candida gigantensis, Candida glae Candida taliae, Candida tammamiensis, Candida tanzawaen US 2015/0094.483 A1 Apr. 2, 2015 27 sis, Candida tartarivorans, Candida temnochilae, Candida expressed in engineered Candida Strains including Candida tenuis, Candidatepae, Candida terraborum, Candida tetrigi tropicalis strains to introduce unsaturation at specific sites of darum, Candida thaimueangensis, Candida thermophila, fatty acid substrates prior to co-hydroxylation or to catalyze Candida tilneyi, Candida tolerans, Candida torresi, Can carbon-carbon double bond formation after ()-hydroxylation dida tritonae, Candida tropicalis, Candida trypodendroni, offatty acids. Candida tsuchiyae, Candida tumulicola, Candida ubatuben 0264. Expression in engineered Candida strains of P450 sis, Candida ulmi, Candida vaccinii, Candida valdiviana, enzymes that are known in the literature to introduce addi Candida vanderkliftii, Candida vanderwaltii, Candida tional internal hydroxylation at specific sites of fatty acids or vartiovaarae, Candida versatilis, Candida vini, Candida ()-hydroxyfatty acids can be used to produce internally oxi viswanathi, Candida wickerhamii, Candida wounanorum, dized fatty acids or co-hydroxyfatty acids or aldehydes or Candida wyomingensis, Candida xylopsoci, Candida yucho dicarboxylic acids. Examples of P450 enzymes with known rum, Candida Zemplimina, Candida zeylanoides in-chain hydroxylation activity on different fatty acids that 5.7.4. Engineering of Additional Enzymes into Candida to may be cloned into Candida are the following: CYP81B1 Further Diversify Structures of Products Formed. from Helianthus tuberosus with ()-1 to ()-5 hydroxylation 0261) Different fatty acids are hydroxylated at different (Cabello-Hurtado et al., 1998, J Biol Chem 273, 7260-7267); rates by different cytochrome P450s. To achieve efficient CYP790C1 from Helianthus tuberosus with (0-1 and (0-2 hydroxylation of a desired fatty acid feedstock, one strategy is hydroxylation (Kandelet al., 2005, J Biol Chem 280, 35881 to express P450 enzymes within Candida that are active for 35889); CYP726A1 from Euphorbia lagscae with epoxida ()-hydroxylation of a wide range of highly abundant fatty acid tion on fatty acid unsaturation (Cahoon et al., 2002, Plant feedstocks. Of particular interest are P450 enzymes that cata Physiol 128, 615-624); CYP152B1 from Sphingomonas lyze ()-hydroxylation of lauric acid (C12:0), myristic acid paucimobilis with O-hydroxylation (Matsunaga et al., 2000, (C14:0), palmitic acid (C16:0), stearic acid (C18:0), oleic Biomed Life Sci 35, 365-371); CYP2E1 and 4A1 from acid (C18:1), linoleic acid (C18:2), and O-linolenic acid (c)3. human liver with ()-1 hydroxylation (Adas et al., 1999, J. Lip C18:3). Examples of P450 enzymes with known ()-hydroxy Res 40, 1990-1997); P450s from Bacillus subtilis with C lation activity on different fatty acids that may be cloned into and 3-hydroxylation (Lee et al., 2003, J Biol Chem 278, Candida are the following: CYP94A1 from Vicia sativa (Tijet 9761–9767); and CYP102A1 (BM-3) from Bacillus megate et al., 1988, Biochemistry Journal 332, 583-589); CYP94A5 rium with (D-1, co-2 and c)-3 hydroxylation (Shirane et al., from Nicotiana tabacum (Le Bouquin et al., 2001, Eur J 1993, Biochemistry 32, 13732-13741). Biochem 268, 3083-3090); CYP78A1 from Zea mays (Lar 0265. In addition to naturally occurring enzymes, modi kin, 1994, Plant Mol Biol 25, 343-353); CYP 86A1 (Ben fied enzymes may be added into the host genome. For veniste et al., 1998, Biochem Biophy's Res Commun 243, example enzymes may be altered by incorporating system 688-693) and CYP86A8 (Wellesen et al., 2001, Proc Natl atically varied sets of amino acid changes, with the resulting Acad Sci USA 98, 9694-9699) from Arabidopsis thaliana: changes in phenotypes measured and used to identify CYP92B1 from Petunia hybrida (Petkova-Andonova et al., sequence changes conferring improved function. See, for 2002, Biosci Biotechnol Biochem 66, 1819-1828); example, United States Patent Publications Nos. CYP102A1 (BM-3) mutant F87 from Bacillus megaterium 2006O136184 and 2008.0050357; Liao et al., 2007, BMC (Oliver et al., 1997, Biochemistry 36, 1567-1572); and CYP Biotechnol 7, 16; Ehren et al., 2008, Protein Eng Des Sel 21, 4 family from mammal and insect (Hardwick, 2008, Biochem 699-707 and Heinzelman et al., 2009, Proc Natl Acad Sci Pharmacol 75,2263-2275). USA 106, 5610-5615. Using these methods, modified ver 0262. A second strategy to obtain efficient hydroxylation sions of cytochrome P450s may be obtained with improved (or further oxidation of the hydroxy group to an aldehyde or ability to oxidise fatty acids of different lengths (for example dicarboxylic acid) of a modified fatty acid is to perform the C6, C7, C8, C9, C10, C11, C12, C13, C14, C15, C16, C17, hydroxylation first and then to expose the hydroxylated fatty C18, C19, C20, C21, C22, C23, C24) or different degrees of acid or aldehyde or dicarboxylic acid to an additional saturation (for example fatty acids with one carbon-carbon enzyme. double bond, fatty acids with two carbon-carbon double 0263 For example incorporating one or more desaturase bonds andfatty acids with three carbon-carbon double bonds) enzymes into engineered Candida would allow the introduc or with unsaturated fatty acids where the unsaturated bond is tion of double bonds into co-hydroxyl fatty acids or aldehydes at different positions relative to the carboxyl group and the or dicarboxylic acids at desired positions. Examples of ()-position, to hydroxy fatty acids or to dicarboxylic fatty desaturases with known specificity that may be cloned into acids. Further, using these methods modified versions offatty Candida are the following: A desaturase from rat liver alcohol oxidases or alcoholdehydrogenases may be obtained microsomes (Savile et al., 2001, JAm ChemSoc. 123,4382 with improved ability to oxidise hydroxy-fatty acids of dif 4385). A desaturase from Bacillus subtilis (Fauconnot and ferent lengths (for example C6, C7, C8, C9, C10, C11, C12, Buist, 2001, Bioorg Med Chem Lett 11, 2879-2881), A C13, C14, C15, C16, C17, C18, C19, C20, C21, C22, C23, desaturase from Tetrahymena thermophila (Fauconnot and C24) or different degrees of saturation (for example fatty Buist, 2001, JOrg Chem 66, 1210-1215), A desaturase from acids with one carbon-carbon double bond, fatty acids with Saccharomyces cerevisiae (Buist and Behrouzian, 1996, J two carbon-carbon double bonds and fatty acids with three Am Chem Soc 118, 6295-6296): A' desaturase from carbon-carbon double bonds) or with unsaturated fatty acids Spodoptera littoralis (Pinilla et al., 1999, Biochemistry 38, where the unsaturated bond is at different positions relative to 15272-15277), A' desaturase from Arabidopsis thaliana the carboxyl group and the co-position. A gene that has been (Buist and Behrouzian, 1998, JAm ChemSoc. 120,871-876); modified by these methods may be made more useful in the A' desaturase from Caenorhabditis elegans (Meesapyodsuk genome of the host by amplification, that is by genetic et al., 2000, Biochemistry 39, 11948-11954). Many other manipulations causing the presence of more than one copy of desaturases are known in the literature that can also be the gene within the host cell genome and frequently resulting US 2015/0094.483 A1 Apr. 2, 2015 28 in higher activity of the gene. Expression of one or more at pH 6.3 and grown at 30°, 900 rpm with aeration rate of 1.5 additional enzymes may also be used to functionalize the VVm. After 12 hour fermentations (growth phase), biotrans oxidized fatty acid, either the hydroxyl group or more highly formation phase was started with feeding of substrate (2 ml oxidized groups such as aldehydes or carboxylic acids l'). Concentrated glucose (500 gl') as co-substrate was fed continuously at the rate of 1.2g 1-1 h-1. During the biotrans 6. BIOTRANSFORMATION EXAMPLES formation phase, pH was maintained at 7.6 automatically by 0266 The following examples are set forth so as to pro addition of 4 mol l NaOH solution. Antifoam (Antifoam vide those of ordinary skill in the art with a complete descrip 204) was also added to the fermentor as necessary. Samples tion of how to practice, make and use exemplary embodi were taken on a daily basis to determine levels of product by ments of the disclosed methods, and are not intended to limit LC-MS. the scope of what is regarded as the invention. 6.3. General Extraction and Purification Procedure of 6.1. General Biotransformation Procedure in Biotransformation Products Shake-Flask 0269. The fermentation broth was acidified to pH 1.0 with 0267 C tropicalis ATCC20962 from fresh agar plate or HCl and extracted twice with diethyl ether. To avoid the glycerol stock was precultured in 30 ml YPD medium con epoxy ring-opening during acidification, the fermentation sisting of (g l'): yeast extract, 10; peptone, 10; glucose, 20 broth with products containing epoxy groups was slowly and shaken at 250 rpm, 30° for 20 hours in 500 ml flask. After acidified to pH 3.0 with 5 NHC1. Solvent was evaporated 16 hours of cultivation at 250 rpm, 30° C., preculture was under vacuum with a rotary evaporator. The residual obtained inoculated at 10% (v/v) to 30 ml conversion medium consist was separated by silica gel column chromatography using ing of (gl'): peptone, 3:yeast extract, 6; yeast nitrogen base, silica gel 60. The fractions containing impurities, un-reacted 6.7; acetic acid, 3: KHPO, 7.2: KHPO 9.3; glucose/glyc mono fatty acids and products were gradually eluted with a erol. 20 in 500 ml flask and shaked at 250 rpm. The initial mixture of n-hexane/diethyl ether that their ratio ranges from concentration of substrate was about 10-20 g l'. pH was 90:30 to 10:90. The fractions containing same compound adjusted to 7.5 by addition of 2 moll-1 NaOH solution after were collected together and the solvents were evaporated 12 hour culture. During biotransformation, concentrated co under vacuum with a rotary evaporator. Substrate (glucose/glycerol/sodium acetate/ethanol) was fed (1-2.5% per day) and pH was maintained at 7.5-8.0 by addi 7. GENETIC MODIFICATION EXAMPLES tion of NaOH solution. Samples were taken on a daily basis to 0270. The following examples are set forth so as to pro determine levels of product by LC-MS. vide those of ordinary skill in the art with a description of how to practice, make and use various disclosed exemplary 6.2. General Biotransformation Procedure in embodiments, and are not intended to limit the scope of what Fermentor is regarded as the invention. 0268. Fermentation was carried out in 3-1 Bioflo3000 fer 0271 The strains shown in Table 2 and further described in mentor (New Brunswick Scientific Co., USA) in fed-batch this section were constructed by the synthesis and cloning of culture. The conversion medium mentioned above was used DNA and its Subsequent transformation into the appropriate except for addition of 0.05% antifoam 204 (Sigma) and 0.5% C. tropicalis strain. Table 2 summarizes the DNA sequences Substrate. The seed culture from fresh agar plate or glycerol synthesized and used in these examples. Table 3 Summarizes stock was prepared in 50 ml of conversion medium for 20 the C. tropicalis strains constructed in these examples. Sec hours at 30°C., 250 rpm prior to inoculation into the fermen tion 7.1 describes the methods used for transformation of tor vessel. Following inoculation, the culture was maintained Candida tropicalis. TABLE 2 SEQ NAME ID NO: GINo. SOURCE CONSTRUCTION APPLICATION SAT1 Flipper 50059745 Joachim Morschhauser Source of the SAT1 Flipper CYP52A17 29469874 Used to design CYP52A17 A CYP52A17 A No Gene synthesis Used to construct applicable CYP52A17::SAT1 CYP52A17:SAT1 No Subcloning of SAT1 Used to delete CYP52A17 applicable flipper into CYP52A17 A CYP52A13 294698.64 Used to design CYP52A13 A CYP52A13 A No Gene synthesis Used to construct applicable CYP52A13::SAT1 CYP52A13:SAT1 No Subcloning of SAT 1 Used to delete CYP52A13 applicable flipper into CYP52A13 A CYP52A18 29469876 Used to design CYP52A18. A CYP52A18. A No Gene synthesis Used to construct applicable CYPS2A18::SAT1 CYPS2A18::SAT1 11 No Subcloning of SAT1 Used to delete CYP52A18 applicable flipper into CYP52A18. A US 2015/0094.483 A1 Apr. 2, 2015 29

TABLE 2-continued SEQ NAME ID NO: GINo. SOURCECONSTRUCTION APPLICATION CYP52A14 3 29469866 Used to design CYP52A14 A Gene#1179 179 CYP52A14 A Gene synthesis Used to construct CYPS2A14::SAT1 CYPS2A14::SAT1 Subcloning of SAT1 Used to delete CYP52A14 e flipper into CYP52A14 A FAO1 Used to design FAO1 A FAO1 A Gene synthesis Used to construct FAO1::SAT1 FAO1::SAT1 Subcloning of SAT1 Used to delete FAO1 flipper into FAO1 A Used to design FAO1B A FAO1B A Assembly PCR. Product Used to construct not cloned. FAO1B::SAT1 21 Ligation of SAT1 flipper Used to delete FAO1B to assembly PCR product of FAO1B A 22 Used to design FAO2A A FAO2A A 23 Gene synthesis Used to construct FAO2A::SAT1 24 Subcloning of SAT1 Used to delete FAO2A flipper into FAO2A A FAO2B 25 Used to design FAO2B A FAO2B A 26 Gene synthesis Used to construct FAO2B::SAT1 27 Subcloning of SAT1 Used to delete FAO2B flipper into FAO2B A CYP52A12 28 Used to design CYP52A12 A CYP52A12 A 29 Gene synthesis Used to construct CYPS2A12::SAT1 CYPS2A12::SAT1 30 Subcloning of SAT1 Used to delete CYP52A12 flipper into CYP52A12 A Used to design CYP52A12B A CYP52A12B A 31 Gene synthesis Used to construct CYPS2A12B::SAT1 32 Subcloning of SAT1 Used to delete CYP52A12B flipper into CYP52A12B A 39 Used to design ADH-A4 A DH-A4 A Gene synthesis Used to construct ADH 45 Subcloning of SAT1 Osed to delete ADH-A4 flipper into ADH-A4 A Used to design ADH DH-A4B A 46 Gene synthesis Used to construct ADH A4B::SAT1 47 Subcloning of SAT1 Osed to delete ADH-A4B flipper into ADH-A4B A 42 Used to design ADH-B4 A DH-B4 A 48 Gene synthesis Used to construct ADH 49 Subcloning of SAT1 Used to delete ADH-B4 flipper into ADH-B4 A Used to design ADH 50 Gene synthesis Used to construct ADH B4B::SAT1 51 Subcloning of SAT1 Used to delete ADH-B4B flipper into ADH-B4B A DH-A10 40 Used to design ADH DH-A10 A 52 Gene synthesis Used to construct ADH A10::SAT1 DH-A10::SAT1 53 Subcloning of SAT1 Osed to delete ADH-A10 e flipper into ADH-A10 A A. 43 Used to design ADH US 2015/0094.483 A1 Apr. 2, 2015 30

TABLE 2-continued SEQ NAME ID NO: SOURCECONSTRUCTION APPLICATION ADH-B11 A S4 Gene synthesis Used to construct ADH B11::SAT1 ADH-B11::SAT1 55 Subcloning of SAT1 Used to delete ADH-B11 flipper into ADH-B11 A ADH-A1OB 56 Used to design ADH A1 OB A ADH-A1 OB A 57 Gene synthesis Used to construct ADH A1 OB::SAT1 ADH-A1OB::SAT1 58 Subcloning of SAT1 Used to delete ADH-A10B flipper into ADH A1 OB A ADH-B11B 59 Used to design ADH B11B A ADH-B11B A 60 Gene synthesis Used to construct ADH B11B::SAT1 ADH-B11B::SAT1 61 Subcloning of SAT1 Used to delete ADH-B11B flipper into ADH B11B A ICL promoter 62 Gene synthesis Used as a component o genomic integration an expression constructs (e.g. SEQ ID NO: 70, SEQID NO: 71, SEQ ID NO: 74, etc.) ICL terminator 63 Not Gene synthesis Used as a component o applicable genomic integration an expression constructs (e.g. SEQ ID NO: 70, SEQID NO: 71, SEQ ID NO: 74, etc.) TEF1 promoter 64 Not Gene synthesis Used as a component o applicable genomic integration an expression constructs (e.g. SEQ ID NO: 70, SEQID NO: 71, SEQ ID NO: 74, etc.) EM7 promoter 65 Not Gene synthesis Used as a component o applicable genomic integration an expression constructs (e.g. SEQ ID NO: 70, SEQID NO: 71, SEQ ID NO: 74, etc.) ZeoR 66 Not Gene synthesis of gene Used as a component o applicable optimized for Candida genomic integration an expression constructs (e.g. SEQ ID NO: 70, SEQID NO: 71, SEQ ID NO: 74, etc.) CYC1 transcription 67 Not Gene synthesis Used as a component o terminator applicable genomic integration an expression constructs (e.g. SEQ ID NO: 70, SEQID NO: 71, SEQ ID NO: 74, etc.) pUC origin of 68 Not Gene synthesis Used as a component o replication applicable genomic integration an expression constructs (e.g. SEQ ID NO: 70, SEQID NO: 71, SEQ ID NO: 74, etc.) CYP52A17 69 Not Gene synthesis Cloned into genomic applicable integration and expression constructs to express (e.g. SEQ ID No: 70) 70 Not CYP52A17 cloned into Used to express CYP52A17 applicable genomic integration in Candida tropicalis under Wector control of the isocitrate lyase promoter CYP52A13 71 Not Gene synthesis Cloned into genomic applicable integration and expression constructs to express(e.g. SEQ ID NO: 71) US 2015/0094.483 A1 Apr. 2, 2015 31

TABLE 2-continued SEQ NAME ID NO: GINo. SOURCECONSTRUCTION APPLICATION pXICL:CYP52A13 72 Not CYPS2A13 cloned into Used to express CYP52A13 applicable genomic integration in Candida tropicalis under Wector control of the isocitrate lyase promoter CYP52A12 73 Not Gene synthesis Cloned into genomic applicable integration and expression constructs to express(e.g. SEQ ID NO: 74) pXICL:CYP52A12 74 Not CYPS2A12 cloned into Used to express CYP52A12 applicable genomic integration in Candida tropicalis under Wector control of the isocitrate lyase promoter mCherry 75 Not Gene synthesis Cloned into genomic applicable integration and expression constructs to express mCherry (e.g. SEQID NO: 76) pXICL::mCherry 76 Not mCherry cloned into Used to express mCherry in applicable genomic integration Candida tropicalis under Wector control of the isocitrate lyase promoter

TABLE 3 Strain Name Genotype Description DP1 ura AfuraB American Type Culture Collection (ATCC pox5::ura3A pox5::ura3A 20962) pox4A::ura3A pox4B::URA3A DP65 DP1 CYP52A17:SAT1 Electroporation of DP1 with CYP52A17:SAT1 (SEQ ID NO: 4) and selection for nourseothricin resistance followed by PCR screens for targeting construct insertion into CYP52A17 DP78 DP1 ACYP52A17 Growth of DP65 with maltose followed by agar plate screen for loss of nourseothricin resistance and PCR screen for excision of argeting construct from CYP52A17 DP107 DP1 ACYP52A17 Electroporation of DP78 with CYP52A13:SAT1 CYP52A13:SAT1 (SEQ ID NO: 7) and selection for nourseothricin resistance ollowed by PCR screens for targeting construct insertion into CYP52A13 DP113 DP1 ACYP52A17 ACYP52A13 Growth of DP107 with maltose followed by agar plate screen for loss of nourseothricin resistance and PCR screen or excision of targeting construct from CYP52A13 DP140 DP1 Electroporation of DP113 with ACYP52A17/CYP52A18::SAT1 CYP52A18:SAT1 (SEQ ID NO: 11) and ACYPS2A13 selection for nourseothricin resistance ollowed by PCR screens for targeting construct insertion into CYP52A18 DP142 DP1 ACYP52A17 ACYP52A18 Growth of DP140 with maltose followed ACYPS2A13 by agar plate screen for loss of nourseothricin resistance and PCR screen or excision of targeting construct from CYP52A18 DP170 DP1 ACYP52A17 ACYP52A18 Electroporation of DP142 with ACYPS2A13, CYPS2A14::SAT1 CYP52A14::SAT1 (SEQID NO: 15) and selection for nourseothricin resistance ollowed by PCR screens for targeting construct insertion into CYP52A14 DP174 DP1 ACYP52A17 ACYP52A18 Growth of DP170 with maltose followed ACYPS2A13:ACYPS2A14 by agar plate screen for loss of nourseothricin resistance and PCR screen or excision of targeting construct from CYP52A14 DP182 DP1 ACYP52A17 ACYP52A18 Electroporation of DP174 with ACYPS2A13:ACYPS2A14 FAO1::SAT1 (SEQ ID NO: 18) and FAO1::SAT1 selection for nourseothricin resistance US 2015/0094.483 A1 Apr. 2, 2015 32

TABLE 3-continued

Strain Name Genotype Description ollowed by PCR screens for targeting construct insertion into FAO1 DP186 Growth of DP182 with maltose followed by agar plate screen for loss of nourseothricin resistance and PCR screen or excision of targeting construct from FAO1 DP197 DP1 ACYP52A17 ACYP52A18 Electroporation of DP186 with ACYPS2A13:ACYPS2A14 pXICL::mCherry (SEQ ID NO: 76) and AFAO1 pXICL::mCherry selection for Zeocin resistance followed by PCR screens for targeting construct insertion into the isocitrate lyase gene Electroporation of DP186 with pXICL:CYP52A17 (SEQID NO:70) and selection for Zeocin resistance followed by PCR screens for targeting construct insertion into the isocitrate lyase gene DP238 Electroporation of DP186 with FAO1B::SAT1 (SEQID NO: 21) and selection for nourseothricin resistance ollowed by PCR screens for targeting construct insertion into FAO1B DP240 Growth of DP238 with maltose followed by agar plate screen for loss of nourseothricin resistance and PCR screen or excision of targeting construct from FAO1B DP255 Electroporation of DP240 with FAO2A::SAT1 (SEQ ID NO: 21) and selection for nourseothricin resistance ollowed by PCR screens for targeting construct insertion into FAO2A DP256 Growth of DP255 with maltose followed by agar plate screen for losso nourseothricin resistance and PCR screen or excision of targeting construct from FAO2A DP258 Electroporation of DP256 with DP259 FAO2B::SAT1 (SEQID NO: 27) and selection for nourseothricin resistance ollowed by PCR screens for targeting construct insertion into FAO2B DP261 Growth of DP259 with maltose followed by agar plate screen for losso nourseothricin resistance and PCR screen or excision of targeting construct from FAO2B DP268 Electroporation of DP261 with CYP52A12::SAT1 (SEQ ID NO:30) and selection for nourseothricin resistance ollowed by PCR screens for targeting construct insertion into CYP52A12 DP272 Growth of DP268 with maltose followed by agar plate screen for loss of nourseothricin resistance and PCR screen or excision of targeting construct from CYP52A12 DP282 Electroporation of DP272 with CYP52A12B::SAT1 (SEQID NO:32) and selection for nourseothricin resistance ollowed by PCR screens for targeting construct insertion into CYP52A12B DP283 Growth of DP282 with maltose followed DP284 by agar plate screen for loss of nourseothricin resistance and PCR screen or excision of targeting construct from CYP52A12B DP387 Electroporation of DP283 with ADH A4::SAT1 (SEQID NO: 45) and selection or nourseothricin resistance followed by PCR screens for targeting construct insertion into ADH-A4

DP388 Growth of DP387 with maltose followed by agar plate screen for loss of US 2015/0094.483 A1 Apr. 2, 2015 33

TABLE 3-continued

Strain Name Genotype Description nourseothricin resistance and PCR screen or excision of targeting construct from Y ADH-A4

DP389 Electroporation of DP388 with ADH A4B::SAT1 (SEQID NO: 47) and selection for nourseothricin resistance ollowed by PCR screens for targeting construct insertion into ADH-A4B

DP390 C Growth of DP389 with maltose followed by agar plate screen for loss of nourseothricin resistance and PCR screen or excision of targeting construct from A. ADH-A4B DP397 Electroporation of DP390 with ADH B4::SAT1 (SEQID NO:49) and selection N. or nourseothricin resistance followed by PCR screens for targeting construct insertion into ADH-B4

DP398 Growth of DP397 with maltose followed by agar plate screen for loss of nourseothricin resistance and PCR screen or excision of targeting construct from ADH-B4 A. D 4 DP409 Electroporation of DP398 with ADH AR B4B::SAT1 (SEQID NO:49) and selection or nourseothricin resistance followed by PCR screens for targeting construct insertion into ADH-B4B A. D

DP411 Growth of DP409 with maltose followed by agar plate screen for loss of nourseothricin resistance and PCR screen for excision of targeting construct from ADH-B4B

DP415 Electroporation of DP411 with ADH A10::SAT1 (SEQID NO: 53) and selection for nourseothricin resistance followed by PCR screens for targeting construct insertion into ADH-A10

DP416 Growth of DP415 with maltose followed by agar plate screen for loss of FA nourseothricin resistance and PCR screen for excision of targeting construct from ADH-A10 D

DP417 Electroporation of DP416 with ADH s B11::SAT1 (SEQID NO: 55) and selection for nourseothricin resistance followed by PCR screens for targeting construct CY C 5 2A. insertion into ADH-B11 A.

DP421 Growth of DP417 with maltose followed by agar plate screen for loss of nourseothricin resistance and PCR screen for excision of targeting construct from ADH-B11 US 2015/0094.483 A1 Apr. 2, 2015 34

TABLE 3-continued Strain Name Genotype Description DP423 DP1 ACYP52A17 ACYP52A18 Electroporation of DP421 with ADH DP424 ACYPS2A13:ACYPS2A14 A1OB::SAT1 (SEQID No. 58) and AFAO1 AFAO1B selection for nourseothricin resistance AFAO2AAFAO2B followed by PCR screens for targeting ACYPS2A12ACYPS2A12B construct insertion into ADH-A1OB AADH-A4, AADH-A4B AADH B4/AADH-B4B AADH A10ADH-A1OB::SAT1 AADH B11 DP427 DP1 ACYP52A17 ACYP52A18 Electroporation of DP421 with DP428 ACYPS2A13:ACYPS2A14 pXICL:CYP52A17 (SEQID NO:70) and AFAO1 AFAO1B selection for Zeocin resistance followed by AFAO2AAFAO2B PCR screens for targeting construct ACYPS2A12ACYPS2A12B insertion into the isocitrate lyase gene AADH-A4, AADH-A4B AADH B4/AADH-B4B AADH-A10 AADH-B11 pXICL:CYP52A17 DP431 DP1 ACYP52A17 ACYP52A18 Growth of DP424 with maltose followed ACYPS2A13:ACYPS2A14 by agar plate screen for loss of AFAO1 AFAO1B nourseothricin resistance and PCR screen AFAO2AAFAO2B for excision of targeting construct from ACYPS2A12ACYPS2A12B ADH-A1 OB AADH-A4, AADH-A4B AADH B4/AADH-B4B AADH A10/AADH-A1 OBAADH-B11 DP433 DP1 ACYP52A17 ACYP52A18 Electroporation of DP431 with ADH DP434 ACYPS2A13:ACYPS2A14 B11 B::SAT1 (SEQID NO: 61) and AFAO1 AFAO1B selection for nourseothricin resistance AFAO2AAFAO2B followed by PCR screens for targeting ACYPS2A12ACYPS2A12B construct insertion into ADH-B11B AADH-A4, AADH-A4B AADH B4/AADH-B4B AADH A10/AADH-A1 OBAADH B11 ADHB11B::SAT1 DP436 DP1 ACYP52A17 ACYP52A18 Growth of DP433 with maltose followed DP437 ACYPS2A13:ACYPS2A14 by agar plate screen for loss of AFAO1 AFAO1B nourseothricin resistance and PCR screen AFAO2AAFAO2B for excision of targeting construct from ACYPS2A12ACYPS2A12B ADH-B11B AADH-A4, AADH-A4B AADH B4/AADH-B4B AADH A10/AADH-A1 OBAADH B11 AADHB11B DP522 DP1 ACYP52A17 ACYP52A18 Electroporation of DP421 with DP523 ACYPS2A13:ACYPS2A14 pXICL:CYP52A13 (SEQID NO: 72) and AFAO1 AFAO1B selection for Zeocin resistance followed by AFAO2AAFAO2B PCR screens for targeting construct ACYPS2A12ACYPS2A12B insertion into the isocitrate lyase gene AADH-A4, AADH-A4B AADH B4/AADH-B4B AADH-A10 AADH-B11 pXICL:CYP52A13 DP526 DP1 ACYP52A17 ACYP52A18 Electroporation of DP421 with DP527 ACYPS2A13:ACYPS2A14 pXICL:CYP52A12 (SEQID NO: 74) and AFAO1 AFAO1B selection for Zeocin resistance followed by AFAO2AAFAO2B PCR screens for targeting construct ACYPS2A12ACYPS2A12B insertion into the isocitrate lyase gene AADH-A4, AADH-A4B AADH B4/AADH-B4B AADH-A10 AADH-B11 pXICL:CYP52A12

7.1. General Protocols for Transformation of 7.1.1. Preparation of DNA Targeting Constructs Prior to Inte Candida gration into Candida tropicalis 0273 A linear segment of DNA of the form shown sche 0272. The protocols described in this section have been matically in either FIG. 4 or FIG. 7 was prepared by digesting performed using Candida tropicalis. However it is expected between 2.5 and 5ug of the plasmid containing the targeting that they will work in the Saccharomycetacaeae Family in construct with flanking restriction enzymes, in the examples general and the Candida genus in particular without undue below the restriction enzyme BsmBI from New England experimentation since the methods rely upon homologous Biolabs was used according to the manufacturers instruc recombination which is found throughout this Family. tions. The digest was purified using Qiagen’s PCR purifica US 2015/0094.483 A1 Apr. 2, 2015

tion kit, eluted in 75 of Qiagen's EB buffer (elution buffer) each analysis, genomic DNA was prepared from the parental and transformed into C. tropicalis by electroporation. strain that was transformed with the targeting construct. Oli 7.1.2. Preparation of Electrocompetent Candida tropicalis gonucleotide primers for PCR analysis were chosen to lie 0274 The desired C. tropicalis strain was densely within the targeting construct and/or in the genomic sequence streaked from a culture stored at -80° C. in growth media Surrounding the desired integration location, as shown for (YPD) containing 10% glycerol, onto 2-3 100 mm YPD Agar example in FIG. 10. The size of amplicons was used to deter plates and incubated overnight at 30°C. The next morning 10 mine which strain(s) possessed the desired genomic struc ml YPD broth was spread onto the surface of the YPD agar ture. PCR primer sequences and diagnostic amplicon sizes plates and the yeast cells were scraped from the plates with are described for many of the targeting constructs in Section the aid of a sterile glass spreader. Cells (of the same Strain) 7. PCR reaction mixes were prepared containing 5 ul of from the 2-3 plates were combined in a 50 ml conical tube, 10xNEB Standard Taq. Buffer, 2.5ul of dNTP mix (6 mM of and the A of a 1:20 dilution determined. Sufficient cells to each of dATP, dCTP, dGTP, dTTP), 2.5ul of oligonucleotide prepare 50 ml of YPD containing yeast cells at an Asoo of 0.2 primer 1 (10 mM), 2.5 ul of oligonucleotide primer 2 (10 were placed in each of two 50 ml conical tubes and pelleted in mM), 1 ul of NEB Taq DNA polymerase (5 U of enzyme), 2.5 a centrifuge for 5 min at 400xg. The cells in each tube were ul of Candida gCNA and water to 50 PCR reactions were suspended in 10 ml of TE/Li mix (100 mM LiC1, 10 mM subjected to the following temperatures for the times indi Tris-C1, 1 mM EDTA, pH 7.4). Both tubes were incubated in cated to amplify the target DNA: a shaking incubator for 1 hour at 30° C. and 125 rpm, then 250 0278 Step 1: 1.5 min (a 95°C. ul of 1M DTT was added to each 10 ml cell suspension and 0279 Step 2: 30 sec (a 95°C. incubation continued for a further 30 min at 30° C. and 125 0280 Step 3:30 sec (a 48°C. (or -5°C. lower than the rpm. calculated Tm for the primers as appropriate) 0275. The cells were then washed twice in water and once (0281 Step 4: 1 min (a) 72° C. (or 1 minute per 1kb for in sorbitol. Sterile, ice-cold purified water (40 ml) was added predicted amplicon size) to each of the cell suspensions which were then centrifuged 0282 Step 5: Go to step 2 a further 29 times for 5 min at 400xg at 4°C. and the supernatant decanted off. (0283 Step 6: 2 min (a 72° C. The cells in each tube were resuspended in 50 ml of sterile, 0284 Step 7: Hold (a 4°C. ice-cold purified water, centrifuged for 5 min at 400xg at 4 0285) Step 8: End C., the Supernatant decanted off supernatant. The cells in each 0286 The amplicon sizes were determined by running tube were then resuspended in 25 ml of ice cold 1 M Sorbitol 5-10 ul of the completed PCR reaction on a 1% Agarose-TBE (prepared with purified water) and centrifuged for 5 min at gel. 400xg. The supernatant was decanted from each tube and 7.1.4. Selection and Screen for Isolates Having Excised Tar cells resuspended in the small residual volume of Sorbitol geting Constructs from the Genome of Candida tropicalis Solution (the Volume of each Suspension was approximately 0287 Strains carrying a genomic targeting construct to be 200 ul). The cell suspensions from both tubes were then excised were inoculated from aYPDagar stock plate into 2 ml pooled, this provided enough cells for 4-8 electroporations. In YP (YPD without dextrose) broth-i-2% maltose in a 14 ml a 1.5 mileppendorf tube on ice, 60 ul of cells were mixed with culture tube. The culture tubes were rolled for ~48 hours at 60 ul (~2.5ug) of BsmBI digested vector DNA containing the 30° C. on a rollerdrum. Growth with maltose induced pro genomic targeting construct. A No DNA Control was pre duction of Flp recombinase in the host strain from the inte pared for every transformation by mixing cells with Qiagen grated targeting construct. The Flp recombinase then acted at EB (elution buffer) instead of DNA. The cell-DNA mixtures Frt sites located near the ends of the targeting construct (be were mixed with a vortexer and transferred to an ice-cold tween the targeting sequences) to excise the sequences Bio-Rad 0.2 cm electrode gap Gene Pulser cuvette. The cells between the Frt sites, including the genes encoding Flp were then electroporated at 1.8 kV using a Bio-Rad E. coli recombinase and conferring nourseothricin resistance. The Pulser, 1 ml of 1M D-Sorbitol was added and the electropo culture was then diluted in serial 10-fold dilutions from rated cells were transferred to a 14 ml culture tube and 1 ml of 10-fold to 10,000-fold. Aliquots (100 ul) of 100, 1,000 and 2xYPD broth was added. Cells were then rolled on a Roller 10,000-fold dilutions were spread onto YPD agar plates. drum for 1 hour at 37°C. before spreading 100 ul on 100 mm 0288 Putative excisants were identified by replica-plating diameter plates containing YPD Agar--200 ug/ml nourseo colonies on the YPD agar plates from the dilution series (the thricin. Plates were incubated for 2-4 days at 30° C. Large most useful plates for this purpose were those with 50-500 colonies (8-16) were individually streaked onto aYPD Agar colonies) to aYPD agar--200 ug/ml nourseothricin plates and plate to purify. A single colony from each streak was patched then to aYPDagar plate. Putative excisants were identified as to aYPD agar stock plate and incubated overnight at 30° C. colonies that grow onYPD agar, but notYPDagar+200ug/ml 7.1.3. Genomic DNA Preparation and PCR Test for Integra nourseothricin following overnight incubation at room tem tion of Genomic Targeting Constructs at the Desired Location perature. Putative excisants were streaked for single colonies in Candida tropicalis to aYPD agar plate and incubated overnight at 30 C. A single 0276 Between 5 and 30 nourseothricin-resistant isolates isolate of each of the putative excisants is patched to a YPD were each inoculated into 2 ml of YP Broth and rolled over agar stock plate and incubated overnight at 30° C. night at 30°C. on a Rollerdrum. Genomic DNA from a 0.5 ml 0289 Putative excisants were inoculated from the stock sample of each culture was isolated using Zymo Research's plate to 2 ml of YPD broth in a 14 ml culture tube and rolled YeaStar genomic DNA isolation kit according to the manu overnight at 30° C. on a Rollerdrum. Genomic DNA was facturers instructions, eluting the DNA in 120 ul of TE, pH prepared from 0.5 ml of the overnight culture using the 8.O. YeaStar Genomic DNA Isolation Kit from Zymo Research (0277 For PCR tests, 2.5 ul of the resulting g|DNA was and eluted in 120 ul of TE, pH 8.0. Excision of the targeting used in a 50 ul PCR amplification reaction. As a control for construct was tested by PCR as described in 7.1.3. US 2015/0094.483 A1 Apr. 2, 2015 36

7.2. Deletion of Cytochrome P450 Genes from enzymes NotI and XhoI, and ligating it into the CYP52A17 Candida pre-targeting construct (SEQID NO:3) from which the 20 bp stuffer had been removed by digestion with restriction 0290. The CYP52A type P450s are responsible for oxida enzymes NotI and XhoI. The sequence of the resulting tar tion of a variety of compounds in several Candida species, geting construct for deletion of CYP52A17 is given as SEQ including ()-hydroxylation of fatty acids (Craft et al., 2003, ID NO: 4. This sequence is a specific example of the construct Appl Environ Microbiol: 69, 5983-91; Eschenfeldt et al., shown generically in FIG. 4: it has nearly 300 base pairs of the 2003, Appl Environ Microbiol: 69, 5992-9. Ohkuma et al., genomic sequence of CYP52A17 at each end to serve as a 1991, DNA Cell Biol: 10, 271-82; Zimmer et al., 1995, DNA targeting sequence; between the targeting sequences are two Cell Biol: 14, 619-28; and Zimmer et al., 1996, Biochem frt sites that are recognized by the flp recombinase; between Biophy's Res Commun: 224, 784-9.) They have also been the two frt sites are sequences encoding the flp recombinase implicated in the further oxidation of thesecompounds. See and a protein conferring resistance to the antibiotic nourseo Eschenfeldt et al., 2003, “Transformation of fatty acids cata thricin. Not shown in SEQID NO. 4 but also present in the lyzed by cytochrome P450 monooxygenase enzymes of Can targeting construct were a selective marker conferring resis dida tropicalis.” Appl. Environ. Microbiol. 69: 5992-5999, tance to kanamycin and a bacterial origin of replication, so which is hereby incorporated by reference herein. In some that the targeting construct can be grown and propagated in E embodiments it is desirable to engineer one or more CYP52A coli. The targeting sequences shown in SEQID NO: 4 also type P450s in a strain of Candida in order to modify the include a BSmBI restriction site at each end of the construct, activity or specificity of the P450 enzyme. In some such so that the final targeting construct can be linearized and embodiments it is advantageous to eliminate the activities of optionally separated from the bacterial antibiotic resistance one or more CYP52A type P450 enzymes endogenous to the marker and origin of replication prior to transformation into strain. Reasons to delete endogenous P450 enzymes include Candida tropicalis. more accurate determination of the activity and specificity of a P450 enzyme that is being engineered and elimination of 0293 Candida tropicalis strain DP65 was prepared by P450 enzymes whose activities may interfere with synthesis integration of the construct shown as SEQID NO. 4 into the of the desired product. Strains lacking one or more of their genome of strain DP1 (Table 3) at the site of the genomic natural CYP52A P450s are within the scope of the disclosed sequence of the gene for CYP52A17. Candida tropicalis technology. For example in order to obtaina Strain of Candida strain DP78 was prepared by excision of the targeting con species of yeast including Candida tropicalis for the produc struct from the genome of strain DP65, thereby deleting the tion of oxidized compounds including co-hydroxy fatty acids, gene encoding CYP52A17. Integration and deletion of tar one method is to reduce or eliminate CYP52A type P450s and geting sequence SEQID NO. 4, and analysis of integrants and other enzyme activities within the cell-that oxidise ()-hy excisants were performed as described in Section 7.1. droxy fatty acids to C.O)-diacids. It is then possible to re Sequences of oligonucleotide primers for analysis of strains introduce one CYP52A type P450 or other enzyme that per Were forms the desired reaction, and to engineer it so that its activity is increased towards desired Substrates and reduced (SEO ID NO : 77) towards undesired substrates. In one embodiment its activity 17-IN-L3: TGGCGGAAGTGCATGTGACACAACG for co-hydroxylation of fatty acids is increased relative to its oxidation of ()-hydroxy fatty acids to C. ()-diacids, thereby (SEO ID NO : 78) favoring the production of ()-hydroxy fatty acids over C.()- 17-IN-R2: GTGGTTGGTTTGTCTGAGTGGAGAG diacids. (SEO ID NO : 79) SAT1-R: TGGTACTGGTTCTCGGGAGCACAGG 7.2.1. Deletion of CYP52A17 (SEQ ID NO: 80) 0291. The sequence of a gene encoding a cytochrome SAT1 - F : CGCTAGACAAATTCTTCCAAAAATTTTAGA P450 in Candida tropicalis, CYP52A17 is given as SEQID 0294 For strain DP65 (integration of SEQ ID NO: 4), NO: 2. This sequence was used to design a "pre-targeting PCR with primers 17-IN-L3 and SAT1-R produces a 959 base construct comprising two targeting sequences from the 5' and pair amplicon: PCR with primers SAT1-F and 17-IN-R2 pro 3' end of the structural gene. The targeting sequences were duces a 922 base pair amplicon. PCR with primers 17-IN-L3 separated by a sequence, given as SEQID NO: 12, compris and 17-IN-R2 from a strain carrying a wild type copy of ing a NotI restriction site, a 20 base pair stuffer fragment and CYP52A17 produces a 2.372 base pair amplicon. For strain an XhoI restriction site. The targeting sequences were flanked DP78, with a deleted copy of CYP52A17, PCR with primers by two BsmBI restriction sites, so that the final targeting 17-IN-L3 and 17-IN-R2 produces a 1,478 base pair amplicon. construct can be linearized prior to transformation into Can dida tropicalis. The sequence of the CYP52A17 pre-targeting 0295 Deletion of a portion of the coding sequence of the construct is given as SEQID NO: 3. Not shown in SEQ ID gene for CYP52A17 will disrupt the function of the protein NO: 3 but also present in the pre-targeting construct were a encoded by this gene in the Candida host cell. selective marker conferring resistance to kanamycin and a bacterial origin of replication, so that the pre-targeting con 7.2.2. Deletion of CYP52A13 struct can be grown and propagated in Ecoli. The sequence 0296. The sequence of a gene encoding a cytochrome was synthesized using standard DNA synthesis techniques P450 in Candida tropicalis, CYP52A13 is given as SEQID well known in the art. NO: 5. This sequence was used to design a "pre-targeting 0292. A targeting construct for deletion of CYP52A17 construct comprising two targeting sequences from the 5' and from the Candida tropicalis genome was prepared by digest 3' end of the structural gene. The targeting sequences were ing the SAT-1 flipper (SEQ ID NO: 1) with restriction separated by a sequence, given as SEQID NO: 12, compris US 2015/0094.483 A1 Apr. 2, 2015 37 ing a NotI restriction site, a 20 base pair stuffer fragment and CYP52A13 produces a 2.259 base pair amplicon. For strain an XhoI restriction site. The targeting sequences were flanked DP113 with a deleted version of CYP52A13 PCR with prim by two BsmBI restriction sites, so that the final targeting ers 13-IN-L2 and 13-IN-R2 produces a 1,350 base pair ampli construct can be linearized prior to transformation into Can CO. dida tropicalis. The sequence of the CYP52A13 pre-targeting 0301 Deletion of a portion of the coding sequence of the construct is given as SEQID NO: 6. Not shown in SEQ ID gene for CYP52A13 will disrupt the function of the protein NO: 6 but also present in the pre-targeting construct were a encoded by this gene in the Candida host cell. selective marker conferring resistance to kanamycin and a bacterial origin of replication, so that the pre-targeting con 7.2.3. Deletion of CYP52A18 struct can be grown and propagated in Ecoli. The sequence was synthesized using standard DNA synthesis techniques 0302 The sequence of a gene encoding a cytochrome P450 in Candida tropicalis, CYP52A18 is given as SEQID well known in the art. NO: 8. This sequence was used to design a "pre-targeting 0297. A targeting construct for deletion of CYP52A13 construct comprising two targeting sequences from the 5' and from the Candida tropicalis genome was prepared by digest 3' end of the structural gene. The targeting sequences were ing the SAT-1 flipper (SEQ ID NO: 1) with restriction separated by a sequence, given as SEQID NO: 12, compris enzymes NotI and XhoI, and ligating it into the CYP52A13 ing a NotI restriction site, a 20 base pair stuffer fragment and pre-targeting construct (SEQID NO: 6) from which the 20 bp an XhoI restriction site. The targeting sequences were flanked stuffer had been removed by digestion with restriction by two BsmBI restriction sites, so that the final targeting enzymes NotI and XhoI. The sequence of the resulting tar construct can be linearized prior to transformation into Can geting construct for deletion of CYP52A13 is given as SEQ dida tropicalis. The sequence of the CYP52A18 pre-targeting ID NO: 7. This sequence is a specific example of the construct construct is given as SEQ ID NO: 9. The CYP52A18 pre shown generically in FIG. 4: it has nearly 300 base pair of the targeting construct also contains a polylinker sequence (SEQ genomic sequence of CYP52A13 at each end to serve as a ID NO: 10) between the 5' targeting sequence and the NotI targeting sequence; between the targeting sequences are two site. This polylinker sequence was placed to allow the inser frt sites that are recognized by the flp recombinase; between tion of sequences into the targeting construct to allow it to the two frt sites are sequences encoding the flp recombinase function as an insertion targeting construct of the form shown and a protein conferring resistance to the antibiotic nourseo schematically in FIG. 7. Not shown in SEQID NO:9 but also thricin. Not shown in SEQID NO: 7 but also present in the present in the pre-targeting construct were a selective marker targeting construct were a selective marker conferring resis conferring resistance to kanamycin and a bacterial origin of tance to kanamycin and a bacterial origin of replication, so replication, so that the pre-targeting construct can be grown that the targeting construct can be grown and propagated in E and propagated in Ecoli. The sequence was synthesized using coli. The targeting sequences shown in SEQID NO: 7 also standard DNA synthesis techniques well known in the art. A include a BSmBI restriction site at each end of the construct, targeting construct for deletion of CYP52A18 from the Can so that the final targeting construct can be linearized and dida tropicalis genome was prepared by digesting the SAT-1 optionally separated from the bacterial antibiotic resistance flipper (SEQ ID NO: 1) with restriction enzymes NotI and marker and origin of replication prior to transformation into XhoI, and ligating it into the CYP52A18 pre-targeting con Candida tropicalis. struct (SEQID NO:9) from which the 20 base pair stuffer had 0298 Candida tropicalis strain DP107 was prepared by been removed by digestion with restriction enzymes Notland integration of the construct shown as SEQID NO: 7 into the XhoI. The sequence of the resulting targeting construct for genome of strain DP65 (Table 3) at the site of the genomic deletion of CYP52A18 is given as SEQ ID NO: 11. This sequence of the gene for CYP52A13. Candida tropicalis sequence is a specific example of the construct shown generi strain DP113 was prepared by excision of the targeting con cally in FIG. 4: it has nearly 300 base pairs of the genomic struct from the genome of strain DP107, thereby deleting the sequence of CYP52A18 at each end to serve as a targeting gene encoding CYP52A13. Integration and deletion of tar sequence; between the targeting sequences are two frt sites geting sequence SEQID NO:7, and analysis of integrants and that are recognized by the flp recombinase; between the two excisants were performed as described in Section 7.1. frt sites are sequences encoding the flp recombinase and a 0299 Sequences of oligonucleotide primers for analysis protein conferring resistance to the antibiotic nourseothricin. of strains were: Not shown in SEQID NO: 11 but also present in the targeting construct were a selective marker conferring resistance to (SEQ ID NO: 81) kanamycin and a bacterial origin of replication, so that the 13 - IN-L2: CATGTGGCCGCTGAATGTGGGGGCA targeting construct can be grown and propagated in Ecoli. The targeting sequences shown in SEQ ID NO: 11 also (SEQ ID NO: 82) include a BSmBI restriction site at each end of the construct, 13 - IN-R2: GCCATTTTGTTTTTTTTTACCCCTCTAACA so that the final targeting construct can be linearized and (SEO ID NO: 79) optionally separated from the bacterial antibiotic resistance SAT1-R: marker and origin of replication prior to transformation into Candida tropicalis. (SEQ ID NO: 80) SAT1 - F : 0303 Candida tropicalis strain DP140 was prepared by integration of the construct shown as SEQID NO: 11 into the 0300 For strain DP107 (integration of SEQ ID NO: 7), genome of strain DP113 (Table 3) at the site of the genomic PCR with primers 13-IN-L2 and SAT1-R produces an 874 sequence of the gene for CYP52A18. Candida tropicalis base pair amplicon: PCR with primers SAT1-F and 13-IN-R2 strain DP142 was prepared by excision of the targeting con produces an 879 base pair amplicon. PCR with primers struct from the genome of strain DP140, thereby deleting the 13-IN-L2 and 13-IN-R2 from a strain with wild type gene encoding CYP52A18. Integration and deletion of tar US 2015/0094.483 A1 Apr. 2, 2015

geting sequence SEQID NO: 11, and analysis of integrants of the genomic sequence of CYP52A14 at each end to serve and excisants were performed as described in Section 7.1. as a targeting sequence; between the targeting sequences are 0304 Oligonucleotide primers for analysis of strains two frt sites that are recognized by the flp recombinase; Were between the two frt sites are sequences encoding the flp recombinase and a protein conferring resistance to the anti biotic nourseothricin. Not shown in SEQID NO: 15 but also (SEQ ID NO: 83) present in the targeting construct were a selective marker 18-IN-L2: GGAAGTGCATGTGACACAATACCCT conferring resistance to kanamycin and a bacterial origin of (SEQ ID NO: 84) replication, so that the targeting construct can be grown and 18-IN-R2: GGTGGTTTGTCTGAGTGAGAACGTTTAATT propagated in Ecoli. The targeting sequences shown in SEQ ID NO: 15 also include a BSmBI restriction site at each end of (SEO ID NO: 79) the construct, so that the final targeting construct can be SAT1-R: TGGTACTGGTTCTCGGGAGCACAGG linearized and optionally separated from the bacterial antibi (SEQ ID NO: 80) otic resistance marker and origin of replication prior to trans SAT1 - F : CGCTAGACAAATTCTTCCAAAAATTTTAGA formation into Candida tropicalis. 0305 For strain DP140 (integration of SEQID NO: 11), (0309 Candida tropicalis strain DP170 was prepared by PCR with primers 18-IN-L2 and SAT1-R produces a 676 base integration of the construct shown as SEQID NO: 15 into the pair amplicon: PCR with primers SAT1-F and 18-IN-R2 pro genome of strain DP 142 (Table 3) at the site of the genomic duces a 605 base pair amplicon. PCR from a strain with a wild sequence of the gene for CYP52A14. Candida tropicalis type version of CYP52A18 with primers 18-IN-L2 and strain DP174 was prepared by excision of the targeting con 18-IN-R2 produces a 2.328 base pair amplicon. For strain struct from the genome of strain DP170, thereby deleting the DP142 with a deleted version of CYP52A18, PCR with prim gene encoding CYP52A14. Integration and deletion of tar ers 18-IN-L2 and 18-IN-R2 produces an 878 base pair ampli geting sequence SEQID NO: 15, and analysis of integrants CO and excisants were performed as described in Section 7.1. 0306 Deletion of a portion of the coding sequence of the 0310 Oligonucleotide primers for analysis of strains gene for CYP52A18 will disrupt the function of the protein Were encoded by this gene in the Candida host cell. (SEO ID NO: 85) 7.2.4. Deletion of CYP52A14 14 - IN-T 2: GACGTAGCCGATGAATGTGGGGTGC 0307 The sequence of a gene encoding a cytochrome (SEQ ID NO: 86) P450 in Candida tropicalis, CYP52A14 is given as SEQID 14 - IN-R2: TGCCATTTATTTTTTATTACCCCTCTAAAT NO: 13. This sequence was used to design a "pre-targeting (SEO ID NO : 79) construct comprising two targeting sequences from the 5' and SAT1-R: 3' end of the structural gene. The targeting sequences were separated by a sequence, given as SEQID NO: 12, compris (SEQ ID NO: 80) ing a NotI restriction site, a 20 base pair stuffer fragment and SAT1 - F : an XhoI restriction site. The targeting sequences were flanked 0311. For strain DP170 (integration of SEQID NO: 15), by two BsmBI restriction sites, so that the final targeting PCR with primers 14-IN-L2 and SAT1-R produces a 664 base construct can be linearized prior to transformation into Can pair amplicon: PCR with primers SAT1-F and 14-IN-R2 pro dida tropicalis. The sequence of the CYP52A14 pre-targeting duces a 609 base pair amplicon. For a strain with a wild type construct is given as SEQID NO: 14. The CYP52A14 pre version of CYP52A14, PCR with primers 14-IN-L2 and targeting construct also contains a polylinker sequence (SEQ 14-IN-R2 produces a 2.234 base pair amplicon. For strain ID NO: 10) between the 5' targeting sequence and the NotI DP174 with a deleted version of CYP52A14, PCR with prim site. This polylinker sequence was placed to allow the inser ers 14-IN-L2 and 14-IN-R2 produces an 870 base pair ampli tion of sequences into the targeting construct to allow it to CO function as an insertion targeting construct of the form shown schematically in FIG. 7. Not shown in SEQ ID NO: 14 but 0312 Deletion of a portion of the coding sequence of the also present in the pre-targeting construct were a selective gene for CYP52A14 will disrupt the function of the protein marker conferring resistance to kanamycin and a bacterial encoded by this gene in the Candida host cell. origin of replication, so that the pre-targeting construct can be grown and propagated in Ecoli. The sequence was synthe 7.3. Deletion of Fatty Alcohol Oxidase Genes from sized using standard DNA synthesis techniques well known Candida in the art. 0313 At least one enzyme capable of oxidizing 03-hy 0308. A targeting construct for deletion of CYP52A14 droxy fatty acids is present in Candida tropicalis in addition from the Candida tropicalis genome was prepared by digest to the cytochrome P450 genes encoding CYP52A13, ing the SAT-1 flipper (SEQ ID NO: 1) with restriction CYP52A14, CYP52A17 and CYP52A18. Oxidation of enzymes NotI and XhoI, and ligating it into the CYP52A14 energy rich molecules reduces their energy content. For the pre-targeting construct (SEQID NO: 14) from which the 20 production of incompletely oxidized compounds-including bp stuffer had been removed by digestion with restriction ()-hydroxy fatty acids, it is advantageous to reduce or elimi enzymes NotI and XhoI. The sequence of the resulting tar nate the further oxidation of incompletely oxidized com geting construct for deletion of CYP52A14 is given as SEQ pounds-Such as ()-hydroxy fatty acids. Under one aspect, this ID NO: 15. This sequence is a specific example of the con can be achieved by deleting the genes encoding the oxidizing struct shown generically in FIG. 4: it has nearly 300 base pairs enzymes from the Candida genome Candidate genes for this US 2015/0094.483 A1 Apr. 2, 2015 39 activity include fatty alcohol oxidase and dehydrogenases as 0317 Sequences of oligonucleotide primers for analysis shown in FIG. 14. One class of enzymes known to oxidize of strains were: incompletely oxidised compounds including hydroxy fatty acids are the fatty alcohol oxidases. (SEO ID NO : 87) 7.3.1. Deletion of FAO1 FAO1-IN-L: ATTGGCGTCGTGGCATTGGCGGCTC 0314. The sequence of a gene encoding a fatty alcohol (SEQ ID NO: 88) oxidase in Candida tropicalis, FAO1 is given as SEQID NO: FAO1-IN-R: TGGGCGGAATCAAGTGGCTT 16. This sequence was used to design a “pre-targeting con (SEO ID NO : 79) struct comprising two targeting sequences from the 5' and 3' SAT1-R: TGGTACTGGTTCTCGGGAGCACAGG end of the structural gene. The targeting sequences were (SEQ ID NO: 80) separated by a sequence, given as SEQID NO: 12, compris SAT1 - F : CGCTAGACAAATTCTTCCAAAAATTTTAGA ing a NotI restriction site, a 20 base pair stuffer fragment and an XhoI restriction site. The targeting sequences were flanked 0318 For strain DP182 (integration of SEQID NO: 18), by two BsmBI restriction sites, so that the final targeting PCR with primers FAO1-IN-L and SAT1-R produces a 624 construct can be linearized prior to transformation into Can base pair amplicon: PCR with primers SAT1-F and FAO1 dida tropicalis. The sequence of the FAO1 pre-targeting con IN-R produces a 478 base pair amplicon. For a strain with a struct is given as SEQ ID NO: 17. The FAO1 pre-targeting wild type copy of FAO1, PCR with primers FAO1-IN-L and construct also contains a polylinker sequence (SEQ ID NO: FAO1-IN-R produces a 2,709 base pair amplicon. For strain 10) between the 5' targeting sequence and the NotI site. This DP186 with a deleted copy of FAO1, PCR with primers polylinker sequence was placed to allow the insertion of FAO1-IN-L and FAO1-IN-R produces a 699 base pair ampli sequences into the targeting construct to allow it to function CO. as an insertion targeting construct of the form shown sche 0319 Deletion of a portion of the coding sequence of the matically in FIG. 7. Not shown in SEQID NO: 17 but also gene for FAO 1A will disrupt the function of the protein present in the pre-targeting construct were a selective marker encoded by this gene in the Candida host cell. conferring resistance to kanamycin and a bacterial origin of replication, so that the pre-targeting construct can be grown 7.3.2. Deletion of FAO1B and propagated in Ecoli. The sequence was synthesized using 0320 No sequence had been reported for a second allele standard DNA synthesis techniques well known in the art. for FAO1 (FAO1B) at the time of this work. To identify the 0315 A targeting construct for deletion of FAO1 from the allele (BAO1B) we used PCR amplification primers and Candida tropicalis genome was prepared by digesting the sequencing primers designed to anneal to the known SAT-1 flipper (SEQID NO: 1) with restriction enzymes NotI sequenced allele of FAO1. The primers used were: and XhoI, and ligating it into the FAO1 pre-targeting con struct (SEQ ID NO: 17) from which the 20 base pair stuffer had been removed by digestion with restriction enzymes NotI (SEO ID NO: 89) and XhoI. The sequence of the resulting targeting construct FAO1 F1; CGTCGACACCCTTATGTTAT for deletion of FAO1 is given as SEQ ID NO: 18. This (SEO ID NO: 90) sequence is a specific example of the construct shown generi FAO1 F2; CGTTGACTCCTATCAAGGACA cally in FIG. 4: it has nearly 300 base pairs of the genomic sequence of FAO1 at the 5' end and 220 base pairs of the (SEQ ID NO: 91) genomic sequence of FAO1 at the 3' end to serve as a targeting FAO1 R1; GGTCTTCTCTTCCTGGATAATG sequence; between the targeting sequences are two frt sites (SEQ ID NO: 92) that are recognized by the flp recombinase; between the two FAO1 F3; CCAGCAGTTGTTTGTTCTTG frt sites are sequences encoding the flp recombinase and a protein conferring resistance to the antibiotic nourseothricin. (SEO ID NO: 93) Not shown in SEQID NO: 18 but also present in the targeting FAO1 R2; AATCCTGTGCTTTGTCGTAGGC construct were a selective marker conferring resistance to (SEQ ID NO: 94) kanamycin and a bacterial origin of replication, so that the FAO1 F4; TCCTTAACAAGAAGGGCATCG targeting construct can be grown and propagated in Ecoli. (SEO ID NO: 95) The targeting sequences shown in SEQ ID NO: 18 also FAO1 R3; TTCTTGAATCCGGAGTTGAC include a BSmBI restriction site at each end of the construct, so that the final targeting construct can be linearized and (SEQ ID NO: 96) optionally separated from the bacterial antibiotic resistance FAO1 F5; TCTTAGTCGTGATACCACCA marker and origin of replication prior to transformation into (SEO ID NO: 97) Candida tropicalis. FAO1 R4; CTAAGGATTCTCTTGGCACC 0316 Candida tropicalis strain DP182 was prepared by integration of the construct shown as SEQID NO: 18 into the (SEO ID NO: 98) genome of strain DP174 (Table 3) at the site of the genomic FAO1 R5; GTGACCATAGGATTAGCACC sequence of the gene for FAO1. Candida tropicalis strain 0321 Genomic DNA was prepared from strains DP1 DP186 was prepared by excision of the targeting construct (which has FAO1) and DP186 (which is deleted for FAO1) as from the genome of strain DP182, thereby deleting the gene described in section 7.1.3. The FAO genes were amplified encoding FAO1. Integration and deletion of targeting from genomic DNA by PCR using oligonucleotide primers sequence SEQ ID NO: 18, and analysis of integrants and FAO1 F1 and FAO1 R5. Genomic DNA from both strains excisants were performed as described in Section 7.1. yielded an amplicon of approximately 2 kilobases. Both US 2015/0094.483 A1 Apr. 2, 2015 40 amplicons were directly sequenced using the ten oligonucle 7.3.3. Deletion of FAO2A otide primers listed above. The amplicon from DP1 gave sequence where there were occasionally two bases that 0327. The sequence of a gene encoding a fatty alcohol appeared to be equally represented. The amplicon from oxidase in Candida tropicalis, FAO2A is given as SEQ ID DP186 had no such ambiguous bases but its sequence was NO: 22. This sequence was used to design a "pre-targeting slightly different 95% identical) from the reported sequence construct comprising two targeting sequences from the 5' and of FAO1. We concluded that the sequence corresponded to a 3' end of the structural gene. The targeting sequences were second allele of FAO1, which we refer to as FAO1B. The separated by a sequence, given as SEQID NO: 12, compris sequence of FAO1B is given as SEQID NO: 19. ing a NotI restriction site, a 20 bp stuffer fragment and an XhoI restriction site. The targeting sequences were flanked by 0322 This sequence was used to design a “pre-targeting two BSmBI restriction sites, so that the final targeting con construct comprising two targeting sequences from the 5' and struct can be linearized prior to transformation into Candida 3' end of the structural gene. The targeting sequences were tropicalis. The sequence of the FAO2A pre-targeting con separated by a sequence, given as SEQID NO: 12, compris struct is given as SEQID NO. 23. Not shown in SEQID NO: ing a Not restriction site, a 20 bp stuffer fragment and an 23 but also present in the pre-targeting construct were a selec XhoI restriction site. The targeting sequences were flanked by tive marker conferring resistance to kanamycin and a bacte two BSmBI restriction sites, so that the final targeting con rial origin of replication, so that the pre-targeting construct struct can be linearized prior to transformation into Candida can be grown and propagated in Ecoli. The sequence was tropicalis. The sequence of the FAO1B pre-targeting con synthesized using standard DNA synthesis techniques well struct is given as SEQID NO. 20. known in the art. 0323 A targeting construct for deletion of FAO1 from the 0328. A targeting construct for deletion of FAO2A from Candida tropicalis genome was prepared by digesting the the Candida tropicalis genome was prepared by digesting the SAT-1 flipper (SEQID NO: 1) with restriction enzymes NotI SAT-1 flipper (SEQID NO: 1) with restriction enzymes NotI and XhoI, and ligating it into the FAO1B pre-targeting con and XhoI, and ligating it into the FAO2A pre-targeting con struct (SEQ ID NO: 20) that had also been digested with struct (SEQID NO. 23) from which the 20 bp stuffer had been restriction enzymes NotI and XhoI. the FAO1B pre-targeting removed by digestion with restriction enzymes NotI and construct (SEQID NO: 20) was not cloned or propagated in XhoI. The sequence of the resulting targeting construct for a bacterial host, so digestion with restriction enzymes NotI deletion of FAO2A is given as SEQID NO: 24. This sequence and XhoI produced two fragments which were then ligated is a specific example of the construct shown generically in with the digested SAT-1 flipper to produce a targeting con FIG. 4: it has nearly 300 base pair of the genomic sequence of struct for deletion of FAO1B, given as SEQID NO: 21. This FAO2A at the 5' and 3' ends of the structural gene to serve as sequence is a specific example of the construct shown generi a targeting sequence; between the targeting sequences are two cally in FIG. 4: it has nearly 300 base pairs of the genomic frt sites that are recognized by the flp recombinase; between sequence of FAO1B at the 5' end and 220 base pairs of the the two frt sites are sequences encoding the flp recombinase genomic sequence of FAO1B at the 3' end to serve as a and a protein conferring resistance to the antibiotic nourseo targeting sequence; between the targeting sequences are two thricin. Not shown in SEQID NO: 24 but also present in the frt sites that are recognized by the flp recombinase; between targeting construct were a selective marker conferring resis the two frt sites are sequences encoding the flp recombinase tance to kanamycin and a bacterial origin of replication, so and a protein conferring resistance to the antibiotic nourseo that the targeting construct can be grown and propagated in E thricin. coli. The targeting sequences shown in SEQID NO: 24 also 0324 Candida tropicalis strain DP238 was prepared by include a BSmBI restriction site at each end of the construct, integration of the construct shown as SEQID NO: 21 into the so that the final targeting construct can be linearized and genome of strain DP 186 (Table 3) at the site of the genomic optionally separated from the bacterial antibiotic resistance sequence of the gene for FAO1B. Candida tropicalis strain marker and origin of replication prior to transformation into DP240 was prepared by excision of the targeting construct Candida tropicalis. from the genome of strain DP238, thereby deleting the gene 0329 Candida tropicalis strain DP255 was prepared by encoding FAO1B. Integration and deletion of targeting integration of the construct shown as SEQID NO: 24 into the sequence SEQ ID NO: 21, and analysis of integrants and genome of strain DP240 (Table 3) at the site of the genomic excisants were performed as described in Section 7.1. sequence of the gene for FAO2A. Candida tropicalis strain Sequences of oligonucleotide primers for analysis of strains DP256 was prepared by excision of the targeting construct were, FAO1 F1 (SEQID NO: 89), FAO1 R5 (SEQ ID NO: from the genome of strain DP255, thereby deleting most of 98), SAT1-R (SEQID NO: 79), SAT1-F (SEQID NO: 80). the coding portion of the gene encoding FAO2A. Integration 0325 For strain DP182 (integration of SEQID NO: 18), and deletion of targeting sequence SEQ ID NO: 24, and PCR with primers FAO1 F1 and SAT1-R produces a 558 analysis of integrants and excisants were performed as base pair amplicon: PCR with primers SAT1-F and FAO1 R5 described in Section 7.1. Sequences of oligonucleotide prim produces a 557 base pair amplicon. For a strain with a wild ers for analysis of Strains were: type copy of FAO1B, PCR with primers FAO1 F1 and FAO1 R5 produces a 2.007 base pair amplicon. For strain (SEO ID NO: 99) DP186, with a deleted copy of FAO1B, PCR with primers FAO2A-IN-L: CTTTTCTGATTCTTGATTTTCCCTTTTCAT FAO1 F1 and FAO1 R5 produces a 711 base pair amplicon. (SEQ ID NO: 1.OO) 0326 Deletion of a portion of the coding sequence of the FAO2A-IN-R: ATACATCTAGTATATAAGTGTCGTATTTCC gene for FAO1B will disrupt the function of the protein encoded by this gene in the Candida host cell. US 2015/0094.483 A1 Apr. 2, 2015 41

- Continued 0334 Candida tropicalis strain DP259 was prepared by (SEO ID NO: 79) integration of the construct shown as SEQID NO: 27 into the SAT1-R: genome of strain DP256 (Table 3) at the site of the genomic (SEQ ID NO: 80) sequence of the gene for FAO2BA. Candida tropicalis strain SAT1 - F : DP261 was prepared by excision of the targeting construct from the genome of strain DP259, thereby deleting most of 0330 For strain DP255 (integration of SEQID NO: 24), the coding region of the gene encoding FAO2B. Integration PCR with primers FAO2A-IN-L and SAT1-R produces a 581 and deletion of targeting sequence SEQ ID NO: 27, and base pair amplicon: PCR with primers SAT1-F and FAO2A analysis of integrants and excisants were performed as IN-R produces a 569 base pair amplicon. For a strain with a described in Section 7.1. wild type copy of FAO2A, PCR with primers FAO2A-IN-L 0335 Sequences of oligonucleotide primers for analysis and FAO2A-IN-R produces a 2,199 base pair amplicon. For of strains were: strain DP186 with a deleted copy of FAO2A, PCR with prim ers FAO2A-IN-L and FAO2A-IN-R produces a 747 base pair amplicon. (SEQ ID NO: 101) FAO2B-IN-L: TGCTTTTCTGATTCTTGATCATCCCCTTAG 0331 Deletion of a portion of the coding sequence of the gene for FAO2A will disrupt the function of the protein (SEQ ID NO: 102) encoded by this gene in the Candida host cell. FAO2B-IN-R: ATACATCTAGTATATAAGTGTCGTATTTCT (SEO ID NO : 79) 7.3.4. Deletion of FAO2B SAT1-R: 0332 The sequence of a gene encoding a fatty alcohol (SEQ ID NO: 80) oxidase in Candida tropicalis, FAO2B is given as SEQ ID SAT1 - F : NO: 25. This sequence was used to design a "pre-targeting 0336. For strain DP259 (integration of SEQID NO: 27), construct comprising two targeting sequences from the 5' and PCR with primers FAO2B-IN-L and SAT1-R produces a 551 3' end of the structural gene. The targeting sequences were base pair amplicon: PCR with primers SAT1-F and FAO2B separated by a sequence, given as SEQID NO: 12, compris IN-R produces a 571 base pair amplicon. For a strain with a ing a NotI restriction site, a 20 base pair stuffer fragment and wild type copy of FAO2B, PCR with primers FAO2B-IN-L an XhoI restriction site. The targeting sequences were flanked and FAO2B-IN-R produces a 2,198 base pair amplicon. For by two BsmBI restriction sites, so that the final targeting strain DP186 with a deleted copy of FAO2B, PCR with prim construct can be linearized prior to transformation into Can ers FAO2B-IN-L and FAO2B-IN-R produces a 719 base pair dida tropicalis. The sequence of the FAO2B pre-targeting amplicon. construct is given as SEQID NO: 26. Not shown in SEQ ID 0337 Deletion of a portion of the coding sequence of the NO: 26 but also present in the pre-targeting construct were a gene for FAO2B will disrupt the function of the protein selective marker conferring resistance to kanamycin and a encoded by this gene in the Candida host cell. bacterial origin of replication, so that the pre-targeting con struct can be grown and propagated in Ecoli. The sequence 7.4. Deletion of More Cytochrome P450 Genes from was synthesized using standard DNA synthesis techniques Candida well known in the art. 0338. At least one enzyme capable of oxidizing co-hy 0333 A targeting construct for deletion of FAO2B from droxy fatty acids is present in Candida tropicalis in addition the Candida tropicalis genome was prepared by digesting the to the cytochrome P450 genes encoding CYP52A13, SAT-1 flipper (SEQID NO: 1) with restriction enzymes NotI CYP52A14, CYP52A17 and CYP52A18 and fatty alcohol and XhoI, and ligating it into the FAO2B pre-targeting con oxidase genes FAO1, FAO1B, FAO2A and FAO2B. Oxida struct (SEQ ID NO: 26) from which the 20 base pair stuffer tion of energy rich molecules reduces their energy content. had been removed by digestion with restriction enzymes NotI For the production of incompletely oxidized compounds and XhoI. The sequence of the resulting targeting construct including ()-hydroxy fatty acids, it is advantageous to reduce for deletion of FAO2B is given as SEQ ID NO: 27. This or eliminate the further oxidation of incompletely oxidized sequence is a specific example of the construct shown generi compounds ()-hydroxy fatty acids. Under one aspect, this can cally in FIG. 4: it has nearly 300 base pairs of the genomic be achieved by deleting the genes encoding the oxidizing sequence of FAO2B at the 5' and 3' ends of the structural gene enzymes from the Candida genome. One class of enzymes to serve as a targeting sequence; between the targeting known to oxidize incompletely oxidised compounds are the sequences are two frt sites that are recognized by the flp cytochrome P450s. recombinase; between the two frt sites are sequences encod 0339. The CYP52A type P450s are responsible for co-hy ing the flp recombinase and a protein conferring resistance to droxylation offatty acids in several Candida species (Craft et the antibiotic nourseothricin. Not shown in SEQ ID NO: 27 al., 2003, Appl Environ Microbiol: 69,5983-91; Eschenfeldt but also present in the targeting construct were a selective et al., 2003, Appl Environ Microbiol: 69,5992-9; Ohkuma et marker conferring resistance to kanamycin and a bacterial al., 1991, DNA Cell Biol: 10, 271-82; Zimmer et al., 1995, origin of replication, so that the targeting construct can be DNA Cell Biol: 14, 619-28; Zimmer et al., 1996, Biochem grown and propagated in E coli. The targeting sequences Biophy's Res Commun: 224, 784-9.) They have also been shown in SEQID NO: 27 also includes a BSmBI restriction implicated in the further oxidation of these ()-hydroxy fatty site at each end of the construct, so that the final targeting acids to C,c)-diacids. See Eschenfeldt, et al., 2003, Appli. construct can be linearized and optionally separated from the Environ. Microbiol. 69: 5992-5999, which is hereby incor bacterial antibiotic resistance marker and origin of replication porated by reference herein. Another CYP52A type P450 prior to transformation into Candida tropicalis. whose expression is induced by fatty acids is CYP52A12. US 2015/0094.483 A1 Apr. 2, 2015 42

74.1. Deletion of CYP52A12 - Continued (SEO ID NO : 79) 0340. The sequence of a gene encoding a cytochrome SAT1-R: P450 in Candida tropicalis, CYP52A12 is given as SEQID NO: 28. This sequence was used to design a "pre-targeting (SEQ ID NO: 80) construct comprising two targeting sequences from the 5' and SAT1 - F : 3' end of the structural gene. The targeting sequences were (0344) For strain DP268 (integration of SEQID NO: 30), separated by a sequence, given as SEQID NO: 12, compris PCR with primers 12-IN-L and SAT1-R produces a 596 base ing a NotI restriction site, a 20 base pair stuffer fragment and pair amplicon: PCR with primers SAT1-F and 12-IN-R2 pro a XhoI restriction site. The targeting sequences were flanked duces a 650 base pair amplicon. For a strain with a wild type by two BsmBI restriction sites, so that the final targeting copy of CYP52A12, PCR with primers 12-IN-L and 12-IN construct can be linearized prior to transformation into Can R2 produces a 2.348 base pair amplicon. For strain DP272 dida tropicalis. The sequence of the CYP52A12 pre-targeting with a deleted copy of CYP52A12, PCR with primers construct is given as SEQID NO. 29. Not shown in SEQ ID 12-IN-L and 12-IN-R2 produces a 843 base pair amplicon. NO: 29 but also present in the pre-targeting construct were a 0345 Deletion of a portion of the coding sequence of the selective marker conferring resistance to kanamycin and a gene for CYP52A12 will disrupt the function of the protein bacterial origin of replication, so that the pre-targeting con encoded by this gene in the Candida host cell. struct can be grown and propagated in Ecoli. The sequence was synthesized using standard DNA synthesis techniques 7.4.2. Deletion of CYP52A12B well known in the art. 0346 No sequence had been reported for a second allele for CYP52A12 at the time of this work. We reasoned that in a 0341. A targeting construct for deletion of CYP52A12 diploid organisms a second allele existed (CYP52A17 and from the Candida tropicalis genome was prepared by digest CYP52A18 are an allelic pair and CYP52A13 and ing the SAT-1 flipper (SEQ ID NO: 1) with restriction CYP52A14 are an allelic pair). To delete the second allele we enzymes NotI and XhoI, and ligating it into the CYP52A12 synthesized a deletion construct based on the CYP52A12 pre-targeting construct (SEQID NO: 29) from which the 20 sequence (SEQID NO: 28), but designed it so that the target base pair stuffer had been removed by digestion with restric ing sequences were homologous to regions of the CYP52A12 tion enzymes Not and XhoI. The sequence of the resulting gene that are missing because they have been deleted in Strain targeting construct for deletion of CYP52A12 is given as DP272. First we constructed a “pre-targeting construct com SEQID NO:30. This sequence is a specific example of the prising two targeting sequences from near the 5' and 3' ends of construct shown generically in FIG. 4: it has nearly 300 base the structural gene, but internal to the two sequences used in pairs of the genomic sequence of CYP52A12 at each end to the design of the targeting construct for the deletion of serve as a targeting sequence; between the targeting CYP52A12. The targeting sequences were separated by a sequences are two frt sites that are recognized by the flp sequence, given as SEQID NO: 12, comprising a Not restric recombinase; between the two frt sites are sequences encod tion site, a 20 base pair stuffer fragment and a XhoI restriction ing the flp recombinase and a protein conferring resistance to site. The targeting sequences were flanked by two BsmBI the antibiotic nourseothricin. Not shown in SEQ ID NO:30 restriction sites, so that the final targeting construct can be but also present in the targeting construct were a selective linearized prior to transformation into Candida tropicalis. marker conferring resistance to kanamycin and a bacterial The sequence of the CYP52A12B pre-targeting construct is origin of replication, so that the targeting construct can be given as SEQID NO:31. Not shown in SEQID NO:31 but grown and propagated in E coli. The targeting sequences also present in the pre-targeting construct were a selective shown in SEQID NO:30 also include a BsmBI restriction site marker conferring resistance to kanamycin and a bacterial at each end of the construct, so that the final targeting con origin of replication, so that the pre-targeting construct can be struct can be linearized and optionally separated from the grown and propagated in Ecoli. The sequence was synthe bacterial antibiotic resistance marker and origin of replication sized using standard DNA synthesis techniques well known prior to transformation into Candida tropicalis. in the art. 0342 Candida tropicalis strain DP268 was prepared by (0347 A targeting construct for deletion of CYP52A12B integration of the construct shown as SEQID NO:30 into the from the Candida tropicalis genome was prepared by digest genome of strain DP261 (Table 3) at the site of the genomic ing the SAT-1 flipper (SEQ ID NO: 1) with restriction sequence of the gene for CYP52A12. Candida tropicalis enzymes Not and XhoI, and ligating it into the CYP52A12B strain DP272 was prepared by excision of the targeting con pre-targeting construct (SEQID NO: 31) from which the 20 struct from the genome of strain DP268, thereby deleting the base pair stuffer had been removed by digestion with restric gene encoding CYP52A12. Integration and deletion of tar tion enzymes Not and XhoI. The sequence of the resulting geting sequence SEQID NO:30, and analysis of integrants targeting construct for deletion of CYP52A12B is given as and excisants were performed as described in Section 7.1. SEQID NO:32. This sequence is a specific example of the 0343 Sequences of oligonucleotide primers for analysis construct shown generically in FIG. 4: it has nearly 300 base of strains were: pairs of the genomic sequence of CYP52A12 at each end to serve as a targeting sequence; between the targeting sequences are two frt sites that are recognized by the flp (SEQ ID NO: 103) recombinase; between the two frt sites are sequences encod 12-IN-L: CGCCAGTCTTTCCTGATTGGGCAAG ing the flp recombinase and a protein conferring resistance to (SEQ ID NO: 104) the antibiotic nourseothricin. Not shown in SEQID NO:32 12-IN-R2: GGACGTTGTCGAGTAGAGGGATGTG but also present in the targeting construct were a selective marker conferring resistance to kanamycin and a bacterial origin of replication, so that the targeting construct can be US 2015/0094.483 A1 Apr. 2, 2015

grown and propagated in E coli. The targeting sequences 7.5.1. Identification of Candida tropicalis Alcohol Dehydro shown in SEQID NO:32 also include a BsmBI restriction site genases at each end of the construct, so that the final targeting con 0353. The sequences of four alcoholdehydrogenase genes struct can be linearized and optionally separated from the were obtained from the Candida Geneome Database in the bacterial antibiotic resistance marker and origin of replication Department of Genetics at the School of Medicine, Stanford prior to transformation into Candida tropicalis. University, Palo Alto, Calif. The sequences of these genes are given as SEQ ID NO:33, SEQID NO:34, SEQ ID NO:35 0348 Candida tropicalis strain DP282 was prepared by and SEQID NO:36. These sequences were aligned and two integration of the construct shown as SEQID NO:32 into the degenerate oligonucleotide primers were designed, whose genome of strain DP272 (Table 3) at the site of the genomic sequences are given as SEQID NO:37 and SEQID NO:38. sequence of the gene for CYP52A12B. Candida tropicalis These two primers were used to PCR amplify from genomic strain DP284 was prepared by excision of the targeting con DNA from Candida tropicalis strain DP1. The resulting struct from the genome of strain DP282, thereby deleting a amplicon of ~1,000 base pairs was cloned and 96 independent portion of the coding region of the gene encoding transformants were picked, plasmid prepared and sequenced CYP52A12B. Integration and deletion of targeting sequence using two primers with annealing sites located in the vector SEQID NO:32, and analysis of integrants and excisants were reading into the cloning site and two primers designed to performed as described in Section 7.1. anneal to highly conserved sequences within the Candida albicans alcohol dehydrogenase sequences: 0349 Sequences of oligonucleotide primers for analysis of strains were: (SEO ID NO : 107) ADH-F: GTTTACAAAGCCTTAAAGACT (SEQ ID NO: 105) 12-F1: CTGTACTTCCGTACTTGACC (SEQ ID NO: 108) ADH-R: TTGAACGGCCAAAGAACCTAA. (SEQ ID NO: 106) 12-R1: GAGACCTGGATCAGATGAG 0354 Five different sequences were obtained by sequenc ing the 96 independent clones, called Ct ADH-A4, Ct ADH (SEO ID NO: 79) A10, Ct ADH-B2, Ct ADH-B4 and Ct ADH-B1 1. These SAT1-R: sequences are provided as SEQID NO:39, SEQID NO: 40, (SEQ ID NO: 80) SEQ ID NO: 41, SEQ ID NO: 42 and SEQ ID NO: 43 SAT1 - F : respectively. In silico translation of Ct ADH-B2 (SEQ ID NO: 41) yielded an amino acid sequence with multiple in 0350 Oligonucleotides 12-F1 and 12-R1 are designed to frame stop codons, so it is almost certainly a pseudogene and does not encode a functional protein. The other four anneal to a part of the genome that is missing in Strains with sequences all encode protein sequences without stop codons. deletions in CYP52A12. In such strains they will thus only be 0355 Amino acid sequences of the partial genes are pre able to anneal to and amplify from the second allele dicted and provided: SEQ ID NO:155 (ADH-A4), SEQ ID CYP52A12B. For strain DP282 (integration of SEQID NO: NO:154 (ADH-B4), SEQID NO:152 (ADH-A10), SEQ ID 32), PCR with primers 12-F1 and SAT1-R produces a 978 NO:153 (ADH-A10B) and SEQID NO:151 (ADH-B11). base pair amplicon: PCR with primers SAT1-F and 12-R1 0356. In some embodiments an alcohol dehydrogenase produces a 947 base pair amplicon. PCR from a strain with a gene is identified in the genome of a yeast of the genus wild type copy of CYP52A12B with primers 12-F1 and Candida by comparison with the nucleotide sequence of an 12-R1 produces a 1,478 base pair amplicon. For strain DP272 alcohol dehydrogenase from Candida tropicalis and is iden with a deleted copy of CYP52A12B, PCR with primers 12-F1 tified as an alcohol dehydrogenase if (i) it comprises an open and 12-R1 produces a 505 base pair amplicon. reading frame encoding a polypeptide at least 275 amino acids long or at least 300 amino acids long and (ii) the gene is 0351. Deletion of a portion of the coding sequence of the at least 65% identical, at least 70% identical, at least 75% gene for CYP52A12B will disrupt the function of the protein identical, at least 80% identical, at least 85% identical, at least encoded by this gene in the Candida host cell. 90% identical, at least 95% identical, or at least 98% identical for a stretch of at least 80, at least 85, at least 90, at least 95, 7.5. Deletion of Alcohol Dehydrogenase Genes from at least 100, at least 105, at least 110, at least 115, or at least Candida 120 contiguous nucleotides of the coding sequence of a Can dida tropicalis gene selected from the group consisting of 0352. At least one enzyme capable of oxidizing co-hy ADH-A4 (SEQ ID NO:39), ADH-B4 (SEQ ID NO: 42), droxy fatty acids is present in Candida tropicalis in addition ADH-A10 (SEQID NO: 40), ADH-AIOB (SEQID NO:56), to the cytochrome P450 genes encoding CYP52A13, and ADH-B11 (SEQID NO: 43). CYP52A14, CYP52A17, CYP52A18, CYP52A12, 0357 The sequence relationships of these protein CYP52A12B and the fatty alcohol oxidase genes FAO1, sequences are shown in a phylogenetic tree in FIG. 17. FAO1B, FAO2A and FAO2B. Oxidation of energy rich mol Ct ADH-A4 (encoded by SEQID NO:39) is most homolo ecules reduces their energy content. For the production of gous to Candida albicans ADH1A and Ct ADH-B4 (en incompletely oxidized compounds including ()-hydroxy fatty coded by SEQID NO: 42) is most homologous to Candida acids, it is advantageous to reduce or eliminate the further albicans ADH2A. oxidation of incompletely oxidized compounds, including for 0358 An alignment, using ClustalW, of the amino acid example ()-hydroxy fatty acids. Under one aspect, this can be sequences of alcohol dehydrogenase proteins predicted from achieved by deleting the genes encoding the oxidizing the sequences of genes from Candida albicans and Candida enzymes from the Candida genome. One class of enzymes tropicalis is shown in FIGS. 3A and 3B. The genes from known to oxidize alcohols is alcohol dehydrogenases. Candida tropicalis were isolated as partial genes by PCR with US 2015/0094.483 A1 Apr. 2, 2015 44 degenerate primers, so the nucleic acid sequences obtained These sequences will be eliminated by the first targeting for the genes represent only a partial sequence of the gene, construct from the first allele of the gene and will thus serve and the predicted amino acid sequences of the encoded pro as a targeting sequence for the second allele of the gene. As teins represent only a partial sequence of the protein. A con described below, this strategy succeeded with two ADH sensus is indicated underneath the aligned amino acid allelic pairs: those for ADH-A4 and ADH-B4. However at the sequences of FIGS. 3A and 3B, with a * indicating that all 4 first attempt it was not successful for deletion of the second Candida albicans alcoholdehydrogenase sequences and all 4 allele of ADH-A10 or ADH-B11, so the secondallele of these Candida tropicalis alcohol dehydrogenase sequences are genes were isolated, sequenced and those sequences were completely identical at those residues. BLAST searching of used to delete the second alleles of ADH-A10 or ADH-B1 1. protein sequences in Genbank with highly conserved peptide 0363 Deletion of a portion of the sequence of an alcohol regions within the alcohol dehydrogenases yields results that dehydrogenase gene will disrupt the function of that alcohol identify uniquely yeast alcohol dehydrogenases. dehydrogenase enzyme in the Candida host cell. 0359. In some embodiments an alcohol dehydrogenase 0364. In some embodiments, disruption of an alcohol gene is identified in the genome of a yeast of the genus dehydrogenase in a first host cell organism is measured by Candida by comparison of the amino acid sequence of its incubating the first host cell organism in a mixture comprising predicted translation product with the predicted polypeptide a Substrate possessing a hydroxyl group and measuring the sequence of an alcohol dehydrogenase from Candida tropi rate of conversion of the substrate to a more oxidized product calis and is identified as an alcohol dehydrogenase if it com Such as an aldehyde or a carboxyl group. The rate of conver prises a first peptide sequence VKYSGVCH (SEQ ID NO: sion of the Substrate by the first host cell organism is com 156) or VKYSGVCHxxxxxWKGDW (SEQID NO: 162) or pared with the rate of conversion produced by a second host VKYSGVCHXXXXXWKGDWXXXXKLPXVG cell organism that does not contain the disrupted gene but GHEGAGVVV (SEQ ID NO: 163) or VGGHEGAGVVV contains a wild type counterpart of the gene, when the first (SEQ ID NO: 157). host cell organism and the second host cell organism are 0360. In some embodiments an alcohol dehydrogenase under the same environmental conditions (e.g., same tem gene is identified in the genome of a yeast of the genus perature, same media, etc.). The rate of formation of the Candida by comparison of the amino acid sequence of its product can be measured using colorimetric assays, or chro predicted translation product with the predicted polypeptide matographic assays, or mass spectroscopy assays. In some sequence of an alcohol dehydrogenase from Candida tropi embodiments the alcohol dehydrogenase is disrupted if the calis and is identified as an alcohol dehydrogenase if it com rate of conversion is at least 5% lower, at least 10% lower, at prises a second peptide sequence QYATADAVOAA (SEQID least 15% lower, at least 20% lower, at least 25% lower in the NO: 158) or SGYxHDGxFxOYATADAVQAA (SEQID NO: first host cell organism than the second host cell organism. 164) or GAEPNCXXADxSGYxHDGxFxOYATADAVQAA 0365. In some embodiments, disruption of an alcohol (SEQ ID NO: 165). In some embodiments an alcohol dehy dehydrogenase in a first host cell organism is measured by drogenase gene is identified in the genome of a yeast of the incubating said first host cell organism in a mixture compris genus Candida by comparison of the amino acid sequence of ing a Substrate possessing a hydroxyl group and measuring its predicted translation product with the predicted polypep the rate of conversion of the substrate to a more oxidized tide sequence of an alcohol dehydrogenase from Candida product such as an aldehyde or a carboxyl group. The amount tropicalis and is identified as an alcohol dehydrogenase if it of the substrate converted to product by the first host cell comprises a third peptide sequence CAGVTVYKALK (SEQ organism in a specified time is compared with the amount of ID NO: 159) or APIxCAGVTVYKALK (SEQID NO: 166). Substrate converted to product by a second host cell organism 0361. In some embodiments an alcohol dehydrogenase that does not contain the disrupted gene but contains a wild gene is identified in the genome of a yeast of the genus type counterpart of the gene, when the first host cell organism Candida by comparison of the amino acid sequence of its and the second host cell organism are under the same envi predicted translation product with the predicted polypeptide ronmental conditions (e.g., same temperature, same media, sequence of an alcohol dehydrogenase from Candida tropi etc.). The amount of product can be measured using colori calis and is identified as an alcohol dehydrogenase if it com metric assays, or chromatographic assays, or mass spectros prises a fourth peptide sequence GQWVAISGA (SEQ ID copy assays. In some embodiments the alcohol dehydroge NO: 160) or GQWVAISGAxGGLGSL (SEQID NO: 167) or nase is disrupted if the amount of product is at least 5% lower, GQWVAISGAxGGLGSLXVQYA (SEQ ID NO: 168) or at least 10% lower, at least 15% lower, at least 20% lower, at GQWVAISGAxGGLGSLXVQYAXAMG (SEQ ID NO: least 25% lower, or at least 30% lower in the first host cell 169) O GQWVAISGAxGGLGSLXVQYAXAM organism than the second host cell organism. GxRVxAIDGG (SEQID NO: 170). 0362. The four coding sequences were sufficiently dis 7.5.2. Deletion of ADH-A4 similar to reach the conclusion that they were not allelic pairs, 0366 Sequence SEQ ID NO: 39 was used to design a but rather represented four different genes, each of which “pre-targeting construct comprising two targeting probably had its own allelic partner in the genome. Each of sequences from the 5' and 3' end of the ADH-A4 structural the coding sequences was thus used to design two targeting gene. The targeting sequences were separated by a sequence, constructs, similarly to the strategy described for given as SEQID NO: 12, comprising a Not restriction site, a CYP52A12B in Section 7.4.2. The construct for the first 20 base pair stuffer fragment and an XhoI restriction site. The allele of each ADH gene used 200 base pairs at the 5' end and targeting sequences were flanked by two BSmBI restriction ~200 base pairs at the 3' end as targeting sequences (5'-ADH sites, so that the final targeting construct can be linearized Out and 3'-ADHOut in FIG. 18). The construct for the second prior to transformation into Candida tropicalis. The sequence allele used two sections of ~200 base pairs between the first of the ADH-A4 pre-targeting construct is given as SEQ ID two targeting sequences (5'-ADH In and 3'-ADH in FIG. 18). NO: 44. Not shown in SEQID NO: 44 but also present in the US 2015/0094.483 A1 Apr. 2, 2015

pre-targeting construct were a selective marker conferring 7.5.3. Deletion of ADH-A4B resistance to kanamycin and a bacterial origin of replication, so that the pre-targeting construct can be grown and propa 0371. No sequence was identified for a second allele for gated in Ecoli. The sequence was synthesized using standard ADH-A4 in the initial set of 96 sequences but we reasoned DNA synthesis techniques well known in the art. that in a diploid organism, a second allele existed. To delete 0367. A targeting construct for deletion of ADH-A4 from the second allele (ADH-A4 B) we synthesized a deletion con the Candida tropicalis genome was prepared by digesting the struct based on the ADH-A4 sequence (SEQID NO:39), but SAT-1 flipper (SEQID NO: 1) with restriction enzymes NotI designed it so that the targeting sequences were homologous and XhoI, and ligating it into the ADH-A4 pre-targeting con to regions of the ADH-A4 gene that are missing because they struct (SEQID NO:44) from which the 20 bp stuffer had been have been deleted in strain DP388. First we constructed a removed by digestion with restriction enzymes NotI and “pre-targeting construct comprising two targeting XhoI. The sequence of the resulting targeting construct for sequences internal to the two sequences used in the design of deletion of ADH-A4 is given as SEQ ID NO: 45. This the targeting construct for the deletion of ADH-A4. The tar sequence is a specific example of the construct shown generi geting sequences were separated by a sequence, given as SEQ cally in FIG. 4: it has nearly 200 base pairs of the genomic ID NO: 12, comprising a NotI restriction site, a 20 base pair sequence of ADH-A4 at each end to serve as a targeting stuffer fragment and an XhoI restriction site. The targeting sequence; between the targeting sequences are two frt sites sequences were flanked by two BsmBI restriction sites, so that are recognized by the flp recombinase; between the two frt sites are sequences encoding the flp recombinase and a that the final targeting construct can be linearized prior to protein conferring resistance to the antibiotic nourseothricin. transformation into Candida tropicalis. The sequence of the Not shown in SEQID NO: 44 but also present in the targeting ADH-A4B pre-targeting construct is given as SEQ ID NO: construct were a selective marker conferring resistance to 46. Not shown in SEQ ID NO: 46 but also present in the kanamycin and a bacterial origin of replication, so that the pre-targeting construct were a selective marker conferring targeting construct can be grown and propagated in Ecoli. resistance to kanamycin and a bacterial origin of replication, The targeting sequences shown in SEQ ID NO: 44 also so that the pre-targeting construct can be grown and propa include a BSmBI restriction site at each end of the construct, gated in Ecoli. The sequence was synthesized using standard so that the final targeting construct can be linearized and DNA synthesis techniques well known in the art. optionally separated from the bacterial antibiotic resistance 0372. A targeting construct for deletion of ADH-A4B marker and origin of replication prior to transformation into from the Candida tropicalis genome was prepared by digest Candida tropicalis. ing the SAT-1 flipper (SEQ ID NO: 1) with restriction 0368 Candida tropicalis strain DP387 was prepared by enzymes NotI and XhoI, and ligating it into the ADH-A4B integration of the construct shown as SEQID NO: 45 into the pre-targeting construct (SEQID NO: 46) from which the 20 genome of strain DP283 (Table 3) at the site of the genomic base pair stuffer had been removed by digestion with restric sequence of the gene for ADH-A4. Candida tropicalis strain tion enzymes Not and XhoI. The sequence of the resulting DP388 was prepared by excision of the targeting construct targeting construct for deletion of ADH-A4B is given as SEQ from the genome of strain DP387, thereby deleting the gene ID NO: 47. This sequence is a specific example of the con encoding ADH-A4. Integration and deletion of targeting struct shown generically in FIG. 4: it has nearly 200 base pairs sequence SEQ ID NO: 45, and analysis of integrants and of the genomic sequence of ADH-A4B at each end to serve as excisants were performed as described in Section 7.1. a targeting sequence; between the targeting sequences are two 0369 Sequences of oligonucleotide primers for analysis frt sites that are recognized by the flp recombinase; between of strains were: the two frt sites are sequences encoding the flp recombinase and a protein conferring resistance to the antibiotic nourseo

A4 - OUT-F: thricin. Not shown in SEQID NO: 47 but also present in the (SEQ ID NO: 109) targeting construct were a selective marker conferring resis GAATTAGAATACAAAGATATCCCAGTG tance to kanamycin and a bacterial origin of replication, so that the targeting construct can be grown and propagated in E A4 - OUT-R: (SEQ ID NO: 110) coli. The targeting sequences shown in SEQID NO: 47 also CATCAACTTGAAGACCTGTGGCAAT include a BSmBI restriction site at each end of the construct, so that the final targeting construct can be linearized and (SEO ID NO: 79) SAT1-R: optionally separated from the bacterial antibiotic resistance marker and origin of replication prior to transformation into (SEQ ID NO: 80) Candida tropicalis. SAT1 - F : 0373 Candida tropicalis strain DP389 was prepared by 0370. For strain DP387 (integration of SEQID NO: 45), integration of the construct shown as SEQID NO: 47 into the PCR with primers A4-OUT-F and SAT1-R produces a 464 genome of strain DP388 (Table 3) at the site of the genomic base pair amplicon: PCR with primers SAT1-F and sequence of the gene for ADH-A4B. Candida tropicalis A4-OUT-R produces a 464 base pair amplicon. PCR from a strain DP390 was prepared by excision of the targeting con strain with a wild type copy of ADH-A4 with primers struct from the genome of strain DP389, thereby deleting a A4-OUT-F and A4-OUT-R produces a 948 base pair ampli portion of the coding region of the gene encoding ADH-A4B. con. For strain DP388 with a deleted copy of ADH-A4, PCR Integration and deletion of targeting sequence SEQ ID NO: with primers A4-OUT-F and A4-OUT-R produces a 525 base 47, and analysis of integrants and excisants were performed pair amplicon. as described in Section 7.1. US 2015/0094.483 A1 Apr. 2, 2015 46

0374 Sequences of oligonucleotide primers for analysis are sequences encoding the flp recombinase and a protein of strains were: conferring resistance to the antibiotic nourseothricin. Not shown in SEQ ID NO: 49 but also present in the targeting construct were a selective marker conferring resistance to A4 -IN-F: kanamycin and a bacterial origin of replication, so that the (SEQ ID NO: 111) targeting construct can be grown and propagated in Ecoli. GAACGGTTCCTGTATGTCCTGTGAGTT The targeting sequences shown in SEQ ID NO: 49 also A4 -IN-R: include a BSmBI restriction site at each end of the construct, (SEQ ID NO: 112) so that the final targeting construct can be linearized and CGGATTGGTCAATGGCTTTTTCGGAA optionally separated from the bacterial antibiotic resistance marker and origin of replication prior to transformation into (SEO ID NO: 79) Candida tropicalis. SAT1-R: 0378 Candida tropicalis strain DP397 was prepared by (SEQ ID NO: 80) integration of the construct shown as SEQID NO: 49 into the SAT1 - F : genome of strain DP390 (Table 3) at the site of the genomic sequence of the gene for ADH-B4. Candida tropicalis strain 0375 Oligonucleotides A4-IN-F and A4-IN-R are DP398 was prepared by excision of the targeting construct designed to anneal to a part of the genome that is missing in from the genome of strain DP397, thereby deleting the gene strains with deletions in ADH-A4. In such strains they will encoding ADH-B4. Integration and deletion of targeting thus only be able to anneal to and amplify from the second sequence SEQ ID NO: 49, and analysis of integrants and allele ADH-A4B. For strain DP389 (integration of SEQ ID excisants were performed as described in Section 7.1. NO: 47), PCR with primers A4-IN-F and SAT1-R produces a 0379 Sequences of oligonucleotide primers for analysis 462 base pair amplicon: PCR with primers SAT1-F and of strains were: A4-IN-R produces a 462 base pair amplicon. PCR from a strain with a wild-type copy of ADH-A4B with primers B4 - OUT-F: A4-IN-F and A4-IN-R produces a 488 base pair amplicon. (SEQ ID NO: 113) For strain DP390 with a deleted copy of ADH-A4B, PCR with AAATTAGAATACAAGGACATCCCAGTT primers A4-IN-F and A4-IN-R produces a 521 base pair amplicon. The amplicons with primers A4-IN-F and A4-IN-R B4 - OUT-R: could not distinguish between a strain carrying a wild-type or (SEQ ID NO: 114) a deleted copy of ADH-A4B, but digestion of the amplicon CATCAACTTGTAGACTTCTGGCAAT with either NotI or XhoI will cleave the amplicon derived (SEO ID NO : 79) from the deleted copy of the gene but not from the wild type, SAT1-R: thereby distinguishing between them. (SEQ ID NO: 80) SAT1-F: 7.5.4. Deletion of ADH-B4 (0380. For strain DP397 (integration of SEQID NO: 49), 0376 Sequence SEQ ID NO: 42 was used to design a PCR with primers B4-OUT-F and SAT1-R produces a 464 bp “pre-targeting construct comprising two targeting amplicon: PCR with primers SAT1-F and B4-OUT-R pro sequences from the 5' and 3' end of the ADH-B4 structural duces a 464 base pair amplicon. PCR from a strain with a wild gene. The targeting sequences were separated by a sequence, type copy of ADH-B4 with primers B4-OUT-F and given as SEQID NO: 12, comprising a Not restriction site, a B4-OUT-R produces a 948 base pair amplicon. For strain 20 bp stuffer fragment and an XhoI restriction site. The tar DP398 with a deleted copy of ADH-B4, PCR with primers geting sequences were flanked by two BSmBI restriction B4-OUT-F and B4-OUT-R produces a 525 base pair ampli sites, so that the final targeting construct can be linearized CO. prior to transformation into Candida tropicalis. The sequence 7.5.5. Deletion of ADH-B4B of the ADH-B4 pre-targeting construct is given as SEQ ID 0381. No sequence was identified for a second allele for NO: 48. Not shown in SEQID NO: 48 but also present in the ADH-B4 in the initial set of 96 sequences but we reasoned pre-targeting construct were a selective marker conferring that in a diploid organism a second allele existed. To delete the resistance to kanamycin and a bacterial origin of replication, second allele (ADH-B4B) we synthesized a deletion con so that the pre-targeting construct can be grown and propa struct based on the ADH-B4 sequence (SEQID NO: 42), but gated in Ecoli. The sequence was synthesized using standard designed it so that the targeting sequences were homologous DNA synthesis techniques well known in the art. to regions of the ADH-B4 gene that are missing because they 0377. A targeting construct for deletion of ADH-B4 from have been deleted in strain DP398. First we constructed a the Candida tropicalis genome was prepared by digesting the “pre-targeting construct comprising two targeting SAT-1 flipper (SEQID NO: 1) with restriction enzymes NotI sequences internal to the two sequences used in the design of and XhoI, and ligating it into the ADH-B4 pre-targeting con the targeting construct for the deletion of ADH-B4. The tar struct (SEQID NO: 48) from which the 20 bp stuffer had been geting sequences were separated by a sequence, given as SEQ removed by digestion with restriction enzymes NotI and ID NO: 12, comprising a NotI restriction site, a 20 base pair XhoI. The sequence of the resulting targeting construct for stuffer fragment and an XhoI restriction site. The targeting deletion of ADH-B4 is given as SEQ ID NO: 49. This sequences were flanked by two BsmBI restriction sites, so sequence is a specific example of the construct shown generi that the final targeting construct can be linearized prior to cally in FIG. 4: it has nearly 200 bp of the genomic sequence transformation into Candida tropicalis. The sequence of the of ADH-B4 at each end to serve as a targeting sequence; ADH-B4B pre-targeting construct is given as SEQ ID NO: between the targeting sequences are two frt sites that are 50. Not shown in SEQ ID NO: 50 but also present in the recognized by the flp recombinase; between the two frt sites pre-targeting construct were a selective marker conferring US 2015/0094.483 A1 Apr. 2, 2015 47 resistance to kanamycin and a bacterial origin of replication, For strain DP411 with a deleted copy of ADH-B4B, PCR with so that the pre-targeting construct can be grown and propa primers B4-IN-F and B4-IN-R produces a 521 base pair gated in Ecoli. The sequence was synthesized using standard amplicon. The amplicons with primers B4-IN-F and B4-IN-R DNA synthesis techniques well known in the art. could not distinguish between a strain carrying a wild-type or 0382. A targeting construct for deletion of ADH-B4B a deleted copy of ADH-B4B, but digestion of the amplicon from the Candida tropicalis genome was prepared by digest with either NotI or XhoI will cleave the amplicon derived ing the SAT-1 flipper (SEQ ID NO: 1) with restriction from the deleted copy of the gene but not from the wild type, enzymes NotI and XhoI, and ligating it into the ADH-B4B thereby distinguishing between them. pre-targeting construct (SEQID NO: 50) from which the 20 base pair stuffer had been removed by digestion with restric tion enzymes Not and XhoI. The sequence of the resulting 7.5.6. Deletion of ADH-A10 targeting construct for deletion of ADH-B4B is given as SEQ ID NO: 51. This sequence is a specific example of the con 0386 Sequence SEQ ID NO: 40 was used to design a struct shown generically in FIG. 4: it has nearly 200 bp of the “pre-targeting construct comprising two targeting genomic sequence of ADH-B4B at each end to serve as a sequences from the 5' and 3' end of the ADH-A10 structural targeting sequence; between the targeting sequences are two gene. The targeting sequences were separated by a sequence, frt sites that are recognized by the flp recombinase; between given as SEQID NO: 12, comprising a Not restriction site, a the two frt sites are sequences encoding the flp recombinase 20 bp stuffer fragment and an XhoI restriction site. The tar and a protein conferring resistance to the antibiotic nourseo geting sequences were flanked by two BSmBI restriction thricin. Not shown in SEQID NO: 51 but also present in the sites, so that the final targeting construct can be linearized targeting construct were a selective marker conferring resis prior to transformation into Candida tropicalis. The sequence tance to kanamycin and a bacterial origin of replication, so of the ADH-A10 pre-targeting construct is given as SEQID that the targeting construct can be grown and propagated in E. NO: 52. Not shown in SEQID NO: 52 but also present in the coli. The targeting sequences shown in SEQID NO: 51 also pre-targeting construct were a selective marker conferring include a BSmBI restriction site at each end of the construct, resistance to kanamycin and a bacterial origin of replication, so that the final targeting construct can be linearized and so that the pre-targeting construct can be grown and propa optionally separated from the bacterial antibiotic resistance gated in Ecoli. The sequence was synthesized using standard marker and origin of replication prior to transformation into Candida tropicalis. DNA synthesis techniques well known in the art. 0383 Candida tropicalis strain DP409 was prepared by 0387. A targeting construct for deletion of ADH-A10 from integration of the construct shown as SEQID NO: 51 into the the Candida tropicalis genome was prepared by digesting the genome of strain DP398 (Table 3) at the site of the genomic SAT-1 flipper (SEQID NO: 1) with restriction enzymes NotI sequence of the gene for ADH-B4B. Candida tropicalis strain and XhoI, and ligating it into the ADH-A10 pre-targeting DP411 was prepared by excision of the targeting construct construct (SEQ ID NO: 52) from which the 20 base pair from the genome of strain DP409, thereby deleting a portion stuffer had been removed by digestion with restriction of the coding region of the gene encoding ADH-B4B. Inte enzymes NotI and XhoI. The sequence of the resulting tar gration and deletion of targeting sequence SEQID NO: 51, geting construct for deletion of ADH-A10 is given as SEQID and analysis of integrants and excisants were performed as NO: 53. This sequence is a specific example of the construct described in Section 7.1. shown generically in FIG. 4: it has nearly 200 bp of the 0384 Sequences of oligonucleotide primers for analysis genomic sequence of ADH-A10 at each end to serve as a of strains were: targeting sequence; between the targeting sequences are two frt sites that are recognized by the flp recombinase; between B4 - IN-F: the two frt sites are sequences encoding the flp recombinase (SEQ ID NO: 115) and a protein conferring resistance to the antibiotic nourseo GAACGGTTCCTGTATGAACTGTGAGTA thricin. Not shown in SEQID NO: 53 but also present in the B4 - IN-R: targeting construct were a selective marker conferring resis (SEQ ID NO: 116) tance to kanamycin and a bacterial origin of replication, so CAGATTGGTTGATGGCCTTTTCGGAG that the targeting construct can be grown and propagated in E coli. The targeting sequences shown in SEQID NO: 53 also (SEO ID NO: 79) SAT1-R: include a BSmBI restriction site at each end of the construct, so that the final targeting construct can be linearized and (SEQ ID NO: 80) optionally separated from the bacterial antibiotic resistance SAT1 - F : marker and origin of replication prior to transformation into 0385 Oligonucleotides B4-IN-F and B4-IN-R are Candida tropicalis. designed to anneal to a part of the genome that is missing in 0388 Candida tropicalis strain DP415 was prepared by strains with deletions in ADH-B4. In such strains they will integration of the construct shown as SEQID NO: 53 into the thus only be able to anneal to and amplify from the second genome of strain DP411 (Table 3) at the site of the genomic allele ADH-B4B. For strain DP409 (integration of SEQ ID sequence of the gene for ADH-A10. Candida tropicalis strain NO:51), PCR with primers B4-IN-F and SAT1-R produces a DP416 was prepared by excision of the targeting construct 462 base pair amplicon: PCR with primers SAT1-F and from the genome of strain DP415, thereby deleting the gene B4-IN-R produces a 462 base pair amplicon. PCR from a encoding ADH-A10. Integration and deletion of targeting strain with a wild-type copy of ADH-B4B with primers sequence SEQ ID NO: 53, and analysis of integrants and B4-IN-F and B4-IN-R produces a 488 base pair amplicon. excisants were performed as described in Section 7.1. US 2015/0094.483 A1 Apr. 2, 2015 48

0389 Sequences of oligonucleotide primers for analysis optionally separated from the bacterial antibiotic resistance of strains were: marker and origin of replication prior to transformation into Candida tropicalis. 0393 Candida tropicalis strain DP417 was prepared by A1 O-OUT-F: integration of the construct shown as SEQID NO: 55 into the (SEO ID NO : 117) genome of strain DP416 (Table 3) at the site of the genomic AAGTTAGAATACAAAGACGTGCCGGTC sequence of the gene for ADH-B1 1. Candida tropicalis strain A1O-OUT-R: DP421 was prepared by excision of the targeting construct (SEQ ID NO: 118) from the genome of strain DP417, thereby deleting the gene CATCAAGTCAAAAATCTCTGGCACT encoding ADH-B1 1. Integration and deletion of targeting (SEO ID NO : 147) sequence SEQ ID NO: 55, and analysis of integrants and SAT1-R: excisants were performed as described in Section 7.1. 0394 Sequences of oligonucleotide primers for analysis (SEQ ID NO: 80) of strains were: SAT1 - F : 0390 For strain DP415 (integration of SEQID NO: 49), PCR with primers A10-OUT-F and SAT1-R produces a 464 B11 - OUT-F: base pair amplicon: PCR with primers SAT1-F and A10 (SEQ ID NO: 119) OUT-R produces a 464 base pair amplicon. PCR from a strain CCATTGCAATACACCGATATCCCAGTT with a wildtype copy of ADH-A10 with primers A10-OUT-F B11 - OUT-R: and A10-OUT-R produces a 948 base pair amplicon. For (SEQ ID NO: 12O) strain DP416 with a deleted copy of ADH-A10, PCR with CAACAATTTGAAAATCTCTGGCAAT primers A10-OUT-F and A10-OUT-R produces a 525 base (SEO ID NO : 79) pair amplicon. SAT1-R: 7.5.7. Deletion of ADH-B11 (SEQ ID NO: 80) SAT1-F: 0391 Sequence SEQ ID NO: 43 was used to design a 0395. For strain DP417 (integration of SEQID NO: 49), “pre-targeting construct comprising two targeting PCR with primers B11-OUT-F and SAT1-R produces a sequences from the 5' and 3' end of the ADH-B11 structural 464 base pair amplicon: PCR with primers SAT1-F and B11 gene. The targeting sequences were separated by a sequence, OUT-R produces a 464 base pair amplicon. PCR from a strain given as SEQID NO: 12, comprising a Not restriction site, a with a wildtype copy of ADH-B11 with primers B11-OUT-F 20 base pair stuffer fragment and an XhoI restriction site. The and B11-OUT-R produces a 948base pair amplicon. For targeting sequences were flanked by two BSmBI restriction strain DP421 with a deleted copy of ADH-B11, PCR with sites, so that the final targeting construct can be linearized primers B11-OUT-F and B11-OUT-R produces a 525 base prior to transformation into Candida tropicalis. The sequence pair amplicon. of the ADH-B11 pre-targeting construct is given as SEQID NO: 54. Not shown in SEQID NO: 54 but also present in the 7.5.8. Deletion of ADH-A10B pre-targeting construct were a selective marker conferring resistance to kanamycin and a bacterial origin of replication, 0396. No sequence was identified for a second allele for so that the pre-targeting construct can be grown and propa ADH-A10 in the initial set of 96 sequences but we reasoned gated in Ecoli. The sequence was synthesized using standard that in a diploid organism a second allele existed. At our first DNA synthesis techniques well known in the art. attempt we were unable to delete the second allele (ADH 0392 A targeting construct for deletion of ADH-B11 from A1OB) using the strategy described for ADH-A4B and ADH the Candida tropicalis genome was prepared by digesting the B4B. We used the primers A10-IN-F and A10-IN-R to SAT-1 flipper (SEQID NO: 1) with restriction enzymes NotI amplify an ~500 base pair amplicon from genomic DNA from and XhoI, and ligating it into the ADH-B11 pre-targeting strain DP415 which has the SAT1-flipper inserted into the construct (SEQ ID NO. 54) from which the 20 base pair first ADH-A10 allele, preventing it from amplifying with stuffer had been removed by digestion with restriction these primers. The amplicon was cloned and sequenced, the enzymes NotI and XhoI. The sequence of the resulting tar sequence is given as SEQID NO: 56. geting construct for deletion of ADH-B11 is given as SEQID NO: 55. This sequence is a specific example of the construct A1 O-IN-F: shown generically in FIG. 4: it has nearly 200base pair of the (SEQ ID NO: 121) genomic sequence of ADH-B11 at each end to serve as a GAATGGTTCGTGTATGAACTGTGAGTT targeting sequence; between the targeting sequences are two frt sites that are recognized by the flp recombinase; between A1 O-IN-R: the two frt sites are sequences encoding the flp recombinase (SEQ ID NO: 122) and a protein conferring resistance to the antibiotic nourseo CCGACTGGTTGATTGCCTTTTCGGAC thricin. Not shown in SEQID NO: 55 but also present in the 0397 We constructed a “pre-targeting construct com targeting construct were a selective marker conferring resis prising two targeting sequences based on SEQID NO: 56. A tance to kanamycin and a bacterial origin of replication, so single mutation was introduced into the sequence obtained as that the targeting construct can be grown and propagated in E SEQ ID NO:56: a G at position 433 was mutated to a C to coli. The targeting sequences shown in SEQID NO: 53 also destroy an unwanted BSmBI site. The targeting sequences include a BSmBI restriction site at each end of the construct, were separated by a sequence, given as SEQ ID NO: 12, so that the final targeting construct can be linearized and comprising a Not restriction site, a 20 base pair stuffer frag US 2015/0094.483 A1 Apr. 2, 2015 49 ment and an XhoI restriction site. The targeting sequences the amplicon with either NotI or XhoI will cleave the ampli were flanked by two BsmBI restriction sites, so that the final conderived from the deleted copy of the gene but not from the targeting construct can be linearized prior to transformation wild type, thereby distinguishing between them. into Candida tropicalis. The sequence of the ADH-A10B pre-targeting construct is given as SEQID NO:57. Not shown 7.5.9. Deletion of ADH-B11B in SEQID NO: 57 but also present in the pre-targeting con 04.01. No sequence was identified for a second allele for struct were a selective marker conferring resistance to kana ADH-B11 in the initial set of 96 sequences but we reasoned mycin and a bacterial origin of replication, so that the pre that in a diploid organism a second allele existed. At our first targeting construct can be grown and propagated in Ecoli. attempt we were unable to delete the second allele (ADH The sequence was synthesized using standard DNA synthesis B11B) using the strategy described for ADH-A4B and ADH techniques well known in the art. B4B. We used the primers B11-OUT-F and B11-OUT-R to 0398. A targeting construct for deletion of ADH-A10B amplify an ~950 base pair amplicon from genomic DNA from from the Candida tropicalis genome was prepared by digest strain DP417 which has the SAT1-flipper inserted into the ing the SAT-1 flipper (SEQ ID NO: 1) with restriction first ADH-B11 allele, preventing it from amplifying with enzymes NotI and XhoI, and ligating it into the ADH-A10B these primers. The amplicon was cloned and sequenced, the pre-targeting construct (SEQID NO: 57) from which the 20 sequence is given as SEQID NO. 59. bp stuffer had been removed by digestion with restriction enzymes NotI and XhoI. The sequence of the resulting tar geting construct for deletion of ADH-A10B is given as SEQ B11 - OUT-F ID NO: 58. This sequence is a specific example of the con (SEQ ID NO: 121) struct shown generically in FIG. 4: it has nearly 200 base pairs GAATGGTTCGTGTATGAACTGTGAGTT of the genomic sequence of ADH-A10B at each end to serve B11 - OUT-R as a targeting sequence; between the targeting sequences are (SEQ ID NO: 122) two frt sites that are recognized by the flp recombinase; and CCGACTGGTTGATTGCCTTTTCGGAC between the two frt sites are sequences encoding the flp 0402 We constructed a “pre-targeting construct com recombinase and a protein conferring resistance to the anti prising two targeting sequences based on SEQ ID NO. 59. biotic nourseothricin. Not shown in SEQID NO: 58 but also The targeting sequences were separated by a sequence, given present in the targeting construct were a selective marker as SEQ ID NO: 12, comprising a NotI restriction site, a 20 conferring resistance to kanamycin and a bacterial origin of base pair stuffer fragment and an XhoI restriction site. The replication, so that the targeting construct can be grown and targeting sequences were flanked by two BSmBI restriction propagated in Ecoli. The targeting sequences shown in SEQ sites, so that the final targeting construct can be linearized ID NO: 58 also include a BSmBI restriction site at each end of prior to transformation into Candida tropicalis. The sequence the construct, so that the final targeting construct can be of the ADH-B11B pre-targeting construct is given as SEQID linearized and optionally separated from the bacterial antibi NO: 60. Not shown in SEQID NO: 60 but also present in the otic resistance marker and origin of replication prior to trans pre-targeting construct were a selective marker conferring formation into Candida tropicalis. resistance to kanamycin and a bacterial origin of replication, 0399 Candida tropicalis strain DP424 was prepared by so that the pre-targeting construct can be grown and propa integration of the construct shown as SEQID NO: 58 into the gated in Ecoli. The sequence was synthesized using standard genome of strain DP421 (Table 3) at the site of the genomic DNA synthesis techniques well known in the art. sequence of the gene for ADH-A10B. Candida tropicalis 0403. A targeting construct for deletion of ADH-B11B strain DP431 was prepared by excision of the targeting con from the Candida tropicalis genome was prepared by digest struct from the genome of strain DP424, thereby deleting a ing the SAT-1 flipper (SEQ ID NO: 1) with restriction portion of the coding region of the gene encoding ADH enzymes NotI and XhoI, and ligating it into the ADH-B11B A1 OB. Integration and deletion of targeting sequence SEQID pre-targeting construct (SEQID NO: 60) from which the 20 NO: 58, and analysis of integrants and excisants were per base pair stuffer had been removed by digestion with restric formed as described in Section 7.1. Sequences of oligonucle tion enzymes Not and XhoI. The sequence of the resulting otide primers for analysis of strains were A10-IN-F (SEQID targeting construct for deletion of ADH-B11B is given as NO: 121), A10-IN-R (SEQID NO: 122), SAT1-R (SEQ ID SEQID NO: 61. This sequence is a specific example of the NO: 79), and SAT1-F (SEQID NO: 80). construct shown generically in FIG. 4: it has nearly 200 base (0400 Oligonucleotides A10-IN-F and A10-IN-R are pair of the genomic sequence of ADH-B11B at each end to designed to anneal to a part of the genome that is missing in serve as a targeting sequence; between the targeting strains with deletions in ADH-A10. In such strains they will sequences are two frt sites that are recognized by the flp thus only be able to anneal to and amplify from the second recombinase; between the two frt sites are sequences encod allele ADH-A10B. For strain DP424 (integration of SEQ ID ing the flp recombinase and a protein conferring resistance to NO: 58), PCR with primers A10-IN-F and SAT1-R produces the antibiotic nourseothricin. Not shown in SEQID NO: 61 a 462 base pair amplicon: PCR with primers SAT1-F and but also present in the targeting construct were a selective A 10-IN-R produces a 462 base pair amplicon. PCR from a marker conferring resistance to kanamycin and a bacterial strain with a wild-type copy of ADH-A1 OB with primers origin of replication, so that the targeting construct can be A 10-IN-F and A10-IN-R produces a 488 base pair amplicon. grown and propagated in E coli. The targeting sequences For strain DP431 with a deleted copy of ADH-A10B, PCR shown in SEQID NO: 61 also include a BSmBI restriction site with primers A10-IN-F and A10-IN-R produces a 521 base at each end of the construct, so that the final targeting con pair amplicon. The amplicons with primers A10-IN-F and struct can be linearized and optionally separated from the A 10-IN-R could not distinguish between a strain carrying a bacterial antibiotic resistance marker and origin of replication wild-type or a deleted copy of ADH-A10B, but digestion of prior to transformation into Candida tropicalis. US 2015/0094.483 A1 Apr. 2, 2015 50

0404 Candida tropicalis strain DP433 was prepared by stretch of at least 100 contiguous nucleotides within the cod integration of the construct shown as SEQID NO: 61 into the ing region, or at least 80% identical for a stretch of at least 100 genome of strain DP431 (Table 3) at the site of the genomic contiguous nucleotides of the coding sequence or at least 75% sequence of the gene for ADH-B11B. Candida tropicalis identical for a stretch of at least 100 contiguous nucleotides of strain DP437 was prepared by excision of the targeting con the coding sequence, or at least 70% identical for a stretch of struct from the genome of strain DP433, thereby deleting a at least 100 contiguous nucleotides of the coding sequence, or portion of the coding region of the gene encoding ADH at least 65% identical for a stretch of at least 100 contiguous B11B. Integration and deletion of targeting sequence SEQID nucleotides of the coding sequence, or at least 60% identical for a stretch of at least 100 contiguous nucleotides of the NO: 61, and analysis of integrants and excisants were per coding sequence with one of the Candida tropicalis genes formed as described in Section 7.1. ADH-A4 (SEQ ID NO:39), ADH-B4 (SEQ ID NO: 42), 04.05 Sequences of oligonucleotide primers for analysis ADH-A10 (SEQID NO: 40), ADH-A10B (SEQID NO:56), of strains were: ADH-B11 (SEQID NO: 43). 04.09. In some embodiments it may be advantageous to integrate a gene encoding a polypeptide into a yeast strain of (SEQ ID NO: 119) the genus Candida in which (i) one or more alcoholdehydro B11 - OUT-F: genase genes have been disrupted and (ii) the disrupted alco B11 - IN-R: holdehydrogenase comprises a first peptide. In some embodi (SEQ ID NO: 123) ments said first peptide has the sequence VKYSGVCH (SEQ CAGACTGGTTGATGGCTTTTTCAGAA ID NO: 156). In some embodiments said first peptide has the (SEO ID NO: 79) sequence VKYSGVCHxxxxxWKGDW (SEQID NO: 162). SAT1-R: In some embodiments the first peptide has the sequence VKYSGVCHXXXXXWKGDWXXXXKLPXVG (SEQ ID NO: 80) GHEGAGVVV (SEQID NO: 163). SAT1 - F : 0410. In some embodiments the disrupted alcohol dehy drogenase sequence, predicted from translation of the gene 04.06 For strain DP433 (integration of SEQID NO: 61), that encodes it, comprises a second peptide. In some embodi PCR with primers B11-OUT-F and SAT1-R produces a 692 ments said second peptide has the sequence QYATA base pair amplicon. PCR from a strain with a wild-type copy DAVQAA (SEQ ID NO: 158). In some embodiments said of ADH-B11B with primers B11-OUT-F and B11-IN-R pro second peptide has the sequence SGYxHDGxFXOYATA duces a 718 base pair amplicon. For strain DP437 with a DAVQAA (SEQ ID NO: 164). In some embodiments said deleted copy of ADH-B11B, PCR with primers B11-OUT-F second peptide has the sequence GAEPNCXXADxSGYX and B11-IN-R produces a 751 base pair amplicon. The ampli HDGxFXOYATADAVQAA (SEQ ID NO: 165). In some cons with primers B11-OUT-F and B11-IN-R could not dis embodiments the disrupted alcoholdehydrogenase sequence, tinguish between a strain carrying a wild-type or a deleted predicted from translation of the gene that encodes it, com copy of ADH-B11B, but digestion of the amplicon with either prises a third peptide. In some embodiments said third peptide NotI or XhoI will cleave the amplicon derived from the has the sequence CAGVTVYKALK (SEQ ID NO: 159). In deleted copy of the gene but not from the wild type, thereby Some embodiments said third peptide has the sequence APIX distinguishing between them. CAGVTVYKALK (SEQID NO: 166). 0411. In some embodiments the first genetic modification 7.6. Insertion of P450 Genes into the Genome of class comprises disruption of at least one alcohol dehydroge Candida nase whose amino acid sequence, predicted from translation of the gene that encodes it, comprises a fourth peptide. In 04.07 To achieve novel phenotypes in yeasts of the genus Some embodiments said fourth peptide has the sequence Candida (e.g., Candida tropicalis), including biotransforma GQWVAISGA (SEQ ID NO: 160). In some embodiments tions of compounds by Candida tropicalis, including chemi said fourth peptide has the sequence GQWVAISGAXG cal conversions not previously obtained, or increased rates of GLGSL (SEQ ID NO: 167). In some embodiments said conversion of one or more substrates to one or more products, fourth peptide has the sequence GQWVAISGAXGGLGSLX or increased specificity of conversion of one or more Sub VQYA (SEQID NO: 168). In some embodiments said fourth strates to one or more products, or increased tolerance of a peptide has the sequence GQWVAISGAXGGLGSLX compound by the yeast, or increased uptake of a compound VQYAXAMG (SEQID NO: 169). In some embodiments said by the yeast, it may be advantageous to incorporate a gene fourth peptide has the sequence GQWVAISGAXGGLGSLX encoding a polypeptide into the genome of the yeast. Expres VQYAXAMGxRVxAIDGG. (SEQ ID NO: 170). In some sion of the polypeptide in the yeast then allows the phenotype embodiments the disrupted alcoholdehydrogenase sequence, predicted from translation of the gene that encodes it, com of the yeast to be modified. prises a fifth peptide. In some embodiments said fifth peptide 0408. In some embodiments of the invention it may be has the sequence VGGHEGAGVVV (SEQ ID NO: 157). advantageous to integrate a gene encoding a polypeptide into 0412 Cytochrome P450s are of particular utility in the a strain of Candida tropicalis in which one or more of the hydroxylation of a variety of substrates including fatty acids. alcohol dehydrogenase genes ADH-A4, ADH-A4B, ADH Different cytochrome P450s are known to have different sub B4, ADH-B4B, ADH-A10, ADH-A1 OB, ADH-B1B and strate and regiospecificities and different specific activities. It ADH-B11 have been disrupted. In some embodiments of the is therefore useful in some embodiments of the invention to invention it may be advantageous to integrate a gene encoding incorporate a gene encoding a cytochrome P450 into the a polypeptide into a yeast strain of the genus Candida in genome of the yeast. The exact P450 to be used will depend which one or more alcohol dehydrogenase genes have been upon the substrate and the position on the substrate to be disrupted, and wherein the disrupted alcohol dehydrogenase hydroxylated. A list of P450 enzymes that may be of utility in gene shares at least 95% nucleotide identity, or at least 90% the hydroxylation of substrates when expressed within a yeast nucleotide identity, or at least 85% nucleotide identity for a cell are given in Table 4. US 2015/0094.483 A1 Apr. 2, 2015 51

TABLE 4

First Database Second Database Accession Number Accession Number Name Species gi29469875 gb AAO73958.1 CYP52A17 Candida tropicalis gi29469877 gb AAO73959.1 CYP52A18 Candida tropicalis gi 231889 sp P30610.1 CP52H CANTR (Cytochrome P450 52A8) gi3913326 sp Q12586.1 CP52I CANMA (Cytochrome P450 52A9) gi29469881 gb AAO73961.1 CYP52A2O Candida tropicalis gi29469879 gb AAO73960.1 CYP52A19 Candida tropicalis gi3913329 sp Q12589.1 CP52K CANMA (Cytochrome P450 52A11) gi3913328 sp Q12588.1 CP52J CANMA (Cytochrome P450 52A10) gi 68492087 ref XP 710174.1 P450 drug resistance protein Candida albicans gi 33.95458 emb CAA75058.1 alks Candida albicans gi 68474594 ref XP 718670.1 CaO 19.7513 Candida albicans gi294.69865 gb AAO73953.1 CYP52A13 Candida tropicalis gi 149239010 ref XP OO1525381.1 cytochrome P450 52A11 Lodderomyces elongisportis gi294.69867 gb AAO73954.1 CYP52A14 Candida tropicalis gi7548332 gb AAA34353.2 cytochrome P-450-alk2 Candida tropicalis gi 732622 emb CAA39366.1 n-alkane inducible Candida maltosa cytochrome P-450 gi 231886 sp P30607.1 CP52B CANTR (Cytochrome P450 52A2) gi 68474592 ref XP 718669.1 CaO 19.7512 Candida albicans gi 150864612 ref XP OO1383,506.2 n-alkane inducible Pichia stipitis cytochrome P-450 gi 231888 sp P30609.1 CP52G CANTR (Cytochrome P450 52A7) gi298.217 gb AAB24479.1 cytochrome P450 Candida tropicalis monoxygenase alka, P450 alkA = CYP52A7 gene product alkane-inducible gi 149246109 refXP OO1527524.1 cytochrome P450 52A2 Lodderomyces elongisportis gi294.69869 gb AAO73955.1 CYP52A15 Candida tropicalis gi 190319368 gb AAD22536.2 AF103948. 1 cytochrome Debaryomyces hansenii P450 alkane hydroxylase gi 1464-19207 refXP OO1485567.1 cytochrome P450 52A12 Pichia guilliermondii gi294.69863 gb AAO73952.1 CYP52A12 Candida tropicalis gi50423067 refxP 46O112.1 DEHAOE19635g Debaryomyces hansenii gi29469871 gb AAO73956.1 bodiment Candida tropicalis gi 199432969 emb CAG88381.2 DEHA2E18612p Debaryomyces hansenii gi 170892 gb AAA34354.1 cytochrome P-450-alk1 Candida tropicalis gi50423065 refxP 4601 11.1 DEHAOE19613g Debaryomyces hansenii gi 1169075 sp P10615.3 CP52A CANTR (Cytochrome P450 52A1) gi 226487 prf 1515252A cytochrome P450alk1 gi 732623 embCAA393.67.1 n-alkane inducible Candida maltosa cytochrome P-450 gi 146413358 ref XP 001482650.1 PGUG O5670 Pichia guilliermondii gi 117182 sp P16141.3 CP52D CANMA (Cytochrome P450 52A4) gi 2608 embCAA361971 unnamed protein product Candida maltosa gi 231887 sp P30608.1 CP52F CANTR (Cytochrome P450 52A6) gi 199432970 emb CAG88382.2 DEHA2E18634p Debaryomyces hansenii gi 190349008 gb EDK41572.2 PGUG O5670 Pichia guilliermondii gi 150864699 ref XP OO1383636.2 Cytochrome P450 52A12 Pichia stipitis (Alkane hydroxylase 1) (Alkane-inducible p450alk 1) (DH-ALK2) gi 117181 sp P16496.3 CP52C CANMA (Cytochrome P450 52A3) gi 199432968 emb CAG88380.2 DEHA2E18590p Debaryomyces hansenii gi50423063 refxP 4601.10.1 DEHAOE19591g Debaryomyces hansenii gi553118 gb AAA34320.1 alkane hydroxylating cytochrome P-450 gi 117183 sp P24458.1 CP52E CANMA (Cytochrome P450 52A5) gi 68475852 ref XP 717999.1 potential alkane Candida albicans hydroxylating monooxygenase P450 gi 18203639 CP52M DEBHA (Cytochrome P450 52A13) US 2015/0094.483 A1 Apr. 2, 2015

TABLE 4-continued

First Database Second Database Accession Number Accession Number Name Species gi 146412241 ref XP 001482092.1 cytochrome P450 52A13 Pichia guilliermondii gi 126134585 ref XP 001383817.1 Cytochrome P450 52A13 Pichia stipitis (Alkane hydroxylase 2) (Alkane-inducible p450alk2) (DH-ALK2) gi 50418551 refXP 457792.1 DEHAOCO2981g Debaryomyces hansenii gi 149236533 ref XP 001524144.1 cytochrome P450 52A5 Lodderomyces elongisportis gi 150864746 ref XP 001383710.2 Cytochrome P450 52A6 Pichia stipitis (CYPLILA6) (Alkane inducible P450-ALK3) gi 1492394.04 ref XP OO1525578.1 cytochrome P450 52A3 Lodderomyces elongisportis gi 50417817 ref XP 457727.1 DEHAOCO1177g Debaryomyces hansenii gi 1994.30432 enb CAG85755.2 DEHA2CO1100p Debaryomyces hansenii gi 1492394,02 refXP OO1525577.1 cytochrome P450 52A8 Lodderomyces elongisportis gi29469873 gb AAO73957.1 CYP52D2 Candida tropicalis gi 150866745 refxP 001386440.2 Cytochrome P450 52A3 Pichia stipitis (CYPLILA3) (Alkane inducible P450-ALK1-A) (P450-CM1) (CYP52A3-A) (Cytochrome P-450ALK) gi 190347603 gb EDK39907.2 PGUG 04005 Pichia guilliermondii gi 146414612 ref XP OO1483.276.1 PGUG 04005 Pichia guilliermondii gi 13913325 sp Q12585.1 CP52T CANMA (Cytochrome P450 52D1) gi50553995 refxP 5044.06.1 YALIOE25982p Yarrowia lipolytica gi 3298289 dbi BAA31433.1 ALK1 Yarrowia lipolytica gi50554897 ref XP 504.857.1 YALIOFO1320p Yarrowia lipolytica gi50545727 re P 5004O2.1 YALIOBO1848p Yarrowia lipolytica gi50546066 re P 500560.1 YALIOBO6248p Yarrowia lipolytica gi50547357 re P 5O1148.1 YALIOB20702p Yarrowia lipolytica gi50546771 re P 500855.1 YALIOB13816p Yarrowia lipolytica gi50546773 re P 500856.1 YALIOB13838p Yarrowia lipolytica gi 70982077 re P 746567.1 cytochrome P450 alkane Aspergiiitisfit migatus hydroxylase gi 119487140 refxP 001262425.1 cytochrome P450 alkane Neosartorya fischeri hydroxylase gi50545119 re P 500097.1 YALIOA15488p Yarrowia lipolytica gi 115387741 re P 001211376.1 cytochrome P450 52A12 Aspergilius terretts gi 145248800 re P OO1400739.1 An14g01110 Aspergilius niger gi 121714465 re P OO1274,843.1 cytochrome P450 alkane Aspergilius clavatus hydroxylase gi50545471 re P 500273.1 YALIOA20130p Yarrowia lipolytica gi21254.1280 re P 002150795.1 cytochrome P450 alkane Penicillium marneffei hydroxylase gi 16978.3066 refXP OO1825995.1 Aspergilius oryzae gi 67541935 ref XP 664735.1 AN7131.2 Aspergilius nidulans gi 218716670 gb EED16091.1 cytochrome P450 alkane Taiaromyces stipitatus hydroxylase gi 211584648 emb CAP74173.1 Pc14g00320 Penicillium chrysogenium gi 68475719 ref XP 718066.1 potential alkane Candida albicans hydroxylating monooxygenase P450 fragmen gi 231890 sp P30611.1 CP52N CANTR (Cytochrome P450 52B1) gi50553800 refxP 504311.1 YALIOE23474p Yarrowia lipolytica gi 115391153 ref XP OO1213081.1 ATEG O3903 Aspergilius terretts gi 1169076 sp P43083.1 CP52V CANAP (Cytochrome P450 52E1) gi212537573 ref XP 002148942.1 cytochrome P450 family Penicillium marneffei protein gi 11948.0837 refxP 001260447.1 cytochrome P450 family Neosartorya fischeri protein gi 159129370 gb EDP54484.1 cytochrome P450 family Aspergiiitisfit migatus protein gi 7100.1214 refXP 75.5288.1 cytochrome P450 family Aspergiiitisfit migatus protein gi50548557 ref XP 5O1748.1 YALIOC12122p Yarrowia lipolytica gi 211592844 emb CAP992.12.1 Pc22g 19240 Penicillium chrysogenium gi 231891 sp P30612.1 CP52P CANTR (Cytochrome P450 52C1) US 2015/0094.483 A1 Apr. 2, 2015 53

TABLE 4-continued

First Database Second Database Accession Number Accession Number Name Species gi3913327 sp Q12587.1 CP52Q CANMA (Cytochrome P450 52C2) gi50548395 ref XP 5O1667.1 YALIOC10054p Yarrowia lipolytica gi 145248373 ref XP 001396435.1 An13g03000 Aspergilius niger gi 169783674 ref XP 001826299.1 Aspergilius oryzae gi 169774249 refXP OO1821592.1 Aspergilius oryzae gi212536398 ref XP 0021483.55.1 cytochrome P450 alkane Penicillium marneffei hydroxylase gi 211590140 emb CAP96310.1 Pc21g 14130 Penicillium chrysogenium gi 189200681 ref XP 001936677.1 cytochrome P450 52A12 Pyrenophora tritici-repentis gi 121698.992 ref XP 001267871.1 cytochrome P450 family Aspergilius clavatus protein gi 154310961 ref XP OO1554811.1 BC1G 06459 Botryotinia fuckeiana gi 119497443 ref XP 001265480.1 cytochrome P450 alkane Neosartorya fischeri hydroxylase gi67539774 refxP 663661.1 AN6057.2 Aspergilius nidulans gi3913324 sp Q12573.1 CP52W CANAP (Cytochrome P450 52E2) gi 159130401 gb EDP55514.1 cytochrome P450 alkane Aspergiiitisfit migatus hydroxylase gi70990140 ref XP 7499.19.1 cytochrome P450 alkane Aspergiiitisfit migatus hydroxylase gi212543867 ref XP 002152088. N-alkane-inducible Penicillium marnefei ATCC 18224 cytochrome P450 gi 189204508 ref XP OO1938589. cytochrome P450 52A12 Pyrenophora tritici-repentis gi 67904794 ref XP 682653.1 AN93842 Aspergilius nidulans gi 115401146 refixP OO1216161. ATEG 07540 Aspergilius terretts gi 169765686 refixP OO1817314. Aspergilius oryzae gi 156034334 refixP OO1585586. SS1G 13470 Scierotinia Scierotiorum gi 115389132 refixP OO1212O71. ATEG 02893 Aspergilius terretts gi 149249004 refixP OO1528842. LELG O5768 Lodderomyces elongisporus gi 119490743 refxP 001263094. n-alkane-inducible Neosartorya fischeri cytochrome P450 gi 1695,98696 ref XP OO1792.771. SNOG 02153 Phaeosphaeria nodorum gi 145233653 refixP OO1400199. Ano2g10700 Aspergilius niger gi 121703415 ref XP OO1269972. cytochrome P450 alkane Aspergilius clavatus hydroxylase gi 145244813 refixP OO1394.678. An 11g07010 Aspergilius niger gi 115400535 refixP OO1215856. ATEG 06678 Aspergilius terretts gi 156054264 refixP OO1593 058. SS1G 0598O Scierotinia Scierotiorum gi 145235009 refXP 001390153. Ano3g02570 Aspergilius niger gi 121714697 ref XP OO1274959. n-alkane-inducible Aspergilius clavatus cytochrome P450 gi 115383936 refixP OO1208515. ATEG 01150 Aspergilius terretts gi 119188703 refixP OO1244958. CIMG 04399 Coccidioides inmitis gi 1543.03347 refixP OO15S2081. BC1G 09422 Botryotinia fuckeiana gi 68469246 refixP 721410.1 potential n-alkane inducible Candida albicans cytochrome P-450 gi 211588353 emb CAP864.58.1 Pc20g11290 Penicillium chrysogenium gi 218.719422 gb EED18842.1 cytochrome P450 Taiaromyces stipitatus gi 189196472 ref XP 001934,574.1 cytochrome P450 52A11 Pyrenophora tritici-repentis gi 145228377 ref XP 001388497.1 Ano1g00510 Aspergilius niger gi 145243810 ref XP 001394.417.1 An 11g04220 Aspergilius niger gi 119467390 refxP 0012575.01.1 n-alkane-inducible Neosartorya fischeri cytochrome P450 gi 218713692 gb EED13116.1 cytochrome P450 alkane Taiaromyces stipitatus hydroxylase gi 156040904 ref XP OO1587438.1 SS1G 11430 Scierotinia Scierotiorum gi 21158.8608 emb CAP86724.1 Pc20g 13950 Penicillium chrysogenium gi 189210960 ref XP 001941811.1 cytochrome P450 52A11 Pyrenophora tritici-repentis gi 1543.00280 ref XP OO1550556.1 BC1G 11329 Botryotinia fuckeiana gi39965179 refXP 365075.1 MGG O992O Magnaporthe grisea gi 70984521 ref XP 747767.1 cytochrome P450 alkane Aspergiiitisfit migatus hydroxylase gi 164424932 refXP 958030.2 NCUO911S Neurospora crassa gi 169785321 ref XP OO1827121.1 Aspergilius oryzae gi 171687345 ref XP OO1908613.1 Podospora anserina gi495225 dbi BAAO5145.1 n-alkane-inducible Candida maltosa cytochrome P-450 gi 169778468 ref XP OO1823699.1 Aspergilius oryzae gi 685237 emb CAA35593.1 cytochrome P-450-alk2 Candida tropicalis gi 115398792 ref XP OO1214985.1 ATEG 05807 Aspergilius terretts gi 156045685 ref XP OO1589398.1 SS1G 10037 Scierotinia Scierotiorum US 2015/0094.483 A1 Apr. 2, 2015 54

TABLE 4-continued

First Database Second Database Accession Number Accession Number Name Species gi 116181964 ref XP OO1220831.1 CHGG O1610 Chaetonium globosum gi212539338 refxP 002149824.1 N-alkane-inducible Penicillium marneffei cytochrome P450 gi55823915 gb AAV66104.1 cytochrome P450 Fusarium heterosporum gi 16978.6131 ref XP OO1827,526.1 Aspergilius oryzae gi 67526919 ref XP 661521.1 AN3917.2 Aspergilius nidulans gi57157397 dbi BAD83681.1 cytochrome P-450 Alternaria Soiani gi 39954838 refxP 364111.1 MGG O8956 Magnaporthe grisea gi46108804 refxP 381460.1 FGO1284.1 Gibbereia zeae gi 167962420 dbi BAGO9241.1 n-alkane inducible Candida maltosa cytochrome P-450 gi 11946.9615 ref XP 001257962.1 cytochrome P450 alkane Neosartorya fischeri hydroxylase gi70991773 refXP 750735.1 cytochrome P450 alkane Aspergiiitisfit migatus hydroxylase gi 17167918.5 ref XP 001904540.1 unnamed protein product Podospora anserina gi 119488606 refxP 001262753.1 n-alkane-inducible Neosartorya fischeri cytochrome P450 gi 218722969 gb EED22387.1 cytochrome P450 Taiaromyces stipitatus gi 145243244 ref XP 001394159.1 An 11g01550 Aspergilius niger gi212533853 refxP 002147083.1 N-alkane-inducible Penicillium marneffei cytochrome P450 gi 218720976 gb EED20395.1 cytochrome P450 alkane Taiaromyces stipitatus hydroxylase gi 145604320 refxP 362943.2 MGG 08494 Magnaporthe grisea gi 154319876 ref XP OO1559255.1 BC1G 02419 Botryotinia fuckeiana gi 154272319 ref XP OO1537012.1 HCAG 08121 Aiellomyces capsulatus gi 39976331 ref XP 369556.1 MGG 05908 Magnaporthe grisea gi 11620.0125 refXP OO1225874.1 CHGG 08218 Chaetonium globosum gi 218722681 gb EED22099.1 cytochrome P450 alkane Taiaromyces stipitatus hydroxylase gi 145606889 ref XP 361347.2 MGG 03821 Magnaporthe grisea gi211592275 emb CAP98.620.1 Pc22g13320 Penicillium chrysogenium gi 171688.034 ref XP 001908957.1 unnamed protein product Podospora anserina gi 211587061 emb CAP94723.1 Pc18g.04990 Penicillium chrysogenium gi 169612986 ref XP OO1799910.1 SNOG 0.9621 Phaeosphaeria nodorum gi212539354 ref XP 0021.49832.1 N-alkane-inducible Penicillium marneffei cytochrome P450 gi212533239 ref XP 002146776.1 cytochrome P450 alkane Penicillium marneffei hydroxylase gi41079162 gb AAR99474.1 alkane monooxygenase Graphium sp. P-450 gi 159122944 gb EDP48064.1 cytochrome P450 alkane Aspergiiitisfit migatus hydroxylase gi6753.7376 ref XP 662462.1 AN4858.2 Aspergilius nidulans gi 39954738 refxP 364102.1 MGG O8947 Magnaporthe grisea gi39968,921 ref XP 365851.1 MGG 10071 Magnaporthe grisea gi 70983886 ref XP 747469.1 cytochrome P450 alkane Aspergiiitisfit migatus hydroxylase gi 171691438 refxP 001910644. unnamed protein product Podospora anserina gi 1191934.52 ref XP OO12473.32. CIMG 01103 Coccidioides inmitis gi 10303293 emb CAC10088.1 related to n-alkane-inducible Neurospora crassa cytochrome P450 gi 169626152 ref XP OO1806478. SNOG 16355 Phaeosphaeria nodorum gi 119191908 refxP 001246560. CIMG 00331 Coccidioides inmitis gi 154296077 refXP OO1548471. BC1G 12768 Botryotinia fuckeiana gi 164429645 ref XP 964653.2 NCUO2O31 Neurospora crassa gi 12311700 emb CAC24473.1 Candida albicans gi 154305169 ref XP 001552987. BC1G O8879 Botryotinia fuckeiana gi39978.177 refXP 370476.1 MGG O6973 Magnaporthe grisea gi 70982576 refxP 746816.1 cytochrome P450 alkane Aspergiiitisfit migatus hydroxylase gi 154319145 ref XP OO1558890. BC1G O2524 Botryotinia fuckeiana gi46127885 ref XP 388496.1 FGO8320.1 Gibbereia zeae gi32330665 gb AAP79879.1 cytochrome P450 Phanerochaete chrysosporium monooxygenase pc-3 gi 116193605 ref XP OO1222615. CHGG O6520 Chaetonium globosum gi 145241598 refXP 001393445. Ano9gO1270 Aspergilius niger gi 149210127 ref XP OO1522438. MGCH7 ch7g,545 Magnaporthe grisea gi 121699244 ref XP OO1267956. cytochrome P450 alkane Aspergilius clavatus hydroxylase gi 156032429 refXP OO1585052. SS1G 13912 Scierotinia Scierotiorum gi 159122551 gb EDP47672.1 cytochrome P450 alkane Aspergiiitisfit migatus hydroxylase US 2015/0094.483 A1 Apr. 2, 2015 55

TABLE 4-continued

First Database Second Database Accession Number Accession Number Name Species gi 145613078 re P OO1412594.1 MGG 12496 Magnaporthe grisea gi 212531571 re P 002145942.1 N-alkane-inducible Penicillium marneffei cytochrome P450 gi 1452S2862 re P 001397944.1 An16g06420 Aspergilius niger gi 1698,55683 re P OO1834508.1 CC1G 02244 Coprinopsis cinerea okayama gi 21253O338 re P 002145326.1 N-alkane-inducible Penicillium marneffei cytochrome P450 gi 61657996 gb AAX4940.0.1 cytochrome P450 Phanerochaete chrysosporium monooxygenase pc-2 gi 701.10164 ref XP OO1886288.1 CYP63 cytochrome P450 Laccaria bicolor monooxygenase-like protein gi 46.323950 ref XP 748328.2 cytochrome P450 Aspergiiitisfit migatus oxidoreductase, alkane hydroxylase gi 56042346 ref XP OO1587730. SS1G 10970 Scierotinia Scierotiorum gi 89196282 ref XP OO1934,479. cytochrome P45071A23 Pyrenophora tritici-repentis gi 8369901 gb AAL67906.1 cytochrome P450 Phanerochaete chrysosporium monooxygenase pc-2 gli218714942 gb EED14365.1 cytochrome P450 Taiaromyces stipitatus gi 7O106497 refxP 001884,460. cytochrome P450 Laccaria bicolor gi 6986SS34 ref XP OO1839366. CC1G 08233 Coprinopsis cinerea okayama gi 698,55669 refXP OO1834501. CC1G O2237 Coprinopsis cinerea okayama gi 89.197495 ref XP OO1935.085. cytochrome P450 52A1 Pyrenophora tritici-repentis gl 21.8713646 gb EED13070.1 cytochrome P450 Taiaromyces stipitatus gi 70106217 refxP 001884320. cytochrome P450 Laccaria bicolor gi 16197088 ref XP OO1224356. CHGG 05142 Chaetonium globosum gi 8369899 gb AAL67905.1 cytochrome P450 Phanerochaete chrysosporium monooxygenase pc-1 gi 54312290 re OO 555473. BC1G 061.78 Botryotinia fuckeiana gi S6064223 re OO1598O33. SS1G OO119 Scierotinia Scierotiorum gi S60392.63 re OO1586739. SS1G 11768 Scierotinia Scierotiorum gi 701OS2O6 re 883816. Laccaria bicolor gi 696.13228 8OOO31. SNOG 09744 Phaeosphaeria nodorum gi 6986.3123 838184. CC1G 12233 Coprinopsis cinerea okayama gl 67902848 681680.1 AN84.11.2 Aspergilius nidulans gi 58.3924.52 Cl b AO91865.1 monooxygenase Penicillium expansium gi 698,57173 OO1835239. CC1G O7782 Coprinopsis cinerea okayama gi 6978.1220 OO1825073. Aspergilius oryzae gl 663925.1 AN6321.2 Aspergilius nidulans gi 45234SS3 re OO1389.925. Ano3g00180 Aspergilius niger gi 70106275 OO1884349. Laccaria bicolor gi 4561 OO12 366716.2 MGG O2792 Magnaporthe grisea gi 19473653 OO12587.02. cytochrome P450 Neosartorya fischeri monooxygenase gi 180263SS e mb AL.69594.1 Cordyceps bassiana gi S430994.5 . OO15543.05. BC1G 06893 Botryotinia fuckeiana gl 211593324 AP997O6.1 Pc22g24180 Penicillium chrysogenium gi 701 11410 .. 001886909. cytochrome P450 Laccaria bicolor monooxygenase CYP63 gi 698.64610 re 001838912. CC1G O5465 Coprinopsis cinerea okayama gi 4524OOO7 re 001392650. Ano8g05330 Aspergilius niger gi 154333O2 re 001216788. Aspergilius terretts gi 21701751 re OO126914.O. Cytochrome P450 Aspergilius clavatus oxidoreductase gi S4289956 re P OO1545581. BC1G 15919 Botryotinia fuckeiana gl 212S27006 re P 002143660. cytochrome P450 alkane Penicillium marneffei hydroxylase gi S6054506 ref XP OO1593.179. SS1G 06101 Scierotinia Scierotiorum gi 679,6212S dbi BAGO9240.1 n-alkane inducible Candida maltosa cytochrome P-450 gi 696.10561 ref XP OO1798.699. SNOG O8385 Phaeosphaeria nodorum gi S4322320 refXP OO1560475. BC1G O1307 Botryotinia fuckeiana gi 71986596 gb ACB59278.1 cytochrome P450 Pseudozymafiocculosa monooxygenase gi 6985.0022 refXP OO1831709. CC1G 12229 Coprinopsis cinerea okayama gl 84514171 gb ABC59094.1 cytochrome P450 Medicago truncatula monooxygenase CYP704G9 gi 573492.59 emb CAO244.05.1 Vitis vinifera gi S4322983 ref XP 001560806. BC1G 00834 Botryotinia fuckeiana gi 71726950 gb AAZ39646.1 cytochrome P450 Petuniax hybrida monooxygenase gi 216.0323 dbi BAAO5146.1 n-alkane-inducible Candida maltosa cytochrome P-450 US 2015/0094.483 A1 Apr. 2, 2015 56

TABLE 4-continued

First Database Second Database Accession Number Accession Number Name Species gi 218717320 gb EED16741.1 cytochrome P450 Taiaromyces stipitatus gi 118485860 gb ABK94777.1 Populus trichocarpa gi 71024781 ref XP 762620. UMO6473.1 Ustilago maydis gi58265 104 ref XP 569708. Cryptococci is neoformans var. neoformans gi 169596949 ref XP OO1791898.1 SNOG 01251 Phaeosphaeria nodorum gi 1573.55912 embCAO49769. Vitis vinifera gi 134109309 ref XP 776769. CNBC2600 Cryptococci is neoformans var. neoformans gi 157349262 embCAO24408. Vitis vinifera gi 147765747 emb CAN6O189. Vitis vinifera gi 1698.64676 refXP OO1838945.1 CC1G O5498 Coprinopsis cinerea okayama gi 157352095 embCAO431O2. Vitis vinifera gi 147791153 emb CAN63571. Vitis vinifera gi 84514173 gb ABC59095.1 cytochrome P450 Medicago truncatula monooxygenase CYP704G7 gi 71024761 ref XP 762610. UMO64631 Ustilago maydis gi 1573.55911 embCAO49768. Vitis vinifera gi 115451645 refNP 00104.9423.1 OsO3g0223100 Oryza sativa gi 22748335 gbAANO5337.1 cytochrome P450 Oryza sativa gi 168059245 refxP 001781.614.1 Physcomitrella patens Subsp. patens gi 15225499 refNP 182075. CYP704A2 (cytochrome Arabidopsis thaliana P450, family 704, Subfamily A, polypeptide 2) oxygen binding gi75319885 CO)4C1, PINTA (Cytochrome P450 704C1) gi 167521978 ref XP OO1745327.1 Monosiga brevicollis gi 21536522 gb AAM60854.1 cytochrome P450-like Arabidopsis thaliana protein gi 15242759 refNP 201150.1 CYP94B1 (cytochrome Arabidopsis thaliana P450, family 94, Subfamily B, polypeptide 1) oxygen binding gi 168031659 refXP OO1768338.1 Physcomitrella patens Subsp. patens gi 15733.9131 emb CAO42482.1 Vitis vinifera gi 30682301 refNP 196442.2 cytochrome P450 family Arabidopsis thaliana protein gi 8346562 emb CAB93726.1 cytochrome P450-like Arabidopsis thaliana protein gi2344895 gb AAC31835.1 cytochrome P450 Arabidopsis thaliana gi 30689861 refNP 850427.1 CYP704A1 (cytochrome Arabidopsis thaliana P450, family 704, Subfamily A, polypeptide 1) oxygen binding gi 15221776 refNP 173862.1 CYP86C1 (cytochrome Arabidopsis thaliana P450, family 86, Subfamily C, polypeptide 1) oxygen binding gi 147793,015 emb CAN77648.1 Vitis vinifera gi 157356646 emb CAO62841.1 Vitis vinifera gi 147844260 emb CAN80040.1 Vitis vinifera gi215466577 gb EEB965.17.1 MPER 04337 Moniliophthora perniciosa gi 15222515 refNP 176558.1 CYP86A7 (cytochrome Arabidopsis thaliana P450, family 86, subfamily A, polypeptide 7) oxygen binding gi 1946977 24 gb ACF82946.1 Zea mays gi 168021353 ref XP OO17632O6.1 Physcomitrella patens Subsp. patens gi 115483.036 refNP OO1065111.1 Os10g.0525000 Oryza sativa (japonica cultivar-group) gi 157338660 emb CAO42011.1 Vitis vinifera gi 147836212 emb CAN75428.1 Vitis vinifera gi5042165 emb CAB44684.1 cytochrome P450-like Arabidopsis thaliana protein gi79326551 refNP 001.031814.1 CYP96A10 (cytochrome Arabidopsis thaliana P450, family 96, subfamily A, polypeptide 10) heme binding iron ion binding? monooxygenase gi 26452145 dbi BAC43161.1 cytochrome P450 Arabidopsis thaliana gi 110289450 gb AAP54707.2 Cytochrome P450 family Oryza sativa protein, expressed gi 21593258 gb AAM65207.1 cytochrome P450 Arabidopsis thaliana gi 11548.3034 ref NP OO1065110.1 Os10g.0524700 Oryza sativa gi 118486379 gb ABK95.030.1 Populus trichocarpa gi 10442763 gb AAG17470.1 AF123610 9 cytochrome Titiciinaestivitin P450 US 2015/0094.483 A1 Apr. 2, 2015 57

TABLE 4-continued

First Database Second Database Accession Number Accession Number Name Species gi 125532704 gb EAY79269.1 OSI 34384 Oryza sativa gi 15237250 refNP 197710.1 CYP86B1 (cytochrome Arabidopsis thaliana P450, family 86, subfamily B, polypeptide 1) oxygen binding gi 12554.9414 gb EAY95236.1 OsI 17053 Oryza sativa gi 110289453 gb AAP547 10.2 Cytochrome P450 family Oryza sativa protein gi 20146744 gb AAM12480.1 ACO74232 7 cytochrome Oryza sativa P450-like protein gi 218184911 gb EEC67338.1 OSI 34388 Oryza sativa Indica Group gi 125549325 gb EAY95147.1 OSI 16965 Oryza sativa Indica Group gi 1984.72816 ref XP 002133118.1 Drosophia pseudoobscura pseudoobscura gi 195574346 refXP 002105150.1 GD21336 Drosophila simulans gi 168024173 refxP 001764611.1 Physcomitrella patens Subsp. patens gi 11544.0549 refNP OO1044554.1 OsO1g0804400 Oryza sativa (japonica cultivar-group) gi 15223657 refNP 176086.1 CYP96A15/MAH1 (MID Arabidopsis thaliana CHAINALKANE HYDROXYLASE 1) oxygen binding gi 12554O131 gb EAY86526.1 OSI O7906 Oryza sativa gi 11546.0030 refNP 001053615.1 OsO4g.0573900 Oryza sativa (japonica cultivar-group) gi 157349258 emb CAO24404.1 Vitis vinifera gi 157346575 emb CAO16644.1 Vitis vinifera gi 147835182 enb CAN76753.1 Vitis vinifera gi 1956.13956 gb ACG28808.1 Zea mays gi 194753285 ref XP OO1958947.1 GF1263S Drosophila ananassae gi 156546811 refxP 001606040.1 Nasonia vitripennis gi 125583181 gb EAZ24112.1 Os) 007595 Oryza sativa (japonica cultivar-group) gi 15229477 refNP 189243.1 CYP86C2 (cytochrome Arabidopsis thaliana P450, family 86, subfamily C, polypeptide 2) oxygen binding gi940446 emb CAA62082.1 cytochrome p450 Arabidopsis thaliana gi 115447789 refNP OO1047674.1 OsO2g0666500 Oryza sativa (japonica cultivar-group) gi 15227788 refNP 1798.99.1 CYP96A1 (cytochrome Arabidopsis thaliana P450, family 96, subfamily A, polypeptide 1) oxygen binding gi 195503768 ref XP 002098791.1 GE23738 Drosophila yakuba gi 147804860 emb CAN66874.1 Vitis vinifera gi 84514169 gb ABC59093.1 cytochrome P450 Medicago truncatula monooxygenase CYP94C9 gi 19698839 gb AAL91155.1 cytochrome P450 Arabidopsis thaliana gi 15237768 refNP 200694.1 CYP86A1 (cytochrome Arabidopsis thaliana P450, family 86, subfamily A, polypeptide 1) oxygen binding gi 157353969 emb CAO4651.0.1 Vitis vinifera gi 169865676 ref XP OO1839436. CC1G 06649 Coprinopsis cinerea okayama gi85001697 gb ABC68403. cytochrome P450 Glycine max monooxygenase CYP86A24 gi 115466172 refNP 001056685. Os06g0129900 Oryza sativa gi 195637782 gb ACG38359. cytochrome P45086A2 Zea mays gi 194704220 gb ACF86194. Zea mays gi 710064.08 refXP 757870.1 UMO1723.1 Ustilago maydis 521 gi 195161677 ref XP 002021689. GL26642 Drosophila persimilis gi 115459886 refNP 001053543. OsO4g.056O100 Oryza sativa gi 194704096 gb ACF86132. Zea mays gi 147773635 emb CAN67559.1 Vitis vinifera gi 125575195 gb EAZ16479. OSJ 030688 Oryza sativa gi 115482616 refNP 001064901. Os10g.0486100 Oryza sativa gi 71726942 gb AAZ39.642. cytochrome P450 fatty acid Petuniax hybrida omega-hydroxylase gi 195626182 gb ACG34921. cytochrome P45086A1 Zea mays gi 194907382 ref XP OO1981543. GG11553 Drosophila erecta gi 71006688 ref XP 758010.1 UMO1863.1 Ustilago maydis gi 15734.6247 emb CAO15944.1 Vitis vinifera gi 116830948 gb ABK28430. Arabidopsis thaliana gi 13641298 gb AAK31592. cytochrome P450 Brassica rapa Subsp. pekinensis gi 2258321 gb AAB63277. cytochrome P450 Phanerochaete chrysosporium gi 15218671 refNP 174713.1 CYP94D1 (cytochrome Arabidopsis thaliana P450, family 94, subfamily D, polypeptide 1) oxygen binding US 2015/0094.483 A1 Apr. 2, 2015

TABLE 4-continued

First Database Second Database Accession Number Accession Number Name Species gi 195623910 gb ACG33785.1 cytochrome P45086A1 Zea mays gi 157337152 emb CAO21498.1 Vitis vinifera

0413. In some embodiments of the invention it is advan (SEQID NO: 162). In some embodiments the first peptide has tageous to integrate one or more genes encoding a P450 the sequence VKYSGVCHXXXXXWKGDWXXXXKLPxVG enzyme-into a yeast Strain, a species of the genus Candida or GHEGAGVVV (SEQ ID NO: 163). In some embodiments a strain of Candida tropicalis in which genes or pathways that the disrupted alcohol dehydrogenase sequence, predicted cause further oxidation of the substrate have been disrupted. from translation of the gene that encodes it, comprises a In some embodiments, a strain of yeast in which one or more second peptide. In some embodiments the second peptide has cytochrome P450s or one or more alcohol oxidase or one or the sequence QYATADAVQAA (SEQID NO: 158). In some more alcoholdehydrogenase have been disrupted will oxidize embodiments the second peptide has the sequence SGYX hydroxyl groups to aldehydes or acids more slowly than HDGxFXOYATADAVQAA (SEQ ID NO: 164). In some strains of yeast in which these genes have not been disrupted. embodiments the second peptide has the sequence GAEP 0414. In some embodiments of the invention it may be NCXXADxSGYxHDGxFxOYATADAVQAA (SEQ ID NO: advantageous to integrate a cytochrome P450 into a strain of 165). Candida tropicalis in which fatty alcohol oxidase genes 0416) In some embodiments the disrupted alcohol dehy FAO1, FAO1B, FAO2 and FAO2B have been disrupted. In drogenase sequence, predicted from translation of the gene Some embodiments of the invention it may be advantageous that encodes it, comprises a third peptide. In some embodi to integrate a cytochrome P450 into a strain of Candida ments the third peptide has the sequence CAGVTVYKALK tropicalis in which at least one of the fatty alcohol oxidase (SEQ ID NO: 159). In some embodiments the third peptide genes FAO1, FAO1B, FAO2 and FAO2Bhave been disrupted. has the sequence APIXCAGVTVYKALK (SEQ ID NO: In some embodiments of the invention it may be advanta 166). geous to integrate a cytochrome P450 into a strain of Candida tropicalis in which alcohol dehydrogenase genes ADH-A4. 0417. In some embodiments the first genetic modification ADH-A4B, ADH-B4, ADH-B4B, ADH-A10 and ADH-B11 class comprises disruption of at least one alcohol dehydroge have been disrupted. In some embodiments of the invention it nase whose amino acid sequence, predicted from translation may be advantageous to integrate a cytochrome P450 into a of the gene that encodes it, comprises a fourth peptide. In strain of Candida tropicalis in which one or more of the Some embodiments the fourth peptide has the sequence alcohol dehydrogenase genes ADH-A4, ADH-A4B, ADH GQWVAISGA(SEQID NO: 160). In some embodiments the B4, ADH-B4B, ADH-A10, ADH-A1 OB, ADH-B1B and fourth peptide has the sequence GQWVAISGAXGGLGSL ADH-B11 have been disrupted. In some embodiments of the (SEQID NO: 167). In some embodiments the fourth peptide invention it may be advantageous to integrate a gene encoding has the sequence GQWVAISGAXGGLGSLXVQYA (SEQID a cytochrome P450 into a yeast species of the genus Candida NO: 168). In some embodiments the fourth peptide has the in which one or more alcoholdehydrogenase genes have been sequence GQWVAISGAXGGLGSLXVQYAXAMG (SEQID disrupted, and wherein the disrupted alcohol dehydrogenase NO: 169). In some embodiments the fourth peptide has the gene shares at least 95% nucleotide identity, or at least 90% Sequence GQWVAISGAxGGLGSLXVQYAXAM nucleotide identity, or at least 85% nucleotide identity for a GxRVXAIDGG (SEQID NO: 170). stretch of at least 100 contiguous nucleotides, or at least 80% 0418. In some embodiments the disrupted alcohol dehy identical for a stretch of at least 100 contiguous nucleotides of drogenase sequence, predicted from translation of the gene the coding sequence, or at least 75% identical for a stretch of that encodes it, comprises a fifth peptide. In some embodi at least 100 contiguous nucleotides of the coding sequence, or ments said fifth peptide has the sequence VGGHEGAGVVV at least 70% identical for a stretch of at least 100 contiguous (SEQ ID NO: 157). nucleotides of the coding sequence, or at least 65% identical 0419. In some embodiments of the invention it may be for a stretch of at least 100 contiguous nucleotides of the advantageous to integrate a cytochrome P450 into a strain of coding sequence, or at least 60% identical for a stretch of at Candida tropicalis in which cytochrome P450 genes least 100 contiguous nucleotides of the coding sequence CYP52A17, CYP52A18, CYP52A13, CYP52A14, within the coding region with one of the Candida tropicalis CYP52A12 and CYP52A12B have been disrupted. In some genes ADH-A4 (SEQ ID NO:39), ADH-B4 (SEQ ID NO: embodiments it may be advantageous to integrate a cyto 42), ADH-A10 (SEQID NO: 40), ADH-A10B (SEQID NO: chrome P450 into a strain of Candida tropicalis in which one 56), ADH-B11 (SEQID NO: 43). or more of the cytochrome P450 genes CYP52A17, 0415. In some embodiments it may be advantageous to CYP52A18, CYP52A13, CYP52A14, CYP52A12 and integrate a gene encoding a cytochrome P450 into a yeast CYP52A12B have been disrupted. In some embodiments it strain of the genus Candida in which (i) one or more alcohol may be advantageous to integrate a cytochrome P450 into a dehydrogenase genes have been disrupted and (ii) the dis strain of Candida tropicalis in which fatty alcohol oxidase rupted alcohol dehydrogenase comprises a first peptide. In genes FAO1, FAO1B, FAO2 and FAO2B, alcohol dehydro some embodiments the first peptide has the sequence VKYS genase genes ADH-A4, ADH-A4B, ADH-B4, ADH-B4B, GVCH (SEQ ID NO: 156). In some embodiments the first ADH-A10 and ADH-B11 and cytochrome P450 genes peptide has the sequence VKYSGVCHXXXXXWKGDW CYP52A17, CYP52A18, CYP52A13, CYP52A14, US 2015/0094.483 A1 Apr. 2, 2015 59

CYP52A12 and CYP52A12B have been disrupted, for 0429. In some embodiments, one or more genes, two or example strain DP421, in which the B-oxidation pathway has more genes, or three or more genes listed in Table 4 are also been disrupted. integrated into a strain of Candida in which endogenous 0420. In some embodiments, a cytochrome P450 is inte cytochrome P450s have been disrupted. grated into a strain of Candida tropicalis in which endog 0430. In some embodiments, a gene having at least 40 enous cytochrome P450s have been disrupted. percent sequence identity, at least 45 percent sequence iden 0421. In some embodiments, a cytochrome P450 is inte tity, at least 50 percent sequence identity, at least 55 percent grated into a strain of Candida in which endogenous cyto sequence identity, at least 60 percent sequence identity, at chrome P450s have been disrupted. least 65 percent sequence identity, at least 70 percent 0422. In some embodiments, a cytochrome P450 is inte sequence identity, at least 75 percent sequence identity, at grated into a strain of yeast of a species of the genus Candida least 80 percent sequence identity, at least 85 percent in which endogenous cytochrome P450s have been disrupted. sequence identity, at least 90 percent sequence identity, or at 0423 In some embodiments, one or more genes, two or least 95 percent sequence identity to a gene listed in Table 4 is more genes, or three or more genes listed in Table 4 are integrated into a yeast strain, a species of Candida, or a strain integrated into a yeast Strain, a species of Candida, or a strain of Candida tropicalis in which genes or pathways that cause of Candida tropicalis in which genes or pathways that cause further oxidation of a fatty acid substrate (e.g., a C-carboxyl further oxidation of a fatty acid Substrate (e.g., a C-carboxyl ()-hydroxy fatty acid having a carbon chain length in the ()-hydroxy fatty acid having a carbon chain length in the range from C6 to C22, an O.co-dicarboxylic fatty acid having range from C6 to C22, an O.co-dicarboxylic fatty acid having a carbon chain length in the range from C6 to C22, or mixtures a carbon chain length in the range from C6 to C22, or mixtures thereof) have been disrupted. In some embodiments, this thereof) have been disrupted. In some embodiments, this strain of yeast is one in which one or more disrupted cyto strain of yeast is one in which one or more disrupted cyto chrome P450s, or one or more disrupted alcohol oxidases, or chrome P450s, or one or more disrupted alcohol oxidases, or one or more disrupted alcohol dehydrogenases present in the one or more disrupted alcohol dehydrogenases present in the strain of yeast will oxidize hydroxyl groups to aldehydes or strain of yeast will oxidize hydroxyl groups to aldehydes or acids more slowly than strains of yeast in which these genes acids more slowly than strains of yeast in which these genes have not been disrupted. In some embodiments, this strain of have not been disrupted. In some embodiments, this strain of yeast is one in which one or more disrupted cytochrome yeast is one in which one or more disrupted cytochrome P450s, one or more disrupted alcohol oxidases, and one or P450s, one or more disrupted alcohol oxidases, and one or more disrupted alcohol dehydrogenases will oxidize more disrupted alcohol dehydrogenases will oxidize hydroxyl groups to aldehydes or acids more slowly than hydroxyl groups to aldehydes or acids more slowly than strains of yeast in which these genes have not been disrupted. strains of yeast in which these genes have not been disrupted. 0431. In some embodiments, a gene having at least 40 percent sequence identity, at least 45 percent sequence iden 0424. In some embodiments, one or more genes, two or tity, at least 50 percent sequence identity, at least 55 percent more genes, or three or more genes listed in Table 4 are sequence identity, at least 60 percent sequence identity, at integrated into a strain of Candida tropicalis in which fatty least 65 percent sequence identity, at least 70 percent alcohol oxidase genes FAO1, FAO1B, FAO2 and FAO2B sequence identity, at least 75 percent sequence identity, at have been disrupted. least 80 percent sequence identity, at least 85 percent 0425. In some embodiments, one or more genes, two or sequence identity, at least 90 percent sequence identity, or at more genes, or three or more genes listed in Table 4 are least 95 percent sequence identity to a gene listed in Table 4 is integrated into a strain of Candida tropicalis in which endog integrated into a strain of Candida tropicalis in which fatty enous alcohol dehydrogenase genes ADH-A4, ADH-A4B, alcohol oxidase genes FAO1, FAO1B, FAO2 and FAO2B ADH-B4, ADH-B4B, ADH-A10 and ADH-B11 have been have been disrupted. disrupted. 0432. In some embodiments, a gene having at least 40 0426 In some embodiments, one or more genes, two or percent sequence identity, at least 45 percent sequence iden more genes, or three or more genes listed in Table 4 are tity, at least 50 percent sequence identity, at least 55 percent integrated into a strain of Candida tropicalis in which endog sequence identity, at least 60 percent sequence identity, at enous cytochrome P450 genes CYP52A17, CYP52A18, least 65 percent sequence identity, at least 70 percent CYP52A13, CYP52A14, CYP52A12 and CYP52A12B have sequence identity, at least 75 percent sequence identity, at been disrupted. least 80 percent sequence identity, at least 85 percent 0427. In some embodiments, one or more genes, two or sequence identity, at least 90 percent sequence identity, or at more genes, or three or more genes listed in Table 4 are least 95 percent sequence identity to a gene listed in Table 4 is integrated into a strain of Candida tropicalis in which fatty integrated into a strain of Candida tropicalis in which endog alcohol oxidase genes FAO1, FAO1B, FAO2 and FAO2B, enous alcohol dehydrogenase genes ADH-A4, ADH-A4B, alcohol dehydrogenase genes ADH-A4, ADH-A4B, ADH ADH-B4, ADH-B4B, ADH-A10 and ADH-B11 have been B4, ADH-B4B, ADH-A10 and ADH-B11 and cytochrome disrupted. P450 genes CYP52A17, CYP52A18, CYP52A13, 0433. In some embodiments, a gene having at least 40 CYP52A14, CYP52A12 and CYP52A12B have been dis percent sequence identity, at least 45 percent sequence iden rupted, for example strain DP421, in which the B-oxidation tity, at least 50 percent sequence identity, at least 55 percent pathway has also been disrupted. sequence identity, at least 60 percent sequence identity, at 0428. In some embodiments, one or more genes, two or least 65 percent sequence identity, at least 70 percent more genes, or three or more genes listed in Table 4 are sequence identity, at least 75 percent sequence identity, at integrated into a strain of Candida tropicalis in which endog least 80 percent sequence identity, at least 85 percent enous cytochrome P450s have been disrupted. sequence identity, at least 90 percent sequence identity, or at US 2015/0094.483 A1 Apr. 2, 2015 60 least 95 percent sequence identity to a gene listed in Table 4 is sequence identity, at least 60 percent sequence identity, at integrated into a strain of yeast species of the genus Candida least 65 percent sequence identity, at least 70 percent in which one or more alcoholdehydrogenase genes have been sequence identity, at least 75 percent sequence identity, at disrupted, and wherein at least one disrupted alcohol dehy least 80 percent sequence identity, at least 85 percent drogenase gene shares at least 95% nucleotide identity, or at sequence identity, at least 90 percent sequence identity, or at least 90% nucleotide identity, or at least 85% nucleotide least 95 percent sequence identity to a gene listed in Table 4 is identity for a stretch of at least 100 contiguous nucleotides integrated into a strain of yeast species of the genus Candida within the coding region, or at least 80% identical for a stretch in which one or more alcoholdehydrogenase genes have been of at least 100 contiguous nucleotides of the coding sequence, disrupted and wherein at least one disrupted alcohol dehy or at least 75% identical for a stretch of at least 100 contigu drogenase gene comprises a first peptide. In some embodi ous nucleotides of the coding sequence, or at least 70% iden ments the first peptide has the sequence VKYSGVCH (SEQ tical for a stretch of at least 100 contiguous nucleotides of the ID NO: 156). In some embodiments the first peptide has the coding sequence, or at least 65% identical for a stretch of at sequence VKYSGVCHxxxxxWKGDW (SEQID NO: 162). least 100 contiguous nucleotides of the coding sequence, or at In some embodiments the first peptide has the sequence least 60% identical for a stretch of at least 100 contiguous VKYSGVCHXXXXXWKGDWXXXXKLPXVG nucleotides of the coding sequence with one of the Candida GHEGAGVVV (SEQ ID NO: 163). In some embodiments tropicalis genes ADH-A4 (SEQID NO:39), ADH-B4 (SEQ the disrupted alcohol dehydrogenase sequence, predicted ID NO: 42), ADH-A10 (SEQID NO: 40), ADH-A10B (SEQ from translation of the gene that encodes it, comprises a ID NO. 56), ADH-B11 (SEQID NO: 43). second peptide. In some embodiments the second peptide has 0434 In some embodiments, a gene listed in Table 4 is the sequence QYATADAVQAA (SEQID NO: 158). In some integrated into a strain of yeast of the genus Candida in which embodiments the second peptide has the sequence SGYX (i) one or more alcohol dehydrogenase genes has been dis HDGxFXOYATADAVQAA (SEQ ID NO: 164). In some rupted and (ii) at least one disrupted alcohol dehydrogenase embodiments the second peptide has the sequence GAEP gene comprises a first peptide. In some embodiments the first NCXXADxSGYxHDGxFxOYATADAVQAA (SEQ ID NO: peptide has the sequence VKYSGVCH(SEQID NO: 156). In 165). some embodiments the first peptide has the sequence VKYS 0440. In some embodiments, the disrupted alcohol dehy GVCHXXXXXWKGDW (SEQID NO: 162). In some embodi drogenase sequence, predicted from translation of the gene ments the first peptide has the sequence VKYSGVCHXXXXX that encodes it, comprises a third peptide. In some embodi WKGDWXXXXKLPxVGGHEGAGVVV (SEQID NO: 163). ments the third peptide has the sequence CAGVTVYKALK In some embodiments the disrupted alcohol dehydrogenase (SEQ ID NO: 159). In some embodiments the third peptide sequence, predicted from translation of the gene that encodes has the sequence APIXCAGVTVYKALK (SEQ ID NO: it, comprises a second peptide. In some embodiments the 166). second peptide has the sequence QYATADAVOAA (SEQID 0441. In some embodiments the first genetic modification NO: 158). In some embodiments the second peptide has the class comprises disruption of at least one alcohol dehydroge sequence SGYxHDGxFXOYATADAVQAA (SEQ ID NO: nase whose amino acid sequence, predicted from translation 164). In some embodiments the second peptide has the of the gene that encodes it, comprises a fourth peptide. In Sequence GAEPNCXXADxSGYxHDGxFxOYATA Some embodiments the fourth peptide has the sequence DAVQAA (SEQ ID NO: 165). GQWVAISGA(SEQID NO: 160). In some embodiments the 0435. In some embodiments the disrupted alcohol dehy fourth peptide has the sequence GQWVAISGAXGGLGSL drogenase sequence, predicted from translation of the gene (SEQID NO: 167). In some embodiments the fourth peptide that encodes it, comprises a third peptide. In some embodi has the sequence GQWVAISGAXGGLGSLXVQYA (SEQID ments the third peptide has the sequence CAGVTVYKALK NO: 168). In some embodiments, the fourth peptide has the (SEQ ID NO: 159). sequence GQWVAISGAXGGLGSLXVQYAXAMG (SEQID 0436. In some embodiments the third peptide has the NO: 169). In some embodiments the fourth peptide has the sequence APIXCAGVTVYKALK (SEQID NO: 166). Sequence GQWVAISGAxGGLGSLXVQYAXAM 0437. In some embodiments the fourth peptide has the GxRVXAIDGG (SEQID NO: 170). sequence GQWVAISGA (SEQ ID NO: 160). In some 0442. In some embodiments the disrupted alcohol dehy embodiments the fourth peptide has the sequence drogenase sequence, predicted from translation of the gene GQWVAISGAxGGLGSL (SEQ ID NO: 167). In some that encodes it, comprises a fifth peptide. In some embodi embodiments the fourth peptide has the sequence ments said fifth peptide has the sequence VGGHEGAGVVV GQWVAISGAxGGLGSLXVQYA (SEQ ID NO: 168). In (SEQ ID NO: 157). Some embodiments, the fourth peptide has the sequence 0443. In some embodiments, a gene having at least 40 GQWVAISGAxGGLGSLXVQYAXAMG (SEQ ID NO: percent sequence identity, at least 45 percent sequence iden 169). In some embodiments the fourth peptide has the tity, at least 50 percent sequence identity, at least 55 percent Sequence GQWVAISGAxGGLGSLXVQYAXAM sequence identity, at least 60 percent sequence identity, at GxRVxAIDGG (SEQID NO: 170). least 65 percent sequence identity, at least 70 percent 0438. In some embodiments the disrupted alcohol dehy sequence identity, at least 75 percent sequence identity, at drogenase sequence, predicted from translation of the gene least 80 percent sequence identity, at least 85 percent that encodes it, comprises a fifth peptide. In some embodi sequence identity, at least 90 percent sequence identity, or at ments the fifth peptide has the sequence VGGHEGAGVVV least 95 percent sequence identity to a gene listed in Table 4 is (SEQ ID NO: 157). integrated into a strain of Candida tropicalis in which endog 0439. In some embodiments, a gene having at least 40 enous cytochrome P450 genes CYP52A17, CYP52A18, percent sequence identity, at least 45 percent sequence iden CYP52A13, CYP52A14, CYP52A12 and CYP52A12B have tity, at least 50 percent sequence identity, at least 55 percent been disrupted. US 2015/0094.483 A1 Apr. 2, 2015

0444. In some embodiments, a gene having at least 40 hydroxyl groups to aldehydes or acids more slowly than percent sequence identity, at least 45 percent sequence iden strains of yeast in which these genes have not been disrupted. tity, at least 50 percent sequence identity, at least 55 percent 0448. In some embodiments, a gene having at least 40 sequence identity, at least 60 percent sequence identity, at percent sequence identity, at least 45 percent sequence iden least 65 percent sequence identity, at least 70 percent tity, at least 50 percent sequence identity, at least 55 percent sequence identity, at least 75 percent sequence identity, at sequence identity, at least 60 percent sequence identity, at least 80 percent sequence identity, at least 85 percent least 65 percent sequence identity, at least 70 percent sequence identity, at least 90 percent sequence identity, or at sequence identity, at least 75 percent sequence identity, at least 95 percent sequence identity to a gene listed in Table 4 is least 80 percent sequence identity, at least 85 percent integrated into a strain of Candida tropicalis in which fatty sequence identity, at least 90 percent sequence identity, or at alcohol oxidase genes FAO1, FAO1B, FAO2 and FAO2B, least 95 percent sequence identity to a gene listed in Table 4 is alcohol dehydrogenase genes ADH-A4, ADH-A4B, ADH integrated into a strain of Candida tropicalis in which fatty B4, ADH-B4B, ADH-A10 and ADH-B11 and cytochrome alcohol oxidase genes FAO1, FAO1B, FAO2 and FAO2B P450 genes CYP52A17, CYP52A18, CYP52A13, have been disrupted. CYP52A14, CYP52A12 and CYP52A12B have been dis 0449 In some embodiments, a gene having at least 40 rupted, for example strain DP421, in which the B-oxidation percent sequence identity, at least 45 percent sequence iden pathway has also been disrupted. tity, at least 50 percent sequence identity, at least 55 percent 0445. In some embodiments, a gene having at least 40 sequence identity, at least 60 percent sequence identity, at percent sequence identity, at least 45 percent sequence iden least 65 percent sequence identity, at least 70 percent tity, at least 50 percent sequence identity, at least 55 percent sequence identity, at least 75 percent sequence identity, at sequence identity, at least 60 percent sequence identity, at least 80 percent sequence identity, at least 85 percent least 65 percent sequence identity, at least 70 percent sequence identity, at least 90 percent sequence identity, or at sequence identity, at least 75 percent sequence identity, at least 95 percent sequence identity to a gene listed in Table 4 is least 80 percent sequence identity, at least 85 percent integrated into a strain of Candida tropicalis in which endog sequence identity, at least 90 percent sequence identity, or at enous alcohol dehydrogenase genes ADH-A4, ADH-A4B, least 95 percent sequence identity to a gene listed in Table 4 is ADH-B4, ADH-B4B, ADH-A10 and ADH-B11 have been integrated into a strain of Candida tropicalis in which endog disrupted. enous cytochrome P450s have been disrupted. 0450. In some embodiments, a gene having at least 40 0446. In some embodiments, a gene having at least 40 percent sequence identity, at least 45 percent sequence iden percent sequence identity, at least 45 percent sequence iden tity, at least 50 percent sequence identity, at least 55 percent tity, at least 50 percent sequence identity, at least 55 percent sequence identity, at least 60 percent sequence identity, at sequence identity, at least 60 percent sequence identity, at least 65 percent sequence identity, at least 70 percent least 65 percent sequence identity, at least 70 percent sequence identity, at least 75 percent sequence identity, at sequence identity, at least 75 percent sequence identity, at least 80 percent sequence identity, at least 85 percent least 80 percent sequence identity, at least 85 percent sequence identity, at least 90 percent sequence identity, or at sequence identity, at least 90 percent sequence identity, or at least 95 percent sequence identity to a gene listed in Table 4 is least 95 percent sequence identity to a gene listed in Table 4 is integrated into a strain of Candida tropicalis in which endog integrated into a strain of Candida in which endogenous enous cytochrome P450 genes CYP52A17, CYP52A18, cytochrome P450s have been disrupted. CYP52A13, CYP52A14, CYP52A12 and CYP52A12B have 0447. In some embodiments, a gene having at least 40 been disrupted. percent sequence identity, at least 45 percent sequence iden 0451. In some embodiments, a gene having at least 40 tity, at least 50 percent sequence identity, at least 55 percent percent sequence identity, at least 45 percent sequence iden sequence identity, at least 60 percent sequence identity, at tity, at least 50 percent sequence identity, at least 55 percent least 65 percent sequence identity, at least 70 percent sequence identity, at least 60 percent sequence identity, at sequence identity, at least 75 percent sequence identity, at least 65 percent sequence identity, at least 70 percent least 80 percent sequence identity, at least 85 percent sequence identity, at least 75 percent sequence identity, at sequence identity, at least 90 percent sequence identity, or at least 80 percent sequence identity, at least 85 percent least 95 percent sequence identity to a gene listed in Table 4 is sequence identity, at least 90 percent sequence identity, or at integrated into a yeast Strain, a species of Candida, or a strain least 95 percent sequence identity to a gene listed in Table 4 is of Candida tropicalis in which genes or pathways that cause integrated into a strain of Candida tropicalis in which fatty further oxidation of a fatty acid Substrate (e.g., a C-carboxyl alcohol oxidase genes FAO1, FAO1B, FAO2 and FAO2B, ()-hydroxy fatty acid having a carbon chain length in the alcohol dehydrogenase genes ADH-A4, ADH-A4B, ADH range from C6 to C22, an O.co-dicarboxylic fatty acid having B4, ADH-B4B, ADH-A10 and ADH-B11 and cytochrome a carbon chain length in the range from C6 to C22, or mixtures P450 genes CYP52A17, CYP52A18, CYP52A13, thereof) have been disrupted. In some embodiments, this CYP52A14, CYP52A12 and CYP52A12B have been dis strain of yeast is one in which one or more disrupted cyto rupted, for example strain DP421, in which the B-oxidation chrome P450s, or one or more disrupted alcohol oxidases, or pathway has also been disrupted. one or more disrupted alcohol dehydrogenases present in the 0452. In some embodiments, a gene having at least 40 strain of yeast will oxidize hydroxyl groups to aldehydes or percent sequence identity, at least 45 percent sequence iden acids more slowly than strains of yeast in which these genes tity, at least 50 percent sequence identity, at least 55 percent have not been disrupted. In some embodiments, this strain of sequence identity, at least 60 percent sequence identity, at yeast is one in which one or more disrupted cytochrome least 65 percent sequence identity, at least 70 percent P450s, one or more disrupted alcohol oxidases, and one or sequence identity, at least 75 percent sequence identity, at more disrupted alcohol dehydrogenases will oxidize least 80 percent sequence identity, at least 85 percent US 2015/0094.483 A1 Apr. 2, 2015 62 sequence identity, at least 90 percent sequence identity, or at (given as SEQID NO: 69) into a vector of the form shown in least 95 percent sequence identity to a gene listed in Table 4 is FIG. 23. The sequence of the complete vector is given as SEQ integrated into a strain of Candida tropicalis in which endog ID NO: 70. enous cytochrome P450s have been disrupted. 0458. The vector was prepared as described in Section 0453. In some embodiments, a gene having at least 40 7.1.1, except that the construct was linearized with BsiWI percent sequence identity, at least 45 percent sequence iden instead of BsmBI. Candida tropicalis strains were trans tity, at least 50 percent sequence identity, at least 55 percent formed with the construct as described in Section 7.1.2, sequence identity, at least 60 percent sequence identity, at except that 100 ug/ml of Zeocin was used instead of 200 g/ml least 65 percent sequence identity, at least 70 percent nourseothricinas the selective antibiotic. Genomic DNA was sequence identity, at least 75 percent sequence identity, at prepared and tested for the presence of the integrated DNA as least 80 percent sequence identity, at least 85 percent described in Section 7.1.3. sequence identity, at least 90 percent sequence identity, or at 0459 Candida tropicalis strain DP201 was prepared by least 95 percent sequence identity to a gene listed in Table 4 is integration of the construct shown as SEQID NO: 70 into the integrated into a strain of Candida in which endogenous genome of strain DP186 (Table 3) at the site of the genomic cytochrome P450s have been disrupted. sequence of the gene for isocitrate lyase. DP428 was prepared 0454. To achieve novel phenotypes of Candida, it may be by integration of the construct shown as SEQID NO: 70 into advantageous to modify the activity of a polypeptide by alter the genome of strain DP421 (Table 3) at the site of the ing its sequence and to test the effect of the polypeptide with genomic sequence of the gene for isocitrate lyase. Sequences altered sequence within the yeast. A preferred method for of oligonucleotide primers for analysis of strains were: testing the effect of sequence changes in a polypeptide within yeast is to introduce a plurality of genes of known sequence, each encoding a unique modified polypeptide, into the same ICL-IN-F1: genomic location in a plurality of strains. (SEQ ID NO: 124) 0455 The isocitrate lyase promoter from Candida tropi GGATCCGTCTGAAGAAATCAAGAACC calis has been shown to be an inducible promoter in both 1758R2 : Saccharomyces cerevisiae and E. coli (Atomi et al., 1995, (SEQ ID NO: 125) Arch Microbiol: 163, 322-8: Umemura et al., 1995, Appl TGGTGTAGGCCAATAATTGCTTAATGATATACAAAACTGGCACC Microbiol Biotechnol: 43, 489-92.) When expressed in S. ACAA cerivisiae, the isocitrate lyase gene was found to be inducible 1758F2: by acetate, glycerol, lactate, ethanol, or oleate. Ethanol is (SEQ ID NO: 126) interesting from the perspective that is a relatively cheap GAGCAATTGTTGGAATATTGGTACGTTGTGGTGCCAGTTTTGTA inducer and oleate for the fact that it is a potential substrate for TATCA the system for converting fatty acids to omega hydroxy fatty 1758R34 : acids. Inducible expression of the Candida tropicalis ICL (SEO ID NO: 127) gene was found to be high in S. cerivisiae (as much as 30% of cGAACTTAACAATAGCACCGTCTTGCAAACACATGGTCAAGTTA soluble protein), indicating that it may serve as a strong GTTAA. inducible promoter in C. tropicalis. 0460 For strains DP201 and DP428 (integrants of SEQID 0456 To insert genes under control of the isocitrate lyase NO: 70), PCR with primers ICL-IN-F1 and 1758R2 produces promoter a genomic insertion construct of the form shown in a 1609 base pair amplicon indicating that the construct has FIG. 21 was synthesized. The sequence used for the sequence been integrated in the ICL promoter region: PCR with prim of promoter 1 was that of the Candida tropicalis isocitrate ers 1758F2 and 1758R34 produces a 1543 base pair amplicon lyase promoter, given as SEQID NO: 62. This promoter has indicating that CYP52A17 has been integrated. Neither a BsiWI site that can be used to linearize the construct for primer pair produces an amplicon from the parental strains Subsequent insertion into the Candida tropicalis genome. The DP186 or DP421. sequence used for transcription terminator 1 was that of the 7.6.2. Insertion of CYP52A13 under control of the isocitrate Candida tropicalis isocitrate lyase terminator, given as SEQ lyase promoter ID NO: 63. The sequence used for Promoter 2 was the TEF1 0461. A construct for expressing Candida tropicalis cyto promoter, given as SEQID NO: 64. The sequence used for the chrome P450 CYP52A13 under the control of the isocitrate bacterial promoter was the EM7 promoter, given as SEQID lyase promoter was made by cloning the sequence of a gene NO: 65. The sequence used for the selectable marker was the encoding Candida tropicalis cytochrome P450 CYP52A13 Zeocin resistance gene, a version optimized for expression in (given as SEQID NO: 71) into a vector of the form shown in Candida tropicalis is given as SEQID NO: 66. The sequence FIG. 23. The sequence of the complete vector is given as SEQ use for Transcription terminator 2 was the CYC1 transcrip ID NO: 72. tion terminator, given as SEQID NO: 67. The sequence used 0462. The vector was prepared as described in Section as the bacterial origin of replication was the pUC origin, given 7.1.1, except that the construct was linearized with BsiWI as SEQID NO: 68. A genomic integration vector with these instead of BsmBI. Candida tropicalis strains were trans components is represented graphically as FIG. 23. formed with the construct as described in Section 7.1.2, except that 100 ug/ml of Zeocin was used instead of 200 g/ml 7.6.1. Insertion of CYP52A17 Under Control of the Isocitrate nourseothricinas the selective antibiotic. Genomic DNA was Lyase Promoter prepared and tested for the presence of the integrated DNA as 0457. A construct for expressing Candida tropicalis cyto described in Section 7.1.3. chrome P450 CYP52A17 under the control of the isocitrate 0463 Candida tropicalis strain DP522 was prepared by lyase promoter was made by cloning the sequence of a gene integration of the construct shown as SEQID NO: 72 into the encoding Candida tropicalis cytochrome P450 CYP52A17 genome of strain DP421 (Table 3) at the site of the genomic US 2015/0094.483 A1 Apr. 2, 2015 63 sequence of the gene for isocitrate lyase. Sequences of oligo 0468 For strain DP526 (integration of SEQID NO: 74), nucleotide primers for analysis of strains were: PCR with primers ICL-IN-F1 and 4082R2 produces a 1554 base pair amplicon indicating that the construct has been integrated in the ICL promoter region: PCR with primers (SEQ ID NO: 124) 4082F2 and 4082R34 produces a 1572 base pair amplicon ICL-IN-F1: indicating that CYP52A12 has been integrated. Neither 4 O82R2: primer pair produces an amplicon from the parental strain (SEQ ID NO: 128) DP421. CGATTAAGGCCAATGGAACAATGACGTACCACTTAGTAAAGTAG GTA 7.7. Deletion of Pdx Genes from Candida tropicalis 4 O82F2: (SEQ ID NO: 129) 0469 Picataggio et al., 1991, Mol Cell Biol: 11, 4333-9, CATGACTGTTCACGACATTATTGCTACC TACTTTACTAAGTGGT describe a system for the sequential disruption of the Candida ACGTC tropicalis chromosomal POX4 and POX5 genes, encoding distinct isozymes of the acyl coenzyme A (acyl-CoA) oxi 4 O82R34: (SEQ ID NO: 130) dase, which catalyze the first reaction in the B-Oxidation path ACATTTCAATATTAGCACCGTCAAATAATGACATGGTCAAATGG way of fatty acids. An alternative method is to use the SAT-1 GACA flipper.

0464 For strain DP522 (integration of SEQID NO: 72), 7.7.1. Deletion of POX4 Alleles PCR with primers ICL-IN-F1 and 4082R2 produces a 1600 base pair amplicon indicating that the construct has been 0470 The sequence of a gene encoding an acyl-coenzyme integrated in the ICL promoter region: PCR with primers A oxidase II (PXP-4) of Candida tropicalis, POX4, is given as 4082F2 and 4082R34 produces a 1565 base pair amplicon SEQ ID NO: 136. This sequence was used to design two indicating that CYP52A13 has been integrated. Neither “pre-targeting constructs. The first pre-targeting construct is primer pair produces an amplicon from the parental strain comprised of two targeting sequences from the 5' and 3' end of DP421. the structural gene. The targeting sequences are separated by a sequence, given as SEQ ID NO: 12, comprising a NotI 7.6.3. Insertion of CYP52A12 Under Control of the Isocitrate restriction site, a 20 bp stuffer fragment and an XhoI restric Lyase Promoter tion site. The targeting sequences are flanked by BSmBI restriction sites, so that the final targeting construct can be 0465. A construct for expressing Candida tropicalis cyto linearized prior to transformation into Candida tropicalis. chrome P450 CYP52A12 under the control of the isocitrate The sequence of the first POX4 pre-targeting construct is lyase promoter was made by cloning the sequence of a gene given as SEQID NO: 137. Not shown in SEQID NO: 137 but encoding Candida tropicalis cytochrome P450 CYP52A12 also present in the pre-targeting construct were a selective (given as SEQID NO: 73) into a vector of the form shown in marker conferring resistance to kanamycin and a bacterial FIG. 23. The sequence of the complete vector is given as SEQ origin of replication, so that the pre-targeting construct can be ID NO: 74. grown and propagated in E. coli. The first pre-targeting 0466. The vector was prepared as described in Section sequence can be synthesized using standard DNA synthesis 7.1.1, except that the construct was linearized with BsiWI techniques well known in the art. instead of BsmBI. Candida tropicalis strains were trans 0471. The second pre-targeting construct is comprised of formed with the construct as described in Section 7.1.2, two targeting sequences from the 5' and 3' end of the structural except that 100 ug/ml of Zeocin was used instead of 200ug/ml gene that lie internal to the 5' and 3' targeting sequences of the nourseothricinas the selective antibiotic. Genomic DNA was first pre-targeting construct. The targeting sequences are prepared and tested for the presence of the integrated DNA as separated by a sequence, given as SEQID NO: 12, compris described in Section 7.1.3. ing a NotI restriction site, a 20 bp stuffer fragment and an 0467 Candida tropicalis strain DP526 was prepared by XhoI restriction site. The targeting sequences are flanked by integration of the construct shown as SEQID NO: 74 into the BSmBI restriction sites, so that the final targeting construct genome of strain DP421 (Table 3) at the site of the genomic can be linearized prior to transformation into Candida tropi sequence of the gene for isocitrate lyase. Sequences of oligo calis. The sequence of the second POX4 pre-targeting con nucleotide primers for analysis of strains were: struct is given as SEQID NO: 138. Not shown in SEQID NO: 138 but also present in the pre-targeting construct are a selec tive marker conferring resistance to kanamycin and a bacte (SEQ ID NO: 124) rial origin of replication, so that the pre-targeting construct ICL-IN-F1: can be grown and propagated in E. coli. The second pre CYP52A12-R2 : targeting sequence can synthesized using standard DNA syn (SEQ ID NO: 131) thesis techniques well known in the art. ATCAATAATTTCCTGGGTTGCCAT 0472 Targeting sequences for deletion of the two POX4 CYP52A12-F1: alleles from the Candida tropicalis geneome can be prepared (SEQ ID NO: 132) by digesting the SAT-1 flipper (SEQID NO: 1) with restric ATGGCAACCCAGGAAATTATTGAT tion enzymes Not and XhoI, and ligating into the POX4 CYP52A12-R1: pre-targeting constructs (SEQ ID NO: 137 or SEQ ID NO: (SEQ ID NO: 133) 138) from which the 20 bp stuffer has been removed by CTACATCTTGACAAAAACACCATCATT digestion with restriction enzymes NotI and XhoI. The sequence of the resulting first targeting construct for the dele US 2015/0094.483 A1 Apr. 2, 2015 64 tion of the first allele of POX4 is given as SEQID NO: 139. selective marker conferring resistance to kanamycin and a The sequence of the resulting second targeting construct for bacterial origin of replication, so that the pre-targeting con the deletion of the second allele of POX4 is given is SEQID struct can be grown and propagated in E. coli. The second NO: 140. Because the POX4 targeting sequences of the sec pre-targeting sequence can be synthesized using standard ond targeting construct lie internal to the targeting sequences DNA synthesis techniques well known in the art. of the first targeting construct, use of the first targeting con 0476 Targeting sequences for deletion of the two POX5 struct to delete the first POX4 allele assures that use of the alleles from the Candida tropicalis geneome were prepared second targeting construct is specific for the second POX4 by digesting the SAT-1 flipper (SEQID NO: 1) with restric allele since the targeting sequences of the second targeting tion enzymes NotI and XhoI, and ligating into both of the construct no longer exist in the first deleted allele. POX5 pre-targeting constructs (SEQID NO 144 or 145) from 0473 Analysis of integrants and excisants can be per which the 20 bp stuffer had been removed by digestion with formed as described in Section 7.1. Sequences of oligonucle restriction enzymes NotI and XhoI. The sequence of the otide primers for the analysis of strains are: resulting first targeting construct for the deletion of the first allele of POX5 is given as SEQID NO: 146. The sequence of the resulting second targeting construct for the deletion of the POX4 - IN-L: (SEQ ID NO: 141) second allele of POX5 is given is SEQID NO: 147. Because ATGACTTTTACAAAGAAAAACGTTAGTGTATCACAAG the POX5 targeting sequences of the second targeting con struct lie internal to the targeting sequences of the first target POX4 - IN-R: ing construct, use of the first targeting construct to delete the (SEQ ID NO: 142) first POX5 allele assures that use of the second targeting TTACTTGGACAAGATAGCAGCGGTTTC construct is specific for the second POX5 allele since the SAT1-R: targeting sequences of the second targeting construct no (SEO ID NO: 79) longer exist in the first deleted allele. TGGTACTGGTTCTCGGGAGCACAGG 0477 Analysis of integrants and excisants can be per SAT1 - F : formed as described in section 7.1. Sequences of oligonucle (SEQ ID NO: 80) CGCTAGACAAATTCTTCCAAAAATTTTAGA otide primers for the analysis of strains are:

7.7.2. Deletion of POX5 Alleles POX5-IN-L: (SEQ ID NO: 148) 0474 The sequence of a gene encoding an acyl-coenzyme ATGCCTACCGAACTTCAAAAAGAAAGAGAA A oxidase I (PXP-5) of Candida tropicalis, POX5, is given as POX5-IN-R: SEQ ID NO: 143. This sequence was used to design two (SEQ ID NO: 149) “pre-targeting constructs. The first pre-targeting construct is TTAACTGGACAAGATTTCAGCAGCTTCTTC comprised of two targeting sequences from the 5' and 3' end of SAT1-R: the structural gene. The targeting sequences were separated (SEO ID NO : 79) by a sequence, given as SEQID NO: 12, comprising a NotI TGGTACTGGTTCTCGGGAGCACAGG restriction site, a 20 bp stuffer fragment and an XhoI restric SAT1 - F : tion site. The targeting sequences are flanked by BSmBI (SEQ ID NO: 80) restriction sites, so that the final targeting construct can be CGCTAGACAAATTCTTCCAAAAATTTTAGA linearized prior to transformation into Candida tropicalis. The sequence of the first POX5 pre-targeting construct is given as SEQID NO: 144. Not shown in SEQID NO: 144 but 7.8. Insertion of Genes into the Genome of Candida also present in the pre-targeting construct were a selective marker conferring resistance to kanamycin and a bacterial 0478. To achieve novel phenotypes in yeasts of the genus origin of replication, so that the pre-targeting construct can be Candida (e.g., Candida tropicalis), including biotransforma grown and propagated in E. coli. The first pre-targeting tions of compounds by Candida tropicalis, including chemi sequence can be synthesized using standard DNA synthesis cal conversions not previously obtained, or increased rates of techniques well known in the art. conversion of one or more substrates to one or more products, 0475. The second pre-targeting construct is comprised of or increased specificity of conversion of one or more Sub two targeting sequences from the 5' and 3' end of the structural strates to one or more products, or increased tolerance of a gene that lie internal to the 5' and 3' targeting sequences of the compound by the yeast, or increased uptake of a compound first pre-targeting construct. The 5' targeting sequence of the by the yeast, it may be advantageous to incorporate a gene second pre-targeting construct is modified at position 248 encoding a polypeptide into the genome of the yeast. Expres (C248T) and 294 (G294A) to remove unwanted XhoI and sion of the polypeptide in the yeast then allows the phenotype BSmBI sites, respectively. The targeting sequences were of the yeast to be modified. separated by a sequence, given as SEQID NO: 12, compris 0479. To achieve novel phenotypes of Candida, it may be ing a Not restriction site, a 20 bp stuffer fragment and an advantageous to modify the activity of a polypeptide by alter XhoI restriction site. The targeting sequences are flanked by ing its sequence and to test the effect of the polypeptide with BSmBI restriction sites, so that the final targeting construct altered sequence within the yeast. A preferred method for can be linearized prior to transformation into Candida tropi testing the effect of sequence changes in a polypeptide within calis. The sequence of the second POX5 pre-targeting con yeast is to introduce a plurality of genes of known sequence, struct is given as SEQID NO: 145. Not shown in SEQID NO: each encoding a unique modified polypeptide, into the same 145 but also present in the pre-targeting construct were a genomic location in a plurality of Strains. US 2015/0094.483 A1 Apr. 2, 2015

0480. The isocitrate lyase promoter from Candida tropi all or a part of the coding sequence of the gene to be disrupted calis has been shown to be an inducible promoter in both with the coding sequence of the gene to be inserted would be Saccharomyces cerevisiae and E. coli as described in Atomi sufficient. H. et al., 1995 Arch Microbiol. 163:322-8: Umemura K. etal, 0483. In some embodiments a construct for integration of 1995 Appl Microbiol Biotechnol. 43:489-92: Kanai T. etal, a gene into the Candida genome with the aim of expressing a 1996 Appl Microbiol Biotechnol. 44:759-65. The paper by protein from that gene comprises a promoter from an alcohol Atomi H. et al., 1995 Arch Microbiol. 163:322-8, identified dehydrogenase gene or a promoter from a cytochrome P450 the sequence between bases -394 and -379 of the promoter as gene, or a promoter for a fatty alcohol oxidase gene. a promoter that regulates the isocitrate lyase promoter in the 0484. In some embodiments of the invention a gene yeast Saccharomyces cerevisiae. The DNA sequence of an encoding a polypeptide is integrated under control of an isoci isocitrate lyase promoter from Candida tropicalis from base trate lyase promoter, an alcohol dehydrogenase promoter, a -394 to -1 is given as SEQID NO 161. Inducible expression fatty alcohol oxidase promoter or a cytochrome P450 pro of the Candida tropicalis ICL gene was found to be high in S. moter into a strain of Candida tropicalis in which one or more cerivisiae (as much as 30% of soluble protein), indicating that of the alcohol dehydrogenase genes ADH-A4, ADH-A4B, it may serve as a strong inducible promoter in C. tropicalis. ADH-B4, ADH-B4B, ADH-A10, ADH-A1OB, ADH-B1B The sequence of an isocitrate lyase promoter that has been and ADH-B11 have been disrupted. In some embodiments of used to drive expression of a protein in the yeast Saccharo the invention a gene encoding a polypeptide is integrated myces cerevisiae is given as SEQID NO: 171. To insert genes under control of an isocitrate lyase promoter, an alcoholdehy under control of the isocitrate lyase promoter a genomic drogenase promoter, a fatty alcohol oxidase promoter or a insertion construct of the form shown in FIG. 21 was synthe cytochrome P450 promoter into a yeast strain of the genus sized. A genomic integration vector with these components is Candida in which one or more alcohol dehydrogenase genes represented graphically as FIG. 23. have been disrupted, and wherein the disrupted alcoholdehy 0481. In some embodiments, a construct for integration of drogenase gene shares at least 95% nucleotide identity, or at a gene to be expressed into the genome of a yeast of the genus least 90% nucleotide identity, or at least 85% nucleotide Candida comprises an isocitrate lyase promoter, in some identity for a stretch of at least 100 contiguous nucleotides embodiments a construct for integration of a gene to be within the coding region, or at least 80% identical for a stretch expressed into the genome of a yeast of the genus Candida of at least 100 contiguous nucleotides of the coding sequence comprises the sequence shown as SEQID NO: 62, in some or at least 75% identical for a stretch of at least 100 contigu embodiments a construct for integration of a gene to be ous nucleotides of the coding sequence, or at least 70% iden expressed into the genome of a yeast of the genus Candida tical for a stretch of at least 100 contiguous nucleotides of the comprises the sequence shown as SEQ ID 161, in some coding sequence, or at least 65% identical for a stretch of at embodiments a construct for integration of a gene to be least 100 contiguous nucleotides of the coding sequence, or at expressed into the genome of a yeast of the genus Candida least 60% identical for a stretch of at least 100 contiguous comprises a sequence that is 70%, 75%, 80%, 85%, 90%, or nucleotides of the coding sequence with one of the Candida 95% identical to the sequence shown as SEQID 161. In some tropicalis genes ADH-A4 (SEQID NO:39), ADH-B4 (SEQ embodiments a construct for integration of a gene to be ID NO: 42), ADH-A10 (SEQID NO: 40), ADH-A10B (SEQ expressed into the genome of a yeast of the genus Candida ID NO. 56), ADH-B11 (SEQID NO: 43). In some embodi comprises a sequence of Sufficient length and identity to the ments of the invention a gene encoding a polypeptide is isocitrate lyase promoter to ensure integration at that locus; in integrated under control of an isocitrate lyase promoter, an Some embodiments said construct comprises at least 100 alcoholdehydrogenase promoter, a fatty alcohol oxidase pro contiguous base pairs or at least 200 contiguous base pairs or moter or a cytochrome P450 promoter into a yeast strain of at least 300 contiguous base pairs or at least 400 contiguous the genus Candida in which one or more alcohol dehydroge base pairs or at least 500 contiguous base pairs of the nase genes have been disrupted, and wherein the disrupted sequence shown as SEQID NO: 62 or to the sequence shown alcohol dehydrogenase comprises a first peptide. In some as SEQ ID NO: 171; in some embodiments the construct embodiments the first peptide has the sequence VKYSGVCH comprises at least 100 contiguous base pairs or at least 200 (SEQ ID NO: 156). In some embodiments, the first peptide contiguous base pairs or at least 300 contiguous base pairs or has the sequence VKYSGVCHXXXXXWKGDW (SEQ ID at least 400 contiguous base pairs or at least 500 contiguous NO: 162). In some embodiments the first peptide has the base pairs that are at least 65%, at least 70%, at least 75%, at Sequence VKYSGVCHXXXXXWKGDWXXXXKLPXVG least 80%, at least 85%, at least 90%, at least 95%, or at least GHEGAGVVV (SEQID NO: 163). 98% identical to the sequence shown as SEQID NO: 62 or to 0485. In some embodiments the disrupted alcohol dehy the sequence shown as SEQID NO: 171. drogenase sequence, predicted from translation of the gene 0482 Genes may also be inserted into the genome of that encodes it, comprises a second peptide. In some embodi yeasts of the genus Candida under control of other promoters ments the second peptide has the sequence QYATADAVOAA by constructing analogous constructs to the one shown sche (SEQID NO: 158). In some embodiments the second peptide matically in FIG. 21. Of particular utility may be the promot has the sequence SGYxHDGxFXOYATADAVQAA (SEQID ers for alcohol dehydrogenase genes, which are known to be NO: 164). In some embodiments the second peptide has the highly expressed in other yeasts such as Saccharomyces cer Sequence GAEPNCXXADxSGYxHDGxFxOYATA evisiae. A construct for integrating into an alcohol dehydro DAVQAA (SEQ ID NO: 165). genase gene locus could also have an advantage in embodi 0486 In some embodiments the disrupted alcohol dehy ments in which it is desirable to disrupt the alcohol drogenase sequence, predicted from translation of the gene dehydrogenase gene itself. In these cases it would be unnec that encodes it, comprises a third peptide. In some embodi essary to know the full sequence of the promoter: replacing ments the third peptide has the sequence CAGVTVYKALK US 2015/0094.483 A1 Apr. 2, 2015 66

(SEQ ID NO: 159). In some embodiments the third peptide has the sequence APIXCAGVTVYKALK (SEQ ID NO: ICL-IN-F1: (SEQ ID NO: 124) 166). GGATCCGTCTGAAGAAATCAAGAACC 0487. In some embodiments the first genetic modification class comprises disruption of at least one alcohol dehydroge 1759R33 : (SEQ ID NO: 150) nase whose amino acid sequence, predicted from translation ACCTTAAAACGCATAAATTCCTTGATGATTGCCATGTTGTCTT of the gene that encodes it, comprises a fourth peptide. In CTTCA Some embodiments the fourth peptide has the sequence GQWVAISGA(SEQID NO: 160). In some embodiments the 0493 For strain DP197 (integrant of SEQ ID NO: 75), fourth peptide has the sequence GQWVAISGAXGGLGSL PCR with primers ICL-IN-F1 and 1759R33 produces a 1592 (SEQID NO: 167). In some embodiments the fourth peptide base pair amplicon indicating that the construct has been has the sequence GQWVAISGAXGGLGSLXVQYA (SEQID integrated in the ICL promoter region. The primer pair does NO: 168). In some embodiments, the fourth peptide has the not produces an amplicon from the parental strain DP 186. sequence GQWVAISGAXGGLGSLXVQYAXAMG (SEQID NO: 169). In some embodiments the fourth peptide has the 8. CONVERSION OF FATTY ACIDSUSING Sequence GQWVAISGAxGGLGSLXVQYAXAM MODIFIED STRAINS OF CANDIDA TROPICALIS GxRVxAIDGG (SEQID NO: 170). 8.1. Analytical Methods 0488. In some embodiments the first genetic modification class comprises disruption of at least one alcohol dehydroge 8.1.1. GC-MS for Identification of Fatty Acids, nase whose amino acid sequence, predicted from translation Omega-Hydroxy Fatty Acids and Diacids of the gene that encodes it, comprises a fifth peptide. In some embodiments the fifth peptide has the sequence VGGHE 0494 Gas chromatography/mass spectrometry (GC/MS) GAGVVV (SEQID NO: 157). analysis was performed at 70 eV with ThermoEinnigan TraceGC Ultra gas chromatograph coupled with Trace DSQ 0489. Insertion of the Gene Encoding mocherry Under mass spectrometer. Products were esterified with BF in Control of the Isocitrate Lyase Promoter methanol (10%, w/w) at 70° C. for 20 min, and further sily 0490. A construct for expressing mocherry (Shaner NC, lation of the methyl esters with HMDS/TMCS/Pyridine at Campbell RE, Steinbach PA, Giepmans B N, Palmer A E, 70° C. for 10 min when needed. The experiments were carried Tsien R Y. (2004) Nat Biotechnol. 22:1567-72) under the out with injector, ion source and interface temperature of 200° control of the C. tropicalis isocitrate lyase promoter (given as C., 250° C. and 280°C., respectively. Samples in hexane (1 SEQID NO: 62) was made by cloning the sequence of a gene ul) were injected in PTV split mode and run on a capillary encoding moherry (given as SEQID NO: 75) into a vector of column (Varian CP8944 VF-SMS, 0.25mmx0.25umx30 m). the form shown in FIG. 23 with the mGherry open reading The oven temperature was programmed at 120° C. for one frame in the position indicated by the element labeled “Gene minute increasing to 260° C. at the rate of 20°C/minute, and for expression'. The sequence of the complete vector is given then to 280° C. at the rate of 4.0° C.Aminute. as SEQID NO: 76. 0491. The vector was prepared as described in Section 8.1.2. LC-MS for Measurement of Fatty Acids, 7.1.1, except that the construct was linearized with BsiWI Omega-Hydroxy Fatty Acids and Diacids instead of BsmBI. Candida tropicalis strain DP186 (Table 3) 0495. The concentration of omega-hydroxy fatty acids was transformed with the construct or a no DNA control as and diacids during biotransformation was measured by liquid described in Section 7.1.2, except that 200, 400 or 600 ug/ml chromatography/mass spectrometry (LC/MS) with purified of Zeocin were used instead of 200 ug/ml nourseothricin as products as standards. The solvent delivery system was a the selective antibiotic. Following ~48 hours at 30° C. and an Waters Alliance 2795 Separation Module (Milford, Mass., additional 24 hours at room temperature, 10 large red colo USA) coupled with a Waters 2996 photodiode array detector nies were observed amongsta virtually confluent background and Waters ZQ detector with an electron spray ionization of small white colonies on YPD agar plates with 200 ug/ml mode. The separation was carried on a reversed-phase col Zeocin. Likewise, following 48 hours at 30° C. and an addi umn with a dimension of 150x4.6 mm and particle size of 5 tional 48 hours at room temperature, large red colonies were um. The mobile phase used for separation contained 10% observed on the 400 and 600 ug/ml Zeocin YPD agar plates HO, 5% acetonitrile, 5% Formic acid solution (1% in water) amongst a background of Smaller white colonies. No red and 80% methanol. colonies were observed on plates transformed with the no DNA control. A total of 8 large, redcolonies were isolated and 8.1.3. NMR for Characterization of Omega-Hydroxyfatty selected for further characterization (see FIG. 27). Genomic Acids and Diacids DNA was prepared from the isolates and tested for the pres ence of the integrated mOherry DNA at the isocitriate lyase 0496 Proton (H) and 'C-NMR spectra were recorded on promoter as described in Section 7.1.3. All 8 tested positive a Bruker DPX300 NMR spectrometer at 300 MHz. The for mCherry integration at the isocitrate lyase promoter dem chemical shifts (ppm) for H-NMR were referenced relative onstrating that expression of genes other than isocitrate lyase to tetramethylsilane (TMS, 0.00 ppm) as the internal refer can be driven in C. tropicalis using this promoter CCC. 0492. One of the eight isolates, Candida tropicalis strain DP197 (Table 3), was prepared by integration of the construct 8.2. Oxidation of Fatty Acids by Candida tropicalis shown as SEQ ID NO: 75 into the genome of strain DP186 Strains Lacking Four CYP52A P450S (Table 3) at the site of the genomic sequence of the gene for 0497 We compared the Candida tropicalis strain lacking isocitrate lyase. CYP52A13, CYP52A14, CYP52A17 and CYP52A18 US 2015/0094.483 A1 Apr. 2, 2015 67

(DP174) constructed in Section 7.2 with the starting strain droxy fatty acid Substrates by adding these Substrates to inde (DP1) for their abilities to oxidize fatty acids. To engineer pendent flasks at final concentrations of 5 g/l and the pH was P450s for optimal oxidation of fatty acids or other substrates adjusted to between 7.5 and 8 and shaking was continued at it is advantageous to eliminate the endogenous P450s whose 30° C. and 250 rpm. Samples were taken at the times indi activities may mask the activities of the enzymes being engi cated, cell culture was acidified to pH~1.0 bp addition of 6 N neered. We tested Candida tropicalis strains DP1 and DP174 HCl, products were extracted from the cell culture by diethyl (genotypes given in Table 3) to determine whether the dele ether and the concentrations of ()-hydroxy fatty acids and tion of the four CYP52 P450S had affected the ability of the C.co-diacids in the media were measured by LC-MS (liquid yeast to oxidize fatty acids. chromatography mass spectroscopy). The results are shown 0498 Cultures of the yeast strains were grown at 30° C. in Table 5. and 250 rpm for 16 hours in a 500 ml flask containing 30 ml of media F (media F is peptone 3 g/l. yeast extract 6 g/l. yeast TABLE 5 nitrogen base 6.7 g/l. Sodium acetate 3 g/l. KHPO, 7.2 g/l. KHPO 9.3 g/l) plus 30 g/l glucose. After 16 hours 0.5 ml of Oxidation of O-hydroxy fatty acids by Candida tropicalis culture was added to 4.5 ml fresh media Fplus 60 g/l glucose -HYDROXY DIACID DIACID in a 125 ml flask, and grown at 30° C. and 250 rpm for 12 FATTY ACID PRODUCED PRODUCED hours. Substrates were added and shaking was continued at SUBSTRATE REACTION BY DP1 BY DP174 30° C. and 250 rpm. We then tested the conversion of C14 CHAINLENGTH TIME (G/L) (G/L) fatty acid substrates as shown in FIG. 13. FIG. 13 parts A and C12 60 hours S.6 5.2 B show that the starting strain DP1 converts methyl myristate C16 60 hours 1.4 O.8 to co-hydroxy myristate and to the C14 diacid produced by C12 24 hours 5.4 5 C12 48 hours 6 6.7 oxidation of the co-hydroxy myristate over a 48 hour time C12 72 hours 6.2 6.5 course, while the quadruple P450 deletion strain DP174 can C16 24 hours 2.3 O.9 effect almost no detectable conversion. FIG. 13 parts C and D C16 48 hours 2.4 1.7 show that the starting strain DP1 converts methyl myristate C16 72 hours 2.8 1.8 and sodium myristate to co-hydroxy myristate and to the C14 diacid produced by oxidation of the co-hydroxy myristate 0502. These results show that at least one enzyme capable after 48 hours, while the quadruple P450 deletion strain of oxidizing ()-hydroxy fatty acids is present in Candida DP174 effects almost no detectable conversion of these sub tropicalis in addition to the cytochrome P450 genes encoding Strates. CYP52A13, CYP52A14, CYP52A17 and CYP52A18. 0499. These results confirm that at least one of the four Candida tropicalis cytochrome P450 genes encoding 8.4. Oxidation of C2-Hydroxy Fatty Acids by CYP52A13, CYP52A14, CYP52A17 and CYP52A18 is Candida tropicalis Strains Lacking Four CYP52A required for hydroxylation of fatty acids, consistent with the P450S and Four Fatty Alcohol Oxidases schematic representation of Candida tropicalis fatty acid metabolism pathways shown in FIG. 12. Further it shows that 0503 We compared the Candida tropicalis strain lacking strain DP174 is an appropriate strain to use for testing of CYP52A13, CYP52A14, CYP52A17, CYP52A18 and FAO1 engineered cytochrome P450s, since it has essentially no (DP186) constructed in Section 7.3 with the Candida tropi calis strain lacking CYP52A13, CYP52A14, CYP52A17, ability to oxidize fatty acids without an added P450. CYP52A18, FAO1, FAO1B, FAO2A and FAO2B (DP258 8.3. Oxidation of C2-Hydroxy Fatty Acids by and DP259) for their abilities to oxidize ()-hydroxy fatty Candida tropicalis Strains Lacking Four CYP52A acids. To engineer a strain for the production of ()-hydroxy fatty acids it is desirable to eliminate enzymes from the cell P450S that can oxidize ()-hydroxy fatty acids. It is possible to deter 0500 We compared the Candida tropicalis strain lacking mine whether other enzymes involved in oxidation of co-hy CYP52A13, CYP52A14, CYP52A17 and CYP52A18 droxy fatty acids are present in the strain by feeding it ()-hy (DP174) constructed in Section 7.2 with the starting strain droxy fatty acids in the media. If there are enzymes present (DP1) for their abilities to oxidize ()-hydroxy fatty acids. To that can oxidize co-hydroxy fatty acids, then the strain will engineer a strain for the production of ()-hydroxy fatty acids convert ()-hydroxy fatty acids fed in the media to C,c)-dicar it is desirable to eliminate enzymes from the cell that can boxylic acids. oxidize co-hydroxy fatty acids. It is possible to determine 0504 Cultures of the yeast strains were grown at 30° C. whether other enzymes involved in oxidation of co-hydroxy and 250 rpm for 16 hours in a 500 ml flask containing 30 ml fatty acids are present in the Strain by feeding it ()-hydroxy of media F (media F is peptone 3 g/l. yeast extract 6 g/l. yeast fatty acids in the media. If there are enzymes present that can nitrogen base 6.7 g/l. Sodium acetate 3 g/l. KHPO, 7.2 g/l. oxidize ()-hydroxy fatty acids, then the strain will convert KHPO 9.3 g/l) plus 20 g/l glycerol. After 16 hours 0.5 ml of ()-hydroxy fatty acids fed in the media to C.()-dicarboxylic culture was added to 4.5 ml fresh media Fplus 20 g/l glycerol acids. in a 125 ml flask, and grown at 30° C. and 250 rpm for 12 0501 Cultures of the yeast strains were grown at 30° C. hours. We then tested the conversion of C12 and C16 ()-hy and 250 rpm for 16 hours in a 500 ml flask containing 30 ml droxy fatty acid Substrates by adding these Substrates to inde of media F (media F is peptone 3 g/l. yeast extract 6 g/l. yeast pendent flasks at final concentrations of 5 g/l and the pH was nitrogen base 6.7 g/l. Sodium acetate 3 g/l. KHPO, 7.2 g/l. adjusted to between 7.5 and 8 and shaking was continued at KHPO 9.3 g/l) plus 20 g/l glycerol. After 16 hours 0.5 ml of 30° C. and 250 rpm. Samples were taken after 24 hours, cell culture was added to 4.5 ml fresh media Fplus 20 g/l glycerol culture was acidified to pH -1.0 by addition of 6 NHCl, in a 125 ml flask, and grown at 30° C. and 280 rpm for 12 products were extracted from the cell culture by diethyl ether hours. We then tested the conversion of C12 and C16 ()-hy and the concentrations of co-hydroxy fatty acids and C.()- US 2015/0094.483 A1 Apr. 2, 2015

diacids in the media were measured by LC-MS (liquid chro FAO2A, FAO2B, CYP52A12, CYP52A12B, ADH-A4, matography mass spectroscopy). As shown in FIG.15 most of ADH-A4B, ADH-B4, ADH-B4B and ADH-A10 (DP415) for the hydroxy fatty acids are converted to diacid after 24 hours. their abilities to oxidize ()-hydroxy fatty acids. To engineer a These results show that at least one enzyme capable of oxi strain for the production of ()-hydroxy fatty acids it is desir dizing ()-hydroxy fatty acids is present in Candida tropicalis able to eliminate enzymes from the cell that can oxidize in addition to the cytochrome P450 genes encoding ()-hydroxy fatty acids. It is possible to determine whether CYP52A13, CYP52A14, CYP52A17, CYP52A18, FAO1, other enzymes involved in oxidation of co-hydroxy fatty acids FAO1B, FAO2A and FAO2B. are present in the strain by feeding it w-hydroxy fatty acids in the media. If there are enzymes present that can oxidize 8.5. Oxidation of C2-Hydroxy Fatty Acids by ()-hydroxy fatty acids, then the strain will convert ()-hydroxy Candida tropicalis Strains Lacking Six CYP52A fatty acids fed in the media to am-dicarboxylic acids. P450S and Four Fatty Alcohol Oxidases 0508 Cultures of the yeast strains were grown at 30° C. and 250 rpm for 18 hours in a 500 ml flask containing 30 ml 0505 We compared the Candida tropicalis strain lacking of media F (media F is peptone 3 g/l. yeast extract 6 g/l. yeast CYP52A13, CYP52A14, CYP52A17, CYP52A18 and FAO1 nitrogen base 6.7 g/l. Sodium acetate 3 g/l. KHPO, 7.2 g/l. (DP186) constructed in Section 7.2 with the Candida tropi KHPO, 9.3 g/l) plus 20 g/l glycerol. After 18 hours the calis strain lacking CYP52A13, CYP52A14, CYP52A17, preculture was diluted in fresh media to A-1.0. This cul CYP52A18, FAO1, FAO1B, FAO2A, FAO2B, CYP52A12 ture was shaken until the Ao reached between 5.0 and 6.0. and CYP52A12B (DP283 and DP284) for their abilities to Biocatalytic conversion was initiated by adding 5 ml culture oxidize ()-hydroxy fatty acids. To engineer a strain for the to a 125 ml flask together with 50 mg of co-hydroxy lauric production of co-hydroxy fatty acids it is desirable to elimi acid, and pH adjusted to ~7.5 with 2MNaOH. Samples were nate enzymes from the cell that can oxidize ()-hydroxy fatty taken at the times indicated, cell culture was acidified to pH acids. It is possible to determine whether other enzymes ~1.0 by addition of 6 NHCl, products were extracted from the involved in oxidation of ()-hydroxy fatty acids are present in cell culture by diethyl ether and the concentrations of C.co the strain by feeding it co-hydroxy fatty acids in the media. If diacids in the media were measured by LC-MS (liquid chro there are enzymes present that can oxidize ()-hydroxy fatty matography mass spectroscopy). As shown in FIG. 19 Part A. acids, then the strain will convert ()-hydroxy fatty acids fed in the cell growth was almost identical for the 3 strains. Strain the media to C.O)-dicarboxylic acids. DP415 produced much less C,c)-dicarboxy laurate than the 0506 Cultures of the yeast strains were grown at 30° C. other two strains, however, as shown in FIG. 19 part B. and 250 rpm for 16 hours in a 500 ml flask containing 30 ml 0509. These results show that a significant reduction in the of media F (media F is peptone 3 g/l. yeast extract 6 g/l. yeast ability of Candida tropicalis to oxidize ()-hydroxy fatty acids nitrogen base 6.7 g/l. Sodium acetate 3 g/l. KHPO, 7.2 g/l. can be reduced by deleting genes encoding CYP52A13, KHPO 9.3 g/l) plus 20 g/l glycerol. After 16 hours 0.5 ml of CYP52A14, CYP52A17, CYP52A18, FAO1, FAO1B, culture was added to 4.5 ml fresh media Fplus 20 g/l glycerol FAO2A, FAO2B, CYP52A12, CYP52A12B, ADH-A4, in a 125 ml flask, and grown at 30° C. and 250 rpm for 12 ADH-A4B, ADH-B4, ADH-B4B and ADH-A10. hours. We then tested the conversion of C12 and C16 ()-hy droxy fatty acid substrates by adding these Substrates to inde 8.7. Oxidation of C2-Hydroxy Fatty Acids by pendent flasks at final concentrations of 5 g/l and the pH was Candida tropicalis Strains Lacking Six CYP52A adjusted to between 7.5 and 8 and shaking was continued at P450S, Four Fatty Alcohol Oxidases and Eight 30° C. and 250 rpm. Samples were taken after 24 hours, cell Alcohol Dehydrogenases culture was acidified to pH -1.0 by addition of 6 NHCl, products were extracted from the cell culture by diethyl ether 0510 We compared the Candida tropicalis strain DP1 and the concentrations of co-hydroxy fatty acids and C.()- with the Candida tropicalis strain lacking CYP52A13, diacids in the media were measured by LC-MS (liquid chro CYP52A14, CYP52A17, CYP52A18, FAO1, FAO1B, matography mass spectroscopy). As shown in FIG.16 most of FAO2A, FAO2B, CYP52A12, CYP52A12B, ADH-A4 and the C12 hydroxy fatty acids and a substantial fraction of the ADH-A4B (DP390), the Candida tropicalis strain lacking C16 hydroxy fatty acids are converted to diacid after 24 CYP52A13, CYP52A14, CYP52A17, CYP52A18, FAO1, hours. These results show that at least one enzyme capable of FAO1B, FAO2A, FAO2B, CYP52A12, CYP52A12B, ADH oxidizing ()-hydroxy fatty acids is present in Candida tropi A4, ADH-A4B, ADH-B4, ADH-B4B and ADH-A10 calis in addition to the cytochrome P450 genes encoding (DP415), the Candida tropicalis strain lacking CYP52A13, CYP52A13, CYP52A14, CYP52A17, CYP52A18, CYP52A14, CYP52A17, CYP52A18, FAO1, FAO1B, CYP52A12, CYP52A12B, FAO1, FAO1B, FAO2A and FAO2A, FAO2B, CYP52A12, CYP52A12B, ADH-A4, FAO2B. ADH-A4B, ADH-B4, ADH-B4B, ADH-A10 and ADH-B11 (DP417 and DP421), the Candida tropicalis strain lacking 8.6. Oxidation of C2-Hydroxy Fatty Acids by CYP52A13, CYP52A14, CYP52A17, CYP52A18, FAO1, Candida tropicalis Strains Lacking Six CYP52A FAO1B, FAO2A, FAO2B, CYP52A12, CYP52A12B, ADH P450S, Four Fatty Alcohol Oxidases and Five A4, ADH-A4B, ADH-B4, ADH-B4B, ADH-A10, ADH Alcohol Dehydrogenases A1 OB and ADH-B11 (DP423), the Candida tropicalis strain lacking CYP52A13, CYP52A14, CYP52A17, CYP52A18, 0507 We compared the Candida tropicalis strain DP1 FAO1, FAO1B, FAO2A, FAO2B, CYP52A12, CYP52A12B, with the Candida tropicalis strain lacking CYP52A13, ADH-A4, ADH-A4B, ADH-B4, ADH-B4B, ADH-A10, CYP52A14, CYP52A17, CYP52A18, FAO1, FAO1B, ADH-A10B, ADH-B11 and ADH-B11B (DP434 and FAO2A, FAO2B, CYP52A12 and CYP52A12B (DP283) and DP436) for their abilities to oxidize co-hydroxy fatty acids. To the Candida tropicalis strain lacking CYP52A13, engineer a strain for the production of ()-hydroxy fatty acids CYP52A14, CYP52A17, CYP52A18, FAO1, FAO1B, it is desirable to eliminate enzymes from the cell that can US 2015/0094.483 A1 Apr. 2, 2015 69 oxidize co-hydroxy fatty acids. It is possible to determine diacid) and negligible levels of ()-hydroxy myristic acid. In whether other enzymes involved in oxidation of co-hydroxy contrast, under these conditions strain DP428 produces fatty acids are present in the Strain by feeding it ()-hydroxy approximately five-fold less tetradecanedioic acid, while fatty acids in the media. If there are enzymes present that can converting nearly 70% of the methyl myristate to co-hydroxy oxidize ()-hydroxy fatty acids, then the strain will convert myristic acid after 60 hours. This shows that elimination of ()-hydroxy fatty acids fed in the media to C.()-dicarboxylic one or more of the genes FAO1B, FAO2A, FAO2B, acids. CYP52A12, CYP52A12B, ADH-A4, ADH-A4B, ADH-B4, 0511 Cultures of the yeast strains were grown at 30° C. ADH-B4B, ADH-A10 and ADH-B11 prevents the over-oxi and 250 rpm for 18 hours in a 500 ml flask containing 30 ml dation of the fatty acid myristic acid by Candida tropicalis, of media F (media F is peptone 3 g/l. yeast extract 6 g/l. yeast and that the presence of CYP52A17 under control of the nitrogen base 6.7 g/l. Sodium acetate 3 g/l. KHPO, 7.2 g/l. isocitrate lyase promoter in this strain background produces a KHPO 9.3 g/l) plus 20 g/l glycerol. After 18 hours the strain that can convert methyl myristate to ()-hydroxy myris preculture was diluted in fresh media to A-1.0. This cul tic acid, but that does not over-oxidize the product to tetrade ture was shaken until the Asoo reached between 5.0 and 6.0. canedioic acid. Biocatalytic conversion was initiated by adding 5 ml culture to a 125 ml flask together with 50 mg of co-hydroxy lauric 8.9. Oxidation of Methyl Myristate by an Engineered acid, and pH adjusted to ~7.5 with 2M NaOH. Samples were Candida tropicalis Strain in a Fermentor taken at the times indicated, cell culture was acidified to pH ~1.0 by addition of 6 NHCl, products were extracted from the 0515 We compared the production of co-hydroxy myristic cell culture by diethyl ether and the concentrations of C.co acid and C. ()-tetradecanoic acid by a Candida tropicalis diacids in the media were measured by LC-MS (liquid chro strain lacking CYP52A13, CYP52A14, CYP52A17, matography mass spectroscopy). As shown in FIG. 20, a CYP52A18, FAO1, FAO1B, FAO2A, FAO2B, CYP52A12, significant reduction in the ability of Candida tropicalis to CYP52A12B, ADH-A4, ADH-A4B, ADH-B4, ADH-B4B, oxidize ()-hydroxy fatty acids can be obtained by deleting ADH-A10 and ADH-B11 and with CYP52A17 added back genes encoding alcohol dehydrogenases in strains lacking under control of the isocitrate lyase promoter (DP428). some cytochrome P450s and fatty alcohol oxidases. 0516 C. tropicalis DP428 was taken from a glycerol stock or fresh agar plate and inoculated into 500 ml shake flask 8.8. Oxidation of Methyl Myristate by Candida containing 30 mL of YPD medium (20 g/l glucose, 20 g/1 tropicalis Strains Lacking Six CYP52A P450S, Four peptone and 10 g/l yeast extract) and shaken at 30° C., 250 Fatty Alcohol Oxidases and Six Alcohol rpm for 20 h. Cells were collected by centrifugation and Dehydrogenases with a Single CYP52A P450 Added re-suspended in FM3 medium for inoculation. (FM3 medium Back Under Control of the ICL Promoter is 30 g/l glucose, 7 g/l ammonium sulfate, 5.1 g/l potassium phosphate, monobasic, 0.5 g/l magnesium sulfate, 0.1 g/1 0512 We compared the Candida tropicalis strain DP1 calcium chloride, 0.06 g/l citric acid, 0.023 g/1 ferric chlo with the Candida tropicalis strain lacking CYP52A13, ride, 0.0002 g/l biotin and 1 ml/l of a trace elements solution. CYP52A14, CYP52A17, CYP52A18 and FAO1 and with The trace elements solution contains 0.9 g/l boric acid, 0.07 CYP52A17 added back under control of the isocitrate lyase g/l cupric sulfate, 0.18 g/l potassium iodide, 0.36 g/l ferric promoter (DP201) and with the Candida tropicalis strain chloride, 0.72 g/l manganese sulfate, 0.36 g/l sodium molyb lacking CYP52A13, CYP52A14, CYP52A17, CYP52A18, date, 0.72 g/l zinc sulfate.) Conversion was performed by FAO1, FAO1B, FAO2A, FAO2B, CYP52A12, CYP52A12B, inoculating 15 ml of preculture into 135 ml FM3 medium, ADH-A4, ADH-A4B, ADH-B4, ADH-B4B, ADH-A10 and methyl myristate was added to 20 g/l and the temperature was ADH-B11 and with CYP52A17 added back under control of kept at 30° C. The pH was maintained at 6.0 by automatic the isocitrate lyase promoter (DP428) for their abilities to addition of 6 M NaOH or 2 M HSO solution. Dissolved oxidize methyl myristate. oxygen was kept at 70% by agitation and O-cascade control 0513 Cultures of the yeast strains were grown at 30° C. mode. After 6 hours growth, ethanol was fed into the cell and 250 rpm for 18 hours in a 500 ml flask containing 30 ml culture to 5 g/l. During the conversion phase, 80% glycerol of media F (media F is peptone 3 g/l. yeast extract 6 g/l. yeast was fed as co-substrate by dissolved oxygen-stat control nitrogen base 6.7 g/l. Sodium acetate 3 g/l. KHPO, 7.2 g/l. mode (the high limit of dissolved oxygen was 75% and low KHPO, 9.3 g/l) plus 20 g/l glucose plus 5 g/l ethanol. After limit of dissolved oxygen was 70%, which means glycerol 18 hours 3 ml of preculture was added to 27 ml fresh media F feeding was initiated when dissolved oxygen is higher than plus 20 g/l glucose plus 5 g/l ethanol in a 500 ml flask, and 75% and stopped when dissolved oxygen was lower than grown at 30° C. and 250 rpm for 20 hours before addition of 70%). Every 12 hours, ethanol was added into cell culture to substrate. Biocatalytic conversion was initiated by adding 40 2 g/l, and methyl myristate was added to 40 g/l until the total g/l of methyl myristate, the pH was adjusted to ~7.8 with 2M methyl myristate added was 140 g/l (i.e. the initial 20 g/l plus NaOH. The culture was pH controlled by adding 2 mol/l 3 Subsequent 40 g/l additions). Formation of products was NaOH every 12 hours, glycerol was fed as cosubstrate by measured at the indicated intervals by taking samples and adding 500 g/l glycerol and ethanol was fed as a inducer by acidifying to pH -1.0 by addition of 6 NHCl; products were adding 50% ethanol every 12 hours. Samples were taken at extracted from the cell culture by diethyl ether and the con the times indicated, cell culture was acidified to pH ~1.0 by centrations of co-hydroxy myristate and C.()-dicarboxy addition of 6 NHCl, products were extracted from the cell myristate were measured by LC-MS (liquid chromatography culture by diethyl ether and the concentrations of co-hydroxy mass spectroscopy), as shown in FIG. 26. Under these con myristate and C.(I)-dicarboxymyristate were measured by ditions the strain produced a final concentration of 91.5 g/1 LC-MS (liquid chromatography mass spectroscopy). ()-hydroxy myristic acid, with a productivity of 1.63 g/1/hr 0514. As shown in FIG. 24, strains DP1 and DP201 both and a w/w ratio of ()-hydroxy myristic acid:tetradecanedioic produce significant levels of tetradecanedioic acid (the O.co acid of 20.3:1. This shows that elimination of one or more of US 2015/0094.483 A1 Apr. 2, 2015 70 the genes FAO1B, FAO2A, FAO2B, CYP52A12, equivalents to the specific embodiments of the invention CYP52A12B, ADH-A4, ADH-A4B, ADH-B4, ADH-B4B, described herein. Such equivalents are intended to be encom ADH-A10 and ADH-B11 prevents the over-oxidation of the passed by the following claims. fatty acid myristic acid by Candida tropicalis, and that the 0521. All publications, patents, patent applications, and presence of CYP52A17 under control of the isocitrate lyase databases mentioned in this specification are herein incorpo promoter in this strain background produces a strain that can rated by reference into the specification to the same extent as convert methyl myristate to ()-hydroxy myristic acid, but that if each individual publication, patent, patent application or does not over-oxidize the product to tetradecanedioic acid. database was specifically and individually indicated to be incorporated herein by reference. 8.10. Oxidation of Methyl Myristate, Oleic Acid and Linoleic Acid by Engineered Candida tropicalis 11. EXEMPLARY EMBODIMENTS Strains 0522 The following are nonlimiting exemplary embodi 0517 We compared the fatty acid oxidizing activities of ments in accordance with the disclosed application: two Candida tropicalis strains which lack CYP52A13, Embodiment 1. A substantially pure Candida host cell for the CYP52A14, CYP52A17, CYP52A18, FAO1, FAO1B, biotransformation of a substrate to a product, wherein the FAO2A, FAO2B, CYP52A12, CYP52A12B, ADH-A4, Candida host cell is characterized by a first genetic modifi ADH-A4B, ADH-B4, ADH-B4B, ADH-A10 and ADH-B11, cation class that comprises one or more genetic modifications one of which has CYP52A17 added back under control of the that collectively or individually disrupt an alcohol dehydro isocitrate lyase promoter (DP428) and one of which has genase gene in the Substantially pure Candida host cell. CYP52A13 added back under control of the isocitrate lyase Embodiment 2. The substantially pure Candida host cell of promoter (DP522). embodiment 1, wherein the substantially pure Candida host 0518 Cultures of the yeast strains were grown at 30°C. in cell is genetically modified Candida glabrata, Candida zey a DASGIP parallel fermentor containing 200 ml of media F lenoides, Candida lipolytica, Candida guillermondii, Can (media F is peptone 3 g/l. yeast extract 6 g/l. yeast nitrogen dida aaseri, Candida abiesophila, Candida africana, Can base 6.7 g/l, sodium acetate 3 g/l. KHPO, 7.2 g/l. KHPO, dida aglyptinia, Candida agrestis, Candida akabanensis, 9.3 g/1) plus 30 g/l glucose. The pH was maintained at 6.0 by Candida alai, Candida albicans, Candida alimentaria, Can automatic addition of 6 M NaOH or 2 MHSO solution. dida amapae, Candida ambrosiae, Candida amphixiae, Can Dissolved oxygen was kept at 70% by agitation and O-cas dida anatomiae, Candida ancudensis, Candida anglica, Can cade control mode. After 6 hour growth, ethanol was fed into dida anneliseae, Candida antarctica, Candida antillancae, the cell culture to 5 g/l. After 12 h growth, biocatalytic con Candida anutae, Candida apicola, Candida apis, Candida version was initiated by adding methyl myristate acid to 60 g/1 arabinofermentans, Candida arcana, Candida ascalaphi or oleic acid to 60 g/l or linoleic acid to 30 g/l. During the darum, Candida asparagi, Candida atakaporum, Candida conversion phase, 80% glycerol was fed as co-substrate for atbi, Candida athensensis, Candida atlantica, Candida conversion of methyl myristate and 500 g/l glucose was fed as atmosphaerica, Candida auringiensis, Candida auris, Can co-substrate for conversion of oleic acid and linoleic acid by dida aurita, Candida austromarina, Candida azyma, Can dissolved oxygen-stat control mode (the high limit of dis dida azymoides, Candida barrocoloradensis, Candida batis solved oxygen was 75% and low limit of dissolved oxygen tae, Candida beechii, Candida bentonensis, Candida bertae, was 70%, which means glycerol feeding was initiated when Candida berthetii, Candida bituminphila, Candida blankii, dissolved oxygen is higher than 75% and stopped when dis Candida blattae, Candida blattariae, Candida bOhiensis, solved oxygen was lower than 70%). Every 12 hour, ethanol Candida boidinii, Candida bokatorum, Candida boleticola, was added into cell culture to 2 g/l. Samples were taken at Candida bolitotheri, Candida bombi, Candida bombiphila, various times, cell culture was acidified to pH~1.0 by addi Candida bondarzewiae, Candida bracarensis, Candida bri tion of 6 NHCl, products were extracted from the cell culture brorum, Candida bromeliacearum, Candida buenavistaen by diethyl ether and the concentrations of co-hydroxy fatty sis, Candida buinensis, Candida butyri, Candida Californica, acids and C.O)-diacids in the media were measured by LC-MS Candida Canberraensis, Candida cariosilignicola, Candida (liquid chromatography mass spectroscopy). As shown in carpophila, Candida carvicola, Candida caseinolytica, Can FIG. 25, strains DP428 and DP522 were both able to produce dida castrensis, Candida catenulata, Candida cellae, Can ()-hydroxy fatty acids from these Substrates, as well as some dida cellulolytica, Candida cerambycidarum, Candida C.()-diacids. FIG. 25 also shows that the different P450s had chauliodes, Candida chickasaworum, Candida chilensis, different preferences for the fatty acid substrates, and differ Candida choctaworum, Candida Chodatii, Candida chry ent propensities to oxidize the ()-hydroxy group. somelidarum, Candida cidri, Candida cloacae, Candida coi pomoensis, Candida conglobata, Candida Corydali, Candida 9. DEPOSIT OF MICROORGANISMS cylindracea, Candida davenportii, Candida davisiana, Can 0519 Aliving cultures of strain DP421 has been deposited dida deformans, Candida dendrica, Candida dendronema, with American Type Culture Collection, 12301 Parklawn Candida derodonti, Candida diddensiae, Candida digboien Drive, Rockville, Md. 20852, on May 4, 2009, under the sis, Candida diospyri, Candida diversa, Candida dosseyi, Budapest Treaty on the International Recognition of the Candida drimydis, Candida drosophilae, Candida dublinien Deposit of Microorganisms for the purposes of patent proce sis, Candida easanensis, Candida edaphicus, Candidaedax, Candida elateridarum, Candida emberorum, Candida endo dure. mychidarum, Candida entomophila, Candida ergastensis, Candida ernobii, Candida etchellsii, Candida ethanolica, 10. EQUIVALENTS Candida famata, Candida fennica, Candida fermenticarens, 0520. Those skilled in the art will recognize, or be able to Candida flocculosa, Candida floricola, Candida floris, Can ascertain using no more than routine experimentation, many dida fiosculorum, Candida fluviatilis, Candida fragi, Can US 2015/0094.483 A1 Apr. 2, 2015 dida freyschussii, Candida friedrichii, Candida frijolesensis, Sivorans, Candida Sorboxylosa, Candida Spandovensis, Can Candida fructus, Candida fukazawae, Candida fungicola, dida Steatolytica, Candida Stellata, Candida Stellimalicola, Candida galacta, Candida galis, Candida galli, Candida Candida stri, Candida Subhashii, Candida succiphila, Can gatunensis, Candida gelsemii, Candida geochares, Candida dida Suecica, Candida Suzuki, Candida takamatsuzukensis, germanica, Candidaghanaensis, Candida gigantensis, Can Candida taliae, Candida tammamiensis, Candida tanzawaen dida glaebosa, Candida glucosophila, Candida glycerino sis, Candida tartarivorans, Candida temnochilae, Candida genes, Candida gorgasii, Candida gotoi, Candida gropeng tenuis, Candidatepae, Candida terraborum, Candida tetrigi iesseri, Candidaguaymorum, Candida haemulonii, Candida darum, Candida thaimueangensis, Candida thermophila, halonitratophila, Candida halophila, Candida hasegawae, Candida tilneyi, Candida tolerans, Candida torresi, Can dida tritonae, Candida tropicalis, Candida trypodendroni, Candida hawaiiana, Candida heliconiae, Candida hispan Candida tsuchiyae, Candida tumulicola, Candida ubatuben iensis, Candida homilentoma, Candida humicola, Candida sis, Candida ulmi, Candida vaccinii, Candida valdiviana, humilis, Candida hungarica, Candida hyderabadensis, Can Candida vanderkliftii, Candida vanderwaltii, Candida dida incommunis, Candida inconspicua, Candida insectal vartiovaarae, Candida versatilis, Candida vini, Candida ens, Candida insectamans, Candida insectorum, Candida viswanathi, Candida wickerhamii, Candida wounanorum, intermedia, Candida ipomoeae, Candida ishiwadae, Can Candida wyomingensis, Candida xylopsoci, Candida yucho dida jaroonii, Candida jeffriesii, Candida kanchanaburien rum, Candida Zemplinina, or Candida zeylanoides. sis, Candida karawaiiewi, Candida kashinagacola, Candida kazuoi, Candida khmerensis, Candida kipukae, Candida Embodiment 3. The substantially pure Candida host cell of kofuensis, Candida krabiensis, Candida kruisii, Candida embodiment 2, wherein the substantially pure Candida host kunorum, Candida labiduridarum, Candida lactis-condensi, cell is genetically modified Candida tropicalis. Candida lassenensis, Candida laureliae, Candida leandrae, Embodiment 4. The substantially pure Candida host cell of Candida lessepsii, Candida lignicola, Candida litsaeae, embodiment 3, wherein the substantially pure Candida host Candida lit.seae, Candida lanquihuensis, Candida lycoper cell is selected from the group consisting of DP428, DP522 dinae, Candida lyxosophila, Candida magnifica, Candida and DP 527. magnoliae, Candida maltosa, Candida mannitofaciens, Can Embodiment 5. The substantially pure Candida host cell of dida marls, Candida maritima, Candida maxii, Candida embodiment 1, wherein the substantially pure Candida host melibiosica, Candida membranifaciens, Candida mesen cell is genetically modified Candida tropicalis and wherein terica, Candida metapsilosis, Candida methanolophaga, the alcohol dehydrogenase gene is selected from the group Candida methanolovescens, Candida methanosorbosa, Can consisting of ADH-A4, ADH-A4B, ADH-B4, ADH-B4B, dida methylica, Candida michaelii, Candida mogii, Candida ADH-A10, ADH-A1OB, ADH-B11, and ADH-B11B. montana, Candida multigennis, Candida mycetangii, Can Embodiment 6. The substantially pure Candida host cell of dida naeodendra, Candida nakhonratchasimensis, Candida embodiment 1, wherein the alcohol dehydrogenase gene nanaspora, Candida natalensis, Candida neerlandica, Can comprises a nucleic acid sequence that binds under condi dida memodendra, Candida nitrativorans, Candida nitrato tions of high stringency to SEQID NO:39, SEQID NO:40, phila, Candida nivariensis, Candida nodaensis, Candida SEQID NO:42, SEQID NO:43, or SEQID NO: 56. norvegica, Candida novaki, Candida Odintsovae, Candida Embodiment 7. The substantially pure Candida host cell of Oleophila, Candida Ontarioensis, Candida Ooitensis, Can embodiment 1, wherein the alcohol dehydrogenase gene dida Orba, Candida Oregonensis, Candida Orthopsilosis, comprises a nucleic acid sequence that binds under condi Candida Ortonii, Candida Ovalis, Candidapallodes, Candida tions of moderate stringency to SEQID NO:39, SEQID NO: palmioleophila, Candida paludigena, Candida panamensis, 40, SEQID NO: 42, SEQID NO:43, or SEQID NO: 56. Candida panamericana, Candida parapsilosis, Candida pararugosa, Candida pattaniensis, Candida peltata, Can Embodiment 8. The substantially pure Candida host cell of dida peoriaensis, Candida petrohuensis, Candida phangn embodiment 1, wherein the alcohol dehydrogenase gene gensis, Candidapicachoensis, Candidapiceae, Candida pic comprises a nucleic acid sequence that binds under condi inguabensis, Candida pignaliae, Candida pimensis, Candida tions of low stringency to SEQID NO:39, SEQID NO: 40, pini, Candida plutei, Candida pomicola, Candida pondero SEQID NO:42, SEQID NO:43, or SEQID NO: 56. sae, Candida populi, Candida powellii, Candida prunicola, Embodiment 9. The substantially pure Candida host cell of Candida pseudogliaebosa, Candida pseudohaemulonii, Can embodiment 1, wherein the alcohol dehydrogenase gene dida pseudointermedia, Candida pseudolambica, Candida encodes an amino acid sequence that has at least 90 percent pseudorhagii, Candida pseudovanderkliftii, Candida psy sequence identity to a stretch of at least 100 contiguous resi chrophila, Candida pyralidae, Candida qinlingensis, Can dues of any one of SEQID NO: 151, SEQID NO: 152, SEQ dida quercitrusa, Candida quercuum, Candida railenensis, ID NO: 153, SEQID NO: 154, or SEQID NO:155. Candida ralunensis, Candida rancensis, Candida restingae, Embodiment 10. The substantially pure Candida host cell of Candida rhagii, Candida riodocensis, Candida rugopellicu embodiment 9, wherein the alcohol dehydrogenase gene losa, Candida rugosa, Candida Sagamina, Candida sai comprises a nucleic acid sequence that binds under condi toana, Candida sake, Candida Salmanticensis, Candida San tions of high Stringency to a first sequence selected from the tamariae, Candida Santjacobensis, Candida Saopaulonensis, group consisting of SEQID NO:39, SEQID NO:40, SEQID Candida Savonica, Candida Schatavi, Candida Sequanensis, NO: 42, SEQID NO:43, and SEQID NO: 56. Candida Sergipensis, Candida Shehatae, Candida Silvae, Embodiment 11. The substantially pure Candida host cell of Candida Silvanorum, Candida Silvatica, Candida Silvicola, embodiment 9, wherein the alcohol dehydrogenase gene Candida Silvicultrix, Candida Sinolaborantium, Candida comprises a nucleic acid sequence that binds under condi sithepensis, Candida Smithsonii, Candida sojae, Candida tions of moderate stringency to a first sequence selected from Solani, Candida songkhlaensis, Candida Sonorensis, Can the group consisting of SEQIDNO:39, SEQID NO:40, SEQ dida Sophiae-reginae, Candida Sorbophila, Candida Sorbo ID NO: 42, SEQID NO:43, and SEQID NO: 56.