US00929.0772B2

(12) United States Patent (10) Patent No.: US 9,290,772 B2 Punt et al. (45) Date of Patent: Mar. 22, 2016

(54) PRODUCTION OF ITACONIC ACID (52) U.S. Cl. CPC ...... CI2N 15/80 (2013.01); C07K 14/38 (71) Applicant: NEDERLANDSE ORGANISATIE (2013.01); CI2P 7/44 (2013.01) VOOR TOEGEPAST (58) Field of Classification Search NATUURWETENSCHAPPELUK None ONDERZOEKTNO, Delft (NL) See application file for complete search history. (72) Inventors: Peter Jan Punt, Houten (NL); Maria (56) References Cited Johanna VanDerWerf. Tuil (NL) U.S. PATENT DOCUMENTS (73) Assignee: DUTCH DNA BIOTECH B. V., Zeist 6,943,017 B2 9/2005 Hutchinson et al. (NL) 2004/0033570 A1 2/2004 Hutchinson et al. (*) Notice: Subject to any disclaimer, the term of this OTHER PUBLICATIONS

past.S.C. islisted154(b) by ayS. " * DatabaseBonname UniProt et al., J. Online).Bacteriol. Accession (1995) 177(12).3573-3578. No. Q0C8L4, Oct. 17, 2006. Dwiarti et al., J. Biosci. Bioeng. (2002) 94(1):29-33. (21) Appl. No.: 14/137,785 International Search Report for PCT/NL2009/050069, mailed on Jul. 6, 2009, 3 pages. (22) Filed: Dec. 20, 2013 Jaklitsch et al., Journal of General Microbiology (1991) 137(3):533 540. (65) Prior Publication Data Kaplanet al., Journal of Biological Chemistry (19950 270(8):4108 4114. US 2014/O193885 A1 Jul. 10, 2014 Pel et al., “Genome sequencing and analysis of the versatile cell factory Aspergillus niger CBS 513.88.” Nature Biotechnology (2007) 25:221-231, Epub Jan. 28, 2007. Related U.S. Application Data g et al., Journal of biological Chemistry (2002) 277(27):24204 (62) Division of application No. 12/918.314, filed as Pri Exami Idbal H Chowdh application No. PCT/NL2009/050069 on Feb. 16, rimary Examiner - Iqbal owanury 2009, now Pat. No. 8,679,801. (74) Attorney, Agent, or Firm — Morrison & Foerster LLP (30) Foreign Application Priority Data (57) ABSTRACT The invention relates to a nucleic acid sequence encoding an Feb. 18, 2008 (EP) ...... O8151584 Aspergillus mitochondrial tricarboxylic acid transporter that can be used in the production of itaconic acid in micro (51) Int. Cl. organisms. Preferably said transporter protein is the protein CI2N L/20 (2006.01) encoded by the nucleic acid which is located on a chromo CI2P 7/44 (2006.01) Some segment of A. terreus that also harbors other genes CI2P 7/46 (2006.01) involved in the itaconic acid biosynthesis and the lovastatin C7H 2L/04 (2006.01) biosynthesis. Also, Vectors, hosts and transformed micro C07K I4/00 (2006.01) organisms are part of the invention. CI2N 15/80 (2006.01) C07K I4/38 (2006.01) 13 Claims, 5 Drawing Sheets U.S. Patent Mar. 22, 2016 Sheet 1 of 5 US 9,290,772 B2

Ø@ZZZZZZZZ

U.S. Patent Mar. 22, 2016 Sheet 3 of 5 US 9,290,772 B2 U.S. Patent Mar. 22, 2016 Sheet 4 of 5 US 9,290,772 B2

VNC1001660T9EIV

US 9,290,772 B2 1. 2 PRODUCTION OF TACONCACD Aspergillus terreus nucleic acid sequence ATEG 09970. 1, or functional homologues thereofhaving a sequence identity CROSS REFERENCE TO RELATED of at least 55%, preferably 60%, more preferably 70%. APPLICATIONS A further embodiment of the invention is a mitochondrial tricarboxylic acid transporter encoded by Such a nucleic acid This application is a divisional of copending U.S. Ser. No. Sequence. 12/918.314 having an international filing date of 16 Feb. Also comprised in the invention is a method for the 2009, which is the national phase of PCT application PCT/ improved production of itaconic acid, through an increased NL2009/050069 having an international filing date of 16 Feb. activity of a protein capable of transporting di?tricarboxylate 2009, which claims benefit of European patent application 10 from the mitochondrion to the cytosol, in a suitable host cell. No. 08151584.3 filed 18 Feb. 2008. The contents of the above Preferably said gene encodes a protein that transports tricar patent applications are incorporated by reference herein in boxylate. More preferably the gene encodes a protein that their entirety. transports cis-aconitate, citrate, and/or isocitrate. Preferably the said gene is derived from Aspergillus sp. Such as, SUBMISSION OF SEQUENCE LISTING ON 15 Aspergillus terreus, Aspergillus niger, Aspergillus nidulans, ASCII TEXT FILE Aspergillus Oryzae or Aspergillus filminagates. Preferably, The content of the following submission on ASCII text file said gene is Aspergillus terreus ATEG 09970.1. is incorporated herein by reference in its entirety: a computer According to a further preferred embodiment, the said readable form (CRF) of the Sequence Listing (file name: genes are expressed in a Suitable vector, under control of their 313632009910SeqList.txt, date recorded: Dec. 20, 2013, own or other promoters. size: 34,658 bytes). Also comprised in the invention is a method as described above, wherein the transported citrate or isocitrate are further BACKGROUND OF THE INVENTION catabolised to cis-aconitate by overexpression of the gene 25 encoding the (s) catalysing this reaction. Moreover, The invention relates to the field of microbial production, the invention also comprises a method as described above, more specifically production of itaconic acid (itaconate), wherein the transported or produced cis-aconitate is catabo more specifically production of itaconate in micro-organ lised to itaconic acid by overexpression of the gene coding for 1SS. the enzyme CAD (see EP07 112895). Production and metabolism of itaconic acid in microbial 30 Another embodiment of the present invention is formed by cells has been studied extensively for several decades (Calam, a host cell wherein a gene coding for a protein capable of C.T. et al., 1939, Thom.J. Biochem., 33:1488-1495; Bentley, transporting di?tricarboxylate from the mitochondrion to the R. and Thiessen, C. P., 1956, J. Biol. Chem. 226:673-720; cytosol, is introduced. Preferably the said gene encodes the Cooper, R. A. and Kornberg, H. L., 1964, Biochem.J., 91:82 above mentioned proteins, and more preferably said gene is 91; Bonnarme, P. et al., 1995, J. Bacteriol. 1 17:3573-3578; 35 Aspergillus terreus ATEG 09970.1. A suitable host cell Dwiarti, L. et al., 2002, J. Biosci. Bioeng. 1:29-33), but the preferably is a host cell selected from filamentous fungi, metabolic pathway for itaconic acid has not been unequivo yeasts and bacteria, more preferably from Escherichia coli, cally established (Wilke, Th. and Vorlop, K.-D., 2001, Appl. Aspergillus sp. Such as (Aspergillus niger or Aspergilluster Microbiol. Biotechnol. 56:289-295; Bonnarme, P. et al., reus), citrate-producing hosts or lovastatin producing hosts. 1995, J. Bacteriol. 177:3573-3578). A complicating factor in 40 The invention further comprises a host cell as described this respect is that aconitase, the enzyme that interconverts above, wherein the transported or produced cis-aconitate is citric acid into cis-aconitate, and vice versa, and other catabolised to itaconic acid by overexpression of the gene in the metabolic pathway have been found to be encoding the enzyme CAD. present in many isoforms in microbial cells. Further, the invention pertains to the use of the protein(s) Production of itaconic acid is now commercially achieved 45 transporting di?tricarboxylate for the production of itaconic in Aspergillus terreus, which has physiological similarity to acid in a suitable host cell. Also comprised in the invention is A. niger and A. Oryzae. However, these latter two accumulate the use of the protein(s) transporting di?tricarboxylate com citric acid, due to the absence of cis-aconic acid decarboxy bined with the CAD enzyme, for the production of itaconic lase (CAD) activity. Substrates used by these fungi include acid in a suitable host cell. mono- and disaccharides, such as glucose. Sucrose and fruc 50 tose and Starches, as they exist informs that are degradable by BRIEF DESCRIPTION OF THE DRAWINGS the micro-organism, and molasses. Recently, it has been dis covered that also glycerol is a useful initaconic acid FIG. 1: Postulated biosynthesis route(s) for itaconic acid in production by A. terreus (U.S. Pat. No. 5,637.485). A. terreus. 1. Citrate synthase; 2. Aconitase, 3, cis-aconitic The general scheme currently envisioned for itaconic acid 55 acid decarboxylase (itaconate-forming); 4, cis-aconitic acid biosynthesis is given in FIG. 1, wherein clearly the existence decarboxylase (citraconate-forming); 5, citraconate of the biosynthetic route both in the cytosol and the mitochon ; 6, mitochondrial dicarboxylate-tricarboxylate dria is depicted and the connection between these two com antiporter 7, mitochondrial tricarboxylate transporter; 8. partments. At several points of this scheme possibilities exist dicarboxylate transporter; 9, 2-methylcitrate dehydratase. to try to improve the existing commercial production of ita 60 FIG. 2: Overview of the Aspergillus terreus genome seg conic acid in micro-organisms. ment with the cluster of genes involved in production of itaconic acid and lovastatin ranging from ATEG 09961.1 - SUMMARY OF THE INVENTION ATEG 09975.1. The cluster contains the cis-aconitate decar boxylase (ATEG 09971.1) and the mitochondrial tricar The invention comprises a nucleic acid sequence encoding 65 boxylate transporter (ATEG 9970.1). an Aspergillus mitochondrial tricarboxylic acid transporter, FIG.3A-C: Sequence of the Aspergillus terreus mitochon preferably wherein said nucleic acid sequence comprises the drial tricarboxylic acid transporter: FIG. 3A. genomic US 9,290,772 B2 3 4 sequence (SEQ ID NO:1), FIG. 3B. cDNA (SEQ ID NO:2), will be one which can be transferred into the host cell, but FIG.3C. protein sequence (SEQID NO:3). which has a replicon which is nonfunctional in that organism. Integration of the segment comprising the gene of interest DETAILED DESCRIPTION OF THE INVENTION may be selected if an appropriate marker is included within that segment. "Fungi are herein defined as eukaryotic micro-organisms “Transformation' and “transforming, as used herein, and include all species of the subdivision Eumycotina (Alex refers to the insertion of an exogenous polynucleotide into a opoulos, C.J., 1962. In: Introductory Mycology, John Wiley host cell, irrespective of the method used for the insertion, for & Sons, Inc., New York). The term fungus thus includes both example, direct uptake, transduction, f-mating or electropo filamentous fungi and yeast. "Filamentous fungi are herein 10 ration. The exogenous polynucleotide may be maintained as defined as eukaryotic micro-organisms that include all fila a non-integrated vector, for example, a plasmid, or alterna mentous forms of the subdivision Eumycotina. These fungi tively, may be integrated into the host cell genome. are characterized by a vegetative mycelium composed of By “host cell is meant a cell which contains a vector or chitin, cellulose, and other complex polysaccharides. The recombinant nucleic acid molecule and Supports the replica filamentous fungi used in the present invention are morpho 15 tion and/or expression of the vector or recombinant nucleic logically, physiologically, and genetically distinct from acid molecule. Host cells may be prokaryotic cells Such as E. yeasts. Vegetative growth by filamentous fungi is by hyphal coli, or eukaryotic cells such as yeast, fungus, plant, insect, elongation and carbon catabolism of most filamentous fungi amphibian, or mammalian cells. Preferably, host cells are are obligately aerobic. “Yeasts' are herein defined as eukary fungal cells. otic micro-organisms and include all species of the Subdivi Key in the biosynthetic pathway for itaconic acid is the sion Eumycotina that predominantly grow in unicellular localisation of the various Substrates. It is thought that pro form. Yeasts may either grow by budding of a unicellular duction ofitaconic acid mainly occurs in the cytosol (see FIG. thallus or may grow by fission of the organism. 1 and Jaklitsch, W. M. et al., 1991, J. Gen. Microbiol. 137: The term “fungal, when referring to a protein or nucleic 533-539). Thus optimal availability of the substrates for the acid molecule thus means a protein or nucleic acid whose 25 conversion to itaconic acid in the cytosol is required. The amino acid or nucleotide sequence, respectively, naturally present inventors have found the gene that is coding for the occurs in a fungus. transporter that is responsible for transporting the tricarboxy The term “gene’’, as used herein, refers to a nucleic acid lic acids that are substrate for the production of itaconic acid sequence containing a template for a nucleic acid polymerase, from the mitochondria to the cytosol. Said gene is found to be in eukaryotes, RNA polymerase II. Genes are transcribed into 30 present on a genomic locus of Aspergillus terreus that further mRNAs that are then translated into protein. comprises putative genes involved in the further enzymatic "Expression” refers to the transcription of a gene into steps in the pathway for itaconic acid production and lovas structural RNA (rRNA, tRNA) or messenger RNA (mRNA) tatin production. The invention now relates to a method for with Subsequent translation into a protein. increasing the production of itaconic acid, by overexpression The term “vector” as used herein, includes reference to an 35 of genes encoding proteins capable of transporting di?tricar autosomal expression vector and to an integration vector used boxylic acids from the mitochondrion to the cytosol, leading for integration into the chromosome. to increased production of itaconic acid, in a Suitable micro The term “expression vector” refers to a DNA molecule, organism. The proteins are further defined as proteins capable linear or circular, that comprises a segment encoding a of transporting tricarboxylic acids more preferably, cis-ac polypeptide of interest under the control of (i.e., operably 40 onitate, or its precursor's citrate or isocitrate. linked to) additional nucleic acid segments that provide for its Examples of Such transporters are, plant mitochondrial transcription. Such additional segments may include pro dicarboxylate-tricarboxylate carriers (DTC) capable of trans moter and terminator sequences, and may optionally include porting dicarboxylic acids and tricarboxylic acids (such as one or more origins of replication, one or more selectable citrate, isocitrate, cis-aconitate and trans-aconitate) (Picault markers, an enhancer, a polyadenylation signal, and the like. 45 et al. 2002, J. Biol. Chem. 277:24204-24211), and the mito Expression vectors are generally derived from plasmid or chondrial citrate transport protein (CTP) in Saccharomyces viral DNA, or may contain elements of both. In particular an cerevisiae capable of transporting tricarboxylates like citrate expression vector comprises a nucleotide sequence that com and isocitrate (Kaplanet al. 1995, J. Biol. Chem. 270:4108 prises in the 5' to 3’ direction and operably linked: (a) a 4114). The inventors now found a transporter that is specifi fungal-recognized transcription and translation initiation 50 cally involved in the transport of tricarboxylates for the pro region, (b) a coding sequence for a polypeptide of interest, duction of itaconic acid. Said gene is identified as ATEG and (c) a fungal-recognized transcription and translation ter 09970 and the nucleic acid and amino acid sequences are mination region. “Plasmid' refers to autonomously replicat provided in FIG. 3. The nucleic acid sequence has already ing extrachromosomal DNA which is not integrated into a been disclosed in Birren, B. W. et al. (Database UniProt: microorganism's genome and is usually circular in nature. 55 Q0C8L4), in which the gene was annotated as belonging to An “integration vector” refers to a DNA molecule, linear or the mitochondrial carrier family. However, it has not been circular, that can be incorporated in a microorganism’s specified that the protein encoded by said sequence would genome and provides for stable inheritance of a gene encod function as a tricarboxylate transporter for the production of ing a polypeptide of interest. The integration vector generally itaconic acid. Further, a highly homologous nucleotide comprises one or more segments comprising a gene sequence 60 sequence from Aspergillus terreus was disclosed in U.S. Pat. encoding a polypeptide of interest under the control of (i.e., No. 6,943,017 as an Acetyl CoA transport gene in the syn operably linked to) additional nucleic acid segments that thesis of lovastatin. provide for its transcription. Such additional segments may Also provided are functional homologues of the ATEG include promoter and terminator sequences, and one or more 09970 sequences, that are 50% or more identical to the segments that drive the incorporation of the gene of interest 65 sequence of FIG. 3b, preferably 60% or more, more prefer into the genome of the target cell, usually by the process of ably 70% or more, more preferably 80% or more, more pref homologous recombination. Typically, the integration vector erably 90% or more and most preferably 95% or more iden US 9,290,772 B2 5 6 tical. Functional in the term functional homologues means Again a further improvement can beachieved by providing that the homologous protein has a tricarboxylic transporter a micro-organism with a gene encoding a protein capable of function i.e. is able to transport tricarboxylates over the mito transporting dicarboxylic acids from the cytosol to the extra chondrial membrane. cellular medium, more preferably the major facilitator super The term "sequence identity, as used herein, is generally family transporter that can be found on the gene cluster that expressed as a percentage and refers to the percent of amino also comprises ATEG 09970 (see FIG. 2). acid residues or nucleotides, as appropriate, that are identical Even further optimisation of the present invention can be as between two sequences when optimally aligned. For the achieved by modulating the activity of the regulator protein purposes of this invention, sequence identity means the that comprises a Zinc finger and a fungal specific transcription 10 factor domain as can be found on the gene cluster that also sequence identity determined using the well-known Basic comprises ATEG 09970, wherein this regulator protein is Local Alignment Search Tool (BLAST), which is publicly indicated as ATEG 09969.1 (see FIG. 2). available through the National Cancer Institute/National In another aspect of the invention, micro-organisms over Institutes of Health (Bethesda, Md.) and has been described expressing at least one but alternatively a combination of the in printed publications (see, e.g., Altschulet al., J. Mol. Biol. 15 above mentioned nucleotide sequences, encoding at least pro 215(3), 403-10 (1990)). Preferred parameters for amino acid teins transporting di?tricarboxylic acids from the mitochon sequences comparison using BLASTP are gap open 11.0, gap drion to the cytosol, are produced and used, for increased extend 1, Blosum 62 matrix. production of itaconic acid. More preferably micro-organ Every nucleic acid sequence herein that encodes a isms overexpressing proteins that transport di?tricarboxylates polypeptide also, by reference to the genetic code, describes from the mitochondrion to the cytosol combined with over every possible silent variation of the nucleic acid. The term expressing the CAD enzyme, the major facilitator Superfam “conservatively modified variants' applies to both amino acid ily transporter and/or the regulator protein as described above and nucleic acid sequences. With respect to particular nucleic are used to further improve the production of itaconic acid. acid sequences, conservatively modified variants refers to Micro-organisms used in the invention are preferably those nucleic acids which encode identical or conservatively 25 micro-organisms that produce itaconic acid. Preferably over modified variants of the amino acid sequences due to the expression of the genes encoding the above described pro degeneracy of the genetic code. tein(s) and enzyme(s) is accomplished in filamentous fungi, The term “degeneracy of the genetic code” refers to the fact yeasts and/or bacteria, Such as, but not limited to, Aspergillus that a large number of functionally identical nucleic acids sp., Such as the fungi A. terreus, A. itaconicus and A. niger, encode any given protein. For instance, the codons GCA, 30 Aspergillus nidulans, Aspergillus Oryzae or Aspergillus fumi GCC, GCG and GCU all encode the amino acid alanine. nagates, Ustilago Zeae, Ustilago maydis, Ustilago sp., Can Thus, at every position where an alanine is specified by a dida sp., Yarrowia lipolytica, Rhodotorula sp. and codon, the codon can be altered to any of the corresponding Pseudozyma antarctica, the bacterium E. coli and the yeast codons described without altering the encoded polypeptide. Saccharomyces cerevisiae. Especially preferred are homolo Such nucleic acid variations are “silent variations” and rep 35 gous or heterologous citric acid producing organisms in resent one species of conservatively modified variation. which the Substrates are available in the host organism. As described above the tricarboxylic acid transporters, Recently (see US 2004/0033570) it has also been estab transport, among others, cis-aconitate, citrate or isocitrate, lished that the so-called D4B segment of Aspergillus terreus, leading to an increase in cis-aconitate in the cytosol, which which comprises the CAD gene is responsible for the synthe leads to a Subsequent increase initaconic acid production (see 40 sis of lovastatin (see FIG. 2). Thus, it is submitted that also FIG. 1). An increased activity of said transporters can be these micro-organisms which are known to produce lovasta achieved in many ways. One way is overexpression of a gene tin would be suitable candidates for the production of itaconic coding for said activity, preferably said gene is ATEG acid. Such micro-organisms include Monascus spp. (Such as 09970. Overexpression can be effected in several ways. It can M. ruber; M. purpureus, M. pilosus, M. vitreus and M. pubig be caused by transforming the micro-organism with a gene 45 erus), Penicillium spp. (Such as P. citrinum, P. chrysogenium), coding for the transporter. Alternatively, another method for Hypomyces spp., Doratomyces spp. (Such as D. Stemonitis), effecting overexpression is to provide a stronger promoter in Phoma spp., Eupenicillium spp., Gymnoascus spp., Pichia front of and regulating the expression of said gene. This can labacensis, Candida cariosilognicola, Paecilomyces virioti, be achieved by use of a strong heterologous promoter or by Scopulariopsis brevicaulis and Trichoderma spp. (Such as T. providing mutations in the endogenous promoter. An 50 viride). Consequently also the CAD encoding part of the D4B increased activity of the transporter can also be caused by segment and the enzyme with CAD activity for which it codes removing possible inhibiting regulatory proteins, e.g. that from these above-mentioned lovastatin producing micro-or inhibit the expression of such proteins. The person skilled in ganisms are deemed to be suitable for use in the present the art will know other ways of increasing the activity of the invention. It further is contemplated that a heterologous above mentioned transporter enzyme. 55 organism, which in nature does not or hardly produce itaconic This process can be even further optimised using a method acid like Aspergillus niger, can be used when providing Such wherein the transported and produced cis-aconitate is con an organism with a functional pathway for expression of Verted to itaconic acid, using overexpression of the gene itaconic acid, by overexpression of the above mentioned encoding the enzyme CAD (EC 4.1.1.6). “CAD is defined as genes. a protein, or a nucleotide sequence encoding for the protein, 60 Recombinant host cells described above can be obtained cis-aconitate decarboxylase (CAD), this further comprises using methods known in the art for providing cells with enzymes with similar activities (see EP07112895). The CAD recombinant nucleic acids. These include transformation, gene is preferably derived from Aspergillus sp. like, Aspergil transconjugation, transfection or electroporation of a host cell lus terreus, Aspergillus niger, Aspergillus nidulans, Aspergil with a suitable plasmid (also referred to as vector) comprising lus Oryzae or Aspergillus filminagates. Most preferably the 65 the nucleic acid construct of interest operationally coupled to CAD gene is ATEG 09971.1, derived form the gene cluster a promoter sequence to drive expression. Host cells of the that also comprises ATEG 09970 (see FIG. 2). invention are preferably transformed with a nucleic acid con US 9,290,772 B2 7 8 struct as further defined below and may comprise a single but if from a non-fungal, eukaryotic, gene. The transcription ter preferably comprises multiple copies of the nucleic acid con mination sequence further preferably comprises a polyade struct. The nucleic acid construct may be maintained episo nylation signal. mally and thus comprise a sequence for autonomous replica Optionally, a selectable marker may be present in the tion, Such as an ARS sequence. Suitable episomal nucleic nucleic acid construct. As used herein, the term “marker” acid constructs may e.g. be based on the yeast 2L or pKD1 refers to a gene encoding a trait or a phenotype which permits (Fleeret al., 1991, Biotechnology 9:968-975) plasmids. Pref. the selection of, or the screening for, a host cell containing the erably, however, the nucleic acid constructisintegrated in one marker. A variety of selectable marker genes are available for or more copies into the genome of the host cell. Integration use in the transformation of fungi. Suitable markers include 10 auxotrophic marker genes involved in amino acid or nucle into the host cells genome may occur at random by illegiti otide metabolism, Such as e.g. genes encoding ornithine mate recombination but preferably the nucleic acid construct transcarbamylases (argB), orotidine-5'-decarboxylases is integrated into the host cells genome by homologous (pyrC, URA3) or glutamine-amido- indoleglyc recombination as is well known in the art of fungal molecular erol-phosphate-synthase phosphoribosyl-anthranilate genetics (see e.g. WO 90/14423, EP-A-0 481 008, EP-A-0 15 (trpC), or involved in carbon or nitrogen metabo 635 574 and U.S. Pat. No. 6,265,186). Most preferably for lism, Such e.g. nial) or facA, and antibiotic resistance markers homologous recombination the ku70A/ku80A techniques is Such as genes providing resistance against phleomycin, bleo used as described for instance in WO 02/052026. mycin or neomycin (G418). Preferably, bidirectional selec Transformation of host cells with the nucleic acid con tion markers are used for which both a positive and a negative structs of the invention and additional genetic modification of genetic selection is possible. Examples of such bidirectional the fungal host cells of the invention as described above may markers are the pyrC (URA3), facA and amdS genes. Due to be carried out by methods well known in the art. Such meth their bidirectionality these markers can be deleted from trans ods are e.g. known from standard handbooks, such as Sam formed filamentous fungus while leaving the introduced brook and Russel (2001) “Molecular Cloning: A Laboratory recombinant DNA molecule in place, in order to obtain fungi Manual (3rd edition), Cold Spring Harbor Laboratory, Cold 25 that do not contain selectable markers. This essence of this Spring Harbor Laboratory Press, or F. Ausubel et al., eds., MARKERGENE FREETM transformation technology is dis “Current protocols in molecular biology'. Green Publishing closed in EP-A-0 635 574, which is herein incorporated by and Wiley Interscience, New York (1987). Methods for trans reference. Of these selectable markers the use of dominant formation and genetic modification of fungal host cells are and bidirectional selectable markers such as acetamidase 30 genes like the amdS genes of A. nidulans, A. niger and P known from e.g. EP-A-0 635 574, WO 98/46772, WO chrysogenium is most preferred. In addition to their bidirec 99/6O1 O2 and WOOO/37671. tionality these markers provide the advantage that they are In another aspect the invention relates to a nucleic acid dominant selectable markers that, the use of which does not construct comprising a nucleotide sequence encoding at least require mutant (auxotrophic) strains, but which can be used the di?tricarboxylate transporters as defined above and usable 35 directly in wild type strains. for transformation of a host cell as defined above. In the Optional further elements that may be present in the nucleic acid construct, the coding nucleotide sequences pref nucleic acid constructs of the invention include, but are not erably is/are operably linked to a promoter for control and limited to, one or more leader sequences, enhancers, integra initiation of transcription of the nucleotide sequence in a host tion factors, and/or reporter genes, intron sequences, cen cell as defined below. The promoter preferably is capable of 40 tromers, telomers and/or matrix attachment (MAR) causing Sufficient expression of the di?tricarboxylate trans sequences. The nucleic acid constructs of the invention may porters and/or the enzyme(s) described above, in the host cell. further comprise a sequence for autonomous replication, Such Promoters useful in the nucleic acid constructs of the inven as an ARS sequence. Suitable episomal nucleic acid con tion include the promoter that in nature provides for expres structs may e.g. be based on the yeast 2LorpKD1 (Fleeret al., sion of the coding genes. Further, both constitutive and induc 45 1991, Biotechnology 9:968-975) plasmids. Alternatively the ible natural promoters as well as engineered promoters can be nucleic acid construct may comprise sequences for integra used. Promoters suitable to drive expression of the genes in tion, preferably by homologous recombination (see e.g. the hosts of the invention include e.g. promoters from glyco WO98/46772). Such sequences may thus be sequences lytic genes (e.g. from a glyceraldehyde-3-phosphate dehy homologous to the target site for integration in the host cells drogenase gene), ribosomal protein encoding gene promot 50 genome. The nucleic acid constructs of the invention can be ers, alcohol dehydrogenase promoters (ADH1, ADH4, and provided in a manner known perse, which generally involves the like), promoters from genes encoding amylo- or cellu techniques such as restricting and linking nucleic acids/ lolytic enzymes (glucoamylase, TAKA-amylase and cello nucleic acid sequences, for which reference is made to the biohydrolase). Other promoters, both constitutive and induc standard handbooks, such as Sambrook and Russel (2001) ible and enhancers or upstream activating sequences will be 55 “Molecular Cloning: A Laboratory Manual (3rd edition), known to those of skill in the art. The promoters used in the Cold Spring Harbor Laboratory, Cold Spring Harbor Labo nucleic acid constructs of the present invention may be modi ratory Press, or F. Ausubel et al., eds. “Current protocols in fied, if desired, to affect their control characteristics. Prefer molecular biology'. Green Publishing and Wiley Inter ably, the promoter used in the nucleic acid construct for science, New York (1987). expression of the genes is homologous to the host cell in 60 In a further aspect the invention relates to fermentation which genes are expressed. processes in which the transformed host cells of the invention In the nucleic acid construct, the 3'-end of the coding are used for the conversion of a Substrate into itaconic acid. A nucleotide acid sequence(s) preferably is/are operably linked preferred fermentation process is an aerobic fermentation to a transcription terminator sequence. Preferably the termi process. The fermentation process may either be a Submerged nator sequence is operable in a host cell of choice. In any case 65 or a solid state fermentation process. the choice of the terminator is not critical; it may e.g. be from In a solid state fermentation process (sometimes referred to any fungal gene, although terminators may sometimes work as semi-solid State fermentation) the transformed host cells US 9,290,772 B2 10 are fermenting on a solid medium that provides anchorage points for the fungus in the absence of any freely flowing Ordered chromosomalicDNA library substance. The amount of water in the solid medium can be any amount of water. For example, the Solid medium could be almost dry, or it could be slushy. A person skilled in the art knows that the terms "solid state fermentation' and “semi DNA arrays solid state fermentation” are interchangeable. A wide variety of solid state fermentation devices have previously been described (for review see, Larroche et al., “Special Transfor mation Processes. Using Fungal Spores and Immobilized 10 Expression analysis Cells’, Adv. Biochem. Eng. Biotech., (1997), Vol 55, pp. 179; Roussos et al., “Zymotis: A large Scale Solid State Fer menter. Applied Biochemistry and Biotechnology, (1993), (Multivariate) data analysis; Vol. 42, pp. 37-52; Smits et al., “Solid-State Fermenta 15 tion—A Mini Review, 1998), Agro-Food-Industry Hi-Tech, Identification of relevant spots March/April, pp. 29-36). These devices fall within two cat egories, those categories being static systems and agitated systems. In static systems, the Solid media is stationary throughout the fermentation process. Examples of static sys Sequence only relevant tems used for solid state fermentation include flasks, petri clones dishes, trays, fixed bed columns, and ovens. Agitated systems provide a means for mixing the solid media during the fer An A. terreus micro-array was made composed of a clone mentation process. One example of an agitated system is a based and an EST-based array. rotating drum (Larroche et al., Supra). In a Submerged fer 25 Materials and Methods Construction Micro-Array mentation process on the other hand, the transformed fungal host cells are fermenting while being Submerged in a liquid Isolation of Chromosomal DNA from A. Terreus medium, usually in a stirred tank fermenter as are well known A. terreus was cultivated overnight in a shake flask in in the art, although also other types offermenters such as e.g. enriched minimal medium at 33°C. and 250 rpm. Enriched airlift-type fermenters may also be applied (see e.g. U.S. Pat. minimal medium (pH 5.5) is mineral medium (MM) supple 30 mented with 0.5% yeast extract and 0.2% casamino acids. No. 6,746,862). The composition of MM was: 0.07 M. NaNO, 7 mM KC1, Preferred in the invention is a submerged fermentation process, which is performed fed-batch. This means that there 0.11 MKHPO 2 mMMgSO, and 1 ml/l of trace elements is a continuous input of feed containing a carbon Source (1000*stock solution: 67 mM ZnSO 178 mM HBO, 25 and/or other relevant nutrients in order to improve itaconic 35 mM MnC1, 18 mM FeSO, 7.1 mM CoCl2, 6.4 mM CuSO, acid yields. The input of the feed can, for example, be at a 6.2 mM NaMoO, 174 mM EDTA). constant rate or when the concentration of a specific Substrate Mycelium was harvested after 22 hours and frozen in liquid or fermentation parameter falls below some set point. nitrogen. Chromosomal DNA was isolated from 4.5 g myce lium following the protocol described below. It is preferred to use a host cell that naturally would contain the enzymes/transporters of the itaconic acid pathway as 40 Grind 0.5-1.0 g mycelium under liquid nitrogen using the depicted in FIG. 1, and the enzymes/transporters of the citric membrane disrupter. acid pathways in the cytosol and mitochondrion. However, if Place polypropylene tubes (Greiner) with 1.5 ml water the host would lack one or more of these genes, they can be saturated phenol, 1 ml TNS, 1 ml PAS and 0.5 ml co-introduced with the above described enzymes. Such a 5xRNB in a water bath at 55° C., add the still frozen co-introduction can be performed by placing the nucleotide 45 mycelium to the tubes and vortex every 20 seconds for sequence of Such a gene on the same plasmid vector as the totally 2-4 minutes. above described genes, or on a separate plasmid vector. TNS: triisopropylnaphthalene sulphonic acid, 20 mg/ml in water, freshly prepared Further, since the itaconic acid pathway is located partly in PAS: 4 aminosalisylic acid, 120 mg/ml in water, freshly the cytosol and partly in the mitochondrion, it is contemplated 50 prepared that overexpression of the genes/enzymes in either or both of 5xRNB: 60.55g Tris, 36.52g NaCl, 47.55g EGTA in those compartments would be desirable. The person skilled in 500 ml water (pH-8.5) the art will know how to achieve overexpression in the cytosol Add 1 ml sevag and vortex with intervals for another 1-2 or mitochondria by using the appropriate signal sequences. minutes. 55 Spin for 10 min. in the tabletop centrifuge at 4° C. at maximum velocity. Extract the water-phase once again with phenol-sevagand twice with sevag. GENTLY, AVOIDSHEARING! EXAMPLES Precipitate the DNA with 2 volumes ethanol. Spin directly 60 for 10 min. in the tabletop centrifuge. Example 1 Drain the tube, dry it with Kleenex and resuspend the pellet in 500 ul Tris/EDTA. Transfer to a microvial. Extract with phenol-sevaguntil interface stays clean. Then Construction of Micro-Array extract once with Sevag. 65 Precipitate with 2 volumes ice-cold ethanol, spin down and An anonymous clone/EST-based array approach was taken resuspend the pellet in 100-200 ul TE with 50 g/ml according to the following scheme: RNase. US 9,290,772 B2 11 12 Construction of Clone-Based gldNA Library 5,000 kanamycin-resistant colonies were picked into The gldNA library was prepared as follows: microtiter plates Chromosomal A. terreus DNA was sheared into fragments The 5000 cDNA clones were replicated into 96-well of size 1.5-2.5 kb microtiter plates. The ordered libraries were stored as glyc The sheared DNA was subsequently size fractionated, end erol stocks at -80° C. repaired (Lucigen), and ligated into blunt-end Construction of the A. Terreus Clone-Based Array pSMART-HC-Amp vectors (Lucigen). PCR fragments were generated from the different clones The ligated constructs were transformed into E. coli DH from the g|DNA (20,000 clones) and cDNA (5,000 clones) 1Ob library by mass 96 well PCR (50 ul/well, Lucigen SMART Colony PCR was performed on 96 transformants to check 10 SR1/SL1 primers with 5'-C6-aminolinkers, SuperTaq and that >90% of the inserts had the correct size buffer from HT Biotech. Ltd, dNTPs (Roche 11969 064 Sequence analysis (short run) was performed on 20 clones 001), pintool dipped template from grown colony plates). to confirm their diversity and fungal origin All above PCR products were purified by 96 well precipi Colony picking of 20,000 amp-resistant colonies was car 15 tation (isopropanol and 96% ethanol wash), speedvac dried, ried out into 96-well microtiter plates containing TY dissolved in 15 Jul 3xSSC/well and spotted with quill pins medium--100 g/ml ampicillin (Telechem SMP3) on CSS100 silylated aldehyde glass slides The 20.000 clones were replicated into 96-well microtiter (Telechem, USA) using a SDDC2 Eurogridder (ESI, plates. The ordered libraries are stored as glycerol stocks at Canada). During spotting, aminolinkers of PCR products will -80° C. covalently link with aldehyde groups of the coated slides. Generation of mRNA for cDNA Library Construction gDNA and cDNA PCR products were spotted on two sepa Precultures: A. terreus spores (10°-10"/ml) were inocu rate slides (slidea: 1st 10,000 g|DNA’s+5000 cDNA's; slide lated into 100 ml B medium (2 g/l NH4NO3; 1 g/1 b: 2nd 10,000 g|DNA's+same 5000 cDNAs). MgSo4*7H2O: 0.008 g/1 ZnSO4*7H2O: 0.015 g/1 For the clone-based array a genomic library was con CuSO4*5H2O: 1.5 ppm FeSO4*5H2O: 0.08 g/1 KH2PO4; 25 structed. A total of 20,000 clones containing chromosomal 10 g/l CaCl2*2H2O, set to pH3.1 with HCl) containing 20 g/1 fragments was generated, 90% of which had an average insert glucose, and incubated for 24-48 hours at 37° C. at 250 rpm. size of 1.5-2.5 kb. This resulted in a full genome coverage of Production cultures (B medium containing 100 g/l glucose) 64% (Akopyants et al., 2001). were inoculated 1/10 (v/v) for 2-days cultivations and 1/25 For the EST-based array a cDNA library of in total 5000 (v/v) for 3-day cultivations. After 2-3 days cultivation myce 30 cDNA clones was constructed, 70% of which had an average lium was harvested, filtered over miracloth, washed with 0.2 insert size of 1.0-1.5 kb. This so-called EST-based approach M sodium phosphate buffer (pH 6.5), frozen in liquid nitro has the advantage that it will be enriched for the genes gen and stored at -80° C. expressed under the selected (itaconic acid producing) con Isolation of mRNA from A. Terreus ditions. Moreover, in the EST-based approach per clone (and grind mycelium with mortar and pestle under liquid nitro 35 thus spot) only a single gene is represented in eukaryotes. gen; add 100 ul B-mercaptoethanol before grinding to The complete micro-array, thus consisting of 20,000 inactivate RNAse genomic DNA clones and 5,000 cDNA clones was composed transfer powder to cooled plastic tube (1.0 g per tube); keep of an A and a B glass slide. Both slides contained the same mycelium frozen 40 5,000 cDNA spots. The A and B slide each contained 10,000 add 4 ml Trizol and vortex till homogenous of the g|NA spots. add 0.4 ml chloroform and vortex centrifuge for 20-30 min. at 3700 rpm, 4° C. Example 2 transfer supernatant to Eppendorf tubes (1.2 ml per tube) add 0.7 ml per 1.2 ml supernatant 45 Generation of the Different RNA Samples by centrifuge in eppendorf centrifuge for 15 min. at 14.000 Fermentation rpm, 4°C. wash pellet with 1 ml 70% ethanol Materials and Methods Fermentation and mRNA. Isolation centrifuge 5 min., 14.000 rpm, 4°C. Fermentation Conditions of A. Terreus air-dry pellet and resuspend in 0.2 ml water 50 5-Liter controlled batch fermentations were performed in a store RNA samples at -80° C. New Brunswick Scientific Bioflow 3000 fermentors. The fol Construction of cDNA Library lowing conditions were used unless stated otherwise: The cDNA library was prepared as follows: 370 C. The RNA was run on gel to determine the quality of the pH start 3.5 set point 2.3 sample 55 DO set points polyT-primed cDNA was prepared from the total RNA Day 1: 75% provided (RT-PCR reaction using superscript and dT Day 2, 3, 4: 50% primers Subsequent days: 25% The cDNA was size fractionated to give fragments of size Preculture: 100 ml of the same medium as used in the 1.0-1.5 kb 60 fermentation medium (107 spores/ml) in 500 ml Erlen The fragments were end-repaired (Lucigen), and ligated meyer flask with baffles, overnight, 37°C., 150 rpm into blunt-end pSMART-HC-kan vectors (Lucigen). pH control:4M KOH (Base), 1.5 M H PO (Acid) Restriction analysis of 96 clones was performed to check Antifoam: Struktol (Schill & Seilacher) the insert size and the 96 of transformants which had the Fermentation Medium Compositions: correct insert size 65 Per liter: 2.36 g of NHSO, 0.11g of KHPO, 2.08 g of Sequence analysis (short run) of 20 clones was performed MgSO.7H2O, 0.13 g of CaCl*2HO, 0.074 g of NaCl, 0.2 to confirm diversity and fungal origin mg of CuSO4.5H2O, 5.5 mg of Fe(III)SO.7H2O, 0.7 mg of US 9,290,772 B2 13 14 MnCl*4H2O and 1.3 mg of ZnSO*7HO and 100 g of TABLE 2-continued glucose as a carbon source. Overview of the fermentations performed in order to generate All media were prepared in demineralised water. RNA samples for transcriptome analysis. The reference fermentation Isolation of mRNA from A. Terreus is on 100 g/l glucose, dO2, day 1, 75%; day 2-4, 50%, day 5 See mRNA isolation protocol described in Example 1 5 and further 25%, pH start 3.5. Set point at 2.3. Determination of the Itaconate Concentration by HPLC Fermen- Max. Max. 5ul of a 10-times diluted supernatant sample (split ratio tation Fermen- Environmental Itaconic Biomass 1:3) was separated using a Waters 2695 Separations module ill tation condition acid (g/I) (gDWT/kg) on a reversed-phase Develosil 3 um RP-Aqueous C30 140A 10 5 pH set 3.5 8.7 16.5 column (150x3 mm) (Phenomenex p?in CH0-6001) at 25°C. 6 pH start 3.5 no set 30.6 8.7 using the solvent gradient profile (flow rate was 0.4 ml/min) point shown in Table 1. Third run 7 Low glucose (30 g/l) 11.1 6.7 8 O set point 25% 47.2 12.O 9 5* higher Mn 20.3 13.8 TABLE 1. 15 Fourth run 10 Glucose (100 g/l) 26.9 17.9 (control) Solvent gradient of the RP-UV method. 11 pH set 4.5 O.1 20.4 12 O set point 10% S2.9 10.6 A. B Time (20 mM NaH2PO pH 2.25) (Acetonitril) (min) (%) (%) As shown in Table 2, a considerable variation in the amount O 1OO O of itaconate is produced in this set of fermentations, ranging 10 1OO O from almost noitaconate (fermentation #11; pH 4.5) to about 15 95 5 2O 95 5 50 g/l itaconate (#8 and #12; O set point 25% and 10% 21 1OO O respectively). 30 1OO O 25 Of each fermentation 2 to 5 samples were harvested for isolation of mRNA. Compounds were detected by UV at 210 nm using a Waters 2487 Dual wavelength Absor From in total 23 fermentation samples mRNA could be bance detector (Milford, MA, USA). isolated. Of 7 samples, mRNA was isolated twice indepen Itaconate Productivity dently. It proved to be especially difficult (impossible) to Itaconate productivity at a certain time point was calcu extract RNA from the samples taken in the stationary phase. lated as the slope of the regression line between that particular 30 A number of samples showed partial degradation of the RNA. time point and the time points right before and after that time Although no mRNA could be isolated from the samples from point. To this end of 6-10 supernatant samples of the different fermentations #6 and #12, the remaining samples still cov fermentations, the itaconate concentrations were determined ered the complete range of itaconate production (Table 3). by HPLC. 35 For the transcriptomics approach it is essential to have TABLE 3 RNA samples from fermentations that result in the production of different amounts of itaconate. Therefore a literature sur List of 30 mRNA samples from various fermentation conditions Vey was performed in order to identify medium components that were used for transcriptone analysis. and/or physicochemical conditions that affect the amount of 40 Sam- Fermen itaconate produced by A. terreus. Although many conflicting ple ation RNA EFT Itaconic Produc- RNA reports were found regarding the effect that a specific param O. condition id (hours) acid (g/l) tivity quality eter has on itaconic acid production, 4 key overall parameters R3 gluc100 1.3.a. S.O.3 14.6 O.117 ok were identified from this literature survey, i.e. (i) carbon R4 gluc100 1.4.a. 74.8 16.1 O.O60 ok Source, (ii) pH, (iii) trace element (i.e. Mn) concentration and R5 ruc100 2.3.a S.O.3 8.2 O.O82 ok 45 R6 ruc100 2.3.b S.O.3 8.2 O.O82 ok (iv) oxygen tension. Fermentations with A. terreus varying R7 ruc100 2.4.a 75.05 8.6 -0.013 ok principally in these four parameters were performed on a R8 malt100 3.3.a. S.O.3 7 O3SS Ok mineral salts medium to ensure that the elemental limitations R9 malt100 3.4.a 75 12.1 O.220 ok required for itaconate production would be achieved. Table 2 R10 pH-i3.5 4.3.a. 53.25 25.8 0.146 part degr presents an overview of the fermentations performed in this R11 pH-i3.5 4.3.b 53.25 25.8 0.146 part degr 50 R12 pH-i3.5 4.4.a. 73 24 -O.153* Ok study. R13 pH-c3.5 5.3a 53.5 7.5 -0.042 ok R14 pH-c3.5 5.3b 53.5 7.5 -0.042 ok TABLE 2 R1S H-c3.5 5.4.a. 73.25 7.9 O.O35 ok R16 gluc30 7.2.a. 30.25 9 O.317 ok Overview of the fermentations performed in order to generate R1 gluc30 7.3a 43.5 10 O.O3O ok RNA samples for transcriptome analysis. The reference fermentation 55 R17 gluc30 7.3a 43.5 10 O.O3O ok is on 100 g/l glucose, dO2, day 1, 75%; day 2-4, 50%, day 5 R18 O2S25% 8.2.a. 3O.S 36: O.824* Ok and further 25%, pH start 3.5. Set point at 2.3. R19 O2S25% 8.4.a. 78.25 46 0.029 part degr R2O 5xMn 9.2.a. 30.75 1 0.194 ok Fermen- Max. Max. R21 5xMn 9.2.b 30.75 1 0.194 ok tation Fermen- Environmental Itaconic Biomass R22 SXMn 9.3.a. 53.5 10 0.496 part degr ill tation condition acid (g/I) (gDWT/kg) R23 SXMn 9.3.b 53.5 10 0.496 part degr 60 R24 SXMn 9.4.a. 78.5 19 0.189 part degr First Run 1 Glucose (100 g/l) 16.1 12.7 R2S SXM 9.4.b 78.5 19 0.189 part degr (control) R26 5xMn 9.5a. 93.25 2O O.106 ok 2 Fructose as C- source 8.84 13.7 R2 Gluc100 10.3.a. 51.5 14.7 O.256 ok 3 Maltose as C-source 13.9 12.1 R27 Gluc100 10.3.a. 51.5 14.7 O.256 ok Second run 4 Glucose (100 g/T) pH 25.8 11.6 R28 Gluc100 10.4.a 74 19.5 O.O85 ok start 3.5, set point 65 R29 Gluc100 105.a 100.4 22 0.177 part degr 2.3 (control) R30 Gluc100 10.5.b 100.4 22 0.177 part degr US 9,290,772 B2 15 16 TABLE 3-continued dried with N-gas. All pre-hybridisation buffers were 0.45um filtered to reduce dust noise. Slide images of Cy5- and Cy3 List of 30 mRNA samples from various fermentation conditions fluorescence intensity (Scan Array Express Scanner & Soft that were used for transcriptone analysis. ware, Packard Biosc.) were analysed (Imagene 5.6 Software, Sam- Fermen Biodiscovery) to obtain for each spot signal- and local back ple tation RNA EFT Itaconic Produc- RNA ground value (medians) for the hybridized Cy5-RNA and lO. condition id (hours) acid (g) tivity quality Cy3-reference goNA. These values were used for further R31 pH 4.5 11.3.a. 51.5 0.04* -0.001 ok data analysis. R32 pH 4.5 11.4.a. 74 O.05% 0.003 ok Array Data Normalization 10 The samples marked with asterix were the samples used for the differential expression data Before normalization, all low abundant spots having a Sig analysis, nal/Background below 1.5 were removed. Data were normal ized using a total cDNA signal correction. For each slide and Example 3 each spot, the difference between signal and background was 15 calculated for Cy5 and Cy3. Per slide, the sum of the differ Transcriptome Analysis, Data Analysis of the Array ences was taken for Cy5 and Cy3, and the ratio between these Data two was used as normalisation factor for that particular slide. All spots (chromosomal and genomic) were normalised using Materials and Methods Transcriptome Analysis, Data Nor this total cDNA signal. malization and Data Analysis Data Analysis of the Transcriptomics Data by Differential Labeling of RNA and g|DNA Expression Analysis Total RNAs (5ug/30 ul reaction), isolated from various A. The differential expression value was calculated by divid terreus cultures (strain NRRL 1960, BASF) with differential ing the Cy5 (RNA)/Cy3 (gDNA) ratio of a spot in the slide itaconate production, were labelled with amino-allyl-dUTP with the highest titer or productivity by the Cy5 (RNA)/Cy3 (0.75 uMaa-duTP final conc., Sigma A0410), using 3 ul 50 25 (gDNA) ratio of that same spot in the slide with the lowest uM oligo p(dT)s primer (La Roche, 814270), unlabelled titer or productivity. The samples used for the differential dNTPs (added to 1.25 uM final conc. for each dNTP), 2 ul expression analysis are marked in Table 2. The spots were Superscript II Reverse Transcriptase and buffer (Life Tech Subsequently ranked based on this ratio or, when the ratio was nologies, 10297-018: primer annealing 10 min 70° C., tran <1, i.e. in the case of down-regulated genes, on 1/ratio. scriptase 180 min 42°). After RNA hydrolysis (3 Jul 2.5M 30 Sequence Analysis of Spots Selected after Transcriptomics NaOH, 30 min 37, 3 ul 2.5 M HAc) the aa-duTP labelled Approach cDNA was directly purified (below). The relevant clones were selected from the glycerol stocks As a reference for correcting slide differences (spotting, of the ordered libraries (gDNA and cDNA library respec labeling-, hybridization- and scan efficiency), gDNA (0.5 tively) and cultivated in 96-well microtiter plates. The ug/reaction) of Aspergillus terreus (strain NRRL 1960, 35 sequences of the inserts from both the 3' and the 5' end were BASF) was labelled with aa-duTP, using dNTPs (conc. as determined by High Throughput (HT) sequencing service. above), Klenov-DNA Polymerase and buffer (Bioprime kit, All RNA samples were labelled with Cy5. Hybridisations Invitrogen 18094-011: primer annealing 5 min 96° C., poly were performed with all 30 RNA samples, using Cy3-labeled merase 90 min 37). chromosomal DNA of A. terreus as the reference. The aa-duTP-labelled cDNA or g|DNA was purified 40 The raw transcriptomics data were shown to be of high (QIAquick column, Qiagen 28106), speedvac dried, dis quality, based on visual inspection of the arrays after fluores solved (4.5ul 0.1 MNaCO), coupled with 4.5ul Cy5-NHS cence scanning. Notably, also the hybridization with the par ester for cDNA, or 4.5 ul Cy3-NHS-ester for g|DNA (Amer tially degraded RNA samples gave good results. sham/GE-Healthcare PA25001 or PA23001 respectively, The normalized data were subsequently combined. As the each in 73 ul DMSO) for 60 min at 20° C., diluted with 10 ul 45 A. terreus array consisted out of two slides, different strate of water, and again purified on AutoSeq G50 columns (GE gies of combining the data from the two slides were pursued, Healthcare 27-5340). making use of the fact that the cDNA clones are present on Array Blocking, (Pre)Hybridization and Image Analysis both the A and B slide: Before hybridization with the array produced as described SET 1-mean expression signal of the cDNA clones on above, slides were blocked (removal surplus of spotted PCR 50 slide A and B, take only those spots that give a signal on products and blocking of free aldehyde groups) by 3x quickly both the A and B slide washing (20°C.) with Prehyb buffer and 45 min incubation SET 2-use only the signal of the cDNA spots on the A (42°C.) in Prehyb buffer (5xSSC, 1% BSA, 0.1% SDS). slide. Spots with a Signal/Background below 1.5 were After 4 washes in water, spotted PCR products were dena removed. tured by dipping the slides 5 sec in boiling water and drying 55 SET 3-use only the signal of the cDNA spots on the B them with a N-gas-pistol. slide. Spots with a Signal/Background below 1.5 were The Cy5- and Cy3-labelled sample were combined, 8 Jul 25 removed. ug/ul yeast tRNA (Invitrogen, 15401-029) and 4 ul 5 lug/ul SET 4=Combimean cDNA data of both the A and B slide; poly-dA/dT (Amersham 27-7860) were added, the mixture i. If both measurement values were Zero the combined was speed vac dried, dissolved in 160 ul Easyhyb buffer 60 value was Zero; (Roche, 1796 895), denatured (2 min, 96° C.), cooled to 50° ii. If both measurements values were both non-zero, the C., applied on a pair of prehybridised slides (a+b, 80 ul/slide) combined value was equal to the average of the two prewarmed at 50° C., covered with a cover slide (Hybrislibs, measurement values; Mol. Probes. H-18201) and incubated overnight at 42°C. in a iii. If one of the two measurement values was Zero and humidified hybridization chamber (Corning 2551). Slides 65 the other measurement value was non-zero, the com were washed (pair a+b in one 50 ml tube, 1x in 1xSSC/0.1% bined value was equal to the non-Zero measurement SDS37° C., 1x in 0.5xSSC 37°C., 2x in 0.2xSSC 20°C.) and value. US 9,290,772 B2 17 18 SET 5-SET 1+normalized gldNA spots using the normal TABLE 5-continued ization factor calculated based on the cDNA clones. The most relevant spots were subsequently identified by Overall Top 20 Differential expression - itaconic acid productivity. differential expression analysis: the expression ratios Gene name according to between the sample with the lowest itaconate titer and the 5 Rank Clone ID Gene locus (http://www.broad.mit.edu/) sample with the highest itaconate titer were calculated (see 3 ASTeRO26D10 Table 2). As two samples have a low itaconate titer, the dif 4 ASTeROOSD11 ATEG 09971. cis-aconitate decarboxylase ferential expression analysis was performed separately with 5 AsTeRO17E03 ATEG 09970. predicted protein both these reference samples (i.e. sample 3.a and 4.a). Simi 6 ASTeR008F12 ATEG 09971. cis-aconitate decarboxylase 10 7 AsTeRO17EO2 ATEG 09970. predicted protein larly, also the expression ratios between the samples with the 8 AsTeRO37B09 ATEG 0.9971. cis-aconitate decarboxylase lowest and the samples with the highest itaconate productiv 9 AsTeRO27FO2 ATEG 09970. predicted protein 10 ASTeRO38FO6 ity were calculated. 11 ASTeR008HO8 ATEG 09970. predicted protein Top 20-ies of the individual data set using the different 12 ASTeRO22C05 ATEG 09971. cis-aconitate decarboxylase data analysis approaches were generated. These top-20'-ies 15 13 AsTeRO37B09 ATEG 0.9971. cis-aconitate decarboxylase were combined, and unique spots were identified (Table 4 and 14 ASTeROO4A12 15 ASTeRO18E11 ATEG 09971. cis-aconitate decarboxylase 5). In total 88 spots obtained after the differential analyses 16 ASTeRO45CO3 G 09970. predicted protein (based on 15 models; 5 data sets-2 titer and 1 productivity 17 AsTeRO45F08 ATEG 09970. predicted protein model) were selected for sequencing. 18 ASTeRO11AOS Of the selected spots, >92% were spots belonging to cDNA 19 ASTeR044FO2 ATEG 09970. predicted protein clones. Of the differential spots, some 50-75% of the spots 20 ASTeRO41BO2 were present in the top 20 of both the itaconate titer and itaconate productivity differentials lists and were mostly Standing out when comparing the highest ranking genes upregulated spots, indicating that they might be really rel found by differential expression analysis based on productiv evant for itaconate production. 25 ity versus titer are the cis-aconitate decarboxylase (ATEG Following sequence analysis of the 190 selected spots, the 09971.1), and the immediately flanking gene encoding a pre genes present on these inserts were identified by performing dicted protein (ATEG 09970.1), both of which are present a homology search using BLAST based on the draft version on multiple clones in the top 20 rankings, underlining their of the A. terreus genome sequence as available from the relevance to the itaconate production phenotype. BROAD institute (http://www.broad.mit.edu/annotation/ fgi/). 30 Tables 4 and 5 show the results of the genes identified on Example 4 the 20 highest overall ranking spots identified by differential expression analysis based on titer and productivity, respec Homology Analysis of the ATEG 09970.1 Gene tively. 35 TABLE 4 A BLAST search was performed in order to identify homologous to the predicted protein ATEG 09970.1 (Table Overall Top 20 Differential expression - itaconic acid titer. 6). High homologies were only found with genes from two Gene name according to other A. terreus strains. With other micro-organisms and Rank Clone ID Gene locus (http://www.broad.mit.edu/) 40 more specifically fungi, homologues were found although with low homology. These low identities suggest that this 1 AsTeRO37B09 ATEG 09971.1 cis-aconitate decarboxylase 2 AsTeR017E03 ATEG 09970.1 predicted protein gene is part of a unique pathway. Based on the annotation of 3 AsTeR008F12 ATEG 09971.1 cis-aconitate decarboxylase these homologous genes ATEG 09970.1 was identified as a 4 AsTeR017E02 ATEG 09970.1 predicted protein putative mitochondrial tricarboxylate transporter. S ASTeRO26D10 45 6 AsTeRO2OB12 ATEG 09970.1 predicted protein 7 AsTeRO27FO2 ATEG 09970.1 predicted protein TABLE 6 8 ASTeRO31E12 9 ASTeRO41AO1 BLAST search results with ATEG 09970.1 10 ASTeRO36C11 11 ASTeRO25E11 Identity 12 AsTeR008HO8 ATEG 09970.1 predicted protein 50 Rank Protein Best Hit Evalue Similarity 13 AsTeRO28C10 ATEG 09970.1 predicted protein 1 Predicted XP 001209272.1 1e-173 100%100% 14 AsTeRO26G08 ATEG 09970.1 predicted protein protein A. terretts 15 ASTeR009E09 ATEG 09971.1 cis-aconitate decarboxylase 2 unknown AAD345 62.1 1e-171 98%.99% 16 ASTeR005D11 ATEG 09971.1 cis-aconitate decarboxylase A. terretts 17 ASTeROS 6AO3 55 3 Conserved XP OO1219399.1 6e-59 43%.60% 18 AsTeRO10E04 ATEG 09970.1 predicted protein hypothetical C. globosum 19 AsTeRO45C03 ATEG 09970.1 predicted protein protein 20 AsTeRO54HO8 ATEG 09970.1 predicted protein 4 Conserved XP 360936.2 3e-58 44%. 62% hypothetical M. grisea protein TABLE 5 5 Conserved XP OO1586805.1 1e-56 44%.64% 60 hypothetical S. Sclerotiorum Overall Top 20 Differential expression - itaconic acid productivity. protein 6 Mitochondrial XPOO1270567.1 7e-56 43.62% Gene name according to Ericarboxylate A. Clavatus Rank Clone ID Gene locus (http://www.broad.mit.edu/) transporter 7 Tricarboxylate XP 956064.2 1e-55 44%.61% 1 ASTeRO2OB12 ATEG 09970.1 predicted protein 65 transport N. Crassa 2 ASTeRO31E12 protein US 9,290,772 B2 19 20 TABLE 6-continued A gene neighbouring CAD, the putative mitochondrial tricarboxylate transporter and the putative itaconate exporter BLAST Search results with ATEG 09970.1 is a putative regulator containing a zinc-finger domain Identity/ (ATEG 09969.1). This gene was not identified using our Rank Protein Best Hit Evalue Similarity transcriptomics approach, but considering its localization it is supposed that it is relevant for itaconic acid synthesis FIG. 2 8 Mitochondrial XP 001263903.1 1e-55 42%.66% tricarboxylate N. Fischeri shows that also the lovastatin pathway genes are located on transporter this cluster, Suggesting a link between both pathways which 9 Mitochondrial XP 755059.2 2e-55 42%.62% are (mainly) specific for A. terreus. tricarboxylate A. Fumigatus 10 transporter Example 5 10 Hypothetical XP 001395,080.1 6e-55 41%.61% protein A. Niger (Co-)Expression of the ATEG 09970.1 Gene in Aspergillus niger It appears that at least the gene coding for the cis-aconitate 15 decarboxylase (ATEG 09971.1) and the gene encoding the In order to unambiguously establish that the ATEG 9970 putitative mitochondrial tricarboxylate transporter (ATEG protein aids to the increased production of itaconic acid, a 09970.1) lie in the same cluster in the A. terreus genome (FIG. naturally non-itaconic acid producing fungal host was (co-) 2). transformed with the CAD gene and the ATEG 09970.1 Flanking the CAD and the putative mitochondrial tricar (MTT) gene. boxylate transporter genes is the Major Facilitator Superfam Expression of the CAD (ATEG 09971. 1) Gene in Aspergillus ily (MFS) transporter (ATEG 09972.1) (SEQ ID NOS:12 niger 13) that was identified by Partial Least Squares (PLS) A PCR generated copy of the gene encoding the CAD biostatistical analysis. MFS transporters are a diverse family protein (see EP07 112895) was generated. For this purpose of transport proteins, transporting compounds ranging from 25 two sets of primers were generated as shown below. PCR Sugars to organic acids, including dicarboxylic acids. In A. amplification based on A. terreus NRRL1960 genomic DNA niger some 450 different MFS genes are present. The local resulted in the isolation of PCR fragments from which the ization of MFSATEG 09972.1 and its identification by PLS, complete coding region of the gene encoding the CAD pro Suggest that this is the itaconate exporter. tein, could be isolated as BspHI-BamHI fragments.

(SEQ ID NO: 4, without intron, 1473 bp) CAD full sequence 1529 bp ORIGIN (SEQ ID NO : 6) BspHI cadfor 40° C. 5-ATCGTCATGACCAAGCAATCTG- 3 (SEO ID NO: 7) BspHI cadfor53° C. 5-ATCGTCATGACCAAGCAATCTGCGGACA-3'

ATGACCAAGC AATCTGCGGA CAGCAACGCA AAGTCAGGAG TTACGTCCGA. AATATGTCAT

61 TGGGCATCCA ACCTGGCCAC TGACGACATC CCTTCGGACG TATTAGAAAG AGCAAAATAC

121. CTTATTCTCG ACGGTATTGC ATGTGCCTGG GTTGGTGCAA GAGTGCCTTG GTCAGAGAAG

181 TATGTTCAGG CAACGATGAG. CTTTGAGCCG CCGGGGGCCT, GCAGGGTGAT TGGATATGGA

241 CAGotaaatt ttatt cactic tagacggtoc acaaagtata ctacgatcc ttctatagA. (intron)

3 O1. AACTGGGGCC TGTTGCAGCA GCCATGACCA, ATTCCGCTTT CATACAGGCT ACGGAGCTTG

361 ACGACTACCA CAGCGAAGCC CCCCTACACT CTGCAAGCAT TGTCCTTCCT. GCGGTCTTTG

421 CAGCAAGTGA. GGTCTTAGCC GAGCAGGGCA AAACAATTTC CGGTATAGAT GTTATTCTAG

481 CCGCCATTGT. GGGGTTTGAA TCTGGCCCAC GGATCGGCAA AGCAATCTAC GGATCGGACC

541. TCTTGAACAA CGGCTGGCAT TGTGGAGCTG TGTATGGCGC TCCAGCCGGT, GCGCTGGCCA

601 CAGGAAAGCT. CTTCGGTCTA ACTCCAGACT CCATGGAAGA TGCTCTCGGA, ATTGCGTGCA

661 CGCAAGCCTG TGGTTTAATG TCGGCGCAAT ACGGAGGCAT GGTAAAGCGT, GTGCAACACG

721 GATTCGCAGC GCGTAATGGT CTTCTTGGGG GACTGTTGGC CCATGGTGGG TACGAGGCAA

781. TGAAAGGTGT CCTGGAGAGA TCTTACGGCG GTTTCCT CAA. GATGTTCACC AAGGGCAACG

841 GCAGAGAGCC TCCCTACAAA GAGGAGGAAG TGGTGGCTGG TCTCGGTTCA TTCTGGCATA

9 O1 CCTTTACTAT TCGCATCAAG CTCTATGCCT GCTGCGGACT TCTCCATGGT CCAGTCGAGG

US 9,290,772 B2 23 24 The resulting BspHI-BamHI fragment was cloned into the Subsequently, an Aspergillus niger strain AB 1.13 (Mat Aspergillus expression vector pAN52-4-amdS, based on tern, I. E. et al., 1992, Mol. Gen. Genet. 234:332-336) was Aspergillus expression vector pAN52-4. The Aspergillus co-transformed with the CAD expression vector and the MTT expression vectorpAN52-4-amdS was derived by cloning the expression vector. AmdS transformants resulting for this Aspergillus selection marker amdS into the Aspergillus experiment were purified by single colony purification and expression vector p AN52-4 (EMBL accession #Z32699). retested for their AmdS+ phenotype. Subsequently, an Aspergillus niger strain AB 1.13 (Mat Analysis of A. nicer Transformants for Itaconic Acid Produc tern, I. E. et al., 1992, Mol. Gen. Genet. 234:332-336) was tion transformed with the CAD expression vector. AmdS transfor Several positive transformants and the parental host strain mants resulting for this experiment were purified by single 10 were subsequently cultured in Shake Flask in MM medium colony purification and retested for their AmdS+ phenotype. Supplied with uridine containing glucose as C-source and Co-Expression of the CAD Gene and the ATEG 09970.1 nitrate as N-source. Medium samples from the various cul Gene in Aspergillus niger tures were analyzed by HPLC for the presence of itaconic The ATEG 09970.1 gene (MTT) was synthesized (Gene acid (Table 7). Art(R) and cloned into the Aspergillus niger expression vector 15 Shake Flask Medium Compositions: pAN52-5doubleNot by restriction enzyme cutting sites of Perliter: 0.52g of KC1, 2.4 g of NaNO, 1.56 g of KHPO, double NotI. The expression vectorpAN52-5doubleNoti was 0.24 g of MgSO.7H2O, 5 mg of Fe(III)SO.7H2O, 5 mg of derived by adding an extra Not site in the Aspergillus expres MnCl*4H2O, 0.022 g of ZnSO*7H2O, 0.011 g of HBO, sion vector p AN52-4 (EMBL accession #Z32699). More 1.7 mg of CoCl*6H2O and 2.44 gofuridine, 100g of glucose over, the codons of the clone were optimized for expression in as a carbon Source. All media were prepared in demineralised the Aspergillus niger strain. Water.

(SEQ ID NOS: 10-11) Translation of MTT cos (1-861 Universal code Total amino acid number : 286 MW = 31503 Max ORF starts at AA pos 1 (may be DNA pos 1) for 286 AA (858 bases), MW = 31503 ATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCT M S I O H F R V A. L I P F F A A F C L P

61 GTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCA 2 V F A H P E T L V K V K D A E D C L G A

121 CGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCC 4. R. W G Y I E L D L N S G. K. I L E S F. R. P

181 GAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCC 6 E E R F P M M S T F. K. W. L. L. C. G. A. W L S

241 CGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTG 8 R I D A G O E O L G R R I H Y S Q N D L.

301 GTTGAGTACT CACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTA 1O W. E. Y S P W T E K. H. L T D G M T W R. E. L.

361 TGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATC 12 C S A. A. I T M S D N T A A. N. L. L. L T T I

421. GGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTT 14 G G P K E L T A F L H N M G D H W T R L.

481 GATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATG 16 D R W E P E L N E A I P N D E R D T T M.

541 CCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCT 18 P W. A. M. A. T T L R K L L T G E L L T L A.

601 TCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGC 2O S R Q Q L I D W M E A D K W A G P L L R

661 TCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGTCT 22 S. A. L. P. A. G. W. F. I. A. D. K. S. G. A. G E R G S

721. CGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTAC 24 R. G. I. I. A. A. L. G. P. D. G. K. P S R W W I Y

781 ACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCC 26 T T G S Q A T M D E R N R O I A E I G A

841. TCACTGATTAAGCATTGGTAA 28 S L I K. H. W. k. US 9,290,772 B2 25 26 HPLC analysis was performed with a reversed phase col TABLE 7-continued umn, using a Develosil TM3 um RP-Aqueous C30 140A col Itaconic acid concentration in the culture fluid of the umn at a constant temperature of 25°C., with elution with 20 A. niger AB1.13 transformants cultivated in shake flasks. Aspergillus niger AB 1.13 transformants (AB 1.13 CAD mM NaH2PO4, pH 2.25 and acetonitril. Compounds were 5 detected by UV at 210 nm using a Waters 2487 Dual wave itaconic acid length Absorbance detector (Milford, Mass., USA). Reten strain code time (hrs) mgig wet weight tion time of itaconic acid was 18.82 min. AB 1.13 CAD - MTT 6.2 S4 1.5 AB 1.13 CAD - MTT 2.2.1 S4 2.2 10 TABLE 7 No itaconic acid was detected in the supernatant of the Itaconic acid concentration in the culture fluid of the parental strain while in the culture fluid of the strains contain A. niger AB1.13 transformants cultivated in shake flasks. ing the CAD gene (strains marked CAD), itaconic acid was Aspergillus niger AB 1.13 transformants (AB 1.13 CAD 15 detected (Table 7). In both the culture fluid of the strains containing the CAD itaconic acid gene and the strains containing both the CAD gene and MTT strain code time (hrs) mgg wet weight gene (strains marked CAD+MTT), itaconic acid was AB 1.13 WT S4 O detected. In at least 2 of the MTT expressing strains more AB 1.13 CAD S.1 S4 1.O itaconic acid was produced in the culture fluid than in the AB 1.13 CAD 7.2 S4 0.7 2O AB 1.13 CAD 10.1 S4 1.4 strains expressing only the CAD gene. Moreover, the average AB 1.13 CAD 14.2 S4 1.2 itaconic acid concentration was higher in the culture fluid of AB 1.13 CAD 16.1 S4 1.2 the strains expressing both the CAD and the MTT gene than AB 1.13 CAD - MTT 4.1 S4 1.3 in the strains expressing the CAD gene only (1.7 versus 1.1 mg itaconic acid/g mycelial wet weight).

SEQUENCE LISTING

<16 Os NUMBER OF SEO ID NOS : 13

<21 Os SEQ ID NO 1 &211s LENGTH: 1299 &212s. TYPE: DNA <213> ORGANISM: Aspergillus terreus 22 Os. FEATURE: <221s NAMEAKEY: misc feature <222s. LOCATION: (1) . . . (1299) <223> OTHER INFORMATION: ATEG O997. 1 genomic sequence

<4 OOs SEQUENCE: 1 atg gac tot aaa at C cag a caaatgttcc attaccaaag go acccc.tta 49 to caaaaagc cc.gtgggaag cqtgtatgtg titcc ttcttg gtcgc.gcgtg ggc.catgtta 109 ctgacaatct cittitt cittaa tatatgtaca gacgaaaggc attcctgcat tdgttgcggg 169

tgcttgtgct gggg.cagttgaaatcto cat caccitaccct titcqaatgttg agctitt cotg 229 tgtttalagag ttctgctitta cc.gtggcc.gc caactgacag to tattgctt cqgctgg tag 289

cggctaaaac togcgcc cag cittaa.gcggc gaalaccatga tigtggcagot at aaaacctg 349 gaatc.cgagg ctgg tatgct ggg tatggag ccaccttggit aggalaccaca gtgaaagc ct 4 O9

cc.gttcgitat gtagcgatcc ccttctaagc cagcgtggag caaaaggala taccgtttg 469

caataacaaa cagaatttgc ct cattcaat atttatcgct cqgc cctitt c ggg.cccaaat 529 ggaga.gctict Caactggagc titc.cgt.cct g gctgggtttg gggctggcgt gaccgaggct 589

gtcttagc.cg talaccc.ca.gc ggaggcgat C aaga caaaaa totaagttgc aa catctoac 649 ccgittatccg accagttctt aattic gttct cittagcattg atgcaaggaa ggttggaaat 7 O9

gcagagittaa gtacgactitt toggcgcgata gctgggatcc titcgagat.cg gggaccgctt 769

ggatt Cttct ctg.cggttgg to ctacaatt ttgcggcagt cct c caatgc ggcagtgaag 829 tt cactgttt at aacgaact tattgggctg gcc.cgaaaat actic galagala tigcgalagac 889 gtgcaccCt c togcaa.gcac cttggtcggit totgttact g gagtttgctg cgc.ctggit cq 949

acacagccac tigacgtgat calagacacgg talagtag tic to agat.cgac agta acacgc 1009

US 9,290,772 B2 29 30 - Continued

21 O 215 22O tgc tigc gcc tig ticg aca Cag cca citg gac gitg atc aag aca ca atg 72 O Cys Cys Ala Trp Ser Thr Gln Pro Leu Asp Val Ile Llys Thr Arg Met 225 23 O 235 24 O caa tot citt cag goa aga caa citg tac gga aat acc titc aac togc gtg 768 Glin Ser Lieu. Glin Ala Arg Glin Lieu. Tyr Gly Asn. Thir Phe Asn. Cys Val 245 250 255 aaa aca. Ct c ctg. c9c agt gaa ggc att ggc gtt ttctgg to C ggit gt C 816 Llys Thr Lieu. Lieu. Arg Ser Glu Gly Ile Gly Val Phe Trp Ser Gly Val 26 O 265 27 O tgg titt cqg aca ggg aga citt to c citt acc tog gcc atc atg titt coc 864 Trp Phe Arg Thr Gly Arg Lieu Ser Lieu. Thir Ser Ala Ile Met Phe Pro 27s 28O 285 gtc tac gag aaa gtc tac aag titc titg acg caa cca aac tda 906 Val Tyr Glu Lys Val Tyr Lys Phe Lieu. Thr Gln Pro Asn 29 O 295 3 OO

<210s, SEQ ID NO 3 &211s LENGTH: 301 212. TYPE: PRT <213> ORGANISM: Aspergillus terreus 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (1) . . . (301 <223> OTHER INFORMATION: translation of ATEGO9970.1

<4 OOs, SEQUENCE: 3 Met Asp Ser Lys Ile Glin Thr Asn Val Pro Lieu Pro Lys Ala Pro Lieu. 1. 5 1O 15 Ile Glin Lys Ala Arg Gly Lys Arg Thr Lys Gly Ile Pro Ala Lieu Val 2O 25 3O Ala Gly Ala Cys Ala Gly Ala Val Glu Ile Ser Ile Thr Tyr Pro Phe 35 4 O 45 Glu Ser Ala Lys Thr Arg Ala Glin Lieu Lys Arg Arg Asn His Asp Wall SO 55 6 O Ala Ala Ile Llys Pro Gly Ile Arg Gly Trip Tyr Ala Gly Tyr Gly Ala 65 70 7s 8O Thr Lieu Val Gly Thr Thr Val Lys Ala Ser Val Glin Phe Ala Ser Phe 85 90 95 Asn. Ile Tyr Arg Ser Ala Lieu. Ser Gly Pro Asn Gly Glu Lieu. Ser Thr 1OO 105 11 O Gly Ala Ser Val Lieu Ala Gly Phe Gly Ala Gly Val Thr Glu Ala Val 115 12 O 125 Lieu Ala Val Thr Pro Ala Glu Ala Ile Llys Thr Lys Ile Ile Asp Ala 13 O 135 14 O Arg Llys Val Gly Asn Ala Glu Lieu. Ser Thir Thr Phe Gly Ala Ile Ala 145 150 155 160 Gly Ile Lieu. Arg Asp Arg Gly Pro Lieu. Gly Phe Phe Ser Ala Val Gly 1.65 17O 17s

Pro Thir Ile Leu Arg Glin Ser Ser Asn Ala Ala Val Llys Phe Thr Val 18O 185 19 O

Tyr Asn. Glu Lieu. Ile Gly Lieu Ala Arg Llys Tyr Ser Lys Asn Gly Glu 195 2OO 2O5 Asp Val His Pro Leu Ala Ser Thr Lieu Val Gly Ser Val Thr Gly Val 21 O 215 22O

Cys Cys Ala Trp Ser Thr Gln Pro Leu Asp Val Ile Llys Thr Arg Met 225 23 O 235 24 O

US 9,290,772 B2 35 36 - Continued

<222s. LOCATION: (1) . . . (490 <223> OTHER INFORMATION: translation of ATEGO9971.1

<4 OOs, SEQUENCE: 5 Met Thir Lys Glin Ser Ala Asp Ser Asn Ala Lys Ser Gly Val Thir Ala 1. 5 1O 15 Glu Ile Cys His Trp Ala Ser Asn Lieu Ala Thr Asp Asp Ile Pro Ser 2O 25 3O Asp Val Lieu. Glu Arg Ala Lys Tyr Lieu. Ile Lieu. Asp Gly Ile Ala Cys 35 4 O 45 Ala Trp Val Gly Ala Arg Val Pro Trp Ser Glu Lys Tyr Val Glin Ala SO 55 6 O Thr Met Ser Phe Glu Pro Pro Gly Ala Cys Arg Val Ile Gly Tyr Gly 65 70 7s 8O Glin Llys Lieu. Gly Pro Val Ala Ala Ala Met Thr Asn. Ser Ala Phe Ile 85 90 95 Glin Ala Thr Glu Lieu. Asp Asp Tyr His Ser Glu Ala Pro Lieu. His Ser 1OO 105 11 O

Ala Ser Ile Wall Lieu Pro Ala Wall Phe Ala Ala Ser Glu Wall Lieu. Ala 115 12 O 125 Glu Glin Gly Lys Thir Ile Ser Gly Ile Asp Val Ile Lieu Ala Ala Ile 13 O 135 14 O Val Gly Phe Glu Ser Gly Pro Arg Ile Gly Lys Ala Ile Tyr Gly Ser 145 150 155 160 Asp Lieu. Lieu. Asn. Asn Gly Trp His Cys Gly Ala Val Tyr Gly Ala Pro 1.65 170 175 Ala Gly Ala Lieu Ala Thr Gly Lys Lieu. Lieu. Gly Lieu. Thr Pro Asp Ser 18O 185 19 O Met Glu Asp Ala Lieu. Gly Ile Ala Cys Thr Glin Ala Cys Gly Lieu Met 195 2OO 2O5 Ser Ala Glin Tyr Gly Gly Met Val Lys Arg Val Gln His Gly Phe Ala 21 O 215 22O Ala Arg Asin Gly Lieu. Lieu. Gly Gly Lieu. Lieu Ala Tyr Gly Gly Tyr Glu 225 23 O 235 24 O Ala Met Lys Gly Val Lieu. Glu Arg Ser Tyr Gly Gly Phe Lieu Lys Met 245 250 255 Phe Thr Lys Gly Asn Gly Arg Glu Pro Pro Tyr Lys Glu Glu Glu Val 26 O 265 27 O Val Ala Gly Lieu. Gly Ser Phe Trp His Thr Phe Thr Ile Arg Ile Llys 27s 28O 285 Lieu. Tyr Ala Cys Cys Gly Lieu Val His Gly Pro Val Glu Ala Ile Glu 29 O 295 3 OO Llys Lieu. Glin Arg Arg Tyr Pro Glu Lieu. Lieu. Asn Arg Ala Asn Lieu. Ser 3. OS 310 315 32O Asn Ile Arg His Val Tyr Val Glin Leu Ser Thr Ala Ser Asn Ser His 3.25 330 335

Cys Gly Trp Ile Pro Glu Glu Arg Pro Ile Ser Ser Ile Ala Gly Glin 34 O 345 35. O

Met Ser Val Ala Tyr Ile Lieu Ala Val Glin Lieu Val Asp Glin Glin Cys 355 360 365

Lieu. Lieu Ala Glin Phe Ser Glu Phe Asp Asp Asn Lieu. Glu Arg Pro Glu 37 O 375 38O

Val Trp Asp Lieu Ala Arg Llys Val Thr Pro Ser His Ser Glu Glu Phe 385 390 395 4 OO US 9,290,772 B2 37 38 - Continued

Asp Glin Asp Gly Asn. Cys Lieu. Ser Ala Gly Arg Val Arg Ile Glu Phe 4 OS 41O 415

Asn Asp Gly Ser Ser Val Thr Glu Thr Val Glu Llys Pro Lieu. Gly Val 42O 425 43 O

Lys Glu Pro Met Pro Asn. Glu Arg Ile Lieu. His Llys Tyr Arg Thr Lieu. 435 44 O 445 Ala Gly Ser Val Thr Asp Glu Ser Arg Val Lys Glu Ile Glu Asp Lieu. 450 45.5 460

Val Lieu. Ser Lieu. Asp Arg Lieu. Thir Asp Ile Thir Pro Lieu. Lieu. Glu Lieu. 465 470 47s 48O Lieu. Asn. Cys Pro Wall Lys Ser Pro Lieu Val 485 490

SEQ ID NO 6 LENGTH: 22 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer SEQUENCE: 6 atcgtcatga ccaa.gcaatc td 22

SEO ID NO 7 LENGTH: 28 TYPE: DNA ORGANISM: Artificial Sequence FEATURE; OTHER INFORMATION: primer SEQUENCE: 7 atcgt.catga C caa.gcaatc. tcggaca 28

SEQ ID NO 8 LENGTH: 28 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: reverse primer SEQUENCE: 8 tittagoggtg accatatt co tagg.ccct 28

SEO ID NO 9 LENGTH: 33 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: reverse primer SEQUENCE: 9 ggcattttag cqgtgacCat attcc taggc ccc 33

SEQ ID NO 10 LENGTH: 861 TYPE: DNA ORGANISM: Aspergillus niger FEATURE: NAMEAKEY: misc feature LOCATION: (1) . . . (861) OTHER INFORMATION: ATEGO997O. 1 FEATURE: NAME/KEY: CDS LOCATION: (1) . . . (861)

US 9,290,772 B2 41 - Continued <213> ORGANISM: Aspergillus niger 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (1) . . . (286) <223> OTHER INFORMATION: translation of ATEGO9970.1

<4 OOs, SEQUENCE: 11 Met Ser Ile Gln His Phe Arg Val Ala Lieu. Ile Pro Phe Phe Ala Ala 1. 5 1O 15 Phe Cys Lieu Pro Val Phe Ala His Pro Glu Thir Lieu Val Llys Val Lys 2O 25 3O Asp Ala Glu Asp Gln Lieu. Gly Ala Arg Val Gly Tyr Ile Glu Lieu. Asp 35 4 O 45 Lieu. Asn. Ser Gly Lys Ile Lieu. Glu Ser Phe Arg Pro Glu Glu Arg Phe SO 55 6 O Pro Met Met Ser Thr Phe Llys Val Lieu. Lieu. Cys Gly Ala Val Leu Ser 65 70 7s 8O Arg Ile Asp Ala Gly Glin Glu Gln Lieu. Gly Arg Arg Ile His Tyr Ser 85 90 95 Gln Asn Asp Leu Val Glu Tyr Ser Pro Val Thr Glu Lys His Lieu. Thr 1OO 105 11 O Asp Gly Met Thr Val Arg Glu Lieu. Cys Ser Ala Ala Ile Thr Met Ser 15 12 O 125 Asp Asn. Thir Ala Ala Asn Lieu. Lieu. Lieu. Thir Thir Ile Gly Gly Pro Llys 13 O 135 14 O Glu Lieu. Thir Ala Phe Lieu. His Asn Met Gly Asp His Val Thr Arg Lieu. 145 150 155 16 O Asp Arg Trp Glu Pro Glu Lieu. Asn. Glu Ala Ile Pro Asn Asp Glu Arg 1.65 17O 17s Asp Thir Thr Met Pro Val Ala Met Ala Thr Thr Lieu. Arg Llys Lieu. Leu 18O 185 19 O Thr Gly Glu Lieu. Lieu. Thir Lieu Ala Ser Arg Glin Glin Lieu. Ile Asp Trp 95 2OO 2O5 Met Glu Ala Asp Llys Val Ala Gly Pro Lieu. Lieu. Arg Ser Ala Lieu Pro 21 O 215 22O Ala Gly Trp Phe Ile Ala Asp Llys Ser Gly Ala Gly Glu Arg Gly Ser 225 23 O 235 24 O Arg Gly Ile Ile Ala Ala Lieu. Gly Pro Asp Gly Llys Pro Ser Arg Ile 245 250 255 Val Val Ile Tyr Thr Thr Gly Ser Glin Ala Thr Met Asp Glu Arg Asn 26 O 265 27 O Arg Glin Ile Ala Glu Ile Gly Ala Ser Lieu. Ile Llys His Trp 27s 28O 285

<210s, SEQ ID NO 12 &211s LENGTH: 1212 &212s. TYPE: DNA <213> ORGANISM: Aspergillus terreus 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (1) . . . (1212) <223> OTHER INFORMATION: ATEGO997.2.1 22 Os. FEATURE: <221s NAME/KEY: CDS <222s. LOCATION: (1) . . . (1212)

<4 OOs, SEQUENCE: 12 atg ggc cac ggit gac act gag to C ccg aac cca acg acg acc acg gala Met Gly His Gly Asp Thr Glu Ser Pro Asn Pro Thr Thr Thr Thr Glu

US 9,290,772 B2 45 46 - Continued Lys Arg Gly Phe Arg Lieu Pro Glin Asp Arg Lieu. His Ser Gly Lieu. Ile 3.25 330 335 aca ttgttc gcc gtg ctg. CCC gca gga acg ctic att tac ggg tog aca O56 Thr Lieu Phe Ala Val Lieu Pro Ala Gly Thr Lieu. Ile Tyr Gly Trp Thr 34 O 345 35. O

Ct c caa gag gat aag ggt gat atg gta gtg ccc ata atc gcg gCd tt C 104 Lieu. Glin Glu Asp Llys Gly Asp Met Val Val Pro Ile Ile Ala Ala Phe 355 360 365 ttic gcg ggc tigg ggg Ctc atg ggc agt titt aac to Ctgaac act tac 152 Phe Ala Gly Trp Gly Lieu Met Gly Ser Phe Asn Cys Lieu. Asn Thr Tyr 37 O 375 38O gtg gct ggit ttgttc. cac acc ct c att tat cita t t c cct ttg tdt aca 2OO Val Ala Gly Lieu Phe His Thr Lieu. Ile Tyr Lieu Phe Pro Leu. Cys Thr 385 390 395 4 OO tgc cca caa taa 212 Cys Pro Glin

<210s, SEQ ID NO 13 &211s LENGTH: 4 O3 212. TYPE: PRT <213> ORGANISM: Aspergillus terreus 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (1) . . . (403) <223> OTHER INFORMATION: translation of ATEGO9972.1

<4 OOs, SEQUENCE: 13 Met Gly His Gly Asp Thr Glu Ser Pro Asn Pro Thr Thr Thr Thr Glu 1. 5 1O 15 Gly Ser Gly Glin Asn. Glu Pro Glu Lys Lys Gly Arg Asp Ile Pro Lieu. 2O 25 3O Trp Arg Lys Cys Val Ile Thr Phe Val Val Ser Trp Met Thr Lieu Val 35 4 O 45 Val Thr Phe Ser Ser Thr Cys Lieu. Leu Pro Ala Ala Pro Glu Ile Ala SO 55 6 O Asn Glu Phe Asp Met Thr Val Glu Thir Ile Asn Ile Ser Asn Ala Gly 65 70 7s 8O Val Lieu Val Ala Met Gly Tyr Ser Ser Lieu. Ile Trp Gly Pro Met Asn 85 90 95 Llys Lieu Val Gly Arg Arg Thr Ser Tyr Asn Lieu Ala Ile Ser Met Lieu. 1OO 105 11 O Cys Ala Cys Ser Ala Gly Thr Ala Ala Ala Ile Asn. Glu Glu Met Phe 115 12 O 125 Ile Ala Phe Arg Val Lieu Ser Gly Lieu. Thr Gly Thr Ser Phe Met Val 13 O 135 14 O Ser Gly Glin Thr Val Lieu Ala Asp Ile Phe Glu Pro Val Tyr Arg Gly 145 150 155 160 Thr Ala Val Gly Phe Phe Met Ala Gly Thr Lieu Ser Gly Pro Ala Ile 1.65 17O 17s

Gly Pro Cys Val Gly Gly Val Ile Val Thr Phe Thr Ser Trp Arg Val 18O 185 19 O

Ile Phe Trp Lieu Gln Lieu. Gly Met Ser Gly Lieu. Gly Lieu Val Lieu. Ser 195 2OO 2O5

Lieu. Leu Phe Phe Pro Lys Ile Glu Gly Asn Ser Glu Lys Val Ser Thr 21 O 215 22O

Ala Phe Llys Pro Thr Thr Lieu Val Thir Ile Ile Ser Lys Phe Ser Pro 225 23 O 235 24 O US 9,290,772 B2 47 48 - Continued

Thir Asp Wall Lieu Lys Glin Trp Wall Pro Asn Wall Phe Lieu. Ala Asp 245 250 255

Lell Gly Lieu. Lieu. Ala Ile Thr Glin Tyr Ser Ile Luell Thir Ser 26 O 265 27 O

Ala Arg Ala Ile Phe Asn Ser Arg Phe His Lieu. Th Thir Ala Lieu Wall 28O 285

Ser Gly Lieu. Phe Lell Ala Pro Gly Ala Gly Phe Lell Ile Gly Ser 29 O 295 3 OO

Lell Wall Gly Gly Luell Ser Asp Arg Thir Wall Arg Arg Ile Wall 3. OS 310 315

Arg Gly Phe Arg Lieu Pro Glin Asp Arg Lieu. His Ser Gly Lieu. Ile 3.25 330 335

Thir Luell Phe Ala Val Lieu Pro Ala Gly. Thir Lieu. Ile Tyr Gly Trp Thr 34 O 345 35. O

Lell Glin Glu Asp Gly Asp Met Wall Wall Pro Ile Ile Ala Ala Phe 355 360 365

Phe Ala Gly Trp Gly Lieu. Met Gly Ser Phe Asn. Cys Lieu. Asn Thir 37 O 375

Wall Ala Gly Lieu. Phe His Thir Luell Ile Lieu. Phe Pro Luell Thir 385 390 395 4 OO

Pro Glin

The invention claimed is: 30 lans, A. oryzae, A. fumigates, Yarrowia lipolytica, Ustilago 1. A host cell which has been modified to contain a heter zeae, Candida sp., Rhodotorula sp., Pseudozyma antarctica, ologous gene encoding a protein that transports di?tricarboxy E. coli, or Saccharomyces cerevisiae. late from the mitochondrion to the cytosol, wherein said 8. The host cell of claim 1, which is of a lovastatin produc protein has an amino acid sequence at least 95% identical to ing microorganism. ATEG 09970.1 of SEQID NO:3. 35 2. The host cell of claim 1, wherein said protein is a tricar 9. The host cell of claim 6, wherein the CAD is encoded by boxylate transporter. the nucleotide sequence comprised in ATEG 09971.1 (SEQ 3. The host cell of claim 1, wherein said protein transports ID NO:5). cis-aconitate, citrate or isocitrate. 10. The host cell of claim 6, wherein the MFS transporter is 4. The host cell of claim 1, wherein said host cell is from of 40 encoded by the nucleotide sequence comprised in ATEG a citrate producing microorganism. 09972.1 (SEQ ID NO:13). 5. The host cell of claim 1, wherein said gene comprises 11. The host cell of claim 8, wherein the lovastatin produc 1) a nucleic acid sequence encoding a mitochondrial tri ing micro-organism is from Monascus spp., Penicillium spp., carboxylic acid transporter from A. terreus, A. niger; A. Hypomyces spp., Doratomyces spp., Phoma spp., Eupenicil lium spp., Gymnoascus spp., Pichia labacensis, Candida itaconicus, A. nidulans, A. Oryzae, or A. fumigates, or 2) 45 a nucleic acid sequence which encodes the amino acid cariosilognicola, Paecilomyces varioti, Scopulariopsis brev sequence of ATEG 09970.1 of SEQID NO:3. icaulis or Trichoderma spp. 6. The host cell of claim 1, wherein a nucleic acid encoding 12. The host cell of claim 7, which is of A. terreus or A. the enzyme cis-aconitic acid decarboxylase (CAD) and/or a niger: nucleic acid encoding a Major Facilitator Superfamily (MFS) 50 13. The host cell of claim 1, wherein the encoding nucle transporter is co-introduced. otide sequence is expressed from a vector comprising a pro 7. The host cell of claim 4, wherein the citrate producing moter capable of driving expression of said sequence. micro-organism is A. terreus, A. niger; A. itaconicus, A. nidu k k k k k