<<

USOO8628943B2

(12) United States Patent (10) Patent No.: US 8,628,943 B2 Reeves et al. (45) Date of Patent: *Jan. 14, 2014

(54) GENES ENCODING KEY CATALYZING S. Barik, S. Prieto, S. B. Harrison, E. C. Clausen, J. L. Gaddy; MECHANISMS FORETHANOL Biological Production of Alcohols from Coal Through Indirect Liq PRODUCTION FROMSYNGAS uefaction; Applied Biochemistry and Biotechnology vol. 18, No. 1, FERMENTATION 363-378. J. L. Vega, S. Prieto, B. B. Elmore, E. C. Clausen, and J. L. Gaddy; The Biological Production of from Synthesis Gas; Applied (75) Inventors: Andrew Reeves, Chicago, IL (US); Biochemistry and Biotechnology, vol. 20-21, No. 1, 781-797. Rathin Datta, Chicago, IL (US) Jamal Abrini. Henry Naveau, Edmond-Jacques Nyns; Clostridium autoethanogenium, sp. nov., an anaerobic bacterium that produces (73) Assignee: Coskata, Inc., Warrenville, IL (US) ethanol from carbon monoxide; Arch Microbiol (1994) 161 : 345 351; Dec. 4, 1993. (*) Notice: Subject to any disclaimer, the term of this Steven P. Allen and Hans P. Blaschek; Factors involved in the patent is extended or adjusted under 35 electroporation-induced transformation of Clostridium peliringens; U.S.C. 154(b) by 393 days. FEMS Microbiology Letters 70 (1990) 217-220. M. Tyurin, R. Padda, K. X. Huang, S. Wardwell, D. Caprette, and G. This patent is Subject to a terminal dis N. Bennett; Electrotransformation of Clostridium acetobutylicum claimer. AICC 824 using high-voltage radio frequency modulated square pulses; Journal of Applied Microbiology, vol. 88, Iss. 2, pp. 220-227. (21) Appl. No.: 12/802,560 (2001). S. Barik, S. Prieto, S. B. Harrison, E. C. Clausen, J. L. Gaddy; (22) Filed: Jun. 9, 2010 Biological Production of Alcohols from Coal Through Indirect Liq uefaction; Applied Biochemistry and Biotechnology vol. 18, No. 1, (65) Prior Publication Data 363-378, (1968). Douglas Burdette and J. G. Zeikus; Purification of acetaldehyde US 2011 FOOO886O A1 Jan. 13, 2011 and from Thermoanaerobacter ethanolicus 39E and characterization of the Related U.S. Application Data Secondary- (2 Adh) as a bifunctional alcohol (63) Continuation-in-part of application No. 12/336,278, dehydrogenase-acetyl-CoA reductive thioesterase; BioChem . J. filed on Dec. 16, 2008, now Pat. No. 8,039,239. (1994) 302,163-170 (Printed in Great Britain). D. Parke; Construction of mobilizable vectors derived from plasmids (51) Int. Cl. RP4, puC18, and puC19; Gene, 93 (1990), 135-137. CI2P 7/06 (2006.01) James G. Ferry; CO Dehydrogenase; Annual Review of Microbiol CI2N 9/00 (2006.01) ogy, vol. 49: 305-333 (Oct. 1995). Edward M. Green, Zhuang L. Boynton, Latonia M. Harris, Frederick CI2N I/00 (2006.01) B. Rudolph, Eleftherios T. Papoutsakis, and George N. Bennett; CI2N L/20 (2006.01) Genetic manipulation of acid formation pathways by gene inactiva CI2N IS/00 (2006.01) tion in Clostridium acetobutylicum ATCC 824; Microbiology C7H 2L/04 (2006.01) (1996), 142, 2079-2086. (52) U.S. Cl. Charles M. H. Hensgens, Wilfred R. Hagen, and Theo A. Hansen; USPC ...... 435/161; 435/183; 435/243; 435/252.3: Purification and Characterization of a Benzylviologen 435/320.1; 536/23.2 Linked.Tungsten-Containing from (58) Field of Classification Search Desulfovibrio gigas; Journal of Bacteriology, vol. 177, No. 12, Nov. None 1995, 6195-6200. Jacques Lefrancois and A. Michel Sicard; Electrotransformation of See application file for complete search history. Streptococcus pneumoniae: evidence for restriction of DNA on entry; (56) References Cited Microbiology (1997), 143, 523-526. (Continued) U.S. PATENT DOCUMENTS 8,039,239 B2 * 10/2011 Reeves ...... 435,161 Primary Examiner — Christian Fronda (74) Attorney, Agent, or Firm — McDonnell Boehnen FOREIGN PATENT DOCUMENTS Hulbert & Berghoff LLP WO 9527O64 A1 10, 1995 WO 2008O18930 A2 2, 2008 WO 2008021141 A2 2, 2008 (57) ABSTRACT WO 2008122354 A1 10, 2008 Gene sequences of key acetogenic clostridial species were OTHER PUBLICATIONS sequenced and isolated. Genes of interest were identified, and functionality was established. Key genes of interest for meta Chica et al. Curr Opin Biotechnol. Aug. 2005:16(4):378-84.* Sen et al. Appl Biochem Biotechnol. Dec. 2007: 143(3):212-23.* bolic catalyzing activity in clostridial species include a three CRF report Feb. 12, 2013.* gene operon coding for CODH activity, a two-gene operon Kopke, M. et al., Clostridium Ijungdahlii represents a microbial coding for PTA-ACK, and a novel acetyl coenzyme A reduc production platform based on syngas, Proc. Nat I. Acad. Sci. USA, tase. The promoter regions of the two operons and the acetyl Jul. 20, 2010, vol. 107, No. 29, pp. 13087-13092. coA reductase are manipulated to increase ethanol produc M. Tyurin, R. Padda, K. X. Huang, S. Wardwell, D. Caprette, and G. tion. N. Bennett; Electrotransformation of Clostridium acetobutyllicum AICC 824 using high-voltage radio frequency modulated square pulses; Journal of Applied Microbiology, vol. 88, Iss. 2, pp. 220-227. 6 Claims, 8 Drawing Sheets US 8,628,943 B2 Page 2

(56) References Cited David M. Rothstein; Clostridium thermosaccharolyticum Strain Deficient in Acetate Production; Journal of Bacteriology, vol. 165, OTHER PUBLICATIONS No. 1, Jan. 1986, p. 319-320. Yun-Long Lin and Hans P. Blaschek; Transformation of Heat Jan Sipma, Anne M. Henstra, Sofiya N. Parshina, Piet N. L. Lens, Treated Clostridium acetobutylicum Protoplasts with puB110 Gatze Lettinga, Alfons J.M. Stams; Microbial CO Conversions with Plasmid DNA; Applied and Environmental Microbiology, vol. 48. Applications in Synthesis Gas Purification and Bio-Desulfurization; No. 4, Oct. 1984, p. 737-742. Critical Reviews in Biotechnology, 2641-2665, (2006). Jack S.-C. Liou, David L. Balkwill, Gwendolyn R. Drake, and Ralph Ralph S. Tanner, Letrisa M. Miller,and Decheng Yang; Clostridium S. Tanner; Clostridium carboxidivorans sp. nov., a solvent-producing liungaahlii sp. nov., an Acetogenic Species ill Clostridial rRNA Clostridium isolated from an agricultural settling lagoon, and reclas Homology Group I. International Journal of Systematic Bacteriol sification of the acetogen Clostridium scatologenes strain SL1 as ogy, vol. 43. No. 2, Apr. 1993, p. 232-236. Clostridium drakei sp. nov.; International Journal of Systematic and Michael V. Tyurin, Sunil G. Desai, and Lee R. Lynd; Evolutionary Microbiology (2005), 55,2085-2091. Electrotransformation of Clostridium thermocellum; Applied and Xiaoguang Liu, Ying Zhu, and Shang-Tian Yang; Construction and Environmental Microbiology, vol. 70, No. 2, Feb. 2004, p. 883-890. Characterization of ack Deleted Mutant of Clostridium J. L. Vega, S. Prieto, B. B. Elmore, E. C. Clausen, and J. L. Gaddy; tyrobutyricum for Enhanced Butyric Acid and Hydrogen Production; The Biological Production of Ethanol from Synthesis Gas; Applied Biotechnol. Prog. 2006, 22, 1265-1275. Biochemistry and Biotechnology, vol. 20-21, No. 1,781-797. (1989). Dena Lyras and Julian I. Rood; Conjugative Transfer of RP4-oriT Bernard Weisblum, Madge Yang Graham, Thomas Gryczan, and Shuttle Vectors from Escherichia coli to Clostridium perfringens; David Dubnau; Plasmid Copy Number Control: Isolation and Char PLASMID, 39, 160-164 (1998). acterization of High-Copy-Number Mutants of PlasmidpE 194; Jour Michel Monod, Claudio DeNoya, and David Dubnau; Sequence and nal of Bacteriology, vol. 137, No. 1, Jan. 1979, p. 635-643. Properties of plM13, a Macrollide-Lincosamide-Streptogramin B D. Ross Williams, Danielle I. Young, and Michael Young; Conjuga Resistance Plasmid from Bacillus subtilis; Journal of Bacteriology, tive plasmid transfer from Escherichia coli to Clostridium vol. 167, No. 1, Jul. 1986. p. 138-147. acetobutylicum; Journal of General Microbiology (1990), 136,819 Stephen W. Ragsdale: Life with Carbon Monoxide; Critical Reviews 826. in Biochemistry and Molecular Biology, 39:165-195, 2004. DIYoung, VJ Evans, JR Jefferies, KCB Jennert, ZEV Phillips, A Sharon I. Reid, Errol R. Allcock, David T. Jones, and David R. Ravagnani and MYoung: 6 Genetic Methods in Clostridia; Methods Woods; Transformation of Clostridium acetobutyllicum Protoplasts in Microbiology, vol. 29, 1999, pp. 191-207. with Bacteriophage DNA; Applied & Environmental Microbiology, vol. 45. No. 1, Jan. 1983. p. 305-307. * cited by examiner U.S. Patent Jan. 14, 2014 Sheet 1 of 8 US 8,628,943 B2

Figure 1 104 u1 108 110 100 COH ) S 2 Foox 114 NAD+ 2 e - C Fared NADH - 112 2Hase 2 Hit NADP+

7 ( NADPH )- 2 e 106 Methyl Branch Carbonyl Branch CO u-1 102 CH+CO + CoA -> Acetyl CoA (6 e) (2e) (8 e)

Ethanol-1/N Acetate Butanol Butyrate (12e) (8 e) (24 e) (20 e) U.S. Patent Jan. 14, 2014 Sheet 2 of 8 US 8,628,943 B2

Figure 2 204 Methyl Branch Carbonyl Branch 100

202 - COVV2 co,2 Co 7, 8 FDA N 208

HCOOH 35-39 2HHH 6 9, 10, 11, 12

2O6 -(CHTHF 13 210 CH-Corrinoid-Protein co is su-1 1 218 214 16 1 102 - Acetyl-CoA --> Acetyl-P- -> Acetate 2. 3. -a, esS As 21 Acetaldehyde 216 N es-1 Ethanol U.S. Patent Jan. 14, 2014 Sheet 3 of 8 US 8,628,943 B2

Figure 3

ceS co FOR (. ragsdalai C. jungdahli

C. carboxidivorans

U.S. Patent US 8,628,943 B2

U.S. Patent Jan. 14, 2014 Sheet 6 of 8 US 8,628,943 B2

9?un61– U.S. Patent Jan. 14, 2014 Sheet 7 of 8 US 8,628,943 B2

/?un6|- U.S. Patent US 8,628,943 B2

Z08

9?un61– US 8,628,943 B2 1. 2 GENESENCODNG KEY CATALYZING (CODH/ACS) complex. Acetyl-CoA 102 is the central MECHANISMS FORETHANOL metabolite in the production of C-C alcohols and acids in PRODUCTION FROMSYNGAS acetogenic Clostridia. FERMENTATION Ethanol production from Acetyl CoA 102 is achieved via one of two possible paths. Aldehyde dehydrogenase facili RELATED U.S. APPLICATION DATA tates the production of acetaldehyde, which is then reduced to ethanol by the action of primary alcohol dehydrogenases. In This application claims the benefit of and priority to U.S. the alternative, in homoacetogenic microorganisms, an patent application Ser. No. 12/336,278 filed Dec. 16, 2008 as NADPH-dependent acetyl CoA reductase (“AR”) facilitates a continuation-in-part application. The entirety of that appli 10 the production of ethanol directly from acetyl CoA. cation is incorporated by reference herein. The content of the Wood-Ljungdahl pathway 100 is neutral with respect to sequence listing information recorded in computer readable ATP production when acetate 214 is produced (FIG.2). When form is identical to the compact disc sequence listing and, ethanol 216 is produced, one ATP is consumed in a step where applicable, includes no new matter, as required by 37 involving the reduction of methylene tetrahydrafolate to CFR 1.821 (e), 1.821 (f), 1.821(g), 1.825(b), or 1.825(d). 15 methyltetrahydrofolate 206 by a reductase, and the process is therefore net negative by one ATP. The pathway is balanced FIELD OF THE INVENTION when acetyl-PO 218 is converted to acetate 214. Acetogenic Clostridia organisms generate cellular energy This invention relates to the cloning and expression of by ion gradient-driven phosphorylation. When grown in a CO novel genetic sequences of microorganisms used in the bio atmosphere, a transmembrane electrical potential is gener logical conversion of CO., H2, and mixtures comprising CO ated and used to synthesize ATP from ADP. medi and/or H2 to biofuel products. ating the process include hydrogenase, NADH dehydrogena ses, carbon monoxide dehydrogenase, and methylene BACKGROUND tetrahydrofolate reductase. Membrane carriers that have been 25 shown to be likely involved in the ATP generation steps Synthetic gas (syngas) is a mixture of carbon monoxide include quinone, menaquinone, and . (CO) gas, carbon dioxide (CO2) gas, and hydrogen (H2) gas, The acetogenic Clostridia produce a mixture of C-C alco and other volatile gases such as CH, N, NH. H.S and other hols and acids, Such as ethanol, n-butanol, hexanol, acetic trace gases. Syngas is produced by gasification of various acid, and butyric acid, that are of commercial interest through organic materials including biomass, organic waste, coal, 30 Wood-Ljungdahl pathway 100. For example, acetate and petroleum, plastics, or other carbon containing materials, or ethanol are produced by C. ragsdalei in variable proportions reformed natural gas. depending in part on fermentation conditions. However, the Acetogenic Clostridia microorganisms grown in an atmo cost of producing the desired product, an alcohol Such as sphere containing syngas are capable of absorbing the syngas ethanol, for example, can be lowered significantly if the pro components CO, CO, and H and producing aliphatic C-C, 35 duction is maximized by reducing or eliminating production alcohols and aliphatic C-C organic acids. These syngas of the corresponding acid, in this example acetate. It is there components activate Wood-Ljungdahl metabolic pathway fore desirable to metabolically engineer acetogenic Clostridia 100, shown in FIG. 1, which leads to the formation of acetyl for improved production of selected C-C alcohols or acids coenzyme A 102, a key intermediate in the pathway. The through Wood-Ljungdahl pathway 100 by modulating enzy enzymes activating Wood-Ljungdahl pathway 100 are carbon 40 matic activities of key enzymes in the pathway. monoxide dehydrogenase (CODH) 104 and hydrogenase (Hase) 106. These enzymes capture the electrons from the SUMMARY OF THE INVENTION CO and H in the syngas and transfer them to ferredoxin 108, an iron- (FeS) electron carrier protein. Ferredoxin 108 One aspect of the present invention provides novel is the main electron carrier in Wood-Ljungdahl pathway 100 45 sequences for three key operons which code for enzymes that in acetogenic Clostridia, primarily because the poten catalyze the syngas to ethanol metabolic process: one coding tial during syngas fermentation is very low (usually between for a carbon monoxide dehydrogenase, a membrane-associ -400 and -500 mV.). Upon electron transfer, ferredoxin 108 ated electron transfer protein, a ferredoxin oxidoreductase, changes its electronic state from Fe" to Fe". Ferredoxin and a promoter; a second operon coding for an acetate kinase, bound electrons are then transferred to cofactors NAD" 110 50 phosphotransacetylase, and a promoter, and a third operon and NADP" 112 through the activity of ferredoxin oxi coding for an acetyl CoA reductase and a promoter. doreductases 114 (FORs). The reduced nucleotide cofactors Another aspect of the invention provides an isolated vector (NAD" and NADP) are used for the generation of interme or transformant containing the polynucleotide sequence cod diate compounds in Wood-Ljungdahl pathway 100 leading to ing for the operons described above. acetyl-CoA 102 formation. 55 Another aspect of the invention provides a method of pro Acetyl-CoA 102 formation through Wood-Ljungdahl ducing ethanol comprising: isolating and purifying anaero pathway 100 is shown in greater detail in FIG. 2. Either CO. bic, ethanologenic microorganisms carrying the polynucle 202 or CO 208 provide substrates for the pathway. The carbon otides coding for an operon comprising carbon monoxide from CO 202 is reduced to a methyl group through Succes dehydrogenase, a membrane-associated electron transfer sive reductions first to formate, by 60 protein, a ferredoxin oxidoreductase, and a promoter, an (FDH) 204, and then is further reduced to methyl operon coding for an acetate kinase, phosphotransacetylase, tetrahydrofolate intermediate 206. The carbon from CO 208 and a promoter, oran operon coding for an acetyl CoA reduc is reduced to 210 by carbon monoxide dehy tase and a promoter, fermenting syngas with said microor drogenase (CODH) 104 through a second branch of the path ganisms in a fermentation bioreactor, providing Sufficient way. The two carbon moieties are then condensed to acetyl 65 growth conditions for cellular production of NADPH, includ CoA 102 through the action of acetyl-CoA synthase (ACS) ing but not limited to sufficient zinc, to facilitate ethanol 212, which is part of a carbon monoxide dehydrogenase production from acetyl CoA. US 8,628,943 B2 3 4 Another aspect of the invention provides a method of pro FIG. 5 is a diagram illustrating the Wood-Ljungdahl path ducing ethanol by isolating and purifying anaerobic, etha way for ethanol synthesis and showing a strategy for specifi nologenic microorganisms carrying the polynucleotide cod cally attenuating or eliminating acetate production in aceto ing for acetyl coenzyme A reductase; fermenting syngas with genic Clostridia by knocking out the genes encoding acetate said microorganisms in a fermentation bioreactor, and pro kinase (ack) and phosphotransacetylase (pta) or by modulat viding sufficient growth conditions for cellular production of ing acetate production by mutating or replacing the promoter NADPH, including but not limited to sufficient zinc, to facili driving phosphotransacetylase and acetate kinase gene tate ethanol production from acetyl CoA. expression, in accordance with the invention; Yet another aspect of the present invention provides a FIG. 6 is a diagram of the Wood-Ljungdahl pathway for method of increasing ethanologenesis or the ethanol to 10 ethanol synthesis, and shows a strategy for specifically acetate production ratio in a microorganism containing the increasing ethanol production in C. ragsdalei by overexpres nucleotide sequence(s) coding for one of more of the operons sion of an acetyl CoA reductase in a host knocked out for described above, said method comprising: modifying, dupli acetate kinase or phosphotransacetylase activity, in accor cating, or downregulating a promoter region of said nucle dance with the invention; otide sequence to increase the activity of the Acetyl Coen 15 FIG. 7 is a diagram of the Wood-Ljungdahl pathway for Zyme A reductase, said sequence being at least 98% identical ethanol synthesis, and showing a strategy for increasing etha to SEQID NO. 3, or to cause overexpression or underexpres nol production in acetogenic Clostridia by aldehyde ferre sion of the nucleotide sequence. doxin oxidoreductase (AOR) in a host strain that is attenuated The present invention is illustrated by the accompanying in its ability to produce acetate and has increased NADPH figures portraying various embodiments and the detailed dependent alcohol dehydrogenase activity, in accordance description given below. The figures should not be taken to with the invention; limit the invention to the specific embodiments, but are for FIG. 8 is a diagram of the butanol and butyrate biosynthesis explanation and understanding. The detailed description and pathway in C. carboxidivorans and the corresponding genes figures are merely illustrative of the invention rather than catalyzing the conversion of acetyl-CoA to butanol and limiting, the scope of the invention being defined by the 25 butyrate showing a strategy for increasing butanol produc appended claims and equivalents thereof. The drawings are tion, in accordance with the invention. not to Scale. The foregoing aspects and other attendant advan tages of the present invention will become more readily DETAILED DESCRIPTION appreciated by the detailed description taken in conjunction with the accompanying figures. 30 The present invention is directed to novel genetic sequences coding for acetogenic Clostridia micro-organisms BRIEF DESCRIPTION OF THE DRAWINGS that produce ethanol and acids from syngas comprising CO, CO2, H2, or mixtures thereof. FIG. 1 is a diagram illustrating the electron flow pathway Several species of acetogenic Clostridia that produce during syngas fermentation in acetogenic Clostridia includ 35 C-C alcohols and acids via the Wood-Ljungdahl pathway ing some of the key enzymes involved in the process; have been characterized: C. ragsdalei, C. liungdahli, C. car FIG. 2 is a diagram illustrating the Wood-Ljungdahl (C) boxydivorans, and C. autoethanogenium. The genomes of pathway for acetyl CoA production and the enzymatic con three of these microorganisms were sequenced in order to version of acetyl-CoA to acetate and ethanol: locate and modify the portions of the genome that code for the FIG. 3 is a diagram illustrating a genetic map containing 40 enzymes of interest. the location of one of the carbon monoxide dehydrogenase The genes that code for enzymes in the Wood-Ljungdahl (CODH) operons which includes cooS, cooF and a ferredoxin metabolic pathway and ethanol synthesis identified in the C. oxidoreductase (FOR), in accordance with the invention; ragsdaleigenome are presented in Table 1. The first column FIG. 4 is a diagram showing the amino acid alignment of identifies the pathway associated with each gene. The gene the gene for NADPH dependent secondary alcohol dehydro 45 identification numbers indicated in the second column corre genase in C. ragsdalei SEQID No. 47, C. Jiungdahlii SEQ spond to the numbers representing the enzymes involved in ID No. 5 and Thermoanaerobactor ethanolicus SEQID No. the metabolic reactions in the Wood-Ljungdahl pathway 6, in accordance with the invention; shown in FIG. 1 and FIG. 2. TABLE 1. Clostridium ragsdalei genes used in metabolic engineering experiments.

Gene EC Pathway ID Gene Name number ORFID Copy ID Description

Wood- 1 Carbon Monoxide 1.2.2.4 RCCC00183 CODH 1 CO oxidation Ljungdahl 2 Dehydrogenase RCCC01175 CODH 2 CO oxidation 3 RCCCO1176 CODH 3 CO oxidation 4 RCCC02026 CODH 4 CO oxidation 5 RCCCO3874 CODH 5 CO oxidation 6 Carbon Monoxide 12.99.2 RCCC03862 cooSacSA bifunctional Dehydrogenase/Acetyl- CODEHFACS CoA Synthase enzyme, carbon fixation 7 Formate Dehydrogenase 1.2.1.2 RCCCOO874 FDH 1 Methyl branch 8 RCCCO3324 FDH 2 carbon fixation 9 Formyltetrahydrofolate 6.3.4.3 RCCCO3872 FTHFS Methyl branch Synthase carbon fixation US 8,628,943 B2

TABLE 1-continued Clostridium ragsdalei genes used in metabolic engineering experiments.

Gene EC Pathway D Gene Name number ORFID Copy ID Description O Methenyltetrahydrofolate 3.54.9 RCCCO387O MEC Methyl branch cyclohydrolase carbon fixation 1 Methylenetetrahydrofolate 1.5.1.5 RCCCO387O MED Methyl branch dehydrogenase carbon fixation 2 Methylenetetrahydrofolate 1.5.1.20 RCCCO3868 MER Methyl branch reductase carbon fixation 3 Methyltransferase 2.1.1.13 RCCCO3863 acSE Methyl branch carbon fixation 4 Corrinoid Iron-sulfur 12.99.2 RCCCO3864 acSC Part of protein CODEHFACS complex, Large Subunit 5 Corrinoid Iron-sulfur 12.99.2 RCCCO3865 acSD Part of protein CODEHFACS complex, Small Subunit Ethanol and 6 Acetate Kinase 27.2.1 RCCCO)1717 ACK Acetate acetate production production 7 Phospho-transacetylase 2.3.1.8 RCCCO)1718 PTA Acetate production 8 Tungsten-containing 1.2.7.5 RCCC00020 AOR 1 Reduction of aldehyde ferredoxin acetate to oxidoreductase acetaldehyde 9 1.2.7.5 RCCC00030 AOR 2 Reduction of acetate to acetaldehyde 2O 1.2.7.5 RCCCO)1183 AOR 3 Reduction of acetate to acetaldehyde 21 Acetyl-CoA Reductase 1.1.1.2 RCCC02715 ADH 1 zinc-containing, NADPH dependent Acetyl-CoA reductase 22 Alcohol Dehydrogenase .1.1.1 RCCCO1356 ADH 2 two bfam domain: FeAHD and ALDH, AdhE 23 .1.1.1 RCCC01357 ADH 3 two bfam domain: FeADH and ALDH, AdhE 24 .1.1.1 RCCCO1358 ADH 4 two bfam domain: FeADH and ALDH, AdhE, ragment (76aa) 25 .1.1.1 RCCCO33OO DH 5 one pfam domain: FeADH 26 .1.1.1 RCCCO3712 DH 6 one pfam domain: FeADH 27 .1.1.1 RCCCO)4095 DH 7 one pfam domain: FeADH 28 ––– RCCC00004 DH 8 short chain ADH, multiple copy 29 . . . . RCCCO1567 DH 9 short chain ADH, multiple copy 30 . . . . RCCCO2765 D 10 short chain ADH, multiple copy 31 ––– RCCCO2240 DH 11 short chain ADH, multiple copy 32 Aldehyde Dehydrogenase .2.1.10 RCCCO3290 1 Acetylating 33 .2.1.10 RCCC04101 2 Acetylating 34 .2.1.10 RCCCO)4114 3 Acetylating Hydrogenase 35 Hydrogenase 12.72 RCCCOOO38 Y D 1. Fe only, H2 production 36 12.72 RCCCOO882 H Y D 2 Fe only, large subunit, H2 production 37 12.72 RCCC01252 HYD 3 Fe only, H2 production 38 12.72 RCCCO1504 HYD 4 Fe only, H2 production 39 12.72 RCCC02997 HYD 5 Ni– Fe large subunit, H2 oxidation Electron 40 Ferredoxin RCCCOOO86 carrier 41 RCCCOO3O1 US 8,628,943 B2

TABLE 1-continued Clostridium ragsdalei genes used in metabolic engineering experiments.

Gene EC Pathway ID Gene Name number ORFID Copy ID Description

42 RCCCOO336 43 RCCCO)1168 44 RCCCO)1415 45 RCCCO)1825 46 RCCCO2435 47 RCCCO2890 48 RCCCO3063 49 RCCCO3726 50 RCCCO)4003 51 RCCCO)4147 Electron 52 Pyridine nucleotide- RCCCO2615 glutamate transfer disulphide synthase Small chain, but no large chain next to it 53 RCCCO2O28 next to cooF and cooS, probably important for reduced pyridine generation S4 RCCCO3O71 NADH dehydrogenase, not part of an operon 55 Membrane-associated RCCCO2O27 coof Between gene electron transfer FeS number 4 and protein, cooF gene number 53

Sequence analysis of the C. ljungdahlii genome was con ducted. Genes coding for enzymes in the Wood-Ljungdahl pathway, ethanol and acetate production, and electron trans fer have been identified and located within the genome. The results are presented in Table 2. TABLE 2 Clostridium liturgdahlii genes used in metabolic engineering experiments.

Gene EC Pathway ID Gene Name number ORFID Copy ID Description Wood- 1 Carbon Monoxide 1.2.2.4 RCCD00983 CODH 1 CO oxidation Ljungdahl 2 Dehydrogenase RCCD00984 CODH 2 CO oxidation 3 RCCDO 1489 CODH 3 CO oxidation 4 RCCD04299 CODH 4 CO oxidation 5 Carbon Monoxide 12.99.2 RCCD00972 CODH ACS bifunctional Dehydrogenase Acetyl- CODEHFACS CoA Synthase enzyme, carbon fixation 6 Formate Dehydrogenase 1.2.1.2 RCCDO1275 FDH 1 Methyl branch 7 RCCDO 1472 FDH 2 carbon fixation 8 Formyltetrahydrofolate 6.3.4.3 RCCDOO982 FTHFS Methyl branch Synthase carbon fixation 9 Methenyltetrahydrofolate 3.54.9 RCCDOO980 MEC Methyl branch cyclohydrolase carbon fixation 10 Methylenetetrahydrofolate 1.5.1.5 RCCDOO980 MED Methyl branch dehydrogenase carbon fixation 11 Methylenetetrahydrofolate 1.5.1.20 RCCDOO978 MER Methyl branch reductase carbon fixation 12 Methyltransferase 2.1.1.13 RCCDOO973 MET Methyl branch carbon fixation 13 Corrinoid Iron-sulfur 12.99.2 RCCDOO974 COPL Part of protein CODEHFACS complex, Large Subuni 14 Corrinoid Iron-sulfur 12.99.2 RCCD00975 COPS Part of protein CODEHFACS complex, Small Subuni Ethanol and 15 Acetate Kinase 27.2.1 RCCD02720 ACK Acetate acetate production US 8,628,943 B2 10 TABLE 2-continued CloStridium liturgdahlii genes used in metabolic engineering experiments.

Gene EC Pathway ID Gene Name number ORFID Copy ID Description production 16 Phospho-transacetylase 2.3.1.8 RCCD02719 PTA Acetate production 17 Tungsten-containing 1.2.7.5 RCCDO1679 AOR 1 Reduction of aldehyde ferredoxin acetate to oxidoreductase acetaldehyde 18 1.2.7.5 RCCDO 1692 AOR 2 Reduction of acetate to acetaldehyde 19 Acetyl-CoA Reductase 1.1.1.2 RCCDOO257 ADH 1 zinc-containing NADPH dependent Acetyl-CoA Reductase Alcohol Dehydrogenase .1.1.1 RCCDOO167 two pfam domain: FeADhand ALDH, AdhE 21 .1.1.1 RCCDOO168 pfam domain: FeADhand ALDH, AdhE 22 .1.1.1 RCCDO2628 one pfam domain: FeADh 23 .1.1.1 RCCDO33SO one pfam domain: FeADh 24 RCCDOO470 short chain ADH, multiple copy 25 RCCDO1665 short chain ADH, multiple copy 26 RCCDO1767 D 10 short chain ADH, multiple copy 27 RCCD02864 short chain ADH, multiple copy 28 Aldehyde Dehydrogenase .2.1.10 RCCD02636 Acetylating 29 .2.1.10 RCCDO3356 Acetylating 30 .2.1.10 RCCDO3368 Acetylating Hydrogenase 31 Hydrogenase 12.72 RCCDOO346 Y D 1. Ni—Fe large subunit, H2 oxidation 32 12.72 RCCDOO938 HYD 2 Ni—Fe Small subunit, H2 oxidation 33 12.72 RCCDO 1.283 HYD 3 Fe only, large subunit, H2 production 34 12.72 RCCDO 1700 HYD 4 Fe only, H2 production 35 12.72 RCCDO2918 HYD 5 Fe only, H2 production 36 12.72 RCCD04233 HYD 6 Fe only, H2 production Electron 37 Ferredoxin RCCDOO424 carrier 38 RCCDO 1226 39 RCCDO 1932 40 RCCD0218S 41 RCCD02239 42 RCCD02268 43 RCCDO2S8O RCCDO3406 45 RCCDO3640 46 RCCDO3676 47 RCCD04306 Electron 48 Pyridine nucleotide RCCDOO18S glutamate transfer disulphide synthase Small oxidoreductases chain, but no large chain next to it 49 RCCDO 1487 next to cooF and cooS, probably important for reduced pyridine cofactor generation 50 RCCDOO433 NADH dehydrogenase, not part of an operon US 8,628,943 B2 11 12 TABLE 2-continued CloStridium liturgdahlii genes used in metabolic engineering experiments.

Gene EC Pathway ID Gene Name number ORFID Copy ID Description 51 Membrane-associated RCCDO 1488 coof Between gene electron transfer FeS number 3 and protein, cooF gene number 49

Similarly, the genome of C. carboxydivorans was Ljungdahl pathway and ethanol and acetate synthesis were sequenced, and genes coding for the enzymes in the Wood identified and located. The results are presented in Table 3. TABLE 3 Clostridium Carboxidivorans genes used in metabolic engineering. Gene EC Pathway ID Gene Name number ORF ID Copy ID Description Wood Carbon Monoxide 1.2.2.4 RCC BO4039 CODH 1 CO oxidation Ljungdahl Dehydrogenase RCC BO1154 CODH 2 CO oxidation RCC BO2478 CODH 3 CO oxidation RCC BO3963 CODH 4 CO oxidation RCC BO4038 CODH 5 CO oxidation Carbon Monoxide 12.99.2 RCC BO4293 CODH ACS bifunctional Dehydrogenase Acetyl CODEHFACS CoA Synthase enzyme, carbon fixation Formate Dehydrogenase 1.2.1.2 RCC BOS4O6 FDH 1 Methyl branch 8 RCC BO1346 FDH 2 carbon fixation Formyltetrahydrofolate 6.3.4.3 RCC BO4040 FTHFS Methyl branch Synthase carbon fixation Methenyltetrahydrofolate 3.54.9 RCC BO4042 MEC Methyl branch cyclohydrolase carbon fixation Methylenetetrahydrofolate 1.5.1.5 RCC BO4042 MED Methyl branch dehydrogenase carbon fixation Methylenetetrahydrofolate 1.5.1.20 RCC BO4044 MER Methyl branch reductase carbon fixation Methyltransferase 2.1.1.13 RCC B04294 MET Methyl branch carbon fixation Corrinoid Iron-sulfur 12.99.2 RCC BO4049 COPL Part of protein CODEHFACS complex, Large Subunit Corrinoid Iron-sulfur 12.99.2 RCC BO4047 COPS Part of protein CODEHFACS complex, Small Subunit Ethanol and 6 Acetate Kinase 27.2.1 RCC BOS249 ACK Acetate acetate production production Phospho-transacetylase 2.3.1.8 RCC BO2481 PTA Acetate production Tungsten-containing 2.7.5 RCC BOOO63 AOR 1 Reduction of aldehyde ferredoxin acetate to oxidoreductase acetaldehyde Alcohol Dehydrogenase .1.1.2 RCC zinc-ADH .1.1.1 RCC A two pfam domain: FeADH and ALDH, AdhE 21 .1.1.1 RCC BO5675 truncated, AdhE 22 .1.1.1 RCC BOO958 A one pfam domain: FeADH 23 .1.1.1 RCC B04489 one pfam domain: FeADH 24 .1.1.1 RCC one pfam domain: FeADH 25 ——— RCC BO2465 short chain ADH, multiple copy 26 ——— RCC BO5551 A. D H 1 O short chain ADH, multiple copy 27 Aldehyde Dehydrogenase .2.1.10 RCC BO2403 Acetylating 28 .2.1.10 RCC D Acetylating 29 .2.1.10 RCC BO4.031 t Acetylating Hydrogenase 30 Hydrogenase 12.72 RCC BO2249 sYD 1 Ni—Fe large subunit, H2 oxi US 8,628,943 B2 13 14 TABLE 3-continued Clostridium carboxidivoransgenes used in metabolic engineering.

Gene EC Pathway ID Gene Name number ORFID Copy ID Description

31 1.1.2.7.2 RCCBO1319 HYD 2 Fe only, H2 production 32 1.1.2.7.2 RCCBO1405 HYD 3 Fe only, H2 production 33 1.1.2.7.2 RCCBO1516 HYD 4 Fe only, large subunit, H2 production 34 1.1.2.7.2 RCCBO3483 HYD 5 Fe only, H2 production 35 1.1.2.7.2 RCCB05411 HYD 6 Fe only, large subunit, H2 production Electron 36 Ferredoxin RCCBOO234 carrier 37 RCCBOO345 38 RCCBO1260 39 RCCBO1334 40 RCCBO1775 41 RCCBO1960 42 RCCBO1972 43 RCCBO2618 44 RCCBO2638 45 RCCBO2836 46 RCCBO2853 47 RCCBO3O23 48 RCCBO3.191 49 RCCBO3278 50 RCCBO3452 51 RCCBO3596 52 RCCBO3762 53 RCCBO3972 S4 RCCBO416S 55 RCCBO4383 56 RCCBO4571 57 RCCBO4585 58 RCCBOS780 59 RCCBO5975 60 RCCBO63O4 61 RCCBO63OS Electron 62 Pyridine nucleotide- RCCBOO442 NADH transfer disulphide dehydrogenase, oxidoreductases not part of an operon 63 RCCBO1674 NADH dehydrogenase, not part of an operon 64 RCCBO3S10 next to cooF and cooS, probably important for reduced pyridine cofactor generation 65 RCCBOOS86 NADH dehydrogenase, not part of an operon 66 RCCBO4795 NADH: ferredoxin oxidoreductasen not part of an operon 67 Membrane-associated RCCBO3S09 coof Between gene electron transfer FeS number 2 and protein, cooF gene number 64 US 8,628,943 B2 15 16 Genes that code for enzymes in the electron transfer path Further, the functionality of the gene (including the pro way include carbon monoxide dehydrogenase, Enzyme moter) encoding for acetyl CoA reductase was tested. The Commission number (EC 1.2.2.4). Five separate open read gene was amplified by PCR, transferred into shuttle vector ing frame (ORF) sequences were identified in C. ragsdalei pCOS52 and ligated into the EcoRI site to form pCOS54. The and C. liungdahli, and six were identified in the C. carbox vector contained the entire acetyl-CoA reductase gene and its idivorans genome for the carbon monoxide dehydrogenase promoter on a high-copy plasmid. pCOS52 contained the enzyme. same backbone vector as pCOS54 but lacked the AR gene. FIG. 3 is a diagram of carbon-monoxide dehydrogenase pCOS52 was used as the control plasmid in functional assays operon 300. The gene order within operon 300 is highly to determine expression of the AR gene in E. coli to confirm conserved in all three species of acetogenic Clostridia, and 10 the Clostridial gene function. The results confirmed the func comprises the genes coding for the carbon monoxide dehy tion of the acetyl CoA reductase gene. drogenase (cooS) (Gene ID 4, Tables 1, 2, and 3), followed by The functional assay consisted of adding cells harvested at the membrane-associated electron transfer FeS protein the given time points to a reaction buffer containing NADPH (cooF) (Gene ID 55, Table 1: Gene ID 51, Table 2: Gene ID 15 and acetone as the Substrate. Spectrophotometric activity 67, Table 3), in turn, followed by ferredoxin oxidoreductase (conversion of NADPH to NADP+) was measured at 378 mm (FOR). and compared to a standard curve to determine total activity A comparison was conducted of the genetic sequence level. Specific activity was determined using 317 mg/gram of found in the operon of FIG. 3 across the three species of dry cell weight at an OD measurement of 1. acetogenic Clostridia. The cooS gene had 98% identity The genes encoding the PTA-ACK operon (Gene IDs between C. ragsdalei and C. Jiungdahlii, 84% identity 16-17, Tables 1 and 3: Gene IDs 15-16, Table 2) and its between C. carboxydivorans and C. ragsdahlii, and 85% promoter were sequenced in C. ragsdalei, C. ljungdahli, and identity between C. carboxydivorans and C. liungdahli. The C. carboxydivorans. The functionality of the operon was cooF gene had 98% identity between C. ragsdalei and C. confirmed, and it was demonstrated that downregulation of liungdahlii, 80% identity between C. carboxydivorans and C. 25 the operon increases the ethanol to acetate production ratio. ragsdalei, and 81% identity between C. carboxydivorans and Downregulation involves decreasing the expression o the C. Jiungdahli. The FOR gene had 97% identity between C. transcription of the 2-gene operon via promoter modification ragsdalei and C. liungdahli, 77% identity between C. car through site-directed mutagenesis. Such downregulation boxydivorans and C. ragsdalei, and 77% identity between C. leads to a decrease in mRNA, leading to a decrease in protein carboxydivorans and C. ljungdahli. 30 production and a corresponding decrease in the ability of the Six hydrogenase (EC 1.12.7.2) ORF sequences were iden strain to produce acetate. Such downregulation can be tified in the genome of each of the acetogenic Clostridium achieved via the method described in Example 2. species. Additionally, a comparison was conducted of the genetic Twelve ferredoxin biosynthesis genes (Gene ID 40-51) sequence found in the PTA-ACK operon across three species were identified in the C. ragsdaleigenome. Eleven ferredoxin 35 of acetogenic Clostridia. The PTA gene had 97% identity biosynthesis genes (Gene ID37-47, Table 2) were found in C. between C. ragsdalei and C. Jiungdahlii, 78% identity liungdahlii, and twenty-six (Gene ID 36-61, Table 3) were between C. carboxydivorans and C. ragsdalei, and 79% iden found in C. carboxidivorans. tity between C. Jiungdahlii and C. carboxydivorans. The Three genes coding for ferredoxin oxidoreductase ACK gene had 96% identity between C. ragsdalei and C. enzymes were found in the C. ragsdaleigenome that contain 40 liungdahlii, 78% between C. carboxydivorans and C. rags both a ferredoxin and nicotinamide cofactor binding domain. dalei, and 77% between C. carboxydivorans and C. The ORF Sequence ID numbers (Table 1) for these genes are: liungdahli. RCCCO2615: RCCCO2028; and RCCCO3071. The key Key genes to promote production of ethanol in C. ragsdalei gene for metabolic engineering, RCCCO2028, is part of the include: SEQID NO 1 (Gene ID Nos. 4, 55, 53, Table 1) the cooS/cooF operon, also shown in FIG. 3. Similarly, three 45 gene sequence, including the experimentally determined pro genes coding for ferredoxin oxidoreductase (FOR) enzymes moter region, for carbon monoxide dehydrogenase, coos, were found in the C. liungdahlii genome. Each of these genes electron transfer protein cooF, and the NADH dependent code for both the ferredoxin and cofactor binding domains. ferredoxin oxidoreductase (FOR): The ORF Sequence ID numbers for these genes are: SEQ ID NO 2 (Gene ID Nos. 17, 16, Table 1), the gene RCCD00185: RCCD01847; and RCCD00433 (Table 2). The 50 sequence, including the experimentally determined promoter key gene RCCD01847, is part of the cooF/cooS operon region, for ACK and PTA; shown in FIG. 3. SEQ ID NO 3 (Gene ID No. 6, Table 1), the gene sequence, Five genes were found in the C. carboxidivorans genome including the experimentally determined promoter region, for that contain both the ferredoxin and cofactor binding the acetyl CoA reductase: domains. The ORF Sequence ID numbers (Table 3) for these 55 Sequence Listing genes are: RCCB00442: RCCB01674: RCCB03510; C. ragsdalei gene sequences (Table 1) RCCB00586; and RCCB 04795. The potentially key gene for modulating electron flow is RCCB03510, which is part of the cooF/cooS operon (FIG. 3). >SEQ ID NO. 1: (cooS coof, NADH: Ferredoxin Oxidoreductase operon (includes The genes encoding AR (Gene ID 21, Table 1: Gene ID 19, 60 STOP), Gene ID Nos. 4, 55, 53) Table 2) were sequenced in C. ragsdalei and C. liungdahli. A TATTATATCAATATAGAATAATTTTCAATCAAATAAGAATTATTTTATAT high degree of gene conservation is observed for the acetyl CoA reductase gene in C. ragsdalei and C. ljungdahli. Fur TTTATATTGACAAGGAAACCGAAAAGGTTTATATTATTGTTATTGGATAA thermore, in both micro-organisms, the enzyme exhibits a CAATTATTTTTTAGTTAGTTGTACTTGTAAATAAATAGTATTAATTAATA high degree of homology. The sequence of the acetyl CoA 65 gene in C. ragsdalei and C. ljungdahlii was compared and CTATTAAACTATTACAGTTTTTGATTCTTAGTATAAGTATTCTTAGTATC found to have a 97.82% identity. US 8,628,943 B2 17 18 - Continued - Continued

TTTAGCACTTAGAATACGTTATCCTTTAGGAGAATAATCCTAATCAGTAA TAAGGATTTATGTACCGGATGCTTAAATTGTACTTTAGCTTGTATGGCAG

TTTTAATAATTTAATAGTATACTTAAATAGTATAGTTTGGAGGTTTTATT AACACAATGAAAATGGGAAATCTTTTTATGATCTGGATCTCAGCAATAAA

ATGTCAAATAACAAAATTTGTAAGTCAGCAGATAAGGTACTTGAAAAGTT TTTCTTGAAAGTAGAAATCATATATCTAAAGATGATAATGGAAACAAGCT

TATAGGTTCTCTAGATGGTGTAGAAACTTCTCATCATAGGGTAGAAAGCC TCCTATATTTTGCCGTCACTGTGACGAACCTGAGTGCGTAATGACATGTA

AAAGTGTTAAATGTGGTTTTGGTCAGCTAGGAGTCTGCTGTAGACTCTGT TGAGCGGTGCCATGACTAAAGATCCTGAAACTGGTATAGTATCCTATGAT 10 GCAAACGGTCCCTGCAGAATAACACCTAAAGCTCCAAGAGGAGTATGTGG GAGCATAAATGTGCCAGCTGCTTTATGTGCGTCATGTCCTGTCCTTATGG

TGCTAGTGCTGATACCATGGTTGCAAGAAACTTTCTTAGAGCTGTAGCTG AGTATTGAAACCAGATACT CAGACCAAAAGTAAAGTAGTTAAATGTGACC

CCGGCAGTGGATGTTATATCCATATAGTCGAAAATACAGCTAGAAACGTA TGTGTGGTGACAGAGATACACCTAGATGCGTTGAAAATTGTCCAACAGAA 15 AAATCAGTAGGTGAAACCGGCGGAGAGATAAAAGGAATGAATGCTCTCAA GCAATTTATATTGAAAAGGAGGCAGATCTCCTATGAATGAGTGGTTTAAC

CACCCTAGCAGAAAAACTTGGTATAACAGAATCTGACCCACATAAAAAAG AATAAAAATATTTTTTCACACAAAATATGTAATAATAGGAGCCAGTGCTG

CTGTACTAGTAGCTGTGCCGTATTAAAGGACTTATACAAACCAAAATTCG CTGGAATAAATGCTGCTAAAACTTTAAGAAAGTTAGATAAATCCTCCAAA

AAAAAATGGAAGTTATAAATAAATTAGCTTATGCACCTAGACTAGAAAAT ATAACTATTATTTCAAAGGATGATGCAGTTTATTCAAGATGTATACTCCA

TGGAACAAATTAAATATAATGCCTGGCGGTGCAAAATCAGAAGTTTTTGA CAAAGTACTTGAGGGAAGTAGAAATTTAGATACCATAAATTTTGTAGATT

TGGTGTAGTAAAAACTTCTACAAATCTAAACAGCGACCCTGTAGATATGC CTGATTTCTTTGAAAAAAATAATATAGAATGGATAAAAGATGCAGATGTA

TTCTAAATTGTTTAAAACTTGGAATATCCACTGGGATTTACGGACTTACC 25 AGCAATATTGATATTGACAAGAAAAAAGTCTTACTTCAAGACAACAGCAG

CTTACAAATTTATTAAATGACATAATTTTAGGTGAACCTGCTATAAGACC CTTCAAATTTGACAAGCTCCTTATAGCTTCTGGTGCTTCCTCCTTTATTC

TGCAAAAGTTGGTTTTAAAGTTGTAGATACGGATTATATAAATTTGATGA CCCCAGTTAAAAAATTAAGAGAAGCTAAAGGAGTGTACTCCCTTAGAAAT

TAACAGGCCACCAGCACTCCATGATTGCCCACCTTCAAGAAGAACTTGTA 30 TTTGAAGATGTAACTGCTATACAAGACAAACTTAAAAACGCAAAACAAGT

AAACCTGAAGCTGTAAAAAAAGCCCAAGCAGTTGGTGCTAAAGGATTCAA GGTAATACTTGGTGCAGGTCTTGTAGGAATTGATGCACTTTTAGGTCTTA

ACTAGTTGGATGTACCTGTGTCGGACAGGATTTACAGTTAAGAGGTAAAT TGGTGAAAAATATAAAGATTTCAGTTGTAGAAATGGGAGATAGGATTCTC

ACTATACTGATGTTTTCTCCGGTCATGCAGGAAATAACTTTACAAGTGAA 35 CCCCTTCAACTGGACAAAACTGCATCCACTATATATGAAAAGTTGTTAAA GCCTTAATAGCAACTGGAGGTATAGATGCAATAGTATCTGAATTTAACTG AGAAAAAGGTATAGATGTCTTTACTTCAGTTAAATTGGAAGAGGTAGTTT

TACTCTTCCTGGCATCGAGCCAATAGCTGATAAGTTCATGGTTAAAATGA TAAATAAAGACGGAACTGTAAGTAAAGCAGTACTATCAAATTCAACTTCT

TATGCCTAGATGACGTTTCTAAAAAATCAAATGCAGAATATGTAGAATAC ATAGATTGCGATATGATAATAGTTGCTGCTGGTGTTAGACCAAATGTAAG 40 TCTTTTAAAGATAGAGAAAAAATAAGCAACCATGTTATAGATACGGCTAT CTTTATAAAAGACAGCAGGATAAAAGTTGAAAAAGGCATTGTCATAGACA

TGAAAGTTATAAGGAAAGAAGATCTAAAGTTACAATGAATATTCCTAAAA AACATTGTAAAACCACTGTAGATAATATATATGCTGCAGGAGATGTTACT

ACCATGGCTTTGATGACGTCATAACAGGTGTAAGTGAAGGTTCCTTAAAA TTTACTGCTCCTATATGGCCTATAGCTGTAAAGCAGGGAATAACTGCTGC 45 TCCTTCTTAGGCGGAAGTTGGAAACCTCTTGTAGACTTAATTGCTGCTGG TTACAACATGGTAGGTATAAATAGAGAATTACATGACACTTTTGGCATGA

AAAAATTAAAGGTGTTGCTGGAATAGTAGGTTGTTCAAACTTAACTGCCA AGAACT CAATGAATTTATTTAACCTTCCATGCGTATCCCTTGGTAATGTA

AAGGTCACGATGTATTTACAGTAGAACTTACAAAAGAACTCATAAAGAGA AATATAGCAGATGAAAGTTATGCTGTTGATACATTAGAAGGAGATGGAGT 50 AATATAATTGTACTTTCTGCAGGTTGTTCAAGTGGTGGACTTGAAAATGT TTATCAAAAAATAGTTCACAAAGATGGAGTAATCTACGGTGCACTTCTAG

AGGACTTATGTCTCCAGGAGCTGCTGAACTTGCAGGAGATAGCTTAAAAG TTGGAGATATATCTTACTGCGGCGTACTAGGATATCTCATAAAAAATAAA

AAGTATGTAAGAGCCTAGGTATACCACCTGTACTAAATTTTGGTCCATGT GTAAATATAAGCAATATCCATAAAAATATTTTTGACATAGATTATTCTGA 55 CTTGCTATTGGAAGATTGGAAATTGTAGCAAAAGAACTAGCAGAATACCT TTTTTACAATGTTGAAGAAGATGGACAATATAGTTATCAATTGAGGTAA

AAAAATAGATATTCCACAGCTTCCACTTGTGCTTTCTGCACCT CAATGGC SEQ ID NO. 2: (PTA-ACK operon (includes STOP), Gene ID Nos. 17, 6) TTGAAGAACAAGCATTGGCAGATGGAAGTTTTGGTCTTGCCCTTGGATTA GCATACTGATTGATTATTTATTTGAAAATGCCTAAGTAAAATATATACAT 60 CCACTTCACCTTGCTATATCTCCTTTCATTGGTGGAAGCAAAGTGGTAAC ATTATAACAATAAAATAAGTATTAGTGTAGGATTTTTAAATAGAGTATCT

AAAAGTTTTATGTGAAGATATGGAAAATCTAACAGGCGGCAAGCTTATAA ATTTTCAGATTAAATTTTTACTTATTTGATTTACATTGTATAATATTGAG

TAGAAGACGATGTAATAAAAGCTGCAGATAAATTAGAAGAAACCATACTT TAAAGTATTGACTAGTAAAATTTTGTGATACTTTAATCTGTGAAATTTCT 65 GCAAGAAGGAAAAGCTTAGGTCTTAATTAAATGAAAAGAATAATGATAAA TAGCAAAAGTTATATTTTTGAATAATTTTTATTGAAAAATACAACTAAAA US 8,628,943 B2 19 20 - Continued - Continued

AGGATTATAGTATAAGTGTGTGTAATTTTGTGTTAAATTTAAAGGGAGGA TTATGTTGCAGTTTTAAATGGAGCAGATGCTATAATATTTACAGCAGGAC

AATAAACATGAAATTGATGGAAAAAATTTGGAATAAGGCAAAGGAAGACA TTGGAGAAAATTCAGCTACTAGCAGATCTGCTATATGTAAGGGATTAAGC

AAAAAAAGATTGTCTTAGCTGAAGGAGAAGAAGAAAGAACTCTTCAAGCT TATTTTGGAATTAAAATAGATGAAGAAAAGAATAAGAAAAGGGGAGAAGC

TGTGAAAAAATAATTAAAGAAGGTATTGCAAATTTAATCCTTGTAGGGAA ACTAGAAATAAGCACACCTGATTCAAAGATAAAAGTATTAGTAATTCCTA

TGAAAAGGTAATAGAGGAGAAGGCATCAAAATTAGGCGTAAGTTTAAATG CAAATGAAGAACTTATGATAGCTAGGGATACAAAAGAAATAGTTGAAAAT 10 GAGCAGAAATAGTAGATCCAGAAACCTCGGATAAACTAAAAAAATATGCA AAATAA

GATGCTTTTTATGAATTGAGAAAGAAGAAGGGAATAACACCAGAAAAAGC SEO ID NO. 3: (ORF RCCCO2715, P11, NADPH-SADH (includes STOP), Gene ID No. 6) GGATAAAATAGTAAGAGATCCAATATATTTTGCTACGATGATGGTTAAGC ATGAAAGGTTTTGCAATGTTAGGTATTAACAAGTTAGGATGGATTGAAAA 15 TTGGAGATGCAGATGGATTGGTTTCAGGTGCAGTGCATACTACAGGTGAT GAAAAACCCAGTACCAGGTCCTTATGATGCGATTGTACATCCTCTAGCTG

CTTTTGAGACCAGGACTTCAAATAGTAAAGACAGCTCCAGGTACATCAGT TATCCCCATGTACATCAGATATACATACGGTTTTTGAAGGAGCACTTGGT

AGTTTCCAGCACATTTATAATGGAAGTACCAAATTGTGAATATGGTGACA AATAGGGAAAATATGATTTTAGGTCACGAAGCTGTAGGTGAAATAGCTGA

ATGGTGTACTTCTATTTGCTGATTGTGCTGTAAATCCATGCCCAGATAGT AGTTGGCAGTGAAGTTAAAGATTTTAAAGTTGGCGATAGAGTTATCGTAC

GATCAATTGGCTTCAATTGCAATAAGTACAGCAGAAACTGCAAAGAACTT CATGCACAACACCTGACTGGAGATCCTTAGAAGTCCAAGCTGGTTTTCAA

ATGTGGAATGGATCCAAAAGTAGCAATGCTTTCATTTTCTACTAAGGGAA CAGCATTCAAACGGTATGCTTGCAGGATGGAAGTTTTCCAATTTTAAAGA

GTGCAAAACACGAATTAGTAGATAAAGTTAGAAATGCTGTAGAAATTGCC 25 CGGTGTATTTGCAGATTACTTTCATGTAAACGATGCAGATATGAATCTTG

AAAAAAGCTAAACCAGATTTAAGTTTGGACGGAGAATTACAATTAGATGC CAATACTTCCAGATGAAATACCTTTAGAAAGTGCAGTTATGATGACAGAC

CTCTATCGTAGAAAAGGTTGCAAGTTTAAAGGCTCCTGAAAGTGAAGTAG ATGATGACTACTGGTTTTCATGGGGCAGAACTTGCTGACATAAAAATGGG

CAGGAAAAGCAAATGTACTTGTATTTCCAGATCTCCAAGCAGGAAATATA 30 TTCCAGTGTTGTCGTAATTGGTATAGGAGCTGTTGGATTAATGGGAATAG

GGTTATAAACTTGTTCAAAGATTTGCAAAAGCTGATGCTATAGGACCTGT CCGGTTCCAAACTTCGAGGAGCAGGTAGAATTATCGGTGTTGGAAGCAGA

ATGCCAGGGATTTGCAAAACCTATAAATGATTTGTCAAGAGGATGTAACT CCCGTTTGTGTTGAAACAGCTAAATTTTATGGAGCAACTGATATTGTAAA

CCGATGATATAGTAAATGTAGTAGCTGTAACAGCAGTTCAGGCACAAGCT 35 TTATAAAAATGGTGATATAGTTGAACAAATAATGGACTTAACTCATGGTA CAAAAGTAAATGAAAATATTAGTAGTAAACTGTGGAAGTTCATCTTTAAA AAGGTGTAGACCGTGTAATCATGGCAGGCGGTGGTGCTGAAACACTAGCA

ATATCAACTTATTGATATGAAAGATGAAAGCGTTGTGGCAAAAGGACTTG CAAGCAGTAACTATGGTTAAACCTGGCGGCGTAATTTCTAACATCAACTA

TAGAAAGAATAGGAGCAGAAGGTTCAGTTTTAACACATAAAGTTAACGGA CCATGGAAGCGGTGATACTTTGCCAATACCTCGTGTTCAATGGGGCTGCG 40 GAAAAGTTTGTTACAGAGCAGCCAATGGAAGATCATAAAGTTGCTATACA GCATGGCTCACAAAACTATAAGAGGAGGGTTATGTCCCGGCGGACGTCTT

ATTAGTATTAAATGCTCTTGTAGATAAAAAACATGGTGTAATAAAAGATA AGAATGGAAATGCTAAGAGACCTTGTTCTATATAAACGTGTTGATTTGAG

TGTCAGAAATATCTGCTGTAGGGCATAGAGTTTTGCATGGTGGAAAAAAA CAAACTTGTTACTCATGTATTTGATGGTGCAGAAAATATTGAAAAGGCCC 45 TATGCGGCATCCATTCTTATTGATGACAATGTAATGAAAGCAATAGAAGA TTTTGCTTATGAAAAATAAGCCAAAAGATTTAATTAAATCAGTAGTTACA

ATGTATTCCATTAGGACCATTACATAATCCAGCTAATATAATGGGAATAG TTCTAA

ATGCTTGTAAAAAACTAATGCCAAATACTCCAATGGTAGCAGTATTTGAT Using detailed genomic information, the acetogenic 50 Clostridia micro-organisms have been metabolically engi ACAGCATTTCATCAGACAATGCCAGATTATGCTTATACTTATGCAATACC neered to increase the carbon and electron flux through the biosynthetic pathways for ethanol and butanol, while simul TTATGATATATCTGAAAAGTATGATATCAGAAAATATGGTTTTCATGGAA taneously reducing or eliminating carbon and electron flux CTTCTCATAGATTCGTTTCAATTGAAGCAGCCAAGTTGTTAAAGAAAGAT through the corresponding acetate and butyrate formation 55 pathways, in accordance with the present invention. For this CCAAAAGATCTTAAGCTAATAACTTGTCATTTAGGAAATGGAGCTAGTAT purpose, the activities of key genes encoding for enzymes in the pathway have been modulated. In one embodiment, gene ATGTGCAGTAAACCAGGGAAAAGCAGTAGATACAACTATGGGACTTACTC expression of key alcohol producing enzymes is increased by CCCTTGCAGGACTTGTAATGGGAACTAGATGTGGTGATATAGATCCAGCT increasing the copy number of the gene. For example, a key 60 carbon monoxide dehydrogenase operon (FIG. 3) and the ATAATACCATTTGTAATGAAAAGAACAGGTATGTCTGTAGATGAAATGGA associated electron transfer proteins, including acetyl CoA reductase and aldehyde ferredoxin oxidoreductase are dupli TACTTTAATGAACAAAAAGTCAGGAATACTTGGAGTATCAGGAGTAAGCA cated within the genome of the modified organism. In one GCGATTTTAGAGATGTAGAAGAAGCTGCAAATTCAGGAAATGATAGAGCA embodiment, these duplications are introduced into strains 65 having knocked out or attenuated acetate production to fur AAACTTGCATTAAATATGTATTATCACAAAGTTAAATCTTTCATAGGAGC ther channel electrons into the ethanol or butanol production pathway. In another embodiment a knockout strategy is US 8,628,943 B2 21 22 applied to strains of acetogenic Clostridia that, when grown Using PCR and other standard methods, a recombinant on syngas, produce more complex mixtures of alcohols and vector containing two large non-contiguous segments of acids, such as ethanol, butanol and hexanol and their corre DNA is generated. Upon replacement of the native gene by sponding carboxylic acids. the recombinant vector gene, the Clostridial strain contains In one embodiment, vectors to be used for the transfer of 5 no phosphotransacetylase or acetate kinase activities as acetogenic Clostridia cloned genes from cloning vehicles to shown in FIG. 5 by X504 and X502, respectively. parent acetogenic Clostridia Strains are constructed using Modulation of the common promoter region, P* 506 to standard methods (Sambrook et al., 1989). All gene targets attenuate gene expression of phosphotransacetylase 508 and used in molecular genetics experiments are amplified using acetate kinase 510 and Subsequent acetate production are high-fidelity polymerase chain reaction (PCR) techniques 10 carried out by generating a series of recombinant vectors with using sequence-specific primers. The amplified genes are altered promoter regions. The vector series is constructed by next Subcloned into intermediate cloning vehicles, and later site-directed mutagenesis. recombined in multi-component ligation reactions to yield Additionally, down-regulation of the 2-gene operon con the desired recombinant vector to be used in the gene transfer taining pta?ack genes is performed by site-directed mutagen experiments. The vectors contain the appropriate functional 15 esis of the promoter region. A decrease in RNA polymerase features required to carry out the gene transfer experiments binding leads to a decrease intranscriptional activity off of the Successfully and vary depending on the method used. pta/ack promoter and in turn lead to a decrease in protein To transfer the recombinant vectors into recipient acetoge activity. The end result is a decrease in acetate production nic Clostridia, a variety of methods are used. These include since the intermediates are produced at a lower rate and more electroporation, bi-parental or tri-parental conjugation, lipo carbon from acetyl-CoA goes towards ethanol production. A some-mediated transformation and polyethylene glycol-me promoter probe assay using a reporter group that is easily diated transformation. Recombinant acetogenic Clostridia quantitated has been developed to measure relative promoter are isolated and confirmed through molecular biology tech strength of the pta?ack promoter in vivo. After site-directed niques based on the acquisition of specific traits gained upon mutagenesis is performed, which imparts single and multiple DNA integration. 25 lesions over a 200 base pair region, strains that have decreased promoteractivity are isolated Such that a series of strains with Example 1 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, 10% and 0% activity of the native promoter in the assay are isolated and Acetogenic Clostridia contain operon 300, shown in FIG. tested in recombinant Clostridia strains. 3, that consists of carbon monoxide dehydrogenase 104 30 (cooS, Gene ID 4, Table 1, Table 2, Table 3), a membrane Example 3 associated electron transfer protein (cooF), and a ferredoxin oxidoreductase (FOR). Overexpression of carbon monoxide In vivo, the acetyl CoA enzyme designated in 102 and FIG. dehydrogenase 104 within the acetogenic Clostridia is known 5 converts the Coenzyme A (CoA) form of a carbon moiety, to increase electron flow from Syngas components to the 35 such as acetyl-CoA 102 or butyrl-CoA directly to its corre oxidizeded nucleotide cofactors NAD" and NADP" The sponding alcohol. Thermodynamically, direct conversion increased levels of reduced nucleotide cofactors then stimu from the CoA form to the alcohol requires transfer of four late generation of intermediate compounds in Wood electrons, and is a more efficient way to generate the alcohol, Ljungdahl pathway 100. compared to the two-step conversion of the to In one embodiment, operon 300 is amplified using long 40 the corresponding alcohol. For example, as shown in FIG. 6. PCR techniques with primers that are designed to anneal to a the two step conversion requires that acetate 214, first be region 200 nucleotides (nt) upstream of the carbon monoxide converted to its aldehyde form (acetaldehyde, 604), and then dehydrogenase gene and 200 ntdownstream of the ferredoxin to the corresponding alcohol, ethanol 216. Thus, increasing oxidoreductase gene. The total region is about 3.8 kilobase AR activity, portrayed by the vertical arrow 602 is desirable pairs. The amplified DNA is cloned directly into suitable 45 for increasing alcohol production, and increasing the selec plasmid vectors specifically designed to ligate PCR products tivity of the process by increasing the ratio of alcohol to acid. such as pGEMT easy (Promega, Madison, Wis.) or pTOPO In one embodiment, AR activity in acetogenic Clostridia is (Invitrogen, Carlsbad, Calif.). The ends of the PCR product increased by amplifying the gene in vitro using high-fidelity contain engineered restriction sites to facilitate later cloning PCR and inserting the duplicated copy of the gene into a steps. The operon 300 is subcloned into a vector that already 50 neutral site in the chromosome using standard molecular contains cloned chromosomal C. ragsdalei or other acetoge genetic techniques. After gene replacement of the vector, the nic Clostridial DNA to allow chromosomal integration at a chromosome contains two copies of the AR. Confirmation of neutral site. genereplacement followed by gene expression studies of the recombinant strain are performed and compared to the parent Example 2 55 strain. In other embodiments a similar strategy is used to increase Because carboxylic acids compete with alcohols for elec the enzymatic activity of adhE-type alcohol dehydrogenases, trons, decreasing acid production allows more electrons to short-chain alcohol-dehydrogenases and primary Fe-contain flow down the alcohol-production pathway from the CoA ing alcohol dehydrogenases. intermediate directly to the alcohol. Acetogenic Clostridia 60 contain genes for phospho-transacetylase enzyme (Gene ID Example 4 17, Tables 1 and 3; Gene ID 16, Table 2) that converts acetyl CoA to acetyl-phosphate and acetate kinase (Gene ID 16, Under some conditions, Clostridia need to obtain addi Table 1) that converts acetyl-phosphate 218 to acetate 214. In tional energy in the form of adenosine triphosphate produc one embodiment, genetic modifications to delete all or part of 65 tion (ATP) causing the cells to temporarily increase the pro the genes for both enzymes and knock out or attenuate pro duction of acetate 214 from acetyl-CoA 102. The net reaction duction of acetate are made as shown in FIG. 5. is 1 ATP from ADP+P, through acetyl-phosphate. Acetate US 8,628,943 B2 23 24 production is advantageous to the Syngas fermentation pro 40% as a result of the increased aldehyde ferredoxin oxi cess at low to moderate acetic acid concentrations, because it doreductase and AR activities. In another embodiment, the allows the cells to produce more energy and remain robust. ratio of ethanol to acetate produced is increased between 5 However, too much free acetic acid causes dissipation of the transmembrane ion gradient used as the primary ATP genera and 10 fold, but allows sufficient acetate formation to support tion source and therefore becomes detrimental to the cells. 5 ATP production needed to meet the energy needs of the For industrial production purposes, it is advantageous to con m1croorganS1m. Vert the acetate to ethanol to increase ethanol production and While the invention has been described with reference to reduce the probability of accumulating too much free acetic particular embodiments, it will be understood by one skilled acid. to in the art that variations and modifications may be made in In one embodiment, ethanol production in the double form and detail without departing from the spirit and scope of mutant C. ragsdalei Strain is increased by between 10 and the invention.

SEQUENCE LISTING

<16 Os NUMBER OF SEO ID NOS: 6

<21 Os SEQ ID NO 1 &211s LENGTH: 3899 &212s. TYPE: DNA <213> ORGANISM: Clostridulim ragsdalei <4 OOs SEQUENCE: 1 tattatat ca atatagaata attitt caatc aaataagaat tattittatat tittatattga 60 caaggaaacc gaaaaggittt at attattgt tattggataa caattattitt ttagttagtt 12O

gtacttgtaa at aaatagta ttaattaata ct attaaact attacagttt ttgatt citta 18O gtataagtat tottagtatic tittagcactt agaatacgtt atcc tittagg agaataatcc 24 O

taatcagtaa ttittaataat ttaatagitat acttaaatag tatagtttgg aggttittatt 3 OO atgtcaaata acaaaatttg talagt cago a gataagg tac ttgaaaagtt tataggttct 360

Ctagatggtg tagaaacttic tdatcat agg gtagaaagcc aaagtgttaa atgtggittitt 42O ggit cagctag gagtctgctg. tagacitctgt gcaaacggtc. cctgcagaat aacacctaaa 48O

gct coaa.gag gagtatgtgg togctagtgct gataccatgg ttgcaagaala Cttt Cttaga 54 O

gctgtagctg cc.ggcagtgg atgttatatic catatagt cq aaaatacagc tagaaacgta 6 OO

aaatcagtag gtgaaaccgg cqgagagata aaaggaatga atgct ct cala Caccct agca 660 gaaaaacttg gtata acaga atctgaccca cataaaaaag Ctgtactagt agctgtgc.cg 72O

tattaaagga cittatacaaa ccaaaatticg aaaaaatgga agittataaat aaattagctt 78O

atgcacctag act agaaaat tdgaacaaat taaatataat gcctggcggit gcaaaatcag 84 O

aagtttittga tiggtgtagta aaaactticta caaatctaaa cagogaccct gtagatatgc 9 OO

ttctaaattig tittaaaactt ggaatat coa citgggattta cqgact tacc cttacaaatt 96.O tattaaatga cataattitta ggtgaacct g ctataag acc togcaaaagtt ggittittaaag O2O ttgtagatac ggattatata aatttgatga taac aggcca ccagcactcc atgattgc cc O8O

accttcaaga agaacttgta aaacctgaag ctgtaaaaaa agcc caagca gttggtgcta 14 O

aaggattcaa act agttgga tigtacctgtg tcggacagga tttacagtta agaggtaaat 2 OO

actatactga tigttittct c c gg.tcatgcag gaaataactt tacaagtgaa gocttaatag 26 O

caactggagg tatagatgca at agtatctg aatttaactg tact citt cott ggcatcgagc 32O

caatagotga taagttcat g gttaaaatga tatgcctaga tigacgtttct aaaaaatcaa 38O

atgcagaata totagaatac tottttaaag at agagaaaa aataagcaac catgttatag 4 4 O

atacggctat togaaagttat aaggaaagaa gatctaaagt tacaatgaat attcctaaaa 5 OO

accatggctt tdatgacgt cataac aggtg taagtgaagg titcc ttaaaa to cittct tag 560 US 8,628,943 B2 25 26 - Continued gcggaagttg gaalacct Ctt gtagacittaa ttgctgctgg aaaaattaaa ggtgttgctg 162O gaatag tagg ttgttcaaac ttaactgcca aaggt cacga tigt atttaca gtagaactta 168O caaaagaact cataaagaga aatataattig tactittctgc aggttgttca agtggtggac 1740 ttgaaaatgt aggact tatgtct coaggag Ctgctgaact tcaggagat agcttaaaag 18OO aagtatgtaa gagcct aggt ataccacctg tactaaattt togg to catgt cittgctattg 1860 gaagattgga aattgtagca aaagaactag cagaatacct aaaaatagat attccacagc 1920 titcc acttgt gctittctgca cct caatggc titgaagaaca agcattggca gatggaagtt 198O ttggtottgc ccttggatta ccact tca cc ttgctatat c ticctitt catt ggtggaagca 2O4. O aagtggtaac aaaagttitta ttgaagata tigaaaatct aac aggcggc aagcttataa 21OO tagaagacga tigtaataaaa gCtgcagata aattagaaga aaccatactit gcaagaagga 216 O aaagct tagg tottaattaa atgaaaagaa taatgataaa taaggattta totaccggat 222 O gcttaaattig tactittagct totatggcag alacacaatga aaatgggaaa totttittatg 228O atctggat ct cagcaataaa titt Cttgaaa gtagaaatca tatat ctaaa gatgataatg 234 O gaaacaagct tcc tatattt togcc.gtcact gtgacga acc tdagtgcgta atgacatgta 24 OO tgagcggtgc catgactaaa gatcctgaaa citggtatagt atcct atgat gag catalaat 246 O gtgc.ca.gctg ctittatgtgc gtcatgtc.ct gtcct tatgg agtattgaaa ccagatactic 252O agaccaaaag taaagtagtt aaatgtgacc ttgttggtga cagagataca Cctagatgcg 2580 ttgaaaattig to caacagaa goaatttata ttgaaaagga gccagat ct c ctatogaatga 264 O gtggtttaac aataaaaata tttitt toaca caaaatatgt aataatagga gccagtgctg 27 OO ctggaataaa togctgctaaa actittaagaa agittagataa atcct coaaa ataact atta 276 O tittcaaagga tigatgcagtt tatt caagat gtatact coa caaagtactt gagggaagta 282O gaaatttaga taccataaat tttgtagatt ctdattt citt tdaaaaaaat aatatagaat 288O ggataaaaga tigcagatgta agcaat attg at attgacaa gaaaaaagtic titact tcaag 294 O acaa.ca.gcag cittcaaattt gacaa.gct cottatagottctggtgct tcc ticcitt tatt c 3 OOO c cccagttaa aaaattaaga gaagctaaag gag tigtactic cct tagaaat tittgaagatg 3 O 6 O taactgctat acaaga caaa cittaaaaacg caaaacaagt gigtaatactt ggtgcaggit c 312 O ttgtaggaat tdatgcactt ttaggtotta tdgtgaaaaa tataaagatt toagttgtag 318O aaatgggaga taggattct c ccc cittcaac tdgacaaaac togcatccact atatatgaaa 324 O agttgttaaa agaaaaaggt atagatgtct ttact tcagt taaattggala gaggtag titt 33 OO taaataaaga cqgaactgta agtaaag.cag tact atcaaa ttcaactitct atagattgcg 3360 atatgata at agttgctgct ggtgttagac caaatgtaag Ctttataaaa gacagcagga 342O taaaagttga aaaaggcatt gttcatagaca aac attgtaa aaccactgta gataatatat 3480 atgctgcagg agatgttact titt actgctic ctatatggcc tatagctgta aag cagggaa 354 O taactgctgc titacaa catg gtagg tataa atagagaatt acatgacact tttgg catga 36OO agaact caat gaatttattt aacct tcc at gcgitatic cct togg taatgta aatatagoag 366 O atgaaagtta totgttgat acattagaag gagatggagt ttatcaaaaa at agttcaca 372 O aagatggagt aatctacggit gcactitctag ttggagatat atctt actgc ggcgtactag 378 O gatat ct cat aaaaaataaa gtaaatataa goaatat coa taaaaatatt tttgacatag 384 O attatt ctoga tittttacaat gttgaagaag atgga caata tag titat caa ttgaggtaa 3899

<210s, SEQ ID NO 2 US 8,628,943 B2 27 28 - Continued

&211s LENGTH: 2506 &212s. TYPE: DNA <213> ORGANISM: Clostridium ragsdalei <4 OOs, SEQUENCE: 2 gcatactgat tdattattta tittgaaaatg cctaagtaaa atatata cat attataacaa 6 O taaaataagt attagtgtag gatttittaaa tagagtat ct attitt cagat taaatttitta 12 O cittatttgat ttacattgta taatattgag taaagtattg act agtaaaa ttttgttgata 18O ctittaatctg tdaaatttct tagcaaaagt tatatttittgaataatttitt attgaaaaat 24 O acaactaaaa aggattatag tataagtgttgttaattittg tittaaattit aaagggagga 3OO aataaacatgaaattgatgg aaaaaatttg gaataaggca aaggalagaca aaaaaaagat 360 tgtcttagct gaaggagaag aagaaagaac tottcaa.gct tctgaaaaaa taattaaaga 42O agg tattgca aatttaatcc ttgtagggaa taaaaggta atagaggaga aggcatcaaa 48O attaggcgta agtttaaatg gag cagaaat agtagat coa gaalacct cqg ataalactaaa 54 O aaaatatgca gatgctttitt atgaattgag aaagaagaag ggaataacac Cagaaaaagc 6OO ggataaaata gtaagagatc caatatattt totacgatg atggittaagc titggagatgc 660 agatggattg gtttcaggtg cagtgcatac tacaggtgat Cttittgaga C Caggactitca 72 O aatagtaaag acagotccag gta catcagt agttt coagc acatttataa toggaagtacc 78O aaattgttgaa tatggtgaca atggtgtact tctatttgct gattgttgctg. taaatccatg 84 O cc.ca.gatagt gat caattgg cittcaattgc aataagtaca gcagaaact g caaagaactt 9 OO atgtggaatg gatccaaaag tag caatgct t t catttitct act aagggaa gtgcaaaa.ca 96.O cgaattagta gataaagtta gaaatgctgt agaaattgcc aaaaaagcta aaccagattit O2O aagtttggac ggagaattac aattagatgc ctictatcgta gaaaaggttg Caagtttaaa O8O ggct Cotgaa agtgaagtag caggaaaagc aaatgtactt gtattt C cag atct c caagc 14 O aggaaatata ggittataaac ttgttcaaag atttgcaaaa gctgatgct a taggacctgt 2OO atgcCaggga tittgcaaaac ctataaatga tttgtcaaga ggatgta act C catgatat 26 O agtaaatgta gtagctgtaa cagcagttca ggcacaa.gct caaaagtaala taaaat att 32O agtag taaac tdtggaagtt catctittaaa at atcaactt attgatatga aagatgaaag 38O cgttgtggca aaaggactitg tagaaagaat aggagcagaa ggttcagttt taacacatala 44 O agittaacgga gaaaagtttgttacagagca gccaatggala gat cataaag ttgctataca SOO attagt atta aatgct cittg tagataaaaa acatggtgta ataaaagata tdt cagaaat 560 atctgctgta gggcatagag titttgcatgg taaaaaaa tatgcgg cat C catt Cttat 62O tgatgacaat gtaatgaaag caatagaaga atgitatt coa ttaggaccat tacataatcc 68O agctaatata atgggaatag atgcttgtaa aaaactaatg ccaaatactic caatggtagc 74 O agtatttgat acago atttic atcagacaat gccagattat gcttatactt atgcaat acc 8OO titatgatata t ctdaaaagt atgatat cag aaaatatggit titt catggaa cittct catag 86 O att cottt ca attgaagcag ccaagttgtt aaagaaagat ccaaaagat c ttaagctaat 92 O aacttgtcat ttaggaaatg gagctagt at atgtgcagta aaccagggala aag cagtaga 98 O tacaactatg ggactt actic cccttgcagg acttgtaatg ggalactagat gtggtgatat 2O4. O agat.ccagct ataataccat ttgtaatgaa aagaacaggt atgtctgtag atgaaatgga 21OO tactittaatgaacaaaaagt caggaatact tagtatica ggagtaa.gca gcgattittag 216 O agatgtagaa gaagctgcaa attcaggaaa tatagagca aaacttgcat taalatatgta 222 O US 8,628,943 B2 29 30 - Continued titat Cacaaa. gttaaatctt toataggagc titatgttgca gttittaaatg gag cagatgc 228O tata at attt acagcaggac ttggagaaaa ttcagctact agcagatctg ctatatgtaa 234 O gggattaa.gc tattittggaa ttaaaataga tgaagaaaag aataagaaaa ggggaga agc 24 OO actagaaata agcacacctg attcaaagat aaaagtatta gtaattic cta caaatgaaga 246 O actitatgata gct agggata caaaagaaat agttgaaaat aaataa. 2506

<210s, SEQ ID NO 3 &211s LENGTH: 1056 &212s. TYPE: DNA <213> ORGANISM: Clostridium ragsdalei <4 OOs, SEQUENCE: 3 atgaaaggitt ttgcaatgtt agg tattaac aagttaggat ggattgaaaa gaaaaac cca 6 O gtaccaggto cittatgatgc gattgtacat cct ctagotg tat coccatg tacat cagat 12 O atacatacgg tttittgaagg agc acttggit aat agggaaa atatgattitt agg to acgaa 18O gctgtaggtg aaatagctga agttggcagt gaagttaaag attittaaagt tggcgataga 24 O gttatcgtac Catgcacaac acctgactgg agatcct tag aagtcCaagc tggittittcaa 3OO cago attcaa acgg tatgct tcaggatgg aagttitt coa attittaaaga cggtgtattt 360 gcagattact tt catgtaaa cqatgcagat atgaatc.ttg Caatact tcc. agatgaaata c ctittagaaa gtgcagtt at gatgacagac atgatgacta ctggttittca tgggg.ca.gala cittgctgaca taaaaatggg ttc.ca.gtgtt gtcgtaattg gtataggagc tgttggatta 54 O atgggaatag ccggttccaa actitcgagga gCagg tagaa ttatcggtgt tggalagcaga cc.cgtttgttg ttgaaacago taaattitt at ggagcaactg at attgtaaa ttataaaaat 660 ggtgatatag ttgaacaaat aatgg actta act catggta alaggtgtaga cc.gtgtaatc 72 O atggCagg.cg gtggtgctga aac act agca Caagcagtaa citatgttaa acctggcggc gtaatttcta a catcaacta ccatggaagc ggtgatactt tgc caat acc tcqtgttcaa 84 O

gcatggctica caaaactata agaggagggit tatgtc.ccgg cggacgt Ctt 9 OO agaatggaaa tgctaagaga ccttgttcta tataaacgtg ttgatttgag caaacttgtt 96.O act catgitat ttgatggtgc agaaaatatt gaaaaggcc.c ttttgctitat gaaaaataag c caaaagatt taattaaatc agtagttaca ttctaa 1056

SEQ ID NO 4 LENGTH 351 TYPE : PRT ORGANISM: Clostridulim rags dalei < 4 OOs SEQUENCE: 4

Met Lys Gly Phe Ala Met Leu Gly Ile Asn Lys Lell Gly Trp Ile Glu 1. 5 1O 15

Lys Lys Asn Pro Wall Pro Gly Pro Tyr Asp Ala Ile Wall His Pro Leu 2O 25

Ala Wall Ser Pro Cys Thir Ser Asp Ile His Thr Wall Phe Glu Gly Ala 35 4 O 45

Lell Gly Asn Arg Glu Asn Met Ile Lieu. Gly His Glu Ala Wall Gly Glu SO 55 6 O

Ile Ala Glu Val Gly Ser Glu Wall Lys Asp Phe Wall Gly Asp Arg 65 70 7s

Wall Ile Wall Pro Cys Thir Thr Pro Asp Trp Arg Ser Lell Glu Wall Glin 85 90 95 US 8,628,943 B2 31 32 - Continued

Ala Gly Phe Glin Glin His Ser Asn Gly Met Luell Ala Gly Trp Lys Phe 105 11 O

Ser Asn Phe Asp Gly Wall Phe Ala Asp Phe His Wall Asn Asp 115 12 O 125

Ala Asp Met Asn Lell Ala Ile Luell Pro Asp Glu Ile Pro Luell Ser 13 O 135 14 O

Ala Wall Met Met Thir Asp Met Met Thir Thir Gly Phe His Gly Glu 145 150 155 160

Lell Ala Asp Ile Lys Met Gly Ser Ser Wall Wall Wall Ile Gly Gly 1.65 17O

Ala Wall Luell Met Gly Ile Ala Gly Ser Lell Arg Gly Gly 18O 185 19 O

Arg Ile Gly Wall Gly Ser Arg Pro Wall Wall Glu Thir

Phe Tyr Ala Thir Asp Ile Wall Asn Tyr Asn Gly Asp Wall 21 O 215 22O

Glu Glin Met Asp Lell Thir His Gly Gly Wall Asp Arg Wall Ile 225 23 O 235 24 O

Met Ala Gly Gly Ala Glu Thir Luell Ala Glin Ala Wall Thir Met Wall 245 250 255

Pro Gly Wall Ile Ser Asn Ile Asn Tyr His Gly Ser Gly Asp 26 O 265 27 O

Thir Luell Pro Ile Pro Arg Wall Glin Trp Gly Cys Gly Met Ala His 285

Thir Ile Arg Gly Gly Lell Cys Pro Gly Gly Arg Lell Arg Met Glu Met 29 O 295 3 OO

Lell Arg Asp Luell Wall Lell Tyr Arg Wall Asp Lell Ser Luell Wall 3. OS 310 315 32O

Thir His Wall Phe Asp Gly Ala Glu Asn Ile Glu Ala Luell Luell Luell 3.25 330 335

Met Asn Lys Pro Asp Luell Ile Ser Wall Wall Thir Phe 34 O 345 35. O

SEO ID NO 5 LENGTH: 351 TYPE : PRT ORGANISM: Clostridium lungdahlili

< 4 OOs SEQUENCE:

Met Lys Gly Phe Ala Met Lell Gly Ile Asn Lys Lell Gly Trp Ile Glu 1. 15

Lys Lys Asn Pro Wall Pro Gly Pro Tyr Asp Ala Ile Wall His Pro Luell 25

Ala Wall Ser Pro Thir Ser Asp Ile His Thir Wall Phe Glu Gly Ala 35 4 O 45

Lell Gly Asn Arg Glu Asn Met Ile Luell Gly His Glu Ala Wall Gly Glu SO 55 6 O

Ile Ala Glu Wall Gly Ser Glu Wall Asp Phe Wall Gly Asp Arg 65 70

Wall Ile Wall Pro Cys Thir Thir Pro Asp Trp Arg Ser Lell Glu Wall Glin 85 90 95

Ala Gly Phe Glin Glin His Ser Asn Gly Met Luell Ala Gly Trp Phe 1OO 105 11 O

Ser Asn Phe Asp Gly Wall Phe Ala Asp Phe His Wall Asn Asp 115 12 O 125 US 8,628,943 B2 33 34 - Continued

Ala Asp Met Asn Lell Ala Ile Luell Pro Asp Glu Ile Pro Luell Glu Ser 13 O 135 14 O

Ala Wall Met Met Thir Asp Met Met Thir Thir Gly Phe His Gly Glu 145 150 155 160

Lell Ala Asp Ile Lys Met Gly Ser Ser Wall Wall Wall Ile Gly Gly 1.65 17O

Ala Wall Gly Lieu. Met Gly Ile Ala Gly Ser Lell Arg Gly Gly 18O 185 19 O

Arg Ile Ile Gly Wall Gly Ser Arg Pro Wall Wall Glu Thir

Phe Tyr Gly Ala Thir Asp Ile Wall Asn Tyr Asn Gly Asp Wall 21 O 215 22O

Glu Glin Ile Met Asp Lell Thir His Gly Gly Wall Asp Arg Wall Ile 225 23 O 235 24 O

Met Ala Gly Ala Glu Thir Luell Ala Glin Ala Wall Thir Met Wall 245 250 255

Pro Wall Ile Ser Asn Ile Asn Tyr His Gly Ser Gly Asp 26 O 265 27 O

Thir Luell Pro Ile Pro Arg Wall Glin Trp Gly Cys Gly Met Ala His 27s 285

Thir Ile Arg Gly Gly Lell Cys Pro Gly Gly Arg Lell Arg Met Glu Met 29 O 295 3 OO

Lell Arg Asp Lieu. Wall Lell Tyr Arg Wall Asp Lell Ser Luell Wall 3. OS 310 315 32O

Thir His Wall Phe Asp Gly Ala Glu Asn Ile Glu Ala Luell Luell Luell 3.25 330 335

Met Asn Lys Pro Asp Luell Ile Ser Wall Wall Thir Phe 34 O 345 35. O

SEQ ID NO 6 LENGTH: 352 TYPE PRT ORGANISM: Thermoanaerobacter ethanolicus

< 4 OOs SEQUENCE:

Met Lys Gly Phe Ala Met Lell Ser Ile Gly Lys Wall Gly Trp Ile Glu 1. 15

Lys Glu Llys Pro Ala Pro Gly Pro Phe Asp Ala Ile Wall Arg Pro Luell 2O 25

Ala Wall Ala Pro Thir Ser Asp Ile His Thir Wall Phe Glu Gly Ala 35 4 O 45

Ile Gly Glu Arg His Asn Met Ile Luell Gly His Glu Ala Wall Gly Glu SO 55 6 O

Wall Wall Glu Wall Gly Ser Glu Wall Asp Phe Pro Gly Asp Arg 65 70

Wall Wall Wall Pro Ala Ile Thir Pro Asp Trp Trp Thir Ser Glu Wall Glin 85 90 95

Arg Gly Tyr His Glin His Ser Gly Gly Met Luell Ala Gly Trp Phe 105 11 O

Ser Asn Val Lys Asp Gly Wall Phe Gly Glu Phe Phe His Wall Asn Asp 115 12 O 125

Ala Asp Met Asn Lell Ala His Luell Pro Glu Ile Pro Luell Glu Ala 13 O 135 14 O

Ala Wall Met Ile Pro Asp Met Met Thir Thir Gly Phe His Gly Ala Glu 145 150 155 160 US 8,628,943 B2 35 36 - Continued

Lell Ala Asp Ile Glu Lieu. Gly Ala Thir Wall Ala Val Lieu. Gly I e Gly 1.65 17O 17s

Pro Wall Lieu Met Ala Wall Ala Gly Ala Lell Arg Gly A. Gly 18O 185 19 O

Arg Ile Ala Val Gly Ser Arg Pro Val Wall Asp Ala 2O5

Tyr Ala Thir Asp Ile Wall Asn Asp Gly Pro Glu 21 O 215

Ser Glin Met Asn Lieu. Thir Glu Gly Gly Wall Asp Ala Ile 225 23 O 235 24 O

Ile Ala Gly Asn Ala Asp Ile Met Ala Thir Ala Wall Wall 245 250

Pro Gly Thir Ile Ala Asn Wall Asn Phe Gly Glu Glu 26 O 265 27 O

Wall Luell Pro Wall Pro Arg Lell Glu Trp Gly Gly Met Ala His 28O 285

Thir Ile Lys Gly Gly Lieu. Cys Pro Gly Gly Arg Lell Arg Met Glu Arg 29 O 295 3 OO

Lell Ile Asp Luell Wall Phe Tyr Lys Pro Wall Asp Pro Ser Luell Wall 3. OS 310 315 32O

Thir His Wall Phe Glin Gly Phe Asp Asn Ile Glu Ala Phe Met Luell 3.25 330 335

Met Asp Llys Pro Lys Asp Luell Ile Pro Wal Wall Ile Lieu Ala 34 O 345 35. O

What is claimed is: 5. A method of producing ethanol comprising: isolating 1. An isolated polynucleotide comprising a nucleotide and purifying anaerobic, ethanologenic microorganisms car sequence encoding an operon that codes for carbon monoxide 35 rying the polynucleotide of claim 1, fermenting syngas with said microorganisms in a fermentation bioreactor. dehydrogenase, a membrane-associated electron transfer 6. A method of increasing ethanologenesis in a microor protein, a ferredoxin oxidoreductase, and a promoter, said ganism containing the nucleotide sequence of claim 1, said sequence being at least 97% identical to SEQID NO. 1. method comprising: modifying or duplicating a promoter 2. A vector comprising the polynucleotide of claim 1. 40 region of said nucleotide sequence to increase the activity of 3. An isolated transformant containing the polynucleotide the operon of claim 1 or to cause overexpression of the of claim 1. operon. 4. An isolated transformant carrying the vector of claim 2.