BIOCHEMICAL AND STRUCTURAL STUDIES OF MICROBIAL

ENZYMES FOR THE BIOSYNTHESIS OF 1,3-BUTANEDIOL

by

Taeho Kim

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Chemical Engineering and Applied Chemistry University of Toronto

© Copyright by Taeho Kim

Biochemical and Structural Studies of Microbial for the Biosynthesis of 1,3-Butanediol

Taeho Kim

Doctor of Philosophy

Department of Chemical Engineering and Applied Chemistry University of Toronto

2019

Abstract

Protein design approach for systems metabolic engineering entails discovery of novel enzymes and protein engineering for improving biocatalysts activity. It allows us to design and optimize cellular for the generation of novel metabolic pathways, aiming at the production of chemicals from renewable biomass. This thesis is aimed at two topics: the discovery and engineering for optimization of the biosynthetic pathway for 1,3-butanediol (1,3BDO), and the in vivo demonstration of the protein design in E. coli and cyanobacteria. The 1,3BDO pathway consists of three steps: pyruvate, a natural metabolite, is converted into acetaldehyde by pyruvate decarboxylase (PDC); it is followed by aldol condensation using 2-deoxyribose-5- phosphate aldolase (DERA), yielding 3-hydroxybutanal (3-HB); an aldo-keto reductase (AKR) catalyzes the reduction of 3-HB to produce 1,3BDO. Thus, we specifically investigated novel DERA and AKR in this thesis. First, PA1127 from Pseudomonas aeruginosa was identified based on the activity on 3-HB. The AKR was biochemically and structurally studied, providing insights into further applications with a remarkable promiscuity. Next, BH1352 from Bacillus halodurans was identified with a high activity of aldol condensation. As the DERA-catalyzed condensation was the rate-limiting step, we rationally designed site-directed mutagenesis for enhancing 1,3BDO synthesis from DERA-AKR coupled reaction; two of the mutations showed 2.6 times higher in vitro activity compared to BH1352 wildtype. The two mutations were then

ii introduced to E. coli fermentation for 1,3BDO biosynthesis from glucose. The Phe160Tyr single mutation exhibited a 5-fold increase in 1,3BDO titer, and the Phe160Tyr/Met173Ile double mutation displayed a 6-fold increase in 1,3BDO production with a 7-fold improvement in glucose yield compared to the wildtype. Finally, the biosynthetic pathway was introduced into

Synechococcus elongatus PCC 7942 for the photosynthetic conversion of CO2 into 1,3BDO. Although we were not able to obtain the strain of cyanobacterial 1,3BDO production yet, the cyanobacterial engineering system was established, and the heterologous expression of the pathway genes was demonstrated via genetic engineering of Synechococcus elongatus PCC 7942.

iii

Acknowledgement

First, I would like to dedicate this thesis to my parents. They could not have been more supportive during my entire life. I am also very grateful for my sister for so many advices and supports that helped me through the Ph.D.

I sincerely appreciate the guidance and teaching from my supervisors, Dr. Yakunin and Dr. Mahadevan. Dr. Yakunin has always made himself available for me to discuss research subjects, and always provided me with directions and answers. He not just made me a researcher with a doctoral degree but transformed me into an analytical thinker. Dr. Mahadevan has always been supportive that he inspired me with many creative ideas and helped me challenging a variety of projects.

I am also very grateful for my friends and colleagues in Enzyme and Genomic Lab and Systems Metabolic Engineering Lab. Anna could not have been more kind and helpful to me for the last 5 years, and Sofia had to take too many responsibilities to manage the lab and help everyone doing their jobs. I cannot thank you two enough for that. Kayla has been a good friend as well as an inspiring colleague, helping me so much of my Ph. D thesis work. I also feel grateful to Mahbod, Kevin, Tommy, Vik, Naveen, Mabel, Andy, and Susie for welcoming me when I joined Biozone and being supportive friends throughout my Ph.D.

Last, but not least, I would like to express my genuine gratitude to Jeong for getting me an opportunity to join Biozone at U of Toronto and having always been a mentor to me.

iv

Table of Contents

ABSTRACT ...... II

ACKNOWLEDGEMENT ...... IV

LIST OF FIGURES ...... VIII

LIST OF TABLES ...... XII

LIST OF APPENDICES ...... XIII

LIST OF ABBREVIATIONS ...... XV

CHAPTER 1. THESIS BACKGROUND, MOTIVATION, AND OUTLINE ...... 1 1.1. Background ...... 1 1.1.1. Biocatalysis and biocatalysts ...... 1 1.1.2. Systems metabolic engineering ...... 3 1.2. Motivation and objective of this Ph. D study ...... 5 1.2.1. Motivation ...... 5 1.2.2. Main objective ...... 6 1.3. Thesis outline ...... 9

CHAPTER 2. LITERATURE REVIEW AND GENERAL INTRODUCTION ...... 11 2.1. Aldo-keto Reductases ...... 11 2.1.1. Introduction ...... 11 2.1.2. Microbial AKRs ...... 13 2.2. Aldol Reaction and Aldolases ...... 15 2.2.1. Aldol reaction ...... 15 2.2.2. Aldolases ...... 16 2.2.3. 2-deoxyribose-5-phsophate aldolase (DERA) ...... 18 2.3. 1,3-Butanediol production technologies and applications ...... 23 2.3.1. Introduction ...... 23 2.3.2. Pathway for bioconversion of glucose into 1,3BDO ...... 26 2.3.3. Aldolase-based biosynthetic pathway for 1,3BDO ...... 26 2.4. Cyanobacteria engineering ...... 28 v

2.4.1. Motivation of cyanobacteria engineering study ...... 28 2.4.2. Cyanobacteria ...... 29 2.4.3. Notes in cyanobacterial engineering ...... 33 2.5. Hypotheses and research objectives ...... 35 2.6. Publication status and contribution ...... 38

CHAPTER 3. NOVEL ALDO-KETO REDUCTASES FOR THE BIOCATALYTIC CONVERSION OF 3- HYDROXYBUTANAL TO 1,3-BUTANEDIOL: STRUCTURAL AND BIOCHEMICAL STUDIES ...... 41 3.1. Abstract ...... 41 3.2. Introduction ...... 42 3.3. Materials and Methods ...... 45 3.4. Results and Discussion ...... 49 3.4.1. Screening of purified AKRs for aldo-keto reductase activity against 3-HB ...... 49 3.4.2. Biochemical characterization of STM2406 and PA1127 ...... 52 3.4.3. Crystal structure of STM2406 ...... 55 3.4.4. Active sites and catalytic mechanisms of STM2406 and PA1127 ...... 57 3.4.5. Mutational analysis of substrate selectivity of STM2406 and PA1127...... 59 3.4.6. Mutational analysis of the STM2406 substrate-binding pocket ...... 61 3.4.7. C-terminal loop of AKR structure ...... 63 3.5. Conclusion ...... 63

CHAPTER 4. STRUCTURAL STUDIES AND ENGINEERING OF 2-DEOXYRIBOSE-5-PHOSPHATE ALDOLASE BH1352 FOR THE BIOSYNTHESIS OF 1,3-BUTANEDIOL ...... 65 4.1. Abstract ...... 65 4.2. Introduction ...... 66 4.3. Materials and methods ...... 70 4.4. Results and discussion ...... 74 4.4.1. Phylogenetic analysis of DERA sequences ...... 74 4.4.2. Screening of purified DERAs for biosynthesis of 1,3BDO from acetaldehyde ...... 75 4.4.3. Crystal structure of BH1352: overall fold, C-terminal, and ...... 78 4.4.4. Probing the active site of BH1352 using site-directed mutagenesis ...... 85 4.4.5. Structure-based engineering of BH1352 for enhanced production of 1,3BDO ...... 91 4.4.6. Synthesis of (3R,5S)-6-chloro-2,4,6-trideoxyhexapyranoside via sequential condensation of acetaldehyde and chloroacetaldehyde ...... 93

vi

4.4.7. Structural insight on inhibition of DERA at high concentrations of acetaldehyde ...... 95 4.5. Conclusion ...... 98

CHAPTER 5. IN VIVO STUDY OF ENGINEERED BH1352 FOR 1,3-BUTANEDIOL BIOSYNTHESIS FROM E. COLI AND SYNECHOCOCCUS ELONGATUS PCC 7942 ...... 99 5.1. Introduction ...... 99 5.1.1. Biosynthesis of 1,3BDO from DERA-based pathway in E. coli ...... 100

5.1.2. Cyanobacterial engineering for photosynthetic conversion of CO2 into 1,3BDO...... 102 5.2. Materials and methods ...... 104 5.3. Results and discussion ...... 109 5.3.1. Whole-cell biotransformation using BH1352 and its variants expression in BL21 ...... 109 5.3.2. Application of engineered BH1352 variants for in vivo production of 1,3BDO from glucose ...... 112 5.3.3. Extracellular 1,3BDO toxicity of PCC 7942 ...... 114 5.3.4. Transformation of Synechococcus elongatus PCC 7942...... 116 5.3.5. Cyanobacterial expression of 1,3BDO pathway genes ...... 117 5.4. Conclusion ...... 121 5.4.1. In vivo study of engineered BH1352 in E. coli ...... 121 5.4.2. Cyanobacterial production of 1,3BDO using PDC-DERA-AKR pathway ...... 122

CHAPTER 6. SUMMARY, RECOMMENDATIONS, AND OUTLOOK ...... 123 6.1. Summary ...... 123 6.2. Recommendations ...... 129 6.2.1. Address the reducing balance ...... 129 6.2.2. Combinatorial optimization of the translation rate of pathway genes ...... 131 6.2.3. Troubleshoot cyanobacterial transformation ...... 132 6.3. Outlook ...... 132

BIBLIOGRAPHY ...... 133

APPENDIX A ...... 153

APPENDIX B ...... 162

APPENDIX C ...... 175

vii

List of Figures

Figure 1-1. Graphical presentation of the main objective of this thesis...... 7

Figure 2-1. "Push-pull" mechanism for acid-base of AKRs ( labeling for STM2406 from Salmonella typhimurium)...... 13

Figure 2-2. Reversible retro-aldol cleavage of 2-deoxyribose-5-phosphate (DRP) via DERA...... 18

Figure 2-3. Overall structure of monomeric BH1352 from Bacillus halodurans. The catalytic residues are displayed with stick model and labels...... 19

Figure 2-4. The predicted catalytic mechanism of BH1352 DERA from Bacillus halodurans...... 21

Figure 2-5. Proposed DERA enzyme-substrate interaction for BH1352 and DRP...... 21

Figure 2-6. Biosynthetic pathway for 1,3-butanediol production. A: Invert fatty acid β-oxidation pathway (Kataoka et al., 2013); B: DERA-based pathway (Nemr et al., 2018)...... 27

Figure 2-7. Schematic representation of the photosynthesis of cyanobacteria. Yellow arrows represent light-dependent reactions, while dark gray colored arrows indicate light-independent reactions. The red dashed arrow displays electron flow pathway and the blue dotted arrows show the journey of protons within the photosystem. Abbreviations: b6f: b6f complex; Fd: ferredoxin; FNR: ferredoxin-NADP reductase; PC: plastocyanin; PS I and II: photosystem I and II; PQ: plastoquinone; 3PG: 3-phosphoglycerate...... 30

Figure 2-8. Schematic representation of Calvin Cycle. C indicates carbon and P represents phosphate group. In terms of stoichiometry, three ribulose 1,5-bisphosphate (RuBP) molecules each with 5 carbons are added with 3 carbon dioxide (1 carbon) molecules, and give out six 3-phosphoglycerate (3PG, 3 carbons) molecules. While carbon number is conserved during the conversion from 3PG into glyceraldehyde 3- phosphate (G3P), 5 out of 6 G3P are used in the regeneration of 3 RuBP molecules...... 31

Figure 3-1. The proposed DERA-AKR pathway of 1,3BDO biosynthesis from acetaldehyde...... 43

Figure 3-2. Screening of 21 purified AKRs for reductase activity against 3-HB. The reaction mixtures contained 10 mM 3-HB, 0.5 mM NADPH, and 10 µg of the indicated purified protein. For each enzyme, the UniProt gene code and source organism are indicated, and the two AKRs characterized in this work are boxed...... 50

Figure 3-3. Phylogenetic analysis of AKRs used in this work. The enzymes characterized in this work (STM2406 and PA1127) are marked with red boxes. The tree encompasses the AKRs with high activities against 3-HB, as well as STM2406 and the proteins whose subfamilies have already been assigned (shown next to the protein). The phylogenetic tree was generated using the neighbor-joining method (Tamura et al., 2013). The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances are in the units of the number of amino acid differences per site. The numbers next to each node represent the probability that supports the evidence for excluding clusters below the node point. These statistical values were calculated by the bootstrap method with 500 replications and used to infer AKR subfamilies of enzymes used in this work. All of the AKRs are represented with UniProt gene names and host organisms...... 51

viii

Figure 3-4. Substrate profiles of STM2406 (A) and PA1127 (B) AKR activities against different . The substrates were used at a concentration of 1 mM. Both AKRs showed no measurable activity toward 1 mM acetaldehyde. Results are the means and standard deviations (error bars) of data from at least two independent determinations...... 53

Figure 3-5. Overall folds of STM2406 and PA1127. (A) Cartoon representation of the crystal structure of the STM2406 protomer. (B) Cartoon representation of the PA1127 structural model generated using the Phyre2 web portal. The α helices and β strand structures that compose the TIM barrel are indicated and labeled...... 56

Figure 3-6. Active site of STM2406. (A) Closeup view of the STM2406 active site with bound cacodylate (CA). The 2mFo-DFc map contoured at 1.0 σ is displayed (shown as a blue mesh) around the cacodylate molecule. The catalytic tetrad (Asp61, Tyr66, Lys97, and His138) and residues involved in substrate binding are shown as sticks with green carbons. (B) Structural model of the PA1127 active site showing the catalytic tetrad (Asp54, Tyr59, Lys84, and His125) and substrate-binding residues. (C) The STM2406 active site with the bound NADPH molecule (shown as sticks with carbon atoms colored in magenta). The catalytic residues and residues involved in cofactor binding are shown as sticks with green carbons and are labeled...... 58

Figure 3-7. AKR activities after alanine replacement mutagenesis of STM2406 (A) and PA1127 (B). The PA1127 H125A and Y203A were expressed in insoluble forms. Catalytic activity was assayed using saturating concentrations of 4-nitrobenzaldehyde (2 mM; white bars) and 3-HB (64 mM for STM2406 and 16 mM for PA1127; gray bars) as the substrates...... 60

Figure 3-8. AKR activity of purified N65 mutant proteins after site-directed mutagenesis of the STM2406 active site. Specific activities of the wild-type (WT) and N65 mutant proteins against 3-HB (A), (B), 3-pyridinecarboxaldehyde (C), and 4-nitrobenzaldehyde (D) as the substrates. The concentrations of each of the substrates used are the saturating concentrations for the WT enzyme. Saturation curves of the purified WT and N65 mutant proteins with 3-HB (E), methylglyoxal (F), 3- pyridinecarboxaldehyde (G), and 4-nitrobenzaldehyde (H) as the substrates...... 62

Figure 4-1. Phylogenetic analysis of DERAs and screening of purified proteins for 1,3BDO formation. A: Phylogenetic analysis of the DERA family: unrooted phylogenetic tree of 2,553 DERA sequences showing the presence of five main clusters (1-5) and non-clustered sequences. Black circles indicate the 20 DERA proteins from different clusters selected for activity screening (with organism names and UniProt codes). BH1352 from B. halodurans characterized in this work is indicated by the red rectangle. B: Screening of 20 purified DERAs for the production of 1,3BDO from acetaldehyde in the presence of PA1127. The graph bars represent the final concentration of 1,3BDO produced after 2 h of incubation with 10 mM NADPH and 50 mM acetaldehyde (see Materials and Methods for experimental details)...... 78

Figure 4-2. Crystal structure of BH1352. A: Dimeric structure of BH1352; B: Overall view of BH1352 protomer. The α helices and β strand structures that compose the TIM barrel are labeled. The is displayed with sticks...... 82

Figure 4-3. Active site of BH1352. A: Close-up view of the BH1352 active site. The catalytic residues and the key residues involved in substrate binding are shown as sticks with green carbons and labeled; B: Diagram showing the predicted binding mode of 2-deoxyribose-5-phosphate (DRP) in the active site of BH1352, modeled using the EcDERA-DRP complex (PDB code 1JCL)...... 83

ix

Figure 4-4. Two orientations of the C-terminal Tyr224 of BH1352 (red boxes of A and B), each being displayed in detail on the right...... 84

Figure 4-5. Substrate entrance regions of BH1352 (A) and EcDERA (B, PDB 1JCL, DRP shown as a green stick). The residues constituting the substrate entrance are displayed with stick model and labeled; C and D: The clusters of hydrophobic amino acids (green stick model, the predicted hydrophobic contact between protein and ligand shown with a dashed arch), one near the catalytic Lys155 (C) and the other in between the β-strand barrel and the α-helix barrel (D)...... 88

Figure 4-6. Structure-based sequence alignment of DERAs active in 1,3BDO production: six DERAs from Bacilli including BH1352 (light red background), six proteobacterial DERAs (light blue background), and TM1559 (the center row). The secondary structure elements derived from the structures of BH1352 and E. coli DeoC are shown above and below the alignment, respectively. Residues conserved in all proteins are shown in white font on a red background. The columns with red residues indicate the presence of more than 70% of biochemically similar residues. The catalytic residues are indicated by cyan boxes with red residue numbers, whereas the columns with black boxes and residue numbers indicate the substrate entrance residues (from Fig. 4-5A). The residues of the hydrophobic amino acid clusters (from Fig. 4-4C and D) are labeled with black circles...... 89

Figure 4-7. Site-directed mutagenesis of BH1352: catalytic activity of purified wild-type and mutant proteins in the retro-aldol (DRP cleavage) and acetaldehyde condensation reactions. A: retro-aldolase activity of purified proteins with 1 mM DRP as substrate; B: acetaldehyde condensation reaction of purified proteins measured as the formation of 1,3BDO in the presence of PA1127 (DERA-AKR ratio 1:1). Experimental details are described in Materials and Methods...... 90

Figure 4-8. BH1352-catalyzed tandem aldol condensation of acetaldehyde and chloroacetaldehyde to produce (3R,5S)-6-chloro-2,4,6-trideoxyhexapyranoside (dashed box), which is a key chiral precursor of atorvastatin...... 94

Figure 4-9. A: Structural alignment of BH1352 (light gray) and EcDERA (green). The two target cysteine residues of BH1352, the corresponding residues of EcDERA, and the catalytic triad of the two DERAs are represented with stick model. DRP substrate bound with EcDERA Lys167 is displayed with a cyan stick. The labels of the key residues are colored differently for BH1352 (black) and EcDERA (green); B: Time- dependent residual activity of the wildtype of BH1352 and its variants during incubation with 100 mM acetaldehyde...... 97

Figure 5-1. Aldolase-based biosynthetic pathway for (R)-1,3BDO in the context of E. coli central carbon metabolism at the pyruvate node. The figure was reproduced from Nemr et al. paper...... 101

Figure 5-2. The biosynthetic pathway for cyanobacterial 1,3BDO production...... 104

Figure 5-3. A: Graphical scheme of the photobioreactor design; B: Photo of the in-house photobioreactor...... 108

Figure 5-4. A: The in vivo activity assay of BH1352 and its variants was performed by the expression in the recombinant E. coli BL21 with T7 promoter; B: The in vivo assay results comparing the wildtype BH1352 and its variants. The culture samples were collected at 2.5h, 5h, and 16h from the initial addition of acetaldehyde and injected into HPLC for the measurement of 1,3BDO...... 111

x

Figure 5-5. Production of 1,3BDO from glucose by E. coli cells expressing the aldolase-based 1,3BDO pathway with the wild-type and mutant BH1352. The E. coli strains used were BDO-0 (wild-type BH1352), BDO-1 (BH1352 F160Y), and BDO-2 (F160Y/M173I). The white bars show the production of 1,3BDO (g/L) by fermentation of corresponding strains, whereas the gray bars represent the corresponding yield of 1,3BDO (mg/g glucose). The results are shown as means (± S.D.) from duplicate experiments. Experimental details are described in Materials and Methods...... 114

Figure 5-6. 1,3BDO toxicity effect on the growth of WT PCC 7942. Triplicate measurement with standard deviations of each data point are represented with error bars...... 116

Figure 5-7. Schematic representation of homologous recombination for PCC 7942 genome integration of the 1,3BDO pathway genes...... 117

Figure 5-8. Culturing of PCC 7942 and its variants (TK01 and TK02) in a shaking incubator (left) and growth curve of cyanobacteria culture (right)...... 118

Figure 5-9. A: Cell density of WT PCC 7942 and TK12; B: SDS-PAGE of the purified 1,3BDO pathway enzymes expressed in TK12. Each gene name and the protein size (kDa) are indicated...... 120

Figure 5-10. A: The 1,3BDO pathway operon designed to be integrated into the neutral site I of the genome of PCC 7942 with the fragment size indicated. The primers used for colony PCR are represented with black arrows at each end; B: Colony PCR of TK09 (left) and TK11 (right) transformation. PCR product using the same primers of PCC 7942 genome (1), the vector plasmid without the pathway genes (2) and the plasmid with the pathway genes (3) are shown as a control set. The red dashed boxes indicate where the band is supposed to be shown if the operon is correctly integrated into the genome...... 121

Figure 6-1. Protein design for optimization of 1,3BDO pathway and in vivo application in E. coli and Synechococcus elongatus. Enzyme screening, biochemical characterization, and structural analysis were conducted to identify the target enzymes (DERA and AKR), and the key step enzyme was rationally engineered to improve the catalytic activity of the target reaction. The protein-engineered pathway was then applied in biological system of E. coli and a cyanobacterial strain for biosynthesis of 1,3BDO from glucose and photosynthesis...... 124

Figure 6-2. The co-substrate binding of NADPH-dependent AKRs (A) and XYL1 (AKR2B5) with a dual co-substrate specificity (B). The structural segment including Glu227 of XYL1 is not determined, but the binding mode was proposed by Luccio et al. (2006)...... 130

xi

List of Tables

Table 2-1. Summary of bioproduction of 1,3BDO. LB stands for lysogeny broth...... 26

Table 2-2. The model strain of cyanobacteria engineering in this study...... 32

Table 3-1. Kinetic parameters of STM2406 and PA1127 with various substrates. Values are the means and standard errors...... 54

Table 4-1. Effect of rationally designed BH1352 mutations on the retro-aldol cleavage of DRP...... 76

Table 4-2. Crystallographic data collection and model refinement statistics for the crystal structure of BH1352. The PDBs will not be released until the paper is accepted...... 81

Table 4-3. Performance of different DERAs in the synthesis of (3R,5S)-6-chloro-2,4,6- trideoxyhexapyranoside...... 94

Table 5-1. Strains and plasmids used in this study...... 106

xii

List of Appendices

Appendix A

Figure A1. LC-MS analysis of the reaction products of 3-HB reduction by PA1127 and STM2406.

Figure A2. Screening of purified STM2406 (A) and PA1127 (B) for AKR activity against various substrates.

Figure A3. A: Asymmetric unit of STM2406 dimer; B: PISA predicted octameric structure of STM2406; C: Structural superposition of STM2406 and model PA1127.

Figure. A4. Structure-based sequence alignment of STM2406 and AKRs with higher activity toward 3-HB which belong to AKR5 and AKR11 subfamily.

Figure A5. Overall structure of AKRs and their C-terminal loops.

Figure A6. Effect of the C-terminal loop deletion on AKR activity of YhdN, and YvgN, and PA1127.

Table A1. Microbial AKRs used in this work.

Table A2. Crystallographic data collection and model refinement statistics.

Appendix B

Figure B1. SDS-PAGE of Coomassie blue-stained purified DERAs used in this study.

Figure B2. Optimization of DERA-AKR coupled assay.

Figure B3. Optimal pH for DRP cleavage reaction of BH1352.

Figure B4. Arrangement of the subunits of BH1352 (A), EcDERA (B), and TM1559 (C).

Figure B5. Full-length crystal structure of TM1559, showing the C-terminal end Tyr246 (PDB 3R12).

Figure B6. Proposed catalytic mechanism of aldol condensation of acetaldehyde by BH1352.

Figure B7. Close-up view of the active site of TM1559.

Figure B8. A and B: Structural alignment of BH1352 (light gray, PDB 6D33), LbDERA (cyan, PDB code

4XBK), and LbDERAT29L/F163Y mutant (magenta, PDB code 5H91), showing a closeup view of the xiii phosphate binding pocket; C and D: Prediction of BH1352 variant structures with mutations with a predicted H-bond shown by a red dash (C) and an introduced cavity highlighted with a red dashed circle (D). The structural models were built using SCWRL4.

Figure B9. Design of LMSE51C strain to maximize the carbon flux towards 1,3BDO by deleting several genes involve in the formation of formate, lactate, acetate, acetolactate, and ethanol. This strain was designed and engineered in a previous study from our group.

Table B1. Microbial DERAs studied in this paper.

Table B2. Hydrophobic residues involved in interactions between A and B subunits of BH1352, EcDERA, and TM1559.

Table B3. Amino acids found in the two hydrophobic clusters of the DERA family.

Table B4. Strains and plasmids used for demonstration of the in vivo effect of BH1352 mutations.

Appendix C

Figure C1. Detection limit of LC-MS for 1,3BDO (Na+ and H+ adduct), 2,3BDO (Na+ adduct), and acetoin (H+ adduct).

Figure C2. Design of LMSE51C strain to maximize the carbon flux towards 1,3BDO by deleting several genes involve in the formation of formate, lactate, acetate, acetolactate, and ethanol.

Figure C3. HPLC chromatography graph of TK01 media (A), TK02 media (B), and 1 mg/ml standard of acetaldehyde and 1,3BDO mixture (C).

xiv

List of Abbreviations 1,3BDO 1,3-butanediol 3-HB 3-hydroxybutanal AKR Aldo-keto reductase CTHP (3R,5S)-6-chloro-2,4,6-trideoxyhexapyranoside DERA 2-deoxyribose-5-phosphate aldolase DHA DHAP Dihydroxyacetone phosphate DO Dissolved DRP 2-deoxyribose-5-phosphate EcDERA DERA from E. coli FBP Fructose-1,6-biphosphate FSA Fructose-6-phosphate aldolase G3P Glyceraldehyde-3-phosphate KDPG 2-keto-3-deoxy-6-phosphogluconate KDPGA 2-keto-3-deoxy-6-phosphogluconate-aldolase LB Lysogeny broth NAD(P)H Nicotinamide adenine dinucleotide (phosphate) cofactor reduced (H) OD Optical density (measured at a specific wavelength) PBR Photobioreactor PCC Pasteur culture collection of cyanobacteria PEP Phosphenolpyruvate T.m Themotoga maritima Z.m Zymomonas mobilis

xv

Chapter 1. Thesis Background, Motivation, and Outline

1.1. Background

1.1.1. Biocatalysis and biocatalysts

What defines life? There are two major conditions: self-replication and catalysis of a variety of biochemical reactions. Enzymes in living organisms process the biochemical reactions, regardless of whether the reactions are simple or complex, and of how advanced the host organisms are. Then, what is enzyme? Enzymes are biomolecules that catalyze thousands of chemical reactions with specificity, keeping living organisms alive.

The term “enzyme” was coined by the German physiologist Wilhelm Kühne (1837-1900). It is originated from ancient Greek words, ‘en’ meaning ‘in’ or ‘within’ and ‘zume’ meaning ‘leaven’. This reveals how enzymes had emerged and been recognized in human history, which is something we add inside of the bread dough to be fermented and baked.

In human history, the use of fermentation has existed since Neolithic age, mainly for beverages. Ancient people realized that functional microorganisms transform metabolites, a process that can be exploited to enhance bio-availability, enrich the quality of food, preserve it, ensure food safety, degrade toxic components and anti-nutritive factors, serve as antioxidant and antimicrobial compounds, perform probiotic functions, and provide bioactive compounds (Tamang et al., 2009, 2016; Thapa and Tamang, 2015; Tamang, Watanabe and Holzapfel, 2016). For example, have been commonly used as starters for the fermentation of meat, vegetables, fruits, beverages, and dairy products. Yeast is yet another example of the human use of microbial metabolism in foods and alcoholic beverages.

The concept of modern biocatalysis had emerged as enzymes and microbes are applied in synthetic chemistry. The first milestone of biocatalysis was the kinetic resolution of racemic tartaric acid by treatment with Penicillium glaucum culture, a common mold, carried out by Louis Pasteur in 1858 (Gal, 2008). Another remarkable discovery was achieved by Eduard Buchner, who, in 1897,

1 successfully conducted cell-free fermentation of sugar with yeast extract. Buchner showed that fermentation is carried out by soluble enzymes in yeast “juice” without whole cells, which led him to the Nobel prize in chemistry in 1907 (Kohler, 1971). His discovery was the first proof showing that biotransformation does not necessarily require living cells, and the application of fermentation is potentially expanded from food processing to the production of chemicals.

As enzymes and biocatalysis began to be understood, it was found out that enzymes catalyze many different reactions with selectivity, and the range of biocatalytic reactions expanded with more enzymes being discovered and characterized. Enzymes are produced from renewable resources in a non-toxic way, and thus biodegradable. Besides, they usually function at environmentally friendly conditions with mild pH, moderate temperature, and ambient pressure. Moreover, enzymes enable synthesis of new chemicals that are difficult or practically impossible to manufacture in traditional chemical industry. Using enzymes, biocatalysis can utilize biomass such as agricultural crops and residues, animal residues, and sewage, as feedstock for chemical and fuel production, which alleviates the dependency on fossil carbons (Hatti-Kaul et al., 2007; Adsul et al., 2011). Therefore, biocatalysis poses a great potential as a sustainable and green manufacturing process.

Rapid advances in molecular biology, biotechnology, structural biology are enabling biocatalysis to establish environmentally friendly alternatives to traditional metallo- and organo-catalysis (Schoemaker, 2003). During the last few decades, the advancement in biocatalysis was focused on manufacture of antibiotics and detergents, (1st generation, early 1900s), then pharmaceutical intermediates, fine chemicals, cosmetics, and polymers (2nd generation, 1980s-1990s). The 3rd wave of biocatalysis realized the industrial-scale production of biofuels from biomass with the advancement of metabolic engineering and protein engineering (3rd wave, late 1990s – present) (Bornscheuer et al., 2012).

There are still, however, limits in applications of enzymes in biocatalysis and manufacturing of chemicals. The size of natural enzyme pool is limited, and both native and many of commercial enzymes are not suitable for industrial synthesis, as they are evolved to be active with their natural substrates. The stability and activity of naturally occurring enzymes do not often meet the industrial demand, which require organic solvents and elevated temperature (Bornscheuer et al.,

2

2012). In addition, some enzymes suffer from allosteric inhibition, which prevents them from being used in processing condition with high substrate load. In most cases, it took 10 to 20 years to overcome the above hurdles and develop the flagship industrial scale biocatalysis processes.

To overcome the limitations concerning the properties of enzymes, protein design via engineering and discovery of novel enzymes is needed. However, the development cycle in biocatalysis and protein engineering is time consuming due to the as yet incomplete knowledge and data. To solve this issue and transform the accumulation of experimental data into engineering principles, a more extensive and profound understanding on the protein sequence-structure-function relationships is required.

1.1.2. Systems metabolic engineering

The human use of microbial metabolism has existed since the Neolithic age and has been documented from around 7000 BCE in China, 3150 BCE in ancient Egypt, and several tens of centuries ago all around the world (Cavalieri et al., 2003; McGovern et al., 2004). In more recent history, the application of microbial metabolism has been extended to applications other than foods production. During World War I, in 1916, Chaim Weizmann devised the acetone-butanol-ethanol fermentation process, which was adopted by the war industry to manufacture cordite from acetone. The 1920s saw the production of citric acid from the fermentation of filamentous fungus, Aspergillus niger. The industrial fermentation process for citric acid was first proposed by James Currie in 1917, upon discovering that Aspergillus niger has the capacity to accumulate significant amounts of citric acid in sugar-based media (Currie, James, 1917). Later, when penicillin was serendipitously discovered by Sir Alexander Fleming in 1928, it began to be produced by fermentation of Penicillum chrysogenum.

During the following decades, the demand and interest in the biosynthesis of pharmaceuticals including antibiotics, cholesterol lowering agents, immunosuppressants, and anti-cancer drugs were growing. Selective screening and genetic mutagenesis were implemented to advance further in terms of fermentation and production performance. In particular, penicillin production from

3

Penicillum chrysogenum was enhanced more than 10,000 times via genetic engineering (Thykaer and Nielsen, 2003). Though genetic modification allowed us to execute more directed way to improve metabolism of microorganisms, the more significant interest was in the development of micro-cell factories for production of recombinant proteins as biopharmaceuticals (Nielsen and Keasling, 2016). Today, there exist more than 300 pharmaceutical proteins and antibodies on the global market with the sales over $ 100 billion (Langer, 2012).

Since the heterologous gene expression in E. coli cell was successfully accomplished by Cohen and Boyer, it was believed that bacterial or other microbial cells could be turned into micro-bio- factories to overproduce diverse chemicals and pharmaceuticals (Cohen et al., 1973). Academic researchers, however, soon realized that it is much harder to overproduce simple metabolites such as ethanol than complex molecules such as insulin, which can be achieved by overexpressing a single gene. Thus, they began to investigate fundamental questions of designing and functioning of systematic networks of metabolic reactions. These efforts led to the first generation of publications on specifying all possible routes connecting substrates and products, taking the thermodynamics of such metabolic pathways, the distribution of kinetic control, and the design of genetic circuits into account to achieve the desired level of product formation and gene expression (Woolston, Edgar and Stephanopoulos, 2013).

The term of metabolic engineering was officially coined in the late 1980s - early 1990s (Bailey, 1991; Stephanopoulos and Vallino, 1991; Nielsen, 2001; Keasling, 2010). It was originally described as the “directed improvement of cellular activities by manipulation of enzymatic, transport, and regulatory functions of the cell with the use of recombinant DNA technology” (Bailey, 1991). In the early age of metabolic engineering, the subjects of molecular biological studies were mostly individual genes and enzymes, rather than integrated metabolic pathways and genetic regulatory networks. However, it was soon realized that metabolic engineering entails considerably more than stitching multiple genes together to build a basic functioning pathway, so a holistic approach is required to decipher the complexity of a metabolic network.

Despite 30-year long history of metabolic engineering, we are still facing many challenges in meeting the industrial demands in terms of product titer, yield, and productivity. It is mainly because the complexity of interactions between metabolic pathways and regulatory networks

4 renders it extremely difficult to predict the consequences of genetic perturbations of organisms. This pushed us to shift our focus from genetic modification to systemic view of metabolic pathway networks. Thus, conventional metabolic engineering was replaced by systems metabolic engineering, which integrates system-level metabolic engineering with omics and the computational techniques of systems biology, the fine design capability of synthetic biology, and rational and random mutagenesis of evolutionary engineering (Lee et al., 2012). Systems metabolic engineering extends the scope of classical metabolic engineering by the optimization of metabolic pathways, the systemic analysis of cellular behavior in response to genetic manipulation, and the design of strategies to control metabolic pathways and cell products (Lee et al., 2012; Chen and Zeng, 2013).

1.2. Motivation and objective of this Ph. D study

1.2.1. Motivation

In most cases, natural organisms are not as efficient as desired, and thus systems metabolic engineering requires various approaches to improve the cellular performance. Most of current approaches used in systems metabolic engineering are based on the modification of genetic components and enzyme expression level: gene knockout to remove bottleneck; altering the promoter sequences and ribosome binding sites; regulation of reporters, repressors, and terminators; optimization of codon usage; and/or manipulation of mRNA secondary structures (Picataggio, 2009; Steen et al., 2010; Pleiss, 2011; Chen and Zeng, 2013). However, these strategies are often not efficient enough in overcoming hurdles because of inherent protein-level limitations such as inhibition, catalytic promiscuity, and low activity. Moreover, current tools based on genetic modifications are not suitable for dynamic control of cellular metabolism, resulting in irreversible metabolic burdens that may impact cell growth and thus productivity (Chen and Zeng, 2013).

As key advances in DNA sequencing, gene synthesis, structural genomics, and biochemical characterization of novel enzymes are being readily available, growing understanding of protein sequence-structure-function relationships offers new insights into how proteins may be engineered

5 for specific applications. As scientist and engineers demonstrate this complex network, the process of protein engineering is becoming more informed, rational, predictable, and efficient than traditional approaches. The accumulation of biochemical characterization studies and crystal structures of proteins offers a great possibility of generating de novo biosynthetic pathways to produce non-natural chemicals. Given a substrate and a non-natural product, one may design pathways based on enzymatic catalytic functions and substrate spectrums and connect the two entities through non-existent reactions. For instance, BNICE (Biochemical Network Integrated Computational Explorer) allows us to predict novel pathways based on broad reaction rules of the Enzyme Commission classification systems and thermodynamics (Hatzimanikatis et al., 2005).

Once de novo biosynthetic pathway was predicted and selected, screening novel enzymes and protein engineering can be used for the optimization of target reactions. The starting enzyme could be the ones active with the substrate of a similar structure with the compound of interest. One example of protein design is the hydrogenation of C=C bond by a novel enoate reductase. Recently, Joo et al. proposed a bioprocess for adipic acid synthesis using an enoate reductase, which shows hydrogenation activity towards 2-hexenedioic acid and/or muconic acids (cis,cis- and trans,trans-) (Joo et al., 2017). Adipic acid is a dicarboxylic acid that is primarily used in manufacture of nylon 6-6, which serves as one of building blocks of polymers. Due to the absence of enzymes of hydrogenation of those two intermediates, though a number of biosynthetic pathways for adipic acid were designed and proposed, the adipic acid production was inevitably dependent on chemo- catalysis (Joo et al., 2017). However, the discovery of the novel enoate reductase with hydrogenation activity enabled the biocatalytic conversion of adipic acid from glucose using an engineered Saccharomyces cerevisiae (Raj et al., 2018). As evidenced by the adipic acid bioprocess, novel enzyme study encourages to develop solutions and opens a door towards novel bioprocesses for valuable chemicals.

1.2.2. Main objective

This thesis is aimed at the protein design for the metabolic pathway of 1,3-butanediol (1,3BDO) biosynthesis: in vitro characterization of 2-deoxyribose-5-phosphate aldolase

6

(DERA) and aldo-keto reductase (AKR), and in vivo demonstration of the protein design approach in E. coli and Synechococcus elongatus PCC 7942 (Fig. 1-1).

Figure 1-1. Graphical presentation of the main objective of this thesis.

First, we identified an AKR highly active on the non-natural target reaction, NAD(P)H- dependent reduction of 3-hydroxybutanal (3-HB), based on the in vitro screening assay. For further application in 1,3BDO biosynthetic pathway and other potential metabolic engineering, the novel AKR was characterized using phylogenetic analysis, biochemical study, and structural analysis. Since the crystal structure of the target AKR was not available, another AKR with crystal structure was used for the comparative analysis with the model structure of the target AKR. Structure-guided site-directed mutagenesis study was conducted to unfold the structural features and catalytic mechanism of the two AKRs. The identified AKR was further used to assess the aldol condensation of DERA by DERA-AKR coupled reaction.

Secondly, as the aldol condensation by DERA was assumed to be the rate-limiting step, we identified a microbial DERA via DERA-AKR coupled reaction measuring the production of 1,3BDO from acetaldehyde. The target DERA was crystallized for further characterization

7 via collaboration with Structural Genomics group in U of Toronto (Prof. Savchenko). Based on the crystal structural analysis, the DERA is rationally protein-engineered for enhancing the target activity. Moreover, other than the aldol condensation between two acetaldehydes, the retro- aldol cleavage of 2-deoxyribose-5-phosphate (DRP) as well as the sequential condensation of chloroacetaldehyde and two acetaldehydes was characterized to explore potential applications of DERA.

Finally, using the identified AKR and engineered DERA studied above, the optimized 1,3BDO pathway was introduced into two biological systems: E. coli and Synechococcus elongatus PCC 7942. This study was aiming the demonstration of in vitro study-based protein design in in vivo conditions to produce 1,3BDO from glucose (E. coli) proving the protein design can be a powerful strategy for optimization of metabolic engineering. Furthermore, by introducing 1,3BDO pathway in PCC 7942, this study proposed the unprecedented photosynthetic bioconversion of CO2 into 1,3BDO.

8

1.3. Thesis outline

This section provides the outline of the main body of this thesis and highlights the significance of each original research paper.

- CHAPTER 2. Literature Review and General Introduction

This chapter includes the literature review addressing aldo-keto reductases (section 2.1), aldolases (section 2.2), 1,3-butanediol production technologies (section 2.3), and cyanobacterial engineering (section 2.4). Based on the literature review and my research interests, three major hypotheses and research objectives are proposed (section 2.5), which are covered in some details in the main body of this thesis (Chapter 3, 4, and 5). Finally, this chapter includes the statement of authorship and publication status of the results reported in this thesis (section 2.6).

- CHAPTER 3. Enzyme Discovery and Biochemical Characterization of Novel Aldo-keto Reductases for the Biosynthesis of 1,3-butanediol

In this study, we identified several aldo-keto reductases (AKRs) with significant activity in reducing 3-hydroxybutanal to 1,3-butanediol (1,3BDO), an important commodity chemical. Biochemical and structural studies of these enzymes revealed the key catalytic and substrate- binding residues, including the two structural determinants necessary for high activity in the biosynthesis of 1,3BDO. This work expands our understanding of the molecular mechanisms of the substrate selectivity of AKRs and demonstrates the potential for protein engineering of these enzymes for applications in the biocatalytic production of 1,3BDO and other valuable chemicals.

- CHAPTER 4. Structural and Biochemical Studies of 2-deoxyribose-5-phosphate Aldolase BH1352 from Bacillus halodurans for the Biosynthesis of 1,3-Butanediol

9

The formation of carbon-carbon bonds via aldolase condensation is essential in organic synthesis. Despite its importance, aldolase-based biocatalysis has largely been untapped due to the paucity of characterized enzymes and reactions. In this research, a novel 2-deoxyribose-5-phosphate aldolase (DERA) with a significant activity on acetaldehyde condensation was identified for the biosynthesis of 1,3BDO. Structural analysis with protein sequence studies of DERAs unfolded the key residue of DERA condensation at the substrate entrance region. On the basis of the finding of the key residue, site-directed mutagenesis was rationally designed for improvement in 1,3BDO production from DERA-AKR coupled reaction. We also proposed the potential of the novel microbial DERA in the synthesis of statin drug precursor. This study contributes to the better understanding of DERA condensation and its further applications.

- CHAPTER 5. In vivo Application of Engineered Aldolase in the Biosynthesis of 1,3- Butanediol Using an Engineered E. coli and Synechococcus elongatus PCC 7942

This thesis aims at the completion of protein design approach by encompassing two major components: in vitro study, including identification of the target enzyme, biochemical and structural characterization, and protein engineering; in vivo application, i.e. 1,3BDO biosynthesis from glucose using an engineered E. coli and from photosynthesis via cyanobacterial engineering. In this chapter, the proposed 1,3BDO pathway with the novel AKR and the engineered DERA was introduced into E. coli and Synechococcus elongatus PCC 7942. Though the demonstration of cyanobacterial 1,3BDO conversion was not fully achieved, this study contributed to the first establishment of cyanobacterial engineering platform in the department. Moreover, E. coli fermentation results validated that the DERA engineering is effective in enhancing the productivity of the biosynthetic pathway.

- CHAPTER 6. Summary, Recommendations, and Outlook

This chapter integrates and summarizes the work of this thesis and describes further research directions.

10

Chapter 2. Literature Review and General Introduction

2.1. Aldo-keto Reductases

2.1.1. Introduction

Aldo-keto reductases (AKRs) are an NAD(P)H-dependent existing in nearly all phyla. They are mainly monomeric proteins with a molecular weight of 34-37 kDa (Penning, 2015). The AKR superfamily contains approximately 200 annotated sequences which fall into 16 subfamilies (Hyndman et al., 2003; Penning, 2015). The enzymes display broad substrate specificity and transform sugar (Bohren et al., 1989) and lipid aldehydes (Burczynski et al., 2001), keto-steroids (Penning et al., 2000), keto-prostaglandins (Penning, 2015), and chemical carcinogens, e.g., nicotine derived nitrosamines (Atalla, Breyer-Pfaff and Maser, 2000), carcinogen metabolites e.g., polycyclic aromatic hydrocarbon trans-dihydrodiols (Burczynski et al., 2001), and aflatoxin dialdehyde (Kozma et al., 2002). All characterized AKRs have the same protein fold, a triose-phosphate TIM barrel or (α/β)8-barrel with additional helices (Jez and Penning, 2001). At the C-terminus of structures there is a loop defining substrate specificity, which is conserved within the superfamily (Kim et al., 2017). They also commonly share the catalytic tetrad comprised of Tyr, Lys, His, and Asp (Jez et al., 1997).

Sequence alignment allows us to identify the AKR families and subfamilies based on sequence similarity that specific protein functions could be predicted based on grouping of related enzymes. In this nomenclature system <40% of sequence identity of an AKR reveals that the protein belongs to a new family, while >60% of sequence identity between AKR members groups them as members of the same subfamily (Jez, Flynn and Penning, 1997). As of today, totally 16 AKR families have been identified, including AKR1 ( reductases, aldose reductases, hydroxysteroid dehydrogenases, and steroid 5b-reductases); AKR2 (mannose and xylose reductases); AKR3 (yeast AKRs); AKR4 (chalcone and codeinone reductases); AKR5 (gluconic acid reductases); AKR6 (β-subunits of the potassium gated voltage channels); AKR7 (aflatoxin dialdehyde and succinic semialdehyde reductases); AKR8 (pyridoxal reductases); AKR9 (aryl alcohol dehydrogenases); AKR10 (Streptomyces AKRs); AKR11 (Bacillus AKRs); AKR12

11

(Streptomyces sugar aldehyde reductases); AKR13 (hyperthermophilic bacteria reductases); AKR14 (E. coli reductases), AKR15 (Mycobacterium reductases) and AKR16 (V. cholera reductases) (www.med.upenn.edu/akr)(Jez and Penning, 2001; Penning, 2015).

All AKRs kinetic mechanism fall into the sequential ordered bi-bi reaction in which cofactor binds first and leaves last, and the superfamily share the same catalytic mechanism as known as “push- pull” mechanism (Fig. 2-1) (Cooper, Jin and Penning, 2007; Penning, 2015). In the reduction at + acidic or neutral pH, the catalytic tyrosine (Tyr66 in Fig. 2-1) turns into TyrOH2 , working as a general acid with a facilitation by the adjacent histidine (His138 in Fig. 2-1). The transfer of a hydride from the reducing cofactor, NADPH, to the carbonyl acceptor is followed by the + protonation of the carbonyl by the catalytic TyrOH2 . On the other hand, in the oxidation the catalytic Tyr66 is deprotonated by the adjacent lysine (Lys97), forming a phenolate anion. It removes a proton from the alcohol with a hydride transfer to the oxidized cofactor, NADP+ (Schlegel, Jez and Penning, 1998; Penning, 2015). Despite the fact that AKRs can catalyze oxidoreduction in vitro, they are much likely to act as a reductase in vivo. Previous studies suggest that since AKRs display much higher cofactor affinity towards NADPH than NADP+, when AKRs are forced to utilize cofactors of the prevailing in vivo concentration they catalyze only reduction reactions (Cooper, Jin and Penning, 2007; Penning, 2015).

12

Figure 2-1. "Push-pull" mechanism for acid-base catalysis of AKRs (amino acid labeling for STM2406 from Salmonella typhimurium).

2.1.2. Microbial AKRs

Because of the overlapping substrate specificities and catalytic promiscuities of AKRs, it has been difficult to functionally assign individual enzymes. In addition, there are more than one enzyme in a single cell that are capable of reducing a specific carbonyl compound. For instance, from microbial genome data, it is predicted that a prokaryote such as E. coli has six AKRs, while a simple eukaryote such as yeast possesses more than 14 AKRs (Ellis, 2002). This is additional to other aldehyde/ that may perform similar functions in aldehyde/ketone metabolism. These facts bring up a fundamental question: why do microbial organisms possess multiple enzymes that are able to perform similar functional roles?

To address the activity/function problem, several studies have tried using different methods. Some studies conducted gene knock-out to verify in vivo function of the target AKRs (Delneri, Gardner and Oliver, 1999; Traff, Cordero Otero and van Zyl, 2001; Ford and Ellis, 2002). However, most of these gene disruption studies have found no more than partial functional redundancy. The opposite way of approach was also performed by overexpressing Gre3p in yeast (Traff, Cordero

13

Otero and van Zyl, 2001) and AKR14A1 in E. coli (Grant et al., 2003). These studies found out that the overexpression of the target AKR increases the tolerance against a toxic chemical, but it was more of proposing a physiological possibility rather than proving a physiological function. Despite extensive data on gene regulation studies are readily available, microbial AKRs functions are still not fully understood.

Although the physiological functions of AKRs remain unknown, several microbial AKRs have been purified and studied due to their ability to carry out unique biochemical transformations that are potentially applicable for commercial use. Xylose reductase from yeast is an example of biotechnological application of AKRs. Multiple studies overexpressed xylose reductase from Pichia stipitis and Saccharomyces cerevisiae to increase the yield of xylitol, a natural sweetener (Hallborn et al., 1991; Hahn-Hägerdal et al., 1994; Kavanagh et al., 2002). Erythrose reductase from Candida magnoliae was also used in the biosynthesis of erythrol, which has extremely low digestibility and approved safety for diabetes.

It is even advantageous in whole cell biotransformation using stereo-selective activity, eliminating or reducing racemic mixture. For example, the production of chiral 4-chloro-3-hydroxybutanoate ethyl esters was performed via microbial reduction using overexpression of β-keto ester reductase (AKR3B) from Sporobolomyces salmonicolor in E. coli (Shimizu, Kataoka and Kita, 1998). Another study attempted to improve the performance of baker’s yeast for β-keto ester reductions by altering expression levels of three enzymes including AKR (Ypr1p) and deleting genes (Rodriguez, Kayser and Stewart, 2001). Engineering cofactor specificity was also attempted for 2,5-diketo-D-gluconic acid reductase (AKR5C) from Corynebacterium sp. to be used in vitamin C biosynthesis, since it is more economically viable to utilize NADH instead of NADPH (Banta et al., 2002)

As summarized above, despite the incomplete understanding of physiological functions of microbial AKRs, they pose a great potential in biocatalysis with the advantage of broad substrate specificity and stereoselectivity. In this thesis, novel microbial AKRs have been purified, crystallized, and biochemically studied for application in biocatalytic conversion of 3-HB to 1,3BDO.

14

2.2. Aldol Reaction and Aldolases

2.2.1. Aldol reaction

Aldol reaction refers to a whole class of reactions between two carbonyl compounds, one of which reacts as a nucleophile – after a proton is removed from an alpha-carbon of it – and attacks the electrophilic carbonyl carbon of the other carbonyl compound (Bruice, 1998). It is important in organic synthesis due to the formation of a new C-C bond. In synthetic organic chemistry it is the most commonly applied reaction for the construction of polyhydroxylated compounds with new chiral centers from smaller and simpler precursors (Mahrwald, 2004). In terms of biological perspective, compounds with multiple hydroxyl groups are of pharmaceutical interest because the polar hydroxyl groups aids both water solubility and molecular recognition by directed hydrogen- bond interactions that convey biological specificity (Fessner and Helaine, 2001). Among different kinds of aldol additions, aldehydes are specifically of interest because the products are also aldehydes themselves and can be subjected to further aldolization and lead to more complex building blocks (Mukherjee et al., 2007).

Aldol reactions have been extensively studied from different perspectives including the development of new catalysts for stereo-controlled synthesis. Catalysts that have been developed for asymmetric aldol reaction include small organic molecules, main group and transition metal coordination complexes with chiral ligands, catalytic antibodies, and aldolases (Houk and List, 2004; Mahrwald, 2004; Mukherjee et al., 2007; Trost and Brindle, 2010). In biological systems, C-C bonds are usually formed by highly specific enzymes with stereo-control. (Breuer and Hauer, 2003). As enzymes are generally highly chemo-, regio-, stereo-, and enantio-selective, aldolases have emerged as an environmentally friendly alternative with great application potentials in the synthesis of rare sugars or sugar derivatives including statins, iminocyclitols, and sialic acids (Clapes et al., 2010; Machajewski and Wong, 2000).

15

2.2.2. Aldolases

Aldolases (EC 4.1.2.X) are that catalyze the stereoselective addition of a ketone or aldehyde donor to an aldehyde acceptor. In a living cell, formation and cleavage of C-C bond are essential in metabolism and catabolism of carbohydrates and ketoacids. Thus, aldolases are key enzymes in metabolic processes including , , fructose metabolism, and pentose phosphate pathway.

Aldolases can be divided into two classes depending on the reaction mechanism. Class I aldolases are characterized with the catalytic Lys forming a covalent Schiff base intermediate with the donor compound to generate an enamine nucleophile (Machajewski and Wong, 2000). On the other hand, class II aldolases are distinguished with the absolute requirement of a divalent metal ion (Zn, Fe, Co) to promote the enolization of the donor substrate via Lewis acid complexation. The nucleophilic enamine or enolate then attacks the carbonyl carbon of the acceptor substrate forming the new C-C bond (Clapés et al., 2010). Class I and II aldolases differ from each other in other criteria including subunit structure, pH profile, and substrate affinity. They hardly share sequence homology and are apparently from different evolutionary origins (Schürmann and Sprenger, 2001). Also, class II aldolases prevail in prokaryotes and lower eukaryotes including some bacteria, fungi, and algae, while Class I aldolases are present in all groups of living organisms (Marsh and Lebherz, 1992; Gefflaut et al., 1995).

Class I aldolases:

Class I aldolases have attracted significant attention for their potential in biocatalytic organic synthesis since they do not require any cofactor for catalysis. The newly formed stereocenter is controlled that it is easier to predict the chemical structure for products. While Class I aldolases are relatively more tolerant with the acceptor molecules, they are typically restricted with the type of donor molecules. Therefore, Class I aldolases are classified into the following subgroups depending on the type of donor molecules.

Dihydroxyacetone phosphate (DHAP) aldolase: This group of aldolase includes fructose-1,6- biphosphate aldolase (FBP aldolase, EC 4.1.2.13), which catalyzes the cleavage of FBP into

16

DHAP and glyceraldehyde-3-phosphate (G3P) in glucose metabolism, or the reverse reaction in gluconeogenesis. These enzymes accept a broad range of acceptor substrates: including aliphatic aldehydes, alpha-heteroatom substituted aldehyde, and monosaccharides and their derivatives (Machajewski and Wong, 2000). DHAP aldolase is, however, strictly dependant on its donor. This is proven problematic for application in organic synthesis because DHAP is expensive, relatively unstable, and the phosphate group is often not desired in the final product. Thus, there have been many attempts to redesign these enzymes for the catalysis of non-phosphorylated donor, dihydroxyacetone (DHA), via directed evolution (Windle et al., 2014).

Dihydroxyacetone (DHA) aldolase: One of the drawbacks of DHAP aldolases is their strict dependence on the donor substrate. Moreover, the phosphate group of the product often must be removed in favor of atom economy of the process. In this connection, the discovery of fructose-6- phosphate (FSA) aldolase isoenzyme is highly promising. FSA aldolase cleaves FSA generating DHA and G3P. Synthetic application of FSA was reported that it enables the demonstration of stereoselective aldol addition of DHA, hydroxyacetone, and hydroxybutanone towards the formation of a variety of aldehydes (Clapés et al., 2010).

Pyruvate or phosphenolpyruvate (PEP) aldolase: Another Class I aldolase is pyruvate- dependent 2-keto-3-deoxy-6-phosphogluconate-aldolase (KDPGA), which catalyzes the cleavage of 2-keto-3-deoxy-6-phosphogluconate (KDPG) into pyruvate and G3P (Windle et al., 2014). The conventional application of KDPGA from Thermotoga maritima and E. coli is the formation of unbranched Nikkomycin side chain precursor (Clapés et al., 2010). These enzymes are almost exclusively originated from microbes and attracting more attentions as sialic acid derivatives that are used for cancer therapy and as anti-infectives (Samland and Sprenger, 2006).

Glycine-dependent aldolase: This type of enzyme encompasses L-threonine aldolase (EC 4.1.2.5), D-threonine aldolase (EC 4.1.2.42), L-allo-threonine aldolase (EC 4.1.2.49), and low- specificity L-threonine aldolase (EC 4.1.2.48). They commonly catalyze the reversible condensation of glycine and an aldehyde acceptor to generate a beta-hydroxy-alpha-amino acid with pyridoxal phosphate as cofactor (Machajewski and Wong, 2000; Samland and Sprenger, 2006). Beta-hydroxy-alpha-amino acids can be used for the synthesis of building blocks of antibiotics (vancomycin), immunosuppressants (cyclosporin), and drugs for Parkinson’s disease

17 therapy (Samland and Sprenger, 2006). Glycine-dependent aldolases have also been extensively used in the resolution of racemic beta-hydroxy-alpha-amino acids; however, few cases of carboligation reactions were studied (Machajewski and Wong, 2000).

Acetaldehyde-dependent aldolase: The only known member of this class is 2-deoxyribose-5- phosphate aldolase. As this enzyme is the main target aldolase of this study, it will be further discussed in the following section.

2.2.3. 2-deoxyribose-5-phsophate aldolase (DERA)

Currently, there is only one known member of acetaldehyde-dependent aldolase: 2-deoxyribose- 5-phosphate aldolase (DERA, EC 4.1.2.4). In biological system, DERA catalyzes a reversible cleavage of 2-deoxyribose-5-phosphate (DRP) generating acetaldehyde and glyceraldehyde-3- phosphate (G3P), and this is a part of pentose phosphate pathway (Fig 2-2). The unique property of DERA is that unlike other types of aldolases both substrates and the product are aldehydes. Due to this circumstance, DERA can perform sequential aldol reactions. It was first reported by Gijsen and Wong, and renders DERA a valuable biocatalyst with the virtue of formation of two new stereogenic centers in the synthesis of a building block of complex molecules (Gijsen and Wong, 1994; Windle et al., 2014).

Figure 2-2. Reversible retro-aldol cleavage of 2-deoxyribose-5-phosphate (DRP) via DERA.

18

Though varying depending on enzymes, DERA shows a broad range of substrate specificity that it can tolerate a variety of aldehydes as an acceptor substrate. For example, DERA from E. coli (EcDERA hereafter), the most studied DERA, displays aldol condensation activity with many different aldehydes including chloroacetaldehyde, propanal, butanal, isobutanal, glyceraldehyde, lactaldehyde, and 3-fluoro-2-hydroxypropanal, as an acceptor (Chen, Dumas and Wong, 1992). It also catalyzes the condensation of ketone compounds such as acetone or fluoroacetone as a donor (Chen, Dumas and Wong, 1992).

Figure 2-3. Overall structure of monomeric BH1352 from Bacillus halodurans. The catalytic residues are displayed with stick model and labels.

As of Oct 2018, there are 28 crystal structures of 14 DERA enzymes are available. The structure of DERA displays a typical triose-phosphate isomerase TIM barrel or (α/β)8-barrel fold, which is conserved among the enzymes in Class I aldolase family (Fig. 2-3). Another conserved structural feature of DERA is the catalytic triad that consists of two lysine residues and an aspartate residue (see Fig. 2-3). The catalytic lysine residue (Lys155 in case of BH1352) is commonly found on the

β6 strand of the barrel core of DERA structure. The pKa of lysine in aqueous solution is typically 10.5, which is high and makes lysine a basic residue. In DERA structure the adjacent lysine residue (Lys184 in case of BH1352) on the β7 strand in close proximity of the catalytic residue is proposed

19 to be involved in the pKa perturbation of the catalytic Lys155 via electrostatic interaction with the other triad residue, aspartate (Asp92 in case of BH1352). This enables the catalytic lysine to remain uncharged and function as the nucleophile to carry out the catalysis at pH lower than its pKa value (Heine et al., 2001).

The catalytic mechanism of DERA in the cleavage of DRP is shown in Fig. 2-4. In the aldol condensation direction, a general base is essential in addition to the nucleophile. This general base is responsible for the deprotonation of the imine to form the enamine. According to Heine et al., it was proposed that a proton relay system involving the aspartate (Asp92 in case of BH1352) and a hydrogen-bonded water molecule acts as the general base in the DERA aldol reaction (Heine et al., 2001). Also, the phosphate of DERA is mostly conserved among the same family and has been characterized with EcDERA (DeSantis et al., 2003). In general, a positively charged amino acid, typically a lysine, is found in the phosphate binding site of DERA structure, interacting with the negatively charged phosphate group of DRP and forming counter-charged environment along with an additional basic residue (Fig. 2-5, Lys15 in case of BH1352). Also, serine and threonine are commonly observed near the phosphate binding site for most of DERAs, interacting with the phosphate group and γ-hydroxyl group of DRP via hydrogen bonds (Heine et al., 2001; DeSantis et al., 2003).

20

Figure 2-4. The predicted catalytic mechanism of BH1352 DERA from Bacillus halodurans.

Figure 2-5. Proposed DERA enzyme-substrate interaction for BH1352 and DRP.

21

In the early 1990s it was found out that EcDERA can perform the addition of acetaldehyde, acetone, fluoroacetone, and propionaldehyde as a donor to G3P acceptor (Chen, Dumas and Wong, 1992). However, the activity was so low that these reactions were hardly studied except for Wong and his colleagues’ study on EcDERA. The DERA activity for non-phosphorylated substrates was also studied, but the catalytic efficiency was too low to be applied in organic synthesis. In a couple of follow-up studies, it was revealed that EcDERA is capable of catalyzing tandem aldol reaction in which the aldehyde product from the first addition is used as the acceptor substrate in the second reaction (Gijsen and Wong, 1994; Wong et al., 1995).

The finding of sequential condensation by DERA motivated further studies on aldol condensation with various aldehydes. Chloroacetaldehyde is specifically of industrial interest as the sequential condensation of two acetaldehyde molecules and one chloroacetaldehyde molecule produces (3R,5S)-6-chloro-2,4,6-trideoxyhexapyranoside (CTHP). CTHP is a valuable chiral precursor for statins such as atorvastatin (Lipitor) and rosuvastatin (Crestor), which are cholesterol lowering agents (Greenberg et al., 2004; Müller, 2005; Jennewein et al., 2006; Clapés et al., 2010; Windle et al., 2014). Statins are extensively used to lower the fatality rate of cardiovascular disease by decreasing the level of low-density lipoprotein cholesterol through the inhibition of 3-hydroxy-3- methylglutaryl-CoA reductase activity (Fei et al., 2017). Despite the near $20 billion size global market, however, there have been only a handful of DERA enzymes proven capable of synthesizing CTHP and/or protein-engineered for improvement in activity (Greenberg et al., 2004; Jennewein et al., 2006; Global Business Intelligence, 2013; Jiao et al., 2015, 2017).

Other than protein characterization and engineering, a process engineering approach was adopted for the platform of DERA applications. Subrizi et al. reported that immobilization of EcDERA on multiwalled carbon nanotubes (MWCNTs) enhances the tolerance to acetaldehyde, while maintaining activity for several days and even allowing reuses as many as five runs (Subrizi et al., 2014). The immobilization on MWCNTs also improved the sequential condensation of chloroacetaldehyde and acetaldehyde over the wildtype EcDERA.

In this thesis, we discovered a novel DERA with a high activity for bioconversion of value chemicals. Bioinformatic study on over 2,500 DERA sequences and in vitro screening assay with purified enzymes identified the target DERA, BH1352 from Bacillus halodurans. The crystal

22 structure with a high resolution was resolved and used for further structure analysis. Combining structure analysis and sequence study, site-directed mutagenesis was rationally designed to improve the target activity, which poses a great potential in the DERA application in biocatalysis.

2.3. 1,3-Butanediol production technologies and applications

2.3.1. Introduction

Diols are compounds with two hydroxyl groups and have a variety of applications as chemicals, solvents, pharmaceutical precursors, cosmetics, building blocks of polymers, and fuels (Jiang et al., 2014; Sabra, Groeger and Zeng, 2016). In particular, relatively low molecular four-carbon diol compounds including 1,3-butanediol, 2,3-butanediol, and 1,4-butanediol, are of interest for microbial conversion using renewable carbon source (Sabra, Groeger and Zeng, 2016). The chemical production of short chain diols from fossil carbon sources has been developed and optimized for decades. However, considering the exhaustion of fossil carbon resources and the increasing environmental issues, the bio-based processes for production short chain diols are attracting interests.

1,4-butanediol

1,4-butanediol is an important commodity chemical used to manufacture over 2.5 million tons of valuable chemicals annually (Sabra, Groeger and Zeng, 2016). The major use of it is the production of tetrahydrofuran (THF) and polybutylene terephtalate (PBT) (Yim et al., 2011). Generally, THF is used to produce spandex fibres and other performance polymers, resins, solvents, and printing inks for plastics. PBT is an engineering-grade thermoplastic that is used as an insulator in the electrical and electronic industries.

Recently, a bioprocess for manufacture of 1,4-butanediol using unnatural synthetic pathway and correspondingly engineered E. coli has been developed and commercialized by Genomatica Inc. (Yim et al., 2011; Burgard et al., 2016). In this pathway, succinyl-CoA is converted into 1,4-

23 butanediol over 4-hydroxybutyrate and other intermediates, producing 18 g/L (Yim et al., 2011). Recently, strains able to produce 30-40 g/L of 1,4-BDO in a continuous bioreactor were developed and patented by Genomatica (Pharkya et al., 2015).

2,3-butanediol

Nowadays, 2,3-butanediol (2,3BDO) is used as the starting material for bulk chemicals such as methyl ethyl ketone, gamma-butyrolactone, and 1,3-butadiene (Nielsen et al., 2010). It also can be used as a precursor of antiulcer drugs and cosmetics. Due to the presence of two chiral centers, there are three isomers of 2,3BDO: (2R,3R)-, (2S,3S)-, and meso-form with no optical activity (Sabra, Groeger and Zeng, 2016). Among 1,4-, 2,3-, and 1,3-butanediol, 2,3BDO is a natural metabolite produced by numerous bacterial and yeast strains, while the other two butanediols are not naturally produced by any known organism (Li et al., 2012; Yang et al., 2017). The species noted for its native 2,3BDO producing ability belong to genera Klebsiella, Enterobacter, Bacillus and Serratia (Jiang et al., 2014). The 2,3BDO biosynthetic pathways of those bacteria have been extensively studied.

However, despite the extensive research conducted on bacterial production of 2,3BDO, the concerns associated with the utilization of pathogenic bacterial and the inefficient utilization of cellulosic sugars have led scientists to engineer more safer strains. As a result, E. coli was extensively used as a host for many metabolic engineering studies for 2,3BDO biosynthesis. The production of highly pure (2S,3S)-2,3BDO was achieved from the engineered E. coli with the product titer of 31.7 g/L (Wang et al., 2013). Moreover, many industrial biotechnological processes are moving towards the use of yeast strains as a 2,3BDO production platform. Engineered yeast strains were reported to produce enantiomerically pure (2R,3R)-2,3BDO with a titer of 100 g/L (Lian, Chao and Zhao, 2014). As summarized, the high titer and productivity of optically active 2,3BDO production are being achieved by engineering of promising hosts nowadays.

24

1,3-butanediol

1,3-Butanediol (1,3BDO) is a non-natural four-carbon glycol that is not metabolically produced by any known organism. It is primarily used as a chemical intermediate in the manufacture of polyester plasticizers, as a solvent for flavouring, and as a humectant in cosmetics, pet foods, and tobacco (Sabra, Groeger and Zeng, 2016). The optically active (R)-1,3BDO is a valuable building block for the synthesis of various optically active compounds such as pheromones, fragrances, and insecticides by direct incorporation into the target intermediates, or is used a chiral template in the Lewis-acid reaction of acetals with nucleophiles (Matsuyama et al., 2001; Sabra, Groeger and Zeng, 2016). Especially, (R)-1,3BDO is an interesting chiral synthon for the synthesis of azetidinone derivatives, leading to penem and carbapenem antibiotics (Yamamoto, Matsuyama and Kobayashi, 2002). Due to its various applications, the demand of (R)-1,3BDO has been drastically increasing, and as a consequence, the production method has been vigorously studied (Zheng et al., 2012; Kataoka et al., 2013).

Conventional (R)-1,3BDO synthesis is a chemical process that the hydration of acetylene producing acetaldehyde is followed by a conversion into 3-hydroxybutanal (3-HB), then 1,3BDO. Biological processes have also been developed and are summarized below (Table 2.1). The two approaches were studied for whole-cell bioconversion of 1,3BDO. The first approach was enzyme- catalyzed asymmetric reduction 4-hydroxy-2-butanone, which is derived from fossil carbon source and converted into (R)-1,3BDO (Yamamoto, Matsuyama and Kobayashi, 2002; Zheng et al., 2012; Yang et al., 2014). The second was based on enantio-selective oxidation of undesired (S)-1,3BDO to 4-hydroxy-2-butanone and reduction to (R)-1,3BDO (Yamamoto, Matsuyama and Kobayashi, 2002). These bioprocesses are, however, still dependent on fossil carbon source. Due to a shortage of fossil carbon source and a variety of environmental concerns related, a route from glucose to 1,3BDO has been developed.

25

Table 2-1. Summary of bioproduction of 1,3BDO. LB stands for lysogeny broth. Yield Host Titer [g/g] strain Carbon source Conditions [g/L] or % References E. coli 1,3BDO racemate shake-flask 72.6 48.4% Yamamoto et al. (2002) E. coli 4-hydroxy-2-butanone shake-flask 19.8 96.6% Zheng et al. (2012) E. coli LB -> Glucose shake-flask 9.05 0.11 Kataoka et al. (2013) E. coli Glucose and Yeast extract fed batch 9.05 0.11 Kataoka et al. (2013) E. coli Glucose and Yeast extract fed batch 15.7 0.18 Kataoka et al. (2014) P. jadinii 4-hydroxy-2-butanone shake-flask 38.3 85.1% Yang et al. (2014)

2.3.2. Pathway for bioconversion of glucose into 1,3BDO

Kataoka et al. constructed a biosynthetic pathway for production of 1,3BDO from glucose in E. coli (Fig. 2-6A). Its starting material is acetyl-CoA, a natural metabolite of glycolysis in E. coli, two of which are then condensed to acetoacetyl-CoA. This pathway was able to produce 9.05 g/L (100.4 mM) of 1,3BDO from 110 h of fed-batch fermentation (Kataoka et al., 2013). As the inverted fatty acid β-oxidation pathway has a high demand of reducing equivalents and cofactors, Kataoka et al. reflected the importance of aerobic catabolism of glucose for reducing equivalent regeneration and optimized the fermentation condition to improve oxygen availability. This approach led to producing up to 8.88 g/L (98.5 mM) of 98.5 % enantiomerically pure (R)-1,3BDO in 36 h of fermentation with a yield of 0.444 mol per mol of glucose (Kataoka et al., 2014). Despite the significant improvement, the productivity of 1,3BDO biosynthesis is far below the acceptable level of commercialization.

2.3.3. Aldolase-based biosynthetic pathway for 1,3BDO

Recently, Nemr et al. proposed a novel biosynthetic pathway for 1,3BDO using pyruvate decarboxylase, 2-deoxyribose-5-phosphate aldolase (DERA), and aldo-keto reductase (AKR) (Fig. 2-6B). This pathway has the same theoretical carbon yield as the inverted fatty acid β-oxidation pathway (one 1,3BDO molecule from two pyruvate molecules), while not requiring as many

26 cofactors or reducing equivalents. The DERA-based pathway necessitates three enzymes and one NADPH reducing cofactor, which alleviates metabolic burden of the host organism.

However, unlike the decarboxylation by pyruvate decarboxylase, which is a significantly fast step, the DERA aldol condensation is relatively inefficient. Moreover, the reductase required reducing 3-HB to 1,3BDO might be promiscuously active with another intermediate, acetaldehyde, leading to ethanol byproduct formation. Therefore, this pathway has to be tuned up in terms of enzymatic perspective by screening active enzymes with a high substrate specificity and protein engineering for improving the catalytic activity.

Figure 2-6. Biosynthetic pathway for 1,3-butanediol production. A: Invert fatty acid β-oxidation pathway (Kataoka et al., 2013); B: DERA-based pathway (Nemr et al., 2018).

27

2.4. Cyanobacteria engineering

2.4.1. Motivation of cyanobacteria engineering study

Most of metabolic engineering hosts utilize sugar, mainly glucose, as a carbon source. The sugars are derived from the hydrolysis of solid biomass such as lignocellulose or food crops such as corn. The biomass is always regenerated by the life cycle of ecosystem, so biofuels and chemicals produced from metabolic engineering have been considered to be an eco-friendly alternative to fossil fuels and fossil fuel-derived products. However, in spite of cheap value of biomass waste, the hydrolysis process is neither cost-efficient nor run in mild conditions. Enzymatic hydrolysis has been investigated, yet is far behind economically feasible level. Thus, the production of some biofuels using biomass waste requires more fossil energy and cost than the energy content and product value of the biofuel (Subhadra, 2010).

Besides, the full environmental effects of biofuel crops have to be reconsidered. Regardless of how effective sugarcane is for producing biofuels and chemicals via metabolic engineering, its benefits quickly diminish if carbon-rich tropical forests are being razed to make the sugarcane fields, thereby causing greenhouse gas (GHG) emission increase (Righelato and Spracklen, 2007). Furthermore, considering biodiversity conservation, hydrological functioning, and soil protection, the benefits of metabolic engineering are becoming even more marginal (Bala et al., 2007).

Additional environmental cost includes trace-gas emissions and rises in food price. Crops require nitrogen fertilizers, which can be a significant source of nitrous oxide, an important GHG that destroys stratospheric ozone (Scharlemann and Laurance, 2008). The U.S. governmental subsidies on corn-based biofuels promote farmers to shift from growing soy to growing corn, which makes the price of soy go up (Laurence, 2007). There is a clear need to consider more than just energy and GHG emissions when evaluating the sugar-based biofuels and chemical productions via conventional metabolic engineering.

28

2.4.2. Cyanobacteria

Cyanobacteria are photosynthetic prokaryotes that can utilize photon energy to generate reduced molecules and sugars themselves. They are classified as a phylum of bacteria and known to be one of the most ancient organisms on earth. In fact, it is believed that cyanobacterial photosynthesis had transformed the atmosphere of this planet from reducing one to oxidizing one more than 2 billion years ago, which dramatically stimulated biodiversity of life forms on earth with a protective ozone layer (Pisciotta, Zou and Baskakov, 2010). They are often referred to as blue- green algae because of their pigment. Technically, despite the definition of algae encompasses prokaryotic photosynthetic organisms, algae are generally eukaryotes while cyanobacteria are prokaryotes, thus ‘blue-green algae’ is considered as a misnomer for cyanobacteria.

Photosynthesis of Cyanobacteria

There are numerous inherent properties of cyanobacteria that render them an attractive candidate organism in biotechnology industry, but the most important aspect is their photosynthetic capacity. They can generate their own carbon source from practically cost-free inputs – light, water, and atmospheric CO2 – to metabolically produce value-added compounds, requiring only a few inexpensive nutrients including phosphate, nitrate, and trace amount of metal ions. In addition, cyanobacteria exceed other photosynthetic organisms such as eukaryotic microalgae and plants in terms of solar energy conversion efficiency that 3-9% of conversion is reported for cyanobacteria while only ≤0.25-3% is converted by terrestrial plants (Ducat, Way and Silver, 2011).

Generally, photosynthesis process mainly consists two parts, light-dependent and light- independent reactions. In light-dependent reactions, photosystem II is excited by photons (usually at 680 nm, two photons per electron) and breaks water molecules down into electron, proton, and oxygen (Fig. 2-7). While oxygen is evolved, the electrons from the reaction are transferred through multiple protein complexes and get excited at another photosystem, photosystem I, by photons (usually at 700 nm, two photons per electron). The excited electrons are transferred to ferredoxin complex and used in regeneration of NADPH by ferredoxin-NADP reductase (4 photon cost per NADPH molecule). Due to these sequential photosystem reactions, cyanobacteria have higher NADPH/NADH ratio than other bacteria. Activities of the electron transport chain, especially from

29 cytochrome b6f, lead to accumulation of protons in lumen, resulting in transmembrane proton gradient that is used to produce ATP via ATP synthase.

Figure 2-7. Schematic representation of the photosynthesis of cyanobacteria. Yellow arrows represent light-dependent reactions, while dark gray colored arrows indicate light-independent reactions. The red dashed arrow displays electron flow pathway and the blue dotted arrows show the journey of protons within the photosystem. Abbreviations: b6f: cytochrome b6f complex; Fd: ferredoxin; FNR: ferredoxin-NADP reductase; PC: plastocyanin; PS I and II: photosystem I and II; PQ: plastoquinone; 3PG: 3- phosphoglycerate.

The light-independent reaction is also known as Cavin cycle or Calvin-Benson-Bassham (CBB) cycle, which fixates inorganic carbon from CO2. Calvin cycle is mainly comprised of 3 parts: carboxylation (carbon fixation), reduction, and regeneration of ribulose 1,5-bisphosphate (RuBP) (Fig. 2-8). In carbon fixation phase, the carbon dioxide is added to ribulose-1,5-biphosphate, a five-carbon sugar phosphate, by rubisco enzyme leading to the formation of six carbon product. This six-carbon intermediate splits to form two 3-phosphoglycerate (3PG) molecules. A part of 3PG is directed to the central metabolic pathway. During the next reduction step, 3PG is converted

30 into glycaraldehyde-3-phosphate (G3P) at the expense of one ATP and one NADPH. Among six of G3P molecules produced from 3PG, five are used in regeneration of RuBP while one of them is used to make sugar for storage. Here, 3 molecules of ATP and 5 molecules G3P are consumed to regenerate 3 RuBP molecules within the regeneration reactions, which are carboxylated via rubisco enzyme.

Figure 2-8. Schematic representation of Calvin Cycle. C indicates carbon and P represents phosphate group. In terms of stoichiometry, three ribulose 1,5-bisphosphate (RuBP) molecules each with 5 carbons are added with 3 carbon dioxide (1 carbon) molecules, and give out six 3-phosphoglycerate (3PG, 3 carbons) molecules. While carbon number is conserved during the conversion from 3PG into glyceraldehyde 3-phosphate (G3P), 5 out of 6 G3P are used in the regeneration of 3 RuBP molecules.

Other features of cyanobacteria

Other than photosynthetic capability, cyanobacteria have several characteristics that are advantageous in turning into cell factories for renewable and sustainable production of biofuels and commodity chemicals. They have the fastest growth rate among all photosynthetic organisms

31 as well as less complex intracellular structures and cell walls (Rosgaard et al., 2012). Also, their genetic simplicity and ease in transformation make genetic manipulation much easier compared to other photosynthetic organisms such as plants and algae. As synthetic biology tools and genetic information of cyanobacteria are being available, it is no more a remote idea that the development in cyanobacteria engineering turn them into a “green E. coli” (Berla et al., 2013). In fact, a number of chemical products ranging from sugars to branched alcohols and hydrocarbon jet fuels have already been produced from photosynthesis via cyanobacteria engineering.

In metabolic engineering of cyanobacteria, most studies have been carried out using two model organisms, Synechococcus elongatus PCC 7942 (PCC 7942) and Synechocystis sp. PCC 6803 (PCC 6803). These two strains are the most extensively studied among all cyanobacteria strains that the full genome sequence and genome scale modeling for both strains are available (Knoop et al., 2013; Broddrick et al., 2016). Specifically, PCC 6803 is the first photosynthetic microorganism of which the full genome was sequenced. Moreover, PCC 6803 is capable of metabolizing organic carbon sources that it can grow with glucose via glycolysis pathway and oxidative pentose phosphate pathway (Yu et al., 2013). On the other hand, PCC 7942 is solely photoautotrophic and the only strain that has a commercial engineering kit (Life Technologies Corp., Carlsbad, CA). Due to the accessibility of genetic tools, PCC 7942 was chosen to be the model strain in this study (Table 2-2).

Table 2-2. The model strain of cyanobacteria engineering in this study.

Strain Synechococcus elongatus PCC 7942 Metabolism Photoautotrophic Genetic methodsa Conjugation, natural transformation Ploidyb ~ 4 (Griese, Lange and Soppa, 2011) Stable in long-chain fatty acids, relatively not in short- Stability against chemicalsc chain alcohols Growth temperature (°C) 33-35 Doubling time (h) 24-12 Commercial engineering kit available, TCA cycle not Notes fully clarified yet a: Voigt et al., 2011; Berla et al., 2013 b: Number of genome copies c: Ruffing and Trahan, 2014

32

The elegance of cyanobacterial engineering is that it enables the direct conversion of the initial form of carbon in nature, CO2, into valuable products such as biofuels and commodity chemicals. Although cyanobacteria and algae both share the ability of photosynthetic carbon fixation, the latter are more difficult to genetically manipulate because they lack in the development of genetic tools while having more complex genetic system than cyanobacteria (Wang, Wang and Meldrum, 2012). Cyanobacterial photosynthesis also allows the production of beneficial chemical compounds free from challenges of environmental sustainability and its impact on the global food supply (Rosgaard et al., 2012). Along with the development in synthetic biology tools, understanding of biochemical pathway in cyanobacteria, and availability of genome sequences, cyanobacteria engineering will unseal the potential in sustainable production of value chemicals from the biotechnological use of light, water, and CO2.

2.4.3. Notes in cyanobacterial engineering

Besides aforementioned advantages of cyanobacteria, there are a few notes that we should consider when engineering cyanobacteria to metabolically produce value chemicals.

Driving force in heterologous metabolic pathway

Microorganisms have evolved to optimize their use of energy and metabolite flux to maximize their survival and growth. Thus, engineering microbes is required to have the two systems function properly at the same time: the growth and replication, and the carbon conversion in the desired metabolic pathway. In order to draw carbon flux away from the natural processes during growth, heterologous pathways should either surpass the enzyme activity in native pathways or be coupled with growth. In cyanobacteria engineering, this was achieved by a few approaches; one is to utilize enzymatic conversion that has low reversibility such as decarboxylation (Gao et al., 2012; Oliver et al., 2013; Oliver and Atsumi, 2014). Pyruvate decarboxylase (PDC) from Zymomonas mobilis is known to be highly active in cyanobacteria that it was used in ethanol synthesis from engineered cyanobacteria combined with alcohol dehydrogenases (Deng and Coleman, 1999; Dexter and Fu, 2009; Gao et al., 2012). Also, due to the generation during photosynthesis, NADPH is more

33 abundant than NADH in cyanobacteria (see Fig. 2-7). Coupling reduction steps with NADPH is obviously advantageous in engineering of cyanobacteria with minimizing impacts on natural processes (Oliver et al., 2013, 2014).

Abundance in carbon pool

When heterologous pathways are introduced into a microorganism, they compete for carbon flux with natural pathways from the starting point to the end of metabolism. Therefore, when designing heterologous pathways in cyanobacteria, it is essential that we should withdraw carbon flux as close to fixation cycle as possible to maximize the available carbon pool and minimize the loss due to the partitioning by natural pathways (Oliver and Atsumi, 2014). One answer that meets this criterion is to utilize pyruvate that is in central metabolic pathway and only 3 steps away from carbon fixation cycle as a starting point (Rosgaard et al., 2012; Oliver and Atsumi, 2014). In fact, among all chemical compounds produced via cyanobacteria engineering, two of the top 3 high- titer products, ethanol (~5.5 g/L) and 2,3-butanediol (~2.4 g/L), were produced from pyruvate (Gao et al., 2012; Oliver et al., 2013). It is assumed that depleting carbon flux to heterologous pathways leads to increase in flux from carbon fixation to pyruvate balance the depletion and provide more material to produce target metabolites (Oliver and Atsumi, 2014).

On the basis of considering abovementioned features of cyanobacteria, we propose the photosynthetic conversion of 1,3BDO via heterologous expression of PDC, DERA and AKR in PCC 7942. First, the pathway utilizes pyruvate, the most abundant metabolite, which is advantageous and meets the criterion suggested above. The relatively short DERA-based pathway gives an advantage to the target product in terms of the access to carbon pool. Moreover, the heterologous expression of PDC from Z. mobilis in PCC 7942 has already been demonstrated effective for photosynthetic ethanol production (Deng and Coleman, 1999). Thirdly, the proposed pathway requires only one cofactor, NADPH, for AKR reduction. This is even less of a metabolic burden for PCC 7942 because of the abundance of NADPH in cyanobacteria. Based on these rationales, the main hypotheses and research objectives of this thesis are proposed in the following section.

34

2.5. Hypotheses and research objectives

This thesis will focus on the following hypotheses, each with a set of objectives that will be covered in the referenced chapters:

Hypothesis 1: Identification and biochemical characterization of AKR enables more efficient demonstration of 1,3BDO biosynthesis, and propose potential applications of AKR in bioprocess of chemicals. (Chapter 3)

Objectives:

I. Identification of an AKR with a high activity towards 3-hydroxybutanal (3-HB)

The proposed DERA-based 1,3BDO synthetic pathway requires a reductase that meets two demands: a high activity on 3-HB and little activity with acetaldehyde. Moreover, as the intermediate 3-HB is not a natural metabolite, the reaction conditions for the non-natural, uncharacterized reduction had to be carefully designed. Based on the requirement, in vitro screening assay was designed to identify the target AKR.

II. Biochemical characterization of the novel AKRs

Generally, AKRs exhibit catalytic promiscuity with a broad substrate range, which renders it difficult to assign specific physiological functions or suggest biocatalytic applications. Thus, biochemical characterization of the novel AKRs promotes understanding of structure-sequence- function relationship as well as potential applications. This study encompasses exploring substrate profile and measuring kinetic parameters.

III. Structure analysis and site-directed mutagenesis study

Since we were not able to crystallize our target AKR, we used the homology modeling approach and combined it with crystal structure study of another novel AKR. Comparison of these two enzyme structures led us to site-directed mutagenesis study for identification of the key residues in AKR catalysis. Furthermore, the substitution of the key residues with different amino acids was

35 performed, leading to the enhancement in catalytic efficiency of the target enzyme and profound understanding of AKR reduction mechanism.

Hypothesis 2: As DERA condensation is the rate-limiting step in 1,3BDO pathway, identification of highly active DERA combined with protein engineering would improve the productivity of the pathway (Chapter 4 and 5).

Objectives:

I. Identification of highly active DERAs in terms of acetaldehyde aldol condensation

Phylogenetic analysis of over 2,500 annotated DERA sequences revealed that there are at least 5 subgroups existing within DERA family. From our enzyme database, we chose 20 DERAs that cover each subgroup as well as non-classified sequences and proceeded to in vitro screening assay using purified enzymes. In this screening assay, the AKR PA1127 from Pseudomonas aeruginosa, the novel AKR studied in Chapter 3, was used as the reporter enzyme to indicate the DERA activity by measuring 1,3BDO produced from DERA-AKR coupled reaction.

II. Crystal structure study and biochemical characterization of the target DERA

From the previous screening, BH1352 from Bacillus halodurans was chosen as the target enzyme. It was then subjected to crystallization for structural study via a collaboration with Prof. Savchenko’s group. Based on the structural analysis and DERA sequence alignment study, we identified key residues of the catalysis and structural features of BH1352. Biochemical study was also conducted, including determination of the optimum pH and the kinetic parameters of 2- deoxyribose-5-phosphate retro-aldol cleavage reaction, measurement of statin precursor synthesis, and stability against acetaldehyde.

III. Structure-guided enzyme engineering to improve 1,3BDO synthesis from DERA-AKR coupled reaction

We identified the structural features of BH1352 based on structural analysis, and designed site- directed mutagenesis to improve aldol condensation activity. On the basis of hypothesizing that the hydrophobic cluster in the BH1352 core region and the substrate entrance are involved in

36 condensation of acetaldehyde molecules, amino acid substitutions on the key residues were rationally designed. Some of the mutations were found to be effective in enhancing 1,3BDO formation from DERA-AKR coupled reaction, and subjected to the in vivo study via heterologous expression of pathway enzymes in a correspondingly engineered E. coli.

IV. In vivo application of the engineered DERA in E. coli (Chapter 5)

The mutations were introduced to the 1,3BDO producing strain designed by Kayla Nemr, to verify whether the mutations are also effective in vivo in the glucose fermentation using E. coli. Two of the mutations with the most improvement in the in vitro DERA-AKR coupled reaction were introduced into the 1,3BDO synthetic pathway harboring strain, which led to even more significant improvement in product titer and yield over the wildtype BH1352.

Hypothesis 3: Introduction of 1,3BDO pathway enables photosynthetic 1,3BDO biosynthesis from carbon dioxide. (Chapter 5)

Objectives:

I. Cyanobacteria engineering platform design

In this study, with the first introduction of a cyanobacterial strain to the department, the whole system of cyanobacteria study had to be designed. Synechococcus elongatus PCC 7942 was used as the host strain since it is one of the most studied cyanobacteria strains and the engineering kit – a plasmid designed for heterogeneous gene integration to the genome - was commercially available. Furthermore, an in-house bioreactor for PCC 7942 was set up and optimized for growing and characterization of engineered strains.

II. Heterologous expression of 1,3BDO pathway in PCC 7942

Introduction of the biosynthetic pathway for 1,3BDO into PCC 7942 was performed via genome integration with various optimization approaches. The first approach was to modify expression of genes by using different promoters such as a constitutive promoter (psbA1) and an inducible promoter (IPTG-inducible Ptrc), or switching the order of PDC, DERA, and AKR within the operon.

After optimization of the gene expression using Ptrc promoter, polyhistidine tag was attached to N-

37 terminus of each gene for physical verification of protein expression. Once the protein expression in PCC 7942 was established, multiple candidate DERAs including BH1352 and its mutants were used in the pathway for the photosynthetic conversion of 1,3BDO.

2.6. Publication status and contribution

The findings of this thesis are included in the following research articles or the manuscripts in preparation to be published in peer reviewed journals as listed below.

1. Structural and biochemical studies of novel aldo-keto reductases for the biocatalytic conversion of 3-hydroxybutanal to 1,3-butanediol. Applied and Environmental Microbiology. 2017. 83, 7, e03172-16

Authors: Taeho Kim,a Robert Flick,a Joseph Brunzelle,b Alex Singer,a Elena Evdokimova,a Greg Brown,a Jeong Chan Joo,c George A. Minasov,b Wayne F. Anderson,b Radhakrishnan Mahadevan,a Alexei Savchenko,a Alexander F. Yakunina*

Affiliations: a Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Ontario, Canada; b Center for Structural Genomics of Infectious Diseases (CSGID), Department of Molecular Pharmacology and Biological Chemistry, Northwestern University, Chicago, Illinois, USA; c Center for Biobased Chemistry, Division of Convergence Chemistry, Korea Research Institute of Chemical Technology, Daejeon, Republic of Korea

Contributions: TK designed site-directed mutagenesis based on analysis of the crystal structure (STM2406) and the model structure (PA1127). TK mainly conducted cloning, protein purification, and biochemical experiments including enzymatic assay. RF, GB, and JCJ partly contributed to the experimental work including cloning and protein purification. TK conducted bioinformatic study including protein sequence alignment and phylogenetic analysis. TK and RF conducted and analyzed LC-MS part of the study. JB, AS, EE, GAM, WFA, and AS conducted crystallization and data collection. TK, JCJ, RM, and AFY analyzed the experimental data of enzymatic assay.

38

RM contributed reagents and materials. AS, RM, and AFY conceived and supervised the study. TK, RF, RM, and AFY wrote the manuscript.

2. Alkene hydrogenation activity of enoate reductases for an environmentally benign biosynthesis of adipic acid. Chemical Science. 2017. 8, 2, 1406-1413.

Authors: Jeong Chan Joo,a* Anna N. Khusnutdinov,a,b Robert Flick,b Taeho Kim,b Uwe T. Bornscheuer,c Alexander F. Yakuninb* and Radhakrishnan Mahadevanb*

Affiliations: a Center for Bio-based Chemistry, Division of Convergence Chemistry, Korea Research Institute of Chemical Technology, Republic of Korea; b Department of Chemical Engineering and Applied Chemistry, University of Toronto, Canada; c Institute of Biochemistry, Department of Biotechnology & , Greifswald University, Germany

Contributions: JCJ and ANK equally contributed to the study. JCJ, ANK, RM, and AFY designed experiments. JCJ and ANK mainly conducted biochemical assay. JCJ, ANK, and TK conducted cloning and protein purification. RF contributed to the analytical work of the study. JCJ, ANK, RM, and AFY analyzed the data and wrote the manuscript. JCJ, UTB, AFY, and RM conceived and supervised the study. AFY and RM contributed reagents and materials.

3. Structural and Biochemical Studies of 2-Deoxyribose-5-phosphate Aldolase BH1352 from Bacillus halodurans for the Biosynthesis of 1,3-Butanediol.

In preparation for ACS Catalysis

Authors: Taeho Kima, Kayla Nemra, Peter J. Stogiosa, Tatiana Skarinaa, Robert Flicka, Jeong C. Joob, Radhakrishnan Mahadevana, Alexei Savchenkoa,c, and Alexander F. Yakunina*

Affiliations: a Department of Chemical Engineering and Applied Chemistry, University of Toronto, Canada; b Center for Bio-Based Chemistry, Division of Convergence Chemistry, Korea Research

39

Institute of Chemical Technology, Republic of Korea; c Department of Microbiology, Immunology, and Infectious Diseases, University of Calgary, Canada

Contributions: TK, JCJ, and KN designed and conducted screening assay to identify the target enzyme of the study. TK conducted bioinformatic study of the family of the enzyme of interest, including phylogenetic analysis and protein sequence alignment. TK designed site-directed mutagenesis and conducted cloning, protein purification, and enzymatic assay. TK purified BH1352 for crystallization and transferred it to TS, who conducted crystallization of BH1352. PJS mainly analyzed the crystallization data. TK, PJS, and AFY conducted the structural analysis of BH1352. TK, KN, and RF conducted analytical work including HPLC, LC-MS, and NMR analysis. TK and KN designed and conducted the bioreactor fermentation study (included in Chapter 5). TK, KN, RM, and AFY analyzed the experimental data. TK and AFY mainly wrote the manuscript, and PJS, KN, and RM contributed to the revision of the manuscript. RM, AS, and AFY conceived and supervised the study, and contributed reagents and materials.

40

Chapter 3. Novel Aldo-Keto Reductases for the Biocatalytic Conversion of 3-Hydroxybutanal to 1,3- Butanediol: Structural and Biochemical Studies

This chapter is reproduced from the following research article with permission from authors.

Applied and Environmental Microbiology. 2017. 83, 7, e03172-16.

3.1. Abstract

The non-natural alcohol 1,3-butanediol (1,3BDO) is a valuable building block for the synthesis of various polymers. One of the potential pathways for the biosynthesis of 1,3BDO includes the biotransformation of acetaldehyde to 1,3BDO via 3-hydroxybutanal (3-HB) using aldolases and aldo-keto reductases (AKRs). This pathway requires an AKR selective for 3-HB, but inactive toward acetaldehyde, so it can be used for one-pot synthesis. In this work, we screened more than 20 purified uncharacterized AKRs for 3-HB reduction and identified 10 enzymes with significant activity and nine proteins with detectable activity. PA1127 from Pseudomonas aeruginosa showed the highest activity and was selected for comparative studies with STM2406 from Salmonella enterica serovar Typhimurium, for which we have determined the crystal structure. Both AKRs used NADPH as a cofactor, reduced a broad range of aldehydes, and showed low activities toward acetaldehyde. The crystal structures of STM2406 in complex with cacodylate or NADPH revealed the active site with bound molecules of a substrate mimic or cofactor. Site-directed mutagenesis of STM2406 and PA1127 identified the key residues important for the activity against 3-HB and aromatic aldehydes, which include the residues of the substrate-binding pocket and C-terminal loop. Our results revealed that the replacement of the STM2406 Asn65 by Met enhanced the activity and the affinity of this protein toward 3-HB, resulting in a 7-fold increase in kcat/Km. Our work provides further insights into the molecular mechanisms of the substrate selectivity of AKRs and for the rational design of these enzymes toward new substrates.

41

3.2. Introduction

Presently, most of the large-volume chemicals and polymer building blocks used by the chemical industry are manufactured from non-renewable feedstocks, such as oil and natural gas. Each year, over 80 million tons of industrial chemicals are produced globally from fossil-based feedstocks providing the materials and products for every aspect of our lives (Burk, 2010). However, the petroleum-based processes and petrochemicals imperil the environment, the economy, and overall global security. Therefore, the sustainable, biocatalytic production of chemicals and materials from renewable feedstocks has gained increasing interest due to global concerns over environmental problems caused by the large scale use of petroleum and limited fossil resources (Lee et al., 2011; Jang et al., 2012). Rapid advances in molecular biology, protein engineering, and synthetic biology have created tremendous opportunities for the biocatalytic conversion of renewable feedstocks to many natural and non-natural chemicals including diols (Sabra, Groeger and Zeng, 2016).

Low molecular weight diol compounds such as 1,3-butanediol (1,3BDO) have a wide range of applications as chemicals and fuels. 1,3BDO is a non-natural diol, which is used as a building block for the production of various pheromones, fragrances, insecticides, and antibiotics (Matsuyama et al., 2001). However, the more significant potential of 1,3BDO lies in the conversion into 1,3-butadiene (Ichikawa et al., 2005, 2006) which is used as a base material in the manufacture of synthetic rubbers, elastomers, and polymer resins with a global demand of up to 10 million metric tons annually (Makshina et al., 2014). The present methods of chemical synthesis of diols from petroleum usually require high pressure, high temperature, expensive catalysts, and are associated with the formation of toxic intermediates (Jiang et al., 2014). However, unlike 2,3- butanediol and 1,4-butanediol, there are no precursors or structural analogs in native metabolic intermediates for 1,3BDO (Jiang et al., 2014). Thus, novel and non-natural metabolic pathway design is required for biosynthesis of 1,3BDO.

Presently, two groups have proposed an artificial biosynthetic route for 1,3BDO production from glucose, which is based on a reversed fatty acid β-oxidation pathway (Kataoka et al., 2013, 2014; Gulevich et al., 2016). This pathway includes four different enzymes: 3-ketothiolase, acetoacetyl- CoA reductase, and two dehydrogenases. The expression of heterologous genes from several

42 organisms in Escherichia coli has resulted in the production of up to 15 g/L of 1,3BDO (174.8 mM) with the yield of 0.37 mol/mol of glucose (Kataoka et al., 2014). However, this pathway requires the use of one CoA and three NAD(P)H reductant molecules per molecule of 1,3BDO produced. Recently, we have proposed a novel biosynthetic pathway for 1,3BDO production, which is based on the condensation of two acetaldehyde molecules by a 2-deoxyribose-5- phosphate aldolase (DERA) with the subsequent reduction of the produced 3-hydroxybutanal (3- HB) to 1,3BDO using an aldo-keto reductase (AKR) (Fig. 3-1.). This pathway is advantageous because it is shorter than the acetoacetyl-CoA pathway and requires only one reducing cofactor molecule. However, the DERA aldolase has been shown to be able to sequentially add additional acetaldehyde molecules to the first reaction product (e.g. 3-HB) (Gijsen and Wong, 1994; Sakuraba et al., 2007). Therefore, to minimize the formation of byproducts, the DERA-AKR pathway should use the DERA aldolases specific to acetaldehyde (and inactive against 3-HB) and AKRs with a preference to 3-HB (with low or no activity against acetaldehyde).

Figure 3-1. The proposed DERA-AKR pathway of 1,3BDO biosynthesis from acetaldehyde.

The AKRs are a superfamily of ubiquitous with diverse functions in carbon metabolism and drug detoxification. Their role has been related to innumerable biological functions including drug metabolism, vitamin C biosynthesis, steroid metabolism, diabetic complications, and xylose metabolism (Anderson et al., 1985; Chung, S.S.M., Chung, 2003; Ehrensberger and Wilson, 2004; Barski, Tipparaju and Bhatnagar, 2008; Penning, 2015). AKRs are also involved in the detoxification of xenobiotics, methylglyoxal, polycyclic aromatic hydrocarbons, naloxone, naltrexone, and dihydromorphinone (Jez et al., 1997; Jez and Penning, 2001; Ellis, 2002; Lapthorn, Zhu and Ellis, 2013). They are implicated in oxidative defense, xenobiotic metabolism, transcriptional regulation, and in many metabolic pathways including

43 sugar and amino acid metabolism, as well as in the biosynthesis of steroids and secondary metabolites (WiLson et al., 1992; Hoog et al., 1994; Hyndman et al., 2003; Penning, 2015). However, there is a paucity of data in the functional assignment and characterization of microbial AKRs compared to mammalian enzymes (Richter et al., 2010). Presently, there are >190 experimentally annotated proteins in the AKR homepage (http://www.med.upenn.edu/akr/) which fall into 16 families with different substrate preferences (Schlegel, Jez and Penning, 1998). AKRs catalyze the NAD(P)H-dependent reduction of aldehydes and ketones into primary and secondary alcohols, respectively. Although these reactions can also be catalyzed by short- and medium-chain dehydrogenases, the AKR enzymes favor aldehyde reduction producing alcohols. Interestingly, AKRs share the catalytic mechanism with short-chain dehydrogenases even though they have different structural folds suggesting convergent evolution of these two groups of enzymes (Penning, 2015).

All characterized AKR enzymes have the same protein fold, a triose-phosphate isomerase (TIM) barrel or (α/β)8 – barrel with several additional α-helices (Huisman, Liang and Krebber, 2010; Calam et al., 2013). The active site of AKRs is located at the C-terminal face of the barrel surrounded by variable loops contributing to substrate specificity (Penning, 2015). The substrate specificity is modulated by the residues located at the edge of the active site and on the variable C-terminal loops (Feske, Kaluzna and Stewart, 2005; Barski, Tipparaju and Bhatnagar, 2008). The active sites of AKRs include a cofactor (NADPH) binding site and a conserved catalytic tetrad of 4 amino acid residues (Asp, Tyr, Lys and His). AKRs use an ordered bi-bi kinetic mechanism wherein NAD(P)H binds first and leaves last (Penning, 2015). Most AKRs prefer NADPH over NADH, with the cofactor bound in the central cavity of the barrel in close proximity to a catalytic tetrad inducing a conformational change (Ehrensberger and Wilson, 2004; Feske, Kaluzna and Stewart, 2005). This is followed by the binding of the carbonyl-containing substrate to the active site pocket in an appropriate orientation (Kaluzna et al., 2004). It has been proposed that the nicotinamide ring of the cofactor together with the Tyr and His of the catalytic tetrad create an anion hole for the binding of the substrate carbonyl oxygen, whereas the Asp and Lys facilitate proton donation (WiLson et al., 1992; Ehrensberger and Wilson, 2004; Penning, 2015).

44

Although much interest to AKRs is associated with their role in the development of human diseases, there is also a significant interest in their application in biocatalysis (Marquardt et al., 2005; Hoyos et al., 2010). To date, several biotechnological applications of AKRs have been reported including the synthesis of drugs, fuels, and intermediates for the production of fine chemicals (Brändén, 1991; Bohren, Grimshaw and Gabbay, 1992; Kuznetsova et al., 2015). However, there is a paucity of data in biochemical and structural characterization of microbial AKRs compared to the mammalian enzymes (Richter et al., 2010). In this study, we performed the biochemical and structural characterization of microbial AKRs with the aim of identifying efficient reductases for the last step of the DERA-AKR pathway (reduction of 3-HB to 1,3BDO, see Fig. 3-1). We screened 21 purified AKRs from different organisms for reductase activity against 3-HB to choose the target AKRs for the application in 1,3BDO synthesis. The crystal structures of STM2406 from Salmonella typhimurium in complex with cacodylate or NADPH were determined and biochemically characterized along with PA1127 from Pseudomonas aeruginosa, which showed the highest activity against 3-HB. Using structure-based site-directed mutagenesis, we characterized the role of the active site residues and the C-terminal loop in the catalytic activity and substrate selectivity of STM2406 and PA1127. This work contributes to the understanding of the molecular mechanisms of the activity and substrate selectivity of AKRs.

3.3. Materials and Methods

Gene cloning, expression, purification, and mutagenesis

The 21 AKR genes studied in this work (see Table A1 in Appendix A) were amplified by PCR from corresponding genomes and cloned into the NdeI and BamHI sites of p15TV-L (modified from pET15b) vector in which the TEV protease cleavage site replaced the Thrombin cleavage site, and a double stop codon was introduced downstream from the BamHI site. This construct provides for an N-terminal hexa-His tag separated from the gene by a TEV protease recognition site (ENLYFQ/G). The fusion proteins were overexpressed in E. coli BL21-Gold (DE3) (Stratagene) harboring an extra plasmid encoding three rare tRNAs (AGG and AGA for Arg, ATA for Ile) (Zhang et al., 2001). The cells were grown in terrific broth media at 37 ºC to an OD600 of

45 approximately 1.0 and protein expression was induced with 1.0 mM IPTG. After induction, the cells were incubated overnight with shaking at 16 ºC. The harvested cells were resuspended in binding buffer (500 mM NaCl, 5% , 50mM HEPES at pH 7.5, 5 mM imidazole), flash- frozen in liquid N2 and stored at -70 ºC. The thawed cells were lysed by sonication and the lysate was clarified by centrifugation (30 min at 27,000 g). The clear supernatant was applied to a metal chelate affinity column charged with Ni2+. The hexa-His tagged proteins were eluted from the column using an elution buffer (500 mM NaCl, 5% glycerol, 50mM HEPES at pH 7.5, 500 mM imidazole), and the elutes were flash-frozen in liquid N2 and stored at -70 ºC for further use (see Appendix A, Fig. A1). Site-directed mutagenesis was performed using the QuikChange site- directed mutagenesis kit (Stratagene) according to the manufacturer’s protocol. The enzyme variants were prepared using the same method as described above.

Enzymatic assays

Purified AKRs were screened spectrophotometrically for reductase activity against 10 mM 3-HB for 10 min at 37°C using the following reaction mixture (0.2 ml): 50 mM potassium phosphate buffer (pH 7.0), 0.5 mM NADPH, and 10 µg of AKR. The substrate profiles of purified AKRs were generated via spectrophotometric screens using 96-well microplates at 37°C for 10 min in reaction mixtures containing K2HPO4 (50 mM, pH 7.0), KCl (10 mM), EDTA (0.5 mM), NADPH (0.5 mM), various substrates (1 mM), and protein (5 µg of STM2406 and 2 µg of PA1127) in a final volume of 200 µl, as used in a previous study (Ehrensberger and WiLson, 2003; Ehrensberger and Wilson, 2004). The reactions were monitored by following the decrease in absorbance at 340 -1 -1 + nm as a measure of the conversion of the cofactor NADPH (ε340 nm = 6,220 M ·cm ) to NADP . The kinetics of STM2406 and PA1127 were determined from specific activities over a range of substrate concentrations (15.625 µM to 64 mM, depending on the enzymes and substrates) and a specified amount of protein (5 µg of STM2406 and 2 µg of PA1127 in 200µl reaction volumes) over a period of 3 min at 37°C. The kinetic parameters were calculated by a nonlinear regression analysis of raw data fit to the sigmoidal function using GraphPad Prism software (version 5.04 for Windows). The effects of site-directed mutagenesis on the activities of STM2406 and PA1127 were determined as previously described for measuring the kinetic parameters of wildtype proteins.

LC-MS analysis of reaction products

46

The LC-MS platform consists of a Dionex UltiMate 3000 UHPLC system and a Q Exactive mass spectrometer equipped with a HESI II source (all from Thermo Scientific). The control of the system and data handling were performed using Thermo Xcalibur 2.2 software and Chromeleon 7.2 software. Separation by liquid chromatography was conducted on a Hypersil Gold C18 column (50 mm by 2.1 mm, 1.9 µm particle size; Thermo Scientific) equipped with a guard column. The pump was run at a flow rate of 200 µl/min. Solvent A was water containing 0.1% formic acid, and solvent B was methanol (MeOH). The system was run in isocratic mode for 5 min with 10% solvent B, washed with 90% solvent B for 5 min, and equilibrated for 5 min with 10% solvent B. An autosampler temperature was maintained at 8°C and injection volumes were 10 µl. Data were collected in positive ionization mode with a scan range m/z of 80 to 200, resolution of 70,000 at 1 Hz, automatic gain control (AGC) target of 3e6, and a maximum injection time of 200 ms. A standard solution of 1,3-butanediol (detected as a sodium adduct, M + Na, m/z 113.0577) was used for validating the retention time and m/z values.

Protein crystallization

Purified STM2406 was dialyzed at 4°C in HEPES (10 mM, pH 7.5), NaCl (300 mM), and tris(2- carboxyethyl) phosphine hydrochloride (TCEP; 0.5 mM), concentrated to 27.5 mg/ml, and stored at -70°C. The crystallization trials were performed at room temperature using hanging-drop vapor diffusion with an optimized sparse matrix crystallization screen. The STM2406 crystal used for data collection (see Appendix A, Table A2) was grown from a crystallization liquor containing calcium acetate (0.2 M), polyethylene glycol (PEG)-8K (9%), and sodium cacodylate (0.1 M, pH 6.5) (Hampton Research, Aliso Viejo, CA, USA). The STM2406 crystals were cryoprotected in the same buffer supplemented with ethylene glycol (25%) and flash frozen in liquid nitrogen.

Data collection, structure determination, and refinement

The diffraction data for a native crystal of STM2406 were collected at 100 K at Advanced Photon Source (APS) beamline 21-IDG. The diffraction data were integrated and scaled using HKL2000, and the structure was solved by molecular replacement using the program PHASER and PDB code 3ERP (Otwinowski and Minor, 1997; McCoy et al., 2007). Initially, molecular replacement was performed on a crystal diffracting to 2.15 Å on our home-source Rigaku MicroMax-007 generator

47 equipped with osmic mirrors and a Raxis4++ detector, followed by rigid-body refinement using the data set collected on the second native crystal at APS beamline 21-IDG to 1.55 Å and the lower-resolution STM2406 structure, all using programs within the CCP4 program suite (Winn et al., 2011). The model was improved by alternate cycles of manual building and water-picking using COOT and restrained refinement against a maximum-likelihood target, with 5% of the reflections randomly excluded as an Rfree test set (Emsley and Cowtan, 2004; Emsley et al., 2010). These refinement steps were performed using REFMAC in the CCP4 program suite (Murshudov et al., 2011). The final model contained 2 nearly complete chains, with residues 2 to 235 and 252 to 332 in chain A and residues 2 to 233 and 258 to 331 in chain B, and was refined to an Rwork and

Rfree of 15.5 and 17.8%, respectively, including translation-liberation-screw rotation (TLS) parameterization (Painter and Merritt, 2006). The Ramachandran plot generated by PROCHECK showed very good stereochemistry overall, with 99.4% of the residues in the most favored and additionally allowed regions (Laskowski et al., 1993). The data collection, phasing, and structure refinement statistics for this structure are summarized in Table A2 (see Appendix A).

Bioinformatic analysis

The primary structure alignments of AKRs were performed using the Clustal Omega web server (http://www.ebi.ac.uk/Tools/msa/clustalo/)(Sievers et al., 2011). The alignment figure was generated by ESPript 3.0 (Robert and Gouet, 2014). The evolutionary history was inferred using the neighbor-joining method, and the evolutionary distances were computed using the p-distance method (Saitou and Nei, 1987). The analysis involved 21 amino acid sequences, including the 11 AKRs used in the initial screening assay and 13 pre-annotated AKRs (https://www.med.upenn.edu/akr/members.html) from AKR subfamilies 5, 11, 13, and 14. Evolutionary analyses were conducted in MEGA6 (Tamura et al., 2013). The PA1127 structural model was generated using the Phyre2 web portal for protein modeling, prediction, and analysis (Kelley et al., 2015).

Accession number(s). The atomic coordinates and structure factors have been deposited in the Protein Data Bank with accession codes 3ERP (STM2406 apo-structure) and 5T79 (STM2406 complex with NADPH).

48

3.4. Results and Discussion

3.4.1. Screening of purified AKRs for aldo-keto reductase activity against 3- HB

In our previous work, we identified 21 uncharacterized soluble AKRs from different organisms with detectable reductase activity toward various aldehydes and ketones (see Table A1 in the Appendix A). In this work, using an NADPH oxidation-based assay, we screened the purified AKR proteins for reductase activity against 3-HB and identified 15 proteins with measurable activity toward this substrate (Fig. 3-2). Among these AKRs, PA1127 from Pseudomonas aeruginosa showed the highest activity in reducing 3-HB. Crystallization trials with the AKRs active toward 3-HB (including PA1127) produced diffracting crystals of STM2406 from Salmonella typhimurium, and its crystal structure was determined in complex with cacodylate (a substrate mimic) or NADPH (Protein Data Bank [PDB] codes 3ERP and 5T79). Since PA1127 produced no diffracting crystals, a structural model of this protein was generated (see 3.3. Materials and Methods) and used for comparative, structure-based mutational analysis of 3-HB reduction by STM2406 and PA1127. In this analysis, we also used the crystal structures of the Bacillus subtilis YvgN, YhdN, and YtbE (see Fig. 3-2), which are available in the PDB database (3F7J, 1PZ1, and 3B3D, respectively). A liquid chromatography-mass spectrometry (LC-MS)- based analysis of the reaction mixtures after incubating 3-HB with PA1127 and STM2406 confirmed the formation of 1,3BDO as the reaction product (see Appendix A Fig. A1).

49

Figure 3-2. Screening of 21 purified AKRs for reductase activity against 3-HB. The reaction mixtures contained 10 mM 3-HB, 0.5 mM NADPH, and 10 µg of the indicated purified protein. For each enzyme, the UniProt gene code and source organism are indicated, and the two AKRs characterized in this work are boxed.

A phylogenetic analysis of the 10 AKRs with the highest activities against 3-HB, STM2406, and an additional 11 pre-annotated AKRs revealed that STM2406 belongs to the AKR subfamily 14 (61% sequence identity with the E. coli AKR14A1), while PA1127 is a member of the AKR subfamily 11 (62% sequence identity with the B. subtilis AKR11B1) (Fig. 3-3). Most of the AKRs with significant activity toward 3-HB are associated with the AKR5 subfamily (BCE_5206, YtbE, YvgN, BH2158, BCE_0216, and BH3849) and AKR11 group (PA1127 and YhdN), whereas STM2406 belongs to the AKR14 subfamily.

50

Figure 3-3. Phylogenetic analysis of AKRs used in this work. The enzymes characterized in this work (STM2406 and PA1127) are marked with red boxes. The tree encompasses the AKRs with high activities against 3-HB, as well as STM2406 and the proteins whose subfamilies have already been assigned (shown next to the protein). The phylogenetic tree was generated using the neighbor-joining method (Tamura et al., 2013). The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances are in the units of the number of amino acid differences per site. The numbers next to each node represent the probability that supports the evidence for excluding clusters below the node point. These statistical values were calculated by the bootstrap method with 500 replications and used to infer AKR subfamilies of enzymes used in this work. All of the AKRs are represented with UniProt gene names and host organisms.

51

3.4.2. Biochemical characterization of STM2406 and PA1127

The substrate scopes of STM2406 and PA1127 were first assessed using a set of 54 substrates, which included common AKR substrates, such as 4-nitrobenzaldehyde and methylglyoxal, as well as other aromatic and aliphatic aldehydes, ketones, and carbohydrates (Ehrensberger and Wilson, 2004; Marquardt et al., 2005). Initial screens revealed that STM2406 reduced mostly aldehyde substrates and isatin, a ketone compound, but showed no activity toward sugars, whereas PA1127 was active against the substrates from all three groups (see Appendix A Fig. A2). On the basis of these results, 18 aldehyde compounds, including acetaldehyde, were selected for the quantitative assessment of protein substrate profiles (Fig. 3-4). Both proteins showed higher activity against aldehydes, with 4-pyridinecarboxaldehyde being the best substrate. STM2406 also showed high activity toward 4-nitrobenzaldehyde but was less active against other aromatic and aliphatic aldehydes. In contrast, PA1127 showed comparable activities against at least five other substrates, including both aromatic (4-nitrobenzaldehyde and 3-pyridinecarboxaldehyde) and aliphatic (pentanal, hexanal, and methylglyoxal) aldehydes (see Fig. 3-4). This is in line with the high similarity of this protein to the B. subtilis AKR11B1 (YhdN) (Ehrensberger and Wilson, 2004). STM2406 and PA1127 were strictly specific to NADPH as a cofactor and showed no activity in the presence of NADH. In addition, both proteins showed negligible activity toward acetaldehyde, making them suitable for applications in aldol condensation of acetaldehyde.

STM2406 and PA1127 exhibited saturation toward most of the positive substrates, except for butanal, pentanal, and hexanal for STM2406 and glyoxal for PA1127. STM2406 exhibited higher affinities and turnover numbers with aromatic aldehydes and displayed measurable activities toward linear C4 to C6 aliphatic aldehydes but showed substrate saturation only with 3-HB (Table 3-1). In comparison to STM2406, PA1127 displayed higher catalytic turnover and higher affinity toward all the substrates tested, with the highest catalytic efficiency against 4-nitrobenzaldehyde.

52

Figure 3-4. Substrate profiles of STM2406 (A) and PA1127 (B) AKR activities against different aldehydes. The substrates were used at a concentration of 1 mM. Both AKRs showed no measurable activity toward 1 mM acetaldehyde. Results are the means and standard deviations (error bars) of data from at least two independent determinations.

53

Table 3-1. Kinetic parameters of STM2406 and PA1127 with various substrates. Values are the means and standard errors.

-1 -1 -1 Protein Substrate KM [mM] kcat [s ] kcat/KM [mM s ] STM2406 WT 2-pyridinecarboxaldehyde 1.19 ± 0.08 5.94 ± 0.16 4.99 ± 0.14 3-hydroxybutanal 22.5 ± 5.00 0.77 ± 0.08 0.03 ± 0.004 3-pyridinecarboxaldehyde 1.63 ± 0.09 6.18 ± 0.15 3.79 ± 0.09 4-nitrobenzaldehyde 0.41± 0.05 3.77 ± 0.23 9.20 ± 0.56 4-pyridinecarboxaldehyde 0.37 ± 0.01 6.31 ± 0.09 17.04 ± 0.26 Glyoxal 4.59 ± 0.16 0.57 ± 0.02 0.12 ± 0.00 Methylglyoxal 3.62 ± 0.44 5.81 ± 0.33 1.60 ± 0.09 Phenylglyoxal 1.13 ± 0.06 4.75 ± 0.10 4.20 ± 0.09 NADPH (with 4PC[b]) 0.17 ± 0.01 3.58 ± 0.07 21.06 ± 0.83 N65I 3-hydroxybutanal 20.20 ± 3.43 2.19 ± 0.16 0.11 ± 0.01 3-pyridinecarboxaldehyde 3.82 ± 0.16 7.16 ± 0.14 1.87 ± 0.04 4-nitrobenzaldehyde 0.36 ±0.004 9.45 ± 0.06 26.25 ± 0.13 Methylglyoxal 6.39 ± 0.66 6.20 ± 0.30 0.97 ± 0.05 N65M 3-hydroxybutanal 13.90 ± 1.89 2.71 ± 0.17 0.19 ± 0.01 3-pyridinecarboxaldehyde 3.29 ± 0.33 21.12 ± 0.88 6.42 ± 0.38 4-nitrobenzaldehyde 0.51 ± 0.04 16.96 ± 0.70 33.25 ± 1.24 Methylglyoxal 6.95 ± 0.88 11.26 ± 0.67 1.62 ± 0.11 N65F 3-hydroxybutanal 21.96 ± 3.75 1.10 ± 0.09 0.05 ± 0.004 4-nitrobenzaldehyde 0.37 ± 0.01 17.38 ± 0.24 46.97 ± 0.62 PA1127 WT 2-pyridinecarboxaldehyde 0.69 ± 0.03 5.89 ± 0.09 8.54 ± 0.24 3-hydroxybutanal 3.22 ± 0.22 7.70 ± 0.22 2.39 ± 0.10 3-pyridinecarboxaldehyde[a] 2.16 ± 0.22 17.32 ± 1.00 8.02 ± 0.35 4-nitrobenzaldehyde 0.21 ± 0.01 9.14 ± 0.12 43.52 ± 0.67 4-pyridinecarboxaldehyde[a] 0.96 ± 0.08 15.20 ± 0.54 15.83 ± 0.76 Butyraldehyde 1.90 ± 0.14 12.07 ± 0.37 6.35 ± 0.27 Hexanal 2.72 ± 0.12 11.60 ± 0.27 4.26 ± 0.09 Methylglyoxal 0.76 ± 0.04 13.58 ± 0.31 17.87 ± 0.53 Pentanal[a] 3.77 ± 1.02 29.82 ± 5.81 7.91 ± 0.60 Phenylglyoxal[a] 2.35 ± 0.66 11.83 ± 2.61 5.03 ± 0.30 NADPH (with 4PC[b]) 0.15 ± 0.002 8.32 ± 0.07 55.47 ± 0.27 V58A 3-hydroxybutanal 5.27 ± 0.50 4.36 ± 0.18 0.83 ± 0.04 Hexanal 3.93 ± 0.16 12.59 ± 0.29 3.20 ± 0.06 Methylglyoxal 1.10 ± 0.09 13.68 ± 0.41 12.44 ± 0.64 L87A 3-hydroxybutanal 16.57 ± 1.31 12.26 ± 0.59 0.74 ± 0.02 Hexanal 5.36 ± 0.18 17.83 ± 0.45 3.33 ± 0.03 Methylglyoxal 4.08 ± 0.28 20.46 ± 0.63 5.01 ± 0.14

[a] Substrate inhibition exhibited. Ki = 24.72 mM for 3-pyridinecarboxaldehyde, Ki = 28.09 mM for 4-pyridinecarboxaldehyde, Ki = 5.04 for pentanal, and Ki = 2.28 mM for phenylglyoxal. [b] 4PC: 4-pyridinecarboxaldehyde

54

3.4.3. Crystal structure of STM2406

Of the purified AKRs active against 3-HB (Fig. 3-2), only STM2406 produced diffracting crystals in crystallization trials, and its crystal structure was determined to 1.55-Å resolution using the single-wavelength anomalous diffraction (SAD) method (see Appendix A Table A2). For PA1127, a high-quality structural model was generated using the Phyre2 web portal (100% confidence, 98% sequence coverage), which is based on the structure of B. subtilis YhdN (PDB code 1PZ1, 60% sequence identity to PA1127). The crystal structure of STM2406 and the structural model of

PA1127 revealed a classical (α/β)8-fold (TIM barrel) (Fig. 3-5), which is typical for AKRs and is one of the most common protein folds (Brändén, 1991). For both proteins, the core domain contains eight α-helices and eight β-strands with two additional helices (H1 and H2) lying outside the barrel structure (Fig. 3-5). Other structural characteristics of STM2406 and PA1127 include the long N-terminal loop (13 amino acids) with two β strands covering the N-terminal end of the barrel and three large loops at the C-terminal end of the barrel.

The crystal structure of STM2406 revealed the presence of two protomers per asymmetric unit (see Appendix A Fig. A3A). On the basis of the STM2406 structure, the interface between the two monomers involves 13 hydrogen bonds and three salt bridges and buries ~1,750 Å 2 of the solvent- accessible surface area of each monomer. However, our gel filtration analysis of purified STM2406 indicated that this protein exists as a hexamer in solution (238.4 ± 9.5 kDa; predicted mass of monomer, 39.8 kDa). Analysis of the crystal contacts of STM2406 using the quaternary prediction server PISA predicted an octameric structure of this protein (see Appendix A Fig. A3B). The difference between the STM2406 gel filtration results and PISA prediction may be due to the tight monomer packing in the oligomeric protein. Experimentally characterized AKRs are mainly monomeric proteins, but there are also examples of dimeric, trimeric, and octameric enzymes (Lapthorn, Zhu and Ellis, 2013; Penning, 2015). A Dali search for structural homologues of STM2406 identified several AKR structures as the best matches, including the E. coli methylglyoxal reductase YghZ (AKR14A1; PDB codes 3N6Q, 4AST, and 4AUB; Z score, 48.8 to 49.4; root mean square deviation [RMSD], 0.8 to 0.9 Å ; 61% sequence identity), the E. coli YgdS (PDB code 1LQA; Z score, 36.5; RMSD, 2.1 Å ; 31% sequence identity), and the Polaromonas sp. JS666 Bpro_4249 (PDB code 4XK2; Z score, 35.7; RMSD, 1.9 Å ; 34% sequence

55 identity). High Z scores with moderate to low sequence similarities (36% or lower sequence identity) between STM2406 and other AKR proteins, including PA1127 (see Appendix A Fig. A3C), suggest a strong evolutionary conservation of this structural fold.

Figure 3-5. Overall folds of STM2406 and PA1127. (A) Cartoon representation of the crystal structure of the STM2406 protomer. (B) Cartoon representation of the PA1127 structural model generated using the Phyre2 web portal. The α helices and β strand structures that compose the TIM barrel are indicated and labeled.

56

3.4.4. Active sites and catalytic mechanisms of STM2406 and PA1127

The crystal structure of STM2406 also revealed the potential active site located in the central cavity of the TIM barrel near its C-terminal side (Fig. 3-5A). The STM2406 active site showed the presence of an additional area of electron density, which was interpreted as a cacodylate ion

(dimethylarsinate [C2H6AsO2]) (Fig. 3-6A). Cacodylate (0.1 M) was present in the crystallization buffer, and it likely mimics an aldehyde substrate bound in the STM2406 active site. One of the cacodylate is positioned close to the side chain of the catalytic Tyr66 (2.7 Å ), whereas the other oxygen atom is near the Asn169 side chain (3.1 Å ). The other residues of the STM2406 catalytic tetrad are also in close proximity to the catalytic Tyr66: Asp61 (3.9 Å ), Lys97 (3.0 Å ), and His138 (4.1 Å ). The catalytic tetrad of PA1127 is composed of Asp54, Tyr59, Lys84, and His125 (Fig. 3-6B).

We also determined the crystal structure of STM2406 in complex with NADPH at 1.85-Å resolution (Fig. 3-6C). In this structure, the cofactor nicotinamide ring forms stacking interactions with the side chain of the conserved Phe221 (3.4 Å ) and is positioned close to the Trp33 side chain (4.2 Å ). The carbonyl oxygen of the nicotinamide ring also interacts with the side chain of the conserved Asn169 (2.9 Å ) (Fig. 3-6C). On the other end of NADPH, the ADP moiety is positioned close to the side chains of Ser222 (2.9 Å ), Arg232 (2.8 Å ), Gln303 (2.8 Å ), and Asp306 (3.3 Å ). Similar modes of NADPH binding were also observed in the structures of the E. coli YghZ (PDB code 4AUB) and Bacillus subtilis YhdN (1PZ1) and YvgN (3D3F).

AKRs have been reported to catalyze a sequential ordered bi-bi reaction, in which the NADPH cofactor binds first and leaves last (Penning, 2015). This suggests that in the presence of the bound cofactor in the STM2406 active site, the substrate molecule will be coordinated by the side chains of Phe221 (3.5 Å to cacodylate), Trp33 (3.3 Å ), Asn65 (5.8 Å ), Tyr100 (4.8 Å ), and Asn169 (3.1 Å ) (Fig. 3-6A). The PA1127 structural model suggests that the residues Tyr203, Trp23, Val58, Leu87, and Asn156 are likely to be involved in substrate binding (Fig. 3-6B).

The proposed catalytic mechanism of STM2406 and PA1127 involves a hydride transfer from NADPH to the acceptor carbonyl followed by carbonyl protonation by the catalytic Tyr66 (Tyr59 for PA1127) using a proton relay from His138 (His125), whereas Asp61 (Asp54) and Lys97

57

(Lys84) facilitate proton donation by decreasing the pKa of the catalytic Tyr66 (Tyr59) (Jez et al., 1997; Penning, 2015). The proposed catalytic model is supported by the results of alanine replacement mutagenesis of the predicted catalytic residues of STM2406 and PA1127. As shown in Fig. 3-7, alanine replacement of these residues produced catalytically inactive mutant proteins when tested with 4-nitrobenzaldehyde and 3-HB as the substrates.

Figure 3-6. Active site of STM2406. (A) Closeup view of the STM2406 active site with bound cacodylate (CA). The 2mFo-DFc map contoured at 1.0 σ is displayed (shown as a blue mesh) around the cacodylate molecule. The catalytic tetrad (Asp61, Tyr66, Lys97, and His138) and residues involved in substrate binding are shown as sticks with green carbons. (B) Structural model of the PA1127 active site showing the catalytic tetrad (Asp54, Tyr59, Lys84, and His125) and substrate-binding residues. (C) The STM2406 active site with the bound NADPH molecule (shown as sticks with carbon atoms colored in magenta). The catalytic residues and residues involved in cofactor binding are shown as sticks with green carbons and are labeled.

58

3.4.5. Mutational analysis of substrate selectivity of STM2406 and PA1127.

It was indicated that the substrate specificity of AKRs is modulated by the residues located at the edges of active sites on the loops connecting the β-barrel strands with the surrounding α-helices (Jez et al., 1997; Penning, 2015). To provide insight into the role of the active-site residues of STM2406 and PA1127 in the activity against different aldehyde substrates, we performed alanine replacement mutagenesis for both proteins based on the STM2406 crystal structure (Fig. 3-6A), the PA1127 structural model (Fig. 3-6B), and their sequence alignments with Bacillus AKRs from subfamilies 5 and 11 (see Appendix A Fig. A6). In STM2406, the mutated residues include Trp33, Asn65, Tyr100, Asn169, Gln193, and Phe221, whereas Trp23, Val58, Leu87, Asn156, Gln175, and Tyr203 were mutated in PA1127. The mutant proteins were purified (except for the PA1127 H125A and Y203A, which were found to be insoluble), and their AKR activities were analyzed using 4-nitrobenzaldehyde and 3-HB as substrates (Fig. 3-7A and B). Alanine replacement of the STM2406 Gln193 (Gln175 in PA1127) dramatically decreased enzymatic activity against both substrates, suggesting their role in NADPH binding along with Phe221 in STM2406 and Tyr203 in PA1127 (assumed to be critical in the protein folding of PA1127).

In the STM2406 structure, the side chains of Trp33 and Tyr100 form the sidewalls of the substrate- binding site, with the catalytic Tyr66 located at the bottom of this cavity (Fig. 3-6A). Accordingly, alanine replacement mutagenesis of both residues, as well as Trp23 in PA1127, greatly reduced the enzymatic activities of both proteins against 4-nitrobenzaldehyde and 3-HB. Based on the sequence alignment, the STM2406 Tyr100 corresponds to the PA1127 Leu87, whose replacement by Ala produced a mutant protein (L87A) with significant reductase activity against both substrates (Fig. 3-7B). Therefore, Leu87 appears to play no significant role in substrate coordination in PA1127. The STM2406 N65A and N169A mutant proteins retained significant activities against 4-nitrobenzaldehyde but showed greatly reduced activities toward 3-HB. These results suggest that these residues are not essential for activity against aromatic aldehydes (because they are coordinated mainly through stacking interactions with the aromatic side chains of Trp33 and Tyr100), but they are important for the binding of 3-HB.

59

Figure 3-7. AKR activities after alanine replacement mutagenesis of STM2406 (A) and PA1127 (B). The PA1127 H125A and Y203A were expressed in insoluble forms. Catalytic activity was assayed using saturating concentrations of 4-nitrobenzaldehyde (2 mM; white bars) and 3-HB (64 mM for STM2406 and 16 mM for PA1127; gray bars) as the substrates.

60

3.4.6. Mutational analysis of the STM2406 substrate-binding pocket

As shown in Fig. 3-6A, the sidewalls of the STM2406 substrate-binding pocket are formed by the side chains of five residues: Trp33, Asn65, Tyr100, Asn169, and Phe221. Based on the AKR sequence alignment (see Appendix A Fig. A4), the STM2406 residues Asn65 and Tyr100 are replaced by other amino acid residues in the sequences of PA1127 and other AKRs with higher activities against 3-HB. This suggests that the replacement of Asn65 and Tyr100 by the residues present in more active AKRs can potentially increase the activity of STM2406 against 3-HB.

Using site-directed mutagenesis, we mutated the STM2406 Tyr100 (to Asn, Asp, Gln, Glu,His, or Leu) and Asn65 (to Val, Leu, Ile, Met, or Phe). Four mutant proteins (Y100D, Y100L, N65V, and N65L) were found to be insoluble, whereas other mutant proteins were purified and compared with the wildtype STM2406 using two nonaromatic (3-HB and methyl- glyoxal) and two aromatic (3- pyridinecarboxaldehyde and 4-nitrobenzaldehyde) substrates (Fig. 3-8). Enzyme assays revealed lower activities in the Y100D and Y100L proteins against all of the substrates tested (data not shown). In contrast, the N65M, N65F, and N65I mutant proteins showed 3 to 5 times higher activities toward aromatic aldehyde substrates (Fig. 3-8). With methylglyoxal as the substrate, N65M was also more active than the wildtype STM2406, but N65I and N65F exhibited wildtype activities. With 3-HB, the activities of N65I and N65M proteins were 2.5 to 3.5 times higher than the wildtype STM2406, while the N65F protein showed wildtype activity.

Purified N65 mutant proteins exhibited saturation kinetics with the four substrates tested except for N65F, which showed no saturation with methylglyoxal and 3-pyridinecarboxaldehyde as the substrates (Fig. 3-8, Table 3-1). An analysis of kinetic parameters of the STM2406 mutant proteins revealed that N65M had a higher kcat and lower Km toward 3-HB than the wildtype STM2406, whereas N65I and N65F had the wildtype Km. With the other three substrates, the mutant proteins typically showed higher Km and kcat values than the wildtype protein. Thus, replacing the STM2406 Asn65 with a hydrophobic amino acid residue increased the activities of this enzyme toward aromatic and nonaromatic aldehydes, with the N65M protein exhibiting increased activities toward all four of the substrates tested. With 3-HB, the wildtype STM2406 had a kcat/Km almost

80 times lower than PA1127, whereas the STM2406 N65M protein showed a kcat/Km almost 7 times higher than the wildtype STM2406.

61

Figure 3-8. AKR activity of purified N65 mutant proteins after site-directed mutagenesis of the STM2406 active site. Specific activities of the wild-type (WT) and N65 mutant proteins against 3-HB (A), methylglyoxal (B), 3-pyridinecarboxaldehyde (C), and 4-nitrobenzaldehyde (D) as the substrates. The concentrations of each of the substrates used are the saturating concentrations for the WT enzyme. Saturation curves of the purified WT and N65 mutant proteins with 3-HB (E), methylglyoxal (F), 3- pyridinecarboxaldehyde (G), and 4-nitrobenzaldehyde (H) as the substrates.

62

3.4.7. C-terminal loop of AKR structure

Sequence alignments of PA1127 and STM2406 revealed the presence of the C-terminal loop in PA1127 and other Bacillus AKRs with higher activities against 3-HB, while it was absent in STM2406 (see Appendix A Fig. A5). A comparison of the STM2406 structure with the published structures of the B. subtilis YvgN, YhdN, and YtbE, which we found to be more active than STM2406, showed that in the B. subtilis AKRs, the C-terminal loops extend over the active site, whereas the STM2406 active site is wide open (Fig. A7). Previous studies with the human AKR1B1 demonstrated that the deletion of its C-terminal loop resulted in an ~300-fold decrease in catalytic efficiency but had no effect on NADPH binding (Bohren, Grimshaw and Gabbay, 1992). In addition, the B. subtilis AKR11A (IolS) lacks a C-terminal loop and exhibits low catalytic activity against common AKR substrates, such as benzaldehyde and butyraldehyde (Ehrensberger and Wilson, 2004). These results indicate that the C-terminal loop, if present, might contribute to the substrate binding and activities of AKRs. Therefore, we hypothesized that the absence of the C-terminal loop in STM2406 may account for the low AKR activity of this enzyme compared with those of PA1127 and other AKRs. To test this hypothesis, we prepared the mutant proteins of PA1127, YhdN, and YvgN with the deleted C-terminal loops and analyzed the AKR activities of the purified mutant proteins using 3-HB and 4-nitrobenzaldehyde as the substrates (see Appendix A Fig. A6). The deletion of the C-terminal loop in PA1127, YhdN, and YvgN produced mutant proteins with negligible AKR activities toward 3-HB and 4-nitrobenzaldehyde, confirming the functional importance of the C-terminal loop for high AKR activities toward 3-HB and 4-nitrobenzaldehyde (Fig. A8).

3.5. Conclusion

The remarkable substrate promiscuity of AKRs makes them attractive biocatalysts for the enzyme- based biotransformation of non-natural substrates, including industrially important aldehydes and ketones (Huisman, Liang and Krebber, 2010; Calam et al., 2013; Penning, 2015). Despite the high sequence and structural conservation of AKRs, these enzymes display different substrate preferences, which had been proposed to depend on the residues located at the edges of the active

63 sites and on the variable C-terminal loops (Barski, Tipparaju and Bhatnagar, 2008; Richter et al., 2010). Although structural information has revealed significant differences among the substrate- binding pockets of AKRs, the relationships between structure and substrate specificity have not been fully determined. Since 3-HB represents a non-natural substrate for AKRs, the identification of promiscuous activity toward this chemical represents a starting point for developing new biocatalysts. In our work, by screening purified recombinant AKRs for reductase activity against 3-HB, we identified two groups of these enzymes, with high (PA1127) and low (STM2406) activities toward this substrate. Comparative studies of STM2406 and PA1127 using crystal structures and structure-based mutagenesis revealed that hydrophobic residues (Met) near the catalytic Tyr in the substrate-binding pocket and the C-terminal loop represent the two structural determinants necessary for the high reductase activity of AKRs against 3-HB. For engineering novel AKRs with higher activities against 3-HB, further biochemical and structural studies are required to determine the optimal amino acid compositions of the substrate-binding site and the C- terminal loop essential for high activity toward this substrate.

64

Chapter 4. Structural Studies and Engineering of 2- Deoxyribose-5-phosphate Aldolase BH1352 for the Biosynthesis of 1,3-Butanediol

This chapter is reproduced from the research article in preparation for Chemical Science. In the publication, 4.4.6. and 4.4.7. will not be included.

4.1. Abstract

Carbon-carbon bond formation is one of the most important reactions in biocatalysis and organic chemistry. In nature, enzymes aldolases catalyze the reversible stereo-selective aldol addition between two carbonyl compounds, making them attractive catalysts for the synthesis of various chemicals. In this work, we identified novel 2-deoxyribose-5-phosphate (DRP) aldolases (DERA) with aldol condensation activity against acetaldehyde, which can be used for the biosynthesis of 1,3-butanediol (1,3BDO) in combination with an aldo-keto reductase (AKR). Enzymatic screening of 20 purified DERAs revealed the presence of significant acetaldehyde condensation activity in 12 enzymes with the highest activity in BH1352 from Bacillus halodurans, TM1559 from Thermotoga maritima, and DeoC from Escherichia coli. The crystal structures of BH1352 and TM1559 are the first full length DERA structures showing the presence of the C-terminal Tyr (Tyr224 in BH1352). Structure-based site-directed mutagenesis of BH1352 demonstrated a key role of the catalytic Lys155 and other active site residues in the DRP cleavage and acetaldehyde condensation reactions. The C-terminal Tyr224 of BH1352 was found to be essential for DRP cleavage, but not critical for acetaldehyde condensation. These experiments also revealed a 2.5- fold increase in acetaldehyde transformation to 1,3BDO (in combination with AKR) in the BH1352 F160Y and F160Y/M173I mutant proteins, whereas their DRP cleavage activity remained at the wild type level. The replacement of the wild type BH1352 by the F160Y and F160Y/M173I variants in E. coli cells expressing the DERA+AKR pathway increased the production of 1,3BDO from glucose five and six times, respectively. Thus, our work provided novel insights into the

65 molecular mechanisms of substrate selectivity and activity of DERA aldolases and identified two DERA variants with enhanced activity for the in vitro and in vivo biosynthesis of 1,3BDO. 4.2. Introduction

The formation of carbon-carbon bonds via aldol condensation of two carbonyl compounds is indispensable in biological systems and organic chemistry (Mukaiyama, 1982; Mahrwald, 1999, 2004). Aldol condensation reactions generate a new β-hydroxy carbonyl compound, which is a valuable precursor in the construction of complex organic molecules due to the formation of up to two new stereogenic centers (Windle et al., 2014). Using aldehydes as donor substrates in aldol reactions is particularly of interest since this provides the opportunity for sequential aldol condensation reactions to synthesize more complex molecules (Orsini, Pelizzoni and Forte, 1989; Mukherjee et al., 2007). In biological systems, aldolase enzymes catalyze the reversible and stereoselective aldol addition of a nucleophilic donor onto an electrophilic aldehyde acceptor (Ma et al., 2016). The formation of a new C-C bond is accompanied by the generation of a new stereocenter making aldolases attractive tools in the synthesis of chiral compounds and bioactive molecules. Therefore, aldolases have emerged as a promising alternative in the biocatalytic synthesis of rare sugars and sugar derivatives, such as statins, iminocyclitols, epothilones, and sialic acids (Machajewski and Wong, 2000; Clapés et al., 2010; Haridas, Abdelraheem and Hanefeld, 2018).

2-Deoxyribose-5-phosphate aldolases (DERA, E.C. 4.1.2.4) are found in all kingdoms of life and represent the major aldolase group. One of the best characterized DERA aldolases is the Escherichia coli DeoC, which belongs to the class I (metal-independent) aldolases (Machajewski and Wong, 2000; Haridas, Abdelraheem and Hanefeld, 2018). The E. coli deoC is part of the deo operon (deoABCD) involved in the utilization of extracellular deoxyribonucleotides as energy sources (Lomax and Greenberg, 1968). It transforms the D-2-deoxyribose-5-phosphate (DRP) intermediate into D-glyceraldehyde-3-phosphate and acetaldehyde, which enter glycolysis and the Krebs cycle, respectively (Racker, 1952). The DERA reaction is reversible, as it also catalyzes the aldol condensation between acetaldehyde (the donor molecule) and D-glyceraldehyde-3- phosphate (the acceptor molecule) producing D-2-deoxyribose-5-phosphate (DRP) (Scheme 4-1) (Valentin‐Hansen et al., 1982). This class of aldolases is unique in that it can catalyze the aldol

66 condensation of two aldehydes and do not require a ketone substrate, whereas other aldolases use ketones as aldol donors and aldehydes as acceptors (Barbas, Wang and Wong, 1990; Haridas, Abdelraheem and Hanefeld, 2018). It activates the donor molecule (acetaldehyde) via the catalytic Lys residue forming a covalent Schiff base intermediate (enamine) followed by the carboligation between the acceptor (D-glyceraldehyde-3-phosphate or second acetaldehyde) and the Schiff base (Pricer and Horecker, 1960; Barbas, Wang and Wong, 1990). The crystal structure of the E. coli

DERA (DeoC) adopts the ubiquitous TIM (Triosephosphate Isomerase) barrel (α/β)8 fold with the catalytic Lys167 (the Schiff base-forming residue) located on strand β6 (Heine et al., 2001). A proton relay system composed of Asp102, Lys201, and a water molecule is involved in shuffling a proton between C2 of the acetaldehyde imine and enamine and subsequent C3 hydroxyl protonation. In addition, several biochemical works suggested that the C-terminal Tyr259 of the E. coli DeoC is crucial for enzyme activity (Hoffee et al., 1974; Heine et al., 2001; Schulte et al., 2018). However, all published crystal structures of DERA show the absence of electron density for the last eight C-terminal residues including Tyr (Heine et al., 2001, 2004; Dick, Hartmann, et al., 2016; Dick, Weiergräber, et al., 2016; T.-P. Cao et al., 2016). Recently, using a combination of NMR spectroscopy and molecular dynamics simulations it has been shown that the C-terminal Tyr259 of the E. coli DeoC enters the active site in catalytically relevant closed states and is required for efficiency of the proton abstraction step of the DERA catalytic reaction (Schulte et al., 2018).

Scheme 4-1.

The acetaldehyde-active DERA aldolases are also distinctive by their ability to ligate three aldehyde molecules in a sequential and stereo-selective manner, making them attractive biocatalysts for synthetic organic chemistry (Gijsen and Wong, 1994; Sakuraba et al., 2007). It has

67 been shown that DERA catalyzes a sequential tandem aldol reaction of chloroacetaldehyde and two acetaldehyde molecules, forming (3R,5S)-6-chloro-2,4,6-trideoxyhexapyranoside (CTHP). This product can be further used as a lactone moiety for the synthesis of 3-hydroxy-3- methylglutaryl (HMG)-CoA reductase inhibitors, cholesterol-lowering statin drugs (e.g. atorvastatin and rosuvastatin) (Liu and Wong, 2002; Greenberg et al., 2004; Müller, 2005; Jennewein et al., 2006; Global Business Intelligence, 2013; Jiao et al., 2015, 2016, 2017). Thus, DERA enzymes offer the great potential by providing an effective and simplified route for their production.

Recently, we established a novel pathway to produce 1,3-butanediol (1,3BDO) from acetaldehyde using DERA as the key enzyme (Kim et al., 2017; Nemr et al., 2018). The non-natural diol 1,3BDO is used as a building block for the production of synthetic polymers, pheromones, fragrances, insecticides, and antibiotics (Matsuyama et al., 2001; Yamamoto, Matsuyama and Kobayashi, 2002; Ichikawa et al., 2005, 2006; Makshina et al., 2014; Sabra, Groeger and Zeng, 2016). Presently, 1,3BDO has been produced mainly from petroleum-based feedstocks using chemical processes, which require harsh reaction conditions and release toxic intermediates and by-products (Makshina et al., 2014). Therefore, the development of biocatalytic processes for the production of 1,3BDO from renewable feedstocks is of increasing importance (Jiang et al., 2014; Sabra, Groeger and Zeng, 2016). The recently proposed artificial biosynthetic approach for 1,3BDO production from glucose is based on a reversed fatty acid β-oxidation pathway, which includes four heterologous enzymes and requires three NADPH and one CoA molecules per molecule of 1,3BDO produced (Kataoka et al., 2014; Gulevich et al., 2016). In contrast, the proposed DERA-based pathway for 1,3BDO production involves three heterologous enzymes: pyruvate decarboxylase (PDC, producing acetaldehyde from pyruvate), DERA (catalyzing aldol condensation of two acetaldehyde molecules to 3-hydroxybutanal), and aldo-keto reductase (AKR), which reduces 3-hydroxybutanal (3HB) to 1,3BDO (Scheme 4-2). The heterologous expression of this pathway in E. coli resulted in the production of 0.3 g of 1,3BDO/liter from glucose (11.2 mg/g of glucose) (Nemr et al., 2018). Using a systems metabolic engineering approach, the 1,3BDO titer was increased to 2.4 g/liter and yield to 56 mg/g of glucose further highlighting the potential of aldolases for synthesis of valuable products. This study also suggested

68 that the rate-limiting step of the proposed 1,3BDO pathway is the DERA-catalyzed aldol condensation of acetaldehyde to 3HB (Nemr et al., 2018).

Scheme 4-2.

Although our recent studies revealed great potential of DERAs for biocatalytic conversion of acetaldehyde to 1,3BDO (Nemr et al., 2018), this activity (acetaldehyde condensation) has not been examined in depth. The scarcity of data on DERAs limits our efforts on increasing the acetaldehyde condensation activity of these enzymes, which represents the rate-limiting step in the biocatalytic synthesis of 1,3BDO and potentially statin drugs. In this work, after screening 20 purified microbial DERAs we identified BH1352 from the alkaliphilic bacterium Bacillus halodurans, as well as TM1559 from Thermotoga maritima and E. coli DeoC as the most active aldolases in the DERA-AKR coupled production of 1,3BDO from acetaldehyde. The crystal structures of these enzymes were determined including the first full-length DERA structure (BH1352) and revealed the catalytic residues and substrate-binding sites. Using structure-based site-directed mutagenesis, we identified the BH1352 residues critical for acetaldehyde condensation and designed several DERA variants with higher activity in the production of 1,3BDO both in vitro (from acetaldehyde) and in vivo (from glucose).

69

4.3. Materials and methods

Chemicals

All substrates and chemicals used in this study were purchased from Sigma Aldrich except for NADPH, which was obtained from Codexis.

Phylogenetic and sequence analyses

The phylogenetic tree was generated by retrieving 2,553 sequences from UniProt using KEGG Orthology (KO) K01619, which represents deoxyribose-phosphate aldolases (EC 4.1.2.4) involved in the pentose phosphate pathway. The original dataset was reduced to 1,974 sequences by removing redundant sequences and increasing gap-free sites using CD-HIT and MaxAlign using the MAFFT online alignment (https://mafft.cbrc.jp/alignment/server/) (Gouveia-Oliveira, Sackett and Pedersen, 2007; Fu et al., 2012; Yamada, Tomii and Katoh, 2016). The tree was built using FastTree 2.1.5 and visualized by iTOL (Interactive tree of life, http://itol.embl.de/) (Price, Dehal and Arkin, 2010; Letunic and Bork, 2016). The DERA sequence alignment and phylogenetic analysis were conducted as described in our previous study (Kim et al., 2017). Structural images of BH1352 were prepared using PyMOL Molecular Graphics System, version 1.8 (Schrödinger, LLC).

Gene cloning, protein purification, and mutagenesis

The 20 DERA genes studied in this work (see Table B1 in Appendix B) were amplified by PCR from corresponding genomes and cloned into the NdeI and BamHI sites of p15TV-L (modified from pET15b) vector in which the TEV protease cleavage site replaced the Thrombin cleavage site, and a double stop codon was introduced downstream from the BamHI site. This construct provides for an N-terminal hexa-His tag separated from the gene by a TEV protease recognition site (ENLYFQ/G). The fusion proteins were overexpressed in E. coli BL21-Gold (DE3) (Stratagene) harboring an extra plasmid encoding three rare tRNAs (AGG and AGA for Arg, ATA for Ile) (Zhang et al., 2001). The cells were grown in terrific broth media at 37 ºC to an OD600 of approximately 1.0 and protein expression was induced with 1.0 mM IPTG. After induction, the cells were incubated overnight with shaking at 16 ºC. The harvested cells were resuspended in

70 binding buffer (500 mM NaCl, 5% glycerol, 50mM HEPES at pH 7.5, 5 mM imidazole), flash- frozen in liquid N2 and stored at -70 ºC. The thawed cells were lysed by sonication and the lysate was clarified by centrifugation (30 min at 27,000 g). The clear supernatant was applied to a metal chelate affinity column charged with Ni2+. The hexa-His tagged proteins were eluted from the column using an elution buffer (500 mM NaCl, 5% glycerol, 50mM HEPES at pH 7.5, 500 mM imidazole), and the elutes were flash-frozen in liquid N2 and stored at -70 ºC for further use (see Appendix B, Fig. B1). Site-directed mutagenesis of BH1352 was performed using the Phusion® High-Fidelity DNA Polymerase (New England BioLabs) accordingly to the manufacturer’s protocol. The enzyme variants were prepared using the same method as described above.

Protein crystallization

Purified BH1352 was crystallized at room temperature using the sitting-drop vapor diffusion method using protein concentration of 10 mg/mL and reservoir solution of 0.1 M Tris-HCl (pH 8.5), 0.2 M magnesium chloride, 25% (w/v) PEG3350 and 10 mM acetaldehyde. The crystal was cryoprotected in the same buffer supplemented with 2% PEG 200 and flash frozen in liquid nitrogen.

Data collection, structure determination, and refinement (conducted by Prof. Savchenko group)

Diffraction data for the BH1352 apoenzyme crystal was collected at 100 K at a Rigaku home source Micromax-007 with R-AXIS IV++ detector. Diffraction data was processed using HKL3000 (Minor et al., 2006). The structure was solved by molecular replacement using Phenix phaser and the structure of a putative aldolase (PDB 3NGJ) (Adams et al., 2010). Model building and refinement were performed using Phenix.refine and Coot.(Emsley and Cowtan, 2004) TLS parameterization was utilized and B-factors were refined as isotropic. Structure geometry and validation were performed using the Phenix Molprobity tools. Data collection and refinement statistics for this structure are summarized in Table 4-1.

Enzyme assays

71

Purified DERAs were initially screened using a DERA-AKR coupled assay with 50 mM acetaldhehyde as substrate in the following reaction mixture (0.2 ml): 100 mM triethanolamine (TEA) buffer (pH 7.5), 10 mM NADPH, DERA (250 μg/ml), and AKR (PA1127, 250 μg/ml). The production of 1,3BDO was measured using HPLC, following 2 h incubation at room temperature. The reaction samples were filtered through centrifugal filter device (10K cut-off, VWR) to remove enzymes and dried to get rid of residual acetaldehyde from the samples using a vacuum concentrator. The dry samples were dissolved in the same volume of ddH2O and analyzed using HPLC (Dionex Ultimate 3000, Thermo Scientific) equipped with an Aminex

HPX-87H column, equilibrated with 5 mM H2SO4 as an eluent with a flow rate of 0.6 ml/min at 50°C. 1,3BDO was detected using a refractive index detector (Shodex RI-101). Assay conditions were optimized by varying concentrations of DERA, AKR, and NADPH, and the optimal conditions included 100 μg/ml each of DERA and AKR, and 10 mM NADPH (see Appendix B Fig. B2).

The kinetic parameters of purified DERA were determined using 2-deoxyribose-5-phosphate (DRP) cleavage reaction via a glyceraldehyde-3-phosphate dehydrogenase/triosephosphate isomerase (GDH/TPI)-coupled assay. The DERA-catalyzed retro-aldol reaction produces acetaldehyde and D-glyceraldehyde-3-phosphate, which is converted into dihydroxyacetone phosphate by TPI and further reduced by GDH consuming NADH. The detailed assay conditions were as follows: 100 mM TEA buffer (pH 8.5), 0.5 mM NADH, wildtype or mutant DERA (1 µg/ml), TPI (11 U/ml), GDH (1 U/ml), and DRP (from 4 µM to 4 mM) in a 200 µl reaction mixture at 30 °C. The kinetic parameters were calculated by a nonlinear regression analysis of raw data fit to the sigmoidal function using GraphPad Prism software (version 5.04 for Windows). The effects of site-directed mutagenesis on the activities of BH1352 were determined as previously described for measuring the kinetic parameters of wildtype proteins.

For the analysis of DERA resistance against acetaldehyde, a freshly prepared acetaldehyde solution (final concentration 100 mM) was added to the incubation mixture containing 2 mg/ml of purified BH1352 (wildtype or mutant proteins). The incubation solution aliquots were taken and diluted for further use in a DRP cleavage assay (1 mM DRP). The activity of DERA samples was analyzed immediately after acetaldehyde addition and then at regular time intervals. The residual

72

DERA activity was calculated by comparison with control samples without acetaldehyde (containing only enzymes and buffer).

The sequential double condensation of chloroacetaldehyde and acetaldehyde was analyzed in a 500 μl reaction mixture containing 200 mM acetaldehyde, 100 mM chloroacetaldehyde, and purified DERA (2 mg/ml) using the same buffer as used for 1,3BDO synthesis. The reaction mixture was incubated at room temperature for 120 min, filtered to remove the enzymes using the 10K cut-off centrifuge filters and analyzed using HPLC to detect 1,3BDO and other intermediate chemicals. The double condensation product was extracted with ethyl acetate, dried over anhydrous Na2SO4, and analyzed using NMR and LC-MS.

NMR analysis: (3R,5S)-6-chloro-2,4,6-trideoxyhexapyranoside

The reaction was carried out using purified BH1352 (wildtype or mutant proteins) in 100 mM TEA buffer (pH 8.5) with 100 mM acetaldehyde and 50 mM chloroacetaldehyde as substrates, in a total volume of 6 mL at 25 ℃. The product was isolated by sampling from HPLC, followed by neutralization of the sample with 10 mM NaOH. The neutralized samples were freeze-dried, and reaction products were analyzed using NMR. The NMR analysis conditions were as follows: 1H NMR (500 MHz, Deuterium Oxide) δ 8.33 (s, 0H), 7.95 (s, 0H), 7.01 (dt, J = 14.7, 7.2 Hz, 0H), 5.52 (t, J = 1.7 Hz, 0H), 5.34 – 5.31 (m, 0H), 5.17 (t, J = 3.1 Hz, 0H), 5.03 (dd, J =10.1, 2.3 Hz, 1H), 4.36 (ddt, J = 10.1, 6.8, 3.5 Hz, 0H), 4.24 (p, J = 3.1 Hz, 1H), 4.13 (p, J = 3.9 Hz, 0H), 4.04 (ddd, J = 10.0, 4.9, 2.5 Hz, 1H), 3.84 – 3.81 (m, 0H), 3.61 (dt, J = 11.3, 3.3 Hz, 2H), 3.57 – 3.52 (m, 0H), 3.50 (dd, J = 11.9, 6.3 Hz, 1H), 3.44 (d, J = 6.5 Hz, 0H), 3.42 (d, J = 6.5 Hz, 0H), 3.39 (d, J = 5.3 Hz, 0H), 2.66 (t, J = 6.1 Hz, 0H), 2.07 (ddt, J = 13.3, 6.2, 2.1 Hz, 0H), 1.99 (ddd, J = 13.2, 6.3, 2.2 Hz, 0H), 1.86 – 1.79 (m, 1H), 1.73 – 1.70 (m, 0H), 1.69 – 1.65 (m, 0H), 1.65 – 1.62 (m, 0H), 1.60 – 1.57 (m, 0H), 1.57 – 1.52 (m, 2H), 1.46 (ddd, J = 13.4, 10.1, 3.0 Hz, 1H), 1.32 (q, J = 11.8 Hz, 0H), 1.23 – 1.13 (m, 0H), 1.11 (d, J = 6.2 Hz, 0H), 1.07 (dd, J = 6.3, 3.6 Hz, 0H).

Strains and plasmids (addressed in Chapter 5)

The strains and plasmids used in this study were adopted from the previous study of our colleague and are listed in Table B4 (Nemr et al., 2018). Expression of pBD1 (pTrC99A harboring BH1352,

73

PA1127, and PDC from Z. mobilis) in LMSE51C was used as the wildtype control (BDO-0) to demonstrate the in vivo effect of the mutations in BH1352.

Cultivation in mini-bioreactors (addressed in Chapter 5)

Strains were characterized in 500 mL bioreactors (Applikon Biotechnology Inc.) with three rushton impellers (two impellers with OD28 and one with OD22) equipped with electrodes to measure pH and dissolved oxygen. The first seed culture was prepared by inoculating 10 mL of LB supplemented with 100 μg/mL of ampicillin from a single colony and grown at 37 °C. 50 mL of modified M9 media (supplemented with 100 μg/mL of ampicillin and 0.5 mg/mL of thiamine and containing 0.1 M MOPS at pH 7.3) was inoculated with the first seed culture, in a 250 mL baffled flask, and grown at 37 °C and 200 r.p.m. for 16 h. The second seed culture was then used to inoculate 300 mL of modified M9 media (without MOPS) and supplemented with 100 μg/ mL of carbenicillin in the bioreactors. The pH was controlled at 7.0 by the addition of 10% NH4OH, stirrer speed at 1500 r.p.m., temperature at 37 °C and air flow rate at 1.5 vvm. When the OD600 nm reached between 7 and 8, protein expression was induced by the addition of 1mM IPTG. After 30 min, the air flow rate was then reduced to 0.37 vvm (25% of the initial vvm) to reduce dissolved oxygen (dO) and 3 % glucose was additionally supplemented. At various time points, 1 mL samples of each culture were taken for further analysis.

4.4. Results and discussion

4.4.1. Phylogenetic analysis of DERA sequences

To provide insight into the phylogenetic diversity of DERAs, over 2,500 sequences (2,553) of putative DERAs were extracted from the KEGG Orthology (KO) database using the KO identifier K01619 for the E. coli DERA (DeoC), which is the best characterized DERA enzyme (Gijsen and Wong, 1994). Initially, this pool of putative DERA proteins included over 2,500 sequences (2,281 from bacteria, 120 from archaea, and 152 from eukaryotes), but it was reduced to 1,974 proteins after removing redundant sequences. This phylogenetic analysis revealed the presence of five major clusters of DERA proteins including one Bacterial domain, one Firmicutes (Bacilli and

74

Clostridia), one mostly Proteobacteria, and two mixed clusters (Fig. 4-1A). To screen DERAs for the bioconversion of acetaldehyde to 1,3BDO, we selected 20 DERA proteins from different phylogenetic groups, which were found to be soluble when expressed in E. coli (see Appendix B, Fig. B1 and Table B1). Based on the phylogenetic analysis, 17 selected DERAs belong to the five large clusters (1-5), whereas the remaining three proteins were from non-classified sequences.

4.4.2. Screening of purified DERAs for biosynthesis of 1,3BDO from acetaldehyde

In our previous work, we identified several aldo-keto reductases (AKRs) with significant activity in reducing 3-hydroxybutanal to 1,3BDO (Scheme 4-2) (Kim et al., 2017). From these proteins, PA1127 from Pseudomonas aeruginosa was found to exhibit negligible activity against acetaldehyde making it suitable for coupling of the DERA-catalyzed condensation of acetaldehyde (to 3-hydroxybutanal) with the AKR-catalyzed reduction of 3-hydroxybutanal to 1,3BDO (Scheme 4-2). Using PA1127, we established a coupled enzyme system (DERA+PA1127) and screened 20 purified DERAs for transformation of acetaldehyde to 1,3BDO. These screens revealed significant production of 1,3BDO in the presence of 12 DERAs with the highest activity observed in TM1559 from Thermotoga maritima (DERA group-1), E. coli DeoC (DERA group- 4), and BH1352 from Bacillus halodurans (DERA group-2) (see Fig. 4-1A and B).

In this work, we have determined the crystal structures of the wild type and mutant (K184L) BH1352 (PDB codes 6D33 and 6MSW, Table 4-2). Previously, we also determined the crystal structure of the E. coli DeoC to 1.4 Å resolution (PDB code 1KTN), and several structures of this enzyme were published by two other labs (Heine et al., 2001, 2004; Dick, Hartmann, et al., 2016). In addition, the unpublished crystal structure of TM1559 is available from PDB database (determined by the Joint Center for Structural Genomics) both in an apo-form and in complex with citrate (PDB codes 3R12 and 3R13) (Heine et al., 2001, 2004). Since BH1352 was found to support the highest production of 1,3BDO by E. coli cells expressing different DERAs (including TM1559 and DeoC) (Nemr et al., 2018), this protein was selected for detailed structural and biochemical studies of the transformation of acetaldehyde to 1,3BDO.

75

Since B. halodurans is an alkaliphilic bacterium (grows well at pH >9.0), we determined the optimal pH range for BH1352 using the retro-aldol DRP cleavage reaction coupled with glyceraldehyde-3-phosphate dehydrogenase and triosephosphate isomerase (Heine et al., 2001; DeSantis et al., 2003; Sakuraba et al., 2007; You et al., 2013). These assays revealed the maximal activity of BH1352 at pH 8.5 (see Appendix B, Fig. B3), whereas the previously reported DERA enzymes from other bacteria showed the highest activity in this reaction at pH 6.0-7.5 (Sakuraba et al., 2007; Jiao et al., 2015). At optimal pH, Vmax of BH1352 with DRP as substrate was calculated to be 34.12 ± 1.03 μmol/min/mg protein, which is lower than that for the DERAs from E. coli and Lactobacillus brevis (58 and 102 μmol/min/mg protein, respectively), but higher than other DERAs (0.25 - 1.00 μmol/min/mg protein) (Sakuraba et al., 2007; Jiao et al., 2015). The L. brevis DERA has also been reported to exhibit high resistance to aldehydes, but this enzyme was found to be hardly active in our AKR-based screen for 1,3BDO synthesis (see Fig. 4-1B) (Jiao et al., 2015). Steady-state kinetic parameters of BH1352 and its variants were also determined using the DRP cleavage reaction (Table 4-1). These experiments revealed that BH1352 exhibits typical

Michaelis-Menten kinetics with the apparent Km = 0.22 mM, which is close to that for E. coli DeoC and more than 10 times lower than that for the L. brevis DERA (3.3 mM) (Sakuraba et al., 2007).

Table 4-1. Effect of rationally designed BH1352 mutations on the retro-aldol cleavage of DRP.

-1 -1 -1 Protein KM [mM] Vmax [U/mg] kcat [s ] kcat/KM [mM s ] BH1352 WT 0.22 ± 0.02 34.12 ± 1.03 13.28 ± 0.40 60.36 C39M 0.28 ± 0.02 37.97 ± 0.88 14.78 ± 0.34 52.79 C61A 0.18 ± 0.01 35.44 ± 0.55 13.80 ± 0.21 76.67 F160Y 0.50 ± 0.05 47.82 ± 1.62 18.62 ± 0.63 37.24 F160H 0.45 ± 0.04 32.96 ± 1.07 12.83 ± 0.42 28.51 F160K 0.06 ± 0.01 9.06 ± 0.42 3.53 ± 0.16 58.83 F160M 0.17 ± 0.01 22.58 ± 0.48 8.79 ± 0.19 51.71 F160W 0.22 ± 0.01 19.14 ± 0.41 7.45 ± 0.16 33.86 I170V 0.18 ± 0.01 38.15 ± 0.46 14.85 ± 0.18 82.50 M173I 0.17 ± 0.02 30.08 ± 0.78 11.71 ± 0.19 68.88 M173L 0.21 ± 0.02 32.80 ± 1.14 12.77 ± 0.44 60.81 M173V 0.18 ± 0.02 25.40 ± 0.87 9.89 ± 0.34 54.94 F160Y/M173I 0.25 ± 0.02 27.94 ± 0.70 10.88 ± 0.27 43.52

76

77

Figure 4-1. Phylogenetic analysis of DERAs and screening of purified proteins for 1,3BDO formation. A: Phylogenetic analysis of the DERA family: unrooted phylogenetic tree of 2,553 DERA sequences showing the presence of five main clusters (1-5) and non-clustered sequences. Black circles indicate the 20 DERA proteins from different clusters selected for activity screening (with organism names and UniProt codes). BH1352 from B. halodurans characterized in this work is indicated by the red rectangle. B: Screening of 20 purified DERAs for the production of 1,3BDO from acetaldehyde in the presence of PA1127. The graph bars represent the final concentration of 1,3BDO produced after 2 h of incubation with 10 mM NADPH and 50 mM acetaldehyde (see Materials and Methods for experimental details).

4.4.3. Crystal structure of BH1352: overall fold, C-terminal, and active site

The crystal structures of BH1352 (PDB codes 6D33 and 6MSW) and E. coli DeoC (PDB code 1KTN) were determined to 2.50 Å and 1.40 Å resolution, respectively, using the sitting-drop vapor diffusion method (Table 2), whereas the unpublished crystal structure of TM1559 is available from Protein Data Bank (PDB codes 3R12, 3R13, determined by the Joint Center for Structural Genomics). Analysis of crystal contacts of BH1352 using the quaternary prediction PDBePISA server (http://www.ebi.ac.uk /pdbe/pisa/) predicted a dimeric state (Fig. 4-2A). This was supported by the result of size-exclusion chromatography suggesting that this protein exists as a dimer in solution (observed molecular mass 51.7 kDa; predicted mass of monomer molecule of 24.2 kDa). It is similar to the dimeric state of hyperthermophilic and L. brevis DERAs, but is different from the E. coli DeoC, which was found to exist in a monomer-dimer equilibrium (Heine et al., 2004; Sakuraba et al., 2007). Based on the BH1352 structure, the interfaces between monomers in each adjacent dimer involve 13 hydrogen bonds and buries 1,288 Å 2 surface area. The dimerization interface of BH1352 is composed mainly of hydrophobic interactions without any inter-molecular bonds and salt bridges, while the weak dimerization interface of DeoC (573 Å 2) is comprised of a single hydrogen bond and two salt bridges in between α3 and α4 helices of each protomer (see Appendix B, Fig. B4) (Sakuraba et al., 2003, 2007; Heine et al., 2004). The dimerizing hydrophobic interactions are mainly focused on the BH1352 loops containing Pro16, Phe66, Pro67, Leu68, Ile97, and Phe160 (see Appendix B, Table B2) On the other hand, TM1559 structure exhibits a much stronger dimerization that the interface between TM1559 protomers buries 1464 Å 2 with 14 hydrogen bonds and two salt bridges along with the most hydrophobic

78 contacts among the three DERAs, which explains the strongest structural stability (see Table B2) (Sakuraba et al., 2007).

The monomeric structures of BH1352, TM1559, and DeoC displayed a classical (α/β)8 fold (TIM barrel), which is one of the most common protein folds catalyzing diverse enzymatic reactions (Brändén, 1991). Interestingly, the AKR enzyme PA1127 used in combination with DERAs (EC 4.1.2.4) for 1,3BDO synthesis also has a TIM barrel fold, but catalyzes the NADPH-dependent reduction of 3HB and other aldehydes (EC 1.1.1.X) (Kim et al., 2017). A Dali search for structural homologues of BH1352 identified several DERA structures as the best matches, including the Entamoeba histolytica DeoC (PDB code 3NGJ; Z score 39.0, root mean square deviation [RMSD], 0.9 Å ; 63% sequence identity), Streptococcus suis DERA (5DBU; Z score 38.5; RMSD, 0.6 Å ; 66%), and L. brevis DERA E78K mutant (4XBS; Z score, 38.1; RMSD, 0.9 Å ; 53%).

Based on the BH1352 structure, its active site is located inside of the β-barrel, near its C-terminal side (Fig. 4-2B). The active site entrance is formed by the several loops connecting β-strands (β1, β6, and β7) with α-helices (α1, α6, and α7), containing highly or semi-conserved residues including Thr12, Leu14, Lys15, Phe66, Ile128, Phe160, Ser186, and Ser209. The side chains of these residues create a narrow channel providing access of substrates to the catalytic Lys155, located on the β6 strand (Fig. 4-3A). In the best characterized DERA from E. coli, the catalytic Lys167 is in close proximity to the side chains of conserved Lys137 and Lys201, and the three Lys residues form salt bridges with the side chain oxygens of conserved Asp102 (Heine et al., 2001, 2004). During the DERA-catalyzed synthesis of 2-deoxyribose-5-phosphate (DRP) (Scheme 4-1), the uncharged nucleophilic Lys167 of E. coli DeoC attacks the acetaldehyde carbonyl forming a carbinolamine and then a Schiff base, which subsequently tautomerizes to an enamine group and attacks glyceraldehyde 3-phosphate (Heine et al., 2001; DeSantis et al., 2003). Finally, hydrolysis of the aldol condensation intermediate produces the free enzyme and DRP.

Previous biochemical works with the E. coli DeoC also suggested an important role for the highly conserved C-terminal Tyr259, because its replacement by Phe (Y259F) resulted in a ~100-fold reduction of the DRP cleavage activity (Heine et al., 2001; Schulte et al., 2018). Interestingly, the deletion of Tyr259 (ΔY259) significantly increased the DeoC activity in the condensation reaction between acetaldehyde and chloroacetaldehyde (Jennewein et al., 2006). Using a combination of

79

NMR and molecular dynamics simulations, it has been shown that the DeoC C-terminal tail is intrinsically disordered with the equilibrium between open and catalytically relevant closed states, where Tyr259 is inserted into the active site close to the catalytic Lys167 (~ 6 Å ) (Schulte et al., 2018). Remarkably, the structures of both BH1352 and TM1559 revealed the presence of electron density for the C-terminal Tyr224 and Tyr246, respectively. In TM1559 Tyr246 was positioned on the C-terminal α-helix with the side chain exposed to solvent (see Appendix B, Fig. B5). On the other hand, BH1352 Tyr224 was located on the flexible C-terminal tail, and its side chain was stabilized through interactions with the active site of the other BH1352 dimer. In the BH1352 active site, the Tyr224 side chain showed two orientations, with the hydroxyl group pointing toward the catalytic Lys155 (2.7 Å) or toward the β1-α2 loop backbone (near conserved Leu14 and Lys15, 2.9-3.0 Å ) (Fig. 4-4A). In the second orientation, the aromatic ring of Tyr224 is part of the hydrophobic wall of the active site (with conserved Leu14, Val63, Phe66, Ile128, and Phe160) (Fig. 4-4B). Thus, in line with a recent work on molecular dynamics simulation with DeoC (Schulte et al., 2018), the BH1352 crystal structure provides the structural indication that the C-terminal Tyr residue of DERA aldolases might be involved directly in substrate binding and/or catalytic mechanism of this enzyme.

80

Table 4-2. Crystallographic data collection and model refinement statistics for the crystal structure of BH1352. The PDBs will not be released until the paper is accepted.

Structure apo BH1352 apo BH1352 K184L EcDERA PDB Code 6D33 6MSW 1KTN Data collection Space group C2 C2 C2 Cell dimensions a, b, c (Å ) 240.91, 55.52, 177.71 240.78, 55.13, 176.30 62.57, 53.56, 81.36 α, β, γ (°) 90, 128.02, 90 90, 127.71, 90 90, 109.97, 90 Resolution, Å 30.0 – 2.50 30.0 – 2.17 50.0 – 1.40 a Rmerge 0.049 (0.654) 0.077 (0.734) 0.078 (0.309) e Rpim 0.027 (0.392) 0.042 (0.405) -- c e CC1/2 0.750 0.815 -- I / (I) 34.2 (2.56) 17.8 (2.11) 20.0 (2.44) Completeness, % 98.9 (97.5) 100 (100) 99.5 (98.2) Redundancy 3.9 (3.5) 4.2 (4.2) 6.0 (4.5) Refinement Resolution, Å 30.02 – 2.50 29.98 – 2.17 38.65 – 1.40 No. of unique reflections: 63680, 1995 97396, 1997 88731, 4458 working, test R-factor/free R-factord 20.4/25.0 (29.7/32.9) 19.3/21.1 (28.8, 30.0) 18.6/20.5 (18.7/19.9) No. of refined atoms, molecules Protein 9557, 6 9514, 6 3763, 2 Solvent 116, 18 102, 17 N/A Water 461 843 631 B-factors Protein 74.70 56.58 8.54 Solvent 97.53 85.32 N/A Water 64.25 54.92 20.50 r.m.s.d. Bond lengths, Å 0.004 0.004 0.004 Bond angles,  0.632 0.618 1.300 a Rsym = hi|Ii(h) - I(h)/hiIi(h), where Ii(h) and I(h) are the ith and mean measurement of the intensity of reflection h. bFigures in parentheses indicate the values for the outer shells of the data. cValue refers to the outer shells of the data. c obs calc obs obs calc R = |Fp – Fp |/Fp , where Fp and Fp are the observed and calculated structure factor amplitudes, respectively.

81

Figure 4-2. Crystal structure of BH1352. A: Dimeric structure of BH1352; B: Overall view of BH1352 protomer. The α helices and β strand structures that compose the TIM barrel are labeled. The catalytic triad is displayed with sticks.

82

Figure 4-3. Active site of BH1352. A: Close-up view of the BH1352 active site. The catalytic residues and the key residues involved in substrate binding are shown as sticks with green carbons and labeled; B: Diagram showing the predicted binding mode of 2-deoxyribose-5-phosphate (DRP) in the active site of BH1352, modeled using the EcDERA-DRP complex (PDB code 1JCL).

83

Figure 4-4. Two orientations of the C-terminal Tyr224 of BH1352 (red boxes of A and B), each being displayed in detail on the right.

84

4.4.4. Probing the active site of BH1352 using site-directed mutagenesis

Since the catalytic residues of E. coli DeoC are conserved in BH1352 and TM1559, the same catalytic mechanism can be applied to aldol condensation of two acetaldehyde molecules catalyzed by these enzymes. In the BH1352 active site, the side chain of conserved Asp92 (Lys179 in TM1559) forms salt bridges with the conserved Lys126 (2.7 Å from Asp92), Lys155 (3.1 Å ), and Lys184 (3.0 Å ). We propose that the catalytic Lys155 forms a Schiff base with the acetaldehyde carbonyl, whereas Asp92 and Lys184 are part of the BH1352 proton relay system involved in imine deprotonation to form an enamine (see Appendix B, Fig. B6). This is consistent with the results of alanine replacement mutagenesis of BH1352 with the respective mutant proteins (D92A, K126A, K155A, and K184A) showing very low or no catalytic activity both in the DRP cleavage and acetaldehyde condensation reactions (Fig. 4-7A and B). This is also supported by the crystal structure of TM1559 in complex with citrate and glycerol (PDB codes 3R12) indicating that its active site includes Asp117, Lys150, Lys179 (catalytic), and Lys208 (see Appendix B, Fig. B7). Another crystal structure of TM1559 (PDB code 3R13) also revealed the presence of additional electron density in the active site representing an unknown ligand covalently bound to the catalytic Lys179 (likely representing one of the reaction intermediates).

To identify other BH1352 residues involved in substrate binding, we modeled DRP into the BH1352 active site using the structure of the DeoC-DRP complex of DeoC from E. coli (Fig. 4- 3B) (Heine et al., 2001; DeSantis et al., 2003). The produced model of DRP binding in the BH1352 active site predicts that the side chain of Thr12 appears to be involved in substrate coordination via hydrogen bonding with the β-hydroxyl group of DRP, as well as with Lys184 and a water molecule (W29, Fig. 4-3B). This is consistent with the results of site-directed mutagenesis, which revealed that Ala replacement of Thr12 resulted in a catalytic impairment in the DRP cleavage reaction (Fig. 4-7A). However, the Thr12Ala mutant protein exhibited acetaldehyde condensation activity comparable to that of the wild type BH1352, suggesting that this residue is not critical for acetaldehyde condensation (Fig. 4-7B).

The DRP binding model also suggested that the DeoC Lys172 (interacting with the DRP phosphate and γ-hydroxyl groups) is replaced by Phe160 in BH1352 (Fig. 4-5A and B). Moreover, it proposes that the highly conserved BH1352 Arg190 and DeoC Arg207 (located close to the conserved

85 phosphate-binding Gly-rich loop) might also be involved in the coordination of the DRP phosphate through the bound water molecule (like Lys172 in DeoC). In addition, the BH1352 Lys15 appears to interact with the phosphate and γ-hydroxyl groups of DRP. Based on the DERA sequence alignment, the BH1352 Lys15 is conserved in DERAs from Bacilli (group 2), whereas the proteobacterial DERAs (group 4) contain an Asn residue at this position (Asn21 in DeoC) (see Fig. 4-1 and 4-6). Ala replacement mutagenesis of Lys15 rendered BH1352 completely inactive in both retro-aldol and acetaldehyde condensation reactions (see Fig. 4-7A and B), indicating that this residue plays an important role in catalytic activity of this enzyme.

Another notable feature of the BH1352 and TM1559 structures is the presence of a cluster of hydrophobic residues near the catalytic (Lys155 in BH1352) including four residues conserved in all DERAs (Leu14, Val63, Phe66, and Ile128 in BH1352; Leu40, Val88, Phe91, Ile152, and Phe184 in TM1559) (Fig. 4-5C, also see Appendix B Fig. B7 and Table B3). In the BH1352 structure, the side chains of Leu14, Phe66, and Ile128 are oriented towards the α-carbon of aldol products (see Fig. 4-5C), suggesting that these residues provide hydrophobic contacts for ligand binding, and that they might be essential for enzyme activity. This was supported by the results of alanine replacement mutagenesis of these residues, which produced mutant proteins with a greatly reduced activity in both reactions (the L14A protein was found to be insoluble) (see Fig. 4-7A and B). Another hydrophobic cluster comprising of three valine residues (Val154, Val177, and Val183), Ile170, and Met173 is located between the two β-strands (β6 and β7) and α6 helix (in BH1352) (Fig. 4-5D). It was previously reported that this cluster may contribute to the sequential aldol condensation, as revealed by the DeoC mutations F200I and M185V (equivalent to Val183 and Met173 in BH1352) resulting in enhanced condensation of acetaldehyde and chloroacetaldehyde (Jennewein et al., 2006). Also, the DeoC Phe200 is replaced by Val in DERAs from T. maritima and Pyrobaculum aerophilum, both of which show higher sequential aldol condensation of acetaldehyde (Sakuraba et al., 2007). These results suggest that reducing the size of hydrophobic side chains in this cluster might contribute to higher aldol condensation activity. Recently, it has been shown that the C-terminal Tyr259 of the E. coli DeoC is required for the efficient proton abstraction step in the DRP cleavage reaction (Schulte et al., 2018), whereas the previous work with the truncated DeoC ΔY259 protein (Tyr259 deleted) demonstrated an enhanced activity in acetaldehyde condensation with chloroacetaldehyde (Jennewein et al., 2006). Since the BH1352

86 structure suggested that the C-terminal Tyr224 might directly contribute to substrate binding or activity of this enzyme (see Fig. 4-4A and B), site-directed mutagenesis was also used to ascertain the role of this residue. We designed and purified four Tyr224 mutant proteins including Y224A, Y224F, ΔY224 (Tyr224 deleted), and ΔS223/Y224 (Ser223 and Tyr224 deleted) and tested their catalytic activities in the DRP cleavage and acetaldehyde condensation (1,3BDO production) reactions. Interestingly, while the acetaldehyde condensation reactions of these mutant proteins were not affected, their retro-aldol activity was greatly reduced (especially in Y224F), indicating that Tyr224 is essential for DRP cleavage, but not for acetaldehyde condensation (Fig. 4-7A and B). Thus, the crystal structures of BH1352 and other DERAs from different phylogenetic groups revealed significant differences in substrate coordination and catalysis of DRP cleavage and acetaldehyde condensation.

87

Figure 4-5. Substrate entrance regions of BH1352 (A) and EcDERA (B, PDB 1JCL, DRP shown as a green stick). The residues constituting the substrate entrance are displayed with stick model and labeled; C and D: The clusters of hydrophobic amino acids (green stick model, the predicted hydrophobic contact between protein and ligand shown with a dashed arch), one near the catalytic Lys155 (C) and the other in between the β-strand barrel and the α-helix barrel (D).

88

Figure 4-6. Structure-based sequence alignment of DERAs active in 1,3BDO production: six DERAs from Bacilli including BH1352 (light red background), six proteobacterial DERAs (light blue background), and TM1559 (the center row). The secondary structure elements derived from the structures of BH1352 and E. coli DeoC are shown above and below the alignment, respectively. Residues conserved in all proteins are shown in white font on a red background. The columns with red residues indicate the presence of more than 70% of biochemically similar residues. The catalytic residues are indicated by cyan boxes with red residue numbers, whereas the columns with black boxes and residue numbers indicate the substrate entrance residues (from Fig. 4-5A). The residues of the hydrophobic amino acid clusters (from Fig. 4-4C and D) are labeled with black circles.

89

Figure 4-7. Site-directed mutagenesis of BH1352: catalytic activity of purified wild-type and mutant proteins in the retro-aldol (DRP cleavage) and acetaldehyde condensation reactions. A: retro-aldolase activity of purified proteins with 1 mM DRP as substrate; B: acetaldehyde condensation reaction of purified proteins measured as the formation of 1,3BDO in the presence of PA1127 (DERA-AKR ratio 1:1). Experimental details are described in Materials and Methods.

90

4.4.5. Structure-based engineering of BH1352 for enhanced production of 1,3BDO

The crystal structures of BH1352 and TM1559 revealed that their substrate-binding pockets also include the side chain of a semi-conserved Phe (Phe160 in BH1352 and Phe184 in TM1559) (see Fig. 4-5A). This residue is conserved in most DERAs from clusters 1 (Mixed group) and 2 (Firmicutes), but it is replaced by a Lys residue in Proteobacterial DERAs (cluster 4) including E. coli DeoC (see Fig. 4-6). In the Lactobacillus brevis DERA (LbDERA), the replacement of the homologous Phe163 by Tyr has been shown to result in enhanced sequential condensation of acetaldehyde and chloroacetaldehyde, probably by promoting substrate access (Jiao et al., 2017). We found that the BH1352 Phe160 was not essential both for the retro-aldol (DRP cleavage) and acetaldehyde condensation reactions, because the F160A mutation had no significant effect on both reactions (Fig. 4-7A and B). However, the DRP cleavage activity of BH1352 was negatively affected when Phe160 was mutated to Glu, Gln, Lys, Met, Trp, or His, and slightly stimulated by mutation to Tyr (~23%) (Fig. 4-7A). Interestingly, the acetaldehyde condensation via BH1352 increased almost three times in the F160Y protein and was also increased in the F160E (72%) and F160H (44 %) proteins (Fig. 4-7B). In contrast, the replacement of Phe160 by Lys, Gln, Met, or Trp had no significant effect on this activity. These results suggest that similar to LbDERA (Jiao et al., 2017), the substitution of Phe160 by Tyr in BH1352 enhances acetaldehyde binding and/or condensation in the enzyme active site (see Appendix B, Fig. B8A and B). Based on the BH1352 crystal structure, the hydroxyl group of Tyr160 (in F160Y) might interact with the main chain amide of conserved Lys15 (3.3 Å) located on the β1-α1 loop (Leu13-Thr19) near the absolutely conserved Leu14 (see Appendix B, Fig. B8C). Our mutagenesis studies demonstrated that Lys15 is critical for catalytic activity of BH1352, whereas Leu14 is part of the hydrophobic cluster near the catalytic Lys155 (L14A mutant protein was found to be insoluble) (Fig. 4-6, Fig. BH1352 active site). We propose that the hydroxyl group of Tyr160 provides a stabilizing effect on the conformation of both Leu14 and Lys15 in the BH1352 active site resulting in increased acetaldehyde condensation activity of this enzyme.

We also mutated the semi-conserved residues Ile170 and Met173 of BH1352, located near the catalytic Lys155 (see Fig. 4-5D), to examine if the reduction or increase of their hydrophobic side

91 chains will affect the catalytic activity of BH1352 and improve acetaldehyde condensation. Our coupled DERA-AKR assays (1,3BDO production) revealed 40-50% increase in the production of 1,3BDO by the purified mutant proteins I170V, M173I, M173L, and M173V compared to the wild-type BH1352, whereas I170A showed reduced activity (Fig. 4-7A and B). In contrast, the retro-aldol (DRP cleavage) activity of BH1352 was not significantly affected by these mutations. Interestingly, the replacement of Met173 by a polar residue (Thr) had a strong negative effect on both BH1352 activities indicating that retaining hydrophobicity at this position is critical for catalytic activity of this enzyme. We also designed the BH1352 double mutant protein F160Y/M173I, which showed 1,3BDO formation activity comparable to that of F160Y suggesting that these two mutations are not synergistic in acetaldehyde condensation. Thus, both BH1352 F160Y and F160Y/M173I proteins exhibit enhanced activity in acetaldehyde condensation reaction and can be used for in vitro and in vivo production of 1,3BDO.

92

4.4.6. Synthesis of (3R,5S)-6-chloro-2,4,6-trideoxyhexapyranoside via

sequential condensation of acetaldehyde and chloroacetaldehyde

Since (3R,5S)-6-chloro-2,4,6-trideoxyhexapyranoside (CTHP) is of pharmaceutical interest with a huge global market, in vitro enzymatic process has been one of the most important subjects of DERA studies. Among the previous academic studies of the DERA-catalyzed synthesis of CTHP, the highest yield by far, 98%, was achieved in 3 h of reaction using the DERA from Haemophilus influenzae Rd KW20 (Woo et al., 2014). Another high conversion of 94% was attained using the DERA from Bacillus caldolyticus after 4 h of reaction, followed by 93% yield for Lactobacillus brevis DERA Thr29Leu variant after two hours (Table 4-3) (You et al., 2013; Jiao et al., 2017).

In this work, the enzymatic synthesis of CTHP was conducted using BH1352 wildtype and its variants with an enzyme loading of 2.0 mg/ml, 200 mM of acetaldehyde, and 100 mM of chloroacetaldehyde (Fig. 4-8). While, none of the previously studied DERAs exhibited a product yield over 60% within 1 h of reaction, BH1352 was able to convert more than 70% of the substrates into the CTHP within 30 min that 74.7% of product yield was achieved (see Table 4-3). Furthermore, its variants, including Phe160Tyr, Ile170Val, and Met173Ile, catalyzed the synthesis of CTHP with a marginally, but measurably improved initial activity (see Appendix B, Fig. B8). By comparison, BH1352 is efficacious in the synthesis of statin drug precursor due to its high conversion rate. However, the conversion rate of BH1352 reaches the plateau at around 75% after 30 min of reaction. It is potentially due to the poor tolerance against highly concentrated acetaldehyde. Generally, highly concentrated substrates are employed in in vitro synthesis process; thus, a weak stability against substrates works as a critical limitation in practical applications. Therefore, in this study, we explored the structural insight on BH1352 inactivation by highly concentrated acetaldehyde in the following section, examining the substrate tolerance and the key residues involved in structural stability.

93

Table 4-3. Performance of different DERAs in the synthesis of (3R,5S)-6-chloro-2,4,6- trideoxyhexapyranoside.

Substrate Enzyme Reaction load dose time Yield Enzyme [mM]a [mg/ml] [min] [%] Ref. EcDERA 500 2.5 480 ≤6.5 Jennewein et al. EcDERAVar14 500 2.5 480 70 Jennewein et al. DERA from Bacillus caldolyticus 80 2.5 480 94 Greenberg et al. DERA from Haemophilus influenzae 80 5 µg/ml 180 98 Woo et al. DERA from L. brevis 500 2.5 120 91 Jiao et al. 2015 DERA from L. brevis 700 2.5 180 44 Jiao et al. 2015

DERA from L. brevisE78K 700 2.5 180 85 Jiao et al. 2015 b DERA from L. brevisT29L 500 1.5 240 93 Jiao et al. 2017 BH1352 100 2.0 30 75 This study a Substrate load refers to chloroacetaldehyde, while the acetaldehyde amount was double. b Approximately 60% of product yield was achieved after 1 h of reaction.

Figure 4-8. BH1352-catalyzed tandem aldol condensation of acetaldehyde and chloroacetaldehyde to produce (3R,5S)-6-chloro-2,4,6-trideoxyhexapyranoside (dashed box), which is a key chiral precursor of atorvastatin.

94

4.4.7. Structural insight on inhibition of DERA at high concentrations of acetaldehyde

An additional common characteristic of DERAs is the presence of a cysteine residue close to the two lysine residues of the catalytic triad (see Cys39 in Fig. 4-3A). Dick et al. reported that this cysteine is involved in the deactivation of DERA in the presence of highly concentrated acetaldehyde by the nucleophilic addition of the thiol group of the cysteine onto crotonaldehyde forming an imine with the catalytic lysine (Dick, Hartmann, et al., 2016). According to their study, methionine substitution on Cys47 of EcDERA significantly increased stability against highly concentrated acetaldehyde, as the variant protein retained native catalytic activity even after 16 h of incubation with 300 mM of acetaldehyde. This cysteine is highly conserved in the family of DERA that greater than 90% of 3,700 orthologues, including BH1352, share the cysteine at the corresponding residue (Dick, Hartmann, et al., 2016). However, unlike EcDERA, BH1352 contains another cysteine in the active site cavity, Cys61, near Cys39, but not close enough to form a disulfide bond in the apo structure (3.2Å , see Fig. 4-9A). These two cysteine residues are conserved and close enough to form a disulfide bond in the hyperthermophilic DERA from Pyrobaculum aerophilum, which partly explains high stability against acetaldehyde (Sakuraba et al., 2007). The additional cysteine is another common characteristic of BH1352 and Bacilli DERAs, while the Proteobacterial DERAs including EcDERA share an alanine in the corresponding residue (see Fig. 4-9A).

To confirm the functional role of these two cysteines in BH1352 regarding stability against acetaldehyde, we introduced Cys39Met, Cys61Ala, Cys61Val, and Cys39Met/Cys61Val substitutions and measured residual activity during the incubation with 100 mM of acetaldehyde (Fig. 4-9B). Apparently, Cys39Met single mutation slightly deactivates rather than stabilizes. Furthermore, Cys61Ala was found to be critical for stability that this variant lost 15% of native activity immediately after acetaldehyde was added and was completely inactivated within 4 h of incubation. Interestingly, the catalytic efficiency and binding affinity of Cys61Ala was improved for retro-aldol DRP cleavage compared with the wildtype, apparently in exchange of the enzyme stability. Cys61Val was introduced to emulate another hyperthermophilic DERA from T. maritima, but this mutation barely affected stability while impairing specific activity on 1 mM DRP (data

95 not shown). Although the double mutation (C39M/C61V) also impaired the specific activity on DRP, it retained over 50% of its initial activity after 10 h, which is higher than the residual activity ratio of the wildtype, 34%. The possible explanation of this destabilization is the strong hydrophobicity of free Cys61. Since the side chain of free cysteine is the more hydrophobic, any amino acid substitution may adversely affect the hydrophobic core integrity of its native structure (Nagano, Ota and Nishikawa, 1999). Thus, the cost of losing hydrophobic integrity is apparently higher than the prevention of nucleophilic attack of Cys39 onto crotonaldehyde bound with the catalytic lysine in the case of BH1352.

96

Figure 4-9. A: Structural alignment of BH1352 (light gray) and EcDERA (green). The two target cysteine residues of BH1352, the corresponding residues of EcDERA, and the catalytic triad of the two DERAs are represented with stick model. DRP substrate bound with EcDERA Lys167 is displayed with a cyan stick. The labels of the key residues are colored differently for BH1352 (black) and EcDERA (green); B: Time- dependent residual activity of the wildtype of BH1352 and its variants during incubation with 100 mM acetaldehyde.

97

4.5. Conclusion

Using a combination of purified DERA aldolases and an aldo-ketoreductase (PA1127), we have identified three microbial DERAs with high activity in the transformation of acetaldehyde to 1,3BDO. The crystal structure and site-directed mutagenesis of BH1352 provided insights into the molecular mechanisms of substrate selectivity and acetaldehyde condensation activity of DERA aldolases. By targeting hydrophobic residues near the catalytic Lys155 of BH1352, we generated two variants of this enzyme (F160Y and F160Y/M173I) with enhanced activity in acetaldehyde condensation and 1,3BDO production. E. coli cells expressing these BH1352 variants as part of the DERA+AKR pathway produced 5-6 times more 1,3BDO from glucose compared to cells with the wild-type BH1352. The designed BH1352 variants can be used as starting material for future protein engineering efforts aimed at improving the activity of DERA aldolases and their performance in the biotechnological production of 1,3BDO and other chemicals. Moreover, its novel application on the tandem aldol condensation of acetaldehyde and chloroacetaldehyde to produce a key chiral precursor of the statin drugs was identified. Despite its relatively high initial conversion rate, it has a limitation of weak tolerance to highly concentrated acetaldehyde. To extend the application of BH1352 to the industrial process of statin drug production, a fed-batch system could be employed to maintain high productivity and obtain a higher conversion yield. Overall, the crystal structure analysis and biochemical characterization of BH1352 suggested the potential applications of a novel DERA in the biosynthesis of high-value chemicals and pharmaceuticals.

98

Chapter 5. In vivo Study of Engineered BH1352 for 1,3- Butanediol Biosynthesis from E. coli and Synechococcus elongatus PCC 7942

5.1. Introduction

Exploiting the potential of microorganisms to synthesize value-added chemicals from cheap, renewable sources such as sugars and biomass offers a promising alternative to current industrial processes. The development of sustainable bioprocesses is attracting more and more attention due to negative environmental impacts of the use of fossil carbons. Systems metabolic engineering has successfully enabled the biosynthesis of a variety of commodity chemicals from renewable carbon source, including 1,4-butanediol (Yim et al., 2011), 1,3-propanediol (González-Pajuelo et al., 2005), aldehydes and alkenes (Rodriguez and Atsumi, 2014; Sheppard, Kunjapur and Prather, 2016; Y. X. Cao et al., 2016). Despite these efforts to develop efficient bioprocesses, the range of explored non-natural chemicals is still narrow due to the significant cost of process development as well as insufficient economic feasibility.

The synthesis of complex compounds often requires formation of new C-C bonds, which provide the foundation of organic synthesis. There are three major biological avenues that have been explored to condense small molecules: Claisen condensation via thiolase catalysis, carboligation by pyruvate decarbolxylase activity, and aldol condensation via aldolase reactions (Brovetto et al., 2011; Schmidt, Eger and Kroutil, 2016). The former reaction using acyl-CoA building blocks has been demonstrated in many studies for a variety of products (Machado et al., 2012; Sheppard, Kunjapur and Prather, 2016), while carboligation has been demonstrated for the production of several chemicals including acetoin and phenylacetylcarbinol (Gunawan et al., 2007; Kunjapur, Tarasova and Prather, 2014). However, aldol condensation-based organic synthesis of value-added chemicals in biological system has not been profoundly studied.

Recently, Nemr et al. from our lab proposed 2-deoxyribose-5-phosphate aldolase (DERA)-based pathway for the biosynthesis of 1,3-butanediol (1,3BDO) using E. coli as a host organism (Nemr

99 et al., 2018). This pathway, as it has been recurring in this thesis, consists of three enzymes including pyruvate decarboxylase (PDC), DERA, and aldo-keto reductase (AKR). The novel AKR (PA1127, Chapter 3) and DERA as well as its enhanced variants (BH1352, Chapter 4) have been identified and characterized for optimization of the biosynthetic pathway previously in this thesis. In this chapter, we explored the in vivo application of the novel enzymes and engineered mutant DERAs in E. coli and Synechococcus elongatus PCC 7942.

5.1.1. Biosynthesis of 1,3BDO from DERA-based pathway in E. coli

The ultimate goal of this study was to demonstrate the DERA-based 1,3BDO pathway in E. coli, identify major competing pathways and rate-limiting steps through iterative optimization, and develop platform strains for other aldolase-based production modules to synthesize diverse chemicals (Nemr et al., 2018). The main approach was modular pathway design, consisting of acetaldehyde producing module and aldolase-based 1,3BDO producing module (Fig. 5-1). For production of acetaldehyde, PDC from Zymomonas mobilis was chosen and expressed, as its expression in E. coli is well-studied with high enzyme activity (Neale et al., 1987). For the latter 1,3BDO producing module, BH1352 from B. halodurans (Chapter 4) and PA1127 from P. aeruginosa (Chapter 3) were chosen. BH1352 not only showed relatively high in vitro activity (see Fig. 4-1B), but also displayed the highest in vivo conversion among the DERAs tested (Nemr et al., 2018). Furthermore, PA1127 was fully characterized to be suitable for NADPH-dependent 3- HB reduction (Kim et al., 2017).

Once the biosynthetic pathway was designed in E. coli, several optimization strategies were explored. The first was to alleviate the imbalance of the pathway, NADPH depletion, by the fermentation strategy which provides sufficient flux to pyruvate while also providing NADPH as a driving force for 1,3BDO production. In order to achieve that, the growth stage was carried out aerobically to accumulate enough biomass then shifted to oxygen-limited stage.

100

Figure 5-1. Aldolase-based biosynthetic pathway for (R)-1,3BDO in the context of E. coli central carbon metabolism at the pyruvate node. The figure was reproduced from Nemr et al. paper.

Next, the identification and removal of competing pathways were carried out to minimize byproduct formation. This involved the deletion of pyruvate consuming pathways including ldhA (), pflB (pyruvate-formate ), and ilvB (acetohydroxy acid synthase isozyme), to concentrate carbon flux to the acetaldehyde producing module (see Fig. 5-1). In addition, several aldehyde/alcohol dehydrogenases such as adhE, yqhD, eutG, adhP, and yjgB were deleted to prevent ethanol production as much as possible. Then, moving on to the 1,3BDO producing module, the addition of extra copy of DERA and AKR as well as ribosome-binding site (RBS) optimization for the higher translation rate of the two genes was conducted to reinforce the carbon flux to 1,3BDO. Deleting acetate producing pathways via deletion PoxB and Pta was also implemented but did not significantly contribute to 1,3BDO production. Finally, bioreactor

101 fermentation allowing pH control and monitoring/controlling dissolved oxygen (DO) was used for cofactor balancing and improving 1,3BDO production. As a result, the best strain produced 2.4 g/L of 1,3BDO and yield of 0.058 g/g of glucose (Nemr et al., 2018).

Despite the iterative, comprehensive optimizations, the productivity of the 1,3BDO pathway is still far below that of the fatty acid β-oxidation pathway, which achieved 15.8 g/L of 1,3BDO with a yield of 0.18 g/g of glucose (Kataoka et al., 2014). There might still be more room for metabolic engineering strategies to optimize the pathway, yet overcoming the inherent limitation of enzyme activity could be an effective approach. Especially the rate-limiting DERA condensation could be the target. In the previous chapter, the DERA BH1352 was well-studied and some of the mutations were shown to be effective in 1,3BDO conversion from DERA-AKR coupled assay. In this study, the mutations were applied in the metabolic pathway in E. coli to study in vivo effect of the mutations and optimize the pathway using protein design approach.

5.1.2. Cyanobacterial engineering for photosynthetic conversion of CO2 into 1,3BDO

Cyanobacteria are photosynthetic prokaryotes that can utilize photon energy to break down water into electrons and hydrogens and generate reduced molecules and sugars themselves. The photosynthetic bacteria are classified as a phylum of bacteria and known to be one of the most ancient organisms on earth. In fact, it is believed that cyanobacterial photosynthesis had transformed the atmosphere of this planet from reducing one to oxidizing one more than 2 billion years ago, which dramatically stimulated biodiversity of life forms on earth with a protective ozone layer (Pisciotta, Zou and Baskakov, 2010).

There are numerous inherent properties of cyanobacteria that render them an attractive candidate organism in biotechnology, but the most important aspect is their photosynthetic capacity. They can generate their own carbon source from practically cost-free inputs – light, water, and atmospheric CO2 – to metabolically produce value-added compounds, requiring only a few inexpensive nutrients including phosphate, nitrate, and trace amount of metal ions. In addition, 102 cyanobacteria exceed other photosynthetic organisms such as eukaryotic microalgae and plants in terms of solar energy conversion efficiency that 3-9 % of conversion is reported for cyanobacteria while only ≤0.25-3 % is converted by terrestrial plants (Ducat, Way and Silver, 2011). Solar flux is converted into biomass-stored chemical energy via cyanobacteria at the rate of nearly 450 TW, which exceeds by around 30 times of the demand of human society (~15 TW) (Pisciotta, Zou and Baskakov, 2010). Cyanobacterial photosynthesis thus provides an enormous energy pool that can be used in value chemical production.

Other than photosynthetic capability, cyanobacteria have several characteristics that are advantageous in turning themselves into cell factories for renewable and sustainable production of biofuels and commodity chemicals. They have the fastest growth rate among all photosynthetic organisms as well as less complex intracellular structures and cell walls (Rosgaard et al., 2012). Also, their genetic simplicity and ease in transformation make genetic manipulation much easier compared to other photosynthetic organisms such as plants and algae. As synthetic biology tools and genetic information of cyanobacteria are being accumulated and available, it is no more a remote idea that the development in cyanobacteria engineering turn them into a “green E. coli” (Berla et al., 2013).

In metabolic engineering of cyanobacteria, most studies have been carried out using the major two strains, Synechococcus elongatus PCC 7942 (hereafter PCC 7942) and Synechocystis sp. PCC 6803 (hereafter PCC 6803). They are the most extensively studied among all cyanobacteria strains that the full genome sequence and genome scale modeling for both strains are available (Nakamura, Kaneko and Tabata, 2000; Fu, 2009; Knoop et al., 2013; Broddrick et al., 2016). Both are unicellular, freshwater strains and naturally transformable. In fact, PCC 6803 was the first photosynthetic microorganism of which the full genome was sequenced. It is also able to metabolize organic carbon sources that it can grow with glucose via glycolysis pathway and oxidative pentose phosphate pathway (Yu et al., 2013). On the other hand, PCC 7942 is photoautotrophic and the only strain that has a commercial engineering kit (Life Technologies Corp., Carlsbad, CA). Due to the accessibility of the strain and genetic tools, PCC 7942 was chosen to be the target strain in this study.

103

Here, we proposed the cyanobacterial expression of the DERA-catalyzed 1,3BDO biosynthetic pathway (Fig. 5-2). Based on the mutations of BH1352 that were rationally designed to improve 1,3BDO formation (Chapter 4), different candidate DERAs and other metabolic engineering strategies have been carried out for unprecedented cyanobacterial conversion of 1,3BDO.

Figure 5-2. The biosynthetic pathway for cyanobacterial 1,3BDO production.

5.2. Materials and methods

Strain and plasmid construction

The strains and plasmids used in this study are listed in Table 5.1. Molecular biology techniques were performed according to standard practices. The sequence of flanking unique nucleotide sequences (UNS or U) are the same as described in Nemr et al. paper (Torella et al., 2014; Nemr et al., 2018). The same primers used in the DERA study (Chapter 4) were used for site-directed mutagenesis.

The following methods of whole-cell biotransformation and bioreactor fermentation were proposed by Kayla Nemr, a Ph. D graduate from Prof. Mahadevan group.

104

Whole-cell biotransformation of acetaldehyde to 1,3BDO by BH1352 and its variants

The transformants of BH1352 and its mutants including Phe160Tyr (F160Y hereafter), Ile170Val (I170V), Met173Ile (M173I), and Phe160Tyr/Met173Ile (F160Y/M173I) in BL21 gold DE3 competent cells were used for the whole-cell biotransformation of acetaldehyde to 1,3BDO. Each strain was cultivated in 5 ml of LB liquid media in triplicates and grown at 37℃, 200 rpm over night. The pre-cultures were then used to inoculate 25 ml of terrific broth (TB) media to OD600nm of 0.2 and grown at 30℃, 150 rpm until OD600nm reached 0.6-0.8, at which point 1 mM of IPTG was added to induce protein expression. The cultures continued growing at 30 °C, 150 rpm for six hours. The cells were then harvested and resuspended in 10mL of M9 media containing 20 g/L of glucose and 0.5 mg/mL of thiamine. At 0 and 2.5 h after the inoculation, 0.8 g/L of acetaldehyde was added. Samples were collected after 2.5, 5, and 16 h and used to measure the cell density and quantify 1,3BDO by HPLC. The concentration of 1,3-BDO was normalized to OD600nm of each culture sample. It was assumed that the amount of 1,3BDO produced is directly related to the amount of 3-HB produced from acetaldehyde via DERA condensation of BH1352 and its variants.

Cultivation conditions in bioreactors

Strains were characterized in 500 mL bioreactors (Applikon Biotechnology Inc.) equipped with three rushton impellers (two impellers with OD28 and one with OD22) and electrodes to measure pH and dissolved oxygen (DO). The first seed culture was prepared by inoculating 10 mL of LB supplemented with 100 μg/mL of ampicillin from a single colony or deep-frozen glycerol cell stocks and grown at 37 °C until the OD600 gets to at least 1. Then the seed culture was inoculated into 50 mL of modified M9 media (supplemented with 100 μg/mL of carbenicillin and 0.5 mg/mL of thiamine and containing 0.1 M MOPS at pH 7.3) to dilute the culture to OD600 of 0.2 in a 250 mL baffled flask and grown at 37 °C and 250 r.p.m. for 16 h. The second seed culture was then used to inoculate 300 mL of modified M9 media (without MOPS) and supplemented with 100 μg/ mL of carbenicillin in the bioreactors. The pH was controlled at 7.0 by the addition of 10% NH4OH, stirrer speed at 1500 r.p.m., temperature at 37 °C and air flow rate at 1.5 vvm. When the OD600 nm reached between 7–8, protein expression was induced by the addition of 1mM IPTG. After 30 min, the air flow rate was then reduced to 0.37 vvm (25% of the initial vvm) to reduce DO and 3 %

105 glucose was additionally supplemented. After the consumption of glucose (verified with pH rise), the culture samples were filtered and injected to HPLC for product measurement.

Table 5-1. Strains and plasmids used in this study.

Strain name Properties / genotype

E. coli E. coli K12 MG1655 F- lambda- ilvG- rfb-50 rph-1 LMSE51C MG1655 ΔadhE ΔldhA ΔpflB ΔyqhD ΔeutG ΔadhP ΔyjgB ΔilvB ΔpoxB Δpta::CmR BDO-0 LMSE51C, pBD1 BDO-1 LMSE51C, pBD1; F160Y in BH1352 BDO-2 LMSE51C, pBD1; F160Y/M173I in BH1352

Synechococcus elongatus PCC 7942 TK01 PCC 7942, pTK01 TK02 PCC 7942, pTK02 TK09 (in progress) PCC 7942, pTK09 TK11 (in progress) PCC 7942, pTK11 TK12 PCC 7942, pTK12

Plasmids pTrc99A pBR322 (AmpR) pBD1 pTrc99A:: BH1352 - PA1127 - PDC (Z. mobils) R pSyn_6 PCC 7942 NSI targeting vector, PpsbA (Spec ) pTK01 pSyn_6:: PDC (Z.m) - BH1352 - PA1127 pTK02 pSyn_6:: BH1352 - PA1127 - PDC (Z.m) q pTK09 pSyn_6, lacI ; Ptrc:: *BH1352 - *PA1127 - *PDC (Z.m) q pTK11 pSyn_6, lacI ; Ptrc:: *DERA (E. coli) - *PA1127 - *PDC (Z.m) q pTK12 pSyn_6, lacI ; Ptrc:: *DERA (T. maritima) - *PA1127 - *PDC (Z.m) * N-terminal His-tag attached

Genetic modification of PCC 7942

Standard methods were used for PCR. Taq polymerase (NEBiolabs) was used for analytical colony-PCR experiments. High fidelity Phusion (NEBiolabs) was used for Gibson Assembly and cloning. The 1,3BDO pathway genes were inserted into pSyn_6 plasmid, a chromosomal

106 integrable plasmid for PCC 7942 (GeneArt® Synechococcus Engineering kits, Life Technologies), using the Gibson Assembly protocol (Gibson et al., 2009).

Construction of engineered cyanobacterial strains

The cyanobacterial strains and plasmids used in this study are summarized in Table 5.1. The E. coli cultures were prepared in Luria Broth for plasmid construction. The plasmid containing gene of interest was then transformed into PCC 7942. Transformation was executed by dark incubation of 100-200 ng of plasmid DNA and 2 mL of PCC 7942 culture taken at the exponential phase

(OD730 1 ~ 2). The duration of dark incubation is recommended to be longer than 4 hours, and was conducted for 8 h at 34°C. After the incubation, the cell and DNA mixture were spread on BG-11 agar media containing 10 µg/ml spectinomycin and continuously illuminated until colonies appear on the plate. Single colonies were picked up and re-streaked on another BG-11/spectinomycin plate, and then the cells were re-streaked over again while gradually increasing the concentration of spectinomycin from 10 µg/ml to 25 µg/ml (using 3-4 steps). After segregation of mutant strains, cells on the plate were collected and subjected to colony PCR to confirm the insertion of 1,3BDO pathway genes. Once the full segregation of mutant strains has been confirmed, the cells on the plate are transferred to modified BG-11 media. For the liquid media, 50× concentrated cyanobacteria BG-11 fresh water solution (Sigma-Aldrich) was purchased and diluted, then supplemented with 20 mM of HEPES pH 7.5 and 50 mM of NaHCO3.

Design of an in-house photobioreactor and cyanobacteria cultivation conditions

For the design of photobioreactor (PBR), three essential elements are required for photosynthesis and cyanobacteria culture: light, temperature control, and CO2 feed. Twelve of 24-inch 15W fluorescent light bulbs (Philips) were used to provide light energy of 140 µE/m2·s , and temperature was controlled at 34°C by using a water bath and Digital Immersion Circulator (Cole-Parmer Instrument Company, LLC.). The water bath was constructed using 5/16-inch-thick acrylic plates with a dimension of 65 cm×25 cm×25 cm. Below the water bath a polyethylene spill-control tray was deployed to prevent any leakage or overflood from the water bath and cyanobacterial culture. Sterilized square-bottom 125 ml bottles were used to contain cyanobacteria culture for uniform irradiance of each culture batch. To provide the carbon source, 3% CO2 balanced with air was fed

107 through sterile 1 ml pipets connected to a 10-hole manifold with clear Tygon PVC tubing (Summarized in Fig. 5-3).

Unless otherwise specified, all PCC 7942 strains were cultured in modified BG11 medium at 34 °C using the photobioreactor described above. All cyanobacteria strains began to be cultured with an

OD730 of ~0.1. Cell growth was indirectly monitored by measuring OD730 in a Cary 50 Bio UV-

Vis Spectrophotometer (Agilent-Varian), under the assumption that OD730 correlates with cell biomass in the culture. Samples were diluted to between OD730 0.2 and 0.9 for accurate cell density measurements.

Figure 5-3. A: Graphical scheme of the photobioreactor design; B: Photo of the in-house photobioreactor.

Liquid chromatography mass spectrometry (LC-MS) analysis

Extracellular compounds in cyanobacteria culture were filtered through centrifugal filter devices (10k cutoff, VWR) to remove residual cells and enzymes and injected to Thermo Scientific Ultimate 3000 UHPLC equipped with an Aminex HPX-87H column for LC-MS analysis. The

108 system was equilibrated with 0.1 % formic acid as an eluent with a flow rate of 0.6 ml/min at 50 °C. The flow-through was subjected to MS analysis using Thermo Scientific Q-Exactive Mass Spectrometer. The detection mass range was set to m/z in between 80 and 250. The spray voltage was 4 kV and the capillary temperature was 320 °C. The mass spec data were analyzed using Thermo Xcaliber (Qual Browser). Here, 1,3BDO was identified mainly with a sodium adduct [M+Na]+ with a m/z value of 113.05 due to the salt in BG-11 media, though a proton adduct [M+H]+ was also observed with less intensity (see Appendix C Fig. C1). Using LC-MS, the detection limit of 1,3BDO in BG-11 media was as low as 10 mg/L (~110 μM).

5.3. Results and discussion

5.3.1. Whole-cell biotransformation using BH1352 and its variants expression in BL21

Whole-cell biotransformation by expressing BH1352 and its variants in E. coli BL21 was conducted to assess the impact of rationally designed mutations on in vivo aldol condensation activity. Since it is difficult to directly measure the formation of 3-HB, the BH1352 activity was assessed based on the production of 1,3BDO from the acetaldehyde added (Fig. 5-4A). Moreover, measuring the consumption of acetaldehyde is not a good indicator of DERA activity because it is volatile and could be reduced to ethanol (Nemr et al., 2018). Instead, 1,3BDO formation from the reduction of 3-HB by promiscuous and endogenous aldehyde/alcohol dehydrogenases was used in the indirect assessment of in vivo activity of BH1352 and the mutants (Rodriguez and Atsumi, 2012, 2014; Liang and Shen, 2017).

Here, Phe160Tyr (F160Y hereafter), Ile170Val (I170V), Met173Ile (M173I), and F160Y/M173I, which showed the most improvement in 1,3BDO formation from in vitro DERA-AKR coupled reaction, were subjected to the whole-cell biotransformation test. During the first 2.5 h of reaction after the initial addition of acetaldehyde, F160Y and F160Y/M173I exhibited 40% and 34% increase in 1,3BDO production, respectively, while the two hydrophobic cluster mutations did not

109 display any significant increase (Fig. 5-4B). After another 2.5 h, however, the difference between the wildtype and the variants diminished that only the double mutant (F160Y/M173I) was showing a discernable increase in 1,3BDO formation. The overnight reaction (for 16 h) result indicated that all of the mutants showed a similar improvement over the wildtype, the difference ranging from 35-38% of the wildtype activity.

Despite the fact that F160Y and F160Y/M173I exhibited more than 2.5-fold increase in in vitro 1,3BDO formation, whole-cell biotransformation via overexpression of the BH1352 variants were not as effective for two major reasons. First, the concentration of acetaldehyde was not controlled due to the existence of at least 44 endogenous, promiscuous aldehyde reductases in E. coli (Rodriguez and Atsumi, 2014). Since each culture batch had slightly different amount of biomass, the consumption of acetaldehyde might be substantially varied even among the batches of the same protein expression. Moreover, the absence of 3-HB-specific reductase contributed to the lack of driving force of carbon flux into 1,3BDO. Though PA1127 reduction of 3-HB is faster than BH1352 aldol condensation, the equivalent amount of PA1127 is required to facilitate BH1352 reaction and optimize the assessment (see Appendix B Fig. B2). Therefore, the sole overexpression of BH1352 and its variants is not fully reflecting the impact of mutations in biological system.

110

Figure 5-4. A: The in vivo activity assay of BH1352 and its variants was performed by the expression in the recombinant E. coli BL21 with T7 promoter; B: The in vivo assay results comparing the wildtype BH1352 and its variants. The culture samples were collected at 2.5h, 5h, and 16h from the initial addition of acetaldehyde and injected into HPLC for the measurement of 1,3BDO.

111

5.3.2. Application of engineered BH1352 variants for in vivo production of 1,3BDO from glucose

For further in vivo study of BH1352 mutations, a fed-batch bioreactor fermentation was conducted with the 1,3BDO pathway expression in an engineered E. coli. According to Nemr et al., controlling pH and dissolve oxygen (DO) levels in reactors had a significant effect on strain performance and product profiles compared with flask experiments. Maintaining pH at 7 could induce the expression of reductases specifically active on acetoin and produce more of (2R,3R)- or (2S,3S)-2,3-butanediol (2,3BDO) (Nemr et al., 2018). On the other hand, controlling the air flow rate seems to contribute to higher flux towards carbon dioxide and repression of oxygen- sensitive enzymes, which may explain the decrease in meso-2,3BDO which can be produced by glycerol dehydrogenase that is anaerobically expressed in E. coli (Nielsen et al., 2010; Lv et al., 2016). Moreover, 2,3BDO and acetoin production may be favored at a lower pH due to a mechanism of preventing excessive medium acidification (Vivijs et al., 2014). On the other hand, the previous study suggests that 1,3BDO is produced significantly more in reactors than flasks due to the increased glucose consumption (Nemr et al., 2018).

For the assessment of the impact of the mutations on the pathway, LMSE51C strain and pBD1 plasmid were used for heterologous expression of 1,3BDO pathway genes (see Table 5.1). The strain LMSE51C was designed by Nemr et al. to reduce the carbon flux to undesired byproduct in MG1655 strain by deleting several genes involved in competing pathways including ethanol producing pathways (adhE, yqhD, eutG, adhP, and yjgB) and pyruvate competing pathways (ldhA, pflB, ilvB, poxB, and pta) (see Appendix C Fig. C2). It was chosen because the strain was engineered to concentrate the carbon flux into the target synthetic pathway. On the other hand, pBD1, harboring BH1352, PA1127, and PDC from Z. mobilis (ZmPDC) with no other optimization, was used as the control plasmid.

Characterization of the engineered strains was conducted using the bioreactor with controlled pH at 7.0 and two-stage fermentation was implemented by supplementation of extra glucose (extra 3%) after the expression of 1,3BDO pathway proteins was induced. Under these conditions, LMSE51C expressing the wildtype BH1352, PA1127, and ZmPDC was not efficient in 1,3BDO

112 formation that it achieved a titer of 0.18 g/L with a yield of 4 mg/g of glucose (BDO-0, Fig. 5-5). This was far below the result reported in Nemr et al. paper because of the lower stirring speed (1,500 r.p.m. used here due to a mechanical issue of the bioreactors, while 1,750 and 1,850 r.p.m. used in Nemr et al.). The DO level was observed to be too low to provide enough reducing cofactor for the pathway after IPTG induction, which caused inefficiency of the pathway. However, the fermentation condition was controlled for each bioreactor and strain, therefore the mutation impact on in vivo 1,3BDO synthesis was assumed to be valid.

On the other hand, the engineered strain with F160Y single mutation (BDO-1) was able to produce 0.89 g/L of 1,3BDO with a yield of 18 mg/g, which is a significant improvement over the BDO-0 (see Fig. 5-5). The double mutation (F160Y/M173I) in BH1352 (BDO-2) was even more efficacious, leading to production of 1.09 g/L of 1,3BDO with a yield of 28 mg/g, 7-fold higher than that of the strain with BH1352 wildtype. As DERA aldol condensation was the bottleneck step in 1,3BDO pathway, the increased activity of BH1352 variants allowed channeling of more carbon flux into the final product, leading to substantial increase in 1,3BDO production and glucose yield. Moreover, this can be combined with other strategies including modification of RBS for higher translation rate and/or optimization of fermentation conditions to further increase the productivity without causing additional metabolic burden.

113

Figure 5-5. Production of 1,3BDO from glucose by E. coli cells expressing the aldolase-based 1,3BDO pathway with the wild-type and mutant BH1352. The E. coli strains used were BDO-0 (wild-type BH1352), BDO-1 (BH1352 F160Y), and BDO-2 (F160Y/M173I). The white bars show the production of 1,3BDO (g/L) by fermentation of corresponding strains, whereas the gray bars represent the corresponding yield of 1,3BDO (mg/g glucose). The results are shown as means (± S.D.) from duplicate experiments. Experimental details are described in Materials and Methods.

114

5.3.3. Extracellular 1,3BDO toxicity of PCC 7942

Despite the fact that cyanobacteria have been engineered to produce a variety of biofuels and commodity chemicals, the productivity is still too low to be scaled up to industrially applicable level. One of the major limiting factors is the low level of chemical toxicity tolerance of cyanobacteria. In fact, the model strain in this study, PCC 7942, was studied that long-chain alcohols, free fatty acids, and hydrocarbons were found to be hardly affecting the growth and physiology, while short-chain alcohols such as ethanol have a relatively higher impact on the cell growth of PCC 7942 (Ruffing and Trahan, 2014). The chemical toxicity tolerance study needs to precede examining cyanobacterial conversion of the compound of interest, though it is systematically limited to extracellular chemical toxicity.

In order to evaluate the toxicity of 1,3 BDO, the growth of WT PCC 7942 with 1,3BDO-added culture was measured (Fig. 5-6). The concentration range was determined based on other cyanobacteria engineering products. Since the highest titer of cyanobacteria-derived product is ~4.5 g/L of ethanol, 1 to 10 g/L of 1,3BDO variation was adopted in this study (Gao et al., 2012). According to the toxicity measurement, the concentration up to 5 g/L of 1,3BDO does not affect the bacterial growth and even 10 g/L of 1,3BDO does not cause significant change in growth rate of PCC 7942. Considering the practical target of the product concentration and the study of cyanobacterial production of an isomer of 1,3BDO, 2,3BDO (3.0 g/L from engineered PCC 7942), the toxicity of 1,3BDO would not be the major hurdle of this study (McEwen, Kanno and Atsumi, 2016).

115

Figure 5-6. 1,3BDO toxicity effect on the growth of WT PCC 7942. Triplicate measurement with standard deviations of each data point are represented with error bars.

5.3.4. Transformation of Synechococcus elongatus PCC 7942

The transformation of PCC 7942 relies on homologous recombination between the cell’s chromosome and exogenous plasmid DNA that is not replicable and contains sequences homologous to the chromosome. The chromosomal region of integration (neutral site, NS1) was developed as a cloning locus (Clerico, Ditty and Golden, 2007), since it can be disrupted without any abnormal phenotype, thus allowing recombination of ectopic DNA sequences. When transformed with vectors containing an antibiotic resistance cassette and homologous sequence with the neutral site, a double homologous recombination occurs between the vector and the PCC 7942 chromosome. The selective marker (spectinomycin) and the gene of interest driven by psbA1 promoter are inserted into the neutral site and the vector backbone including the origin of copy (pUC for pSyn_6) is lost, allowing the expression of the heterologous genes in PCC 7942 (Fig. 5- 7).

116

Figure 5-7. Schematic representation of homologous recombination for PCC 7942 genome integration of the 1,3BDO pathway genes.

5.3.5. Cyanobacterial expression of 1,3BDO pathway genes

Constitutive psbA promoter-driven operon design

In this study, a constitutive promoter (PpsbA) was first used for the heterologous expression of 1,3BDO pathway genes in PCC 7942. It is natively present in PCC 7942 and regulate the transcription of one of psbA genes (psbAI) that encode an essential photosystem II reaction center protein, D1. Along with D2, D1 and D2 are critical to the photosystem II complex as they coordinate the cofactors of light-driven charge separation (Nair, Thomas and Golden, 2001).

According to extensive assays of constitutive promoters fused to luxAB in PCC 7942, PpsbA is

117 among the strongest in this organism (Liu et al., 1995). Despite the detailed characterization, the metabolic engineering of PCC 7942 using PpsbA has hardly been conducted.

Using the pSyn_6 vector harboring PpsbA (see Fig. 5-7), the three pathway genes were introduced into PCC 7942, yielding TK01 (with pTK01, arranging ZmPDC-BH1352-PA1127) and TK02 (with pTK02, BH1352-PA1127-ZmPDC). These two engineered strains were grown along with the wildtype in a shaking incubator at 34ºC and 150 r.p.m. with 60 µE/m2·s of light energy (Fig.

5-8). We were able to observe the cell growth up to OD730 of 4-5 in 7 days, however, no growth impediment from the variants was observed (see Fig. 5-8). Also, no measurable amount of 1,3BDO was produced from TK01 and TK02 and detected from HPLC analysis, as predicted from the growth curve (see Appendix C Fig. C3). Since it was not clear to determine whether the gene expression or the lack of in vivo activity of the pathway genes was the reason of no 1,3BDO formation, we adopted several synthetic biological approaches for more efficient and controllable gene expression as well as physical observation.

Figure 5-8. Culturing of PCC 7942 and its variants (TK01 and TK02) in a shaking incubator (left) and growth curve of cyanobacteria culture (right).

118

Inducible promoter-driven operon design

Creation of synthetic biology systems that predictably behave for a specific goal often depends upon inducible promoters for the control of gene transcription. An ideal inducible promoter should meet at least a few of the following criteria: it will not be activated in the absence of inducer; it should produce predictable response to a given concentration of inducer or repressor; the inducer at saturating concentrations should have no harmful effect on the host organism; the inducer should be affordable and stable under the growth conditions; the inducible system should act orthogonally to the host cell’s transcriptional system (Berla et al., 2013). Among the several inducible promoters that meet some of the above criteria and have been demonstrated in PCC 7942, we adopted IPTG- inducible trc promoter and lacIq repressor for a more controlled and stronger gene expression. It has been often used in metabolic engineering of PCC 7942 for the production of 2,3-BDO (McEwen, Kanno and Atsumi, 2016), 1-butanol (Lan and Liao, 2012), 1,3-propanediol (Li and Liao, 2013), and 2-methyl-1-butanol (Shen and Liao, 2012). Additionally, 6× His-tag was attached to N-terminus of each gene to physically observe protein expression via Nickel-nitrilotriacetic acid (Ni-NTA) purification. As DERA was turned out to be rate-limiting, DERA was arranged at first within the operon with the optimized RBS for maximum translation, and AKR and PDC with RBS designed for a moderate level of translation (Salis, Mirsky and Voigt, 2009). Three of DERA genes including BH1352 (TK09), EC1535 from E. coli (TK11), and TM1559 from T. maritima (TK12), were chosen as candidates based on in vitro screening (see Fig. 4-1B).

Using trc promoter-driven expression system, TK12 with TM1559 DERA was obtained and characterized first. TK12 and WT PCC 7942 were cultured using the designed photobioreactor (PBR), and the pathway was induced by addition of 1 mM of IPTG after 30 h of growing, at which the growth of TK12 reached an exponential phase with an OD730 of about 0.6-0.8 (Fig. 5-9A). As shown in Figure 5-9B, we could observe the expression of 1,3BDO pathway enzymes with varied levels of protein expression. However, no measurable 1,3BDO was produced from TK12 culture, neither from cell lysate (see Appendix C, Fig. C4 A and B). This may not be surprising in view of the extremely high growth temperature of hyperthermophile T. maritima that TM1550 may not be active in biological systems of mesophilic organisms such as PCC 7942. It is in accordance with the previous study that TM1559 was not showing much of in vivo aldol condensation in E. coli despite the highest in vitro activity among the screened DERA (Nemr et al., 2018).

119

Figure 5-9. A: Cell density of WT PCC 7942 and TK12; B: SDS-PAGE of the purified 1,3BDO pathway enzymes expressed in TK12. Each gene name and the protein size (kDa) are indicated.

Following the observation of heterologous enzyme expression in cyanobacteria, additional plasmids were designed for the expression of EcDERA (TK11), BH1352 (TK09), and BH1352 variants (F160Y and F160Y/M173I). Despite the correctly sequenced plasmids, however, the strain construction of TK11 and TK09 was not achieved due to unknown troubles in transformation. The colony PCR test for segregation of TK09 and TK11 strains indicates that none of the selected colonies have the pathway genes integrated into the genome (Fig. 5-10 A and B). The troubleshooting of transformation such as codon optimization is recommended for future work.

120

Figure 5-10. A: The 1,3BDO pathway operon designed to be integrated into the neutral site I of the genome of PCC 7942 with the fragment size indicated. The primers used for colony PCR are represented with black arrows at each end; B: Colony PCR of TK09 (left) and TK11 (right) transformation. PCR product using the same primers of PCC 7942 genome (1), the vector plasmid without the pathway genes (2) and the plasmid with the pathway genes (3) are shown as a control set. The red dashed boxes indicate where the band is supposed to be shown if the operon is correctly integrated into the genome.

5.4. Conclusion

5.4.1. In vivo study of engineered BH1352 in E. coli

In this study, we explored in vivo effect of the protein engineering of the rate-limiting enzyme on 1,3BDO biosynthetic pathway. The best two mutations including F160Y and F160Y/M173I were introduced in E. coli strain engineered with the biosynthetic pathway, and both of the two mutant strains exhibited a significant improvement over the wildtype BH1352. In fact, the mutations were even more effective in vivo than in vitro that F160Y single mutation and the double mutation produced 5- and 6-fold more 1,3BDO than the wildtype, respectively. However, the obtained level of productivity, titer, and yield is still far from the practically scalable level that it requires much

121 more optimization. Nevertheless, especially considering the insufficient oxygen supply due to the mechanical issue of bioreactors, engineering of BH1352 was thus far the most effective optimization approach.

5.4.2. Cyanobacterial production of 1,3BDO using PDC-DERA-AKR pathway

One of the ultimate goals of this thesis was the demonstration of cyanobacterial conversion of CO2 into 1,3BDO via cyanobacterial engineering. Unfortunately, despite the characterization of the biosynthetic pathway and in vivo demonstration in E. coli, we were not able to obtain the cyanobacterial strain producing 1,3BDO yet. However, the cyanobacterial engineering system including experimental protocol and an in-house photobioreactor was established and optimized. The heterologous expression of the pathway genes was successfully achieved as well using DERA from T. maritima. Though no cyanobacterial production of 1,3BDO was demonstrated, this work could be used as a valuable cornerstone for follow-up study.

122

Chapter 6. Summary, Recommendations, and Outlook

6.1. Summary

Metabolic engineering efforts have led to some successful bioprocesses implemented at an industrial level to produce a variety of chemicals. Many challenges still have to be addressed to make such processes economically viable and extend lab-scale technologies to commercial scale ones. One area of improvement is to design the enzymes in engineered pathways. The difficulty of metabolic engineering often lies in the unavailability of known enzymes to catalyze steps required in novel biosynthetic pathways. Or, when enzymes with target activity are available, they may have insufficient catalytic efficiency to allow high enough productivity, yield, and titer of the target product. While microbial biosynthesis research efforts have been focusing on replacing existing chemicals in well-established markets, applying new chemistries can prove valuable for synthesizing novel chemicals for value-added applications, such as more robust polymers, biofuels, and pharmaceutical complexes. Therefore, exploring new enzyme activities can be beneficial for development of bioprocesses of chemicals.

This thesis investigated the protein design and application of novel metabolic pathway for production of 1,3BDO from renewable carbon source (Fig. 6-1). The pathway consists of decarboxylation of pyruvate by pyruvate decarboxylase (PDC), aldol condensation of acetaldehyde by 2-deoxyribose-5-phosphate aldolase (DERA), and reduction by aldo-keto reductase (AKR). This series of reactions can lead to shorter pathway design requiring a smaller number of enzymatic steps compared to the previously proposed pathway. While the enzymatic reactions of the pathway are defined and proposed, the specific enzymes for maximal productivity were not identified. Moreover, the rate-limiting step, aldol condensation by DERA, had to be improved for practical application of the proposed pathway. Therefore, we first identified novel AKR and DERA for optimization of 1,3BDO biosynthetic pathway. The target enzymes were biochemically and structurally characterized for further applications, and the rate-limiting DERA enzyme was rationally engineered for higher production of 1,3BDO. The rationally designed mutations of DERA were applied in an engineered E. coli to demonstrate in vivo

123 effect of protein design. Cyanobacterial engineering with the target enzymes was also attempted for photosynthetic conversion of CO2 into 1,3BDO.

Figure 6-1. Protein design for optimization of 1,3BDO pathway and in vivo application in E. coli and Synechococcus elongatus. Enzyme screening, biochemical characterization, and structural analysis were conducted to identify the target enzymes (DERA and AKR), and the key step enzyme was rationally engineered to improve the catalytic activity of the target reaction. The protein-engineered pathway was then applied in biological system of E. coli and a cyanobacterial strain for biosynthesis of 1,3BDO from glucose and photosynthesis.

We chose the identification of novel AKRs as the first project for two reasons: the reducing activity on 3-hydroxybuatanal (3-HB) is a non-natural reaction and has never been demonstrated; it had to be used as a reporter enzyme to assess the activity of DERA candidates by a coupled reaction (Chapter 3). In this study, we purified and screened 21 uncharacterized enzymes for NADPH- dependent reduction on 3-HB. Among the tested AKRs, PA1127 from Pseudomonas aeruginosa displayed the highest production of 1,3BDO from 3-HB, while several of the others also produced measurable amounts of the product. Unfortunately, we were not able to crystallize PA1127. Thus, we crystallized another AKR, STM2406 from Salmonella typhimurium, and used it for the comparative structural analysis with the model PA1127 structure.

124

Next, the biochemical and structural characterization of the target enzymes was conducted. We established the substrate profile of the two AKRs via running the reactions with 54 substrates including aldehyde, ketone, and sugar compounds. Based on the substrate profile, key substrates with high activity were chosen and the kinetic parameters of PA1127 and STM2406 on the key substrates were measured. While both two AKRs exhibited a wide range of substrate specificity, PA1127 was generally more active with most of the substrates than STM2406. Then, we conducted the comparative structural analysis of the two AKRs to understand structural features affecting the AKR catalysis. Using the apo-structure and NADP-bound structure of STM2406, the key residues of substrate binding were identified. The corresponding residues of PA1127 were also determined via the structural alignment of the two enzymes. Site-directed mutagenesis study on the key residues revealed that Asn65 of STM2406 is critical in catalysis that the substitution of Asn65 with bulkier hydrophobic amino acids such as Met, Ile, and Phe improved the activity of STM2406 on 3-HB, methylglyoxal, and aromatic aldehyde compounds. In fact, Asn65Met mutation of

STM2406 increased the catalytic efficiency (kcat/Km) 6-fold on 3-HB and Asn65Phe exhibited a 5- fold increase in the catalytic efficiency with nitrobenzaldehyde. In addition, we determined the functional role of the C-terminal loop of AKRs by designing truncated mutants of PA1127 and bacillus AKRs. The presence of the C-terminal loop is found to be critical in AKR catalysis, partially explaining the reason that STM2406, which does not possess the C-terminal loop, shows lower activity than PA1127.

Our work demonstrated a reduction of 3-HB using AKRs for the application in 1,3BDO biosynthesis. Since 3-HB represents an uncharacterized, non-natural substrate for AKRs, the identification of promiscuous activity toward this chemical represents a starting point for developing new biocatalysts and proves them attractive as biocatalysts for the enzyme-based biotransformation of non-natural substrates, including industrially important aldehydes and ketones. This study also expands the understanding of the molecular mechanisms of the substrate selectivity of AKRs and demonstrates the potential for protein engineering of these enzymes for further applications in the production of various chemicals.

In Chapter 4, we identified a novel DERA with a high aldol condensation activity on acetaldehyde. Over 2,500 DERA orthologue sequences were subjected to bioinformatic analysis,

125 revealing that there are at least five major clusters of DERA enzymes. We chose 20 of them representing all the major groups and non-classified sequences, then purified them for a screening assay. Using PA1127, a DERA-AKR coupled reaction system was designed to measure how much acetaldehyde is converted into 1,3BDO. Here, BH1352 from Bacillus halodurans was found to be one of the most active DERAs along with TM1559 from Thermotoga maritima and DeoC from E. coli. However, since BH1352 displayed the highest in vivo activity, it was chosen as the target enzyme.

The crystallization of BH1352 was conducted and we were able to successfully achieve a crystal structure at a resolution of 2.50 Å . On the basis of the structural analysis, we identified the key residues involved in protein-substrate binding, and Ala-scanning was conducted on those residues to understand functional roles. Furthermore, unlike the previously reported DERA crystal structures, BH1352 reveals the C-terminal tail including Tyr224 residue. For all the previously determined structures of DERAs the C-terminal loop is generally intrinsically disordered and thus the electron density was not enough to be captured. We conducted site-directed mutagenesis study on the C-terminal end Tyr224 and discovered that Tyr224 seems involved in the binding with a phosphate of substrates, while it barely affects the aldol condensation.

The structure analysis was then combined with the protein sequence alignment study to pinpoint the key residue of BH1352 for aldol condensation that several residues were rationally chosen for site-directed mutagenesis. We found two major sites of BH1352 structure that might be related with the catalysis, a hydrophobic cluster near the catalytic residue and the substrate entrance region. Based on mutation study, Phe160 at the substrate entrance region was the key residue that Phe160Tyr single mutation produced over 2.5-fold 1,3BDO from DERA-AKR coupled reaction compared to the wildtype. The combination of Phe160Tyr with Met173Ile (from the hydrophobic cluster) even further improved the activity that 2.6 times more 1,3BDO was produced from the coupled reaction.

We also investigated another DERA-catalyzed reaction for the synthesis of a statin drug precursor. Previously, some DERAs were known to be capable of condensing two acetaldehyde molecules and a chloroacetaldehyde in a sequential manner, yielding (3R,5S)-6-chloro-2,4,6- trideoxyhexapyranoside (CTHP). Statin drugs are extensively used for treatment of cardiovascular

126 diseases. Despite nearly $ 20 billion global market of statin drugs, only a handful of DERAs were reported for this reaction. In this study, we demonstrated that a nearly 75 % conversion of the substrates into CTHP is achieved within 30 min using BH1352.

In this project, a novel DERA with a high aldol condensation activity was identified using DERA- AKR coupled assay. The crystal structure of BH1352 revealed two protomers per asymmetric unit with multiple conformations of the C-terminal tail, which has not previously been determined. We explored all three major reactions of DERA including 2-deoxyribose-5-phosphate cleavage, acetaldehyde aldol condensation (1,3BDO formation), and a sequential condensation of two acetaldehyde molecules and one chloroacetaldehyde with a detailed structural analysis and biochemical characterization. The rational protein engineering was conducted to enhance the aldol condensation activity and the improved mutants were subjected to in vivo study in E. coli and a cyanobacterial strain.

The last chapter investigated the effect of engineered BH1352 in biological system. First, the mutations with improved aldol condensation activity, including Phe160Tyr, Ile170Val, Met173Ile, and Phe160Tyr/Met173Ile, were overexpressed in BL21 and whole-cell biotransformation of acetaldehyde into 1,3BDO was conducted. However, unlike in vitro study results, the whole-cell biotransformation did not display a significant change due to the mutations. It is mainly because of the two reasons: several endogenous reductases in BL21 consumed acetaldehyde, which might have made the substrate concentration much lower than the given condition; the lack of 3-HB- specific reductase, which led to loss of the driving force of carbon flux towards 1,3BDO. Therefore, the two best mutations from in vitro study, Phe160Tyr and Phe160Tyr/Met173Ile, were subjected to fermentation test.

For glucose fermentation using 1,3BDO pathway, LMSE51C strain was used as the host cell. It was designed to concentrate the carbon flux towards 1,3BDO by deleting genes involved in competing pathways. The pathway genes including BH1352, PA1127, and ZmPDC were expressed and the fed-batch fermentation method was implemented here to provide sufficient glucose for production of 1,3BDO. The effect of engineered BH1352 in fed-batch fermentation turned out to be more effective than in vitro reaction result that Phe160Tyr single mutation produced nearly 5 times more 1,3BDO (0.89 g/L) than the strain with BH1352 wildtype (0.18

127 g/L), and Phe160Tyr/Met173Ile double mutation improved the 1,3BDO production 6 times (1.09 g/L) with 7-fold higher glucose yield (0.028 g/g glucose). This study validated the significance of protein engineering in metabolic engineering, although the increase was achieved at a non-optimized fermentation condition. Moreover, this approach can be combined with other optimization strategies developed in the previous study by Dr. Nemr, such as optimization of the translation rate of each pathway gene and the introduction of extra copy of pathway genes.

Lastly, we investigated on the photosynthetic conversion of CO2 into 1,3BDO via cyanobacterial engineering. In this work, Synechococcus elongatus PCC 7942 was used as the host strain for heterologous expression of ZmPDC, DERA, and AKR (PA1127). The heterologous expression of the target pathway genes was conducted via homologous recombination for genome integration. The genome integration of ZmPDC, BH1352, and PA1127 under a constitutive promoter (PpsbA) was obtained, but no measurable amount of 1,3BDO was produced from the engineered stain.

Assuming that the promoter strength was not strong enough for sufficient expression of the pathway genes, a trc promoter-driven expression system was adopted. Using the trc promoter, we were able to observe the cyanobacterial protein expression of ZmPDC, PA1127 and TM1559, but no 1,3BDO was detected from the strain culture. This strain (TK12) with TM1559 was unlikely to produce 1,3BDO because the DERA is from hyperthermophilic organism and exhibited a marginal activity in BL21. Unfortunately, the strains with EcDERA, BH1352 and its variants were not obtained due to unresolved transformation issue. Despite cyanobacterial production of 1,3BDO was not achieved, the platform of cyanobacterial engineering was established in this department, including the development of an in-house photobioreactor, cellular engineering protocols, and analytical methods.

In sum, we have demonstrated the protein design strategy for optimization of 1,3BDO biosynthetic pathway. In vitro screening of enzymes discovered novel enzymes, and structure-guided protein engineering was conducted to improve the target activity. In vivo application of the engineered protein proved the protein design beneficial in metabolic engineering.

128

6.2. Recommendations

The following recommendations are proposed based on the engineered strains and pathways explored in this thesis, in order to extend their utility for practical applications.

6.2.1. Address the reducing cofactor balance

It is known that NADH is nearly 4 times more abundant than NADPH in E. coli when cultured with glucose as a carbon source (Andersen and Meyenburg, 1977). However, the reduction of 3- HB by PA1127 AKR is strictly dependent on NADPH. Based on in vitro DERA-AKR coupled assay, the ratio of AKR to DERA is more significant on 1,3BDO production than expected considering much higher catalytic activity of PA127 than BH1352 (see Appendix B Fig. B2). Thus, the insufficient availability of NADPH might be acting as a rate-limiting step of the pathway.

One strategy to resolve the cofactor supply is the cofactor preference engineering of a NADPH- dependent AKR. While only NADPH-dependent AKRs were purified and tested against 3-HB in this study, there are several AKRs active with NADH, including XYL1 from Candida tenuis (AKR2B5). It displays a dual cofactor specificity with both of NADH and NADPH and the structural analysis of such enzymes could be used as the model for protein engineering of NADPH- dependent AKRs.

The essential structural difference between dual co-substrate specificity and NADPH-dependence is the residues involved in binding of the enzyme with adenosine 2’-phosphate of NADPH. As the phosphate is negatively charged, positively charged amino acids such as Arg and Lys are often found in 2’ binding pockets of strictly NADPH-dependent AKRs (Fig. 6-2A). The Fig. 6-2A shows the cofactor binding of two NADPH-dependent AKRs: STM2406, which is one the two AKRs studied in this thesis; BSU0953, which shares 61 % sequence identity with PA1127. While STM2406 has Lys300 and Arg232 interacting with the adenosine 2’-phosphate, BSU0953 possesses four positively charged residues (Lys214, Arg227, Arg282, and Lys283) for binding with the phosphate, all of which are fully shared in PA1127. The binding mode of these two AKRs reveals that the positively charged residues in are essential in cofactor binding of 129

NADPH and may impact the catalytic activity, considering higher activity of PA1127 and BSU0953 than STM2406.

Figure 6-2. The co-substrate binding of NADPH-dependent AKRs (A) and XYL1 (AKR2B5) with a dual co-substrate specificity (B). The structural segment including Glu227 of XYL1 is not determined, but the binding mode was proposed by Luccio et al. (2006).

On the other hand, the positively charged residues of AKRs with dual cofactor specificity are not as essential, while negatively charged residues including Asp and Glu and/or polar amino acids such as Gln and Asn often serve to repel the 2’ phosphate and accept hydrogen bonds from 2’ and 3’ ribose hydroxyls (Fig. 6-2B) (Luccio, Elling and Wilson, 2006; Cahn et al., 2016). Basically, the structure-guided enzyme engineering for switching cofactor preference relies on the specificity-determining motif. Based on this idea, Dr. Arnold and her colleagues developed an

130 web-based enzyme engineering platform to modify cofactor preference (Cofactor Specificity Reversal: Structural Analysis and Library Design, http://www.che.caltech.edu/groups/fha/ CSRSALAD/index.html.) (Cahn et al., 2016). Previous studies suggest that targeting a limited set of residues would be sufficient for switching cofactor preference, and this web interface could be useful to extend the application of 1,3BDO biosynthetic pathway.

Another strategy is to improve the availability of NADPH. For microbial production of chemicals, increasing cofactor availability is a chief hurdle for the generation of efficient production platforms. One of E. coli engineering studies proved that improving overall NADPH production increased the productivity of the introduced NADPH-dependent pathway (Chemler et al., 2010). Here, the deletion of pgi (glucose-6-phosphate isomerase), pldA (phospholipase A), and ppc (phosphoenolpyruvate carboxylase) led to 4-fold higher production of leucocyanidin, which is NADPH-dependent. We could also couple this strategy with the engineered BH1352 and PA1127 to maximize the 1,3BDO production from E. coli.

6.2.2. Combinatorial optimization of the translation rate of pathway genes

To further take advantage of the BH1352 variants in 1,3BDO biosynthesis, the combinatorial approach of RBS sequence can be a useful option for optimal protein expression and balancing activity. The 1,3BDO pathway is grouped as an operon with a single promoter controlling the transcription of all the three pathway genes. The use of a single operon is advantageous in preventing variability in transcriptional control while allowing for control of total protein load through varying the strength of the single promoter (Glick, 1995; Pfleger et al., 2006). The translation of each gene in the operon from a single mRNA can be modulated by intergenic sequence containing RBS (Pfleger et al., 2006). In a previous study, the optimization of pathway protein balance by combinatorial design of the 5’-untranslated region increased the productivity of cyanobacterial 2,3-butanediol synthesis nearly twice (Oliver et al., 2014). More tailored modulation of the 1,3BDO pathway proteins could be an effective approach to enhance 1,3BDO production from E. coli.

131

6.2.3. Troubleshoot cyanobacterial transformation

As a way of resolving the transformation inefficiency, linear fragmentation of the target operon can be adopted. A recent study on transformation of PCC 7942 using linear DNA fragments suggested that linear fragments employing EDTA-mediated inhibition of DNases improved the transformation efficiency (Almeida et al., 2017). Basically, EDTA works as an inhibitor on endogenous exonucleases, which facilitates the survival of transformants and improves the efficiency up to 8-fold compared with plasmid DNA transformation (Almeida et al., 2017). This is could be an efficient strategy for integration of the 1,3BDO pathway into PCC 7942 and will be adopted as a future work.

6.3. Outlook

Here we have shown the protein design strategy for optimization of a novel metabolic pathway to produce 1,3BDO from renewable carbon source. In vivo demonstration of the engineered DERA suggested that in vitro-based design can be even more effective in biological system. In addition, since protein engineering improves the productivity at a molecular level, it is unlikely to create a metabolic burden or perturbation and can be coupled with other strategies such as translation rate optimization or cellular metabolic engineering. However, for this technology to be applied at industrial level, significant efforts for optimization have to be made to deal with challenges associated with the limit of productivity of the engineered pathway. Moreover, cellular engineering approach should be accompanied to address toxicity of aldehyde intermediates and requirement of reducing cofactors.

In addition, aldol reaction via aldolase catalysis has a potential in biosynthesis of organic molecules. Especially, DERA condenses various aldehyde compounds with acetaldehyde in a sequential manner, thereby the strain with overexpression of DERA has a capability to be used as a platform for biosynthesis of building blocks of complex molecules. For practical application of a variety of DERA-catalyzed reactions, an affordable and sustainable technology for product identification/separation should be developed. Furthermore, commercially viable uses of products and downstream process should be identified.

132

Bibliography

Adams, P. D. et al. (2010) ‘PHENIX: A comprehensive Python-based system for macromolecular structure solution’, Acta Crystallographica Section D: Biological Crystallography, 66(2), pp. 213–221.

Adsul, M. G. et al. (2011) ‘Development of biocatalysts for production of commodity chemicals from lignocellulosic biomass’, Bioresource Technology. Elsevier Ltd, 102(6), pp. 4304–4312.

Almeida, D. V. et al. (2017) ‘Improved genetic transformation of Synechococcus elongatus PCC 7942 using linear DNA fragments in association with a DNase inhibitor’, Biotechnology Research and Innovation. Sociedade Brasileira de Biotecnologia, 1(1), pp. 123–128.

Andersen, K. B. and Meyenburg, K. Von (1977) ‘Charges of Nicotinamide Adenine Nucleotides and Adenylate Energy Charge as Regulatory Parameters of the Metabolism in Escherichia coli’, The Journal of biological chemistry, 252(12), pp. 4151–4156.

Anderson, S. et al. (1985) ‘Production of 2-Keto-L-Gulonate, an Intermediate in L-Ascorbate Synthesis, by a Genetically Modffied Erwinia herbicola’, Science, 230(4722), pp. 144–149. Available at: http://science.sciencemag.org/content/230/4722/144.

Atalla, A., Breyer-Pfaff, U. and Maser, E. (2000) ‘Purification and characterization of oxidoreductases- catalyzing carbonyl reduction of the tobacco-specific nitrosamine 4-methylnitrosamino-1-(3-pyridyl)-1- butanone (NNK) in human liver cytosol’, Xenobiotica, 30(8), pp. 755–769.

Bailey, J. E. (1991) ‘Toward a Science of Metabolic Engineering’, Science, 252(5013), pp. 1668–1675.

Bala, G. et al. (2007) ‘Combined climate and carbon-cycle effects of large-scale deforestation’, Proceedings of the National Academy of Sciences, 104(16), pp. 6550–6555.

Banta, S. et al. (2002) ‘Optimizing an artificial metabolic pathway: Engineering the cofactor specificity of Corynebacterium 2,5-diketo-D-gluconic acid reductase for use in vitamin C biosynthesis’, Biochemistry, 41(20), pp. 6226–6236. doi: 10.1021/bi015987b.

Barbas, C. F., Wang, Y. F. and Wong, C. H. (1990) ‘Deoxyribose-5-phosphate aldolase as a synthetic catalyst’, Journal of the American Chemical Society, 112(5), pp. 2013–2014.

133

Barski, O. A., Tipparaju, S. M. and Bhatnagar, A. (2008) ‘The aldo-keto reductase superfamily and its role in drug metabolism and detoxification.’, Drug metabolism reviews, 40(4), pp. 553–624.

Berla, B. M. et al. (2013) ‘Synthetic biology of cyanobacteria: Unique challenges and opportunities’, Frontiers in Microbiology, 4, pp. 1–14.

Bohren, K. M. et al. (1989) ‘The aldo-keto reductase superfamily. cDNAs and deduced amino acid sequences of human aldehyde and aldose reductases’, Journal of Biological Chemistry, 264(16), pp. 9547–9561.

Bohren, K. M., Grimshaw, C. E. and Gabbay, K. H. (1992) ‘Catalytic effectiveness of human aldose reductase. Critical role of C- terminal domain’, Journal of Biological Chemistry, 267(29), pp. 20965– 20970.

Bornscheuer, U. T. et al. (2012) ‘Engineering the third wave of biocatalysis’, Nature. Nature Publishing Group, 485(7397), pp. 185–194.

Brändén, C.-I. (1991) ‘The TIM barrel—the most frequently occurring folding motif in proteins’, Current Opinion in Structural Biology, 1(6), pp. 978–983.

Breuer, M. and Hauer, B. (2003) ‘Carbon-carbon coupling in biotransformation’, Current Opinion in Biotechnology, 14(6), pp. 570–576.

Broddrick, J. T. et al. (2016) ‘Unique attributes of cyanobacterial metabolism revealed by improved genome-scale metabolic modeling and essential gene analysis’, Proceedings of the National Academy of Sciences, 113(51), pp. E8344–E8353.

Brovetto, M. et al. (2011) ‘C-C bond-forming lyases in organic synthesis’, Chemical Reviews, 111(7), pp. 4346–4403.

Bruice, P. Y. (1998) Organic Chemistry. 2nd edn. Upper Saddle River, New Jersey: Prentice-Hall, Inc.

Burczynski, M. E. et al. (2001) ‘The Reactive Oxygen Species- and Michael Acceptor-inducible Human Aldo-Keto Reductase AKR1C1 Reduces the α,β-Unsaturated Aldehyde 4-Hydroxy-2-nonenal to 1,4- Dihydroxy-2-nonene’, Journal of Biological Chemistry, 276(4), pp. 2890–2897.

134

Burgard, A. et al. (2016) ‘Development of a commercial scale process for production of 1,4-butanediol from sugar’, Current Opinion in Biotechnology. Elsevier Ltd, 42, pp. 118–125.

Burk, M. J. (2010) ‘Sustainable production of industrial chemicals from sugars’, International Sugar Journal, 112, pp. 30–35.

Cahn, J. K. B. et al. (2016) ‘A General Tool for Engineering the NAD/NADP Cofactor Preference of Oxidoreductases’, ACS Synthetic Biology, 6, pp. 326–333.

Calam, E. et al. (2013) ‘Biocatalytic production of alpha-hydroxy ketones and vicinal diols by yeast and human aldo-keto reductases’, Chemico-Biological Interactions, 202(1–3), pp. 195–203.

Cao, T.-P. et al. (2016) ‘Structural insight for substrate tolerance to 2-deoxyribose-5-phosphate aldolase from the pathogen Streptococcus suis’, Journal of Microbiology, 54(4), pp. 311–321.

Cao, Y. X. et al. (2016) ‘Heterologous biosynthesis and manipulation of alkanes in Escherichia coli’, Metabolic Engineering. Elsevier, 38, pp. 19–28.

Cavalieri, D. et al. (2003) ‘Evidence for S. cerevisiae Fermentation in Ancient Wine’, Journal of Molecular Evolution, 57, pp. 226–232.

Chemler, J. A. et al. (2010) ‘Improving NADPH availability for natural product biosynthesis in Escherichia coli by metabolic engineering’, Metabolic Engineering. Elsevier, 12(2), pp. 96–104.

Chen, L., Dumas, D. P. and Wong, C. (1992) ‘Deoxyribose-5-phosphate Aldolase as a Catalyst in Asymmetric Aldol Condensation’, Journal of the American Chemical Society, 114, pp. 741–748.

Chen, Z. and Zeng, A. P. (2013) ‘Protein design in systems metabolic engineering for industrial strain development’, Biotechnology Journal, 8(5), pp. 523–533.

Chung, S.S.M., Chung, S. K. (2003) ‘Genetic Analysis of Aldose Reductase in Diabetic Complications’, Current Medicinal Chemistry, 10(15), pp. 1375–1387.

Clapés, P. et al. (2010) ‘Recent progress in stereoselective synthesis with aldolases’, Current Opinion in Chemical Biology, 14(2), pp. 154–167.

135

Clerico, E. M., Ditty, J. L. and Golden, S. S. (2007) ‘Specialized Techniques for Site-Directed Mutagenesis in Cyanobacteria’, in Rosato, E. (ed.) Methods in Molecular Biology. Springer Protocols, Humana Press, pp. 155–171.

Cohen, S. N. et al. (1973) ‘Construction of Biologically Functional Bacterial Plasmids In Vitro’, Proceedings of the National Academy of Sciences, 70(11), pp. 3240–3244.

Cooper, W. C., Jin, Y. and Penning, T. M. (2007) ‘Elucidation of a complete kinetic mechanism for a mammalian hydroxysteroid dehydrogenase (HSD) and identification of all enzyme forms on the reaction coordinate: The example of rat liver 3α-HSD (AKR1C9)’, Journal of Biological Chemistry, 282(46), pp. 33484–33493.

Currie, James, N. (1917) ‘The citric acid fermentation of Aspergillus niger’, Journal of Biological Chemistry, 31(1), pp. 15–37.

Delneri, D., Gardner, D. C. J. and Oliver, S. G. (1999) ‘Analysis of the seven-member AAD gene set demonstrates that genetic redundancy in yeast may be more apparent than real’, Genetics, 153(4), pp. 1591–1600.

Deng, M. De and Coleman, J. R. (1999) ‘Ethanol synthesis by genetic engineering in cyanobacteria’, Applied and Environmental Microbiology, 65(2), pp. 523–528.

DeSantis, G. et al. (2003) ‘Structure-based mutagenesis approaches toward expanding the substrate specificity of D-2-deoxyribose-5-phosphate aldolase’, Bioorganic and Medicinal Chemistry, 11(1), pp. 43–52.

Dexter, J. and Fu, P. (2009) ‘Metabolic engineering of cyanobacteria for ethanol production’, Energy & Environmental Science, 2(8), p. 857.

Dick, M., Hartmann, R., et al. (2016) ‘Mechanism-based inhibition of an aldolase at high concentrations of its natural substrate acetaldehyde: Structural insights and protective strategies’, Chem. Sci., (7), pp. 4492–4502.

Dick, M., Weiergräber, O. H., et al. (2016) ‘Trading off stability against activity in extremophilic aldolases’, Scientific Reports, 6, p. 17908.

136

Ducat, D. C., Way, J. C. and Silver, P. A. (2011) ‘Engineering cyanobacteria to generate high-value products’, Trends in Biotechnology. Elsevier Ltd, 29(2), pp. 95–103.

Ehrensberger, A. H. and Wilson, D. K. (2004) ‘Structural and Catalytic Diversity in the Two Family 11 Aldo-keto Reductases’, Journal of Molecular Biology, 337(3), pp. 661–673.

Ehrensberger, A. and WiLson, D. K. (2003) ‘Expression, crystallization and activities of the two family 11 aldo-keto reductases from Bacillus subtilis’, Acta Crystallographica - Section D Biological Crystallography, 59(2), pp. 375–377.

Ellis, E. M. (2002) ‘Microbial aldo-keto reductases’, FEMS Microbiology Letters, 216, pp. 123–131.

Emsley, P. et al. (2010) ‘Features and development of Coot’, Acta Crystallographica Section D Biological Crystallography, 66(4), pp. 486–501.

Emsley, P. and Cowtan, K. (2004) ‘Coot: Model-building tools for molecular graphics’, Acta Crystallographica Section D: Biological Crystallography. International Union of Crystallography, 60(12 I), pp. 2126–2132.

Fei, H. et al. (2017) ‘An industrially applied biocatalyst: 2-Deoxy-D-ribose-5- phosphate aldolase’, Process Biochemistry. Elsevier, 63(August), pp. 55–59.

Feske, B. D., Kaluzna, I. A. and Stewart, J. D. (2005) ‘Enantiodivergent , Biocatalytic Routes to Both Taxol Side Chain Antipodes Taxol 1 has emerged as the drug of choice for treating certain types of ovarian and breast cancers because it source proved insufficient to meet commercial demand , plored a biocata’, The Journal of Organic Chemistry, 70(23), pp. 9654–9657.

Fessner, W. D. and Helaine, V. (2001) ‘Biocatalytic synthesis of hydroxylated natural products using aldolases and related enzymes’, Current Opinion in Biotechnology, 12(6), pp. 574–586.

Ford, G. and Ellis, E. M. (2002) ‘Characterization of Ypr1p from Saccharomyces cerevisiae as a 2- methylbutyraldehyde reductase’, Yeast, 19(12), pp. 1087–1096.

Fu, L. et al. (2012) ‘CD-HIT: Accelerated for clustering the next-generation sequencing data’, Bioinformatics, 28(23), pp. 3150–3152.

137

Fu, P. (2009) ‘Genome-scale modeling of Synechocystis sp. PCC 6803 and prediction of pathway insertion’, Journal of Chemical Technology and Biotechnology, 84(4), pp. 473–483.

Gal, J. (2008) ‘The discovery of biological enantioselectivity: Louis Pasteur and the fermentation of tartaric acid, 1857—A review and analysis 150 yr later’, Chirality, 20(1), pp. 5–19.

Gao, Z. et al. (2012) ‘Photosynthetic production of ethanol from carbon dioxide in genetically engineered cyanobacteria’, Energy & Environmental Science, pp. 9857–9865.

Gefflaut, T. et al. (1995) ‘Review Class I Aldolases’, Prog.Biophys.Molec.Biol., 63, pp. 301–340.

Gibson, D. G. et al. (2009) ‘Enzymatic assembly of DNA molecules up to several hundred kilobases’, Nature Methods, 6(5), pp. 12–16.

Gijsen, H. J. M. and Wong, C.-H. (1994) ‘Unprecedented Asymmetric Aldol Reactions with Three Aldehyde Substrates Catalyzed by 2-Deoxyribose-5-phosphate Aldolase’, Journal of the American Chemical Society, 116(18), pp. 8422–8423.

Glick, B. R. (1995) ‘Metabolic load and heterogenous gene expression’, Biotechnology Advances, 13(2), pp. 247–261.

Global Business Intelligence (2013) Statins Market to 2018 - Weak Product Pipeline and Shift of Focus towards Combination Therapies will Lead to Erosion of Brand Share, GBI Research. Available at: http://www.gbiresearch.com/report-store/market-reports/therapy-analysis/statins-market-to-2018-weak- product-pipeline-and-shift-of-focus-towards-combination-therapies-will-lead-to-erosion-of-brand-s.

González-Pajuelo, M. et al. (2005) ‘Metabolic engineering of Clostridium acetobutylicum for the industrial production of 1,3-propanediol from glycerol’, Metabolic Engineering, 7(5–6), pp. 329–336.

Gouveia-Oliveira, R., Sackett, P. W. and Pedersen, A. G. (2007) ‘MaxAlign: Maximizing usable data in an alignment’, BMC Bioinformatics, 8(312), pp. 1–8.

Grant, A. W. et al. (2003) ‘A novel aldo-keto reductase from Escherichia coli can increase resistance to methylglyoxal toxicity’, FEMS Microbiology Letters, 218(1), pp. 93–99.

138

Greenberg, W. et al. (2004) ‘Development of an efficient, scalable, aldolase-catalyzed process for enantioselective synthesis of statin intermediates.’, Proceedings of the National Academy of Sciences of the United States of America, 101(16), pp. 5788–5793.

Griese, M., Lange, C. and Soppa, J. (2011) ‘Ploidy in cyanobacteria’, FEMS Microbiology Letters, 323(2), pp. 124–131.

Gulevich, A. Y. et al. (2016) ‘Metabolic engineering of Escherichia coli for 1,3-butanediol biosynthesis through the inverted fatty acid β-oxidation cycle’, Applied Biochemistry and Microbiology, 52(1), pp. 21– 29.

Gunawan, C. et al. (2007) ‘Yeast pyruvate decarboxylases: Variation in biocatalytic characteristics for (R)-phenylacetylcarbinol production’, FEMS Yeast Research, 7(1), pp. 33–39.

Hahn-Hägerdal, B. et al. (1994) ‘Biochemistry and physiology of xylose fermentation by yeasts’, Enzyme and Microbial Technology, 16(11), pp. 933–943.

Hallborn, J. et al. (1991) ‘Xylitol production by recombinant Saccharomyces cerevisiae’, Bio/technology (Nature Publishing Company), 9(11), pp. 1090–1095.

Haridas, M., Abdelraheem, E. M. M. and Hanefeld, U. (2018) ‘2-Deoxy-D-ribose-5-phosphate aldolase (DERA): applications and modifications’, Applied microbiology and biotechnology. Applied Microbiology and Biotechnology, 102, pp. 9959–9971.

Hatti-Kaul, R. et al. (2007) ‘Industrial biotechnology for the production of bio-based chemicals - a cradle- to-grave perspective’, Trends in Biotechnology, 25(3), pp. 119–124.

Hatzimanikatis, V. et al. (2005) ‘Exploring the diversity of complex metabolic networks’, Bioinformatics, 21(8), pp. 1603–1609.

Heine, A. et al. (2001) ‘Observation of covalent intermediates in an enzyme mechanism at atomic resolution.’, Science (New York, N.Y.), 294(5541), pp. 369–74.

Heine, A. et al. (2004) ‘Analysis of the class I aldolase binding site architecture based on the crystal structure of 2-deoxyribose-5-phosphate aldolase at 0.99A resolution.’, Journal of molecular biology, 343(4), pp. 1019–34.

139

Hoffee, P. et al. (1974) ‘Deoxyribose-5-P Aldolase : Subunit Structure and Composition of Active Site Lysine Region’, Archives of biochemistry and biophysics, 164, pp. 736–742.

Hoog, S. S. et al. (1994) ‘Three-dimensional structure of rat liver 3 alpha-hydroxysteroid/dihydrodiol dehydrogenase: a member of the aldo-keto reductase superfamily’, Proceedings of the National Academy of Sciences of the United States of America, 91(7), pp. 2517–2521.

Houk, K. N. and List, B. (2004) ‘Asymmetric organocatalysis’, Accounts of Chemical Research, 37(8), p. 487.

Hoyos, P. et al. (2010) ‘Biocatalytic strategies for the asymmetric synthesis of alpha-hydroxy ketones.’, Accounts of chemical research, 43(2), pp. 288–299.

Huisman, G. W., Liang, J. and Krebber, A. (2010) ‘Practical chiral alcohol manufacture using ketoreductases’, Current Opinion in Chemical Biology. Elsevier Ltd, 14(2), pp. 122–129.

Hyndman, D. et al. (2003) ‘The aldo-keto reductase superfamily homepage’, Chemico-Biological Interactions, 143–144, pp. 621–631.

Ichikawa, N. et al. (2005) ‘PIO study on 1,3-butanediol dehydration over CeO2 (1 1 1) surface’, Journal of Molecular Catalysis A: Chemical, 231(1–2), pp. 181–189.

Ichikawa, N. et al. (2006) ‘Catalytic reaction of 1,3-butanediol over solid acids’, Journal of Molecular Catalysis A: Chemical, 256(1–2), pp. 106–112.

Jang, Y. S. et al. (2012) ‘Bio-based production of C2-C6 platform chemicals’, Biotechnology and Bioengineering, 109(10), pp. 2437–2459.

Jennewein, S. et al. (2006) ‘Directed evolution of an industrial biocatalyst: 2-deoxy-D-ribose 5-phosphate aldolase.’, Biotechnology journal, 1(5), pp. 537–48.

Jez, J. M. et al. (1997) ‘Comparative anatomy of the aldo-keto reductase superfamily.’, The Biochemical journal, 326, pp. 625–636.

Jez, J. M., Flynn, T. G. and Penning, T. M. (1997) ‘A new nomenclature for the aldo-keto reductase superfamily’, Biochemical Pharmacology, 54(6), pp. 639–647.

140

Jez, J. M. and Penning, T. M. (2001) ‘The aldo-keto reductase (AKR) superfamily: An update’, Chemico- Biological Interactions, 130–132, pp. 499–525.

Jiang, Y. et al. (2014) ‘Microbial production of short chain diols.’, Microbial cell factories, 13(1), p. 165.

Jiao, X.-C. et al. (2015) ‘Efficient synthesis of a statin precursor in high space-time yield by a new aldehyde-tolerant aldolase identified from Lactobacillus brevis’, Catal. Sci. Technol. Royal Society of Chemistry, 5(8), pp. 4048–4054.

Jiao, X.-C. et al. (2017) ‘Protein engineering of aldolase LbDERA for enhanced activity toward real substrates with a high-throughput screening method coupled with an aldehyde dehydrogenase’, Biochemical and Biophysical Research Communications. Elsevier Ltd, 482(1), pp. 159–163.

Jiao, X. et al. (2016) ‘A green-by-design system for efficient bio- oxidation of an unnatural hexapyranose into chiral lactone for building statin side-chains’, Catalysis Science & Technology. Royal Society of Chemistry, 6(19), pp. 7094–7100.

Joo, J. C. et al. (2017) ‘Alkene hydrogenation activity of enoate reductases for an environmentally benign biosynthesis of adipic acid’, Chemical Science. Royal Society of Chemistry, 8(2), pp. 1406–1413.

Kaluzna, I. a et al. (2004) ‘Systematic investigation of Saccharomyces cerevisiae enzymes catalyzing carbonyl reductions.’, Journal of the American Chemical Society, 126(40), pp. 12827–12832.

Kataoka, N. et al. (2013) ‘Improvement of (R)-1,3-butanediol production by engineered Escherichia coli’, Journal of Bioscience and Bioengineering. Elsevier Ltd, 115(5), pp. 475–480.

Kataoka, N. et al. (2014) ‘Enhancement of (R)-1,3-butanediol production by engineered Escherichia coli using a bioreactor system with strict regulation of overall oxygen transfer coefficient and pH.’, Bioscience, biotechnology, and biochemistry, 78(4), pp. 695–700.

Kavanagh, K. L. et al. (2002) ‘The structure of apo and holo forms of xylose reductase, a dimeric aldo- keto reductase from Candida tenuis’, Biochemistry, 41(28), pp. 8785–8795.

Keasling, J. D. (2010) ‘Manufacturing Molecules Through Metabolic Engineering’, Science, 330(6009), pp. 1355–1358.

141

Kelley, L. A. et al. (2015) ‘The Phyre2 web portal for protein modeling, prediction and analysis’, Nature Protocols. Nature Publishing Group, 10(6), pp. 845–858.

Kim, T. et al. (2017) ‘Structural and biochemical studies of novel aldo-keto reductases for the biocatalytic conversion of 3-hydroxybutanal to 1,3-butanediol’, Applied and Environmental Microbiology, 83(7), pp. e03172-16.

Knoop, H. et al. (2013) ‘Flux Balance Analysis of Cyanobacterial Metabolism: The Metabolic Network of Synechocystis sp. PCC 6803’, PLoS Computational Biology, 9(6).

Kohler, R. (1971) ‘The Background to Eduard Buchner’s Discovery of Cell-Free Fermentation’, Journal of the History of Biology, 4(1), pp. 35–61.

Kozma, E. et al. (2002) ‘The crystal structure of rat liver AKR7A1. A dimeric member of the aldo-keto reductase superfamily’, Journal of Biological Chemistry, 277(18), pp. 16285–16293.

Krivov, G. G., Shapovalov, M. V. and Dunbrack, R. L. (2009) ‘Improved prediction of protein side-chain conformations with SCWRL4’, Proteins: Structure, Function and Bioinformatics, 77(4), pp. 778–795.

Kunjapur, A. M., Tarasova, Y. and Prather, K. L. J. (2014) ‘Synthesis and accumulation of aromatic aldehydes in an engineered strain of escherichia coli’, Journal of the American Chemical Society, 136(33), pp. 11644–11654.

Kuznetsova, E. et al. (2015) ‘Functional diversity of haloacid dehalogenase superfamily phosphatases from Saccharomyces cerevisiae: Biochemical, structural, and evolutionary insights’, Journal of Biological Chemistry, 290(30), pp. 18678–18698.

Lan, E. I. and Liao, J. C. (2012) ‘ATP drives direct photosynthetic production of 1-butanol in cyanobacteria.’, Proceedings of the National Academy of Sciences of the United States of America, 109(16), pp. 6018–23.

Langer, E. (2012) ‘Biomanufacturing Outsourcing Outlook’, BioPharm International, 25(2).

Lapthorn, A. J., Zhu, X. and Ellis, E. M. (2013) ‘The diversity of microbial aldo/keto reductases from Escherichia coli K12’, Chemico-Biological Interactions, 202(1–3), pp. 168–177.

142

Laskowski, R. a. et al. (1993) ‘PROCHECK: a program to check the stereochemical quality of protein structures’, Journal of Applied Crystallography. International Union of Crystallography, 26, pp. 283–291.

Laurence, W. F. (2007) ‘Switch to Corn Promotes Amazon Deforestation’, Science, 318(5857), pp. 1721– 1724.

Lee, J. W. et al. (2011) ‘Systems metabolic engineering for chemicals and materials’, Trends in Biotechnology. Elsevier Ltd, 29(8), pp. 370–378.

Lee, J. W. et al. (2012) ‘Systems metabolic engineering of microorganisms for natural and non-natural chemicals’, Nature Chemical Biology. Nature Publishing Group, 8(6), pp. 536–546.

Letunic, I. and Bork, P. (2016) ‘Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees’, Nucleic Acids Research, 44(W1), pp. W242–W245.

Li, H. and Liao, J. C. (2013) ‘Engineering a cyanobacterium as the catalyst for the photosynthetic conversion of CO2 to 1,2-propanediol.’, Microbial cell factories. Microbial Cell Factories, 12(1), p. 4.

Li, L. et al. (2012) ‘Biocatalytic production of (2S,3S)-2,3-butanediol from diacetyl using whole cells of engineered Escherichia coli’, Bioresource Technology. Elsevier Ltd, 115, pp. 111–116.

Lian, J., Chao, R. and Zhao, H. (2014) ‘Metabolic engineering of a Saccharomyces cerevisiae strain capable of simultaneously utilizing glucose and galactose to produce enantiopure (2R,3R)-butanediol’, Metabolic Engineering. Elsevier, 23, pp. 92–99.

Liang, K. and Shen, C. R. (2017) ‘Selection of an endogenous 2,3-butanediol pathway in Escherichia coli by fermentative redox balance’, Metabolic Engineering. Elsevier, 39(December 2016), pp. 181–191.

Liu, J. and Wong, C. H. (2002) ‘Aldolase-catalyzed asymmetric synthesis of novel pyranose synthons as a new entry to heterocycles and epothilones’, Angewandte Chemie - International Edition, 41(8), pp. 1404–1407.

Liu, Y. et al. (1995) ‘Circadian orchestration of gene expression in cyanobacteria’, Genes and Development, 9, pp. 1469–1478.

Lomax, M. S. and Greenberg, G. R. (1968) ‘Characteristics of the deo operon: role in thymine utilization and sensitivity to deoxyribonucleosides.’, Journal of Bacteriology, 96(2), pp. 501–514.

143

Luccio, E. D. I., Elling, R. A. and Wilson, D. K. (2006) ‘Identification of a novel NADH-specific aldo- keto reductase using sequence and structural homologies’, Biochemical Journal, 114, pp. 105–114.

Lv, X. et al. (2016) ‘Metabolic engineering of Serratia marcescens MG1 for enhanced production of (3R)- acetoin’, Bioresources and Bioprocessing. Springer Berlin Heidelberg, 3(1), p. 52.

Ma, H. et al. (2016) ‘Linking coupled motions and entropic effects to the catalytic activity of 2- deoxyribose-5-phosphate aldolase (DERA)’, Chem. Sci., 7(2), pp. 1415–1421.

Machado, H. B. et al. (2012) ‘A selection platform for carbon chain elongation using the CoA-dependent pathway to produce linear higher alcohols’, Metabolic Engineering. Elsevier, 14(5), pp. 504–511.

Machajewski, T. and Wong, C. (2000) ‘The Catalytic Asymmetric Aldol Reaction’, Angewandte Chemie (International ed. in English), 39(8), pp. 1352–1375.

Mahrwald, R. (1999) ‘Diastereoselection in Lewis-Acid-Mediated Aldol Additions’, Chemical Reviews, 99(5), pp. 1095–1120.

Mahrwald, R. (2004) Modern Aldol Reactions, WILEY-VCH. Weinheim: Wiley-VCH.

Makshina, E. V et al. (2014) ‘Review of old chemistry and new catalytic advances in the on-purpose synthesis of butadiene.’, Chemical Society reviews, 43, pp. 7917–7953.

Marquardt, T. et al. (2005) ‘High-resolution crystal structure of AKR11C1 from Bacillus halodurans: An NADPH-dependent 4-hydroxy-2,3-trans-nonenal reductase’, Journal of Molecular Biology, 354(2), pp. 304–316.

Marsh, J. J. and Lebherz, H. G. (1992) ‘Fructose-bisphosphate aldolases: an evolutionary history’, Trends in Biochemical Sciences, 17(3), pp. 110–113.

Matsuyama, A. et al. (2001) ‘Industrial production of (R)-1 , 3-butanediol by new biocatalysts’, Journal of Molecular Catalysis B: Enzymatic, 11, pp. 513–521.

McCoy, A. J. et al. (2007) ‘Phaser crystallographic software’, Journal of Applied Crystallography. International Union of Crystallography, 40(4), pp. 658–674.

144

McEwen, J. T., Kanno, M. and Atsumi, S. (2016) ‘2,3 Butanediol production in an obligate photoautotrophic cyanobacterium in dark conditions via diverse sugar consumption’, Metabolic Engineering, 36, pp. 28–36.

McGovern, P. E. et al. (2004) ‘Fermented beverages of pre- and proto-historic China’, Proceedings of the National Academy of Sciences, 101(51), pp. 17593–17598.

Minor, W. et al. (2006) ‘HKL-3000: The integration of data reduction and structure solution - From diffraction images to an initial model in minutes’, Acta Crystallographica Section D: Biological Crystallography. International Union of Crystallography, 62(8), pp. 859–866.

Mukaiyama, T. (1982) ‘Organic Reactions’, Organic Reactions, pp. 203–331.

Mukherjee, S. et al. (2007) ‘Asymmetric enamine catalysis’, Chemical Reviews, 107(12), pp. 5471–5569.

Müller, M. (2005) ‘Chemoenzymatic synthesis of building blocks for statin side chains’, Angewandte Chemie - International Edition, 44(3), pp. 362–365.

Murshudov, G. N. et al. (2011) ‘REFMAC5 for the refinement of macromolecular crystal structures’, Acta Crystallographica Section D: Biological Crystallography, 67(4), pp. 355–367.

Nagano, N., Ota, M. and Nishikawa, K. (1999) ‘Strong hydrophobic nature of cysteine residues in proteins’, FEBS Letters, 458(1), pp. 69–71.

Nair, U., Thomas, C. and Golden, S. S. (2001) ‘Functional Elements of the Strong psbAI Promoter of Synechococcus elongatus PCC 7942’, Journal of Bacteriology, 183(5), pp. 1740–1747.

Nakamura, Y., Kaneko, T. and Tabata, S. (2000) ‘CyanoBase , the genome database for Synechocystis sp . strain PCC6803 : status for the year 2000’, Nucleic Acids Research, 28(1), p. 72.

Neale, A. D. et al. (1987) ‘Pyruvate decarboxylase of Zymomonas mobilis: isolation, properties, and genetic expression in Escherichia coli’, Journal of Bacteriology, 169(3), pp. 1024–1028.

Nemr, K. et al. (2018) ‘Engineering a short, aldolase-based pathway for (R)-1,3-butanediol production in Escherichia coli’, Metabolic Engineering, 48, pp. 13–24.

Nielsen, D. R. et al. (2010) ‘Metabolic engineering of acetoin and meso-2,3-butanediol biosynthesis in E. coli’, Biotechnology Journal, 5(3), pp. 274–284.

145

Nielsen, J. (2001) ‘Metabolic engineering’, Applied Microbiology and Biotechnology, 55(3), pp. 263– 283.

Nielsen, J. and Keasling, J. D. (2016) ‘Engineering Cellular Metabolism’, Cell. Elsevier Ltd, 164(6), pp. 1185–1197.

Oliver, J. W. K. et al. (2013) ‘Cyanobacterial conversion of carbon dioxide to 2,3-butanediol.’, Proceedings of the National Academy of Sciences of the United States of America, 110(4), pp. 1249–54.

Oliver, J. W. K. et al. (2014) ‘Combinatorial optimization of cyanobacterial 2,3-butanediol production’, Metabolic Engineering. Elsevier, 22, pp. 76–82.

Oliver, J. W. K. and Atsumi, S. (2014) ‘Metabolic design for cyanobacterial chemical synthesis’, Photosynthesis Research, 120(3), pp. 249–261.

Orsini, F., Pelizzoni, F. and Forte, M. (1989) ‘Behaviour of Aminoacids and Aliphatic Aldehydes in Dipolar Aprotic Solvents: Formation of Oxazolidinones’, Journal of Heterocyclic Chemistry, 26(3), pp. 837–841.

Otwinowski, Z. and Minor, W. (1997) ‘Macromolecular Crystallography Part A’, Methods in Enzymology, 276(January 1993), pp. 307–326.

Painter, J. and Merritt, E. A. (2006) ‘Optimal description of a protein structure in terms of multiple groups undergoing TLS motion’, Acta Crystallographica Section D: Biological Crystallography. International Union of Crystallography, 62(4), pp. 439–450.

Penning, T. M. et al. (2000) ‘Human 3alpha-hydroxysteroid dehydrogenase isoforms (AKR1C1- AKR1C4) of the aldo-keto reductase superfamily: functional plasticity and tissue distribution reveals roles in the inactivation and formation of male and female sex hormones.’, The Biochemical journal, 351(Pt 1), pp. 67–77.

Penning, T. M. (2015) ‘The aldo-keto reductases (AKRs): Overview’, Chemico-Biological Interactions. Elsevier Ireland Ltd, 234(August 2014), pp. 236–246.

Pfleger, B. F. et al. (2006) ‘Combinatorial engineering of intergenic regions in operons tunes expression of multiple genes’, Nature Biotechnology, 24(8), pp. 1027–1032.

146

Pharkya, P. et al. (2015) ‘Microorganisms and methods for production of 4-hydroxybutyrate, 1,4- butanediol and related compounds’. US.

Picataggio, S. (2009) ‘Potential impact of synthetic biology on the development of microbial systems for the production of renewable fuels and chemicals’, Current Opinion in Biotechnology, 20(3), pp. 325–329.

Pisciotta, J. M., Zou, Y. and Baskakov, I. V. (2010) ‘Light-dependent electrogenic activity of cyanobacteria’, PLoS ONE, 5(5).

Pleiss, J. (2011) ‘Protein design in metabolic engineering and synthetic biology’, Current Opinion in Biotechnology, 22(5), pp. 611–617.

Price, M. N., Dehal, P. S. and Arkin, A. P. (2010) ‘FastTree 2 - Approximately maximum-likelihood trees for large alignments’, PLoS ONE, 5(3), p. e9490.

Pricer, W. E. and Horecker, B. L. (1960) ‘Deoxyribose Aldolase’, Journal of Biological Chemistry, 235(5), pp. 1292–1298.

Racker, E. (1952) ‘Enzymatic Synthesis and Breakdown of Deoxyribose Phosphate’, Journal of Biological Chemistry, 196, pp. 347–365.

Raj, K. et al. (2018) ‘Biocatalytic production of adipic acid from glucose using engineered Saccharomyces cerevisiae’, Metabolic Engineering Communications. Elsevier B.V., 6, pp. 28–32.

Richter, N. et al. (2010) ‘The three-dimensional structure of AKR11B4, a glycerol dehydrogenase from Gluconobacter oxydans, reveals a tryptophan residue as an accelerator of reaction turnover’, Journal of Molecular Biology. Elsevier Ltd, 404(3), pp. 353–362.

Righelato, R. and Spracklen, D. V. (2007) ‘Carbon mitigation by biofuels or by saving and restoring forests?’, Science, 317(5840), p. 902.

Robert, X. and Gouet, P. (2014) ‘Deciphering key features in protein structures with the new ENDscript server’, Nucleic Acids Research, 42(W1), pp. 320–324.

Rodriguez, G. M. and Atsumi, S. (2012) ‘Isobutyraldehyde production from Escherichia coli by removing aldehyde reductase activity’, Microbial cell factories, 11(1), p. 90.

147

Rodriguez, G. M. and Atsumi, S. (2014) ‘Toward aldehyde and alkane production by removing aldehyde reductase activity in Escherichia coli’, Metabolic Engineering, 25, pp. 227–237.

Rodriguez, S., Kayser, M. M. and Stewart, J. D. (2001) ‘Highly stereoselective reagents for β-keto ester reductions by genetic engineering of Baker’s yeast’, Journal of the American Chemical Society, 123(8), pp. 1547–1555.

Rosgaard, L. et al. (2012) ‘Bioengineering of carbon fixation, biofuels, and biochemicals in cyanobacteria and plants’, Journal of Biotechnology, 162(1), pp. 134–147.

Ruffing, A. M. and Trahan, C. A. (2014) ‘Biofuel toxicity and mechanisms of biofuel tolerance in three model cyanobacteria’, Algal Research, 5(1), pp. 121–132.

Sabra, W., Groeger, C. and Zeng, A. P. (2016) ‘Microbial Cell Factories for Diol Production’, in Bioreactor Engineering Research and Industrial Applications I. Advances in Biochemical Engineering/Biotechnology, pp. 165–197.

Saitou, N. and Nei, M. (1987) ‘The neighbour-joining method: a new method for reconstructing phylogenetic trees’, Molecular Biology and Evolution, 4(4), pp. 406–425.

Sakuraba, H. et al. (2003) ‘The first crystal structure of Archaeal aldolase. Unique tetrameric structure of 2-deoxy-D-ribose-5-phosphate aldolase from the hyperthermophilic Archaea Aeropyrum pernix’, Journal of Biological Chemistry, 278(12), pp. 10799–10806.

Sakuraba, H. et al. (2007) ‘Sequential aldol condensation catalyzed by hyperthermophilic 2-deoxy-D- ribose-5-phosphate aldolase’, Applied and Environmental Microbiology, 73(22), pp. 7427–7434.

Salis, H. M., Mirsky, E. A. and Voigt, C. A. (2009) ‘Automated design of synthetic ribosome binding sites to control protein expression’, Nat Biotechnol, 27(10), pp. 946–950.

Samland, A. K. and Sprenger, G. a (2006) ‘Microbial aldolases as C-C bonding enzymes--unknown treasures and new developments.’, Applied microbiology and biotechnology, 71(3), pp. 253–64.

Scharlemann, J. P. W. and Laurance, W. F. (2008) ‘How Green Are Biofuels?’, Science, 319(319), pp. 43–44. doi: 10.1126/science.1153103.

148

Schlegel, B. P., Jez, J. M. and Penning, T. M. (1998) ‘Mutagenesis of 3 alpha-hydroxysteroid dehydrogenase reveals a “push-pull” mechanism for proton transfer in aldo-keto reductases’, Biochemistry, 37(10), pp. 3538–3548.

Schmidt, N. G., Eger, E. and Kroutil, W. (2016) ‘Building Bridges: Biocatalytic C-C-Bond Formation toward Multifunctional Products’, ACS Catalysis, 6(7), pp. 4286–4311.

Schoemaker, H. E. (2003) ‘Dispelling the Myths--Biocatalysis in Industrial Synthesis’, Science, 299(5613), pp. 1694–1697.

Schulte, M. et al. (2018) ‘Conformational Sampling of the Intrinsically Disordered C-Terminal Tail of DERA Is Important for Enzyme Catalysis’, ACS Catalysis, 8(5), pp. 3971–3984.

Schürmann, M. and Sprenger, G. A. (2001) ‘Fructose-6-phosphate Aldolase is a Novel Class I Aldolase from Escherichia coli and is Related to a Novel Group of Bacterial Transaldolases’, Journal of Biological Chemistry, 276(14), pp. 11055–11061.

Shen, C. R. and Liao, J. C. (2012) ‘Photosynthetic production of 2-methyl-1-butanol from CO2 in cyanobacterium Synechococcus elongatus PCC7942 and characterization of the native acetohydroxyacid synthase’, Energy & Environmental Science, 5(11), pp. 9574–9583.

Sheppard, M. J., Kunjapur, A. M. and Prather, K. L. J. (2016) ‘Modular and selective biosynthesis of gasoline-range alkanes’, Metabolic Engineering. Elsevier, 33, pp. 28–40.

Shimizu, S., Kataoka, M. and Kita, K. (1998) ‘Chiral Alcohol Synthesis with Microbial Carbonyl Reductases in a Water–Organic Solvent Two-Phase System’, Annals of the New York Academy of Sciences, 864, pp. 87–95.

Sievers, F. et al. (2011) ‘Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega.’, Molecular systems biology, 7(1), p. 539.

Steen, E. J. et al. (2010) ‘Microbial production of fatty-acid-derived fuels and chemicals from plant biomass’, Nature. Nature Publishing Group, 463(7280), pp. 559–562.

Stephanopoulos, G. and Vallino, J. J. (1991) ‘Network rigidity and metabolite engineering in metabolic overproduction’, Science, 252(5013), pp. 1675–1681.

149

Subhadra, B. G. (2010) ‘Sustainability of algal biofuel production using integrated renewable energy park (IREP) and algal biorefinery approach’, Energy Policy. Elsevier, 38(10), pp. 5892–5901.

Subrizi, F. et al. (2014) ‘Versatile and efficient immobilization of 2-deoxyribose-5-phosphate aldolase (DERA) on multiwalled carbon nanotubes’, ACS Catalysis, 4(9), pp. 3059–3068.

Tamang, J. P. et al. (2009) ‘Functional properties of lactic acid bacteria isolated from ethnic fermented vegetables of the Himalayas’, International Journal of Food Microbiology, 135(1), pp. 28–33.

Tamang, J. P. et al. (2016) ‘Functional properties of microorganisms in fermented foods’, Frontiers in Microbiology, 7(APR), pp. 1–13.

Tamang, J. P., Watanabe, K. and Holzapfel, W. H. (2016) ‘Review: Diversity of microorganisms in global fermented foods and beverages’, Frontiers in Microbiology, 7(377).

Tamura, K. et al. (2013) ‘MEGA6: Molecular evolutionary genetics analysis version 6.0’, Molecular Biology and Evolution, 30(12), pp. 2725–2729.

Thapa, N. and Tamang, J. P. (2015) ‘Functionality and Therapeutic Values of Fermented Foods’, in Health Benefits of Fermented Foods and Beverages. Boca Raton: CRC Press, pp. 111–168.

Thykaer, J. and Nielsen, J. (2003) ‘Metabolic engineering of β-lactam production’, Metabolic Engineering, 5(1), pp. 56–69.

Tina, K. G., Bhadra, R. and Srinivasan, N. (2007) ‘PIC: Protein Interactions Calculator’, Nucleic Acids Research, 35, pp. W473–W476.

Torella, J. P. et al. (2014) ‘Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications.’, Nature protocols, 9(9), pp. 2075–2089.

Traff, K., Cordero Otero, R. R. and van Zyl, W. H. (2001) ‘Deletion of the GRE3 aldose reductase gene and its influence on xylose metabolism in recombinant strains of Saccharomyces cerevisiae expressing the xylA and XKS1 genes’, Applied and Environmental Microbiology, 67(12), pp. 5668–5674.

Trost, B. M. and Brindle, C. S. (2010) ‘The direct catalytic asymmetric aldol reaction’, Chemical Society Reviews, 39(5), pp. 1600–1632.

150

Valentin‐Hansen, P. et al. (1982) ‘The Primary Structure of Escherichia coli K12 2‐Deoxyribose 5‐ Phosphate Aldolase: Nucleotide Sequence of the deoC Gene and the Amino Acid Sequence of the Enzyme’, European Journal of Biochemistry, 125(3), pp. 561–566.

Vivijs, B. et al. (2014) ‘Acetoin synthesis acquisition favors Escherichia coli growth at low pH’, Applied and Environmental Microbiology, 80(19), pp. 6054–6061.

Voigt, C. et al. (2011) ‘Synthetic Biology in Cyanobacteria: Engineering and Analyzing Novel Functions’, in Methods in Enzymology. Elsevier Inc., pp. 539–579.

Wang, B., Wang, J. and Meldrum, D. R. (2012) ‘Application of synthetic biology in cyanobacteria and algae’, Frontiers in Microbiology, 3(SEP), pp. 1–15. doi: 10.3389/fmicb.2012.00344.

Wang, Y. et al. (2013) ‘Engineering of cofactor regeneration enhances (2S,3S)-2,3-butanediol production from diacetyl’, Scientific Reports, 3(2643).

WiLson, D. K. et al. (1992) ‘An unlikely sugar substrate site in the 1.65 A structure of the human aldose reductase holoenzyme implicated in diabetic complications’, Science, 257(5066), pp. 81–84.

Windle, C. L. et al. (2014) ‘Engineering aldolases as biocatalysts’, Current Opinion in Chemical Biology, 19(1), pp. 25–33.

Winn, M. D. et al. (2011) ‘Overview of the CCP4 suite and current developments’, Acta Crystallographica Section D: Biological Crystallography. International Union of Crystallography, 67(4), pp. 235–242.

Wong, C.-H. et al. (1995) ‘Recombinant 2-Deoxyribose-5-phosphate Aldolase in Organic Synthesis: Use of Sequential Two-Substrate and Three-Substrate Aldol Reactions’, Journal of the American Chemical Society, 117(12), pp. 3333–3339.

Woo, M. H. et al. (2014) ‘Expression and characterization of a novel 2-deoxyribose-5-phosphate aldolase from Haemophilus influenzae Rd KW20’, Journal of the Korean Society for Applied Biological Chemistry, 57(5), pp. 655–660.

Woolston, B. M., Edgar, S. and Stephanopoulos, G. (2013) ‘Metabolic Engineering: Past and Future’, Annual Review of Chemical and Biomolecular Engineering, 4(1), pp. 259–288.

151

Yamada, K. D., Tomii, K. and Katoh, K. (2016) ‘Application of the MAFFT sequence alignment program to large data - Reexamination of the usefulness of chained guide trees’, Bioinformatics, 32(21), pp. 3246– 3251.

Yamamoto, H., Matsuyama, A. and Kobayashi, Y. (2002) ‘Synthesis of (R)-1,3-butanediol by enantioselective oxidation using whole recombinant Escherichia coli cells expressing (S)-specific secondary alcohol dehydrogenase.’, Bioscience, biotechnology, and biochemistry, 66(4), pp. 925–7.

Yang, D. et al. (2017) ‘Systems metabolic engineering as an enabling technology in accomplishing sustainable development goals’, Microbial Biotechnology, 10(5), pp. 1254–1258.

Yang, T. et al. (2014) ‘Asymmetric reduction of 4-hydroxy-2-butanone to (R)-1,3-butanediol with absolute stereochemical selectivity by a newly isolated strain of Pichia jadinii’, Journal of Industrial Microbiology and Biotechnology, 41(12), pp. 1743–1752.

Yim, H. et al. (2011) ‘Metabolic engineering of Escherichia coli for direct production of 1,4-butanediol’, Nat Chem Biol, 7(7), pp. 445–452.

You, Z. Y. et al. (2013) ‘Characterization and application of a newly synthesized 2-deoxyribose-5- phosphate aldolase’, Journal of Industrial Microbiology and Biotechnology, 40(1), pp. 29–39.

Yu, Y. et al. (2013) ‘Development of synechocystis sp. PCC 6803 as a phototrophic cell factory’, Marine Drugs, 11(8), pp. 2894–2916.

Zhang, R. G. et al. (2001) ‘Structure of Thermotoga maritima stationary phase survival protein SurE: A novel acid phosphatase’, Structure, 9(11), pp. 1095–1106. doi: 10.1016/S0969-2126(01)00675-X.

Zheng, R. C. et al. (2012) ‘Asymmetric synthesis of (R)-1,3-butanediol from 4-hydroxy-2-butanone by a newly isolated strain Candida krusei ZJB-09162’, Applied Microbiology and Biotechnology, 94(4), pp. 969–976.

152

Appendix A

Supplementary Information for Chapter 3

Figure A1. LC-MS analysis of the reaction products of 3-HB reduction by PA1127 and STM2406. (A, B), Extracted ion chromatograms of the 1,3-BDO sodium adduct for PA1127 (A) and STM2406 (B) isolated via C18 column under positive ionization; (C, D), MS spectra of the reaction product peak from C18 column corresponding to 1,3-BDO for PA1127 (C) and STM2406 (D). On both graphs, M/Z values for the sodium adduct of 1,3-BDO under positive ionization are shown.

153

154

Figure A2. Screening of purified STM2406 (A) and PA1127 (B) for AKR activity against various substrates. The 54 substrates used here included 24 aldehydes, 14 ketones, and 16 sugars. Reaction mixtures contained 10 μg of STM2406 (A) or 5 μg of PA1127 (B) with 1 mM of each substrate and 0.5 mM NADPH. The activity was calculated based on the oxidation of NADPH during 10 min reaction. The dotted line indicates the possible maximal level of activity based on the amount of NADPH added, and the activity bars near this line imply that all the NADPH added was consumed (continued from the previous page).

155

Figure A3. A: Asymmetric unit of STM2406 dimer (two orientations related by a 90° rotation). STM2406 protomers shown in different colors (gray and blue). Protein subunits shown as space filling models; B: PISA predicted octameric structure of STM2406. STM2406 protomers shown in different colors; C: Structural superposition of STM2406 (gray) and model PA1127 (cyan). C-terminal loop is present for PA1127 while absent for STM2406.

156

Figure. A4. Structure-based sequence alignment of STM2406 and AKRs with higher activity toward 3-HB which belong to AKR5 and AKR11 subfamily (from Fig. 3-3). Residues conserved in all proteins are shown in white on a red background. The columns with red residues indicate the presence of more than 70% of biochemically similar residues. The residues mutated in this work are marked with numbers above (STM2406) and below (PA1127) the alignment, whereas the catalytic tetrad residues are indicated by red numbers. The C-terminal loops of PA1127, YvgN, and YhdN deleted in this work are marked by magenta boxes. The secondary structure elements derived from structures of STM2406 (PDB code 3ERP) are shown above the alignment.

157

Figure A5. Overall structure of AKRs and their C-terminal loops. Ribbon diagrams showing the presence of the C-terminal loop in the B. subtilis YvgN (A), YhdN (B), and YtbE (C) and its absence in STM2406 (D). The core protein domains are shown in gray with catalytic Tyr residues colored in green and the C- terminal loops colored in magenta. Labels C indicate C-termini of the proteins.

158

Figure A6. Effect of the C-terminal loop deletion on AKR activity of YhdN, and YvgN, and PA1127. Comparison of AKR activity of the wild type and mutant proteins with 3-HB and 4-nitrobenzaldehyde as substrates.

159

Table A1. Microbial AKRs used in this work. Source Organism Uniprot Gene code SwissProt/ Genbank ID PDB Code Bacillus cereus BCE_0216 Q73EZ1 N/A Bacillus cereus BCE_5206 Q72Y15 N/A Bacillus halodurans BH2158 Q9KAX8 N/A Bacillus halodurans BH3849 Q9K683 N/A Bacillus subtilis yccK P46905 N/A Bacillus subtilis yhdN P80874 1PZ1 Bacillus subtilis ytbE O34678 3B3D Bacillus subtilis yvgN O32210 3D3F/3F7J Lactobacillus brevis LVIS_0099 Q03U52 N/A Lactobacillus brevis LVIS_0272 Q03TN4 N/A Lactobacillus brevis LVIS_0866 Q03S14 N/A Lactobacillus brevis LVIS_1188 Q03R78 N/A Lactobacillus brevis LVIS_1423 Q03QK1 N/A Lactobacillus brevis LVIS_1844 Q03PG5 N/A Pseudomonas aeruginosa *PA1127 Q9I4K8 N/A Pseudomonas aeruginosa PA2535 Q9I0U9 N/A Pseudomonas aeruginosa PA3795 Q9HXK2 N/A Pseudomonas savastanoi PSPPH_1726 Q48KW2 N/A Pseudomonas syringae PSPTO_0351 Q88AN7 N/A Rhodopseudomonas palustris RPA2150 Q6N7V4 N/A Salmonella typhimurium *STM2406 Q8ZNA1 3ERP/5T79 * Main target enzymes in this study

160

Table A2. Crystallographic data collection and model refinement statistics Parameters STM2406 STM2406 (apo-structure) (NADPH complex) Data collection Space group I4 P422 Cell dimensions a, b, c (Å ) a=127.1, b=127.1, a=94.4, b=94.4, c=91.7 c=120.5 Wavelength 0.97856 0.97917 Resolution (Å ) 30.0-1.55 (1.61-1.55)a 30.0-1.85 (1.88-1.85)

Rmerge 0.06(0.477) 0.073(0.563) I / (I) 23.1(2.6) 25.4(3.2) Completeness (%) 97.2(80.3) 99.5(100.0) Redundancy 5.2 (4.6) 7.7(7.4)

Refinement Resolution (Å ) 30.0-1.55 29.85-1.86 Number of reflections 125986/6743b 32853/1999

Rwork / Rfree 15.3/17.8 16.5/19.2 No. of atoms Protein 5561 2499 Other ligands/ion 72 59 Solvent 905 448 R.m.s. deviations Bond lengths (Å ) 0.014 0.010 Bond angles () 1.52 0.95 Ramachandran plot Most favored (%) 93.2 92.3 Additionally allowed (%) 6.2 7.4 Generously allowed (%) 0.6 0.4 Disallowed (%) 0 0 PDB code 3ERP 5T79

161

Appendix B

Supplementary Information for Chapter 4

Figure B1. SDS-PAGE of Coomassie blue-stained purified DERAs used in this study. A and B: DERAs used in the screening assay; C, D and E: BH1352 variants. 10-20 μg of purified proteins were loaded in each well, depending on varied concentrations of the proteins.

162

Figure B2. Optimization of DERA-AKR coupled assay. A: Scheme of the sequential reactions of aldol condensation via DERA catalysis and reduction via AKR, including byproduct formation via double condensation. B: Normalized 1,3BDO production from DERA-AKR coupled assay depending on various ratio and concentration of each enzyme. The ratio represents the concentration of DERA over that of AKR, while only DERA concentrations are displayed on x-axis.

163

Figure B3. Optimal pH for DRP cleavage reaction of BH1352. 100 mM each of sodium acetate (pH 5.0 – 6.0), imidazole-HCl (pH 6.5 – 7.0), triethanolamine (pH 7.5 – 8.5), and glycine-NaOH (pH 9.0 – 10.0) were used in this study.

164

Figure B4. Arrangement of the subunits of BH1352 (A), EcDERA (B), and TM1559 (C). Different subunits of each DERA are colored differently. The two inter-molecular salt bridges of EcDERA and TM1559 are indicated with amino acid symbols and residue numbers.

165

Figure B5. Full-length crystal structure of TM1559, showing the C-terminal end Tyr246 (PDB 3R12). Tyr246 and two ligands are displayed with green stick model (CIT: citric acid; PGO: 1,2-propanediol).

166

Figure B6. Proposed catalytic mechanism of aldol condensation of acetaldehyde by BH1352.

167

Figure B7. Close-up view of the active site of TM1559. The catalytic lysine (K179) as well as other residues involved in substrate binding are displayed with green stick model and residue numbers. The ligands including citrate and 1,2-propanediol (PGO) are shown as manganese and orange sticks, respectively.

168

Figure B8. A and B: Structural alignment of BH1352 (light gray, PDB 6D33), LbDERA (cyan, PDB code

4XBK), and LbDERAT29L/F163Y mutant (magenta, PDB code 5H91), showing a closeup view of the phosphate binding pocket; C and D: Prediction of BH1352 variant structures with mutations with a predicted H-bond shown by a red dash (C) and an introduced cavity highlighted with a red dashed circle (D). The structural models were built using SCWRL4 (Krivov, Shapovalov and Dunbrack, 2009).

169

3

2

1

CTHP synthesis activity synthesis CTHP

mol product/mg protein/min] product/mg mol  [ 0

WT F160Y F160H I170V M173I

Figure B9. Initial activity of BH1352 and its variants in terms of the production of (3R,5S)-6-chloro-2,4,6- trideoxyhexapyranoside.

170

Table B1. Microbial DERAs studied in this paper.

Gene Name Organism Uniprot ID PDB

APE_2437.1 Aeropyrum pernix Q9Y948 1N7K BCE_1975 Bacillus cereus P39121 N/A BH1352 Bacillus halodurans Q9KD67 6D33 (This study) BSU39420 Bacillus subtilis Q73A11 N/A CV_3701 Chromobacterium violaceum Q7NRS9 N/A

1JCJ, 1JCL, 1KTN, deoC Escherichia coli P0A6L0 1P1X, 5EKY, 5EL1, 5EMU

4XBK, 4XBS, LVIS_1595 Lactobacillus brevis Q03Q50 5H91 Legionella pneumophila subsp. LPG1433 Q5ZVK7 N/A pneumophila LMO1995 Listeria monocytogenes Q8Y5R1 N/A Methanothermobacter MTH_818 O26909 N/A thermautotrophicus Pseudomonas savastanoi pv. PSPPH0865 Q48N75 N/A phaseolicola PAE1231 Pyrobaculum aerophilum Q8ZXK7 1VCV STM4567 Salmonella typhimurium Q8ZJV8 N/A SO_1217 Shewanella oneidensis Q8EHK4 N/A SF4413 Shigella flexneri serotype 5b Q0SX30 N/A SM11_pD0790 Sinorhizobium meliloti F7XGE6 N/A SAV2137 Staphylococcus aureus P61084 N/A SP_0843 Streptococcus pneumoniae serotype 4 Q97RH2 N/A TM_1559 Thermotoga maritima Q9X1P5 3R12, 3R13 TTHA1186 Thermus thermophilus Q5SJ28 1J2W, 1UB3

171

Table B2. Hydrophobic residues involved in interactions between A and B subunits of BH1352, EcDERA, and TM1559. BH1352 Position Residue Chain Position Residue Chain 16 PRO A 133 LEU B 16 PRO A 68 LEU B 16 PRO A 97 ILE B 66 PHE A 68 LEU B 67 PRO A 67 PRO B 67 PRO A 68 LEU B 68 LEU A 16 PRO B 68 LEU A 160 PHE B 68 LEU A 66 PHE B 68 LEU A 67 PRO B 75 VAL A 78 PHE B 78 PHE A 75 VAL B 78 PHE A 78 PHE B 97 ILE A 16 PRO B 133 LEU A 16 PRO B 160 PHE A 68 LEU B

EcDERA (DeoC) Position Residue Chain Position Residue Chain 54 PRO A 85 ILE B 54 PRO A 88 ALA B 55 ILE A 85 ILE B 85 ILE A 54 PRO B 85 ILE A 55 ILE B 88 ALA A 54 PRO B 92 ALA A 92 ALA B 92 ALA A 95 ALA B 92 ALA A 96 TYR B 95 ALA A 92 ALA B 95 ALA A 95 ALA B 96 TYR A 92 ALA B 96 TYR A 96 TYR B

172

TM1559 Position Residue Chain Position Residue Chain 42 PRO A 122 VAL B 42 PRO A 157 TYR B 42 PRO A 93 LEU B 43 PHE A 122 VAL B 43 PHE A 157 TYR B 46 PRO A 127 ALA B 91 PHE A 93 LEU B 92 PRO A 92 PRO B 92 PRO A 93 LEU B 93 LEU A 184 PHE B 93 LEU A 42 PRO B 93 LEU A 91 PHE B 93 LEU A 92 PRO B 122 VAL A 42 PRO B 122 VAL A 43 PHE B 127 ALA A 46 PRO B 157 TYR A 42 PRO B 157 TYR A 43 PHE B 184 PHE A 93 LEU B

* Cutoff distance of hydrophobic interaction between residues was set to be 5Å . Protein interactions were calculated using the PIC webserver (http://pic.mbu.iisc.ernet.in/job.html) (Tina, Bhadra and Srinivasan, 2007).

173

Table B3. Amino acids found in the two hydrophobic clusters of the DERA family.

Residue *The ratio of hydrophobic amino in acid found in the corresponding Most frequently found amino acid BH1352 residue [%] and its ratio Leu14 99.6 Leu, 96.6 % Val63 98.1 Val, 92.6 % Phe66 99.8 Phe, 98.2 % Ile128 99.5 Ile, 97.6 % Val154 99.6 Val, 53.9 % Ile170 86.6 Val, 48.7 % Met173 99.5 Met, 79.5 % Val177 94.8 Val, 52.0 % Val183 99.5 Val, 53.5 %

*The ratio of hydrophobic amino acids found in 2,553 DERA sequences included in the phylogenetic analysis.

174

Appendix C

Figure C1. Detection limit of LC-MS for 1,3BDO (Na+ and H+ adduct), 2,3BDO (Na+ adduct), and acetoin (H+ adduct). All of these were observed with a measurable signal intensity at a concentration of 0.01 mg/ml using the previously mentioned protocol.

175

Figure C2. Design of LMSE51C strain to maximize the carbon flux towards 1,3BDO by deleting several genes involve in the formation of formate, lactate, acetate, acetolactate, and ethanol. This strain was designed and engineered in a previous study from our group (Nemr et al., 2018).

176

Figure C3. HPLC chromatography graph of TK01 media (A), TK02 media (B), and 1 mg/ml standard of acetaldehyde and 1,3BDO mixture (C). No peaks corresponding to 1,3BDO or acetaldehyde were observed from the variant strains. At this point, the HPLC protocol was not optimized and the detection sensitivity was low (0.05-0.1 mg/ml) that HPLC was not suitable for the detection of cyanobacterial engineering products.

177

A

Figure C4. A: LC-MS analysis of the supernatant of TK12 culture media. The top diagram shows the peak of compounds with m/z values of ~ 91. The H+ adduct of 1,3BDO (m/z = ~91.07) was chosen to be analyzed because another butanediol isomer, 2,3BDO, does not produce H+ adduct, while Na+ adduct form is present for both of 1,3BDO and 2,3BDO. The bottom diagram shows that, however, no 1,3BDO is apparently present.

178

B

Figure C4 (continued). B: LC-MS analysis of the cell lysate of TK12 culture. Likewise, no measurable 1,3BDO was observed.

179