Development of New Enzyme Activities for Applications in Synthetic Biology:

Carboxylic Acid Amidases (CAA)

A thesis submitted to The University of Manchester for the degree of Doctor of Philosophy (PhD) in the Faculty of Science and Engineering

2018

Alexander J Wood School of Chemistry

Contents

Figures ...... 3 Tables ...... 7 List of Abbreviations...... 8 Abstract ...... 10 Declaration ...... 11 Copyright Statement ...... 12 Acknowledgments ...... 13 Chapter 1. Introduction ...... 14 1.1. General introduction to and their production ...... 14 1.2. Chemical methods of synthesis and their limitations ...... 16 1.3. Enzymatic synthesis of amides ...... 20 1.4. reductase...... 57 1.5. Objective of the project: Engineer CAR into a broad specificity amide synthetase 63 Chapter 2. Developing a CAR-VibB, VibH fusion enzyme system ...... 65 2.1. Introduction ...... 65 2.2. Gene analysis and CARmm-VibB fusion design ...... 68 2.3. Vib system gene cloning and CARmm A domain-VibB PCP domain fusion ...... 70 2.4. Expression trials of Vib and CARmm-VibB fusion genes ...... 72 2.5. Discussion and future work ...... 79 Chapter 3. Direct amide bond formation using CARs ...... 84 3.1. Introduction ...... 84 3.2. CAR gene expression and enzyme purification ...... 85 3.3. CAR dependent amide formation - method development and substrate screen .... 87 3.4. Conclusions ...... 104 Chapter 4. Investigation of the mechanism of CAR-dependent amide synthesis ...... 106 4.1. Introduction ...... 106 4.2. CAR production in the absence of co-produced Sfp ...... 106 4.3. CAR mutagenesis and the removal of the phosphopantetheine binding site ...... 108 4.4. Use of truncated CAR for amide formation ...... 109 4.5. Influence of a PPant mimetic on amide formation ...... 110 4.6. Investigation of CAR enantioselectivity using chiral ...... 113 4.7. Analysis of adenylation activity using the EnzChek phosphate detection kit ..... 115 2

4.8. Structural modelling of amines into CAR active site ...... 119 4.9. Whole-cell CAR-dependent amide formation ...... 123 4.10. Conclusion and future work ...... 124 Chapter 5. Use of radical substrates for studies of CAR dynamics ...... 126 5.1. Introduction ...... 126 5.2. Kinetic studies with radical substrates ...... 128 5.3. Discussion and future work ...... 130 Chapter 6. Discussion of results and perspectives ...... 131 Chapter 7. Experimental procedures ...... 134 7.1. General methods and materials ...... 134 7.2. Genes and molecular cloning ...... 134 7.3. Protein production and purification by nickel affinity chromatography ...... 138 7.4. Biotransformations and analysis ...... 142 7.5. Investigation of coupling between CAR-dependent ATP consumption and amide formation using an EnzChek kit ...... 153 7.6. Enzyme kinetics analysis of native CAR activity with radical-TEMPO carboxylic acid 154 7.7. Structural modelling of piperidine 52 into the active sites of CAR A domain structures ...... 154 References ...... 155 Appendices ...... 166 Appendix 1: Genes used in this work ...... 166 Appendix 2: PCR primers and PCR conditions ...... 171 Appendix 3: Example enzyme nickel affinity purification, AKTA UV chromatograms ...... 173 Appendix 4: Example HPLC traces showing CAR-dependent amide formation...... 175

Final Word Count: 44044

Figures

Figure 1.1: Examples of amide-containing drugs, including the anti-cancer drug imatinib 1, the antibiotic cefpiramide 2 and the sleeping disorder drug modafinil 3...... 14

3

Figure 1.2: Use of coupling or activating reagents to form amides from carboxylic acids and amines...... 18 Figure 1.3: Exploitation of the serine protease catalytic mechanism for amide formation. 22 Figure 1.4: Overcoming the natural substrate specificity of proteases through the use of substrate mimetics for amide formation...... 24 Figure 1.5: Selected examples of amide products produced by the lipases CALB and PPL respectively...... 26 Figure 1.6: Synthesis of aminoacyl-tRNAs by aminoacyl tRNA synthetase...... 28 Figure 1.7: Module and domain composition of the NRPS tyrocidine synthetase...... 31 Figure 1.8: Conformational changes of the C-terminal subdomain of A domains permit adenylation or thiolation...... 33 Figure 1.9: PCP domain and C domain functions within NRPS enzyme complexes...... 38 Figure 1.10: A proposed mechanism of C domain-catalysed amide formation...... 39 Figure 1.11: Subdomain swapping in NRPS systems allows the production of novel peptides...... 42 Figure 1.12: Module shuffling within the tyrocidine synthetase system to produce novel peptides...... 44 Figure 1.13: Domain shuffling between the bacitracin synthetase and tyrocidine synthetase systems for novel dipeptide formation...... 45 Figure 1.14: Model proposed for ATP-grasp enzyme-catalysed amide formation ...... 47 Figure 1.15: Exploiting the wide substrate breadth of ATP-Grasp enzymes YwfE and PGM1 ...... 49 Figure 1.16: Amide synthesis by the amide synthetase McbA...... 52 Figure 1.17: Amide synthesis by the adenylate forming amide ligases NovL and CouL .... 52 Figure 1.18: An overview of enzymatic methods of amide formation ...... 56 Figure 1.19: Carboxylic acid reductase reaction and domain composition and functions... 62 Figure 1.20: Selected examples of carboxylic acids reduced by CARni...... 63 Figure 1.21: Carboxylic acid activation and amide formation by the vibriobactin synthetase system ...... 64 Figure 2.1: Proposed CAR-Vib fusion to combine CAR and Vib enzymatic activities ...... 67 Figure 2.2: Pfam analysis of CAR and VibB domain boundaries to guide fusion design. .. 69 Figure 2.3: PCR amplification of pET21a CARmm A domain vector and VibB PCP domain gene fragment for subsequent fusion ...... 71 Figure 2.4: Agarose gel visualisation of PCR and restriction digest products for subsequent cloning of VibE and VibH genes into expression vectors ...... 72

4

Figure 2.5: SDS-PAGE protein analysis of CAVibB-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA...... 73 Figure 2.6: SDS-PAGE of CAVibB-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA, following Halt inhibitor addition to lysate...... 74 Figure 2.7: SDS-PAGE of negative control, pET28b-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA...... 74 Figure 2.8: SDS-PAGE of VibE-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA...... 75 Figure 2.9: SDS-PAGE of VibH-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA...... 76 Figure 2.10: SDS-PAGE of VibE-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA, following published grow-up conditions...... 76 Figure 2.11: SDS-PAGE of VibH-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA, following published grow-up conditions...... 77 Figure 2.12: SDS-PAGE of VibE, VibH or CAVibB-transformed BL21 (DE3) cell lysates following 20°C expression trial...... 78 Figure 2.13: SDS-PAGE of VibE, VibH, CAVibB or pET28b-transformed BL21 (DE3) cell lysates following 30°C expression trial...... 78 Figure 2.14: SDS-PAGE of VibE, VibH or CAVibB-transformed BL21 (DE3) cell lysates following 37°C expression trial...... 79 Figure 2.15: Screenshot of Clustal Omega amino acid alignment of the CARmm and VibB PCP domains...... 82 Figure 3.1: Proposed interception of thioester intermediates bound to the PCP domain of CAR by ...... 84 Figure 3.2: SDS-PAGE of CARmm-expressing BL21 (DE3) cell lysate nickel affinity purification by AKTA...... 86 Figure 3.3: SDS-PAGE of CARni-expressing BL21 (DE3) cell lysate nickel affinity purification by AKTA...... 86 Figure 3.4: Typical NADPH absorbance assay at 340 nm before and after the addition of active purified CARmm measuring the reduction of carboxylic acids...... 87 Figure 3.5: HPLC traces showing the CAR-dependent production of benzamide 38 from benzoic acid 25 and ammonia...... 88 Figure 3.6: HPLC response factor calibration curve between equimolar concentrations of benzoic acid 25 and benzamide 38 at 230 nm...... 90

5

Figure 3.7: CAR dependent formation of the pharmaceutically-relevant amide ilepcimide 58 ...... 95 Figure 3.8: Conversion to 58 by reaction of CARmm with 57 and 52 at different pH values ...... 100 Figure 3.9: Conversion to 58 by CARmm achieved at varying excesses of 52...... 101 Figure 3.10: Conversion to 58 by CARmm achieved at varying concentrations of ATP. . 101 Figure 3.11: Time-course analysis of conversion to 58 by an optimised 30°C, pH 9.0 CARmm reaction...... 103 Figure 3.12: Time-course analysis of conversion to 38 by the optimised 30°C or 37°C, pH 9.0 CARmm reactions...... 104 Figure 4.1: SDS-PAGE of Sfp-absent, CARmm-expressing BL21 (DE3) cell lysate nickel affinity purification by AKTA...... 107 Figure 4.2: SDS-PAGE of Sfp-absent, CARni-expressing BL21 (DE3) cell lysate nickel affinity purification by AKTA...... 107 Figure 4.3: Mutation of CARni. Visualisation of the linear mutated CARni PCR product on an agarose gel...... 108 Figure 4.4: SDS-PAGE of CARni S689A-expressing BL21 (DE3) cell lysate nickel affinity purification by AKTA...... 109 Figure 4.5: Truncated CARmm structural composition and purification...... 110 Figure 4.6: Influence of 4’-phosphopantetheine mimetics on CAR-dependent amide bond formation...... 112 Figure 4.7: Potential methods of CAR-catalysed amide formation via acyl adenylate interception...... 114 Figure 4.8: Chiral amines 60 and 61 utilised for enantioselectivity studies of CAR- dependent amidation and the expected product of the 61 reaction, 62...... 115 Figure 4.9: Use of the EnzChek phosphate assay kit and inorganic phosphatase to analyse ATP consumption during CAR-dependent adenylation ...... 116 Figure 4.10: EnzChek kit Pi concentration calibration curve generation and analysis of ATP coupling to amide formation...... 118 Figure 4.11: Screenshots of Autodock 4.0 modelling of piperidine 52 into the active site of CARni structure PDBID: 5MSD...... 121 Figure 4.12: Screenshots of Autodock 4.0 modelling of piperidine 52 into the active site of CARsr structure PDBID: 5MST...... 122 Figure 4.13: SDS-PAGE of pET28bCARmm or CARmm729-1175-transformed BL21 (DE3) cell lysates from whole-cell studies...... 124

6

Figure 5.1: The potential activation and reduction of radical carboxylic acids by CAR for analysis of enzymatic dynamics...... 128 Figure 5.2: Kinetics analysis of reduction activity of TEMPO carboxylic acid 66 by CARmm...... 129 Figure 5.3: Kinetics analysis of reduction activity of benzoic acid 25 by CARmm...... 130 Figure 7.1: MS analysis of the amide product 62...... 150 Figure 7.2: NMR analysis of the purified product 58...... 152

Tables

Table 3.1: Panel of carboxylic acids screened for CAR primary amidation activity with their corresponding primary amides...... 90 Table 3.2: HPLC response factors obtained between various carboxylic acids and their corresponding primary amides at equimolar concentrations...... 91 Table 3.3: Conversions achieved for the initial primary amide production screen from various carboxylic acids following reaction with CAR...... 91 Table 3.4: Conversions to amide 44 achieved in a preliminary pH profile of CAR- dependent amide formation...... 93 Table 3.5: Conversions achieved for a repeat of the primary amide screen at pH 9.0...... 93 Table 3.6: Conversions to secondary and tertiary amides obtained by CAR-dependent amide formation with methylamine 51, piperidine 52 and propargylamine 53 ...... 94 Table 3.7: Improved conversions achieved for ilepcimide 58 production when supplementary batches of CARni are added to the amide forming reaction...... 96 Table 3.8: Conversions to 58 achieved at different time points during the CARni reaction up to 24 h...... 96 Table 3.9: Conversions achieved for ilepcimide 58 production at different temperatures using CARmm and CARni...... 99 Table 3.10: Conversions to amides 38, 44-48 and 54-56 achieved at different temperatures using CARmm and CARni...... 102 Table 7.1: Isocratic HPLC methods used to separate acid substrates and amide products with percentages of buffers A and B and absorbance wavelength...... 143 Table 7.2: Isocratic and gradient method employed for the separation of (2E)-3-(1,3- Benzodioxol-5-yl)acrylic acid 57 and ilepcimide 58 by HPLC...... 144

7

List of Abbreviations

 empty or removed (plasmids or amino acids removed from truncation respectively) µ micro Å Angstrom A Adenylation (domain) AA Amino acid aaRS Aminoacyl tRNA synthetase ACP Acyl carrier protein Adm Andrimid synthetase AngB Anguibactin synthetase B ANL Adenylate forming AMP Adenosine monophosphate ATP Adenosine triphosphate Bac Bacitracin synthetase Boc Tert-butyloxycarbonyl BOP (Benzotriazol-1-yloxy)tris(dimethylamino)phosphonium hexafluorophosphate bp base pairs C Condensation (domain) CALB Candida antarctica Lipase B CAR Carboxylic acid reductase CAVibB CAR A domain-VibB carrier protein fusion Cy Cyclisation (domain) Da Dalton DCM Dichloromethane Dcp D-alanyl carrier protein DCU Dicyclohexylurea DHB Dihydroxybenzoic acid DMF Dimethylformamide DMSO Dimethyl sulfoxide DNA Deoxyribonucleic acid Dpt Daptomycin synthetase E Epimerisation (domain) EDC 3-(ethyliminomethyleneamino)-N, N-dimethylpropan-1-amine EntE Enterobactin synthetase E EPR Electron paramagnetic resonance ESI Electrospray ionisation F domain Formylation (domain) FAS Fatty acid synthetase Fmoc Fluorenylmethyloxycarbonyl chloride FT Column flow through HMPA Hexamethylphosphoric triamide HMWP2 Yersiniabactin synthetase carrier protein HOAt 1-Hydroxy-7-azabenzotriazole HOBt Hydroxybenzotriazole HPLC High-performance liquid chromatography I Insoluble (fraction) IPP Inorganic pyrophosphatase 8

IPTG Isopropyl β-D-1-thiogalactopyranoside kb kilobases kcat Turnover number kDa kilodalton KM Michaelis constant L Lysate LCMS Liquid chromatography–mass spectrometry M Module MESG 2-Amino-6-mercapto-7-methyl-purine riboside MIBK Methyl isobutyl ketone mRNA Messenger RNA MS Mass spectrometry MT Methyl transferase (domain) MW Molecular weight MWCO Molecular weight cut-off NADP Nicotinamide adenine dinucleotide phosphate NMR Nuclear magnetic resonance NRP Nonribosomal peptide NRPS Nonribosomal peptide synthetase NSPD Norspermidine P PCP (domain) PchE Pyochelin synthetase E PCP Peptidyl carrier protein PEGA Poly(ethylene glycol)-acrylamide Pi Inorganic phosphate PKS Polyketide synthetase PPant 4′-Phosphopantetheine PPi Pyrophosphate PPL Porcine pancreas lipase PPTase 4'-phosphopantetheinyl transferase PYBOP Benzotriazol-1-yl-oxytripyrrolidinophosphonium hexafluorophosphate R Reduction (domain) RNA Ribonucleic acid Rpm Revolutions per minute SD Shine-Dalgarno SPPS Solid-phase peptide synthesis Srf Tyrocidine synthetase TE Thioesterase (domain) TEMPO (2,2,6,6-Tetramethylpiperidin-1-yl)oxyl THF Tetrahydrofuran tRNA Transporter RNA Tyc Tyrocidine synthetase U Units Vib Vibriobactin synthetase Vmax Maximum velocity

9

Abstract

This thesis reports the exploitation of carboxylic acid reductase adenylation activity to produce a range of primary, secondary and tertiary amides under mild conditions. The amide bond is one of the most vital functionalities in Nature and industry, including the peptide bonds of proteins and the amide linkages of many drug products. However, traditional methods of amide formation suffer from poor atom economy due to the use of coupling agents which must be used in stoichiometric amounts. Meanwhile, the use of organic solvents also worsens the waste and environmental impact of these traditional processes. Biocatalytic methods of amide formation meanwhile, are frequently limited in substrate scope and often also require the use of organic media. A key goal of this PhD project therefore, was the development of new, broad specificity, aqueous enzymatic activities that would produce amides. We initially aimed to produce chimeric enzymes between broad specificity carboxylic acid reductase (CAR) adenylation domains and amide forming nonribosomal peptide synthetase (NRPS) enzyme systems. It was hoped that such a fusion would combine broad specificity carboxylic acid activation with amide forming activity. Ultimately this would prove unsuccessful, yet success would be found by exploiting the broad specificity carboxylic acid activation activity of native CARs directly. This work demonstrates that by altering the reaction conditions of CARs, replacing the natural reducing cofactor NADPH with amine nucleophiles in alkaline conditions, it is possible to intercept activated intermediates to produce amides. Mutational work demonstrated that adenylation activity was sufficient for subsequent amide formation and that the acyl adenylate intermediate could be intercepted to yield amides. Through the introduction of different carboxylic acid substrates and various amines, it was possible to produce a range of primary, secondary and tertiary amides with low conversions. By optimising the reaction conditions, it was possible to produce a target drug molecule, the anticonvulsant ilepcimide, with up to 96% conversion with purified CAR enzyme. Additionally, a scale up reaction using this method with CAR-containing cell lysate allowed the milligram-scale production of ilepcimide with 19% yield. Moreover, ATP consumption assays showed that amide formation and ATP consumption are coupled, suggesting that attack on the acyl adenylate by the amine occurs while the former is still bound to the enzyme. This project, therefore, lays the groundwork for future studies into the extent of amides which can be produced by CARs, and also potentially by the many other adenylating enzymes, with differing and complementary substrate specificities.

10

Declaration

Title: Development of New Enzyme Activities for Applications in Synthetic Biology: Carboxylic Acid Amidases (CAA)

Author: Alexander J L Wood

I declare that no portion of the work referred to in the thesis has been submitted in support of an application for another degree or qualification of this or any other university or other institute of learning;

Date: 04/01/2018

11

Copyright Statement

i. The author of this thesis (including any appendices and/or schedules to this thesis) owns certain copyright or related rights in it (the “Copyright”) and s/he has given The University of Manchester certain rights to use such Copyright, including for administrative purposes. ii. Copies of this thesis, either in full or in extracts and whether in hard or electronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it or, where appropriate, in accordance with licensing agreements which the University has from time to time. This page must form part of any such copies made. Page 10 of 25 Presentation of Theses Policy iii. The ownership of certain Copyright, patents, designs, trademarks and other intellectual property (the “Intellectual Property”) and any reproductions of copyright works in the thesis, for example graphs and tables (“Reproductions”), which may be described in this thesis, may not be owned by the author and may be owned by third parties. Such Intellectual Property and Reproductions cannot and must not be made available for use without the prior written permission of the owner(s) of the relevant Intellectual Property and/or Reproductions. iv. Further information on the conditions under which disclosure, publication and commercialisation of this thesis, the Copyright and any Intellectual Property and/or Reproductions described in it may take place is available in the University IP Policy (see http://documents.manchester.ac.uk/DocuInfo.aspx?DocID=2442 0), in any relevant Thesis restriction declarations deposited in the University Library, The University Library’s regulations (see http://www.library.manchester.ac.uk/about/regulations/) and in The University’s policy on Presentation of Theses.

12

Acknowledgments

Of the many people to whom I give thanks for their contributions to my project, my greatest thanks must go to my supervisor Professor Sabine Flitsch, whose guidance, knowledge and novel ideas have been instrumental in the development and success of this project. Additionally, I thank my co-supervisor Professor Nicholas Turner who has also played a major role throughout this project. My thanks also go to CoEBio3 and its affiliates, without the funding and interest of whom, this work would simply not have been possible. In particular, Dr. Richard Lloyd, formerly of Dr.Reddy’s and in his capacity of industrial mentor, has given much advice and an industrial perspective to my work, for which I thank him. On a personal level, none of my work would have been possible without the unwavering support of my loving family who have helped me through my years of studies and have always been by my side. I also give thanks to my close colleagues and friends, in particular, Dr. Nicholas Weise, who has advised me through much of my project and has provided much enthusiasm, advice, and ideas which have been pivotal in achieving positive results. I give thanks to my friend, Dr. Michael Hollas, who has been by my side throughout my entire postgraduate studies and has also contributed greatly to my project. Finally I thank my other close friends and colleagues in the Manchester Institute of Biotechnology and the Turner-Flitsch group who have aided me both personally and in my project including Laura Jeffreys, Dr. Mark Dunstan, Professor David Leys, Dr. Sasha Richardson, Dr. Deepankar Gahloth, Dr. Fabio Parmeggiani, Dr. Joanne Porter, Dr. Daniela Quaglia, Dr. Lorna Hepworth, Joseph Frampton and Paula Tipton.

13

Chapter 1. Introduction

1.1. General introduction to amides and their production

Amides are one of the most important functionalities in chemistry.1–3 They are found in one quarter of all pharmaceutical compounds and amide bond formation represents one of the most common reactions performed in the pharmaceutical industry.4,5

Some key examples of amide containing drug molecules include the cancer treatment drug imatinib 1,6 the antibiotic cefpiramide 27 and the sleeping disorder treatment drug modafinil 38 (Figure 1.1). Furthermore, the amide bond is instrumental in biological building blocks of life, with proteinogenic L-amino acids making up the building blocks of proteins, in which their backbones are bound through peptide bonds formed by the ribosome.9

Figure 1.1: Examples of amide-containing drugs, including the anti-cancer drug imatinib 1, the antibiotic cefpiramide 2 and the sleeping disorder drug modafinil 3.

In industry, amides are typically produced through synthetic chemistry methods, with a survey of the largest pharmaceutical companies showing that amide formation represented 12% of all their chemical processes.10 At ambient temperatures, amines and carboxylic acids form an ammonium-carboxylate salt, but direct condensation can occur at

14 temperatures above 100°C.2 However, such high temperatures are not efficient for large- scale processes or are not always compatible with the reactants being used, resultantly, carboxylic acids are generally activated by coupling reagents to permit attack by the amine nucleophile at suitable temperatures.2 Although these processes can produce amides with high yields, they are not without limitations including low atom economy, high waste, and the need to separate coupling agents and associated additives during product isolation.2,11

Moreover, the use of coupling agents can suffer from racemisation of products, requiring the use of additives such as hydroxybenzotriazole (HOBt) or 1-hydroxy-7-azabenzotriazole

(HOAt) to suppress this side reaction, yet this too reduces atom economy and can add the additional risk of explosive properties.2,12,13 In addition to the requirement for coupling agents, amide formation is also mainly conducted in organic solvents,14 with most

15,16 reactions being conducted in dichloromethane (DCM) or N, N-dimethylformamide (DMF) further increasing waste and the environmental impact of the reaction. Primarily with an aim of improving atom economy, the development of new methods of producing amides without the use of coupling agents was declared a key research priority by the world’s leading pharmaceutical companies, known as the pharmaceutical round table.17

As a means of avoiding the use of coupling agents, enzymes have been employed for the production of amides. This has principally focused on the use of hydrolase enzymes such as proteases and lipases for the synthesis of amides from carboxylic acid and amines through a trans-acylation reaction, under mild conditions.3,18 While the former are generally limited in substrate breadth18 the latter have been shown to be applicable for the synthesis of many different amides, and overall, removing the need for coupling agents.3,19

Indeed amide formation using lipases is well established, with these highly stable enzymes being readily available commercially, with a proven ability to produce amides from many carboxylic acid esters and amines.10,20 Further, their industrial use has been further enhanced by their ability to be immobilised, increasing stability and recoverability.21

Therefore the synthetic chemist has a clear option to use hydrolase enzymes in order to 15 form amides in the absence of coupling reagents. However, these enzymes generally require non-aqueous media in order to permit the reverse transacylation reaction of these enzymes and avoid hydrolysis of the amide product.3

Biocatalytic options for amide synthesis in aqueous media do exist however, with nonribosomal peptide synthetases (NRPS)22 and ATP grasp enzymes23 being capable of synthesising amides in water at the expense of adenosine triphosphate (ATP). Yet these enzymes are typically very limited in their substrate scope, limiting their industrial applicability.18

In planning the focus of this work, it was clear that more biocatalytic options for amide formation, removing the need for coupling reagents, were needed. It was felt that much effort was already placed on hydrolase enzymes as amide synthetases which operate in organic solvents. Conversely, we believed that it was in aqueous amide biocatalysis, that more research was required, in particular with an aim of developing hitherto non-existent aqueous biocatalysts which possessed broad specificity to both carboxylic acids and amines. Therefore, engineering enzymes to produce many different amides, without the need of organic solvents or coupling reagents would provide a valuable tool to synthetic and biological chemists in conducting one of the most important and common reactions in chemistry with a reduced impact on the environment. To this end, the broad substrate breadth carboxylic acid activating activity of carboxylic acid reductases (CARs), which naturally reduce carboxylic acids to aldehydes, was targeted for engineering to introduce a novel amide synthetase activity. 24

1.2. Chemical methods of amide synthesis and their limitations

Typically, the formation of amides is conducted via the activation of carboxylic acids with coupling reagents or activating agents. The activated carboxylic acid is then

16 attacked by an amine nucleophile to yield the amide (Figure 1.2 a). The most common methods use carbodiimides such as dicyclohexylcarbodiimide (DCC) 4, or proceed via the generation of acyl chlorides from carboxylic acids with chlorinating agents such as thionyl chloride 5 (Figure 1.2 b).25 As well as the formation of amide bonds between small molecules, coupling agents are also used for the formation of amides between amino acids during peptide coupling.13 This is generally conducted through the sequential coupling of

N-α-protected amino acids to a de-protected peptide, which is bound to a solid resin, followed by washing, de-protecting and the addition of the next N-α-protected amino acid in what is known as solid phase peptide synthesis (SPPS).26

Amide coupling using carbodiimides can, however, yield an undesired and unreactive by-product, N-acyl urea.27 Further, undesired racemisation can also occur when longer peptides are used as the carboxylate donor,28 for example during SPPS, via intramolecular cyclisation and the production of an oxazolone intermediate.12 In order to suppress these side reactions, additives are often added to increase the rate of amide formation.2,11,29 Such additives include HOBt and HOAt,30 with the former being known to be explosive and all additives worsening the atom economy of the reaction, without being able to entirely eliminate racemisation.2 Moreover, the use of HOBt with DCC 4 can also result in the production of an undesired diazetidine by-product 6 (Figure 1.2 b).31 It is important to note also that while it is possible to use soluble carbodiimides, such as 3-

(ethyliminomethyleneamino)-N, N-dimethylpropan-1-amine (EDC), in aqueous solution, the carbodiimides are vulnerable to hydrolysis and must ideally be used at a higher pH to prevent ionisation of the amine nucleophile.32

17

a.

b.

Figure 1.2: Use of coupling or activating reagents to form amides from carboxylic acids and amines. a. Carboxylic acids are often activated by coupling or activating reagents which facilitate nucleophilic attack by an amine, yielding an amide. b. Examples of commonly used coupling and activating reagents, the carbodiimide DCC 4, and the chlorinating agent thionyl chloride 5, as well as the undesired diazetidine by-product 6 formed following reaction of DCC with HOBt.

Along with the use of carbodiimides and their associated additives, the other most common method of amide formation relies on first generating an acyl chloride intermediate. Acyl chlorides are first generated by reaction with a chlorinating agent such

33 as thionyl chloride 5 followed by a separate reaction with the amine to produce the amide,11 although it has been demonstrated that this can be done in a one-pot process.34

While acyl chlorides are highly reactive, the amide formation step yields HCl, which can be problematic for processes which are acid-sensitive such as those which use Boc- protected35 amines, and which may also generate a HCl salt with the amine. Bases may, therefore, be added to trap the HCl.11 There are other disadvantages, as acyl chlorides are vulnerable to hydrolysis in the presence of water 36 and are prone to racemisation when used in peptide synthesis.11

18

In addition to these more common amide formation methods, other methods of amide formation are frequently used. An important group is the phosphonium salts, some of which are HOBt based, but which are stand-alone coupling reagents that don’t require carbodiimides.11 These include benzotriazol-1-yloxy-tris-(dimethylamino)-phosphonium- hexafluorophosphate (BOP)37 and benzotriazol-1-yl-oxytripyrrolidinophosphonium hexafluorophosphate (PyBOP)38 and proceed by the acid reacting to generate a reactive acyl-phosphonium species, which is subsequently attacked by -OBt to generate a Bt .11,39 A major disadvantage with using these coupling agents is that they retain the potentially explosive component HOBt and their side products can be highly toxic, for example, the side product from BOP amide formation is hexamethylphosphoric triamide

(HMPA), a known carcinogen.40

Overall, the long established and most commonly used synthetic chemistry methods possess drawbacks. The use of coupling agents and additives increases waste and decreases the atom economy1,17 while some additives and side products are toxic or explosive.11 Furthermore, for the most part these reactions are conducted in organic solvents15,16,38 and often require the extra step of scavenging water molecules with molecular sieves.41 This combination of coupling agent, additive and organic solvent use in one of their single most common reactions, contributes to a very poor E (kg waste/ kg product) factor in the pharmaceutical industry as a whole, which compares badly with other industries such as oil refining and fine chemical production.42 This has spurred an industrial desire to use more environmentally friendly conditions for amide formation, removing the need for these coupling agents17 and if possible, the replacement of toxic organic solvents with safer alternatives16 or aqueous media.43 Biocatalytic methods of amide formation have therefore received a great deal of academic and industrial interest as an environmentally friendly alternative.3

19

1.3. Enzymatic synthesis of amides 1.3.1. Biocatalytic formation of amides by hydrolases

In order to circumvent the use of coupling agents, much effort has been focused on developing hydrolase enzymes, particularly serine proteases for amide formation.3,44

Proteases have been a major target of medical research, focusing on their roles in pathologies and physiological processes.45–47 This has led to these enzymes and their mechanism of action being well characterised and employed for a range of industrial uses in the chemical and food industries.48 Serine proteases typically cleave a peptide bond at a specific amino acid residue, for example the serine protease trypsin cleaves the scissile bond at the C-terminal of lysine or arginine amino acids, in a mechanism involving a catalytic triad, composed of His, Ser and Asp.49–51 This catalytic triad is typical of serine proteases.52 The enzyme’s mechanism works by the serine oxygen, having been deprotonated by the adjacent histidine residue, first attacking the carbonyl group of the peptide bond forming a tetrahedral intermediate which is stabilised by hydrogen bonding with backbone and side-chain hydrogens in an oxyanion hole,3,52 subsequently the C-N bond is cleaved leaving a covalently bound acyl-enzyme intermediate.53 A water molecule is then deprotonated by the histidine and the resulting hydroxyl group attacks the carbonyl of the enzyme bound intermediate, allowing hydrolysis to occur (Figure 1.3).54

Interestingly it was found that in anhydrous conditions, the reverse amide forming reaction was preferred, producing peptide bonds from esters and amines (Figure 1.3).44 The native catalytic machinery is exploited in this process, by generating the same acyl enzyme intermediate, with methanol acting as the leaving group, the amine replaces water as the nucleophile, allowing amide synthesis.3,44 While the proteases generally possess low selectivity towards the amine donor, they are highly selective towards the acyl donor with a preference for the cognate amino acid for the hydrolysis reaction, which has severely limited their use in industry.18,55 A notable exception to this is the protease papain which

20 possesses broad specificity towards the amine acceptor, while retaining a preference for lysine or arginine.56 Indeed, this was exploited by the Numata group to produce leucine- nylon peptides.57 Nonetheless, the generally narrow substrate scope of proteases has inspired imaginative techniques to overcome this limitation. One such novel method of broadening the substrate scope of proteases was the binding of amino acid mimetics, which resemble the structure of the cognate amino acid, to various acyl donors, which are then able to act as substrates for the amide-forming reaction. In the work of Bordusa et al. the arginine mimetic, 2-(4-hydroxyphenyl)guanidine 7 was esterified to various non-substrate acyl donors, allowing them to be amidated with the arginine-specific protease clostripain

(Figure 1.4).18,55 Interestingly, this reaction was also conducted in water, with only low levels of hydrolysis products being observed. This is likely due to the removal of the arginine mimetic upon amide formation, with the resultant amide no longer possessing an arginine-like structure to permit binding to the protease active site and hydrolysis. Despite overcoming the innate selectivity of certain proteases, this extra step of chemically adding the 2-(4-hydroxyphenyl)guanidine ester resulting higher costs of production may dissuade large-scale application of this method by industry.58 This has therefore encouraged the use of cheaper esters such as simple benzyl esters to improve the aminolysis of the unfavourable substrate gylcine by papain, as demonstrated in the work by the Rutjes and

Numata groups.58,59 Interestingly, as stated by the Rutjes group, it is believed that in this case the ester doesn’t mimic the natural substrate or bind in an identical fashion to the substrate binding site, but that the ester permits sufficient interaction with the enzyme to permit enzyme-specific aminolysis nonetheless.

21

Figure 1.3: Exploitation of the serine protease catalytic mechanism for amide formation.44 i. In the first step, the Ser of the catalytic triad attacks the carbonyl carbon of the amide or ester substrate generating a tetrahedral intermediate which is stabilised by an oxyanion hole. ii. The triad His residue then donates a proton to the leaving group oxygen and iii. an enzyme-acyl intermediate is formed. iv. Finally, this intermediate can then be hydrolysed by water in the natural mechanism or undergo aminolysis to yield and amide. X: N or C

The natural selectivity of proteases towards the amino acid-acyl donor in the absence of bound mimetics or affinity-improving groups, has not prevented proteases being exploited for their specificity in the production of peptide bonds and they are the most commonly used enzymes for in vitro peptide bond formation.60 For example α- chymotrypsin’s natural selectivity for large hydrophobic residues such as tyrosine was exploited to produce the dipeptide kyotorphin (Tyr-Arg).61 Combinations of proteases with complementary specificities have also been employed for the synthesis of commercially relevant peptides by ligation of shorter peptides, for example with the production of

22 dynorphin (Tyr-Gly-Gly-Phe-Leu-Arg-Arg-Ile) from the condensation of Boc-Tyr(Bzl)-

Gly-Gly-Phe-O-Et and Leu-Arg-Arg-Ile-N2H2Ph by α-chymotrypsin (Tyr, Trp, Phe and leu selectivity),62 with the precursor Leu-Arg-Arg-Ile being produced by Trypsin (Arg and Lys

49 selectivity) from Boc-Leu-Arg-OMe and Arg-Ile-N2H2Ph, which was carried out in aqueous solution at pH 10.0 which was successful in suppressing hydrolytic activity.63

Generally, however, proteases are used in organic solvents such as isooctane, hexane, and

THF3,64 or organic-aqueous mixes due to the ever-present risk of hydrolysis of the amide product,65 however, organic solvents can have a detrimental effect on protease activity.3,18

An alternative method to promote amidation over hydrolysis with proteases in aqueous solution, has been shown through the immobilisation of amine substrates on poly(ethylene glycol)-acrylamide (PEGA) resins. When used with Fmoc66 N-α-protected amino acids, resin-bound peptides were produced which could subsequently be cleaved from the resins and extracted.67 It is believed that the equilibrium is shifted towards amide formation and away from hydrolysis through a combination of higher amine substrate loading being possible on the resin support, the suppression of ionisation of the amine nucleophile due to mutual electrostatic repulsion and improved solvation of the Fmoc-bound acyl donor.67,68

23

a.

Natural arginine Peptide bond substrate

b.

Non-native substrate- Arginine mimetic ester

Figure 1.4: Overcoming the natural substrate specificity of proteases through the use of substrate mimetics for amide formation.18 a. Clostripain selectively hydrolyses the C terminal peptide bond of arginine residues. b. By forming an ester between an acid and 2- (4-hydroxyphenyl)guanidine 7, the ester could bind to the enzyme binding pocket and could undergo aminolysis.

The limited substrate scope of proteases for amide formation has, however, resulted in the use of other broader specificity hydrolases, namely lipases. These enzymes natively hydrolyse triglycerides rather than peptide bonds, and therefore unlike proteases, they have no innate preference to specific amino acids and are less likely to hydrolyse the amide products.3,69 Yet, they share the same catalytic triad machinery of the serine proteases, allowing them to catalyse the same acyltransferase activity between esters and amines to produce amides, usually in organic solvents.3,19,70,71 There are however a small number of examples where lipases have been employed in aqueous conditions for amide formation, 24 although the range of products produced in these examples is narrow.69,72 Lipases are now widely commercially available, for example, the enzyme Candida antartica lipase B

(CALB) has been commercialised by Novozyme as Novozyme 435, and is typically immobilised on acrylic resin, allowing separation from reaction mixtures and reuse.21

The broad specificity of lipases for both acyl and amine donors as well as other attributes such as stability in organic solvents and high temperatures has made them very useful for industrial applications.3,56,73 This substrate breadth of individual lipases such as

CALB can include short acyl donors such as butyl acetate which could be amidated with benzyl amine74 to produce 8 but also longer acyl donor-fatty acid esters, such as lauric acid ethyl ester could be transacylated with ethanolamine to yield 9 (Figure 1.5).75 Interestingly the free fatty acids could also be directly acylated by ethanolamine but with much poorer activity due to the formation of insoluble ion pairs between the fatty acids and ethanolamine in acetonitrile.75 A disadvantage of using both proteases and lipases is the usual requirement to activate the carboxylic acid as an ester prior to transesterification, however it has been shown that ammonia from ammonium bicarbonate or ammonium carbamate could also be used as the amine donor to directly amidate butyric acid to butyramide 10 using CALB as the biocatalyst in methyl isobutyl ketone (MIBK) solvent

(Figure 1.5).20 This reaction required gradual addition of the ammonia substrate so as to not precipitate the acid substrate.

As well as amide production of small molecules, like with proteases, lipases have been used for peptide elongation by adding amines to peptidyl esters for example in the production of the N-α-benzoate protected tetrapeptide, Bz-Arg-Gly-Asp-Ser from Bz-Arg-

O-Et as the acyl donor and Gly-Asp-Ser as the acyl acceptor, conducted with porcine pancreas lipase (PPL) in an aqueous organic mix with 60% DMF.56,76 Another interesting use for lipases is the catalysis of intramolecular amide bond formation, such as the use of

PPL to produce 5 to 7 membered lactam rings such as 11 from amino esters, using tert- amyl alcohol as the solvent, with 4 membered rings not being formed.77 Additionally it was 25 possible to produce bis-lactams such as 12 through the condensation of diamines and diesters with PPL in dichloromethane or chloroform, therefore offering the potential to produce more complex macrocyclic molecules (Figure 1.5).

Figure 1.5: Selected examples of amide products produced by the lipases CALB and PPL respectively.20,56,74–77

In summary, proteases and lipases have proven to be invaluable in the biocatalytic production of many different amide bonds. Yet proteases normally suffer narrow substrate specificity, which has limited their usefulness in industrial processes, thus lipases, which have a much wider substrate scope, show more promise for general amide formation.

While both proteases and lipases are normally used in organic solvents to promote amide formation and reduce hydrolysis, proteases have been shown to be capable of forming some amide bonds in aqueous solution, however, lipases with a few isolated exceptions69,72 are much more dependent on the use of organic solvents for amide formation to improve activity.78 Indeed this preference for organic solvent worsens the environmental impact of using these enzymes.79 This is compounded by the poor reactivity of proteases and lipases

26 towards free carboxylic acids for aminolysis and a preference for ester substrates, resulting in leaving groups, such as methanol, being a by-product of transacylation reactions. This has therefore spurred research into enzymes which are able to directly activate free carboxylic acids in aqueous media to permit aminolysis and amide formation, without a prior requirement for chemical esterification.

1.3.2. ATP dependent amide formation

In Nature, the principal biological machine for amide formation is the ribonucleoprotein complex of the ribosome, which is responsible for the formation of the vast majority of proteins and enzymes in nature, with amino acid sequences being predetermined by the genetic code of transcribed mRNA sequences.9,19, As well as these ribosomal proteins, secondary metabolite peptides such as antibiotics and siderophores, are produced by multi-module nonribosomal peptide synthetases (NRPS) which are found in bacteria and fungi.80 These produce peptides in a sequence not determined by a mRNA template, but by the collinear condensation of amino acids, which are specifically and sequentially activated by the subsequent modules of the NRPS, and unlike with ribosomal proteins, can incorporate modified, β and D-amino acids into their peptide product.81–83

ATP-grasp enzymes also complement ribosomal and NRPS, as some of this family form amide bonds to produce dipeptides, again in a non-mRNA dependent manner.23 A common feature of these key natural methods of amide formation is that they must all begin with the activation of the carboxylic acid substrate at the expense of the cofactor ATP.3

The ribonucleoprotein complex of the ribosome is responsible for the formation of the vast majority of proteins and enzymes in nature and produces peptides through the transacylation between aminoacyl-tRNAs and the growing peptidyl-tRNA, in an order which is dictated by the mRNA coding sequence.84 Prior to peptide bond formation by the ribosome, these amino acids must be ligated to the appropriate tRNA molecule by a complementary aminoacyl tRNA synthetase (aaRS), which in the first reaction activates a 27 specific amino acid carboxylate group at the expense of ATP by generating an aminoacyl adenylate intermediate. In the second reaction, the enzyme catalyses the transfer of the aminoacyl group from the adenylate to the correct tRNA molecule, producing the aminoacyl tRNA which is subsequently employed as a substrate by the ribosome (Figure

1.6).85 Ribosomes are regularly employed indirectly for the production of recombinant peptides, following cloning of protein encoding genes into expression vectors and competent bacteria86 with proteins regularly being changed in structure by direct mutation of their genetic DNA sequence19 or by directed evolution.87 Nonetheless, ribosomes are limited in their industrial application due to their requirement for the 20 naturally occurring proteinogenic L-amino acids (22 including the rare amino acids selenocysteine and pyrrolysine)88,89 and the requirement for an mRNA template.90 Moreover, their large size and complex composition of various different proteins and ribonucleodides91 make their isolation, engineering, and modification, to enhance substrate scope beyond these substrates impractical.

Figure 1.6: Synthesis of aminoacyl-tRNAs by aminoacyl tRNA synthetase. i. The amino acid is first adenylated at the expense of ATP. ii. The aaRS then catalysed the formation of AA-tRNA, releasing AMP.

28

1.3.3. Nonribosomal peptide synthetases

NRPSs are a useful expansion to the synthetic toolbox for amide synthesis and are responsible for the synthesis of biologically important secondary metabolites such as the siderophore vibriobactin,92 and commercially vital compounds including antibiotics such as vancomycin93,94 or anticancer agents such as bleomycin.95 Unlike ribosomes, each

NRPS subunit is composed of a single polypeptide, with interlinked, intra-protein interacting, modules sequentially adding a specific amino acid to the growing polypeptide,81 although some NRPS modules are separated and interact in an inter-protein fashion.96,97 The N to C terminal order of these modules pre-determines the N to C terminal order of the amino acids of the nonribosomal peptide (NRP) product in a collinear manner, with the number of NRPS modules reflecting the number of amino acids found in the peptide.98 This has been well characterised in the tyrocidine synthetase system of Bacillus brevis which produces the cyclic peptide tyrocidine 13 (Figure 1.7).99 These modules are subdivided into catalytic domains which catalyse different reactions. These typically include an adenylation (A) domain, responsible for adenylating and activating a specific residue, a peptidyl carrier protein (PCP) with a bound 4-phosphopantetheinyl (PPant)

“swinging” arm, responsible for binding and transferring the activated residue between domains and modules, and a condensation domain (C) which catalyses the condensation of two PPant-bound residues (with the order of these domains being C-A-PCP with the C domain being typically absent in initiating modules).90,100 These domains form the core domain structure of NRPSs.101 A thioesterase (TE) domain is also usually found on the terminal module of the NRPS and is responsible for hydrolysing or cyclising the peptide product and releasing it from the multi-enzyme complex.102,103 As well as these common domains which are pivotal for the elongation of the peptide chain, other domains are found within some NRPSs, and are responsible for the modification of the peptide chain. These include epimerase (E) domains which permit the incorporation of D-amino acids into the

29 peptide.104 They racemise the L-amino acid residues, with the associated downstream C domain being enantioselective for only the D-enantiomer, ensuring selective incorporation of the D-amino acid only.105,106 Incorporations of D-amino acids can reduce proteolytic degradation of the peptides, with proteases having a selective preference for L-amino acids.106,107 One such example is the E domain found in gramicidin S synthetase, which synthesises the antibiotic gramacydin S,108 and which is responsible for the racemisation of

L-Phe, leading to the subsequent incorporation of D-Phe into the peptide chain.106 N- methyltransferase (MT) domains are another auxiliary domain found in some NRPS, and catalyse the methylation of backbone amines of the peptide chain, making them less susceptible to hydrolysis,109 an example being the MT domain found on the NRPS cyclosporine synthetase in which 7 out of the 11 amino acid backbone amines are N- methylated, which is essential for chain elongation intramolecular cyclisation activity of this specific NRPS.110 Other auxiliary domains can also be found in NRPSs, such as formylation (F) domains and reduction (R) domains,81 as well as cyclisation (Cy)111 domains which catalyse the formation of heterocyclic rings on peptides such as the siderophore vibriobactin by vibriobactin synthetase F (VibF).112 Indeed, the ability of

NRPS not only to synthesise peptide chains, but also introduce altered and non- proteinogenic amino acids into these chains, has provided a plethora of peptides with different and useful properties, beyond that provided by ribosomal peptides.81

A domains are considered to be the gate keepers of selectivity, as these are needed for the specific activation of one specific residue for its subsequent incorporation into the peptide, with each A domain containing-module adding its own residue.113 They conduct two separate reactions. First they must catalyse the Mg2+ dependent nucleophilic attack by the carboxylic acid on the α-phosphate group of the ATP cofactor to generate an acyl- adenylate intermediate, which remains within the enzyme active site, and releases pyrophosphate (PPi). Subsequently, the adenylation domain catalyses the thioesterification

30 of the substrate carbonyl to the PPant group of the subsequent downstream PCP domain, releasing AMP.114

Figure 1.7: Module and domain composition of the NRPS tyrocidine synthetase.99 Tyc is composed of three subdomains, TycA, TycB and TycC. All modules within these subdomains contain at least an A domain (A) and a PCP (P). They can also contain C domains (C), E domains (E) and TE domains (TE). Each module activates a specific amino acid, and the order of the modules determines a co-linear order of the amino acids within the NRP. The resulting product is the cyclic peptide tryocidine 13.

31

Structural studies using X-ray crystallography on the first A domain of gramicidin

S synthetase I (GrsaI), which activates L-Phe, revealed that the A domain itself is split into a larger N-terminal subdomain and a smaller C-terminal subdomain.115 Substrate binding and recognition occurs in a substrate binding pocket while ATP binding occurs in an adjacent cleft, both of which are found within the larger N-terminal subdomain. However, a disordered loop originating from the smaller C-terminal domain projects into the active site and lysine of this loop binds both the adenosine of ATP and the Phenylalanine, resulting in the C terminal subdomain being clamped over the substrate active site.115

Further structural studies with X-ray crystallography on other A domains from NRPS and other related adenylating enzymes, reveals that this N/C terminal subdomain division of adenylation domains is typical, with a mobile C terminal domain playing an important role in catalysing the two separate adenylation and thiolation reactions by rotation and revealing a different face to the active site depending on the reaction state.116–118 These crystallographic studies revealing separate A domain subdomains and their conformational changes have been supported by proteolytic studies the NRPS protein Tyc1, in which a cleavage site was located in the linker region between the two subdomains. The rate of cleavage was found to be substantially reduced upon the addition of substrates and the A domain undergoing conformational change.119 The C-terminal subdomain rotates to present the conserved loop lysine, which plays a role in catalysis of substrate adenylation by transition state and leaving group stabilisation,117 and subsequently rotating to present an alternative face to the active site, which can interact with the PPant group of the PCP, facilitating thiolation (Figure 1.8).118,120–122

32

Figure 1.8: Conformational changes of the C-terminal subdomain of A domains permit adenylation or thiolation.116–118 i. To permit adenylation the adenylation face of the subdomain (green) is in contact with the active site in the N-terminal core domain. ii. The subdomain undergoes a conformational rotation to reveal an alternative face to the active site. iii. With the thiolation face of the subdomain in contact with the active site (orange), the PPant arm on the PCP can enter the active site and thiolation is catalysed.

NRPS A domains general have a very narrow substrate specificity and are able to select their cognate amino acids based on a combination of the substrate’s size, structure and charge.81 Villiers et al. investigated the substrate specificity of the initiator A domain

33 of tyrocidine synthetase 1 (TycA), the cognate residue of which is L-Phe, and tested a total

123 of 30 D and L natural and unnatural amino acids. They demonstrated that L-Phe was significantly preferred over all over amino acids with 3 orders of magnitude higher catalytic efficiency than towards L-Tyr, the second best performing substrate. Interestingly however, most amino acids could act as substrates, with the exception of charged-R group amino acids which showed no activity. The enzyme showed a strong preference for amino acids with hydrophobic, benzyl and larger R groups, reflecting the large and hydrophobic nature of the substrate binding pocket, containing residues such as Trp, Ile, Thr and Ala responsible for R group binding. However, amino acids which are hydrophobic but larger than L-Phe such as L-Trp were poorer substrates due to the steric exclusion of the active site. Overall this work demonstrated that even between substrates that were similar in size and structure, the A domain of NRPS can preferentially select and activate with great efficiency its cognate amino acid.

With the identification of the groups responsible for R-group binding, the obvious next step was for engineering the substrate binding site residues via mutagenesis to alter the substrate specificity of the A domain. For example Eppelmann et al. succeeded in altering the specificity of the L-Glu initiating module A domain of surfactin synthetase A

(SrfA-A1) to L-Gln by mutating a single residue; Lys239 to Gln239.124 Indeed this success of rational design showed that through understanding the nature of substrate binding, it was possible to predict the changes required to achieve the desired selectivity, a major advance in building designer-made amide synthetases. They further expanded on this work by mutating the 5th module A domain in surfactin synthetase B (SrfA-B2) to alter the specificity from L-Asp to L-Asn, resulting in the incorporation of L-Asn in the 5th position of the peptide, thus generating the novel [Asn5] surfactin variant.

Despite the success in altering the substrate specificity of the initiating A domain of

SrfA-A1, as this resulted in a complete switch of selectivity, with only trace activity towards the formerly cognate L-Glu, this does not represent a broadening of specificity. 34

Consequently, it is clear that such a process of rational redesign may only be applicable where a single target molecule must be incorporated into an NRP in place of another, rather than developing a broad specificity amide synthetase. Additionally, as highlighted in the review by Linne and Marahiel, L-Gln and L-Glu are similar in size and structure, if not charge, and it is clear that large-scale change of the substrate binding pocket was not required. Indeed it is likely that more than one mutation would be required to significantly change the properties of the amino acid being activated, for example, changing the

112 selectivity to L-Trp, which is bulkier and more hydrophobic than either L-Gln or L-Glu.

An alternative to the rational design of the binding pockets is the use of directed evolution to generate alternative specificity of NRPS A domains. In the work by Evans et. al. saturation mutagenesis was employed to randomly mutate the active site residues of the naturally Val specific A domain of Andrimid synthetase K (AdmK) to generate a mutant library, which was then screened to analyse the andrimid variants produced. This showed that the substrate specificity could be altered from valine to larger residues such as phenylalanine and isoleucine, depending on the mutant produced, which were incorporated into the larger peptide structure produced in conjunction with the other Adm modules.125

This demonstrates that NRPS can indeed be evolved towards altered specificity with residues which are different in size and structure than the cognate amino acid being introduced into NRPs, without the need for rational design. However, this was again shown to only switch specificity, rather than greatly broadening it, with the mutated A domains remaining selective to only one or two new residues, therefore not overcoming the limitation that the A domain cannot be used for the wide introduction of many carboxylic acid components. Therefore, the generation of a truly broad specificity NRPS A domain via mutagenesis has yet to be demonstrated.126

The PCP is usually the next domain found downstream of the A domain, and is responsible for the covalent binding to the activated substrate via its bound PPant group which must be transferred to the PCP apo-enzyme posttranslationally through the action of 35 a phosphopantetheinyl transferase (PPTase).81 PPTases catalyse the Mg2+ dependent transfer of the 4’-phosphopantetheine group from coenzyme A (CoA) to a hydroxyl of a conserved serine residue found in PCP domains,127,128 identifiable within a consensus sequence which is typically composed of V,G/L,G/A/F/Y,D/H/K/E,S,L/Q, D/A/G.129

Following activation of the substrate in the A domain and the resulting C-terminal subdomain rotation, the PPant group of the PCP enters the A domain active site and thiolates the activated substrate, generating a PCP bound thioester. It was initially believed that conformational changes within the helices which make up the PCP permit interaction with the A domain and subsequent downstream domain interactions.81 However, this theory is disputed, with new structural data suggesting that the PCP is a dynamically rigid platform for a mobile “swinging arm” PPant group, with surface residues of the PCP being largely responsible for inter-domain interaction.130 Early work also suggested that PPant binding to the PCP had an impact on the conformation of the PCP domain with different conformations also being observed in the apo and holo states of the enzyme.81,131 However, work investigating the related D-alanyl carrier protein (Dcp) on the protein DltC, refutes this theory, and it was shown that that PPant addition had no impact on the overall structure of the carrier protein with both the apo and holoenzymes showing no major differences in structure.132 Regardless of the conformational dynamics of the PCP domain, it is clear that a “swinging arm” model whereby a highly mobile PPant arm is able to bind substrates and “shuffle” them between different active sites would leave the thioester intermediate vulnerable to hydrolysis. The related acyl carrier proteins (ACP) of polyketide synthetases (PKS) and fatty acid synthetases (FAS) have been shown to sequester their substrates within their hydrophobic core, protecting them from premature hydrolysis.133,134

It is not known whether this occurs widely in the case of NRPS PCP domains, although work on the carrier protein of yersiniabactin synthetase showed transient interaction between the PPant arm, the bound substrate and the hydrophobic core of the carrier protein, but that the PPant was still largely disordered and exposed to solvent outside of the 36

PCP core.135 Additionally, as the yersiniabactin synthetase system is composed of both

NRPS and PKS components,136 this system cannot be considered typical of NRPS carrier proteins. Indeed, competing evidence suggests that there is little significant interaction between NRPS PCPs and their PPant arms other than the former acting as a platform for the latter.130 Moreover, the lack of success in attaining a crystal structure of a substrate- bound PPant group on an NPRS carrier protein compared to PKS due to hydrolysis suggests that NRPS PCP bound substrates are much more labile and vulnerable to hydrolysis.137 However, a model proposed by Marahiel, suggests that the domains of

NRPS form a helical structure which encloses the PCP and PPant within the centre of the helix, protecting the PPant bound substrate from hydrolysis.138 Unfortunately, with no structures currently available of multi-module NRPS with all their modules present, it is still largely unknown how the PCPs precisely interact with other domains.

Unlike A domains, PCPs do not show substrate selectivity, with the PPant generating a thioester with any substrate that is activated by the preceding A domain, even if it is not the cognate substrate of the module.139 Following binding of the substrate to the

PPant of the PCP, the thioester intermediate is transferred to the downstream C domain active site at the beginning of the next module (Figure 1.9 a), for subsequent condensation with a residue bound to the next downstream PCP domain (Figure 1.9 b).80

C domains which are responsible for the condensation of two amino acid residues in NRPS systems are composed of N-terminal and a C-terminal subdomains with the active site being found in the intersection between them, with a solvent channel entering from the opposite N-terminal and C-terminal faces to the active site, allowing both upstream and downstream substrate bound PPant groups to enter the active site from their respective faces.81,140 The mechanism of amide bond formation is depended upon a conserved motif within condensation domains (HHXXXDG).141 Work by the Marahiel group demonstrated with mutational studies that the second histidine of this motif, in particular, was required for catalysis, with its mutation to valine eliminating condensation activity.141,142 They 37 proposed a model whereby the second histidine acts as a general base to deprotonate the

PPant-bound amine donor residue, which could then attack the PPant-bound amine acceptor carbonyl, with the hydroxyl of a nearby amino acid such as tyrosine stabilising the tetrahedral transition state (Figure 1.10). However, while this may be the catalytic mechanism in some C domains, work by the Keating group suggests that it is not universal.

Using structural and mutational studies on the standalone condensation domain vibriobactin synthetase H (VibH), they found that the mutating the second histidine of the motif to glutamine or alanine had little impact on catalysis.140 They, therefore, proposed an alternative model for amide formation in certain C domains, whereby the active site catalyses amide formation, not through histidine-dependent base catalysis, but through selective binding of the uncharged amine form of the amine donor and through holding the substrates in the correct orientation and in close proximity to promote amide formation.

a.

b.

Figure 1.9: PCP domain and C domain functions within NRPS enzyme complexes. a. The PCP transports the thioester intermediate from the A domain to the C domain. b. The C domain catalyses the condensation of the acid and amine substrates which are bound to the upstream and downstream PCP domains respectively.80

38

Figure 1.10: A proposed mechanism of C domain-catalysed amide formation. This model suggests that a histidine in the active site acts as a general base and deprotonates the amine donor substrate, promoting nucleophilic attack on the PPant bound intermediate.142 The tetrahedral intermediate is stabilised by donation of a proton by a neighbouring amino acid such as tyrosine. An amide is then formed which is bound to the downstream PPant group.

In terms of substrate specificity selectivity, C domains have been shown to exhibit little selectivity to the upstream amine acceptor, with non-cognate amino acids being 39 readily accepted, while high selectivity to the downstream amine donor was shown with little tolerance for non-cognate amino acids.105 This selectivity has hindered approaches of engineering NRPSs for the production of novel peptides. Namely, despite the aforementioned success of Eppelmann et al. to incorporate downstream non-cognate amino acids into a growing NRP of surfactin, it has been shown in other work by the Micklefield group, that when an A domain downstream of a C domain was mutated to possess altered specificity, also from Asp to Asn, the downstream non-cognate substrate was poorly recognised by the upstream C domain, and while the novel peptide was produced, a high level of premature hydrolysis of the upstream thioester and release of an incomplete NRP was observed.143

An exception to low substrate selectivity towards the upstream amine acceptor residue, is the previously mentioned cases where the C domains following an E domain exhibit strict stereoselectivity in order to incorporate the correct enantiomer into the peptide structure.106 Nonetheless, the ability of C domains to accept non-native upstream substrates has proven valuable when engineering NRPS modules, specifically when the preceding A domain is mutated, either rationally or via directed evolution, resulting in altered substrate specificity, or when the preceding A domain is removed and replaced by a different NRPS A domain, possessing altered substrate specificity, in both cases producing novel NRPs.112

Module and domain shuffling of NRPS has been seen as an attractive route for producing novel peptides, especially when the limitations of rational design and directed mutagenesis are considered, such as the, as of yet, absence of drastic changes in the specificity of A domains to structurally different residues through rational design, or the time needed to generate mutant libraries and screen them desired specificity. A major advantage of a successful domain swap is the predictability of the amino acid composition of the resulting novel peptide products when one A domain or module is swapped for another with a different known specificity. For example if the Phe specific A domain of the 40 first module in an NRPS which is known to produce the peptide Phe-Pro-Ala-Ser, is successfully switched to an Ala specific A domain, and the recombinant NRPS is functional, then it is a reasonable assumption that the resulting NRP would be composed of

Ala-Pro-Ala-Ser. Such ability to “plug in and play” would be of value to the scientist who wishes to produce a specific peptide and has a range of residue-specific modules available, in that they could, theoretically, build their NRPS in a module order that will produce their desired NRP. Unfortunately, as will be expanded upon, such an approach is not always so simple and such a capability is still many years away.

There have indeed been successes in swapping subunits, modules and domains from differing NRPS to introduce non-cognate residues into NRPs.144 One approach to subunit exchange used knockouts of individual separate trans-acting subunits (each composed of their own modules, making up the overall NRPS system), for example replacing the final subunit in the cyclic peptide daptomycin 14-producing daptomycin synthetase (composed of three subunits DptA, DptBC, DptD), with DptD being knocked out, then the final subunits from heterologous NRPSs A54145 and CDA, LptD and

CDAPS3 respectively, were introduced by transformation of the bacteria. This approach was successful and these final non-native subunits were able to introduce the final two amino acids into the NRP delivered from the preceding daptomycin synthetase subunits, resulting in new daptomycin variants 15 (LptD incorporation of final two amino acids) and

16 (CDAPS3 incorporation of final two amino acids) being produced through fermentation

(Figure 1.11).145

While replacing the genes for entire heterologous subunits to generate altered NRPs is interesting, it still limits the user to the peptide building blocks naturally produced by each subunit, even if these can be ultimately re-arranged through the introduction of heterologous building blocks. Conversely, fusing modules and domains in unnatural orders within NRPSs has been successfully demonstrated to generate novel peptides produced from the same subunit. For example, the NRPS tyrocidine synthetase (Tyc), 41

Figure 1.11: Subdomain swapping in NRPS systems allows the production of novel 145 peptides. R1: Decanoic acid. The daptomycin synthetase system is composed of three subdomains; DptA (blue), DptBC (red) and Dptd (green). Dptd could be replaced by subunits LptD (orange) or cdaPS3 (purple), modified daptomycins could be produced in vivo. R2: Final two amino acids composition within cyclic peptide. M: module

42 composed of 10 modules and responsible for the production of the decapeptide tyrocidine, has been used as a model system for the swapping of both modules and domains. The

Marahiel group succeeded in fusing modules 10 (Leu specific) or 9 (Orn specific) to the C- terminal of module 2 found in TycB (Pro specific). When used in conjunction with the separate subunit Tyc A, containing module 1 (D-Phe specific), the module fusions could produce novel tripeptides D-Phe-Pro-Leu (Figure 1.12) and D-Phe-Pro-Orn respectively in vitro.146

As well as fusing modules from within the same NRPS system, the Marahiel group have also had success in exchanging and fusing domains and modules from separate species, namely by fusing the initiating A domain from the first module bacitracin synthetase (BacA1) to elongation and termination modules from tyrocidine synthetase

(Figure 1.13).147 In this particular work they fused the incoming A domain upstream of existing the PCP and C domains, in the knowledge that C domains exhibit less substrate selectivity towards the upstream amine acceptor than to the downstream amine donor.

Thus they successfully incorporated a new initiating residue to generate novel dipeptides.

Such selectivity for the downstream amine donor has indeed proved to be inhibitory to domain fusions where a non-native A domain is introduced downstream to a C domain.

For example, in the work by Calcott et al. where various A domains replaced the native threonine selective A domains of pyoverdine synthetase, it was found that only the unnatural A domains which also selected for threonine could incorporate their residue into the structure of the NRP.148 This demonstrated that the introduced A domains were indeed able to communicate with the upstream C domain, but that the C domain’s amine donor site was selective towards its canonical amine donor residue threonine. However, when the corresponding C domain associated with the incoming A domain was also introduced in a

C-A domain pair it was found that residues other than threonine could now be incorporated into the NRP in some cases, due to the acceptance of the fused C domain for the non- threonine residue introduced by its natural partner A domain. Unfortunately, many of these 43 fusions were often non-functional, yielding truncated NRPs, with Calcott et al. proposing that the C domains downstream of the fusion may indeed exhibit a level of selectivity even at the amine acceptor site, especially when the peptidyl substrate is large, or that PCP domains may able to selectively interact with their native C domain partner.148

D-Phe-Pro-Leu

Figure 1.12: Module shuffling within the tyrocidine synthetase system to produce novel peptides. By fusing the terminal Leu-selective module 10 of TycC to the C-terminal of the Pro-selective elongation module 2 of TycB, the novel tri-peptide D-Phe-Pro-Leu can be produced when the fused NRPS was used in combination with the initiating D-Phe- selective module 1.146

In summary, there are various limitations to NRPS domain swapping which makes the theoretical “plug in and play” potential of engineered NRPS systems more difficult than originally hoped. With C domains showing selectivity towards the downstream amine donors, A domain fusions are usually, therefore, conducted upstream, usually at the beginning of the structure, limiting the options for modifying the structure of an NRP through simple A domain replacement. While transferring the partner C domains along 44 with the A domains can alleviate this issue, amine acceptor selectivity may also be a previously unexpected barrier to domain or module fusions. Moreover, as these NRPS have evolved to have specific domain interfaces, it is believed that introducing non-native domains and modules could potentially perturb these interactions, with even successful fusions producing much lower yields than wild-type NRPS.126,148 Finally it has also been shown that the linkers between NRPS domains also play a pivotal role in determining whether a fusion will be functional, meaning that when introducing new modules or domains, trial and error is often required to find the correct composition of the domain linkers.149

Ile-Leu

Figure 1.13: Domain shuffling between the bacitracin synthetase and tyrocidine synthetase systems for novel dipeptide formation. Fusing the A domain of bacA (Ile selective) to the N-terminal of TycC module 9 PCP domain a novel Ile-Leu dipeptide could be produced.147

Overall, while future research may allow for much easier engineering of NRPS to produce designer peptides, the innate selectivity of the A domains of NRPS towards 45 carboxylic acid substrates, the selectivity of C domains for amine donors, and the difficulty in altering substrate specificity of A domains or shuffling domains through genetic engineering, limits the current industrial applications of NRPS for forming amide products from relevant breadth of substrates.

1.3.4. ATP-grasp enzymes

Another superfamily of ATP-dependent enzymes which are relevant to amide formation is the ATP-grasp enzymes. Like NRPSs, these enzymes activate carboxylic acid substrates in an ATP and Mg2+ dependent manner.3 However, unlike NRPS, the activated intermediate is not an acyl adenylate but an acyl phosphate, with ADP and Pi being released as the side products following nucleophilic attack on the carbonyl carbon of the intermediate.150 Not all of these enzymes utilise an amine as the co-substrate however, with some ATP-grasp enzymes utilising thiols as the nucleophile,23 for example in the case of succinyl-CoA ligase which catalyses the ligation of succinate and CoA.151 The amide forming ATP-grasp enzymes are categorised as ATP-dependent carboxylate-amine ligases and are able to catalyse various amide formations including the formation of dipeptides from two L-amino acids, for example by D-Ala:D-Ala ligase,152 or the ligation of a glutamate residue to the C-terminus of the S6 ribosomal protein by Rimk.153 Indeed the recent review by Ogasawara and Dairi nicely explains the diverse functions of the ATP- grasp enzymes from small dipeptide to larger oligopeptide production in both primary and secondary metabolism.154

Structurally, all ATP-grasp enzymes possess a characteristic ATP-grasp fold which is formed by two α+β subdomains which grasp the ATP molecule between them.155 These enzymes are typically made up of 3 domains, being the N-terminal, central and C-terminal domains, also referred to as domains A B and C respectively, with the ATP-grasp fold found between domains B and C while domains A and C form a central core encapsulates the co-substrates for ligation.23,154 Upon ATP binding, the mobile B domain acts as a lid 46 and shuts down on the nucleotide and the active site.156 Amide formation proceeds by two separate half reactions, with the first being acyl-phosphate formation and the second being amine attack of the acyl phosphate, which yields the amide product (Figure 1.14).3,157

Similar to the catalytic mechanism of aminolysis by some condensation domains in NRPS, it is suspected that a residue which acts as a general base permits this second half reaction in ATP-grasp enzymes by deprotonating the nucleophile,157 allowing attack of the acyl phosphate intermediate, although this has not yet been confirmed.23 An interesting difference to NRPSs is that these two ATP-grasp enzyme half-reactions occur within the same active site, compared with the separate active sites of the A domain and C domain of

NRPS for carboxylic acid activation and aminolysis respectively.

Figure 1.14: Model proposed for ATP-grasp enzyme-catalysed amide formation via the formation of an acyl-phosphate intermediate and amino acid base-catalyzed proton extraction of the amine substrate, allowing aminolysis.157

In terms of substrate selectivity the ATP-grasp enzymes as a whole exhibit a substantially more relaxed selectivity towards both the carboxylic acid and amine components when compared to NRPS. In some cases the substrate breadth is only minor, for example, cyanophycin synthetase, which normally catalyses sequential Asp-Arg peptide bond formation and elongation onto a cyanophycin primer, could replace Arg with

Lys, although Asp could not be replaced by another amino acid.158 A more impressive demonstration of wide substrate breadth was shown by Tabata et al. with the L-amino acid ligase YwfE which natively ligates L-Ala and anticapsin to produce bacilysin 17.159 As well as this natural product, the enzyme was stated to be able to produce 111 different 47 dipeptides when 231 combinations of L-amino acids were used (Figure 1.15 a).160 There were limitations to the substrate breadth of the enzyme, however. Namely, D-amino acids, certain charged residues such as Asp, Lys, Glu and Arg, and the secondary amine residue proline, could not be incorporated into any dipeptides. Also, for the N-terminally incorporated residue, the amine acceptor, selection was much more restrictive than that for the C-terminal residue. While this work is interesting and demonstrates the potential for future work with this and related enzymes, the lack of any yield or conversion data for the majority of products limits the value of the results with respect to their industrial relevance.

Further, as stated by the authors, some products are stated as such without a positive standard for comparison, with a new HPLC peak being used to assume successful product formation. Nonetheless, this shows that similar ATP-ligases could eventually provide a means of producing a wide range of amides by expanding on this already broad substrate breadth with engineering.

A further example of broad specificity was demonstrated by the Dairi group, with the ATP-grasp enzyme PGM1 which is able to form amide bonds between (S)-2-(3,5- dihydroxy-4-hydroxymethyl)phenyl-2-guanidinoacetic acid 18 and the N-terminal of ribosome-produced peptides 2-18 amino acids in length to yield N-modified peptide products (Figure 1.15 b).154 Interestingly this enzyme also showed activity for non-peptidyl nucleophiles including spermine and spermidine for amidation of variants of 18 albeit with much lower yield.161 Although this nucleophile substrate breadth is impressive, the carboxylic acid components are limited exclusively to those possesing an Nα-amidino group.162 A similar criticism of this work as with the last is the lack of conversion or yield data for the majority of the novel products, therefore not giving an accurate picture of how relevant such a system is for industrial application.

In general, the ATP-grasp enzymes offer an appealing potential route for the production of a wide variety of amides. However, this broad specificity for carboxylic acids and amines, is usually limited to proteinogenic and non-proteinogenic amino acids, 48 with notable exceptions including biotin carboxylase which uses bicarbonate and biotin as carboxylic acid and amine respectively.163 While these enzymes may prove useful to those wishing to produce peptides, for more general industrial production of a wide range of carboxylic acids and amines, including secondary amines, these enzymes are not yet of great relevance. Indeed while NRPSs appear to have been adopted in industrial processes164 ATP-grasp enzymes do not appear to have gained much traction in industry.

However, the use of these enzymes is in its infancy, and as more ATP-grasp enzymes are discovered and become available, so may the substrate breadth of known enzymes widen, and therefore become increasingly relevant to industry.

a.

b.

Figure 1.15: Exploiting the wide substrate breadth of ATP-Grasp enzymes YwfE and PGM1 a. YwfE naturally produces bacilysin 17 but is also able to produce a wide range of dipeptides.159 (Selected examples in blue) b. PGM1 is able to amidate 18 with a range of ribosomally produced peptides (X) 2 to 18 amino acids in length.154

49

1.3.5. Other ATP-dependent biocatalytic amide forming enzymes and methods

While the natural mechanisms of NRPS and ATP-Grasp enzymes have been the focus of this introduction’s analysis of aqueous biocatalytic amide formation, it is important to highlight other amide forming enzymes and methods which can be employed, but which are less common or less well known. One recently discovered amide synthetase is McbA, which has been shown to catalyse the formation of a range of amides from the amidation of 1-acetyl-3-carboxy-β-carboline 19 (Figure 1.16).165 This enzyme generates an acyl adenylate intermediate which can then be attacked by a nucleophilic amine. While the nucleophile specificity was shown to be broad, only the native substrate carboxylic acid 19 appears to have been tested.

An interesting class of amide forming enzymes are the adenylate forming amide ligases.154 These enzymes are homologous to NRPS A domains or acyl-coA synthetases,166 which are both part of the adenylate forming enzyme superfamily (ANL).118,167 Like

NRPS, these also catalyse the formation of an acyl adenylate intermediate. Unlike conventional NRPSs however, they do not transport the activated substrate to a separate C domain via a PCP domain PPant arm for amide formation, with the acyl adenylate being directly attacked by the amine nucleophile. Two examples of enzymes with adenylate forming amide ligase activity are novobiocic acid synthetase (NovL)166 and the amide

168 synthetase of the coumermycin A1 biosynthetic gene cluster (CouL). NovL and CouL are related168 and produce intermediates in the biosynthetic pathway of the antibiotics novobiocin and coumermycin A1 respectively. Both enzymes acylate the amine substrate 3‐ amino‐4,7‐dihydroxy‐8‐methyl coumarin 20, however their carboxylic acid substrates differ, with NovL first adenylating 3‐dimethylallyl‐4‐hydroxybenzoic acid 21 for

50 subsequent amidation to yield novobiocic acid 22 (Figure 1.17 a),166 while CouL adenylates both carboxylic acid groups of 3‐methylpyrrole‐2,4‐dicarboxylic acid 23 with the amine substrate 20 to yield a diamide 24 (Figure 1.17 b).168 The substrate scope of these two enzymes was investigated, showing that NovL in particular was highly restricted to its native amine and carboxylic acid substrates, with only 2 carboxylic acid analogues showing more than 4% relative activity.166 CouL showed more capacity for accepting non- native carboxylic acid substrates but these were all similar in structure to the native cyclic dicarboxylic acid substrate, limiting the general versatility of this enzyme for broad specificity amide formation. Nonetheless, as stated by the authors of this investigative work, future efforts to identify and modify the substrate selecting residues of CouL may

168 permit the production of various new coumermycin A1 variants.

Another example of an adenylate forming amide ligase is the NRPS subunit ORF

19 in streptothricin synthetase of Streptomyces shown in 2012 by Maruyama et al. to catalyse PCP and C domain independent L-β-lysine adenylation and amide formation with the amine of lysine-oligopeptides of varying lengths produced by the preceding NRPS subunits.169 Such standalone domains could be useful targets for engineering for wider substrate specificity, particularly due to the lack of adjacent, catalytically necessary C domains which as mentioned previously, can prevent the incorporation of non-cognate residues into peptide bonds, even when the adjacent A domain is engineered to do so. In this particular case, ORF 19 was shown to not adenylate other β-amino acids other than L-

β-homolysine, therefore like conventional NRPS A domains, such standalone amide forming A domains would likely require engineering to widen their substrate scope.

51

Figure 1.16: Amide synthesis by the amide synthetase McbA.X: Selected examples of structures amidated to 19 by McbA.165

a. b.

Figure 1.17: Amide synthesis by the adenylate forming amide ligases NovL and CouL a. NovL amidates the carboxylic acid 21 with amine 20 to yield amide 22.166 b. CouL conducts amidation on each carboxylate group of 23 with amine 20 to yield the diamide 24.168

A novel method for ATP-dependent amide formation has been through the use of natural NRPS enzymes and naturally non-amide forming adenylating enzymes but via the interception of their acyl adenylates with amine nucleophiles. Like the above mentioned natural mechanism of adenylate forming amide ligases, this exploitation of the natural

NRPS A domain mechanism does away with the need of having adjacent PCP domains and

C domains for amide formation. Such processes have only been demonstrated relatively

52 recently. In 2008 the Kobayashi group demonstrated that acyl-CoA synthase which normally produces bonds between free carboxylic acids and CoA, was shown to catalyse amide formation between isobutyrate or acetate and the non-native substrate, cysteine.170

This is an interesting example of potential enzymatic promiscuity, whereby the enzyme is catalysing a reaction which is different from its native function.171 However, as the amine substrate used is cysteine, which has a thiol group, it is possible that thioester formation by this thiol may precede intramolecular amide formation, similar to that observed in native chemical ligation, and would therefore not be due to promiscuity, but by a chemoenzymatic process which first utilises the native function of the enzyme, in this case the generation of a thioester.172 Indeed, the Kobayashi group followed up on this work by employing the enzyme DltA which is homologous to NRPS A domains to form a range of dipeptides and oligopeptides when various amino acids were used as substrates for attack by cysteine. Interestingly, when other amino acids were trialed as nucleophiles, or cysteine with the thiol group protected, no amide bond was formed, leading the authors to theorise that the amide formation did indeed occur through initial thioester formation and intramolecular amide formation.173 Finally, in 2016, the group were able to demonstrate this same activity on a true NRPS A domain, namely DhBE, a standalone A domain which usually activates dihydroxybenzoic acid (DHB) for subsequent transfer to a separate PCP domain and C domain for amide formation.174 By introducing various acid substrates including the cyclic substrate such as DHB, benzoic acid and its hydroxyl derivatives as well as linear substrates such as octanoic acid, they were able to demonstrate a wide acid substrate breadth for amidation by cysteine. While this is interesting, it is important to recognise that this mechanism is limited to using cysteine as a nucleophile. Although this method could be used for similar purposes to native chemical ligation,175 i.e. cysteine dependent peptide ligation, or producing valuable products which contain cysteine, this would not be useful as a general amide forming mechanism due to the limitation in nucleophile choice. Moreover, DhBE-catalysed amide formation using cysteine and DHB 53

−1 proved to be inefficient with a Vmax of 0.0156±0.0008 units mg and a Km of

150±18.3 mM for cysteine. Additionally, in standard assays the Kobayashi group employed cysteine in a 40-fold excess to DHB. Overall, the narrow amine substrate breadth, the inefficiency of the DhBE-catalysed amide forming reaction, and the large excess of amine utilised in standard assays, may combine to limit the utility of this biocatalytic method for industrial amide formation.174

Earlier work by Dieckmann et al. in 2001 had shown that it was indeed possible to utilise an NRPS A domain for the formation of dipeptides through the use of the truncated tyrocidine synthase 1 (TY1) which was composed of only an A domain.176 Here they were able to activate Phe which could be amidated by Phe, Ala or Leu as well as the amino acid amides of Leu and Phe. Therefore in this work, the authors demonstrated that nucleophiles other than cysteine could, in fact, be used to attack the high energy acyl adenylate intermediate produced by A domains, offering an appealing route for direct amide formation using A domains, without need to shuffle domains between NRPS systems. On the other hand it is apparent that the inherent selectivity of the A domain for the acid substrate is still restrictive, with only Phe being activated and composing the N-terminal residue of the peptide products. Generally, it is clear that these methods of direct amide formation via intermediate interception by amines offers a compelling new route for expanding NRPs utility. However, the inherently narrow selectivity of NRPS A domains for acids, and the limited range of nucleophiles employed mean that much more investigation must be done to find a truly broad specificity and industrially relevant amide synthetase, either using NRPS A domains, or other carboxylic acid activating enzymes.

Furthermore, as the Kobayashi group did not succeed with DhBE in catalysing direct amidation of acyl adenylates in the same manner as Dieckmann et al. with non-thiol containing amino acids, it is likely that not all A domains may be exploitable in this manner of direct aminolysis of acyl adenylates.

54

In summary, as well as hydrolase catalysed amide formation, there are various ATP dependent enzymatic methods of amide formation, from NRPS to ATP-grasp enzymes

(Figure 1.18) which collectively can produce a wide range of different amides. However many of these are severely limited in the substrates they can ligate into amides. In the case of NRPSs, their A domains are generally highly specific for one cognate amino acid. While engineering these domains or swapping their order in recombinant systems can alter the residue which is activated, the substrate specificity of the adjacent C domain and the complex nature of domain-domain interactions may often inhibit success. In comparison,

ATP-grasp enzymes show much more relaxed substrate specificity, although this too is usually limited to amino acids, and generally either the acid or nucleophile substrate selectivity will be narrow. Therefore in selecting a class of enzymes on which to concentrate our efforts, to engineer into a broad specificity amide synthetase, we decided to select a group of enzymes which were already proven to activate a wide range of carboxylic acids. Namely, we selected carboxylic acid reductases (CARs) to be engineered into amide synthetases, due to their exceptionally broad carboxylic acid substrate acceptance, both as individual enzymes, and as a collective group.177,178

55

a. Hydrolase catalysed amide formation

b. NRPS catalysed amide formation

c. Ribosomal amide formation and preceding amino acid activation

d. ATP-Grasp enzyme catalysed amide formation

Figure 1.18: An overview of enzymatic methods of amide formation (non-native mechanisms are shown in blue). a. Hydrolases can form amides through a non-native interception of the acyl-enzyme intermediate by amines.3 b. NRPSs catalyse amide formation via acyl-AMP and thioester intermediates.98 The former can be intercepted with thiols for subsequent intramolecular amide174 formation or directly by amines.176 c. Amino 85 84 acids are activated by aaRS, with the resulting AAtRNAs ligated by ribosomes. d. ATP-grasp enzymes form amides via acyl-phosphate intermediates.150

56

1.4. Carboxylic acid reductase

Microbes have long been known to reduce carboxylic acids to their corresponding aldehydes and alcohols179 and it was in the late 1960s that Gross et al. first purified and characterised the CAR enzyme from Neurospora crassa, was shown to catalyse the ATP,

Mg2+ and NADPH dependent reduction of carboxylic acids to aldehydes (Figure 1.19 a).180,181 It was not until the work by the Rosazza group in 2004 however, that the gene for a CAR, that of Nocardia iowensis, was used to transform E.coli and the gene and amino acid sequences fully revealed.182 In this work, the authors were able to identify two putative domains from BLAST183 analysis, including an N-terminal AMP-binding domain, a C-terminal NADPH binding domain and a potential PPant binding site between the two.

In later work, they confirmed the presence of a PPant binding site, identified by its consensus sequence of LGGDSLSA, and the importance of the prosthetic group in the

CAR’s reduction mechanism. When E.coli was transformed by the gene for Nocardia iowensis CAR (CARni), it was found that the recombinant enzyme had an activity 50-fold lower than that of the wild-type enzyme purified directly from the organism. However, activity could be increased 5-fold by addition of the Nocardia PPtase Npt or the broad specificity PPtase from Bacillus subtilis (Sfp). Co-expression with the genes for these

PPtases was also very effective at improving activity of the purified recombinant CAR.

Mutation of the proposed PPant binding serine, Ser689 to an alanine completely eliminated activity, and demonstrated that post-translational addition of a PPant group was essential for an active CAR enzyme. These results prompted the authors to propose a mechanism similar to that in NRPS mentioned above, whereby an N-terminal A domain activates the acid substrate at the expense of ATP, followed by transfer to the PCP bound PPant group which then transfers the thioester intermediate to the C-terminal R domain which reduces the thioester to the product aldehyde at the expense of NADPH (Figure 1.19 b).184

57

The Rosazza group also explored the substrate breadth of CARni, and as well as the model substrate benzoic acid 25, were able to demonstrate that other benzyl containing structures could be reduced such as vannilic acid 26 and ferulic acid 27.182 Finnigan et al. also expanded the known substrate scope of CARni and other species of CAR to also reduce cinnamic acid 28 and derivatives as well as meta and para substituted benzoic acids groups including 3-nitrobenzoic acid 29 and 4-methylbenzoic acid 30. Additionally they demonstrated that CARni could reduce aliphatic carboxylic acids such butanoic acid 31 and octanoic acid 32 and other aromatic substrates such as phenylpropynoic acid 33 or phenylpropanoic acid 34 (Figure 1.20).185 The Turner group meanwhile, explored the utility and the substrate breadth of the CAR from Mycobacterium marinum (CARmm) which as well as benzoic acid, could reduce a wide range of saturated and unsaturated fatty acids from C-4 to C-18 in length.186 This proved especially useful in their in vivo production of many different and valuable fatty alcohols and alkanes when CAR was co- produced in cells along with aldehyde reductases (AHR)187,188 or aldehyde decarbonylases

(ADC) respectively.186,189 These products are vital in the fuel and cosmetics industries, but are normally derived from fossil fuels. Moreover as shown in the review by Napora-Wijata et al. not only are these CARs individually very unrestricted in their substrate specificity, but when taken as a family, there are a huge variety of carboxylic acids which can be activated by CARs with different and complementary substrate specificities.177

The utility of CARs in biocatalytic systems was further showcased in work by the

Turner and Flitsch groups with their use of CARs in in vivo enzymatic cascades, in combination with co-produced transaminases and imine reductases, for the whole-cell production of chiral amines.190,191

The CAR enzymes are related to NRPSs due to their interlinked domain nature of

A domain to PCP to termination reduction domain. Indeed Gahloth et al. in their recent structural work demonstrated that the A domain was very closely related to acyl-CoA synthetases, which like NRPS A domains, form part of the ANL superfamily of 58 adenylating enzymes,118 meanwhile they showed that the termination R domain was highly similar in structure to R domains of NRPSs.192 They were able to obtain crystal structures of the isolated A domains of CARni and Segniliparus rugosus CAR (CARsr) both bound to AMP and an acid, the isolated R domains of CARmm and CARsr bound to NADPH, as well as the A-PCP didomain and PCP-C didomain of CARsr bound to AMP and NADPH respectively. This allowed the authors to reveal the dynamic nature of CAR domains and their interaction. Firstly, like the previously mentioned work on NRPS A domains, they revealed that the isolated A domain adopted an adenylation state whereby the catalytic lysine of the subdomain was orientated to be in contact with the bound AMP nucleotide.

Comparatively, in the A-PCP didomain of CARsr two alternative conformations were observed; the aforementioned adenylation state with the PPant binding serine of the PCP located far from the AMP phosphate (52 Å) with the PCP as a whole being distant from the

A domain, and a second state in which the A subdomain is rotated and the distance between the AMP phosphate and PPant binding serine reduced to just (19 Å), bringing the

PCP closer to the A core domain with which it now makes direct surface contacts. This latter state is described as the thiolation state. This latter structure was overlaid with that of a related PPant bound NRPS A domain LgrA,193 as unlike in similar structural work on the

ANL enzyme 4-chlorobenzoyl:CoA ligase, where phosphopantetheine could be co- crystallised and its position in the enzyme deciphered,121 the CARsr didomain structure lacked a PPant group, they were able to simulate the binding of a PPant group within a narrow channel leading to the active site.

Next they investigated the dynamics of the isolated R domains of CARmm and

CARsr, which revealed two principle alternative conformations of the R domain. In one state an Asp residue is directed away from the NADPH binding pocket, resulting in an ordered nicotinamide moiety which can conduct hydride transfer and reduction of a bound thioester intermediate. This is considered the active state. Comparatively, a conformation was observed in which this same Asp residue is pointed into the NADPH binding pocket, 59 resulting in a disordered nicotinamide moiety which is no longer primed for hydride transfer. The authors believe this Asp acts as an on-off switch, directed by Acyl-PPant binding to the active site, resulting in backbone reorientation and the Asp moving out of the NADPH binding pocket, allowing reduction of the thioester to the aldehyde.

Conversely, they hypothesised that the resulting aldehyde could not induce the active conformation of the backbone and Asp residue, and would therefore not be further reduced to the alcohol. Indeed this PPant dependent activation is supported by structural analysis of the CARsr PCP-R didomain which was modified to either possess or lack a PPant group.

The PPant-bound structure was found to be in the active form, with the key on/off determining Asp directed away from the binding pocket, whereas the unmodified didomain was found to be in the inactive state. To investigate if complex interaction between the domains was required for activity, they combined separate A and PCP-R domains to observe any potential protein-protein interaction of these non-connected domains to permit reduction of acid substrates, which ultimately was unsuccessful. Comparatively when benzoyl-CoA groups was added to the R domain or PCP-R didomain, activity was observed, albeit with a much lower rate of activity, similarly, low activity was observed when benzoic acid and (R)-pantetheine were introduced showing to a mixture of A domain and PCP-R didomain, showing that covalent binding of the domains and complex surface interaction is not required for individual domain activity, and that the R domain is able to bind and reduce PPant-bound substrates which are in solution. However as stated by the authors, the lower activity rate suggests that the covalent binding of the domains is important in increasing the effective concentration of the thioester intermediate, with the

PCP interacting with the flanking domains through dynamic sampling. However, a limitation of this work, mainly due to current technological challenges, is the lack of a structure of the entire CAR enzyme, therefore limiting the information that can be attained of the dynamics of the entire CAR complex. Nevertheless, the importance of the PPant moiety in the activity of the R domain was shown when thiobenzoic acid couldn’t be 60 reduced, suggesting that the PPant binding to the R domain is the main determinant in R- domain activity, with structural studies also showing that no significant interactions are made between the acyl group of the intermediate and the substrate binding site. Due to the ability of the domains to catalyse their individual reactions, even without retaining their covalent linkers with the other domains, the authors were able to swap domains between

CARmm and CARni with the resulting chimeras being active. This suggests that future domain swaps with domains from different enzymes and species, as conducted previously with NRPS, may also be possible with CARs.

61

a.

b.

P R

P R

A P R

A P

Figure 1.19: Carboxylic acid reductase reaction and domain composition and functions a. The reaction catalysed by CARs. b. CARs are composed of an A domain, PCP domain and an R domain. Following adenylation in the A domain, the substrate is transferred as a PCP- bound thioester to the R domain where it is reduced to an aldehyde by hydride transfer from NADPH.184

62

Figure 1.20: Selected examples of carboxylic acids reduced by CARni.182,185

1.5. Objective of the project: Engineer CAR into a broad specificity amide synthetase

Due to the broad substrate scope and the demonstrated utility of CARs, they were selected for engineering into amide synthetases. The selected route was to fuse the broad specificity carboxylic acid activation domain of a CAR, with the amide forming activity of an NRPS system. This was chosen because of the NRPS-like domain structure of CARs192 which could potentially allow certain domains, such as the CAR A domain, to be swapped with those of an NRPS, and therefore allow the incorporation of non-cognate carboxylic acids into the NRPS peptide.144 As mentioned in chapter 1.3.3, NRPS A domain swaps have indeed been successful in allowing the introduction of unnatural amino acids into growing NRPs.147 However, as the aim of this project was to produce an amide synthetase with broad selectivity towards both the acid and amine components, a swap between a

CAR A domain and that of a typical NRPS would not be appropriate, as the highly substrate specific natural, downstream A domains would dictate the amine substrate.81

Therefore, an atypical NRPS C domain which is not covalently tethered to an adjacent module or domain, that does not require a PPant-bound nucleophile substrate 63 from a downstream PCP domain, and which as a result could be more easily engineered with regards to altering the substrate specificity, was required for this work. In addition to a C domain that was independent of a PCP-bound nucleophile, was one that did not bind amino acid substrates, which are already commonly available in natural NRPS systems.80

Such a C domain is found in the vibriobactin synthetase (Vib) NRPS, which is composed of standalone A, PCP, and C domains VibE, VibB and VibH respectively, and which play a role in the synthesis of the siderophore vibriobactin.92,194 These three enzymes catalyse the sequential activation, PCP transfer and condensation of dihydroxybenzoic acid 35

(DHB) with the small molecule norspermidine 36 (NSPD) to produce DHB-NSPD 37

(Figure 1.21). Further, VibH has been shown to accept the non-cognate amines hexylamine, octylamine, and 1,7-diaminoheptane.194 The initial plan was to fuse the PCP of VibB to the CAR A domain, with the intention of promoting substrate loading onto the

VibB PCP which would then interact with VibH which would catalyse amide formation. It was then hoped that engineering of VibH could then expand the nucleophile specificity, generating a chimeric system with industrially relevant substrate breadth.

Figure 1.21: Carboxylic acid activation and amide formation by the vibriobactin synthetase system The activation, thiolation and amidation of DHB by VibE, VibB and VibH respectively reflect A domain, PCP domain and C domain activities.92,194

64

Chapter 2. Developing a CAR-VibB, VibH fusion enzyme system

2.1. Introduction

As a means of exploiting the broad carboxylic acid activating activity of CAR for amide formation, various approaches were considered. As well as potential engineering of the native CAR enzyme mechanism to replace reduction activity with aminolysis activity, an appealing route was via domain fusion with amide forming enzymes. Enzymes of the

NRPS family appeared to be a good target for this process. CARs share certain characteristics with NRPS complexes, such as interlinked A and PCP domains (Figures 1.9 and 1.19), and are indeed related to them.192 Moreover, there is a wide variety of amide- forming condensation domains of NRPS with differing substrate selectivity which have already been identified and exploited.22,195,196 Furthermore, domain and module swapping of NRPS, although difficult and often unsuccessful, has been achieved in the past.144,197

Therefore, a successful domain fusion between the adenylation (A) domain of CAR and the peptidyl carrier protein (PCP) component of an NRPS enzyme could potentially allow the initial activation of the carboxylic acid substrate with subsequent transfer as a thioester onto the PCP. If this initial step was successful, it is feasible that the PCP-bound thioester could then interact with the subsequent condensation domain and be lysed by amine substrates, forming an amide. As A domains have been shown to be the “gate keepers” of selectivity in NRPS systems,105,198 with PCP being shown to accept non-cognate substrates, and C domains showing limited selectivity for the amine acceptor,147 it was likely that a successful and interactive fusion of a CAR A domain and a NRPS PCP domain would be able to provide an activated substrate for subsequent aminolysis.

Fusing domains is often much more difficult than a simple “plug in and play”, with the selection of an appropriate linker between the domains being pivotal for the fused domains to be functional, with the A domain transferring its activated substrate to the PCP 65 domain, which could in turn transfer the substrate to the C domain.149,199 Indeed, in previous work by Beer et al. it was shown that the most success was observed when the A-

PCP linker of the incoming PCP domain for fusion was retained and replaced the A-PCP linker of the existing NRPS protein.149 In the work conducted by Doekel and Marahiel, the fusion site of domains was located within the A-PCP linkers of both the incoming initiating

A domain and the existing PCP domain with more of the incoming A domain’s linker being preserved than the latter. Indeed this was successful in the fusion of various domains, including the initiating A domain of bacitracin synthetase 1, BacA1, which activates isoleucine, and the tyrocidine synthetase C terminating module of TycC6 which activates leucine, resulting in the formation of a IleLeu dipeptide.147

In investigating NRPS proteins which could potentially be fused with the A domain of CAR, the tyrocidine synthetase NRPS complexes, as used by Doekel and Marahiel initially appeared to be a good target.147 As shown in their work, following fusion of the initiating non-cognate A domain, the subsequent T and C domains were able to accept the non-cognate amino acids activated by the fused A domain. This supports the theory that the A domain is the main selector for substrate specificity with the PCP domains not showing any selectivity, which was also shown in work by Beer et al.149

While the replacement of a native Tyc synthetase99 A domain with that of CAR was proposed, there would be a major limitation in the fact that downstream of the module incorporating the CAR A domain, there would be at least one subsequent A domain of the native Tyc system with a highly specific cognate amino acid selection. This would limit the final industrial applicability of any fusion. Although the selectivity towards the carboxylic acid by the CAR A domain would be broad, the subsequent amino acids added would be limited by the selective nature of the Tyc synthetase modules.

Therefore, as mentioned in chapter 1.5, the Vib system with its standalone C domain lacking a downstream A domain-linked amine substrate, was selected for the fusion between its standalone VibB carrier protein and the CAR A domain.92 It was hoped 66 that if a successful and functional fusion could be produced between the A domain of CAR and the PCP domain of VibB (Figure 2.1 a), it would be possible that the VibB bound substrate could then interact with VibH to produce a range of amides by exploiting the aforementioned acceptance of non-cognate substrates by PCP domains and C domains

(Figure 2.1 b). Accordingly, creating a fusion between the CAR A domain and the VibB

PCP domain was selected as the first objective of this project.

a.

b.

Figure 2.1: Proposed CAR-Vib fusion to combine CAR and Vib enzymatic activities a.

The proposed fusion of the VibB PCP domain with the CAR A domain. The fused enzyme would then be used in combination with VibH. b. The proposed production of amides by combining CAR and Vib activities.

67

2.2. Gene analysis and CARmm-VibB fusion design

The first task in designing a chimera between CARmm (UniProt accession number

B2HN69) and VibB (UniProt accession number P0C6D3) was to identify the domain boundaries within these proteins. By using the Pfam program, which is able to align input amino acid sequences with a vast database of known protein families with identified globular domain boundaries, and identify regions of homology,200 it was possible to identify the predicted boundaries of the CARmm A domain and the carrier protein domain of VibB which also contains an isochorismatase domain (ISC).194 The Pfam program had been used previously by Beer et al. to predict PCP domain boundaries in their work swapping NRPS PCP domains.149 The Pfam boundary prediction gave the AMP binding domain of CARmm consisting of AA 49-514 with the PCP being AA 655-722, while the

PCP domain of VibB was predicted to be AA 219-283 (Figure 2.2). While the VibB domain boundaries had already been confirmed by X-ray crystallography,201 with the N- terminal ISC domain structure corresponding to the stretch of amino acids identified by

Pfam, at this stage of the project, there had been no published structural data on CARs to support the Pfam prediction. The subsequent work conducted by Gahloth et al. with

CARsr shows the PPant domain beginning at AA 655192 compared to a Pfam prediction of

AA 672, suggesting that there is disparity between the Pfam prediction and the true PCP domain boundary in this case.

68

Figure 2.2: Pfam analysis of CAR and VibB domain boundaries to guide fusion design. Screen shot taken from Pfam analyses of CARmm and VibB showing predicted domain boundaries. Having now identified the putative domain boundaries of the respective domains on

CARmm and VibB, the chimera between the two was designed. As VibB does not have a linked A domain, it would not be possible to follow the recommended approach of retaining the A-PCP linker of the fused PCP domain,149 therefore the A-PCP linker of CAR would be retained. This linker was assumed to end at the beginning of the CAR PCP domain boundary, therefore AA 1-654 of CAR would form the CAR component of the fusion. Meanwhile the entire predicted PPant binding domain sequence, consisting of AA

219-283 of VibB would form the VibB component. The fusion would be made by first using PCR to amplify the VibB PCP domain DNA sequence, and amplify pET21a CARmm with primers which surround the un-needed PCP-R domain components of CARmm. The amplicons would then be fused using an In-Fusion kit by clontech.202 Other enzymes required for the production of amides were also to be cloned and produced. This included the gene for the VibH enzyme (UniProt accession number Q9FDB1) which would be required for the condensation reaction with the chimera-bound substrate, as well as VibE

(UniProt accession number O07899) and full-length VibB which would all be required to produce the native Vib system product 37 as a control, in order to demonstrate activity of the VibH enzyme. 69

2.3. Vib system gene cloning and CARmm A domain-VibB PCP domain fusion

The first cloning to be conducted was the insertion of the VibB PCP domain gene fragment into the pET21a vector containing the CARmm A domain. The VibB PCP domain gene fragment was amplified from the full length VibB gene in the pEX-A2 plasmid by

PCR. The primers used had a sequence which overlapped the edges of both the VibB sequence and CARmm A domain pET21a to allow annealing with the In-Fusion system.

The pET21a CARmm A domain vector was then also amplified by PCR. Following amplification, the PCR products were run on agarose DNA gels for visualisation. In the pET21a CARmm A domain PCR product there was an undesired band in addition to the desired product (Figure 2.3 a), therefore gel extraction was conducted on the specific desired product band. While with the VibB PCP domain PCR there was only one band visible which corresponded to the desired PCR product (Figure 2.3 b). Following gel extraction, the insert and vector were incubated with the In-Fusion enzyme and buffer at

50°C for 15 min. The annealed product was then used to transform Stellar competent cells which were later used for amplification and extraction of the fused DNA. The fused DNA was then sent for sequencing, which confirmed the successful fusion of the CARmm A domain and VibB PCP domain sequences (Figure 2.3 c). Subsequently the VibE (Figure 2.4 a) and VibH (Figure 2.4 b) genes were also amplified out of their respective pEX-K4 plasmids by PCR and visualised on an agarose gel. The PCR primers used possessed 16 bp overhangs which overlapped either the NcoI or XhoI endonuclease restriction sites of pET28b which was to be used as expression vectors for either VibE or VibH respectively. pET28b was cut with NcoI and XhoI and the restriction digest product visualised on an agarose gel (Figure 2.4 c). Again, following gel extraction, the linear gene inserts and their separate digested pET28b vectors were annealed using the In-Fusion kit. Sequencing confirmed that the inserts had been successfully annealed to the vectors.

70

a. b.

1 kb ladder PCR product 100 bp ladder PCR product

DNA length DNA length

8 kb pET21a 6 kb 4 kb CARmm 1.5 kb 1 kb 3 kb A domain 2 kb 7.3 kb 500 bp

200 bp VibB PCP domain 255bp

c.

VibB PCP domain sequence CARmm A domain sequence

Sequencing data of gene fusion region of CAVibB

Figure 2.3: PCR amplification of pET21a CARmm A domain vector and VibB PCP domain gene fragment for subsequent fusion a. Agarose gel visualisation of the vector pET21a CARmm A domain PCR product of 7.3 kb. b. Agarose gel visualisation of the insert VibB PCP domain PCR product of 255 bp. c. Screenshot of SnapGene alignment of sequencing results with designed fusion gene.

71

a. b.

1 kb ladder PCR product 1 kb ladder PCR product

DNA length DNA length

2 kb 1.5 kb VibE insert 1 kb 1.7 kb 2 kb 1.5 kb 1 kb VibH insert 1.3 kb

c. 1 kb ladder Linear vector

DNA length 10 kb 8 kb 6 kb pET28b vector cut with 5 kb NcoI and XhoI 5.2 kb

Figure 2.4: Agarose gel visualisation of PCR and restriction digest products for subsequent cloning of VibE and VibH genes into expression vectors a. Visualisation of the insert VibE PCR product of 1.7 kb. b. Visualisation of the insert VibH PCR product of 1.3kb. c. Visualisation of the endonuclease-cut vector pET28b of 5.2 kb.

2.4. Expression trials of Vib and CARmm-VibB fusion genes

The expression vectors containing the N-terminal 6 x histidine tagged CARmm-

VibB fusion (CAVibB) gene and the VibE and VibH genes were used separately to transformed BL21 (DE3) competent cells for gene overexpression and protein overproduction with CAVibB being co-transformed with Sfp to permit PPant addition to the fused PCP domain. All cells were initially grown up in auto-induction (AI) media for 48 h at 20°C in the presence of the protease inhibitor phenylmethyl sulfonyl fluoride (PMSF), 72 followed by lysis by lysozyme and sonication and purification with nickel affinity chromatography on an AKTA system. When lysate, column flow-through, wash fractions and elution fractions were analysed by SDS-PAGE, in the case of the CAVibB purification, there appeared to be a band at the expected size of 81 kDa, but there also appeared to be a an additional band between 55 kDa and 70 kDa (Figure 2.5). In case this was a degradation product of the chimera due to insufficient protection from proteases by

PMSF, grow up and lysis was repeated but in the presence of the Halt protease inhibitor cocktail which has a broader range of protease inhibition than PMSF. However this too resulted in the same bands being visible on SDS-PAGE (Figure 2.6). As there was a possibility that these bands were not in fact the chimera or degradation products at all, it was therefore decided to grow up BL21 (DE3) cells transformed with empty pET28b

(pET28b in the same conditions as transformed cells and lyse these blank cells as a control. When the lysate was purified as with the transformed cells, the same double band was observed, suggesting that the chimera had not been expressed by the cells and that the bands were due to proteins natively present in the bacteria (Figure 2.7).

Protein Mass MW L FT W1 E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11

180 kDa 130 kDa 100 kDa 70 kDa ~80 kDa 55 kDa ~60 kDa 40 kDa 35 kDa 25 kDa

15 kDa 10 kDa

Figure 2.5: SDS-PAGE protein analysis of CAVibB-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA. A band of approximately 80 kDa can be seen in lanes for elution fractions 5-9 as would be expected for the 81 kDa CAVibB fusion enzyme. However an unexpected band of approximately 60 kDa can also be observed in these lanes. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction. 73

Protein Mass MW L FT W1 E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11

180 kDa 130 kDa 100 kDa 70 kDa 55 kDa ~80 kDa ~60 kDa 40 kDa 35 kDa 25 kDa 15 kDa

10 kDa

Figure 2.6: SDS-PAGE of CAVibB-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA, following Halt inhibitor addition to lysate. While the band at around 80 kDa is still present in elution fractions, so too is the undesired band of around 60 kDa, with no visible decrease compared to lysate without Halt inhibitor. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction.

Protein Mass MW L FT W1 E1 E2 E3 E4 E5 E6 E7 E8

180 kDa 130 kDa 100 kDa 70 kDa ~80 kDa ~60 kDa 55 kDa 40 kDa 35 kDa 25 kDa 15 kDa 10 kDa

Figure 2.7: SDS-PAGE of negative control, pET28b-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA. The bands observed previously in CAVibB- transformed cell-lysates are also present here in elution fraction 4 showing that they are not the CAVibB fusion protein and degradation product. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction.

The VibE (Figure 2.8) and VibH (Figure 2.9) transformed cells were then lysed to analyse the expression of VibE and VibH by SDS-PAGE respectively. In both cases, no

74 overproduction of either protein was observed in lysate or eluted fractions by SDS-PAGE trace vs. the non-transformed BL21 (DE3) cells, with expected large bands of 61 kDa for

His-tagged VibE and 51 kDa for His-tagged VibH not being visible. There was a faint band seen at approximately 50 kDa, but as this was in both VibE and VibH-transformed cells, it was unlikely to be the VibH enzyme. As the use of AI media may have been the cause of a lack of overproduction of the VibE and VibH proteins, the methods employed with success in previous work by Keating et al. were used for these respective proteins. In the case of VibE overproduction, IPTG induction followed by only 4 h incubation at 30°C was shown to successfully produce soluble enzyme.194 Meanwhile, overproduction of soluble VibH when IPTG induction had been used proved to be unsuccessful, yielding only in insoluble product, but when leaky expression alone at 25°C for 18 h was used, soluble

VibH was produced. Therefore these published conditions were trialled. However this resulted in no improvement, with neither VibE (Figure 2.10) nor VibH (Figure 2.11) being visible on SDS-PAGE in lysate or eluate, but again with various contaminant bands being visible on both gels. Protein MW L FT W1 E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11 Mass

180 kDa 130 kDa

100 kDa 70 kDa ~80 kDa 55 kDa ~60 kDa 40 kDa ~50 kDa 35 kDa 25 kDa

15 kDa 10 kDa

Figure 2.8: SDS-PAGE of VibE-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA. The undesired bands seen previously at around 80 kDa and around 60 kDa are again present in addition to another undesired band at around 50 kDa. The band at around 60 kDa is particularly faint making it unlikely to be overproduced His-tagged VibE enzyme of 61 kDa. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction.

75

Protein MW L FT W1 E1 E2 E3 E4 E5 E6 E7 E8 E9 Mass

180 kDa 130 kDa 100 kDa ~80 kDa 70 kDa 55 kDa ~50 kDa 40 kDa 35 kDa 25 kDa 15 kDa 10 kDa

Figure 2.9: SDS-PAGE of VibH-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA. While a band of approximately the correct size of the 51 kDa His- tagged VibH is observed, this same band is also visible in VibE-transformed cell lysate. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction.

Protein Mass MW L FT W1 E1 E2 E3 E4 E5 E6 E7 E8

180 kDa 130 kDa 100 kDa 70 kDa ~70 kDa 55 kDa 40 kDa 35 kDa 25 kDa 15 kDa 10 kDa

Figure 2.10: SDS-PAGE of VibE-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA, following published grow-up conditions. No desired band at 61 kDa for His-tagged VibE was visible, with an undesired band at around 70 kDa being observed. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction.

76

Protein Mass MW L FT W1 E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11 180 kDa 130 kDa 100 kDa 70 kDa 55kDa 40 kDa

35kDa 25 kDa 15 kDa 10 kDa

Figure 2.11: SDS-PAGE of VibH-transformed BL21 (DE3) cell lysate nickel affinity purification by AKTA, following published grow-up conditions. No desired band at 51 kDa for VibH was visible in any fraction while many contaminant bands were present throughout the gel. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction.

Expression trials were then conducted on BL21 DE3 cells transformed with either

VibE, VibH or CAVibB (co-transformed with Sfp) to attempt to find a condition in which they may successfully express. The cells were grown at three separate temperatures 20°C

(Figure 2.12), 30°C (Figure 2.13) and 37°C (Figure 2.14), each with induction by 0.5 mM

IPTG and grown up for 5 h post-induction. pET28b-transformed cells also grown up at

30°C under the same conditions, as a control. Cells were then lysed with lysis reagent to allow analysis of both soluble and insoluble protein fractions by SDS-PAGE. It was clear that there was no evidence of soluble protein overproduction at any temperature for any of the proteins. However there was a notable exception with the insoluble fractions at 37°C where a band at the correct size was observed for the 51 kDa His-tagged VibH at around

50 kDa and a faint band visible at the expected size for the 81 kDa CAVibB chimera of around 80 kDa, there was no such band visible for VibE. This could be an indication that the genes for VibH and potentially the chimera are being transcribed at 37°C, but that when these genes are translated the proteins are insoluble.

77

As the various attempts at producing soluble protein had failed for all of these proteins to be studied, and the time consuming nature of growing up cells and conducting expression trials, this work was discontinued as other experiments, to be discussed in later chapters, were proving more successful.

Protein VibE VibH CAVibB Mass MW S I S I S I

180 kDa 130 kDa 100 kDa 70 kDa 55 kDa 40 kDa 35 kDa 25 kDa

15 kDa 10 kDa

Figure 2.12: SDS-PAGE of VibE, VibH or CAVibB-transformed BL21 (DE3) cell lysates following 20°C expression trial. No clear bands of overproduced protein are visible for any of the enzymes in soluble or insoluble fractions. MW: molecular weight ladder, S: soluble fraction, I: insoluble fraction.

Protein Mass VibE VibH CAVibB pET28b MW S I S I S I S I

180 kDa 130 kDa 100 kDa 70 kDa 55 kDa 40 kDa 35 kDa 25 kDa 15 kDa 10 kDa

Figure 2.13: SDS-PAGE of VibE, VibH, CAVibB or pET28b-transformed BL21 (DE3) cell lysates following 30°C expression trial. No clear bands of overproduced protein are visible for any of the enzymes in soluble or insoluble fractions. MW: molecular weight ladder, S: soluble fraction, I: insoluble fraction.

78

Protein VibE VibH CAVibB Mass MW S I S I S I 180 kDa 130 kDa 100 kDa 70 kDa ~80 kDa 55 kDa ~50 kDa 40 kDa 35 kDa 25 kDa 15 kDa

10 kDa

Figure 2.14: SDS-PAGE of VibE, VibH or CAVibB-transformed BL21 (DE3) cell lysates following 37°C expression trial. Bands of approximately 50 kDa and 80 kDa are visible in the insoluble fractions of VibH and CAVibB-transformed cell lysates respectively suggesting both genes are expressed but that the protein products are insoluble. MW: molecular weight ladder, S: soluble fraction, I: insoluble fraction.

2.5. Discussion and future work

The failure to produce any of the proteins, both the native Vib proteins and the

CAR A domain-VibB PCP domain chimera, was a severe set-back in the project to create a broad specificity aqueous amide synthetase. It is important however to analyse why the expression failed. Many potential causes could be postulated for the failure of these proteins to be successfully overproduced in a soluble form. Importantly the manner in which the genes were cloned into expression vectors could potentially affect the transcription and ultimate translation of the target genes. In re-analysing the sequences of pET28b VibE and pET28b VibH, the distance between the ribosomal binding site and the start codon of the genes could be a cause for the poor or absent expression of either gene.

As the NcoI endonuclease was used to cut the pET28b vector for both VibE and VibH, and the primers for In-Fusion cloning designed to retain this restriction site to allow for potential future cutting out of the Vib genes, the start codon used for each gene occurs downstream of the innate ATG codon found within the NcoI restriction site. In both cases the separation between the ribosomal binding site and the start codon was 10 nucleotides.

79

This was done as using the innate NcoI restriction site pET28b start codon limits the next codon to one beginning with guanosine (G) which would not correspond with the native amino acid sequence for either VibE or VibH. However, this may have been an error, as while much literature and even the Novagen pET vector manual places the optimal spacer distance between the end of the Shine-Dalgarno (SD) sequence and the start codon at between 5 nt and 13 nt long or even longer,203 publications have shown that spacers longer than 8 nt are severely detrimental to protein overproduction.204 While this could explain the lack of soluble overproduction of either of the Vib enzymes, the observable band in the insoluble fraction for VibH at 37°C could suggest that the protein is able to be translated in certain conditions but that it is insoluble. However, in the case of the CAVibB chimera, the cause of the lack of overproduction cannot be attributed to a sub-optimal distance from the

SD sequence. The VibB PCP sequence was inserted into pET28b CARmm downstream of, and in frame with the CARmm A domain which itself is in-frame with the innate pET28b start codon ATG found in the NcoI restriction site, 8 nt from the SD sequence. Moreover the pET21a CARmm used has been successfully used to express and overproduce His- tagged CAR in work for subsequent chapters. Therefore translation initiation of the chimera should not have been inhibited by the spacer length, with the beginning of the fused gene being unchanged from the expressible CARmm gene. Additionally as all genes had been codon optimised for E.coli expression, rare codons should not have been a contributing factor to poor expression levels.

Overall, if time was not a limiting factor in the progression of this specific work, alternative spacer lengths would have been trialled on all genes to be expressed in order to determine if this was a factor in their low or non-expression. Aside from this, a wider range of expression strains and conditions would have been trialled to improve expression and overproduction.

Previous efforts towards carrier protein chimera production have also resulted in low yields of protein products.205 In their work, Worthington et al. postulated that low 80 yield chimera production could have been caused by a combination of factors including instability leading to degradation and insoluble aggregate production, toxicity resulting in cell death, and potentially disruption of secondary structure which would facilitate quicker proteolytic degradation of the chimera.205 Additionally, in our work we were limited by a lack of information on the true domain boundaries within the CAR enzyme, which has now been revealed for CARsr and CARni by structural work by Gahloth et. al.192 Protein misfolding could have been caused by an incorrect fusion site being chosen for the chimera. Therefore future work on developing CAR chimeras should be conducted using

CARs for which the domain boundaries have been revealed through structural studies.

Aside from creating a chimera, there were other options for combining the activities of CAR and VibH which could be investigated if expression issues with VibH are resolved. Firstly, it may be possible for a truncated version of CAR, lacking the terminal reduction domain, but retaining the A and PCP domains to interact with VibH directly without need for an intermediary VibB PCP domain. Alternatively, a further truncated

CAR consisting of only the adenylation domain could potentially interact with VibB, allowing downstream aminolysis of the CAR activated substrate by VibH. Indeed, work using non-cognate carrier proteins with the Vib system enzymes has been utilised in previous work with success by Marshall et al.206 In this work it was demonstrated that

VibE and enterobactin synthetase E (EntE) its E.coli homologue were able to recognise non-cognate and standalone carrier proteins from different species including the PCPs of pyochelin synthetase E (PchE), anguibactin synthetase B (Ang B) and yersiniabactin synthetase (HMWP2). This permitted acylation of the PCP generally with much lower efficiency than the cognate carrier proteins EntB and VibB respectively, with the notable exception of the PchE PCP domain which was acylated with greater efficiency than the native VibB. Therefore it is potentially possible that a standalone CAR A domain truncation could catalyse acylation of VibB.

81

It was found that two residues on VibB, E239 and E256, were key to facilitating binding between VibB and both VibE and VibH. When the equivalent aligned residues on the HMWP2 PCP, which was shown to be the worst non-cognate PCP, were mutated to glutamates, activity with both VibE and VibH was greatly improved.

This strategy could therefore be conducted to potentially permit or improve CAR truncation interaction with VibH directly, with a A-PCP domain CAR truncation, or via acylation of VibB, with an A domain-only CAR truncation. In pursuing the former approach, sequence alignments would be conducted with the intention of mutating the

CAR A-PCP truncation to introduce the key glutamate residues used by VibB to facilitate

VibH interaction. Alternatively to permit or improve VibB interaction with a CAR A domain-only truncation, the reverse process would be used, with VibB being mutated at the key residues to the CAR PCP equivalents.

After conducting a Clustal Omega207 alignment of the amino acid sequences of both the VibB and CARmm PCP domains, it was possible to identify the homologous residues to E239 and E256 in the CAR sequence (Figure 2.15), corresponding to A676 and N693 which flank the PPant binding serine of the PCP. These residues should therefore be targeted for future mutation of either the CARmm gene with A676E and N693E or retrospective mutation of the VibB gene with E239A and E256N, depending on which

CAR-truncation/Vib domain combination is employed. 239 256

676 693

PPant binding site

Figure 2.15: Screenshot of Clustal Omega amino acid alignment of the CARmm and VibB PCP domains. VibB residues E239 and E256 which were shown to be vital for VibB interaction with VibH align respectively with A676 and N693 of CARmm.

82

In conclusion, despite successful cloning of the respective genes required to investigate a potential combination of CAR and Vib system activities for the production of a wide range of amides, it was not possible to overproduce any of the desired soluble proteins, especially the CAR A domain-VibB PCP chimera. A potential issue was the length of the spacer used between the ribosomal binding site of the expression vectors and the beginning of the ATG start codons of the VibE and VibH gene inserts, which were longer than previously reported optimal linkers. The cause for an absence of the chimera production is not known however, as the optimal length ribosome binding site to start codon pET vector linker was used. Therefore future work to produce this chimera in competent bacteria may require investigation of a wider range of growth conditions such as temperature, incubation time and growth media used. Also alternative linker lengths between the two fused domains may need to be trialled in-case this was responsible for poor chimera production. Incorrect prediction of the domain boundaries of the CARmm

PCP domain could have also contributed to the design of a sub-optimal fusion site.

However in analysing the literature, it is clear that a chimera between the CAR A domain and the VibB carrier protein may not be necessary in order to combine the activities of CAR adenylation and VibH amide formation. Namely due to the proven ability of both adenylation domains and condensation domains such as VibE and VibH respectively being able to recognise non-cognate carrier proteins,206 it is possible that a

CAR A domain may be able to interact directly with a non-fused VibB. Or the CAR PCP domain may be able to interact directly with a non-fused VibH. Such an approach should therefore be trialled with CAR as an alternative to chimera production.

83

Chapter 3. Direct amide bond formation using CARs

3.1. Introduction

The initial aim of this project was to combine the broad specificity carboxylic acid adenylation activity of CAR, with the amide synthetase activity of Vibriobactin synthetase

(Vib) enzymes. However, as an alternative to strategies described in Chapter 2, direct amide formation with CAR was investigated. It was proposed that it may be possible to intercept thioester intermediates bound to the CAR PCP with amines such as ammonia.

Indeed as mentioned in Chapter 1.3.3, the PPant-bound thioesters of PCPs in related

NRPSs are generally exposed to solvent and are thought to be highly labile,137 which could make them vulnerable to nucleophilic attack by an amine. Firstly it would be necessary to express and purify CARs for subsequent aminolysis trials using ammonia and testing various conditions to permit primary amide formation. By adding ATP to the CAR reaction to permit carboxylic acid adenylation, but omitting NADPH, which is required for the final thioester reduction step, it was postulated that the substrate would be activated but trapped as a thioester to allow aminolysis (Figure 3.1). It was soon realised that if it were successful, this method could potentially be used to exploit CAR activity for the production of a wide range of amides.

A P R

Figure 3.1: Proposed interception of thioester intermediates bound to the PCP domain of CAR by ammonia.

84

3.2. CAR gene expression and enzyme purification

The genes for Mycobacterium marinum CAR (CARmm)186 or Nocardia iowensis

CAR (CARni)182 (UniProt accession number Q6RKB1) were co-transformed with the gene for the 4’-phosphopantetheine transferase (PPTase) Sfp from Bacillus subtilus into BL21

(DE3) competent cells. PPtases are necessary to post-translationally add a 4’- phosphopantetheine (PPant) group to the PCP of CARs. In early work cells were initially grown up in Luria Broth (LB) for 24 h at 20°C post induction with IPTG but were later grown up in AI media for 48 h at 20°C due to the improved convenience. Following expression and grow up, the cells were lysed and CARmm (Figure 3.2) and CARni (Figure

3.3) purified by nickel affinity chromatography. This was conducted using an AKTA purification system with high concentration elution fractions being identified with SDS-

PAGE and then pooled and desalted.

To test that the CARs were active, a standard NADPH absorbance change assay at

360 nm was performed using benzoic acid as a model substrate to demonstrate native reduction activity by the depletion of NADPH following enzyme addition (Figure 3.4). As a control for this reaction, enzyme storage buffer without enzyme was added, with no resulting increase in the rate of depletion of NADPH absorbance being observed. It was found that CARs produced from cells grown for 24 h at 37°C were inactive. Comparatively when cells were left at 37°C only until an OD 600nm of 0.6-0.8 was achieved followed by induction by IPTG and the temperature lowered to 20°C for 24 h, reduction activity was retained.

85

Protein Protein Mass MW L FT W1 E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11 Mass MW PF 180 kDa 250 kDa 130 kDa 130 kDa CARmm 100 kDa 100 kDa CARmm 129 kDa 70 kDa 70 kDa 55 kDa 55 kDa 40 kDa 35 kDa 35 kDa 25 kDa 25 kDa 15 kDa 15 kDa 10 kDa 10 kDa

Figure 3.2: SDS-PAGE of CARmm-expressing BL21 (DE3) cell lysate nickel affinity purification by AKTA. Bands corresponding to CARmm at 129 kDa can be seen in all fractions. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction. PF: Pooled fraction.

Protein Protein Mass Mass MW L FT W1 E1 E2 E3 E4 E5 E6 E7 PF MW 250 kDa 250 kDa 130 kDa CARni 130 kDa 100 kDa 130 kDa 100 kDa 70 kDa 70 kDa

55 kDa 55 kDa 35 kDa 35 kDa 25 kDa 25 kDa 15 kDa 10 kDa 15 kDa 10 kDa

Figure 3.3: SDS-PAGE of CARni-expressing BL21 (DE3) cell lysate nickel affinity purification by AKTA. Bands of 130 kDa corresponding to CARni can be observed in all fractions. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction. PF: Pooled fraction.

86

CAR enzyme added

Abs 340 nm

Time (min)

Figure 3.4: Typical NADPH absorbance assay at 340 nm before and after the addition of active purified CARmm measuring the reduction of carboxylic acids. 1 mL reaction mixtures consisted of 100 mM Tris-HCl (pH 7.5), 10 mM MgCl2, 10 mM benzoic acid, 5 mM ATP, 0.25 mM NADPH and 100 µg CAR enzyme which was added last following a period of time with no enzyme added to determine background NADPH depletion.

3.3. CAR dependent amide formation - method development and substrate screen

To maximise the possibility of the CAR PCP-bound thioester being intercepted by ammonia, it was thought that the reaction should attempted at or close to the pKa of the amine to increase deprotonation. Therefore the aforementioned NADPH absorbance assay was conducted at pH 10.0, which is above the pKa of ammonia, with CARmm to test for activity at high pH. This demonstrated that CARmm, which is normally used at neutral pH retained reduction activity at this high pH.

In preliminary work, CARmm was used in a pH 10.0 Tris-HCl buffer with the model substrate benzoic acid 25, ATP, MgCl2, increasing concentrations of ammonia (pH

10.0) and in the presence or absence of NADPH. After 22 h, the reaction was quenched with 10 volumes of methanol and the samples analysed by reverse phase HPLC, using commercially available benzamide 38 as a standard. Interestingly, peaks corresponding to that of benzamide 38 were observed, increasing in area with increasing concentration of ammonia and with the removal of NADPH (Figure 3.5), suggesting that when NADPH is

87 present the aminolysis reaction is in competition with the natural reduction reaction.

However, in addition to the peak observed for the amide product, an unidentified peak was also visible when NADPH was omitted, and which decreased in intensity when ammonia concentration was increased. It was postulated that this peak could be caused by the undesired formation of an amide between benzoic acid and tris(hydroxymethyl)aminomethane from the pH 10.0 reaction buffer.

? 100 mM

50 mM

10 mM

100 mM

50 mM 10 mM

Figure 3.5: HPLC traces showing the CAR-dependent production of benzamide 38 from benzoic acid 25 and ammonia. CARmm. Blue: 10 mM NADPH, 10 mM ammonia, Red: 10 mM NADPH, 50 mM ammonia, Green: 10 mM NADPH, 100 mM ammonia, Pink: 0 mM NADPH, 10 mM ammonia, Gold: 0 mM NADPH, 50 mM ammonia, Purple: 0 mM NADPH, 100 mM ammonia. Reaction conditions: 100 mM Tris-HCl (pH 10.0), 2 mM MgCl2, 5 mM ATP, 10 mM benzoic acid, 37°C, 22 h.

To investigate whether acids other than benzoic acid could be used for amide formation using CARmm and ammonia, the substrate scope of carboxylic acids was expanded to include benzoic acid and cinnamic acid derivatives (25, 28, 30, 39-45) with

88 the aim of producing their respective primary amides (38, 44-50) (Table 3.1). To permit the calculation of conversions by HPLC analysis, response factors were determined by generating UV absorbance calibration curves for the respective carboxylic acid substrates and the primary amide products when used at equimolar concentrations between 1 mM and

0.1 mM (Figure 3.6). The response factors for the primary amide screen are shown in table

3.2.

In these primary amide forming reaction trials, an amine excess of 100 x was utilised as this had been proven to give the highest conversion in the initial benzoic acid 25 trial (Figure 3.7) compared to 10 x and 50 x excess. Amide product peaks were observed, compared to those of commercially available standards, following reaction with all substrates with the exception of 2-methylbenzoic acid 42 and 2-hydroxy benzoic acid 43.

This reflects on the substrate specificity of CARmm which excludes carboxylic acids with substituent groups on the ortho position.192 Additionally to investigate whether other CAR enzymes could be used in this manor, CARni donated by Dr. Sasha Derrington was also trialled in this screen. CARni was subsequently produced and purified independently. In all cases of primary amides being produced however, the conversions were low (Table 3.3).

As a control to ensure that the reactions were enzyme dependent and not spontaneous, each separate carboxylic acid reaction was also conducted in the same conditions but in the absence of enzyme, with enzyme storage buffer being used in its place. In all cases no peaks corresponding to the amide products could be observed.

89

Table 3.1: Panel of carboxylic acids screened for CAR primary amidation activity with their corresponding primary amides.

Acid Primary amide Acid Primary amide Acid Primary amide

1400

1200 y = 1.42x

1000

800 Benzoic acid 25 peak area 600

400

200

0 0 200 400 600 800 1000 Benzamide 38 peak area

Figure 3.6: HPLC response factor calibration curve between equimolar concentrations of benzoic acid 25 and benzamide 38 at 230 nm.

90

Table 3.2: HPLC response factors obtained between various carboxylic acids and their corresponding primary amides at equimolar concentrations.

Acid Amide Acid peak Amide peak Response Wavelength retention time retention time factor (Acid (min) (min) absorbance/ Amide absorbance) 25 38 9.2 3.1 1.42 230 nm

28 44 7.3 3.4 0.92 270 nm

39 45 9.2 3.9 0.95 270 nm

40 46 7.0 2.9 1.07 230 nm

30 47 6.8 2.9 0.90 230 nm

41 48 6.0 2.7 1.03 230 nm

Table 3.3: Conversions achieved for the initial primary amide production screen from various carboxylic acids following reaction with CAR.a.

Carboxylic acid substrate Conversion to primary amideb Conversion to primary amideb (CARmm) (CARni) 25 6% 2%

28 12% 12%

39 0% 9%

40 3% NA

30 7% 6%

41 4% 1%

42 0% 0%

43 0% 0%

a. 1 mM carboxylic acid, 100 mM ammonia, 100 µg mL–1 purified CARmm and CARni, 5 mM ATP, 2 mM MgCl2, 100 mM Tris-HCl, pH 10.0, 37°C, 250 rpm, 22 h. b. Conversion determined by HPLC at 22 h. (NA: not applicable at this stage)

91

Following this initial substrate screen of carboxylic acids using a Tris-HCl buffered solution, it was decided that an alternative buffer system should be used, firstly because pH

10 was in fact out of the normal buffering range of Tris-HCl, and secondly it appeared that

Tris itself was participating in aminolysis giving rise to an unidentified peak being observed on all traces, although this peak’s area was lower in the presence of NADPH or higher concentrations of ammonia, suggesting a competing reaction. For high pH reactions the buffer selected was sodium carbonate-sodium bicarbonate as this, unlike the Tris-HCl buffer, does not contain an amine, and was therefore less likely to undergo a side reaction.

A simple pH profile was conducted with CARmm and cinnamic acid 28 between pH 7.5 and pH 10.5 with a potassium phosphate buffer being used between pH 7.5 and pH

8.0 with a sodium carbonate buffer being used between and pH 9.0 and pH 10.5. This showed that pH 9.0 rather than pH 10.0 was optimal for this particular reaction (Table 3.4).

No unidentified product peaks observed in any traces suggesting that those seen in previous reactions were due to a side reaction of the Tris amine with the carboxylic acid substrate. The carboxylic acids which had previously been shown to be substrates were again used at pH 9.0 with this sodium carbonate buffered system, and now for 24 h, with either CARmm or CARni (Table 3.5).

92

Table 3.4: Conversions to amide 44 achieved in a preliminary pH profile of CAR- dependent amide formation.a

Buffer pH Conversion to cinnamamide 44b.

Potassium phosphate 7.5 0.5%

Potassium phosphate 8.0 0.4%

Sodium carbonate-

sodium carbonate 9.0 5.5%

Sodium carbonate-

sodium carbonate 9.5 3.5%

Sodium carbonate-

sodium carbonate 10.0 0.4%

Sodium carbonate-

sodium carbonate 10.5 0.0%

–1 a. 1 mM 28, 100 mM ammonia, 200 µg mL purified CARmm, 2 mM MgCl2, 5 mM ATP, 100 mM buffer, pH 7.5 – pH 10.5, 37°C, 250 rpm, 22 h. b. Conversion determined by HPLC at 22 h.

Table 3.5: Conversions achieved for a repeat of the primary amide screen at pH 9.0.a

Carboxylic acid substrate Conversion to primary amideb. Conversion to primary amideb. (CARmm) (CARni)

25 4% 3%

28 15% 12%

39 12% 13%

40 2% 2%

30 4% 3%

41 3% 3%

–1 a. 1 mM carboxylic acid, 100 mM ammonia, 100 µg mL purified CAR, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 37°C, 24 h. b. Conversion determined by HPLC at 24 h. Having successfully demonstrated that ammonia could be used to intercept various carboxylic acids, following their activation by CAR, to produce primary amides, other

93 amines were introduced in place of ammonia, again in excess and including both primary and secondary amines. Benzoic acid 25 was reacted with methylamine 51, piperidine 52, or propargylamine 53 in the presence of CARmm or CARni to produce amides 54-56 (Table

3.6). Amide formation was detected by HPLC and conversions calculated as with primary amides using commercial standards to obtain response factors, with standards being synthesised chemically by Dr. Michael Hollas when not commercially available. While amide production was observed for both CARmm and CARni with 51 and 52, only very low conversion could be achieved with CARmm when 53 was used, with enzyme precipitate of CARmm and CARni being visible soon after addition of this amine. Overall this assay showed that as well as ammonia, other amines could be used to intercept activated carboxylic acids in CAR dependent amidation.

Table 3.6: Conversions to secondary and tertiary amides obtained by CAR-dependent amide formation with methylamine 51, piperidine 52 and propargylamine 53a

Amine Amide Conversionb. Conversionb. CARmm CARni

15% 12%

4% 5%

2% ND

–1 a. 1 mM 25, 100 mM amine, 100 µg mL purified CAR, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 37°C, 24 h. b. Conversion determined by HPLC at 24 h. ND: not detected.

94

To demonstrate the potential commercial utility of this method, a product with pharmaceutical relevance was selected as a target for production. By the reaction of the carboxylic acid (2E)-3-(1,3-benzodioxol-5-yl)acrylic acid 57 and piperidine 52 the amide ilepcimide 58 would be produced (Figure 3.7). This has been shown to possess anticonvulsant properties.208 This reaction was conducted as with other acid substrates and amines, at pH 9.0, 37°C and at this initial stage for 22 h, with conversions of 5% by

CARmm and 26% by CARni.

Figure 3.7: CAR dependent formation of the pharmaceutically-relevant amide ilepcimide 58from carboxylic acid 57 and amine 52. 1 mM 57, 100 mM 52, 100 µg mL–1 purified CAR, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 37°C, 22 h. Conversion determined by HPLC at 22 h.

A clear limitation of this reaction with all substrates was the modest conversions achieved. It was initially postulated that the low conversions were due to poor stability of the CAR enzymes at high pH, with white enzyme precipitate being visible within 30 min of all reactions. Therefore ilepcimide 58 was used as a target product to optimise conversions. If the low conversion was indeed due to enzyme instability, rather than other possible limiting factors such as ATP hydrolysis then replacement of the inactive enzyme with another batch should restore activity. Hourly batch addition of enzyme was attempted with CARni with 100 µg of purified enzyme per batch being added to replace the initial enzyme addition. 5 separate reactions were conducted with 0, 1, 2, 3 or 4 additional batches of enzyme being added. All reactions had a total reaction time of 22 h. This trial demonstrated that with each subsequent batch the conversion increased, achieving 45% conversion after a total of 5 batches of CARni (Table 3.7). This suggested that enzyme stability was indeed the limiting factor for conversion to adenylate and subsequent

95 aminolysis. Additionally an initial time-course reaction was conducted with CARni which confirmed that maximal conversion was achieved after 2 h, suggesting that reaction ceased after this time (Table 3.8). The maximum duration of this latter reaction had been increased to 24 h in place of 22 h.

Table 3.7: Improved conversions achieved for ilepcimide 58 production when supplementary batches of CARni are added to the amide forming reaction.a

Batches of CARni Conversion to 58b.

1 19%

2 30%

3 34%

4 41%

5 45%

a. 1 mM 57, 100 mM 52, 100 µg mL–1 purified CARni per batch, up to 5 hourly batches, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 37°C, 22 h. b. Conversion determined by HPLC at 22 h.

Table 3.8: Conversions to 58 achieved at different time points during the CARni reaction up to 24 h.a

Time Conversionb. 5 m 4% 10 m 8% 20 m 13% 30 m 16% 40 m 18% 50 m 20% 1 h 21% 2 h 23% 24 h 21%

–1 a. 1 mM 57, 100 mM 52, 100 µg mL purified CARni, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 37°C. b. Conversion determined by HPLC at each time point.

96

Next, 10 hourly 100 µg batches of either CARni or CARmm for a final total of 1 mg of enzyme, were used for the production of 58 in a total reaction time of 24 h, affording much improved conversions of 93% and 65% by CARni and CARmm respectively.

3.3.1. Use of CARni-containing cell lysate in preparative- scale production of ilepcimide 58 for isolation and characterisation

Following success in improving conversions to 58 using batch addition of purified

CAR enzyme, it was then decided to investigate the use of cell lysate containing CAR enzyme, for preparative-scale production of 58. This was conducted as if it were successful, it would be a more quick and convenient method of producing 58, compared to the large-scale purification of CAR enzyme that would be necessary for preparative scale production of amides. Moreover, preparative-scale production of 58 followed by isolation and purification, would allow for the characterisation of this product.

BL21 (DE3) cells co-transformed with CARni and Sfp were lysed and centrifuged, with the supernatant being pooled and stored on ice for subsequent use in the reaction. As in amide forming reactions conducted with purified CAR enzyme, the pH of the reaction mixture was buffered to pH 9.0 at 37°C using a 100 mM sodium carbonate-sodium bicarbonate buffer with a piperidine 52 concentration of 100 mM. ATP concentration was increased to 15 mM to compensate for potential leaching of the cofactor by side processes involving cellular components. 100 mg of starting material 57 was added to a starting reaction mixture volume of 100 mL.

As batch addition of enzyme had proven beneficial to conversion to amide product using purified CAR enzyme, batch addition of lysate was also employed. 1 mL of lysate was added every 30 min for the first 5 h of the 24 h reaction. Following 24 h the reaction was quenched in methanol. Dichloromethane (DCM) was then added to the mixture to isolate the amide product. Following purification, an isolated product yield of 19 % was

97 achieved and NMR and mass spectrometry were used to characterise the product and confirm its identity as 58.

3.3.2. Control reactions to establish catalytic activity of CAR in amide synthesis

Control reactions were conducted to ensure that the production of amides was due to CAR activity and not a background reaction. First, each primary, secondary and tertiary amide forming reaction including the reaction to produce 58 was also conducted in the absence of purified enzyme, with the enzyme storage buffer being used in its place; this afforded no amide in any reaction. This showed that in the CAR-dependent amide forming reaction conditions, amides could not form spontaneously in absence of CAR. Additionally to demonstrate that the reaction was ATP dependent the benzoic acid 25 and (2E)-3-(1,3- benzodioxol-5-yl)acrylic acid 57 reactions were conducted in the presence of enzyme but with ATP omitted; again this resulted in no amide being produced.

3.3.3. Method optimisation and reaction profiling using ilepcimide 58 as a target

As using batch addition of enzyme or cell lysate to improve conversion to amides would be expensive and inconvenient for industrial use, method optimisation using single batches of purified enzyme was investigated. This would focus primarily on examining optimal temperatures and pH of the amidation reaction. Optimisation studies were conducted with ilepcimide 58 as the model product. Three temperatures were analysed, 37

°C, 30°C and 22°C using CARmm and CARni as biocatalysts (Table 3.9). These studies resulted in much improved conversions using single batch addition with optimal temperatures of 30°C for CARmm with a conversion to 58 of 71% and of 22°C for CARni with a conversion of 68%.

98

Table 3.9: Conversions achieved for ilepcimide 58 production at different temperatures using CARmm and CARni.a

Temp (°C) CARmm conversionb. CARni conversionb.

22 52% 68%

30 71% 50%

37 17% 48%

–1 a. 1 mM 57, 100 mM 52, 100 µg mL purified CAR, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 22°C, 30°C or 37°C. b. Conversion determined by HPLC at 24 h.

While these results showed that incubation temperatures could have different effects on amide conversion depending on the CAR used, they also showed increased conversions for both CAR enzymes at 37°C. This was attributed to the angle at which the

Eppendorf tubes were now being held, whereas in previous work they had been held at a

45°angle to increase agitation, they were now being held vertically. This resulted in no white enzyme precipitate being visible after reactions at any temperature, suggesting that the enzyme was not crashing out after a short amount of time. Therefore all subsequent reactions were held vertically.

The CARmm reaction at 30°C was then analysed for optimal pH by conducting a pH profile between pH 7.5 and pH 10.0 using overlapping buffers of HEPES (pH 7.5- pH

8.0) Tris-HCl (pH 7.5- pH 9.0) and sodium carbonate-sodium bicarbonate (pH 9.0- pH

10.0) (Figure 3.8). This demonstrated that pH 9.0 was indeed the optimal pH for amide formation with 71% conversion, but unlike in preliminary work with cinnamic acid 25 where only trace amounts of amide were observed at neutral pH, up to 21% conversion to ilepcimide 58 could be achieved at pH 7.5. Beyond pH 9.0, at pH 9.5 and pH 10.0 however, conversion decreases, suggesting that enzyme instability at this high pH is a limiting factor. It would appear that a balance between high pH to promote amine deprotonation, but not so high as to destabilise the enzyme, is required.

99

80%

70% HEPES

60%

50% Tris-HCl

Conversion 40% to ilepcimide Sodium 58 30% carbonate- sodium 20% bicarbonate

10%

0% 7 7.5 8 8.5 9 9.5 10 10.5 pH Figure 3.8: Conversion to 58 by reaction of CARmm with 57 and 52 at different pH values with overlapping buffers: 100 mM HEPES green, 100 mM Tris-HCl (Red), 100 mM Sodium carbonate-sodium bicarbonate. Reaction conditions: 1 mM 57, 100 mM 52, 100 µg –1 mL purified CARmm, 10 mM MgCl2, 5 mM ATP, 30°C, 24 h, conversion determined by HPLC at 24 h.

Next the amine excess (Figure 3.9) and ATP concentration (Figure 3.10) were

investigated for optimal conversion to ilepcimide 58 with CARmm at 30°C, and pH 9.0.

Piperidine 52 was used from 1 mM (1 x excess) to 200 mM (200 x excess), with 100 mM

(100 x excess) giving the highest conversion, although 21% conversion could still be

achieved with 5 mM 52 (5 x excess), with concentrations higher than 100 mM of 52

reducing conversion. ATP concentrations between 1 mM and 5 mM were tested, with 5

mM giving the highest conversion.

100

80% 70% 60% 50% Conversion 40% to 58 30% 20% 10% 0% 0 50 100 150 200 250

Amine excess

Figure 3.9: Conversion to 58 by CARmm achieved at varying excesses of amine 52. Amine excess ranges from 1 x to 200 x. Reaction conditions: 100 mM sodium carbonate buffer, pH 9.0, 1 mM 57, 1 mM-200 mM 52, 100 µg mL–1 purified CARmm, 10 mM MgCl2, 5 mM ATP, 30°C, 24 h, conversion determined by HPLC at 24 h.

70% 60% 50% 40% Conversion to 58 30% 20% 10% 0% 0 1 2 3 4 5 6 ATP concentration (mM)

Figure 3.10: Conversion to 58 by CARmm achieved at varying concentrations of ATP. Reaction conditions: 100 mM sodium carbonate buffer, pH 9.0, 1 mM 57, 100 mM 52, 100 –1 µg mL purified CARmm, 10 mM MgCl2, 0-5 mM ATP, 30°C, 24 h, conversion determined by HPLC at 24 h.

As improved temperature conditions had now been found for ilepcimide 58 production by both CARmm and CARni temperature studies were now repeated with the original carboxylic acid and amine screens with both CARmm and CARni (Table 3.10).

Eppendorf tubes containing the reaction mixes were also held vertically at a 0° angle, as this had been shown to improve conversion to 58. Unlike with the production of 58, 37°C 101 was shown to be optimal for the production of primary amides with both CARs. This is also the case for the production of the secondary amides 54 and 55 with CARmm, however

22°C was still optimal for the production of 55 by CARni. Conversion to 56 was only observed with CARmm at 37°C. While conversions in all reactions at 37°C were improved for CARmm reactions compared to the initial reaction screens held at a 45° angle, with up to 25 % conversion to 45 and 54, there was no such improvement for CARni, with the initial screen results at 37°C still providing the higher conversions.

Table 3.10: Conversions to amides 38, 44-48 and 54-56 achieved at different temperatures using CARmm and CARni.a

Conversionb.

Product CARmm CARni

22°C 30°C 37°C 22°C 30°C 37°C

38 1% 1% 11% 2% 2% 3%

44 5% 6% 22% 9% 7% 12%

45 5% 7% 25% 9% 6% 13%

46 1% 1% 8% 1% 1% 2%

47 1% 2% 10% 2% 2% 3%

48 1% 1% 13% 3% 2% 3%

54 6% 5% 25% 12% 6% 12%

55 3% 2% 8% 7% 2% 5%

56 0% 0% 4% 0% 0% 0%

–1 a. 1 mM acid, 100 mM amine, 100 µg mL purified CAR, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 22°C, 30°C or 37°C. b. Conversion determined by HPLC at 24 h.

Time-course reactions were then conducted in triplicate of the CARmm ilepcimide

58 production reaction at 30°C (Figure 3.11) and the CARmm benzamide 38 reaction at

30°C and 37°C (Figure 3.12). These demonstrated that unlike in the preliminary time- course reaction, activity continued beyond 2 h and could continue up to 72 h at which 96% 102 conversion to 58 could be achieved by CARmm at 30°C, or 16% conversion to 38 at 37°C.

The linear conversion increase for the first 2 h of the 58 production reaction was used to calculate a specific activity of 12.22 ± 0.15 mU mg–1. This was not possible with the 38 production reaction as the conversion increase did not remain linear within the first hours of the reaction.

100% 90% 80% 70% 60%

Conversion 50% to 58 40% 30% 20% 10% 0% 0 20 40 60 80 Time (h)

Figure 3.11: Time-course analysis of conversion to 58 by an optimised 30°C, pH 9.0 –1 CARmm reaction. 1 mM 57, 100 mM 52, 100 µg mL purified CARmm, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 30°C. Conversions determined by HPLC. Error bars show standard deviation of three separate sample readings.

103

20%

18% 37°C 16%

14% 30°C

12%

Conversion 10% to 38 8%

6%

4%

2%

0% 0 20 40 60 80 Time (h)

Figure 3.12: Time-course analysis of conversion to 38 by the optimised 30°C or 37°C, pH 9.0 CARmm reactions. 1 mM 25, 100 mM ammonia, 100 µg mL–1 purified CARmm, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 30°C or 37°C. Conversions determined by HPLC. Error bars show standard deviation of three separate sample readings. .

3.4. Conclusions

We have demonstrated the ability of CARs to be used as biocatalysts for amide formation with broad specificity. This is particularly interesting since typically, aqueous amide synthetases, such as NRPS or ATP-grasp enzymes are highly substrate specific for both carboxylic acids and amines. We have shown that by engineering the reaction conditions of CAR, principally by increasing the pH to promote deprotonation of amines, omitting NADPH required for the native reduction step, and introducing a range of amines in excess, various primary, secondary and tertiary amides could be produced by CARs.

There were clear limitations however, namely the particularly low conversions afforded in early amide forming trials due to enzyme precipitation shortly after the beginning of the reactions. This could initially be overcome through the replacement of precipitated and 104 inactive enzyme by sequential batch addition of active enzyme. By applying this process to cell lysate, it was possible to produce the commercially relevant target amide, ilepcimide

58, in mg quantities, without the timely and costly process of purifying proteins, which would be of greater use for industrial purposes. This preparative scale production was used to permit NMR analysis of the product, confirming its identity. By optimising incubation temperatures and the agitation of samples, it was possible to remove the requirement for batch addition of purified enzyme with one batch being sufficient for conversions up to

96% for 58 when CARmm was left for 72 h. This demonstrated that the CAR enzymes were able to facilitate amide formation over several hours despite the high pH environment used for amide formation. Additionally, optimisation experiments showed that for different carboxylic acids and amines, different temperatures were optimal. Consequently it is likely that, for any new carboxylic acid-amine combination, investigations should be carried out to find the optimal temperatures.

The next question concerned the mechanism of the reaction, in particular which

CAR reaction intermediate was intercepted by the amine nucleophile.

105

Chapter 4. Investigation of the mechanism of CAR-dependent amide synthesis

4.1. Introduction

In the process of activating and reducing carboxylic acid substrates, CARs generate two different intermediates. The first being the acyl adenylate and the second being the

PCP-bound thioester.192 While we had initially aimed to intercept the PCP-bound thioester with amine nucleophiles, it was possible that it was the acyl adenylate which was being attacked. As explained in Chapter 1.3.5, there is in-fact a precedent in the literature for amide formation arising from aminolysis of acyl adenylates.174,176 To ascertain whether the thioester intermediate was required for amide formation, the CARs were produced in the absence of co-produced Sfp which is used to add the PPant group to the PCP. Additionally a mutant variant of the enzyme was generated. The serine binding site for PPant addition was mutated to an alanine, eliminating the possibility of the prosthetic group being added.

If removal of the prosthetic group resulted in no amide being produced, it would be a strong indication that the thioester is necessary for amide formation. Conversely, if amides were still produced despite the absence of a PPant group, this would show that the thioester is not required, and that the acyl adenylate can be attacked by amines to produce amides.

4.2. CAR production in the absence of co-produced Sfp

BL21 DE3 competent cells were transformed with the CARmm and CARni genes as in Chapter 2, but with the absence of the co-transformed Sfp. Cells were then grown up, lysed and CARmm (-Sfp) (Figure 4.1) and CARni (-Sfp) (Figure 4.2) purified as before.

When used for the production of 58, conversions of 72% were achieved by CARmm (-Sfp) and 76% by CARni (-Sfp), as determined by HPLC. While this could be seen as an

106 indication that PPant addition is not required for amide formation, when an NADPH absorbance assay was conducted, both CARs retained reduction activity. Indeed it has been observed previously that, although with lower activity than with Sfp co-production, CARs are able to be post-translationally modified with PPant addition by native E.coli PPTases which was likely the cause of the residual activity.24 Therefore this test cannot determine whether or not the PPant group is necessary for amidation. However it did confirm that Sfp itself plays no role in amide formation.

MW L FT W1 E1 E2 E3 E4 E5 E6 250 kDa 130 kDa 100 kDa CARmm 70 kDa 129 kDa 55 kDa

35 kDa 25 kDa

15 kDa

10 kDa

Figure 4.1: SDS-PAGE of Sfp-absent, CARmm-expressing BL21 (DE3) cell lysate nickel affinity purification by AKTA. A protein band of His-tagged CARmm of approximately 129 kDa can be seen in all fraction lanes. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction.

Protein Mass MW L FT W1 E1 E2 E3 E4 E5 E6 250 kDa 130 kDa CARni 100 kDa 130 kDa 70 kDa 55 kDa 35 kDa 25 kDa

15 kDa 10 kDa

Figure 4.2: SDS-PAGE of Sfp-absent, CARni-expressing BL21 (DE3) cell lysate nickel affinity purification by AKTA. A protein band of His-tagged CARmm of approximately 130 kDa can be seen in all fraction lanes. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction. 107

4.3. CAR mutagenesis and the removal of the phosphopantetheine binding site

Site-directed mutagenesis of the CARni PPant binding site serine to an alanine was conducted with inverse PCR to remove the ability of CARni to receive the PPant prosthetic group, as has been done in previous work.24 The mutant DNA PCR product was run on an agarose gel and underwent gel extraction (Figure 4.3), ligation and transformation into 5- alpha competent cells and was subsequently sequenced to confirm successful mutation.

This mutant gene was expressed in the presence of co-transformed Sfp and CARni S689A was purified (Figure 4.4). The reaction to produce 58 was conducted and a conversion of

79% was achieved, despite removal of the PPant group. This demonstrated that adenylation activity alone was sufficient for amide formation, implying that the acyl adenylate could be directly attacked.

100 bp ladder PCR product DNA length

8 kb 10 kb Linear pET21a CARni S689A 6 kb 8.9 kb

Figure 4.3: Mutation of CARni. Visualisation of the linear mutated CARni PCR product on an agarose gel. The linear mutant CARni S689A PCR product was visible at the correct size of 8.9 kb. The band was subsequently excised and the DNA purified.

108

Protein Mass MW L FT W1 E1 E2 E3 E4 E5 E6 E7 E8 180 kDa 130 kDa 100 kDa CARni S689A 70 kDa 130 kDa 55 kDa 40 kDa 35 kDa 25 kDa

15 kDa 10 kDa

Figure 4.4: SDS-PAGE of CARni S689A-expressing BL21 (DE3) cell lysate nickel affinity purification by AKTA. A band corresponding to the mutant CARni S689A enzyme at approximately 130 kDa can be seen in all fraction lanes. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction.

4.4. Use of truncated CAR for amide formation

It was next decided to utilise a truncation of CARmm possessing only the A and

PCP domains (Figure 4.5 a) to demonstrate that the now redundant reduction domain was not required for amide formation through stabilising effects on the preceding A domain.

The gene for the truncated CARmm (pET28b CARmm729-1175) was available from previous structural work192 and was transformed into BL21 DE3 cells as with full length

CAR and was purified (Figure 4.5 b). If this construct was also active and able to make amides, it could potentially be useful in whole-cell production of amides, without the need to add ATP co factor, as unlike with full length CARs there would be less risk of the substrates being activated and leached off for reduction to aldehydes in the native CAR reaction, which has already been shown to occur in whole-cells.190 While CARni S689A could potentially also be used in this role, as shown in recent structural work with CARs,

PPant groups in solution are able to act as a shuttle between A and R domains even in the absence of a PCP domain,192 and therefore there would remain a risk of substrate reduction. When the truncation was used for the production of 58, it was found to be 109 active, with a conversion of 69%. This confirmed the CAR R domain plays no role in amide formation and that the loss of any A domain-R domain cross-talk, does not eliminate adeylation and aminolysis activity.

a.

b. Protein MW L FT W1 E1 E2 E3 E4 E5 E6 E7 E8 Mass

180 kDa 130 kDa

100 kDa 70 kDa CARmm 729-1175 81 kDa 55 kDa 40 kDa 35 kDa 25 kDa 15 kDa 10 kDa

Figure 4.5: Truncated CARmm structural composition and purification. a. CARmm 729- 1175 possesses an A domain and PCP domain, but lacks the C-terminal R domain. b. SDS- PAGE of CARmm 729-1175-expressing BL21 (DE3) cell lysate nickel affinity purification by AKTA. A strong band of approximately the size of CARmm 729-1175 can be seen in elution fractions. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction.

4.5. Influence of a PPant mimetic on amide formation

110

While it was most likely that the acyl adenylate could be attacked by amines to produce amides it was still possible that the thioester could also be attacked. By using

CARni S689A, lacking its own PPant group, with increasing concentrations of added

PPant mimetics, an increase in conversion to amide product would suggest that the mimetic is able to interact with the enzyme and generate a thioester which is capable of being attacked to form the amide as well as the acyl adenylate. The mimetic chosen was N- acetylcysteamine 59 which has been used in other work to simulate PPant groups in reactions with NRPS.209,210 CARni S689A which lacks its own bound PPant group, was used to produce 58 while N-acetylcysteamine 59 was added from 0 to 20 mM and the conversions analysed. While the increase was modest, there did indeed appear to be a positive correlation between the concentration of 59 and the conversion to 58 (Figure 4.6 a). This could be an indication that 59 is able to interact with the CAR A domain, attacking the acyl adenylate and thereby producing thioesters which in turn could be attacked by amines. However, when analysing the area of substrate and product peaks by HPLC it is clear that while the area of peaks of the substrate 57 decrease with increasing concentrations of 59 (Figure 4.6 b), there is no corresponding increase of peak area for 58

(Figure 4.6 c) as would be typically observed with increasing conversion and decreasing substrate peak area. While it is possible that the thioester is formed, accounting for the decrease in substrate peak area, it is also possible this is not subsequently amidated, which in turn could account for the lack of an increase in product peak area. Future work to chemically synthesise thioester intermediates, and investigations into the interaction of

CARs with PPant mimetics would be required to determine the importance of the thioester in amide formation.

111

90.0% a. 80.0% 70.0%

60.0% y = 0.0049x + 0.6701 Conversion to 50.0% R² = 0.55 Ilepcimide 58 40.0%

30.0% 20.0%

10.0% 0.0% 0 5 10 15 20 25 Concentration of N-acetylcysteamine 59 (mM) 1000 b. 900

800 y = -16.225x + 749.64 700 R² = 0.57 Carboxylic acid 600 57 500 Peak area 400

300 200

100

0 0 5 10 15 20 25

Concentration of N-acetylcysteamine 59 (mM) c.

1600 y = -4.9549x + 1332.6 R² = 0.20 1400

1200 1000 Ilepcimide 58 Peak area 800

600

400 200

0 0 5 10 15 20 25 Concentration of N-acetylcysteamine 59 (mM)

Figure 4.6: Influence of 4’-phosphopantetheine mimetics on CAR-dependent amide bond formation. Conversion to 58 in a CARni(S689A) reaction was analysed with increasing concentrations of the phosphopantetheine mimetic N-acetylcysteamine 59 to determine if its addition could increase conversion. a. Conversion to 58 by the optimised 22°C CARni(S689A) reaction with increasing concentrations of 59 b. Substrate carboxylic acid HPLC peak area vs. concentration of 59 in the reaction. c. Product ilepcimide 58 HPLC peak area vs. concentration of N-acetylcysteamine 59 in the reaction. 112

4.6. Investigation of CAR enantioselectivity using chiral amines

While the acyl adenylate seemed to be the principal high energy intermediate being attacked by amines to produce amides, and while it was demonstrated that the reaction was

CAR-dependent, the question remained as to what extent the enzymes were involved. We had two alternative hypotheses; i) the CAR A domain activated the carboxylic acid substrate which was then released from the enzyme as an acyl adenylate which was then attacked in solution with no further role of the enzyme, in which case ATP consumption would not be coupled and would likely continue to be expended regardless of the presence of amine nucleophile, or ii) following activation, the acyl adenylate would remain bound the A domain active site until the nucleophile entered the enzyme and attacked it, releasing the amide and AMP (Figure 4.7). With the latter alternative, ATP consumption would be coupled to this process as ATP could not enter the active site until it had been cleared by the previous amide formation turnover.

In order to probe the mechanism further, chiral amines were used as nucleophiles.

If CAR-dependent amide synthesis demonstrated enantioselectivity, this would suggest that the acyl adenylate remained bound to the enzyme, and that the amine had to enter into the active site to perform aminolysis. Cinnamic acid 28 was used as the acid substrate for

CAR mediated amide synthesis (Figure 4.8). The first amine tested was alpha- methylbenzylamine 60, however, this failed to produce any new product peaks by HPLC.

Therefore another amine, sec-butylamine 61 was employed, but with this amine too, only trace amounts of the amide product 62 could be observed with either enantiomer. With the absence of a standard for 62, this product was identified by LCMS analysis. When comparing HPLC traces of reactions conducted with either the (R) or (S)-enantiomers of 61 from reactions conducted in triplicate, the product peak for 62 was not detected when using

113

the (R)-enantiomer of 61. A peak for 62 was observed when using the (S)-enantiomer of 61

but this was measured as less than 1% by integration of peak area. While this may indicate

a selective preference for the (S)-enantiomer, this could also be due to the apparent higher

concentration of substrate 28 in reactions with the (S)-enantiomer of 61 vs. reactions with

the (R)-enantiomer of 61 when measuring integration of the substrate peak areas by HPLC.

The presence of a peak in reactions with the (S)-enantiomer of 61 but not the (R)-

enantiomer of 61 may therefore be due to experimental error.

Future work investigating amine enantiomer selectivity in CAR-dependent amide

formation should focus initially on finding a chiral amine which provides suitable

conversion to allow appropriate analysis. However, the fact that these two amines acted as

poor nucleophiles suggests that size limitations could be a factor in CAR-dependent amide

formation, as would be the case if the acyl adenylate remained enzyme-bound.

i.

A

ii.

Figure 4.7: Potential methods of CAR-catalysed amide formation via acyl adenylate interception. i. The acyl adenylate is released from the A domain and is intercepted by an

amine in solution. ii. The acyl adenylate remains bound within the A domain following activation and aminolysis occurs at the enzyme active site.

114

Figure 4.8: Chiral amines 60 and 61 utilised for enantioselectivity studies of CAR- dependent amidation and the expected product of the 61 reaction, 62.

4.7. Analysis of adenylation activity using the EnzChek phosphate detection kit

Following a lack of success with the use of chiral amines to provide mechanistic insight into the aminolysis of the acyl adenylate, we turned our attention to the investigation of coupling between ATP consumption and amide product formation, for which a method which analyses ATP consumption was required.

The method chosen was the EnzChek inorganic phosphate (Pi) quantification kit by

Thermo Fisher. This method is sensitive to very low concentrations of Pi and is typically used to quantify ATP consumption in reactions which yield ADP and Pi. When the Pi is released it reacts with the substrate 2-amino-6-mercapto-7-methyl-purine riboside (MESG)

63 in the presence of purine nucleoside phosphorylase (PNP) which catalyses the reaction211 to produce 2-amino-6-mercapto-7-methyl-purine 64 and ribose 1-phosphate 65, with the former having a maximal wavelength absorbance at 360 nm (Figure 4.9). By generating a calibration curve between the concentration of Pi and absorbance at 360 nm it is possible to calculate the concentration of Pi in a sample between 2 µM and 150 µM.

However as the ATP-dependent reaction catalysed by CAR releases AMP and pyrophosphate (PPi) an additional enzymatic step was required to catalyse the breakdown of PPi into Pi. Therefore in addition to the kit enzyme and substrate, additional inorganic pyrophosphatase was added to samples.212

115

Figure 4.9: Use of the EnzChek phosphate assay kit and inorganic phosphatase to analyse ATP consumption during CAR-dependent adenylation by absorbance change at 360 nm. MESG: 2-amino-6-mercapto-7-methyl-purine riboside 63, PNP: Purine nucleoside phosphorylase. A calibration curve was generated with Pi at concentrations between 2 µM and 100

µM and the absorbance at 360 nm monitored, allowing the calculation of Pi concentration in samples (Figure 4.10 a). It was important to note in reactions that for every two Pi molecules generated, only one molecule of ATP was being consumed. As well as the standard optimised CARmm ilepcimide 58 production reaction with 100 mM piperidine 52, control reactions lacking either the amine 52, acid 57, or enzyme components were also conducted. Following 24 h samples of the reaction mixtures were taken and their Pi concentrations were analysed using the EnzChek assay kit. When the enzyme reaction was analysed, it was found there was a significant increase in the production of Pi when amine and enzyme were present compared to when the amine is absent (Figure 4.10 b). There is however, a low level of Pi production in the absence of amine, suggesting that there is background enzyme-dependent hydrolysis of ATP with resulting PPi release. Nevertheless, the large increase in ATP consumption when both enzyme and amine nucleophile was

116 present suggests that ATP consumption is indeed coupled to amide formation, with a new molecule of ATP not being hydrolysed until the preceding enzyme bound acyl adenylate is amidated by an amine nucleophile. Moreover, when the concentration of ATP consumed was compared to the concentration of amide produced, it was found that the amide concentration observed by HPLC conversion analysis (715 µM) fell within the boundaries of the observed ATP consumption (725 ± 44 µM) suggesting that the reaction itself is proceeds with little background hydrolysis occurring in the presence of amine.

As there was now a means of directly analysing the consumption of ATP in conjunction with CAR-dependent amide formation, this method was to be used for kinetic investigations of amide formation. While the coupling of ATP consumption with amide formation suggests that it is likely that the aminolysis reaction occurs while the acyl adenylate remains bound to the enzyme, it was unknown whether this reaction was itself enzyme-catalysed. Indeed if the aminolysis reaction were able to undergo Michaelis-

Menten kinetics, demonstrating that the amine was being bound as a substrate,213 this would be an example of enzyme promiscuity.214,215 Conversely, if the aminolysis reaction did not exhibit Michealis-Menten kinetics, this would suggest that only the adenylation reaction was enzyme catalysed and that the amine nucleophile was not being bound by the enzyme in a favourable orientation and distance to facilitate catalysis.216 Indeed with the high concentrations of amine required it might be suggested that the latter is the case.

Nonetheless, a method for analysing the kinetics of the reaction was devised wherein the aforementioned Pi concentration assay would be employed to analyse the production of Pi in real time from an amide synthesis assay. There were immediate challenges to this approach however. Firstly the maximum pH in which the Pi concentration assay can operate is pH 8.5, compared to the optimal pH of pH 9.0 for the amide production assay.

Additionally it was unknown how the Pi assay would perform in the presence of the components of the amide forming assay, including the amine 52 and the sodium-carbonate buffer. Unfortunately when the reaction was performed and the absorbance at 360 nm 117 analysed, it was found that the absorbance increased equally in both control and reaction samples. Indeed it is likely that for kinetics studies to be performed, a wide range of conditions would have to be investigated to find those which are compatible to both the amide forming reaction and the Pi concentration assay.

0.7 a. y = 0.0061x 0.6

0.5

Absorbance 0.4 360 nm 0.3

0.2

0.1

0.0 0 20 40 60 80 100 120 Pi (µM)

1800

1600

1400

b. 1200

Pi Produced 1000 (µM) 800

600

400

200

0 Enzyme + amine Enzyme - amine

Figure 4.10: EnzChek kit Pi concentration calibration curve generation and analysis of ATP coupling to amide formation. a. The EnzChek kit was used with known concentrations of Pi, to generate a calibration curve. b. Pi produced by the CARmm 58 production reaction in the presence and absence of amine demonstrates coupling between amide formation and ATP consumption. Amide forming reaction and control reaction: 1 –1 mM 57, 100 mM or 0 mM 52, 100 µg mL purified CARmm, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 30°C, 24 h. Following 24 h three separate samples of the amide forming reaction mixtures were taken for the EnzCheck reaction mix. Error bars show standard deviation of three separate EnzChek reactions.

118

4.8. Structural modelling of amines into CAR active site

Having shown that Pi production, and likely ATP consumption, and amide formation were coupled, and therefore aminolysis likely to be occurring with the acyl adenylate still bound to the enzyme, modelling studies were conducted with published structures of the CARni (PDBID: 5MSD) and CARsr (PDBID: 5MSD) adenylation domains. The purpose of this was to visualise the potential interactions of amine nucleophiles within the enzyme active site, both with the enzyme and the acyl adenylate.

The structures published by Gahloth et al. of the CARni and CARsr A domains192 which contain co-crystallised benzoic acid and AMP, and a modelled fumaric acid and AMP respectively were studied for potential docking sites for amine nucleophiles in the active site. Using the AutoDock 4.0 software, piperidine 52 which was seen to be the most effective overall amine nucleophile in this work, was employed as a putative ligand for which the software modelled predicted docked conformations which offer the lowest free energy217,218 within a 20 Å x 20 Å x 20 Å grid centred on the acyl adenylate carbonyl carbon, which would by attacked by the amine nucleophile. Interestingly when the structure for CARni was modelled with 52 two separate predicted docked conformations were presented. One of which places piperidine 52 in close proximity to Ser 280 within the substrate binding site but places the piperidine nitrogen 10 Å from the carboxyl carbonyl of the acid substrate and in the wrong orientation for nucleophilic attack (Figure 4.11 a). The second and higher energy conformation places the amine in close proximity to the backbone carbonyl of Leu 617 of the active site (Figure 4.11 b). While this is oriented in the direction of the acid carbonyl, the amine is also 10 Å from the acid carbonyl and is also obstructed by the ribosyl group of AMP. The structure of the CARsr A domain also offered two predicted docked conformations of 52 within the active site. The first and lower energy conformation places the amine 13 Å from the substrate carbonyl carbon and in close proximity to the Pro 410 backbone carbonyl but obstructed by Ala 409 (Figure

119

4.12 a). The second conformation places the amine at its closest predicted distance from the substrate carbonyl at 7 Å and interestingly, in close proximity to both the phosphate of

AMP and also Lys 629 which participates in stabilisation of tetrahedral intermediate in acyl adenylate formation (Figure 4.12 b). A common feature amongst these models is the fact that there does not appear to be a binding site which would facilitate enzyme catalysed aminolysis of the acyl adenylate intermediate of the CAR reaction by binding the amine nucleophile in close proximity and in the correct orientation. However, the models do suggest that there are areas within the active site, distant from the carbonyl electrophile which would allow the docking of amine nucleophiles which could then participate in the interception of the enzyme bound acyl adenylate in a non-enzyme catalysed reaction.

There are limitations with this model system however. Firstly here we have only analysed the adenylation conformation of the A domains. Indeed alternative and more favourable binding sites for the amine nucleophile may be available in the thiolation state observed in the A-PCP didomain structure of CARsr. However, with no structural information of a co- crystallised acid, amine binding modelling was not conducted. Another possibility raised from this modelling is that with a potential amine docking site being adjacent to the nucleotide phosphate, and the tetrahedral intermediate stabilising Lys 629 in CARsr, it may be the case that amine binding could interfere with the native adenylation reaction.

This along with a lack of suitable amine binding sites for intermediate interception could account for the low conversions observed. Finally it should be noted that the optimal conditions for amide formation are alkaline, while the structures were obtained in neutral conditions, therefore the structures presented in that work may not represent a true representation of what would exist at the relevant high pH environment.

120

Ser 280

Piperidine amine modelled in this work Co-crystallised benzoic acid substrate into the CARni A domain active site

Å

Å

Å

Å

Leu 617 Benzoic acid Piperidine AMP co-crystallised into structure

Figure 4.11: Screenshots of Autodock 4.0 modelling of piperidine 52 into the active site of CARni structure PDBID: 5MSD. a. One configuration places the amine of 52 9.5 Å from the carbonyl carbon of the acid substrate and the amine hydrogen 2 Å from Ser 280. b. The second configuration places the amine 10 Å from the amine and obstructed by the ribosyl group of AMP, and the amine hydrogen 2 Å from the backbone carbonyl of Leu 617.

121

Fumaric acid modelled into the structure Piperidine amine modelled in this work of the CARsr A domain active site into the CARsr A domain active site AMP co-crystallised Ala 409 into structure Pro 410

Å

Å

Å Å Å

Lys 629 Fumaric acid AMP Piperidine

Figure 4.12: Screenshots of Autodock 4.0 modelling of piperidine 52 into the active site of CARsr structure PDBID: 5MST. a. One configuration places the amine of 52 12.8 Å from the carbonyl carbon of the acid substrate, but obstructed by Ala 409 and the amine hydrogen 2 Å from the backbone carbonyl of Pro 410. b. The second configuration places the amine 6.9 Å from the amine and obstructed by the phosphate of AMP and 2.9 Å from the Lys 629 amine, and the amine hydrogen 1.9 Å from the phosphate oxygen of AMP.

122

4.9. Whole-cell CAR-dependent amide formation

With the ability of CAR to produce amides clearly demonstrated, even when mutated to no longer possess a PPant binding serine and with its R domain removed, it was decided to use this system in a whole-cell assay. If successful this would be highly relevant for industrial production of amides; as well as being able to produce a wide range of amides through CAR catalysed amide formation, it would not require the costly and time consuming process of purifying enzymes. Moreover, there would be no need to add ATP cofactor to activate the carboxylic acid substrate, again providing a desirable economisation for industrial use. As there is a precedent for using CARs in whole-cell assays for the natural reaction,191 it was possible that the amidation reaction could also be conducted with CARs. However, as simply using full length CAR would likely result in the production of aldehyde over desired amide even in the absence of Sfp, it was decided to use A domain-PCP domain truncated CARmm 729-1175 as well as full length

CARmm producing cells. BL21 DE3 cells were transformed with the full length CARmm or truncated CARmm 729-1175 genes, or empty pET28b vector as a control. Following cell growth in AI media, cells were centrifuged and pellets used for the attempted production of 58. 1 mL reactions contained 100 mM sodium-carbonate buffer at pH 9.0, 50 mM glucose, 10 mM MgCl2 with 5 mM of substrate 57 and 100 mM of amine 52.

Following 24 h the samples were analysed by HPLC. Despite the respective forms of CAR being present in the cells (Figure 4.13), there was no product 58 detected in either sample.

The failure of the whole-cell assay could be due to a number of reasons. Firstly it is possible that the substrate and/or the amine could not enter the cells in sufficient concentration to allow amide synthesis. Alternatively the combination of high pH conditions and high amine 52 concentration could have caused death or stress of the cells, preventing their ability to provide ATP co factor for the reaction.

123

Due to the importance of whole-cell assays as a more economical and time-efficient means of producing valuable products vs. purified enzyme catalysts,190,219 its use with the

CAR-dependent should be a main focus for future work. Various conditions should be trialled to improve the likelihood of reaction substituent entry into cells, and also the survival of the cells. For example, while E.coli are traditionally utilised for whole-cell reactions, they are typically used at neutral pH reactions and undergo severe stress in highly alkaline conditions.220 Comparatively, should an alkaliphilic cell line such as

Bacillus firmus be obtained and transformed with CAR genes, with their ability to survive and grow in high pH environments,221 they could provide a viable method of conducting whole-cell CAR-dependent amidation reactions at the more optimal alkaline conditions.

MW 1 2 3

180 kDa 130 kDa CARmm 129 kDa 100 kDa 70 kDa CARmm 729-1175 81 kDa 55 kDa 40 kDa 35 kDa 25 kDa 15 kDa 10 kDa

Figure 4.13: SDS-PAGE of pET28bCARmm or CARmm729-1175-transformed BL21 (DE3) cell lysates from whole-cell studies. Bands of approximately the size of full length CARmm (129 kDa) and CARmm 729-1175 81 kDa can be observed in the CARmm or CARmm 729-1175-transformed cell lysate fractions respectively. MW: molecular weight ladder, L: lysate, FT: column flow through, W1: wash fraction 1, E: elution fraction.

4.10. Conclusion and future work

Here, with the use of site-directed mutagenesis, we have shown that CAR- dependent amide synthesis can occur via the direct attack of amines upon acyl adenylates.

While this work cannot rule out the possibility of thioester involvement, our studies suggest that the acyl adenylate is the primary target of aminolysis. Indeed this is backed up

124 by the literature where NRPS adenylation domains have also been exploited for aminolysis of their acyl adenylate intermediates.173,174,176 However, these previous examples have been limited in breadth of either their carboxylic acid substrates, or their amine constituents. We have shown that CAR’s wide natural substrate breadth towards carboxylic acids can be exploited for carboxylic acid activation and that various different amines can be introduced to produce primary, secondary and tertiary amides. Moreover through inorganic phosphate quantification studies, we have shown that amide formation and ATP consumption are coupled, suggesting that the acyl adenylate intermediate is bound to the enzyme at the moment of aminolysis, and therefore blocking the entry and hydrolysis of the next ATP molecule until this is complete. Indeed this fact could allow for future engineering of

CARs, altering the active site to allow more efficient aminolysis and reducing the large excess of amines required to perform the reaction. Due to the difficulties encountered in this work with regards to developing a method to analyse the kinetics of CAR-dependent amide formation, future work to allow such analyses should be conducted, to determine to what extent the aminolysis reaction is enzyme-catalysed, if at all. It is possible that by analysing the rate of reaction at a pH which is less optimal for CAR-dependent amide formation, but preferable for the EnzChek Pi quantification kit, kinetics data for the reaction could be gathered. Additionally, due to time limitations, a wider range of reaction conditions for successful whole-cell application of CAR-dependent amide formation were not investigated in this work. However, future work to develop methods for successful whole-cell CAR-dependent amide synthesis could potentially open up the possibility of an environmentally friendly, and economical biocatalytic means of broad specificity amide formation, which cannot yet be achieved by conventional chemical or biocatalytic methods.

125

Chapter 5. Use of radical substrates for studies of CAR dynamics

5.1. Introduction

Parallel work in the group conducted by Dr. Michael Hollas222 prompted the investigation of spin labelled substrates for CARs. Introduction of spin labelled compounds would allow dynamics studies of the natural and amidation mechanisms of CAR through the use of EPR. Site directed spin labelled enzymes have been used before to analyse enzyme movement and dynamics.223–225 Radical-containing cofactors involved in the native reactions of enzymes have also been used to analyse enzyme dynamics.226 As CAR is a multidomain enzyme, it would be interesting to see how these domains interact to conduct the separate adenylation and reduction reactions. While insights into the dynamics of CAR have recently been conducted using X-ray crystallography to provide structural data,192 questions still remain. Firstly, as this structural data was obtained via crystallography, it would be valuable to confirm the findings with EPR dynamics studies.

Additionally, specific questions such as how labile the carrier protein bound substrate is and whether multiple domains are simultaneously occupied by substrates and intermediates, remain to be answered or confirmed. It was postulated that if a radical carboxylic acid such as TEMPO carboxylic acid 66 could act as a substrate for CAR, it would have to pass through all three constituent domains, allowing a unique opportunity to follow the passage of a substrate from A domain binding, adenylation, thioester formation and PCP binding, transfer to R domain and finally reduction to the aldehyde 67 (Figure 5.1 a). While EPR studies could not be performed as part of this work, radical substrates were trialled and the theory developed to allow for future work by experts in biochemical EPR.

There are a number of ways a radical substrate could be used in conjunction with

CAR activity to analyse domain dynamics, in particular the following two. The first would

126 tackle the investigation of the lability of the substrate bound carrier protein. By omitting

NADPH as in the method for amide production, the substrate bound on the carrier protein

PPant group would become trapped. By analysing the mobility of the carrier protein bound substrate, it could potentially be possible to determine whether the carrier protein bound substrate remains mainly bound within the A or R domain (with limited mobility) or is predominantly free to move between the two domains (with high mobility) (Figure 5.1 b).223 The second method would use a combination of a radical substrate and radical labelling on specific residues in the CAR structure, i.e. one in the A or R domain adjacent to key active site residues. This would also potentially allow analysis of the movement of the carrier protein by calculation of the distance between the carrier protein-bound substrate and the two radical-tagged domains (Figure 5.1 c). Moreover, it could potentially provide information of the positioning of the substrate within the active site relative to radically labelled residues. And potentially confirm the location(s) of aminolysis, i.e. adenylation active site or even while bound to the PPant group.

127

a.

A R P

b.

c.

Figure 5.1: The potential activation and reduction of radical carboxylic acids by CAR for analysis of enzymatic dynamics. a. Analysis of radicals as they go through the three domains of CAR during reduction. b. Trapping the radical as a thioester by omitting NADPH to analyse carrier protein dynamics. c. In combination with the trapped carrier protein radical, spin labelled residues on the enzyme could be used to analyse distance measurements between the substrate and domain residues.

5.2. Kinetic studies with radical substrates

128

The commercially available radical carboxylic acid, TEMPO carboxylic acid 66 was acquired and tested with the native CAR assay at neutral pH using CARmm and a

TECAN plate reader and then compared to the model substrate benzoic acid 25. This demonstrated that 66 could indeed be reduced by CARmm (Figure 5.2), with NADPH

- being expended, providing a KM of 3.1 ± 0.5 mM and a Vmax of 0.31 ± 0.02 µmol/min

1 -1 -1 -1 /mg giving a kcat of 39.99 ± 2.58 min and a kcat/KM of 13.03 ± 3.22 min / mM. While 66 was a substrate for CAR reduction, it was reduced much less efficiently than the model substrate 25, (Figure 5.3) which provided a KM of 0.15 ± 0.01 mM, and a Vmax of 1.17 ±

-1 -1 -1 -1 0.02 µmol/min /mg , giving a kcat of 150.9 ± 2.6 min and a kcat/KM of 1.01 ± 0.09 min / mM.

T E M P O c a rb o x y lic a c id 5667

0 .3

g 0 .2

m

/

n

i

m

/

l

o m

µ 0 .1

0 .0 0 5 1 0 1 5

S u b s tr a te ( m M )

Figure 5.2: Kinetics analysis of reduction activity of TEMPO carboxylic acid 66 by CARmm. Error bars show the standard error of the mean of reactions conducted in triplicate.

129

B e n z o ic a c id 1256

1 .5

g 1 .0

m

/

n

i

m

/

l

o m

µ 0 .5

0 .0 0 5 1 0 1 5

S u b s tr a te ( m M )

Figure 5.3: Kinetics analysis of reduction activity of benzoic acid 25 by CARmm. Error bars show the standard error of the mean of reactions conducted in triplicate.

5.3. Discussion and future work

By demonstrating that TEMPO carboxylic acid can act as a substrate for native

CAR activity, we have made the first steps towards allowing analysis of CAR

dynamics by EPR. We have also outlined herein how such a method could potentially

be applied, by either using the radical substrate alone to analyse carrier protein

dynamics, or in conjunction with chemically labelled residues within the protein, to

allow substrate-domain distance calculations by those with extensive biochemical EPR

knowledge. It is hoped that such methods will aid future work in investigating the

mechanism of CAR multi domain dynamics in more detail as well as the natural and

amidating mechanisms of CAR.

130

Chapter 6. Discussion of results and perspectives

The overall goal of this work was to develop a broad specificity amide synthetase which could catalyse the synthesis of a wide range of amides in aqueous conditions. At the commencement of the project there were no such enzymes available, with aqueous amide synthetases, such as nonribosomal peptide synthetases, typically being highly substrate specific with regards to either or both acid and amine components, while hydrolase enzymes, which are typically exploited for broad specificity amide formation, must generally be used in organic solvents. The initial, and probably most ambitious, plan was the fusion of a carboxylic acid reductase adenylation domain, with the carrier protein of

VibB, to permit amide formation through subsequent interaction with the amide forming condensation domain, VibH. Unfortunately expression and overproduction of the soluble chimera and Vib proteins was unsuccessful.

An alternative strategy proved to be more successful: It was found that many different carboxylic acids could be activated by CAR and subsequently directly intercepted by ammonia, primary and secondary amines. Although the majority of conversions were low, even after optimisation, this is the first time that such a broad range of primary, secondary and tertiary amides was shown to be produced by individual aqueous enzymes using non-activated carboxylic acids. Further investigations into the mechanism of this amide formation, using mutagenesis and ATP consumption assays, demonstrated that the acyl adenylate formation was sufficient for amide bond formation and that thioester formation was not required. Coupling of ATP hydrolysis and amide formation was found to be very efficient. This would suggest that the acyl adenylate is retained within the enzyme until nucleophilic attack by the amine to produce the amide, which would then

131 allow the next substrate and ATP molecules to enter the active site for adenylate formation and substrate activation.

There is indeed precedent for adenylating and phosphorylating enzymes, in particular NRPS A domains, acting as activators of carboxylic acids and facilitating amidation of the activated intermediate.165,169,174,176,227 However, these previous examples were generally limited to the cognate carboxylic acid substrate or similar molecules, and/or the breadth of amines which could be used was narrow.174,176 For example, in work where cysteine was used as the nucleophile, amidation was proceeded first by thioester formation by the cysteine thiol, followed by intramolecular amide formation, with direct attack of the activated intermediate by amines not being possible.173 Comparatively in this work, the amine is responsible for the direct nucleophilic attack of the activated intermediate.

The specific activity observed in the production of the pharmaceutically relevant amide ilepcimide 58 was found to be slow with 12.22 ± 0.2 mU mg–1 but was comparable to the Vmax observed in the previous work of adenylation domain-dependent amide formation between dihydroxy benzoic acid and cysteine with 15.6 ± 0.8 mU mg-1 by

DhbE.174

By demonstrating that carboxylic acid reductases can be used for this amidation reaction, it is hoped that this will provide the foundation for future work into other CARs and adenylating enzymes, with wide and complementary substrate specificities to expand the range of amides which can be produced through aqueous biocatalytic methods.

Nevertheless it is important to acknowledge the limitations of this work and the opportunities for future research that they provide. The clear roadblock in using this system as an industrial method for amide formation is the generally low conversions achieved.

While conversions up to 96% could be achieved with the target pharmaceutical molecule ilepcimide 58 following optimisation, the conversions to the majority of amides remained below 20% and the starting concentrations of acid substrates were kept at 1 mM.

Therefore future methods to improve the conversions and yields of CAR-mediated 132 amidation, such as immobilisation and directed evolution,87 should be a research priority.

Another limitation in using this method for industrial purposes is the reliance of the commercially expensive cofactor ATP. Therefore a clear means of circumventing this need is the use of whole-cells or ATP recycling methods.190,228 By using variants of CAR which lack reduction activity such as truncations lacking the R domain, it could be possible to use these constructs in whole cells, benefiting from biologically produced ATP, but without the risk of the substrates being reduced. Although whole-cell production was tried in this work, it proved to be unsuccessful and therefore may require intensive research to find the optimal conditions to provide an economical method for the production of a wide range of amides. ATP recycling has yet to be used in conjunction with CARs, as most ATP recycling systems convert ADP and Pi to ATP, compared to the AMP and PPi which is produced in the CAR reaction. However polyphosphate kinases could be used in future for this reaction and therefore could be used in the amide forming system for CARs, greatly reducing the method’s cost.228

In conclusion, we have demonstrated a novel method for the production of amides, using CARs as biocatalysts.229 It is hoped that this will aid future efforts of producing a wide range of amides in an economical and environmentally friendly manner.

133

Chapter 7. Experimental procedures

7.1. General methods and materials

Where possible all chemicals were purchased from commercial suppliers, namely

Fischer Scientific, Alfa Aesar, Sigma Aldrich and Fluorochem, unless stated otherwise.

Endonucleases including Dpn1, Phusion HF polymerase, T4 DNA ligase, T4 DNA kinase,

DNaseI and DNA ladders were purchased from New England Biolabs while QIAprep mini prep kits and QIAquick DNA Gel extraction kits were purchased from Qiagen. In-Fusion

DNA cloning kits were purchased from Takara. Competent NEB 5-alpha cells for plasmid

DNA storage and amplification, and BL21 (DE3) competent cells for gene overexpression and protein production were purchased from New England Biolabs, while Stellar competent cells, used to amplify In-Fusion cloned DNA, were purchased from Takara. All

DNA sequences were analysed and primers designed using the SnapGene program by GSL

Biotech. For agarose gel DNA visualisation SYBR Safe DNA Gel Stain by ThermoFisher was used. When substrates positive controls for HPLC analysis and conversion calculation were not available commercially, they were produced by chemical synthesis by Dr. Fabio

Parmeggiani (57) and Dr. Michael Hollas (56, 58).229

7.2. Genes and molecular cloning

7.2.1. Genes used in this work

Genomic sequences used in this work are given in appendix 1. The genes for wild type, N-terminally 6 x histidine tagged CARmm and CARni were given by Dr. Mark

Dunstan. Both genomic sequences were codon optimised for expression in E.coli and were both in separate pET21a plasmid. The gene for truncated, N-terminally 6 x histidine tagged CARmm was also received from Dr. Mark Dunstan, and were in the pET28b 134 plasmid. The B. subtilis Sfp PPtase gene was received from Dr. Mark Dunstan and was within the pCDF1b plasmid and contained no 6 x histidine tag. Vibriobactin synthetase genes VibE, VibB and VibH were purchased from Eurofins genomics services and were codon optimised for expression in E.coli and were received in the transport vectors pEX-

K4 for VibE and VibH, and pEX-A2 for VibB. Empty pET28b was acquired from Dr. Mark

Dunstan. All plasmids were stored in Milli-Q water in a freezer at -20°C, and all concentrations of DNA were calculated using a NanoDrop 1000 following the manufacturer’s instructions.

7.2.2. Cloning genes into expression vectors and transformation into 5-alpha and BL21 (DE3) competent cells

For storage and amplification, pET21a CARmm, pET21a CARni, pET28b CARmm

729-1175 and pCDF1b Sfp were used to transform 5-alpha cells with 1 µL of 1-100 ng/µL plasmid being added to 50 µL of thawed 5-alpha cells on ice. The manufacturer’s heat shocking protocol was followed to induce transformation. The cells were then streaked onto LB agar plates containing the relevant respective antibiotics of ampicillin (50

µg/mL), kanamycin (100 µg/mL) or spectinomycin (50 µg/mL) for selection overnight at

37°C. Colonies were picked and used to inoculate 5 mL antibiotic containing LB broth and grown overnight at 37°C. For long term storage of transformed 5-alpha cells, 500 µL of cell culture was mixed with 500µL of autoclaved 50% glycerol to give a 25% glycerol stock which was then flash frozen in liquid nitrogen and stored at -80°C until required.

For gene overexpression and subsequent protein production the car genes were transformed into BL21 (DE3) cells as with 5-alpha cells and again following the manufacturer’s protocols. To permit post-translational addition of a PPant group to the

CAR PCP, pCDF1b Sfp was co-transformed into the same cells as the car genes and both plasmid selecting antibiotics added to the LB agar. Glycerol stocks were made as above.

135

VibE and VibH were cloned out of their respective transport vectors and into expression the expression plasmid, pET28b. pET28b was cut with the endonucleases NcoI and XhoI (New England Biolabs) and was subsequently ran on a 1% agarose gel. The linear DNA was extracted and purified using a DNA gel extraction kit. The genes for insertion were amplified from their transport vectors by PCR using primers designed on

SnapGene using the In-Fusion cloning function, and containing a 16 bp overhang which overlapped with the respective NcoI and XhoI cut ends of linear pET28b and reincorporated these restriction sites back into the vector. The insert genes were also run on a 1% agarose gel and extracted using a DNA gel extraction kit. The primers used for the respective inserts and the PCR conditions are shown below in the appendices. The inserts and vectors were ligated using the In-Fusion cloning kit using the manufacturer’s instructions. Subsequently the fusion mixture was used to transform Stellar cells following the manufacturer’s protocol and the cells spread onto kanamycin selection plates overnight at 37°C. Following retrospective analysis of pET28b VibH, it was realised that a start codon within the cleavage site of NcoI which was in a separate open reading frame and preceding the start codon of VibH and may have been responsible for poor expression in preliminary work, therefore a point mutation was conducted to remove the start codon by converting ATG to ATC, the primers and PCR conditions used are shown below. The transformed Stellar cells were later grown up and the amplified DNA extracted by miniprep. The purified DNA was sequenced by Eurofins MWG genomic services and the plasmids containing the correctly cloned genes were used to transform 5-alpha and BL21

(DE3) cells and glycerol stocks made as above.

7.2.3. Fusion of CARmm A domain and VibB PCP domain to generate chimera CAVibB

The fusion of the A domain of CARmm to the PCP of VibB first required the identification of the domain boundaries within the respective proteins. This was conducted

136 by inputting the sequences into the Pfam program which identifies domain boundaries based on homology with domains in other genes.200 Primers were designed to amplify the sequence of CARmm up to the starting edge of the PCP boundary, removing the PCP domain and R domains but retaining the A-PCP linker of CARmm. The CARmm gene was within the intended expression plasmid of pET21a, therefore this was used as the vector with the PCR primers designed to flank the N-terminal PCP boundary and the C-terminal stop codon of CAR and therefore amplify both the CARmm A domain and also the linear pET21a plasmid. Primers were also designed to amplify the VibB gene only within the boundaries of its PCP domain. The primers of the VibB insert were designed with 15bp overhangs which overlapped with the edges of the linear pET21a CARmm A domain vector, with the 3’ end of the CARmm A-PCP linker being in frame with the fused 5’ end of the VibB PCP insert. The primers and the PCR conditions used are shown in appendix 2.

Following In-Fusion cloning and gel extraction Stellar competent cells were transformed and DNA extracted as above. Sequencing confirmed the successful fusion of the genes within the plasmid which was then transformed into 5-alpha cells and co-transformed into

BL21 (DE3) cells with pCDF1b Sfp with glycerol stocks being made.

7.2.4. Mutagenesis of CARni PPant binding site to generate CARni S689A mutant

To remove the PPant binding serine of CARni by mutating the serine AGC codon within CARni to an alanine TGC codon, two back-to-back primers were designed with only one mutagenic primer containing the mutation. The primers and PCR conditions used are shown in appendix 2. Dpn1 was added to the PCR product to degrade the parental, non- mutated plasmid following the manufacturer’s protocol. The PCR product was then run on a 1% agarose gel followed by extraction and purification with a DNA gel extraction kit. T4 polynucleotide kinase was used to phosphorylate the 5’ end of the linear PCR product prior to the addition of T4 DNA ligase to ligate the ends of the linear DNA following the 137 manufacturer’s protocols. The ligated circular PCR product was used to transform 5-alpha cells and once the mutation had been confirmed by sequencing, also BL21 (DE3) cells.

7.3. Protein production and purification by nickel affinity chromatography 7.3.1. Transformed BL21 DE3 grow up and induction for protein overproduction

To express CARmm or CARni alone (for control assays to show that Sfp is not required for amide production for example) or co-express CARmm/Sfp and CARni/Sfp, transformed cells from glycerol stocks were grown up in antibiotic containing 5 mL starter cultures overnight at 37°C, 250 rpm. In early work they were then used to inoculate 400 mL of LB broth containing the normal working concentrations of antibiotics mentioned above in 2 L flasks. They were then grown at 37°C, 250 rpm until an OD 600 of 0.6-0.8 was achieved, at which time 0.4 mM of syringe filter (Millipore 0.45 µm) sterilised isopropyl β-D-1-thiogalactopyranoside (IPTG) was added to induce expression of the recombinant genes. The cells were then left at 20°C for 24 h and then spun down at 4000 rpm for 25 min and cell pellets stored at -20°C. For the majority of the project however auto induction media (AI) (Formedium) was used due to its convenience and unlike traditional IPTG induction, there was no need to repeatedly check OD 600 absorbance, with the potential risk of exceeding an absorbance of 0.8 after which point IPTG induction is less effective. 5 mL starter cultures were used to inoculate 600 mL antibiotic containing media in a 2L bevelled flask, which were left in a 37°C incubator, 250 rpm for 3 h before transfer to a 20°C incubator for 48 h before centrifugation and pellet storage. Co-expressed

CARni S689A/Sfp and CARmm 729-1175/Sfp were only grown up using the AI media protocol. Initial attempts to express VibE, VibH and the fused CAVibB (co-expressed with

Sfp) genes in BL21 DE3 were conducted using the AI method. However later expression

138 trials were conducted at different temperatures, in LB media with differing concentrations of IPTG as mentioned below.

7.3.2. Cell lysis and protein purification

Prior to cell lysis for protein extraction, cells were thawed on ice and then were typically resuspended in 5 mL/g cell paste of wash buffer A consisting of 10 mM imidazole and 100 mM Tris pH adjusted to pH 7.5 with conc. HCl at room temperature, 2 mM MgCl2 0.2 mg/mL lysozyme from hen egg white (Sigma), 2.5 U/ mL DNaseI (New

England Biolabs) and 1 mM phenylmethane sulfonyl fluoride or phenylmethylsulfonyl fluoride (PMSF) (Sigma) were added immediately prior to lysis. The lysis mixture was first incubated at 37°C 250 rpm for 1 h and subsequently sonicated in 20 cycles of 20 seconds on/ 20 seconds off. The lysed mixture was then centrifuged at 18000 rpm for 25 min. The supernatant was filtered with a syringe filter tip and applied via syringe to a 5 mL

HisTrap FF Crude nickel affinity column (GE Healthcare) which had been pre-calibrated with 10 column volumes (CV) of 99% wash buffer A and 1 % elution buffer B consisting of 100 mM Tris-HCl, pH 7.5 1M Imidazole. The column flow-through was collected for later analysis by SDS-PAGE. The sample-loaded nickel affinity columns were loaded onto an AKTA purification system (GE Healthcare) and 20 CVs of 99% buffer A 1% buffer B was used to wash the column collecting 10 mL wash fractions in falcon tubes. This was followed by 10 CVs of elution with the gradient of buffer B increasing from 1% to 100% with 2 mL elution fractions collected in falcon tubes. Typical AKTA-UV traces at 280 nm from protein purification are shown in appendix 3. Filtered lysate, flow-through, the first wash fraction and the elution fractions were analysed by SDS-PAGE. Pre-cast gels employed for SDS-PAGE were either by NuSep or Bio-Rad depending on availability, using a premixed BioRad 25 mM Tris, 192 mM Glycine, 0.1% SDS, pH 8.3 10 x buffer diluted 10 fold and electrophoresis carried out using the manufacturer’s protocols. Gels were visualised by the addition of InstantBlue protein stain by expedeon with rocking for 1 139 h, followed by rinsing with distilled water for 1 h and imaging by a Bio-Rad Gel Doc EZ system on a white light setting. Elution fractions shown to possess the highest intensity bands of the correct size for the enzyme, when compared to either a PageRuler or

PageRuler Plus prestained protein ladder (ThermoFisher), were pooled and desalted with a

30000 Da molecular weight cut off (MWCO) vivaspin column which were spun for multiple 20 minute rounds in a centrifuge at 4000 rpm with salted solution regularly replaced by enzyme storage buffer of 100 mM Tris-Hcl pH 7.5. The protein concentration of pooled fractions were principally measured by Nanodrop 1000 spectrophotometer at 280 nm compared to a storage buffer blank due to improved reproducibility compared to

Bradford assay using BSA as a standard, which had been used briefly to quantify protein concentration when conducting preliminary pH trials for direct amide formation from cinnamic acid when using CARmm. For most of the work, including substrate screening for CAR dependent amide formation and its optimisation, the default setting of protein quantification on the Nanodrop 1000 was used with 1 Abs = 1 mg/mL. However in later assays, including those to investigate amine stereoselectivity, ATP coupling to amide formation and kinetics studies into the use of radical substrates in the native CAR assay, concentration was calculated with the Nanodrop 1000 using the other protein (E & MW) function whereby the molecular weight of the protein (kDa) and the CAR enzyme’s predicted extinction co-efficient (M-1 cm-1), derived from the ProtParam software,230 were used to give a more accurate estimation of enzyme concentration.

7.3.3. Expression trials of VibE, VibH and CAVibB in BL21 (DE3) cells

Following the failure to express the VibE, VibH or the CAVibB genes, using the previously mentioned AI media method for CAR production 20°C for 48 h, expression trials were conducted to investigate potential improved conditions for expression. Firstly different two concentrations of IPTG and two incubation times were trialled for the

140 production of the CAR A domain-VibB PCP domain fusion protein. 4 x 5 mL starter cultures were added to 4 separate 250 mL non-bevelled flasks 100 mL of antibiotic containing LB broth. The flasks contained either 0.25 mM IPTG or 0.5 mM IPTG and following grow up at 37°C until induction were left at 25°C for either 5 h or 24 h after which cells were pelleted as above. Pellets were lysed with 5ml/g cell pellet of BugBuster

(Millipore), with initial centrifugation to pellet the insoluble fraction with the supernatant being taken as the soluble fraction. Guided by the manufacturer’s instructions, the insoluble pellet was resuspended in lysozyme and BugBuster followed by centrifugation to isolate a pellet containing inclusion bodies which were resuspended in BugBuster. Both the soluble and insoluble protein fractions were analysed by SDS-PAGE as above. Further expression trials were conducted with VibE transformed, VibH transformed or CAVibB/Sfp co-transformed cells with 0.5 mM IPTG induction and at 20°C, 30°C or 37°C to investigate whether different temperatures would be optimal for expression again with initial grow up at 37°C followed by IPTG induction of 100 mL inoculated LB broth containing appropriate antibiotics in a 250 mL flask with 250 rpm at the appropriate temperature. Additionally as a negative control BL21 (DE3) cells which were transformed with empty pET28b which was grown up at 30°C. They were pelleted and lysed for SDS-

PAGE analysis as above.

7.3.4. Demonstrating CAR enzyme native activity through NADPH UV absorbance change

Following the purification of CAR enzymes, to demonstrate that they were active prior to use in biotransformations, an NADPH depletion assay was conducted to reveal native reduction activity. The reaction mix consisted of a 1 mL solution containing 100 µL

10 x reaction buffer (1M Tris-HCl, pH 7.5 at room temperature) 100 µL of 100 mM

MgCl2, 10 µL of 100 mM benzoic acid in methanol, 10 µL of 500 mM ATP, 25 µL of 10 mM NADPH and 100 µg of CAR enzyme, made up to 1 mL with distilled water. The

141 solution containing all components was mixed in a 2.5 mL quartz cuvette first and absorbance analysed at 340 nm to observe background NADPH absorbance depletion on a

CARY 50 Bio UV-Visible spectrophotometer. The enzyme solution was added last and the solution mixed and any change of absorbance monitored, demonstrating enzyme reduction activity or inactivity. Active enzymes were used for biotransformations. This native reduction activity trial was conducted with Tris-HCl adjusted to pH 8.5, pH 9.0 and pH

10.0 at room temperature to observe whether CARmm could still reduce benzoic acid 25 at higher pH.

7.4. Biotransformations and analysis 7.4.1. HPLC methods for substrate and product isolation and analysis

To analyse the production of amides from CAR-dependent amide formation HPLC methods were developed to separate positive controls of carboxylic acid substrates and amide products for absorbance analysis at an appropriate wavelength. All acids and amides, other than 57 and 58, were separated using an Agilent HPLC system and an

Agilent non-chiral, reverse phase, pursuit 5 C-18 column (150 mm x 3 mm) and were analysed at 230 nm with the exception of runs with cinnamic acids 28 and 39 which were analysed at 270 nm. Conversely 57 and 58 were separated with an Agilent non-chiral, reverse-phase, Zorbax C-18 Extend column (50 mm x 4.6 mm x 3.5 mm) and analysed at

230 nm. The mobile phase used for the analysis of acids and amides other than 57 and 58 was composed of buffer A: water, 0.1% formic acid and buffer B: acetonitrile, 0.1% formic acid. 1 µL injections were used (1.5 µL injection for analysis of amide products 54-56) and a flow rate of 0.5 mL/min for 15 min. The isocratic method mixes are shown below (Table

7.1). Being unable to separate 57 and 58 with any of the methods used for the other acids and amides, a combined isocratic and gradient method developed previously by Dr.

142

Nicholas Weise was found to successfully separate the acid and amide with a mobile phase of buffer A: 0.1 M NH4OH, pH 10.0 and buffer B: methanol, with an injection of 5 µL and flow rate of 1 mL/min for 25 min. The mobile phase gradient method is shown below

(Table 7.2). By using equimolar concentrations of acid and amide it was possible to determine the response factors between acids and amides at their respective wavelengths.

Example HPLC traces used to calculate conversions are shown in appendix 4.

Table 7.1: Isocratic HPLC methods used to separate acid substrates and amide products with percentages of buffers A and B and absorbance wavelength.

Acid Substrate Amide product % Buffer A % Buffer B Wavelength

25 38 80 20 230 nm

28 44 70 30 270 nm

39 45 70 30 270 nm

40 46 70 30 230 nm

38 47 70 30 230 nm

41 48 86 14 230 nm

25 54 80 20 230 nm

25 55 62 38 230 nm

25 56 80 20 230 nm

Buffer A: Water, 0.1% formic acid. Buffer B: Acetonitrile, 0.1% formic acid. 0.5 mL/min flow rate, 15 min runtime.

143

Table 7.2: Isocratic and gradient method employed for the separation of (2E)-3-(1,3- Benzodioxol-5-yl)acrylic acid 57 and ilepcimide 58 by HPLC.

Time period % Buffer A % Buffer B

0-5 mins 50 50 (Isocratic) 5-20 mins 50 to 10 50 to 90 (Gradient) 20-25 mins 10 90 (Isocratic)

Buffer A: 0.1 M NH4OH, pH 10, Buffer B: Methanol, 1 mL/min flow rate, 25 min runtime, 230 nm wavelength.

7.4.2. Initial CAR-dependent primary amide production trials

It was theorised that a high pH environment would likely be required to promote deprotonation of the amine nucleophile ammonia if it were to intercept the thioester intermediate on CAR. Therefore a pH 10 x reaction buffer of 1M Tris-HCl was made. The initial reaction mixture to investigate whether ammonia could intercept an activated CAR intermediate consisted of 100 mM Tris-HCl, 2 mM MgCl2, 100 mM, 50mM or 10 mM ammonium hydroxide-ammonium chloride adjusted to pH 10, 5 mM ATP, 10 mM benzoic acid 25 from a 100 mM stock in methanol, and either 0 mM or 10 mM NADPH, and 100

µg of CARmm, the solution was then raised to 1 mL with distilled water. A control containing all the above components, including 10 mM NADPH and 100 mM ammonia but with the enzyme solution replaced with enzyme storage buffer. The reaction mixes contained within 1.5 mL Eppendorf tubes were incubated at 37°C for 22 h at 250 rpm held within a cushioned box. Following incubation, they were then removed from their tubes and added to 10 mL of methanol in a 10 mL Falcon tube to quench the reaction. This was then centrifuged for 20 min at 4°C, 4000 rpm and the supernatant taken. The supernatant was then dried on a Genevac system (SP scientific) overnight on a HPLC setting at 50°C.

144

The dried pellet was suspended in 1 mL of 50% acetonitrile, 50 % water, 0.1% formic acid.

This solution was filtered with a syringe filter tip and then stored in a HPLC vial for subsequent HPLC analysis. Following success with 25, acids 28, 30, 39-43 were trialled and prepared under these conditions with CARmm and CARni. The later was initially donated by Dr. Sasha Derington to determine whether CARs other than CARmm could work as biocatalysts of amide formation. Subsequently pET21a CARni was transformed into BL21 (DE3) cells and produced independently for all subsequent work.

The preliminary pH screen was conducted using CARmm and cinnamic acid 28 as the substrate. pH 7.5 and pH 8.0 reactions were in a potassium phosphate buffer, while pH

9.0, pH 9.5, pH 10.0 and pH 10.5 reactions were in a sodium carbonate-bicarbonate buffer.

For this assay CARmm was calculated using a Bradford assay and 200 µg/mL of enzyme was used. The reaction was performed as above in the initial amide forming trial with 100 mM ammonia pH adjusted to the correct pH but with no additional NADPH.

7.4.3. Carboxylic acid and amine screen for CAR- dependent amide formation of primary, secondary and tertiary amides

The carboxylic acid substrate screen was later repeated in a sodium carbonate- sodium bicarbonate buffer adjusted to pH 9.0 at 37°C. Acids 25, 28, 30, 39-41 were prepared as 100 mM stocks in methanol. As well as CARmm, CARni was to be used.

Reaction solutions were prepared with final concentrations of 100 mM sodium carbonate- sodium bicarbonate buffer (pH 9.0 at 37°C), 100 mM ammonium hydroxide-ammonium chloride (pH 9.0 at 37°C), 10 mM MgCl2 , 5 mM ATP, 10 mM carboxylic acid and

100µg/mL of CARmm or CARni (concentration determined by Nanodrop at 280 nm using

1 Abs = 1 mg /mL), reactions were brought up to 1 mL with distilled water. Controls where enzyme was replaced by storage buffer were performed for each acid. Due to changes in the rotation methods of the incubators the samples were now held uniformly at

145 a 45° angle. Reactions were incubated for 24 h at 37°C at 250 rpm and were quenched in methanol and prepared for HPLC analysis as above. White enzyme precipitate was observed within 30 min of the reaction start.

Following the carboxylic acid substrate screen, amines other than ammonia were tested with benzoic acid 25, namely 51, 52 and 53 for CAR-dependent amide production.

As with ammonia, these amines were made up as a 1M stock, but were pH adjusted with

HCl. Reactions were conducted as with the carboxylic acid screen but with only 25 as a substrate and 100 mM of 51, 52 or 53. To produce 58 the same conditions as above were employed for both CARmm and CARni catalysed reactions with 1 mM of 57 and a 100 x excess of 52. Also 57 was prepared as a 100 mM stock in DMSO in place of methanol.

7.4.4. Enzyme batch addition for improved conversion to ilepcimide 58

In order to improve the conversions to 58 by both CARs at 37°C, batch addition was initially conducted to replace the enzyme which crashed out after a short period of time, usually below 30 min of reaction. In the CARni reaction, following the initial addition of enzyme, an additional batch of 100 µg of enzyme was added for every h for up to 5 h with separate reactions conducted with 1-5 batches. The reaction was quenched following a total of 22 h, 250 rpm, 37°C. Following success in improving the conversion to

58 using this method it was repeated with both CARmm and CARni with 10 batches, with one batch addition per half hour with 10 batches in total being added with reactions now left for a total of 24 h prior to quenching.

The initial time course reaction which showed that activity with a single batch of

CARni had ceased after 2 h was conducted with 9 separate 1 mL reactions conducted as above but quenched after 5, 10, 20, 30, 40, 50, 60, 120 min and 24 h respectively prior to analysis by HPLC.

146

7.4.5. CAR-dependent amide forming reaction profiling and optimisation

To avoid the use of batch addition of enzyme for improved production of 58, separate temperatures were trialled, 22°C, 30°C and 37°C, with the sodium carbonate buffer adjusted to pH 9.0 at these appropriate temperatures. Due to poor reproducibility of results it was investigated whether the angle of agitation of the tubes could be contributing to inconsistent conversions. When reproducible results could consistently be achieved, when the tubes were held vertically (0°) compared to a 45° angle and with a much improved conversion, this was adopted for all subsequent reactions. Moreover white fibrous enzyme precipitate was no longer observed in the reaction mixtures. The 22°C,

30°C and 37°C reactions for 58 production were therefore repeated but also all previous substrate and amine screens at these temperatures, 0° angle of agitation and 24 h.

58 production by CARmm was designated as the model reaction after having achieved the highest conversion with a single batch of this enzyme at 30°C. Consequently an improved pH profile was conducted with this reaction. This was conducted with overlapping HEPES (pH 7.5 and 8.0), Tris-HCl (pH 7.5, 8.0, 8.5 and pH 9.0) and sodium carbonate-sodium bicarbonate (pH 9.0, 9.5, 10) buffers pH adjusted at 30°C. As mentioned in the results section, Tris-HCl had been largely avoided after finding its use led to the production of a secondary product visible by HPLC. However available buffers with a similar buffering range which could act as a bridge between the acidic HEPES buffer and the alkaline sodium carbonate-sodium bicarbonate buffer, i.e. borate-saline and glycine-

NaOH buffers, couldn’t be made up to the desired concentration of 100 mM before saturation or also produced a secondary product respectively. With the correct buffers at the respective pH, the 1 mL CARmm reactions to produce 58 were conducted with 100 mM buffer, 100 mM 52 (pH 9.0), 10 mM MgCl2, 5 mM ATP, 1 mM 57, 100 µg enzyme

30°C, 24 h, 250 rpm, held at a 0° angle. To confirm that the addition of the amine which

147 was adjusted to pH 9.0 didn’t alter the pH of reaction mixtures, litmus paper was used before and after its addition to the buffered mixture. In all cases the buffers kept the mixtures to the same pH following amine addition. They were quenched and analysed by

HPLC as above.

To test the ATP excess and amine excess requirements of the model CARmm 58 production reaction, the above conditions were used for a reaction at 30°C, pH 9.0 in sodium carbonate-sodium bicarbonate buffer, but with differing concentrations of ATP or

52. A control with no ATP was also conducted to demonstrate that without the cofactor the amide forming reaction couldn’t proceed, which was indeed the case. This was also conducted with benzamide 38 reaction, again with no conversion observed.

Finally an improved time course reaction for CARmm production of 58 was conducted under these optimal temperature and pH conditions with 5 x excess of ATP and

100 x excess of amine. The 1 mL reactions were conducted in triplicate and 100 µL and samples removed at each of the analysed time intervals and quenched in 900 µL of methanol. These samples were loaded directly for analysis by HPLC. To calculate the specific activity of this reaction, the initial and linear concentration change over time was calculated from the conversion change over time, per mg of CARmm. The same methods were employed for a time course assay using CARmm and 38 as the target product but the reaction conducted at both 30°C and 37°C.

7.4.6. Use of PPant mimetics to analyse role of PPant in CAR-dependent amide formation

To investigate the effect of PPant mimetics on CAR-dependent amide synthesis of

58, the mutant CARni S689A was used for the standard reaction but with N- acetylcysteamine 59 was added from 0 to 20 mM, enzyme concentration for this assay was calculated by nanodrop using the molecular weight and extinction co-efficient at 280 nm

148 function. Quenching by methanol was followed by HPLC analysis as mentioned previously.

7.4.7. Investigating enantioselectivity of CAR-dependent amide formation

For the chiral amine assay to trial enantioselectivity towards R and S enantiomers of 60 in CAR-dependent amidation, a working stock of 2 M in methanol of both was made due to it being immiscible in water. 1 mM of cinnamic acid 28 was incubated with 100 mM of either the R or S enantionmers of 60 and the 30°C, pH 9 CARmm reaction carried out as normal. However a master mix composed of all reaction components excluding the amines was made to ensure uniform concentrations between the blank and the enzyme reactions. Additionally the reaction was heat-killed on a heat block at 80°C for 20 min followed by centrifugation to remove precipitate. This was performed as no positive control of the theorised amide product was available and this would permit a more accurate comparison of the substrate absorbance peak between the controls and the reactions as response factor data was not available to compensate for any loss of substrate or product during methanol quenching and supernatant extraction. No new peaks could be observed in the HPLC traces with no difference observed between an enzyme free blank containing an equal mix of both enantiomers and the enzyme assays with the separate enantiomers.

Next, the same assay as 60 was conducted but initially with a racemic mixture of sec-butylamine 61, as due to the high value of the individual enantiomers, it was preferable to first of all be certain that a product would be attained from reactions with this amine.1 mM of 28 was combined with the pH 9 buffer and other the standard reaction components, and the amine 61 was added directly with a final concentration of 100 mM. The reaction was conducted as with 60. The HPLC comparison of the enzyme free trace and the

149 enzymatic reaction revealed a novel albeit small peak in the enzyme sample. The sample was then sent for LCMS analysis (Synbiochem analytical services) and the novel molecule was shown by MS (ESI+) to be 204 g/mol which corresponds to the protonated form of the theorised amide product 62 (Figure 7.1). Subsequently a master mix was made and the R and S forms of 61 were added separately and the reaction conducted in triplicate, in addition to an enzyme free control. The reactions were stopped by heat-killing and subsequently analysed by HPLC.

Figure 7.1: MS analysis of the amide product 62.

150

7.4.8. Preparative-scale production, extraction and isolation of ilepcimide 58

Prior to optimisation of one batch reaction conditions, the scale up assay using

Carni/Sfp containing BL21 (DE3) cell lysate was conducted. Co-transformed cells were grown up in AI media as previously described. They were then pelleted and frozen until necessary when they were thawed on ice. The lysate was prepared by suspending the pelleted cells in 2.5 mL of lysis buffer per gram of cells with the omission of imidazole, with incubation at 37°C for 1 h followed by sonication. The lysate was then split into 1.5 mL Eppendorf tubes followed by centrifugation for 10 mins at 13,200 rpm in a table top centrifuge at 4°C. The supernatants were pooled and stored on ice. Reaction mixtures of an initial 100 mL were conducted in a 250 mL conical flask. The mixture consisted of 10 mL of 1M sodium carbonate-sodium bicarbonate buffer pH 9.0 at 37°C, 20 mL of 0.1 M

MgCl2, 15 mM ATP, 10 mL of 1 M piperidine 52, pH adjusted to pH 9.0, 100 mg of starting material 57 initially dissolved in 1 mL of DMSO. 1 mL of the lysate was added every 30 mins over 5 h, with a total reaction time of 24 h, 37°C, 250 rpm. This gave a total added lysate of 10 mL. Following incubation, 1 mL of lysate was removed and added to 10 mL of methanol to allow preparation for HPLC analysis as conducted above. The 110 mL reaction mix was split into 4 separate 50 mL Falcon tubes. Dichloromethane (DCM) was added to each to bring the total volume to 50 mL and mixed to extract the product 58. The tubes were then centrifuged briefly, aiding the separation of DCM and aqueous layers. The

DCM layer was removed and additional DCM added to the aqueous layers to extract remaining product and bring the volume in the Falcon tubes to 50 mL, which was conducted twice. The combined DCM layer extractions were combined and given to Dr.

Michael Hollas for purification and NMR analysis (Figure 7.2). A final yield of 25.1 mg of

58 was isolated (19% yield) while the HPLC analysis of the lysate gave a conversion value of 28%.

151

δH (400 MHz, CDCl3) 1.55-1.72 (m, 6H, H16-18), 3.50-3.70 (m, 4H, H15,19), 5.99 (s,

2H, H2), 6.74 (d, 1H, J = 15.4 Hz, H11), 6.80 (d, 1H, J = 7.8 Hz, H9), 7.00 (dd, 1H, J =

8.1, 1.5 Hz, H8), 7.04 (d, 1H, J = 1.8 Hz, H6), 7.57 (d, 1H, J = 15.2 Hz, H10).m/z (ESI+)

260.1 (M+H+, 100%)

Figure 7.2: NMR analysis of the purified product 58. (Conducted by Dr. Michael Hollas)

7.4.9. Whole-cell production of amides by CAR trial

The trial for whole cell production of amides using CAR enzyme producing cells was performed firstly by transforming BL21 (DE3) cells with pET21a CARmm, pET28b

CARmm 729-1175 or empty pET28b. They were grown up using the standard AI media method and were pelleted by centrifugation, washed with distilled water and re-spun. They were then immediately used for the whole cell assays. The reaction mixtures were composed of 100 mM sodium carbonate-sodium bicarbonate buffer, 100 mM 52 pH 9.0, 5 mM of 57 (with 50 µL of DMSO), 10 mM MgCl2 75 mg wet transformed cells and 50 mM glucose. They were incubated at 30 degrees, 250 rpm, for 24 h after which they were added

152 to 1 mL of methanol and the mixture centrifuged on a table top centrifuge at 13000 rpm for

10 min. The supernatant was added to a HPLC filter vial with a 0.45 µm filter followed by the standard 58 analysis by HPLC.

7.5. Investigation of coupling between CAR-dependent ATP consumption and amide formation using an EnzChek kit

The EnzChek phosphate assay kit by Thermo Fisher was employed to analyse the release of phosphate linked to ATP hydrolysis. As this kit measures Pi concentration, rather than PPi which is released in the CAR reaction, an inorganic pyrohosphatase (IPP)

(New England Biolabs) was employed following the same protocol used for the Thermo

Fisher pyrophosphate assay kit which is identical to the phosphate assay kit but with the additional use of its own IPP. The manufacturer’s instructions were followed for the generation of a calibration curve between Pi concentration and absorbance at 360 nm.

Instead of the advised 1 mL assay analysis the assay volume was reduced to 200 µL, retaining the same concentrations of components, to allow analysis of more samples efficiently in a TECAN 96 well plate reader.

Following the generation of a calibration curve, a standard 1 mL CARmm reaction, enzyme free assay with enzyme replaced by storage buffer and amine 52 free reaction

(replaced by dH2O) for the production of 58 were conducted and put on ice after 24 h.

Subsequently samples from each of the three different reactions and controls were taken in triplicate and analysed following the manufacturers protocol adjusted for 200 µL on the

TECAN plate reader and the absorbance at 360 nm used to calculate the Pi concentration of samples. A background absorbance assay advised by the manufacturer’s protocol was used in triplicate and the average absorbance subtracted from all sample readings. The remaining samples were immediately quenched with methanol as usual for analysis by

HPLC. This allowed for comparison of conversion and ATP consumption analysis of samples at the same end time point. 153

7.6. Enzyme kinetics analysis of native CAR activity with radical-TEMPO carboxylic acid

For kinetic analysis of native CAR reduction of both benzoic acid 25 and the radical substrate TEMPO carboxylic acid 66, an NADPH absorbance depletion assay was conducted on a 96 well plate in a TECAN plate reader. 200 µL reaction mixes contained

100 mM Tris-HCL pH 7.5 at 30°C, 10 mM MgCl2, 1 mM ATP, 0.15 mM NADPH, and 10

µg of CARmm (concentration determined by Nandodrop at 280 nm with molecular weight and extinction coefficient input) and 8 µL of either 25 or 66 in methanol to a final concentration between 0-10 mM of substrate. The reactions were conducted in triplicate.

Following the addition of the substrate as the final component samples were rapidly put into the pre-heated TECAN plate reader with shaking for 10 seconds followed by absorbance readings. Specific activity per mg of enzyme was calculated and the Prism

GraphPad software was used to generate kinetics calculations.

7.7. Structural modelling of piperidine 52 into the active sites of CAR A domain structures

Published crystal structures from previous work192 were used for active site modelling by the AutoDock 4.0 program through the Autodock GUI, while PDBQT files were generated with the Open Babel GUI program.231 Structures 5MSD (CARni) and

5MST (CARsr) from the protein data bank were used for A domain structural analysis.192

The structure of piperidine 52 was modelled within the active site of structures and the

AutoDock 4.0 energy minimisation function conducted in a grid box of 20 Å x 20 Å x 20

Å centred on the carboxylic acid carbonyl carbon.

154

References

1 V. R. Pattabiraman and J. W. Bode, Nature, 2011, 480, 471–479. 2 E. Valeur and M. Bradley, Chem Soc Rev, 2009, 38, 606–631. 3 A. Goswami and S. G. Van Lanen, Mol. Biosyst., 2015, 11, 338–353. 4 R. M. Lanigan, P. Starkov and T. D. Sheppard, J. Org. Chem., 2013, 78, 4512–4523. 5 H. Lundberg, F. Tinnis, N. Selander and H. Adolfsson, Chem. Soc. Rev., 2014, 43, 2714–2742. 6 F. Musumeci, S. Schenone, G. Grossi, C. Brullo and M. Sanna, Expert Opin. Ther. Pat., 2015, 25, 1411–1421. 7 F. Tang, S. Wu and S. Zhao, J. Solution Chem., 2017, 46, 1556–1574. 8 S. Maurya, D. Yadav, K. Pratap and A. Kumar, Green Chem., 2017, 19, 629–633. 9 K. Tamura, J. Biosci., 2011, 36, 921–928. 10 S. van Pelt, R. L. M. Teeuwen, M. H. A. Janssen, R. A. Sheldon, P. J. Dunn, R. M. Howard, R. Kumar, I. Martinez and J. W. Wong, Green Chem., 2011, 13, 1791– 1798. 11 C. A. G. N. Montalbetti and V. Falque, Tetrahedron, 2005, 61, 10827–10852. 12 T. I. Al-Warhi, H. M. A. Al-Hazimi and A. El-Faham, J. Saudi Chem. Soc., 2012, 16, 97–116. 13 M. Stawikowski and G. B. Fields, Curr. Protoc. Protein Sci., 2002, Unit-18.1. 14 C. M. Gabriel, M. Keener, F. Gallou and B. H. Lipshutz, Org. Lett., 2015, 17, 3968– 3971. 15 Q. Wang, Y. Wang and M. Kurosu, Org. Lett., 2012, 14, 3372–3375. 16 D. S. MacMillan, J. Murray, H. F. Sneddon, C. Jamieson and A. J. B. Watson, Green Chem., 2013, 15, 596–600. 17 D. J. C. Constable, P. J. Dunn, J. D. Hayler, G. R. Humphrey, J. J. L. Leazer, R. J. Linderman, K. Lorenz, J. Manley, B. A. Pearlman, A. Wells, A. Zaks and T. Y. Zhang, Green Chem., 2007, 9, 411–420. 18 F. Bordusa, Brazilian J. Med. Biol. Res., 2000, 33, 469–485. 19 J. Pitzer and K. Steiner, J Biotechnol, 2016, 235, 32–46. 20 M. J. J. Litjens, A. J. J. Straathof, J. A. Jongejan and J. J. Heijnen, Chem. Commun., 1999, 1255–1256. 21 P. Adlercreutz, Chem. Soc. Rev., 2013, 42, 6406–6436. 155

22 M. A. Marahiel, T. Stachelhaus and H. D. Mootz, Chem Rev, 1997, 97, 2651–2674. 23 M. V Fawaz, M. E. Topper and S. M. Firestine, Bioorg. Chem., 2011, 39, 185–191. 24 P. Venkitasubramanian, L. Daniels and J. P. N. Rosazza, J. Biol. Chem., 2007, 282, 478–485. 25 J. R. Dunetz, J. Magano and G. A. Weisenburger, Org. Process Res. Dev., 2016, 20, 140–177. 26 J. M. Palomo, RSC Adv., 2014, 4, 32658–32672. 27 A. El-Faham and F. Albericio, Chem. Rev., 2011, 111, 6557–6602. 28 G. W. Anderson and F. M. Callahan, J. Am. Chem. Soc., 1958, 80, 2902–2903. 29 S. Nozaki, J. Pept. Res., 1999, 54, 162–167. 30 L. A. Carpino, J. Am. Chem. Soc., 1993, 115, 4397–4398. 31 A. Williams and I. T. Ibrahim, Chem. Rev., 1981, 81, 589–636. 32 N. Nakajima and Y. Ikada, Bioconjug. Chem., 1995, 6, 123–130. 33 S. Zuffanti, J. Chem. Educ., 1948, 25, 481. 34 A. Leggio, E. L. Belsito, G. De Luca, M. L. Di Gioia, V. Leotta, E. Romio, C. Siciliano and A. Liguori, RSC Adv., 2016, 6, 34468–34475. 35 A. Isidro-Llobet, M. Álvarez and F. Albericio, Chem. Rev., 2009, 109, 2455–2504. 36 L. Zhang, X. Wang, J. Wang, N. Grinberg, D. Krishnamurthy and C. H. Senanayake, Tetrahedron Lett., 2009, 50, 2964–2966. 37 B. Castro, J. R. Dormoy, G. Evin and C. Selve, Tetrahedron Lett., 1975, 16, 1219– 1222. 38 J. Coste, D. Le-Nguyen and B. Castro, Tetrahedron Lett., 1990, 31, 205–208. 39 M. H. Kim and D. V Patel, Tetrahedron Lett., 1994, 35, 5603–5606. 40 K. P. Lee and H. J. Trochimowicz, Am. J. Pathol., 1982, 106, 8–19. 41 T. Krause, S. Baader, B. Erb and L. J. Gooßen, 2016, 7, 11732. 42 R. A. Sheldon, Green Chem., 2007, 9, 1273–1283. 43 R. Garcia-Alvarez, P. Crochet and V. Cadierno, Green Chem., 2013, 15, 46–66. 44 G. A. Homandberg, J. A. Mattis and M. Laskowski Jr., Biochemistry, 1978, 17, 5220–5227. 45 B. Turk, Nat. Rev. Drug Discov., 2006, 5, 785–799. 46 R. Saravanan, S. S. Adav, Y. K. Choong, M. J. A. van der Plas, J. Petrlova, S. Kjellström, S. K. Sze and A. Schmidtchen, Sci. Rep., 2017, 7, 13136. 47 H. Neurath and K. A. Walsh, Proc. Natl. Acad. Sci. U. S. A., 1976, 73, 3825–3832. 48 M. B. Rao, A. M. Tanksale, M. S. Ghatge and V. V Deshpande, Microbiol. Mol.

156

Biol. Rev. , 1998, 62, 597–635. 49 J. V Olsen, S.-E. Ong and M. Mann, Mol. Cell. Proteomics , 2004, 3, 608–614. 50 J. N. Higaki, L. B. Evnin and C. S. Craik, Biochemistry, 1989, 28, 9256–9263. 51 E. S. Radisky, J. M. Lee, C.-J. K. Lu and D. E. Koshland, Proc. Natl. Acad. Sci. U. S. A., 2006, 103, 6835–6840. 52 L. Hedstrom, Chem. Rev., 2002, 102, 4501–4524. 53 K. Tamura and R. W. Alexander, Cell. Mol. Life Sci. C., 2004, 61, 1317–1330. 54 E. Di Cera, IUBMB Life, 2009, 61, 510–515. 55 R. Günther and F. Bordusa, Chem. – A Eur. J., 2000, 6, 463–467. 56 K. Yazawa and K. Numata, Molecules, 2014, 19, 13755–13774. 57 K. Yazawa, J. Gimenez-Dejoz, H. Masunaga, T. Hikima and K. Numata, Polym. Chem., 2017, 8, 4172–4176. 58 R. J. A. C. de Beer, B. Zarzycka, M. Mariman, H. I. V Amatdjais-Groenen, M. J. Mulders, P. J. L. M. Quaedflieg, F. L. van Delft, S. B. Nabuurs and F. P. J. T. Rutjes, ChemBioChem, 2012, 13, 1319–1326. 59 J. M. Ageitos, K. Yazawa, A. Tateishi, K. Tsuchiya and K. Numata, Biomacromolecules, 2016, 17, 314–323. 60 S. Nitta, A. Komatsu, T. Ishii, H. Iwamoto and K. Numata, Polym. J., 2016, 48, 955. 61 P. Clapes, G. Valencia, J. L. Torres, F. Reig, J. M. Garcia-Anton and J. Mata- Alvarez, Biochim. Biophys. Acta, 1988, 953, 157–163. 62 L. Gráf, L. Szilágyi and I. Venekei, in Handbook of Proteolytic Enzymes, ed. G. B. T.-H. of P. E. Salvesen, Academic Press, 2013, pp. 2626–2633. 63 W. Kullmann, J. Org. Chem., 1982, 47, 5300–5303. 64 M. V Sergeeva, V. M. Paradkar and J. S. Dordick, Enzyme Microb. Technol., 1997, 20, 623–628. 65 F. Bordusa, Chem. Rev., 2002, 102, 4817–4868. 66 L. A. Carpino and G. Y. Han, J. Org. Chem., 1972, 37, 3404–3409. 67 R. V Ulijn, B. Baragana, P. J. Halling and S. L. Flitsch, J Am Chem Soc, 2002, 124, 10988–10989. 68 R. V Ulijn, N. Bisek, P. J. Halling and S. L. Flitsch, Org. Biomol. Chem., 2003, 1, 1277–1281. 69 C.-H. Kuo, J.-A. Lin, C.-M. Chien, C.-H. Tsai, Y.-C. Liu and C.-J. Shieh, J. Mol. Catal. B Enzym., 2016, 129, 15–20. 70 A. Liljeblad, P. Kallio, M. Vainio, J. Niemi and L. T. Kanerva, Org Biomol Chem, 2010, 8, 886–895. 71 V. Gotor, Bioorg. Med. Chem., 1999, 7, 2189–2197.

157

72 L. Chronopoulou, S. Lorenzoni, G. Masci, M. Dentini, A. R. Togna, G. Togna, F. Bordi and C. Palocci, Soft Matter, 2010, 6, 2525–2532. 73 E. M. Anderson, K. M. Larsson and O. Kirk, Biocatal. Biotransformation, 1998, 16, 181–204. 74 K. P. Dhake, Z. S. Qureshi, R. S. Singhal and B. M. Bhanage, Tetrahedron Lett., 2009, 50, 2811–2814. 75 M. Fernández-Pérez and C. Otero, Enzyme Microb. Technol., 2001, 28, 527–536. 76 Y.-B. Huang, Y. Cai, S. Yang, H. Wang, R.-Z. Hou, L. Xu, W. Xiao-Xia and X.-Z. Zhang, J. Biotechnol., 2006, 125, 311–318. 77 A. L. Gutman, E. Meyer, X. Yue and C. Abell, Tetrahedron Lett., 1992, 33, 3943– 3946. 78 K. Khumtaveeporn, A. Ullmann, K. Matsumoto, B. G. Davis and J. B. Jones, Tetrahedron: Asymmetry, 2001, 12, 249–261. 79 B. H. Lipshutz and S. Ghorai, Green Chem., 2014, 16, 3660–3679. 80 M. Strieker, A. Tanović and M. A. Marahiel, Curr. Opin. Struct. Biol., 2010, 20, 234–240. 81 G. H. Hur, C. R. Vickery and M. D. Burkart, Nat. Prod. Rep., 2012, 29, 1074–1098. 82 D. Konz and M. A. Marahiel, Chem. Biol., 1999, 6, R39–R48. 83 F. Kudo, A. Miyanaga and T. Eguchi, Nat. Prod. Rep., 2014, 31, 1056–1073. 84 E. K. Y. Leung, N. Suslov, N. Tuttle, R. Sengupta and J. A. Piccirilli, Annu. Rev. Biochem., 2011, 80, 527–555. 85 Y. L. J. Pang, K. Poruri and S. A. Martinis, Wiley Interdiscip. Rev. RNA, 2014, 5, 461–480. 86 E. J. Steinmetz and M. E. Auldridge, in Current Protocols in Protein Science, John Wiley & Sons, Inc., 2017, p. 5.27.1-5.27.20. 87 N. J. Turner, Nat. Chem. Biol., 2009, 5, 567–573. 88 Y. Zhang and V. N. Gladyshev, Nucleic Acids Res., 2007, 35, 4952–4963. 89 G. Srinivasan, C. M. James and J. A. Krzycki, Science, 2002, 296, 1459–1462. 90 M. A. Marahiel, T. Stachelhaus and H. D. Mootz, Chem Rev, 1997, 97, 2651–2674. 91 T. A. Steitz, Nat Rev Mol Cell Biol, 2008, 9, 242–253. 92 T. A. Keating, C. G. Marshall and C. T. Walsh, Biochemistry, 2000, 39, 15522– 15530. 93 N. M. Gaudelli, D. H. Long and C. A. Townsend, Nature, 2015, 520, 383. 94 J. Recktenwald, R. Shawky, O. Puk, F. Pfennig, U. Keller, W. Wohlleben and S. Pelzer, Microbiology, 2002, 148, 1105–1118. 95 B. Shen, L. Du, C. Sanchez, D. J. Edwards, M. Chen and J. M. Murrell, J. Ind.

158

Microbiol. Biotechnol., 2001, 27, 378–385. 96 M. Hahn and T. Stachelhaus, Proc. Natl. Acad. Sci. U. S. A., 2004, 101, 15585– 15590. 97 S. Lautru and G. L. Challis, Microbiology, 2004, 150, 1629–1636. 98 G. L. Challis and J. H. Naismith, Curr. Opin. Struct. Biol., 2004, 14, 748–756. 99 H. D. Mootz and M. A. Marahiel, J. Bacteriol., 1997, 179, 6843–6850. 100 K. Bloudoff and T. M. Schmeing, Biochim. Biophys. Acta - Proteins Proteomics, 2017, 1865, 1587–1604. 101 E. A. Felnagle, E. E. Jackson, Y. A. Chan, A. M. Podevels, A. D. Berti, M. D. McMahon and M. G. Thomas, Mol. Pharm., 2008, 5, 191–211. 102 M. A. Marahiel and L. O. Essen, in Complex Enzymes in Microbial Natural Product Biosynthesis, Part A: Overview Articles and Peptides, ed. A. H. David, Academic Press, 2009, vol. 458, pp. 337–351. 103 M. Peschke, C. Brieke, M. Heimes and M. J. Cryle, ACS Chem. Biol., 2017, acschembio.7b00943. 104 W.-H. Chen, K. Li, N. S. Guntaka and S. D. Bruner, ACS Chem. Biol., 2016, 11, 2293–2303. 105 P. J. Belshaw, C. T. Walsh and T. Stachelhaus, Science (80-. )., 1999, 284, 486–489. 106 T. Stachelhaus and C. T. Walsh, Biochemistry, 2000, 39, 5775–5787. 107 J. Grünewald and M. A. Marahiel, Microbiol. Mol. Biol. Rev. , 2006, 70, 121–146. 108 J. Swierstra, V. Kapoerchan, A. Knijnenburg, A. van Belkum and M. Overhand, Eur. J. Clin. Microbiol. Infect. Dis., 2016, 35, 763–769. 109 C. T. Walsh, Acc. Chem. Res., 2008, 41, 4–10. 110 T. Velkov, J. Horne, M. J. Scanlon, B. Capuano, E. Yuriev and A. Lawen, Chem. Biol., 2011, 18, 464–475. 111 D. P. Dowling, Y. Kung, A. K. Croft, K. Taghizadeh, W. L. Kelly, C. T. Walsh and C. L. Drennan, Proc. Natl. Acad. Sci. U. S. A., 2016, 113, 12432–12437. 112 U. Linne and M. A. Marahiel, in Methods in Enzymology, Academic Press, 2004, vol. 388, pp. 293–315. 113 T. Stachelhaus, H. D. Mootz and M. A. Marahiel, Chem. Biol., 1999, 6, 493–505. 114 S. A. Sieber and M. A. Marahiel, Chem. Rev., 2005, 105, 715–738. 115 E. Conti, T. Stachelhaus, M. A. Marahiel and P. Brick, EMBO J., 1997, 16, 4174– 4183. 116 L. Du, Y. He and Y. Luo, Biochemistry, 2008, 47, 11473–11480. 117 K. T. Osman, L. Du, Y. He and Y. Luo, J. Mol. Biol., 2009, 388, 345–355. 118 A. M. Gulick, ACS Chem. Biol., 2009, 4, 811–827.

159

119 R. Dieckmann, M. Pavela-Vrancic, H. von Dohren and H. Kleinkauf, J. Mol. Biol., 1999, 288, 129–140. 120 C. A. Mitchell, C. Shi, C. C. Aldrich and A. M. Gulick, Biochemistry, 2012, 51, 3252–3263. 121 A. S. Reger, R. Wu, D. Dunaway-Mariano and A. M. Gulick, Biochemistry, 2008, 47, 8016–8025. 122 H. Yonus, P. Neumann, S. Zimmermann, J. J. May, M. A. Marahiel and M. T. Stubbs, J Biol Chem, 2008, 283, 32484–32491. 123 B. R. M. Villiers and F. Hollfelder, Chembiochem, 2009, 10, 671–682. 124 K. Eppelmann, T. Stachelhaus and M. A. Marahiel, Biochemistry, 2002, 41, 9718– 9726. 125 B. S. Evans, Y. Chen, W. W. Metcalf, H. Zhao and N. L. Kelleher, Chem Biol, 2011, 18, 601–607. 126 M. J. Calcott and D. F. Ackerley, Biotechnol Lett, 2014, 36, 2407–2416. 127 C. T. Walsh, A. M. Gehring, P. H. Weinreb, L. E. N. Quadri and R. S. Flugel, Curr. Opin. Chem. Biol., 1997, 1, 309–315. 128 C. Neville, A. Murphy, K. Kavanagh and S. Doyle, ChemBioChem, 2005, 6, 679– 685. 129 R. H. Lambalot, A. M. Gehring, R. S. Flugel, P. Zuber, M. LaCelle, M. A. Marahiel, R. Reid, C. Khosla and C. T. Walsh, Chem. Biol., 1996, 3, 923–936. 130 T. Kittilä, A. Mollo, L. K. Charkoudian and M. J. Cryle, Angew. Chemie Int. Ed., 2016, 55, 9834–9840. 131 A. Koglin, M. R. Mofid, F. Lohr, B. Schafer, V. V Rogov, M.-M. Blum, T. Mittag, M. A. Marahiel, F. Bernhard and V. Dotsch, Science, 2006, 312, 273–276. 132 S. Zimmermann, S. Pfennig, P. Neumann, H. Yonus, U. Weininger, M. Kovermann, J. Balbach and M. T. Stubbs, FEBS Lett., 2015, 589, 2283–2289. 133 J. Crosby and M. P. Crump, Nat. Prod. Rep., 2012, 29, 1111–1137. 134 S. Vance, O. Tkachenko, B. Thomas, M. Bassuni, H. Hong, D. Nietlispach and W. Broadhurst, Biochem. J., 2016, 473, 1097–1110. 135 A. C. Goodrich, B. J. Harden and D. P. Frueh, J. Am. Chem. Soc., 2015, 137, 12100–12109. 136 D. A. Miller, L. Luo, N. Hillson, T. A. Keating and C. T. Walsh, Chem. Biol., 2002, 9, 333–344. 137 A. C. Goodrich and D. P. Frueh, Biochemistry, 2015, 54, 1154–1156. 138 M. A. Marahiel, Nat. Prod. Rep., 2016, 33, 136–140. 139 T. Weber and M. A. Marahiel, Structure, 2001, 9, R3–R9. 140 T. A. Keating, C. G. Marshall, C. T. Walsh and A. E. Keating, Nat Struct Biol, 2002, 9, 522–526. 160

141 T. Stachelhaus, H. D. Mootz, V. Bergendahl and M. A. Marahiel, J Biol Chem, 1998, 273, 22773–22781. 142 V. Bergendahl, U. Linne and M. A. Marahiel, Eur. J. Biochem., 2002, 269, 620– 629. 143 G. C. Uguru, C. Milne, M. Borg, F. Flett, C. P. Smith and J. Micklefield, J. Am. Chem. Soc., 2004, 126, 5032–5033. 144 M. Winn, J. K. Fyans, Y. Zhuo and J. Micklefield, Nat. Prod. Rep., 2016, 33, 317– 347. 145 V. Miao, M.-F. Coeffet-Le Gal, K. Nguyen, P. Brian, J. Penn, A. Whiting, J. Steele, D. Kau, S. Martin, R. Ford, T. Gibson, M. Bouchard, S. K. Wrigley and R. H. Baltz, Chem. Biol., 2006, 13, 269–276. 146 H. D. Mootz, D. Schwarzer and M. A. Marahiel, Proc Natl Acad Sci U S A, 2000, 97, 5848–5853. 147 S. Doekel and M. A. Marahiel, Chem Biol, 2000, 7, 373–384. 148 M. J. Calcott, J. G. Owen, I. L. Lamont and D. F. Ackerley, Appl. Environ. Microbiol. , 2014, 80, 5723–5731. 149 R. Beer, K. Herbst, N. Ignatiadis, I. Kats, L. Adlung, H. Meyer, D. Niopek, T. Christiansen, F. Georgi, N. Kurzawa, J. Meichsner, S. Rabe, A. Riedel, J. Sachs, J. Schessner, F. Schmidt, P. Walch, K. Niopek, T. Heinemann, R. Eils and B. Di Ventura, Mol Biosyst, 2014, 10, 1709–1718. 150 M. Y. Galperin and E. V Koonin, Protein Sci, 1997, 6, 2639–2643. 151 W. T. Wolodko, M. E. Fraser, M. N. James and W. A. Bridger, J. Biol. Chem. , 1994, 269, 10883–10890. 152 C. Fan, P. C. Moews, Y. Shi, C. T. Walsh and J. R. Knox, Proc. Natl. Acad. Sci. U. S. A., 1995, 92, 1172–1176. 153 W. K. Kang, T. Icho, S. Isono, M. Kitakawa and K. Isono, Mol. Gen. Genet., 1989, 217, 281–288. 154 Y. Ogasawara and T. Dairi, Chem. – A Eur. J., 2017, 23, 10714–10724. 155 A. G. Murzin, Curr. Opin. Struct. Biol., 1996, 6, 386–394. 156 J. B. Thoden, C. Z. Blanchard, H. M. Holden and G. L. Waldrop, J. Biol. Chem. , 2000, 275, 16183–16190. 157 J. B. Thoden, H. M. Holden and S. M. Firestine, Biochemistry, 2008, 47, 13346– 13353. 158 H. Berg, K. Ziegler, K. Piotukh, K. Baier, W. Lockau and R. Volkmer-Engert, Eur. J. Biochem., 2000, 267, 5561–5570. 159 J. B. Parker and C. T. Walsh, Biochemistry, 2013, 52, 889–901. 160 K. Tabata, H. Ikeda and S. Hashimoto, J. Bacteriol., 2005, 187, 5195–5202. 161 Y. Ogasawara, K. Ooya, M. Fujimori, M. Noike and T. Dairi, J. Antibiot. (Tokyo)., 2016, 69, 119–120. 161

162 Y. Ogasawara, M. Fujimori, J. Kawata and T. Dairi, Bioorg. Med. Chem. Lett., 2016, 26, 3662–3664. 163 G. L. Waldrop, I. Rayment and H. M. Holden, Biochemistry, 1994, 33, 10249– 10256. 164 M. A. Martínez-Núñez and V. E. L. y. López, Sustain. Chem. Process., 2016, 4, 13. 165 C. Ji, Q. Chen, Q. Li, H. Huang, Y. Song, J. Ma and J. Ju, Tetrahedron Lett., 2014, 55, 4901–4904. 166 M. Steffensky, S.-M. Li and L. Heide, J. Biol. Chem. , 2000, 275, 21754–21760. 167 S. Schmelz and J. H. Naismith, Curr. Opin. Struct. Biol., 2009, 19, 666–671.

168 S. Elisabeth, S. Marion, S. Jürgen, P. Andrea, L. Shu‐Ming and H. Lutz, Eur. J. Biochem., 2003, 270, 4413–4419. 169 C. Maruyama, J. Toyoda, Y. Kato, M. Izumikawa, M. Takagi, K. Shin-Ya, H. Katano, T. Utagawa and Y. Hamano, Nat. Chem. Biol., 2012, 8, 791–797. 170 T. Abe, Y. Hashimoto, H. Hosaka, K. Tomita-Yokotani and M. Kobayashi, J. Biol. Chem., 2008, 283, 11312–11321. 171 P. J. O’Brien and D. Herschlag, Chem. Biol., 1999, 6, R91–R105. 172 T. M. Hackeng, J. H. Griffin and P. E. Dawson, Proc. Natl. Acad. Sci. U. S. A., 1999, 96, 10068–10073. 173 T. Abe, Y. Hashimoto, Y. Zhuang, Y. Ge, T. Kumano and M. Kobayashi, J. Biol. Chem. , 2016, 291, 1735–1750. 174 T. Abe, Y. Hashimoto, S. Sugimoto, K. Kobayashi, T. Kumano and M. Kobayashi, J. Antibiot. (Tokyo)., 2017, 70, 435–442. 175 P. E. Dawson, T. W. Muir, I. Clark-Lewis and S. B. Kent, Science (80-. )., 1994, 266, 776–779. 176 R. Dieckmann, T. Neuhof, M. Pavela-Vrancic and H. von Döhren, FEBS Lett., 2001, 498, 42–45. 177 K. Napora-Wijata, G. A. Strohmeier and M. Winkler, Biotechnol. J., 2014, 9, 822– 843. 178 M. Winkler, Curr. Opin. Chem. Biol., 2017, 43, 23–29. 179 D. M. Bachman, B. Dragoon and S. John, Arch. Biochem. Biophys., 1960, 91, 326. 180 G. G. Gross, K. H. Bolkart and M. H. Zenk, Biochem. Biophys. Res. Commun., 1968, 32, 173–178. 181 G. G. Gross and M. H. Zenk, Eur. J. Biochem., 1969, 8, 413–419. 182 A. He, T. Li, L. Daniels, I. Fotheringham and J. P. N. Rosazza, Appl. Environ. Microbiol., 2004, 70, 1874–1881. 183 S. McGinnis and T. L. Madden, Nucleic Acids Res., 2004, 32, W20–W25. 184 K. Reuter, M. R. Mofid, M. A. Marahiel and R. Ficner, EMBO J, 1999, 18, 6823– 162

6831. 185 W. Finnigan, A. Thomas, H. Cromar, B. Gough, R. Snajdrova, J. P. Adams, J. A. Littlechild and N. J. Harmer, ChemCatChem, 2017, 9, 1005–1017. 186 M. K. Akhtar, N. J. Turner and P. R. Jones, Proc Natl Acad Sci U S A, 2013, 110, 87–92. 187 G. M. Rodriguez and S. Atsumi, Microb. Cell Fact., 2012, 11, 90. 188 R. Vidal, L. López-Maury, M. G. Guerrero and F. J. Florencio, J. Bacteriol., 2009, 191, 4383–4391. 189 B. E. Eser, D. Das, J. Han, P. R. Jones and E. N. G. Marsh, Biochemistry, 2011, 50, 10743–10750. 190 L. J. Hepworth, S. P. France, S. Hussain, P. Both, N. J. Turner and S. L. Flitsch, ACS Catal., 2017, 7, 2920–2925. 191 S. P. France, S. Hussain, A. M. Hill, L. J. Hepworth, R. M. Howard, K. R. Mulholland, S. L. Flitsch and N. J. Turner, ACS Catal., 2016, 6, 3753–3759. 192 D. Gahloth, M. S. Dunstan, D. Quaglia, E. Klumbys, M. P. Lockhart-Cairns, A. M. Hill, S. R. Derrington, N. S. Scrutton, N. J. Turner and D. Leys, Nat. Chem. Biol., 2017, 13, 975–981. 193 J. M. Reimer, M. N. Aloise, P. M. Harrison and T. M. Schmeing, Nature, , DOI:10.1038/nature16503. 194 T. A. Keating, C. G. Marshall and C. T. Walsh, Biochemistry, 2000, 39, 15513– 15521. 195 R. Finking and M. A. Marahiel, Annu. Rev. Microbiol., 2004, 58, 453–488. 196 T. Weber and M. A. Marahiel, Structure, 2001, 9, R3–R9. 197 J. W. A. van Dijk, C.-J. Guo and C. C. C. Wang, Org. Lett., 2016, 18, 6236–6239. 198 S. Lautru and G. L. Challis, Microbiology, 2004, 150, 1629–1636. 199 B. R. Miller, J. A. Sundlov, E. J. Drake, T. A. Makin and A. M. Gulick, Proteins, 2014, 82, 2691–2702. 200 R. D. Finn, A. Bateman, J. Clements, P. Coggill, R. Y. Eberhardt, S. R. Eddy, A. Heger, K. Hetherington, L. Holm, J. Mistry, E. L. L. Sonnhammer, J. Tate and M. Punta, Nucleic Acids Res., 2014, 42, D222–D230. 201 S. Liu, C. Zhang, N. Li, B. Niu, M. Liu, X. Liu, T. Wei, D. Zhu, Y. Huang, S. Xu and L. Gu, Acta Crystallogr. D. Biol. Crystallogr., 2012, 68, 1329–1338. 202 L. E. Bird, H. Rada, J. Flanagan, J. M. Diprose, R. J. C. Gilbert and R. J. Owens, Methods Mol. Biol., 2014, 1116, 209–234. 203 S. K. Berwal, R. K. Sreejith and J. K. Pal, Anal. Biochem., 2010, 405, 275–277. 204 J. He, K. Sakaguchi and T. Suzuki, J. Biosci. Bioeng., 2012, 113, 442–444. 205 A. S. Worthington, G. H. Hur and M. D. Burkart, Mol. Biosyst., 2011, 7, 365–370.

163

206 C. G. Marshall, M. D. Burkart, R. K. Meray and C. T. Walsh, Biochemistry, 2002, 41, 8429–8437. 207 F. Sievers and D. G. Higgins, Curr. Protoc. Bioinforma., 2014, 48, 3.13.1-16. 208 Q. S. Yan, P. K. Mishra, R. L. Burger, A. F. Bettendorf, P. C. Jobe and J. W. Dailey, J Pharmacol Exp Ther, 1992, 261, 652–659. 209 J. Franke and C. Hertweck, Cell Chem. Biol., 2017, 23, 1179–1192. 210 R. Finking and M. A. Marahiel, Annu. Rev. Microbiol., 2004, 58, 453–488. 211 M. R. Webb, Proc. Natl. Acad. Sci. U. S. A., 1992, 89, 4884–4887. 212 R. Lahti, Microbiol. Rev., 1983, 47, 169–178. 213 A. Cornish-Bowden, Perspect. Sci., 2015, 4, 3–9. 214 O. K. and D. S. Tawfik, Annu. Rev. Biochem., 2010, 79, 471–505. 215 A. Babtie, N. Tokuriki and F. Hollfelder, Curr. Opin. Chem. Biol., 2010, 14, 200– 207. 216 D. Ringe and G. A. Petsko, Science (80-. )., 2008, 320, 1428–1429. 217 G. M. Morris, R. Huey, W. Lindstrom, M. F. Sanner, R. K. Belew, D. S. Goodsell and A. J. Olson, J. Comput. Chem., 2009, 30, 2785–2791. 218 D. S. Goodsell, G. M. Morris and A. J. Olson, J. Mol. Recognit., 1996, 9, 1–5. 219 S. P. France, L. J. Hepworth, N. J. Turner and S. L. Flitsch, ACS Catal., 2017, 7, 710–724. 220 E. Padan, E. Bibi, M. Ito and T. A. Krulwich, Biochim. Biophys. Acta, 2005, 1717, 67–88. 221 K. Horikoshi, Microbiol. Mol. Biol. Rev., 1999, 63, 735–750. 222 M. A. Hollas, S. J. Webb, S. L. Flitsch and A. J. Fielding, Angew. Chemie Int. Ed., 2017, 56, 9449–9453. 223 W. L. Hubbell, H. S. Mchaourab, C. Altenbach and M. A. Lietzow, Structure, 1996, 4, 779–783. 224 B. R. K. Menon, K. Fisher, S. E. J. Rigby, N. S. Scrutton and D. Leys, J. Biol. Chem., 2014, 289, 34161–34174. 225 D. P. Claxton, K. Kazmier, S. Mishra and H. S. Mchaourab, Methods Enzymol., 2015, 564, 349–387. 226 I. D. Sahu, R. M. McCarrick and G. A. Lorigan, Biochemistry, 2013, 52, 5967– 5984. 227 S. H. Liaw and D. Eisenberg, Biochemistry, 1994, 33, 675–681. 228 J. N. Andexer and M. Richter, Chembiochem, 2015, 16, 380–386. 229 A. J. L. Wood, N. J. Weise, J. D. Frampton, M. S. Dunstan, M. A. Hollas, S. R. Derrington, R. C. Lloyd, D. Quaglia, F. Parmeggiani, D. Leys, N. J. Turner and S.

164

L. Flitsch, Angew. Chemie - Int. Ed., 2017, 56, 14498–14501. 230 M. R. Wilkins, E. Gasteiger, A. Bairoch, J. C. Sanchez, K. L. Williams, R. D. Appel and D. F. Hochstrasser, Methods Mol. Biol., 1999, 112, 531–552. 231 N. M. O’Boyle, M. Banck, C. A. James, C. Morley, T. Vandermeersch and G. R. Hutchison, J. Cheminform., 2011, 3, 33.

165

Appendices

Appendix 1: Genes used in this work

CARmm genetic sequence cloned within pET21a, and donated by Dr. Mark Dunstan.

Red sequence= truncated CARmm gene cloned within pET28b and donated by Dr. Mark Dunstan.

ATGAGCCCGATTACCCGTGAAGAACGTCTGGAACGTCGTATTCAGGATCTGTATGCGA ACGATCCGCAGTTCGCAGCAGCCAAACCGGCGACCGCGATTACCGCGGCGATTGAAC GTCCGGGTCTGCCGCTGCCGCAGATCATCGAAACGGTGATGACCGGCTATGCGGATCG TCCGGCACTGGCACAACGTAGCGTGGAATTTGTGACCGATGCGGGCACCGGTCATACC ACCCTGCGTCTGCTGCCGCATTTTGAAACCATTAGCTATGGCGAACTGTGGGATCGTAT TAGCGCGCTGGCCGATGTTCTGAGCACCGAACAGACCGTGAAACCGGGCGATCGTGTG TGCCTGCTGGGCTTTAACAGCGTGGATTATGCGACCATTGATATGACCCTGGCACGTCT GGGTGCTGTCGCTGTCCCGCTGCAGACCTCTGCTGCGATTACCCAGCTGCAGCCGATT GTGGCGGAAACCCAGCCGACCATGATTGCGGCGAGCGTGGATGCCCTGGCCGATGCG ACCGAACTGGCACTGAGTGGTCAAACGGCTACGCGTGTGCTGGTGTTTGATCATCATC GTCAGGTGGATGCGCATCGTGCGGCGGTTGAAAGCGCGCGTGAACGTCTGGCCGGTA GCGCGGTGGTTGAAACCCTGGCCGAAGCGATTGCGCGTGGTGATGTGCCGCGTGGTGC GAGCGCGGGTAGCGCACCGGGCACCGATGTGAGCGATGATAGCCTGGCCCTGCTGATT TATACCTCTGGTAGTACGGGTGCGCCGAAAGGCGCCATGTATCCGCGTCGTAACGTGG CGACCTTTTGGCGTAAACGTACCTGGTTTGAAGGCGGCTATGAACCGAGCATTACCCT GAACTTTATGCCGATGAGCCATGTGATGGGCCGTCAGATTCTGTATGGCACCCTGTGC AACGGCGGCACCGCGTATTTTGTGGCGAAAAGCGATCTGAGCACCCTGTTTGAAGATC TGGCCCTGGTGCGTCCGACCGAACTGACCTTCGTCCCGCGTGTTTGGGATATGGTGTTC GATGAATTTCAGAGCGAAGTGGATCGTCGTCTGGTGGATGGCGCGGATCGTGTTGCGC TGGAAGCGCAGGTGAAAGCGGAAATTCGTAACGATGTGCTGGGCGGTCGTTATACCTC TGCTCTGACGGGTTCTGCTCCGATTAGCGATGAAATGAAAGCGTGGGTGGAAGAACTG CTGGATATGCATCTGGTGGAAGGCTATGGCAGCACCGAAGCGGGCATGATTCTGATTG ATGGCGCGATTCGTCGTCCGGCGGTGCTGGATTATAAACTGGTGGATGTTCCGGATCT GGGCTATTTTCTGACCGATCGTCCGCATCCGCGTGGCGAACTGCTGGTGAAAACCGAT AGCCTGTTTCCGGGCTATTATCAGCGTGCGGAAGTGACCGCGGATGTGTTTGATGCGG ATGGCTTTTATCGCACCGGCGATATTATGGCGGAAGTGGGCCCGGAACAGTTTGTGTA TCTGGATCGTCGTAACAACGTGCTGAAACTGAGCCAGGGCGAATTTGTTACCGTGAGC AAACTGGAAGCGGTGTTTGGCGATAGCCCGCTGGTGCGTCAGATTTATATTTATGGCA ACAGCGCGCGTGCGTATCTGCTGGCCGTGATTGTGCCGACCCAGGAAGCGCTGGACGC GGTCCCGGTTGAAGAACTGAAAGCGCGTCTGGGTGACTCTCTGCAGGAAGTGGCGAA AGCGGCGGGTCTGCAGAGCTATGAAATTCCGCGCGATTTTATTATCGAAACCACCCCG TGGACCCTGGAAAACGGCCTGCTGACGGGTATTCGTAAACTGGCCCGTCCGCAGCTGA AAAAACATTATGGTGAACTGCTGGAACAAATTTATACCGATCTGGCCCACGGCCAGGC GGATGAACTGCGTAGCCTGCGTCAGAGCGGTGCGGATGCGCCGGTGCTGGTGACCGTT TGTCGTGCGGCTGCAGCTCTGCTGGGTGGTAGCGCGAGCGATGTGCAGCCGGATGCGC ATTTCACCGATCTGGGTGGTGATAGCCTGAGCGCCCTGAGCTTTACCAACCTGCTGCAT GAAATCTTTGATATTGAAGTGCCGGTGGGCGTGATTGTGAGCCCGGCGAACGATCTGC AGGCGCTGGCCGATTATGTGGAAGCGGCGCGTAAACCGGGTAGCAGCCGTCCGACCTT TGCGAGCGTGCATGGCGCGAGCAACGGCCAGGTGACCGAAGTGCATGCGGGCGATCT GAGCCTGGATAAATTTATTGATGCGGCGACCCTGGCCGAAGCCCCGCGTCTGCCGGCT GCAAATACCCAGGTGCGTACCGTGCTGCTGACCGGTGCGACCGGCTTTCTGGGCCGTT ACCTGGCCCTGGAATGGCTGGAACGTATGGATCTGGTTGATGGCAAACTGATTTGCCT GGTGCGTGCCAAAAGCGATACCGAAGCGCGTGCGCGTCTGGATAAAACCTTTGATAGC GGCGATCCGGAACTGCTGGCCCATTATCGTGCGCTGGCCGGCGATCATCTGGAAGTGC TGGCCGGTGATAAAGGCGAAGCGGATCTGGGCCTGGATCGTCAGACCTGGCAACGCC 166

TGGCAGATACCGTGGATCTGATTGTTGACCCGGCTGCCCTGGTGAATCATGTGCTGCC GTATAGCCAGCTGTTTGGCCCGAATGCGCTGGGCACCGCTGAACTGCTGCGCCTGGCT CTGACCAGCAAAATTAAACCGTATAGCTACACCAGCACCATTGGCGTGGCGGATCAGA TTCCGCCGAGCGCGTTTACCGAAGATGCGGATATTCGTGTGATTAGCGCGACCCGTGC GGTGGATGATAGCTATGCGAACGGCTATAGCAACAGCAAATGGGCGGGTGAAGTGCT GCTGCGTGAAGCGCATGATCTGTGCGGTCTGCCGGTGGCGGTGTTTCGTTGCGATATG ATCCTGGCAGACACGACCTGGGCGGGTCAGCTGAACGTGCCGGATATGTTTACCCGTA TGATTCTGTCTCTGGCAGCTACGGGTATCGCACCGGGTAGCTTTTATGAACTGGCCGCG GATGGTGCGCGTCAGCGTGCGCATTATGATGGCCTGCCGGTGGAATTTATTGCGGAAG CGATTAGCACCCTGGGCGCGCAGAGCCAGGATGGCTTTCATACCTATCATGTGATGAA TCCGTATGATGATGGCATTGGCCTGGATGAATTTGTGGATTGGCTGAACGAAAGCGGC TGCCCGATTCAGCGTATTGCGGATTATGGCGATTGGCTGCAGCGTTTTGAAACCGCGC TGCGCGCTCTGCCGGATCGTCAGCGTCATAGCAGCCTGCTGCCGCTGCTGCATAACTA TCGTCAGCCGGAACGTCCGGTGCGTGGTAGCATTGCGCCGACCGATCGCTTTCGTGCG GCCGTGCAGGAAGCGAAAATTGGCCCGGATAAAGATATTCCGCATGTGGGTGCGCCG ATTATTGTGAAATATGTGAGCGATCTGCGCCTGCTGGGCCTGCTGTAA

CARni genetic sequence cloned within pET21a, and donated by Dr. Mark Dunstan.

Green sequence= codon mutated from Serine (AGC) codon to Alanine codon (GCC) in this work.

ATGGCGGTGGATAGCCCGGATGAACGTCTGCAGCGTCGTATTGCGCAGCTGTTTGCGG AAGATGAACAGGTGAAAGCAGCACGCCCGCTGGAAGCGGTTAGCGCAGCGGTGAGCG CACCGGGTATGCGTCTGGCCCAGATTGCGGCGACCGTGATGGCGGGCTATGCGGATCG TCCGGCAGCGGGTCAGCGTGCGTTTGAACTGAACACCGATGATGCGACCGGCCGTACC AGCCTGCGTCTGCTGCCGCGTTTTGAAACCATTACCTATCGTGAACTGTGGCAGCGTGT GGGTGAAGTTGCGGCAGCGTGGCATCACGATCCGGAAAATCCGCTGCGTGCGGGCGA TTTTGTGGCGCTGCTGGGCTTTACCAGCATTGATTATGCGACCCTGGATCTGGCCGATA TTCATCTGGGCGCGGTGACCGTTCCGCTGCAGGCGAGCGCAGCAGTCAGCCAACTGAT TGCGATTCTGACCGAAACGAGTCCGCGCCTGCTGGCATCTACCCCGGAACATCTGGAT GCGGCGGTGGAATGTCTGCTGGCAGGTACGACGCCGGAACGCCTGGTGGTGTTTGATT ATCATCCGGAAGATGATGATCAGCGTGCGGCGTTTGAAAGCGCGCGTCGTCGTCTGGC CGATGCGGGCAGCCTGGTGATTGTGGAAACCCTGGATGCGGTGCGTGCGCGTGGTCGT GATCTGCCGGCTGCTCCGCTGTTTGTGCCGGATACCGATGATGATCCGCTGGCCCTGCT GATTTATACCTCTGGTAGCACGGGTACGCCGAAAGGCGCCATGTATACCAACCGCCTG GCAGCAACGATGTGGCAAGGTAACAGCATGCTGCAGGGCAATAGCCAGCGTGTGGGC ATTAACCTGAACTATATGCCGATGAGCCATATTGCGGGCCGTATTAGCCTGTTTGGCGT GCTGGCCCGTGGTGGCACCGCGTATTTTGCGGCGAAAAGCGATATGAGCACCCTGTTT GAAGATATTGGCCTGGTGCGTCCGACCGAAATTTTTTTTGTGCCGCGTGTGTGCGATAT GGTGTTTCAGCGTTATCAGAGCGAACTGGATCGTCGTAGCGTGGCGGGTGCGGATCTG GATACCCTGGATCGTGAAGTGAAAGCGGATCTGCGTCAGAACTATCTGGGCGGTCGTT TTCTGGTGGCGGTGGTGGGTAGCGCACCGCTGGCCGCGGAAATGAAAACCTTTATGGA AAGCGTGCTGGATCTGCCGCTGCATGATGGCTATGGCAGCACCGAAGCGGGTGCGAG CGTGCTGCTGGATAACCAGATTCAGCGTCCGCCGGTGCTGGATTATAAACTGGTGGAC GTCCCGGAACTGGGCTATTTTCGTACCGATCGTCCGCATCCGCGTGGCGAACTGCTGCT GAAAGCGGAAACCACCATTCCGGGCTATTATAAACGTCCGGAAGTGACCGCGGAAAT TTTTGATGAAGATGGCTTCTATAAAACCGGCGATATTGTGGCGGAACTGGAACATGAT CGTCTGGTGTATGTGGATCGTCGCAACAACGTGCTGAAACTGAGCCAGGGCGAATTTG TGACCGTGGCGCATCTGGAAGCGGTGTTTGCGAGCAGCCCGCTGATTCGTCAGATTTT 167

TATCTACGGCTCTAGTGAACGCTCTTATCTGCTGGCAGTGATTGTGCCGACCGATGATG CCCTGCGTGGCCGTGATACCGCGACCCTGAAAAGCGCGCTGGCCGAAAGCATTCAGCG TATTGCGAAAGATGCGAACCTGCAGCCGTATGAAATTCCGCGTGATTTTCTGATTGAA ACCGAACCGTTCACCATTGCGAACGGCCTGCTGTCTGGCATTGCGAAACTGCTGCGTC CGAACCTGAAAGAACGTTATGGCGCGCAGCTGGAACAAATGTATACCGATCTGGCCA CCGGCCAGGCGGATGAACTGCTGGCCCTGCGTCGTGAAGCGGCGGATCTGCCGGTTCT GGAAACCGTTAGCCGTGCGGCGAAAGCCATGCTGGGTGTGGCGAGCGCGGATATGCG TCCGGATGCGCATTTTACCGATCTGGGCGGCGATAGCCTGAGCGCCCTGAGCTTTAGC AACCTGCTGCATGAAATTTTTGGCGTGGAAGTGCCGGTGGGTGTGGTTGTGAGCCCGG CAAACGAACTGCGTGACCTGGCCAACTATATTGAAGCGGAACGTAACAGCGGCGCGA AACGTCCGACCTTTACCAGCGTGCATGGCGGCGGTAGCGAAATTCGTGCGGCCGATCT GACCCTGGATAAATTTATTGATGCGCGTACCCTGGCCGCAGCGGATAGCATTCCGCAT GCACCGGTTCCGGCACAGACCGTCCTGCTGACGGGCGCAAATGGCTATCTGGGCCGTT TTCTGTGCCTGGAATGGCTGGAACGTCTGGATAAAACCGGTGGCACCCTGATTTGCGT GGTGCGTGGCAGCGATGCGGCGGCAGCCCGTAAACGCCTGGATAGCGCGTTTGATAG CGGCGATCCGGGCCTGCTGGAACATTATCAGCAGCTGGCCGCACGCACCCTGGAAGTT CTGGCCGGTGATATTGGCGATCCGAACCTGGGCCTGGATGATGCCACCTGGCAGCGTC TGGCCGAAACCGTGGATCTGATTGTGCACCCGGCTGCTCTGGTGAATCATGTGCTGCC GTATACCCAGCTGTTTGGCCCGAACGTTGTGGGCACCGCGGAAATCGTTCGTCTGGCT ATTACCGCGCGTCGTAAACCGGTGACCTATCTGAGCACCGTGGGCGTGGCGGATCAGG TTGATCCGGCGGAATATCAGGAAGATAGCGACGTCCGCGAAATGAGCGCAGTCCGCG TCGTTCGCGAAAGTTATGCAAACGGTTATGGTAACAGCAAATGGGCGGGTGAAGTGCT GCTGCGTGAAGCGCATGATCTGTGCGGTCTGCCGGTGGCGGTGTTTCGTAGCGATATG ATTCTGGCCCATAGCCGTTATGCGGGCCAGCTGAACGTGCAGGATGTGTTTACCCGTC TGATTCTGAGCCTGGTGGCGACCGGCATTGCGCCGTATAGCTTTTATCGCACCGATGC GGATGGCAACCGTCAGCGTGCGCATTATGATGGCCTGCCGGCGGATTTTACCGCAGCA GCTATCACCGCACTGGGCATTCAGGCGACCGAAGGCTTTCGTACCTATGATGTGCTGA ATCCGTATGATGATGGCATTAGCCTGGATGAATTTGTCGATTGGCTGGTCGAATCTGGT CACCCGATTCAGCGCATTACCGATTATAGCGATTGGTTTCACCGCTTTGAAACCGCGAT TCGTGCGCTGCCGGAAAAACAGCGTCAGGCGAGCGTTCTGCCGCTGCTGGATGCGTAT CGTAATCCGTGTCCGGCGGTCCGTGGTGCAATTCTGCCGGCGAAAGAATTTCAGGCGG CGGTGCAGACCGCGAAAATTGGCCCGGAACAGGATATTCCGCATCTGAGCGCACCGCT GATTGATAAATATGTGAGCGATCTGGAACTGCTGCAGCTGCTGTAA

VibE genetic sequence cloned within pET28b in this work.

ATGACTACGGACTTTACCCCATGGCCGGAAGCGTTGGCAGCACAGTATCGCCA GTTAGGCTACTGGCAAGACAAAACGCTCCTGGACTATCTGCAACAAAGCGCTG AACGCACACCAAATGCTCTCGCACTGGTAGGCGACAATCAGCAATGGCGTTAT CAGGCAATGCTGGAACGTATCGAACAGCTGGCAGCCGGGTTTACTGAACTGGG CTTAGGCTGCGGCGATAACGTTGTGCTGCAACTGGGGAATGTAGCCGAGTTTT ACCTGTGCTTCTTCGCGTTACTGCGCCAAGGGATTCGCCCGATTTTGGCACTGC CTGCGCATCGGCTGGCTGAAATTCGCTACTTTTGCCAGCATTCACAGGCCAAA GCCTACCTGATCGATGGAGCCCAGCGGCCGTTTGACTATCAAGCACTGGCCCA GGAGTTACTGGCGTGCTGTCCAACCCTTCAGACGGTGATTGTGCGTGGGCAGA CACGTGTTACAGATCCGAAATTCATCGAACTGGCTAGTTGCTACTCAGCGTCGT CTTGCCAGGCGAATGCAGATCCGAATCAGATTGCCTTCTTTCAGCTGTCTGGCG GCACCACTGGTACCCCGAAACTTATTCCTCGCACGCACAACGACTATGCGTAT AGCGTCACTGCCAGCGTGGAAATCTGCCGCTTTGACCAACACACGCGCTATCT GTGTGTTCTGCCGGCTGCGCATAACTTCCCGCTTAGTAGCCCTGGTGCCTTAGG 168

TGTGTTTTGGGCAGGAGGTTGTGTTGTGCTGAGCCAGGATGCCTCGCCACAGC ATGCCTTTAAACTGATCGAACAGCACAAGATTACCGTAACCGCGCTGGTCCCT CCATTGGCTCTGTTATGGATGGACCATGCAGAGAAATCTACCTACGATCTGTCC TCGCTCCACTTTGTCCAGGTCGGCGGAGCGAAATTTAGCGAAGCTGCGGCACG CCGGCTTCCCAAAGCGCTCGGCTGTCAGCTGCAACAAGTTTTCGGCATGGCGG AAGGTCTGGTCAACTATACCCGTTTGGACGATAGTGCCGAGCTGATTGCCACT ACGCAGGGTCGCCCCATTTCCGCGCATGATCAGCTGCTGGTTGTGGATGAGCA GGGTCAACCGGTAGCCTCAGGGGAAGAGGGCTATCTGCTGACCCAAGGCCCGT ATACCATTCGTGGGTATTACCGTGCGGATCAACATAATCAGCGTGCGTTCAAC GCGCAGGGCTTTTACATCACCGGTGATAAGGTAAAATTGTCGTCTGAAGGCTA TGTCATTGTCACAGGTCGTGCAAAAGATCAGATCAACCGTGGTGGCGAGAAAA TTGCTGCTGAAGAAGTTGAAAACCAGCTCTTACACCATCCCGCGGTTCATGAT GCAGCGCTCATTGCGATCTCCGATGAGTATCTGGGAGAACGCAGTTGTGCGGT GATTGTGCTTAAACCGGAACAAAGCGTGAACACCATCCAGTTGAAACGCTTCC TCCACCAAGCTGGTCTGGCCGATTACAAAATTCCGGATCAAATCCAGTTCATC GATCAGTTGCCGAAAACGTCCGTGGGTAAGATTGACAAGAATGCGCTTCGCCG TCGCTTTGATACGCTGGGTTTAGCCTTGATGAGC

VibH genetic sequence cloned within pET21b in this work.

ATGTCCATGCTCCTTGCTCAGAAACCCTTTTGGCAACGCCACTTAGCATATCCG CATATCAACCTCGACACTGTTGCTCACTCGCTGCGTCTGACGGGTCCTCTGGAT ACGACCCTTCTGTTACGTGCCCTGCATCTGACCGTCAGCGAAATTGACTTGTTT CGTGCGCGCTTTTCCGCGCAAGGTGAACTCTATTGGCATCCCTTCAGTCCACCG ATTGACTATCAGGACCTGTCGATTCACTTAGAAGCGGAACCGCTTGCCTGGCG CCAAATTGAGCAGGATCTGCAACGCTCATCGACCCTCATTGATGCCCCGATTA CATCGCACCAGGTTTATCGCCTTTCCCATAGCGAACACCTGATCTATACGCGGG CGCATCACATCGTGCTGGATGGCTATGGCATGATGCTGTTTGAACAGCGTCTGT CACAGCACTACCAGAGCCTGCTGTCCGGTCAAACGCCTACTGCGGCGTTTAAA CCGTACCAGAGCTACTTGGAGGAAGAAGCGGCCTACTTAACCTCGCATCGCTA TTGGCAGGATAAACAGTTTTGGCAGGGTTACTTACGTGAGGCACCTGACCTTA CCCTGACATCTGCAACTTACGATCCGCAACTGAGCCACGCTGTGAGCTTATCCT ATACCCTGAATTCCCAGCTGAATCATCTGCTGCTGAAGTTAGCAAACGCGAAC CAGATTGGGTGGCCTGATGCCTTGGTTGCCCTCTGTGCTCTGTACCTGGAAAGT GCTGAACCAGATGCTCCGTGGCTTTGGCTGCCGTTTATGAACCGTTGGGGTTCG GTAGCAGCGAATGTGCCCGGCTTGATGGTCAACTCACTGCCGCTGCTGCGCTT ATCTGCCCAGCAAACGAGTCTGGGCAATTACTTGAAACAGAGTGGCCAAGCCA TTCGCAGTCTTTATCTGCATGGCCGCTATCGCATCGAGCAGATTGAGCAGGATC AGGGACTGAATGCGGAACAAAGCTACTTCATGTCTCCGTTCATCAACATCCTG CCATTCGAAAGTCCGCATTTTGCCGACTGCCAAACGGAACTGAAAGTGCTGGC GTCAGGGTCAGCCGAAGGCATCAACTTCACCTTCCGTGGAAGCCCGCAGCACG AACTCTGTCTGGACATCACAGCCGATTTGGCGTCTTATCCACAAAGCCATTGGC AGAGCCATTGCGAGCGTTTCCCACGGTTCTTTGAGCAGTTGCTCGCACGCTTTC AGCAAGTCGAACAGGATGTAGCGCGTTTACTGGCAGAACCGGCCGCTTTGGCG GCAACCACCAGCACTCGCGCAATTGCGTCT

CAVibB chimera genetic sequence cloned within pET21a in this work.

169

ATGAGCCCGATTACCCGTGAAGAACGTCTGGAACGTCGTATTCAGGATCTGTA TGCGAACGATCCGCAGTTCGCAGCAGCCAAACCGGCGACCGCGATTACCGCGG CGATTGAACGTCCGGGTCTGCCGCTGCCGCAGATCATCGAAACGGTGATGACC GGCTATGCGGATCGTCCGGCACTGGCACAACGTAGCGTGGAATTTGTGACCGA TGCGGGCACCGGTCATACCACCCTGCGTCTGCTGCCGCATTTTGAAACCATTAG CTATGGCGAACTGTGGGATCGTATTAGCGCGCTGGCCGATGTTCTGAGCACCG AACAGACCGTGAAACCGGGCGATCGTGTGTGCCTGCTGGGCTTTAACAGCGTG GATTATGCGACCATTGATATGACCCTGGCACGTCTGGGTGCTGTCGCTGTCCCG CTGCAGACCTCTGCTGCGATTACCCAGCTGCAGCCGATTGTGGCGGAAACCCA GCCGACCATGATTGCGGCGAGCGTGGATGCCCTGGCCGATGCGACCGAACTGG CACTGAGTGGTCAAACGGCTACGCGTGTGCTGGTGTTTGATCATCATCGTCAG GTGGATGCGCATCGTGCGGCGGTTGAAAGCGCGCGTGAACGTCTGGCCGGTAG CGCGGTGGTTGAAACCCTGGCCGAAGCGATTGCGCGTGGTGATGTGCCGCGTG GTGCGAGCGCGGGTAGCGCACCGGGCACCGATGTGAGCGATGATAGCCTGGC CCTGCTGATTTATACCTCTGGTAGTACGGGTGCGCCGAAAGGCGCCATGTATCC GCGTCGTAACGTGGCGACCTTTTGGCGTAAACGTACCTGGTTTGAAGGCGGCT ATGAACCGAGCATTACCCTGAACTTTATGCCGATGAGCCATGTGATGGGCCGT CAGATTCTGTATGGCACCCTGTGCAACGGCGGCACCGCGTATTTTGTGGCGAA AAGCGATCTGAGCACCCTGTTTGAAGATCTGGCCCTGGTGCGTCCGACCGAAC TGACCTTCGTCCCGCGTGTTTGGGATATGGTGTTCGATGAATTTCAGAGCGAAG TGGATCGTCGTCTGGTGGATGGCGCGGATCGTGTTGCGCTGGAAGCGCAGGTG AAAGCGGAAATTCGTAACGATGTGCTGGGCGGTCGTTATACCTCTGCTCTGAC GGGTTCTGCTCCGATTAGCGATGAAATGAAAGCGTGGGTGGAAGAACTGCTGG ATATGCATCTGGTGGAAGGCTATGGCAGCACCGAAGCGGGCATGATTCTGATT GATGGCGCGATTCGTCGTCCGGCGGTGCTGGATTATAAACTGGTGGATGTTCC GGATCTGGGCTATTTTCTGACCGATCGTCCGCATCCGCGTGGCGAACTGCTGGT GAAAACCGATAGCCTGTTTCCGGGCTATTATCAGCGTGCGGAAGTGACCGCGG ATGTGTTTGATGCGGATGGCTTTTATCGCACCGGCGATATTATGGCGGAAGTG GGCCCGGAACAGTTTGTGTATCTGGATCGTCGTAACAACGTGCTGAAACTGAG CCAGGGCGAATTTGTTACCGTGAGCAAACTGGAAGCGGTGTTTGGCGATAGCC CGCTGGTGCGTCAGATTTATATTTATGGCAACAGCGCGCGTGCGTATCTGCTGG CCGTGATTGTGCCGACCCAGGAAGCGCTGGACGCGGTCCCGGTTGAAGAACTG AAAGCGCGTCTGGGTGACTCTCTGCAGGAAGTGGCGAAAGCGGCGGGTCTGCA GAGCTATGAAATTCCGCGCGATTTTATTATCGAAACCACCCCGTGGACCCTGG AAAACGGCCTGCTGACGGGTATTCGTAAACTGGCCCGTCCGCAGCTGAAAAAA CATTATGGTGAACTGCTGGAACAAATTTATACCGATCTGGCCCACGGCCAGGC GGATGAACTGCGTAGCCTGCGTCAGAGCGGTGCGGATGCGCCGGTGCTGACGA TGCAGCACGATGTAGCAGCGGCGCTGAATCTCTCGGTGGATGAGGTGGACGTA CAGGAGAACCTGTTGTTCCTCGGCCTTGATTCGATTCGCGCGATTCAGCTGCTG GAAAAGTGGAAAGCACAAGGTGCTGATATCTCGTTTGCCCAGTTGATGGAGCA TGTGACCTTACAGCAGTGGTGGCAGACCATTCAGGCCAACTTGCATCAACCGT GTTCGGCC

170

Appendix 2: PCR primers and PCR conditions

Fusion cloning:

Vector pET21a CARmm A domain PCR primers and PCR conditions:

Forward primer: CAGCACCGGCGCATCCGC

Reverse primer: TGAGATCCGGCTGCTAACAAAGC

Step Temperature Time Initial denaturation 98°C 30 s 98°C 10 s 34 cycles 63°C 30 s 72°C 8 m Final extension 72°C 10 m Hold 4°C

PCR conducted using Phusion HF polymerase by NEB following the manufacturer’s 50 µL protocol using HF buffer.

Insert VibB PCR primers and PCR conditions:

Forward primer: AGCAGCCGGATCTCAGGCCGAACACGG

Reverse primer: GATGCGCCGGTGCTGACGATGCAGCACGATGTAGCAG

Step Temperature Time Initial denaturation 98°C 30 s 98°C 10 s 34 cycles 55°C 30 s 72°C 30 s Final extension 72°C 10 m Hold 4°C

PCR conducted using Phusion HF polymerase by NEB following the manufacturer’s 50 µL protocol using HF buffer.

171

Cloning of Vib genes into vectors:

VibE PCR primers and PCR conditions:

Forward primer: AGGAGATATACCATGGATGACTACGGACTTTACCCCATGG

Reverse primer: GGTGGTGGTGCTCGAGGCTCATCAAGGCTAAACCCAGC

Step Temperature Time Initial denaturation 98°C 30 s 98°C 10 s 34 cycles 66°C 30 s 72°C 1 m Final extension 72°C 10 m Hold 4°C

PCR conducted using Phusion HF polymerase by NEB following the manufacturer’s 50 µL protocol using HF buffer.

VibH PCR primers and PCR conditions:

Forward primer: AGGAGATATACCATGGATGACTACGGACTTTACCCCATGG

Reverse primer: GGTGGTGGTGCTCGAGGCTCATCAAGGCTAAACCCAGC

Step Temperature Time Initial Denaturation 98°C 30 s 98°C 10 s 34 cycles 52°C 30 s 72°C 30 s Final extension 72°C 10 m Hold 4°C

PCR conducted using Phusion HF polymerase by NEB following the manufacturer’s 50 µL protocol using HF buffer.

Mutagenesis of CARni Ser 689 to Ala, PCR primers and PCR conditions:

Forward mutagenic primer: ATCTGGGCGGCGATGCCCTGAGCGCCCTGAG

Reverse non-mutagenic primer: CGGTAAAATGCGCATCCGGACGCATATCC

Step Temperature Time Initial Denaturation 98°C 30 s 98°C 10 s 34 cycles 58°C 30 s 72°C 9 m Final extension 72°C 10 m Hold 4°C

PCR conducted using Phusion HF polymerase by NEB following the manufacturer’s 50 µL protocol using GC buffer and 3% DMSO. 172

Appendix 3: Example enzyme nickel affinity purification, AKTA UV chromatograms

CARmm-expressing BL21 (DE3) cell lysate, AKTA purification, 280 nm UV trace.

173

CARni-expressing BL21 (DE3) cell lysate, AKTA purification, 280 nm UV trace.

174

CARmm-expressing BL21 (DE3) cell lysate, AKTA purification, 280 nm UV trace.

Appendix 4: Example HPLC traces showing CAR-dependent amide formation.

175

HPLC traces of CARmm-dependent amidation reactions to produce amides 38,44-48. 1 mM acid, 100 mM ammonia, 100 µg mL–1 purified CARmm, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 37°C, 24 h.

176

HPLC trace of CARmm production of 54, 1 mM 25, 100 mM 51, 100 µg mL–1 purified CARmm,

10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 37°C, 24 h.

Example HPLC trace of CARmm production of 55, 1 mM 25, 100 mM 52, 100 µg mL–1 purified

CARmm, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 37°C, 24 h.

177

Example HPLC trace of CARmm production of 56, 1 mM 25, 100 mM 53, 100 µg mL–1 purified

CARmm, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 37°C, 24 h.

HPLC trace of CARmm production of 58, 1 mM 57, 100 mM 52, 100 µg mL–1 purified CARmm,

10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 30°C, 24 h.

178

HPLC traces of CARmm reactions with cinnamic acid 28 and R (top 3) and S (bottom 3) enantiomers of 61 conducted in triplicate to produce 62. 1 mM 28, 100 mM 61, 100 µg mL–1 purified CARmm, 10 mM MgCl2, 5 mM ATP, 100 mM sodium carbonate buffer, pH 9.0, 30°C, 24 h.

179