<<

UNIVERSITY OF CALIFORNIA

Los Angeles

Characterization of the Biosynthesis of Fungal Aromatic

Polyketides

A dissertation submitted in partial satisfaction of the requirements for the degree Doctor

of Philosophy in Chemical Engineering

By

Yanran Li

2012

ABSTRACT OF THE DISSERTATION

Characterization of the Biosynthetic Pathway of Fungal Aromatic Polyketides

By

Yanran Li

Doctor of Philosophy in Chemical Engineering

University of California, Los Angeles, 2012

Professor Yi Tang, Chair

Nature has endowed microbes and plants with enormous power to synthesize complex natural products from simple building blocks. Compounds derived from natural sources that show biological activities are defined as natural products. Over the past century, natural products have played an indispensible role in the pharmaceutical discovery, and have become an important element of modern therapy. Aromatic polyketides are an important group of natural products that exhibit extraordinary medicinal activities. Actinomycetes and filamentous fungi are the two major producers of aromatic polyketides, especially polycyclic aromatic polyketides; and they employ distinct enzymatic system to synthesize these complex structures from malonyl-

CoA. Fungi utilize iterative multidomain megasynthase of type I nonreducing polyketide synthases (NRPKSs), while actinomycetes employ bacterial type II PKSs that comprise a group of dissociated . Compared to the well-studied type II PKS, NRPKS has been much less understood. ii

In this study, we first identified the unique product template (PT) domain of NRPKS that is not shared among the other PKS families as the first ring aldol cyclase. Based on the understanding of PT domain function, we performed phylogenetic analysis of PT domains, which classified the NRPKSs into five major groups. Subsequently, we proposed a sequence- function correlation of PT domain that allowed us to predict the activities of unknown PTs uncovered from genome sequencing. These target PT domains were assayed with the PKS4 (a previously identified NRPKS) minimal PKS domains in trans or in cis, and the proposed PT functions were confirmed through such product based assays. According to the established sequence-function relationship, we conducted genome mining of A. niger for the biosynthetic locus of TAN-1612, which contains an interesting tetracyclic naphthacenedione core structure.

Subsequent investigations of the TAN-1612 biosynthetic pathway uncovered a novel α- hydroxylation-dependent Claisen cyclization cascade that involves a bifunctional dimanganese metallo-β-lactamase-type thioesterase (MβL-TE). Also, the identification of the TAN-1612 biosynthetic pathway revealed a decaketide synthase AdaA, which enabled us to realize the in vitro enzymatic total synthesis of several important aromatic decaketide intermediates including rabelomycin and tetracenomycin D1.

iii

The dissertation of Yanran Li is approved.

Daniel T. Kamei

James C. Liao

Harold Monbouquette

Yi Tang, Committee Chair

University of California, Los Angeles

2012

iv

TABLE OF CONTENTS

1 Background and Motivation…………………………………………………….. 1

1.1 Cyclization of Bacterial Aromatic Polyketides…………………………. 4

1.1.1 The First Ring Cyclization………………………………………… 4

1.1.2 Additional Cyclization Steps……………………………………… 11

1.2 Cyclization of Fungal Aromatic Polyketides…………………………… 19

1.2.1 Product Template Domain Mediated Cyclization………………… 20

1.2.2 Thioesterase Domain Mediated Cyclization……………………… 24

2 Results…………………………………………………………………….……… 26

2.1 Identification of PT Domain Function and Engineered Biosynthesis of

Bacteroides Aromatic Polyketides in Escherichia coli ………………... 26

2.1.1 Introduction……………………………………………….……… 26

2.1.2 Results and Discussion…………………………………………… 29

2.1.3 Conclusion………………………………………………………… 42

2.1.4 Materials and Methods……………………………………………. 42

2.2 Classification, Prediction, and Verification of the Regioselectivity of Fungal

Polyketide Synthase Product Template Domains …………………….. 47

2.2.1 Introduction………………………………………………………. 48

2.2.2 Results...………………………………………………………….. 52

2.2.3 Discussion………………………………………………………… 62

2.2.4 Materials and Methods…………………………………………… 65

2.3 Comparative Characterization of Fungal Anthracenone and

Naphthacenedione Biosynthetic Pathways Reveals an α-Hydroxylation-

v

Dependent Claisen-Like Cyclization Catalyzed by a Dimanganese

Thioesterase……………………………………………………………… 69

2.3.1 Introduction………………………………………………………. 69

2.3.2 Results……………………………………………………………. 73

2.3.3 Discussion………………………………………………………… 87

2.3.4 Conclusion……………………………………………………….. 94

2.3.5 Materials and Methods…………………………………………… 95

2.4 Enzymatic Total Synthesis of Aromatic Decaketides in vitro………… 100

2.4.1 Introduction……………………………………………….……… 100

2.4.2 Results and Discussion…………………………………………… 101

2.4.3 Conclusions………………………………………………………. 106

2.4.4 Materials and Methods…………………………………………… 107

3 Conclusions……………………………………………………………………… 111

4 Appendix…………………………………………………………………………. 115

4.1 Compound List………………………………………………………….. 115

4.2 Supporting information for chapter 2.1……………………………….. 117

4.3 Supporting information for chapter 2.2……………………………….. 123

4.4 Supporting information for chapter 2.3……………………………….. 139

4.5 Supporting information for chapter 2.4……………………………….. 176

5 References……………………………………………………………………….. 184

vi

LIST OF FIGURES

Figure 1: Comparison between bacterial and fungal aromatic PKSs…………………….. 2

Figure 2: Biosynthesis of tetracyclines and aureolic acids………………………………. 12

Figure 3: Biosynthesis of tetracenomycins………………………………………………. 15

Figure 4: Biosynthesis of angucyclines………………………………………………….. 16

Figure 5: Biosynthesis of 50 by PKS4…………………………………………………… 22

Figure 6: Biosynthesis of 53 by PksA……………………………………………………. 23

Figure 7: Comparison between fungal and bacterial iterative polyketide synthases…….. 27

Figure 8: Construction of PKS4 minimal PKS domains without the PT domain………… 30

Figure 9: (A) HPLC analysis (280nm) of minimal PKS activities in vitro. (B) LC-MS analysis and selected ion monitoring of nonaketides produced by KS-MAT+ACP and PKS_WJ in vitro……………………………………………………………………………………….. 31

Figure 10: UV and mass spectrum of 15 and 51; and two possible cyclization patterns to afford

51…………………………………………………………………………………………. 33

Figure 11: (A) PKS_WJ exhibited comparable kinetic properties as the parent PKS4. (B)

Michaelis-Menten properties of the dissociated KS-MAT didomain (3µM) toward the ACP domain at 2mM malonyl-CoA…………………………………………………………… 34

Figure 12: Starter unit selection by PKS_WJ…………………………………………… 36

Figure 13: (A) HPLC analysis (280nm) of reaction mixtures. (B) (i) C7-C12 cyclized polyketides produced by bacterial minimal PKSs alone without aid of cyclases. (ii) Natural aromatic polyketides with C7-C12-cyclized first ring. (C) Cyclization of the nonaketide backbone to 7………………………………………………………………………. 38

vii

Figure 14: (A) HPLC analysis (430nm) of SEK26 63 biosynthesis by using E. coli BAP1. (B)

High-cell density, feb-batch fermentation production of 63……………………………… 41

Figure 15: (A) Biosynthesis of fungal aromatic polyketides. (B) C6-C11/C4-C13 cyclization regioselectivity has been observed in various fungal aromatic polyketide………………. 51

Figure 16: Phylogenetic analysis of PT domains from 30 NRPKSs that have been related to known aromatic polyketides……………………………………………………………… 53

Figure 17: (A) Construction and SDS-PAGE analysis of PKSH1. (B) LC-MS (325nm) analysis of the polyketides synthesized by PKSH1 in vitro……………………………………….. 57

Figure 18: The biosynthesis of 68 and the 1H-13C HMBC correlations of 68…………… 59

Figure 19: (A) Construction and SDS-PAGE analysis of PKSH2. (B) LC-MS (325 nm) analysis of the polyketides synthesized by PKSH2 in vitro……………………………………….. 61

Figure 20: (A) VrtA has been related to the biosynthesis of viridicatumtoxin in P. aethiopicum.

(B) LC-MS (325 nm) analysis of the polyketides synthesized by PKS4 minimal PKS domains with AptA-PT (trace i) and VrtA-PT (trace ii) in vitro…………………………………… 62

Figure 21: Biosynthetic pathway of 65 and related shunt products……………………… 72

Figure 22: Induction of the ada gene cluster…………………………………………….. 75

Figure 23: LC analysis (400 nm) of polyketides synthesized by coexpression of different combinations of ada and apt genes in S. cerevisiae BJ5464-NpgA……………………… 78

Figure 24: Biosynthetic pathway of 70 and related shunt products……………………… 81

Figure 25: In vitro reconstitution of AptB activities……………………………………... 83

Figure 26: Structural modeling and mutational analysis of AptB……………………….. 87

Figure 27: Proposed enzymatic mechanism of MβL-TE………………………………… 89

viii

Figure 28: Characterization of the AdaA minimal PKS domains and the in vitro biosynthesis of

34………………………………………………………………………………………… 103

Figure 29: (A) SDS-PAGE analysis of PKSH_tcm1 and PKSH_tcm2. (B) LC-MS analysis of in vitro assays containing malonyl-CoA and i) relinked AdaA minimal PKS domains; ii) PKSH_tcm1; iii)

PKSH_tcm2……………………………………………………………………………….. 105

ix

LIST OF TABLES

Table 1: NMR Spectral Data for 68 in DMSO-d6………………………………………… 58

Table 2: Genes Within the ada Cluster from Aspergillus niger ATCC 1015…………….. 76

Table 3: Heterologous expression of different combinations of apt and ada genes and the resulting products…………………………………………………………………………. 77

Table 4: ICP-AES analysis of AptB……………………………………………………… 85

x

ACKNOWLEDGMENTS

Section 1 contains reproduced material by permission of The Royal Society of Chemistry from Zhou, H. Li, Y., Tang, Y. (2010). Cyclization of Aromatic Polyketides from and

Fungi. Nat. Prod. Rep., 27 (6), 839-868. (H.Z. and Y.L. are equal contributions) © by The Royal

Society of Chemistry. (http://dx.doi.org/10.1039/B911518H)

Section 2.1 contains reprinted material with permission from Zhang, W., Li, Y., Tang, Y.

(2008). Engineered Biosynthesis of Bacterial Aromatic Polyketides in Escherichia coli. Proc.

Natl. Acad. Sci. U. S. A., 105 (52), 20683-20685. (W.Z., Y.L., and Y.T. designed research; W.Z. and Y.L. performed research; W.Z., Y.L., and Y.T. analyzed data; and W.Z. and Y.T. wrote the paper.) © by the National Academy of Sciences.

Section 2.2 contains reprinted material with permission from Li, Y., Xu, W., Tang, Y.

(2010). Classification, Prediction and Verification of the Regioselectivity of Fungal Polyketide

Synthase Product Template Domains. J. Biol. Chem. 285 (30), 22762-22771. (Y.L. and Y.T. designed research; Y.L. and X.W. performed research; Y.L. and Y.T. analyzed data and wrote the paper.) © by the American Society for and Molecular Biology.

Section 2.3 contains reprinted material with permission from Li, Y., Chooi, Y.-H., Sheng,

Y., Valentine, J. S., Tang, Y. (2011). Comparative Characterization of Fungal Anthracenone and

Naphthacenedione Biosynthetic Pathways Reveals an alpha-Hydroxylation-Dependent Claisen-

Like Cyclization Catalyzed by a Dimanganese Thioesterase. J. Am. Chem. Soc., 133 (39),

15773-15785. (Y.L., Y.C., Y.S. and Y.T. designed research; Y.L. and Y.S. performed research;

Y.L., Y.S., J.V. and Y.T. analyzed data; Y.L., Y.C., Y.S., J.V. and Y.T. wrote the paper.) © 2011

American Chemical Society.

xi

My thesis advisor, Prof. Yi Tang is the principle investigator for the above publications.

And the work in this dissertation was funded by NIH Grants 1R01GM085128, 1R01GM092217 and the David and Lucile Packard Fellowship in Science and Engineering to Y. T.

In addition, I would like to thank many individuals who have been helping me to accomplish this project, and supporting me in personal and professional development along the way.

First, I am deeply grateful to my thesis advisor Professor Yi Tang for his support of me throughout my pursuing of PhD degree during the past more than four years. The dedicated and careful training I obtained from him in respect of fundamental knowledge and laboratory skills, research and teaching, responsibility and attitudes has provided me with continual motivation and guidance to academic and professional success. I am indebted to him not only for the incredible research opportunity, but also for the chance of working with an advisor from whom we can follow and learn so much for my personal and professional development.

I am also very thankful to my committee members Professor Daniel Kamei, Professor

James Liao and Professor Harold Monbouquette for their support of my doctoral candidacy.

Their insightful questions and discussions along the way have helped me a lot over the years.

Additional thanks to Professor James Liao for letting me use of many equipments from his laboratory and insightful professional discussions. Moreover, I would like to thank Professor

Joan S. Valentine for the very helpful discussion on the dimanganese metallo-β-lactamase thioesterase, her effort to modify the of AptB, and the use of the equipments from her laboratory for the AptB characterization.

xii

In addition, I would like to thank the following people for their contributions to the research described here: Dr. Wenjun Zhang for her patient and responsible training and guidance during my first one and a half years in the lab; Yuewei Sheng for her valuable contributions to the AptB study, as well as many helpful discussions; Wei Xu for his trials to help me in the PT phylogenetic study, especially compound purification and structural elucidation; Dr. Xinkai Xie for his help in conducting NMR in the PT phylogenetic study; Dr. Yitheng Chooi for teaching me how to conduct phylogenetic analysis and fungal transformation, and also his contributions to the AptB story; Sherry (Xue) Gao for her help in crystallization and teaching me how to do fungal transformation, and as well as many helpful discussions; Ryan (Peng) Wang for always providing me protoplasts and helpful discussions; Dr. Jacklyn Winter for her effort in revising the manuscript of the AptB story, and as well as many helpful discussions; Dr. Duilio

Cascio for his help with my trials to protein crystallization of AptA-PT and AptB; Professor

Jurgon Rohr for providing me with the GilF, JadD, RavG expression plasmids; and all Tang lab members for their helpful discussions.

On a personal note, I would like to thank all Tang lab members past and present for comprising such an incredible group of generous, kind-spirited and hard-working people. I want to say thanks to everyone: thank Dr. Wenjun Zhang for always being like an elder sister and encourage me when I get depressed; thank Dr. Zhen Gu not only for telling funny stories all the time, but also for advising me when I get lost about life and research; thank Sherry (Xue) Gao for always being so supportive and encouraging; thank Angelica Zabara for being so kind and always bringing happiness to us; thank Dr. Jacklyn Winter for being so nice and always providing me with professional suggestions; thank everyone for their encouragement, kindness, spirit of community and always being happy to help each other out. I always believe that we

xiii gather here due to destiny, and value the time I spend with you all so much, and I wish you all the best for the future.

Also, I would like to thank all the professors and staffs in the department that have been so nice, helpful and supportive all the time. Additional thanks to Professor Yunfeng Lu for his help for my transfer from Rice University to UCLA, and as well as being so nice all the time and providing me those insightful advice on life; to Professor Tatianna Segura, Professor Yoram

Cohen and Professor Robert Hicks for their wonderful and very helpful lectures; to John Burger,

Victoria L. Ybiernas and Kimberly Claffey for their always being so kind and patient to help me, especially when I am in dilemma and don’t know what to do; to Alain C. De Vera and Miguel

Perez for being very patient in processing the orders and my timesheet, especially when there are problems processing the documents.

Furthermore, I am deeply grateful for all the professors in the Department of Chemical

Engineering of Rice University before I transferred to UCLA. Special thanks to Professor

Ramon Gonzalez and Professor Sibani Lisa Biswal for their guiding me into the world of interdisciplinary research involving biology and chemistry.

I want to thank all my best friends for their always being so patient about my blathering, so encouraging when I am getting depressed, so supportive when I am in trouble, and so kind to me all the time. Without the encouragement from all of you, I won’t be that happy every day.

All of you are important spiritual supports throughout my personal and professional life.

Finally, I am indebted to my parents and grandparents. For the past five years, I only get back to China to reunion with my family once because I was too busy working on my projects. I miss them so much and I feel so sorry that I let them miss me so much; however, they have always been so supportive and that means a lot to me. I am deeply grateful to my parents and

xiv grandparents not only for their loving me so much and providing me such a perfect family, but also their insistent teaching me to be a pure and honest person. I firmly believe that the things they taught me are the most important education throughout my life, and are so critical to my professional and personal development. And specially, I would like to thank my husband Zheng

Chen for always understanding and taking care of me. During my 25 years of life till now, I think the past four years are the most challenging period of time; and without the spiritual, emotional and intellectual support from him, I won’t be able to make all the way through. Your love and support means the world to me.

xv

VITA

May 1st, 1986 Born, Dongying, Shandong Province, PRC

2007 B. S., Chemical Engineering Tianjin University Tianjin, PRC

2007 B. S., Chemistry Nankai University Tianjin, PRC

2009-2012 Teaching Assistant Department of Chemical and Biomolecular Engineering University of California, Los Angeles, USA

2008-2012 Research Assistant Department of Chemical and Biomolecular Engineering University of California, Los Angeles, USA

PUBLICATIONS AND PRESENTATIONS

Zhang, W., Li, Y., Tang, Y. (2008). Engineered Biosynthesis of Bacterial Aromatic Polyketides in Escherichia coli. Proc. Natl. Acad. Sci. U. S. A., 105 (52), 20683-20685.

Zhou, H. Li, Y., Tang, Y. (2010). Cyclization of Aromatic Polyketides from Bacteria and Fungi. Nat. Prod. Rep., 27 (6), 839-868. (Equal contribution)

Li, Y., Xu, W., Tang, Y. (2010). Classification, Prediction and Verification of the Regioselectivity of Fungal Polyketide Synthase Product Template Domains. J. Biol. Chem. 285 (30), 22762-22771. (Highlighted by Nat. Chem. Biol.)

Li, Y., Chooi, Y.-H., Sheng, Y., Valentine, J. S., Tang, Y. (2011). Comparative Characterization of Fungal Anthracenone and Naphthacenedione Biosynthetic Pathways Reveals an alpha- Hydroxylation-Dependent Claisen-Like Cyclization Catalyzed by a Dimanganese Thioesterase. J. Am. Chem. Soc., 133 (39), 15773-15785.

Liu, Q., Yao, F., Chooi, Y.-H., Kang, Q., Xu, W., Li, Y., Shi, Y., Deng, Z., Tang, Y., You, D. (2012). Elucidation of Piericidin A1 Biosynthetic Locus Revealed a Thioesterase-dependent Mechanism of α-Pyridone Ring Formation. Chem. Bio., 19(2), 243-253.

xvi

Li, Y., Tang, Y. (2010). Classification, Prediction and Verification of the Regioselectivity of Fungal Polyketide Synthase Product Template. Presented at the Society for Industrial Microbiology Annual Meeting, San Francisco, CA.

Li, Y., Tang, Y. (2011). Targeted Fungal Genomic Mining of Novel Tetracyclic Polyketide Synthesis. Presented at the American Chemical Society Annual Meeting, Anaheim, CA.

xvii

1 Background and Motivation

A full version of this section was originally published in Nat. Prod. Rep.: Zhou, H., Li,

Y., Tang, Y. (2010). Cyclization of Aromatic Polyketides from Bacteria and Fungi. Nat. Prod.

Rep., 27 (6), 839-868. © by The Royal Society of Chemistry.

Aromatic polyketides comprise an important and structurally diverse group of natural products synthesized from microbes and plants.(1) This family includes a number of highly successful pharmaceutical compounds, including tetracycline, daunorubicin and mithramycin.

On the other hand, some human or plant diseases have been linked to fungal mycotoxins,(2) many of which are aromatic polyketides. The rich biological activities, both beneficial and harmful, and structural complexity of aromatic polyketides have attracted research interests into the discovery and biosynthesis of these compounds.

Most aromatic polyketides are highly substituted, fused-ring polyphenols. Each compound in this family contains an aromatic aglycon derived from a poly-β- backbone.

Each aglycon is characterized by three structural features, which are i) number of rings as determined by the overall length of the carbon backbone; ii) cyclization patterns as dictated by the regioselectivities of individual ring-fusing carbon-carbon bonds; and iii) ring topology as a result of different oxidative rearrangement reactions. The aglycon can be further decorated with tailoring reactions such as methylation, amination, oxidation and glycosylation to afford the bioactive natural products.(3)

Despite the structural similarity shared between bacterial or fungal aromatic polyketides, numerous feeding experiments using isotopically labeled precursors revealed fundamental differences in cyclization strategies (Figure 1). In a comprehensive review by Thomas, the cyclization modes of fused-ring aromatic polyketides can be classified into F-mode for fungal

1 origin, or S-mode for filamentous bacteria source with some exceptions.(4) For F-mode, the first cyclized phenol ring contains two intact acetate building blocks with the other two carbons from two different acetate units. While in S-mode three intact acetate units are incorporated to assemble the first ring. Using the polyketide carbon numbering system in which the thioester α- carbon is designated C1 and the remaining carbons are numbered consecutively, these F- and S- modes can be designated numerically. In F-mode, C2-C7, C4-C9 and C6-C11 first ring cyclizations are commonly observed, while in S-mode, the most commonly found patterns are

C7-C12 and C9-C14.

Bacterial Aromatic PKS ACP CYC O O CoA HO S

KS KS MAT ACP

S-Mode

Fungal Aromatic PKS O O CoA HO S ACP SAT KS MAT PT ACP TE/CLC

F-Mode PT TE/CLC

Figure 1. Comparison between bacterial and fungal aromatic PKSs and the respective cyclization modes.

The apparent orthogonal strategies employed by bacteria and fungi to cyclize reactive poly-β-ketone backbones are attributed to the different regioselectivities of the cyclization enzymes associated with the respective aromatic polyketide synthases (PKSs). Aromatic PKS represents a collection of enzymes required to synthesize the carbon backbone from acetate building blocks and modify the backbone into the aromatic polyketides. The poly-β-ketone

2 backbone of bacterial and fungal polyketides are produced by the minimal PKS, which consists of the ketosynthase (KS), acyl-carrier protein (ACP) and the malonyl-CoA:ACP transacylase

(MAT).(5) Polymerization proceeds through the iterative reactions of i) selection of malonyl-

CoA by MAT as a building block; ii) malonylation of ACP by MAT; iii) decarboxylative condensation catalyzed by the KS; and iv) transferring of the extended polyketide product to the

KS and departure of the unacylated holo-ACP.

Following the landmark discovery of the actinorhodin (act) gene cluster from

Streptomyces coelicolor A3(2),(6, 7) a large number of aromatic polyketide biosynthetic pathways have been elucidated from bacterial hosts. The bacterial aromatic PKS, also known as type II PKS, is comprised of a collection of dissociated enzymes (Figure 1). The minimal PKS includes the KS (KSα)-Chain Length Factor (CLF or KSβ) heterodimer and a dissociated ACP.(1)

The minimal PKS shares bacterial fatty acid synthase FabD as the MAT.(8) The chain length, or the number of Claisen-like chain extension steps, is largely determined by the CLF.(9) In bacteria, the observed S-mode cyclization regioselectivity of the backbone is predominantly controlled by dissociated tailoring enzymes such as cyclases. Each biosynthetic gene cluster contains 2-4 cyclases, depending on the number of rings in the final product. In the absence of cyclases, aberrant intramolecular cyclization occurs readily and leads to formation of shunt products. The programming rules of different families of cyclases have been determined through various mix-and-match experiments in both native(10) and heterologous hosts.(11)

In contrast to the dissociated nature of the bacterial enzymes, fungal polyketides are mostly synthesized by megasynthases in which catalytic components are present on a single polypeptide.(12) The individual domains are linearly juxtaposed to resemble an enzymatic assembly line (Figure 1). Unlike bacterial type I modular PKSs,(13) the single set of domains

3 found in a fungal megasynthase is employed iteratively towards the synthesis of polyketides.

Cox recently classified fungal megasynthases into three categories based on the degrees of β- reduction of the final product.(14) Fungal aromatic polyketides are mainly synthesized by nonreducing polyketide synthases (NRPKS) in which no reductive domains are employed during the elongation steps of the polyketide chain. In addition to the minimal PKS (KS, MAT and

ACP), a typical NRPKS consists of a starter-unit:ACP acyltransferase (SAT) domain that selects starter units;(15) and a PT domain that controls the immediate cyclization steps of the nascent polyketide.(16, 17) Frequently a thioesterase/Claisen-like cyclase (TE/CLC) is found at the C- terminus of the megasynthase to catalyze additional cyclization steps and product release.(18)

The cyclization regioselectivities of the PT and TE domains give rise to the F-mode cyclization observed for nearly all the fungal aromatic polyketides.

Unlike bacterial type II PKS that has been thoroughly characterized, the mechanism of

NRPKS synthesizing fungal aromatic polyketides has not been well understood until recently.

My graduate research is aimed at understanding the enzymatic basis for the biosynthesis of aromatic polyketides in filamentous fungi. As aforementioned, fungal and bacterial aromatic

PKS systems are completely distinct in that NRPKSs are multidomain megasynthases, yet type II

PKSs comprise a group of dissociated enzymes. The specific goals of my PhD studies are 1) to understand the sequence-activity relationship of NRPKSs, 2) to investigate novel aromatic PKSs, and 3) to bridge the structural and functional distance between bacterial and fungal aromatic

PKSs.

1.1 Cyclization of Bacterial Aromatic Polyketides

1.1.1 The First Ring Cyclization

4

The nascent poly-β-ketone synthesized by the minimal PKS is extremely reactive and prone to spontaneous cyclization. As a result, aromatic PKSs must fix the regioselectivity of the first intramolecular aldol condensation immediately following completion of the polyketide chain. Regioselectivity of the first cyclization event also dictates the structure of the final aglycon in that it determines where the carbon backbone “turns the corner”, as described by

Rawlings.(19) The first cyclization step and the corresponding cyclases have been well-studied during the past 20 years, among which C7-C12 and C9-C14 cyclizations are shown to be the most widely adopted cyclization patterns. These two cyclization regioselectivities lead to the formation of the most important bacterial aromatic polyketide scaffolds, including tetracyclines, anthracyclines, tetracenomycins, angucyclines and benzoisochromanequinones. Other first ring cyclization regioselectivities are also observed in the presence of nonacetate starter units or longer polyketide backbones.(20)

C7-C12 first ring cyclization

Early studies using the heterologous host/vector pairs(11) suggest that the minimal PKSs synthesize C7-C12 cyclized polyketides as the major products, such as SEK4 1 synthesized by octaketide act minimal PKS,(21) SEK15 2 synthesized by decaketide tetracenomycin (tcm) minimal PKSs,(21) and a set of compounds such as TW93c 3 synthesized by the dodecaketide whiE minimal PKS.(22) These compounds are all derived from regioselective C7-C12 aldol condensation, followed by spontaneous cyclization of the remaining portions of the polyketide chain. Although minor products with other first-ring cyclization regioselectivities were also observed from these PKSs, such as C10-C15 SEK4b 4,(23) C9-C14 SEK15b 5,(24) C8-C13

TW93f 6,(20) the formation of 1-3 in the absence of additional cyclases suggest that the various

KS-CLFs have significant influence on the folding of the polyketide chain and promote the

5 observed C7-C12 cyclizations. In 2004, Keatinge-Clay et al solved the crystal structure of the act KS-CLF heterodimer and suggested the possible chain length control mechanism and conformation of the growing polyketide chain.(25) As predicted via previous mutagenesis studies,(9) the interface between KS and CLF forms the back wall of the substrate binding pocket and defines the length of the active site tunnel. When the growing poly-β-ketone chain reaches the interface, additional elongation of the chain must be accompanied by buckling of the backbone to fit into the cavity. Therefore, the size of the tunnel not only determines the chain length of the poly-β-ketone intermediate, but also directs where the linear intermediate folds. In the act KS-CLF structure, a positioned in the middle of the pocket has been suggested to act as a proton donor and to facilitate the C7-C12 aldol condensation. The structure illustrates the possibility of a similar distance between the active site cysteine and the back wall of the substrate binding pocket among all KS-CLFs and rationalizes the preferential C7-C12 cyclization regardless of final chain length.

In addition to the possible KS-CLF facilitated C7-C12 cyclization, putative cyclases that can direct first ring C7-C12 cyclization of unreduced polyketide backbones are found in several gene clusters. These include ZhuI from the R1128 pathway,(26) StfQ from the steffimycin pathway,(27) and MtmQ from the mithramycin pathway.(28) The precise roles of these enzymes have been difficult to confirm as their activities cannot be assayed independently from the C7-

C12 cyclization directed by the KS-CLF. The cyclase activities of these enzymes were recently confirmed using a polyketide backbone generated by an engineered fungal minimal PKS derived from the Gibberella fujikuroi PKS4,(17) which does not promote C7-C12 cyclization of the nascent polyketide backbone. Addition of these cyclases to the PKS4 minimal PKS resulted in the exclusive synthesis of C7-C12 cyclized nonaketide NonaSEK4 7. Therefore, these enzymes

6 are indeed C7-C12 cyclases, although may not be essential for the biosynthesis of the associated natural products. Instead, these enzymes may serve ancillary roles to ensure that all polyketides are correctly cyclized. The crystal structure of ZhuI was recently reported by Tsai and coworkers and the structure is similar to that of the C9-C14 first ring cyclase TcmN (N- terminal).(29) Compared to TcmN, ZhuI shows a smaller substrate binding pocket that favors bending of the poly-β-keto backbone between C7 and C12 instead of C9 and C14.(30)

Another that can influence the C7-C12 cyclization is the C9-ketoreductase (C9-

KR) associated with aromatic polyketides that are reduced at the C9 position, such as anthracyclines and angucyclines. ActIII (act KR) has been the most widely studied C9 KR and reduces polyketide backbones of all lengths. Heterologous host/vector studies have found that polyketides reduced at C9 are cyclized exclusively in the C7-C12 regioselectivity, thereby raising the possibility that the C9 KR may also provide an environment to promote the regioselective aldol condensation.(31) For instance, the C9-reduced octaketide, decaketide and dodecaketide all undergo C7-C12 cyclization and spontaneously rearrange to become mutactin

8,(11) RM20b/c 9,(32) and TW94d 10(22) or YT231a 11,(33) respectively. Addition of a first ring aromatase/dehydratase leads to aromatization of the first ring affording compounds such as 7

SEK34 12(34) and SEK43 13.(35) Dissecting the role of act KR in promoting C7-C12 cyclization was again complicated by the intrinsic activities of the KS-CLFs. In 2008, Tsai and coworkers co-crystallized act KR with the anthraquinone and suggested that act KR may prefer partially cyclized substrates, such as the C7-C12 cyclized polyketide intermediate.(36)

Recently, Ma et al demonstrated that the act KR can reduce the polyketide backbone synthesized by G. fujikuroi minimal PKS4 and afford compounds that are not only reduced at C9, but are also regioselectively cyclized between C7-C12.(37) Therefore, it is possible that buckling of the polyketide backbone takes place in the active site of C9 KR, which simultaneously positions the correct ketone group for reduction and facilitates the spontaneous C7-C12 first ring cyclization.

OH OH

O OH O O OH OH 12 12 12 O OH O OH HO 7 HO 7 O HO 7 12 O O O O O HO OH

HO O HO O 7 HO O 8 9 10 11

OH O OH O 12 12 O

7 7 OH HO O O

HO O HO O 12 13

C9-C14 first ring cyclization

For unreduced aromatic polyketide backbones that are decaketides or larger, C9-C14 first ring cyclization is a common cyclization pattern, giving rise to compounds in the tetracenomycin and pentangular families. Addition of a C9-C14 specific cyclase, such as the widely studied

TcmN from the tetracenomycin (tcm) gene cluster, to different minimal PKSs can override the

C7-C12 cyclization and afford a different set of polyketides. When TcmN is introduced, the act,

8 frenolicin nonaketide synthase (fren), tcm, and whiE PKSs synthesized C9-C14 cyclized products RM77 14,(38) PK8 15,(39) RM80 16,(38) and TW95a 17,(22) respectively. The fused, second rings of 14-17 are all derived from subsequent C7-C16 cyclization, which are likely also catalyzed by TcmN. Therefore, TcmN can be considered as a first- and second-ring cyclase for longer polyketide backbones.

The N-terminus, cyclase portion of TcmN (N-TcmN) was successfully crystallized by

Ames et al in 2008.(29) They reported that the N-TcmN adopts a helix-grip fold and contains an interior pocket for binding of the polyketide intermediate. The pocket geometry orients the polyketide to buckle between C9 and C14. Docking simulations indicated that the residues within the cavity interact with much of the buckled polyketide, including C9, C13, C14 and C20.

In addition, the docked polyketide also adopts a conformation in a narrower portion of the cyclization chamber to position C7 in close proximity of C16, thereby possibly mediate cyclization of the second ring. In contrast, when the poly-β-ketone buckled between C7 and C12 was docked during simulation, a much weaker binding interaction between the polyketide and the cavity is predicted. Site-directed mutagenesis studies showed that two highly conserved residues among cyclases of similar functions, R69 and Y35, are essential for the C9-C14 regioselectivity. Mutation of these residues resulted in the loss of the C9-C14 regioselectivity and led to the synthesis of C7-C12 cyclized products. In comparison to the N-TcmN structure,

9 the crystal structure of WhiE-OrfVI, which directs C9-C14 cyclization of dodecaketides, shows a larger cyclization chamber that is needed to accommodate a longer polyketide backbone.(30)

Other first ring cyclization modes

In the presence of minimal PKS alone, most acetate-primed polyketides are subjected to

C7-C12 cyclization as discussed previously. Interestingly, when the starter unit is replaced with amide starter units, such as the malonamate starter unit from the tetracycline pathway, the cyclization regioselectivity is altered. Coexpression of the oxytetracycline (oxy) minimal PKS with the amidotransferase OxyD in a heterologous host yielded an C-19 amidated polyketide that adopted C11-C16 first ring cyclization, followed by C9-C18, and N-C7 cyclization to yield

WJ85 18.(40) Upon addition of act KR to reduce the C9 ketide, the amidated polyketide cyclized via C13-C18 connectivity, followed by N-C11 cyclization to form the isoquinolone

WJ35 19.(41) The synthesis of 19 further supports the possibility that C9-KR accepts an uncyclized nascent polyketide intermediate. Formation of 18 and 19 suggests that the amide starter unit has a significant influence on conformation of the polyketide chain in either the KS-

CLF or the act KR. The regioselectivity of first ring cyclases, however, are not affected by the amide starter unit. Addition of C7-C12 or C9-C14 cyclase to either the C9-reduced or unreduced, amidated backbones led to the synthesis of WJ78 20 or WJ150 21, respectively.(42)

The structures of both compounds reflect the expected regioselectivities of the two cyclases.

10

Interestingly, both products are benzopyrones that are formed through the spontaneous displacement of the amide by a favorably positioned phenol.

1.1.2 Additional Cyclization Steps

Following initial cyclization of the first ring through either C7-C12 or C9-C14, additional cyclases are recruited to perform subsequent regioselective cyclizations. The remainder of the polyketide chain can be either folded in a linearly-fused fashion to yield compounds in the tetracycline, anthracycline or tetracenomycin families; or can be folded in an angularly-fused fashion to yield angucycline, pentangular benzo[a]naphthacene, or discoid-folded polyketides.

Cyclases, both individually and sometime synergistically, play crucial roles in controlling the regioselectivities of these successive steps, as well as preventing spontaneous cyclizations.

Among the structurally diversified bacteria aromatic polyketides, three groups that are frequently involved in my PhD studies will be discussed in more detail here: (1) tetracyclines and aureolic acids, (2) tetracenomycins, and (3) angucyclines.

Tetracyclines and aureolic acids (C7-C12, C5-C14, C3-C16, C1-C18)

Compounds in the tetracycline and aureolic acid families are derived from decaketide backbones. These two families, although structurally distinct, are both tailored from naphthacene intermediates (Figure 2). Four successive cyclization steps are required, starting from first ring C7-C12 closure, followed by formation of C5-C14, C3-C16 and C1-C18 connectivities. Differences in the decorations of the naphthacene core can arise from starter unit choice (amide or acetate), and/or C9 reduction. For example, the tetracyclines such as oxytetracycline 22 are primed with an amide starter unit and are generally ketoreduced at C9; whereas the aureolic acids such as mithramycin 23 are primed with acetate starter units and are not reduced at C9.

11

Figure 2. Biosynthesis of tetracyclines and aureolic acids.

The tetracyclines are among the most important discovered in the last century.

It was used widely to treat both gram-positive and gram-negative bacterial infections due to their broad-range activities. This family is relatively small with chlorotetracycline and 22 as the best known examples. The 2-naphthacenecarboxamide scaffold of tetracyclines is derived from the naphthecene intermediates via tailoring steps such as methylation, oxidation and reductive amination. The biosynthetic steps have recently been reconstituted systematically in S. coelicolor strain CH999.(40-44) Cyclization steps leading to the formation of the naphthacene core were investigated and only two cyclases were shown to be necessary to cyclize all four rings in the presence of the amide starter unit. OxyK was identified as the first ring cyclase/dehydratase that mediates C7-C12 cyclization of the backbone (note that without OxyK, the polyketide spontaneously cyclizes into 19).(43, 45) OxyN was identified as the second ring cyclase that mediates C5-C14 cyclization, while cyclization of the third ring is believed to be

12 spontaneous. No enzyme has been associated with the cyclization of the fourth ring so far. The gene cluster of a third member of the tetracycline family, SF2575 24,(46) was recently sequenced from Streptomyces sp. SF2575. Biochemical analysis showed that the aglycon of 24 is also derived from the naphthacene intermediate, which in turn is also cyclized from an amidated backbone. Reconstitution work using enzymes from this gene cluster, however, revealed that cyclization of the last ring requires the action of an acyl-CoA ligase SsfL2. Therefore, the exact mechanism in which the C1-C18 cyclization occurs in this family of compounds is not completely established.

Aureolic acids are a group of linearly-fused, tricyclic aromatic polyketides. The biosynthesis of compounds in this family has been demonstrated to go through a tetracyclic, naphthacene intermediate. Among the aureolic acids, 23 is the best studied example.(47, 48) 23 was first proposed to be derived from a tetracenomycin-like scaffold based on heterologous expression of the putative last-ring cyclase MtmX from 23 biosynthetic pathway in Streptomyces glaucescens Tu49.(49) Subsequently, the key intermediate premithramycin B 25 was isolated from a genetically modified strain of Streptomyces argillaceus.(47) The structural similarity between the aglycon of 25 to that of 22 led to the conclusion that 23 is more likely to be derived from a tetracycline-like cyclization pathway. Indeed, the 23 and 22 biosynthetic pathways share a number of highly homologous enzymes, including cyclases and tailoring enzymes.(28, 41) The oxygenase MtmOIV was identified to transform 25 into the aureolic acid structure via a Baeyer-

Villiger oxidative cleavage of the fourth ring.(50, 51) The MtmOIV homologue also resides in the biosynthetic gene cluster of 26,(52) which is an aureolic acid derived from the tetracyclic intermediate prechromomycin B 27.(53)

13

The biosynthetic gene cluster of a related compound polyketomycin 28 was recently identified and sequenced.(54) Based on comparative gene cluster analysis and gene knockout, biosynthesis of the aglycon of 28 is expected to parallel closely to that of 22 and 23. 28 contains novel structural features not observed among the tetracyclines, including O-glycosylation and oxidation of the first ring.

Tetracenomycins (C9-C14, C7-C16, C5-C18, C2-C19)

Enzymes involved in the biosynthesis of tetracenomycin C 29, such as the minimal tcm

PKS and the cyclase TcmN, have been widely studied and utilized in the reconstitution of aromatic PKS enzymes.(10) Different from anthracyclines and tetracyclines, biosynthesis of the tetracenomycin scaffold is initiated with a regiospecific C9-C14 first ring cyclization (Figure 3).

This is followed by C7-C16 and C5-C18 aldol condensations to yield the anthracene Tcm F2 30.(55) Although TcmN is known to catalyze formation of the first two rings, it remains unclear if the C5-C18 third ring cyclization occurs spontaneously or is enzyme- mediated. The combination of tcm minimal PKS and TcmN produces 16 in CH999,(38) yet the same enzyme combination produces 30 as the major product in S. lividans.(56) A dedicated cyclase TcmI catalyzes the C2-C19 cyclization to yield the naphthacene carboxylic acid Tcm F1

31.(57) Interestingly, cyclization regioselectivity of the last ring in 31 is identical to those found in anthracyclines. It is, however, a true aldol condensation as elimination of water takes place to yield the completely aromatic scaffold. Downstream tailoring steps such as triple hydroxylation furnish the highly oxidized aglycon.(58) The antitumor drug elloramycin 32 shares the same aglycon with 29, while the C3 hydroxyl group of 32 is glycosylated with a permethylated L- rhamnose.(59, 60)

14

Figure 3. Biosynthesis of tetracenomycins.

Structural analysis of TcmI showed it is a dimer consisting of two βαβ ferredoxin-like folds.(57) Most of the residues forming the catalytic cavity are hydrophobic, thereby providing a favorable environment for binding of the anthracene carboxylic acid 30. Four residues H26,

D27, R40 and H51 have been proposed to play important roles in mediating the aldol condensation reaction through mutational analysis, yet none of the single mutants at these positions causes complete loss of cyclization activity.(57) Therefore, these residues may function to steer the C2-C19 regioselectivity and may not play any catalytic roles. The different three-dimensional structures and catalytic mechanisms indicate fourth-ring cyclases from anthracycline and tetracenomycin pathways may have evolved independently.

Angucycline (C7-C12, C5-C14, C4-C17, C2-C19)

Angucyclines are benz[a]anthracenes that have an angularly cyclized third ring, which results in the bending of the fourth ring with respect to the rest of the molecule (Figure 4).

Starting from a C9 reduced decaketide backbone, regioselective cyclizations of C7-C12 and C5-

C14 fix the first two rings, in an identical fashion as the steps in anthracycline biosynthesis. The

15 unique C4-C17 cyclization angularly directs the orientation of the uncyclized portion of the backbone, and leads to the closure of the last ring via C2-C19 cyclization and spontaneous to afford the common angucycline intermediate UWM6 33.(61) 33 can also readily dehydrate and oxidize to afford the well known intermediate rabelomycin 34.(61)

Therefore, fixing the C4-C17 regioselectivity is a critical step in differentiating the tailoring steps of anthracyclines and angucyclines. The most well-studied angucycline is jadomycin B 35 produced from Streptomyces venezuelae,(62) and the cyclase responsible for the angular cyclization steps has been identified as JadI. JadI has been associated with mediating C4-C17 and C2-C19 cyclizations, as well as the dehydration of 33 en route to 34.(61)

Figure 4. Biosynthesis of angucyclines.

Since both anthracyclines and angucyclines are decaketides and the cyclization regioselectivity diverge at the third ring, this presented opportunities to perform combinatorial

16 biosynthesis experiments to probe functional differences between linear and angular cyclases.

Mixed results have been obtained using different angucycline cyclases. In a study by

Hutchinson and coworkers,(63) they demonstrated that replacing the second/third ring cyclase

DpsY in the set of genes that afforded aklanonic acid with JadI was insufficient to generate the angucyclic backbone. Similarly, replacing JadI with DpsY in the set of genes that synthesized

34 was inefficient to produce the linear scaffold. Therefore, the presence of heterologous cyclase alone cannot alter the cyclization outcome and additional biosynthetic enzymes are involved in dictating the final configuration of the cyclized polyketide. In contrast, Mäntsälä and coworkers showed angucycline and anthracyclines cyclases can be combinatorially mixed.(64) Using the putative angucyclines cyclase PgaF that has sequence homology to JadI, they showed PgaF can function in the presence of anthracyclines PKS from the biosynthetic pathway of nogalamycin and synthesized the expected angucycline aglycons 33 and 34. Therefore, a general model for how angucycline cyclases function in the context of different decaketide PKSs has not been established.

Through tailoring steps such as oxidation, dehydration, methylation, and glycosylation,

33 can be transformed into various bioactive angucyclines and angucyclinones. Conversion of 33 to the urdamycin aglycon 12b-hydroxyl-UWM6 is catalyzed by UrdM, which catalyzes the

Baeyer-Villiger oxygenation to form an ε-lactone in the fourth ring. Isolation of the shunt product urdamycin L 36 supports this mechanism.(65) Biosynthesis of the aglycon of landomycin A 37, 11-deoxylandomycinone 38, is derived from 34. Rohr and coworkers discovered that synthesis of 33 from tetrangomycin and the conversion of 34 to 38 requires a bifunctional oxygenase/reductase enzyme LndM2.(66) LndM2 is proposed to first catalyze the attachment of the C15-oxygen to afford 33, then function as a reductase to convert 34 into 38.

17

The antitumor oviedomycin 39 produced by Streptomyces antibioticus is a heavily oxidized angucycline derived from 34 and the biosynthetic gene cluster was uncovered in 2004.(67) The oxygenases required to synthesize 39 have been identified, which also facilitated the combinatorial biosynthesis of a number of angucyclines with enhanced antitumor activities.(68)

Other compounds derived from 33 are gaudimycin A 40 and gaudimycin B 41, which were identified through heterologous expression of the genes putatively responsible for this aglycon.(69)

The angucycline scaffold of 33 can also be dramatically rearranged to form other non- linear, fused-ring polyketides, giving rise to greater structural diversity. 35 is synthesized from

33 through oxidative cleavage of the C15-C16 bond in the C ring, followed by the addition of one unit of isoleucine.(70, 71) Gilvocarcin V 42 with a naphthopyran structure is synthesized starting from cleavage of the C15-C16 of a related intermediate, followed by rotation of the C4-

C5 bond to facilitate lactonization.(71, 72) Recently, the gene clusters responsible for the biosynthesis of two gilvocarcin-type anticancer agents ravidomycin 43 and chrysomycin 44 were sequenced.(73) 42, 43 and 44 share an identical polyketide aglycon and differ in the structures of the C-glycosides. Interestingly, a second ACP gene (ravC1/chryC1) is present in the gene cluster of 43 and 44, but not in that of 42. The second ACP has been proposed to be involved in the incorporation of the propionate primer unit. Benzopyrenomycin 45, the first natural product discovered to possess the carbocyclic benzo[a]pyrene structure, is proposed to be synthesized from the angucycline backbone, as evident from the recovery of several analogues of 33 in the producing culture. Based on the structures of the putative intermediates, it was suggested that the carbon scaffold of 45 maybe formed through fusing one molecule of malonic acid to an angucycline intermediate.(74) The natural host of saquayamycin Z 46 is reported to produce the

18 linear polyketide galtamycin B 47.(75) However, inactivation of glycosyltransferase saq GT5 or saq GT6 in the gene cluster of 70 eliminated the synthesis of 47, which suggested the biosynthesis of 46 and 47 are related and may both be derived from 33. The possible rearrangement of 33 to produce 47 may therefore be the first example of an angular polyketide serving as the precursor to a linearly-fused aromatic polyketide.

1.2 Cyclization of Fungal Aromatic Polyketides

Fungal megasynthases are intrinsically more difficult to study than the dissociated bacterial PKSs due to their large size (in excess of 200 kDa) and the difficulties associated with genetically manipulating the fungal hosts. In recent years, the biochemical studies of fungal aromatic polyketide biosynthesis has been facilitated by i) rapid accumulation of complete fungal genomes through various genome sequencing projects;(76, 77) ii) the ability to express intact megasynthases in heterologous hosts such as Escherichia coli(78) and Aspergillus oryzae;(79) and iii) the ability to excise individual domains as standalone enzymes.(16, 17) Genome sequencing has revealed that most fungal species encode multiple PKSs, many of which are not associated with known fungal metabolites. Understanding the biochemical basis of PKS enzymatic functions can therefore be important in product prediction and reprogramming of these machineries. Recent studies have shown that the product template (PT) domain(80) found

19 in fungal PKSs controls the immediate cyclization regioselectivity of the reactive backbone.(16,

17) Consistent with the F-mode rule proposed by Thomas, the PT domain mediates the first ring closure by bridging an even-numbered carbon and an odd-numbered carbonyl carbon, which leads to two intact acetate units in the first ring (Figure 1).(4) Most of the recent studies on PT domains have focused on that of PksA(16) involved in the biosynthesis of norsolorinic acid and that of PKS4(17) involved in the biosynthesis of bikaverin. Additional cyclization of the polyketide backbone can be further directed by a C-terminus thioesterase/Claisen-like- cyclase (TE/CLC) which can also offload the polyketide chain from the megasynthases. In the following sections, we will organize the discussion on the cyclization of fungal aromatic polyketides based on the PT regioselectivity and C-terminal TE/CLC function.

1.2.1 Product Template Domain Mediated Cyclization

Characterization of PKS4 PT Domain

The naphtha-γ-pyrone compound 48 is produced by WA PKS from A. nidulans, which is the precursor of a yellow pigment embedded in the conidial cell walls. [1,2-13C]Acetate feeding studies suggested that the folding pattern of heptaketide backbone follows the F-mode rule

(Figure 5).(81) Furthermore, formation of naphthopyrone scaffold requires second ring closure by Claisen-like condensation following first ring cyclization.(18) The TE/CLC domain in WA

PKS has been proven to catalyze the Claisen-type cyclization which will be discussed later.

SMA76a 49 is synthesized by PKS4 from Gibberella fujikuroi and is the precursor to red pigment bikaverin 50,(78, 82, 83) which has been shown to possess anticancer properties (Figure

5).(84) Compared to 48, 49 contains an additional aromatic ring linearly fused to the naphtha-γ- pyrone to give the naphthoxanthone structure. Feeding studies suggested that 49 is derived from nine acetate units and the folding pattern of the naphthopyrone portion is identical to that of

20

48.(85) The activities of PKS4 were recently reconstituted in vitro using intact holo PKS4 purified from Escherichia coli.(78) The protein expression level was sufficiently high to allow in vitro synthesis and full characterization of the polyketide product. It was shown that PKS4 synthesized 49 as a predominant product in the presence of malonyl-CoA, hence establishing that the NRPKS possesses all the required activities to synthesize and regioselectively cyclize the nonaketide backbone. Based on a series of studies in vitro, it was confirmed that construction of the nonaketide backbone requires the minimal PKS alone; the PT domain is responsible for catalyzing the first ring cyclization via aldol condensation between C2 and C7; and the TE/CLC domain facilitates Claisen-like condensation between C10 and C1 to generate the second ring.(17, 37, 78) Subsequently, attack of C13 carbonyl by the nucleophilic C9 phenol results in the tricyclic naphthopyrone, and finally C12-C17 aldol condensation and dehydration afford the last aromatic ring in 49. Timing of third ring closure is critical to form the tetracyclic structure as premature cyclization of the last three ketides will prevent the formation of the naphthoxanthone core. Therefore, it is likely that the TE/CLC domain may facilitate cyclization of both the second and third rings. However, this is yet to be verified.

PKS4 was highly amendable to domain engineering and dissection, which allowed detailed characterization of individual domain functions. When the cyclization domains such as

PT and TE were excised from PKS4, the minimal PKS continued to produce ample amounts of nonaketide backbones.(17) Compound 49 was no longer observed as a product since the committed C2-C7 cyclization did not take place. Instead, the reactive poly-β-ketone was cyclized into a spectrum of compounds not observed in previously known fungal metabolites.

These compounds contain different first ring cyclization regioselectivities, including the major

21 products 15 cyclized via C9-C14 and C7-C16; and naphthopyrone 51 previously synthesized by a mutant type III PKS (Figure 10).(86)

Figure 5. Biosynthesis of 50 by PKS4.

Characterization of PksA PT Domain

When the nascent polyketide chain is longer than a pentaketide, fungal PT domains can cyclize the first aromatic ring with alternative regioselectivity. A group of fungal aromatic compounds features C4-C9 cyclization of the first ring. Among them is norsolorinic acid 52 which is the anthraquinone precursor of the well-studied mycotoxin, aflatoxin B1 53 (Figure 6).

The biosynthesis of 52 is accomplished by an 1.4 × 106 Da enzyme complex (NorS) consisting of an NRPKS (PksA) and a pair of fatty acid synthases-like enzymes.(87) Initiated by a hexanoyl starter, PksA performs seven rounds of polyketide extension to afford an octaketide backbone.

Townsend and coworkers deconstructed the intact PksA by domain dissection and reassembled in trans to assign the precise roles of different domains.(16) Their work identified the PT domain mediates the successive C4-C9 and C2-C11 aldol cyclizations and the TE/CLC domain performs the C14-C1 Claisen-like condensation and product release. Removal of the TE/CLC domain led to the synthesis of a large amount of naphthopyrone 54, confirming the roles of the

PT and TE domains in the cyclization reaction. The minimal PKS domains of PksA were

22 inefficient when removed from the PT domain and synthesized only trace amounts of 52 and 54.

The loss of product turnover implied that PksA PT domain also plays an important role in the overall efficiency of the chain elongation steps of PksA minimal PKS. From the same study, it is also proposed that the PT domain facilitates release of intermediates attached to the ACP domain.(16)

Figure 6. Biosynthesis of 53 by PksA.

The structural basis of the PksA-PT regioselectivity was recently elucidated by

Townsend and coworkers.(88) The PksA-PT adopts a “double hot dog” (DHD) fold similar to the helix-grip structure of the bacterial cyclase TcmN.(29) Evident in the crystal structure is a deep internal pocket that can accommodate the nascent polyketide chain while attached to the phosphopantetheinyl arm. Extended from the protein surface into the internal pocket is a phosphopantetheine-binding site, followed by a cyclization chamber that can accommodate up to two aromatic rings. The innermost part of the pocket is proposed to be a holding channel for the hexyl moiety, which is the starter unit for PksA. Mutagenesis analysis confirmed the function of the proposed catalytic dyad, Asp1543 and His1345, suggesting that the PT domain has a catalytic role in addition to a template role. Furthermore, the PT domain was found to be a dimer as facilitated by a PT-specific insertion sequence. The dimeric structure of PT implies its role as a structural joint for the dimerization of the entire NRPKS structure. 23

1.2.2 Thioesterase Domain Mediated Cyclization

In addition to cyclizations mediated by the PT domain, fungal NRPKSs also use the terminal TE domain to catalyze regioselective cyclization of nascent polyketides. A majority of the fungal TE domains found in NRPKSs catalyze Claisen-like condensations to form C-C bonds, which is accompanied by release of the polyketide product from the megasynthases.

Hence, these TE domains are also referred to as TE/CLC domains. The Claisen-like condensation activity of TE/CLC is distinct from bacterial TE domains that catalyze lactonization(89-91) or hydrolysis,(92) and therefore represents an important strategy employed by fungal NRPKS to produced fused ring structures.

The CLC function of fungal NRPKS TE was first elucidated in the study of 48. Whereas the full length WA synthase produced the naphthopyrone 48 when heterologouly expressed from

Aspergillus oryzae,(81) Ebizuka and coworkers observed the biosynthesis of citreoisocoumarin

55 as major product by a version of WA PKS truncated at the C-terminus.(93) The differences in the cyclization patterns of the second ring between 48 and 55 suggest the importance of the

TE domain in catalyzing the required Claisen-like C1-C10 condensation. Site-directed mutagenesis of the WA PKS TE domain revealed two essential catalytic residues Ser1967 and

His2129 and led to the proposed two-step, general -catalyzed mechanism.(18) The similar catalytic role of the PKS1 TE/CLC domain in the synthesis of 1,3,6,8-tetrahydroxynaphthalene

(THN) 56 was confirmed by deletion and site-directed mutagenesis inactivation experiments.

These PKS1 mutants produced pentaketide isocoumarin 57 via spontaneous lactonization of the last ring, as well a hexaketide isocoumarin 58. The presence of the latter compound indicated 24 the possible involvement of the functional TE/CLC domain in maintaining chain length of the minimal PKS. The PKS1 TE/CLC domain also displays hydrolytic activity in addition to the

CLC activity, as intact PKS1 produced a significant amount of α-acetylorsellinic acid 59, a pentaketide carboxylic acid, as a major byproduct to 56.(79) Inactivation of PKS1 TE/CLC similarly led to the loss of 59.

The CLC activity of PKS4 TE/CLC domain in the biosynthesis of 49 was investigated by domain deletion, site-directed mutagenesis and in trans complementation assays in vitro.(37)

Truncation of PKS4 at the C-terminus of the ACP domain gave rise to the nonaketide product

SMA93 60. The same isocoumarin compound was also produced by the PKS4 Ser1830Ala mutant. Therefore, in the absence of the C1-C10 cyclization activity, the C2-C7 cyclized backbone spontaneously undergoes pyrone formation. Interestingly, when the standalone PKS4-

TE was incubated with TE-less PKS4, the C1-C10 cyclization was partially restored and both 49 and 60 were recovered in similar yields. Additional PKS4 in vitro assays revealed the substrate specificity of the TE/CLC domain. When supplied with octanoyl-CoA, PKS4 preferentially incorporated the C8 starter unit and synthesized two new isocoumarins SMA76b 61 and

SMA76c 62.(78) The structures of 61 and 62 illustrated that in the presence of a partial β-keto backbone, the PT can maintain the C2-C7 cyclization regioselectivity, whereas the TE/CLC domain can no longer catalyze the C1-C10 CLC reaction. The incompatibility of the TE/CLC towards the fatty acyl-primed polyketide showed that a full poly-β-ketone backbone is required for binding to the enzyme, which suggests the PKS4 TE/CLC may also dictate the timing of third ring cyclization en route to 49. 25

Other confirmed TE/CLC domains include that of PksA involved in the biosynthesis of

52.(16) When the dissected PksA domains were reassembled in trans in the absence of TE/CLC, significant amounts of the naphthopyrone 54 were found in the in vitro assay. Reintroduction of

TE/CLC to the full ensemble of PksA catalytic domains restored the production of 52. As expected, the C1-C14 regioselectivity of the PksA TE/CLC differs from the C1-C10 pattern associated with NRPKSs that have C2-C7 first-ring cyclization regioselectivtity.

2 Results

2.1 Identification of PT Domain Function and Engineered Biosynthesis of Bacteroides

Aromatic Polyketides in Escherichia coli

This research was originally published in Proc. Natl. Acad. Sci. U. S. A.: Zhang, W., Li, Y.,

Tang, Y. (2008). Engineered Biosynthesis of Bacterial Aromatic Polyketides in Escherichia coli.

Proc. Natl. Acad. Sci. U. S. A., 105 (52), 20683-20685. © 2008 by the National Academy of

Sciences.

2.1.1 Introduction

Escherichia coli is a powerful microorganism for understanding and engineering the biosynthesis of natural products. This is attributed to its faster growth characteristics, more abundant genetic tools and a better understanding of its primary metabolism compared to native hosts. Nearly all major classes of natural products have been synthesized and engineered in E. coli, including macrolides,(94) cyclic peptide,(95) terpenes(96) and alkaloids.(97) A noticeable exception has been the pharmaceutically important bacterial aromatic polyketides, which include 26 important human therapeutics such as tetracyclines and anthracyclines. Bacterial aromatic polyketides are synthesized by type II aromatic polyketide synthases (PKSs) from soil-borne actinomycetes.(1) The barrier to reconstitute the biosynthesis of aromatic polyketides in E. coli has been the inability to generate the elongated poly-β-ketone backbone from malonyl-CoA, which requires a minimal PKS(98) that consists of a ketosynthase (KS)-chain length factor

(CLF) heterodimer and an acyl-carrier protein (ACP). Despite repeated attempts by numerous groups, expression of the core KS-CLF heterodimer in E. coli has always resulted in 100% of the proteins as inclusion bodies. Therefore, an alternative minimal PKS machinery capable of generating an elongated polyketide backbone, as well as interacting with the immediate tailoring enzymes, is needed to synthesize bacterial aromatic polyketides in E. coli.

Figure 7. Comparison between fungal and bacterial iterative polyketide synthases. (A) Fungal iterative PKS is a megasynthase that contains all of the catalytic domains on a single polypeptide. The G. fujikuroi PKS4 synthesizes the nonaketide backbone and cyclizes through C2-C7 regioselectivity. The TE/CLC domain catalyzes the C1-C10 cyclization and leads to formation of 49. Without the TE/CLC domain, 60 is isolated. (B) Bacterial type II PKSs. The minimal PKS consists of the KS-CLF heterodimer and a dissociated ACP domain. The fren minimal PKS is also a nonaketide synthase. However, cyclization of the backbone is determined by dissociated cyclases. The TcmN cyclase can fix the C9-C14 connectivity to form 15. When a defined set of tailoring enzymes is included with the fren minimal PKS, the anthraquinone compound 63 is formed via reduction of C9 ketide and sequential cyclization of C7-C12, C5-C14, and C3-C16. 27

The fungal NRPKSs are involved in the biosynthesis of fungal aromatic metabolites, including the well-known mycotoxin aflatoxin.(16) Unlike bacterial aromatic PKSs in which the enzymatic components are dissociated, a fungal PKS is a megasynthase that contains the required catalytic domains in a single polypeptide (Figure 7).(14) These domains include the starter-unit:ACP acyltransferase (SAT) involved in starter unit selection,(15) KS, malonyl-

CoA:ACP transferase (MAT) for extender unit loading,(99) product-template (PT),(80) ACP and thioesterase/Claisen-cyclase (TE/CLC).(18) The juxtaposition of these domains enables the megasynthase to initiate, extend and cyclize the polyketide backbone into multicyclic, fungal- specific products.(14) We have recently demonstrated full-length fungal megasynthases can be expressed and reconstituted in E. coli, hence making this family an attractive machinery for E. coli based biosynthesis of bacterial aromatic polyketides.(78) However, despite the analogous iterative Claisen-like condensations catalyzed by two families of PKSs during chain elongation,(99) cyclization modes of the poly-β-ketone backbones are drastically different

(Figure 7).(4, 100) For example, both the fungal PKS4(82) from Gibberella fujikuroi and the bacterial frenolicin (fren) PKS(101) from Streptomyces roseofulvus can synthesize the nonaketide (C18) backbone from nine malonyl-CoA units. The differences in cyclization modes between the two PKSs, however, give rise to orthogonal sets of aromatic polyketide products.

PKS4 regioselectively directs the four consecutive cyclization reactions, starting with the C2-C7 aldol condensation, to form the tetracyclic SMA76a 49.(78) In the absence of the TE/CLC domain, which is responsible for the second ring cyclization in 49, the C2-C7 cyclized polyketide intermediate spontaneously rearranges to yield the benzopyrone SMA93 60.(37) In contrast, the fren PKS can be combined with an accessory cyclase such as TcmN from the tcm pathway(102) to cyclize via C9-C14 regioselectivity and afford PK8 15.(39) Similarly, the fren

28

PKS can also interact with a series of tailoring enzymes to yield the anthraquinone SEK26

63.(35)

In this report, we demonstrate the dissection and reassembly of G. fujikuroi PKS4 into a synthetic PKS that can efficiently synthesize bacterial aromatic polyketides. The engineered

PKS afforded a spectrum of polyketides with cyclization regioselectivity that not observed among fungal polyketides. When the new PKS machinery was expressed in E. coli with bacterial tailoring enzymes, complex bacterial aromatic polyketides were produced by this host for the first time. Our approach overcomes the barrier of reconstituting bacterial minimal PKS in

E. coli, and paves the way for the characterization and engineering of the biosynthesis of type II aromatic polyketides in this powerful organism.

2.1.2 Results and Discussion

Reconstitution and Characterization of fungal minimal PKS in Escherichia coli

A) Design and reconstitution of fungal minimal PKS

Fungal NRPKS and type II minimal PKS both produce unreduced poly-β-ketone backbones through iterative Claisen-like condensations, yet they utilize drastically different cyclization modes, which under most circumstances will produce compounds with various structures.(4, 78, 100) Although it has been demonstrated that the accessory enzymes in type II

PKSs will alter the cyclization of the polyketides synthesized by PKS4, 49 remains the major product because of the robust, built-in C2-C7 cyclization catalyzed by the intact PKS4.

Furthermore, coexpression of PKS4 and the bacterial tailoring enzymes in E. coli results in the synthesis of 49 only, which may be caused by the inability or excessive delay of the dissociated cylcases to access the poly-β-ketone backbone in vivo. We propose that the C2-C7 cyclization is mediated by PT domain, which is supported by the deconstruction study of PksA (a NRPKS

29 involved in the synthesis of the environmental carcinogen aflatoxin B1 53 that owns the same architecture as PKS4) that the PT domain in PksA may direct the first two ring cyclizations between C4-C9 and C2-C11.(16)

ML1L2L3 kDa L1 KS MAT ACP 160 120 100 L2 KS MAT

L3 ACP

20

15

Figure 8. Construction of PKS4 minimal PKS domains without the PT domain: L1, PKS_WJ; L2, PKS4 KS-MAT didomain; L3, PKS4-ACP.

Two strategies have been developed to remove the PKS4 PT domain to detect if the build-in PKS4 cyclization mode would be inactivated. In the first approach, to mimic the dissociated component machinery of bacterial type II minimal PKS, we dissect and reconstitute the PKS4 KS-MAT didomain and the ACP domain as two stand-alone proteins based on conserved domain boundaries.(103) The KS-MAT didomain (108kDa) was expressed in

BL21(DE3) and the discrete ACP (13kDa) was expressed in BAP1 (Figure 8). In the second approach, the C-terminus of PKS4 KS-MAT didomain and N-terminus of PKS4-ACP were fused with a KR-ACP interdomain linker peptide (~90 aa) from the 6-DEBS module 3 (Figure A1).

The linker was chosen to facilitate the ACP domain to access the active sites of both KS and

MAT domains in cis, as it has been reported that the linker peptides play an indispensible role in the domain-domain interaction.(103) The design yielded a compact, synthetic megasynthase

PKS_WJ (129 kDa) that retained all of the minimal PKS components on a single polypeptide.

30

PKS_WJ was solubly expressed in BAP1 with an N-terminal hexahistidine tag and purified at a final yield of 1.6 mg/L (Figure 8).

B) Catalytic properties of the fungal minimal PKSs

(x1,000,000) 15 51 60 a A B m/z 343 KS_MAT + ACP + MatB + malonic acid (12C) 5.0 m/z 299

0.0 10.0 12.5 15.0 17.5 20.0 22.5 25.0 27.5 30.0 (x1,000,000) b 5.0 m/z 352 KS_MAT + ACP + MatB m/z 308 + malonic acid (2-13C) 2.5

0.0 10.0 12.5 15.0 17.5 20.0 22.5 25.0 27.5 30.0 (x1,000,000) c m/z 343 PKS_WJ + MatB 12 2.5 m/z 299 + malonic acid ( C)

0.0 10.0 12.5 15.0 17.5 20.0 22.5 25.0 27.5 30.0 (x1,000,000) d m/z 352 PKS_WJ + MatB + malonic acid (2-13C) 2.5 m/z 308

0.0 10.0 12.5 15.0 17.5 20.0 22.5 25.0 27.5 30.0

Figure 9. (A) HPLC analysis (280nm) of minimal PKS activities in vitro. (i) TE0-PKS4 produced 49 as dominant product. (ii) The dissociated PKS4 minimal PKS synthesized new nonaketides marked with *, while 49 was dramatically less formed. The major products were 15 and 51. (iii) PKS_WJ produced the similar set of compounds as the dissociated minimal PKS. (iv) No polyketides were produced by KS- MAT didomain alone (negative control). (B) LC-MS analysis and selected ion monitoring of nonaketides produced by KS-MAT+ACP and PKS_WJ in vitro. The molecular weight of 342 corresponds to nonaketides that have been subjected to 2 dehydrations, such as 15. The molecular weight of 298 corresponds to nonaketides subjected to 3 dehydrations followed by one decarboxylation, such as 60. When 12C-malonate was replaced with [2-13C] malonate, [2-13C] malonyl-CoA was synthesized by MatB, which would be incorporated into the polyketide products leading to a mass increase of +1 for each ketide.

The catalytic properties of the two fungal minimal PKS machineries were examined in the presence of malonate, CoA, and the Rhizobium trifolii malonyl-CoA synthase MatB.(104)

Compare with the negative control reaction where ACP has been removed, both the dissociated minimal PKS and PKS_WJ produced a similar set of new compounds not observed with the TE0-

PKS4 (Figure 9A). Most of the new compounds are confirmed to be nonaketides (labeled with * in Figure 9Aii) through selected LC-MS ion monitoring of the reaction mixture with either 12C-

31 malonate or [2-13C] malonate as the malonyl-CoA precursor (Figure 9B). Several unidentified octaketides were also synthesized at trace levels as revealed by MS analysis. Intriguingly, the

C2-C7 cyclized product 60 is much less synthesized by either set of minimal PKS, indicating that the PT domain is essential for directing the committed cyclization step in the parent megasynthase.

The two major products 15 and 60 were purified from the extracts of the scaled-up in vitro reaction and characterized through 1 and 2-dimensional NMR. Compared with the authentic standard 15, the first major product (RT=17.4 min) was identified to be the C9-C14 cyclized nonaketide 15, which was previously obtained from an engineered Streptomyces coelicolor strain CH999 (where the gene clusters encoding the main polyketide synthases has been knocked out(11)) expressing fren minimal PKS and the C9-C14 cyclase TcmN.(39) The second major product (RT=18.9 min) has a parent ion peak [M+H]+ at m/z 299, which is shifted to m/z 308 on the use of [2-13C] malonate as the malonyl-CoA precursor. The mass is in agreement with that of a nonaketide subjected to 3 dehydrations and 1 decarboxylation.

Compared the NRM signals with an authentic standard, the compound is determined to be the naphthopyrone 51, which was previously isolated as a spontaneously cyclized nonaketide produced by a mutant type III PKS.(86) Two possible cyclization modes can lead to the formation of 51 (Figure 10): either C10-C15 or C5-C10 first ring cyclization, followed by additional folding and decarboxylation will yield 51. The minor nonaketide products were not characterized, and were presumably derived from other cyclization regioselectivities. Together, the synthesis of various nonaketides indicated that both minimal PKS devices can afford functional enzymes capable of generating a full-length nonaketide backbone with numerous cyclization modes but C2-C7 folding catalyzed by the parent PKS4 megasynthases.

32

1500 1500 207 265 274 1000 1000 mAU mAU 360 500 500

0 0 241 233 200 300 400 500 200 300 400 500 nm nm Inten.(x10,000,000) Positive ionization Inten.(x10,000,000) Positive ionization 1.5 1.5 114 299 343 365 1.0 1.0 143 105 143 325 533 0.5 0.5 406

257 301 594 594 152 189 449 208 257 321 362 0.0 217 420 495 543 586 0.0 152 180 340 393419 448 488 510 533552 573 100 200 300 400 500 m/z 100 200 300 400 500 m/z

Inten.(x1,000,000) Negative ionization Negative ionization 7.5 Inten.(x1,000,000) 341 1.5 297 5.0 1.0 2.5 0.5 255 297 599 109 143 189 215 323 363 403 442 497 535 573 101 0.0 213 255 311 365 411 100 200 300 400 500 m/z 0.0 149 185 337447467 517 542 592 100 200 300 400 500 m/z Figure 10. UV and mass spectrum of 15 and 51; and two possible cyclization patterns to afford 51.

C) Kinetic properties of the fungal minimal PKSs:

To investigate whether the reconstituted minimal PKSs retained the catalytic activity that is sufficiently fast to turnover polyketides, we quantified the kinetic efficiency of PKS_WJ and

KS-MAT with a titration of Malonyl-CoA and ACP correspondingly by using [1,2-14C] malonyl-

CoA and radioactive-TLC. At 3 µM enzyme concentration, PKS_WJ synthesized polyketides with Vmax = 315 nM/min and displayed good affinity towards malonyl-CoA with Km = 290 µM

(Figure 11A). In addition, the maximum velocity of the reaction varies linearly with PKS_WJ

-1 concentration to give a kcat = 0.11 min (Figure 11A). The catalytic efficiency of PKS_WJ is

-1 therefore nearly identical to that of the parent PKS4 (kcat = 0.1 min ).(78) Two important insights into the properties of the synthetic PKS_WJ are implied: 1) removing the SAT and PT domains in the parent PKS4 does not affect the turnover rate of the minimal PKS; 2) the engineered linker between KS-MAT and ACP in PKS_WJ does not disrupt either KS:ACP or

MAT:ACP interaction. The synthetic PKS machinery is therefore kinetically competent to function in E. coli.

33

Nevertheless, the titration of ACP at 3 µM KS_MAT and 2 mM [1,2-14C] malonyl-CoA revealed Michaelis-Menten kinetics with Vmax = 581 nM/min and Km(ACP) = 41 µM (Figure

11B). The high Km requirement of the KS-MAT didomain to reach a comparable velocity as

PKS4 or PKW_WJ indicates the importance of ACP diffusion and the in trans protein-protein interactions during ,(98) and hints potential limitations of the dissociated minimal PKS to function efficiently in E. coli. To determine the rate limiting step, a KS0_MAT mutant

(C514A, 10 nM) was used and the rate of [1,2-14C] malonyl transfer to PKS4_ACP (100 µM) with the synthesis of the malonyl-CoA:ACP was measured by autoradiography. The apparent rate of malonyl-transfer was ~130 min-1, which is comparable to that of the same reaction catalyzed by freestanding acyltransferases such as S. coelicolor FadD, a commonly used MAT in type II PKS in vitro assays. Therefore, the rate-limiting step of the minimal PKS is likely to be one of the catalytic reactions involving the KS domain, including initiation and the Claisen-like condensation extension.(98)

A B 300 500 600 500 400 400 300 200 (nM/min) 200

max 300

V 100 0 0246 PKS_WJ conc (µM) 200

100 V (nmol/min) V (nM/min) KS MAT ACP KS MAT ACP 100

Vmax= 316±17 nM/min Vmax= 580.6±42 nM/min Km(M-CoA)= 290±6 µM Km(ACP)= 41±6.4µM 0 0 0246 0 20406080100 Malonyl-CoA (mM) ACP (μM)

Figure 11. (A) PKS_WJ exhibited comparable kinetic properties as the parent PKS4, with Vmax=316nM/min and Km=290µM toward malonyl-CoA. (B) Michaelis-Menten properties of the dissociated KS-MAT didomain (3µM) toward the ACP domain at 2mM malonyl-CoA.

34

Meanwhile, we investigated the ACP specificity of the KS-MAT didomain according to the in trans protein-protein interaction of the dissociated minimal PKS. When PKS4_ACP was substituted with bacterial type I PKS ACPs (6DEBS-M3_ACP and 6DEBS-M6_ACP), bacterial type II PKS ACPs (FrenN and ZhuN), or heterologous fungal PKS ACPs (LNKS ACP(105) and gzPKS13_ACP(106)), no trace of any polyketides products can be detected using radio-TLC or

LC-MS. This indicates the interactions between fungal KS-MAT and its ACP is highly specific.

However, using the PKS4 KS0_MAT mutant didomain, we observed that the MAT domain was able to transfer 2-14C-malonyl label to the heterologous ACP partners with nearly equal efficiency as towards PKS4_ACP. This suggested that the incompatibilities between the gfPKS4_KS-MAT and heterologous ACPs are entirely attributed to the highly-specific KS:ACP interactions. This is in sharp contrast to the tolerance exhibited by the bacterial minimal PKS towards noncognate ACPs,(107) including those extracted from fungal PKSs.(99)

D) Starter unit selectivity of the fungal minimal PKSs

We then assessed the starter unit specificity of the reconstituted fungal minimal PKS through addition of acyl-SNACs with various lengths to the in vitro assays of PKS_WJ, where malonyl-CoA is synthesized by MatB from malonate and CoA. To our surprise, different from the parent PKS4,(78) PKS_WJ showed a significantly broad substrate specificity, where it can take the acyl-SNACs from C3 to C14 (except C5) as a starter unit (Figure 12). Albeit the assay with pentyl-SNAC did not show any new compounds in the HPLC analysis, no products shown in Figure 12C were detected either. Assays with acyl-CoAs showed similar pattern of the HPLC traces yet higher production of the new compounds.

The HPLC analysis indicates that PKS_WJ show a higher affinity to acyl-SNAC (C4-

C14) than malonyl-CoA, as the synthesis of acetyl-primed nonaketides were almost completely

35 abolished in the presence of acyl-SNACs. As MAT is malonyl-CoA:ACP transferase, we propose that the acyl-SNACs were spontaneously uploaded onto the active site cysteine of KS domain. This is quite different from the intact PKS4, hence we suggest that some structural changes during engineering the megasynthase has destructed the structure of the entrance to the

KS domain. Specifically, this damage may be caused by the lack of SAT domain which structurally may wrap around the KS domain.

A C

4nm (1.00) C14

C12

C10 C3

15.0 17.5 20.0 22.5 25.0 27.5 30.0 minutes C9

B C8

C7

ii C4+ZhuI C6

i C5 C4

12.5 15.0 17.5 20.0 22.5 25.0 20.0 22.5 25.0 27.5 30.0 32.5 35.0 37.5 minutes minutes

Figure 12. Starter unit selection by PKS_WJ. (A) HPLC analysis of polyketides produced by PKS_WJ using propionyl-SNAC. All original malonate-primed polyketides were retained and a propionyl-primed SEK4 analog is detected and characterized with the extract mass and UV absorption at 18.7min. (B) HPLC analysis of polyketides produced by PKS_WJ using butyryl-SNAC with (i) and without (ii) ZhuI. Synthesis of the original malonate-primed polyketides were abolished, and the major products are truncated triketide (marked as Δ) and tetrakeitde (marked as ♦) compounds. Full length butyryl-primed octaketide, an analog of 1, was synthesized with trace amount around 20min. The addition of ZhuI lead to the synthesis of more full length polyketides, although the truncated polyketides are still the major products. (C) HPLC analysis of polyketides synthesized by PKS_WJ using acyl-SNAC from C5 to C14. Truncated triketide (marked as Δ) and tetrakeitde (marked as ♦) compounds are the major products and their retention time is longer when primed with longer acyl group.

36

In the assays of PKS_WJ with propionyl-SNAC, besides the original acetyl-primed nonaketides synthesized, there is a new major compound (RT=18.7min, Figure 12A) that shares similar UV absorption pattern as SEK4 1 and has a parent ion peak [M-H]- at m/z 331, which is shifted to m/z 339 on the use of [2-13C] malonate as the malonyl-CoA precursor. The mass is in agreement with that of a propionyl-primed octaketide that has been subjected to 2 dehydrations.

Meanwhile, at the presence of butyryl-SNAC, PKS_WJ can synthesize trace amount of a new compound (RT=20min, Figure 12B) that also share similar UV absorption as 1 and has a parent ion peak [M-H]- at m/z 345, which is shifted to m/z 353 on the use of [2-13C] malonate as precursor. The mass is in agreement with that of a butyryl-primed octaketide that has been subjected to 2 dehydrations. Besides, in the reactions of PKS_WJ with acyl-SNAC (C4 to C14), the major products are all truncated triketide or tetraketide compounds (Figure 12B, C). This implies that the elongating chain is easily released from the phosphopantetheinyl arm of ACP domain after no more than three iterative condensations when primed with an acyl-group longer than C3. Hence, we believe that there are some residues in the polyketide binding pocket that interact with the keto or hydroxyl group through to stabilize the elongating chain and prevent it from self-release with the formation of a pyrone. This may also explain why gzPKS13 (a

NRPKS responsible for the synthesis of the zearalenone in Gibberella zeae (106)) always show truncated triketide or tetraketide pyrone even when fed with malonyl-CoA. Originally, gzPKS13 is primed with a highly reduced hexaketide and catalyzes three iterative Claisen condensations.

Thus, we propose that there are very few accessory residues that stabilize the keto or hydroxyl group of the elongating chain in the gzPKS13 substrate binding pocket.

Engineering Biosynthesis of Bacterial Aromatic Polyketides in Escherichia coli

A) Interaction Between Fungal Minimal PKS and Bacterial Cyclases

37

With the minimal PKS fully functional and the built-in C2-C7 cyclization mechanism disabled, we assayed whether cyclization fate of the backbone can now be fully controlled by exogenous cyclases. Remarkably, when the C9-C14 cyclase TcmN or WhiE-ORFVI(22) was added to either minimal PKS, the exclusive synthesis of 15 was observed (Figure 13A, trace i and ii) while the synthesis of almost all the other polyketides are abolished (Figure 13B), illustrating that both TcmN and WhiE-ORFVI can completely fix the C9-C14 regioselectivity of the first ring formation.

ABOH OH O OH OH O O O 12 12 12 1 715 O HO 7 OH HO 7 OH HO i HO 7 O O O O HO O HO O HO O SEK4, 1 SEK15, 2 TW93c, 3

OH OH OH O O O OH O 12 OR2 12 O OH O OH O O R1O 7 ii 12 O HO 7 OH HO O O OH O 7 HO O O R=glycosides R1128a, 63 Mithramycin, 23 Steffimycin, 64

C

Figure 13. (A) HPLC analysis (280nm) of reaction mixtures. Addition of the C9-C14 cyclase TcmN (i) or WhiE- ORFVI (ii) to PKS_WJ resulted in the exclusive synthesis of 15; whereas addition of the putative C7-C12 cyclases ZhuI (iii) or MtmQ (iv) to PKS_WJ resulted in the synthesis of nonaketide 7 and the minor octaketide 1. (B) (i) C7- C12 cyclized polyketides produced by bacterial minimal PKSs alone without aid of cyclases. 1, 2 and 3 are produced by the act, tcm and whiE minimal PKSs, respectively. (ii) Natural aromatic polyketides with C7-C12- cyclized first ring. (C) Cyclization of the nonaketide backbone to yield 7, which confirms the activities of ZhuI, MtmQ and StfQ as C7-C12 cyclases specific for unreduced polyketides.

To date, numerous bacterial minimal PKSs are suggested to predominantly produce polyketides with C7-C12 first ring,(24) including octaketide 1,(21) decaketide SEK15 2,(21) and

38 dodecaketide TW93c 3.(22) Thus the C7-C12 cyclization appears to be controlled by the KS-

CLFs along but no additional cyclases. The was rationalized from an X-ray crystal structure of act KS-CLF, in which the polyketide backbone is proposed to be buckled at C9 and C10 in the active site of the KS, hence positioning C7 and C12 in close proximity and initiating the aldol condensation.(25) However, putative first ring cyclases can be found in all gene clusters of aromatic polyketides that cyclize via C7-C12 (Figure 13B), including that of R1128 63

(ZhuI),(26) mithramycin 23 (MtmQ)(28) and steffimycin 64 (StfQ).(27) The catalytic mechanism and activity of these cyclases have not been clarified due to the lack of a minimal

PKS that is independent of C7-C12 folding catalysis.

The PKS4 minimal PKS does not produce any appreciable amounts of C7-C12 cyclized nonaketides and is therefore an ideal enzyme for examining the roles of the putative cyclases.

ZhuI, MtmQ and StfQ were expressed in BL21(DE3) (Figure A2), purified to near homogeneity and each added to either the minimal PKS4 machinery in the presence of CoA, malonate and

MatB. For each assay, according to the HPLC analysis, a previously unknown compound 7

(Figure A3) was predominantly synthesized with the concomitant disappearance of other nonaketides (Figure 13A, trace iii and iv). MS analysis of this compound showed a parent ion peak [M-H]- at m/z 359, which is shifted to m/z 368 on the use of [2-13C] malonate as the malonyl-CoA precursor. This is in agreement with that of a nonaketide subjected to 2 dehydrations. Further NMR characterization of 7 has revealed the compound a nonaketide analog of 1 (Figure 13, Table A3), which is cyclized spontaneously after the regioselective C7-

C12 first ring cyclization. This unambiguously verifies the catalysis of the three cyclases, which ensure complete C7-C12 cyclization and increase the rate of the committed cyclization step in their respective pathways. In addition to compound 7, the truncated octaketide 1 with the first

39 ring C7/C12 cyclization was detected at low levels in the HPLC analysis (Figure 13A).

Together, the synthesis of 7 suggests that while the fungal minimal PKS may have some template influence over the orientation of the nascent polyketide chain to afford 15 and 51 as the major product, addition of polyketide cyclases can completely reorient the polyketide backbone and lead to the formation of highly specific products.

B) Combinatorial Biosynthesis of Bacterial Aromatic Polyketides In E. Coli

The in vitro data presented so far provide convincing evidence that a fungal minimal PKS can be a functional alternative to a type II minimal PKS. To demonstrate the application of this new enzymatic machinery in E. coli, we targeted the biosynthesis of anthraquinone 63.

Compound 63 is derived from a nonaketide backbone after ketoreduction at C9 by a KR, C7-C12 cyclization and dehydration by an aromatase/cyclase (CYC1), C5-C14 cyclization by a second ring cyclase (CYC2), followed by spontaneous third ring cyclization and second ring oxidation

(Figure 7).(35) Two plasmids encoding PKS_WJ, act KR, gris CYC1 and oxy CYC2 (OxyN), with each gene flanked by a T7 promoter and a T7 terminator, were transformed into BAP1. The resulting strain was grown in shake flask culture, while protein expression was induced with

0.5mM IPTG and incubated at 16°C for two days. The expression of all four PKS enzymes were tracked by SDS-PAGE and the maximum soluble protein expression levels were reached 24 hours after induction. Gratifyingly, analysis of the culture extract by LC-MS revealed that 63 was indeed synthesized by our engineered E. coli strain at approximately 0.3 mg/L (Figure 14A).

Approximately 90% of 63 were found in the culture media, while the remaining 10% were extracted from the cell pellets. Compound 64, which is the decarboxylated form of 63, was also detected. Selected ion monitoring of the extract showed no other nonaketide species present, confirming the precise timing and specificities of the three tailoring enzymes in E. coli.

40

Furthermore, to explore the scale-up potential of this process, we performed a fed-batch fermentation using F1 minimal medium and monitored the level of 63 for ~60 hours after induction (Figure 14B). The E. coli culture started producing detectable amounts of 63 ~10 hours after induction, and the titer increased steadily until a maximum of ~ 3 mg/L was achieved at a plateau OD600 of ~70. Prolonged fermentation beyond the plateau did not result in an increase in polyketide levels, which may be attributed to loss of enzyme activity or depleting intracellular malonyl-CoA levels.

63 B A 80 3.5

70 3 64 60 2.5 Associated PKS 50 2 40 Dissociated PKS (mg/L)

OD600 1.5

30 63 OD600 1 20 Yield Titer of 10 0.5

0 0 0 20406080100

Time (hour)

Figure 14. (A) HPLC analysis (430nm) of SEK26 63 biosynthesis by using E. coli BAP1 expressing PKS_WJ or dissociated minimal PKS, and three tailoring enzymes act KR, grisCYC1 and oxy CYC2. (B) High-cell density, feb-batch fermentation production of 63.

In contrast to PKS_WJ, no polyketides can be detected when the dissociated PKS4 minimal PKS was introduced in vivo despite repeated optimization attempts (Figure 13A). SDS-

PAGE analysis confirmed that both proteins are expressed in the BAP1. The lack of product turnover is possibly due to the high Km of ACP towards the standalone KS_MAT enzyme as shown in the kinetic analysis (Figure 11B). The less efficient protein-protein interactions between dissociated components may not be able to sustain the synthesis of the elongated poly- 41

β-ketone backbone under in vivo conditions. Therefore, while both designs of minimal PKS were catalytically competent in vitro, the synthetic linker we introduced to religate the KS-MAT and ACP domains proved to be essential in facilitating polyketide synthesis in E. coli.

2.1.3 Conclusions

In summary, we have identified the function of PKS4-PT and demonstrated the total biosynthesis of bacterial aromatic polyketides in the workhorse organism E. coli. The enabling protein-engineering steps were the extraction and reassembly of the minimal PKS from a nonreducing fungal megasynthase. The extraction step was important to inactivate the built-in cyclization modes of fungal megasynthase, and the reassembly step was indispensable toward establishing a kinetically competent enzyme in E. coli. This approach can be extended to other fungal megasynthases of different chain-length specificities and be applied toward the synthesis of other type II bacterial aromatic polyketides. For example, pharmaceutically relevant decaketides such as anthracyclines can be synthesized in E. coli by reengineering a fungal decaketides synthase from the hypomycetin biosynthetic pathway.(108) The genetically superior

E. coli system can also provide immense opportunities toward biochemical understanding and protein engineering of aromatic PKS enzymes.

2.1.4 Materials and Methods

Cloning of a PKS4 KS_MAT didomain

The KS_MAT didomain was cloned as a standalone protein with primer pair pks4-KS-F-

NdeI and pks4-AT-R-NotI-S-EcoRI, flanking the didomain with an NdeI restriction site preceding the domain and appending the domain with a NotI restriction site, the stop codon and an EcoRI restriction site. The plasmid pSMa76 was used as the template. The PCR product was digested with NdeI and EcoRI and inserted into pET28a to yield pWJ249.

42

Cloning of a standalone PKS4 ACP domain

The ACP domain was cloned as a standalone protein with primer pair PKS4-ACP-F-NdeI and PKS4-ACP-R-NotI-S-EcoRI, flanking the didomain with an NdeI restriction site preceding the domain and appending the domain with an EcoRI restriction site. The plasmid pSMa76 was used as the template. The PCR product was digested with NdeI and EcoRI and inserted into pET28a to yield pWJ248.

Cloning of PKS_WJ

The ACP domain was recloned with primer pair PKS4ACP-F and PKS4ACP-R, and the linker was cloned with primer pair DEBS2M3-ACPL-F and DEBS2M3-R from eryII (encoding

DEBS2). The two PCR products were assembled together by slice by overlapping extension

PCR (SOE-PCR) and were amplified with primer pair DEBS2M3-ACPL-F and PKS4ACP-R.

This SOE-PCR product flanked with NotI sites was then digested with NotI, inserted into pWJ249 digested with NotI, and selected for the correct orientation to yield pWJ282.

Site-directed mutagenesis

Site-directed mutagenesis was used to generate the site-directed mutants of the KS_MAT didomain. The mutagenic primers are PKS4KSS2A and PKS4KSS2A–r (shown in Table A1).

The expression plasmid pWJ249 was used as the template for KS mutagenesis. Restriction analysis of the plasmid product was first performed to identify mutant, followed by DNA sequencing of the entire gene to verify the intended mutation.

Protein expression and purification

All proteins purified in this work contain N-terminus hexahistidine tags. of the genes were regulated under T7 promoter and controlled by LacO operator. The expression plasmid was transformed into E. coli BL21(DE3) strain or BAP1 strain (contains a chromosomal

43 copy of the sfp gene for phosphopantetheine transfer to apoACP) for protein expression.(94) For

500 mL of liquid culture, the cells were grown at 37°C in LB medium with appropriate concentration of antibiotics to an OD600 of 0.4-0.6. At which time the cells were incubated on ice for 10 minutes, and then induced with 0.1 mM isopropyl thio-β-D-galactoside (IPTG) for 16 hours at 16°C. The cells were harvested by centrifugation (3500 rpm, 10 minutes, 4°C), resuspended in 30 mL lysis buffer (20mM Tris-HCl pH = 7.9, 0.5 M NaCl, 10 mM imidazole) and lysed using sonication on ice. Cellular debris was removed by centrifugation (15000 g, 1 hour, 4°C). Ni-NTA agarose resin was added to the supernatant (4 mL/L of culture) and the solution was stirred at 4°C for at least 2 hours. The protein resin mixture was loaded into a gravity flow column and proteins were purified with increasing concentration of imidazole in

Buffer A (20mM Tris-HCl pH = 7.9, 0.5 M NaCl, 5 mM imidazole, 2 mM DTT). Purified proteins were concentrated and buffered exchanged into Tris Buffer (50 mM Tris-HCl, pH = 7.9,

2 mM EDTA, 2 mM DTT) + 10% glycerol with Centriprep filter devices (Amicon Inc.). The final protein was concentrated, aliquoted and flash frozen. Protein concentrations were determined with the Bradford assay using BSA as a standard. Approximate yields: KS_MAT (3 mg/L), ACP (2 mg/L), PKS_WJ (1.6 mg/L).

Large scale polyketide synthesis and NMR

During the scaled-up reaction, malonyl-CoA was made in situ with MatB using either sodium malonate or 13C-3 labeled malonic acid. In a typical reaction, the following reagents were added to phosphate buffer (pH = 7.4): 100 mM sodium malonate, 2 mM DTT, 7 mM

MgCl2, 20 mM ATP, 5 mM coenzyme A, 20 μM MatB, 10 μM KS_MAT and 50 μM ACP or 10

μM PKS_WJ. 40 μM ZhuI was added for compound 7 production. The reaction mixture was shaken gently at room temperature and the reaction progress was followed by HPLC. After the

44 product level had maximized, the reaction mixture was extracted three times with 99% EA/1%

AcOH. The resultant organic extracts were combined and evaporated to dryness, redissolved in methanol. Compound 15 and 51 were purified by reverse-phase HPLC (Alltima C18 5u, 250 mm X 10.0 mm) on a linear gradient of 30 to 50% CH3CN (v/v) over 30 min and 95% CH3CN

(v/v) further for 15 min in H2O supplemented with 0.1% (v/v) trifluoroacetic acid at a flow rate of 2 mL/min. Compound 6 was purified by reverse-phase HPLC (Alltech Apollo 5u, 250 mm X

4.6 mm) on a linear gradient of 10 to 50% CH3CN (v/v) over 45 min and 95% CH3CN (v/v) further for 15 min in H2O supplemented with 0.1% (v/v) trifluoroacetic acid at a flow rate of 1 mL/min. NMR were performed on Bruker DRX-500 spectrometer using CD3OD as solvent.

Transacylation activity assay

The transfer of malonyl group from malonyl-CoA to acceptor ACPs was performed in phosphate buffer (pH = 7.4). Holo-ACP domain was each added to a final concentration of 100

μM. KS0_MAT (C514A) was added to a final concentration between 10 and 50 nM. [1,2-14C] malonyl-CoA (55 nCi/nmol) was added to initiate the reaction at a final concentration of 180

μM. The reaction was performed at room temperature for 30, 60 and 90 seconds. A 10 μL aliquot was quenched with 10 μL of 2x SDS loading dye at each time point, separated on

SDS/PAGE and the radiolabeled ACP bands were visualized by autoradiography.

Polyketide product detection

Example given for a typical reaction: 10 μM of KS_MAT (with 50 μM ACP) or

PKS_WJ was incubated with 20 μM MatB, 100 mM sodium malonate (12C or 2-13C labeled), 7 mM MgCl2, 5 mM coenzyme A and 20 mM ATP. The bacterial tailoring enzymes were each added to a final concentration of 40 μM. All polyketide product assays were incubated at room temperature. The assays were stopped after 4 hours and extracted three times with 99% ethyl

45 acetate (EA)/1% (AcOH). The organic phase was separated, evaporated to dryness, redissolved in 20 μL of methanol, and the reaction mixture was subjected to HPLC analysis.

HPLC (Alltech Apollo 5u, 250 mm X 4.6 mm) is on a linear gradient of 5 to 95% CH3CN (v/v) over 30 min and 95% CH3CN (v/v) further for 10 min in H2O supplemented with 0.1% (v/v) trifluoroacetic acid at a flow rate of 1.0 mL/min. HPLC retention times were as follows:

SMA76a (1): 28.3 min, SMA93 (2): 20.4 min, PK8 (3): 17.4 min, SEK26 (4): 24.0 min, (5):18.9 min, (6): 15.0 min, (7): 14.7 min, (13): 30.1 min.

LC-MS was conducted with a Shimadzu 2010 EV Liquid Chromatography Mass

Spectrometer using both positive and negative electrospray ionization and a Phenomenex Luna

5u 2.0 x 100 mm C18 reverse-phase column. Samples were separated on a linear gradient of 5 to

95% CH3CN (v/v) over 30 min and 95% CH3CN (v/v) further 30 min in H2O supplemented with

0.05% (v/v) formic acid at a flow rate of 0.1 mL/min at room temperature.

Polyketide turnover assay

A time course analysis was performed on each of the PKS described in this work. The

KS_MAT, PKS_WJ and PKS4 concentrations were fixed at 3 μM in phosphate buffer (pH =

7.4). ACP concentrations were varied between 5 and 90 μM. The reaction was initiated by the addition of 2 mM [1,2-14C] malonyl-CoA (1.22 nCi/nmol) and allowed to react at room temperature. At each time point (5, 10, 20 mins), a 10 μL aliquot was quenched and extracted with 300 μL 99% EA/1% AcOH. The organic phase was evaporated to dryness, redissolved in

15 μL EA, separated by thin-layer chromatography (EA ⁄AcOH, 99:1) and quantified with a phosphoimager.

Small-scale shake flask low-density fermentation

46

E. coli BAP1 strain transformed with expression plasmid(s) was grown in F1 medium at

37°C to an OD600 of 0.4-0.6.(109) At which time the cultures were cooled to 16°C, and then induced with 0.2 mM IPTG at 250 rpm and grown at 16°C for 2 days. For product detection, 1 mL cell culture was collected. The cell pellet was extracted with , evaporated to dryness and redissolved in methanol. The medium was extracted with 99% EA/1% AcOH, the organic phase was separated, evaporated to dryness and redissolved in methanol. Both fractions were then subjected to HPLC analysis.

High-density F1 fed-batch fermentation

Methods for F1 fed-batch fermentation and medium composition were adopted from methods described previously by Pfeifer et al.(94, 110) A starter culture was grown overnight in

5 mL of LB medium at 37°C and 250 rpm, and 1 mL was used to inoculate a 100 mL shake flask seed with F1 medium. 10 mL of the seed F1 culture was used to inoculate a 2 L Applikon

Biobundle vessel containing 1 L of F1 medium. Fermentation was started at 37°C, and the pH was maintained at 7.0 throughout the experiment with 1 M HCl and half-concentrated NH4OH.

Aeration was controlled at 0.2 to 0.5 L/min, and agitation was maintained at 1400 rpm. When the OD600 reached ~15-20, the temperature of the fermentation was reduced to 16°C, followed by the addition of 0.5 mM IPTG to induce protein expression. At the same time, a peristaltic pump delivered 0.1 mL/min of the feed solution to the fermentor. At each time point, 0.5 mL aliquot was removed from the culture and the products were analyzed by HPLC.

2.2 Classification, Prediction, and Verification of the Regioselectivity of Fungal

Polyketide Synthase Product Template Domains

47

This research was originally published in J. Biol. Chem.: Li, Y., Xu, W., Tang, Y. (2010).

Classification, Prediction and Verification of the Regioselectivity of Fungal Polyketide Synthase

Product Template Domains. J. Biol. Chem. 285 (30), 22762-22771. © the American Society for

Biochemistry and Molecular Biology.

2.2.1 Introduction

Natural products such as polyketides and nonribosomal peptides isolated from filamentous fungi have played indispensible roles in human health care.(12) During the past decade, advances in rapid DNA sequencing techniques have enabled the complete genome sequencing of many fungi species.(76, 77, 111-113) One important outcome of the sequencing efforts is the revelation of large numbers of natural product biosynthetic pathways encoded in these organisms. Among them, putative polyketide synthases (PKSs) and nonribosomal peptide synthetases (NRPSs) are particularly abundant. Many of the encoded pathways are silent under laboratory culturing conditions and as a result, the metabolites associated with these pathways have not been observed or isolated.(114) Mining these uncharacterized pathways is therefore an important objective towards discovery of new bioactive and novel enzymatic machineries.(115)

One method of predicting the structures of polyketides encoded in cryptic pathways is by analyzing the sequence of the PKSs. This method has been successfully used in the prediction of bacterial polyketides by examining the linear arrangement of catalytic sites along the modular type I PKSs (13, 116). However, this method is not readily applicable to fungal polyketides owing to the differences in enzyme architectures and more importantly, the lack of the colinearity rule. Fungal PKSs are large megasynthases that assemble single copies of catalytic domains in one polypeptide. The domains function in an iterative fashion and use built-in

48 programming rules to dictate chain length, cyclization, reductive tailoring and other structural features.(14) Therefore, understanding the relationships between protein sequence and catalytic activity, as well as substrate specificity, of each domain is essential to unlocking the biosynthetic potential of fungal PKSs.

Among the different families of fungal iterative PKSs, the nonreducing PKSs (NRPKSs) are responsible for the synthesis of a number of important metabolites, including the well-known mycotoxin aflatoxin.(16) Other representative products of NRPKSs include pigment precursors

(1,3,6,8-tetrahydroxylnaphthalene, THN), compounds with (viridicatumtoxin) and anticancer (bikaverin) properties. Polyketides produced from NRPKSs are aromatic compounds with sizes ranging from monocyclic aromatics to those containing multiple fused rings. Starting from the N-terminus of the megasynthase, the following domains are typically found in an

NRPKS: a starter unit:ACP transacylase (SAT) that selects the starter unit;(15) a ketosynthase

(KS) that catalyzes repeated decarboxylative condensation to elongate the polyketide backbone; a malonyl-CoA:ACP transacylase (MAT) that selects and transfers the extender unit malonyl-

CoA; a product template (PT) domain that controls the immediate cyclization regioselectivity of the reactive polyketide backbone;(88) an acyl-carrier protein (ACP) that serves as the tether of the growing and completed polyketide via its phosphopantetheinyl arm;(117) and a thioesterase/Claisen-like cyclase (TE/CLC) domain that cyclizes and releases the product.(18)

Other domains such as methyltransferase (MT) and reductase (R) can be found in subclades of the NRPKSs.(76) Establishment of domain boundaries using computation tools such as the

Udwary-Merski algorithm (UMA),(80) and dissection of domains,(16, 17, 37, 99) have been used to verify the function of individual domains. Expression of intact megasynthases in heterologous hosts such as Aspergillus oryzae(118, 119) and E. coli(78) has been used to

49 reconstitute the activities of the entire megasynthase through structural determination of the polyketide products.

Because the poly-β-ketone backbone of a newly synthesized polyketide is extremely reactive, the NRPKSs must suppress spontaneous cyclization and control the regioselectivity of the first cyclization step to afford the F-mode folding of the fungal aromatic polyketides.(4) In this mode, the first aromatic ring contains two intact acetate units while the two bridging carbons are derived from two different acetate units. As shown in Figure 15, the three most commonly observed F-mode folding patterns are C2-C7, C4-C9 and C6-C11. Recently, the PT domain has been shown to be responsible for this important first step in tailoring the completed polyketide backbone. Using dissected domains, Townsend and co-workers demonstrated that the PT from

PksA (from Aspergillus parasiticus) can regioselectively cyclize a hexanoyl-primed octaketide via C4-C9 and C2-C11 aldol condensations en route to norsolorinic acid 52, the key intermediate of aflatoxin B1 53 biosynthesis (Figure 15).(16) The structural basis of PksA-PT has also been elucidated to confirm its catalytic role as an aldol cyclase. (88) Ma et al have reconstituted the intact bikaverin synthase PKS4 (from Gibberella fujikuroi) in E. coli to synthesize the C2-C7 cyclized nonaketide SMA76a 49.(78) Systematic removal of individual domains from the intact

PKS4 was then conducted to verify the functional roles of the TE and PT domains.(17, 37)

Removal of the TE/CLC domain (named PKS4-99) produced the C2-C7 cyclized SMA93 60 and confirmed the role of the TE/CLC in catalyzing the C1-C10 Claisen reaction (Figure 15A).(37)

Further excision of the PT domain from PKS4-99 abolished the production of 60 and led to the biosynthesis of aberrantly cyclized products 15 and 51, confirming the role of the PT domain in catalyzing the C2-C7 aldol condensation.(17)

50

OH OH O 14 O

HO 9 OH O OH O O 17 A HO 7 9 HO O HO 12

Spontaneous 1 O O OH 15 51 2 OH O Inactive TE 60 PKS4 HO 7 9 11 13 15 17 HO 7 O OH SAT KS MAT PT ACP TE O O O O O 10 S 2 12 2 TE/CLC 1 17 n=8 OH OH O R=CH OH O 3 SMa76a, 59 C2-C7

PksA n=7 O O O R=(CH ) CH HO 9 11 13 15 HO 9 11 OH 2 5 3 R S SAT KS MAT PT ACP TE n O O 14 S TE/CLC C4-C9 4 2 4 2 1 OH OH O OH O OH O norsolorinic acid, 52 n=7 ACAS R=CH3 HO 11 13 15 HO 11 13 15 C6-C11 -CO2 OH SAT KS MAT PT ACP O 6 4 2 S Lactamase 6 4 2 OH OH O O OH OH O atrochrysone, 67

Polyketide Chain PT-mediated TE-mediated Elongation cyclization cyclization/ chain offloading B

Figure 15. (A) Biosynthesis of fungal aromatic polyketides. C2-C7, C4-C9/C2-C11 and C6-C11/C4-C13 are commonly observed F-mode folding patterns catalyzed by fungal NRPKSs. Product template domains that mediate C2-C7 and C4-C9/C2-C11 folding modes have been characterized in the study of PKS4 and PksA, respectively. TE/CLC domains catalyze the cyclization of the last ring and off-loading. In the absence of PT mediated cyclization, the PKS4 minimal PKS produces spontaneously cyclized products 15 and 51. (B) C6-C11/C4-C13 cyclization regioselectivity has been observed in various fungal aromatic polyketide such as asperthecin, emodin, and hypomycetin.

The commonly observed C6-C11 F-mode cyclization is exemplified by the model product emodin and related compounds (Figure 15B).(100, 108, 120, 121) NRPKSs that synthesize C6-C11 cyclized aromatic polyketides include AptA and AN0150 involved in the synthesis of asperthecin 65 and emodin 66 in A. nidulans,(120, 121) respectively, as well as

ACAS that synthesizes atrochrysone 67 in Aspergillus terreus (Figure 15A).(122) A common feature observed among these PKSs is the lack of a fused TE/CLC domain. For ACAS, a

51 discrete β-lactamase-type TE (ACTE) is employed to release the product from ACAS.(122)

Unlike the PT domains from PksA and PKS4, the activities of the PT domains from C6-C11 regiospecific NRPKSs have not been verified.

Because of the important roles played by PT domains in controlling the polyketide structures, being able to precisely predict the cyclization regioselectivity of PT domains is an important step towards suggesting the potential products of NRPKS. Towards this end, we sought to 1) establish a sequence-activity relationship through bioinformatics analysis; 2) establish a simple experimental platform to assay the activities of PT domains; and 3) use the experimental platform to confirm the predicted regioselectivity of PT domains from cryptic

NRPKSs.

2.2.2 Results

Phylogenetic analysis of PTs

To establish a sequence-activity relationship, we performed a phylogenetic analysis of PT domains from NRPKSs that have been associated with known fungal aromatic polyketides

(Table A5). A previous phylogenetic analysis of the KS sequences classified the NRPKSs into subclades I and II that consist of PKS4 and PksA; and a subclade III of which the PKSs all contain a MT domain embedded between ACP and TE/CLC domains.(76) Due to this architectural difference and phylogenetic distances to subclades I and II, subclade III NRPKSs were not included in this study. The subclades I and II PT sequences were aligned and phylogenetically analyzed by means of the boot strap minimum evolution method.(123) The resulting phylogenetic tree shown in Figure 16 clearly divides these NRPKSs into five major groups.

52

C. resinifera PKS1 C. lagenarium PKS1 O. piceae PKSA S. macrospora PKS G. lozoyensis PKS1 B. oryzae PKS1 Nodulisporium sp. PKS1 E. fawcettii PKS1 W. dermatitidis PKS1 E. lecanii-corni PKS1 A. nidulans AN0150 II A. terreus ACAS V G. fujikuroi PKS4 A. nidulans AptA III A. fumigatus PKSP A. nidulans WA

A. nidulans AN7909 I G. zeae PKS12

P. chlamydosporia RDC1 IV

C. chiversii RADS2 C. nicotianae CTB1

G. zeae PKS13 H. subicuosus Hpm3 L. maculans PKS1 A. parasiticus PksA A. oryzae PksA A. ochraceoroseus AflC A. flavus PksA A. nidulans PKSST D. septosporum PksA 0.2

Figure 16. Phylogenetic analysis of PT domains from 30 NRPKSs that have been related to known aromatic polyketides. The NRPKSs have been classified into five major groups: group I, C2-C7 monocyclic PKSs; group II, THN synthases (C2-C7 bicyclic PKSs); group III, C2-C7 multicyclic PKSs; group IV, C4-C9 PKSs; and group V, C6-C11 PKSs. The evolutionary distances are in the units of the number of amino acid substitutions per site. The sum of branch length is 10.38. The tree is drawn to scale (see scale bar), with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree.

Group I includes the NRPKSs that synthesize compounds containing a single aromatic ring, exemplified by the recently discovered orsellinic acid synthase AN7909 from A. nidulans.(120, 124) Group I also consists of the NRPKSs that are involved in the synthesis of the aromatic portions of the resorcylic acid lactones, including the zearalenone PKS13,(106, 125) the hypothemycin Hpm3(126, 127) and the radicicol RDC1/RADS2.(126, 128) Primed by different starter units, Group I NRPKSs synthesize a tetraketide backbone and the PT domains perform the regioselective C2-C7 cyclization to yield the aromatic ring. Group II contains most of the known THN 56 synthases (THNSs). The PT domain of this group catalyzes the

53 cyclization of pentaketide backbones via C2-C7 aldol condensation, followed by TE/CLC catalyzed cyclization of the second ring and product release.(79)

The remaining three groups contain NRPKSs that synthesize longer polyketide chains and multiply fused-ring structures. Group III contains NRPKSs that cyclize first ring via C2-C7 regioselectivity, such as the nonaketide synthase PKS4(78, 82) and the aurofusarin heptaketide synthase PKS12 from G. zeae.(125) An interesting member of the phylogenetic tree is the WA synthase that synthesizes THN from A. nidulans. This NRPKS is a member of Group III instead of Group II. This classification is consistent with previous biochemical analysis, which showed that synthesis of 56 by WA is a result of chain shortening of the heptaketide product YWA1

48.(129) The well-studied PksA and other NRPKSs that are involved in the synthesis of C4-

C9/C2-C11 cyclized aromatic polyketides are clearly separated into Group IV. This group contains other known aflatoxin synthases that are analogues of PksA,(130, 131) as well as the heptaketide synthase cercosporin synthase CTB1.(132) Within this group, NRPKSs primed with different starter units are clearly divergent in sequence, as illustrated by the phylogenetic distance between the acetate-primed CTB1 and the hexanoyl-primed aflatoxin synthases. Lastly,

Group V consists of AptA, AN0150 and ACAS, which putatively cyclize the nascent polyketide via C6-C11/C4-C13 regioselectivity. Because all members of this group lack the C-terminus

TE/CLC domain, the Group V PT domains appear to be responsible for all the enzyme-catalyzed polyketide cyclization reactions.

Validating the Group V PT domain activities

Our analysis with a limited number of PT domains indicated that cyclization specificities can be readily correlated and classified based on protein sequence. To validate this phylogenetic classification of PT functions, we attempted to confirm the predicted but unverified C6-C11/C4-

54

C13 cyclization regioselectivity of Group V PT domains. To do so, we hypothesized that PT domains can be extracted from the native megasynthase and functionally combined with a well- characterized heterologous minimal PKS. By analyzing the structure of the product synthesized by the hybrid components, the regiospecificity of the unknown PT domain can be deciphered.

This approach may be especially useful for NRPKSs of which the intact megasynthase or the endogenous minimal PKS domains cannot be solubly expressed or functionally reconstituted.

The G. fujikuroi PKS4 is well-suited to serve as the template for PT domain swapping because i) the megasynthases can be expressed at high levels and the minimal PKS (KS-MAT,

ACP) can produce ample amounts of the nonaketide products in vitro, which facilitates product structure elucidation; ii) the minimal PKS domain does not significantly influence the cyclization regioselectivity of the nascent polyketide.(17) We therefore anticipated that a heterologous PT could dictate the cyclization outcome; and iii) other built-in cyclization domain, such as the

TE/CLC domain can be removed from the assembly line without affecting the product elongation and turnover. Hence, we can attribute all cyclization steps to that of the transplanted

PT domain.

We chose to study the PT domain from the asperthecin synthase AptA (AptA-PT)(121) of Group V. The gene encoding AptA-PT was amplified from aptA in accordance with the boundaries of the crystallized PksA PT domain (Figure A4).(16) AptA-PT was then expressed from E. coli strain BL21(DE3) and purified to a final yield of 13 mg/L. To assay the regioselectivity, AptA-PT was combined with the PKS4 minimal PKS. We used the malonyl-

CoA synthase MatB(104) from Rhizobium trifolii to generate the malonyl-CoA extender units from malonate, ATP and CoA. The reaction was allowed to proceed for 12 hours, after which the organic products were extracted, dried and analyzed by LC-MS (Figure 17B). As expected,

55 the minimal PKS alone produced polyketides that are not cyclized in accordance with the F- modes, including 15 and 51 (Figure 17B). In contrast, in the presence of AptA-PT, the spontaneous cyclization modes were suppressed and a predominant product that corresponds to a new compound 68 (retention time, RT = 25.5 min) was produced (Figure17B). Compound 68 has a parent ion peak [M-H]- at m/z 337, which is consistent with a nonaketide backbone that has undergone three dehydrations and one oxidation. The nonaketide backbone of 68 was further confirmed when an increase of 9 mu was observed for 68 upon using [2-13C] malonate as the substrate of MatB (Figure A6). The UV absorbance of 68 displays λmax at 247, 315, 474 nm, which suggests that this compound contains a highly conjugated chromophore (Figure A6).

We next tested whether the AptA-PT domain can functionally replace the PKS4 PT domain in PKS4 to create a hybrid megasynthase. Using the TE-less PKS4 (PKS4-99) as the template and SOE-PCR, we constructed PKSH1 in which the original PT domain was replaced with the AptA-PT (Figure 17A). PKSH1 was expressed in E. coli strain BAP1, which contains a chromosomally encoded copy of the broadly specific phosphopantetheinyl transferase, Sfp.(94)

This hybrid megasynthase was solubly expressed in BAP1, and was purified with nickel affinity chromatography to near homogeneity (~2 mg/L) (Figure 17A). Interestingly, when PKSH1 was assayed for product formation in vitro, a similar product profile was observed compared to the reaction mixture containing the dissociated enzymatic components. Whereas PKS4-99 produced the expected C2-C7 cyclized 60 as the main product, PKSH1 synthesized 68 as the predominant product (Figure 17B). This result indicates that while the hybrid megasynthases remain functional to produce polyketides, the regioselectivity of the cyclization steps is completely altered upon replacement of the PT domain. The robust turnover of products by PKSH1 further confirms the dexterity of PKS4 towards domain engineering experiments.

56

15 51 60 68 A B PKSH1 SAT KS AT PT ACP

iv PKS4-99 SAT KS AT PT ACP

AptA iii SAT KS AT PT ACP

kDa ME ii 220 +

160 i 120 + 10 15 20 25 30 35 40 100 Minutes

Figure 17. (A) Construction and SDS-PAGE analysis of PKSH1. PKSH1 (196 kDa) is constructed through domain shuffling of PT domain between PKS4 and AptA using PKS4-99 as a template. For SDS-PAGE: lane M, bench mark ladder; lane E, Ni-NTA purified PKSH1. (B) LC-MS (325nm) analysis of the polyketides synthesized by PKSH1 in vitro. Trace i. PKS4 minimal PKS alone synthesizes nonaketides 15 and 51, which are derived from spontaneous cyclization of the backbone; trace ii. PKS4 minimal PKS domains and AptA-PT interact efficiently to afford a new nonaketide 68 as the predominant product; trace iii. The parent megasynthase PKS4-99 synthesizes 60 as a major product; and trace iv: PKSH1 synthesizes 68 as the predominant product.

Elucidation of the structure of 68

To determine the structure and cyclization pattern of 68, we scaled up the in vitro assay with PKSH1 with 12C-malonate that is enriched with 10% of [1,2,3-13C] malonate. The 13C- enriched 68 was purified from the crude extract through reverse-phase HPLC and thoroughly characterized by NMR spectroscopy. Seven signals were observed in the 1H NMR spectrum including one methyl (δH = 2.29 ppm), four aromatic (δH = 6.65, 6.81, 7.16, 7.69 ppm) and two phenolic protons (δH = 12.11, 13.74 ppm). A spin system of CH (δH = 6.65 ppm) - CH (δH =

57

13 7.16 ppm) was observed with a coupling constant of JHH= 2.4 Hz. The C NMR spectrum

(Table 1, Figure A7) showed 18 signals, each appearing as doublets due to isotopic enrichment.

The signals include those that are associated with a (δC = 181.4 and δC = 188.4 ppm).

An additional carbonyl peak (δC = 158.1 ppm) was observed and is indicative of a pyrone moiety. 1H-13C HMQC (Figure A7) correlations enabled the assignment of the four hydrogen atoms to the directly bonded carbon atoms (8-C/H, 10-C/H, 14-C/H and 16-C/H). The coupling constants calculated from the 13C spectrum allowed establishment of nine sets of neighboring carbons. Together with the remaining 1D and 2D NMR data, including 1H-13C HMBC (Figure

A7) correlations, the structure of 68 was elucidated as an α-pyranoanthraquinone as shown in

Figure 18.

1 Table 1. NMR Spectral Data for 68 in DMSO-d6. 13 1 1 13 No. C δ (ppm) JCC(Hz) H δ (ppm) (m, area, JHH (Hz)) H- C HMBC 1 158.1 75.2 - - 2 110.3[b] 75.7 - - 3 164.8 64.8 13.74 (s, 1H) 2, 4 4 113.7 63.9 - - 5 188.4 59.6 - - 6 109.3 59.6 - - 7 164.5 69.8 12.11 (s, 1H) 6, 7, 8 8 108.5 69.3 6.65 (d, 1H, 2.4) 6, 10 9 165.5 61.7 - - 10 108.8 62.0 7.16 (d, 1H, 2.4) 6, 8, 12 11 134.9 54.4 - - 12 181.12 54.6 - - 13 137.1 62.6 - - 14 115.0 63.4 7.69 (s, 1H) 2, 4, 12, 16 15 145.7 53.4 - - 16 104.2 53.3 6.81 (s, 1H) 2, 14, 17 17 160.0 51.8 - - 18 19.4 51.1 2.29 (s, 3H) 16, 17 1 The 1H spectrum was recorded at 500MHz, the 13C NMR spectrum was recorded at 125MHz, and the 1H-13C HMBC and HMQC spectra were collected at 600MHz. 2 Weak signals and were observable in the 1H-13C HMBC.

Based on the structure and the [1,2-13C] acetate incorporation pattern of 68, the cyclizations steps catalyzed by PKSH1 are shown in Figure 18. Importantly, this verified that 58

AptA-PT indeed catalyzes the cyclization of polyketide backbone via C6-C11 aldol condensation. Cyclization of the second ring via C4-C13 is most likely to be also facilitated by this PT domain. Closure of the third ring via C2-C15 is likely spontaneous, which was proposed for the ACAS(122) and is commonly observed for most bacterial anthraquinone polyketides.(35)

Finally, a spontaneous C1-O17 esterification reaction takes place to form the α-pyrone fourth ring and concomitant product release, followed by second ring oxidation to yield 68 (Figure 18).

To generalize this finding, we further dissected the PT domain of A. nidulans AN0150(120) as a second example of Group V PTs (Figure A4). When combined, PKS4 minimal PKS and

AN0150-PT also synthesized 68 as the predominant product (Figure A8).

PKSH1

PKSH1

Figure 18. The biosynthesis of 68 and the 1H-13C HMBC correlations of 68. 68 is synthesized from a nonaketide backbone synthesized by the PKS4 minimal PKS domains, followed by the AptA-PT mediated C6-C11 and C4-C13 cyclizations as shown in this study. Cyclization of the third ring via C2- C15 cyclization is proposed to occur spontaneously. The proposed, spontaneous C1-O17 esterification forms the forth ring and releases the product, followed by second ring oxidation to afford 68.

59

Therefore, we have functionally verified the regioselectivity of Group V PT domains through heterologous recombination with PKS4 minimal PKS, both as dissociated enzymes and as a hybrid megasynthase. Together, the C2-C7, C4-C19 and the C6-C11 PT domains represent the three known fungal “cyclases” reconstituted so far. It is interesting to note that the AptA PT domain functions efficiently on a nonaketide backbone, while asperthecin was predicted to derive from an octaketide backbone.(121) Thus the PT domains may also be functional when paired with minimal PKSs that are of different chain length specificities.

Prediction of PT Regioselectivity from NRPKS of Unknown Function

Given that all five groups of PT functions have been functionally verified, the phylogenetic tree may therefore serve as a predictive tool of product structures synthesized from

NRPKSs. To do so, we included 99 additional PT sequences extracted from NRPKSs of unknown functions (Table A6) into the alignment. The inclusion of these sequences did not significantly influence the phylogenetic classification and most of the sequences fall into these five groups discussed above (Figure A9).

To test whether the PT regioselectivity agrees with the phylogenetic classification, we chose to functionally test a PT domain extracted from an unknown NRPKSs. The PT domain chosen was that of An03g05440 from Aspergilllus niger (assigned as e_gw1_14.257 in A. niger

ATCC 1015). The An03g05440-PT is located within group III, yet belongs to a branch that does not contain any functionally verified NRPKS. The gene fragment encoding the An03g05440-PT was amplified from the A. niger genomic DNA and replaced pks4 PT in the PKS4-99 template to afford the hybrid megasynthase PKSH2. PKSH2 was similarly expressed in BAP1, and purified to a final yield of ~ 1.5 mg/L (Figure 19A). PKSH2 was assayed in vitro and the products were analyzed with LC-MS (Figure 19B). Similar to PKS4-99, PKSH2 synthesized 60 as the major

60 product with similar yields. This confirms An03g05440-PT indeed directs the cyclization of the nonaketide backbone via C2-C7 regioselectivity, and suggests that the parent NRPKS is responsible for the synthesis of a product that contains this structural feature. In accordance with the recent review on the secondary metabolites synthesized by A. niger,(133) the C2-C7

An03g05440 may be involved in the synthesis of the YWA1 analogues such as fonsecin, rubrofusarin, or the dimeric aurasperone B.

A B 60 M E kDa 220 ii 160 120 i 100 10 15 20 25 30 Minutes

Figure 19. (A) Construction and SDS-PAGE analysis of PKSH2. PKSH2 (193 kDa) is constructed through domain shuffling of PT domain between PKS4 and An03g05440 using PKS4-99 as a template. For SDS-PAGE: lane M, bench mark ladder; lane E, Ni-NTA purified PKSH2. (B) LC-MS (325 nm) analysis of the polyketides synthesized by PKSH2 in vitro. Trace i: the parent megasynthase PKS4-99 synthesizes 60 as a major product; and trace ii: PKSH2 also synthesizes 60 as the major product, confirming the phylogenetic prediction of An03g05440-PT as a C2-C7 aldol cyclase.

Viridicatumtoxin is a tetracycline-like fungal aromatic polyketide (Figure 20). Recently, the gene cluster has been located in P. aethiopicum and the NRPKS involved was shown to be

VrtA.(134) Previous 13C enrichment study of viridicatumtoxin led to two possible cyclization pathways,(135) which can originate from either C6-C11 or C8-C13 first ring cyclization (Figure

20). Our phylogenetic analysis finds VrtA PT domain is located in Group V (Figure A9). To clarify the regioselectivity of VrtA, we dissected VrtA-PT and assayed it with PKS4 minimal

PKS domains in trans. As expected from phylogenetic prediction, 68 was synthesized as a major

61 product (Figure 20). This result indicates that the formation of the viridicatumtoxin aglycon initiates with the C6-C11 regioselective cyclization.

VrtA A B SAT KS AT PT ACP 68

ii

i

10 20 30 40

Minutes Figure 20. (A) VrtA has been related to the biosynthesis of viridicatumtoxin in P. aethiopicum. Previous 13C-enrichment study implies two possible cyclization patterns: C6-C11 and C8-C13 folding modes, for the formation of the four ring scaffold. (B) LC-MS (325 nm) analysis of the polyketides synthesized by PKS4 minimal PKS domains with AptA-PT (trace i) and VrtA-PT (trace ii) in vitro. Same as AptA-PT, VrtA-PT and PKS4 minimal PKS domains interact efficiently to afford 68 as the predominant product.

2.2.3 Discussion

With an increasing number of sequenced genomes from filamentous fungi, it has become apparent that these organisms are “underachievers” as natural product producers. A majority of sequenced biosynthetic gene clusters are silent under laboratory conditions and as a result, the metabolites associated with these pathways have remained cryptic. Therefore, being able to predict structures or partial structures of compounds based on sequence analysis of pathways is an important goal towards mining the fungal biosynthetic potential.(136) In this work, we used phylogenic analysis of the PT domains to classify fungal NRPKSs into five groups and showed a correlation between protein sequence and cyclization regioselectivity. The five groups cover the known cyclizations modes of C2-C7, C4-C9 and C6-C11. The PT domains that control the C2-

62

C7 aldol cyclizations were further grouped separately based on the ring sizes of the final products. Through verification of the regioselectivities of the Group V PT domains, we established a straightforward experimental approach to assay the function of PT domains. The assay is based on heterologous combinations of target PT domains with an NRPKS minimal

PKS. Combining the bioinformatics and the simple biochemical approaches, we were able to predict and confirm the activities of two NRPKS PT domains uncovered from genome sequencing. Similarly, fundamental understanding of sequence-activity relationships of domains that control chain length, such as the KS, can provide additional insights into the structures of polyketides produced by cryptic NRPKSs.

The assay developed here is highly robust and the dominant polyketide produced reflects the regioselectivity of the target PT domains. The heterologous PT domains were able to interact compatibly with the PKS4 ACP domain, illustrating this key protein-protein interaction is not specific between endogenous partners. The mix and match of minimal PKS components with different PT domains are reminiscent of the combinatorial biosynthesis experiments performed with the bacterial type II PKSs and cyclases. (19, 35) In these studies, the cyclases were found to be broadly specific and can be combined with different type II minimal PKSs to generate a number of regioselectively controlled unnatural aromatic polyketides. Our work here indicates that this approach can also be extended to fungal iterative type I PKSs to synthesize different compounds cyclized via the F-modes.

The construction of functional, hybrid NRPKSs demonstrates modularity of the domains in fungal PKSs, in that key domains can be readily swapped without major compromises to megasynthase structure, activity and processivity. Such modularity provides further evidence that NRPKS domains may be evolutionarily derived from the juxtaposition of dissociated

63 catalytic domains. The dimeric PT domains likely play a key role in maintaining the overall structural integrity of the megasynthases. Our results therefore suggest the overall folds of the different PT domains from different groups may be similar, thereby maintaining the endogenous communication and protein-protein interactions between the upstream KS-AT didomain and the downstream ACP domain. We compared the sequences of GroupV PT domains to those from the other groups and found the sequential similarities between different groups are moderate.

Group V PT domains show no more than 30% identities and 40% similarities to members of the other groups. For example, AptA-PT and PksA-PT share only 19% identity and 33% similarity, even though the final products are both anthraquinone polyketides. Hence, the sequence divergence must contribute to major differences in the sizes and geometries of the cyclization chamber (88), leading to different orientation of the polyketide backbone in the PT domains.

Structural analysis of C2-C7 and C6-C11 PT domains, and comparison to that of the C4-C9

PksA-PT will provide insights into these differences.

Whereas using dissociated catalytic domains can provide a rapid method of assaying PT activities, using hybrid megasynthases such as PKSH1 and PKSH2 has several advantages for combinatorial biosynthesis applications. First, the simplicity of cloning and expressing a single gene is an advantage over having to manipulate multiple genes. This was demonstrated in this work, where we used a single PKSH1 instead of three separate proteins (KS-MAT, PT and ACP) to produce sufficient amounts of 68 for structure elucidation. Second, the in cis interactions between the different domains will lead to increased catalytic efficiency and product turnover, especially under in vivo conditions. Indeed, while compound 68 can be recovered from the cell pellets of E. coli expressing PKSH1, no product can be detected from E. coli expressing different domains as stand-alone proteins. This was similarly demonstrated using PKS4 minimal PKS in

64

E. coli, where artificially linking the KS-MAT with ACP led to significant improvement in product turnover in vivo.(17)

2.2.4 Materials and Methods

Materials

1,2,3-13C-malonate was purchased from Cambridge Isotope Laboratories. CoA was purchased from Sigma Aldrich. Aspergillus nidulans FGSC A4 and Aspergillus niger ATCC

1015 were obtained from NRRL. The Escherichia coli strain XL-1 Blue utilized for DNA manipulation and BL21(DE3) used for protein expression were purchased from Stratagene. The protein expression strain BAP1 described by Pfeifer et al (94) was from Prof. Chaitan Khosla.

General technique for DNA manipulation

The A. nidulans and A. niger genomic DNA was prepared using the ZYMO (Orange,

CA) ZR fungal/bacterial DNA kit according to supplied protocols. PCR reactions were performed with the Phusion high-fidelity DNA polymerase (NEB) and Platinum Pfx DNA polymerase (Invitrogen). The PCR products are cloned into the pCR-blunt vector obtained from

Invitrogen. Restriction enzymes (NEB) and T4 ligase (Invitrogen) were used to digest and ligate the DNA fragments. The primers used to amplify the genes were synthesized by IDT and

Operon.

Phylogenetic analysis of PT domains

The 128 PT sequences were obtained from National Center for Biotechnology

Information (NCBI) through blasting the PT domains and KS domains of PKS4, ACAS and

PksA. The sequence for VrtA from P. aethiopicum was provided by Y. H. Chooi. All of these

129 NRPKSs belong to subclades I and II as defined in ref (76). PT domains of each NRPKS were carefully extracted according to the boundary of PksA-PT.(16) The PT sequences were

65 aligned with CLUSTALX 2.0(137) and phylogenetic analyses were conducted using MEGA version 4(123) by means of the boot strap minimum evolution method. The evolutionary history was inferred using the Minimum Evolution method.(138) The evolutionary distances were computed using the Poisson correction method(139) and are in the units of the number of amino acid substitutions per site.

Cloning of the stand-alone PT domains

Primers used to amplify AptA-PT, which contains one intron, are listed in supplemental

Table A4. The first exon was amplified by PCR from A. nidulans genomic DNA with the forward primer AptA-PT-f and reverse primer AptA-PT-i-r; and the second exon was amplified using the primer AptA-PT-i-f and the reverse primer AptA-PT-r. Splice by overlapping extension PCR (SOE-PCR) was used to fuse the two pieces to synthesize a DNA fragment encoding AptA-PT flanked with NdeI and NotI. The DNA fragment of AptA-PT was inserted into pET24 vector digested with NdeI and NotI to yield pYR94. The PT domains of AN0150 and

VrtA were cloned using similar methods and the primers are listed in supplemental Table A4.

Following amplification, the gene encoding AN0150-PT was inserted into pET24 vector digested with NdeI and NotI to yield pYR160 and the gene encoding VrtA-PT was inserted into pET28 vector digested with NdeI and EcoRI to yield pYR149.

Construction of PKSH1

Primers used to amplify AptA-PT and PKS4-ACP are listed in supplemental Table A4.

AptA-PT and PKS4-ACP were amplified and fused together by SOE-PCR to synthesize a DNA fragment encoding AptA-PT-PKS4-ACP flanked with NotI. pWJ279 is derived from pET28 and encodes the N-terminal hexahistidine tagged PKS4-SAT-KS-MAT between NdeI and NotI. The

66

DNA fragment of AptA-PT-PKS4-ACP flanked by NotI was inserted into the NotI-digested pWJ279 to yield pYR123.

Expression and purification of proteins

Expression of stand-alone PT domains and full megasynthases follow identical procedures. The example of AptA-PT is detailed here. pYR94 was used to transform the

BL21(DE3) for protein expression. The cells were cultured at 37°C and 250 rpm in 500 ml LB medium supplemented with 35 μg/ml kanamycin to a final OD600 between 0.4-0.6. The culture was then incubated on ice for 10 mins before addition of 0.1 M isopropylthio-β-D-galactoside

(IPTG) for protein expression. The cells were further cultured at 18°C for 12-16 hrs. The cells were then harvested by centrifugation (3,500 rpm, 15 min, 4°C), resuspended in ~ 25 mL lysis buffer (20 mM Tris-HCl, pH 7.9, 0.5 M NaCl, 10 mM imidazole) and lysed by sonication on ice.

Cellular debris was removed by centrifugation (15000 rpm, 30mins, 4°C) and Ni-NTA agarose resin was added to the supernatant (1-2 ml/L of culture). The suspension was swirled at 4°C for

2 hours and loaded into a gravity column. The protein-bound resin was washed with one column volume of 10 mM imidazole in buffer A (50 mM Tris-HCl, pH = 7.9, 2 mM EDTA, 2 mM

DTT), followed by 0.5 column volume of 20 mM imidazole in buffer A. The protein was then eluted with 250 mM imidazole in buffer A. Purified AptA-PT was concentrated and exchanged into buffer A+10% glycerol with the centriprep filter devices (Amicon), aliquoted and flash frozen. Protein concentration was determined with the Bradford assay using BSA as a standard.

In vitro reconstitution of PKSH1

The reactions were performed at 100 μl scale in 100 mM phosphate buffer (pH = 7.4) in the presence of 100 mM sodium malonate, 2 mM DTT, 7 mM MgCl2, 20 mM ATP, 5 mM coenzyme A, 20 μM MatB and 10 μM PKSH1. The reactions were terminated with 1 mL of

67

99% ethyl acetate (EA)/1% acetic acid (AcOH). The organic phase was separated, evaporated to dryness, redissolved in 20 µL of DMSO and analyzed with a Shimadzu 2010 EV Liquid

Chromatography Mass Spectrometer by using both positive and negative electrospray ionization and a phenomenex Luna 5μm, 2.0×100mm C18 reverse-phase column. Compounds were separated on a linear gradient of 5% (CH3CN, v/v) in water (0.1% formic acid) to

95% CH3CN in water (0.1% formic acid) over 30 min with a flow rate of 0.1 mL/min.

In vitro characterization of the stand-alone PT domains

The reactions were performed at 100 μl scale in 100 mM phosphate buffer (pH = 7.4) in the presence of 100 mM sodium malonate, 2 mM DTT, 7 mM MgCl2, 20 mM ATP, 5 mM coenzyme A, 20 μM MatB, 10 μM PKS4 KS-MAT didomain, 50 μM PKS4-ACP and 50 μM PT domain. The reactions were terminated, extracted and analyzed with the same procedure as in the in vitro reconstitution of PKSH1.

Large scale synthesis of 68 and NMR characterization

To obtain a sufficient quantity of 68, the 100 μl in vitro assay was scaled up to 100 ml.

The reaction mixture was shaken gently at room temperature and the reaction progress was monitored by HPLC. After approximately 48 hours when the product level had reached a plateau, the reaction mixture was extracted three times with equal volume of EtOH (1% AcOH).

The resultant organic extracts were combined and evaporated to dryness, washed with methanol, redissolved in DMSO and purified by reverse-phase HPLC (Alltech Apollo 5 μm, 250 mm×4.6 mm) on a linear gradient of 45 to 95% CH3CN (v/v) over 15 min and 95% CH3CN (v/v) for 15 min in water (0.1% TFA) at a flow rate of 1 mL/min. The heteronuclear multiple quantum coherence (HMQC) and heteronuclear multiple bond correlation (HMBC) spectra of 4 were

1 performed on the Bruker DRX-600 spectrometer using DMSO-d6 as the solvent, the H spectrum

68 was performed on a Bruker DRX-500 spectrometer, and 13C spectrum was performed on a

Bruker ARX-500 spectrometer.

Construction, purification and reconstitution of the PKSH2

Primers used to amplify An03g05440-PT and PKS4-ACP are listed in supplemental

Table A4. An03g05440-PT was first fused to PKS4-ACP through SOE-PCR to yield the DNA fragments encoding PT-ACP flanked with NotI. With the same procedure as in constructing

PKSH1, the DNA fragment of An03g05440-PT-PKS4-ACP flanked by NotI was inserted into the NotI-digested pWJ279 to yield pYR243. pYR243 was expressed and purified with the same procedure as PKSH1. Small scale in vitro assays of 100 μl were performed at room temperature and analyzed by LC-MS.

2.3 Comparative Characterization of Fungal Anthracenone and Naphthacenedione

Biosynthetic Pathways Reveals an α-Hydroxylation-Dependent Claisen-Like Cyclization

Catalyzed by a Dimanganese Thioesterase

This research was originally published in J. Am. Chem. Soc.: Li, Y., Chooi, Y.-H., Sheng, Y.,

Valentine, J. S., Tang, Y. (20011). Comparative Characterization of Fungal Anthracenone and

Naphthacenedione Biosynthetic Pathways Reveals an alpha-Hydroxylation-Dependent Claisen-

Like Cyclization Catalyzed by a Dimanganese Thioesterase. J. Am. Chem. Soc., 133 (39),

15773-15785. © by the American Chemical Society.

2.3.1 Introduction

Polycyclic aromatic polyketides constitute an important family of natural products and exhibit a broad range of beneficial and deleterious bioactivities, including anticancer (e.g. doxorubicin), antibacterial (e.g. tetracycline), antifungal (e.g. griseofulvin), antiparasitic (e.g.

69 frenolicin) and carcinogenic (e.g. aflatoxin)(140) activities. They are synthesized by polyketide synthases (PKSs) from acetate building blocks followed by regioselective cyclizations of the highly reactive poly-β-ketone backbones into different polycyclic ring systems.(1, 5, 141)

Filamentous fungi, like the actinomycetes bacteria, are prolific producers of aromatic polyketides. Fungi use non-reducing polyketide synthases (NRPKSs) to synthesize a diverse array of polycyclic compounds,(14) such as the anthraquinone norsolorinic acid 52 that is the precursor of aflatoxin,(17) the spirobenzofuranone-containing griseofulvin,(142) the naphtho-β- pyrone YWA1 48,(81) and the anhydrotetracycline-like naphthacenedione viridicatumtoxin (vrt)

69.(135, 143, 144) The mechanisms of polyketide chain length control and regioselective cyclization in fungal PKSs are significantly different from the well-studied bacterial type II

PKSs, are less understood, and are therefore of significant biochemical interests.(145)

Each NRPKS is an iterative type I megasynthase that contains catalytic domains arranged in a linear fashion.(14) The core domains, juxtaposed from the N- to C-terminus, consist of a starter unit:ACP transacylase (SAT),(15) a ketosynthase (KS), a malonyl-CoA:ACP transacylase

(MAT), a product template (PT) domain(16, 17, 88) and an acyl-carrier protein (ACP).(117)

Other domains, such as methyltransferase (MeT)(76) and chain-releasing thioesterase/Claisen cyclase (TE/CLC)(18, 146) or reductase (R)(119) can also be found in some NRPKSs. NRPKSs utilize PT domains to control the cyclization regioselectivity of poly-β-ketone backbones synthesized by the minimal PKS domains (KS, MAT and ACP).(16, 17) The PT domains of most NRPKSs can be classified phylogenetically into five major groups with each group

70 corresponding to a unique first-ring cyclization regioselectivity and/or product size: Group I, C2-

C7 and monocyclic; Group II, C2-C7 and bicyclic; Group III, C2-C7 and multicyclic; Group IV,

C4-C9 and muticyclic; and Group V, C6-C11 and multicyclic.(147) In each group, the even- numbered α-carbon serves as the and attacks the odd-numbered carbonyl via an aldol condensation to form a new 6-membered ring. Group V NRPKSs are unique in that they lack the

C-terminal fused TE/CLC domain that catalyzes Claisen-like C-C bond formation and product release. Instead, a standalone metallo-β-lactamase-type thioesterase (MβL-TE) is typically encoded in the vicinity of the NRPKS within the gene clusters.(122) Recently, two Group V

NRPKSs, AptA(121) from A. nidulans and ACAS(122) from A. terreus, were linked to the biosynthesis of the anthraquinone asperthecin (apt) 65 and anthracenone athrochrysone 67, respectively. Each cluster contains a standalone MβL-TE (AptB and ACTE) gene. In the biosynthesis of 67, atrochrysone carboxylic acid was shown to be hydrolyzed off the ACP domain of ACAS by ACTE and then spontaneously decarboxylated to yield 67.(122)

The Group V PKS VrtA from P. aethiopicum was recently linked to the biosynthesis of the tetracyclic naphthacenedione core of 69.(134) Comparing the vrt and apt gene clusters revealed that although the structures of 65 and 69 differ significantly, the protein sequences of the core PKS genes are remarkably similar. VrtA and the associated MβL-TE VrtG share high sequence similarities to AptA and AptB (60% and 62% identity, respectively). The activities of the VrtA and AptA PT domains were shown to be functionally equivalent when spliced into different NRPKSs.(147) For example, when inserted into a nonaketide (C18) NRPKS, the nascent polyketide was cyclized by either Group V PT domain through C6-C11 and C4-C13 aldol condensations to afford a bicyclic intermediate (Figure 21).(147) The third ring is likely spontaneously cyclized via C2-C15 aldol addition. Subsequently, in the absence of additional

71 enzymatic modifications, spontaneous O17-C1 α-pyrone formation results in the release of the pyranoanthracenone 72, which is dehydrated and oxidized into 68 (Figure 21). In the biosynthesis of 69 from a decaketide (C20) backbone, the VrtA-PT is likely to function in a similar way to generate the bicyclic intermediate through the programmed cyclization steps.

However, in order to form the tetracyclic naphthacenedione, an additional Claisen-like condensation between C18 and C1 must take place regioselectively following formation of the tricyclic intermediate. The enzymatic basis of this reaction has not yet been elucidated.

Figure 21. Biosynthetic pathway of 65 and related shunt products.

Because of the similarities between the core PKS genes in the anthracenone and naphthacenedione biosynthetic pathways, a comparative investigation involving the mix and match of individual enzymes may reveal insights into the mechanism of the fourth ring cyclization. Although the biosynthetic locus of 69 has been identified,(134) the requirement of the unusual malonamyl starter unit has complicated the investigation of VrtA functions. As a 72 result, the acetate-primed TAN-1612 70 (neuropeptide Y antagonist) from Penicillium claviforme FL-27337(148) was identified as our naphthacenedione target in this study. A related compound BMS-192548 71 (neurokinin-1 antagonist) was isolated from A. niger

WB2346 and is likely tautomerized from 70 during purification.(149, 150) For parallel biosynthetic investigation of a fungal anthracenone, the apt pathway that produces 65 is selected.

The anthraquinone 65 is likely derived from an anthracenone precursor such as 67. Here, we first mined the biosynthetic gene cluster of 70 (gene cluster designated as ada, 2-acetyl-2- decarboxamidoanthrotainin) from A. niger ATCC 1015 using Group V PKS PT sequence as a lead. Characterizations of the biosynthetic pathways of 70 and 65 through heterologous reconstitution in Saccharomyces cerevisiae and in vitro biochemical investigations uncovered that the C18-C1 fourth-ring Claisen-like condensation is catalyzed by a dimanganese MβL-TE and is dependent on C2-hydroxylation by a flavin-dependent hydroxylase. The MβL-TE is the first example of TE in polyketide biosynthesis that catalyzes the Claisen-like condensation without an α/β hydrolase fold and forms no covalent bond with the substrate.

2.3.2 Results

Mining of the ada Gene Cluster in A. niger

Although A. niger WB2346 is the only A. niger species reported to produce 70, (149,

150) we postulated that the genome-sequenced A. niger strains may also contain gene clusters that can biosynthesize 70. To test this idea, we scanned the sequenced genomes of A. niger

ATCC 1015 and CBS513.88(151) for genes encoding Group V NRPKSs. A BLAST search of both genomes with AptA-PT(147) uncovered a unique Group V NRPKS gene (An11g07310) in the CBS 513.88 strain. The corresponding ortholog (e_gw1_4.29) is also found in the ATCC

1015 strain, but the coding region is incomplete. The gaps were filled by PCR and sequencing,

73 which revealed that the Group V NRPKSs encoded in both strains share >98% protein sequence identity. (Table 2) The gene products downstream of the A. niger ATCC 1015 NRPKS (named

AdaA) include a putative MβL-TE AdaB, a FAD-dependent monooxygenase (FMO) AdaC, an

O-methyltransferase (O-MeT) AdaD and a putative Zn2Cys6 family transcription regulator AdaR

(Figure 22, Table 2). The core biosynthetic enzymes AdaA, AdaB and AdaC all show very high sequence similarities to the corresponding apt and vrt enzymes (Table 2). We reason that the decaketide backbone of 70 may be synthesized and cyclized by AdaA. While the C15-OH of 70 is expected to be retained from an acetate unit during the C2-C15 aldol addition as in the biosynthesis of 67,(122) the C2-OH may be added by the FMO AdaC, which is a homolog of

AptC.(121) AdaD then performs the C9 O-methylation to complete the biosynthesis of 70.

Since the anthracenone hydrolysis activity observed for the close homolog ACTE is not applicable here, the role of the MβL-TE AdaB in the formation of 70 is not yet clear. Given that no additional enzyme from this putative gene cluster appears likely as a fourth ring Claisen cyclase, we propose that AdaB may be involved in facilitating the C18-C1 C-C bond formation and concomitant release of 70.

Culturing of A. niger ATCC 1015 in different media and under various growth conditions did not lead to the production of 70 (Figure 22B, trace i), suggesting that the biosynthesis of 70 may be silent. Overexpression of pathway-specific transcriptional regulators has been a successful strategy for activating cryptic metabolic pathways in fungi.(152, 153) AdaR is likely involved in the regulation of the ada gene cluster and maybe similarly overexpressed to activate the biosynthetic pathway. To do so, adaR was placed under the regulation of the A. nidulans gpdA promoter (Pgpda) in pBARGPE1,(154) which carries the bar gene.(155) The resulting plasmid pYR311, was transformed into A. niger ATCC 1015. Several glufosinate-resistant

74 colonies exhibited yellow pigmentation on agar plates not observed in the wild type (WT) strain when grown on glucose minimal medium (Figure A10). Insertion of the PgpdA::adaR and the bar resistant gene into the pigmented transformants T1-T5 were confirmed by PCR (Figure A10).

Pigmented agar of one transformant (T4) was then extracted and analyzed by LC-MS (Figure

22B, trace ii, Figure A11). A new compound at RT (retention time) = 26.2 min, with a UV spectrum and m/z consistent to those reported for 70 was overproduced in T4 (Figure 22B).(148)

RT-PCR analysis of adaA and adaB confirmed activated transcriptions of both genes in the T4 strain but not in WT, which supports the involvement of the ada gene cluster in the biosynthesis of the yellow pigment in the strain (Figure 22C).

A DC BA R SAT KS AT PT ACP 1kb B 70

ii i

10 15 20 25 30 35 40 Minute

C β-tubulin adaA adaB gDNA cDNA RNA gDNA cDNA gDNA cDNA M WT T4 WT T4 WT WT T4 WT WT T4 WT

1.65kb 1kb 0.65kb

Figure 22. Induction of the ada gene cluster. (A) Organization of the ada gene cluster. (B) LC analysis (400 nm) of A. niger extracts: i. WT and ii. T4 transformant; (C) RT-PCR analysis of the ada genes with lane M as marker. The β-tubulin was used as a positive control and RNA only of both WT and T4 was used as negative controls. Regions of adaA and adaB were probed for inducted transcription.

75

The yellow pigment was subsequently purified from T4, extensively characterized by

NMR, and confirmed to be 70 (Figure A12, Table A9). Taken together, these results verified the proposed correlation between the ada gene cluster and the biosynthesis of 70 in A. niger ATCC

1015. Overexpression of adaR led to an increased titer of 70 in A. niger ATCC 1015 from undetectable levels to ~ 400 mg/L, which represents a 100-fold increase compared to the titer reported from A. niger WB2346 (2.9 mg/L).(149)

Table 2. Genes Within the ada Cluster from Aspergillus niger ATCC 10151 A. niger CBS Identity/ Size BLASTP Gene 513.88 Similarity E-value Conserved Domain (bp/aa) homolog (% identity) (%)

An11g07310 aptA 69/80 0.0 NRPKS, SAT-KS- adaA 5382/1793 (99) vrtA 59/74 0.0 MAT-PT-ACP

An11g07320 aptB 69/79 1e-177 metallo-β-lactamase adaB 1005/317 (100) vrtG 62/76 1e-107 superfamily UbiH and related An11g07330 aptC 63/79 8e-153 adaC 1245/414 FAD-dependent (99) vrtH 63/78 2e-153 Oxidoreductases An11g07340 methyltransferase adaD 750/249 vrtF 60/71 2e-68 (99) domain type12 GAL4-like Zn(II) Cys An11g07350 2 6 adaR 2504/777 42/55 1e-128 binuclear cluster (100) DNA-binding domain 1 The sequence of the ada gene cluster in A. niger ATCC 1015 was deposited in GenBank with accession number JN257714.

Heterologous reconstitution of ada and apt pathways in BJ5464-NpgA

To verify if the four-enzyme ensemble (AdaA-D) is sufficient for the biosynthesis of 70, we reconstituted the putative four-gene pathway in S. cerevisiae BJ5464-NpgA, which is a vacuolar protease-deficient strain carrying the A. nidulans phosphopantetheinyl transferase npgA.(156, 157) The intron-free coding sequences of the four genes were amplified from A.

76 niger ATCC 1015 genomic DNA using SOE-PCR and placed under the control of the ADH2 promoter. Two resulting yeast 2μ expression plasmids pYR291 (carrying adaA) and pYR342

(carrying adaB-D) were transformed into BJ5464-NpgA (Table A8). The two-day culture of the yeast transformant synthesized 70 as essentially the only polyketide product with a titer of ~ 10 mg/L (Figure 23A, trace ii, Table 3). Subsequently, AdaD was excluded from the heterologous host (Table A8) and the resulting strain produced a new compound (~8 mg/L) with UV spectrum

(Figure 23A, trace iii, Table 3) and mass (m/z [M+H]+ = 401) consistent with that of 73 (Figure

A13, Figure 24). The structure of 73 was verified to be the C9-hydroxyl version of 70 by NMR analysis (Figure A14, Table A10), which confirmed that AdaD is indeed the C9-O- methyltransferase. The efficient production of 73 in BJ5464-NpgA shows that AdaA, AdaB and

AdaC are sufficient for the biosynthesis of a naphthacenedione, and sets the stage for more systematic investigation into the roles of each enzyme in the yeast host. In subsequent studies, adaD was excluded from the heterologous host/plasmid combinations.

Table 3. Heterologous expression of different combinations of apt and ada genes and the resulting products.1 NRPKS MβL-TE FMO Product adaA adaB adaC 73 aptA aptB aptC 74 adaA aptB aptC 73 aptA adaB adaC 74 aptA aptB - 75 adaA adaB - 76 adaA aptB - nd2 aptA adaB - 75 adaA - adaC 73 aptA - aptC nd 1 S. cerevisiae BJ5464-NpgA is the host. 2 nd: no polyketide metabolite was detected from this combination of genes.

77

Using the same approach as above, we reconstituted the activities of AptA, AptB and

AptC from the biosynthetic pathway of 65. The two expression plasmids encoding the three apt genes (pYR203 and pYR260, Table A8) were introduced into BJ5464-NpgA, and the resulting strain synthesized a new compound 74 (m/z [M+H]+ = 333) (Figure 23B, trace ii, Figure A15).

The molecular weight of 74 agrees with that of a nonaketide that has undergone two dehydrations, one decarboxylation and one hydroxylation. This was surprising since we originally anticipated the NRPKS AptA to be an octaketide (C16) synthase based on the size of the final product 65.(121) 74 was purified from the yeast culture and the structure was established to be a multiply hydroxylated anthracenone (Figure 21, Figure A16, Table A11).

The coordinated activities of AptA, AptB and AptC in the biosynthesis of 74 are shown in Figure

21. The conversion of 74 to 65 may require additional A. nidulans enzymes encoded outside of the compact apt cluster.

A B 74 75 73 vii AdaA vi AptA

vi AdaA v AptA 76

v AdaA iv AptA 73

iv AdaA iii AptA

iii AdaA ii AptA 70 ii AdaA i Control

10 15 20 25 30 35 40 i Control Minute

10 15 20 25 30 35 40 Minute

Figure 23. LC analysis (400 nm) of polyketides synthesized by coexpression of different combinations of ada and apt genes in S. cerevisiae BJ5464-NpgA. (A). AdaA coexpressed with different combinations of MβL-TEs and/or FMOs; (B) AptA coexpressed with different combinations of MβL-TEs and/or FMOs. Control trace is extracted from untransformed BJ5464-NpgA culture. 78

Comparative analysis of the ada and apt pathways through gene shuffling

The heterologous reconstitution of AdaA-C and AptA-C clearly demonstrate these two sets of highly similar enzymes synthesize the structurally distinct 73 and 74, respectively. This raises the obvious question of how hydrolysis of 74 and fourth-ring cyclization of 73 may be accomplished by the corresponding MβL-TEs, which are highly similar in sequence (Table 2).

To dissect the roles of each enzyme and the biochemical differences between the homologous enzymes, we constructed various shuffled pathways in BJ5464-NpgA (Table 3). When AptA was combined with the ada tailoring enzymes AdaB and AdaC, biosynthesis of anthracenone 74 was observed as the major product with similar yield as the intact apt pathway (Figure 23B, trace iii, Table 3). This indicates that AdaB and AdaC can efficiently function on a smaller, nonaketide substrate. Conversely, coexpression of AptB and AptC with AdaA produced 73 as the major product, albeit at a yield that is ~10-fold lower than the intact ada pathway (Figure

23A, trace iv, Table 3). The lower yield may be attributed to the less efficient function of AptC or AptB on the larger decaketide substrate (more evidence on this later). The biosynthesis of 73 and 74 by these shuffled pathways implicates that i) the polyketide chain length control lies in the NRPKSs AdaA and AptA; ii) both the MβL-TE and the FMO in the two pathways can process substrates of unnatural chain lengths and are therefore functionally homologous (with some differences in efficiency); and iii) most importantly, both MβL-TEs appear to be bifunctional and can catalyze either hydrolysis (towards the formation of 74), or the Claisen-like fourth-ring cyclization (towards the formation of 73) when the substrate is of sufficient chain length (C20).

We next excluded either AptC or AdaC from the reconstituted pathways to verify the role of the FMO and the necessity of the modification to subsequent steps. When only AptA and

79

AptB were expressed in BJ5464-NpgA, a major product 75 with a mass of 316 was recovered

(Figure 23B, trace iv, Figure A17, Table 3). Following large scale culture and NMR characterizations, the structure of 75 was shown to be identical to 74 except the absence of C2-

OH (Figure A18, Table A12). This single structural difference between 74 and 75 confirms that

AptC is the C2-hydroxylase (Figure 21). Following thioester hydrolysis by AptB to release 75a, spontaneous decarboxylation occurs to afford the anthracenone 75. The C2-hydroxylation does not appear to affect the hydrolytic activities of AptB since both 74 and 75 were produced at comparable levels (~10 mg/L) from the heterologous yeast host. As expected, substitution of

AptB with AdaB also resulted in hydrolysis and production of 75 (Figure 23B, trace v, Table 3).

On the other hand, coexpression of AdaA and AdaB in BJ5464-NpgA did not yield 77, which is an analog of 73 that is unhydroxylated at C2 (Scheme 3). Instead, the resulting strain synthesized 76 (m/z [M+H]+ = 359) (Figure 2A, trace v, Table 3, Figure A19), the molecular weight of which is consistent with a decaketide that has undergone two dehydrations and one decarboxylation. The structure of 76 was analyzed by NMR (Figure A20, Table A13) and was confirmed to be a tricyclic, 1,3-diketone-containing anthracenone (Figure 24). Compared to 75,

76 contains an additional ketone at C18 that is reflective of the longer polyketide backbone synthesized by AdaA. While the absence of the C2-hydroxyl also verifies the role of AdaC, the failure of AdaA and AdaB to synthesize 77 in the absence of AdaC hints that C2-hydroxylation may be essential for the Claisen cyclization. In the absence of the C2-OH modification, AdaB acts as a hydrolytic TE to release the diketo-anthracenone carboxylic acid 76a from AdaA, which then spontaneously decarboxylates to form 76. This result supports the timing of C2- hydroxylation to precede cyclization as shown in Figure 24. Therefore, the α-hydroxylation step appears to be crucial for switching the nucleophile in the chain releasing step catalyzed by AdaB,

80 from a hydroxide ion to the C18 anion. Lastly, BJ5464-NpgA transformants expressing

AdaA and AptB did not produce any obvious products (Figure 23A, trace vi, Table 3). This indicates a subtle difference between the two MβL-TEs in the hydrolysis reaction: while AdaB can hydrolyze both C18 and C20 anthracenone-S-ACP substrates to yield 75 and 76, respectively,

AptB can only hydrolyze the C18 substrate to yield 75.

Figure 24. Biosynthetic pathway of 70 and related shunt products.

To probe whether offloading of 73 and 74 indeed requires the MβL-TE, we coexpressed the individual NRPKSs with their corresponding FMO only. BJ5464-NpgA expressing aptA and aptC did not produce any polyketide (Figure 23B, trace vi), confirming that nonenzymatic hydrolysis to yield 74a does not occur. In contrast, BJ5464-NpgA expressing AdaA and AdaC produced trace amounts of 73 (Figure 23A, trace vii). Hence the C18-C1 Claisen cyclization can take place uncatalyzed with very low efficiency, and is significantly enhanced by AdaB.

In vitro Biosynthesis of 73 and 74

To complement the heterologous biosynthesis results, we reconstituted the synthesis of 73 and 74 using purified enzymes. His6-tagged AptA and AdaA were expressed and purified from E. coli 81

BAP1,(94) while the other enzymes (AptB, AptC, AdaC) were expressed and purified from

BL21(DE3). AptB purified from the culture medium without supplementation of metal ions is designated as AptB*. Soluble forms of AdaB could not be obtained from E. coli and is therefore not included in the assays. Purified AptC and AdaC were thermally denatured and the prosthetic groups were confirmed to be FAD by LC-MS analysis as previously described (Figure

A21).(158) To generate malonyl-CoA in situ for use in polyketide turnover assays, we used the malonyl-CoA synthetase MatB from Rhizobium trifolii to produce the extender units from malonate, ATP and CoA.(104)

Upon mixing AptA, AptB* and AptC in the presence of the required reagents and cofactors for 12 hours, the organic products were analyzed by LC-MS. In agreement with the heterologous reconstitution results, the in vitro reaction produced 74 as the dominant product, while excluding AptC from the reaction yielded 75 (Figure 25A, traces i and ii). When AptB and AptC were both absent from the reaction mixture, AptA synthesized 68 from the spontaneous lactonization of the fourth ring (Figure A22), which was previously observed with other hybrid Group V nonaketide NRPKS.(147) Therefore, the activities observed in the yeast host can be completely recapitulated by using purified enzyme components. To test if the hydroxylation of C2 occurs on the free standing substrate 75 to yield 74, AptC was assayed with

75 in the presence of reduced nicotinamide dinucleotide phosphate (NADPH).

However, no conversion was observed (Figure A23) which is in agreement with the proposed mechanism that AptC hydroxylates the anthracenone carboxyl thioester attached to the AptA-

ACP (Figure 21).

In vitro reactions using purified components also enabled us to prove the trace amounts of 73 synthesized by AdaA and AdaC is not an artificial result from the actions of endogenous

82 yeast enzymes. Indeed, release of small amounts of 73 can be detected after overnight incubation of AdaC and AdaA (Figure 25A, trace iii). Inclusion of AptB* in the assay, however, led to a higher level synthesis of 73 (Figure 25A, trace iv). To quantify the role of AptB* in accelerating the cyclization, we monitored the time-course production of 73 by using an HPLC assay (Figure 25B). Following the reaction progress from 5 to 120 minutes after initiation showed that AptB* can increase the turnover of 73 up to 3-fold in two hours.

AB 74 75 73 2.5 No AptB AptB* iv AdaA 2 Fe(II)-AptBFe2+-AptB 2+ g) Mn(II)-AptBMn -AptB μ

iii AdaA 1.5

1 ii AptA

0.5

i AptA Production of 73( Claisen Cyclization ClaisenActivity Cyclization

10 15 20 25 30 35 40 0 0 20406080100120 Minute Minute

CD80 80 AptB* AptB* FeFe(II)-AptB2+-AptB Fe(II)-AptBFe2+-AptB 60 Mn(II)-AptBMn2+-AptB 60 Mn(II)-AptBMn2+-AptB

40 40

20 20 Hydrolytic Activity Hydrolytic Production(ng) of 75 Hydrolytic Activity Hydrolytic Production of (ng) 74 0 0 0 20406080100120 0 20406080100120 Minute Minute

Figure 25. In vitro reconstitution of AptB activities. (A) LC analysis (400 nm) of polyketide synthesized from different combinations of purified enzymes; (B-D) Time-course analysis of production of 73, 74, or 75 over 120 minutes using AptB containing different levels of metal cofactors (AptB*, Fe2+-AptB and Mn2+-AptB) coincubated with AdaA and AdaC (B), AptA and AptC (C), and AptA only (D). No hydrolysis can be observed in the absence of AptB.

83

Combining the in vivo and in vitro results, the dual catalytic functions of fungal PKS

MβL-TEs such as AdaB and AptB can be established as shown Figure 21 and 24. In both cases, the MβL-TE is responsible for product turnover. First, MβL-TE catalyzes the hydrolysis of anthracenone carboxylic acids such as 75a and 76a from the NRPKS with certain chain-length specificities; second, when the decaketide anthracenone carboxyl-S-ACP is hydroxylated at C2 by the FMO, the MβL-TE accelerates the Claisen-like condensation between C18 and C1 to release the naphthacenedione product from the NRPKS.

Biochemical Characterization of AptB as a Bifunctional TE

As proposed in the study of ACTE, MβL-TEs belong to the metallo-β-lactamase (MβL) superfamily, and consist of the conserved divalent metal binding motif HXHXDH.(122) In addition, alignment of the sequences of AptB, AdaB, VrtG and ACTE shows that these MβL-

TEs are similar to the subclass B3 MβLs (Figure A24),(159, 160) which employ a

HXHXDH(X)iH(X)jH motif to bind two divalent transition metal ions such as Zn2+ and Fe2+.

The metal contents of AptB* was determined by ICP-AES, and showed that AptB* contains

Fe2+, Mn2+ and Ni2+ ions and more than 55% of AptB* has no metal ions in either of the metal binding sites (Table 4). We believe that Ni2+ ions originated from the affinity purification step.

As ions are naturally abundant, the presence of Zn2+ in AptB* at such low concentration suggests the unlikelihood of Zn2+ being the physiological cofactor. Dialysis of AptB* with 10 mM EDTA led to significant precipitation, likely due to instability of the apo AptB as a result of metal loss. ICP-AES data on AptB* that remained in solution showed a significant decrease in the metal content of divalent metal ions other than Mn2+ (Table 4), which suggests that the other metal ions (Fe2+, Ni2+, Zn2+) are loosely bound in AptB*. To obtain a higher percentage of AptB in holo form, 200 ppm of MnSO4 or FeSO4 was supplemented to the AptB expression medium.

84

The metal content of AptB purified from the Mn2+-enriched medium (Mn2+-AptB) was determined to be 1.5 Mn2+ cations per monomer (Table 4), while Fe2+-AptB (purified from Fe2+- enriched medium) contained 0.9 Fe2+ and 0.4 Mcn2+ cations per monomer (Table 4). This evidence strongly suggests dinuclear Mn2+ as the physiologically preferred metal cofactor of

AptB. Dialysis of Mn2+-AptB with 10 mM EDTA did not result in notable precipitates, indicating the enzyme-Mn2+ interactions are strong.

Table 4. ICP-AES analysis of AptB. Number of metal ions per monomer

Fe2+ Mn2+ Ni2+ Zn2+

AptB* 0.23 0.13 0.13 0.05 EDTA Treated AptB* 0.13 0.1 0.08 0.04 Mn2+-AptB 0.06 1.33 0.06 0.04 EDTA Treated Mn(II)-AptB 0.06 0.66 0.04 0.07 Fe2+-AptB 0.93 0.44 0.11 0.08

The catalytic functions of AptB*, Fe2+-AptB and Mn2+-AptB were then compared by measuring the time courses of in vitro production of 73, 74 or 75 (Figure 25B-D). As shown in

Figure 25B-D, Mn2+-AptB is much more efficient (up to three to four-fold higher in product turnover) in catalyzing both the hydrolysis and Claisen-like condensation than Fe2+-AptB and

AptB*. In the presence of Mn2+-AptB, the apparent production rate of 73 is ~ 17 ng/min, while that of 74 or 75 is ~25-fold lower at 0.7 ng/min. Therefore, even though the natural role of AptB in the apt pathway is hydrolysis to yield 75a, it can promote the C18-C1 Claisen-like condensation much more efficiently than the hydrolysis reaction under the assay conditions.

To visualize the metal binding sites, we generated a homology structural model of AptB from the I-TASSER server (Figure 26A, left panel).(161, 162) Comparative analysis with the

85 previously reported structures of MβL superfamily enzymes, such as ST1585(163) from

Sulfolobus tokodaii (Figure 26A, right panel) and YcbL(164) from Salmonella enteric serovar

Typhimurium, suggests that H97, H99, D101, H102, H153 and H207 of AptB are the six residues that chelate the two Mn2+ cations. Among them H97, H99 and H153 are predicted to bind one

Mn2+ atom, while D101, H102 and H207 are predicted to bind the other Mn2+.(165, 166) In addition, the binuclear Mn2+ ions are normally bridged with a hydroxide ion and an aspartate side chain, which is from the conserved D172 among AptB, AdaB and VrtG (Figure 26A, Figure

A24).(165, 166)

Site-directed mutations to alanines at these seven residues in Mn2+-AptB were then performed to confirm their roles in metal binding and catalytic activity. All these mutants were expressed and purified from E. coli BL21(DE3) in the Mn2+-enriched medium with similar yields as the wild type Mn2+-AptB. The hydrolytic activities of these mutants were assayed along with

AptA and estimated by the production of 75; whereas the Claisen-like cyclization activities were assayed in the presence of AdaA/AdaC and estimated by the relative levels of 73 (wild type

Mn2+-AptB is defined to be 1). As expected, the hydrolytic activities of these seven mutants

(H97A, H99A, D101A, H102A, H153A, D172A and H207A) were completely abolished, while the levels of 73 were attenuated to that of uncatalyzed levels with AdaA/AdaC alone (Figure

26B). ICP-AES analysis and far-UV circular dichroism spectra of these AptB mutants suggest that mutating these seven residues to alanines leads to the loss of at least one Mn2+ cation in

AptB, yet no significant structural and folding changes (Table 4 and Figure A25). Therefore, chelation of two Mn2+ ions are essential for both activities of AptB, which is in contrast to the well-studied dimanganese arginase in which loss of one metal binding site leads to partial loss of enzyme activity.(167-169)

86

A

H207 H207 D172 D165

H153 Zn2 H148

Zn1 H99 W H60 D62 D101 H102 H63 H58 H97 AptB ST1585

B Hydrolytic Activity -+------+ 1.2

1

0.8

0.6

0.4

Relative production Relative of 73 0.2 Claisen Cyclization Activity

0

Figure 26. Structural modeling and mutational analysis of AptB. (A) Comparison between the modeled metal binding sites of AptB (left) predicted by the I-TASSER server and the zinc binding site of a M L superfamily protein ST1585 (Chain B) (adapted from ref (163)) (right). (B). Site directed mutagenesis of AptB to probe the hydrolysis (formation of 75) and (formation of 73) activities. The level of production of 73 in the wild-type Mn2+-AptB assay is defined as 1 for comparison of the catalytic activity of the Mn2+-AptB mutants.

2.3.3 Discussion

In this work, we have successfully utilized the PT phylogenetic classification to locate the gene cluster for the biosynthesis of the fungal naphthacenedione 70. The connection between the ada gene cluster and biosynthesis of 70 was confirmed by overexpressing the pathway-specific

87 transcriptional regulator AdaR in A. niger. The gene cluster of 70, including the NRPKS, is highly similar to those synthesize anthracenones. Subsequent parallel investigations with ada and apt core PKS enzymes in S. cerevisiae led to the identification of a group of bifunctional

MβL-TEs that can catalyze both hydrolysis and Claisen cyclization.

AptB is a Dimanganese Enzyme that Catalyzes C-C Bond formation.

The MβL enzymes belong to the broader family of hydrolytic metalloenzymes.(170) The hydrolytic activity of MβLs is both dependent on the electrostatic activation of the proton on the metal-coordinated water molecule and the consequent formation of the hydroxide ion (OH-

).(165, 166) Zn2+ as the hardest Lewis acid among the first row of transition metals, is most utilized among the MβLs.(171) Our investigation suggests that AptB is different from most

MβLs and does not bind Zn2+. Instead, AptB exhibits preferential catalytic and physiological requirements for two Mn2+ cations. A number of hydrolytic dimanganese enzymes have been reported,(172-174) including the best characterized arginase,(175) aminopeptidase,(176) and inorganic pyrophosphatase.(177) AptB is the first example among the binuclear manganese enzymes that can catalyze C-C bond formation through Claisen-like condensation. Based on the sequence alignment and homology modeling to other MβL-TEs, each Mn2+cation in AptB is likely penta-coordinated in a trigonal bipyramidal geometry (Figure 26A). Mutating any of the metal-coordinating residues to alanine (H97A, H99A, D101A, H102A, H153A, D172A and

H207A) led to the loss of both the Claisen cyclization and hydrolysis activities (Figure 26B).

Proposed Mechanism of the Hydrolysis Reaction.

AptB can facilitate the turnover of naphthacenedione 73 more than 20 times faster than towards anthracenone 74 or 75 (Figure 25B-D). This was surprising considering AptB is naturally recruited by the apt pathway to perform the hydrolytic release of 74a. This suggests

88 that the active site environment of AptB, including positioning of the two Mn2+ cations, is optimally setup to perform the Claisen-like condensation, while the hydrolysis reaction occurs when the thioester is placed in close proximity to a strong hydroxide ion. Hence we could expect the hydrolytic reaction to be nonspecific towards substrates that have similar structure and hydrophobicity. Indeed, when AptB was added to the in vitro reaction containing the lovastatin nonaketide synthase (LNKS) LovB and the enoylreductase LovC, AptB was also able to catalyze the hydrolytic release and turnover of the otherwise stalled dihydromonacolin (data not shown).(178) Other types of cyclization TE domains in PKSs, including those that have α/β hydrolase folds, have also been noted to display substantial substrate promiscuity in the loading and hydrolysis of nonnative acyl-thioesters.(146, 179)

Hydrolysis by AdaB

Asp Asp172 172 HO HO O O 2+ 2+ 2+ 2+ Mn Mn Mn Mn H OR O H O Base H ACP OH OH O ACP ACP COOH OH OH O O OH OH O O HS O O S S H H 3 1 3 1 O O O O HO OH HO HO 18 76a OH 18 OH

Claisen Cyclization by AptB/AdaB OH Asp172 O Asp172 OH Mn2+ O 2+ 2+ Mn O Mn ACP H OR OH OH O O O H 2+ OH Mn HS H ACP H OH OH O O OH OH O O S O O SH HO OH O O O O OH 2 73 HO HO ACP OH OH H Base

Figure 27. Proposed enzymatic mechanism of MβL-TE.

89

The hydrolytic mechanism of AptB and AdaB likely resembles that of other dimanganese metalloenzymes such as arginase (Figure 27).(172, 175) For example, the dimanganese stabilizes the highly reactive hydroxide ion under physiological pH, which serves as the nucleophile in attacking the C1 thioester of the tricyclic polyketide intermediates to yield 75a and 76a. The dependence of hydrolytic activity on both Mn2+ suggests that the Mn2+ ions may play additional roles in catalysis not seen with those in arginase, in which the catalytic center is the metal-ion loaded in the M1 site and the enzyme is active with a mononuclear center.(167-

169) Here, one of the Mn2+ ions in AptB may play a crucial role in stabilizing the oxyanion generated in the tetrahedral intermediate. Alternatively, in a model similar to that proposed by

Dismukes for arginase,(180) one of the Mn2+ may be involved in substrate binding through coordination to the C1 (or even C3) carbonyl oxygen (as well as stabilizing the oxyanion hole), whereas the other Mn2+ chelates a water molecule. During catalysis, a general base such as a nearby histidine can deprotonate the water to initiate the hydrolysis reaction.

Proposed Mechanism of the α-Hydroxylation Dependent Claisen Cyclization.

The comparatively weak hydrolytic activity of AptB suggests that the Claisen cyclization is its true function. Our studies here reveal that the cyclization of 73 requires C2-hydroxylation by the accompanying of FMO (AdaC or AptC). If the decaketide substrate is not C2- hydroxylated, the corresponding cyclization product 77 is not observed. Conversely, upon C2- hydroxylation, no hydrolysis product such as 78a (and the decarboxylated product 78) can be detected (Figure 24). Therefore, the α-hydroxylation step essentially functions as a nucleophile switch that modulates the dual functions of AptB. Insertion of the tertiary hydroxyl group α to the C1 thioester carbonyl can have profound effects in substrate reactivity and conformation.

This is evidenced in the slow but detectable spontaneous cyclization of 73 in the absence of

90

AptB when C2 is hydroxylated. In contrast, the hydrolysis of 74a cannot be observed without

AptB. The C2-OH can disrupt the keto- tautomerization across C1 and C3, thereby increasing the electrophilicity of C1 for nucleophilic attack. Additionally, the C2-hydroxyl group may hydrogen bond to the C1-carbonyl and may provide stabilizing effects to the tetrahedral intermediate. However, these electronic effects should be observed in both hydrolysis and Claisen cyclization reactions, and therefore cannot be used solely to explain the

“nucleophile switching” role of this modification.

Although the absolute stereochemistries of 73 and 70 are unknown, the C2-OH is mostly like in the (R) configuration and is syn to the C15-OH as seen in the crystal structure of 69.(181)

The syn configuration may lead to rearrangement of the flexible C17-C19 diketone moiety, such as through hydrogen bonding, and position the C18 nucleophile in a more favorable position for catalysis. Furthermore, this stereochemical configuration can lead to the possible positioning of a cluster of oxygen atoms, including those on C1, C2, C15, C17 and C19 on the same face of the substrate and are all within suitable distances for chelating both Mn2+ ions when bound to AptB.

Such chelation may significantly increase the binding affinity of the substrate in the active site.

For example, one of the Mn2+ ions can chelate to both C2-OH and C1 carbonyl to yield a five- membered-ring (Figure 27). Such kind of penta-ring structure is not unprecedented, and has been seen in dihydroxyacetone phosphate chelating a Zn2+ cation in E. coli class II fructose-1,6- bisphosphate aldolase.(182, 183) In this mode, the water molecule only chelates to the other

Mn2+ ion and this may prevent deprotonation required for hydrolysis. The general base is instead positioned more favorably to deprotonate the C18 α-carbon.

Alternatively, in addition to the five-membered ring formed through coordination to the first Mn2+, there is a possibility that the C17-C19 diketone moiety can be coordinated to the

91 second Mn2+ ion following substrate conformation changes induced by C2-hydroxlylation

(Figure 27). This indeed represents the most favorable model for the Claisen cyclization, and is similar to that seen with aldolases.(183, 184) The coordination of C17 and C19 carbonyls strongly favors the formation of the enol and deprotonation of C18 without the requirement of a general base. The displacement of the hydroxide or water on the second Mn2+ should also completely suppress the hydrolysis reaction. Additional biophysical and structure studies on this novel Claisen cyclase is needed to understand the exact modes of substrate binding and metal coordination.

The Bifunctional MβL-TEs are different from the TE/CLC domains.

The α-hydroxylation-dependent Claisen cyclization strategy employed by AptB and

AdaB is different from the mechanism used by TE/CLC domain found in many NRPKSs, such as those involved in the biosynthesis of bikaverin,(37) naphthopyrone(129) and aflatoxin.(146)

During the Claisen-like cyclization catalyzed by TE/CLC, the active site serine forms an oxyester bond with C1 of the polyketide substrate, releasing it from the phosphopantetheinyl-arm of the ACP. Formation of the covalent intermediate was suggested to reorient the nucleophilic

α-carbon with respect to the Ser-His-Asp catalytic triad and enables formation of the enol.(146)

The reaction then proceeds via deprotonation of the α-carbon and nucleophilic attack on C1.

From the vastly different reaction mechanisms, it is not surprising that fungal TE/CLC and MβL-

TE share no sequence similarity. Sequence alignment of AptB, AdaB and VrtG does indicate a conserved serine residue (S191 for AptB, Figure A24); however, mutation to alanine in AptB did not affect the efficiency of the Claisen cyclization (Figure 26B). Therefore, MβL-TEs employ a completely different mechanism to enolize the α-carbon, and no covalent bond is formed between the polyketide intermediate and the MβL-TEs during the cyclization. Furthermore, in

92 the biosynthesis of fungal aromatic polyketides requiring TE/CLC domains, the C1 carbonyl is usually directly connected to an aromatic ring due to the C2-C7 or C4-C9/C2-C11 cyclization steps catalyzed by PT, and the hyperconjugation between the aromatic ring and C1 carbonyl reduces the reactivity of C1. Therefore, the covalently bound enzyme-substrate intermediate is required to promote the cyclization reaction in TE/CLC.

In contrast, bacterial type II PKSs infrequently perform the final ring closure through

Claisen cyclization, except in the biosynthesis of tetracyclines and aureolic acids.

Anhydrotetracycline(185) and premithramycinone,(186) the key intermediates in tetracycline and aureolic acid biosynthesis, respectively, share similar naphthacenedione scaffolds.

Correspondingly, the biosynthesis steps also involve a C18-C1 Claisen-like ring closure. The electronic environment of the α-carbon is more similar to substrates of the TE/CLC cyclized reactions, in that the C1 carbonyl is directly connected to and conjugated with the aromatic portions of the molecules. Therefore, although the exact catalytic mechanism of the C18-C1 fourth ring cyclization in bacteria has not been completely established, most likely it will be different from the fungal strategy described here.

Phylogenetic Analysis of MβL-TEs.

During phylogenetic analysis of PT domains, the Group V NRPKSs were branched into two major subgroups. Interestingly, their corresponding MβL-TEs can also be classified into two subgroups (Figure A26) with similar phylogenetic distribution to their paired NRPKSs, indicating that the NRPKS/MβL-TE may be co-evolved. The first subgroup (Group V1) contains ACTE and AN0149, while the second subgroup (Group V2) contains AdaB, AptB and

VrtG. AN0149 has been linked to the biosynthesis of the anthraquinone emodin, which is likely synthesized via an anthracenone intermediate as in the biosynthesis of 67 involving ACTE.(122)

93

Hence, the classification of MβL-TE is largely based on the sizes of the polyketide substrates.

The fourth ring cyclization mechanism of Group V2 MβL-TEs should be common to the biosynthesis of fungal tetracyclic naphthacenedione compounds, such as 69, hypomycetin and anthrotainin. To investigate if the Claisen cyclization activity is universal among both groups of

MβL-TEs, we also purified and assayed ACTE with AdaA and AdaC. However, formation of 73 was not accelerated, which indicates that the Claisen cyclization activity is unique among the

Group V2 MβL-TEs.

The presence of several uncharacterized PKS clusters with the Group V2 NRPKS/MβL-

TE/FMO three-gene ensemble in other sequenced fungal genomes (Figure A26) indicates that there are potentially other undiscovered fungal tetracyclic compounds. Based on the presence of

FMOs in all analyzed Group V2 PKS clusters and their absence in all Group V1 PKS clusters, it is likely that the Group V2 MβL-TEs were diverged from the hydrolytic Group V1 MβL-TEs and acquired the Claisen cyclase function during the recruitment of FMO into these pathways.

The fact that AptB exhibits higher efficiency in catalyzing the Claisen cyclization than hydrolysis, and the presence of the unique FMO in the apt cluster suggest that AptB may have been part of a tetracyclic pathway, but reverted to its hydrolytic function more recently. This is perhaps due to the loss of ability to produce the longer decaketide chain by the NRPKS AptA.

Most identified enzymes in the MβL superfamily are known to catalyze hydrolytic reaction, such as the well-known β-lactamase that confers penicillin resistance. Therefore, the hydrolytic activity is likely the ancestral role of these newly identified MβL-TEs with fourth ring cyclization activity.

2.3.4 Conclusion

94

In conclusion, this study sheds light into the mechanism of biosynthesis of anthracenone and naphthacenedione in fungi. Compared to the biosynthesis of naphthacenedione in bacteria, such as premithramycinone and anhydrotetracycline that require over ten enzymes, the fungal strategy is concise and only involves three enzymes. The linear polyhydroxylated tetracyclic compounds have been a promising source of bioactive molecules, which include clinically important antibiotics and antitumor drugs. The establishment of the minimal three-gene ensemble (Group V NRPKS/MβL-TE/FMO) required for the biosynthesis of a fungal naphthacenedione provides a useful signature feature for discovery of other related compounds from fungi using genome mining. Furthermore, the demonstrated ability to produce tetracyclic naphthacenedione in yeast has opened up new possibilities to engineer these pharmacologically attractive tetracyclic scaffolds.

2.3.5 Materials and Methods

Strains and Culturing Conditions.

A. niger ATCC 1015 was obtained from NRRL and cultured at 28°C. The E. coli strain

XL-1 Blue (Stratagene) and TOPO10 (Invitrogen) were used for DNA manipulation, and

BL21(DE3) (Stratagene) was used for protein expression. BAP1 described by Pfeifer et al. was obtained from Prof. Khosla.(94) S. cerevisiae BJ5464-NpgA (MATα ura3-52 trp1 his3Δ200 leu2Δ1 pep4::HIS3 prb1Δ1.6R can1 GAL) was used as the yeast expression host.(156, 157) S. cerevisiae transformants were selected with the appropriate complete minimal dropout agar plates, and cultured in YPD medium with 1% dextrose at 28°C.

General Techniques for DNA Manipulation.

A. niger genomic DNA was prepared using the ZYMO ZR fungal/bacterial DNA kit according to supplied protocols. PCR reactions were performed with Phusion high-fidelity DNA

95 polymerase (New England Biolabs, NEB) and Platinum Pfx DNA polymerase (Invitrogen). PCR products were subcloned into pCR-Blunt vector (Invitrogen) and confirmed by DNA sequencing.

Restriction enzymes (NEB) and T4 ligase (Invitrogen) were used to digest and ligate the DNA fragments, respectively. Primers used to amplify the genes were synthesized by Integrated DNA

Technologies and are listed in Table A7.

Construction of pYR311.

pBARGPE1 (obtained from FGSC) is a fungal expression vector and contains the A. nidulans gpdA promoter and trpC terminator.(154) adaR was amplified from A. niger genomic

DNA with primers adaR-f and adaR-r. The resulting DNA fragment was digested with EcoRI and XhoI and inserted in to pBARGPE1 (digested with EcoRI and XhoI) to yield pYR311.

Transformation of A. niger with pYR311.

Polyethylene glycol-mediated transformation of A. niger was conducted essentially as described previously for A. nidulans, except that the protoplasts were prepared with 3 mg/mL lysing enzymes (Sigma-Aldrich) and 2 mg/mL Yatalase (Takara Bio.).(134, 187) Glufosinate used for the selection of bar transformants was prepared as previously described.(134) Miniprep genomic DNA from A. niger transformants was prepared according to previous procedures for the screening.(188) Transformants were cultured in the presence of 4-8 mg/mL glufosinate for screening.(134) Integration of the AdaR-overexpression cassette was confirmed by PCR using

GoTaq Green Master Mix (Promega) and two pairs of primers: bar-f and bar-r, GPE-gpdA-f and adaR-r.

Expression analysis by Reverse Transcription Polymerase Chain Reaction (RT-PCR).

The total RNA of A. niger mutants and wild-type strains were extracted using the Qiagen

RNeasy plant mini kit following the supplied protocols. The first strand cDNA was synthesized

96 using the gene specific reverse primers (adaA-r, adaB-r, β-tub-r) and Improm-II reverse transcription system (Promega) according to the manufacturer's instructions. cDNA was then amplified with GoTaq Green Master Mix (Promega) using three pairs of primers: adaA-r and adaA-PT-f, adaB-r and adaB-f, β-tub-r and β-tub-f.

Purification and Characterization of Compounds.

The A. niger transformant T4 producing 70 was cultured on 1 L GMM agar for 3 days at

28°C. The agar was extracted three times with equal volume of ethyl acetate (EtAc). The resultant organic extracts were combined, dried over anhydrous Na2SO4, and concentrated in vacuo. The crude extract was then washed with hexane, and redissolved in 1:1 methanol

(MeOH)-chloroform (CHCl3). A Sephadex LH-20 column was used to fractionate the crude extract with 1:1 MeOH-CHCl3 as the mobile phase. Fractions that contained 70 were combined and dried. The residue was redissolved in HPLC grade MeOH and purified by reverse-phase

HPLC (Alltech Alltima 5 μm, 250 mm × 10 mm) on a linear gradient of 35 to 95% acetonitrile

(CH3CN, v/v in water) over 60 minutes and 95% CH3CN (v/v) over 15 minutes at a flow rate of

2 mL/min. 1D and 2D NMR of 70 were performed on a Bruker DRX-500 spectrometer using

CDCl3 as the solvent. 73, 74, 75 and 76 were extracted from a 2L YPD culture of the corresponding S. cerevisiae transformants, purified and analyzed in deuterated NMR solvents

(MeOD for 70, acetone-d6 for 74 and 75, DMSO-d6 for 76) using similar procedures. The extraction and purification of 73 requires acidic condition.

Construction of S. cerevisiae Expression Plasmids.

pKOS518-120A and pKOS48-172b are 2μ-based yeast-E.coli shuttle plasmids containing different auxotrophic markers (TRP1 and URA3, respectively).(189) Primers used to amplify adaA, adaB, adaC, adaD are listed in supplemental Table A7. adaA was amplified by PCR from

97

A. niger genomic DNA with the primers adaA-f and adaA-r and inserted into pCR-Blunt. The resulting vector was digested with SpeI and PmeI and the adaA gene was ligated in pKOS48-

172b digested with SpeI and PmlI to yield pYR291. Similar methods were used to insert adaB, adaC and adaD in pKOS518-120A digested with NdeI and PmeI to yield pYR301, pYR313 and pYR315, respectively. The ADH2p-adaC-ADH2t cassette was obtained by digesting pYR313 with SalI and XhoI, and inserting the fragment into the pYR301 digested with XhoI to yield pYR330. The ADH2p-adaD-ADH2t cassette obtained by digesting pYR315 with HindIII and

XhoI, and inserting the fragment into pYR330 digested with HindIII and SalI to yield pYR342.

The expression plasmids for apt genes were constructed using similar methods, except aptA was inserted in the pKOS48-172b digested with NdeI and PmlI to yield pYR203; aptB and aptC were inserted in the pKOS518-120A digested with NdeI and PmeI to yield pYR204 and pYR260, respectively. The ADH2p-aptC-ADH2t cassette was obtained by digesting pYR255 with SalI and XhoI, and was inserted into pYR204 digested with SalI to yield pYR260. The introns of adaB, aptA, aptB, aptC were removed by using splice by overlapping extension (SOE)-PCR.

Reconstitution and Investigation of the Biosynthetic Pathway of 70 in S. cerevisiae

Various combinations of the expression plasmids were introduced into S. cerevisiae strain BJ5464-NpgA(156, 157) by using the S. c. EasyCompTM Transformation Kit

(Invitrogen). The cells were grown at 28°C in 10 mL YPD media with 1% dextrose. For product detection, 1 mL of cell culture was collected and analyzed every 12 hours after induction for 72 hours. The medium was extracted with equal volume of 99% EtAc/1% acetic acid

(AcOH), and the cell pellets were extracted with 100 μL acetone. The organic phase was separated, evaporated to dryness, redissolved in 20 μl MeOH and analyzed with a Shimadzu

2010 EV Liquid Chromatography Mass Spectrometer (LC-MS, phenomenex Luna 5 μm, 2.0 ×

98

100 mm) using both positive and negative electrospray ionization. Compounds were separated on a linear gradient of 5% CH3CN (v/v in water, 0.1% formic acid) to 95% CH3CN (v/v in water,

0.1% formic acid) over 30 min with a flow rate of 0.1 mL/min.

Construction of E. coli Expression Plasmids.

The DNA fragments encoding AdaA, AdaC, AptB and AptC were each inserted into pET24 digested with NdeI and NotI to yield pYR303, pYR316, pYR146 and pYR201, respectively.

The site-directed mutants of AptB were constructed through SOE-PCR and inserted into pET24.

The DNA fragments encoding AptA was inserted into pET28 digested with NdeI and EcoRI to yield pYR62.

Expression and Purification of Enzymes.

Expression and purification procedures of AdaA, AdaC, AptB, AptC, AptA and MatB followed identical procedures. The expression and purification of AdaA is detailed here as an example. pYR303 was used to transform the BAP1 for protein expression.(94) The cells were cultured at 37°C and 250 rpm in 500 mL LB medium supplemented with 35 μg/mL kanamycin to a final OD600 between 0.4 and 0.6. The culture was then incubated on ice for 10 min before addition of 0.1 M isopropylthio-β-D-galactoside (IPTG) to induce protein expression. The cells were further cultured at 16 °C for 12-16 h. The cells were then harvested by centrifugation

(3,500 rpm, 15 min, 4°C), resuspended in ~25 mL lysis buffer (20 mM Tris-HCl, pH 7.9, 0.5 M

NaCl, 10 mM imidazole) and lysed by sonication on ice. Cellular debris was removed by centrifugation (15,000 rpm, 30 min, 4°C), and Ni-NTA agarose resin was then added to the supernatant (1-2 mL/liter of culture). The suspension was swirled at 4°C for 2 h and loaded into a gravity column. The protein-bound resin was washed with one column volume of 10 mM imidazole in buffer A (50 mM Tris-HCl, pH 7.9, 2 mM EDTA, 2 mM DTT), followed by 0.5

99 column volume of 20 mM imidazole in buffer A. The protein was then eluted with 250 mM imidazole in buffer A. Purified AdaA was concentrated and exchanged into Buffer A+10% glycerol with the centriprep filters (Amicon), aliquoted and flash frozen using acetone/dry ice.

Protein concentration was determined with the Bradford assay using bovine serum albumin as a

2+ 2+ standard.(190) For the expression of Mn -AptB or Fe -AptB, 200 ppm of MnSO4 or FeSO4 was added to the expression media. Metal contents of various AptB and mutations were determined using an inductively coupled plasma atomic emission spectrometer (ICP-AES, TJA

Radial IRIS 1000).

In vitro Reconstitution of Enzymes

The reactions were performed at 100 μL scale containing 100 mM phosphate buffer (pH

= 7.4) in the presence of 100 mM sodium malonate, 2 mM DTT, 7 mM MgCl2, 20 mM ATP, 5 mM coenzyme A, 20 μM MatB, with various combinations of NRPKSs and the tailoring enzymes.(104) Typically, the concentration of the NRPKS (i.e. AptA or AdaA) was 10 μM and the tailoring enzymes (i.e. AdaC, AptB, AptC) were 50 μM. The reactions were quenched with 1 mL of 99% EtAc/1% AcOH. The organic phase was separated, evaporated to dryness, redissolved in 20 μL of DMSO and analyzed with LC-MS as previously described.

2.4 Enzymatic Total Synthesis of Aromatic Decaketides in vitro

2.4.1 Introduction

Type II aromatic polyketides are a structurally distinct group of natural products, and are comprised of important therapeutics, such as doxorubicin,(55) mithramycin(186) and oxytetracycline.(185) Many of these compounds are derived from a decaketide backbone (C20), and tetracenomycins and angucyclines are two special subgroups. Tetracenomycins are among

100 the few subgroups that are synthesized from C9-C14 first ring cyclization, while angucyclines adopt an angularly cyclized scaffold, which is distinct from the other aromatic decaketides that are linearly cyclized. Most angucyclines are converted from rabelomycin 34, and the enzymatic biosynthesis of 34 has been recently accomplished using seven type II PKS enzymes expressed form Streptomyces lividans and E. coli.(191) The engineered biosynthesis of bacterial aromatic nonaketides in E. coli utilizing fungal NRPKS reported in 2009 suggests an alternative efficient way to synthesize bacterial aromatic polyketides.(17) In accordance with the characterization of

AdaA encoded in the TAN-1612 70 biosynthetic loci as a decaketide synthase,(192) we report the in vitro enzymatic synthesis of 34 and Tcm D1 81 utilizing either the dissociated enzyme complex or compact hybrid PKSs constructed from fungal and bacterial PKSs.

2.4.2 Results and Discussion

Characterization of AdaA as a Decaketide Synthase

Previous investigations on the ada gene cluster suggests that AdaA can be solubly expressed from E. coli and synthesize a C20 pyranoanthraquinone 80 in the presence malonyl-

CoA.(192) To verify that the synthesis of the decaketide backbone is not affected by the removal of the PT domain, we extracted the ACP domain and SAT-KS-MAT tridomain. These standalone domains were solubly expressed in E. coli (Figure A27), and their catalytic properties were examined through assaying with the MatB:malonyl-CoA regenerating system (MatB, malonate, CoA, ATP and MgCl2).(104) Through selected LC-MS ion monitoring of the reaction mixture with either 12C-malonate or [2-13C] malonate as the malonyl-CoA precursor, most of the polyketides synthesized by the AdaA minimal PKS domains were decaketides, including the C7-

C12 cyclized SEK15 2 and C9-C14 cyclized RM80 16 (Figure A28, Figure 28). In addition, descent amount of nonaketides and triketide pyrones were detected in the reaction mixtures as

101 well (Figure 28). The synthesis of nonaketides such as 15 is consistent with the trace amount of the C18 anthracenone 75 detected in S. cerevisiae culture expressing AdaA and AdaB.(192) The replacement of AdaA-ACP with PKS4-ACP did not result in any change to the product profile

(Figure 28). Together, this implies the domain-domain interactions among the AdaA minimal

PKSs are not as efficient, and this is likely to be attributed to the deficiency in KS or MAT domain functions. Relinking the SAT-KS-MAT and ACP domains with the KR-ACP interdomain linker peptide (~90 aa) from the 6-DEBS module 3 results in the hybrid PKS called

PKSH_ada. In the presence of malonyl-CoA, PKSH_ada synthesizes a similar array of polyketide products, except for a more portion of the truncated products (Figure 28), which implies that the engineering of AdaA may affect the domain-domain interactions.

We then tested the effect of bacterial first ring cyclases to the synthesis of decaketides by

AdaA minimal PKS domains. The addition of ZhuI and TcmN lead to the major synthesis of 2 and 16, respectively (Figure 28), and this agrees with the previously reported behavior of PKS4 minimal PKS domains.(17) Therefore, although AdaA minimal PKS domains synthesize truncated polyketide products in addition to the full length decaketides, they are amendable to domain manipulation, and can interact with the bacterial cyclases. Hence, this PKS system is applicable for the enzymatic total synthesis of bacterial aromatic decaketides.

Biosynthesis of Rabelomycin

In addition to the structural diversity, most angucyclines are synthesized from rabelomycin 34, followed by a series of oxidative ring cleavage, glycosylation, small molecule fusion and etc.(141) The C4-C17 cyclization is the critical step leading to the angular folding and can be fixed by a group of cyclases including JadI from Jadomycin B 35 biosynthetic pathway.(61) Previous investigations suggest that KS-CLF in the angucycline biosynthetic

102 pathways may influence the regioselectivity, and be required for the proper function of the angucycline cyclases.(63, 64) To establish a general mode of angucycline cyclases functioning with different decaketide PKSs, we utilized the AdaA minimal PKSs as the control PKS system.

Previous investigation demonstrates that the angucycline minimal PKSs (KS-CLF, MAT and

ACP), C9-KR (GilF), C7-C12 cyclase/aromatase (JadD) and C4-C17 cyclase (RavG) are sufficient to synthesize 34 in vitro in the presence of malonyl-CoA and NADPH.(192) When the minimal PKSs are substituted with the AdaA SAT-KS-MAT and ACP domains, LC-MS analysis of the reaction mixture indicates that 34 is still synthesized as the major product (Figure 28).

The efficient in vitro synthesis of 34 with AdaA minimal PKS domains implies that the angucycline cyclases alone are sufficient to direct the C4-C17 cyclization. Moreover, compared to the in vitro assay of AdaA minimal PKSs without the bacterial tailoring enzymes, the synthesis of the truncated nonaketides is almost abolished in the 34 synthesizing assay. This suggests that the presence of the tailoring enzymes exhibit a stronger effect to the domain- domain interactions during chain elongation in AdaA than in PKS4.

21615 34 OH O OH 12

vi HO 7 OH

O SAT KS AT ACP v HO O

iv 2

iii ACP

ii

i

10 12 14 16 18 20 22 24 26 28 30 Minute Figure 28. Characterization of the AdaA minimal PKS domains and the in vitro biosynthesis of 34. LC- MS analysis (350nm) of the SAT-KS-MAT tridomain assayed with i) AdaA-ACP; ii) PKS4-ACP; iii) AdaA-ACP and ZhuI; iv) AdaA-ACP and TcmN; v) relinked AdaA-ACP; vi) GilF, JadD and RavG. 103

Biosynthesis of Tetracenomycin

The synthesis of 34 demonstrates that in the presence of tailoring enzymes, the AdaA minimal PKS domains can efficiently synthesize aromatic decaketides. In addition to the dissociated enzyme complex resembling type II PKS system, we tried to investigate if the synthesis of the bacterial aromatic decaketides can be accomplished by using a single multidomain megasynthase like the fungal NRPKSs.

In 2008, the structure of the cyclase portion of TcmN was reported to adopt a helix-grip fold; (29) and the structure of PksA-PT was decoded in 2010 to employ a “double hot dog

(DHD)” fold, which is similar to the helix-grip fold of TcmN. Because of the structural and functional similarity between TcmN and PT domain, we swapped AdaA-PT with TcmN, which results in a hybrid megasynthase called PKSH_tcm1. PKSH_tcm1 was solubly expressed in E. coli (Figure 29A), and this suggests that PT domain and TcmN are structurally interchangeable.

When PKSH_tcm1 was assayed with MatB:malonyl-CoA regenerating system, the C9-C14 cyclized 15 and 16 become the major products, while the productions of other cyclized products are mostly abolished (Figure 29). Compared to the product profile of the dissociated enzyme complex (Figure 28, trace iv), the ratio of C18 to C20 products synthesized by PKSH_tcm1 is slightly increased (Figure 29). This may because the insertion of TcmN influenced the AdaA minimal PKS domain-domain interactions. Although the insertion of TcmN leads to more truncated products, the embedded TcmN remains functional to dictate the regioselectivity. The same domain swapping experiment was conducted on PKS4, and results in a hybrid PKS that synthesize 15 as the only major product in the presence of malonyl-CoA (Figure A32). The different efficiency may be due to the deficiency in the AdaA KS-MAT domains as aforementioned. As a summary, the successful domain engineering of AdaA and PKS4 indicates

104 that PT domains are structurally and functionally replaceable with the bacterial first ring cyclases.(147)

21615 81 ABMPKSHPKSH _tcm1 _tcm2 220kDa iii 160kDa

120kDa

ii

i

10 15 20 25 30 35

Minute

Figure 29. (A) SDS-PAGE analysis of PKSH_tcm1 and PKSH_tcm2. (B) LC-MS analysis (350nm) of in vitro assays containing malonyl-CoA and i) relinked AdaA minimal PKS domains; ii) PKSH_tcm1; iii) PKSH_tcm2.

Although enzymes involved in the synthesis of tetracenomycin C 29 have been much studied and utilized in the past 20 years for engineered biosynthesis of novel polyketides, whether the C5-C18 aldol condensation takes place spontaneously or not remain unclear, as the expression of tcm minimal PKSs and TcmN in different Streptomyces host results in the production of distinct metabolites. The major synthesis of 16 among the decaketides synthesized by PKSH_tcm1 suggests that the C5-C18 aldol condensation is not a spontaneous reaction subsequent to the C9-C14/C7-C16 cyclizations.

TcmI functions as the last ring cyclase to catalyze the C2-C19 aldol condensation.

Through fusing TcmI at the C-terminal of PKSH_tcm1, a hexa-domain megasynthase was constructed, which is designated as PKSH_tcm2. The hybrid megasynthase PKSH_tcm2 was 105 assayed with MatB:malonyl-CoA regenerating system; and compared to the PKSH_tcm1 reaction mixture, a new peak with identical mass and UV to Tcm D1 81 was detected (Figure 29,

Figure A33). The synthesis of this peak suggests the C2-C19 ring closure in the presence of

TcmI; and therefore indicates that the C5-C18 ring formation will take place following the first two ring cyclizations when both TcmN and TcmI are present. However, compared to the synthesis of 34, the efficiency of this reaction is much lower, which suggests that although TcmI and TcmN are sufficient for the four ring formation, more accessory enzymes maybe required for a more efficient synthesis of the tetracyclic scaffold. Moreover, the successful construction and reconstitution of PKSH_tcm1 and PKSH_tcm2 implies a new approach for the synthesis of bacterial aromatic polyketides with single enzymatic component.

On the other hand, the functional TcmI embedded in PKSH_tcm2 is surprising, as the C- terminal domain of fungal NRPKS is normally monomer, and TcmI has been suggested to be a trimer or dimer.(57) However, as there is little truncation of PKSH_tcm2 (Figure 29), the possibility of the dissected domains containing TcmI to function on the polyketide intermediate is low. Further investigations are therefore required for understanding the domain-domain interaction in this hybrid megasynthase.

2.4.3 Conclusion

As a conclusion, we have reconstituted the biosynthesis of two important bacterial aromatic decaketide intermediates, rabelomycin 34 and Tcm D1 81, with AdaA minimal PKS domains resembling either type II PKS of dissociate enzyme complex, or fungal type I PKS of a single multidomain megasynthase. The synthesis of 34 determines that the C4-C17 cyclase does not require angucycline KS-CLF to fix the angular bending of the polyketide intermediate; while the synthesis of 81 implies that TcmN and TcmI are sufficient yet not as efficient to convert the

106 linear C20 decaketide backbone into the tetracyclic scaffold. Also, the successful construction of

PKSH_tcm2 to synthesize 81 further indicates the highly amendable feature of fungal NRPKS domains and provide a new approach for the synthesis of bacterial aromatic polyketides with single enzymatic component.

2.4.4 Material and Methods

Materials

[2-13C]-malonate was purchased from Cambridge Isotope Laboratories. CoA was purchased from Sigma Aldrich. The Escherichia coli strain XL-1 Blue utilized for DNA manipulation and BL21(DE3) used for protein expression were purchased from Stratagene. The protein expression strain BAP1 described by Pfeifer et al (94) was from Prof. Chaitan Khosla.

General technique for DNA manipulation

PCR reactions were performed with the Phusion high-fidelity DNA polymerase (NEB) and Platinum Pfx DNA polymerase (Invitrogen). The PCR products are cloned into the pCR- blunt vector obtained from Invitrogen. Restriction enzymes (NEB) and T4 ligase (Invitrogen) were used to digest and ligate the DNA fragments. The primers used to amplify the genes were synthesized by IDT.

Cloning of the stand-alone SAT-KS-MAT and ACP domains

Primers used to amplify SAT-KS-MAT and ACP domains, are listed in supplemental

Table A14. The DNA fragments encoding SAT-KS-MAT and ACP were digested with NdeI and NotI and inserted in to pET24 (digested with NdeI and NotI) and pET28-Amp (digested with

NdeI and NotI) to yield pYR302a and pYR308, respectively.

Construction of PKSH_ada

107

Primers used to amplify linker region and AdaA-ACP are listed in supplemental Table

A14. The DNA fragments each encoding the linker peptide and AdaA-ACP were amplified and fused together by SOE-PCR to synthesize a DNA fragment encoding linker-AdaA-ACP flanked with NotI. pYR305 is derived from pET28-Amp and encodes the N-terminal hexahistidine tagged

AdaA-SAT-KS-MAT between NdeI and NotI. The DNA fragment of linker-AdaA-ACP flanked by NotI was inserted into the NotI-digested pYR305 to yield pYR323.

Construction of PKSH_tcm1

Primers used to amplify TcmN and PKS4-ACP are listed in supplemental Table A14.

The DNA fragment encoding TcmN and PKS4-ACP was amplified and fused together by SOE-

PCR to synthesize a DNA fragment encoding TcmN-ACP flanked with NotI. The DNA fragment of TcmN-ACP flanked by NotI was inserted into the NotI-digested pYR305 to yield pYR415. Similarly, for constructing the PKS4 version of PKSH_tcm1, the DNA fragment of

TcmN-ACP flanked by NotI was inserted into the NotI-digested pWJ266 (derived from pET28-

Amp and encodes the N-terminal hexahistidine tagged AdaA-SAT-KS-MAT between NdeI and

EcoRI) to yield pYR413.

Construction of PKSH_tcm2

Primers used to amplify PKSH_tcm1 and TcmI are listed in supplemental Table A14.

The DNA fragment encoding PKSH_tcm1 was digested with NdeI and AflII; and the DNA fragment encoding TcmI was digested with AflII and EcoRI. The DNA fragments of

PKSH_tcm1 flanked by NdeI and AflII and TcmI flanked by AflII and EcoRI were inserted into the NdeI/EcoRI-digested pET28-Amp through three pieces ligation to yield pYR431.

Expression and purification of proteins

108

Expression of stand-alone domains and full megasynthases follow identical procedures.

The example of AdaA-ACP is detailed here. pYR308 was used to transform the BL21(DE3) for protein expression. The cells were cultured at 37°C and 250 rpm in 500 ml LB medium supplemented with 100 μg/ml ampicillin to a final OD600 between 0.4-0.6. The culture was then incubated on ice for 10 mins before addition of 0.1 M isopropylthio-β-D-galactoside (IPTG) for protein expression. The cells were further cultured at 18°C for 12-16 hrs. The cells were then harvested by centrifugation (3,500 rpm, 15 min, 4°C), resuspended in ~ 25 mL lysis buffer

(20 mM Tris-HCl, pH 7.9, 0.5 M NaCl, 10 mM imidazole) and lysed by sonication on ice.

Cellular debris was removed by centrifugation (15000 rpm, 30mins, 4°C) and Ni-NTA agarose resin was then added to the supernatant (1-2 ml/L of culture). The suspension was swirled at

4°C for 2 hours and loaded into a gravity column. The protein-bound resin was washed with one column volume of 10 mM imidazole in buffer A (50 mM Tris-HCl, pH = 7.9, 2 mM EDTA, 2 mM DTT), followed by 0.5 column volume of 20 mM imidazole in buffer A. The protein was then eluted with 250 mM imidazole in buffer A. Purified AdaA-ACP was concentrated and exchanged into buffer A+10% glycerol with the centriprep filter devices (Amicon), aliquoted and flash frozen. Protein concentration was determined with the Bradford assay using BSA as a standard.(190)

In vitro Reconstitution of Enzymes

The reactions were performed at 100 μL scale containing 100 mM phosphate buffer (pH

= 7.4) in the presence of 100 mM sodium malonate, 2 mM DTT, 7 mM MgCl2, 20 mM ATP, 5 mM coenzyme A, 20 μM MatB, with various combinations of PKSs and the tailoring enzymes.(104) Typically, the concentration of the hybrid megasynthase or AdaA SAT-KS-MAT was 10 μM and the tailoring enzymes or ACP were 50 μM. The reactions were quenched with 1

109 mL of 99% EtAc/1% AcOH. The organic phase was separated, evaporated to dryness, redissolved in 20 μL of DMSO and analyzed with a Shimadzu 2010 EV Liquid Chromatography

Mass Spectrometer (LC-MS, phenomenex Luna 5 μm, 2.0 × 100 mm) using both positive and negative electrospray ionization. Compounds were separated on a linear gradient of 5% CH3CN

(v/v in water, 0.1% formic acid) to 95% CH3CN (v/v in water, 0.1% formic acid) over 30 min with a flow rate of 0.1 mL/min.

110

3. Conclusions

Aromatic polyketides are a group of structurally distinct natural products with usually highly substituted fused-ring polyphenols and exhibit a broad range of biological activities. Both bacterial type II polyketide synthase (PKS) and fungal type I NRPKS systems synthesize this family of compounds, which are derived from the highly active poly-β-ketone backbones followed by regioselective cyclizations of various folding modes. This family includes a number of highly successful pharmaceutical compounds, including tetracycline, daunorubicin and mithramycin. On the other hand, some human or plant diseases have been linked to fungal mycotoxins, many of which are aromatic polyketides, such as the highly toxic carcinogenic aflatoxin. The rich biological activities, both beneficial and harmful, and structural complexity of aromatic polyketides have attracted research interests into the discovery and biosynthesis of these compounds. Although the bacterial type II PKSs has been well studied during the past 30 years, there are questions remain unsolved due to the lack of control enzymatic system.

Compared to the bacterial type II PKSs, investigations on fungal NRPKSs are limited to date.

On the other hand, advances in rapid DNA sequencing techniques have enabled the complete genome sequencing of many fungi species, which revealed large numbers of natural product biosynthetic pathways including abundant amount of putative NRPKSs. Many of the encoded pathways are silent under laboratory culturing conditions; and as a result, the metabolites associated with these pathways have not been observed or isolated. Understanding the function, and establishing sequence-activity relationships of NRPKSs are therefore an important step toward the prediction of polyketide structures from NRPKS sequences and can enable the genome mining of hundreds of cryptic NRPKSs uncovered via genome sequencing.

111

The product template (PT) domains embedded in the NRPKSs are unique and not shared by other types of PKSs. To investigate the PT domain function, we conducted domain engineering of PKS4, the bikaverin 50 NRPKS in G. fujikuroi. We replaced PT domain with a linker peptide to construct a megasynthase PKS_WJ that retains KS, MAT and ACP domains on a sing peptide. The engineered megasynthase continued to produce ample amounts of full length nonaketide backbones yet the committed C2-C7 cyclization was no longer observed. In addition, the ACP-bound nonaketide intermediate synthesized by PKS_WJ can be efficiently tailored by bacterial tailoring enzymes including cyclase and ketoreductase. As the core enzymes of type II

PKSs required for chain elongation has never been solubly expressed in E. coli, the biosynthesis of bacterial aromatic polyketides has never been accomplished in E. coli until our study utilizing

PKS_WJ. More important, the domain engineering of PKS4 allowed us to identify the function of PT domain as the first ring aldol cyclase, which is consistent with simultaneous work studying

PksA in A. parasiticus.

As PT domains mediate the regioselective cyclization, understanding the sequence- activity relationships of PT domains become an essential step towards correlating the polyketide structures to the NRPKS sequences. In the following work, the PT domains from NRPKSs of known functions were classified phylogenetically into five major groups, with each group corresponding to a unique first-ring cyclization regioselectivity and/or product size.

Subsequently, 99 additional PT sequences extracted from NRPKSs of unknown functions were included into the alignment, and there was no significant influence to the classification. To verify the predicted regioselectivity of PT domains, we constructed a couple of hybrid NRPKSs by using the TE-less PKS4 (PKS4–99) as the template and replacing the endogenous PKS4-PT with the unidentified PT domains. The hybrid megasynthases remain functional to produce

112 polyketides, and the regioselectivity of the cyclization step is completely altered upon replacement of the PT domains. By analyzing the structure of the products synthesized by the hybrid NRPKSs, the regiospecificity of the unknown PT domains can be deciphered. Therefore, a sequence-activity relationship of PT domains was established, verified and utilized to accurately predict the regiospecific activity of unknown PT domains for the first time.

Utilizing the PT phylogenetic classification, we were able to locate the gene cluster (ada) for the biosynthesis of TAN-1612 70 in A. niger. The ada gene cluster was verified through overexpression of the transcriptional regulator in A. niger and subsequent gene expression analysis. The enzymes encoded in the ada gene cluster share high sequence similarities to the anthraquinone asperthecin 65 apt pathway from A. nidulans. To uncover the enzymatic basis for the fourth ring Claisen cyclization, these two highly homologous gene clusters were comparatively investigated through heterologous pathway reconstitution in S. cerevisiae and in vitro biochemical characterization. These studies revealed a novel α-hydroxylation-dependent

Claisen cyclization that is catalyzed by a bifunctional dimanganese metallo-β-lactamase-type thioesterase (MβL-TE). The MβL-TE-catalyzed cyclization has been found to take place only when the ACP-bound polyketide intermediate is α-hydroxylated by the FAD dependent monooxygenase (FMO); while in the absence of FMO, the MβL-TE performs hydrolytic release of an anthracenone product. The bifunctional MβL-TE is the first example among the binuclear manganese enzymes that can catalyze C–C bond formation through Claisen-like condensation; and it is the first example of a TE in polyketide biosynthesis that catalyzes the Claisen cyclization without an α/β hydrolase fold and forms no covalent bond with the substrate

The characterization of ada pathway provided us with a decaketide synthase, which according to our in vitro reconstitution experiments, is amendable for domain engineering and

113 applicable for the biosynthesis of bacterial aromatic decaketides. Through combinatorial biosynthesis with bacterial tailoring enzymes both in cis and in trans, we accomplished the in vitro enzymatic synthesis of Tcm D1 81 and rabelomycin 34. The successful synthesis of 34 in vitro with AdaA minimal PKS excludes the influence of the angucycline minimal PKSs on the angular cyclizations. The biosynthesis of 81 by the hybrid megasynthase (PKSH_tcm2) revealed that only two cyclases, TcmN and TcmI, are sufficient for the formation of the tetracyclic scaffold from the linear polyketide intermediate; albeit other accessory enzymes may be required for an efficient biosynthetic procedure. In addition to providing a PKS system to answer previous unsolved questions of type II PKSs, this study also demonstrates the first example of constructing a functional multidomain megasynthase utilizing the building blocks from both fungal and bacterial synthases.

114

4 Appendix

4.1 Compound List

O

OH OH HO OH O O OH O OH OH O O O OH O OH O 12 12 12 12 15 14 O O O 10 OH HO 13 HO 7 HO 7 HO 7 OH HO 7 HO OH HO O HO 9 OH O O 8 O O O O O O O OH O HO O HO O HO O HO O HO O HO O O 1 2 3 4 5 6 7

OH OH OH OH O O OH O OH O HO O OH OH O 12 12 OH OH 12 O O 14 12 O O 12 O OH 14 O OH HO 7 OH HO 7 7 HO 9 HO 7 O 7 HO 9 OH O 12 O HO OH O O O O O O O HO O HO O HO O HO O HO O 7 HO O 8 9 10 11 12 13 14 15

OH O OH O OH O OH OH O OH OH OH OH O O OH O 14 14 O 16 18 12 14 N NH OH O OH OH HO 9 HO 9 OH HO HO 11 HO 13 7 HO 9 O O O O O O HO O HO O HO O HO O HO O HO O

16 17 18 19 20 21

OH OH O O O O HO O O HO O HO OH N HO O O HO O OH H H OH HO OH OH O O O HO OCH3 O O O OH OH O H H O O NH2 OH HO O OH O OH HO O H O O OH O HO O O O HO NH2 HO O O OH O O HO H O O O OCH OH O HO O O 3 H3CO OH OCH O 22 23 24 3 25

O OH O O O OH O HO HO O HO O O H CO OH 3 OH O O O O HO O O O OH OCH3 HO HO O OH O HO O O O O O O OH OH O OH OH O O O O O OH O H3CO O O OH OH O H O OH O HO O HO O O O OH O O O O O OH H H3CO OH OCH3 OH OCH3 O 26 OH 27 28 29

O O OH OCH HO O 3 O O O O OH O OH OH O OH OH O OH H COOH O COOH H CO O 3 OH OH N O OH O HO O HO OH OCH3 O O OH O OCH3 OH OH O OH O OH O O OCH3 HO 30 31 32 33 34 35

OH OCH O HO OH 3 R O O 1 O O O O O HO OCH3 OH O OH HO HO O O O OH O R OH O R2 OH O OH 2 OH O OH O O OH HO O O O HO R =H 36 38 : R1 =H R2 =H O 39 40 : 2 42 R =OH R = O O 41 : R2 =OH 37 : 1 2 HO O O O HO O OH 115

O O OH OH OCH O 3 OH OCH O O 3 HO OCH3 OCH3 OH O H3CO O O O HO HO OH O OH OH OH O OH O O O O O O O OH O HO O HO O O O H CO O OH O O 3 O HO N HO O O O O O 45 43 44 O 46 47 O O HO O O O O O O

OH

OH O OH O OH O OH O OH O OH OH O OH OH O O OH O O O H CO O OCH HO OH HO O HO O OH 3 3 O OH O OH O O HO 48 49 50 51 52 53

OH O

OH OH O OH O OH OH OH O OH O OH O OOH COOH O O O OH O O O O O HO

HO HO HO OH HO HO HO OH 54 55 56 57 58 59 60

OH O OH O OH O OH O OH O O OH O OH O OH OH O HO HO OH O O O COOH OH OH HO HO HO O O OH O OH OH O OH 61 62 63 64 65 66 67

OH O OH O OH OH OH OH H3CO OH O OH O OH OH OH OH O O O O HO OH OH OH OH O OH OH O O NH2 OH OH O O O OH O OH O O HO

68 69 70 71 72

OH OH OH OH OH HO OH HO HO HO HO

O O O O OH OH COOH OH COOH OH OH O O O OH OH O OH OH O OH OH O OH OH O

7374a 74 75a 75

OH OH OH OH OH HO HO HO OH HO HO

O O O O O O O O OH OH COOH OH OH O OH OH O OH OH O O O OH OH O COOH OH OH O 76a 76 77 78a 78

OH O OH O OH OH O OH O OH O O

NH2 HO HO OH OH OH OH O O O O O 79 80 81

116

4.2 Supporting information for chapter 2.1

Table A1. Primers used in this study. Restriction sites are underlined.

Primer Sequence

pks4-KS-F-NdeI 5’- AAACATATGACCTTGGAGAGCCTGACCAAA-3’ pks4-AT-R-NotI-S-EcoRI 5’- AAAGAATTCCTATGCGGCCGCAGTCAATCCGTTGACGCGATG-3’ pks4-ACP-F-NdeI 5’- AAACATATGAGCGTCGGCACAACATCACCT-3’

pks4-ACP-R-NotI-S-EcoRI 5’- AAAGAATTCCTATGCGGCCGCACTAGGGGATGTTGAAGCAGC-3’

DEBS2M3-ACPL-F 5’- AAGCGGCCGCACTGCCGGGCGGGGCGGTGG-3’ DEBS2M3-R 5’- GGTGATGTTGTGCCGACGCTCCGTTGTGCCGGCTCGCCGG-3’ PKS4ACP-F 5’- CCGGCGAGCCGGCACAACGGAGCGTCGGCACAACATCACC-3’ PKS4ACP-R 5’- AAGCGGCCGCTCAACTAGGGGATGTTGAAGCAG-3’ 5’-AAAACATATGAGCAGCCATCATCATCATCATCACAGCAG GrisARO-NdeI-His CATGTCGCAGCCCGGCCTGCG-3’ GrisARO-XhoI 5’- AAAAACTCGAGTCAGCCCCGCCGGCCCGCGG-3’ OxyN_5_NcoI 5’- AAAAAACCATGGCAATGCGCATCATCGATCTGTC-3’

OxyN_3_HindIII 5’- AAAAAAAAGCTTCTACTCCTCCACCACCGCC-3’ MtmQ_5_NdeI 5’- AAGTTTAAACGGAGGAGCCCATATGCCGAACAGCCAGCAGTACT-3’ MtmQ_3_SpeI 5’- AAACTAGTCAGACACCGGCGCGCTCCTC-3’ StfQ_5_NdeI 5’- AGTTTAAACGGAGGAGCCCATATGCCGACTCTGGGCAAAGA-3’ StfQ_3_NheI 5’- AGCTAGCTCAGACACCGGCGCGCTCCT-3’ PKS4KSS2A 5’- AGTATCGCTGTTGACACTGCAGCCTCTTCGAGTCTTGCTGCGA-3’ PKS4KSS2A-r 5’- TCGCAGCAAGACTCGAAGAGGCTGCAGTGTCAACAGCGATACT-3’

117

Table A2. Expression plasmids used in this study.

Plasmid Derived from Function pSMa99 pET28a Expression plasmid for PKS4-99, the “TE-less” PKS4 pWJ248 pET28a Expression plasmid for PKS4 ACP domain pWJ249 pET28a Expression plasmid for PKS4 KS_MAT didomain pWJ282 pET28a Expression plasmid for PKS_WJ (KS_MAT_Linker_ACP) pWJ221 pET28a Expression plasmid for TcmN cyclase domain pWJ291 pET28a Expression plasmid for ZhuI pYR30 pET28a Expression plasmid for MtmQ pYR31 pET28a Expression plasmid for StfQ pETwhiE pET28a Expression plasmid for WhiE pWJ292 pET28a Expression plasmid for MatB pWJ269 pET28a Co-expression plasmid for KS_MAT and ACP pWJ309 pET28a Co-expression plasmid for PKS_WJ and act KR pWJ272 pCDFDuet-1 Co-expression plasmid for oxy CYC2 and gris CYC1 pWJ268 pACYCDuet-1 Co-expression plasmid for ACP and act KR pWJ298 pET28a Co-expression plasmid for PKS_WJ and ZhuI

118

Table A3: NMR characterization of 7.

Spectra were obtained at 500 MHz for proton and 125 MHz for carbon and were recorded in

CD3OD.

No.b 13C δ (ppm) 1H δ (ppm)

(m, area, JHH (Hz)) 1 163.6 - 2 89.3 5.28 (d, 1H, 2.0) 3 173.4 - 4 101.5 5.71 (d, 1H, 2.0) 5 168.9 - 6 39.4 4.16 (d, 1H, 16.0) 4.22 (d, 1H, 16.0) 7 140.1 - 8 115.0 6.40 (d, 1H, 2.2) 9 165.2 - 10 104.3 6.32 (d, 1H, 2.3) 11 167.4 - 12 113.1 - 13 196.7 - 14 50.1 2.66 (d, 1H, 16.0) 2.89 (d, 1H, 16.0) 15 103.5 - 16 52.9 3.06 (s, 2H) 17 207.8 - 18 30.4 2.26 (s, 3H)

119

Figure A1: Construction of PKS_WJ expression plasmid (pWJ282).

PKS4 KS_MAT didomain was first inserted into pET-28a vector, followed by insertion of

Linker_ACP didomain in the same open reading frame through NotI site.

S. erythraea DEBS 2 Module 3 G. fujikuroi PKS4 SAT KS MAT PT ACP TE/CYC KS AT KR0 ACP PCR Nde I (3) ATG Not I (2949) TGA Eco RI (2960) ACP KS MAT 2964 bp Ligate to pET-28a Splicing by Overlapping Extension PCR through NdeI/EcoRI

Not I (3) Linker ACP TGA Not I (624)

Eco RI (8087) Not I (8113) T7 terminator TGA Linker ACP 629 bp Not I (8076) Kan TGA Not I (8697) ACP T7 terminator Linker Kan KS MAT Not I (8076) Ligate through pWJ249 8282 bp NotI sites

pBR322 origin KS MAT pWJ282 8866 bp Nde I (5130) pBR322 origin

His Tag ATG rbs lacI Nde I (5130) lac operator T7 promoter His Tag ATG lacI T7 promoter rbs lac operator

120

Figure A2: SDS-PAGE analysis of the purified proteins.

(a) TcmN cyclase domain (21 kDa), (b) ZhuI (19 kDa), (c) MatB (56 kDa), (d) PKS4 KS_MAT

(108 kDa) and PKS4 ACP (13 kDa), (e) MtmQ (36 kDa) and StfQ (36 kDa), (f) WhiE (19 kDa).

All proteins have N-terminal hexahistidine tags. KS_MAT and ACP were co-expressed on the same plasmid under separate T7 promoters.

ab c

60kDa 50kDa TcmN MatB ZhuI 25kDa 20kDa 25kDa 20kDa

d KS_MAT e f

120kDa 100kDa MtmQ

40kDa WhiE 30kDa 25kDa 20kDa StfQ 20kDa

ACP

121

Figure A3: UV Spectra and MS of 7.

Red line in MS indicates the saturation of signals.

206

400

280

mAU 200

411 0

200 300 400 500 nm Inten.(x10,000,000) Positive ionization

383 1.0 343

105 0.5 143

261 560 189241 301 152 208 424 456 539 0.0 399 488 582 100 200 300 400 500 m/z

Negative ionization Inten.(x100,000) 7.5 359

5.0

231 315 2.5 297 473 381 105 189 273 431 0.0 146 213 389512 540 589 100 200 300 400 500 m/z

122

4.3 Supporting information for chapter 2.2

Table A4. Primers used in this study.

Restriction sites are underlined.

Primer Sequence AptA-PT-f 5’- GCGGCCGCACATATGCGCACATCTCTGATCCACGA -3’ AptA-PT-i-r 5’- ACATGTCGGTCCAGATTGATGATGTTGCTACGCCACAAT -3’ AptA-PT-i-f 5’- ATTGTGGCGTAGCAACATCATCAATCTGGACCGACATGT -3’ AptA-PT-r 5’- GCGGCCGCGGGTTTCGACGGCGTTACTT -3’

AN0150-PT-f 5’- CATATGACGTCCACTGTTCAGCAGAT -3’

AN0150-PT-i-r 5’- CCAATGTCGCCGTGTATAGACGAAGTTACCACACCGCATC -3’ AN0150-PT-i-f 5’- GATGCGGTGTGGTAACTTCGTCTATACACGGCGACATTGG -3’ AN0150-PT-r 5’- GAATTCTCATGCGGCCGCGGCAACTGGTGCAGCTGC -3’ VrtA-PT-f 5’- CATATGCGAACATCTACTATCCACCAG -3’ VrtA-PT-r 5’- GAATTCTCATGCGGCCGCGTCATGTCTTACTGGCACAG -3’ AptA-PT-PKS4-ACP-r 5’- GGTGATGTTGTGCCGACGCTGGGTTTCGACGGCGTTACTT -3’ AptA-PT-PKS4-ACP-f 5’- AAGTAACGCCGTCGAAACCC AGCGTCGGCACAACATCACC -3’ PKS4-ACP-r 5’- GCGGCCGCTTAAGTCAACTAGGGGATGTTGAAGCAG -3’ An03g05440-PT-f 5’- GCGGCCGCATCTCCGTCAGTCCAGAGAATT -3’ An03g05440-PT-1i-r 5’- CCAATATCGGCGTAGAGAGACGATGGACATAACATTGCGC -3’ An03g05440-PT-1i-f 5’- GCGCAATGTTATGTCCATCGTCTCTCTACGCCGATATTGG -3’ An03g05440-PT-2i-r 5’- GCGAGGGACCCCTTGGAATCGGATCCCTTGAAACACTG -3’ An03g05440-PT-2i-f 5’- CAGTGTTTCAAGGGATCCGATTCCAAGGGGTCCCTCGC -3’ An03g05440-PT-PKS4-ACP-r 5’- GGTGATGTTGTGCCGACGCTAGCTTTCCGGCACACCGGTT -3’ An03g05440-PT-PKS4-ACP-f 5’- AACCGGTGTGCCGGAAAGCTAGCGTCGGCACAACATCACC -3’

123

Table A5. List of NRPKSs related to known polyketide products.

Protein Name Strain Name Accession No. Products A. nidulans PKSST A. nidulans FGSC26 AAA81586 Sterigmatocystin W. dermatitidis PKS1 Wangiella dermatitidis 8656 AAD31436 THN Nodulisporium sp. pks1 Nodulisporium sp. MF5954 AAD38786 THN E. lecanii-corni PKS1 Exophiala lecanii-corni CBS102400 AAN75188 THN C. resinifera PKS1 Ceratocystis resinifera AAO60166 THN D. septosporum PKSA Dothistroma septosporum NZE7 AAZ95017 Aflatoxin O. piceae PKSA Ophiostoma piceae AU55-3 ABD47522 THN E. fawcettii PKS1 Elsinoe fawcettii ABU63483 Elsinochrome A.terreus ACAS Aspergillus terreus NIH2624 XP_001217072 Emodin Aspergillus ochraceoroseus A. ochraceoroseus AflC ACH72912 Aflatoxin SRRC1432 A. nidulans AN0150 Aspergillus nidulans FGSC A4 XP_657754 Emodin A. nidulans AN7909 Aspergillus nidulans FGSC A4 XP_681178 Orsellinic acid A. nidulans AptA Aspergillus nidulans FGSC A4 XP_663604 Asperthecin C. lagenarium PKS1 Colletotrichum lagenarium 104-T BAA18956 THN B. oryzae PKS1 Bipolaris oryzae D9/F6-69 BAD22832 THN A. oryzae PKSA Aspergillus oryzae RIB40 BAE71314 Aflatoxin S. macrospora PKS Sordaria macrospora k-hell 48977 CAM35471 THN C. nicotianae CTB1 Cercospora nicotianae ATCC18366 AAT69682 THN A. fumigatus PKSP Aspergillus fumigatus ATCC46645 EDP55264 THN H. subicuosus Hpm3 Hypomyces subiculosus DSM11932 ACD39762 Hypocemycin G. zeae PKS12 Gibberella zeae AAU10633 Aurofusarin G. zeae PKS13 Gibberella zeae PH-1 ABB90282 Zearalenone G. lozoyensis PKS1 Glarea lozoyensis ATCC20868 AAN59953 THN L. maculans PKS1 Leptosphaeria maculans AAS92537 Sirodesmin PL G. fujikuroi PKS4 Gibberella fujikuroi CAB92399 Bikaverin A. flavus PKSA Aspergillus flavus ATCC96044 AAS90093 Aflatoxin A. parasiticus PKSA Aspergillus parasiticus AAS66004 Aflatoxin C. chiversii RADS2 Chaetomium chiversii CS-36-62 ACM42403 Radicicol Pochonia chlamydosporia P. chlamydosporia RDC1 ACD39770 Radicicol ATCC16683 A. nidulans WA Aspergillus nidulans FGSC A4 CBF74114 Naphthopyrone

124

Table A6. List of the putative NRPKSs introduced in the phylogenic analysis.

Protein Name Strain name Accession No. A. clavatus ACLA_043310 Aspergillus clavatus NRRL 1 XP_001275038 A. clavatus PKSP Aspergillus clavatus NRRL 1 XP_001276035 A. flavus AFLA_060020 Aspergillus flavus NRRL3357 XP_002378746 A. flavus AFLA_082150 Aspergillus flavus NRRL3357 XP_002373130 A. flavus AFLA_108550 Aspergillus flavus NRRL3357 XP_002376725 A. flavus AFLA_116220 Aspergillus flavus NRRL3357 P_002384329 A. fumigatus AFUA_4G00210 Aspergillus fumigatus Af293 XP_746435 A. fumigatus AFUA_4G14560 Aspergillus fumigatus Af293 XP_751377 A. fumigatus AFUA_7G00160 Aspergillus fumigatus Af293 XP_746913 A. fumigatus AFUB_071800 Aspergillus fumigatus A1163 EDP50840 A. fumigatus AFUB_086700 Aspergillus fumigatus A1163 EDP47964 A. fumigatus AFUB_10073 Aspergillus fumigatus A1163 EDP47078 A. fumigatus PKSP Aspergillus fumigatus Af293 XP_756095 A. nidulans AN7071 Aspergillus nidulans FGSC A4 XP_664675 A. niger An03g05440 Aspergillus niger CBS 513.88 XP_001390425 A. niger An04g09530 Aspergillus niger CBS 513.88 XP_001402309 A. niger An09g05730 Aspergillus niger CBS 513.88 XP_001393884 A. niger An11g07310 Aspergillus niger CBS 513.88 XP_001394705 A. nomius PKSA Aspergillus nomius NRRL13137 AAS90047 A. oryzae AO090005000961 Aspergillus oryzae RIB40 XP_001817959 A. oryzae AO090010000048 Aspergillus oryzae RIB40 XP_001827098 A. oryzae AO090023000444 Aspergillus oryzae RIB40 XP_001820992 A. oryzae AO090026000009 Aspergillus oryzae RIB40 P_001821511 A. oryzae AO090102000545 Aspergillus oryzae RIB40 XP_001822700 A. oryzae AO090701000530 Aspergillus oryzae RIB40 XP_001823362 A. sojae PKSA Aspergillus sojae ATCC 9362 AAU08792 A. terreus ATEG_00145 Aspergillus terreus NIH2624 XP_001210231 A. terreus ATEG_02434 Aspergillus terreus NIH2624 XP_001211612 A. terreus ATEG_07500 Aspergillus terreus NIH2624 XP_001216121 B. fuckeliana PKS12 Botryotinia fuckeliana B05.10 AAR90248 B. fuckeliana PKS13 Botryotinia fuckeliana B05.10 AAR90249 B. fuckeliana PKS14 Botryotinia fuckeliana B05.10 AAR90250 B. fuckeliana PKS15 Botryotinia fuckeliana B05.10 AAR90251 C. globosum CHGG_00542 Chaetomium globosum CBS 148.51 XP_001219763 C. globosum CHGG_08141 Chaetomium globosum CBS 148.51 XP_001225797 C. globosum CHGG_09364 Chaetomium globosum CBS 148.51 XP_001227291 125

Protein Name Strain name Accession No. C. heterostrophus PKS18 Cochliobolus heterostrophus C4 AAR90272 C. heterostrophus PKS19 Cochliobolus heterostrophus C4 AAR90273 C. heterostrophus PKS20 Cochliobolus heterostrophus C4 AAR90274 C. immitis CIMG_08569 Coccidioides immitis RS P_001241406 C. posadasii CPC735_062240 Coccidioides posadasii C735 delta SOWgp EER28351 D. aegiceri PKS Dothiorella aegiceri ABS85549 D. applanata PKS Dirinaria applanata ACH72076 E. lecanii-corni PKS2 Exophiala lecanii-corni CBS 102400 AAN74983 G. fujikuroi PKS3 Gibberella fujikuroi m567 CAC88775 G. graminicola PKS1 Glomerella graminicola ACN32207 G. moniliformis PKS3 Gibberella moniliformis ATCC 38932 AAR92210 G. moniliformis PKS4 Gibberella moniliformis ATCC 38932 AAR92211 M. canis MCYG_03598 Microsporum canis CBS 113480 EEQ30779 M. canis MCYG_04442 Microsporum canis CBS 113480 EEQ31623 M. canis MCYG_05054 Microsporum canis CBS 113480 EEQ32235 M. canis MCYG_08535 Microsporum canis CBS 113481 EEQ35716 M. grisea MGG_00241 Magnaporthe grisea 70-15 XP_369003 M. grisea MGG_07219 Magnaporthe grisea 70-15 XP_367294 M. grisea MGG_08281 Magnaporthe grisea 70-15 XP_362767 M. purpureus PKS1 Monascus purpureus CAC94008 N. crassa NCU03584 Neurospora crassa OR74A XP_960586 N. fischeri NFIA_045430 Neosartorya fischeri NRRL 181 XP_001267621 N. fischeri NFIA_101660 Neosartorya fischeri NRRL 181 XP_001266579 N. fischeri NFIA_101810 Neosartorya fischeri NRRL 181 XP_001266594 N. fischeri NFIA_112240 Neosartorya fischeri NRRL 181 XP_001262597 N. fischeri PKSP Neosartorya fischeri NRRL 181 P_001261235 N. haematococca Nectria haematococca mpVI 77-13-4 EEU34216 NECHADRAFT_101778 N. haematococca Nectria haematococca mpVI 77-13-4 EEU40203 NECHADRAFT_33672 N. haematococca PKSN Nectria haematococca CBS225.58 AAS48892 P. anserina PODANSg7836 Podospora anserina DSM 980 XP_001910795 P. anserina PODANSg8570 Podospora anserina DSM 980 XP_001911528 P. anserina PODANSg8809 Podospora anserina DSM 980 XP_001911764 P. chrysogenum Pc21g16000 Penicillium chrysogenum Wisconsin 54-1255 CAP96497 P. marneffei PKSP Penicillium marneffei ATCC 18224 XP_002147717 P. marneffei PMAA_001920 Penicillium marneffei ATCC 18224 XP_002152334 P. marneffei PMAA_031640 Penicillium marneffei ATCC 18224 XP_002144865 P. marneffei PMAA_061720 Penicillium marneffei ATCC 18224 XP_002151003 126

Protein Name Strain name Accession No. P. marneffei PMAA_063620 Penicillium marneffei ATCC 18224 XP_002145792 P. marneffei PMAA_066670 Penicillium marneffei ATCC 18224 XP_002146110 P. marneffei PMAA_09556 Penicillium marneffei ATCC 18224 XP_002149119 P. marneffei PMAA_100360 Penicillium marneffei ATCC 18224 XP_002149615 P. nodorum SNOG_08274 Phaeosphaeria nodorum SN15 XP_001798591 P. nodorum SNOG_08614 Phaeosphaeria nodorum SN15 XP_001798923 P. nodorum SNOG_11981 Phaeosphaeria nodorum SN15 XP_001802212 P. nodorum SNOG_15829 Phaeosphaeria nodorum SN15 XP_001805964 P. tritici-repentis PTRG_02708 Pyrenophora tritici-repentis Pt-1C-BFP XP_001933041 P. tritici-repentis PTRG_03323 Pyrenophora tritici-repentis Pt-1C-BFP XP_001933656 P. tritici-repentis PTRG_04385 Pyrenophora tritici-repentis Pt-1C-BFP XP_001934718 S. sclerotiorum SS1G_05681 Sclerotinia sclerotiorum 1980 UF-70 XP_001592760 S. sclerotiorum SS1G_11789 Sclerotinia sclerotiorum 1980 UF-70 XP_001586760 S. sclerotiorum SS1G_13322 Sclerotinia sclerotiorum 1980 UF-70 XP_001585805 T. stipitatus PKSP Talaromyces stipitatus ATCC 10500 XP_002481882 T. stipitatus TSTA_008140 Talaromyces stipitatus ATCC 10500 XP_002488697 T. stipitatus TSTA_008950 Talaromyces stipitatus ATCC 10500 XP_002483004 T. stipitatus TSTA_014510 Talaromyces stipitatus ATCC 10500 XP_002483594 T. stipitatus TSTA_083310 Talaromyces stipitatus ATCC 10500 XP_002478062 T. stipitatus TSTA_126190 Talaromyces stipitatus ATCC 10500 XP_002482902 T. stipitatus TSTA_126840 Talaromyces stipitatus ATCC 10500 XP_002482968 V. albo-atrum VDBG_00580 Verticillium albo-atrum VaMs.102 EEY14472 X. elegans PKS1 Xanthoria elegans gbM10 ABG91136 Xylaria sp. PKS12 Xylaria sp. BCC 1067 AAM93545 VrtA Penicillium aethiopicum IBT5753 --

127

Figure A4: SDS-PAGE analysis of the purified AptA PT, AN0150 PT and VrtA PT domains.

M AptA MMAN0150 VrtA PT PT PT

50kDa 50kDa 50kDa

40kDa 40kDa 40kDa

30kDa 30kDa 30kDa

128

Figure A5: UV spectra and MS of 15, 51 and 60.

UV spectrum and MS of 60:

2500

2000

1500

mAU 1000

500

0

200 250 300 350 400 450 500 nm

Positive ionization Inten.(x100,000) 1.00 343 0.75

0.50

0.25 151 384 105 407 533 0.00 171183 215 233 270 303 317 460 474 511 554 579 594 100 200 300 400 500 m/z

Negative ionization Inten.(x10,000) 1.00 341 0.75

0.50

159 0.25 373 297 420 118 395 505 517 565 145 191 243 271 447 0.00 227 328 472 582 594 100 200 300 400 500 m/z

129

UV spectrum and MS of 15:

1500

1250

1000

750 mAU

500

250

0

200 250 300 350 400 450 500 nm

Positive ionization Inten.(x10,000,000) 1.5 343

1.0

282 0.5 260 327 134 384 597 105 151 217 0.0 196 239 415 446 468513 533545 591 100 200 300 400 500 m/z

Negative ionization Inten.(x1,000,000) 341 5.0

2.5

297 255 0.0 113 145 169 185213 323 383398 411 444 471 491515 548 587 100 200 300 400 500 m/z

130

UV spectrum and MS of 51:

1200

1000

800

600 mAU

400

200

0

200 250 300 350 400 450 500 nm

Positive ionization Inten.(x10,000,000)

2.0 299

1.0

105 257 362 321 340 598 0.0 146 201 217 241 403 424457 487 508 528 550 577 100 200 300 400 500 m/z

Negative ionization Inten.(x1,000,000) 5.0 297

2.5

600 213 255 411 0.0 114 145 185 319 333365 433 455477 501 520 567 581 100 200 300 400 500 m/z

131

Figure A6: UV Spectrum and MS of 68.

1200

1000

800

600 mAU 400

200

0

250 300 350 400 450 500 550 nm

Inten.(x1,000,000) Positive ionization 1.0

339

0.5 402

225 260 199 213 369 253 282 293 311343 379 427 443 467 501 0.0 481 200 250 300 350 400 450 m/z Inten.(x100,000) Negative ionization 5.0

337

2.5

191 508 0.0 209 231 255 271 287 303 323 359 379 405 432 448 473 489 200 250 300 350 400 450 m/z

Inten.(x100,000) 13 1.0 2- C-Malonate enrichment Positive ionization 397

0.5

441 348 411 282 224 196 311 457 0.0 209 237 263 283 333370 379 473 485 495 200 250 300 350 400 450 m/z Inten.(x10,000) 5.0 13 346 2- C-Malonate enrichment Negative ionization

2.5

249 320 187 332 415 497 0.0 205225 257 274 291 372381 398 431 450 470 482 509 200 250 300 350 400 450 m/z

132

Figure A7: NMR Spectra of 68.

1 H-NMR spectrum (DMSO-d6, 500MHz)

13.742 12.109 7.688 7.157 7.153 6.814 6.654 6.649 2.287

14 13 12 11 10 9 8 7 6 5 4 3 2 1 ppm

1.078 1.000 1.107 1.067 1.063 0.598 2.818

133

13 C-NMR spectrum (DMSO-d6, 500MHz) 188.663 188.187 181.355 180.919 165.730 165.236 165.080 164.785 164.562 164.224 160.200 159.786 158.426 157.824 145.903 145.476 137.340 136.849 135.085 134.650 115.305 114.804 113.930 113.418 110.625 110.020 109.563 109.087 109.043 108.755 108.546 108.201 104.388 103.962 19.610 19.201

190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 ppm

134

1 13 H- C HMQC (DMSO-d6, 600MHz)

ppm

20

30

40

50

60

70

80

90

100

110

120

130

9.0 8.5 8.0 7.5 7.0 6.5 6.0 5.5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 ppm

135

1 13 H- C HMBC (DMSO-d6, 600MHz)

ppm

20

40

60

80

100

120

140

160

180

15 14 13 12 11 10 9 8 7 6 5 4 3 2 ppm

136

Figure A8: LC-MS analysis of polyketides synthesized by AN0150-PT and PKS4 minimal

PKS domains.

68

PKS4 minimal PKS + AN0150 PT

PKS4 minimal PKS + AptA PT

10 15 20 25 30 35 40 Minutes

137

Figure A9: Phylogenic analysis of PTs from both known and uncharacterized NRPKSs.

The An03g05440 and VrtA are marked with arrow.

0

8

5

1

0

S 0 1

K 1 s

G k

P S p B A K . m A p P D P 6 . u S n i s 3 a . V 2 r m i r K 8 d 1 m C a 9 m e 2 a P 7 u f u A n S 1 . i i 4 u g g r l e r 2 . e r n a n K 5 l t t a S o i o 7 e e n g 0 P S b r a s e p 0 N f s a . 0 - f r l c K o e i s A N e e A . p i o r l S s . P G G . i u p s D P N b i f u C . u K l a r i s C G G N s m M 0 i d O a e P A c A O r 1 . o c M P H a . C A a . h C G.graminicolaPKS1 i f 5 l r 4 f i V a a C A u s e H A N g o 8 c A 0 y e n 1 . m r G e i m p 5 f h i 0 S X s r S u i N a i u s 3 m g e F G 3 . r e s o 0 K 9 a r 1 g s r P i i I D o c U 8 g tu N A 0 6 n b is 7 C a s F 9 4 M. a a C s 2 1 t A 1 3 . lo 1 P .h u IA 0 0 m N n 1 . e s F 6 P .g . a e S tr A 1 4 S s y K G i te U 1 6 C 1 ti r F B 0 s o P 1 ci o U 1 6 a z a S S - s A 07 8 0 r lo n S K P re t .c . a . p ro 4 1 li m iP 1 m e p G 180 0 N G e ru rn S a n h k o K r ti u 1 c ti co ne s s 4 0 u o i- sP T f P P 5 .f r ni i . fe T K 6 B le a tid st iP R S 0 c c ti 22 ip M G 1 .s le a 3 it 9 S . m 3 T at A 02 E r 1 1 .s us A 7 de S G tip T 1 0 . PK 1 ita S 00 8 W s SS tu TA 3 an m A. sT 1 60 g ru 3 te S 2 le io S1 6 rre T 61 .e ot K 5 u A 9 X er P 95 N sA 12 0 cl na 0 .fis T 6 .s lia AA ch EG 84 S e M er 0 ck iP iN 02 .fu ffe S1 FI 43 B ne K A A A 4 ar tiiP 81 .fu .fu 04 m et 19 mi m 54 P. wc 1 gat iga 30 .fa OG 323 usA tus E SN 03 FU PK rum RG A 4 S do sPT B.fu G0 .no nti cke 02 P epe P. lia 10 ci-r 1 nod naP triti KS orumS KS P. aeP 8 A N 14 oryz KS1 .ory OG B. usP zae 15 oph AO0 829 ostr 9070 eter 100 C.h KS A.fla 053 ataP vusA 0 plan FLA .ap 0600 D PKS1 A. 20 ureus nidula .purp nsAN7 M 071 PKSP M.cani scheri sMCYG N.fi 04442 sPKSP C.h umigatu eterostro II A.f phusPKS 20 PKSP A.clavatus VrtA A.flavusPKSP A.nidulansAptA V A.oryzaeAO090102000545 A.nigerAn11g07310 A.nigerAn09g05730 M.canisMCYG 03598 A.nidulansWA 40 fischeriNFIA 1122 N. P.chryso genumPc21g1 G00160 6000 atusAFUA 7 P. A.fumig marneffeiP 700 III KSP FUB 086 T igatusA .stipitat A.fum usPKSP 000444 90023 G aeAO0 .zeae A.oryz 550 PKS1 A 108 G 2 sAFL .mon .flavu 0 iliform A 0953 isPK n04g G S4 gerA 0 I .fujik A.ni 331 uroi 04 A.fla PKS CLA IV vus 4 tusA 510 AFL ava 014 A.o A 1 A.cl TA ryz 162 TS 20 aeA 20 atus 19 N O0 tipit 00 .ha 90 T.s AA 2 em 010 iPM KS N ato 000 ffe iP .ha coc 04 rne orn 5 em ca 8 .ma ii-c 14 A at NE P an 00 .te oco CH .lec G rre cc AD E TE 09 P usA aP RA sA 88 .n T K FT eu Sg od E SN 3 rr N 15 A or G 36 .te DA S .n um 07 72 A O PK ig S 50 aP na P er NO 0 rin ia 9 .m An G e el 0 0 a 0 0 ns ck 79 4 T rn 3g 8 .a fu N 1 .s e 0 27 P . A 08 ti ff 54 4 B s 0 0 M p ei 4 an A 7 ita P 0 ul T 66 .c tus M id S 6 1 A an AA .n sT 0 4 .f is T A u A 02 la M S 06 at A 0 1 A v C TA 3 it M 4 .o u Y 6 ip P G 1 r s 08 20 st i G 8 0 T y A G 3 . ffe 0 2 .s z F 0 3 T e M 7 t a L 5 10 n a G 1 1 G ip e A 0 r e G 6 8 . A 5 a is 0 2 m it O 0 4 r H 8 5 G a 8 .m .g C A 8 . o tu 0 2 P 0 3 N f n 9 1 M m A 4 u i s 0 5 u M G . ji li T 0 0 s 0 M h k fo S 0 o iP G a u r T 5 b e M G 3 S .c e r m A o ff R m o 0 l a m P . a i i 0 g e e T p 3 s n P s 0 0

. n s 1 . c a P 0 P A i K 9 C r ri H 1 n l s t K 8

a s S 0 C e o 6 A S

g i s . o M S 9 1 t s r c

m . u K C 2 5 C d 3 5 . . o

. n o i C C o 3 s D 4 o 0

M e P S . m o t c

P j A p

P

o r . i Y

e R 2 a n r o c

p u D y m u

A o .

a 1 e G

e a . r a z i

c n L.

r i i A A a s c m u

e g P

. A N

r A i

- a o A f t D

i n a o m

b z R 1 . 0

l m i

o K S E

. . e o

i . c a m s t

u s .

i i n d 8

2 p

p . i S

o A r C

S C N s s G v a it s e ac 5

c i a a . s y

r i

r c d S

u e u O n H

t A O 3 r s I

z

. H o P r

e h M i u

s 1 p

s u a

a i 5 A n

d a

P v 0 i G

r l

P G i P

m t l C e

y s a G

a D e a 9 an

h o C

u i

K

K n

P P 0

P c 0 t R m c s 0

. n i 0 8 S e s T

c S

0 p C O s a K l e 5 A

P 8 6

u o P B1 2 C A

o 6

A F h g S 7

D 5 1

s r 6 K 8

c o r K 3 T o

A 6

. u 4

P A 0 1 s S S 5

s 9 1

P m

0 y K N S

r e 1 0 0

0 S P

S u 1

h T 6

0 7 A s c K

g 2 . 0 7 A

8 S 2 P 9

5 f 4 A l

C 7 0 0

138

4.4 Supporting information for chapter 2.3

Table A7. Primers used in this study.

Primer Sequence adaR-f 5’- GAATTCATGGAACAACGCAGTTCCCC -3’ adaR-r 5’- CTCGAGTCAACTCCCCCAGTTCATATC -3’ adaA-f 5’- CATATGACTAGTTCTGGCCCTACGAAGCTGG -3’ adaA-r 5’- GTTTAAACTCATGCGGCCGCGCAATACTGCTCCAACCACC -3’ adaB-f 5’- CATATGGCCTTCCGTATCCCCTT -3’ adaB-i-r 5’- TCGATCCAGCTTGCTTCACC CTGGCCGGTGTCAATGAGAA -3’ adaB-i-f 5’- TTCTCATTGACACCGGCCAG GGTGAAGCAAGCTGGATCGA -3’ adaB-r 5’- GTTTAAACTCAAGCGGCCGCCAGCCTCACTGCAGTTGCAG -3’ adaC-f 5’- CATATGACCCCTCCCATCCTCATCATCGGC -3’ adaC-r 5’- GTTTAAACTCAAGCGGCCGCATGAGCCTCCACCCCCTTCA -3’ adaD-f 5’- CATATGTCCTCCGTCACCTTAACC -3’ adaD-r 5’- GTTTAAACTCAAGCGGCCGCAGCATCAAGAATCGGCCTCTT -3’ aptA-f 5’- CATATGAAAGACAATACGCATAG -3’ aptA-i-r 5’- CCTTCAGCTTCATAATGCCCTATGAACATGCCAAGTTCC -3’ aptA-i-f 5’- GGAACTTGGCATGTTCATAGGGCATTATGAAGCTGAAGG -3’ aptA-ii-r 5’- ATAGGGTCTGTTTGTTCAGCCTCTTTCGGCGACATGTTG -3’ aptA-ii-f 5’- CAACATGTCGCCGAAAGAGGCTGAACAAACAGACCCTAT -3’ aptA-iii-r 5’- ACATGTCGGTCCAGATTGATGATGTTGCTACGCCACAAT -3’ aptA-iii-f 5’- ATTGTGGCGTAGCAACATCATCAATCTGGACCGACATGT -3’ aptA-r 5’- GCGGCCGCCTAACAGTACTGCTCTAACC -3’ aptB-f 5’- CATATGGCCTTCAGAATACCATTTG -3’ aptB-i-r 5’- GGATCCACGAGGGCATTCC CTGACCAGTATCAACTAGGAT -3’ aptB-i-f 5’- ATCCTAGTTGATACTGGTCAG GGAATGCCCTCGTGGATCC -3’ aptB-r 5’- GTTTAAACTTATGCGGCCGCATTCGGCCGGCGCGACCT -3’ aptC-f 5’- CATATGACACTCCCAGTCCTCAT -3’ aptC-r 5’- GTTTAAACTTAAGCGGCCGCCGCACGCATCTTCTGCTTCT -3’ GPE-gpdA-f 5’-CATGCGGAGAGACGGACGGA-3’ bar-f 5’-AGTAACCATGAGCCCAGAACGAC-3’ bar-r 5’-AGAAACCCACGTCATGCCAGTTC-3’

139

Table A8. Plasmids used in this study.

Plasmid Description pYR311 AdaR downstream of gpda promoter in pBARGPE1 pYR291 Expression plasmid of AdaA in S. cerevisiae, marker pYR342 Shuttle vector for the co-expression of AdaB-D in S. cerevisiae, tryptophan marker pYR301 Expression plasmid of AdaB in S. cerevisiae, tryptophan marker pYR315 Expression plasmid of AdaC in S. cerevisiae, tryptophan marker pYR330 Expression plasmid of AdaB & AdaC in S. cerevisiae, tryptophan marker pYR203 Expression plasmid of AptA in S. cerevisiae, uracil marker pYR260 Expression plasmid of AptB & AptC in S. cerevisiae, tryptophan marker pYR204 Expression plasmid of AptB in S. cerevisiae, tryptophan marker pYR255 Expression plasmid of AptC in S. cerevisiae, tryptophan marker pYR62 Expression plasmid of AptA in E. coli, kanamycin marker pYR146 Expression plasmid of AptB in E. coli, kanamycin marker pYR201 Expression plasmid of AptC in E. coli, kanamycin marker pYR303 Expression plasmid of AdaA in E. coli, kanamycin marker pYR316 Expression plasmid of AdaC in E. coli, kanamycin marker

140

Table A9. NMR spectra data of 70.

NMR spectra of 1 were obtained at 500 MHz for proton and 125 MHz for carbon, and recorded in CDCl3.

70 69[1] 79[1] 70[1] 1H δ [ppm] No. 13C δ 1H-13C HMBC 13C δ [ppm] 13C δ [ppm] 13C δ [ppm] (m, area, JHH (Hz)) [ppm] in CDCl3 in CDCl3 in DMSO-d6 1 - 190.6 - 190.5 192.1 194.1 2 - 72 - 80.2 82.1 80.7 3 - 196.4 - 195.2 196.6 176.4 4 - 106.3 - 105.1 106.8 108.4 5 - 167.5 - 166 166.7 194.4 6 - 108.1 - 105.5 108.2 111.4 7 9.47 (s, 1OH) 160.1 C6, C7, C8, C9, C10 158 159.3 163.4 8 6.49 (d, 1H, 1.95) 101.9 C6, C7, C9, C10 99.9 101.1 97.3 9 - 167.5 - 160.8 163.2 162.3 10 6.55 (d, 1H, 1.95) 100.7 C6, C8, C9, C12 122.7 99.5 97.6 11 - 141.4 - 147.2 141.4 140.6 C4, C6, C10, C11, 12 6.89 (s, 1H) 118.6 137.1 117.5 111.7 C14, C18 13 - 133.3 - 123.9 135.4 137.4 2.97 (d, 1H, 15.80) 14 38.8 C2, C13 71.7 37.9 38.9 3.39 (d, 1H, 16.15) 15 - 79.7 - 71.5 72.3 71.1 2.87(d, 1H, 19.10) 16 40.7 C17 40.4 42.1 46.3 2.94(d, 1H, 19.15) 17 - 195 - 192.9 194.3 192.6 18 - 110.8 - 99.6 98 112.4 19 - 202.8 - 172.77 173.5 196.7 20 2.66 (s, 3H) 28.1 C18, C19 - - 31.2

9-OCH3 3.88 (s, 3H) 55.8 C9 55.53 55.6 55.1 5.23 (s, 1H), 15.04 -OH (s br, 1H)

[1] NMR spectra data of 4 taken from ref (135), 13 taken from ref (148), 13 taken from ref (149).

141

Table A10. NMR spectra data of 73.

NMR spectra of 73 and 70 were obtained at 500 MHz for proton and 125 MHz for carbon, and recorded in CD3OD.

73 1H δ [ppm] 70 1H δ [ppm] 70 1H δ [ppm] No. (m, area, JHH (Hz)) (m, area, JHH (Hz)) (m, area, JHH (Hz)) in CD3OD in CD3OD in CDCl3 1 - - - 2 - - - 3 - - - 4 - - - 5 - - - 6 - - - 7 - - 9.47 (s, 1OH) 8 6.37 (d, 1H, 2.15) 6.44 (d, 1H, 2.4) 6.49 (d, 1H, 1.95) 9 - - - 10 6.53 (d, 1H, 2.15) 6.71 (d, 1H, 2.4) 6.55 (d, 1H, 1.95) 11 - - - 12 6.90 (s, 1H) 7.03 (s, 1H) 6.89 (s, 1H) 13 - - - 3.37 - 3.60 (m) 3.37 - 3.60 (m) 14 2.97, 3.39 (dd, 2H, 15.75) 2.78 - 3.18 (m) 2.78 - 3.18 (m) 15 - - - 3.37 - 3.60 (m) 3.37 - 3.60 (m) 16 2.87, 2.94 (dd, 2H, 19.23) 2.78 - 3.18 (m) 2.78 - 3.18 (m) 17 - - - 18 - - - 19 - - - 20 2.58 (s, 3H) 2.58 (s, 3H) 2.66 (s, 3H)

9-OCH3 - 3.89 (s, 3H) 3.88 (s, 3H)

142

Table A11. NMR spectra data of 74.

NMR spectra of 74 were obtained at 500 MHz for proton and 125 MHz for carbon, and recorded in acetone-d6.

1H δ [ppm] No. 13C δ [ppm] 1H-13C HMBC (m, area, JHH (Hz)) 4.31 (d, 1H, 4.2) C3, C15, C16 2 78.84 5.31 (d, 1OH, 4.25) C15 3 - 202.66 - 4 - 107.72 - 5 15.66 (br s, 1OH) 166.71 - 6 - 107.76 - 7 9.63 (s, 1OH) 160.87 C6, C7, C8, C9 8 6.41 (d, 1H, 2.25) 102.36 C6, C7, C9, C10 9 9.31 (s, 1OH) 162.97 C8, C9 10 6.63 (d, 1H, 2.2) 103.27 C6, C8, C9, C12 11 - 142.6 - 12 6.88 (s, 1H) 117.91 C6, C10, C11, C14 13 - 136.15 - 3.00 (d, 1H, 15.75) 14 39.16 C2, C4, C12, C13, C15, C16 3.28 (dd, 1H, 1.05, 16.15) 15 4.75 (1OH, s) 75.7 C2, C14 2.53 (d, 1H, 15.55) 16 46.52 C2, C14, C15, C17 3.00 (d, 1H, 15.70) 17 - 209.9 - 18 2.19 (s, 3H) 32.53 C16, C17

143

Table A12. NMR spectra data of 75.

NMR spectra of 75 were obtained at 500 MHz for proton and 125 MHz for carbon, and recorded in acetone-d6.

1H δ [ppm] No. 13C δ [ppm] 1H-13C HMBC (m, area, JHH (Hz)) 2.83 (dd,1H, 17.35, 1.85) 2 49.44 C3, C4, C14, C15, C16 3.06 (d, 1H, 17.45) 3 - 203.19 - 4 - 109.06 - 5 - 167.08 - 6 - 107.79 - 7 9.81 (s, 1OH) 161.11 C6, C7, C8, C9 8 6.39 (d,1H, 2) 102.05 C6, C7, C9, C10 9 - 162.74 - 10 6.59 (d, 1H, 2) 103.10 C6, C8, C9, C12 11 - 142.46 - 12 6.83 (s, 1H) 117.73 C4, C6, C10, C11, C14 13 - 137.16 - 3.04 (d, 1H, 15.85) 14 41.76 C2, C4, C12, C13, C15 3.19 (d, 1H, 15.85) 15 4.37 (s, 1OH) 72.30 - 16 2.90 (d, 2H, 2.05) 53.02 C2, C14, C15, C17 17 - 209.18 - 18 2.21 (s, 3H) 32.18 C16, C17 -OH 9.20 (br s, 1OH)

144

Table A13. NMR spectra data of 76.

NMR spectra of 76 were obtained at 500 MHz for proton and 125 MHz for carbon, and recorded in DMSO-d6.

761 762 1 1 No. H δ [ppm] 13C δ H δ [ppm] 13C δ 1H-13C HMBC 1H-13C HMBC (m, area, JHH (Hz)) [ppm] (m, area, JHH (Hz)) [ppm] 2.72 (dd, 1H, 1.60, 2.70 (dd, 1H, 1.70, 2 17.20), 3.03 (d, 48.87 C3, C14, C15 17.10), 3.03 (d, 48.60 C3, C14, C15 1H,17.05) 1H,17.45) 3 - 203.30 - - 202.07 - 4 - 107.85 - - 107.81 - 5 - 165.19 - - 165.19 - 6 - 106.34 - - 106.34 - 7 9.80 (s, 1OH) 159.22 C7, C8 9.80 (s, 1OH) 159.22 C7, C8 8 6.32 (d, 1H, 1.75) 101.22 C6, C7, C9, C10 6.32 (d, 1H, 1.75) 101.22 C6, C7, C9, C10 9 - 161.46 - - 161.46 - 10 6.50 (d, 1H, 1.65) 101.90 C6, C8, C9, C12 6.50 (d, 1H, 1.65) 101.90 C6, C8, C9, C12 11 - 141.03 - - 141.03 - C4, C10, C11, C4, C10, C11, 12 6.82 (s, 1H) 116.34 6.82 (s, 1H) 116.34 C14 C14 13 - 136.25 - - 136.25 - 2.95 (d, 1H, 16.25) C4, C12, C13, 2.95 (d, 1H, 16.25) C4, C12, C13, 14 40.79 40.60 3.09 (d, 1H, 15.90) C15, C16 3.10 (d, 1H, 15.85) C15, C16 15 5.18 (br s, 1OH) 71.29 - - 70.88 - C14, C15, C16, 16 2.55 (s, 2H) 49.30 3.77 (s, 2H) 59.12 C17 C17, C18 17 - 189.69 - 204.47 - 18 5.75 (s, 1H) 102.63 C1, C17, C19 2.75 (d, 2H, 2.9) 53.16 C15, C17, C19 19 - 192.41 - - 203.30 - 20 2.06 (s, 3H) 24.95 C18, C19 2.14 (s, 3H) 30.76 C19 -OH 10.27 (s, 1OH), 15.86 (s, 1OH) 10.27 (s, 1OH), 15.68 (br s, 1OH)

145

Figure A10: PCR targeting of adaR and the resistance gene in the transformants.

Pgpda-adaR With GPE-gpdA-f & adaR-r T1 T2 T3 T4 T5

T2 T1 Resistance gene With bar-f & bar-r T3

T4 T5

146

Figure A11. UV spectrum and MS of 70.

mAU 4000 274 3500 225

3000

2500 205 2000 246

1500 211 196

1000 409 322

500 309 353

0

200 250 300 350 400 450 500 550 nm

Positive ionization Inten. 40000 461

30000

20000 589 10000 105 415 477 502 594 146179 362 397 549 0 232269286 323 515 100 150 200 250 300 350 400 450 500 550 m/z

Negative ionization Inten. 413 100000

50000 369 285 595 227 269 0 116 144164 199 311 354 397 449 479 519 550 575 100 150 200 250 300 350 400 450 500 550 m/z

147

Figure A12. NMR spectra of 70 in CDCl3.

NMR spectra of 70 were obtained at 500 MHz for proton and 125 MHz for carbon, and recorded in CDCl3.

13C spectrum

148

1H spectrum

149

1H-13C HMQC

150

1H-13C HMBC

151

Figure A13. UV spectrum and MS of 73.

mAU 4000 274

3500

3000 224

2500

2000

1500 246 205 210

1000 195 411

500 325 311 354 0

200 250 300 350 400 450 500 550 nm

Inten. Positive ionization 401

100000

50000 423 105 464 383 275 299355 442 490 0 142 177 229 323 536 562 590 100 150 200 250 300 350 400 450 500 550 m/z

Negative ionization Inten. 399

100000

50000 355 271 597 253 413 0 117145 163 182 225 297381 447475 523 552 582 100 150 200 250 300 350 400 450 500 550 m/z

152

Figure A14. NMR spectra of 73 and 70 in CD3OD.

NMR spectra of 73 and 70 were obtained at 500 MHz for proton and 125 MHz for carbon, and recorded in CD3OD.

1 H spectrum of 70 in CD3OD

1 H spectrum of 73 in CD3OD

153

1 13 H- C HMQC of 1 in CD3OD

1 13 H- C HMQC of 7 in CD3OD

154

Figure A15. UV spectrum and MS of 74.

mAU

3500 270

3000

2500

2000 203 197 223 1500 211 1000 242 401

500 322 301 348

0

200 250 300 350 400 450 500 550 nm

Positive ionization Inten. 297

250000 355

315 396 105 243 269 0 128152 181 374 411 443465 519 551 593 100 150 200 250 300 350 400 450 500 550 m/z

Negative ionization Inten. 331 300000

200000

100000 105 600 273 333 0 144159201227 247 295 373 407436452 470 504 541 560 585 100 150 200 250 300 350 400 450 500 550 m/z

155

Figure A16. NMR spectra of 74 in acetone-d6.

NMR spectra of 74 were obtained at 500 MHz for proton and 125 MHz for carbon, and recorded in acetone-d6.

13C spectrum

156

1H spectrum

157

1H-13C HMQC

158

1H-13C HMBC

159

Figure A17. UV spectrum and MS of 75.

mAU

4000 269

3500

3000

2500 221

2000 203 197

1500 210 241

1000 396

500 320 300 345

0

200 250 300 350 400 450 500 550 nm

Positive ionization Inten. 317 500000

250000 105 299 597 259 339 380 0 139 165 189 211 242 418453 494 522 558 584 100 150 200 250 300 350 400 450 500 550 m/z

Negative ionization Inten. 300000 315

200000

100000 257

148 0 122 172199215 255297 347368 400429 460476 499 529 564 588 100 150 200 250 300 350 400 450 500 550 m/z

160

Figure A18. NMR spectra of 75 in acetone-d6.

NMR spectra of 75 were obtained at 500 MHz for proton and 125 MHz for carbon, and recorded in acetone-d6.

13C spectrum

161

1H spectrum

162

1H-13C HMQC

163

1H-13C HMBC

164

Figure A19. UV spectrum and MS of 76.

mAU

3000 269

2500

2000

1500 222 204 209 197 1000 241 396 500 320 311 345 0

200 250 300 350 400 450 500 550 nm

Positive ionization Inten. 359

25000

105 341 152237 283 423 466 0 211 502 554 591 663690748 772 824 865 100 200 300 400 500 600 700 800 m/z

Negative ionization Inten. 7500 339

5000

357 2500 255 100 737 154 211 297421 493538583 619 709 785 835 877 0 100 200 300 400 500 600 700 800 m/z

165

Figure A20. NMR spectra of 76 in DMSO-d6.

NMR spectra of 76 were obtained at 500 MHz for proton and 125 MHz for carbon, and recorded in DMSO-d6.

13C spectrum

166

1H spectrum

167

1H-13C HMQC

168

1H-13C HMBC

169

Figure A21: LC analysis of AptC and AdaC as FAD-binding enzymes at λ=280nm.

FAD

mAU

4000 209 Denatured AptC FAD

3000

2000 Denatured AdaC 257 238 1000 446 373 401 304 583 FAD STD 0 566

200 300 400 500 nm 5 10152025 Minute

170

Figure A22. LC analysis of polyketide products synthesized by AdaA and AptA at

λ=475nm.

In the absence of AptC and AptB, AptA synthesized pyranoanthraquinone 68 in vitro; in the absence of AdaC and AdaB, pyranoanthraquinones 68 (trace amount) and 80 (structure proposed through comparing the mass, retention time and UV absorption with 68) were synthesized by

AdaA in vitro.

80 68

OH O OH O OH O OH O

O O O AdaA HO HO O O

AptA 68 80

10 15 20 25 30 35 40

Minute

Inten. Negative ionization of 68 UV spectra of 68 mAU 25.51 221 337 350 245 25000 300 238

250

200 315

150 475 599 284 108 409 100 143170 205234 279 307330 378 429 448 473 515 557 380 370 0 50 394

100 150 200 250 300 350 400 450 500 550 m/z 0

200 300 400 500 nm

Negative ionization of 80 UV spectra of 80 Inten. mAU 250 23.75 221 379 200 247 238

2500 205 150 315 100 285 131 243269 358 515 473 154 177 222 301330 397 416 453 564 587 50 379 367 0 406 100 150 200 250 300 350 400 450 500 550 m/z 0 200 300 400 500 nm

171

Figure A23. LC analysis of AptC assaying with 75 at λ=400nm.

75

75 + + NADPH

75 +

10 15 20 25 30 35 40

Minute

172

Figure A24. Sequence alignment among ACTE, AptB, AdaB, VrtG and subclass B3 MβLs.

* * **

*

*

* Residues that are essential for the binuclear metal binding.

D172 and S191 are marked by the arrows.

FEZ-1(193) and THIN-B(110) are subclass B3 MβLs.(159, 160)

173

Figure A25. Far-UV circular dichroism (CD) spectra of AptB and mutants.

40 MnAptB2+-AptB Apo-AptB 30

20 /dmol)

2 10 ̊ cm

3 0 (10 θ -10

-20 190 200 210 220 230 240 250 260

Wavelength (nm)

80 80 Apo-AptB MnAptB2+-AptB H97A Apo-AptB 60 60 H99A H102A D101A H207A 40 H153A 40 D172A /dmol) /dmol) 2 20 2 20 ̊ cm ̊ cm 3 0 3 0 (10 (10 θ θ -20 -20

-40 -40 190 200 210 220 230 240 250 260 190 200 210 220 230 240 250 260 Wavelength (nm) Wavelength (nm)

All the mutants were expressed and purified from Mn2+-enriched medium.

The far-UV CD spectra of Mn2+-AptB and apo-AptB (EDTA Treated AptB* as in Table 4) indicate that the loss of Mn2+ ions causes some changes in the secondary structure of AptB. The far-UV CD spectra of the mutants are consistent with the ICP-AES analysis: mutants of H97A,

H99A, D101A, H153A and D172A show similar far-UV CD spectra to apo-AptB, while mutating H102 or H207 to alanine show far-UV CD spectra more similar to that of Mn2+-AptB than apo-AptB. 174

Figure A26. Phylogenetic analysis of MβL-TEs.

100 AFUB 071810 100 AFUA 4G14570 60 NFIA 101800 66 ACTE AN0149

9443 NFIA 101650 Group V1 97 PMAA 031630

73 PTRG 02709 97 TSTA 126180 NFIA 045420 36 100 AFUA 4G00220 VrtG

91 AptB 77 AdaB 42 34 MCYG 03601 Group V2 NFIA 112270 81 100 AFUA 7G00120 100 AFUB 086670 AN7070 99 AFLA 060000 MCYG 04444 FEZ-1

0.2

The MβL-TEs sequences were aligned with CLUSTALX 2.0, and phylogenetic analyses were conducted using MEGA version 4.0.2 by means of the boot strap minimum evolution method.

FEZ-1, a subclass B3 MβL from Fluoribacter gormanii (confers penicillin resistance), was placed as the root of the tree. AN7070, AFLA_060000 and MCYG_0444 are MβL-TEs are uncharacterized and basal to group V1 and V2. The gene prefixes corresponds to the name of the whole genome-sequenced fungi AFUA/AFUB: Aspergillus fumigates, AN: Aspergillus nidulans, NFIA: Neosartorya fischeri, PMAA: Penicillium marneffei, PTRG: Pyrenophora tritici-repentis, TSTA: Talaromyces stipitatus, MCYG: Arthroderma otae.

175

4.5 Supporting information for chapter 2.4

Table A14. Primers used in this study.

Primer Sequence AdaA-f 5’- CATATGTCTGGCCCTACGAAGCTGG -3’ AdaA-MAT-r 5’- GCGGCCGCGACGCCGCAGTTGTTCATGG -3’ AdaA-ACP-f 5’- CATATGTCCTCTCCCTCCTCCAG -3’ AdaA-r 5’- GCGGCCGCTCAGCAATACTGCTCCAACCACC -3’ linker-f 5’- TTCTCATTGACACCGGCCAG GGTGAAGCAAGCTGGATCGA -3’ linker-ACP-r 5’- CCGCTGGAGGAGGGAGAGGACCGTTGTGCCGGCTCGCCG -3’ linker-ACP-f 5’- CGGCGAGCCGGCACAACGGTCCTCTCCCTCCTCCAGCGG -3’ TcmN-f 5’- GCGGCCGCAGCAGCGCGCACGGACAAC -3’ TcmN-ACP-r 5’- GTGATGTTGTGCCGACGCTCAACAGCAGTTGCTGGGCG -3’ TcmN-ACP-f 5’- CGCCCAGCAACTGCTGTTGAGCGTCGGCACAACATCAC -3’ PKSH_tcm1-r 5’- CAATTGGGATCCGCGACCCATTCACTTAAGGCAATACTGCTCCAACCACC -3’ TcmI-f 5’- CTTAAGGCCTACCGGGCACTGATGG -3’ TcmI-r 5’- GAATTCAAGCTTTCAGGAGTCGGGGGCGGT -3’

176

Figure A27. SDS-PAGE analysis of AdaA minimal PKS domains.

SAT-KS- M MAT MACP M PKSH_ada 220kDa 160kDa 220kDa 120kDa

160kDa

120kDa

20kDa

15kDa

10kDa

177

Figure A28. LC-MS ion monitoring of the AdaA minimal PKS assay.

(x100,000) 1:385.00 (1.00) 2.25 1:367.00 (1.00) 1:343.00 (1.00) AdaA SAT-KS-MAT+ACP+ 2.00 MatB+malonic acid (12C) 1.75

1.50

1.25

1.00

0.75

0.50

0.25

0.00 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 25.0 27.5 30.0 32.5 35.0

(x100,000)

1.1 1:395.00 (1.00) 1:377.00 (1.00) 1.0 1:352.00 (1.00) AdaA SAT-KS-MAT+ACP+ 0.9 MatB+malonic acid (2-13C) 0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 25.0 27.5 30.0 32.5 35.0

178

Figure A29. UV spectrum and MS of 2.

mAU

3500 217 197

3000 203

2500

2000 288 1500

1000 254 500

0 543 546 581

200 300 400 500 nm

Positive ionization Inten. 400000 385 261 300000

200000 151 193 100000 596 219 152 302 361 430 0 123 234 317448 481 518 552 573 100 150 200 250 300 350 400 450 500 550 m/z

Inten. Negative ionization

383 200000

100000 259 359 597 233 321 429 0 125 159 189 215 297 386450 476 497 539 571 100 150 200 250 300 350 400 450 500 550 m/z

179

Figure A30. UV spectrum and MS of 16.

mAU 3500 197 208 3000 204

2500 266

2000 220 1500 215 238 1000

500 374 333

0 485

200 300 400 500 nm

Positive ionization Inten.

367 300000

200000

100000 385 105 255 301 325 343 461 0 135 162 189211 233 427 481515 543 569 594 100 150 200 250 300 350 400 450 500 550 m/z

Inten. Negative ionization

5000 383

2500

321 406 592 123 139 186 225 263 301 341 423 461 484 512 538 0 100 150 200 250 300 350 400 450 500 550 m/z

180

Figure A31. UV spectrum and MS of 34.

mAU 227

800 267

700

600 199 500

400 244

300 428 200

100 321 597 0 581

200 300 400 500 nm

Positive ionization Inten. 339 150000

100000 402 50000 105 361 321 599 297 527 0 138152 183 211 232 255 418 443 488548 578 100 150 200 250 300 350 400 450 500 550 m/z

Inten. Negative ionization 30000 337

20000

10000

600 135 249 291 323 473 0 151 188 211 379399 422 459 505 535 551 569 100 150 200 250 300 350 400 450 500 550 m/z

181

Figure A32. LC-MS analysis of the PKS4_TcmN hybrid PKS.

15

10 15 20 25 30

Minute

182

Figure A33. UV spectrum and MS of 81.

mAU

200 218

175

150

125

100 202

75 273 295 307 252 291

50 302 484

25 322

0

200 300 400 500 nm

Negative ionization Inten.

2000 335

1500 113

1000 249 142 174 313 578 702 793 219 385 621 500 429 463 524 674 766

0 100 200 300 400 500 600 700 m/z

183

5 References

1. Hertweck C, Luzhetskyy A, Rebets Y, & Bechthold A (2007) Type II polyketide synthases: gaining a deeper insight into enzymatic teamwork. Nat. Prod. Rep. 24(1):162-190. 2. Brase S, Encinas A, Keck J, & Nising CF (2009) Chemistry and biology of mycotoxins and related fungal metabolites. Chem. Rev. 109(9):3903-3990. 3. Rix U, Fischer C, Remsing LL, & Rohr J (2002) Modification of post-PKS tailoring steps through combinatorial biosynthesis. Nat. Prod. Rep. 19(5):542-580. 4. Thomas R (2001) A biosynthetic classification of fungal and streptomycete fused-ring aromatic polyketides. ChemBioChem 2(9):612-627. 5. Das A & Khosla C (2009) Biosynthesis of aromatic polyketides in bacteria. Acc. Chem. Res. 42(5):631-639. 6. Malpartida F & Hopwood DA (1984) Molecular cloning of the whole biosynthetic pathway of a Streptomyces antibiotic and its expression in a heterologous host. Nature 309(5967):462-464. 7. Hopwood DA (1997) Genetic contributions to understanding polyketide synthases. Chem. Rev. 97(7):2465-2498. 8. Summers RG, Ali A, Shen B, Wessel WA, & Hutchinson CR (1995) Malonyl-coenzyme A:acyl carrier protein acyltransferase of Streptomyces glaucescens: a possible link between fatty acid and polyketide biosynthesis. Biochemistry 34(29):9389-9402. 9. Tang Y, Tsai SC, & Khosla C (2003) Polyketide chain length control by chain length factor. J. Am. Chem. Soc. 125(42):12708-12709. 10. Shen B & Hutchinson CR (1993) Enzymatic synthesis of a bacterial polyketide from acetyl and malonyl coenzyme A. Science 262(5139):1535-1540. 11. McDaniel R, Ebert-Khosla S, Hopwood DA, & Khosla C (1993) Engineered biosynthesis of novel polyketides. Science 262(5139):1546-1550. 12. Hoffmeister D & Keller NP (2007) Natural products of filamentous fungi: enzymes, genes, and their regulation. Nat. Prod. Rep. 24(2):393-416. 13. Khosla C, Tang Y, Chen AY, Schnarr NA, & Cane DE (2007) Structure and mechanism of the 6- deoxyerythronolide B synthase. Annu. Rev. Biochem. 76:195-221. 14. Cox RJ (2007) Polyketides, proteins and genes in fungi: programmed nano-machines begin to reveal their secrets. Org. Biomol. Chem. 5(13):2010-2026. 15. Crawford JM, Dancy BC, Hill EA, Udwary DW, & Townsend CA (2006) Identification of a starter unit acyl-carrier protein transacylase domain in an iterative type I polyketide synthase. Proc. Natl. Acad. Sci. U. S. A. 103(45):16728-16733. 16. Crawford JM, et al. (2008) Deconstruction of iterative multidomain polyketide synthase function. Science 320(5873):243-246. 17. Zhang W, Li Y, & Tang Y (2008) Engineered biosynthesis of bacterial aromatic polyketides in Escherichia coli. Proc. Natl. Acad. Sci. U. S. A. 105(52):20683-20688. 18. Fujii I, Watanabe A, Sankawa U, & Ebizuka Y (2001) Identification of Claisen cyclase domain in fungal polyketide synthase WA, a naphthopyrone synthase of Aspergillus nidulans. Chem. Biol. 8(2):189-197. 19. Rawlings BJ (1999) Biosynthesis of polyketides (other than actinomycete macrolides). Nat. Prod. Rep. 16(4):425-484. 20. Shen YM, et al. (1999) Ectopic expression of the minimal whiE polyketide synthase generates a library of aromatic polyketides of diverse sizes and shapes. Proc. Natl. Acad. Sci. U. S. A. 96(7):3622-3627.

184

21. Fu H, Ebert-Khosla S, Hopwood DA, & Khosla C (1994) Engineered biosynthesis of novel polyketides: Dissection of the catalytic specificity of the act ketoreductase. J. Am. Chem. Soc. 116(10):4166-4170. 22. Yu TW, et al. (1998) Engineered biosynthesis of novel polyketides from Streptomyces spore pigment polyketide synthases. J. Am. Chem. Soc. 120(31):7749-7759. 23. Fu H, Hopwood DA, & Khosla C (1994) Engineered biosynthesis of novel polyketides: evidence for temporal, but not regiospecific, control of cyclization of an aromatic polyketide precursor. Chem. Biol. 1(4):205-210. 24. McDaniel R, Ebertkhosla S, Fu H, Hopwood DA, & Khosla C (1994) Engineered biosynthesis of novel polyketides: influence of a downstream enzyme on the catalytic specificity of a minimal aromatic polyketide synthase. Proc. Natl. Acad. Sci. U. S. A. 91(24):11542-11546. 25. Keatinge-Clay AT, Maltby DA, Medzihradszky KF, Khosla C, & Stroud RM (2004) An antibiotic factory caught in action. Nat. Struct. Mol. Biol. 11(9):888-893. 26. Marti T, Hu Z, Pohl NL, Shah AN, & Khosla C (2000) Cloning, sequence, and heterologous expression of the biosynthetic gene cluster for R1128, a non-steroidal estrogen receptor antagonist. Insights into an unusual priming mechanism. J. Biol. Chem. 275(43):33443- 33448. 27. Gullón S, et al. (2006) Isolation, characterization, and heterologous expression of the biosynthesis gene cluster for the antitumor anthracycline steffimycin. Appl. Environ. Microbiol. 72(6):4172- 4183. 28. Lombó F, Blanco G, Fernández E, Méndez C, & Salas JA (1996) Characterization of Streptomyces argillaceus genes encoding a polyketide synthase involved in the biosynthesis of the antitumor mithramycin. Gene 172(1):87-91. 29. Ames BD, et al. (2008) Crystal structure and functional analysis of tetracenomycin ARO/CYC: Implications for cyclization specificity of aromatic polyketides. Proc. Natl. Acad. Sci. U. S. A. 105(14):5349-5354. 30. Tsai SC, Ames BD, & David AH (2009) Chapter 2 Structural Enzymology of Polyketide Synthases. Methods Enzymol., (Academic Press), Vol Volume 459, pp 17-47. 31. Korman TP, Hill JA, Vu TN, & Tsai SC (2004) Structural analysis of actinorhodin polyketide ketoreductase: Cofactor binding and substrate specificity. Biochemistry 43(46):14529-14538. 32. Fu H, McDaniel R, Hopwood DA, & Khosla C (1994) Engineered biosynthesis of novel polyketides: Stereochemical course of two reactions catalyzed by a polyketide synthase. Biochemistry 33(31):9321-9326. 33. Lee TS, Khosla C, & Tang Y (2005) Engineered biosynthesis of aklanonic acid analogues. J. Am. Chem. Soc. 127(35):12254-12262. 34. McDaniel R, Ebert-Khosla S, Hopwood DA, & Khosla C (1994) Engineered biosynthesis of novel polyketides: actVII and actIV genes encode aromatase and cyclase enzymes, respectively. J. Am. Chem. Soc. 116(24):10855-10859. 35. McDaniel R, Ebert-Khosla S, Hopwood DA, & Khosla C (1995) Rational design of aromatic polyketide natural products by recombinant assembly of enzymatic subunits. Nature 375(6532):549-554. 36. Korman TP, Tan YH, Wong J, Luo R, & Tsai SC (2008) Inhibition kinetics and emodin cocrystal structure of a type II polyketide ketoreductase. Biochemistry 47(7):1837-1847. 37. Ma SM, et al. (2008) Redirecting the cyclization steps of fungal polyketide synthase. J. Am. Chem. Soc. 130(1):38-39. 38. McDaniel R, Hutchinson CR, & Khosla C (1995) Engineered biosynthesis of novel polyketides: Analysis of tcmN function in tetracenomycin biosynthesis. J. Am. Chem. Soc. 117(26):6805-6810.

185

39. Kramer PJ, et al. (1997) Rational design and engineered biosynthesis of a novel 18-carbon aromatic polyketide. J. Am. Chem. Soc. 119(4):635-639. 40. Zhang W, Watanabe K, Wang CCC, & Tang Y (2006) Heterologous biosynthesis of amidated polyketides with novel cyclization regioselectivity from oxytetracycline polyketide synthase. J. Nat. Prod. 69(11):1633-1636. 41. Zhang W, Ames BD, Tsai SC, & Tang Y (2006) Engineered biosynthesis of a novel amidated polyketide, using the malonamyl-specific initiation module from the oxytetracycline polyketide synthase. Appl. Environ. Microbiol. 72(4):2573-2580. 42. Zhang W, et al. (2007) A new mechanism for benzopyrone formation in aromatic polyketide biosynthesis. J. Am. Chem. Soc. 129(30):9304-9305. 43. Zhang W, Watanabe K, Wang CCC, & Tang Y (2007) Investigation of early tailoring reactions in the oxytetracycline biosynthetic pathway. J. Biol. Chem. 282(35):25717-25725. 44. Zhang W, et al. (2008) Identifying the minimal enzymes required for anhydrotetracycline biosynthesis. J. Am. Chem. Soc. 130(19):6068-6069. 45. Petkovic H, et al. (1999) Disruption of an aromatase/cyclase from the oxytetracycline gene cluster of Streptomyces rimosus results in production of novel polyketides with shorter chain lengths. J. Biol. Chem. 274(46):32829-32834. 46. Pickens LB, et al. (2009) Biochemical Analysis of the Biosynthetic Pathway of an Anticancer Tetracycline SF2575. J. Am. Chem. Soc. 47. Prado L, et al. (1999) Oxidative cleavage of premithramycin B is one of the last steps in the biosynthesis of the antitumor drug mithramycin. Chem. Biol. 6(1):19-30. 48. Trefzer A, et al. (2002) Rationally designed glycosylated premithramycins: hybrid aromatic polyketides using genes from three different biosynthetic pathways. J. Am. Chem. Soc. 124(21):6056-6062. 49. Künzel E, et al. (1997) Tetracenomycin M, a novel genetically engineered tetracenomycin resulting from a combination of mithramycin and tetracenomycin biosynthetic genes. Chem. Eur. J. 3(10):1675-1678. 50. Gibson M, Nur-e-alam M, Lipata F, Oliveira MA, & Rohr J (2005) Characterization of kinetics and products of the Baeyer−Villiger oxygenase MtmOIV, the key enzyme of the biosynthetic pathway toward the natural product anticancer drug mithramycin from Streptomyces argillaceus. J. Am. Chem. Soc. 127(50):17594-17595. 51. Beam MP, Bosserman MA, Noinaj N, Wehenkel M, & Rohr J (2009) Crystal structure of Baeyer- Villiger monooxygenase MtmOIV, the key enzyme of the mithramycin biosynthetic pathway. Biochemistry 48(21):4476-4487. 52. Menéndez N, et al. (2004) Biosynthesis of the antitumor chromomycin A3 in Streptomyces griseus: Analysis of the gene cluster and rational design of novel chromomycin analogs. Chem. Biol. 11(1):21-32. 53. Menéndez N, et al. (2006) Deoxysugar transfer during chromomycin A3 biosynthesis in Streptomyces griseus subsp. griseus: New derivatives with antitumor activity. Appl. Environ. Microbiol. 72(1):167-177. 54. Daum M, et al. (2009) Organisation of the biosynthetic gene cluster and tailoring enzymes in the biosynthesis of the tetracyclic quinone glycoside antibiotic polyketomycin. ChemBioChem 10(6):1073-1083. 55. Hutchinson CR (1997) Biosynthetic studies of daunorubicin and tetracenomycin C. Chem. Rev. 97(7):2525-2536. 56. Shen B, Summers RG, Wendt-pienkowski E, & Hutchinson CR (1995) The Streptomyces glaucescens tcmKL polyketide synthase and tcmN polyketide cyclase genes govern the size and shape of aromatic polyketides. J. Am. Chem. Soc. 117(26):6811-6821.

186

57. Thompson TB, Katayama K, Watanabe K, Hutchinson CR, & Rayment I (2004) Structural and functional analysis of tetracenomycin F2 cyclase from Streptomyces glaucescens: A type II polyketide cyclase. J. Biol. Chem. 279(36):37956-37963. 58. Rafanan ER, Hutchinson CR, & Shen B (2000) Triple hydroxylation of tetracenomycin A2 to tetracenomycin C involving two molecules of O2 and one molecule of H2O. Org. Lett. 2(20):3225-3227. 59. Rafanan ER, Jr., Le L, Zhao L, Decker H, & Shen B (2001) Cloning, sequencing, and heterologous expression of the elmGHIJ genes involved in the biosynthesis of the polyketide antibiotic elloramycin from Streptomyces olivaceus Tu2353. J. Nat. Prod. 64(4):444-449. 60. Ramos A, et al. (2008) Biosynthesis of elloramycin in Streptomyces olivaceus requires glycosylation by enzymes encoded outside the aglycon cluster. Microbiology 154(3):781-788. 61. Kulowski K, et al. (1999) Functional characterization of the jadI gene as a cyclase forming angucyclinones. J. Am. Chem. Soc. 121(9):1786-1794. 62. Han L, Yang K, Ramalingam E, Mosher RH, & Vining LC (1994) Cloning and characterization of polyketide synthase genes for jadomycin B biosynthesis in Streptomyces venezuelae ISP5230. Microbiology 140 ( Pt 12):3379-3389. 63. Wohlert SE, Wendt-Pienkowski E, Bao W, & Hutchinson CR (2001) Production of aromatic minimal polyketides by the daunorubicin polyketide synthase genes reveals the incompatibility of the heterologous DpsY and JadI cyclases. J. Nat. Prod. 64(8):1077-1080. 64. Metsä-Ketelä M, Palmu K, Kunnari T, Ylihonko K, & Mäntsälä P (2003) Engineering Anthracycline Biosynthesis toward Angucyclines. Antimicrob. Agents Chemother. 47(4):1291- 1296. 65. Rix U, Remsing LL, Hoffmeister D, Bechthold A, & Rohr J (2003) Urdamycin L: A novel metabolic shunt product that provides evidence for the role of the urdM gene in the urdamycin A biosynthetic pathway of Streptomyces fradiae TÜ 2717. ChemBioChem 4(1):109-111. 66. Zhu L, et al. (2004) Identification of the function of gene lndM2 encoding a bifunctional oxygenase-reductase involved in the biosynthesis of the antitumor antibiotic landomycin E by Streptomyces globisporus 1912 supports the originally assigned structure for landomycinone. J. Org. Chem. 70(2):631-638. 67. Lombó F, Braña AF, Salas JA, & Méndez C (2004) Genetic organization of the biosynthetic gene cluster for the antitumor angucycline oviedomycin in Streptomyces antibioticus ATCC 11891. ChemBioChem 5(9):1181-1187. 68. Lombó F, et al. (2009) Elucidation of oxygenation steps during oviedomycin biosynthesis and generation of derivatives with increased antitumor activity. ChemBioChem 10(2):296-303. 69. Palmu K, Ishida K, Mäntsälä P, Hertweck C, & Metsä-Ketelä M (2007) Artificial reconstruction of two cryptic angucycline antibiotic biosynthetic pathways. ChemBioChem 8(13):1577-1584. 70. Chen Y, et al. (2005) Functional analyses of oxygenases in jadomycin biosynthesis and identification of JadH as a bifunctional oxygenase/dehydrase. J. Biol. Chem. 280(23):22508- 22514. 71. Kharel MK, Zhu L, Liu T, & Rohr J (2007) Multi-oxygenase complexes of the gilvocarcin and jadomycin biosyntheses. J. Am. Chem. Soc. 129(13):3780-3781. 72. Fischer C, Lipata F, & Rohr J (2003) The complete gene cluster of the antitumor agent gilvocarcin V and its implication for the biosynthesis of the gilvocarcins. J. Am. Chem. Soc. 125(26):7818-7819. 73. Kharel MK, Nybo SE, Shepherd MD, & Rohr J (2010) Cloning and characterization of the ravidomycin and chrysomycin biosynthetic gene clusters. ChemBioChem 11(4):523-532. 74. Huang X, et al. (2008) Benzopyrenomycin, a cytotoxic bacterial polyketide metabolite with a benzo[a]pyrene-type carbocyclic ring system. Angew. Chem. Int. Ed. 47(21):3995-3998.

187

75. Erb A, Luzhetskyy A, Hardter U, & Bechthold A (2009) Cloning andsequencing of the biosynthetic gene cluster for saquayamycin Z and galtamycin B and the elucidation of the assembly of their saccharide chains. ChemBioChem 10(8):1392-1401. 76. Kroken S, Glass NL, Taylor JW, Yoder OC, & Turgeon BG (2003) Phylogenomic analysis of type I polyketide synthase genes in pathogenic and saprobic ascomycetes. Proc. Natl. Acad. Sci. U. S. A. 100(26):15670-15675. 77. Galagan JE, et al. (2005) Sequencing of Aspergillus nidulans and comparative analysis with A. fumigatus and A. oryzae. Nature 438(7071):1105-1115. 78. Ma SM, et al. (2007) Enzymatic synthesis of aromatic polyketides using PKS4 from Gibberella fujikuroi. J. Am. Chem. Soc. 129(35):10642-10643. 79. Fujii I, et al. (2000) Enzymatic synthesis of 1,3,6,8-tetrahydroxynaphthalene solely from malonyl coenzyme A by a fungal iterative type I polyketide synthase PKS1. Biochemistry 39(30):8853- 8858. 80. Udwary DW, Merski M, & Townsend CA (2002) A method for prediction of the locations of linker regions within large multifunctional proteins, and application to a type I polyketide synthase. J. Mol. Biol. 323(3):585-598. 81. Watanabe A, et al. (1999) Re-identification of Aspergillus nidulans wA gene to code for a polyketide synthase of naphthopyrone. Tetrahedron Lett. 40(1):91-94. 82. Linnemannstöns P, et al. (2002) The polyketide synthase gene pks4 from Gibberella fujikuroi encodes a key enzyme in the blosynthesis of the red pigment bikaverin. Fungal Genet. Biol. 37(2):134-148. 83. Wiemann P, et al. (2009) Biosynthesis of the red pigment bikaverin in Fusarium fujikuroi: Genes, their function and regulation. Mol. Microbiol. 72(4):931-946. 84. Zhan J, Burns AM, Liu MX, Faeth SH, & Gunatilaka AA (2007) Search for cell motility and angiogenesis inhibitors with potential anticancer activity: beauvericin and other constituents of two endophytic strains of Fusarium oxysporum. J. Nat. Prod. 70(2):227-232. 85. McInnes AG, Walter JA, Smith DG, Wright JL, & Vining LC (1976) Biosynthesis of bikaverin in Fusarium oxysporum. Use of 13C nuclear magnetic resonance with homonuclear 13C decoupling to locate adjacent 13C labels. J. Antibiot. 29(10):1050-1057. 86. Abe I, et al. (2007) Structure-based engineering of a plant type III polyketide synthase: Formation of an unnatural nonaketide naphthopyrone. J. Am. Chem. Soc. 129(18):5976-5980. 87. Watanabe CM & Townsend CA (2002) Initial characterization of a type I fatty acid synthase and polyketide synthase multienzyme complex NorS in the biosynthesis of aflatoxin B. Chem. Biol. 9(9):981-988. 88. Crawford JM, et al. (2009) Structural basis for biosynthetic programming of fungal aromatic polyketide cyclization. Nature 461(7267):1139-1143. 89. Tsai SC, et al. (2001) Crystal structure of the macrocycle-forming thioesterase domain of the erythromycin polyketide synthase: versatility from a unique substrate channel. Proc. Natl. Acad. Sci. U. S. A. 98(26):14808-14813. 90. Akey DL, et al. (2006) Structural basis for macrolactonization by the pikromycin thioesterase. Nat. Chem. Biol. 2(10):537-542. 91. Giraldes JW, et al. (2006) Structural and mechanistic insights into polyketide macrolactonization from polyketide-based affinity labels. Nat. Chem. Biol. 2(10):531-536. 92. Strieter ER, Koglin A, Aron ZD, & Walsh CT (2009) Cascade reactions during coronafacic acid biosynthesis: elongation, cyclization, and functionalization during Cfa7-catalyzed condensation. J. Am. Chem. Soc. 131(6):2113-2115. 93. Watanabe A, et al. (1998) Product identification of polyketide synthase coded by Aspergillus nidulans wA gene. Tetrahedron Lett. 39(42):7733-7736.

188

94. Pfeifer BA, Admiraal SJ, Gramajo H, Cane DE, & Khosla C (2001) Biosynthesis of complex polyketides in a metabolically engineered strain of E. coli. Science 291(5509):1790-1792. 95. Watanabe K, et al. (2006) Total biosynthesis of antitumor nonribosomal peptides in Escherichia coli. Nat. Chem. Biol. 2(8):423-428. 96. Chang MCY, Eachus RA, Trieu W, Ro D-K, & Keasling JD (2007) Engineering Escherichia coli for production of functionalized terpenoids using plant P450s. Nat. Chem. Biol. 3(5):274-277. 97. Minami H, et al. (2008) Microbial production of plant benzylisoquinoline alkaloids. Proc. Natl. Acad. Sci. U. S. A. 105(21):7393-7398. 98. Beltran-Alvarez P, Cox RJ, Crosby J, & Simpson TJ (2007) Dissecting the component reactions catalyzed by the actinorhodin minimal polyketide synthase. Biochemistry 46(50):14672-14681. 99. Ma Y, et al. (2006) Catalytic relationships between Type I and Type II iterative polyketide synthases: The Aspergillus parasiticus norsolorinic acid synthase. ChemBioChem 7(12):1951- 1958. 100. Bringmann G, et al. (2006) Different polyketide folding modes converge to an identical molecular architecture. Nat. Chem. Biol. 2(8):429-433. 101. Bibb MJ, Sherman DH, Omura S, & Hopwood DA (1994) Cloning, sequencing and deduced functions of a cluster of Streptomyces genes probably encoding biosynthesis of the polyketide antibiotic frenolicin. Gene 142(1):31-39. 102. Shen B & Hutchinson CR (1996) Deciphering the mechanism for the assembly of aromatic polyketides by a bacterial polyketide synthase. Proc. Natl. Acad. Sci. U. S. A. 93(13):6600-6604. 103. Tang Y, Kim CY, Mathews, II, Cane DE, & Khosla C (2006) The 2.7-Angstrom crystal structure of a 194-kDa homodimeric fragment of the 6-deoxyerythronolide B synthase. Proc. Natl. Acad. Sci. U. S. A. 103(30):11124-11129. 104. An JH & Kim YS (1998) A gene cluster encoding malonyl-CoA decarboxylase (MatA), malonyl- CoA synthetase (MatB) and a putative dicarboxylate carrier protein (MatC) in Rhizobium trifolii. Eur. J. Biochem. 257(2):395-402. 105. Ma SM & Tang Y (2007) Biochemical characterization of the minimal polyketide synthase domains in the lovastatin nonaketide synthase LovB. FEBS J. 274(11):2854-2864. 106. Zhou H, Zhan JX, Watanabe K, Xie XK, & Tang Y (2008) A polyketide macrolactone synthase from the filamentous fungus Gibberella zeae. Proc. Natl. Acad. Sci. U. S. A. 105(17):6249-6254. 107. Tang Y, Lee TS, Kobayashi S, & Khosla C (2003) Ketosynthases in the initiation and elongation modules of aromatic polyketide synthases vave orthogonal acyl carrier protein specificity. Biochemistry 42(21):6588-6595. 108. Breinholt J, Jensen GW, Kjaer A, Olsen CE, & Rosendahl CN (1997) Hypomycetin - an antifungal, tetracyclic metabolite from Hypomyces aurantius: Production, structure and biosynthesis. Acta Chem. Scand. 51(8):855-860. 109. Pfeifer B, Hu Z, Licari P, & Khosla C (2002) Process and Metabolic Strategies for Improved Production of Escherichia coli-Derived 6-Deoxyerythronolide B. Appl. Environ. Microbiol. 68(7):3287-3292. 110. Rossolini GM, et al. (2001) Metallo-beta-lactamase producers in environmental microbiota: new molecular class B enzyme in Janthinobacterium lividum. Antimicrob. Agents Chemother. 45(3):837-844. 111. Yoder OC & Turgeon BG (2001) Fungal genomics and pathogenicity. Curr. Opin. Plant Biol. 4(4):315-321. 112. Machida M, et al. (2005) Genome sequencing and analysis of Aspergillus oryzae. Nature 438(7071):1157-1161. 113. Nierman WC, et al. (2005) Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus. Nature 438(7071):1151-1156.

189

114. Brakhage AA, et al. (2008) Activation of fungal silent gene clusters: a new avenue to drug discovery. Prog. Drug. Res. 66:1, 3-12. 115. Corre C & Challis GL (2009) New natural product biosynthetic chemistry discovered by genome mining. Nat. Prod. Rep. 26(8):977-986. 116. Udwary DW, et al. (2007) Genome sequencing reveals complex secondary metabolome in the marine actinomycete Salinispora tropica. Proc. Natl. Acad. Sci. U. S. A. 104(25):10376-10381. 117. Wattana-amorn P, et al. (2010) Solution structure of an acyl carrier protein domain from a fungal type I polyketide synthase. Biochemistry 49(10):2186-2193. 118. Tsai HF, et al. (2001) Pentaketide melanin biosynthesis in Aspergillus fumigatus requires chain- length shortening of a heptaketide precursor. J. Biol. Chem. 276(31):29292-29298. 119. Bailey AM, et al. (2007) Characterisation of 3-methylorcinaldehyde synthase (MOS) in Acremonium strictum: first observation of a reductive release mechanism during polyketide biosynthesis. Chem. Commun. (39):4053-4055. 120. Bok JW, et al. (2009) Chromatin-level regulation of biosynthetic gene clusters. Nat. Chem. Biol. 5(7):462-464. 121. Szewczyk E, et al. (2008) Identification and characterization of the asperthecin gene cluster of Aspergillus nidulans. Appl. Environ. Microbiol. 74(24):7607-7612. 122. Awakawa T, et al. (2009) Physically discrete beta-lactamase-type thioesterase catalyzes product release in atrochrysone synthesis by iterative type I polyketide synthase. Chem. Biol. 16(6):613- 623. 123. Tamura K, Dudley J, Nei M, & Kumar S (2007) MEGA4: Molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24(8):1596-1599. 124. Schroeckh V, et al. (2009) Intimate bacterial-fungal interaction triggers biosynthesis of archetypal polyketides in Aspergillus nidulans. Proc. Natl. Acad. Sci. U. S. A. 106(34):14558- 14563. 125. Kim JE, et al. (2005) Putative polyketide synthase and laccase genes for biosynthesis of aurofusarin in Gibberella zeae. Appl. Environ. Microbiol. 71(4):1701-1708. 126. Reeves CD, Hu Z, Reid R, & Kealey JT (2008) Genes for the biosynthesis of the fungal polyketides hypothemycin from Hypomyces subiculosus and radicicol from Pochonia chlamydosporia. Appl. Environ. Microbiol. 74(16):5121-5129. 127. Zhou H, et al. (2010) Enzymatic synthesis of resorcylic acid lactones by cooperation of fungal iterative polyketide synthases involved in hypothemycin biosynthesis. J. Am. Chem. Soc.:4530– 4531. 128. Wang S, et al. (2008) Functional Characterization of the Biosynthesis of Radicicol, an Hsp90 Inhibitor Resorcylic Acid Lactone from Chaetomium chiversii. Chem. Biol. 15(12):1328-1338. 129. Fujii I, et al. (2004) Hydrolytic polyketide shortening by Ayg1p, a novel enzyme involved in fungal melanin biosynthesis. J. Biol. Chem. 279(43):44613-44620. 130. Brown DW, et al. (1996) Twenty-five coregulated transcripts define a sterigmatocystin gene cluster in Aspergillus nidulans. Proc. Natl. Acad. Sci. U. S. A. 93(4):1418-1422. 131. Tominaga M, et al. (2006) Molecular analysis of an inactive aflatoxin biosynthesis gene cluster in Aspergillus oryzae RIB strains. Appl. Environ. Microbiol. 72(1):484-490. 132. Chen H, Lee M-H, Daub ME, & Chung K-R (2007) Molecular analysis of the cercosporin biosynthetic gene cluster in Cercospora nicotianae. Mol. Microbiol. 64(3):755-770. 133. Nielsen KF, Mogensen JM, Johansen M, Larsen TO, & Frisvad JC (2009) Review of secondary metabolites and mycotoxins from the Aspergillus niger group. Anal. Bioanal.Chem.:1225-1242. 134. Chooi Y-H, Cacho R, & Tang Y (2010) Identification of Viridicatumtoxin and Griseofulvin Gene Clusters in Penicillium aethiopicum. Chem. Biol. 17(5):483-494.

190

135. Jesus AE, Hull WE, Steyn PS, Heerden FR, & Vleggaar R (1982) Biosynthesis of viridicatumtoxin, a mycotoxin from Penicilium expansum. J. Chem. Soc., Chem. Commun. 1982(16):902-904. 136. Schumann J & Hertweck C (2006) Advances in cloning, functional analysis and heterologous expression of fungal polyketide synthase genes. J. Biotechnol. 124(4):690-703. 137. Larkin MA, et al. (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23(21):2947-2948. 138. Rzhetsky A & Nei M (1992) A simple method for estimating and testing minimum-evolution trees. Mol. Biol. Evol. 9(5):945-967. 139. Zuckerkandl E & Pauling L (1965) Evolutionary divergence and convergence in proteins Evolving genes and proteins: A symposium held at the Institute of Microbiology of Rutgers, (The State University, NY), pp 97-166. 140. O'Hagan D (1991) The polyketide metabolites (Ellis Howard, Chichester, UK). 141. Zhou H, Li Y, & Tang Y (2010) Cyclization of aromatic polyketides from bacteria and fungi. Nat. Prod. Rep. 27(6):839-868. 142. Harris CM, Roberson JS, & Harris TM (1976) Biosynthesis of griseofulvin. J. Am. Chem. Soc. 98(17):5380-5386. 143. Zheng C-J, Yu H-E, Kim E-H, & Kim W-G (2008) Viridicatumtoxin B, a new anti-MRSA agent from Penicillium sp. FR11. J. Antibiot. 61(10):633-637. 144. Hutchison RD, Steyn PS, & Van Rensburg SJ (1973) Viridicatumtoxin, a new mycotoxin from Penicillium viridicatum Westling. Toxicol. Appl. Pharmacol. 24(3):507-509. 145. Crawford JM & Townsend CA (2010) New insights into the formation of fungal aromatic polyketides. Nat. Rev. Microbiol. 8(12):879-889. 146. Korman TP, et al. (2010) Structure and function of an iterative polyketide synthase thioesterase domain catalyzing Claisen cyclization in aflatoxin biosynthesis. Proc. Natl. Acad. Sci. U. S. A. 107(14):6246-6251. 147. Li Y, Xu W, & Tang Y (2010) Classification, prediction, verification of the regioselectivity of fungal polyketide synthase product template domains. J. Biol. Chem. 285(30):22764-22773. 148. Ishimaru T, Tsuboya S, & Saijo T (1994) JP0556861A. 149. Shu YZ, Cutrone JFQ, Klohr SE, & Huang S (1995) BMS-192548, a tetracyclic binding inhibitor of neuropeptide Y receptors, from Aspergillus niger WB2346. II. Physico-chemical properties and structural characterization. J. Antibiot. 48(10):1060-1065. 150. Kodukula K, et al. (1995) BMS-192548, a tetracyclic binding inhibitor of neuropeptide Y receptors, from Aspergillus niger WB2346. I. Taxonomy, fermentation, isolation and biological activity. J. Antibiot. 48(10):1055-1059. 151. Pel HJ, et al. (2007) Genome sequencing and analysis of the versatile cell factory Aspergillus niger CBS 513.88. Nat. Biotechnol. 25(2):221-231. 152. Chiang Y-M, et al. (2009) A gene cluster containing two fungal polyketide synthases encodes the biosynthetic pathway for a polyketide, asperfuranone, in Aspergillus nidulans. J. Am. Chem. Soc. 131(8):2965-2970. 153. Bergmann S, et al. (2007) Genomics-driven discovery of PKS-NRPS hybrid metabolites from Aspergillus nidulans. Nat. Chem. Biol. 3(4):213-217. 154. P. ML & B. JP (1993) A series of six compact fungal transformation vectors containing polylinkers with multiple unique restriction sites. Fungal. Genet. Newsl. 40:59-62. 155. Thompson CJ, et al. (1987) Characterization of the herbicide-resistance gene bar from Streptomyces hygroscopicus. EMBO J. 6(9):2519-2523. 156. Lee KK, Da Silva NA, & Kealey JT (2009) Determination of the extent of phosphopantetheinylation of polyketide synthases expressed in Escherichia coli and Saccharomyces cerevisiae. Anal. Biochem. 394(1):75-80. 191

157. Jones EW (1991) Tackling the protease problem in Saccharomyces cerevisiae. Methods Enzymol. 194:428-453. 158. Chen Y, et al. (2010) Characterization of JadH as an FAD- and NAD(P)H-dependent bifunctional hydroxylase/dehydrase in jadomycin biosynthesis. ChemBioChem 11(8):1055-1060. 159. Bush K & Jacoby GA (2010) Updated functional classification of beta-lactamases. Antimicrob. Agents. Chemother. 54(3):969-976. 160. Garau G, et al. (2004) Update of the standard numbering scheme for class B beta-lactamases. Antimicrob. Agents Chemother. 48(7):2347-2349. 161. Roy A, Kucukural A, & Zhang Y (2010) I-TASSER: A unified platform for automated protein structure and function prediction. Nat. Protoc. 5(4):725-738. 162. Zhang Y (2008) I-TASSER server for protein 3D structure prediction. BMC Bioinf. 9:40. 163. Shimada A, Ishikawa H, Nakagawa N, Kuramitsu S, & Masui R (2010) The first crystal structure of an archaeal metallo-beta-lactamase superfamily protein; ST1585 from Sulfolobus tokodaii. Proteins 78(10):2399-2402. 164. Stamp AL, et al. (2010) Structural and functional characterization of Salmonella enterica serovar Typhimurium YcbL: an unusual Type II glyoxalase. Protein Sci. 19(10):1897-1905. 165. Xu D, Xie D, & Guo H (2006) Catalytic mechanism of class B2 metallo-beta-lactamase. J. Biol. Chem. 281(13):8740-8747. 166. Wang Z, Fast W, Valentine AM, & Benkovic SJ (1999) Metallo-beta-lactamase: Structure and mechanism. Curr. Opin. Chem. Biol. 3(5):614-622. 167. López V, et al. (2005) Insights into the interaction of human arginase II with substrate and manganese ions by site-directed mutagenesis and kinetic studies. FEBS J. 272(17):4540-4548. 168. Orellana MS, et al. (2002) Insights into the interaction of human liver arginase with tightly and weakly bound manganese ions by chemical modification and site-directed mutagenesis studies. Arch. Biochem. Biophys. 403(2):155-159. 169. Scolnick LR, Kanyo ZF, Cavalli RC, Ash DE, & Christianson DW (1997) Altering the binuclear manganese cluster of arginase diminishes thermostability and catalytic function. Biochemistry 36(34):10558-10565. 170. Daiyasu H, Osaka K, Ishino Y, & Toh H (2001) Expansion of the zinc metallo-hydrolase family of the beta-lactamase fold. FEBS Lett. 503(1):1-6. 171. Bebrone C (2007) Metallo-beta-lactamases (classification, activity, genetic organization, structure, zinc coordination) and their superfamily. Biochem. Pharmacol. 74(12):1686-1701. 172. Christianson DW & Cox JD (1999) Catalysis by metal-activated hydroxide in zinc and manganese metalloenzymes. Annu. Rev. Biochem. 68:33-57. 173. Christianson DW (1997) Structural chemistry and biology of manganese metalloenzymes. Prog. Biophys. Mol. Biol. 67(2-3):217-252. 174. Dismukes GC (1996) Manganese enzymes with binuclear active sites. Chem. Rev. 96(7):2909- 2926. 175. Kanyo ZF, Scolnick LR, Ash DE, & Christianson DW (1996) Structure of a unique binuclear manganese cluster in arginase. Nature 383(6600):554-557. 176. Wilce MC, et al. (1998) Structure and mechanism of a -specific aminopeptidase from Escherichia coli. Proc. Natl. Acad. Sci. U. S. A. 95(7):3472-3477. 177. Heikinheimo P, et al. (1996) The structural basis for pyrophosphatase catalysis. Structure 4(12):1491-1508. 178. Ma SM, et al. (2009) Complete reconstitution of a highly reducing iterative polyketide synthase. Science 326(5952):589-592.

192

179. Sharma KK & Boddy CN (2007) The thioesterase domain from the pimaricin and erythromycin biosynthetic pathways can catalyze hydrolysis of simple thioester substrates Bioorg. Med. Chem. Lett. 17:3034-3037 180. Khangulov SV, Sossong TM, Jr., Ash DE, & Dismukes GC (1998) L-arginine binding to liver arginase requires proton transfer to gateway residue His141 and coordination of the guanidinium group to the dimanganese(II,II) center. Biochemistry 37(23):8539-8550. 181. Kabuto C, et al. (1976) X-Ray structure of viridicatumtoxin: a new class of mycotoxin from Penicillium viridicatum Westling. J. Chem. Soc., Chem. Commun. (18):728-729. 182. Hall DR, et al. (1999) The crystal structure of Escherichia coli class II fructose-1, 6-bisphosphate aldolase in complex with phosphoglycolohydroxamate reveals details of mechanism and specificity. J. Mol. Biol. 287(2):383-394. 183. Galkin A, et al. (2009) Structural insights into the substrate binding and stereoselectivity of giardia fructose-1,6-bisphosphate aldolase. Biochemistry 48(14):3186-3196. 184. Lipscomb WN & Strater N (1996) Recent advances in zinc enzymology. Chem. Rev. 96(7):2375- 2434. 185. Pickens LB & Tang Y (2010) Oxytetracycline biosynthesis. J. Biol. Chem. 285(36):27509-27515. 186. Lombo F, Menendez N, Salas JA, & Mendez C (2006) The aureolic acid family of antitumor compounds: structure, mode of action, biosynthesis, and novel derivatives. Appl. Microbiol. Biotechnol. 73(1):1-14. 187. Andrianopoulos A & Hynes MJ (1988) Cloning and analysis of the positively acting regulatory gene amdR from Aspergillus nidulans. Mol. Cell. Biol. 8(8):3532-3541. 188. Chooi Y-H, et al. (2008) Cloning and sequence characterization of a non-reducing polyketide synthase gene from the lichen Xanthoparmelia semiviridis. Mycol. Res. 112(2):147-161. 189. Mutka SC, Bondi SM, Carney JR, Da Silva NA, & Kealey JT (2006) Metabolic pathway engineering for complex polyketide biosynthesis in Saccharomyces cerevisiae. FEMS Yeast Res. 6(1):40-47. 190. Bradford MM (1976) A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem. 72:248-254. 191. Kharel MK, Pahari P, Lian H, & Rohr J (2010) Enzymatic total synthesis of rabelomycin, an angucycline group antibiotic. Org. Lett. 12(12):2814-2817. 192. Li Y, Chooi YH, Sheng Y, Valentine JS, & Tang Y (2011) Comparative characterization of fungal anthracenone and naphthacenedione biosynthetic pathways reveals an alpha- hydroxylation-dependent Claisen-like cyclization catalyzed by a dimanganese thioesterase. J. Am. Chem. Soc. 133(39):15773-15785. 193. Garcia-Saez I, et al. (2003) Three-dimensional structure of FEZ-1, a monomeric subclass B3 metallo-beta-lactamase from Fluoribacter gormanii, in native form and in complex with D- captopril. J. Mol. Biol. 325(4):651-660.

193