Quick viewing(Text Mode)

Identification, Analysis and Manipulation of the Torrubiellone a Gene Cluster

Identification, Analysis and Manipulation of the Torrubiellone a Gene Cluster

IDENTIFICATION, ANALYSIS AND MANIPULATION OF THE TORRUBIELLONE A GENE CLUSTER

By

GUILLERMO CARLOS FERNANDEZ BUNSTER

School of Biological Sciences University of Bristol United Kingdom

A dissertation submitted to the UNIVERSITY OF BRISTOL in accordance with the requirements of the DOCTOR OF PHILOSOPHY in the FACULTY OF SCIENCE

June / December 2016

Word count = 43.800

i

Abstract

Torrubiellones A-D, extracted from Torrubiella sp. BCC2165, are structurally similar to 2- pyridone compounds. Torrubiellone A is particularly interesting because it has antimalarial activity. Combining knowledge of the gene clusters responsible for the biosynthesis of the structurally similar compounds with in-silico analysis of the Torrubiella genome sequence lead to the identification of the torrubiellone A biosynthetic gene cluster.

Torrubiella sp. BCC2165 DNA was extracted, sequenced and analysed to reveal a putative torrubiellone A gene cluster, comprising torS encoding a hybrid polyketide synthase- nonribosomal peptide synthetase, torA and torB encoding two P450 cytochromes and torC encoding an enoyl reductase. Comparison to the tenellin and desmethylbassianin gene clusters identified two additional genes, torD and torE, which could be responsible for structural differences between torrubiellone A and desmethylbassianin. torS was assembled without introns by homologous recombination in yeast and combined with other biosynthetic genes from the putative torrubiellone cluster, on a multigene expression vector. Assembled plasmids were used to transform the filamentous Aspergillus oryzae NSAR1, yielding strongly yellow-pigmented transformants. Analysis of organic extracts from transformants by liquid chromatography-mass spectroscopy indicated that the production of torrubiellone- related compounds has been achieved. torD and torE gene functions were investigated by co- expressing these genes in a tenellin-producing A. oryzae, resulting in the addition of a hydroxyl group to the tenellin compound.

A Torrubiella transformation system was developed, initially for promoter analysis of the torD and torE genes because the start codon of torE lies only 17 bp downstream of the torD stop codon in the Torrubiella genome, yet the two genes are transcribed independently, and that torE promoter is encoded within the torD coding region. The potential regulatory function of two transcription factors, ZnTF and C6TF, whose genes flank the torrubiellone biosynthetic genes, was investigated by over-expression in the native host. Overexpression of ZnTF led to the production of novel compounds, but no role, either positive or negative, was found for C6TF. The potential for genetic analysis by gene knockout in Torrubiella was tested with the torS gene. Transformants generated by homologous recombination were detected, as well as ectopic events, but the system requires improvement particularly in the isolation of homokaryons post-transformation.

ii

Dedication and acknowledgements

I would like to thank to my supervisor for giving me the opportunity to study a PhD in England and accepting me as his student. I have learnt lots by listening, talking and watching him work, such as the first day, with joy and excitement of every new finding he could get. I will always have him as example in how a researcher should be, always curious, with constantly one theory to test, tweaking protocols to enhance them and teaching me how to express and write properly. Those are lessons that I will never forget.

Also thanks to Elisabeth, who always was present for us, and especially for my daughter, who adores her.

Also thanks to Jeroen Maertens, Luis Pablo and Sandra Alvarez, the best friends I could have got in UK, who offered me a word of support, discussion and even constructive criticisms, and always available for a cup of coffee at any time of the day, and to Magdalena Koziol for teaching me (almost) all the techniques required to perform my research.

To my parents, for their continuous support while I was away.

Amelia and Angela, I really appreciate what you did for me, you decided to come with me for this four-year adventure, in a foreign country, almost not speaking the language. You are the bravest people I know, and I am very lucky to have you as my beloved wife and my adorable daughter.

I am sure I wouldn’t have been able to do this without you. You are my pillars and my motivation, the reason why I am here, and I hope to be the best for you.

iii

Author’s declaration

declare that the work in this dissertation was carried out in accordance with the requirements of the University’s Regulations and Code of I Practice for Research Degree Programmes and that it has not been submitted for any other academic award. Except where indicated by specific reference in the text, the work is the candidate’s own work. Work done in collaboration with, or with the assistance of, others, is indicated as such. Any views expressed in the dissertation are those of the author.

SIGNED: ...... DATE: ......

iv

ABBREVIATIONS:

PK: Polyketide

PKS: Polyketide synthase

NRP: Non-Ribosomal peptide

NRPS: Non-Ribosomal Peptide Synthase

PCR: Polymerase Chain Reaction

LC-MS: Liquid Chromatography – Mass Spectrometry

TLC: Thin Layer Chromatography

ZnTF: Zn Finger Domain- Transcription Factor

C6TF: C6-Transcription Factor

TF: Transcription Factor

FAS: Fatty Acid Synthase

FA: Fatty Acid

KS: Keto synthase domain of PKS

ACP: Acyl-carrier protein domain of PKS

AT: Acyl transferase domain of PKS

KR: Keto-reductase domain of PKS

DH: Dehydratase domain of PKS

ER: Active enoyl reductase domain of PKS.

ER°: Inactive enoyl reductase domain of PKS.

TE: Thio-esterease domain of NRPS

SAT: Starter-Unit ACP transacylase domain of PKS

PT: product template domain of PKS

C-Met: Methylation domain of PKS

A: Adenylation domain of NRPS

PCP: Thiolation and Peptide Carrier Protein domain of NRPS

C: Condensation domain of NRPS

NR-PKS: Non-reducing polyketide synthase

v

PR-PKS: Partially-reducing polyketide synthase

HR-PKS: Highly-reducing polyketide synthase

MO: 3-methylorcinaldehyde

OSA: Orsenillic acid

NSAS: Norsolorinic acid synthase

MSAS: 6-methylsalicylate synthase ten, dmb, mil pathway: tenellin, desmethylbassianin, militarinone

ACT: Artemisinin-based combination therapies

WHO: World Health Organization

GMAP: Global Malaria Action Plan

SAM: S-adenosyl methionine

SM: Secondary metabolism

NMR: Nuclear Magnetic Resonance antiSMASH: antibiotics & Secondary Metabolite Analysis Shell

ORF: Open Reading Frame

MEA: Malt Extract Agar. A. oryzae production media.

CDA or Cdox: Czapek-Dox agar. A. oryzae selection media.

ES+: Positive electrospray ionisation mass spectrometry

ES-: Negative electrospray ionisation mass spectrometry

Standard genomic nomenclature: torS: gene

TorS: protein sequence

TORS: Enzyme

vi

INDICE

Abbreviations: ...... v Indice ...... vii List of tables ...... x List of Figures ...... xi 1. Introduction ...... 1 1.1 Specialized metabolism ...... 1 1.2 Fungal secondary metabolites ...... 1 1.3 Biosynthesis ...... 3 1.3.1 Fatty acid biosynthesis ...... 4 1.3.2 Polyketide biosynthesis ...... 6 1.3.3 PKS- NRPS hybrid systems ...... 9 1.4 Torrubiella spp: Characteristics and compounds ...... 13 1.5 Malaria and antimalarial activities ...... 16 1.6 Gene mining approach for discovering new natural products ...... 21 1.7 Heterologous expression of fungal secondary metabolites ...... 24 1.8 Molecular approaches to the study of torrubiellone A biosynthesis ...... 26 1.8.1 Yeast Recombination ...... 27 1.8.2 Gateway Recombination System...... 28 1.9 Aims ...... 29 2. Materials and Methods ...... 30 2.1 Microbial strains and growth media ...... 30 2.1.1 Routine Chemicals ...... 30 2.1.2 Escherichia coli ...... 30 2.1.3 Saccharomyces cerevisiae ...... 31 2.1.4 Aspergillus oryzae NSAR1 ...... 31 2.1.5 Torrubiella sp. BCC2165 ...... 31 2.2 Transformation procedures ...... 31 2.2.1 Escherichia coli ...... 31 2.2.2 Saccharomyces cerevisiae transformation ...... 32 2.2.3 Aspergillus oryzae transformation ...... 33

vii

2.2.4 Torrubiella sp. BCC 2165 transformation ...... 33 2.3 DNA manipulations ...... 34 2.3.1 Restriction digests ...... 34 2.3.2 Gateway transfer ...... 34 2.3.3 Nucleic Acid Extractions ...... 34 2.3.4 Polymerase chain reactions ...... 36 2.3.5 Preparative/Analytical PCR ...... 39 2.3.6 Colony PCR ...... 40 2.3.7 Electrophoresis ...... 41 2.3.8 Sequencing...... 41 2.4 Chemical Extractions and Analysis ...... 41 2.4.1 Metabolite extraction from liquid media ...... 41 2.4.2 Metabolite extraction from plates ...... 41 2.4.3 Liquid Chromatography – Mass Spectroscopy (LC-MS) ...... 42 2.4.4 Thin Liquid Chromatography ...... 42 2.5 Microscopy ...... 42 2.5.1 Light Microscopy ...... 42 2.5.2 Fluorescence Microscopy ...... 43 2.6 Software and Online Tools ...... 43 3. The torrubiellone biosynthetic gene cluster ...... 44 3.1 Introduction ...... 44 3.2 Genomic Sequence data from Torrubiella sp. BCC2165 ...... 45 3.3 in-silico analysis ...... 46 3.3.1 Gene Cluster prediction ...... 46 3.4 Putative torrubiellone A gene cluster analysis ...... 53 3.4.1 Torrubiellone synthase analysis ...... 58 3.4.2 Genomic analysis of the ORFs located on the torrubiellone A gene cluster ...... 63 3.4.3 PKS-NRPS Gene cluster comparison ...... 65 3.4.4 Torrubiellone A gene cluster intron analysis ...... 66 3.5 Gene cluster amplification from Torrubiella sp. BCC2165 ...... 67 3.6 Biosynthetic pathway proposal for Torrubiellone A ...... 68 4. Development of a Torrubiella transformation system for promoter analysis, directed gene targeting and transcription factor overexpression...... 79 4.1 Introduction ...... 79 4.2 Results ...... 83 4.2.1 Development of transformation system for Torrubiella sp. BCC2165 ...... 83 4.2.2 Promoter analysis ...... 87

viii

4.2.3 KO of torS ...... 91 4.2.4 Overexpression of Transcription Factors ...... 93 4.2.5 Metabolite production in ZnTF transformants ...... 96 4.3 Discussion ...... 100 4.3.1 Transformation system ...... 100 4.3.2 Promoter analysis ...... 103 4.3.3 ZnTF and C6TF Transcription Factors ...... 105 5. Torrubiellone A gene cluster expression in A. oryzae ...... 107 5.1 Introduction ...... 107 5.2 Results ...... 111 5.2.1 torS (PKS-NRPS) assembly ...... 111 5.2.2 Expression of torS+torC in A. oryzae ...... 112 5.2.3 Intron removal strategy ...... 114 5.2.4 Expression of torA and torB genes in torSC A. oryzae transformants ...... 121 5.2.5 Function of TORD and TORE enzymes ...... 128 5.3 Discussion ...... 135 6. General Discussion ...... 139 7. References ...... 156 8. Appendices ...... 179 9.1 Domain identification for torS synthase gene ...... 179 9.2 Protein sequence alignment between TorS and TenS ...... 183 9.3 Plasmid Maps ...... 194

ix

LIST OF TABLES

Table 1.1: Examples of PKS-NRPS systems...... 11 Table 1.2 : List of compounds described in the genus Torrubiella...... 15 Table 1.3: Current compounds used in anti-malarial treatment ...... 19 Table 1.4: Anti-malarial drugs under study ...... 20 Table 1.5: Recently validated antimalarial targets...... 21 Table 2.1: List of primers used in the research ...... 36 Table 2.2: Programme for preparative PCR ...... 40 Table 2.3: Analytical PCR programmes ...... 40 Table 3.1: Data obtained from Torrubiella sp. BCC2165 sequencing ...... 45 Table 3.2: in-silico prediction of secondary metabolite gene clusters in Torrubiella sp. BCC2165 ...... 47 Table 3.3: Secondary metabolite gene clusters in-silico prediction for Torrubiella and related fungi...... 51 Table 3.4: % aa identity between TorS, DmbS, MilS and TenS protein sequences. 54 Table 3.5: Proposed Torrubiellone A gene cluster, and their homologs genes in other organisms...... 55 Table 3.6: % aa identity between domains of TorS, DmbS and TenS protein sequences ...... 59 Table 3.7: Domains predicted for TorS protein sequence and their probable position within the synthase...... 60 Table 3.8: Comparison of conserved amino acids of different A domain...... 63 Table 3.9: Putative introns located in genes belonging to the biosynthetic torrubiellone A gene cluster ...... 67 Table 4.1: BASTA growth data for Torrubiella sp. BCC2165 ...... 84 Table 4.2: eGFP expression of promoters...... 90 Table 5.1: Proposed torrubiellone-related structures for compounds extracted from A. oryzae transformed with pTYargtorSintlessCdmbABC ...... 128

x

LIST OF FIGURES

Figure 1.1. Examples of NRPs and PKs ...... 2 Figure 1.2 : Examples of structurally-related molecules produced by PKS-NRPS fusions ...... 3 Figure 1.3: Examples of polyketide structures...... 3 Figure 1.4: Claisen ester condensation...... 4 Figure 1.5: The fatty acid biosynthetic cycle...... 5 Figure 1.6: Reductive steps catalysed by PKSs domains...... 6 Figure 1.7: MOS domain structure ...... 7 Figure 1.8: MOS to xenovulene A conversion...... 8 Figure 1.9: Domain architectures for fungal iterative type I PKSs deduced from gene sequences...... 8 Figure 1.10: PKS and NRPS strdomain...... 10 Figure 1.11: Torrubiella morphological characteristics...... 13 Figure 1.12 : Different bioactive substances obtained from insect pathogenic fungi. 14 Figure 1.13: : Comparison of torrubiellone structures with militarinone A...... 16 Figure 1.14: Historical timeline of the discovery of antimalarial therapeutics ..... 18 Figure 1.15: Tenellin pathway ...... 26 Figure 1.16: Basic gene assembly in a desired vector...... 27 Figure 1.17: Scheme of the Gateway Recombinational System...... 28 Figure 3.1: Chalcone core and compounds derived from the basic core ...... 53 Figure 3.2: Comparison between similar PKS-NRPS compounds ...... 53 Figure 3.3: ORF order of the putative torrubiellone A gene cluster...... 56 Figure 3.4: Examples of OYE activities...... 56 Figure 3.5: CYP52X1 enzymatic function...... 57 Figure 3.6: ORFs detected by BLAST search for contig 5044 in Torrubiella sp. BCC2165...... 57 Figure 3.7: Predicted domains in the coding sequence of torS...... 60 Figure 3.8: Domain organization and genomic context of several PKS-NRPS ...... 61 Figure 3.9: Phylogenetic tree comparing protein sequences of known PKS and PKS-NRPS with TorS ...... 62 Figure 3.10: torD and torE distance ...... 64 Figure 3.11: PCR product from genomic (g) and complementary (c) DNA ...... 64 Figure 3.12: Comparison between torrubiellone, militarinone and tenellin gene clusters ...... 65 Figure 3.13: Torrubiellone A structure ...... 65 Figure 3.14: Aspyridone and torrubiellone A structures...... 66 Figure 3.15: Contig 5044 analysis...... 68

xi

Figure 3.16: Activation by the A domain of the tyrosine aminoacid ...... 68 Figure 3.17: Biosynthesis of the PK moiety by the synthase ...... 69 Figure 3.18: PK-NRP fusion, attached to the PCP domain of the cognate NRPS . 69 Figure 3.19: Pre-torrubiellone D ...... 70 Figure 3.20: Biosynthetic pathways from pre-tenellin B to tenellin and/or shunt metabolites...... 71 Figure 3.21: Modifications between pre-fusarin C and fusarin C...... 71 Figure 3.22: Products from the enzyme ACE1 in A. oryzae ...... 72 Figure 3.23: Examples of shunt metabolites ...... 72 Figure 3.24: Strucural difference between two torrubiellone compounds ...... 73 Figure 3.25:Militarinone pathway ...... 73 Figure 3.26: Effect of the cytochrome P450...... 74 Figure 3.27: TENA/APDE effect on PK-NRP precursors pretenellin A and preaspyridone...... 74 Figure 3.28: Shunt metabolite transformation from pretenellin to prototenellin-D 75 Figure 3.29: Transformation from torrubiellone C to torrubiellone B ...... 75 Figure 3.30: (A) Fischerin; (B) Aspyridone B ...... 75 Figure 3.31: Torrubiellone E structure ...... 76 Figure 3.32: Compounds isolated from Isaria sp. NRBC 104353 ...... 77 Figure 3.33: (A) Militarinone D; (B) Militarinone A...... 77 Figure 3.34: Transformation from pretenellin-B to tenellin (N-hydroxylation) by TenB cytochrome p450...... 78 Figure 3.35: Proposed metabolic pathway for synthesis of torrubiellone A ...... 78 Figure 4.1: Strategies for new secondary metabolites discovery...... 80 Figure 4.2: pTYargeGFP modification to obtain plasmid pTYpromlesseGFP...... 83 Figure 4.3: pTYpromlesseGFPbar assembly ...... 84 Figure 4.4: Morphological differences in Torrubiella sp. BCC2165 grown on PDA. 86 Figure 4.5: PCR confirmation test for eGFP gene ...... 86 Figure 4.6: eGFP expression in a Torrubiella transformant...... 87 Figure 4.7: PromE500 genomic sequence ...... 87 Figure 4.8: PromD500 genomic sequence...... 88 Figure 4.9: Origins of the fragments used in promoter analysis...... 88 Figure 4.10: pTYpromtorD500eGFPbar assembly...... 89 Figure 4.11: Assembly strategy to generate pEYAtorSeGFP-KO-XmaI ...... 92 Figure 4.12: PCR insertion analysis of torS-KO in Torrubiella...... 92 Figure 4.13: Comparison tree of C6TF and ZnTFs ...... 93 Figure 4.14: Alignment of the first 50 aa in C6TF and ZnTF ...... 94

xii

Figure 4.15: Assembly strategy for construction of pTYpromD500eGFPC6TFbarZnTF...... 94 Figure 4.16: Fluorescence in Torrubiella sp. BCC2165 using pTpromD500eGFPZnTFBar...... 95 Figure 4.17: Pigmentation in TF-transformants Torrubiella colonies ...... 96 Figure 4.18: Diode array scan of representative TF Torrubiella transformants. . 97 Figure 4.19: LC-MS analysis of a representative ZnTF transformant...... 98 Figure 4.20: UV absorption peaks of a representative ZnTF transformant ...... 98 Figure 4.21: ES+ analysis of peaks in a representative ZnTF transformant...... 99 Figure 4.22: Primers proposal for the elucidation of a correct insertion...... 102 Figure 4.23: Alignment of the amino acid Torrubiella TF sequences to other known TFs...... 105 Figure 5.1: Map showing essential features of the multigene expression vectors used in this research ...... 109 Figure 5.2: Illustration of multigene pathway reconstruction using yeast homologous recombination property and Gateway transfer...... 110 Figure 5.3: Plasmid pEYA1eGFP ...... 111 Figure 5.4: PCR scheme used for torS reconstruction ...... 112 Figure 5.5: torS assembly strategy in pEYAeGFP plasmid...... 112 Figure 5.6: pTYGSargtorC and pTYargtorSeGFPtorC assembly...... 113 Figure 5.7: eGFP expression of a representative A. oryzae transformant ...... 113 Figure 5.8: LC-MS analysis of organic extracts from a representative pTYargtorSeGFPtorC transformant...... 114 Figure 5.9: Introns position in torS gene...... 114 Figure 5.10: Intron 2 removal strategy in pEYAtorSeGFP...... 115 Figure 5.11: Intron 2 removal plasmid strategy ...... 116 Figure 5.12: Pigmentation of A. oryzae transformed with pTYtorSi2eGFPtorC ... 116 Figure 5.13: LC-MS analysis of a representative pTYargtorSi2eGFPtorC transformant ...... 117 Figure 5.14: Intron 1 elimination strategy...... 118 Figure 5.15: MfeI digest and prediction to confirm intron 1 removal in pEYAtorSi2eGFP...... 118 Figure 5.16: torSintless expression plasmid and metabolite analysis ...... 119 Figure 5.17: LC-MS analysis of a representative pTYargtorSintlesseGFPtorC transformant...... 120 Figure 5.18: Comparison of torrubiellone D and the proposed precursor sturctures. 121 Figure 5.19: Construction of pTYargtorSintlesseGFPtorABC ...... 122 Figure 5.20: LC-MS analysis of a representative pTYargtorSintlesseGFPtorABC transformant...... 124 Figure 5.21: Proposal of a torrubiellone C modification...... 125

xiii

Figure 5.22: pTYargtorSintlesseGFPdmbABC plasmid map ...... 126 Figure 5.23: LC-MS analysis of organic extracts from a representative pTYargtorSintlesseGFPdmbABC transformant...... 127 Figure 5.24: pTYGSadetorDE assembly ...... 129 Figure 5.25: LC-MS analysis from two representative tenSABC + torD and torE transformants...... 130 Figure 5.26: Proposal of the effect of TORD and TORE enzymes in tenellin structure 131 Figure 5.27: LC-MS analysis of a representative torSC + torE transformant...... 132 Figure 5.28: LC-MS analysis of a representative tenSABC + torD transformant. 133 Figure 5.29: Effect of pigmentation in LC-MS analysis of organic extracts of three different transformant colonies...... 134 Figure 5.30: UV absorbance spectra for peak eluted at RT = 15.942 min, with peaks of 215.94 and 383.94 from a representative mild yellow torSABCDE transformant...... 134 Figure 5.31: Structure comparison between predesmethylbassianin A, torrubiellone D and a pre-torrubiellone D...... 137 Figure 6.1: Comparison between different torrubiellone structures ...... 155

xiv

C

H A H PT ER

1

1. INTRODUCTION

1.1 SPECIALIZED METABOLISM

Secondary metabolites are defined as low molecular mass molecules whose production is not essential for the life of the producer organism (Berdy, 2005). This is in contrast to primary metabolites, which are essential for growth and development and are defined as compounds produced during active growth, such as citric acid produced by Aspergillus niger and gluconic acid produced by A. niger and Aspergillus terreus (Calvo et al., 2002).

Many organisms, particularly bacteria, plants and fungi, produce an extremely diverse range of secondary metabolites including penicillin, cephalosporins and statins to name but a few (Brakhage et al., 2009). One way to explain the existence of non-vital metabolites is associated with the use of some of these products as communication signals, such as quorum sensing communication-like system (de Salas et al. 2015), for defence by producing compounds that can inhibit the growth of another competitive organism (Brakhage and Schroeckh, 2011) or related to sporulation in bacteria (Hopwood, 1988) and fungi (Bu'Lock, 1961), among other functions.

1.2 FUNGAL SECONDARY METABOLITES

Fungi can be found in almost every known environment on Earth, such as woodlands and meadows, thermal springs, sub-Arctic and dry cold desert regions of the world and even animal dung (Kis-Papo et al.. 2001). In 1991, 1.5 million different species of fungi were estimated to exist on Earth (Hawksworth, 1991), although only 70,000 of them were known

1

at the time. More recent estimates, some arising from the use of high-throughput sequencing methods, suggested that approximately 5.1 million fungal species exist (Taylor et al., 2010).

It has been widely described that several species of filamentous fungi have the capability to produce compounds of different chemical complexity, which can be helpful to enhance the survival of the organism, but not needed for development and basal growth. Secondary metabolites are mainly produced at a late stage in fungal development. Alternatively, they may enhance the survival rate of spores, as in the case of sterigmatocystin synthesis in production of asexual spores of Aspergillus nidulans, and in melanin production for UV protection of spores in Alternaria alternata (Calvo et al., 2002).

Polyketides (PKs) and non-ribosomal peptides (NRPs) are considered among the most significant classes of secondary metabolites produced by plants, fungi, bacteria and marine organisms. Their synthesis is initiated by polyketide synthases (PKSs) or non-ribosomal peptide synthetases (NRPSs), respectively (Cox, 2007). Some NRP examples include cyclosporin, an immunosuppressant produced by Beauveria nivea (Svarstad et al., 2000) and antibiotics such as cephalosporins, synthesized by Cephalosporium acremonium (Foye et al.,

2008) and penicillins, produced by the genera , Cephalosporium and Aspergillus (Felnagle et al., 2008). Well-known PKs include the cholesterol-lowering compound lovastatin, produced by A. terreus (Hoffmeister and Keller, 2007), and aflatoxins, obtained from Aspergillus flavus and Aspergillus parasiticus (Wannop, 1961). Structures of these compounds are shown in Figure 1.1.

Figure 1.1. Examples of NRPs and PKs (A) Cyclosporin (D) Lovastatin (B) Cephalosporin (E) Aflatoxin (C) Penicillin

Moreover, some compounds are produced by a PKS-NRPS fusion enzyme, such as fusarin C (Song et al., 2004), aspyridones (Bergmann et al., 2010), tenellin (Eley et al., 2007) and

2

desmethylbassianin (Heneghan et al., 2011), among others. Figure 1.2 shows a selection of structurally-related PK-peptide molecules.

Figure 1.2 : Examples of structurally-related molecules produced by PKS-NRPS fusions (A) Tenellin (D) Militarinone D (B) Bassianin (E) Pyridovericin (C) Farinosone A/B (F) Torrubiellone C

Fungi produce a wide range of PKs, ranging from simple structures, such as orsellinic acid (from Aspergillus species, Figure 1.3A), to some of the most complex known to date, such as Fumonisin B1 from Gibberella fujikuroi. Other examples include 6-methyl salicylic acid, isolated from Penicillium patulum (Figure 1.3B; (Minto and Townsend, 1997)), T-toxin produced by Cochliobolus heterostrophus (Kroken et al., 2003) and lovastatin (Figure 1.3C) from A. terreus.

Figure 1.3: Examples of polyketide structures. (A) Orsellinic acid (B) 6-methylsalicylic acid (6-MSA) (C) Lovastatin

Despite the diversity of structure, the polyketide basic scaffold is formed from small carboxylic acids such as acetate, propionate and, rarely, butyrate, in a very similar way to fatty acid biosynthesis, as described by Birch in 1953 (Birch and Donovan, 1953).

1.3 BIOSYNTHESIS

Polyketide synthesis resembles greatly to fatty acid synthesis (Birch and Donovan, 1953) by using the same type of chemical reactions but with one main difference: while both PKSs and

3

fatty acid synthases (FASs) must be able to control the chain length (i.e. the number of synthetic iterations to be done), only PKSs can control the selection of the starter/extender unit and the degree of reduction in each cycle. In addition, fungal PKSs can determine the degree of methylation of the chain, depending on the presence of the corresponding domain structure. This programming feature is the key to understanding and exploiting PK synthesis. In bacterial modular PKSs, each condensation cycle is governed by a single module that contains all the required catalytic domains (Staunton and Weissman, 2001), with the program defined (usually) by the order and composition of the modules; but in type I fungal PKSs, in contrast, the program is encoded within the PKS (Cox, 2007), and it is currently impossible to predict the final polyketide form from knowledge only of the (predicted) structure of the synthase.

1.3.1 FATTY ACID BIOSYNTHESIS

The similarity between polyketide and fatty acid synthesis was discovered by using isotopically labelled precursors, which showed that a decarboxylative Claisen condensation (Figure 1.4) performs the carbon-carbon bond-forming step in both PK and FA biosynthesis. This reaction occurs between an acyl thiolester and a malonyl thiolester, and is catalysed by a β-ketoacyl synthase (KS) enzyme. This enzyme activity must be present in all FASs and PKSs (Cox et al., 2002).

Figure 1.4: Claisen ester condensation. Figure obtained from ChemTube3D, University of Liverpool. EtO-: Alkoxide base.

Fatty acid synthases also use an acyl carrier protein (ACP), which carries the malonyl extender units and, at the same time, provisionally holds the growing acyl chain. The apo- ACP (inactive form of the ACP) requires post-translational modification, the addition of phosphopantetheine (PP), carried out by a PP transferase (or holo-ACP synthase, ACPS) (Cox et al., 2002). Most FAS and PKS proteins also require an acyl transferase (AT) activity, which transfers the acyl groups from CoA to the KS and ACP components.

4

The fatty acid synthesis cycle begins with the transfer of the acetyl moiety to the keto-acyl synthase by an ACP-bound intermediate, a stage called initiation. Malonyl thiolester can also be transferred in a similar way to an ACP (substrate loading) and joined to the ketosynthase- bound acyl chain, a process described as the chain extension stage. After this, the product passes through several tailoring steps, performed by other domains contained in the module (β-carbon processing). The first tailoring step is the reduction of the ester moiety to a secondary alcohol by a β-ketoacyl reductase (KR); then a water molecule is removed by a dehydratase (DH), forming an unsaturated thiolester, and finally an enoyl reductase (ER) produces a fully saturated thiolester. The resultant saturated acyl chain is returned to the KS to initiate the next cycle, and the process continues until the fully elaborated acyl chain is released (chain termination) by a thiolesterase (TE) (Brignole et al., 2009, Figure 1.5).

Figure 1.5: The fatty acid biosynthetic cycle. MAT: malonyl-acetyl transferase; ACP: Acyl carrier protein; KS: Ketosynthase; KR: Ketoreductase; DH: Dehydratase; ER: Enoyl reductase; TE: Thiolesterase. Obtained from (Brignole et al., 2009)

There are two types of FAS enzymes: Type I FASs, found in animals and fungi, are described as multifunctional proteins in which one (or sometimes two) peptides include the sequences for all the KS, ACP, AT, KR, DH, ER and TE functional domains (Figure 1.6). They are encoded by large genes with single open reading frames.

In contrast, type II FASs are composed of a group of proteins, each encoded by a corresponding single gene, and can be found in plants and bacteria. Both types of FAS are iterative, requiring cycles of action of every catalytic function repetitively to obtain the mature fatty acid (Cox, 2007).

5

PK biosynthesis differs from FA synthesis in that full β-carbon processing is not obligatory and may be absent, partially or fully. Plus, additional modifications may occurs to the growing polyketide chain.

Three basic domains are required for the PKS elongation module, in a very similar way to the fatty acid biosynthetic process: an AT domain to transfer the starter unit and for extender unit selection, an ACP for extender unit loading and a KS domain for decarboxylative conden- sation the extender unit (usually malonyl-CoA or methylmalonyl-CoA) with an acyl thiolester. As in FA synthesis, the resulting β-ketothiolester can be reduced by the NADPH-dependent KR, the alcohol dehydrated by the DH and the resulting olefin can be hydrogenated by the ER. Also, in fungal polyketide synthesis the polyketide chain can be methylated by a methyltransferase using a methyl group from S-adenosylmethionine (SAM); this probably occurs after KS activity, resulting in a methyl-β-ketothiolester (Brakhage and Schroeckh, 2011).

Figure 1.6: Reductive steps catalysed by PKSs domains. KR changes the ketone group for a hydroxyl group. DH, in conjunction to KR, is able to remove the OH by dehydration to produce a double bond. ER saturates the double bond. From Nguyen et al., 2008.

1.3.2 POLYKETIDE BIOSYNTHESIS

Orsellinic acid (OSA; Figure 1.3), which is considered as the simplest tetraketide as no reduction steps are required in its synthesis, was one of the first fungal polyketides discovered, initially isolated from Penicillium madriti (Gaucher and Shepherd, 1968). The

6

genes responsible for OSA synthesis were subsequently isolated from A. nidulans (Sanchez et al., 2010). OSAS is an iterative type I PKS, encoded by orsA; upregulation of the genes orsA to orsE was detected when A. nidulans was co-cultivated with a soil-dwelling actinomycete (Schroeckh et al., 2009). Other PKS biosynthetic genes have been isolated from different fungal species, and a common pattern of domain organization has been described, in a very similar way to fatty acid biosynthesis (Cox, 2007).

Type I PKSs are large multifunctional proteins with individual functional domains, which are found in bacteria and fungi, while type II PKSs are composed of individual proteins and are only found in bacteria. There are also Type III PKSs (stilbene and chalcone synthases), described as KS proteins, without any AT, ACP, KR, DH, ER or TE domains. These PKSs can be found in plants, bacteria and fungi (Cox, 2007).

Non-reducing (NR) PKSs do not require participation of any reductive domains after the production of the basic polyketide scaffold by KR, DH and ER. Only a few NR-PKSs possess C-methylation (C-Met), such as the fungal plant pathogen Acremonium strictum, which produces the polyketide 3-methylorcinaldehyde (MO). MO is synthesized by PKS MOS (Figure 1.7), which exhibits a C-met domain, but no reductive activities from either KR or ER domains. MO is suggested to be an intermediate in the production of xenovulene A (Bailey et al., 2007; Figure 1.8).

Another example of an NR-PKS is norsolorinic acid synthase (NSAS), which produces an intermediate in the production of the carcinogenic compound aflatoxin B1. NSA is an octaketide initiated by a hexanoate starter, isolated from Aspergillus parasiticus (McKeown et al., 1996). The synthase gene, named pksA, is homologous to the wA gene from A. nidulans. The PKS contains the domains KS, AT and ACP, but no KR or ER. By bioinformatics, another two domains were found: a product template (PT), which could possibly be involved in the stabilization, folding and control of the chain length, and a Starter-Unit ACP transacylase (SAT), involved in the selection of the starter unit (Cox and Simpson, 2009).

Figure 1.7: MOS domain structure (Bailey et al., 2007)

7

Figure 1.8: MOS to xenovulene A conversion. R-catalysed reductive release of 3-methylorcinaldehyde in the biosynthesis of the polyketide xenovulene A (Du and Lou, 2010)

Another example of a NR-PKS is the tetrahydronaphthalene synthase THNS. Its product, 1,3,6,8-THN, is a pentaketide formed by condensation between acetate and four malonate molecules; it is a key intermediate for melanin production, used for plant penetration and pathogenicity by fungal pathogens of different crops (Wheeler and Bell, 1988). The THNS can be obtained from several fungi, such as Colletotrichum lagernarium, from which the corresponding gene, pksA, was studied by heterologous expression in A. oryzae (Fujii et al., 2000).

Partially reducing PKSs (PR-PKSs) strongly resemble to mammalian FASs and composed by an N-terminal KS and AT, DH, KR and ACP domains. The main difference between NR and PR PKSs is the absence of SAT or PT domains, and they also appear not to need a thioesterase–Claisen cyclase (CLC-TE) domain (Cox and Simpson, 2009). A prime example of a PR-PKS is the 6-methylsalicylate synthase (MSAS) from Penicillium patulum (Dimroth et al., 1970). Its domain structure, as described above, was deduced from the gene sequence; msas was the first fungal PKS gene to be isolated (Beck et al., 1990; Figure 1.9, A).

Figure 1.9: Domain architectures for fungal iterative type I PKSs deduced from gene sequences. (A) General structure of a NR-PKS (B) Lovastatin nonaketide synthase (LNKS) and Lovastatin diketide synthase (LDKS), both HR-PKS; note that the inactivity of the LNKS ER is complemented by a trans-acting ER encoded by lovC (C) General structure of a highly reduced PKS-NRPS (Cox and Simpson, 2009).

8

Finally there are highly reducing PKSs (HR PKSs; Figure 1.9, D), which have an N-terminal KS domain followed by AT and DH domains, and, sometimes, a C-Met and ER domains are also present. After these domains is a KR domain and the PKS often terminates with an ACP. Apparently there are no domains similar to the PT or SAT domains of the NR-PKS (Cox, 2007). An example of HR PKS is lovastatin, produced by Aspergillus terreus, and its importance is based on its activity as potent inhibitor of cholesterol biosynthesis, specifically targeting the 3-hydroxy-3-methylglutaryl (HMG) CoA reductase (Tobert, 2003).

Lovastatin synthesis requires a decalin produced by cyclization of a highly reduced nonaketide precursor, linked to a diketide-derived-3-methylbutyryl moiety (Hendrickson et al.1999; Figure 1.9, B).

In general terms, the conserved characteristics of polyketide synthases and PKS-NRPS hybrid systems have been used as the basis of genomics-based discovery of this class of natural product biosynthesis.

1.3.3 PKS- NRPS HYBRID SYSTEMS

Polyketides fused to amino acidic to form an extensive range of biological compounds, such as fusarin C, tenellin and aspyridones. Fusarin C, a tetramethylated heptaketide fused to homoserine, is produced by certain strains of Fusarium moniliforme and Fusarium venenatum (Song et al., 2004). Tenellin, produced by Beauveria bassiana, is a dimethylated pentaketide fused to tyrosine, and aspyridones are dimethylated tetraketides fused to a tyrosine moiety, produced by Aspergillus nidulans (Table 1.1).

Fusarin C, a toxic metabolite isolated from Fusarium moniliforme and F. venenatum, is synthesized by a fusarin synthase (FUSS), is composed of a HR PKS fused to a NRPS module (Song et al., 2004). It possess an inactive ER domain, but ER activity is not needed for fusarin synthesis, but in tenellin (ten), produced by Beauveria bassiana, the ER activity is provided by an ORF (tenC) included in the tenellin gene cluster, which is homologous to the lovC gene of the lovastatin gene cluster (Halo et al., 2008).

9

A typical NRPS module contains at least three domains: an adenylation domain (A), which activates the amino acid, a peptidyl carrier protein (PCP - also known as a thiolation domain), binding the cofactor 4′-phosphopantetheine (4′PP) to the activated amino acid (Figure 1.10, B) and finally a condensation domain (C), catalysing the formation of the peptide bond. Similarly to PKS biosynthesis, some optional domains have also been described, such as epimerization and methyltransferase domains (Brakhage, 2013; Figure 1.10, A). ucture

Figure 1.10: PKS and NRPS strdomain. (A) Comparison of polyketide synthase and non-ribosomal peptide synthetase modules (Brakhage, 2013). (B) Domain organisation of fungal PKS-NRPS hybrids, and biosynthesis of PK-aminoacid hybrids. (Boettger and Hertweck, 2013)

10

Table 1.1: Examples of PKS-NRPS systems. Adapted from Boettger and Hertweck, 2013

COMPOUND PRODUCER(S) BIOLOGICAL ACTIVITY SYNTHASE STRUCTURE REF GENE Aspyridone A. nidulans Cytotoxic apdA (Wasil et al., 2013)

Chaetoglobosin A Penicillium actin filament capping cheA (Schuma expansum nn and Hertwec k, 2007)

Cyclopiazonic acid A. oryzae, A. mycotoxin, inhibition of Ca2+- cpaS (Chang flavus ATPase et al., 2009)

Cytochalasin E A. clavatus anti-angiogenic ccsA (Qiao et al., 2011)

Equisetin F. heterosporum inhibition of HIV-1 integrase eqiS (Sims et al.,. 2005)

11

Fusarin C G. moniliformis, mycotoxin, mycoestrogenic fusA (Song et G. zeae al., 2004)

Isoflavipucine A. terreus mycotoxin ATEG_003 (Gressler 25 et al., 2011)

Pseurotin A A. fumigatus neuritogenic, psoA (Maiya immunosuppressive et al., 2007)

Tenellin B. bassiana entomopathogenic, tenS (Eley et membrane ATPase inhibition al., 2007)

Desmethyl-bassianin B. bassiana entomopathogenic dmbS (Fisch et membrane ATPase inhibition al., 2011)

12,13- M. grisea avirulence factor ace1 (Song et dihydroxymagnaporthepyrone al., 2015

12

1.4 TORRUBIELLA SPP: CHARACTERISTICS AND COMPOUNDS

The genus Torrubiella is a member of the Clavicipitaceae s.lat group (Division: ; Class: Sordariomycetes; Order: Hypocreales), initially described in 1885 (Boudier, 1885) and known by their obligate interaction with plants, animals and other fungi. The species of this group are described as possessing cylindrical asci, thickened ascus apices and filiform ascospores, which can lead, in some species, to disarticulate into part-spores. Torrubiella (Figure 1.11) is recognized as a fungal pathogen of arthropods, mainly spiders (Arachnida) and scale insects, mostly with soft scales (Coccidae) and armoured scales (Diaspididae). Fungal infection of insects starts with cuticular attachment and posterior degradation of the hydrocarbons composing the cuticle, followed by penetration and proliferation inside the insect host (Wanchoo et al., 2009).

Figure 1.11: Torrubiella morphological characteristics. (A) Torrubiella sp. (Johnson et al., 2009) (B) Torrubiella sp. BCC2165, grown on potato dextrose agar (PDA).

It is believed that the primary targets are female adults, because of their lack of movement, but sometimes, as in the case of infection by Torrubiella tenuis, the scale insect gets destroyed beyond identification (Hywel-Jones, 1993). Because of their morphological characteristics and entomopathogenic lifestyle, it has been related to the Cordyceps genus. Other genera have also been associated to Torrubiella, such as Gibberella (producer of pyrrolidines and steroids), Paecilomyces (synthesizing Paeciloside A) and Verticillium (producer of lowdenic acid), among others (Johnson et al., 2009). Presently, the genus Torrubiella is composed of 83 species (Index Fungorum http://www.speciesfungorum.org).

A current area of interest in these types of fungi is their production of novel bioactive compounds. For example, Cordyceps unilateralis BCC1869 produces anti-malarial naphthoquinone compounds with IC50 values of 2.5-10.1 µg/ml (Figure 1.12, A-F), and Beauveria bassiana FO-6979 produces antiteratogenic beauveriolides I (Figure 1.12, G) and III (Figure 1.12, H). Another interesting compound produced by B. bassiana is beauvericin

13

(Figure 1.12, I), an ionophoric cyclodepsipeptide, which exhibits insecticidal and antibiotic activities (Hamill et al., 1969).

Figure 1.12 : Different bioactive substances obtained from insect pathogenic fungi. Obtained from Isaka et al., 2005. (A)-(F): Naphthoquinone compounds (G) Beauveriolides I (H) Beauveriolides III (I) Beauvericin.

A few compounds isolated from Torrubiella species have been described (Table 1.2), such as paecilodepsipeptide A and a naphthopyrone glucoside from Torrubiella luteorostata BCC9617 (Isaka et al., 2007), torrubiellutins A-C from Torrubiella luteorostata BCC12904 (Pittayakhajonwut et al., 2009) and isocoumarin glucosides from T. tenuis BCC12732 (Kornsakulkarn et al., 2009).

14

Table 1.2 : List of compounds described in the genus Torrubiella.

COMPOUND TORRUBIELLA STRUCTURE

SPECIES

Paecilodepsipeptide A Torrubiella luteorostata BCC9617

Naphthopyrone glucoside Torrubiella luteorostata BCC 9617

Torrubiellutins A-C Torrubiella luteorostata BCC A (1), R1 = R2 = OH, R3=H 12904 B(2), R1 = R2 = OAc, R3=H

C (3), R1 = OAc, R2 = OH, R3=H

Isocoumarin glucosides T. tenuis BCC12732

In 2010, the profiles of 16 different Torrubiella strains were analysed, and the compounds extracted from Torrubiella sp. BCC2165 showed a unique 1H-NMR spectroscopic profile. The strain was passed through fermentation and analysis, and the resultant alkaloids obtained were named torrubiellones A-D (Isaka et al., 2010). Torrubiellone A (Figure 1.13, A) structurally resembles militarinone A (from Cordyceps militaris), differing only in a hydroxymethyl group and in the length of the polyketide chain.

15

Figure 1.13: Comparison of torrubiellone structures with militarinone A. (A) Torrubiellone A (X = OH) (B) Torrubiellone B (X = H) (C) Torrubiellone C (D) Torrubiellone D (Isaka et al., 2010).

Torrubiellones were assessed against Plasmodium falciparum K1 (malaria pathogen), Mycobacterium tuberculosis H37Ra (tuberculosis pathogen), non-malignant Vero cells and three cancer cell lines. Torrubiellone A exhibited antimalarial activity, with an IC50 value of 8.1 µM, accompanied by very weak cytotoxic activity (Isaka et al., 2010).

The full chemical synthesis of torrubiellones B (Ding et al., 2014) and C (Jessen et al., 2011) has been reported, but to date, no total synthesis or other chemical approach has been reported to obtain torrubiellones A and D.

Given the structural similarities between the alkaloids tenellin, militarinone and torrubiellone, it is likely that the biosynthetic pathways of all these compounds should be very similar (Halo et al., 2008, Schmidt et al., 2003). Also, since the sequences of the genes responsible for tenellin (Eley et al., 2007, Halo et al., 2008) and desmethylbassianin (Heneghan et al., 2011) biosynthesis are already known, it can be predicted that sequencing the Torrubiella genome and in-silico analysis will lead to the discovery of the torrubiellone A biosynthetic gene cluster. This in turn could lead to an alternative biosynthetic route to torrubiellone A production, via heterologous gene expression.

1.5 MALARIA AND ANTIMALARIAL ACTIVITIES

Malaria is an infectious disease produced by parasitic protozoans of the genus Plasmodium, namely P. falciparum, P. vivax and P. ovale (World Health Organization). The most common symptoms are fatigue, vomiting, fever and headaches, but in most serious cases it can cause yellow skin, coma, seizures and even death. This disease is transmitted by mosquito bites, with the symptoms beginning mainly between 10 to 15 days after the bite (Caraballo and

16

King, 2014). In 2000, there were between 350 and 500 million malarial infections, resulting in at least one million deaths, mostly of children under 5 years old and pregnant women (Pearce, 2007). Also since 2000, the number of deaths has been falling by 4% per year, mainly because of the use of insecticide-treated nets and artemisinin-based combination therapies (ACTs). However this diminishing rate is considered too slow for the World Health Organization (WHO); to meet the Global Malaria Action Plan (GMAP) milestone of a 10-fold reduction of malaria incidence by 2030, meaning to achieve a 90% reduction in malarial infections, requires a reduction of 10 to 16% of cases each year (Atta and Zamani, 2008).

Based on historical knowledge of medicinal plants, two types of compounds have been developed as antimalarials, namely quinines and artemisinins, but the inappropriate use of some compounds has resulted in drug resistance. This is now the main hurdle to overcome to eradicate definitively the disease (World Health Organization). Currently, the gold standard for malaria treatment is a 3-day course of an ACT, but ideally only one exposure should be needed. This should be administered in the presence of a health worker, as a critical point in the campaign of malaria eradication. (Wells et al., 2015). The perfect medical requirements for malaria treatment were named SERCaP (Single Exposure Radical Cure and Prophylaxis) (Alonso et al., 2011), but no single chemical has presented these properties. Instead, a combination of two or more molecules attacking different targets has been needed. Optimally, novel molecules should be able to shorten the duration of treatment, accompanied by the prevention of transmission of the parasite back to the insect vector; at the same time they should be safe for children and expecting mothers (Wells et al., 2015).

Up to 2015, no fully successful vaccine had been developed for protozoon diseases (Figure 1.14). Currently, the most advanced malaria vaccine, Mosquirix, has been effective in only 55% of children treated, and, what is worse, in only 33% of infants exposed to the disease. Also, it is species-specific for Plasmodium falciparum (Birkett, 2015).

17

Figure 1.14: Historical timeline of the discovery of antimalarial therapeutics (Wells et al., 2015).

Artemisinin-derivatives are capable of killing the parasite quickly (Table 1.3), diminishing the parasite concentration as much as 10.000 fold in 48 hours (Ashley et al., 2014), but to ensure complete cure from this disease, the drug must be active for at least three or four parasite life cycles. This means that to be effective, the drug needs to maintain a plasma concentration above the minimal inhibitory concentration (MIC) for more than a week (Bethell et al., 2011).

To eradicate malarial disease, one key objective is to block the transmission cycle, i.e. to stop parasite-carrying humans infecting the mosquitoes. Only the sexual stage of the parasite is capable of surviving in the blood stream of the mosquito, so discovery of an antimalarial compound that kills or selectively disrupts the sexual stages of the parasite would stop its spread (Trenholme et al., 2014).

18

Table 1.3: Current compounds used in anti-malarial treatment (Wells et al., 2015).

DRUG APPROVAL DATE STRUCTURE

ARTEMETHER- 2001 (Adult) and

LUMEFANTRINE 2009 (Paediatric)

ARTESUNATE-AMODIAQUINE Approved in Morocco in 2007

DIHYDROARTEMISININ- 2011, prequalified

PIPERAQUINE

PYRONARIDINE-ARTESUNATE 2012. Approved by Korea in 2011, for use in Vietnam and Cambodia in 2014

ARTESUNATE-MEFLOQUINE 2012, prequalified

The ACT treatment is currently composed of an artemisinin derivative (artesunate, dihydroartemisinin or artemether, Table 1.3), together with a 4-aminoquinoline or aminoalcohol. One of the biggest concerns about artesunate is that it is derived from plants, with an 18-month lead time between demand for the material and its supply. The other major concern is the high price of manufacturing artesunate; this has led to the design of the compound OZ277 (Table 1.4). This has a production cost of $800 per kilo, and is currently in clinical studies (Phase II) (Wells et al., 2015).

19

Table 1.4: Anti-malarial drugs under study (Wells et al., 2015).

DRUG IN STUDY COMMON NAME STRUCTURE

OZ277 Arterolane

KAE609 Cipargamin

DSM265 Dihydroorotate dehydrogenase inhibitor

For any new compound to be worth investigating it should make a significant difference, in comparison to the drugs already described, in one of the following: safety, activity against resistant strains, compliance or reduced cost (Coteron et al., 2011).

A current way to identify novel molecules is by phenotypic screening. Of a total of nearly 6 million compounds screened to date, more than 25000 showed promising results, having an

IC50 of 1 µM or below against P. falciparum. The advantage of the technique is the relative low cost of screening, compared to triage first, and in some cases, attempting to elucidate a relation between structure and activity (Wells et al., 2015). A group of 400 compounds selected from the original 25000 comprise the “Malaria Box” (Spangenberg et al., 2013), which is freely available to 200 research groups. 7% of these compounds have been shown to target the Na+-ATPase 4 ion channel PfATP4, and one of them, KAR609, is already in Phase I trial (Leong et al., 2014), killing at a faster rate than artesunate and maintaining an adequate plasma concentration for more than a week. KAE609 is seven times more potent than artesunate and 40 times more potent than 4-aminoquinolines (Table 1.4). In total the screening approach has identified seven novel targets as well as others already known (Table 1.5).

20

Table 1.5: Recently validated antimalarial targets. Six of them have been clinically validated (Wells et al., 2015).

TARGET TARGET NAME KEY INVESTIGATIVE CLINICALLY

MOLECULES VALIDATED?

PfATP4 Na+-ATPase 4 KAE609 Yes

PfPL4K Phosphatidylinositos-4-kinase MMV390048 Yes

PfDHFR Dihydrofolate reductase P218 Yes

PfDXR DXP reductoisomerase Fosmidomycin Yes

PfDHODH Dihydroorotate dehydrogenase DSM265 Yes

PfFP2-3 Falcipain cysteine proteases 2-3 Falcitidin No

PfHDAC1 Histone deacetylase SB939, trichostatin A No

PfCYTbc1 Cytochrome bc1 Decoquinate, GSK932121 Yes

PfCHT1,2,4 Aspartic protease plamepsins I, CWHM-117, TCMDC- No II, IV 134674

PfCARL Cyclic amine resistance locus KAF156 No

DXP: 1-deoxy-D-xylulose 5-phosphate

The search for new drugs, aside of attempting a complete chemical synthesis, is also by screening natural products, considering that previously malaria disease was treated only with herbal medicinal products.

Malaria is considered to be a disease associated with market failure. This is because the cost of developing new drugs is very high, while the price at which they can be sold has to be low to be affordable to patients in the developing world. These facts do not offer a commercial incentive to pharmaceutical companies, and alternative approaches may be necessary. One such approach may include the discovery and cost-effective production of novel anti-malarial natural products.

1.6 GENE MINING APPROACH FOR DISCOVERING NEW NATURAL PRODUCTS

Fungi are known to be a rich source of bioactive compounds, but partial and full genome sequencing has revealed the presence of more secondary metabolite gene clusters than compounds recorded. This indicates the potential to source vastly more bioactive substances than previously imagined possible. Genome mining, has already proved to be an excellent tool in the cloning of known fungal PKS genes together with the neighbouring genes composing the gene cluster (Kroken et al., 2003). Current discovery techniques have now

21

moved on to whole-genome sequencing projects, generating information that can be used to discover novel compounds and/or fill some missing key steps in biosynthetic pathways of compounds already known (Crawford and Clardy, 2012). In some cases, the gene clusters identified can be correlated with compounds known to be produced by the organism, but most clusters discovered this way are cryptic.

Given the small number of fungal genome sequences determined to date and their large content of cryptic secondary metabolism gene clusters, there must be a wide range of novel secondary metabolites to discover, many with potential application in the pharmaceutical field (Bergmann et al., 2010). The basis of this approach is the initial identification of a main synthase gene (it could be a PKS, NRPS or terpene synthase), and prediction of the function of neighbouring genes. This type of analysis can be done by software packages such as FungiFun (Priebe et al., 2011) and antiSMASH (Medema et al., 2011), which predict secondary metabolite gene clusters within genomic sequences. In Aspergillus nidulans, at least 54 gene clusters related to secondary metabolite production have been identified, but only some of them have been linked to a known compound (von Dohren, 2009). Similarly, for Aspergillus niger, 34 PKS genes have been identified (Pel et al., 2007), but only a few of them have been described as components of the biosynthesis of specific compounds.

The discovery of new natural products has mainly involved the screening of crude extracts of microorganisms grown on specific growth media. Phase separation is used to facilitate the analysis of the crude extract by different analytical techniques, such as Gas Chromatography/Mass Spectrometry (GC/MS) or Liquid Chromatography/Mass Spectrometry LC/MS for detection and Nuclear Magnetic Resonance (NMR) and/or Infra Red (IR) techniques to elucidate the compound’s structure. The approach of testing different growth media and extracting the secondary metabolites produced under different growth conditions led to the successful isolation of aspoquinolones A-D from Aspergillus nidulans (Scherlach and Hertweck, 2006). Such non-specific screening of this nature has certainly yielded novel compounds, but has often also resulted in rediscovery of previously described compounds (Winter et al., 2011).

In microorganisms, biosynthetic pathway genes are often clustered together with regulatory and resistance genes on a continuous stretch of the genome. The type of secondary metabolite produced from the cluster, but not its structure, can be predicted by using genetics, microbiological and chemical tools. The major predictor will be the presence of a core synthase gene, encoding, for example, a PKS or NRPS or terpene synthase. Further information can be gleaned from database searches with surrounding gene sequences, which may indicate specific activities associated with the formation or modification of particular chemical structures.

22

One good example is the case of stipitatic acid (SA), a tropolone isolated from Talaromyces stipitatus (Birkinshaw et al., 1942); at the time of its structural elucidation the mechanism of formation of the seven-membered ring structure was unknown, and remained a mystery for the next seven decades. SA synthesis was only proven in 2012, when Davison and co- workers screened the genome sequence of T. stipitatus with the aspks1 gene involved in the production of 3-methylorcinaldehyde in Acremonium strictum. BLAST search identified four genes clusters in which one of them contained 11 open reading frames, including a non- reducing-PKS (NR-PKSs gene tspks1) with an overall 38% identity to the MOS of A. strictum (Davison et al., 2012). Tspks1 analysis revealed similar domain pattern found also in methylorcinaldehyde, i.e. SAT-KS-AT-PT-ACP-CMet-TE domains. To confirm if the right cluster was chosen, a knock-out experiment targeting the NR-PKS was performed by using the duplex-KO method (Nielsen et al., 2006), obtaining nine transformants. LC-MS analysis of them that four of them were deficient in tropolone biosynthesis.

Once a gene cluster has been selected, its function can be studied by using different approaches, including targeted gene knock-out; this leads to total loss of production of the expected compound if a core synthase gene is inactivated, or the accumulation of compound precursors and/or intermediates if tailoring enzymes are removed. Another approach consists in analysing the function of each gene (separately and/or in combination) by expression in heterologous hosts, such as Aspergillus oryzae, Saccharomyces cerevisiae and Escherichia coli; both gene knockout and heterologous expression were used to solve the SA biosynthesis question (Davison et al., 2012).

Another way to study secondary metabolite production is by bioinformatic analysis of putative gene clusters and then using specific labelled amino acid precursors, hoping to find novel compounds, an approach used to describe the production of orfamides from Pseudomonas fluorescencens (Gross et al., 2007). In a further use of this technique, Robbel et al. (2010) predicted two NRPS gene clusters and added 14C-labeled precursors to Saccharopolyspora erythrea, to isolate erythrochelin (Robbel et al., 2010). Despite the success of this technique to predict the substrates of modular NRPS gene clusters (based on amino acids), it is less applicable to PKS modules based on acetate and malonate moities. Nevertheless, it was possible to describe the polyene macrolactam salinilactam A, produced by a modular PKS encoded within an 80 kb gene cluster in the marine actinomycete Salinispora tropica (Udwary et al., 2007). This methodology is limited by the lack of precision of precursor prediction, although advances are been made in this area, accompanied by the availability of these precursors. In addition, other relevant parameters, such as precursor concentrations and time of feed are essential to achieve adequate incorporation, and consequently, compound isolation.

23

Considering that it has not been possible to predict structures (and even precursors) for iterative PKSs by only analysing the genomic sequence, the use of molecular biology techniques, such as gene knockout or heterologous gene expression are better positioned, particularly for gene clusters with few or no predictions about their substrate specificity (Corre et al., 2007). Different techniques have been used to study fungal gene clusters, such as overexpression of pathway regulatory genes (Bergmann et al., 2007), fungal compound production by different growth conditions (Bok et al., 2006) and use of chemicals inducing epigenetic changes (Williams et al., 2008), among others.

Genome mining can help in finding cryptic gene clusters i.e. ones that have low expression or no known expression at all and global/specific gene regulators (Corre and Challis, 2007). One approach is to manipulate global secondary metabolite regulators that could affect the target gene cluster, which potentially can activate orphan gene clusters (Bok et al., 2006). Gene cluster regulators can be also pathway-specific, as exemplified by the expression of a cryptic PKS-NRPS gene cluster in Aspergillus nidulans by constitutive over-expression of the apdR gene, which encodes a putative regulator, resulting in the production of aspyridones A and B (Bergmann et al., 2007). However, these techniques may not work adequately, because some gene clusters are regulated only by global regulators, such as the global regulator LaeA, described for the genus Aspergillus, and PacC, described as a pH signalling transcription factor. Overexpression of global regulators could lead to affect other biological processes, aside of the overproduction of the desired compound.

Transcription factors can also affect gene cluster expression by up- or downregulation, and an example of the latter is provided in the bacterial strain Burkholderia thailandensis; disruption of the gene thaA led to enhanced production of the secondary metabolite thailandamide A and the related thailandamide lactone (Ishidav et al., 2010). The thaA gene is a member of the Lux-R group of regulators, part of the quorum sensing system related to bacterial interspecies communication mediated by chemical signals (Winter et al., 2011).

1.7 HETEROLOGOUS EXPRESSION OF FUNGAL SECONDARY METABOLITES

Fungi are not always easy to grow on a large scale and/or may take a long time to achieve detectable production of the desired compound(s). In addition, several fungi are not susceptible to the use of molecular techniques on them (Heneghan et al., 2010). In gene knockout, for example, directed mutagenesis is used to target and inactivate gene clusters to compare secondary metabolite production in the mutant in comparison to the wild-type

24

organism; the technique requires both a transformation system to be available and the organism must have a significant rate of homologous recombination.

For less studied organisms, or when genetic manipulation has not been possible, a different approach to study secondary metabolites is to use heterologous expression of partial or complete biosynthetic gene clusters. For example, the expression in E. coli of a terpene synthase gene from Streptomyces avermitilis (Chou et al., 2010) resulted in production of the sesquiterpene avermitilol when growth was in the presence of farnesyl diphosphate. By expressing heterologously genes fused to its native promoters jointly with the synthase in E. coli, the same group obtained avermitilol, avermitilone, as well as germacrenes A and B (Komatsu et al., 2010). In this particular case, the expression of the gene cluster was successful only because of the compatibility of prokaryotic promoters. Expression of fungal genes in E. coli, requires a promoter change to be compatible for the system host, intron removal prior to gene insertion because of the prokaryotic system would not be able to remove (even recognize) effectively intronic sequences within the gene, among other factors.

Using the gene knockout method together with the heterologous expression of gene clusters has led to successful production of novel secondary metabolites, such as the tenellin synthesis, in which the main PKS-NRPS tenS gene knockout was used to confirm its effect in tenellin production (Eley et al., 2007). Afterwards, Halo et al. (2008) proved that the ER- tailoring enzyme TENC must be coexpressed with TORS for achieve a correct programming to obtain the first stable intermediate pretenellin A. Heneghan et al. (2010) indicated that only two extra genes encoding for cytochromes P450 were needed to be co-expressed together with the PKS-NRPS and ER genes to obtain the final product tenellin. One of the cytochromes P450 (TENA / P450- 1) participates in the ring expansion of pretenA to convert it to pretenB, while the other one (TENB / P450 - 2) converts pretenB to tenellin by N-hydroxylation (Figure 1.15).

25

Figure 1.15: Tenellin pathway. As described by Heneghan et al., 2010

1.8 MOLECULAR APPROACHES TO THE STUDY OF TORRUBIELLONE A BIOSYNTHESIS

Considering the similarities of the torrubiellone A-D compounds to the other known polyketides tenellin and desmethylbassianin, joined to the knowledge of the genes (and cognate enzymes) of the gene clusters, it is feasible to believe that a Torrubiella whole genome sequencing and secondary metabolite analysis by different softwares, as antiSMASH, will find a putative gene cluster for torrubiellone production.

To study a putative torrubiellone A gene cluster from Torrubiella sp. BCC2165, two main approaches will be taken: firstly, to achieve the heterologous expression of the cluster of interest in A. oryzae, several plasmids containing the ORFs recognized by antiSMASH to correspond to the PKS-NRPS, trans-acting ER and tailoring enzymes must be assembled, following protocols previously described (Pahirulzaman et al., 2012). The genes will be inserted in plasmids by using the property of homologous recombination in the yeast Saccharomyces cerevisiae (yeast recombination). The PKS-NRPS gene can be assembled in a Gateway entry plasmid and then transferred into a multigene destination vector by LR

26

recombination. After plasmid assembly, it will be transferred to Aspergillus oryzae NSAR1 by protoplast-mediated transformation. Secondly, overexpression of transcription factors positioned closely to the gene cluster will be attempted to obtain a greater quantity of the compound, in comparison to the wild type.

1.8.1 YEAST RECOMBINATION

The property of homologous recombination performed by the yeast Saccharomyces cerevisiae has been developed as a tool of molecular technology (Barr, 2003) and exploited for the heterologous expression of metabolic pathways in Aspergillus oryzae (Pahirulzaman et al., 2012). This methodology can be used to join DNA fragments to a linearized plasmid vector if there is overlap of homologous sequence between them. The process generates a circular recombinant plasmid which will transform yeast cells at high efficiency (Figure 1.16). The overlap can be produced by the addition of 25-30 bases to the 5’ ends of PCR primers used to amplify the DNA fragments corresponding to the predicted ORFs. This method can be used to reconstruct large genes (possibly more than 12 kb in the case of some PKS-NRPS genes) from smaller PCR-generated fragments and to produce gene fusions, for example with markers such as enhanced green fluorescent protein (eGFP) (Chalfie et al., 1994; Cubitt et al., 1995).

Figure 1.16: Basic gene assembly in a desired vector. Using the property of homologous recombination in S. cerevisiae (Pahirulzaman et al., 2012).

27

1.8.2 GATEWAY RECOMBINATION SYSTEM

The Gateway Recombination System (Hartley et al., 2000) uses the property of site-specific recombination from the integration-excision process by which bacteriophage λ forms and breaks down lysogens in E. coli. In lysogen formation recombination occurs between certain attachments sequences named attP (in the phage DNA) and attB (in the bacterial genome). attP and attB have a common nucleotide sequence of 7 bp (known as core sequence, O) and two flanking arms (P and P' from the phage, or B and B' from the bacterium). The recombination process starts when the bacterial integration host factor (IHF) and a number of phage-coded integrase molecules bind strongly to attP. The complex then binds to attB in the bacterial chromosome. After binding, sequential DNA nicks are made at the end of the attP and attB, O sites followed by strand exchange between them to form a small heteroduplex. Then the Holliday junction is resolved and the prophage gets flanked by two hybrid att sites, named attL and attR (Left and Right, respectively). This is known as the BP reaction (Katzen, 2007). In the case of integration, only Int and IHF are required, process known as the LR reaction. The Gateway system for gene transfer was developed by introducing matching alterations to the att sites to increase their number and specificity (Figure 1.17).

In the multigene expression system developed for A. oryzae, a megasynthase gene can be assembled in an entry plasmid (namely pEYA) by yeast recombination, as shown in Fig 1.16. In the resultant recombinant plasmid the DNA fragment of interest is flanked by attL sequences, allowing its transfer between the attR sites of a destination vector (pTYGS series). To perform this reaction, entry and destination plasmids are incubated in vitro with the LR Clonase enzyme mixture, then a small volume of the reaction mixture is used for transformation of competent E. coli cells. This is considered as a very highly efficient cloning technique, as the parental entry vector and recombinant donor vector carry the wrong antibiotic selection marker for E. coli transformation, while the parental destination vector and recombinant donor vector carry the toxic ccdB counter-selection marker which kills E. coli cells by inducing gyrase-mediated DNA breakage (Bernard and Couturier, 1992; Miki et al., 1992); only the desired expression vector combines the correct selection marker with absence of the counter-selection marker (Figure 1.17).

Figure 1.17: Scheme of the Gateway Recombinational System. (From Pahirulzaman et al., 2012)

28

1.9 AIMS

The structure of the antimalarial compound torrubiellone A is comparable to other known PKS-NRPS compounds like tenellin and desmethylbassianin, giving rise to the proposal that its biosynthesis will be directed by a PKS-NRPS-based gene cluster that should be easy to identify.

Identification of the gene cluster could reveal additional functions accounting for the structural differences between torrubiellone, tenellin and desmethylbassianin. The research therefore set out to:

1. Sequence the genome of Torrubiella sp. BCC2165 and identify the torrubiellone A gene cluster by bioinformatic analysis (Chapter 3).

2. Develop a transformation system for Torrubiella and use it to confirm that the correct cluster had been identified by knocking out the PKS-NRPS gene (Chapter 4).

3. Investigate the function of any associated transcription factors by overexpressing them in the native host (Chapter 4).

4. Reconstruct the torrubiellone A biosynthetic pathway in A. oryzae to determine whether heterologous expression could provide a more efficient means of producing torrubiellone A (Chapter 5).

29

C

H A H PT ER

2

2. MATERIALS AND METHODS

2.1 MICROBIAL STRAINS AND GROWTH MEDIA

2.1.1 ROUTINE CHEMICALS

Unless otherwise stated, chemicals were obtained from Sigma, Fisher Scientific or Appleton Woods. Growth media components were purchased from Formedium (except where indicated).

2.1.2 ESCHERICHIA COLI

E. coli strains TOP10 and ccdB-S (Invitrogen) were cultured at 37° C on LB liquid medium (1% peptone, 0.5% yeast extract, 0.5% NaCl) and in LB agar plates (LB medium with 2% agar added). When required for selection, the antibiotics ampicillin and chloramphenicol were added to culture media at final concentrations of 100 and 30 µg/ml, respectively. For preparation of electrocompetent cells the bacteria were cultured in YENB (0.75% (w/v) yeast extract, 0.8% (w/v) nutrient broth no. 2).

TOP 10: F- mcrA Δ(mrr-hsdRMS-mcrBC) Ф 80lacZΔM15 ΔlacX74 recA1 araD139 Δ(ara leu) 7697 galU galK rpsL (StrR) endA1 nupG. (Invitrogen) ccdB-S: F- mcrA Δ(mrr-hsdRMS-mcrBC) Φ80lacZΔM15 ΔlacX74 recA1 araΔ139 Δ(ara-leu)7697 galU galK rpsL (StrR) endA1 nupG fhuA::IS2. (Invitrogen)

30

2.1.3 SACCHAROMYCES CEREVISIAE

Yeast strain YPH499 (MATa ura3-52 lys2-801amber ade2-101orchre trp1-Δ63 his3-Δ200 leu2-Δ1, Stratagene) was grown at 28° C in liquid YPAD medium (1% (w/v) yeast extract, 2% (w/v) peptone, 2% (w/v) glucose, 0.04% (w/v) adenine sulphate) and on YPAD agar plates (YPAD medium solidified with 2% (w/v) agar). S. cerevisiae transformants were selected on SM-URA plates (0.17% (w/v) yeast nitrogen base, 0.5% (w/v) ammonium sulphate, 2% (w/v) glucose, 0.077% (w/v) complete supplement mixture minus uracil (Formedium), 2% (w/v) agar) at 28° C.

2.1.4 ASPERGILLUS ORYZAE NSAR1

A. oryzae NSAR1 (niaD- sC- ΔargB adeA-, Jin et al., 2004) was grown on MEA medium (2% (w/v) Malt Extract Broth (MEB; VWR International), 2% (w/v) agar) at 28° C. For preparation of protoplasts conidia were germinated in GN medium (1% (w/v) glucose, 2% (w/v) Nutrient Broth No. 2) overnight at 28° C. For A. oryzae transformation, transformants were cultured in Czapek Dox/S plates (3.5% (w/v) Czapek Dox, sorbitol 1M, supplemented with the correspondent nutrient supplements arginine 0.1%, methionine 0.15%, adenine 0.05%, ammonium sulphate 0.1%, when appropriate, agar 2%) at 28° C for 4 days. Spores were stored at -80° C in 15% (v/v) glycerol.

For metabolite production the transformants were grown in Czapek Dox with the adequate nutrient supplements with 2% Maltose or in MEA.

2.1.5 TORRUBIELLA SP. BCC2165

Torrubiella sp. strain BCC2165, obtained from BIOTEC, Thailand, was grown at 28° C at 200 rpm in Potato-Dextrose broth (PDB - Sigma) and on PD agar plates (PDA – PDB solidified with 2% (w/v) agar). For transformant selection, 100 μl glufosinate (25 mg/ml BASTA stock) were added to PDA plates (volume = 25 ml), for a final concentration of 100 µg/ml.

2.2 TRANSFORMATION PROCEDURES

2.2.1 ESCHERICHIA COLI

31

2.2.1.1 PREPARATION OF ELECTROCOMPETENT E. COLI CELLS

A starter culture was prepared by inoculating a single E. coli colony into 5 ml YENB liquid medium and incubated at 37° C overnight (200 rpm). 0.5 ml aliquots of the starter culture were used to inoculate 10 flasks each containing 50 ml YENB medium. All flasks were incubated for 3-4 hours at 37° C with shaking at 200 rpm. The cells were transferred into 50 ml sterile tubes, pelleted at 3000 x g for 10 min and each pellet resuspended in 20 ml pre- cooled sterile distilled water (SDW). Subsequent procedures were done on ice. Suspensions were pooled, 2 per tube, and centrifuged at 4000 x g for 10 min. The pellets were each resuspended in 20 ml of pre-cooled H2O, and the suspensions pooled again into two tubes before centrifuging at 4000 x g, 4° C for 10 min. Supernatants were discarded and the two pellets were each resuspended in 25 ml of pre-cooled sterile 10% (v/v) glycerol, combined in a single tube and centrifuged at 4000 x g, 4° C for 10 min. The pellet was finally resuspended with 1 ml of pre-cooled sterile 10% (v/v) glycerol, and 50 µl aliquots were transferred to sterile 1.5 ml microcentrifuge tubes before rapid freezing in liquid nitrogen and storage at -80° C.

2.2.1.2 TRANSFORMATION OF ELECTROCOMPETENT E. COLI CELLS

Electrocompetent E. coli cells were thawed on ice and 1 μl of DNA was added to a 50 μl aliquot. The mixture was transferred to a cold electroporation cuvette (0.2 cm electrode gap) and pulsed using a Bio-Rad E. coli Gene-Pulser to get a time constant (msec) between 4 and 5. The cuvette was immediately returned to ice and 200 μl of the recovery SOC medium (2% (w/v) tryptone, 0.5% (w/v) yeast extract, 10 mM sodium chloride, 10 mM potassium chloride, 10 mM magnesium chloride, 10 mM magnesium sulphate, 20 mM glucose) was added. The mixture was incubated at 37° C for up to 1 h, and then aliquots were spread on LB plates supplemented with antibiotic and incubated at 37° C overnight.

2.2.2 SACCHAROMYCES CEREVISIAE TRANSFORMATION

Yeast transformation was based on the LiOAc/SS carrier DNA/PEG method developed by Giezt and Woods (2002). A starter culture was prepared by inoculating 10 ml of YPAD medium with a single colony of S. cerevisiae and incubated overnight at 28° C (200 rpm). The starter culture was added to 40 ml YPAD in a 250 ml conical flask and incubated as above for an additional 4.5 hours. The culture was split between two sterile 50 ml centrifuge tubes and centrifuged at 4° C, 3000 x g for 5 min. The cells were washed by resuspension with 25 ml SDW and re-centrifuged as above. Each pellet was resuspended in 1 ml 0.1 M LiOAc, transferred to a 1.5 ml microcentrifuge tube and centrifuged at full speed (10000 x g) for 15 sec. Supernatants were discarded and cells were resuspended in 400 µl 0.1 M LiOAc. For

32

each transformation, a 50 µl aliquot was transferred to a new 1.5 ml microcentrifuge tube and centrifuged as above. To each cell pellet the following was added in order: 240 µl 50% PEG solution (polyethylene glycol 3350), 36 µl 1M LiOAc, 50 µl SS-DNA (2 mg/ml denaturated salmon testis DNA in TE buffer) and 34 µl DNA/water mixture. Cells were resuspended in the mixture by vortexing, and then incubated at 30° C for 30 min and then 42° C for 30 min. Subsequently, the cells were pelleted at 5000 x g for 15 sec and resuspended in 1 ml of SDW by gentle pipetting. 200 µl and 20 µl aliquots were spread on SM-URA plates, which were incubated at 28° C for 3-4 days.

2.2.3 ASPERGILLUS ORYZAE TRANSFORMATION

A. oryzae transformation was performed as described previously by Halo et al. (2008). Conidia harvested from a 7-day grown single plate in MEA agar were used to inoculate 50 ml of GN medium and incubated at 28° C with shaking at 200 rpm overnight. The germinated A. oryzae conidia were centrifuged at 10000 x g for 10 min and the supernatant discarded. The germlings were then resuspended in 10 ml of filter-sterilized protoplasting solution (10 mg/ml Trichoderma lysing enzyme (Sigma) in 0.8 M NaCl), and incubated at room temperature with gentle shaking for 1–1.5 hours. Protoplasts were released from hyphae by pipetting with a wide-bore 5 ml pipette, and then filtered through sterile Miracloth. The protoplasts were then centrifuged at 3000 x g for 5 min and the supernatant discarded. The pellet was resuspended in solution 1 (0.8 M NaCl, 10 mM CaCl2, 50 mM Tris-HCl pH 7.5), and, for each transformation, 100 μl transferred to a 15 ml centrifuge tube on ice. 5-10 μg (10 μl max) of plasmid DNA (prepared using the Midiprep Plasmid Kit (Thermo Scientific)) was added to the protoplasts and gently mixed. The tube was incubated on ice for 2 min, after which 1 ml of solution 2 (60% (w/v) PEG 3350, 10 mM CaCl2, 50 mM Tris-HCl pH 7.5) was added and the tube was incubated at room temperature for 20 min. 5 ml of molten (50° C) CZD/S top medium (3.5% (w/v) Czapek Dox broth, 1 M sorbitol, 0.8% (w/v) agar, with the appropriate supplements) was added, gently mixed and then overlaid onto prepared plates (approx. 15 ml of CZD/S with 2% (w/v) agar, plus supplements). Plates were then incubated at 28° C for 3-5 days until colonies appeared, and these were serially subcultured on selection plates until genetically pure transformants were obtained.

2.2.4 TORRUBIELLA SP. BCC 2165 TRANSFORMATION

Torrubiella transformation was performed using the A. oryzae protocol described above, but with different growth and plating conditions. Torrubiella sp. BCC2165 was grown firstly on PDA plates for 14 days then the aerial hyphae harvested from a single plate were used to inoculate 50 ml of GN medium under incubation conditions of 28° C with shaking at 200 rpm

33

for 48 hours. The germlings were then resuspended in 10 ml of filter-sterilized protoplasting solution (20 mg/ml Trichoderma lysing enzyme (Sigma) in 0.8 M NaCl), and incubated at room temperature with gentle shaking for 3-4 hours. Protoplast transformation mixtures were plated by adding 5 ml of PDA/S/top medium (2.4 % (w/v) Potato Dextrose Broth (PDB), 1 M sorbitol (S), 0.8% (w/v) agar), mixing gently and then pouring onto prepared plates (approximately 15 ml of 2.4% (w/v) PDB, 1 M sorbitol, 2% (w/v) agar). Following incubation at 28° C for 24 hours, another 5 ml of PDA/S/top medium containing 140 µl BASTA was added (final concentration = 100 µg/ml), and the plates incubated at 28° C for 7-10 days until colonies appeared.

2.3 DNA MANIPULATIONS

2.3.1 RESTRICTION DIGESTS

All restriction enzymes were purchased from Thermo Scientific, and all digestions were performed using FastDigest enzymes and 10X FastDigest Buffer or 10X FastDigest Green Buffer. Preparative digests were performed in a 30 µl volume using 3 µl of buffer, 2 µl of enzyme and 25 µl of plasmid DNA. Analytical digests were performed in a 10 µl volume, using 0.2 µl of enzyme, 1 µl of buffer, 2 µl of plasmid DNA and 6.8 µl of SDW. All digests were analysed by 2% agarose gel electrophoresis.

2.3.2 GATEWAY TRANSFER

Gene transfer from entry vectors (flanked by attL sites) to destination vectors (containing attR sites) was performed by mixing 1 µl destination vector, 1 µl entry vector, 2.5 µl TE buffer (10 mM Tris-HCl pH 7.5, 1 mM EDTA) and 0.5 µl LR Clonase II (Thermo Scientific) and incubating at 25° C overnight. The reaction was stopped by addition of 0.5 µl Proteinase K solution (2 µg/µl) and incubation at 37° C for 10 min. 1 μl of the reaction mix was used to transform E. coli by electroporation as described above.

2.3.3 NUCLEIC ACID EXTRACTIONS

2.3.3.1 PLASMID DNA EXTRACTION FROM ESCHERICHIA COLI

Plasmid DNA was extracted from E. coli using the GeneJet Plasmid Miniprep Kit (Thermo Scientific) according to the manufacturer’s protocol, except that plasmids were extracted from patch cultures on solid LB medium. This kit yields up to 20 µg of plasmid DNA per isolation procedure. For large scale plasmid purification the GeneJet Plasmid Midiprep kit (Thermo

34

Scientific) was used, according to the manufacturer’s protocol. The plasmid DNA was extracted from 50 ml LB (plus antibiotic) cultures. This kit yields up to 200 µg of plasmid DNA from 50 ml of bacterial culture.

2.3.3.2 PLASMID DNA EXTRACTION FROM SACCHAROMYCES CEREVISIAE

The Zymoprep Yeast Plasmid Miniprep I Kit (Zymo Research) was used to extract plasmid DNA from S. cerevisiae. Cells from multiple colonies were scraped up from primary transformation plates using a toothpick, resuspended in 150 μl digestion buffer containing zymolyase, and incubated for 1 h at 37° C. The manufacturer’s protocol was then followed for production and clarification of a cell lysate and precipitation of DNA. The plasmid pellet was rinsed with ethanol and dissolved in 20 μl H2O, then 1 μl was used to transform E. coli by electroporation as described above.

2.3.3.3 SMALL SCALE GENOMIC DNA EXTRACTION FROM FUNGI

Mycelium was harvested from a 50 ml culture by centrifugation, then freeze-dried, weighed and ground under liquid nitrogen to disrupt the fungal cells. The GenElute Plant Genomic DNA Miniprep Kit (Sigma) was then used to extract genomic DNA, according to the manufacturer’s protocol.

2.3.3.4 DNA EXTRACTION FROM PLATES

A small lump of mycelium was added to 500 µl of lysis buffer (400 mM Tris-HCl (pH 8.0), 60 mM EDTA, 150 mM NaCl, 1% (w/v) sodium dodecyl sulfate) in a 1.5 ml microcentrifuge tube and disrupted using a sterile toothpick. The tube was then left at room temperature for 10 min. After adding 150 µl of 5 M potassium acetate (pH 4.8) the tube was vortexed briefly and spun at 10000 x g for 1 min. The supernatant was transferred to another 1.5 ml Eppendorf tube and centrifuged as described previously. After transferring the supernatant to a new 1.5 ml Eppendorf tube, an equal volume of isopropyl alcohol was added and mixed in by inversion. After centrifugation at 10000 x g for 2 min the supernatant was discarded and the resultant DNA pellet washed in 300 µl of 70% ethanol. After spinning at 10000 g for 1 min and removing the supernatant, the DNA pellet was air dried and dissolved in 50 µl TE buffer and 1 µl of the purified DNA was used as template in a 25 µl PCR.

35

2.3.3.5 EXTRACTION OF RNA FROM FUNGI

Mycelium was harvested from 50 ml cultures by centrifugation, then freeze-dried, weighed and ground under liquid nitrogen to disrupt the fungal cells. The RNeasy Plant Mini Kit (Qiagen) was then used to extract RNA according to the manufacturer’s protocol. RNase-free DNase I (Thermo Scientific) was used to remove genomic DNA, by adding to an RNase-free tube 1 µl of RNA, 1 µl of 10X reaction buffer, 1 µl of DNase I and 7 µl of DEPC-treated water and incubating at 37° C for 30 min. The DNase was inactivated by the addition of 1 µl 50 mM EDTA and incubation at 65° C for 10 min, prior to using the RNA as a template for reverse transcriptase.

2.3.4 POLYMERASE CHAIN REACTIONS

Primers were bought from Sigma Aldrich. The stock primers were diluted to a 100 µM concentration in 0.1X TE Buffer. For PCR purposes, primers were diluted to 5 µM using SDW. Both primer concentrations were stored a -20° C until use. Primers used in the research were: (Table 2.1)

Table 2.1: List of primers used in the research

NAME SEQUENCE

PKS-TOR-F1 CTTTGTACAAAAAAGCAGGCTCCGCGGCCGCAATGTCTCATCCTAAGAAG

PKS-TOR-R1 GTAGACCGATCATCTGACAG

PKS-TOR-F2 GAACGAGATCAGTGCCAGAG

PKS-TOR-R2 CTTACGGAGCTCTCTTTGCG

PKS-TOR-F3 GTCTCAACATGATGCGGCTG

PKS-TOR-R3* CCTCCATGCAGTATTTTGGG

PKS-TOR-F4 GCAGATGCGAGACGTGATCC

PKS-TOR-R3 GAACAGCTCCTCGCCCTTGCTCACCATAACTGGCTGCTCTCCCATATAAG

adh-torA-F TTTCTTTCAACACAAGATCCCAAAGTCAAAATGTTTTTCCCTTTGGAGAG

adh-torA-R TTCATTCTATGCGTTATGAACATGTTCCCTTTATGTGTAAAAAGCTATCT

gdpA-torB-F AACAGCTACCCCGCTTGAGCAGACATCACCATGGCTCTCTCTGTGGCAGC

36

gdpA-torB-R ACGACAATGTCCATATCATCAATCATGACCTCAGACCTTGCGTCTCTTGA eno-torC-F GTCGACTGACCAATTCCGCAGCTCGTCAAAATGACGGCTGTCACGGCTCT eno-torC-R GGTTGGCTGGTAGACGTCATATAATCATACTCAGAGACCTGATGTCTCCT

TF-tor-F TTTCTTTCAACACAAGATCCCAAAGTCAAAATGGAAACACCATTAAAGGC

TF-tor-R TATGCGTTATGAACATGTTCCCTGCGCGCCTCAGAGCATCAGCATTAAAA gdpA-BAR-F AACAGCTACCCCGCTTGAGCAGACATCACCATGAGCCCAGAACGACGCCC gdpA-BAR-R ACGACAATGTCCATATCATCAATCATGACCTTAGCTTACCTAAATCTCGG adh-torD-F TTTCTTTCAACACAAGATCCCAAAGTCAAAATGGAAGCAAGCGTTTCTCA adh-torD-R TTCATTCTATGCGTTATGAACATGTTCCCTTTAGGCTACCTCTCGTGATG gpdA-torE-F AACAGCTACCCCGCTTGAGCAGACATCACCATGTGGTATACCAAAGTGAT gdpA-torE-R ACGACAATGTCCATATCATCAATCATGACCCTATTGCCAATCTGCCAGCA adh-torDE-R TTCATTCTATGCGTTATGAACATGTTCCCTCTATTGCCAATCTGCCAGCA eno-TFZn-F GTCGACTGACCAATTCCGCAGCTCGTCAAAATGGAACCCGGCCCATCCCA eno-TFZn-R GGTTGGCTGGTAGACGTCATATAATCATACTCAAAGCGAAATTCCCACAG prom_adh-F TATACTAAACTCACAAATTAGAGCTTCAATGCACCTACATCATTCAATAG prom_adh-R GGTGAACAGCTCCTCGCCCTTGCTCACCATGCTTGCCCAATTTGAGCAGC

*PKS2tor-1-F GTACAAAAAAGCAGGCTCCGCGGCCGCCATGTCCTCGCACAAGAACGAG

PKS2tor-1-R CCTCAACAGCAACCTCAGAC

PKS2tor-2-F CTGACCAGCTCAAGGGCTGG

PKS2tor-2-R GAAGGAGAAACCGTCCATGG

PKS2tor-3-F GTGGCGAGGTCATCCGCATG

*PKS2tor-3-R CGGTGAACAGCTCCTCGCCCTTGCTCACCATCCGCTCGAGGAAGCCCATG

PKS2torarg-1-F TCTGAACAATAAACCCCACAGCAAGCTCCGATGTCCTCGCACAAGAACGA

PKS2torarg-1-R CCTCAACAGCAACCTCAGAC

37

PKS2torarg-2-F CTGACCAGCTCAAGGGCTGG

PKS2torarg-2-R GAAGGAGAAACCGTCCATGG

PKS2torarg-3-F GTGGCGAGGTCATCCGCATG

PKS2torarg-3-R GGTGAACAGCTCCTCGCCCTTGCTCACCATCCGCTCGAGGAAGCCCATGT sd-eno-F TCGACTGACCAATTCCGCAGCTCGTCAAAGATGGATCTCCTCCTCTCCAT sd-eno-R AGGTTGGCTGGTAGACGTCATATAATCATATCACATTTCCTTTGCAGTCT

C6TF-adhprom- TTCCTCAGTTGCTGCTCAAATTGGGCAAGCATGGAAACACCATTAAAGGC F

C6TF-eGFP-R GGTGAACAGCTCCTCGCCCTTGCTCACCATGAGCATCAGCATTAAAATCA

ZnTF-adhprom- TTCCTCAGTTGCTGCTCAAATTGGGCAAGCATGGAACCCGGCCCATCCCA F

ZnTF-eGFP-R GGTGAACAGCTCCTCGCCCTTGCTCACCATAAGCGAAATTCCCACAGAAC ptorD500-F TATACTAAACTCACAAATTAGAGCTTCAATCGTAGTAGCCCAGATGGGCC ptorD500-R CGGTGAACAGCTCCTCGCCCTTGCTCACCATTGCAAATTGGTATGCTTGC ptorDmid-F TATACTAAACTCACAAATTAGAGCTTCAATCAACAGGAACAGGCGCAGCT ptorDmid-R GGTGAACAGCTCCTCGCCCTTGCTCACCATTCAACCTCTACACTGTCATT ptorE500-F TATACTAAACTCACAAATTAGAGCTTCAATCAAGGACGACTACCTCATAC ptorE500-R GGTGAACAGCTCCTCGCCCTTGCTCACCATCTGTTGAGCGCACGGTGTTA

TorIntRem-F1 CGACGTTGATGCTTGGGTTC

TorIntRem-R1 GCCATCAAGTGCACTCAACACGGCCAGCGACGTTGCACCAGTACCTGCTC

TorIntRem-F2 CTGGAAATCGGAGCAGGTACTGGTGCAACGTCGCTGGCCGTGTTGAGTGC

TorIntRem-R2 CGCGATGATAATATCGTACG

IR1-Mfe1-R-50 GGAACGGAGCTCCAGCAGCCAATTTCGGCCATGGTGTCGCTGTCGTGGGA

IR1-Mfe2-F-50 CGTTGCAAATTCCCACGACAGCGACACCATGGCCGAAATTGGCTGCTGGA torpatIR1-F CGAGTTCCGAGCGCACATGG

38

torpatIR1-R CTGACGCTCAGTGGAACGAC

torS-PgdpA-F GGATGTTGCCTTTCAGACCATTTTCATTGCTCTAGTGGATCTTTCGACAC

Bar-PgdpA-R GGGGCGGAACCGGCAGGCTG

torS-KO-F-1 GCGAAACTCTCCCAGGGTGG

torS-KO-R-1 CGTATTTCAGTGTCGAAAGATCCACTAGAGCAATGAAAATGGTCTGAAAG

Bar-TgdpA-F GCCACCGAGGCGGACATGCC

Bar-TgdpA-R GCCGCCGCTCAAGAAGAAATCCGACCCCGGAATTGACCTCCTAAAACCCC

TgdpA-torS-F ATGACCCACTGGGGTTTTAGGAGGTCAATTCCGGGGTCGGATTTCTTCTT

TgdpA-torS-R GGCTGCTCTCCCATATAAGG

torS-PgdpA-F GGATGTTGCCTTTCAGACCATTTTCATTGCTCTAGTGGATCTTTCGACAC

Bar-PgdpA-R GGGGCGGAACCGGCAGGCTG

q-Bar-F GGCACAGGGCTTCAAGAGCG

q-Bar-R CCTAAATCTCGGTGACGGGC

q-torSKO-F GCACCATGCCTTTGCTGAAG

q-torSKO-R CGCAGCGTTCTGGTTCGAAG

q-torS-F CCTTCGCCTTTGTCTCGTCG

q-torS-R CACACGGTCTATGGATCGAG

torSKOcheck-F CCTACATGATACATCCCGCC

torSKOcheck-R CAGCCTGTAGCTTCATCAAG

2.3.5 PREPARATIVE/ANALYTICAL PCR

All PCRs were carried out using a Multi Gene II thermal cycler (Labnet International Inc) using the programmes shown in Tables 2.2 and 2.3.

Preparative PCR was performed in a 50 µl volume using 0.5 µl of either Phusion DNA polymerase in HF buffer (Thermo Scientific), Q5 High Fidelity DNA polymerase in Q5 Reaction Buffer (New England Biolabs) or KAPA HiFi HotStart DNA polymerase in Fidelity

39

Buffer (KAPA Biosystems). The final concentrations of dNTPs and primers were 0.2 mM and 0.5 µM respectively. The DNA template was added according to the manufacturers’ guidelines for low complexity DNA, namely 1 pg-10 ng.

Table 2.2: Programme for preparative PCR

PROGRAM

Initial denaturation 98° C for 30 sec, 1 cycle

Denaturation 98° C for 10 sec

Annealing 60° C for 20 sec 35 cycles

Extension 72° C for 15 sec - 2 min*

Final extension 72° C for 5 min, 1 cycle

* Relative to product length (15 sec/kb for plasmid template; 30 sec/kb for gDNA template)

Analytical PCR was performed in a 25 µl volume containing 17.5 µl of 1.4 PCR Mix (0.025 ml DreamTaq DNA polymerase (Thermo Scientific), 0.5 ml 10X DreamTaq Green Buffer, 0.1 ml dNTPs 10 mM (each) and 2.875 ml of SDW: total volume = 3.5 ml), 2.5 µl (each) of 5 µM forward and reverse primers and 2.5 µl of template DNA.

Table 2.3: Analytical PCR programmes

PROGRAM: 55S (FOR PRODUCTS LESS THAN 800 BP) 55L (FOR PRODUCTS BETWEEN 800

BP AND 2 KB)

INITIAL DENATURATION 94° C for 3 min, 1 cycle 94° C for 3 min, 1 cycle

DENATURATION 94° C for 15 sec 94° C for 15 sec

ANNEALING 55° C for 30 sec 30 cycles 55° C for 30 sec 30 cycles

EXTENSION 72° C for 45 sec 72° C for 90 sec

FINAL EXTENSION 72° C for 5 min, 1 cycle 72° C for 5 min, 1 cycle

2.3.6 COLONY PCR

Cells from an individual E. coli colony were collected on a sterile toothpick and resuspended in TE buffer supplemented with 50 µg/ml proteinase K. Cell suspensions were incubated at

40

55° C for 15 min, and then 80° C for 15 min, then centrifuged at 10000 x g for 15 sec. 2.5 µl of supernatant was used as a template for analytical PCR.

2.3.7 ELECTROPHORESIS

Agarose gels (1%) were cast and run in 1X TAE buffer (40 mM Tris-acetate pH 8.3, 1 mM EDTA) supplemented with 0.002% (v/v) Midori Green (Geneflow). DNA was loaded using loading dye present in FastDigest Green Buffer (Thermo Scientific). Analytical gels were run at 80 mA for 0.5 h and preparative gels at 40 mA for 1 h. Gels were visualized by UV light and images recorded using a Bio Doc-It Imaging System (UVP).

2.3.8 SEQUENCING

DNA was sequenced at the University of Bristol Genomics Facility.

2.4 CHEMICAL EXTRACTIONS AND ANALYSIS

2.4.1 METABOLITE EXTRACTION FROM LIQUID MEDIA

A. oryzae NSAR1 transformants were grown in 50 ml CMP medium (3.5% (w/v) Czapek Dox, 2% (w/v) maltose and 1% (w/v) peptone) or in Czapek Dox + 2% maltose at 28° C/200 rpm for 5-7 days. The culture was homogenised and acidified to pH 3 using 1M HCl. An equal volume of ethyl acetate was then added, and the sample left stirring for 10 min. The sample was vacuum filtered through Whatman No 1 filter paper and transferred to a separation funnel to separate the organic solvent phase from the aqueous phase. The organic phase was recovered and dried with anhydrous magnesium sulphate, which was removed by filtering through Whatman No 1 filter paper. The ethyl acetate phase was evaporated from the sample at 37° C using a rotary evaporator. The dry residue was dissolved in a small volume of ethyl acetate and put in a pre-weighed 1.5 ml glass vial, dried and weighed again and dissolved in methanol or acetonitrile. For LC-MS analysis, the samples were diluted to 5 mg/ml for crude extracts and 1 mg/ml for pure compounds.

2.4.2 METABOLITE EXTRACTION FROM PLATES

Direct extractions from mycelia grown on solid media were done by grinding a small plug of agar containing A. oryzae or Torrubiella sp. transformants in a 1.5 ml Eppendorf tube using acetone as solvent. The sample was centrifuged for 10 min at 10000 x g and the supernatant transferred to a glass vial. The sample was dried under a stream of air to drive off the acetone.

41

To remove water-soluble compounds, 300 μl of HPLC-grade water and 700 μl of ethyl acetate were added and vortexed. The organic phase was transferred to a pre-weighed glass vial and dried. Methanol or acetonitrile was added to get a concentration of 5 mg/ml of sample for further analysis.

2.4.3 LIQUID CHROMATOGRAPHY – MASS SPECTROSCOPY (LC-MS)

LC-MS was performed in the School of Chemistry, in where the samples were analyzed by LC-MS using a Waters 2795HT HPLC system. Compounds were detected by UV absorption between 200-500 nm using a Waters 2998 diode array detector. Simultaneous electrospray (ES) mass spectrometry was performed using a Waters ZQ spectrometer detecting 150-800 m/z units. Chromatography, at a of flow rate 1 ml/min was performed using a Phenomenex Kinetex column equipped with a Phenomenex Security Guard pre-column. Solvents: A: 0.05% formic acid in HPLC grade H2O; B: 0.045% formic acid in HPLC grade MeOH. Gradient used: 0 min, 10%, B; 10 min, 90%, B; 12 min, 90%, B; 13 min, 10%, B; 15 min, 10%, B. Running time varied between 15 min and 20 min, depending on the LC-MS machine used.

2.4.4 THIN LIQUID CHROMATOGRAPHY

Thin-layer chromatography (TLC) was used for compound screening from fungal crude extracts after ethyl acetate extractions. For analytical TLC, approximately 10 µl of a 5 µg/ml fungal crude extract was transferred to a 2x10 cm TLC plate (TLC silica gel 60 F254; Merck) and developed in petroleum spirit-ethyl acetate (9:1). The plate was dried and visualised after submerging in potassium permanganate solution (1.5 g of KMnO4, 10 g K2CO3 and 1.25 ml 10% NaOH in 200 ml of water), revealing unsaturated compounds as yellow spots. Preparative TLC was performed on a 20x20 cm TLC plate, and developed as before. After drying the plate a 1 cm-strip was cut from one side, and stained in potassium permanganate solution; a spatula was used to scrape the silica off the unstained plate from positions corresponding to the visualised spots. Purified compounds were extracted from the silica with methanol, collected in pre-weighed glass vials and dried under a flow of dry N2 gas.

2.5 MICROSCOPY

2.5.1 LIGHT MICROSCOPY

An OLYMPUS BH-2 microscope was used to observe sporulating fungi and protoplast production.

42

2.5.2 FLUORESCENCE MICROSCOPY eGFP (enhanced Green Fluorescent Protein) expression was observed using a LEICA DMLB microscope fitted with a UV lamp (Leistungelektronik, Jena)

2.6 SOFTWARE AND ONLINE TOOLS

DNAman (Lynnon) was used to view, align and manipulate DNA sequences, e.g. for intron removal. GENtle (MagnusManske) and SnapGene Viewer 3.1.4 (GSL BIOTECH LLC) were used to create maps of vectors and restriction analysis. MassLynx (Waters) was used for LCMS analysis. The NCBI database was used for DNA (BlastN and TblastX) and amino acid (Blastp and TblastN) searches. antiSMASH 2.0 and 3.0 (antibiotics & Secondary Metabolite Analysis Shell - Medema et al., 2011), NaPDos (Natural Product Domain Seeker) and PRISM (PRediction Informatics for Secondary Metabolomes - Nathan Magarvey Lab at McMaster University) were used for PKS-NRPS gene cluster prediction. ChemDraw (PerkinElmer Informatics) was used for drawing chemical structures. Phylogenetic trees were made in MEGA5.

43

C

H A H PT ER

3

3. THE TORRUBIELLONE BIOSYNTHETIC GENE CLUSTER

3.1 INTRODUCTION

Genes related to the biosynthesis of secondary metabolites in fungi are generally clustered, i.e. two or more genes that work together to synthesize a product are located at a single genetic locus and can be regulated by the same or different transcription factors. For example, the GATA transcription factor AREA in Fusarium graminearum, named after its recognition of an AGATAA consensus motif, affects fungal growth and development, but also participates in controlling the production of a wide range of secondary metabolites (Ko and Engel, 1993). AREB, another GATA transcription factor, is able to activate nitrogen-repressed and - induced secondary metabolite gene clusters (Tudzynski, 2014).

Fungal PK and NRP gene clusters are composed of a main gene encoding a megasynthase (very large protein comprising several functional domains), responsible for the production of a precursor of the desired product (Brakhage et al., 2009), together with genes for additional proteins, such as tailoring enzymes, which can modify the PK or NRP structural base from the precursor to the final metabolite (Hertweck, 2009). To identify the genes related to the production of torrubiellone A in Torrubiella sp. BCC2165, in-silico analysis was made for the whole genome

44

sequence of the fungal strain to predict and distinguish the putative gene cluster among other secondary metabolite gene clusters.

3.2 GENOMIC SEQUENCE DATA FROM TORRUBIELLA SP. BCC2165

Genomic DNA was extracted from Torrubiella sp. BCC2165 and sent for sequencing to the University of Bristol Genomics Facility. Statistics regarding the assembled sequence are shown in Table 3.1.

Table 3.1: Data obtained from Torrubiella sp. BCC2165 sequencing

TORRUBIELLA SP. BCC2165 GENOME SEQUENCE

Number of contigs 17,574

Total assembled length (bp) 45,823,919

Median contig length (bp) 3656

Largest contig (bp) 509,930

Smallest contig (bp) 200

N50 (half of the data is in contigs of ... bp and more) 7398

The total assembled length of the Torrubiella sp. BCC2165 genome sequence (available at http://www.cerealsdb.uk.net/cerealgenomics/fungal_blast.htm), was determined to be 45.8 Mb in total, which is larger than B. bassiana ARSEF 2860 (33.7 Mb), C. militaris (32.2 Mb) and Metarhizium acridium (38.1 Mb), other ascomycota enthomopathogen (Xiao et al., 2012). In fungi, genome sizes can range from 8.97 Mb to 177.57 Mb, with an average of 36.91 Mb and 46.48 Mb for Ascomycota and Basidiomycota, respectively, making Torrubiella slightly over the average.

According to Galagan et al. (2005), fungal genes have a coding density ranging from 31% to 61% and is inversely correlated to genome size. In oomycota species, the increase of genome size is directly correlated with the amount of coding sequences and also a correlation exists between average intron size and genome size (Machado et al., 2015). Comparison of multiple fungal genome sequences demonstrated how different fungi can be at genome level, despite the similarities in appearance, morphology and lifestyle. For example, M. grisea (or M. oryzae, depending of the author), and the model fungi Neurospora crassa share only 47%

45

amino acid identity, despite of having a common ancestor only 200 million years ago (Galagan et al., 2005). Also, intron numbers differ between basidiomycetes (five to six introns per gene) and ascomycetes (one to two introns per gene), and in a more extreme case, up to 300 introns in total in S. cerevisiae. In ascomycotas, such as Torrubiella, the intron average size lies between 80 and 150 bp (Goffeau et al., 2000).

3.3 IN-SILICO ANALYSIS

3.3.1 GENE CLUSTER PREDICTION

In-silico analysis was performed on the whole genome sequencing of Torrubiella sp. BCC2165 using antiSMASH (Medema et al., 2011) as reference. To complement the results, other gene mining tools were used, such as PRISM (PRediction Informatics for Secondary Metabolomes) and NaPDos (Natural Product Domain Seeker), but it should be considered that any number of hits obtained by any bioinformatics tool is not an absolute analysis of the genome sequence. The combined results gave 56 different putative gene clusters related to secondary metabolism, in which 36 clusters (Table 3.2) are described as PKS-NRPS, PKS (or PKS –like) and NRPS. 28 of them were predicted by antiSMASH and 8 other different clusters predicted by PRISM.

The NaPDos showed 11 KS domains and 33 C domains for the whole genome sequencing. KS domains can belong to a PKS, PKS-like or PKS-NRPS type, while the C domain can be part of a NRPS, NRPS-like and/or PKS-NRPS. Complementing the results by considering NaPDoS outcome, it was expected that the number of NRPS genes detected were higher than the hits related to PKS.

46

Table 3.2: in-silico prediction of secondary metabolite gene clusters in Torrubiella sp. BCC2165

NO CONTIG; TYPE DOMAIN ARCHITECHTURE GENES DESCRIBRED IN THE CONTIG LOCATION (NT)

1 19930; 1 – NRPS C – A- A Condensation domain-containing protein; AMP-dependent synthetase and ligase 3941

2 1850: 1- NRPS A-C-A-ACP-C-A-ACP-C-C-C ABC transporter related protein, AMP-dependent synthetase and ligase 25034

3 5044:1- PKS-NRPS KS – AT- DH – C-Met – KR – C – Major facilitator superfamily MFS 1, Beta-ketoacyl synthase, crotonyl-CoA 50529 A - PCP – TD -ER reductase / alcohol dehydrogenase, NADH:flavin oxidoreductase/NADH oxidase, cytochrome P450

4 19633, 1 - NRPS- PCP – A – PCP – C - AT Phosphopantetheine-binding domain-containing protein, condensation domain- 35784 transAT- containing protein, malonyl CoA-acyl carrier protein transacylase, cation ABC PKS transporter, periplasmic, iron compound ABC transporter, periplasmic

5 4911, 1- NRPS A – PCP – C – PCP – C – A – PCP – serine/threonine protein kinase, AMP-dependent synthetase and ligase, AMP- 29122 E – C dependent synthetase and ligase

6 7367, 1- NRPS C – PCP - A AMP-dependent synthetase and ligase, condensation domain-containing protein, 6015 condensation domain-containing protein

7 14215, 1 - NRPS A - PCP AMP-dependent synthetase and ligase, 2`,3`-cyclic-nucleotide 2`-phosphodiesterase 12083

8 5016, T3-PKS None described chalcone and stilbene synthase domain protein, RNA polymerase, sigma-24 232071 - subunit, ECF subfamily, sugar-binding lipoprotein, oxidoreductase 273233

9 13571, 1- NRPS- KS – AT – DH – C-Met – KR – C – Beta-ketoacyl synthase 18417 T1PKS A – PCP - TD

47

10 13546, 1 - T1PKS KS – AT – DH – C-Met malonyl CoA-acyl carrier protein transacylase 8838

11 4927, 1 - NRPS- A – PCP – TD // KS – AT – ACP –A AMP-dependent synthetase and ligase, malonyl CoA-acyl carrier protein 24003 T1PKS transacylase, cytochrome P450, AMP-dependent synthetase and ligase

12 4880, 1 - NRPS A – PCP – C – A – NAD binding AMP-dependent synthetase and ligase, major facilitator transporter, 33986

13 6146, 1 - NRPS- KS – AT – DH – KR – C – A – PCP - Beta-ketoacyl synthase, cytochrome P450 29463 T1PKS TD

14 5188, 1 - T1PKS KS – AT – DH – ER - KR Beta-ketoacyl synthase 17303

15 13724, 1 - NRPS C – A – ACP – C – C – A – C- C –C phosphopantetheine-binding domain-containing protein, AMP-dependent 16620 // A – ACP – C – C – A – C – C synthetase and ligase

16 14478, 1 - NRPS A - C AMP-dependent synthetase and ligase 10623

17 1860, NRPS CAL - PCP – C – A - PCP ABC transporter ATP-binding protein, extracellular solute-binding protein family 23469 - 5, AMP-dependent synthetase and ligase, 49128

18 1896, 1 - NRPS A – PCP – C - A – PCP – C - A – acetylornithine deacetylase, AMP-dependent synthetase and ligase 15876 PCP – C - A – PCP – C

19 1952:1 - T1PKS A – PCP - Aminotran AMP-dependent synthetase and ligase, aminotransferase class-III 15266

20 13439: 1 - T1PKS KS-AT-DH-ACP-ACP-TE cytochrome P450, Beta-ketoacyl synthase 18823

48

21 13683 : NRPS Aminotran_1_2 – Met – PCP 8-amino-7-oxononanoate synthase, FAD dependent oxidoreductase, 315-38729 phosphopantetheine-binding domain-containing protein

22 5208 : 1 - NRPS A – PCP – C – A – PCP – A condensation domain-containing protein, Drug resistance transporter, EmrB/QacA, 16173 putative carboxymuconolactone decarboxylase

23 5025 : NRPS KR – A – C –A – PCP – TE 2-isopropylmalate synthase, short-chain dehydrogenase/reductase SDR, AMP- 108457 - dependent synthetase and ligase, condensation domain-containing protein,

158355 phosphopantetheine-binding domain-containing protein, luciferase family protein,

24 5027 : TransAT- AT // KS – TransAT-Docking - DH – malonyl CoA-acyl carrier protein transacylase, Beta-ketoacyl synthase, 46060 - PKS KR – ACP – KS - TransAT-Docking phosphopantetheine-binding domain-containing protein, sugar-binding lipoprotein 125088 – KR – ACP – KS

25 1923 : 1 - T1PKS KS – AT – ACP – TE Beta-ketoacyl synthase 11630

26 5038 : 1 - T1PKS KS – AT – DH – C-Met – ER –KR – serine/threonine protein kinase, Beta-ketoacyl synthase, putative 18534 ACP carboxymuconolactone decarboxylase

27 3541 : 1 – NRPS A AMP-dependent synthetase and ligase , cytochrome P450 5990

28 6118 : 1 - NRPS C AMP-dependent synthetase and ligase 13952

29* 4512 PKS KS – T - Mal --

30* 5727 PKS-NRPS AT – DH – C-Met – KR – TE – C – -- Trp – TE – Re – KS - KS

31* 19239 PKS KS – TE – KS – AT – DH – C-Met - -- KR

49

32* 7313 PKS-NRPS C – A – T – KS – A – TC --

33* 14785 NRPS A – TE – E – A – TE – E – A – TE – -- TD

34* 2479 PKS-NRPS KS – AT – DH – KR – TE – C – A – --

35* 9680 PKS KR – T – AT – DH – ER – KS – KS -- – KS – KS – KS - KS

36* 15973 PKS KS – AT – DH – ER – KR - TE --

*: Detected by PRISM software

50

The number of PKS/NRPS/PKS-NRPS detected for Torrubiella (Table 3.3) is very similar to the total number found for B. bassiana ARSEF2860, but triplicating the amount for NRPS detected for C. militaris CM01.

Table 3.3: Secondary metabolite gene clusters in-silico prediction for Torrubiella and related fungi. Additional data obtained from (Xiao et al., 2012)

CORE GENE B. BASSIANA C. MILITARIS M. ACRIDIUM TORRUBIELLA

ARSEF 2860 CM01 CQMA 102 SP. BCC 2165

PKS-NRPS 12 9 13 8

NRPS 13 5 13 16

PKS 8 10 12 12

Terpene 7 2 6 10

Total 40 26 44 46

In bacteria, the assumption was a bigger genome size allows more space for genes related to secondary metabolites, until a study published by Udwary et al. in 2007 compared and described the actinomycete Salinispora sp. (Genome size 5 Mb), which uses 10% of their genome for secondary metabolism, while Streptomyces sp. (Genome size 8 Mb) dedicates only 8% of its genome for secondary processes (Udwary et al., 2007). In addition, some strains with similar genome size can present marked differences in their gene cluster prediction, so a correlation between genome size and number of secondary metabolite gene clusters cannot be described. For example, a larger genome strain as Halomonas sp. S2151 presents a lower number of clusters in their prediction than a smaller genome strain like Pseudomonas piscicida (Machado et al., 2015).

In fungi, correlation can be found between genome sizes and their adaptation to a parasitic lifestyle. The Saccharomycetaceae family, free-living and pathogenic, have lost most of their mobile elements and introns, while the biggest ascomycota genomes belong to obligated parasitic fungi, containing a high amount of transposable elements (Lynch et al., 2003). In Ascomycota genomes exists a correlation between gene density and genome size for genomes less than 100 Mb, suggesting that the increase in gene numbers at medium-sized genes is related to adaptive evolution, while the big-sized genomes are produced by genetic drift (Kelkar et al., 2012).

51

The most unique features found in the gene cluster analysis were the epimerization domain in cluster 5, responsible for convert the amino acid part of the NRPS to their opposite configuration and participating in the elongation stages. Cluster 12 presents a NAP binding site, which is also described as a TD domain. Cluster 17 presents a CAL domain, which works as an A domain. Cluster 24 appears to be a modular PKS, commonly found in bacteria. Modular PKS, as mentioned previously in Chapter 1, are easier to predict their cognate product because of every module is equivalent to a cycle of a ketide unit insertion and modification. In this particular case, it would be an HR-PKS because of the domains contained in the synthase.

Cluster 4, which is described as an NRPS-transATPKS, contains only a trans-AT domain, but no other PKS domain involved. The terms ‘‘trans-AT PKS’’ and ‘‘cis-AT PKS’’ were established for synthases with free-standing or integrated AT domain inside the PKS, respectively. Trans-AT enzymes are also known as ‘‘ATless PKS’’, but there is no adequate corresponding term for cis-AT PKSs in this context. In contrast to cis-PKSs, the main PKS protein lacks an AT domain, receiving the acyl building block via a separate AT (Helfrich et al., 2016). For each gene cluster, there can be up to three AT activities, encoded as an individual gene, fused as tandem ATs to the main PKS or fused to an oxidoreductase domain as a trans-acting ER (Piel, 2010).

Cluster 8 encodes a type III PKS; described as homodimers of ketosynthases that catalyse condensation of one or several molecules of extender units onto a starter substrate through iterative decarboxylative Claisen condensation reactions (Katsuyama and Ohnishi, 2012). Type III PKSs are much less abundant in fungi than type I PKSs; for example, Botrytis cinerea has sixteen type I PKS genes, nine NRPS genes, five PKS-NRPS genes, but only one type III PKS gene (BC1G_06032) (Hashimoto et al., 2014). Type III PKS were first identified in A. oryzae RIB40 (Seshime et al., 2005) in 2005 when Seshime et al. reported four type III PKS genes, called chalcone synthase genes csyA, csyB, csyC and csyD; Chalcones are composed of an enone and an aromatic ketone and are present as the central core of several important biological compounds (Figure 3.1) with anti-cancer and anti-diabetes properties (Mahapatra et al., 2015), together with anti-bacterial activities (Venkatesan and Maruthavanan, 2012).

52

Figure 3.1: Chalcone core and compounds derived from the basic core (A) Chalcone core (B) (C) CHO27 (D) A (E) (F) Flavokavain C

Similar chalcone synthase genes have been found in Magnaporthe grisea (two genes), Neurospora crassa (two genes), Fusarium graminearum (one gene) and Phanerochaete chrysosporium (three genes), so it was expected to find at least one type III PKS in Torrubiella (Seshime et al., 2005), detected in this case by antiSMASH.

3.4 PUTATIVE TORRUBIELLONE A GENE CLUSTER ANALYSIS

Contig 5044 contains Cluster 3, described as encoding a NRPS-T1PKS, with 80% of the genes in the cluster showing similarity to the genes involved in the synthesis of desmethylbassianin. Cluster 3 (contig 5044) thus appears to be the best candidate for involvement in the synthesis of torrubiellone A (Figure 3.2, D), considering the structural resemblance of this compound to tenellin (Figure 3.2, A), desmethylbassianin (Figure 3.2, B), and militarinone (Figure 3.2, C). This result proposes cluster 3 as the torrubiellone A biosynthetic gene cluster, therefore, the PKS-NRPS gene found in the cluster will be designated as torS (encoding enzyme TORS).

Figure 3.2: Comparison between similar PKS-NRPS compounds (A) Tenellin (B) Desmethylbassianin (C) Militarinone (D) Torrubiellone A

53

To confirm the similarities between the ORFs predicted by antiSMASH for the putative torrubiellone A gene cluster in contig 5044 and other known PKS-NRPS gene clusters, nucleotide and protein BLAST analysis was performed (Table 3.4) on the Torrubiella whole genome database using the coding sequences of the genes required for tenellin biosynthesis, namely tenS (tenellin synthase), tenA (cytochrome P450), tenB (cytochrome P450) and tenC (enoyl reductase), together with their amino acid sequences (Eley et al., 2007; Halo et al., 2008). All four genes (Figure 3.3) were found to be located in the same genomic area (contig 5044) predicted by antiSMASH, each with an identity value of at least 70%.

Table 3.4: % aa identity between TorS, DmbS, MilS and TenS protein sequences.

TorS TorS DmbS (Bbas) MilS (Cm) TenS (Bbas) TorS X 72.33 70.62 70.18 DmbS 72.33 X 72.4 87.26 MilS 70.62 72.4 X 70.6 TenS 70.18 70.62 70.6 X

TorA TorA DmbA MilA TenA TorA X 83.04 87.77 83.04 DmbA 83.04 X 84.84 99.02 MilA 87.77 84.84 X 84.45 TenA 83.04 99.02 84.45 X

TorB TorB DmbB MilB TenB TorB X 76.07 69.38 75.36 DmbB 76.07 X 68.82 95.46 MilB 69.38 68.82 X 76.63 TenB 75.36 95.46 76.63 X

TorC TorC DmbC MilC TenC TorC X 74.09 61.24 72.99 DmbC 74.09 X 65.88 90.46 MilC 61.24 65.88 X 65.89 TenC 72.99 90.46 65.89 X

54

TorD TorD NADH:oxidase FMN oxidase NADH flavin (Bbas ARSEF) (Cm) oxidoreductase (Gg) TorD X 49.11 49.22 52.88 NADH:oxidase 49.11 X 79.04 56.14 (Bbas ARSEF) FMN oxidase (Cm) 49.22 79.04 X 56.6 NADH flavin 52.88 56.14 56.6 X oxidoreductase (Gg) Gg: Glomerella graminicola

C6-Transcription Factor C6-TF C6-TF (Bbas C6-TF C6-TF (Aspergillus (C6-TF) (Tor) ARSEF) (Cm) clavatus) C6-TF (Tor) X 75.31 77.17 43.63 C6-TF (Bbas ARSEF) 75.31 X 76.73 42.38 C6-TF (Cm) 77.17 76.73 X 43.97 C6-TF (Aspergillus 43.63 42.38 43.97 X clavatus)

Zn Finger Domain- C6-Zn C6-Zn C6-Zn (Bbas Putative Zn-TF Transcription Factor (Zn-TF) (Tor) (Cm) ARSEF) (Bbas ARSEF) Zn-TF (Tor) X 56.24 60.16 21.25 C6-Zn (Cm) 56.24 X 56.13 22.1 C6-Zn (Bbas ARSEF) 60.16 56.13 X 23.1 Putative Zn-TF (Bbas ARSEF) 21.65 22.1 23.1 X

Considering the results above, a relation between ORFs and function was proposed (Table 3.5):

Table 3.5: Proposed Torrubiellone A gene cluster, and their homologs genes in other organisms.

GENE EQUIVALENT FUNCTION

torS dmbS, tenS, milS Synthase

torA dmbA, tenA, milA Cytochrome P450. Ring expansion

torB dmbB, tenB, milB Cytochrome P450. Hydroxylation

torC dmbC, tenC, milC Enoyl reductase

torD NADH:Oxidase (Bbas Putatively, addition of an OH in the ARSEF), FMN oxidase benzene ring moiety of torrubiellone (C.militaris)

torE No B.bassiana nor C. Putatively, hydroxymethylation militaris equivalent

55

Figure 3.3: ORF order of the putative torrubiellone A gene cluster.

Protein BLAST results described TORD as an Old Yellow Enzyme (OYE); various functions have been described for this type of enzyme, but their true physiological role is not accurately defined. NADPH is assumed to be the main OYE reductant, and substrates are limited to unsaturated aldehydes, ketones, and cyclic enones (Williams et al., 2002). Homologous OYE proteins activities (Figure 3.4) have been described for yeast, bacteria, plants and even nematodes; it is assumed that an unsaturated carbonyl functional group is needed for these enzymes to work properly. Saccharomyces carlsbergensis (Saito et al., 1991) and Saccharomyces cerevisiae (Niino et al., 1995) contain pairs of closely related OYE genes, and two homologous OYE genes have also been identified in Schizosaccharomyces pombe by genome sequencing (Wood et al., 2002).

Figure 3.4: Examples of OYE activities. (A) Modification of trinitrotoluene by PETN reductase, described in Enterobacter cloacae PB2 (B) Reduction of N-ethylmaleimide to N-ethylsuccinimide by N-ethyl-maleimide reductase, identified in E. coli (C) Reduction of 9R,13R-12-oxophytodienoate by 12-oxophytodienoate reductase in tomato plant. From Williams et al., 2002

According to a BLAST search, TORE is very similar to the enzyme CYP52X1, one of the CYP52 cytochrome P450s described in B. bassiana by Pedrini (Pedrini et al., 2010). Zhang et al. subsequently linked the CYP52X1 enzyme to fatty acid assimilation in B. bassiana (Zhang et al., 2012); using heterologous expression in yeast they showed that the cytochrome had high activity on medium-sized fatty acids adding a terminal hydroxyl group to the substrate (Figure 3.5).

56

Figure 3.5: CYP52X1 enzymatic function. Gas chromatography of metabolites generating in yeast expressing CYP52X1, incubated with oleic acid. From Zhang et al., 2012

Fungi may use a range of P450s to participate in fatty-acid and alkane metabolism, and in this case no significant difference was observed between a Δcyp52x1 mutant and the wild type in conidiation and growth under standard conditions (Zhang et al., 2012). Given the similarity between CYP52X1, whose hydroxylase activity adds a terminal hydroxyl group to various fatty acids and epoxides, and the TORE enzyme, it is a reasonable assumption that TORE would add a hydroxyl group to the torrubiellone compounds.

Other ORFs detected in the cluster corresponded (Figure 3.6), according to BLAST search, to an integral membrane protein (MVP17/PMP22), a major facilitator family protein (MFS), muramidase (MUR), phosphoglycerate kinase (PGK), sterol desaturase and a tetratricopeptide protein repeat (TPR).

Figure 3.6: ORFs detected by BLAST search for contig 5044 in Torrubiella sp. BCC2165.

The TPR motif is known to facilitate interactions between proteins, and are important to the function of chaperones and transcription process in S. cerevisiae and protein transport complexes in S. cerevisiae and Neurospora crassa, among others (Blatch and Lassle, 1999). According to the function, this protein should not participate in the synthesis of any compound. Muramidase is considered an extracellular bacteriolytic enzyme which catalyses the dissolution of Bacillus subtilis cell walls releasing muramyl reducing groups, described for Schizophyllum commune and Gliomastrix murorum (Grant et al., 1990). The

57

phosphoglycerate kinase (pg-kinase or PGK) participates in the glycolysis cycle, mainly in the transfer of a phosphate group from 1,3-bisphosphoglycerate to ADP, converting them to 3-phosphoglycerate and ATP (Harrier et al., 1998). This function can be considered as a defence mechanisms of the fungi, but it is unlikely that participates in torrubiellone production. The PGK amino acid sequences are highly conserved among fungi, and it should not be participating in the production of torrubiellone A. MFS transporter is one of the two largest families of membrane transporters and are present in bacteria, archaea, and eukaryotes. Compounds transported by MFS permeases range from simple sugars to drugs, amino acids, nucleosides and a large variety of inorganic and organic compounds (Pao et al.,

1998). In Penicillium funiculosum, the enzyme is involved in the acid resistance and intracellular pH homeostasis (Xu et al., 2014). This protein can participate in the transport of the molecule from one point to another, but not modifying per se the core structure of the compound. In summary, none of this genes, besides of the genes flanked by the TF, should participate in torrubiellone A biosynthesis, with the exception of the sterol desaturase domain, which cannot be discarded.

3.4.1 TORRUBIELLONE SYNTHASE ANALYSIS

TORS is described as a fused PKS-NRPS, with the main core elongation domains KS, AT and ACP, a C-Met, plus the reductive modules keto-acyl reductase KR, DH and an ER0 in the PKS moiety fused to the NRPS moiety consisting of C, A, PCP and TD domain. The ER domain is expected to be a trans-ER domain as occurs with tenellin and desmethylbassianin gene cluster, in which the ER domain within the PKS-NRPS is inactive (ER0) (Eley et al., 2007). The trans-acting enzyme that works as the ER domain is usually next to the synthase and known examples are TENC (tenellin) and DMBC (desmethylbassianin). In torrubiellone, it will be named TORC, its genomic coding sequence as torC and protein sequence as TorC. Same nomenclature will be used in this work. Domain protein sequences from TorS, TenS and DmbS were compared (Table 3.5).

The domains AT, DH, C-Met, KR, C, A, PCP and TD were located in the same reading frame, while the functional ER domain is not detected within the synthase, but detected in the gene cluster. The ACP domain was not detected by antiSMASH, but protein sequence comparison between the ACP domain of TenS and TorS locates it between the KR from the PKS and the C domain from the NRPS (Figure 3.7, Table 3.6). antiSMASH software predicts AT domain’s specificity as methyl-malonate CoA, while the NRPS moiety was predicted to join leucine, tryptophane (trp) and tyrosine (tyr) by different prediction methods inserted in antiSMASH. By analyzing torrubiellone A structure, the correct amino acid should be tyrosine (or tryptophane transformed into tyrosine), such as tenellin and desmethylbassianin.

58

Table 3.6: % aa identity between domains of TorS, DmbS and TenS protein sequences

%aa KS KS-DmbS KS-TorS KS-TenS KS- DmbS 100 86.90 87.69 KS- TorS 86.90 100 85.00 KS- TenS 87.69 85.00 100

%aa AT AT- DmbS AT- TorS AT- TenS AT- DmbS 100 83.18 80.37 AT- TorS 83.18 100 87.67 AT- TenS 80.37 87.67 100

%aa KR KR- DmbS KR- TorS KR- TenS KR- DmbS 100 84.66 87.67 KR- TorS 84.66 100 79.55 KR- TenS 87.67 79.55 100

%aa ERo ERo- DmbS ERo- TorS ERo- TenS ERo- DmbS 100 57.97 87.67 ERo- TorS 57.97 100 57.83 ERo- TenS 87.67 57.83 100

%aa Cmet Cmet- DmbS Cmet- TorS Cmet- TenS Cmet- DmbS 100 78.18 75.76 Cmet- TorS 78.18 100 87.67 Cmet- TenS 75.76 87.67 100

%aa DH DH- TorS DH- DmbS DH- TenS DH- TorS 100 69.11 70.68 DH- DmbS 69.11 100 87.67 DH- TenS 70.68 87.67 100

%aa C C- TorS C- DmbS C- TenS C- TorS 100 77.63 72.88 C- DmbS 77.63 100 87.67 C- TenS 72.88 87.67 100

%aa A A- DmbS A- TorS A- TenS A- DmbS 100 73.33 92.3 A- TorS 73.33 100 73.33 A- TenS 92.38 73.33 100

59

%aa PCP PCP- DmbS PCP- TorS PCP- TenS PCP- DmbS 100 83.56 87.67 PCP- TorS 83.56 100 82.19 PCP- TenS 87.67 82.19 100

%aa TD TD- DmbS TD- TorS TD- TenS TD- DmbS 100 75.95 87.64 TD- TorS 75.95 100 71.61 TD- TenS 87.64 71.61 100

Table 3.7: Domains predicted for TorS protein sequence and their probable position within the synthase.

DOMAIN AA

KS 17-454

AT 587-908

DH 987-1178

C-Met 1474-1657

ER0 1788-2251

KR 2196-2370

ACP 2489-2655

C 2710-3003

A 3195-3698

PCP 3715-3784

TD 3835-4072

Figure 3.7: Predicted domains in the coding sequence of torS. (A) antiSMASH domain prediction; (B) PKS domains relative position in the synthase, including ACP, not detected by antiSMASH.

60

torS genomic region should contain two introns when compared to the tenellin synthase. With introns, torS is composed by 12644 bp, in comparison to the intron less version with 12492 bp.

Intron 1 is composed by 80 bp, its sequence is GTTTCTTCCCATGCTCTGATTTCCCTTTACAATTCTGTGAAGCGTTTCTTTCATGAC(TA A)CTGGAAGTCTGAATAG, in the same direct reading frame, but a one stop codon included in the sequence, with GT/AG borders, located in the KS domain.

Intron 2 is composed by 72 bp, sequence GTAAGCTATTTTCCAGTTTTTCTCTCCGACTTGTCGAAACCAGAGAGCGCGAATACTAA TCGGAAACTGCAG, also located in the same reading frame, with no stop codon included, with GT/AG borders, located in the C-Met domain. With the introns removed, a phylogenetic tree (Figure 3.9) based on aminoacid sequences of known PKS (and their cognate product) was computed.

The phylogenetic tree based on PKS-NRPS protein sequences shows a relation between TenS (pentaketide), DmbS (hexaketide), MilS (heptaketide) and TorS (hexaketide). Apart from sharing a very similar structure, they are correlated by their amino acid sequence and domain organization (Figure 3.8).

Figure 3.8: Domain organization and genomic context of several PKS-NRPS

PKS, PKS-NRPS1, LNKS and LDKS from A. terreus, MokA from Mimulus pilosus, CNKS from Penicillium citrinum, EqiS from F. heterosporum, DmbS and TenS from B. bassiana and FusS from F. moniliformis, TorS from Torrubiella sp. BCC2165. Adapted from Wang et al., 2012.

61

Figure 3.9: Phylogenetic tree comparing protein sequences of known PKS and PKS-NRPS with TorS The PKS-NRPS from Torrubiellone A is closely related to a putative PKS from C. militaris, tenellin and desmethylbassianin. APS, bassianolide and Beauvericin are not PKS, only used as controls.

62

The fourth and fifth most related protein sequence to TorS were the ones responsible for fusarin C and pseurotin synthesis, respectively. Fusarin C, isolated from Fusarium moniliforme, is considered as an estrogenic agonist and stimulates breast cancer cells in vitro (Songergaard et al., 2011) and heterologous expression of its biosynthetic genes was already achieved in A. oryzae (Song et al., 2004). Pseurotin A, isolated from Aspergillus fumigatus (Maiya et al., 2007), was described as an inhibitor of immunoglobuline E production (Ishikawa et al., 2009). Nonetheless, structurally they cannot be not related to any torrubiellone compound.

Related to the NRPS moiety of TorS, the A domain (Table 3.8) was compared to aspyridone (ApdA), TenS and DmbS protein sequences, and presents the proper conserved domains found in commonly in the adenylation domains (Boettger et al., 2012).

Table 3.8: Comparison of conserved amino acids of different A domains. Tyr (Y): Tyrosine, D: Aspartic acid, M: Methionine, V: Valine, I: Isoleucine, T: Threonine, W: Tryptophan, C: Cysteine, A: Alanine, K: Lysine.

SEQUENCE aa 173 174 177 216 239 241 272 280 281 395

APDA Tyr D M V I Y W C A A K

TENS Tyr D M V I T W C A A K

DMBS Tyr D M V I T W C A A K

TORS Tyr D M(176) V(179) I(218) T W C A A K (175) (394)

3.4.2 GENOMIC ANALYSIS OF THE ORFS LOCATED ON THE TORRUBIELLONE A GENE CLUSTER

Firstly, it was imperative to elucidate if torD and torE coding sequences were fused together in one gene, or if they were two different genes. Both ORFs are separated only by 17 bp between the stop codon from torD and the start codon of torE, in which splicing could combine both ORFs (Figure 3.10). It is not known whether either or both have a role in torrubiellone A synthesis, or even if they really represent two separate genes; the short distance between the ORFs suggests that they could be joined by intron splicing. Both of them presents an ATG codon, but it is not known if the ATG codon is encoding an N-terminal- or internal methionine, or, indeed, whether it's in an intron and doesn't make it into a mature mRNA.

63

Figure 3.10: torD and torE distance Under grey it is marked the 17 bp of distance between the stop codon of torD and torE start codon.

PCR was used to amplify the coding regions of torD (primers torD-F and torD-R) and torE (primers torE-F and torE-R) from both genomic and cDNA isolated from Torrubiella. Combining the primers torD-F and torE-R yielded a PCR product from genomic DNA but not cDNA (Figure 3.11), indicating that the both genes were translated from separate mRNAs. This would imply that the torE promoter lies within the torD coding region and this possibility will be investigated using a reporter gene approach.

Figure 3.11: PCR product from genomic (g) and complementary (c) DNA torD and torE genes were amplified separately from genomic and complementary DNA. When torD-F and torE-R were combined, a PCR product from genomic DNA was obtained, but not from cDNA, indicating that both proteins are translated from separate mRNAs. D: torD; E: torE; D+E: torD-F and torE-R.

Results confirm the presence of two extra genes in the torrubiellone A gene cluster, but no indicators of their function, besides from BLAST results. Torrubiellone A presents two distinctive features: A hydroxymethylation and a hydroxylation in the benzene ring, together with the saturation of the ring, changes possibly produced by TORE and TORD enzymes.

Nucleotide BLAST analysis using torD as query sequence identified a homologue defined as NADH:oxidase in B. bassiana, although not associated with the tenellin/desmethylbassianin gene cluster, while torE did not identify any equivalent gene in C. militaris or B. bassiana. Currently the functions of torD and torE in Torrubiella are not known.

64

3.4.3 PKS-NRPS GENE CLUSTER COMPARISON

A visual comparison between the torrubiellone, militarinone and tenellin gene clusters, as aligned in Figure 3.12, shows a common structure, gene order and orientation with respect to the megasynthase gene (dark blue) and the tailoring enzyme genes designated A (yellow), B (pink) and C (light blue).

Insertion of the additional ORFs in the (putative) torrubiellone cluster does not change this overall gene order, which is flanked in all cases by two ORFs encoding transcription factors (brown and orange), supporting the hypothesis that torrubiellone A, structurally related to militarinone, tenellin and desmethylbassianin, shares a common gene cluster pattern for PKS-NRPS synthesis.

Figure 3.12: Comparison between torrubiellone, militarinone and tenellin gene clusters Homologous genes can be found among the three gene clusters (highlighted).

Structurally, Torrubiellone A has two distinctive features (Figure 3.13): compared to tenellin, a phenyl ring is present, labelled from C-1’ to C-6’, instead of a benzoyl group present in tenellin and desmethylbassianin, together with an OH group in C-1’. Also, a hydroxyl group to the final methyl group (or hydroxymethylation) occurs in C-17 at the end of PK moiety.

Figure 3.13: Torrubiellone A structure

65

Compared to the torrubiellone gene cluster, the tenellin cluster does not contain any torD and torE coding sequences equivalent and they could be producing the structural difference. The sterol desaturase, found in Torrubiella and Cordyceps, could transform the benzyl moiety into a cyclohexane, but it is not present between transcription factors, so initially it will not be considered as part of the gene cluster.

Militarinone A gene cluster contains the sterol desaturase and a torD homolog, but no torE equivalent by nucleotide BLAST search. The only structural difference between torrubiellone A and militarinone is the hydroxymethylation, giving lights to the putative functions of the extra two genes, in which TORE can participate in the hydroxymethylation of the polyketide structure, while TORD could participate in the modification of the aminoacid part.

A good example in which not all the genes contained in a cluster are necessary to obtain the final metabolite is the aspyridone gene cluster (from A. nidulans), composed by a hrPKS APDA, two cytochromes p450 (APDB and APDE), an enoyl reductase (APDC), a FAD- dependant monoxygenase (APDD), a dehydrogenase (APDG), an exporter (APDF) and a transcription factor (APDR). Heterologous expression of the aspyridone gene cluster (Wasil et al., 2013) demonstrated that only APDA, APDE and APDC enzymes were necessary for the production of aspyridone A.

There is clearly a structural resemblance (Figure 3.14) between torrubiellone A and aspyridone, differentiated only by the presence of three hydroxyl groups in torrubiellone, and their chain length (tetraketide for aspyridone, hexaketide for torrubiellone), but BLAST search in the Torrubiella web server using apdA, apdE and apdC as query nucleotide sequence did not present any hits.

Figure 3.14: Aspyridone and torrubiellone A structures. (a) Aspyridone is a saturated PKS-NRPS tetraketide. (b) Torrubiellone A is an unsaturated PKS-NRPS hexaketide.

3.4.4 TORRUBIELLONE A GENE CLUSTER INTRON ANALYSIS

From the BLAST search analysis and alignment with the homologous tenellin and desmetylbassiannin biosynthetic genes, introns were identified (Table 3.9).

66

Table 3.9: Putative introns located in genes belonging to the biosynthetic torrubiellone A gene cluster

GENE GDNA INTRONS CDNA

torA 1661 2 1527

torB 1673 1 1611

torC 1175 0 1175

torD 1305 1 1233

torE 1237 ? ?

torS 12492 2 12340

Besides torC, all the genes of the torrubiellone A gene cluster contains at least one intron, corresponding to the positions of the introns in their homologous genes by comparison between gDNA and cDNA sequences. All the introns located in Torrubiella were flanked by the consensus limits GT/AG (Burset et al., 2000) facilitating their identification.

When introns were removed from the sequence by DNAman software, BLAST search using their correspondant protein sequences also matches every Torrubiella protein to their homologous enzyme. torB presents an intron in the same place as tenB, but TENB did not worked properly when heterologous expression in A. oryzae was attempted (Eley et al., 2007). The conserved base pairs flanking the intron in tenB were unusual, with GC/AG flanks, instead of the more common GT/AG. In the case of torrubiellone, torB flanks are GT/AG.

3.5 GENE CLUSTER AMPLIFICATION FROM TORRUBIELLA SP. BCC2165

All the proposed genes of the gene cluster could be amplified from gDNA, confirming their presence in Torrubiella sp. BCC2165, using 20 bp primers (torA-F/R; torB-F/R; torC-F/R; torD-F/R, torE-F/R) starting from the predicted ATG start codon site and ending with the corresponding stop codon, according to sequencing data.

The uniqueness of the torrubiellone gene cluster, when compared to tenellin and desmethylbassianin gene clusters and besides of torD and torE ORFs, is the order of the other ORFs in the 5044 contig outside the torrubiellone gene cluster, i.e. TGK, Mur and SD, among others, but to start, it was considered that only the genes flanked by the TFs should be sufficient for torrubiellone A biosynthesis (Figure 3.15). Next to the militarinone and

67

desmethylbassianin gene clusters, the non-related genes are located upstream from the synthase, next to the ZnTF, while in Torrubiella, they appear to be next to the C6TF, downstream to the synthase.

To rule out a mistake in the sequence assembly, primers were designed from the mfs gene to the trichoglycopeptide gene (A) and from the muramidase to the ZnTF (B). PCR results demonstrated that the genome sequencing was correct, and the genes are downstream the synthase, in contrast to the other clusters.

Figure 3.15: Contig 5044 analysis. Mur: Muramidase, Mpv17/PMP22: Integral Membrane Protein, pg-kinase: Phosphoglyceratekinase, MFS: Major Facilitator Superfamily. (A) Gene order given by sequencing (B) Expected gene order by comparison with other gene clusters.

3.6 BIOSYNTHETIC PATHWAY PROPOSAL FOR TORRUBIELLONE A

The full metabolic pathway proposal will be based on other PKS-NRPS assembly process, whose biosynthetic pathway has been already described, such as tenellin (Halo et al., 2008) and aspyridone, together with information about other structurally related compounds, whose biosynthesis has not been yet studied.

Tenellin, desmethylbassianin and bassianin use tyrosine in the NRPS part, and supported by the antiSMASH prediction made for the synthase located in contig 5044, it is most likely that torrubiellone biosynthesis use tyrosine as aminoacid, instead of phenylalanine, and then modified. In the NRPS part of the main synthase, the aminoacid tyrosine is activated by the A Domain (Figure 3.16) to be fused to the PK moiety after its synthesis.

Figure 3.16: Activation by the A domain of the tyrosine aminoacid Then, held by the PCP domain of the NRPS part.

68

The biosynthesis of desmethylbassianin is expected to be the same way as tenellin, being also expected for torrubiellone A synthesis to follow the same order and route as the aforementioned compounds (Figure 3.17).

Figure 3.17: Biosynthesis of the PK moiety by the synthase (TENS / TORS / DMBS / APDA) and the trans-enoyl reductase (TENC / TORC / DMBC / APDE). SAM: S- adenosylmethionine In the case of pentaketide molecule tenellin, it requires an acetyl-CoA as starter molecule, four malonyl-CoA molecules as extender units, and 2 SAM molecules to add the two required methyations. The molecule also shows that it is an HR-PKS, having reductive steps to obtain the double bonds (KR-DH) and full saturation (KR-DH-ER). Following the same idea, the minimum molecule of the aspyridone PK requires acetyl-CoA, only 3 malonyl-CoA molecules (because it is a tetraketide) and 2 SAMs. In torrubiellone, it would be required the acetyl- CoA as starter, 5 malonyl molecules (to produce a hexaketide) and solely one SAM molecule for the methylation. As mentioned in a previous chapter, these PKS-NRPS are iterative type I, so the presence of the domains is not indicative of how many modifications (i.e. C-Met domain and number of methylations) would be done to the compound. It is known that the C-Met domain is in charge of the methylation pattern, while the KR is in charge of the chain length programming. BLAST search of the release domain of torrubiellone is not described as a Dieckmann cyclisation domain, but the fusion between the NRP and the PK part (Figure 3.18), together with the release of the molecule should follow the same pathway as tenellin biosynthesis.

Figure 3.18: PK-NRP fusion, attached to the PCP domain of the cognate NRPS Tenellin scheme from Fisch, 2013.

69

In the three examples mentioned in Figure 3.16, the ER domain within the synthase is inactive (ER°), instead requiring a trans-enoyl reductase as the enzymes TENC and DMBC, and Torrubiella doesn’t appear to be an exception to the rule, in which the torC ORF has a high % homology with tenC and dmbC. TENC and DMBC have been demonstrated to be interchangeable (Fisch et al., 2011), without influencing the final product.

By heterologous expression, when APDA is expressed alone (without APDE), formation of pyridines with different chain lengths were synthesized and when TENS wasn’t coexpressed with TENC, compounds with different chain lengths and methylation patters were obtained. There is no reason to believe that torrubiellone would act differently, so the minimum molecule should be produced by the TORS synthase and the TORC trans-enoyl reductase. The most similar molecule to torrubiellone, desmethylbassianin, has the same gene order and domain arquitecture as tenellin.

After the PK-NRP fusion, the molecule is released as the first precursor of the final compound. In the case of tenellin and aspyridone, it would be pre-tenellin A and aspyridone A. For torrubiellone, it seems to be needed another precursor, prior to torrubiellone D, but it shouldn’t have the hydroxylation at C-17 (Figure 3.19), because no hydroxylation has been described in that position to be made solely by the synthase and enoyl reductase. Maybe the molecule is unstable, difficulting its isolation from the Torrubiella wild-type, so its formation cannot be discarded. The molecule will be named pre-torrubiellone D.

Figure 3.19: Pre-torrubiellone D

According to Isaka et al. (2010), all torrubiellones A to D possess the hydroxylation at C-17 so another enzyme, such as an extra cytochrome P450 could be producing the modification. The research made by Isaka supports the idea that the OH group addition was performed before the ring expansion step, which is the difference between torrubiellone D and torrubiellone C. The only similar modifications found in related structures are the ones occurring after the ring expansion in tenellin, but they were considered as shunt metabolites (Figure 3.20).

70

Figure 3.20: Biosynthetic pathways from pre-tenellin B to tenellin and/or shunt metabolites. From Halo et al., 2008. Figure 3.20 shows interesting features, in which the action of an unknown enzyme (probably another cytochrome p450) is able to add a hydroxyl group to C-13 (13-hydroxypretenellin) and C-15 (pyridovericin), working simultaneously with the N-hydroxylation enzyme TENB, so pretenellin B could be modified to three different forms. Even if the tenellin is fully produced, another enzyme (probably the same one as before) is able to add the OH group at C-15 (15-hydroxytenellin), so it seems possible that the hydroxylation step could occur at any time in the biosynthetic pathway.

In another example, the modification of prefusarin to fusarin C (Cox, 2007) also has a particular modification, added after the action of the enoyl reductase, in which a COOCH3 (Figure 3.21) is added to the structure, after several modifications, such as a methyl group oxidation and an O-methyl transfer, epoxidation and hydroxylation, affecting the whole structure. Despite the fact that another group was added, it supports the idea of a group addition after the synthesis of the basic PK scaffold.

Figure 3.21: Modifications between pre-fusarin C and fusarin C.

Another interesting OH addition is found in the products obtained when the ACE1 synthase gene from Magnaporthe oryzae was heterologously expressed in A. oryzae M-2-3 (Song et al., 2015), without any trans-acting enoyl reductase enzyme. In 12,13-

71

dihydroxymagnaporthepyrone (Figure 3.22), the double bond gets saturated by an epoxidation step, followed by the addition of two hydroxyl groups at positions C-12 and C-13 (hence the name) at the end of the PK chain. In ACE1 gene, the enoyl reductase within the synthase is also inactive as tenellin and desmethylbassianin. With the trans-ER reductase, the biosynthetic pathway changes and the OH modifications are eliminated.

Figure 3.22: Products from the enzyme ACE1 in A. oryzae (A) Without the trans-enoylreductase (A) and with the ER (B) Similar hydroxylation position (C-11 – C-12) was also described in tenellin by Halo et al. (2008), in which the shunt metabolite prototenellin C (Figure 3.23) was obtained by solely expression of the tenS gene. The first authentic precursor of tenellin is pretenellin A, obtained only when torS is expressed with the trans-enoyl reductase torC. Also, prototenellin C lacks of a methylation group in the structure. Similar modification (C-13 – C-14) is found in the shunt metabolite proto-DMB C (Heneghan et al., 2011).

Figure 3.23: Examples of shunt metabolites (A)Protonetenellin C, (B) Proto-dmbC. Comparison with the other metabolic pathways supports the idea of the conversion from pre- torubiellone D to torrubiellone D. Also, considering that TORS has a C-Met domain, the methylation should occur firstly in the synthesis of pre-torD, and then the hydroxylation should happen to obtain the torrubiellone D (Figure 3.24), instead of a full addition of a hydroxymethyl (-CH2OH) group.

72

Figure 3.24: Strucural difference between two torrubiellone compounds (A) Pre-Torrubiellone D; (B) Torrubiellone D

To transform torrubiellone D into torrubiellone C, the molecules should undergo through a ring expansion, but the explanation proposed differs between militarinone and tenellin/aspyridone. In the case of militarinone, and based on Schmidt et al. (2003), militarinone C undergoes firstly to a hydroxylation step to allow the conversion to militarinone B (Figure 3.25), but the conversion is explained by the action of another enzyme, not described previously. For the metabolic pathway proposal, Schmidt et al. considered that all of the compounds isolated were part of the pathway, not evaluating that maybe some of them were shunt metabolites, as proposed later by Halo et al. in 2008. They proposed that militarinone B was a shunt metabolite and not part of the authentic biosynthetic pathway, considering the results obtained when heterologous expression of the tenellin gene cluster pathway was performed in A. oryzae.

Figure 3.25: Militarinone pathway Schmidt et al., 2003

Heneghan et al. (2008) and Wasil et al. (2013) had a more plausible explanation, in which the molecule undergoes under an oxidative rearrangement by the action of APDE/TENA. According to Fisch (Fisch, 2013), desmethylbassianin should follow the same rearrangement

73

(Figure 3.26), and considering the similarities among structures, torrubiellone probably follows the same process.

Figure 3.26: Effect of the cytochrome P450. Cytochrome effect in aspyridone (APDE), tenellin (TENA) and, probably, in torrubiellone (TORA).

TORA selectiveness hasn’t been determined yet, but TENA is described as a highly-selective ring-expandase, while APDE (Figure 3.26) appears to be less selective: It is capable to add an OH group to the phenyl moiety, followed by the release of the aminoacid part moiety (Dephenylation, APDE) or a rebound mechanism, in which the OH is added to the carbon between the pyrone ring and the phenyl group (Figure 3.27).

Figure 3.27: TENA/APDE effect on PK-NRP precursors pretenellin A and preaspyridone.

74

The same rebound modification was also described for tenellin (Figure 3.28), in the compound prototenellin-D, but it was observed only in the B. bassiana wild type and not in the tenSAC (Synthase + Ring expandase + trans-acting enoyl reductase) transformants in A. oryzae, suggesting that the B. bassiana strain also contains a similar enzyme doing the hydroxylation, but not located in A. oryzae.

Figure 3.28: Shunt metabolite transformation from pretenellin to prototenellin-D

After torrubiellone C synthesis, it should be transformed into torrubiellone B (Figure 3.29), which has different options to be proposed:

Figure 3.29: Transformation from torrubiellone C to torrubiellone B

One of the options is showed by fischerin (Figure 3.30, A), isolated by Fujimoto et al. (1993): The compound possess an hydroxyl group in the same position (C-1’) as torrubiellone B, but also with an epoxidation (C-2’ – C-3’) probably associated with the original double bonds from the phenyl group, while the other positions (C-5’ – C-6’) became saturated by the addition of hydrogens.

Other molecules showing the addition of other OH group is aspyridone B (Figure 3.30, B), in which the phenyl group doesn’t become saturated, but a hydroxyl group in C-5’ is added. Aspyridone B has an extra OH group within the cycle, obtained when Bergmann et al. (2007) expressed ectopically the APDR (transcription factor) in A. nidulans strain SB 4.1, obtaining aspyridone A and B. Maybe a cytochrome p450 produces the difference between them, but no further information was found.

Figure 3.30: (A) Fischerin; (B) Aspyridone B

75

Despite there is no information about fischerin biosynthetic pathway, it supports the possibility of an epoxidation step within the phenyl group, which could lead to the addition of the OH when the epoxy-ring opens.

Related to torrubiellone, Isaka et al. (2014) possibly isolated a precursor of torrubiellone A, namely torrubiellone E (Figure 3.31). It appears to be an intermediate, in which two of the double bonds in C-3’ – C-4’ and C-5’ – C-6’ became saturated. The other double bond located in C-1’ – C-2’ remains untouched, which could facilitate the addition of an OH in the double bond (helps by its sp3 carbon configuration), and completed by a hydrogen (Isaka et al., 2014). Considering that torrubiellone E was coproduced with torrubiellone A and B in T. longissima BCC2022, sharing the C-7 – C -17 side chain structure, the compounds should have the same absolute configuration. Maybe, it could occur a formation of a diol for the double bond in the cycle, arising by a stereoselective C-1’ – C-2’ epoxidation, followed by acid catalyzed hydrolysis. This theory was proposed for prototenellin C and proto-dmbB.

Figure 3.31: Torrubiellone E structure From Isaka et al., 2014.

Hosoya et al. (2013) described the compounds JBIR 130/131/132 (Figure 3.32), which has an OH in C’-1 and in C’-4 (given by the original tyrosine), but there is no information about the genes involved in the synthesis.

76

Figure 3.32: Compounds isolated from Isaria sp. NRBC 104353 From Hosoya et al., 2013. (A) JBIR-130; (B) JBIR-131; (C) JBIR-132

The heptaketide compound militarinone A (Figure 3.33, B) has the same OH moiety at C’-1, but the description given by Schmidt et al. (2003) doesn’t describe anything related about the steps that occur between militarinone D (Figure 3.32, A) and A or the order of the reaction, only that both hydroxylations in N-1 and in C-1 happen, accompanied by a saturation of the double bonds of the phenyl cycle. Considering the structural difference in torrubiellone B and torrubiellone A, probably torrubiellone C is a precursor of them, transformed by another cytochrome p450.

Figure 3.33: (A) Militarinone D; (B) Militarinone A.

The final step proposed to synthesize torrubiellone A is an N-hydroxylation of the 2-pyridone nitrogen produced by a cytochrome p450, such as happens in B. bassiana by TENB (Figure 3.34), in desmethylbassianin by DMBB and, based on BLAST search and gene similitude, the ORF designated as torB, which encodes for a cytochrome p450 denominated TORB.

77

Figure 3.34: Transformation from pretenellin-B to tenellin (N-hydroxylation) by TenB cytochrome p450.

Curiously, the same problem with the militarinone pathway occurs also in JBIR, in which the difference is also the N-hydroxylation. In JBIR 132, none of the hydroxyl groups are found, while in JBIR 130/131, both hydroxylations occur, but there is no intermediate between them, also, no genetic studies have been done for that Isaria strain. In summary, the proposed biosynthetic pathway (Figure 3.35) for torrubiellone A is the following:

Figure 3.35: Proposed metabolic pathway for synthesis of torrubiellone A

Considering all the data previously mentioned, probably only six genes are necessary to fully synthesize torrubiellone A. Heterologous expression of the polyketide synthase TORS and the tailoring enzymes TORA, TORB, TORC, TORD and TORE will be perfomed in A. oryzae NSAR1.

78

C

H A H PT ER

4

4. DEVELOPMENT OF A TORRUBIELLA TRANSFORMATION SYSTEM FOR PROMOTER ANALYSIS, DIRECTED GENE TARGETING AND TRANSCRIPTION FACTOR OVEREXPRESSION

4.1 INTRODUCTION

Fungal secondary metabolism (SM) gene clusters are controlled by different factors, such as nitrogen and carbon sources, pH, reactive oxygen species (ROS), aerobic/anaerobic conditions among others. These factors may activate or repress individual gene clusters, but not all of them are linked to an external factor response (Bruns et al., 2010).

Gene cluster regulation is usually driven by global regulators, defined as transcription factors (TF) encoded by genes that do not belong directly to the cognate gene cluster, but are able to regulate primary and secondary metabolism genes at numerous genomic locations (Espeso et al., 1996). In contrast to global regulators, there are also pathway-specific transcription factors, usually found within or flanking the gene cluster which they specifically regulate.

79

About 60% of all fungal SM gene clusters have at least one regulatory gene. From this 60%, approximately 90% of fungal PKS or PKS-NRPS gene clusters are controlled by members of the ZnTF cluster family (or of the Zn2-Cys2 binuclear cluster domain family), whereas the transcription factors which regulate NRPS clusters appear to be more diverse (Shelest et al., 2008). For example, the 25-member aflatoxin and sterigmatocystin biosynthesis gene clusters from A. flavus and A. nidulans contain two regulatory genes adjacent to each other, named aflR and aflS (Brown et al., 1996). aflR encodes a Zn family TF regulator that participates in the transcriptional activation of the majority (or all) of the genes encoded in the cluster. The role of the aflS regulator gene is not properly defined, but it was proposed to work as a transporter of the ZnTF regulator (Georgianna et al., 2009).

Strategies used for fungal SM discovery include transcription factor overexpression, promoter exchange, the use of selected culture conditions and co-expression with other microorganisms (change in physiological conditions), chromatin modifications and overexpression/deletion of global regulators (Figure 4.1).

Figure 4.1: Strategies for new secondary metabolites discovery. (A) Transcription factor overexpression (B) Promoter exchange (C) Physiological conditions (D) Chromatin modifications (E) Global regulator overexpression or deletion. From Brakhage et al., 2013

Silent gene clusters (not expressed under standard growth conditions) require specific conditions to be activated or the overexpression of a regulator, but crosstalk between gene

80

clusters could happen and activate both: The silent inp cluster from A. nidulans, possess two NRPS-genes and a pathway regulator scpR. The induction of the scpR regulator triggers expression of the inp pathway, but also activates another pathway (apo gene cluster), encoding for asperfuranone biosynthesis. The clusters are located on different chromosomes, and both are silent under non-scp-inducing conditions (Bergmann et al., 2010).

Other example of TF overexpression, linked to promoter exchange, is the activation of the silent cluster producing the compound aspyridone; transformation of A. nidulans with the apdR gene under the control of a strong promoter led to the production of compounds that had not been described before (Bergmann et al., 2007). Promoter exchange was tested by the removal of the native promoter of the afoA gene (related to the silent asperfuranone gene cluster) with the promoter of the scfR regulator, resulting in production of the alkaloid asperfuranone (Maiya et al., 2007).

Also silent gene clusters can be activated by microbial co-cultivation, for example, production of emericellamides was obtained by the marine fungus Emericella sp. when co-cultivated with the actinomycete Salinispora arenicola (Udwary et al., 2007).

Chromatin changes are made mainly by histone modifications, which affect the activation of several gene clusters. The method consists of the treatment of the fungal strain with inhibitors of histone acetyltransferases or DNA methyltransferases to activate silent gene clusters. For example, the addition of suberoylanilide hydroxamic acid to a culture of Cladosporium cladosporioides stimulated the production of calphostin B and novel cladochromes (Brakhage, 2013).

Manipulation of global regulators can also be used in screening for novel metabolites, for example, the deletion of a gene that is required for global protein N-acetylation in A. nidulans resulted in the production of the red pigment pheofungin, a novel metabolite not seen in the native strain (Scherlach et al., 2011), while overexpression of the LaeA global regulator increased the production of several SMs in different fungi, such as penicillin in A. nidulans (Bok et al., 2004) and aflatoxin in A. flavus (Kale et al., 2008).

Regulatory proteins interact with DNA in a number of ways, and TFs are classified in terms of the protein motif responsible for the interaction. Such protein motifs are highly conserved within each class, and include homeodomains, helix-loop-helix, basic region leucine zipper

(bZip), Zinc coordinated and Zn(II)2Cys6 (or C6 Zinc) binuclear clusters, among other classes of TFs (Todd et al., 2014). The corresponding protein-binding motifs in DNA are similarly conserved so that members of a particular class of TF tend to bind to similar target sequences.

81

The torrubiellone A gene cluster is flanked by two transcription factors and denominated as

ZnTF and C6TF by their detected protein motif (Zn(II)2Cys6 binuclear cluster). Maybe one or both of them could be regulating the activation/repression of torrubiellone A production. In Gibberella fujikuroi, the genes responsible for gibberellic acid biosynthesis are clustered and regulated by an AreA-type TF (Tudzynski, 1999).

Investigation of their roles within the native species will require the development of a genetic transformation system for gene overexpression/inactivation. In general, development of a transformation strategy will depend on the choice of selectable marker, growth characteristics of the chosen strain and the physiology/biochemistry of the organism, among other factors (Ruiz-Diez, 2002).

The first strategy used for fungal transformation was based on the ability to transform S. cerevisiae by complementation of auxotrophic markers and the development of E. coli-yeast shuttle vectors (Beggs, 1978). Transformation of filamentous fungi started with Neurospora crassa (Case et al., 1979), and nowadays all main groups of fungi, including zygomycetes, basidiomycetes and ascomycetes, can be transformed. The most common fungal transformation strategy is based on protoplast preparation using cell-wall degrading enzymes, in which the optimum timing of enzyme incubation will depend on the microorganism. The starter culture can be obtained from asexual spores and mycelial fragments, while the uptake of DNA (usually double-stranded) is carried out in the presence of calcium ions and high concentrations of polyethylene glycol (PEG) (Ruiz-Diez., 2002).

Fungal transformants are most commonly selected by the complementation of auxotrophic markers, but this requires the existence of a suitable genotype in the recipient strain. In a relatively recent example, uracil dependency was produced in A. oryzae strain S1 by generating a pyrG mutant, and transforming plasmids contained the wild-type gene as a complementation selection marker (Ling et al., 2013). However, obtaining such a genotype by directed mutation could represent a challenge in species without a well-characterized genetic system, so a widely used alternative in strains without auxotrophic mutants is to use drug resistance markers. This does not require knowledge of the genotype of the strain, only determination of its susceptibility to the drug. A major drawback is that the resistance allele may not produce a distinctive difference compared to the wild type, leading to selection difficulties. Transformation in filamentous fungi often results from ectopic (non-homologous) recombination of DNA into the genome, but where homologous recombination is frequent it can be used for targeted gene knockout. (Lazarus et al., 2014).

This chapter reports the development and application of a Torrubiella transformation system to overexpress both C6TF and ZnTF. Also, a knock-out of the torS gene is attempted to

82

establish whether the technique allows for targeted gene disruption flanking the torrubiellone A gene cluster.

4.2 RESULTS

4.2.1 DEVELOPMENT OF TRANSFORMATION SYSTEM FOR TORRUBIELLA SP. BCC2165

4.2.1.1 VECTOR DEVELOPMENT FOR TORRUBIELLA TRANSFORMATION SYSTEM

For promoter analysis and overexpression of the transcription factors in Torrubiella, the multigene expression vector pTYargeGFP was used as an initial plasmid to be modified (Figure 4.2). Since no Torrubiella auxotrophs have been reported, the argB marker used for selection of A. oryzae transformants was also redundant.

At first, eGFP gene was added to pTYGSarg by Gateway transfer to obtain the plasmid pTYargeGFP. The section of pTYargeGFP containing the argB selectable marker, PamyB and attB sequence were replaced by PCR fragment with the eGFP marker using yeast recombination. The PCR fragment also introduced a unique PacI site to allow for promoter/gene insertion. The resultant plasmid was called pTYpromlesseGFP (Figure 4.2).

Figure 4.2: pTYargeGFP modification to obtain plasmid pTYpromlesseGFP The area covering the argB gene, attB and PamyB sites were replaced by a small DNA fragment including a PacI site, which allows to fuse promoters to eGFP for expression analysis. See Appendix for full plasmid map.

83

Since Torrubiella sp. BCC2165 is fully prototrophic, the expression plasmid must contain a drug selection marker, instead of a complementation marker. Susceptibility to ammonium glufosinate (BASTA – Table 4.1) was tested to use the related resistance gene (bar) can be used for selection purposes.

Table 4.1: BASTA growth data for Torrubiella sp. BCC2165 Growth was defined as appearance of colonies measured after 7 days of incubation at 28º C, using PDA plates. (+): Colony Growth, (-): No growth detected. BASTA concentration (µg/ml)

TORRUBIELLA SP. 25 50 75 100 200 500 BCC2165

GROWTH + + - - - -

The bar gene was amplified as a fragment of 612 bp, amplifing the 552 bp coding region and 30 bp overlap to the end of PgdpA and 30 bp to the beginning of TgdpA, using the primers gdpA-Bar-F and gdpA-Bar-R. These primers had 5' extensions to provide 30 bp overlaps with the end of PgpdA and the beginning of TgpdA. The plasmid pTYpromlesseGFP was digested with AscI and used for yeast recombination, together with the bar gene PCR product and patches to rejoin the eno and adh cassettes; The resultant plasmid (Figure 4.3) was called pTYpromlesseGFPbar.

Figure 4.3: pTYpromlesseGFPbar assembly

84

4.2.1.2 DEVELOPMENT OF TRANSFORMATION PROTOCOL

An initial attempt at Torrubiella transformation made use of slight modification of the A. oryzae protoplast transformation protocol used by Heneghan et al. (2011). As a control plasmid, the primary metabolism promoter adh was fused to the eGFP moiety to create pTYpromadheGFPbar. After 8 hours of incubation at 28° C, an additional 5 ml of PDA/Top containing 140 µl BASTA were overlaid on the transformation plate (final concentration = 100 µg/ml). After incubation at 28° C for 24 hours, another 5 ml of PDA/S/top medium containing BASTA was added (final concentration = 100 µg/ml), and the plates incubated at 28° C for 7-10 days until colonies appeared, but no transformants appeared after incubation for several days, so the process required further modification.

The first change in the development of the Torrubiella transformation system was to increase the incubation time in GN medium for spore germination and production of young mycelium for protoplasting. Torrubiella grows at a slower rate than A. oryzae, and visual inspection after overnight incubation (A. oryzae requires about 18 h) revealed no apparent difference from the initial inoculum. Increasing the incubation to 36-48 hours, depending on the amount of mycelium observed, did not result in any transformants, but it was clearly necessary to maintain this initial incubation time to produce sufficient material for protoplasting. Also, no transformants were obtained despite an attempt to improve protoplast yield by doubling the Trichoderma lysing enzyme concentration (to 20 mg/ml) and tripling the incubation time (to 3 h).

It was possible that the physiological state of Torrubiella was not suitable for transformation, so mycelial growth and colony pigmentation were investigated by using different growth media. As a comparison, physiological and pigmentation properties were studied in the militarinone-producing fungus C. militaris, which had been previously characterised (Shrestha et al., 2006). According to Shrestha and co-workers (2006), solid media provide the best means of studying the growth characteristics of a fungal strain, if it is non-aquatic in its native state, so the same idea was used for Torrubiella analysis. MEA and CDA are described as general purpose media, while PDA is considered to be a rich medium.

Torrubiella was grown under different growth media (MEA, PDA and CDA plates at 28° C for one week) and a cottony (or fluffy) phenotype (Figure 4.4) was sometimes be found on PDA plates. Using this alternative morphology, Torrubiella transformation was re-attempted.

85

Figure 4.4: Morphological differences in Torrubiella sp. BCC2165 grown on PDA. (A) Plain phenotype found commonly in the Torrubiella strain. (B) Cottony phenotype found sometimes in the strain.

Transformation was performed with inocula obtained from mycelia displaying the “cottony” and “plain” phenotypes using the same parameters described for the previous transformation attempts; using the “cottony” phenotype, the transformation yielded a small number of putative transformants (seven colonies). After one week of incubation, transformants were transferred onto PDA+BASTA plates to retain selection. As an additional test, selected colonies were tested by PCR of the added fragments by DNA extraction from plates (Figure 4.5) in every transformation round and as a confirmatory method in all the research, using as control the untransformed microorganism. After three rounds of purification on PDA with gradually increasing BASTA concentrations (Final plate concentrations: 100, 125 and 150 µg/ml, respectively), the obtained colonies were analyzed by fluorescence microscopy (Figure 4.6) and luminiscence was observed in all transformants obtained with the pTYpromadheGFPBar plasmid, but unfortunately, a scale bar for the images obtained from the eGFP transformants couldn't be established.

Figure 4.5: PCR confirmation test for eGFP gene PCR of four transformants obtained after transformation of Torrubiella with pTYpromadheGFPBar. 1, 2, 3, 4: Transformant colony obtained after the third round of selection. C: Control of a wild-type Torrubiella

86

Figure 4.6: eGFP expression in a Torrubiella transformant. A representative colony from pTYpromadheGFPBar transformation under white light and under fluorescence conditions, looking for eGFP expression.

4.2.2 PROMOTER ANALYSIS

Analysis by RT-PCR (Chapter 3) indicated that both the torD and torE genes are transcribed separately, despite there being only 17 bp between the stop codon of torD and the start codon of torE. This suggested that the promoter of torE must lie within and form part of the torD coding region, and to perform better analysis, both promoters were analysed in-silico by BLAST comparison, elucidating certain similarities with other filamentous fungi.

Closer analysis of the 500 bp region of promE500 (Figure 4.7) shows the presence of the sequence ATTGG, corresponding to CCAAT sequence on the complementary strand, located 76 bp upstream the ATG start codon of torE. Zn-based TFs are known to bind to CCAAT sequences, which are present in several eukaryotic genes. The CCAAT binding complex is needed for activation of gene expression by pathway-specific regulators (Brakhage et al., 1999), and are located on either strand 50 to 200 bp upstream of the transcriptional start point. These TFs have been named differently depending on the organism: HAP for S. cerevisiae and A. thaliana, CBF in rats and NF-Y in mouse, Xenopus and humans (Li et al., 1998). For example, the ZnTF HAP complex was found to bind to CCAAT sequences in S. cerevisiae and S. pombe (Brakhage et al., 1999).

Figure 4.7: PromE500 genomic sequence In red is shown the CCAAT-binding region for TF regulation. Also the stop codon of torD is highlighted. Format given by DNAman software.

87

In the case of promE500 sequence alignment revealed similarities with genes in Metarhizium robertsii ARSEF 23, Fusarium fujikuroi, Fusarium verticilloides and Apteryx australis, but no alignment was related to B. bassiana or C. militaris.

Different scenario is the promD500 sequence (Figure 4.8), also located in B. bassiana strain 992.05. A 38 bp region has a 95% identity with the genomic area positioned between dmbB and dmbC coding regions within the desmethylbassianin gene cluster, as well with the tenellin biosynthetic gene cluster (95% identity, located between tenB and tenC) and the militarinone gene cluster (between milB and milC) from C. militaris. However, torD and torE are located between torB and torC, so the conserved sequence may have played a role in the creation of the indel that distinguishes the Torrubiella gene cluster from the homologous B. bassiana (and C. militaris) clusters.

Figure 4.8: PromD500 genomic sequence. In red is shown CCAAT sequences that could be binding regions for TF regulation. In green is highlighted the sequence matching dmb and ten clusters (bases 229 to 266). Format given by DNAman software.

With the Torrubiella transformation system developed, plasmids containing promD500/promE500/promDmid promoters (Figure 4.9) were assembled. For this purpose, a PCR product of 500 bp of the genomic area immediately upstream of the torE start codon (designated promE500) was amplified using promE500-eGFP-F and promE500-eGFP-R promoters. This amplification places the PCR fragment next to the eGFP coding region in the promoter-probe plasmid pTYpromlesseGFPbar. Also, a construct was made with the equivalent 500 bp 5’ upstream region of torD (promD500) using promD500-eGFP-F and promD500-eGFP-R primers, and a putative negative control was assembled using the 500 bp immediately upstream of promE500 designated promDmid, using primers promDmid-eGFP- F and promD-mid-eGFP-R (Figure 4.10).

Figure 4.9: Origins of the fragments used in promoter analysis.

There are no exact rules to recognize the extent of a promoter area, so an arbitrary value of 500 bp upstream of the start codon was used as a rule of thumb, following the successful

88

methodology described by Pahirulzaman et al. (2012) for using the adh and eno promoters to drive heterologous gene expression in A. oryzae. The promD500 sequence should contain the torD promoter and so should act as a positive control. The promD500mid sequence is from a central region of the torD coding region far upstream of the start of translation of the torE coding region and so should act as a negative control. The promE500 sequence is the test, expected to contain the torE promoter.

Figure 4.10: pTYpromtorD500eGFPbar assembly. PCR fragments corresponding to torD500/E500/Dmid500 were added to the PacI-digested pTYpromlesseGFPbar plasmid by yeast recombination. The promoters’ addition will result in pTYpromtorD500/E500/DmideGFPBar plasmids.

Using the same protocol established for Torrubiella transformation, the strain was transformed with pTYpromD500eGFPbar, pTYpromE500eGFPBar, and pTYpromDmideGFPBar. Analyzing the results from fluorescence microscopy, and using adh- eGFP as a positive control (Table 4.2, E and F), promD500 (Table 4.2, G) and promE500 (Table 4.2, H) functioned as promoters and drove the expression of eGFP, while for promDmid (Table 4.2, C and D), there was no apparent difference in eGFP expression compared to the promlesseGFP plasmid (Table 4.2, A and B), which could be produced by autofluorescence of the Torrubiella strain (not currently reported, but possible in A. oryzae) or because a small exent of expression occurs when eGFP got inserted. promD500 and promE500 fluorescence was comparable visually to promadheGFP. The experimental results support the hypothesis that some elements within the 3' portion of the coding region of torD could work also as the promoter of torE. The best and clearest photos were used for Table 4.2.

89

Table 4.2: eGFP expression, using different promoters. A and B: promlesseGFP; C and D: promDmideGFP; E and F: promadheGFP; G: promD500eGFP, H: promE500eGFP

A B

C D

E F

G H

90

4.2.3 KO OF TORS

The torS gene was chosen as target in an attempt to determine whether gene KO is feasible in Torrubiella. The intrinsic property of homologous recombination found commonly in filamentous fungi is a necessary prerequisite for directed gene targeting, but this is only effective if ectopic integration events inserting the DNA fragment/plasmid randomly into the genome do not swamp transformation by homologous recombination (Lazarus et al., 2014).

The KO-construct design was based on plasmid pEYAtorSintlesseGFP which is a shuttle vector carrying the complete, intron-edited torS gene (fused to eGFP). Construction of the KO-construct torS-KO required the replacement of as portion of torS with the bar gene as a dominant selectable marker for Torrubiella transformation by using homologous recombination. Ectopic integration of the plasmid would produce BASTA-resistant transformants without inactivating torS.

The torS-KO construction strategy is shown in Figure 4.11: Cutting the pEYAtorSintlesseGFP plasmid with XmaI releases a fragment of 7543 bp from the centre of torS while leaving the rest of the plasmid intact. Homologous recombination in yeast was used to introduce the PgdpA-Bar-TgdpA cassette into the torS coding region located in the plasmid, amplified as two overlapping fragments from pTYpromlesseGFPbar with 5' and 3' flanking sequences of torS. The assembly design included the restoration of the XmaI sites to aid in the screening, since cutting the new plasmid with XmaI would excise a fragment of 3037 bp rather than 7543 bp from the original plasmid containing the torS coding region. The plasmid will be named pEYAtorSGFP-KO-XmaI.

In a more detailed explanation, a 1557 bp piece using torS-KO-F1 and torS-KO-R1 (Orange), with a 30 bp overlap homologous to the torS synthase in the template plasmid and an overlap to the second piece inserted, containing a PgdpA promoter-encoding moiety. The XmaI site was kept for digest check. Also, a 2693 bp fragment (Light blue) was amplified using torS- PgdpA-F and Bar-PgdpA-R, overlapping to the end of the first piece to the left, and with an overlap at the end with a section of the bar gene. The third amplification (red) of 826 bp amplified by using Bar-TgdpA-F and Bar-TgdpA-R primers contained an overlap to TgdpA and the region next to the TgdpA cassette. Finally, the fourth fragment (green) of 1277 bp amplified by using TgdpA-torS-F and TgdpA-torS-R primers, containing an overlap to TgdpA and the beginning of eGFP. When assembled, the 3153 bp fragment consisting in a PgdpA- bar-TgdpA cassette will replace the fragment of 7543 bp from the torS gene (Figure 4.11), losing 4390 bp in comparison to the intact torS.

91

Figure 4.11: Assembly strategy to generate pEYAtorSeGFP-KO-XmaI

Transformation of Torrubiella with pEYAtorSGFP-KO-XmaI yielded thirty transformants, which were serially subcultured to obtain homokaryotic lines from which genomic DNA was isolated. Transformants were screened by PCR (Figure 4.12) using primers torSKOcheck-F and torSKOcheck-R, which had been designed to bind to the sequences immediately flanking the XmaI sites of torS. The expected PCR product sizes were 7543 bp from the non- interrupted torS gene or 3153 bp from KO lines generated by homologous recombination. Figure 4.12 shows the PCR results for four transformants.

Figure 4.12: PCR insertion analysis of torS-KO in Torrubiella. The band in (A) exceeds the expected 3153 bp band. The smaller band in lanes (B), (C) and (D) lanes is much more consistent with that value. The band found in lane A is unexplainable. (B): Bands obtained from a heterokayon with both KO and intact genes. (C) Probably, corresponds to an ectopic integrant (D) PCR fragment is a near homokaryon for knockout.

There was a problem with the primer design used to check if the replacement was correct, by means of the PCR band won't show if the introduction was achieved by homologous recombination and positioned within the torS coding region, instead of ectopic insertion in any part of the genome. By re-analysis of the results, the band in (A) should be effectively an ectopic recombination, but by size is unexplainable, while (B) shows both bands, which can be interpreted in two ways: One is that the two bands were amplified because the heterokaryotic state of the tranformant, but also it could be that the insertion was effectively done, but it was an ectopic integration and the torS gene from the Torrubiella strain wasn't affected. This would mean that the 7 kb band corresponds to torS, and the 3 kb band would correspond to the inserted gene fragment, located anywhere in the genome.

92

Different primers are needed to check if the gene interruption was properly performed, but the results indicate that there is a chance that gene knockout, or at least, gene insertion can be achieved in Torrubiella sp. BCC2165, albeit at a low frequency. It requieres many rounds of subculturing under selection appear to be necessary to achieve the homokaryotic state required for analyzing the effect of any gene disruption.

4.2.4 OVEREXPRESSION OF TRANSCRIPTION FACTORS

The work described above established that Torrubiella transformation could be achieved and that upstream segments of genes (torD and torE) could drive the expression of eGFP in transformants. The next question to be addressed was whether the transcription factors encoded next to the torrubiellone biosynthetic gene cluster play a role in regulating the cluster. Both of these TFs are highly conserved; the one described as a C6TF shares 75-77% identity with C6TFs from B. bassiana ARSEF and C. militaris, while the Zn-finger domain TF shares 56% identity with a Zn-TF from B. bassiana ARSEF and 60% with a Zn-TF described for C. militaris, according to BLAST comparison.

The designations of the Torrubiella transcription factors arose from the BLAST search results in the initial screening of contig 5044. C6TF is described as a fungal transcription factor type MHR (middle homology region) and as a fungal-specific transcription factor. It shares 76% identity with a putative C6 transcription factor from C. militaris (hence the name) and from a TF in B. bassiana ARSEF 25880 (Figure 4.13).

Figure 4.13: Comparison tree of C6TF and ZnTFs Analysis was done by BLASTP online server. (A) C6TF is more closely related TF found in B. bassiana and Cordyceps strains. (B) ZnTF is similar to TFs described for Cordyceps confragosa and C. brongniartii RCEF3172

93

The C6TF protein sequence also has a GAL-4 area, described as an enzyme involved in galactose metabolism. This TF is classified as Zn(II)2Cys6 binuclear cluster, and analysis of the protein sequence confirmed the presence of six cysteine residues in the pattern

CX2CX6CX5CX2CX11C (Figure 4.14). ZnTF also contains a GAL-4 moiety. Its identity with other homologous enzymes is lower than C6TF's, having only 60% identity with a Zn-finger domain TF in B. bassiana ARSEF 2880. Zn-TF is also described as a Zn(II)2Cys6 binuclear cluster. Analysis of the sequence also revealed the six cysteine binuclear cluster close to the

N-terminus in the pattern CX2CX6CX5CX2CX8C. Protein sequence alignment relates ZnTF to a zinc-finger domain protein from Cordyceps brongniartii RCEF3172 and Cordyceps confragosa which is not associated with regulation of any known function or gene cluster regulation, apart from the GAL-4 moiety which participates in galactose metabolism.

Figure 4.14: Alignment of the first 50 aa in C6TF and ZnTF (A) C6TF; (B) ZnTF In yellow is highlighted the six cysteine pattern used for DNA binding.

The experimental design involved inserting the TF genes either singly or together into the vacant expression cassettes of plasmid pTpromD500eGFPBar. The 2300 bp C6TF coding region (including one intron) was amplified from gDNA with 5'-extended primers targeting the adh cassette, while the intronless 2636 bp ZnTF coding region was targeted to the eno cassette (Figure 4.15). A positive TF activity in Torrubiella transformants should be evident if eGFP expression from torD promoter occurs, together with activation of the torrubiellone gene cluster and metabolite production.

Figure 4.15: Assembly strategy for construction of pTYpromD500eGFPC6TFbarZnTF. Plasmids containing single TFs (pTYpromD500eGFPC6TFbar and pTYpromD500eGFPZnTFbar) were also constructed by replacing the other TF fragment with a "patch" to close the empty expression cassette.

94

Transformation of Torrubiella sp. BCC2165 was carried out with the plasmids pTpromD500eGFPZnTFBar (testing ZnTF, together with promD500), pTpromD500eGFPC6TFbarZnTF (testing both TFs, with promD promoter at eGFP) and pTpromD500eGFPC6TFbar (only C6TF promoter, with eGFP promoter) and and a range of control plasmids (pTpromlesseGFP and pTpromadheGFPbar). Transformant colonies were subcultured on PDA+BASTA plates and analyzed by PCR, following the DNA plate extraction protocol, testing for eGFP and bar ampification, and by visual expression analysis of green fluorescence. Figure 4.16 shows an example of a fluorescent transformant containing pTpromD500eGFPZnTFBar.

Transformation with plasmids pTpromD500eGFPZnTFBar and pTpromD500eGFPC6TFbarZnTF (Figure 4.16) presented strong green fluorescent colonies, medium-strength fluorescence with pTpromD500eGFPbar while weak fluorescent colonies were obtained with pTpromlesseGFPC6TFbarZnTF and pTpromD500eGFPC6TFbar.

This result probably indicates that torrubiellone production is controlled by expression of the ZnTF, tested by comparison of fluorescence intensity between the plasmids containing the ZnTF and the promD only-eGFP expression. C6TF appears to play a negative role in eGFP expression when expressed alone, but does not affect it when coexpressed with ZnTF, as observed in the pTpromlesseGFPC6TFBarZnTF transformants.

Figure 4.16: Fluorescence in Torrubiella sp. BCC2165 using pTpromD500eGFPZnTFBar. The best picture was chosen for demonstrating purposes. (A) White light view of a ZnTF transformant (B) Epifluorescence of a ZnTF transformant

95

4.2.5 METABOLITE PRODUCTION IN ZNTF TRANSFORMANTS

ZnTF transformants were characterized by production of an intense yellow pigment, while C6TF and control transformants remained white (Figure 4.17). However, analysis of organic extracts taken from plate cultures did not reveal any difference between transformants expressing the ZnTF and controls, although the yellow pigment was retained in the organic phase (Figure 4.17). Extractions were also performed on liquid cultures; Eight transformants per plasmid type were grown in 50 ml PDB+BASTA liquid cultures (Final concentration: 100 µg/ml) and incubated at 28 °C/200 rpm for 7 days. All transformants were extracted with ethyl-acetate and analysed by HPLC.

Figure 4.17: Pigmentation in TF-transformants Torrubiella colonies Control, Torrubiella wild-type; ZnTF, pTpromD500eGFPZnTFBar; C6TF, pTpromD500eGFPC6TFbar; TF2, pTpromD500eGFPC6TFBarZnTF Bottom: Ethyl acetate extraction of metabolites from transformants grown in liquid media.

Diode array analysis showed two peaks eluted at RT = 5.67 min and RT = 5.92 min in the ZnTF transformant, in comparison to the untransformed Torrubiella control (Figures 4.18 and 4.19). The absence of a corresponding peak in C6TF transformants is consistent with the eGFP expression and solid-medium pigmentation data.

96

Figure 4.18: Diode array scan of representative TF Torrubiella transformants.

(A) pTpromD500ZnTFbar (B) Untransformed Torrubiella, (C) pTpromD500C6TFbar (D) pTpromD500C6TFbarZnTF. The peak eluted at RT = 5.67 min is labeled in all samples.

97

Figure 4.19: LC-MS analysis of a representative ZnTF transformant. (A) ES+ Scan. (B) Diode array scan

The UV absorption spectrum of the peak found at RT = 5.67 min in diode array scan has maxima at 207.23, 246.23 and 313.23 nm, while a small peak at RT = 5.92 min has a maxima of 214.23 and 312.23 nm (Figure 4.20).

Figure 4.20: UV absorption peaks of a representative ZnTF transformant (A) RT = 5.922 min and (B) RT = 5.655 min

These spectra are not far from the data described by Isaka and coworkers (Isaka et al., 2010) for torrubiellone B, with maxima at 202, 223 and 329 nm. This supports the idea that the peaks could correspond to a torrubiellone or torrubiellone derivative, even the final biosynthetic product of the gene cluster. Since the biosynthetic pathway had not been

98

established prior to this research, it cannot be stated with certainty that torrubiellone A is the final product of the pathway.

For molecular mass analysis, unfortunately the ES- trace is uninformative, so analysis relied on ES+. None of the peaks found in ES+ at retention times = 3.37, 4.01, 4.44, 7.83 and 11.37 min had a corresponding diode array equivalent; consequently, no UV peaks were detected. At RT = 5.934 min (Figure 4.21, A), the mass ions m/z [M]H+ found in ES+ were 392, 718 and 926, representing an odd number for the actual compound, which is one of the characteristics expected for any PKS-NRPS product.

Figure 4.21: ES+ analysis of peaks in a representative ZnTF transformant. (A) ES+ scan of the peak eluted at RT = 5.934 min (B) ES+ scan of the peak eluted at RT = 5.722 min

At RT = 5.722 min (Figure 4.21, B), the mass ions m/z found in ES+ were [M]H+ = 392 (such as the previous peak), but with novel m/z of [M]H+ = 736 and [M]H+ = 943, with a difference of +18 in comparison to the weights obtained for the previous peak, which could represent that the former compound detected at 5.934 min might be a dehydrated form of the metabolites obtained in RT = 5.722 min.

The ES+ mass ion of [M]H+ = 392 is between the mass range of other torrubiellone molecular weights (Torrubiellone A, B, C, D, E: 419, 403, 381, 383 and 401, respectively), so it could correspond to a torrubiellone-related compound, while the other compounds could be the

99

products of overexpression of other secondary metabolite gene clusters as a result of cross- talk by the ZnTF. The most similar compound found in a Torrubiella strain related to the obtained molecular weight is paecilodepeptide C, with a mass ion m/z of 774. No compound could be related to the ES+ mass ion of m/z 926.

4.3 DISCUSSION

4.3.1 TRANSFORMATION SYSTEM

A new transformation system must have several particular features: ease of DNA manipulation and transformation, ability to generate stable gene expression from exogenous sources and to supply the necessary precursors for metabolite assembly (Zhang et al., 2008). In a novel transformation system, reduced gene expression can be caused by genetic codon bias, as well as a mismatch between the natural expression signals and the host transcriptional machinery. Heterologous expression is also affected by the chromosomal location where the plasmid/DNA segment inserts when ectopic recombination occurs.

Protoplast transformation PEG-mediated is based on three basic steps: Protoplast preparation, DNA uptake and regeneration of the newly-transformed protoplasts (Liu and Friesen, 2012). Initial cells can be young mycelial fragments, germinated asexual spores, or basidiospores, depending on their physical characteristics. In Neurospora species, germinating macroconidia are most commonly used, but uninucleate microconidia will also produce transformable protoplasts (Rossier et al., 1985). If young mycelium is used, protoplasts are required to be released from the hyphal debris after the enzymatic treatment (Buxton and Radford, 1983). For species that do not produce conidia, mycelia can be used for protoplast preparation, and in A. oryzae transformation, germinated spores are mainly used.

Usually, the enzymes used for this purpose are driselase (2%), Trichoderma lysing enzyme or combinations of both, but in general terms, different enzymatic batches may affect the effectiveness of cell wall degradation (Ruiz-Diez, 2002), therefore testing different degrading enzymes for protoplasting efficiency is highly recommended. In the fungal pathogen of wheat Stagonospora nodorum, a 3 h enzymatic digestion with driselase is apparently enough to produce >1 × 108 protoplasts/ml (Liu and Friesen, 2012). For Pyrenophora teres, a barley net blotch pathogen, it requires at least 6 h of digestion time with driselase, or even overnight digestion at 4°C. Also, it requires a different osmotic buffer (Leng et al., 2011). Other previously used enzymes are helicase, glusulase and zymolase, in which all of them contain a complex combination of hydrolytic enzymes, such as 1,3-glucanases and chitinase

100

(Binninger et al., 1987). For A. oryzae NSAR1 transformation, protoplast formation requires only Trichoderma lysing enzyme, and this was also found to be the case for Torrubiella.

The Torrubiella transformation system, demonstrated by growth of the strain under BASTA- conditions and PCR screening, appears to require specific traits: a cottony phenotype, higher enzymatic concentration and incubation time, together with a greater initial incubation period between 36 to 48 hours. A longer initial incubation time producing young mycelium, together with a higher concentration of Trichoderma enzyme seems to be effective for protoplast formation in Torrubiella. Seemingly, only the cottony phenotype is suitable for transformation, which apparently is the “sporulation” state of the fungal strain as required for protoplast formation. When the plain phenotype (appearing as only vegetative hyphae) is analysed by microscopy, no spore-like structure was detected, but when the cottony phenotype was analysed (vegetative and aerial hyphae), some spore-like structures appeared. The cottony phenotype is better induced by PDA agar, but both phenotypes were produced at the same temperature and light conditions in the same growth media, so the cottony phenotype (in a low proportion) appears to occur randomly.

PEG-mediated transformation problems may include difficulty to obtain viable protoplasts in high quantities, a low transformation efficiency, high percentages of transient transformants, and frequent multiple loci integration (Liu and Friesen, 2012), problems represented in the Torrubiella transformation attempts. Also, strains carrying mutations transform less efficiently for reasons are not yet understood (Ruiz-Diez, 2002).

A main advantage of drug-based selection markers, as BASTA, is that the genotype of the recipient strain does not need to be known, only its susceptibility to the drug marker, especially useful in strains with a few or no auxotrophic variations, as the case of Torrubiella strains. (Ruiz-Diez, 2002). Possible problems of drug-based selection could be that the resistance-confering gene must be isolated in order to be used for transformation, also that the resistance allele may not produce a significant difference over the wild-type allele, difficulting the selection process, which can be helped by using also a visual marker, such as eGFP, as done in this research.

Ammonium glufosinate affects fungal growth by inhibiting the glutamine synthase enzyme as an analogue of L-glutamate, which catalyses glutamine synthesis from glutamate and ammonium. The compound prevents glutamate amino acid synthesis, an essential building block to synthesize several nitrogenous compounds needed in primary metabolism, but also leads to the accumulation of the non-used ammonia to a toxic level, killing plant cells. (Kutlesa et al., 2001). The compound has been tested for more than 30 years by farmers as an herbicide, and independent studies have been performed to check if BASTA is toxic for

101

humans, certifying that the compound is safe to use following standard applications (Tanzer et al., 2003).

A problem for gene analysis and expression in a new fungal system is the multinucleate nature of filamentous fungi, which can complicate methods based on insertional mutagenesis and gene replacement. These methods rely on the isolation of homokaryotic transformants from a single transformation event to be able to study the overexpression or loss-of-function phenotype properly (Vijn et al., 2003). Repeated subculturing of syncytial transformants allows genetic segregation to occur until obtain a homokaryotic colony, accompanied by PCR screening, which should provide an effective confirmation method if the primer design supports the identification of the insert(s) under standard conditions.

For the torS-KO gene targeting experiment, firstly it was thought that the samples were not pure enough to obtain a homokaryotic state by seeing two bands amplified, instead of one. In filamentous fungi, the segregation of genetically different nuclei within hyphae can produce homokaryotic nuclei from transformants that were originally heterokaryotic, permitting the success of the gene targeting, but this phenomenon does not occur every time (Rolland et al., 2003). After analysing the primer design, it was noticed that the primers used for screening did not allow to recognize correctly if the gene replacement was effective, i.e., two bands were indicative of the insertion of the gene fragment containing the bar gene, but not necessarily in the torS gene, allowing to amplify the intact synthase fragment and the addition of the bar gene. Because of this, it cannot be confirmed if gene targeting can be successfully done in Torrubiella.

The correct way (Figure 4.22) to elucidate if the homologous integration occurred as expected should be based in using two sets of primers: one amplifying from the torS region flanking PgdpA with the bar gene, and other pair starting from the bar gene to the torS region flanking the TgdpA insertion.

Figure 4.22: Primers proposal for the elucidation of a correct insertion. (A) Primers used in the research; (B): Correct primers.

102

The former primers (Figure 4.22, A) were able to amplify the insertion of PgdpA-Bar-TgdpA, but it won’t be able to differenciate between amplifying the insertion from the torS gene or from other random ectopic position, explaining the amplification of two bands using the same set of primers: The lower band corresponds to the inserted fragment, and the higher one would correspond to the intact torS gene.

With the new set of primers (Figure 4.22, B), amplification would only be positive if the gene fragment is inserted in the correct position between the torS gene of Torrubiella, because it will depend on the bar gene and a region belonging to the original torS gene.

4.3.2 PROMOTER ANALYSIS

Promoters are defined as DNA sequences working as regulatory signals of transcription initiation. They are specific sequences present in the non-coding regulatory regions of the genes, determining the position of the transcriptional start point. To study promoter function, a promoter-probe plasmid may be used, which carries one or more promoterless reporter genes (Dehli et al., 2012).

RT-PCR detected transcription of the torrubiellone cluster genes in Torrubiella, indicating that the cluster was transcriptionally active at the time of RNA isolation, after ten days of incubation in PDA. torD and torE genes were found to be expressed separately, but in the same time frame. Since only 17 bp separated the predicted stop codon of torD and the predicted start codon of torE, it seemed logical to suggest that the torE promoter was embedded within the torD coding region.

The basic idea to know if torD500 and torE500 region worked as promoters, is based on the insertion of a promoter-containing DNA fragment upstream of the reporter gene to form a transcriptional fusion, i.e. a promoterless fluorescence reporter (eGFP). Also, a BASTA- resistance selection marker (bar gene), controlled by the strongly expressed gdpA promoter from A. nidulans was used. Autofluorescence of A. oryzae NSAR1 can sometimes be mistaken as eGFP expression, requiring transformation with an empty plasmid as a negative control to compare the intrinsic fluorescence with the higher fluorescence intensity resulting from reporter gene activity.

Promoters are usually described as “weak” or “strong”, according to the level of reporter activity that they promote (Blount et al., 2012) and also defined as constitutive or inducible promoters: Constitutive promoters drive a constant level of expression during all growth stages of the microorganism, whereas inducible promoters should allow tight control of gene expression by adding or removing signals from the growth medium. It is not clear if the genes

103

responsible for torrubiellone A biosynthesis contain weak or strong promoters for each gene, but it is most likely that all promoters are activated simultaneously by one or two transcription factors, for example, according to the results obtained, the ZnTF transcription factor appears to activate expression of the torrubiellone A gene cluster.

Visual analysis of eGFP expression suggested that promE500 and promD500 worked at a similar level to the strong Padh from A. nidulans, although by unknown reasons, no torrubiellone A-D were detected by chemical extraction. Maybe torrubiellone biosynthesis is triggered by a sum of factors, not relying solely on the gene cluster expression, i.e., Isaka et al. (2010) obtained torrubiellone in low quantities only after 88 days of incubation at different conditions, but after one week of incubation the promoters are already activated. As anticipated the promDmid sequence did not generate any extra fluorescence in transformants, similar to the empty-cassette (no-promoter) negative plasmid. The promoter region of promD500 and promE500 seems to work properly as a promoter, supporting the proposal of the promoter of torE gene lies within torD gene.

For efficient gene targeting, the degree of homology between the insert and the recipient strain must be almost 100%. If the frequency of homologous recombination is low, it is necessary to screen a large number of transformants in order to find the rare ones with the directed integration (Wendland et al., 2003). Preparation of the construct for targeting the Torrubiella torS gene was also done efficiently by yeast recombination. Only a single yeast transformation event was required to join four DNA fragments: a selectable marker (bar) in a complete promoter-terminator cassette (gdpA) flanked by long homologous segments of the synthase gene. The random nature of ectopic recombination means that independent transformation events will see the insert integrated at any location of any chromosome, and so influenced by a range of local transcriptional environments. This indicates the importance of analysing several independent transformants in order to obtain an averaged view of the level of expression of the reporter gene.

The method of protoplast-mediated transformation was adapted to A. nidulans by Tilburn in 1983 (Tillburn et al., 1983), achieving only an efficiency of 25 transformants per µg of plasmid DNA, but improved by Dawe et al. to several hundred transformants/µg of plasmid DNA (Dawe et al., 2000). A. oryzae transformation protocol based on spheroplast was afterwards described in detail by Kitamoto in 2002 (Kitamoto, 2002). Using this method, heterokaryotic transformants are obtained, but homokaryotic transformants can be obtained from heterokaryotic ones by reselecting the progeny in selection plates, essential step that wasn’t performed correctly for the Torrubiella-KO transformants, hence the two bands when KO was attempted.

104

4.3.3 ZNTF AND C6TF TRANSCRIPTION FACTORS

Global regulators are widely found in fungi, while specific ones are less common, usually controlling specific behaviours under certain conditions and presenting an advantage in particular habitats. Gene regulators are essential for physiology and gene activation when required, allowing adaptation to different conditions or stimuli. Up to 37 classes of regulatory proteins have been identified, controlling growth, survival, and/or reproduction (Shelest, 2008).

Figure 4.23: Alignment of the amino acid Torrubiella TF sequences to other known TFs. The two ZnTF and C6TF amino acid sequences from the torrubiellone cluster were aligned with other TFs of known function from other filamentous fungi. The pherogram was performed and analyzed by MEGA5 (Tamura et al., 2011).

Pherogram analysis (Figure 4.23) positioned the two Torrubiella TFs in very different areas. The C6TF clustered with the aflatoxin regulator AFLR and close to the functionally unrelated STEA, which participates in the sexual development, and the SRAA stress response factor; all these proteins are from A. fumigatus, while ZnTF is related to HAP2 from Aspergillus

105

udagawae and the ACR2 C6TF from A. fumigatus. Unfortunately, no exact function can be inferred from this type of analysis.

Experimental results support the proposal that one of the transcriptions factors could work as activator of the cluster, up-regulating the expression of gene clusters, while the other would participate as a down-regulator of cluster expression: ZnTF appears to activate Torrubiellone A gene cluster (and other clusters), with or without the presence of C6TF, while this transcription factor by itself seems to repress the action of gene promoters.

One of the compounds obtained from the overexpression of the Zn-transcription factor ([M]H+ = 392) could be related by molecular weight to torrubiellone, but no structural elucidation method such as NMR was performed by the impossibility to obtain enough material quantity. It is worth to consider that when Isaka et al. (2010) described originally the compounds in Torrubiella sp. BCC2165, torrubiellone A was isolated from the organic phase obtained from the culture broth, while torrubiellone B was obtained only after the mycelium was macetared and then extracted. Also, all the compounds described were obtained after a precise furnishing from one of the eight (culture broth) and nine pooled fractions (macerated mycelia), they were able to obtain the torrubiellone compounds. Maybe other torrubiellone derivatives could be in other fractions of the extract. This possibility would support the chance of finding other torrubiellone-derivative compounds.

The other molecular masses obtained from ZnTF overexpression couldn’t be related to any known secondary metabolites obtained from any Torrubiella. Further analysis of other compounds derived from related filamentous fungi was done to elucidate what could be the other compounds, and for [M]H+ = 718, it can be compared by molecular weight to verticillin E (molecular mass of 723) from Gliocladium catenulatum, a mycoparasite of Aspergillus flavus sclerotia (Joshi et al., 1999), chaetosin from Chaetomium minutum (molecular weight of 696.84) and triornicin (molecular mass of 698.76), a siderophore isolated from Epicoccum purpurascens.

The molecular mass m/z of [M]H+ = 926 could not be related to any compound characterized from Torrubiella, Cordyceps and Isaria derivatives, but it represents an odd number for the actual compound, which is one of the characteristics expected for any PKS-NRPS product.

106

C

H A H PT ER

5

5. TORRUBIELLONE A GENE CLUSTER EXPRESSION IN A. ORYZAE

5.1 INTRODUCTION

In previous decades, the discovery of natural products was based on screening crude extracts from microorganisms, such as fungi and bacteria, separating each compound obtained from the extract by various techniques depending on its nature, to eventually obtain a final structure (Winter et al., 2001). Currently, natural product discovery research has been enhanced by whole-genome sequencing projects, leading to new findings about the biosynthetic capacities of the microorganisms, such as silent (or inactivated) gene clusters, which can lead to natural (but not yet described) products (Winter et al, 2001).

Genome sequence analysis has shown that both fungi and bacteria have more SM gene clusters than reported metabolites (Walsh et al., 2010), and also new metabolites are being identified, produced under non-standard fermentation conditions (Winter et al, 2001). To relate secondary metabolite gene clusters to their correspondant products, different genetic, biochemical and microbiological tools can be used analyse and express whole gene clusters, or by the addition of each gene one by one (Crawford and Clardy, 2012).

Several techniques have been used for the discovery of natural products or their cognate gene cluster, such as the genomisotropic analysis, based on bioinformatics analysis and prediction of their amino acid precursor, used by Gross et al. (2006) in the discovery of orfamides from Pseudomonas fluorescens. This technique only works properly for bacterial NRPS. Mutagenesis approaches (by KO-generated mutants) targeting parts of the gene cluster can

107

be used to study metabolic effects when compared to the wild-type, for example, inactivation of a PKS-NRPS gene led to the production of myxochromides (Silakowski et al., 2001) and aurafurone A (Kunze et al., 2005) in Stigmatella aurantiaca, but it requires to know which genes are involved in the production of the compound. An alternative approach to study SM is the heterologous expression of single genes or whole biosynthetic gene clusters, especially for microorganisms whose genetic manipulation has not been described or has proven to be difficult.

The combinational method of KO-generation and heterologous expression can be used also for natural products discovery, such as the case of tenellin biosynthesis in B. bassiana. Knock-Out generation (Eley et al., 2007) in the native strain, plus heterologous expression in the filamentous fungus A. oryzae (Heneghan et al., 2010, Halo et al., 2008) led to identify its cognate biosynthetic genes and metabolic steps to obtain tenellin and precursors.

Heterologous hosts as the bacteria E. coli (Zabala et al., 2012) and the yeast S. cerevisiae (Zu et al., 2010) have been tested successfully for heterologous fungal gene expression, i.e., the expression of the azaphilone gene cluster from A. niger (silent under standard fermentation conditions) in E. coli. Also, microviridin L was obtained when a cryptic gene cluster in Microcystis aeruginosa NIES843 was expressed by using heterologous expression in E. coli (Ziement et al., 2010).

Filamentous fungi, such as A. niger and N. crassa (Earl et al., 1990) have been used also as heterologous host, but commonly the members of the Aspergillus species have been used for reconstruction of natural product pathways from fungi. One example of transformations using Aspergillus species was in A. nidulans was performed by Fujii et al., who used the expression vector pTAex3 to express the PKS gene atX from A. terreus under the control of the amylase gene promoter amyB to obtain 6-methylsalicylic acid (Fujii et al., 1995). Also, the silent gene cluster involved in asperfuranone synthesis from A. terreus was expressed in A. nidulans (Chiang et al., 2013).

A. oryzae also has been used as heterologous host: Fujii et al. (1995) used the A. oryzae M-2- 3 strain (arginine auxotroph) to express the ribonuclease T1 gene of A. oryzae under the control of PamyB, using the same pTAex3 vector model used for other Aspergillus transformations. Using the same strain, Cox et al. (2004) were able to synthesize the squalestatin S1 side chain by cloning and expression of the PKS1 gene from Phoma sp. Heneghan et al (2010) reconstructed the tenellin biosynthetic pathway by coexpression of the four necessary genes in the M-2-3 strain by using arginine auxotrophy as a complementation marker, together with two other selectable markers.

108

Other auxotrophic strains have been developed to avoid the use of the same selection/complementation markers in different expression vectors. The quadruple auxotroph A. oryzae NSAR1 was developed by Jin et al. (2004), based on the strains NS4 (niaD-, sC-) and NSR1 (niaD-, sC-, adeA). The NS4 presents a defective nitrate reductase gene (niaD-) and an interrupted ATP sulfurase gene (sC-), not permitting the use of NO3- and SO4- as sole nitrogen and sulphur sources, respectively. Also, Xu et al. (2010) interrupted the ornithine transcarbamylase gene (arg-), essential in arginine biosynthesis pathway and the phosphoribosylaminoimidazolesuccinocarboxamide synthase gene (adeA-) from the purine biosynthetic pathway, to develop an expression host that allows for sequential or co- transformation to be performed without the use of antibiotics or selection drugs. The host was tested by Fujii et al., using four separate plasmids (each of them having a different complementation marker) to express four genes from Phome betae, leading to the production of the diterpene aphidicolin (Fujii et al., 2011).

Figure 5.1: Map showing essential features of the multigene expression vectors used in this research In this variant eGFP has replaced the Gateway cassette to allow for the creation of gene fusions by homologous recombination in yeast. Four versions of the vector exist with complementation markers for use with the A. oryzae strain NSAR1 (Pahirulzaman et al., 2012)

The plasmid pEYA, or yeast-adapted Gateway entry vector, contains two attL sites to allow the transfer of any genomic segment (usually the reconstructed PKS-NRPS synthase) between sites to the destination vector containing attR sites via Gateway LR reaction (Fujii et al, 1996). This feature allows for movement of the synthase gene among different expression vectors containing attR sites without needing to assemble the synthase each time

109

when additional genes are inserted in the pTYGSarg vector. The multigene plasmid vector pTYGSarg (Figure 5.1) used by the Bristol Polyketide group was designed especially for assembly and transfer of megasynthetic genes and for tailoring enzymes addition into the expression vector (Lazarus et al., 2014). The multigene expression vector derived from A. oryzae contains a Gateway-modified amyB expression cassette, and also additional expression cassettes adh, gdpA and eno, in which the promoter and terminator are separated by an AscI restriction site. AscI digest allows to add simultaneously up to three PCR fragments in one single yeast transformation event. The ORF of interest will be positioned downstream of the desired promoter for transcriptional regulation, and upstream of a terminator for the polyadenylation of the transcript (Figure 5.2).

Other versions of the expression vector containing different complementation markers are also available.

Figure 5.2: Illustration of multigene pathway reconstruction using yeast homologous recombination property and Gateway transfer. From Lazarus et al, 2014. (A) The main synthase gene between the Gateway transfer sites inside the yeast assembly vector can be transferred into the multigene expression vector via Gateway recombination. (B) AscI restriction enzyme cuts into the multigene expression vector next to the end of each promoter (P1 (Padh: alcohol dehydrogenase, P2 (PgdpA: glyceraldehyde-3-phosphate dehydrogenase) and P3 (Peno: enolase). (C) Genes encoding for putative tailoring enzymes (plus terminators – T1 (Tadh), T2 (TgdpA) and T3 (Teno)) can be amplified by using primers containing tails homologous to the cut ends of the vector. (D) After assembly, the final vector will contain one to three genes encoding for tailoring enzymes in P1-2-3/T1-2-3 sites and the synthase gene in an expression cassette tailored for high-level expression in A. oryzae.

The main advantages of using fungi for heterologous expression of PKSs and NRPSs of ascomycete origin, instead of bacteria (such as E. coli) as a host, relies on three aspects: ability to recognize introns and remove them correctly from gDNA ORFs; the natural presence (and expression) of a phosphopantetheine transferase (PPT) gene, and correct folding of the expressed protein (Halo et al., 2008).

110

5.2 RESULTS

5.2.1 TORS (PKS-NRPS) ASSEMBLY

The torS synthase (ca. 12 kb) gene was reconstructed as a fusion to eGFP in the vector pEYA1eGFP (Figure 5.3) to evaluate gene expression by intensity of eGFP expression under fluorescence microscopy. The initial strategy was to amplify the torS gene as three overlapping >4-kb parts from Torrubiella sp. BCC2165 gDNA. PCR was done with 50 b primers, of which 30 b had the overlap for assembly by homologous recombination in S. cerevisiae.

Figure 5.3: Plasmid pEYA1eGFP The plasmid presents a modification from the plasmid pEYA1, in which the eGFP reporter gene is replacing the original genes ccdB and cam genes between the attL1 and attL2 sites. The plasmid has to be linearized at the NotI site immediately upstream of eGFP to allow assembly of the torS-eGFP fusion by yeast recombination.

The initial attempts to amplify the torS gene were successful only for parts 1 and 2, but no amplification for part 3 was achieved. By splitting the third part in 3-kb and 1-kb pieces, all four pieces could be obtained for gene assembly by yeast recombination (Figure 5.4).

111

Figure 5.4: PCR scheme used for torS reconstruction

The first fragment of 4315 bp contained a NotI restriction site at the 5' end and a homologous region of the second fragment at the 3' end, using primers PKS-TORR-F1 and PKS-TORR- R1. The second fragment of 4410 bp contained an overlap to the first fragment at the 5’ end and a homologous region of the third fragment at the 3' end, using primers PKS-TORR-F2 and PKS-TORR-R2. The first half of 3316 bp presented a homologous region with the second fragment at the 5’ end and an overlapping part with the 1 kb fragment at 3’ end, using primers PKS-TORR-F3 and PKS-TORR-R4. The second half of the third fragment of 1086 bp presented a homologous region with the first segment of the third part at 5' end, while 3' end presented an overlap with the beginning to the eGFP coding region, using PKS-TORR-F5 and PKS-TORR-R3 primers.

Yeast assembly was performed with the four torS pieces, together with the NotI-digested shuttle plasmid pEYA1eGFP to obtain pEYAtorSeGFP (Figure 5.5). Plasmids were extracted from cells collected en masse from yeast transformation plates and shuttled through E. coli TOP10 cells selected on LB+chloramphenicol plates. Ten colonies were screened by colony PCR using gene-specific primers, followed by confirmatory analytical digests performed on plasmids extracted from selected transformants.

Figure 5.5: torS assembly strategy in pEYAeGFP plasmid. Linearized section of the map from pEYAtorSeGFP, showing primers and cognate PCR fragments for assembly of the tor synthase.

5.2.2 EXPRESSION OF TORS+TORC IN A. ORYZAE

As a reference, to obtain the first tenellin precursor (pretenellin A) by heterologous expression in A. oryzae, it was necessary to co-express the tenS gene encoding the PKS-NRPS together with tenC encoding the enoyl reductase (Halo et al., 2008). With this in mind, the

112

first step towards investigating heterologous expression of torrubiellone synthesis genes in A. oryzae was the construction of a multigene expression plasmid carrying torS and torC coding regions. For this purpose, the plasmid pTYGSarg was digested with AscI and recombined with torC PCR product in the eno expression cassette by yeast recombination (Figure 5.6). The resultant plasmid was designated pTYGSargtorC. Gateway LR reaction with pEYAtorSeGFP was used to place the torS gene into the amyB expression cassette to create the plasmid pTYargtorSeGFPtorC (See Appendix).

Figure 5.6: pTYargtorSeGFPtorC assembly. pTYargtorSeGFPtorC plasmid was used to transform A. oryzae strain NSAR1 using arginine as complementation marker. Transformants were selected and subcultured on a minimal medium lacking arginine to maintain the selection pressure. In every selection round, transformants were selected by visualization of eGFP by fluorescence microscopy (Figure 5.7). The observation of green fluorescence was used to indicate expression of the complete PKS-NRPS since the eGFP moiety was located at the C-terminus of the torS coding region. To obtain eGFP expression, torS coding region should be also expressed.

Figure 5.7: eGFP expression of a representative A. oryzae transformant (A) A pTYargtorSeGFPtorC A. oryzae representative transformant seen under light microscopy. (B) Same A. oryzae transformant under fluorescence conditions, using a specific filter to be able to visualize eGFP expression.

Once transformants were considered genetically pure, they were transferred to nonselective production media, such as CMP or MEA for metabolite extraction. Both growth media contain maltose, which induces expression of the genes controlled by the amyB cassette. Although

113

pTYargtorSeGFPtorC transformants showed eGFP expression, no new compounds were detected by LC-MS analysis (Figure 5.8), in comparison to the control.

Figure 5.8: LC-MS analysis of organic extracts from a representative pTYargtorSeGFPtorC transformant. (A) ES- scan, (B) ES+ scan, (C) Diode array scan. ES+ and ES- scan do not show any novel peak, while Diode Array showed a peak at RT = 12.55 min without any unusual UV absorbance.

5.2.3 INTRON REMOVAL STRATEGY

In-silico analysis of the torS gene led to the finding of two introns of 80 (intron 1) and 72 (intron 2) bp respectively, as shown in Figure 5.9. Failure of the heterologous host to (correctly) splice an intron can lead to protein truncation due to the presence of an in-frame stop codon within the intron, or a frame-shift. The TORS enzyme apparently is expressed by means of eGFP expression, but the failure to remove a fully in-frame intron can perturb the resultant protein structure, with effects on folding and/or activity. Mis-splicing was therefore proposed as a possible explanation of the failure to produce any polyketide compound, despite expression of the fluorescent marker.

Figure 5.9: Introns position in torS gene. Diagram of the putative 12 kb torrubiellone A synthase gene fused to eGFP, with introns (red section) flanked by MfeI (intron 1) and BglII (intron 2) sites.

114

5.2.3.1 INTRON 2 REMOVAL

The initial idea for intron 2 removal was to amplify the appropriate PCR fragment from cDNA of the native Torrubiella strain, followed by yeast recombination on plasmid pEYAtorSeGFP digested with BglII (flanking the intron), but cDNA amplification could not be achieved. The alternative approach was to design primers (TorIntRem-F1/R1/F2/R2) that would amplify overlapping fragments from gDNA and remove intron 2 by yeast recombination, using pEYAtorSeGFP as receiving plasmid. The technique is based on amplify overlapping fragments next to the introns, but not including them in the amplification: Primers produced a 30-bp overlap between the 3' terminal part of the fragment of 801 bp amplified using TorIntRem1-F and TorIntRem1-R and the 5’-part of the PCR fragment of 195 bp amplified by TorIntRem2-F and TorIntRem2-R. By the overlap of both PCR fragements, the intron is avoided entirely, allowing for its removal by yeast recombination (Figure 5.10).

Figure 5.10: Intron 2 removal strategy in pEYAtorSeGFP

Restriction analysis of the resulting plasmid with intron 2 removed, named pEYAtorSi2eGFP, with NotI+BglII indicated that the procedure had been successful (Figure 5.11), and this was confirmed by DNA sequencing. The torSi2 (torS with intron 2 removed) gene was inserted into pTYGSargtorC by Gateway site-specific recombination, resulting in pTYargtorSi2eGFPtorC, used for A. oryzae transformation.

115

Figure 5.11: Intron 2 removal plasmid strategy (A) Plasmid map of pEYA1torSeGFP, including the intron 2 removal strategy. (B) NotI+BglII restriction digest of pEYA1torSeGFP and pEYA1torSi2eGFP, being obtained the expected bands 7000, 6854, 3881 bp for both plasmids, but torSi2 lower band is 895 bp, while torS is 967 bp, confirming the intron removal.

Several transformants were obtained, some showing a pale yellow coloured growth in solid and liquid media (Figure 5.12), in comparison to controls transformed with an empty plasmid or the previously obtained colonies transformed with the full intron version of torS. However, it does not seem to be a relation between fluorescence and yellow pigmentation; instead, the more highly pigmented colonies usually exhibited low or no eGFP expression.

Figure 5.12: Pigmentation of A. oryzae transformed with pTYtorSi2eGFPtorC Ethyl acetate extractions from pTYtorSi2eGFPtorC transformants grown in liquid CDox-Arg (Left), CMP (Middle), and an empty plasmid control – arg (Right).

Direct metabolite extraction from MEA plates was performed, but LC-MS revealed no new compounds, compared to control, despite the presence of yellow pigment.

116

pTYargtorSi2eGFPtorC A. oryzae transformants were grown in 10 flasks containing 50 ml CMP liquid medium for 7 days, extracted and analysed by LC-MS, but, disappointingly, no difference was found (Figure 5.13), compared to the control. Removal of the second intron made no difference in comparison to pTYtorSeGFPtorC.

Figure 5.13: LC-MS analysis of a representative pTYargtorSi2eGFPtorC transformant (A) Diode array scan of the empty plasmid transformation (B) Diode array scan of a representative pTYargtorSi2eGFPtorC. The peaks at RT = 6.64 min and 7.54 min are more visible in the transformant in comparison to the control, but both peaks do not present any unusual UV absorbance

5.2.3.2 INTRON 1 REMOVAL

The removal of intron 2 was not enough to obtain any torrubiellone-related compound (or even any novel compound), but eGFP was still expressed. Intron 1 was removed by yeast recombination using pEYAtorSi2eGFP, using the same strategy as intron 2 removal: Primers contained a 30-bp overlap between the 3' terminal part of the fragment of 1411 bp amplified using PKS-TORR-F1 and IR-MFe2-R-50 and the 5’-part of the PCR fragment of 2982 bp

117

amplified by IR-MFe2-F-50 and PKS-TORR-R1. Fragments will be fused by yeast recombination.

Figure 5.14: Intron 1 elimination strategy. pEYAtorSi2eGFP was digested with MfeI, which cuts in flanking areas of the intron and an intronless version of torS was assembled to work in the following experiments (Figure 5.14). In this case, MfeI also has another restriction site within the synthase, so a 2134 bp patch was amplified to close the plasmid appropriately, using primers torpatIR1-F and torpatIR1- R. The new plasmid was named pEYAtorSintlesseGFP. Gateway transfer was performed to obtain pTYargtorSintlesseGFPtorC (Figure 5.15).

Figure 5.15: MfeI digest and prediction to confirm intron 1 removal in pEYAtorSi2eGFP. GENtle digest prediction shows a small but noticeable difference between the intronless and i2 version. i2 version has two bands of 1209 and 1109 bp while the intronless version has two bands of 1129 and 1109 bp. All the other bands remain the same.

118

After intron removal, the torSintlesseGFP fragment from pEYAtorSintlesseGFP was Gateway transferred to pTYGSarg to obtain pTYtorSintlesseGFPtorC, using pTYGSargtorC as destination vector. Nine transformants were obtained, subcultured for two additional rounds of selection, in which a notorious yellow pigment could be observed. Then, genetically pure colonies were subcultured in 50 ml liquid media of CDox-arg with maltose added. As an initial secondary metabolite analysis, the samples extracted from torSintlesseGFPtorC were analysed by thin layer chromatography (TLC) to check for new compounds (Figure 5.16, B, representative sample) and compared to an untransformed A. oryzae. TLC showed the presence of new spots, maybe related to the production of new compounds.

Figure 5.16: torSintless expression plasmid and metabolite analysis (A) Plasmid map of the expression vector pTYargtorSintlesseGFPtorC (B) TLC of organic extracts of untransformed A. oryzae NSAR1 and a pTYargtorSintlessC representative A. oryzae transformant

LC-MS analysis of metabolites extracted from pTYargtorSintlesseGFPtorC transformants grown in MEB revealed the presence of new peaks, in which the peak eluting at RT= 9.37 min had a mass ion m/z of [M]H+ = 368, accompanied by a mass ion m/z of [M]H-= 366 (Figure 5.17).

The mass ion m/z of 368 [M]H+ compound obtained by expression of two genes from the torrubiellone biosynthetic pathway (torS and torC), can have an explanation: Torrubiellone D has a molecular mass ion m/z of 384 [M]H+, so the absence of an OH group would produce a compound of ion mass m/z of 368 [M]H+, which is also the mass of predesmethylbassianin

119

A (Heneghan et al., 2011). Production of such a compound is consistent with the high degree of relatedness of torS with dmbS and tenS, and of torC with dmbC and tenC, and it will be named pretorrubiellone D.

Figure 5.17: LC-MS analysis of a representative pTYargtorSintlesseGFPtorC transformant. (A) Diode array scan of an empty plasmid A. oryzae transformant. (B) Diode array scan of a representative pTYargtorSintlesseGFPtorC transformant. A notorius peak is eluted at RT = 9.39 min (C) ES- scan of a representative pTYargtorSintlesseGFPtorC transformant (D) ES+ scan of a representative pTYargtorSintlesseGFPtorC transformant (E) ES+ scan at RT = 9.618 min

120

In 2010, Isaka et al. reported UV/Vis absorbance peaks for torrubiellone D of 226, 286 and 333 nm (Isaka et al., 2010) and the new compound shares the 226 nm peak, while the others could be affected by the absence of the OH group (Figure 5.18).

Figure 5.18: Comparison of torrubiellone D and the proposed precursor sturctures. (A) Torrubiellone D possess a molecular weight of m/z 383.17, as described by Isaka et al. 2010 (B) The removal of an OH would lead to an m/z of 367. Therefore, a proposed structure of torrubiellone D precursor is presented, which also corresponds to predesmethylbassianin A.

5.2.4 EXPRESSION OF TORA AND TORB GENES IN TORSC A. ORYZAE TRANSFORMANTS

On the assumption that the enzymes TORA and TORB have the same functions as their homologues in tenellin and desmethylbassianin synthesis, i.e. ring expansion and N- hydroxylation, respectively, a logical step was to co-express them with torS and torC.

The genes torA, torB and torC were amplified from Torrubiella sp. BCC2155 gDNA, using 5'- extended primers designed for targeting the PCR products into the adh, gdpA and eno expression cassettes, respectively. pTYGSarg was digested with AscI and yeast homologous recombination was performed with the torA (1720 bp), torB (1733 bp) and torC (1236 bp) PCR products (Figure 5.19, C) to assemble pTYGSargtorABC (Figure 5.19, B). No patch was needed for this construction, since genes were directed to all three expression cassettes. PCR was used to confirm the presence of the genes in extracted plasmids from E. coli transformants, and Gateway was used to insert torSintlesseGFP into a destination plasmid to create pTYargtorSintlesseGFPABC (Figure 5.19).

121

Figure 5.19: Construction of pTYargtorSintlesseGFPtorABC (A) Addition of torA and torB genes into adh and gdpA cassettes by yeast recombination. (B) pTYargintlesseGFPtorABC expression vector, including the torS gene without introns, and three genes encoding for tailoring enzymes (torA, torB, torC) included in the expression cassettes. See Appendix for details. (C) PCR of torA, torB and torC genes from two transformants using pTYGSargtorABC as a template. All values in basepairs.

After plasmidial transformation in A. oryzae, metabolite extraction from MEA plates was performed but LC-MS revealed no new compound, compared to the control, despite the production of a yellow pigment. Subsequently, pTYargtorSintlesseGFPABC A. oryzae transformants were grown in 50 ml CMP liquid media for 7 days, followed by ethyl acetate- based metabolite extraction and LC-MS was performed. This led to identification of a new compound eluting at RT=14.35 min (Figure 5.20) in a 20 minute run, with a mass ion m/z of [M]H+ = 382 and [M]H+ = 380 [M]H-, together with a presumely dehydrated related mass ion m/z of [M– H2O]H+ = 364.

122

123

Figure 5.20: LC-MS analysis of a representative pTYargtorSintlesseGFPtorABC transformant. (A) ES- scan, presenting a interesting peak at RT = 14.48 min (B) ES+, with a peak at RT = 14.35 min matching with ES+ and Diode Array. (C) Diode array scan (D) Detection of a 380.68 m/z molecular masses at RT = 14.480 min in ES- (E) Detection of a 382.44 and 364.21 m/z molecular masses at RT = 14.387 min in ES+.

124

The mass ion of m/z of [M]H+= 382 is suggestive of a compound related to torrubiellone C, but differing in the position of the hydroxylation, as proposed in Figure 5.21. The torrubiellone C modification is predicted to be identical to desmethylbassianin (Heneghan et al., 2011), and is the expected product of a ring-expanding cytochrome P450 (encoded by torA) and an N-hydroxylating cytochrome P450 (encoded by torB) acting on the torrubiellone D precursor with the structure of predesmethylbassianin A (activities of torS and torC).

Figure 5.21: Proposal of a torrubiellone C modification. (A) Torrubiellone C structure. (B) Torrubiellone C modification proposal. According to prediction, the group should not be present in the torrubiellone C modification, instead, another OH group should be joined to the nitrogen part of the structure (labelled in light-blue), because of the participation of the torB gene working as a cytochrome P450.

To add weight to this interpretation, the torS gene was combined with the dmbA, dmbB and dmbC genes from the desmethylbassianin biosynthetic pathway to assemble the plasmid pTYargtorSintlesseGFPdmbABC (Figure 5.22). If the homologous tailoring enzymes have an equivalent function in dmb and tor synthases, the resultant product should be the same as that obtained for torSABC.

125

Figure 5.22: pTYargtorSintlesseGFPdmbABC plasmid map

Analysis of A. oryzae pTYargtorSintlesseGFPdmbABC transformants indicated the presence of two new peaks compared to controls, eluting at RT=7.10 min and RT=9.55 min. The m/z detected for the peak at RT=7.10 min were [M]H+-= 380 and [M]H+= 382 (Figure 5.23), matching those of the compound obtained in torSABC transformants, and most likely indicative of synthesis of the same compound in both cases. By analysing the peak at RT=9.55 min, the biggest mass ion m/z detected were 348 [M-H2O]H+, 366 [M]H+ and 388 [M]Na+, which could correspond to a variant of the compound produced by torSC or to predesmethylbassianin A. The mass difference could be explained by loss of a double bond in this compound, which would be a precursor of the compound eluting at RT=7.10 min (Table 5.1), but it requires structure elucidation by NMR to confirm the proposal.

126

Figure 5.23: LC-MS analysis of organic extracts from a representative pTYargtorSintlesseGFPdmbABC transformant. (A) ES+ scan, presenting a marked peak at RT = 9.76 min (B) Diode array scan, with two peaks eluted at RT = 7.10 min and 9.55 min. (C) UV absorbance spectra for peak eluted at RT = 9.548 min, with peaks at 200.23, 252.23 and 383.23 (D) UV absorbance spectra for peak eluted at RT = 7.098 min, with peaks at 222.23, 260 and 386.23 (E) Detection of 362.72 [M- - H2O] and 380.73 [M-] m/z molecular masses at RT = 7.363 min in ES- (F) Detection of 346.59, 364.75 [M+ -H2O], 382.70 [M+] and 404.67 [M + Na] m/z molecular masses at RT = 7.292 min in ES+ (G) Detection of 348.70, 366.72 [M+] and 388.89 [-H2O+Na] m/z molecular masses at RT = 9.739 min in ES+

None of the proposed structures, which are shown in Table 5.1, should have the OH group with the methyl group seen in torrubiellones A and B. UV absorbance peaks of 200, 252 and 383 nm for torSC/dmbSC and 202/255/386 nm for torSABC/torSdmbABC are consistent with identical activities for torS and dmbS for the three sets of tailoring enzymes.

127

Table 5.1: Proposed torrubiellone-related structures for compounds extracted from A. oryzae transformed with pTYargtorSintlessCdmbABC

RT= 7.10 MIN RT= 9.55 MIN

Same structure as Same strcucture as predesmethylbassianin A desmethylbassianin

5.2.5 FUNCTION OF TORD AND TORE ENZYMES

With the functions of the Torrubiella PKS-NRPS, ER and P450s proposed to work identically to their counterparts in desmethylbassianin synthesis, the structural differences between torrubiellone A and desmethylbassianin would have to be a result of the actions of other enzymes.

To investigate the function of the genes torD and torE, in conjunction with the four core biosynthetic genes, both coding regions were inserted into the plasmid pTYGSade by homologous recombination in yeast, creating pTYGSadetorDE (Figure 5.24).

128

Figure 5.24: pTYGSadetorDE assembly (A): torD and torE addition strategy to pTYGSade to adh and gdpA cassettes, respectively. (B) Plasmid map of pTYGSadetorDE

The torDE plasmid was transformed into a tenellin-producing recombinant strain of A. oryzae (tenSABC), using adenine as complementation marker. The previous introduction of the tenellin genes (tenSABC with arg as selection) already added the arg gene, so arginine couldn’t be used. As before, transformants were subcultured three times on selective medium and then transferred to MEA plates for production. Yellow pigmentation was occasionally apparent as previously observed, but an additional orange pigment was also sometimes obtained (not shown). Extracts were analysed by LC-MS: The tenellin producer and tenStorDE have almost the same peaks in the diode array scan, differenciated only in an extra peak at RT=9.88 min, while the peak found as RT=9.14 min in torDE (Figure 5.25) presents a lower intensity than the tenellin control, which could represent a transformation of the compound in RT=9.14 min (m/z of 368) to the one in RT=9.88 min (m/z of 402).

129

Figure 5.25: LC-MS analysis from two representative tenSABC + torD and torE transformants. (A) Diode Array Scan of the tenellin producer, in where the peak eluted at RT = 9.14 min is higher than the peak present with torD and torE genes (B) Diode Array Scan of a representative tenellin producer + torD + torE (transformant 1), which presents an additional peak at RT = 9.88 min, in comparison to the tenellin producer. (C) Diode Array Scan of a representative tenellin producer + torD + torE (transformant 2), which presents an additional peak at RT = 9.88 min, in comparison to the tenellin producer. (D) Detection of 368.64 and 402.57 m/z molecular masses in ES- scan for RT = 9.909 min

The ion mass m/z of [M]H- of 402, was also observed at the same RT in the tenSABC + torD + torE tranformants. Addition of a mass of m/z 20 to the previously described structure could conceivably arise by further hydroxylation and partial saturation of the benzene ring to generate structures more reminiscent of torrubiellone A. Nevertheless, and also bearing in mind that there are several different possibilities for the site of double bond removal, two related structures are proposed in Figure 5.26, but in the absence of structural data this is, of course, purely speculative, and it would require NMR analysis of the compound obtained of [M]H- of 402 to confirm any of the proposals.

130

Figure 5.26: Proposal of the effect of TORD and TORE enzymes in tenellin structure (A) Proposed tenellin derivative from tenSABC + torD + torE (B) Proposed tenellin derivative from tenSABC + torD + torE (C) Tenellin

In an attempt to elucidate the function of each gene (torD and torE) individually, the plasmids pTYGSadetorD and pTYGSadetorE were constructed and used to co-transform the NSAR1 strain of A. oryzae, together with plasmid pTYargtorSintlesseGFPtorC or with the tenSABC tenellin-producer A. oryzae. The choice of using pTYargtorSintlesseGFPtorC instead of using the full torSABC transformant relied in it is not known the order in which enzymes work and the enzymatic function of TORA and/or TORB could affect the function of TORD and TORE enzymes. Also, by the biosynthetic pathway proposal, both torD and torE genes should participate earlier in the pathway than torA and torB.

Extracts from a co-transformant of pTYargtorSintlesseGFPtorC and pTYGSadetorE contained a low intensity peak eluting at RT=8.27 min (Figure 5.27), and in a mere speculative way, the associated peak could correspond to the torrubiellone precursor obtained for torSC (m/z 368), accompanied by an OH addition, giving a molecular ion mass m/z of [M]H+=384. Another molecular mass is found is [M]H+ of 408, which should correspond to the addition of sodium to the structure (+22), but probably requiring the addition of two hydrogen ions, whereas no structural variant could be envisaged for the mass ion m/z of [M]H+= 394. Some of the pTYargtorSintlesseGFPtorC + torE transformants developed an intense orange pigment, but this only appeared to be a higher concentration of the yellow pigment obtained previously. When plate extractions were performed, both orange and yellow pigments looked practically the same when diluted.

131

Figure 5.27: LC-MS analysis of a representative torSC + torE transformant. (A) ES+ scan of torSC+ torE transformant, which presents a novel peak at RT = 8.28 min, labelled with green. (B) Diode Array Scan of torSC+ torE, in where a small peak at RT = 8.28 min, considering small peak in UV detection. Other peaks found do not present any UV detection (C) Detection of 368.08 (preDMB / pre torrubiellone), 384.05 and 408.51 m/z molecular masses in ES+ scan for RT = 8.279 min

In addition, the tenellin-producing A. oryzae strain was transformed with pTYGSadetorD (Figure 5.28). Diode array scan showed peaks at RT = 6.57, 7.45, 10.33 and 11.97 min, but ES+ scan shows only peaks at RT = 10.42 and 11.98 min (related to diode array). Additional peaks were found at 8.40 and 9.84 min.

132

Figure 5.28: LC-MS analysis of a representative tenSABC + torD transformant. (A) ES+ scan of tenSABC + torD, with notorious peaks at RT = 8.40, 9.84, 10.42 and 11.98 min. (B) Diode Array Scan of tenSABC + torD, with peaks at RT = 7.45, 10.33 and 11.97 min. (C) ES+ scan of tenSABC + torD at RT = 10.324 min, with a m/z mass of [M]H+= 368 (D) ES+ scan of tenSABC + torD at RT = 9.728 min, with a m/z mass of [M]H+= 368. (E) ES+ scan of tenSABC + torD at RT = 11.962 min, with m/z masses of [M]H+= 386 and [M]H+= 414.

ES+ analysis shows a molecular mass m/z of [M]H+= 368 at RT = 10.324 min. Tenellin molecular mass is [M]H+ = 370, so apparently a double bond became saturated. Also at RT = 9.728 min, again the m/z analysis shows a [M]H+= 368, but also accompanied by an m/z of 398. The difference of 30 in molecular mass could not be interpreted, requiring further NMR structural analysis. Finally, a molecular mass m/z of [M]H+= 414/415 appears at RT = 11.962 min, differing in 44 from tenellin mass, which could be interpreted as the addition of two sodium elements in the structure.

Afterwards, the expression of all six genes related to the torrubiellone A gene cluster was investigated by transformation by co-transforming A. oryzae NSAR1 with pTYargtorSintlesseGFPtorABC and pTYGSadetorDE. At this stage, the relevance of the yellow pigment obtained in the transformants was finally investigated. It was possible that

133

the yellow pigment was kojic acid, but LC-MS analysis (Figure 5.29) showed that this was not the case. Kojic acid would elute very early, and, curiously, a significant early-eluting peak was observed in all samples.

Figure 5.29: Effect of pigmentation in LC-MS analysis of organic extracts of three different transformant colonies. Diode Array scan of independent torSABCDE transformants with and without pigmentation, together with A. oryzae untransformed control. (A) Untransformed A. oryzae, presenting a peak at RT = 2.97 min, which could correspond to Kojic acid. (B) torSABCDE transformant, which presented a strong yellow pigmentation (C) torSABCDE transformant, which did not present any pigmentation, (D) torSABCDE transformant, which presented a mild yellow pigmentation. Different colours are used to show the presence/absence of certain peaks.

Compared to the control, a peak eluted at RT=15.96 min in both mild yellow and yellow extracts, while the peak was not present in the white transformant. UV absorbance was detected for the peak (Figure 5.30), but ES+ and ES- do not show a good detection of a mass ion m/z. Figure 5.30 shows UV absorbance related to the peak described for RT=15.96 min.

Figure 5.30: UV absorbance spectra for peak eluted at RT = 15.942 min, with peaks of 215.94 and 383.94 from a representative mild yellow torSABCDE transformant.

134

5.3 DISCUSSION

Previous research provided only structural information on torrubiellones, but nothing related to genomic analysis. To be able to perform this research, Torrubiella sp. BCC2165 was obtained from M. Isaka (BIOTEC, Thailand) and its genome sequence was obtained. Application of the antiSMASH algorithm and BLAST searching with gene sequences emanating from related polyketide biosynthetic pathways rapidly led to the identification of a putative torrubiellone A gene cluster, composed of at least six genes. This was the starting point for attempts to elucidate the torrubiellone A biosynthetic pathway by heterologous gene expression in A. oryzae.

Assembly of the 12 kb torS megasynthase gene presented some difficulties, mainly with respect to the 3'-end. Only on the third attempt at reconstruction was the gene fully assembled and fused to the reporter gene eGFP. Using eGFP as a reporter ss not an essential requirement for the expression of the gene or production of the compound, but the selection of recombinants should be facilitated by visualization of eGFP expression. Unfortunately, eGFP expression gave misleading answers as green fluorescence appeared, but the enzyme was inactive. Assembly of the torSeGFP gene between attL sites of the entry vector pEYA meant that it was easily transferred by the Gateway LR site-specific recombination process into the Gateway-modified amyB expression cassette found on the multigene expression vectors developed for biosynthetic pathway reconstruction (Lazarus et al., 2014). Assembly of the gene in a yeast-E. coli shuttle vector also makes subsequent sequence modification, such as the removal of introns, relatively easy to achieve.

When the tenellin PKS-NRPS was expressed alone in A. oryzae, aberrant products were obtained, apparently due to a mis-programming of the enzyme, and the solution was found to co-express it with the cognate trans-acting enoyl reductase (Halo et al., 2008). A similar problem, in the heterologous expression of the lovastatin hexaketide synthase, had previously been resolved in the same manner (Kennedy et al., 1999). Fisch noted that the ER-domains of all fungal iterative PKS-NRPSs are inactive, due to the absence of the GGVG motif used for NADPH binding (Fisch et al., 2013). Song and coworkers noted that the addition of a functional trans-acting ER can divert biosynthetic pathways; they found that the heterologous expression of the ACE1 gene (with an inactive ER domain) with and without an active ER resulted in activation of different pathways, and subsequently to the production of different compounds (Song et al., 2015).

It was therefore considered reasonable to start the reconstruction of the torrubiellone biosynthetic pathway with the primary synthase gene (torS) and the trans-acting ER (torC).

135

However, when the pTYargtorSeGFPtorC plasmid was finally assembled and expressed in A. oryzae, the presence of introns in torS appearenty prevent the production of any torrubiellone-related compound. Only after removal of both introns, new compounds could be detected by LC-MS.

Jaillon et al. classified introns according to multiples of 3 bases (Jaillon et al., 2008), and by this definition intron 2 is a 3n intron (multiple of 3), whereas intron 1 belongs to the 3n+1 and 3n+2 intron classes. Failure to recognise (and therefore remove) a 3n+1 or 3n+2 intron will always result in a frameshift, and this is likely to engender an in-frame stop codon and production of a truncated protein. The observation of green fluorescence in A. oryzae pTYargtorSeGFPtorC transformants therefore suggested that intron 1 was spliced correctly. Non-splicing of a 3n intron will not lead to protein truncation if it does not contain any in- frame stop codon, as the case of torS intron 2, but the translation of additional codons could lead to perturbation of the protein structure, resulting in mis-folding or blocked access of precursors or cofactors to an active site. This reasoning led to the initial removal only of intron 2, which proved ineffective in restoring activity to torSeGFP and leading to the subsequent additional removal of intron 1. Introns are usually recognized by short conserved sequences (Jaillon et al., 2008), but genes may also have potential splice sites which are detected correctly by the native host but incorrectly by a heterologous host as A. oryzae.

Intron-induced disruption of the heterologous expression of a megasynthase was reported by Song et al. (Song et al., 2015): the M. oryzae ACE1 gene has three introns located between the exons of the PKS module (none in the NRPS module). The first and third introns were spliced correctly by A. oryzae, but the second intron was incorrectly spliced leading to a frameshift and creation of a premature stop codon. The B. bassiana tenB gene provides an example of inefficient splicing when expressed in A. oryzae. Reconstruction of the tenellin biosynthetic pathway and expression from a single plasmid resulted in the accumulation of more pre-tenellin B than tenellin in the heterologous host but this was reversed by removal of the single intron of tenB (Heneghan et al., 2010). After intron removal, A. oryzae transformants displayed yellow pigmentation, but only appeared following introduction of a PKS-NRPS-related gene such as tenS and torS. Growth of untransformed A. oryzae on a variety of media never resulted in the appearance of the yellow pigment.

Co-expression of the intronless torS gene with torC produced a new compound eluting at RT=9.37 min and with a molecular mass of m/z 367. This is the mass of predesmethylbassianin A, a monomethylated hexaketide very similar to torrubiellone A. This result supports the expectation that TORS and DMBS are functionally equivalent, as are TORC and DMBC; in both cases their combined activities would probably generate the first stable precursor of torrubiellones and desmethylbassianin, respectively. Pre-torrubiellone D,

136

based on molecular masses obtained, should resemble to a non-hydroxymethylated version of torrubiellone D (Figure 5.30), which is the only torrubiellone in which ring expansion has not occurred. It is probable that torrubiellone D is a stable precursor of the other torrubiellones compounds, but the possibility to be a shunt product cannor be discarded. Predesmethylbassianin A and pre-torrubiellone D apparently has some differences in the tetramic acid moiety of their structures (green), but that part can interconvert, meaning that both structures should be really the same. Unfortunately attempts to purify and characterize any compounds were unsuccessful.

Figure 5.31: Structure comparison between predesmethylbassianin A, torrubiellone D and a pre- torrubiellone D. (A) Predesmethylbassianin A and pre-torrubiellone D. (B) Torrubiellone D. The main difference between the precursor and torrubiellone D is the presence of a hydroxyl group (marked with red).

Successful expression of the PKS-NRPS plus ER allowed to continue adding the other genes from the cluster, such as those encoding the cytochrome P450s, whose functions were already known (by analogy, with a high degree of confidence). The prediction that torA encoded a ring expandase and that torB encoded an N-hydroxylase was tested by comparing expression of torS plus torABC with torS plus dmbABC. In both cases, a product with a molecular mass ion m/z of 381 was observed, probably indicating the same compound. However the expected compound, based on the earlier results, was desmethylbassianin, which has an m/z mass of 379. While it can be speculated that the mass difference can be explained by the loss of a double bond (for unknown reasons), this emphasises the importance of structural information for pathway proposals.

The addition of torD and torE genes to tenellin-producer A. oryzae transformants shed some light about the function of both genes. When both genes were coexpressed in the tenellin producer, a new peak was obtained at RT=9.88 min, in comparison to the control, and it requires structural analysis to confirm the proposed modifications, but it appears that a hydroxyl group was added to the structure.

137

When torD was expressed together with the tenellin-producer A. oryzae transformant (tenSABC), the two peaks at RT = 9.728 and RT = 11.962 min showed m/z of [M]H+= 368, 386 and 414, respectively. If the molecular masses corresponds to the same compound with modifications, m/z [M]H+= 368 could represent a dehydrated modification of m/z 386. Tenellin molecular weight is m/z 370 [M]H+, so the difference of +16 between strucutres could correspond to the addition of an OH group. [M]H+= 414 can be related to the addition of two sodium groups to the Tenellin structure (+44). While addition of an OH group has not been reported for Old Yellow Enzymes, as described for torD by BLAST search, they have been found to participate in the saturation of double bonds.

In torSC+torE transformants, a small peak eluted at RT=8.28 min with a molecular ion mass m/z of [M]H+=384; this could correspond to the proposed torrubiellone precursor accompanied by OH addition, considering the original torSC transformant with m/z of [M]H+ = 368. The other mass ion detected of [M]H+ of 408 could correspond to the addition of sodium to the structure (+22), with the saturation of a double bond. There is no TORE homologue in the tenellin, desmethylbassianin and militarinone gene clusters, but OH addition to the unsaturated chain has also been found in shunt tenellin metabolites, such as pyridovericin and hydroxytenellin, attributed to an unrelated B. bassiana cytochrome P450 (Halo et al., 2008).

138

C

H A H PT ER

6

6. GENERAL DISCUSSION

Before Next Generation Sequencing (NGS) tools were more commonly used, research centered on the study of PKS-NRPS biosynthetic pathways was based on synthesizing degenerate PCR primers targeting fragments of PKS genes to screen in cDNA libraries (Bingle et al., 1999). For example, to study the genes involved in fusarin production in F. venenatum, a cDNA λ library was constructed, blotted and probed with radiolabeled PCR products using degenerate cDNA primers targeting the C-Met domain from the PKS moiety (Song et al., 2004). The presence of the C-Met domain could be predicted if there was any methylation step in the structure. Afterwards, when the domain was located, a 4kb cDNA clone from HR-PKS from F. venenatum could be obtained from the libraries (Song et al., 2004). This strategy would only work, logically, if it is expected a C-Met domain within the main synthase gene, but other domains can be targeted by primer design.

Currently, the recent advances in NGS have intensely reduced sequencing costs (Metzker, 2010), leading to whole genome sequencing from small, microbial genomes (Wu et al., 2009) to complex plant or mammalian genomes (Schmutz et al., 2010). The amount of sequence data obtained by NGS tools could lead to difficulties in data processing and analysis, for example, the Illumina HiSeq 2500 platform can yield over 1,000,000,000,000 bp (1 Tbp) of raw sequence data, which may increase several-fold during downstream processing and analysis (Clooney et al., 2016).

In an average sequencing project, a first genome draft is produced containing raw data, followed by annotations and analyses. Subsequent releases then correct or alleviate assembly problems and update the auxiliary information linked to the project (Salzberg and Yorke, 2005). Raw sequencing data could have some small error rates (1/50 000 or 1/100 000 bases), but when draft genomes are processed and assembled, hundreds of mis-assemblies could

139

occur, such as an incorrectly re-arranged genome or big pieces of DNA sequences deleted from the genome (Salzberg and Yorke, 2005).

For Torrubiella sp. BCC2165, the sequencing data was not assembled to obtain a full genome sequence in one single file, instead, an online platform containing all the raw sequences was given to perform initial BLAST searches of any DNA sequences inserted as a query. The results consisted in a list of the contigs showing best scores against the query sequence. At first, only nucleotide BLAST search was working properly, but it was enough to identify the correct location of a gene, providing the core part for downstream studies of gene function (Florea et al., 2011).

Frequently, sequencing and assembly quality is roughly determined only by contig size, with larger contigs being preferred to enhance the possibility to find complete coding sequences, but bigger contigs can be the result of disorganized assembly and are not necessarily a good measure of quality.

The most widely used statistics tool to assess the quality of a genome assembly is the N50 parameter, defined as the shortest sequence length at 50% of the whole genome (Yandell & Ence, 2012). Even though there are no precise rules or standard guidelines to follow for quality assessment, an assembly with an N50 length that is gene-sized can be considered decent for annotation, because if the N50 value is around the median gene length (Length value that separates the higher half of a data sample, from the lower half), then approximately 50% of the genes should be contained on a single contig, and when combined with other genomic fragments from the rest of the genome, it will allow for further process and analysis (Cantarell et al., 2008). If the N50 length is too short, additional sequencing is recommended (Tsai et al., 2010).

Looking at the numbers of the N50 (7398 bp) from Torrubiella sequencing, the possibility to find a complete putative gene cluster encoding for Torrubiellone A in one single contig was low. The first initial search was done using part of the coding region of the tenS gene and the best result, according to E- value, was in contig 5044. The contig length of 50529 bp was long enough to contain a complete PKS-NRPS synthase gene (usually 12 kb), together with genes that correspond to cytochromes following the same genomic pattern than other PKS-NRPS. If the main synthase would be splitted into different contigs, i.e. to be located in one of the smallest contigs (200 bp is the smallest one), it would not even contain a complete synthase domain. This situation would have difficulted greatly the identification of genes to be involved in torrubiellone synthesis.

According to Philippy et al., (2008) sequencing quality should be assessed depending on the intended use of the data, because no automated validation tools exist (Phillipy et al., 2008).

140

For this research, the proposed use was to find genes in a genome assembly, making both parameters (Contig size and N50) very relevant to obtain contigs long enough to allow for identification of complete genes. Horn et al. (2015) published a draft genome of Verticillium hemipterigenum (anamorph Torrubiella hemipterigena), in which the whole genome assembly consisted of 26 scaffolds with a total size of 28.5 Mbp, and a N50 of 6,006 kbp, comparable to Torrubiella N50 sequencing data (7398 bp) (Horn et al., 2011).

After the obtention of the sequencing data, in silico tools (web-based and standalone) to analyse partial or whole genome sequences have greatly contributed to natural products research (Jenke-Kodama and Dittmann, 2009) for example, a simple use of BLAST search comparison and analysis can lead to a rough assessment of PKS, NRPS, and hybrid PKS- NRPS genes (Sanchez et al., 2012), but by using more advanced bioinformatic tools, PKS and NRPS domains can be detected, even in some cases with a structural prediction of the molecule, based on the precursor units (Yadav et al., 2003) and amino acid specificity of the NRPS moiety (Rausch et al., 2005).

Dedicated softwares to search for genes involved in secondary metabolites synthesis can be used to analyze partial or whole genome sequencing assemblies, for example, antiSMASH (Medema et al, 2011), PRISM (Skinnider et al., 2015) and NaPDos (Ziemert et al., 2012), all three tools used in the research. For Verticillium hemipterigenum (Horn et al., 2015), SMURF (Khaldi et al., 2010) was used and gene names/functional descriptions were identifed by BLAST search against the fungal UniProt Knowledgebase (Apweiler et al., 2004).

By combining different prediction tools, 36 clusters were described Torrubiella sp. BCC2165 as PKS-NRPS, PKS (or PKS –like) and NRPS, but the number of clusters obtained could be modified if the analysis was made in InterProScan (Jones et al., 2014), but it was not possible to use it, because this software requires amino acid sequences and gene coordinates to enhance the quality of the prediction by comparing protein sequences. Also, Torrubiella sequencing data it not part of any known online database yet. Despite of the difficulties, combined analysis with multiple softwares can produce reliable results if the outcome given by the in-silico examination is analysed thoroughly.

The metabolic profile prediction for Verticillium hemipterigenum, based on aminoacid/protein sequences, revealed 27 SM biosynthesis gene clusters, including 13 PKSs, 16 NRPSs and 4 hybrid PKS-NRPS gene clusters (Horn et al., 2015). In addition, SM prediction for Fusarium graminearum showed 51 predicted SM genes: 15 PKS, 19 NRPS and 17 terpenoid synthetases (TPS), but only 13 of them have known functions (13 SM genes) (Brown et al., 2012). Fungal SM gene clusters are frequently species-specific and can have different origins (Khaldi et al., 2008), such as horizontal gene transfer of clusters or conditionally dispensable chromosomes (Ma et al., 2010; Sieber et al., 2014), but the numbers are very similar to the

141

total number found for the tenellin producer B. bassiana ARSEF2860, but it triplicates the amount for NRPS detected for desmethylbassianin producer C. militaris CM01.

The research done by Sieber et al. (2014) demonstrates how fragile could be SM gene cluster prediction: Based on genomic and functional gene comparison with other known fungal gene clusters, 67 gene clusters were found, while expression data analysis resulted in only 43 clusters that are differentially expressed in standard laboratory conditions. By searching for conserved promoter motifs, only 19 clusters were identified. In all cases, the 51 genes related to SM synthesis exceeds the number of currently known secondary metabolites in this organism. Also, despite the high predicted number of SM gene clusters, it is difficult to correlate an isolated fungal chemical product to a fungal gene cluster, because a vast amount of clustered genes are not expressed under standard laboratory conditions (Sieber et al., 2014).

From the in-silico analysis, contig 5044 (cluster 3) was chosen. This cluster contains the putative torrubiellone A gene synthase PKS-NRPS, including the expected reductive domains, such as KR, DH and ER, together with the C-Met domain for an additional methylation (-CH3) step. The detection of this domain supports the idea that the methyl group is already present in the structure, requiring an additional hydroxylation (-OH) step at the

C-17 of the structure, instead of full hydroxymethylation (-CH2OH) addition at the C-17 position. Cytochromes P450 can also be used as a parameter to search for SM gene clusters, because they commonly participate in known fungal biosynthetic pathways related to PKS, as done to study trichothecene mycotoxins (Meek et al., 2003) and gibberellin synthesis (Tudzynski, 2005).

The whole contig was analyzed by search of ORFs using BLAST tools to obtain a putative function related to that particular ORF. As a result, also two transcription factors were detected, flanking the torrubiellone A proposed gene cluster. This arrangement has also found in the militarinone gene cluster (not published) and tenellin gene cluster. Currently, overexpression of the ten TFs hasn’t been performed to elucidate if one or both regulators are participating in the control of the ten cluster expression.

Posterior to ORF analysis of contig 5044, it was noticed that the torrubiellone A gene cluster shared the same order and composition with tenellin and desmethylbassianin gene clusters. But sequencing also showed a different genomic length between one of the cytochromes (tenB, dmbB, torB) and the trans-enoyl reductase (tenC, dmbC, torC) from 600 bp of ten gene cluster to 3300 bp, including two ORFs predicted by GENtle, but because of the genomic distance between them (17 bp), maybe both ORFs were separated only because of an intron and it would correspond to a one single gene when matured, or if they are effectively two separated

142

genes. cDNA demonstrated that both genes were separatedly expression and were denominated as torD and torE. In tenellin gene cluster, the space between genes just covers a few hundreds of bases, but it shares some sequence similarities with the promoter of torD (torD500), suggesting the possibility of obtaining that chunk of DNA by horizontal gene transfer for Torrubiella, or that same fragment got lost in the B. bassiana strain, but no further analysis was done in that area. torD was defined as NADH:oxidase in B. bassiana (but outside the ten/dmb cluster) and also it is described as an Old Yellow Enzyme (OYE), while no torE equivalent was found in C. militaris or B. bassiana. To propose a gene function from only BLAST search could be risky: If someone performs BLAST search using the trans-enoyl reductase nucleotide sequence (tenC) as a query, two putative functions are described. At the moment of this research, Eley et al. already described TENC function as an enoyl reductase (Eley et al., 2007), facilitating the decision to discard the other function described for tenC as an alcohol dehydrogenase, prediction based solely in in-silico sequence comparison (Xiao et al., 2012). This situation shows how important is to work directly with the organism to elucidate the effect of the gene in the metabolic profile.

Outside of the area flanked by the TFs in contig 5044, there is one additional gene described as a sterol desaturase that could be participating in the biosynthetic pathway: Saito et al. (Saito et al., 1991) described sequence similarities between a gene encoding for OYE and a gene from an operon induced by bile acid in Eubacterium sp., suggesting that OYE could be involved in sterol metabolism, in which also the sterol desaturase enzyme participates. Currently there is no evidence in vivo that supports that theory (Williams and Bruce, 2002). Maybe there is a functional relation between TORD and the sterol desaturase, but it was not further investigated.

As described before, the situation of torD and torE is very interesting: Initially it was thought that both ORFs (detected by GENtle) were only one big coding sequence and their separation (17 bp) and ATG start codon of torE would be removed by intron splicing by the RNA maturation process. After cDNA amplifications, both genes were expressed at the same time frame, separately. This situation raises more questions than answers: The promoter region and 5-UTR’ regulatory sites of torE should be contained in the coding region of torD, while the 3’-UTR site and polyadenylation signals of torD should be located on the coding region of torE. How they are regulated, which elements and signals are participating, and which is the limit of these regions require more studies.

5' UTR is located upstream of the coding sequence allows the ribosome to bind and initiate translation, while the 3' UTR is located next to the translation stop codon, participating in translation termination and as well as post-transcriptional gene expression (Barrett et al.,

143

2013). In eukaryotes organisms, the 5’ UTR can have hundreds to thousands (as difference of prokaryotes) of nucleotides long, For example, the lac operon in E. coli only has 7 nucleotides in its 5′ UTR (Lin et al., 2013), but the ste11 transcript in S. pombe has a 2273 nucleotide 5′ UTR (Michelson et al., 1980). The 3’ UTR also varies in length as well, but in general terms, 5' UTR length is more conserved in evolution than the 3' UTR length (Lin et al., 2013).

A theory to explain why both genes can be expressed at the same time despite the distance is the concept of overlapping genes, defined as adjacent genes with a certain extent of overlap in genomic locations. According to definition, it is probably referred to coding sequences, but extrapolating the concept to regulatory sequences, there is one option considered in the idea denominated Promoter Overlap (PO) that can be considered (Johnson and Chisholm, 2004). PO is defined as coupled genes’ promoters overlap within 1 kb with a high level of co- expression, indicating probably that the co-regulation of neighboring genes is a result of sharing local chromatin status (Ho et al., 2012).

How the overlapping initially occurs is not clear, and none of the current hypotheses fulfill completely the existence of these genomic architectures (Ho et al., 2012). One theory is the “overprinting”, in which a new gene is produced by accumulated mutations within a pre- existing gene, resulting in a new gene and an ancient gene overlapping in the same genomic locus (Keese and Gibbs, 2002). Another theory was proposed by the Shintani research group, proposing that the “overlap” origin occurs when a 3′- injured gene uses its neighbor's polyadenylation signals on the opposite strand after genome rearrangement occurs (Shintani et al., 1999), which could be the key step in the transition from non-overlapping genes to overlapping genes (Sanna et al., 2008).

The research from Ho also proposes that co-regulation and expression of paired overlapping genes is a result from two major cis-regulatory mechanisms: The first one is the effect of sharing a common local chromatin status and it would depend on the physical distance between both genes and in the overlap extent (Ho et al., 2012). Secondly, the genes may be activated by the same promoter, i.e., it is possible that promE500 may work as a promoter (as shown by eGFP expression), but torE and torD could use the same promoter (promtorD500) and regulator regions and it would explain why both genes are expressed in the same time frame and not differentially (Ho et al., 2012). It would be very interesting to continue with the research of knowing where are precisely located the 5- and 3-UTR and which would the effect of having them in coding regions of surrounding genes.

Related to torS intron removal, the ability of introns to be removed by A. oryzae will depend greatly in the identification of the GT/AG intron flanking sequences. An error of the removal (partial or total) of one of both introns can lead to frame shifts or deletions which could affect

144

protein folding, loss of key protein identification motifs, leading to no expression of the protein (Lazarus et al., 2014). Usually, mRNAs with premature stop codons (as intron 1) are degraded by the nonsense-mediated mRNA decay pathway (or NMD), but this defence mechanism is not able to remove mRNAs with stopless 3n introns (as intron 2). Despite the absence of an early stop codon, stopless 3n codons can still affect protein function. 3n introns are characterized by having stronger splicing signals and are very efficiently spliced in native hosts, and so do not require any extra translational control (Jaillon et al., 2008).

To avoid intron problems, it is recommended to work with cDNA from the native strain, instead of gDNA. Torrubiella RNA was difficult to obtain, so gDNA was used for gene and plasmid assembly and introns removed “manually” by yeast recombination, as done by Song et al., 2014. There was a chance that A. oryzae is able to remove introns correctly by itself, but the first results using torS with introns did not show any change in the metabolic profile of A. oryzae, suggesting that the strain maybe was not able to recognize introns adequately or they were removed incorrectly. One problem that to elucidate if the protein was working properly was related to eGFP expression when torS was inserted with and without introns. In both cases, eGFP expression was obtained, but only in the latter occurred a change in the metabolic profile. Why fluorescence was observed in both cases (indicating that the protein was expressed) but no new compounds produced (suggesting that the enzyme wasn’t working properly) couldn’t be adequately explained and it requires further investigation.

About the detection of torrubiellone in the native strain, the initial idea was to elucidate when torrubiellone A production was enhanced to perform DNA and RNA extraction at that time, trying to avoid amplification from non-torrubiellone biosynthetic genes. By calculating yields using published data (Isaka et al., 2010), torrubiellone A, B, C and D yields were 4.35 mg/l, 200 µg/l, 800 µg/l and 900 µg/l, respectively, in a total of 88 days of incubation, which are very low yields in comparison to other related compounds: Song et al. (2004) described a production of 300 µg/l of fusarin C (Song et al., 2004) from Fusarium venenatum at the fifth day of incubation under standard conditions, while F. moniliforme produced 500 µg/l at the tenth day. Also, ten days of growth of B. bassiana were enough to detect tenellin with a yield of 20.4 mg/l and 3.8 mg/l of 15-hydroxytenellin (Halo et al., 2008).

At the beginning of this research, torrubiellone A couldn’t be detected in initial screenings, despite using the same growth media as Isaka (Isaka et al., 2010), leading to find a way to enhance or stimulate torrubiellone production in the native strain. It is possible that torrubiellone A synthesis was under a very tight temporal specific control, as proposed by Song and co-workers (Song et al., 2015), explaining the difficulties to purify or even detect any torrubiellone in standard conditions, but it is not an isolated problem: Bergmann et al. (2007) described aspyridone A as a major component of the metabolic profile of the A.

145

nidulans SB4.1 strain, but when Wasil and co-workers (Wasil et al., 2013) attempted to replicate the amounts of aspyridone described previously, low titres were obtained.

Ideally, if the compound was detected at the first screenings, gene function would have been studied by generating knock outs to every gene in the cluster and evaluate the metabolic profile and structural difference in the compounds obtained between the control and the transformants. New tools for gene editing are being currently developed, such as CRISPR/Cas9 system for filamentous fungi (Nodvig et al, 2015), and if it confirmed (by NMR) that the ZnTF overexpression is producing torrubiellone A or a derivative, it would be a very interesting project to use gene editing tools directly in the native strain to evaluate gene function. But, this couldn’t be done because torrubiellone compounds couldn’t be detected.

Related to the pathway proposal in Chapter 3, there were several options in the every step of the proposal, being chosen the most likely to occur, considering other known PKS-NRPS pathways, but there was a problem defining the order and timeframe to elucidate when the addition of the hydroxyl groups occurs.

According to previous PKS-NRPS studies, the synthase should not be able to add the OH group at C-17 by itself, instead, it should require an additional enzyme, probably an extra cytochrome p450. If this situation happens, the expression of the synthase plus the trans- enoyl reductase (torS+torC) should lead to a precursor of torrubiellone D, here named pretorrubiellone D. It is possible that another precursor from the pathway could be obtained solely by expressing torS alone, not essentially requiring the trans-ER enzyme, for example, the shunt metabolite prototenellin C was obtained by the expression of the tenS gene alone (Heneghan et al., 2011). This theory seems unlikely, becaue prototenellin C was not considered as part of the metabolic pathway, but it cannot be completely discarded, because torS alone was not evaluated in the heterologous expression experiments, only coexpressed with the ER torC.

There is no information about a precursor in any of the studies made by Isaka (Isaka et al., 2010, Isaka et al., 2014), but I consider that many answers can be obtained by analyzing shunt metabolites from other metabolic pathways. As example, Heneghan et al., 2011 proposed that synthesis of prototenellin A (benzylic hydroxylated version of pretenellin-A) is catalyzed by a B. bassiana oxidative enzyme encoded elsewhere in the genome, instead of inside the tenellin cluster. Other shunt metabolites found when ten biosynthetic pathway was analyzed was 13-hydroxypretenellin, being suggested that an additional cytochrome p450 selective for hydroxylation of the PKS side chain must be present, probably a third p450 cytochrome (Heneghan et al., 2011). The addition of the OH group may not be straight forward and maybe it will require several modifications, for example, perhaps the OH

146

addition is made initially from a double bond, then an epoxidation and ring opening to obtain an OH in that position, among other explanations. This multiple-step addition happens to transform fusarin from prefusarin, in which the COOCH3 functional group within the molecule is only obtained under several modifications (Song et al., 2004).

The biosynthetic pathway proposal for torrubiellone A is based on other already known biosynthetic pathways, but it cannot be established if torrubiellone A is the final compound from the pathway, considering the way how it was obtained. The process to obtain torrubiellones by Isaka (2010) required high fermentation volumes, repeated fractioning and furnishing to obtain the four torrubiellones. As a personal point of view, I would like to see further analysis of the remanent fractions to search for any other related compounds. The same research group in 2014 described torrubiellone E for another Torrubiella strain, together with the production of torrubiellone A and B, and JBIR-130, all of them very similar in structure, and to distinguish which one is produced first (or even if one or more are shunt metabolites from the pathway) is very hard to establish without genomic studies. Another research group (Hosoya et al, 2013) also described a very similar compound to torrubiellone A for Isaria sp. NBRC 104353, namely JBIR-132, but using a different production media of oatmeal and vegetable juice. It makes wonder what would produce the strain if grown in PDA instead of that particular growth media, or viceversa, which compounds could be produced by Torrubiella if grown in the Isaria sp. production media.

The strategy based on transcription factors overexpression represented a challenge, in terms of a transformation system for Torrubiella must be developed to allow overexpression of the transcription factors. Development of a new transformation system is not usually as straightforward as the literature or experience suggests and can vary from species to species, but Torrubiella transformation was based on B. bassiana protoplast-mediated transformation method. Protoplasts-making procedure requires monitorization of cell wall digestion, and the optimum timing has to be determined for each enzyme and strain (Ruiz- Diez, 2002).

In the research, promoter analysis for torD500 and torE500 was performed by fusing the 500 bp upstream the start codon of both coding regions with the eGFP reporter gene to evaluate its expression by fluorescence microscopy. i.e., similar to a gain-of-function experiment. Other reporter genes, namely rfp (red fluorescent protein) and dsRed (also coding for red fluorescence protein, obtained from Discosoma sp.) can be used instead of eGFP. Also, non- luminescent (visual) reporters could be used, including lacZ, coding for beta-galactosidase (Eikmanns et al., 1991).

147

eGFP expression measurements made solely by visual identification is not recommended, because it can vary depending on the researcher’s point of view to decide which promoters are working more than others. The most common way to evaluate the effect of a promoter in gene expression is fusing it to a reporter and then to perform quantitative reverse transcription polymerase chain reaction (qRT-PCR) to have precise data to compare between controls (untransformed host and transformed host with eGFP without promoter) and the transformants. Because of complications in obtaining RNA from transformants, these experiments couldn’t be performed, but they are needed if there are intents of publication.

Only BASTA susceptibility was tested in Torrubiella, but other selection markers should have worked also, such as hygromycin (hph gene) (used by Song et al., 2015) and bleomycin (ble) (Austin et al., 1990). For example, Liu and Friesen (2012) successfully transformed Stagonospora nodorum with the hygromycin resistance gene, obtaining more than one hundred fungal colonies using on regeneration media plates with 100 mg/ml hygromycin B. A relatively recent example of using BASTA as selection marker can be found in work of Zhang (Zhang et al., 2012), which used the bar resistance gene to investigate the role of a cytochrome p450 (BbasCYP52x1) found in B. bassiana. They attempted (successfully) to disrupt the BBcyp52x1 gene with the bar gene by homologous recombination. From the 82 transformants obtained, only four were correct (checked by RT-PCR), and all the other ones were considered as ectopic recombination (Zhang et al., 2012).

Strong promoters are often used to control expression of the selection marker in the construction of genetically modified strains; for example, the constitutive promoter from the gpdA gene in A. nidulans has been widely used, resulting in good expression levels in a variety of transformed species (Fan et al., 2007), and it was used as a promoter to drive the bar selectable marker expression in Torrubiella. However, a native promoter from the target strain will usually display stronger activity than the exogenous A. nidulans promoter (Cho et al., 2007). For example, in B. bassiana, the gene hyd1 that encodes a class I hydrophobin is expressed constitutively, and its promoter works better than PgdpA when fused to eGFP (Zhao et al., 2016).

But even if a strong promoter as PgdpA and an adequate selection marker as bar are selected for transformation, the filamentous fungi transformation process is mostly integrative. Non- targeted integration generates a population of transformants with the plasmid inserted at different chromosomal locations on each of them (Lazarus et al., 2014). Genomic position is very significative in terms of expression, because a number of transformants will have the cassette located in a loci prone to high-level expression, while other transformants would have the loci where the chromatin structure doesn’t allow for adequate expression (Palmer and Keller, 2010). In addition, transformants can have multiples copies of the inserted gene,

148

while others may contain only a single copy (Watson et al., 2008). Because of this, it is essential to screen several independent transformants, since titres of desired metabolite may vary from transformant to transformant (Lazarus et al., 2014).

The first Torrubiella transformations were not successful, even with the change of enzymatic incubation times and growth. A few times, the Torrubiella strain appeared to have a fluffy (or cottony) morphological state, but the original research describing Torrubiella sp. BCC2165 did not have any details about the physical traits of the fungal strain, not mentioning anything related to the possibility to obtain two different phenotypes. Several experiments were done changing different parameters, as temperature, light and growth media (not shown) to elucidate under which conditions the cottony state is triggered, but no positive conclusions were obtained. The transformation protocol was done with both phenotypes, and only the cottony state allowed for transformation by surviving the BASTA- added selection media, while no recombinants were obtained from plain colonies.

By looking to other similar strains, Wang and coworkers (Wang et al., 2012) described and published some images of the strain C. militaris DSM1153, which also has a cottony state, but they don’t focus on the reason of why this state occurs, and currently no explanations has been found yet. It only explains that the conidia from that strain are morphologically indistinguishable from the conidia of Cordyceps ninchukispora. (Su and Wang, 1986). The only difference between C. militaris DSM1153 and Torrubiella growth conditions was the incubation temperature, with 20°C and 28°C respectively (Wang et al., 2012). It would be very interesting to learn about the physical properties of both phenotypes, and why just one of them apprears to be suitable for transformation. The understanding of this area could lead to new information to create or optimize other transformation processes. The Torrubiella transformation system could almost certainly be further improved, but this research has provided the first steps towards an efficient transformation protocol for this strain.

Knock-out experiments in Torrubiella were based on the inherent property of homologous recombination in filamentous fungi. One of the problems of the insertional KO methodology technique, besides of the risk of ectopic integration, is that four outcomes or modes of insertion of the gene (in here, a PgdpA-bar-TgdpA cassette) to disrupt the gene of interest (torS) could be obtained: Simple replacement by double crossover, outcome expected when a KO recombination is performed, an insertion solely at the 5’ end or the 3’ end by single crossover and insertion at both ends. Song et al. (2014) performed a KO gene interruption to the main PKS-NRPS encoding fusarin C, but they got as a result the four possibilities. Furthermore, there is a chance of ectopic integration, i.e. the addition of the desired gene in a random position at the genome. Halo and co-workers (Halo et al., 2007) reported that by solely using homologous recombination, ca. 100 individual clones should be analyzed to find a genuine KO, because the rate of ectopic integration of the selection marker (BASTA

149

resistance) in B. bassiana was high (>90%). In total, they yielded one single KO clone from 35 transformants, corresponding to a successful rate of less than 3%. To enhance the KO efficiency, they also used the RNA silencing approach, based on antisense RNA (aRNA), obtaining a 25% yield of correct KO transformants (Halo et al., 2007). aRNA was not considered for this research and torS-KO experiments were solely based on homologous recombination, instead of a combination of both techniques, hoping for a better transformation rate than B. bassiana. Considering the results, that was a mistake and both strategies are needed. Torrubiella KO experiments didn’t show a genuine KO in the strain, and also gDNA was used for PCR confirmation, but it is acknowledged that cDNA should be used to study the correct insertion of the selection marker.

In general terms, the transformation system based on protoplasts appears to work correctly in Torrubiella, based on the increased resistance to ammonium glufosinate, but it must to be refined to obtain KO-transformants. In addition, the mistake made in the primer design makes necessary to perform again the experiments, select more transformants and to do two different PCR amplifications to confirm the correct gene interruption by homologous recombination.

Also after the development of the transformation system, overexpression of the transcription factors was done and ZnTF transformants produced an intense yellow pigment, in comparison to the colorless controls.

The ZnTF LCMS analysis showed a novel peak at RT = 5.934 min, in comparison to its absence in the wild type control. The molecular mass ions m/z found in ES+ were 392, 718 and 926, but only the m/z 392 could be possibly related to a torrubiellone derivative. The three molecular masses fits in the “odd” number rule for PKS-NRPS: The secondary metabolite produces by the PKS-NRPS enzyme contains a nitrogen, given by the amino acid fused by the NRPS to the PK chain. Because it should be only one nitrogen atom in the structure, it will add +15 to the molecular mass, giving an “odd” value to the compound. When LC-MS analysis are performed, the masses are given by electrospray ionization as protonated (+1) or deprotonated (-1), resulting in an “even” number as m/z for the compound.

By looking to other Torrubiella strains and related products, none of them were close to the m/z 718 and 926 masses, making it imperative to elucidate compounds’ structure by NMR, but it is very important not to search “blindly” without knowing what to expect, because LC- MS interpretations could sometime be very subjective and could mislead to other compound proposals.

The second approach to study the torrubiellone A gene cluster was by heterologous expression in A. oryzae. The protocols are well established and it doesn’t require any modifications. After

150

a putative gene cluster was identified, heterologous expression of the cluster could be achieved by a direct transfer of the complete cluster, but a successful expression will depend on promoters working properly in the heterologous host (i.e. Torrubiella promoters to be recognized by A.oryzae), as well a correct mRNA maturation, in terms of intron splicing (Mattern et al., 2015). Smith’s research group (Smith et al., 1990) was able to clone the whole penicillin biosynthetic gene cluster from Penicillin chrysogenum on a cosmid vector and transfer it to A. niger and N. crassa to achieve penicillin production. Also, Sakai and coworkers (Sakai et al., 2008) were able to produce the mycotoxin citrinin from Monascus purpureus by directly transferring the whole cluster transfer in A. oryzae,

Instead of transferring the whole torrubiellone A gene cluster, it was preferred to follow the guidelines from previous research from the Lazarus Lab (Eley et al., 2007, Halo et al., 2008, Heneghan et al., 2011) and start from the minimum PKS-NRPS expression, and from there, to start adding genes until obtain torrrubiellone A to understand in a better way every step of the biosynthetic pathway..

At first, the 12 kb PKS-NRPS torS coding region had to be assembled in an entry vector for further uses. To be able to amplify it by PCR, the torS coding region was splitted into three 4-kb parts, but the torS PKS-NRPS third fragment was difficult to amplify: Amplification of the first and second part was successful, but the third part was not amplified. An advantage of the yeast homologous recombination technique is that allows to work with different fragments by only changing the primers and the overlap area, without changing anything in the protocol. It only requires to have the 30 bp overlap between sequences to be able to fuse them. A second attempt involved the split of the third fragment into 2-kb parts, but amplification was only achieved for the first fragment. The third attempt consisted in dividing the third part into 3-kb and 1-kb pieces, finally achieving amplification of all the necessary pieces to reconstruct the 12 kb synthase. By looking the G/C – A/T proportions, they are quite similar, but according to the web-server GC Content Calculator (Biologics Corp), the 4-kb and 2-kb fragments had a concentrated GC content distribution, while the 1- kb fragment had the GC nucleotides well spread. Perhaps, a more concentrated GC content in one region may difficult the PCR process in the denaturation step, making it difficult to open, but this is just a theory and no information has been found about it. Also, maybe just DNA folds up into a stable secondary structure difficulting PCR amplification, but at the end, the synthase reconstruction was possible.

After transformation with the intronless version of torS, some of them produced a strong yellow pigmentation after subculturing rounds. When Song et al. transformed A. oryzae with the ACE1 gene, transformants produced a bright yellow color, while the untransformed control was white (Song et al., 2014). According to eGFP visualization in the transformants, there was no positive correlation between yellow pigmentation and eGFP expression, instead,

151

the colonies that presented a high amount of pigmentation tend to exhibit low eGFP expression.

One of the flaws of the photos showing eGFP expression in this research was the lack of scale bar for identification and comparison with other fungal strains. The way the photos were taken (personal camera) did not allow to establish any measurements or automatically produce a scale bar, making these images not adequate for publishing purposes and it would require additional photos with the scale bar to follow the parameters and guidelines of any selected journal, for example, the Journal ChemBioChem, in which the paper describing tenellin biosynthesis (Eley et al., 2007) from B. bassiana was published, it wouldn’t accept the microscopy images (optical, electron or scanning probe), because according to its guidelines, it should always contain a scale bar (Notice to Authors- ChemBioChem).

Yellow pigmentation was not that surprising to obtain, because it was described before by Koziol (Koziol, 2014) that the presence of a pigment is not necessary indicative of the production of a novel metabolite, at least by LC-MS detection. One possibility was that the yellow colour could be related to the production of kojic acid (KA), a metabolite that inhibits the enzyme tyrosinase by chelating copper ions required for its active site. KA is produced under aerobic conditions in several Aspergillus species, including A. oryzae and A. flavus. Very recently, the KA gene cluster was described as comprising three genes encoding a transcription factor (kojR), an enzyme (kojA) and a transporter (kojT). Production of the enzyme is dependent on the amount of nitrogen available in the growth medium i.e. KA biosynthesis is activated when nitrogen is depleted (Sano et al., 2016). Kojic acid detection by LC-MS should ocurr in the first minutes of the run and also it can be discarded by using a control when the transformants’ organic extracts are run, but unfortunately kojic acid tends to produce problems when samples were analyzed by LC-MS. For any compound measurements, it would have been very helpful to have a positive control (i.e. Purified torrubiellone) for peak comparison, but despite the efforts to obtain it by contacting other research groups that chemically synthesized torrubiellone B and C, it wasn’t possible to obtain it for this occasion.

The compounds obtained with combinations of torS with torA to torE are promising and different structures can be proposed from the molecular weight data, but the structure MUST be elucidated to confirm or discard any step from the torrubiellone A pathway proposal. To be objective, because of a lack of consistency and order on my own, there were some plasmid combinations that weren’t tested in this research and it would have given more answers to the function proposals of the extra genes described in the research, i.e., torSC + torD (only torE was tested) and tenSABC + torE (just coexpressed with torD), also not using the same synthase for both. It is encouraged to continue with other gene combinations together with

152

other known synthases to elucidate the TORD and TORE functions, as done with the tenellin A. oryzae transformant, for example, to test both genes in a desmetylbassianin producer or their effect in JBIR compounds. All the metabolic pathway proposal was theorical, so it is imperative to find a way to elucidate the structure of the compounds obtained by the different transformations performed in this research. NMR was planned to be used to elucidate structures, but it was not possible to obtain enough concentration of any transformant to be able to run the samples.

As future project, and as another approach to study the torS synthase gene, it would be very interesting to attempt the expression of hybrid chimeras by module/domain swapping of torS PKS or torS-NRPS parts with other known PKS-NRPS. An example of module swapping was performed by Fisch et al. (2011), able to obtain the bassianin by swapping domains between TenS and DmbS, together with tenABC. Heneghan et al., (2011) were able to express an hybrid synthase formed by the PKS moiety of the tenellin synthase gene tenS fused to the NRPS part from the dmbS desmethylbassianin synthase gene, obtaining prototenellins A to C, compounds previously described when the full tenS synthase gene was expressed into A. oryzae. Nielsen and coworkers (Nielsen et al., 2016) also reported module swapping between ccsA from A. CLAVATUS, involved in the cytochalasin E biosynthesis (Qiao et al., 2011), and the uncharacterized syn2 gene from M. ORYZAE heterologously expressed in A. NIDULANS, obtaining a product with a polyketide moiety of niduclavin and the trp residue found in niduporthin (Nielsen et al., 2016). Also the PKS module of ApdA from aspyridone and the NRPS moiety of CpaS, encoding for cyclopiazonic acid were expressed as individual proteins in yeast, obtaining the tetraketide structure from the aspyridone, but fused to tryptophan, amino acid fused by CPAS.

Domain or module swapping doesn’t work everytime, as shown by Boettger et al. (2012). They combined the lovB gene from lovastatin PKS and cheA gene from the chaetoglobosin A NRPS part and it didn’t result in any hybrid compound, suggesting that programming rules and domains specifities that can make some PKS and NRPS modules incompatible between them (Boettger et al., 2012). Nielsen et al. also suggested that a successful swapping depends of the similarity of polyketide precursors of both PKS-NRPS genes, instead of their protein structure compatibility (Nielsen et al., 2016). Considering precursors and similarities, hybrid synthases combining the PKS and NRPS modules of ten, dmb, mil, apdA with tor should result in novel metabolites for further studies.

Also, another proposed work to test the effect of cytochromes p450 by using the methodology described by Zhang (Zhang et al., 2012): They cloned a cytochrome p450 (Bbcyp52x1) from B. bassiana under a GAL1 promoter to create pYe-BBcyp52x1, and then transformed into a engineered strain S. cerevisiae WAT11, optimized for cytochrome p450 expression. The strain

153

has an overexpressed cytochrome p450 reductase that conduits electrons from NADPH to the p450 enzyme (Pompon et al., 1996). After plasmid insertion, enzymatic activities of the cytochrome p450 were evaluated by following transformation rate of metabolites produced when incubated with different substrates (Zhang et al., 2012).

The structural difference among torrubiellones relies in the presence of two hydroxyl groups and the saturation of the phenyl group into a cyclohexyl moiety, and going even further, the difference between torrubiellone B and torrubiellone A is an N-hydroxylation, substitution which appears to confer antimalarial properties to the latter, but this change in functional groups (or combinations) can have a great effect into protein structure and folding, among other properties, and that is the reason why it is so important to know which genes are participating, for example, in the addition of the hydroxyl groups in the different torrubiellone compounds.

The addition of one or more hydroxyl groups to a compound should enhance the ability to form hydrogen bonds, thus increasing the water solubility of the compound, but also since the hydroxyl group is larger than the original hydrogen atom, its addition changes the overall steric dimensions of the compound, and its electronegativity. In addition, functional groups that contain an available pair of electrons, such as a hydroxyl group can donate electrons into a phenyl or aromatic ring system, situation that also should occur in torrubiellone (Sanchez- Cruz et al., 2012). An example of the OH effect addition is found between doxycycline and tetracycline: An hydroxyl group located within doxycycline forms an internal hydrogen bond with the adjacent tertiary amine, decreasing the property to form H bonds with water, but in tetracycline, the OH group is no longer adjacent to the tertiary amine, making it more available to form H bonds with water and playing an important role in its chemical properties. Water solubility decrease in doxycycline improves oral absorption and an enhanced penetration into bacteria (Stone et al., 2016).

The torrubiellone compounds were tested under different biological assays against Plasmodium falciparum, Mycobacterium tuberculosis, cytotoxic activities against human cancer cell-lines, such as using KB cells (oral human epidermoid carcinoma) (Isaka et al., 2010) and MCF-7 cells (Breast cancer) (Isaka et al., 2014), but there is no standard test for biological assays to use, in terms of very similar compounds are not tested by the same biological tests, so there is no information about the effect of the compounds of the biological pathways of tenellin, desmethylbassianin or militarinone, which could have also antimalarial activity, but they have not been tested yet. The compounds JBIR 130/131/132 were tested looking for anticancer properties, precisely to the human lung cancer line A49. It would be recommended to perform the same tests for known similar compounds to reach a

154

degree of standardization, and also helping to propose new theories about why some of them are antimalarial, and some others no.

In summary, the alkaloid torrubiellone A was described as an antimalarial with very low human cytotoxicity. Currently, there are more efficient antimalarials, such as dihydroartemisin, which has an IC50 value of 0.00046 µg/ml, while torrubiellone A and the more recently discovered torrubiellone E (Isaka et al., 2014) described as a dehydrated variant of torrubiellone A) need a higher concentration to be effective (IC50 values of 3.2 µg/ml for both). Comparison of the torrubiellone structures in figure 6.1 suggests that the antimalarial activity between them resides in the presence or absence of N-hydroxylation. Torrubiellone E is also antimalarial but doesn't have the benzene ring-associated hydroxyl group. In theory, to convert desmethylbassianin to an antimalarial compound, it would be needed a hydroxymethylation and partial/full reduction of the benzene ring (but no need for further hydroxylation).

Figure 6.1: Comparison between different torrubiellone structures (A) Torrubiellone A (B) Torrubiellone E (C) Torrubiellone B

If effectively the amount of OH groups are conferring the antimalarial properties to the PK- NRP compounds, it is plausible to think that adding torD and/or torE to other PKS-NRPS could confer properties not described previously for the compound. This could work for tenellin, but it presents a shorter chain length and extra methyl group. It's an experiment worth doing (assuming that torE will hydroxymethylate and torD reduce the ring). Tenellin has the critical N-hydroxylation but not the hydroxymethylation or reduced ring structures, so it would be interesting to see whether adding torD and torE coding genes to the genome of tenellin-producing A. oryzae or to B. bassiana results in generation of antimalarial activity. In the latter case it would be particularly interesting to transform the two different isolates of B. bassiana that produce tenellin and desmethylbassianin with these genes, since chain length and degree of methylation may also play a role.

155

C

H A H PT ER

7

7. REFERENCES

Ahmad, I., Malloch, D. (1995) Interaction of soil microflora with the bioherbicide phosphinothricin. Agriculture, Ecosystems and Environment., 54: 165– 174. doi: 10.1016/0167-8809(95)00603-P

Ahn, I.P. (2008) Glufosinate ammonium-induced pathogen inhibition and defence responses culminate in disease protection in bar-transgenic rice. Plant Physiology, 146: 213–227. doi: 10.1104/pp.107.105890

Alonso, P. L., Brown, G., Arevalo-Herrera, M., Binka, F., Chitnis, C., Collins, F., Tanner, M. (2011) A research agenda to underpin malaria eradication. PLoS Med, 8(1), e1000406. doi:10.1371/journal.pmed.1000406

Alper, H., Fischer, C., Nevoigt, E., Stephanopoulos, G. (2005) Tuning genetic control through promoter engineering. Proceedings of the National Academy of Sciences U S A. 6;102(36):12678-83. doi: 10.1073/pnas.0504604102

Andersen, M. et al. (2013) Accurate prediction of secondary metabolite gene clusters in filamentous fungi. Proceedings of the National Academy of Sciences U S A. Jan 2; 110(1): E99–E107. doi: 10.1073/pnas.1205532110.

Apweiler R et al. (2004) UniProt: the universal protein knowledgebase. Nucleic Acids Research 32:D115–D119. doi: 10.1093/nar/gkh131

Ashley, E. A., Dhorda, M., Fairhurst, R. M., Amaratunga, C., Lim, P., Suon, S. (2014) Tracking Resistance to Artemisinin, C. Spread of artemisinin resistance in Plasmodium falciparum malaria. New England Journal of Medicine, 371(5), 411-423. doi:10.1056/NEJMoa1314981

156

Atta, H., & Zamani, G. (2008) The progress of Roll Back Malaria in the Eastern Mediterranean Region over the past decade. Eastern Mediterranean Health Journal, 14 Suppl, S82-89.

Austin, B., Hall, R.M., Tyler, B.M. (1990) Optimized vectors and selection for transformation of Neurospora crassa and Aspergillus nidulans to bleomycin and phleomycin resistance. Gene. 1990 Sep 1;93(1):157-62. doi:10.1016/0378-1119(90)90152-H

Awakawa, T., Sugai, Y., Otsutomo, K., Ren, S., Masuda, S., Katsuyama, Y., Ohnishi, Y. (2013) 4-Hydroxy-3-methyl-6-(1-methyl-2-oxoalkyl)pyran-2-one synthesis by a type III polyketide synthase from Rhodospirillum centenum. Chembiochem, 14(8), 1006-1013. doi:10.1002/cbic.201300066

Bailey, A. M., Cox, R. J., Harley, K., Lazarus, C. M., Simpson, T. J., Skellam, E. (2007) Characterisation of 3-methylorcinaldehyde synthase (MOS) in Acremonium strictum: first observation of a reductive release mechanism during polyketide biosynthesis. Chemical Communications (Cambridge),(39), 4053-4055. doi:10.1039/b708614h

Bailey, A.M., Alberti, F., Kilaru, S., Collins, C.M., de Mattos-Shipley, K., Hartley, A.J., Hayes, P., Griffin, A., Lazarus, C.M., Cox, R.J., Willis, C.L., O'Dwyer, K., Spence, D.W., Foster, G.D. (2016) Identification and manipulation of the pleuromutilin gene cluster from Clitopilus passeckerianus for increased rapid antibiotic production. Scientific Reports. 4;6:25202. doi: 10.1038/srep25202

Barr, M. M. (2003) Super models. Physiological Genomics, 13(1), 15-24. doi:10.1152/physiolgenomics.00075.2002

Barrett, L., Fletcher, S., Wilton, S.(2012). Regulation of eukaryotic gene expression by the untranslated gene regions and other non-coding elements. Cellular and Molecular Life Sciences. 69 (21): 3613–3634. doi:10.1007/s00018-012-0990-9.

Basse, C. W., & Farfsing, J. W. (2006) Promoters and their regulation in Ustilago maydis and other phytopathogenic fungi. FEMS Microbiology Letters, 254(2), 208-216. doi:10.1111/j.1574-6968.2005.00046.x

Battaglia, E., Visser, L., Nijssen, A., van Veluw, G. J., Wosten, H. A., de Vries, R. P. (2011) Analysis of regulation of pentose utilisation in Aspergillus niger reveals evolutionary adaptations in . Studies in Mycology, 69(1), 31-38. doi:10.3114/sim.2011.69.03

Beck, J., Ripka, S., Siegner, A., Schiltz, E., Schweizer, E. (1990) The multifunctional 6- methylsalicylic acid synthase gene of Penicillium patulum. Its gene structure relative to that of other polyketide synthases. European Journal of Biochemistry, 192(2), 487-498. doi: 10.1111/j.1432-1033.1990.tb19252.x

Beggs, J. D. (1978) Transformation of yeast by a replicating hybrid plasmid. Nature, 275(5676), 104-109. doi:10.1038/275104a0

Bentley, R. (2006) From miso, saké and shoyu to cosmetics: a century of science for kojic acid. Natural Product Reports, 23, 1046-1062, doi: 10.1039/B603758P

157

Berdy, J. (2005) Bioactive microbial metabolites. Journal of Antibiotics (Tokyo), 58(1), 1-26. doi:10.1038/ja.2005.1

Bergh, K., Brakhage, A. (1998) Regulation of the Aspergillus nidulans Penicillin Biosynthesis Gene acvA (pcbAB) by Amino Acids: Implication for Involvement of Transcription Factor PACC. Applied and Environmental Microbiology, 64(3): 843–849.

Bergmann, S., Funk, A. N., Scherlach, K., Schroeckh, V., Shelest, E., Horn, U., Brakhage, A. A. (2010) Activation of a silent fungal polyketide biosynthesis pathway through regulatory cross talk with a cryptic nonribosomal peptide synthetase gene cluster. Applied and Environmental Microbiology, 76(24), 8143-8149. doi:10.1128/AEM.00683-10

Bergmann, S., Schumann, J., Scherlach, K., Lange, C., Brakhage, A. A., Hertweck, C. (2007) Genomics-driven discovery of PKS-NRPS hybrid metabolites from Aspergillus nidulans. Nature Chemical Biology, 3(4), 213-217. doi:10.1038/nchembio869

Bernard, P., & Couturier, M. (1992) Cell killing by the F plasmid CcdB protein involves poisoning of DNA-topoisomerase II complexes. Journal of Molecular Biology, 226(3), 735-745. doi:10.1016/0022-2836(92)90629-X

Bethell, D., Se, Y., Lon, C., Tyner, S., Saunders, D., Sriwichai, S., Fukuda, M. M. (2011) Artesunate dose escalation for the treatment of uncomplicated malaria in a region of reported artemisinin resistance: a randomized clinical trial. PLoS One, 6(5), e19283. doi:10.1371/journal.pone.0019283

Binninger, D.M., Skrzynia, C., Pukkila, P.J., Casselton, L.A. (1987) DNA-mediated transformation of the basidiomycete Coprinus cinereus. EMBO Journal, 6, 835-840.

Birch, A. J., & Donovan, F. W. (1953) Studies in Relation to Biosynthesis .1. Some Possible Routes to Derivatives of Orcinol and Phloroglucinol. Australian Journal of Chemistry, 6(4), 360-368.

Birkett, A. J. (2015) Building an effective malaria vaccine pipeline to address global needs. Vaccine, 33(52), 7538-7543. doi:10.1016/j.vaccine.2015.09.111

Birkinshaw, J. H., Chambers, A. R., Raistrick, H. (1942) Studies in the biochemistry of micro- organisms: Stipitatic acid, C(8)H(6)O(5), a metabolic product of Penicillium stipitatum Thom. Biochemical Journal, 36(1-2), 242-251.

Blatch, G. L., & Lassle, M. (1999) The tetratricopeptide repeat: a structural motif mediating protein-protein interactions. Bioessays, 21(11), 932-939. doi:10.1002/(SICI)1521- 1878(199911)21:11<932::AID-BIES5>3.0.CO;2-N

Blount, B. A., Weenink, T., Ellis, T. (2012) Construction of synthetic regulatory networks in yeast. FEBS Letters, 586(15), 2112-2121. doi:10.1016/j.febslet.2012.01.053

Boettger, D., Bergmann, H., Kuehn, B., Shelest, E., & Hertweck, C. (2012) Evolutionary imprint of catalytic domains in fungal PKS-NRPS hybrids. Chembiochem, 13(16), 2363-2373. doi:10.1002/cbic.201200449

158

Boettger, D., & Hertweck, C. (2013) Molecular diversity sculpted by fungal PKS-NRPS hybrids. Chembiochem, 14(1), 28-42. doi:10.1002/cbic.201200624

Bok, J. W., Chung, D., Balajee, S. A., Marr, K. A., Andes, D., Nielsen, K. F., Keller, N. P. (2006) GliZ, a transcriptional regulator of gliotoxin biosynthesis, contributes to Aspergillus fumigatus virulence. Infection and Immunity Journal, 74(12), 6761-6768. doi:10.1128/IAI.00780-06

Botterman, J., Gossele, V., Thoen, C., Lauwereys, M. (1991) Characterization of phosphinotricin acetyltransferase and C-terminal enzymatically active fusion proteins. Gene 102: 33–37. doi: 10.1016/0378-1119(91)90534-I

Brakhage, A. A., Andrianopoulos, A., Kato, M., Steidl, S., Davis, M. A., Tsukagoshi, N., Hynes, M. J. (1999) HAP-Like CCAAT-binding complexes in filamentous fungi: implications for biotechnology. Fungal Genetics and Biology, 27(2-3), 243-252. doi:10.1006/fgbi.1999.1136

Brakhage, A. A. (2013) Regulation of fungal secondary metabolism. Nature Reviews Microbiology, 11(1), 21-32. doi:10.1038/nrmicro2916

Brakhage, A. A., & Schroeckh, V. (2011) Fungal secondary metabolites - strategies to activate silent gene clusters. Fungal Genetics and Biology, 48(1), 15-22. doi:10.1016/j.fgb.2010.04.004

Brakhage, A. A., Thon, M., Sprote, P., Scharf, D. H., Al-Abdallah, Q., Wolke, S. M., & Hortschansky, P. (2009) Aspects on evolution of fungal beta-lactam biosynthesis gene clusters and recruitment of trans-acting factors. Phytochemistry, 70(15-16), 1801-1811. doi:10.1016/j.phytochem.2009.09.011

Brignole, E. J., Smith, S., Asturias, F. J. (2009) Conformational flexibility of metazoan fatty acid synthase enables catalysis. Nature Structural & Molecular Biology, 16(2), 190-197. doi:10.1038/nsmb.1532

Bromann, K., et al. (2012) Identification and characterization of a novel diterpene gene cluster in Aspergillus nidulans. PLoS ONE 7(4):e35450. doi: 10.1371/journal.pone.0035450

Brown, D.W., Butchko, R.A.E., Baker, S.E., Proctor, R.H. (2012) Phylogenomic and functional domain analysis of polyketide synthases in Fusarium. Fungal Biology 116: 318–331. doi:10.1016/j.funbio.2011.12.005

Brown, D. W., Yu, J. H., Kelkar, H. S., Fernandes, M., Nesbitt, T. C., Keller, N. P., Leonard, T. J. (1996) Twenty-five coregulated transcripts define a sterigmatocystin gene cluster in Aspergillus nidulans. Proceedings of the National Academy of Sciences U S A, 93(4), 1418- 1422.

Bruns, S., Seidler, M., Albrecht, D., Salvenmoser, S., Remme, N., Hertweck, C., Muller, F. M. (2010) Functional genomic profiling of Aspergillus fumigatus biofilm reveals enhanced production of the mycotoxin gliotoxin. Proteomics, 10(17), 3097-3107. doi:10.1002/pmic.201000129

159

Bu'Lock, J. D. (1961) Intermediary metabolism and antibiotic synthesis. Advances in Applied Microbiology, 3, 293-342.

Burgers, P.M., Percival, K.J. (1987) Transformation of yeast spheroplasts without cell fusion. Analytical Biochemistry, 163, 391–397. doi: 10.1016/0003-2697(87)90240-5

Burset, M., Seledtsov, I. A., Solovyev, V. V. (2000) Analysis of canonical and non-canonical splice sites in mammalian genomes. Nucleic Acids Research 28(21), 4364-4375.

Burrows, J. N., van Huijsduijnen, R. H., Mohrle, J. J., Oeuvray, C., Wells, T. N. (2013) Designing the next generation of medicines for malaria control and eradication. Malaria Journal, 12, 187. doi:10.1186/1475-2875-12-187

Buxton, F.P., Gwynne, D.I. Davies, R.W. (1989) Cloning of a new bidirectionally selectable marker for Aspergillus strains. Gene, 84, 329-334. doi: 10.1016/0378-1119(89)90507-6

Calvo, A. M., Wilson, R. A., Bok, J. W., Keller, N. P. (2002) Relationship between secondary metabolism and fungal development. Microbiology and Molecular Biology Reviews, 66(3), 447-+. doi:10.1128/Mmbr.66.3.447-459.2002

Cantarel, B.L., Korf, I., Robb, S.M., Parra, G., Ross, E., Moore, B., Holt, C., Sánchez Alvarado, A., Yandell, M. (2008) MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Research 2008 Jan;18(1):188-96. Epub 2007 Nov 19.

Caraballo, H., & King, K. (2014) Emergency department management of mosquito-borne illness: malaria, dengue, and West Nile virus. Emergency Medicine Practice, 16(5), 1-23; quiz 23-24.

Case, M. E., Schweizer, M., Kushner, S. R., Giles, N. H. (1979) Efficient transformation of Neurospora crassa by utilizing hybrid plasmid DNA. Proceedings of the National Academy of Sciences U S A, 76(10), 5259-5263.

Chang, M. C. Y., Eachus, R. A., Trieu, W., Ro, D. K., Keasling, J. D. (2007) Engineering Escherichia coli for production of functionalized terpenoids using plant P450s. Nature Chemical Biology, 3(5), 274-277. doi:10.1038/nchembio875

Chalfie, M., Tu, Y., Euskirchen, G., Ward, W. W., Prasher, D. C. (1994) Green Fluorescent Protein as a Marker for Gene-Expression. Science, 263(5148), 802-805. doi: 10.1126/science.8303295

Chang, P. K., Ehrlich, K. C., Fujii, I. (2009) Cyclopiazonic acid biosynthesis of Aspergillus flavus and Aspergillus oryzae. Toxins (Basel), 1(2), 74-99. doi:10.3390/toxins1020074

Chatterjee, S., Pal, J.K. (2009) Role of 5'- and 3'-untranslated regions of mRNAs in human diseases. Biology of the Cell 101(5):251-62. doi: 10.1042/BC20080104.

Cheng, Y., Schneider, B., Riese, U., Schubert, B., Li, Z., Hamburger, M. (2004) Farinosones A-C, neurotrophic alkaloidal metabolites from the entomogenous deuteromycete Paecilomyces farinosus. Journal of Natural Products, 67(11), 1854-1858. doi:10.1021/np049761w

160

Chooi, Y. H., Fang, J., Liu, H., Filler, S. G., Wang, P., Tang, Y. (2013) Genome mining of a prenylated and immunosuppressive polyketide from pathogenic fungi. Organic Letters, 15(4), 780-783. doi:10.1021/ol303435y

Chou, W. K., Fanizza, I., Uchiyama, T., Komatsu, M., Ikeda, H., Cane, D. E. (2010) Genome mining in Streptomyces avermitilis: cloning and characterization of SAV_76, the synthase for a new sesquiterpene, avermitilol. Journal of the American Chemical Society, 132(26), 8850- 8851. doi:10.1021/ja103087w

Chung, K. R., Ehrenshaft, M., Wetzel, D. K., Daub, M. E. (2003) Cercosporin-deficient mutants by plasmid tagging in the asexual fungus Cercospora nicotianae. Molecular Genetics and Genomics, 270(2), 103-113. doi:10.1007/s00438-003-0902-7

Cho, E. M., Kirkland, B. H., Holder, D. J., Keyhani, N. O. (2007). Phage display cDNA cloning and expression analysis of hydrophobins from the entomopathogenic fungus Beauveria (Cordyceps) bassiana. Microbiology, 153(Pt 10), 3438-3447. doi:10.1099/mic.0.2007/008532-0

Clooney, A. G., Fouhy, F., Sleator, R. D., O’ Driscoll, A., Stanton, C., Cotter, P. D., Claesson, M. J. (2016) Comparing Apples and Oranges?: Next Generation Sequencing and Its Impact on Microbiome Analysis. PLoS ONE, 11(2), e0148028. http://doi.org/10.1371/journal.pone.0148028

Corre, C., & Challis, G. L. (2007) Heavy tools for genome mining. Chemistry & Biology, 14(1), 7-9. doi:10.1016/j.chembiol.2007.01.001

Coteron, J. M., Marco, M., Esquivias, J., Deng, X. Y., White, K. L., White, J., Phillips, M. A. (2011) Structure-Guided Lead Optimization of Triazolopyrimidine-Ring Substituents Identifies Potent Plasmodium falciparum Dihydroorotate Dehydrogenase Inhibitors with Clinical Candidate Potential. Journal of Medicinal Chemistry, 54(15), 5540-5561. doi:10.1021/jm200592f

Cox, R. J. (2007) Polyketides, proteins and genes in fungi: programmed nano-machines begin to reveal their secrets. Organic & Biomolecular Chemistry, 5(13), 2010-2026. doi:10.1039/b704420h

Cox, R. J., Crosby, J., Daltrop, O., Glod, F., Jarzabek, M. E., Nicholson, T. P., Westcott, J. (2002) Streptomyces coelicolor phosphopantetheinyl transferase: a promiscuous activator of polyketide and fatty acid synthase acyl carrier proteins. Journal of the Chemical Society- Perkin Transactions 1(14), 1644-1649. doi:Doi 10.1039/B204633b

Cox, R. J., & Simpson, T. J. (2009) Fungal type I polyketide synthases. Methods in Enzymology, 459, 49-78. doi:10.1016/S0076-6879(09)04603-5 Crawford, J. M., & Clardy, J. (2012) Microbial genome mining answers longstanding biosynthetic questions. Proceedings of the National Academy of Sciences U S A, 109(20), 7589- 7590. doi:10.1073/pnas.1205361109

Cubitt, A. B., Heim, R., Adams, S. R., Boyd, A. E., Gross, L. A., Tsien, R. Y. (1995) Understanding, Improving and Using Green Fluorescent Proteins. Trends in Biochemical Sciences, 20(11), 448-455. doi:Doi 10.1016/S0968-0004(00)89099-4

161

Cvitanich, C., & Judelson, H. S. (2003) Stable transformation of the oomycete, Phytophthora infestans, using microprojectile bombardment. Current Genetics, 42(4), 228-235. doi:10.1007/s00294-002-0354-3

Davison, J., al Fahad, A., Cai, M., Song, Z., Yehia, S. Y., Lazarus, C. M., Cox, R. J. (2012) Genetic, molecular, and biochemical basis of fungal tropolone biosynthesis. Proceedings of the National Academy of Sciences U S A, 109(20), 7642-7647. doi:10.1073/pnas.1201469109

Dawe, A.L., Willins, D.A., Morris, N.R. (2000) Increased transformation efficiency of Aspergillus nidulans protoplasts in the presence of dithiothreitol. Analytical Biochemistry, 283, 111–112. doi: 10.1006/abio.2000.4658 de Salas, F., Martinez, M. J., & Barriuso, J. (2015) Quorum-Sensing Mechanisms Mediated by Farnesol in Ophiostoma piceae: Effect on Secretion of Sterol Esterase. Applied and Environmental Microbiology, 81(13), 4351-4357. doi:10.1128/AEM.00079-15

Dehli, T., Solem, C., Jensen, P. R. (2012) Tunable promoters in synthetic and systems biology. Subcellular Biochemistry, 64, 181-201. doi:10.1007/978-94-007-5055-5_9

Dimroth, P., Walter, H., Lynen, F. (1970) [Biosynthesis of 6-methylsalicylic acid]. European Journal of Biochemistry, 13(1), 98-110. doi: 10.1111/j.1432-1033.1970.tb00904.x

Ding, F., William, R., Leow, M. L., Chai, H., Fong, J. Z., Liu, X. W. (2014) Directed orthometalation and the asymmetric total synthesis of N-deoxymilitarinone A and torrubiellone B. Organic Letters, 16(1), 26-29. doi:10.1021/ol402820d

Du, L., & Lou, L. (2010). PKS and NRPS release mechanisms. Natural Product Reports, 27(2), 255-278. doi:10.1039/b912037h

Dujon, B., Sherman, D., Fischer, G., Durrens, P., Casaregola, S., Lafontaine, I., De Montigny, J., Marck, C., Neuveglise, C., Talla, E. (2004) Genome evolution in yeasts. Nature 430: 35– 44. doi: 10.1038/nature02579

Earl, D., et al. (2011) Assemblathon 1: a competitive assessment of de novo short read assembly methods. Genome Research. 21(12):2224-41. doi: 10.1101/gr.126599.111

Eley, K. L., Halo, L. M., Song, Z., Powles, H., Cox, R. J., Bailey, A. M., Simpson, T. J. (2007) Biosynthesis of the 2-pyridone tenellin in the insect pathogenic fungus Beauveria bassiana. Chembiochem, 8(3), 289-297. doi:10.1002/cbic.200600398

Espeso, E. A., & Penalva, M. A. (1996) Three binding sites for the Aspergillus nidulans PacC zinc-finger transcription factor are necessary and sufficient for regulation by ambient pH of the isopenicillin N synthase gene promoter. The Journal of Biological Chemistry, 271(46), 28825-28830.

Fan, Y., Fang, W., Guo, S., Pei, X., Zhang, Y., Xiao, Y., Pei, Y. (2007) Increased insect virulence in Beauveria bassiana strains overexpressing an engineered chitinase. Applied and Environmental Microbiology, 73(1), 295-302. doi:10.1128/AEM.01974-06

Felnagle, E.A., Jackson, E.E., Chan, Y.A., Podevels, A.M., Berti, A.D., McMahon, M.D., Thomas, M.G. (2008) Nonribosomal peptide synthetases involved in the production of

162

medically relevant natural products. Molecular Pharmaceutics 5(2):191-211. doi: 10.1021/mp700137g.

Fincham, J.R.S. (1989) Transformation in fungi. Microbiology Reviews, 53, 148-170.

Fisch, K. M., Bakeer, W., Yakasai, A. A., Song, Z., Pedrick, J., Wasil, Z., Cox, R. J. (2011) Rational domain swaps decipher programming in fungal highly reducing polyketide synthases and resurrect an extinct metabolite. Journal of the American Chemical Society, 133(41), 16635-16641. doi:10.1021/ja206914q

Florea, L., Souvorov, A., Kalbfleisch, T.S., Salzberg, S.L. (2011) Genome Assembly Has a Major Impact on Gene Content: A Comparison of Annotation in Two Bos Taurus Assemblies. PLoS ONE 6(6): e21400. doi:10.1371/journal.pone.0021400

Forsburg, S. L. (2001) The art and design of genetic screens: yeast. Nature Reviews Genetics, 2(9), 659-668. doi:10.1038/35088500

Foye, W. O., Lemke, T. L., Williams, D. A. (2008). Foye's principles of medicinal chemistry (6th ed.). Philadelphia: Lippincott Williams & Wilkins.

Fujii, I., Mori, Y., Watanabe, A., Kubo, Y., Tsuji, G., Ebizuka, Y. (2000) Enzymatic synthesis of 1,3,6,8-tetrahydroxynaphthalene solely from malonyl coenzyme A by a fungal iterative type I polyketide synthase PKS1. Biochemistry, 39(30), 8853-8858. doi: 10.1021/bi000644j

Galagan, J. E., Henn, M. R., Ma, L. J., Cuomo, C. A., Birren, B. (2005) Genomics of the fungal kingdom: insights into eukaryotic biology. Genome research, 15(12), 1620-1631. doi:10.1101/gr.3767105

Gaucher, G. M., & Shepherd, M. G. (1968) Isolation of orsellinic acid synthase. Biochemical and Biophysical Research Communications, 32(4), 664-671. doi:10.1016/0006- 291X(68)90290-8

Georgianna, D. R., & Payne, G. A. (2009) Genetic regulation of aflatoxin biosynthesis: from gene to genome. Fungal Genetics and Biology, 46(2), 113-125. doi:10.1016/j.fgb.2008.10.011

Gerstmeir, R., Cramer, A., Dangel, P., Schaffer, S., Eikmanns, B.J. (2004) RamB, a novel transcriptional regulator of genes involved in acetate metabolism of Corynebacterium glutamicum. Journal of Bacteriology. 2004 May;186(9):2798-809. doi: 10.1128/JB.186.9.2798- 2809.2004

Gietz, R.D., Woods, R.A. (2002) Transformation of yeast by lithium acetate/single-stranded carrier DNA/polyethylene glycol method. Methods in Enzymology, 350:87-96. doi: 10.1016/S0076-6879(02)50957-5

Glenn, T.C. (2011) Field guide to next-generation DNA sequencers. Molecular Ecology Resources. Sep;11(5):759-69. doi: 10.1111/j.1755-0998.2011.03024.x

Goffeau, A. (2000) Four years of post-genomic life with 6,000 yeast genes. FEBS Letters, 480(1), 37-41. doi: 10.1016/S0014-5793(00)01775-0

163

Grant, W. D., Prosser, B. A., Asher, R. A. (1990). A bacteriolytic muramidase from the basidiomycete Schizophyllum commune. Journal of General Microbiology, 136(11), 2267- 2273. doi:10.1099/00221287-136-11-2267

Gressler, M., Zaehle, C., Scherlach, K., Hertweck, C., Brock, M. (2011) Multifactorial induction of an orphan PKS-NRPS gene cluster in Aspergillus terreus. Chemistry & Biology, 18(2), 198-209. doi:10.1016/j.chembiol.2010.12.011

Gross, H., Stockwell, V. O., Henkels, M. D., Nowak-Thompson, B., Loper, J. E., Gerwick, W. H. (2007) The genomisotopic approach: a systematic method to isolate products of orphan biosynthetic gene clusters. Chemistry & Biology, 14(1), 53-63. doi:10.1016/j.chembiol.2006.11.007

Guohua, X. et al. (2012) Genomic perspectives on the evolution of fungal entomopathogenicity in Beauveria bassiana. Scientific Reports 2, Article number: 483 doi:10.1038/srep00483

Halo, L. M., Heneghan, M. N., Yakasai, A. A., Song, Z., Williams, K., Bailey, A. M., Simpson, T. J. (2008). Late stage oxidations during the biosynthesis of the 2-pyridone tenellin in the entomopathogenic fungus Beauveria bassiana. Journal of the American Chemical Society, 130(52), 17988-17996. doi:10.1021/ja807052c

Halo, L. M., Marshall, J. W., Yakasai, A. A., Song, Z., Butts, C. P., Crump, M. P., Cox, R. J. (2008) Authentic heterologous expression of the tenellin iterative polyketide synthase nonribosomal peptide synthetase requires coexpression with an enoyl reductase. Chembiochem, 9(4), 585-594. doi:10.1002/cbic.200700390

Hamer, J. E., Timberlake, W. (1987) Functional organization of the Aspergillus nidulans trpC promoter. Molecular and Cellular Biology. 7(7): 2352–2359.

Hamill, R. L., Higgens, C. E., Boaz, H. E., & Gorman, M. (1969). Structure of Beauvericin, a New Depsipeptide Antibiotic Toxic to Artemia salina. Tetrahedron Letters(49), 4255. doi:10.1016/S0040-4039(01)88668-8

Harrier, L. A., Wright, F., Hooker, J. E. (1998) Isolation of the 3-phosphoglycerate kinase gene of the arbuscular mycorrhizal fungus Glomus mosseae (Nicol. & Gerd.) Gerdemann & Trappe. Current Genetics, 34(5), 386-392.

Hartley, J. L., Temple, G. F., Brasch, M. A. (2000) DNA cloning using in vitro site-specific recombination. Genome Research, 10(11), 1788-1795. doi:10.1101/Gr.143000

Hashimoto, M., Nonaka, T., Fujii, I. (2014) Fungal type III polyketide synthases. Natural Product Reports, 31(10), 1306-1317. doi:10.1039/c4np00096j

Hawksworth, D. L. (1991) The Fungal Dimension of Biodiversity - Magnitude, Significance, and Conservation. Mycological Research, 95, 641-655. doi:10.1016/S0953-7562(09)80810-1

Hendrickson, L., Davis, C. R., Roach, C., Nguyen, D. K., Aldrich, T., McAda, P. C., Reeves, C. D. (1999) Lovastatin biosynthesis in Aspergillus terreus: characterization of blocked mutants,

164

enzyme activities and a multifunctional polyketide synthase gene. Chemistry & Biology, 6(7), 429-439. doi:10.1016/S1074-5521(99)80061-1

Heneghan, M. N., Yakasai, A. A., Halo, L. M., Song, Z., Bailey, A. M., Simpson, T. J., Lazarus, C. M. (2010) First heterologous reconstruction of a complete functional fungal biosynthetic multigene cluster. Chembiochem, 11(11), 1508-1512. doi:10.1002/cbic.201000259

Heneghan, M., Yakasai, A., Williams, K., Kadir, K.A., Wasil, Z., Bakeer, W., Fisch, K. Bailey, A., Simpson, T., Lazarus, C.M., Cox, R.J. (2011) The programming role of trans-acting enoyl reductases during the biosynthesis of highly reduced fungal polyketides. Chemical Science, 2, 972-979 DOI: 10.1039/C1SC00023C

Helfrich, E. J., & Piel, J. (2016) Biosynthesis of polyketides by trans-AT polyketide synthases. Natural Product Reports, 33(2), 231-316. doi:10.1039/c5np00125k

Hertweck, C. (2009) The biosynthetic logic of polyketide diversity. Angewandte Chemie International Edition, 48(26), 4688-4716. doi:10.1002/anie.200806121

Ho, M.R., Tsai, K.W., Lin, W.C. (2012) A unified framework of overlapping genes: towards the origination and endogenic regulation. Genomics. 100(4):231-9. doi: 10.1016/j.ygeno.2012.06.011.

Hoffmeister, D., & Keller, N. P. (2007) Natural products of filamentous fungi: enzymes, genes, and their regulation. Natural Product Reports, 24(2), 393-416. doi:10.1039/b603084j

Hogan, L. H., Klein, B. S., Levitz, S. M. (1996) Virulence factors of medically important fungi. Clinical Microbiology Reviews, 9(4), 469-488.

Hopwood, D. A. (1988) The Leeuwenhoek lecture, 1987. Towards an understanding of gene switching in Streptomyces, the basis of sporulation and antibiotic production. Proceedings of the Royal Society of London. Series B, Biological sciences, 235(1279), 121-138.

Horn, F., Habel, A., Scharf, D.H., Dworschak, J., Brakhage, A.A., Guthke, R., Hertweck, C., Linde, J. (2015) Draft genome sequence and gene annotation of the entomopathogenic fungus Verticillium hemipterigenum. Genome Announcements. 3(1):e01439-14. doi:10.1128/genomeA.01439-14.

Horng, J.S., Linz, J.E., Pestka, J.J. (1989) Cloning and characterization of the trpC gene from an aflatoxigenic strain of Aspergillus parasiticus. Applied and Environmental Microbiology. 55(10):2561-8.

Hosoya, T., Takagi, M., Shin-ya, K. (2013) New pyridone alkaloids JBIR-130, JBIR-131 and JBIR-132 from Isaria sp. NBRC 104353. Journal of Antibiotics (Tokyo), 66(4), 235-238. doi:10.1038/ja.2012.106

Hywel-Jones, N. L. (1993) Torrubiella-Luteorostrata - a Pathogen of Scale Insects and Its Association with Paecilomyces-Cinnamomeus with a Note on Torrubiella tenuis. Mycological Research, 97, 1126-1130.

165

Isaka, M., Chinthanom, P., Supothina, S., Tobwor, P., Hywel-Jones, N. L. (2010) Pyridone and tetramic acid alkaloids from the spider pathogenic fungus Torrubiella sp. BCC 2165. Journal of Natural Products, 73(12), 2057-2060. doi:10.1021/np100492j

Isaka, M., Kittakoop, P., Kirtikara, K., Hywel-Jones, N. L., Thebtaranonth, Y. (2005) Bioactive substances from insect pathogenic fungi. Accounts of Chemical Research, 38(10), 813-823. doi:10.1021/ar040247r

Isaka, M., Palasarn, S., Kocharin, K., Hywel-Jones, N. L. (2007) Comparison of the bioactive secondary metabolites from the scale insect pathogens, Anamorph Paecilomyces cinnamomeus, and Teleomorph Torrubiella luteorostrata. Journal of Antibiotics (Tokyo), 60(9), 577-581. doi:10.1038/ja.2007.73

Ishida, K., Lincke, T., Behnken, S., Hertweck, C. (2010) Induced biosynthesis of cryptic polyketide metabolites in a Burkholderia thailandensis quorum sensing mutant. Journal of the American Chemical Society, 132(40), 13966-13968. doi:10.1021/ja105003g

Ishikawa, M., Ninomiya, T., Akabane, H., Kushida, N., Tsujiuchi, G., Ohyama, M., Murata, T. (2009) Pseurotin A and its analogues as inhibitors of immunoglobulin E [correction of immunoglobuline E] production. Bioorganic & Medicinal Chemistry Letters, 19(5), 1457- 1460. doi:10.1016/j.bmcl.2009.01.029

Jaillon, O., Bouhouche, K., Gout, J. F., Aury, J. M., Noel, B., Saudemont, B., Meyer, E. (2008) Translational control of intron splicing in eukaryotes. Nature, 451(7176), 359-362. doi:10.1038/nature06495

Jenke-Kodama, H. & Dittmann, E. (2009) Bioinformatic perspectives on NRPS/PKS megasynthases: Advances and challenges. Natural Product Reports, 26(7):874–883. doi: 10.1039/b810283j

Jessen, H. J., Schumacher, A., Schmid, F., Pfaltz, A., Gademann, K. (2011) Catalytic enantioselective total synthesis of (+)-torrubiellone C. Organic Letters, 13(16), 4368-4370. doi:10.1021/ol201692h

Jin, F.J., Maruyama, J., Juvvadi, P.R., Arioka, M., Kitamoto, K. (2004) Adenine auxotrophic mutants of Aspergillus oryzae: development of a novel transformation system with triple auxotrophic hosts. Bioscience, Biotechnology, and Biochemistry. 68(3):656-62. doi: 10.1271/bbb.68.656

Johnson, D., Sung, G. H., Hywel-Jones, N. L., Luangsa-Ard, J. J., Bischoff, J. F., Kepler, R. M., Spatafora, J. W. (2009). Systematics and evolution of the genus Torrubiella (Hypocreales, Ascomycota). Mycological Research, 113(Pt 3), 279-289. doi:10.1016/j.mycres.2008.09.008

Johnson, Z.I., Chisholm, S.W. (2004) Properties of overlapping genes are conserved across microbial genomes. Genome Research. 14(11):2268-72. doi: 10.1101/gr.2433104

Jones, P., Binns, D., Chang, H.Y., Fraser, M., Li, W., McAnulla, C., McWilliam, H., Maslen, J., Mitchell, A., Nuka, G., Pesseat, S., Quinn, A.F., Sangrador-Vegas, A., Scheremetjew, M., Yong, S.Y., Lopez, R., Hunter, S. (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics. 1;30(9):1236-40. doi: 10.1093/bioinformatics/btu031.

166

Kale, S. P., Milde, L., Trapp, M. K., Frisvad, J. C., Keller, N. P., Bok, J. W. (2008) Requirement of LaeA for secondary metabolism and sclerotial production in Aspergillus flavus. Fungal Genetics and Biology, 45(10), 1422-1429. doi:10.1016/j.fgb.2008.06.009

Kakule, T.B., Lin, Z., Schmidt, E.W. (2014) Combinatorialization of fungal polyketide synthase − peptide synthetase hybrid proteins. Journal of the American Chemical Society. 136: 17882–17890. doi: 10.1021/ja511087p.

Katsuyama, Y., & Ohnishi, Y. (2012) Type III polyketide synthases in microorganisms. Methods in Enzymology, 515, 359-377. doi:10.1016/B978-0-12-394290-6.00017-3

Katzen, F. (2007) Gateway((R)) recombinational cloning: a biological operating system. Expert Opinion on Drug Discovery, 2(4), 571-589. doi:10.1517/17460441.2.4.571

Keese, P.K., Gibbs, A. (1992) Origins of genes: “big bang” or continuous creation? Proceedings of the National Academy of Sciences U. S. A. 89, 9489–9493.

Kehr, J. C., Gatte Picchi, D., Dittmann, E. (2011) Natural product biosyntheses in cyanobacteria: A treasure trove of unique enzymes. Beilstein Journal of Organic Chemistry, 7, 1622-1635. doi:10.3762/bjoc.7.191

Kelkar, Y. D., & Ochman, H. (2012) Causes and consequences of genome expansion in fungi. Genome Biology and Evolution, 4(1), 13-23. doi:10.1093/gbe/evr124

Kennedy, J., Auclair, K., Kendrew, S. G., Park, C., Vederas, J. C., Hutchinson, C. R. (1999) Modulation of polyketide synthase activity by accessory proteins during lovastatin biosynthesis. Science, 284(5418), 1368-1372. doi: 10.1126/science.284.5418.1368

Kitamoto K. (2002) Molecular biology of the Koji molds. Advances in Applied Microbiology, 51, 129–153.

Khaldi, N., Collemare, J., Lebrun, M-H, Wolfe, K. (2008) Evidence for horizontal transfer of a secondary metabolite gene cluster between fungi. Genome Biology 9: R18. doi:10.1186/gb- 2008-9-1-r18.

Khaldi, N., Seifuddin, F.T., Turner, G., Haft, D., Nierman, W.C., Wolfe, K.H., Fedorova, N.D. (2010) SMURF: genomic mapping of fungal secondary metabolite clusters. Fungal Genetics and Biology 47:736–741. doi: 10.1016/j.fgb.2010.06.003.

Kis-Papo, T., Grishkan, I., Oren, A., Wasser, S. P., Nevo, E. (2001) Spatiotemporal diversity of filamentous fungi in the hypersaline Dead Sea. Mycological Research, 105, 749-756. doi:10.1017/S0953756201004129

Ko, L. J., & Engel, J. D. (1993) DNA-binding specificities of the GATA transcription factor family. Molecular and Cellular Biology, 13(7), 4011-4022.

Kobayashi, T., Abe, K., Asai, K., Gomi, K., Juvvadi, P.R., Kato, M., Kitamoto, K., Takeuchi, M., Machida, M. (2007) Genomics of Aspergillus oryzae. Bioscience, Biotechnology, and Biochemistry. 71:646–670. doi: 10.1271/bbb.60550.

167

Komatsu, M., Uchiyama, T., Omura, S., Cane, D. E., Ikeda, H. (2010) Genome-minimized Streptomyces host for the heterologous expression of secondary metabolism. Proceedings of the National Academy of Sciences U S A, 107(6), 2646-2651. doi:10.1073/pnas.0914833107

Kornsakulkarn, J., Thongpanchang, C., Lapanun, S., Srichomthong, K. (2009). Isocoumarin glucosides from the scale insect fungus Torrubiella tenuis BCC 12732. Journal of Natural Products, 72(7), 1341-1343. doi:10.1021/np900082h

Koziol, M. (2014). Investigating Programming and Production of Fungal Polyketide Synthases. PhD thesis. University of Bristol.

Kroken, S., Glass, N. L., Taylor, J. W., Yoder, O. C., Turgeon, B. G. (2003) Phylogenomic analysis of type I polyketide synthase genes in pathogenic and saprobic ascomycetes. Proceedings of the National Academy of Sciences U S A, 100(26), 15670-15675. doi:10.1073/pnas.2532165100

Kunze, B., Reichenbach, H., Müller, R., Höfle, G. (2005) Aurafuron A and B, new bioactive polyketides from Stigmatella aurantiaca and Archangium gephyra (Myxobacteria). Fermentation, isolation, physico-chemical properties, structure and biological activity. Journal of Antibiotics (Tokyo). 58(4):244-51. doi: 10.1038/ja.2005.28

Kutlesa, N. J., & Caveney, S. (2001) Insecticidal activity of glufosinate through glutamine depletion in a caterpillar. Pest Management Science, 57(1), 25-32. doi:10.1002/1526- 4998(200101)57:1<25::AID-PS272>3.0.CO;2-I

Lazarus, C. M., Williams, K., Bailey, A. M. (2014). Reconstructing fungal natural product biosynthetic pathways. Natural Product Reports, 31(10), 1339-1347. doi:10.1039/c4np00084f

Letek, M., Valbuena, N., Ramos, A., Ordóñez, E., Gil, J.A., Mateos, L.M. (2006) Characterization and use of catabolite-repressed promoters from gluconate genes in Corynebacterium glutamicum. Journal of Bacteriology, 188(2):409-23. doi: 10.1128/JB.188.2.409-423.2006

Leong, F. J., Li, R., Jain, J. P., Lefevre, G., Magnusson, B., Diagana, T. T., Pertel, P. (2014) A first-in-human randomized, double-blind, placebo-controlled, single- and multiple- ascending oral dose study of novel antimalarial Spiroindolone KAE609 (Cipargamin) to assess its safety, tolerability, and pharmacokinetics in healthy adult volunteers. Antimicrobial Agents and Chemotherapy, 58(10), 6209-6214. doi:10.1128/AAC.03393-14

Li, Q., Herrler, M., Landsberger, N., Kaludov, N., Ogryzko, V. V., Nakatani, Y., Wolffe, A. P. (1998) Xenopus NF-Y pre-sets chromatin to potentiate p300 and acetylation-responsive transcription from the Xenopus hsp70 promoter in vivo. EMBO Journal, 17(21), 6300-6315. doi:10.1093/emboj/17.21.6300

Lin, Z., Li, W.H. (2012) Evolution of 5' untranslated region length and gene expression reprogramming in yeasts. Molecular Biology and Evolution. 29(1):81-9. doi: 10.1093/molbev/msr143. Epub 2011 Sep 28.

Ling, S. O. S., Storms, R., Zheng, Y., Rodzi, M. R. M., Mahadi, N. M., Illias, R. M., Abu Bakar, F. D. (2013) Development of a pyrG Mutant of Aspergillus oryzae Strain S1 as a Host for the Production of Heterologous Proteins. Scientific World Journal. doi: 10.1155/2013/634317

168

Liu, Z., Friesen, T.L. (2012) Polyethylene glycol (PEG)-mediated transformation in filamentous fungal pathogens. Methods in Molecular Biology. 835:365-75. doi: 10.1007/978- 1-61779-501-5_21.

Lo, H. C., Entwistle, R., Guo, C. J., Ahuja, M., Szewczyk, E., Hung, J. H., Wang, C. C. (2012) Two separate gene clusters encode the biosynthetic pathway for the meroterpenoids austinol and dehydroaustinol in Aspergillus nidulans. Journal of the American Chemical Society, 134(10), 4709-4720. doi:10.1021/ja209809t

Long, D.M., Smidansky, E.D., Archer, A.J., Strobel, G.A. (1998) In vivo addition of telomeric repeats to foreign DNA generates extrachromosomal DNAs in the taxol-producing fungus Pestalotiopsis microspora. Fungal Genetics Biology 24, 335-344. doi: 10.1006/fgbi.1998.1065

Lubertozzi, D., & Keasling, J. D. (2009) Developing Aspergillus as a host for heterologous expression. Biotechnology Advances, 27(1), 53-75. doi:10.1016/j.biotechadv.2008.09.001

Lynch, M., & Conery, J. S. (2003) The origins of genome complexity. Science, 302(5649), 1401- 1404. doi:10.1126/science.1089370

Ma, S. M., Zhan, J., Watanabe, K., Xie, X., Zhang, W., Wang, C. C., Tang, Y. (2007) Enzymatic synthesis of aromatic polyketides using PKS4 from Gibberella fujikuroi. Journal of the American Chemical Society, 129(35), 10642-10643. doi:10.1021/ja074865p

Ma, L-J., van der Does, H.C., Borkovich, K.A., Coleman, J.J., Daboussi, M-J. (2010) Comparative genomics reveals mobile pathogenicity chromosomes in Fusarium. Nature 464: 367–373. doi:10.1038/nature08850

Machado, H., Sonnenschein, E. C., Melchiorsen, J., Gram, L. (2015) Genome mining reveals unlocked bioactive potential of marine Gram-negative bacteria. BMC Genomics, 16, 158. doi:10.1186/s12864-015-1365-z

Mahapatra, D. K., Asati, V., Bharti, S. K. (2015) Chalcones and their therapeutic targets for the management of diabetes: structural and pharmacological perspectives. European Journal of Medicinal Chemistry, 92, 839-865. doi:10.1016/j.ejmech.2015.01.051

Maiya, S., Grundmann, A., Li, X., Li, S. M., Turner, G. (2007) Identification of a hybrid PKS/NRPS required for pseurotin A biosynthesis in the human pathogen Aspergillus fumigatus. Chembiochem, 8(14), 1736-1743. doi:10.1002/cbic.200700202

Marzluf, G. A. (1997) Genetic regulation of nitrogen metabolism in the fungi. Microbiology and Molecular Biology Reviews, 61(1), 17-32.

Mattern, D.J., Valiante, V., Unkles, S.E., Brakhage, A.A. (2015) Synthetic biology of fungal natural products. Frontiers in Microbiology. 30;6:775. doi: 10.3389/fmicb.2015.00775.

May, G.S., Gambino, J., Weatherbee, J.A., Morris, N.R., Ward, M., Wilkinson, B., Turner, G. (1986) Identification and functional analysis of beta-tubulin genes by site specific integrative transformation in Aspergillus nidulans. Molecular Genetics and Genomics 202: 265. doi:10.1007/BF00331648

169

McInnes, G., Smith, D., Wat, C-K.,Vining, L., Wright, J. (1974) Tenellin and bassianin, metabolites of Beauveria species. Structure elucidation with 15N- and doubly 13C-enriched compounds using 13C nuclear magnetic resonance spectroscopy. Journal of the Chemical Society, Chemical Communications, 281-282 doi: 10.1039/C39740000281

McKeown, D. S. J., McNicholas, C., Simpson, T. J., Willett, N. J. (1996) Biosynthesis of norsolorinic acid and averufin: Substrate specificity of norsolorinic acid synthase. Chemical Communications(3), 301-302. doi:DOI 10.1039/cc9960000301

Medema, M. H., Blin, K., Cimermancic, P., de Jager, V., Zakrzewski, P., Fischbach, M. A., Breitling, R. (2011) antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Research, 39(Web Server issue), W339-346. doi:10.1093/nar/gkr466

Meek, I.B., Peplow, A.W., Ake, C., Phillips, T.D., Beremand, M.N. (2003) Tri1 encodes the cytochrome P450 monooxygenase for C-8 hydroxylation during trichothecene biosynthesis in Fusarium sporotrichioides and resides upstream of another new Tri gene. Applied and Environmental Microbiology 69: 1607– 1613. doi: 10.1128/AEM.69.3.1607-1613.2003

Metzker ML. (2010) Sequencing technologies - the next generation. Nature Reviews Genetics ;11:31-46. doi:0.1038/nrg2626

Michelson, A.M., Orkin, S.H. (1980) The 3' untranslated regions of the duplicated human alpha-globin genes are unexpectedly divergent. Cell. 22(2 Pt 2):371-7. doi: 10.1016/0092- 8674(80)90347-5

Miki, T., Park, J. A., Nagao, K., Murayama, N., Horiuchi, T. (1992) Control of segregation of chromosomal DNA by sex factor F in Escherichia coli. Mutants of DNA gyrase subunit A suppress letD (ccdB) product growth inhibition. Journal of Molecular Biology, 225(1), 39-52. doi:10.1016/0022-2836(92)91024-J

Minto, R. E., Townsend, C. A. (1997) Enzymology and Molecular Biology of Aflatoxin Biosynthesis. Chemical Reviews, 97(7), 2537-2556. doi: 10.1021/cr960032y

Miyanaga, A., Funa, N., Awakawa, T., Horinouchi, S. (2008) Direct transfer of starter substrates from type I fatty acid synthase to type III polyketide synthases in phenolic lipid synthesis. Proceedings of the National Academy of Sciences U S A, 105(3), 871-876. doi:10.1073/pnas.0709819105

Mukherjee, P. K., Buensanteai, N., Moran-Diez, M. E., Druzhinina, I. S., Kenerley, C. M. (2012) Functional analysis of non-ribosomal peptide synthetases (NRPSs) in Trichoderma virens reveals a polyketide synthase (PKS)/NRPS hybrid enzyme involved in the induced systemic resistance response in maize. Microbiology, 158(Pt 1), 155-165. doi:10.1099/mic.0.052159-0

Munawar, A., Marshall, J.W., Cox, R.J., Bailey, A.M., Lazarus, C.M. (2013) Isolation and characterisation of a ferrirhodin synthetase gene from the sugarcane pathogen Fusarium sacchari. Chembiochem. 14(3):388-94. doi: 10.1002/cbic.201200587

Nielsen, M. L., Albertsen, L., Lettier, G., Nielsen, J. B., Mortensen, U. H. (2006) Efficient PCR-based gene targeting with a recyclable marker for Aspergillus nidulans. Fungal Genetics and Biology, 43(1), 54-64. doi:10.1016/j.fgb.2005.09.005

170

Nielsen, M.L., Isbrandt, T., Petersen, L.M., Mortensen, U.H., Andersen, M.R., Hoof, J.B. (2016) Linker Flexibility Facilitates Module Exchange in Fungal Hybrid PKS-NRPS Engineering. PLoS ONE 11(8): e0161199. doi:10.1371/journal.pone.0161199

Niino, Y.S., Chakraborty, S., Brown, B.J., Massey, V. (1995) A new old yellow enzyme of Saccharomyces cerevisiae. Journal of Biological Chemistry. 3;270(5):1983-91. doi: 10.1074/jbc.270.5.1983

Nødvig, C.S., Nielsen, J.B., Kogle, M.E., Mortensen, U.H. (2015) A CRISPR-Cas9 System for Genetic Engineering of Filamentous Fungi. PLoS One. 2015 Jul 15;10(7):e0133085. doi: 10.1371/journal.pone.0133085.

Orbach, M., Porro, E., Yanofsky, C. (1986) Cloning and Characterization of the Gene for 3- Tubulin from a Benomyl-Resistant Mutant of Neurospora crassa and Its Use as a Dominant Selectable Marker. Molecular and Cellular Biology, Vol. 6, No. 72452-2461. doi: 0270- 7306/86/072452-10$02.00/0

Pahirulzaman, K. A. K., Williams, K., Lazarus, C. M. (2012) A Toolkit for Heterologous Expression of Metabolic Pathways in Aspergillus oryzae. Natural Product Biosynthesis by Microorganisms and Plants, Pt C, 517, 241-260. doi:10.1016/B978-0-12-404634-4.00012-7

Palmer, J.M., Keller, N.P.(2010) Secondary metabolism in fungi: does chromosomal location matter?. Current Opinion in Microbiology. 13(4):431-6. doi: 10.1016/j.mib.2010.04.008.

Pan, T., & Coleman, J. E. (1990) GAL4 transcription factor is not a "zinc C6" but forms a Zn(II)2Cys6 binuclear cluster. Proceedings of the National Academy of Sciences U S A, 87(6), 2077-2081

Pao, S. S., Paulsen, I. T., Saier, M. H., Jr. (1998) Major facilitator superfamily. Microbiology and Molecular Biology Reviews, 62(1), 1-34.

Pearce, L. (2007) Roll back malaria. Nursing standard, 21(31), 18-19.

Pedrini, N., Zhang, S., Juarez, M. P., Keyhani, N. O. (2010) Molecular characterization and expression analysis of a suite of cytochrome P450 enzymes implicated in insect hydrocarbon degradation in the entomopathogenic fungus Beauveria bassiana. Microbiology, 156(Pt 8), 2549-2557. doi:10.1099/mic.0.039735-0

Pel, H. J., de Winde, J. H., Archer, D. B., Dyer, P. S., Hofmann, G., Schaap, P. J., Stam, H. (2007) Genome sequencing and analysis of the versatile cell factory Aspergillus niger CBS 513.88. Nature Biotechnology, 25(2), 221-231. doi:10.1038/nbt1282

Phillippy, A., Schatz, M., Pop, M. (2008) Genome assembly forensics: finding the elusive mis- assembly. Genome Biology 9:R55 doi: 10.1186/gb-2008-9-3-r55

Phonghanpot, S., Punya, J., Tachaleat, A., Laoteng, K., Bhavakul, V., Tanticharoen, M., Cheevadhanarak, S. (2012) Biosynthesis of xyrrolin, a new cytotoxic hybrid polyketide/non- ribosomal peptide pyrroline with anticancer potential, in Xylaria sp. BCC 1067. Chembiochem, 13(6), 895-903. doi:10.1002/cbic.201100746

Piel, J. (2010) Biosynthesis of polyketides by trans-AT polyketide synthases. Natural Product Reports, 27(7), 996-1047. doi:10.1039/b816430b

171

Pittayakhajonwut, P., Usuwan, A., Intaraudom, C., Khoyaiklang, P., & Supothina, S. (2009) Torrubiellutins A-C, from insect pathogenic fungus Torrubiella luteorostrata BCC 12904. Tetrahedron, 65(31), 6069-6073. doi:DOI 10.1016/j.tet.2009.05.070

Pompon, D., Louerat, B., Bronine, A., Urban, P. (1996) Yeast expression of animal and plant P450s in optimized redox environments. Methods in Enzymology, 272, 51–64. doi: 10.1016/S0076-6879(96)72008-6

Priebe, S., Linde, J., Albrecht, D., Guthke, R., Brakhage, A. A. (2011) FungiFun: a web-based application for functional categorization of fungal genes and proteins. Fungal Genetics and Biology, 48(4), 353-358. doi:10.1016/j.fgb.2010.11.001

Qiao, K., Chooi, Y. H., Tang, Y. (2011). Identification and engineering of the cytochalasin gene cluster from Aspergillus clavatus NRRL 1. Metabolic Engineering, 13(6), 723-732. doi:10.1016/j.ymben.2011.09.008

Rausch, C., Weber, T., Kohlbacher, O., Wohlleben, W., Huson, D. H. (2005) Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs). Nucleic Acids Research, 33(18):5799–5808. doi: 10.1093/nar/gki885

Riach, M.B.R. & Kinghorn, J.R. (1996) Genetic transformation and vector developments in filamentous fungi. In Fungal Genetics: Principles and Practice ed. Bos, C.J. pp. 209-233. New York: Marcel Dekker Inc. ISBN 9780824795443

Robbel, L., Knappe, T. A., Linne, U., Xie, X., Marahiel, M. A. (2010) Erythrochelin--a hydroxamate-type siderophore predicted from the genome of Saccharopolyspora erythraea. Federation of European Biochemical Societies Journal, 277(3), 663-676. doi:10.1111/j.1742- 4658.2009.07512.x

Rolland, S., Jobic, C., Fevre, M., Bruel, C. (2003) Agrobacterium-mediated transformation of Botrytis cinerea, simple purification of monokaryotic transformants and rapid conidia-based identification of the transfer-DNA host genomic DNA flanking sequences. Current Genetics, 44(3), 164-171. doi:10.1007/s00294-003-0438-8

Romero, M.C., M.I. Urrutia, E.H. Reinoso, Kiernan, A.M. (2009) Wild soil fungi able to degrade the herbicide isoproturon. Rev. Mexicana De Micologia, 29: 1–7

Rossier, C., Pugin, A. Turian, G. (1985) Genetic analysis of transformation in a microconidiating strain of Neurospora crassa. Current Genetics 10, 313-320.

Ruiz-Diez, B. (2002) Strategies for the transformation of filamentous fungi. Journal of Applied Microbiology, 92(2), 189-195. doi: 10.1046/j.1365-2672.2002.01516.x

Saito, K., Thiele, D. J., Davio, M., Lockridge, O., Massey, V. (1991) The cloning and expression of a gene encoding old yellow enzyme from Saccharomyces carlsbergensis. Journal of Biological Chemistry 266, 20720–20724.

Sakai, K., Kinoshita, H., Shimizu, T., Nihira, T. (2008) Construction of a citrinin gene cluster expression system in heterologous Aspergillus oryzae. Journal of Bioscience and Bioengineering. 106(5):466-72. doi: 10.1263/jbb.106.466.

172

Salzberg, S. L., Yorke, J. A. (2005) Beware of mis-assembled genomes. Bioinformatics 21 (24): 4320-4321. doi:10.1093/bioinformatics/bti769

Sanchez, J. F., Chiang, Y. M., Szewczyk, E., Davidson, A. D., Ahuja, M., Elizabeth Oakley, C., Wang, C. C. (2010) Molecular genetic analysis of the orsellinic acid/F9775 gene cluster of Aspergillus nidulans. Molecular BioSystems, 6(3), 587-593. doi:10.1039/b904541d

Sanchez, J. F., Somoza, A. D., Keller, N. P., Wang, C. C. C. (2012) Advances in Aspergillus secondary metabolite research in the post-genomic era. Natural Product Reports, 29(3):351– 371. doi: 10.1039/c2np00084a

Sanchez-Cruz, P., Dejesus-Andino, F., Alegria, A.E. (2012) Roles of hydrophilicities and hydrophobicities of dye and sacrificial electron donor on the photochemical pathway. J Journal of Photochemistry and Photobiology A: Chemistry. 15;236:54-60. doi: 10.1016/j.jphotochem.2012.03.012

Sanna, C.R, Li, W.H., Zhang, L. (2008) Overlapping genes in the human and mouse genomes, BMC Genomics 9, 169. doi: 10.1186/1471-2164-9-16

Sano, M. (2016) Aspergillus oryzae nrtA affects kojic acid production. Bioscience, Biotechnology, and Biochemistry. 80(9):1776-80. doi: 10.1080/09168451.2016.1176517.

Schardl, C. L. (1996) EPICHLOE SPECIES: fungal symbionts of grasses. Annual Review of Phytopathology, 34, 109-130. doi:10.1146/annurev.phyto.34.1.109

Scherlach, K., & Hertweck, C. (2006) Discovery of aspoquinolones A-D, prenylated quinoline- 2-one alkaloids from Aspergillus nidulans, motivated by genome mining. Organic & Biomolecular Chemistry, 4(18), 3517-3520. doi:10.1039/b607011f

Scherlach, K., Nutzmann, H. W., Schroeckh, V., Dahse, H. M., Brakhage, A. A., Hertweck, C. (2011) Cytotoxic pheofungins from an engineered fungus impaired in posttranslational protein modification. Angewandte Chemie International Edition, 50(42), 9843-9847. doi:10.1002/anie.201104488

Schmidt, K., Riese, U., Li, Z., Hamburger, M. (2003) Novel tetramic acids and pyridone alkaloids, militarinones B, C, and D, from the insect pathogenic fungus Paecilomyces militaris. Journal of Natural Products, 66(3), 378-383. doi:10.1021/np020430y

Schmidt, K. Günther, W., Stoyanova, S., Schubert, B., Li, Z., Hamburger, M. (2002) Militarinone A, a Neurotrophic Pyridone Alkaloid from Paecilomyces militaris. Organic Letters, 4 (2), 197–199 doi: 10.1021/ol016920j

Schmutz J, et al. (2010) Genome sequence of the palaeopolyploid soybean. Nature 463:178- 183. doi:10.1038/nature08670

Schroeckh, V., Scherlach, K., Nutzmann, H. W., Shelest, E., Schmidt-Heck, W., Schuemann, J., Brakhage, A. A. (2009) Intimate bacterial-fungal interaction triggers biosynthesis of archetypal polyketides in Aspergillus nidulans. Proceedings of the National Academy of Sciences U S A, 106(34), 14558-14563. doi:10.1073/pnas.0901870106

Schumann, J., & Hertweck, C. (2007) Molecular basis of cytochalasan biosynthesis in fungi: gene cluster analysis and evidence for the involvement of a PKS-NRPS hybrid synthase by RNA silencing. Journal of the American Chemical Society, 129(31), 9564-9565. doi:10.1021/ja072884t

173

Sekiguchi, J., & Gaucher, G. M. (1977) Conidiogenesis and secondary metabolism in Penicillium urticae. Applied and Environmental Microbiology, 33(1), 147-158.

Seshime, Y., Juvvadi, P. R., Fujii, I., Kitamoto, K. (2005) Discovery of a novel superfamily of type III polyketide synthases in Aspergillus oryzae. Biochemical and Biophysical Research Communications, 331(1), 253-260. doi:10.1016/j.bbrc.2005.03.160

Shelest, E. (2008) Transcription factors in fungi. FEMS Microbiology Letters, 286(2), 145-151. doi:10.1111/j.1574-6968.2008.01293.x

Shim, J., Coop, A., MacKerell, A.D. (2013) Molecular details of the activation of the μ opioid receptor. The Journal of Physical Chemistry B, 117(26):7907-17. doi: 10.1021/jp404238n.

Shintani, S., O'HUigin, C., Toyosawa, S., Michalova, V., Klein, J. (1999) Origin of gene overlap: the case of TCP1 and ACAT2. Genetics 152 (1999) 743–754

Shrestha, B., Lee, W. H., Han, S. K., Sung, J. M. (2006) Observations on Some of the Mycelial Growth and Pigmentation Characteristics of Cordyceps militaris Isolates. Mycobiology, 34(2), 83-91. doi:10.4489/MYCO.2006.34.2.083

Sieber, C.M.K., Lee, W., Wong, P., Mewes H-W (2014) The Fusarium graminearum Genome Reveals More Secondary Metabolite Gene Clusters and Hints of Horizontal Gene Transfer. PLoS ONE 9(10): e110311. doi:10.1371/journal.pone.0110311

Silakowski, B., Kunze, B., Müller, R. (2001) Multiple hybrid polyketide synthase/non- ribosomal peptide synthetase gene clusters in the myxobacterium Stigmatella aurantiaca. Gene. 19;275(2):233-40. doi: 10.1016/S0378-1119(01)00680-1

Sims, J. W., Fillmore, J. P., Warner, D. D., Schmidt, E. W. (2005) Equisetin biosynthesis in Fusarium heterosporum. Chemical Communications (Camb)(2), 186-188. doi:10.1039/b413523g

Skinnider, M.A., Dejong, C.A., Rees, P.N., Johnston, C.W., Li, H., Webster, A.L., Wyatt, M.A., Magarvey, N.A. (2015) Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM). Nucleic Acids Research. 16;43(20):9645-62. doi: 10.1093/nar/gkv1012.

Smith, D., Burnham, M., Edwards, J., Earl, A., Turner, G. (1990). Cloning and heterologous expression of the penicillin biosynthetic gene cluster from Penicillium chrysogenum. Nature Biotechnology 8, 39–41. doi: 10.1038/nbt0190-39

Sondergaard, T. E., Hansen, F. T., Purup, S., Nielsen, A. K., Bonefeld-Jorgensen, E. C., Giese, H., Sorensen, J. L. (2011) Fusarin C acts like an estrogenic agonist and stimulates breast cancer cells in vitro. Toxicology Letters, 205(2), 116-121. doi:10.1016/j.toxlet.2011.05.1029

Song, Z., Cox, R. J., Lazarus, C. M., Simpson, T. T. (2004) Fusarin C biosynthesis in Fusarium moniliforme and Fusarium venenatum. Chembiochem, 5(9), 1196-1203. doi:10.1002/cbic.200400138

174

Spangenberg, T., Burrows, J. N., Kowalczyk, P., McDonald, S., Wells, T. N. C., Willis, P. (2013) The Open Access Malaria Box: A Drug Discovery Catalyst for Neglected Diseases. Plos One, 8(6). doi:10.1371/journal.pone.0062906

Staunton, J., & Weissman, K. J. (2001) Polyketide biosynthesis: a millennium review. Natural Product Reports, 18(4), 380-416. doi: 10.1039/A909079G

Stone, L.K., Baym, M., Lieberman, T.D., Chait, R., Clardy, J., Kishony, R. (2016) Compounds that select against the tetracycline-resistance efflux pump. Nature Chemical Biology. 12(11):902-904. doi: 10.1038/nchembio.2176.

Su, C.H. & Wang, H.H. (1986) Phytocordyceps, a new genus of the Clavicipitaceae. Mycotaxon 26: 337–344

Suh, S. O., McHugh, J. V., Pollock, D. D., Blackwell, M. (2005). The beetle gut: a hyperdiverse source of novel yeasts. Mycological Research, 109(Pt 3), 261-265.

Svarstad, H., Bugge, H. C., Dhillion, S. S. (2000) From Norway to Novartis: cyclosporin from Tolypocladium inflatum in an open access bioprospecting regime. Biodiversity and Conservation, 9(11), 1521-1541. doi:10.1023/A:1008990919682

Takahashi, T., Hatamoto, O., Koyama, Y., Abe, K. (2004). Efficient gene disruption in the koji-mold Aspergillus sojae using a novel variation of the positive-negative method. Molecular Genetics and Genomics, 272(3), 344-352. doi:10.1007/s00438-004-1062-0

Tanzer, M. M., Arst, H. N., Skalchunes, A. R., Coffin, M., Darveaux, B. A., Heiniger, R. W., Shuster, J. R. (2003) Global nutritional profiling for mutant and chemical mode-of-action analysis in filamentous fungi. Functional & Integrative Genomics, 3(4), 160-170. doi:10.1007/s10142-003-0089-3

Taylor, D. L., Herriott, I. C., Stone, K. E., McFarland, J. W., Booth, M. G., Leigh, M. B. (2010) Structure and resilience of fungal communities in Alaskan boreal forest soils. Canadian Journal of Forest Research-Revue Canadienne De Recherche Forestiere, 40(7), 1288-1301. doi: 10.1139/X10-081

Tilburn, J., Scazzocchio, C., Taylor, G.G., Zabicky-Zissman, J.H., Lockington, R.A., Davies, R.W. (1983) Transformation by integration in Aspergillus nidulans. Gene, 26, 205–221. doi: 10.1016/0378-1119(83)90191-9

Tobert, J. A. (2003) Lovastatin and beyond: the history of the HMG-CoA reductase inhibitors. Nature Reviews Drug Discovery, 2(7), 517-526. doi:10.1038/nrd1112

Todd, R. B., Andrianopoulos, A. (1997) Evolution of a fungal regulatory gene family: the Zn(II)2Cys6 binuclear cluster DNA binding motif. Fungal Genetics and Biology, 21(3), 388- 405. doi:10.1006/fgbi.1997.0993

Tominaga, M., Lee, Y-H, Hayashi, R., Suzuki, Y., Yamada, O. (2006) Molecular analysis of an inactive aflatoxin biosynthesis gene cluster in Aspergillus oryzae RIB strains. Applied and Environmental Microbiology 72: 484–490. doi:10.1128/AEM.72.1.484-490.2006.

175

Trenholme, K., Marek, L., Duffy, S., Pradel, G., Fisher, G., Hansen, F. K., Andrews, K. T. (2014) Lysine acetylation in sexual stage malaria parasites is a target for antimalarial small molecules. Antimicrobial Agents and Chemotherapy, 58(7), 3666-3678. doi:10.1128/AAC.02721-13

Tsai, I. J., Otto, T. D., Berriman, M. (2010) Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps. Genome Biology 11:R41 (2010). doi: 10.1186/gb-2010-11-4-r41

Tubajika, K.M., Damann, K. (2002) Glufosinate-ammonium reduces growth and aflatoxin B1 production by Aspergillus flavus. Journal of Food Protection, 65: 1483–1487

Tudzynski ,B. (2005) Gibberellin biosynthesis in fungi: genes, enzymes, evolution, and impact on biotechnology. Applied Microbiology and Biotechnology 66: 597–611. doi:10.1007/s00253- 004-1805-1

Tudzynski, B. (2014) Nitrogen regulation of fungal secondary metabolism in fungi. Frontiers in Microbiology, 5, 656. doi:10.3389/fmicb.2014.00656

Udwary, D. W., Zeigler, L., Asolkar, R. N., Singan, V., Lapidus, A., Fenical, W., Moore, B. S. (2007) Genome sequencing reveals complex secondary metabolome in the marine actinomycete Salinispora tropica. Proceedings of the National Academy of Sciences U S A, 104(25), 10376-10381. doi:10.1073/pnas.0700962104

Venkatesan, P., & Maruthavanan, T. (2012) Piperidine-mediated synthesis of thiazolyl chalcones and their derivatives as potent antimicrobial agents. Natural Product Research, 26(3), 223-234. doi:10.1080/14786419.2010.536161

Vesth, T., Brandl, J., Andersen, M. (2016) FunGeneClusterS: Predicting fungal gene clusters from genome and transcriptome data. Synthetic and Systems Biotechnology. 1(2), June 2016, Pages 122–129. doi: 10.1016/j.synbio.2016.01.002

Vijn, I., Govers, F. (2003) Agrobacterium tumefaciens mediated transformation of the oomycete plant pathogen Phytophthora infestans. Molecular Plant Pathology, 4(6), 459-467. doi:10.1046/j.1364-3703.2003.00191.x von Dohren, H. (2009) A survey of nonribosomal peptide synthetase (NRPS) genes in Aspergillus nidulans. Fungal Genetics and Biology, 46 Suppl 1, S45-52. doi:10.1016/j.fgb.2008.08.008

Walsh, C.T., Fischbach, M.A. (2010) Natural products version 2.0: connecting genes to molecules. Journal of the American Chemical Society. 132:2469–2493. doi: 10.1021/ja909118a

Wanchoo, A., Lewis, M. W., Keyhani, N. O. (2009) Lectin mapping reveals stage-specific display of surface carbohydrates in in vitro and haemolymph-derived cells of the entomopathogenic fungus Beauveria bassiana. Microbiology, 155(Pt 9), 3121-3133. doi:10.1099/mic.0.029157-0

Wang, W. J., Vogel, H., Yao, Y. J., Ping, L. (2012) The nonribosomal peptide and polyketide synthetic gene clusters in two strains of entomopathogenic fungi in Cordyceps. FEMS Microbiology Letters, 336(2), 89-97. doi:10.1111/j.1574-6968.2012.02658.x

176

Wannop, C. C. (1961) Histopathology of Turkey X Disease in Great Britain. Avian Diseases, 5(4), 371-&. doi:10.2307/1587768

Wasil, Z., Pahirulzaman, K. A. K., Butts, C., Simpson, T. J., Lazarus, C. M., Cox, R. J. (2013) One pathway, many compounds: heterologous expression of a fungal biosynthetic pathway reveals its intrinsic potential for diversity. Chemical Science, 4(10), 3845-3856. doi:10.1039/c3sc51785c

Watson, R.J., Burchat, S., Bosley, J. (2008) A model for integration of DNA into the genome during transformation of Fusarium graminearum. Fungal Genetics and Biology. 245(10):1348-63. doi: 10.1016/j.fgb.2008.07.015.

Wells, T. N., Hooft van Huijsduijnen, R., Van Voorhis, W. C. (2015) Malaria medicines: a glass half full? Nature Reviews Drug Discovery, 14(6), 424-442. doi:10.1038/nrd4573

Wendland, J. (2003). PCR-based methods facilitate targeted gene manipulations and cloning procedures. Current Genetics 44(3), 115-123. doi:10.1007/s00294-003-0436-x

Williams, R.E., Bruce, N.C. (2002) 'New uses for an Old Enzyme'--the Old Yellow Enzyme family of flavoenzymes. Microbiology. 148(Pt 6):1607-14. doi: 10.1099/00221287-148-6-1607

Wheeler, M. H., & Bell, A. A. (1988) Melanins and their importance in pathogenic fungi. Current topics in medical mycology, 2, 338-387.

Williams, R. B., Henrikson, J. C., Hoover, A. R., Lee, A. E., Cichewicz, R. H. (2008) Epigenetic remodeling of the fungal secondary metabolome. Organic & Biomolecular Chemistry, 6(11), 1895-1897. doi:10.1039/b804701d

Winter, J. M., Behnken, S., Hertweck, C. (2011) Genomics-inspired discovery of natural products. Current Opinion in Chemical Biology, 15(1), 22-31. doi:10.1016/j.cbpa.2010.10.020

Wu, D., et al. (2009) A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea. Nature; 462:1056-1060. doi:10.1038/nature08656

Xiao, G., Ying, S. H., Zheng, P., Wang, Z. L., Zhang, S., Xie, X. Q., Feng, M. G. (2012) Genomic perspectives on the evolution of fungal entomopathogenicity in Beauveria bassiana. Scientific Reports, 2, 483. doi:10.1038/srep00483

Xu, W., Cai, X., Jung, M., Tang, Y (2010) Analysis of Intact and Dissected Fungal Polyketide Synthase-Nonribosomal Peptide Synthetase in Vitro and in Saccharomyces cerevisiae. Journal of the American Chemical Society, 132 (39), 13604–13607. doi: 10.1021/ja107084d

Xu, X., Chen, J., Xu, H., Li, D. (2014) Role of a major facilitator superfamily transporter in adaptation capacity of Penicillium funiculosum under extreme acidic stress. Fungal Genetics and Biology, 69, 75-83. doi:10.1016/j.fgb.2014.06.002

Yadav, G., Gokhale, R. S., Mohanty, D. (2003) SEARCHPKS: a program for detection and analysis of polyketide synthase domains. Nucleic Acids Research, 31(13):3654–3658. doi: 10.1093/nar/gkg607

177

Yamakawa, M., Hishinuma, F., Gunge, N. (1985) Intact cell transformation of Saccharomyces cerevisiae by polyethylene glycol. Agricultural and Biological Chemistry, 49, 869–871. doi: 10.1080/00021369.1985.10866817

Yandell, M., Ence. D. (2012) A beginner's guide to eukaryotic genome annotation. Nature Reviews Genetics. Apr 18;13(5):329-42. doi: 10.1038/nrg3174.

Zhang, H., Wang, Y., Pfeifer, B. A. (2008) Bacterial hosts for natural product production. Molecular Pharmaceutics, 5(2), 212-225. doi:10.1021/mp7001329

Zhang, S., Widemann, E., Bernard, G., Lesot, A., Pinot, F., Pedrini, N., Keyhani, N. O. (2012) CYP52X1, representing new cytochrome P450 subfamily, displays fatty acid hydroxylase activity and contributes to virulence and growth on insect cuticular substrates in entomopathogenic fungus Beauveria bassiana. Journal of Biological Chemistry, 287(16), 13477-13486. doi:10.1074/jbc.M111.338947

Zhao, H., Lovett, B., Fang, W. (2016) Genetically Engineering Entomopathogenic Fungi. Advances in Genetics, 94, 137-163. doi:10.1016/bs.adgen.2015.11.001

Zheng, P., Xia, Y., Xiao, G., Xiong, C., Hu, X., Zhang, S., Zheng, H., Huang, Y., Zhou, Y., Wang, S., Zhao, G.P., Liu, X., St Leger, R.J., Wang, C. (2011) Genome sequence of the insect pathogenic fungus Cordyceps militaris, a valued traditional Chinese medicine. Genome Biology 23;12(11):R116. doi: 10.1186/gb-2011-12-11-r116.

Ziemert, N., Lechner, A., Wietz, M., Millán-Aguiñaga, N., Chavarria, K., Jensen, P. (2014) Diversity and evolution of secondary metabolism in the marine actinomycete genus Salinispora. Proceedings of the National Academy of Sciences vol. 111 no. 12, E1130-E1139 doi: 10.1073/pnas.1324161111

Ziemert, N., Podell, S., Penn, K., Badger, J.H., Allen, E., Jensen, P.R. (2012) The Natural Product Domain Seeker NaPDoS: A Phylogeny Based Bioinformatic Tool to Classify Secondary Metabolite Gene Diversity. PLoS ONE 7(3): e34064. doi:10.1371/journal.pone.0034064

178

C

H A H PT ER

8

8. APPENDICES

9.1 DOMAIN IDENTIFICATION FOR TORS SYNTHASE GENE

1 MSHPKKQDAG LHCIPEPIAI VGSACRFPGG CNAPSKLWDL LQQPRDILKD IHPDRLNLRR 61 YHHPDGETHG ATDVANRAYT LEEDIGLFDA SFFGISPLEA AGMDPQQRML LEVVYESTET 121 AGITLDQLRG SLTSVHVGVM TNDWAHVQRR DPETMPQYTG TGISSSIISN RISYIFDLKG 181 ASETIDTACS SSLVALHNAA RALQSGDSEK AIVAGVNLII DPDPFIFESK LHMLSPDSRS 241 RMWDKSANGY ARGEGAASVI LKTLSQALRD GDQIEGVIRS TLVNSDGLSS GLTMPSAAAQ 301 TALIRQTYRK AGLDPVRDRP QFFECHGTGT KAGDPVEARA ISDAFIPKRR VKGSATVDTH 361 PLYVGSVKTL IGHLEGCAGL AGVIKVLLSL KHGVIPPNLW FDKLNPDIAR YYGPLQIPTT 421 ATPWPKLAAG APFRASVNSF GFGGTNAHAI IERYDASQDY AAGRHAGVST RETAKEEQAG 481 SNDYVPVPIL LTAKTGGALW RTVGAYTQYL RAHPDLDLAD LGQFLHSRRS THRVRAQFSG 541 ASRDELLENM ETFVQTHATD AKSAASENRI GHSPLLIDPK ETPGILGVFT GQGAQWPAMG 601 RRMMEKSPVF RNIIADCESV LQSLPAKDAP QWSLSQELIK DASTSRLSEA ELSQPLCTAV 661 QLALVNVLWK SGVHFDAVVG HSSGEIAASY ASGIINLQGA MQIAYYRGFH AKLSQGGSGQ 721 SGGMMAAGLS MAEAVQFCHR PEFEGRIQVA ASNAPKSVTL SGDREAIQSA KAVLDSDGVF 781 ARELKVDTAY HSHHMLPCAK PYLESLLACE IQVRPPTPGK CIWSSSVRGD AELLRSDESL

179

841 EGLRGPYWVA NMVQTVLFSR AVESTISHGG PFDLAIEIGP HPALKGPTEQ TLKSAYGSTP 901 FYTGVLKRGS DDAVAFSTAI GNIWAHLGPA FVDLSGYQSA FSDAPRSPQA TAPSFISGLP 961 SYSWDHEKPY WRESRISRRY RLGQDGSHEL LGRRTPDDNE REVRWRNLLK VSELPWTQGH 1021 KVLGEVLLPG AAYISMALEA GKRLALDRGR KVQLLEVSDV DILRPVVVSD GKEGTETLFT 1081 VRMLQDNLST NQQFGGFIRA SFSYYLFNNA TSTSVAHTCE GQITIHLGAR LEYDSEADRI 1141 QQLPPRGPLA SNLQEMDCEN IYSLFGGIGL EYSGAFRRIT ASSRQLGYAT ASASWSSRDL 1201 NDSYMIHPAI LDVAFQTIFI ARAHPDSGQI NAALLPSRIE RVRVVPSLVM ESRLQDNLDI 1261 NADVDAWVLK QDTTSLTGDL NVYDADSGTP LLQVEGFEVR TVGEPDASHD RPIFSETVWG 1321 PDISMNGLSD PVRDKATDTT VEAVSEACER VSLFYVRRLM NEISARDKKQ ASWYHARMLH 1381 AFEHHLEQVR DGRHLHVRRE WLSDDRSTMD AIDTAFPDMI ELQMLHAVGK DIASIVRGEK 1441 HMLEVMRVDN MLDRFYAENK GMQQINIALA KALKEITFKF PRCKILEIGA GTGATSLAVL 1501 SALDGAFDTY TYTDLSVGFF ETAMERFSEF GHKMIFKALD VEKDLAAQGY DLHSYDIIIA 1561 ANVLHATRNL EVTLDNVRSL LKPGGYLLLN EKTGPESLRA AFNFGPLEGW WLAEEEDRQL 1621 CPLMSPLGWE AQLQKAQFSG VDYLVHDIPE EPKQHTSLIV SQAVDDMLYS RLCPLADMAS 1681 LAPTKEPIVI IGGQTTNTCK IVKEIQKLLP RQWKQMVHLV DTIENLEAAN LAVRSDVICL 1741 QELDKALFAG PIAVKRLSAI QTLLMNTKNL LWVTNAQNSS SAVPRSSMFR GITRVLAGEV 1801 PQIHTQVFGI ETMGLASATA RNILEAFLRL RSGYSQTEAD TEDQDTGRQI LWSHEPEVDL 1861 LSNGVMMVPR VKLNKPLNEA FMASTRAVSR AVDASRVAVQ VIAGPAKMML QPCQSAVGSA 1921 APKGLADSTM RIQVRYTLHV PQGRDGTRLY LVCGWAHTAG PSGAVSVPVM ALSHTNASII 1981 DVSSTAVVTV DDGSLSSDVL VRTFKHLSLQ ALESTTETQQ RTLVYGADEA LAELISAKHA 2041 LRGSKVYFAS SRSSTPPNWL KVHSLSSRFT LGQMIPYGLG TFIDCLNCAE SDSALRTLAS 2101 CLPTDCIAYQ LDASLLSDMS RTSATALAEA YSCAKMQDKP NSVQVADVKT IQVAELVGQA 2161 SHSLRQSIYL TDWQKNDSVV VTVPPLDTQG MFKRDRTYLM VGAAGGIGTS ICRWMVRNGA 2221 RHVVVTSRNP QGDPNMISEA ERCGATVRVV PMDVCNRDSV QSVIDMIRAT MPPIAGVCNA

180

2281 AMVLCDKLFL DMDVDQLNNT LGPKVDGTEI LDSVFAREPL DFFVLLGSSA SITNNIGQSN 2341 YHCANLYMDS LVAQRRSRGL AASIIHIGYI CDTGYITRLT DDAKKVQSNR DIMRAMTLSE 2401 TDVHHAFAEA VRGGQPGGAN GSHNIIMGIE PPTKPLDPNK RKGLWLSDAR LGHMVPTSAS 2461 SNQNAASEQA AVSSNSIGQQ ISEANTNEEA TTAILHAFGA KLESILLLPP GSIGQDRVGR 2521 PVTDLGIDSL VAVEIRTWFL KRLRVDVPVM KILGGSTIGQ LSALAAKLAR MDTSKESQSQ 2581 GIAAGKNHDS AKAPRNSSSE AADKAVTKPP DQVTEPGTLG KTDEALLPGA PAKDDFPTNP 2641 TISSSASELD GSLQASVQQS CETDSSSTPS KSSDYKSDSE TESKLSKGGS SNACSELQTT 2701 KAARPNILRE AQMSPAQSRI WFLSKHIAEP YAYNMVWHYR VHGRLNMMRL RHAMQTVTNH 2761 HECLRMCFYA DAHNGQPMQG LLASSAFHMS HMSDCGEEDT QRELRKLGTR AWGIENGQTL 2821 ELVVLSRPGA QEEHSLLFGY HHIVMDAISW HIFLADLDRA YRMLPLDKTA AGSHLDLAEM 2881 QLQQERAGAW DDSLGYWQTE FATIPDMLPT LPMASPSSQR DTLGTHYVLR ELPNEQGDAI 2941 KNACKQLRVS SFNLHVAVLQ VLLARLANIE DVCIGIVDAN RGETGASQMV GCFVNMLPIR 3001 SQVLGSATLA DVSKAASSKA LAAFAHGCVP LDKILDRFKA PRLASGTPLF QVALNYRPAA 3061 SISWDQPLGS ECQMELAPYD IKDAENPFEM SVLVSEMPGG SVALELYCQK AKYTLEGSHA 3121 LMDAYLNVLG SFLSDANQHV SDCAVYEQAK IERAIDIGKG SQTDFGWPAT LSERVMIMCQ 3181 QHCAKPAIKD GQSEMSYAQL ASRVSDTASA IISAGCGVGS RIAVLCDPSI DTIVAMLAIL 3241 HMGGVYVPLD TSLPEARHVA LVSNSTPSLL LFHTATKERV HSLRTSLPAL GHQVPRELLI 3301 GSVSASALDV AASLQANADA PAILLYTSGS TGTPKGVLLT QANFCNHIAL KTDILDLGRG 3361 ECVLQQSSLG FDMSLVQIFC ALANGGCVVV VPADARRDPV ELTSLMAHHR VSLTIATPSE 3421 YLAWLQYGSS SLAQNTAWRH LCMGGEPIPQ LLRDELRHLG RRERILTNCY GPTEATAAAS 3481 FQPISLESQG GDLQVEDELV RYAVGKALPN YSIRIMDAAG GWLPANHTGE IAVGGAGVAL 3541 GYLGLPKETQ AKFIRPYGES GRFYRTGDKG RLLPDGTLLC LGRIEGDSQV KLRGLRIELQ 3601 EVEAALLKAS EGLIQAAVVS RREDVLVAHC TRSHDKTAAA AHEEQHVASI LSRLAKLLPQ 3661 YSVPAAIIFL PSLPTNANGK LDRKAIAALP LAQQDQDSVR DKSSSGGEKM TIHQGELRLL

181

3721 WERVLPQNAT GTRIVPGSDF FLSGGNSLLL MKLQAAIRDA MGVRVSTKSL YQSSTLSGMS 3781 RCIVEQREQQ QDECQAEIDW AAEAAMPPSL LQQMDRLQSS STATSWRPPK TAGLEILVTG 3841 ATGFLGGQLL QRLLQAPEVS KVHCVAVPAD ERHLLEPLQQ MNEKLMPYTG NLAAPDLGLG 3901 AAARAHLQES VDVIVHAGAM GHCLNTYATL SGPNLASTRS LCSLALARAP PIPFAFVSSS 3961 RVVLLTGSTS PAPGSVAAFP PPTDGAEGYT ASKWASEVFL ENAAARAAAR ERPWSVSIHR 4021 PCVLVSEQAP NSDALNSILR FSVAMRCVPS LPEERAHGYL DFGQVDKVVE EMATDVFKLV 4081 GESQQGKPAV AYRHHSGGAK VPIHEFRAHM ESVYGGRFDS LDLAEWIARA VDAGMDPLIS 4141 AYLETFLEGD APMVFPYMGE QPV Domains. Light green: KS Yellow: AT Light Blue: DH Greenish blue: CMet Green: KR Fuchsia: ACP Violet: C Dark grey: A Grey: PCP Red: TD

182

9.2 PROTEIN SEQUENCE ALIGNMENT BETWEEN TORS AND TENS

CLUSTAL W (1.83) multiple sequence alignment

DmbS MSPMKQNESESHCVSEPIAIVGSAYRFPGGCNTPSKLWDLLRQPRDILKE TorS MSHPKKQDAGLHCIPEPIAIVGSACRFPGGCNAPSKLWDLLQQPRDILKD tenellin MSPMKQNESESHSVSEPIAIVGSAYRFPGGCNTPSKLWDLLQQPRDILKE ** *:::: *.:.********* *******:********:*******:

DmbS IDPERLNLRRYYHPDGETHGSTDVSNRAYTLEEDISRFDASFFGISPLEA TorS IHPDRLNLRRYHHPDGETHGATDVANRAYTLEEDIGLFDASFFGISPLEA tenellin LDPERLNLRRYYHPDGETHGSTDVSNKAYTLEEDISRFDASFFGISPLEA :.*:*******:********:***:*:********. *************

DmbS AGMDPQQRTLLEVVYESTETAGIPLDKLRGSLTSVHVGVMTTDWAQMQRR TorS AGMDPQQRMLLEVVYESTETAGITLDQLRGSLTSVHVGVMTNDWAHVQRR tenellin ASMDPQQRTLLEVVYESTETAGIPLDKLRGSLTSVHVGVMTTDWAQVQRR *.****** **************.**:**************.***::***

DmbS DPETMPQYTATGIASSIISNRISYIFDLKGASETIDTACSSSLVALHNAA TorS DPETMPQYTGTGISSSIISNRISYIFDLKGASETIDTACSSSLVALHNAA tenellin DPETMPQYTATGIASSIISNRISYIFDLKGASETIDTACSSSLVALHNAA *********.***:************************************

DmbS RALQSGDSEKAIVAGVNLILDPDPFIFESKLHMLSPDSRSRMWDAAANGY TorS RALQSGDSEKAIVAGVNLIIDPDPFIFESKLHMLSPDSRSRMWDKSANGY tenellin RALQSGDCEKAIVAGVNLILDPDPFIYESKLHMLSPDARSRMWDAAANGY *******.***********:******:**********:****** :****

DmbS ARGEGAAAVVLKTLGHALRDGDQIEGVIRSTYVNSDGLSSGLTMPSSAAQ TorS ARGEGAASVILKTLSQALRDGDQIEGVIRSTLVNSDGLSSGLTMPSAAAQ tenellin ARGEGAAAVVLKTLGHALRDGDRIEGVIRSTFVNSDGLSSGLTMPSSAAQ *******:*:****.:******:******** **************:***

DmbS TALIRQTYRKAGLDPVKDRPQFFECHGTGTKAGDPVEARAISDAFLPNHK TorS TALIRQTYRKAGLDPVRDRPQFFECHGTGTKAGDPVEARAISDAFIPKRR tenellin TALIRQTYRKAGLDPVRDRPQFFECHGTGTRAGDPVEARAISDAFLPSHR ****************:*************:**************:*.::

183

DmbS TKG--AA-TVD-APLYVGSIKTVVGHLEGCAGLAGVIKVLLSLKHGIIPP TorS VKG--SA-TVDTHPLYVGSVKTLIGHLEGCAGLAGVIKVLLSLKHGVIPP tenellin TNGGGAATTVD-DPLYVGSIKTVVGHLEGCAGLAGLVKVLLSLKHGIIPP .:* :* *** ******:**::***********::*********:***

DmbS NLWFNKLNPEIARYYGPLQIPTTAIPWPELAPGTPFRASVNSFGFGGTNA TorS NLWFDKLNPDIARYYGPLQIPTTATPWPKLAAGAPFRASVNSFGFGGTNA tenellin NLWFDKLNPEIARYYGPLQIPTKAIPWPELAPGTPLRASVNSFGFGGTNA ****:****:************.* ***:**.*:*:**************

DmbS HAIIERYDANQSYCSQWRRDMTEQKTIVRPQDEGNTNIPVPLLLTAKTGG TorS HAIIERYDASQDYAAGRHAGVSTRETAKEEQAGSNDYVPVPILLTAKTGG tenellin HAIIERYDASQSYCSQWRRDMTEEKTIARTQNNDDVEIPVPLVLTAKTGG *********.*.*.: : .:: .:* . * .: :***::*******

DmbS ALWRTVDAYAQHLRQNPELGLANLSKFMHSRRATHRVRASFSGASREELL TorS ALWRTVGAYTQYLRAHPDLDLADLGQFLHSRRSTHRVRAQFSGASRDELL tenellin ALWRTVDAYAQHLRQHPKLRVANLSQFMHSRRSTHRVRASFSGASREELV ******.**:*:** :*.* :*:*.:*:****:******.******:**:

DmbS ENMAKFVQAHAADAKSPASQNRIGYSPLLIDPKEVPGILGVFTGQGAQWP TorS ENMETFVQTHATDAKSAASENRIGHSPLLIDPKETPGILGVFTGQGAQWP tenellin ENMANFVQAHAADAKSPASQNRIGYSPLLIDPKEVSGILGIFTGQGAQWP *** .***:**:****.**:****:*********..****:*********

DmbS AMGRDMMHQSPLFRKTIADCESVLQALPSKDVPSWSLSEELKKDASTSRL TorS AMGRRMMEKSPVFRNIIADCESVLQSLPAKDAPQWSLSQELIKDASTSRL tenellin AMGRDMMHQSPLFRKTIADCESVLQALPLKDAPAWSLSEELKKDASTSRL **** **.:**:**: *********:** **.* ****:** ********

DmbS GEAEISQPLCTAVQLALVNVLTASGVHFDAVVGHSSGEIAATYASGIISL TorS SEAELSQPLCTAVQLALVNVLWKSGVHFDAVVGHSSGEIAASYASGIINL tenellin GEAEISQPLCTAVQLALVNVLTASGVYFDAVVGHSSGEIAATYASGIINL .***:**************** ***:**************:******.*

DmbS KGAMQIAYYRGLYAKLARGKSDESGGMMAAGLSMNEAVKLCRLPEFEGRI TorS QGAMQIAYYRGFHAKLSQGGSGQSGGMMAAGLSMAEAVQFCHRPEFEGRI tenellin KAAMQIAYYRGLYAKLARGQSDEAGGMMAAGLSMDDAVKLCRLPEFEGRI :.*********::***::* *.::********** :**::*: *******

DmbS QVAASNAPQSVTLSGDKEAIKAAKAMLDSDGVFARELKVDTAYHSHHMLP

184

TorS QVAASNAPKSVTLSGDREAIQSAKAVLDSDGVFARELKVDTAYHSHHMLP tenellin QVAASNAPQSVTLSGDKEAIKAAKAKLDADGVFARELKVDTAYHSHHMLP ********:*******:***::*** **:*********************

DmbS CAEPYLESLLACDIQVSAPT--PG-KCMWSSSVRGDAELLRGDRNLDSLK TorS CAKPYLESLLACEIQVRPPT--PG-KCIWSSSVRGDAELLRSDESLEGLR tenellin CAEPYLKALLACDIQVSAPTKTPGRKCMWSSSVRGDAELLRRDRNLDSLK **:***::****:*** .** ** **:************* *..*:.*:

DmbS GPYWVANMVQTVLFSRAVQSTIWHGGPFDLAIEVGPHPALKGPTEQTLKA TorS GPYWVANMVQTVLFSRAVESTISHGGPFDLAIEIGPHPALKGPTEQTLKS tenellin GPYWVANMVQTVQFSRAIQSTIWHGGPFDLAVEVGPHPALKGPTEQTLKA ************ ****::*** ********:*:***************:

DmbS VYGSTPLYTGVLRRGANDAVAFSTAIGNIWSHLGPAFVDMTGCQSIFSGA TorS AYGSTPFYTGVLKRGSDDAVAFSTAIGNIWAHLGPAFVDLSGYQSAFSDA tenellin VYGSAPLYTGVLSRGANDAVAFSTAIGNIWSHLGPAFVDITGYQSIFSGT .***:*:***** **::*************:********::* ** **.:

DmbS SEGHGGSAAPFISDLPLYPWDHDEEYWRESRISRRYRTGKDESHELLGRR TorS PRSPQATAPSFISGLPSYSWDHEKPYWRESRISRRYRLGQDGSHELLGRR tenellin CEGHGGSEAPFISDLPLYPWDHDEEYWRESRISRRYRTGKDESHELLGRR .. .: ..***.** *.***:: ************ *:* ********

DmbS TPDDNEREIRWRNLLKVSELPWTQGHRVLGEVLLPGAAYISMAIEAGRRL TorS TPDDNEREVRWRNLLKVSELPWTQGHKVLGEVLLPGAAYISMALEAGKRL tenellin MPDDNEREIRWRNLLKVSELPWTQGHRVLGEVLLPGAAYISMAIEAGRRL *******:*****************:****************:***:**

DmbS ALDQGRQVCLLEVFDVDILRPVVVADNKEGTETLFTVRLLDEHTVSAKKL TorS ALDRGRKVQLLEVSDVDILRPVVVSDGKEGTETLFTVRMLQDNLSTNQQF tenellin ALDQGREVSLLEVSDVDILRPVVVADNKEGTETLFTVRLLDEYASTGKKS ***:**:* **** **********:*.***********:*:: : ::

DmbS DEIITASFSFYIHNSSASTSVVHTCEGRMAVHLGAKLGSGVGANSMPQLP TorS GGFIRASFSYYLFNNATSTSVAHTCEGQITIHLGARLEYDSEADRIQQLP tenellin DELMTASFSFYIYNSPASTSIVHTCEGRIAVHLGAKLGSEAAANSTPQLP . :: ****:*:.*..:***:.*****::::****:* *: ***

DmbS QRELSVSNLQPIDCEKLYSLFETIGLEYSGAFRAINSSSRRLGHATASAS

185

TorS PRGPLASNLQEMDCENIYSLFGGIGLEYSGAFRRITASSRQLGYATASAS tenellin PREPSVSNLQQLDCEKLYSVFETIGLEYSGAFRRIVSSSRCLGHATATAS * .**** :***::**:* ********** * :*** **:***:**

DmbS WASLDLNNCYLIHPAILDVAFQTMFVARAHPDSGQLNSALLPSRIERVRV TorS WSSRDLNDSYMIHPAILDVAFQTIFIARAHPDSGQINAALLPSRIERVRV Tenellin WPTADLNDCYLVHPAILDVAFQTIFVARAHPDSGQLSSALLPSRIERVRV *.: ***:.*::***********:*:*********:.:************

DmbS IPSSAMESKLQSNENINAEIDSWVLNQTVSSLTGDLNVYDTDTGIPLLQV TorS VPSLVMESRLQDNLDINADVDAWVLKQDTTSLTGDLNVYDADSGTPLLQV tenellin VPSLAMGSKLQNNENFNAAIDSWALNQTASSLTGNINVYDADSERALIQV :** .* *:**.* ::** :*:*.*:* .:****::****:*: .*:**

DmbS EGFEVRAVGEPDASKDRLLFSETVWGRDISIMGLSDPIRNETTDAAVQSL TorS EGFEVRTVGEPDASHDRPIFSETVWGPDISMNGLSDPVRDKATDTTVEAV tenellin EGFEVRAVGEPDASKDRLLFYETVWGRDISIMGLSDPIRDETSDAMVQNL ******:*******:** :* ***** ***: *****:*::::*: *: :

DmbS AEAIERVSLFYVRQLMSELSTKDRREANWYHSRMLTAFEHHLARIHEDTH TorS SEACERVSLFYVRRLMNEISARDKKQASWYHARMLHAFEHHLEQVRDGRH tenellin SEAIERVSLFYVRQLMGELSTADRRQANWYHTRMLAAFDHHLAKVHEETH :** *********:**.*:*: *:::*.***:*** **:*** :::: *

DmbS LHVRQEWLSDDWSVIQIIDEAYPDTVELQMLHAIGQNMANVIRGEKHMLE TorS LHVRREWLSDDRSTMDAIDTAFPDMIELQMLHAVGKDIASIVRGEKHMLE tenellin LHLRPEWLADDWTVIQTIDEAYPDAVELQMLHAVGQNVADVIRGKKHLLE **:* ***:** :.:: ** *:** :*******:*:::*.::**:**:**

DmbS VMRVNNLLDRLYTEDKGMQQGNHFLANALKEITFKFPRCKILEIGAGTGA TorS VMRVDNMLDRFYAENKGMQQINIALAKALKEITFKFPRCKILEIGAGTGA tenellin VLRVDNLLDRLYTEDKGMHMANLFLANALKEITFKFPRCKILEIGAGTGA *:**:*:***:*:*:***: * **:***********************

DmbS TTWAVLSAIDETFDTYTYTDLSVGFFETAVERFSAFRHKMIFKALDIEKS TorS TSLAVLSALDGAFDTYTYTDLSVGFFETAMERFSEFGHKMIFKALDVEKD tenellin TTWAALSAIGEAFDTYTYTDLSVGFFENAVERFSAFRHRMVFRALDIEKD *: *.***:. :***************.*:**** * *:*:*:***:**.

DmbS PAAQSFDLGSYDIIIATNVLHATRNLDITLGNVRSLLKPGGYLLLNEKTG TorS LAAQGYDLHSYDIIIAANVLHATRNLEVTLDNVRSLLKPGGYLLLNEKTG Tenellin PASQSFDLNSYDIIIATNVLHATRNLGVTLGNVRSLLKPGGYLLLNEKTG *:*.:** *******:********* :**.*******************

186

DmbS PESLRATFNFGGLEGWWLAEEEERQLSPLLSPDGWDSQLQKTQFSGVDHV TorS PESLRAAFNFGPLEGWWLAEEEDRQLCPLMSPLGWEAQLQKAQFSGVDYL tenellin PDSLRATFNFGGLEGWWLAEEKERQLSPLMSPDGWDAQLQKAQFSGVDHI *:****:**** *********::***.**:** **::****:******::

DmbS VHDVQEE--GKQQNSMIMSQAVDDAFYARLSPLSEMASLLPTQEPLLLIG TorS VHDIPEE--PKQHTSLIVSQAVDDMLYSRLCPLADMASLAPTKEPIVIIG tenellin VHDVQEDQQDKQQNSMIMSQAVDDTFYARLSPLSEMANLLPMNEPLLIIG ***: *: **:.*:*:****** :*:**.**::**.* * :**:::**

DmbS GQTNTTLRIIKEIQKQLPRKWRHKIRLIASVDQLEDEDLPAHSDVICVQE TorS GQTTNTCKIVKEIQKLLPRQWKQMVHLVDTIENLEAANLAVRSDVICLQE tenellin GQTTATLKMIKEIQKLLPRQWRHKVRLIASVNHLEAEGVPAHSNVICLQE ***. * :::***** ***:*:: ::*: ::::** .:..:*:***:**

DmbS LDRGLFTTAMTSKRLNALKSLFMNTKNLLWVTNAQNSSSMTPRASMFRGI TorS LDKALFAGPIAVKRLSAIQTLLMNTKNLLWVTNAQNSSSAVPRSSMFRGI tenellin LDRGLFTTAMTSKCLDALKTLFINTRNLLWVTNAQHSSSMTPRASMFRGI **:.**: .:: * *.*:::*::**:*********:*** .**:******

DmbS TRVMDGEVPHIRTQILGIEPIGAPSTIARNLLEAFLRLRFDDTYQAATID TorS TRVLAGEVPQIHTQVFGIETMGLASATARNILEAFLRLRSGYS--QTEAD tenellin TRVLDGEIPHIRTQVLGIEPRATSSATARNLLEAFLRLRSDDGRHAANVD ***: **:*:*:**::***. . .*: ***:******** . : *

DmbS GDGADGGSQQVLWSHEPEVDLLSSGTMMIPRVKLRKSLNDTYLASTRAIS TorS TEDQDT-GRQILWSHEPEVDLLSNGVMMVPRVKLNKPLNEAFMASTRAVS tenellin EDGADGSSQQVLWLHEPEAELLSNGTMMIPRVKARKSLNDTYLASTRAIS :. * .:*:** ****.:***.*.**:**** .*.**::::*****:*

DmbS TTVDARCVPVQAVAGPAKIMLRPVEDIAVDHEISSQTSDPKVHIQVEVTL TorS RAVDASRVAVQVIAGPAKMMLQPCQS-AVGSAAPKGLADSTMRIQVRYTL tenellin TTVDARCVSVQAVAGPAKMLLRPVEDFAVEHAISSQSTDSKVHIQVESTL :*** *.**.:*****::*:* :. ** .. :*..::***. **

DmbS HIPEALDGTCLYLVCGWTRPAEASDTSSVPVMALSTSNASIIAVEPKAVA TorS HVPQGRDGTRLYLVCGWAHTAGPSGAVSVPVMALSHTNASIIDVSSTAVV tenellin HIPEALDGTCLYLVCGWTRTAE----TSVPVIALSTSNASIVAVESKAVA *:*:. *** *******::.* ****:*** :****: *...**.

DmbS MIDEVDLKPEALLRVFQHMAMQAVDSAVRRHGQRQRTALIYGADEELAEL

187

TorS TVDDGSLSSDVLVRTFKHLSLQALESTT----ETQQRTLVYGADEALAEL tenellin MIDEADVKPETLFRVFQHMAMQALDSAVGRHGQGQSTALIYGADEELAKL :*: .:..:.*.*.*:*:::**::*:. : * :*:***** **:*

DmbS TSKRCAVRESKIYFASSHSAAPGDWLKVHRLSSKFAMSQMVPSGVQVFID TorS ISAKHALRGSKVYFASSRSSTPPNWLKVHSLSSRFTLGQMIPYGLGTFID tenellin TSERFAVRESKVYFASTRTSAPGDWLKVQPLLSKFALSQMMPADVEVFID * : *:* **:****:::::* :****: * *:*::.**:* .: .***

DmbS CLGGTESFDACRTLQSCLPTTCTVHRLDACLLSEMSQCSPDFLLDAYSYA TorS CLNCAESDSALRTLASCLPTDCIAYQLDASLLSDMSRTSATALAEAYSCA tenellin CLGDTESFDACRTLESCLSTTSTVHRLDACLLSRMSQCSPDTLADAYSHA **. :** .* *** ***.* . .::***.*** **: *. * :*** *

DmbS QTQSNAGFSRSDNIKTFTAAELAGKLSHSLINSMYITDWQKQDAILVTVP TorS KMQDKPNSVQVADVKTIQVAELVGQASHSLRQSIYLTDWQKNDSVVVTVP tenellin KTQSNAEFSWNGNVQTFTAAELAGKLSHSLMHSVYMTDWQEKDSILVTVP : *.:. :::*: .***.*: **** :*:*:****::*:::****

DmbS PLQTRGLFKSDRTYLMVGAAGGLGTSLCRWMVRNGARHVVVTSRNPKADP TorS PLDTQGMFKRDRTYLMVGAAGGIGTSICRWMVRNGARHVVVTSRNPQGDP tenellin PLQTRGLFKSDRTYLMVGAAGGLGTSICRWMVRNGARHVVVTSRNPKADP **:*:*:** ************:***:*******************:.**

DmbS EMLNEAERYGAIVRVVPMDACNKDSVQTVVDTIRATMPPIAGVCNAAMVL TorS NMISEAERCGATVRVVPMDVCNRDSVQSVIDMIRATMPPIAGVCNAAMVL tenellin EMLNEARRYGAAVKVVPMDACSKDCVQTVVDMIRDTMPPIAGVCNAAMVL :*:.**.* ** *:*****.*.:*.**:*:* ** ***************

DmbS CDKLFLDMDVDQMNNTLGPKVDGTEYLDSIFAHEPLDFFILLGSAAAILN TorS CDKLFLDMDVDQLNNTLGPKVDGTEILDSVFAREPLDFFVLLGSSASITN tenellin RDKLFLDMNVDHMNNVLGPKMQGTEHLDSIFAQEPLDFFVLLSSSAAILN *******:**::**.****::*** ***:**:******:**.*:*:* *

DmbS NMGQSNYHCANLYMDSLVKHRRSRGLAASIIHIGHVCDTGYVARMVDDN- TorS NIGQSNYHCANLYMDSLVAQRRSRGLAASIIHIGYICDTGYITRLTDDAK tenellin NTGQSNYHCANLYMDSLVTNRRSRGLAASIIHVGHVCDTGYVARLVDDS- * **************** :************:*::*****::*:.**

DmbS RIQSNIATMRAMRLSETDVHHAFAQAVRGGQLDSRSGSYNIIMGIEPPTK TorS KVQSNRDIMRAMTLSETDVHHAFAEAVRGGQPGGANGSHNIIMGIEPPTK tenellin KVQMSLGTTRVMSVSETDVHHAFAEAVRGGQPDSRSGSHNIIMGIEPPTK ::* . *.* :**********:****** .. .**:***********

188

DmbS PLDLTRRQAVWLSDPRLGHMLPYSTLENQMIASGQAAAS-ADSLAQQVSE TorS PLDPNKRKGLWLSDARLGHMVPTSASSNQNAASEQAAVS-SNSIGQQISE tenellin PLDVAKRKPVWISDPRLGHMLPFSTLENQMVASEQAAASAADSLAQQVSE *** :*: :*:**.*****:* *: .** ** ***.* ::*:.**:**

DmbS ATTDEEATAAVLKGFATKLEGILLLPPGSIGEDSAGRPVTDLGIDSLVAV TorS ANTNEEATTAILHAFGAKLESILLLPPGSIGQDRVGRPVTDLGIDSLVAV tenellin ATTDEEAAAAALKGFATKLEGILLLPLGSIGEDSAGRPVTDLGIDSLVAV *.*:***::* *:.*.:***.***** ****:* .***************

DmbS EIRTWFLKQLRVDVPVMKILGGSTVGQLSALAAKLARQDAKKQAQVEE-A TorS EIRTWFLKRLRVDVPVMKILGGSTIGQLSALAAKLARMDTSKESQSQGIA tenellin EIRTWFLKQLRVDVPVMKILGGSTVGQLSALAAKLARQDAKKRAQLEE-A ********:***************:************ *:.*.:* : *

DmbS SGNQHVALPPPKDK-VGPNTNGKAQDSPETAQ-VGTLIERMEPLVLAASD TorS AGKNHDSAKAPRNS-SSEAADKAVTKPPDQVTEPGTLGKTDEALLPGAPA tenellin SGNQPVALPPLNDKETGPSKKGKAQEFPETVQVVGTAAERTEPLVLEASD :*:: : . .:. . . . . *: . ** : *.*: *.

DmbS RGDSSTANLTTSSSVSELDDSLQSSALQSSENDGGSTPSKSSNCNSDSGS TorS KDD-FPTNPTISSSASELDGSLQASVQQSCETDSSSTPSKSSDYKSDSET tenellin RGGSSTANFTTSSSVSELDDSLQESTLQSSENNGESTPSKSSNCNSDSGS :.. .:* * ***.****.*** *. **.*.:. *******: :*** :

DmbS DTQAPKEIPSNGYTHPAATAPVRPNVLREASMSPAQSRIWFLSKHIAEPD TorS ESKLSKGGSSNACSELQTTKAARPNILREAQMSPAQSRIWFLSKHIAEPY tenellin DNQAPREISSNGFFT-QPAATARPNVLREAPMSPAQSRIWFLSKHIAEPD :.: .: .**. .: ..***:**** ******************

DmbS AYNMVFHYRVRGPLSMVRLRHAMQTVANHHECLRMCFYASADNGQPMQGL TorS AYNMVWHYRVHGRLNMMRLRHAMQTVTNHHECLRMCFYADAHNGQPMQGL tenellin AYNMVFHYRVRGPLSMVRLRHALQTVTNHHECLCMCFYASADNGQPMQGL *****:****:* *.*:*****:***:****** *****.*.********

DmbS LASSAFHMAHVPDCEEQDLQRELCKLKTRVWSIENGQTLELLVLG-RPGT TorS LASSAFHMSHMSDCGEEDTQRELRKLGTRAWGIENGQTLELVVLS-RPGA tenellin LASSASQMTIVPGGEEQDLQRELRKLKTRVWSVESGQTLELVVVGPRPGT ***** :*: :.. *:* **** ** **.*.:*.******:*:. ***:

DmbS ----TDEFSLLFGYHHIVMDAISFHIFLADLDKAYRMLPLDKAAAGSHLD TorS ----QEEHSLLFGYHHIVMDAISWHIFLADLDRAYRMLPLDKTAAGSHLD

189

tenellin AAAEEEEFSLLFGYHHIVMDAISFSIFLADLDKAYRMLPLDKASAGSHLD :*.***************: *******:*********::******

DmbS LTQLQLQQERAGAWNESLDFWQAEFETIPEMLPPLTVALPTLQRGAVGTH TorS LAEMQLQQERAGAWDDSLGYWQTEFATIPDMLPTLPMASPSSQRDTLGTH tenellin LAAHQRQQEHAGAWKESLEFWQAEFETIPEMLPPLSVALPTLQRGAVGTH *: * ***:****.:** :**:** ***:***.*.:* *: **.::***

DmbS RALRELPHEQGD--AIKKTCKNLRVSPFNLHIAILQVLLARLASIEDVCI TorS YVLRELPNEQGD--AIKNACKQLRVSSFNLHVAVLQVLLARLANIEDVCI tenellin RVLRELAHEQGGDAAIKKTCKNLRVSPFNLHIAVLQVVIARLGSIEDVCV .****.:***. ***::**:****.****:*:***::***..*****:

DmbS GIVDANRSDSRASRMVGCFVNMLPIRSRILRTATLVDVARAASSKALAAF TorS GIVDANRGETGASQMVGCFVNMLPIRSQVLGSATLADVSKAASSKALAAF tenellin GIVDANRSDSRASRMVGCFVNMLPVRSRILPSATLADVARAASSKALAAF *******.:: **:**********:**::* :***.**::**********

DmbS AHGQVPLDNILDKVKAPRPAGSTPLFQVALNYRPAAALSSKQPLGSECQM TorS AHGCVPLDKILDRFKAPRLASGTPLFQVALNYRPAASISWDQPLGSECQM tenellin AHGQVPLDSILDKVKAPRPAGSTPLFQVALNYRPAAAIASKQSLGGECEM *** ****.***:.**** *..**************::: .*.**.**:*

DmbS ELSPYDIKDAENPFEISVLVTEMPGGGLAVEMLCQKSQYTMQATEALLDA TorS ELAPYDIKDAENPFEMSVLVSEMPGGSVALELYCQKAKYTLEGSHALMDA tenellin ELLADDFKDAENPFEISVLVSEMPGGRIAVEVVCQKSRYTMQATEALLDA ** . *:********:****:***** :*:*: ***::**::.:.**:**

DmbS YLNVLVAYLSDTAQRVSDCEVHLQTEVKHALDLGKGAQKSFGWPCTLSER TorS YLNVLGSFLSDANQHVSDCAVYEQAKIERAIDIGKGSQTDFGWPATLSER tenellin YLNVLAGFLSDTAQSVGDCVVHDQSKVEHALDLGKGAQKSFGWPRTLSER ***** .:***: * *.** *: *:::::*:*:***:*..**** ***** DmbS VMSICEHHSTKSAIKDGRTELSYAQLASRVNHTASALVDAGCSVGSRIAV TorS VMIMCQQHCAKPAIKDGQSEMSYAQLASRVSDTASAIISAGCGVGSRIAV tenellin VMSICQQHSTKSAIKDGRNELSYAQLASKVNHTASALVNAGCSVGSRIAV ** :*::*.:*.*****:.*:*******:*..****::.***.*******

DmbS LCNPSIDAIVTMLAILHIGGVYVPLDTSLPEARHLSLASSCTPSLIISHA TorS LCDPSIDTIVAMLAILHMGGVYVPLDTSLPEARHVALVSNSTPSLLLFHT tenellin LCNPSIDAIVAMLAILHIGGVYVPLDTSLPEARHQSLASNCTPSLIISHA **:****:**:******:**************** :*.*..****:: *:

DmbS ATRERAHKLAAAISAPGYEPARELTVDDLSPDETGYMAPLSAEPNAPAIL

190

TorS ATKERVHSLRTSLPALGHQVPRELLIGSVSASALDVAASLQANADAPAIL tenellin ATRERAHKLSAVISAPGHEPARELTLDDLSPDETGYMAPLNAEPNAPAIL **:**.*.* : :.* *:: .*** :..:*.. . *.*.*:.:*****

DmbS LYTSGSTGTPKGVLLTQANFGNHIALKTDILGLKRGENVLHQSSLGFDMS TorS LYTSGSTGTPKGVLLTQANFCNHIALKTDILDLGRGECVLQQSSLGFDMS tenellin LYTSGSTGTPKGVLLTQANFGNHIALKTDILGLQRGECVLQQSSLGFDMS ******************** **********.* *** **:*********

DmbS LVQVFCALANGGCVVIVPQDARRDPVELTSLMAQHKVSLTIATPSEYLAW TorS LVQIFCALANGGCVVVVPADARRDPVELTSLMAHHRVSLTIATPSEYLAW tenellin LVQVFCALANGGCLVIVPQDVRRDPMELTSLMAQHKVSLTIATPSEYLAW ***:*********:*:** *.****:*******:*:**************

DmbS LQYGSDSLAQATSWRHLCMGGEPIPQLLKDELR-RLERKDLV-VTNCYGP TorS LQYGSSSLAQNTAWRHLCMGGEPIPQLLRDELR-HLGRRERI-LTNCYGP tenellin LQYGSDALAQATSWKHLCMGGEPIPQLLKDELRRRLERKDLVVVSNCYGP *****.:*** *:*:*************:**** :* *:: : ::*****

DmbS TETTAAISFQSIALDSDN-HE-LLVDNELAKYAVGKALPNYSVRIRDPAG TorS TEATAAASFQPISLESQG-GD-LQVEDELVRYAVGKALPNYSIRIMDAAG tenellin TETTAAISFQSIALDSQDSHEQLPGESELANYAVGKALPNYSIRIRDPAG **:*** ***.*:*:*:. : * :.**..***********:** *.**

DmbS -AWLPVNHTGEIVIGGAGVAKGYLNMPEETRARFLQTPGE-DGM-FYRTG TorS -GWLPANHTGEIAVGGAGVALGYLGLPKETQAKFIRPYGE-SGR-FYRTG tenellin GAWLPVNHTGEIVIGGAGVALGYLDMPEETRARFLQTPGEEDGMLLYRTG .***.******.:****** ***.:*:**:*:*::. ** .* :****

DmbS DKGRLLSDGTLLCFGRINGDNQVKLRGLRIELEEVEAALLQASQGLIHTA TorS DKGRLLPDGTLLCLGRIEGDSQVKLRGLRIELQEVEAALLKASEGLIQAA tenellin DKGRLLSDGTLLCFGRITGDNQVKLRGLRIELGEVEAALLQASQGLIHTA ******.******:*** **.*********** *******:**:***::*

DmbS VVSRRGDVLVAHCARSHESSDTTA--AGEQQ--ATAILRRVSELLPQYSV TorS VVSRREDVLVAHCTRSHDKTAAAA--HEEQH--VASILSRLAKLLPQYSV tenellin VVSRRGDVLVAHCARSHESSRETTGGGGEQQDAATAILRRVSELLPQYSV ***** *******:***:.: :: **: .::** *:::*******

DmbS PAAIALLPSLPTNANAKLDRKAIAALPLSPQDEAAASPS------N TorS PAAIIFLPSLPTNANGKLDRKAIAALPLAQQDQDSVRD------K

191

tenellin PAAIALLPSLPTNANGKLDRTAIAALPLSPQDEAAAATSPSNDNNNNNTP **** :*********.****.*******: **: :.

DmbS SLSGSEKMTVRQGELRLLWERVLPRDATTT----SVRITPESDFFLRGGN TorS SSSGGEKMTIHQGELRLLWERVLPQNATGT------RIVPGSDFFLSGGN tenellin SGGGGEKMTVRQGELRLLWERVLPRDATTTTTTNSVRITPESDFFLRGGN * .*.****::*************::** * **.* ***** ***

DmbS SLLLMKLQAAIRDSMGVRVSTKALYQASTLSGMARCVSEQREQQSDEAEA TorS SLLLMKLQAAIRDAMGVRVSTKSLYQSSTLSGMSRCIVEQREQQQDECQA tenellin SLLLMKLQAAIRESMGVRVSTKALYQASTLSGMARCVAEQRSD-DDEAEE ************::********:***:******:**: ***.: .**.:

DmbS DIDWAAEVAVPPSMLAQMEKLHSSSAGS------SARPRKTIG TorS EIDWAAEAAMPPSLLQQMDRLQSSSTAT------SWRPPKTAG tenellin DIDWAAEVAVPPSMLAQIEKLQHSSASSSSSSSSSSAGSSSTQRPRKTSG :******.*:***:* *:::*: **:.: : ** ** *

DmbS LEILLTGATGFLGGQLLERLVQSPRVSKVHCVAVPVDEQSLLEPL-QQQA TorS LEILVTGATGFLGGQLLQRLLQAPEVSKVHCVAVPADERHLLEPL-QQM- tenellin LQILLTGATGFLGGQLLERLVQSPRVSTVHCVAVPVDEQSLLEPFLQQQA *:**:************:**:*:*.**.*******.**: ****: **

DmbS D---SKVHCYTGNLAAPNLGLTAADRTHISQTIDVIVHAGSMGHCLNTYA TorS N---EKLMPYTGNLAAPDLGLGAAARAHLQESVDVIVHAGAMGHCLNTYA tenellin DGTRRKVRCYIGNLAAPALGLTAADQTALSQTADVIVHAGSMGHCLNTYA : *: * ****** *** ** :: :.:: *******:*********

DmbS TLSAPNLASTKHLCSLALARSPPIPFAFASSNRVALLTGSTAPPPGSVAA TorS TLSGPNLASTRSLCSLALARAPPIPFAFVSSSRVVLLTGSTSPAPGSVAA tenellin TLSAPNFASTRHLCALALSRSPPIPLAFASSNRVALLTGSTAPPPGSAAA ***.**:***: **:***:*:****:**.**.**.******:*.***.**

DmbS FPPPPDGTQGFTASKWASEAFLEKLAASIT------PKTTTVPTPWRVSI TorS FPPPTDGAEGYTASKWASEVFLENAAARAA------ARERPWSVSI tenellin FPPPP-GAQGFTASKWASEAFLEKLTASMSDVSKTKTKTTTTVMPWRVSI ****. *::*:********.***: :* : : ** ***

DmbS HRPCALVSEHAPNSDALNAILRYSTSMRCVPSLPEHRAQGYLDFGQVDKV TorS HRPCVLVSEQAPNSDALNSILRFSVAMRCVPSLPEERAHGYLDFGQVDKV tenellin HRPCALISDRAPNSDALNAILRYSTSMRCVPSLPEHRAEGYLDFGQVDKV ****.*:*::********:***:*.:*********.**.***********

192

DmbS VEEMVGDVLGLADERQEEGPAVVYKHHSGGVKVPIHEFREHMESVYGGRF TorS VEEMATDVFKLVGESQQGKPAVAYRHHSGGAKVPIHEFRAHMESVYGGRF tenellin VEEMVGDILGLADERPQEGPAVVYRHHSGGVKVPIHEFREHMESVYGGRF ****. *:: *..* : ***.*:*****.******** **********

DmbS ESVDLGQWIARAVDAGMDPLISAYLETFLEGDAPMVFPYMGEQAV TorS DSLDLAEWIARAVDAGMDPLISAYLETFLEGDAPMVFPYMGEQPV tenellin ESVQLGQWIIRAVDAGMDPLISAYLETFLEGDASMVFPYMGEQAV :*::*.:** ***********************.*********.*

193

9.3 PLASMID MAPS

194

195

196

197

198

199

200