BIOSYNTHESIS AND MECHANISM OF NATURAL PRODUCTS IN

By

LIKUI FENG

A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

2018

© 2018 Likui Feng

To my family

ACKNOWLEDGMENTS

I would like to dedicate my success in abtaining a PhD at the University of Florida to the many people who have given me support since I came to Gainesville, because without their considerate help, I do not believe that I would have obtained it.

First of all, I would like to give the special thanks to my research advisor Dr.

Rebecca Butcher. It was her who gave me the chance to transfer to this great university and program when I tried to transfer from Indiana University Bloomington. Most importantly, her knowledge, passion and confidence have helped me to become a successful scientist over the past five years. To be a great mentor, she also inspired me with new ideas in multiple perspectives.

I would also like to thank my kind committee members, Dr. Steven Bruner, Dr.

Nicole Horenstein, Dr. Keith Choe and Dr. Robert McKenna. These professors have provided me with valuable suggestions in many aspects of my work. They are also very nice to get along with and good friends. In addition, I would like to thank Dr. Ben Smith and Ms. Lori Clark, for their endless help in processing documents and my program transfer. Furthermore, I would like to thank Dr. Kari Basso and others in the mass spec facility for their help in running and analyzing samples.

I am very glad and lucky to work with so many fantastic people in the Butcher lab. It is all of you who make our big lab family full of happiness. In particular, I would like to thank Dr. Qingyao Shou, who inspired me a lot in the nemamide project and has continued help me to figure out experimental problems even after he left our lab. Dr.

Xinxing Zhang, full of academic thoughts, is a very good friend both in research and life, and provided me with a lot of fresh ideas and new techniques. I would like to thank Dr.

Yue Zhou and Dr. Yuting Wang for their help in the lab and life. I would like to thank Dr.

4

Rouf Dar for synthesizing intermediates and Ying Liu for her collaborative work on metabolomics. In my last year of PhD, I am so lucky to work with Matthew Gordon, who is very helpful, funny and talented as a promising scientist. I thank all other current

Butcher lab members, including Dr. Guohui Li, Prashant Singh, Subhradeep Bhar,

Nasser Faghih, David Perez and previous group members Dr. Satya Chinta, Dr. Rachel

Jones, Dr. Jungsoo Han, Asyegul Ozdogan, Mayra Tuiche, Jasmine Gonzalez,

Priyanka Raichoudhury, and Lauren Suarez.

I am also grateful to all Bruner lab members, including Dr. Kunhua Li, Dr. Wei-

Hung Chen, Dr. Matthew Burg, Aleksandra Zagulyaeva, Brian Mctavish, Dr. Naga

Sandhya Guntaka, Qiang Li, Prabhanshu Tripathi and Gengnan Li. I would also like to thank my dear friends in the Department of Microbiology and Cell Sciences, because they made my PhD life colorful and amazing.

Finally, I would like to give my deepest gratitude to my family. My parents gave me the the life and continue supporting me to decorate my world in many aspects. Their endless love makes me go through lots of difficulties to further improve myself to be a better person and a successful scientist. I am very thankful to my younger brother for his effort to take care of our parents. Most importantly, I would like to thank my dear wife

Yuanyuan Leng, who cheers me up and motivates me all along to make me succeed.

She is always patient no matter how late and how long I work and she tries to bring me out of the lab to make me feel relaxed. I will do my best to make her proud and love her forever.

5

TABLE OF CONTENTS

page

ACKNOWLEDGMENTS ...... 4

LIST OF TABLES ...... 9

LIST OF FIGURES ...... 11

LIST OF ABBREVIATIONS ...... 17

ABSTRACT ...... 20

CHAPTER

1 INTRODUCTION ...... 22

1.1 Nematodes and Life Cycle of C. elegans ...... 22 1.2 Polyketide and Nonribosomal Peptides ...... 25 1.2.1 Initiation Module of PKSs and NRPSs ...... 27 1.2.2 Mechanism of Extension in PKSs and NRPSs ...... 29 1.2.3 Accessory Enzymes Acting in trans or in cis ...... 31 1.3 Signaling Molecules in C. elegans ...... 32 1.3.1 Dauer Pheromone Ascarosides ...... 32 1.3.2 Discovery of the Nemamides ...... 34 1.3.3 Other Signaling Molecules in Worms ...... 39 1.4 Significance and Summary ...... 39

2 MECHANISM OF THE NEMAMIDES IN PROMOTING LARVAL SURVIVAL AND PATHOGEN RESISTANCE ...... 42

2.1 Insulin Signaling Pathway in Worm Development and Arrest ...... 42 2.2 Experimental Methods ...... 43 2.2.1 Methods for Roles of Nemamides in Larval Survival ...... 43 2.2.2 Screen for Pathogen Resistance ...... 49 2.2.3 Methods for PKS-1b Verification and Gene Cloning ...... 51 2.3 Results ...... 52 2.3.1 Site of nrps-1 and pks-1 Expression in C. elegans ...... 53 2.3.2 Nemamides Promoting Arrested Larval Survival ...... 54 2.3.3 Screening by Pathogen Avoidance Assay and Killing Assay ...... 68 2.3.4 Verification of the Presence of pks-1b ...... 71 2.3.5 Gene Cloning of pks-1b ...... 72 2.4 Discussion and Future Work ...... 73

3 NONCANONICAL FEATURES IN BIOSYNTHESIS OF THE NEMAMIDES ...... 75

3.1 Proposed Nemamide Biosynthetic Pathway ...... 75

6

3.2 Domain Properties of PKS-1 and NRPS-1 ...... 76 3.2.1 Ketosynthase (KS) Domains ...... 76 3.2.2 Carrier Protein (CP) Domain ...... 77 3.2.3 Acyltransferase (AT) Domains ...... 78 3.2.4 Dehydratase (DH) Domains ...... 79 3.2.5 Ketoreductase (KR) Domains ...... 80 3.2.6 Adenylation domain (A) Domains ...... 82 3.2.7 Condensation (C) Domains ...... 86 3.2.8 Thioesterase (TE) Domains ...... 86 3.3 Experimental Methods ...... 88 3.3.1 Strains and Transgenic Lines ...... 88 3.3.2 Single Worm PCR and CRISPR-Cas9 ...... 93 3.3.3 Transgenic Line Construction ...... 98 3.3.4 Plasmid Construction, Protein Overexpression and Purification ...... 99 3.3.5 Small Scale Worm Extraction and Intermediate Extractions ...... 102 3.3.6 LC-MS-based ACS and ACOX Activity Assay ...... 104 3.3.7 Profiling Ascaroside Production ...... 105 3.4 Functional Analysis of Enzymatic Domains ...... 105 3.4.1 Ketosynthase (KS) Domain ...... 106 3.4.2 Carrier Protein (CP) Domains ...... 108 3.4.3 Condensation (C) Domains ...... 111 3.4.4 Adenylation (A) Domains ...... 115 3.4.5 Thioesterase (TE) Domains ...... 117 3.5 Genome Mining of Accessory Enzymes ...... 122 3.6 Functional Analysis of Accessory Enzymes ...... 124 3.6.1 Transcriptional and Translational Reporters of Biosynthetic Genes ...... 124 3.6.2 Generation of Deletion Mutants by CRISPR-Cas9 ...... 125 3.6.3 Nemamide Production in Mutants and Rescued Mutants ...... 132 3.6.4 ACOX-3 Functions in both Ascaroside and Nemamide Biosynthesis .... 135 3.6.5 Gene Expressed in Both CAN and Intestine ...... 137 3.6.6 Phosphopantetheinyl Transferase (PPTase) ...... 138 3.6.7 ACOX-3 Catalyzes Fatty-acyl CoA into ∆Fatty-acyl CoA ...... 139 3.6.8 ACS-24 Activates Fatty Acids into Fatty-acyl CoA ...... 140 3.6.9 ACS-9 has Fatty-acyl AMP Ligase Activity ...... 141 3.7 Complete Dissection of Nemamide Biosynthesis ...... 142 3.7.1 Functional Contribution of NRPS-1 Domains ...... 143 3.7.2 Unusual Trafficking Between PKS-1 and NRPS-1 by ACS-9 ...... 145 3.7.3 The Freestanding Methyltransferase Incorporates the Methyl Group .... 146 3.7.4 T12G3.4 Functions as a Lactonase to Hydrolyze the product of PKS-1 147 3.7.5 PKS-1 C-terminal Module Might Play Aminotransferase Function ...... 149 3.7.6 ACS-24 and ACOX-3 Assisted Initiation ...... 151 3.8 Discussion and Future Work ...... 152

4 INVESTIGATION OF NEMAMIDE-LIKE MOLECULES IN OTHER NEMATODES ...... 155

4.1 Chromosomal Locus of Nemamide Biosynthetic Genes ...... 155

7

4.2 Experimental Methods ...... 156 4.2.1 Small-Scale Extraction of the Nemamides in species .. 156 4.2.2 Large-Scale Extraction of Nmemamde-like Molecules in P. pacificus ... 157 4.3 Nemamide Production in Caenorhabditis Species ...... 159 4.4 Isolation and Identification of the Nemamides in P. pacificus ...... 161 4.5 Proposed Structure and Biosynthesis of the nemamides in P. pacificus...... 169

5 BIOSYNTHESIS OF RHAMNOSE AND ASCARYLOSE IN C. ELEGANS ...... 172

5.1 Carbohydrate Metabolism in C. elegans ...... 172 5.2 Experimental Methods ...... 175 5.2.1 Isotope Labeling Experiments ...... 175 5.2.2 Sugar Extraction and Identification ...... 176 5.2.3 Construction, Overexpression, Purification and Activities of RML Enzymes ...... 177 5.2.4 Purification and Identification of Reaction Products ...... 181 5.2.5 Other experimental procedures ...... 182 5.2.5.1 Phylogenetic tree analysis ...... 182 5.2.5.2 Small- and large-scale RNAi ...... 182 5.2.5.3 Sequence alignment, structural modeling and superposition ...... 183 5.2.5.4 Plasmid construction, transgenesis and microscopy ...... 183 5.3 Results ...... 184 5.3.1 Ascarylose in Worms Originated from Glucose ...... 184 5.3.2 Discovery of UDP-ascarylose in Worms ...... 188 5.3.3 Homologous Genes Identified by BLAST ...... 190 5.3.4 In vitro Activities of Rhamnose Biosynthetic Enzymes ...... 193 5.3.5 Identification of dTDP-Rha from Enzymatic Synthesis and Inside Worms ...... 202 5.3.6 Biosynthesis of dTDP-Rha Involved in Worm Molting Cycles ...... 212 5.4 Discussion and Future Work ...... 217

LIST OF REFERENCES ...... 222

BIOGRAPHICAL SKETCH ...... 236

8

LIST OF TABLES

Table page

2-1 Primers used for plasmid construction or genotyping ...... 43

2-2 Bacterial and fungal strain used for screening ...... 50

3-1 Comparison of the A domain selectivity codes...... 83

3-2 A domain selectivity codes in PKS-1_A1 domains of various nematodes...... 84

3-3 A domain selectivity codes in NRPS-1_A domains...... 85

3-4 Wild type, backcrossed and published strains...... 89

3-5 Strains with point mutations generated by CRISPR-Cas9...... 89

3-6 Mutant strains with deletions generated by CRISPR-Cas9...... 90

3-7 Double mutants generated...... 90

3-8 Transcriptional reporter lines...... 90

3-9 Translational reporter or overexpression strains...... 91

3-10 Single worm pcr primers for mutant strains crossing...... 91

3-11 sgRNA used for CRISPR-Cas9...... 94

3-12 Repair templates used for CRISPR-Cas9...... 95

3-13 Single worm pcr primers for CRISPR-Cas9 generated mutant strains...... 96

3-14 Single worm pcr information for wild type and mutants...... 97

3-15 Primers for transcriptional reporter line construction...... 98

3-16 Primers for translational reporter line construction...... 99

3-17 Primers for plasmid construction...... 101

3-18 Plasmid construction and protein overexpression...... 102

3-19 Gene expressions with enrichment ratio of more than 5-fold ...... 123

4-1 m/z of fragment ions for nemamide A, B and C...... 164

5-1 Primers used for plasmid construction...... 178

9

5-2 Observed phenotypes for RNAi of genes...... 191

5-3 Kinetic parameters of RML-1, GMP-1, and UGP-1...... 195

5-4 Proton chemical shifts and coupling constants ...... 203

10

LIST OF FIGURES

Figure page

1-1 Life cycle of C. elegans...... 24

1-2 Schematic mechanism of PKS and NRPS in an assembly-line manner...... 26

1-3 Examples of polyketides and nonribosomal peptides...... 26

1-4 Carrier protein (CP) domains interact with other domains ...... 27

1-5 Three types of initiation in PKS...... 28

1-6 Catalytic mechanism of the minimal module in PKS...... 30

1-7 Mechanism of a typical NRPS module...... 31

1-8 Accessory enzymes function in cis and/or in trans ...... 32

1-9 Examples of most abundant ascarosides in C. elegans...... 33

1-10 Genomic location of pks-1 and nrps-1...... 35

1-11 Comparative metabolomics study ...... 37

1-12 Elucidation of the structures of nemamide A and B...... 38

1-13 UV spectra of nemamide A and B...... 38

1-14 Domain architectures of PKS-1 and NRPS-1...... 40

2-1 Insulin signaling pathway plays important roles in worm development ...... 42

2-2 Phylogeny and domain analysis of PKS-1 homologs ...... 53

2-3 Expression of the transcriptional reporters pks-1p::gfp,...... 54

2-4 Recovery of arrested L1s and development to the L4 stage ...... 56

2-5 Comparison of development of wild-type, pks-1, and nrps-1 worms...... 56

2-6 Dauer formation and recovery in wild-type, pks-1, and nrps-1 worms...... 57

2-7 M-cell imaging in wild-type, pks-1, and nrps-1 backgrounds...... 57

2-8 Fertility and brood size in wild-type, pks-1, and nrps-1 worms ...... 58

2-9 Expression of insulins in wild-type, pks-1, and nrps-1 arrested L1s ...... 60

11

2-10 Expression of insulins in recovered versus arrested ...... 61

2-11 Nemamide production in arrested and recovered L1s...... 62

2-12 Survival of wild-type, pks-1, and nrps-1 arrested L1s over time...... 63

2-13 L1 survival for wild-type and different mutant strains...... 63

2-14 L1 survival for wild-type, pks-1, and nrps-1 worms...... 65

2-15 Feeding rate of wild-type, pks-1, and nrps-1 worms...... 66

2-16 Pharynx pumping rate of wild-type, pks-1, and nrps-1 worms...... 67

2-17 Effect of an unc-31(e928 null) mutation on survival of arrested L1s...... 67

2-18 Avoidance of different worm strains upon OP50, B20 and PA14 ...... 68

2-19 Killing phenotype of different worm strains...... 69

2-20 Killing assay of pks-1 and nrps-1 worms upon C.albicans DAY185 ...... 70

2-21 Worm reproduction after 48h infection by C. albicans DAY185...... 71

2-22 Presence of pks-1a and pks-1b splice variants...... 72

2-23 Verification of p15TV-L pks1b by HindIII, NdeI and BseRI digestion...... 73

2-24 Model for the role of the nemamides in L1 arrest and survival...... 74

3-1 Proposed biosynthetic pathway of nemamide A in C. elegans...... 76

3-2 Sequence alignment of PKS-1_KS domains with KSQ domains...... 77

3-3 Sequence alignment of PKS-1_ACP domains with Bacillus subtilis ACP ...... 78

3-4 Sequence alignment of PKS-1_PCP domains, NRPS-1_PCP domains ...... 78

3-5 Comparison of PKS-1_AT domains with AT0 loading acyltransfeases...... 79

3-6 Sequence alignment of PKS-1_DH domains with known DH domains...... 80

3-7 Alignment of the PKS-1 KR domains with bacterial KR domains...... 81

3-8 Sequence alignment of A domains with EntE_A ...... 83

3-9 Sequence alignment of C domains with ArfA_C2 ...... 86

3-10 Alignment of PKS-1 and NRPS-1 TE domains...... 87

12

3-11 Phylogeny of the PKS-1 and NRPS-1 TE domains ...... 88

3-12 Gene structures and related alleles used in this chapter are indicated...... 92

3-13 Properties of mutant pks-1(reb7[KS1_C159A])...... 106

3-14 Sequencing data for pks-1(reb7[KS1_C159A]) ...... 107

3-15 Sequence alignment of NRPS-1_ACP7 with PKS-1_ACP domains ...... 108

3-16 Structural modeling and sequence comparison of NRPS-1_ACP7 ...... 109

3-17 Properties of mutant nrps-1(reb8[ACP7_S307V])...... 109

3-18 Sequencing data for nrps-1(reb8[ACP7_S307V])...... 110

3-19 Properties of mutant pks-1(reb9[C1_H6685A]) ...... 112

3-20 Sequencing data for pks-1(reb9[C1_H6685A])...... 113

3-21 Sequencing data for nrps-1(reb10[C3_H1486A])...... 114

3-22 Properties of mutant pks-1(reb22[A1_G7106E]...... 115

3-23 Sequencing data for pks-1(reb22[A1_G7106E])...... 116

3-24 Properties of mutant pks-1(reb11[TE1_S7593A]) ...... 118

3-25 Sequencing data for pks-1(reb11[TE1_S7593A])...... 119

3-26 Sequencing data for pks-1(reb13[TE1_S7593C])...... 120

3-27 Sequencing data for nrps-1(reb12[TE2_S2803A])...... 121

3-28 Expression of the transcriptional reporters ...... 124

3-29 Expression of the translational reporters ...... 125

3-30 Verification of biosynthetic gene mutants generated by CRISPR-Cas9...... 126

3-31 Sequencing data for nemt-1(F49C12.10, reb15) deletion...... 127

3-32 Sequencing data for C24A3.4(reb24)deletion...... 128

3-33 Sequencing data for acs-9(reb21) deletion...... 129

3-34 Sequencing data for acs-9(reb28) deletion...... 130

3-35 Sequencing data for acs-24(reb23) deletion...... 131

13

3-36 Sequencing data for acs-24(reb24) deletion...... 132

3-37 Nemamide production in wild type and nemt-1 ...... 133

3-38 Nemamide production in wild type and nemt-1, acs-24, acs-9, Y71H2B.1, C24A3.4 and T12G3.4 rescue strains...... 134

3-39 Inter-rescue of nemamide production in acs-9 and acs-24 mutants...... 135

3-40 Nemamide production in wild type and ascaroside biosynthetic mutants ...... 136

3-41 Functional properties of acox-3 in nemamide production...... 137

3-42 Ascaroside production in wild type and mutants ...... 138

3-43 Transcriptional reporter of PPTase T04G9.4 in wild-type L4 worms...... 139

3-44 Enzymatic activity of ACOX-3...... 140

3-45 Enzymatic activity of ACS-24...... 141

3-46 Enzymatic activity of ACS-9 towards different fatty acids...... 142

3-47 Intermediates extracted from NRPS-1 mutant worms ...... 144

3-48 Proposed trafficking mechanism between PKS-1 and NRPS-1...... 145

3-49 Functional characterization of the methyltransferase NEMT-1...... 147

3-50 Characterization of the lactonase enzyme T12G3.4...... 149

3-51 Characterization of PKS-1 C terminal module...... 151

3-52 Proposed initiation mechanism via PKS-1_KS-AT-ACP module ...... 152

3-53 Biosynthetic properties of the nemamide A...... 154

4-1 Chromosomal locations of nine nemamide biosynthetic genes ...... 155

4-2 Production of nemamide A in Caenorhabditis species...... 160

4-3 Production of nemamide B in Caenorhabditis species...... 161

4-4 LC-MS trace of small-scale P. pacificus extracts...... 162

4-5 LC-HRMS of nemamide C...... 163

4-6 LC-HRMSMS of nemamide C [M+H]+ 761.4172...... 164

4-7 LC-HRMSMSMS of nemamide C fragment ion 541.2346...... 165

14

4-8 Analysis of fragment ions from peptide ring of nemamide C...... 166

4-9 Marfey’s analysis for standards and nemamide C sample...... 167

4-10 The extracted ion chromatogram for L-FDAA-L/D-Asp, L-FDAA-GABA and L- FDAA standards...... 168

4-11 The extracted ion chromatogram for L-FDAA-L/D-Asp, L-FDAA-GABA and L- FDAA in Marfey’s analysis of nemamide C...... 169

4-12 Proposed structure of most dominant nemamide (C) in P. pacificus...... 170

4-13 Domain architectures of Ppa-PKS-1 and Ppa-NRPS-1s...... 171

5-1 Biosynthetic pathway of CDP-ascarylose in bacteria...... 173

5-2 Ion extraction of ascarosides in isotope labeling experiments...... 185

5-3 Proposed conversion of glucose to ascaroside...... 187

5-4 Ion extraction for NDP-ascarylose from worm nucleotide sugar pool. m/z shown in figure are in negative mode [M-H]-...... 188

5-5 LC-MS and LC-MS/MS analysis of UDP-ascarylose...... 189

5-6 Proposed biosynthetic pathway of UDP-ascarylose in worms...... 190

5-7 RML-1-5 homologs from different species organized in neighbor- joining phylogenetic trees...... 192

5-8 Activity assays with RML-1, GMP-1, UGP-1 and RML-2...... 194

5-9 Comparison of the domain structure of RML-4 with known dehydratases and coexpression of RML-4/RML-5...... 198

5-10 Structural alignment of a modeled structure of RML-4 ...... 199

5-11 Dehydratase and reductase activities of different enzyme combinations...... 201

5-12 Amino acid sequence alignment of RML-3 with RmlC...... 202

5-13 Analysis of the product of the reaction of RML-3, RML-4/RML-5 with dTDP-4- keto-6-deoxyglucose ...... 204

5-14 Analysis of the product of the reaction of RML-2, RML-3, RML-4/RML-5 with UDP-Glc ...... 205

5-15 LC-MS analysis of sugar nucleotides. UV absorbance at 254 nm of sugar nucleotides in mixture of standards ...... 207

15

5-16 Ion extraction and mass spectrum of dTDP-rhamnose (m/z 547)...... 208

5-17 LC-MS/MS and LC-MS/MS/MS analysis of a dTDP-rhamnose standard...... 209

5-18 LC-MS/MS and LC-MS/MS/MS analysis of dTDP-rhamnose isolated from worms grown in CeHR...... 210

5-19 Sugar nucleotide analysis in C. elegans...... 212

5-20 Expression pattern of the prml-2::gfp and prml-4::gfp reporter genes in the embryo and larval stages of transgenic worms...... 213

5-21 Monitoring the expression of rml-2 and rml-4 transcriptional reporters expressing GFP-PEST during the molting cycle...... 215

5-22 Expression patterns of prml-2::gfp-pest and prml-4::gfp-pest at specific developmental stages...... 217

5-23 Proposed biosynthetic pathway for dTDP-L-rhamnose in C. elegans...... 218

16

LIST OF ABBREVIATIONS

A Adenylation domain

AABA α-Aminobutyric acid

ACOX Acyl-CoA oxidase

ACP Acyl carrier protein

ACS Acyl-CoA synthetase

ADP Adenosine diphosphate

AMP

AMT Aminotransferase

AT Acyltransferase

BABA β-Aminobutyric acid

C Condensation domain

CAN Canal associated neuron

Cas9 CRISPR associated protein 9

CeHR C. elegans Habitation and Reproduction

CDP Cytidine diphosphate

CoA Coenzyme A

CP Carrier protein

CRISPR Clustered regularly interspaced short palindromic repeats

Cy Cyclization domain

DH Dehydratase dTDP Thymidine diphosphate

DTT Dithiothreitol

E Epimerase domain

ER Enoyl reductase

17

ESI Electrospray ionization

FDAA 1-Fluoro-2,4-dinitrophenyl-5-L-alaninamide

FPLC Fast protein liquid chromatography

GABA ᵞ-Aminobutyric acid

GDP Guanosine diphosphate

HRMS High resolution mass spectrometry

IGF-1 Insulin-like growth factor 1

IPTG Isopropyl β–D-thiogalactopyranoside

KR Ketoreductase

KS Ketosynthase

LC-MS Liquid chromatography-mass spectrometry

LC-MS/MS Liquid chromatography-tandem mass spectrometry

MESG 2-Amino-6-mercapto-7-methylpurine ribonucleoside

MT Methyltransferase

NGM Nematode growth medium

NRPS Nonribosomal peptide synthetase

Ox Oxidation domain

PCP Peptidyl carrier protein

PKS Polyketide synthase

PPTase Phosphopantethine transferase qRT-PCR Quantitative real time-polymerase chain reaction

SDS-PAGE Thiolation

SFP Thioesterase sgRNA Small guide RNA

SIM Single ion monitoring

18

T Thiolation domain

TE Thioesterase

UDP Uridine diphosphate

19

Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

BIOSYNTHESIS AND MECHANISM OF NATURAL PRODUCTS IN NEMATODES

By

Likui Feng

December 2018

Chair: Rebecca A. Butcher Major: Chemistry

The model organism can arrest its development in response to starvation in L1 arrest and in response to population density and low food in the dauer diapause. Natural products, acting as both internal and external signaling molecules, are produced by worms to regulate developmental processes and behaviors.

One of the main natural products produced by C. elegans, the nemamides, represents the first assembly-line hybrid polyketide and nonribosomal peptides to be identified in . The discovery of the nemamides and the annotation of the biosynthetic pathway have enabled the exploration of nemamide biosynthesis in the context of an system. Nemamide biosynthesis has multiple noncanonical features and likely requires accessory enzymes acting in trans. Moreover, biosynthesis and functional studies could potentially provide insights into the mechanism of action of the nemamides. As the nemamide biosynthetic genes are found in most nematode species, including parasitic ones, the nemamides may play a conserved role in nematode biology. Chapter 2 describes the mechanism of the nemamides in promoting L1 larval survival during arrest, and their potential roles in pathogen resistance against such pathogens as Candida albicans. Chapter 3 explores nemamide biosynthesis by

20

analyzing domain architectures, identifying seven accessory enzymes, and characterizing the roles of the different enzymes using comparative metabolomics and biochemical assays. Nemamides and nemamide-like molecules in other nematode species are investigated and discussed in Chapter 4.

Another family of natural products, the ascarosides, which C. elegans secretes as pheromones to induce dauer formation and coordinate various behaviors, contains the unusual 3,6-dideoxysugar ascarylose. Nothing is known about the biosynthesis of this sugar in nematodes. Chapter 5 explores how C. elegans utilizes UDP-ascarylose to synthesize ascarosides and how C. elegans produces dTDP-rhamnose, which oscillates with the worm molting cycle. Our work will greatly improve our understanding of the signaling molecules in C. elegans and their roles in regulating worm development, stress survival and aging.

21

CHAPTER 1 INTRODUCTION

1.1 Nematodes and Life Cycle of C. elegans

Nematodes are roundworms that are associated with many kinds of organisms, such as bacteria, fungi, plants, animals and human. There are free-living nematodes such as Caenorhabditis elegans (C. elegans), necromeric nematode such as

Pristionchus pacificus (P.pacificus) and parasitic nematodes, such as Ascaris suum (A. suum). Parasitic nematodes have been shown to be causes of plant or animal diseases that decrease plant production, affect animal survival and reduce human health1-3.

The model organism C. elegans, as one model organism, was isolated from soil as a free-living nematode, and it is very easy to grow in petri dishes and liquid cultures.

It is small, and transparent, which makes it very easy to manipulate. It has two sexes, hermaphrodites and males, which enables it very useful in genetic work4,5. The complete genome of C. elegans has been fully sequenced so multiple approaches have been extensively performed to investigate transcriptional and translational profiles5. A hermaphroditic adult worm has exactly 959 somatic cells and a large brood size of 200-

300 eggs, and an adult male worm has exactly 1031 somatic cells. C. elegans has a short life cycle from 3 to 5 days depending on its cultivation temperature (15 °C to 25

°C). C. elegans hermaphroditic worm has a complex and fully mapped nervous system with 302 neurons in a connectome including synapses. There are thirty pairs of chemosensory neurons to help worms detect and discriminate a wide range of chemical compounds. Most importantly, various techniques, such as transgenesis, RNA interference, CRISPR-Cas9 are applied comprehensively6-8. These advantages make

22

C. elegans a perfect model organism to study genetics, neuronal science and developmental biology.

In Figure 1-1, the life cycle of C. elegans begins from embryos, lasts four larval stages, L1, L2, L3, L4 and goes into the adult stage. Under normal developmental conditions, the transition between two adjacent stages is achieved through worm molting, however, under certain conditions, the worms choose two alternative arrest stages, called L1 arrest9 and dauer10. L1 arrest occurs when the eggs hatch in the absence of food, but these arrested L1s can resume normal development if they receive food. In L1 arrest, the basic developmental progression is not changed, and arrested

L1s are morphologically similar to unarrested L1s. In contrast, under some harsh conditions, such as high population, low food and high temperatures, C. elegans will enter an alternative L3 larval stage called the dauer, which is morphologically distinct from a normal worm. The dauer worms can survive for several months without feeding and once the food is re-supplied, they recover and get back to L4 stages11.

23

Figure 1-1. Life cycle of C. elegans.

Besides C. elegans as a model organism, Pristionchus pacificus is gradually becoming another useful and interesting tool to study nematodal development, behaviors and pathogen resistance. P. pacificus can be free-living, which is used to study specie-development, and parasitic, which is useful to study host-parasite interactions such as between P. pacificus and scarab beetles12. Compared to C. elegans, P. pacificus behaves distinctly in several different ways: it has a longer lifespan and less progeny; its life cycle goes through J1 (Juvenile) to J4 to adult; the eggs hatch inside the eggshell as J1 and molt into J2 when the eggshell is broken13. Recently, important techniques such as transgenesis and CRISPR-Cas9 have been applied for the first time to P. pacificus14.

24

1.2 Polyketide and Nonribosomal Peptides

Polyketides and nonribosomal peptides represent two of the most important classes of natural products used in modern medicine. They include the anthelmintic drug ivermectin, which is used to treat parasitic worms in over 300 million people annually, the antibiotic vancomycin, which is used to treat life-threating infections by gram-positive bacteria, and the immunosuppressant FK506, an essential drug after organ transplantation15. PKSs and NRPSs are large, multi-domain enzymes that synthesize polyketides and nonribosomal peptides, respectively, in an assembly-line or iterative manner. PKSs incorporate building blocks of malonyl derivatives to make polyketides, and NRPSs couple amino acids to make nonribosomal peptides shown in

Figure 1-216,17. Biosynthesis of polyketides and nonribosomal peptides includes initiation, extension and termination steps to make various natural products. PKSs and

NRPSs can also work together to produce hybrid polyketide-nonribosomal peptides.

Polyketide and nonribosomal peptides are natural products that are commonly produced in many species of bacteria or fungi, such as antibiotics, siderophores, but are rare in metazoans18,19. As shown in Figure 1-3, vancomycin, rapamycin and avermectin are all produced by bacterial PKSs and NRPSs to kill competing bacteria or fungi, or paralyze parasites. Many of these natural products have been developed into important therapeutics.

25

Figure 1-2. Schematic mechanism of PKS and NRPS in an assembly-line manner.

Figure 1-3. Examples of polyketides and nonribosomal peptides.

26

Figure 1-4 illustrates a minimal module for either a PKS or an NRPS. In both

PKS and NRPS module, carrier protein (CP) domains play essential functions in loading starter units and holding growing intermediates. PKSs use acyl-carrier protein (ACP) domains to incorporate malonyl derivatives or hold fatty-acyl intermediates. NRPSs have peptidyl-carrier protein (PCP) domains to load amino acids or transfer growing peptides. Interactions of CP domains with other functional domains nearby are achieved through a phosphopantethinyl arm, which is installed by a phosphopantethine transferase (PPTase) using Coenzyme A onto the free serine residue in CP domains20-

22.

Figure 1-4. Carrier protein (CP) domains interact with other domains through the phosphopantetheine arm. Apo CP with its free serine residue can be modified into holo-CP, which is catalyzed by phosphopantetheine transferase (PPTase) to attack Coenzyme A.

1.2.1 Initiation Module of PKSs and NRPSs

To initiate polyketide or nonribosomal biosynthesis, PKSs and NRPSs need to activate starter acyl esters. Three types of starter loading modules are found in bacterial and fungal PKSs shown in Figure 1-5: 1) The most common type is the KSQ-AT-ACP

27

type loading module, AT domains in this kind of module strictly prefer malonyl-CoA or methylmalonyl-CoA (CoA esters of dicarboxylic acid), and KSQ domains catalyze the decarboxylation of loaded esters to yield acetyl- or propionyl units. Unlike most KS domains, KSQ domains have a glutamine instead of a as a general catalytic residue. 2) The second type of loading module is the AT-ACP type, which can load a broad range of CoA-esters with different carbon lengths; 3) The last type of loading module is the CoL (CoA-ligase like)-ACP type, in which a CoL enzyme can either work in cis or in trans to activate different acyl esters in an ATP-dependent manner23 and then load them.

Figure 1-5. Three types of initiation in PKS. A), KSQ type loading module KSQ-AT-ACP, B) AT-ACP type loading module, C), Col (CoA-ligase like)–ACP type loading module.

28

1.2.2 Mechanism of Extension in PKSs and NRPSs

In the PKS module shown in Figure 1-6, the monomer is usually malonyl or methylmalonyl or their derivatives. The monomers are loaded onto the thiolation (T or

ACP) domains by the preceding acyltransferase (AT) domain. These malonyl monomers are tethered as thioesters to the thiolation domains via a phosphopantetheinyl arm. The ketosynthase (KS) then catalyzes decarboxylation of the loaded monomer and a further Claisen reaction between the monomer and the growing chain intermediate. In effect, the chain is lengthened by two additional carbons.

Additional domains can also be present within the module. For example, a ketoreductase (KR) domain would convert this -ketone to a -hydroxyl group, a dehydratase (DH) domain would convert the -hydroxyl to an α, -unsaturated thioester

24, and an enoylreductase (ER) domain would reduce the unsaturated double bond.

After certain rounds of extension, the growing intermediate is hydrolyzed or macrocyclized into the final product by catalytic triad S (Serine)-D ()-H

(Histidine) of terminal thioesterase (TE) domain25.

29

Figure 1-6. Catalytic mechanism of the minimal module in PKS.

In the NRPS module of Figure 1-7, individual monomers, in this case amino acids are loaded onto the thiolation (T or PCP) domains, by the preceding adenylation

(A) domain. By using ATP, A domains can activate different kinds of amino acids into amino acid-AMP in the presence of Mg2+ and load it onto PCP domains. Each amino acid monomer is tethered as a thioester to a thiolation domain via a phosphopantetheinyl arm and can be epimerized by epimerase (E) domains. The condensation (C) domain then catalyzes the formation of the peptidyl bond. Of course, multiple modules are generally present, so this dipeptide can then react with the amino acid monomer on the thiolation domain in the next module to extend the chain. One single domain can also work iteratively to incorporate multiple times the same amino acid. The final product is produced by the terminal TE domain-catalyzed hydrolysis or

30

macrocyclization. Sometimes one or more domains might be present, but inactive and thus not involved in biosynthesis of the natural products22.

Figure 1-7. Mechanism of a typical NRPS module.

1.2.3 Accessory Enzymes Acting in trans or in cis

In addition to the common domains present in PKSs and NRPSs, there can be other domains (shown in Figure 1-8) present that function either in cis or in trans. In

PKSs, methyltransferase (MT) domains introduce one or more methyl groups onto carbon-based26, nitrogen-based or hydroxyl functionalities of the growing product, and aminotransferase (AMT) can add amino groups into the growing natural product, which may play important roles in the final macrocyclization to form a lactam ring27. In NRPSs, domains such as cyclization (Cy) or oxidation (Ox) are present to introduce ring structures. Some other enzymes, such as acyl-CoA ligases, fatty-acyl binding proteins,

CoA transferases, are also sometimes present28-30. Additional enzymes, like thioesterases, act as proof-reading enzymes to offload aberrant intermediates in order to maintain the fidelity of natural product biosynthesis22.

31

Figure 1-8. Accessory enzymes function in cis and/or in trans in polyketide or nonribosomal peptide biosynthesis. MT: methyltransferase; AMT: aminotransferase; Cy: cyclization; Ox: oxidation; Accessory domains can act either in cis or in trans.

1.3 Signaling Molecules in C. elegans

Since C. elegans has a very sophisticated and complex neuron system, it can perceive both external and internal small molecule signals to respond to changing environments. In this part of the introduction, the ascarosides11, which make up the dauer pheromone, and the nemamides31, which we have shown affect L1 survival, will be discussed and some other signaling molecules will also be mentioned.

1.3.1 Dauer Pheromone Ascarosides

When worms encounter harsh conditions, such as high population, low food and high temperatures, the worm population secretes a pheromone, called the dauer pheromone ascarosides, to communicate with each other and enter into the dauer stage until external conditions go back to normal32,33. Ascarosides have a modular structure with the 3,6-dideoxysugar, ascarylose that is attached to fatty-acid chains of different

32

lengths, and these ascarosides can be further modified by distinct head or terminus groups34. Figure 1-9 shows the most abundant ascarosides present in worm medium.

Most of these four kinds of ascarosides, including asc-ωC3, asc-∆C9, and IC-asc-C5 have dauer formation activities, with asc-ωC3, functioning synergistically with other ascarosides to induce dauer formation. Ascarosides such as asc-∆C9, IC-asc-C5, and

Glc-asc-C6-MK have male attraction activities as sex pheromones. Indole modified ascarosides, such as IC-asc-C5 and IC-asc-∆C9 can induce worm aggregation at very low concentrations32,33,35-37.

Figure 1-9. Examples of most abundant ascarosides in C. elegans. Highlighted pink parts indicate the core part ascarylose in ascarosides.

The biosynthesis of ascarosides is thought to begin from the attachment of very long-chain fatty acids to the ascarylose sugar, though the exact mechanism is yet to be determined38. To produce various medium-chain and short-chain ascarosides, peroxisomal β-oxidation shortens the side chains of the long-chain ascarosides. Each β- oxidation cycle involves four enzymes: acyl-CoA oxidase (ACOX), enoyl-CoA

33

hydratase (MAOC-1), 3-hydroxyacyl-CoA dehydrogenase (DHS-28) and the last step 3- ketoacyl-CoA thiolase (DAF-22). ACOX initiates β-oxidation by FAD-assisted oxidation to introduce an α,β-unsaturated double bond, followed by hydration catalyzed by

MAOC-1. DHS-28 then comes in to further oxidize the β–hydroxy group, and DAF-22 cleaves off acetyl-CoA to shorten the side chain by two carbons. Recent metabolomics studies, biochemical assays and structural studies indicate that ACOXs regulate the production of distinct ascarosides in response to environmental changes, and DAF-22 activity is involved in many biosynthetic pathways34,39. Although intensive studies have been done on the biosynthesis of ascarosides, nothing is known about how the ascarylose is biosynthesized and incorporated into the ascarosides.

1.3.2 Discovery of the Nemamides

The second significant signaling molecule is the nemamides, which have been shown to play an important role in promoting survival during and recovery from L1 arrest. The nemamides are remarkably produced by two large genes, pks-1 in chromosome X and nrps-1 in chromosome III (Figure 1-10). pks-1 encodes a 865 kDa megaenzyme polyketide synthase (PKS) and nrps-1 encodes another 333 kDa, megaenzyme nonribosomal peptide synthetase (NRPS)40,41. PKS and NRPS are large, multidomain enzymes to produce polyketide and nonribosomal peptide, respectively.

Polyketide and nonribosomal peptide are commonly found in antibiotics, siderophore and other natural products, which are produced mainly in bacteria and fungi, but is extremely rare in metazoans, such as nematodes18,19. Moreover, their complex domain structures make it very difficult to predict the structures of their products, and genome- wide efforts to annotate C. elegans genes, including mutagenesis screens, RNA interference screens, and transcriptional profiling/RNAseq experiments, do not provide

34

any significant clues as to the biological functions of PKS-1 and/or NRPS-142, so it would be very intriguing to identify their products and understand the corresponding molecular mechanisms.

Figure 1-10. Genomic location of pks-1 and nrps-1.

To identify masses of the natural products that PKS-1 and NRPS-1 produce, comparative, untargeted metabolomics was applied in Figure 1-11. Extracts from worms and from conditioned culture medium were generated from mixed-stage cultures of wild type worms, pks-1 mutant worms and nrps-1 mutant worms. The metabolites in the extracts were analyzed by HR-LC-MS and compared using XCMS43,44. Two peaks (m/z

757.3866 and 755.3700), termed nemamide A and nemamide B, respectively, were present in wild-type worm extracts and completely absent in both mutant worm extracts.

Thus, PKS-1 and NRPS-1 likely work together to produce a hybrid polyketide/nonribosomal peptide. Nemamide A and B are associated with the worm body rather than secreted into the culture medium, as the molecules were not detected by HR-LC-MS in the culture medium extracts. Therefore, they likely serve as internal signaling molecules.

To purify enough of the nemamides to identify their structures by NMR spectroscopy, wild-type worms were grown in the axenic, semi-defined medium,

45 CeHR , which gives about five times higher density of worms than bacteria-fed worm

35

cultures. ~50L of worms in CeHR were grown, and ultimately, only ~100 g of nemamide A and less of nemamide B were purified. These compounds were extracted from freeze-dried worms and purified using a short silica gel column, followed by an HP-

20 column, a Sephadex LH-20 column, and then HPLC. The extraction and purification process had to be completed for small batches of worms (from ~2L of culture) within 1-2 days to prevent degradation of the nemamides. The exact masses of nemamide A and

B indicated the molecular formulas C34H54N8O10 and C34H52N8O10, respectively. NMR spectra, including dqf-COSY, TOCSY, HSQC, HMBC, and ROESY spectra, were obtained for nemamide A and used to establish its molecular connectivity. The absolute configuration of the three asparagines in nemamide A was determined by acid- hydrolysis of the molecule, followed by derivatization with Marfey’s reagent and

46 comparison to derivatized standards by LC-MS . All data established that nemamide A contains one L-Asn and two D-Asn. The position of the L-Asn in the macrolactam ring, as well as the absolute configuration of the stereocenter at C-18, was deduced through conformational analysis of the macrolactam ring for each of the possible stereoisomers, subjected to the constraints of observed ROESY correlations and coupling constants.

As the CD spectrum of a chiral molecule depends largely on the chiral center nearest to the UV active portion of the molecule, the absolute configuration of the stereocenter at

C-22, as well as that at C-20, was deduced through comparison of the observed CD spectrum of nemamide A to the predicted CD spectrum. Thus, the absolute configuration of nemamide A was established as 2R,6R,10S,18S,20R,22S in Figure 1-

12. In comparison to nemamide A, nemamide B has one additional double bond, based

36

on its NMR spectra, HR-MS, MS/MS, and UV spectrum (Nemamide A, max 258, 269,

279 nm and nemamide B, max 286, 301, 315 nm) shown in Figure 1-13.

Figure 1-11. Comparative metabolomics study on wild type, pks-1(mos-1) and nrps- 1(mos-1) mutants. Samples were generated from both worms and worm medium. HR-LC-MS was used to identify masses missing in mutants. Two masses were both missing in pks-1 and nrps-1 mutants, named as nemamide A and nemamide B. x-axis indicates fold changes of masses in mutant worms compared to wild type and y-axis indicates the statistical significance (P value).

37

Figure 1-12. Elucidation of the structures of nemamide A and B.

Figure 1-13. UV spectra of nemamide A and B. Nemamide A has max 258, 269, 279 nm and nemamide B has max 286, 301, 315 nm.

38

1.3.3 Other Signaling Molecules in Worms

Many other signaling molecules regulate the development and behavior of C. elegans. Levels of N-acylethanolamine (NAE), as an internal signal, are associated with dietary restriction-induced lifespan extension47. Similarly, another molecule oleoylethanolamide (OEA), produced by a lysosomal acid lipase, can activate gene expression of NHR-49 and NHR-80 targets, and extends longevity48. Upon fungal infection and wounding, C. elegans worms can synthesize internal signals 4- hydroxyphenyllactic acid (HPLA), which is recognized by G protein-coupled receptor

(GPCR) DCAR-1 to trigger the immune response49. In addition to signaling molecules produced by C. elegans, other nematode species can also secret signaling molecules sensed by C. elegans. For example, P. pacificus50, can secrete sulfolipids into the medium, which is perceived by C. elegans worms and induces defensive responses.

1.4 Significance and Summary

The model organism C. elegans can arrest its development at two stress- resistant larval stages: L1 diapause and dauer diapause. Two main natural products that C. elegans uses to control the development of these larval stages: (1) The nemamides, which are hybrid polyketides-nonribosomal peptides that might act as internal signaling molecules to mediate worm development under normal and stress conditions, and (2) The ascarosides, which C. elegans secretes as pheromones to induce dauer formation and coordinate various behaviors.

Although PKS and NRPS genes are commonly found in many bacterial and

18,19 fungal species, they are rarely seen in metazoans . Phylogenetic analysis of animal

PKSs that are distinct from animal fatty acid synthases suggests that simple iterative

18,51 Type I PKSs and single module NRPSs can be found in a handful of animal species .

39

In the sea urchin, for example, an iterative Type I PKS with a single module is required for the biosynthesis of the pigment echinochrome, a napthoquinone that may function in

51 defense . However, to our knowledge, no assembly-line PKS or NRPS has been characterized in metazoans. The nemamides represent the first assembly-line polyketide and nonribosomal peptides to be identified in animals. The discovery of the nemamides and the annotation of the domain organization of PKS-1 and NRPS-1 shown in Figure 1-14 will enable the exploration of polyketide and nonribosomal peptide biosynthesis in the context of a complex animal system, and perhaps also provide new insights into the biosynthesis of an important class of natural products. Nemamide biosynthesis likely requires additional enzymes, so it would be very intriguing to dissect the functions of biosynthetic domains and re-assemble nemamide biosynthesis by searching more enzymes in trans. Further studies of gene expression and enzyme functions could potentially provide additional insights into the biological role of the nemamides.

Figure 1-14. Domain architectures of PKS-1 and NRPS-1.

40

A library of ascarosides secreted by C. elegans worms regulate many aspects of worm functions, including dauer formation, male attraction, and aggregation33,34,36.

Recent studies have been focused on the biosynthesis of ascarosides by β–oxidation in different nematode species39, but no work has been done to investigate whether worms can synthesize ascarylose and how it is synthesized. Ascarylose has only been found in pathogenic bacteria where it is used to synthesize cell-wall components that act as antigens52,53. Its active form is CDP-ascarylose and the biosynthesis of CDP-ascarylose has been studied extensively. A critical step to synthesize CDP-ascarylose requires a

PMP-dependent dehydratase (E3)54-57, but no homolog of this enzyme has been found in worms.

Overall, in Chapter 1, the discovery of the nemamides and the ascarosides was discussed. Chapter 2 describes the investigation of nemamide mechanism in promoting

L1 larval survival during starvation and the tesing of various nemamide biosynthetic mutants respond to pathogens. Chapter 3 focuses on the exploration of nemamide biosynthesis by dissecting domain architectures and searching for accessory enzymes.

Nemamides and nemamide-like molecules in other nematode species are studied in

Chapter 4. Chapter 5 explores the biosynthesis of ascarylose and rhamnose in C. elegans. Our work has improved our understanding of the biosynthesis of signaling molecules in C. elegans and their roles in regulating worm development, stress survival and aging.

41

CHAPTER 2 MECHANISM OF THE NEMAMIDES IN PROMOTING LARVAL SURVIVAL AND PATHOGEN RESISTANCE

2.1 Insulin Signaling Pathway in Worm Development and Arrest

With sufficient food* (bacteria), C. elegans worms will progress from the egg, through four larval stages (L1-L4) to the adult shown in Figure 2-1. However, under certain conditions, worms will arrest their development. One of the best known arrests is dauer arrest, which occurs under high population density, low food and high temperatures58,59. If eggs hatch to L1 larvae in the complete absence of food, the L1 larvae will arrest, but then resume development upon addition of food. The insulin/IGF-1 pathway is an important regulator of L1 arrest9,60. Starvation leads to the downregulation of certain insulins, leading to nuclear translocation of daf-16 transcriptional factor and L1 arrest. Food, on the other hand, leads to the upregulation of certain insulins, blocking the nuclear translocation of daf-16 and L1 arrest61.

Figure 2-1. Insulin signaling pathway plays important roles in worm development under normal and stress conditions9.

Parts adapted with permission from Shou, Q., Feng, L., Long, Y., Han, J., Nunnery, J.K., Powell, D.H., Butcher, R.A. A hybrid polyketide-nonribosomal peptide in nematodes that promotes larval survival. Nat Chem Biol 12, 770-772 (2016). Copyright 2016 Springer Nature.

42

2.2 Experimental Methods

2.2.1 Methods for Roles of Nemamides in Larval Survival

Strains and culture methods. Worms were maintained on E. coli OP50 according to standard methods. Strains used include wild type (N2), RAB7 pks-

1(ttTi24066), RAB9 pks-1(ok3769), RAB8 nrps-1(ttTi45552), RAB10 pks-1(ttTi24066); nrps-1(ttTi45552), ayIs7[hlh-8p::gfp], RAB15 pks-1(ttTi24066); ayIs7, RAB16 nrps-

1(ttTi45552); ayIs7, unc-31(e928), RAB17 unc-31(e928); pks-1(ttTi24066), RAB18 unc-

31(e928); nrps-1(ttTi45552), daf-16(mu86), RAB19 daf-16(mu86); pks-1(ttTi24066), and

RAB20 daf-16(mu86); nrps-1(ttTi45552). The pks-1(ttTi24066), nrps-1(ttTi45552), and pks-1(ok3769) strains were backcrossed two, four, and six times, respectively. The double mutants were constructed from single mutants using standard genetic methods and the presence of alleles was verified by PCR, see primers in Table 2-1.

Table 2-1. Primers used for plasmid construction or genotyping Primer Purpose Sequence (5’-3’)* nrps-1p-AscI-F GFP reporter gcatGGCGCGCCTGCATCAGCACATACTCA ATGGTC nrps-1p-NotI-R GFP reporter catgGCGGCCGCTGTGCAGAGTGCTCCGC GTAG pks-1p-SalI-F GFP reporter gcgcGTCGACTGTGCATACATGAGTTGTTG CT pks-1p-NotI-R GFP reporter catgGCGGCCGCTTTCTCCAAATCTTAATAC AAATTATAT nrps-1-F Mos1 detection GGAGAAGTCATCTGTTTCCA nrps-1-R Mos1 detection TTGGCGATCACTTCAAATGG pks-1-F Mos1 detection GAGGGAATATTGTATCCCACC pks-1-R Mos1 detection GAAAACCGTGTTTGGTCTCG oJL11562 Mos1 detection GCTCAATTCGCGCCAAACTATG daf-16-F63 deletion detection GTAGACGGTGACCATCTAGAG daf-16-internal63 deletion detection CGGGAATTTCAGCCAAAGAC daf-16-R63 deletion detection GACGATCCAGGAATCGAGAG

43

Table 2-1. Continued Primer Purpose Sequence (5’-3’)* unc-31-F deletion detection TAAGACCGCCCATGTTGCAC unc-31-internal deletion detection AGTTGTGGCCTCTCCAATTC unc-31-R deletion detection ATTCTGAGGGCACGACTCTG ins-11-F qRT-PCR TCTTCGTCAATGAGGGTCAAG ins-11-R qRT-PCR CAGTCGGATGCTGTTCTCC *Underlined bases indicate restriction sites.

nrps-1p::gfp and pks-1p::gfp reporter strains. The PEST sequence from pAF20764 (gift of Alison Frand) was subcloned into pPD114.108 at the XhoI/EcoRI sites to generate pPD114.108-gfp::pest. 4.563 kb of the pks-1 promoter and 3 kb of the nrps-

1 promoter were amplified from C. elegans genomic DNA (Table 2-1). The pks-1 and nrps-1 promoters were inserted into the SalI/NotI and AscI/NotI sites, respectively, of pPD114.108 or pPD114.108-gfp::pest to obtain pks-1p::gfp and nrps-1p::gfp or pks-

1p::gfp-pest and nrps-1p::gfp-pest, respectively. 50 ng/μL of the transgenes, along with

50 ng/L of the co-injection marker unc-122p::DsRed (gift of Piali Sengupta), were injected into wild-type worms. At least three independent transgenic strains were analyzed. Imaging was conducted on a Zeiss Axiovert.A1 microscope equipped with

ZEN lite 2012 camera.

L1 recovery, dauer formation, and dauer recovery assays. For L1 recovery assays, eggs were isolated from well-fed gravid worms using alkaline bleach treatment, diluted to 4-6 eggs/L in M9 buffer, and shaken for 24 h at 22.5 °C and 225 rpm.

Approximately 80-120 synchronized L1s were placed onto a 3 cm NGM plate with OP50 at 15 °C, 20 °C, or 25 °C. After a certain period of time (40 h at 25 °C, 48 h at 20 °C, and 80 h at 15 °C), the percentage of worms at or passed the L4 stage was determined.

44

For each experiment, five plates were analyzed for each strain, the percentage of worms at or passed the L4 stage was calculated for each plate, and the percentages for each strain were averaged. Dauer formation assays were performed for wild-type, pks-

1, and nrps-1 with vehicle control or 1 M asc-C6-MK at 25 °C as described32. Dauer recovery assays were performed by taking dauers from dauer formation assay plates and moving them to a lawn of bacteria for 24 h at 20 °C before scoring for recovery.

Egg to L4 development assay. Worms were maintained at 15 °C, 20 °C, or 25

°C for 2-3 generations. For egg lay assay, L4s were moved onto a new plate one day before the beginning of the experiment, and the next day 8 adults were used to perform a 1 h egg lay for each 3 cm NGM plate with OP50. Alternatively, for egg prep experiment, eggs were isolated from well-fed gravid worms using alkaline bleach treatment, washed, and added to 3 cm NGM plates with OP50. In both egg lay and egg prep assays, eggs were then incubated at 15 °C, 20 °C, or 25 °C, and after a certain period of time (40 h at 25 °C, 48 h at 20 °C and 80 h at 15 °C), the percentage of worms at or passed the L4 stage was determined. For each experiment, five plates were analyzed for each strain, and the percentages for each strain were averaged.

Analysis of expression of insulins using qRT-PCR. Well-fed gravid adults were collected from multiple 10 cm NGM plates and eggs were isolated by using alkaline bleach treatment, diluted to 4–6 eggs /µL in M9 buffer, and shaken for 24 h at

21 °C, 225 rpm. Then 25 mg/mL of OP50 was supplied to initiate recovery. Worms at 0 h and 6 h after feeding were collected by washing with cold M9 buffer, flash-frozen and stored at –80 °C before qRT-PCR. Total RNA extraction, an on-column DNase treatment, and cDNA generation were performed as described34, except that 0.25 µg of

45

total RNA was used for reverse transcription. All primers for insulin genes were as described65, except for the ones for ins-3, ins-26, and ins-31, which were as described here66, and the ones for ins-11, which were designed using the Real-time PCR Primer

Design Tool (IDT) and are listed in Table 2-1. qPCR was performed using SYBR Green select Master Mix (Life Technologies) on a 7500 Fast Real-Time PCR system (Applied

Biosystems) using the standard mode. PCR parameters include a holding stage at 50

°C for 2 min and another holding stage at 95 °C for 5 min, followed by 40 cycles of 95

°C for 10s, 57 °C for 20s and 72 °C for 30s. Relative expression levels for recovered

67 versus arrested L1s were determined using the ∆∆Ct method , and normalized to the expression levels of endogenous control genes act-1, pmp-3 and Y45F10D.434. Ratio of relative expression level for each strain was calculated by comparing the relative expression level at 6 h after feeding with the level at 0 h.

Nemamide production. Eggs were isolated from well-fed gravid worms using alkaline bleach treatment, diluted in S basal in a 125 mL flask, and shaken for 24 h at

20 °C and 200 rpm. Then, the synchronized L1s were inoculated into 2.8 L flasks with the density adjusted to 6 L1s /μL in 200 mL S medium (starved L1s) or 190 mL S medium plus 10 mL concentrated OP50 (fed L1s). Starved and fed L1 worms were cultured at 20 °C and 150 rpm for 6 h. L1s were then harvested by washing three times with S basal. L1s were freeze-dried, ground with sand, and extracted with ethanol, as described above, and then the nemamides were analyzed by LC-MS.

M cell division assay. For the ayIs7[hlh-8p::gfp], RAB15 pks-1; ayIs7, and

RAB16 nrps-1; ayIs7 strains, eggs were isolated from well-fed gravid worms using alkaline bleach treatment, diluted to 4–6 eggs/μL in M9 buffer with 0.08% ethanol, and

46

shaken for 7d at 22.5 °C and 225 rpm. Worms were examined for M cell division using a fluorescent microscope. > 50 worms were examined for each genotype.

Fertility and brood size upon recovery from L1 arrest. For the wild-type, pks-

1, and nrps-1 strains, eggs were isolated from well-fed gravid worms using alkaline bleach treatment, diluted to 4–6 eggs/μL in M9 buffer, and shaken for 5d at 22.5 °C and

225 rpm. Worms were then moved to a lawn of bacteria, allowed to develop to the L4 stage, and then singled. Survival and fertility were examined 2–3 d later.

L1 survival assay. Eggs were isolated from well-fed gravid worms using alkaline bleach treatment, diluted in M9 buffer in a 125 mL flask, and shaken for 24 h at

20 °C and 200 rpm. Then, the density of the synchronized L1s was adjusted to 4-6 L1s

/μL or 20–25 L1s /μL (high density) or 0.5-0.8 L1s /μL (low density) in M9 buffer. Every other day, 20 μL starved L1 samples were seeded onto 3 cm NGM plate with OP50 at

20 °C, and three plates were seeded for each strain. The plated worms were scored after 2–3 d, and worms at or passed the L2 stage were scored as surviving worms. For each experiment, three plates were analyzed for each strain, the percentage of surviving worms was calculated for each plate, and the percentages for each strain were averaged. Survival curves were statistically analyzed as previously described68.

Pumping assay. At least 30 young adults for each strain (wild-type, pks-1, and nrps-1) were analyzed while on a lawn of OP50 by counting the number of pharynx pumps per 30s under the dissecting microscope at room temperature.

Phylogeny and domain analysis. Phylogenetic trees were generated in MEGA

6, and domain analysis was performed using antiSMASH 3.0.69,70 Protein sequences were retrieved from NCBI or Wormbase Parasite. Phylogeny and protein domain

47

analysis was performed for PKS-1 homologs (a) and NRPS-1 homologs (b) in the following nematode species: Ancylostoma ceylanicum (a, EYC37444.1; b,

EYB85901.1), Ancylostoma duodenale (a, KIH69030.1; b, KIH67424.1), Ascaris suum

(a, PRJNA80881; b, GS_05892), Brugia malayi (a, CDQ05007.1; b, XP_001901640.1),

Bursaphelenchus xylophilus (a, BUX.s00713.159; b, BUX.gene.s01513.336),

Caenorhabditis angaria (a, Cang_2012_03_13_00116.g4813; b,

Cang_2012_03_13_00228.g7416), C. brenneri (a, EGT30644.1; b, EGT46479.1), C. briggsae (a, EGT30644.1; b, CAP32083.2), C. elegans (a, NP_508923.2; b,

CAC70135.3), C. japonica (a, CJA00126; b, CJA13923), C. remanei (a,

XP_003118401.1; b, EFP02416.1), C. tropicalis (a, Csp11.Scaffold626.g6628; b,

Csp11.Scaffold488.g2019), Dirofilaria immitis (a, nDi.2.2.2.g06619; b, nDi.2.2.2.g03539), Haemonchus contortus (a, CDJ83277.1; b, CDJ93083.1,

CDJ93084.1, CDJ82649.1), Heterorhabditis bacteriophora (a, ACKM01001433.1; b,

Hba_08702), Loa Loa (a, EJD75257.1; b, EFO26749.2), Necator americanus (a,

ETN74557.1; b, NECAME_19208, NECAME_19210), Oesophagostomum dentatum (a,

KHJ99846.1; b, KHJ98077.1), Onchocerca volvulus (a, OVOC1839; b, OVOC7029),

Pristionchus exspectatus (a, scaffold450-EXSNAP2012.7; b, scaffold1344-

EXSNAP2012.3), P. pacificus (a, PPA23686; b, PPA07616, PPA07617, PPA31783, in

Chapter 4, new annotated genes are used: PPA31783 was dead and replaced by

PPA07616 or Ppa-NRPS-1.2; PPA38771 and PPA07617 encode Ppa-NRPS-1.1 and

Ppa-NRPS-1.3, respectively), Steinernema carpocapsae (a, L596_g18665.t1; b,

L596_g20331.t1), Strongyloides stercoralis (a, SSTP_0001127100.1; b,

SSTP_0000446000.1), Toxocara canis (a, KHN84567.1). If available, the Genbank

48

accession number for the protein is listed, or, if not available, the protein name from

Wormbase Parasite is listed. If a given species contained multiple proteins with homology to pks-1 and/or nrps-1, the domains were annotated for all of the proteins using antiSMASH71, but only the longest protein was used for generation of the phylogenetic tree. For the H. bacteriophora pks-1 homolog, DNA sequence rather than protein sequence was analyzed (by first converting it to protein sequence using antiSMASH71).

2.2.2 Screen for Pathogen Resistance

Bacterial and fungal strains in Table 2-2 were cultured in 5 mL of the corresponding medium and at certain temperatures until they reached saturation. 10 μL of each culture was seeded onto a 3 cm petri dish with 3 mL NGM agar and allowed to sit inside the hood for at least five days before assay. About 20~30 L4 worms

(synchronized through egg lay) were moved onto the plate with bacteria or fungi, and monitored for their avoidance (numbers of worms on lawn), survival (touch by platinum wire to differentiate live or dead worms) and reproduction (number of progeny and their development) every 12h at room temperature. At least three to five replicated plates were analyzed for each worm strain with one bacterial or fungal strain. Worms strain used for screening were wild type (N2), RAB7 pks-1(ttTi24066), RAB9 pks-1(ok3769),

RAB8 nrps-1(ttTi45552), RAB46 nrps-1(tm3704), RAB10 pks-1(ttTi24066); nrps-

1(ttTi45552).

49

Table 2-2. Bacterial and fungal strain used for screening Name Origin Culture Conditions OP50 CGC 37 °C, LB meidum JUb38 Marie-Anne Felix 25 °C, LB medium JUb39 Marie-Anne Felix 25 °C, LB medium JUb40 Marie-Anne Felix 25 °C, LB medium JUb41 Marie-Anne Felix 25 °C, LB medium JUb42 Marie-Anne Felix 25 °C, LB medium JUb43 Marie-Anne Felix 25 °C, LB medium JUb44 Marie-Anne Felix 25 °C, LB medium JUb45 Marie-Anne Felix 25 °C, LB medium JUb46 Marie-Anne Felix 25 °C, LB medium JUb47 Marie-Anne Felix 25 °C, LB medium JUb48 Marie-Anne Felix 25 °C, LB medium JUb49 Marie-Anne Felix 25 °C, LB medium JUb50 Marie-Anne Felix 25 °C, LB medium JUb51 Marie-Anne Felix 25 °C, LB medium JUb52 Marie-Anne Felix 25 °C, LB medium JUb126 Marie-Anne Felix 25 °C, LB medium+Mannitol JUb127 Marie-Anne Felix 25 °C, LB medium+Mannitol JUb128 Marie-Anne Felix 25 °C, LB medium+Mannitol PA14 Butcher lab 37 °C, LB meidum PA14 gacA Ausubel Lab 37 °C, LB meidum+Kanamycin Streptomyces ATCC 31267 26 °C, ATCC medium 184 avermitilis Serritia ATCC 8100 26 °C, ATCC Medium 3 marcescens Staphylococcus Butcher lab 37 °C, LB meidum epidermidis Bacillus subtilis ATCC 55422 30 °C, ATCC medium 3/18 Photorhabdus Butcher lab 30 °C, LB meidum luminescens CBX102 Butcher lab 37 °C, LB meidum B20 Butcher lab 37 °C, LB meidum Candida ATCC 30°C, YPD medium guilliermondii C. albicans Ausubel Lab 30°C, YPD medium DAY185 SC5314 Soll Lab 30°C, YPD medium

50

2.2.3 Methods for PKS-1b Verification and Gene Cloning

C. elegans wild-type (N2) cDNA was generated as the method mentioned above and used to verify the presence of both pks-1a and pks-1b. Primers were designed at the boundary of pks-1a and pks-1b to generate the isoform-specific pair of primers: for pks-1a specific primers

F4(GCTTTTGCACGATTCGCTTG)/R1(CTTCATGACCACAATTTCTCGC) or

F5(CTCTTCCCTATAACCAGATAAGCG)/R3(CCAATACCGAGCATCAGAGTC); for pks-1b specific primers

F3(GAAAACCGAACCAGTGCATAC)/R1(CTTCATGACCACAATTTCTCGC) or

F2(TCAATGTTGCTTTACTTGCCG)/R1. PCR program was 50 °C for 2 min, each cycle with 95 °C for 2 min, 95 °C for 15s, and 60 °C for 1min, totally 40 cycles, by using SYBR

Green Mix (Invitrogen). The expected lengths for PCR products were: F4/R1, 175bp;

F5/R3, 143bp; F3/R1, 131bp; F2/R1, 197bp.

cDNA cloning of pks-1b (13.755 kb) was conducted by amplifying upstream of pks-1b (~7 kb) and downstream of pks-1b (~6.755 kb) separately through Phusion polymerase (New England Biolabs). Primers used for upstream cloning were pks1b15TVL_F (ttgtatttccagggcATGAGCGCATGTCTTGTTGGAAGCT) and pks1b13kb_MR (CGCTAGCAATATTGCAGATCGGAAC); for downstream cloning, pks1b15TVL_R (caagcttcgtcatcaTTAAGCCTGCTCGAAATATGGCTGG) and pks1b13kb_MF (GTTCCGATCTGCAATATTGCTAGCG). In-fusion HD cloning kit

(Takara) was used to ligate these two pks-1b fragments with vector p15TV-L, which is a derivative of pET 15b and is designed for infusion cloning72. p15TV-L was digested by

BseRI restriction enzymes (New England Biolabs) before infusion with two fragments.

The two pks-1b fragments were first incubated with Infusion HD mix at 50 °C for 15 min

51

and then add BseRI digested p15TV-L for another 15 min, followed by transformed into

Stellar competent cells (Takara). Plasmids were digested by HindIII, NdeI or BseRI for verification and further confirmed by sequencing.

2.3 Results

Although PKS and NRPS genes are commonly found in many bacterial and fungal species, only simple, single-module PKSs and NRPSs are present in a few animal species18,19. Thus, it is quite remarkable that the genome of the nematode C. elegans encodes a huge (865 kDa), multi-module hybrid PKS/NRPS on the X chromosome (PKS-1) and a large (333 kDa), multi-module NRPS on chromosome III

(NRPS-1)40,41. Homologs of PKS-1 and NRPS-1 are present in most nematode species, including parasitic ones (Figure 2-2), suggesting their conserved roles.

52

Figure 2-2. Phylogeny and domain analysis of PKS-1 homologs, A) and NRPS-1 homologs, B). Domains depicted include ketosynthase (KS, pink), acyl carrier protein (ACP, grey), ketoreductase (KR, green), acyl transferase (AT, yellow), peptidyl carrier protein (PCP, grey), condensation (C, light blue), adenylation (A, dark purple), thioesterase (TE, light purple).

2.3.1 Site of nrps-1 and pks-1 Expression in C. elegans

To determine the site in C. elegans of nemamide biosynthesis, we generated transcriptional reporter strains that express GFP under the control of the pks-1 or nrps-1

53

promoters. Both genes are expressed during all larval stages and the adult stage specifically in the CAN neurons, two essential neurons with a poorly defined role that extend the length of the worm and are closely associated with the excretory canals, which play a role in osmoregulation73-75 (Figure 2-3). This expression pattern was verified by using an RFP reporter under the control of a CAN-specific promoter (Figure

2-3). The coexpression of nrps-1 and pks-1 in the same tissue in C. elegans is consistent with the genes working together to biosynthesize the nemamides.

Figure 2-3. Expression of the transcriptional reporters pks-1p::gfp, A) or nrps-1p::gfp, B), as well as canp::rfp (marker for CAN neurons) in transgenic worms. Scale bar, 20 μm. Similar results were obtained for pks-1p::gfp-pest and nrps- 1p::gfp-pest reporters, in which GFP undergoes rapid turnover64.

2.3.2 Nemamides Promoting Arrested Larval Survival

Given that pks-1 and nrps-1 are expressed neuronally, we speculated that the nemamides might play a signaling role in development. With sufficient food (bacteria),

C. elegans will progress from the egg, through four larval stages (L1-L4) to the adult.

54

However, if C. elegans eggs hatch to L1 larvae in the complete absence of food, the L1 larvae will arrest, but then resume development upon addition of food9. We placed wild- type, pks-1, and nrps-1 arrested L1s on food, monitored the time that it takes for the arrested L1s to recover and develop to L4s, and showed that the mutants recover slower (Figure 2-4). The pks-1 and nrps-1 mutants progress from the egg to the L4 stage at the same rate as wild-type worms, and this delay is not due to alkaline-bleach treatment or a general delay in developmental rate (Figure 2-5). Thus, the mutants do not have a general defect in larval progression or development, but rather a specific defect in recovery from starvation-induced L1 arrest. Additionally, the pks-1 and nrps-1 mutants enter the dauer larval stage76 in response to dauer pheromone and recover from dauer once pheromone is removed as well as wild type (Figure 2-6). M-cell arrest in arrested L1s is an indication that the worm has properly arrested somatic progenitor cell division during starvation. Certain mutants in the insulin/IGF-1 pathway, such as daf-16/foxo, undergo improper M-cell division during L1 arrest60,66. Fluorescence images showing that the M-cell (identified using the M-cell-specific reporter, hlh-8p:gfp) does not divide during L1 arrest in wild-type, pks-1, and nrps-1 worms (Figure 2-7).

Mutant daf-18/pten, undergo improper germline proliferation during L1 arrest, leading to fertility defects once the L1 recover and develop to the adult stage77,78. The absence of fertility defects in the mutants suggests that the mutants maintain proper germline arrest during starvation-induced L1 arrest (Figure 2-8).

55

Figure 2-4. Recovery of arrested L1s and development to the L4 stage for wild-type, pks-1, and nrps-1 worms at different temperatures. The data represent the mean SD of three independent experiments, and two-tailed, unpaired t-tests were applied (*P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001).

Figure 2-5. Comparison of development of wild-type, pks-1, and nrps-1 worms at different temperatures. A) Development of eggs (obtained through alkaline- bleach treatment of gravid adults) to the L4 stage. B) Development of eggs (obtained by allowing gravid adults to lay eggs) to the L4 stage. The data represent the mean SD of two independent experiments, and two-tailed, unpaired t-tests were applied. All P values were non-significant except as indicated (*P ≤ 0.05).

56

Figure 2-6. Dauer formation and recovery in wild-type, pks-1, and nrps-1 worms. A) Dauer formation in wild-type, pks-1, and nrps-1 worms exposed to 1 M asc- C6-MK(ascr#2) in the dauer formation assay at 25 °C. B) Recovery of wild- type, pks-1, and nrps-1 dauers after being placed on a lawn of OP50 bacteria for 24h at 20 °C. The data represent the mean SD of two A) or four B) independent experiments, and two-tailed, unpaired t-tests showed that there is no significant difference between wild type and mutants.

Figure 2-7. M-cell imaging in wild-type, pks-1, and nrps-1 backgrounds.

57

Figure 2-8. Fertility and brood size in wild-type, pks-1, and nrps-1 worms that experienced extended L1 arrest. Percent fertility A) and brood size B) after L1s were subjected to five days of L1 arrest and then allowed to recover and develop into adults on food. Data represent the mean ± SD of five independent experiments (n = 30) A) or two independent experiments (n = 20) B). Two-tailed, unpaired t-tests showed that there is no significant difference between the wild type and the mutants.

So like wild-type worms, the pks-1 and nrps-1 mutants maintain proper somatic progenitor cell and germline arrest during L1 arrest. Thus, although the pks-1 and nrps-

1 mutants are defective in recovery from L1 arrest, they are not defective in L1 arrest initiation and maintenance66.

The insulin/IGF-1 pathway is an important regulator of L1 arrest and recovery, and specific insulins are down-regulated upon L1 arrest and then up-regulated following food addition9,61,79. To determine whether the nemamides affect insulin expression, we profiled the expression of all 40 C. elegans insulins by qRT-PCR during L1 arrest and recovery. In the pks-1 and nrps-1 arrested L1s, ins-4, ins-5, ins-19, and ins-37 are expressed at higher levels than in wild-type arrested L1s. daf-28 is also expressed at higher levels in pks-1 arrested L1s than in wild-type arrested L1s. Conversely, ins-33 is expressed at lower levels in pks-1 and nrps-1 arrested L1s than in wild-type arrested

58

L1s (Figure 2-9). Higher levels of expression of ins-4 and daf-28 in arrested L1s have been associated with reduced L1 arrest survival, and deletion of ins-4 and daf-28 has been associated with increased L1 arrest survival61.

Furthermore, unlike in wild type, in the pks-1 and nrps-1 backgrounds, ins-5 and ins-19 are not induced (or not induced as much) during L1 recovery and ins-4 and ins-

37 are down-regulated during L1 recovery (at least not at 6h post-recovery) (Figure 2-

10). The production of the nemamides decreases during L1 recovery (Figure 2-11).

Thus, our data suggest that the nemamides are negative regulators of the expression of specific insulins, such that expression of these insulins increases as nemamide levels decrease during L1 recovery.

59

Figure 2-9. Expression of insulins in wild-type, pks-1, and nrps-1 arrested L1s relative to wild-type arrested L1s, as determined by qRT-PCR. Data represent the mean ± SD of three independent experiments. Two-tailed, unpaired t-tests were used to determine statistical significance (*P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001, ****P ≤ 0.0001).

60

Figure 2-10. Expression of insulins in recovered versus arrested wild-type, pks-1, and nrps-1 L1s, as determined by qRT-PCR. Data represent the mean ± SD of three independent experiments. Two-tailed, unpaired t-tests were used to determine statistical significance (*P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001, ****P ≤ 0.0001).

61

Figure 2-11. Nemamide production in arrested and recovered L1s. Levels of nemamides A and B in arrested L1s and recovered L1s (6 h after addition of food). Data represent the mean ± SD of four independent experiments. Two- tailed, unpaired t-tests were used to determine statistical significance (*P ≤ 0.05).

In order to determine whether the nemamides play a role in survival during L1 arrest, we maintained wild-type, pks-1 and nrps-1 arrested L1s without food and monitored survival daily by placing an aliquot of the L1s on food and determining whether they could progress beyond the L1 stage. The pks-1 and nrps-1 mutants show reduced survival during prolonged L1 arrest (Figure 2-12). In addition to the pks-

1(ttTi24066) and nrps-1(ttTi45552) strains, we also tested L1 survival in the pks-

1(ok3769) strain and the pks-1(ttTi24066); nrps-1(ttTi45552) double mutant strain and obtained similar results. That is, no statistically significant difference was found for any of the tested mutants in terms of mean survival (Figure 2-13).

62

Figure 2-12. Survival of wild-type, pks-1, and nrps-1 arrested L1s over time. The insulin/IGF-1 pathway controls L1 survival in a manner dependent on the downstream daf-16/foxo transcription factor9,60. The poor survival of daf- 16/foxo (mu86 null) was slightly enhanced by the pks-1 and nrps-1 mutations. Mean survival (days ± SE): 14.3±0.2 for wild type, 11.3±0.3 for pks-1, 11.0±0.3 for nrps-1, 4.4±0.1 for daf-16, 3.7±0.1 for pks-1; daf-16, and 3.5±0.1 for nrps-1; daf-16. In c-e, the data represent the mean ± SD of three independent experiments, and two-tailed, unpaired t-tests were applied (*P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001).

Figure 2-13. L1 survival for wild-type and different mutant strains. The mean ± SD of three independent experiments are plotted. Mean survival (days ± SE): 12.2±0.3 for wild type, 9.4±0.4 for pks-1(ttTi24066), 8.9±0.4 for nrps- 1(ttTi45552), 7.7±0.4 for pks-1(ok3769), and 8.7±0.5 for pks-1(ttTi24066); nrps-1(ttTi45552). A two-tailed, unpaired t-test was used to determine statistical significance (*P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001).

63

Although it has been shown that L1 survival is density-dependent and that L1s secrete unidentified small molecules that increase survival80, the nemamides are unlikely to be a component of this pheromone, since the mutants showed reduced survival relative to wild type, regardless of worm density (Figure 2-14). C. elegans mutants which eat less display reduced survival during L1 arrest9. However, the pks-1 and nrps-1 mutants are not defective in bacterial food consumption or pharynx pumping

(Figure 2-15 and Figure 2-16), and thus, their reduced survival is not simply due to reduced nutrient stores. Because the insulin/IGF-1 pathway regulates L1 survival, we investigated the genetic interactions between this pathway and the nemamide pathway.

UNC-31 regulates insulin secretion and acts upstream of the insulin/IGF-1 pathway, which controls L1 survival in a manner dependent on the daf-16/foxo transcription factor.9,60,79 The unc-31(e928 null) mutation was able to suppress significantly, but not completely, the reduced survival of the pks-1 and nrps-1 mutants. Thus, the nemamides likely extend L1 survival by negatively regulating UNC-31-mediated insulin signaling and UNC-31-independent pathways. Performing the survival assay in insulin/IGF-1 pathway mutant backgrounds suggests that although nemamide signaling regulates insulin expression, it functions at least partially independently of the insulin/IGF-1 pathway (Figure 2-12 and Figure 2-17)9.

64

Figure 2-14. L1 survival for wild-type, pks-1, and nrps-1 worms at low and high population densities. Survival assays were performed at 25 °C. The mean ± SD of three independent experiments are plotted. Mean survival (days ± SE) was calculated as described in Methods: 8.3±0.2 for wild type/high, 6.4±0.4 for pks-1/high, 5.9±0.4 for nrps-1/high, 5.2±0.3 for wild type/low, 2.8±0.4 for pks-1/low, and 2.9±0.4 for nrps-1/low. A two-tailed, unpaired t-test was used to determine statistical significance (*P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001).

65

Figure 2-15. Feeding rate of wild-type, pks-1, and nrps-1 worms. Ten wild-type, pks-1, or nrps-1 worms at the L4 stage were transferred to NGM-agar plates (containing 50 M of 5-fluoro-2’-dexoxyuridine to prevent egg development) with a lawn of OP50 bacteria. The plates were incubated at 20 °C. The rate that the bacterial lawn was consumed was monitored over time, and no differences between the worms strains were observed. Photos of the plates with the wild-type (a), pks-1 (b), and nrps-1 (c) worms were taken after 6 d. Three replicates were done for each strain.

66

Figure 2-16. Pharynx pumping rate of wild-type, pks-1, and nrps-1 worms. Data represent the mean ± SD of two independent experiments. Two-tailed, unpaired t-tests showed that there is no significant difference between the wild type and the mutants.

Figure 2-17. Effect of an unc-31(e928 null) mutation on survival of arrested L1s. Survival assays were performed at 20°C. Mean survival (days ± SE) was calculated as described in Methods: 14.3 ± 0.2 for wild type, 10.0 ± 0.2 for pks-1, 10.9 ± 0.2 for nrps-1, 17.5 ± 0.2 for unc-31, 13.1 ± 0.3 for pks-1; unc- 31, and 13.9 ± 0.2 for nrps-1; unc-31. Data represent the mean ± SD of three independent experiments. A two-tailed, unpaired t-test was used to determine statistical significance (*P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001).

67

2.3.3 Screening by Pathogen Avoidance Assay and Killing Assay

To investigate if the nemamides or nemamide related genes are involved in pathogen defense, we screened the effects of about 30 bacterial or fungal species on wild-type worm and mutants of nemamide biosynthetic genes to monitor worm avoidance behavior and worm survival. Avoidance behavior was indicated by numbers of worms staying on lawn. In Figure 2-18, all the worms preferred on OP50 lawn, but they showed avoidance to both B20 and PA14. PA14 was shown to secrete metabolites pyochelin and phenazine-1-carboxamide, which activate the expression of daf-7 in the

ASJ chemosensory neuron81. These results indicate that the nemamides are not related to avoidance behavior to PA14 or B20.

Figure 2-18. Avoidance of different worm strains upon OP50, B20 and PA14 after 24h.

We also found that two pathogenic yeast species Candida albicans DAY185,

Candida guilliermondii, and one bacterial strain Staphylococcus epidermidis could kill certain worm strains, most notably, pks-1(ok3769) deletion mutant and pks-

68

1(ttTi24066);nrps-1(ttTi45552) double mos-1 insertion mutant (Figure 2-19). Wild type worms and mutant nrps-1(ttTi45552) showed very strong resistance to the three pathogens and could retain the ability to produce progeny, however, the pks-1 single and double mutant were sensitive to these pathogens shown in Figure 2-20. Worm mutants with abolished pks-1 showed decreased numbers of progeny after infected by

C. albicans DAY185 (Figure 2-21). Since pks-1 mutant worms are very sensitive to C. albicans, we propose that pks-1 might play important roles in defense against bacterial and fungal species.

Figure 2-19. Killing phenotype of different worm strains on A) Candida albicans DAY185, B) Candida guilliermondii and C) Staphylococcus epidermidis after 5 days’ infection.

69

Figure 2-20. Killing assay of pks-1 and nrps-1 worms upon C.albicans DAY185 for 48h. Arrows indicate seeded worms.

70

Figure 2-21. Worm reproduction after 48h infection by C. albicans DAY185.

2.3.4 Verification of the Presence of pks-1b

We speculate that pks-1 might play seperate roles from nrps-1 in pathogen defense, and may produce natural products on its own that differ from the nemamides.

Meanwhile, we found that according to Wormbase the pks-1 transcript has two splice variants, pks-1a and pks-1b. To further confirm the presence of pks-1b, we designed splice variant-specific primers to amplify pks-1a or pks-1b fragments and further confirmed by sequencing. Results in Figure 2-22 revealed that both pks-1a and pks-1b specific fragments could be amplified and confirmed, suggesting the presence of pks-1b transcript.

71

Figure 2-22. Presence of pks-1a and pks-1b splice variants. A) Scheme of pks-1a and pks-1b transcripts and location of primers for PCR. B) pks-1a specific fragments F4/R1, 175bp and F5/R3, 143bp; pks-1b specific fragments F3/R1, 131bp and F2/R1, 197bp

2.3.5 Gene Cloning of pks-1b

To figure out the product of PKS-1b, we cloned its cDNA sequence using

Infusion HD cloning method. The constructed plasmid was verified by sequencing and restriction enzyme digestion (HindIII, NdeI or BseRI) shown in Figure 2-23.

72

Figure 2-23. Verification of p15TV-L pks1b by HindIII, NdeI and BseRI digestion. 1kb DNA ladder: from top to bottom 10kb, 8kb, 6kb, 5kb, 4kb, 3kb (brightest), 2kb, 1.5kb, 1kb, 500bp. HindIII digest fragments: 6.587kb, 5.418kb, 4.077kb, 1.904kb, 1.474kb. NdeI digest fragment: 8.599kb, 6.767kb, 2.06kb, 1.08kb, 618bp, 361bp. BseRI digest fragments: 16.262kb, 1.239kb, 928bp, 752bp, 306bp.

2.4 Discussion and Future Work

The mechanisms by which animals control their development and physiology in response to nutrient fluctuations are poorly understood. The nemamides could potentially serve as a chemical tool with which to dissect this process. We show that the nemamides are important for survival during and recovery from starvation-induced larval arrest. The nemamides likely influence larval development in C. elegans in part by modulating insulin signaling (Figure 2-24). The nemamides represent the first polyketide-nonribosomal peptides biosynthesized in an assembly-line manner in a metazoan. Their discovery will enable the exploration of polyketide and nonribosomal peptide biosynthesis in the context of a complex animal system. Nemamide biosynthesis likely requires additional enzymes that act in trans. Future studies of the

73

site of expression and regulation of these enzymes could potentially provide additional insights into the biological role and site of action of the nemamides. As the nemamide biosynthetic genes are found in most nematode species, including parasitic ones, the role of the nemamides in larval development is likely conserved across nematode evolution.

Figure 2-24. Model for the role of the nemamides in L1 arrest and survival.

We also showed that certain pks-1 mutants were very sensitive to pathogenic species such as C. albicans. It would be very intriguing to know if other pathogens have similar killing effects on additional pks-1 mutants. On the other hand, it would also be interesting to know how expression of nemamide biosynthetic genes changes upon fungal infection. Some other questions include where GFP-labeled C. albicans is located inside wild-type and mutant worms, and whether extracts from C. albicans culture have any killing effects on pks-1 mutants.

74

CHAPTER 3 NONCANONICAL FEATURES IN BIOSYNTHESIS OF THE NEMAMIDES

3.1 Proposed Nemamide Biosynthetic Pathway

Based on the domain architectures of PKS-1 and NRPS-1,*also with the natural product structures, we proposed the preliminary biosynthesis of the nemamides (Figure

3-1). Biosynthesis begins on PKS-1, which initially extends the growing natural product through six iterative cycles, then uses two additional PKS modules to further extend the polyketide in an assembly-line manner, and then used the C-terminal NRPS module to incorporate β–alanine. Next, the growing natural product is passed onto NRPS-1, which sequentially adds D-Asn, D-Asn and L-Asn, finally passed onto the final TE domain, which catalyzes the formation of the macrolactam ring. The biosynthetic pathway, however, has several non-canonical features: (1) KR domains (specifically, KR2 and

KR3) that cannot be classified as A-type or B-type, which would enable their stereospecificities to be predicted. KR3 is supposed to be inactive, (2) Missing enzymatic domains, such as methyltransferase and aminotransferase domains, that are likely encoded elsewhere in the C. elegans genome, (3) Adenylation domains with protein sequences that diverge significantly from those of bacterial and fungal adenylation domains. Given that the nemamides contain four amino acids and that

PKS-1 and NRPS-1 have only three A domains, A2 or A3 may act twice to incorporate two Asn residues, (4) The absence of any obvious epimerase domains despite the presence of D-Asn in the nemamides, (5) A chain-terminating thioesterase domain is present not only at the C-terminus of NRPS-1, but also, unusually at the C-terminal of

Parts adapted with permission from Shou, Q., Feng, L., Long, Y., Han, J., Nunnery, J.K., Powell, D.H., Butcher, R.A. A hybrid polyketide-nonribosomal peptide in nematodes that promotes larval survival. Nat Chem Biol 12, 770-772 (2016). Copyright 2016 Springer Nature.

75

PKS-1, (6) No enoyl reductase is present in PKS-1, but the nemamide structure contains several saturated C-C bonds, and (7) The product of PKS-1 must be passed to

NRPS-1, but no obvious carrier protein domain exists at the N-terminus of NRPS-1.

Figure 3-1. Proposed biosynthetic pathway of nemamide A in C. elegans. Inactive domains in PKS-1 are depicted with asterisks. MT: methyltransferase; AMT: aminotransferase. Domain abbreviations: acyl transferase (AT), acyl carrier protein (ACP), ketosynthase (KS), ketoreductase (KR), dehydratase (DH), methyltransferase (MT), aminotransferase (AMT), adenylation (A), peptidyl carrier protein (PCP), condensation (C), and thioesterase (TE). Domains labeled with an asterisk are predicted to be inactive based on the nemamide structures. The KR and A domains are labeled with numbers for further discussion.

3.2 Domain Properties of PKS-1 and NRPS-1

3.2.1 Ketosynthase (KS) Domains

Based on protein sequence alignment of all KS domains in PKS-1 and several known starter loading KSQ domains shown in Figure 3-2, which have an active site glutamine and decarboxylate malonyl-CoA or methylmalonyl-CoA, KS domains in PKS-

76

1 maintain the conserved cysteine to catalyze both decarboxylation of the monomeric

82-84 units and the claisen reaction , so PKS-1_KS1 might incorporate another kind of starter unit, other than malonyl-CoA or methylmalonyl-CoA.

Figure 3-2. Sequence alignment of PKS-1_KS domains with KSQ domains. Sequences used for KSQ are from Pikromycin (Pik, GenBank accession no. LN881739), Amphotericin (Amp, WP_052453973), Spinosyn (Spi, AY007564), Monensin (Mon, AF440781), Niddamycin (Nid, AF016585), Tylosin (Tyl, U78289). The active site residue cysteine (C) or Glutamine (Q) is depicted with asterisk. Alignment was generated by ClustalW and ESPRipt 3.0. Protein sequences used here are: PKS-1_KS1 (125-182), PKS-1_KS2 (948-1005), PKS-1_KS3 (2016-2074), PKS-1_KS4 (4362-4421), PKS-1_KS5 (5364-5423).

3.2.2 Carrier Protein (CP) Domain

Alignments of carrier protein domains (ACP or PCP) in PKS-1 and NRPS-1 shown in Figure 3-3 and Figure 3-4 demonstrate that they all maintain the conserved serine residue for attachment of posttranslational phosphopantetheine arm. The featured motif for ACPs is GX(H/D)S(L/I) and the conserved motif in PCPs is

DXFFXLGGDSL, where X indicates any residue20,85,86. The attachment of phosphopantetheine is catalyzed by phosphopantethine transferase (PPTase)87 and there are two genes encoding this type of enzyme inside worms, T04G9.4 and

T28H10.1. A PPTase is also required for fatty acid biosynthesis, and since T04G9.4 has been shown to be essential by RNAi, it may be involved in this process, leaving

T28H10.1 to be involved in nemamide biosynthesis.

77

Figure 3-3. Sequence alignment of PKS-1_ACP domains with Bacillus subtilis ACP (GenBank accession no. P80643, PDB ID: 1HY8). Conserved active site serine (S) for posttranslational phosphopantetheine attachment is depicted with asterisk. Protein sequences use here are: PKS-1_ACP1 (719-776), PKS- 1_ACP2 (1776-1833), PKS-1_ACP3 (2792-2846), PKS-1_ACP4 (2940-2997), PKS-1_ACP5 (3471-3526), PKS-1_ACP6 (5154-5207). Alignment was generated by ClustalW and ESPRipt 3.0.

Figure 3-4. Sequence alignment of PKS-1_PCP domains, NRPS-1_PCP domains and Yer_PCP1 (Yersiniabactin synthetase PCP1 domain, GenBank accession no. Q7CI41, PDB ID: 5U3H). Asterisked residue serine (S) for posttranslational phosphopantetheine attachment is located in the conserved PCP motif DXFFXLGGDSL. Protein sequences are PKS-1_PCP1 (6462-6521), PKS- 1_PCP2 (7424-7481), NRPS-1_PCP3 (1289-1346), NRPS-1_PCP4 (1700- 1758), NRPS-1_PCP5 (2648-2705). Alignment was generated by ClustalW and ESPRipt 3.0.

3.2.3 Acyltransferase (AT) Domains

The purpose of the AT domain is to select starter units or malonyl-CoA derivatives and load them onto an ACP domain. In PKS-1, there are two obvious AT domains AT4 and AT5, which have the conserved active site motifs displayed in Figure

3-5, GXSXS and HAFH/YASH, and thus are likely active23,88. However, in the N-terminal part of PKS-1, there are three potential, hidden AT domains, AT1, AT2 and AT3. AT1 has the GKGLG motif, but lacks the motifs HAFH or YASH, so the function of the AT1 domain is still elusive, although structural modeling indicates AT1 is likely an acyltransferase. For AT2 and AT3, blast results indicate that they might be

78

acyltransferase-like domains, but they are embedded inside other domains and no conserved active site residues could be found.

Figure 3-5. Comparison of PKS-1_AT domains with AT0 loading acyltransfeases. Sequences used for AT0 are from Pikromycin (Pik, GenBank accession no. LN881739), Amphotericin (Amp, WP_052453973), Spinosyn (Spi, AY007564), Monensin (Mon, AF440781), Niddamycin (Nid, AF016585), Tylosin (Tyl, U78289). Catalytic residues serine (S) and histidine (H) are labeled by asterisks. Sequences used for AT domain in PKS-1 are: PKS- 1_AT1 (557-704), PKS-1_AT4 (3719-3875), PKS-1_AT5 (5814-5964). Alignment was generated by Clustal Omega and ESPRipt 3.0.

3.2.4 Dehydratase (DH) Domains

DH domains in PKSs catalyze the dehydration of a β-hydroxyl group to form an

α,β-unsaturated double bond through a catalytic dyad His-Asp present in a double- hotdog fold89. There are two DH domains in PKS-1 and both contain the catalytic residues (Figure 3-6). The first DH domain should be active, based on the presence of three or four double bonds in nemamide A or B. The second DH is proposed to be

79

catalytically inactive, given the presence of the methoxyl group, and the fact that methylation only happens towards hydroxyl groups.

Figure 3-6. Sequence alignment of PKS-1_DH domains with known DH domains. Sequences used for PKS-1_DH1 (1469-1748), PKS-1_DH2 (3940-4210). RzxB (GenBank accession no. Q4KCD8, PDB ID: 5IL6), MlnB (QIRS66, PDB ID: 5HST), CurF_DH (Curacin polyketide synthase CurF module, PDB ID: 3KG6), CurH (Curacin polyketide synthase CurH module, PDB ID: 3KG7). Catalytic residues histidine (H) and aspartic acid (D) are labeled with asterisks. Alignment was generated by Clustal Omega and ESPRipt 3.0.

3.2.5 Ketoreductase (KR) Domains

Whereas A-type KR domains catalyze the formation of an L-configured hydroxyl at the 3-position relative to the thioester in the growing polyketide, B-type KR domains catalyze the formation of a D-configured hydroxyl at the 3-position relative to the

80

thioester90. There are three KR domains in PKS-1 and all contain the NADP+-binding motif GXXGXXG (Figure 3-7). The first two KR domains should be active given the corresponding presence of hydroxyl groups in the nemamides, and the third KR domain might be inactive diven the presence of the amino group in the nemamides, which might have a keto group as a precursor group. The PKS-1 KR1 is a B-type KR domain, which provides further support for the assigned configuration at C-22 in nemamide A.

Although PKS-1 KR1 has an LKD motif instead of an LDD motif, the LKD sequence is seen in the chicken FAS KR domain, which is presumed to be a B-type KR domain91.

PKS-1 KR2 and KR3 do not have the characteristic residues of either an A-type or a B- type KR domain.

Figure 3-7. Alignment of the PKS-1 KR domains with bacterial KR domains. The three ketoreductase (KR) domains in PKS-1 (KR1, KR2, and KR3) were aligned with bacterial KR domains, SPN_KR3 and AMP_KR2 (A-type) and SPN_KR2 (B- type). Sequences were aligned with Clustal Omega.92 KR domains are from spinosyn (Spn) and amphotericin (Amp) PKSs. Red boxes indicate possible NADP binding domains, red residues indicate catalytic residues, pink residues (“LDD”) are characteristic of B-type KR domains, and green residues (“W”) are characteristic of A-type KR domains.

81

3.2.6 Adenylation domain (A) Domains

A domains use ATP to activate amino acids as amino acyl-AMP and then load the amino acyl group onto a PCP. The substrate specificity of the A domains from bacterial or fungal NRPSs can be predicted from the identities of 10 amino acids (as well as additional amino acids) that form a ‘selectivity code’ for the amino acid

93-95 substrate . The PKS-1 A domain (A1) is presumed to recognize β–Ala, and the

NRPS-1 A domains (A2 and A3) are presumed to recognize L-Asn or D-Asn. The three A domains contain the conserved glycine residue in the flexible loop that is involved in the interaction with the pyrophosphate leaving group during the loading of the amino acid96 shown in Figure 3-8. However, all of these A domains possess amino acid sequences that diverge significantly from those of bacterial A domains recognizing β–Ala and L-Asn in Table 3-1. Furthermore, although the nemamides contain two asparagines, there are no obvious epimerization domains and multiple D-amino acids have been detected inside the worm body97,98, leaving open the question of whether the relevant A domain(s) load L-Asn or D-Asn. In addition, we also compared the selectivity codes of all A domains in most of the nematode species in Table 3-2 and Table 3-3, suggesting

A domains in PKS-1s are conserved in most nematodes and A domains in NRPS-1s are divergent to incorporate different types of amino acids.

82

Figure 3-8. Sequence alignment of A domains with EntE_A (Enterobactin module E, PDB ID: 3RG2), SidN_A3 (A3 domain in Siderophore N synthetase, PDB ID: 3ITE), Grs_A (Gramicidin S, PDB ID: 1AMU). Sequences used for C domains are: PKS-1_A1 (7044-7167), NRPS-1_A2 (900-1019), NRPS-1_A3 (2274- 2398). Conserved glycine residue in the flexible loop involved in interaction with PPi (pyrophosphate) leaving group during loading amino acid is depicted by asterisk. Alignment was generated by Clustal Omega and ESPRipt 3.0.

Table 3-1. Comparison of the A domain selectivity codes. Selectivity codes for β-Ala and L-Asn in bacterial A domains are listed in red. The corresponding amino acids in the PKS-1_A1, NRPS-1_A2 and NRPS-1_A3 domains are listed for comparison. Sequence 235 236 239 278 299 301 322 330 331 517 position: β-Ala V D X V I S X G D K recognition L-Asn D L T K L G E V G K recognition PKS-1_A1 D V S F T G I I W K NRPS-1_A2 D I A Y Q G E V Y K NRPS-1_A3 D N L L V G N A F K

83

Table 3-2. A domain selectivity codes in PKS-1_A1 domains of various nematodes. Species 235 236 239 278 299 301 322 330 331 517 C. elegans D V S F T G I I W K C. angaria D V S F T G I I W K C. japonica D V A F T G I V W K C. brenneri D V S F T G I I W K C. remanei D V S F T G I V W K C. briggsae D V S F T G I V W K C. tropicalis D V S F T G I I W K A. suum D V M Y F G I I W K T. canis D V M F F G I I W K D. immitis D V M F Y G I I W K O. volvulus D V V F Y G I V W K L. loa D V V F Y G I V W K B. malayi D V M F F G I I W K P. pacificus D V F F I G I I W K P. D V F F I G I I W K exspectatus S. D V F F Y G I I W K carpocapsae B. xylophilus D V F F I G I I W K A. ceylanicum D V M F F G I V W K A. duodenale D V M F L G I I W K O. dentatum D V L F F G I V W K N. D V F F V G I V W K americanus H. D V V F F G I V W K bacteriophora H. contortus D V F F F G I V W K

84

Table 3-3. A domain selectivity codes in NRPS-1_A domains of various nematodes. A domain numbers are labeled based on their location in NRPS-1 genes, beginning from A2, A3, A4. Species An 235 236 239 278 299 301 322 330 331 517 H. contortus A2 D I A Y H G E V Y K D. viviparus A2 D I A Y H G E V Y K S. carpocapsae A2 D I A Y Q G Q V Y K A. suum A2 D I A Y I F Q V Y K C. elegans A2 D I A Y Q G E V Y K C. japonica A2 D I A Y Q G E V Y K C. remanei A2 D I A Y Q G E V Y K C.tropicalis A2 D I A L Q G E V Y K C. briggsae A2 D I A Y Q G E V Y K O. volvulus A2 D V A Y I F Q I Y K D. immitis A2 D I A Y I F Q I Y K L. loa A2 D I A Y N G Q I Y K B. malayi A2 D I A Y N G Q I Y K P. exspectatus A2 D I C F V V L I N K P. pacificus A2 D I C F L T L I N K H. bacteriophora A2 D A S H S G L C Y K A. ceylanicum A2 D A S H S G L C Y K H. contortus A3 D A S H S G L C Y K D. viviparus A3 D A S H S G L C Y K S. carpocapsae A3 D A A H S G V V L K A. suum A3 D A A H S G V V F K P. exspectatus A3 L C G V T Y K P. pacificus A3 D N L L C G V T Y K H. bacteriophora A3 D N L L I G N A Y K A. ceylanicum A3 D N L L I G N A Y K C. elegans A3 D N L L V G N A F K C. japonica A3 D N L L I G N A F C. remanei A3 D N L L I G N A F K C. tropicalis A3 D N L L I G N A F C. briggsae A3 D N L L I G N A F K C. brenneri A2 D N L L I G N A F K O. volvulus A3 D N L L I G N A Y K D. immitis A3 D N L L I G N A Y K L. loa A3 D N L L I G N A Y K B. malayi A3 D N L L I G N A Y K H. contortus A4 D N L L I G N A Y K D. viviparus A4 D N L L I G N A Y K S. carpocapsae A4 D N L L I G N T Y K A. suum A4 D N L L I G N A Y K

85

3.2.7 Condensation (C) Domains

C domains catalyze the conjugation of loaded amino acids with the preceding starter units or growing intermediates. C domains retain the conserved HHXXXDG motif99,100 (Figure 3-9) and the second His has been shown to play a very important catalytic role, although the Asp might be also very essential for catalysis.

Figure 3-9. Sequence alignment of C domains with ArfA_C2 (Arthrofactin module A C2 domain), VibH_C (Vibriobactin free-standing C domain VibH, PDB ID: 1L5A), CDA_C1 (Calcium-dependent antibiotic synthase C1 domain, PDB ID: 4JN3). Sequences used for C domains are: PKS-1_C1 (6653-6701), NRPS-1_C2 (520-563), NRPS-1_C3 (1457-1502), NRPS-1_C4 (1895-1947). The second histidine (asterisk labeled) in the HHxxxDG motif plays a catalytic base role. Alignment was generated by Clustal Omega and ESPRipt 3.0.

3.2.8 Thioesterase (TE) Domains

The PKS-1 and NRPS-1 TE domains were aligned with bacterial TEI and TEII domains shown in Figure 3-10. TEI domains cleave polyketides/nonribosomal peptides from PKS/NRPSs once biosynthesis is complete, while TEII domains have editing functions101. Both the PKS-1 TE domain and the NRPS-1 TE domain have the Ser-Asp-

His catalytic triad of TEI domains. Although the PKS-1 TE domain appears to be most similar to PKS-NRPS TEI domains (Figure 3-11), it does have the sequence motif around the catalytic Ser (GHSMG) that is characteristic of TEII domains.

86

Figure 3-10. Alignment of PKS-1 and NRPS-1 TE domains with bacterial TE domains. Red residues indicate catalytic residues (portion of sequence alignment showing conserved His is not shown). Sequences were aligned with Clustal Omega92. TE domains are from the PKS (P), NRPS (N), and PKS-NRPS (PN) assembly lines that biosynthesize soraphen (Sor), amphotericin (Amp), spinosyn (Spn), spirangien (Spi), avermectin (Ave), pikromycin (Pik), erythromycin (Ery), bacitracin (Bac), tyrocidine (Tyr), surfactin (Sur), myxothiazol (Myx), hectochlorin (Hec), tubulysin (Tub), chondramid (Cho), megalomicin (Meg), borrelidin (Bor), tylosin (Tyl), kendomycin (Ken), rifampicin (Rif), and microcystein (Mic).

87

Figure 3-11. Phylogeny of the PKS-1 and NRPS-1 TE domains and bacterial TE domains. Both the PKS-1 and NRPS-1 TE domains cluster with the TEI domains of bacterial hybrid PKS-NRPSs. TE domains are described in Supplementary Figure 14. Phylogenetic tree was generated in MEGA 669.

3.3 Experimental Methods

3.3.1 Strains and Transgenic Lines

Worm strains used in this chapter are listed in Table 3-4 for backcrossed strains and published strains; Table 3-5 for mutant strains with point mutations generated by

CRISPR-Cas9; Table 3-6 for newly generated deletion strains by CRISPR-Cas9; Table

3-7 for double mutants; Table 3-8 for transcriptional reporter strains and Table 3-9 for translational reporter strains. In Figure 3-12, gene structures and related alleles are demonstrated. Worms were maintained on OP50 according to standard methods. The double mutants were constructed from single mutants, using standard genetic methods and the presence of alleles was verified by PCR in Table 3-10.

88

Table 3-4. Wild type, backcrossed and published strains. Strain Genotype Mutation Resource Cross Background N2 wild type CGC RAB43 nrps-1 Located in CGC 4x VC20469 (gk186409[C4_S1934N]; NRPS-1_C4 gk186410[C4_D1971N]) III RAB44 T12G3.4 (gk961508) IV 122 bp deletion CGC 4x VC40695 VC40591 C24A3.4(gk964112) X 163003 bp CGC deletion RAB45 Y71H2B.1(gk712674) W136Opal CGC 2x VC40597 III RAB46 nrps-1(tm3704) III 481 bp deletion NBRP 4x with 3 bp ‘ATA’ insertion RAB47 pks-1(gk3257) X 327 bp deletion CGC 2x VC3246 RAB48 pks-1(gk620807) X Q3724Ochre CGC 2x VC40402 RAB49 pks-1(gk644506) X C6024Opal CGC 2x VC40451 RAB1 acox-1.1(ok2257) I RAB26 acox-1.3(reb4[stop]) I RAB21 acox-1.4(tm6415) I RB1985 acox-1.5(ok2619) III RAB22 acox-3(tm4033) IV RAB35 acox-3(tm4033); acox- 1.4(reb6[E433A]) DR476 daf-22(m130) II CGC

Table 3-5. Strains with point mutations generated by CRISPR-Cas9. Strain Genotype Domain location Background of mutation RAB50 pks-1(reb7[KS1-C159A]) X PKS-1_KS1 N2 RAB51 nrps-1(reb8[ACP7_S307V]) III NRPS-1_ACP7 N2 RAB52 pks-1(reb9[C1_H6685A]) X PKS-1_C1 N2 RAB53 nrps-1(reb10[C3_H1486A]) III NRPS-1_C3 N2 RAB54 pks-1(reb11[TE1_S7593A]) X PKS-1_TE1 N2 RAB55 nrps-1(reb12[TE2_S2803A]) III NRPS-1_TE2 N2 RAB56 pks-1(reb13[TE1_S7593C]; PKS-1_TE1 N2 reb14[TE1_G7596A]) X RAB89 pks-1(reb22[A1_G7106E]) X PKS-1_A1 N2

89

Table 3-6. Mutant strains with deletions generated by CRISPR-Cas9. Strain Genotype Description Background RAB57 nemt-1(reb15) IV 306 bp deletion N2 with 27 bp insertion RAB58 acs-9(reb21) X 154 bp deletion N2 with single base ‘T’ insertion RAB59 acs-9(reb28) X 552 bp deletion N2 RAB60 acs-24(reb23) I 991 bp deletion N2 with 10 bp insertion RAB61 acs-24(reb24) I 993 bp deletion N2 RAB62 C24A3.4(reb16) X 1361 bp deletion N2 RAB63 pks-1(reb18) X 1021 bp deletion N2 RAB64 pks-1(reb17) X 386 bp deletion N2

Table 3-7. Double mutants generated. Strain Genotype RAB65 acox-3(tm4033); C24A3.4(reb16) RAB66 acs-9(reb28); T12G3.4 (gk961508) RAB67 acs-9(reb28); pks-1(reb[TE1_S7593A]) RAB68 acs-9(reb28); nemt-1(reb15) RAB69 nemt-1(reb15); pks-1(TE1_reb[S7593A]) RAB70 nemt-1(reb15); T12G3.4 (gk961508) RAB71 T12G3.4 (gk961508); pks-1(reb13[TE1_S7593C]; reb14[TE1_G7596A])

Table 3-8. Transcriptional reporter lines. Strain Genotype RAB72 rebEx15 (Pnemt-1::gfp, 50 ng/µL; CAN::mcherry, 50 ng/µL) RAB73 rebEx16 (Pacs-9::gfp, 50 ng/µL; CAN::mcherry, 50 ng/µL) RAB74 rebEx17 (Pacs-24::gfp, 50 ng/µL; CAN::mcherry, 50 ng/µL) RAB75 rebEx18 (PT04G9.4::gfp, 50 ng/µL; CAN::mcherry, 50 ng/µL)

90

Table 3-9. Translational reporter or overexpression strains. Strain Genotype RAB76 nemt-1(reb15); rebEx19(Pnemt-1::nemt-1::sl2::mcherry, 50 ng/µL) RAB77 acs-9(reb28); rebEx20(Pacs-9::acs-9::sl2::mcherry, 50 ng/µL) RAB78 acs-24(reb24); rebEx21(Pacs-24::acs-24::sl2::mcherry, 50 ng/µL) RAB79 acs-24(reb24); rebEx20(Pacs-9::acs-9::sl2::mcherry, 50 ng/µL) RAB80 acs-9(reb28); rebEx21(Pacs-24::acs-24::sl2::mcherry, 50 ng/µL) RAB81 Y71H2B.1(gk712674); rebEx22(PY71H2B.1::Y71H2B.1::sl2::mcherry, 50 ng/µL) RAB82 C24A3.4(reb16); rebEx23(PC24A3.4::C24A3.4::sl2::mcherry, 50 ng/µL) RAB83 C24A3.4(gk964112); rebEx24(PC24A3.4::C24A3.4::sl2::mcherry, 50 ng/µL; Plin-44::gfp, 20 ng/µL) RAB84 T12G3.4 (gk961508); rebEx25(CAN::T12G3.4 cDNA::sl2::mcherry, 50 ng/µL; Plin-44::gfp, 20 ng/µL) RAB85 rebEx26(Pacox-3::gfp::acox-3, 60 ng/µL; coel::dsred, 50ng/µL) RAB86 acox-3(tm4033); rebIs1(Pacox-3::acox-3::sl2::mcherry, 50 ng/µL; Plin-44::gfp, 20 ng/µL) RAB87 acox-3(tm4033); rebIs2(CAN::acox-3::sl2::mcherry, 50 ng/µL; Plin- 44::gfp, 20 ng/µL) RAB88 acox-3(tm4033); rebIs3(Pges-1::acox-3::sl2::mcherry, 50 ng/µL; Plin- 44::gfp, 20 ng/µL)

Table 3-10. Single worm pcr primers for mutant strains crossing. Strain Genotype Primers RAB43 nrps-1 Forward_CTGAAGCCTTTATTCAGTGCCAAG (gk186409[C4_S1934N]; Reverse_CTTGCACTGCTAGAGCTAAGCTTC gk186410[C4_D1971N]) III RAB44 T12G3.4 (gk961508) IV Forward_GATATGCAAATTCGATTCGTCATGAG Reverse_AGACGAATGTTATCCGGATATCCAG RAB45 Y71H2B.1(gk712674) III Forward_GGAAAGCACGGAGATTTTGAAG Reverse_AGTGATGGGAATGGTCTCTGTT RAB46 nrps-1(tm3704) III Forward_TCTGAAAGCAGGTCACTCAGAT Reverse_TCAAATGGGATGCTACCATGAG RAB47 pks-1(gk3257) X Forward_GTCGATTTCTATGATAAGGGAAAG Reverse_CGACAGTTGGATAATCGAAAATATC RAB48 pks-1(gk620807) X Forward_CCTGTGAACAGTGTCAAAAGTG Reverse_ATGTCGCTTGTTTGCCTACAAC RAB49 pks-1(gk644506) X Forward_GTTGTTGGGCCAGTTGAAACGA Reverse_CACAAGAACGGGTTCAGATATGC RAB22 acox-3(tm4033) IV Forward_GTGTTTGAACCATGGCCTACTT Reverse_GCATATGTACTGGGTAGGAAG

91

Figure 3-12. Gene structures and related alleles used in this chapter are indicated.

92

3.3.2 Single Worm PCR and CRISPR-Cas9

All mutants containing deletions and point mutations were generated based on the Fire lab’s CRISPR-Cas9 protocol102-104 with modifications. Concentration of Cas9 vector used was 50 ng/µL. Plasmid containing sgRNA (obtained from Optimized

CRISPR-Cas9 design105) of dpy-10 was 25 ng/µL, and other plasmids containing sgRNA for target genes in Table 3-11 were used at 50-100 ng/µL. dpy-10 cn64 donor oligonucleotides were used at the concentration of 500 nM, and other donor oligos with desired mutations in Table 3-12 were at 500-750 nM. To generate deletion mutants, no donor was used for either dpy-10 or targeted genes. After injection, F1 worms with dpy

(for deletions) and roller (for point mutations) phenotypes were picked for single worm

PCR using the primers listed in Table 3-13 and restriction digestion (only for mutants with point mutation) of the PCR products listed in Table 3-14. The resulting candidates were sequenced for their PCR products and dumpy worms were backcrossed with wild type worms.

93

Table 3-11. sgRNA used for CRISPR-Cas9. Strain Genotype sgRNA (20 bases+NGG, and imported vector)

RAB50 pks-1(reb7[KS1_C159A]) X AGCAGTTTCGATTCCCACGG CGG (pTM 55-FE) RAB51 nrps-1(reb8[ACP7_S307V]) III CTCCAGCTCGGCGAGTCTTA AGG (pTM 55-FE) RAB52 pks-1(reb9[C1_H6685A]) X ATCATATTTTAACTGATGGT TGG (pTM 55-FE) RAB53 nrps-1(reb10[C3_H1486A]) III GGCTTCTACCATCGCAGATC AGG (pTM 55-FE) RAB54 pks-1(reb11[TE1_S7593A]) X TTCGTTATGGGGCACTCGAT GGG (pTM 55) CTTCGTTATGGGGCACTCGA TGG (pTM 55) GTTATGGGGCACTCGATGGG TGG (pTM 55) RAB55 nrps-1(reb12[TE2_S2803A]) III ACCTCTAAATTGGTGTTCAT TGG (pTM 55) TTCATTGGCGCCTCGTCTGC TGG (pTM 55) RAB56 pks-1(reb13[TE1_S7593C]; TTCGTTATGGGGCACTCGAT GGG (pTM 55) reb14[TE1_G7596A]) X CTTCGTTATGGGGCACTCGA TGG (pTM 55) GTTATGGGGCACTCGATGGG TGG (pTM 55) RAB57 nemt-1(reb15) IV TATTACTACAGTTATGGCTT TGG (pTM 55-FE) CGAGAAATATGGAACACAGG TGG (pTM 55-FE) RAB58 acs-9(reb21) X AAACTATTGGGCACTTTCGG AGG (pTM 55-FE) CCCAGAATCCAGATGCATGG TGG (pTM 55-FE) TCTTGTGGACATATTCTGCC AGG (pTM 55-FE) RAB59 acs-9(reb28) X AAACTATTGGGCACTTTCGG AGG (pTM 55-FE) CCCAGAATCCAGATGCATGG TGG (pTM 55-FE) TCTTGTGGACATATTCTGCC AGG (pTM 55-FE) RAB60 acs-24(reb23) I CTTCCATTCTTCCATGCGGG TGG (pTM 55-FE) GACGTGATCCGGAAAGTGGA GGG (pTM 55-FE) RAB61 acs-24(reb24) I CTTCCATTCTTCCATGCGGG TGG (pTM 55-FE) GACGTGATCCGGAAAGTGGA GGG (pTM 55-FE) RAB62 C24A3.4(reb16) X GAGCAGAGGCATCGCGTCCA TGG (pTM 55-FE) GACATTCCTACAGAAGACTC GGG (pTM 55-FE) RAB63 pks-1(reb18) X CTCTCGTTACCACCATTAGA AGG (pTM 55) ATCCGACTTCCTTCTAATGG TGG (pTM 55) ATTACGAGTACGAGTTGCGG AGG (pTM 55) RAB64 pks-1(reb17) X ATCATATTTTAACTGATGGT TGG (pTM 55-FE) RAB89 pks-1(reb22[A1_G7106E]) X AGGGACACCTGTTGAGCCAC TGG(pTM 55-FE)

94

Table 3-12. Repair templates used for CRISPR-Cas9. Strain Genotype Repair template RAB50 pks- CTACTTTCTGAATTCGCGCGGAGCCGCCGTGGGAATCG 1(reb7[KS1_C159A]) X AAACTGCAGCATCGTCTTCTCTCGTTGCTTTTCACCTTG CACGACAAGCA (underlined: PstI; C159A: TGC»GCA) RAB51 nrps- ATGAAGTTGAAACCACTCCTCTACCATACCTCGGAATCG 1(reb8[ACP7_S307V]) ACGTCTTAAGACTCGCCGAGCTGGAGTACCACGTGGCT III AGT (underlined: AatII; S307V: TCC»GTC ) RAB52 pks- TGATAACAGTCGAATTCACATCGTTTTCAATCAGCATGC 1(reb9[C1_H6685A]) X AATTTTAACTGATGGTTGGTCAATGACTGTTCTTTCTGA CACTGT (underlined: SphI; H6685A: CAT»GCA ) RAB53 nrps- CTGGATGAGTAGCAAAAATAAGTTATTGACAATTTCCATT 1(reb10[C3_H1486A]) CACGCATTAATCTGCGATGGTAGAAGCCTGCAGATTCTC III GAG (underlined: AseI; H1486A: CAC»GCA ) RAB54 pks- TGCCGCACATGCCGGAAACAAGAGAATCTTCGTTATGGG 1(reb11[TE1_S7593A]) GCATGCGATGGGTGGAATAATGAGTCGCGAAATAGTGGC X TGAGCTCAAAAT (underlined: SphI; S7593A: TCG»GCG ) RAB55 nrps- GTGCTGAAAATATTGAAACCTCTAAATTGGTGTTCATTGG 1(reb12[TE2_S2803A]) CGCCGCTAGCGCTGGTACTTTTGCATTTTCCACGTCACA III ACTTTTTG (underlined: NheI; S2803A: TCG»GCT ) RAB56 pks- TGCCGCACATGCCGGAAACAAGAGAATCTTCGTTATGGG 1(reb13[TE1_S7593C]; ACATTGCATGGGCGCCATAATGAGTCGCGAAATAGTGGC reb14[TE1_G7596A]) TGAGCTCAAAAT (underlined: KasI; S7593C/G7596A: TCG» X TGC/ GGA»GCC) RAB89 pks- AGAACTCAATTTGGAAGTATTTACTCCATATTCACCAGTGA 1(reb22[A1_G7106E]) GTCGACAGGTGTCCCTAAAGGAGTTTTGATGGCGGAACA X GTCA(underlined: SalI; G7106E: GGC»GAG)

95

Table 3-13. Single worm pcr primers for CRISPR-Cas9 generated mutant strains. Strain Genotype Primers

RAB50 pks-1(reb7[KS1_C159A]) X Forward_TCTACACTGCAGCGTGGTGCTAT Reverse_TATTGATGGCTGTAAGCTCTACCTG RAB51 nrps-1(reb8[ACP7_S307V]) Forward_GAAGGAGCAGCAAACATCGAGAA III Reverse_ATCTGAGTGACCTGCTTTCAGAG RAB52 pks-1(reb9[C1_H6685A]) X Forward_CATCTGTAAACCCTGCAGATATTGC Reverse_CGGCATCGCAGAAAACTGATAATGC RAB53 nrps-1(reb10[C3_H1486A]) III Forward_GAAGCTGGTGGAGTTGTCCAATGCT Reverse_GAAACTGTATCCCAGTTCTCTGGAG RAB54 pks-1(reb11[TE1_S7593A]) X Forward_GGTGATTAAATCTGGAGTAC Reverse_TAGTCCAGAGAAGACGTACT RAB55 nrps-1(reb12[TE2_S2803A]) Forward_TCGAGACCAAACTCGGAATC III Reverse_TCTGAGAAAATGTTCACCGG RAB56 pks-1(reb13[TE1_S7593C]; Forward_GAGGTGATTAAATCTGGAGTACGGC reb14[TE1_G7596A]) X Reverse_TCACTATCCGGTAGTCCAGAGAAG RAB57 nemt-1(reb15) IV Forward_AGTGGCTTTGCCTTTCCTCCTT Reverse_AGCCCTCAACTACTTCATCAGTG RAB58 acs-9(reb21) X Forward_GAGCTCGGGATTTCTCAAGGT Reverse_CAATTCTGCAACACAGAATGTCG RAB59 acs-9(reb28) X Forward_GAGCTCGGGATTTCTCAAGGT Reverse_CAATTCTGCAACACAGAATGTCG RAB60 acs-24(reb23) I Forward_GCTTCAACTCCAGAGAATCAGG Reverse_CAACGGCTCTCCGCTCTTAAG RAB61 acs-24(reb24) I Forward_GCTTCAACTCCAGAGAATCAGG Reverse_CAACGGCTCTCCGCTCTTAAG RAB62 C24A3.4(reb16) X Forward_CTCTGCCGTACCAGTGATGTTCTA Reverse_CTATCCATGTGCTACCAAACTTGTC RAB63 pks-1(reb18) X Forward_CTGTGTTGGAGTTGAAACATCTG Reverse_TTGATGGGATCAGTTTGCTGATTC RAB64 pks-1(reb17) X Forward_CATCTGTAAACCCTGCAGATATTGC Reverse_CGGCATCGCAGAAAACTGATAATGC RAB89 pks-1(reb22[A1_G7106E]) X Forward_CACCACTATACCAATTCGAAGAACTG Reverse_AGTGACTTGTCAACTTTCCCACTTG

96

Table 3-14. Single worm pcr information for wild type and mutants. Strain Genotype Wild type Mutant Enzyme digestion RAB43 nrps-1 445 bp 445 bp Wild type is cut by (gk186409[C4_S1934N]; BtsIMutI to 265 bp+180 gk186410[C4_D1971N]) bp; no cut for mutant III RAB44 T12G3.4 (gk961508) IV 628 bp 506 bp RAB45 Y71H2B.1(gk712674) III 700 bp 700 bp Wild type is cut by Hinfl into 450 bp+200 bp+50 bp; mutant is cut by Hinfl into 500 bp+200 bp RAB46 nrps-1(tm3704) III 877 bp 399 bp RAB47 pks-1(gk3257) X 1053 bp 726 bp RAB48 pks-1(gk620807) X 861 bp 861 bp Wild type is cut by FspI into 500 bp+361 bp; no cut for mutant RAB49 pks-1(gk644506) X 689 bp 689 bp Wild type is cut by AfIII into 380 bp+309 bp; no cut for mutant RAB22 acox-3(tm4033) IV 833 bp 417 bp RAB50 pks-1(reb7[KS1_C159A]) 1392 bp 1392 bp Mutant is cut by PstI into X 892 bp+500 bp RAB51 nrps- 820 bp 820 bp Mutant is cut by AatII into 1(reb8[ACP7_S307V]) III 438 bp+382 bp RAB52 pks-1(reb9[C1_H6685A]) 1098 bp 1098 bp Mutant is cut by SphI X into 750 bp+348 bp RAB53 nrps- 993 bp 993 bp Mutant is cut by AseI into 1(reb10[C3_H1486A]) III 658 bp+335 bp RAB54 pks- 562 bp 562 bp Mutant is cut by SphI 1(reb11[TE1_S7593A]) X into 355 bp+207 bp RAB55 nrps- 539 bp 539 bp Mutant is cut by NheI 1(reb12[TE2_S2803A]) III into 318 bp+221 bp RAB56 pks- 574 bp 574 bp Mutant is cut by KasI into 1(reb13[TE1_S7593C]; 364 bp+210 bp reb14[TE1_G7596A]) X RAB57 nemt-1(reb15) IV 981 bp 702 bp RAB58 acs-9(reb21) X 1723 bp 1570 bp RAB59 acs-9(reb28) X 1723 bp 1171 bp RAB60 acs-24(reb23) I 1405 bp 424 bp RAB61 acs-24(reb24) I 1405 bp 412 bp RAB62 C24A3.4(reb16) X 1863 bp 502 bp RAB63 pks-1(reb18) X 1427 bp 400 bp RAB64 pks-1(reb17) X 1098 bp 712 bp RAB89 pks- 1350 bp 1350 bp Mutant is cut by SalI into 1(reb22[A1_G7106E]) X 914 bp+436 bp

97

3.3.3 Transgenic Line Construction

All transcriptional reporters were inserted into pPD 114.108 expressing GFP using primers containing restriction sites listed in Table 3-15, and all translational reporters were inserted into pBS77-sl2-mcherry using primers with restriction sites listed in Table 3-16.

Table 3-15. Primers for transcriptional reporter line construction. Strain Genotype Primers RAB72 rebEx15 nemt-1p_SalI_Fwd_ GCGCGTCGACCGAAAATCTCAAGTCTTGTCTTAA nemt-1p_NotI_Rev_ CATGGCGGCCGCAGTGAATATGTGTTTACGAGTAAATG RAB73 rebEx16 acs-9p_SalI_Fwd_ GCGCGTCGACGCTATAAATGGGTACCTGGCCGTAA acs-9p_NotI_Rev_ CATGGCGGCCGCCGTAGAGAAGAAACTGTGACAGTTC RAB74 rebEx17 acs-24p_SalI_Fwd_ GCGCGTCGACGTACCGGGAATCGAAAAATTGTCC acs-24p_NotI_Rev_ CATGGCGGCCGCTCTTATTGTACAGAATGTTTCTTTCC RAB75 rebEx18 T04G9.4p_SalI_Fwd_ GCGCGTCGACTGCCTGATATGCCTGTTAAGAAG T04G9.4p_NotI_Rev_ CATGGCGGCCGCCTTTTTCTTCAAGTCCCGGCTGCC

98

Table 3-16. Primers for translational reporter line construction. Strain Genotype Primers RAB76 rebEx19 nemt-1SL2_SalI_Fwd_ GCGCGTCGACCGAAAATCTCAAGTCTTGTCTTAA nemt-1SL2_NotI_Rev_ CATGGCGGCCGCAGTGAATATGTGTTTACGAGTAAATG RAB77 rebEx20 acs-9SL2_PstI_Fwd_ CATGCTGCAGGCTATAAATGGGTACCTGGCCGTAA acs-9SL2_NotI_Rev_ CATGGCGGCCGCTCAATAGTACATTAGCCTATTCTTTTG RAB78 rebEx21 acs-24SL2_SalI_Fwd_ GCGCGTCGACGTACCGGGAATCGAAAAATTGTCC acs-24SL2_NotI_Rev_ CATGGCGGCCGCTCAATGCAACATATGACGGACTAG RAB81 rebEx22 Y71H2B.1SL2_PstI_Fwd_ CATGCTGCAGCCAGATGAAGAAAACGATGGAACT Y71H2B.1SL2_NotI_Rev_ CATGGCGGCCGCTTAAACCCATCCCTCTGGAGGATT RAB82 rebEx23 C24A3.4SL2_SalI_Fwd_ GCAGGTCGACGGTCTCTACAGTGATGACACTCATT C24A3.4SL2_NotI_Rev_ CATGGCGGCCGCTCACAGCTTGGACCGCGCCGCAAA RAB84 rebEx84 T12G3.4_BamHI_Fwd_ TGAGGATCCATGAAGAAATCAGACGGAAGCGTCAG T12G3.4_NotI_Rev_ CATGGCGGCCGCTCAAAGTTTAGCTCGAGCAATATAATAG RAB86 rebIs1 acox-3p_PstI_Fwd_ GCGCCTGCAGGTGCACATAAGGGAAAATTGTGGC acox-3SL2_KpnI_Fwd_ GCGCGGTACCAAAAATGAGTGCTCCTTTAATTGACA acox-3_NotI_Rev_ CATGGCGGCCGCTTATAGCTTGGATCCTTGTGATC

3.3.4 Plasmid Construction, Protein Overexpression and Purification

All genes were amplified by PCR using Phusion polymerase (New England

Biolabs) from a C. elegans cDNA library. Genes were inserted into the corresponding vector separately such that they could be expressed with an N-terminal or C-terminal

His tag. All primers used in this study were listed and restriction sites were underlined in Table 3-17. All the gene sequences were confirmed by sequencing. Constructed plasmids were transformed into BL21 (DE3) cells (New England Biolabs) independently

99

for expression. Transformed cells were grown in LB broth under appropriate antibiotic selection at 37 °C to OD600 0.6~0.8, protein expression was induced with certain amount of IPTG, and cells were grown at certain temperatures for 20 h in Table 3-18.

All purification steps were carried out at 4°C. Briefly, cells were collected by centrifugation at 3700 rpm for 10 min, and resuspended in lysis buffer (20 mM Tris, 500 mM NaCl, pH 7.5). The cells were then lysed by microfluidizer three times and centrifuged at 18,000 rpm for 20 min. The supernatant was incubated with 1 mL pre- equilibrated Nickel-resin (Thermo Scientific) for 1 h by shaking on ice. The resin was washed with 15 mL lysis buffer, 15 mL washing buffer (20 mM Tris, 500 mM NaCl, 20 mM imidazole, pH 7.5), and eluted with buffer containing 250 mM imidazole. The eluted sample was concentrated and loaded onto an FPLC connecting to a Superdex 200 gel filtration column (GE healthcare) with buffer (20 mM Tris, 100 mM NaCl, pH 7.5).

Protein concentration was determined by using Quick StartTM Dye reagent (Bio-Rad) with 2 mg/mL bovine serum albumin used as a standard. Purified proteins were flash frozen in 10% glycerol and stored at -80 °C. For purification of acyl-CoA synthetases, the protocol was adapted106. Moreover, in all buffers for purification of acyl-CoA oxidase

ACOX-3106, 20 µM FAD was added to saturate the purified enzymes.

100

Table 3-17. Primers for plasmid construction. Gene Primers ACS-9 NcoI_Fwd_CATGCCATGGGGGCGAAATATTATCCAGAAAC NotI_Rev_CATGGCGGCCGCATAGTACATTAGCCTATTC ACS-24 NheI_Fwd_CATGGCTAGCATGATGCGAAATTTCGGTCGTGAA NotI_Rev_CATGGCGGCCGCTCAATGCAACATATGACGGACTAG C24A3.4 NdeI_Fwd_CAGCCATATGTCACGTCTTTTATCCGGAATTAAAG NotI_Rev_CATGGCGGCCGCTCACAGCTTGGACCGCGCCGCAAA T04G9.4 NdeI_Fwd_ GCGCCATATGGGAGAACATAAAAACTGCAAATG XhoI_Rev_ CATGCTCGAGTTAAAATGTTTTTTTCGGCTTCGCC PKS-1397- NcoI_Fwd_GCGCCCATGGGGCCAGAGAAACCTAGCCTTGTGCA 813 _AT1ACP1 NotI_Rev_CATGGCGGCCGCAGTTGTTGCTTTAGTAACTGGAAC PKS-1397- NcoI_Fwd_GCGCCCATGGGGCCAGAGAAACCTAGCCTTGTGCA 728 _AT1 NotI_Rev_CATGGCGGCCGCACTGCTGTTAGTCTGCTCGTCAA PKS-1729- NcoI_Fwd_GCGCCCATGGGGCTTTCTGATGCGGAAATTGAGTC 813 _ACP1 NotI_Rev_ CATGGCGGCCGCAGTTGTTGCTTTAGTAACTGGAAC NRPS-1259- NcoI_Fwd_CATGCCATGGGGAGTGAAGACTCCGATGAAGAAGT 361 _ACP7 NotI_Rev_CATGGCGGCCGCTCCGGACCCCAGCGCTTTCTCAC ACOX-3 Yz_0081_AscI_Fwd_CATGGGCGCGCCTGAGTGCTCCTTTAATTGAC Yz_0082_NotI_Rev_CATGGCGGCCGCTTATAGCTTGGATCCTTGTGA NRPS-12719- NheI_Fwd_CATGGCTAGCGCAATTTCTGTTGTCGTTTTTCCT 2942 _TE2 NotI_Rev_CATGGCGGCCGCTTATTTAATAGACTTCAAAACC NRPS-12642- NheI_Fwd_CATGGCTAGCATGGAGCTGGTGAAAAATCTACCAC 2942 _PCP5TE2 NotI_Rev_ CATGGCGGCCGCTTATTTAATAGACTTCAAAACC NRPS-11290- NheI_Fwd_CATGGCTAGC GCTGCAATTTCTATTGCCAGAC 1371 _PCP3 NotI_Rev_CATGGCGGCCGCTTA AAAGTCCCTAGCAGGCTCTAC

101

Table 3-18. Plasmid construction and protein overexpression. Gene Vector Tag Temperature(°C) IPTG(mM) ACS-9 pET 16b- C- 16 0.3 KH01 His ACS-24 pET 28a N- 16 0.3 His C24A3.4 pET 28a N- 16 0.35 His T04G9.4 pET 28a N- 16 0.3 His SFP pET 28a N- 25 0.5 His PKS-1397- pET 16b- C- 16 0.4 813 _AT1ACP1 KH01 His 397-728 PKS-1 _AT1 pET 16b- C- 16 0.4 KH01 His PKS-1729- pET 16b- C- 16 0.4 813 _ACP1 KH01 His NRPS-1259- pET 16b- C- 16 0.3 361 _ACP7 KH01 His ACOX-3 pET Duet1 N- 16 0.5 His NRPS-12719- pET 28a N- 16 0.35 2942 _TE2 His NRPS-12642- pET 28a N- 16 0.35 2942 _PCP5TE2 His NRPS-11290- pET 28a N- 18 0.4 1371 _PCP3 His

3.3.5 Small Scale Worm Extraction and Intermediate Extractions

C. elegans wild type and mutant worm strains were grown at room temperature on two NGM agar plates (10 cm) spread with 0.75 mL 25X OP50 until the food on the plates was almost gone. Then, the worms were transferred to 1 L Erlenmeyer flasks containing S medium (350 mL). The worm cultures were grown at 22.5˚C for 3~5 d until no food was left and were fed with 3.5 mL of 25X OP50 every day. For sample collection, the culture flasks were placed in an ice-bath for 30 min to 1 h to settle the worms. Then, the worms were transferred from the bottom of the flasks to a 50 mL centrifuge tube and were centrifuged (1000 rpm for 5 min) to separate the worms from

102

the worm medium. The process was repeated until most of the worms were removed from the flasks. The collected worms were washed with water three times and centrifuged (1000 rpm for 5 min), and then they were soaked in 10 mL of water for 1 h in a shaking incubator (22.5˚C, 225 rpm) to remove bacteria from their digestive tract. The worms were collected by centrifugation and were freeze-dried. The dried worm pellets were ground with sea sand (2 g sand per 200 mg dried worms) using a mortar and pestle. The ground worms were extracted with 15 mL of 190 proof ethanol for 3.5 h, and the extract was centrifuged (3500 rpm for 20 min). The supernatant was collected and dried using a speedvac. The dried worm samples were each resuspended in 100

L of methanol, sonicated (if needed), and centrifuged (15000 rpm for 1 min) before analysis by LC-MS. Extracts above were analyzed using a Luna 5μ C18 (2) column

(100 × 4.6 mm; Phenomenex) coupled with an Agilent 6130 single quad mass spectrometer operating in single ion monitoring (SIM) mode for nemamide A (m/z 757) and nemamide B (m/z 755). A solvent gradient was used: 95% buffer A, 5% buffer B, 0 min; 0% buffer A, 100% buffer B, 20 min; 0% buffer A, 100% buffer B, 22min; 95% buffer A, 5% buffer B, 23 min; 95% buffer A, 5% buffer B, 26 min (buffer A, water with

0.1% formic acid; buffer B, acetonitrile with 0.1% formic acid). The flow rate was 0.7 mL/min.

50 mL of Wild type N2 worm or mutant worm cultures were inoculated and shaken at 225 rpm for 7 d at 20 C in 2.8 L baffled flasks containing 500 mL of CeHR medium45 with 20% cow’s milk. Worms were collected by centrifugation, washed with water, shaken in water for 30 min to clear their intestines, and washed again with water.

Worms were stored frozen at -20 C until needed. For extraction and fractionation

103

process, worms from 2 L-worth of culture (around 5 g of dried worm) were processed at a time. After freeze drying, worms were ground for 15 min with 50 g of sand using a mortar and pestle. The pulverized worms were transferred to a 1 L Erlenmeyer flask, and 300 mL of 190-proof ethanol was added to the flask. The synthetic intermediate 9

(potentially the most non-polar intermediate) was added as an internal standard into the extracts to identify corresponding fractions. The flask was shaken at 300 rpm for 3.5 h.

The extract was filtered using a Buchner funnel and filter paper and evaporated with a rotavap at 27 C. The extract was then subjected to silica gel chromatography (50 g silica gel) and eluted with a gradient of hexane, ethyl acetate, ethyl acetate/methanol

(1:1) and MeOH (350 mL each) to give four fractions (A – D). Since the synthetic intermediate 9 was detected in Fraction C, so fraction C and D were processed separately, evaporated with a rotavap at 27 C, redissolved in 12 mL of methanol, and centrifuged at 3500 rpm for 10 min. The supernatant was dried and dissolved in 10 mL

70% methanol/water. The resulting cloudy sample was then applied to an HP-20 column (100 g HP-20 resin), eluting with MeOH/H2O (7:3 to 9:1 to 10:0) to give twelve subfractions (D1 – D12, 125 mL each). Each fraction was dried by rotavap and analyzed by LC-MS for peaks with nemamide A- or nemamide B-type UV spectra.

3.3.6 LC-MS-based ACS and ACOX Activity Assay

To examine the activity of acyl-CoA synthetases, a LC-MS-based assay was used as previously described106. Generally, 50 µL reaction mixture contained 100 mM potassium phosphate at pH 7.0, 5 mM MgCl2, 5 mM CoA, 5 mM ATP, 1 mM fatty acid substrate. Reaction was initiated by adding 2 µL of 2 mg/mL purified acs-24 or acs-9 enzyme at 25 C for 2 h. 50 µL methanol was added to quench the reaction, vortexed

104

and centrifuged. 5 µL supernatant was used for LC-MS analysis on an Agilent 6130 single quadrupole mass spectrometer in both positive and negative full-scan modes, mass range 150-1500, 125 V fragmentator voltage, 0.15 min peak width and 2.20 sec cycle length. Mobile phases were: A, water with 10 mM ammonium acetate and B, acetonitrile. LC gradient was started from 95% A for 2 min and then ramped up to 100%

B over 24 min.

An LC-MS-based method was also used to test the activity of the acyl-CoA oxidase ACOX-3106. Briefly, 50 µL of reaction mixture contained 50 mM potassium phosphate, pH 7.4, 160 µM CoA-ester substrates, 20 µM FAD. 3 µL of 2 mg/mL ACOX-

3 enzyme was added to initiate the reaction at 25 C for 20 min. Sample preparation for

LC-MS was described above. Standard CoA esters butyryl-CoA, hexanoyl-CoA, octanoyl-CoA, and lauroyl-CoA were used for both ACS and ACOX activity assays.

3.3.7 Profiling Ascaroside Production

C. elegans wild-type and mutant worms were cultured in 5 mL S. medium at 20

°C, 225 rpm for 7 d before harvest. Worm medium was collected by centrifugation at

3500 rpm for 10 min, followed by purification through Sep-Pak C18 Cartridge (500 mg,

Waters). The final elutant in MeOH was dried by speedvac, dissolved in 100 μL 50%

34 MeOH/H2O and 5 μL of sample was analyzed by LC-MS .

3.4 Functional Analysis of Enzymatic Domains

To completely dissect the function of each domain in PKS-1 and NRPS-1 during nemamide biosynthesis, we mutated the essential residue(s) to see whether the resulted mutant shows any defects in nemamide production.

105

3.4.1 Ketosynthase (KS) Domain

To examine the role of first KS domain in PKS-1, we mutated its catalytic residue

Cys159 to Ala and analyzed the production of the nemamides in this mutant shown in

Figure 3-13 and Figure 3-14. PKS-1_KS1 mutant is defective in production of nemamide

B, but it retains the ability to produce nemamide A at half the level of wild-type worms.

These results indicate that the KS domain in the initiating module plays a critical role in the loading of starter units to form nemamide B, and also contributes to the biosynthesis of nemamide A.

Figure 3-13. Properties of mutant pks-1(reb7[KS1_C159A]). A), pks-1(reb7[KS1_C159A]) mutant generated by CRISPR-Cas9. Desired mutant was verified by single worm pcr (1392 bp) and PstI digestion to 892 bp and 500 bp. B), Nemamide A (m/z 757) and nemamide B (m/z 755) production in wild type and mutant pks-1(reb7[KS1_C159A]).

106

Figure 3-14. Sequencing data for pks-1(reb7[KS1_C159A]). A), Alignment of sequence file with wild type gene sequence. Portion of sequencing trace highlighted is shown in B).

107

3.4.2 Carrier Protein (CP) Domains

As mentioned earlier, PKS-1 and NRPS-1 work together to synthesize the nemamides, however, no obvious carrier protein was found at the N-terminus of NRPS-

1, so our work to analyze the carrier proteins in these enzymes was mainly focused on the N-terminus of NRPS-1. First we analyzed the N-terminal amino acid sequence with all other carrier protein domains in PKS-1 and NRPS-1 (Figure 3-15), and it shows that the ACP that we detected in the N-terminus of NRPS-1, NRPS-1_ACP7, has the conserved motif LGXDSL. Structural modeling107 shown in Figure 3-16 indicates that it has a very similar structure to CalE8, which was shown to have a transient interaction with CalE7 thioesterase in calicheamicin polyketide biosynthesis108. Mutation of the conserved Ser to Val in NRPS-1_ACP7 results in the failure of nemamide production

(Figure 3-17 and Figure 3-18), indicating this ACP domain in N-terminus of NRPS-1 plays an essential role.

Figure 3-15. Sequence alignment of NRPS-1_ACP7 with PKS-1_ACP domains and Bacillus subtilis ACP (GenBank accession no. P80643, PDB ID: 1HY8). Conserved active site serine (S) is depicted with asterisk. Protein sequences used here are: PKS-1_ACP1 (719-776), PKS-1_ACP2 (1776-1833), PKS- 1_ACP3 (2792-2846), PKS-1_ACP4 (2940-2997), PKS-1_ACP5 (3471-3526), PKS-1_ACP6 (5154-5207) and NRPS-1_ACP7 (267-320). Alignment was generated by ClustalW and ESPRipt 3.0

108

Figure 3-16. Structural modeling and sequence comparison of NRPS-1_ACP7 domain (yellow) with CalE8_ACP (Magenta). Active site of NRPS-1_ACP7 is depicted at Ser307.

Figure 3-17. Properties of mutant nrps-1(reb8[ACP7_S307V]). A), nrps- 1(reb8[ACP7_S307V]) mutant generated by CRISPR-Cas9. Desired mutant was verified by single worm pcr (820 bp) and AatII digestion to 438 bp and 382 bp. B), Nemamide A (m/z 757) and nemamide B (m/z 755) production in wild type and mutant nrps-1(reb8[ACP7_S307V]).

109

Figure 3-18. Sequencing data for nrps-1(reb8[ACP7_S307V]). A), Alignment of sequence file with wild type gene sequence. Portion of sequencing trace highlighted is shown in B).

110

3.4.3 Condensation (C) Domains

To study the functional roles of C domains in nemamide biosynthesis, we mutated the conserved His to Ala in PKS-1_C1 domain and NRPS-1_C3 domain. For

NRPS-1_C4, one mutant is available with the mutations of S1934N/D1971N, in which

S1934N might disrupt the functional role of the conserved Asp1935 in motif NRPS-

1_C4_HHLISDA. Results in Figure 3-19, 3-20 and 3-21 display that PKS-1_C1 mutant can still produce half the amount of nemamide A and nemamide B compared to wild type, but the other two C domain mutants could not produce any nemamides. Thus, the

PKS-1_C1 domain might not be involved in nemamide production, but the NRPS-1_C3 and C4 domains are essential. Previous studied showed that the VibF_C1 domain in vibriobactin synthetase might play a structural role in dimerization to properly orient

109 nearby modules but is not essential for catalysis . So PKS-1_C1 domain might be not essential for nemamide biosynthesis, and might play a structurally oriented role.

111

Figure 3-19. Properties of mutant pks-1(reb9[C1_H6685A]), nrps-1(reb10[C3_H1486A]) and nrps-1(gk186409[C4_S1934N];gk186410[C4_D1971N]). A), pks- 1(reb9[C1_H6685A]) mutant generated by CRISPR-Cas9. Desired mutant was verified by single worm pcr (1098 bp) and SphI digestion to 750 bp and 348 bp. B), nrps-1(reb10[C3_H1486A]) mutant generated by CRISPR-Cas9. Desired mutant was verified by single worm pcr (993 bp) and AseI digestion to 658 bp and 335 bp. C), Nemamide A (m/z 757) and nemamide B (m/z 755) production in wild type and mutant pks-1(reb9[C1_H6685A]), nrps- 1(reb10[C3_H1486A]) and nrps- 1(gk186409[C4_S1934N];gk186410[C4_D1971N]). nrps- 1(gk186409[C4_S1934N];gk186410[C4_D1971N]) mutant was obtained from CGC and backcrossed four times with wild type.

112

Figure 3-20. Sequencing data for pks-1(reb9[C1_H6685A]). A), Alignment of sequence file with wild type gene sequence. Portion of sequencing trace highlighted is shown in B).

113

Figure 3-21. Sequencing data for nrps-1(reb10[C3_H1486A]). A), Alignment of sequence file with wild type gene sequence. Portion of sequencing trace highlighted is shown in B).

114

3.4.4 Adenylation (A) Domains

Since the C domain in PKS-1 is not critically involved in nemamide biosynthesis, we also investigated the function of the A domain in PKS-1 by mutating Gly-7106 in the loop region of the A domain critical for pyrophosphate binding to a Glu shown in Figure

3-22. Results showed that no nemamide was produced in this A domain mutant, suggesting that unlike the C domain of PKS-1, the A domain of PKS-1 plays an essential role in nemamide biosynthesis.

Figure 3-22. Properties of mutant pks-1(reb22[A1_G7106E]. A), pks- 1(reb22[A1_G7106E]) mutant generated by CRISPR-Cas9. Desired mutant was verified by single worm pcr (1350 bp) and SalI digestion to 914 bp and 436 bp. B), nemamide production in mutant pks-1(reb22[A1_G7106E].

115

Figure 3-23. Sequencing data for pks-1(reb22[A1_G7106E]). A), Alignment of sequence file with wild type gene sequence. Portion of sequencing trace highlighted is shown in B).

116

3.4.5 Thioesterase (TE) Domains

Since there are two TE domains, each at the C-terminus of PKS-1 and NRPS-1, we wanted to determine if both TE domains are active or not. We generated three mutants by mutating the catalytic Ser to Ala or Cys, and none of these mutants could produce the nemamides (Figure 3-24, Figure 3-25, Figure 3-26 and Figure 3-27), indicating both TE domains are essential in nemamide biosynthesis.

117

Figure 3-24. Properties of mutant pks-1(reb11[TE1_S7593A]), pks- 1(reb13[TE1_S7593C]) and nrps-1(reb12[TE2_S2803A]). A), pks- 1(reb11[TE1_S7593A]) mutant generated by CRISPR-Cas9. Desired mutant was verified by single worm pcr (562 bp) and SphI digestion to 355 bp and 207 bp. B), pks-1(reb13[TE1_S7593C]) mutant generated by CRISPR-Cas9. Desired mutant was verified by single worm pcr (574 bp) and KasI digestion to 364 bp and 210 bp. C), nrps-1(reb12[TE2_S2803A]) mutant generated by CRISPR-Cas9. Desired mutant was verified by single worm pcr (539 bp) and NheI digestion to 318 bp and 221 bp. D), Nemamide A (m/z 757) and nemamide B (m/z 755) production in wild type and mutant pks- 1(reb11[TE1_S7593A]), pks-1(reb13[TE1_S7593C]) and nrps- 1(reb12[TE2_S2803A]).

118

Figure 3-25. Sequencing data for pks-1(reb11[TE1_S7593A]). A), Alignment of sequence file with wild type gene sequence. Portion of sequencing trace highlighted is shown in B).

119

Figure 3-26. Sequencing data for pks-1(reb13[TE1_S7593C]). A), Alignment of sequence file with wild type gene sequence. Portion of sequencing trace highlighted is shown in B).

120

Figure 3-27. Sequencing data for nrps-1(reb12[TE2_S2803A]). A), Alignment of sequence file with wild type gene sequence. Portion of sequencing trace highlighted is shown in B).

121

3.5 Genome Mining of Accessory Enzymes

Another unusual feature in nemamide biosynthesis is that some enzymes that we would predict would be involved in nemamide biosynthesis, such as a methyltransferase and an aminotransferase, are not present in PKS-1 and NRPS-1. To mine for more genes or enzymes involved in nemamide biosynthesis, we retrieved all of the available gene expression databases and chose GExplore as the main one110,111. GExplore provides genes that are highly enriched in specific cell types. Since both PKS-1 and

NRPS-1 are expressed in the CAN neurons, we extracted genes whose expression is enriched at least 5-fold in CAN neurons over other cell types (Table 3-19). We analyzed predicted functions of all the genes and conducted screening for nemamide production in available mutants. Results showed that, besides PKS-1 and NRPS-1,

F49C12.10(nemt-1), acs-9, C32E8.6(acs-24), C24A3.4, T12G3.4, and Y71H2B.1 are also essential in nemamide biosynthesis. Among these, NEMT-1 ie predicted to have methyltransferase activity, two ACS proteins are predicted to have acyl-CoA ligase activities, C24A3.4, is a potential CoA transferase, T12G3.4 is predicted to be a lactonase and Y71H2B.1 might be a fatty acyl-CoA binding protein. We also found that the candidate PPTase T04G9.4 may be involved in nemamide biosynthesis.

Surprisingly, we serendipitously found that the acyl-CoA oxidase ACOX-3, which is involved in ascaroside biosynthesis, is also involved in nemamide biosynthesis.

Unfortunately, we were unable to implicate any aminotransferase innemamide biosynthesis. In the later part of this chapter, we will talk about the expression, activity and potential roles of these enzymes in nemamide biosynthesis.

122

Table 3-19. Gene expressions with enrichment ratio of more than 5-fold in canal associated neuron (CAN) are listed. Enriched genes Predicted functions C24A3.4 Alpha-methylacyl-CoA racemase or CoA transferase srv-1 Serpentine Receptor, class V T22F3.12 Peptidyl-prolyl cis-trans isomerase F49C12.10 Methyltransferase activity dhhc-5 Protein-cysteine S-palmitoyltransferase activity cyk-1 Formin Homology to Drosophila diaphanous and human DIAPH1 C32E8.6 AMP-dependent ligase T05A1.5 Ortholog of human SLC (Solute carriers) family including SLC22A24 pks-1 Polyketide synthase activity C08G5.6 Isoforms of an unfamiliar protein paralogous to AIN-1 C03F11.4 C03F11.4 is regulated by D-glucose and Dafa#1 ZK112.6 ZK112.6 is regulated by rsr-2 and sir-2.1 Y46H3C.7 Unknown C41A3.2 Enriched in coelomocytes, germline and in PVD and OLL neurons aexr-1 Predicted to have G-protein coupled receptor activity R04A9.6 Unknown acy-2 Adenylyl cyclase C23H5.11 C23H5.11 is regulated by clk-1 F27C8.2 Predicted to have N-acetyltransferase activity pqn-52 Encodes proteins with (Q/N)-rich ('prion') domains pak-1 PAK-1 is required (redundantly with its paralog, MAX-2 T12G3.4 Predicted to have lactonase/hydrolase/esterase activity Y54E5A.2 Y54E5A.2 is involved in spermatogenesis nrps-1 Nonribosomal peptide synthatase activity F27C8.3 F27C8.3 is regulated by rsr-2, clk-1, let-418, chd-3, and dpy-21 ace-3 One of four C. elegans acetylcholinesterases (AChE) ZK112.5 ZK112.5 is enriched in the coelomocytes and male-specific tissues. F13H8.9 Ortholog of human SCLY (selenocysteine lyase) lips-13 In lipid storage; lips-13 is predicted to have hydrolase activity F09C12.6 Predicted to have G-protein coupled receptor activity moc-1 Ortholog of human GEPHYRIN F42C5.6 Involved in embryo development K10C2.12 K10C2.12 is regulated by npr-1 acbp-6 Encodes protein containing a functional acyl-CoA-binding domain acs-9 Ortholog of human ACSBG1 (acyl-CoA synthetase family member 1) swt-1 Ortholog of human SLC50A1 (solute carrier family 50 member 1) Y71H2B.1 Predicted to have fatty-acyl-CoA binding activity F40E3.5 Predicted to have hydrolase activity Dataset is retrieved from GExplore (http://genome.sfu.ca/gexplore/gexplore_search_tissues.html)110. Genes with color green were essential in nemamide biosynthesis, genes with red color were shown to be not essential and genes with blue color were not verified, but thought to be notinvolved. Enrichment ratio is indicated by ratio of gene expression in most highly cell type vs the second most highly cell type.

123

3.6 Functional Analysis of Accessory Enzymes

3.6.1 Transcriptional and Translational Reporters of Biosynthetic Genes

To further verify the expression pattern for each of these candidate genes in nemamide biosynthesis, we first analyzed a transcriptional reporter strain in which gfp was placed under the control of the candidate gene’s native promoter, as well as a translational reporter strain in which the native promoter and gene sequence were inserted into an SL2 type expression vector, upstream of mcherry shown in Figure 3-28 and Figure 3-29. Results showed that the methyltransferase nemt-1, the fatty acyl-CoA binding protein Y71H2B.1, two acyl-CoA ligase acs-9 and acs-24 are expressed mostly in the CAN neuron. However, for THE “CoA transferase” C24A3.4 and “hydrolase”

T12G3.4, are expressed in boththe CAN neuron and intestine, suggesting that they might also participate in other biological processes.

Figure 3-28. Expression of the transcriptional reporters A), Pnemt-1::gfp, B), Pacs- 9::gfp or C), Pacs-24::gfp, as well as CAN::mcherry (marker for CAN neurons) in transgenic worms. Scale bar, 20 μm.

124

Figure 3-29. Expression of the translational reporters A), Pnemt-1::nemt- 1::sl2::mcherry, B), Pacs-24::acs-24::sl2::mcherry, C), Pacs-9::acs- 9::sl2::mcherry, D), PY71H2B.1::Y71H2B.1::sl2::mcherry, E), PC24A3.4::C24A3.4::sl2::mcherry, F), CAN::T12G3.4::sl2::mcherry. Scale bar, 50 μm.

3.6.2 Generation of Deletion Mutants by CRISPR-Cas9

For those genes with no available mutants, we generated deletion mutants by

CRISPR-Cas9 shown in Figure 3-30 for nemt-1 (Figure 3-31), C24A3.4 (Figure 3-32), two alleles for acs-9 (Figure 3-33 and Figure 3-34) and acs-24 (Figure 3-35 and Figure

3-36), respectively.

125

Figure 3-30. Verification of biosynthetic gene mutants generated by CRISPR-Cas9. 1 kb DNA ladder was used for all gels. Length of pcr products for each: A), wild type 981 bp, nemt-1(reb15) deletion mutant 702 bp; B), wild type 1863 bp, C24A3.4 (reb24)deletion mutant 502 bp; C), wild type 1723 bp, acs-9(reb21) deletion mutant 1570 bp; D), wild type 1723 bp, acs-9(reb28) deletion mutant 1171 bp; E), wild type 1405 bp, acs-24(reb23) deletion mutant 424 bp and acs-24(reb24) deletion mutant 412 bp.

126

Figure 3-31. Sequencing data for nemt-1(F49C12.10, reb15) deletion. Alignment of sequence file with wild type gene sequence generated by MAFFT. nemt-1 deletion contains 306 bp deletion (green portion) with 27 bp insertion (red portion).

127

Figure 3-32. Sequencing data for C24A3.4(reb24)deletion. Alignment of sequence file with wild type gene sequence generated by MUSCLE. C24A3.4 deletion contains 1361 bp deletion (green portion).

128

Figure 3-33. Sequencing data for acs-9(reb21) deletion. Alignment of sequence file with wild type gene sequence generated by MUSCLE. acs-9(reb21) deletion contains 154 bp deletion (green portion) with single base ‘T’ insertion (red portion).

129

Figure 3-34. Sequencing data for acs-9(reb28) deletion. Alignment of sequence file with wild type gene sequence generated by MUSCLE. acs-9(reb28) deletion contains 552 bp deletion (green portion).

130

Figure 3-35. Sequencing data for acs-24(reb23) deletion. Alignment of sequence file with wild type gene sequence generated by MAFFT. acs-24(reb23) deletion contains 991 bp deletion (green portion) with 10 bp insertion (red portion).

131

Figure 3-36. Sequencing data for acs-24(reb24) deletion. Alignment of sequence file with wild type gene sequence generated by MAFFT. acs-24(reb24) deletion contains 993 bp deletion (green portion).

3.6.3 Nemamide Production in Mutants and Rescued Mutants

To show these genes are really involved in nemamide biosynthesis, we grew all the mutant worm strains and analyzed their abilities to produce nemamides. None of these mutants could produce nemamide A or nemamide B (Figure 3-37). In addition, we also complemented these genes under control of their own promoters or a CAN neuron specific promoter back into mutant worms to see if they could rescue nemamide production, and results revealed that all these strains showed rescue shown in Figure 3-

38. Furthermore, the inability of acs-9 to rescue an acs-24 mutant and the inability of acs-24 to rescue an acs-9 mutant shown in Figure 3-39 indicates that these two genes do not have redundant activities.

132

Figure 3-37. Nemamide production in wild type and nemt-1, acs-24, acs-9, Y71H2B.1, C24A3.4 and T12G3.4 mutants. A), nemamide A, m/z 757 and B) nemamide B, m/z 755.

133

Figure 3-38. Nemamide production in wild type and nemt-1, acs-24, acs-9, Y71H2B.1, C24A3.4 and T12G3.4 rescue strains. A), nemamide A, m/z 757 and B) nemamide B, m/z 755. Mutants were rescued by complementing with SL2::mcherry plasmids under control of gene promoters::genes, except T12G3.4, whose cDNA was expressed under control of CAN neuron specific promoter.

134

Figure 3-39. Inter-rescue of nemamide production in acs-9 and acs-24 mutants by injecting Pacs-24::acs-24::sl2::mcherry or Pacs-9::acs-9::sl2::mcherry.

3.6.4 ACOX-3 Functions in both Ascaroside and Nemamide Biosynthesis

Serendipitously, we found that the acyl-CoA oxidase, acox-3, is expressed in both the intestine and the CAN neurons. The function of ACOX-3 in the intestine is to tailor the side chains of longer chain ascarosides and indole-containing ascarosides106,

Given the expression of acox-3 in the CAN neurons, we wanted to know whether acox-

3 plays a role in nemamide production. After analyzing several mutants in ascaroside biosynthetic genes for nemamide production (Figure 3-40), we found that the acox-3

135

mutant produces no nemamides and the acox-1.1 mutant produces very low level of nemamides. Other mutants can still produce good amounts of nemamides. Since acox-

3 expresses in both the CAN neuron and the intestine, we performed tissue-specific expression of acox-3 under control of its own promoter, a CAN neuron specific promoter and the intestine-specific promoter of ges-1 shown in Figure 3-41. Only when acox-3 is expressed in the CAN neuron, can worms produce the nemamides.

Figure 3-40. Nemamide production in wild type and ascaroside biosynthetic mutants acox-1.1(ok2257), acox-1.3(reb4, stop), acox-1.4(tm6415), acox-1.5(ok2619), acox-3(tm4033), acox-3(tm4033);acox-1.4(tm6415) double, and daf- 22(ok693) mutants. A), nemamide A, m/z 757 and B) nemamide B, m/z 755.

136

Figure 3-41. Functional properties of acox-3 in nemamide production. A), Translational expression of Pacox-3::gfp::acox-3 in wild type L4 worms, scale bar 5 µm. B), Transcriptional expression of Pacox-3::acox-3::sl2::mcherry in acox- 3(tm4033) mutant. C), Nemamide production in wild type, acox-3(tm4033), and integrated transgenic worm strains. acox-3 was transgenically expressed under control of acox-3 promoter, CAN neuron specific promoter and intestine-specific promoter ges-1.

3.6.5 Gene Expressed in Both CAN and Intestine

As mentioned above, there are three genes associated with nemamide biosynthesis with dual expression in the CAN neuron and the intestine, and acox-3 gene was shown to be involved in longer chain ascaroside and indole-ascaroside production106, so we analyzed ascaroside production in single or double mutants of acox-3, “CoA transferase” C24A3.4 and “hydrolase” T12G3.4 shown in Figure 3-42. It

137

shows that acox-3 and C24A3.4 single and double mutants have similar profiles in ascaroside production, suggesting they might function in the same pathway in IC-asc production, though T12G3.4 seems like no effect in ascaroside production. Moreover, all three of these genes contain peroxisome-targeting signals112, ACOX-3_SKL,

C24A3.4_SKL and T12G3.4_AKL, suggesting they might play related roles in peroxisomal-like organelles.

1.00E+08 wild type 9.00E+07

8.00E+07 C24A3.4(reb16)

7.00E+07 acox-3(tm4033)

6.00E+07 Intensity

acox-3(tm4033)/C24A3.4(reb16) 5.00E+07

4.00E+07 T12G3.4(gk961508) Signal

3.00E+07

2.00E+07

1.00E+07

0.00E+00

Figure 3-42. Ascaroside production in wild type and mutants in genes expressed in both the CAN neuron and the intestine. Mutants used here are C24A3.4(reb16), acox-3(tm4033), T12G3.4(gk961508) and double mutant acox-3(tm4033)/C24A3.4(reb16). Data represent the mean ± SD of three independent experiments.

3.6.6 Phosphopantetheinyl Transferase (PPTase)

We also analyzed the expression of two phosphopantethine transferases

(PPTases) T04G9.4 and T28H10.1 inside worms42. No expression was seen for

138

T28H10.1 but T04G9.4 is expressed in multiple tissues, including hypodermal seam cells, the intestine and the CAN neurons (Figure 3-43).

Figure 3-43. Transcriptional reporter of PPTase T04G9.4 in wild-type L4 worms.

3.6.7 ACOX-3 Catalyzes Fatty-acyl CoA into ∆Fatty-acyl CoA

The acyl-CoA oxidase ACOX-3 was previously shown to be implicated in the processing of longer chain and indole-containing ascarosides106. The enzyme was expressed in E. coli using previously published protocols113 and then purified and analyzed in an HPLC-based assay. Results in Figure 3-44 reveals that ACOX-3 has broad substrate selectivities towards CoA esters of C6, C8, and C12.

139

Figure 3-44. Enzymatic activity of ACOX-3. A) Protein expression of ACOX-3, B) Proposed function of ACOX-3 in catalyzing fatty-acyl CoA into ∆fatty-acyl CoA, C) ACOX-3 activity on different lengths of fatty acid-CoAs. Standards used were C4-CoA with m/z [M-H]- 836; C6-CoA with m/z [M-H]- 864; C8-CoA with m/z [M-H]- 892 and C12-CoA with m/z [M-H]- 948. All ∆fatty-acyl CoA products were confirmed by LC-MS.

3.6.8 ACS-24 Activates Fatty Acids into Fatty-acyl CoA

ACS-24 enzyme was expressed in E. coli, purified and its activity was examined towards fatty acids of different carbon-lengths shown in Figure 3-45. As one acyl-CoA synthetase, ACS-24 has high activities towards fatty acids C8, C9, C10, and also shows moderate activity towards fatty acid C6.

140

Figure 3-45. Enzymatic activity of ACS-24. A) Protein expression of ACS-24, B) Proposed function of ACS-24 in activating fatty acids into fatty-acyl CoA, C) ACS-24 activity on different lengths of fatty acids. Standards used were C4- CoA with m/z [M-H]- 836; C6-CoA with m/z [M-H]- 864; C8-CoA with m/z [M- H]- 892 and C12-CoA with m/z [M-H]- 948. All products were confirmed by LC- MS.

3.6.9 ACS-9 has Fatty-acyl AMP Ligase Activity

We also tested the activity of another potential ACS enzyme, ACS-9 (Figure 3-

46). Compared to ACS-24, ACS-9 could only activate different fatty acids into fatty acyl-

AMPs, but not into fatty acyl-CoAs, suggesting it might be involved in the activation of nemamide biosynthetic intermediates. Activation of fatty acid or other starter units were previously reported in multiple biosynthetic natural product pathways28,30,114-118.

141

Figure 3-46. Enzymatic activity of ACS-9 towards different fatty acids. ACS-9 activates broad range of fatty acids as fatty acyl-AMPs, but not fatty acyl-CoAs. All the peaks were verified by ion extractions of the corresponding fatty acyl-AMP in LC-MS.

3.7 Complete Dissection of Nemamide Biosynthesis

To completely understand the biosynthesis of nemamides, we characterized the functional contribution of biosynthetic domains and genes by extracting the biosynthetic intermediates that accumulated in inactivated mutants. Worm extracts from each worm strain were processed through silica gel chromatography and ion exchange chromatography (HP20 column), and a synthetic compound that is supposed to be a possible and the most non-polar intermediate , was synthesized by Abdul Rouf Dar in the lab, was used as an internal standard to identify fractions containing potential intermediates from silica gel chromatography. All intermediates 2, 3, 4, 5, 6, 7 were detected in methanol fraction of silica gel column, but the internal standard was found in ethyl acetate/methanol (1:1) fraction of silica gel column. No other intermeidates such as 1, and some proposed short intermediates have been found so far. Intermediates in most of the nemamide biosynthetic mutants were present at a very tiny amount, so

142

about 3~5 grams of dried worms from 2~3 L CeHR medium for each single or double mutant were processed to obtain enough for HR-LCMS and MS/MS analysis.

3.7.1 Functional Contribution of NRPS-1 Domains

As mentioned above, NRPS-1 mutants with abolished domains TE2, C4, C3 or

ACP7 could not produce any nemamides. As shown in Figure 3-47A, they also produced very distinct types of intermediates. Intermediates 2, 3, 4, and 5 were present in both C4 and TE2 mutants (Figure 3-47B and C), intermediates 4 and 5 were produced in C3 mutant (Figure 3-47D) and intermediate 5 was the only one detected in ACP7 mutant (Figure 3-47E). Proposed intermediate 1 was not detected in either TE2 or C4 domain, suggesting the NRPS-1_C4-A3-PCP4-TE2 module participates in loading the last L-Asn by A3 domain, conjugating by C4 domain and the final formation of macrocyclization catalyzed by TE2 domain. The intermediates 2 and 3 were extracted in

C4 mutant but not C3 mutant indicates the C3-PCP4 module was involved in loading two

D-Asn or, possibly two L-Asn, al though no obvious epimerase was found. The intermediate 4 was found in C3 mutant but not ACP7 mutant, which suggests that β- alanine is incorporated by the first module C2-A2-PCP3 of NRPS-1, not by the C-terminal

NRPS module of PKS-1, as we originally proposed. Since disruption of the domains

TE2, C4, C3 or ACP7 did not affect the production of intermediate 5, this intermediate is most likely originated from PKS-1 and transferred onto NRPS-1_ACP7 for further elongation.

143

Figure 3-47. Intermediates extracted from NRPS-1 mutant worms are depicted and A), proposed biosynthetic properties of NRPS-1 in nemamide production. EIC traces of intermediates extracted from mutants with single mutations of four domains, B), NRPS-1_TE2, C), NRPS-1_C4, D), NRPS-1_C3 and E), NRPS- 1_ACP7.

144

3.7.2 Unusual Trafficking Between PKS-1 and NRPS-1 by ACS-9

To investigate the production of intermediate 5 by PKS-1 and its transfer from

PKS-1 to NRPS-1_ACP7, we extracted pks-1_TE1 mutant, acs-9 mutant and acs-9;pks-

1_TE1 double mutants. As shown in Figure 3-48, the targeted intermediate 5 was only found in fractions of acs-9 mutant worms, but not in any pks-1_TE1 single mutant or acs-9;pks-1_TE1 double mutant, demonstrating the intermediate 5 is originated from

PKS-1 and ACS-9 functions downstream of PKS-1_TE1 by activating free intermediate

5 into its AMP active form.

Figure 3-48. Proposed trafficking mechanism between PKS-1 and NRPS-1. ACS-9 facilitates the transfer of intermediate 5 from PKS-1 to NRPS-1.

145

3.7.3 The Freestanding Methyltransferase Incorporates the Methyl Group

To figure out how the methyl group in the polyketide chain is formed, we extracted mutants with abolished methyltransferase activity, and we found that in nemt-

1 mutant, the non-methylated intermediate 6 was extracted, also in the double mutant nemt-1;acs-9, but not in the double mutant nemt-1;pks-1_TE1 (Figure 3-49). We did not detect any other proposed short intermediates, which were not found in any mutants containing inactivated PKS-1_TE1. These data suggest that NEMT-1 catalyzes the methylation step upstream of ACS-9, moreover, since no longer intermediates were purified in nemt-1 single mutant and nemt-1;acs-9 double mutant, indicating ACS-9 enzyme has very strict substrate specificity towards methylated intermediates. Because we also did not detect any methylated lactone intermediates, but found intermediates 7 in nemt-1 mutant (see T12G3.4 part described below), so we proposed that NEMT-1 functions downstream of PKS-1_TE1.

146

Figure 3-49. Functional characterization of the methyltransferase NEMT-1. Proposed function of NEMT-1 in incorporating the methyl group. If no NEMT-1 is present, a non-methylated intermediate 6 will be produced.

3.7.4 T12G3.4 Functions as a Lactonase to Hydrolyze the product of PKS-1

Since both methylated intermediate 5 and non-methylated intermediate 6 are both found in extractions, we wanted to know how wild-type worms control the fidelity of nemamide biosynthesis, so we focused on another enzyme T12G3.4, which is predicted to have lactonase, hydrolase or thioesterase activity (Figure 3-50A). In our early data about nemamide production in the T12G3.4 mutant, we did not detect any nemamides

147

in a small-scale extraction. However, as shown in Figure 3-50B, T12G3.4 mutant produced about 50% as much nemamide A, and about 1% as much nemamide B as in wild type in a large scale extraction. These conflicting results might result from the different culture conditions of the worms (3-day bacterial-fed culture in S-medium for small-scale extraction and 9-day axenic culture in CeHR for large-scale extraction).

This is further shown by blocking the activity of ACS-9, since we detected intermediate

5 in the T12G3.4;acs-9 double mutant (Figure 3-50C). We have been able to detect intermediate 6 in T12G3.4 deletion mutant (Figure 3-50D), but intermediate 6 was supposed to be converted into intermediate 5 and further into the nemamides. We assumed that the excess intermediate 6 detected was hydrolyzed from intermediate 7 non-enzymatically during worm storage, extraction or LC analysis conditions. We further generated double mutants of T12G3.4;nemt-1 and T12G3.4;acs-9, in which we found intermediate 6 and 7 (Figure 3-50D), suggesting in wild type worm, T12G3.4 catalyzes the hydrolysis of the lactone 7 into 6 to accelerate the production of the nemamides, but in T12G3.4 mutant, intermediate 7 could be converted into 6 non- enzymatically. The detection of intermediate 6 and/or 7 in nemt-1 mutant and nemt-

1;acs-9 mutants also proved the ‘rate-limiting’ role of T12G3.4. Moreover, no intermediates were found in any mutants with PKS-1_TE1 mutation. All these data suggest that T12G3.4 functions downstream of PKS-1 and upstreams of both NEMT-1 and ACS-9. This type of rate-limiting enzyme was also studied in neoantimycin biosynthesis where NatG is a proof-reading thioesterase to remove aberrant intermediates from the PKS/NRPS assembly line119.

148

Figure 3-50. Characterization of the lactonase enzyme T12G3.4. A), The proposed model of post-PKS-1 tailoring, and B), Nemamide production in T12G3.4 single and double mutants. EIC traces of intermediates 5, 6, 7 in all the mutants listed in this figure C) and, D).

3.7.5 PKS-1 C-terminal Module Might Play Aminotransferase Function

To investigate how the amino group is formed, we inactivated the PKS-1_C1 and

PKS-1_A1 domains. Inactivation of the C1 domain maintained the ability to produce the nemamides, but a reduced amount. However, inactivation of the A1 domain blocked the

149

production of any nemamides (Figure 3-51). This result suggests that the C-terminal

NRPS module in PKS-1 plays an essential function, and given its replacement in the biosynthetic pathway, one potential function would be the addition of the amino group.

In addition, TMHMM prediction for transmembrane region indicated that the C-terminal region of PKS-1 may have a transmembrane domain and the PKS-1_PCP2 domain might localize to the opposite side of the membrane as the rest of the PKS-1 megasynthase. We are currently extracting intermediates from the A1 mutant to see if we can detect any intermediates lacking the amino group.

150

Figure 3-51. Characterization of PKS-1 C terminal module. Proposed aminotransferase role of the PKS-1 C terminal NRPS module. PKS-1_C1 is shown to be inactive, labeled by asterisk. EIC traces indicate the nemamide production in wild type, PKS-1_C1 domain mutant and PKS-1_A1 domain mutant.

3.7.6 ACS-24 and ACOX-3 Assisted Initiation

Since ACS-24 could activate fatty acids with different carbon-lengths as fatty acyl-CoAs and ACOX-3 could incorporate one α, β–unsaturated double bond in fatty acyl-CoAs, we would like to determine whether ACOX-3 acts on a free fatty acyl-CoA or

151

an ACP bound form. Because no enoylreductase (ER) could be identified within PKS-1 or encoded elsewhere in the genome and acting in trans, we speculated that the saturated tail of the nemamides might be from the initial loading unit(s), specifically, fatty acid C8 for nemamide A and fatty acid C6 for nemamide B. Two models were proposed for initiation (Figure 3-52). All the related enzymes or functional domains were overexpressed in E. coli and purified. We are currently analyzing the potential modifications of the ACP1 domain using MALDI-TOF MS-based assay.

Figure 3-52. Proposed initiation mechanism via PKS-1_KS-AT-ACP module for nemamide A biosynthesis. A) and B) are two models of loading selectivity of fatty acid C8. Difference of two models is the timing to incorporate C8-CoA. C), Purified initiating domains of 1, ACS-24 (~53 kDa); 2, ACOX-3 (~76 kDa); 3, SFP (~27 kDa); 4, PKS-1_AT1 (~39 kDa); 5, PKS-1_ACP1 (~11 kDa); 6, PKS-1-AT1ACP1 (~49 kDa).

3.8 Discussion and Future Work

To summarize, the biosynthesis of the nemamides (Figure 3-53) is initiated from

ACS-24, which activates the free fatty acid C8 into fatty acyl C8-CoA to produce nemamide A and C6 into C6-CoA to make nemamide B. These CoA esters could be loaded onto PKS-1_ACP1 domain through the initiating KS1-AT1-ACP1 module. Either

152

the free fatty acyl-CoAs or the ACP1 bound starter units could be modified to introduce an α, β–unsaturated double bond by ACOX-3. C24A3.4 and Y71H2B.1 might be involved in the mobilization of free fatty acids or fatty acyl-CoA esters, and no intermediates have been extracted from mutant worms of C24A3.4 and Y71H2B.1. After certain iterative cycles, the growing intermediate is further extended by two PKS modules in an assembly-line manner. The C-terminal NRPS module in PKS-1 may participate in two types of reactions: the A1 domain might facilitate the formation of amino group, which is essential for the final macrocyclization, and the TE1 domain release the lactone product 7 from PKS-1. The released product from PKS-1 is hydrolyzed by T12G3.4 to intermediate 6, methylated by NEMT-1 to intermediate 5, and further activated by ACS-9 in the form of an AMP-ester and transferred onto the NRPS-

1_ACP7 domain. The NRPS-1_A2 domain recognizes the amino acid β–alanine, and C2 catalyzes the condensation reaction to produce intermediate 4. The NRPS-1_A3 domain could possibly load D-Asn and/or L-Asn onto PCP4 to form intermediate 2 and 3 in a reaction catalyzed by the C3 domain. It might be possible that an unidentified epimerase may act in trans or that an unidentified embedded enzyme may act to catalyze the conversion of L-Asn to D-Asn. The NRPS-1_A3 domain could also load the third L-Asn onto PCP5, and the TE2 domain catalyzes the final macrocyclization to make the nemamides. Future work will focus on the formation of amino group, the initiating order, further confirm the lactone intermediate 7, and the trafficking role of ACS-9 in transferring synthetic intermediate 5 to NRPS-1.

153

Figure 3-53. Biosynthetic properties of the nemamide A.

154

CHAPTER 4 INVESTIGATION OF NEMAMIDE-LIKE MOLECULES IN OTHER NEMATODES

4.1 Chromosomal Locus of Nemamide Biosynthetic Genes

In previous chapter we showed that PKS-1 and NRPS-1 are conserved in most nematode species, so we analyzed and compared the other seven nemamide biosynthetic genes in C. briggsae, P. pacificus and B. malayi shown in Figure 4-1.

Nematodes have six chromosomes and nemamide biosynthetic genes may have been scattered in different chromosomes during horizon gene transfer40. Genes such as pks-

1 and acs-9 are mainly located in Chr X, acox-3 and T12G3.4 are mainly located in Chr

IV, but other genes are present in multiple chromosomes, so it would be very intriguing to learn if nematodes produce same kind of nemamides.

Figure 4-1. Chromosomal locations of nine nemamide biosynthetic genes in nematode C. elegans (red), C. briggsae (green), P. pacificus (blue) and B. malayi (pink).

155

4.2 Experimental Methods

4.2.1 Small-Scale Extraction of the Nemamides in Caenorhabditis species

Caenorhabditis strains used for small-scale extractions were: C. elegans(N2), C. briggsae(AF16), C. remanei(PB4641), C.brenneri(PB2801), C. japonica(DF5081),

C.angaria(PS1010), C. tropicalis(JU1373). Strains were each grown at room temperature on two NGM agar plates (10 cm) spread with 0.75 mL 25X OP50 until the food on the plates was almost gone. Then, the worms were transferred to 1 L

Erlenmeyer flasks containing S medium (350 mL). The worm cultures were grown at

22.5˚C for 3~5 d until no food was left and were fed with 3.5 mL of 25X OP50 every day.

For sample collection, the culture flasks were placed in an ice-bath for 30 min to 1 h to settle the worms. Then, the worms were transferred from the bottom of the flasks to a

50 mL centrifuge tube and were centrifuged (1000 rpm for 5 min) to separate the worms from the worm medium. The process was repeated until most of the worms were removed from the flasks. The collected worms were washed with water three times and centrifuged (1000 rpm for 5 min), and then they were soaked in 10 mL of water for 1 h in a shaking incubator (22.5˚C, 225 rpm) to remove bacteria from their digestive tract. The worms were collected by centrifugation and were freeze-dried. The dried worm pellets were ground with sea sand (2 g sand per 200 mg dried worms) using a mortar and pestle. The ground worms were extracted with 15 mL of 190 proof ethanol for 3.5 h, and the extract was centrifuged (3500 rpm for 20 min). The supernatant was collected and dried using a speedvac. The dried worm samples were each resuspended in 100

L Methanol, sonicated (if needed), and centrifuged (15000 rpm for 1 min) before analysis by LC-MS. Extracts above were analyzed using a Luna 5μ C18 (2) column

(100 × 4.6 mm; Phenomenex) coupled with an Agilent 6130 single quad mass

156

spectrometer operating in single ion monitoring (SIM) mode for nemamide A (m/z 757) and nemamide B (m/z 755). A solvent gradient was used: 95% buffer A, 5% buffer B, 0 min; 0% buffer A, 100% buffer B, 20 min; 0% buffer A, 100% buffer B, 22min; 95% buffer A, 5% buffer B, 23 min; 95% buffer A, 5% buffer B, 26 min (buffer A, water with

0.1% formic acid; buffer B, acetonitrile with 0.1% formic acid). The flow rate was 0.7 mL/min.

4.2.2 Large-Scale Extraction of Nmemamde-like Molecules in P. pacificus

Wild-type worms (PS312) were shaken at 225 rpm for 7 d at 20 C in 2.8 L baffled flasks containing 500 mL of CeHR medium45 with 20% cow’s milk. Worms were collected by centrifugation, washed with water, shaken in water for 30 min to clear their intestines, and washed again with water. Worms were stored frozen at -20 C until needed. For extraction and fractionation process, worms from 4 L-worth of culture

(around 10 g dry worm) were processed at a time. After freeze drying, worms were ground for 15 min with 70 g of sand using a mortar and pestle. The pulverized worms were transferred to a 1 L Erlenmeyer flask, and 700 mL of 190 proof ethanol was added to the flask. The flask was shaken at 300 rpm for 3.5 h. The extract was filtered using a

Buchner funnel and filter paper and evaporated with a rotavap at 27 C. The extract was then subjected to silica gel chromatography and eluted with a gradient of hexane, ethyl acetate, ethyl acetate/methanol (1:1) and MeOH to give four fractions (A – D).

Fraction C was evaporated with a rotavap at 27 C, redissolved in 12 mL of methanol, and centrifuged at 3500 rpm for 10 min. The supernatant was dried and dissolved in 10 mL 70% methanol/water. Fraction C was then applied to an HP-20 column, eluting with

MeOH/H2O (7:3 to 9:1) to give eight subfractions (C1 – C8). Combination of

157

subfraction C6 and C7 was applied to a Sephadex LH-20 column, eluting with methanol to give seven subfractions (C67-a to C67-j). Fraction C67-g, which contained the most dominant nemamide based on LC-MS analysis, was further fractionated by HPLC

(eclipse XDB-C18 column, 150 × 4.6 mm, 5 m), using a gradient of methanol and water

(ramping from 10% to 100% methanol over 30 min, holding at 100% methanol for 6 min, then returning to 10% methanol over 4 min; flow rate 1 mL / min; UV detection at 320 nm), to obtain purified nemamide C. All the mass spectra were obtained on Bruker

Daltonics, Impact II QTOF in the positive mode, Gass temperature 200 C, Drying N2 gas at 4 L/min, Nebulizer at 1.0 bar. 3 L sample was injected into Thermo UltiMate

3000 series system and analyzed by gradient 0-5 min (98% A, 2% B), 5-35 min (2-60%

B),35-40 min(60-95% B), 40-40.1 min (95-2% B), 40.1-50 min (98% A, 2% B); A: water with 0.1% formic acid and B: acetonitrile with 0.1% formic acid. The flow rate for loading pump was 25 L /min and NC pump was 5 L/min.

Marfey’s analysis. Nemamide C (purified from worms from 4 L of culture, and dry worm amount 10 g) was hydrolyzed with 200 L of 6 N HCl at 110 C for 12 h. The reaction was then dried down by rotavap, and the residue was dissolved in 50 L of water. Amino acid standards used were: L-Asp and L-Asn from Sigma Aldrich, D-Asp and D-Asn, L--Aminobutyric acid (L-AABA), D--Aminobutyric acid (D-AABA), -

Aminobutyric acid (GABA), and (S)-3-Aminobutyric acid (L-BABA) from Chem-Impex. D-

-homoalanine (D-BABA) was abtained from deprotection of Fmoc-D--homoalanine using 20% Piperdine in DMF. 50 mM stock solutions of the amino acid standards were made in water. 20 L of 1 M NaHCO3 and 100 L of 1-fluoro-2,4-dinitrophenyl-5-L- alaninamide46 (L-FDAA, Marfey’s reagent; 1% w/v in acetone) were added to 50 L of

158

the sample or the amino acid standards. After heating at 37 C for 60 min, reactions were quenched by addition of 20 L of 1 N HCl. The sample reaction was diluted with

100 L of acetonitrile while the reactions of the amino acid standards were diluted with

810 L of acetonitrile. The reactions of the sample and standards were subjected to LC-

MS analysis (Phenomenex Luna C18, 4.6 × 100 mm, 5 μm) using a linear gradient of water with 0.1% formic acid and acetonitrile with 0.1% formic acid (start from 5% acetonitrile , ramping to 22% acetonitrile over 60 min; ramping to 40% acetonitrile over another 20 min; 80-85 min, ramping to 95% acetonitrile; and finally stay at 5% acetonitrile for 5 min flow rate, 0.7 mL/min; UV and ESI-MS detection, 340 nm and negative ion mode).

4.3 Nemamide Production in Caenorhabditis Species

First, we analyzed the production of the nemamides in Caenorhabditis species and the results showed that all species could produce both nemamide A shown in

Figure 4-2 and nemamide B shown in Figure 4-3, including C. elegans, suggesting nemamide A and B might play conserved roles in nematode species, at least, in

Caenorhabditis species.

159

Figure 4-2. Production of nemamide A in Caenorhabditis species. m/z of nemamide A is 757. Strains used were C. elegans(N2), C. briggsae(AF16), C. remanei(PB4641), C.brenneri(PB2801), C. japonica(DF5081), C.angaria(PS1010), C. tropicalis(JU1373).

160

Figure 4-3. Production of nemamide B in Caenorhabditis species. m/z of nemamide B is 755. Strains used were C. elegans(N2), C. briggsae(AF16), C. remanei(PB4641), C.brenneri(PB2801), C. japonica(DF5081), C.angaria(PS1010), C. tropicalis(JU1373).

4.4 Isolation and Identification of the Nemamides in P. pacificus

To examine if P. pacificus worms produce nemamide A and/or nemamide B, we analyzed its worm extract by LC-MS, however, we did not find either nemamide A or B.

Surprisingly, multiple peaks containing nemamide B-like UV spectra with max 286, 301,

161

315 nm, most dominant peak at 12.6 min (Figure 4-4), suggested the presence of four- double bond structure in nemamide-like molecules from P. pacificus.

Figure 4-4. LC-MS trace of small-scale P. pacificus extracts. Potential nemamide-like peaks at retention times of 12.229 min, 12.687 min, 13.841 min and 14.583 min. Inset figure indicated the UV spectra with max 286, 301, 315 nm.

To purify enough nemamide-like molecules to identify their structures by mass spectrometry, we grew 4 L PS312 worms in CeHR medium that produced a relatively higher density of worms than bacteria-fed worm cultures. About 10 g freeze-dry worms were collected and purified using a short silica-gel column followed by an HP-20 column, a Sephadex LH-20 column and HPLC. Peak at 12.6 min, named as nemamide

C, was the most dominant and purified for LC-HRMS, LC-HRMSMS, and LC-

HRMSMSMS. The exact mass of nemamide C shown in Figure 4-5 indicated the

+ molecular formula, C36H56N8O10. LC-HRMS (m/z): [M+Na] calcd. for C36H56N8O10Na

+ 783.4017, found 783.3994; [M+H] calcd. for C36H57N8O10 783.4197, found 761.4172.

Based on mass analysis of nemamide A and B, MS/MS of nemamide C [M+H]+

761.4172 gave fragments of 729.3916 (loss of -O-CH3 group in polyketide side chain),

711.3805 (loss both -O-CH3 and –OH groups) and 541.2346 (peptide ring, loss of whole polyketide part) shown in Figure 4-6 and Table 4-1, indicating nemamide C has very

162

similar structural pattern to nemamide A/B, and there is one extra carbon in each of polyketide part and peptide part compared to fragment ion of nemamide A/B 527.

Further analysis by LC-HRMSMSMS on the fragment of 541.2346 (Figure 4-7 and

Figure 4-8) revealed that the cyclic peptide ring of nemamide C contains three asparagines (L- or D-Asn) and one aminobutyric acid (L/D-AABA, L/D-BABA, or GABA).

Figure 4-5. LC-HRMS of nemamide C.

163

Figure 4-6. LC-HRMSMS of nemamide C [M+H]+ 761.4172.

Table 4-1. m/z of fragment ions for nemamide A, B and C. Nemamide Molecular [M+Na]+ [M+H- [M+H- Cyclic Neutral + formula CH3OH] CH3OH- peptide loss of + H2O] polyketide A C34H54N8O10 757.3852 703.3738 685.3651 527.2250 208 B C34H52N8O10 755.3692 701.3547 683.3515 527.2220 206 C C36H56N8O10 783.3993 729.3900 711.3804 541.2353 220

164

Figure 4-7. LC-HRMSMSMS of nemamide C fragment ion 541.2346.

165

Figure 4-8. Analysis of fragment ions from peptide ring of nemamide C.

Marfey’s method was used to identify that numbers of L/D-Asn and kind of aminobutyric acid in nemamide C. Analysis with both Asn and Asp amino acids indicated the conversion of Asn to Asp during the acid hydrolysis step for both the sample and the Asn amino acid standards. Retention times for products after Marfey’s analysis shown in Figure 4-9 were: L-FDAA-L-Asn 43.232 min, L-FDAA-D-Asn 43.744 min, L-FDAA-L-Asp 49.180 min, L-FDAA-D-Asp 52.586 min, L-FDAA, 57.289min, L-

FDAA-L-BABA 60.558 min, L-FDAA-GABA 62.882 min, L-FDAA-L-AABA 64.513 min, L-

FDAA-D-BABA 66.466 min, and L-FDAA-D-AABA 69.087 min. The extracted ion

166

chromatogram (m/z 384, negative mode) for the sample indicated the presence of L-

FDAA-L-Asp and L-FDAA-D-Asp and the extracted ion chromatogram (m/z 354, negative mode) for the sample revealed the presence of L-FDAA-GABA shown in

Figure 4-10 and Figure 4-11.

Figure 4-9. Marfey’s analysis for amino acid standards and nemamide C sample..

167

Figure 4-10. The extracted ion chromatogram for L-FDAA-L/D-Asp, L-FDAA-GABA and L-FDAA standards.

168

Figure 4-11. The extracted ion chromatogram for L-FDAA-L/D-Asp, L-FDAA-GABA and L-FDAA in Marfey’s analysis of nemamide C.

4.5 Proposed Structure and Biosynthesis of the nemamides in P. pacificus

In this chapter, we mainly focused on the most dominant peak in P. pacifcus worm extract and showed that nemamide C has very similar structural properties to

169

nemamide A and B, for example, they all have one methoxyl group, one hydroxyl group in the polyketide chain. Nemamide C has two traditional carbons compared to nemamide A/B, with one in polyketide part and the other one in GABA, which is in the same position as β-alanine in the peptide ring of nemamide A/B. Therefore, we proposed the chemical structure of nemamide C in P. pacificus (Figure 4-12). It contains one polyketide tail with four double bonds and one peptidyl ring containing three asparagines and one GABA. We also analyzed the domain architectures of Ppa-PKS-1 and Ppa-NRPS-1s (Figure 4-13). Ppa-PKS-1 and Ppa-NRPS-1 have very similar domain architecture to PKS-1 in C. elegans, however, Ppa-nrps-1 has three transcripts,

PPA07616, PPA07617 and PPA38771, and the later two encode the same protein, which is part of PPA07616 (Ppa-NRPS-1.2).

Figure 4-12. Proposed structure of most dominant nemamide (C) in P. pacificus.

170

Figure 4-13. Domain architectures of Ppa-PKS-1 and Ppa-NRPS-1s.

171

CHAPTER 5 BIOSYNTHESIS OF RHAMNOSE AND ASCARYLOSE IN C. ELEGANS

5.1 Carbohydrate Metabolism in C. elegans

The biosynthesis of ascarosides is proposed to begin with the conjugation of

NDP-ascarylose and a very long fatty acid derived chain. *Ascarylose, as an unusual

3,6-dideoxysugar, is found to be produced in some pathogenic bacteria such as

Yersinia pseudotuberculosis, and Salmonella, etc 57. The biosynthesis of glucose- derived 3,6-dideoxyhexose sugars in these bacteria is initiated by the glucose-1- phosphate cytidylyltransferase- (Ep-) catalyzed coupling of R-D-glucose-1-phosphate and cytidine triphosphate (CTP) to form CDP-glucose, shown in Figure 5-1. The second

+ step CDP-D-glucose 4,6-dehydratase (Eod), in the presence of NAD , converts CDP- glucose to CDP-6-deoxy-4-keto-glucose. The subsequent C-3 deoxygenation is mediated by a pair of enzymes, CDP-6-deoxy-D-glycero-L-threo-4-hexulose-3- dehydrase (E1), a PMP-dependent iron-sulfur-containing enzyme, and CDP-6-deoxy- glycero-L-threo-4-hexulose-3-dehydrase, and reductase (E3), a [2Fe-2S]-containing flavoprotein reductase. The intermediate can then be converted to CDP-L-ascarylose and other 3,6-dideoxyhexoses by various epimerases and ketoreductases54-57.

Structure of ascaroside contains a core 3, 6-dideoxysugar, ascarylose, however, questions about whether worms can synthesize ascarylose by themselves, yet to be identified.

Parts adapted with permission from Feng, L., Shou, Q., Butcher, R.A. Identification of a dTDP-rhamnose biosynthetic pathway that oscillates with the molting cycle in Caenorhabditis elegans. Biochem J 473, 1507-1521 (2016). Copyright 2016 Portland Press Limited on behalf of the Biochemical Society.

172

Figure 5-1. Biosynthetic pathway of CDP-ascarylose in bacteria120. Ep, glucose-1- phosphate cytidylyltransferase; Eod, CDP-D-glucose 4,6-dehydratase; E1, CDP-6-deoxy-D-glycero-L-threo-4-hexulose-3-dehydrase; E3, CDP-6-deoxy- D-glycero-L-threo-4-hexulose-3-dehydrase reductase; Eep, epimerase; Ered, reductase; PMP, pyridoxamine 5’- phosphate.

The 3-deoxysugar L-rhamnose is commonly found in polysaccharide and glycoprotein components of the cell wall of bacteria and plants, as well as in various natural products, but is rarely found in fungi and animals 121. In bacteria, L-rhamnose is a constituent of several types of O-antigens, which often are covalently linked to lipopolysaccharides in the outer leaflet of the outer cell membrane. These O-antigens are associated with bacterial virulence and facilitate bacterial evasion of immune system defenses 122. In plants, L-rhamnose is a key sugar found in rhamnogalacturanan I and

II, which are primary constituents of the pectins, complex polysaccharides in the plant cell wall 123,124. L-rhamnose is also often found in some O-linked glycoproteins in the plant cell wall that influence growth, morphogenesis, and responses to various stresses

125. L-rhamnose is also part of many natural plant compounds, including flavonoids, terpenoids, and saponins. Recently, L-rhamnose was found to be a component of a phosphoglycan from trypanosomes 126.

173

In bacteria and plants, the biosynthesis of UDP/dTDP-rhamnose from glucose-1- phosphate (Glc-1-P) is well characterized and is usually accomplished in a few steps: the formation of UDP/dTDP-glucose, followed by a 4,6-dehydration to produce

UDP/dTDP-4-keto-6-deoxyglucose, followed by 3,5-epimerization and 4-keto-reduction to form UDP/dTDP-rhamnose 56,57,121,127. In E. coli, the pathway converts dTDP-glucose

(dTDP-Glc) to dTDP-rhamnose, and the four steps in the pathway are catalyzed by four separate enzymes (RmlA-D) 128-134. Conversely, in plants, the pathway converts UDP- glucose (UDP-Glc) to UDP-rhamnose, and the last three steps in the pathway are catalyzed by a single polypeptide, encoded by the RHM gene family, with dehydratase, epimerase and reductase activities 135.

In C. elegans and other nematodes, O- and N-linked glycoproteins and glycolipids have been shown to play important roles in embryonic and larval development and in mediating interactions with pathogens 136. The outer surface of C. elegans is covered by a protective cuticle, which is secreted by an underlying layer of epithelial cells, including the hypodermis and seam cells 137. Glycoproteins are present in the cuticle matrix and are also secreted, coating the outer surface of the cuticle.

Glycolipids are also present in this surface coat. After hatching from an egg, C. elegans proceeds through four larval stages (L1-L4) to the adult, but will enter a stress-resistant, alternative L3 larval stage (the ‘dauer’ stage), under conditions of high population density and low food 76. At each larval stage during development, a new, stage-specific cuticle is made, resulting in changes in surface coat glycoproteins and glycolipids.

Mutations that interfere with the production of glycoproteins and glycolipids in C. elegans are associated, for example, with abnormal surface epitope expression (Srf,

174

surface antigenicity abnormal phenotype) and pathogen resistance (Bus, bacterially unswollen) 138,139. To our knowledge, rhamnose has not been detected in the C. elegans cuticle. However, a glycoprotein containing rhamnose has been extracted from the surface of the parasitic nematode Ascaridia galli 140. Thus, rhamnose may be a component of specific glycoproteins in the surface coat of certain nematodes.

During embryonic and larval development in C. elegans, glycoproteins have been shown to be critical for cell migration and patterning. For example, the latrophilin receptor homolog lat-1 is essential for the establishment of tissue polarity and the alignment of cell division planes in the developing C. elegans embryo 141. Latrophilin receptors all contain a highly conserved RBL (rhamnose-binding lectin) domain. The lat-1 RBL domain plays an essential role, since rescue of the embryonic lethality of lat-1 mutant with a lat-1 transgene requires the presence of the RBL domain in the transgene. However, it has been reported that rhamnose-binding activity for the lat-1

RBL domain could not be detected 141. Thus, it is unclear whether this domain binds to specific carbohydrates of glycoproteins and whether this binding is necessary for the lat-

1 receptor to mediate cell-cell interactions.

Although our analysis has shown that homologs of bacterial and plant rhamnose biosynthetic genes are present in most nematode species, rhamnose biosynthesis has not been studied in nematodes.

5.2 Experimental Methods

5.2.1 Isotope Labeling Experiments

C. elegans wild-type worms were cultured in 5 mL CeHR medium at 20 °C, 225 rpm. Regular glucose(Sigma Aldrich) was fed to worms in control experiment and

[1,2,3,4,5,6-13C] glucose (Cambridge Isotope Laboratories) was fed in replace of regular

175

glucose to define the conversion of glucose to ascarylose. To ensure the complete depletion of glucose in isotope labeling experiments, worms were grown for at least two generations before analyzing ascaroside production. Worm medium was collected by centrifugation at 3500 rpm for 10 min, followed by purification through Sep-Pak C18

Cartridge (500 mg, Waters). The final elutant in MeOH was dried by speedvac, dissolved in 100 μL 50% MeOH/H2O and 5 μL samples were analyzed by LC-MS. Mass to charge ratios (m/z) of several ascarosides ascr#3 (asc-∆C9), ascr#4(Glc-asc-C6-MK) and icas#9 (IC-asc-C5) were compared between control and experimental group34.

5.2.2 Sugar Extraction and Identification

Wild-type worms were cultured in 150 mL of the axenic, semi-defined medium

CeHR for 7 d or in S medium (150 mL) for 9 d. All worms were collected and washed with M9 buffer three times over 3 h to allow the worms to eliminate waste from their digestive track. Worm samples were then frozen using liquid nitrogen, lyophilized, ground with sand and a mortar and pestle, and dissolved with 15 mL of 80% EtOH in

M9 buffer. Sugar nucleotides were extracted by vortexing, sonicating for 10 min, and shaking for another 3 h. Samples were then centrifuged at 3500 rpm for 10 min and then at 18000 rpm for 25 min. The supernatants were filtered through a 0.22 μm filter, dried by Speedvac, resuspended in 100 μL of 20 mM triethylammonium acetate (TEAA) buffer, pH 6.0, and centrifuged at 15000 rpm for 10 min. The resulting crude sugar nucleotide samples were stored at-80 °C before analysis by LC-MS, LC-MS/MS, and

LC-MS/MS/MS. LC-MS analysis was performed with solvent gradient: 100% buffer A,

0% buffer B, 0 min; 100% buffer A, 0% buffer B, 15 min; 50% buffer A, 50% buffer B,

35min; 0% buffer A, 100% buffer B, 42 min; 100% buffer A, 0% buffer B, 44 min; 100% buffer A, 0% buffer B, 49 min (buffer A, 20 mM TEAA buffer, pH 6.0; buffer B, 4%

176

acetonitrile in 20 mM TEAA buffer, pH 6.0). The flow rate was 1 mL/min. LC-MS/MS and

LC-MS/MS/MS were run on a Thermo Scientific LCQ Deca Ion Trap instrument coupled with a Hypersil Gold aQ C18 column (150 x 2.1 mm; particle size 3 μm; Thermo

Scientific) with the same solvent gradient as the LC-MS above, but with a slower flow rate of 0.25 mL/min. Selected ion monitoring (SIM) was performed first on the component of interest. A selected ion chromatogram of the MS/MS fragment at m/z 323 was extracted for MS/MS, and a further fragmentation was performed at m/z 323 for

MS/MS/MS (MS3).

5.2.3 Construction, Overexpression, Purification and Activities of RML Enzymes

All genes were amplified by PCR using Pfu polymerase (New England Biolabs) from a C. elegans cDNA library. gmp-1 (C42C1.5), ugp-1 (D1005.2), rml-1 (K08E3.5), rml-2 (F53B1.4) and rml-3 (C14F11.6) genes were inserted into the pET-30a vector separately such that they could be expressed with an N-terminal His tag. For coexpression of RML-4 (C01F1.3) and RML-5 (Y71G12B.6), rml-4 was inserted into multiple cloning site 1 (MCS1) of pACYC-Duet1 with an N-terminal His tag, and rml-5 was inserted into multiple cloning site 2 (MCS2) of pACYC-Duet1 without any tag. All primers used in this study were listed and restriction sites were underlined in Table 5-1.

All the gene sequences were confirmed by sequencing. Constructed plasmids containing gmp-1, ugp-1, rml-1, rml-2, rml-3, or rml-4/rml-5 were transformed into BL21

(DE3) cells (New England Biolabs) independently for expression. rml-5 (N-terminal His tag) in pET-28a and rml-4 (no tag) in MCS2 of pACYC-Duet1, or rml-4 (N-terminal His tag) in pET-28a and rml-5 (no tag) in MCS2 of pACYC-Duet1, were also cotransformed into BL21 (DE3).

177

Table 5-1. Primers used for plasmid construction. Primer Sequence* gmp-1-NcoI-F 5’-gcgcCCATGGggATGAAGGCGCTGATTCTAGT-3’ gmp-1-NotI-R 5’-catgGCGGCCGCTTACATAATAATGTCTTTCGACGG-3’ ugp-1-NcoI-F 5’-gcgcCCATGGGGATGTCCAACGATCAACTCAAAT-3’ ugp-1-NotI-R 5’-catgGCGGCCGCTTACTCAGCAATATACTCCTGA-3’ rml-1-NcoI-F 5’-gcgcCCATGGGTCCGCAGCCAGTTCATCGTCT-3’ rml-1-NotI-R 5’-catgGCGGCCGCTTAGTGCTCCAAAATGCGCAAAT-3’ rml-2-NcoI-F 5’-gcgcCCATGGGTTCCGCGTGGGAAGA-3’ rml-2-NotI-R 5’-catgGCGGCCGCTTAACCCTGGAGACGAGCTG-3’ rml-3-NcoI-F 5’-gcgcCCATGGGGATGTCGCATCCTACTCCAG-3’ rml-3-NotI-R 5’-catgGCGGCCGCTTATAGAGATTTGAATGATGCATG-3’ rml-4-BamHI-F 5’-gcgcGGATCCgATGGTCTATACCCCGAAAAAC-3’ rml-4-NotI-R 5’-catgGCGGCCGCTTATTTAGCAGCACTAGTCAT-3’ rml-4-NdeI-F 5’-ggaattcCATATGGTCTATACCCCGAAAAAC-3’ rml-4-NotI-R 5’-catgGCGGCCGCTTATTTAGCAGCACTAGTCAT-3’ rml-4-FseI-F 5’-tatcGGCCGGCCacATGGTCTATACCCCGAAAAAC-3’ rml-4-PacI-R 5’-cattTTAATTAACTATTTAGCAGCACTAGTCAT-3’ rml-5-NdeI-F 5’-ggaattcCATATGACCGTTTTGATTACCGGCG-3’ rml-5-NotI-R 5’-catgGCGGCCGCTTAAATATTATTATTAAGTCCTCC-3’ rml-5-BamHI-F 5’-gactGGATCCgATGACCGTTTTGATTACCGGCG-3’ rml-5-NotI-R 5’-catgGCGGCCGCTTAAATATTATTATTAAGTCCTCC-3’ rml-5-FseI-F 5’-tatcGGCCGGCCacATGACCGTTTTGATTACCGGCG-3’ rml-5-PacI-R 5’-cattTTAATTAACTAAATATTATTATTAAGTCCTCC-3’ prml-2-AscI-F 5’-catGGCGCGCCGGATGGTACTCTCATCAAAGGACAAAG-3’ prml-2-NotI-R 5’-atgGCGGCCGCCTCTTTTTGGCGGCGGATCTGAAGAGA -3’ prml-4-SalI-F 5’-gcgcGTCGACAGGGTTACGGTAGCCCCAAAAGTACGC-3’ prml-4-NotI-R 5’-atgGCGGCCGCCCTGGAATTCAGTTGAGAATTATCGAG-3’ *Underlined bases indicate restriction sites.

Cells transformed with pET-30a-gmp-1, ugp-1, rml-1, rml-2 or rml-3 were grown in LB broth under appropriate antibiotic selection at 37 °C to OD600 0.6~0.8, protein expression was induced with 0.2 mM IPTG, and cells were grown at 20 °C for 20 h. For coexpression of rml-4 and rml-5, cells transformed with pACYC-Duet1-rml-4/rml-5 or cotransformed with two plasmids were grown in LB broth under appropriate antibiotic

178

selection, protein expression was induced with 0.1 mM IPTG, and cells were grown at

16 °C for 20 h. All purification steps were carried out at 4°C. Briefly, cells were collected by centrifugation at 6,000 rpm for 10 min, and resuspended in lysis buffer (20 mM Tris,

500 mM NaCl, pH 7.5). The cells were then lysed by microfluidizer three times and centrifuged at 18,000 rpm for 20 min. The supernatant was incubated with 1 mL pre- equilibrated Nickel-resin (Thermo Scientific) for 1 h by shaking on ice. The resin was washed with 15 mL lysis buffer, 15 mL washing buffer (20 mM Tris, 500 mM NaCl, 20 mM imidazole, pH 7.5), and eluted with buffer containing 250 mM imidazole. The eluted sample was concentrated and loaded onto an FPLC connecting to a Superdex 200 gel filtration column (GE healthcare) with buffer (20 mM Tris, 100 mM NaCl, pH 7.5).

Protein concentration was determined by using Quick StartTM Dye reagent (Bio-Rad) with 2 mg/mL bovine serum albumin used as a standard. Purified proteins were flash frozen in 10% glycerol and stored at -80 °C.

The pyrophosphorylase activities of GMP-1, UGP-1, and RML-1 were measured using a spectrophotometric pyrophosphate assay, according to the manufacturer’s method for the EnzChek pyrophosphate assay kit (Invitrogen). The standard reaction

(100 μL total volume) contained 0.25 mM 2-amino-6-mercapto-7-methylpurine ribonucleoside (MESG) (Santa Cruz Biotechnology), 1 mM sugar-1-phosphate (Glc-1-P or Man-1-P), 1 mM NTP (ATP, GTP, CTP, dTTP or UTP), 0.1 U purine nucleoside phosphorylase, and 0.04 U inorganic pyrophosphorylase in 50 mM Tris, 5 mM MgCl2, 1 mM DTT, pH 7.5. Reactions were pre-incubated for at least 10 min to remove the background phosphate and initiated by adding 2 μM of purified enzyme. Reactions were carried out at 25 °C and monitored using the absorbance at 360 nm on an Agilent 8453

179

UV/VIS spectrophotometer. The initial rate (AU/s) was measured over the first 5 min. To confirm the formation of a particular NDP-sugar during reactions, LC-MS was used to detect the relevant ion (ADP-Glc, m/z 588; dTDP-Glc, m/z 563; GDP-Glc, m/z 604;

CDP-Glc, m/z 564; UDP-Glc, m/z 565; GDP-Man, m/z 604). For kinetics, reactions contained 1 mM sugar-1-phosphate (Glc-1-P or Man-1-P) with various concentrations of

NTP (GTP, dTTP or UTP) or 1 mM NTP (GTP, dTTP or UTP) with various concentrations of sugar-1-phosphate (Glc-1-P or Man-1-P), as well as 0.25 mM MESG,

0.1 U purine nucleoside phosphorylase, 0.04 U inorganic pyrophosphorylase (in 50 mM

Tris, 5 mM MgCl2, 1 mM DTT, pH 7.5), and 4~5 μg of recombinant enzymes. The concentration of the product pyrophosphate was calculated based on a calibration curve made using pyrophosphate (Invitrogen) as a standard. Km and kcat were calculated by fitting each data set to the Michaelis-Menten algorithm in GraphPad Prism software, and the standard deviation of kcat/Km was calculated using the Fenner formula 142.

Dehydratase activity was measured by detecting the product NDP-4-keto-6- deoxyglucose, which has a characteristic absorbance at 320 nm under alkaline conditions 143. First, all substrate NDP-sugars were generated by the pyrophosphorylase reaction, the reaction mixtures were aliquoted, and NAD+ (0.5 mM) was added. Then, reactions were initiated by adding RML-2 (2 μM final concentration) and incubating at 25 °C for 60 min. Heat-treated samples were cooled on ice and centrifuged for 10 min at 13,000 rpm. The supernatant was made alkaline (0.1M sodium hydroxide final concentration in 100 μL total volume) and incubated at 37 °C for 20 min before measuring absorbance. Negative controls were carried out in the absence of enzyme.

180

The substrate dTDP-4-keto-6-deoxyglucose was produced through the overnight reaction of RML-2 (2 μM) with dTDP-Glc (1 mM) and NAD+ (0.5 mM), and the reaction mixture was then incubated with RML-3 and/or RML-4/RML-5 (2 μM) in the presence of

NAD(P)H (1 mM). Negative controls were performed without RML-2 or NAD(P)H.

Substrate UDP-Glc (1 mM) was incubated with RML-2 and/or RML-3 and/or RML-

4/RML-5 (2 μM) in the presence of NAD+ (0.5 mM) and NAD(P)H (1 mM). A negative control was performed without any enzymes or NAD(P)H. Reactions were carried out at

25 °C for 30 min, monitoring the absorbance at 370 nm, which indicated the consumption of NAD(P)H 144,145.

5.2.4 Purification and Identification of Reaction Products

Products of all reactions above were analyzed by LC-MS using a Luna 5μ C18

(2) column (100 × 4.6 mm; Phenomenex) coupled with an Agilent 6130 single quad mass spectrometer operating in negative scan mode. A solvent gradient was used:

95% buffer A, 5% buffer B, 0 min; 80% buffer A, 20% buffer B, 5 min; 0% buffer A,

100% buffer B, 10min; 100% buffer A, 0% buffer B, 15 min (buffer A, water with 0.1% formic acid; buffer B, acetonitrile with 0.1% formic acid). The flow rate was 0.7 mL/min.

To purify the reaction products, reaction mixtures were boiled, cooled on ice, and centrifuged. The supernatants were analyzed using SphereClone 5μ SAX HPLC column

(250 × 4.6 mm; Phenomenex, CA). The column was pre-equilibrated with 30 mM

KH2PO4, pH 4.33. For each run, 100 μL of sample was injected and eluted using buffer

(30 mM KH2PO4, pH 4.33) over 30 min at a flow rate of 1 mL/min. NDP-sugars were detected based on their UV254 absorbance. The retention times for dTDP-Glc, dTDP- rhamnose, UDP-Glc and UDP-rhamnose were 14.8 min, 16.4 min, 12.4 min and 13.9 min, respectively.

181

To identify the structures of raction products, overnight reactions (1 mL) were

HPLC-purified to produce dTDP-Glc (~500 μg), UDP-Glc (~500 μg), dTDP-rhamnose

(~400 μg), and UDP-rhamnose (~300 μg). Products were lyophilized, dissolved in

99.9% deuterium oxide (Cambridge Isotopes), and analyzed by water-suppressed 1H and COSY NMR spectroscopy on a Bruker Advance II 600 MHz NMR spectrometer, equipped with a 5 mm TXI cryoprobe. Chemical shifts of spectra are expressed in ppm referenced to the internal water (4.800 ppm). NMR data were processed using

MestReNova software.

5.2.5 Other experimental procedures

5.2.5.1 Phylogenetic tree analysis

The protein sequences of RML-1-5 and its homologs in both free-living and parasitic nematode species were retrieved from Wormbase. MEGA6.0 was used to generate the neighbor-joining tree 69.

5.2.5.2 Small- and large-scale RNAi

RNAi was carried out by feeding bacteria expressing double-stranded RNA as described 146 with modifications. Briefly, Nematode Growth Medium (NGM) agar plates with 25 mg/L carbenicillin and 1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG) in 6 cm dishes were prepared and left at room temperature for 4-7 d before use. Bacteria carrying appropriate RNAi clones were grown in 5 mL LB containing 150 mg/L ampicillin at 225 rpm, 37°C, for 6-10 hours, and 200 μL of bacterial culture was transferred to each plate. The plates were left at room temperature overnight to let the bacterial lawn dry. 20 L4 rrf-3 worms were moved onto each plate, allowed to lay eggs for 24 h and then removed. Worms were cultivated at 20°C for 3 d, and phenotypes were examined.

182

To analyze sugar nucleotide pool and ascaroside production in worms with knocking-down level of rml-2 or rml-3, alternatively, rrf-3 worms were cultured in S medium (150 mL) with IPTG-induced RNAi strains, including L4440 (control), rml-2, or rml-3, for 9 d before collected and analyzed as methods above.

5.2.5.3 Sequence alignment, structural modeling and superposition

Amino acid sequence alignment of RML-3 with Escherichia coli RmlC (GenBank accession number AFC91454) was completed by using ClustalW. Modeling of RML-4 structure was achieved via Robetta 147. Streptococcus suis RmlB without substrate bound (protein data bank code 1OC2) and EcRmlD (protein data bank code 1KBZ) were used as the templates for RML-4 N-terminal (1-344) and C-terminal domain (345-

631) modeling. Superposition of the RML-4 structure model with dTDP-Glc-bound

SsRmlB (protein data bank code 1KER) was done using Pymol.

5.2.5.4 Plasmid construction, transgenesis and microscopy

~2 kb of the rml-2 or rml-4 promoter was amplified by PCR from C. elegans genomic DNA using the primers listed in Table 5-1. The promoters were inserted into the AscI/NotI (for rml-2) or SalI/NotI (for rml-4) sites of pPD114.108 (Addgene) to obtain prml-2::gfp or prml-4::gfp. To generate prml-2::gfp-pest for monitoring dynamic expression, the pest sequence from pAF207 (gift of Alison Frand, UCLA) was subcloned into prml-2::gfp using the XhoI/EcoRI sites. To generate prml-4::gfp-pest, the pest sequence was first subcloned into pPD114.108 using the XhoI/EcoRI sites to obtain pPD114.108-GFP-pest, and then the rml-4 promoter was inserted into pPD114.108-GFP-pest at the SalI/NotI sites. Microinjections into N2 worms were conducted as described previously 148, using an injection mixture containing 50 ng/μL of either prml-2::gfp, prml-2::gfp-pest, prml-4::gfp or prml-4::gfp-pest and 50 ng/μL co-

183

injection marker coel::dsRED (gift of Piali Sengupta, Brandeis University). At least five independent lines were analyzed. To generate synchronized L1 worms, gravid transgenic prml-2::gfp, prml-2::gfp-pest, prml-4::gfp or prml-4::gfp-pest worms were treated with alkaline bleach to obtain eggs, which were washed twice with water and once with M9, and then allowed to hatch for 18 h in M9 buffer with shaking at 225 rpm and 22.5 °C. To monitor GFP expression during the molting cycle, 20 synchronized L1 worms (containing the prml-2::gfp-pest or prml-4::gfp-pest reporter) were seeded onto an NGM agar plate with OP50 at 22.5 °C. Worms expressing the fluorescent reporters

(n=25) were selected and moved onto a new NGM plate with OP50 at 22.5 °C, and the time course was initiated. GFP expression was recorded every 2 h on a Zeiss

Axiovert.A1 microscope equipped with ZEN lite 2012 camera. For examination of gene expression in pre-dauers and dauers, dauer formation was induced as described 33.

Dauers were examined 64-72 h after egg lay, and pre-dauers were examined 40-64 h after egg lay.

5.3 Results

5.3.1 Ascarylose in Worms Originated from Glucose

To investigate if worms can synthesize ascarylose by themselves, we performed isotope labeling experiments by feeding worms with [1,2,3,4,5,6-13C] glucose in replace of regular glucose in CeHR medium, which is bacteria-free. After analyzing ascaroside production, we found that worms could incorporate isotope labeled glucose into ascaroside shown in Figure 5-2.

184

Figure 5-2. Ion extraction of ascarosides in isotope labeling experiments. A) icas#9 (IC- asc-C5) from worms fed with regular glucose. B) icas#9 (IC-asc-C5) from worms fed with [1,2,3,4,5,6-13C] glucose. C) asc#3 (asc-∆C9) from worms fed with regular glucose. D) asc#3 (asc-∆C9) from worms fed with [1,2,3,4,5,6- 13C] glucose. E) asc#4 (Glc-asc-C6-MK) from worms fed with regular glucose. F) asc#4 (Glc-asc-C6-MK) from worms fed with [1,2,3,4,5,6-13C] glucose.

185

m/z of icas#9 from worms fed with regular glucose was 390, but in worms fed with isotope labeled [1,2,3,4,5,6-13C] glucose, no 390 was found, instead, a bunch of masses range from 396 (390+6) to 401(390+6+5) were extracted at the same retention time, suggesting that the core ascarylose was labeled, plus different numbers of carbons in the side chain are also labeled. Similarly, ascr#3 had m/z 301 for regular glucose, and 307 (301+6) to 316 (301+6+9) for isotope label glucose. For glucosylated ascr#4, it has m/z 431 under regular glucose fed condition, but 443 (431+12) to 449

(431+12+6) under [1,2,3,4,5,6-13C] glucose fed condition, indicating worms incorporated glucose and glucose derived ascarylose into ascr#4.

These results suggest that worms utilize glucose to biosynthesize active form of ascarylose, coupled with different length of fatty-acid chains to form ascaroside. The differerent numbers of carbons labeled in side chain indicate that through glycolysis and

Krebs cycle, glucose could be decomposed and the products such as acetyl-CoAs are used for fatty acid biosynthesis, shown in Figure 5-3. We also tried to feed worms with isotope labeled sodium acetate, however, the complexity of acetate conversion inside cells made it very difficult to monitor the labeling of ascarosides.

186

Figure 5-3. Proposed conversion of glucose to ascaroside. Structures with blue color represent the whole conversion and incorporation of glucose into ascaroside; structures with magenta color indicates the mobilization of acetyl-CoA from isotope labeled glucose. * asterisk means labeling of carbons.

187

5.3.2 Discovery of UDP-ascarylose in Worms

Since worms can synthesize ascarylose from glucose to produce ascaroside, we are curious if there is any active form of ascarylose inside worms, specifically, NDP- ascarylose. We examined the nucleotide sugar pool of worm extract and conducted ion extraction for different kinds of NDP-ascarylose, surprisingly, only m/z of UDP- ascarylose [M-H]- 533 could be extracted (Figure5-4) and further verified by using LC-

MS/MS (daughter ion 323) (Figure 5-5), which is consistent with reported daughter ion of UDP-sugars.

Figure 5-4. Ion extraction for NDP-ascarylose from worm nucleotide sugar pool. m/z shown in figure are in negative mode [M-H]-.

188

Figure 5-5. LC-MS and LC-MS/MS analysis of UDP-ascarylose. A) LC-MS spectra of UDP-ascarylose. B) MS/MS analysis of UDP-ascarylose.

Given that in bacteria, ascarylose is originated from CDP-glucose57, so we proposed the biosynthesis of UDP-ascarylose based on the bacterial CDP-ascarylose biosynthetic pathway (Figure 5-6). According to the potential enzymes involved in this pathway, we did blast analysis, constructed all the genes, overexpressed, purified and tested their activities towards all the possible substrates, however, we could not

189

reconstitute the biosynthesis of UDP-ascarylose in worms, instead, we reconstituted the biosynthesis of dTDP-rhamnose and UDP-rhamnose. The later parts of this chapter will focus on the biosynthesis of dTDP/UDP-rhamnose and their potential roles in worm molting cycles.

Figure 5-6. Proposed biosynthetic pathway of UDP-ascarylose in worms.

5.3.3 Homologous Genes Identified by BLAST

To investigate whether a rhamnose biosynthetic pathway is present in C. elegans, we used BLAST analysis. We first identified three possible NDP-sugar pyrophosphorylases, rml-1 (K08E3.5), gmp-1 (C42C1.5), and ugp-1 (D1005.2), which usually catalyze the conjugation of Glc-1-P or Man-1-P to NTP to give NDP-Glc or NDP-

Man, respectively. Of these three genes, rml-1 and gmp-1 are essential in embryogenesis and larval development based on observed phenotypes in RNAi experiments or in mutants 149, listed in Table 5-2. For the additional steps in NDP-

190

rhamnose biosynthesis, RmlB, RmlC, and RmlD from E. coli 150 and RHM2 and UER1 from Arabidopsis thaliana 135 were used to obtain putative 4,6-dehydratase, 3,5- epimerase, and 4-keto-reductase homologs in C. elegans, rml-2 (F53B1.4), rml-3

(C14F11.6), and rml-4 (C01F1.3), respectively. Whereas RNAi against rml-2 and rml-3 does not affect embryonic or larval development, rml-4 RNAi or gene deletion causes lethality and sterility in worms (Table 5-2). Homologs of these genes can be found in both free-living and parasitic nematode species (Figure 5-7), suggesting that these genes play a conserved role through nematode evolution.

Table 5-2. Observed phenotypes for RNAi of genes. Gene Phenotypes rml-1 (K08E3.5) Embryonic lethal 151,152 (this study); extended lifespan 153; larval arrest 151 (this study); slow growth 152; transgene subcellular localization variant 154 gmp-1 (C42C1.5) Adult lethal 152; embryonic lethal 152 (this study); larval arrest 155 (this study); larval lethal 152; maternal sterile 151 ugp-1 (D1005.2) None rml-2 (F53B1.4) Sluggish 155 rml-3 (C14F11.6) Bordering at edges of RNAi bacterial lawns (this study) rml-4 (C01F1.3) Early larval arrest 156; egg laying variant 151; larval arrest 155(this study); larval lethal 155; lethal 157; locomotion variant 155; embryonic lethal (this study) rml-5 (Y71G12B.6) Embryonic lethal 156

191

Figure 5-7. RML-1-5 homologs from different nematode species organized in neighbor- joining phylogenetic trees. The phylogenetic trees were generated by MEGA 6.0 using sequences of RML-1-5 homologs from 13 nematode species. The following sequences were acquired from Wormbase: A) C. elegans RML-1 and its homologues, C. remanei CRE04853, C. japonica CJA03580, C. brenneri CBN11066, C. briggsae CBG18265, P. pacificus PPA08493, B. malayi Bm13963, O. volvulus OVOC8721, B. xylophilus BX:BUX.s01142.24, H. bacteriophora HB:Hba_01354, L. loa LL:EFO21664.1, A. suum AS:GS_14768, and M. incognita MI:Minc03247. B) RML-2 and its homologues, CRE00931, CJA12429, CBN25184, CBG14085, PPA20960, Bm9898, OVOC2669, BX:BUX.s01144.243, HB:Hba_07512, L. loa LL:EFO24845.2, AS:GS_19609, and MI:Minc05550. C) RML-3 and its homologues, CRE00181, CJA08012, CBN15172, CBG05010, PPA18087, Bm2207, OVOC3044, BX:BUX.s01355.1, HB:Hba_14212, LL:EFO28373.1, AS:GS_17184, and M. hapla MH:MhA1_Contig1.frz3.fgene1. D) RML-4 and its homologues, CRE22549, CJA06599, CBN09988, CBG03656, PPA20960, Bm1876, OVOC10817, BX:BUX.s00351.21, HB:Hba_20989, L. loa LL:EJD75859.1, AS:GS_21000, and MI:Minc01529. E. RML-5 and its homologues, CRE03894, CJA06758, CBN19366, CBG22228, PPA01693, B. malayi Bm10271, OVOC1741, BX:BUX.s00055.268, HB:Hba_07918, LL:EFO20509.2, AS:ASU_05503, and MI:Minc14214.

192

5.3.4 In vitro Activities of Rhamnose Biosynthetic Enzymes

RML-1, UGP-1, and GMP-1 were expressed in E. coli and purified using Nickel affinity resin and then gel filtration (Figure 5-8A, lanes 1-3). Although RML-1 and UGP-

1 are both annotated as UDP-Glc pyrophosphorylases, they showed very different substrate specificities. RML-1 demonstrated a broad substrate range towards five nucleotides (dTTP, UTP, CTP, ATP, and GTP) in the presence of Glc-1-P, with higher activity towards dTTP and UTP, and with very low activity towards UTP with Man-1-P

(Figure 5-8B). UGP-1 was specifically active towards UTP with Glc-1-P, but displayed very low or no activity towards other nucleotides, with either Glc-1-P or Man-1-P (Figure

5-8C). GMP-1 was specifically active towards GTP in the presence of Glc-1-P or Man-

1-P, although it preferred Man-1-P. GMP-1 showed almost no activity towards the other nucleotides (Figure 5-8D). The kinetic parameters of the three enzymes towards different substrates are summarized in Table 5-3.

193

Figure 5-8. Activity assays with RML-1, GMP-1, UGP-1 and RML-2. A) SDS-PAGE analysis of purified protein, lane M, molecular weight marker; lane 1, GMP-1; lane 2, UGP-1; lane 3, RML-1; lane 4, RML-2; lane 5, RML-3. B), C) and D) Substrate specificity of RML-1, UGP-1 and GMP-1. Enzymatic activities were determined for five different nucleotide triphosphates (1mM ATP, GTP, CTP, dTTP, or UTP) in the presence of 1 mM substrates Glc-1-P or Man-1-P. E) Substrate specificity of RML-2. Substrate ADP-Glc, CDP-Glc, dTDP-Glc or UDP-Glc 1 was generated from a RML-1-catalyzed reaction of ATP, CTP, dTTP or UTP and Glc-1-P. GDP-Glc was from a GMP-1-catalyzed reaction of GTP and Glc-1-P. UDP-Glc 2 was from a UGP-1-catalyzed reaction of UTP and Glc-1-P. GDP-Man, UDP-GlcNAc and UDP-Gal are commercially available. Negative controls were reaction mixtures without either RML-2 (Cont. 1) or pyrophosphorylase (Cont. 2). The data in B), C), D) and E) are the average (± S.D.) of three independent experiments.

194

Table 5-3. Kinetic parameters of RML-1, GMP-1, and UGP-1 in the pyrophosphorylase activity assay. † -1 † -1 . -1 † Enzyme Substrate* Km (M) kcat (s ) kcat/Km (M s ) RML-1 UTP (Glc-1-P) 9.47±1.70 21.92±0.61 2.31±0.42 dTTP (Glc-1-P) 24.22±2.86 28.20±0.76 1.16±0.14 Glc-1-P (UTP) 32.24±3.65 23.24±0.58 0.72±0.084 Glc-1-P (dTTP) 49.88±4.14 23.58±0.47 0.47±0.040 GMP-1 GTP (Glc-1-P) 88.98±11.95 27.96±1.27 0.31±0.045 GTP (Man-1-P) 16.04±1.68 29.05±0.58 1.81±0.19 Glc-1-P (GTP) 508.90±118.40 33.10±2.90 0.065±0.016 Man-1-P 18.49±2.14 30.31±0.69 1.64±0.19 (GTP) UGP-1 UTP (Glc-1-P) 13.85±2.95 20.02±0.70 1.45±0.31 Glc-1-P (UTP) 77.65±7.73 22.96±0.65 0.30±0.031 * Kinetic parameters were determined with different concentrations of sugar-1-phosphate (Glc-1-P or † Man-1-P) or NTP (GTP/UTP/dTTP) while concentrations of NTP or sugar-1-phosphate were fixed. Km and kcat are given as mean ± S.D. from three replicates, and the standard deviation of kcat/Km was calculated using the Fenner formula 142.

We identified RML-2, which belongs to the short-chain dehydrogenase/reductase

(SDR) family, as a potential second enzyme in the rhamnose biosynthetic pathway.

Similar to homologs such as the dTDP-Glc 4,6-dehydratases in E. coli (RmlB) and in humans, RML-2 contains a Rossmann domain with the conserved NAD+ binding motif

GXXGXXG (15GGCGFI21G), as well as the catalytic site YXXXK (164YAAS168K), both of which are essential for the catalytic activity of this type of enzyme. RML-2 was overexpressed in E. coli and purified (Figure 5-8A, lane 4). To test which NDP-glucose was the optimal substrate for RML-2, different kinds of NDP-glucose were prepared from RML-1-, GMP-1-, or UGP-1-catalyzed overnight reactions. Aliquots of the reaction mixtures were incubated with RML-2 in the presence of the cofactor NAD+, and reaction products were monitored by measuring the absorbance at 320 nm under alkaline conditions 143. RML-2 was specifically active towards dTDP-Glc, showing little, if any, activity towards UDP-Glc and no activity towards other potential NDP-glucose substrates or towards several NDP-sugars that are abundant in C. elegans (e.g., GDP-

195

Man, UDP-GlcNAc, and UDP-Gal) (Figure 5-8E). The product of the reaction of RML-2 with dTDP-Glc was analyzed by LC-MS, demonstrating conversion of dTDP-Glc (m/z

563) to dTDP-4-keto-6-deoxyglucose (m/z 545) (data not shown). Even when no exogenous NAD+ was provided, RML-2 still displayed activity towards dTDP-Glc (Figure

5-8E, dTDP-Glc, no NAD+). A similar result has also been seen for other NDP-Glc 4,6- dehydratases, which have been shown to bind their NAD+ cofactors tightly during purification. 158,159

RML-4 was identified as a potential fourth enzyme in the rhamnose biosynthetic pathway using BLAST analysis with the NDP-rhamnose biosynthetic genes RHM2 from

Arabidopsis thaliana and RmlB from Streptococcus suis. After amino acid sequence alignment, we compared the domains of RML-4, AtRHM2, SsRmlB and RML-2 (Figure

5-9A). RML-4 has two Rossmann folds with the conserved motifs GXXGXXG and

YXXXK, one in its N-terminal domain and one in its C-terminal domain. RML-4 shares

28.97% identity with AtRHM2, in which the N-terminal domain (amino acids 1-370) is an

UDP-Glc 4,6-dehydratase, and the C-terminal domain (amino acids 371-667) is a bifunctional UDP-4-keto-6-deoxy-glucose 3,5-epimerase/4-keto-reductase 135. The N- terminal domain of RML-4 (protein sequence 1-334) shares around 28% identity to

SsRmlB and RML-2, which have both been shown to be dTDP-Glc 4,6-dehydratases. It is important to note that the dehydratase activity of AtRHM2 has been shown to require, in addition to the essential residues in the Rossmann fold, two other residues, D96 and

G193 135. The glycine and aspartate residues are conserved in AtRHM2, SsRmlB, and

RML-2. Intriguingly, however, while the glycine residue in RML-4 is conserved, the aspartate residue is substituted with a threonine. This substitution potentially indicates

196

that unlike the AtRHM2 N-terminal domain, SsRmlB, or RML-2, RML-4 may have attenuated or no dehydratase activity. To further investigate the possible role of the aspartate residue in AtRHM2, SsRmlB and RML-2, and the effect of a substitution to a threonine on RML-4 activity, we used Robetta to predict the structure of full-length RML-

4 (1-631). SsRmlB without substrate bound (1OC2) was selected as the template for

RML-4 N-terminal domain (1-344) modeling, and EcRmlD (1KBZ) was selected as the template for RML-4 C-terminal domain (345-631) modeling. Modeling suggested that the N-terminal domain of RML-4 has very similar secondary and tertiary structural folds as the well-characterized dehydratases. Structural alignment suggests that the aspartate residue may play a crucial role in stabilizing and/or recognizing the substrate dTDP-Glc or UDP-Glc, especially the thymine or uracil group (Figure 5-10). Thus, a substitution of the aspartate residue in AtRHM2, SsRmlB and RML-2 to threonine in

RML-4 may disrupt the dehydratase activity of RML-4.

We attempted to express RML-4 with either an N or C-terminal His, GST or MBP tag, but we were unable to express the enzyme in a stable form. Analysis of rml-4 with the Spell 160 and String 161 databases indicated that the gene is transcriptionally correlated with rml-5 (Y71G12B.6). Sequence analysis suggests that RML-5 is a member of the NAD-dependent epimerase/dehydratase family, however it lacks the conserved motifs GXXGXXG and YXXXK. As with rml-4, loss of rml-5 gene function results in an embryonic lethal phenotype (Table 5-2). We hypothesized that RML-4 and

RML-5 might work together. RML-5 could be overexpressed and purified alone, and furthermore, co-expression of RML-5 with N-terminal His-tagged RML-4 enabled the purification of stable RML-4 as a RML-4/RML-5 complex (Figure 5-9B and C).

197

Figure 5-9. Comparison of the domain structure of RML-4 with known dehydratases and coexpression of RML-4/RML-5. A) Comparison of the domain structures of RML-4 from C. elegans, RHM2 from A. thaliana (TAIR locus number At1g53500), RmlB in S. suis (GenBank accession number BAM95117), RML- 2 and RML-5 from C. elegans. Amino acids shown are the putative conserved NAD(P)+ binding motif GXXGXXG/A, the catalytic motif YXXXK, and other residues that are essential for enzymatic activity (e.g., in AtRHM2, the D96N and G193A mutations were shown to completely abolish the dehydratase activity). B) Gel filtration chromatograms of N-terminal His-tagged RML-5 in pET28a (dashed line) and N-terminal His-tagged RML-4 and RML-5 with no tag in pACYC-Duet1 (solid line). C) SDS-PAGE analysis of gel filtration fractions containing the RML-4/RML-5 complex. Upper band indicates RML-4 (72 kDa), and lower band indicates RML-5 (41.9 kDa).

198

Figure 5-10. Structural alignment of a modeled structure of RML-4 and the crystal structure of SsRmlB bound to dTDP-Glc (protein data bank code 1KER). Robetta, a full-chain protein structure prediction server, was used to predict the structure of full-length RML-4 (1-631). SsRmlB without substrate (protein data bank code 1OC2) was selected as the template for modeling the RML-4 N-terminal domain (1-344), and RmlD (protein data bank code 1KBZ) was selected as the template for modeling the RML-4 C-terminal domain (345- 631). Superimposition of RML-4 (cyan) with RmlB (green) was produced in Pymol. View is centered to show the binding of the enzyme to substrate dTDP-Glc (magenta) and cofactor NAD+ (red). Residues AtRHM2 D89 and CeRML-4 T94 are shown as sticks.

Given that the N-terminal domains of RML-4 and RML-5 have homology to dehydratases (but lack catalytically important motifs/residues), we first determined whether they had any dehydratase activity. Comparison of dehydratase activities of

RML-2 and RML-4/RML-5 on dTDP-Glc and UDP-Glc showed that RML-4/RML-5 had very low, if any, activity on the two substrates (Figure 5-11A). We did not detect any dehydratase activity of RML-4/RML-5 on other nucleotide-sugars (e.g., ADP-Glc, CDP-

Glc or GDP-Glc, UDP-Gal, GDP-Man or UDP-GlcNAc). We also did not detect any dehydratase activity for RML-5 alone on various NDP-Glc substrates based on LC-MS analysis (data not shown).

Before testing the potential reductase activity of the RML-4/RML-5 complex, we cloned, expressed and purified a potential third enzyme in the rhamnose biosynthetic

199

pathway, RML-3, a predicted 3,5-epimerase which shares 41.33% identity to RmlC, a well-characterized dTDP-4-keto-6-deoxyglucose/dTDP-4-dehydrorhamnose 3,5- epimerase (Figure 5-8A, lane 5 and Figure 5-12). The reductase activity of the RML-

4/RML-5 complex towards dTDP-4-keto-6-deoxyglucose (produced by reaction of RML-

2 with dTDP-Glc) was monitored by measuring NAD(P)H consumption. Reductase activity towards dTDP-4-keto-6-deoxyglucose was observed, but only when both RML-

4/RML-5 and RML-3 were added to the reaction mixture in the presence of NAD(P)H

(Figure 5-11B). Monitoring the reaction by LC-MS indicated conversion of dTDP-Glc

(m/z 563) to dTDP-4-keto-6-deoxyglucose (m/z 545) by RML-2, and, subsequently, upon addition of RML-3 and RML-4/RML-5, reduction of the dTDP-4-keto-6- deoxyglucose (to give m/z 547). Similarly, we could also detect consumption of

NAD(P)H in the presence of UDP-Glc only when RML-2, RML-3, and RML-4/RML-5 were all present in the reaction mixture (Figure 5-11C). Formation of the product of this reaction was verified using LC-MS by showing some conversion of UDP-Glc (m/z 565) to UDP-4-keto-6-deoxyglucose (m/z 547) by RML-2, and, subsequently, upon addition of RML-3 and RML-4/RML-5, reduction of the UDP-4-keto-6-deoxyglucose (to give m/z

549). These results show that RML-3 may catalyze the 3,5-epimerization reaction and that the RML-4/RML-5 complex functions only on the epimerized product as a 4-keto- reductase. Although UDP-Glc is not a good substrate for RML-2, inclusion of subsequent enzymes in the sequence, RML-3 and RML-4/RML-5, likely pulls the reaction forward, allowing production of the final product (m/z 549).

200

Figure 5-11. Dehydratase and reductase activities of different enzyme combinations. A) 4,6-dehydratase activity of RML-2, RML-4/RML-5 or their combinations with either dTDP-Glc (white) or UDP-Glc (black). dTDP-Glc or UDP-Glc was produced by reaction of RML-1 with Glc-1-P and dTTP or UTP. B) RML- 4/RML-5 displayed reductase activity (as indicated by a decrease of absorbance at 370 nm) on dTDP-4-keto-6-deoxyglucose (dTDP-4-keto-6- deoxy-Glc) only when RML-3 and NADH (unless noted as NADPH) were present in the mixture. The substrate for the reactions (dTDP-4-keto-6- deoxyglucose) was prepared by overnight preincubation of dTDP-Glc (generated using RML-1) with RML-2 and NAD+ (except in the negative control without RML-2). C) Only when RML-2, RML-3, and RML-4/RML-5 were coincubated with UDP-Glc, NAD+, and NADH (unless noted as NADPH) was reductase activity detected (as indicated by a decrease of absorbance at 370 nm).

201

RML-3 MSHPTPGKRFQLEKEVIEAIPDLLVIKPKVFPDERGFFSESYNKTEWAEKIGYTEDLQQD 60 RmlC ------MNVIQTPLKDCVIIEPKVFGDSRGFFLEAWHKEKY-ENAGIKGNFVQD 47 : : .: * ::*:**** *.**** *:::* :: *: * . :: **

RML-3 NHSFSHYGVLRGLHTQPH--MGKLVTVVSGEIFDVAVDIRKDSPTYGKWHGVVLNGDNKH 118 RmlC NRSRSSRNVLRGLHFQKTKPQGKLVSVISGEVYDVAVDLRHDSETFGQYVSVLLSGKKNN 107 *:* * .****** * ****:*:***::*****:*:** *:*:: .*:*.*.:::

RML-3 AFWIPAGFLHGFQVLSKEGAHVTYKCSAVYDPKTEFGINPFDEDINVDWPIRDKTVVIVS 178 RmlC QLYVPPGFAHGFCVLS-EYADFHYKCTDFYDPKDEGGIIWNDPDIAIDWPVTEPLLSEKD 166 :::*.** *** *** * *.. ***: .**** * ** * ** :***: : : .

RML-3 ERDTQHASFK-SL 190 RmlC IKLITLAEYKKSL 179 : *.:* **

Figure 5-12. Amino acid sequence alignment of RML-3 with RmlC (GenBank accession number AFC91454) in E. coli by ClustalW. Asterisks below the sequence indicate identical residues.

5.3.5 Identification of dTDP-Rha from Enzymatic Synthesis and Inside Worms

The product of the reaction of RML-3 and RML-4/RML-5 with dTDP-4-keto-6- deoxyglucose (generated from the reaction of RML-2 with dTDP-Glc) was purified by

HPLC (based on detection of m/z 547 by LC-MS) for analysis using NMR spectroscopy

(Figure 5-13A). Based on 1H NMR (Figure 5-13B) and COSY (Figure 5-13C) spectra, the chemical shifts and coupling constants of the product dTDP-sugar corresponded to those reported for dTDP-rhamnose 150. Comparison to the 1H NMR spectrum of dTDP-

Glc (Figure 5-13D) showed the appearance of a doublet at 1.21 ppm, indicating the presence of the 6'' methyl protons. Coupling constants associated with the H1'' proton were J1'', P 8.70 Hz, indicating that the sugar moiety was the β–anomer. Moreover, the coupling constants J3'', 4'' 9.66 Hz and J4'', 5'' 9.60 Hz demonstrated the trans configurations of these protons in rhamnose (Table 5-4). Thus, the NMR spectra demonstrate that the final product is dTDP-rhamnose when dTDP-4-keto-6- deoxyglucose is incubated with RML-3 and RML-4/RML-5 in the presence of NAD(P)H.

Although the reaction was not as efficient, incubation of UDP-Glc with RML-2, RML-3

202

and RML-4/RML-5 produced UDP-β–L-rhamnose, as determined by the mass, 1H NMR,

and COSY spectra (Figure 5-14A-D, Table 5-4) 150,162.

Table 5-4. Proton chemical shifts and coupling constants for dTDP-rhamnose and UDP-rhamnose.* dTDP-rhamnose† UDP-rhamnose‡

# H mult. (J (Hz)) H mult. (J (Hz)) T 6 1H, 7.64, s 7 3H, 1.82, s

U 5 1H, 5.88, d, (J5, 6 = 7.56) 6 1H, 7.84, d, (J6, 5 = 8.16) R 1′ 1H, 6.23, dd, (J1′, 2′ = 6.78, 7.20) 1H, 5.88, d, (J1′, 2′ = 4.08) 2′ 2H, 2.27, m 1H, 4.27, m 3′ 1H, 4.51, m 1H, 4.19, m 4′ 1H, 4.08, overlap 1H, 4.13, m 5′ 2H, 4.08, overlap 2H, 4.08, m

S 1′′ 1H, 5.11, d, (J1′′, P = 8.70) 1H, 5.11, d, (J1′′, P = 8.40) 2′′ 1H, 3.98, d, (J2′′, 3′′ = 3.12) 1H, 3.98, d, (J2′′, 3′′ = 3.18) 3′′ 1H, 3.54, dd, (J3′′, 2′′ =3.36, J3′′, 4′′ =9.66)1H, 3.54, dd, (J3′′, 2′′ =3.30, J3′′, 4′′ =9.66) 4′′ 1H, 3.28, t, (J = 9.60) 1H, 3.28, t, (J = 9.60) 5′′ 1H, 3.34, m 1H, 3.34, m

6′′ 3H, 1.21, d, (J6′′, 5′′ = 6.00) 3H, 1.21, d, (J6′′, 5′′ = 6.60) * Chemical shifts in ppm are relative to the internal water signal (4.800 ppm). Assignments are based on water-suppressed 1H-NMR and COSY spectra of dTDP-rhamnose. Thymine (T), Uracil (U) and Ribose (R) are from dTDP and UDP, and Sugar (S) indicates Rhamnose.†dTDP-rhamnose was obtained from coincubation of RML-3 and RML-4/RML-5 with dTDP-4-keto-6-deoxyglucose, generated from reaction of RML-2 with dTDP-Glc in the presence of NADH. ‡UDP-rhamnose was obtained from coincubation of RML-2, RML-3 and RML-4/RML-5 with UDP-Glc in the presence of NADH.

203

A 547.0

548.1

B dTDP-rhamnose

C

D dTDP-glucose

Figure 5-13. Analysis of the product of the reaction of RML-3, RML-4/RML-5 with dTDP-4-keto-6-deoxyglucose by mass spectrometry and NMR spectroscopy. A) HPLC trace of reaction products and mass spectrum (inset) of peak corresponding to dTDP-rhamnose. HPLC peaks were collected and analyzed 1 by LC-MS to define the fractions. H NMR, B) and COSY, C) spectra demonstrate that the product is dTDP-rhamnose. D) 1H-NMR spectrum of dTDP-Glc.

204

A 549.0

550.1

551.1

B UDP-rhamnose

C

D UDP-glucose

Figure 5-14. Analysis of the product of the reaction of RML-2, RML-3, RML-4/RML-5 with UDP-Glc by mass spectrometry and NMR spectroscopy. A) HPLC trace of reaction products and mass spectrum (inset) of peak corresponding to UDP-rhamnose. HPLC peaks were collected and analyzed by LC-MS to define the fractions. 1H NMR, B) and COSY, C) spectra demonstrate that the product is UDP-rhamnose. D) 1H-NMR spectrum of UDP-Glc.

205

To investigate whether dTDP/UDP-rhamnose is produced in vivo, we examined the sugar nucleotide pools in bacteria-fed worms and in worms grown in axenic (no bacteria) CeHR medium by LC-MS. Several common sugar nucleotides in worms were detected, including UDP-GlcNAc, which is the most abundant sugar nucleotide in worms 163, UDP-Glc, UDP-Gal, and GDP-Man. dTDP-rhamnose (m/z 547) was detected both in bacteria-fed worms and in worms grown in CeHR medium, indicating that the bacteria are not the source of dTDP-rhamnose (Figure 5-15 and Figure 5-16). UDP- rhamnose was not detected, and thus, it is likely that the biologically relevant biosynthetic pathway produces dTDP-rhamnose, not UDP-rhamnose. LC-MS/MS and

LC-MS/MS/MS of the m/z 547 parent ion demonstrated that its fragmentation pattern was consistent with that of dTDP-rhamnose (547->321->195) (Figure 5-17 and Figure

5-18) 164.

206

Figure 5-15. LC-MS analysis of sugar nucleotides. UV absorbance at 254 nm of sugar nucleotides in mixture of standards A), worm extract from wild-type worms fed with bacteria B), and worm extract from wild-type worms grown in CeHR medium C). Sugar standards are: a, UDP-Gal (m/z, 565); b, UDP-Glc (m/z, 565); c, UDP-rhamnose (m/z, 549); d, UDP-GlcNAc (m/z, 606); e, GDP-Man (m/z, 604); f, dTDP-Glc (m/z, 563); g, dTDP-rhamnose (m/z, 547). The inset enlarges the indicated regions where dTDP-rhamnose elutes (indicated with an asterisk).

207

Figure 5-16. Ion extraction and mass spectrum of dTDP-rhamnose (m/z 547) from LC- MS of A) a dTDP-rhamnose standard, B) worm extract from wild-type worms fed with bacteria, and C) worm extract from worms grown in CeHR medium. The retention times of dTDP-rhamnose in the three samples were 26.2 min.

208

aA

bB 546.87

321.07

cC 194.90

O

CH3 MS/MS HN O O O N H3C O O P O P O O HO MS3 HO O- OH OH HO

176.81

Figure 5-17. LC-MS/MS and LC-MS/MS/MS analysis of a dTDP-rhamnose standard. A) Selected ion monitoring (SIM) mode was performed on m/z 547. The retention time of dTDP-rhamnose standard was 8.64 min. B) A selected ion chromatogram of the MS/MS fragment at m/z 321 was extracted. C) LC- MS/MS/MS produced a product ion at m/z 195. Dotted lines in the chemical structure of dTDP-rhamnose indicate the sites where the fragmentations took place.

209

A

B 546.87

321.00

C 194.90

O

CH3 MS/MS HN O O O N H3C O O P O P O O HO MS3 HO O- OH OH HO

Figure 5-18. LC-MS/MS and LC-MS/MS/MS analysis of dTDP-rhamnose isolated from worms grown in CeHR. A) SIM mode was performed on m/z 547. The retention time of dTDP-rhamnose was 8.91 min. B) A selected ion chromatogram of the MS/MS fragment at m/z 321 was extracted. C) LC- MS/MS/MS produced a product ion at m/z 195. Dotted lines in the chemical structure of dTDP-rhamnose indicate the sites where the fragmentations took place.

210

To determine whether the dTDP-rhamnose biosynthetic pathway functions in vivo, we knocked down the expression of rml-2 and rml-3 by RNAi. Analysis of sugar nucleotides showed that RNAi against either gene significantly reduced production of dTDP-rhamnose, but not other NDP-sugars (Figure 5-19A). Thus, rml-2 and rml-3 contribute in vivo to the biosynthesis of dTDP-rhamnose. RNAi against rml-2 or rml-3 did not affect the production of the ascarosides, a group of pheromones produced by C. elegans to coordinate its development and behavior (Figure 5-19B) 165. The ascarosides are derivatives of the 3,6-dideoxy-L-sugar ascarylose, which is structurally similar to L-rhamnose but lacks a hydroxyl at the 3-position. Although little is known about the biosynthesis of ascarylose in C. elegans, its biosynthesis in bacteria has been extensively studied and includes enzymes that are weakly homologous to the enzymes in the rhamnose biosynthetic pathway, as well as additional enzymes 32,34. Our data show that ascarylose biosynthesis does not require the rhamnose biosynthetic genes rml-2 and rml-3.

211

Figure 5-19. Sugar nucleotide analysis in C. elegans. A) Sugar nucleotide analysis in rrf-3 worms fed with RNAi strains L4440 (control), rml-2, and rml-3. Ion extractions were conducted for each sugar nucleotide, and peak areas were integrated. Each sugar nucleotide was normalized to the NDP-sugar with the highest peak area, UDP-GlcNAc. Two-tailed, unpaired t-tests were conducted to determine the statistical significance for differences in the amount of dTDP- rhamnose (dTDP-Rha) (* indicates P ≤ 0.05). At least three independent experiments were performed. B) LC-MS analysis of ascarosides in the culture medium of rrf-3 worms fed with RNAi strains L4440 (control), rml-2, rml-3, and daf-22.

5.3.6 Biosynthesis of dTDP-Rha Involved in Worm Molting Cycles

To further investigate the role of rhamnose biosynthesis in C. elegans, we generated GFP reporter strains, prml-2::gfp and prml-4::gfp. The two reporter strains showed very similar GFP expression patterns in embryos (Figure 5-20A and B, embryo). In larval stages and the adult, both reporters showed GFP expression in the hypodermal syncytium, but not in the seam cells of the hypodermis (Figure 5-20A and

B, L1, L2, L3, L4 and adult).

212

A

p

f

g :

: Embryo L1 L2

2

-

l

m

r p

L3 L4 Adult

B

p

f

g :

: Embryo L1 L2

4

-

l

m

r p

L3 L4 Adult

Figure 5-20. Expression pattern of the prml-2::gfp and prml-4::gfp reporter genes in the embryo and larval stages of transgenic worms. A) prml-2::gfp transgenic array was expressed in embryonic stage, L1 larvae, L2 larvae, L3 larvae, L4 larvae and adult. Scale bars, 20 μm for Embryo, 100 μm for L1, L2, L3, L4 and adult, 20 μm for inset. B) prml-4::gfp is expressed through four larval stages and continued in adult worm of all transgenic lines. Higher magnification showing prml-4::GFP expression located in hypodermal syncytium, but except the seam cell hypodermis. Scale bars, 20 μm for embryo, 50 μm for L1, 100 μm for L2, L3, L4 and adult, 20 μm for inset.

213

Previous global expression profiling experiments have shown that the expression of rml-2, rml-3, rml-4 and rml-5 changes during molting and a sleep-like lethargus 166. In addition, these genes also oscillate during C. elegans larval development 167, although rml-1 expression only showed moderate changes. To verify that the expression of these rhamnose biosynthetic genes is coupled to the worm molting cycle, we employed a destabilized GFP fused to the PEST sequence that results in rapid protein turnover

64,168,169. In order to monitor changes in rml-2 and rml-4 expression during larval growth, we generated the transcriptional reporters prml-2::gfp-pest and prml-4::gfp-pest. A pulse of GFP-PEST expression was observed in the hypodermal cells during the later part of each larval stage (lL1, lL2, lL3 and lL4), but not in the early part (eL1, eL2, eL3 and eL4)

(Figure 5-21A). Quantitation of GFP-PEST expression during the postembryonic molting cycles also revealed an oscillating pattern (Figure 5-21B). All of these results indicate that the rhamnose biosynthetic genes play a role in the molting cycle of C. elegans.

214

Figure 5-21. Monitoring the expression of rml-2 and rml-4 transcriptional reporters expressing GFP-PEST during the molting cycle. A) The prml-2::gfp-pest and prml-4::gfp-pest reporter strains expressed GFP as pulses that oscillated with postembryonic molting. eL1, eL2, eL3, and eL4 represent early L1, L2, L3, and L4 stages, respectively. lL1, lL2, lL3, and lL4 represent late L1, L2, L3, and L4 stages, respectively. Scale bars are 100 μm. B) Percentage of prml- 2::gfp-pest (solid line) or prml-4::gfp-pest (dashed line) worms that displayed fluorescence. Worms were scored every two hours post-L1 recovery for fluorescence.

215

The prml-2::gfp-pest and prml-4::gfp-pest reporters were also highly expressed in the embryo (Figure 5-22A) and in the seam cells of L1 larvae immediately after recovery from starvation-induced L1 arrest (Figure 5-22B). To investigate whether the gfp-pest reporters were expressed in the dauer stage, the strains were induced to form dauers.

In the prml-4::gfp-pest strain, GFP-PEST was expressed dominantly in the seam cells of pre-dauers (Figure 5-22C), but was not expressed at all in dauers (Figure 5-22D). In the prml-2::gfp-pest strain, GFP-PEST was also expressed in the seam cells of pre-dauers, although the expression level in the seam cells was variable from worm to worm (Figure

5-22E-1-3), and, no GFP-PEST was expressed at all in dauers (Figure 5-22E-4). As the seam cells are specialized epidermal cells that are critical for cuticle synthesis and molting 138,170, expression of the rhamnose biosynthetic genes in the pre-dauer seam cells may suggest that this biosynthetic pathway is involved in the production of the cuticle and/or surface coat of the dauer stage specifically.

216

Figure 5-22. Expression patterns of prml-2::gfp-pest and prml-4::gfp-pest at specific developmental stages. A. embryonic stage, scale bar 20 μm. B. very early L1 larvae, scale bar 50 μm. C. prml-4::gfp-pest in the pre-dauer stage shows GFP expression in seam cells. D. prml-4::gfp-pest in the dauer stage shows no GFP expression. Scale bars for panel (C) and (D) are 50 μm. E. (1-3) prml-2::gfp-pest in the pre-dauer stage shows GFP expression predominantly in the seam cells. E. (4) prml-2::gfp-pest strain in the dauer stage shows no GFP expression. Scale bars are 50 μm and (for inset) 20 μm.

5.4 Discussion and Future Work

In this study, we discovered that worms convert glucose into UDP-ascarylose and uncover a pathway for the biosynthesis of dTDP-rhamnose from dTDP-Glc in C. elegans (Figure 5-23). Our work is the first time to reveal biosynthesis of UDP- ascarylose in eukaryotes, which is very distinct from biosynthesis of CDP-ascarylose in bacteria. Our analysis of sugar nucleotide pools in C. elegans demonstrated that worms do biosynthesize UDP-ascarylose and dTDP-rhamnose in vivo and that dTDP-

217

rhamnose biosynthesis requires the rhamnose biosynthetic pathway that we have characterized. Our work is the first to characterize a rhamnose biosynthetic pathway in nematodes or other metazoans. Furthermore, we show that the pathway is specific to the biosynthesis of rhamnose and is not relevant to the biosynthesis of the related 3,6- dideoxy-L-sugar, ascarylose, which is a core component of the ascaroside pheromones in nematodes. As humans cannot biosynthesize rhamnose, but bacteria can, the rhamnose biosynthetic pathway has been proposed as a potential target for the development of new antibiotics. Similarly, the rhamnose biosynthetic pathway in nematodes could be a potential target for new anthelmintics. The identified enzymes in the C. elegans pathway, RML-1 through RML-5, show very high similarities to their homologs in other nematode species. This conservation suggests an important role for rhamnose in nematodes.

Figure 5-23. Proposed biosynthetic pathway for dTDP-L-rhamnose in C. elegans.

218

Our results show that RML-1 converts Glc-1-P in the presence of dTTP or UTP into dTDP-Glc or UDP-Glc, respectively. The dTDP-Glc is used for the biosynthesis of rhamnose in C. elegans, while the UDP-Glc may be used in the biosynthesis of glycogen, glycolipids, and/or N-glycans 171-173. A translational reporter for RML-1 has been used to show that this protein is expressed in the intestine, body wall muscle, and hypodermis during larval development and in the intestine and body wall muscle in the adult worm 174. Expression of RML-1 in the hypodermis is consistent with our reporter gene data for RML-2 and RML-4 that suggests that rhamnose biosynthesis occurs in the hypodermis. Expression in the muscles is consistent with a role for RML-1 in glycogen biosynthesis. In addition to RML-1, we characterized two other pyrophosphorylases, GMP-1 and UGP-1, and showed that they act as a GDP-mannose pyrophosphorylase and an UDP-Glc pyrophosphorylase, respectively. GDP-mannose pyrophosphorylase is thought to be involved in the first step of the GDP-fucose biosynthetic pathway in C. elegans, which catalyzes the conjugation of Man-1-P to GMP from GTP. GDP-Man is further transformed by GMD-1 or GMD-2 and GER-1 to form

GDP-fucose, which is used in the biosynthesis of a fucosylated glycolipid that acts as a toxin receptor 175. Unlike RML-1, the UDP-Glc pyrophosphorylase UGP-1 was previously reported to be strongly upregulated when worms are exposed to desiccation conditions 176. UGP-1 was proposed to function in the biosynthesis of trehalose, one component of the maradolipids in the cuticle, which protect worms from harsh environments 177.

The second step in the rhamnose biosynthetic pathway is catalyzed by RML-2, a dTDP-Glc 4,6-dehydratase that converts dTDP-Glc to dTDP-4-keto-6-deoxyglucose. As

219

with other enzymes of the SDR family, such as the dTDP-Glc 4,6-dehydratase RmlB from E.coli, RML-2 has the conserved motifs GXXGXXG for NAD+ cofactor binding and

YXXXK for catalysis. Our results show that even if no exogenous NAD+ is provided,

RML-2 still has considerable activity, likely due to the NAD+ cofactor remaining tightly bound during the purification process. The mechanism of C-6 deoxygenation seen in other NDP-Glc 4,6-dehydratases suggests regeneration of NAD+ during the reaction cycle of RML-2 158,178.

The third step in the C. elegans rhamnose biosynthetic pathway is catalyzed by

RML-3, which we show catalyzes the 3,5-epimerization of dTDP-4-keto-6- deoxyglucose. The last step in the pathway, reduction of the 4-keto group, is catalyzed by RML-4, which could only be coexpressed and purified with RML-5. Although RML-4 has very similar domain structure to the bifunctional plant enzyme RHM2, comparison of the modeled structure of RML-4 with the structure of RmlB suggests that the N-terminal domain of RML-4 is unlikely to have dehydratase activity. In comparison to RmlB, an essential residue in RML-4 has been substituted (Asp to Thr) that is possibly important for NDP-Glc binding. Indeed, we were not able to detect any significant dehydratase activity for RML-4.

Our data suggest that rhamnose biosynthesis is involved in embryogenesis and in hypodermal development during the postembryonic molting cycles of C. elegans. C. elegans molts at the end of each larval stage (L1-L4) and must generate a new cuticle with each molt. Expression of rml-2 and rml-4 in the hypodermis oscillates with the molting cycle, peaking before each larval molt. It is possible that dTDP-rhamnose in worms is incorporated into glycoproteins or glycolipids during cuticle and/or surface coat

220

synthesis. A previous study showed that a glycoprotein extracted from the surface of the parasitic nematode Ascaridia galli contains a rhamnose moiety 140. Expression of rml-2 and rml-4 in pre-dauers occurred specifically in the hypodermal seam cells, which are crucial for cuticle production and molting. Thus, rhamnose biosynthesis may be particularly important for the formation of the dauer cuticle and/or surface coat.

Phenotypic analysis of genes involved in rhamnose biosynthesis indicates that knockdown of rml-1, rml-4 and rml-5 (but not rml-2 and rml-3) causes lethality of the embryo. This result may indicate that rml-2 and rml-3 work redundantly with other genes in rhamnose biosynthesis, or that rml-1, rml-4, and rml-5 play an essential role unrelated to rhamnose biosynthesis. The discovery of a rhamnose biosynthetic pathway should enable further discoveries in the role of rhamnose in nematode embryonic/larval development and could possibly facilitate the development of new types of anthelmintics.

Future work is still required to investigate the biosynthesis of UDP-ascarylose and how UDP-ascarylose is incorporated into ascaroside biosynthesis by potential ascarosyltransferase. Since RML-2 and RML-4/RML-5 showed certain activities towards

UDP-glucose, so these enzymes might also participate in UDP-ascarylose biosynthesis.

However, the essential enzymes deoxygenate the 3’-OH is yet to be determined.

221

LIST OF REFERENCES

1. Jones, J.T. et al. Top 10 plant-parasitic nematodes in molecular plant pathology. Mol Plant Pathol 14, 946-61 (2013).

2. Jasmer, D.P., Goverse, A. & Smant, G. Parasitic nematode interactions with mammals and plants. Annu Rev Phytopathol 41, 245-70 (2003).

3. Stepek, G., Buttle, D.J., Duce, I.R. & Behnke, J.M. Human gastrointestinal nematode infections: are new control methods required? Int J Exp Pathol 87, 325-41 (2006).

4. Brenner, S. The genetics of Caenorhabditis elegans. Genetics 77, 71-94 (1974).

5. Consortium, C.e.S. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282, 2012-8 (1998).

6. Fire, A. et al. Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature 391, 806-11 (1998).

7. Friedland, A.E. et al. Heritable genome editing in C. elegans via a CRISPR-Cas9 system. Nat Methods 10, 741-3 (2013).

8. Dickinson, D.J., Ward, J.D., Reiner, D.J. & Goldstein, B. Engineering the Caenorhabditis elegans genome using Cas9-triggered homologous recombination. Nat Methods 10, 1028-34 (2013).

9. Baugh, L.R. To grow or not to grow: nutritional control of development during Caenorhabditis elegans L1 arrest. Genetics 194, 539-55 (2013).

10. Golden, J.W. & Riddle, D.L. The Caenorhabditis elegans dauer larva: developmental effects of pheromone, food, and temperature. Dev Biol 102, 368- 78 (1984).

11. Jeong, P.Y. et al. Chemical structure and biological activity of the Caenorhabditis elegans dauer-inducing pheromone. Nature 433, 541-5 (2005).

12. Sommer, R.J. Pristionchus pacificus. WormBook, 1-8 (2006).

13. Moreno, E. et al. Regulation of hyperoxia-induced social behaviour in Pristionchus pacificus nematodes requires a novel cilia-mediated environmental input. Sci Rep 7, 17550 (2017).

14. Schlager, B., Wang, X., Braach, G. & Sommer, R.J. Molecular cloning of a dominant roller mutant and establishment of DNA-mediated transformation in the nematode Pristionchus pacificus. Genesis 47, 300-4 (2009).

222

15. Omura, S. & Crump, A. Ivermectin: panacea for resource-poor communities? Trends Parasitol 30, 445-55 (2014).

16. Fischbach, M.A. & Walsh, C.T. Assembly-line enzymology for polyketide and nonribosomal Peptide antibiotics: logic, machinery, and mechanisms. Chem Rev 106, 3468-96 (2006).

17. Hill, A.M. The biosynthesis, molecular genetics and enzymology of the polyketide-derived metabolites. Nat Prod Rep 23, 256-320 (2006).

18. Castoe, T.A., Stephens, T., Noonan, B.P. & Calestani, C. A novel group of type I polyketide synthases (PKS) in animals and the complex phylogenomics of PKSs. Gene 392, 47-58 (2007).

19. Hojo, M. et al. Unexpected link between polyketide synthase and calcium biomineralization. Zoological Letters 1, 1-16 (2015).

20. Weissman, K.J. Genetic engineering of modular PKSs: from combinatorial biosynthesis to synthetic biology. Nat Prod Rep 33, 203-30 (2016).

21. Miyanaga, A., Kudo, F. & Eguchi, T. Protein-protein interactions in polyketide synthase-nonribosomal peptide synthetase hybrid assembly lines. Nat Prod Rep (2018).

22. Süssmuth, R.D. & Mainz, A. Nonribosomal Peptide Synthesis-Principles and Prospects. Angew Chem Int Ed Engl 56, 3770-3821 (2017).

23. Barajas, J.F., Blake-Hedges, J.M., Bailey, C.B., Curran, S. & Keasling, J.D. Engineered polyketides: Synergy between protein and host level engineering. Synth Syst Biotechnol 2, 147-166 (2017).

24. Sattely, E.S., Fischbach, M.A. & Walsh, C.T. Total biosynthesis: in vitro reconstitution of polyketide and nonribosomal peptide pathways. Nat Prod Rep 25, 757-93 (2008).

25. Du, L. & Lou, L. PKS and NRPS release mechanisms. Nat Prod Rep 27, 255-78 (2010).

26. Cooke, H.A., Guenther, E.L., Luo, Y., Shen, B. & Bruner, S.D. Molecular basis of substrate promiscuity for the SAM-dependent O-methyltransferase NcsB1, involved in the biosynthesis of the enediyne antitumor antibiotic neocarzinostatin. Biochemistry 48, 9590-8 (2009).

27. Aron, Z.D., Dorrestein, P.C., Blackhall, J.R., Kelleher, N.L. & Walsh, C.T. Characterization of a new tailoring domain in polyketide biogenesis: the amine transferase domain of MycA in the mycosubtilin gene cluster. J Am Chem Soc 127, 14986-7 (2005).

223

28. Kraas, F.I., Helmetag, V., Wittmann, M., Strieker, M. & Marahiel, M.A. Functional dissection of surfactin synthetase initiation module reveals insights into the mechanism of lipoinitiation. Chem Biol 17, 872-80 (2010).

29. Chooi, Y.H. & Tang, Y. Adding the lipo to lipopeptides: do more with less. Chem Biol 17, 791-3 (2010).

30. Trivedi, O.A. et al. Enzymic activation and transfer of fatty acids as acyl- adenylates in mycobacteria. Nature 428, 441-5 (2004).

31. Shou, Q. et al. A hybrid polyketide-nonribosomal peptide in nematodes that promotes larval survival. Nat Chem Biol 12, 770-2 (2016).

32. Butcher, R.A., Fujita, M., Schroeder, F.C. & Clardy, J. Small-molecule pheromones that control dauer development in Caenorhabditis elegans. Nat Chem Biol 3, 420-2 (2007).

33. Butcher, R.A., Ragains, J.R., Kim, E. & Clardy, J. A potent dauer pheromone component in Caenorhabditis elegans that acts synergistically with other components. Proc Natl Acad Sci U S A 105, 14288-92 (2008).

34. Zhang, X. et al. Acyl-CoA oxidase complexes control the chemical message produced by Caenorhabditis elegans. Proc Natl Acad Sci U S A 112, 3955-60 (2015).

35. Butcher, R.A., Ragains, J.R. & Clardy, J. An indole-containing dauer pheromone component with unusual dauer inhibitory activity at higher concentrations. Org Lett 11, 3100-3 (2009).

36. Butcher, R.A. et al. Biosynthesis of the Caenorhabditis elegans dauer pheromone. Proc Natl Acad Sci U S A 106, 1875-9 (2009).

37. Zhang, X., Noguez, J.H., Zhou, Y. & Butcher, R.A. Analysis of ascarosides from Caenorhabditis elegans using mass spectrometry and NMR spectroscopy. Methods Mol Biol 1068, 71-92 (2013).

38. Olson, S.K., Greenan, G., Desai, A., Müller-Reichert, T. & Oegema, K. Hierarchical assembly of the eggshell and permeability barrier in C. elegans. J Cell Biol 198, 731-48 (2012).

39. von Reuss, S.H. et al. Comparative metabolomics reveals biogenesis of ascarosides, a modular library of small-molecule signals in C. elegans. J Am Chem Soc 134, 1817-24 (2012).

40. O'Brien, R.V., Davis, R.W., Khosla, C. & Hillenmeyer, M.E. Computational identification and analysis of orphan assembly-line polyketide synthases. J Antibiot (Tokyo) 67, 89-97 (2014).

224

41. Wang, H., Fewer, D.P., Holm, L., Rouhiainen, L. & Sivonen, K. Atlas of nonribosomal peptide and polyketide biosynthetic pathways reveals common occurrence of nonmodular enzymes. Proc Natl Acad Sci U S A 111, 9259-64 (2014).

42. Harris, T.W. et al. WormBase 2014: new views of curated biology. Nucleic Acids Res 42, D789-93 (2014).

43. Smith, C.A., Want, E.J., O'Maille, G., Abagyan, R. & Siuzdak, G. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal Chem 78, 779-87 (2006).

44. Gowda, H. et al. Interactive XCMS Online: simplifying advanced metabolomic data processing and subsequent statistical analyses. Anal Chem 86, 6931-9 (2014).

45. Nass, R. & Hamza, I. The nematode C. elegans as an animal model to explore toxicology in vivo: solid and axenic growth culture conditions and compound exposure parameters. Curr Protoc Toxicol Chapter 1, Unit1 9 (2007).

46. Bhushan, R. & Brückner, H. Marfey's reagent for chiral amino acid analysis: a review. Amino Acids 27, 231-47 (2004).

47. Lucanic, M. et al. N-acylethanolamine signalling mediates the effect of diet on lifespan in Caenorhabditis elegans. Nature 473, 226-9 (2011).

48. Folick, A. et al. Aging. Lysosomal signaling molecules regulate longevity in Caenorhabditis elegans. Science 347, 83-6 (2015).

49. Zugasti, O. et al. Activation of a G protein-coupled receptor by its endogenous ligand triggers the innate immune response of Caenorhabditis elegans. Nat Immunol 15, 833-8 (2014).

50. Liu, Z. et al. Predator-secreted sulfolipids induce defensive responses in C. elegans. Nat Commun 9, 1128 (2018).

51. Calestani, C., Rast, J.P. & Davidson, E.H. Isolation of pigment cell specific genes in the sea urchin embryo by differential macroarray screening. Development 130, 4587-96 (2003).

52. Liu, H.W. & Thorson, J.S. Pathways and mechanisms in the biogenesis of novel deoxysugars by bacteria. Annu Rev Microbiol 48, 223-56 (1994).

53. Thorson, J.S., Lo, S.F., Ploux, O., He, X. & Liu, H.W. Studies of the biosynthesis of 3,6-dideoxyhexoses: molecular cloning and characterization of the asc (ascarylose) region from Yersinia pseudotuberculosis serogroup VA. J Bacteriol 176, 5483-93 (1994).

225

54. Johnson, D.A. & Liu, H. Mechanisms and pathways from recent deoxysugar biosynthesis research. Curr Opin Chem Biol 2, 642-9 (1998).

55. Wu, Q., Liu, Y.N., Chen, H., Molitor, E.J. & Liu, H.W. A retro-evolution study of CDP-6-deoxy-D-glycero-L-threo-4-hexulose-3-dehydrase (E1) from Yersinia pseudotuberculosis: implications for C-3 deoxygenation in the biosynthesis of 3,6-dideoxyhexoses. Biochemistry 46, 3759-67 (2007).

56. He, X. & Liu, H.W. Mechanisms of enzymatic CbondO bond cleavages in deoxyhexose biosynthesis. Curr Opin Chem Biol 6, 590-7 (2002).

57. Thibodeaux, C.J., Melançon, C.E. & Liu, H.W. Unusual sugar biosynthesis and natural product glycodiversification. Nature 446, 1008-16 (2007).

58. Choe, A. et al. Ascaroside signaling is widely conserved among nematodes. Curr Biol 22, 772-80 (2012).

59. Ludewig, A.H. & Schroeder, F.C. Ascaroside signaling in C. elegans. WormBook, 1-22 (2013).

60. Baugh, L.R. & Sternberg, P.W. DAF-16/FOXO regulates transcription of cki- 1/Cip/Kip and repression of lin-4 during C. elegans L1 arrest. Curr Biol 16, 780-5 (2006).

61. Chen, Y. & Baugh, L.R. Ins-4 and daf-28 function redundantly to regulate C. elegans L1 arrest. Dev Biol 394, 314-26 (2014).

62. Boulin, T. & Bessereau, J.L. Mos1-mediated insertional mutagenesis in Caenorhabditis elegans. Nat Protoc 2, 1276-87 (2007).

63. Jeong, M.H., Kawasaki, I. & Shim, Y.H. A circulatory transcriptional regulation among daf-9, daf-12, and daf-16 mediates larval development upon cholesterol starvation in Caenorhabditis elegans. Dev Dyn 239, 1931-40 (2010).

64. Frand, A.R., Russel, S. & Ruvkun, G. Functional genomic analysis of C. elegans molting. PLoS Biol 3, e312 (2005).

65. Ritter, A.D. et al. Complex expression dynamics and robustness in C. elegans insulin networks. Genome Res 23, 954-65 (2013).

66. Fukuyama, M., Kontani, K., Katada, T. & Rougvie, A.E. The C. elegans Hypodermis Couples Progenitor Cell Quiescence to the Dietary State. Curr Biol 25, 1241-8 (2015).

67. Livak, K.J. & Schmittgen, T.D. Analysis of relative gene expression data using -ΔΔC real-time quantitative PCR and the 2 T Method. Methods 25, 402-8 (2001).

226

68. Zhang, X., Zabinsky, R., Teng, Y., Cui, M. & Han, M. microRNAs play critical roles in the survival and recovery of Caenorhabditis elegans from starvation- induced L1 diapause. Proc Natl Acad Sci U S A 108, 17997-8002 (2011).

69. Tamura, K., Stecher, G., Peterson, D., Filipski, A. & Kumar, S. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol Biol Evol 30, 2725-9 (2013).

70. Weber, T. et al. antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic Acids Res 43, W237-43 (2015).

71. Weber, T. et al. antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic Acids Res (2015).

72. Pan, L., Gardner, C.L., Pagliai, F.A., Gonzalez, C.F. & Lorca, G.L. Identification of the Tolfenamic Acid Binding Pocket in PrbP from. Front Microbiol 8, 1591 (2017).

73. Modzelewska, K. et al. Neurons refine the Caenorhabditis elegans body plan by directing axial patterning by Wnts. PLoS Biol 11, e1001465 (2013).

74. Nelson, F.K. & Riddle, D.L. Functional study of the Caenorhabditis elegans secretory-excretory system using laser microsurgery. J Exp Zool 231, 45-56 (1984).

75. Buechner, M., Hall, D.H., Bhatt, H. & Hedgecock, E.M. Cystic canal mutants in Caenorhabditis elegans are defective in the apical membrane domain of the renal (excretory) cell. Dev Biol 214, 227-41 (1999).

76. Golden, J.W. & Riddle, D.L. A pheromone influences larval development in the nematode Caenorhabditis elegans. Science 218, 578-80 (1982).

77. Fukuyama, M., Rougvie, A.E. & Rothman, J.H. C. elegans DAF-18/PTEN mediates nutrient-dependent arrest of cell cycle and growth in the germline. Curr Biol 16, 773-9 (2006).

78. Fukuyama, M. et al. C. elegans AMPKs promote survival and arrest germline development during nutrient stress. Biol Open 1, 929-36 (2012).

79. Lee, B.H. & Ashrafi, K. A TRPV channel modulates C. elegans neurosecretion, larval starvation survival, and adult lifespan. PLoS Genet 4, e1000213 (2008).

80. Artyukhin, A.B., Schroeder, F.C. & Avery, L. Density dependence in Caenorhabditis larval starvation. Sci Rep 3, 2777 (2013).

81. Meisel, J.D., Panda, O., Mahanti, P., Schroeder, F.C. & Kim, D.H. Chemosensation of bacterial secondary metabolites modulates neuroendocrine signaling and behavior of C. elegans. Cell 159, 267-80 (2014).

227

82. Bisang, C. et al. A chain initiation factor common to both modular and aromatic polyketide synthases. Nature 401, 502-5 (1999).

83. Huitt-Roehl, C.R. et al. Starter unit flexibility for engineered product synthesis by the nonreducing polyketide synthase PksA. ACS Chem Biol 10, 1443-9 (2015).

84. Crawford, J.M., Dancy, B.C., Hill, E.A., Udwary, D.W. & Townsend, C.A. Identification of a starter unit acyl-carrier protein transacylase domain in an iterative type I polyketide synthase. Proc Natl Acad Sci U S A 103, 16728-33 (2006).

85. Di Lorenzo, M. et al. A nonribosomal peptide synthetase with a novel domain organization is essential for siderophore biosynthesis in Vibrio anguillarum. J Bacteriol 186, 7327-36 (2004).

86. Mofid, M.R., Finking, R. & Marahiel, M.A. Recognition of hybrid peptidyl carrier proteins/acyl carrier proteins in nonribosomal peptide synthetase modules by the 4'-phosphopantetheinyl transferases AcpS and Sfp. J Biol Chem 277, 17023-31 (2002).

87. Beld, J., Sonnenschein, E.C., Vickery, C.R., Noel, J.P. & Burkart, M.D. The phosphopantetheinyl transferases: catalysis of a post-translational modification crucial for life. Nat Prod Rep 31, 61-108 (2014).

88. Wang, F. et al. Structural and functional analysis of the loading acyltransferase from avermectin modular polyketide synthase. ACS Chem Biol 10, 1017-25 (2015).

89. Akey, D.L. et al. Crystal structures of dehydratase domains from the curacin polyketide biosynthetic pathway. Structure 18, 94-105 (2010).

90. Kwan, D.H. & Schulz, F. The stereochemistry of complex polyketide biosynthesis by modular polyketide synthases. Molecules 16, 6092-115 (2011).

91. Reid, R. et al. A model of structure and catalysis for ketoreductase domains in modular polyketide synthases. Biochemistry 42, 72-9 (2003).

92. Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7, 539 (2011).

93. Stachelhaus, T., Mootz, H.D. & Marahiel, M.A. The specificity-conferring code of adenylation domains in nonribosomal peptide synthetases. Chem Biol 6, 493-505 (1999).

94. Challis, G.L., Ravel, J. & Townsend, C.A. Predictive, structure-based model of amino acid recognition by nonribosomal peptide synthetase adenylation domains. Chem Biol 7, 211-24 (2000).

228

95. Rausch, C., Weber, T., Kohlbacher, O., Wohlleben, W. & Huson, D.H. Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs). Nucleic Acids Res 33, 5799-808 (2005).

96. Trautman, E.P., Healy, A.R., Shine, E.E., Herzon, S.B. & Crawford, J.M. Domain- Targeted Metabolomics Delineates the Heterocycle Assembly Steps of Colibactin Biosynthesis. J Am Chem Soc 139, 4195-4201 (2017).

97. Katane, M. et al. Characterization of a homologue of mammalian serine racemase from Caenorhabditis elegans: the enzyme is not critical for the metabolism of serine in vivo. Genes Cells 21, 966-77 (2016).

98. Saitoh, Y. et al. Spatiotemporal localization of D-amino acid oxidase and D- aspartate oxidases during development in Caenorhabditis elegans. Mol Cell Biol 32, 1967-83 (2012).

99. Bloudoff, K., Rodionov, D. & Schmeing, T.M. Crystal structures of the first condensation domain of CDA synthetase suggest conformational changes during the synthetic cycle of nonribosomal peptide synthetases. J Mol Biol 425, 3137-50 (2013).

100. Keating, T.A., Marshall, C.G., Walsh, C.T. & Keating, A.E. The structure of VibH represents nonribosomal peptide synthetase condensation, cyclization and epimerization domains. Nat Struct Biol 9, 522-6 (2002).

101. Kotowska, M. & Pawlik, K. Roles of type II thioesterases and their application for secondary metabolite yield improvement. Appl Microbiol Biotechnol 98, 7735-46 (2014).

102. Arribere, J.A. et al. Efficient marker-free recovery of custom genetic modifications with CRISPR/Cas9 in Caenorhabditis elegans. Genetics 198, 837-46 (2014).

103. Farboud, B. & Meyer, B.J. Dramatic enhancement of genome editing by CRISPR/Cas9 through improved guide RNA design. Genetics 199, 959-71 (2015).

104. Kim, H. et al. A co-CRISPR strategy for efficient genome editing in Caenorhabditis elegans. Genetics 197, 1069-80 (2014).

105. Cong, L. & Zhang, F. Genome engineering using CRISPR-Cas9 system. Methods Mol Biol 1239, 197-217 (2015).

106. Zhou, Y. et al. Biosynthetic tailoring of existing ascaroside pheromones alters their biological function in. Elife 7(2018).

107. Waterhouse, A. et al. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res 46, W296-W303 (2018).

229

108. Lim, J. et al. Solution structures of the acyl carrier protein domain from the highly reducing type I iterative polyketide synthase CalE8. PLoS One 6, e20549 (2011).

109. Hillson, N.J., Balibar, C.J. & Walsh, C.T. Catalytically inactive condensation domain C1 is responsible for the dimerization of the VibF subunit of vibriobactin synthetase. Biochemistry 43, 11344-51 (2004).

110. Hutter, H., Ng, M.P. & Chen, N. GExplore: a web server for integrated queries of protein domains, gene expression and mutant phenotypes. BMC Genomics 10, 529 (2009).

111. Cao, J. et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science 357, 661-667 (2017).

112. Zhang, X., Wang, Y., Perez, D.H., Jones Lipinski, R.A. & Butcher, R.A. Acyl-CoA Oxidases Fine-Tune the Production of Ascaroside Pheromones with Specific Side Chain Lengths. ACS Chem Biol 13, 1048-1056 (2018).

113. Zhou, Y. et al. Biosynthetic tailoring of existing ascaroside pheromones alters their biological function in C. elegans. Elife 7(2018).

114. Estrada, P. et al. The pimeloyl-CoA synthetase BioW defines a new fold for adenylate-forming enzymes. Nat Chem Biol 13, 668-674 (2017).

115. Tripathi, A. et al. A Defined and Flexible Pocket Explains Aryl Substrate Promiscuity of the Cahuitamycin Starter Unit-Activating Enzyme CahJ. Chembiochem 19, 1595-1600 (2018).

116. Hemmerling, F., Lebe, K.E., Wunderlich, J. & Hahn, F. An Unusual Fatty Acyl:Adenylate Ligase (FAAL)-Acyl Carrier Protein (ACP) Didomain in Ambruticin Biosynthesis. Chembiochem 19, 1006-1011 (2018).

117. Arora, P. et al. Mechanistic and functional insights into fatty acid activation in Mycobacterium tuberculosis. Nat Chem Biol 5, 166-73 (2009).

118. Wang, N. et al. Natural separation of the acyl-CoA ligase reaction results in a non-adenylating enzyme. Nat Chem Biol 14, 730-737 (2018).

119. Skyrud, W. et al. Biosynthesis of the 15-Membered Ring Depsipeptide Neoantimycin. ACS Chem Biol 13, 1398-1406 (2018).

120. Alam, J., Beyer, N. & Liu, H.W. Biosynthesis of colitose: expression, purification, and mechanistic characterization of GDP-4-keto-6-deoxy-D-mannose-3- dehydrase (ColD) and GDP-L-colitose synthase (ColC). Biochemistry 43, 16450- 60 (2004).

121. Giraud, M.F. & Naismith, J.H. The rhamnose pathway. Curr Opin Struct Biol 10, 687-96 (2000).

230

122. Kocíncová, D. & Lam, J.S. Structural diversity of the core oligosaccharide domain of Pseudomonas aeruginosa lipopolysaccharide. Biochemistry (Mosc) 76, 755-60 (2011).

123. Caffall, K.H. & Mohnen, D. The structure, function, and biosynthesis of plant cell wall pectic polysaccharides. Carbohydr Res 344, 1879-900 (2009).

124. Mohnen, D. Pectin structure and biosynthesis. Curr Opin Plant Biol 11, 266-77 (2008).

125. Nguema-Ona, E. et al. Cell wall O-glycoproteins and N-glycoproteins: aspects of biosynthesis and function. Front Plant Sci 5, 499 (2014).

126. Allen, S., Richardson, J.M., Mehlert, A. & Ferguson, M.A. Structure of a complex phosphoglycan epitope from gp72 of Trypanosoma cruzi. J Biol Chem 288, 11093-105 (2013).

127. Allard, S.T. et al. Toward a structural understanding of the dehydratase mechanism. Structure 10, 81-92 (2002).

128. Blankenfeldt, W. et al. The purification, crystallization and preliminary structural characterization of glucose-1-phosphate thymidylyltransferase (RmlA), the first enzyme of the dTDP-L-rhamnose synthesis pathway from Pseudomonas aeruginosa. Acta Crystallogr D Biol Crystallogr 56, 1501-4 (2000).

129. Allard, S.T., Giraud, M.F., Whitfield, C., Messner, P. & Naismith, J.H. The purification, crystallization and structural elucidation of dTDP-D-glucose 4,6- dehydratase (RmlB), the second enzyme of the dTDP-L-rhamnose synthesis pathway from Salmonella enterica serovar typhimurium. Acta Crystallogr D Biol Crystallogr 56, 222-5 (2000).

130. Giraud, M.F., Leonard, G.A., Field, R.A., Berlind, C. & Naismith, J.H. RmlC, the third enzyme of dTDP-L-rhamnose pathway, is a new class of epimerase. Nat Struct Biol 7, 398-402 (2000).

131. Stern, R.J. et al. Conversion of dTDP-4-keto-6-deoxyglucose to free dTDP-4- keto-rhamnose by the rmIC gene products of Escherichia coli and Mycobacterium tuberculosis. Microbiology 145 ( Pt 3), 663-71 (1999).

132. Graninger, M., Nidetzky, B., Heinrichs, D.E., Whitfield, C. & Messner, P. Characterization of dTDP-4-dehydrorhamnose 3,5-epimerase and dTDP-4- dehydrorhamnose reductase, required for dTDP-L-rhamnose biosynthesis in Salmonella enterica serovar Typhimurium LT2. J Biol Chem 274, 25069-77 (1999).

133. GLASER, L. & KORNFELD, S. The enzymatic synthesis of thymidine-linked sugars. II. Thymidine diphosphate L-rhamnose. J Biol Chem 236, 1795-9 (1961).

231

134. Giraud, M.F. et al. Overexpression, purification, crystallization and preliminary structural study of dTDP-6-deoxy-L-lyxo-4-hexulose reductase (RmlD), the fourth enzyme of the dTDP-L-rhamnose synthesis pathway, from Salmonella enterica serovar Typhimurium. Acta Crystallogr D Biol Crystallogr 55, 2043-6 (1999).

135. Oka, T., Nemoto, T. & Jigami, Y. Functional analysis of Arabidopsis thaliana RHM2/MUM4, a multidomain protein involved in UDP-D-glucose to UDP-L- rhamnose conversion. J Biol Chem 282, 5389-403 (2007).

136. Schachter, H. Protein glycosylation lessons from Caenorhabditis elegans. Curr Opin Struct Biol 14, 607-16 (2004).

137. Page, A.P. & Johnstone, I.L. The cuticle. WormBook, 1-15 (2007).

138. Gravato-Nobre, M.J., Stroud, D., O'Rourke, D., Darby, C. & Hodgkin, J. Glycosylation genes expressed in seam cells determine complex surface properties and bacterial adhesion to the cuticle of Caenorhabditis elegans. Genetics 187, 141-55 (2011).

139. Politz, S.M., Philipp, M., Estevez, M., O'Brien, P.J. & Chin, K.J. Genes that can be mutated to unmask hidden antigenic determinants in the cuticle of the nematode Caenorhabditis elegans. Proc Natl Acad Sci U S A 87, 2901-5 (1990).

140. Masood, K., Sircar, K.P. & Srivastava, V.M. Purification and characterization of a glycoprotein from the surface of Ascaridia galli. J Helminthol 61, 219-24 (1987).

141. Langenhan, T. et al. Latrophilin signaling links anterior-posterior tissue polarity and oriented cell divisions in the C. elegans embryo. Dev Cell 17, 494-504 (2009).

142. Fenner, G. Das Genauigkeitsmaβ von Summen, Differenzen, Produktenund Quotienten der Beobachtungsreichen. Naturwissenschaften 19, 310 (1931).

143. Okazaki, R., Okazakit, Strominger, J.L. & Michelson, A.M. Thymidine diphosphate 4-keto-6-deoxy-d-glucose, an intermediate in thymidine diphosphate L-rhamnose synthesis in Escherichia coli strains. J Biol Chem 237, 3014-26 (1962).

144. Marquez, L.A. & Dunford, H.B. Transient and steady-state kinetics of the oxidation of scopoletin by horseradish peroxidase compounds I, II and III in the presence of NADH. Eur J Biochem 233, 364-71 (1995).

145. Read, J.A., Wilkinson, K.W., Tranter, R., Sessions, R.B. & Brady, R.L. Chloroquine binds in the cofactor binding site of Plasmodium falciparum lactate dehydrogenase. J Biol Chem 274, 10213-8 (1999).

232

146. Timmons, L., Court, D.L. & Fire, A. Ingestion of bacterially expressed dsRNAs can produce specific and potent genetic interference in Caenorhabditis elegans. Gene 263, 103-12 (2001).

147. Kim, D.E., Chivian, D. & Baker, D. Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res 32, W526-31 (2004).

148. Mello, C.C., Kramer, J.M., Stinchcomb, D. & Ambros, V. Efficient gene transfer in C.elegans: extrachromosomal maintenance and integration of transforming sequences. EMBO J 10, 3959-70 (1991).

149. Harris, T.W. et al. WormBase: a comprehensive resource for nematode research. Nucleic Acids Res 38, D463-7 (2010).

150. Watt, G., Leoff, C., Harper, A.D. & Bar-Peled, M. A bifunctional 3,5-epimerase/4- keto reductase for nucleotide-rhamnose synthesis in Arabidopsis. Plant Physiol 134, 1337-46 (2004).

151. Rual, J.F. et al. Toward improving Caenorhabditis elegans phenome mapping with an ORFeome-based RNAi library. Genome Res 14, 2162-8 (2004).

152. Kamath, R.S. et al. Systematic functional analysis of the Caenorhabditis elegans genome using RNAi. Nature 421, 231-7 (2003).

153. Hamilton, B. et al. A systematic RNAi screen for longevity genes in C. elegans. Genes Dev 19, 1544-55 (2005).

154. Winter, J.F. et al. Caenorhabditis elegans screen reveals role of PAR-5 in RAB- 11-recycling endosome positioning and apicobasal cell polarity. Nat Cell Biol 14, 666-76 (2012).

155. Simmer, F. et al. Genome-wide RNAi of C. elegans using the hypersensitive rrf-3 strain reveals novel gene functions. PLoS Biol 1, E12 (2003).

156. Sönnichsen, B. et al. Full-genome RNAi profiling of early embryogenesis in Caenorhabditis elegans. Nature 434, 462-9 (2005).

157. Ceron, J. et al. Large-scale RNAi screens identify novel genes that interact with the C. elegans retinoblastoma pathway as well as splicing-related components with synMuv B activity. BMC Dev Biol 7, 30 (2007).

158. Creuzenet, C., Schur, M.J., Li, J., Wakarchuk, W.W. & Lam, J.S. FlaA1, a new bifunctional UDP-GlcNAc C6 Dehydratase/ C4 reductase from Helicobacter pylori. J Biol Chem 275, 34873-80 (2000).

159. Vogan, E.M. et al. Crystal structure at 1.8 A resolution of CDP-D-glucose 4,6- dehydratase from Yersinia pseudotuberculosis. Biochemistry 43, 3057-67 (2004).

233

160. Hibbs, M.A. et al. Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics 23, 2692-9 (2007).

161. Franceschini, A. et al. STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res 41, D808-15 (2013).

162. Martinez, V. et al. Biosynthesis of UDP-4-keto-6-deoxyglucose and UDP- rhamnose in pathogenic fungi Magnaporthe grisea and Botryotinia fuckeliana. J Biol Chem 287, 879-92 (2012).

163. Novelli, J.F. et al. Characterization of the Caenorhabditis elegans UDP- galactopyranose mutase homolog glf-1 reveals an essential role for galactofuranose metabolism in nematode surface coat synthesis. Dev Biol 335, 340-55 (2009).

164. Turnock, D.C. & Ferguson, M.A. Sugar nucleotide pools of Trypanosoma brucei, Trypanosoma cruzi, and Leishmania major. Eukaryot Cell 6, 1450-63 (2007).

165. Edison, A.S. Caenorhabditis elegans pheromones regulate multiple complex behaviors. Curr Opin Neurobiol 19, 378-88 (2009).

166. Turek, M. & Bringmann, H. Gene expression changes of Caenorhabditis elegans larvae during molting and sleep-like lethargus. PLoS One 9, e113269 (2014).

167. Hendriks, G.J., Gaidatzis, D., Aeschimann, F. & Großhans, H. Extensive oscillatory gene expression during C. elegans larval development. Mol Cell 53, 380-92 (2014).

168. Meli, V.S., Osuna, B., Ruvkun, G. & Frand, A.R. MLT-10 defines a family of DUF644 and proline-rich repeat proteins involved in the molting cycle of Caenorhabditis elegans. Mol Biol Cell 21, 1648-61 (2010).

169. Perales, R., King, D.M., Aguirre-Chen, C. & Hammell, C.M. LIN-42, the Caenorhabditis elegans PERIOD homolog, negatively regulates microRNA transcription. PLoS Genet 10, e1004486 (2014).

170. Partridge, F.A., Tearle, A.W., Gravato-Nobre, M.J., Schafer, W.R. & Hodgkin, J. The C. elegans glycosyltransferase BUS-8 has two distinct and essential roles in epidermal morphogenesis. Dev Biol 317, 549-59 (2008).

171. Nomura, K.H. et al. Ceramide glucosyltransferase of the nematode Caenorhabditis elegans is involved in oocyte formation and in early embryonic cell division. Glycobiology 21, 834-48 (2011).

172. Schiller, B., Hykollari, A., Yan, S., Paschinger, K. & Wilson, I.B. Complicated N- linked glycans in simple organisms. Biol Chem 393, 661-73 (2012).

234

173. Peng, H.L. & Chang, H.Y. Cloning of a human liver UDP-glucose pyrophosphorylase cDNA by complementation of the bacterial galU mutation. FEBS Lett 329, 153-8 (1993).

174. van Gemert, A.M. et al. In vivo monitoring of mRNA movement in Drosophila body wall muscle cells reveals the presence of myofiber domains. PLoS One 4, e6663 (2009).

175. Rhomberg, S. et al. Reconstitution in vitro of the GDP-fucose biosynthetic pathways of Caenorhabditis elegans and Drosophila melanogaster. FEBS J 273, 2244-56 (2006).

176. Erkut, C. et al. Molecular strategies of the Caenorhabditis elegans dauer larva to survive extreme desiccation. PLoS One 8, e82473 (2013).

177. Penkov, S. et al. Maradolipids: diacyltrehalose glycolipids specific to dauer larva in Caenorhabditis elegans. Angew Chem Int Ed Engl 49, 9430-5 (2010).

178. Sturla, L. et al. Expression, purification and characterization of GDP-D-mannose 4,6-dehydratase from Escherichia coli. FEBS Lett 412, 126-30 (1997).

235

BIOGRAPHICAL SKETCH

Likui Feng was born in Weifang, China. He was very interested in science since he was young. He obtained his Bachelor of Science degree from Shandong University and then he entered Tsinghua University in Beijing to pursue his Master’s degree. His focus during his Master’s work was the biochemical and biophysical characterization of human deadenylases. He came to the US in 2012 to study at Indiana University

Bloomington, followed by transfer to the University of Florida in the summer of 2013 where he joined Dr. Rebecca Butcher’s lab to pursue his Ph.D. in the Department of

Chemistry in the Division of Chemical Biology. His doctoral study focused on the biosynthesis and mechanism of hybrid polyketide-nonribosomal peptides in the the nematodes, and he also participated in multiple additional independent and collaborative projects. Part of his work was published and highlighted in this dissertation

(Chapter 2 and Chapter 5). He was awarded at UF with the William R & Arlene F.

Ruegamer Fellowship and a Chemistry Teaching Award. He received his Ph.D. from the

University of Florida in the Fall of 2018.

236