<<

: Early Online, published on May 22, 2020 as 10.1534/genetics.120.303206

1 GENETICS 2

3 Modular organization of cis-regulatory control information of

4 neurotransmitter pathway in C. elegans

5

6 Esther Serrano-Saiz1,2,*, Burcu Gulez1, Laura Pereira1,3, Marie Gendrel1,4, Sze Yen 7 Kerk1, Berta Vidal1, Weidong Feng5, Chen Wang1, Paschalis Kratsios5, James B. Rand6 8 and Oliver Hobert1,*

9 10 1 Department of Biological Sciences, Columbia University, Howard Hughes Medical 11 Institute, New York, USA 12 2 Centro de Biologia Molecular Severo Ochoa/CSIC, Madrid, Spain 13 3 New York Genome Center, New York, USA 14 4 Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Ecole Normale

15 Supérieure, CNRS, INSERM, PSL Research University, Paris, 75005, France 16 5 Department of Neurobiology, University of Chicago, Chicago, USA 17 6 Oklahoma Medical Research Foundation, Oklahoma, USA 18 19 * Correspondence : [email protected] and [email protected] 20

21

1

Copyright 2020. 22 ABSTRACT

23 We explore here the cis-regulatory logic that dictates expression in specific cell 24 types in the nervous system. We focus on a set of eight genes involved in synthesis, 25 transport and breakdown of three neurotransmitter systems, acetylcholine (unc- 26 17/VAChT, cha-1/ChAT, cho-1/ChT, ace-2/AChE), glutamate (eat-4/VGluT) and GABA 27 (unc-25/GAD, unc-46/LAMP, unc-47/VGAT). These genes are specifically expressed in 28 defined subsets of cells in the nervous system. Through transgenic reporter gene 29 assays, we find that the cellular specificity of expression of all of these genes is 30 controlled in a modular manner through distinct cis-regulatory elements, corroborating 31 the previously inferred piece-meal nature of specification of neurotransmitter identity. 32 This provides the mechanistic basis for the phenomenon of phenotypic 33 convergence, in which distinct regulatory pathways can generate similar phenotypic 34 outcomes (i.e. the acquisition of a specific neurotransmitter identity) in different neuron 35 classes. We also identify cases of enhancer , in which the same cis-regulatory 36 element is utilized to control gene expression in distinct neuron types. We CRISR/Cas9- 37 engineered a cis-regulatory of the vesicular acetylcholine transporter, unc- 38 17/VAChT to assess the functional contribution of a “shadowed” enhancer. We observe 39 a selective loss of unc-17/VAChT expression in one cholinergic pharyngeal pacemaker 40 motor neuron class and a behavioral that matches microsurgical removal of 41 this neuron. Our analysis illustrates the power of understanding cis-regulatory 42 information to manipulate gene expression and control behavior.

43

44

2 45 INTRODUCTION

46 Only a small fraction of an animal’s genome is transcribed into messenger, 47 regulatory, or structural . Much of the remainder of the genome carries cis- 48 regulatory information in the form of binding sites for trans-acting regulatory factors that

49 dictates patterns of gene expression (DAVIDSON 2006). Genomic binding profiles have 50 been determined for many transcriptional regulators but it is far from clear how reliably 51 such binding profiles predict the presence of functionally relevant cis-regulatory

52 elements (ECKER et al. 2012; DOOLITTLE 2013; EDDY 2013; BROWN AND CELNIKER 2015). 53 De novo prediction of functional cis-regulatory elements by bioinformatic analysis has 54 also proven to be challenging, at least in part due to the fact that binding sites of trans- 55 acting regulatory factors are often small and degenerate. An alternative approach to 56 define and decode cis-regulatory control elements employs transgenic reporter gene 57 analysis, which can reveal the functionality of cis-regulatory elements based on their

58 ability to direct reporter expression in specific cell types (YANEZ-CUNA et al. 2013).

59 The identification of cis-regulatory elements by reporter gene analysis has the 60 potential to reveal insights into the regulatory logic of cell type-specific gene expression. 61 Genes are often expressed in multiple distinct cell types. This could indicate that distinct 62 cell types may utilize the same that operats through a singular cis- 63 regulatory element to turn on expression of a given target gene in distinct cell types. 64 Alternatively, a gene could be expressed in different cell types by harboring multiple cis- 65 regulatory elements that are responsive to distinct, cell-type specific transcription 66 factors. This problem is particularly evident in the nervous system, in which distinct 67 neuron types choose to express usually one of a discrete number of different 68 neurotransmitter systems, thereby acquiring a specific “neurotransmitter identity”. 69 Neurotransmitter identity is determined by specific enzymes and transporters that 70 synthesize and synaptically package a given neurotransmitter (“neurotransmitter 71 pathway genes”) (Fig.1A). The neurotransmitter identity of a specific neuron can be 72 considered a phenotypic trait. This trait (e.g. the usage of glutamate as 73 neurotransmitter) is shared by many different neuron types. Hence, the question arises 74 as to whether such a shared trait is 1) induced by the same transcriptional regulatory 75 mechanism that operates through the same cis-regulatory elements in all neuron types

3 76 that share this neurotransmitter phenotype (“singular control”) - or 2) is independently 77 controlled by different transcription factor or transcription factor combinations in distinct 78 neuron types through distinct cis-regulatory elements (“modular control”)(Fig.1B). The 79 analysis of cis-regulatory control regions of neurotransmitter pathway genes has the 80 potential to distinguish between these two possibilities. If the phenotypic trait of a 81 specific neurotransmitter identity is regulated separately in distinct neuron types, cis- 82 regulatory information should be organized in a modular manner. That is, discrete cis- 83 regulatory elements will be responsible for driving expression of the same 84 neurotransmitter pathway gene in different neuron types.

85 There is presently ample evidence to support such modular cis-regulatory

86 organization, particularly in the nematode C. elegans (EASTMAN et al. 1999; WENICK AND

87 HOBERT 2004; FLAMES AND HOBERT 2009; KRATSIOS et al. 2011; SERRANO-SAIZ et al.

88 2013; LLORET-FERNANDEZ et al. 2018). We further extend this concept here by an 89 analysis of the cis-regulatory control regions of several neurotransmitter pathway genes. 90 However, we also uncover additional complexities. For a few neuron classes we have 91 not been able to define independent cis-regulatory elements, indicating that the same 92 transcription factor may act in different cell types, but assisted by different co-regulators 93 in distinct cell types. We discuss our findings in the context of phenotypic convergence, 94 a concept aiming to explain how a phenotypic trait shared by distinct neuron types 95 (neurotransmitter identity) can be specified by distinct regulatory means

96 (KONSTANTINIDES et al. 2018; KRATSIOS AND HOBERT 2018). We also report on evidence 97 for the existence of enhancer pleiotropy, in which similar cis-regulatory elements can 98 drive expression in distinct neuron classes.

99 Besides providing mechanistic insights into gene regulation, substantial 100 methodological benefits can be gained from the dissection of cis-regulatory control 101 regions. Cis-regulatory elements that drive expression in a small subset of cell types 102 provide “genetic access” to these cell types. That is, these elements can be used as 103 drivers to express genetic tools that allow the visualization or isolation of these cells, or 104 in the context of the nervous system, the optogenetic activation or inhibition of these 105 cells.

4 106 MATERIAL AND METHODS

107 108 Transgenes 109 A list of all transgenic strains used in this study is shown in Suppl. Table S1. Strains 110 were generated by injecting the transgene mix as simple or complex arrays. Briefly, 111 complex arrays contained the fosmid (for otIs388, otIs518, otIs354, otEx6066 and 112 otEx4431) linearized by restriction enzyme digestion or obtained as a PCR product (for 113 unc-17, cho-1, ace-2, unc-25, unc-46 and unc-47 reporters, Suppl. Table S2). The 114 reporters were mixed with bacterial genomic DNA from E. coli OP50 (previously 115 sonicated and purified with Gentra Puregene kit from QIAGEN) and a co-injection 116 marker plasmid, also linearized by restriction enzymes. For simple arrays, plasmids 117 were mixed directly with the co-injection marker plasmid. For each construct, two to 118 three transgenic lines and approximately 30 worms per line were analyzed. 119 120 DNA constructs 121 A list of all reporter constructs and sequences can be found in Suppl. Table S2. 122 Reporters were generated by different methods: 1) cloned into pDP95.75, pOH569 or 123 pPD95.67 using standard molecular cloning techniques; 2) by PCR fusion

124 described in (HOBERT 2002) or 3) fosmid recombineering described in (TURSUN et al. 125 2009). 126 127 CRISPR/Cas9 genome engineering 128 The unc-17 regulatory allele ot1025 was generated using CRISPR/Cas9 genome

129 engineering as described (DOKSHIN et al. 2018). The E3 motif was replaced with a PmeI 130 restricited site with the following oligo repair template: 131 tttttctgtctcaaaacggagctcgagcccatgacgtgtGTTTAAACtccgcccaattccttca (capital letters 132 indicate the introduced PmeI site replacing the E3 motif). The engineering was done in

133 unc-17(ot907[unc-17::mKate2::3xFLAG]IV) (PEREIRA et al. 2019)(strain 134 OH15568), generating strain OH16363 unc-17(ot907ot1025). 135 136 Neuron identification

5 137 Neuronal identification was performed by co-localization of the reporter constructs with 138 landmark strains. Specifically, individual reporter transgene expressing rfp or mChopti 139 were crossed with a green transgenic landmark strain, otIs354 [cho-1fosmid::sl2 ::yfp fosmid 140 ::h2b] (PEREIRA et al. 2015) or otIs388 [eat-4 ::sl2::yfp::h2b] (SERRANO-SAIZ et al. 141 2013), and gfp or yfp transgenic expressing lines were crossed with either otIs544 [cho- 142 1fosmid::sl2::mChopti::h2b] or otIs518 [eat-4fosmid::sl2 ::mChopti ::h2b] or otIs564 [unc- fosmid 143 47 ::sl2::h2b::mChopti (GENDREL et al. 2016). 144 Worms were anesthetized using 100 mM of sodium azide (NaN3) and mounted on 5% 145 agarose on glass slides. All images were acquired using a Zeiss confocal microscope 146 (LSM880). Representative images are shown following orthogonal projection of 2–10 z- 147 stacks. Images shown in Fig.8 were taken using an automated fluorescence 148 microscope (Zeiss, AXIO Imager Z1 Stand). Acquisition of several z-stack images was 149 performed with the Micro-Manager software. Representative images are shown using 150 the maximum intensity projection type. Image reconstruction was performed using 151 ImageJ software (Schneider et al., 2012). 152 153 Scoring of pharyngeal pumping 154 Pumping assays were performed at room temperature. Household bleach-mediated egg 155 preparations were used to synchronize C. elegans, animals were transferred from an 156 uncrowded non-starved plate to a fresh NGM plate with a uniform thin layer of OP50. 157 After 5 minutes recovery, they were observed under a Nikon Eclipse E400 upright 158 microscope equipped with DIC optics using 50X objective lens for scoring L1 stage 159 animals and 20x objective for adult animals. The number of pumps in a 20 sec period 160 was measured by focusing on a single worm and then this number was multiplied by 3 161 to obtain pumps per minute (ppm). For each stage, ppm was counted for 30 worms. 162 163

6 164 RESULTS

165 In previous work, we used fosmid (large genomic clone)-based reporter and/or 166 CRISPR/Cas9-engineered reporter to comprehensively assign neurotransmitter

167 identities throughout the entire C. elegans nervous system (SERRANO-SAIZ et al. 2013;

168 PEREIRA et al. 2015; GENDREL et al. 2016)(Fig.1C). Cholinergic neurons are marked by 169 two fosmid-based reporters: (1) the unc-17 , which encodes the C. elegans

170 orthologue of the vesicular ACh transporter VAChT (ALFONSO et al. 1993) and reports 171 on transcription of both unc-17/VAChT and cha-1/ChAT, which shares its first non-

172 coding exon with unc-17 (ALFONSO et al. 1994); and (b) cho-1, the choline reuptake

173 transporter (OKUDA et al. 2000; MULLEN et al. 2007) which marks most, but not quite all

174 cholinergic neurons (PEREIRA et al. 2015). Expression of the fosmid-based unc-17 175 reporter has been confirmed with a CRISPR/Cas9-engineered reporter insertion into the

176 unc-17 locus (PEREIRA et al. 2015). A fosmid-based reporter for ace-2, one of several 177 C. elegans acetylcholinesterase encoding genes, is expressed in a subset of cholinergic

178 neurons (PEREIRA et al. 2015). All glutamatergic neurons are marked by a fosmid-based

179 reporter of the vesicular transporter for glutamate, eat-4/VGLUT (LEE et al. 1999;

180 SERRANO-SAIZ et al. 2013). Lastly, GABAergic neurons are marked by CRISPR/Cas9- 181 engineered reporter allele of the unc-25 locus, which encode glutamic acid

182 decarboxylase (GAD; unc-25)(JIN et al. 1999; GENDREL et al. 2016) and by fosmid-

183 based reporters for the unc-47 and unc-46 loci (GENDREL et al. 2016) which encode the

184 vesicular GABA transporter unc-47/VGAT (MCINTIRE et al. 1997) and the associated

185 LAMP cofactor (unc-46)(SCHUSKE et al. 2007), respectively. We collectively refer to the 186 cohort of genes as “neurotransmitter pathway genes” (Fig.1A).

187 Armed with the knowledge of the precise expression pattern of those fosmid- 188 based and/or CRISPR/Cas9 engineered reporters, we generated a series of reporter 189 constructs which contain distinct DNA fragments of the respective loci. We generated 190 transgenic C. elegans animals carrying these reporters and analyzed their expression 191 pattern in the nervous system with single-cell resolution. This analysis was motivated by 192 a specific hypothesis. Based on previous studies of cis-regulatory control regions of 193 neurotransmitter pathway genes, as well as extensive analysis of genetic mechanisms 194 that control neurotransmitter specification, a theme of modular control of

7 195 neurotransmitter identity has been emerging (EASTMAN et al. 1999; WENICK AND HOBERT

196 2004; KRATSIOS et al. 2011; SERRANO-SAIZ et al. 2013; PEREIRA et al. 2015; GENDREL et

197 al. 2016; LLORET-FERNANDEZ et al. 2018). To test whether the theme of modularity 198 broadly applies to multiple neurotransmitter pathway genes, we carried out an extensive 199 dissection of their cis-regulatory control regions.

200 Cholinergic pathway genes: Regulation of the cholinergic locus (unc- 201 17/VAChT and cha-1/ChAT). In animal species ranging from worms to vertebrates, the 202 enzymes that synthesize (choline acetyltransferase; cha-1) and package acetylcholine 203 (ACh) into synaptic vesicles (vesicular ACh transporter; unc-17) are organized into an

204 -like conserved structure, termed the “cholinergic locus” (ALFONSO et al. 1994;

205 EIDEN 1998). Hermaphrodite animals carrying five kilobases (kb) upstream of the first 206 -coding exon of unc-17 (unc-17prom1 drives reporter gene expression in the 207 majority, but clearly not all, cholinergic neurons (prom1 in Fig.2B). Within the male- 208 specific nervous system, only a very small fraction of neurons expresses this reporter. A 209 1kb fragment preceding this 5kb fragment captures a number of additional cholinergic 210 neurons, in both the -shared as well as male-specific nervous system (unc- 211 17prom10, Fig.2B).

212 Expression of unc-17/VAChT is eliminated in cholinergic motor neurons of the 213 ventral nerve cord, the flanking retrovesicular and preanal ganglia as well as command 214 interneurons by removal of distal sequences within the 5kb fragment (compare unc- 215 17prom1 versus unc-17prom2) that contain a binding site for the UNC-3 transcription 216 factor (“COE motif”), the terminal selector of ventral cord motor neuron and command

217 interneuron identity(KRATSIOS et al. 2011; PEREIRA et al. 2015). Conversely, a construct 218 containing only distal sequences and the COE motif (unc-17prom6) is sufficient to drive

219 reporter gene expression in cholinergic MNs at VNC, RVG and PAG (KRATSIOS et al. 220 2011). Further shortening of the 5kb fragment (unc-17prom3) leads to consecutive loss 221 of expression in more neuron classes (Fig.2A).

222 Reporter animals carrying 3,249bp of cis-regulatory information upstream of the 223 non-coding first exon shared by unc-17 and cha-1 (unc-17prom5) yields almost identical 224 expression as the 5kb construct (unc-17prom1) that contained the first intron of the

8 225 locus (unc-17prom1). We broke these 4kb region into several non-overlapping pieces. 226 One small piece, which contains the COE motif is sufficient to drive expression in VNC

227 MNs (unc-17prom6), as previously described (KRATSIOS et al. 2011). Three additional, 228 non-overlapping ~1kb fragments (unc-17prom7 to prom9) show distinct patterns of 229 expression in largely (but not completely) non-overlapping neuron classes. One 230 fragment (unc-17prom7) drives expression exclusively in head motor neuron classes 231 (RMD, SMD, SMB, SIA, SIB, RMF, RMH) as well as two ring interneuron classes (RIR, 232 RIV). Another, distinct ~900 bp element captures expression in pharyngeal inter- and 233 motor neuron classes (unc-17prom8). Notably, very similar expression is observed by a 234 different regulatory element in the first intron of the unc-17 locus unc-17prom4). The 235 ~900bp upstream pharyngeal element is expressed in a few extra-pharyngeal neuron 236 classes (AIY, AWB, RIH). Lastly, the third ~1kb fragment is exclusively expressed in 237 three neuron classes of the anterior ganglion (IL2, URA, URB) (unc-17prom9).

238 We sought to increase the granularity of our cis-regulatory analysis via 239 phylogenetic footprinting. We identified multiple conserved sequence motifs in the 240 upstream region of the unc-17/cha-1 cholinergic locus. Examining these regions in 10 241 different nematode species, we identified a host of small conserved elements (Fig.3A). 242 We examined the functional relevance of two small clusters of these elements in the 243 context of two separate ~1kb fragments (Fig.2A, B). One is the ~900bp element (unc- 244 17prom8) that drives expression in a few pharyngeal and extra-pharyngeal neuron 245 classes (AIY, AWB, RIH). This element contains a small motif (“E6”) that we previously 246 identified to be a binding site for the terminal selector-type TTX-3/CEH-10 247 homeodomain complex (“AIY motif”). We found this motif to be required for unc-17 248 expression and hence the acquisition of cholinergic identity of the AIY interneuron

249 (WENICK AND HOBERT 2004). An adjacent element (“E7”) matches the consensus binding 250 site for the UNC-86 POU homeodomain transcription factor and we previously found

251 UNC-86 to be required for cholinergic identity acquisition of the RIH neuron (PEREIRA et 252 al. 2015). We find that combined deletion of E6 and E7 results in the expected loss of 253 expression of this element both in AIY and RIH (unc-17prom8_B30, Fig.2A). 254 Expression in AWB, which expresses neither ttx-3, ceh-10 nor unc-86 is also lost, 255 suggesting that a related homeodomain transcription factor that recognizes E6 and/or

9 256 E7 motifs, may control AWB cholinergic identity.

257 The other ~1kb element (unc-17prom9), which is exclusively expressed in the 258 IL2, URA and URB neuron classes, contains a stretch of conserved sequence motifs 259 (termed F2 in Fig.3B) including another predicted binding site for the UNC-86 POU 260 homeodomain transcription factor, previously shown to be required for cholinergic

261 identity acquisition of the IL2, URA and URB neuron classes (ZHANG et al. 2014). 262 Deletion of the F2 motif affects expression in all these neuron classes (unc- 263 17prom9_C39, Fig.2A). In summary, this cis-regulatory analysis indicates that (a) the 264 terminal selector UNC-86 exerts its effect on cholinergic identity specification in distinct 265 neuron classes likely via direct binding and activation of the unc-17/cha-1 cholinergic 266 locus, and (b) that UNC-86 operates through distinct regulatory elements in different 267 cholinergic neuron cell types. These points are further supported by a distal cis- 268 regulatory element that is expressed in four different neuron classes, ADF, RIH, RIR 269 and URX (unc-17prom10, Fig.2A, B). Three of these neuron classes express UNC-86 270 and at least two of them (RIH and URX) have been shown to require UNC-86 for proper

271 unc-17 expression (PEREIRA et al. 2015).

272 Lastly, we sought to define the minimal cis-regulatory element upstream of unc- 273 17 that is sufficient to drive unc-3-dependent expression in cholinergic motor neuron 274 classes in the retrovesicular ganglion (RVG), ventral nerve cord (VNC) and pre-anal 275 ganglion (PAG). We focused on unc-17prom6 (1001bp) (Fig.2A), which contains the 276 UNC-3 binding site (COE motif) previously shown to be required for reporter gene

277 expression in unc-3-dependent neuron classes in the RVG, VNC and PAG (KRATSIOS et 278 al. 2011). Shorter reporter versions of 250bp, 200bp, 150bp and 125 bp that contain the 279 COE motif are sufficient to drive expression in motor neuron classes of the RVG, VNC 280 and PAG (Fig.2C). Through the introduction of a number of deletions surrounding the 281 COE motif, we found no evidence for the existence of additional required cis-regulatory 282 elements (Fig.2C). While this may indicate that UNC-3 is the sole factor required for 283 expression of unc-17 in motor neuron classes of the RVG, VNC and PAG, we note that 284 there are highly conserved sequences directly adjacent to the COE motif. Whether 285 these sequences are covered by the UNC-3 protein (which contacts DNA beyond the

286 COE motif (TREIBER et al. 2010) is presently not clear.

10 287 Cholinergic pathway genes: Regulation of the cho-1 locus. Dissection of the 288 cho-1/ChT locus reveals a similar modular logic. A ~5kb fragment upstream of the cho-1 289 locus recapitulates almost the entire expression pattern of the fosmid-based reporter 290 (cho-1prom1, Fig.4A). Successive deletions from the 5’ end result in sequentially 291 increasing losses of expression in multiple cholinergic neuron types. The 2.7kb 292 fragment (cho-1prom3) directly upstream of the start site of the gene only contains 293 regulatory information for six classes of head and tail neurons, including the only two 294 pharyngeal cholinergic neuron classes that express cho-1. Expression in the pharyngeal 295 neurons can be separated from the remaining few head and tail neurons through a 296 deletion from the 3’-end of this construct (compare cho-1prom3 with cho-1prom11, 297 prom12, prom16). As it is the case for the unc-17 locus, there appears to be two distinct 298 regions of the cho-1 locus that can drive expression in pharyngeal neurons (Fig.4). 299 Another striking similarity to unc-17 regulation is that we isolated from both cho-1 and 300 unc-17 an element that is active in the same set of extra-pharyngeal neuron classes: 301 ADF, RIH, RIR and URX. As mentioned above, three of these neuron classes express 302 UNC-86 and two have been shown to require UNC-86 for both unc-17 and cho-1

303 expression (PEREIRA et al. 2015).

304 Successive deletions of constructs from the 3’ end also break expression into 305 distinct neuron classes (Fig.4A). Most notable is the previously reported 280bp distal 306 element that contains the unc-3 binding COE motif (cho-1prom8). The expression of this 307 element precisely matches all neurons throughout the C. elegans nervous system that 308 require UNC-3 for their cholinergic identity specification. Removal of 60bp from this 5’ 309 end of this element, leaving the COE motif intact, eliminates expression in the UNC-3(+) 310 neurons (compare prom8 and prom9) suggesting that these 60bp contain a binding site 311 for another factor that may collaborate with UNC-3. Such a factor is expected to exist 312 because UNC-3 is alone not sufficient to induce cholinergic fate (e.g. UNC-3 is 313 expressed in the non-cholinergic ASI neurons (Prasad et al. 1998).

314 Cholinergic pathway genes: Regulation of ace-2. The ace-2 gene is an 315 orthologue of human acetylcholinesterase (ACHE), an enzyme that catalyzes the

316 breakdown of ACh (CULOTTI et al. 1981). Using a fosmid-based reporter, we previously 317 determined that ace-2 is expressed in some, but not all cholinergic neuron classes

11 318 (PEREIRA et al. 2015)(Fig.5). A distal element of 1.8kb (prom1) contains two COE 319 motifs, and is sufficient for expression in cholinergic VNC motor neurons, as previously

320 described (KRATSIOS et al. 2011). Moreover, we find this element to be expressed in 321 command interneuron classes (AVA, AVB, AVE, as well as RIH and DVA neuron 322 classes, almost completely recapitulating the ace-2 fosmid expression pattern (Fig.5). 323 Supporting the notion of modularity, a shorter 1kb version of this 1.8kb element (prom2) 324 drives reporter expression only in AVA, AVB, DVA, and a limited number of VNC motor 325 neuron classes (Fig.5). A small 180bp element (prom3) at the distal end of the 1.8 kb 326 fragment drives expression in VNC motor neuron classes, as well as a dozen of head 327 (IL2D/V, URA, URB) and tail (PLM, PHC) neuron classes that normally do not express 328 ace-2 (based on fosmid-based reporter). This is one of the rare cases where an isolated 329 cis-regulatory element led to de-repression in ectopic neuron types.

330 Glutamatergic identity: Regulation of eat-4/VGluT expression. In previous 331 work, we had already mapped cis-regulatory elements that drive expression of eat-4 in

332 distinct eat-4(+) neuron classes in the hermaphrodite nervous system (SERRANO-SAIZ et 333 al. 2013). While we had examined eat-4 expression in the male nervous system as well

334 (SERRANO-SAIZ et al. 2017b), we had not examined cis-regulatory elements that drive 335 expression in the male nervous system. We now find that such elements are located in 336 distinct regions of the intergenic sequence upstream of the eat-4 locus (Fig.6A). One 337 element, the 377bp eat-4prom13 element, drives reporter gene expression exclusively 338 in the male-specific PVV neuron) (Fig.6B). This element will be a useful driver to 339 visualize and manipulate PVV neuron anatomy and . Some of the elements of 340 the eat-4 locus appear to act redundantly, i.e. separate regions drive expression in the 341 same male-specific neuron classes. For example, two non-overlapping elements are 342 expressed in PVV (eat-4prom1 and eat-4prom13) (Fig.6).

343 We also undertook a more fine-grained analysis of the previously identified 344 regulatory elements that drive expression in multiple glutamatergic neuron classes with 345 the goal of identifying putative transcription factor binding sites. Given the

346 preponderance of genes in controlling neurotransmitter identity (JIN et al.

347 1994; SERRANO-SAIZ et al. 2013; ZHANG et al. 2014; PEREIRA et al. 2015; GENDREL et al. 348 2016), we mutated a number of predicted homeodomain binding sites (TAAT) as well as

12 349 a predicted and highly conserved POU homeodomain site in various cis-regulatory 350 elements of the eat-4 locus. We found that several of these deletions lead to losses of 351 expression of the respective eat-4 reporter in very specific neuron classes. For 352 example, deletion of distinct, predicted homeodomain binding sites eliminates 353 expression in individual, neuron classes of the pharynx within the eat-4prom7 element 354 (Fig.6A). Similarly, deletion of distinct homeodomain binding sites eliminates expression 355 in distinct ray sensory neurons and deletion of a conserved POU homeodomain binding 356 site specifically eliminates expression in the male-specific PVV neuron (eat-4prom27, 357 Fig.6). Taken together, this cis-regulatory analysis confirms and extends our previous 358 conclusion that the eat-4 locus, and hence glutamatergic neurotransmitter identity, is 359 controlled in a modular manner through distinct cis-regulatory elements that likely

360 respond to neuron-type specific homeodomain transcription factors (SERRANO-SAIZ et al. 361 2013).

362 GABAergic pathway. The cis-regulatory region of the unc-25/GAD locus that 363 directs expression in all GABAergic neuron classes, except the RIB neuron class whose

364 GABAergic identity was discovered recently (GENDREL et al. 2016), had been previously

365 localized to <1kb upstream of the first exon of unc-25 (EASTMAN et al. 1999; JIN et al. 366 1999). We sought to locate the cis-regulatory elements for the previously unrecognized 367 RIB neurons, as well as for the expression in the male-specific nervous system (Fig.7). 368 We found that 600bp upstream of unc-25 locus revealed expression exclusively in the 369 RME motor neurons and the D-type motor neurons (unc-25prom2, Fig.7), consistent 370 with this element containing a previously characterized binding site for the unc-30

371 transcription factor, a terminal selector that controls unc-25 expression (EASTMAN et al. 372 1999). However, this element did not show expression in other GABAergic neuron 373 classes in the sex-shared or sex-specific nervous system. We rather found those 374 elements to reside with an intron of the unc-25 locus. Specifically, we found that 300bp 375 (3intron1) revealed expression in RIS, AVL, DVB, as well as the newly identified 376 GABAergic RIB neuron and the GABAergic neuron classes of the male-specific nervous 377 system (Fig.7A3, A4). Similar to the upstream 600bp element (prom3), the intronic 378 300bp element also drives expression in the RME neurons. Further deletion of this 379 300bp element (3intron2) eliminated reporter gene expression in the DVB motor neuron,

13 380 but retained expression in the RME neuron class (Fig.7A1).

381 A previous analysis of the unc-47/VGAT locus revealed that its cis-regulatory 382 elements also reside in a few hundred base-pairs upstream of the start of the gene

383 (EASTMAN et al. 1999). We corroborated these findings with a smaller 300bp element 384 (prom1) that also reveals expression in the RIB neurons, which were not identified in the

385 original unc-47 studies as GABAergic (MCINTIRE et al. 1993; EASTMAN et al. 386 1999)(Fig.7B). Further dissection of these 300bp separated expression in D-type motor 387 neurons from the remainder of the GABAergic neuron classes (and showed no more 388 ectopic reporter expression) (Fig.7B1).

389 The cis-regulatory regions for unc-46 expression in GABA neuron classes were

390 previously found to reside somewhere upstream of the unc-46 start site (SCHUSKE et al. 391 2007). We narrowed down this region to a 246bp element (unc-46prom1), which 392 recapitulates the expression in the entire GABAergic nervous system, including the RIB 393 interneurons (Fig.7C), as well as in a non-GABAergic neuron RIA. As with unc-47, two 394 overlapping sub-fragments again separate expression in D-type motor neurons from the 395 remainder of the GABAergic neuron classes.

396 In conclusion, the unc-30-dependent regulatory element that drives expression in 397 VNC GABAergic (D-type) MNs can be separated easily from expression in other 398 GABAergic neuron classes, but expression in other neurons is harder to separate from 399 one another. This is not surprising for the case of the RME, AVL, RIS and DVB neuron 400 classes, which appear to share several common trans-acting factors for their identity 401 specification, namely the orphan nuclear hormone nhr-67 and the LIM

402 homeobox gene lim-6 (HOBERT et al. 1999; SARIN et al. 2009; GENDREL et al. 2016).

403 Assessment of potentially redundant activity of distinct cis-regulatory 404 modules in the native genome context. In most neurotransmitter pathway genes, we 405 noted some incidences of distinct, non-overlapping cis-regulatory regions driving 406 expression in the same neuron type. For example, in the unc-17 locus, two distinct cis- 407 regulatory elements (unc-17prom4 and unc-17prom8 in Fig.2) produce the exact same 408 expression in 5 pharyngeal neuron classes (I1, M1, M2, M4, MC). At first sight, a 409 sufficiency of two distinct elements to direct expression in these neurons would argue

14 410 for redundancy, i.e. one would predict that in either element in the context of 411 the genomic locus should not result in expression defects because the other element 412 should be able to compensate for the mutated one. To test whether such redundancy 413 does indeed exist, we used CRISPR/Cas9 genome engineering to mutate a highly 414 conserved motif within the unc-17prom8 element (Fig.2A, Fig.8A). This motif, the E3 415 box, is completely conserved in 10 nematode species (Fig.3A). The E3 box 416 (Fig.8A) was engineered in a strain in which the endogenous unc-17 locus had been

417 tagged with mKate2 (ot907 allele (PEREIRA et al. 2019)). We find that this mutation 418 results in an almost fully penetrant complete loss of unc-17/VAChT expression in the 419 MC neuron class and a much milder effect on expression in I1 (Fig.8B,C). This 420 observation argues that at least for expression in the MC neuron, the existence of the 421 regulatory elements present in the unc-17prom4 fragment cannot compensate for 422 disrupting the unc-17prom8 element.

423 The unc-17/VAChT and cha-1/ChT genes share a common first exon and are

424 thought to be jointly regulated (ALFONSO et al. 1994)(Fig.2A). Therefore, unc- 425 17(ot907ot1025) animals are predicted to be unable to synthesize ACh (due to loss of 426 the CHA-1 protein) and unable to package ACh into synaptic vesicles (due to loss of the 427 UNC-17 protein) specifically in the MC neurons. The two MC neurons are the key

428 pacemaker neurons of the pharynx (AVERY AND HORVITZ 1989; RAIZEN et al. 1995;

429 TROJANOWSKI et al. 2014). Hence, we tested whether unc-17(ot907ot1025) animals 430 display pharyngeal pumping defects akin to those observed upon laser ablation or 431 genetic silencing of the MC neurons. We indeed find that unc-17(ot907ot1025) animals, 432 which appear otherwise behaviorally indistinguishable from wild-type animals, indeed 433 pump less (Fig.8C). Reductions in pumping are also observed in animals treated with a 434 nAChR agonists or in animals that lack the AChR subunit EAT-2 that is expressed in

435 MC target tissue (RAIZEN et al. 1995; MCKAY et al. 2004). Our finding of neuron-type 436 specific removal of the unc-17/VAChT therefore provide further evidence for a 437 cholinergic signal from MC to target tissue being essential for pharyngeal behavior.

438

15 439 DISCUSSION

440 Past work has defined the modular nature of neurotransmitter identity 441 specification in C. elegans from two different perspectives: First, distinct terminal 442 selector-type transcription factors were found to be required to specify the same

443 neurotransmitter identity in different neuron types (reviewed in (HOBERT 2016)). In 444 several cases, it was explicitly shown that this specificity is achieved through the binding 445 of the respective transcription factor to defined cis-regulatory elements within

446 neurotransmitter pathway genes (EASTMAN et al. 1999; WENICK AND HOBERT 2004;

447 KRATSIOS et al. 2011). Second, reporter constructs that capture distinct regions of the 448 loci that encode for neurotransmitter pathway genes were shown to be selectively 449 expressed in distinct subsets of ACh, Glu, GABA or monoaminergic neurons,

450 respectively (EASTMAN et al. 1999; WENICK AND HOBERT 2004; FLAMES AND HOBERT 2009;

451 KRATSIOS et al. 2011; SERRANO-SAIZ et al. 2013; LLORET-FERNANDEZ et al. 2018). We 452 have expanded here the analysis of cis-regulatory control regions of neurotransmitter 453 pathway genes, thereby corroborating the largely modular control of neurotransmitter 454 identity (Fig.1B). However, not each neuron type is controlled by an independent, 455 dedicated module, since the same cis-regulatory element can be employed by the same 456 transcription factor in different cell types, likely by acting in the context of distinct 457 combination of transcription factors. Hence, cell-type specific gene expression profiles 458 are controlled by a combination of different types of regulation, as illustrated in Fig.9. 459 We discuss these points in more detail below.

460 This and past analysis clearly shows that distinct, modular regulatory inputs 461 control expression of neurotransmitter pathway systems in distinct neuron types

462 (EASTMAN et al. 1999; JIN et al. 1999; WENICK AND HOBERT 2004; FLAMES AND HOBERT

463 2009; KRATSIOS et al. 2011; SERRANO-SAIZ et al. 2013; STEFANAKIS et al. 2015; LLORET-

464 FERNANDEZ et al. 2018) (illustrated in Fig.9). Apart from simple reporter gene assays, 465 we provide here a striking example of this modularity. We engineered a cis-regulatory 466 allele of the unc-17/VAChT locus in which we mutated a small, phylogenetically 467 conserved motif and find that this not only leads to selective loss of unc-17/VAChT 468 expression in a single neuron class, but also causes the phenotype that one would 469 expect from a dysfunctionality of this neuron.

16 470 Nevertheless, past work, as well as the work presented here, also demonstrates 471 that regulatory elements that drive expression in distinct neuron types can be difficult to 472 disentangle, i.e. even small cis-regulatory elements still drive expression in multiple 473 neuron classes. Does this observation contrast the modular control model and support 474 the singular control model shown in Fig.1B. Theoretically, a more finely grained 475 mutational analysis may still separate distinct elements for distinct cell types. However, 476 there are already clear cases where we know that the same transcription factor acts in 477 distinct cells to specify, for example, their cholinergic identity, seemingly in support of 478 the singular model. However, in many, if not all cases, this singular transcription factor 479 may act in combination with different sets of cell-type specific partner 480 (illustrated in Fig.9, compare neuron #1 and #4). For example, the COE-type 481 transcription factor unc-3 controls unc-17/VAChT and cho-1/ChT expression in VNC 482 MNs and their presynaptic command interneurons, which are all cholinergic and

483 specified by unc-3 (KRATSIOS et al. 2011; PEREIRA et al. 2015). Mutation of the unc-3 484 binding site leads to loss of expression in all these neuron classes. Similarly, a 485 regulatory element from the unc-17/VAChT locus shows expression in the IL2, URA and 486 URB neuron classes, all of which require unc-86 for their proper neurotransmitter

487 identity acquisition (ZHANG et al. 2014). Likewise, a regulatory element from the eat-4 488 locus expressed in the AIM and PHC neuron classes, both of which require unc-86 for 489 their glutamatergic identity specification. Cis-regulatory elements that control expression 490 of GABA pathway genes in the RIS and AVL neuron classes (both controlled by the nhr- 491 67 and lim-6 transcription factors) can also not be easily separated, while cis-regulatory 492 elements for GABAergic VNC MNs (which are dependent on the unc-30 transcription 493 factor) can be separated from RIS and AVL. These examples demonstrate that the 494 modularity of cis-regulatory control does not need to manifest itself by the existence of 495 distinct, completely separable cis-regulatory elements that drive expression in distinct 496 neuron types. Rather, the same cis-regulatory element and its cognate transcription 497 factor may be deployed in different cell types because transcription factors operate in 498 distinct, cell type-specific combinations.

499 Enhancers that are employed to drive expression in different cell types have

500 been termed “pleiotropic enhancers” (SABARIS et al. 2019). Such pleiotropy has been

17 501 mostly inferred from genome-wide binding profiles of modification marks that 502 indicate (but do not prove) enhancer activity on a relatively coarse level. Specifically, 503 marks of enhancer activity can be found in the same location in distinct tissue types

504 (MCKAY AND LIEB 2013). The underlying mechanistic basis for enhancer pleiotropy is 505 often not clear. The cis-regulatory analysis conducted in this and previous papers

506 (KRATSIOS et al. 2011; SERRANO-SAIZ et al. 2013; SERRANO-SAIZ et al. 2017a) not only 507 supports the concept of enhancer pleiotropy in an orthogonal and more granular 508 manner, but also strongly hints at its mechanistic basis. Different neuron types utilize 509 the same cis-regulatory element because this regulatory element contains binding for 510 the same trans-acting factor (Fig.9; compare neuron #1 and #4).

511 While the unc-3 case provides the clearest evidence for the same cis-regulatory 512 motif to be employed in different neuron types, our analysis also reveals that a 513 transcription factor required for acquisition of the same neurotransmitter identity in 514 distinct neuron types does not necessarily always act through the same cis-regulatory 515 element. For example, UNC-86 operates in more than a dozen different neuron types to

516 specify neurotransmitter identity (SZE et al. 2002; SERRANO-SAIZ et al. 2013; ZHANG et

517 al. 2014; PEREIRA et al. 2015). It appears that UNC-86 functions through separable cis- 518 regulatory elements in the unc-17 locus to enable cholinergic gene expression in 519 different cell types. These distinctive types of engagement of UNC-86 with its target 520 promoters is a likely reflection of distinct molecular partnerships that UNC-86 may 521 engage in (Fig.9, compare neuron #1 and #3).

522 We noted in several neurotransmitter pathway genes the existence of separable 523 elements that drive expression in the same neuron types. For example, RME motor 524 neuron expression is controlled by separable cis-regulatory elements in the unc-25/GAD 525 locus. This is akin to the “shadow enhancers” that have been observed to drive

526 expression of developmental control genes in Drosophila (HONG et al. 2008; LAGHA et 527 al. 2012). One immediate question that arises from such a constellation is whether this 528 is an indication of true enhancer redundancies. Complete redundancy would predict that 529 deletion of one of these elements would result in no expression defect because of the 530 existence of the other element. Alternatively, the sufficiency of isolated elements to 531 drive expression could be a result of the multicopy nature of concatamerized C. elegans

18 532 transgenes (MELLO et al. 1991). In the natural genomic context, several separable cis- 533 regulatory elements may need to cooperate to turn on expression in a specific cell type. 534 We tested this notion here explicitly through mutating a small conserved motif that is 535 located in one of two separate cis-regulatory elements of the unc-17 locus, which each 536 driving expression in the same set of pharyngeal neurons. We find completely penetrant 537 defects in unc-17 expression in a single neuron class, thereby indicating that in their 538 endogenous context a cis-regulatory element may be required even if another element 539 produces, in the context of a multicopy array, the same pattern of expression. This 540 finding is not without precedent. We had previously defined two separate cis-regulatory 541 elements in the cog-1 homeobox gene locus that are alone sufficient to drive expression

542 in the ASE neuron (O'MEARA et al. 2009). Nevertheless, we isolated a cis-regulatory 543 allele of cog-1 from a mutant screen in which only one of those two elements is 544 mutated. These examples illustrate an important limitation for the interpretation of 545 reporter gene analysis: The expression of isolated elements may demonstrate that an 546 element is sufficient to produce an expression pattern, but they do not prove that such 547 an element is required in its normal genomic context.

548 Pioneering work by Eric Davidson and colleagues in the sea urchin embryo 549 revealed that “nuts and bolt” differentiation genes, i.e. genes that encode not for 550 regulatory factors, but rather “function” proteins that endow a cell with specific 551 anatomical or functional properties, are largely controlled by positive regulatory inputs,

552 i.e. by transcriptional activators (DAVIDSON 2006). Our cis-regulatory analysis of 553 neurotransmitter pathway genes provides strong support for this notion. If the 554 expression patterns of neurotransmitter pathway genes were frequently shaped by 555 repressive mechanisms, our successive deletion of regulatory elements should have 556 revealed ectopic expression of reporter genes. While we observed such derepression in 557 some rare cases, repression does not appear to be a major mechanism in shaping the 558 expression of neurotransmitter pathway genes. The cis-regulatory analysis of other

559 neuronal differentiation genes, undertaken by our lab over the years (WENICK AND

560 HOBERT 2004; ETCHBERGER et al. 2007; ETCHBERGER et al. 2009; KRATSIOS et al. 2011;

561 DOITSIDOU et al. 2013; ZHANG et al. 2014; SERRANO-SAIZ et al. 2018) also provides only 562 limited evidence in favor of repression being an important mechanism to shape neuron-

19 563 type specific gene expression profiles. This is not to say that transcriptional repression 564 plays no role at all in shaping terminal differentiation program. Several in depth studies 565 have revealed that transcriptional repression is used in a number of distinct cellular 566 contexts (sensory neurons, motor neurons) to establish subtle, but important molecular 567 difference between members of the same neuron class (e.g. differences between left

568 versus right ASE neuron (ETCHBERGER et al. 2009); or differences between different

569 ventral nerve cord motor neuron classes (KERK et al. 2017)) .

570 Lastly, the dissection of cis-regulatory elements of genes that define specific 571 phenotypic traits of distinct neuron types (i.e. neurotransmitter identity) highlights the 572 underlying mechanistic basis of “phenotypic convergence”. In evolutionary theory, 573 phenotypic convergence refers to independent of similar traits in different

574 (ROSENBLUM et al. 2014). This terminology has recently been applied to the

575 acquisition of the phenotypic trait of neurotransmitter identity (KONSTANTINIDES et al. 576 2018). In both C. elegans and Drosophila, distinct transcription factors control the same

577 phenotypic trait in distinct neuron types (SERRANO-SAIZ et al. 2013; KONSTANTINIDES et

578 al. 2018; KRATSIOS AND HOBERT 2018). Cis-regulatory analyses like the ones described 579 here demonstrate that it is the cis-regulatory control regions of neurotransmitter 580 pathway genes that integrate distinct regulatory inputs to turn on a common 581 neurotransmitter trait in distinct neuron types.

582

20 583 ACKNOWLEDGEMENT. We thank Chi Chen for expert assistance in generating 584 transgenic strains, Kelsey Roberts for generating constructs, Surojit Sural for advice on 585 pumping assays and Steven Cook for help with statistics. E.S-S has been supported by 586 the Ramon y Cajal program and M.G. was supported by the European Molecular 587 Biology Organization and Human Frontier Science Program postdoctoral fellowships.

588

589

590

591

21 592 FIGURE LEGENDS

593

594 Fig.1: Introduction to the problem of neurotransmitter identity specification. 595 A: Neurotransmitter pathway genes. In this paper, fluorescence reporters were 596 generated for cis-regulatory elements contained in the regulatory regions of the genes 597 highlighted in boxes. 598 B: Conceptual models for how the shared phenotypic trait of neurotransmitter identity 599 could be regulated in different neuron types. Previous studies as well as this paper 600 corroborate the modular control model. However, as we will show here in the paper, 601 there are cases where the same transcription factor (TF) is used in multiple different 602 neuron types (seemingly in support of the singular model), but this is usually because a 603 given TF usually operates in distinct combinations in different cell types, as discussed in 604 the text. For simplicity, these two models just focus on the question whether a specific 605 neurotransmitter identity is controlled by single TFs. 606 C: Neurotransmitter atlas of the C. elegans adult, hermaphrodite head and tail. Adapted

607 from (GENDREL et al. 2016). 608 609 610 Fig.2: Analysis of cis-regulatory elements of the unc-17/cha-1 locus. 611 A: Overview of the unc-17/cha-1 locus, reporter constructs and summary of sites of 612 expression. The constructs are referred to the unc-17/cha-1 locus to scale. Expression

613 of the fosmid reporter is from (PEREIRA et al. 2015). unc-17prom4, prom5 were shown in

614 previous papers (WENICK AND HOBERT 2004; KRATSIOS et al. 2011) but expressing cells 615 were not identified. Within the matrix grey color indicates expression, “X” indicates no 616 expression while white color indicates not determined. 617 B: Representative images of several constructs in the adult head (lateral views). In the 618 images the unc-17 reporters are shown in red and the cho-1 fosmid (otIs354) is also 619 shown in green as a landmark. Cells in the unc-17prom1 are not labeled because there is 620 a complete co-localization with cho-1 fosmid. Usually 2 or 3 lines were scored for each 621 reporter and >10 animals/line. Scale bar = 50 m.

22 622 C: Analysis of an unc-3 responsive element from the unc-17 locus required for 623 expression in RVG, VNC and PAG. This small 250bp element is also expressed in a 624 few unidentified head neurons, but we only consider here the expression in the RVG, 625 VNC and PAG, which has previously been shown to require the indicated COE motif

626 and unc-3 (KRATSIOS et al. 2011). 627 628 629 Fig.3: Identification of conserved motifs upstream of the cholinergic locus (unc- 630 17/cho-1) of 10 nematode species. A: Overview of the unc-17/cho-1 loci of 10 631 nematode species. There is a high conservation of the cholinergic locus among 632 nematode species. The data was obtained from wormbase.org. B: Conserved 633 sequenced motifs within the cholinergic locus. The small element designated E6 = “AIY 634 motif” (TTX-3/CEH-10 binding site) has been functionally validated in C. elegans to be

635 required to drive unc-17 expression in AIY by (WENICK AND HOBERT 2004). The element 636 designated E7 contains the consensus binding site for UNC-86 transcription factor. The 637 combined deletion of E6 and E7 (designated unc-17prom8_B30 in Fig.2) results in the 638 loss of expression of the reporter in AIY and RIH. The red box in E3 indicates the 639 nucleotides that were replaced in ot1025 (Fig.8). The red box in F2 indicates the 640 deleted nucleotides in unc-17prom9_C39 (Fig.2). 641 642 643 Fig.4: Analysis of cis-regulatory elements of the cho-1 locus. 644 A: Overview of the cho-1 locus. Schematic of the cho-1 reporters and their expression 645 in hermaphrodite. The constructs are referred to the cho-1 locus to scale. otIs354

646 (fosmid reporter) and otEx6066 reporters were shown in previous papers (PEREIRA et al.

647 2015; STEFANAKIS et al. 2015) but expressing cells were not identified. Within the matrix 648 grey color indicates expression, “X” indicates no expression while white color indicates 649 not determined. 650 B: Representative images of several constructs in the adult head (lateral views). In the 651 images the cho-1 reporters are shown in red and the cho-1 fosmid (otIs354) is also 652 shown in green as a landmark. Usually 2 or 3 lines were scored for each reporter and

23 653 >10 animals/line. Scale bar = 50 m 654 655 656 Fig.5: Analysis of cis-regulatory elements of the ace-2 locus. 657 A: Overview of the ace-2 locus. Schematic of ace-2 reporters and their expression in 658 hermaphrodites. The constructs are referred to the ace-2 locus to scale. B: In the 659 images two ace-2 reporters are shown in red and the cho-1 fosmid (otIs354) is also 660 shown in green as a landmark. Expression of ace-2 fosmid was characterized in

661 (KRATSIOS et al. 2011)(PEREIRA et al. 2015). Constructs prom1 and prom2 were

662 described in (KRATSIOS et al. 2011), but cell identification in head and tail neurons was 663 not performed. Within the matrix, grey color indicates expression, “X” indicates no 664 expression while white color indicates not determined. Some representative images of 665 two reporter constructs are shown (also containing a cho-1 fosmid landmark reporter). 666 Usually 2 or 3 lines were scored for each reporter and >10 animals/line. Scale bar = 50 667 m 668 669 670 Fig.6: Analysis of cis-regulatory elements of the eat-4 locus. 671 A: Overview of the eat-4 locus. Schematic of eat-4 gfp reporter constructs and their 672 expression in hermaphrodite and male nervous systems. The constructs are referred to 673 the eat-4 locus to scale. Grey boxed reporters have not been published before, while

674 the information for the other constructs had been partially shown in (SERRANO-SAIZ et al. 675 2013), but only in the context of the hermaphrodite and not in the male nervous system 676 (Suppl. Table S1). The red cross indicates the position of the deletion in each 677 construct. Within the matrix, the grey intensity of each box indicates gfp relative 678 intensity; white boxes indicate no expression. (*) CP1-6 expression was not determined. 679 B: Representative images of several constructs in the adult male tail. In the images for 680 eat-4prom2 and eat-4prom27 the eat-4 fosmid (otIs518) is also shown in red as a landmark. 681 Usually 2 or 3 lines were scored for each reporter and >10 animals/line. Scale bar = 50 682 m 683

24 684 685 Fig.7: Analysis of cis-regulatory elements of GABA pathway genes 686 Analysis of the unc-25/GAD (panel A1), unc-47/VGAT (panel B1) and unc-46/LAMPG 687 (panel C1) loci. 688 A1, B1, C1: Schematic of the unc-47, unc-25 and unc-46 GABA pathway genes. The 689 constructs are referred to their respective loci to scale. Dark grey boxes indicate than in 690 at least two lines, more than 50% of the worms show gfp expression in the cell. Dashed 691 lined color indicates that, in at least two lines, more than 50% of the worms are having 692 gfp expression in at least 1 RME (if there are no dashed lines, gfp expression is always 693 present in the 4 RMEs). The black cross indicates the position of the deletion in unc- 694 25prom3del1 (TAATC). The red line indicates the position of the UNC-30 binding site 695 sequence (TAATCC). Note that there is no TAATCC sequence in the unc-46 promoter. 696 A2, A3, A4, B2 and C2: Representative confocal images adult animals for unc-25prom1 697 head (A2), unc-25_3intron1 [head (A3) and tail(A4)], unc-47prom1 head (B2) and unc- 698 46prom1 head (C2). Usually 2 or 3 lines were scored for each reporter and >10 699 animals/line. Scale bar = 50 m 700 701 702 Fig.8: Cis-regulatory alleles of unc-17, generated by CRISPR/Cas9 genome 703 engineering 704 A: Box E3 is the consensus sequence of multiple Caenorhabditis species as shown in 705 Fig.3. The E3 box was replaced with a PmeI site (ot1025 allele) in the background of 706 the ot907 allele, which designates the mKate2 tag on the unc-17 locus. 707 B: Scoring of fluorescent reporter expression of unc-17(ot907) and unc- 708 17(ot907ot1025) animals in cholinergic pharyngeal neurons. The E3 box deletion affects 709 unc-17 expression in the bilaterally symmetric MC neuron class and, mildly, the I1 710 neuron class. The fraction of neurons that show the indicated phenotype is indicated in 711 the bar chart. Representative images are shown below. Animals were scored at the L4 712 or young adult stage, n50. 713 C: Pumping rates in worms of different were compared at L1 stage and adult 714 stage. Three different cohorts of worm were tested on different days (each dot color

25 715 represent a worm from a different assay day). The ANOVA test was used to make a 716 model to fit a linear mixed-effects model (LMM) to the obtained data. Then, pairwise- 717 comparisons of genotypes were done with Tukey correction. Statistical significance 718 indicated for L1 comparison, P<2e-16 and for adult comparison, P=6e-14. 719 720 721 Fig.9: Illustrations of organization of cis-regulatory elements for neuron type 722 specific gene expression. A neurotransmitter pathway gene (NT) is controlled by 723 different sets of transcription factors (A, B, C, D, E, F) in 4 different neuron classes. 724 Reporter gene analysis (“promoter bashing”) has revealed distinct cis-regulatory 725 elements (CRE) in the NT locus that drive expression in 4 different neuron classes. The 726 indicated transcription factors act through specific binding sites (black boxes) within a 727 given CRE. The following scenarios are illustrated: Comparing neuron class #1 and 728 class #2 illustrates the simplest scenario of modularity where different combinations of 729 neuron type specific transcription factors (A+B or C+D) control NT expression in 730 different neuron classes. Comparing neuron class #1 and #3 illustrates that the same 731 transcription factor A can be used in different neuron types, but it operates through 732 distinct, separable cis-regulatory elements that each contain a cognate binding site for 733 this transcription factor. One example for this are the two distinct elements from the unc- 734 17/VAChT locus that drive expression in distinct sets of UNC-86-depending neurons 735 (RIH vs. IL2/URA/URB) with each of these elements contain validated UNC-86 binding 736 sites (see Results). Comparing neuron class #1 and #4 illustrate a cis-regulatory 737 element which drives expression in multiple distinct neuron classes and for both neuron 738 classes the same transcription factor binding site and the same cognate transcription 739 factor are required (e.g. unc-17 element driving UNC-86-dependent expression in IL2, 740 URA, URB). In such a case UNC-86 may interact with different transcription factors in 741 neuron #1 vs. neuron #4, but it will not be possible to generate reporters that separate 742 expression in neuron #1 and neuron #4. Hence, the modularity of cis-regulatory control 743 is not manifested by distinct separable elements, but by distinct transcription factor 744 binding sites that are used in a modular and combinatorial manner. 745

26 746 REFERENCES 747 748 Alfonso, A., K. Grundahl, J. S. Duerr, H. P. Han and J. B. Rand, 1993 The 749 Caenorhabditis elegans unc-17 gene: a putative vesicular acetylcholine 750 transporter. Science 261: 617-619. 751 Alfonso, A., K. Grundahl, J. R. McManus, J. M. Asbury and J. B. Rand, 1994 Alternative 752 splicing leads to two cholinergic proteins in Caenorhabditis elegans. J Mol Biol 753 241: 627-630. 754 Avery, L., and H. R. Horvitz, 1989 Pharyngeal pumping continues after laser killing of 755 the pharyngeal nervous system of C. elegans. Neuron 3: 473-485. 756 Brown, J. B., and S. E. Celniker, 2015 Lessons from modENCODE. Annu Rev 757 Genomics Hum Genet 16: 31-53. 758 Culotti, J. G., G. Von Ehrenstein, M. R. Culotti and R. L. Russell, 1981 A second class 759 of acetylcholinesterase-deficient mutants of the nematode Caenorhabditis 760 elegans. Genetics 97: 281-305. 761 Davidson, E. H., 2006 The Regulatory Genome. Academic Press. 762 Doitsidou, M., N. Flames, I. Topalidou, N. Abe, T. Felton et al., 2013 A combinatorial 763 regulatory signature controls terminal differentiation of the dopaminergic nervous 764 system in C. elegans. Genes Dev 27: 1391-1405. 765 Dokshin, G. A., K. S. Ghanta, K. M. Piscopo and C. C. Mello, 2018 Robust Genome 766 Editing with Short Single-Stranded and Long, Partially Single-Stranded DNA 767 Donors in Caenorhabditis elegans. Genetics 210: 781-787. 768 Doolittle, W. F., 2013 Is junk DNA bunk? A critique of ENCODE. Proc Natl Acad Sci U S 769 A 110: 5294-5300. 770 Eastman, C., H. R. Horvitz and Y. Jin, 1999 Coordinated transcriptional regulation of the 771 unc-25 glutamic acid decarboxylase and the unc-47 GABA vesicular transporter 772 by the Caenorhabditis elegans UNC-30 homeodomain protein. J Neurosci 19: 773 6225-6234. 774 Ecker, J. R., W. A. Bickmore, I. Barroso, J. K. Pritchard, Y. Gilad et al., 2012 Genomics: 775 ENCODE explained. Nature 489: 52-55. 776 Eddy, S. R., 2013 The ENCODE project: missteps overshadowing a success. Curr Biol 777 23: R259-261. 778 Eiden, L. E., 1998 The cholinergic gene locus. J Neurochem 70: 2227-2240. 779 Etchberger, J. F., E. B. Flowers, R. J. Poole, E. Bashllari and O. Hobert, 2009 Cis- 780 regulatory mechanisms of left/right asymmetric neuron-subtype specification in 781 C. elegans. Development 136: 147-160. 782 Etchberger, J. F., A. Lorch, M. C. Sleumer, R. Zapf, S. J. Jones et al., 2007 The 783 molecular signature and cis-regulatory architecture of a C. elegans gustatory 784 neuron. Genes Dev 21: 1653-1674. 785 Flames, N., and O. Hobert, 2009 Gene regulatory logic of dopamine neuron 786 differentiation. Nature 458: 885-889. 787 Gendrel, M., E. G. Atlas and O. Hobert, 2016 A cellular and regulatory map of the 788 GABAergic nervous system of C. elegans. Elife 5: e17686 789 Hobert, O., 2002 PCR fusion-based approach to create reporter gene constructs for 790 expression analysis in transgenic C. elegans. Biotechniques 32: 728-730.

27 791 Hobert, O., 2016 A map of terminal regulators of neuronal identity in Caenorhabditis 792 elegans. Wiley Interdiscip Rev Dev Biol 5: 474-498. 793 Hobert, O., K. Tessmar and G. Ruvkun, 1999 The Caenorhabditis elegans lim-6 LIM 794 homeobox gene regulates neurite outgrowth and function of particular 795 GABAergic neurons. Development 126: 1547-1562. 796 Hong, J. W., D. A. Hendrix and M. S. Levine, 2008 Shadow enhancers as a source of 797 evolutionary novelty. Science 321: 1314. 798 Jin, Y., R. Hoskins and H. R. Horvitz, 1994 Control of type-D GABAergic neuron 799 differentiation by C. elegans UNC-30 homeodomain protein. Nature 372: 780- 800 783. 801 Jin, Y., E. Jorgensen, E. Hartwieg and H. R. Horvitz, 1999 The Caenorhabditis elegans 802 gene unc-25 encodes glutamic acid decarboxylase and is required for synaptic 803 transmission but not synaptic development. J Neurosci 19: 539-548. 804 Kerk, S. Y., P. Kratsios, M. Hart, R. Mourao and O. Hobert, 2017 Diversification of C. 805 elegans Motor Neuron Identity via Selective Effector Gene Repression. Neuron 806 93: 80-98. 807 Konstantinides, N., K. Kapuralin, C. Fadil, L. Barboza, R. Satija et al., 2018 Phenotypic 808 Convergence: Distinct Transcription Factors Regulate Common Terminal 809 Features. Cell 174: 622-635. 810 Kratsios, P., and O. Hobert, 2018 Nervous System Development: Flies and Worms 811 Converging on Neuron Identity Control. Curr Biol 28: R1154-R1157. 812 Kratsios, P., A. Stolfi, M. Levine and O. Hobert, 2011 Coordinated regulation of 813 cholinergic motor neuron traits through a conserved terminal selector gene. Nat 814 Neurosci 15: 205-214. 815 Lagha, M., J. P. Bothma and M. Levine, 2012 Mechanisms of transcriptional precision in 816 animal development. Trends Genet 28: 409-416. 817 Lee, R. Y., E. R. Sawin, M. Chalfie, H. R. Horvitz and L. Avery, 1999 EAT-4, a homolog 818 of a mammalian sodium-dependent inorganic phosphate cotransporter, is 819 necessary for glutamatergic neurotransmission in caenorhabditis elegans. J 820 Neurosci 19: 159-167. 821 Lloret-Fernandez, C., M. Maicas, C. Mora-Martinez, A. Artacho, A. Jimeno-Martin et al., 822 2018 A transcription factor collective defines the HSN serotonergic neuron 823 regulatory landscape. Elife 7: e32785 824 McIntire, S. L., E. Jorgensen and H. R. Horvitz, 1993 Genes required for GABA function 825 in Caenorhabditis elegans. Nature 364: 334-337. 826 McIntire, S. L., R. J. Reimer, K. Schuske, R. H. Edwards and E. M. Jorgensen, 1997 827 Identification and characterization of the vesicular GABA transporter. Nature 389: 828 870-876. 829 McKay, D. J., and J. D. Lieb, 2013 A common set of DNA regulatory elements shapes 830 Drosophila appendages. Dev Cell 27: 306-318. 831 McKay, J. P., D. M. Raizen, A. Gottschalk, W. R. Schafer and L. Avery, 2004 eat-2 and 832 eat-18 are required for nicotinic neurotransmission in the Caenorhabditis elegans 833 pharynx. Genetics 166: 161-169. 834 Mello, C. C., J. M. Kramer, D. Stinchcomb and V. Ambros, 1991 Efficient gene transfer 835 in C.elegans: extrachromosomal maintenance and integration of transforming 836 sequences. Embo J 10: 3959-3970.

28 837 Mullen, G. P., E. A. Mathews, M. H. Vu, J. W. Hunter, D. L. Frisby et al., 2007 Choline 838 transport and de novo choline synthesis support acetylcholine biosynthesis in 839 Caenorhabditis elegans cholinergic neurons. Genetics 177: 195-204. 840 O'Meara, M. M., H. Bigelow, S. Flibotte, J. F. Etchberger, D. G. Moerman et al., 2009 841 Cis-regulatory Mutations in the Caenorhabditis elegans Homeobox Gene Locus 842 cog-1 Affect Neuronal Development. Genetics 181: 1679–1686. 843 Okuda, T., T. Haga, Y. Kanai, H. Endou, T. Ishihara et al., 2000 Identification and 844 characterization of the high-affinity choline transporter. Nat Neurosci 3: 120-125. 845 Pereira, L., F. Aeschimann, C. Wang, H. Lawson, E. Serrano-Saiz et al., 2019 Timing 846 mechanism of sexually dimorphic nervous system differentiation. Elife 8: e42078 847 Pereira, L., P. Kratsios, E. Serrano-Saiz, H. Sheftel, A. E. Mayo et al., 2015 A cellular 848 and regulatory map of the cholinergic nervous system of C. elegans. Elife 4: 849 e12432 850 Prasad, B. C., B. Ye, R. Zackhary, K. Schrader, G. Seydoux et al., 1998 unc-3, a gene 851 required for axonal guidance in Caenorhabditis elegans, encodes a member of 852 the O/E family of transcription factors. Development 125: 1561-1568. 853 Raizen, D. M., R. Y. Lee and L. Avery, 1995 Interacting genes required for pharyngeal 854 excitation by motor neuron MC in Caenorhabditis elegans. Genetics 141: 1365- 855 1382. 856 Rosenblum, E. B., C. E. Parent and E. E. Brandt, 2014 The Molecular Basis of 857 Phenotypic Convergence. Annual Review of , Evolution, and Systematics 858 45: 203-226. 859 Sabaris, G., I. Laiker, E. Preger-Ben Noon and N. Frankel, 2019 Actors with Multiple 860 Roles: Pleiotropic Enhancers and the Paradigm of Enhancer Modularity. Trends 861 Genet 35: 423-433. 862 Sarin, S., C. Antonio, B. Tursun and O. Hobert, 2009 The C. elegans Tailless/TLX 863 transcription factor nhr-67 controls neuronal identity and left/right asymmetric fate 864 diversification. Development 136: 2933-2944. 865 Schuske, K., M. T. Palfreyman, S. Watanabe and E. M. Jorgensen, 2007 UNC-46 is 866 required for trafficking of the vesicular GABA transporter. Nat Neurosci 10: 846- 867 853. 868 Serrano-Saiz, E., E. Leyva-Diaz, E. De La Cruz and O. Hobert, 2018 BRN3-type POU 869 Homeobox Genes Maintain the Identity of Mature Postmitotic Neurons in 870 Nematodes and Mice. Curr Biol 28: 2813-2823 871 Serrano-Saiz, E., M. Oren-Suissa, E. A. Bayer and O. Hobert, 2017a Sexually 872 Dimorphic Differentiation of a C. elegans Hub Neuron Is Cell Autonomously 873 Controlled by a Conserved Transcription Factor. Curr Biol 27: 199-209. 874 Serrano-Saiz, E., L. Pereira, M. Gendrel, U. Aghayeva, A. Battacharya et al., 2017b A 875 Neurotransmitter Atlas of the Caenorhabditis elegans Male Nervous System 876 Reveals Sexually Dimorphic Neurotransmitter Usage. Genetics 206: 1251-1269. 877 Serrano-Saiz, E., R. J. Poole, T. Felton, F. Zhang, E. D. de la Cruz et al., 2013 Modular 878 Control of Glutamatergic Neuronal Identity in C. elegans by Distinct 879 Homeodomain Proteins. Cell 155: 659-673. 880 Stefanakis, N., I. Carrera and O. Hobert, 2015 Regulatory Logic of Pan-Neuronal Gene 881 Expression in C. elegans. Neuron 87: 733-750.

29 882 Sze, J. Y., S. Zhang, J. Li and G. Ruvkun, 2002 The C. elegans POU-domain 883 transcription factor UNC-86 regulates the tph-1 tryptophan hydroxylase gene and 884 neurite outgrowth in specific serotonergic neurons. Development 129: 3901- 885 3911. 886 Treiber, T., E. M. Mandel, S. Pott, I. Gyory, S. Firner et al., 2010 Early B cell factor 1 887 regulates B cell gene networks by activation, repression, and transcription- 888 independent poising of chromatin. Immunity 32: 714-725. 889 Trojanowski, N. F., O. Padovan-Merhar, D. M. Raizen and C. Fang-Yen, 2014 Neural 890 and genetic underlies Caenorhabditis elegans feeding behavior. 891 Journal of neurophysiology 112: 951-961. 892 Tursun, B., L. Cochella, I. Carrera and O. Hobert, 2009 A toolkit and robust pipeline for 893 the generation of fosmid-based reporter genes in C. elegans. PLoS ONE 4: 894 e4625. 895 Wenick, A. S., and O. Hobert, 2004 Genomic cis-Regulatory Architecture and trans- 896 Acting Regulators of a Single Interneuron-Specific Gene Battery in C. elegans. 897 Dev Cell 6: 757-770. 898 Yanez-Cuna, J. O., E. Z. Kvon and A. Stark, 2013 Deciphering the transcriptional cis- 899 regulatory code. Trends Genet 29: 11-22. 900 Zhang, F., A. Bhattacharya, J. C. Nelson, N. Abe, P. Gordon et al., 2014 The LIM and 901 POU homeobox genes ttx-3 and unc-86 act as terminal selectors in distinct 902 cholinergic and serotonergic neuron types. Development 141: 422-435. 903

30 A Cholinergic neurons GABAergic neurons Glutamatergic neurons

Ch GAD Glu ChAT unc-25 cha-1 Glu ACh VGAT GABA LAMP other cells VAChT unc-47 unc-46 VGluT unc-17 eat-4 GABA-T gta-1 CHT1 cho-1 GAT snf-11

AChE GluT ace-2 glt-1 to glt-6 B Glu neuron type #1 Glu neuron type #2 Glu neuron type #3

Model #1 TF TF TF Singular control vglut vglut vglut cis cis cis

Model #2: TF TF TF Modular control vglut vglut vglut cis cis cis

C HMC

RID ALA ASIL RMED ADLL RIVL GLRDL ASKL IL1DL OLQDL CEPDLURXL IL2DL ASGL AVHL URADL AVJL URYDL SMDVL MI I3 ADFL AWBL AWAL I4 SAAVLRIAL M5 OLLL AVBL RIPL RMEL RMDVL ASEL AINL M1 IL2L URBL AVAL AFDL I6 M4 ASHL AVDL AVEL AIBL RIBL IL1L BAGL I1L RMDL RIML M2L MCL M3L SIBDL AWCL AQRL ADEL NSML URYVL AUAL RMGL I2L OLQVL SIBVL AIZL FLPL ADAL RMDDL RMHL ASJL RMDDL AIAL RICL GLRL RMFL I5 URAVL SAADL SMDDL AVL AIYL IL2VL CEPV GLRVL RIH SIADL IL1VL RMEV SIAVL AIML RIS AVKL SMBVL SMBDL DB2 SABD AVG RIGR SABVL RIFR DA1 VD1 SABVR VB1 VA1 DB1 VB2 AVFR AVFL RIFL RIGL DD1 AS1 VD2

Glu tyramine

ACh octopamine ALNL DVB DVA DVC PVR dopamine PVNL GABA PVCL

5HT PHCL PLML PVQL LUAL PVWL orphan PHAL PLNL PQR PHBL PDA VD13 DA9 DD6 PDB DA8

PVPR VA12 VD12 PVPL AS11 PVT

Figure 1 Shared exon Expression cha-1 A No Expression ATG unc-17 Not determined

exon 1 exon 2 Dim expression

Mid- unc-17 Pharynx Head RVG and VNC body PAG Tail Male

ATG PVU ADF L/R AIY L/R AVA L/R AS11 ALN L/R PDC PHD x2 PVS R1-6A AIM L/R AIN L/R ASJ L/R AIA L/R AVB L/R AVD L/R AVE L/R AWB L/R PVP L/R AS1-10 AVG DA8-9 PLN L/R PVC L/R PVN L/R RIH URA L/R URB L/R URX L/R HOB PCB PCC PGA PVV RIR SAA x4 RIF L/R VA1-11 VB1-11 VA12 CEM x4 CA1-9 RMF L/R RMH L/R SMB x4 SMD x4 HSN L/R SDQ L/R DA1-7 DB1-7 VC1-6 PDA PDB DVA PVX PVY PVZ SPC SPV IL2 x6 RMD x6 SIA x4 RIB L/R I1 L/R I3 M1 M2 L/R M5 MC L/R SABD SABV L/R RIV L/R SIB x4 M4 CDEF Fosmid prom1 -5000bp prom2 -4000bp COE (-4202) prom3 -3000bp G3 prom4 -822 to +15bp (837bp) prom5 -4409 to -1161bp (3249bp) vsIs48 prom6 -4356 to -3355bp (1001bp) prom7 -4000 to -2988bp (1013bp) E3 E6 E7 prom8 -3000 to -2134bp (867bp) prom8_B30 -3000 to -2134bp_B30 F2 prom9 -2133 to -1154bp (980bp) prom9_C39 -2133 to -1154bp_C39 prom10 -6000 to -4999bp (1002bp)

B cho-1fos::sl2::yfp::h2b I3 M4 M1

M2 I1 IL2 MC AIY prom4 prom1 RIH

L L A A P P R R

cho-1fos::sl2::yfp::h2b cho-1fos::sl2::yfp::h2b

IL2 M1 M1 M4 URA M4 MC AWB URB AWB M2 AIY prom8

prom3 MCI1 RIH I1 MC AIY

L L A A P AIY P R R

cho-1fos::sl2::yfp::h2b cho-1fos::sl2::yfp::h2b

RIV URX SAA RMD SIB ADF RMH RIR prom7

RMF prom10 SMD SIA RIH SMB

L L A P A P R R

No. of C Length COE Expression neurons in No. of (bp) motif intensity RVG, VNC, PAG lines 250 +++ ~42±3 3/3 200 ++ ~41±3 3/3 150 +++ ~42±4 3/3 125 +++ ~40±4 3/3 47 30 250_A47_B30 +++ / ++ ~48±3 3/3 250_C72_B30 72 30 ++ / + ~43±7 2/4

Figure 2 A α A B C D E F exon 1 G exon u2 1 2 3 4 1 2 3 4 5 1 2 3 4 5 6 7 8 1 3 4 5 6 7 8 9 12 3 4 5 6 2 3 4 5 6 7 8 12 3 3b 4 1 2 3 C. elegans

R1u R2u R2d 0.5 kb = 250 bp; R = 500 bp; α A B C D E F G,R etc. C. brenneri

α A B C D E F G,R C. briggsae

α A B C D E F C. doughertyi G,R

α A B C D E F G,R C. remanei

C. sinica α A B C D E F G,R

C. tropicalis α A B C D E F G,R

C. wallacei α A B C D E F G,R

C. zanzibari α A B C D E F G,R

C. tribulationis α A B C D E F G,R

B

ot1025

Figure 3 1kb A cho-1

Mid- Pharynx Head RVG and VNC body PAG Tail

COE AIY motif AVA L/R AVB L/R AVD L/R AVE L/R AWB L/R AIN L/R ASJ L/R ADF L/R AIA L/R AIY L/R AS11 AS1-10 ALN L/R DVA RIB L/R RIH URA L/R URB L/R URX L/R SDQ L/R IL2 x6 RIV L/R RMD x6 SIA x4 SIB x4 DA1-7 DB1-7 VC1-6 DA8-9 PDA PDB PLN L/R PVC L/R RMF L/R RMH L/R VA1-11 PVP L/R SMB x4 SMD x4 M4 M5 SABD SABV L/R (-5016) (-2981-2970) cho-1 RIR SAA x4 RIF L/R VB1-11 VA12 otIs354 Fosmid prom1 -5160bp otEx6066 Fosmid_mut-COE prom17 -4948bp prom2 -3700bp prom18 -3221bp prom3 -2700bp prom19 -2528bp prom4 -2500bp prom5 -2000bp prom6 -1500bp prom7 -1000bp prom20 -956bp prom8 -5160 to -4881bp (280bp) prom9 -5100 to -2700bp (2400bp) prom10 -5100 to -3700bp (1400bp) prom12 -3700 to -1700bp (2000bp) prom13 -3700 to -2700bp (1000bp) prom14 -3700 to -3300bp (400bp)

prom15 -3300 to -2600bp (700bp) prom16 -2000 to -700bp (1300bp)

sl2::yfp::h2b 2xnls::tagrfp yfp

Expression No Expression B cho-1fos::sl2::yfp::h2b panneuronal

URX AWB AIN ADF RMDV AIY SMDD prom2

RIH prom14 RMDD SIAD AIA L L SIAV A A P P R R

cho-1fos::sl2::yfp::h2b panneuronal

URX M4 M5 M5 ADF RIR prom16 prom3 RIH

L L A P A P R R

Figure 4 1kb ace-2 A Mid- Pharynx Head RVG and VNC body PAG Tail

COE motifs (-7008-6986) (-5763-5741) ADF L/R AIA L/R AIY L/R AVA L/R AVB L/R AVD L/R AVE L/R AWB L/R SDQ L/R AIN L/R ASJ L/R AS1-10 ALN L/R AS11 DA1-7 DB1-7 VC1-6 DA8-9 PDA PDB PLN L/R PVC L/R RIB L/R RIH RIR RMF L/R RMH L/R SAA x4 URA L/R URB L/R URX L/R RIF L/R VA1-11 VB1-11 VA12 DVA M4 M5 IL2 x6 RIV L/R RMD x6 SMB x4 SMD x4 SABD SABV L/R PVP L/R SIA x4 SIB x4

Fosmid prom1 -7093 to -5351bp (1742bp) prom2 -6281 to -5351bp (932bp) prom3 -7093 to -6913bp (180bp) *

B cho-1fos::sl2::yfp 2xNLS::DsRed2 2xNLS::YFP

Stable expression

Mosaic expression AVB No Expression AVE prom1 Not determined AVA RIH * Ectopic expression L A P R

cho-1fos::sl2::yfp prom2 AVA

L A P R

Figure 5 A eat-4a 1kb

Sex-shared Male specific

ectopic ectopic M5 I2 MI M3 OLL OLQ IL1 URY BAG RIA AFD ASK ADL AIB AWC ASE ASG ASH AUA RIM AIZ AIMI5 AQR FLP ADA RIG AVM ALM PVD DVC PHB LUA PQR PHC PVR PLM PHA PVQ expression PCA PVV HOA R2B R5A R6B R9A CP0,5,6CP7 PVN expression Fosmid reporter (otIs388, otIs518) eat-4a eat-4b prom12 (otIs557) - 1618 prom1 prom2 - 1154 - 951 prom2D11 X - 726 * prom2D12 X -189 * prom2D7 X-98 * D prom2 6 X * prom10 - 326 * prom9 (otIs461,otIs462) - 172 - 2155 prom4 prom5 - 3554 prom13 prom14 (otIs558) - 3177 prom11 (otIs519,otIs520) - 2649 prom6 (otIs392) - 4454 prom7 (otIs487) - 5038 - 4979 prom7D1 X - 4896 prom7D2 X - 4859 prom7D3 X - 4795 prom7D4 X prom15 - 4776 prom8 (otIs521) - 5327

Expression intensity Homeodomain core bs (TAAT) POU homeodomain bs No Expression Not determined * B

PCA R2B HOA PCA R2B HOA R5A PHC R5A CP7 * PVV PVR CP7 * R9A * PLM PHC * PCA L PVV R2B PCA prom14

prom5 R2B prom2 A P R5A R prom2 D 7 PVR 50 mm 50 mm

R2B R5A R9A R2B R5A PVV R9A PLM * PVR PVV PVN PVR HOA PCA

* prom8 L prom10 prom13 A PVV P prom2 D 6 R6B R9A R R5A R2B

Figure 6 A1

2000 2150 -405 -180 unc-25 2300 -1800 -600

RIB RIS AVL DVB RMEs D-type EFs 1/2 EF3 R1A R8A R9B prom1 prom2 prom3 TAATC prom3del1 x prom4

3intron1 UNC-30 site

3intron2

A2 prom1 A3 3intron1 A4 3intron1 (head) (head) R1A (tail) R9B RMED RMED GFP RIBR EF3 EF1,2 R8A RMER RIS RMER DVB R9B R8A RMEV D-type RMEV AVL R1A L L L A P A P A P R R R

B1 C1 unc-47 unc-46 -234 -300

RIB RIS AVL DVB 4 RMEs D-type RIA -200 -150 -100 -50 RIB RIS AVL DVB RMEs D-type -152 -84 prom1 prom1

prom2 prom2 N.D.

prom3 prom3

B2 prom1 (otIs348) C2 prom1 (otIs575) (head) (tail) RIAR RMED RIBR RMED RIBR RMER RMER RIS RIS AVL RMEV RMEV L L AVL D-type A P A P R R D-type

at least 50% of animals express 1 to 4 cell

dimmer

Figure 7 exon 1 A α A B C D E F G exon u2 1 2 3 4 1 2 3 4 5 1 2 3 4 5 6 7 8 1 3 4 5 6 7 8 9 12 3 4 5 6 2 3 4 5 6 7 8 12 3 3b 4 1 2 3

0.5 kb = 250 bp; = 500 bp; E3 etc.

ot1025 GTTTAAAC

M1 M2 M5 I3 M4 I1 MC B 100% 80%

60% ON DIM 40% OFF

expressing mKate2 20% Percentage of neurons

0%

ot1025 ot1025 ot1025 ot1025 ot1025 ot1025 ot1025

no mutation no mutation no mutation no mutation no mutation no mutation C no mutation unc-17(ot907[unc-17::mKate2])

I1 MC

ot907

I1

ot907 ot1025 D L1 adult p<2e-16 p=6e-14 300 250 200 150 pumps per minute 100 50 50 100 150 200 250 0 0

unc-17(ot907) unc-17 unc-17(ot907) unc-17 (ot907ot1025) (ot907ot1025) Figure 8 A B NT neuron class #1 CRE1 CRE2 CRE3

C D neuron class #2 NT

A E neuron class #3 NT

F A neuron class #4 NT

Figure 9