Development 113, 245-255 (1991) 245 Printed in Great Britain © The Company of Biologists Limited 1991

The Drosophila extramacrochaetae antagonizes sequence-specific DNA binding by daughterless/achaete-scute protein complexes

MARK VAN DOREN, HILARY M. ELLIS* and JAMES W. POSAKONYf Department of Biology and Center for Molecular Genetics, University of California San Diego, LaJolla, CA 92093-0322, USA * Present address: Department of Biology, Emory University, 1510 Clifton Road, Atlanta, GA 30322, USA t Corresponding author

Summary

In Drosophila, a group of regulatory of the of emc. Under the conditions of our experiments, the emc helix-loop-helix (HLH) class play an essential role in protein, but not the h protein, is able to antagonize conferring upon cells in the developing adult epidermis specifically the in vitro DNA-binding activity of da/AS-C the competence to give rise to sensory organs. Proteins and putative da/da protein complexes. We interpret encoded by the daughterless (da) and three of these results as follows: the heterodimerization capacity the achaete-scute complex (AS-C) act positively in the of the emc protein (conferred by its HLH domain) allows determination of the sensory organ precursor fate, it to act in vivo as a competitive inhibitor of the while the extramacrochaetae (emc) and hairy (h) gene formation of functional DNA-binding protein complexes products act as negative regulators. In the region by the da and AS-C proteins, thereby reducing the upstream of the achaete gene of the AS-C, we have effective level of their transcriptional regulatory activity identified three 'E box' consensus sequences that are within the cell. bound specifically in vitro by hetero-oligomeric com- plexes consisting of the da protein and an AS-C protein. Key words: helix-loop-helix, sensory organ development, We have used this DNA-binding activity to investigate peripheral nervous system, negative regulation, protein- the biochemical basis of the negative regulatory function protein interaction, DNA binding.

Introduction lead to the loss of both larval and adult sensory organ precursors (Caudy et al. 1988a; Cline, 1989; Dambly- The adult peripheral nervous system (PNS) of Dros- Chaudiere and Ghysen, 1987; Garcia-Bellido, 1979; ophila is composed of several types of multicellular Garcia-Bellido and Santamaria, 1978; Ghysen and sensory organs, or sensilla, that appear on the body O'Kane, 1989; Romani et al. 1989; Ruiz-Gomez and surface in a relatively invariant spatial pattern. The Modolell, 1987). The activities of these genes are precursor cells that give rise to sensory organs develop negatively regulated by the extramacrochaetae (emc) during the late larval and early pupal stages within and hairy (h) genes; loss-of-function mutations in emc undifferentiated epithelial sheets (imaginal discs and and h cause the ectopic appearance of supernumerary histoblast nests) that also give rise to ordinary sensilla in the adult (Botas et al. 1982; Garcia Alonso epidermal cells (see Hartenstein and Posakony, 1989). and Garcfa-Bellido, 1988; Moscoso del Prado and Thus, a primary event in PNS development is the Garcia-Bellido, 1984). The genes of the ASIG interact position-dependent assignment of one of two alterna- in a dosage-sensitive manner, such that the phenotype tive fates, sensillum precursor or epidermal precursor, conferred by a mutation in one gene in the group is to the cells in the developing adult epidermis. A set of suppressed or enhanced by changes in the wild-type genes that we have referred to earlier as the 'achaete- dosage of another gene in the group (Botas etal. 1982; scute interacting group' (ASIG; Ellis etal. 1990) play an Dambly-Chaudiere et al. 1988; Moscoso del Prado and essential role in establishing the competence of cells to Garcia-Bellido, 1984). The discovery that the protein be sensory organ precursors. The daughterless (da) products of all of these genes contain the protein gene and three of the four genes of the achaete-scute dimerization domain characteristic of the helix-loop- complex (AS-C) [achaete (ac) or T5, scute (sc) or T4, helix (HLH) class of proteins (Murre et al. 1989a; and asense (ase) or T8 (Gonzalez et al. 1989)] act Alonso and Cabrera, 1988; Caudy et al. 19886; Ellis et positively to confer the sensory organ precursor cell al. 1990; Garrell and Modolell, 1990; Rushlow et al. fate: Loss-of-function mutations in da and the AS-C 1989; Villares and Cabrera, 1987) has led to the 246 M. Van Doren, H. M. Ellis andJ. W. Posakony suggestion that the dosage-sensitive interactions of the by the HLH proteins encoded by units ASIG primarily reflect protein-protein interactions m5, ml, and m8 of the Enhancer of split [E(spl)] gene (Ellis et al. 1990; Garrell and Modolell, 1990). complex (Klambt etal. 1989). E(spl) is a member of the Most known members of the HLH family of proteins, so-called neurogenic class of genes; these are required including the da and AS-C proteins, have a region of for inhibitory cell-cell interaction events that restrict conserved basic residues adjacent to the HLH domain the expression of particular cell fates during both that is essential for sequence-specific DNA-binding central and peripheral neurogenesis in Drosophila (see activity (Davis etal. 1990; Lassar e/a/. 1989; Murre e/a/. for review Campos-Ortega, 1990). Though the h and 1989a; Voronova and Baltimore, 1990). These are E(spl) HLH proteins share a set of basic residues referred to as bHLH proteins (Davis etal. 1990; Jones, adjacent to their HLH domains, the basic region 1990). They bind DNA with high affinity as dimeric or consensus sequence derived from them is quite differ- higher-order oligomeric complexes, the formation of ent from that derived from other bHLH proteins (see which requires the HLH dimerization domain (Davis et Discussion). Nevertheless, the existence of a conserved al. 1990; Lassar et al. 1989; Murre et al. 1989a,b; set of basic residues in the h and E(spl) proteins Voronova and Baltimore, 1990). The specific binding indicates that they may be capable of binding DNA, sites that have been identified for several bHLH and suggests that their mode of action may therefore be proteins include a common core sequence (CANNTG) different from that of emc and Id. called the 'E box' (see Blackwell and Weintraub, 1990; In this paper we report the identification, in the Murre and Baltimore, 1991; Murre et al. 1989a,b). region upstream of the ac gene, of specific 'E box' Detailed studies of certain bHLH proteins, such as the binding sites for hetero-oligomeric complexes com- myogenic regulator MyoD (Tapscott et al. 1988), have posed of the da protein and an AS-C protein. We also shown that they act as transcriptional activators, demonstrate that the emc protein can specifically controlling not only the expression of cell type-specific antagonize the DNA-binding activity of da/AS-C downstream genes (Davis etal. 1990; Lassar etal. 1989), complexes in vitro. Under the same conditions, the h but also their own expression (positive autoregulation) and E(spl) m8 proteins fail to exhibit this inhibitory (Thayer et al. 1989). By analogy to these results, the activity. Drosophila da and AS-C proteins are thought to exert their effects on neurogenesis as transcriptional acti- vators (Alonso and Cabrera, 1988; Murre etal. 1989ft). Materials and methods da and the AS-C T4 (sc) gene also play an essential role in determination by participating in the measure- General molecular biology methods ment of the X:autosome ratio in the preblastoderm General techniques not described in detail below were carried embryo (Cline, 1976; Cline, 1988; Erickson and Cline, out as described by Ausubel et al. (1987) and Sambrook et al. 1991; Parkhurst et al. 1990; Steinmann-Zwicky et al. (1989). 1990; Torres and Sanchez, 1989). Here again, hetero- dimeric complexes of the two proteins (da/T4) are Synthesis of capped mRNAs in vitro thought to act as transcriptional activators of the master cDNA clones representing mRNAs of the sc (AS-C T4), ac (AS-C T5), da, h, and E(spt) m8 genes were isolated froma regulatory gene Sex-lethal (Erickson and Cline, 1991; + Parkhurst et al. 1990). Consistent with these hypoth- library made from 4-8 h Drosophila embryo poly(A) eses, Murre et al. have shown that complexes of the da (Brown and Kafatos, 1988). These have been designated pNBT4, pNBT5, pNBda, pNBh, and pNBm8, respectively. and AS-C T3 {lethal of scute) proteins will bind in vitro Each clone contains a complete protein coding sequence, as to the KE2 site of the mouse immunoglobulin K light determined by comparing partial sequence data to published chain enhancer (Murre et al. 19896). sequences (Caudy etal. 19886; Cronmiller etal. 1988; Klambt etal. 1989; Rushlow etal. 1989; Villares and Cabrera, 1987); We and others have reported previously that the details are available upon request. An emc cDNA clone predicted emc protein, like its mouse homologue Id (pNB5b) has been described previously (Ellis etal. 1990). All (Benezra et al. 1990), shares the HLH dimerization cDNAs except that representing da were transcribed (see domain of the bHLH proteins but lacks the conserved below) directly from the library vector (pNB40). basic residues that constitute their DNA-binding For da, an Asel-Scal fragment of pNBda containing the domain (Ellis et al. 1990; Garrell and Modolell, 1990). protein coding region was ligated to Xbal linkers and On the basis of this finding, we proposed that the emc subcloned into the Xbal site of pBluescript KS(+) (Strata- protein negatively regulates sensory organ determi- gene) so that the 5' end is oriented towards the Sad site of the nation by forming heterodimers with the da and/or AS- vector (pKSda). The EcoRl-Avail fragment from pNB40 C proteins, thereby interfering with their DNA binding containing the SP6 promoter and the Xenopus /3-globin leader or transcriptional activation functions (Ellis etal. 1990). sequences (Brown and Kafatos, 1988) was then inserted by Benezra et al. (1990) have shown that the Id protein blunt-end ligation into the Notl site of pKSda, to yield p/?Gda; forms complexes in vitro with at least three mouse in vitro transcription of p/?Gda using SP6 polymerase generates a synthetic da mRNA containing the /3-globin bHLH proteins and attenuates their ability to bind leader. This construct reduces the length of the 5' untrans- DNA as dimers. lated region of the da cDNA; transcripts from it are more The h protein, by contrast, has several sequence efficiently translated in vitro than those containing additional characteristics that distinguish it from both emc/ld and 5' untranslated sequences (M. V. D., unpublished obser- typical bHLH proteins. These characteristics are shared vations). To generate a DNA template for making a truncated emc antagonizes da/AS-C DNA binding 247 da protein, pKSda was digested with BstXl, which cuts within (Operon Technologies); each represents the sequence sur- the pBluescript KS(+) vector and also after codon 348 in the rounding a single 'E box' consensus site (bold letters) from the da coding sequence. Following removal of single-stranded 5' flanking region of the ac gene of the AS-C (Villares and overhangs with T4 DNA polymerase, the fragment containing Cabrera, 1987; see Fig. 1A): the vector and the carboxy-terminal half of the da coding region was isolated and religated. This generates a plasmid T5E1 : AGGTAGTCACGCAGGTGGGATCCCTAGGCCC (pdaS) in which the first codon in the truncated da coding GGTCCATCAGTGCGTCCACCCTAGGGATCCG sequence is an ATG (codon 349 in the full-length da), which T5E2 : CCTCAGGTCACCAACAGCTGCGTTTTACAGA can now be used as a new translation initiation site. AGTCCAGTGGTTGTCGACGCAAAATGTCTGG Translation of synthetic mRNA transcribed from this tem- T5E3: GGGACGACAGGCAGCTGAAAATGAACAAAGG plate should yield a truncated da protein (362 amino acids CCCCCTGCTGTCCGTCGACTTTTACTTGTTT instead of 710) which still contains the bHLH region. T5E4 : CCGGCTGAGAGGAACAACTGATACGTTGGGC A plasmid containing an AS-C T3 (lethal of scute) cDNA CCGACTCTCCTTGTTGACTATGCAACCCGGG inserted into the EcoRl site of the pBluescribe (pBS) vector (Stratagene) was a gift from C. Cabrera (pCCT3), and a The mutant probe T5XX3 is identical to the T5E3 probe with plasmid containing a mouse CREB cDNA in the EcoKV site the exception of a 2-bp change in the 'E box' consensus of pBluescript SK(-) (poT3CREB) was a gift from D. Steger sequence from CAGCTG to AAGCGG. The two strands ofa and P. Mellon. given probe were end-labeled separately using y-3 Prior to in vitro transcription, DNA templates were (ICN) and T4 polynucleotide kinase, annealed, and purified linearized with an appropriate restriction enzyme, phenol/ using a G-25 Sephadex spin column. Typically, 2pmol of labeled double-stranded oligonucleotide wasrecovered in a chloroform extracted, and ethanol precipitated. Enzymes ^ ^ 1jd used for linearization are as follows: pNBT4, pNBT5, pNB5b, final volume of 80 1 at l-SxlO ctsmin" and PNBm8 (No/I); pNBh (£coRI); pCCT3 (Pvul); p^Gda Complementary oligonucleotides comprising a binding site (BamHI); pdaS (Hindlll); paT3CREB (EcoRI). for the mouse CREB protein from the human glycoprotein SP6, T3, and T7 RNA polymerases and RNasin were hormone cv-subunit gene were a gift from J. Altschmied and P. purchased from Promega. In vitro synthesis of capped mRNA Mellon (Delegeane et al. 1987): was performed essentially as suggested by Promega, using GATCTAAATTGACGTCATGGTAA 3jug of linearized template DNA per 50,ul reaction, and including a 10-fold excess of m7GpppG (Boehringer Mann- ATTTAACTGCAGTACCATTCTAG heim) over GTP (see Promega protocol). Over the course of the ends, the incubation, the reactions were supplemented 2-3 times with additional RNA polymerase and GTP to increase the Unlabeled specific competitor were prepared by yield of capped RNA. The transcripts were purified by annealing complementary oligonucleotides. incubating 1, foWr 15 min . with RNAase-free DNAase (2 .5mgml~ orthington), extracting once with phenol/ Electrophoretic mobility shift assays chloroform, extracting twice with dH2O-saturated isobutanol, DNA binding reactions consisted of 7fi\ of translation lysate and ethanol precipitating twice from 2M ammonium acetate. (blank lysate and/or lysate containing specific proteins The RNA was redissolved in diethylpyrocarbonate-treated synthesized in vitro, stored in 10 % glycerol),1 fAofprobe (25 dH2O and quantitated by OD26o- fmol or =0.5 ng at^L-^xlQ ,~. . , l of lmgml"1 poly(dl-dC) (Sigma), and I/A 10x reaction In vitro translation buffer (0.1 M Tris-HCl pH7.5, 0.5M TYTT 1 dena l salmon sperrri Wheat germ and rabbit reticulocyte lysates were purchased 10 mM EDTA, and 275 fig ml" turedsa from Promega, and translations were performed essentially as DNA). An additional contribution of salt to the DNA binding described by the manufacturer. In vitro synthesizedmRNA reactions comes from the translation reaction mixtures, which was added to 20/zgml~1 final concen:™°n- ,, , have a potassium acetate concentration of 90 mM (rabbit L-methionine (ICN, MOOOCimmoP1, =15 ^M) was added as reticulocyte) and 67.5 mM (wheat germ). For competition suggested (4/^1/50^1 in reticulocyte reactions, 2.5 ^tl/50jul in experiments, 0.5 [A of 100 jig ml unlabeled competitor wheat germ reactions), and unlabeled L-methionine (Calbio- probe (a 100-fold excess) was also included in the reactions. chem) was added to three times the estimated concentration Translation products were mixed and preincubated for of labeled L-methionine Q6KM cold L-methionine for 20 min at 25°C. The remaining components were then added reticulocyte lysates; 2.25 JIM for wheat germ lysates). Labeled and the mixtures incubated for an additional 40min at 25°C. protein products were visualized by SDS-polyacrylamide gel The reactions were electrophoresed on prerun 0.5x TBE/ electrophoresis and autoradiography. In each case the major 4% acrylamide gels(2mmthickxl5cm long) for2.5h at 100 labeled species migrated with an apparent relative molecular V (67 Vcm"^-^e^ werefixedin 25 % methanol/7% acetic mass consistent with the predicted protein sequence, though acid for 1-2 h and dried prior to autoradiography on X-Omat very small amounts of lower molecular weight species AR X-ray film (Kodak) at -80°C with an intensifying screen. (presumably representing proteolytic fragments or incom- plete translation products) were also observed (data not shown). Translation reactions were quantitated by TCA Results precipitation: Total TCA-precipitable counts were normal- ized to the number of methionine residues in the expected protein product as a measure of the relative molar quantity of Identification of bHLH protein binding sites upstream protein produced. Translation mixtures were stored at —80°C of the achaete gene in 10% glycerol. In order to test the functional properties of the emc protein in vitro, we first sought to identify binding sites Preparation of DNA probes and specific competitors for the da and AS-C proteins that might be functionally The following DNA oligonucleotide probes were synthesized significant in vivo. We were guided in this endeavor by 248 M. Van Doren, H. M. Ellis and J. W. Posakony A AS-C T5 proteins known or thought to play a role in adult \ peripheral nervous system development in Drosophila (see Introduction). The various proteins were syn- thesized individually by in vitro translation of synthetic mRNA templates, using either wheat germ or rabbit B reticulocyte lysates. The binding site probes consisted of double-stranded oligonucleotides terminally lableled T5E1 GCAGGTGG 32P T5E21 GCAGCTGT with . T5E3 GCAGCTGA We tested the following proteins alone and in all T5E4 ACAA.CTGA possible pairwise (equimolar) combinations with one of T5E4' TCAGTTGT the T5E probes, that representing T5E3: da, AS-C T3, .4S-CT4 (sc), AS-C T5 (ac), emc, h, and E(spl) m8. As E2 consensus GCAGNTGN shown in Fig. 2A, detectable binding to the T5E3 probe was observed only with the da+T3, da+T4, and da+T5 Fig. 1. Sequences upstream of the T5 (ac) gene of the protein combinations, suggesting that in each case a achaete-scute complex (AS-C) that conform to the 'E box' consensus CANNTG. (A) Diagram showing the location of hetero-oligomeric complex is the competent binding four 'E box' consensus sequences that occur in the first species. None of these proteins alone exhibited specific 877bp upstream of the AS-C T5 gene. The sites are binding to this probe; similarly, the three pairwise designated T5E1-T5E4 in order of increasing distance combinations of T3, T4, and T5 failed to bind (negative numbers, in bp) from the transcription start site detectably (Fig. 2A). Finally, no specific binding to the of the gene (arrow at +1). (B) Alignment of the four 'E T5E3 probe was observed with the emc, h, and E(spl) box' sequences T5E1-T5E4. All four contain the core 'E m8 proteins, either alone or in pairwise combination box' consensus nucleotides shown in bold print (CA--TG). with any other protein (data not shown). However, only T5E1, T5E2, and T5E3 match the consensus sequence shown at the bottom for the E2 class Three aspects of the binding of the T5E3 probe by the of bHLH binding sites (Murre and Baltimore, 1991; see different da/AS-C protein combinations are worthy of Discussion). Both strands of the T5E4 sequence are shown; note (Fig. 2A). First, two different shifted complexes underlined bases violate the E2 consensus. Sequences are observed in each case, with a much greater amount shown with a prime are from the antisense strand. of probe present in the upper band. Second, the reduced mobilities of the da/T4 protein-DNA com- plexes relative to those of the da/T3 and da/T5 the known positive autoregulatory activity of MyoD complexes are consistent with the substantially greater relative molecular mass of the T4 protein(38xlO3Mr, (Thayer et al. 1989), and by genetic evidence indicating 3 Wl r Mrf that as little as 0.9kb of 5' flanking sequence is versus 29 x 10 ™ T3 and 23 x 103 ORT5). Third, sufficient for much of the wild-type activity of the ac we consistently observe the substantial differences in gene (Ruiz-Gomez and Modolell, 1987). The specific the amount of T5E3 probe bound by the different binding sites that have been identified previously for da/AS-C combinations shown in Fig. 2A (in the order bHLH proteins conform to the consensus sequence da/T5>da/T3>da/T4), while much smaller differ- CANNTG ('E box'; see Blackwell and Weintraub, ences are observed in the binding of the same protein 1990; Murre and Baltimore, 1991; Murre etal. 1989a,6). preparations to other probes (see below). We searched the DNA sequence of the 5' flanking We investigated these DNA-binding activities of region of ac (trancription unit T5 of the AS-C) (Villares da/AS-C protein combinations in more detail in the and Cabrera, 1987) for occurrences of this consensus. In experiments illustrated in Fig. 2B-C. Fig. 2B presents the first 877 bp upstream of the transcription start, there the results of two experiments designed to test the are four CANNTG sequences (Fig. 1A), three located sequence specificity of da/T5 binding to the T5E3 close to the initiation site and one substantially further probe. Under conditions in which the da/T5 protein upstream. We have given these sites the designations combination shows a strong binding activity with the T5E1-T5E4, in order of increasing distance from the T5E3 probe, the same combination fails to bind transcription start. Fig. IB shows a comparison of their detectably to the mutant probe T5XX3, which is sequences with the consensus derived from other identical to T5E3 except for a 2-bp change in the 'E box' known bHLH binding sites. It is noteworthy that the sequence from CAGCTG to AAGCGG. A competition T5E1, T5E2, and T5E3 sites, but not the T5E4 site, experiment (Fig. 2B) corroborates these results. Ad- match the consensus for the 'E2' class of 'E box' dition of excess unlabeled T5E3 oligonucleotide to the sequences (Murre and Baltimore, 1991). binding reaction results in a strong reduction in the Previous studies have indicated that other bHLH amount of labeled T5E3 probe present in each of the proteins bind DNA sequence-specifically and with high da/T5-specific complexes, while the non-template- affinity as dimers or higher-order oligomers (Blackwell dependent background band (presumably due to a and Weintraub, 1990; Davis et al. 1990; Murre et al. protein present in the wheat germ lysate) is affected 1989a,6; Voronova and Baltimore, 1990). We made use very little. By contrast, addition of unlabeled T5XX3 of an electrophoretic mobility shift assay (EMSA) to causes only a slight reduction in the amount of specific test whether the T5E sequences are bound specifically binding. Similar results have been obtained in compe- in vitro by complexes involving one or more HLH tition experiments testing the specificity of da/T3 and emc antagonizes da/AS-C DNA binding 249 A R Fig. 2. Electrophoretic B D mobility shift assays Probe: Competitor: demonstrating the T5E3 T5XX3 characteristics of da/AS-C binding to 'E box' sequences 10 *r in w u in - £ 2 « & upstream of the ac gene. ra .00 5S - C = t" Proteins were synthesized in a O TJ h- h- I- wheat germ translation lysate. (A) The da and AS-CT3, T4, and T5 proteins were assayed for their ability to bind to the T5E3 probe (see Materials and methods) either individually or in pairwise combinations. Six molar units of protein (arbitrary units; see Materials and methods) were used in single protein lanes; three molar units of each protein were used in combination lanes. Weak 'background' band observed in all lanes is not dependent on added Drosophila proteins (see B). (B) Specificity of da/T5 binding to the T5E3 site. Left panel: Binding of da/T5 complexes to either the wild- 6 D type probe T5E3 or the re mutant probe T5XX3, which is • o re re re re identical to T5E3 except for a • o o re re 2-bp change in the 'E box' • a sequence from CAGCTG to Mmm AAGCGG (see Materials and methods). For control lanes (Cont), the probe was mixed with wheat germ lysate not programmed by synthetic mRNA template. The weak 'background' band observed in each lane is presumably due to the binding of an endogenous protein or proteins present in the translation lysate. Right panel: da/T5 binding to the T5E3 probe is specifically competed by a 100-fold excess of unlabeled T5E3, but not by the same quantity of unlabeled T5XX3 [compare to lane with no added competitor (None)]. (C) da/AS-C-dependent protein-DNA complexes exhibit different mobilities either when a different AS-C protein (T4 versus T5) is combined with full-length da protein (left panel), or when full-length versus truncated da protein (daS; see Materials and methods) is combined with full-length T5 protein (right panel). Central lane in each panel shows the results of combining the three proteins in each case. (D) Identical experiment to that shown in A, except that the T5E1 probe was used. A different pattern of non-template-dependent background bands is observed with this probe. The exposure time for the autoradiogram in D is approximately four times longer than for the autoradiogram in A, using probes of similar specific activity. In A, B, and D, free probe appears at the bottom of each panel. da/TA binding to the T5E3 probe (data not shown). complex. When the da protein is mixed with the T4 These findings demonstrate that the binding of da/ AS - protein (38xlO 3Mr), the complex formed on the T5E3 C protein complexes to the T5E3 oligonucleotide in site exhibits a slower mobility than the complex of da vitro is sequence-specific and that the 'E box' nucleo- with the smaller T5 protein (23xlO3Mr) on the same tides constitute at least part of the required sequence of site, indicating that in each case the AS-C protein is part the binding site. of the bound complex (Fig. 2C, left). Moreover, when Fig. 2C presents evidence that not only are both da both T4 and T5 proteins are mixed with da, the two and an AS-C protein required to give detectable different da/AS-C complexes are formed and bind binding to the T5E3 site in this assay (Fig. 2A), but DNA independently, as expected. Conversely, when a both proteins are in fact present in the protein-DNA truncated da protein (42xlO3Mr) is mixed with T5 250 M. Van Doren, H. M. Ellis andJ. W. Posakony protein, the mobility of the complex bound to the T5E3 bind specifically to the three located close to the site is considerably increased relative to thatof the transcription initiation site but not to the more distantly complex formed using full-length da (74xlO3M r) and located T5E4 sequence. T5, indicating that the da protein is likewise a part of the protein-DNA complexes we are detecting in this Specific inhibition of the DNA-binding activities of assay (Fig. 2C, right). Again, the complexes of T5 with da/AS-C complexes by the emcprotein full-length and truncated da protein form and bind The finding that the predicted emc protein shares the DNA independently, and no third complex of inter- dimerization domain of other HLH proteins but lacks mediate mobility is observed. We interpret these results an adjacent DNA-binding motif led us to propose that it to mean that the T5E3 'E box' site is bound in vitro by acts as a negative regulator by forming heterodimers hetero-oligomeric, probably heterodimeric, complexes with the da and/or AS-C proteins (Ellis et al. 1990). It of da and an AS-C protein. was suggested that, by contrast to da/AS-C complexes, Using the same electrophoretic mobility shift assay, emc-containing heterodimers might lack either DNA- we tested the ability of the da, T3, T4 and T5 proteins binding or transcriptional regulatory activity. We used (singly and in pairwise combination) to bind in vitro to the EMSA to test the ability of the emc protein to the other three 'E box' sequences that we had antagonize the DNA-binding activities of da/AS-C identified. Fig. 2D shows the results for the T5E1 protein complexes. As shown in Fig. 3A, addition of oligonucleotide probe. Just as with the T5E3 site increasing amounts of emc protein to binding reactions (Fig. 2A), the three combinations of da and an AS-C that included the da and T5 proteins and the T5E3 protein (da/T3, da/TA, and da/T5) exhibit the greatest oligonucleotide probe resulted in a progressive re- binding activity, while the AS-C proteins alone, either duction of the amount of probe present in both of the singly or in pairwise combination, fail to bind detect- dfl/T5-specific protein-DNA complexes. In contrast, ably. Two important differences from the T5E3 results the non-template-dependent background band is rela- are readily apparent. First, the da protein alone tively unaffected by emc. The same results were produces two shifted complexes with the T5E1 probe, obtained using the T5E1 probe, or when the proteins but does not bind detectably. to the T5E3 probe were synthesized in a rabbit reticulocyte lysate instead (Fig. 2A). The reduced mobility of the da-only T5E1 of a wheat germ lysate (data not shown). The ability of complexes relative to those of da/AS-C T5E1 com- the emc protein to inhibit specific DNA binding by plexes is consistent with the much greater relative da/T5 complexes is not shared under these conditions molecular mass of the da protein compared tothatof by either of two other HLH proteins, h and E(spl) m8. the AS-C proteins (74 x 103Mr versus 38x it? Addition of identical molar ratios of these proteins to the largest of the three), suggesting that da/da homo- da/T5-T5E3 binding reactions did not appreciably oligomers bind to the T5E1 site. However, it is also affect the amount of probe present in the shifted possible that hetero-oligomeric complexes formed protein-DNA complexes (Fig. 3A). Fig. 3B shows that between da and a protein or proteins present in the emc is likewise able to inhibit the binding of da/T3 and translation lysate are responsible for these shifted da/TA protein complexes to the T5E3 site, while again bands. Second, the amounts of T5E1 probe present in not affecting the amount of probe in the 'background' the different da/AS-C complexes are much more complex. Finally, emc antagonizes the binding of da- similar than in the case of the T5E3 probe. Since the only complexes to the T5E1 site (Fig. 3C), suggesting same protein preparations were used in the two that it is capable of forming hetero-oligomeric com- experiments, these differences are likely to reflect plexes with at least the da protein. As with the da/T5 differences in the relative affinities of da/T3, da/TA, complexes on the T5E3 site, the h protein by contrast and da/T5 hetero-oligomeric protein complexes for the fails to antagonize the binding of

* M1

• mIMHMto C

(9

Fig. 3. Electrophoretic mobility shift assays demonstrating the effects of the emc, h, and E(spl) m8 proteins on the ability of da/AS-C protein complexes to bind DNA. Proteins were synthesized in a wheat germ translation lysate [except for the CREB protein (see D), which was synthesized in a rabbit reticulocyte lysate]. For control lanes, the probe was mixed with wheat germ lysate not programmed by synthetic mRNA template. The 'background' band(s) observed in each lane are presumably due to the binding of endogenous proteins present in the translation lysate. (A) The emc protein effectively inhibits the binding of da/T5 to the T5E3 site, while the h and E(spl) m8 proteins do not. Amounts of protein are as follows (arbitrary units; see Materials and methods): 3 molar units each of da and T5, and 6 (2x) or 15 (5x) molar units of emc, h and E(spl) m8 (i.e. the emc protein is present in a two-fold or five-fold molar excess over either the da or T5 protein). (B) The emc protein can also inhibit the binding of da/12 and da/T4 to T5E3. Protein amounts are 30 molar units for emc and 6 molar units each for the other proteins (i.e. a five-fold molar ratio of emc to da or emc to AS-C). (C) The emc protein, but not the h protein, inhibits the ability of putative da/da complexes to bind to the T5E1 site. Protein amounts are 9 molar units for da and 41 molar units each for emc and h (i.e. a 4.5-fold molar excess of emc or h). (D) Demonstration that the emc protein is not a general inhibitor of DNA binding, as it does not appreciably affect the binding of the mouse CREB protein to a specific site (cyclic AMP response element or CRE) from the human glycoprotein hormone o--subunit gene (Delegeane et al. 1987; see Materials and methods). Protein amounts are 3 molar units for CREB and 15 molar units for emc (i.e. a five-fold molar excess of emc). In A and D, free probe appears at the bottom of the panel. complexes composed of the da protein and an AS-C Sequence-specific DNA-binding activities of the da and protein bind in vitro to three specific sites upstream of AS-C proteins the ac gene, and that the emc protein specifically We have extended the work of Murre et al. (1989i>) to antagonizes this activity in a dose-dependent manner. demonstrate that, like other members of the bHLH These results have a number of important implications class of proteins, the products of the da and AS-C T3, for our understanding of the function of both positively T4, and T5 genes are sequence-specific DNA-binding and negatively acting HLH proteins in the control of proteins in vitro, and that the conserved bases of the 'E adult sensory organ development and other processes in box' consensus form at least part of the sequence Drosophila. required for their binding. For the 'E box' sites 252 M. Van Doren, H. M. Ellis andJ. W. Posakony considered here, the strongest binding activity is required for the establishment and/or execution of the exhibited by combinations of da and an AS-C protein, sensory organ precursor cell fate (Cline, 1989; Dambly- and we have presented evidence that in these cases, Chaudiere et al. 1988; Garcfa-Bellido, 1979; Garcia- both proteins are indeed part of the bound complex. Bellido and Santamaria, 1978; Romani etal. 1989; Ruiz- The simplest interpretation of these data is that of the Gomez and Modolell, 1987). By analogy to the possible complexes involving the da and/or AS-C mammalian myogenic protein MyoD (Davis etal. 1990; proteins, hetero-oligomeric (probably heterodimeric) Lassar et al. 1989), it is likely that these Drosophila da/AS-C complexes represent the highest-affinity bind- bHLH proteins act as dimeric transcriptional activators ing species for these sites. Our experiments do not allow of genes involved in sensory organ development. Our us to distinguish whether the failure of the AS-C results show that protein complexes composed of da proteins to bind (either singly or in pairwise combi- and an AS-C protein possess the requisite sequence- nation) in the absence of the da protein reflects a very specific DNA-binding activity for such a regulatory low affinity of AS-C homodimers or heterodimers for function. Thus, a single hypothesis may generally these sites, or the failure of such protein complexes to suffice to explain the genetically defined requirement form. The apparent inability of putative da homo- for both da and AS-C activity in adult peripheral oligomers to bind to the T5E2 and T5E3 sites, however, neurogenesis; i.e. that these proteins exert their does appear to be a measure of their affinity for these regulatory effects on this process as part of da/AS-C sites, since

1991). Finally, these four proteins all have the sequence and to Joachim Altschmied, David Steger, and Pamela WRPW as their carboxy-terminal residues (Klambt et Mellon for mouse CREB cDNA and CRE probe DNA. We al. 1989; Rushlow et al. 1989). thank Anne Bang, Michael Levine, Jonathan Margolis, William McGinnis, Pamela.Mellon, Kees Murre, and Fran- Davis et al. (1990) have shown that when a single cois Schweisguth for critical review of the manuscript. This proline residue is substituted into the basic region of work was supported by an NIH predoctoral training grant MyoD, the protein loses its DNA-binding activity, (GM 07240; M. V. D.), a Senior Postdoctoral Fellowship fan though dimerization is unaffected. By analogy to these the California Division of the American Cancer Society (H. results, it has been suggested that the presence of a M. E.), a Pew Scholars award (J. W. P.), and an NIH grant proline residue in the basic region of the wild-type h and (GM 41100). E(spl) proteins would render their basic regions non- functional as DNA-binding domains (Benezra et al. 1990; Davis et al. 1990; Garrell and Modolell, 1990; Jones, 1990). As a result, they would be expected to References function like emc and Id; i.e. they would antagonize by ALONSO, M. C. AND CABRERA, C. V. (1988). The achaete-scute heterodimerization the activity of sequence-specific gene complex of comprises four DNA-binding bHLH proteins. However, this sugges- homologous genes. EMBO J. 7, 2585-2591. tion seems unpersuasive for two reasons. First, it does AUSUBEL, F. M., BRENT, R., KINGSTON, R. E., MOORE, D. D., not explain the apparent conservation of basic residues SEIDMAN, J. G., SMITH, J. A. AND STRUHL, K. (1987). Current in the h and E(spl) proteins; this is instead suggestive of Protocols in Molecular Biology. New York: John Wiley and Sons. a possible function of these residues in DNA binding. BENEZRA, R., DAVIS, R. L., LOCKSHON, D., TURNER, D. L. AND Second, the fact that the proline residue in the basic WEINTRAUB, H. (1990). The protein Id: a negative regulator of region of h and the E(spl) HLH proteins is a helix-loop-helix DNA binding proteins. Cell 61, 49-59. substitution for a conserved asparagine residue present BLACKWELL, T. K. AND WEINTRAUB, H. (1990). Differences and in the basic region of most bHLH proteins suggests that similarities in DNA-binding preferences of MyoD and E2A protein complexes revealed by binding site selection. Science it may have an important. function. For example, the 250, 1104-1110. conserved asparagine residue may serve a function BOTAS, J., MOSCOSO DEL PRADO, J. AND GARCIA-BELLIDO, A. similar to that proposed for a conserved asparagine (1982). Gene-dose titration analysis in the search of trans- residue in the basic region of leucine zipper (bZIP) regulatory genes in Drosophila. EMBO J. 1, 307-310. proteins; that is, to produce a bend or kink in the DNA- BROWN, N. H. AND KAFATOS, F. C. (1988). Functional cDNA libraries from Drosophila embryos. J. molec. Biol. 203, 425-37. binding domain that maximizes the interaction of the CAMPOS-ORTEGA, J. A. (1990). Mechanisms of a cellular decision basic residues with DNA (Vinson et al. 1989). Perhaps during embryonic development of Drosophila melanogaster: the proline residue in the h and E(spl) proteins Epidermogenesis or neurogenesis. Adv. Genet. 27, 403-453. functions in a related way. CAUDY, M., GRELL, E. H., DAMBLY-CHAUDIERE, C, GHYSEN, A., JAN, L. Y. AND JAN, Y. N. (1988a). The maternal sex These considerations prompted us to compare the determination gene daughterless has zygotic activity necessary abilities of the emc, h, and E(spl) m8 proteins to for the formation of peripheral neurons in Drosophila. Genes antagonize the sequence-specific DNA-binding activi- Dev. 2, 843-52. CAUDY, M., VASSIN, H., BRAND, M., TUMA, R., JAN, L. Y. AND ties of da/AS-C hetero-oligomeric complexes. We JAN, Y. N. (19886). daughterless, a Drosophila gene essential for found that, under the conditions of the experiments both neurogenesis and sex determination, has sequence shown in Fig. 3A and C, the emc protein behaves very similarities to myc and the achaete-scute complex. Cell 55, differently from the other two. Neither the h nor the 1061-7. E(spl) m8 protein display an inhibitory activity of the CLINE, T. W. (1976). A sex-specific, temperature-sensitive maternal effect of the daughterless mutation of Drosophila type clearly exhibited by the emc protein, consistent melanogaster. Genetics 84, 723-742. with the notion that they do not act by forming CLINE, T. W. (1988). Evidence that sisterless-a and sisterless-b are heterodimers with the da and/or AS-C proteins, as emc two of several discrete 'numerator elements' of the X/A sex appears to. On the other hand, these in vitro determination signal in Drosophila that switch Sxl between two experiments cannot be considered conclusive, since the alternate stable expression states. Genetics 119, 829-863. CLINE, T. W. (1989). The affairs of daughterless and the contrasting behaviors of the different proteins in our promiscuity of developmental regulators. Cell 59, 231-4. assay does not necessarily indicate a different mechan- CRONMILLER, C, SCHEDL, P. AND CLINE, T. W. (1988). Molecular ism of action. For example, the h and E(spl) m8 characterization of daughterless, a Drosophila sex determination proteins may require a post-translational modification gene with multiple roles in development. Genes Dev. 2, in order to be activated (one not supplied by the 1666-1676. DAMBLY-CHAUDIERE, C. AND GHYSEN, A. (1987). Independent translation lysate), while the emc protein does not. subpatterns of sense organs require independent genes of the Thus, while the experiments that we have described do achaete-scute complex in the Drosophila larva. Genes Dev. 1, not demonstrate that the emc protein acts differently 1297-1306. from the h and E(spl) m8 proteins in vivo, they DAMBLY-CHAUDIERE, C, GHYSEN, A., JAN, L. Y. AND JAN, Y. N. (1988). The determination of sense organs in Drosophila: nevertheless establish specific in vitro conditions in Interaction of scute with daughterless. Roux's Arch, devl Biol. which these distinct types of HLH protein display 197, 419-423. distinct functional behaviors. DAVIS, R. L., CHENG, P. F., LASSAR, A. B. AND WEINTRAUB, H. (1990). The MyoD DNA binding domain contains a recognition code for muscle-specific gene activation. Cell 60, 733-46. We are grateful to Andrew Singson, Carlos Cabrera, and DELEGEANE, A. M., FERLAND, L. H. AND MELLON, P. L. (1987). David Ish-Horowicz for providing cDNA and genomic clones, Tissue-specific enhancer of the human glycoprotein hormone emc antagonizes da/AS-C DNA binding 255

a—-subunit gene: Dependence upon cyclic AMP inducible DNA binding and dimerization motif in immunoglobulin elements. Molec. cell. Bioi. 7, 3994-4002. enhancer binding, daughterless, MyoD. and myc proteins. Cell ELLIS, H. M, SPANN, D. R. AND POSAKONY, J. W. (1990). 56, 777-83. extramacrochaetae, a negative regulator of sensory organ MURRE, C, MCCAW, P. S., VASSIN, H., CAUDY, M., JAN, L. Y., development in Drosophila, defines a new class of helix-loop- JAN, Y. N., CABRERA, C. V., BUSKIN, J. N., HAUSCHKA, S. D., helix proteins. Cell 61, 27-38. LASSAR, A. B., WEINTRAUB, H. AND BALTIMORE, D. (1989£>). ERICKSON, J. W. AND CLINE, T. W. (1991). Molecular nature of Interactions between heterologous helix-loop-helix proteins the Drosophila sex determination signal and its link to generate complexes that bind specifically to a common DNA neurogenesis. Science 251, 1071-1074. sequence. Cell 58, 537-44. GARCIA-BELLIDO, A. (1979). Genetic analysis of the achaete-scute PARKHURST, S. M., BOPP, D. AND ISH-HOROWICZ, D. (1990). X:A system of Drosophila melanogaster. Genetics 91, 491-520. ratio, the primary sex-determining signal in Drosophila, is GARCIA-BELLIDO, A. AND SANTAMARIA, P. (1978). Developmental transduced by helix-loop-helix proteins. Cell 63, 1179-1191. analysis of the achaete-scute system of Drosophila melanogaster. ROMANI, S., CAMPUZANO, S., MACAGNO, E. R. AND MODOLELL, J. Genetics 88, 469-486. (1989). Expression of achaete and scute genes in Drosophila GARCIA ALONSO, L. A. AND GARCIA-BELLIDO, A. (1988). imaginal discs and their function in sensory organ development. Extramacrochaetae, a trans-acting gene of the achaete-scute Genes Dev. 3, 997-1007. complex of Drosophila involved in cell communication. Roux's RUIZ-G6MEZ, M. AND MODOLELL, J. (1987). Deletion analysis of Arch, devl Biol. 197, 328-338. the achaete-scute locus of Drosophila melanogaster. Genes Dev. GARRELL, J. AND MODOLELL, J. (1990). The Drosophila 1, 1238-1246. extramacrochaetae locus, an antagonist of proneural genes that, RUSHLOW, C. A., HOGAN, A., PINCHIN, S. A., HOWE, K. M., like these genes, encodes a helix-loop-helix protein. Cell 61, LARDELLI, M. AND ISH-HOROWICZ, D. (1989). The Drosophila 39-48. hairy protein acts in both segmentation and bristle patterning GHYSEN, A. AND O'KANE, C. J. (1989). Neural enhancer-like and shows homology to N-myc. EMBO J. 8, 3095-3103. elements as specific cell markers in Drosophila. Development SAMBROOK, J., FRITSCH, E. F. AND MANIATIS, T. (1989). Molecular 105, 35-52. Cloning: A Laboratory Manual. Cold Spring Harbor, New York: GONZALEZ, F., ROMANI, S., CUBAS, P., MODOLELL, J. AND Cold Spring Harbor Laboratory Press. CAMPUZANO, S. (1989). Molecular analysis of the asense gene, a STEINMANN-ZWICKY, M., AMREIN, H. AND NOTHIGER, R. (1990). member of the achaete-scute complex of Drosophila Genetic control of sex determination in Drosophila. Adv. Genet. melanogaster, and its novel role in optic lobe development. 27, 189-238. EMBO J. 8, 3553-3562. TAPSCOTT, S. J., DAVIS, R. L., THAYER, M. J., CHENG, P. F., HABENER, J. F. (1990). Cyclic AMP response element binding WEINTRAUB, H. AND LASSAR, A. B. (1988). MyoDl: A nuclear proteins: A cornucopia of transcription factors. Molec. phosphoprotein requiring a Myc homology region to convert Endocrinol. 4, 1087-1094. fibroblasts to myoblasts. Science 242, 405-11. HARTENSTEIN, V. AND POSAKONY, J. W. (1989). Development of adult sensilla on the wing and notum of Drosophila THAYER, M. J., TAPSCOTT, S. J., DAVIS, R. L., WRIGHT, W. E., melanogaster. Development 107, 389-405. LASSAR, A. B. AND WEINTRAUB, H. (1989). Positive JONES, N. (1990). Transcriptional regulation by dimerization: Two autoregulation of the myogenic determination gene MyoDl. Cell sides to an incestuous relationship. Cell 61, 9-11. 58, 241-8. KLAMBT, C, KNUST, E., TIETZE, K. AND CAMPOS-ORTEGA, J. A. TORRES, M. AND SANCHEZ, L. (1989). The scute (T4) gene acts as a (1989). Closely related transcripts encoded by the neurogenic numerator element of the X:A signal that determines the state gene complex Enhancer of split of Drosophila melanogaster. of activity of Sex-lethal in Drosophila. EMBO J. 8, 3079-86. EMBO J. 8, 203-210. VILLARES, R. AND CABRERA, C. V. (1987). The achaete-scute gene LANDSCHULZ, W. H., JOHNSON, P. F. AND MCKNIGHT, S. L. complex of D. melanogaster: Conserved domains in a subset of (1988). The leucine zipper: A hypothetical structure common to genes required for neurogenesis and their homology to myc. Cell a new class of DNA binding proteins. Science 240, 1759-1764. 50, 415-424. LASSAR, A. B., BUSKIN, J. N., LOCKSHON, D., DAVIS, R. L., VINSON, C. R., SIGLER, P. B. AND MCKNIGHT, S. L. (1989). APONE, S., HAUSCHKA, S. D. AND WEINTRAUB, H. (1989). MyoD Scissors-grip model for DNA recognition by a family of leucine is a sequence-specific DNA binding protein requiring a region of zipper proteins. Science 246, 911-916. myc homology to bind to the muscle creatine kinase enhancer. VORONOVA, A. AND BALTIMORE, D. (1990). Mutations that disrupt Cell 58, 823-31. DNA binding and dimer formation in the E47 helix-loop-helix MARTINEZ, C. AND MODOLELL, J. (1991). Cross-regulatory protein map to distinct domains. Proc. natn. Acad. Sci. U.S.A. interactions between the proneural achaete and scute genes of 87, 4722-4726. Drosophila. Science 251, 1485-1487. WHELAN, J., CORDLE, S. R., HENDERSON, E., WEIL, P. A. AND Moscoso DEL PRADO, J. AND GARCfA-BELLiDO, A. (1984). Genetic STEIN, R. (1990). Identification of a pancreatic /3-cell insulin regulation of the Achaete-scute complex of Drosophila gene transcription factor that binds to and appears to activate melanogaster. Roux's Arch, devl Biol. 193, 242-245. cell-type-specific expression: Its possible relationship to other MURRE, C. AND BALTIMORE, D. (1991). The helix-loop-helix motif: cellular factors that bind to a common insulin gene sequence. Structure and function. In Transcriptional Regulation (ed. S. Molec. cell. Biol. 10, 1564-1572. McKnight and K. Yamamoto), (in press). Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press. MURRE, C, MCCAW, P. S. AND BALTIMORE, D. (1989a). A new (Accepted 4 June 1991)