The BEN Domain Is a Novel Sequence- Specific DNA-Binding Domain Conserved in Neural Transcriptional Repressors

Downloaded from genesdev.cshlp.org on September 25, 2021 - Published by Cold Spring Harbor Laboratory Press The BEN domain is a novel sequence- specific DNA-binding domain conserved in neural transcriptional repressors Qi Dai,1,3 Aiming Ren,2,3 Jakub O. Westholm,1 Artem A. Serganov,2 Dinshaw J. Patel,2 and Eric C. Lai1,4 1Department of Developmental Biology, 2Department of Structural Biology, Sloan-Kettering Institute, New York, New York 10065, USA We recently reported that Drosophila Insensitive (Insv) promotes sensory organ development and has activity as a nuclear corepressor for the Notch transcription factor Suppressor of Hairless [Su(H)]. Insv lacks domains of known biochemical function but contains a single BEN domain (i.e., a ‘‘BEN-solo’’ protein). Our chromatin immuno- precipitation (ChIP) sequencing (ChIP-seq) analysis confirmed binding of Insensitive to Su(H) target genes in the Enhancer of split gene complex [E(spl)-C]; however, de novo motif analysis revealed a novel site strongly enriched in Insv peaks (TCYAATHRGAA). We validate binding of endogenous Insv to genomic regions bearing such sites, whose associated genes are enriched for neural functions and are functionally repressed by Insv. Unexpectedly, we found that the Insv BEN domain binds specifically to this sequence motif and that Insv directly regulates transcription via this motif. We determined the crystal structure of the BEN–DNA target complex, revealing homodimeric binding of the BEN domain and extensive nucleotide contacts via a helices and a C-terminal loop. Point mutations in key DNA-contacting residues severely impair DNA binding in vitro and capacity for transcriptional regulation in vivo. We further demonstrate DNA-binding and repression activities by the mammalian neural BEN-solo protein BEND5. Altogether, we define novel DNA-binding activity in a conserved family of transcriptional repressors, opening a molecular window on this extensive gene family. [Keywords: BEN domain; neurogenesis; transcriptional repression] Supplemental material is available for this article. Received January 3, 2013; revised version accepted February 6, 2013. Sequence-specific transcription factors coordinate and annotated DNA-binding proteins that have yet to be drive all programs of development and physiology, and characterized, but perhaps some of them may represent a few dozen families of sequence-specific DNA-binding target sites of proteins whose DNA-binding activity is not folds have been characterized to date (Yusuf et al. 2012). evident from a primary sequence. With the passing years, the recognition of novel DNA- The Drosophila peripheral nervous system has been an binding domains has become increasingly infrequent. In exemplary model system for studying the sequential particular, the latest DNA-binding domains were recog- adoption of cell fates during development (Lai and Orgogozo nized from bacterial or fungal species (Boch et al. 2009; 2004). The capacity for neurogenesis within the epider- Moscou and Bogdanove 2009; Wiedenheft et al. 2009; mis is conferred by the spatially patterned expression and Lohse et al. 2010), with few new domains described in activity of proneural basic helix–loop–helix (bHLH) acti- metazoans. Nevertheless, the observation that many un- vator proteins in proneural clusters. For example, the characterized sequence motifs emerge from multigenome bHLH factors Achaete and Scute are responsible for ini- alignments and genome-wide data collection efforts tiating the development of the several hundred external (Stark et al. 2007; Lindblad-Toh et al. 2011; Neph et al. mechanosensory organs that decorate the body surface. 2012) indicates that many binding sites remain to be con- Local cell interactions mediated by the Notch signaling nected to their cognate transcription factors. Undoubtedly, pathway subsequently restrict neural potential to indi- many of these correspond to sites for the hundreds of vidual sensory organ precursors (SOPs), which then ex- ecute a stereotyped lineage to generate the cells of a 3These authors contributed equally to this work. mature sensory organ comprised of a neuron and several 4Corresponding author nonneural accessory cells (Lai 2004). Each cell division in E-mail [email protected] Article published online ahead of print. Article and publication date are the neural lineages is asymmetric and driven by Notch online at http://www.genesdev.org/cgi/doi/10.1101/gad.213314.113. signaling such that loss-of-function and gain-of-function 602 GENES & DEVELOPMENT 27:602–614 Ó 2013 by Cold Spring Harbor Laboratory Press ISSN 0890-9369/13; www.genesdev.org Downloaded from genesdev.cshlp.org on September 25, 2021 - Published by Cold Spring Harbor Laboratory Press Function and structure of DNA-binding BEN domains of the Notch pathway can induce reciprocal, symmetric Insv to multiple Notch/Su(H)-regulated targets resident cell fate outcomes. In particular, the neuron is the ulti- in the E(spl)-C, including both bHLH repressor and Bearded mate product of multiple precursor cells that successfully family genes (Duan et al. 2011). Insv mutants exhibit mild avoid activating Notch signaling. sensory organ loss and shaft-to-socket transformations A new component in Drosophila neurogenesis emerged within the sensory lineage, phenotypes that are exacer- from the recovery of Insensitive (Insv) in a molecular bated by elevation of haploinsufficiency of the charac- screen for genes specifically expressed in proneural clus- terized Su(H) corepressor Hairless. ters (Reeves and Posakony 2005). The only recognizable To gain insight into regulatory targets of Insv genome- domain on Insv is a BEN domain. Named after instances wide, we performed Insv ChIP sequencing (ChIP-seq) us- in the BANP, E5R, and NAC1 proteins, this bioinformat- ing 2.5- to 6.5-h and 6.5- to 12-h embryos. These data ically recognized motif is widely distributed among meta- revealed 5364 and 2390 peaks, respectively, relative to zoans and viruses (Abhiman et al. 2008). Some BEN control IgG ChIP-seq (Supplemental Table 1). The spec- proteins contain other motifs linked to chromatin func- ificity of Insv binding was well-illustrated at the E(spl)-C, tions, but Insv is the prototype of the ‘‘BEN-solo’’ family where Insv peaks were concentrated across the upstream that lacks other recognizable protein domains. We re- regions of most of these Notch-regulated genes (Fig. 1A), cently showed that Insv is a nuclear factor that promotes whereas binding of Insv to the flanking hundreds of peripheral neurogenesis and inhibits Notch signaling kilobases was sparse. Despite this, neither high-affinity (Duan et al. 2011). In particular, Insv has activity as a (YGTGRGA) nor low-affinity (RTGRGAR) consensus se- direct corepressor for the Notch pathway transcription quences for Su(H) (Nellesen et al. 1999) were enriched factor Suppressor of Hairless [Su(H)] and localizes in vivo among Insv peaks genome-wide, and the majority of re- to a number of Notch target genes located in the Enhancer gions strongly bound by Insv lacked Su(H)-binding sites of split-Complex [E(spl)-C] (Duan et al. 2011). (Supplemental Fig. 1). In this study, we examined the genomewide occupancy An example of such a region is shown in Figure 1B, of Insv during two stages of Drosophila embryogenesis. where both Insv ChIP-seq data sets reported on strong We confirmed binding of Insv to Notch target genes in the binding to a conserved genomic region located close to E(spl)-C; however, Insv bound thousands of regions that the promoter of the Notch modulator fringe. To under- did not contain consensus Su(H)-binding sites. Instead, stand the basis of chromatin association of Insv in the these regions defined a novel sequence motif that was en- absence of Su(H) sites, we performed de novo motif analysis riched among genes involved in neurogenesis, and genes using Weeder (Pavesi et al. 2001) and MEME (Bailey and with Insv occupied-motifs were broadly up-regulated in Elkan 1994), querying Insv peaks of various classes. We insv mutants. Surprisingly, we found that Insv bound to considered all bound regions, those regions that were ex- this motif directly via the BEN domain, and we went on clusive of Su(H)-binding sites, and regions exclusive of to solve the structure of the BEN domain in complex with highly occupied target (HOT) regions (Roy et al. 2010). its DNA target. Extensive mutational analysis confirmed Analysis by MEME reported a novel semipalindromic the structural details of BEN domain interactions with motif (TCYAATHRGAA) as the most highly enriched the DNA backbone and with critical base-specific con- sequence (Fig. 1C), while Weeder reported the perfect pal- tacts that explain the affinity of Insv for its cognate DNA indromic sequence (CCAATTGG). Indeed, these motifs target. Although BEN domains are only loosely related to were enriched even without having to preclude peaks con- each other, knowledge of the Insv structure allowed us to taining Su(H) sites. Genome-wide, the Insv ChIP peaks recognize other BEN proteins that share its base-specific bearing the TCYAATHRGAA motif were preferentially amino acids. In particular, we show that mammalian located in proximity to transcriptional start sites in both BEND5 is also a sequence-specific transcriptional repres- data sets (Fig. 1D; Supplemental Fig. 2). In addition, this sor that regulates neurogenesis, despite very limited over- motif was preferentially enriched among more strongly all homology with Insv. In summary, our work defines a bound sites in both ChIP-seq data sets, supporting its novel conserved family of metazoan transcription factors biological relevance (Fig. 1E; Supplemental Fig. 1). and provides direction to the study of the other >100 an- We confirmed in vivo binding to these genomic regions notated BEN domain proteins (Abhiman et al. 2008). by endogenous Insv. We selected Insv ChIP-seq peaks from repo, vestigial, mir-263a, fringe, grainyhead, ham- let, tramtrack, Syt1, Dlic2, Chinmo, and fne (found in Results neurons) for verification by ChIP-qPCR. All of these are genes involved in nervous system development, and none Genome-wide analysis reveals a novel DNA motif of their Insv peaks contained even low-affinity Su(H) sites.

The BEN Domain Is a Novel Sequence- Specific DNA-Binding Domain Conserved in Neural Transcriptional Repressors

Representation for Discovery of Protein Motifs

Introduction to Sequence Motif Discovery and Search the Complete

Human Induced Pluripotent Stem Cell–Derived Podocytes Mature Into Vascularized Glomeruli Upon Experimental Transplantation

Sharmin Supple Legend 150706

Tools for Motif and Pattern Searching

Degsampler: Gibbs Sampling Strategy for Predicting E3-Binding Sites with Position-Specific Prior Information

Bioinformatics: a Practical Guide to the Analysis of Genes and Proteins, Second Edition Andreas D

Sequence Conservation

Evolutionary Analysis of Viral Sequences in Eukaryotic Genomes

Alignment, Clustering and Extraction of Structured Motifs in DNA Promoter Sequences

ER-Resident Transcription Factor Nrf1 Regulates Proteasome Expression and Beyond

Lecture 7: Sequence Motif Discovery