Helix–Loop–Helix Proteins and the Advent of Cellular Diversity: 30 Years of Discovery

Helix–Loop–Helix Proteins and the Advent of Cellular Diversity: 30 Years of Discovery

Downloaded from genesdev.cshlp.org on October 4, 2021 - Published by Cold Spring Harbor Laboratory Press REVIEW Helix–loop–helix proteins and the advent of cellular diversity: 30 years of discovery Cornelis Murre Division of Biological Sciences, University of California at San Diego, La Jolla, California 92903, USA Helix–loop–helix (HLH) proteins are dimeric transcrip- In a parallel study, two proteins (E12 and E47) were iden- tion factors that control lineage- and developmental-spe- tified. Both proteins bound to an E2-box site located in the cific gene programs. Genes encoding for HLH proteins intronic immunoglobulin light chain gene enhancer but arose in unicellular organisms >600 million years ago differed from each other in the c-myc homology region and then duplicated and diversified from ancestral genes (Murre et al. 1989a). Further investigation revealed that across the metazoan and plant kingdoms to establish E12 and E47 were generated by differential splicing of ex- multicellularity. Hundreds of HLH proteins have been ons encoding for the DNA-binding domain (Murre et al. identified with diverse functions in a wide variety of cell 1989a; Yamazaki et al. 2018). This variation contributed types. HLH proteins orchestrate lineage specification, to their unique affinities: E12 bound weakly to DNA, commitment, self-renewal, proliferation, differentiation, whereas E47 bound DNA with high affinity (Murre et al. and homing. HLH proteins also regulate circadian clocks, 1989a; Sun and Baltimore 1991). Helical wheel analysis protect against hypoxic stress, promote antigen receptor of both DNA-binding domains revealed two highly con- locus assembly, and program transdifferentiation. HLH served amphipathic α helices separated by a loop domain. proteins deposit or erase epigenetic marks, activate non- The conservation of the helices led to the hypothesis that coding transcription, and sequester chromatin remodelers the helix–loop–helix (HLH) motif functioned as a dimeri- across the chromatin landscape to dictate enhancer–pro- zation domain (Murre et al. 1989a). Indeed, simple mixing moter communication and somatic recombination. Here and binding experiments showed that the HLH region me- the evolution of HLH genes, the structures of HLH do- diated homodimerization and heterodimerization (Murre mains, and the elaborate activities of HLH proteins in et al. 1989a,b). Further analysis uncovered that, adjacent multicellular life are discussed. to the HLH region, a cluster of conserved basic amino ac- ids mediated DNA binding (Davis et al. 1990). Hence, the name basic HLH (bHLH) proteins was established. More than 30 years ago, a group of DNA elements sharing conserved sequences was identified within the enhancers of the immunoglobulin heavy and light chain loci (Church et al. 1985; Ephrussi et al. 1985). In a series of elegant ex- Characterization of HLH proteins periments, these distinct DNA segments, named E-box Thirty years have passed since those initial findings, and sites, were found to be protected from in vivo methylation the field has witnessed the identification of ever-increas- in B-lineage cells. The E-box sites contained a signature ing additional HLH proteins. Based on expression patterns, core of six nucleotides: CANNTG. dimerization selectivities, and DNA-binding specificities, Simultaneously, MyoD was identified. MyoD was an the HLH proteins were segregated into distinct classes eye-opener with the remarkable ability to convert fibro- (Murre et al. 1989b). Class I HLH proteins, now categorized blasts into myoblasts. MyoD was the first example of as E proteins, include E12, E47, E2-2, HEB, daughterless programmed transdifferentiation. MyoD induced the ex- (Drosophila melanogaster), and HLH-2 (Caenorhabditis pression of hundreds of genes associated with skeletal elegans). E proteins are abundantly expressed in many lin- muscle cell identity (Davis et al. 1987). Analysis of its pri- eages and bind DNA as either homodimers or hetero- mary amino acid sequence revealed that MyoD shared a dimers. Class II HLH proteins include MyoD, myogenin, region of similarity with the myc family of proteins. The SCL, NeuroD1, NeuroD2, and members of the achaete– shared feature was called the c-myc homology region scute complex. This subset of HLH proteins binds DNA (Tapscott et al. 1988). as either homodimers or heterodimers with E proteins [Keywords: E proteins; HLH; hematopoiesis; Id proteins; myogenesis; © 2019 Murre This article is distributed exclusively by Cold Spring Har- neurogenesis; programming differentiation; somitogenesis; phylogeny; bor Laboratory Press for the first six months after the full-issue publication structure] date (see http://genesdev.cshlp.org/site/misc/terms.xhtml). After six Corresponding author: [email protected] months, it is available under a Creative Commons License (Attribution- Article is online at http://www.genesdev.org/cgi/doi/10.1101/gad.320663. NonCommercial 4.0 International), as described at http://creativecom- 118. mons.org/licenses/by-nc/4.0/. 6 GENES & DEVELOPMENT 33:6–25 Published by Cold Spring Harbor Laboratory Press; ISSN 0890-9369/19; www.genesdev.org Downloaded from genesdev.cshlp.org on October 4, 2021 - Published by Cold Spring Harbor Laboratory Press HLH proteins and the advent of cellular diversity and is lineage-restricted. Class III proteins include domain as well as a basic region that contacted residues c-myc, TFE3, sterol regulatory element-binding protein in the major groove (Fig. 1; Ellenberger et al. 1994; Ma et 1a (SREBP-1), and Mi. In addition to the HLH domain, al. 1994). More specifically, E47 folded as a parallel four- they contain a leucine zipper dimerization domain and helix bundle that intertwined, positioning the basic region function as either transcriptional activators or repressors in contact with each half of the E-box site (CAC or CAG). (Carroll et al. 2018). The majority of yeast and plant HLH Two highly conserved amino acids—a glutamate and an proteins belong to this class of HLH proteins. Class IV arginine residue in the E47 DNA-binding region—mediat- HLH proteins include Mad, Max, and Mxi. They form het- ed DNA-binding specificity. The glutamate residue erodimers with c-myc and among each other to bind a dis- directly contacted CA nucleotides, while arginine residues tinct set of E-box sites (CACGTG or CATGTTG) (Carroll stabilized DNA binding. Dimerization united the two am- et al. 2018). Class V HLH proteins contain a HLH domain phipathic helices common among all HLH proteins. The but lack a basic region. Upon interaction with bHLH pro- highly conserved hydrophobic interactions were stabilized teins, they antagonize the DNA-binding activity of their through numerous van der Waals interactions, which facil- bHLH partners (Benezra et al. 1990). Prominent members itated dimerization. are the Id proteins Id1–4 and the D. melanogaster gene The crystal structures of class III and class IV HLH do- product Extramacrochaete (Emc). Class VI HLH proteins mains and leucine zippers have also been elucidated. are characterized by the presence of a proline residue in These classes bind DNA as quasisymmetric heterodimers, their basic region. They include Hes and enhancer of split stabilized by hydrophobic and polar interactions that in- proteins. Class VI HLH proteins recognize a unique DNA volve helix I and helix II as well as residues continuous sequence, named the N box (CACGCG or CACGAG), with helix II located in the leucine zipper. The crystal and function predominantly as transcriptional repressors structure of the Myc–Max heterodimer was particularly by binding to Groucho (Paroush et al. 1994). Finally, class informative (Nair and Burley 2003). Myc and Max fold as VII HLH proteins display as their defining feature multiple two heterodimers that are oriented in a head-to-tail config- PAS domains that function as signaling sensors to light uration, resulting in an anti-parallel four-helix bundle (Fig. and oxygen. They include members of the HLH proteins 1; Nair and Burley 2003). The SREBP-1a, which modulates that control the circadian clock (BMAL and CLOCK) and cholesterol metabolism, also contains an HLH domain a response pathway to hypoxia (hypoxia-inducible factor that is continuous with a leucine zipper region (Fig. 1). α [HIF-α] and aryl hydrocarbon nuclear translocator The SREBP-1a domain is unusual, as it interacts with an [ARNT]) (Wang et al. 1995; Lowrey and Takahashi 2004; asymmetric binding site. The asymmetric binding site rec- Huang et al. 2012). This class binds predominantly ognition is instructed by a tyrosine rather than an arginine ACGTG or GCGTG core sequences and functions as tran- residue in the basic region, leading to a loss of polar inter- scriptional repressors. actions between the basic region and the major groove (Parraga et al. 1998). Of particular interest are class VII HLH proteins that The structure of the HLH domain contain PAS domains. Prominent among them are CLOCK and BMAL1, which contain ACGTG or GCGTG The modeling suggested that the E-protein DNA-binding core sequences in their binding sites. The crystal structure domain was folded into an HLH configuration. Detailed of the CLOCK/BMAL heterodimer revealed an unusual structural analysis soon followed. It validated the mo- asymmetric heterodimer in which the HLH and PAS do- deling. Crystal structures of E47 and MyoD HLH mains are intertwined to promote heterodimerization homodimers revealed two α helices connected by a loop (Fig. 1; Huang et al. 2012). The HIFα–ARNT factors also

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    21 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us