Enhancers and Super-Enhancers Have an Equivalent Regulatory Role in Embryonic Stem Cells Through Regulation of Single Or Multiple Genes
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press Enhancers and super-enhancers have an equivalent regulatory role in embryonic stem cells through regulation of single or multiple genes Authors: Sakthi D Moorthy*1, Scott Davidson*1, Virlana M Shchuka1, Gurdeep Singh1, Nakisa Malek- Gilani1, Lida Langroudi1, Alexandre Martchenko1, Vincent So1, Neil N Macpherson1, and Jennifer A Mitchell1,2. * These authors contributed equally 1) Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada. 2) Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, ON, M5S 3G5, Canada. Contact: Jennifer A Mitchell, Department of Cell and Systems Biology, University of Toronto, 25 Harbord Street, Toronto, Ontario M5S 3G5, Canada. Phone: (416)978-6711, Fax: (416)978-8532, Email: [email protected]. Running Title: Enhancer and super-enhancer equivalence Keywords: Transcription, Enhancer Elements, Super-enhancers, Embryonic Stem Cells, Gene Expression Regulation. 1 Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press ABSTRACT Transcriptional enhancers are critical for maintaining cell type-specific gene expression and driving cell fate changes during development. Highly transcribed genes are often associated with a cluster of individual enhancers such as those found in locus control regions. Recently these have been termed stretch enhancers or super-enhancers, which have been predicted to regulate critical cell identity genes. We employed a CRISPR/Cas9-mediated deletion approach to study the function of several enhancer clusters (ECs) and isolated enhancers in mouse embryonic stem (ES) cells. Our results reveal that the effect of deleting ECs, also classified as ES cell super-enhancers, is highly variable, resulting in target gene expression reductions ranging from 12% to as much as 92%. Partial deletions of these ECs which removed only one enhancer or a sub-cluster of enhancers revealed partially redundant control of the regulated gene by multiple enhancers within the larger cluster. Many highly transcribed genes in ES cells are not associated with a super-enhancer; furthermore, super-enhancer predictions ignore 81% of the potentially active regulatory elements predicted by co-binding of 5 or more pluripotency- associated transcription factors. Deletion of these additional enhancer regions revealed their robust regulatory role in gene transcription. In addition, select super-enhancers and enhancers were identified which regulated clusters of paralogous genes. We conclude that whereas robust transcriptional output can be achieved by an isolated enhancer, clusters of enhancers acting on a common target gene act in a partially redundant manner to fine tune transcriptional output of their target genes. INTRODUCTION Distal regulatory elements are critical in establishing and maintaining tissue-specific transcriptional regulation of gene expression and are central in controlling cell identity. Furthermore, genome-wide association studies identifying disease and phenotypic trait- associated variants have found that the majority of these variants are within non-coding regions of the genome, suggesting regulatory activity (Maurano et al., 2012; Schaub et al., 2012). Further supporting this hypothesis are the observations that these non-coding regions overlap DNase I hypersensitive sites (DHSs) found in accessible chromatin and that the disease- 2 Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press associated variants often disrupt existing or create new transcription factor binding motifs. Alterations in regulatory DNA by large deletion or single mutation are known to cause disease and phenotypic alterations. For example, deletion of the hemoglobin subunit beta (HBB) locus control region (LCR) 50 kb upstream of the HBB gene causes β-thalassemia due to the inability of erythroid cells to produce mature globin (Kioussis et al., 1983; Tuan et al., 1989). As these critical regulatory sequences can be located tens to hundreds of kb away from the gene or genes they regulate, it can be challenging to identify regulatory elements and their target genes. A striking example of this is the limb enhancer that regulates the SHH (sonic hedgehog) gene. This enhancer is located 1 Mb away from the SHH gene in the intron of the LMBR1 gene and point mutations in this enhancer are associated with preaxial polydactyly in humans (Lettice et al., 2003). These findings highlight the necessity for a mechanistic understanding of the distal regulatory elements that regulate gene transcription. Several approaches have been used to identify transcriptional enhancers including: sequence based approaches that rely on transcription factor motif identification, sequence conservation through evolution, co-activator binding (EP300 and Mediator), histone modifications, open chromatin and approaches that combine multiple such features (Ballester et al., 2014; Chen et al., 2012; Ernst and Kellis, 2012; Hallikas et al., 2006; Heintzman et al., 2007; Hoffman et al., 2012; Pennacchio et al., 2006; Visel et al., 2009). Despite these numerous efforts and the wealth of data generated by the ENCODE Project (Bernstein et al., 2012), only a fraction (26%) of enhancer predictions display enhancer activity in reporter assays (Kwasnieski et al., 2014). Mid- to high-throughput screening methods such as MPRA, STARR-seq, and FIREWACh, facilitate testing the activity of enhancers genome-wide in transiently, or in the case of FIREWACh stably, transfected cell lines (Arnold et al., 2013; Melnikov et al., 2012; Murtha et al., 2014); however, these assays do not test activity in the endogenous genomic context and therefore do not identify the regulated gene. Recently the concept of super-enhancers and stretch enhancers was proposed and these regions were predicted to regulate cell identity genes and to confer higher expression on their target genes compared to regions termed typical enhancers (Hnisz et al., 2013; Loven et al., 2013; Parker et al., 2013; Whyte et al., 2013). Stretch enhancers are regions of ≥3 kb which 3 Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press exhibit an enhancer chromatin state based on the ChromHMM algorithm (Ernst et al., 2011; Parker et al., 2013). Super-enhancers have been predicted in numerous cell types where these regions have been identified by increased binding of the mediator protein MED1, enrichment of histone H3 K27 acetylation (H3K27ac), or binding of cell type-specific transcription factors (Khan and Zhang, 2016). The super-enhancer prediction involves bioinformatically stitching regions within 12.5 kb of one another in the linear genome; as a result super-enhancers are on average longer than traditional enhancers with a median size of 8.7 kb, and often contain more than one separate region which is bound by multiple transcription factors (Hnisz et al., 2013; Loven et al., 2013; Whyte et al., 2013). Although super-enhancers have been predicted in numerous cell types, few have been functionally investigated by deletion and it remains unclear whether these predicted super-enhancers or stretch enhancers represent a novel paradigm in gene regulation separate from the traditional concept of enhancers (Pott and Lieb, 2014). In fact the two classic robust distal regulatory elements, the LCRs of the Hbb and hemoglobin subunit alpha (Hba) genes, are strikingly similar to super-enhancers; they contain multiple DHSs or are enhancer clusters (ECs), with each enhancer bound by multiple transcription factors (Chen et al., 1997; Grosveld et al., 1987; Higgs et al., 1990; Mitchell et al., 2012; Tuan et al., 1989). Furthermore, both the Hbb and Hba LCRs were recently determined to fit the bioinformatic criteria for super-enhancers (Hay et al., 2016). The transcriptional regulatory networks that control cell identity of embryonic stem (ES) cells have been intensely studied making mouse ES cells an ideal model for the study of transcriptional regulation. This knowledge has led to the development of numerous reprogramming strategies to generate induced pluripotent stem (iPS) cells in multiple species. Despite this knowledge, we lack a clear understanding of where in the genome the critical regulatory elements regulating pluripotency lie and how those elements function in their endogenous context. Advances in genome editing technology have provided the opportunity to characterize this regulatory landscape with greater fidelity. One example of this approach applied to ES cells is deletion of the enhancers surrounding the Sox2 gene. Two studies revealed that a cluster of predicted enhancer regions 100 kb downstream of the Sox2 gene, termed the Sox2 control region (SCR), are required for Sox2 transcription in mouse ES cells (Li et al., 2014; 4 Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press Zhou et al., 2014). The SCR overlaps a predicted super-enhancer and was found to be required for >85% of the transcript levels of Sox2 in ES cells. Similarly, other studies have revealed the role of select super-enhancers in gene regulation (Hay et al., 2016; Hnisz et al., 2015; Huang et al., 2016; Mansour et al., 2014; Shin et al., 2016). At the Sox2 locus an additional super- enhancer was predicted surrounding the Sox2 gene; however, deletion of the enhancers within that region