bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1
2
3 L1 and B1 repeats blueprint the spatial organization of chromatin
4
5 J. Yuyang Lu1#, Lei Chang2#, Tong Li1#, Ting Wang1,3, Yafei Yin1,3, Ge Zhan1,3, Ke Zhang1,3,
6 Michelle Percharde4, Liang Wang1, Qi Peng1, Pixi Yan1, Hui Zhang1, Xue Han1, Xianju Bi1, Wen
7 Shao1, Yantao Hong1, Zhongyang Wu1, Peizhe Wang1, Wenzhi Li1, Jing Zhang1, Zai Chang1,
8 Yingping Hou2, Pilong Li1, Miguel Ramalho-Santos5, Jie Na1, Wei Xie1, Yujie Sun2*, Xiaohua
9 Shen1*
10
11 1Tsinghua Center for Life Sciences, School of Medicine and School of Life Sciences, Tsinghua
12 University, Beijing 100084, China
13 2 State Key Laboratory of Membrane Biology, School of Life Sciences, and Biomedical
14 Pioneering Innovation Center (BIOPIC), Peking University, Beijing 100871, China
15 3 These authors contributed equally to this work.
16 4 MRC LMS & Imperial College London, London W12 0NN, UK
17 5 Lunenfeld-Tanenbaum Research Institute, University of Toronto, Ontario M5T 3H7, Canada
18 # These authors share the co-first authorship.
19
20 * Leading correspondence should be addressed to X.S. ([email protected]).
21 Correspondence may also be addressed to Y.S. ([email protected]).
22
23
24
25 Keywords: retrotransposons, L1 repeats, B1/Alu repeats, higher-order chromatin structure 1
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1
2
3 SUMMARY
4 Despite extensive mapping of three-dimensional (3D) chromatin structures, the basic
5 principles underlying genome folding remain unknown. Here, we report a fundamental role
6 for L1 and B1 retrotransposons in shaping the macroscopic 3D genome structure. Homotypic
7 clustering of B1 and L1 repeats in the nuclear interior or at the nuclear and nucleolar
8 peripheries, respectively, segregates the genome into mutually exclusive nuclear
9 compartments. This spatial segregation of L1 and B1 is conserved in mouse and human cells,
10 and occurs dynamically during establishment of the 3D chromatin structure in early
11 embryogenesis and the cell cycle. Depletion of L1 transcripts drastically disrupts the spatial
12 distributions of L1- and B1-rich compartments. L1 transcripts are strongly associated with
13 L1 DNA sequences and induce phase separation of the heterochromatin protein HP1α. Our
14 results suggest that genomic repeats act as the blueprint of chromatin macrostructure, thus
15 explaining the conserved higher-order structure of chromatin across mammalian cells.
16
17
2
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 INTRODUCTION
2 The mammalian genome is folded nearly one million-fold within the nucleus of a cell
3 (Annunziato, 2008). Hierarchical organization of genome folding at different scales is central to
4 gene expression, cellular function and development, and alterations in genome organization often
5 lead to disease (Dekker et al., 2017; Zane et al., 2014). Microscopic and 3C-based approaches
6 reveal that chromatin forms a "fractal globule" with compartments, domains, and loop interactions
7 (Boettiger et al., 2016; Dekker et al., 2002; Mirny, 2011; Wang et al., 2016). At the megabase scale,
8 chromatin is subdivided into two spatially segregated compartments, arbitrarily labeled as A and B,
9 with distinct transcriptional activity (Lieberman-Aiden et al., 2009). The A and B compartments
10 also differ in other features, including gene density, CpG frequency, histone modifications, and
11 DNA replication timing (Bonev and Cavalli, 2016; Dekker et al., 2013; Gibcus and Dekker, 2013;
12 Rivera-Mulia and Gilbert, 2016). Simulation of three-dimensional (3D) genome structures
13 predicted that the euchromatic A compartment adopts a central position, whereas the
14 heterochromatic B compartment moves towards the nuclear periphery and perinucleolar regions
15 (Buchwalter et al., 2019; Stevens et al., 2017). Within compartments at the kilobase-to-megabase
16 scale, chromatin is organized in topologically associated domains (TADs), which serve as
17 functional platforms for physical interactions between co-regulated genes and regulatory elements
18 (Dixon et al., 2012). At a finer scale, TADs are divided into smaller loop domains, in which distal
19 regulatory elements such as enhancers come into direct contact with their target genes via
20 chromatin loops (Kadauke and Blobel, 2009).
21 The organization of the nuclear genome into euchromatin and heterochromatin, which closely
22 match the A and B compartments, respectively, appears to be conserved from ciliates to humans
23 and has been maintained in eukaryotes over 500 million years of evolution (Solovei et al., 2016).
24 Intriguingly, most A/B compartments and TADs seem to be invariant in different mouse and human
25 cell types or during cell-fate transition, whereas sub-TAD loops are more variable to facilitate
26 differential gene expression (Dixon et al., 2015; Stadhouders et al., 2018). The insulator protein 3
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 CTCF (CCCTC-binding factor) and the ring-shaped cohesin complex, considered as the weavers
2 of genome organization, co-localize at the boundaries and anchor regions of contact domains and
3 loops (Rao et al., 2014; Splinter et al., 2006). However, depletion of CTCF or cohesin disrupts
4 some sub-TAD loop contacts, but fails to affect the formation and maintenance of A/B
5 compartments in cells (Bintu et al., 2018; Gassler et al., 2017; Haarhuis et al., 2017; Nora et al.,
6 2017; Nuebler et al., 2018; Rao et al., 2017; Schwarzer et al., 2017; Wutz et al., 2017). The direct
7 molecular drivers, other than CTCF and cohesin, that govern compartmental domains are yet to be
8 defined. A few mechanisms have been proposed for the formation of nuclear compartments, such
9 as anchoring heterochromatin to the nuclear lamina (Falk et al., 2019; Guelen et al., 2008; van
10 Steensel and Belmont, 2017; Wijchers et al., 2015; Zullo et al., 2012), preferential attraction of
11 chromatin harboring similar histone modifications and regulators (Jost et al., 2014; Strom et al.,
12 2017; Wang et al., 2016), and hypothetical models involving pairing of homologous sequences
13 mediated by active transcription or phase separation (Cook, 2002; Cook and Marenduzzo, 2018;
14 Rowley et al., 2017; Tang, 2017; van de Werken et al., 2017). A recent study based on imaging and
15 polymer simulations proposes that interactions between heterochromatic regions are crucial for
16 compartmentalization of the genome in inverted and conventional nuclei (Falk et al., 2019).
17 However, the molecular basis that drives heterochromatin compartmentalization and segregation
18 from active domain remains unclear (Bouwman and de Laat, 2015; Pueschel et al., 2016).
19 Repetitive elements comprise about half of human and mouse genomes (Biemont, 2010).
20 Once regarded as genomic parasites (Orgel and Crick, 1980), retrotransposons have been recently
21 implicated in playing active roles in re-wiring the genome and gene expression programs in diverse
22 biological processes (Bourque et al., 2008; Chuong et al., 2016; Durruthy-Durruthy et al., 2016;
23 Grow et al., 2015; Liu et al., 2018; Lynch et al., 2011; Rebollo et al., 2012; Wang et al., 2014).
24 Repetitive DNA is speculated to have a role in nuclear organization (Falk et al., 2019; Solovei et
25 al., 2016; Tang, 2017). Considering the diverse nature and widespread distribution of repeats in
26 mammalian genomes (Sundaram and Wang, 2018), it is imperative to determine whether and how 4
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 repetitive sequences are involved in this process. Long and short interspersed nuclear elements
2 (LINEs and SINEs, respectively) are the two predominant subfamilies of retrotransposons in most
3 mammals (Mandal and Kazazian, 2008). L1 (also named as LINE1 or LINE-1) is the most abundant
4 subclass of LINEs, and also constitutes the most abundant type of all repeat subclasses, making up
5 to 18.8% and 16.9% (0.9~1.0 million copies) of the genome in mouse and human, respectively
6 (Taylor et al., 2013). B1 in mouse and its closely related, primate-specific Alu elements in human
7 are the most abundant subclass of SINEs, constituting 2.7%~10.6% (0.6~1.3 million copies) of
8 mouse and human genomes (Lander et al., 2001; Mouse Genome Sequencing et al., 2002). L1 and
9 B1/Alu have distinct nucleotide compositions and sequence lengths. While Alu elements are ~300
10 bp long and rich in G and C nucleotides, L1 elements are 6–7 kb long and AT-rich (Jurka et al.,
11 2004). A previous study of metaphase chromosome banding showed roughly inverse distributions
12 of L1 and Alu elements in chromosomal regions with distinct biochemical properties (Korenberg
13 and Rykowski, 1988). Further studies reported that Alu/B1 elements are preferentially enriched in
14 regions that are generally gene-rich, whereas L1 elements are enriched in gene-poor regions, which
15 tend to be located in lamina-associated heterochromatin (Deininger, 2011; Meuleman et al., 2013;
16 Wijchers et al., 2015). However, systematic mapping of L1 and B1 elements in the genome and the
17 nuclear space is still lacking.
18 Here, we report that L1 and B1/Alu repeats harbor the basic structural information in their
19 DNA sequences to instruct the 3D organization of mammalian genomes. Co-segregation of B1/Alu
20 and L1 repeats with the A and B compartments, respectively, is highly conserved in different mouse
21 and human cell types. B1/Alu-rich regions cluster in the interior of the nucleus, while L1-rich
22 regions cluster at the nuclear and nucleolar periphery. This pattern of homotypic clustering and
23 segregation is dynamically established in the cell cycle and in early embryogenesis. L1 RNA is
24 critically required for the formation and maintenance of compartment segregation and the higher-
25 order chromatin structure. Most likely, L1 RNA transacts the structural information in L1 DNA
26 repeats by acting upon its own DNA sequences to promote phase separation of HP1, thereby 5
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 inducing heterochromatin formation. Our findings uncover a fundamental principle of 3D genome
2 organization that is rooted in genomic DNA sequences, and also provide an answer to the
3 perplexing phenomenon of the conservation and robustness of chromatin compartmentalization
4 across mammalian species.
5
6 RESULTS
7 Homotypic clustering of B1 and L1 repeats segregates the A/B compartments
8 We analyzed the genomic positions of the major repeat subfamilies and observed positive
9 correlations within the L1 and SINE B1 subfamilies, but strong inverse correlations between them
10 (Figure S1A). This observation suggests that L1 and SINE B1 elements tend to be positioned away
11 from each other in the genome, while repeats from the same subfamily tend to be clustered. The
12 non-random positioning of repeat sequences in the genome prompted us to examine their relative
13 distributions in the A/B compartments. Analysis of Hi-C data in mouse embryonic stem cells
14 (mESCs) (Fraser et al., 2015) showed that dense L1 and B1 repeats appear to reside in distinct and
15 mutually exclusive compartments across the mouse genome, whereas other types of
16 retrotransposons such as ERV1, ERVK, and L2 tend to be randomly distributed (Figures 1A and
17 S1B). The compartments marked by B1 repeats show enrichment of active histone marks
18 (H3K4me3, H3K9ac, H3K27ac, H3K36me3), strong binding of RNA polymerase II (Pol II), and
19 high levels of chromatin accessibility and transcription activity. In contrast, the compartments
20 marked by L1 repeats show signatures of heterochromatin, including enrichment of the repressive
21 H3K9me2 and H3K9me3 marks, and strong binding of heterochromatin proteins such as HP1
22 and the nuclear corepressor KRAB-associated protein-1 (KAP1 or TRIM28) (Figures 1A and S1B).
23 Quantitative sequence analysis of annotated A/B compartments revealed high levels of SINE
24 repeats (including B1, B2 and B4) in the A compartments and L1 repeats in the B compartments,
25 whereas repeats in other subclasses showed no specific enrichment (Figures S1C, D). This
26 opposing distribution pattern of B1/Alu and L1 in the A and B compartments was also detected in 6
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 five other cell types (Figure 1B): human embryonic stem cells (hESCs), lymphoblastoid cells
2 (GM12878) and erythroleukemic cells (K562), and mouse neural progenitor cells (NPCs) and
3 neurons (Fraser et al., 2015; Rao et al., 2014). Furthermore, unsupervised clustering revealed that
4 the genomic positions of A/B compartments in six distinct mouse and human cell types are highly
5 similar to each other, with an average Spearman correlation coefficient >0.73 within species
6 and >0.52 between species (Figure S1E). Compared to other subclass repeats, L1 and B1 are most
7 strongly related to the high-order chromatin structures, and their distributions appear to be
8 conserved in homologous regions of the mouse and human genomes (Figure S1E). For example, a
9 region in mouse chromosome 2 (chr2: 120-140 Mb) and its syntenic region in human chromosome
10 20 (chr20: 6-50 Mb) show similar patterns of Hi-C contact probabilities, and gene and repeat
11 compositions and distributions in the corresponding A and B compartments along the DNA
12 sequences (Figure 1C). Thus, co-segregation of B1/Alu and L1 repeats with the A and B
13 compartments appears to be invariable in different cell types in mouse and human.
14 We then took mouse chromosome 17 (chr17, 95 Mb in length) as an example to overlay L1
15 and B1 repeat features on the Hi-C interaction matrix of mESCs (Figures 1D-F). Interestingly, the
16 plaid pattern of enriched and depleted interaction blocks in the Hi-C map is largely correlated with
17 the compositions and distributions of B1 and L1 along the whole chr17 (Figure 1D). In a 42-Mb
18 region of chr17, four L1-enriched B compartments (denoted by c, e, g and i) and three B1-enriched
19 A compartments (denoted by D, F and H) are alternately positioned along the linear DNA sequence
20 (Figure 1E). Strong interactions were observed between L1-enriched compartments (represented
21 by ce, cg, eg, ei and gi) or B1-enriched compartments (represented by DF, DH and FH), but not
22 between these two compartments (Figure 1E, dotted boxes). The interaction frequencies between
23 D and F (DF) or between c and e (ce) are much stronger than those of D or F with c or e (cD, De,
24 eF), despite the fact that these regions are closer in the linear sequence.
25 Conversion of Hi-C contact frequencies into Pearson correlation coefficients sharpened our
26 view of the long-range chromatin interactions (Figure 1F). Strikingly, the plaid pattern of the Hi-C 7
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 correlation map appears to perfectly match the distribution and interaction status of L1 and B1. L1-
2 rich or B1-rich regions show strong enrichment of contacts with regions containing the same repeat
3 type (red blocks in Figure 1F). We refer to these as homotypic contacts. Contacts between regions
4 containing the other repeat type (heterotypic interactions) are strongly depleted (blue blocks in
5 Figure 1F). For example, in one region of chr17 (35 to 95 Mb), L1-enriched segments (from e to
6 u) and B1-enriched segments (from F to T) exhibit high frequencies of homotypic contacts (Figure
7 1F, highlighted by arrows), but strong depletion of heterotypic contacts. Similarly, homotypic
8 contacts between L1-rich regions or B1-rich regions were also observed between chromosomes, as
9 illustrated by chromosomes 17 and 19 (Figure 1G). These results indicate that genomic regions
10 containing B1 or L1 repeats tend to interact with genomic regions containing repeat sequences
11 from similar subfamilies, but not from different subfamilies, regardless of linear proximity. Thus,
12 homotypic clustering of regions rich in B1 or L1 repeats partitions the genome into distinct A and
13 B compartments, respectively.
14
15 DNA FISH reveals the spatial segregation of L1 and B1 compartments
16 To visualize the nuclear localization of B1 and L1 repeats in cells we performed dual-color
17 DNA fluorescence in situ hybridization (FISH) using fluorescence-tagged oligonucleotide probes
18 that specifically target the consensus sequences of B1 and L1 elements. Consistent with the Hi-C
19 analysis, L1 and B1 exhibit distinct and complementary nuclear localizations in mESCs (Figures
20 2A and 2B). While B1 DNA shows punctate signals in the nuclear interior, L1 DNA exhibits highly
21 organized and concentrated signals that line the periphery of the nucleus and nucleolus. Weak L1
22 signals were also detected in a few areas of the nuclear interior subregions where B1 signals were
23 absent. Both B1 and L1 signals are absent from DAPI-dense regions, which likely represent
24 satellite repeat-enriched chromocenters (Mateos-Langerak et al., 2007). To confirm the L1
25 localization at the nucleolar periphery in mESCs, we performed immuno-FISH, which combines
26 DNA FISH with immunofluorescence staining using an antibody against the nucleolar marker 8
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Nucleolin (NCL). Indeed, L1 signals surround and partially overlap with the ring-shaped signals
2 of NCL at the nucleolar periphery (Figure 2C).
3 To ask whether L1 and B1 localizations might vary with cell type, we performed immuno-
4 FISH in four other representative mouse and human cell lines: neural stem cells (NSC), fibroblasts
5 (NIH3T3), myoblasts (C2C12) and Hela cells (Figure 2C). Similar to mESCs, all these cells show
6 non-overlapping localizations of B1/Alu repeats, which are present in the nuclear interior, and L1
7 repeats, which are mainly located at the nuclear and nucleolar peripheries. The distributions of
8 B1/Alu and L1 are reminiscent of the nuclear localization of euchromatin and heterochromatin,
9 respectively. Together, the results from Hi-C and imaging analysis demonstrate that homotypic
10 B1/Alu or L1 DNA sequences associate together to form larger clusters, which divide the nucleus
11 into distinct territories of A/B compartments, and nuclear co-segregation of B1/Alu and L1 repeats
12 with the A and B compartments is conserved across different cell types in mouse and human.
13
14 Dynamic establishment of L1 and B1 segregation during the cell cycle and embryonic
15 development
16 As chromatin organization undergoes dynamic changes during mitosis and after fertilization,
17 we asked whether L1 and B1 compartments are established and re-constructed in these two
18 biological processes. DNA FISH analysis of synchronous mESCs showed that L1 and B1
19 localizations change dramatically at different cell-cycle stages (Figures 3A and S2A). S-phase cells
20 show non-overlapping and complementary localizations of L1 and B1 repeats (Figures 2 and 3A).
21 This is similar to the pattern we observed previously in asynchronous mESCs, most of which are
22 in the S phase of the cell cycle (Figure S2A). However, L1 and B1 DNA signals are mixed on
23 mitotic chromosomes in metaphase (M phase, including prophase and anaphase), when the nuclear
24 membrane and nucleoli are disassembled. As the cell cycle progresses into the G1 phase, L1 and
25 B1 DNA start to segregate again. To quantify the degree of segregation, we defined a FISH-based
26 segregation index as the negative value of Pearson’s correlation coefficient of L1 and B1 DNA 9
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 signals in the nucleus. The FISH segregation index is lowest in M-phase cells, but increases
2 significantly in the G1 phase and peaks in the S phase (Figure 3B).
3 To provide further molecular evidence for segregation of repeats during the cell cycle, we
4 analyzed the published Hi-C data from cell-cycle synchronized mESCs and HeLa cells (Nagano et
5 al., 2017; Naumova et al., 2013). In both cell types, G1-phase cells exhibit a classic plaid pattern
6 of hierarchical interactions, with enriched and depleted interaction blocks outside of the diagonal
7 region of the Hi-C interaction heatmap (Figures 3C, 3D, S2B, and S2C). In contrast, M-phase cells
8 exhibit stronger signals along the diagonal, which represent the linearly organized, longitudinally
9 compressed array of consecutive chromatin loops. To quantify this difference, we defined a Hi-C-
10 based segregation index by calculating the ratio of homotypic versus heterotypic interaction
11 frequencies between L1 and B1/Alu subfamilies. Indeed, the Hi-C segregation index is
12 significantly higher in G1-phase cells than in M-phase cells (Figures 3C and S2B). These results
13 indicate that segregation of B1/Alu and L1 repeats is dispersed by mitosis, and is re-established
14 when the higher-order chromatin structure forms during each cell cycle. This finding agrees with
15 previous reports that in metaphase, chromosome folding becomes homogeneous and large
16 megabase-scale A and B compartments are lost, whilst in interphase, chromosomes return to a
17 highly compartmentalized state (Nagano et al., 2017; Naumova et al., 2013).
18 After fertilization, the chromatin undergoes extensive reprogramming from a markedly relaxed
19 state in zygotes to fully organized structures in blastocysts (Du et al., 2017; Flyamer et al., 2017;
20 Ke et al., 2017). We performed time-course DNA FISH analysis of L1 and B1 in early mouse
21 embryos (Figure 3E). L1 and B1 signals are largely overlapping in zygotes and early 2-cell embryos,
22 and become progressively more segregated during embryonic divisions (Figure 3E). A plot of the
23 FISH segregation indexes along the course of blastocyst development showed that the greatest
24 change (steepest trend-line) occurred between the zygote and the late 2-cell stage (Figure 3F). This
25 observation implies that the initiation of B1 and L1 compartmentalization may coincide with the
26 zygotic genome activation (ZGA), during which massive transcription switches on (Qiu et al., 10
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 2003). We then analyzed the published Hi-C data from early mouse embryos (Du et al., 2017).
2 Consistent with our results, we detected a gradual increase of the segregation index (Figure 3G).
3 Early 2-cell embryos exhibit prevalent cis-chromosomal contacts along the diagonal of the Hi-C
4 interaction map, whereas the plaid patterns of Hi-C interactions become readily detectable in late
5 2-cell embryos and are fully established in the inner cell mass (ICM) cells of blastocysts (Figure
6 3H). Thus, in early embryos, compartmentalization of B1- and L1-rich regions appears to be
7 established in a stepwise manner, coincident with de novo establishment of higher-order chromatin
8 structures.
9
10 Repeat RNA and transcription promote L1 and B1 compartmentalization
11 Transcription and nuclear architecture are closely interdependent and influence each other
12 (Dowen et al., 2014; Rowley and Corces, 2018; Rowley et al., 2017). We sought to test whether
13 RNA transcribed from L1 and B1 repeats mediates the function of their DNA sequences. We first
14 analyzed the effects of inhibition of L1 and B1 expression in early mouse embryos, as both repeats
15 are activated and highly expressed in two-cell embryos (Fadloun et al., 2013; Jachowicz et al., 2017;
16 Rothstein et al., 1992). We recently showed that that L1 RNA can be efficiently knocked down with
17 antisense morpholinos (AMO) in mouse embryos and mESCs (Percharde et al., 2018). We
18 successfully depleted L1 and B1 transcripts by injecting antisense morpholinos or oligonucleotides
19 (ASO) into mouse zygotes (Figures S3A-B). Embryos depleted of L1 or B1 RNA were able to pass
20 the first embryonic division and their development was indistinguishable from control embryos at
21 the 2-cell stage; however, they failed to divide further and became arrested at the 2-cell stage
22 (Figure 4A). We collected these embryos for DNA FISH analysis when the control group injected
23 with scramble AMO or ASO had grown to the late 2-cell stage. In embryos depleted of L1 or B1
24 RNA, L1/B1 segregation is impaired, as indicated by significantly lower L1/B1 segregation
25 indexes compared to scramble control and noninjected late 2-cell embryos (Figures 4B and S3C-
26 E). This result suggests that formation of L1 and B1 compartments is delayed upon depletion of L1 11
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 or B1 transcripts, which is consistent with the observed embryonic arrest.
2 It was reported that inhibition of Pol II transcription by -amanitin causes embryonic arrest at
3 the late 2-cell stage (Du et al., 2017; Qiu et al., 2003). We analyzed the Hi-C data obtained by Du
4 et al and found that -amanitin-treated embryos exhibit predominantly cis-chromosomal contacts
5 along the diagonal of the Hi-C map, in a pattern similar to that of early 2-cell embryos, while the
6 control group shows the plaid-like pattern of late 2-cell embryos (Figures S4A). Consistent with
7 this, treating mESCs with the transcription inhibitors Actinomycin D (ActD) or DRB (5,6-dichloro-
8 1-b-D-ribofuranosylbenzimidizole) also altered L1/B1 compartmentalization as shown by loss of
9 L1 perinucleolar localization and gain of mixed L1 and B1 signals (Figures 4C, 4D and S4B).
10 Notably, inhibition of both Pol I and II by ActD caused a more severe effect than inhibition of Pol
11 II alone by DRB. These results indicate that transcription facilitates the nuclear organization.
12 Next, we sought to deplete repeat transcripts in mESCs in order to dissect the effects on
13 chromatin organization independent of embryonic progression. B1/Alu repeats are known to be
14 broadly involved in diverse cellular processes, including transcription and RNA processing and
15 nuclear export (Chen et al., 2009; Dominissini et al., 2011; Lubelsky and Ulitsky, 2018; Polak and
16 Domany, 2006). Treatment of mESCs with B1 ASO led to severe cell death within hours of
17 transfection (data not shown), which precluded direct assessment of the function of B1 transcripts
18 in organizing the nuclear structure. L1 repeats, the most abundant of all repeat subclasses, are
19 predominantly enriched in heterochromatic B compartments. It was suggested that heterochromatin
20 interactions play a more important role in nuclear compartmentalization than euchromatin
21 interactions (Falk et al., 2019; Houda Belaghzal, 2019). To reveal a direct role of repeat RNA in
22 the nuclear organization, we focused on L1 in subsequent investigations.
23 Treatment of mESCs with L1 AMOs drastically altered the formation of L1/B1 compartments
24 as shown by the appearance of the overlapping, yellowish FISH signals of B1 and L1 DNA (Figures
25 4E, 4F, and S5A). In particular, when L1 RNA was depleted by AMO treatment for 12 or 36 hours,
12
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 perinucleolar L1 signals were diffuse or absent and nucleoplasmic signals were dramatically
2 increased at both time-points. In contrast to the punctate pattern of B1 DNA in the control mESCs,
3 B1 signals became more uniformly dispersed in the nucleoplasm, which is indicative of decreased
4 clustering. In comparison, treatment with the drug azidothymidine (AZT), which blocks L1
5 retrotransposition activity (Xie et al., 2010), failed to affect L1 RNA levels and the nuclear
6 localization of L1 and B1 (Figures 4E, 4F, and S5B). In addition, statistical analysis indicated that
7 the FISH segregation indexes were significantly lower in L1-depleted mESCs than in mESCs
8 treated with scramble AMO or AZT. These results suggest a role of L1 transcripts in regulating the
9 nuclear localization and segregation of L1 and B1 DNA repeats, independently of L1
10 retrotransposition activity, in mESCs.
11
12 L1 RNA regulates the higher-order chromatin structure in mESCs
13 For in-depth characterization of the molecular defects caused by L1 RNA depletion, we
14 performed in situ Hi-C analysis at 36 hours after transfecting L1 AMO into mESCs. Direct
15 visualization of Hi-C interaction maps revealed obvious differences in the plaid pattern of L1-
16 depleted and control mESCs, as illustrated by chr17 (Figure 5A). In L1-depleted mESCs, Hi-C
17 contact signals were abnormally increased along the diagonal line, whereas the plaid signals outside
18 of the diagonal regions became fuzzy or even lost (Figure 5A, panel i). Accordingly, comparison
19 of Hi-C contact frequencies of the control versus L1-depleted cells showed decreased ratios (in
20 blue) across the diagonal and increased ratios (in red) in the periphery (Figure 5A, panel ii). These
21 changes indicate enhanced local chromosomal contacts but decreased long-range interactions in
22 L1-depleted mESCs.
23 A zoomed-in view of a 42-Mb region of chr17 in L1-depleted mESCs further shows decreased
24 homotypic chromatin contacts between L1-rich or B1-rich compartments, but abnormally
25 increased heterotypic contacts (Figure 5B, right; and Figure 5C). For example, B1-B1 interactions
26 (represented by DF, DH and FH) or L1-L1 interactions (represented by ce, cg, ci, eg, ei and gi) 13
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 were downregulated in L1-depleted cells, whereas aberrant B1-L1 contacts (represented by cD, De,
2 Dg, Di, eF, Fg, and Hi) were increased. Globally, L1-depleted cells show significantly lower Hi-C
3 segregation indexes compared to control mESCs (Figure 5A). Consistent with the overlapping
4 signals of L1 and B1 DNA FISH (Figure 4E), these results revealed defective nuclear segregation
5 of L1 and B1 compartments after depletion of L1 RNA (Figure 5B).
6 Next, to directly visualize changes in spatial contacts at specific loci, we performed oligopaint
7 dual-color DNA FISH (Figure 6A). We labeled four representative genomic regions on chr17, a
8 B1-enriched region (F) and three L1-enriched regions (e, g and q), with a set of 500 ~ 4,500 DNA
9 probes for each region at a density of 200 bp per probe, and measured the spatial distances between
10 the probe signals within mESC nuclei. In control mESCs, the two linearly adjacent L1- and B1-
11 enriched regions e and F are positioned away from each other in the nuclear space with a median
12 distance of 1.72 0.55 µm. Depletion of L1 RNA significantly shortened the distance between
13 them to 1.27 0.47 µm (Figures 6B and 6C, left). In comparison, the two L1-enriched
14 compartments g and q reside in close spatial proximity (1.08 0.45 µm) in the nuclei of control
15 cells even though they are separated by multiple L1 and B1 compartments in the linear genomic
16 sequence. L1 depletion significantly increased the nuclear distance between g and q to 1.81 0.82
17 µm (Figure 6C, right). Together, DNA FISH and Hi-C results demonstrate that L1 RNA is critical
18 in maintaining the high-order chromatin structure of segregated L1- and B1-enriched compartments
19 in mESCs.
20
21 L1 repeats promote phase separation of HP1
22 Recent evidence suggests that the formation of heterochromatin domains entails a phase-
23 separation mechanism (Larson et al., 2017; Li et al., 2012; Strom et al., 2017; Zwicker et al., 2014).
24 HP1, a known component of heterochromatin, forms phase-separated droplets in the presence of
25 DNA or nucleosomes in vitro (Larson et al., 2017; Strom et al., 2017). We showed that HP1 binds 14
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 strongly to L1-enriched but not B1-enriched genomic regions (Figure 7A). In addition, L1 RNA
2 specifically binds L1 DNA sequences throughout the genome, and is significantly enriched in L1
3 repeat-associated compartments, but is depleted in B1-associated compartments (Figures 1A and
4 7A), as shown by chromatin isolation by RNA purification followed by sequencing of L1 RNA
5 (ChIRP-seq) (Lu et al., paper submitted). Given the abundance and co-residence of L1 repeats with
6 HP1 in similar heterochromatin contexts within the nucleus, we hypothesized that L1 RNA may
7 facilitate heterochromatin formation by promoting HP1 phase separation.
8 We firstly tested the direct protein-RNA interaction between the recombinant human HP1
9 protein and L1 RNA fragments that were transcribed in vitro. Interestingly, the L1 RNA mix
10 captures HP1 protein in extremely high stringent washes (up to 1 M of salt and urea), indicating
11 that HP1 has a strong RNA-binding activity towards L1 (Figure 7B). Droplet formation assays
12 showed that HP1 alone fails to phase separate (Figures 7C), consistent with a previous report
13 (Larson et al., 2017). Intriguingly, upon addition of L1 RNA, HP1 forms spherical droplets with
14 the properties of liquid-like condensates, such as fusion of droplets and rapid recovery after photo-
15 bleaching (Figures 7C and S6A, B). HP1-L1 RNA droplets increase in size when increasing
16 concentrations of components are added into the assay (Figure 7C). L1 DNA also promotes the
17 phase separation of HP1 (Figure S6C). These results suggest that L1 repeats play an active role
18 in heterochromatin formation.
19
20 Discussion
21 Although tremendous efforts have been dedicated to studies of structural chromatin proteins
22 and cataloging chromatin maps, the role of DNA sequences in 3D genome organization has been
23 largely ignored. It has been speculated that genome organization may occur through polymer phase
15
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 separation mediated by homologous sequence pairing (Cook, 2002; Cook and Marenduzzo, 2018;
2 Falk et al., 2019; Rowley et al., 2017; Tang, 2017). However, it was unclear which types of repeats
3 are involved in this process, and direct evidence was still lacking. In addition, the molecular
4 determinants and mechanisms that drive formation of the A/B compartments remain unclear. In
5 this study, we provide experimental evidence to support a fundamental role of L1 and B1/Alu
6 repeats in organizing chromatin macrostructure.
7 Complementary results based on genomics, Hi-C and imaging analysis revealed that L1 and
8 B1/Alu repeats tend to cluster with sequences from their own repeat subfamily and form mutually
9 exclusive domains in the nuclear space that are highly correlated with A/B compartments (Figures
10 1 and 2). The nuclear segregation of B1/Alu-enriched A compartments in the nuclear interior and
11 L1-enriched B compartments in the nuclear and nucleolar peripheries is relatively conserved and
12 stable across a variety of mouse and human cells, and re-occurs during the cell cycle (Figure 2). In
13 addition, de novo establishment of the segregated pattern of L1- and B1-rich compartments is
14 coincident with the formation of higher-order chromatin structures during early embryogenesis,
15 and appears to be critically regulated by RNA transcripts produced from L1 and B1 DNA sequences.
16 Depletion of L1 RNA by AMOs in mESCs drastically disrupts the nuclear localization of L1 repeats,
17 and leads to impaired segregation of L1 and B1 compartments and altered higher-order chromatin
18 structures. Moreover, L1 RNA binds to its own DNA sequences in cells, and promotes HP1 phase
19 separation in vitro. Together, our findings suggest that repeat transcripts may transact the structural
20 information embedded in their DNA sequences and facilitate formation of the nuclear architecture
21 with assistance from protein partners, such as HP1 (Figures 4-7).
22 Based on our observations together with previous reports in the literature, we propose a
23 hypothetical model in which repetitive elements organize the macroscopic structure of the genome
24 at three hierarchical levels (Figure 7D). First, L1 and B1/Alu repeats serve as the genetic basis for
25 A and B compartments. The abundance and scattering of these repeat elements in the genome
16
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 provide numerous nucleation points or "structural codes" to seed the formation of nuclear
2 subdomains. Homotypic clustering of Ll-rich regions or B1-rich regions initiates genome folding.
3 Second, the structural information embedded in linear genomic DNA repeats is in part transacted
4 by their transcripts, particularly L1 RNA as demonstrated in this study, into spatially ordered
5 chromatin in the nucleus. Molecular crowding and specific interactions among similar molecules,
6 including DNA sequences and RNAs containing multiple copies of the same repeat type, and their
7 associated proteins, promote clustering into larger subdomains (Bergeron-Sandoval et al., 2016;
8 Cook, 2002; Misteli, 2007). Phase separation of individual subdomains may also lead to their
9 segregation in the nuclear space and the formation of distinct L1-enriched heterochromatin and
10 B1/Alu-enriched euchromatin domains. Third, chromatin compartmentalization may be further
11 reinforced and stabilized through attachment of repeat DNA sequences to subnuclear structures
12 such as the nucleolus and nuclear speckles (Quinodoz et al., 2018). Evidence to support this notion
13 includes our observation that L1 repeats are preferentially localized at the nuclear and nucleolar
14 peripheries, and depletion of L1 RNA disrupts the localization of L1 DNA at these sites.
15 In the absence of zygotic transcription, chromatin folding continues to occur in a significantly
16 delayed manner (Figures 4 and S4), and the higher-order chromatin structure still partially forms
17 (Du et al., 2017). This observation implies that the 3D organization of chromatin is an autonomous
18 process that is greatly facilitated by transcription. We surmise that homotypic clustering of L1-rich
19 or B1-rich regions is autonomous but dynamically regulated. L1/B1-driven nuclear
20 compartmentalization is reminiscent of the self-assembly of tandem repeat sequences such as
21 ribosomal DNA (rDNA) and satellite repeats, which promotes high-order assemblages of the
22 nucleolus and pericentromeric domains, respectively (Falahati et al., 2016; Tjong et al., 2016).
23 Consistent with our model, in mouse ESCs carrying a human artificial chromosome, human L1
24 sequences spatially interact with mouse L1-enriched genomic regions but avoid the SINE-enriched
25 regions, and vice versa for human SINEs (van de Werken et al., 2017). Specific DNA-RNA-protein
26 interactions at homologous L1 or B1 chromosomal segments not only enhance the formation of 17
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 subdomains through phase separation, but may also promote their segregation in the nucleus. A
2 recent study reported that there does not appear to be cross-talk between different transcription
3 factors; instead, they interact among themselves via sequence-specific protein-protein interactions
4 to form concentrated hubs on synthetic lacO DNA arrays (Chong et al., 2018). L1 or B1/Alu repeats
5 tend to associate with distinct sets of transcription and chromatin regulators (Lu et al., paper
6 submitted). For example, heterochromatin proteins such as HP1, KAP1, and SETDB1 are
7 specifically enriched on L1 elements, whereas general transcription factors such as GTF3C2 and
8 CEBPB and RNA Pol II subunits show enriched binding on B1/Alu. In addition, different
9 physiochemical properties of phase-separated L1 or B1 subdomains, for example chromatin
10 compactness, may further promote the segregation of L1 and B1 compartments.
11 By focusing on L1 transcripts, we demonstrate an essential role for L1 repeat RNA in
12 promoting the formation of segregated higher-order chromatin structure in mESCs (Figures 4-6).
13 Heterochromatin domains appear to be more stable than euchromatin domains (Houda Belaghzal,
14 2019). In support of a heterochromatin-dominated model (Falk et al., 2019), our data further
15 suggest that L1-mediated heterochromatin formation may drive nuclear compartmentalization.
16 This function of L1 RNA might be achieved by stabilizing the binding of chromatin proteins such
17 as HP1 to L1 DNA sequences, and by recruiting additional RNA-binding proteins to increase
18 local protein concentrations and multivalent interactions for efficient phase separation. It was
19 reported that the RNA-binding activity of mammalian HP1 is required for its localization to
20 pericentromeric chromatin (Muchardt et al., 2002; Stunnenberg et al., 2015). On the other hand,
21 heterochromatic RNA reduces the mobility and spreading of the fission yeast HP1 protein in
22 heterochromatic domains (Keller et al., 2012; Muchardt et al., 2002; Stunnenberg et al., 2015).
23 From an evolutionary perspective, the properties of co-localization, abundance, and sequence
24 preference of L1 and HP1 might have co-evolved to ensure regulated formation of L1-enriched
25 heterochromatin in organizing the nucleus. L1 RNA binds NCL, and its depletion causes relocation
18
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 of DUX genes from the perinucleolar heterochromatin to the nucleoplasm (Percharde et al., 2018).
2 Abnormal activation or silencing of L1 in mouse zygotes led to altered chromatin accessibility and
3 compaction, which failed to be rescued by synthetic L1 RNA (Jachowicz et al., 2017). Together
4 with previous reports, our results suggest a model where L1 RNA acts in cis to anchor L1 repeat-
5 enriched chromosomal segments to the nucleolar and nuclear peripheries. This scaffold function is
6 reminiscent of reported roles for several chromatin-bound noncoding RNAs in organizing
7 subnuclear domains. For example, Xist RNA binds the lamin B receptor in the inner nuclear
8 membrane to initiate and maintain the silencing of the inactive X chromosome (McHugh et al.,
9 2015). Transcription at major satellite repeats precedes chromocenter formation, and satellite
10 transcripts help to recruit the H3K9me3 methyltransferase SUV39H to centromeric DNA
11 sequences (Probst et al., 2010; Velazquez Camacho et al., 2017). However, the exact mechanism
12 underlying the function of L1 RNA remains to be fully elucidated by future studies.
13 In summary, our study starts to unravel the fundamental principle of 3D genome organization,
14 in which the distribution of L1 and B1 transposable elements provides the blueprint for chromatin
15 folding. As discovered by Anfinsen in the late 1950s, the central pillar of protein science tells us
16 that the amino acid sequence of a protein determines its structure and function (Anfinsen, 1972;
17 Anfinsen et al., 1954). Analogously, we propose that the primary DNA sequences determine how
18 the genome folds and functions. We believe that genome folding occurs autonomously, through a
19 process that is driven by homotypic clustering of regions containing L1 or B1 repeat sequences,
20 and is further facilitated by transcription and transcripts produced at these repeat elements.
21 Structural information embedded in L1 and SINE repeats may be universally recognizable, thus
22 contributing to the high degree of stability and conservation in compartmental and TAD
23 organization that is observed across mammalian species and cell types. We want to note that L1
24 and B1/Alu compartmental domains represent a structural and functional ground state of chromatin
25 organization, on which subsequent regulatory features, such as dynamic enhancer-promoter
26 interactions, are overlaid. Nevertheless, the same principle of homotypic clustering, phase 19
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 separation and spatial segregation of chromosomal segments may be reiterated at different genomic
2 scales, consequently folding the genome.
3
4 ACKNOWLEDGMENTS
5 We thank the Shen Laboratory members for insightful discussions, and we thank Dr. Hongxia Lv
6 at the core imaging facility of the School of Life Sciences, Peking University, for imaging support.
7 This work was supported in part by the National Basic Research Program of China
8 (2018YFA0107604, 2017YFA0504204 to X.S.); the National Natural Science Foundation of China
9 (31630095, 31471219 to X.S.; 21573013, 21825401 to Y.S.); the National Key R&D Program of
10 China (2017YFA0505300 to Y.S.); and the Center for Life Sciences (CLS) at Tsinghua University
11 (to X.S).
12
13 AUTHOR CONTRIBUTIONS
14 X.S. conceived of and supervised the study. X.S., Y.S., J.Y.L. and L.C. designed the experiments.
15 J.Y.L. performed bioinformatics analysis. L.C. conducted all imaging experiments with help from
16 Y.Y., G.Z., T.W., X.H., and J.Z. T.L. performed phase separation and RNA pull-down experiments.
17 K.Z., Q.P. and P.Y. prepared the Hi-C library. T.W., W.L. and P.W. did embryo injections. P.L. and
18 L.W. provide HP1 proteins. M.P and M.R.-S. contributed to early experiments of L1 RNA
19 knockdown. W.X., J.N., H.Z., X.B., W.S., Z.W. and Y.H. contributed technical
20 assistance/suggestions. X.S., J.Y.L. and L.C. wrote the manuscript with input from all authors.
21
22 DECLARATION OF INTERESTS
23 The authors declare no competing interests.
24
25
20
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 REFERENCES
2 Anfinsen, C.B. (1972). The formation and stabilization of protein structure. Biochem J 128, 737-749. 3 Anfinsen, C.B., Redfield, R.R., Choate, W.L., Page, J., and Carroll, W.R. (1954). Studies on the gross structure, 4 cross-linkages, and terminal sequences in ribonuclease. J Biol Chem 207, 201-210. 5 Annunziato, A.T. (2008). DNA Packaging: Nucleosomes and Chromatin. Nature Education 1(1):26. 6 Bergeron-Sandoval, L.P., Safaee, N., and Michnick, S.W. (2016). Mechanisms and Consequences of 7 Macromolecular Phase Separation. Cell 165, 1067-1079. 8 Biemont, C. (2010). A Brief History of the Status of Transposable Elements: From Junk DNA to Major Players in 9 Evolution. Genetics 186, 1085-1093. 10 Bintu, B., Mateo, L.J., Su, J.H., Sinnott-Armstrong, N.A., Parker, M., Kinrot, S., Yamaya, K., Boettiger, A.N., and 11 Zhuang, X. (2018). Super-resolution chromatin tracing reveals domains and cooperative interactions in single 12 cells. Science 362. 13 Boettiger, A.N., Bintu, B., Moffitt, J.R., Wang, S., Beliveau, B.J., Fudenberg, G., Imakaev, M., Mirny, L.A., Wu, C.T., 14 and Zhuang, X. (2016). Super-resolution imaging reveals distinct chromatin folding for different epigenetic 15 states. Nature 529, 418-422. 16 Bonev, B., and Cavalli, G. (2016). Organization and function of the 3D genome. Nat Rev Genet 17, 661-678. 17 Bourque, G., Leong, B., Vega, V.B., Chen, X., Lee, Y.L., Srinivasan, K.G., Chew, J.L., Ruan, Y., Wei, C.L., Ng, H.H., et 18 al. (2008). Evolution of the mammalian transcription factor binding repertoire via transposable elements. 19 Genome Res 18, 1752-1762. 20 Bouwman, B.A., and de Laat, W. (2015). Getting the genome in shape: the formation of loops, domains and 21 compartments. Genome Biol 16, 154. 22 Buchwalter, A., Kaneshiro, J.M., and Hetzer, M.W. (2019). Coaching from the sidelines: the nuclear periphery in 23 genome regulation. Nature Reviews Genetics 20, 39-50. 24 Chen, C., Ara, T., and Gautheret, D. (2009). Using Alu elements as polyadenylation sites: A case of retroposon 25 exaptation. Mol Biol Evol 26, 327-334. 26 Chong, S., Dugast-Darzacq, C., Liu, Z., Dong, P., Dailey, G.M., Cattoglio, C., Heckert, A., Banala, S., Lavis, L., 27 Darzacq, X., et al. (2018). Imaging dynamic and selective low-complexity domain interactions that control gene 28 transcription. Science 361. 29 Chuong, E.B., Elde, N.C., and Feschotte, C. (2016). Regulatory evolution of innate immunity through co-option 30 of endogenous retroviruses. Science 351, 1083-1087. 31 Cook, P.R. (2002). Predicting three-dimensional genome structure from transcriptional activity. Nat Genet 32, 32 347-352. 33 Cook, P.R., and Marenduzzo, D. (2018). Transcription-driven genome organization: a model for chromosome 34 structure and the regulation of gene expression tested through simulations. Nucleic Acids Res 46, 9895-9906. 35 Deininger, P. (2011). Alu elements: know the SINEs. Genome Biol 12, 236. 36 Dekker, J., Belmont, A.S., Guttman, M., Leshyk, V.O., Lis, J.T., Lomvardas, S., Mirny, L.A., O'Shea, C.C., Park, P.J., 37 Ren, B., et al. (2017). The 4D nucleome project. Nature 549, 219-226. 38 Dekker, J., Marti-Renom, M.A., and Mirny, L.A. (2013). Exploring the three-dimensional organization of 39 genomes: interpreting chromatin interaction data. Nat Rev Genet 14, 390-403. 40 Dekker, J., Rippe, K., Dekker, M., and Kleckner, N. (2002). Capturing chromosome conformation. Science 295, 41 1306-1311. 42 Dixon, J.R., Jung, I., Selvaraj, S., Shen, Y., Antosiewicz-Bourget, J.E., Lee, A.Y., Ye, Z., Kim, A., Rajagopal, N., Xie, 43 W., et al. (2015). Chromatin architecture reorganization during stem cell differentiation. Nature 518, 331-336. 44 Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., and Ren, B. (2012). Topological domains
21
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376-380. 2 Dominissini, D., Moshitch-Moshkovitz, S., Amariglio, N., and Rechavi, G. (2011). Adenosine-to-inosine RNA 3 editing meets cancer. Carcinogenesis 32, 1569-1577. 4 Dowen, J.M., Fan, Z.P., Hnisz, D., Ren, G., Abraham, B.J., Zhang, L.N., Weintraub, A.S., Schujiers, J., Lee, T.I., Zhao, 5 K., et al. (2014). Control of cell identity genes occurs in insulated neighborhoods in mammalian chromosomes. 6 Cell 159, 374-387. 7 Du, Z., Zheng, H., Huang, B., Ma, R., Wu, J., Zhang, X., He, J., Xiang, Y., Wang, Q., Li, Y., et al. (2017). Allelic 8 reprogramming of 3D chromatin architecture during early mammalian development. Nature 547, 232-235. 9 Durruthy-Durruthy, J., Sebastiano, V., Wossidlo, M., Cepeda, D., Cui, J., Grow, E.J., Davila, J., Mall, M., Wong, 10 W.H., Wysocka, J., et al. (2016). The primate-specific noncoding RNA HPAT5 regulates pluripotency during 11 human preimplantation development and nuclear reprogramming. Nat Genet 48, 44-52. 12 Fadloun, A., Le Gras, S., Jost, B., Ziegler-Birling, C., Takahashi, H., Gorab, E., Carninci, P., and Torres-Padilla, M.E. 13 (2013). Chromatin signatures and retrotransposon profiling in mouse embryos reveal regulation of LINE-1 by 14 RNA. Nat Struct Mol Biol 20, 332-338. 15 Falahati, H., Pelham-Webb, B., Blythe, S., and Wieschaus, E. (2016). Nucleation by rRNA Dictates the Precision 16 of Nucleolus Assembly. Curr Biol 26, 277-285. 17 Falk, M., Feodorova, Y., Naumova, N., Imakaev, M., Lajoie, B.R., Leonhardt, H., Joffe, B., Dekker, J., Fudenberg, 18 G., Solovei, I., et al. (2019). Heterochromatin drives compartmentalization of inverted and conventional nuclei. 19 Nature 570, 395-399. 20 Flyamer, I.M., Gassler, J., Imakaev, M., Brandao, H.B., Ulianov, S.V., Abdennur, N., Razin, S.V., Mirny, L.A., and 21 Tachibana-Konwalski, K. (2017). Single-nucleus Hi-C reveals unique chromatin reorganization at oocyte-to- 22 zygote transition. Nature 544, 110-114. 23 Fraser, J., Ferrai, C., Chiariello, A.M., Schueler, M., Rito, T., Laudanno, G., Barbieri, M., Moore, B.L., Kraemer, D.C., 24 Aitken, S., et al. (2015). Hierarchical folding and reorganization of chromosomes are linked to transcriptional 25 changes in cellular differentiation. Mol Syst Biol 11, 852. 26 Gassler, J., Brandao, H.B., Imakaev, M., Flyamer, I.M., Ladstatter, S., Bickmore, W.A., Peters, J.M., Mirny, L.A., and 27 Tachibana, K. (2017). A mechanism of cohesin-dependent loop extrusion organizes zygotic genome 28 architecture. Embo J 36, 3600-3618. 29 Gibcus, J.H., and Dekker, J. (2013). The hierarchy of the 3D genome. Mol Cell 49, 773-782. 30 Grow, E.J., Flynn, R.A., Chavez, S.L., Bayless, N.L., Wossidlo, M., Wesche, D.J., Martin, L., Ware, C.B., Blish, C.A., 31 Chang, H.Y., et al. (2015). Intrinsic retroviral reactivation in human preimplantation embryos and pluripotent 32 cells. Nature 522, 221-225. 33 Guelen, L., Pagie, L., Brasset, E., Meuleman, W., Faza, M.B., Talhout, W., Eussen, B.H., de Klein, A., Wessels, L., de 34 Laat, W., et al. (2008). Domain organization of human chromosomes revealed by mapping of nuclear lamina 35 interactions. Nature 453, 948-951. 36 Haarhuis, J.H.I., van der Weide, R.H., Blomen, V.A., Yanez-Cuna, J.O., Amendola, M., van Ruiten, M.S., Krijger, 37 P.H.L., Teunissen, H., Medema, R.H., van Steensel, B., et al. (2017). The Cohesin Release Factor WAPL Restricts 38 Chromatin Loop Extension. Cell 169, 693-707 e614. 39 Houda Belaghzal, T.B., Andrew D. Stephens , Denis L. Lafontaine , Sergey V. Venev , Zhiping Weng , John F. 40 Marko, Job Dekker (2019). Compartment-dependent chromatin interaction dynamics revealed by liquid 41 chromatin Hi-C. bioRxiv. 42 Jachowicz, J.W., Bing, X.Y., Pontabry, J., Boskovic, A., Rando, O.J., and Torres-Padilla, M.E. (2017). LINE-1 43 activation after fertilization regulates global chromatin accessibility in the early mouse embryo. Nature Genetics 44 49, 1502-1510. 45 Jost, D., Carrivain, P., Cavalli, G., and Vaillant, C. (2014). Modeling epigenome folding: formation and dynamics 22
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 of topologically associated chromatin domains. Nucleic Acids Res 42, 9553-9561. 2 Jurka, J., Kohany, O., Pavlicek, A., Kapitonov, V.V., and Jurka, M.V. (2004). Duplication, coclustering, and selection 3 of human Alu retrotransposons. Proc Natl Acad Sci U S A 101, 1268-1272. 4 Kadauke, S., and Blobel, G.A. (2009). Chromatin loops in gene regulation. Biochim Biophys Acta 1789, 17-25. 5 Ke, Y., Xu, Y., Chen, X., Feng, S., Liu, Z., Sun, Y., Yao, X., Li, F., Zhu, W., Gao, L., et al. (2017). 3D Chromatin 6 Structures of Mature Gametes and Structural Reprogramming during Mammalian Embryogenesis. Cell 170, 7 367-381 e320. 8 Keller, C., Adaixo, R., Stunnenberg, R., Woolcock, K.J., Hiller, S., and Buhler, M. (2012). HP1(Swi6) mediates the 9 recognition and destruction of heterochromatic RNA transcripts. Mol Cell 47, 215-227. 10 Korenberg, J.R., and Rykowski, M.C. (1988). Human genome organization: Alu, lines, and the molecular structure 11 of metaphase chromosome bands. Cell 53, 391-400. 12 Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., 13 FitzHugh, W., et al. (2001). Initial sequencing and analysis of the human genome. Nature 409, 860-921. 14 Larson, A.G., Elnatan, D., Keenen, M.M., Trnka, M.J., Johnston, J.B., Burlingame, A.L., Agard, D.A., Redding, S., 15 and Narlikar, G.J. (2017). Liquid droplet formation by HP1alpha suggests a role for phase separation in 16 heterochromatin. Nature 547, 236-240. 17 Li, P., Banjade, S., Cheng, H.C., Kim, S., Chen, B., Guo, L., Llaguno, M., Hollingsworth, J.V., King, D.S., Banani, S.F., 18 et al. (2012). Phase transitions in the assembly of multivalent signalling proteins. Nature 483, 336-340. 19 Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., 20 Sabo, P.J., Dorschner, M.O., et al. (2009). Comprehensive mapping of long-range interactions reveals folding 21 principles of the human genome. Science 326, 289-293. 22 Liu, N., Lee, C.H., Swigut, T., Grow, E., Gu, B., Bassik, M.C., and Wysocka, J. (2018). Selective silencing of 23 euchromatic L1s revealed by genome-wide screens for L1 regulators. Nature 553, 228-232. 24 Lubelsky, Y., and Ulitsky, I. (2018). Sequences enriched in Alu repeats drive nuclear localization of long RNAs in 25 human cells. Nature 555, 107-111. 26 Lynch, V.J., Leclerc, R.D., May, G., and Wagner, G.P. (2011). Transposon-mediated rewiring of gene regulatory 27 networks contributed to the evolution of pregnancy in mammals. Nat Genet 43, 1154-1159. 28 Mandal, P.K., and Kazazian, H.H., Jr. (2008). SnapShot: Vertebrate transposons. Cell 135, 192-192 e191. 29 Mateos-Langerak, J., Brink, M.C., Luijsterburg, M.S., van der Kraan, I., van Driel, R., and Verschure, P.J. (2007). 30 Pericentromeric heterochromatin domains are maintained without accumulation of HP1. Molecular Biology of 31 the Cell 18, 1464-1471. 32 McHugh, C.A., Chen, C.K., Chow, A., Surka, C.F., Tran, C., McDonel, P., Pandya-Jones, A., Blanco, M., Burghard, 33 C., Moradian, A., et al. (2015). The Xist lncRNA interacts directly with SHARP to silence transcription through 34 HDAC3. Nature 521, 232-236. 35 Meuleman, W., Peric-Hupkes, D., Kind, J., Beaudry, J.B., Pagie, L., Kellis, M., Reinders, M., Wessels, L., and van 36 Steensel, B. (2013). Constitutive nuclear lamina-genome interactions are highly conserved and associated with 37 A/T-rich sequence. Genome Res 23, 270-280. 38 Mirny, L.A. (2011). The fractal globule as a model of chromatin architecture in the cell. Chromosome Res 19, 39 37-51. 40 Misteli, T. (2007). Beyond the sequence: cellular organization of genome function. Cell 128, 787-800. 41 Mouse Genome Sequencing, C., Waterston, R.H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J.F., Agarwal, P., 42 Agarwala, R., Ainscough, R., Alexandersson, M., et al. (2002). Initial sequencing and comparative analysis of the 43 mouse genome. Nature 420, 520-562. 44 Muchardt, C., Guilleme, M., Seeler, J.S., Trouche, D., Dejean, A., and Yaniv, M. (2002). Coordinated methyl and 45 RNA binding is required for heterochromatin localization of mammalian HP1 alpha. Embo Reports 3, 975-981. 23
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Nagano, T., Lubling, Y., Vaarnai, C., Dudley, C., Leung, W., Baran, Y., Cohen, N.M., Wingett, S., Fraser, P., and 2 Tanay, A. (2017). Cell-cycle dynamics of chromosomal organization at single-cell resolution. Nature 547, 61- 3 67. 4 Naumova, N., Imakaev, M., Fudenberg, G., Zhan, Y., Lajoie, B.R., Mirny, L.A., and Dekker, J. (2013). Organization 5 of the mitotic chromosome. Science 342, 948-953. 6 Nora, E.P., Goloborodko, A., Valton, A.L., Gibcus, J.H., Uebersohn, A., Abdennur, N., Dekker, J., Mirny, L.A., and 7 Bruneau, B.G. (2017). Targeted Degradation of CTCF Decouples Local Insulation of Chromosome Domains from 8 Genomic Compartmentalization. Cell 169, 930-944. 9 Nuebler, J., Fudenberg, G., Imakaev, M., Abdennur, N., and Mirny, L.A. (2018). Chromatin organization by an 10 interplay of loop extrusion and compartmental segregation. Proc Natl Acad Sci U S A 115, E6697-E6706. 11 Orgel, L.E., and Crick, F.H. (1980). Selfish DNA: the ultimate parasite. Nature 284, 604-607. 12 Percharde, M., Lin, C.J., Yin, Y., Guan, J., Peixoto, G.A., Bulut-Karslioglu, A., Biechele, S., Huang, B., Shen, X., and 13 Ramalho-Santos, M. (2018). A LINE1-Nucleolin Partnership Regulates Early Development and ESC Identity. Cell 14 174, 391-405 e319. 15 Polak, P., and Domany, E. (2006). Alu elements contain many binding sites for transcription factors and may 16 play a role in regulation of developmental processes. Bmc Genomics 7, 133. 17 Probst, A.V., Okamoto, I., Casanova, M., El Marjou, F., Le Baccon, P., and Almouzni, G. (2010). A strand-specific 18 burst in transcription of pericentric satellites is required for chromocenter formation and early mouse 19 development. Dev Cell 19, 625-638. 20 Pueschel, R., Coraggio, F., and Meister, P. (2016). From single genes to entire genomes: the search for a function 21 of nuclear organization. Development 143, 910-923. 22 Qiu, J.J., Zhang, W.W., Wu, Z.L., Wang, Y.H., Qian, M., and Li, Y.P. (2003). Delay of ZGA initiation occurred in 2- 23 cell blocked mouse embryos. Cell Res 13, 179-185. 24 Quinodoz, S.A., Ollikainen, N., Tabak, B., Palla, A., Schmidt, J.M., Detmar, E., Lai, M.M., Shishkin, A.A., Bhat, P., 25 Takei, Y., et al. (2018). Higher-Order Inter-chromosomal Hubs Shape 3D Genome Organization in the Nucleus. 26 Cell 174, 744-757 e724. 27 Rao, S.S., Huntley, M.H., Durand, N.C., Stamenova, E.K., Bochkov, I.D., Robinson, J.T., Sanborn, A.L., Machol, I., 28 Omer, A.D., Lander, E.S., et al. (2014). A 3D map of the human genome at kilobase resolution reveals principles 29 of chromatin looping. Cell 159, 1665-1680. 30 Rao, S.S.P., Huang, S.C., St Hilaire, B.G., Engreitz, J.M., Perez, E.M., Kieffer-Kwon, K.R., Sanborn, A.L., Johnstone, 31 S.E., Bascom, G.D., Bochkov, I.D., et al. (2017). Cohesin Loss Eliminates All Loop Domains. Cell 171, 305-320. 32 Rebollo, R., Romanish, M.T., and Mager, D.L. (2012). Transposable Elements: An Abundant and Natural Source 33 of Regulatory Sequences for Host Genes. Annual Review of Genetics, Vol 46 46, 21-42. 34 Rivera-Mulia, J.C., and Gilbert, D.M. (2016). Replication timing and transcriptional control: beyond cause and 35 effect-part III. Curr Opin Cell Biol 40, 168-178. 36 Rothstein, J.L., Johnson, D., Deloia, J.A., Skowronski, J., Solter, D., and Knowles, B. (1992). Gene-Expression during 37 Preimplantation Mouse Development. Gene Dev 6, 1190-1201. 38 Rowley, M.J., and Corces, V.G. (2018). Organizational principles of 3D genome architecture. Nat Rev Genet 19, 39 789-800. 40 Rowley, M.J., Nichols, M.H., Lyu, X.W., Ando-Kuri, M., Rivera, I.S.M., Hermetz, K., Wang, P., Ruan, Y.J., and Corces, 41 V.G. (2017). Evolutionarily Conserved Principles Predict 3D Chromatin Organization. Mol Cell 67, 837-852. 42 Schwarzer, W., Abdennur, N., Goloborodko, A., Pekowska, A., Fudenberg, G., Loe-Mie, Y., Fonseca, N.A., Huber, 43 W., C, H.H., Mirny, L., et al. (2017). Two independent modes of chromatin organization revealed by cohesin 44 removal. Nature 551, 51-56. 45 Solovei, I., Thanisch, K., and Feodorova, Y. (2016). How to rule the nucleus: divide et impera. Current Opinion in 24
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Cell Biology 40, 47-59. 2 Splinter, E., Heath, H., Kooren, J., Palstra, R.J., Klous, P., Grosveld, F., Galjart, N., and de Laat, W. (2006). CTCF 3 mediates long-range chromatin looping and local histone modification in the beta-globin locus. Genes Dev 20, 4 2349-2354. 5 Stadhouders, R., Vidal, E., Serra, F., Di Stefano, B., Le Dily, F., Quilez, J., Gomez, A., Collombet, S., Berenguer, C., 6 Cuartero, Y., et al. (2018). Transcription factors orchestrate dynamic interplay between genome topology and 7 gene regulation during cell reprogramming. Nature Genetics 50, 238-249. 8 Stevens, T.J., Lando, D., Basu, S., Atkinson, L.P., Cao, Y., Lee, S.F., Leeb, M., Wohlfahrt, K.J., Boucher, W., 9 O'Shaughnessy-Kirwan, A., et al. (2017). 3D structures of individual mammalian genomes studied by single-cell 10 Hi-C. Nature 544, 59-64. 11 Strom, A.R., Emelyanov, A.V., Mir, M., Fyodorov, D.V., Darzacq, X., and Karpen, G.H. (2017). Phase separation 12 drives heterochromatin domain formation. Nature 547, 241-245. 13 Stunnenberg, R., Kulasegaran-Shylini, R., Keller, C., Kirschmann, M.A., Gelman, L., and Buhler, M. (2015). H3K9 14 methylation extends across natural boundaries of heterochromatin in the absence of an HP1 protein. Embo J 15 34, 2789-2803. 16 Sundaram, V., and Wang, T. (2018). Transposable Element Mediated Innovation in Gene Regulatory Landscapes 17 of Cells: Re-Visiting the "Gene-Battery" Model. Bioessays 40. 18 Tang, S.J. (2017). Potential Role of Phase Separation of Repetitive DNA in Chromosomal Organization. Genes 19 (Basel) 8. 20 Taylor, M.S., LaCava, J., Mita, P., Molloy, K.R., Huang, C.R., Li, D., Adney, E.M., Jiang, H., Burns, K.H., Chait, B.T., 21 et al. (2013). Affinity proteomics reveals human host factors implicated in discrete stages of LINE-1 22 retrotransposition. Cell 155, 1034-1048. 23 Tjong, H., Li, W.Y., Kalhor, R., Dai, C., Hao, S.L., Gong, K., Zhou, Y.G., Li, H.C., Zhou, X.J., Le Gros, M.A., et al. 24 (2016). Population-based 3D genome structure analysis reveals driving forces in spatial genome organization. 25 P Natl Acad Sci USA 113, E1663-E1672. 26 van de Werken, H.J.G., Haan, J.C., Feodorova, Y., Bijos, D., Weuts, A., Theunis, K., Holwerda, S.J.B., Meuleman, 27 W., Pagie, L., Thanisch, K., et al. (2017). Small chromosomal regions position themselves autonomously 28 according to their chromatin class. Genome Res 27, 922-933. 29 van Steensel, B., and Belmont, A.S. (2017). Lamina-Associated Domains: Links with Chromosome Architecture, 30 Heterochromatin, and Gene Repression. Cell 169, 780-791. 31 Velazquez Camacho, O., Galan, C., Swist-Rosowska, K., Ching, R., Gamalinda, M., Karabiber, F., De La Rosa- 32 Velazquez, I., Engist, B., Koschorz, B., Shukeir, N., et al. (2017). Major satellite repeat RNA stabilize 33 heterochromatin retention of Suv39h enzymes by RNA-nucleosome association and RNA:DNA hybrid 34 formation. Elife 6. 35 Wang, J., Xie, G., Singh, M., Ghanbarian, A.T., Rasko, T., Szvetnik, A., Cai, H., Besser, D., Prigione, A., Fuchs, N.V., 36 et al. (2014). Primate-specific endogenous retrovirus-driven transcription defines naive-like stem cells. Nature 37 516, 405-409. 38 Wang, S., Su, J.H., Beliveau, B.J., Bintu, B., Moffitt, J.R., Wu, C.T., and Zhuang, X. (2016). Spatial organization of 39 chromatin domains and compartments in single chromosomes. Science 353, 598-602. 40 Wijchers, P.J., Geeven, G., Eyres, M., Bergsma, A.J., Janssen, M., Verstegen, M., Zhu, Y., Schell, Y., Vermeulen, C., 41 de Wit, E., et al. (2015). Characterization and dynamics of pericentromere-associated domains in mice. Genome 42 Res 25, 958-969. 43 Wutz, G., Várnai, C., Nagasaka, K., Cisneros, D.A., Stocsits, R.R., Tang, W., Schoenfelder, S., Jessberger, G., Muhar, 44 M., Hossain, M.J., et al. (2017). Topologically associating domains and chromatin loops depend on cohesin and 45 are regulated by CTCF, WAPL, and PDS5 proteins. The EMBO journal 36, 3573-3599. 25
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Xie, Y., Rosser, J.M., Thompson, T.L., Boeke, J.D., and An, W. (2010). Characterization of L1 retrotransposition 2 with high-throughput dual-luciferase assays. Nucleic acids research 39, e16-e16. 3 Zane, L., Sharma, V., and Misteli, T. (2014). Common features of chromatin in aging and cancer: cause or 4 coincidence? Trends Cell Biol 24, 686-694. 5 Zullo, J.M., Demarco, I.A., Pique-Regi, R., Gaffney, D.J., Epstein, C.B., Spooner, C.J., Luperchio, T.R., Bernstein, 6 B.E., Pritchard, J.K., Reddy, K.L., et al. (2012). DNA sequence-dependent compartmentalization and silencing of 7 chromatin at the nuclear lamina. Cell 149, 1474-1487. 8 Zwicker, D., Decker, M., Jaensch, S., Hyman, A.A., and Julicher, F. (2014). Centrosomes are autocatalytic droplets 9 of pericentriolar material organized by centrioles. P Natl Acad Sci USA 111, E2636-E2645. 10 11
26
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 FIGURE LEGENDS
2 Figure 1. B1- and L1-rich genomic regions homotypically interact and correlate with A and
3 B compartments.
4 (A) Heatmaps of the distribution densities of B1, L1, L2, and ERV1 repeats and random genomic
5 regions (panel i), DNase I hypersensitive sites (DHS) and ChIP-seq signals of Pol II, H3K4me3,
6 HP1α, and H3K9me3 (panel ii), and ChIRP-seq signals of Malat1 and L1 RNA (panel iii), and
7 RNA-seq (panel iv) in mESCs across two adjacent compartments (Cn, Cn+1). All signals in 696
8 compartments annotated in mESCs were sorted according to the B1 distributions shown in
9 panel (i).
10 (B) Relative contents of various repeats across the A and B compartments annotated in various cell
11 types in mouse (top) and human (bottom). Three repeat subclasses are shown (B1 or Alu, L1,
12 and ERV1), together with random genomic regions as the negative control.
13 (C) Genome browser shots showing conserved domain structures as indicated by heatmaps of the
14 Hi-C contact matrix over a syntenic region in mouse (top) and human (bottom) ESCs. The B1
15 and L1 repeat densities, the A/B compartments shown by eigenvalues of the Hi-C contact
16 matrix, and Refseq gene annotations are shown underneath each heatmap.
17 (D) Heatmap of normalized interaction frequencies at 100-kb resolution on chromosome 17 in
18 mESCs. Genomic densities of B1 and L1 repeats are shown at the left and bottom.
19 (E) A zoomed-in view of the interaction matrix of the genomic region from 18 Mb to 60 Mb on
20 mouse chr17 (40-kb resolution). Genomic densities of B1 and L1 repeats, eigenvalues of the
21 Hi-C matrix representing A/B compartments from mouse mESCs, neural progenitor cells (NPC)
22 and neurons, and Pol II ChIP-seq signals in mESCs are also shown underneath. B1-enriched
23 regions are arbitrarily labeled as D, F, and H in uppercase. L1-enriched regions are labeled as
24 c, e, g, and i in lowercase. Some strong homotypic interactions between compartments rich in
25 the same repeat subfamily (for example, between the B1-rich regions DF, DE and FH, and
26 between the L1-rich regions ce and cg), are highlighted by dotted boxes. 27
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 (F) Correlation heatmap showing Pearson correlation coefficients of the interaction frequencies of
2 any two paired regions in a sub-region on chr17 (500-kb resolution). B1-enriched regions are
3 labeled in uppercase as F, H … R, T. L1-enriched regions are labeled in lowercase as e, g … q,
4 u. Dotted boxes and arrows highlight positive correlations (in red) of the anchor region F
5 (indicated by *) with other B1-enriched genomic regions (horizontal), and of the anchor region
6 g (indicated by *) with other L1-enriched genomic regions (vertical).
7 (G) Correlation heatmap showing Pearson correlation coefficients of inter-chromosomal interaction
8 frequencies between chromosomes 17 and 19 at 500-kb resolution. B1-enriched regions on
9 chr19 are arbitrarily labeled in uppercase as V, W, X, Y, and Z. L1-enriched regions on chr17
10 are labeled as shown in panels (E-F) from c, e … q, to u. The horizontal dotted box and arrows
11 highlight positive correlations of an L1-enriched region on chr19 with other L1-enriched
12 regions on chr17. The vertical dotted box and arrows show positive correlations of a B1-
13 enriched region on chr17 with other B1-enriched regions on chr19.
14
28
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Figure 2. DNA FISH reveals the spatial segregation of L1 and B1 compartments.
2 (A) Representative images of L1 (green) and B1 (red) repeats revealed by DNA FISH in mESCs.
3 DNA is labeled by DAPI (blue). All scale bars, 5 μm.
4 (B) Normalized fluorescence intensity profiles along the white line in the merged B1/L1 image
5 shown in panel (A). The L1 channel shows four intensity peaks that surround the two intensity
6 peaks in the B1 channel. These two peaks in the B1 channel correspond to valleys in the L1
7 channel.
8 (C) Representative images of immuno-DNA FISH analysis of L1 (green) and B1 (red) DNA repeats,
9 and NCL protein (purple) in mESCs (n=37), NSC (n=23), NIH3T3 (n=18), C2C12 (n=11) and
10 Hela cells (n=15).
11
29
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Figure 3. Dynamic segregation of L1 and B1 compartments during the cell cycle and
2 embryonic development.
3 (A) and (B) DNA FISH analysis of synchronized mESCs at different cell cycle stages.
4 Representative images and the scatterplot of the segregation index of L1 and B1 DNA are shown
5 in panels (A) and (B), respectively. Data are presented as mean ± standard deviation (SD). N,
6 number of nuclei analyzed. M phase and G1 phase data are compared to S phase using the two-
7 tailed Student’s t-test. p values are shown at the top.
8 (C) Boxplot analysis of the ratio of homotypic interactions versus heterotypic interactions for B1
9 and L1 based on single-cell Hi-C data from mESCs. The y-axis shows the segregation index (Hi-
10 C) of L1 and B1, which represents the ratio of average interaction frequency between
11 compartments containing similar repeats (B1.B1 and L1.L1) to that between compartments
12 containing different repeats (B1.L1) for all chromosomes (except X and Y). Larger values
13 indicate a higher degree of homotypic interaction between B1- or L1-enriched compartments. p
14 values are calculated with the two-tailed Student’s t-test.
15 (D) Heatmaps of normalized interaction frequencies at 500-kb resolution on chromosome 17 in
16 mESCs at M (left) and G1 (middle) phase. A comparison of contact frequencies between M and
17 G1 phase [log2(G1/M)] for the whole of chromosome 17 is shown on the right. Genomic
18 densities of B1 and L1 repeats are shown at the bottom.
19 (E) and (F) DNA FISH analysis of early embryos. Representative FISH images and the scatterplot
20 of the segregation index of L1 and B1 DNA are shown in panels (E) and (F), respectively. Data
21 are presented as mean ± SD (embryos were collected and processed in two independent
22 experiments). N, number of nuclei analyzed. Each sample is compared to the blastocyst stage
23 and p values are calculated with the two-tailed Student’s t-test. Dotted lines show the trend-line.
24 (G) Boxplot showing a gradual increase of homotypic versus heterotypic interactions between L1-
25 rich and B1-rich regions based on Hi-C analysis of early embryos from zygotes (including PN3
26 and PN5 stages) to 8-cell embryos and pluripotent mESCs (in vitro equivalent of the inner cell 30
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 mass cells of blastocysts). Each sample is compared to mESCs and p values are calculated with
2 the two-tailed Student’s t-test.
3 (H) The upper panels show heatmaps of normalized interaction frequencies at 500-kb resolution
4 on chromosome 17 in mouse embryos at the early 2-cell, late 2-cell and inner cell mass (ICM)
5 stages. The lower panels show comparisons of the contact frequencies between Late 2-cell and
6 Early 2-cell (left), and ICM and Early 2-cell (right). Genomic densities of B1 and L1 repeats are
7 also shown.
8
31
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Figure 4. Repeat RNA and transcription promote the spatial segregation of L1 and B1
2 compartments.
3 (A) Developmental analysis of embryos microinjected with scramble or L1 AMO (left) or with
4 scramble or two different B1 ASOs (ASO-1 and ASO-2) (right) at the zygote stage. N, number
5 of embryos analyzed.
6 (B) Embryos depleted of L1 RNA by AMO show poorer L1/B1 segregation, as indicated by
7 significantly lower L1/B1 segregation indexes compared to the scramble AMO control and
8 noninjected late 2-cell embryos. Each dot represents an embryo analyzed. Black and grey dots
9 represent injected and noninjected embryos, respectively. Data were collected from three
10 independent experiments. p values are calculated with the two-tailed Student’s t-test.
11 (C) and (D) DNA FISH analysis of L1 (green) and B1 (red) repeats in mESCs treated with
12 transcription inhibitors for 3 hours. Representative images and the scatterplot of the segregation
13 index of L1 and B1 DNA are shown in (C) and (D), respectively. DMSO: mock control; DRB
14 (100 μM): inhibits the release and elongation of Pol II; ActD (1 μg/ml): inhibits both Pol I and
15 II. Both drug treatments disrupt the perinucleolar staining of L1 DNA and induce mixing of the
16 L1 and B1 DNA signals. Treatment with ActD elicits a stronger mixing effect than DRB.
17 (E) and (F) DNA FISH analysis of L1 (green) and B1 (red) repeats in mESCs transfected with
18 scramble AMO for 36 hours, or L1 AMO for 12 hours and 36 hours, or treated with the drug
19 AZT for 12 hours. Representative images and the scatterplot of the segregation index of L1 and
20 B1 DNA are shown in (E) and (F), respectively.
21 All scale bars, 5 μm. N, number of nuclei analyzed. Data are presented as mean ± SD (two
22 independent experiments), and p values are calculated with two-tailed Student’s t-test.
23
24
32
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Figure 5. L1 RNA is required for the formation and maintenance of higher-order chromatin
2 structure.
3 (A-B) Hi-C analysis of mESCs treated with scramble control (SCR) and L1 AMOs for 36 hours.
4 Two independent experiments of AMO transfection followed by Hi-C were performed. Panel A
5 shows Hi-C heatmaps of contact frequencies for SCR and L1 AMO (panel i) and the comparison
6 of contact frequencies (panel ii) between scramble and L1 AMOs [log2(SCR/L1)]. The whole
7 of chromosome 17 (chr17) is shown on the left (at 500-kb resolution) and the boxed sub-region
8 (18-60 Mb) of chr17 is shown enlarged on the right. To better orient the visualization and
9 comparison of these three sub-region heatmaps, representative homotypic interactions (ce, DF,
10 and eg) are labeled with black boxes in the right panels. Relevant B1- and L1-enriched
11 compartments are labeled as in Figure 1. Panel B shows a boxplot comparing the Hi-C-based
12 segregation indexes. L1 AMO led to decreased ratios of homotypic interaction versus
13 heterotypic interaction between L1-rich and B1-rich regions compared to SCR AMO. The p
14 value was calculated by the two-tailed Student’s t-test.
15 (C) Schematic representation of compartmental interactions before and after depletion of L1 RNA
16 based on Hi-C data shown in the panel (A). The control mESCs show strong homotypic
17 interactions (indicated by blue solid lines), whereas L1-depleted cells show weakened
18 homotypic interactions (dotted blue lines) and abnormal increases of heterotypic interactions
19 (indicated by brown solid lines).
20
33
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Figure 6. L1 RNA is required for spatial segregation of L1 and B1 compartments.
2 Oligopaint DNA FISH analysis of four representative loci before and after depletion of L1 RNA
3 for 36 hours in mESCs. (A) Scheme of relative chromosomal localizations and spatial distances
4 of the four compartments labeled by Oligopaint DNA FISH probes. B1-enriched domain: F. L1-
5 enriched domains: e, g, and q. Representative images and quantification of the spatial distances
6 between two labeled compartments are shown in panels (B) and (C), respectively. Each dot in
7 panel (C) represents a nucleus. In cells treated with L1 AMO, the median distance between the
8 two B1 and L1 compartments e and F is significantly decreased compared to the control, while the
9 median distance between the two L1 compartments g and q is significantly increased. Data are
10 presented as mean ± SD. N, number of nuclei analyzed. p values are calculated by the two-tailed
11 Student’s t-test. Scale bars, 5 μm.
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26 34
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Figure 7. L1 RNA binds to its own DNA and promotes the phase separation of HP1α.
2 (A) Genome browser view of mouse chr17: 18 - 60 Mb (left) and chr19: 3 - 12 Mb (right). The
3 first five rows show the genomic density of B1 repeats, B1-related sequencing data (ChIRP-
4 seq signals of Malat1, ChIP-seq signals of Pol II, and H3K27ac), and RNA-seq in mESCs. The
5 lower tracks show the genomic density of L1 repeats (highlighted by beige shading) and L1-
6 related sequencing data, including ChIRP-seq signals of L1 RNA, ChIP-seq signals of HP1α
7 and H3K9me3. Refseq gene annotations are also included.
8 (B) Biotinylated L1 RNA pulls down recombinant HP1α protein. Reactions without addition of L1
9 RNA were served as mock control.
10 (C) Representative images of droplet formation at different concentrations of HP1α protein and L1
11 RNA. Concentrations of HP1α and RNA are indicated at the top and left of the images,
12 respectively. Scale bars, 10 μm. Data are representative of 3 independent experiments.
13 (D) The model. First, the intrinsic self-assembly property of L1 and B1/Alu repeats provides
14 numerous nucleation points to seed the formation of nuclear subdomains. Repetitive DNA
15 sequences also serve as anchor sites for transcription machinery, regulatory proteins and RNAs.
16 Second, the embedded structural information in DNA repeats may be translated by their RNA
17 transcripts, together with interacting DNA and/or RNA-binding proteins. Molecular crowding
18 generated by interactions among DNA sequences, RNAs, and proteins in subnuclear domains
19 seeded by individual clusters of L1 or B1/Alu, may drive the aggregation of compartments
20 containing the same repeat type through a phase-separation mechanism, which consequently
21 folds the genome. Third, the nuclear segregation of L1-enriched B compartments and B1/Alu-
22 enriched A compartments may be further reinforced by attaching their DNA sequences to
23 subnuclear structures such as nuclear speckles and the nucleolus, respectively, which serve as
24 scaffolds to stabilize the nuclear architecture.
35
bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 1
A i) Genomic distribution ii) chromatin state iii) ChIRP iv) RNA C
B A B A
B1 L1 A/B 696 696 compartmentsin mESC Gene ISM1 FOXA2 EIF6 TOP1 PLTP Cn Cn+1 Low High (abundance) Gene B/A B (i) Mouse 0.2 mESC 0.2 NPC 0.2 Neuron L1 B1 0.1 0.1 B A B A 0 0 0 -0.1 -0.1 B1/Alu Relativecontent High -0.2 -0.2 -0.2 L1 hESC K562 (ii) Human 0.06 0.06 GM128780.06 ERV1 Low 0.04 0.04 Random 0.02 0.02 0 0 0 -0.02 -0.02 -0.04 -0.04
Relativecontent mESC: intra-chromosome correlation -0.06 A -0.06 B A -0.06 B A B F (chr17:35 - 95 Mb) D compartments mESC: intra-chromosome interaction [chr17: Log2 (interaction matrix)] k m o q u i H J L N P R T F H J L N P R T F* e g L1 B1 B1 B1 L1
0 20 40 60 80 95 (Mb) L1 e g* i k m o q u B1 L1 B1 L1 G mESC: inter-chromosome correlation E (Fraser et al., 2015) c e g i k m o q u
cg chr17:18 - 60 Mb DH ce DF ei cD De eg FH 61 Mb)
gi - c D e F g H i B1 D F H
c g Chr19 (0 L1 e i mESC
NPC W X Y Z V W Y X Z Neuron V Pol II B1 L1 B1 (mESC) B1 A compartment B compartment L1 L1 c e g i k m o q u Chr17 (0 - 95 Mb) bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 2 A B
B1 L1 DAPI
B1 L1 1.0
0.8
0.6
B1/L1 L1/DAPI B1/L1/DAPI 0.4
0.2
Fluorescent Fluorescent intensity (A.U.) 0.0 0 5 10 15 20 Distance (μm)
DNA FISH IF Merge C DNA FISH IF Merge BB11 ((Alu)Alu) LINEL1 1 NCLNCL DAPIDAPI LINE L1/B11/B1 ( Alu ) LINE L1/1/NCLNCL B1 (B1/L1/NCLAlu)/LINE1/ NCL
mESC
NSC
NIH3T3
C2C12
Hela Hela Hela C2C12 NIH3T3 NSC ESC Early Late 4 2 2 cell stage Blastocyst G cell stage cell stage Anaphase Prophase S phase 1 1 Zygote E A Morula phase
Blastocyst Morula 4-cell Late 2-cell Early 2-cell PN4 Zygote S phase G1 phase Anaphase Prophase ♀ bioRxiv preprint B1 L1 Merge L1 B1 B B1 L1 Merge L1 B1 B 1 1 was notcertifiedbypeerreview)istheauthor/funder.Allrightsreserved.Noreuseallowedwithoutpermission. doi: LINE LINE https://doi.org/10.1101/802173 1 1 Merge Merge H F D B
Segregation index (FISH) ; Log2 (interaction matrix) Log2 (interaction matrix) Log2 (interaction matrix) chr17 this versionpostedOctober13,2019. of L1 and B1 - - -
0 25 50 75 100 2 (Late Log2 0 25 50 75 95 (Mb) 0.9 0.6 0.3 0.0 0.3 0.6 0 25 50 75 95 (Mb) -
1.0 1.0 1.0 Segregation index (FISH) N=11
2.4E of L1 and B1 0.0 - - 0.4 0.2 0.0 0.2 0.4 - Early 2 Early 15 Based on DNA FISH DNA on Based N=15 Based on DNA FISH DNA on Based M phase G1 phase log phase G1 phase M - 4.5 ( cell / Early Early 2 / cell N=12 2.6E p p E value vs Blastocyst) vsvalue value vs Sphase vsvalue - N=21 - 18 cell Late 2 Late cell - 10 4.0E 10 6.2 N=12 E N=17 4.0 Single cell Hi cell Single - 15 - - 04 1 cell) Log2 (ICM / Early Early 2 / (ICM Log2 cell) A/B .6E N=21 B1 N=21 L1 - 09 5.0E N=22 - - C analysis of the cell cycle (mESC) cycle cell of the C analysis 4 C The copyrightholderforthispreprint(which G
- Segregation index (Hi-C) cell ICM ICM cell of L1 and B1 0.5 1.0 1.5 2.0 2.5 Segregation index (Hi-C) 3 4 5 of L1 and B1 - cell) cell) 1.0 1.5 2.0 2.5 3.0 9.5E - 11
B1 Hi on Based Based on Hi on Based ( 0.0002 p 7.6E
L1 mESC) vsvalue (Nagano et al, 2017) et(Nagano - 0.0 - 2.0 2.0 2.0 A/B 08 5.9E (Du et al, 2017) etal, (Du - C - 06 - (G1/ M) (G1/ C Figure 3 Figure 5.0 bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 4
A N=52 N=46 N=38 N=31 N=48 B n.s. 1 100 0.4
80 0.2 2.5E-05 n.s . 0.0013
60 0.0 0.5 40 -0.2 of L1 and of L1 and B1 Developmental progression (%) 20 -0.4 0 0 Segregation index (FISH) -0.6
Embryos beyond 2-cell stage 2-cell stage
C B1B1 LINEL1 1 DAPIDAPI MergeMerge D 0.0002 0.4 0.008 DMSO 0.2
0.0 DRB N=12 -0.2 of L1 and of L1 and B1 N=15 N=15 -0.4 Segregation Segregation index (FISH)
ActD ActD DRB DMSO
E F B1 L1 DAPI Merge 0.20 3.0E-12 0.4 2.5E-07 Control 0.2
0.0 N=29 N=14 -0.2 of L1 and of L1 and B1
L1 L1 AMO -0.4 Segregation Segregation index (FISH) N=12 N=29 -0.6 AZT Treatment Treatment 36 h 12 h SCR AMO bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 5
A i) Log2 (interaction matrix) of chr17 Zoom-in views of chr1718-60 Mb a) SCR AMO
cF Dg ce DF cD De eg eF c *D *e F g H B1 D F H c e g i L1
0 25 50 75 95 A/B A compartment B compartment 3.0 6.0 b) L1 AMO
ce DF eg
c *D *e F g H B1 D F H c e g i L1 A/B 0 25 50 75 95 3.0 6.0 A compartment B compartment ii) Log2 (SCR/L1) of chr17
ce DF eg
c D *e *F g H B1 D F H c e g L1 i
A/B A compartment B compartment
Mb: 0 25 50 75 95 -1.0 1.0 B C chr17:18 - 60 Mb 6.2E-15 B1-B1: Strong 2.5 B1-L1: D F H absent
C) c e g i - L1-L1:strong
2.0 L1 AMO of L1 and B1
B1-B1: weak Segregation index (Hi 1.5 B1-L1 D F H increased #1 #2 #1 #2 c e g i L1-L1: weak SCR AMO L1 AMO bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 6
Oligopaint FISH A chr17:35 - 95 Mb C F (i) e and F (ii) g and q B1 m)
μ 2.3E-04 5 5.9E-06 L1 4 4 e g q 3 3 2 e F g q 2 1 L1 AMO 1 0 0 Distance Distance of two closest alleles ( e F g q A1A and1 and B1 B1 B (i) e in green and F in red (ii)B g2 inB greenand2 and and B q3 inB red3 SCR AMO SCR AMO
SCRSCR AMO AMO 36 36h h
e, F g, q
L1 AMO L1 AMO
LINELINE1 AMO1 AMO 36 36h h
e, F g, q bioRxiv preprint doi: https://doi.org/10.1101/802173; this version posted October 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 7 A C HP1α (μM)
B1 DNA 4.8 6.25 12.5 25 50
Malat1 ChIRP-seq 6.0 200 Pol II 1.0 seq - H3K27ac 1.8 100 ChIP RNA-seq 40
L1 DNA 2.5 75
L1 ChIRP-seq 25
0.2 50
HP1α ng/μL) seq ( - H3K9me3 0.5 ChIP 25
Refseq genes L1 RNA chr17:18 - 60 Mb chr19:3- 12 Mb 12.5
B L1 RNA (biotinylated) Mock 6.25 NaCl (M) 0.15 0.3 0.075 0.15 0.3 0.5 1 1 +Urea
Elution 0
Input
anti-HP1α western blot
D Nucleus LAD L1-enriched compartments NAD
Nuclear speckle Nucleolus
lamina DNA: L1 SINE (B1/Alu) RNA: L1 others (eg: Malat1) Proteins: Nucleolin (NCL) B1-enriched Heterochromatic, eg: HP1α compartments Euchromatic, eg: Pol II Biophysical forces: Self-aggregation Phase separation