bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 Article 2 3 Super-enhancer switching drives a burst in germline 4 expression at the mitosis-to-meiosis transition 5 6 So Maezawa1, 2, 3*, Masashi Yukawa2, 4, Xiaoting Chen5, Akihiko Sakashita1, 2, Kris G. 7 Alavattam1, 2, Matthew T. Weirauch2, 5, 6, Artem Barski2, 4, and Satoshi H. Namekawa1, 2* 8 9 1 Division of Reproductive Sciences, Division of Developmental Biology, Perinatal Institute, 10 Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, 45229, USA 11 2 Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, 45229, 12 USA 13 3 Department of Animal Science and Biotechnology, School of Veterinary Medicine, Azabu 14 University, Sagamihara, Kanagawa 252-5201, Japan 15 4 Division of Allergy and Immunology, Division of Genetics, Cincinnati Children’s 16 Hospital Medical Center, Cincinnati, Ohio, 45229, USA 17 5 Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center, 18 Cincinnati, Ohio, 45229, USA 19 6 Divisions of Biomedical Informatics and Developmental Biology, Cincinnati Children’s Hospital 20 Medical Center, Cincinnati, Ohio, 45229, USA 21 22 *Corresponding authors: E-mail: [email protected]; [email protected] 23 24 25 Abstract 26 27 The testis has the most diverse and complex transcriptome of all organs due to bursts in 28 expression of thousands of germline-specific . Much of this unique takes 29 place when mitotic germ cells differentiate and enter into meiotic prophase. Here, we 30 demonstrate that the genome-wide reorganization of super-enhancers (SEs) drives bursts of 31 germline genes after the mitosis-to-meiosis transition. At the mitosis-to-meiosis transition, 32 mitotic SEs dissolve while meiotic SEs are established. Meiotic SEs are associated with the 33 activation of key germline genes, defining the cellular identity of germ cells. This SE switching 34 is regulated by the establishment of meiotic SEs via A-MYB (MYBL1), a key transcription 35 factor for germline genes, and by the resolution of mitotic SEs via SCML2, a germline-specific 36 Polycomb required for spermatogenesis-specific gene expression. Prior to the entry 37 into meiosis, meiotic SEs are preprogrammed in mitotic spermatogonia, serving to direct the 38 unidirectional differentiation of spermatogenesis. We identify key regulatory factors for both 39 mitotic and meiotic enhancers, revealing a molecular logic for the concurrent activation of 40 mitotic enhancers and suppression of meiotic enhancers in the somatic and/or mitotically 41 proliferating phase. bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

42 Introduction 43 Meiosis is an essential step in the preparation of haploid gametes, and the transition from 44 mitotic proliferation to meiosis is a fundamental event in the maturation of germ cells. In the 45 mammalian male germline, this mitosis-to-meiosis transition coincides with a fundamental change 46 to the transcriptome: a dynamic and massive change in genome-wide gene expression1-4. Due to 47 the burst in expression of thousands of germline genes at the mitosis-to-meiosis transition, the testis 48 has the most diverse and complex transcriptome of all organs2,5,6. Further, during spermatogenesis, 49 progressive, dynamic chromatin remodeling takes place to produce haploid spermatids7. Together 50 with genome-wide changes in gene expression, the mitosis-to-meiosis transition accompanies the 51 dynamic reorganization of epigenetic modifications, accessible chromatin, and 3D chromatin 52 conformation to prepare for the next generation of life4,8-12. Yet despite these recent advances in 53 our understanding of the mitosis-to-meiosis transition, it remains largely unknown how DNA 54 regulatory elements underlie the massive, dynamic transcriptional change in the male germline. 55 56 With this in mind, one class of DNA regulatory elements is of particular interest: enhancers. 57 Enhancers play key roles in the control of cell type-specific gene expression programs through the 58 binding of transcription factors (TFs) and interaction with promoters13-15. Mammalian cells contain 59 thousands of active enhancers16. A portion of these enhancers cluster in aggregate to regulate the 60 expression of genes important for establishing cellular identity17,18; these enhancer clusters have 61 been termed ‘super-enhancers’ (SEs)17. SEs are prevalent in various cell and tissue types, and are 62 also found in cancer cells, where they direct the expression of key tumor pathogenesis genes19. 63 However, for the most part, the characterization of SEs has been limited to somatic and/or 64 mitotically proliferating cells. Given the massive scale and scope of the mitosis-to-meiosis 65 transcriptome change, there are compelling questions as to the existence of a uniquely meiotic type 66 of SEs and, if present, to the function of SEs in the meiotic phase. Although a previous study 67 suggested that enhancer activation is not involved in mouse spermatogenesis8, the detailed profiles 68 of active enhancers remain undetermined in spermatogenesis. 69 70 Here, we determine the profiles of active enhancers in representative stages of 71 spermatogenesis in mice and identify a meiotic type of SEs. We demonstrate that the switch from 72 mitotic to meiotic types of SEs drives a dynamic change in the transcriptome at the mitosis-to- 73 meiosis transition. This SE switching is regulated by the establishment of meiotic SEs via A-MYB 74 (MYBL1), a key for germline genes20,21, and by the resolution of mitotic SEs 75 via SCML2, a germline-specific Polycomb protein required for spermatogenesis-specific gene 76 expression3. We found that meiotic SEs are preprogrammed in undifferentiated spermatogonia 77 prior to the mitosis-to-meiosis transition, suggesting that gene activation in meiosis takes place 78 based on epigenetic mechanisms of preprograming. Through systematic analyses, we identified key 79 regulatory factors for both mitotic and meiotic enhancers, thereby exposing the molecular logic of 80 concurrent activation mechanisms for mitotic enhancers and suppression mechanisms for meiotic 81 enhancers in the somatic and/or mitotically proliferating phase. 82 83 Results 84 The landscape of active enhancers during spermatogenesis 85 To determine the landscape of active enhancers in spermatogenesis, we performed 86 chromatin immunoprecipitation with sequencing (ChIP-seq) for the histone modification H3K27ac, 87 a marker of active enhancers22. We analyzed four representative stages of wild-type 88 spermatogenesis: THY1+ undifferentiated spermatogonia, a population that contains both 89 spermatogonial stem cells and progenitor cells; KIT+ differentiating spermatogonia from P7 testes; 90 purified pachytene spermatocytes (PS) in meiotic prophase; and postmeiotic round spermatids (RS) 91 from adult testes (Fig. 1a). We carried out H3K27ac ChIP-seq for two independent biological 92 replicates, and we confirmed that ChIP-seq signals are consistent at genomic loci between the

2 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

93 replicates (Fig. 1b, Supplementary Fig. 1). (While generated for and analyzed in this study, our 94 H3K27ac ChIP-seq data for wild-type PS and RS was initially introduced in another study that 95 analyzed active enhancers on the sex chromosomes23.) Consistent with the massive, dynamic 96 transcriptional change occurring at the mitosis-to-meiosis transition, we observed H3K27ac peaks 97 in mitotically proliferating spermatogonia (blue shadow), H3K27ac peaks unique to meiotic 98 spermatocytes (red shadow), and constitutive peaks (gray shadow: Fig. 1b). 99 100 For the quantitative comparison of active enhancers during spermatogenesis, we analyzed 101 H3K27ac ChIP-seq peaks ±1 kb outside transcription start sites (TSSs); hereafter, we refer to such 102 peaks as ‘distal peaks’ and peaks within ±1 kb as ‘proximal peaks.’ Through the use of MACS224, 103 a program that identifies the significant enrichment of ChIP-seq signals, we detected 11,433 distal 104 H3K27ac ChIP-seq peaks that were present in at least one stage of spermatogenesis (for these 105 analyses, we permitted only distal peaks with a normalized enrichment value of ≥4; see Methods; 106 Supplementary Table 1). The distal peaks were categorized into 9 clusters through k-means 107 clustering; we further organized these into three classes as follows (Fig. 1c). The first class (705 108 peaks, comprising clusters 1-3) represents constitutive active enhancers, i.e., those observed 109 throughout spermatogenesis. The second class (2,524 peaks, comprising clusters 4 and 5) represents 110 enhancers that are active in the mitotic proliferation phase of spermatogenesis (i.e., the ‘mitotic 111 phase’) but are inactive in meiotic and postmeiotic phases. Notably, the third class (8,204 peaks, 112 comprising clusters 6-10) consists of enhancers that are largely inactive in the mitotic phase yet are 113 highly active in meiotic and postmeiotic stages. The number of H3K27ac ChIP-seq peaks gradually 114 increases over the course of spermatogenic differentiation. Taken together, on the contrary to a 115 previous view that mouse spermatogenesis is not involved in enhancer activation8, these results 116 demonstrate that large numbers of active enhancers are established de novo in spermatogenesis. 117 118 To elucidate how active enhancers change in the course of spermatogenesis, we examined 119 the dynamics of H3K27ac peaks at various transitions from one stage to another. For these analyses, 120 we used MAnorm, a peak-analysis program that facilitates quantitative comparisons of peaks 121 derived from two pairwise next-generation sequencing datasets (see Methods)25. Although 122 H3K27ac proximal peaks were largely common at each transition, distal H3K27ac peaks 123 representative of active enhancers underwent dynamic changes in spermatogenesis (Fig. 1d). With 124 regard to distal H3K27ac peaks, a large fraction persisted between THY1+ and KIT+ 125 spermatogonia; we refer to such peaks as ‘common’ (Fig. 1d). These data suggest that, for the most 126 part, THY1+ and KIT+ spermatogonia share a largely common profile of active enhancers. In 127 comparison, MAnorm analysis between KIT+ spermatogonia and PS revealed a dynamic change in 128 the distribution of active enhancers at the mitosis-to-meiosis transition. This result suggests that a 129 majority of active enhancers in KIT+ spermatogonia disappear prior to meiosis, and the extensive 130 de novo formation of active enhancers takes place in meiotic prophase (Fig. 1d). For the large part, 131 this de novo formation of active enhancers occurred in intergenic and intronic regions. The 132 continued alteration of active enhancers occurred from meiotic PS to postmeiotic RS, and we 133 observed the additional de novo establishment of active enhancers in RS (Fig. 1d). 134 135 De novo establishment of super-enhancers after the mitosis-to-meiosis transition facilitates 136 robust expression of key spermatogenesis genes 137 To elucidate how massive, dynamic transcriptional change is stimulated at the mitosis-to- 138 meiosis transition, we sought to test the following hypothesis: The transcriptional change of the 139 mitosis-to-meiosis transition is associated with the establishment of “super-enhancers (SEs).” SEs 140 are genomic regulatory units made up of multiple enhancers, bound by transcription factors to 141 control key regulatory genes required for cellular identity17,19,26. SEs have been defined as large 142 chromatin domains enriched with H3K27ac and/or other active enhancer marks27; drawing on this 143 definition, we identified SEs based on elevated H3K27ac enrichment in spermatogenesis using the

3 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

144 following criteria: 1) Enhancers within 12.5 kb of each other were consolidated into a single entity; 145 (2) enhancer entities were ranked by H3K27ac ChIP-seq signal intensity; finally, (3) enhancer 146 entities enriched with H3K27ac signal above a certain cutoff were defined as SEs (see Methods). 147 Using this definition, we found that SEs are established in the course of spermatogenesis, and they 148 increase in number as germ cells mature: We identified 65 SEs in THY1+ spermatogonia, 182 SEs 149 in KIT+ spermatogonia, 487 SEs in PS, and 1,114 SEs in RS (Fig. 2a, Supplementary Table 2). The 150 number of SEs established de novo increases as spermatogenesis progresses (Fig. 2b). Among the 151 65 SEs in THY1+ spermatogonia, 85% (55/65) are common to SEs identified in KIT+ 152 spermatogonia (Fig. 2c), indicating a largely common profile of SEs in mitotically dividing THY1+ 153 and KIT+ spermatogonia. However, among the 182 SEs in KIT+ spermatogonia, only 32% (59/182) 154 are common to SEs in PS (Fig. 2b). These data reveal the dynamic, de novo formation of SEs at the 155 mitosis-to-meiosis transition. After the mitosis-to-meiosis transition, 57% (278/487) of SEs in PS 156 were common to SEs in RS; we observed the establishment of 836 new SEs in RS (Fig. 2b). 157 158 We identified distinct characteristics of SEs common to the mitotic stages (i.e., between 159 THY1+ and KIT+ spermatogonia), and the SEs common to the meiotic (i.e., PS) and postmeiotic 160 (i.e., RS) stages. First, with respect to SEs common to mitotic stages (THY1+ and KIT+ 161 spermatogonia), H3K27ac was decreased in PS, but tended to persist throughout spermatogenesis 162 into at least as late as the RS stage; for example, H3K27ac was still present after the resolution of 163 a SE on 15 (Fig. 2c). We designate SEs unique to THY1+ and/or KIT+ spermatogonia 164 ‘mitotic SEs.’ Through average tag density analyses, we confirmed that decreased H3K27ac 165 persisted to as late as the RS stage at the genomic loci of mitotic SEs (Fig. 2d). On the other hand, 166 with respect to SEs unique to PS, H3K27ac was largely absent from corresponding genomic loci 167 in THY1+ and KIT+ spermatogonia, but H3K27ac was robustly established during the mitosis-to- 168 meiosis transition (Fig. 2c, d). We term SEs unique to PS and/or RS ‘meiotic SEs.’ Intriguingly, 169 meiotic SEs tend to consist of large and broad H3K27ac peaks, while mitotic SEs tend to comprise 170 clusters of distinct, narrow H3K27ac peaks (Figs. 2c and 2d); SEs with this distinct, narrow 171 conformation have been reported as a general feature of tissue-specific SEs in other mitotically 172 proliferating cells17-19. 173 174 Next, we examined whether meiotic SEs are associated with specific gene expression 175 programs after the mitosis-to-meiosis transition. analysis revealed that genes 176 adjacent to meiotic SEs (i.e., genes within 20 kb upstream to 50 kb downstream of SEs) are enriched 177 for roles in “spermatogenesis” and “male gamete function” (Fig. 2e). We identified 101 genes that 178 were categorized for “spermatogenesis,” and this gene group is highly expressed after the mitosis- 179 to-meiosis transition (Fig. 2f, Supplementary Table 3). This group includes key regulators of 180 spermatogenesis such as Brdt, a key chromatin regulator in spermatogenesis28; and Trdr1 and 181 Piwil1 (also known as Miwi), components of the Piwi-interacting RNA pathway29,30; Msh4, an 182 essential gene for meiotic recombination31; and other spermatogenesis genes including Ggn, Prm3, 183 and Tssk2 (Fig. 2f). Next, we investigated whether genes adjacent to meiotic SEs are subject to 184 higher expression than genes that are not adjacent to meiotic SEs. Among 2,623 late 185 spermatogenesis genes (i.e., genes that are not highly expressed in spermatogonia but are highly 186 expressed in PS and/or RS by a ≥4-fold change compared to spermatogonia: a gene list is included 187 in Supplementary Table 4), 652 genes are located adjacent to meiotic SEs (i.e., genes within 20 kb 188 upstream to 50 kb downstream of SEs). These genes are robustly expressed compared to the 189 remaining 1,971 late spermatogenesis genes that are not adjacent to SEs (Fig. 2g). These results 190 suggest that the de novo establishment of meiotic SEs facilitates robust expression of key 191 spermatogenesis genes. 192 193 A-MYB binding is associated with the establishment of meiotic SEs for targeted activation of 194 germline genes

4 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

195 Enhancers contain transcription factor (TF)-binding sites to regulate the expression of 196 target genes32. Thus, we sought to identify TF-binding motifs that underlie gene expression 197 programs unique to spermatogenesis, specifically those associated with the mitosis-to-meiosis 198 transition. Using the motif analysis program MEME-ChIP33, we identified consensus motifs similar 199 to known TF-binding motifs in active enhancers. To identify TF-binding sites indicative of the 200 mitotic stage versus the meiotic stage, we compared H3K27ac ChIP-seq peaks between KIT+ 201 spermatogonia versus PS. Among the H3K27ac peaks unique to KIT+, the TF-binding motif with 202 the lowest E value (i.e., an E value = 4.3×10-16; E values are expected values output by the MEME 203 expectation maximization algorithm)34 contained motifs for STAT family transcription factors 204 (STAT1, STAT3, and STAT5a). This is in line with the function of STAT3 in spermatogonial 205 differentiation35. The motif with the second lowest E-value (E-value = 1.5×10-13) contains a 206 common binding motif for FOX family transcription factors (FOXK1, FOXJ3, and FOXG1). 207 208 Of note, these data for motif enrichment in KIT+ spermatogonia are distinct from data 209 obtained for KIT+ spermatogonia in our previous study of accessible chromatin (detected via 210 ATAC-seq; like H3K27ac ChIP-seq speaks, ATAC-seq peaks are indicators of cis-regulatory 211 elements)9. To more carefully examine the union and/or exclusivity of motifs in H3K27ac and 212 ATAC peaks, we performed HOMER motif analyses36 using an expanded TF binding motif library 213 taken from the Cis-BP database37 (see Methods). H3K27ac and ATAC double-positive peaks in 214 KIT+ spermatogonia contain consensus motifs such as NR5A2, a TF implicated in germ cell 215 development; the retinoid receptors RXRA and RXRB; and binding motifs for FOS, FOSL2, and 216 JUND. These data are common with previously identified motifs in ATAC peaks in KIT+ 217 spermatogonia9 (Fig. 3b, Supplementary Table 5). Notably, in KIT+ spermatogonia, consensus 218 motifs for DMRT1, a key TF that regulates the mitosis-to-meiosis transition38, were enriched only 219 in ATAC-positive/H3K27ac-negative peaks, suggesting that DMRT1 functions outside of active 220 enhancers (Fig. 3b). A similar feature was also found in PS: The binding motifs for the POU/OCT 221 family of TFs (POU2F, POU3F, and POU1F1) were found only in ATAC-positive/H3K27ac- 222 negative peaks, suggesting that the POU/OCT family of TFs functions outside of active enhancers 223 in PS (Fig. 3b). 224 225 The PS-unique motifs found within H3K27ac peaks revealed a common and notable 226 feature between MEME-ChIP and HOMER motif analyses: A strong enrichment of MYB binding 227 motifs was identified by both analysis pipelines (MEME-ChIP analysis, GGCAGTT: E value = 228 5.6×10-25, Fig. 3a; HOMER motif analysis, Fig. 3b, Supplementary Table 5). These MYB motifs 229 are recognized by A-MYB (also known as MYBL1), a key transcription factor for germline genes20, 230 and are also strongly enriched in A-MYB testes ChIP-seq peaks21. Using this previously published 231 dataset, we identified A-MYB ChIP-seq peaks at the center of SE-associated H3K27ac peaks on 232 the and autosomes in PS (Fig. 3c). Given the role of A-MYB in the regulation of 233 meiotic transcription, these data raise the possibility that A-MYB binding may nucleate 234 establishment of meiotic enhancers and SEs onto the surrounding chromatin. MAnorm analyses 235 revealed that, genome-wide, most A-MYB peaks overlap H3K27ac peaks (Fig. 3d), which further 236 supports the function of A-MYB in priming meiotic enhancers. 237 238 Importantly, we found that meiotic SEs overlap pachytene piRNA clusters, which produce 239 pachytene piRNAs (Fig. 3c, right panel, and Fig. 3e). Among pachytene piRNA clusters, we found 240 that BTBD18-dependent piRNA loci are highly likely to overlap meiotic SEs (Fig. 3e); BTBD18 241 is an essential factor for pachytene piRNA production by way of transcriptional elongation39. Since 242 A-MYB is essential for the production of pachytene piRNA21, these data lend further support to the 243 assertion that A-MYB functions in the priming of meiotic SEs; furthermore, A-MYB and meiotic 244 SEs may comprise, in part or whole, a potential mechanism for the production of pachytene piRNA. 245

5 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

246 To test whether meiotic SE-associated genes are regulated by A-MYB, we analyzed 247 previously published RNA-seq data from the testes of A- mutants (Mybl1repro9) and 248 heterozygous littermate controls at P1435 (Fig. 3f). We observed a significant overlap of meiotic 249 SE adjacent genes and genes differentially expressed in A-myb mutants (211 differentially 250 expressed genes out of 652 meiotic SE adjacent genes (32.7%); this is in comparison to 1,705 251 differentially expressed genes out of all 22,661 RefSeq genes in the genome (7.5%); P = 1.0×10-79; 252 hypergeometric probability test), and many of the differentially expressed genes were found in the 253 downregulated genes of A-myb mutants. Together, these results suggest that A-MYB-binding might 254 trigger the establishment of meiotic SEs to activate target germline genes (Fig. 3g). 255 256 SCML2 is required for the resolution of mitotic SEs during meiosis 257 Next, we sought to determine a mechanism underlying the resolution of mitotic SEs at the 258 mitosis-to-meiosis transition. We focused our investigation on the function of SCML2, a germline 259 specific Polycomb protein that is responsible for dynamic transcriptional changes at the transition3; 260 in mice deficient for Scml2 (i.e., Scml2-knockout (KO) mice), somatic/progenitor genes were 261 derepressed on autosomes after the mitosis-to-meiosis transition, and robust activation of late- 262 spermatogenesis genes was compromised as well3. Although H3K27ac peaks were comparable 263 between the THY1+ and KIT+ spermatogonia of both Scml2-KO mice and wild-type littermate 264 controls (Supplementary Fig. 2a), MAnorm analysis revealed a large number of unique H3K27ac 265 peaks at intergenic and intronic regions in PS of Scml2-KO mice compared to wild-type controls 266 (at intergenic regions, 1,549 peaks in wild-type and 1,951 peaks in Scml2-KO; at intronic regions, 267 845 peaks in wild-type and 4,461 peaks Scml2-KO; Fig. 4a). Intriguingly, the increased number of 268 H3K27ac peaks in Scml2-KO PS appeared to result from the retention of mitotic enhancers after 269 the mitosis-to-meiosis transition: H3K27ac peaks at mitotic SEs, which are resolved in wild-type 270 PS, were, for the most part, retained in Scml2-KO PS and RS (Supplementary Fig. 2b). Average 271 tag density analysis confirmed the genome-wide retention of H3K27ac at SEs from Scml2-KO KIT+ 272 spermatogonia to PS and RS (Fig. 4b). Given this evidence, and since SCML2 suppresses 273 somatic/progenitor genes in meiosis3, these results suggest that the SCML2-mediated resolution of 274 mitotic SEs constitutes a potential mechanism for the suppression of somatic/progenitor genes at 275 the mitosis-to-meiosis transition. 276 277 We also observed that the intensity of H3K27ac at meiotic SEs was slightly decreased in 278 Scml2-KO PS (Fig. 4c), and Scml2-KO RS saw a further decrease in H3K27ac intensity (Fig. 4d). 279 These observations are consistent with the down-regulation of late-spermatogenesis genes in PS 280 and RS Scml2-KO mice3. SCML2 is required for the establishment of H3K27me3 during meiosis, 281 forming two major classes of bivalent genomic domains comprised of H3K27me3 and 282 H3K4me2/3: Class I domains, which are associated with developmental regulator genes; and Class 283 II domains, which are associated with somatic/progenitor genes40. We observed an increase in 284 H3K27ac signal intensity at both classes of bivalent domains in Scml2-KO mice (Supplementary 285 Fig. 2c). We presume that this is the consequence—at least in part—of an antagonistic relationship 286 between H3K27me3 and H3K27ac, since both post-translational modifications occupy the same 287 amino acid residue (K27) of the histone H3 tail. 288 289 SCML2 is required for the formation of SEs on the X chromosome during meiosis 290 Switching focus to meiosis, we performed analyses to elucidate the mechanisms governing 291 the establishment of active enhancers on the male sex . During male meiosis, the sex 292 chromosomes undergo regulation distinct from autosomes due to a central regulatory mechanism 293 known as meiotic sex chromosomes inactivation (MSCI)41,42. MSCI engages a DNA damage 294 response (DDR) pathway to catalyze and regulate sex chromosome gene silencing in PS, an 295 essential occurrence prior to the activation of a subset of sex chromosome genes in RS43,44. RNF8, 296 an E3 and key DDR factor, is responsible for the establishment of ubiquitination on the sex

6 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

297 chromosomes, along with the establishment of active histone modifications such as the enhancer 298 mark H3K27ac, thereby regulating the activation of a subset of sex chromosome genes that escape 299 post-meiotic silencing23,44. Contrasting with our genome-wide profiles of H3K27ac (Fig. 1), we 300 observed a scarcity of distal H3K27ac peaks on the sex chromosomes of THY1+ and KIT+ 301 spermatogonia (Fig. 5a). Of note, in accordance with the chromosome-wide establishment of 302 H3K27ac signals on the XY body as detected by fluorescence microscopy23, many H3K27ac peaks 303 (930 intergenic and 327 intronic peaks of total 1851 peaks) were established de novo on the sex 304 chromosomes in the intergenic and intronic regions from KIT+ spermatogonia to PS (Fig. 5a). 305 306 To dissect the regulatory mechanisms underlying this process, we focused on SCML2, 307 which has a critical regulatory function on the sex chromosomes independent of its functions on 308 autosomes3. SCML2 functions downstream of the DDR pathway that initiates MSCI, where it 309 cooperates with RNF8 to establish H3K27ac23. MAnorm analysis revealed that, on the sex 310 chromosomes of PS and RS, a large portion of distal H3K27ac peaks (particularly those in 311 intergenic and intronic regions) depend on SCML2 (Fig. 5b). Our prior report showed that, 312 interestingly, ATAC-seq peaks appeared specifically on the sex chromosomes of PS in an SCML2- 313 dependent fashion too9. Together, these data indicate that SCML2 is a key regulatory factor for 314 chromatin accessibility and H3K27ac deposition on the sex chromosomes in meiosis. Accordingly, 315 26 SEs are established on the X chromosomes in meiosis (Fig. 5c), and these largely depend on 316 SCML2 (Fig. 5d). Intriguingly, this is unlike SCML2’s function to resolve mitotic SEs after the 317 mitosis-to-meiosis transition (Fig. 4); as such, we observed increased numbers of SEs on the 318 autosomes of Scml2-KO PS (Fig. 5d). Together, these results demonstrate distinct autosome- and 319 sex chromosome-specific functions for SCML2 in the regulation of enhancers in spermatogenesis. 320 321 Meiotic super-enhancers on autosomes are poised in undifferentiated spermatogonia 322 SEs on the sex chromosomes are established downstream of the DDR pathway that initiates 323 MSCI. This DDR-dependent regulation of the sex chromosomes is specific to the unpaired — or 324 unsynapsed — status of the hemizygous male sex chromosomes in meiosis, when homologous 325 autosomes otherwise fully synapse to facilitate recombination42. Given this difference between sex 326 chromosomes and autosomes, we suspected that SEs on the autosomes are regulated by a distinct 327 mechanism. Thus, to determine the mechanism by which autosomal SEs are established, we 328 examined the epigenetic status of meiotic SEs in progenitor cells. We examined H3K4me2 and 329 H3K4me3, active marks that were previously reported to be associated with poised gene promoters 330 during spermatogenesis4. Notably, prior to the establishment of H3K27ac, H3K4me2 was present 331 on autosomal meiotic SEs in THY1+ and KIT+ spermatogonia (Fig. 6a). Additionally, H3K4me3 is 332 also enriched on autosomal meiotic SEs in THY1+ spermatogonia (Fig. 6a). These features were 333 unique to meiotic SEs: Other meiotic enhancers detected through analyses of distal H3K27ac peaks 334 did not exhibit these features (Supplementary Fig. 3a). These results suggest that meiotic SEs are 335 poised as early as the THY1+ spermatogonia phase to prepare for the expression of key 336 spermatogenesis genes after the mitosis-to-meiosis transition. These features were not observed on 337 meiotic SEs associated with the X chromosome (Fig. 6b), lending further support for the distinct 338 regulation of meiotic SEs between autosomes and the X chromosome. While poised chromatin is 339 unique to autosomal meiotic SEs—and not associated with other meiotic enhancers—we found that 340 the TSSs of late spermatogenesis genes are also broadly poised for activation in spermatogonia 341 (Supplementary Fig. 3b)4. Together, these data indicate that SE-associated late spermatogenesis 342 genes on autosomes are poised for activation in two layers: SEs and TSSs (Fig. 6c). We propose 343 that this form of epigenomic programming ensures the unidirectional differentiation of 344 spermatogenesis (Fig. 6c). 345 346 Next, we sought to identify mechanisms for the expression of postmeitoic spermatid 347 specific genes. On autosomes in PS, at distal H3K27ac peaks around RS-specific genes, we

7 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

348 observed an increase in H3K27ac and H3K4me3 signals prior to activation in RS, and H3K27ac 349 and H3K4me3 signals became highly enriched in RS (Supplementary Fig. 3c). On the other hand, 350 on the X chromosome in PS, at distal H3K27ac peaks around RS-specific genes, H3K27ac and 351 H3K4me2 became temporary enriched in PS prior to gene activation in RS (Supplementary Fig. 352 3d). Since, on the sex chromosomes, accumulation of H3K4me2 and H3K27ac takes place 353 downstream of RNF8 function, and accumulation of H3K4me2 and H3K27ac is regulated by 354 SCML2 too23,44, these results serve to further reveal gene activation mechanisms that are distinct 355 between autosomes and the sex chromosomes in haploid RS. 356 357 Identification of key regulatory factors for both mitotic and meiotic enhancers 358 Finally, we took advantage of our new data sets to infer general mechanisms underlying 359 the regulation of mitotic and meiotic enhancers. Since the meiotic gene program is, by and large, 360 repressed in cell types that undergo mitotic divisions, we sought to identify putative TFs that 361 meet one of two counteractive conditions: (1) Those that can operate on and/or promote the 362 activity of mitotic enhancers, and (2) those that can suppress meiotic enhancers. To this end, we 363 used our recently published RELI (Regulatory Element Intersection) algorithm45, to 364 compare the genomic locations of our H3K27 peak data with a large collection of publicly 365 available ChIP-seq data. Taking the genomic location information for H3K27ac peaks detected to 366 be mitotic enhancers in KIT+ spermatogonia, we analyzed the intersections between these data and 367 publicly available ChIP-seq data sets for many TFs in many contexts. Since the overwhelming 368 majority of public ChIP-seq data are from somatic cells that undergo mitotic divisions in between 369 cell cycles, this informs (at least) one interpretation for such an experiment: The enrichment of 370 intersections between mitotic enhancers and TFs could be indicative of general mechanisms that 371 operate on mitotic enhancers. In addition to TFs that were previously associated with 372 spermatogonia, such as STAT3, TCF3, MAZ, and ETS135,46,47, we identified additional factors 373 with enriched ChIP-seq peaks at distal H3K27ac peaks in KIT+ spermatogonia: SRF, TCF12, 374 GATA4, BCL6, CEBPB, and MAX (Fig. 7a, Supplementary Table 6). When we applied the same 375 analysis to mitotic SEs, we identified UBTF, RBP1, CHD1, ZFX, and as specific factors that 376 may be involved in their regulation (Fig. 7b, Supplementary Table 7). Among them, ZFX was 377 previously implicated in spermatogenesis48. Next, we applied this strategy to identify factors that 378 suppress meiotic enhancers in the mitotic phase. Compellingly, at the sites for meiotic enhancers 379 as determined by the loci of distal H3K27ac peaks in PS, we revealed high enrichment for factors 380 that comprise, in part, transcriptional silencing machinery, including REST, TRIM28, RCOR2, 381 SIN3A, and YY1 (Fig. 7c, Supplementary Table 8). Of note, when we applied this analysis to 382 meiotic SEs in the mitotic phase, we identified KDM5A, a histone that acts on 383 H3K4me3, as the factor having the highest enrichment at meiotic SEs (Fig. 7d, Supplementary 384 Table 9). Together, these analyses systematically identify putative regulators of mitotic and meiotic 385 enhancers, providing clues for understanding their underlying molecular mechanisms. 386 387 Discussion 388 In this study, we determined the profiles of active enhancers in representative stages of 389 spermatogenesis, and we demonstrated that SE switching underlies the dynamic transcriptome 390 change of the mitosis-to-meiosis transition. Our results establish an overarching molecular logic 391 for this switching: SCML2 resolves mitotic SEs, and A-MYB establishes meiotic SEs to regulate 392 meiotic transcription. Recent reports indicate that it is, generally speaking, the nature of SEs to 393 regulate gene expression that underlies the cellular identity of cell types17,18; with this in mind, the 394 A-MYB-dependent regulation of meiotic SEs becomes a conceivable mechanism for gene 395 expression that defines the cellular identity of germ cells in late spermatogenesis. Our analyses 396 revealed SE-adjacent genes such as Brdt, Miwil1, and Tdrd1—genes that are critical for late 397 spermatogenesis. Thus, these genes may be candidate genes for cellular identity. In particular, 398 Miwil1 and Tdrd1 are involved in the regulation of piRNA29,30, and A-MYB is required for the

8 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

399 expression of Miwil1 and Tdrd121. For example, A-MYB directly regulates transcription of 400 pachytene piRNA clusters, and robust pachytene piRNA production is thought to be regulated by 401 a feed forward loop as part of the transcriptional activation of piRNA clusters and the expression 402 of piRNA regulators such as MIWIL1 and TDRD121. Our data support the possibility that this 403 feedforward loop is mediated by the establishment of meiotic SEs. Interestingly, the A-MYB- 404 dependent activation of germline genes is an ancient mechanism also found in rooster testes21. So, 405 it is interesting to speculate that the establishment of meiotic SEs by A-MYB could be an ancient 406 mechanism too—and one that was possibly exploited by, or otherwise coopted by, pachytene 407 piRNA production in the course of evolutionary history. Given the robust and evolutionarily 408 conserved nature of germline gene activation via SEs, such a mechanism stands in stark contrast to 409 a concomitant mechanism whereby rapidly evolved enhancers, driven by endogenous retroviruses, 410 activate species-specific genes after the mitosis-to-meiosis transition of spermatogenesis (Sakashita 411 et al., co-submitted). 412 413 In tumor cells, SEs are regulated by BRD4, a member of the bromodomain and 414 extraterminal (BET) subfamily of , and inhibition of BRD4 results in the dysregulation of 415 SE-associated genes, including the proto-oncogene26. In spermatogenesis, a testis-specific 416 member of the BET family, BRDT, is required for the meiotic gene expression program28, and 417 small-molecule inhibition of BRDT caused spermatogenic failure49. Given the molecular 418 similarities between BRD4 and BRDT, it is possible that BRDT could be a binding protein for 419 meiotic SEs, and loss of function of BRDT could represent loss of function of meiotic SEs. Since 420 we identified Brdt as a meiotic SE-adjacent gene, BRDT may contribute to a feedforward loop that 421 putatively establishes meiotic SEs. Curiously, another protein containing bromodomains, BRWD1 422 (Bromodomain And WD Repeat Domain Containing 1), can also recognize acetylated lysine 423 residues and is required for postmeiotic transcription in spermatids50. Likewise, BRD4 is also 424 associated with gene expression in spermatids51. So, given the increasing numbers of active 425 enhancers established as PS progresses to become RS (Fig. 1), these observations collectively 426 represent an important direction for future work: to determine the functions of bromodomain 427 proteins in the regulation of germline enhancer activity. 428 429 Our study elucidated distinct forms of regulation for active enhancers on autosomes versus 430 active enhancers on sex chromosomes in spermatogenesis; these forms of regulation are mediated 431 by the distinct functions of SCML2 on autosomes versus sex chromosomes. On the autosomes, 432 SCML2, a highly expressed protein in undifferentiated spermatogonia3, is involved in the 433 resolution of mitotic SEs after the mitosis-to-meiosis transition (Fig. 4), while meiotic SEs are 434 already poised with H3K4me2 in undifferentiated spermatogonia (Fig. 6). Therefore, it is 435 conceivable that these dual mechanisms preprogram meiotic gene expression as early as the 436 undifferentiated spermatogonia phase of spermatogenesis, all in preparation for the unidirectional 437 differentiation of spermatogenesis. On the other hand, on the sex chromosomes, H3K27ac 438 deposition depends on RNF823, a DDR factor, in addition to SCML2 (Fig. 5). Given that MDC1, a 439 DDR factor and key regulator of MSCI43, is necessary for the localization of SCML2 on the XY 440 body3, our results indicate that active enhancers and postmeiotic gene expression are directly 441 downstream of a DDR pathway specific to the sex chromosomes. 442 443 Finally, through genome-wide analyses using publicly available ChIP-seq data from 444 many different cell types, we revealed transcription factors that might bind mitotic and meiotic 445 enhancers, as well as mitotic and meiotic SEs (Fig. 7). Among the factors we identified, the 446 transcriptional repressor KDM5A (also known as RBP2 or JARID1A) evinced the highest 447 enrichment value for meiotic SEs. This is of particular interest because KDM5A was originally 448 identified as an RB (Retinoblastoma)-binding protein and is implicated in tumorigenesis52. Since 449 many germline-associated genes are expressed in many cancer types—or, put another way, many

9 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

450 germline-associated genes are so-called cancer/testis genes53—it is interesting to consider that the 451 regulation of meiotic SEs could, in turn, drive or otherwise regulate germline gene expression in 452 various cancers. 453 454 In summary, our current study provides a framework to understand enhancer activity and 455 the regulation of gene expression during spermatogenesis. Because our study focuses on 456 representative stages, it will be important to further dissect these mechanisms in order to fully 457 understand the complex and well-coordinated nature of spermatogenesis. Recent studies using 458 single cell analyses have revealed new details on the shifting, transitory transcriptomic and 459 epigenomic environments of progressive cell types in human and mouse spermatogenesis54-59. Such 460 dynamism is achievable through the functional interplay of complex combinations of TFs and 461 enhancers, as well as other regulatory elements. Indeed, more than a thousand TFs are differentially 462 expressed during the mitosis-to-meiosis transition in spermatogenesis9. Of note, the testis has the 463 largest number of specifically expressed TFs of all organs60. The systematic determination of 464 germline cis-regulatory elements makes for a compelling future research direction to understand 465 germline mechanisms, including fundamental aspects of the mitotic and meiotic programs. 466 467 Methods 468 Animals 469 Scml2-KO mice were previously reported 3. 470 471 Germ cell fractionation 472 Pachytene spermatocytes and round spermatids were isolated via BSA gravity sedimentation as 473 previously described61. Purity was confirmed by nuclear staining with Hoechst 33342 using 474 fluorescence microscopy. In keeping with previous studies from the Namekawa lab3,9,11,40, ≥90% 475 purity was confirmed for all purifications. Spermatogonia were isolated as described previously8 476 and collected from C57BL/6N mice aged 6-8 days. Testes were collected in a 24-well plate in 477 Dulbecco’s Modified Eagle Medium (DMEM) supplemented with GlutaMax (Thermo Fisher 478 Scientific), non-essential amino acids (NEAA) (Thermo Fisher Scientific), and penicillin and 479 streptomycin (Thermo Fisher Scientific). After removing the tunica albuginea membrane, testes 480 were digested with collagenase (1 mg/ml) at 34°C for 20 min to remove interstitial cells, then 481 centrifuged at 188×g for 5 min. Tubules were washed with the medium and then digested with 482 trypsin (2.5 mg/ml) at 34°C for 20 min to obtain a single cell suspension. Cells were filtered with 483 a 40-μm strainer to remove Sertoli cells, and the cell suspension was plated in a 24-well plate for 1 484 h in the medium supplemented with 10% fetal bovine serum, which promotes adhesion of 485 remaining somatic cells. Cells were washed with magnetic cell-sorting (MACS) buffer (PBS 486 supplemented with 0.5% BSA and 5 mM EDTA) and incubated with CD117 (KIT) MicroBeads 487 (Miltenyi Biotec) on ice for 20 min. Cells were washed and resuspended with MACS buffer, and 488 filtered with a 40-μm strainer. Cells were separated by autoMACS Pro Separator (Miltenyi Biotec) 489 with the program “possel.” Cells in the flow-through fraction were washed with MACS buffer and 490 incubated with CD90.2 (THY1) MicroBeads (Miltenyi Biotec) on ice for 20 min. Cells were 491 washed and resuspended with MACS buffer and filtered with a 40-μm strainer. Cells were 492 separated by autoMACS Pro Separator (Miltenyi Biotec) with the program “posseld.” Purity was 493 confirmed by immunostaining. 494 495 ChIP-seq library preparation and sequencing 496 Cells were suspended in chilled 1× PBS. One-eleventh volume of crosslinking solution (50 mM 497 HEPES-NaOH pH 7.9, 100 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, and 8.8% formaldehyde) was 498 added to the cell suspension and incubated on ice for 8 min. One-twentieth volume of 2 M glycine 499 was added to the cell suspension and incubated at room temperature for 5 min to stop the reaction. 500 Cells were washed twice with PBS, frozen at -80°C, and lysed at 4°C for 10 min each in ChIP lysis

10 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

501 buffer 1 (50 mM HEPES pH 7.9, 140 mM NaCl, 10% glycerol, 0.5% IGEPAL-630, 0.25% Triton 502 X-100). After centrifugation at 2,000×g for 10 min at 4°C, pellets were resuspended with ChIP 503 lysis buffer 2 (10 mM Tris-HCl pH 8.0, 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA) and 504 incubated at 4°C for 10 min. After centrifugation at 2,000×g for 10 min at 4°C, pellets were washed 505 with TE containing 0.1% SDS and protease inhibitors (Sigma; 11836145001), and resuspended 506 with the same buffer. Chromatin was sheared to approximately 200–500 bp by sonication using a 507 Covaris sonicator at 10% duty cycle, 105 pulse intensity, 200 burst for 2 min. Sheared chromatin 508 was cleared by centrifugation at 20,000×g for 20 min, followed by pre-incubation with Dynabeads 509 Protein G (Thermo Fisher Scientific). Chromatin immunoprecipitation was carried out on an SX- 510 8X IP-STAR compact automated system (Diagenode). Briefly, Dynabeads Protein G were pre- 511 incubated with 0.1% BSA for 2 h. Then, the cleared chromatin was incubated with beads 512 conjugated to antibodies against H3K27ac (Active Motif; 39133) at 4°C for 8 h, washed 513 sequentially with wash buffer 1 (50 mM Tris-HCl pH 8.0, 150 mM NaCl, 1 mM EDTA, 0.1% SDS, 514 0.1% NaDOC, and 1% Triton X-100), wash buffer 2 (50 mM Tris-HCl pH 8.0, 250 mM NaCl, 1 515 mM EDTA, 0.1% SDS, 0.1% NaDOC, and 1% Triton X-100), wash buffer 3 (10 mM Tris-HCl pH 516 8.0, 250 mM LiCl, 1 mM EDTA, 0.5% NaDOC, and 0.5% NP-40), wash buffer 4 (10 mM Tris- 517 HCl pH 8.0, 1 mM EDTA, and 0.2% Triton X-100), and wash buffer 5 (10 mM Tris-HCl). DNA 518 libraries were prepared through the ChIPmentation method 62. Briefly, beads were resuspended in 519 30 μl of the tagmentation reaction buffer (10 mM Tris-HCl pH 8.0 and 5 mM MgCl2) containing 1 520 μl Tagment DNA from the Nextera DNA Sample Prep Kit (Illumina) and incubated at 521 37°C for 10 min in a thermal cycler. The beads were washed twice with 150 μl cold wash buffer 1, 522 incubated with elution buffer (10 mM Tris-HCl pH 8.0, 1 mM EDTA, 250 mM NaCl, 0.3% SDS, 523 0.1 μg/μl Proteinase K) at 42°C for 30 min, and then incubated at 65°C for another 5 h to reverse 524 cross-linking. DNA was purified with the MinElute Reaction Cleanup Kit (Qiagen) and amplified 525 with NEBNext High-Fidelity 2× PCR Master Mix (NEB). Amplified DNA was purified by 526 Agencourt AMPure XP (Beckman Coulter). Afterwards, DNA fragments in the 250- to 500-bp size 527 range were prepared by agarose gel size selection. DNA libraries were adjusted to 5 nM in 10 mM 528 Tris-HCl pH 8.0 and sequenced with an Illumina HiSeq 2500. 529 530 Code availability: ChIP-seq and RNA-seq data 531 RNA-seq data from THY1+ spermatogonia, PS, and RS were downloaded from the Gene 532 Expression Omnibus (accession number: GSE55060)3. ChIP-seq data for A-MYB and RNA-seq 533 data from A-MYB mutant and control testes were downloaded from the Gene Expression Omnibus 534 (accession number: GSE44690)21. ChIP-seq data for H3K4me3, and H3K4me2, and RNA-seq data 535 from KIT+ spermatogonia, were downloaded from Gene Expression Omnibus (accession number: 536 GSE89502)40. While generated for and analyzed in this study, our H3K27ac ChIP-seq data for wild- 537 type PS and RS were initially introduced in another study that analyzed active enhancers on the sex 538 chromosomes23; ChIP-seq data for H3K27ac from PS and RS, were downloaded from Gene 539 Expression Omnibus (accession number: GSE107398)23. ChIP-seq data for H3K27ac from 540 embryonic stem cells was downloaded from Gene Expression Omnibus (accession number: 541 GSE29184)63. ChIP-seq data for H3K27ac from sperm were downloaded from Gene Expression 542 Omnibus (accession number: GSE79230)64. 543 544 ChIP-seq and RNA-seq data analysis 545 Data analysis for ChIP-seq was performed in the Wardrobe Experiment Management System 546 (https://code.google.com/p/genome-tools/)65. Briefly, reads were aligned to the mouse genome 547 (mm10) with Bowtie (version 1.2.0)66, assigned to RefSeq genes (or isoforms) using the Wardrobe 548 algorithm, and displayed on a local mirror of the UCSC genome browser as coverage. ChIP-seq 549 peaks for H3K27ac, H3K4me2, H3K4me3, and A-MYB were identified using MACS2 (version 550 2.1.1.20160309)24. Pearson correlations for the genome-wide enrichment of H3K27ac peaks among 551 ChIP-seq library replicates were analyzed using SeqMonk (Babraham Institute). MAnorm,

11 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

552 software designed for quantitative comparisons of ChIP-seq datasets25, was used to compare the 553 genome-wide ChIP-seq peaks among stages in spermatogenesis. Unique peaks were defined using 554 the following criteria: (1) defined as “unique” by the MAnorm algorithm; (2) P-value <0.01; (3) 555 raw counts of unique reads >10. Common peaks between two stages were defined using the 556 following criteria: (1) defined as “common” by MAnorm algorithm; (2) raw read counts of both 557 stages >10. Average tag density profiles were calculated around transcription start sites for gene 558 sets of somatic/progenitor genes, late spermatogenesis genes, constitutive active genes, and 559 constitutive inactive genes as described previously4. Resulting graphs were smoothed in 200-bp 560 windows. Enrichment levels for ChIP-seq experiments were calculated for 4-kb windows, promoter 561 regions of genes (±2 kb surrounding TSSs) and enhancer regions. To normalize tag value, read 562 counts were multiplied by 1,000,000 and then divided by the total number of reads in each 563 nucleotide position. The total amount of tag values in promoter or enhancer regions were calculated 564 as enrichment. The k-means clustering of differential enhancer peaks were analyzed using Cluster 565 3.0 software. The results were further analyzed using JavaTreeview software67 to visualize as 566 heatmaps. MEME-ChIP 68 was used for motif discovery as described in the text. For all motif 567 analyses, we used only peak regions (±250 bp from the peak summit) outside of ±1 kb from TSSs; 568 we chose a maximum of 3,000 peak regions from the lowest P-values (P <0.01) via MAnorm 569 analysis, and we extracted those sequences using the Table Browser69. The HOMER software 570 package36 was used for motif enrichment analyses using a customized version of HOMER that 571 employs a log base 2 scoring system and motifs contained in the Cis-BP motif database37. To 572 identify SEs, H3K27ac ChIP-seq data were used with the same criteria and software as previously 573 described17,26. 574 575 RELI (Regulatory Element Locus Intersection) analysis was performed as described 576 previously45 In brief, genomic regions of interest (e.g., ChIP-seq peaks) were systematically 577 aligned with a large collection of publicly available ChIP-seq data for various TFs in various 578 cellular contexts largely taken from mouseENCODE, and the significance of the intersection of 579 each dataset was calculated using RELI. 580 581 Analyses of RNA-seq data were performed in the Wardrobe Experiment Management System65. 582 Briefly, reads were aligned by STAR (version STAR_2.5.3a)70 with “--outFilterMultimapNmax 1 583 --outFilterMismatchNmax 2”. RefSeq annotation from the mm10 UCSC genome browser 71 was 584 used. The --outFilterMultimapNmax parameter was used to allow unique alignments only, and the 585 --outFilterMismatchNmax parameter was used to allow a maximum of 2 errors. All reads from the 586 resulting .bam files were split for related isoforms with respect to RefSeq annotation. Then, the EM 587 algorithm was used to estimate the number of reads for each isoform. 588 589 Accession Codes 590 H3K27ac ChIP-seq data reported in this study are deposited to the Gene Expression Omnibus 591 (GEO) under the accession number GSE130652. The following secure token has been created to 592 allow review of record GSE130652 while it remains in private status: sbslqciwldgxpof 593 594 Acknowledgements 595 We thank members of the Namekawa laboratory for discussion and helpful comments regarding 596 the manuscript. Funding sources: The research project grant by the Azabu University Research 597 Services Division, Ministry of Education, Culture, Sports, Science and Technology (MEXT)- 598 Supported Program for the Private University Research Branding Project, (2016-2019), Grant-in- 599 Aid for Research Activity Start-up (19K21196), and The Uehara Memorial Foundation Research 600 incentive grant (2018) to S.M.; Lalor Foundation Postdoctoral Fellowship to A.S.; Albert J. Ryan 601 Fellowship to K.G.A.; CCHMC Endowed Scholar and CpG grant awards to M.T.W.; National 602 Institute of Health (NIH) DP2 GM119134 to A.B.; NIH R01 GM122776 and GM098605 to S.H.N.

12 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

603 604 Author contributions 605 The manuscript was written by S.M., K.G.A., and S.H.N., with critical feedback from all other 606 authors, and S.M. and S.H.N. designed the study. S.M. performed the H3K27ac ChIP-seq 607 experiments. S.M. M.Y., X.C., A.S., K.G.A., M.T.W., A.B., and S.H.N. designed and interpreted 608 the computational analyses; S.M. performed the majority of computational analyses. S.H.N. 609 supervised the project. 610 611 Competing Interest Statement 612 A.B. is a cofounder of Datirium, LLC. 613 614 References 615 616 1. Namekawa, S.H. et al. Postmeiotic sex chromatin in the male germline of mice. Curr Biol 617 16, 660-7 (2006). 618 2. Soumillon, M. et al. Cellular source and mechanisms of high transcriptome complexity in 619 the mammalian testis. Cell Rep 3, 2179-90 (2013). 620 3. Hasegawa, K. et al. SCML2 Establishes the Male Germline Epigenome through Regulation 621 of Histone H2A Ubiquitination. Dev Cell 32, 574-88 (2015). 622 4. Sin, H.S., Kartashov, A.V., Hasegawa, K., Barski, A. & Namekawa, S.H. Poised chromatin 623 and bivalent domains facilitate the mitosis-to-meiosis transition in the male germline. BMC 624 Biol 13, 53 (2015). 625 5. Ramskold, D., Wang, E.T., Burge, C.B. & Sandberg, R. An abundance of ubiquitously 626 expressed genes revealed by tissue transcriptome sequence data. PLoS Comput Biol 5, 627 e1000598 (2009). 628 6. Brawand, D. et al. The evolution of gene expression levels in mammalian organs. Nature 629 478, 343-8 (2011). 630 7. Kimmins, S. & Sassone-Corsi, P. Chromatin remodelling and epigenetic features of germ 631 cells. Nature 434, 583-9 (2005). 632 8. Hammoud, S.S. et al. Chromatin and transcription transitions of mammalian adult germline 633 stem cells and spermatogenesis. Cell Stem Cell 15, 239-53 (2014). 634 9. Maezawa, S., Yukawa, M., Alavattam, K.G., Barski, A. & Namekawa, S.H. Dynamic 635 reorganization of open chromatin underlies diverse transcriptomes during spermatogenesis. 636 Nucleic Acids Res 46, 593-608 (2018). 637 10. Wang, Y. et al. Reprogramming of Meiotic Chromatin Architecture during 638 Spermatogenesis. Mol Cell 73, 547-561.e6 (2019). 639 11. Alavattam, K.G. et al. Attenuated chromatin compartmentalization in meiosis and its 640 maturation in sperm development. Nat Struct Mol Biol 26, 175-184 (2019). 641 12. Patel, L. et al. Dynamic reorganization of the genome shapes the recombination landscape 642 in meiotic prophase. Nat Struct Mol Biol 26, 164-174 (2019). 643 13. Bulger, M. & Groudine, M. Functional and mechanistic diversity of distal transcription 644 enhancers. Cell 144, 327-39 (2011). 645 14. Ong, C.T. & Corces, V.G. Enhancer function: new insights into the regulation of tissue- 646 specific gene expression. Nat Rev Genet 12, 283-93 (2011). 647 15. Calo, E. & Wysocka, J. Modification of enhancer chromatin: what, how, and why? Mol 648 Cell 49, 825-37 (2013). 649 16. Heintzman, N.D. et al. Histone modifications at human enhancers reflect global cell-type- 650 specific gene expression. Nature 459, 108-12 (2009). 651 17. Whyte, W.A. et al. Master transcription factors and mediator establish super-enhancers at 652 key cell identity genes. Cell 153, 307-19 (2013).

13 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

653 18. Parker, S.C. et al. Chromatin stretch enhancer states drive cell-specific gene regulation and 654 harbor human disease risk variants. Proc Natl Acad Sci U S A 110, 17921-6 (2013). 655 19. Hnisz, D. et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934- 656 47 (2013). 657 20. Bolcun-Filas, E. et al. A-MYB (MYBL1) transcription factor is a master regulator of male 658 meiosis. Development 138, 3319-30 (2011). 659 21. Li, X.Z. et al. An ancient transcription factor initiates the burst of piRNA production during 660 early meiosis in mouse testes. Mol Cell 50, 67-81 (2013). 661 22. Creyghton, M.P. et al. Histone H3K27ac separates active from poised enhancers and 662 predicts developmental state. Proc Natl Acad Sci U S A 107, 21931-6 (2010). 663 23. Adams, S.R. et al. RNF8 and SCML2 cooperate to regulate ubiquitination and H3K27 664 acetylation for escape gene activation on the sex chromosomes. PLoS Genet 14, e1007233 665 (2018). 666 24. Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol 9, R137 (2008). 667 25. Shao, Z., Zhang, Y., Yuan, G.C., Orkin, S.H. & Waxman, D.J. MAnorm: a robust model 668 for quantitative comparison of ChIP-Seq data sets. Genome Biol 13, R16 (2012). 669 26. Loven, J. et al. Selective inhibition of tumor oncogenes by disruption of super-enhancers. 670 Cell 153, 320-34 (2013). 671 27. Adam, R.C. et al. Pioneer factors govern super-enhancer dynamics in stem cell plasticity 672 and lineage choice. Nature 521, 366-70 (2015). 673 28. Gaucher, J. et al. Bromodomain-dependent stage-specific male genome programming by 674 Brdt. Embo j 31, 3809-20 (2012). 675 29. Chuma, S. et al. Tdrd1/Mtr-1, a tudor-related gene, is essential for male germ-cell 676 differentiation and nuage/germinal granule formation in mice. Proc Natl Acad Sci U S A 677 103, 15894-9 (2006). 678 30. Deng, W. & Lin, H. miwi, a murine homolog of piwi, encodes a cytoplasmic protein 679 essential for spermatogenesis. Dev Cell 2, 819-30 (2002). 680 31. Kneitz, B. et al. MutS homolog 4 localization to meiotic chromosomes is required for 681 chromosome pairing during meiosis in male and female mice. Genes Dev 14, 1085-97 682 (2000). 683 32. Spitz, F. & Furlong, E.E. Transcription factors: from enhancer binding to developmental 684 control. Nat Rev Genet 13, 613-26 (2012). 685 33. Ma, W., Noble, W.S. & Bailey, T.L. Motif-based analysis of large nucleotide data sets using 686 MEME-ChIP. Nat Protoc 9, 1428-50 (2014). 687 34. Bailey, T.L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids 688 Res 37, W202-8 (2009). 689 35. Oatley, J.M., Kaucher, A.V., Avarbock, M.R. & Brinster, R.L. Regulation of mouse 690 spermatogonial stem cell differentiation by STAT3 signaling. Biol Reprod 83, 427-33 691 (2010). 692 36. Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime 693 cis-regulatory elements required for macrophage and identities. Mol Cell 38, 576-89 694 (2010). 695 37. Weirauch, M.T. et al. Determination and inference of eukaryotic transcription factor 696 sequence specificity. Cell 158, 1431-1443 (2014). 697 38. Matson, C.K. et al. The mammalian doublesex homolog DMRT1 is a transcriptional 698 gatekeeper that controls the mitosis versus meiosis decision in male germ cells. Dev Cell 699 19, 612-24 (2010). 700 39. Zhou, L. et al. BTBD18 Regulates a Subset of piRNA-Generating Loci through 701 Transcription Elongation in Mice. Dev Cell 40, 453-466.e5 (2017). 702 40. Maezawa, S. et al. Polycomb protein SCML2 facilitates H3K27me3 to establish bivalent 703 domains in the male germline. Proc Natl Acad Sci U S A 115, 4957-4962 (2018).

14 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

704 41. Ichijima, Y., Sin, H.S. & Namekawa, S.H. Sex chromosome inactivation in germ cells: 705 emerging roles of DNA damage response pathways. Cell Mol Life Sci 69, 2559-72 (2012). 706 42. Turner, J.M. Meiotic Silencing in Mammals. Annu Rev Genet 49, 395-412 (2015). 707 43. Ichijima, Y. et al. MDC1 directs chromosome-wide silencing of the sex chromosomes in 708 male germ cells. Genes Dev 25, 959-71 (2011). 709 44. Sin, H.S. et al. RNF8 regulates active epigenetic modifications and escape gene activation 710 from inactive sex chromosomes in post-meiotic spermatids. Genes Dev 26, 2737-48 711 (2012). 712 45. Harley, J.B. et al. Transcription factors operate across disease loci, with EBNA2 implicated 713 in autoimmunity. Nat Genet 50, 699-707 (2018). 714 46. Zhu, Z. et al. Dynamics of the Transcriptome during Human Spermatogenesis: Predicting 715 the Potential Key Genes Regulating Male Gametes Generation. Sci Rep 6, 19069 (2016). 716 47. Ryser, S. et al. Gene expression profiling of rat spermatogonia and Sertoli cells reveals 717 signaling pathways from stem cells to niche and testicular cancer cells to surrounding 718 stroma. BMC Genomics 12, 29 (2011). 719 48. Luoh, S.W. et al. Zfx mutation results in small animal size and reduced germ cell number 720 in male and female mice. Development 124, 2275-84 (1997). 721 49. Matzuk, M.M. et al. Small-molecule inhibition of BRDT for male contraception. Cell 150, 722 673-84 (2012). 723 50. Pattabiraman, S. et al. Mouse BRWD1 is critical for spermatid postmeiotic transcription 724 and female meiotic chromosome stability. J Cell Biol 208, 53-69 (2015). 725 51. Bryant, J.M. et al. Characterization of BRD4 during mammalian postmeiotic sperm 726 development. Mol Cell Biol 35, 1433-48 (2015). 727 52. Defeo-Jones, D. et al. Cloning of cDNAs for cellular proteins that bind to the 728 retinoblastoma gene product. Nature 352, 251-4 (1991). 729 53. Scanlan, M.J., Simpson, A.J. & Old, L.J. The cancer/testis genes: review, standardization, 730 and commentary. Cancer Immun 4, 1 (2004). 731 54. Sohni, A. et al. The Neonatal and Adult Human Testis Defined at the Single-Cell Level. 732 Cell Rep 26, 1501-1517.e4 (2019). 733 55. Guo, J. et al. The adult human testis transcriptional cell atlas. Cell Res 28, 1141-1157 734 (2018). 735 56. Green, C.D. et al. A Comprehensive Roadmap of Murine Spermatogenesis Defined by 736 Single-Cell RNA-Seq. Dev Cell 46, 651-667.e10 (2018). 737 57. Hermann, B.P. et al. The Mammalian Spermatogenesis Single-Cell Transcriptome, from 738 Spermatogonial Stem Cells to Spermatids. Cell Rep 25, 1650-1667.e8 (2018). 739 58. Chen, Y. et al. Single-cell RNA-seq uncovers dynamic processes and critical regulators in 740 mouse spermatogenesis. Cell Res 28, 879-896 (2018). 741 59. Wang, M. et al. Single-Cell RNA Sequencing Analysis Reveals Sequential Cell Fate 742 Transition during Human Spermatogenesis. Cell Stem Cell 23, 599-614.e4 (2018). 743 60. Lambert, S.A. et al. The Human Transcription Factors. Cell 172, 650-665 (2018). 744 61. Bellve, A.R. Purification, culture, and fractionation of spermatogenic cells. Methods 745 Enzymol 225, 84-113 (1993). 746 62. Schmidl, C., Rendeiro, A.F., Sheffield, N.C. & Bock, C. ChIPmentation: fast, robust, low- 747 input ChIP-seq for histones and transcription factors. Nat Methods 12, 963-965 (2015). 748 63. Shen, Y. et al. A map of the cis-regulatory sequences in the mouse genome. Nature 488, 749 116-20 (2012). 750 64. Jung, Y.H. et al. Chromatin States in Mouse Sperm Correlate with Embryonic and Adult 751 Regulatory Landscapes. Cell Rep 18, 1366-1382 (2017). 752 65. Kartashov, A.V. & Barski, A. Wardrobe - an integrated system for analysis of epigenomics 753 and transcriptomics data, (2014).

15 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

754 66. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient 755 alignment of short DNA sequences to the . Genome Biol 10, R25 (2009). 756 67. Saldanha, A.J. Java Treeview--extensible visualization of microarray data. Bioinformatics 757 20, 3246-8 (2004). 758 68. Machanick, P. & Bailey, T.L. MEME-ChIP: motif analysis of large DNA datasets. 759 Bioinformatics 27, 1696-7 (2011). 760 69. Karolchik, D. et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res 32, 761 D493-6 (2004). 762 70. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15-21 763 (2013). 764 71. Meyer, L.R. et al. The UCSC Genome Browser database: extensions and updates 2013. 765 Nucleic Acids Res 41, D64-9 (2013). 766

16 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a c d Mitotic stages Meiotic stage Post-meiotic stages k-means clustering analysis MAnorm analysis Postmeiotic sex XY body chromatin Distal H3K27ac peaks Intergenic (PMSC) Total 11,433 peaks Intron Distal peaks Exon Cluster 1 + (n = 133) Upstream (-1 kb < -20 kb)

+ Proximal peaks Promoter (±1 kb from TSSs) Chromocenter Cluster 2 (n = 4) Undifferentiated Differentiating Pachytene Round Sperm THY1 KIT PS RS Distal peaks spermatogonia spermatogonia spermatocytes spermatids Cluster 3 + + + (n = 568) THY1 vs. KIT KIT vs. PS PS vs. RS THY1+ KIT+ PS RS Constitutive (n = 705) 20,000 Cluster 4 b (n = 741) 100 kb 15,000 RefSeq chr9 Tmem136 Cluster 5 Genes Arhgef12 Pou2f3 Oaf D630033O11Rik Trim29 (n = 1,783) Mitotic (n = 2,524) 10,000 ES 5,000 rep1 THY1+ 0 rep2 Cluster 6 (n = 452) Proximal peaks 10,000 rep1 Cluster 7 Numbers of peaks (n) KIT+ (n = 519) rep2 Cluster 8 5,000 (n = 2,205)

rep1 Meiotic (n = 8,204) Cluster 9 H3K27ac 0 PS (n = 1,354) Cluster 10 rep2 (n = 3,674) -unique -unique -unique + + +

rep1 common + PS-unique PS-unique RS-unique KIT RS KIT -PS common + THY1 rep2 -KIT log2 (RPKM) + PS-RS common KIT Sperm

14 16 THY1 Constitutive Mitotic Meiotic

Figure 1. The landscape of active enhancers during spermatogenesis. (a) Schematic of mouse spermatogenesis and the five representative stages analyzed in this study: THY1+, undifferentiated spermatogonia; KIT+, differentiating spermatogonia; PS, pachytene spermatocyte; RS, round spermatids; Sperm, epididymal spermatozoa. (b) Track view of H3K27ac ChIP-seq data with biological repli- cates for each stage of spermatogenesis. ES: embryonic stem cells. (c) Heatmap of distal H3K27ac peaks during spermatogenesis by k-means clustering analysis. (d) MAnorm analysis of H3K27ac peaks at each transition of spermatogenesis. The genomic distribution of each peak is shown with colored bars.

bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a

20 + + 15 THY1 KIT 15 PS RS 15 15 10 10 65 10 182 487 1,114 10 SEs SEs SEs SEs 5 (x10,000) 5 5 5

0 0 0 0 0 1 2 3 0 42 6 0 4 8 12 0 5 10 2015 H3K27ac signal at enhancers Enhancers ranked by H3K27ac signal (x1,000)

b Overlap of SEs e Genes adjacent to meiotic SEs f RNA-seq profiles of 101 RS 1,114 “Spermatogenesis” genes PS PS -log10(P-value) + KIT+ KIT+ 487 487 GO biological process 0 2 4 6 8 10 + THY1+

182 182 THY1 KIT PS RS 65 Spermatogenesis Transcription, DNA-templated Tssk2 10 12755 42859123 278209 836 Prm3 mRNA processing Brdt Cell cycle Ggn Ubiquitin-dependent protein Msh4 catabolic process Piwil1 Tdrd1 Mitosis-to-meiosis Phosphorylation Z-score transition Cellular response to DNA damage stimulus 0-1.51.5 c RefSeq chr15 20 kb chr5 20 kb genes AU022754 Mirlet7bhg Barhl2 THY1+ g Expression of late spermatogenesis genes KIT+ PS RS PS 300 *** *** H3K27ac 200 RS

RPKM 100

+ + SEs in THY1 and KIT SEs in PS and RS 0 SEs in KIT+ SEs in PS SEs in RS d

4 6 + 6 5 KIT 3 5 PS 4 SE-adjacent SE-adjacent 4 RS 2 3 3 2 2 Non SE-adjacent Non SE-adjacent 1 1 1

H3K27ac enrichment -5 kb Start +5 kbEnd-5 kb Start +5 kbEnd-5 kb Start +5 kbEnd

Figure 2. Identification of super-enhancers during spermatogenesis. (a) Identification of SEs in representative stages of spermatogenesis. (b) Overlap of SEs in each transition during spermatogenesis. (c) Track view of H3K27ac ChIP-seq data on representative SEs in spermatogenesis. (d) Average tag density of H3K27ac ChIP-seq reads at SEs. (e) Gene ontology analysis of genes adjacent to meiotic SEs. (f) RNA-seq profiles of 101 “spermatogenesis” genes. (g) Box-and-whisker plots showing distribution of RPKM values for RNA-seq data of late spermatogenesis genes. Central bars represent medi- ans, the boxes encompass 50% of the data points, and the whiskers indicate 90% of the data points. *** P < 0.0001, Mann-Whitney U test.

bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

+ a MEME-ChIP analyses: KIT vs PS c chrX 50 kb chr9 50 kb C2cd4b E-value TFs in motif RefSeq Genes Ccnb3 Mir7241 Mir7242

2 [0-3] Akap4 [0-5] STAT1 ZNF143 + 1 bits KIT G GT STAT3 EBF1 0 1 2 3 4 A5 6 C GG A H3K27ac [0-3] [0-5] 4.3x10-16 STAT5A

2 PS -unique

+ FOXK1 GM397 1 bits C ZSCAN4 A 0 T 2 3 4 5 1 FOXJ3 KIT A ACA [0-25] [0-25] 5.4x10-13 FOXG1 A-MYB 2 MYB SNAI2 (whole testis) 1 bits G G 0 2 3 4 5 6 7 8 G1 ACAGTTA MYBL1 ASCL2 Meiotic SEs Meiotic SEs 5.6x10-25 ID4 TGIF1 NHLH1 2 piRNA cluster NKX3-1 1 bits T FOXL1 0 2 3 4 5 6 7 8 C1 ATACACA 7.8x10-11 SOX17 PS-unique 2 STAT6 1 bits d MANorm e Overlap between SEs and A BCL6B 0 2 3 4 5 A1 GAAC piRNA clusters -10 4.5x10 A-MYB vs. H3K27ac-PS no overlap 100 15,000 meiotic SE b Homer analyses: KIT+ vs. PS 80 common SE

H3K27ac+ / ATAC+ H3K27ac- / ATAC+ clusters (n) 10,000 A mitotic SE NR5A1:NR5A2 DMRT1:DMRTA1 60 ESRRA:ESRRB, ESRRG 5,000 40 JUN:JUNB:JUND KIT+ FOS:FOSB:FOSL1:FOSL2 Numbers of peaks (n) 0 20 BATF SMARCC1:SMARCC2 0 Numbers of piRN Pre- BTBD18- BTBD18- RFX POU2F:POU3F: pachytene independent dependent NYFA POU1F1 common piRNA pachytene pachytene MYB:MYBL1:MYBL2 Intergenic PS Intron A-MYB-unique (n = 99) piRNA piRNA

SREBF H3K27ac-unique Exon A-MYB-H3K27ac (n = 50) (n = 65) ATF2:ATF7:CREB5 Promoter (±1 kb) ATF1:CREB1:CREM Upstream (-1 kb < -20 kb) piRNA clusters

f RNA-seq: A-myb mutants vs. controls at P14

10 All genes (n = 22,661) g Differentially expressed genes (n = 1,705) Meiotic SE adjacent genes overlapping

mutants 5 with differentially expressed genes (n = 211: 211/652), P = 1.0 × 10-79 fold change

2 A-MYB binding Upregulated in A-myb Active meiotic SEs

log on meiotic SEs 0

Meiotic SEs Genes regulated by meiotic SEs Tdrd1 -5 A-MYB H3K27ac

mutants Brdt

Mutants/controls -10 Piwil1 Downregulated in A-myb 0 5 10 15

Average log2 RPKM

Figure 3. A-MYB binding is associated with the establishment of meiotic SEs for targeted activation of germline genes. (a) Representative data of MEME-ChIP analysis of H3K27ac reads between KIT+ and PS. (b) Summary of HOMER analysis of H3K27ac reads between KIT+ and PS. (c) Track view of H3K27ac ChIP-seq data on meiotic SEs in spermatogenesis. (d) MAnorm analysis between A-MYB peaks in testes and PS-H3K27ac peaks. (e) Overlap between SEs and piRNA clusters. (f) RNA-seq analysis of A-myb mutant versus heterozy- gous control testes at 14 days post-partum (dpp). The 1,705 genes evincing significant changes in expression (>2-fold change, P adj < 0.01: a binomial test) in A-myb mutants are represented by blue circles. P value is based on a hypergeometric probability test. The 211 dysregulated genes (represented by red circles)/652 meiotic SE-adjacent genes (P = 1.0 × 10-79) compared to 1,705 dysregulated genes/all 22,661 RefSeq genes in the genome. (g) A model of A-MYB dependent establishment of meiotic SEs. bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a MAnorm WT vs. Scml2-KO b H3K27ac Wild-type c H3K27ac in PS Wild-type Scml2-KO Scml2-KO SEs in KIT+ SEs in PS 32,278 PS RS 30,000 KIT+ 7 25,922 4 5 20,000 3 9,802 3 10,000 8,078 7,279 2 3,251 1 1 Numbers of peaks (n) 0 H3K27ac enrichment -5 kb Start +5 kbEnd PS 4

3

KO-unique KO-unique d H3K27ac in RS WT-unique WT-unique Wild-type 2 Scml2-KO

WT-KO common SEs in RS WT-KO common 1 H3K27ac enrichment Intergenic 6 Intron 4 RS 5 Exon 4 Promoter (±1 kb) 3 Upstream (-1 kb < -20 kb) 3 2 2 1 1 H3K27ac enrichment -5 kb Start +5 kbEnd -5 kb Start +5 kbEnd

Figure 4. SCML2 is required for the resolution of mitotic SEs during meiosis. (a) MAnorm analysis of H3K27ac peaks in PS and RS between wild-type and Scml2-KO. (b) Average tag density of H3K27ac ChIP-seq reads at SEs in KIT+ in each stage of spermatogenesis (KIT+, PS, RS). (c, d) Average tag density of H3K27ac ChIP-seq reads at SEs in PS (c) and SE in RS (d).

bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a b c MAnorm: MAnorm: 30 26 Sex chromosomes (WT) Sex chromosomes (WT vs. Scml2-KO) THY1+ vs KIT+ KIT+ vs PS PS RS Intergenic 20 2,000 1,851 1,200 1,088 Intron Exon 10 842 1,500 900 Promoter (±1 kb) 0 0 2 + 650 Upstream (-1 kb < -20 kb) 0 + 600 558 1,000 PS RS KIT Numbers of SEs on the X chromosomes (n) 500 321 368 300 d THY1 72 130 Chr X Autosomes 7 29 77 30 1,200

Numbers of peaks (n) 0 Numbers of peaks (n) 0 26 1,038 900 20 600 -unique -unique -unique 461 + + +

common 10 +

PS-unique 300 KO-unique KO-unique WT-unique WT-unique KIT KIT 3 Numbers of SEs in PS (n) -PS common + THY1 -KIT 0 0 + WT Scml2 WT Scml2 WT-KO common KIT WT-KO common -KO -KO THY1

Figure 5. SCML2 is required for the formation of SEs on the X chromosome during meiosis. (a) MAnorm analysis of H3K27ac peaks on the sex chromosomes between THY1+ and KIT+ spermatogonia, and between KIT+ spermatogonia and PS. (b) MAnorm analysis of H3K27ac peaks on the sex chromosomes in PS and RS between wild-type and Scml2-KO. (c) Numbers of SEs on the X chromosome in each stage of spermatogenesis. (d) Number of SEs in PS on the X chromosome and autosomes.

bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a Enrichment around autosomal meiotic SEsSEs: 1218 SEs H3K27ac H3K4me2 H3K4me3 10 8 *** 10 *** 8 6 8 6 6 4 4 4 2 2 2 0 0 0 Enrichment per kb + + + + + + PS PS PS RS RS RS KIT KIT KIT THY1 THY1 THY1

b Enrichment around X-linked meiotic SEs: 26 SEs H3K27ac H3K4me2 H3K4me3

8 *** 8 10 6 6 8 6 4 4 4 2 2 2 0 0 0 Enrichment per kb + + + + + + PS PS PS RS RS RS KIT KIT KIT THY1 THY1 THY1

c THY1+ spermatogonia Pachytene spermatocytes Establishment of H3K4me2 Recruitment of A-MYB on autosomal meiotic SEs Establishment of H3K27ac

Poised meiotic SEs and promoters Active meiotic SEs

Meiotic SEs Genes regulated by meiotic SEs H3K4me2 A-MYB H3K27ac

Figure 6. Meiotic super-enhancers on autosomes are poised in undifferentiated spermatogonia. (a, b) Box-and-whisker plots showing distribution of ChIP-seq read enrichment around autosomal meiotic SEs (a) and around X-linked meiotic SEs (b). Central bars represent medians, the boxes encompass 50% of the data points, and the whiskers indicate 90% of the data points. *** P < 0.0001, Mann-Whitney U test. (c) A model of poised meiotic SEs and promoters in THY1+ spermatogonia and establishment of active meiotic SEs in pachytene spermatocytes. bioRxiv preprint doi: https://doi.org/10.1101/2020.03.11.987180; this version posted March 11, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a Putative regulators of mitotic enhancers b Putative regulators of mitotic SEs

SRF UBTF 200 TCF12 60 SRF STAT3 MAX GATA4 50 GATA4 BCL6 RBP1 Specific activators of 150 CEBPB CHD1 mitotic SEs TCF3 40 TCF3 ETS1 TCF12 100 MAX 30 ZFX MAZ KLF4 20 50 10 ChIP-seq peak intersection significance (-log P-value) 0 0 0 50 100 150 200 0 50 100 150 200 250 Ranking of ChIP-seq libraries (n) Ranking of ChIP-seq libraries (n)

c Putative suppressors of meiotic enhancers d Putative suppressors of meiotic SEs KDM5A Specific supressors of

+ REST + 15 TRIM28 15 SIN3A meiotic SEs RCOR2 RCOR2 10 SIN3A 10 REST YY1 5 5 Enriched on SEs in PS Enriched on PS enhancers 50 100 150 200 0 0

+ 50 100 150 200 250 -5 -5 -10 -10 enhancers

+ Relative enrichment PS/KIT Relative enrichment PS/KIT -15 Enriched on SEs in KIT -15 IT

Enriched on K Ranking of ChIP-seq libraries (n) Ranking of ChIP-seq libraries (n)

Figure 7. Identification of key regulatory factors for mitotic and meiotic enhancers and SEs. (a, b) Identification of putative regulators for mitotic enhancers and SEs. The Y-axis indicates the –log of the P-value for the overlap between publically available ChIP-seq datasets for various TFs and mitotic enhancers (a) or SEs (b) based on the RELI algorithm (see Methods). TFs of interest are highlighted. (c, d) Comparison between enriched TFs in meiotic PS vs. KIT+ enhancers (c) or SEs (d). The Y-axis indicates the ratio of the –log of the P-value for the overlap between publically available ChIP-seq datasets for various TFs and enhancers (c) or SEs (d) based on the RELI algorithm.