bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 The role of H3K36 methylation and associated

2 methyltransferases in ‐specific regulation

3

4 Henrik Lindehell1, Alexander Glotov1, Eshagh Dorafshan1, Yuri B. Schwartz1 and Jan Larsson1*

5 1Department of Molecular Biology, Umeå University, SE‐90187 Umeå, Sweden

6

7 *Corresponding author: Jan Larsson, Department of Molecular Biology, Umeå University, SE‐ 8 90187 Umeå, Sweden, Tel: +46 (0)90 7856785; Fax: +46 (0)90 778007; Email: 9 [email protected]

10

11 Data deposition footnote: The RNA‐seq data reported in this paper have been deposited in 12 the Gene Expression Omnibus database (GSE166934).

1

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

13 Abstract 14 In Drosophila, two require special mechanisms to balance their transcriptional 15 output to the rest of the genome. These are the male‐specific lethal complex targeting the 16 male X‐chromosome, and Painting of fourth targeting chromosome 4. The two systems are 17 evolutionarily linked to dosage compensation of the X‐chromosome and the chromosomes 18 involved display specific chromatin structures. Here we explore the role of histone H3 tri‐ 19 methylated at lysine 36 (H3K36me3) and the associated methyltransferases in these two 20 chromosome‐specific systems.

21 We show that the loss of Set2 impairs the MSL complex mediated dosage compensation; 22 however, the effect is not recapitulated by H3K36 replacement and indicates an alternative 23 target of Set2. Unexpectedly, balanced transcriptional output from the 4th chromosome 24 requires intact H3K36 and depends on the additive functions of NSD and the Trithorax group 25 Ash1.

26 We conclude that H3K36 methylation and the associated methyltransferases are important 27 factors to balance transcriptional output of the male X‐chromosome and the 4th 28 chromosome. Furthermore, our study highlights the pleiotropic effects of these enzymes.

29

30 Keywords: Dosage compensation; Drosophila; H3K36 methylation; NSD; Ash1; Set2; Painting 31 of fourth 32

2

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

33 Introduction 34 The evolution of sex‐chromosomes, for example the X and Y chromosome pairs, often leads 35 to sex differences in gene dose. In Drosophila, the Y chromosome has degenerated when it 36 comes to gene functions, but is retained through its specific roles in male fertility (Bachtrog 37 2013). This leads to a difference in X‐chromosome gene doses in males and females. 38 Although some located on the X‐chromosome are expressed in a sex‐specific manner, 39 most genes require equal expression levels in males and females (Stenberg and Larsson 40 2011; Mank 2013). The sex differences in gene doses of X‐chromosomal genes have resulted 41 in a need to equalize gene expression between the single X in males and the two X‐ 42 chromosomes in females, and to balance this level of expression with that of the two sets of 43 autosomal chromosomes (Oliver 2007; Stenberg and Larsson 2011; Mank 2013; Kuroda et al. 44 2016). In Drosophila, the gene dosage problem has been solved such that X‐chromosomal 45 gene expression is increased by a factor of approximately two in males (Kuroda et al. 2016). 46 This increased expression is partly mediated by chromosome‐specific targeting and 47 stimulation of the male X‐chromosome by the male‐specific lethal (MSL) complex. The MSL 48 complex consists of at least five protein components, originally identified through their 49 male‐specific lethal phenotype when lost; MSL1, MSL2, MSL3, MLE and MOF, and two long 50 non‐coding RNAs (lncRNAs) named RNA on the X (roX1 and roX2) (Kuroda et al. 2016; Samata 51 and Akhtar 2018). The targeting of the MSL complex to the male X‐chromosome is in 52 principle described in two steps. The MSL complex initially binds specifically to roughly 250 53 specific sites, denoted: chromatin entry sites (CES); MSL recognition element (MRE); high‐ 54 affinity sites (HAS); or pioneering sites on the X (Pion‐X), largely overlapping but initially 55 isolated and classified using different criteria (Kelley et al. 1999; Alekseyenko et al. 2008; 56 Straub et al. 2008; Villa et al. 2016). Next, the MSL complex spreads from these high‐affinity 57 sites to neighbouring transcriptionally active genes; this spreading requires the presence of 58 at least one of the two roX RNAs (Figueiredo et al. 2014). The resulting X‐chromosome 59 specific targeting is accompanied by an enriched histone 4 lysine 16 acetylation (H4K16ac), 60 mediated by the acetyltransferase MOF (Akhtar and Becker 2000; Smith et al. 2000).

61 Importantly, at least one more chromosome has been the subject of as similar an 62 evolutionary process as the current X‐chromosome. The 4th chromosome in Drosophila 63 melanogaster was ancestrally an X‐chromosome that has reverted to an autosome (Vicoso

3

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

64 and Bachtrog 2013; Vicoso and Bachtrog 2015). This sex‐chromosome reversal is likely to 65 explain that the 4th chromosome in D. melanogaster is the subject of chromosome‐specific 66 targeting and regulatory mechanisms mediated by the protein Painting of fourth (POF) 67 (Larsson et al. 2001; Larsson et al. 2004; Johansson et al. 2007a; Kim et al. 2018a). We have 68 previously hypothesised that POF and its stimulatory function became trapped on the 4th 69 chromosome when the latter reverted to being an autosome (Kim et al. 2018a). The 70 stimulatory effect on transcript output mediated by POF is comparable in level to the 71 compensation mediated by the MSL complex on the X‐chromosome. The hypothesised origin 72 of POF as a dosage compensation system is also supported by the fact that the POF 73 mediated compensation of the 4th chromosome is essential for the survival of flies with a 74 monosomy of chromosome 4 (Johansson et al. 2007a). The 4th chromosome, targeted by 75 POF, has several unique characteristics. It is the smallest chromosome in the Drosophila 76 genome, it is replicated late, and in principle, the entire 4th chromosome can be considered 77 heterochromatic (Riddle and Elgin 2006; Filion et al. 2010). This means that the 4th 78 chromosome is enriched in HP1a, the histone methyltransferase Setdb1, as well as 79 methylated H3K9 (Riddle et al. 2011; Figueiredo et al. 2012).

80 Accurate targeting and the precise stimulatory effect of these chromosome‐specific systems 81 require the contributions of several factors to be well coordinated. In addition to specific 82 DNA sequence motifs mentioned above, these may include distinct DNA topology and spatial 83 positioning of the dosage‐compensated chromosomes (Samata and Akhtar 2018; Valsecchi 84 et al. 2020) as well as the interaction between the chromo‐domain of MSL3 and histone H3 85 tri‐methylated at lysine 36 (H3K36me3). It was reported that in Set21 mutants, with reduced 86 levels of H3K36me3, the localization of MSL2 and MSL3 toward gene ends was impaired 87 (Larschan et al. 2007) and that the interaction between MSL3 and H3K36me3 helps the MSL 88 complex to bind active genes (Larschan et al. 2007; Sural et al. 2008). However, structural 89 studies suggest that the chromo‐domain of MSL3 has higher affinity to the more abundant 90 histone H4 mono‐ and di‐methylated at lysine 20 (H4K20) (Kim et al. 2010; Moore et al. 91 2010); the rival hypothesis contends that the MSL3 chromo‐domain interacts with 92 methylated H4K20 to present H4 tails for acetylation at lysine 16 by MOF (Moore et al. 93 2010). Therefore, the extent and mechanisms by which H3K36 methylation and/or

4

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

94 associated methyltransferases contribute to balanced transcriptional output from the male X 95 and the 4th chromosomes remains an open question.

96 Three evolutionarily conserved Drosophila : Absent, small or homeotic discs 1 97 (Ash1), SET domain containing 2 (Set2) and Nuclear receptor binding SET domain containing 98 protein (NSD), are believed to methylate H3K36. In vitro, Ash1 can add one or two methyl 99 groups to H3K36 (Tanaka et al. 2007) and this activity is further enhanced by Mrg15 and Caf1 100 proteins with which Ash1 forms a complex (Huang et al. 2017; Schmahling et al. 2018; Hou et 101 al. 2019; Lee et al. 2019). Flies lacking ash1 function show an approximated two‐fold 102 reduction of bulk H3K36me1 but no detectable loss of overall H3K36me2 or H3K36me3 103 (Schmahling et al. 2018; Dorafshan et al. 2019). These ash1 mutants display multiple 104 homeotic transformations and die at the larval stage due to erroneous repression of 105 developmental genes by Polycomb group mechanisms (Tripoulas et al. 1994; Klymenko and 106 Muller 2004; Dorafshan et al. 2019). The Set2 protein can methylate H3K36 in vitro using 107 mono‐ and di‐methylated H3K36 as a substrate (Bell et al. 2007). Whether it can also add 108 methyl groups to unmethylated H3K36 is not entirely clear although it has been 109 demonstrated that the human orthologue (SETD2) is able to do so in reconstituted reactions 110 (Yuan et al. 2009). Consistently, the Drosophila Set2 mutants have 10‐fold lower levels of 111 H3K36me3 overall and at specific genes (Larschan et al. 2007; Dorafshan et al. 2019) but 112 display no major changes in the bulk H3K36me2 and H3K36me1 (Dorafshan et al. 2019). 113 Taken together, these observations suggest that Set2 produces most of the Drosophila 114 H3K36me3 and may contribute to cellular pools of H3K36me1 and H3K36me2. The loss of 115 Set2 function is lethal (Larschan et al. 2007; Dorafshan et al. 2019), but the corresponding 116 mutant shows no signs of excessive Polycomb repression (Dorafshan et al. 2019) suggesting 117 that Set2 and Ash1 affect Drosophila development in distinct ways. Less is known about the 118 biochemical properties of the fly NSD. It has three closely related orthologues in mammals 119 (NSD1, NSD2, NSD3) whose methyltransferase activity in vitro has been extensively studied. 120 These studies concur that NSD proteins can mono‐ and di‐methylate H3K36 (Rayasam et al. 121 2003; Li et al. 2009; Kudithipudi et al. 2014). However, methylation of other substrates is 122 also suggested. A systematic screen indicates that these include K168 of histones H1.5 and 123 H1.2, K169 of histone H1.3, K44 of histone H4 as well as K1033 of the chromatin re‐modeller 124 ATRX and K189 of the U3 small nucleolar RNA‐associated protein 11 (Kudithipudi et al.

5

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

125 2014). NSD proteins are essential for proper mammalian development and 126 haploinsufficiencies in the NSD1 and NSD2 genes were linked to Sotos and Wolf‐Hirschhorn 127 genetic syndromes, respectively (Kurotaki et al. 2002). Whether methylation of H3K36 or any 128 of the substrates above are relevant for the developmental functions of the NSD proteins 129 remains an open question. Unlike its mammalian counterparts, Drosophila NSD loss of 130 function mutants are viable, fertile, and show no obvious morphological defects (Dorafshan 131 et al. 2019). Although early experiments in cultured cells suggested that NSD knock‐down 132 reduces bulk levels of H3K36me2 and H3K36me3 (Bell et al. 2007), the NSD loss of function 133 mutants display no obvious changes in the overall levels of H3K36me1, H3K36me2 and 134 H3K36me3 (Dorafshan et al. 2019).

135 Whether H3K36 is the only (or even the major) physiological substrate is a question that 136 appears equally relevant for Ash1 and Set2. One might expect that animals in which H3K36 is 137 replaced with an amino acid that cannot be methylated, will display defects similar to those 138 seen in mutants that lack the corresponding methyltransferase. Mounting experimental 139 evidence suggests the contrary. Thus complete substitution of the zygotic H3 with a variant 140 in which K36 is replaced with arginine (R) does not cause excessive repression of homeotic 141 genes seen in ash1 mutants (Dorafshan et al. 2019). Likewise, flies with replication‐coupled 142 histone H3.2 replaced with the H3K36R variant, display no cryptic transcription initiation or 143 altered splice site choice as reported for yeast Set2 or human SETD2 mutants (Meers et al. 144 2017).

145 To investigate the role of H3K36 and the associated histone methyltransferases in 146 chromosome‐specific gene regulation we evaluated relative levels of methylated H3K36 on 147 individual chromosome arms and compared those to transcriptional imbalance caused by 148 the loss of Set2, NSD and Ash1 or the replacement of the histone H3.2 with a variant in 149 which lysine 36 is substituted to arginine.

150 Confirming previous reports, we found that loss of Set2 impairs MSL complex mediated 151 dosage compensation. However, the effect is not recapitulated in H3K36R mutants and 152 suggests an alternative target for Set2. This implies that the model in which Set2‐mediated 153 H3K36me3 helps the MSL complex to bind active genes may need to be revised. 154 Unexpectedly, our results indicate that the balanced transcriptional output from the 4th

6

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

155 chromosome depends on the additive contribution from Ash1 and NSD and requires intact 156 H3K36. We conclude that H3K36 methylation and the associated methyltransferases are 157 important factors for balanced transcriptional output of the male X‐chromosome and the 4th 158 chromosome. Our study also emphasises the importance of pleiotropic effects of these 159 enzymes.

160 Results

161 H3K36me3 is reduced on the male X‐chromosome and enriched on the 4th chromosome 162 Drosophila chromosome‐specific regulatory systems are accompanied by chromosome‐ 163 specific enrichment of specific histone modifications. The male X‐chromosome is enriched in 164 H4K16ac (Turner et al. 1992) and the 4th chromosome is enriched in H3K9me2/me3 (Czermin 165 et al. 2002). Considering the proposed role of methylated H3K36 in dosage compensation, 166 we decided to analyse methylation of H3K36 and the associated methyltransferases further. 167 Immunostaining of polytene chromosomes from male third instar larvae with antibodies 168 against mono‐ di‐ and tri‐methylated H3K36 shows that these modifications are differentially 169 distributed between the chromosome arms. Mono‐methylated H3K36 (H3K36me1) was 170 evenly distributed in a banded pattern throughout the entire genome (Supplementary Fig. 171 S1A). Di‐methylated H3K36 (H3K36me2) was also ubiquitous genome‐wide, but the banding 172 pattern of H3K36me2 shows only minor overlaps with H3K36me1 (Supplementary Figs. S1A, 173 B). A noticeable difference is the increased enrichment of H3K36me2 in the pericentromeric 174 heterochromatin as well as on the heterochromatic fourth chromosome compared to the 175 major chromosome arms (Supplementary Fig. S1A). Increased enrichment on the fourth 176 chromosome was also consistently observed for H3K36me3 (Fig. 1A) but unlike H3K36me2, 177 less relative enrichment in pericentromeric heterochromatin was observed for H3K36me3 178 (Supplementary Fig. S1C). Unexpectedly, H3K36me3 appeared less abundant on the male X‐ 179 chromosome (Fig. 1A, Supplementary Fig. S1C) compared to the autosomes. Importantly, 180 weaker anti‐H3K36me3 immunostaining of the X‐chromosome was not observed in females 181 (Fig. 1B). In males, the number of chromatids of the polytene X‐chromosome is half that of 182 the autosomes. This difference may account for some of the reduced staining observed. 183 Nevertheless, the decrease in H3K36me3 staining is still more pronounced than that seen for 184 H3K36me1 (Supplementary Fig. S1A) or H3K36me2 (Supplementary Fig. S1C). This argues

7

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

185 that the male polytene X‐chromosome has a lower level of H3K36me3 than the autosomes. 186 The apparent lower level of H3K36me3 on the male X‐chromosome, together with its 187 enrichment on the fourth chromosome, is at odds with the model in which Set2‐mediated 188 H3K36me3 helps the MSL complex to bind active genes (Larschan et al. 2007; Sural et al. 189 2008). We, therefore, investigated the question further.

190 Chromosome‐specific differences in tri‐methylated H3K36 are established during early 191 development 192 To validate and quantify our observations in polytene tissue, we analysed H3K36me3 ChIP‐ 193 chip data from third instar D. melanogaster larvae (mixed sexes) available from modENCODE 194 (mod et al. 2010) as well as the more recent H3K36me3 ChIP‐seq data (Prayitno et al. 2019). 195 Since H3K36me3 is predominantly enriched within the coding regions of genes (Schwartz et 196 al. 2009), we calculated relative enrichment values of H3K36me3 within exons for each gene. 197 The average ChIP signals divided by chromosomes confirmed our observation from the 198 polytene chromosome stainings. We found that H3K36me3 ChIP signals within gene bodies 199 in 3rd instar larvae were higher on the fourth chromosome and lower on the male X‐ 200 chromosome as compared to the autosomes (Fig. 1C). Analysis of H3K36me3 in late 201 embryonic stage (mixed sexes) ChIP‐Seq data (Prayitno et al. 2019) showed the same 202 distribution pattern (Fig. 1D).

203 Next, we analysed the H3K36me3 abundance over developmental time by looking at 204 additional ChIP datasets from different developmental stages (mod et al. 2010; Prayitno et 205 al. 2019). ChIP‐chip data in early embryos (2 – 4 h) indicate that the amount of H3K36me3 is 206 uniformly the same for all chromosomes (Fig. 1G). In contrast, the ChIP‐seq data show 207 reduced H3K36me3 ChIP signal on the X‐chromosome, even at this early stage (Fig. 1F). In a 208 late embryonic stage (14 ‐ 16 h), the fourth chromosome shows significantly stronger 209 immunoprecipitation with H3K36me3 antibodies in both ChIP assays (Figs. 1D, E). However, 210 only the ChIP‐seq data show a reduced ChIP signal on the X‐chromosome (Fig. 1D). By the 3rd 211 instar larval stage, as shown in Fig. 1C, the ChIP‐signal on the X‐chromosome is lower than 212 on the autosomes and the signal on the fourth is even more pronounced.

213 The level of H3K36me3 within genes correlates with the degree of transcriptional activity 214 and gene length

8

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

215 We and others have shown that the two chromosome‐specific systems studied here affect 216 genes differently depending on gene type, expression levels, gene length and replication 217 timing (Schubeler et al. 2002; Gilfillan et al. 2006; Stenberg et al. 2009; Regnard et al. 2011; 218 Lundberg et al. 2013b; Philip and Stenberg 2013; Lubelsky et al. 2014; Lee et al. 2016; Kim et 219 al. 2018b; Prayitno et al. 2019; Belyi et al. 2020; Ekhteraei‐Tousi et al. 2020). We therefore 220 determined how these criteria relate to the levels of H3K36me3 within genes. We found that 221 the H3K36me3 ChIP signal at the gene level positively correlates with wildtype transcript 222 abundance (Fig. 2A) and longer genes accumulate more H3K36me3 compared to short genes 223 (Fig. 2B). We also observed higher enrichment of H3K36me3 in early replicating genes, 224 compared to late replicating genes, on all chromosomes (Fig. 2C). Genome‐wide 225 housekeeping genes were also significantly more enriched by H3K36me3 than non‐ 226 housekeeping genes (Fig. 2D).

227 Set2 is required for balanced transcription of X‐linked genes bound by the MSL complex 228 The level of H3K36me3 within genes correlates with their transcriptional activity, and Set2 229 has been reported to help in maintaining the transcriptional balance of the male X‐ 230 chromosome. Yet the level of H3K36me3 on this chromosome appears to be lower than it is 231 on autosomes. Puzzled by this paradox, we decided to analyse transcriptional outputs from 232 each chromosome in mutants of the three known H3K36 specific methyltransferases: Set2, 233 NSD and Ash1. In flies, Set2 is responsible for most of the H3K36me3 while the potential 234 division of labour and redundancy in H3K36 methylation mediated by NSD and Ash1 is not 235 fully understood. As reported previously (Alekseyenko et al. 2014), we found that NSD is 236 enriched in pericentromeric heterochromatin and on the 4th chromosome on polytene 237 chromosomes (Supplementary Fig. S2), with a staining pattern similar to HP1a and to the 238 H3K36me2 enrichment (Supplementary Fig. S1B) (James et al. 1989; Figueiredo et al. 2012; 239 Alekseyenko et al. 2014).

240 It has been reported that in Set21 mutants the MSL complex correctly targets the male X‐ 241 chromosome but with reduced binding levels to its target genes (Larschan et al. 2007). The 242 Set21 allele corresponds to a deletion of the N‐terminal half of its open reading frame, 243 including the catalytic SET domain (Larschan et al. 2007) and leads to approximately a 10‐ 244 fold reduction in global H3K36me3 (Larschan et al. 2007; Dorafshan et al. 2019). 245 Homozygous Set21, but not ash122/ash19011 or NSDds46 (see below), displays a significant

9

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

246 relative decrease in transcriptional output from the male X‐chromosome (Fig. 3A) and a 247 nearly complete loss of H3K36me3 staining on polytene chromosomes (Supplementary Fig. 248 S3). By dividing genes according to their wildtype transcript levels we found that highly 249 transcribed genes are those most affected by loss of Set2 (Fig. 3B). We have previously 250 shown that highly transcribed genes require high levels of MSL complex binding (Kim et al. 251 2018b). We therefore asked if the reduced expression observed in Set21 correlates with 252 binding levels of the MSL complex. To address this question, all X‐chromosome genes were 253 divided into five bins based on their immunoprecipitation with antibodies against the MSL1 254 subunit of the MSL complex (Kind et al. 2008; Philip and Stenberg 2013). Thus, bin 1 included 255 unbound and weakly bound genes, while bin 5 included genes highly enriched in MSL 256 proteins. As illustrated in Fig. 3C, genes binding MSL complex tend to show reduced 257 transcript abundance upon Set2 loss. To further substantiate this result we grouped the 258 genes by distance to the nearest High Affinity Site (HAS) since it has been shown that genes 259 more distal to HAS are less sensitive to loss of a functional MSL complex (Straub et al. 2008; 260 Kim et al. 2018b). As expected, we observed that genes close to a HAS are, on average, more 261 affected by the loss of Set2 compared to genes further away. Genes more distal than about 262 30 kb from a HAS are essentially unaffected by the loss of Set2 (Fig. 3D). This is in line with 263 what we have observed after ablation of the roX1 roX2 non‐coding RNAs (Kim et al. 2018b). 264 We conclude that the reduced transcript levels on the male X‐chromosome observed in 265 Set21 corresponds well to a dysfunction in MSL complex mediated dosage compensation.

266 H3K36R mutants show no imbalance of transcriptional output from the male X‐chromosome 267 Loss of Set2 function has been shown to cause reduced MSL complex binding to target genes 268 (Larschan et al. 2007) and our results indicate that the Set21 mutation causes a relative 269 decrease in transcriptional output from the male X‐chromosome in a pattern that mimics 270 impaired function of the MSL complex. It is tempting to speculate that the loss of Set2 leads 271 to lower H3K36me3 which, in turn, impairs the binding of MSL3 (Larschan et al. 2007; Sural 272 et al. 2008). However, the interaction between H3K36me3 and the chromo‐domain of MSL3, 273 proposed for this mechanism, is at odds with structural data (Kim et al. 2010; Moore et al. 274 2010) and our previous proximity ligation analysis did not detect a direct interaction 275 between MSL3 and H3K36me3 (Lindehell et al. 2015). If the reduction of the transcriptional 276 output from the male X‐chromosome observed in the Set21 mutants is caused by a

10

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

277 concomitant reduction in H3K36me3, we would expect to see a similar transcriptional 278 imbalance in the H3K36R histone replacement mutant, ΔHisC; 12xH3K36R. These mutant flies 279 carry the deletion of the histone gene cluster ΔHisC combined with a transgenic construct 280 carrying twelve copies of the 5 kb histone repeat unit in which H3.2K36 is mutated to 281 arginine (12xH3K36R) (Gunesdogan et al. 2010; McKay et al. 2015).

282 Polytene chromosomes of the ΔHisC; 12xH3K36R flies display a dramatic decrease of 283 immunostaining with antibodies against H3K36me3 (Fig. 4A). This is in accordance to what 284 has previously been shown (McKay et al. 2015) and is similar to that seen in Set21 mutants 285 (Supplementary Fig. S3). Western blot analysis showed that H3K36me3 methylation is, as 286 expected, strongly reduced in ΔHisC; 12xH3K36R flies (Fig. 4B). Notably, some H3K36me3 signal 287 is still detected. This may correspond to the tri‐methylated H3.3 variant histone still present 288 in ΔHisC; 12xH3K36R flies; but we cannot exclude the possibility that some of it may be caused 289 by antibody cross‐reactivity. Despite the five‐fold reduction in the overall H3K36me3, we 290 observed no X‐chromosome specific transcriptional imbalance in ΔHisC; 12xH3K36R animals 291 (Fig. 4C), which suggests that imbalance in transcription of genes on male X‐chromosome 292 after Set2 knock‐out is not linked to a concomitant loss of H3K36 methylation. The 293 interpretation of histone replacement experiments may, however, be less straightforward if 294 it is the un‐methylated H3K36 that is required for the process, which Set2 may aim to 295 prevent. Thus, recent structural studies suggest that unmodified H3K36 is required for 296 unimpeded methylation of lysine 27 of histone H3 (H3K27) by the Polycomb Repressive 297 Complex 2 (PRC2) (Jani et al. 2019; Finogenova et al. 2020). Consistently, flies in which the 298 zygotic histone H3.2 is replaced with a variant that contains arginine instead of K36, show 299 mild defects in Polycomb repression of homeotic selector genes (Finogenova et al. 2020). 300 We, therefore, considered a possibility that reduced activity of PRC2 in the ΔHisC; 12xH3K36R 301 mutant may lead to an increased transcription of the X‐chromosome genes, which in turn, 302 may mask the effect of the Set2 knock‐out. To address this issue, we took advantage of the 303 RNA‐seq data from (Lee et al. 2015) where the PRC2 function was disrupted by a 304 temperature shift of a cell line homozygous for the temperature sensitive E(z)61 allele (Jones 305 and Gelbart 1990; Lee et al. 2015). We calculated transcript abundance ratios for all genes 306 divided by chromosomes and found the transcription of genes on the X‐chromosome 307 relative to autosomes to be reduced, not increased (Fig. 4D). Whether this relative reduction

11

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

308 is a consequence of global de‐repression of low and non‐expressed genes as noted 309 previously (Lee et al. 2015) remains an open question which we have not attempted to 310 investigate further at this stage. Nevertheless, we find no evidence that the lack of an X‐ 311 chromosome specific effect upon H3K36R replacement is caused by potential crosstalk with 312 PRC2. Taken together, our results argue that the reduced transcription of dosage 313 compensated genes after Set2 knock‐out is not linked to a concomitant loss of H3K36 314 methylation.

315 We next considered the possibility that methylation of variant histone H3 (H3.3) is critical for 316 balanced transcriptional output from the male X‐chromosome and the regulatory 317 mechanism mediated by Set2 methylation. It has previously been shown that the replication 318 independent histone variant H3.3 is enriched on the male X‐chromosome (Mito et al. 2005). 319 We therefore analysed gene transcription in flies with both H3.3 genes deleted, ΔH3.3B; 320 ΔH3.3A. Despite the reported enrichment of H3.3 on the male X‐chromosome we observed 321 no reduction in X‐chromosome transcript abundance as compared to autosomes in the 322 ΔH3.3B; ΔH3.3A double mutant (Fig. 4E).

323 A minor chromosome‐specific effect on a subset of genes could be masked when all genes 324 are analysed together. Intriguingly, when the genes were binned into subcategories, 325 interesting trends emerged. We noted that genes with moderate to high transcription levels 326 were more sensitive to the loss of H3.3. However, this trend was observed on the autosomes 327 as well as on the male X‐chromosome (Fig. 4F). Likewise, the transcript levels from 328 housekeeping genes were decreased as compared to non‐housekeeping genes (Fig. 4G). We 329 also observed a small but significant relative increase of transcript abundance from late 330 replicating genes (Supplementary Fig. S4). The most striking effect was in relation to gene 331 length, where we observed a clear positive correlation between the gene length and the 332 transcriptional changes in ΔH3.3B; ΔH3.3A mutants (Fig. 4H). We conclude that, despite the 333 relative enrichment of H3.3 on the male X‐chromosome, the loss of H3.3 has no 334 chromosome‐specific effect measurable in our assay. However, the proper transcription of 335 short, moderately to highly transcribed genes, is still dependent on H3.3 in a non 336 chromosome‐specific manner. Taken together, the lack of an X‐chromosome specific effect 337 in ΔHisC; 12xH3K36R as well as in ΔH3.3B; ΔH3.3A mutants argues that Set2 contributes to 338 increased expression of dosage compensated genes independently of H3K36 methylation.

12

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

339

340 The 3’‐UTR length correlates with transcriptional activity and amount of H3K36me3 within 341 genes 342 It has been shown that H3K36me3 levels positively correlate with the length of the 3’‐UTR in 343 Drosophila as well as in C. elegance (Pu et al. 2015). In agreement with this, we observed a 344 positive correlation of both the wildtype gene transcription (transcript abundance) and the 345 strength of H3K36me3 ChIP signal within gene bodies to the length of the 3’‐UTR (Fig. 5A, B). 346 Importantly, when we calculated the average length of the 3’‐UTR chromosome wise, we 347 found that the 4th chromosome and the X‐chromosome have 2.07 and 1.46 times longer 3’‐ 348 UTRs on average, compared to the autosomal average (Fig. 5C). We therefore asked whether 349 the decrease in the transcriptional output from the X‐chromosome genes in the Set21 350 mutants differs depending on UTR length, and compared it to those in the ΔHisC; 12xH3K36R 351 and the ΔH3.3B; ΔH3.3A mutants (Figs. 5D‐F). We found no correlation between the 352 decrease in the transcriptional output from the X‐chromosome genes in the Set21 mutants 353 and the length of their UTRs (Fig. 5D). However, genes with long UTRs showed increased 354 transcript abundance in the ΔH3.3B; ΔH3.3A mutants (Fig. 5F), which is in line with the 355 increased transcript abundance observed for long genes (Fig. 4H). We conclude that the two 356 chromosomes with chromosome‐specific gene regulatory mechanisms both have longer 357 average 3’UTRs, a characteristic that correlates with higher transcript abundance and higher 358 H3K36me3 ChIP signal.

359 Ash1 and NSD together maintain balanced transcriptional output from the 4th chromosome 360 Immunostaining of polytene chromosomes and ChIP indicate that the fourth chromosome is 361 enriched in both H3K36me2 (Supplementary Fig. S1A, C) and H3K36me3 (Fig. 1, 362 Supplementary Fig. S1C). Furthermore, the 4th chromosome genes have longer 3’‐UTRs, a 363 feature that correlated with higher immunoprecipitation with anti‐H3K36me3 antibodies 364 (Fig. 5A, C). Despite this, Set2 knock‐out caused no imbalance in transcription of 4th 365 chromosome genes compared to those on autosomes.

366 In contrast, the ash122/ash19011 and the NSDds46 mutants displayed a small but significant 367 imbalance in the transcriptional output from the 4th chromosome (Figs 6A, B). The effect was 368 markedly more pronounced in flies that lacked both ash1 and NSD functions (ash122

13

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

369 NSDds46/ash19011 NSD ds46) (Fig. 6C). By dividing the genes into housekeeping and non‐ 370 housekeeping classes we found that the bulk of the change corresponded to imbalanced 371 transcriptional output of the non‐housekeeping genes (Fig. 6D). Interestingly, the magnitude 372 of the drop in the relative transcript abundance of the non‐housekeeping genes and the 373 stronger effect on this class of genes resembled those observed in Pof mutants, the key 374 component of the regulatory mechanism specific to chromosome 4 (Stenberg et al. 2009; 375 Lundberg et al. 2013b). We conclude that Ash1 and NSD together are required for balanced 376 transcriptional output from chromosome 4. To test if the observed reduction in ash122 377 NSDds46/ash19011 NSD ds46 is a consequence of impaired methylation of H3K36, we included 378 ΔHisC; 12xH3K36R for comparison. Importantly, the expression levels from chromosome 4 379 dropped in the H3K36R histone replacement mutants, ΔHisC; 12xH3K36R, by a similar extent as 380 that in the ash1 NSD double mutant (Fig. 6E). Taken together these observations argue that 381 methylation of H3K36 by both Ash1 and NSD is required to balance transcriptional output of 382 the 4th chromosome.

383 Discussion 384 Two main conclusions can be drawn from the observations presented here. First, 385 transcriptome profiling of Set2 deficient animals indicates that this methyltransferase is 386 required for the balanced transcriptional output from the single male X‐chromosome. 387 However, the comparison with mutants where the major histone H3 isoform is replaced with 388 a variant that carries arginine instead of lysine 36, provides evidence that the process does 389 not involve methylation of H3K36. Second, the balanced transcriptional output from the 4th 390 chromosome requires intact H3K36 and depends on the additive functions of NSD and the 391 Trithorax group protein Ash1.

392 The latter conclusion was unexpected. We and others have previously shown that Set2 is 393 responsible for bulk of H3K36me3 (Larschan et al. 2007; Dorafshan et al. 2019). Consistently, 394 we see an almost complete loss of H3K36me3 signal on polytene chromosomes in Set21 395 mutants. In contrast, the loss of Set2 causes no major changes in the bulk H3K36me2 and 396 H3K36me1 (Dorafshan et al. 2019). We therefore speculate that Ash1 and NSD stimulate 397 transcription of the 4th chromosome genes by di‐methylation of H3K36. On polytene 398 chromosomes, H3K36me2 appears enriched in pericentric heterochromatin. These regions

14

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

399 (enriched in H3K36me2) coincide with the regions strongly enriched in H3K9me3 and HP1a 400 (James et al. 1989; Figueiredo et al. 2012). We have previously shown that HP1a and POF 401 together are involved in the chromosome‐specific regulation of the 4th chromosome 402 (Johansson et al. 2007a; Johansson et al. 2007b; Lundberg et al. 2013b). The reduced 403 transcriptional output from the 4th chromosome upon the loss of Ash1 and NSD suggests 404 that di‐methylated H3K36 may counteract methylation of H3K9 and/or HP1a binding and 405 repression, or that di‐methylated H3K36 somehow facilitates POF functions.

406 The 4th chromosome is enriched in repetitive DNA, and is therefore the Drosophila 407 chromosome most similar in genetic composition to the . In contrast to 408 Drosophila, human NSD proteins are essential for proper development, and 409 haploinsufficiencies in NSD1 and NSD2 have been linked to Sotos and Wolf‐Hirschhorn 410 genetic syndromes, respectively (Kurotaki et al. 2002). It is tempting to speculate that the 411 stimulatory effect of NSD proteins may be mechanistically similar but quantitatively more 412 important for mammalian genes embedded in the repeat‐rich genome. The Drosophila 4th 413 chromosome displays a high and unusual tolerance to dosage differences and mis‐ 414 expression (Hochman 1976; Johansson et al. 2007a; Johansson et al. 2007b; Stenberg et al. 415 2009; Johansson et al. 2012). Therefore, the decreased transcriptional output from the 4th 416 chromosome of the NSD mutants is not expected to have severe physiological 417 consequences.

418 Paradoxically, our observations indicate that H3K36me3 is less abundant on the male X‐ 419 chromosome while it is enriched on chromosome 4, yet it is not directly involved in either 420 chromosome‐specific regulatory system, suggesting that the differences represent a 421 consequence rather than a cause of these chromosomes‐specific functions. How could we 422 explain the observed differences in H3K36me3 levels? Although still a matter of discussion 423 and debate (Kuroda et al. 2016), it has been proposed that transcription elongation 424 efficiency is higher on the male X‐chromosome compared to autosomes (Larschan et al. 425 2011; Prabhakaran and Kelley 2012). Likewise, we have previously shown that there is a 426 significant reduction of the transcription elongation efficiency (elongation density index) on 427 the 4th chromosome compared to the other autosomes (Johansson et al. 2012). Considering 428 that Set2 travels with RNA polymerase II, we speculate that the faster elongation speeds on 429 the dosage‐compensated X‐chromosome, leave less time for Set2 to add three methyl

15

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

430 groups to H3K36 resulting in the overall lower H3K36me3 levels. Conversely, the slower 431 elongation on the 4th chromosome leads to more extensive H3K36 tri‐methylation.

432 H3K36me3 is associated to, and correlates with, gene expression; i.e., higher levels of 433 H3K36me3 within genes correlate with higher transcription output. We have also shown that 434 higher levels of H3K36me3 correlate with gene length and the length of the 3’‐UTRs. We 435 were intrigued to see that both chromosomes with specific regulatory systems have a 436 significantly increased average length of their genes’ 3’‐UTRs. Both for the X‐chromosome 437 and the autosomes, the H3K36me3 levels positively correlate with the length of 3’‐UTRs, but 438 for the X‐chromosome, the overall levels of H3K36me3 are lower. H3K36me3 levels are 439 higher on the 4th chromosome compared to the other autosomes which is in line with the 440 increased length of 3’‐UTRs on this chromosome. In fact, despite the 4th chromosome’s 441 heterochromatic nature, the average level of gene expression of chromosome 4 is 442 comparable to, or even higher than, that of genes on other chromosomes (Haddrill et al. 443 2008; Johansson et al. 2012).

444 Finally, our observations indicate that the loss of Set2 impairs the MSL complex mediated 445 dosage compensation while the H3.2K36R replacement does not. We hypothesise that Set2 446 exerts its functions by methylation of (as yet unknown) non‐histone targets similar to what 447 has recently been suggested for Ash1 (Dorafshan et al. 2019). How this contributes to 448 dosage compensation at the mechanistic level remains to be discovered. Nevertheless, our 449 findings imply that models that involve the binding of MSL complex to tri‐methylated H3K36 450 need to be revised.

451 Materials and methods

452 Fly strains and genetic crosses 453 All fly strains were cultivated and crossed at 25°C in vials containing potato mash‐yeast‐agar. 454 For the RNA preparations, brains from male third instar larvae were collected. Oregon R flies 455 were used as wildtype in all experiments except in the histone preplacement comparison as 456 described below. To generate the ash1 mutant larvae, ash122 flies (w/+; ash122, 457 P[w+mW.hs=FRT(whs) 2A]/TM3, Ser e P[w+mC ActGFP]) (Dorafshan et al. 2019) were crossed 458 with w1118; Df(3L)Exel9011/TM3, Ser e P[w+mC ActGFP], hereafter ash19011 (Dorafshan et al.

16

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

459 2019) and the ash122/ash19011 trans‐heterozygote first instar larvae were isolated, based on 460 the lack of a GFP signal under fluorescent stereomicroscopy, and grown to the third instar 461 larvae at 25°C. The CRISPR/Cas9 generated NSD loss of function allele (NSDds46) has 462 previously been described (Dorafshan et al. 2019) and the ash1 NSD double mutant larvae 463 (ash122 NSDds46/ash19011 NSD ds46), were generated as described previously (Dorafshan et al. 464 2019). To obtain Set2 mutant larvae, we used the Set21 loss of function allele (Larschan et al. 465 2011). Progeny from y1 w67c23 Set21/FM7c, P[GAL4‐Kr.C]DC1 P[UAS‐GFP.S65T]DC5 were 466 screened under a fluorescent stereomicroscope and non‐GFP larvae (homozygous and 467 hemizygous for Set21) were transferred to separate vials. To generate ΔHisC; 12xH3K36R and 468 ΔHisC; 12xH3K36K larvae, the fly strain w; UAS‐2xYFP ΔHisC/CyO, P[ftz‐lacZ] was crossed with 469 either y w; elav‐Gal4, ΔHisC/CyO; VK33[H3K36Rx12]/TM6B,Tb (to supplement the progeny 470 with the transgenic mutant Histone cluster where Lys36 residue of His3.2 was replaced by 471 Arg), or y w; elav‐Gal4, ΔHisC/CyO; VK33[H3K36Kx12]/TM6B,Tb (to supplement the progeny 472 with the transgenic wildtype Histone cluster) (McKay et al. 2015). The larvae with the correct 473 genotype were then selected based on the presence of a GFP signal (hence homozygous for 474 deletion of endogenous Histone cluster (ΔHisC)), and an absence of the Tb marker (hence 475 presence of the transgenic Histone cluster). The ΔH3.3B; ΔH3.3A mutant larvae (Hodl and 476 Basler 2009) were isolated based on the lack of a GFP signal under fluorescent 477 stereomicroscopy in the progeny from the cross w, ΔHis3.3B, hsp‐Flp; Df(2L)His3.3A/CyO, 478 P[w+mC ActGFP]JMR1 × w, ΔHis3.3B, hsp‐Flp/Y; Df(2L)His3.3A/CyO, P[w+mC ActGFP]JMR1.

479 Immunostaining of polytene chromosomes 480 Immunostaining of polytene chromosomes was done essentially as described previously 481 (Johansson et al. 2012; Lundberg et al. 2013a). The antibodies used in the protocol are listed 482 in Supplementary Table S1. Preparations were analysed using a Zeiss Axiophot microscope 483 (Plan Apochromat 40x/0.95 objective) equipped with a KAPPA DX20C CCD camera and with 484 Zeiss Apotome Microscope (Plan Apochromat 63x/1.40 oil DIC M27 objective, filters set: 485 63HE for red channel, 38HE for green channel and 49 for DAPI) equipped with AxioCam MR 486 R3 camera. For comparisons of targeting between different genotypes, the protocol was run 487 in parallel and nuclei with clear cytology were chosen on the basis of DAPI staining and 488 photographed. At least 20 nuclei per slide were used in these comparisons and at least five 489 slides per genotype. Images were processed with ZenPro software (v2.3, Zeiss).

17

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

490 Western blot 491 Total tissue extracts were prepared from hand‐dissected brains and imaginal discs of third 492 instar larvae. For analysis of the H3K36 methylation, protein extracts were loaded on a 15% 493 SDS–PAGE gel and blotted to a PVDF membrane for 60 minutes at 200 mA. Primary and 494 secondary antibodies were diluted in 1xPBS, 1% BSA, 0.05% Tween‐20. The antibodies used 495 in the protocol are listed in Supplementary Table S1.

496 Library Preparation 497 For each sample, five male third instar larvae brains were homogenised and RNA was 498 purified using the Directzol microprep kit (Zymo research) and using the standard protocol 499 with on‐column DNase treatment. RNA quality was determined with a Fragment Analyser 500 5200 (Agilent Technologies, Inc.) using the DNF‐471 Standard Sensitivity RNA reagent kit. 501 Total‐RNA libraries were synthesised using the Ovation RNA‐Seq system for Drosophila 502 (NuGEN). Standard protocol procedure with integrated DNase treatment was used to make 503 the sequencing libraries. Fragmentation was performed using a Covaris E220 Focused‐ 504 ultrasonicator with the recommended settings for 200 target length. The library 505 quality was evaluated on a Fragment Analyser using the DNF‐920 DNA reagent kit.

506 Sequencing and data analyses 507 In the first round, a total of five samples per genotype were sequenced (Oregon R, Set21, 508 ash122/ash19011, NSDds46, ash122 NSDds46/ash19011 NSD ds46, ΔHisC; 12xH3K36R and ΔHisC; 509 12xH3K36K) on Illumina HiSeq2500 (Illumina Cambridge Ltd.) by Science for Life Laboratory, 510 Stockholm and 125 bp paired‐end reads were obtained. In the second round, seven samples 511 of ΔH3.3B; ΔH3.3A and an additional six Oregon R samples were sequenced at Science for 512 Life Laboratory, Stockholm using Illumina NovaSeq SP (Illumina Cambridge Ltd.) with a 513 paired‐end read length of 150 bp. Reads from both rounds were mapped to the D. 514 melanogaster genome version BDGP6 using STAR v2.5.3 (Dobin et al. 2013) with default 515 settings. Read counts were obtained with featureCounts v1.5.1 with default settings (Liao et 516 al. 2014). Gene abundances were calculated using StringTie v1.3.3 (Pertea et al. 2015). 517 Replicate 3 of Set21 and replicate 5 of ΔHisC; 12xH3K36K did not pass the quality control in 518 RSeQC (Wang et al. 2012) and were therefore excluded in the downstream analysis. DESeq2 519 v1.18 (Love MI et al. 2014) was used for differential expression analysis using default 520 parameters and normal log fold change shrinkage.

18

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

521 Distance to High Affinity Sites (HAS) and MSL gene enrichments 522 Genomic locations for 188 previously published High Affinity Sites, compiled by (Philip and 523 Stenberg 2013), originating from (Alekseyenko et al. 2008) (Straub et al. 2008) were used in 524 the analysis. The distance to the closest HAS was calculated for each gene on the X‐ 525 chromosome. The gene enrichment data for the MSL complex component MSL1 were from 526 (Philip and Stenberg 2013). The binding data were subdivided into five equally sized bins 527 based on the average MSL1 enrichment, bin 1 comprising the genes with the lowest MSL1 528 enrichments, up to bin 5 which comprised genes with the highest MSL1 enrichments.

529 Classification of genes as early or late replicating 530 Genes were classified as early or late replicating as previously described (Kim et al. 2018b). 531 The data on early and late replicating genomic domains were kindly provided by David 532 MacAlpine (Lubelsky et al. 2014) and the coordinates were converted from Drosophila 533 genome release 5 to release 6 using flybase.org’s online coordinate converter. The 534 annotating tool from BEDTools (Quinlan and Hall 2010) was used to calculate the overlap 535 between genes and replication domains. Genes were classified as early or late replicating if 536 the entire transcript was within a domain classified as early or late replicating in the male S2 537 cell line.

538 Classification of genes as housekeeping or non‐housekeeping 539 Genes with >6 as the expression level in all 12 FlyAtlas‐specified tissue types (Chintapalli et 540 al. 2007) are defined as housekeeping genes and genes with >6 expression levels in 11 or 541 fewer tissue types are here defined as non‐housekeeping genes or differentially expressed 542 genes.

543 Wildtype gene expression 544 The average Transcript per million (TPM) score for wildtype samples (Oregon R) was 545 calculated for all genes and subsequently binned according to Flybase RNA‐Seq convention 546 (Thurmond et al. 2019). The bins are 0‐0 TPM, 1‐3 TPM, 4‐10 TPM, 11‐25 TPM, 26‐50 TPM, 547 51‐100 TPM, 101‐1000 TPM and >1000 TPM for Unexpressed, Very low expression, Low 548 expression, Moderate expression, Moderately high expression, High expression, Very high 549 expression and Extremely high expression, respectively.

550 Datasets and calculated gene enrichments

19

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

551 Raw and processed sequencing data generated were deposited to NCBI GEO under accession 552 GSE166934. ChIP‐chip data sets were obtained from GSE23457, GSE23458 and GSE32793. 553 ChIP‐seq data sets were obtained from GSE127177 (Prayitno et al. 2019) and, in processed 554 form, kindly provided by Tamás Schauer and Peter Becker. RNA‐seq data from the 555 temperature sensitive E(z)61 cell line were obtained from GSE61307 (Lee et al. 2015) The E(z)61 556 cell line was classified as male based on the expression of roX1, roX2, msl2 and traF.

557 Enrichment scores for the ChIP‐chip experiments is a log2 ratio of ChIP over input. A custom 558 script was used to calculate the top 50 percent of the total exonic region for each gene. For 559 ChIP‐seq the Homer suite v4.11 (Heinz et al. 2010) was used to obtain H3K36me3 peaks over 560 input. The peak score used for downstream analysis is defined as position adjusted reads from 561 initial peak region. Only peaks originating in exons were used when calculating the 562 chromosomal average.

563 Bioinformatics and plotting of the figures 564 All calculations after the compilation of the raw counts table were performed using R 565 (R_Core_Team 2016) and plots were generated using the ggplot2 R package (Wickham 566 2009).

567 Competing interest statement 568 The authors declare no competing interests.

569 Acknowledgments 570 We thank Tamás Schauer and Peter Becker for sharing H3K36me3 ChIP‐seq data; Dirk 571 Schübeler for the NSD antibody; Mitzi Kuroda, Gregory Matera and Konrad Basler for fly 572 strains. We also thank the Science for Life Laboratory, Stockholm, Sweden; the National 573 Genomics Infrastructure (NGI), Stockholm, Sweden; and UPPMAX, Uppsala, Sweden, for 574 assistance with the RNA sequencing and for providing the computational infrastructure. This 575 work was supported by grants from the Swedish Research Council (2016‐03306 to J.L. and 576 2017‐03918 to Y.B.S.) Knut and Alice Wallenberg Stiftelse (2014.0018 to J.L. and Y.B.S.), Nils 577 Erik Holmstens Forskningsstiftelse (to Y.B.S.) and Swedish Cancer Foundation (CAN 2017/342 578 to J.L.).

20

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

579 Author contributions: JL and HL conceived the project. HL performed the RNA‐seq 580 experiments and all bioinformatics. AG and JL performed the immunostainings and AG 581 performed the Western blots. HL and ED performed the fly genetics. HL, AG, ED, YBS and JL 582 together analysed the data. HL, YBS and JL wrote the manuscript with contributions from all 583 authors.

584 References

585 Akhtar A, Becker PB. 2000. Activation of transcription through histone H4 acetylation by MOF, an 586 acetyltransferase essential for dosage compensation in Drosophila. Mol Cell 5: 367‐375. 587 doi:10.1016/s1097‐2765(00)80431‐1 588 Alekseyenko AA, Gorchakov AA, Zee BM, Fuchs SM, Kharchenko PV, Kuroda MI. 2014. 589 Heterochromatin‐associated interactions of Drosophila HP1a with dADD1, HIPP1, and 590 repetitive RNAs. Genes Dev 28: 1445‐1460. doi:10.1101/gad.241950.114 591 Alekseyenko AA, Peng S, Larschan E, Gorchakov AA, Lee OK, Kharchenko P, McGrath SD, Wang CI, 592 Mardis ER, Park PJ et al. 2008. A sequence motif within chromatin entry sites directs MSL 593 establishment on the Drosophila X chromosome. Cell 134: 599‐609. 594 doi:10.1016/j.cell.2008.06.033 595 Bachtrog D. 2013. Y‐chromosome evolution: emerging insights into processes of Y‐chromosome 596 degeneration. Nat Rev Genet 14: 113‐124. doi:10.1038/nrg3366 597 Bell O, Wirbelauer C, Hild M, Scharf AN, Schwaiger M, MacAlpine DM, Zilbermann F, van Leeuwen F, 598 Bell SP, Imhof A et al. 2007. Localized H3K36 methylation states define histone H4K16 599 acetylation during transcriptional elongation in Drosophila. EMBO J 26: 4974‐4984. 600 doi:10.1038/sj.emboj.7601926 601 Belyi A, Argyridou E, Parsch J. 2020. The influence of chromosomal environment on X‐Linked gene 602 expression in Drosophila melanogaster. Genome Biol Evol 12: 2391‐2402. 603 doi:10.1093/gbe/evaa227 604 Chintapalli VR, Wang J, Dow JA. 2007. Using FlyAtlas to identify better Drosophila melanogaster 605 models of human disease. Nat Genet 39: 715‐720. doi:10.1038/ng2049 606 Czermin B, Melfi R, McCabe D, Seitz V, Imhof A, Pirrotta V. 2002. Drosophila enhancer of Zeste/ESC 607 complexes have a histone H3 methyltransferase activity that marks chromosomal polycomb 608 sites. Cell 111: 185‐196. doi:10.1016/S0092‐8674(02)00975‐3 609 Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. 2013. 610 STAR: ultrafast universal RNA‐seq aligner. Bioinformatics 29: 15‐21. 611 doi:10.1093/bioinformatics/bts635 612 Dorafshan E, Kahn TG, Glotov A, Savitsky M, Walther M, Reuter G, Schwartz YB. 2019. Ash1 613 counteracts Polycomb repression independent of histone H3 lysine 36 methylation. EMBO 614 Rep 20. doi:10.15252/embr.201846762 615 Ekhteraei‐Tousi S, Lewerentz J, Larsson J. 2020. Painting of Fourth and the X‐Linked 1.688 satellite in 616 D. melanogaster is involved in chromosome‐wide gene regulation. Cells 9. 617 doi:10.3390/cells9020323 618 Figueiredo ML, Kim M, Philip P, Allgardsson A, Stenberg P, Larsson J. 2014. Non‐coding roX RNAs 619 prevent the binding of the MSL‐complex to heterochromatic regions. PLoS Genet 10: 620 e1004865. doi:10.1371/journal.pgen.1004865 621 Figueiredo ML, Philip P, Stenberg P, Larsson J. 2012. HP1a recruitment to promoters is independent 622 of H3K9 methylation in Drosophila melanogaster. PLoS Genet 8: e1003061. 623 doi:10.1371/journal.pgen.1003061

21

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

624 Filion GJ, van Bemmel JG, Braunschweig U, Talhout W, Kind J, Ward LD, Brugman W, de Castro IJ, 625 Kerkhoven RM, Bussemaker HJ et al. 2010. Systematic protein location mapping reveals five 626 principal chromatin types in Drosophila cells. Cell 143: 212‐224. 627 doi:10.1016/j.cell.2010.09.009 628 Finogenova K, Bonnet J, Poepsel S, Schafer IB, Finkl K, Schmid K, Litz C, Strauss M, Benda C, Muller J. 629 2020. Structural basis for PRC2 decoding of active histone methylation marks H3K36me2/3. 630 Elife 9. doi:10.7554/eLife.61964 631 Gilfillan GD, Straub T, de Wit E, Greil F, Lamm R, van Steensel B, Becker PB. 2006. Chromosome‐wide 632 gene‐specific targeting of the Drosophila dosage compensation complex. Genes Dev 20: 858‐ 633 870. doi:10.1101/gad.1399406 634 Gunesdogan U, Jackle H, Herzig A. 2010. A genetic system to assess in vivo the functions of histones 635 and histone modifications in higher eukaryotes. EMBO Rep 11: 772‐776. 636 doi:10.1038/embor.2010.124 637 Haddrill PR, Waldron FM, Charlesworth B. 2008. Elevated levels of expression associated with regions 638 of the Drosophila genome that lack crossing over. Biol Lett 4: 758‐761. 639 doi:10.1098/rsbl.2008.0376 640 Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C, Singh H, Glass CK. 2010. 641 Simple combinations of lineage‐determining transcription factors prime cis‐regulatory 642 elements required for macrophage and B cell identities. Mol Cell 38: 576‐589. 643 doi:10.1016/j.molcel.2010.05.004 644 Hochman B. 1976. The fourth chromosome of Drosophila melanogaster. in The Genetics and biology 645 of Drosophila (eds. M Ashburner and E Novitski), pp. 903‐928. Academic Press, London, 646 United Kingdom. 647 Hodl M, Basler K. 2009. Transcription in the absence of histone H3.3. Curr Biol 19: 1221‐1226. 648 doi:10.1016/j.cub.2009.05.048 649 Hou P, Huang C, Liu CP, Yang N, Yu T, Yin Y, Zhu B, Xu RM. 2019. Structural insights into stimulation of 650 Ash1L's H3K36 methyltransferase activity through Mrg15 binding. Structure 27: 837‐845 651 e833. doi:10.1016/j.str.2019.01.015 652 Huang C, Yang F, Zhang Z, Zhang J, Cai G, Li L, Zheng Y, Chen S, Xi R, Zhu B. 2017. Mrg15 stimulates 653 Ash1 H3K36 methyltransferase activity and facilitates Ash1 Trithorax group protein function 654 in Drosophila. Nat Commun 8: 1649. doi:10.1038/s41467‐017‐01897‐3 655 James TC, Eissenberg JC, Craig C, Dietrich V, Hobson A, Elgin SC. 1989. Distribution patterns of HP1, a 656 heterochromatin‐associated nonhistone chromosomal protein of Drosophila. Eur J Cell Biol 657 50: 170‐180. 658 Jani KS, Jain SU, Ge EJ, Diehl KL, Lundgren SM, Muller MM, Lewis PW, Muir TW. 2019. Histone H3 tail 659 binds a unique sensing pocket in EZH2 to activate the PRC2 methyltransferase. Proc Natl 660 Acad Sci U S A 116: 8295‐8300. doi:10.1073/pnas.1819029116 661 Johansson AM, Stenberg P, Allgardsson A, Larsson J. 2012. POF regulates the expression of genes on 662 the fourth chromosome in Drosophila melanogaster by binding to nascent RNA. Mol Cell Biol 663 32: 2121‐2134. doi:10.1128/MCB.06622‐11 664 Johansson AM, Stenberg P, Bernhardsson C, Larsson J. 2007a. Painting of fourth and chromosome‐ 665 wide regulation of the 4th chromosome in Drosophila melanogaster. EMBO J 26: 2307‐2316. 666 doi:10.1038/sj.emboj.7601604 667 Johansson AM, Stenberg P, Pettersson F, Larsson J. 2007b. POF and HP1 bind expressed exons, 668 suggesting a balancing mechanism for gene regulation. PLoS Genet 3: e209. 669 doi:10.1371/journal.pgen.0030209 670 Jones RS, Gelbart WM. 1990. Genetic analysis of the enhancer of zeste locus and its role in gene 671 regulation in Drosophila melanogaster. Genetics 126: 185‐199. 672 Kelley RL, Meller VH, Gordadze PR, Roman G, Davis RL, Kuroda MI. 1999. Epigenetic spreading of the 673 Drosophila dosage compensation complex from roX RNA genes into flanking chromatin. Cell 674 98: 513‐522. doi:10.1016/S0092‐8674(00)81979‐0

22

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

675 Kim D, Blus BJ, Chandra V, Huang P, Rastinejad F, Khorasanizadeh S. 2010. Corecognition of DNA and 676 a methylated histone tail by the MSL3 chromodomain. Nat Struct Mol Biol 17: 1027‐1029. 677 doi:10.1038/nsmb.1856 678 Kim M, Ekhteraei‐Tousi S, Lewerentz J, Larsson J. 2018a. The X‐linked 1.688 satellite in Drosophila 679 melanogaster promotes specific targeting by Painting of fourth. Genetics 208: 623‐632. 680 doi:10.1534/genetics.117.300581 681 Kim M, Faucillion ML, Larsson J. 2018b. RNA‐on‐X 1 and 2 in Drosophila melanogaster fulfill separate 682 functions in dosage compensation. PLoS Genet 14: e1007842. 683 doi:10.1371/journal.pgen.1007842 684 Kind J, Vaquerizas JM, Gebhardt P, Gentzel M, Luscombe NM, Bertone P, Akhtar A. 2008. Genome‐ 685 wide analysis reveals MOF as a key regulator of dosage compensation and gene expression in 686 Drosophila. Cell 133: 813‐828. doi:10.1016/j.cell.2008.04.036 687 Klymenko T, Muller J. 2004. The histone methyltransferases Trithorax and Ash1 prevent 688 transcriptional silencing by Polycomb group proteins. EMBO Rep 5: 373‐377. 689 doi:10.1038/sj.embor.7400111 690 Kudithipudi S, Lungu C, Rathert P, Happel N, Jeltsch A. 2014. Substrate specificity analysis and novel 691 substrates of the protein lysine methyltransferase NSD1. Chem Biol 21: 226‐237. 692 doi:10.1016/j.chembiol.2013.10.016 693 Kuroda MI, Hilfiker A, Lucchesi JC. 2016. Dosage compensation in Drosophila‐a model for the 694 coordinate regulation of transcription. Genetics 204: 435‐450. 695 doi:10.1534/genetics.115.185108 696 Kurotaki N, Imaizumi K, Harada N, Masuno M, Kondoh T, Nagai T, Ohashi H, Naritomi K, Tsukahara M, 697 Makita Y et al. 2002. Haploinsufficiency of NSD1 causes Sotos syndrome. Nat Genet 30: 365‐ 698 366. doi:10.1038/ng863 699 Larschan E, Alekseyenko AA, Gortchakov AA, Peng S, Li B, Yang P, Workman JL, Park PJ, Kuroda MI. 700 2007. MSL complex is attracted to genes marked by H3K36 trimethylation using a sequence‐ 701 independent mechanism. Mol Cell 28: 121‐133. doi:10.1016/j.molcel.2007.08.011 702 Larschan E, Bishop EP, Kharchenko PV, Core LJ, Lis JT, Park PJ, Kuroda MI. 2011. X chromosome 703 dosage compensation via enhanced transcriptional elongation in Drosophila. Nature 471: 704 115‐118. doi:10.1038/nature09757 705 Larsson J, Chen JD, Rasheva V, Rasmuson‐Lestander A, Pirrotta V. 2001. Painting of fourth, a 706 chromosome‐specific protein in Drosophila. Proc Natl Acad Sci U S A 98: 6273‐6278. 707 doi:10.1073/pnas.111581298 708 Larsson J, Svensson MJ, Stenberg P, Mäkitalo M. 2004. Painting of fourth in genus Drosophila 709 suggests autosome‐specific gene regulation. Proc Natl Acad Sci U S A 101: 9728‐9733. 710 doi:10.1073/pnas.0400978101 711 Lee H, Cho DY, Whitworth C, Eisman R, Phelps M, Roote J, Kaufman T, Cook K, Russell S, Przytycka T 712 et al. 2016. Effects of gene dose, chromatin, and network topology on expression in 713 Drosophila melanogaster. PLoS Genet 12: e1006295. doi:10.1371/journal.pgen.1006295 714 Lee HG, Kahn TG, Simcox A, Schwartz YB, Pirrotta V. 2015. Genome‐wide activities of Polycomb 715 complexes control pervasive transcription. Genome Res 25: 1170‐1181. 716 doi:10.1101/gr.188920.114 717 Lee Y, Yoon E, Cho S, Schmahling S, Muller J, Song JJ. 2019. Structural basis of MRG15‐mediated 718 activation of the ASH1L histone methyltransferase by releasing an autoinhibitory loop. 719 Structure 27: 846‐852 e843. doi:10.1016/j.str.2019.01.016 720 Li Y, Trojer P, Xu CF, Cheung P, Kuo A, Drury WJ, 3rd, Qiao Q, Neubert TA, Xu RM, Gozani O et al. 721 2009. The target of the NSD family of histone lysine methyltransferases depends on the 722 nature of the substrate. J Biol Chem 284: 34283‐34295. doi:10.1074/jbc.M109.034462 723 Liao Y, Smyth GK, Shi W. 2014. featureCounts: an efficient general purpose program for assigning 724 sequence reads to genomic features. Bioinformatics 30: 923‐930. 725 doi:10.1093/bioinformatics/btt656

23

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

726 Lindehell H, Kim M, Larsson J. 2015. Proximity ligation assays of protein and RNA interactions in the 727 male‐specific lethal complex on Drosophila melanogaster polytene chromosomes. 728 Chromosoma 124: 385‐395. doi:10.1007/s00412‐015‐0509‐x 729 Lubelsky Y, Prinz JA, DeNapoli L, Li Y, Belsky JA, MacAlpine DM. 2014. DNA replication and 730 transcription programs respond to the same chromatin cues. Genome Res 24: 1102‐1114. 731 doi:10.1101/gr.160010.113 732 Lundberg LE, Kim M, Johansson AM, Faucillion ML, Josupeit R, Larsson J. 2013a. Targeting of Painting 733 of fourth to roX1 and roX2 proximal sites suggests evolutionary links between dosage 734 compensation and the regulation of the fourth chromosome in Drosophila melanogaster. G3 735 3: 1325‐1334. doi:10.1534/g3.113.006866 736 Lundberg LE, Stenberg P, Larsson J. 2013b. HP1a, Su(var)3‐9, SETDB1 and POF stimulate or repress 737 gene expression depending on genomic position, gene length and expression pattern in 738 Drosophila melanogaster. Nucleic Acids Res 41: 4481‐4494. doi:10.1093/nar/gkt158 739 Mank JE. 2013. Sex chromosome dosage compensation: definitely not for everyone. Trends Genet 29: 740 677‐683. doi:10.1016/j.tig.2013.07.005 741 McKay DJ, Klusza S, Penke TJ, Meers MP, Curry KP, McDaniel SL, Malek PY, Cooper SW, Tatomer DC, 742 Lieb JD et al. 2015. Interrogating the function of metazoan histones using engineered gene 743 clusters. Dev Cell 32: 373‐386. doi:10.1016/j.devcel.2014.12.025 744 Meers MP, Henriques T, Lavender CA, McKay DJ, Strahl BD, Duronio RJ, Adelman K, Matera AG. 2017. 745 Histone gene replacement reveals a post‐transcriptional role for H3K36 in maintaining 746 metazoan transcriptome fidelity. Elife 6. doi:10.7554/eLife.23249 747 Mito Y, Henikoff JG, Henikoff S. 2005. Genome‐scale profiling of histone H3.3 replacement patterns. 748 Nat Genet 37: 1090‐1097. doi:10.1038/ng1637 749 mod EC, Roy S, Ernst J, Kharchenko PV, Kheradpour P, Negre N, Eaton ML, Landolin JM, Bristow CA, 750 Ma L et al. 2010. Identification of functional elements and regulatory circuits by Drosophila 751 modENCODE. Science 330: 1787‐1797. doi:10.1126/science.1198374 752 Moore SA, Ferhatoglu Y, Jia Y, Al‐Jiab RA, Scott MJ. 2010. Structural and biochemical studies on the 753 chromo‐barrel domain of male specific lethal 3 (MSL3) reveal a binding preference for mono‐ 754 or dimethyllysine 20 on histone H4. J Biol Chem 285: 40879‐40890. 755 doi:10.1074/jbc.M110.134312 756 Oliver B. 2007. Sex, dose, and equality. PLoS Biol 5: e340. doi:10.1371/journal.pbio.0050340 757 Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT, Salzberg SL. 2015. StringTie enables 758 improved reconstruction of a transcriptome from RNA‐seq reads. Nat Biotechnol 33: 290‐ 759 295. doi:10.1038/nbt.3122 760 Philip P, Stenberg P. 2013. Male X‐linked genes in Drosophila melanogaster are compensated 761 independently of the Male‐Specific Lethal complex. Epigenet Chromatin 6: 35. 762 doi:10.1186/1756‐8935‐6‐35 763 Prabhakaran M, Kelley RL. 2012. Mutations in the transcription elongation factor SPT5 disrupt a 764 reporter for dosage compensation in Drosophila. PLoS Genet 8: e1003073. 765 doi:10.1371/journal.pgen.1003073 766 Prayitno K, Schauer T, Regnard C, Becker PB. 2019. Progressive dosage compensation during 767 Drosophila embryogenesis is reflected by gene arrangement. EMBO Rep 20: e48138. 768 doi:10.15252/embr.201948138 769 Pu M, Ni Z, Wang M, Wang X, Wood JG, Helfand SL, Yu H, Lee SS. 2015. Trimethylation of Lys36 on H3 770 restricts gene expression change during aging and impacts life span. Genes Dev 29: 718‐731. 771 doi:10.1101/gad.254144.114 772 Quinlan AR, Hall IM. 2010. BEDTools: a flexible suite of utilities for comparing genomic features. 773 Bioinformatics 26: 841‐842. doi:10.1093/bioinformatics/btq033 774 R_Core_Team. 2016. R: A language and environment for statistical computing. R Foundation for 775 Statistical Computing, Vienna, Austria.

24

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

776 Rayasam GV, Wendling O, Angrand PO, Mark M, Niederreither K, Song L, Lerouge T, Hager GL, 777 Chambon P, Losson R. 2003. NSD1 is essential for early post‐implantation development and 778 has a catalytically active SET domain. EMBO J 22: 3153‐3163. doi:10.1093/emboj/cdg288 779 Regnard C, Straub T, Mitterweger A, Dahlsveen IK, Fabian V, Becker PB. 2011. Global analysis of the 780 relationship between JIL‐1 kinase and transcription. PLoS Genet 7: e1001327. 781 doi:10.1371/journal.pgen.1001327 782 Riddle NC, Elgin SCR. 2006. The dot chromosome of Drosophila: Insights into chromatin states and 783 their change over evolutionary time. Chromosome Res 14: 405‐416. doi:10.1007/s10577‐006‐ 784 1061‐6 785 Riddle NC, Minoda A, Kharchenko PV, Alekseyenko AA, Schwartz YB, Tolstorukov MY, Gorchakov AA, 786 Jaffe JD, Kennedy C, Linder‐Basso D et al. 2011. Plasticity in patterns of histone modifications 787 and chromosomal proteins in Drosophila heterochromatin. Genome Res 21: 147‐163. 788 doi:10.1101/gr.110098.110 789 Samata M, Akhtar A. 2018. Dosage compensation of the X chromosome: A complex epigenetic 790 assignment involving chromatin regulators and long noncoding RNAs. Annu Rev Biochem 87: 791 323‐350. doi:10.1146/annurev‐biochem‐062917‐011816 792 Schmahling S, Meiler A, Lee Y, Mohammed A, Finkl K, Tauscher K, Israel L, Wirth M, Philippou‐Massier 793 J, Blum H et al. 2018. Regulation and function of H3K36 di‐methylation by the trithorax‐group 794 protein complex AMC. Development 145. doi:10.1242/dev.163808 795 Schubeler D, Scalzo D, Kooperberg C, van Steensel B, Delrow J, Groudine M. 2002. Genome‐wide DNA 796 replication profile for Drosophila melanogaster: a link between transcription and replication 797 timing. Nat Genet 32: 438‐442. doi:10.1038/ng1005 798 Schwartz S, Meshorer E, Ast G. 2009. Chromatin organization marks exon‐intron structure. Nat Struct 799 Mol Biol 16: 990‐995. doi:10.1038/nsmb.1659 800 Smith ER, Pannuti A, Gu WG, Steurnagel A, Cook RG, Allis CD, Lucchesi JC. 2000. The Drosophila MSL 801 complex acetylates histone h4 at lysine 16, a chromatin modification linked to dosage 802 compensation. Mol Cell Biol 20: 312‐318. doi:10.1128/Mcb.20.1.312‐318.2000 803 Stenberg P, Larsson J. 2011. Buffering and the evolution of chromosome‐wide gene regulation. 804 Chromosoma 120: 213‐225. doi:10.1007/s00412‐011‐0319‐8 805 Stenberg P, Lundberg LE, Johansson AM, Ryden P, Svensson MJ, Larsson J. 2009. Buffering of 806 segmental and chromosomal aneuploidies in Drosophila melanogaster. PLoS Genet 5: 807 e1000465. doi:10.1371/journal.pgen.1000465 808 Straub T, Grimaud C, Gilfillan GD, Mitterweger A, Becker PB. 2008. The chromosomal high‐affinity 809 binding sites for the Drosophila dosage compensation complex. PLoS Genet 4: e1000302. 810 doi:10.1371/journal.pgen.1000302 811 Sural TH, Peng S, Li B, Workman JL, Park PJ, Kuroda MI. 2008. The MSL3 chromodomain directs a key 812 targeting step for dosage compensation of the Drosophila melanogaster X chromosome. Nat 813 Struct Mol Biol 15: 1318‐1325. doi:10.1038/nsmb.1520 814 Tanaka Y, Katagiri Z, Kawahashi K, Kioussis D, Kitajima S. 2007. Trithorax‐group protein ASH1 815 methylates histone H3 lysine 36. Gene 397: 161‐168. doi:10.1016/j.gene.2007.04.027 816 Thurmond J, Goodman JL, Strelets VB, Attrill H, Gramates LS, Marygold SJ, Matthews BB, Millburn G, 817 Antonazzo G, Trovisco V et al. 2019. FlyBase 2.0: the next generation. Nucleic Acids Res 47: 818 D759‐D765. doi:10.1093/nar/gky1003 819 Tripoulas NA, Hersperger E, La Jeunesse D, Shearn A. 1994. Molecular genetic analysis of the 820 Drosophila melanogaster gene absent, small or homeotic discs1 (ash1). Genetics 137: 1027‐ 821 1038. 822 Turner BM, Birley AJ, Lavender J. 1992. Histone H4 isoforms acetylated at specific lysine residues 823 define individual chromosomes and chromatin domains in Drosophila polytene nuclei. Cell 824 69: 375‐384. doi:10.1016/0092‐8674(92)90417‐b 825 Valsecchi CIK, Basilicata MF, Georgiev P, Gaub A, Seyfferth J, Kulkarni T, Panhale A, Semplicio G, 826 Manjunath V, Holz H et al. 2020. RNA nucleation by MSL2 induces selective X chromosome 827 compartmentalization. Nature. doi:10.1038/s41586‐020‐2935‐z 25

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

828 Vicoso B, Bachtrog D. 2013. Reversal of an ancient sex chromosome to an autosome in Drosophila. 829 Nature 499: 332‐335. doi:10.1038/nature12235 830 Vicoso B, Bachtrog D. 2015. Numerous transitions of sex chromosomes in Diptera. PLoS Biol 13: 831 e1002078. doi:10.1371/journal.pbio.1002078 832 Villa R, Schauer T, Smialowski P, Straub T, Becker PB. 2016. PionX sites mark the X chromosome for 833 dosage compensation. Nature 537: 244‐248. doi:10.1038/nature19338 834 Wang L, Wang S, Li W. 2012. RSeQC: quality control of RNA‐seq experiments. Bioinformatics 28: 835 2184‐2185. doi:10.1093/bioinformatics/bts356 836 Wickham H. 2009. ggplot2: Elegant graphics for data analysis. Springer‐Verlag, New York. 837 Yuan W, Xie J, Long C, Erdjument‐Bromage H, Ding X, Zheng Y, Tempst P, Chen S, Zhu B, Reinberg D. 838 2009. Heterogeneous nuclear ribonucleoprotein L Is a subunit of human KMT3a/Set2 839 complex required for H3 Lys‐36 trimethylation activity in vivo. J Biol Chem 284: 15701‐15707. 840 doi:10.1074/jbc.M808431200

841

842 Figure legends 843 Figure 1. H3K36 tri‐methylation is enriched on the 4th chromosome and reduced on the male 844 X‐chromosome. (A) Immunostaining of a male third instar larvae polytene chromosome shows 845 an accumulation of H3K36me3 on chromosome 4 (red arrow) and a reduction on the X‐ 846 chromosome (green arrow) as compared to autosomal signals. H3K36me3 in yellow and DAPI 847 staining of DNA in blue. (B) Immunostaining of female third instar larvae polytene 848 chromosome with no observable differences between any chromosome arms. H3K36me3 in 849 red and DAPI staining in blue. The 4th chromosome and the X‐chromosome are indicated with 850 red and green arrows, respectively. (C) Average exon H3K36me3 ChIP‐chip enrichment scores 851 per chromosome in mixed‐sex third instar larvae. This ChIP data analysis confirms the 852 observations from the immunostainings that H3K36me3 is significantly enriched on the 4th 853 chromosome and likewise reduced on the male X‐chromosome. (D) Average exon H3K36me3 854 ChIP‐Seq enrichment scores per chromosome in mixed‐sex late embryos (14 – 16 h). 855 Chromosome 4 is significantly more enriched in H3K36me3 while the X‐chromosome is 856 reduced in H3K36me3 as compared to the major autosome arms. (E) Average exon H3K36me3 857 ChIP‐chip enrichment scores per chromosome in mixed‐sex late embryos (14 – 16 h). 858 Chromosome 4 is significantly more enriched in H3K36me3 compared to the autosomes. (F, 859 G) Average exon H3K36me3 ChIP‐Seq enrichment scores per chromosome in mixed‐sex early 860 embryos (2 – 4 h). The ChIP‐Seq data show lower H3K36me3 levels on the X‐chromosome (F). 861 No significant differences were observed in ChIP‐chip from early embryos (G). Enrichment

862 scores for the ChIP‐chip experiments (C, E, G) is the ChIP over input log2 ratio of H3K36me3 in

26

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

863 the top 50% of the exon regions per gene. For ChIP‐seq (D, F) the H3K36me3 enrichment 864 scores are reads adjusted for difference in position between ChIP and input. In ChIP‐seq only 865 peaks in exons were used. Error bars (C ‐ G) indicate the 95% confidence intervals. 866 Chromosome 4 is shown in red, autosomes in grey and the X‐chromosome is shown in green.

867 Figure 2. Gene levels of H3K36me3 ChIP signal positively correlate with gene lengths and 868 expression levels. Third instar larvae ChIP‐chip gene exon enrichment level (ChIP over input, 869 log2) plotted against: (A) Transcript abundance in wildtype male third instar larval brains as 870 determined by the RNA‐Seq experiment. Highly transcribed genes show a higher 871 accumulation of H3K36me3 than do genes with lower transcript abundance. (B) Gene length, 872 i.e., the total gene length (5’‐3’) divided into five equal sized bins. Longer genes accumulate 873 more H3K36me3. (C) Early replicating genes accumulate more H3K36me3 than late 874 replicating genes. (D) Housekeeping genes accumulate more H3K36me3 than do non‐ 875 housekeeping genes. H3K36me3 ChIP‐chip enrichment scores represent the ChIP over input 876 log2 ratio of H3K36me3 in the top 50% of the exonic regions per gene. In (A, B) the error 877 bars indicate the 95% confidence intervals. For (C, D) the statistical significances were 878 determined by Unpaired Two‐Sample Wilcoxon Tests.

879 Figure 3. Set2 is required for balanced transcription output from the single male X‐

880 chromosome. (A) Boxplot showing chromosome‐specific transcript abundance ratios (log2 881 fold change) between Set21 and wildtype for chromosome 4 (red), X‐chromosome (green) 882 and the autosome arms (grey). The transcript abundance is significantly reduced for the X‐ 883 chromosomes as compared to the autosomes in Set21. Significance levels were determined 884 by Unpaired Two‐Sample Wilcoxon Tests between the X‐chromosome and chromosome 885 arm2L (the least unlike autosome arm). (B) Transcript abundance in Set21 versus wildtype

886 plotted as fold change ratio (log2). Genes were binned according to Flybase RNA‐Seq 887 expression level intervals. Genes with higher wildtype expression output are significantly 888 more affected on the X‐chromosome than on the autosomes. (C) Boxplot showing average 889 transcript abundances of the X‐chromosomal genes in Set21 versus wildtype. The genes are 890 grouped in equally sized bins based on their MSL1 binding strength; 1 (lowest) to 5 (highest). 891 Genes with higher MSL1 levels show reduced transcript abundance compared to genes with

892 no or low amounts of MSL1 bound. (D) Average log2 fold change for X‐chromosomal genes

27

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

893 binned by distance to high‐affinity sites (HAS). In (B) and (D) the error bars indicate the 95% 894 confidence intervals.

895 Figure 4. The replication independent histone variant H3.3 is required for proper 896 transcription of short, moderately to high expressed genes. (A) Staining of H3K36me3 in 897 salivary glands for wildtype (top row) and ΔHisC; 12xH3K36R histone replacement (bottom 898 row). Staining of BEAF‐32 serves as an internal control. (B) Western blot of brains and 899 imaginal discs of third instar larvae showing significant decrease of H3K36me3 (top row) in 900 ΔHisC; 12xH3K36R versus wildtype. Total Histone 3 levels serve as loading control (2‐fold

901 dilutions steps). (C) Boxplot showing log2 fold change of transcript abundance per 902 chromosome for ΔHisC; 12xH3K36R versus ΔHisC; 12xH3K36K (wt). Note that chromosome 4 903 shows a significant reduction in transcript abundance compared to the other chromosome

904 arms. (D) Box plot showing log2 fold change of transcript abundance per chromosome for the 905 E(z)61 cell line at 31°C (strongly reduced E(z) function) versus 25°C). The transcript abundance 906 is significantly reduced for the X‐chromosomes as compared to the major autosome arms

907 upon reduced E(z) function. (E) Boxplot showing log2 fold change of transcript abundance 908 per chromosome for ΔH3.3B; ΔH3.3A versus wildtype flies. No significant differences are

909 observed between chromosomes. (F) Transcript abundance ratio (log2 fold change) in 910 ΔH3.3B; ΔH3.3A versus wildtype in relation to gene expression levels in wildtype. Genes 911 were binned based on wildtype expression output level. Genes on the X‐chromosome, as 912 well as genes on autosomes with moderate to high expression levels, were more sensitive to

913 the loss of H3.3. (G) Boxplot showing log2 fold change of transcript abundance in ΔH3.3B; 914 ΔH3.3A compared to wildtype with genes divided into non‐housekeeping (Non‐HK) and 915 housekeeping (HK). Genes on chromosome 4 and the X‐chromosome are indicated (4th) and

916 (X), respectively. (H) Boxplot showing log2 fold change of transcript abundance in ΔH3.3B; 917 ΔH3.3A compared to wildtype, genome‐wide. Genes were binned by total gene lengths. The 918 increase in transcript abundance in ΔH3.3B; ΔH3.3A compared to wildtype positively 919 correlates with gene length. For (C ‐ F) significant differences were determined by Unpaired 920 Two‐Sample Wilcoxon Tests. Error bars in (F) indicate 95% confidence intervals.

921 Figure 5. The X‐ and the 4th chromosomes have longer 3’‐UTRs, a characteristic that 922 positively correlates with gene expression output and H3K36me3 enrichment. (A) Line graph 923 showing a positive correlation between gene level enrichments of H3K36me3 and gene

28

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

924 average 3’‐UTR length for the X‐chromosome (green) and the autosomes (grey). The genes 925 are binned into equally sized groups based on their 3’‐UTR length. (B) Line graph showing a 926 positive correlation between wildtype transcript abundance and gene average 3’‐UTR length 927 for the X‐chromosome (green) and the autosomes (grey). The genes are binned into equally 928 sized groups based on their 3’‐UTR length. (C) Mean values of average 3’‐UTR length per 929 chromosome arm. P‐values between the X‐chromosome and the 4th chromosome to 930 chromosome arm 3L (the least unlike autosome arm) are indicated. (D, E, F) Boxplots

1 931 showing log2 fold change of transcript abundance in Set2 versus wildtype (D), ΔHisC; 932 12xH3K36R versus ΔHisC; 12xH3K36K (wt) (E), and ΔH3.3B; ΔH3.3A versus wildtype (F) with 933 genes binned into equally sized groups based on their 3’UTR lengths. Error bars in (A‐C) 934 represent 95% confidence intervals. Significant differences in (C ‐ F) were determined by 935 Unpaired Two‐Sample Wilcoxon Tests.

936 Figure 6. Ash1 and NSD show additive effects and are required for balanced transcription

937 output of genes on chromosome 4. Boxplot showing chromosome‐specific transcript

938 abundance ratios (log2 fold change) compared to wildtype for chromosome 4 (red), 939 autosomes (grey) and the X‐chromosome (green). (A) In ash122/ash19011 the transcript 940 abundance from chromosome 4 is reduced by 8.0% compared to the main autosome arms. 941 (B) In NSDds46 the transcript abundance from chromosome 4 is reduced by 5.1% compared to 942 the main autosome arms. (C) In ash122 NSDds46/ash19011 NSD ds46, transcript abundance is 943 reduced by 14.9% compared to the main autosome arms. (D) Boxplot showing chromosome 944 4 transcript abundance change in ash122 NSDds46/ash19011 NSD ds46 compared to wildtype. The 945 genes are grouped as non‐housekeeping (left) and housekeeping genes (right). In ash122 946 NSDds46/ash19011 NSD ds46, the expression output is reduced by 21.2% in non‐housekeeping 947 genes, which is significantly greater than the 11.7% reduction observed in housekeeping

948 genes. (E) Chromosome‐wide transcript abundance ratios (log2 fold change) in ΔHisC; 949 12xH3K36R compared to ΔHisC; 12xH3K36K (wt). H3K36R causes a reduction on chromosome 4 950 transcript abundance compared to the main chromosome arm by 10.7%. The statistical 951 significance was determined by an Unpaired Two‐Sample Wilcoxon Test. In (A, B, C, E), 952 chromosome 4 is compared to chromosome arm 2L and in (D), chromosome 4 is compared 953 to all autosome genes located on chromosomes 2 and 3.

29

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

954 Figure legends Supplementary figures 955 Supplementary Figure S1. Differential enrichments of methylated H3K36. (A) 956 Immunostaining of third instar larvae polytene chromosome showing H3K36me1 (yellow), 957 H3K36me2 (green) and DAPI staining of DNA in blue. Note the accumulation of H3K36me2 in 958 pericentromeric heterochromatin. (B) The H3K36me2 (green) staining shows only minor 959 overlaps with H3K36me1 (yellow). (C) Immunostaining of a male third instar larvae polytene 960 chromosome showing H3K36me2 (green), H3K36me3 (yellow) and DAPI staining of DNA in 961 blue. Note the accumulation of H3K36me2 in pericentromeric heterochromatin, H3K36me3 962 on the 4th chromosome and the reduced amount of H3K36me3 on the X‐chromosome.

963 Supplementary Figure S2. NSD targets pericentric heterochromatin. Immunostaining of third 964 instar larvae polytene chromosome showing NSD (yellow), and DAPI staining of DNA in blue. 965 Note the accumulation of NSD in pericentromeric heterochromatin and region 2L:31, similar 966 to typical enrichments of HP1a.

967 Supplementary Figure S3. H3K36me3 is lost in Set21 mutants. Staining of H3K36me3 in 968 salivary glands for wildtype (top row) and Set21 (bottom row). Staining of BEAF‐32 serves as 969 an internal control.

970 Supplementary Figure S4. Boxplot showing log2 fold change of transcript abundance in 971 ΔH3.3B; ΔH3.3A compared to wildtype with genes divided by chromosome arms, and into 972 early or late replicating. Note the small but significant relative increase of expression output 973 from late replicating genes. The statistical significance was determined by an Unpaired Two‐ 974 Sample Wilcoxon Test.

30

Lindehell_Fig1 A DAPI H3K36me3

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

B DAPI H3K36me3

C 1.5 D 3rd instar larvae ChIP-chip Embryo 14-16h ChIP-Seq 2000 1.0

0.5 1000

H3K36me3 H3K36me3 0 0 4 2L 2R 3L 3R X 4 2L 2R 3L 3R X Chromosome Chromosome E 1.0 F Embryo 14-16h ChIP-chip Embryo 2-4h ChIP-Seq 2000

0.5

1000

H3K36me3 H3K36me3 0 0 4 2L 2R 3L 3R X 4 2L 2R 3L 3R X Chromosome Chromosome

G Embryo 2-4h ChIP-chip 1.0

0.5 H3K36me3 0 4 2L 2R 3L 3R X Chromosome bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Lindehell_Fig2

A 1.25 3rd instar larvae ChIP-chip 1.00 0.75 0.50 H3K36me3 0.25

Wildtype expression level

B 1.0 3rd instar larvae ChIP-chip 0.8

0.6 H3K36me3 0.4

0.2 1 2 3 4 5 (Short) (Long) Binned gene length

p < 2.2e−16 p < 2.2e−16 C 2 D 4

3

1 2 H3K36me3 H3K36me3 1 0

0

-1 Early Late Non-HK HK Replication timing Housekeeping bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Lindehell_Fig3 A Transcript abundance ratio p = 2.8e−16 0.50 - wt 0.25

0 Set2[KO] −0.25

log2FC −0.50 4 2L 2R 3L 3R X Chromosome B Transcript abundance ratio 0.1 - wt 0 −0.1

Set2[KO] −0.2 Autosomes Male X −0.3 log2FC Wildtype expression Transcript abundance ratio C 0.50 - wt 0.25

0 Set2[KO] −0.25

log2FC −0.50 1 2 3 4 5

MSL abundance Transcript abundance ratio D 0 - wt

−0.05 Set2[KO]

−0.10 log2FC

< 1 kb < 5 kb < 10 kb < 20 kb < 30 kb > 30 kb Distance to nearest HAS bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Lindehell_Fig4 A B Chromosome

4 Oregon R Autosomes H3K36me3

H3 total X H3K36R Oregon R H3K36R

DAPI BEAF-32 H3K36me3

Transcript abundance ratio Transcript abundance ratio Transcript abundance ratio C p = 4.6e-13 D p = 0.32 p = 1.3e-09 E p = 0.63 0.50 1 0.50 wt wt

0.25 - 25°C 0.5 0.25 ∆H3.3B - ; 31°C

0 61 0 0 H3K36R - E(z) ∆H3.3A −0.25 −0.5 −0.25 log2FC log2FC

−1 log2FC −0.50 −0.50 4 2L 2R 3L 3R X 4 2L 2R 3L 3R X 4 2L 2R 3L 3R X Chromosome Chromosome Chromosome Transcript abundance ratio Transcript abundance ratio Transcript abundance ratio 0.4 p = 0.076 p < 2.2e−16 p = 6.1e−14 F G 0.50 H 1.0 wt wt wt 0.25 0 0.5 ∆H3.3B - ∆H3.3B - ∆H3.3B - 0 ; ; ;

Autosomes −0.4 −0.25 0 ∆H3.3A ∆H3.3A ∆H3.3A X −0.50 4th Autosomes X −0.5 log2FC −0.8 log2FC log2FC HK HK HK 0−1 1−5 Non-HK Non-HK Non-HK 5−1010−2020−5050−100 >100 Wildtype gene expression Non-Housekeeping / Housekeeping Gene length (kb) bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Lindehell_Fig5 A 3rd instar larvae ChIP-chip B Transcript abundance 1.00 Autosomes Autosomes X-chromosome 100 X-chromosome 0.75

0.50 50 H3K36me3 Wildtype TPM 0.25

<61 61- 88- 116- 155- 207- 285- 408- 602- >972 <61 61- 88- 116- 155- 207- 285- 408- 602- >972 88 116 155 207 285 408 602 972 88 116 155 207 285 408 602 972 Length of 3'UTR (bp) Length of 3'UTR (bp) C D Transcript abundance ratio p < 2.22e−16 Autosomes X p = 0.00036 p = 2.5e−10 p = 7.8e−12 p = 4.8e−16 p = 0.00081 1000 p = 2.3e−09 0.50 - wt

800 0.25 0

600 Set2[KO] −0.25 400 3'UTR length (bp) log2FC 2L 2R 3L 3R 4 X −0.50 <88 88-155 155-285 285-602 >602 Chromosome Length of 3'UTR (bp)

Transcript abundance ratio Transcript abundance ratio E Autosomes X F Autosomes X wt 0.50 p = 0.4 p = 0.16 p = 0.75 p = 0.14 p = 0.89 0.50 p = 0.14 p = 0.45 p = 0.52 p = 0.013 p = 0.56 wt p = 0.89 0.25 0.25 ∆H3.3B - 0 ; 0 H3K36R - −0.25 −0.25 ∆H3.3A

log2FC −0.50 −0.50 <88 88-155 155-285 285-602 >602 <88 88-155 155-285 285-602 >602

Length of 3'UTR (bp) log2FC Length of 3'UTR (bp) bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Lindehell_Fig6

Transcript abundance ratio Transcript abundance ratio A p = 5.6e-07 B p = 0.00027 0.50 0.50 - wt

- wt 0.25 0.25

0 0 NSD[KO] ash1[KO]

−0.25 −0.25 log2FC log2FC

−0.50 −0.50 4 2L 2R 3L 3R X 4 2L 2R 3L 3R X Chromosome Chromosome Transcript abundance ratio Transcript abundance ratio C p = 6.5e-10 D p = 6.5e-06 p = 1.3e-05 0.50 1.0 - wt - wt 0.5 0.25

0 0

−0.5 ash1[KO] NSD[KO] −0.25 ash1[KO] NSD[KO]

−1.0 log2FC −0.50 log2FC Non-HK HK 4 2L 2R 3L 3R X 4 Auto 4 Auto Chromosome Chromosome Transcript abundance ratio E p = 4.6e-13 0.50 Chromosome

wt 0.25 4

0

H3K36R - Autosomes

−0.25 log2FC X

−0.50 4 2L 2R 3L 3R X Chromosome bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Lindehell_Supp.Fig1 A DAPI H3K36me1 H3K36me2

B DAPI H3K36me2

H3K36me1 Split me1/me2

C DAPI H3K36me2 H3K36me3 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Lindehell_Supp.Fig2

DAPI

NSD bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Lindehell_Supp.Fig3 Oregon R ¹ Set2

DAPI BEAF-32 H3K36me3 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Lindehell_Supp.Fig4

2L 2R 3L 3R X

p = 0.0013 p = 0.00041 p = 0.03 p = 0.0048 p = 0.0066 1.0 wt

0.5 ∆H3.3B - ; 0 ∆H3.3A

-0.5 log2FC

-1.0

Early Late Early Late Early Late Early Late Early Late Replication Timing bioRxiv preprint doi: https://doi.org/10.1101/2021.03.04.433843; this version posted March 5, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Lindehell_Supp.Table S1 Supplementary Table S1: List of antibodies

Antigen Host Reference Polytene Western chromosomes H3K36me3 Rabbit monoclonal Cell Signaling, D5A7, Cat#4909 1:200 1:2000 H3K36me3 Rabbit polyclonal Abcam, ab9050 1:200 - H3K36me2 Mouse monoclonal Active Motifs, MABI 0332 1:50 - H3K36me1 Rabbit polyclonal Abcam, ab9048 1:100 - NSD Mouse monoclonal Bell et al. EMBO J (2007) 26: 1:10 4974-4984 BEAF-32 Mouse monoclonal DHSB, anti-BEAF 1:5 - H3 Rabbit polyclonal Abcam, ab1791 - 1:100 000 Secondary Antibody to Rabbit Goat polyclonal Abcam, ab150078 1:500 - IgG - (H+L), Alexa Fluor® 555 conjugate Secondary Antibody to Mouse Donkey polyclonal Invitrogen, A-21202 1:500 - IgG (H+L) Highly Cross- Adsorbed Secondary Antibody, Alexa Fluor® 488 conjugate Secondary Antibody to Rabbit Donkey polyclonal Invitrogen, A-31572 1:500 - IgG (H+L) Highly Cross- Adsorbed Secondary Antibody, Alexa Fluor® 555 conjugate Secondary antibody to Mouse Goat polyclonal Invitrogen, A-21424 1:500 - IgG (H+L) Highly Cross- Adsorbed Secondary Antibody, Alexa Fluor® 555 Anti-Rabbit AP conjugated Goat polyclonal Promega, S3731 - 1:10000 Anti-Mouse IgG AP conjugated Goat polyclonal Sigma, A3562 - 1:10000