<<

bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

Structural basis for COMPASS recognition of an H2B-ubiquitinated

Evan J. Worden1, Xiangbin Zhang1, Cynthia Wolberger1*

1Department of Biophysics and Biophysical Chemistry, Johns Hopkins University School

of Medicine, Baltimore, MD 21205

*Correspondence: [email protected]

1 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

ABSTRACT

Methylation of H3K4 is a hallmark of actively transcribed that depends on

mono-ubiquitination of (H2B-Ub). H3K4 in yeast is catalyzed

by Set1, the subunit of COMPASS. We report here the cryo-EM

structure of a six- core COMPASS subcomplex, which can methylate H3K4 and

be stimulated by H2B-Ub, bound to a ubiquitinated nucleosome. Our structure shows

that COMPASS spans the face of the nucleosome, recognizing on one face of

the nucleosome and methylating H3 on the opposing face. As compared to the structure

of the isolated core complex, Set1 undergoes multiple structural rearrangements to

cement interactions with the nucleosome and with ubiquitin. The critical Set1 RxR motif

adopts a helix that mediates bridging contacts between the nucleosome, ubiquitin and

COMPASS. The structure provides a framework for understanding mechanisms of

trans-histone cross-talk and the dynamic role of H2B ubiquitination in stimulating

.

2 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

1 INTRODUCTION

2

3 The histone that package eukaryotic DNA into (Andrews and Luger,

4 2011) are subject to a huge variety of post-translational modifications that regulate

5 chromatin structure, nucleosome positioning and protein recruitment, thereby playing a

6 central role in regulating (Kouzarides, 2007). Methylation of at

7 4 (H3K4) is a mark of actively transcribed genes and is enriched in

8 regions (Barski et al., 2007). H3K4 is methylated in yeast by the Set1 methyltransferase

9 (Roguev et al., 2001), which can attach up to three methyl groups to the lysine ε-amino

10 group (Santos-Rosa et al., 2002), and in humans by the six related SET/MLL1 family of

11 (Meeks and Shilatifard, 2017). Methylation of nucleosomal H3K4

12 depends on the prior ubiquitination of histone H2B (H2B-Ub) at Lysine 120 (Lys 123 in

13 yeast) (Dover et al., 2002; Shahbazian et al., 2005; Sun and Allis, 2002), an example of

14 histone modification “cross-talk” in which attachment of one histone mark templates the

15 deposition of another. H2B-Ub and H3K4 di- and tri-methylation are strongly associated

16 with active transcription in both yeast and humans (Barski et al., 2007; Jung et al., 2012;

17 Liu et al., 2005; Minsky et al., 2008; Pokholok et al., 2005; Santos-Rosa et al., 2002;

18 Shieh et al., 2011; Steger et al., 2008). H3K4 methylation can serve as a recruitment

19 signal for various transcription activators (Sims et al., 2007; Vermeulen et al., 2010;

20 Vermeulen et al., 2007; Wysocka et al., 2006) including SAGA, whose acetyltransferase

21 activity is stimulated by H3K4 methylation (Bian et al., 2011; Ringel et al., 2015).

22

3 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

23 In yeast, H3K4 is methylated by the evolutionarily conserved COMplex of Proteins

24 ASsociated with Set1 (COMPASS)(Miller et al., 2001; Shilatifard, 2012), which contains

25 Cps15, Cps25, Cps30, Cps35, Cps40, Cps50, Cps60, as well as Set1 (Miller et al.,

26 2001; Shilatifard, 2012). Set1, which contains a conserved SET methyltransferase

27 domain, is inactive on its own (Avdic et al., 2011; Dou et al., 2006; Patel et al., 2009;

28 Southall et al., 2009). Multiple studies have identified a core subcomplex comprising

29 Cps25, Cps30, Cps50, Cps60 and Set1 which constitutes the minimal set of subunits

30 required to support full Set1 methyltransferase activity (Dou et al., 2006; Patel et al.,

31 2009; Schneider et al., 2005; Southall et al., 2009). This core subcomplex of COMPASS

32 is highly conserved across all , with human WDR5/RbBP5/ASH2L/DPY30

33 corresponding to yeast Cps30/50/60/25 (Miller et al., 2001; Shilatifard, 2012). The ability

34 of COMPASS activity to be fully stimulated on by H2B ubiquitination

35 depends upon the presence of an additional COMPASS subunit, Cps40, as well as the

36 RXXXRR (RxR) motif in the N-set region of Set1 (Kim et al., 2013).

37

38 Structural studies of the core subcomplex (Hsu et al., 2018) and the H2B-ubiquitin-

39 sensing subcomplex (Qu et al., 2018; Takahashi et al., 2012) have shown that

40 COMPASS adopts a Y-shaped, highly intertwined structure with the Set1 catalytic

41 domain at its core. Furthermore, a recent structure of the related human MLL1 core

42 complex bound to a ubiquitinated nucleosome revealed the underlying mechanisms of

43 nucleosome recognition by human COMPASS-like complexes (Xue et al., 2019).

44 However, there is currently no structural information on how the full H2B-ubiquitin-

45 sensing COMPASS subcomplex binds and recognizes the H2B-Ub containing

4 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

46 nucleosome. We report here the 3.95 Å resolution cryo-EM structure of the H2B-Ub

47 sensing COMPASS subcomplex from the yeast, , bound to

48 an H2B-Ub nucleosome. The structure shows that COMPASS contains multiple

49 structural elements that position the complex on the nucleosome disk though

50 interactions with nucleosomal DNA and three of the core . The position of the

51 Set1 catalytic domain suggests that COMPASS methylates H3K4 in an asymmetric

52 manner by targeting H2B-Ub and H3K4 on opposite sides of the nucleosome.

53 Structuring of a critical RxR motif to form a helix enables the complex to associate with

54 the nucleosome acidic patch, with the RxR helix forming the bottom edge of an

55 extended ubiquitin interaction crevice that underlies the structural basis of H2B-ubiquitin

56 recognition by COMPASS. Comparison with other ubiquitin-activated

57 methyltransferases shows that interactions with the H2B-linked ubiquitin are highly

58 plastic and suggests how a single ubiquitin mark can be utilized by several different

59 . Our findings shed light on the long-standing mystery of how H2B-Ub is

60 recognized by COMPASS and provide the first example of trans-nucleosome histone

61 crosstalk.

62

63 RESULTS

64

65 Architecture of the COMPASS H2B-Ub nucleosome complex

66 We determined the cryo-EM structure of the minimal, H2B-ubiquitin-sensing

67 subcomplex of Saccharomyces cerevisiae COMPASS bound to a Xenopus laevis

68 nucleosome core particle ubiquitinated at histone H2B K120 via a non-hydrolyzable

5 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

69

70 Figure 1: Architecture of the COMPASS H2B-Ub nucleosome complex 71 (a) Cryo-EM structure of the COMPASS-nucleosome complex. The unsharpened EM density 72 showing two COMPASS molecules bound to the nucleosome is depicted as a semitransparent 73 surface. The sharpened EM density of the complex is depicted as an opaque surface and 74 colored according to the different subunits of the complex. (b) The model of the COMPASS- 75 nucleosome complex is shown and colored as in panel a. The unstructured histone H3 tail 76 residues between the H3 exit point in the nucleosome and the Set1 active site are depicted as a 77 blue dashed line. (c) Large scale structural motions of COMPASS from the nucleosome-free 78 state to the nucleosome-bound state. Free COMPASS (PDB: 6BX3) is colored gray and 79 Nucleosome-bound COMPASS is colored according to panel b. The largest motion in the 80 structural transition at the end of Cps40 is shown as a black dashed arrow. (d) The COMPASS- 81 nucleosome structure viewed from the dyad axis. The distances between the cis-H3 and trans- 82 H3 subunits and the H3 residues in the Set1 active site are shown as dashed black lines.

6 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

83 dichloroacetone (DCA) linkage (Morgan et al., 2016). The ubiquitinated residue

84 corresponds to K123 of yeast H2B. To drive tight association between COMPASS and

85 the nucleosome, we utilized a variant of histone H3 in which K4 was substituted with the

86 non-native , norleucine (Nle) (Worden et al., 2019). Lysine-to-norleucine

87 mutations have been shown to greatly increase the affinity of SET-domain

88 methyltransferases for their substrates in a S-adenosylmethionine (SAM)-dependent

89 manner (Jayaram et al., 2016; Lewis et al., 2013; Worden et al., 2019). To assess the

90 gain in affinity imparted by the H3K4Nle substitution, we used gel mobility shift assays

91 to measure binding of COMPASS to different nucleosome variants in the presence of

92 SAM (Figure S1). Surprisingly, COMPASS binds to unmodified and H2B-Ub

93 nucleosomes with the same apparent affinity, indicating that H2B-Ub does not

94 contribute significantly to the energy of COMPASS binding to the nucleosome (Figure

95 S1). However, H2B-Ub nucleosomes that also contain the H3K4Nle mutant bind

96 COMPASS with 2-5 fold higher affinity than unmodified nucleosomes (Figure S1). We

97 therefore prepared complexes between COMPASS and H2B-Ub nucleosomes

98 containing the H3K4Nle mutation in the presence of saturating SAM and determined the

99 structure of the complex to 3.95 Å by single particle cryo-EM (Figure 1a, Figures S2-S3

100 and Table 1).

101

102 In the reconstruction, two COMPASS complexes are bound to opposite faces of the

103 nucleosome in a pseudo-symmetric 2:1 arrangement (Figure 1a, Figure S2). However,

104 only one of the two bound COMPASS assemblies resolved to high resolution. The final

105 model therefore includes one COMPASS complex and the nucleosome core particle

7 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

106 (Figure 1a-c). To build the yeast COMPASS complex, models of Cps40 and the N-set

107 region of Set1 were taken from the cryo-EM structure of S. cerevisiae COMPASS (Qu et

108 al., 2018) and docked into the EM density. For the rest of the COMPASS model, crystal

109 structures of K. lactis Cps60, Cps50, Cps30, Cps25 and Set1 subunits (Hsu et al.,

110 2018) were utilized to create homology models with the S. cerevisiae sequence (25%-

111 50% sequence identity) using Swiss-model (Waterhouse et al., 2018). The homology

112 models were docked into the EM density, manually re-built in COOT(Emsley et al.,

113 2010) and refined using Phenix (Adams et al., 2010) (see Methods). Cps40, Cps60 and

114 the Cps25 dimer are less well resolved than the other COMPASS subunits due to their

115 location on the periphery of the complex and consequent higher mobility (Figure 1,

116 Figure S3).

117

118 The COMPASS complex spans the entire diameter of the nucleosome and is anchored

119 by contacts between DNA and Cps60/Set1 at one end of the complex and Cps50 and

120 Cps40 at the other (Figures 1b, Figure 2). These DNA contacts position COMPASS

121 such that Cps50 and the Set1 catalytic domain can contact the central histone core.

122 Compared to isolated S. cerevisiae COMPASS (Qu et al., 2018), nucleosome-bound

123 COMPASS flexes around the central Set1 catalytic domain and moves Cps30, Cps40

124 and Cps50 toward the nucleosome by ~30Å (Figure 1c). This movement allows Cps40

125 to bind the nucleosomal DNA, and allows Cps50 to interact with the histone core.

126 Notably, a subset of particles in the cryo-EM structure of COMPASS in the absence of

127 nucleosome (Qu et al., 2018) exhibited flexing about the same axis, although not to the

128 extent observed here when COMPASS is bound to a nucleosome (Figure 1c). Our

8 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

129 structure suggests that the previously observed conformational flexibility of COMPASS

130 is important for nucleosome recognition.

131

132 The Set1 active site is oriented away from the nucleosome and contains density for the

133 SAM cofactor and the bound H3 tail (Figure 1d, Figure S3). The intervening sequence

134 of the H3 tail, from its exit point in the nucleosome (P38) to the first resolved residue in

135 the Set1 active site (T6), is not visible in the maps, suggesting that these residues are

136 highly mobile (Figures 1b, d). To determine which copy of histone H3 was connected to

137 the portion of the H3 tail bound to Set1, we compared the distance between the Set1

138 active site and the H3 residue (P38) on each face of the nucleosome. The Set1 active

139 site is positioned much closer to the exit point of the H3 subunit on the opposite face of

140 the nucleosome (trans-H3, ~36Å) than to the exit point of the H3 subunit on the same

141 side of the nucleosome (cis-H3, ~84Å, Figure 1d). The 84 Å end-to-end distance

142 between the exit point of cis-H3 and the Set1 active site is too long to be spanned by

143 the intervening unstructured H3 tail residues given that an even greater distance (~100

144 Å) would be needed for the H3 residues to wrap around the nucleosomal DNA near the

145 dyad axis. This arrangement therefore indicates that COMPASS methylates the

146 nucleosome in an asymmetric manner by recognizing ubiquitin on one face of the

147 nucleosome and targeting H3K4 on the opposite, trans-H3, face of the nucleosome.

148

149 COMPASS also interacts directly with the core histone proteins. A pair of loops in

150 Cps50 anchor COMPASS near the C-terminal H2B helix (Figure 3). In addition, a long

151 helix in Set1 containing the RxR motif (Kim et al., 2013) extends along the surface of

9 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

152 the histone core and makes multiple interactions with the nucleosome acidic patch

153 formed by histones H2A and H2B (Figure 4). Finally, Set1, Cps60 and Cps50 bind to

154 the H2B-linked ubiquitin (Figures 5 and 6), providing a structural basis for the crosstalk

155 between H2B-Ubiquitination and H3K4 methylation.

156

157 COMPASS interacts with DNA using three distinct interfaces

158 COMPASS interacts with DNA in three distinct locations, thereby orienting the complex

159 on the face of the nucleosome (Figure 2). At one end of COMPASS, the nucleosomal

160 DNA docks into a concave surface at the interface of Cps60 and Set1 (Figure 2a, b).

161 This concave surface is lined with several basic residues that can potentially contact the

162 DNA. In particular, Cps60 K318 and Set1 K1026 are in a position to interact directly with

163 the sugar-phosphate backbone (Figure 2b). We note that we did not observe clear

164 sidechain density for either Cps60 K318 or Set1 K1026, so the positions of these

165 sidechains are inferred from the conformation of the protein backbone. Set1 residues

166 R1034 and K1029 are also located at this interface (Figure 2b) but are too far away to

167 directly contact the DNA. Instead, these basic residues likely serve to increase the local

168 positive charge of the concave DNA binding surface. On the opposing edge of the

169 nucleosome, COMPASS contacts the DNA at two distinct interfaces mediated by Cps40

170 and Cps50 (Figure 2c,d). Fragmented density connecting Cps40 and the nucleosomal

171 DNA corresponds to a loop in Cps40 that is disordered in our structure (residues 241-

172 261). This Cps40 loop (Figure 2c) contains a patch of positively-charged amino acids

173 (KRKKKK) that likely interact with the negatively charged DNA backbone and major

174 groove at SHL 2.5. Cps50 interacts with the nucleosomal DNA near SHL 3 using two

10 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

175

176 Figure 2: COMPASS makes multiple contacts with nucleosome DNA 177 (a) Model of the COMPASS-nucleosome structure shown in cartoon representation and colored 178 as in Figure 1. (b-d) Detailed views of COMPASS interactions with nucleosomal DNA. In all 179 views the experimental EM density is shown as a semi-transparent gray surface. (b) DNA 180 contact between Set1 and Cps60. Potential DNA-interacting COMPASS residues are shown in 181 stick representation. (c) Contact between a flexible loop in Cps40 and the nucleosomal DNA. 182 The unmodeled loop of Cps40 is depicted as a yellow dashed line and the positively charged, 183 putative DNA-interacting residues in the loop are shown as yellow letters. (d) Interaction 184 between Cps50 and the nucleosome DNA. Cps50 residues that contact DNA are shown in stick 185 representation. 186

11 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

187 highly conserved basic resides, R236 and K266, which emanate from loops in blade 5

188 of the Cps50 WD40 domain (Figure 2d). Substitution of Cps50 K266 with an alanine

189 decreased H3K4 di- and tri-methylation in yeast (Figure 3d), indicating that the loss of

190 even a single DNA contact can impair COMPASS function. As discussed further below,

191 this region of Cps50 is highly conserved and also mediates contacts between Cps50

192 and the core , making this part of Cps50 a hotspot of COMPASS

193 interaction with the nucleosome.

194

195 Importantly, similar interactions between these COMPASS subunits (Cps60, Cps40 and

196 Cps50) and nucleosomal DNA have recently been observed in the human MLL1

197 complex (Xue et al., 2019) and the K. lactis COMPASS complex (Hsu et al., 2019),

198 indicating that these DNA interactions are critical for COMPASS function and are highly

199 conserved.

200

201 A conserved loop in Cps50 contacts the core histone octamer

202 The structure reveals that Cps50 contains two loops that interact with three different

203 histones in the nucleosome core (Figure 3). Cps50 Loop 1 connects β21 and β22, and,

204 along with the edge of β-strand 25, embraces the H2B C-terminal helix, ɑ4 (Figure 3b).

205 At the tip of loop 1, V263 and I264 insert into a small hydrophobic crevice at the three-

206 helix interface consisting of H2B helices, ɑ3 and ɑ4, and H2A ɑ3 (Figure 3b, c). This

207 hydrophobic crevice includes H2A Y50, and H2B V118, which are in van der Waals

208 contact with Cps50 V263 and I264. In addition to the hydrophobic interactions in loop 1,

209

12 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

210 211 Figure 3: A conserved loop in Cps50 contacts the histone core 212 (a) The COMPASS-nucleosome model with Cps50 surrounded by a semi-transparent blue 213 surface. (b) Close up view of the contact between Cps50 and the histone core. Cps50 is shown 214 as a blue cartoon and the two loops that contact the histone core are colored red. (c) Detailed 215 view of the specific contacts made between loop 1 and loop 2 of Cps50 and the histone core. 216 Potential hydrogen bonds are shown as black dashed lines and the sharpened EM density is 217 shown as a transparent gray surface. (d) Western blot analysis of H3K4 methylation states in 218 cps50Δ yeast strains transformed with containing the indicated Cps50 variants. 219 220

13 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

221 Cps50 N265 hydrogen bonds with H2B Q95, further stabilizing the loop 1 interaction

222 with the histone core (Figure 3c).

223

224 To assess the importance of the interaction between Cps50 loop 1 and the H2A/H2B

225 hydrophobic crevice, we examined the effects of alanine substitutions on histone H3K4

226 methylation in S. cerevisiae. As shown in Figure 3d, the substitution with the greatest

227 effect was Cps50 I264A, which completely abolished H3K4 di- and trimethylation and

228 greatly reduced H3K4 mono-methylation (Figure 3d). I264 is positioned in the center of

229 the H2A/H2B hydrophobic crevice and the strong defect in H3K4 methylation seen for

230 the I264A mutation indicates that this interaction is critical for COMPASS activity in vivo.

231 The V263A mutant slightly decreased H3K4 tri-methylation, but did not significantly

232 change H3K4 mono-, or di-methylation, whereas an N265A mutation had no effect

233 (Figure 3d). Loop 1 is highly conserved (Figure S4) and the residues that correspond to

234 V263 and I264 are always hydrophobic in character, indicating that that the interaction

235 we observe between Loop 1 and the histone octamer is structurally conserved among

236 Cps50 homologs. Indeed, the human MLL1 core complex subunit, Rbbp5, binds to the

237 histone octamer using a similar interface (Xue et al., 2019) and a recent preprint reports

238 that the K. lactis COMPASS complex Cps50/Swd1 subunit also utilizes the hydrophobic

239 cleft interface in a similar manner (Hsu et al., 2019).

240

241 In addition to Cps50 loop 1, loop 2 connects β-strands β23 and β24 and is oriented

242 toward the histone H3 ɑ2-ɑ3 loop and ɑ3 (Figure 3b,c). At the tip of loop 2,

243 S289 forms a with H3 K79. As compared to loop 1, loop 2 is not well

14 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

244 conserved (Figure S4) and the contact it makes with H3K79 is not recapitulated in other

245 COMPASS-like complexes (Xue et al., 2019). Taken together, these results show that

246 the interaction between Cps50 and the nucleosome is conserved from yeast to humans

247 and is critical for COMPASS function in vivo.

248

249 The Set1 RxR motif forms an adaptor helix that bridges the nucleosome acidic

250 patch and ubiquitin

251 The -rich RxR motif in the N-set region of Set1 is critical for stimulation of

252 methyltransferase activity by H2B ubiquitination and mutants in this motif impair

253 COMPASS activity in yeast (Kim et al., 2013). Our structure reveals the critical role that

254 the RxR motif plays in COMPASS interactions with both the nucleosome and ubiquitin.

255 In isolated yeast COMPASS (Qu et al., 2018), the RxR motif is disordered. When bound

256 to the H2B-Ubnucleosome, Set1 residues 901-919 become ordered, forming a long

257 helix that passes underneath COMPASS and makes extensive contacts with the

258 nucleosome acidic patch (Figure 4a-e), as well as with the H2B-linked ubiquitin (see

259 below). The RxR helix docks on the nucleosome surface parallel to the C-terminal helix

260 ɑ4 of histone H2B, positioning multiple arginine residues opposite the negatively

261 charged residues of H2A that make up the acidic patch (Figure 4d, e). Residues in the

262 Set1 RxR helix form several specific electrostatic interactions with acidic patch residues:

263 R908 contacts H2A residue E56 and H2B E113, R904 contacts H2A residue E61 and

264 R901 contacts H2A residues D90 and E92 (Figure 4d, e). The extensive electrostatic

265 network mediated by the RxR helix is flanked on both sides by polar interactions

266 between Set1 R936 and H2A N68 on one side and Set1 R909 and H2B S112 on the

15 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

267 268 Figure 4: The RxR helix binds to the H2A/H2B acidic patch 269 (a) The COMPASS-nucleosome structure with COMPASS subunits surrounded by semi- 270 transparent colored surfaces. The RxR helix passes underneath the H2B-linked ubiquitin and is 271 shown in cartoon representation. (b) Cutaway view of the COMPASS nucleosome structure 272 showing the RxR helix passing between COMPASS and the nucleosome. (c) Sharpened EM 273 density surrounding the RxR helix, which is shown in stick representation. (d-e) End-on view (d) 274 and side view (e) of the RxR helix. Contacting residues between the RxR helix and the 275 H2A/H2B acidic patch are shown in stick representation. (f) Sharpened EM density surrounding 276 the section of Set1 that undergoes a conformational change. The corresponding part of Set1 in 277 the isolated COMPASS structure (PBD: 6BX3, Apo) is colored yellow and shown as a cartoon. 278 The reformed Set1 residues are shown in stick representation and colored green. (g) Western 279 blot analysis of H3K4 methylation states in Set1Δ yeast strains transformed with plasmids 280 containing the indicated Set1 variants. 281

16 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

282 other side. Interestingly, the RxR motif is conserved in human Set1 homologs, but not in

283 the human paralogs MLL1-4 (Figure S4) which have recently been shown not to bind

284 the nucleosome acidic patch (Xue et al., 2019).

285

286 In addition to formation of the RxR helix, nucleosome binding is accompanied by a

287 profound restructuring of Set1 ɑ-helix, 926-933, which unravels and forms an extended

288 strand that lies parallel to the RxR helix (Figure 4f). This extended strand lies along the

289 histone core, orienting R936 toward the nucleosome surface (Figure 4d). The

290 conformational change in this region of Set1 alleviates what would otherwise be steric

291 clash between the N-terminus of the helix formed by these residues in uncomplexed

292 (Apo) COMPASS.

293

294 Previous mutagenesis studies of the N-set region of yeast Set1 (Kim et al., 2013)

295 showed that a combined R909, R908 and R904 triple mutant abolished COMPASS

296 activity in vitro and in vivo. Furthermore, a large-scale alanine screen of histone

297 residues in yeast previously determined that residues in the H2A/H2B acidic patch are

298 required for H3K4 methylation by COMPASS (Nakanishi et al., 2008). To assess the

299 contribution of individual Set1 amino acids to H3K4 methylation by COMPASS in vivo,

300 we generated yeast strains in which a set1 deletion was complemented with mutant

301 Set1 with point mutations designed to disrupt interactions with the nucleosome acidic

302 patch. As compared to wild-type, R936A and R936E Set1 mutations both reduced H3K4

303 methylation (Figure 4g). The charge reversal mutation, R936E, was more severe and

304 almost entirely abolished di- and tri-methylation by COMPASS (Figure 4g). Individual

17 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

305 alanine substitutions of R901A, R904A and R901A reduced mono-methylation by

306 COMPASS and abolished di-, and tri-methylation. The R909A mutation, which does not

307 form electrostatic contacts with the acidic patch, greatly reduced di-and tri-methylation.

308 As expected, the quadruple mutant R901A, R904A, R908A and R909A (4R->4A)

309 completely abolished all methylation of H3K4. Charge reversal mutants R901E, R904E,

310 R908E, R909E and the quadruple 4R->4E mutants all abolished H3K4 methylation,

311 expect for R909E, which retained some residual H3K4 mono-methylation. Together,

312 these structural and mutational data indicate that the interactions between the RxR helix

313 and the acidic patch are critical for COMPASS activity

314

315 Structural basis of ubiquitin recognition

316 The COMPASS-Ubiquitin interaction is different from, and much more extensive than,

317 the interaction between ubiquitin and the MLL1 core complex subunit RbBP5 (Xue et

318 al., 2019) or Dot1L (Anderson et al., 2019; Valencia-Sanchez et al., 2019; Worden et

319 al., 2019), the histone H3K79 methyltransferase which is also stimulated by H2B-Ub

320 (Briggs et al., 2002; McGinty et al., 2008; Ng et al., 2002) (Figure 5a-c). Moreover, the

321 H2B-linked ubiquitin is positioned in different orientations relative to the face of the

322 nucleosome in each of these complexes (Figure 5d), indicating that H2B-Ub is a

323 conformationally plastic epitope that can be recognized in structurally distinct ways.

324 Ubiquitin binds to COMPASS in a large cleft located between Cps50, Set1 and Cps60

325 (Figure 6a), burying 930 Å2 of total surface area. The H2B-linked ubiquitin sits on top of

326 the Set1 RxR helix and makes multiple contacts with the N-terminal and C-terminal

327 extensions of Cps50 (Figure 6a, b). In addition to the Set1 and Cps50 contacts, there is

18 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

328 Figure 5: H2B-Ubiquitin recognition by different methyltransferases 329 (a-c) Structures of H2B-ubiquitin activated methyltransferases bound to the H2B-Ub 330 nucleosome. The structures are encompassed by a semitransparent gray surface, except for 331 ubiquitin which is colored purple and depicted as a cartoon. Surface representations of the 332 bound ubiquitin are shown on the right with the buried surface area colored red. The locations of 333 the I44 and I36 patches are indicated with colored yellow and blue lines, respectively. (a) 334 Structure of COMPASS bound the H2B-Ub nucleosome. (b) Structure of the MLL1 core 335 complex bound to the H2B-Ub nucleosome (PDB: 6KIU). (c) Structure of Dot1L bound to the 336 H2B-Ub nucleosome (PDB: 6NJ9). (d) Superimposition of the structures from a-c but only 337 showing the ubiquitin to compare the relative positions of the H2B-linked Ubiquitin. H2B K120 is 338 shown as spheres and colored blue.

19 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

339 substantial connecting density between Cps60 and the N-terminus of ubiquitin that likely

340 corresponds to a 38 amino acid loop in Cps60 (residues 140 – 178) that is unmodeled

341 in our structure (Figure 6a). This Cps60 loop is not found in human or other yeast

342 COMPASS complexes and appears to be specific to the S. cerevisiae COMPASS

343 complex (Figure S4).

344

345 The primary contact between ubiquitin and COMPASS is mediated by the N- and C-

346 terminal extensions of Cps50 (Figure 6b). These Cps50 extensions are critical for

347 COMPASS assembly and mediate multiple contacts between COMPASS subunits (Hsu

348 et al., 2018; Qu et al., 2018). Compared to the nucleosome-free (apo) structure of

349 COMPASS (Qu et al., 2018), the N- and C- terminal extensions of Cps50 change

350 conformation when bound to ubiquitin (Figure 6 c,d). These Cps50 extensions present

351 an extended surface of hydrophobic residues which interact with the ubiquitin C-terminal

352 tail and the hydrophobic “I44 patch” on ubiquitin comprising I44, V70, L8 and H68

353 (Figures 5a and 6b). The I44 patch is a canonical interaction surface that is contacted

354 by many different ubiquitin binding proteins (Komander and Rape, 2012), and by the

355 related MLL1 complex subunit, Rbbp5 (Xue et al., 2019) (Figure 5b). At the center of the

356 I44 patch interaction, Cps50 L12 and V401 bind to Ub I44 and Cps50 F9 interacts with

357 Ub L8, V70 and H68 (Figure 6b,e). The I44 patch contact is flanked by electrostatic

358 interactions between Cps50 E14 and Ub R42 on one side, and between Cps50 E397

359 and Ub K48 on the other (Figure 6b).

20 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

360 361 Figure 6: Structural basis for COMPASS recognition of H2B-ub 362 (a) Model of the COMPASS H2B-Ub nucleosome complex. The unsharpened experimental EM 363 density is shown as a semi-transparent gray surface. The connecting density between Cps60 364 and ubiquitin is shown. (b) Detailed view of the interactions made between Ubiquitin and Cps50 365 and Set1. Ubiquitin is enclosed in a semi-transparent pink surface and the interfacial residues 366 are shown in stick representation. (c-d) Conformational changes in Cps50 that occur upon 367 ubiquitin binding. The nucleosome-free (apo) structure of COMPASS is shown in gray (PDB:

21 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

368 6BX3). (c) Restructuring of the Cps50 C-terminal extension. Upon binding ubiquitin, the C- 369 terminal extension moves by 17Å toward the ubiquitin binding interface. (d) Restructuring of the 370 Cps50 N-terminal extension. Ubiquitin binding causes a register shift that moves F9 by 9Å to 371 interact with ubiquitin. (e) Detailed view of Cps50 F9 interaction with ubiquitin. The sharpened 372 EM density around Cps50 is shown as a semi-transparent gray surface and the EM density 373 around ubiquitin is shown as a semi-transparent pink surface. (f) Detailed view of the interaction 374 between Set1 L928 and ubiquitin. The sharpened EM density around Set1 is shown as a semi- 375 transparent gray surface. (g) Western blot analysis of H3K4 methylation states in Cps50Δ yeast 376 strains transformed with plasmids containing the indicated Cps50 variants. 377

378 In addition to the I44 patch contacts, Set1 contacts the hydrophobic “I36 patch” of

379 ubiquitin consisting of I36, L71 and L73 (Figure 6b,f) (Komander and Rape, 2012).

380 While the I36 patch of ubiquitin is not as widely utilized by ubiquitin-binding proteins, it is

381 also used by the Dot1L methyltransferase (Figure 5c). At the end of the RxR helix Ub

382 L73 contacts I914 and the aliphatic portion of N917 through van der Waals interactions

383 (Figure 6b). The importance of the L73 contact is supported by previous in vitro studies

384 showing that ubiquitin residues, L71 and L73, are critical for ubiquitin-dependent

385 COMPASS activity (Holt et al., 2015). Finally, Set1 L928 inserts into a small

386 hydrophobic pocket within ubiquitin and interacts with ubiquitin I13, I36, T7 and the

387 aliphatic portion of K11 (Figure 6f).

388

389 To assess the contribution of interfacial Cps50 residues in ubiquitin-dependent H3K4

390 methylation in vivo, we generated cps50 deletion yeast strains expressing mutant

391 Cps50 and measured the effects on H3K4 methylation. As shown in Figure 6g, the

392 Cps50 L12A mutation completely abolished H3K4 methylation by COMPASS in vivo,

393 indicating that the interaction with between this Cps50 residue and Ub I44 is critical to

394 COMPASS function. Cps50 E14A and E14R mutations both greatly decreased H3K4 di-

395 and tri-methylation and also reduced H3K4 mono-methylation to an intermediate level

22 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

396 (Figure 6g). Unsurprisingly, a Cps50 F9A, L12A and E14A triple mutant (FLE)

397 completely abolished H3K4 methylation. Finally, the E397R mutation, which lies on the

398 periphery of the ubiquitin interaction, modestly reduced all methylation states of H3K4

399 (Figure 6g). Together, these data show that the contacts observed between COMPASS

400 and the H2B-linked ubiquitin are critical to activate COMPASS for H3K4 methylation.

401

402 DISCUSSION

403

404 Our structure reveals the basis of crosstalk between H2B-ubiquitination and H3K4

405 methylation by Saccharomyces cerevisiae COMPASS and shows that it methylates its

406 target lysine in an asymmetric manner by recognizing the H2B-Ub and H3K4 on

407 opposite sides of the nucleosome (Figure 1d). This asymmetric recognition is distinct

408 from the H3K79 methyltransferase, Dot1L, which is also stimulated by H2B-Ub but

409 which methylates H3 on the same, cis-H3, side of the nucleosome (Figure 5c)

410 (Anderson et al., 2019; Valencia-Sanchez et al., 2019; Worden et al., 2019). To our

411 knowledge, the asymmetric recognition of H2B-Ub and H3K4 by COMPASS and

412 COMPASS-related complexes (Hsu et al., 2019; Xue et al., 2019) is the first example of

413 trans-nucleosome histone crosstalk. This observation highlights the importance of

414 asymmetry in histone modifications. Previous studies have shown that “repressive”

415 H3K27 tri-methyl and “active” H3K4 tri-methyl marks can be deposited asymmetrically in

416 the same nucleosome, but on opposite H3 tails (Voigt et al., 2012). This asymmetric,

417 bivalent modification of H3K27 and H3K4 is believed to be associated with maintaining

418 promoters in a poised state during differentiation. Our structure provides a mechanistic

23 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

419 framework to understand how asymmetric nucleosome modifications can be read out

420 and deposited during histone crosstalk.

421

422 Compared with other histone modifications such as methylation and ,

423 ubiquitination is a large, highly complex mark that presents a chemically rich interaction

424 surface to potential binding partners and also decompacts chromatin (Fierz et al., 2011).

425 In addition, because it is conjugated to the nucleosome through its flexible C-terminus,

426 ubiquitin can adopt a range of orientations on the nucleosome, enabling even greater

427 complexity it its recognition. Several structures now exist of H2B-Ub-activated

428 methyltransferases bound to ubiquitinated nucleosomes which reveal highly divergent

429 strategies for ubiquitin recognition (Figure 5). The distinct ubiquitin binding modes

430 employed by COMPASS (this study), Dot1L (Anderson et al., 2019; Valencia-Sanchez

431 et al., 2019; Worden et al., 2019) and the MLL1 core complex (Xue et al., 2019) reveal a

432 striking plasticity in how ubiquitin is able to interact with and activate these different

433 histone methyltransferases. The high complexity of the ubiquitin mark likely enables

434 H2B-Ub to communicate with these different enzymatic complexes and template the

435 deposition of “activating” marks during transcription. The size of ubiquitin might also

436 impose a significant steric obstacle that could inhibit the activity of enzymes which do

437 not directly recognize the ubiquitin. Indeed, H2B-Ub may even impose a steric barrier to

438 the transcriptional machinery as evidenced by the observation that efficient transcription

439 elongation requires removal of H2B-Ub (Wyce et al., 2007).

440

24 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

441 Our structure suggests a mechanism by which H2B-Ub stimulates COMPASS to

442 methylate H3K4. Compared to the nucleosome-free Saccharomyces cerevisiae

443 COMPASS structure (Qu et al., 2018) and the structure of Kluyveromyces lactis

444 COMPASS bound to an H3 peptide and SAM (Hsu et al., 2018), there are no

445 discernable conformational changes in the Set1 catalytic domain that could explain how

446 H2B-Ub stimulates methylation (Figure S5). It is notable that, for both COMPASS and

447 Dot1L, the presence of ubiquitin conjugated to H2B does not appear to increase affinity

448 of the for the nucleosome (Figure S1) (Worden et al., 2019). The lack of a

449 discernable effect on binding energy suggests that ubiquitin binding primarily affects

450 catalytic activation and, in the case of Dot1L, has been shown to increase kcat but not

451 KM (McGinty et al., 2009; Worden et al., 2019). Alternatively, It has been suggested that

452 ubiquitin activates Dot1L by using its binding energy to pay the energetic cost of

453 inducing a conformational change in the globular core of histone H3, thereby inserting

454 K79 into the Dot1L active site (Worden et al., 2019). We speculate that H2B-Ub may

455 similarly activate COMPASS by providing the binding energy needed to induce the

456 disordered Set1 RxR motif to form a helix that mediates contacts with the nucleosome.

457 Ubiquitin binding may also compensate for the energetic cost of inducing the Set1 helix,

458 326-331, to unravel and form an extended β-strand that buttresses the RxR helix and

459 the catalytic domain of Set1 (Figure 4). Our structure provides a basis for further

460 elucidating the role that H2B ubiquitination plays in stimulating histone

461 methyltransferase activity.

462

463

25 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

464 METHODS

465

466 Expression and purification of COMPASS

467 Genes encoding subunits of COMPASS were cloned from yeast genomic DNA (S288C)

468 using PCR amplification and inserted into pLIB vector from the biGBac baculovirus

469 expression system (Weissmann et al., 2016) using Gibson assembly primers as

470 described in the biGBac assembly protocol. Full-length Cps15, Cps25, Cps35, Cps50

471 and Cps60 were assembled into the pBIG1a expression vector. Full-length Cps30,

472 Cps40 containing two N-terminal strep tags (twin-strep-Cps40), and residues 762-1080

473 of Set1 with an N-terminal hexahistidine tag followed by three binding sites for the FLAG

474 anitbody (6xHis-3xFLAG-Set1(762-1080)) were assembled into the pBIG1b expression

475 vector. The cassettes from the pBIG1a and pBIG1b expression vectors were

476 excised using PmeI and cloned into pBIG2ab to form the final COMPASS expression

477 assembly containing all eight COMPASS genes. DH10Bac E. coli (Thermo-Fisher) was

478 transformed with the pBIG2ab COMPASS expression vector to produce viral bacmids

479 for baculovirus expression. SF9 insect cells were transfected with the COMPASS

480 bacmid and the viral stock was amplified two times to obtain the high titer p3 viral stock.

481 For protein production, 2L of 3 million cells/ml of High Five insect cells (Thermo-Fisher)

482 were infected with the p3 viral stock at a MOI of 1. The insect cells were harvested after

483 three days by centrifugation and then resuspended in 120 ml of lysis buffer (50 mM Tris

484 pH 8.0, 200 mM NaCl, 10% glycerol) supplemented with 1 tablet of cOmplete, EDTA-

485 free protease inhibitor (Roche) per 50 ml of buffer. The resuspended tablet was flash

486 frozen in liquid nitrogen and stored at -80 oC.

26 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

487

488 To purify the complex, the cell pellet was thawed at room temperature and then placed

489 on ice. All subsequent purification steps were conducted at 0-4 oC. The thawed cell

490 pellet was diluted to 150 ml using lysis buffer and lysed with an LM10 Microfluidizer

491 (Microfluidics). The lysate was clarified by centrifugation and then filtered through a 0.45

492 µm fast-flow bottle-top filter (Nalgene). The filtered lysate was batch-adsorbed to 5 ml of

493 freshly equilibrated M2-FLAG affinity gel (Millipore Sigma) and incubated for 1 hour. The

494 sample was washed in batch with 40 ml of lysis buffer, transferred to a gravity flow

495 column and washed with 20 ml of lysis buffer. COMPASS was then eluted from the

496 FLAG resin using lysis buffer supplemented with 0.15 mg/ml 3x FLAG peptide. 5 mM β-

497 mercapto-ethanol (BME) was added to the eluate and immediately poured onto a

498 column containing with Streptactin-XT resin (IBA). The column was washed with five

499 column volumes of lysis buffer and the complex eluted with lysis buffer supplemented

500 with 50 mM . The strep eluate was concentrated to 0.5 ml and then injected onto a

501 Superose 6 size exclusion column (GE) equilibrated with GF buffer (30 mM HEPES pH

502 7.8, 150 mM NaCl, 1 mM TCEP, 5% glycerol). The following day, peak fractions were

503 pooled, concentrated, flash frozen in liquid nitrogen and stored at -80 oC.

504

505 Purification of histone proteins

506 Unmodified Xenopus laevis histone proteins, ubiquitinated histone H2B and histone H3

507 containing norleucine in place of H3K4 were purified as described previously (Dyer et

508 al., 2004; Worden et al., 2019). In brief, ubiquitinated H2B-K120Ub was prepared using

509 H2B containing a K120C substitution and ubiquitin containing a G76C substitution.

27 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

510 Ubiquitin was cross-linked to H2B-K120C using dichloroacetone (DCA) as described

511 (Morgan et al., 2016). Histone H3 with K4 substituted with norleucine was generated by

512 mutating K4 to methionine and expressing the H3K4M protein in minimal medium

513 supplemented with all amino acids except methionine, as well as with added norleucine

514 (Worden et al., 2019).

515

516 Nucleosome reconstitution

517 Nucleosomes were prepared as previously described (Dyer et al., 2004).

518

519 Electrophoretic Mobility Shift assays

520 For the electrophoretic mobility shift assays (EMSA), 50 nM of each nucleosome variant

521 was mixed with COMPASS at the indicated concentrations and incubated at room

522 temperature for 30 minutes in EMSA buffer (20 mM HEPES pH 7.5, 100 mM NaCl, 1

523 mM DTT, 0.2mg/ml bovine serum albumin (BSA), 100 µM S-adenosyl methionine

524 (SAM)). Samples were then diluted with an equal volume of 2x EMSA sample buffer (40

525 mM HEPES pH 7.6, 100 mM NaCl, 10% Sucrose, 2 mM DTT, 0.2mg/ml BSA) and 10 µl

526 of sample was loaded onto a 6% native Tris-borate EDTA (TBE) gel at 4 oC. The gel

527 was stained with SybrGold DNA stain (Thermo-Fisher) to visualize bands.

528

529 Yeast strains

530 All yeast strains were prepared from the BY4743 background and cultured using

531 standard methods. The Cps50Δ strain was obtained from the Yeast Knock-Out

532 collection (Dharmacon) and was a generous gift from Dr. Carol Greider. The Set1Δ

28 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

533 strain was prepared using PCR-mediated gene disruption with the KanMX gene as a

534 selectable marker. Wild type (WT) CPS50 and SET1 were isolated by PCR from

535 Saccharomyces cerevisiae genomic DNA with a native 600 (bp) upstream

536 promotor and a 200 bp . The isolated CPS50 and SET1 genes were cloned

537 into pRS415 and mutants were generated using inverse PCR. Empty pRS415 vector or

538 plasmids containing WT or mutant variants of CPS50 or SET1 were introduced to the

539 Cps50Δ or Set1Δ strains using standard yeast transformation techniques and Leucine

540 selection.

541

542 Protein extraction and Western Blot analysis

543 Yeast deletion strains containing the pRS415 vector, WT or mutant variants of Cps50 or

o 544 Set1 were grown in 50 ml of SD-Leu media at 30 C to an OD600 of 0.7-1.0. A volume of

545 40 ml of the yeast culture was pelleted and resuspended in 10% tri-chloroacetic acid

546 (TCA) to a final volume with an OD600 of 6 and incubated at room temperature for 30

547 min. 3 ODs of cells (500 µl) were aliquoted into 1.7 ml Eppendorf tubes, pelleted and

548 frozen. For total protein extraction, the cell pellet was thawed on ice and resuspended in

549 250 µl of 20% TCA. A volume of 250 µl of 0.25 mm - 0.5 mm glass beads were added

550 to the resuspended cell pellet and the cells were lysed by vortexing for 6 minutes. The

551 bottom of the Eppendorf tube was punctured, placed into a fresh tube, and the lysed

552 cells were collected by centrifugation. The glass beads were washed with an additional

553 300 µl of 5% TCA and discarded. Total protein was pelleted by centrifugation at 20,000

554 G for 10 minutes at 4 oC. The resulting pellet was washed with 100% ethanol at -20 oC

555 and resuspended in 2x SDS sample buffer. The efficiency of the total protein extraction

29 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

556 was evaluated by SDS PAGE followed by stain-free protein imaging (BioRad). For

557 Western blotting, equal amounts of protein extraction were separated by SDS-PAGE,

558 transferred to PVDF membranes, blocked with 5% milk in TBST buffer and probed with

559 anti ɑ-H3me1 (Abcam, ab8895) and anti ɑ-H3me2 (Abcam, ab7766) and anti ɑ-H3me3

560 (Abcam, ab8580) antibodies. A total of three technical replicates were analyzed from

561 the same yeast growth.

562

563 Cryo EM sample preparation

564 A 2.44 ml volume of 300nM COMPASS, 100 nM of nucleosome containing H2B-Ub and

565 H3K4Nle, and 200 µM SAM was prepared in crosslinking buffer (25 mM HEPES pH7.5,

566 100 mM NaCl, 1 mM DTT). The sample was incubated on ice for 30 min and then mixed

567 with 2.44 ml of 0.14% glutaraldehyde to initiate crosslinking. The crosslinking reaction

568 progressed on ice and was quenched after one hour by the addition of 1 M Tris pH 7.5

569 to a final concentration of 100 mM. The reaction was incubated for 1 hour on ice and

570 then concentrated to ~50 µl using an Amicon Ultra 30K MWCO spin concentrator. The

571 concentration of the complex was determined using the absorption at 260 nm of the 601

572 nucleosome DNA. Quantifoil R2/2 grids were glow discharged for 45 seconds at 15 mA

573 using a Pelco Easyglow glow discharger. A volume of 3 µl of 0.5 mg/ml crosslinked

574 sample was added to the glow-discharged grids and flash frozen in liquid ethane using

575 a Vitrobot (Thermo Fisher) at 4 oC and 100% humidity with a 3.5 sec blot time.

576

577 EM data collection and refinement

30 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

578 All data were collected at the National Cryo-Electron Microscopy Facility (NCEF) at the

579 National Cancer Institute on a Titan Krios (Thermo-Fisher) at 300 kV utilizing a K3

580 (Gatan) direct electron detector in counting-mode with at a nominal magnification of

581 81,000 and a pixel size of 1.08 Å. Data were collected at a nominal dose of 50 e-/Å2 with

582 40 frames per movie and 1.25 e-/frame. A total of 5,784 movies were collected.

583 The dataset was processed in Relion 3.0 (Zivanov et al., 2018). All movies were motion-

584 corrected and dose-weighted using the Relion 3.0 implementation of MotionCorr2

585 (Zivanov et al., 2018). An initial batch of 500 micrographs were randomly selected and

586 used to pick 315,372 particles using the Laplacian of gaussian auto-picking feature.

587 After 3 rounds of 2D-classification to remove junk particles, 216,107 particles were used

588 for 3D classification. A single good class of 79,832 particles was refined and used as a

589 model for template-based picking on the entire dataset, resulting in 2,036,654 particles.

590 The particles were extracted and binned by a factor of 4. 1,357,004 particles were

591 retained after 2D classification and used for 3D classification with 6 classes. Two well

592 resolved classes emerged from the 3D classification, a 2:1 complex with 2 COMPASS

593 molecules per nucleosome and a 1:1 complex with 1 COMPASS molecule per

594 nucleosome. 553,234 particles from the 2:1 complex were refined and subjected to

595 focused 3D classification using a mask that encompassed COMPASS and the

596 nucleosome. A single class of 179,588 particles that contained all COMPASS subunits

597 was selected, the particles were re-extracted without binning and subjected to masked

598 refinement using the same mask that was used for the focused classification. The

599 unbinned particle stack was subjected beam tilt correction and per-particle contrast

600 transfer function (CTF) estimation in Relion 3.0. The final particle stack was then

31 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

601 subjected to masked refinement using the same mask that was used in the previous

602 masked refinement and classification steps. The final structure was sharpened using

603 the Relion postprocessing tool with a soft mask that encompassed the more well

604 resolved COMPASS molecule and the nucleosome. A sharpening B-factor of -135.69 Å2

605 was applied. The final resolution of the COMPASS-nucleosome structure is 3.95 Å

606 according to the Fourier shell correlation (FSC) 0.143 criterion.

607

608 Model building and refinement

609 Coordinates for the Xenopus laevis nucleosome core particle (PBD: 6NJ9),

610 Saccharomyces cerevisiae Cps40, and the N-set domain of Set1 (PDB: 6BX3) were

611 docked into the EM density using Chimera. Crystal structures of Kluyveromyces lactis

612 Cps25, Cps30, Cps50, Cps60 and the Set1 catalytic domain (PDB: 6CHG) fit the EM

613 density better than the existing cryo-electron microscopy structures (PDB: 6BX3) of S.

614 cerevisiae Cps25 (25% identity, 67% in modeled area), Cps30 (50% identity), Cps50

615 (50% identity), Cps60 (42% Identity) and Set1 (41% identity, 86% in modeled area).

616 Therefore, homology models of K. lactis Cps25, Cps30, Cps50, Cps60 and the Set1

617 catalytic domain were prepared using the Swiss-model software and docked into the

618 EM density using Chimera. The resulting model was iteratively refined in Phenix

619 (Afonine et al., 2018) using initial model reference restraints and edited in COOT

620 (Emsley et al., 2010). To avoid overfitting, we refined the COMPASS-nucleosome

621 model against a sharpened, masked half-map (Mapwork) and validated the model against

622 the other half-map that was not used in the refinement (Maptest), but was similarly

623 sharpened. The FSC between the model and Maptest (FSCtest) agrees well with the FSC

32 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

624 of the model against Mapwork (FSCwork). Furthermore, the FSC 0.5 resolution estimate of

625 the model/map (FSCfull) exceeds the FSC 0.5 resolution estimate of the two half maps,

626 and there is no significant correlation between the model and map that exceeds the

627 calculated map resolutions, indicating that the model is not overfit.

628

33 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

ACKNOWLEDGMENTS

This research was supported, in part, by the National Cancer Institute’s National Cryo-

EM Facility at the Frederick National Laboratory for Cancer Research under contract

HSSN261200800001E. E.J.W. is a Damon Runyon Fellow supported by the Damon

Runyon Cancer Research Foundation (DRG 2308-17). This work was supported by

grant GM130393 from the National Institute of General Medical Sciences.

34 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

AUTHOR CONTRIBUTIONS

C.W. and E.J.W. conceived of the study. E.J.W. isolated COMPASS genes, purified

proteins, prepared cryo-EM samples, screened cryo-EM conditions, determined the

structure and conducted the yeast experiments. X.Z. cloned the COMPASS expression

constructs and expressed COMPASS. X.Z cloned COMPASS mutants in collaboration

with E.J.W. E.J.W. and C.W. wrote the manuscript.

35 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

COMPETING INTERESTS

C.W. is a member of the scientific advisory board of ThermoFisher Scientific

36 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

SUPPLEMENTAL FIGURES

Figure S1: Analysis of COMPASS binding to the Nucleosome Electrophoretic mobility shift assay (EMSA) showing COMPASS binding to different Nucleosome (NCP) variants.

37 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

Figure S2: Cryo-EM processing pipeline

38 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

Figure S3: Cryo-EM map and model validation and example density (a) Fourier Shell Correlation for the COMPASS nucleosome structure. (b) Model-Map FSC curves calculated from the refined atomic coordinates and the half-map used for refinement (Mapwork), the half-map used for validation (Maptest) and the full map (Mapfull). (c) Local map resolution estimates calculated from ResMap (Kucukelbir et al., 2014) and colored according to local resolution. Top left, full sharpened map. Bottom left, a slice through the center of the map. (d) A vertical slice through the structure of COMPASS bound to the H2B-Ub nucleosome. All chains in the atomic model are shown as sticks and the EM density is shown as a gray mesh. (e) Example EM density around the H3 tail in the Set1 active site. (f) Example EM density for the SAM cofactor.

39 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

Figure S4: Multiple sequence alignments of Cps50, Set1 and Cps60 (a-c) Multiple sequence alignments were generated using the Clustle Omega software (Sievers et al., 2011). Species names are abbreviated (Sc = Saccharomyces cerevisiae, Sp = Schizosaccharomyces pombe, Kl = Kluyveromyces lactis, Dm = Drosophila melanogaster, Xt = Xenopus tropicalis, Mm = Mus musculus, Hs = Homo sapiens). Important features from the COMPASS structure are highlighted. (a) Multiple sequence alignment of Cps50 and its homologs. (b) Multiple sequence alignment of Set1 and its homologs. (c) Multiple sequence alignment of Cps60 and its homologs.

40 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

Figure S5: Comparison of Set1 catalytic domains from COMPASS in different states. (a) Superimposition of the H2B-Ub nucleosome bound Set1 catalytic domain from S. cerevisiae (green) with the substrate bound, K. lactis Set1 catalytic domain from isolated COMPASS (PDB: 6CHG). (b) Superimposition of the H2B-Ub nucleosome bound Set1 catalytic domain from S. cerevisiae (green) with the substrate-free S. cerevisiae Set1 catalytic domain from isolated COMPASS (PDB: 6BX3).

41 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

Table 1: Cryo-EM data collection, refinement and validation statistics

Complex between COMPASS and the H2B-Ub Nucleosome (EMDB-xxxx) (PDB xxxx) Data collection and processing Magnification 81,000 Voltage (kV) 300 Electron exposure (e–/Å2) 50 Defocus range (μm) -1.0 to -2.5 Pixel size (Å) 1.08 Symmetry imposed C1 Initial particle images (no.) 2,036,654 Final particle images (no.) 179,588 Map resolution (Å) 3.95 (0.143) FSC threshold Map resolution range (Å) 999 – 3.95

Refinement Initial model used (PDB code) PDB: 6NJ9, 6BX3, 6CHG Model resolution (Å) 4.16 (0.5) FSC threshold Model resolution range (Å) 39.2 to 4.16 Map sharpening B factor (Å2) -135.7 Model composition Non-hydrogen atoms 24,554 Protein residues 2,354 Ligands 2 B factors (Å2) Protein 255.3 Ligand 267.5 R.m.s. deviations Bond lengths (Å) 0.009 Bond angles (°) 0.930 Validation MolProbity score 2.2 Clashscore 20.2 Poor rotamers (%) 0.05 Ramachandran plot Favored (%) 95.53 Allowed (%) 4.47 Disallowed (%) 0

42 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

REFERENCES

Adams, P.D., Afonine, P.V., Bunkóczi, G., Chen, V.B., Davis, I.W., Echols, N., Headd, J.J., Hung, L.W., Kapral, G.J., Grosse-Kunstleve, R.W., McCoy, A.J., Moriarty, N.W., Oeffner, R., Read, R.J., Richardson, D.C., Richardson, J.S., Terwilliger, T.C., and Zwart, P.H. (2010). PHENIX: A comprehensive Python-based system for macromolecular structure solution. Acta Crystallographica Section D: Biological Crystallography 66, 213-221.

Afonine, P.V., Poon, B.K., Read, R.J., Sobolev, O.V., Terwilliger, T.C., Urzhumtsev, A., and Adams, P.D. (2018). Real-space refinement in PHENIX for cryo-EM and crystallography. Acta Crystallogr D Struct Biol 74, 531-544.

Anderson, C.J., Baird, M.R., Hsu, A., Barbour, E.H., Koyama, Y., Borgnia, M.J., and McGinty, R.K. (2019). Structural Basis for Recognition of Ubiquitylated Nucleosome by Dot1L Methyltransferase. Cell Reports 26, 1681-1690.e1685.

Andrews, A.J., and Luger, K. (2011). Nucleosome structure(s) and stability: variations on a theme. Annu Rev Biophys 40, 99-117.

Avdic, V., Zhang, P., Lanouette, S., Groulx, A., Tremblay, V., Brunzelle, J., and Couture, J.F. (2011). Structural and biochemical insights into MLL1 core complex assembly. Structure 19, 101-108.

Barski, A., Cuddapah, S., Cui, K., Roh, T.Y., Schones, D.E., Wang, Z., Wei, G., Chepelev, I., and Zhao, K. (2007). High-Resolution Profiling of Histone in the Human . Cell 129, 823-837.

Bian, C., Xu, C., Ruan, J., Lee, K.K., Burke, T.L., Tempel, W., Barsyte, D., Li, J., Wu, M., Zhou, B.O., Fleharty, B.E., Paulson, A., Allali-Hassani, A., Zhou, J.Q., Mer, G., Grant, P.A., Workman, J.L., Zang, J., and Min, J. (2011). Sgf29 binds histone H3K4me2/3 and is required for SAGA complex recruitment and histone H3 acetylation. EMBO J 30, 2829-2842.

Briggs, S.D., Xiao, T., Sun, Z.-W., Caldwell, J.A., Shabanowitz, J., Hunt, D.F., Allis, C.D., and Strahl, B.D. (2002). Trans-histone regulatory pathway in chromatin. Nature 418, 498.

Dou, Y., Milne, T.a., Ruthenburg, A.J., Lee, S., Lee, J.W., Verdine, G.L., Allis, C.D., and Roeder, R.G. (2006). Regulation of MLL1 H3K4 methyltransferase activity by its core components. Nature structural & molecular biology 13, 713-719.

Dover, J., Schneider, J., Tawiah-Boateng, M.A., Wood, A., Dean, K., Johnston, M., and Shilatifard, A. (2002). Methylation of histone H3 by COMPASS requires ubiquitination of histone H2B by Rad6. J Biol Chem 277, 28368-28371.

43 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

Dyer, P.N., Edayathumangalam, R.S., White, C.L., Bao, Y., Chakravarthy, S., Muthurajan, U.M., and Luger, K. (2004). Reconstitution of nucleosome core particles from recombinant histones and DNA. Methods in enzymology 375, 23-44.

Emsley, P., Lohkamp, B., Scott, W.G., and Cowtan, K. (2010). Features and development of Coot. Acta Crystallogr D Biol Crystallogr 66, 486-501.

Fierz, B., Chatterjee, C., McGinty, R.K., Bar-Dagan, M., Raleigh, D.P., and Muir, T.W. (2011). Histone H2B ubiquitylation disrupts local and higher-order chromatin compaction. Nature chemical biology 7, 113-119.

Holt, M.T., David, Y., Pollock, S., Tang, Z., Jeon, J., Kim, J., Roeder, R.G., and Muir, T.W. (2015). Identification of a functional hotspot on ubiquitin required for stimulation of methyltransferase activity on chromatin. Proceedings of the National Academy of Sciences of the United States of America 112, 10365-10370.

Hsu, P.L., Li, H., Lau, H.-T., Leonen, C., Dhall, A., Ong, S.-E., Chatterjee, C., and Zheng, N. (2018). Crystal Structure of the COMPASS H3K4 Methyltransferase Catalytic Module. Cell 0, 1-11.

Hsu, P.L., Shi, H., Leonen, C., Kang, J., Chatterjee, C., and Zheng, N. (2019). Structural Basis of H2B Ubiquitination-Dependent H3K4 Methylation by COMPASS. bioRxiv, 740738.

Jayaram, H., Hoelper, D., Jain, S.U., Cantone, N., Lundgren, S.M., Poy, F., Allis, C.D., Cummings, R., Bellon, S., Lewis, P.W., Arrowsmith, C.H., Dent, S.Y.R., and Marmorstein, R. (2016). S- adenosyl methionine is necessary for inhibition of the methyltransferase G9a by the lysine 9 to methionine mutation on histone H3. Pnas 113, 6182-6187.

Jung, I., Kim, S.K., Kim, M., Han, Y.M., Kim, Y.S., Kim, D., and Lee, D. (2012). H2B monoubiquitylation is a 5'-enriched active transcription mark and correlates with exon- structure in human cells. Genome Res 22, 1026-1035.

Kim, J., Kim, J.A., McGinty, R.K., Nguyen, U.T.T., Muir, T.W., Allis, C.D., and Roeder, R.G. (2013). The n-SET Domain of Set1 Regulates H2B Ubiquitylation-Dependent H3K4 Methylation. Molecular Cell 49, 1121-1133.

Komander, D., and Rape, M. (2012). The Ubiquitin Code. Annual Review of Biochemistry 81, 203-229.

Kouzarides, T. (2007). Chromatin Modifications and Their Function. Cell 128, 693-705.

Kucukelbir, A., Sigworth, F.J., and Tagare, H.D. (2014). Quantifying the local resolution of cryo- EM density maps. Nat Methods 11, 63-65.

44 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

Lewis, P.W., Müller, M.M., Koletsky, M.S., Cordero, F., Lin, S., Banaszynski, L.a., Garcia, B.a., Muir, T.W., Becher, O.J., and Allis, C.D. (2013). Inhibition of PRC2 activity by a gain-of-function H3 mutation found in pediatric glioblastoma. Science (New York, NY) 340, 857-861.

Liu, C.L., Kaplan, T., Kim, M., Buratowski, S., Schreiber, S.L., Friedman, N., and Rando, O.J. (2005). Single-nucleosome mapping of histone modifications in S. cerevisiae. PLoS Biol 3, e328.

McGinty, R.K., Kim, J., Chatterjee, C., Roeder, R.G., and Muir, T.W. (2008). Chemically ubiquitylated histone H2B stimulates hDot1L-mediated intranucleosomal methylation. Nature 453, 812-816.

McGinty, R.K., Köhn, M., Chatterjee, C., Chiang, K.P., Pratt, M.R., and Muir, T.W. (2009). Structure-activity analysis of semisynthetic nucleosomes: Mechanistic insights into the stimulation of Dot1L by ubiquitylated histone H2B. ACS Chemical Biology 4, 958-968.

Meeks, J.J., and Shilatifard, A. (2017). Multiple Roles for the MLL/COMPASS Family in the Epigenetic Regulation of and in Cancer. Annual Review of Cancer Biology 1, 425-446.

Miller, T., Krogan, N.J., Dover, J., Erdjument-Bromage, H., Tempst, P., Johnston, M., Greenblatt, J.F., and Shilatifard, A. (2001). COMPASS: a complex of proteins associated with a trithorax- related SET domain protein. Proceedings of the National Academy of Sciences of the United States of America 98, 12902-12907.

Minsky, N., Shema, E., Field, Y., Schuster, M., Segal, E., and Oren, M. (2008). Monoubiquitinated H2B is associated with the transcribed region of highly expressed genes in human cells. Nat Cell Biol 10, 483-488.

Morgan, M.T., Haj-Yahya, M., Ringel, A.E., Bandi, P., Brik, A., and Wolberger, C. (2016). Structural basis for histone H2B deubiquitination by the SAGA DUB module. Science 351, 725- 728.

Nakanishi, S., Sanderson, B.W., Delventhal, K.M., Bradford, W.D., Staehling-Hampton, K., and Shilatifard, A. (2008). A comprehensive library of histone mutants identifies nucleosomal residues required for H3K4 methylation. Nat Struct Mol Biol 15, 881-888.

Ng, H.H., Xu, R.M., Zhang, Y., and Struhl, K. (2002). Ubiquitination of histone H2B by Rad6 is required for efficient Dot1-mediated methylation of histone H3 lysine 79. Journal of Biological Chemistry 277, 34655-34657.

Patel, A., Dharmarajan, V., Vought, V.E., and Cosgrove, M.S. (2009). On the mechanism of multiple lysine methylation by the human mixed lineage leukemia protein-1 (MLL1) core complex. J Biol Chem 284, 24242-24256.

45 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

Pokholok, D.K., Harbison, C.T., Levine, S., Cole, M., Hannett, N.M., Lee, T.I., Bell, G.W., Walker, K., Rolfe, P.A., Herbolsheimer, E., Zeitlinger, J., Lewitter, F., Gifford, D.K., and Young, R.A. (2005). Genome-wide map of nucleosome acetylation and methylation in yeast. Cell 122, 517-527.

Qu, Q., Takahashi, Y.-h., Yang, Y., Hu, H., Zhang, Y., Brunzelle, J.S., Couture, J.-F., Shilatifard, A., and Skiniotis, G. (2018). Structure and Conformational Dynamics of a COMPASS Histone H3K4 Methyltransferase Complex. Cell 0, 1-10.

Ringel, A.E., Cieniewicz, A.M., Taverna, S.D., and Wolberger, C. (2015). Nucleosome competition reveals processive acetylation by the SAGA HAT module. Proceedings of the National Academy of Sciences of the United States of America 112, E5461-5470.

Roguev, A., Schaft, D., Shevchenko, A., Pijnappel, W.W., Wilm, M., Aasland, R., and Stewart, A.F. (2001). The Saccharomyces cerevisiae Set1 complex includes an Ash2 homologue and methylates histone 3 lysine 4. EMBO J 20, 7137-7148.

Santos-Rosa, H., Schneider, R., Bannister, A.J., Sherriff, J., Bernstein, B.E., Emre, N.C., Schreiber, S.L., Mellor, J., and Kouzarides, T. (2002). Active genes are tri-methylated at K4 of histone H3. Nature 419, 407-411.

Schneider, J., Wood, A., Lee, J.S., Schuster, R., Dueker, J., Maguire, C., Swanson, S.K., Florens, L., Washburn, M.P., and Shilatifard, A. (2005). Molecular regulation of histone H3 trimethylation by COMPASS and the regulation of gene expression. Molecular Cell 19, 849-856.

Shahbazian, M.D., Zhang, K., and Grunstein, M. (2005). Histone H2B ubiquitylation controls processive methylation but not monomethylation by Dot1 and Set1. Molecular Cell 19, 271- 277.

Shieh, G.S., Pan, C.H., Wu, J.H., Sun, Y.J., Wang, C.C., Hsiao, W.C., Lin, C.Y., Tung, L., Chang, T.H., Fleming, A.B., Hillyer, C., Lo, Y.C., Berger, S.L., Osley, M.A., and Kao, C.F. (2011). H2B ubiquitylation is part of chromatin architecture that marks exon-intron structure in budding yeast. BMC Genomics 12, 627.

Shilatifard, A. (2012). The COMPASS family of histone H3K4 methylases: mechanisms of regulation in development and disease pathogenesis. Annual review of biochemistry 81, 65-95.

Sievers, F., Wilm, A., Dineen, D., Gibson, T.J., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., Soding, J., Thompson, J.D., and Higgins, D.G. (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7, 539.

Sims, R.J., 3rd, Millhouse, S., Chen, C.F., Lewis, B.A., Erdjument-Bromage, H., Tempst, P., Manley, J.L., and Reinberg, D. (2007). Recognition of trimethylated histone H3 lysine 4 facilitates the recruitment of transcription postinitiation factors and pre-mRNA splicing. Mol Cell 28, 665-676.

46 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

Southall, S.M., Wong, P.S., Odho, Z., Roe, S.M., and Wilson, J.R. (2009). Structural Basis for the Requirement of Additional Factors for MLL1 SET Domain Activity and Recognition of Epigenetic Marks. Molecular Cell 33, 181-191.

Steger, D.J., Lefterova, M.I., Ying, L., Stonestrom, A.J., Schupp, M., Zhuo, D., Vakoc, A.L., Kim, J.- e., Chen, J., Lazar, M.A., Blobel, G.A., and Vakoc, C.R. (2008). DOT1L/KMT4 recruitment and H3K79 methylation are ubiquitously coupled with gene transcription in mammalian cells. Molecular and cellular biology 28, 2825-2839.

Sun, Z.-W., and Allis, C.D. (2002). Ubiquitination of histone H2B regulates H3 methylation and gene silencing in yeast. Nature 418, 104-108.

Takahashi, Y., Westfield, G., Oleskie, A., Trievel, R., Shilatifard, A., and Skiniotis, G. (2012). Structural analysis of the core COMPASS family of histone H3K4 methylases from yeast to human. Proceedings of the National Academy of Sciences 12, 871-874.

Valencia-Sanchez, M.I., De Ioannes, P., Wang, M., Vasilyev, N., Chen, R., Nudler, E., Armache, J.P., and Armache, K.J. (2019). Structural Basis of Dot1L Stimulation by Histone H2B Lysine 120 Ubiquitination. Mol Cell.

Vermeulen, M., Eberl, H.C., Matarese, F., Marks, H., Denissov, S., Butter, F., Lee, K.K., Olsen, J.V., Hyman, A.A., Stunnenberg, H.G., and Mann, M. (2010). Quantitative interaction proteomics and genome-wide profiling of epigenetic histone marks and their readers. Cell 142, 967-980.

Vermeulen, M., Mulder, K.W., Denissov, S., Pijnappel, W.W., van Schaik, F.M., Varier, R.A., Baltissen, M.P., Stunnenberg, H.G., Mann, M., and Timmers, H.T. (2007). Selective anchoring of TFIID to nucleosomes by trimethylation of histone H3 lysine 4. Cell 131, 58-69.

Voigt, P., LeRoy, G., Drury, W.J., 3rd, Zee, B.M., Son, J., Beck, D.B., Young, N.L., Garcia, B.A., and Reinberg, D. (2012). Asymmetrically modified nucleosomes. Cell 151, 181-193.

Waterhouse, A., Bertoni, M., Bienert, S., Studer, G., Tauriello, G., Gumienny, R., Heer, F.T., de Beer, T.A.P., Rempfer, C., Bordoli, L., Lepore, R., and Schwede, T. (2018). SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res 46, W296-W303.

Weissmann, F., Petzold, G., VanderLinden, R., Huis In 't Veld, P.J., Brown, N.G., Lampert, F., Westermann, S., Stark, H., Schulman, B.A., and Peters, J.M. (2016). biGBac enables rapid gene assembly for the expression of large multisubunit protein complexes. Proc Natl Acad Sci U S A 113, E2564-2569.

Worden, E.J., Hoffmann, N.A., Hicks, C.W., and Wolberger, C. (2019). Mechanism of Cross-talk between H2B Ubiquitination and H3 Methylation by Dot1L. Cell 176, 1490-1501 e1412.

47 bioRxiv preprint doi: https://doi.org/10.1101/841262; this version posted November 13, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license.

Wyce, A., Xiao, T., Whelan, K.A., Kosman, C., Walter, W., Eick, D., Hughes, T.R., Krogan, N.J., Strahl, B.D., and Berger, S.L. (2007). H2B ubiquitylation acts as a barrier to Ctk1 nucleosomal recruitment prior to removal by Ubp8 within a SAGA-related complex. Mol Cell 27, 275-288.

Wysocka, J., Swigut, T., Xiao, H., Milne, T.A., Kwon, S.Y., Landry, J., Kauer, M., Tackett, A.J., Chait, B.T., Badenhorst, P., Wu, C., and Allis, C.D. (2006). A PHD finger of NURF couples histone H3 lysine 4 trimethylation with chromatin remodelling. Nature 442, 86-90.

Xue, H., Yao, T., Cao, M., Zhu, G., Li, Y., Yuan, G., Chen, Y., Lei, M., and Huang, J. (2019). Structural basis of nucleosome recognition and modification by MLL methyltransferases. Nature 573, 445-449.

Zivanov, J., Nakane, T., Forsberg, B.O., Kimanius, D., Hagen, W.J., Lindahl, E., and Scheres, S.H. (2018). New tools for automated high-resolution cryo-EM structure determination in RELION-3. Elife 7.

48