bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1 Modulation of macrophage inflammatory function through selective inhibition 2 of the epigenetic reader SP140 3 4 Mohammed Ghiboub1,3,5, Jan Koster2, Peter D. Craggs4, Andrew Y.F. Li Yim3,6, Anthony Shillings4, Sue 5 Hutchinson4, Ryan P. Bingham4, Kelly Gatfield4, Ishtu L. Hageman1, Gang Yao7, Heather P. O'Keefe7, 6 Aaron Coffin8, Amish Patel9, Lisa A. Sloan4, Darren J. Mitchell4, Laurent Lunven4, Robert J. Watson4, 7 Christopher E. Blunt4, Lee A. Harrison4, Gordon Bruton4, Umesh Kumar3, Natalie Hamer3, John R. 8 Spaull4, Danny A. Zwijnenburg2, Olaf Welting1, Theodorus B.M. Hakvoort1, Johan van Limbergen1,5, 9 Peter Henneman6, Rab K. Prinjha3, Menno PJ. de Winther10,11, Nicola R. Harker3¶, David F. Tough12¶✉, 10 Wouter J. de Jonge1,13¶✉

11 1Tytgat Institute for Liver and Intestinal Research, Amsterdam Gastroenterology & Metabolism, 12 Amsterdam University Medical Centers, University of Amsterdam, the Netherlands. 13 2Department of Oncogenomics, Amsterdam University Medical Centers, University of Amsterdam and 14 Cancer Center Amsterdam, Amsterdam, Netherlands. 15 3Immuno-Epigenetics, Adaptive Immunity Research Unit, GlaxoSmithKline, Medicines Research Centre, 16 Stevenage, United Kingdom. 17 4Medicine Design, Medicinal Science and Technology, GlaxoSmithKline, Stevenage, United Kingdom. 18 5Department of Pediatrics, Division of Pediatric Gastroenterology & Nutrition, Emma Children's Hospital, 19 Amsterdam University Medical Centers, University of Amsterdam, the Netherlands. 20 6Department of Clinical Genetics, Genome Diagnostics Laboratory, Amsterdam Reproduction & 21 Development, Amsterdam University Medical Centers, University of Amsterdam, the Netherlands. 22 7GlaxoSmithKline, Cambridge, Massachusetts, USA. 23 8Constellation Pharmaceuticals, Cambridge, Massachusetts, USA. 24 9WuXi AppTec, Cambridge, Massachusetts, USA. 25 10Department of Medical Biochemistry, Amsterdam University Medical Centers, University of 26 Amsterdam, Amsterdam, Netherlands. 27 11Institute for Cardiovascular Prevention (IPEK), Munich, Germany. 28 12Adaptive Immunity Research Unit, GlaxoSmithKline, Medicines Research Centre, Stevenage, United 29 Kingdom. 30 13Department of Surgery, University of Bonn, Bonn, Germany. 31 ¶ Shared senior authorship. 32 ✉Corresponding authors: [email protected]; [email protected]. 33 34

35

1 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

36 Abstract

37 Speckled 140 KDa (SP140) is a nuclear body protein, mainly expressed in immune cells, which contains 38 multiple domains suggestive of an epigenetic reader function; namely a bromodomain, a PHD domain and 39 a SAND domain. Single nucleotide polymorphisms and epigenetic modifications in the SP140 locus have 40 been linked to autoimmune and inflammatory diseases including Crohn’s disease (CD). However, little is 41 known about the cellular function of SP140; this is due in part to the fact that, unlike for other many other 42 epigenetic , no small molecule inhibitors have been available to investigate the biological role of 43 SP140. We report the discovery of the first small molecule SP140 inhibitor (GSK761) and utilize this to 44 elucidate SP140 function in innate immune cells. We show that SP140 is highly expressed in CD68+ CD 45 mucosal macrophages and in in vitro-generated inflammatory macrophages. SP140 inhibition through 46 GSK761 reduced monocyte differentiation into inflammatory macrophages and lipopolysaccharide (LPS)- 47 induced inflammatory activation, whilst inducing the generation of CD206+ regulatory macrophages that 48 mark anti-TNF remission induction in CD patients. ChIP-seq analyses revealed that SP140 preferentially 49 occupies transcriptional start sites (TSS) in inflammatory macrophages, with enrichment at loci 50 encoding pro-inflammatory cytokines/chemokines and inflammatory pathways. GSK761 specifically 51 reduced SP140 binding and thereby expression of SP140-dependent downstream inflammatory . 52 Notably, in CD14+ macrophages isolated from CD intestinal-mucosa, GSK761 inhibited the spontaneous 53 expression of cytokines, including TNF. Together, this study identifies SP140 as a druggable epigenetic 54 reader and therapeutic target for CD.

55 Introduction

56 Although various therapeutic strategies provide benefit in Crohn’s disease (CD), the unmet need remains 57 high with only approximately 30% of patients maintaining long term remission1,2,3,4. Intestinal 58 macrophages, which arise mainly from blood-derived monocytes, play a central role in maintaining gut 59 homeostasis in health, but are also implicated as key contributors to CD5,6. These opposing activities are 60 linked to the capacity of macrophages to take on different functional properties in response to 61 environmental stimuli. Epigenetic processes play an important role in the regulation of macrophage 62 differentiation and gene expression, mediated by DNA methylation and histone post-translational 63 modification7,8. As such, epigenetic mechanisms mediated through the activity of specific histone 64 acetylases and deacetylases activity are implicated in CD pathogenesis9,10,11.

65 Speckled 140 KDa (SP140) is a nuclear body protein12,13 predominantly expressed in immune cells14,15. 66 Genome-wide association studies identified an association between single nucleotide polymorphisms in 67 SP140 and CD16 while rare SP140-associated variants have been identified in pediatric CD patients17.

2 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

68 Further, we identified two differentially hyper-methylated positions at the SP140 locus in blood cells of 69 CD patients18, suggesting aberrant regulation of its expression. Although little is known about how SP140 70 may be functionally linked to CD, a recent study reported that SP140 knockdown in in vitro-generated 71 inflammatory macrophages results a decrease in levels of pro-inflammatory cytokines10. Lower expression 72 of SP140 in intestinal biopsies was correlated with an improved response to anti-TNF therapy in CD 73 patients10. Conversely, in biopsies from those who showed resistance to anti-TNF therapy, SP140 was 74 highly expressed10. Together, these previous findings suggest a role for SP140 in CD pathogenesis.

75 SP140 protein harbors both a bromodomain (Brd) and a PHD finger10,15. Since Brd and PHD domains are 76 capable of binding to histones, this suggests that SP140 might function as an epigenetic reader. Readers 77 are recruited to chromatin possessing specific epigenetic ‘marks’ and regulate gene expression through 78 mechanisms that modulate DNA accessibility to transcription factors (TFs) and the transcriptional 79 machinery19. Brds are tractable to small molecule antagonists, with selective inhibitors of several Brd- 80 containing proteins (BCPs) reported20. Such compounds have helped elucidate the function of their BCP 81 targets; for instance, multiple inhibitors targeting BRD2, BRD3 and BRD4 have been used to demonstrate 82 the role of these proteins in selectively regulating gene expression in the context of cancer and 83 inflammation9,21,22. To date, no inhibitors targeting SP140 have been reported.

84 Here, we describe the discovery of the first specific SP140 inhibitor and utilize this compound to 85 investigate the functional relevance of SP140 in CD pathogenesis, focusing on its role in macrophages. 86 We show that SP140 functions as an epigenetic reader in controlling the generation of an inflammatory 87 macrophage phenotype, and in regulating the expression of pro-inflammatory and CD-associated genes. 88 We show that GSK761 inhibits SP140 binding to an array of inflammatory genes and demonstrate its 89 efficacy in CD colon macrophages ex vivo, presenting evidence for SP140 as a target for regulating 90 inflammation in CD. In addition, this study suggests that SP140 inhibition may also serve to support anti- 91 TNF therapy in non-responder CD patients.

92 Results 93 94 Elevated SP140 expression in inflammatory diseases and mucosal macrophages of CD patients. 95 SP140 gene expression in a variety of cells and tissues was analyzed using an in-house GSK microarray 96 profiler. SP140 was predominantly expressed in blood, CD8+ and CD4+ T cells, monocytes and in 97 secondary immune organs such as lymph node and spleen (Fig. 1a). We then investigated SP140 98 expression in white blood cells (WBC) and intestinal tissue of IBD patients versus controls. While SP140 99 gene expression in WBC was comparable in IBD and healthy controls (Fig. 1b), SP140 gene and protein 100 expression were found to be increased in inflamed colonic mucosa of both CD and UC patients (Fig. 1c,d

3 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

101 and S1). Notably, an increase in CD68+SP140+ but not CD68+SP140- macrophages was observed in CD 102 inflamed colonic mucosa compared to normal control mucosa (Fig. 1e,f). Elevated SP140 expression was 103 also found in other inflammatory conditions including in WBC of systemic lupus erythematosus and 104 rheumatoid arthritis patients (Fig. 1b), and in inflamed tissues of appendicitis, sarcoidosis, psoriatic 105 arthritis, rheumatoid arthritis, Hashimoto's thyroiditis and Sjogren's syndrome patients (Fig. S1). High 106 expression of SP140 was also observed in chronic lymphocytic leukemia (Fig. 1b).

107 Next, we analyzed data set from a publicly available single-cell RNA-sequencing experiment23 of 108 inflamed and uninflamed ileal tissue (biopsies) obtained from CD patients. Unsupervised clustering 109 analysis identified 22 cluster blocks (Fig. 2a). Based on the expression of CD14, CD16 (FCGR3A), CD64 110 (FCGR1A), CD68, CD163 and CD206 (MRC1), we identified cluster 6 as containing monocytes, 111 macrophages and dendritic cells (Fig. 2b,c), representing mononuclear phagocytes. Inflamed ileal biopsies 112 contained more cells in cluster 6 as compared to uninflamed biopsies (Fig. S2). Interestingly, within 113 cluster 6 the number of SP140-expressing cells was higher in inflamed ileal biopsies as well (Fig. 2d). In 114 addition, the percentage of SP140 expressing cells in cluster 6 was higher in inflamed compared with 115 uninflamed tissue, which coincides with an increased number of CD68-, CD64- CD14- and CD16- 116 expressing cells (Fig. 2e).

117 SP140 mediates inflammatory M1 macrophage function. Given the high SP140 expression in CD 118 mucosal macrophages, we investigated whether SP140 is associated with inflammatory activation. Human 119 CD14+ monocytes or THP-1 cells were differentiated into macrophages and then polarized into M1 and 120 M2 phenotypes using IFN-γ and IL-4 respectively or left without treatment (M0) (Fig. 3a). The 121 polarization was verified by measuring gene expression of CD64 and CCL5 (markers of M1 macrophages) 122 and CD206 and CCL22 (markers of M2 macrophages) (Fig. S3a,b). SP140 mRNA was expressed at 123 significantly higher levels in M1 compared to M0 or M2 macrophages derived from primary monocytes 124 (Fig. 3b) or from THP-1 cells (Fig. S3c). Protein staining showed increased numbers of SP140 protein- 125 containing speckles in the nucleus of M1 compared to M2 macrophages derived from both human 126 monocytes (Fig. 3c) and THP-1 cells (Fig. S3d). LPS induced an increase in SP140 gene expression in M0 127 and M1 macrophages, along with enhanced CCL5 and decreased CCL22 gene expression (Fig. S3e). The 128 expression levels of 13 other BCPs (SP140L, SP100, SP110, BRD2, BRD3, BRD4, BRD9, BRDT, BAZ2A, 129 BAZ2B, PCAF, EP300 and CREBBP) showed no increase in M1 compared to M0 macrophages (Fig. 130 S4a), indicating that the association between SP140 and inflammatory macrophages was not a common 131 phenomenon amongst BCPs.

4 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

132 To investigate SP140 function, we initially used siRNA-mediated knock-down to reduce SP140 133 expression in M1 macrophages, achieving approximately 75% reduction in mRNA (Fig. 3e) and a clear 134 decrease in the number of SP140 protein-containing speckles (Fig. 3f). Knockdown seemed specific to 135 SP140, as expression of related family members SP110 (Fig. S4b) and SP100 was not affected (Fig. 136 S4b,c). After knockdown, cells were simulated with LPS or kept unstimulated (Fig. 3d). SP140 silencing 137 led to a decrease in LPS-induced IL-6 and TNF mRNA and protein levels (Fig. 3g). After transcriptional 138 profiling, principal component analysis (PCA) revealed that the variance was most associated with SP140 139 siRNA treatment in LPS-stimulated M1 macrophages (Fig. 3h). Downregulation of some key CD- 140 associated genes, notably CXCL9, CEACAM1 and JAK2, was apparent amongst SP140-siRNA-treated 141 unstimulated macrophages, and IL2RA, TRIM69 and ADAM9 amongst SP140-siRNA LPS-stimulated 142 macrophages (Fig. 3i). SP140 silencing affected common inflammatory pathways, such as TNF-signaling 143 via NFKB, IFN-γ response, inflammatory response (Fig. 3j) and cytokine-signaling (Fig. 3k).

144 The development of a selective inhibitor of SP140 (GSK761). To identify selective SP140 binding 145 compounds, we utilized encoded library technology to screen the GSK proprietary collection of DNA- 146 linked small molecule libraries29, 30. Affinity selection utilizing a recombinant protein construct spanning 147 the PHD and Brd domains of SP140 was carried out, leading to identification of an enriched building 148 block combination in DNA Encoded Library 68 (DEL68) (Fig. 4a). Representative compounds of the 149 identified three-cycle benzimidazole chemical series were synthesized, which yielded the small molecule, 150 GSK761 (Fig. 4b). To quantify the interaction between GSK761 and SP140 (aa 687-867), the dissociation

151 constant (Kd) was determined using a fluorescence polarization (FP) binding approach. A fluorophore-

152 conjugated version of GSK761, GSK064 (Fig. 4c), was prepared and subsequently used to determine a Kd 153 value of 41.07 ± 1.42 nM for the interaction with SP140 (aa 687-867) (Fig. 4d). To further validate this 154 interaction, competition studies were carried out using GSK761 in a FP-binding assay configured using 155 GSK064 and SP140 (aa 687-867). Competitive displacement of GSK064 from SP140 (aa 687-867) by

156 GSK761 was observed, subsequently leading to the determination of an IC50 value of 77.79 ± 8.27 nM 157 (Fig. 4e). Prior to utilizing GSK761 for cell-based binding studies to endogenous SP140, the cell 158 penetration capacity of the compound was assessed using mass spectrometry and the methodology 28 159 described by . Concentration measurements showed that GSK761 had a pΔCtotal value of 1.45 ± 0.21 160 (data not shown), indicating that GSK761 permeates human cells and accumulates intracellularly by an 161 order of magnitude, when compared to GSK761 free in solution.

162 To confirm the binding of GSK761 to full-length SP140, immobilized GSK761 was used to probe for 163 SP140 in nuclear extracts from HuT78 cells and HeLa cells transfected with Halo-tagged SP140. Both 164 endogenous and Halo-tagged SP140 were pulled down with biotinylated GSK761 and visualized via

5 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

165 Western Blotting (Fig. 4f). Endogenous SP140 is observed as a doublet containing the four largest 166 isoforms (predicted 86-96KDa)10 while Halo-tagged SP140 exhibits a single band. The specificity of 167 GSK761 for SP140 was profiled using the BROMOscan® Assay, in which DNA tagged BCPs were 168 incubated with an increasing concentration of GSK761 or DMSO. Binding is assessed by measuring the 169 amount of bromodomain captured in GSK761 vs DMSO samples by using an ultra-sensitive qPCR. There 170 was evidence of low affinity interaction between GSK761 and several tested BCPs, with binding detected 171 at concentrations >21000 nM. However, no binding was detected at concentrations ≤ 21000 nM indicating 172 a high degree of specificity of GSK761 for SP140 (Supplemental table 2).

173 GSK761 reduces the inflammatory activation of macrophages and expression of pro-inflammatory 174 cytokines. SP140 expression was selectively increased in M1 compared to M0 macrophages, prompting 175 us to test whether SP140 is required for the polarization to inflammatory macrophage phenotype. We first 176 tested whether GSK761 demonstrated any toxicity in macrophages to determine the appropriate dose 177 range. At ≤0.12 µM, GSK761 showed no cytotoxicity (Fig. S5a,b). To investigate the effect of GSK761 178 on macrophage polarization to inflammatory phenotype, M0 macrophages were treated with either DMSO 179 or GSK761 for 3 days in presence of IFN-γ or IL-4 (during the polarization to M1 and M2 macrophages, 180 respectively). GSK761 enhanced mRNA expression of the anti-inflammatory marker CD206 in both M1 181 and M2 macrophages and decreased the pro-inflammatory marker CD64 in M1 macrophages (Fig. 5a). 182 FACS analysis showed that GSK761-treatment prior to IFN-γ (M1) polarization reduced CD64+ cells and 183 CD64 protein expression (Fig. 5b) and increased CD206+ cells and CD206 protein expression (Fig. 5c). 184 Adding GSK761 to the cells during IFN-γ (M1) polarization lowered TNF (M1 polarization marker) gene 185 expression in these cells compared to the control (Fig. S5c). Altogether, these data suggest that SP140 186 inhibition during M1 polarization biased differentiation towards a regulatory M2 phenotype.

187 We next assessed the effect of SP140 inhibition on the response of M1 polarized macrophages to 188 inflammatory stimuli. LPS-stimulated M1 macrophages pretreated with GSK761 showed a strong 189 reduction in secretion of IL-6, TNF, IL-1β and IL-12 (Fig. 5e) at concentrations where cytotoxicity was 190 not observed (0.04 and 0.12 µM). When employing RNA transcriptional profiling using a customized 191 qPCR array, SP140 inhibition was found to reduce the expression of many other pro-inflammatory 192 cytokines and chemokines including GM-CSF, CCL3, CCL5 and CCL1 (Fig. 5d).

193 The marked anti-inflammatory effects of GSK761 on macrophages in vitro suggests that targeting SP140 194 may be an effective approach for IBD. Unfortunately, due to poor in vivo pharmokinetics (data not 195 shown), GSK761 was not suitable to evaluate the effects of SP140 inhibition in in vivo animal models of 196 colitis. To evaluate the impact of SP140 inhibition in human CD, CD14+ mucosal macrophages were 197 isolated from CD anti-TNF refractory patients’ colonic-mucosa and then cultured in vitro with either

6 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

198 DMSO or GSK761 for 4h. Spontaneous gene expression of TNF, IL6 and IL10 was strongly decreased in 199 GSK761-treated macrophages, demonstrating that GSK761 inhibited their immune reactivity (Fig. 5f). 200 Previous studies have demonstrated that successful anti-TNF therapy is associated with an increase in 201 CD206+ regulatory macrophages31 and a decrease in pro-inflammatory cytokines10 and SP140 expression10 202 in CD mucosa. Thus, our data suggests that SP140 inhibition could be also beneficial in supporting the 203 anti-TNF therapy.

204 SP140 preferentially interacts with transcription start sites (TSS) and enhancer regions. SP140 205 possesses multiple chromatin binding domains, namely Brd, PHD and SAND (SP100, AIRE-1, 206 NucP41/75, DEAF-1) domains, which may allow it to function as an epigenetic reader 10, 32. Indeed, we 207 found that recombinant SP140 Brd-PHD protein bound to histone H3 peptides with sub-µM affinity (Fig. 208 S6a). The strongest binding was observed for an unmodified peptide corresponding to the N-terminal 21 209 aa, while methylation at lysine residue 4 (K4) led to substantially (~30-fold) reduced affinity (Fig. S6b). 210 K9 acetylation also resulted in lower affinity binding (4-5-fold), while SP140 binding was unaffected by 211 either acetylation or methylation of K14 (Fig. S6c). Similar binding affinity was observed for unmodified

212 H31-21 vs H31-18, while reduced (4-5-fold) but still significant SP140 binding was measured for H31-9 (Fig. 213 S6c). Taken together, these data suggest that the SP140 Brd-PHD module functions as a reader for un- 214 modified histone H3, with binding focused around the N-terminal 9 aa.

215 To evaluate whether native SP140 is also capable of binding to histones, we used immobilized histone H3 216 peptides with varying modifications to precipitate proteins from nuclear extracts of anti-CD3/CD28- 217 stimulated HuT78 T cells and probed via Western blotting for the presence of SP140. SP140 was

218 efficiently pulled down by un-modified H31-21 (Fig. 6a). Notably, H31-21 peptides bearing certain 219 methylation and acetylation marks (K4me3, K9me3, K9ac and K14ac) were also capable of capturing 220 native SP140 (Fig. 6a), despite the lower measured affinity of some of these for recombinant SP140 Brd- 221 PHD (see above). To evaluate whether SP140 interacts with histone H3 in a cellular setting, we utilized a 222 NanoBRET system (Fig. S6d). NanoBRET is a proximity-based assay that can detect protein interactions 223 by measuring energy transfer from a bioluminescent protein donor NanoLuc® (NL) to a fluorescent 224 protein acceptor HaloTag® (HT). This energy transfer was observed in the nuclei of HEK293 cells 225 transfected with SP140-NL and Histone3.3-HT DNA, indicating close proximity of the two proteins.

226 To assess whether SP140 associates with chromatin in macrophages and how this might be regulated in 227 inflammation, we initially conducted ChIP-qPCR experiments in unstimulated or LPS-stimulated M1 228 macrophages. Binding of SP140 to the TSS of TNF and IL6 genes was observed in unstimulated cells, and 229 this was increased following LPS-stimulation (Fig. 6b). We then evaluated chromatin occupancy of SP140 230 on a genome-wide level using ChIP-seq and tested the effects of GSK761 on both SP140 binding and gene

7 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

231 expression (RNA-seq) in the context of LPS-stimulation. In DMSO-treated, LPS-stimulated macrophages, 232 epigenome roadmap scan analysis revealed that the majority of SP140 occupancy was at strong 233 transcription associated-regions, enhancers and TSS regions (Fig. 6c). However, this occupancy was 234 decreased when the cells were pretreated with GSK761 at 1h (Fig. S6f) and 4h (data not shown) post LPS- 235 stimulation. Heatmap rank ordering of SP140 occupancy in unstimulated and LPS-stimulated M1 236 macrophages showed a strong SP140 enrichment at the TSS (Fig. 6d). The top 20 enriched genes 237 belonged mostly to those involved in the innate immune response, including TNF, ICAM4, IRF1, LITAF, 238 TNFAIP2 and NFKBIA (Fig. 6d). However, the most SP140-enriched gene FLOT1 (Fig. 6d) has been 239 reported to be strongly involved in tumorigenesis33. We found SP140 also to occupy active enhancers, as 240 marked by H3K27Ac in human macrophages by overlapping the publicly available H3K27Ac ChIP-seq 241 data (GSE54972) with our SP140 ChIP-seq dataset at 1h of LPS-stimulation (Fig. S6e). ChIP-qPCR 242 showed a strong reduction of SP140 occupancy at the TNF-TSS in GSK761-treated M1 macrophages, 243 which was reduced to that of unstimulated cells (Fig. 6e), correlating with the previously observed 244 reduced expression of TNF in GSK761-treated macrophages.

245 Following LPS-stimulation, we observed a marked increase in binding at the TSS of the top 1000 genes 246 occupied by SP140, reaching its maximum at 4h (Fig. 6f). GSK761-treatment prior to LPS-stimulation 247 strongly reduced SP140 occupancy at the TSS (Fig. 6f) associated with an altered LPS-induced gene 248 expression (Fig. 6g). PCA of RNA-seq data revealed a large separation in global gene expression between 249 DMSO- and GSK761-treated macrophages at both time points post of LPS-stimulation (4h and 8h) (Fig. 250 6g).

251 and Reactome enrichment analysis of SP140-bound genes showed enrichment of genes 252 which participate in the processes of cytokine response and immune response that were inhibited through 253 GSK761 pre-exposure (Fig. 6h and S6g). To elucidate the regulatory pathways enriched by SP140 and 254 affected by GSK761, we carried out hallmark pathway analysis for differentially SP140 bound-genes 255 (DBGs) (Fig. 6i) and DEGs (Fig. 6j) upon DMSO- or GSK761-treatment at each time point of LPS- 256 stimulation. In DMSO-treated M1 macrophages, we found that SP140 binding is predominantly enriched 257 at many pathways typically defined as inflammatory such as TNF-signaling via NFKB, and inflammatory 258 response (Fig. 6i). However, this enrichment was no longer seen in GSK761-treated macrophages (Fig. 259 6i). Similar pathways were observed for DEGs and were downregulated in GSK761-treated macrophages, 260 especially at 4h (Fig. 6j). In addition, we observed an up-regulation of MYC targets and oxidative 261 phosphorylation gene sets in GSK761-treated macrophages, whereas these normally are down-regulated 262 because of the induction of aerobic glycolysis upon LPS-stimulation of macrophages (Fig. S7a)34. These

8 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

263 data suggest SP140 as a critical regulator of genes involved in the inflammatory response and that 264 GSK761 can inhibit this response through displacing SP140 binding to TSS and enhancer regions.

265 SP140 preferentially controls the expression of specific gene sets involved in the innate immune 266 response. We next investigated the functional significance of inhibiting SP140 binding for gene 267 expression. Heatmap (Top 100) and volcano plots of DEGs after 4h LPS-stimulation demonstrated a 268 strong effect of GSK761 on gene sets that are involved in innate immune response such as the 269 downregulation of TNFSF9, IL6, F3, CXCL1, CCL5, IL1β, TRAF1, IL23A, IL18 and GEM and the 270 upregulation of PLAU and CXCR4 (Fig. 7a,b). We next explored the global DBGs in 1h LPS-stimulated 271 M1 macrophages using R2 TSS-plot (Fig. 7c) and R2 TSS-peak calling (Fig. 7d). Interestingly, the top 272 DBGs belonged to TNF-signaling via NFKB such as TNFAIP2, TNF, LTA, TRAF1, IRF1 and NFKBIA 273 (Fig. 7c,d). GSK761 clearly reduced SP140 binding to those genes (Fig.7,d). Similarly, gene expression of 274 TNFAIP2, TAF1, TNF, LTA, and IRF1 was reduced (Fig. 7e). By conducting a focused TNF-signaling 275 enrichment analysis (Fig. S7b) and R2 TSS-plot analysis (Fig. S7c), we illustrated a new set of DBGs 276 including TFs (TNFAIP3, CEBPB, ATF4, MAP3K8, MAP2K3 and JUNB), cytokines (IL1β, IL23A) and 277 adhesion molecules (ICAM1). Most of these genes were DEGs (Fig. S7d).

278 We next verified the impact of global reduced SP140 binding on gene expression by integrating the DBGs 279 and the DEGs. Interrogating the top 1000 DBGs (at 1h or 4h LPS) for their gene expression (at 4h or 8h 280 LPS), implicated genes that are involved in TNF-signaling pathway (NFKBIA, SOD2, LITAF, SGK1, 281 PNRC1, IRF1 and RNF19B) and cytokine-mediated signaling pathway (For example: IL10RA, NFKBIA, 282 IRF1, SOD2, CAMK2D, IRF2 and ISG15) which showed the strongest concordant differential binding and 283 expression (Fig. 7f and S8).

284 In addition to SP140 binding to chromatin, we speculated that SP140 may also interact with other TFs 285 proteins to regulate gene expression. To this end, we performed homer known motif enrichment analysis 286 (HKMEA) at DMSO_1h LPS (Fig. 7g). We also conducted a homo de novo motif discovery to define new 287 DNA sequence motifs that are recognized specifically by SP140 (Fig. 7g). Interestingly, HKMEA 288 suggests that SP140 may bind the same DNA sequence motifs recognized by certain TFs defined as key 289 proteins in TNF-signaling via NFKB (ATF3, FOSL2, FOSL1, NFKB-p65-rel, JUNB and JUN-AP1) and 290 of cytokine-mediated signaling pathway (BATF, NFKB-p65-rel, JUNB, IRF8 and IRF3) (Fig. 7g).

291 LPS-stimulation strongly recruits SP140 to multiple human leukocyte antigen (HLA) genes, including 292 HLA-A, HLA-B, HLA-C, HLA-F, HLA-DPA and HLA-DPB (Fig. S7e), binding which was strongly 293 reduced by GSK761. This indicates a role of SP140 in regulating antigen presentation-associated genes. 294 Notably, the Brd-containing protein 2 (BRD2) gene was occupied by SP140, and GSK761 reduced this

9 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

295 binding and lowered BRD2 gene expression (Fig. S7e and table 5). BRD2 has been reported to play a key 296 role in inflammatory response in murine macrophages and in inducing insulin resistance35.

297 Finally, we noticed that SP140 occupancy on TSS of chemokine activity genes was dramatically reduced 298 by GSK761, inducing a reduced global chemokine activity at mRNA level (Fig. 7h). R2 TSS-plots and R2 299 TSS-peak calling indicate a strong SP140 occupancy at several inflammatory macrophage-associated 300 chemokines (CCL2, CCL3, CCL4, CCL5, CCL8, CXCL1, CXCL3, CXCL8 CXCL9, CXCL10 and 301 CXCL11), chemokine ligands (CCL4L1, CCL3L1, CCL3L3, CCL4L2), chemokine receptors (CCR7), TFs 302 (STAT1, JAK2, PIK3R5, RELA) and protein kinases (PRKCD, FGR and HCR) (Fig. S9a,b,c). This binding 303 was reduced by GSK761, affecting expression of many genes involved in chemokine-signaling (Fig. S9d). 304 However, SP140 binding was selective as it did not bind to the TSS of a range of other chemokines such 305 as CXCL6, CXCL5, CCL11, and CCL7 (Fig. S9c).

306 Discussion

307 SP140 has been implicated in CD and other autoimmune diseases through genetic and epigenetic 308 association studies16,36. Here, we describe the first small molecule inhibitor of SP140 protein, which has 309 been used to investigate its function. The novel SP140-binding small molecule GSK761 was shown to 310 compete with the N-terminal tail of histone H3 for interactions with the SP140 Brd-PHD module. In 311 human macrophages, SP140 associated with the regulatory regions of immune-related/inflammatory genes 312 in a stimulus-dependent manner. GSK761 inhibited SP140 binding to the TSS of many inflammatory 313 genes, correlating with compound-induced decreased expression of these genes and reduced macrophage 314 inflammatory function. The predominant SP140 expression in immune cells and in particular 315 inflammatory macrophages, further identify SP140 as an epigenetic reader protein that contributes to 316 inflammatory gene expression.

317 Inflammatory macrophages generated in vitro or in CD patients were shown to express high levels of 318 SP140, and SP140 silencing in macrophages reduced expression of a range of inflammatory genes. These 319 anti-inflammatory effects were extended using GSK761, which was able to inhibit spontaneous cytokine 320 expression from CD14+ macrophages isolated from CD anti-TNF refractory patients’ colonic mucosa. 321 LPS-stimulation of macrophages was shown to cause SP140 recruitment to the TSS of a specific set of 322 inflammatory genes, which was prevented by GSK761, dampening the induced expression of those genes. 323 The strongest SP140 occupancy was at the TSS of a range of genes involved in TNF-signaling. TNF- 324 signaling plays an integral role in CD as evidenced by the efficacy of chimeric anti-TNF mAbs-therapy37. 325 Notably, anti-TNF therapy response rests on TNF neutralisation, but specifically in CD also on its potency 326 to bind membrane bound TNF on CD-tissue infiltrated macrophages38, which causes differentiation of

10 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

327 macrophages to a more regulatory CD206+ M2-like phenotype31,39. This is for example evidenced by the 328 clinical observation using etanercept (does not bind membrane bound TNF), which is not effective in CD, 329 whilst infliximab is39. A reduced expression of SP140 in CD-biopsies was shown to be correlated with a 330 better anti-TNF response10. In this study, we found that GSK761 increased CD206 expressing 331 macrophages in in vitro. Furthermore, GSK761 reduced pro-inflammatory cytokine expression in CD- 332 mucosal macrophages of anti-TNF resistant patients. Thus, we anticipate that SP140 inhibition would be 333 useful in enhancing anti-TNF remission-induction in CD patients. In this respect, SP140 was also shown 334 by ChIP-seq analysis to bind and regulate expression of some key CD-associated genes such as IL23A40, 335 TYK241, JAK241, NOD242 and CARD943.

336 In IBD, epigenetic mechanisms control macrophage inflammatory activation and polarization44. These 337 macrophages are central components of the inflamed mucosa and contribute to disease pathology by 338 producing inflammatory cytokines, which promote the differentiation and activation of Th1 and Th17 339 cells45. In this study, SP140 is shown to be critical for both the polarization of inflammatory macrophages 340 and in the response of polarised macrophages to inflammatory stimuli. In particular, the TFs STAT1 and 341 STAT2 are crucial for inflammatory macrophages polarization5,46,47. We found high levels of SP140 342 binding associated with the STAT1 and STAT2 genes; GSK761 disrupted SP140 binding to these genes 343 and reduced their expression, providing a probable mechanistic basis for the inhibitory effects of the 344 compound on M1 polarisation. Notably, GSK761 inhibited macrophage-induced cytokines and co- 345 stimulatory molecules required for inflammatory T cell activation including IL23A40, IL1248, IL1849, 346 IL1β49, CD4050 and CD8051. Therefore, in addition to reducing the direct inflammatory function of 347 macrophages, inhibition of SP140 may inhibit downstream activation/polarisation of responding T cells.

348 While the mechanism by which SP140 is targeted to gene regulatory regions remains to be determined, 349 motif analysis indicated a strong overlap between the DNA sequences enriched for SP140 binding and the 350 binding motifs of some key TFs involved in regulating the inflammatory response as well as the 351 differentiation and polarization to inflammatory macrophages, such as NFkB-p5052, IRF353 and IRF854. 352 Thus, it is possible that SP140 is recruited to chromatin via binding to stimulation-induced TFs – either 353 before or after the TFs bind to DNA. Conversely, SP140 may associate with other proteins recruited to 354 sites bound by these TFs. By conducting ChIP-seq at multiple time points (1 and 4h), we observed 355 differential binding kinetics of SP140 to different sets of genes; for example, SP140 is bound to the CCL5 356 TSS by 1h after LPS-stimulation but the TSS of CCL2 only after 4h. The factors controlling the 357 differential binding kinetics are currently unknown but could include differences in the initial chromatin 358 environments of these genes in addition to distinct kinetics of TF activation and binding.

11 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

359 In a previous study, SP140 was found to be enriched at the HOXA genes in macrophages, and SP140 360 binding at these locations was proposed to play a role in repression of these lineage-inappropriate genes10. 361 However, SP140 binding to this region was minimal in our study in both unstimulated and LPS-stimulated 362 macrophages. The reason for this discrepancy is unclear, but the use of different SP140 antibodies in the 363 two studies is one possibility. The antibody used in our ChIP experiments was found to be the most 364 SP140-specific amongst 5 antibodies that we tested. Importantly, the displacement of SP140 binding by 365 GSK761 provides strong support for the specificity of the SP140 binding identified in our ChIP-seq study.

366 Analysis of the gene sets identified in our SP140 ChIP-seq study suggested a role for SP140 in immune 367 defence against microbes (in accordance with55 and56) and some other diseases, such as graft-versus-host- 368 disease, cancers and rheumatoid arthritis. The intestinal microbiota is thought to play a central role in the 369 pathogenesis of CD, and subsequent investigations could consider the role of SP140 in this respect as a 370 previous study linked SP140 SNPs to microbial dysbiosis in human intestinal microbiota57. In this study, 371 ChIP-seq analysis was mainly focused on investigating SP140 binding to coding genes. However, by 372 including non-coding RNA molecules, we observed very strong SP140 binding to a limited set of 373 microRNAs (miRNAs) mainly involved in tumorigenesis, such as MIR368758, MIR364859 and 374 MIR663A60. Accordingly, previous studies have demonstrated that certain BCPs (such as BRD4) are 375 capable of binding and modulating transcription of some miRNAs involved in tumorogenesis61,62. 376 Contrary to the coding genes, GSK761 was unable to affect SP140 binding to miRNAs. This suggests that 377 SP140 binding in these locations is independent of the Brd-PHD region with which GSK761 interacts; 378 binding of SP140 to DNA via the SAND domain is one possibility. While not the focus of this study, other 379 potential links between SP140 and cancer were observed, including high expression of SP140 in chronic 380 lymphocytic leukemia cells (although reduced expression in acute myelogenous leukemia), and high 381 binding of SP140 to the FLOT1 gene, which has been implicated in tumorigenesis33

382 We conclude that SP140 is an epigenetic reader protein that regulates gene expression in macrophages in 383 response to inflammatory stimuli, including those present in the CD intestinal mucosa of patients failing 384 therapy. Targeting SP140 with a selective inhibitor was shown to displace SP140 from chromatin and 385 decrease inflammatory gene expression in macrophages. This study suggests that SP140 inhibition could 386 represent a promising approach for CD, which might include in addition to remission induction, the 387 potential for combination with anti-TNF therapy.

388 Competing interests: PDC, AYFLY, AS, SH, RPB, KG, GY, HPK, AC, AP, LAS, DJM, LL, RJW, CEB, 389 LAH, GB, UK, NH, JRP, RKP, NRH and DFT were employed by GSK at the time of conducting this 390 study. MG, JK, ILH, DAZ, OW, TBMH, JVL, PH, MPJW and WJDJ were employed by Amsterdam

12 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

391 University Medical Centers at the time of conducting this study. GSK has a patent EP2643462B1 related 392 to the therapeutic use of SP140 inhibitors. 393 Acknowledgment: We thank Charlotte J. Barrett, previous student trainee at GSK for the assistance with 394 a compound synthetic work. Funding: This project was funded by European Union’s Horizon 2020 395 research and innovation program under Grant Agreement No. ITN-2014-EID-641665. Further grant 396 support is acknowledged from Dutch Ministry of Economical Affairs, LSH TKI, grants nr TKI-LSH 397 T2017, and European Crohn’s and Colitis Organization (ECCO) Pioneer Grant, 2018. 398 Author Contributions. Conduction of the study, laboratory and writing of the manuscript: MG; study 399 design: MG, WJdJ, NRH and DFT; bioinformatics analyses; MG, JK, AYFLY and DAZ; supervision: 400 WJdJ, NRH, DFT, MPJV and PH; reviewing and editing: WJdJ, NRH, DFT, MPJV, JVL, TBMH, PH and 401 RKP; compound development: PDC, AS, SH, RPB, KG, LAS, DJM, GY, HPO, AC, AP, LL, RJW, CEB, 402 LAH and UK; laboratory analysis: OW, NH, JRS and ILH; technical support: GB. All authors have read 403 and agreed to the published version of the manuscript. 404 405 Materials and methods

406 The human biological samples were sourced ethically, and their research use was in accord with the terms 407 of the informed consents under an IRB/EC approved protocol. Written informed consent was obtained 408 from each donor, as approved by the UK East of England - Cambridgeshire and Hertfordshire Research 409 Ethics Committee.

410 In-house GSK microarray profiler. Human gene expression data, along with accompanying sample 411 descriptions, was purchased from GeneLogic (GeneLogic Division, Ocimum Biosolutions, Inc.) in 2006, 412 and later organized by sample type. Gene expression data for each sample had been determined using 413 mRNA amplification protocols as recommended by Affymetrix (Affymetrix, Inc.) and subsequently 414 hybridized to the Affymetrix U133_plus2 chip. Purchased data was subject to reported quality control 415 measures including ratios for ACTB and GAPDH, as well as maximal scale factors as reported by 416 Affymetrix MAS 5.0. Expression data was normalized using MAS5.0 with a target intensity of 150.

417 Immunohistochemistry. An anti-SP140 antibody produced in rabbit (HPA006162; Sigma Prestige 418 Antibody) was used to visualize SP140 protein in a Cambridge Bioscience 69571061 Normal Human 419 tissue microarray, a Cambridge Bioscience 4013301 Human Autoimmune Array, a Cambridge Bioscience 420 4013101 Human Colitis Array and on in-house rheumatoid arthritis synovial samples. Anti-SP140 421 antibody was detected with a Leica polymer secondary antibody. Sections were de-waxed using 422 proprietary ER1, low pH 6 and ER2, high pH 8 buffers for 20 minutes at 98°C. Sites of antibody binding

13 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

423 were visualized with peroxidase and DAB. Staining was performed on a Leica Bondman immunostaining 424 instrument, using protocol IHC F.

425 Tissue immunofluorescence of SP140 and CD68. Paraffin embedded tissue sections from inflamed and 426 non-inflamed colon were obtained from patients with CD during surgical procedures, were first 427 deparaffinized and washed with TBS. For antigen retrieval, slides were treated at 96°C for 10 minutes in 428 0.01 M sodium citrate buffer pH 6.0. The tissue sections were blocked and permeabilized with PBT (PBS, 429 0.1% Triton X-100 (Biorad), 1% w/v BSA (Sigma-Aldrich)) and incubated with primary antibodies for 2 430 hours (h) at room temperature (RT), to detect pan-macrophage marker CD68 (dilution 1:200) (M0876, 431 Dako) and SP140 (dilution 1:200) (ab171141; Abcam). After washing, slides were stained with the 432 following secondary antibodies: polyclonal goat anti-Mouse Alexa Fluor488 to detect CD68 protein 433 (Green color) (A-11029, Invitrogen) or polyclonal goat anti-Rabbit Alexa Fluor546 to detect SP140 434 protein (Orange color) (A-11035, Invitrogen). Slides were mounted in SlowfadeGold reagent containing 435 DAPI (4′,6-diamidino-2-phenylindole, Thermo-Fisher) and imaged using a Leica DM6000B microscope 436 equipped with LAS-X software (Leica Microsystems).

437 SP140 expression in ileal macrophages. Publicly available single-cell RNA-sequencing data from 11 438 involved (inflamed) and 11 paired uninvolved (uninflamed) ileal biopsies was downloaded from the 439 sequence read archive accession number SRP21627323. Raw reads were aligned against GRCh38 using 440 Cellranger (v3.1.0). The resulting unique molecular identifier (UMI) count matrices were then imported 441 into the R statistical environment (v3.6.3) whereupon the samples were analyzed in an integrative fashion 442 using Seurat (v3.1.5)24,25. Clustering analyses were performed using the top 2000 most variable genes and 443 the top 15 principal components yielding 22 clusters whereupon they were visualized through UMAP 444 (arXiv:1802.03426). We identified the monocytes and macrophages containing cluster through the 445 expression of CD68, FCER1A (CD64), CD163, MRC1 (CD206), CD14 and FCGR3A (CD16A), after 446 which SP140 was visualized in the cluster representing monocytes/macrophages. Comparative cell count 447 analyses were performed using the Wald test as implemented in DESeq2 (v1.24.0). Comparative 448 percentage non-zero cells were performed using t-tests.

449 Isolation, differentiation and polarization of primary human monocytes and THP-1 cells. Peripheral 450 blood mononuclear cells (PBMCs) were obtained from whole blood of healthy donors (from Sanquin 451 Institute Amsterdam or from GSK Stevenage Blood Donation Unit) by Ficoll density gradient 452 (Invitrogen). CD14+ monocytes were positively selected from PBMCs using CD14 Microbeads according 453 to the manufacturer’s instructions (Miltenyi Biotec). CD14+ cells were differentiated with 20 ng/mL of 454 macrophage colony-stimulating factor (M-CSF) (R&D systems) for 3 days followed by 3 days of

14 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

455 polarization into naïve macrophages (M0) (media only)26, classically activated (inflammatory) M1 456 macrophages (100 ng/mL IFN-γ; R&D systems)26 or alternative activated (regulatory) M2 macrophages 457 (40 ng/mL IL4;R&D systems)26. Human monocytic cell line (THP-1) cells were differentiated with 100 458 nM phorbol myristate acetate (Sigma-Aldrich) for 3 days and then the same polarization protocol was 459 performed. Cells were incubated in Isocove’s Modified Dulbecco’s Medium (Lonza) supplemented with 460 10 % fetal bovine serum (FBS) (Lonza), 2 mM l-glutamine (Lonza), 100 U/mL penicillin (Lonza) and 100 461 U/mL streptomycin (Lonza), at 37 °C, 5 % CO2.

462 siRNA-mediated SP140 knockdown. Human M1 macrophages were generated in vitro from human 463 primary CD14+ monocytes as described above. M1 macrophages were transfected with siGENOME 464 human smartpool SP140 siRNA or non-targeting scrambled siRNA for 48h with DharmaFECT™ 465 transfection reagents according to manufacturer’s protocol (Dharmacon). The cells were left unstimulated 466 or stimulated with 100 ng/mL LPS (E. coli 0111:B4; Sigma) for 4h (for qPCR) or 24h (for Elisa). The 467 supernatant was harvested for cytokine measurement and the cells were lysed (ISOLATE II RNA Lysis 468 Buffer RLY- Bioline) for RNA extraction.

469

470 RNA isolation and Reverse Transcriptase, Polymerase Chain Reaction (PCR) and Real-Time 471 Quantitative PCR. Total RNA was extracted from macrophages using RNeasy Mini Kit (Qiagen) 472 following the manufacturer’s instructions. The concentration of RNA was determined using 473 spectrophotometry (Nano-Drop ND-1000). Complementary DNA (cDNA) was synthesized with qScript 474 cDNA SuperMix (Quanta Biosciences). PCR amplification of SP140, SP100, SP110, SP140L, BRDT, 475 BRD2, BRD3, BRD4, BRD9, EP300, BAZ2A, BAZ2B, PCAF, CREBBP, TNF, IL6, IL10, IL8, CCL5, 476 CCL22, CD206 and CD64 was performed by Fast Start DNA Masterplus SYBR Green I kit on the Light 477 Cycler 480 (Roche, Applied Science). Relative RNA expression levels were normalized to the geometric 478 mean of two reference genes RPL37A and ACTB. Primer sequences are listed in Supplementary table 1.

479 Immunofluorescence cell staining. Primary human monocyte derived macrophages were polarized to 480 M1 phenotype on coverslips. The cells were fixed with 4% paraformaldehyde and then permeabilized in 481 0.2% Triton (Biorad)/PBS. Blocking buffer 2% BSA (Sigma-Aldrich) was added for 30 minutes. Rabbit 482 polyclonal anti-SP140 antibody (dilution 1:200) (ab171141; Abcam) or mouse polyclonal anti-SP100 483 antibody (dilution 1:200) (ab167605; Abcam) were added for 2h) followed by 2h of secondary antibody, 484 Polyclonal goat anti-Rabbit, Alexa Fluor546 (A-11035, Invitrogen) (dilution of 1:1000) or goat anti- 485 mouse, Alexa Fluor546 (A-21123, Life Technologies) (dilution of 1:500), respectively. DAPI (Thermo- 486 Fisher) was used for nuclear detection.

15 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

487 Microarray. RNA from human M1 macrophages transfected with SP140 siRNA or scrambled siRNA 488 was extracted using a RNeasy mini-kit (Qiagen). 150 ng total RNA was labelled using the cRNA labelling 489 kit for Illumina BeadArrays (Ambion) and hybridized with Ref8v3 BeadArrays (Illumina). Arrays were 490 scanned on a BeadArray 500GX scanner and data were normalized using quantile normalization with 491 background subtraction (GenomeStudio software; Illumina). Genes with negative values were removed 492 from analysis. Differentially expressed genes (DEGs) had a P-value <0.05 (analysis of variance). The data 493 was analyzed in R (v3.6.3) and R2 Genomics Analysis and Visualization Platform-UMC (r2.amc.nl). 494 Gene ontology overrepresentation analyses were performed in ShinyGO v0.6027.

495 Discovery and synthesis of the compounds (GSK761, GSK064, GSK675 and GSK306): The text 496 describing the discovery and the synthesis of different compounds is added as supplementary information.

497 Fluorescence Polarization (FP) Binding Affinity Studies. 6HisFLAGTEVSP140 (687-867) was serially 498 diluted in the presence of 3 nM GSK064 in Assay Buffer (50 mM HEPES, pH 7.5, 150mM NaCl, 0.05 % 499 Pluronic F-127, 1 mM DTT) in a total assay volume of 10 µl. Following incubation for 60 minutes, FP 500 was measured on a Perkin Elmer Envision multi-mode plate reader, by exciting the Alexa647 fluorophore 501 of GSK064 at a wavelength of 620 nm and then measuring emission at 688 nm in both parallel and 502 perpendicular planes. The FP measurement, expressed as milliP (mP), was then calculated using the 503 following equation;

(푝푎푟푎푙푙푒푙 푓푙푢표푟푒푠푐푒푛푐푒 푖푛푡푒푛푠푖푡푦) − (푝푒푟푝푒푛푑푖푐푢푙푎푟 푓푙푢표푟푒푠푐푒푛푐푒 푖푛푡푒푛푠푖푡푦) 504 푚푃 = (푝푎푟푎푙푙푒푙 푓푙푢표푟푒푠푐푒푛푐푒 푖푛푡푒푛푠푖푡푦) + (푝푒푟푝푒푛푑푖푐푢푙푎푟 푓푙푢표푟푒푠푐푒푛푐푒 푖푛푡푒푛푠푖푡푦)

505 These data were then used to calculate an apparent Kd value by application of the following Langmuir-Hill

506 equation; 퐴퐵 = (퐴0 + 퐵)/(퐾d + 퐵) where AB is the concentration of the bound complex, A0 is the total

507 amount of one binding molecule added, B is the free concentration of the second binding molecule and Kd

508 is the dissociation complex. IC50 determination, GSK761 was serially diluted in DMSO (1% final assay 509 concentration) and tested in the presence of 50 nM SP140 and 3 nM GSK064 in the same assay buffer and

510 volume as above. Reactions were incubated for 60 minutes, and IC50 values calculated using a four- 퐼 511 parameter logistic equation; 푦 = 푥/ [1 + ( ) nH + 퐷] where y is the mP signal in the presence of the 퐼퐶50

512 inhibitor at concentration I, x is the mP signal without the inhibitor, IC50 is the concentration of the 513 inhibitor that gives 50% inhibition, nH is the Hill coefficient and D is the assay background.

514 Pull-down of endogenous SP140 and Halo-transfected SP140 with GSK761. HuT78 cells were 515 stimulated with 100 ng/mL anti-CD3 antibody (in-house reagent) and 3 mg/mL anti-CD28 antibody (in-

16 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

516 house reagent) for 96h. Nuclear extract from HuT78 cells (endogenous SP140) was prepared following 517 manufacturer’s protocol (Active Motif nuclear extract and co-IP kit). Protein concentration was 518 determined by Bradford Assay (Pierce). Nuclear extracts were incubated with GSK675 (biotinylated 519 GSK761) prebound to streptavidin Dynabeads (Thermo Fisher). Dynabeads were incubated with 2X SDS 520 reducing buffer (Invitrogen) and the eluted proteins resolved on a 4-12% Bis-Tris SDS-PAGE (Invitrogen) 521 with MagicMark (Invitrogen) and SeeBlue 2 (Invitrogen) protein ladders. The gel was subjected to 522 Western blotting (Invitrogen) and incubated with Rabbit anti-SP140 antibody (HPA-006162; Sigma) 523 followed by anti-rabbit IgG HRP (A4914; Sigma). The Western blot was developed with Super Signal 524 West femto kit (Thermo Fisher) and imaged on Carestream chemilluminescence imager (Kodak). A 525 chimeric gene encoding Halo-SP140 was synthesized using overlapping oligonucleotides and cloned in an 526 expression vector pCDNA3.1. HEK293 cells were transfected with expression vector using Lipofectamine 527 and incubated for 24h. The transfected cells were labelled with HaloTag TMR ligand following 528 manufacturer’s protocol (HaloTag® TMR Ligand Promega). Nuclear extract from transfected HEK293 529 cells (Halo-SP140) was prepared and SP140 immunoprecipitation was carried out as described above. 530 Eluted proteins were resolved on a 4-12% Bis-Tris SDS-PAGE gel (Invitrogen) with SeeBlue 2 531 (Invitrogen) protein ladder. Halo-SP140 was visualised using Versa Doc scanner and 520LP UV 532 Transilluminator.

533 BROMOscan® Bromodomain Profiling. BROMOscan® bromodomain profiling was provided by

534 Eurofins DiscoverX Corp (Fremont, CA, USA, http://www.discoverx.com). Determination of the Kd 535 between test compounds and DNA tagged bromodomains was achieved through binding competition 536 against a proprietary reference immobilized ligand.

537 Cell penetration assessment assay of GSK761. Intracellular GSK761 compound concentration was 538 measured by RapidFire Mass Spectrometry utilizing the methodology described by28.

539 Cell viability and cytotoxicity assays. M1 macrophages were generated in vitro from human primary 540 CD14+ monocytes as described above. M1 macrophages were plated into opaque-walled 96-well plate at 541 10 × 105 cells per well and incubated with a concentration gradient of GSK761 (0.04–1.11 μM) for 1h 542 (0.1% DMSO was used as control). The cells were left unstimulated or stimulated with 100 ng/mL LPS 543 for 24h. Cell viability was assessed using CellTiter-Glo® Luminescent Cell Viability Assay kit (Promega) 544 according to the manufacturer's protocol. This assay quantifies ATP, an indicator of metabolically active 545 cells. An equal volume of freshly prepared CellTiter-Glo® reagent was added to each well the plate was 546 shaken for 10 minutes at RT and luminescent signals were recorded using a plate reader (SpectraMax

17 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

547 M5). The index of cellular viability was calculated as the fold change of luminescence with respect to 548 untreated control cells.

549 Cell viability was also determined using Muse® Count & Viability Kit (Luminex corp) according to the 550 manufacturer's protocol and measured on the Guava® Muse® Cell Analyzer (Luminex corp).

551 Histone H3 peptide displacement. Various H3 peptides (Anaspec library (https://www.anaspec.com/)) 552 were serially diluted in DMSO (1% final assay concentration) and tested in the presence of 10 nM 553 GSK306 (FAM-labelled version of GSK761) and 80 nM 6HisFLAGTEVSP140 (687-867) (2x apparent

554 Kd for this ligand) in 50 mM HEPES (Sigma-Aldrich), pH 7.5, 50 mM NaCl (Sigma-Aldrich), 1 mM 555 CHAPS (Sigma-Aldrich), 1 mM DTT (Sigma-Aldrich) in a total volume of 10 µl. Reactions were 556 incubated for 30 minutes and FP was measured on a Perkin Elmer Envision multi-mode plate reader 557 (PerkinElmer), by exciting the FAM fluorophore of GSK306 at a wavelength of 485 nm and then 558 measuring emission at 535 nm in both parallel and perpendicular planes. The FP measurement, expressed 559 as milliP (mP), was then calculated as described previously, and normalised to free- and bound- ligand

560 controls to determine % response. IC50 values were calculated using the described four-parameter logistic 561 equation.

562 SP140/Histone Nano-bioluminescence resonance energy transfer™ (NanoBRET™) Assay Testing. 563 The NanoBRET™ System is a proximity-based assay that can detect protein interactions by measuring 564 energy transfer from a bioluminescent protein donor NanoLuc® (NL) to a fluorescent protein acceptor 565 HaloTag® (HT). Briefly, SP140-NL and Hitone3.3-HT DNA was transfected into HEK293 cells using the 566 following ratios: 1:1, 1:10 and 1:100, respectively. Signal window was determined by relative NL/HT- 567 fused protein expression. Minus HT controls were used as the baseline in this experiment for each 568 condition. The data is presented as NanoBRET response (mBU) which is dependent on the presence of the 569 HT Ligand, and by microscopic imaging of NL/HT-fused protein signal. For more information about 570 NanoBRET assay protocol: (https://www.promega.com/-/media/files/resources/protocols/technical- 571 manuals/101/nanobret-proteinprotein-interaction-system-protocol.pdf?la=en)

572 Flow cytometry (FACS). Primary human CD14+ monocytes were positively selected from PBMCs using 573 CD14 Microbeads according to the manufacturer’s instructions (Miltenyi Biotec) and differentiated for 3 574 days with 20 ng/mL M-CSF. The monocytes were then washed with PBS and treated with 0.1% DMSO or 575 0.04 µM GSK761. After 1h, 100 ng/mL IFN-γ (R&D systems) was added to the cells to generate M1 576 macrophages. After 3 days of incubation, the cells were harvested and subsequently permeabilized using 577 1% Saponin (Sigma-Aldrich) for 10 minutes on ice. Cells were then stained using the conjugated

18 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

578 antibodies; PerCP-Cy5 mouse anti-human CD64 (dilution 1:50) (305023, Biolegend) and PE mouse anti- 579 human CD206 (dilution 1:20) (2205525, Sony Biotechnology). The analysis was performed by flow 580 cytometry (FACS) using the LSRFortessa and FACSCalibur (both BD Biosciences). FlowJo (BD 581 LSRFortessa™ cell analyzer) was used for data analysis.

582 Cytokine analysis. Cytokine expression in the supernatant of DMSO- or GSK761-treated M1 583 macrophages (LPS-stimulated or unstimulated) were measured using electro-chemiluminescence assays 584 (Meso Scale Discovery [MSD])-Human ProInflammatory 7-Plex Tissue Culture Kit (IFN-γ, IL-1β, IL-6, 585 IL-8, IL-10, IL-12p70, TNF) according to manufacturer's protocols and analyzed on an MSD 1250 Sector 586 Imager 2400 (Mesoscale). Supernatant IL-6, IL-8 and TNF from SP140 siRNA or scrambled-treated M1 587 macrophages (LPS-stimulated or unstimulated) were determined by Sandwich Enzyme-Linked 588 Immunosorbent Assay (ELISA; R&D systems) according to manufacturer’s protocol.

589 Customized qPCR-Array. Human M1 macrophages were generated in vitro from human primary CD14+ 590 monocytes as described above (from 5 independent donors). M1 macrophages were treated either with 591 0.1% DMSO or 0.04 µM GSK761 for 1h. The cells where then washed with PBS and stimulated with LPS 592 for 4h. Total RNA was isolated using RNeasy Mini Kit (Qiagen) and treated with DNaseI (Qiagen) 593 according to the manufacturer’s instructions. RNA was reverse transcribed using the First Strand 594 Synthesis Kit (Qiagen) and loaded onto a customized RT2 profiler array for selected 89 genes according to 595 the manufacturer’s instructions (Qiagen) and run on QuantStudio 7 Flex (software v1.0). Qiagen’s online 596 GeneGlobe Data Analysis Center (https://geneglobe.qiagen.com/us/analyze/) was used to determine the 597 DEGs. The data was presented as scatter plot. All data were normalized to the geometric mean of two 598 reference genes (RPL37A and ACTB). The list of genes included in this experiment were selected from 599 DEG in SP140 silenced M1 macrophages in 10 and from our MSD, qPCR and microarray datasets of 600 SP140 silenced M1 macrophages.

601 Genome wide expression profiling (RNA-sequencing (RNA-seq)). M1 macrophages were generated in 602 vitro from human primary CD14+ monocytes as described above (from 3 independent donors). M1 603 macrophages were incubated for 1h with either 0.1% DMSO or 0.04 µM GSK761. M1 macrophages were 604 then kept unstimulated or stimulated with 100 ng/mL LPS (E. coli 0111:B4; Sigma) for 4h or 8h. Total 605 RNA was isolated from macrophages using the RNAeasy mini kit (Qiagen) and transcribed into cDNA by 606 qScript cDNA SuperMix (Quanta Biosciences) according to manufacturer’s instructions. Sequencing of 607 the cDNA was performed on the Illumina HiSeq4000 to a depth of 35M reads at the Amsterdam UMC 608 Core Facility Genomics. Quality control of the reads was performed with FastQC (v0.11.8) and 609 summarization through MultiQC (v1.0). Raw reads were aligned to the (GRCh38) using

19 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

610 STAR (v2.7.0) and annotated using the Ensembl v95 annotation. Post-alignment processing was 611 performed through SAMtools (v1.9), after which reads were counted using the featureCounts application 612 in the Subread package (v1.6.3). Differential expression (DE) analysis was performed using DESeq2 613 (v1.24.0) in the R statistical environment (v3.6.3), in R2 Genomics Analysis and Visualization Platform- 614 UMC (r2.amc.nl) and ShinyGO v0.6027.

615 Chromatin immunoprecipitation (ChIP). M1 macrophages were generated in vitro from human primary 616 CD14+ monocytes as described above (107 cells were used for each condition). M1 macrophages were 617 either incubated for 1h with 0.1% DMSO or with 0.04 µM GSK761 and left unstimulated or simulated 618 with 100 ng/mL LPS for 1h or 4h. The cells were cross-linked with 1% formaldehyde for 10 minutes at 619 RT and quenched with 2.5 M glycine (Diagenode) for 5 minutes at RT. The ChIP assay was performed 620 using the iDeal ChIP kit for Transcription Factors (Diagenode) and sonication was performed using the 621 PicoruptorTM (Diagenode) according to the manufacturer’s protocols. Chromatin shearing was verified by 622 migration on a 1% agarose gel (E-Gel, Thermo-Fisher) and visualised using E-Gel imager (Thermo- 623 Fisher). Immunoprecipitation was performed with a polyclonal SP140 antibody (H00011262-M07, 624 Abnova). DNA was purified using IPure kit (Diagenode) according to the manufacturer’s protocol. For 625 validation, quantitative real-time ChIP-qPCR was performed on DNA isolated from input (unprecipitated) 626 chromatin and SP140 ChIP DNA with primer pairs specific for the TSS of TNF and IL6 genes. For 627 detailed PCR primer sequences, please see supplemental table 1. Quantitative PCR was performed using 628 SYBR Green (Applied Biosystems) and StepOnePLus (Applied Biosystems). Results were quantitated 629 using the delta–delta CT (ΔΔCT) method. Libraries of input DNA and ChIP DNA were prepared from 630 gel-purified >300– DNA.

631 Global profiling of chromatin binding sites. The DNA was used to generate sequencing libraries 632 according to the manufacturer's procedure (Life Technologies). The DNA was end polished and dA tailed, 633 and adaptors with barcodes were ligated. The fragments were amplified (eight cycles) and quantified with 634 a Bioanalyzer (Agilent). Libraries were sequenced using the HiSeq PE cluster kit v4 (Illumina) with the 635 HiSeq 2500 platform (Illumina), resulting in 125-bp reads. FastQ files were adapter clipped and quality 636 trimmed using BBDukF (java -cp BBTools.jar jgi.BBDukF -Xmx1g in=fastqbef.fastq.gz out=fastq.fastq 637 minlen=25 qtrim=rl trimq=10 ktrim=r k=25 mink=11 hdist=1 ref=adapters.fa). Subsequently the reads 638 were mapped to NCBI37/HG19 using Bowtie (bowtie2 -p 8 -x hg19 fastq.fastq > sample.sam) and sorted 639 and converted into bam files using samtools. Peaks were called using MACS2, with the reads being 640 extended to 200bp (callpeak --tempdir /data/tmp -g hs -B -t sample.bam -c control.bam -f BAM -n result/ - 641 -nomodel --extsize 149) and duplicates removed. BDG files were binned in 25bp regions and loaded in the 642 R2 platform (r2.amc.nl) for subsequent analyses and visualization. For motif analyses, Summits obtained

20 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

643 from the MACS2 analyses were extended by 250 bases and subsequently analyzed on a repeat masked 644 hg19 genome with HOMER to identify over represented sequences (both known as well as de novo). The 645 data was uploaded and analyzed in R2 Genomics Analysis and Visualization Platform-UMC (r2.amc.nl).

646 SP140 inhibition in isolated human mucosal macrophages. Colon tissues were obtained during surgery 647 procedures from patients with CD. The mucosa was stripped and dissociated in GentleMACS tubes in 648 digestion medium (DM; complete medium (RPMI 1640 with PGA/L-glutamine/10% FCS) with 1 mg/mL 649 Collagenase D (Roche), 1 mg/mL soybean Trypsin inhibitor (Sigma), 50 µg/mL DNase I (Roche)), and 650 then mechanically dissociate on GentleMACS using program B01. The mucosa was then incubated in DM 651 for 1h at 37°C while shaking. During digestion the dissociation was repeated two times. The tissue 652 suspension was passed through a cell strainer (200-300 um) and centrifuged at 1500 rpm for 10 minutes. 653 The dissociated cells were resuspended in cold MACS Buffer and macrophages were isolated using CD14 654 MicroBeads according to the manufacturer’s instructions (MiltenyiBiotec). The macrophages were 655 incubated with either 0.1% DMSO or 0.04 µM GSK761 for 4h. RNA was extracted as described above 656 and gene expression of IL6, TNF, IL10 and CD64 was measured by qPCR. All data were normalized to 657 the reference gene ACTB. 658 659 Statistics analysis. Statistical analysis was performed with GraphPad Prism v8.0.2.263 (GraphPad 660 Software Inc.). For group analysis, data were subjected to one-way ANOVA or Student’s t-test. The two- 661 tailed level of significance was set at p ≤ 0.05 (*), 0.01 (**), 0.001 (***) or 0.0001 (****) for group 662 differences. Data is shown as mean ± SEM. The figures were prepared using Inkscape 0.92.4.

663 Data Availability. Processed microarray data is publicly available and can be explored at r2.amc.nl under 664 the data set name ‘Exp SP140 siRNA in M1 macrophages - deJonge - 12 - custom - ilmnht12v4’. The 665 summary statistics for microarray data can be found in supplemental tables 3 and 4. Raw data of RNA-seq 666 and ChIP-seq are publicly available and have been deposited in European Genome-phenome Archive 667 (https://ega-archive.org/studies/EGAS00001004460). The counts and the summary statistics for RNA-seq 668 data can be found in supplemental table 5, 6 and 7. The normalized read count for ChIP-seq data can be 669 found in supplemental tables 8, 9, 10, 11, 12 and 13.

670 Figure legends

671 Figure 1: SP140 is highly associated with inflammatory diseases and mucosal CD68+ macrophages of 672 CD patients. 673 (a) SP140 gene expression in human tissues and cell types. (b) SP140 gene expression in white blood cells 674 of normal healthy controls (N), Crohn’s disease (CD), ulcerative colitis (CD), systemic lupus

21 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

675 erythematosus (SLE), chronic lymphocytic leukemia (CCL), acute myeloid leukemia (AML) and 676 rheumatoid arthritis (RA) patients. (c) SP140 gene expression in human colon tissue obtained from N, 677 inflamed and non-inflamed CD and UC colonic tissues. Data was collected from in-house GSK 678 microarray profiler. (d) Immunohistochemistry of SP140 protein in colon tissue obtained from N and 679 inflamed CD tissue, scale bar: 50 µm. (e) Immunofluorescence staining of DAPI (blue), CD68 (green) and 680 SP140 (red) in N or inflamed CD colon tissue, scale bar: 100 µm. (f) Mucosal cell count per visual field of 681 total CD68+ macrophages, SP140+ CD68+ macrophages and SP140- CD68+ macrophages in N and inflamed 682 CD tissue (n=3). Statistical significance is indicated as follows: *P < 0.05, **P < 0.01, ****P < 0.0001. 683 684 Figure 2. The percentage of SP140 expressing macrophages in cluster 6 is higher in inflamed tissue 685 compared with uninflamed CD-ileal biopsies. 686 Publicly available single-cell RNA-sequencing was used to investigate SP140 expression in ileal 687 macrophages in inflamed (n=11) and uninflamed (n=11) biopsies of CD patients. (a) Clustering analyses 688 of all cells were performed using the top 2000 most variable genes and the top 15 principal components 689 yielding 22. (b) Visual illustration of the actual counts of cells expressing different 690 monocytes/macrophages markers. Darker blue represents more reads per cell. (d) The percentage 691 (represented by the hue) of cells that are non-zero for the indicated macrophage marker genes (c). UMAP 692 of cluster 6 only with blue dots representing SP140 expression per cell. (e) Comparative analyses of the 693 percent non-zero cells when comparing inflamed with uninflamed, where the percentage was calculated 694 relative to cluster 6 only. 695 696 Figure 3: SP140 knock-down reduces the activity of the inflammatory macrophages. 697 (a) Polarization protocol of human CD14+ monocytes to M0, M1 and M2 macrophage phenotypes. 698 Freshly isolated CD14+ monocytes were differentiated for 3 days using 20 ng/mL M-CSF. The cells were 699 then washed with PBS and polarized for 3 days to M0, M1 and M2 macrophages using media only, 100 700 ng/mL IFN-γ or 40ng/mL IL-4, respectively. (b) Relative gene expression (qPCR) of SP140 in M0, M1 701 and M2 macrophages, n=6. (c) Immunofluorescence staining of SP140 speckles in M1 and M2 702 macrophages imaged by microscopy (left) or nuclear bodies count (right). (d) SP140 silencing protocol; 703 M1 macrophages were incubated with either scrambled control siRNA or SP140 siRNA for 48h. The cells 704 were then washed with PBS and stimulated with 100 ng/mL LPS for 4h (for qPCR) and 24h (for Elisa) or 705 kept without LPS-stimulation. (e) The efficiency of SP140 silencing was assessed by measuring relative 706 gene expression (qPCR) of SP140 (n=6) and (f) immunofluorescence staining of SP140 speckles nuclear 707 bodies (left) and nuclear bodies count (right). (g) Relative gene expression (qPCR) (top) n=6, and protein 708 levels (ELISA) (bottom) n=3, of TNF, IL-6 and IL-8. (h) PCA of microarray dataset; PC1 represents most

22 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

709 variance associated with the data (LPS-stimulation) and PC2 represents second most variance (siRNA), 710 n=3. (i) Heatmap of top 50 DEGs at 0h LPS or 4h of 100 ng/mL LPS (non-annotated genes were not 711 included). (j) Hallmark pathways analysis at 0h (top) and 4h of 100 ng/mL LPS (bottom). (k) Reactome 712 pathway analysis carried out using ShinyGO v0.60 in unstimulated M1 macrophages (bottom) or 4h of 713 100 ng/mL LPS-stimulated M1 macrophages (bottom). In all assays, statistical significance is indicated as 714 follows: *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. 715 716 Figure 4: The Development of the first small molecule inhibitor of SP140 (GSK761) and 717 investigating its affinity and selectivity. 718 (a) Three-cycle benzimidazole library (left) and cube view of the SP140 selection output (right). Each 719 individual dot in the cube represents a DEL68 molecule, while the size of the dot corresponds to the total 720 copy number (2-24) identified for that library molecule. The dots are colored by the BB3s that compose 721 the library molecules. Library members with a single copy were removed to simplify visualization 722 revealing a prominent line defined by a specific BB1&BB3 combination. (d) Biochemical characterization 723 of the interaction between (b) GSK761 and recombinant SP140 in a FP binding assay using (c) GSK064,

724 generated by fluorescent labelling of GSK761. The mean binding affinity for this interaction was a Kd = 725 41.07 ± 1.42 nM (n = 5). (e) A FP binding assay was configured using recombinant SP140 and GSK064, 726 which was used to determine the potency of GSK761.Displacement of GSK064 from SP140 by GSK761

727 (circles) was achieved and determined to have a mean IC50 of 77.79 ± 8.27 nM (n = 3). No effect on 728 GSK064 motion was observed in the presence of varying concentrations of GSK761 (triangles). The data 729 presented in (d) and (e) are representative data form a single experimental replicate, affinities and 730 potencies are mean values determined from multiple test occasions. (f) An endogenous SP140 (HuT78 731 nuclear extracts) and Halo-transfected SP140 were pulled-down using biotinylated GSK761 and 732 visualized by Western blotting and gel electrophoresis.

733 734 Figure 5: GSK761 biases differentiation towards an CD206 expressing macrophages and reduces 735 pro-inflammatory cytokine expression in CD14+ macrophages isolated from CD anti-TNF 736 refractory patients’ colonic-mucosa. 737 Human primary CD14+ monocytes were differentiated with 20 ng/mL M-CSF for 3 days. The cells where 738 then washed with PBS and treated with 0.1% DMSO or 0.04 µM GSK761 for 1h prior to 3 days 739 polarization to M1 and M2 macrophages phenotypes with 100 ng/mL INF-γ or 40 ng/mL IL-4, 740 respectively (GSK761 and DMSO were not washed and kept in the culture during the 3 days of 741 polarization). (a) Gene expression of the inflammatory marker CD64 (M1 marker) and anti-inflammatory 742 marker CD206 (M2 marker) was measured by qPCR in IFN-γ(M1) and IL-4(M2) polarized macrophages

23 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

743 and (b,c) FACS analysis was performed in IFN-γ(M1) polarized macrophages to assess: (b) the frequency 744 of CD64+ macrophages (left) and CD64 protein expression intensity (right) and (c) the frequency of 745 CD206+ macrophages (left) and CD206 protein expression intensity, n=3. For FACS data, the count was 746 normalized to mode. (d) M1 polarized macrophages were pretreated for 1h with 1% DMSO or with 0.04 747 µM GSK761. The cells were then stimulated for 4h with 100 ng/mL LPS and Customized RT2 Profiler 748 PCR Arrays was performed, the scatter plot illustrates the differentially expressed genes (2-fold change), 749 n=4 to 5. (e) M1 polarized macrophages were pretreated for 1h with 1% DMSO or with an increasing 750 concentration of GSK761 (0.01, 0.04, 0.12, 0.37. 1.11 µM). The cells were then stimulated for 24h with 751 100 ng/mL LPS. Protein levels of TNF, IL-6, IL-1β, IL-10, IL-8 and IL-12p70 were measured in the 752 supernatant using ProInflammatory 7-Plex MSD kit, n=4 to 7. The Y axis indicates the fold change in 753 cytokine protein expression relative to DMSO control. (f) CD14+ cells were isolated from inflamed CD 754 mucosa. The cells were then incubated ex vivo for 4h with either 0.1% DMSO or 0.04 µM GSK761. 755 Relative gene expression of TNF, IL6, IL10 and CD64 were measured using qPCR, n=2. The Y axis 756 indicates the fold change in mRNA level relative to DMSO control. Statistical significance is indicated as 757 follows: *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. 758 759 Figure 6: LPS-stimulation enhances SP140 protein recruitment to chromatin; GSK761 reduces this 760 recruitment and dampens inflammatory pathways in inflammatory macrophages. 761 (a) Nuclear extracts from αCD3/αCD28 stimulated HuT78 cells were incubated with unmodified or 762 modified (acetylated and methylated) histone H3 peptides. SP140 was then pulled–down and its 763 interaction with H3 peptides was visualized. (b) ChIP-qPCR of SP140 occupancy at TSS of TNF and IL6 764 in M1 macrophages stimulated with 100 ng/mL LPS for 4h or without LPS-stimulation, n=3 donors (DN). 765 (c) Epigenome roadmap scan illustrating proportions of SP140 genome-wide occupancy after SP140 766 ChIP-seq experiment. (d) Heatmap (1000 top SP140-bound genes) of SP140 ChIP-seq reads ranked on 1h 767 LPS-stimulated (left) or on 4h LPS-stimulated (right) M1 macrophages rank-ordered from high to low 768 occupancy centered on TSS. Top 20 genes with high SP140 occupancy were listed. (e) ChIP-qPCR of 769 SP140 occupancy at TSS of TNF in M1 macrophages pretreated with 0.1% DMSO or 0.04 µM GSK761 770 for 1h and then stimulated with 100 ng/mL LPS for 1 or 4h or kept unstimulated (0h LPS). (f) Metagene 771 created from normalized genome-wide average reads for SP140 centered on TSS. (g) PCA of RNA-seq 772 comparing 0.1% DMSO to 0.04 µM GSK761 treated M1 macrophages after 4h of LPS-stimulation (left) 773 or after 8h of LPS-stimulation (right). (h) SP140 ChIP-seq gene ontology analysis of the most enriched 774 molecular function and biological process after 1h of LPS-stimulation, comparing 0.1% DMSO with 0.04 775 µM GSK761 treated M1 macrophages. (i,j) Hallmarks pathway enrichment analysis at (i) 0, 1 and 4h of 776 100 ng/mL LPS-stimulation for ChIP-seq and at (j) 0, 4 and 8h of 100 ng/mL LPS-stimulation for RNA-

24 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

777 seq. (j) The direction and color of the arrow indicates the direction and size of the enrichment score, the

778 size of the arrow is proportional to the –log10(p-value), and non-transparent arrows represent significantly 779 affected pathways. 780 781 Figure 7: SP140 preferentially controls the expression of specific gene sets involved in the innate 782 immune response. 783 (a) Heatmap of the top 100 DEGs and (b) volcano plots of the all genes comparing 0.1% DMSO- with 784 0.04 µM GSK761-treated M1 macrophages after 4h of 100 ng/mL LPS-stimulation. (c) R2 TSS-plot 785 comparing a global DBGs after 1h of 100 ng/mL LPS-stimulation of 0.1% DMSO- and 0.04 µM GSK761- 786 treated M1 macrophages. (d) SP140 ChIP-seq genome browser view of some of the most affected DBGs; 787 TNF, TRAF1, IRF1 and TRAFAIP2. Y axis represents signal score of recovered sequences in 0.1% DMSO 788 and 0.04 µM GSK761 treated macrophages after 0, 1 and 4h of 100 ng/mL LPS-stimulation. (e) RNA- 789 sequencing derived gene expression of some of the most DBGs. (f) Comparative analyses of the top 1000 790 DBGs (signal) with their gene expression (Wald statistic). Gene expression at 4h and ChIP at 1h (left) and 791 gene expression at 4h and ChIP at 4h (right), Y axis represents the SP140 differential binding signal (BD). 792 (g) Homer Known Motif Enrichment using TF motifs and their respective p-value scoring (top) and 793 Homer de novo Motif results with best match TFs (bottom) in 0.1% DMSO treated M1 macrophages after 794 1h of 100 ng/mL LPS-stimulation. (h) An enrichment analysis targeted chemokine activity for SP140 795 ChIP-seq (top) and RNA-seq (bottom) comparing 0.1% DMSO- with 0.04 µM GSK761 treated-M1 796 macrophages at 0, 1 and 4h of 100 ng/mL LPS-stimulation.

797 Supplementary figure legends

798 Supplementary figure 1: SP140 expression is associated with inflammatory diseases. 799 Immunohistochemistry of SP140 in ulcerative colitis (colon), appendicitis (appendix), sarcoidosis (lung), 800 psoriatic arthritis (synovium), rheumatoid arthritis (synovium), Hashimoto’s thyroiditis (thyroid) and 801 Sjogren’s syndrome (cervical cyst). SP140 is illustrated by peroxide staining. 802 803 Supplementary figure 2: CD-ileal inflamed tissue has consistently more cells than uninflamed tissue 804 in cluster 6. 805 Publicly available single-cell RNA-sequencing was used to investigate SP140 expression in ileal 806 macrophages in inflamed (n=11) and uninflamed (n=11) biopsies of CD patients. Comparative analyses of 807 the cell counts per cluster when comparing inflamed with uninflamed. Graphs are ranked by p-values, 808 which were calculated through the Wald test as implemented in DESeq2. 809

25 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

810 Supplementary figure 3: Inflammatory stimulus induces SP140 expression. 811 (a,b) Relative gene expression of CD64, CCL5, CD206 and CCL22 in M0, M1 and M2 macrophages 812 derived from (a) human primary CD14+ monocytes or (b) from THP-1 cells, n=4. (c) Relative gene 813 expression of SP140 in M0, M1 and M2 macrophages derived from THP-1 cells, n=5. (d) 814 Immunofluorescence staining of SP140 in M1 and M2 macrophages derived from THP-1 cells. DAPI 815 staining nucleus in blue and SP140 speckles in red. (e) Relative gene expression of SP140, CCL5 and 816 CD206 after 24h of 100 ng/mL LPS-stimulated or unstimulated M0, M1 and M2 macrophages derived 817 from human primary CD14+ monocytes, n=4. Relative gene expression was measured by qPCR. Statistical 818 significance is indicated as follows: *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001.

819 Supplementary figure 4: The association between SP140 and inflammatory macrophages is not a 820 common phenomenon amongst BCPs. 821 (a) Relative gene expression (qPCR) of 12 BCPs; SP100, SP110, SP140L, BRD2, BRD3, BRD4, BRD9, 822 PCAF, EP300, CRBBP, BAZ2A and BAZ2B in M0, M1 and M2 macrophages derived from human 823 primary CD14+ monocytes, n= 4 to 8. (b) Relative gene expression of SP110 and SP100 in M1 824 macrophages, transfected with siRNA against SP140 (grey bars) or a scrambled control siRNA (black 825 bars), and then stimulated or not with LPS (100 ng/mL for 4h). (c) Immunofluorescence staining of SP140 826 in M1 macrophages, transfected with siRNA against SP140 or a scrambled control siRNA, and then 827 stimulated with LPS (100 ng/mL for 24h). SP140 speckles (red dots) were imaged and SP140 nuclear 828 bodies were counted. DAPI (blue) was used to stain the nucleus. 829 830 Supplementary figure 5: GSK761 lowers the expression of the M1 macrophage polarization marker 831 TNF. 832 (a) ATP luminescence measurement within M1 macrophages treated with an increasing concentration of 833 GSK761 (0.01, 0.04, 0.12, 0.37. 1.11 µM) or 1% DMSO in presence or absence of 100 ng/mL of LPS for 834 24h, n=6, statistical significance is indicated as follows: *P < 0.05, **P < 0.01. The Y axis indicates the 835 fold change in detected ATP luminescence (relative to unstimulated DMSO control ). (b) M1 836 macrophages were pretreated with 0.1% DMSO or with GSK761 (0.04 µM, 0.12 µM or 0.37 µM), and 837 then stimulated with 100 ng/mL for 24h. The cells were stained with Muse® Count & Viability Reagent, 838 and then analyzed on the Muse® Cell Analyzer. The data shown as viability vs nucleated cells and 839 presented as % viability. (c) Primary human CD14+ monocytes were differentiated with 20 ng/mL M-CSF 840 for 3 days. The cells were then washed with PBS and treated with 0.1% DMSO or 0.04 µM GSK761 for 841 1h prior to 3 days polarization to M1 phenotype with 100 ng/mL INF-γ (DMSO and GSK761 were not

26 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

842 washed and kept in the culture during the 3 days of polarization). TNF gene expression was measured by 843 qPCR, n=3. 844 845 Supplementary figure 6: SP140 protein binds to histone H3. 846 (a) IC50 (0.36 +/- 0.03 µM) for the displacement of GSK761 from SP140 by H3 peptide (1-21, 847 ARTKQTARKSTGGKAPRKQLA). (b) IC50 for the displacement of GSK761 by H3 peptide (circles) 848 compared to IC50s for H3 peptide tri-methylated at position K4 (1-21, H3K4(3Me)) (Squares) and H3 849 peptide tri-methylated at position K14 (1-21, H3K14(3Me)) (10 +/- 1 and 0.28 + 0.03 µM, respectively). 850 (c) IC50s for the displacement of a range of modified H3 and truncated peptides. Open circles; 851 Biotinylated H3 peptide (1-21, IC50 = 0.51 µM), open squares; Biotinylated H3K9(Ac) (1-21, IC50 = 852 2.22 µM), open triangles up; Biotinylated H3K4(3Me)K9(Ac) (1-21, IC50 = 11.32 µM), open triangles 853 down; Biotinylated H3K4(3Me) (1-21, IC50 = 17.66 µM), open diamonds; biotinylated H3K14(Ac) (1-21, 854 IC50= 0.43 µM), open octagon; H3K4(3Me) (1-10, IC50 = 12.47 µM), plus symbol; H3K4(3Me) (1-21, 855 IC50 = 6.10 µM), star symbol; biotinylated H3(+YCK ) (1-18 +YCK, IC50 = 0.27 µM), solid circle; 856 H3K4(Ac)K14(3Me) (1-21, IC50 = 13.10 µM), solid square; H3K4(3Me)K14(Ac) (1-21 IC50 = 9.99 µM 857 and solid triangle up; biotinylated H3 (1-9 IC50 = 2.83 µM). (b) Peptides tested and data not shown (due 858 to no fit); Biotinylated H3K14(Ac) (9-20), Biotinylated H3 (15-40) and Biotinylated H3 (5-24). (d) 859 SP140-NL and Histone3.3-HT DNA was transfected into HEK293 cells. NL/HT-fused protein green 860 signal was imaged by microscopy (left) and NanoBRET response (mBU) was measured (right). (e) SP140 861 enrichment at H3K27ac after 0h, 1h and 4h of 100 ng/mL LPS-stimulation of M1 macrophages. (f) 862 Roadmap comparing the DMSO versus GSK761 on the peaks that were called in both instances (In 1h 863 LPS-stimulated M1 macrophages). If the peaks in both conditions are overlapping, then we call them 864 unaffected (a peak is found under both conditions) (right). If there was a peak in DMSO, but in the 865 presence of GSK761, SP140 binding is reduced or absent, then this is described as ‘SP140 reduced’ 866 (middle). Finally, if there was no peak called in DMSO samples, but a peak appears in the GSK761 867 condition, then SP140 binding is described as 'SP140 appears' (left). For the circles: All has been 868 compared to the epigenome roadmap data for 1 particular tissue (primary monocytes from peripheral 869 blood). This provides an annotation for every 200bp region in the genome and states its association in 15 870 different categories. The narrow outer circle reflects the proportion of the annotated genome in that 871 particular tissue (primary monocytes from peripheral blood). The thicker inner circle reflects the 872 proportions of the annotated genome where SP140 binding is seen (basepair with the highest signal in a 873 peak) for the peaks as termed above. The numbers inside the circles reflect the number of SP140 enriched 874 genes. (g) Reactome pathway analysis of SP140 DBGs in unstimulated, 1h LPS- and 4h LPS-stimulated 875 M1 macrophages comparing 0.1% DMSO and 0.04 µM GSK761.

27 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

876 877 Supplementary figure 7: GSK761 prevents SP140 enrichment at gene set involved in TNF-signaling 878 and antigen presentation. 879 (a) A complete data set of hallmark pathway analysis for global DEGs in LPS-unstimulated or stimulated 880 (4 or 8h) M1 macrophages (pretreated with 0.1% DMSO or 0.04 µM GSK761). The direction and color of 881 the arrow indicates the direction and size of the enrichment score, the size of the arrow is proportional to

882 the –log10(p-value), and non-transparent arrows represents significantly affected pathways,n=3. (b) An 883 enrichment pathway analysis targeting TNF-signaling for SP140 ChIP-seq (top) and RNA-seq (bottom) 884 comparing 0.1% DMSO- with 0.04 µM GSK761 treated-M1 macrophages at 0, 1 and 4h of 100 ng/mL 885 LPS-stimulation. (c) R2 TSS-plot targeting TNF-signaling pathway genes, comparing 0.1% DMSO with 886 0.04 µM GSK761 treated M1 macrophages after 1h of 100 ng/mL LPS-stimulation. The indicated genes 887 in the graph present the most DBGs. (d) Heatmap of most DEGs that are involved in TNF-signaling, 888 comparing 0.1% DMSO with 0.04 µM GSK761 treated M1 macrophages after 4h of 100 ng/mL LPS- 889 stimulation.(e) SP140 ChIP-seq genome browser view of HLA-DMB, HLA-DMA, HLA-DQA, HLA-DPA1, 890 HLA-DPB1, HLA-DPB2, HLA-C, HLA-B, HLA-F, HLA-A and BRD2. Y axis represents signal score of 891 recovered sequences in 0.1% DMSO and 0.04 µM GSK761 treated macrophages after 0, 1 and 4h of 100 892 ng/mL LPS-stimulation. 893 894 Supplementary figure 8: Genes that are involved in innate immune response showed the strongest 895 concordant differential SP140 binding and gene expression. 896 Comparative analyses of the top 1000 DBGs (signal) with their gene expression (Wald statistic). Gene 897 expression at 8h LPS and ChIP at 1h LPS (top) and gene expression at 8h LPS and ChIP at 4h LPS 898 (bottom). Y axis represents the SP140 differential binding signal (BD). 899 900 Supplementary figure 9: GSK761 reduces SP140 enrichment at several chemokine genes. 901 (a) R2 TSS-plot of chemokine-signaling genes, comparing 0.1% DMSO with 0.04 µM GSK761-treated 902 M1 macrophages after 1h of 100 ng/mL LPS-stimulation. The indicated genes in the graph represent the 903 most DBGs involved in chemokine activity/response. (b, c) SP140 ChIP-seq genome browser view of 904 some of most SP140 differentially bound chemokines at different time points of LPS-stimulation in M1 905 macrophages.SP140-bound chemokines were highlighted. (d) Volcano plot of DEGs involved in 906 chemokine signaling/genes after 1h of 100 ng/mL LPS-stimulation.

907 Supplementary table legends

908 Supplemental table 1. Primer sequences used in the quantitative PCR analysis of the genes of interest.

28 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

909 Supplemental table 2. GSK761 screen against human BCPs (Bromoscan assay) reveals no binding 910 at Kd of ≤ 30000 nM (for most of tested BCPs) and at Kd of ≤ 21000 nM for PBRM1(5), indicating a high 911 degree of specificity of GSK761 for SP140

912 Supplemental table 3: Summary statistics for microarray data (SP140 siRNA M1 macrophages vs 913 scrambled siRNA-M1 macrophages)

914 Supplemental table 4: Summary statistics for microarray data (SP140 siRNA-M1 macrophages 915 stimulated with 4h LPS (100 ng/mL) vs scrambled siRNA-M1 with 4h LPS (100 ng/mL))

916 Supplemental table 5: Counts and summary statistics for RNA-seq data (GSK761-pretreated M1 917 macrophages vs DMSO-pretreated M1 macrophages)

918 Supplemental table 6: Counts and summary statistics for RNA-seq data (GSK761-pretreated M1 919 macrophages stimulated with 4h LPS (100 ng/mL) vs DMSO-pretreated M1 macrophages stimulated with 920 4h LPS (100 ng/mL))

921 Supplemental table 7: Counts and summary statistics for RNA-seq data (GSK761-pretreated M1 922 macrophages stimulated with 8h LPS (100 ng/mL) vs DMSO-pretreated M1 macrophages stimulated with 923 8h LPS (100 ng/mL))

924 Supplemental table 8: The normalized read count for ChIP-seq (DMSO-pretreated M1 macrophages)

925 Supplemental table 9: The normalized read count for ChIP-seq (DMSO-pretreated M1 macrophages 926 stimulated with 1h LPS (100 ng/mL))

927 Supplemental table 10: The normalized read count for ChIP-seq (DMSO-pretreated M1 macrophages 928 stimulated with 4h LPS (100 ng/mL))

929 Supplemental table 11: The normalized read count for ChIP-seq (GSK761-pretreated M1 macrophages)

930 Supplemental table 12: The normalized read count for ChIP-seq (GSK761-pretreated M1 macrophages 931 stimulated with 1h LPS (100 ng/mL))

932 Supplemental table 13: The normalized read count for ChIP-seq (GSK761-pretreated M1 macrophages 933 stimulated with 4h LPS (100 ng/mL))

934 935 936

29 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

937 References 938 939 1. Rutgeerts, P. A critical assessment of new therapies in inflammatory bowel disease. Journal of 940 Gastroenterology and Hepatology 17, S176-S185 (2002). 941 942 2. Papadakis, K.A. et al. Safety and efficacy of adalimumab (D2E7) in Crohn's disease patients with 943 an attenuated response to infliximab. The American journal of gastroenterology 100, 75-79 944 (2005). 945 946 3. Feagan, B.G. et al. Treatment of active Crohn's disease with MLN0002, a humanized antibody to 947 the alpha4beta7 integrin. Clinical gastroenterology and hepatology : the official clinical practice 948 journal of the American Gastroenterological Association 6, 1370-1377 (2008). 949 950 4. Sandborn, W.J. et al. A randomized trial of Ustekinumab, a human interleukin-12/23 monoclonal 951 antibody, in patients with moderate-to-severe Crohn's disease. Gastroenterology 135, 1130-1141 952 (2008). 953 954 5. Bain, C.C. & Schridde, A. Origin, Differentiation, and Function of Intestinal Macrophages. 955 Frontiers in immunology 9, 2733-2733 (2018). 956 957 6. Mahida, Y.R. The key role of macrophages in the immunopathogenesis of inflammatory bowel 958 disease. Inflammatory bowel diseases 6, 21-33 (2000). 959 960 7. Chen, S., Yang, J., Wei, Y. & Wei, X. Epigenetic regulation of macrophages: from homeostasis 961 maintenance to host defense. Cellular & molecular immunology (2019). 962 963 8. Schmidt, S.V. et al. The transcriptional regulator network of human inflammatory macrophages is 964 defined by open chromatin. Cell research 26, 151-170 (2016). 965 966 9. Chan, C.H. et al. BET bromodomain inhibition suppresses transcriptional responses to cytokine- 967 Jak-STAT signaling in a gene-specific manner in human monocytes. European journal of 968 immunology 45, 287-297 (2015). 969 970 10. Mehta, S. et al. Maintenance of macrophage transcriptional programs and intestinal homeostasis 971 by epigenetic reader SP140. Science immunology 2 (2017). 972 973 11. Ray, G. & Longworth, M.S. Epigenetics, DNA Organization, and Inflammatory Bowel Disease. 974 Inflammatory bowel diseases 25, 235-247 (2019). 975 976 12. Bloch, D.B., de la Monte, S.M., Guigaouri, P., Filippov, A. & Bloch, K.D. Identification and 977 characterization of a leukocyte-specific component of the nuclear body. The Journal of biological 978 chemistry 271, 29198-29204 (1996). 979 980 13. Dent, A.L. et al. LYSP100-associated nuclear domains (LANDs): description of a new class of 981 subnuclear structures and their relationship to PML nuclear bodies. Blood 88, 1423-1426 (1996). 982 983 14. Karaky, M. et al. SP140 regulates the expression of immune-related genes associated with 984 multiple sclerosis and other autoimmune diseases by NF-kappaB inhibition. Human molecular 985 genetics 27, 4012-4023 (2018). 986

30 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

987 15. Zucchelli, C. et al. Sp140 is a multi-SUMO-1 target and its PHD finger promotes SUMOylation 988 of the adjacent Bromodomain. Biochimica et biophysica acta. General subjects 1863, 456-465 989 (2019). 990 991 16. Franke, A. et al. Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's 992 disease susceptibility loci. Nat Genet 42, 1118-1125 (2010). 993 994 17. Christodoulou, K. et al. Next generation exome sequencing of paediatric inflammatory bowel 995 disease patients identifies rare and novel variants in candidate genes. Gut 62, 977-984 (2013). 996 997 18. Li Yim, A.Y.F. et al. Peripheral blood methylation profiling of female Crohn’s disease patients. 998 Clinical Epigenetics 8, 65 (2016). 999 1000 19. Li, J., Zhao, G. & Gao, X. Development of neurodevelopmental disorders: a regulatory 1001 mechanism involving bromodomain-containing proteins. Journal of Neurodevelopmental 1002 Disorders 5, 4 (2013). 1003 1004 20. Dawson, M.A. et al. Inhibition of BET recruitment to chromatin as an effective treatment for 1005 MLL-fusion leukaemia. Nature 478, 529-533 (2011). 1006 1007 21. Mertz, J.A. et al. Targeting MYC dependence in cancer by inhibiting BET bromodomains. Proc 1008 Natl Acad Sci U S A 108, 16669-16674 (2011). 1009 1010 22. Copsel, S.N. et al. BET Bromodomain Inhibitors Which Permit Treg Function Enable a 1011 Combinatorial Strategy to Suppress GVHD in Pre-clinical Allogeneic HSCT. Frontiers in 1012 immunology 9, 3104 (2018). 1013 1014 23. Martin, J.C. et al. Single-Cell Analysis of Crohn's Disease Lesions Identifies a Pathogenic 1015 Cellular Module Associated with Resistance to Anti-TNF Therapy. Cell 178, 1493-1508.e1420 1016 (2019). 1017 1018 24. Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell 1019 transcriptomic data across different conditions, technologies, and species. Nature Biotechnology 1020 36, 411-420 (2018). 1021 1022 25. Stuart, T. et al. Comprehensive Integration of Single-Cell Data. Cell 177, 1888-1902.e1821 1023 (2019). 1024 1025 26. van der Mark, V.A. et al. Phospholipid flippases attenuate LPS-induced TLR4 signaling by 1026 mediating endocytic retrieval of Toll-like receptor 4. Cellular and Molecular Life Sciences 74, 1027 715-730 (2017). 1028 1029 27. Ge, S.X., Jung, D. & Yao, R. ShinyGO: a graphical gene-set enrichment tool for animals and 1030 plants. Bioinformatics (2019). 1031 1032 28. Gordon, L.J. et al. Direct Measurement of Intracellular Compound Concentration by RapidFire 1033 Mass Spectrometry Offers Insights into Cell Permeability. Journal of biomolecular screening 21, 1034 156-164 (2016). 1035

31 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1036 29. Wellaway, C.R. et al. Discovery of a Bromodomain and Extraterminal Inhibitor with a Low 1037 Predicted Human Dose through Synergistic Use of Encoded Library Technology and Fragment 1038 Screening. Journal of Medicinal Chemistry 63, 714-746 (2020). 1039 1040 30. Kazmierski, W.M. et al. DNA-Encoded Library Technology-Based Discovery, Lead 1041 Optimization, and Prodrug Strategy toward Structurally Unique Indoleamine 2,3-Dioxygenase-1 1042 (IDO1) Inhibitors. Journal of Medicinal Chemistry (2020). 1043 1044 31. Koelink, P.J. et al. Anti-TNF therapy in IBD exerts its therapeutic effect through macrophage IL- 1045 10 signalling. Gut (2019). 1046 1047 32. Filippakopoulos, P. et al. Histone recognition and large-scale structural analysis of the human 1048 bromodomain family. Cell 149, 214-231 (2012). 1049 1050 33. Zhao, L. et al. Flotillin1 promotes EMT of human small cell lung cancer via TGF-β signaling 1051 pathway. Cancer Biol Med 15, 400-414 (2018). 1052 1053 34. Netea, M.G. et al. Trained immunity: A program of innate immune memory in health and disease. 1054 Science 352, aaf1098-aaf1098 (2016). 1055 1056 35. Belkina, A.C., Nikolajczyk, B.S. & Denis, G.V. BET protein function is required for 1057 inflammation: Brd2 genetic disruption and BET inhibitor JQ1 impair mouse macrophage 1058 inflammatory responses. Journal of immunology (Baltimore, Md. : 1950) 190, 3670-3678 (2013). 1059 1060 36. Beecham, A.H. et al. Analysis of immune-related loci identifies 48 new susceptibility variants for 1061 multiple sclerosis. Nat Genet 45, 1353-1360 (2013). 1062 1063 37. Bradford, E.M. et al. Epithelial TNF Receptor Signaling Promotes Mucosal Repair in 1064 Inflammatory Bowel Disease. Journal of immunology (Baltimore, Md. : 1950) 199, 1886-1897 1065 (2017). 1066 1067 38. Bloemendaal, F.M. et al. TNF-anti-TNF Immune Complexes Inhibit IL-12/IL-23 Secretion by 1068 Inflammatory Macrophages via an Fc-dependent Mechanism. Journal of Crohn's & colitis 12, 1069 1122-1130 (2018). 1070 1071 39. Vos, A.C. et al. Anti-tumor necrosis factor-alpha antibodies induce regulatory macrophages in an 1072 Fc region-dependent manner. Gastroenterology 140, 221-230 (2011). 1073 1074 40. Neurath, M.F. IL-23: a master regulator in Crohn disease. Nature Medicine 13, 26-27 (2007). 1075 1076 41. De Vries, L.C.S., Wildenberg, M.E., De Jonge, W.J. & D’Haens, G.R. The Future of Janus Kinase 1077 Inhibitors in Inflammatory Bowel Disease. Journal of Crohn's and Colitis 11, 885-893 (2017). 1078 1079 42. Bonen, D.K. et al. Crohn's disease-associated NOD2 variants share a signaling defect in response 1080 to lipopolysaccharide and peptidoglycan. Gastroenterology 124, 140-146 (2003). 1081 1082 43. Leshchiner, E.S. et al. Small-molecule inhibitors directly target CARD9 and mimic its protective 1083 variant in inflammatory bowel disease. Proceedings of the National Academy of Sciences 114, 1084 11392-11397 (2017). 1085

32 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1086 44. Patel, U. et al. Macrophage polarization in response to epigenetic modifiers during infection and 1087 inflammation. Drug Discov Today 22, 186-193 (2017). 1088 1089 45. Steinbach, E.C. & Plevy, S.E. The role of macrophages and dendritic cells in the initiation of 1090 inflammation in IBD. Inflammatory bowel diseases 20, 166-175 (2014). 1091 1092 46. Stark, G.R., Kerr, I.M., Williams, B.R., Silverman, R.H. & Schreiber, R.D. How cells respond to 1093 interferons. Annual review of biochemistry 67, 227-264 (1998). 1094 1095 47. Darnell, J.E., Jr., Kerr, I.M. & Stark, G.R. Jak-STAT pathways and transcriptional activation in 1096 response to IFNs and other extracellular signaling proteins. Science (New York, N.Y.) 264, 1415- 1097 1421 (1994). 1098 1099 48. Grohmann, U. et al. Positive regulatory role of IL-12 in macrophages and modulation by IFN- 1100 gamma. Journal of immunology (Baltimore, Md. : 1950) 167, 221-227 (2001). 1101 1102 49. Nakanishi, K. Unique Action of Interleukin-18 on T Cells and Other Immune Cells. Front 1103 Immunol 9, 763-763 (2018). 1104 1105 50. Danese, S., Sans, M. & Fiocchi, C. The CD40/CD40L costimulatory pathway in inflammatory 1106 bowel disease. Gut 53, 1035-1043 (2004). 1107 1108 51. Rugtveit, J., Bakka, A. & Brandtzaeg, P. Differential distribution of B7.1 (CD80) and B7.2 1109 (CD86) costimulatory molecules on mucosal macrophage subsets in human inflammatory bowel 1110 disease (IBD). Clin Exp Immunol 110, 104-113 (1997). 1111 1112 52. Genard, G., Lucas, S. & Michiels, C. Reprogramming of Tumor-Associated Macrophages with 1113 Anticancer Therapies: Radiotherapy versus Chemo- and Immunotherapies. Frontiers in 1114 immunology 8, 828 (2017). 1115 1116 53. Gunthner, R. & Anders, H.J. Interferon-regulatory factors determine macrophage phenotype 1117 polarization. Mediators of inflammation 2013, 731023 (2013). 1118 1119 54. Jia, Y. et al. IRF8 is the target of SIRT1 for the inflammation response in macrophages. Innate 1120 immunity 23, 188-195 (2017). 1121 1122 55. Madani, N. et al. Implication of the lymphocyte-specific nuclear body protein Sp140 in an innate 1123 response to human immunodeficiency virus type 1. Journal of virology 76, 11133-11138 (2002). 1124 1125 56. Regad, T. & Chelbi-Alix, M.K. Role and fate of PML nuclear bodies in response to interferon and 1126 viral infections. Oncogene 20, 7274-7286 (2001). 1127 1128 57. Ruhlemann, M.C. et al. Application of the distance-based F test in an mGWAS investigating beta 1129 diversity of intestinal microbiota identifies variants in SLC9A8 (NHE8) and 3 other loci. Gut 1130 microbes 9, 68-75 (2018). 1131 1132 58. Hagio, K. et al. High miR-3687 Expression Affects Migratory and Invasive Ability of 1133 Oesophageal Carcinoma. Anticancer research 39, 557-565 (2019). 1134 1135 59. Xing, R. miR-3648 Promotes Prostate Cancer Cell Proliferation by Inhibiting Adenomatous 1136 Polyposis Coli 2. Journal of nanoscience and nanotechnology 19, 7526-7531 (2019).

33 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1137 1138 60. Tian, W. et al. MALAT1–miR663a negative feedback loop in colon cancer cell functions through 1139 direct miRNA–lncRNA binding. Cell Death & Disease 9, 857 (2018). 1140 1141 61. Mensah, A.A. et al. Bromodomain and extra-terminal domain inhibition modulates the expression 1142 of pathologically relevant microRNAs in diffuse large B-cell lymphoma. Haematologica 103, 1143 2049-2058 (2018). 1144 1145 62. Mio, C. et al. BET bromodomain inhibitor JQ1 modulates microRNA expression in thyroid 1146 cancer cells. Oncology reports 39, 582-588 (2018). 1147 1148

34 Figure 1 a 600 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted AugustPresent 20, 2020. The copyright holder for this preprint 500 (which was not certified by peer review) is the author/funder, who has granted bioRxiv a licenseMarginal to display or not the detectable preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. 400

300 expression 200

SP140 100

0 skin vein lung liver aorta testis colon Ileum Blood cervix bones artery breast uterus cecum spleen kidney rectum tendon thymus bladder adipose prostate stomach brain-BA jejunum pancreas bronchus appendix exocervix omentum esophagus left atrium duodenum monocytes endocervix soft tissues gallbladder lymph node right atrium vas deferens brain-nonBA left ventricle myometrium thyroid gland adrenal gland salivary gland endometrium fallopian tube right ventricle small intestine seminal vesicle muscle skeletal coronary artery mononuclear (MS) lymphocyte-T CD8+ bronchial brushings digestive muscularis lymphocytes-T CD4+ mononuclear (lupus) lymphocyte (heart dis) b c d White blood cells Colon 800 250 **** N CD Uninflamed * **** 200 600 Inflamed ** 150 400 ** 100

200

expression SP140 expression **** SP140 50 0 0 N CD UC SLE CLL AML RA NCDCDUCUC e DAPI CD68 SP140 Merged

N

CD

f 40 Total CD68+ macrophages 30 CD68+ SP140+ macrophages CD68+ SP140- macrophages

20

eld fi Count per per Count 10

0 N CD Figure 2

a b CD68 CD64 (FCGR1A) CD14 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

CD16 (FCGR3A) CD206 (MRC1) CD163

c % non-zero cells CD163 CD206 CD16 CD14 CD64 CD68 Cluster 0 Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster 7 Cluster 8 Cluster 9 Cluster 10 Cluster 11 Cluster 12 Cluster 13 Cluster 14 Cluster 15 Cluster 16 Cluster 17 Cluster 18 Cluster 19 Cluster 20 Cluster 21 d 0 10 20 30 SP140 in cluster 6 Uninflamed Inflamed

e SP140 CD68 CD64 CD14 p-value = 0.04 p-value = 0.37 p-value = 0.14 p-value = 0.17 0.100 0.7 0.5 0.15 0.4 0.075 0.6 0.10 0.3 0.050 0.5 0.2 0.4 0.05 0.025 0.1 0.3 0.000 0.00 0.0 CD16 CD206 CD163 p-value = 0.08 p-value = 0.71 p-value = 0.76 Phenotype 0.3 0.5 Inflamed 0.4 0.4 Uninflamed 0.2 0.3 % non-zero cells in cluster 6 0.2 0.1 0.2 0.1 0.0 0.0 0.0 Figure 3 b SP140 c a 0.0015 15 **** DAPI SP140 Merged bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint Media M0 (which was not3 days certified by peer review) is0.0010 the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It 10is made available under aCC-BY-NC 4.0M1 International license. IFN� M-CSF M1 3 days 3 days CD14+ 0.0005 05

IL-4 M2 ve gene expression gene ve Monocytes 3 days M2 �

0.0000 bodies nuclear SP140

Rela 0 M0 M1 M2 M1 M2 d e f DAPI SP140 Merged 0.020 SP140 20 scrambled Scrambled siRNA scrambled Wash SP140 siRNA SP140 siRNA 48 hours 0.015 15 M1 -/+ LPS M1 SP140 siRNA 0.010 Wash 10 48 hours ***

0.005 * M2 05 ve gene expression gene ve � *

0.000 bodies nuclear SP140 0 Rela - LPS + LPS g scrambled SP140 siRNA h 5 1.5 TNF 0.08 IL6 IL8 Treatment 0.50 Scrambled 4 SP140 siRNA 0.06 LPS 1.0 ** 0.25 0h 3 4h * 0.04 2 0.00 0.5 0.02 (10.26%) PC2

1 -0.25

ve gene expression gene ve � ve gene expression gene ve � expression gene ve �

Rela Rela 0.0 Rela 0.00 0 -0.50 - LPS + LPS - LPS + LPS - LPS + LPS -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 15000 TNF 2000 IL-6 2000 IL-8 PC1 (51.37%)

1500 1500 i 10000 * Treatment 1000 1000 Scrambled

* EBI2 IL2RA SP140 siRNA

pg/mL pg/mL pg/mL G0S2 FLJ10489 5000 DCAF6 USP41 RTN4R MFHAS1 500 500 ABCC3 C13orf29 WIPI1 SP140 GCNT1 TRIM69 GRIN3A C13orf31 0 0 0 CXCL9 TGFBR2 SFT2D2 CDC40 - LPS + LPS - LPS + LPS - LPS + LPS ANKRD22 SCP2 JAK2 CEP170 GBP5 TPM3 GBP4 ADAM9 j Hallmark pathways analysis, 0h_LPS k Reactome analysis, 0h_LPS SP140 EPHA5 PLAC8 MLL3 Interleukin-10 signaling GBP1 PDIA6 INTERFERON ALPHA SIGNALING GVIN1 TMEM150B DCTN4 NEK7 TNFA SIGNALING VIA NFKB Signaling by Receptor Tyrosine Kinases ITGA11 U1SNRNPBP PIGN EIF3CL MHC class II antigen presentation LRRK2 NOMO3 XENOBIOTIC METABOLISM PAQR8 GPR89A FRMD3 HNRNPA1 INFLAMMATORY RESPONSE cytokine Signaling in Immune system GPR18 EIF5A RRAGA PIPSL SLC35B1 MED12 INTERFERON GAMMA RESPONSE Immune System DGKA CENPT THG1L ZEB2 STRADB C9orf167 0 5 10 15 0 1 2 3 4 FYCO1 NUP153 TMCO1 COX7B2 MRPL36 EPHB6 Hallmark pathways analysis, 4h_LPS Reactome analysis, 4h_LPS RNASE1 ARRDC4 FLJ20489 ACSF2 SUMOylation NGFRAP1 CUTL1 TNFA SIGNALING VIA NFKB C17orf63 SPATA13 SUMO E3 ligases SUMOylate target proteins PON2 AK2P2 IL2 STAT5 SIGNALING UNFOLDED APLP2 TNPO3 HRASLS3 HRASLS3 Metabolism of lipids MOBKL1A LAMB2 PROTEIN RESPONSE SNX10 RHOBTB3 FCGBP NFIA cytokine Signaling in Immune system FAM107B TOMM40L INTERFERON GAMMA RESPONSE SIPA1L1 ELAC2 Immune System CEACAM1 RRP12 ANDROGEN RESPONSE AMY1A WDR5 CDK6 C3orf58 MDK SLC29A1 0 2 4 6 8 0 2 4 6 MLF2 SLC39A10 p-value -Log10(p-value) -Log10( ) 0h_LPS 4h_LPS -3 0 3 Figure 4 a b BB2

bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyrightN holder for this preprint DEL68(which was not certified by peer review) is the author/funder, who has granted bioRxiv a licenseO to displayH the preprint in perpetuity. ItNH is made available under aCC-BY-NC 4.0 International license. N N O O Most enriched warhead O

BB1

O BB1 c O H N N NH N O BB3 O HO BB3 O S O O O O + d e S N O O- CH3 520 360 HN O CH3 500

480 340 O 460 S N O 440 HO 320 H C 420 3 400 300 380 O S O HO FP (mP) 360 FP (mP) 340 280 320 300 260 280 1e-9 1e-8 1e-7 1e-6 1e-8 1e-7 1e-6 [SP140](M) [GSK761](M)

f Pull down of Pull down of endogenous SP140 transfected Halo‐ (Western blot) SP140 (Halo visualisation) Figure 5 a b 0.20 CD64 0.10 CD206 100 DMSO DMSO bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475IFN�; this version posted August 20, 2020.GSK761 The copyright holder for this preprint (which was not certified by 0.08peer review) is the author/funder, whoIL-4 has granted80 bioRxiv a license to display the preprint in perpetuity. It is madeGSK761 0.15 available under aCC-BY-NC 4.0 International license. 0.06 60 0.10 * 0.04 40

0.05 cells CD64+ %

ve gene expression gene ve � ve gene expression gene ve

� 0.02 20 Rela Rela 0.00 0 0.00 3 3 4 5 -10 0 10 10 10 DMSO GSK761 DMSO GSK761 CD64-PE-Cy5-A c d 1.0 Up-regulated in GSK761 IL8 80 DMSO 0.5 DMSO GSK761 0.0 GSK761 aCt)

lt IL6 60 −0.5 CCL3 IL1B NCF1 TNF CCL5 −1.0

2^-De CDKN1A 40 −1.5 CCL1 −2.0 HLA-C CSF3 % CD206+ cells CD206+ % 20 IL18 (GSK761 GM-CSF

10 −2.5 IL2RA Down-regulated in GSK671 0 Log −3.0 IL12A 3 3 4 5 -10 0 10 10 10 CD206_APC-A −3.5 −3.0 −2.0 −1.0 0.0 1.0 Log (DMSO 2^-DeltaCt) e f 10 1.5 TNF 1.5 IL-6 1.5 TNF 1.5 IL6 1.0 1.0 DMSO GSK761 ** * 0.5 ** 0.5 *** ** ** 1.0 1.0 *** *** 0.0 0.0 0 0.01 0.04 0.12 0.37 1.11 0 0.01 0.04 0.12 0.37 1.11 0.5 0.5 2.0 IL-12p70 1.5 IL-1β 1.5 1.0 1.0 * 0.0 0.0 0.5 0.5 * * IL10 CD64 ** 1.5 1.5 ** ** Fold change (pg/mL) change Fold *** 0.0 0.0 0 0.01 0.04 0.12 0.37 1.11 0 0.01 0.04 0.12 0.37 1.11 1.0 1.0 2.0 IL-8 1.5 IL-10 1.5 1.0

1.0 change) (fold expression gene Relative 0.5 0.5 0.5 0.5 * * ** *** 0.0 0.0 0.0 0.0 0 0.01 0.04 0.12 0.37 1.11 0 0.01 0.04 0.12 0.37 1.11 GSK761 (uM) Figure 6 a b c - LPS bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475+ LPS ; this version posted August 20, 2020. The copyrightChroma holder for� thisn states preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made only 21 K4me321 K9me321 K9Ac 21 K14Ac 21 K9Ac/K14Ac21 ------TNF-TSSavailable under aCC-BY-NCIL6-TSS 4.0 International license. Tx: Strong transcription 1 1 1 1 1 1 0.8 0.4 Enh: Enhancers

H3 H3 H3 H3 H3 H3 Beads MagicMark TxWk: Weak transcription 220 EnhG: Genic enhancers TssA: Active TSS 120 0.6 0.3 12.33% 100 TssAFlnk: Flanking Active TSS 13.09% Quies: Quiescent/Low 80 11.75% TxFlnk: Transcr. at gene 5 and 3 0.2 Input 0. ZNF/Rpts: ZNF genes & repeats

60 4 BivFlnk: Flanking Bivalent TSS/Enh

50 % 12.55% % Input 23.86% EnhBiv: Bivalent Enhancer 40 ReprPC: Repressed PolyComb 0. 0.1 9.93% 30 2 ReprPCWk: Weak Repressed PolyComb Heterochromatin 20 TssBiv: Bivalent/Poised TSS 0.0 0.0 Total=100% DN1 DN2 DN3 DN1 DN2 DN3 d Low High e f 0h 1h 4h 0h 1h 4h TNF-TSS Top 20 Top 20 0h LPS_DMSO FLOT1 FLOT1 0.8 SERPINA1 SCO2 DMSO 1h LPS_DMSO VIM-AS1 FTH1 GSK761 4h LPS_DMSO CEBPB-AS1 TMEM138 1.2 PHF19 SLC35B2 0h_LPS GSK761 NFKBIA GLUL 0.6 1h LPS_GSK761 LOC100130476 BZRAP1-AS1 1.0 LINC00910 NFE2L2 4h LPS_GSK761 FTH1 CEBPB-AS1

TNFAIP2 TYMP signal 0.8 DUSP5P1 SGK1 0.4

e IRF1 ICAM4 g 0.6

ICAM4 NFKBIA % Input SCO2 IRF1 ROCK1P1 0.4 SOD2 era TNFα LINC00910 0.2 NUP62 DMXL2

Av Ranked on 1h LPS Ranked on 4h LPS 0.2 ACTR3 ROCK1P1 LITAF CTSB SGPP2 TNFAIP2 0.0 0.0 -5k TSS +5k -5k TSS +5k -5k TSS +5k -5k TSS +5k -5k TSS +5k -5k TSS +5k -5K -4K -3K -2K -1K TSS 1K 2K 3K 4K 5K Distance from TSS (kb) Distance from TSS (kb) 0h 1h 4h Distance from TSS (kb) g h Ontology_ChIP_1h LPS 4h_LPS 0.00 8h_LPS

0.50 Response to stress 0.50 Response to organic substance Immune response GSK761 0.25 0.25 Cellular response to cytokine stimulus Response to cytokine 0.00

0.00 Response to stress Response to organic substance PC2(10.67%) PC2(10.67%) -0.25 Immune response DMSO -0.25 Cellular response to cytokine stimulus -0.50 Response to cytokine -0.475 -0.450 -0.425 -0.400 -0.375 -0.350 -0.50 -0.45 -0.40 -0.35 -0.30 PC1(22.24%) PC1(22.24%) 0 10 20 30 40 50 -log10(p-value) DMSO GSK761B i j DMSO_0h LPS DMSO_1h LPS DMSO_1h LPS GSK761 VS DMSO INFLAMMATORY_RESPONSE 0h 4h 8h Enrichment 2 INTERFERON_ALPHA_RESPONSE 1 0 IL6_JAK_STAT3_SIGNALING TNFA_SIGNALING_VIA_NFKB -1 -2 INTERFERON_GAMMA_RESPONSE -3 INTERFERON_GAMMA_RESPONSE TNFA_SIGNALING_VIA_NFKB Status INTERFERON_ALPHA_RESPONSE Upregulated GSK761_0h LPS GSK761_1h LPS GSK761_1h LPS Downregulated

INFLAMMATORY_RESPONSE INFLAMMATORY_RESPONSE -log10(pval) INTERFERON_ALPHA_RESPONSE 1 IL6_JAK_STAT3_SIGNALING 2 IL6_JAK_STAT3_SIGNALING 3 INTERFERON_GAMMA_RESPONSE

TNFA_SIGNALING_VIA_NFKB

0 2 4 6 0 2 4 60 2 4 6 Normalized read count Figure 7 a DMSO GSK761 b c TRIM32 CYB561D1 40 TNFAIP2 PER1 HHEX bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint CXCR4 (which was not certified 30by peer review) is the author/funder, who has granted bioRxiv a license35 to display the preprint in perpetuity. It DDX11L16is made ZNF766 TRIM25available under aCC-BY-NC 4.0 International license. FAM46A TSC22D3 IL18 PLAU 30 BBS12 ENSG00000279117 ENSG00000235027 ADPGK-AS1 LINC00641

ZNF628 DMSO_1h LPS ADM

TFAP4 _ LTB value) 25 ZNF12 20 ENSG00000265206 CERK LTA TNF GPAA1 GAS5 p- MIR6732 TRAF1 MIR7641-2 ZNF627 IL6 MIR8078 10 IRF1 ZNF57 20 TYMP AC068580.3 NFKBIA FTH1 PATJS AC068580.1 VIM-AS1

-Log C10ORF55 15 RGS1 10 PDIK1L ZNF844 ZNF526 10 ZNF319 ZNF441 DCLRE1B 5 SOWAHD 0 ZNF230 CDKN1B -5.0 -2.5 0.0 2.5 5.0 Normalized read count 0 EDN1 0 5 10 15 20 25 30 35 40 LOC101928809 Mean effect size ST7-AS1 Non-significant Down-regulated Up-regulated Normalized read count_GSK761_1h LPS PTGER4 TRIM25 d e DMSO AL157871.2 ADPGK-AS1 140 140 140 140 DMSO_0h LPS GSK761 OSR2 120000 TNFAIP2 12000 TRAF1 140000 TNF 500 LTA AC092723.1 0 0 0 0 450 100000 10000 120000 NT5DC2 140 140 140 140 DMSO_1h LPS 400 100000 MIR29B2CHG 80000 8000 350 BCL2L14 80000 300 TNFSF9 0 0 0 0 60000 6000 250 140 140 140 140 60000 200 EGR4 DMSO_4h LPS 40000 4000 AC004687.1 40000 150 20000 2000 100 IL6 Expression 20000 0 0 0 0 50 F3 140 140 140 140 0 0 0 0 FAM43A GSK761_0h LPS

Signal IRF1 LTB YUMP NFKBIA BMP2 40000 90 35000 80000 CXCL1 0 0 0 0 35000 80 30000 70000 140 140 140 140 70 IL1B GSK761_1h LPS 30000 25000 60000 60 KCNA3 25000 50000 PMAIP1 50 20000 0 0 0 0 20000 40000 40 15000 ZNF503 140 140 140 140 15000 30000 GSK761_4h LPS 30 BTG2 10000 IER5 10000 20 20000 5000 10 5000 10000 0 0 0 0 Expression KCNJ2 IRF1 TRAF1 0 0 0 0 TP53INP2 TNF IRF1 OASL TRAF1 IRF1 TRAFAIP2 TRAF1 IRF1 PTGS2 IRF1 LINC00515 DNAJB4 GSK761 vs DMSO GSK761 vs DMSO FAM65C f g � Expression_4h, ChIP_1h Homer Known Mo f Enrichment GPR132 Expression_4h, ChIP_4h Mo�f Protein P-value TRAF1 AGER ATF3 1e-9 PELI1 GG C ) TAA C TT ATG CA FOSL2(FRA2) 1e-9 GA

G T A T CCG GGG FSCN1 0 CCCTTAGTACATCTCTACAGAG AP001972.5 T G T 0 GCA G AG C BATF 1e-9 T

G T CCG GAGG AC018644.1 CTTAGTACTACACTCTACAA GG C LINC01465 TAA G TA SLC35B2 IL10RA CTG C C FOSL1(FRA1) G T T A G CA GAT GTT CCCG CCGCG ACT TAATA AG 1e-8 LINC00641 BTBD19 NCF1B ISG15 T A LINC01465 RNF19B CYB561D1 TNFAIP2 LITAF -10 IL10RA C NFkB-p65-Rel T G

CT CC GTG A GGTCG AGAT DRAM1 IRF2 GCAT TATACTCATCGA 1e-8 ADM IRF1 NABP1 G AA C PNRC1 G GT CDKN2B LINC01465 G T C -20 C CA IRF8 A C C A G A T G C GT CTCGC TT C 1e-7 T GT TTGACC T TNFAIP2 IRF1 A GAA G MEGF8 AAA G A SLC35B2 GA C T SGK1 AG C CLCF1 T JunB TAAAG 1e-7 CA G TC GTT CA G CTGTGCGATCTCGCAA NCR3LG1 -20 DB (peak signal (peak DB G CC AA C TTA TG CAT LINC02605 A Jun-AP1

G T T CCTG GTG TAACTCAGACG C G A C TGA TCA GG 1e-7 IL23A AG AG GT TC GC C A C A C G C CGGG A IRF3

G -40 CTGAATA CCAGAGA CCDC200 T TTTCTATTCT 1e-7 MIR3945HG G TCC -30 AA G TG TG G CCA AA FOSL2 (FRA2) T CG CA TA T 1e-7 G G CC GC TG CAA G CT CTATAAGT MIR22HG -10 T SERPINB9 -5 0 5 -10 -5 0 5 ZC3HAV1 Differen�al expression (Wald sta�s�c) BTBD19 DBG and DEG DBG and non-DEG Homo de novo Mo�f IL18 Mo� f P-value Best match E2F7 SLC25A34 H GO: Chemokine ac�vity (ChIP_0h) GO: Chemokine ac�vity (ChIP_1h) GO: Chemokine ac�vity (ChIP_4h)

GT T T GTTT CGGGGGCGCGGG AC124798.1 NES:-1.3286, p-value:0.01 NES:-1.3290, p-value:0.01 NES:-1.2698, p-value: 3.43E-3 TACACTACACTACAGTACTATACACACA PH0158.1_Rhox11_2 ACOD1 Reverse opposite 1e-23 CCL5 0.00 0.0 0.0 /Jaspar(0.620)

GTGT T T TT CCCGTTCTCGCG GEM TAGATACAGCACGAGAGCAGACTAGACA AL031590.1 -0.25 -0.2 -0.2 TNIP3

T G

TTTTT G T TT GGCGG CTG GG ESPL1 ACACCGCACAATAAC -0.4 CAGAAATGCCCA HIC2/MA0738.1 LOC101927686 -0.50 -0.4 Reverse opposite 1e-22 TNFAIP6 /Jaspar(0.675) -0.6 T C

GT T T GGTGT CC CGG CCGCC CD300E AAGCAACTACGTAAAAAA -0.75 TG G A TTCTG HIVEP1 -0.6 AC006449.6 -0.8 CAMK1G G A C MITF/MA0620.2

T T G T CTCTGTCGCTGT 0 10000 20000 0 10000 20000 0 10000 20000 GACAGTACGACACGATACTAGTACGCAGA 1e-22 -3 0 3 Reverse opposite /Jaspar(0.741) GO: Chemokine ac�vity (RNA-seq_0h) GO: Chemokine ac�vity (RNA-seq_4h) GO: Chemokine ac�vity (RNA-seq_8h) G T NES:-1.8842, p-value:1.21E-3 NES:-2.1996, p-value:1.97E-4 NES: 0.7486, p-value: 0.8466 C

T T G T TCCTTGTCTTTG CAGAGACGAGCACAGCATAGCACGAGACA 0.0 0.0 0.2

Enrichment score

-0.2 0.1 -0.2

-0.4 0.0 -0.4 -0.6 -0.1

-0.6 -0.8 -0.2 0 5000 10000 0 10000 20000 30000 0 5000 10000 15000 Rank Supplementary figure 1

bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint Ulcera�ve coli(which�s, wascolon not certified by peerAppendici review) �iss, the appendix author/funder, who hasSarcoidosis, granted bioRxiv lung a license to display the preprintPsoria �inc perpetuity.arthri�s, synoviumIt is made available under aCC-BY-NC 4.0 International license.

Rheumatoid arthri�s, synovium Hashimoto's thyroidi�s, thyroid Sjogren's syndrome, cervical cyst Supplementary figure 2

Phenotype bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxivIn afl licenseamed to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 InternationalUnin licenseflamed.

Cluster 9 Cluster 21 Cluster 8 Cluster 6 Cluster 11 p-value = 5.09e-03 p-value = 7.24e-03 p-value = 3.02e-02 p-value = 3.03e-02 p-value = 6.64e-02 20 600 2000 500 15 900 400 1500 400 10 600 300 1000 200 200 5 300 500 100 0 0 0 0 Cluster 13 Cluster 10 Cluster 15 Cluster 14 Cluster 1 p-value = 7.42e-02 p-value = 7.64e-02 p-value = 8.63e-02 p-value = 1.17e-01 p-value = 1.28e-01 500 800 400 400 600 300 600 300 300 400 200 200 400 200 200 100 200 100 100

0 0 0 Cluster 17 Cluster 0 Cluster 18 Cluster 2 Cluster 16 p-value = 1.31e-01 p-value = 2.14e-01 p-value = 2.27e-01 p-value = 5.11e-01 p-value = 5.73e-01

800 200 300 1500 200 600 150 150 200 1000 100 100 400

Cell count 500 100 50 50 200

0 0 0 Cluster 12 Cluster 7 Cluster 3 Cluster 4 Cluster 5 p-value = 5.81e-01 p-value = 6.43e-01 p-value = 6.88e-01 p-value = 8.27e-01 p-value = 8.38e-01 2500 600 1000 1000

600 2000 750 750 400 1500 400 500 500 1000 200 200 250 250 500

0 0 0 0 Cluster 19 Cluster 20 p-value = 9.24e-01 p-value = 9.69e-01

80 120

60 80

40 40 20

0 Supplementary figure 3

a bioRxiv preprintCD64 doi: https://doi.org/10.1101/2020.08.10.239475CCL5 ; this version postedCD206 August 20, 2020. The copyright holderCCL22 for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. 0.15 0.020 0.15 0.20 * M0 M1 0.015 0.15 0.10 0.10 M2 0.010 0.10 0.05 ** 0.05 0.005 0.05 * * eaiegene expression Relative eaiegeneRelative expression 0.00 0.000 geneRelative expression 0.00 geneRelative expression 0.00

b CD64 CCL5 CD206 CCL22 M0 0.25 50 0.003 1.0 M1 * ** M2 0.20 40 0.8 0.002 0.15 30 0.6

0.10 20 0.4 0.001 0.05 * ** 10 0.2 Relative gene expression eaiegeneRelative expression eaiegeneRelative expression ** * Relative gene expression 0.00 0 0.000 0.0

c d SP140 M0 SP140 in M1 SP140 in M2 0.020 M1 * M2 0.015 expression 0.010 gene gene 0.005

Relative 0.000

e SP140 CCL5 CD206 0.015 0.08 0.4 ** * 0.3 0.06 0.010 ** 0.2 0.04 0.005 * * 0.1 * 0.02 * * Relative gene0.000 expression **** Relative gene expression 0.0 Relative gene0.00 expression M0 M1 M2 M0 M1 M2 M0 M1 M2 M0 M1 M2 M0 M1 M2 M0 M1 M2 - LPS + LPS - LPS + LPS - LPS + LPS Supplementary figure 4 a bioRxivSP100 preprint doi: https://doi.org/10.1101/2020.08.10.239475SP110 ; this version postedSP140L August 20, 2020. The copyright holderBRD2 for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 0.020 0.015 available under aCC-BY-NC0.008 4.0 International license. 0.003 M0 M1 0.015 0.006 M2 0.010 0.002

0.010 p=0.06 0.004

0.005 0.001 0.005 0.002 Relative gene expression Relative gene expression Relative gene expression 0.000 0.000 0.000 Relative gene expression 0.000

BRD3 BRD4 BRD9 PCAF 0.008 0.005 0.0020 0.0010

0.004 0.0008 0.006 0.0015 0.003 0.0006 0.004 0.0010 0.002 0.0004 0.002 0.0005 0.001 0.0002 Relative gene expression Relative gene expression Relative gene expression 0.000 0.000 Relative gene expression 0.0000 0.0000

EP300 CREBBP BAZ2A BAZ2B 0.008 0.010 0.0025 0.003

0.008 0.0020 0.006 0.002 0.006 0.0015 0.004 0.004 0.0010 0.001 0.002 0.002 0.0005 Relative gene expression Relative gene expression Relative gene expression 0.000 Relative gene expression 0.000 0.0000 0.000 b c Scrambled SP140 siRNA DAPI SP100 Merged SP110 SP100 Scrambled SP140 siRNA 0.04 0.08 20

0.03 0.06 15

Scram bled

0.02 0.04 10

siRNA 0.01 0.02 5 SP100 nuclear bodies Relative gene expression Relative gene expression

0.00 0.00 SP140 0 -LPS + LPS -LPS + LPS Supplementary figure 5

a b DMSO 0.04 µM GSK761 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which1.5 was not certified byns peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under- LPSaCC-BY-NC 4.0 International license. +LPS 1.0

* *

0.5 0.12 µM GSK761 0.37 µM GSK761 **** **** 0.0

ATP luminescence (Fold change) 0.00 0.01 0.04 0.12 0.37 1.11 Concentration of GSK761 (�M)

c TNF 0.25 DMSO GSK761 0.20

0.15

0.10

0.05

Relative gene expression 0.00 Supplementary figure 6 a b c 1-21,H3 Pep�de 120 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint 100 (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to 100display the preprint in perpetuity. It is made available100 under aCC-BY-NC 4.0 International license. 80 1-21,H3 Pep�de

50 60 501 1

40

% Response % % Response % % Response % 0 20 0 0 1e-8 1e-7 1e-6 1e-5 1e-4 1e-9 1e-8 1e-7 1e-6 1e-5 1e-8 1e-7 1e-6 1e-5 1e-4 [H3 Peptide] M [Peptide] M [Peptide] M d e SP140 enrichment at H3K27ac 0h 1h 4h 300 + HaloTag Ligand 250 - HaloTag Ligand

200

mBU 150

100

50

0 1:1 1:10 1:100 SP140_NanoLuc:Histone_HaloTag Active TSS Flanking Active TSS f Transcr. at gene 5 and 3 Strong transcription Weak transcription Genic enhancers Enhancers SP140 appears SP140 reduced SP140 unaffected ZNF genes & repeats 59 6828 915 Heterochromatin Bivalent/Poised TSS Flanking Bivalent TSS/Enh Bivalent Enhancer Repressed PolyComb Weak Repressed PolyComb Quiescent/Low g GSK761B vs DMSO_0h LPS_ChIP GSK761B vs DMSO_1h LPS_ChIP GSK761B vs DMSO_4h LPS_ChIP

Enrichment FDR Functional Category Enrichment FDR Functional Category Enrichment FDR Functional Category 7.50E-05 Reactome:R-HSA-168256 Immune System 1.60E-22 Reactome:R-HSA-1280215 Cytokine Signaling in 7.10E-07 Reactome:R-HSA-1280215 Cytokine Immune system Signaling in Immune system 3.30E-04 Reactome:R-HSA-449147 Signaling by 3.10E-20 Reactome:R-HSA-168256 Immune System 7.70E-07 Reactome:R-HSA-449147 Signaling by Interleukins Interleukins 3.90E-04 Reactome:R-HSA-1280215 Cytokine Signaling 5.30E-15 Reactome:R-HSA-449147 Signaling by Interleukins 8.40E-07 Reactome:R-HSA-168256 Immune System in Immune system 4.60E-04 Reactome:R-HSA-168928 DDX58/IFIH1- 9.50E-12 Reactome:R-HSA-8953897 Cellular responses to 1.70E-05 Reactome:R-HSA-8953897 Cellular mediated induction of interferon-alpha/beta external stimuli responses to external stimuli 5.20E-03 Reactome:R-HSA-447115 Interleukin-12 family 1.10E-10 Reactome:R-HSA-2262752 Cellular responses to 1.80E-05 Reactome:R-HSA-8939211 ESR-mediated signaling stress signaling 5.20E-03 Reactome:R-HSA-75893 TNF signaling 6.90E-09 Reactome:R-HSA-913531 Interferon Signaling 2.20E-05 Reactome:R-HSA-9018519 Estrogen- dependent gene expression 8.00E-03 Reactome:R-HSA-5357905 Regulation of 2.90E-08 Reactome:R-HSA-6785807 Interleukin-4 and 13 2.20E-05 Reactome:R-HSA-6798695 Neutrophil TNFR1 signaling signaling degranulation 1.80E-02 Reactome:R-HSA-5357956 TNFR1-induced 1.50E-06 Reactome:R-HSA-1280218 Adaptive Immune 6.30E-05 Reactome:R-HSA-2559583 Cellular NFkappaB signaling pathway System Senescence 1.80E-02 Reactome:R-HSA-933542 TRAF6 mediated NF- 1.50E-06 Reactome:R-HSA-168249 Innate Immune System 1.90E-04 Reactome:R-HSA-2262752 Cellular kB activation responses to stress 1.90E-02 Reactome:R-HSA-1606322 ZBP1(DAI) 1.60E-06 Reactome:R-HSA-1169410 Antiviral mechanism by 1.90E-04 Reactome:R-HSA-74160 Gene expression mediated induction of type I IFNs IFN-stimulated genes (Transcription) Supplementary figure 7

a b Hallmark: TNFa signaling via NFkB Hallmark: TNFa signaling via NFkB Hallmark: TNFa signaling via NFkB NES: -1.4145, p-value: 1.00E-4 NES: -1.4145, p-value: 1.00E-4 NES: -1.4373, p-value: 1.00E-4

bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.2394750.00 ; this version posted August0.0 20, 2020. The copyright holder0.0 for this preprint HALLMARK_XENOBIOTIC_METABOLISM(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

HALLMARK_WNT_BETA_CATENIN_SIGNALING available under aCC-BY-NC 4.0 International-0.2 license. -0.2 -0.25 HALLMARK_UV_RESPONSE_UP -0.4 -0.4 HALLMARK_UV_RESPONSE_DN -0.50

-0.6 HALLMARK_UNFOLDED_PROTEIN_RESPONSE -0.6

e -0.75 ChIP-seq_0h ChIP-seq_1h ChIP-seq_4h HALLMARK_TNFA_SIGNALING_VIA_NFKB -0.8 or -0.8 HALLMARK_TGF_BETA_SIGNALING 0 10000 20000 0 10000 20000 0 10000 20000

HALLMARK_SPERMATOGENESIS t sc Hallmark: TNFa signaling via NFkB Hallmark: TNFa signaling via NFkB Hallmark: TNFa signaling via NFkB NES: -2.7270, p-value: 1.97E-4 NES: 0.3647, p-value: 0.7293 HALLMARK_REACTIVE_OXIGEN_SPECIES_PATHWAY NES: -2.2659, p-value: 2.03E-4

HALLMARK_PROTEIN_SECRETION 0.0

hmen 0.2 HALLMARK_PI3K_AKT_MTOR_SIGNALING 0.0 -0.2 HALLMARK_PEROXISOME

Enric 0.1

HALLMARK_P53_PATHWAY -0.2 -0.4 HALLMARK_OXIDATIVE_PHOSPHORYLATION 0.0 -0.6 HALLMARK_NOTCH_SIGNALING -0.4 RNA-seq_0h RNA-seq_4h -0.1 RNA-seq_8h

HALLMARK_MYOGENESIS -0.8 0 5000 10000 0 10000 20000 30000 0 5000 10000 15000 HALLMARK_MYC_TARGETS_V2 Rank HALLMARK_MYC_TARGETS_V1 DMSO GSK761 c d TR AF5 HALLMARK_MTORC1_SIGNALING C AS P3 ATF2 HALLMARK_MITOTIC_SPINDLE 25 MAPK14 TNF PIK3R 1 HALLMARK_KRAS_SIGNALING_UP LTA C R E B1 TRAF1 ITCH HALLMARK_KRAS_SIGNALING_DN MAP2K4 MLKL HALLMARK_INTERFERON_GAMMA_RESPONSE MAP2K7 20 TNFR S F1A HALLMARK_INTERFERON_ALPHA_RESPONSE BAG4 IKBKB NFKBIA TAB1 HALLMARK_INFLAMMATORY_RESPONSE TNF TRADD HALLMARK_IL6_JAK_STAT3_SIGNALING IL6 E DN1 HALLMARK_IL2_STAT5_SIGNALING _DMSO_1h LPS 15 MMP9 MAP3K8 HALLMARK_HYPOXIA TNFAIP3 CEBPB C C L20 HALLMARK_HEME_METABOLISM C XC L1 CCL5 IL1B IC AM1 HALLMARK_HEDGEHOG_SIGNALING TNFAIP3 10 PTGS 2 JUNB LTA HALLMARK_GLYCOLYSIS SOCS3 R IPK1 ICAM1 TR AF1 HALLMARK_G2M_CHECKPOINT MAP2K3 CFLAR NFKB1 HALLMARK_FATTY_ACID_METABOLISM IL1B BIR C 3 MAP3K7 HALLMARK_ESTROGEN_RESPONSE_LATE 5 CXCL10 MAP3K8 C XC L10 ATF4 CXCL1 C R E B3 MAP2K3 PIK3R5 C C L2 HALLMARK_ESTROGEN_RESPONSE_EARLY MMP9 TR AF3 RELA HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION CXCL2, CCL2, JAG1 Normalized read count ATF4 J AG1 HALLMARK_E2F_TARGETS C C L5 0 0 1 2 3 4 5 6 7 PIK3R 5 HALLMARK_DNA_REPAIR CFLAR Normalized read count_GSK761_1h LPS HALLMARK_COMPLEMENT -3 0 3 HALLMARK_COAGULATION e HALLMARK_CHOLESTEROL_HOMEOSTASIS HALLMARK_BILE_ACID_METABOLISM 60 DMSO_0h LPS HALLMARK_APOPTOSIS 0 HALLMARK_APICAL_SURFACE 60 DMSO_1h LPS HALLMARK_APICAL_JUNCTION 0 HALLMARK_ANGIOGENESIS 60 DMSO_4h LPS HALLMARK_ANDROGEN_RESPONSE HALLMARK_ALLOGRAFT_REJECTION 0

HALLMARK_ADIPOGENESIS Signal 60 GSK761_0h LPS

0h 4h 8h 0 60 GSK761_1h LPS Significant Enrichment Status -log10(pval) FALSE 2 Upregulated 1 0 1 60 GSK761_4h LPS TRUE Downregulated 2 0 3 0 -1 HLA-F HLA-DMB HLA-DOA HLA-DPA1 HLA-C HLA-B HLA-C HLA-F HLA-A -2 HLA-DMA HLA-DPA1 HLA-F HLA-A BRD2 HLA-DPA1 -3 BRD2 HLA-DPB1 BRD2 HLA-DPB2 BRD2 BRD2 Supplementary figure 8

bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. GSK761 vs DMSO Expression_8h, ChIP_1h DBG and DEG DBG and non-DEG

0

-10 S CO2 CDH23 S OD2 ACAT2 LITAF NUP62 DRAM1 NFKBIA

-20

VIM-AS 1 PHF19 DB (peak signal) (peak DB -30

-40

-15 -10 -5 0 5 10 15 Differential expression (Wald sta�s�c)

GSK761 vs DMSO Expression_8h, ChIP_4h DBG and DEG DBG and non-DEG

0

-10

NUP62

IFITM10 CAMK2D

DB (peak signal) (peak DB RGL2 S LC35B2 S GK1 S CO2 -20 DMXL2 S OD2 GLUL

-30

-10 0 10 20 Differential expression (Wald sta�s�c) Supplementary figure 9 a b

30 80 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprintDMSO_0h LPS (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International0 license.

25 80 DMSO_1h LPS

0

20 80 DMSO_4h LPS

_DMSO_1h LPS 0

Signal 15 80 GSK761_0h LPS

0

80 10 GSK761_1h LPS 0

80 GSK761_1h LPS 5 0 Normalized read count

0 0 1 2 3 4 5 6 7 8 Normalized read count_GSK761_1h LPS c

80 80 80 DMSO_0h LPS

0 0 0

80 80 80 DMSO_1h LPS

0 0 0

80 80 80 DMSO_4h LPS

0 0 0 GSK761_0h LPS 80 80 80

Signal

0 0 0

80 80 80 GSK761_1h LPS

0 0 0

80 80 80 GSK761_1h LPS

0 0 0

d 3.0 Expression Up-regula�on Down-regula�on Low

2.5 High

2.0

value)

(p- 1.5

10

Log

- 1.0

5

0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 Logfold DMSO/GSK761 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made Supplemental tables. available under aCC-BY-NC 4.0 International license.

Supplemental table 1. Primer sequences used in the quantitative PCR analysis of the genes of interest.

Gene Forward (5’-3’) Reverse (5’-3’) SP140 AGGATGGTCGCAGAGATCCA TGGCCTTGTTATTGCACTTGC SP110 CAGCACACCTTCAGACAAGAA TCTCTACCGTGGAGTTACAAGTT SP100 TCCATGACAAATTGCCTCTCC GAGATGGGGAACCCGAAGG SP140L GGGCTGAACGGAGGTGTTT AGTGTCATAGACAAGTCCCTCAT BRD2 GTGGTTCTCGGCGGTAAG ACACCCCGGATTACATACCC BRD3 TTGGCAAACCTCATCTCAAA GATGTCCGGCTGATGTTCTC BRD4 GATGGTGCTTCTTCTGCTCC AGTCCAGCTCCTCTGACAGC BRDT TCCCTTGAACGTGGTACAGG GCAGGAGTTGTTGTATCTGCT BRD9 GCAATGACATACAATAGGCCAGA GAGCTGCCTGTTTGCTCATCA BAZ2A ATGGAAATGGAGGCAAACGAC GAGACCCGTTAGTGTAGAGGC BAZ2B ATGGAGTCTGGAGAACGGTTA AATGTCCACATGGGTTGATTGT EP300 TCTGGTAAGTCGTGCTCCAA GCGGCCTAAACTCTCATCTC PCAF CGAATCGCCGTGAAGAAAGC CTTGCAGGCGGAGTACACT CREBBP CAACCCCAAAAGAGCCAAACT CCTCGTAGAAGCTCCGACAGT CD64 ACCCCATACAGCTGGAAATC TTATCCTTCCACGCATGACA CD206 GGGTTGCTATCACTCTCTATGC TTTCTTGTCTGTTGCCGTAGTT CCL5 CCAGCAGTCGTCTTTGTCAC CTCTGGGTTGGCACACACTT

CCL22 CGCGTGGTGAAACACTTCTA GGATCGGCACAGATCTCCT TNF ATGTTGTAGCAAACCCTCAAGC GGACCTGGGAGTAGATGAGGT IL6 AGTGAGGAACAAGCCAGAGC GTCAGGGGTGGTTATTGCAT IL8 AAATTTGGGGTGGAAAGGTT TCCTGATTTCTGCAGCTCTGT ACTB AATGTGGCCGAGGACTTTGA TGGCTTTTAGGATGGCAAGG RPL37A CCAAACGTACCAAGAAAGTCGG GCGTGCTGGCTGATTTCAA TSS-TNF GGGACATATAAAGGCAGTTGTTGG TCCCTCTTAGCTGGTCCTCTGC TSS-IL6 AATGAAACCATCCAGCCATCC CAGAGACGGTGGTCCTCTGC bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made Supplemental table 2. GSK761available screen under against aCC-BY-NC human 4.0 BCPs International (Bromoscan license. assay) reveals no binding

at Kd of ≤ 30000 nM (for most of tested BCPs) and at Kd of ≤ 21000 nM for PBRM1(5), indicating a high degree of specificity of GSK761 for SP140

Compound Name DiscoveRx Gene Symbol Kd (nM) GSK761 ATAD2A 30000 GSK761 ATAD2B 30000 GSK761 BAZ2A 30000 GSK761 BAZ2B 30000 GSK761 BRD1 30000 GSK761 BRD2(1) 30000 GSK761 BRD2(1,2) 30000 GSK761 BRD2(2) 30000 GSK761 BRD3(1) 30000 GSK761 BRD3(1,2) 30000 GSK761 BRD3(2) 30000 GSK761 BRD4(1) 30000 GSK761 BRD4(1,2) 30000 GSK761 BRD4(2) 30000 GSK761 BRD4(full-length,short-iso.) 30000 GSK761 BRD7 30000 GSK761 BRD8(1) 30000 GSK761 BRD8(2) 30000 GSK761 BRD9 30000 GSK761 BRDT(1) 30000 GSK761 BRDT(1,2) 30000 GSK761 BRDT(2) 30000 GSK761 BRPF1 30000 GSK761 BRPF3 30000 GSK761 CECR2 30000 GSK761 CREBBP 30000 GSK761 EP300 30000 GSK761 FALZ 30000 GSK761 GCN5L2 30000 GSK761 PBRM1(2) 30000 GSK761 PBRM1(5) 21000 GSK761 PCAF 30000 GSK761 SMARCA2 30000 GSK761 SMARCA4 30000 GSK761 TAF1(2) 30000 GSK761 TAF1L(2) 30000 GSK761 TRIM24(Bromo.) 30000 GSK761 TRIM24(PHD,Bromo.) 30000 GSK761 TRIM33(PHD,Bromo.) 30000 GSK761 WDR9(2) 30000 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 1 Supplementary informationavailable (compound under aCC-BY-NC discovery 4.0 International and synthesis license.)

2 Screening of DNA encoded libraries (GSK761 discovery). Affinity screening of DNA Encoded 3 Library (DEL) was done using AntiFlag matrix tips (Phynexus). Before use, the AntiFlag matrix tips 4 were washed four times with 1x selection buffer and stored at 4° C. The SP140 protein (24 kDa) used 5 for the affinity selection was: 6His-Flag-Tev-SP140 (687-867). During the screening process, each 6 AntiFlag matrix tip was loaded with 5 μg of protein and the No target tips was treated with buffer 7 only. The following 1x selection buffer was used in the affinity selection: 50 mM HEPES (7.5), 150 8 mM NaCl, 0.1% Tween 20, 1 mM BME, 1 mg/mL sheared salmon sperm DNA (sssDNA, Ambion). 9 Three rounds of affinity selection were performed for each screening condition. No target controls 10 (buffer only) were run in parallel as a control.

11 Selection Round 1: Prior to initiating target selections, 5 μg of SP140 (aa 687-867) protein was 12 immobilized on prepared Anti-Flag resin tip (Phynexus), tips were washed four times with 100 μL of 13 1 x selection buffer. For each selection, 5 nM of DEL molecules (pool DEL 34-79) in 60 μL of 1 x 14 selection buffer was incubated with the immobilized SP140 (aa 687-867) by pipetting up and down 15 for 1h (RT). Tips were then washed eight times with 100 μL of 1x Selection buffer, and another 2 16 times with 100 μL of DNA free 1x Selection buffer. In order to release the bound DEL molecules off 17 the tip, a heat elution was performed by treating the tip in 60 μL of 1 x selection buffer (minus 18 sssDNA) for 12 minutes at 80 °C. Eluted samples were post cleared to remove denatured protein by 19 passing over a fresh IMAC resin tip to remove any denatured protein for 10 minutes at RT. This step 20 was repeated once. 1 μL of round 1 elution was retained to be used for qPCR. sssDNA and buffer 21 were added to bring the total volume of the eluted material to 60 uL to be used for next round of 22 selection.

23 Selection Round 2: The 2nd round selection was performed by binding 5 μg of fresh SP140 (aa 687- 24 867) protein to a fresh prepared Anti-Flag tip. The above selection procedure was repeated using the 25 eluted material from round 1. At the end of round 2, 5 μL of the elution was retained. The eluted 26 material was post cleared twice to remove denatured protein, as described above. sssDNA and buffer 27 were added to bring the volume to 60 μL in order to begin round 3 of the selection.

28 Selection Round 3: The above selection procedure was repeated with eluted material from round 2. 29 However, no post-clear step was performed for round 3 selection. At the end of round 3 selection, a 30 quantitative PCR was run to assess yield from each round of selection. Target and No target samples 31 from Round 3 elution were sequenced on an Illumina sequencer.

32 Cloning, expression and purification. DNA spanning the SP140 region encoding the BRD and PHD 33 domains (encoding SP140 amino acids 687-867) was prepared in a construct, which was subsequently bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. 34 inserted into a construct that comprised His6 and FLAG (DYKDDDDK) affinity chromatography 35 peptides and a TEV protease cleavage site (ENLYFQ\S, “\” denotes the cleaved peptide bond). This 36 DNA sequence was then cloned into a ET11c vector (Bioduro) and then subsequently used to 37 transform E. coli BL21 CodonPlus (DE3) RIPL (Stratgene) in media containing 100 μg/mL ampicillin 38 and 34 μg/mL chloramphenicol as selection antibiotics. A culture in Luria Bertani (LB) medium was 39 grown at 30°C, 240 rpm, overnight in Erlenmeyer shake flasks to provide a seed for a larger 40 expression culture in auto-inducible Overnight Express medium (supplemented with 1% Glycerol and

41 0.1 mM ZnCl2), in shake flasks, at 37 °C and 200 rpm. Incubation temperature was reduced to 25°C 42 when O.D.600nm was 1.2. The culture was pelleted by centrifugation after a further 20h incubation 43 and stored at -80° C. For protein purification, a 40 g cell pellet was processed by mixing with 200 mL 44 of PBS, 50 mM Imidazole, 10% Glycerol, PIC III, 1 mg/mL lysozyme, Benzonase (Merck) pH7.4 and 45 left to stir for 30 minutes at 4°C. The resulting suspension was lysed by sonication on ice for 15 46 minutes (10 seconds on /10 seconds off) and then harvested by centrifugation for 90 minutes at 30000 47 rpm. The lysate supernatant was collected and then passed through a 20 mL HisTrap HP column (GE 48 Healthcare) and then washed back with buffer A. (Buffer A: PBS, 2 mM DTT, 10% Glycerol, pH 49 7.4). Bound SP140 was eluted using stepped protocol with buffer B (PBS, 2 mM DTT, 10% Glycerol, 50 500 mM Imidazole, and pH 7.4) yielding 434.4 mg of total protein. This pool was diluted down to 5 51 mScm using 20 mM Hepes, 2 mM DTT, 10% Glycerol, and pH 7.5 and further purified using 100 mL 52 Source15Q anion exchange. Elution of the SP140 was carried out using a segmented linear NaCl 53 gradient (0-50% buffer B over 20 column volumes (Cvs), 50-100% over 1 Cv). Three peaks were 54 obtained from the post anion exchange material (Peak 1, 2 and 3) and pooled separately. The pooled 55 fractions from absorbance peak 2 were further purified using gel filtration (Superdex 75 320 mL Cv), 56 which lead to a final yield of purified SP140 protein of 7.87 mg, in a buffer comprising 10 mM 57 Potassium phosphate, 100 mM NaCl, 0.5 mM TCEP, 5% Glycerol , pH7.4. 58 59 Compound synthesis. Unless otherwise stated, all reactions were carried using anhydrous solvents. 60 Solvents and reagents were purchased from commercial suppliers and used as received. Reactions 61 were monitored by LCMS. Silica flash chromatography was carried out using SP4 apparatus using 62 RediSep® pre-packed silica cartridges. Ion exchange chromatography was carried out using Biotage 63 Isolute cartridges and extracted organic mixtures were dried using Biotage PTFE hydrophobic phase 64 separator frits unless otherwise stated. NMR spectra were recorded at RT (unless otherwise stated) 65 using standard pulse methods on a Bruker AV-400 spectrometer (1H = 400 MHz, 13C = 101 MHz). 66 Chemical shifts are referenced to trimethylsilane (TMS) or the residual solvent peak, and are reported 3 67 in ppm. Coupling constants are reported in Hz and refer to JH-H couplings, unless otherwise stated. 68 Coupling constants are quoted to the nearest 0.1 Hz and multiplicities are given by the following 69 abbreviations and combinations thereof: s (singlet), d (doublet), t (triplet), q (quartet), m (multiplet), 70 br. (broad). LCMS analysis was carried out on a Waters Acquity UPLC instrument equipped with a bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 71 BEH or CSH column (50 mm xavailable 2.1 mm, under 1.7 a μmCC-BY-NC packing 4.0 diameter)International andlicense Waters. micromass ZQ MS 72 using alternate-scan positive and negative electrospray. Analytes were detected as a summed UV 73 wavelength of 210 – 350 nm. Two liquid phase methods were used: Formic: 40 °C, 1 mL/min flow 74 rate. Gradient elution with the mobile phases as (A) water containing 0.1% volume/volume (v/v) 75 formic acid and (B) acetonitrile containing 0.1% (v/v) formic acid. Gradient conditions were initially 76 1% B, increasing linearly to 97% B over 1.5 min, remaining at 97% B for 0.4 min then increasing to 77 100% B over 0.1 min. High pH: 40 °C, 1 mL/min flow rate. Gradient elution with the mobile phases 78 as (A) 10 mM aqueous ammonium bicarbonate solution, adjusted to pH 10 with 0.88 M aqueous 79 ammonia and (B) acetonitrile. Gradient conditions were initially 1% B, increasing linearly to 97% B 80 over 1.5 min, remaining at 97% B for 0.4 min then increasing to 100% B over 0.1 min. Mass directed 81 automatic purification (MDAP): High pH MDAP: The HPLC analysis was conducted on an Xselect 82 CSH C18 column (150 mm x 30 mm i.d. 5 μm packing diameter) at ambient temperature, eluting with 83 10 mM ammonium bicarbonate in water adjusted to pH 10 with ammonia solution (solvent A) and 84 acetonitrile (solvent B) using an elution gradient of between 0 and 100% solvent B over 15 or 25 min. 85 The UV detection was an averaged signal from wavelength of 210 nm to 350 nm. The mass spectra 86 were recorded on a Waters ZQ Mass Spectrometer using alternate-scan positive and negative 87 electrospray. Ionisation data was rounded to the nearest integer. 88 GSK761

89 N-(3-(2-(tert-Butoxy)ethyl)phenyl)-4-formylbenzamide

90

91 4-Carboxybenzaldehyde (2.3g, 15.32 mmol) was dissolved in Dichloromethane (DCM) (25 mL), then 92 oxalyl chloride (5.83 g, 46.0 mmol) and N,N’-dimethylformamide (DMF) (0.024 mL, 0.306 mmol) 93 were added and the mixture stirred till a colorless solution was obtained. This was evaporated in 94 vacuo and the residue redissolved in DCM (25 mL) and cooled in an ice bath. Pyridine (3.72 mL, 95 46.0 mmol) was added, followed by 3-(2-tertbutoxyethyl)aniline (2.96 g, 15.32 mmol) and the 96 resulting suspension stirred at 0°C to room temperature for 1h. 97 The reaction mixture was diluted with DCM (50 mL), then washed with water (50 mL) and brine (50 98 mL), dried and evaporated in vacuo to a dark brown gum. This was dissolved in DCM (10 mL) and 99 loaded onto a 100 g silica column, then eluted with 0-50% EtOAc/cyclohexane to give N-(3-(2-(tert- 100 butoxy)ethyl)phenyl)-4-formylbenzamide (2.75 g, 8.45 mmol, 55.2 % yield) as an amber gum. 101 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 102 1H NMR (CHLOROFORM-d, 400available MHz) under δ 10.13aCC-BY-NC (s, 1H), 4.0 International 8.0-8.1 (m, license 4H),. 7.83 (br s, 1H), 7.5-7.6 (m, 103 2H), 7.33 (t, 1H, J=7.6 Hz), 7.0-7.1 (m, 1H), 3.60 (t, 2H, J=7.6 Hz), 2.87 (t, 2H, J=7.3 Hz), 1.45 (s, 9H) 104 LCMS (2 min Formic): Rt = 1.12 min, [M - H]+ = 324

105 Methyl 2-(4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1-methyl-1H- 106 benzo[d]imidazole-5-carboxylate

107

108 N-(3-(2-(tert-butoxy)ethyl)phenyl)-4-formylbenzamide (2.6 g, 7.99 mmol) and methyl 4- 109 (methylamino)-3-nitrobenzoate (1.679 g, 7.99 mmol) were combined in ethanol (50 mL), then sodium 110 dithionite (2.78 g, 15.98 mmol) in water (20 mL) was added and the mixture was stirred at 70°C 111 overnight. The mixture was cooled and evaporated in vacuo to half volume, then extracted with 112 EtOAc (2 x 50 mL) and the organic layer dried and evaporated in vacuo to give a pale yellow solid. 113 The solid was dissolved in DCM (10 mL) and loaded onto a 100 g silica column, then eluted with 0- 114 80% EtOAc/cyclohexane and product-containing fractions evaporated in vacuo to give methyl 2-(4- 115 ((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1-methyl-1H-benzo[d]imidazole-5-carboxylate 116 (2.35 g, 4.84 mmol, 60.6 % yield)

117 1H NMR (CHLOROFORM-d, 400 MHz) δ 8.55 (s, 1H), 8.18 (s, 1H), 8.11 (dd, 1H, J=1.0, 8.6 Hz), 118 8.04 (d, 2H, J=8.1 Hz), 7.89 (d, 2H, J=8.1 Hz), 7.5-7.6 (m, 2H), 7.46 (d, 1H, J=8.6 Hz), 7.33 (t, 1H, 119 J=7.6 Hz), 7.09 (d, 1H, J=7.6 Hz), 3.98 (s, 3H), 3.93 (s, 3H), 3.60 (t, 2H, J=7.6 Hz), 2.87 (t, 2H, 120 J=7.3 Hz), 1.20 (s, 9H) 121 LCMS (2 min Formic): Rt = 1.14 min, [MH]+ = 486

122 2-(4-((3-(2-(tert-Butoxy)ethyl)phenyl)carbamoyl)phenyl)-1-methyl-1H-benzo[d]imidazole-5- 123 carboxylic acid

124

125 Methyl 2-(4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1-methyl-1H-benzo[d]imidazole-5- 126 carboxylate (2.3 g, 4.74 mmol) was dissolved in THF (10 mL), then LiOH (0.340 g, 14.21 mmol) in bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 127 water (5 mL) was added and theavailable solution under was aCC-BY-NC heated at 4.0 60 International°Cfor 2h, licensethen cooled. to room temperature 128 and evaporated in vacuo. The residue was dissolved in water (20 mL) and acidified with 2M HCl to 129 pH 4, then the resulting beige solid collected by filtration and the solid dried in the vacuum oven 130 overnight at 60°C to give 2-(4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1-methyl-1H- 131 benzo[d]imidazole-5-carboxylic acid (2.2 g, 4.67 mmol, 98 % yield).

1 132 H NMR (DMSO-d6, 400 MHz) δ 10.40 (s, 1H), 8.33 (d, 1H, J=1.0 Hz), 8.20 (d, 2H, J=8.6 Hz), 8.08 133 (d, 2H, J=8.6 Hz), 8.02 (dd, 1H, J=1.0, 8.6 Hz), 7.86 (d, 1H, J=8.6 Hz), 7.6-7.7 (m, 2H), 7.28 (t, 1H, 134 J=7.8 Hz), 7.02 (d, 1H, J=7.6 Hz), 4.01 (s, 4H), 3.54 (t, 2H, J=7.3 Hz), 2.75 (t, 2H, J=7.1 Hz), 1.14 (s, 135 9H) 136 LCMS (2 min Formic): Rt = 1.02 min, [MH]+ = 472 137 138 N-(3-(2-(tert-Butoxy)ethyl)phenyl)-2-(4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1- 139 methyl-1H-benzo[d]imidazole-5-carboxamide

140

141 To a mixture of 2-(4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1-methyl-1H- 142 benzo[d]imidazole-5-carboxylic acid (100 mg, 0.212 mmol) and 3-(2-(tert-butoxy)ethyl)aniline (62 143 mg, 0.321 mmol) in DMF (1 mL) were added HATU (121 mg, 0.318 mmol) and DIPEA (0.056 mL, 144 0.318 mmol) and the reaction mixture was stirred at room temperature for 3 hours. LCMS showed 145 complete consumption of starting material. The solution was then partitioned between EtOAc and 146 water. The organic layer was washed with water (3x 10 mL), dried over magnesium sulphate and 147 evaporated under vacuum. The sample was loaded in DCM and purified on a 10 g silica cartridge 148 eluting with 0-100% EtOAc-cyclohexane. The appropriate fractions were combined and evaporated in 149 vacuo to give the required product N-(3-(2-(tert-butoxy)ethyl)phenyl)-2-(4-((3-(2-(tert- 150 butoxy)ethyl)phenyl)carbamoyl)phenyl)-1-methyl-1H-benzo[d]imidazole-5-carboxamide (118 mg, 151 0.182 mmol, 86 % yield), as a colorless glass. 152 153 1HNMR (CHLOROFORM-d ,400MHz): δ (ppm) 8.28 (d, J=1.3 Hz, 1H), 8.01 - 8.06 (m, 3H), 7.94 - 154 7.99 (m, 2H), 7.89 (d, J=8.3 Hz, 2H), 7.56 (s, 2H), 7.52 - 7.56 (m, 2H), 7.50 (d, J=8.6 Hz, 1H), 7.27 - 155 7.36 (m, 2H), 7.01 - 7.10 (m, 2H), 3.94 (s, 3H), 3.58 (t, J=7.5 Hz, 4H), 2.86 (t, J=7.5 Hz, 4H), 1.19 (s, 156 18H) bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 157 LCMS (2 min Formic): Rt = 1.30available min, under[MH] a+CC-BY-NC = 647 4.0 International license. 158 159 GSK064

160 4-((2,2-Dimethyl-4-oxo-3,8,11-trioxa-5-azatridecan-13-yl)amino)-3-nitrobenzoic acid

161

162 N-Boc-2,2'-(ethylenedioxy)diethylamine (1.300 mL, 5.48 mmol) and diisopropylethylamine (2.781 163 mL, 16.25 mmol) were added to a stirred solution of 4-fluoro-3-nitrobenzoic acid (1.0033 g, 5.42

164 mmol) in ethanol (15 mL) at ambient temperature. The resulting mixture was stirred under N2 165 atmosphere at 80 °C for 3.5 hr. The solvent was evaporated and dried in vacuo to give the crude 166 product, 4-((2,2-dimethyl-4-oxo-3,8,11-trioxa-5-azatridecan-13-yl)amino)-3-nitrobenzoic acid 167 (3.5849 g, 8.67 mmol, 160 % yield) as an orange/yellow oil. 168 The crude product was used in the next step without further purification. 169 170 1H NMR (CHLOROFORM-d, 400 MHz) δ 8.92 (d, 1H, J=2.0 Hz), 8.43 (br t, 1H, J=4.8 Hz), 8.16 171 (dd, 1H, J=1.8, 8.8 Hz), 6.84 (d, 1H, J=8.6 Hz), 5.1 (br s, 1H), 3.81 (t, 2H, J=5.3 Hz), 3.6-3.7 (m, 172 4H), 3.5-3.6 (m, 4H), 3.3-3.4 (m, 2H), 1.40 (s, 9H) 173 LCMS (2 min Formic): Rt = 0.92 min, [MH+]= 414. 174 175 3-Amino-4-((2,2-dimethyl-4-oxo-3,8,11-trioxa-5-azatridecan-13-yl)amino)benzoic acid 176

177 178

179 Under N2 atmosphere ethanol (10 mL) was to added palladium, 10 wt. % (dry basis) on activated 180 carbon, wet, Degussa type E101 NE/W (0.3776 g, 3.55 mmol) in a hydrogenation flask. A solution of 181 the crude starting material 4-((2,2-dimethyl-4-oxo-3,8,11-trioxa-5-azatridecan-13-yl)amino)-3- 182 nitrobenzoic acid (3.548 g equivalent to 2.241 g, 5.42 mmol) dissolved in ethanol (15 mL) was then 183 added to the catalyst mixture under vacuum. The resulting mixture was placed under an atmosphere of 184 hydrogen gas at room temperature and pressure and stirred vigorously. Once hydrogen uptake had 185 ceased, the reaction mixture was filtered through Celite™ under nitrogen. The solvent was evaporated bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 186 from the filtrate in vacuo to leaveavailable a residue under thataCC-BY-NC was dried 4.0 International to give the license crude. product, 3-amino-4-((2,2- 187 dimethyl-4-oxo-3,8,11-trioxa-5-azatridecan-13-yl)amino)benzoic acid (3.141 g, 8.19 mmol, 151 % 188 yield), as a brown oil, which was used without purification in the next step. 189 LCMS (2 min Formic): Rt = 0.73 min, [MH+]= 384. 190 2-(4-Carboxyphenyl)-1-(2,2-dimethyl-4-oxo-3,8,11-trioxa-5-azatridecan-13-yl)-1H- 191 benzo[I]imidazole-5-carboxylic acid

192

193 Acetic acid (14 mL) was added to the crude 3-amino-4-((2,2-dimethyl-4-oxo-3,8,11-trioxa-5- 194 azatridecan-13-yl)amino)benzoic acid (2.350 g crude, equivalent to 1.557 g, 4.06 mmol) and 4- 195 carboxybenzaldehyde (0.6122 g, 4.08 mmol). The resulting mixture was stirred under nitrogen at 196 room temperature for 66 hr. The solvent was evaporated in vacuo to give a dark reddish-brown oil. 197 The oil was loaded onto a 100 g silica cartridge as a partial suspension in dichloromethane and then 198 purified by silica flash chromatography, eluting with 0-100 % of (1:10:89 acetic 199 acid:MeOH:DCM)/DCM. The required fractions were combined and the solvent evaporated in vacuo 200 to give the final product, 2-(4-carboxyphenyl)-1-(2,2-dimethyl-4-oxo-3,8,11-trioxa-5-azatridecan-13- 201 yl)-1H-benzo[I]imidazole-5-carboxylic acid (799.2 mg, 1.323 mmol, 32.6 % yield), as a brown 202 crunchy foam.

1 203 H NMR (400 MHz, METHANOL-d4) δ 8.62 (s, 1H), 8.43 – 8.37 (m, 2H), 8.26 (dd, J = 8.6, 1.5 Hz, 204 1H), 8.18 – 8.14 (m, 2H), 7.96 – 7.92 (m, 1H), 4.76 – 4.71 (m, 2H), 4.07 – 4.02 (m, 2H), 3.70 – 3.56 205 (m, 2H), 3.53 – 3.46 (m, 4H), 3.28 – 3.25 (m, 2H), 1.59 (s, 9H) 206 LCMS (2 min Formic): Rt = 0.78 min, [MH+]= 514 207 208 tert-Butyl (2-(2-(2-(5-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)-2-(4-((3-(2-(tert- 209 butoxy)ethyl)phenyl)carbamoyl)phenyl)-1H-benzo[d]imidazol-1- 210 yl)ethoxy)ethoxy)ethyl)carbamate 211 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

212 213 Diisopropylethylamine (0.400 mL, 2.337 mmol) was added to a solution of 3-(2-(tert- 214 butoxy)ethyl)aniline (181.3 mg, 0.938 mmol), 2-(4-carboxyphenyl)-1-(2,2-dimethyl-4-oxo-3,8,11- 215 trioxa-5-azatridecan-13-yl)-1H-benzo[d]imidazole-5-carboxylic acid (399.4 mg, 0.778 mmol) and 216 HATU (356.7 mg, 0.938 mmol) in DMF (5 mL) and the resulting mixture stirred at room temperature 217 under nitrogen for 150 min. The solvent was evaporated under a stream of nitrogen, citric acid (10 % 218 aq) (50 mL) was added and the aqueous phase extracted with EtOAc (3 x 50 mL). The combined 219 organic phases were filtered through a hydrophobic frit and the solvent evaporated in vacuo to give a 220 brown residue. The residue was dissolved in MeOH and loaded onto a 50 g silica cartridge, which was 221 then dried in vacuo and eluted with 30-100 % (1 % acetic acid in ethyl acetate)/cyclohexane. The 222 desired product and mono-coupled by-products co-eluted and fractions containing the desired 223 products were combined and the solvent evaporated in vacuo to give a pale yellow residue. 224 The resulting residue was dissolved in dichloromethane and loaded on to a 50g silica cartridge, then 225 eluted with 0-10 % MeOH/DCM to give tert-butyl (2-(2-(2-(5-((3-(2-(tert- 226 butoxy)ethyl)phenyl)carbamoyl)-2-(4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1H- 227 benzo[d]imidazol-1-yl)ethoxy)ethoxy)ethyl)carbamate (125.7 mg, 0.131 mmol, 16.83 % yield as a 228 pale yellow solid. 229 1H NMR (CHLOROFORM-d, 400 MHz) δ 8.3-8.4 (m, 1H), 7.9-8.2 (m, 6H), 7.6-7.7 (m, 5H), 7.3-7.4 230 (m, 2H), 7.07 (t, 2H, J=7.1 Hz), 4.8-5.0 (m, 1H), 4.4-4.6 (m, 2H), 3.92 (br s, 2H), 3.60 (t, 4H, J=7.6 231 Hz), 3.49 (br s, 4H), 3.3-3.4 (m, 2H), 3.1-3.2 (m, 2H), 2.8-2.9 (m, 4H), 1.43 (s, 9H), 1.21 (s, 18H) 232 LCMS (2 min Formic): Rt = 1.40 min, [MH+] = 864 . 233 234 1-(2-(2-(2-Aminoethoxy)ethoxy)ethyl)-N-(3-(2-(tert-butoxy)ethyl)phenyl)-2-(4-((3-(2-(tert- 235 butoxy)ethyl)phenyl)carbamoyl)phenyl)-1H-benzo[d]imidazole-5-carboxamide 236

237 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 238 Hydrogen chloride solution (4.0available M in dioxane,under aCC-BY-NC 2 mL, 8.004.0 International mmol) was license added. to tert-butyl (2-(2-(2-(5- 239 ((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)-2-(4-((3-(2-(tert- 240 butoxy)ethyl)phenyl)carbamoyl)phenyl)-1H-benzo[d]imidazol-1-yl)ethoxy)ethoxy)ethyl)carbamate 241 (112.1 mg, 0.130 mmol) and the resulting mixture stirred for 2 min at room temperature. The reaction 242 was quenched with saturated sodium carbonate solution (2 mL), then water (3 mL) and EtOAc (5 mL) 243 were added and the phases separated. The aqueous phase was extracted with further ethyl acetate (2 x 244 5 mL) and the combined organic phases were filtered through a hydrophobic frit and the solvent 245 evaporated. The resulting residue was dissolved in methanol (1 mL) and purified by high pH MDAP 246 to give 247 1-(2-(2-(2-aminoethoxy)ethoxy)ethyl)-N-(3-(2-(tert-butoxy)ethyl)phenyl)-2-(4-((3-(2-(tert- 248 butoxy)ethyl)phenyl)carbamoyl)phenyl)-1H-benzo[d]imidazole-5-carboxamide (46.1 mg, 0.057 249 mmol, 44.2 % yield), was obtained as a very pale yellow glass. 250 251 1H NMR (CHLOROFORM-d, 400 MHz) δ 8.72 (s, 1H), 8.3-8.4 (m, 2H), 8.0-8.1 (m, 4H), 7.95 (dd, 252 1H, J=1.8, 8.3 Hz), 7.5-7.6 (m, 4H), 7.32 (dt, 2H, J=3.5, 7.8 Hz), 7.07 (t, 2H, J=7.8 Hz), 4.51 (t, 2H, 253 J=5.3 Hz), 3.91 (t, 3H, J=5.1 Hz), 3.60 (t, 6H, J=7.3 Hz), 3.5-3.6 (m, 4H), 3.39 (t, 2H, J=5.3 Hz), 2.88 254 (t, 4H, J=7.6 Hz), 2.80 (t, 2H, J=5.3 Hz), 1.21 (s, 18H) 255 LCMS (2 min high pH): Rt = 1.27 min, [MH+]= 764 256 257 3-(2-((1E,3E,5Z)-5-(3-(6-((2-(2-(2-(5-((3-(2-(tert-Butoxy)ethyl)phenyl)carbamoyl)-2-(4-((3-(2(tert- 258 butoxy)ethyl)phenyl)carbamoyl)phenyl)-1H-benzo[d]imidazol-1yl)ethoxy)ethoxy)ethyl)amino)6- 259 oxohexyl)-3-methyl-5-sulfo-1-(3-sulfopropyl)indolin-2-ylidene)penta-1,3-dien-1-yl)-3,3-dimethyl- 260 5-sulfo-3H-indol-1-ium-1-yl)propane-1-sulfonate, 3 Ammonia salt

261

262 Alexa Fluor 647 carboxylic acid succinimidyl ester (5mg) (Invitrogen) on 15th July 2011; Lot 263 890799; cat# A20106; mf not given; Exact Mass: not given; Molecular Weight: not given but cited as 264 "~1250". bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 265 Structure of AlexaFluor 647 determinedavailable under from aCC-BY-NC WO02/26891 4.0 International Compound license 9.. This structure contains two 266 pendant propyl sulfonic acids contrary to some reports suggesting that the molecule contains butyl 267 sulfonic acids. Salt form from patent example of tris(triethylamine) has MWt. 1259.67 which 268 approximates to the vendor's citation of "~1250". 269 270 3-(2-((1E,3E,5Z)-5-(3-(6-((2,5-dioxopyrrolidin-1-yl)oxy)-6-oxohexyl)-3-methyl-5-sulfo-1-(3- 271 sulfopropyl)indolin-2-ylidene)penta-1,3-dien-1-yl)-3,3-dimethyl-5-sulfo-3H-indol-1-ium-1- 272 yl)propane-1-sulfonate, 3 N,N-diethylethanamine salt (5 mg, 3.97 µmol) in anhydrous DMF (200 µL) 273 was added to 1-(2-(2-(2-aminoethoxy)ethoxy)ethyl)-N-(3-(2-(tert-butoxy)ethyl)phenyl)-2-(4-((3-(2- 274 (tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1H-benzo[d]imidazole-5-carboxamide (5.4 mg, 7.07 275 µmol). Diisopropylethylamine (1.4 µL, 8.02 µmol) was added and the resulting mixture stirred at 276 room temperature in a vial wrapped in foil for 16 hr. Acetonitrile was added to increase the sample 277 volume to 1 mL. The sample was then purified by high pH MDAP to give 3-(2-((1E,3E,5Z)-5-(3-(6- 278 ((2-(2-(2-(5-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)-2-(4-((3-(2-(tert- 279 butoxy)ethyl)phenyl)carbamoyl)phenyl)-1H-benzo[d]imidazol-1-yl)ethoxy)ethoxy)ethyl)amino)-6- 280 oxohexyl)-3-methyl-5-sulfo-1-(3-sulfopropyl)indolin-2-ylidene)penta-1,3-dien-1-yl)-3,3-dimethyl-5- 281 sulfo-3H-indol-1-ium-1-yl)propane-1-sulfonate, 3 Ammonia salt (6.7 mg, 3.84 µmol, 97 % yield), as 282 a blue solid.

1 283 H NMR (METHANOL-d4, 400 MHz) δ 8.2-8.5 (m, 3H), 8.17 (d, 2H, J=8.6 Hz), 8.0-8.1 (m, 3H), 284 7.9-7.9 (m, 3H), 7.8-7.9 (m, 2H), 7.6-7.7 (m, 4H), 7.43 (dd, 2H, J=8.3, 10.9 Hz), 7.31 (t, 2H, J=7.8 285 Hz), 7.07 (dd, 2H, J=3.8, 7.3 Hz), 6.6-6.8 (m, 1H), 6.45 (br dd, 2H, J=5.3, 13.4 Hz), 4.57 (t, 2H, 286 J=4.8 Hz), 4.3-4.4 (m, 4H), 3.88 (t, 2H, J=5.1 Hz), 3.65 (dt, 6H, J=2.5, 7.1 Hz), 3.42 (br d, 6H, J=3.5 287 Hz), 3.1-3.2 (m, 2H), 2.9-3.1 (m, 5H), 2.8-2.9 (m, 5H), 2.2-2.3 (m, 5H), 1.92 (s, 2H), 1.73 (s, 4H), 288 1.70 (s, 7H), 1.4-1.5 (m, 2H), 1.17 (s, 18H), 1.1-1.2 (m, 1H), 0.83 (br d, 1H, J=3.0 Hz), 0.5-0.7 (m, 289 1H) 290 LCMS (2 min high pH): Rt = 0.88 min, [M+2H]2+= 804 291

292 GSK675 293 N-(3-(2-(tert-Butoxy)ethyl)phenyl)-4-chloro-3-nitrobenzamide

294 295 296 4-Chloro-3-nitrobenzoic acid (4194.8 mg, 20.81 mmol) was taken up into DCM (100 mL) in an ice- 297 bath under nitrogen. Oxalyl dichloride (5.45 mL, 62.4 mmol) was slowly added to the mixture bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 298 followed by DMF (2 drops), at availablewhich point under theaCC-BY-NC mixture 4.0 started International to slightly license .bubble. The reaction was 299 then allowed to stir at RT for 2.5h then the volatiles removed in vacuo. 300 301 The crude acid chloride was taken up into DCM (100 mL) in an ice-bath under nitrogen. 3-(2-(tert- 302 butoxy)ethyl)aniline (4023 mg, 20.81 mmol) was slowly added followed by pyridine (3.36 mL, 41.6 303 mmol). The reaction was allowed to stir at RT for 2h, then evaporated in vacuo to give a brown thick 304 oil. The crude was dissolved in DCM (40 mL) and washed with saturated aq sodium bicarbonate 305 solution (2 x 40 mL) then with hydrochloric acid (1M aq, 40 mL), dried with magnesium sulfate and 306 filtered under vacuum. The solvent was evaporated under vacuum to give N-(3-(2-(tert- 307 butoxy)ethyl)phenyl)-4-chloro-3-nitrobenzamide (8132.3 mg, 20.50 mmol, 99 % yield) as a brown 308 oil.

1 309 H NMR (DMSO-d6, 400 MHz) δ 10.46 (s, 1H), 8.64 (d, 1H, J=2.3 Hz), 8.27 (dd, 1H, J=2.1, 8.5 Hz), 310 7.97 (d, 1H, J=8.6 Hz), 7.6-7.7 (m, 2H), 7.2-7.4 (m, 1H), 7.03 (d, 1H, J=7.6 Hz), 3.52 (t, 2H, J=7.1 311 Hz), 2.74 (t, 2H, J=7.1 Hz), 1.12 (s, 9H) 312 LCMS (2 min Formic): Rt = 1.28 min, [M - H]+ = 375, 377 (1 Cl) 313 314 4-((2-((2-Aminoethyl)(methyl)amino)ethyl)amino)-N-(3-(2-(tert-butoxy)ethyl)phenyl)-3- 315 nitrobenzamide

316 317 318 N-(3-(2-(tert-Butoxy)ethyl)phenyl)-4-chloro-3-nitrobenzamide (1.03g, 2.73 mmol) and N1-(2- 319 aminoethyl)-N1-methylethane-1,2-diamine (0.480 g, 4.10 mmol) were stirred in DMF (10 mL) and 320 triethylamine (0.381 mL, 2.73 mmol) for 3h at 50°C, then the solution was cooled and diluted with 321 water (50ml), extracted with EtOAc (2 x 50 mL) and the organic layer washed with water (2 x 50 322 mL), dried and evaporated in vacuo. The residue was loaded onto a 50 g silica column and eluted 323 with 0-20% 2 M methanolic ammonia/DCM to give 4-((2-((2- 324 aminoethyl)(methyl)amino)ethyl)amino)-N-(3-(2-(tert-butoxy)ethyl)phenyl)-3-nitrobenzamide (0.85g, 325 1.858 mmol, 68.0 % yield) as a bright yellow gum 326

327 1H NMR (CHLOROFORM-d, 400 MHz) δ 8.7-8.8 (m, 1H), 8.67 (d, 1H, J=2.5 Hz), 8.52 (s, 1H), 8.0- 328 8.1 (m, 1H), 7.5-7.6 (m, 2H), 7.2-7.3 (m, 1H), 7.00 (d, 1H, J=7.6 Hz), 6.85 (d, 1H, J=9.1 Hz), 3.56 (t, bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 329 2H, J=7.3 Hz), 3.38 (q, 2H, J=5.6available Hz), under2.8-2.9 aCC-BY-NC (m, 4H), 4.0 2.7 International-2.8 (m, license2H), 2.4. -2.6 (m, 2H), 2.28 (s, 330 3H), 1.17 (s, 9H) 331 LCMS (2 min high pH): Rt = 1.10 min, [MH]+ = 458 332 333 tert-Butyl (2-((2-((4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)-2- 334 nitrophenyl)amino)ethyl)(methyl)amino)ethyl)carbamate

335

336 4-((2-((2-Aminoethyl)(methyl)amino)ethyl)amino)-N-(3-(2-(tert-butoxy)ethyl)phenyl)-3-

337 nitrobenzamide (0.84g, 1.836 mmol) and Boc2O (0.426 mL, 1.836 mmol) were dissolved in DCM 338 (20 mL) and allowed to stand for 2h, then the solvent was evaporated in vacuo and the residue was 339 purified by chromatography on a 50 g silica column eluting with 0-10% 2 M methanolic 340 ammonia/DCM to give tert-butyl (2-((2-((4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)-2- 341 nitrophenyl)amino)ethyl)(methyl)amino)ethyl)carbamate (0.85g, 1.524 mmol, 83 % yield) as a 342 bright yellow gum.

343 1H NMR (CHLOROFORM-d, 400 MHz) δ 8.7-8.9 (m, 1H), 8.69 (br s, 1H), 7.9-8.2 (m, 2H), 7.5-7.6 344 (m, 2H), 7.2-7.4 (m, 1H), 7.03 (d, 1H, J=7.6 Hz), 6.90 (d, 1H, J=9.1 Hz), 5.10 (br s, 1H), 3.58 (t, 2H, 345 J=7.3 Hz), 3.40 (q, 2H, J=5.6 Hz), 3.27 (q, 2H, J=5.6 Hz), 2.85 (t, 2H, J=7.6 Hz), 2.77 (t, 2H, J=5.8 346 Hz), 2.58 (t, 2H, J=5.8 Hz), 2.34 (s, 3H), 1.40 (s, 9H), 1.19 (s, 9H) 347 LCMS (2 min high pH): Rt = 1.41 min, [MH]+ = 558

348 1-(2-((2-Aminoethyl)(methyl)amino)ethyl)-N-(3-(2-(tert-butoxy)ethyl)phenyl)-2-(4-((3-(2-(tert- 349 butoxy)ethyl)phenyl)carbamoyl)phenyl)-1H-benzo[d]imidazole-5-carboxamide

350 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 351 tert-Butyl (2-((2-((4-((3-(2-(tertavailable-butoxy)ethyl)phenyl)carbamoyl) under aCC-BY-NC 4.0 International-2- license. 352 nitrophenyl)amino)ethyl)(methyl)amino)ethyl)carbamate (0.21g, 0.377 mmol) and N-(3-(2-(tert- 353 butoxy)ethyl)phenyl)-4-formylbenzamide (0.123 g, 0.377 mmol) were dissolved in ethanol (10 mL), 354 then a solution of sodium dithionite (0.197 g, 1.130 mmol) in water (3 mL) was added and the mixture 355 was heated at reflux for 3h. The solvent was evaporated in vacuo and the residue was partitioned 356 between EtOAc (10 mL) and water (5 mL), the organic layer dried and evaporated to give a pale 357 yellow gum. 358 The crude product was dissolved in DCM (3 mL) and HCl (1M in ether, 1 mL, 1.000 mmol) was 359 added, the mixture stirred for 2h, then evaporated in vacuo. A sample (20mg) of the crude product 360 was purified by high pH MDAP to give 1-(2-((2-aminoethyl)(methyl)amino)ethyl)-N-(3-(2-(tert- 361 butoxy)ethyl)phenyl)-2-(4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1H- 362 benzo[d]imidazole-5-carboxamide (10mg, 0.014 mmol, 3.62 % yield) as a pale yellow gum. 363 364 The remaining crude amine was purified by chromatography on silica (25 g column) eluting with 0- 365 20% 2 M methanolic ammonia/DCM to give 1-(2-((2-aminoethyl)(methyl)amino)ethyl)-N-(3-(2-(tert- 366 butoxy)ethyl)phenyl)-2-(4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1H- 367 benzo[d]imidazole-5-carboxamide (85 mg, 0.116 mmol, 30.8 % yield) as a colorless foam.

368 1H NMR (CHLOROFORM-d, 400 MHz) δ 8.48 (s, 1H), 8.27 (s, 1H), 8.14 (s, 1H), 8.03 (d, 3H, J=8.1 369 Hz), 7.96 (dd, 1H, J=1.5, 8.6 Hz), 7.87 (d, 3H, J=8.6 Hz), 7.5-7.6 (m, 7H), 7.3-7.4 (m, 3H), 7.07 (t, 370 2H, J=8.8 Hz), 4.40 (br t, 2H, J=6.6 Hz), 3.60 (dt, 5H, J=1.8, 7.5 Hz), 2.87 (t, 5H, J=7.3 Hz), 2.71 (t, 371 2H, J=6.3 Hz), 2.4-2.5 (m, 2H), 2.3-2.4 (m, 2H), 2.09 (s, 3H), 1.20 (s, 18H)

372 LCMS (2 min high pH): Rt = 1.27 min, [MH]+ = 733

373 N-(3-(2-(tert-Butoxy)ethyl)phenyl)-2-(4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1-(2- 374 (methyl(2-(6-(5-((3aS,4S,6aR)-2-oxohexahydro-1H-thieno[3,4-d]imidazol-4- 375 yl)pentanamido)hexanamido)ethyl)amino)ethyl)-1H-benzo[d]imid 376 azole-5-carboxamide

377 bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 378 1-(2-((2-Aminoethyl)(methyl)amino)ethyl)available under- NaCC-BY-NC-(3-(2-(tert 4.0- butoxy)ethyl)phenyl)International license. -2-(4-((3-(2-(tert- 379 butoxy)ethyl)phenyl)carbamoyl)phenyl)-1H-benzo[d]imidazole-5-carboxamide (15 mg, 0.020 mmol) 380 and 2,5-dioxopyrrolidin-1-yl 6-(5-((3aS,4S,6aR)-2-oxohexahydro-1H-thieno[3,4-d]imidazol-4- 381 yl)pentanamido)hexanoate (9.30 mg, 0.020 mmol) were dissolved in DMF (0.9 mL) and triethylamine 382 (8.56 µl, 0.061 mmol) and the solution was allowed to stand overnight. LCMS (2 min high pH ): Rt 383 = 1.21 min, [MH]+ = 1072. The solution was purified by high pH MDAP to give N-(3-(2-(tert- 384 butoxy)ethyl)phenyl)-2-(4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1-(2-(methyl(2-(6-(5- 385 ((3aS,4S,6aR)-2-oxohexahydro-1H-thieno[3,4-d]imidazol-4- 386 yl)pentanamido)hexanamido)ethyl)amino)ethyl)-1H-benzo[d]imidazole-5-carboxamide (8.5 mg, 7.93 387 µmol, 38.7 % yield) as a colorless solid.

1 388 H NMR (METHANOL-d4, 400 MHz) δ 8.39 (d, 1H, J=1.5 Hz), 8.19 (d, 2H, J=8.6 Hz), 8.04 (dd, 1H, 389 J=1.5, 8.6 Hz), 8.0-8.0 (m, 2H), 7.83 (d, 1H, J=8.6 Hz), 7.5-7.7 (m, 4H), 7.32 (dt, 2H, J=3.0, 7.8 Hz), 390 7.08 (t, 2H, J=6.8 Hz), 4.54 (t, 2H, J=6.1 Hz), 4.47 (dd, 1H, J=5.1, 8.1 Hz), 4.2-4.3 (m, 1H), 3.66 (t, 391 5H, J=7.3 Hz), 3.1-3.2 (m, 1H), 3.00 (t, 2H, J=6.6 Hz), 2.8-2.9 (m, 5H), 2.8-2.8 (m, 2H), 2.6-2.7 (m, 392 1H), 2.37 (t, 2H, J=6.3 Hz), 2.17 (t, 2H, J=7.3 Hz), 2.06 (t, 2H, J=7.6 Hz), 1.4-1.8 (m, 11H), 1.3-1.3 393 (m, 3H), 1.21 (s, 18H) 394 LCMS (2 min high pH ): Rt = 1.04 min, M/2+H observed 536 395 396 GSK306

397 N-(3-(2-(tert-Butoxy)ethyl)phenyl)-2-(4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1-(2- 398 (2-(2-(3',6'-dihydroxy-3-oxo-3H-spiro[isobenzofuran-1,9'-xanthen]-5- 399 ylcarboxamido)ethoxy)ethoxy)ethyl)-1H-benzo[d]imidazole-5-carboxamide

400

401 DMF (0.4 mL) and triethylamine (4.42 µL, 0.032 mmol) were added to 5-carboxyfluorescein N- 402 succinimidyl ester (5.1 mg, 10.77 µmol) and 1-(2-(2-(2-aminoethoxy)ethoxy)ethyl)-N-(3-(2-(tert- 403 butoxy)ethyl)phenyl)-2-(4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1H- bioRxiv preprint doi: https://doi.org/10.1101/2020.08.10.239475; this version posted August 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 404 benzo[d]imidazole-5-carboxamideavailable (13.5 under mg, a CC-BY-NC0.018 mmol) 4.0 International and the resulting license. mixture stirred at room 405 temperature in a vial wrapped in foil for 17.5 hr. The mixture was left to stand overnight, then diluted 406 with methanol to 1 mL and then the sample purified by MDAP (high pH) to give N-(3-(2-(tert- 407 butoxy)ethyl)phenyl)-2-(4-((3-(2-(tert-butoxy)ethyl)phenyl)carbamoyl)phenyl)-1-(2-(2-(2-(3',6'- 408 dihydroxy-3-oxo-3H-spiro[isobenzofuran-1,9'-xanthen]-5-ylcarboxamido)ethoxy)ethoxy)ethyl)-1H- 409 benzo[d]imidazole-5-carboxamide (6.8 mg, 5.76 µmol, 53.4 % yield), as an orange solid.

1 410 H NMR (METHANOL-d4, 400 MHz) δ 8.37 (dd, 2H, J=1.3, 11.9 Hz), 8.1-8.2 (m, 2H), 8.0-8.1 (m, 411 3H), 7.98 (dd, 1H, J=1.5, 8.6 Hz), 7.80 (d, 1H, J=8.6 Hz), 7.6-7.7 (m, 2H), 7.58 (br d, 2H, J=7.6 Hz), 412 7.2-7.3 (m, 2H), 7.21 (d, 1H, J=8.1 Hz), 7.0-7.1 (m, 2H), 6.6-6.8 (m, 4H), 6.5-6.6 (m, 2H), 4.57 (br t, 413 2H, J=4.8 Hz), 3.94 (br t, 2H, J=5.1 Hz), 3.63 (dt, 4H, J=2.0, 7.1 Hz), 3.5-3.6 (m, 8H), 2.81 (br t, 4H, 414 J=6.6 Hz), 1.18 (m, 18H) (5 exchangeable protons not seen). 415 LCMS (2 min high pH ): Rt = 0.98 min, [MH+]= 1122 416