The University of Manchester Research

Chromatin remodelling factor SMARCD2 regulates transcriptional networks controlling differentiation of neutrophil granulocytes DOI: 10.1038/ng.3833

Document Version Accepted author manuscript

Link to publication record in Manchester Research Explorer

Citation for published version (APA): Witzel, M., Petersheim, D., Fan, Y., Bahrami, E., Racek, T., Rohlfs, M., Puchalka, J., Mertes, C., Gagneur, J., Ziegenhain, C., Enard, W., Stray-Pedersen, A., Arkwright, P., Abboud, M. R., Pazhakh, V., Lieschke, G. J., Krawitz, P. M., Dahlhoff, M., Schneider, M. R., ... Klein, C. (2017). remodelling factor SMARCD2 regulates transcriptional networks controlling differentiation of neutrophil granulocytes. Nature Genetics, 49(5), 742-752. https://doi.org/10.1038/ng.3833 Published in: Nature Genetics

Citing this paper Please note that where the full-text provided on Manchester Research Explorer is the Author Accepted Manuscript or Proof version this may differ from the final Published version. If citing, it is advised that you check and use the publisher's definitive version.

General rights Copyright and moral rights for the publications made accessible in the Research Explorer are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

Takedown policy If you believe that this document breaches copyright please refer to the University of Manchester’s Takedown Procedures [http://man.ac.uk/04Y6Bo] or contact [email protected] providing relevant details, so we can investigate your claim.

Download date:10. Oct. 2021 Chromatin remodelling factor SMARCD2 regulates transcriptional networks controlling differentiation of neutrophil granulocytes

1,2 1 1 1 1 Maximilian Witzel , Daniel Petersheim , Yanxin Fan , Ehsan Bahrami , Tomas Racek , Meino

1 1§ 2 2,3 4 Rohlfs , Jacek Puchałka , Christian Mertes , Julien Gagneur , Christoph Ziegenhain , Wolfgang

4 5 6 7 8 Enard , Asbjørg Stray-Pedersen , Peter D. Arkwright , Miguel R. Abboud , Vahid Pazhakh , Graham

8 9 10 10 10 J. Lieschke , Peter M. Krawitz , Maik Dahlhoff , Marlon R. Schneider , Eckhard Wolf , Hans-Peter

Horny11, Heinrich Schmidt1, Alejandro A. Schäffer12 and Christoph Klein1,2

1Department of Pediatrics, Dr. von Hauner Children's Hospital, Ludwig-Maximilians-Universität München, Munich, Germany

2Gene Center Munich, Ludwig-Maximilians-Universität München, Munich, Germany

3Department of Informatics, Technical University of Munich, Munich, Germany

4Anthropology and Human Genomics, Department Biology II, Faculty of Biology, Ludwig-Maximilians- Universität München, Munich, Germany

5Norwegian National Unit for Newborn Screening, Oslo University Hospital, Oslo, Norway

6 University of Manchester, Department of Paediatric Allergy & Immunology, Royal Manchester Children's Hospital, Manchester, UK

7Department of Pediatrics and Adolescent Medicine, American University of Beirut Medical Center, Beirut, Lebanon

8Australian Regenerative Medicine Institute, Monash University, Clayton, Australia

9Medical Genetics and Human Genetic, Charite University Hospital, Berlin, Germany

10Molecular Animal Breeding and Biotechnology, Center Ludwig-Maximilians-Universität München, Munich, Germany

11Pathology Institute, Faculty of Medicine, Ludwig-Maximilians-Universität München, Munich, Germany

12National Center for Biotechnology Information, National Institutes of Health, Department of Health and Human Services, Bethesda, MD USA

§Deceased (in September 2015)

1

Correspondence:

Christoph Klein MD PhD

Department of Pediatrics

Dr. von Hauner Children’s Hospital

Ludwig-Maximilians-Universität München

Lindwurmstrasse 4

D-80337 Munich

Tel +49-89-4400 5 7700

Fax +49-89-4400 5 7703

Email [email protected]

Key words:

Hematopoiesis, primary immunodeficiency disorder, SWI/SNF complex, specific granule deficiency, leukemogenesis, congenital neutropenia

2 1 Abstract

2 Differentiation of hematopoietic stem cells follows a hierarchical program of

3 transcription factor regulated events1-3. Early myeloid cell differentiation is dependent

4 on PU.1 and CEBPA (CCAAT/enhancer binding alpha), late myeloid

5 differentiation is orchestrated by CEBPE (CCAAT/enhancer binding protein epsilon)4.

6 The influence of SWI/SNF (SWItch/Sucrose Non-Fermentable) chromatin

7 remodelling factors as novel master regulators of hematopoietic differentiation is only

8 beginning to be explored3,5,6. Here, we identify SMARCD2 (SWI/SNF related, matrix

9 associated, actin dependent regulator of chromatin, subfamily d, member 2) as a

10 critical regulator of myeloid differentiation in humans, mice, and zebrafish. Studying

11 patients from three unrelated pedigrees characterized by neutropenia, specific granule

12 deficiency, myelodysplasia with excess of blast cells and various developmental

13 aberrations, we identified three loss-of-function mutations in SMARCD2. Using mice

14 and zebrafish as model systems, we showed that SMARCD2 controls early steps in the

15 differentiation of myeloid-erythroid progenitor cells. In vitro, SMARCD2 interacts

16 with the transcription factor CEBPE and controls expression of neutrophil

17 stored in specific granules. Defective expression of SMARCD2 leads to transcriptional

18 and chromatin changes in acute myeloid leukemia (AML) human promyelocytic cells.

19 In summary, SMARCD2 is a key factor controlling myelopoiesis and a potential

20 tumor suppressor in leukemia.

21

22 Article

23

24 Differentiation of hematopoietic cells is controlled by transcription-factor mediated

25 instructive events and less well-defined permissive events orchestrated by a variety of

26 epigenetic modulators7,8. Dynamic chromatin remodelling adds another level of complexity.

3 27 Embedding of promoter DNA into nucleosome landscapes restricts accessibility of cognate

28 binding sites to transcription factors and restricts gene expression 9-11. The SWI/SNF complex

29 is composed of multimeric units that use energy derived from ATP hydrolysis to unwrap or

30 restructure nucleosomes12. SMARCD2 is a component of the SWI/SNF complex in

31 hematopoietic stem cells (HSC) and other hematopoietic cells6,13,14. The two paralogous

32 proteins, SMARCD1 (BAF60A) and SMARCD3 (BAF60C), control embryonic stem cell15

33 and heart muscle cell differentiation16, respectively.

34 Here, we investigated three independent pedigrees with four patients who presented as

35 neonates with delayed separation of umbilical cord and subsequently developed severe

36 bacterial infections associated with neutropenia, parasitosis or chronic diarrhea (Table S1).

37 Extrahematopoietic findings included mild to moderate developmental delay and dysmorphic

38 features (Figure S1, Table S1). The bone marrow of patients showed hypercellularity, paucity

39 of neutrophil granulocytes, dysplastic features (Figure 1), and progressive development of

40 myelodysplasia (Figure 1, S2). Neutrophil granulocytes were characterized by absence of

41 granules (Figure S3).

42 In search of the underlying genetic defect, we performed homozygosity mapping and whole

43 exome sequencing (WES), followed by Sanger sequencing of patients and family members

44 (see Supplementary Materials & Methods for details). Homozygosity mapping identified an

45 especially large perfect marker interval of over 50Mbp in family A on 17;

46 within this interval, family B had two non-adjacent perfect intervals spanning 1.8Mbp and

47 0.5Mbp. The asymptotic LOD scores for these intervals are +4.2 (+1.8 for A and +2.4 for B)

48 and peak observed LOD scores with a more realistic disease haplotype frequency of 0.05, are

49 3.0 (+1.2, +1.8). There were approximately 36 located in the two shared intervals,

50 including SMARCD2.

51

4 52 We identified distinct segregating homozygous mutations in SMARCD2 in all three pedigrees

53 (Figure 2a, b, c). Mutations are described by their putative effect on transcript SMARCD2-001

54 (ENST00000448276). Effects on hypothetical transcripts are shown in Table S2. At the DNA

55 level, the mutations in pedigrees A and C affect splice sites, while the mutation in pedigree B

56 is a duplication of 25bp, leading to a frameshift and premature termination (Table S2).

57 Western blot analyses showed an absence of SMARCD2 protein in patient cells (Figure 2d).

58 To confirm that the SMARCD2 mutations lead to a loss of function, we sequenced reverse

59 transcribed mRNA from patient cells (Figure 2e) and determined their putatively encoded

60 proteins. We then cloned 2 isoforms of patient AII.1 (AII.1a: p.Ile362Cysfs*3 and AII1b:

61 p.Ser394Argfs*1), one isoform of patient BII.1. (BII.1: p.Gln147Glufs*5) and one isoform of

62 patient CII.1 (CII.1: p.Arg73Valfs*8). FLAG-tagged expression vectors carrying mutated

63 SMARCD2 versions and a red fluorescence protein gene separated by an Internal Ribosomal

64 Entry Sequence (IRES.RFP) were transfected into 293T cells and were investigated for co-

65 immunoprecipitation with native SWI/SNF core members. As shown in Figure 2f, only the

66 wild-type version of SMARCD2 was able to co-precipitate with SMARCA4 (BRG1),

67 SMARCC2 (BAF170), SMARCC1 (BAF155), and SMARCB1 (BAF47); none of the mutant

68 versions were able to co-precipitate with any of these proteins, suggesting that the mutations

69 constitute loss-of-function alleles.

70

71 Since all SMARCD2-deficient patients had either been subjected to allogeneic HSCT or had

72 died due to their disease, primary SMARCD2-deficient hematopoietic stem cells were not

73 available for further experiments. To further study the role of SMARCD2 in neutrophil

74 differentiation, we established several in vivo and in vitro models.

75 As a first model, we used zebrafish (Danio rerio), in which smarcd2 (XP_692749.2) is the

76 ortholog of human SMARCD2. Using antisense morpholino-oligonucleotides (MOs), we

77 created Smarcd2-deficient zebrafish in two reporter strains with fluorescent neutrophil

5 78 granulocytes: Tg (mpx:EGFP)i114 (Figure S4a,b,c) and Tg (lyz:dsRed)nz50 (Figure 3c)17-19.

79 Smarcd2 MOs were designed to block either translation initiation (label ATG) or splicing

80 (labels SB1 and SB2, for MOs targeting splice donor and acceptor sites respectively) of

81 smarcd2. In both fish strains, there was a significant reduction in the number of neutrophil

82 granulocytes compared to controls at 72 hours post fertilization (hpf) for the ATG and SB1

83 MOs (Figure S4c, 3c). MO SB2, which failed to disrupt smarcd2 splicing (Figure S4a),

84 provided an additional negative control indicating specificity of the on-target smarcd2-MO

85 effect to reduce neutrophil abundance. Using CRISPR/Cas9 genome editing in zebrafish, we

86 created a frameshift mutant smarcd2 allele Smarcd21/1 (Figure S4d), which also showed

87 reduced granulocyte abundance at 72 hpf compared to wild-type controls (Figure 3a, b).

88 Collectively, these zebrafish models provide concordant evidence that a requirement for

89 SMARCD2 in neutrophil granulocyte differentiation is evolutionarily conserved.

90

91 A second in vivo model was generated by injection of Smarcd2+/- murine ES cells (KOMP

92 repository) into blastocysts and transferring them into pseudo-pregnant mice. Chimeric

93 offspring were mated with wild-type mice, resulting in Smarcd2+/- mice, which were

94 intercrossed (Figure S5a, b, f). We found that Smarcd2-/- embryos died late during fetal

95 development (Figure 5c, d) and are characterized by reduced size, pallor, and decreased

96 temporal vascularization (Figure 3d), suggestive of a compromised hematopoietic system.

97 However, we did find Mendelian ratios of Smarcd2-/- embryos at 14.5dpc (Figure S5d, e).

98 Flow cytometry analysis of fetal liver single cell suspensions showed comparable numbers of

99 hematopoietic stem cells (Figure S5e), yet a striking reduction in CD11b+Gr1+ neutrophil

100 granulocytes and CD11b+Ly6c+ monocytes in Smarcd2-/- embryos (Figure 3h,k).

101 To assess the differentiation capacity of hematopoietic stem cells, we next purified

102 CD45.2+Lin-Mac+/low Sca1+cKit+ (LSK) cells from wild-type, heterozygous and homozygous

103 fetal livers and performed colony-forming unit (CFU) assays, in vitro. In comparison to CFU

6 104 colonies derived from wild-type or heterozygous mice, Smarcd2-/- CFU colonies showed a

105 marked reduction in size and numbers (data not shown and Figure S6a) and maturation arrest

106 (Figure 3f). Smarcd2-/- myeloid CFU colonies, generated in the presence of myeloid cytokine

107 cocktail, were deficient in cell surface expression of CD11b, Gr1 and Ly6c (Figure S6b). A

108 block in myeloid differentiation was also seen when LSK cells (native) were exposed to either

109 GM-CSF, M-CSF, or G-CSF, suggesting that none of the corresponding cytokine-receptors

110 were able to induce myeloid cell growth (Figure 3i).

111

112 Aberrant hematopoiesis was not restricted to the myeloid compartment in Smarcd2-/-

113 embryos, but also affected erythroid differentiation. Fetal/umbilical cord blood cytology at

114 14.5dpc showed marked dysplastic changes in Smarcd2-/- erythropoiesis: In contrast to wild-

115 type embryos, characterized by normochromic, orthochromatic erythrocytes and presence of

116 few nucleated erythrocytes, Smarcd2-/- embryos showed extensive anisocytosis of

117 erythrocytes, multinucleated cells, perturbed mitosis and increased apoptosis (Figure 3e).

118 Furthermore, in vitro erythroid differentiation of LSK cells in the presence of rm SCF, rm IL-

119 3, rh IL-6, rh EPO hints towards a partial differentiation block or delay at the immature S1

120 stage, as determined by CD71/Ter119 expression20 in Smarcd2-/- GEMM colonies (Figure

121 3j,l). Taken together, murine SMARCD2-deficient hematopoietic cell differentiation is

122 characterized by a maturation arrest in myeloid and erythroid cells in vitro and in vivo,

123 reminiscent of the hematological phenotype in SMARCD2-/- patients.

124 Various previous studies identified that SWI/SNF complex members increase or decrease

125 primitive or definite hematopoiesis6. Here, we hypothesize that A) the functional effects of

126 SMARCD2 deficiency on granulopoiesis are due to its absence from SWI/SNF complexes, B)

127 SWI/SNF complexes that contain SMARCD2 have a specific role in granulopoiesis, and C)

128 mechanistically, SMARCD2 governs granulopoiesis via chromatin accessibility and

129 interaction with CEBPE.

7 130 To identify alterations in transcriptional networks controlling differentiation of fetal liver

131 hematopoietic stem cells, we isolated LSK cells from 5 Smarcd2+/+ and 9 Smarcd2-/- fetal

132 livers and profiled their transcriptome by RNA-sequencing. Among a total of 12,362 detected

133 genes, we found 4,290 to be differentially expressed at a False Discovery Rate (FDR, see

134 Material and Methods) lower than 10%; Smarcd2 showed the largest expression ratio among

135 all genes, as expected (Figure 3g, Table S4). Interestingly, the majority (79%) of the 605

136 genes with a relatively large difference (fold-change > 1.4, FDR<1%) were upregulated and

137 not downregulated. This had also been reported for embryonic fibroblasts deficient for

138 SMARCB1 (Snf5) and SMARCA4 (Brg1), two other members of the SWI/SNF complex11.

139 The upregulated genes were most enriched in categories related to membrane proteins,

140 including MHC complexes, immunoglobulin domains and G-protein coupled receptors that

141 included signalling pathways related to immunodeficiency and host defence (Table S5, Figure

142 S7a). A subset of CEBPE-dependent genes is also deregulated in Smarcd2-/- murine LSK cells

143 (Figure S7 b,c). Consistent with the finding that CpG island (CGI) promoters can facilitate

144 promiscuous induction without a requirement for SWI/SNF21, we found that genes containing

145 CGI promoters are significantly under-represented within the group of differentially

146 expressed genes (Fisher’s exact test, p = 0.00441004, odds ratio=0.71).

147 Thus, a considerable fraction of the genes that are found to be differentially expressed are

148 directly dependent on SWI/SNF and/or transcription factors.

149

150 Even though these experiments suggest that SMARCD2 orchestrates transcriptional networks

151 in early hematopoietic stem cells, they cannot directly explain the striking absence of

152 neutrophil granules and perturbed differentiation of mature neutrophils seen in SMARCD2-

153 deficient patients. To shed light on the mechanisms of SMARCD2 in late neutrophil

154 maturation, we set out to establish a human in vitro system to further study the function of

155 SMARCD2. We chose the promyelocytic cell line NB4 that is responsive to retinoic acid

8 156 signalling and can be differentiated toward mature neutrophil granulocytes in vitro. Since our

157 attempts to generate SMARCD2-deficient NB4 cells using CRISPR/Cas9 tools was

158 unsuccessful, we decided to make use of RNA interference to establish cell lines

159 characterized by lower SMARCD2 protein expression. We designed lentiviral shRNA

160 constructs expressing a SMARCD2-specific shRNA and the marker gene GFP, transduced and

161 flow-sorted NB4 cells for further analysis.

162

163 NB4 cells express SMARCD1, SMARCD2, SMARCD3 and CEBPE RNA/cDNA at detectable

164 levels (Figure 4a and 22). RNA expression of SMARCD2, but not of the family members

165 SMARCD1 and SMARCD3 was significantly reduced upon lentiviral expression of shRNA

166 directed against SMARCD2 (Figure 4a). The expression of CEBPE was not affected by

167 SMARCD2 knock down and increased after differentiation with all trans retinoic acid (ATRA)

168 (data not shown) as previously described (e.g.23). Next, we systematically analysed RNA

169 expression of genes encoding proteins that are expressed and stored in primary and specific

170 granules in neutrophil granulocytes. Interestingly, during differentiation with ATRA,

171 transcript levels of primary granule proteins cathelicidin (CAMP) and alpha-1-antitrypsin

172 (AAT) as well as specific granule proteins matrix metalloproteinase-8 (MMP8),

173 transcobalamin (TCN1) and lactoferrin (LTF), were all significantly reduced (Figure 4a) in

174 SMARCD2-deficient cells.

175 Mice with targeted mutations in Cebpe24 and human patients with rare mutations in CEBPE25

176 are characterized by specific granule deficiency and susceptibility to bacterial infections. In

177 view of these phenotypic similarities, we asked whether SMARCD2 controls the effects of

178 CEBPE. RNA-expression of CEBPE was not directly affected in SMARCD2-deficient cells.

179 As an alternative, we hypothesized that SMARCD2 may be relevant for recruiting CEBPE to

180 open chromatin and thus facilitating expression of CEBPE-dependent genes. Indeed, co-

181 expression and immune precipitation of HA-tagged CEBPE and Flag-tagged SMARCD2

9 182 suggested a direct protein-protein interaction of both proteins in mammalian cells (Figure 4c).

183 A functional link between SMARCD2 and CEBPE is further supported by our finding that

184 documented CEPBE-dependent genes (Table S3) are deregulated in the absence of

185 SMARCD2 in human (Figure 4g, S8c, d) and murine hematopoietic cells (Figure S7b, c).

186

187 The consequences of defective nucleosome positioning in dysfunctional SWI/SNF molecules

188 may be complex. We attempted to interrogate effects of SMARCD2 deficiency on global

189 chromatin accessibility using ATAC sequencing. We compared all genes that showed

190 differential chromatin accessibility in SMARCD2-knockdown cells with differentially

191 expressed genes determined by RNA-sequencing studies in undifferentiated and ATRA

192 differentiated promyelocytic leukemia cell line NB4. A specific subset of genes was found

193 deregulated in both assays, ATAC-Seq and RNA-Seq (Figure 4c-f), affecting vesicular

194 trafficking, migration and signalling. Differentially expressed genes in both, murine

195 transcriptome (Table S4) and human transcriptome (Table S6,S7), cluster significantly in

196 signalling pathways relevant to immune system functions (Figure S7a and Figure S8a, b,

197 respectively). Taken together, DNA accessibility studies, transcriptome studies and protein-

198 protein interaction studies suggest that SMARCD2 has a direct role to remodel the chromatin

199 and to mediate downstream effects partly by interaction with the myeloid transcription factor

200 CEBPE. In contrast to CEBPE deficiency, SMARCD2 deficiency causes not only absence of

201 specific granule expression, but also defects in early hematopoietic cells associated with

202 AML/myelodysplasia (Figure S2) as well as non-hematopoietic syndromic features (Figure

203 S1, Table S1).

204

205 In summary, our clinical and molecular characterization of a previously unrecognized human

206 disease reveals SMARCD2 as a key factor controlling transcriptional networks governing

10 207 hematopoietic stem cell differentiation and highlights the relevance for chromatin remodelling

208 in lineage specification in the hematopoietic system.

209

11 210 Patients, Materials, and Methods

211 Patients

212 Patients were referred by AS-P, PDA, and MA for genetic assessment of congenital

213 neutrophil deficiencies. The study was approved by the ethics committee of the University

214 Medical School of Hannover and the Faculty of Medicine at LMU, Munich. Patient

215 recruitment, genetic analysis, and data handling were done in accordance with the tenets of

216 the Declaration of Helsinki. Patients or their parents gave informed consent for the genetic

217 and functional studies and for publication of their pictures.

218

219 Hematology, biochemistry, and pathological bone marrow studies

220 Clinical laboratory-based assays, such as blood cell counting, were done by referring centers

221 according to good clinical practices. Bone marrow histological studies were performed on

222 paraffin-embedded samples provided by the referring clinical immunology centers.

223 Following standard histopathological procedures, specimens were cut by microtome (Leica)

224 and stained by SAKURA Tissue-Tek Prisma & Film Automated Slide Stainer (hematoxylin-

225 eosin) or BenchMark XT fully automated IHC/ISH staining instrument (immune

226 histochemistry). In addition to anti-lactoferrin antibody ab15811 (Abcam), antibodies against

227 myeloperoxidase #A0398 (Dako) , CD15 #PNIM1921 (Beckman Coulter), glycophorin C

228 #M0820 (Dako) and CD61 #760-4249 (Ventana/Roche) were used according to the

229 manufacturers’ instructions.

230

231 Homozygosity mapping and next generation sequencing

232 Patient AII.1 served as the index case. Patient BII.1, previously described as clinical case

233 report26 and patient BII.2 (not described) served as reference case for homozygosity mapping

234 using the Affymetrix 6.0 chip as in27.We searched for perfectly segregating intervals in the

235 SNP data using the software findhomoz28. To compute LOD scores, we assumed that the

12 236 parents of the affected individuals are second cousins as in29,30 because they are known not to

237 be first cousins and if they are more distantly related than second cousins, then the LOD

238 scores would be higher. Indeed, in the initial case report, family B was described as "non-

239 consanguineous"26. Asymptotic LOD scores, assuming the frequency of the disease-

240 associated marker haplotype decreases in the limit towards 0, were computed by hand using

241 the principle that each meiosis after the first contributes log102 to the score. We used

242 FASTLINK v. 4.1P31-33 to check the asymptotic scores and to compute scores with more

243 realistic marker allele frequencies. For LOD score computations, we assumed full penetrance

244 and a very rare disease associated allele. The scores shown here are computed for the two

245 families separately and summed.

246 Genomic DNA of the two parents and two affected children in family B was enriched for all

247 coding exons using Agilent’s SureSelect Human All Exon kit V3-50MB (Agilent

248 Technologies) according to the manufacturer’s protocol and subjected to sequencing on an

249 Illumina Genome Analyzer II. Short sequence reads were mapped to the human reference

250 genome GRCh37 with Novoalign and variants were detected as previously described34-36. For

251 each possible mutation found in family B, we designed a sequencing assay to test the affected

252 individual in family A (our index patient) for that mutation. Since this failed, we performed

253 high-throughput sequencing in family A, and identified a likely pathological variant in

254 SMARCD2: c.1181+1G>A (NM_001098426, ENST00000448276) confirmed by Sanger

255 sequencing. Sanger sequencing of SMARCD2 in family B revealed a large homozygous

256 insertion in patients BII.1 and BII.2 (c.414_438dup), segregating in family B.

257 Within our cohort of SCN patients, in patient CII.1, a homozygous mutation in SMARCD2

258 (c.401+2T>C) was identified by whole exome sequencing with SureSelect XT Human All

259 Exon V3 + UTRs kit according to the manufacturer’s instructions (Agilent Technologies)

260 using SOLiD 5500 next generation sequencing platform (LifeTechnologies) to an average

261 coverage depth of 100x (75 bp forward and 35bp reverse pair-end). Segregation of this

13 262 variant in family C was confirmed by Sanger sequencing. In all three families (A, B, C),

263 CEBPE and several other candidate genes were excluded (i.e., shown not to contain germline

264 biallelic mutations) by Sanger sequencing or whole exome sequencing (26, and new data, not

265 shown).

266

267 Sanger sequencing of SMARCD2

268 Human SMARCD2 isoform SMARCD2-001 (ENST00000448276) is consistently annotated

269 (CCDS45756) and was used as the reference sequence for specific sequence-based

270 experiments. Targeted sequencing included all 13 exons of ENST00000448276 as well as one

271 potential alternative Exon 1 derived from isoform SMARCD2-003 (ENST00000323347).

272 Throughout the text, mutations are described by their putative effect on transcript

273 SMARCD2-001 only. Effects on other transcripts are shown in Supplementary Table S2.

274 DNA was extracted from adherent cells or suspension cells with QIAamp DNA Blood Mini

275 Kit (Qiagen #51106) and used for further application. RNA was extracted with Qiagen

276 RNeasy Micro Kit #74004, RNase inhibitor ribolock (Thermo) was added, RNA was stored at

277 -80°C or used for cDNA transcription using High-Capacity cDNA Reverse Transcription Kit

278 # 4368813 (Life Technologies).

279 Sanger sequencing was performed on both gDNA and cDNA. The exonic regions of gDNA

280 were amplified by PCR reaction. Per reaction, 2.5µl HiFi buffer, 2.5µl dNTP 2mmol/l,

281 0.125µl HiFi polymerase, DMSO 1.25µl or betaine 5µl, 1.5µl primer forward/ reverse

282 10pmol/l, ≥20ng DNA, up to 25µl nuclease free water. The PCR reaction conditions were

283 95°C melting for 5-10’, followed by 35-40 cycles of loops consisting of 90°C melting for

284 30’’, 56°C annealing for 30’’, 72°C elongation for 30’’ (primer list for exons upon request) or

285 1’30’’ for amplification of full length cDNA exon 1 to exon 13 of ENST00000448276

286 (SMARCD2-FW GAGCGATGTCGGGCCGAG; SMARCD2-REV

287 ATCCCTGAGCAGTTAGGTCAGGCGAAT). The full length SMARCD2 transcript

14 288 amplification was performed with the aim to enrich all potential transcript variants conserved

289 in the N and C termini of SMARCD2-001 ENST00000448276. It includes mutated transcripts

290 of SMARCD2-001 (ENST00000448276) altered by exon skipping, intron retention or

291 insertion/ duplications. Bands were visualized on 1% agarose gel. Clean up of PCR products

292 was performed with ExoSAP-IT (Affimetrix AF 78202) or with QIAquick Gel Extraction

293 (Qiagen #28706). Sanger sequencing was performed in house on a ABI 3130xl Genetic

294 Analyzer or outsourced to Eurofins Company, Munich, Germany. Results were analyzed by

295 Seqman (DNASTAR) or ApE (M. Wayne Davis, Utah37) software.

296

297 Cell lines

298 Standard cell lines (NB4, 293T) were purchased from the German Collection of

299 Microorganisms and Cell Cultures (DSMZ). Patient cell lines were cultivated from skin

300 biopsies (fibroblasts) or peripheral blood after infection with Epstein-Barr virus (EBV) (the

301 term we use for the transformed cells lines is EBV-transformed lymphoblastoid cell lines).

302 Adherent cell lines 293T and fibroblasts from healthy donors and patients AII.1 and BII.1

303 were cultured in DMEM, supplemented with 10%FCS, 50U/ml penicillin, 50 µg/ml

304 streptomycin, 2mM L-Glu. Suspension cell lines NB4 and EBV-LCL were cultured in RPMI,

305 supplemented with 10% FCS, 50 U/ml penicillin, 50 µg/ml streptomycin, 2mM L-Glu, 10mM

306 HEPES buffer.

307

308

309 Plasmids and molecular cloning

310 SMARCD2 was amplified from a human healthy donor sample or from patients’ cDNA

311 (SMARCD2-FW GACGGGACGGAGCGATGT; SMARCD2-REV

312 GAGCAGTTAGGTCAGGCGAATT). Analysis on agarose gels revealed differences in

313 fragment size and/or number of fragments in patients versus healthy donor. Fragments were

15 314 gel extracted and cloned into an in-house, CMV driven plasmid modified from the

315 pCHGFPW plasmid38 using 5’ prime XhoI primer (with Kozak and FLAG tag: SMARCD2-

316 Xho1-FLAG-FW:

317 AAACTCGAGGCCACCATGGACTACAAAGACGATGACGACAAGTCGGGCCGAGGC

318 GCG) and 3’ prime SpeI primer (SMARCD2-SpeI-REV:

319 TTTACTAGTTTAGGTCAGGCGAATTCCC). Due to individual truncations at the 3´-

320 terminus of mutated proteins, cloning was restricted to 5´-Flag tags for mutated proteins.

321 SMARCD2-specific pGIPZ. shRNA constructs and pGIPZ non-silencing control were

322 purchased from Thermo Scientific (shRNA 1-3: clone ID V3LHS_300463; V3LHS_300461;

323 V3LHS_400374 and non-silencing control # RHS4531). The shRNA sequences were cloned

324 via Mlu1 and Xho1 restriction sites into pGIPZ.SF.GFP.2 plasmid (kindly provided by Axel

325 Schambach, MHH Hannover). Viral particles were produced in 293T cells with gag-pol,

326 VSVG and rev helper plasmids. NB4 cell lines were stably transduced, GFP sorted and

327 expanded. Knockdown efficiency was determined by expression of protein (Western blot) and

328 mRNA/cDNA (qPCR method, see section: Expression of neutrophil specific granule genes in

329 NB4 cells).

330

331 Immunoprecipitation experiments and western blotting:

332 Transfection of wild-type and mutant SMARCD2 Flag-tagged proteins or HA-tagged CEBPE

333 with calcium phosphate into semi-confluent 293T with 10µg plasmid/10cm dish was

334 performed. Cells were harvested on day 3, pelleted and lysed in freshly prepared RIPA Buffer

335 (450mM NaCl, 25mM TrisHCl pH7.5, 1mM EDTA, 1%NP40, 5% Glycerol, 25mM Na-

336 Pyrophosphate, 50mM Na-Fluoride, EDTA-free protease inhibitor (Roche)). For each

337 immune-precipitation one confluent 10cm dish with 293T cells has been used. Lysates were

338 cleared by centrifugation (21,000g x for 15 min at 4°C). IP with FLAG affinity gel (Sigma) or

339 HA affinity gel (Thermos Scientific) was performed overnight at 4°C on a rotating laboratory

16 340 wheel. Samples were washed 5 times in RIPA buffer. Elution was performed with FLAG

341 peptide 37.5µg (Sigma F3290-4MG) or boiling in 2x Laemmli sample buffer. Lysates were

342 analyzed by Western blot.

343 Similarly, SMARCD2 expression in fibroblasts and EBV-LCLs of healthy donor and patients

344 was analyzed. Cells were lysed with freshly prepared RIPA buffer. Lysates were cleared by

345 centrifugation (21,000 x g, 10 min, 4 °C). Protein quantification was performed with Bradford

346 reagent (Bio-Rad Laboratories) using ELISA plate readers (Synergy H1 Hybrid Reader,

347 BioTek; infinite M200, Tecan). Equal amounts of protein - achieved by Bradford

348 (SMARCD2 expression in healthy donor and patient cells) or counting of input cells (Immune

349 precipitation) were separated by sodium dodecyl sulphate polyacrylamide (SDS-PAGE) gel

350 electrophoresis and blotted onto polyvinyl difluoride (PVDF) membranes. The membranes

351 were blocked in PBS containing 0.1 % Tween-20 (PBS-T) supplemented with 5 % BSA or

352 5% non-fat dry milk for 2 hours, followed by incubation with primary antibodies overnight at

353 4 °C or for 1 h at room temperature in (PBS-T) with 5 % BSA or 5% non-fat dry milk.

354 Antibodies used included anti-Flag, mouse, clone M2, F1804-200ug (Sigma) or anti FLAG-

355 HRP, A8592-.2MG (Sigma), SMARCD2, mouse monoclonal antibody clone F-34, SC-

356 101162, (SCBT), GAPDH mouse monoclonal antibody clone 6C5, SC-32233 (SCBT), anti-

357 HA, rabbit, ab9110 (Abcam), anti BRG1, rabbit, clone EPNCIR111A, ab110641, (Abcam),

358 anti SMARCC2/BAF170, rabbit, #8829S (CST), anti SMARCC1/ BAF 155, rabbit, clone

359 D7F8S, #11956S (CST), anti SNF5/BAF47, rabbit, clone D9C2, #8745S (CST). After

360 washing the PVDF membranes in PBS-T, secondary horse-radish peroxidase conjugated

361 antibodies anti-mouse (BD Pharmingen) and anti-rabbit (CST) were added for 1h at room

362 temperature. After development with chemiluminescent substrate (Pierce ECL western

363 blotting substrate), digital images were acquired on a Chemidoc XRS Imaging System (Bio-

364 Rad Laboratories). Blots were stripped between detection of different antibody probes using

17 365 Restore Western Blot Stripping Buffer (Thermo Scientific). Data analysis was performed

366 using Image LabTM software (Bio-Rad Laboratories).

367

368 Mouse model

369 Generation of SMARCD2-deficient murine model

370 The C57BL/6 embryonic stem cell clone 11930A-F4 carrying a mutant Smarcd2 allele was

371 generated by Regeneron Pharmaceuticals and obtained from the KOMP repository

372 (www.komp.org). To generate Smarcd2-deficient mice, clonal embryonic stem cells were

373 injected into C57BL/6BrdCrHsd-Tyrc (albino) blastocysts and transferred to pseudo-pregnant

374 NMRI foster mothers. The resulting chimeras were crossed to C57BL/6 albino mice to

375 identify germ line transmission of the targeted allele and to produce mice heterozygous for the

376 mutation. F1 intercrosses of heterozygous mice resulted in Smarcd2+/+, Smarcd2+/-, and

377 Smarcd2-/- embryos/ mice, which were genotyped using standard PCR reaction conditions and

378 the primers for the wild-type allele: FW: CCATCTGTAACGAAATCCGATGCCC; REV:

379 TTATCCCTCAGGTTCCTGACAAGGC, amplicon size 264bp and for the knock-out allele:

380 FW: GAGTCTAGGGCCTTCTCTTCCTTGC; REV:

381 GCAGCCTCTGTTCCACATACACTTCA, amplicon size 569bp (see Fig S5).

382 Animals were maintained under specific pathogen-free conditions at 23°C, 65% humidity and

383 with 12h light/dark cycle and had free access to a standard rodent diet (V1534, Ssniff, Soest,

384 Germany) and water. All animal experiments were carried out in accordance with the German

385 Animal Welfare Act with permission from the responsible veterinary authority.

386

387 Flow cytometry (FACS)

388 For FACS analysis of fetal liver hematopoietic cells, single cell suspensions by

389 homogenization of fetal liver tissue with a 1ml Eppendorf pipette and Hank´s buffered salt

390 solution (HBSS) with 3% fetal calf serum (FCS) were prepared. Fetal liver cells were kept on

18 391 ice until the genotyping. For FACS analysis of CFU derived hematopoietic cells, CFU

392 colonies were picked after light microscopic evaluation and washed once in HBSS with 3%

393 FCS.

394

395 Fetal liver hematopoietic cells and CFU derived cells were stained with the following

396 fluorochrome- or biotin-conjugated monoclonal antibodies for 20 min on ice: anti-B220-

397 AlexaFluor®780 (eBioscience), anti-CD3ɛ-FITC(eBioscience), anti-CD19-

398 PeCy7(eBioscience), anti-Ter119-PE (BD Pharmigen), anti-Gr1-FITC (BD Pharmigen), anti-

399 Ly6c-PerCP-Cy5.5 (eBioscience), anti-Mac2/CD11b-biotin/- eFluor 450 (eBioscience) and

400 anti-CD71-FITC (BD Pharmigen). Cells stained with biotinylated monoclonal antibodies

401 were washed and incubated with Streptavidin-APC (eBioscience). Samples were acquired on

402 either FACSCanto or LSR II flow cytometer (BD), and data were analyzed using FlowJo

403 software (Tree Star). Fluorescence intensity plots are shown in log10 scales. Relative

404 abundances (percentage of parental gate) were analyzed by Prism software (GraphPad);

405 statistical center value: Mean, standard error: SEM, p-values, and two-tailed unpaired t-tests

406 were used.

407

408 Flow cytometry sorting

409 Murine fetal liver Lin-Sca1+ckit+ (LSK) early progenitor cells, were isolated to perform CFU

410 assays and RNA Sequencing (RNA-Seq). Individual embryos were genotyped, fetal liver

411 tissue was suspended in HBSS with 3% FCS, and LSK cells were defined as follows:

412 CD45.2+, lineage-, Mac1low/+, ckit+ and Sca1+ cells. Cells were then stained with anti-CD45.2-

413 FITC (BD Pharmingen), biotinylated lineage antibodies (anti-B220, -CD3, Gr-1, and -Ter-

414 119; all BD), anti-Mac1/CD11b-eFluor 450 (eBioscience), anti-CD117/c-kit-Alexa

415 Fluor®780 (eBioscience), and anti-Sca-1-PeCy7 (eBioscience). Biotinylated monoclonal

416 antibodies were labeled by incubation with Streptavidin-PerCP/PerCP-Cy5.5 (eBioscience).

19 417 LSK cells were sorted into Iscove's Modified Dulbecco's Medium (IMDM) with 3% FCS for

418 CFU assays or directly into 1%TritonX supplemented with RNAse Inhibitor (Promega) for

419 RNA-Seq. Cell sorting was performed using a FACS Aria III cell sorter and FACS Diva

420 software.

421

422 Colony Forming Unit Assays

423 Flow-sorted fetal liver LSK cells were washed and resuspended in 50µl IMDM without FCS.

424 Between 500-1500 LSK cells were plated per 35mm Petri dishes containing 1.3ml

425 MethoCult® (M3231 or M3434, Stem Cell Technologies). M3434 (rm SCF, rm IL-3, rh IL-6,

426 rh EPO) was used to examine erythro-myeloid maturation. M3231 (with addition of murine

427 G-CSF [50 ng/ml], murine GM-CSF [50ng/ml] or murine M-CSF [50ng/ml] (cytokines from

428 Peprotech)) was used to assess myeloid maturation to specific cytokines. CFU colonies were

429 assessed daily from day 3 onwards. Colony forming units (>20 cells) and lineage

430 differentiation potential were assessed using an inverted microscope (Axiovert-II, Zeiss) at

431 day 7 - 12. CFUs were photographed (data not shown), counted and analyzed by FACS.

432 Cytological assessment was performed by May-Grünwald-Giemsa stain after cytospin

433 centrifugation (Shandon Cytofunnel Thermo). CFU counts were normalized to LSK cell input

434 and analyzed by Prism software (GraphPad); statistical center value: Mean, standard error:

435 SEM, p-values, and two-tailed unpaired t-tests were used.

436

437 Mouse fetal blood cytology

438 Fetal blood was recovered from sacrificed embryos and washed in HBSS with 3% FCS.

439 Cytological assessment of equal numbers or nucleated cells was performed by cytospins

440 (Shandon Cytofunnel Thermo) and May-Grünwald-Giemsa stain. Blood cells were

441 morphologically assessed using an inverted microscope (Axiovert-II, Zeiss) and

442 photographed.

20 443 Zebrafish experiments

444 Zebrafish

445 Tg(mpx:EGFP)i114 19 and Tg(lyz:dsRed)nz50 17 strains were used. Fish were held in the

446 FishCore (Monash University) using standard practices. Embryos were held in egg water

447 (0.06 g/L salt (Red Sea, Sydney, Australia)) or E3 medium (5mM NaCl, 0.17 mM KCl, 0.33

448 mM CaCl2, 0.33 mM MgSO4, equilibrated to pH 7.0); from 12 hpf, 0.003% 1-phenyl-2-

449 thiourea (Sigma-Aldrich) was added to inhibit pigmentation. Embryos were held at 28°C in

450 an incubator (Thermoline Scientific) upon collection. Animal experiments followed NHMRC

451 guidelines (“Australian code of the care and use of animals for scientific purposes” 8th edition,

452 NHMRC, 2013) and were approved by the Monash University Animal Ethics Committees.

453

454 Morpholino knockdown experiments

455 Microinjection of morpholino oligonucleotides was performed as follows: Antisense

456 morpholino oligonucleotides (Gene Tools, LLC (Eugene, OR)) (Table S3) were resuspended

457 as stock in milli-Q water at 1 mM and microinjected at highest non-toxic concentration for

458 each morpholino (700 µM for all MO-smarcd2). 10-20 min post fertilization embryos were

459 collected in egg water and placed on a 4% agarose gel block aligned to grooves on the gel

460 surface. Microinjection of 1-cell embryos was performed using a standard microinjection

461 apparatus and large-bore needle, positioned at the border of cell and yolk sac.

462 Knockdowns of smarcd2 by the splice-blocking morpholino oligonucleotides were examined

463 by RT-PCR as follows: Whole embryo RNA was extracted using TRIzol® Reagent (Life

464 Technologies) and cDNA synthesis was performed using SuperScript™ III Reverse

465 Transcriptase (Invitrogen). Phusion High Fidelity DNA Polymerase (Thermo scientific) was

466 used for cDNA amplification. 50 µl PCR reaction was consisting of 1 µl Phusion DNA

467 Polymerase, 10 µl 5X Phusion HF Buffer, 1 µl dNTP (10 mM), 1 µl forward primer (10 µM),

468 1 µl reverse primer (10 µM), 2 µl RT reaction product (cDNA) and 34 µl of nuclease free

21 469 water. Biorad T100 thermal cycler with following program was used for amplification: 90

470 seconds at 95°C as initial denaturation, followed by 30 cycles of 30 sec at 95°C for

471 denaturation, 30 sec at 56 °C for annealing, 30 sec at 72 °C for extension, and final extension

472 at 72 °C for 5 min. Primer sequences in Table S3.

473 Total numbers of fluorescent neutrophils in digital images of control and morphant embryos

474 were manually counted at 72 hours post fertilization (hpf) using an Olympus MVX10

475 microscope fitted with an Olympus DP72 camera.

476

477 CRISPR/Cas9 mutagenesis in zebrafish model

478 Single guide RNA (sgRNA) synthesis for CRISPR mutagenesis

479 The zebrafish smarcd2 gene was mutated by CRISPR/Cas9 technology using the method of

480 Gagnon et al.39 . Briefly, the web tool “CHOPCHOP” (https://chopchop.rc.fas.harvard.edu/)39

481 was used to design gene-specific spacer sequences to contribute to two single guide RNAs

482 (sgRNAs) for smarcd2 targeting (named S1 and S2 in Table S3a). All CHOPCHOP results

483 were checked on zebrafish genome database by Ensembl genome browser. DNA templates

484 for sgRNA synthesis resulted from annealing two single-stranded DNA oligonucleotides

485 (Sigma Aldrich) followed by T4 DNA polymerase (NEB) fill-in, to make a full double-

486 stranded DNA oligonucleotide. Each for each sgRNA DNA template, one oligonucleotide

487 provides the site specific sequence (incorporating either S1 or S2) and the second “constant”

488 oligonucleotide one supplied the binding site for Cas9 enzyme. sgRNAs were generated by in

489 vitro transcription (mMESSAGE mMACHINE® SP6 or T7 Transcription Kit, Thermo Fisher

490 Scientific).Transcribed sgRNA was cleaned (Sephadex G-50 spin columns, Roche

491 Diagnostics) and its integrity was checked on 1% agarose (Bioline, BIO-41025) gel made in

492 0.5% TBE.

493

22 494 sgRNA microinjection

495 Individual sgRNAs (50-200 ng/µl) mixed with Cas9 Nuclease 20 µM (NEB) at a 1:1 ratio

496 were microinjected (500–1000 pg) into the cytoplasm of 1-cell stage Tg(mpx:EGFP)

497 embryos.

498

499 Genotyping of zebrafish

500 Smard2 locus genotyping was performed by DNA sequencing. DNA samples were extracted

501 from single embryos or fin clips of adult fish using the HotSHOT protocol40 and amplified by

502 PCR (primers and PCR conditions, see Table S3a) . Following gel electrophoresis, excised

503 bands (AccuPrep® Gel purification kit, BIONEER) were sequenced in the Micromon

504 sequencing facility (Monash University) using an “Applied Biosystems 3730s Genetic

505 Analyzer”. F0 genotyping documented sgRNA activity. F1 genotyping was used to identify

506 founders carrying mutated alleles. F2 genotyping assisted colony management and confirmed

507 the genotypes of all embryos contributing to the phenotype comparison.

508

509 Sequencing analysis of zebrafish results

510 Sequencing traces were analyzed in DNASTAR navigator (Version 2.2.1.1) and ApE (A

511 Plasmid Editor v.2.0.47, similar to37). Analysis of complex compound CRISPR/Cas9

512 genotypes required manual curation and interpretation of sequence chromatograms.

513

514 Phenotype analysis of zebrafish

515 EGFP-positive neutrophils in digital images of control and F2 Tg(mpx:EGFP) embryos of

516 various smarcd2 allelotypes were manually counted at 72 hours post fertilization (hpf) in the

517 tail distal to the tip of the yolk extension, which includes the leukocyte-rich caudal

518 hematopoietic tissue (CHT) using Olympus MVX10 microscope fitted with an Olympus

519 DP72 camera.

23 520 Descriptive and analytical statistics were prepared in Prism 5.0c (GraphPad Software Inc). p-

521 values are from two-tailed unpaired t-tests, statistical center value: Mean, error bar: standard

522 deviation.

523

524 Murine LSK transcriptome – RNA-Seq

525 Murine LSK cell populations were sorted into lysis buffer composed of 0.2 % Triton X-100

526 (Sigma) and 2 U/µl of RNase Inhibitor (Promega). ERCC spike- in controls (Life

527 Technologies) were added to the cell lysis mix at 1:1,000 dilution. RNA was cleaned up from

528 the crude lysate with Agencourt RNAclean XP SPRI beads (Beckman-Coulter). cDNA was

529 synthesized and pre-amplified from 5 µl of lysate as described elsewhere41. 0.7 ng of pre-

530 amplified cDNA was used as input for tagmentation by the Nextera XT Sample Preparation

531 Kit (Illumina), where a second amplification round was performed for 12 cycles. For each

532 sample, 5 ng of final library was pooled.10 pmol of the library pool was sequenced 1 x 50

533 bases on an Illumina HiSeq1500.

534

535 RNA-Seq data analysis

536 We chose a minimum sample size of n=5 according to recommendations of power in RNA-

537 Seq42. All sorted murine LSK samples were processed, none was excluded. The murine fetal

538 LSK samples/cell lysates were randomized for RNA-Seq library preparation by assigning a

539 random sample number. During analysis, samples had to be unblinded. Sequencing reads

540 were demultiplexed from the Nextera (i5 and i7) indices. Demultiplexed reads were aligned to

541 the mouse genome (mm10) and ERCC reference using NextGenMap43. Count data were

542 generated from mapped reads using feature Counts44 on Ensembl gene models (GRCm38.74).

543 To remove noise from genes with low expression levels, count data sets were subjected to

544 data-driven gene filtering using the HTSFilter R package45.

24 545 Differential expression (DE) analysis was done in the DESeq2 R package46. The full set of the

546 12362 detected genes, their estimated log2 fold-change and the adjusted p-value of the Wald

547 test is given in Table S4. For Figure 3g, we used the 50 genes showing the largest difference

548 between the two groups and applied hierarchical clustering gene-wise and sample-wise with

549 complete linkage based on Euclidian distances of variance stabilized counts of DE genes. We

550 displayed the two-dimensional hierarchical cluster results as a heatmap. The reference

551 expression value is the expression average of wild-type LSK cells. For testing enrichment of

552 functional categories we used upregulated (log2 fold change>0.5) and downregulated genes

553 (<-0.5) as the input list and all 12,362 detected genes as background list for functional

554 annotation clustering using DAVID47 . Results obtained using default parameters and a cutoff

555 of a 2-fold enrichment are shown in Table S5. Data deposition: GSE84703

556

557 Expression of neutrophil specific granule genes in NB4 cells

558 NB4 AML cells transduced with specific shRNA against SMARCD2 clone 1 V3LHS_300463

559 and clone 2 V3LHS_400374 or non-silencing control # RHS4531 were cultivate in complete

560 RPMI medium with ATRA 1µM (dissolved in DMSO), after 3 days RPMI medium

561 supplemented with ATRA was exchanged. Cells were analyzed on day 3 and 6. RNA was

562 extracted, cDNA was transcribed and expression levels of SMARCD1 (FW:

563 GTCAGATGCCGAGGATGGGGA; REV: GTGGCATCATATTTGGACAAGGCTG),

564 SMARCD2 (FW: ATCTCTTGGCTTTTGAGCGGAAGCTG; REV:

565 CTTGCTGGGACTGAACGTATTGGA), SMARCD3 (FW: GGAGCCGCAGTGCCAAGA;

566 REV: TAAGCCTGGGACTCGGGGAC), as well as granule genes LFT (FW:

567 GCTGGAGACGTGGCTTTTATCAGA; REV GTAACTCATACTCGTCCCTTTCAGC),

568 MMP8 (FW: CCGAAGAAACATGGACCAACACCTC; REV:

569 TGAGCGAGCCCCAAAGAATG), TCN1 (FW: CACATTTAGCACAGGAGAAGC; REV:

570 TGTTGGCAATTCCAGTCAT), CALM (FW: AGAAATCACCCAGCAGGGCAAA; REV:

25 571 GTATGGGGACAGTGACCCTCAACC), AAT (FW:

572 GAACTCACCCACGATATCATCACC; REV: TGGACAGTTTGGGTAAATGTAAGC)

573 normalized to GAPDH (FW: TGATGACATCAAGAAGGTGGTGAAG; REV:

574 TCCTTGGAGGCCATGTGGGCCAT) were detected by SYBR green based qPCR on an

575 ABI Step one plus cycler. Differential expression of genes was calculated by the ΔΔ Ct

576 method. Data points represent the relative fold change of shRNA clone 1 or 2 vs non-

577 silencing control and individual repeat differentiation experiments. Descriptive and analytical

578 statistics were prepared in Prism 5.0 (GraphPad Software Inc) and p-values are from two-

579 tailed unpaired t-tests, statistical center value: Mean, error bar: standard deviation.

580

581 RNA-Seq in differentiated NB4 AML cells

582 NB4 AML cells transduced with specific shRNA against SMARCD2 clone 1 V3LHS_300463

583 and clone 2 V3LHS_400374 or non-silencing control # RHS4531 were cultivate in complete

584 RPMI medium with ATRA 1µM (dissolved in DMSO) or DMSO-only control or 3 days.

585 RNA was extracted from 1,000,000 NB4 cells (shRNA treated cells (clone 1 V3LHS 300463

586 and clone 2 V3LHS 400374) and the control cells (RHS4531)) with or without ATRA

587 differentiation. Cells were extracted by GeneJET RNA Purification Kit (Thermo Fisher

588 Scientific) and RNA-Seq library preparation was performed with NEBNext® Ultra RNA

589 Library Prep Kit from Illumina® (#E7530 S) according to the manufacturer´s instructions and

590 sequenced on an Illumina NextSeq 500 at the Dr. von Hauner Children´s Hospital NGS

591 facility. 6 libraries were sequenced together using a Mid output cartridge (FC-404-2001, 150

592 cycles, paired-end sequencing) reaching approximately 2x 5 Gb per sample.

593

26 594 ATAC-Seq in differentiated NB4 AML cells

595 Assay for Transposase-Accessible Chromatin with high throughput sequencing was

596 performed as described previously48 .

597 NB4 cells (ACC207) were cultured as described above. DMSO (Sigma-Aldrich) was used as

598 a carrier for ATRA. Cells were kept in logarithmic growth and stimulated with 1µM ATRA

599 or DMSO as a control. After 72 hours, 50,000 cells per condition were harvested and nuclei

600 preparation was done as described48 . Isolated nuclei were treated with Tn5 transposase from

601 the Nextera DNA Library Preparation Kit (Illumina, Catalog # FC-140-1089) for 30 minutes,

602 300 rpm in a Thermomixer. Transposed DNA was purified with the Qiagen MinElute

603 Reaction Cleanup Kit (Qiagen, Catalog # 28204) and amplified with Illumina Tn5 compatible

604 barcoding primers (NEBNext Multiplex Oligos for Illumina,NEB). We ran a qPCR side-

605 reaction with 5µl of the previously amplified library to determine the minimum number of

606 additional PCR cycles needed (Primers: FW 5ʹ AATGATACGGCGACCACCGAGAT 3ʹ and

607 REV 5ʹ CAAGCAGAAGACGGCATACGA 3ʹ). Minimally PCR-amplified libraries were

608 again purified with the Qiagen MinElute Reaction Cleanup Kit. Libraries were analyzed on an

609 Agilent Bioanalyzer 2100 (High Sensitivity DNA Chip) and size selection for the fragments

610 was performed using AMPure beads.16 ATAC libraries were pooled and sequenced using a

611 Mid output cartridge (FC-404-2001, 150 cycles, paired-end sequencing) on a Illumina

612 NextSeq 500 reaching approximately 2x1.8 Gb per sample.

613

614 Human RNA-Seq and ATAC-Seq data analysis

615 Demultiplexed FASTQ files were generated using bcl2fastq v2.17 (BCL2FASTQ Conversion

616 Software 2.17, Illumina). The ATAC-Seq reads were mapped with BWA-MEM49 with default

617 parameters to the (GRCh37.p13). The RNA-Seq reads were mapped with

618 STAR (v2.5.0a) to the same genome in combination with the gene model annotation of

619 GENCODE 1950.

27 620 The R/Bioconductor51 package GenomicAlignments was utilized to generate count data for

621 the RNA-Seq data based on the gene level. The same gene model as in the alignment step was

622 used. Only reads mapping uniquely to one feature were counted by setting the

623 summarizeOverlaps function option to mode=‘Union’. Furthermore all other options were set

624 to FALSE (ignore.strand=FALSE, inter.feature=FALSE, fragments=FALSE). To reduce

625 noise from lowly expressed genes, only genes were kept, if the 95 % quantile of the coverage

626 across all samples was below 10 reads as suggested by DESeq246 package. DESeq2 was

627 utilized to carry out a differential expression analysis between shRNA treated cells (clone 1

628 V3LHS 300463 and clone 2 V3LHS 400374) and the control cells (RHS4531) of the

629 remaining 13,244 genes. Furthermore, a differential expression analysis was performed

630 between ATRA treated cells and DMSO treated cells as control. A significance cutoff of FDR

631 < 0.1 was applied to the results before further downstream analysis. The DESeq2 results of

632 the significantly differentially expressed genes are listed in Table S6 and S7 with their raw

633 counts and their normalized counts.

634 For the ATAC-Seq analysis we followed the proposed workflow by Lun and Smyth52. Hence

635 the reads were first pulled together from all samples. On all reads MACS253 was used with

636 default parameters to call peaks. The resulting peaks were used as genomic features to

637 generate count data for each sample with the R/Bioconductor package GenomicAlignment53.

638 The count data was again subjected to filtering steps to reduce the noise. To call significantly

639 differentially occupied genomic regions the R/ Bioconductor package EdgeR54 was utilized.

640 A significance cutoff of p value < 0.005 was applied to the results before further downstream

641 analysis. The EdgeR results of the significantly differentially occupied peaks are listed in

642 Table S6 and S7 with their raw counts and their normalized counts.

643 Heatmaps were plotted based on the log2-fold-changes. The dendrogram was obtained by

644 gene-wise and sample-wise hierarchical clustering with complete linkage. Fold changes and

28 645 p-values per gene are given in Table S6 (DMSO/undifferentiated) and Table S7

646 (ATRA/differentiated).

647

648 Pathway analysis

649 Pathway analysis of human and murine transcriptomes was carried out by Cytoscape 3.3.055

650 together with Reactome Functional Interaction plugin56,57. In brief, for pathway analysis,

651 differentially expressed murine genes (log2 FC -0.50.5, padj<0.1) or differentially

652 expressed human genes (padj<0.1) or intersections thereof with CEBPE target genes were

653 loaded into the Gene set / Mutation Analysis interface. Networks were generated with or

654 without linker genes as indicated. Spectral partition based network clustering according to58

655 was performed and individual spectral clusters where analyzed by Reactome Pathway

656 Enrichment5959. Abstraction of spectral clusters (Figure S7, S8) and lists of significantly

657 enriched pathways are shown (Table S3). For intersection of gene lists the Venny Venn online

658 tool60 was used.

659

660 Acknowledgements

661 We thank all medical and laboratory staff members involved in taking care of patients and

662 performing scientific experiments, in particular Raffaele Conca (FACS sorting) and Jens

663 Hinke (genomic facility), Peter Robinson and Stefan Mundlos for NGS expertise and

664 contribution to the identification of homozygous variants in family A and B. We thank

665 Bernward Zeller for clinical care of patient AII.1 and Anne Tierens for intial FACS and

666 histological workup of patient AII.1. We acknowlegde Julie Lessard for providing antibodies

667 and critical discussion.

668

669 The study has been supported by the European Research Council (ERC Advanced Grant

670 „Explore“), the Else-Kröner Fresenius Research School, the German Center for Infection

29 671 Research DZIF, the Deutsche Forschungsgemeinschaft (Gottfried Wilhelm Leibniz Program),

672 the German PID-NET (BMBF) and the Care-for-Rare Foundation.

673 VP was supported by a Monash International Postgraduate Research Scholarship (MIPRS)

674 and Monash Graduate Scholarship (MGS). GL was supported by NHMRC (1069284,

675 1044754). The Australian Regenerative Medicine Institute (ARMI) is supported by grants

676 from the State Government of Victoria and the Australian Government. This research was

677 supported in part by the Intramural Research Program of the National Institutes of Health,

678 NLM. WE and CZ were supported by the Deutsche Forschungsgemeinschaft (DFG) through

679 LMUexcellent and the SFB1243 (Subproject A14)

680

681 Contributions:

682 MW designed, performed and interpreted experiments, performed writing and editing of the

683 paper. DP performed ATAC-Seq and RNA-Seq, YF, EB TR, MR were involved in genomic

684 and biochemical analyses, JP led the computational biology efforts, CM, JG analysed ATAC-

685 Seq, RNA Seq, CZ and WE performed mouse LSK RNA-Seq and DGE analysis. AS-P, PDA,

686 MRA provided clinical care of patients, VP and GJL generated and analysed zebrafish

687 models, PMK analysed WES in intial patients. MD, MRS, EW generated mice. HPH

688 performed ICH analysis of bone marrow biopsies, HS provided expert clinical genetic

689 consulting, AAS guided bioinformatics studies and helped with writing, editing. CK designed

690 and guided the study, supervised MW, provided laboratory resources and wrote the paper.

691

692

693

694

30 695 Author information:

696 The authors declare no competing financial interests. Murine LSK RNA-Seq data has been

697 deposited here: GSE84703. Correspondence and requests for material should be addressed to

698 CK ([email protected])

31 699 References 700 701 1 Krumsiek, J., Marr, C., Schroeder, T. & Theis, F. J. Hierarchical differentiation of

702 myeloid progenitors is encoded in the transcription factor network. PloS one 6,

703 e22649, doi:10.1371/journal.pone.0022649 (2011).

704 2 Orkin, S. H. & Zon, L. I. Hematopoiesis: an evolving paradigm for stem cell biology.

705 Cell 132, 631-644, doi:10.1016/j.cell.2008.01.025 (2008).

706 3 Krosl, J. et al. A mutant allele of the Swi/Snf member BAF250a determines the

707 pool size of fetal liver hemopoietic stem cell populations. Blood 116, 1678-1684,

708 doi:10.1182/blood-2010-03-273862 (2010).

709 4 Friedman, A. D. Transcriptional control of granulocyte and monocyte

710 development. Oncogene 26, 6816-6828, doi:10.1038/sj.onc.1210764 (2007).

711 5 Griffin, C. T., Brennan, J. & Magnuson, T. The chromatin-remodeling enzyme BRG1

712 plays an essential role in primitive erythropoiesis and vascular development.

713 Development 135, 493-500, doi:10.1242/dev.010090 (2008).

714 6 Huang, H. T. et al. A network of epigenetic regulators guides developmental

715 haematopoiesis in vivo. Nature cell biology 15, 1516-1525, doi:10.1038/ncb2870

716 (2013).

717 7 Alvarez-Errico, D., Vento-Tormo, R., Sieweke, M. & Ballestar, E. Epigenetic control

718 of myeloid cell differentiation, identity and function. Nature reviews. Immunology

719 15, 7-17, doi:10.1038/nri3777 (2015).

720 8 Cedar, H. & Bergman, Y. Epigenetics of haematopoietic cell development. Nature

721 reviews. Immunology 11, 478-488, doi:10.1038/nri2991 (2011).

722 9 Cairns, B. R. The logic of chromatin architecture and remodelling at promoters.

723 Nature 461, 193-198, doi:10.1038/nature08450 (2009).

32 724 10 de la Serna, I. L., Ohkawa, Y. & Imbalzano, A. N. Chromatin remodelling in

725 mammalian differentiation: lessons from ATP-dependent remodellers. Nature

726 reviews. Genetics 7, 461-473, doi:10.1038/nrg1882 (2006).

727 11 Tolstorukov, M. Y. et al. Swi/Snf chromatin remodeling/tumor suppressor

728 complex establishes nucleosome occupancy at target promoters. Proceedings of

729 the National Academy of Sciences of the United States of America 110, 10165-

730 10170, doi:10.1073/pnas.1302209110 (2013).

731 12 Wilson, B. G. & Roberts, C. W. SWI/SNF nucleosome remodellers and cancer.

732 Nature reviews. Cancer 11, 481-492, doi:10.1038/nrc3068 (2011).

733 13 Buscarlet, M. et al. Essential role of BRG, the ATPase subunit of BAF chromatin

734 remodeling complexes, in leukemia maintenance. Blood 123, 1720-1728,

735 doi:10.1182/blood-2013-02-483495 (2014).

736 14 Jojic, V. et al. Identification of transcriptional regulators in the mouse immune

737 system. Nature immunology 14, 633-643, doi:10.1038/ni.2587 (2013).

738 15 Alajem, A. et al. Differential association of chromatin proteins identifies

739 BAF60a/SMARCD1 as a regulator of embryonic stem cell differentiation. Cell

740 reports 10, 2019-2031, doi:10.1016/j.celrep.2015.02.064 (2015).

741 16 Lickert, H. et al. Baf60c is essential for function of BAF chromatin remodelling

742 complexes in heart development. Nature 432, 107-112,

743 doi:10.1038/nature03071 (2004).

744 17 Hall, C., Flores, M. V., Storm, T., Crosier, K. & Crosier, P. The zebrafish lysozyme C

745 promoter drives myeloid-specific expression in transgenic fish. BMC

746 developmental biology 7, 42, doi:10.1186/1471-213X-7-42 (2007).

747 18 Liongue, C., Hall, C. J., O'Connell, B. A., Crosier, P. & Ward, A. C. Zebrafish

748 granulocyte colony-stimulating factor receptor signaling promotes myelopoiesis

33 749 and myeloid cell migration. Blood 113, 2535-2546, doi:10.1182/blood-2008-07-

750 171967 (2009).

751 19 Renshaw, S. A. et al. A transgenic zebrafish model of neutrophilic inflammation.

752 Blood 108, 3976-3978, doi:10.1182/blood-2006-05-024075 (2006).

753 20 Koulnis, M. et al. Identification and analysis of mouse erythroid progenitors using

754 the CD71/TER119 flow-cytometric assay. Journal of visualized experiments : JoVE,

755 doi:10.3791/2809 (2011).

756 21 Ramirez-Carrozzi, V. R. et al. A unifying model for the selective regulation of

757 inducible transcription by CpG islands and nucleosome remodeling. Cell 138,

758 114-128, doi:10.1016/j.cell.2009.04.020 (2009).

759 22 Iriyama, N. et al. Enhancement of differentiation induction and upregulation of

760 CCAAT/enhancer-binding proteins and PU.1 in NB4 cells treated with

761 combination of ATRA and valproic acid. International journal of oncology 44, 865-

762 873, doi:10.3892/ijo.2013.2236 (2014).

763 23 Tanaka, M., Gombart, A. F., Koeffler, H. P. & Shiohara, M. Expression of

764 bactericidal/permeability-increasing protein requires C/EBP epsilon.

765 International journal of hematology 85, 304-311, doi:10.1532/IJH97.05162

766 (2007).

767 24 Lekstrom-Himes, J. & Xanthopoulos, K. G. CCAAT/enhancer binding protein

768 epsilon is critical for effective neutrophil-mediated response to inflammatory

769 challenge. Blood 93, 3096-3105 (1999).

770 25 Lekstrom-Himes, J. A., Dorman, S. E., Kopar, P., Holland, S. M. & Gallin, J. I.

771 Neutrophil-specific granule deficiency results from a novel mutation with loss of

772 function of the transcription factor CCAAT/enhancer binding protein epsilon. The

773 Journal of experimental medicine 189, 1847-1852 (1999).

34 774 26 Wynn, R. F. et al. Intractable diarrhoea of infancy caused by neutrophil specific

775 granule deficiency and cured by stem cell transplantation. Gut 55, 292-293,

776 doi:10.1136/gut.2005.081927 (2006).

777 27 Kotlarz, D. et al. Loss-of-function mutations in the IL-21 receptor gene cause a

778 primary immunodeficiency syndrome. The Journal of experimental medicine 210,

779 433-443, doi:10.1084/jem.20111229 (2013).

780 28 Glocker, E. O. et al. Inflammatory bowel disease and mutations affecting the

781 interleukin-10 receptor. The New England journal of medicine 361, 2033-2045,

782 doi:10.1056/NEJMoa0907206 (2009).

783 29 Bohn, G. et al. A novel human primary immunodeficiency syndrome caused by

784 deficiency of the endosomal adaptor protein p14. Nature medicine 13, 38-45,

785 doi:10.1038/nm1528 (2007).

786 30 Hamada, T. et al. Lipoid proteinosis maps to 1q21 and is caused by mutations in

787 the extracellular matrix protein 1 gene (ECM1). Human molecular genetics 11,

788 833-840 (2002).

789 31 Cottingham, R. W., Jr., Idury, R. M. & Schaffer, A. A. Faster sequential genetic

790 linkage computations. American journal of human genetics 53, 252-263 (1993).

791 32 Lathrop, G. M., Lalouel, J. M., Julier, C. & Ott, J. Strategies for multilocus linkage

792 analysis in humans. Proceedings of the National Academy of Sciences of the United

793 States of America 81, 3443-3446 (1984).

794 33 Schaffer, A. A., Gupta, S. K., Shriram, K. & Cottingham, R. W., Jr. Avoiding

795 recomputation in linkage analysis. Human heredity 44, 225-237 (1994).

796 34 Krawitz, P. et al. Microindel detection in short-read sequence data. Bioinformatics

797 26, 722-729, doi:10.1093/bioinformatics/btq027 (2010).

35 798 35 Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics

799 25, 2078-2079, doi:10.1093/bioinformatics/btp352 (2009).

800 36 Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing

801 genomic features. Bioinformatics 26, 841-842,

802 doi:10.1093/bioinformatics/btq033 (2010).

803 37 Zhu, L. J., Holmes, B. R., Aronin, N. & Brodsky, M. H. CRISPRseek: a bioconductor

804 package to identify target-specific guide RNAs for CRISPR-Cas9 genome-editing

805 systems. PloS one 9, e108424, doi:10.1371/journal.pone.0108424 (2014).

806 38 Van Maele, B., De Rijck, J., De Clercq, E. & Debyser, Z. Impact of the central

807 polypurine tract on the kinetics of human immunodeficiency virus type 1 vector

808 transduction. Journal of virology 77, 4685-4694 (2003).

809 39 Gagnon, J. A. et al. Efficient mutagenesis by Cas9 protein-mediated

810 oligonucleotide insertion and large-scale assessment of single-guide RNAs. PloS

811 one 9, e98186, doi:10.1371/journal.pone.0098186 (2014).

812 40 Meeker, N. D., Hutchinson, S. A., Ho, L. & Trede, N. S. Method for isolation of PCR-

813 ready genomic DNA from zebrafish tissues. BioTechniques 43, 610, 612, 614

814 (2007).

815 41 Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in

816 single cells. Nature methods 10, 1096-1098, doi:10.1038/nmeth.2639 (2013).

817 42 Liu, Y., Zhou, J. & White, K. P. RNA-seq differential expression studies: more

818 sequence or more replication? Bioinformatics 30, 301-304,

819 doi:10.1093/bioinformatics/btt688 (2014).

820 43 Sedlazeck, F. J., Rescheneder, P. & von Haeseler, A. NextGenMap: fast and accurate

821 read mapping in highly polymorphic genomes. Bioinformatics 29, 2790-2791,

822 doi:10.1093/bioinformatics/btt468 (2013).

36 823 44 Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose

824 program for assigning sequence reads to genomic features. Bioinformatics 30,

825 923-930, doi:10.1093/bioinformatics/btt656 (2014).

826 45 Rau, A., Gallopin, M., Celeux, G. & Jaffrezic, F. Data-based filtering for replicated

827 high-throughput transcriptome sequencing experiments. Bioinformatics 29,

828 2146-2152, doi:10.1093/bioinformatics/btt350 (2013).

829 46 Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and

830 dispersion for RNA-seq data with DESeq2. Genome biology 15, 550,

831 doi:10.1186/s13059-014-0550-8 (2014).

832 47 Huang da, W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis

833 of large gene lists using DAVID bioinformatics resources. Nature protocols 4, 44-

834 57, doi:10.1038/nprot.2008.211 (2009).

835 48 Buenrostro, J. D., Wu, B., Chang, H. Y. & Greenleaf, W. J. ATAC-seq: A Method for

836 Assaying Chromatin Accessibility Genome-Wide. Current protocols in molecular

837 biology / edited by Frederick M. Ausubel ... [et al.] 109, 21 29 21-29,

838 doi:10.1002/0471142727.mb2129s109 (2015).

839 49 Li, H. Aligning reads, clone sequences and assembly contigs with BWAMEM.

840 http://arxiv.org/abs/1303.3997v2 (2013).

841 50 Harrow, J. et al. GENCODE: the reference human genome annotation for The

842 ENCODE Project. Genome research 22, 1760-1774, doi:10.1101/gr.135350.111

843 (2012).

844 51 Lawrence, M. et al. Software for computing and annotating genomic ranges. PLoS

845 computational biology 9, e1003118, doi:10.1371/journal.pcbi.1003118 (2013).

37 846 52 Lun, A. T. & Smyth, G. K. De novo detection of differentially bound regions for

847 ChIP-seq data using peaks and windows: controlling error rates correctly. Nucleic

848 acids research 42, e95, doi:10.1093/nar/gku351 (2014).

849 53 Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome biology 9,

850 R137, doi:10.1186/gb-2008-9-9-r137 (2008).

851 54 Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for

852 differential expression analysis of digital gene expression data. Bioinformatics 26,

853 139-140, doi:10.1093/bioinformatics/btp616 (2010).

854 55 Shannon, P. et al. Cytoscape: a software environment for integrated models of

855 biomolecular interaction networks. Genome research 13, 2498-2504,

856 doi:10.1101/gr.1239303 (2003).

857 56 Croft, D. et al. The Reactome pathway knowledgebase. Nucleic acids research 42,

858 D472-477, doi:10.1093/nar/gkt1102 (2014).

859 57 Milacic, M. et al. Annotating cancer variants and anti-cancer therapeutics in

860 reactome. Cancers 4, 1180-1211, doi:10.3390/cancers4041180 (2012).

861 58 Newman, M. E. Modularity and community structure in networks. Proceedings of

862 the National Academy of Sciences of the United States of America 103, 8577-8582,

863 doi:10.1073/pnas.0601602103 (2006).

864 59 ReactomeWiki. ReactomeFIViz.

865 http://wiki.reactome.org/index.php?title=ReactomeFIViz&oldid=7168 (2016).

866 60 Oliveros, J. C. Venny. An interactive tool for comparing lists with Venn's diagrams.

867 http://bioinfogp.cnb.csic.es/tools/venny/index.html (2007-2015).

868

38 869 Figure Legends

870 Figure 1: Bone marrow and peripheral blood cell analysis

871 (a-c) Bone marrow and peripheral blood cells from healthy donor. (a) Hematoxylin eosin

872 (HE) stained normal cellular bone marrow biopsy showing regular maturation of all

873 hematopoietic cell lineages and no infiltration by blast cells. Insert shows magnification of

874 normal hematopoiesis. (b) Giemsa stained healthy donor peripheral blood cells showing

875 mature, segmented neutrophil granulocytes (c) Giemsa stained healthy donor bone marrow

876 cells showing maturation of red and white blood cell lineages.

877 (d-f) Bone marrow and peripheral blood cells from patient AII.1. (d) HE stained bone

878 marrow biopsy with diffuse and compact infiltration by blast cells, absence of

879 megakaryocytes and erythroid islands. Insert shows immature neutrophilic cells. (e) Giemsa

880 stained bone marrow cells showing atypical mature neutrophilic cells with hypogranulated

881 cytoplasm, hyposegmented nuclei and Pseudo Pelger-Huët anomaly (PPHA) (black arrow

882 head). (f) Giemsa stained bone marrow cells showing left shifted neutrophilic granulopoiesis,

883 blast cells and PPHA (black arrow head)

884 (g-i) Bone marrow and peripheral blood cells from family B (g) HE stained bone marrow

885 biopsy from patient BII.2 (boy) showing a marked hypercellularity with subtotal depletion of

886 adipocytes and normal erythroid precursors. Diffuse infiltration by blast cells and starry sky

887 pattern with disseminated activated macrophages (empty arrow heads). Insert shows

888 immature neutrophilic cells. (h) Giemsa stained peripheral blood cells from patient BII.1 (girl)

889 showing blood smear with circulating atypical neutrophil cells and PPHA (black arrow head).

890 (i) Giemsa stained bone marrow cells from patient BII.1 (girl) showing left shifted atypical

891 neutrophilic granulocytopoiesis with increase of blast cells. PPHA (black arrow head) and

892 atypical mature neutrophils (empty arrow head).

39 893 (j-l) Bone marrow cells from patient CII.1. (j) HE stained bone marrow biopsy from patient

894 CII.1 showing marked hypercellularity with subtotal depletion of adipocytes and depletion of

895 normal erythrocytes. Diffuse and compact infiltration by blast cells, scattered activated

896 macrophages (empty arrow heads). Insert showing pleomorphic blast cells with round nuclei

897 and small but distinguishable nucleoli (black arrow heads). (k) Bone marrow anti GlycoC

898 staining demonstrates superseding of erythropoiesis (empty arrow heads) by blast invasion

899 (asterisks). (l) immunohistochemical anti CD-61 staining shows loosely scattered, small and

900 immature megakaryocytes (micro-megakaryocytes).

901

902

903

904

905

40 906 Figure 2: Identification of biallelic loss-of-function mutations in SMARCD2

907 (a, b, c.) Pedigrees and Sanger sequencing chromatogram of patient (Pat) compared to

908 reference sequence (Ref), and specification of homozygous mutations (Mut).

909 (d) Western blot revealing absence of SMARCD2 protein expression (molecular weight

910 58.9kDa, filled arrow head) in fibroblasts (healthy donors 1 (HD1) and healthy donor 2

911 (HD2), patients AII.1, patients BII.1) and in EBV-transformed B cell lines (Healthy donor

912 (HD), patient CII.1)

913 (e) SMARCD2 mRNA transcripts detected in patients cells, open reading frames are shown in

914 black. Healthy donor (HD) transcript ENST00000448276, CCDS45756, in comparison to

915 patients AII.1 (a: p.I362CfsX2, b: p.S394RfsX1, c: p.I362VfsX85), BII.1 (p.Q147EfsX4) and

916 CII.1 (p.R73VfsX8). See also Table S2.

917 (f) Immunoprecipitation showing defective binding of patient-specific SMARCD2-mutated

918 proteins to SWI/SNF core complex members BRG1, BAF170, BAF155, and BAF47. FLAG-

919 tagged SMARCD2 proteins (wildtype and mutants), expressed in 293T cells, were

920 immunoprecipitation using anti-FLAG tag. Native co-immunoprecipitation of SWI/SNF

921 complex members was visualized using Western blotting of input and immunoprecipitated

922 samples. Short exposure of FLAG stained membrane shows similar amounts of FLAG-

923 precipitated SMARCD2-WT and SMARCD2-AII1.a and SMARCD2-AII.1b proteins (no

924 shown); long exposure of FLAG stained membrane reveals presence of FLAG-IP precipitated

925 SMARCD2-WT, SMARCD2-AII1.a, SMARCD2-AII.1b, SMARCD2-BII.1 and SMARCD2-

926 CII.1 proteins.

927

928

41 929 Figure 3: Smarcd2 deficiency in genetic model organisms

930 (a-c) Defective neutrophil differentiation in zebrafish models

931 (a) Fluorescence image of Danio rerio strain Tg(mpx:EGFP)i114, smarcd2wt/wt and

932 smarcd21/1 (knock out). Reduced numbers of GFP expressing neutrophils are observed in the

933 smarcd21/1 mutant fish embryo. Numbers of fluorescence labled neutrophils were evaluated in

934 the caudal hematopoietic tissue per individual fish embryos.

935 (b) Statistics of neutrophil numbers in wildtype smarcd2wt/wt vs. knock out smarcd21/1

936 zebrafish. Numbers of fluorescence labled neutrophils were evaluated in the caudal

937 hematopoietic tissue per individual fish embryos. (Center value: mean and error bar: SD, p-

938 values: two-tailed unpaired t-tests.)

939 (c) Statistics of neutrophil numbers in Tg(lyz:dsRed)nz50 zebrafish at 72 hours post

940 fertilization after injection of morpholino oligonucleotides (unspecific control versus

941 translation start site blocker (ATG) and splice site blocker (SB1 and SB2) against smarcd2.

942 Data points represent numbers of fluorescence labeled neutrophils per individual fish

943 embryos. (Center value: mean and error bar: SD, p-values: two-tailed unpaired t-tests).

944 (d-l) Defective hematopoiesis in murine Smarcd2-deficient embryos

945 (d) Gross morphology of murine litter mate embryos Smarcd2+/+, Smarcd2 +/- and Smarcd2-/-

946 at 14.5dpc, (representative pictures chose out of 4 litters (Wt n=4, Ht n=10, Ko n=9)), (e)

947 Smarcd2+/+ and Smarcd2-/- blood cytology at 14.5dpc, May-Grünwald/Eosin 20x and 63x

948 showing an increase of dysplastic red cell precursors in Smarcd2-/- blood e.g. anisocytosis

949 (empty arrow head, 23x), increased mitosis (black arrow heads, 63x) and multinucleated cells

950 (empty arrow heads, 63x), 2 independent experiments in 2 litter, (f) May-Grünwald/Eosin

951 stained CFU cells derived from Smarcd2+/+, Smarcd2+/- and Smarcd2-/- hematopoietic stem

952 cells upon differentiation with SCF, Il3, Il6 show promyelocytic arrest. Murine neutrophils

42 953 (anular shaped nucleus) are absent in Smarcd2-/- colonies. (g) RNA-Seq analysis of

954 Smarcd2+/+ (n=5) and Smarcd-/- (n=9) 14-15dpc fetal liver LSK cell samples. Shown is a

955 heatmap of 50 significantly (padj < 0.1) differentially expressed genes with highest absolute

956 fold change. Each column represents a LSK sample of one embryo. Smarcd2+/+ is depleted in

957 all Smarcd2-/- samples (black arrow). Color key below heat map indicates range of log2 fold

958 changes. For detailed statistic methodology please refer to material/ method section.

959 (h, k) Representative FACS scatter blots of fetal liver cells stained for CD11b+Gr1+ neutrophil

960 granulocytes and CD11b+Ly6c+ monocytes and statistical analysis. (h) Here, Smarcd2+/+ fetal

961 liver blood cells express CD11b+ (21.3%) vs CD11b+Gr1+ (12.1%) and CD11b+Ly6c+ (10%),

962 Smarcd2+/- fetal liver blood cells express CD11b+ (33.1%) vs CD11b+Gr1+ (8.27%) and

963 CD11b+Ly6c+ (7.51%) and Smarcd2-/- fetal liver blood cells express CD11b+ (13.9%) vs

964 CD11b+Gr1+ (0.47%) and CD11b+Ly6c+ (1.1%). (k) statistical analysis of (h), CD11b+Gr1+

965 and CD11b+Ly6c+ cells are significantly reduced in Smarcd2-/- vs Smarcd2+/+ . Representation

966 of 2 experiments with 6 litters (Wt n=10, Ht n=14, Ko n=9) with center value: Mean, Error

967 bar: SEM p-values: two-tailed unpaired t-tests. Experiment was repeated 3 times with a total

968 of 8 litters.

969 (i) Count of myeloid colonies derived from Smarcd2+/+, Smarcd2+/- and Smarcd2-/- LSK cells

970 upon differentiation with myeloid cocktail M3434, GM-CSF, M-CSF or G-CSF, respectively.

971 CFU from LSK of Wt n=4, Ht n=5, Ko n= 5, derived from 5 litter, 3 independent

972 experiments, center value: Mean, Error bar: SD, p-values: two-tailed unpaired t-tests).

973 (j,l) FACS analysis of erythropoietic progenitors derived from Smarcd2+/+, Smarcd2+/- and

974 Smarcd2-/- CFU GEMM colonies after differentiation with M3434 (myeloid cocktail). (j)

975 FACS scatter blots show representative CD71/Ter119 distribution of erythropoietic cells from

976 GEMM colonies (l) Display shows distribution of of erythroid stages S0-S5 in 8 GEMM

977 colonies derived per each of 3 WT (i.e. 3 x 8 GEMM colonies), 2 HT (i.e. 2x 8 GEMM

43 978 colonies) and 3 KO (i.e. 3x8 GEMM colonies) from fetal liver LSKs (i.e. 64 data points).

979 Statistical analysis by 2-way ANOVA, center value: Mean, Error bar: SEM)

980

981

44 982 Figure 4: SMARCD2, granule formation and transcriptional regulation

983 (a) Relative mRNA expression of SMARCD1 (dots), SMARCD2 (dots), SMARCD3 (dots) and

984 primary granule genes (empty squares) LL37, AAT and secondary granule genes (filled

985 squares) MMP8, TCN1, LTF is shown. The human AML-NB4 cell line was lentivirally

986 transduced with either an unspecific control (CTRL) or 1 of 2 specific shRNAs against

987 SMARCD2. Data points show the relative expression of shRNA vs CTRL in 3 independent

988 experiments with 2 shRNAs (i.e. 6 data points) for SMARCD1 , SMARCD3, LL37, AAT,

989 MMP8, or in 4 independent experiments with 2 shRNAs (i.e. 8 data points) for SMARCD2

990 and LTF. SMARCD1, SMARCD2 and SMARCD3 expression levels were determined in

991 undifferentiated cells; granule gene expression was measured after differentiation with ATRA

992 1µM for 6 days. To describe the effects of SMARCD2 knock down, the relative expression

993 levels in all samples were compared to the relative expression of SMARCD1. Statistics: center

994 value: Mean, p-values: two-tailed unpaired t-tests. Relative expression of granule genes after

995 knock down and differentiation with ATRA 1µM for 3 days show similar results (data not

996 shown).

997 (b) Co- overexpression of CEBPE-N-HA and SMARCD2-N-FLAG shows protein-protein

998 interaction in vitro in 293T cells. SMARCD2-N-FLAG co-precipitates with HA-immune

999 precipitated CEBPE-N-HA and vice versa.

1000 (c-f) The intersection of differentially enriched genes (ATAC-Seq and RNA-Seq) in NB4

1001 knock down SMARCD2 vs. control is shown.

1002 (c) In undifferentiated NB4 cells (UD) a distinct subset of genes shows both, changes of

1003 chromatin compaction measured by ATAC-Seq and gene expression measured by RNA-Seq.

1004 (d) Fold change of transcription (FC RNA) and chromatin accessibility (FC ATAC) are

1005 indicated for genes, affected in both assays in undifferentiated NB4 cells. Color key (same for

45 1006 (d) and (f)) below heat map indicates range of log2 fold changes. For detailed statistic

1007 methodology please refer to material/ method section.

1008 (e) In differentiated NB4 cells (ATRA 1µM for 3 days), a distinct subset of genes shows both,

1009 changes of chromatin compaction measured by ATAC-Seq and gene expression measured by

1010 RNA-Seq.

1011 (f) Fold change of transcription (FC RNA) and chromatin accessibility (FC ATAC) are

1012 indicated for genes, affected in both assays in differentiated NB4 cells. Color key below heat

1013 map (d) indicates range of log2 fold changes (for (d) and (f)). For detailed statistic

1014 methodology please refer to material/ method section.

1015 (g) SMARCD2 regulates expression of CEBPE dependent genes. Intersection of differentially

1016 expressed genes in undifferentiated vs differentiated SMARCD2 knock down cells vs CEBPE

1017 targets is shown, for intersections see Table S3.

1018 For detailed statistical methodology of ATAC-Seq and RNA-Seq data analysis please refer to

1019 Materials and Methods section.

1020

46 Fig. 1 Bone marrow and peripheral blood cell analysis

a b

HD c

40X 60X

d e

Pat AII.1

f Giemsa

40X 60X

g h

Pat BII.2 i

40X 60X

j k

* * GlycoC 40X * Pat CII.1 l CD61 40X 40X 60X

HE Fig. 2 Identification of biallelic loss-of-function mutations in SMARCD2

AI.1 AI.2 BI.1 BI.2 CI.1 CI.2 a b c

AII.1 AII.2 BII.1 BII.2 CII.1 CII.2

Mut c.1181+1G>A) Mut c.414_438dup) Mut c.401+2T>C) Ref T T T A C C T A A Ref T T C CT Ref G G GG T A A G A G Pat T T T A T C T A A Pat T T C A G G T A G A A C C T T A T C T G C C A T C T T C C T Pat G G GG C A A G A G

d e HD1 HD2 BII.1 AII.1 HD CII.1 kDa HD 1 2 3 4 5 6 7 8 9 10 11 12 13

100 a 1 2 3 4 5 6 7 8 10 11 12 13 70 IVS9 55 AII.1 b 1 2 3 4 5 6 7 8 9 10 11 12 13 SMARCD2 40 IVS8 IVS9

c 1 2 3 4 5 6 7 8 9 10 11 12 13

BII.1 1 2 3 4 5 6 7 8 9 10 11 12 13

CII.1 1 2 3 4 5 6 7 8 9 10 11 12 13 40 GADPH 35

f INPUT FLAG - IP

pET.empty.RFP pET.SMARCD2_N.RFP pET.AII.1b_N.RFP pET.AII.1a_N.RFP pET.BII.1_N.RFP pET.CII.1_N.RFP pET.empty.RFP pET.SMARCD2_N.RFP pET.AII.1b_N.RFP pET.AII.1a_N.RFP pET.BII.1_N.RFP pET.CII.1_N.RFP empty kDa BRG1 240

BAF170 170 150 BAF155 170 150

BAF 47 40

70 55 Flag_HRP 40 Long 35 exposure 25

15

10

GAPDH 35 Fig. 3 Smarcd2 deficiency in genetic model organisms a b c ns

**** *** *** ** 150 250

200

Smarcd2 wt/wt 100 150 Neutrophils Neutrophils 100 # # 50 50

0 0 Smarcd2 1/1 WT/WT 1/1 CTRL ATG SB1 SB2

−2 −10 1 2 Value +/+ -/- d e f g Calcoco1Calcoco1 Stat2Stat2 H2−H2-T23T23 +/+ +/+ CttnCttn Gbp9 +/+ Gbp9 Gsn 20x 63x Gsn Slc35d3Slc35d3 MyclMycI H2−H2-ObOb -/- Gm15925Gm15925 +/- Gm15448Gm15448 Gm14548Gm14548 20x 63x Klrd1Klrd Ddx58Ddx58 +/- h Il10raIl10ra

21.3 10 5 Cd33 10 21.3% 105 Cd33 Gm16439 4 Gm16439 10 104 -/- P2ry12P2ry12 3 10 103 Slamf6 12.1 +/+ Slamf6

2 10 102 Ccl4Ccl4 0 0 12.1% 10% H2−H2-DMaDMa 2 3 4 5 0 10 10 10 10 0 102 103 104 105 i via 33.1 via 7.51 Ptger2 FL murine105 E14-5,2f,33.1% E16-5_45 FL11 Tube B myeloid_011.fcsFL murine105 E14-5,2f, E16-5_45 FL11 Tube B myeloid_011.fcs Ptger2 Event Count: 38000 Event Count: 38000 Ap3m2Ap3m2 4 104 10 Ndrg1 100 100 *** +/+ Ndrg1 3 -/- 103 10 +/- +/- Trbv4Trbv4 8.27 * CD11b CD11b 2 102 Cass4 10 80 80 -/- Cass4 0 0 8.27% 7.51% H2−H2-T24T24 2 3 4 5 0 102 103 104 105 0 10 10 10 10 ns ns * Lilra6 via 13.9 via 1.1 60 Lilra6 5 13.9% 5 60 FL murine10 E14-5,2f, E16-5_41 FL7 Tube B myeloid_007.fcsFL 10murine E14-5,2f, E16-5_41 FL7 Tube B myeloid_007.fcs Event Count: 35407 Event Count: 35407 * * ** Drc7Drc7 colonies 104 104 40 40 ItgaxItgax

103 103 -/- Ndrg2Ndrg2

0.47 # CFU Tmem45a 102 102 20 20 Tmem45a 0 0.47% 0 1.1% AldocAldoc

0 102 103 104 105 0 102 103 104 105 Sept4Sept4 via via 0 0 FL murine E14-5,2f, E16-5_44 FL10 Tube B myeloid_010.fcsFL murine E14-5,2f, E16-5_44 FL10 Tube B myeloid_010.fcs Event Count: 51539 Ly6c Event Count: 51539 Gr1 Rtn1 M3434 GM-CSF M-CSF G-CSF Rtn1 Akap5Akap5 M3434 M-CSF G-CSF GM-CSF Usp18Usp18 +/+ +/- -/- j k Rsad2Rsad2 50.5 6.91 5 10 5.21% 70.7%10 5 0.77% 69.9% 105 50.5% 25.3% 5.21 6.88 0.77S1-S52.41 merged to WT HT KO Dhx58Dhx58 4.74 7.69 ns 4 Irf7 10 104 104 S1 S2 S3 Irf7 2 70.7 69.9 40 ns +/+ 25.3 Pira2Pira2 3 3 10 10 103 WT

CD71 1.52 4.62 +/- Pirb : CD71 : CD71 : CD71 0.104 ht S4 Pirb 2 10 102 2 S0 -/- 10 0.0779 KO 30 Il7r Il7r 0.0632 S5 0 0 0 8.9 2.65 Tnip3Tnip3 2 3 4 5 2 3 4 5 cells 0 10 10 10 10 0 10 10 10 10 0 102 103 104 105 : TER119 : TER119 : TER119 ns ns Bnip3 Via Via Via Bnip3 Specimen_001_28-5 WT 2_013.fcs Specimen_001_28-6 HT 2_006.fcs Specimen_001_28-2 KO 6_042.fcs 20 Event Count: 3167 Ter119 l Event Count: 2079 Event Count: 3851 Ankrd37Ankrd37 100100 blood ns *** ns *** ns ns *** *** fl Smarcd2Smarcd2 % 10 Psmc5Psmc5 8080 +/+ Msr1Msr1 Colec12 cells +/- 0 Colec12 6060 -/- CD11b+ CD11b+ CD11b+ Gr1+ Ly6c+

4040 KO22.2 KO21.2 KO20.3 KO21.3 KO24.5 KO22.5 KO22.4 KO19.5 KO19.6 WT21.5 WT21.1 WT19.2 WT22.8 %CFU GEMM WT24.10 2020

-2−2 -1−10 0 1 1 2 2 Value 00 Calcoco1 S1 S2 S3 S4 S5 Stat2 S0 S1 S2 S3 S4 S5 H2−T23 non ery Cttn Gbp9 Gsn Slc35d3 Mycl H2−Ob Gm15925 Gm15448 Gm14548 Klrd1 Ddx58 Il10ra Cd33 Gm16439 P2ry12 Slamf6 Ccl4 H2−DMa Ptger2 Ap3m2 Ndrg1 Trbv4 Cass4 H2−T24 Lilra6 Drc7 Itgax Ndrg2 Tmem45a Aldoc Sept4 Rtn1 Akap5 Usp18 Rsad2 Dhx58 Irf7 Pira2 Pirb Il7r Tnip3 Bnip3 Ankrd37 Smarcd2 Psmc5 Msr1 Colec12 KO22.2 KO21.2 KO20.3 KO21.3 KO24.5 KO22.5 KO22.4 KO19.5 KO19.6 WT21.5 WT21.1 WT19.2 WT22.8 WT24.10 <) <) <) <) Fig. 4 SMARCD2, granule formation and transcriptional regulation

*** ** ** * a 1.5 * ns **

1.0 expression

relative 0.5

0.0 SMARCD1 SMARCD1 TCN1 SMARCD2 SMARCD2 SMARCD3 LL37 AAT MMP8 LTF

b INPUT FLAG-IP INPUT HA-IP pET.empty.GFP + - - - + - - - pET.empty.GFP + - - - + - - - SMARCD2 N-FLAG - + - + - + - + CEBPE N-HA - + - + - + - + SMARCD2 C-FLAG - - + - - - + - CEBPE C-HA - - + - - - + - CEBPE N-HA - - - + - - - + kDa SMARCD2 N-FLAG - - - + - - - + kDa

100 70 70 55 FLAG 55 FLAG 40 40

55 55 40 40 HA HA 35 35

40 40 GAPDH GAPDH 35 35

c d f g ATAC RNA RNA ATAC RNA ATAC UD ATRA

HDAC9 SORL1 LDLR MBNL3

AZU1 PALLD 307 12 204 149 61 713 UD IRS1 OTUD4 BCL7A 1 LIPA 5 21 TFRC SMAD3 DST DUSP6 287 PTPRN2 TGFA NRP1 GLTSCR1 MAML2

e NEO1 BLVRA CEBPE targets ATAC RNA TMSB4X UNC5B

HK2 KXD1 RNASET2 PAK1 245 16 780

ATRA SULF2

-3 -2 -1 0 1 2 3