Cell Stem Cell Resource

Identification of Regulatory Networks in HSCs and Their Immediate Progeny via Integrated Proteome, Transcriptome, and DNA Methylome Analysis

Nina Cabezas-Wallscheid,1,2,10 Daniel Klimmeck,1,2,3,10 Jenny Hansson,3,10 Daniel B. Lipka,4,10 Alejandro Reyes,3,10 Qi Wang,5,9 Dieter Weichenhan,4 Amelie Lier,2,6 Lisa von Paleske,1,2 Simon Renders,1,2 Peer Wu¨ nsche,1,2 Petra Zeisberger,1,2 David Brocks,4 Lei Gu,4,5,9 Carl Herrmann,5,9 Simon Haas,2,7 Marieke A.G. Essers,2,7 Benedikt Brors,5,8,9 Roland Eils,5,8,9 Wolfgang Huber,3,11 Michael D. Milsom,2,6,11 Christoph Plass,4,8,11 Jeroen Krijgsveld,3,11 and Andreas Trumpp1,2,8,11,* 1Division of Stem Cells and Cancer, Deutsches Krebsforschungszentrum (DKFZ), 69120 Heidelberg, Germany 2Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGmbH), 69120 Heidelberg, Germany 3European Molecular Biology Laboratory (EMBL), Genome Biology Unit, 69117 Heidelberg, Germany 4Division of Epigenomics and Cancer Risk Factors, DKFZ, 69120 Heidelberg, Germany 5Division of Theoretical Bioinformatics, Department of Bioinformatics and Functional Genomics, DKFZ, 69120 Heidelberg, Germany 6Junior Research Group Experimental Hematology, Division of Stem Cells and Cancer, DKFZ, 69120 Heidelberg, Germany 7Junior Research Group Stress-induced Activation of Hematopoietic Stem Cells, Division of Stem Cells and Cancer, DKFZ, 69120 Heidelberg, Germany 8German Cancer Consortium (DKTK), 69120 Heidelberg, Germany 9Institute for Pharmacy and Molecular Biotechnology (IPMB) and BioQuant, Heidelberg University, 69120 Heidelberg, Germany 10Co-first author 11Co-senior author *Correspondence: [email protected] http://dx.doi.org/10.1016/j.stem.2014.07.005

SUMMARY 2010; Wilson et al., 2009). They give rise to a series of multipotent progenitors (MPPs) with decreasing self-renewal potential, fol- In this study, we present integrated quantitative pro- lowed by differentiation toward committed progenitors and teome, transcriptome, and methylome analyses of he- more mature cells (Adolfsson et al., 2005; Forsberg et al., matopoietic stem cells (HSCs) and four multipotent 2006). MPPs have been subdivided immunophenotypically into progenitor (MPP) populations. From the characteriza- MPP1, MPP2, MPP3, and MPP4 populations based on a step- tion of more than 6,000 , 27,000 transcripts, wise gain of CD34, CD48, and CD135 as well as loss of CD150 and 15,000 differentially methylated regions (DMRs), expression (Wilson et al., 2008). However, despite recent efforts to characterize changes in expression and epigenome we identified coordinated changes associated with modifications that occur at distinct stages of differentiation early differentiation steps. DMRs show continuous (Gazit et al., 2013; Kent et al., 2009; McKinney-Freeman et al., gain or loss of methylation during differentiation, 2012; Bock et al., 2012), the distinct functional characteristics and the overall change in DNA methylation correlates and the molecular programs that maintain HSC self-renewal inversely with gene expression at key loci. Our data and drive progenitor differentiation are poorly characterized. reveal the differential expression landscape of 493 We have taken advantage of recent technological advances transcription factors and 682 lncRNAs and highlight enabling analysis of rare cell populations to establish compre- specific expression clusters operating in HSCs. We hensive mass spectrometry-based proteome, transcriptome also found an unexpectedly dynamic pattern of tran- (RNA sequencing [RNA-seq]), and genome-wide DNA methyl- script isoform regulation, suggesting a critical regula- ome (tagmentation-based whole genome bisulfite sequencing, tory role during HSC differentiation, and a cell cycle/ TWGBS) data for HSCs and MPPs. We provide a comprehensive insight into the molecular mechanisms that are dynamically DNA repair signature associated with multipotency regulated during early HSC commitment through the MPP1– in MPP2 cells. This study provides a comprehensive MPP4 populations. We uncovered molecular changes at the pro- genome-wide resource for the functional exploration tein, RNA, and DNA levels as they occur in vivo in the context of of molecular, cellular, and epigenetic regulation at physiologic commitment processes. the top of the hematopoietic hierarchy.

RESULTS INTRODUCTION The five stem/progenitor populations corresponding to HSC and Hematopoietic stem cells (HSCs) are unique in their capacity to MPP1–MPP4 (Wilson et al., 2008) were isolated by fluorescence- self-renew and replenish the entire blood system (Orkin and activated cell sorting (FACS) from the bone marrow of C57BL/6J Zon, 2008; Purton and Scadden, 2007; Seita and Weissman, mice (Figure 1; Figures S1A–S1C available online). These cells

Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. 507 Cell Stem Cell Molecular Landscape of Early Hematopoiesis

Multipotent Myeloid Mature A stem and progenitor cells committed effector cells C Linneg Sca-1+ c-Kit+ Linneg Sca-1- c-Kit+ Mega MEP Plat CD34- CD34+ MPP2 CD34+ CMP Bone CD48- CD48- CD48+ Ery CD150+ CD150+ CD150+ GMP marrow isolation CD135- CD135- CD135- Neu

Mac + HSC MPP1 MPP3 CD34 Lymphoid CD48+ CD150- committed - CD135 Linneg Sca-1low c-Kitlow FACS sorting MPP4 CD34+ CLP T-cells of primary cells CD48+ (n=3) CD150- NK CD135+ B-cells

Quantitative RNA- DNA Reconstitution B 5 5 5 10 10 10 proteomics sequencing methylome experiments 4 4 4 31.4 10 10 4.1 10 HSC; MPP1 (TWGBS) HSC; MPP1

3 3 3 10 c-Kit 10 10 HSC; MPP1 MPP2; MPP3 HSC; MPP1 MPP2; MPP3 CD135 Lin neg MPP4 MPP2; MPP3/4 MPP4 0 0 0 98.1 5.9 31.9

0 50K 100K 150K 200K 250K 0103 104 105 0103 104 105 FSC-A Sca-1 CD34 Fig. 2,3,4 Fig. 3,4,5,6,7 Fig. 6,7 Fig. 4

Data integration 105 HSC 105 MPP1 MPP2 105 d 72.3 14.1 8.6 HSC and MPP signatures (RNA, ) Functional 104 104 104 RNA-protein correlation readout Differential exon usage d 103 103 103 Differentiation CD150 CD150 CD150 lncRNAs potential 0 0 0 Differential DNA methylation MPP3 41.6 MPP4 83.3 RNA-DNA methylation correlation

0103 104 105 0103 104 105 0103 104 105 CD48 CD48 CD48

Figure 1. Global Analysis of Early Adult Hematopoiesis (A) The mouse hematopoietic system. HSCs give rise to MPPs, which commit to more mature progenitors with restricted cell fate. CMP, common myeloid progenitor; MEP, megakaryocyte erythroid progenitor; GMP, granulocyte macrophage progenitor; CLP, common lymphoid progenitor; Mega, megakaryocyte; Pla, platelet; Neu, neutrophil; Mac, macrophage; NK, natural killer cell. (B) FACS scheme used to isolate primary mouse cells: HSC (Linneg Sca-1+ c-Kit+, LSK, CD34- CD135- CD150+ CD48À), MPP1 (LSK CD34+ CD135À CD150+ CD48À), MPP2 (LSK CD34+ CD135À CD150+ CD48+), MPP3 (LSK CD34+ CD135À CD150À CD48+), and MPP4 (LSK CD34+ CD135+ CD150À CD48+). L, lineage-negative. (C) Experimental study design. Shown are the different generated data sets and their respective figure numbers. See also Figures S1 and S7. were used as a source for quantitative proteomics, RNA-seq, In total, 4,037 proteins were quantified in all three replicates TWGBS, and functional reconstitution experiments. (Figure S2C). Of these, only 47 proteins were expressed differen- tially (false discovery rate [FDR] = 0.1), together with nine pro- Proteome Differences between HSCs and MPP1 Cells teins detected exclusively in MPP1 (Figure 2C; Figure S2D; Table Uncovered Key Molecular Players of Long-Term S1). This suggests that the differential self-renewal potential be- Reconstitution Potential tween HSCs and MPP1 is elicited by a relatively small number of The transition from CD34À HSCs to CD34+ MPP1 is accompa- proteins. nied by a switch in the cells’ reconstitution capabilities. We trans- (GO) enrichment analysis revealed that cell cy- planted mice with 50 HSCs and observed that 100% of primary cle as well as related processes, such as DNA replication and cell and 80% of secondary recipients showed multilineage repopu- proliferation, were strongly overrepresented in MPP1 (Figure 2D). lating activity (Figures S1D–S1F). In contrast, 56% of mice trans- Indeed, protein-protein interaction network analysis highlighted planted with MPP1 cells showed reconstitution of the primary a large group of interconnected cell cycle proteins being ex- recipient, and no engraftment was detected in secondary recip- pressed at elevated levels in MPP1 compared with HSCs. All ma- ients. This is consistent with previous reports showing that the jor processes associated with the cell cycle machinery, including two populations are very similar but display a measurable differ- DNA polymerases (Pol1a, Pole), cell cycle checkpoint proteins ence in long-term self-renewal (Ema et al., 2006; Osawa et al., (Chek1), DNA methylation maintenance and cell cycle progres- 1996). To investigate the molecular basis of this difference in sion (Cdk1, Cdk6), and others were represented in the network self-renewal, we compared the proteomes of HSC and MPP1 (Figure 2E). In contrast, HSCs were enriched in the monosaccha- cells in a quantitative mass spectrometry-based approach (Fig- ride metabolic process, including the glycolytic enzymes lactate ure 2A; Figure S2A). From 400,000 HSCs and MPP1 purified in dehydrogenase b and d (Ldhb, Ldhd) as well as Pygm. In addi- biological triplicate, 6,389 protein groups were identified (Table tion, cellular ion homeostasis, including two iron transporters S1). These covered a broad range of protein classes, e.g. recep- (Fth1, Ftl2), oxidation reduction (Rrm1, Rrm2), and response to tors (222) and transcription factors (549) (Figure 2B), as well as hypoxia (Mecp2) were enriched in HSCs. Together, the data low-abundance proteins, as judged from the estimated protein are consistent with an anaerobic metabolic program employed levels that spanned more than seven orders of magnitude by quiescent HSCs, whereas MPP1 cells become primed for (Figure S2B). entry into the cell cycle and start proliferating.

508 Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. Cell Stem Cell Molecular Landscape of Early Hematopoiesis

AB C 22 Protein count Igf2bp2 0 500 1000 1500 nucleic acid binding HSC MPP1 hydrolase Hmga2 5 transferase 4x10 cells, enzyme modulator n=3 transcription factor 2 cytoskeletal protein Ldhd Hmga1 oxidoreductase Fscn1 Pygm Hmga1 Gda kinase Irgm2 Myof Cell lysis/protein extraction/ ligase Nagk Serpinb6a Ifitm1 Camk2d Cd38 protein digestion transporter Cbr3 receptor Gstm1 membrane traffic protein Tgm2 signaling molecule Slamf1 Peptide labeling protease Ldhb Cmas Gstm5 calcium-binding protein 0 transfer/carrier protein Stmn1 Dnmt1 Pola1 chaperone Smc2 Ndrg1 Peptide fractionation phosphatase Hat1 Plek isomerase Lig1 lyase Tyms defense/immunity protein Chek1 Hells Uhrf1 cell adhesion molecule log2 ratio HSC/MPP1 Cdk6 extracellular matrix protein Kif4 Serpinb1a structural protein Kif2c Dut Top2a cell junction protein Thy1 Ncaph Rrm2 tr.membr. rec. reg. −2 Pole Anxa2 High-resolution nano LC-MS/MS surfactant Gins3 Cdk1 storage protein viral protein

Dnajc6; Sptlc2; Mtmr9; Zc3h10 Mis12; Mllt1; Faah; Steap3; Ncf4 0.0 0.5 1.0 1.5 25 −log10(adjusted p−value HSC/MPP1) D E +9 Hat1 Dnmt1 HSC #proteins MPP1 20 0 20 40 Cdk6 Pole Lig1 oxidation reduction 14 2 Pola1 Uhrf1 Stmn1 F monosaccharide metabolic process 8 1 Hells Cdk1 cellular ion homeostasis 6 1 2 regulation of organization 3 1 neuron projection development 4 1 Plek Dut Smc2 Ncaph response to hypoxia 3 1 PP1 1

Chek1 Kif4 M C/ DNA replication 0 19 0 Rrm2 Gins3 S cell cycle 5 29 Ncf4 H chromosome condensation 0 7 Top2a Kif2c DNA packaging 2 10 io -1

chromosome organization 5 16 Tyms Mis12 at r cell division 3 14 -2 2

Mphase 2 13 g

cellular response to stress 1 12 Hmga1 Gstm1 Pygm Ldhb lo -3 mitosis 2 9 cell proliferation 1 8 DNA repair 0 9 Hmga2 Gstm5 Ldhd deoxyribonucleotide metabolic process 0 4 cell activation 2 7 neg. reg. of transcription, DNA-dependent 1 7 Edge color Node fill color Node size Node shape

0.41 1 -2 22.7 Uniquely Combined score Average log2 ratio detected 0.02 0.1 STRING HSC/MPP1 adj.p value MPP1

Figure 2. Proteomic Comparison of HSCs and MPP1 (A) Quantitative proteomics workflow. (B) Classification of identified proteins. Bars show the number of proteins within each functional class. (C) Differential protein expression. Proteins expressed significantly higher (FDR = 0.1) in HSCs and MPP1 are shown in red and blue, respectively. The lower right corner shows proteins exclusively detected in MPP1. (D) Overrepresented biological processes of differentially expressed proteins (202 with FDR = 0.15 and 9 exclusively detected in MPP1). Top, HSC-enriched. Bottom, MPP1-enriched. Neg. reg. of transcription, negative regulation of transcription. (E) Protein network of differentially expressed proteins. The edges show known and predicted protein-protein interactions. STRING, Search Tool for the Retrieval of Interacting /Proteins; adj., adjusted. (F) CD cell surface proteins expressed differentially. The average log2 ratio ± SD is shown. See also Figure S2.

Of the 22 proteins with higher expression in HSCs, we found tion and drive self-renewal (Shyh-Chang et al., 2013; Yaniv and three high mobility group AT hook proteins; namely, two isoforms Yisraeli, 2002). High expression of this pathway in HSCs suggests of Hmga1 as well as Hmga2 (Figure 2C; Figure S2E). These chro- a role in self-renewal of HSCs, whereas its downregulation in matin-modulating transcriptional regulators have been reported MPP1 cells correlates with decreasing self-renewal activity. recently to be potent rheostats of tumor progression (Morishita Two glutathione S-transferases (GST), Gstm1 and Gstm5, et al., 2013; Shah et al., 2013) and HSC self-renewal, proliferation, were found to be expressed at higher levels in HSCs compared and lineage commitment checkpoints (Battista et al., 2003; Cop- to MPP1 (Figure 2C). Moreover, elevated levels in HSCs were ley et al., 2013). In addition, the protein showing the highest fold consistent for all 11 GSTs quantified (Figure S2F). This points change was the Hmga2 target Igf2bp2 (Figure 2C; Cleynen to a requirement for this enzyme class in HSCs, which may relate et al., 2007). Hmga1, Hmga2, and Igf2bp2 are downstream medi- to their ability to mediate the conjugation of xenobiotics for the ators of the Lin28-let7 pathway that link metabolism to prolifera- purpose of detoxification and defense against environmental

Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. 509 Cell Stem Cell Molecular Landscape of Early Hematopoiesis

stress and cellular damage (Tew and Townsend, 2012). This sug- MPP1 in agreement with the proteome findings and increasing gests that homeostatic HSCs have an array of mechanisms in proliferative activity. place to protect their cellular integrity, extending an observation To systematically investigate potential posttranscriptional that we have made previously in hematopoietic progenitors regulation at the transition from HSCs to MPP1, we correlated (Klimmeck et al., 2012). Along these lines, two interferon-induc- the transcriptome data with the proteome data (Figures 3F–3G; ible proteins involved in host defense (Ifitm1 and Irgm2) were Table S3). Out of the quantified proteins, we were able to assign expressed at higher levels in HSCs compared with MPP1 (Fig- 99.3% (4,007 of 4,036) to their corresponding transcripts (Fig- ure 2C), suggesting that the type I interferon pathway is not ure 3F; Figures S3D–S3E). Notably, in addition to a strong positive only critical for the response to stress but also during homeosta- correlation between both data sets, all proteins that were ex- sis (Essers et al., 2009; Trumpp et al., 2010). Moreover, HSCs pressed differentially were also consistently upregulated or down- and MPP1 employ different intracellular serpins for protection regulated at the transcript level (Figure 3F). For example, the against death during stress because Serpinb6a and Serpinb1a response to cytokine stimulus was overrepresented at both the were expressed at higher levels in either HSC or MPP1, respec- RNA and protein levels in HSCs, whereas genes belonging to tively (Figure 2C). Taken together, HSCs harbor proteins involved the cell cycle and DNA replication were expressed strongly in in immune defense and detoxification, indicative of an increased MPP1 (Figure 3G). As a notable exception to this overall pattern, self-protecting repertoire compared with MPP1. enzymes in glycolysis/gluconeogenesis (Cmas, Gstm5, and Among the 86 quantified cluster of differentiation (CD) surface Nagk), as well as several downstream targets of the Lin28-let7 proteins (Table S1), nine showed a strong differential expression pathway (Hmga1, Hmga2, and Igf2bp2), showed only modest between HSC and MPP1 (Figure 2F). These included CD34 as changes at the mRNA level (Figures 3F and 3G), in line with their well as other membrane proteins described previously in the suggested regulation at the posttranscriptional level (Shyh-Chang context of stem/progenitors, namely CD38, CD41, CD49b, and and Daley, 2013). In addition, we found the GO term chromosome CD90 (Benveniste et al., 2010; Dumon et al., 2012; Weissman enriched and several long-lived histones related to chromosome and Shizuru, 2008). Additionally, our analysis identified CD82 organization (e.g. H2afz, H2afy2) to be downregulated at the and CD13 to be expressed at higher levels in HSCs, whereas RNA level in MPP1. Because these were not detected as being ex- CD11b had a higher expression level in MPP1. These findings pressed differentially at the protein level (Figures 3F and 3G), we may be used to develop additional marker combinations to hypothesize that the regulatory initiation of nuclear reorganization distinguish HSCs and MPP1 as well as to further refine the clas- starts already in MPP1 but becomes only operational at subse- sification of stem/progenitor intermediates. quent stages. The overall high similarity between the transcrip- tome and proteome suggests that posttranscriptional regulation The Transcriptome and Proteome Are Highly in HSC/progenitors might be used preferentially only for some Coordinated upon HSC Differentiation pathways, including the Lin28-Hmga-Igf2bp2 axis. We next analyzed the transcriptome of HSCs and MPP1 using high-throughput RNA sequencing starting from 30,000 FACS- MPP2 Cells Are Multipotent, but MPP3 and MPP4 Show sorted primary cells (Figure 3A; Figure S3A). Robust and repro- a Differentiation Lineage Bias ducible data were obtained for all samples, with more than 2 3 To investigate the functional attributes of MPP2–MPP4, we per- 108 total readings per population (Figure S3B). In total, tran- formed in vivo reconstitution assays using FACS-sorted cells scripts corresponding to 27,881 genes were identified (Table (Figure 4A; Figures S4A–S4B). The contribution of transplanted S2). Those genes were classified, according to their database MPP2 cells to the T cell, B cell, and myeloid progeny in peripheral annotation, into 21 RNA categories, and, as expected, protein- blood (PB) after 4 months was 23%, 35%, and 32%, respec- coding transcripts were highly represented (68.9%) (19,219; tively, compared with the 56%, 70%, and 82% observed for Figure 3B). In line with the proteome data, the protein-coding transplanted HSCs (Figures 4B and 4C; Figure S4A). In contrast, transcripts displayed a high diversity of functionalities (Fig- MPP3 and MPP4 cells generated only a small number of differ- ure 3C), including transcription factors (TF, 1,776 genes), recep- entiated cells (mostly below 2%), although the progeny were re- tors (1,796), and cell adhesion molecules (584). Additionally, the tained in the PB for at least 4 months. MPP3-transplanted mice expression of 8,662 noncoding RNA species was identified, showed a bias toward the production of myeloid cells, which including pseudogenes (4,034), microRNAs (miRNAs) (642) and was evident from 1 week posttransplantation but decreased long noncoding RNAs (lncRNAs) (589). over time. MPP4-transplanted animals preferentially generated We found 479 genes to be expressed differentially between lymphoid B cell progeny starting at 3 weeks posttransplantation, HSCs and MPP1 (FDR = 0.1; Table S2; Figure 3D). Consistent whereas myeloid progeny remained below 1% during the entire with the GO terms enriched in the proteome data, we found observation period. In line with this, T cells peaked after 3 weeks an overrepresentation of processes related to metabolism (pos- in the thymi of MPP4-transplanted mice (Figure S4B). In addition, itive regulation of the phosphate metabolic process and sulfur colony-forming unit (CFU) assays showed that, like MPP2 and compound metabolic process) and response to hypoxia (cellular MPP3 progenitors, MPP4 cells also generate myeloid colonies response to oxygen-containing compounds) within the genes (Figure S4C), in line with earlier studies performed with expressed at higher levels in HSCs (Figure 3E). Furthermore, CD135+ lymphoid-primed multipotent progenitors (Adolfsson TFs involved in different aspects of developmental signaling et al., 2005). In summary, MPP2 generates a large number of (cellular developmental process, cell differentiation Mecom, long-term myeloid, B cell, and T cell progeny upon transplanta- Hoxb7) were enriched in HSCs. In contrast, the DNA metabolic tion. In contrast, the other two populations generate only a process, cell cycle, and nuclear division were enriched in limited number of progeny in vivo, with a significant lineage

510 Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. Cell Stem Cell Molecular Landscape of Early Hematopoiesis

Count A C 0 500 1000 1500 2000 2500 D nucleic acid binding HSC MPP1 hydrolase n=4 n=3 transferase 30,000 cells enzyme modulator 277 transcription factor polyA-RNA cytoskeletal protein oxidoreductase kinase High throughput sequencing ligase 1 sample/lane transporter Paired-end reads receptor 100bp membrane traffic protein signaling molecule protease B calcium-binding protein 68.93% 19219 protein coding transfer/carrier protein 14.47% 4034 pseudogene chaperone log2 fold change (HSC/MPP1) 202 3.77% 1052 antisense phosphatase isomerase 2.30% 642 miRNA lyase Mean of normalized counts 2.11% 589 lncRNA defense/immunity protein 2.07% 577 snoRNA cell adhesion molecule 2.02% 562 processed transcript extracellular matrix protein structural protein 2.01% 560 snRNA cell junction protein 0.88% 244 misc RNA tr.membr. rec. reg. 0.42% 117 IG LV gene surfactant 1.02% 285 other non-coding species storage protein viral protein

HSC # transcripts E MPP1 F 180 120 60 0 60 120 180 4,007 RNA, protein coding response to stimulus 167 111 19,219 RNA Δ RNA FDR=0.1 32 47 cellular developmental process 83 53 Protein +9 Protein cell differentiation 80 50 15 Δ Protein FDR=0.1 regulation of localization 44 24 63 RNA & RNA & protein Δ cell migration 25 12 Protein RNA & protein FDR=0.1 positive regulation of phosphate metabolic process 24 10 357 cellular response to oxygen-containing compound 16 7 sulfur compound metabolic process 9 2 response to alcohol 8 4 435

cell cycle 29 91 DNA metabolic process 17 53 nuclear division 7 52 Gbp10 microtubule-based process 7 50 regulation of cell cycle 21 39 DNA recombination 3 13

1.0 G Vitamin C Myof Gstm1 Pygm Igf2bp2 Ifitm1 Zc3h10 Gda vacuolar cellular response to Type cytokine stimulus Mis12 Hmga2 0.5 membrane GOBP name Hmga1 GOCC name Sptlc2 Mllt1 Cmas KEGG name Anxa2 Faah Gstm5 RNA Keywords Mtmr9 Tyms Steap3 Nagk Extracellular matrix Ncf4 Cdk6 H2afz Glycolysis/Gluconeogenesis log10(Size) Dnajc6 Serpinb1a H2afy2 0.0 1 Top2a Cdk1 Hmgb2 2 Stm1 Kif4 Hist1h1a 3 Smc2 log2 fold change (HSC/MPP1) nucleosome FDR

HSC/MPP1 RNA score −0.5 Tricarboxylic acid cycle Chromosome 0.015 Nuclear pore Cell cycle complex 0.010 0.005 0.000 −1.0 DNA-dependent DNA replication initiation −1.0 −0.5 0.0 0.5 1.0 Protein HSC/MPP1 protein score log2 fold change (HSC/MPP1)

Figure 3. Comparison of the Transcriptome and Proteome of HSCs and MPP1 Cells (A) RNA-seq workflow. (B) RNA categories of 27,881 quantified genes. Shown are the percentage and number of RNA species within each RNA class. (C) Classification of quantified protein-coding genes. Bars show the number of genes within each functional class, ranked based on the protein coverage(Figure 2B). (D) Differential gene expression. Genes highly expressed (FDR = 0.1) in HSCs or MPP1 are shown in red and blue, respectively. (E) Overrepresented biological processes of differentially expressed genes. (F) Overlap and correlation between protein and RNA expression changes. Top panel, integration of proteome and transcriptome data sets. Quantified proteins (blue) were mapped to protein-coding transcripts (green). Differentially expressed transcripts (dark green, 78 mapped of 435) and differentially expressed proteins (dark blue, 47 significant plus 9 exclusively detected proteins) were assessed for overlap (red). Bottom panel, significant (FDR = 0.1) changes at RNA (green), protein (blue), and both levels (red, correlation coefficient R = 0.93), respectively. The box on the left indicates exclusively detected proteins. (G) 2D GO enrichment analysis of protein and RNA expression changes. Red regions correspond to concordant enrichment or lower expression. Blue and green regions highlight terms that are enriched or lower in one direction but not in the other, whereas terms in yellow regions show anticorrelating behavior. GOBP, gene ontology biological process; GOCC, gene ontology cellular compartment; KEGG, Kyoto Encyclopedia of Genes and Genomes. See also Figure S3.

Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. 511 Cell Stem Cell Molecular Landscape of Early Hematopoiesis

My-cells MPP2 MPP2 MPP3 MPP4 A CD45.1/2 B 100 100 100 B-cells Supportive T-cells 10 10 10 MPP3 %)

( 1 1 1 2 CD45.2 B6J MPP4 CD45.1 . Donor Recipient 0.10 0.10 0.10 D45

1 -16 weeks C 0.01 0.01 0.01 13691216 1 3 6 9 12 16 13691216 Donor

Read-out Weeks PB, BM, SPL & THY 1139126163912616 C My-cells CD45.1 Fold change MY-cells LY-cells B-cells CD45.2 x0.0003 T-cells x0.1

MPP2 x1 D E MPP3 MPP4 800 Count 0 CD34 −3 −1 1 3 Row Z-score MPP4 HSC MPP1 MPP2 MPP3 MPP4 40 F MPP4 G 4,288 20 HSC 3,240 1,436 4,043 CD150 MPP1 0 MPP3 n 2,437 MPP3 PC 2 o HSC MPP1 i -20 479

ss 3,721 e MPP2 1,797 -40 1,040 xpr e CD135 -40 -20 0 20 40 MPP2 NA 3,826

nR PC 1 ea

m B cell receptor Retinoic acid H Mitosis signaling pathway metabolic process CD48 2 3 HSC 2 2 MPP1 1 1 1 MPP2 MPP3 0 0 MPP4 0 -1 -1 -2 -1 HSCMPP1 MPP2 MPP3 MPP4 -3

Mean log2 fold change -2 4 Os P I Cd96 J K G Ngenes N MPP3 MP MPP1 HSC MPP2 Itgax Cluster Selp 100 Ldhb 239060 H2−Aa 35410 Ctla2b Retinoic acid metabolism Regulation of metabolism Igtp 41484247 Htatip2 5270 Sdpr Aldh1a1* Bco2* Rbp1* Cyp26b1* Lin28b* Igf2bp2* Insr Igf1* Hif1a 6860 Gp1bb * * * * * Aqp1 * * * * 7 155 16 Tal1 * Bpgm * 8497431 Aldh1a1 9360 Fhl1 Aldh1a7* Aldh1a3 Crabp2 Cyp26a1 Lin28a*Hmga1 Insrr* Igf2* Hif3a* 10 57 0 Fli1 * * * * * * * * * 11 20 0 Mpl * * Pdzk1ip1 * * 12 22 0 vWF 13 204 50 Nckap1 Adh1 Aldh1a2* Hmgb2* Hmga2* Irs2* Igf2r* Foxa3* 14 28 0 Gata2 * * * * * 15 178 57 Epor * * * * Zfpm1 * 16 17 4 Tspan33 17 84 0 Cmas Retinoic acid metabolic process Clu 18 340 261 Gpr64 19 25 0 Gstt1 Rdh10* Rdh5* Rxra Lin28b Klf1 * * 20 170 47 Itga2b * * * 21 39 0 * ** Gata1 * RNA DEG HSC-MPPs FDR=0.1 22 14 0 Hemgn Let7 Gp5 23 42 0 Rara* Rarb* Rarg* Ldhb* F2r * 24 3 0 Mrvi1 * * * * * RNA DEG HSC/MPP1,2,3 or 4 Rgs10 * Igf2bp2; Hmga1; Hmga2 FDR=0.1 25 709 324 * 26 227 41 Lmna Insr; Igf1; Igf2; Igf2r Mean expression Proteome DEP HSC/MPP1 Thpo 27 58 0 Car1 Pparg Ppard* Ppargc1a* FDR=0.1 HSC 28 4 0 MPP1 MPP2 MPP3 Cpa3 * * * * * MPP4 Il1rl1 Anaerobic OxPhos; TCA Ldhb: RNA mean expression >700 29 1176 306 Gadd45a * Glycolysis Proliferation 30 11 0 H1234 Ldhc: RNA mean expression <700 RAR / PPAR signaling 31 330 306 Enriched 32 0 0 Row Z-score Non-enriched −3 0 3

Rtel1 Mus81 Xrcc3

Emsy Btbd12 L M Fancb Gm15428 Rad51c Fancc Rad18 Topb p1

HSC (#2) 35 84 86 115 40 5 4 Rbbp8 Apitd1 Brc a1 MPP1 (#3) 14 5 3 Rad51 Paxip1 Brip1 HSC-MPP1 (#4) 103 62 51 301 338 113 414 152 70 24 8828747 Ercc1 MPP1-MPP2 (#7) 12 7 Faap24 Fanci HSC-MPP1-MPP2 (#8) 56 38 45 22 135 136 48 201 67 32 35 15 9 24 11 18 12 8 21 37 510 Rpa2 Rad54b Uhrf1 Exo1 Hmgb2 MPP2-MPP3 (#13) 17 13 16 7 6 Pold3 Lig1 18 18 6 13 9 Fanca MPP1-MPP2-MPP3 (#15) Clspn 5 HSC-MPP1-MPP2-MPP3 (#16) Ube2a Fancd2 Rad50 Rpa1 18 16 76 73 118 39 18 6 27 7 22 29 6 Rad52 Chek1 HSC-MPP4 (#18) Ogg1 Chaf1a Rad54l HSC-MPP1-MPP4 (#20) 46 4 3 Nudt1 MPP3-MPP4 (#25) 49 38 47 36 157 149 53 235 76 35 38 10 12 18 12 7 22 21 8 3 39 10 13 Smc4 Mlh1 Cdca5 Eme1 Smarc b1 HSC-MPP3-MPP4 (#26) 53 70 7 19 Neil3 Mbd4 Mcm8 Kif22 MPP2-MPP3-MPP4 (#29) 48 77 313 52 63 536 77 24 48 17 45 32 16 10 19 15 10 LOC677447 Kat5 MPP1-MPP2-MPP3-MPP4 (#31) 30 51 9 47 67 41 33 35 21 Rad9 Smc1a Msh6 Pif1 y H2afx Trip13 Pola1 Nono rium ration ration unity mulus vesicle m vation ylation Pcna Act l6a ounding fe r Bard1 Foxm1 Pole w unication ric region elopment elopment Smc2 m DNA repair erentiationferentiationv v feron−beta T cascadeferentiation ferentiation nucleosomecell division e e endocytosisA coagulation lipid binding Nsmc e2 Ssrp1 Hus1b 0 0.2 1 cell adhesion cell mig T neurogenesis adjusted p-value e d kinase activity Cul4a cell proli DNA replication v MAPK cascadephospho Uchl5 Trp53 Baz1b ocyte chemotaxis protein secretionplatelet acti metabolic process k AK−S Nsmc e1 Rad51ap1 Dtl extracellular regioncell comelopmental process , centrome cytokine productionT cell diffB cell dif ner leukin−1 productionJ Polq Dna2 response to sti e leu r Polh Dek responseinflammatory to dev response interleukin−6forelimb production morphogenesisinte pre−B cell difcalciumregulation ion homeostasis of proteolysis eron−gamma production Ruvbl1 Eef1e1 cytoskeleton organization melanosome organization ERK1 and ERK2 cascadef chromosome organization ratory system d fense response to bacte response to inter yonic inter face receptor signaling pathway lymphocytede mediated im r Node size chromosom respi myeloid leukocyte dif Edge color Node color emb cell sur e regulation of RNA metabolic process v MPP1 MPP2 MPP3 MPP4 canonical Wnt receptor signaling pathwa 0.4 1 Highestexpression positi Combined score 1 56 STRING Degrees

(legend on next page) 512 Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. Cell Stem Cell Molecular Landscape of Early Hematopoiesis

bias toward mostly myeloid (MPP3) or lymphoid (MPP4) cell megakaryocytic erythroid signature described previously types. (Ma˚ nsson et al., 2007; Gekas and Graf, 2013; Sanjuan-Pla et al., 2013) were expressed preferentially in HSCs, MPP1, and Clustering Analyses Identify Global Molecular MPP2 compared with MPP3 and MPP4 (Figure 4I). Differences during HSC Differentiation Consistent with the differential expression of metabolic en- We sequenced the transcriptomes of the MPP2, MPP3, and zymes between HSCs and MPP1 (Figures 2Dand3E), we found MPP4 populations in biological triplicates (Figures 4D–4M; Fig- genes belonging to the retinoic acid metabolic process to be ures S4D–S4G). The surface markers used for FACS showed highly expressed in HSCs compared with all other populations consistency between transcript and protein levels (Figure 4D). (Figures 4H and 4J), suggesting a role in adult HSCs. Furthermore, We detected 6,487 genes to be expressed differentially across we found evidence for stage-dependent and isoform-specific all five populations (FDR = 0.1; Table S2). The differential expres- expression of essential glycolytic enzymes in HSCs (e.g. Aldoc, sion of a set of genes was confirmed by quantitative real-time Ldhb; Figure S4D), extending recent studies demonstrating the PCR (Figure S3F), validating the robustness of the data. Unsu- relevance of anaerobic glucose metabolism for the maintenance pervised clustering of gene expression profiles grouped together of HSC self-renewal (Takubo et al., 2013). In contrast, the large HSC and MPP1 as well as MPP3 and MPP4 (Figure 4E). This was majority of enzymes involved in the tricarboxylic acid cycle and verified by principal component analysis (Figure 4F) and by the oxidative phosphorylation showed MPP2-enriched expression numbers of differentially expressed genes in each pairwise com- (e.g. Fh1, Aco2). Interestingly, although Lin28b and its target parison (Figure 4G). In all analyses, MPP2 was placed between genes, including Hmga2 and Igf2bp2, were highly expressed in the HSC-MPP1 and MPP3-MPP4 clusters (Figures 4B and C; HSCs compared with MPP1, the family member Lin28a was spe- Figures S4A and S4B). cifically upregulated in MPP3 (Figure 4J). Therefore, our findings support the recently established role of the Lin28-let7 axis in Clustering of Gene Expression Profiles Suggests glucose metabolism in stem cells (Shyh-Chang and Daley, 2013) Processes Operating in HSC/MPP1-MPP4 Cells and suggest Lin28b as a candidate for an upstream posttranscrip- The 6,487 genes with differential expression across the five tional regulator of glycolytic enzymes in HSCs (Figures 3F and 3G). cell populations were enriched in specific GO terms (http:// Next we assigned each of the 6,487 differentially expressed www-huber.embl.de/SMHSC/HSCGO/goAnalysis.html; Sup- genes to one of the 32 possible relative expression patterns (Fig- plemental Information), of which we highlight three selected ure 4K) and tested for overrepresented GO terms within each examples (Figure 4H). Based on the expression data, MPP2 is pattern (Table S4). Population-specific GO terms are displayed the mitotically most active population, whereas HSC is the in Figure 4L. Consistent with the strong upregulation of cell most quiescent one. In agreement with the functional lymphoid cycle-associated genes in MPP2, we observed that 43% of the differentiation bias observed in MPP4-transplanted recipients genes involved in DNA repair also showed the highest expres- (Figures 4B and 4C), the B cell receptor signaling pathway was sion in MPP2 (Figure S4F). The protein-protein interaction enriched in this population. Consistent with this, mapping network of the HSC-suppressed DNA repair genes comprised, of the HSC-MPP data against granulocyte macrophage or e.g., Brca1 and Exo1 (Figure 4M), representing potential players lymphoid-primed progenitor signatures described previously re- for the switch in DNA repair mechanisms between HSCs and vealed an enrichment for MPP3 and MPP4 populations, respec- MPP2. Interestingly, the two common mechanisms for repair of tively (Ma˚ nsson et al., 2007; Figure S4E). In addition, genes of a DNA double strand breaks (homologous recombination and

Figure 4. Lineage Potential and Whole Transcriptome Molecular Signatures of HSC-MPP1-4 Populations (A) Experimental workflow for investigation of the in vivo reconstitution potential. MPP2, MPP3, and MPP4 were transplanted into lethally irradiated mice, and the myeloid (MY) and lymphoid (LY) outcomes were monitored over time by flow cytometry. SPL, spleen; THY, thymus. (B and C) Monitoring the myeloid and lymphoid outcomes in peripheral blood after transplantation of MPP2, MPP3, and MPP4. The relative percentage (B) and contribution (C) of donor myeloid (blue), B cells (gray), and T cells (purple) were monitored from the peripheral blood of MPP2-, MPP3-, and MPP4-transplanted animals at 1–16 weeks. n = 8-12 per group. The sizes of the circles in the right plot represent fold change relative to HSC contribution after 3 weeks. (D) Differential RNA expression of the surface markers used for FACS. Average RNA expression ± SD is shown. (E) Clustered heat map. The colors represent the normalized average read count in each of the five cell populations for all differentially expressed genes (FDR = 0.1). (F) Principal component analysis. (G) Relative distances are shown based on numbers of differentially expressed genes in pairwise comparisons. (H) GO enrichment analysis of 6,387 genes with differential expression. Three representative examples of significantly enriched GO terms out of 1,001 terms are shown. (I) Expression of megakaryocytic/erythrocytic genes. (J) Transcriptome and proteome signatures of metabolism. Average RNA expression ± SD is shown (arbitrary units). RAR, retinoic acid receptor; PPAR, peroxisome proliferator-activated receptor; OxPhos, oxidative phosphorylation; DEG, differentially expressed genes. (K) Gene expression clusters. 6,487 differentially expressed genes were grouped into 32 clusters based on higher expression (enriched) in one or several cell population(s) compared with mean expression level across all cell populations. (L) Representative overrepresented GO terms of gene expression clusters. The number of genes within each cluster is shown for significantly enriched GO terms (FDR = 0.1). (M) Network of HSC-suppressed DNA repair genes. Genes with low expression in HSCs (Z-score < À1.5) and with known and predicted protein-protein interactions are shown. The colors depict the MPP population with the highest expression. Note that most genes are most highly expressed in MPP2. See also Figure S4.

Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. 513 Cell Stem Cell Molecular Landscape of Early Hematopoiesis

Frizzled Wnt1 Wnt2a Wnt2b Wnt3 Wnt3a Wnt4 Wnt5a Wnt5b HSC ABHSC p -log10( value) ligands MPP1 0123 MPP2 MPP3 Nuclear Receptors 4 Wnt6* Wnt7a Wnt7b Wnt10a* Wnt10b Wnt11 Wnt16 MPP4 Monoamine GPCRs 2 * DEG HSC/MPP1 FDR=0.1 TGF Beta Signaling Pathway 4 Retinol metabolism 2 Myc* DEG HSC-MPPs FDR=0.1 GPCRs, Other 3 Wnt1 RNA mean expression <700 Diurnally Regulated Genes with Circadian Orthologs 3 Fzd7 RNA mean expression >700 Ovarian Infertility Genes 2 Exercise-induced Circadian Regulation 3 Wnt Fzd1 Fzd2* Fzd3* Fzd4* Fzd5* Fzd6 Fzd7 Fzd8* Fzd9* Fzd10 Ldlr Senescence and Autophagy 4 receptors Splicing factor NOVA regulated synpatic proteins 2

HSC-MPP1 -log10(p value) Dvl1 Dvl2 Dvl3 Rhoa Racgap1* 024 Protein Kinase C * Cytoplasmic Ribosomal Proteins 20 Prkca* Prkcb* Retinol metabolism 6 Matrix Metalloproteinases 4 Endochondral Ossification 9 Prkcc Prkcd* Ppp2r5c Ppp2r5e Mapk9 Mapk10 G Protein Signaling Pathways 13 Cytoskeleton Ovarian Infertility Genes 5 Axin1 ACE Inhibitor Pathway 2 Prkce* Prkci Nuclear receptors in lipid metabolism and toxicity 4 Frat1 Dopminergic Neurogenesis 3 Csnk1e Apoptosis Adipogenesis 14 Focal Adhesion 19 Prkch* Prkd1* Gsk3b Apc Regulation of Actin Cytoskeleton 14 Nuclear Receptors 5 B-catenin Fbxw2 Pafah1b1 Glutathione metabolism 3 Prkcq Prkcz Wnt Signaling Pathway 8 P Purine metabolism 15 Ctnnb1 Wnt Signaling Pathway and Pluripotency 10 Phosphorylated

Tcf1 Ubiquitin tagged B-catenin 26S Proteosome Transcriptional activation C Counts Nucleus degradation 0 50 100 Mycn* Myc* Ccnd1* Ccnd2* Ccnd3* Fosl1 Jun* Plau* nucleic acid binding enzyme modulator receptor transcription factor transferase hydrolase D Foxj3 (ENSMUSG00000032998+) cytoskeletal 5000 transporter 2000 HSC signaling 1000 MPP1 kinase differential exon usage oxidoreductase 100 cell adhesion exons 10

ligase Fitted expression defense/immunity extracellular matrix phosphatase 1 lyase calcium-binding 2 membrane traffic 3 structural 4 transfer/carrier 5 isomerase 6 protease chaperone 7 cell junction 8 viral 9 surfactant 10 tr.membr. rec. reg. 119537004 119557474 119577944 119598414 119618884 Genomic region

Figure 5. Gene Expression Signatures and Transcript Isoform Regulation in HSCs and MPP1–MPP4 Cells (A) Signaling pathway enrichment analysis. Shown is the overrepresentation of signaling pathways within HSC- and HSC-MPP1 clusters. NOVA, neuro- oncological ventral antigen; ACE, angiotensin-converting enzyme. (B) Wnt signaling. Shown is a pathway analysis based on WikiPathways (WP403). Average RNA expression ± SD is shown (arbitrary units). (C) Association of differentially spliced genes to protein classes. (D) Representative example for a TF (Foxj3) showing differential exon use. The first Foxj3 exon was detected at higher levels in HSC (red) compared with MPP1 (blue). nonhomologous end joining) also showed a different expression To investigate the signaling pathway complexity of both the pattern across the five cell populations. Both repair processes HSC and the HSC-MPP1 self-renewal clusters, we tested for are low in HSCs and are most enriched in MPP2 (Figure S4G). overrepresentation of differentially expressed genes in pathways In summary, our data indicate that the increased mitotic activity annotated in REACTOME within each group of genes. We found, during HSC differentiation, starting in MPP1 and peaking in among others, G protein-coupled receptor (Gpr143, Pde1c) MPP2, is associated with a parallel increase in expression of and transforming growth factor b (Inhba, Bmp4) signaling to DNA repair pathway genes. be enriched (Figure 5A; Table S4). In accordance with the GO

514 Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. Cell Stem Cell Molecular Landscape of Early Hematopoiesis

enrichments (Figure 4L), Wnt signaling was prominent in self- 81% and 83% and were not significantly different across popu- renewing HSCs and MPP1. Although the role of canonical lations, whereas pairwise comparisons identified a total b -catenin-mediated Wnt signaling in adult HSCs remains of 15,887 distinct differentially methylated regions (DMRs) controversial (Koch et al., 2008; Luis et al., 2012), noncanonical (Figure 6B; Table S5). Mapping these DMRs to previous DNA signaling has recently been suggested to mediate critical inter- methylome data generated on HSC/MPPs using reduced repre- actions between HSCs and their niches (Sugimura et al., 2012). sentation bisulfite sequencing (RRBS) (Bock et al., 2012), we Therefore, we interrogated individual expression patterns of found that 85% of all DMRs were exclusively identified using the entire Wnt signaling pathway during differentiation of HSCs the TWGBS analysis (e.g. Mecom; Figure S5C), demonstrating toward MPP4. Although the pathway was highly overrepre- the additional coverage of our data set. Early commitment steps sented in HSC-MPP1, none of the Wnt ligands were expressed correlated with lower numbers of DMRs (1,121, HSC-MPP1), at high levels, arguing against autocrine production of Wnt which likely reflects close ontogenic and functional relationships, ligands (Figure 5B). However, even low levels of Wnt ligands sup- and were mainly associated with loss of methylation (71%, HSC- port stem cell self-renewal (Luis et al., 2012). In agreement with MPP1). In contrast, transitions between more differentiated MPP this, three frizzled receptors (Fzd4, Fzd8, and Fzd9) were highly populations showed higher numbers of DMRs (1,874, MPP2- expressed in HSC-MPP1, consistent with reported Fzd4 expres- MPP3/MPP4) and gain of methylation (75%, MPP2-MPP3/ sion in human CD34+ cells (Tickenbrock et al., 2008) and a role of MPP4; Figure 6C; Figure S5D; Table S5). Globally, methylation noncanonical Fzd8 in the maintenance of quiescent long-term changes showed continuous progressive behavior (gain or HSCs (Sugimura et al., 2012). Taken together, our analysis un- loss) through HSC differentiation (Figure 6C). derscores the relevance of Wnt signaling for HSCs but also out- Next we investigated the overlap of DMRs with annotated lines the complex isoform-specific enzyme utilization operational genomic features derived from Refseq (Figure 6D; Pruitt et al., in HSCs and the different MPPs. This likely contributes to the dy- 2009). The majority of DMRs were located in intragenic regions namic regulation of this pathway, which is critical for most stem/ (57%), whereas promoters and intergenic regions comprised progenitor cell types. 9% and 34%, respectively. Comparison of our data set with experimentally defined functional genomic elements (Stama- Transcription Factor and Transcript Isoform Regulatory toyannopoulos et al., 2012; Shen et al., 2012) demonstrated a Landscape significant overrepresentation of DNaseI hypersensitivity sites Among the genes expressed differentially across the five popu- (DHSs), promoters, enhancers, and/or transcription factor bind- lations, we identified 490 TFs, including members of the Fox, ing sites (TFBSs) across DMRs (Figure S5E; all p values < 10À16). Gata, and Hox families (Table S2). TF splice isoforms can have Interestingly, 74% of all DMRs that overlapped with DHSs stage- and tissue-specific expression patterns throughout (10,887) mapped to regions annotated as TFBSs, promoters, development (Lo´ pez, 1995; Sebastian et al., 2013) and alterna- and/or enhancers, suggesting that dynamic methylation upon tive splicing has been shown to trigger switches between acti- hematopoietic commitment occurs predominantly at cis-regula- vating and repressive TF isoforms (Taneri et al., 2004). We tested tory elements (Figure 6E). Together, these data support the for differential exon use for detection of alternative transcription hypothesis that many DMRs represent cis-regulatory elements start sites, alternative splicing, and alternative terminations sites in HSC-MPPs. (Anders et al., 2012) in the HSC MPP1–MPP4 transcriptome data. Among the 497 genes expressing transcript isoform vari- Gene Expression Anticorrelates Globally with Changes ants across HSC MPPs (FDR = 0.1), we identified 46 TFs (Fig- in DNA Methylation ure 5C; http://www-huber.embl.de/SMHSC/HSCDEU/overall/ Given the significant overlap between DMRs and functionally testForDEU.html; Supplemental Information). We observed that important regulatory regions of the genome, we postulated the first exon of the TF forkhead box J3, Foxj3, was included that the methylation changes could be associated with gene more frequently in the HSC transcripts compared with MPP1, expression during HSC differentiation. Indeed, we found that suggesting the expression of a specific Foxj3 isoform in HSCs DMRs coincided with previously identified cis-acting regulatory (Figure 5D). Although the role of this Foxj3 variant in hematopoi- sites at the Sfpi1 (Rosenbauer et al., 2004) and Gata2 (Gao esis is uncharacterized, an alternative splicing switch for another et al., 2013) loci (Figure S5F), which both encode TFs described forkhead family member, Foxp1, has recently been shown to as effectors of hematopoietic commitment. We generated gene regulate embryonic stem cell pluripotency and reprogramming sets based on DMRs that were stratified for methylation increase by changing its DNA-binding preference (Gabut et al., 2011). or decrease and interrogated the associated gene expression levels between HSCs and each of the MPP populations. DMRs Genome-wide DNA Methylation Analysis of HSCs/MPPs detected at the early transition of HSC to MPP1 were signifi- Identifies Candidate Regions for Epigenetic Regulation cantly associated with anticorrelated gene expression along To investigate the methylation status of all cytosine residues differentiation (Figure 6F). In addition, pairwise comparisons within the genome of HSCs and MPPs, we subjected R10,000 demonstrated a global anticorrelation between transcription FACS-sorted cells per biological replicate to TWGBS (Wang and DNA methylation (Figure 6G; Supplemental Information). et al., 2013)(Figure 6A; Figure S5A). Robust data were obtained Among the top anticorrelating genes between HSC-MPP1 (80; for all samples, with more than 6 3 108 reads and a combined Table S6), we found a number of well documented markers genomic cytosine-phosphate-guanine (CpG) coverage of more of early hematopoietic differentiation and HSC function (e.g. than 33-fold per population across the three biological replicates Hoxb2, Rorc, Cd34). For example, the increasing expression (Figure S5B). Global levels of DNA methylation ranged between of Cd34 and Wnt-inhibitory factor (Wif1) from HSC to MPP1 is

Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. 515 Cell Stem Cell Molecular Landscape of Early Hematopoiesis

HSC MPP1 MPP2 MPP3/4 HSC MPP1 MPP2 MPP3/4 A B C R1 R2 R3 R1 R2 R3 R1 R2 R3 R1 R2 R3 R1 R2 R3 R1 R2 R3 R1 R2 R3 R1 R2 R3

HSC MPP1 MPP2 MPP3/4 4,712 n=3 DNA methylation 10-30 ng DNA/replicate 1.0 HSC MPP1 MPP2 MPP3/4 1,121 1,412 1,874 0.8

Tagmentation-based whole- 0.6

genome bisulfite sequencing 7,411 0.4

(TWGBS) 0.2 100bp paired-end reads gain of methylation 0 6.7 - 19.9 x 108 aligned reads loss of methylation 12,013

DMRs (HSC – MPP1) DMRs (MPP2 – MPP3/4)

D 57.0% E F Promoter Enhancer 8.7% n=4,745 n=4,561 34.3% DHS (29.7%) (28.5%) TFBS n=14,617 n=8,798 (91.4%) 111 99 (55.0%)

778 12 23

HSC MPP1 MPP2 MPP3/4 3730 258 7 162

1313 1296 51 1553 2215

Intergenic 3474 Promoter Intragenic

Il4 Eomes HoxB Chr11: 96,270,000 G Lsr Tenc1 H I HSC MPP4 Slc9a3r2 Plekhg6 MPP1 MPP2 MPP3 Perp Clec14a Hoxb1* Hoxb2* Doc2b Flrt1 hylation Hoxb3* Pde1b Ndnf Hoxb4* Efna1 Cpne8 Hoxb5* Hoxb Hoxb6* Pdgfc Kcnc1 Hoxb7*

Rorc Cttnbp2nl Hoxb8 Relative Met Tinagl1 Sox7 Hoxb9* Gfi1 Hoxb2 Hoxb13 0.0 0.2 0.4 0.6 0.8 1.0 St8sia1 Dpy19l2 Hoxd1 Mir10a Hoxd3 b9 b8 b7 b6 b5 b4 b3 b2 b1 Hoxd8 Gene Haao Spns3 Hoxd9 Promoter Hoxd Hoxd10 DMR Mrvi1 Wdr16 Hoxd11 Enhancer Mc5r F2rl2 Hoxd12 Cd163 Zfat Hoxd13 Tex22 Sept3 Alx3 Endou −2 0 2 HoxD Chr2:74,660,000 Anp32e Rfc2 Z-score Cd34 Poc1a * DEG HSC-MPPs FDR=0.1 Hoxb1 with DMRs

Slc16a7 Zscan10 0.8 1.0 Hoxb7 no DMRs Wif1 Gfra1 Ltc4s Tyms Mrc2 Rabgap1l 0.4 0.6

Relative Methylation HSC MPP1

0.0 0.2 MPP2 1700109F18Rik MPP3/4 d13 d12 d11d10 d9 d8 d3 d1 b1 Hoxb1 Gene Promoter with DMRs DMR d7 Hoxd7 Enhancer no DMRs Transcription direction

Figure 6. Global DNA Methylation Analysis and Anticorrelation with Gene Expression (A) DNA methylome workflow. (B) Gain or loss of methylation DMRs between HSC-MPPs. The numbers indicate total DMRs. (C) Clustering of DMRs identified between HSC-MPP1 or MPP2-MPP3/MPP4. Each horizontal dash represents a DMR. R1–R3, replicates 1–3. (D) Overlap of DMRs with gene-centric Refseq genomic regions. (E) Percent overlap of all DMRs with experimentally defined functional genomic elements based on available data from the mouse ENCODE project. (F) Overall comparison of DNA methylation to gene expression. The box plots represent relative gene expression associated with either gain (left) or loss (right) of methylation in the transition from HSC and MPP1. (G) Pairwise comparison of DNA methylation to gene expression. The box plots represent log2 fold change of gene expression associated with either gain or loss of methylation in the transition from HSC to MPP1. The top 48 anticorrelated genes (24 highest/24 lowest expression in HSCs) are indicated. (H) Gene expression levels of all members of the HoxB and HoxD clusters. (I) Relative DNA methylation profile for the HoxB and HoxD clusters. Red arrows indicate examples of DMRs. See also Figure S5.

516 Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. Cell Stem Cell Molecular Landscape of Early Hematopoiesis

associated with decreasing methylation at the DMRs of these and/or have unknown functions. The lncRNAs H19 and Meg3 loci. In contrast, the cyclic AMP-mediating cyclic nucleotide that are highly expressed in HSCs compared with MPPs (Fig- phosphodiesterase 1b (Pde1b), as well as the retinoic acid re- ure 7C, validated by quantitative RT-PCR; Figure S6B) are core ceptor orphan receptor g(Rorc), become methylated and down- members of the transcriptional imprinted gene network (IGN), regulated during differentiation toward MPP1. which has been postulated to regulate embryonic growth (Var- Hox TF clusters are critical during normal and malignant he- rault et al., 2006). In adult hematopoiesis, the genetic deletion matopoiesis (Argiropoulos and Humphries, 2007), but little is of several members of this network affects HSC self-renewal known about their epigenetic regulation. We found that most integrity (e.g. Cdkn1c/p57KIP2)(Berg et al., 2011; Rossi et al., members of the HoxB family showed the highest expression in 2012). Therefore, we further investigated the expression of the HSCs and were associated with DMRs (seven out of nine; Fig- IGN members and found an overall strong expression in HSCs ures 6H and 6I). Notably, HSC differentiation showed a contin- but a steady decrease during the differentiation process (Fig- uous increase in DMR methylation and paralleled decrease of ure 7D), suggesting a contribution of the IGN in the maintenance RNA expression (Figure 6I, see arrows for Hoxb2 and Hoxb4). of self-renewal and/or quiescence of HSCs. In contrast, the members within the HoxD cluster showed no Finally, we interrogated the DMRs within the loci encoding the specific expression pattern in HSCs and MPPs, and none of its 682 quantified lncRNAs. This revealed a significant enrichment genes were associated with a DMR (Figure 6I). Similar results of DMRs in the differentially expressed lncRNAs. These DMRs were found for HoxA and HoxC clusters, respectively (Figures were enriched within a 10 kb window centered on the transcrip- S5G and S5H). In conclusion, the integrated analysis of the tion start site (Figure 7E). A notable example is the H19 locus, methylome and transcriptome provides a resource of genes which exhibits a DMR in an enhancer region outside of its whose expression is, at least partially, regulated by DNA methyl- imprinting control region (Figures 7F and 7G). The increasing ation and that are possibly involved in the hard wiring of the hier- level of methylation at this enhancer during the transition from archical organization of the HSC/progenitor populations. HSCs to MPPs correlates with decreasing expression during dif- ferentiation (Figure 7F). Overall, these data suggest that differen- Differential lncRNA Expression and the Imprinted Gene tial DNA methylation of regulatory regions is a likely mechanism Network in Stem/Progenitors by which lncRNA expression is controlled in HSCs and their Long noncoding RNAs (lncRNAs) have been implicated in guid- progeny. ing chromatin remodeling complexes to specific target genes and mediating gene activation or silencing by recruiting the DISCUSSION DNA demethylation machinery to the promoter (Arab et al., 2014) or recruiting repressive complexes such as PRC2, respec- In this study, we describe a combined proteome, transcriptome, tively (Rinn and Chang, 2012). However, little is known about and DNA methylome analysis of highly purified primary HSCs their functional role or regulation in stem cells and hematopoiesis and four downstream MPPs, which we characterized addition- (Paralkar and Weiss, 2013). We identified 682 lncRNAs ex- ally using in vitro and in vivo functional assays (Figure S7). Our pressed in HSC-MPPs, 79 of which were differentially expressed data sets uncover progressively changing cell type-specific across all five populations (FDR = 0.1; Figure 7A; Table S2). Un- methylation, gene, and protein expression landscapes starting supervised clustering (Figure 7B) and pairwise comparisons of with quiescent CD34-CD150+CD48-LSK HSCs that sit at the differentially expressed lncRNAs revealed the same population pinnacle of the hematopoietic hierarchy. These differentiate to- relationship as using the entire transcriptome data set, again ward slowly cycling multipotent MPP1, followed by multipotently placing MPP2 between the HSC-MPP1 and the MPP3-MPP4 cycling MPP2. The steady increase in the activity of the cell cycle populations (compare Figures 4E–4G and S6A). Next we clus- and proliferation machinery is paralleled by the robust upregula- tered the significantly changed lncRNAs based on their relative tion of the entire DNA repair machinery. This raises the possibility expression levels across all populations (Figure 7C; Table S4). that physiological DNA replication in proliferating early progeni- As shown in cluster 2, 12 lncRNAs are strongly expressed in tors generates significant replicative stress that needs to be HSCs compared with the rest of the populations, and none of counteracted by the activation of the DNA repair machinery to these has yet been functionally annotated or studied (e.g. ensure genome integrity (Bakker and Passegue´ , 2013). 2410080I02Rik, Gm12474). In addition, 14 lncRNAs are coex- We found that the majority of DMRs either progressively gain pressed in HSC-MPP1 (cluster 4), representing additional candi- or lose DNA methylation through early HSC differentiation. More- dates for regulation of self-renewal (e.g. H19, Malat1, Meg3). In over, we observed a global anticorrelation between DNA methyl- agreement, the imprinted H19 lncRNA has recently been shown ation and gene expression. Because this was observed at a to mediate HSC quiescence by inhibiting insulin growth factor global level by the whole genome DNA methylome analysis (IGF) signaling (Venkatraman et al., 2013). Thirteen lncRNAs (TWGBS) but not using previous approaches (array-based and were enriched in MPP3-MPP4, suggesting regulatory roles in RRBS) (Ji et al., 2010; Bock et al., 2012), our data suggest that these lineage-restricted populations (cluster 25; e.g. Gm568, many regulatory regions, including, e.g., distal enhancers critical Neat1). Although Neat1 is essential for the integrity of the nuclear for gene expression, are only covered by TWGBS analysis. substructure, it has also been linked to the immune response af- The observed high overall correlation between RNA and pro- ter HIV-1 infection (Zhang et al., 2013) and might, therefore, be tein levels suggests that posttranscriptional regulation is not a involved in the maintenance of immune regulatory circuits in predominant mechanism by which gene expression is regulated MPP3-MPP4. Notably, most of the differentially expressed in homeostatic HSCs. However, a small number of specific func- lncRNAs identified here have not been studied in hematopoiesis tions in the stem/progenitor compartment may be regulated in

Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. 517 Cell Stem Cell Molecular Landscape of Early Hematopoiesis

A C D

Mus musculus

E

B

F G

Figure 7. Expression Landscape of lncRNAs and the Imprinted Gene Network (A) Workflow for lncRNA analysis. (B) Clustering of 79 differentially expressed lncRNAs. (C) LncRNA expression clusters. Differentially expressed lncRNAs were grouped into 32 clusters based on higher expression (enriched) in one or several cell population(s) compared to mean expression level across all cell populations. Right, an example diagram for each of the most enriched clusters. Average RNA expression ± SD is shown. (D) Expression of imprinted gene network genes. (E) Differential DNA methylation of quantified lncRNAs. Top panel, comparison of DMRs between differentially and nondifferentially expressed lncRNAs. Bottom panel, heatmap showing DMRs in red. LncRNAs are ranked by increasing adjusted p value (gray scale). Distances are relative to transcription start sites. *p = 0.01–0.001; **p < 0.001. (F) Differential methylation of the lncRNA H19 locus. The H19 locus shows increasing methylation from HSC (red) to MPP1 (blue), MPP2 (green), and MPP3/4 (brown) in two DMRs. The inset shows a diagram depicting average H19 RNA expression (mean ± SD). (G) Differential methylation of H19 locus at enhancer region 3. Black and white dots represent methylated and unmethylated CpGs, respectively. The red box indicates DMR. See also Figure S6.

518 Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. Cell Stem Cell Molecular Landscape of Early Hematopoiesis

this way. For example, our results point to a potential posttran- In summary, the global signatures for stemness and multipo- scriptional regulation of glycolytic metabolism in HSCs via tency generated in this study represent not only a compre- Lin28b-Hmga-Igfbp2 signaling (Shyh-Chang and Daley, 2013), hensive reference but also suggest distinct areas of stem cell likely because of differences in relative protein synthesis rather regulation (progressive DNA methylation, alternative splicing, than degradation rate (Kristensen et al., 2013; Signer et al., and lncRNAs). This study significantly extends the current under- 2014). Moreover, our data suggest that HSC self-renewal and standing of HSC progenitor biology at the global level and quiescence are regulated by an interplay between the Lin28b- provides a solid basis for functional studies exploring the net- let7-Hmga-Igfbp2 axis, the IGN regulatory network, and HSC- works responsible for stem cell quiescence, self-renewal, and enriched pathways such as Wnt and retinoic acid (RA) signaling. differentiation. Although some of these pathways are also implicated in embry- onic HSC emergence from the hemogenic endothelium (Varrault EXPERIMENTAL PROCEDURES et al., 2006; Chanda et al., 2013; Copley et al., 2013), the individ- ual players and mechanisms governing HSC function remain to Animals be explored in the embryo and in the adult. As an example, the Eight- to twelve-week-old female C57BL/6 (CD45.2) mice purchased from Harlan Laboratories and B6.SJL-Ptprca Pepcb/BoyJ (CD45.1) animals pur- Lin28b target Igf2bp2 was found to be one of the most differen- chased from Charles River Laboratories were used throughout the study. tially expressed transcripts and proteins in HSCs. It is known to CD45.1/CD45.2 heterozygotes (F1) for transplant auxiliary bone marrow modulate expression of the lncRNA H19, leading to suppression were bred in-house at the Deutsches Krebsforschungszentrum. of proproliferative IGF signaling as well as Let7 miRNAs and has been suggested to mediate HSC quiescence (Runge et al., 2000; Bone Marrow Reconstitution Experiments Kallen et al., 2013; Venkatraman et al., 2013). In agreement, our Fifty (HSC, MPP1) or 2,000 cells (HSC, MPP2, MPP3, MPP4) were FACS- 3 5 data show increasing H19 enhancer methylation during the sorted and injected intravenously together with 2 10 supportive bone marrow (BM) cells (CD45.1/2) into lethally irradiated (2 3 450 Gy) (CD45.1) transition from HSC to MPP1, providing an explanation for the recipient mice. CD45.2-donor cells were monitored at 1, 3–4, 6, 9, 12, and release of HSCs out of quiescence associated with loss of self- 16 weeks posttransplantation. For secondary transplantations, whole BM renewal, which might be enhanced further by suppression of was isolated at 16 weeks posttransplant, and 1 3 106 cells were retrans- the IGN activity (i.e. p57; Zou et al., 2011; Tesio and Trumpp, planted into lethally irradiated CD45.1 recipient mice. Myeloid (CD11b+Gr1+) 2011). RA signaling is not only known to be critical for embryonic and lymphoid (T cells, CD4+CD8+; B cells, B220+) lineages were addressed HSC emergence (Chanda et al., 2013) but also for the regulation by FACS. of Hox gene expression by chromatin reorganization in embry- Proteomic Analysis onic stem cells (Kashyap et al., 2011). Because we observed FACS-sorted HSCs and MPP1 (4 3 105) were lysed, and proteins were ex- high expression and low DNA methylation of most members of tracted, reduced/alkylated, and digested with trypsin. Peptides were labeled the HoxA/B clusters in HSCs, it is plausible that RA signaling differentially with stable isotope dimethyl labeling on a column as described contributes to the control of the epigenetic landscape of HoxA/ previously (Boersema et al., 2009) and fractionated by OFFGEL isoelectric B transcription factors in HSCs. Moreover, because many Hox focusing (Agilent Technologies). In technical duplicates, peptides were sepa- genes are mutated in leukemias, these mechanisms may also rated by nanoflow ultra-high-performance liquid chromatography on a be relevant for leukemic stem cells (Alharbi et al., 2013). 120 min gradient and analyzed by electrospray ionization-tandem mass spec- trometry (ESI-MS/MS) on a linear trap quadrupole Orbitrap Velos or Orbitrap An unexpected finding is the degree of alternatively spliced Velos Pro (Thermo Fisher Scientific). MS raw data files were processed with transcript isoforms present in HSCs and their progeny. To MaxQuant (version 1.3.0.5) (Cox and Mann, 2008). The derived peak list was date, only rare cases of HSC regulation through alternative searched using the built-in Andromeda search engine (version 1.3.0.5) in splicing have been reported (Bowman et al., 2006). In this study, MaxQuant against the Uniprot mouse database (2013.02.20). A 1% FDR we identified almost 500 genes with alternative transcript iso- was required at both the protein level and the peptide level. Differential expres- form regulation. Although the underlying regulatory mechanism sion was assessed using the Limma package in R/Bioconductor (Gentleman et al., 2004; Smyth, 2004), and proteins with an adjusted p value of less than remains unknown, the lncRNA Malat1, which is highly expressed 0.1 were considered expressed differentially between HSC and MPP1. in HSCs, has been suggested to be a regulator of alternative splicing (Tripathi et al., 2010). In line with this, Malat1 has been RNA-seq implicated in multiple types of human cancer (Gutschner et al., Total RNA isolation was performed using an ARCTURUS PicoPure RNA isola- 2013), and numerous genetic mutations encoding factors of tion kit (Life Technologies, Invitrogen) according to the manufacturer’s instruc- the splicing machinery have been detected in patients with tions. Total RNA was used for quality controls and for normalization of the chronic lymphoid leukemia (Martı´n-Subero et al., 2013) and mye- starting material (Figure S3). cDNA libraries were generated with 10 ng of total RNA for HSC-MPP using the SMARTer Ultra Low RNA kit for Illumina lodysplastic syndrome, a disease derived directly from HSCs sequencing (Clontech Laboratories) according to the manufacturer’s indica- (Lindsley and Ebert, 2013; Medyouf et al., 2014). Our catalog tions. Sequencing was performed with a HiSeq2000 device (Illumina) and of splicing variants, with Foxj3 as an example of an HSC-specific one sample per lane. Sequenced read fragments were mapped to the mouse splice isoform, will serve as a starting point to explore this largely reference genome GRCm38 (ENSEMBL release 69) using the Genomic Short- uncharted area of regulation in HSCs and their immediate prog- Read Nucleotide Alignment program (version 2012-07-20). DESeq2 (Love eny. In addition to Malat1, we identified more than 70 differen- et al., 2014) and DEXSeq (Anders et al., 2012) were used to test for differential expression (FDR = 0.1) and differential exon use, respectively. tially expressed lncRNAs, the vast majority of which have no documented role in hematopoiesis. The variety of molecular TWGBS functions assigned to lncRNAs is expanding steadily, and their DNA methylation analysis using TWGBS (Adey and Shendure, 2012) was biological roles include regulation of genomic imprinting, differ- performed as described previously (Wang et al., 2013). Genomic mouse entiation, and self-renewal (Fatica and Bozzoni, 2014). DNA (10-30 ng) was used as input, and each sequencing library was subjected

Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. 519 Cell Stem Cell Molecular Landscape of Early Hematopoiesis

to 101 bp paired-end sequencing on a single lane of a HiSeq2000 device Adolfsson, J., Ma˚ nsson, R., Buza-Vidas, N., Hultquist, A., Liuba, K., Jensen, (Illumina). C.T., Bryder, D., Yang, L., Borge, O.-J., Thoren, L.A., et al. (2005). Animal procedures were performed according to protocols approved by the Identification of Flt3+ lympho-myeloid stem cells lacking erythro-megakaryo- German authorities (Regierungspra¨ sidium Karlsruhe [no. Z110/02, DKFZ nos. cytic potential a revised road map for adult blood lineage commitment. Cell 261, G175-12, and G140-13]). 121, 295–306. Alharbi, R.A., Pettengell, R., Pandha, H.S., and Morgan, R. (2013). The role of ACCESSION NUMBERS HOX genes in normal hematopoiesis and acute leukemia. Leukemia 27, 1000– 1008. Proteome data have been deposited to the ProteomeXchange Consortium Anders, S., Reyes, A., and Huber, W. (2012). Detecting differential usage of (http://proteomecentral.proteomexchange.org) via the Proteomics Identifica- exons from RNA-seq data. Genome Res. 22, 2008–2017. tions Database (PRIDE) partner repository (Vizcaı´no et al., 2013) with the data set identifier PXD000572. RNA-seq data are available in the ArrayExpress Arab, K., Park, Y.J., Lindroth, A.M., Scha¨ fer, A., Oakes, C., Weichenhan, D., database (http://www.ebi.ac.uk/arrayexpress) under accession number E- Lukanova, A., Lundin, E., Risch, A., Meister, M., et al. (2014). Long MTAB-2262. TWGBS data can be accessed under Gene Expression Omnibus Noncoding RNA TARID Directs Demethylation and Activation of the Tumor under accession no. GSE52709. Suppressor TCF21 via GADD45A. Mol. Cell, in press. Published online July 30, 2014. http://dx.doi.org/10.1016/j.molcel.2014.06.031. SUPPLEMENTAL INFORMATION Argiropoulos, B., and Humphries, R.K. (2007). Hox genes in hematopoiesis and leukemogenesis. Oncogene 26, 6766–6776. Supplemental Information includes Supplemental Experimental Procedures, Bakker, S.T., and Passegue´ , E. (2013). Resilient and resourceful: genome seven figures, six tables, one item and two online resources can be found maintenance strategies in hematopoietic stem cells. Exp. Hematol. 41, with this article online at http://dx.doi.org/10.1016/j.stem.2014.07.005. 915–923. Battista, S., Pentimalli, F., Baldassarre, G., Fedele, M., Fidanza, V., Croce, AUTHOR CONTRIBUTIONS C.M., and Fusco, A. (2003). Loss of Hmga1 gene function affects embryonic stem cell lympho-hematopoietic differentiation. FASEB J. 17, 1496–1498. N.C.-W., D.K., and A.T. designed and coordinated the study. A.T., J.K. (prote- ome), C.P. (methylome), M.D.M. (methylome), and W.H. (bioinformatics) de- Benveniste, P., Frelin, C., Janmohamed, S., Barbara, M., Herrington, R., signed and supervised the experiments and interpreted the data. J.H., D.K., Hyam, D., and Iscove, N.N. (2010). Intermediate-term hematopoietic stem and N.C.-W. performed the proteome experiments, bioinformatic analysis, cells with extended but time-limited reconstitution potential. Cell Stem Cell and/or data interpretation. N.C.-W., A.R., D.K., D.B.L., and J.H. performed 6, 48–58. the RNA-seq experiments, bioinformatic analysis, and/or data interpretation. Berg, J.S., Lin, K.K., Sonnet, C., Boles, N.C., Weksberg, D.C., Nguyen, H., D.B.L., Q.W., N.C.-W., D.K., A.R., D.W., A.L., D.B., L.G., C.H., S.H., Holt, L.J., Rickwood, D., Daly, R.J., and Goodell, M.A. (2011). Imprinted genes M.A.G.E., B.B., and R.E. performed the methylome experiments, bioinformatic that regulate early mammalian growth are coexpressed in somatic stem cells. analysis, and/or data interpretation. D.K., N.C.-W., and A.L. (methylome) per- PLoS ONE 6, e26410. formed FACS. N.C.-W. and D.K. coordinated the animal experiments and per- Bock, C., Beerman, I., Lien, W.H., Smith, Z.D., Gu, H., Boyle, P., Gnirke, A., formed reconstitution experiments, FACS analysis, qPCR, and CFU assays. Fuchs, E., Rossi, D.J., and Meissner, A. (2012). DNA methylation dynamics L.v.P., S.R., P.W., and P.Z. assisted with FACS analysis, reconstitution exper- during in vivo differentiation of blood and skin stem cells. Mol. Cell 47, iments, and qPCR. N.C.-W., D.K., J.H., D.B.L., and A.R., together with W.H., 633–647. J.K., C.P., M.D.M., and A.T., wrote the manuscript. Boersema, P.J., Raijmakers, R., Lemeer, S., Mohammed, S., and Heck, A.J. ACKNOWLEDGMENTS (2009). Multiplex peptide stable isotope dimethyl labeling for quantitative pro- teomics. Nat. Protoc. 4, 484–494. We thank A. Rathgeb from the Central Animal Laboratory; S. Schmidt, U. Ernst, Bowman, T.V., McCooey, A.J., Merchant, A.A., Ramos, C.A., Fonseca, P., A. Hotz-Wagenblatt, C. Previti, and S. Wolf from the Genomics and Proteomics Poindexter, A., Bradfute, S.B., Oliveira, D.M., Green, R., Zheng, Y., et al. Core Facility; S. Schmitt and G. de la Cruz from the Flow Cytometry Core Fa- (2006). Differential mRNA processing in hematopoietic stem cells. Stem cility at the DKFZ; and the EMBL Proteomics Core Facility for expert assis- Cells 24, 662–670. tance. We also thank M. Ba¨ hr and M. Helf for technical assistance. We further Chanda, B., Ditadi, A., Iscove, N.N., and Keller, G. (2013). Retinoic acid thank C.C. Oakes, A.M. Lindroth, and R. Claus for helpful discussions as well signaling is essential for embryonic hematopoietic stem cell development. as P. Komardin for support in handling data submission to the GEO database. Cell 155, 215–227. This work was supported by the BioRN Spitzencluster ‘‘Molecular and Cell Based Medicine’’ supported by the German Bundesministerium fu¨ r Bildung Cleynen, I., Brants, J.R., Peeters, K., Deckers, R., Debiec-Rychter, M., Sciot, und Forschung (to A.T.), by the EU FP7 Programs ‘‘EuroSyStem’’ (to A.T.) R., Van de Ven, W.J.M., and Petit, M.M.R. (2007). HMGA2 regulates transcrip- and ‘‘Radiant’’ (to W.H. and A.R.), by the Dietmar Hopp Foundation (to A.T.), tion of the Imp2 gene via an intronic regulatory element in cooperation with by the Sonderforschungsbereich SFB 873 funded by the Deutsche For- nuclear factor-kappaB. Mol. Cancer Res. 5, 363–372. schungsgemeinschaft (to A.T.), by a Vidi grant from the Netherlands Organiza- Copley, M.R., Babovic, S., Benz, C., Knapp, D.J.H.F., Beer, P.A., Kent, D.G., tion for Scientific Research (NWO) (to J.K.), by the DFG Priority Program Wohrer, S., Treloar, D.Q., Day, C., Rowe, K., et al. (2013). The Lin28b-let-7- SP1463 (to D.B.L. and C.P.), by an intramural grant from the DKFZ (Epigene- Hmga2 axis determines the higher self-renewal potential of fetal haemato- tics@DKFZ) (to D.B.L. and M.D.M.), and by a Humboldt research fellowship poietic stem cells. Nat. Cell Biol. 15, 916–925. for postdoctoral researchers (to Q.W.). Cox, J., and Mann, M. (2008). MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide pro- Received: February 11, 2014 tein quantification. Nat. Biotechnol. 26, 1367–1372. Revised: June 25, 2014 Dumon, S., Walton, D.S., Volpe, G., Wilson, N., Dasse´ , E., Del Pozzo, W., Accepted: July 18, 2014 Landry, J.-R., Turner, B., O’Neill, L.P., Go¨ ttgens, B., and Frampton, J. Published: August 21, 2014 (2012). Itga2b regulation at the onset of definitive hematopoiesis and commit- REFERENCES ment to differentiation. PLoS ONE 7, e43300. Ema, H., Morita, Y., Yamazaki, S., Matsubara, A., Seita, J., Tadokoro, Y., Adey, A., and Shendure, J. (2012). Ultra-low-input, tagmentation-based Kondo, H., Takano, H., and Nakauchi, H. (2006). Adult mouse hematopoietic whole-genome bisulfite sequencing. Genome Res. 22, 1139–1143. stem cells: purification and single-cell assays. Nat. Protoc. 1, 2979–2987.

520 Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. Cell Stem Cell Molecular Landscape of Early Hematopoiesis

Essers, M.A., Offner, S., Blanco-Bose, W.E., Waibler, Z., Kalinke, U., Martı´n-Subero, J.I., Lo´ pez-Otı´n, C., and Campo, E. (2013). Genetic and epige- Duchosal, M.A., and Trumpp, A. (2009). IFNalpha activates dormant haemato- netic basis of chronic lymphocytic leukemia. Curr. Opin. Hematol. 20, poietic stem cells in vivo. Nature 458, 904–908. 362–368. Fatica, A., and Bozzoni, I. (2014). Long non-coding RNAs: new players in cell McKinney-Freeman, S., Cahan, P., Li, H., Lacadie, S.A., Huang, H.-T., Curran, differentiation and development. Nat. Rev. Genet. 15, 7–21. M., Loewer, S., Naveiras, O., Kathrein, K.L., Konantz, M., et al. (2012). The tran- Forsberg, E.C., Serwold, T., Kogan, S., Weissman, I.L., and Passegue´ ,E. scriptional landscape of hematopoietic stem cell ontogeny. Cell Stem Cell 11, (2006). New evidence supporting megakaryocyte-erythrocyte potential of 701–714. flk2/flt3+ multipotent hematopoietic progenitors. Cell 126, 415–426. Medyouf, H., Mossner, M., Jann, J.C., Nolte, F., Raffel, S., Herrmann, C., Lier, Gabut, M., Samavarchi-Tehrani, P., Wang, X., Slobodeniuc, V., O’Hanlon, D., A., Eisen, C., Nowak, V., Zens, B., et al. (2014). Myelodysplastic cells in pa- Sung, H.K., Alvarez, M., Talukder, S., Pan, Q., Mazzoni, E.O., et al. (2011). tients reprogram mesenchymal stromal cells to establish a transplantable An alternative splicing switch regulates embryonic stem cell pluripotency stem cell niche disease unit. Cell Stem Cell 14, 824–837. and reprogramming. Cell 147, 132–146. Morishita, A., Zaidi, M.R., Mitoro, A., Sankarasharma, D., Szabolcs, M., Gao, X., Johnson, K.D., Chang, Y.I., Boyer, M.E., Dewey, C.N., Zhang, J., and Okada, Y., D’Armiento, J., and Chada, K. (2013). HMGA2 is a driver of tumor Bresnick, E.H. (2013). Gata2cis-element is required for hematopoietic stem metastasis. Cancer Res. 73, 4289–4299. cell generation in the mammalian embryo. J. Exp. Med. 210, 2833–2842. Orkin, S.H., and Zon, L.I. (2008). Hematopoiesis: an evolving paradigm for Gazit, R., Garrison, B.S., Rao, T.N., Shay, T., Costello, J., Ericson, J., Kim, F., stem cell biology. Cell 132, 631–644. Collins, J.J., Regev, A., Wagers, A.J., et al. (2013). Transcriptome analysis Osawa, M., Hanada, K., Hamada, H., and Nakauchi, H. (1996). Long-term lym- identifies regulators of hematopoietic stem and progenitor cells. Stem Cell phohematopoietic reconstitution by a single CD34-low/negative hematopoiet- Rev. 1, 1–15. ic stem cell. Science 273, 242–245. Gekas, C., and Graf, T. (2013). CD41 expression marks myeloid-biased adult Paralkar, V.R., and Weiss, M.J. (2013). Long noncoding RNAs in biology and hematopoietic stem cells and increases with age. Blood 121, 4463–4472. hematopoiesis. Blood 121, 4842–4846. Gentleman, R.C., Carey, V.J., Bates, D.M., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., et al. (2004). Bioconductor: open soft- Pruitt, K.D., Tatusova, T., Klimke, W., and Maglott, D.R. (2009). NCBI ware development for computational biology and bioinformatics. Genome Reference Sequences: current status, policy and new initiatives. Nucleic Biol. 5, R80. Acids Res. 37, D32–D36. Gutschner, T., Ha¨ mmerle, M., and Diederichs, S. (2013). MALAT1 — a para- Purton, L.E., and Scadden, D.T. (2007). Limiting factors in murine hematopoi- digm for long noncoding RNA function in cancer. J. Mol. Med. 91, 791–801. etic stem cell assays. Cell Stem Cell 1, 263–270. Ji, H., Ehrlich, L.I., Seita, J., Murakami, P., Doi, A., Lindau, P., Lee, H., Aryee, Rinn, J.L., and Chang, H.Y. (2012). Genome regulation by long noncoding M.J., Irizarry, R.A., Kim, K., et al. (2010). Comprehensive methylome map of RNAs. Annu. Rev. Biochem. 81, 145–166. lineage commitment from haematopoietic progenitors. Nature 467, 338–342. Rosenbauer, F., Wagner, K., Kutok, J.L., Iwasaki, H., Le Beau, M.M., Okuno, Kallen, A.N., Zhou, X.B., Xu, J., Qiao, C., Ma, J., Yan, L., Lu, L., Liu, C., Yi, J.S., Y., Akashi, K., Fiering, S., and Tenen, D.G. (2004). Acute myeloid leukemia Zhang, H., et al. (2013). The imprinted H19 lncRNA antagonizes let-7 induced by graded reduction of a lineage-specific transcription factor, PU.1. microRNAs. Mol. Cell 52, 101–112. Nat. Genet. 36, 624–630. Kashyap, V., Gudas, L.J., Brenet, F., Funk, P., Viale, A., and Scandura, J.M. Rossi, L., Lin, K.K., Boles, N.C., Yang, L., King, K.Y., Jeong, M., Mayle, A., and (2011). Epigenomic reorganization of the clustered Hox genes in embryonic Goodell, M.A. (2012). Less is more: unveiling the functional core of hematopoi- stem cells induced by retinoic acid. J. Biol. Chem. 286, 3250–3260. etic stem cells through knockout mice. Cell Stem Cell 11, 302–317. Kent, D.G., Copley, M.R., Benz, C., Wo¨ hrer, S., Dykstra, B.J., Ma, E., Cheyne, Runge, S., Nielsen, F.C., Nielsen, J., Lykke-Andersen, J., Wewer, U.M., and J., Zhao, Y., Bowie, M.B., Zhao, Y., et al. (2009). Prospective isolation and Christiansen, J. (2000). H19 RNA binds four molecules of insulin-like growth molecular characterization of hematopoietic stem cells with durable self- factor II mRNA-binding protein. J. Biol. Chem. 275, 29562–29569. renewal potential. Blood 113, 6342–6350. Sanjuan-Pla, A., Macaulay, I.C., Jensen, C.T., Woll, P.S., Luis, T.C., Mead, A., Klimmeck, D., Hansson, J., Raffel, S., Vakhrushev, S.Y., Trumpp, A., and Moore, S., Carella, C., Matsuoka, S., Bouriez Jones, T., et al. (2013). Platelet- Krijgsveld, J. (2012). Proteomic cornerstones of hematopoietic stem cell differ- biased stem cells reside at the apex of the haematopoietic stem-cell hierarchy. entiation: distinct signatures of multipotent progenitors and myeloid Nature 502, 232–236. committed cells. Mol. Cell. Peoteomics 11, 1–54. Sebastian, S., Faralli, H., Yao, Z., Rakopoulos, P., Palii, C., Cao, Y., Singh, K., Koch, U., Wilson, A., Cobas, M., Kemler, R., Macdonald, H.R., and Radtke, F. Liu, Q.C., Chu, A., Aziz, A., et al. (2013). Tissue-specific splicing of a ubiqui- (2008). Simultaneous loss of beta- and gamma-catenin does not perturb tously expressed transcription factor is essential for muscle differentiation. hematopoiesis or lymphopoiesis. Blood 111, 160–164. Genes Dev. 27, 1247–1259. Kristensen, A.R., Gsponer, J., and Foster, L.J. (2013). Protein synthesis rate is Seita, J., and Weissman, I.L. (2010). Hematopoietic stem cell: self-renewal the predominant regulator of protein expression during differentiation. Mol. versus differentiation. Wiley Interdiscip. Rev. Syst. Biol. Med. 2, 640–653. Syst. Biol. 9, 689. Shah, S.N., Cope, L., Poh, W., Belton, A., Roy, S., Talbot, C.C., Jr., Sukumar, Lindsley, R.C., and Ebert, B.L. (2013). Molecular pathophysiology of myelo- S., Huso, D.L., and Resar, L.M. (2013). HMGA1: a master regulator of tumor dysplastic syndromes. Annu. Rev. Pathol. 8, 21–47. progression in triple-negative breast cancer cells. PLoS ONE 8, e63419. Lo´ pez, A.J. (1995). Developmental role of transcription factor isoforms gener- ated by alternative splicing. Dev. Biol. 172, 396–411. Shen, Y., Yue, F., McCleary, D.F., Ye, Z., Edsall, L., Kuan, S., Wagner, U., Dixon, J., Lee, L., Lobanenkov, V.V., and Ren, B. (2012). A map of thecis-reg- Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of fold ulatory sequences in the mouse genome. Nature 488, 116–120. change and dispersion for RNA-Seq data with DESeq2. BioRxiv. Preprint. Published online February 19, 2014. http://dx.doi.org/10.1101/002832. Shyh-Chang, N., and Daley, G.Q. (2013). Lin28: primal regulator of growth and metabolism in stem cells. Cell Stem Cell 12, 395–406. Luis, T.C., Ichii, M., Brugman, M.H., Kincade, P., and Staal, F.J. (2012). Wnt signaling strength regulates normal hematopoiesis and its deregulation is Shyh-Chang, N., Zhu, H., Yvanka de Soysa, T., Shinoda, G., Seligson, M.T., involved in leukemia development. Leukemia 26, 414–421. Tsanov, K.M., Nguyen, L., Asara, J.M., Cantley, L.C., and Daley, G.Q. Ma˚ nsson, R., Hultquist, A., Luc, S., Yang, L., Anderson, K., Kharazi, S., (2013). Lin28 enhances tissue repair by reprogramming cellular metabolism. Al-Hashmi, S., Liuba, K., Thore´ n, L., Adolfsson, J., et al. (2007). Molecular Cell 155, 778–792. evidence for hierarchical transcriptional lineage priming in fetal and adult Signer, R.A., Magee, J.A., Salic, A., and Morrison, S.J. (2014). Haematopoietic stem cells and multipotent progenitors. Immunity 26, 407–419. stem cells require a highly regulated protein synthesis rate. Nature 509, 49–54.

Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc. 521 Cell Stem Cell Molecular Landscape of Early Hematopoiesis

Smyth, G.K. (2004). Linear models and empirical Bayes methods for assessing lates an imprinted gene network critically involved in the control of embryonic differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. growth. Dev. Cell 11, 711–722. 3, 1–25. Venkatraman, A., He, X.C., Thorvaldsen, J.L., Sugimura, R., Perry, J.M., Tao, Stamatoyannopoulos, J.A., Snyder, M., Hardison, R., Ren, B., Gingeras, T., F., Zhao, M., Christenson, M.K., Sanchez, R., Yu, J.Y., et al. (2013). Maternal Gilbert, D.M., Groudine, M., Bender, M., Kaul, R., Canfield, T., et al.; Mouse imprinting at the H19-Igf2 locus maintains adult haematopoietic stem cell ENCODE Consortium (2012). An encyclopedia of mouse DNA elements quiescence. Nature 500, 345–349. (Mouse ENCODE). Genome Biol. 13, 418. Vizcaı´no, J.A., Coˆ te´ , R.G., Csordas, A., Dianes, J.A., Fabregat, A., Foster, J.M., Sugimura, R., He, X.C., Venkatraman, A., Arai, F., Box, A., Semerad, C., Haug, Griss, J., Alpi, E., Birim, M., Contell, J., et al. (2013). The PRoteomics J.S., Peng, L., Zhong, X.B., Suda, T., and Li, L. (2012). Noncanonical Wnt IDEntifications (PRIDE) database and associated tools: status in 2013. signaling maintains hematopoietic stem cells in the niche. Cell 150, 351–365. Nucleic Acids Res. 41, D1063–D1069. Takubo, K., Nagamatsu, G., Kobayashi, C.I., Nakamura-Ishizu, A., Kobayashi, Wang, Q., Gu, L., Adey, A., Radlwimmer, B., Wang, W., Hovestadt, V., Ba¨ hr, H., Ikeda, E., Goda, N., Rahimi, Y., Johnson, R.S., Soga, T., et al. (2013). M., Wolf, S., Shendure, J., Eils, R., et al. (2013). Tagmentation-based whole- Regulation of glycolysis by Pdk functions as a metabolic checkpoint for cell genome bisulfite sequencing. Nat. Protoc. 8, 2022–2032. cycle quiescence in hematopoietic stem cells. Cell Stem Cell 12, 49–61. Weissman, I.L., and Shizuru, J.A. (2008). The origins of the identification and Taneri, B., Snyder, B., Novoradovsky, A., and Gaasterland, T. (2004). isolation of hematopoietic stem cells, and their capability to induce donor-spe- Alternative splicing of mouse transcription factors affects their DNA-binding cific transplantation tolerance and treat autoimmune diseases. Blood 112, domain architecture and is tissue specific. Genome Biol. 5, R75. 3543–3553. Tesio, M., and Trumpp, A. (2011). Breaking the cell cycle of HSCs by p57 and Wilson, A., Laurenti, E., Oser, G., van der Wath, R.C., Blanco-Bose, W., friends. Cell Stem Cell 9, 187–192. Jaworski, M., Offner, S., Dunant, C.F., Eshkind, L., Bockamp, E., et al. Tew, K.D., and Townsend, D.M. (2012). Glutathione-s-transferases as deter- (2008). Hematopoietic stem cells reversibly switch from dormancy to self- minants of cell survival and death. Antioxid. Redox Signal. 17, 1728–1737. renewal during homeostasis and repair. Cell 135, 1118–1129. Tickenbrock, L., Hehn, S., Sargin, B., Choudhary, C., Ba¨ umer, N., Buerger, H., Wilson, A., Laurenti, E., and Trumpp, A. (2009). Balancing dormant and self- Schulte, B., Mu¨ ller, O., Berdel, W.E., Mu¨ ller-Tidow, C., and Serve, H. (2008). renewing hematopoietic stem cells. Curr. Opin. Genet. Dev. 19, 461–468. Activation of Wnt signalling in acute myeloid leukemia by induction of Yaniv, K., and Yisraeli, J.K. (2002). The involvement of a conserved family of Frizzled-4. Int. J. Oncol. 33, 1215–1221. RNA binding proteins in embryonic development and carcinogenesis. Gene Tripathi, V., Ellis, J.D., Shen, Z., Song, D.Y., Pan, Q., Watt, A.T., Freier, S.M., 287, 49–54. Bennett, C.F., Sharma, A., Bubulya, P.A., et al. (2010). The nuclear-retained Zhang, Q., Chen, C.Y., Yedavalli, V.S., and Jeang, K.T. (2013). NEAT1 long noncoding RNA MALAT1 regulates alternative splicing by modulating SR noncoding RNA and paraspeckle bodies modulate HIV-1 posttranscriptional splicing factor phosphorylation. Mol. Cell 39, 925–938. expression. mBio. 4, e00596–00512. Trumpp, A., Essers, M., and Wilson, A. (2010). Awakening dormant haemato- Zou, P., Yoshihara, H., Hosokawa, K., Tai, I., Shinmyozu, K., Tsukahara, F., poietic stem cells. Nat. Rev. Immunol. 10, 201–209. Maru, Y., Nakayama, K., Nakayama, K.I., and Suda, T. (2011). p57(Kip2) and Varrault, A., Gueydan, C., Delalbre, A., Bellmann, A., Houssami, S., Aknin, C., p27(Kip1) cooperate to maintain hematopoietic stem cell quiescence through Severac, D., Chotard, L., Kahli, M., Le Digarcher, A., et al. (2006). Zac1 regu- interactions with Hsc70. Cell Stem Cell 9, 247–261.

522 Cell Stem Cell 15, 507–522, October 2, 2014 ª2014 Elsevier Inc.