bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Single-cell transcriptomics identifies CD44 as a new marker and regulator of haematopoietic stem cells development

Morgan Oatley1*, Özge Vargel Bölükbası1,2*, Valentine Svensson3,4,5, Maya Shvartsman1, Kerstin Ganter1, Katharina Zirngibl6, Polina V. Pavlovich1,7, Vladislava Milchevskaya6,8, Vladimira Foteva1, Kedar N. Natarajan3,9, Bianka Baying10, Vladimir Benes10, Kiran R. Patil6, Sarah A. Teichmann3 & Christophe Lancrin1,11

1 European Molecular Biology Laboratory, EMBL Rome - Epigenetics and Neurobiology Unit, Monterotondo, Italy. 2 Current address: Boston's Children Hospital/Harvard Medical School, Boston, USA. 3 Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK. 4 European Molecular Biology Laboratory, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK. 5 Current address: Pachter Lab, Caltech, California, USA. 6 European Molecular Biology Laboratory, Structural and Computational Biology Unit, Heidelberg, Germany 7 Moscow Institute of Physics and Technology, Institutskii Per. 9, Moscow Region, Dolgoprudny 141700, Russia. 8 Current address: Institut für Medizinische Statistik und Bioinformatik, Köln, Germany 9 Current address: The University of Southern Denmark, Danish Institute for Advanced Study, Department of Biochemistry and Molecular Biology, Odense, Denmark. 10 European Molecular Biology Laboratory, Genomics Core Facility, Heidelberg, Germany. 11 Correspondence: [email protected] * Co-first authors

1 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Abstract

The endothelial to haematopoietic transition (EHT) is the process whereby haemogenic endothelium differentiates into haematopoietic stem and progenitor cells (HSPCs). The intermediary steps of this process are unclear, in particular the identity of endothelial cells that give rise to HSPCs is unknown. Using single-cell transcriptome analysis and antibody screening we identified CD44 as a new marker of EHT enabling us to isolate robustly the different stages of EHT in the aorta gonad mesonephros (AGM) region. This allowed us to provide a very detailed phenotypical and transcriptional profile for haemogenic endothelial cells, characterising them with high expression of related to Notch signalling, TGFbeta/BMP antagonists (Smad6, Smad7 and Bmper) and a downregulation of genes related to glycolysis and the TCA cycle. Moreover, we demonstrated that by inhibiting the interaction between CD44 and its ligand hyaluronan we could block EHT, identifying a new regulator of HSPC development.

2 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Introduction

Understanding the developmental origin of haematopoietic stem and progenitor cells (HSPCs) is of critical importance to efforts to produce blood and blood products in vitro for medical applications. Haematopoietic stem and progenitor cells (HSPCs) originate from endothelial cells in the aorta gonad mesonephros (AGM) of mid- gestation 1,2. This process known as the endothelial to haematopoietic transition (EHT) requires drastic morphological changes that have been directly visualised through time-lapse imaging studies both in vitro and in vivo 3-6. EHT is a highly conserved process that has been studied across vertebrate models from xenopus and zebrafish to mice 7. Importantly, the human definitive blood system also has an endothelial origin 8. The best tools so far to detect endothelial cells with haemogenic capabilities rely on using fluorescent reporters under the control of Runx19 or Gfi1 10 regulatory elements, two keys transcription factors in the process of EHT. Cells expressing these transcription factors already co-express blood and endothelial genes. However, we still do not know the nature of endothelial cells which will acquire the expression of these transcription factors. We still do not know if any endothelial cells in the AGM can initiate the haematopoietic program or if a certain type of endothelial cells is primed to undergo the EHT. Despite our lack of characterisation of the definitive precursor to HSPC development, haemogenic endothelium (HE), recent advances have been made in terms of in vitro HSPC generation via an endothelial intermediate. Through the use of transcription factor cocktails both human pluripotent stem cells via a HE stage and adult mouse endothelial cells have been successfully reprogrammed into multi-potent, definitive haematopoietic stem cells (HSCs) 11,12. However, the use of endothelial populations in the process emphasises the importance of an improved understanding of HE. The early haematopoietic hierarchy has been described as a three-step process based on phenotypic characteristics. Specifically, Pro-HSC, Pre-HSC type I and type II populations have been defined based on their expression of the cell surface markers CD41, CD43 and CD45, as well as the time taken mature into definitive HSCs in OP9 co-culture 13. Recently, an in depth transcriptional investigation was performed on the

3 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

type I and type II Pre-HSC populations at day 11 of mouse embryonic development, however, the earlier stages of EHT remain largely uncharacterised 14. Indeed, gaining a solid understanding of HE and the initial steps that endothelial cells must take to become HSPCs has proved difficult in the absence of a robust marker. Through antibody screening and single-cell RNA sequencing (sc-RNA-seq) we identified CD44 as a novel marker of EHT, enabling the isolation of key cellular stages of blood cell formation in the embryonic vasculature. CD44 is a cell surface principally involved in the binding of the molecule hyaluronan 15. Its cell surface expression has been used to identify stem cell populations and has been strongly linked to the metastatic potential of many 16-20. While previous research has revealed the importance of CD44 in HSPCs migration to the bone marrow, the role of the receptor in early embryonic haematopoiesis has not been characterised 21. Using CD44 expression in conjunction with VE- (VE-Cad) and Kit we could clearly differentiate between vascular endothelium, HE, Pre-HSPC-I and Pre-HSPC-II more accurately than using the combination of VE-Cad, CD41, CD43 and CD45 markers. This has allowed us to perform extensive transcriptional profiling making it possible to characterise the very earliest changes in haematopoietic differentiation from endothelial cells. Moreover, by disrupting the interaction of CD44 and its ligand, we could inhibit EHT, demonstrating an unexpected role for CD44 in the emergence of HSPCs.

4 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Results

Antibody screening and single-cell RNAseq identifies CD44 as a potential marker of early haematopoietic fate It is well established that HSPCs have an endothelial origin within the 4-6,22. In order to better characterise the transition between endothelial and haematopoietic identity, we performed both an in vitro antibody screen and an in vivo single-cell RNAseq experiment to identify new markers allowing the identification of subpopulations within VE-Cad+ cells (Supplementary Fig. S1). Antibodies against 176 cell surface markers 23 were tested against the VE-Cad+ population generated from our in vitro ESC differentiation system into blood cells which recapitulates faithfully the early blood development process 24. Forty-two of these markers were expressed on VE-Cad+ cells (Supplementary Table S1). We looked for bimodal expression to separate distinct endothelial populations and identified a short-list of sixteen candidates to test in vivo including CD41 and CD117 (also known as Kit, a marker of HSPCs) already known to split VE-Cad+ cells 13. Eight of these markers (CD44, CD51, CD55, CD61, CD93, Icam1, Madcam1 and Sca1) were found to split the VE-Cad+ endothelial population of the AGM in two (Fig. 1a). In parallel, VE- Cad+ cells were isolated from the AGM region of E10.5 embryos and their transcriptional profiles analysed sc-RNA-seq (Fig. 1b-d). Clustering analysis identified a population with both haematopoietic and endothelial expression, distinct from the other endothelial population (Fig. 1d). Bioinformatics analysis showed that Cd44 is one of the best marker genes for this population of transitioning cells co-expressing endothelial and haematopoietic genes (Fig. 1c). The expression of Cd44 was also positively correlated with other known haematopoietic markers such as Runx1, Gfi1 and Adgrg1 (Gpr56) (Fig. 1d). Given the association of Cd44 with endothelial cells undergoing EHT at both the and mRNA level we decided to further investigate its role in embryonic haematopoiesis.

CD44 marks transitioning cells with differing morphology To validate our screening results and investigate the identity of CD44+ cells, we performed immunofluorescence and more detailed flow cytometry analysis on the

5 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

AGM region of mouse embryos at embryonic day 9.5 (E9.5) and 10.5 (E10.5) (Fig. 2). Immunofluorescence of cross-sections of mouse AGMs revealed that CD44 marked cells that were part of the vascular wall and cells that were incorporated in haematopoietic clusters (Fig 2a). Flow cytometry revealed that CD44 expression significantly increased in the VE-cad+ endothelium of the AGM between E9.5 and E10.5 when cells are undergoing EHT (Fig. 2b-c). Furthermore, by staining with an antibody against Kit (a marker of intra-aortic haematopoietic clusters), we found that the majority of cells with lower levels of CD44 expressed little or no Kit (Fig 2d). By grouping the cells based on their cell surface expression of CD44 and Kit, we found these populations to be significantly different in terms of cell size, suggestive of cells undergoing a morphological transition (Fig. 2e). Altogether, our results indicate that CD44 marks a subset of endothelial cells and cells in the haematopoietic clusters of the AGM during the key window of HSPCs development in the mouse embryo.

Single-cell q-RT-PCR analysis identifies three homogeneous CD44+ populations with an increasing haematopoietic profile Using the Biomark HD single-cell qPCR platform, we analysed the expression of 95 genes associated with both endothelial and haematopoietic cell types 24. We performed extensive transcriptional profiling on the CD44Neg, CD44LowKitNeg, CD44LowKitPos and CD44High populations identified (Fig. 2d) between E9.5 and E11.5 (342 cells in total). We found that the CD44LowKitPos and CD44High populations expressed numerous haematopoietic genes (Fig. 3a). The CD44LowKitPos cells also expressed many endothelial genes as in our sc-RNA-seq analysis (Fig. 1d), however the CD44High cells appeared to be more advanced in the EHT process and lacked endothelial (Fig. 3a). Of note, the CD44LowKitPos population expressed specifically Gfi1 and Itgb3 (coding for CD61) as well as the heptad of transcription factors (Gata2, Runx1, Lyl1, Erg, Fli1, Lmo2 and Tal1) whose simultaneous co-expression is responsible for the dual endothelial-haematopoietic identity of Pre-HSPCs (Supplementary Fig. S2) 24. Conversely, the CD44Neg and CD44LowKitNeg populations both showed specific endothelial gene signatures and lacked haematopoietic gene expression (Fig. 3a). Despite their different CD44 expression patterns, these cell populations clustered together (Supplementary Fig. S2). We repeated this experiment using a new selection

6 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

of 96 genes based on our sc-RNA-seq experiment (Supplementary Table S2). With our new gene list, we were able to confirm the dual endothelial-haematopoietic and haematopoietic identities of the CD44LowKitPos and CD44High populations, respectively. Surprisingly, the CD44Neg and CD44LowKitNeg populations formed two distinct clusters (Fig. 3b). In addition to Cd44, we found the genes Adgrg6, Emb, Fbn1, Pde3a, Plcg2, Serpinf1, Smad6, Smad7, Sox6 and Stxbp2 to be up-regulated in the CD44LowKitNeg cells compared to CD44Neg. In contrast, Bmp4, Kit, Hmmr and Pde2a were more expressed in CD44Neg endothelial cells (Fig. 3c). Although the four groups defined by VE-Cad, CD44 and Kit were confirmed to be distinct transcriptionally, our clustering analysis found a fifth population (SC_3) composed of cells from both CD44LowKitNeg and CD44LowKitPos groups (Fig. 3a). Its transcriptional profile was found to be intermediary between SC_2 (CD44LowKitNeg) and SC_4 (CD44LowKitPos) e.g. it expressed Adgrg1, Runx1, Itgb3 and Spi1 like SC_4 but was still expressing Adgrg6 and Pcdh12 like SC_2 (Fig. 3c). This suggests a developmental link between the CD44Low populations where CD44LowKitNeg cells would be the direct precursors of the CD44LowKitPos population which would then go on to generate CD44High cells. Interestingly, it is within this transitional SC_3 population that we saw the up regulation of Runx1, Spi1 and Gfi1, which are three of the four transcription factors used to reprogram adult mouse endothelial cells into HSCs 12.

VE-Cad, CD44 and Kit could segregate the earliest stages of haematopoietic development more accurately than the combination of VE-cad, CD41, CD43 and CD45 markers To place our results in context of known AGM subpopulations, we performed transcriptional analysis of the Pro-HSC, Pre-HSC type I, and Pre-HSC type II populations defined according to the combination of VE-cad, CD41, CD43 and CD45 markers (Fig. 4a) 13. Following hierarchical clustering, we found three clusters: the first mostly composed of Pro-HSCs, a second being a mix of Pro-HSCs and Pre-HSCs type I and a third one composed only of Pre-HSCs type II (Supplementary Fig. S3). We then performed a clustering analysis in conjunction with the populations defined by CD44 (Fig. 4b). This revealed that the SC_5 (CD44High) population closely associated with the Pre-HSC type II and the SC_4 (CD44LowKitPos) population with Pre-HSC type I and part of the Pro-HSCs. SC_2 (CD44LowKitNeg) clustered closely

7 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

with the remaining Pro-HSCs. Finally, only five cells with Pre-HSCs type I and Pro- HSCs phenotype clustered with the SC_1 (CD44Neg). Ninety-seven per cent of Pro- HSCs, Pre-HSCs type I, and Pre-HSCs type II were CD44 positive (Supplementary Fig. S3 and Fig. 4c). Overall, we have demonstrated that the phenotypes based on CD44 expression could allow us to isolate all key populations in the process of HSCs formation more accurately than the phenotypes previously described.

Bulk RNA-seq analysis further distinguishes CD44Neg and CD44LowKitNeg endothelial populations and identifies early changes in the differentiation process To further compare the two endothelial populations found in the AGM, we performed RNAseq on 25-cell bulk samples from these populations across three different time points (E9.5, E10 and E11). We analysed as well the more advanced stages in EHT: CD44LowKitPos at E9.5 and E10 and CD44High at E11. The bulk RNA- seq approach allowed us to detect low abundant genes (such as genes encoding transcription factors) more efficiently than sc-RNA-seq and also to measure smaller changes in gene expression between populations. The samples clustered according to their marker expression, despite the difference in developmental time, confirming our previous experiments (Fig. 5a). We identified several haematopoietic genes switched on in the CD44LowKitNeg population including, Ctsc, Nfe2, Runx1 and Ifitm1, suggesting that these cells are subjected to the EHT process (Fig. 5b). The expression pattern of endothelial genes fits with our previous observations – more highly expressed in CD44Neg and CD44LowKitNeg, moderately expressed in CD44LowKitPos and absent in CD44High (Fig 5b). Moreover, we used this dataset to check the expression of genes corresponding to we found in our antibody screen (Fig. 1a). Cd93 and Madcam1 showed an endothelial expression pattern and could be used to separate endothelial from blood cells in the AGM. In contrast, Itgb3 marks specifically the CD44LowKitPos population while Ly6e marks both CD44LowKitPos and CD44High populations as shown previously in Supplementary Fig. S2 and Fig. 3. Interestingly, this transcriptome analysis showed strong differences between the two endothelial populations of the AGM. We found 1605 genes differentially expressed between these two populations (p-value < 0.01, Wald test). Among them, several genes from the Notch pathway (Hey2, Jag1, Dll4, Hey1 and Notch1) were

8 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

significantly more expressed in the CD44LowKitNeg population compared to CD44Neg. Similarly, antagonists of the TGFbeta/BMP pathway, including Smad6, Smad7 and Bmper, were up-regulated in CD44LowKitNeg cells compared to CD44Neg ones. In contrast, target genes of the Wnt pathway (Lef1, Ccnd1 and Myc) were more highly expressed in CD44Neg compared to CD44LowKitNeg. From this analysis we could also identify specific markers for each of the endothelial populations (Fig. 5c). Nr2f2, Pde2a, Aplnr and Kcne3 marked specifically CD44Neg cells while Adgrg6, Hey2, Akr1c14, Fas and Ltbp1 strongly marked the CD44LowKitNeg population. Sc-RNA-seq done by another group investigated the different stages of EHT in the AGM 14, but they did not identify the CD44LowKitNeg population. However, the gene expression pattern they obtained for the three other populations was very similar to the one we found with our bulk RNA analysis (Supplementary Fig. S4). Our data supports the hypothesis that the CD44Neg cells and the CD44LowKitNeg cells belong to two distinct endothelial populations. The CD44Neg population expresses venous (Aplnr, Nr2f2 and Nrp2) and arterial (Sox17, Bmx and Efnb2) markers while the CD44LowKitNeg has a clear arterial signature with stronger expression of Bmx, Jag1 and Hey2. Moreover, there are distinct transcriptional links between the two CD44Low populations and the initiation of several blood markers already at the CD44LowKitNeg stage suggests that this population is in fact the endothelial precursor of CD44LowKitPos and CD44High cells, and hence of haematopoietic development.

The CD44Neg and CD44LowKitNeg endothelial populations feature distinct and autophagy signatures Remarkably, a large number of the differentially expressed genes between the two endothelial populations belonged to metabolic processes (395 out of 1605 genes; 1.32-fold enrichment, p-value <0.05, Fisher’s exact test). We therefore identified specific metabolic pathways distinguishing the two populations (Fig. 6a and Supplementary Table S3). Notably, the CD44LowKitNeg population showed a pronounced down regulation of genes coding for involved in glycolysis, TCA cycle and respiration, suggesting reduced ATP generation. Furthermore, and nucleotide biosynthesis genes were also down regulated. Altogether it suggests that the CD44LowKitNeg is a non-proliferative, metabolically rather quiescent state which is in line with the smaller size of this population compared to the CD44Neg

9 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

(Fig. 2e). Given that endothelial cells are known to obtain most of their energy from glycolysis 25, this change in metabolic status supports a loss of endothelial identity. These cells also show a marked increase in the expression of pathways leading to with regulatory function: glycerolipids, glycerophospholipids, phosphatidylinositol and sphingolipids (Fig. 6a). Interestingly, we observe that several genes involved in autophagy, a process known to be regulated by phosphatidylinositol and sphingolipids 26-29, are also highly upregulated in the CD44LowKitNeg population (Fig. 6b). Concordantly, two key processes accompanying autophagy, ubiquitylation and , are also upregulated. As autophagy has been shown to play a key role in embryonic development and haematopoiesis 30-32, this provides further support to the CD44LowKitNeg cells being in transit from endothelial to haematopoietic cells.

Runx1 is not required for the formation of CD44LowKitNeg endothelial cells CD44 has allowed us for the first time to clearly define the key VE-Cad+ populations in the AGM. The transcription factor RUNX1 is a key driver of HSPC development and is known to down-regulate endothelial identity through its target genes GFI1 and GFI1B 10,33. Next, we decided to evaluate the impact of Runx1 loss of function on the different CD44+ cells. Using a Runx1 knockout mouse model 34, we stained for VE-Cad, CD44 and Kit expression and performed transcriptional profiling on the sorted cells (Fig. 7). We found that in the absence of Runx1 there is a loss of CD44High and CD44LowKitPos cells and we observed a concomitant increase in the frequency of the CD44LowKitNeg population (Fig. 7a and 7b). Interestingly, we found no obvious transcriptional differences between the CD44LowKitNeg populations derived from Runx1+/+ versus Runx1-/- embryos indicating that Runx1 is not necessary for the formation of these endothelial cells but for the promotion of the transition into CD44LowKitHigh cells (Supplementary Fig. S5).

All CD44+ populations display haematopoietic potential To understand the haematopoietic potential of the different populations defined by VE-Cad, CD44 and Kit expression, we performed ex vivo assays using an OP9 co-culture system. No colonies were formed at the single-cell level from either the CD44LowKitNeg or the CD44Neg populations. However, by plating cells at a density

10 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

of 300 cells per well, we could observe round cell colony formation from the CD44LowKitNeg population but not from the CD44Neg (Fig. 8a). In contrast, using single-cell sorting, we found that the CD44LowKitPos population was the most potent with an average of 43% of single cells forming round cell colonies after three days of growth. Similarly, the CD44High population produced haematopoietic colonies but with a lesser frequency, on average 11% of single cells showed the ability to form haematopoietic colony on OP9 (Fig. 8b-c). To uncover the differentiation potential of the cells generated by CD44LowKitPos and CD44High, we performed CFU assays following the OP9 co-culture. Both populations readily generated both erythroid and myeloid colonies with the CD44LowKitPos population demonstrating a significantly higher capacity than the CD44High population (Fig. 8d-e). However, the four-fold increase in the number of CD44LowKitPos colonies on OP9 did not correspond to a four-fold increase in CFU colonies suggesting a higher replating efficiency of the CD44High cells compared to CD44LowKitPos (Fig 8f). We further tested the lymphoid potential of the CD44High population by growing cells for 21 days on either OP9 or OP9-DL1 with lymphoid promoting cytokines. We demonstrated that this population could give rise to both B and T cells ex vivo (Fig. 8h). Overall, we found that all populations expressing CD44 displayed haematopoietic differentiation capacity including CD44LowKitNeg reinforcing the differentiation link between them as suggested by the transcriptome analyses described previously.

Interrupting hyaluronan binding to the CD44 receptor inhibits the endothelial to haematopoietic transition ex vivo and in vitro So far, we demonstrated that CD44 was a very useful marker to distinguish the different stages of EHT. Although the CD44 knock-out mice do not have a severe haematopoietic phenotype 35, compensatory mechanisms through other Hyaluronan receptors (e.g. Hmmr 36 expressed by some CD44LowKitPos and CD44High cells in Fig. 3a) may be at play to diminish the consequences of CD44 loss of function. In order to explore the functional role of CD44 in EHT we employed a pharmacological approach. By treating CD44High sorted cells with a CD44 blocking antibody, we found that round cell colony formation could be inhibited in a dose dependent manner (Fig. 9a-b). The blocking antibody inhibited not only the number of colonies deriving from the ex vivo sorted cells but also the size of the colonies generated (Fig 9c).

11 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

To further investigate the role of CD44 in EHT we used an ESC in vitro differentiation system that mimics embryonic haematopoiesis. We performed a haemangioblast culture and analysed CD44 expression at day 1, 2 and 3. Endothelial cells expressed CD44 at all tested time points (Supplementary Fig. S6). We therefore applied the blocking antibody from the beginning of the culture and found that treatment with the antibody halved the number of HSPCs (VE-Cad- CD41+) produced and significantly increased the percentage of endothelial cells in the culture (Fig. 8d- e). Given that the inhibitory antibody is known to bind close to the hyaluronan binding site on the extracellular domain of CD44 29, we next attempted to manipulate the amount of hyaluronan in the culture. When 300 μg/mL of hyaluronidase was applied to the haemangioblast culture, we again observed a block in EHT characterised by a significant decrease in blood cell formation and an increase in the percentage of Pre-HSPCs (VE-Cad+ CD41+) (Fig. 8f). Using the 4MU inhibitor which blocks the synthesis of hyaluronan, we also observed the reduction of CD41+ cell number. Combining 4MU with hyaluronidase had a much more potent effect with a clear block in EHT at the Pre-HSPC stage. In conclusion, these results for the first time demonstrate a regulatory role for hyaluronan and its receptor CD44 in the formation of HSPCs.

12 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Discussion

Using antibody screening and sc-RNA-seq to dissect the endothelial populations in the AGM, we discovered that CD44 was a robust marker to distinguish all the main populations in the EHT process. Combining CD44 with Kit and VE-Cad allowed us to discriminate the different stages of EHT more accurately that the method based on VE-Cad, CD41, CD43 and CD45 cell surface markers 13. In addition, we showed that CD44 had an unexpected regulatory function in the EHT process. Our work has been instrumental in distinguishing the different types of endothelial cells in the AGM region. The CD44LowKitNeg has a gene expression signature strongly compatible with arterial identity (e.g. expression of Efbn2 and Sox17 and upregulation of Notch pathway target Hey2) while the CD44Neg cells co- expressed genes related to the venous (e.g. Nr2f2 and Aplnr) and arterial cell fates (e.g. Efnb2 and Sox17) (Fig. 5d). This co-expression of venous and arterial genes supports previous work indicating that the dorsal aorta can contribute to the formation of the cardinal vein 37. Another interesting finding was the striking metabolic state difference between the two endothelial populations. The CD44Neg cells appeared much more metabolically active than the CD44LowKitNeg suggesting that the latter population is becoming quiescent. It is surprising since the acquisition of a quiescent phenotype in endothelial cells occurs normally after birth following the completion of angiogenesis. FOXO1 has been described as an important transcription factor to induce quiescence in endothelial cells through suppression of Myc 38. While we observe specific down- regulation of Myc in CD44LowKitNeg cells, Foxo1 has a similar level of expression in the two endothelial populations. Recently, a study comparing lung endothelial cells between infant to adult mice showed that SMAD6 and SMAD7 higher expression in adult endothelial cells was linked with the induction of endothelial cell quiescence in adulthood 39. Interestingly, all the CD44LowKitNeg cells co-expressed Smad6 and Smad7 at the single-cell level suggesting that the quiescence phenotype we observed could also be linked to the co-expression of the two inhibitory Smads. The acquisition of quiescence in this context could be the first indication of the divergence from an endothelial phenotype. The metabolic distinction of the CD44LowKitNeg population was coupled with an increase in genes involved in

13 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

autophagy. This is in line with the known role of autophagy in embryogenesis, haematopoiesis and stem cell maintenance 30-32,40,41, such as protein and organelle turnover and protection from reactive oxygen species. Our data thus suggest that these cells mark the resetting of metabolic and regulatory states to initiate the EHT process. We consequently propose that the CD44LowKitNeg population is the source of HSPCs in the AGM, hence the haemogenic endothelium (Fig. 10). Although it was suggested that HE cells in the human pluripotent stem cells differentiation model do not display arterial identity 42, our results clearly support the arterial source of HSPCs in the AGM. Another characteristic for the initiation of the EHT is the inhibition of the BMP and TGFbeta pathways as suggested by the increased expression of Smad6, Smad7 and Bmper in the HE population. Expression of RUNX1 in endothelial cells would then trigger the haematopoietic cell fate. The dynamic interaction between the heptad of transcription factors GATA2, RUNX1, LYL1, ERG, FLI1, LMO2 and TAL1 co-expressed at the Pre-HSPC-I stage would eventually lead to the loss of endothelial gene expression 24 and give rise to the Pre-HSPC-II stage, cells expressing only haematopoietic genes. Our detailed transcriptomics analysis of hemogenic and non-hemogenic endothelial cell populations in the AGM revealed for the first time the prerequisites needed by endothelial cells to generate HSPCs. Thus, our study may help to design new protocols for the generation of HSCs ex vivo without the use of transcription factors, which still represents a considerable health risk. The manipulation of the interaction between CD44 and hyaluronan could offer a new strategy for reprogramming endothelial cells into HSPCs. In addition, our work could shed new on CD44 function in other cellular transitions such as the epithelial-mesenchymal transition occurring during 23. Given the role of CD44 in the metastatic process, it is possible that there is an overlap in its function for these transformations. Therefore, understanding the down- stream targets of the CD44-hyaluronan interaction could also have implications also for cancer biology.

14 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Methods

Timed mating and embryo dissection For timed pregnancies, C57BL/6 wild-type mice or Runx1+/- mice were mated overnight. Embryos were collected in PBS supplemented with 10% FBS (PAA Laboratories). The yolk sac of embryos derived from Runx1+/- mice were genotyped using the Kappa Mouse Genotyping Kit (KAPA Biosystems) according to the manufacturer’s instructions. E9.5 to E11.5 embryos were staged based on counts. After removing the yolk sac, the AGM region was dissected by removing the head and tail above and below the limb buds then removing the limb buds, organs and by cutting ventrally and dorsally either side of the AGM. All experiments were performed following the guidelines and regulations defined by the European and Italian legislations (Directive 2010/63/EU and DLGS 26/2014, respectively). They apply to foetal forms of mammals as from the last third of their normal development (from day 14 of gestation in the mouse). They do not cover experiments done with day 12 mouse embryos and at earlier stages. Therefore, no experimental protocol or license was necessary for the performed experiments. Mice were bred and maintained at the EMBL Rome Animal Facility in accordance with European and Italian legislations (EU Directive 634/2010 and DLGS 26/2014, respectively).

C1 Fluidigm chip and single-cell RNA sequencing AGM from E10 embryos were isolated and stained with anti-VE-Cadherin and anti-CD41 antibodies. VE-Cad+ cells from E10 embryos were isolated using FACS and mixed with Fluidigm suspension reagent (Fluidigm) in a 3:2 ratio. A primed Fluidigm C1 chip and 5ul of cell suspension was loaded onto a C1 instrument. Cell capture was then assessed with a 40x bright field microscope and wells scored as single-cell, doublet or debris. Lysis, reverse transcription and PCR reagents (Clontech Takara) along with ERCC spike-ins at 1 in 4000 dilution (Ambion) were added and mRNA-Seq RT & Amp script were performed overnight. The cDNA was then harvested and diluted in Fluidigm C1 DNA dilution buffer.

15 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Single-cell RNA sequencing data analysis The paired 2 x 101bp Illumina reads from the libraries were quantified using Salmon 43 with the setting -l IU to indicate library topology, and the optional flags -- posBias and --gcBias to account for coverage and amplification biases present in sc- RNA-seq protocols. As an index the cDNA annotation of Ensembl release 85 for GRCm38.p4 was used, together with ERCC spike-in sequences. The TPM values were rescaled to not include ERCC expression and only consider endogenous gene expression. Technical features of the data were compared with manual annotation of samples in C1 chambers through microscopy. Samples with less than 500,000 mapped reads and more than 30% mitochondrial content were discarded from analysis. This left 78 single cells in the in vivo experiment. To identify cells as endothelial or haematopoietic, expression levels of 10 known markers were analysed (Cdh5, Kdr, Pecam1, Pcdh12, Sox7, Gfi1, Gfi1b, Myb, Runx1, Spi1). Cells were clustered using Principal Component Analysis and a Gaussian Mixture Model with two components on these markers (Fig. 1b). A cluster of 10 cells considered haematopoietic was identified (and annotated based on high Runx1 expression). Analysis was performed using the decomposition.PCA and mixture.GaussianMixture classes in the scikit-learn package. PCA was performed on the log transformed TPM values of the markers, and the first two principal components were used for Gaussian Mixture. Novel markers for hematopoietic cells were identified using a likelihood ratio test, where the alternative model included a binary term for whether the cell was haematopoietic, and the null model just assumed a common mean for all the cells. The P-values from the likelihood ratio test were corrected for multiple testing by the Bonferroni procedure. The top differentially expressed genes were investigated to find markers which could be used in follow-up experiments, and Cd44 was considered a good candidate (Fig. 1c).

In vitro ES cell differentiation system The A2lox Empty Embryonic Stem (ES) cell line was maintained and differentiated as previously described 24. Briefly, the ES cells were maintained on mouse embryonic fibroblasts in DMEM knockout medium (Gibco) supplemented with 1% Pen/Strep (Gibco), 1% L-glutamine (Gibco), 1% NEAA, 15% FBS (Gibco),

16 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

LIF (EMBL Protein expression and purification core facility) and 2-Mercaptoethanol (Gibco). For differentiation the cells were plated on 0.1% gelatin for 24 hours in DMEM-ES cell medium and for 24 hours with IMDM instead on DMEM-knockout and in the absence of NEAA. To generate embryoid bodies (EBs) ES cells were transferred to petri dishes at a density of 0.3x106 cells per 10cm2 dish and plated in IMDM (Lonza) medium supplemented with 1% Pen/Strep, 1% L-glutamine (Gibco), 15% FBS, transferrin (Roche), MTG (Sigma) and 50mg/μL of ascorbic acid (Sigma). After 3 days in culture Flk1+ progenitor cells were isolated through magnetic activated cell sorting (MACS) using an anti-Flk1 APC conjugated antibody (eBiosciences) and anti-APC microbeads (Miltenyi Biotec). For the haemangioblast differentiation, Flk1+ cells were cultured on gelatin for 24 to 72 hours in IMDM medium supplemented with 1% Pen/Strep, 1% L-glutamine (Gibco), 15% FBS (Gibco), transferrin (Roche), MTG (Sigma) and 50mg/μL of ascorbic acid (Sigma), 15% D4T, 10μg/mL VEGF (R&D), 10μg/mL IL6 (R&D). The cells could then be harvested with TrypLE express and cell populations analysed by flow cytometry using anti-VE-Cad, anti-CD41 and anti-Kit antibodies (supplementary table 1). For in vitro CD44-hyaluronan interaction experiments Flk1+ cells were plated on gelatin, as per the haemangioblast differentiation protocol, and incubated with either 10μg/mL anti-CD44 antibody [KM201], 300μg/mL of hyaluronidase (Sigma H4272), 500μM of 4MU (sigma). Generation of haematopoietic and endothelial populations was assessed by flow cytometry after 48hrs in culture.

Antibody screen, Flow cytometry and cell sorting The antibody screen was performed using the Mouse Cell Surface Marker Screening Panel (BD Bioscience, see table below) according to the manufacturer’s instructions. Cells from haemangioblast culture were harvested with TrypLE (Gibco) at 37 degrees for 5 minutes and deactivated with PBS supplemented with 10% FBS. Dissected AGMs were dissociated with collagenase (Sigma) at 37 degrees for 30 minutes and the collagenase deactivated with PBS supplemented with 10% FBS (Gibco). Following the generation of single cell suspension, cells were stained with different combination of antibodies (see list below). Cells were washed, filtered and analysed using FACS Aria III (Becton Dickinson) and FACS Diva software. Data was later analysed using FlowJo v10.1r5 (Tree Star Inc.). For single-cell qPCR analysis

17 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

cells were sorted using an 85μm nozzle directly into reaction buffer (Cells Direct qRT-PCR Kit, Invitrogen) and snap frozen. For OP9 co-culturing assays cells were sorted using 100μm nozzle directly into a 96 well culture dish (Costar).

Antibody Reference Experiment Anti-Flk1-APC eBiosciences 17-5821-81 Antibody screen Haemangioblast assay Mouse Cell Surface BD Biosciences 562208 Antibody screening Marker Screening Panel (176 antibodies) anti-VE-Cadherin- eBiosciences 50-1441-82 Antibody screen eFluor660 Haemangioblast assay AGM FACS sort for sc- RNA-seq, sc-q-RT-PCR and bulk RNA-seq. AGM FACS analysis anti-CD41-PE eBiosciences 12-0411-82 Antibody screen Haemangioblast assay AGM FACS sort for sc- RNA-seq anti-CD44-PE BD Biosciences 553134 Antibody screen Haemangioblast assay AGM FACS sort for sc- RNA-seq and bulk RNA- seq anti-Kit-BV421 BD Biosciences 562609 AGM FACS sort for sc- RNA-seq, sc-q-RT-PCR and bulk RNA-seq. AGM FACS analysis anti-CD45-FITC BD Biosciences 553079 AGM FACS sort for sc-q- RT-PCR. AGM FACS analysis anti-CD43 PerCP-Cy5-5 BD Biosciences 562865 AGM FACS sort for sc-q-

18 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

RT-PCR. AGM FACS analysis anti-CD19 PE eBiosciences 12-0191-81 OP9 B-lymphoid culture anti-CD45 BV605 BD Biosciences 563053 OP9 B-lymphoid culture anti-CD11b APC eBiosciences 17-0112-81 OP9 B-lymphoid culture anti-CD8a FITC BD Biosciences 557688 OP9-Dl1 T-lymphoid culture anti-CD4 PE-Cy7 BD Biosciences 561099 OP9-Dl1 T-lymphoid culture anti-VE-Cadherin eBiosciences 14-1441-81 Immunofluorescence unconjugated anti-CD44 unconjugated Abcam ab157107 Immunofluorescence anti-CD44 unconjugated Abcam ab25340 Inhibition of CD44 and hyaluronan interaction

Immunofluorescence and confocal microscopy Mid-gestation embryos were dissected and fixed in 4% paraformaldehyde for 15 minutes at room temperature then incubated in 15% sucrose solution for 1 hour before freezing in OCT (-Tek). 10μm transverse cryo-sections of the AGM region were then placed on superfrost plus slides (Thermoscientific). Sections were washed in PBS, incubated in 1M glycine solution and permeabilised with 0.3% Triton X-100 (Sigma). Blocking solution consisting of 5% donkey serum, 5% chicken serum and 0.1% Tween-20 (Sigma) in TBS buffer was applied to sections for 2 hours at room temperature. Sections were incubated in primary antibodies overnight at 4°C and then washed. Secondary antibodies were applied for 1 hour at room temperature and washed before DAPI nuclear stain (Invitrogen) was applied for 15 minutes. Slides were washed before being mounted with Prolong gold (Life Technologies) and imaged on a Leica SP5 confocal microscope.

Single-cell q-RT-PCR Single cells were sorted directly into lysis buffer and snap frozen. Samples were reverse transcribed with superscript III reverse transcriptase from the Cells Direct one-step qRT-PCR Kit (Invitrogen) for 15 minutes at 50°C. The cDNA was

19 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

then pre-amplified for twenty cycles with 25nM final concentration of each outer- primer for a set of 96 target genes (Supplementary Table S2). The cDNA was then diluted with loading reagent (Fluidigm) and SoFastTM EvaGreen supermix (Biorad) and loaded onto a chip with (50μM) of inner primer mix. Amplification of the target genes was measured with the Fluidigm Biomark HD system with the Biomark Data Collection software and the GE96 x 96 + Meltv2.pcl program.

Single-cell qPCR data analysis Analysis of single-cell qPCR data was performed as previously described 24. Briefly, initial analysis was performed using the Fluidigm Real Time PCR analysis software. Hierarchical clustering and principal component analysis was performed using the SINGuLAR analysis toolkit (Fluidigm version 3.5) in R software (version 3.2.1).

Bulk RNA sequencing Bulk RNA sequencing was performed as described by the SmartSeq2 protocol 44. Briefly, 25 cells from each group (see table below) were FACS sorted directly into lysis buffer containing 0.2% Triton X-100, oligo-dT primers and dNTP mix, and then snap frozen. Reverse transcription was then performed, followed by pre-amplification for 14 cycles. Nextera libraries were then prepared and sequenced on the Illumina Next Seq sequencer.

E9.5 E10 E11 VE-CadPos CD44Neg 4 x 25-cells 3 x 25-cells 3 x 25-cells (CD44Neg) VE-CadPos CD44Low KitNeg 1 x 25-cells 2 x 25-cells 3 x 25-cells (CD44Low KitNeg) VE-CadPos CD44Low KitNeg 1 x 25-cells 2 x 25-cells Not done (CD44Low KitNeg) VE-CadNeg CD44High Not done Not done 3 x 25-cells (CD44High)

20 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Bulk RNA-seq data analysis Sequencing data was analysed with the aid of the EMBL Galaxy tools (galaxy.embl.de) 45 - specifically, FASTX for adaptor clipping, RNA STAR for mapping and htseq-count for obtaining raw gene expression counts. The R software (version 3.2.1, http://www.R-project.org) was then used to generate heatmaps and tSNE plots using the DESeq2, Scater, Biobase and pHeatmap packages.

OP9 co-culturing assays OP9 cells were maintained in MEM alpha medium (Gibco) with 20% FBS (ATCC 30-2020). For limiting dilution co-cultures cells were sorted directly onto a confluent OP9 stromal layer and incubated in medium conducive to haematopoietic development, IMDM (Lonza) treated with 5% penicillin-streptomycin (Gibco) and supplemented with 10% FBS (PAA Laboratories), L-glutamine, transferrin, MTG, ascorbic acid, LIF, 50ng/ml SCF, 25ng/ml IL3, 5ng/ml IL11, 10ng/ml IL6, 10ng/ml Oncostatin M, 1ng/ml bFGF. Round cell colonies were quantified after three to six days in culture. For ex vivo CD44 blocking antibody experiments 20 VE- Cad+CD44High cells were sorted, as per OP9 co-culturing assays, into a medium containing either no antibody, 5μg/mL or 10μg/mL of anti-CD44 antibody [KM201] (Abcam). The number of colonies were assessed after 3 days in culture.

Haematopoietic colony forming assay One hundred cells were initially sorted onto a confluent OP9 stromal layer as per OP9 co-culturing assay. After three days in culture cells were harvested with TrypLE express (Gibco) and colony-forming unit-culture (CFU-C) assays were initiated using Methocult complete medium (Stem Cell Technologies). Cells were grown in 35mm culture dishes and colonies quantified after 7 days.

Lymphocyte progenitor assay Fifty cells were sorted onto confluent OP9 or OP9-DL1 stromal layers in MEM-alpha medium (Gibco) and supplemented with growth factors conducive to lymphocyte development, 20% FBS (PAA Laboratories), 50ng/ml SCF, 5ng/ml Flt- 3L and 1ng/ml IL7. Medium was changed every 4-5 days and cells were split as necessary. Cells were cultured for 21 days before harvesting with TrypLE express for flow cytometry analysis.

21 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

References

1. Medvinsky, A. & Dzierzak, E. Definitive hematopoiesis is autonomously initiated by the AGM region. Cell 86, 897–906 (1996). 2. de Bruijn, M. F. T. R. et al. Hematopoietic stem cells localize to the endothelial cell layer in the midgestation mouse aorta. Immunity 16, 673–683 (2002). 3. Bertrand, J. Y. et al. Haematopoietic stem cells derive directly from aortic endothelium during development. Nature 464, 108–111 (2010). 4. Boisset, J.-C. et al. In vivo imaging of haematopoietic cells emerging from the mouse aortic endothelium. Nature 464, 116–120 (2010). 5. Kissa, K. & Herbomel, P. Blood stem cells emerge from aortic endothelium by a novel type of cell transition. Nature 464, 112–115 (2010). 6. Lancrin, C. et al. The haemangioblast generates haematopoietic cells through a haemogenic endothelium stage. Nature 457, 892–895 (2009). 7. Ciau-Uitz, A., Monteiro, R., Kirmizitas, A. & Patient, R. Developmental hematopoiesis: ontogeny, genetic programming and conservation. Experimental Hematology 42, 669–683 (2014). 8. Oberlin, E., Tavian, M., Blazsek, I. & Peault, B. Blood-forming potential of vascular endothelium in the human embryo. Development 129, 4147–4157 (2002). 9. Swiers, G. et al. Early dynamic fate changes in haemogenic endothelium characterized at the single-cell level. Nature Communications 4, 1–10 (2014). 10. Thambyrajah, R. et al. GFI1 proteins orchestrate the emergence of haematopoietic stem cells through recruitment of LSD1. Nat Cell Biol 18, 21– 32 (2016). 11. Sugimura, R. et al. Haematopoietic stem and progenitor cells from human pluripotent stem cells. Nature 545, 432–438 (2017). 12. Lis, R. et al. Conversion of adult endothelium to immunocompetent haematopoietic stem cells. Nature 545, 439–445 (2017). 13. Rybtsov, S. et al. Tracing the origin of the HSC hierarchy reveals an SCF- dependent, IL-3-independent CD43(-) embryonic precursor. STEMCR 3, 489– 501 (2014).

22 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

14. Zhou, F. et al. Tracing haematopoietic stem cell formation at single-cell resolution. Nature 533, 487–492 (2016). 15. Aruffo, A., Stamenkovic, I., Melnick, M., Underhill, C. B. & Seed, B. CD44 is the principal for hyaluronate. Cell 61, 1303–1313 (1990). 16. Takaishi, S. et al. Identification of gastric cancer stem cells using the cell surface marker CD44. STEM CELLS 27, 1006–1020 (2009). 17. Prince, M. E. et al. Identification of a subpopulation of cells with cancer stem cell properties in head and neck squamous cell carcinoma. Proceedings of the National Academy of Sciences 104, 973–978 (2007). 18. Lopez, J. I. et al. CD44 attenuates metastatic invasion during breast cancer progression. Cancer Research 65, 6755–6763 (2005). 19. Liu, C. et al. The microRNA miR-34a inhibits prostate cancer stem cells and metastasis by directly repressing CD44. Nature Medicine 17, 211–215 (2011). 20. Du, L. et al. CD44 is of functional importance for colorectal cancer stem cells. Clin. Cancer Res. 14, 6751–6760 (2008). 21. Avigdor, A. et al. CD44 and hyaluronic acid cooperate with SDF-1 in the trafficking of human CD34+ stem/progenitor cells to bone marrow. Blood 103, 2981–2989 (2004). 22. Zovein, A. C. et al. Fate tracing reveals the endothelial origin of hematopoietic stem cells. Cell Stem Cell 3, 625–636 (2008). 23. Pastushenko, I. et al. Identification of the tumour transition states occurring during EMT. Nature 556, 463–468 (2018). 24. Bergiers, I. et al. Single-cell transcriptomics reveals a new dynamical function of transcription factors during embryonic hematopoiesis. eLife 7, R106 (2018). 25. Quintero, M., Colombo, S. L., Godfrey, A. & Moncada, S. Mitochondria as signaling organelles in the vascular endothelium. Proceedings of the National Academy of Sciences 103, 5379–5384 (2006). 26. Yu, X., Long, Y. C. & Shen, H.-M. Differential regulatory functions of three classes of phosphatidylinositol and phosphoinositide 3-kinases in autophagy. Autophagy 11, 1711–1728 (2015). 27. Nascimbeni, A. C., Codogno, P. & Morel, E. Phosphatidylinositol-3-phosphate in the regulation of autophagy membrane dynamics. FEBS J. 284, 1267–1278 (2017).

23 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

28. Tan, X., Thapa, N., Liao, Y., Choi, S. & Anderson, R. A. PtdIns(4,5)P2 signaling regulates ATG14 and autophagy. Proc. Natl. Acad. Sci. U.S.A. 113, 10896–10901 (2016). 29. Dall'Armi, C., Devereaux, K. A. & Di Paolo, G. The role of lipids in the control of autophagy. Curr. Biol. 23, R33–45 (2013). 30. Riffelmacher, T. & Simon, A. K. Mechanistic roles of autophagy in hematopoietic differentiation. FEBS J. 284, 1008–1020 (2017). 31. Phadwal, K., Watson, A. S. & Simon, A. K. Tightrope act: autophagy in stem cell renewal, differentiation, proliferation, and aging. Cell. Mol. Life Sci. 70, 89–103 (2013). 32. Mizushima, N. & Levine, B. Autophagy in mammalian development and differentiation. Nat Cell Biol 12, 823–830 (2010). 33. Lancrin, C. et al. GFI1 and GFI1B control the loss of endothelial identity of hemogenic endothelium during hematopoietic commitment. Blood 120, 314– 322 (2012). 34. Pinheiro, I. et al. Discovery of a new path for red blood cell generation in the mouse embryo. bioRxiv 309054 (2018). doi:10.1101/309054 35. Schmits, R. et al. CD44 regulates hematopoietic progenitor distribution, granuloma formation, and tumorigenicity. Blood 90, 2217–2233 (1997). 36. Nedvetzki, S. et al. RHAMM, a receptor for hyaluronan-mediated motility, compensates for CD44 in inflamed CD44-knockout mice: A different interpretation of redundancy. Proceedings of the National Academy of Sciences 101, 18081–18086 (2004). 37. Lindskog, H. et al. Molecular identification of venous progenitors in the dorsal aorta reveals an aortic origin for the cardinal vein in mammals. Development 141, 1120–1128 (2014). 38. Wilhelm, K. et al. FOXO1 couples metabolic activity and growth state in the vascular endothelium. Nature 529, 216–220 (2016). 39. Schlereth, K. et al. The transcriptomic and epigenetic map of vascular quiescence in the continuous lung endothelium. eLife 7, 464 (2018). 40. Mortensen, M. et al. The autophagy protein Atg7 is essential for hematopoietic stem cell maintenance. Journal of Experimental Medicine 208, 455–467 (2011). 41. Ho, T. T. et al. Autophagy maintains the metabolism and function of young and old stem cells. Nature 543, 205–210 (2017).

24 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

42. Ditadi, A. et al. Human definitive haemogenic endothelium and arterial vascular endothelium represent distinct lineages. Nat Cell Biol 17, 580–591 (2015). 43. Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nature Methods 14, 417–419 (2017). 44. Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nature Protocols 9, 171–181 (2014). 45. Afgan, E. et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Research 44, W3–W10 (2016).

25 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Author Contributions Morgan Oatley, Conceptualization, Formal analysis, Investigation, Visualization, Writing—original draft, Writing—review and editing; Özge Vargel Bölükbası, Conceptualization, Formal analysis, Investigation, Visualization, Writing—review and editing; Valentine Svensson, Formal analysis, Visualization, Writing—review and editing; Maya Shvartsman, Investigation, Formal analysis, Writing—review and editing; Kerstin Ganter, Investigation, Visualization, Writing—review and editing; Katharina Zirngibl, Formal analysis, Visualization, Writing—review and editing; Polina V. Pavlovich, Formal analysis, Writing—review and editing; Vladislava Milchevskaya, Formal analysis, Writing—review and editing; Vladimir Foteva, Investigation, Writing—review and editing; Kedar N. Natarajan, Investigation, Writing—review and editing; Bianka Baying, Investigation, Writing—review and editing; Vladimir Benes, Supervision, Writing— review and editing; Kiran R. Patil, Supervision, Investigation, Writing— review and editing; Sarah A. Teichmann, Supervision, Investigation, Writing— review and editing; Christophe Lancrin, Conceptualization, Formal analysis, Supervision, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing.

Acknowledgments We thank Kalina Stantcheva and Cora Chadick (EMBL Rome FACS Facility, Italy) for cell sorting; Andreas Buness (EMBL Rome Bioinformatics Services, Italy) and Tallulah Andrews (Wellcome Trust Sanger Institute, UK) for help with bioinformatics analysis; Paul Collier, (EMBL Genomics Core Facility, Germany) for bulk RNA-seq; Gisela Luz (Patil Group, EMBL, Germany) for scientific illustration; Inke Nathke (University of Dundee, Scotland), Paul Heppenstall (EMBL, Italy) and Alexander Aulehla (EMBL, Germany) for fruitful scientific discussions. The European Molecular Biology Laboratory and the Wellcome Trust supported this work.

26 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure Legends

Figure 1: Search for a new marker to dissect the endothelial to hematopoietic transition (a) FACS plot of cells isolated from AGM region at E11, stained with VE-Cad and indicated cell surface markers selected from the antibody screen. (b) Principal Component Analysis of the single-cell RNA-seq data. Cells expressing hematopoietic genes are marked in red while the other cells are marked in green. (c) Volcano plot showing a selection of marker genes specific of the groups of cells expressing haematopoietic genes. The Cd44 gene is highlighted with a red circle. (d) Heatmap displaying the expression of a selection of genes in endothelial and haematopoietic clusters. Cd44 is highlighted in red. See also Supplementary Figure S1 and File S1.

Figure 2: CD44 splits the AGM VE-Cadherin+ cells in different populations with various morphologies (a) Immunofluorescence of VE-Cad (magenta) and CD44 (green) expression in a cross section of the AGM region of a wild-type embryo at E10 (32 somite pairs). Images 1 and 2 show higher magnification of the areas highlighted in the main image, showing CD44 marking endothelial cells in the vascular wall and cells making up a haematopoietic cluster. (b) FACS plots indicating percentage of cells expressing high levels of VE-Cad from dissected AGMs of wild-type embryos. The histograms indicate the percentage of VE-CadHigh cells positive for CD44 at both E9.5 (28 somite pairs) and E10.5 (35 somite pairs). (c) Percentage of CD44+ cells within the VE- CadHigh fraction, each data point represents an independent experiment, E9.5 n = 4 and E10.5 n = 5. Significance was determined by two-tailed, independent t-test. (d) FACS plots of VE-Cad and CD44 expression in the AGM region of wild-type embryos at E10 (30-34 somite pairs). Expression of Kit cell surface marker is highlighted for the CD44Low population. (e) Mean FSC-A as an indication of cell size is plotted for each population (CD44Neg, CD44LowKitNeg, CD44LowKitPos, CD44High) for five independent experiments on a litter of E10 wild-type embryos. Significance was determined by two-tailed, paired t-tests.

27 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure 3: The VE-Cad+ subpopulations defined by CD44 are transcriptionally distinct (a) Single cells from each VE-Cad+ populations were isolated and tested for the expression of 96 genes by single-cell q-RT-PCR. The heatmap shows the result of the hierarchical clustering analysis (cells were clustered by Euclidian distance). A selection of genes has been highlighted: endothelial genes group, blood genes groups I, II and III. (b) tSNE plot from single-cell q-RT-PCR data shown in (a). Five groups are indicated. (c) Heatmap showing average expression of endothelial (blue), haematopoietic (red) and various (black) genes in the indicated groups (genes are clustered using Pearson correlation). The number of cells for each cluster is indicated at the bottom of the panel. See also Supplementary Fig. S2 and File S3.

Figure 4: Comparison of the CD44 populations to Pro-HSC, Pre-HSC type I and type II. (a) FACS plots of VE-Cad, CD45, CD43 and CD41 expression in the AGM region at E10 (31-32 somite pairs). Single cells from Pro-HSCs (VE-Cad+ CD41+ CD45- CD43- ), Pre-HSCs type I (VE-Cad+ CD41+ CD45- CD43+), Pre-HSCs type II (VE-Cad+ CD45+) populations were isolated. (b) tSNE plot from single-cell q-RT-PCR data shown in (a). (c) Heatmap showing average expression of endothelial (blue), haematopoietic (red) and various (black) genes in the indicated groups (genes are clustered using Pearson correlation). The number of cells for each cluster is indicated at the bottom of the panel. See also Supplementary Fig. S3 and File S4.

Figure 5: Bulk RNA sequencing further distinguishes CD44Neg and CD44LowKitNeg endothelial populations and identifies early changes in the differentiation process (a) tSNE plot from 25-cell bulk RNA sequencing generated from E9.5, E10 and E11 AGM. Four groups are indicated. (b) Heatmap of gene expression highlighting a selection of genes. (c) Heatmap of gene expression highlighting the most differentially expressed genes between CD44Neg and CD44LowKitNeg endothelial populations (p-value < 0.01). (d) Heatmap of gene expression highlighting arterial (Efbn2, Gja5, Sox17, Bmx and Hey2) and venous (Aplnr, Nr2f2, Nrp2 and Ephb4)

28 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

coding genes in CD44LowKitNeg and CD44Neg populations. See also Supplementary File S5 and Fig. S4.

Figure 6: CD44LowKitNeg endothelial cells feature altered expression of genes involved in metabolism and autophagy (a) Overview of key metabolic nodes and pathways enriched in differentially expressed genes when comparing the CD44LowKitNeg and CD44Neg endothelial populations. These were selected based on reporter metabolite analysis. Pathway boxes summarize multiple genes / reporter metabolites (Supplementary Table S3). The mentioned upregulated and downregulated genes refer to the expression in the CD44LowKitNeg population compared to CD44Neg. (b) Schematic representation of the autophagy process marking differentially expressed genes. Included are genes coding for structural as well as regulatory aspects of autophagy. The colour code is the same as in (a).

Figure 7: Runx1 is required for the generation of CD44LowKitPos and CD44High cells but not for the formation of the CD44LowKitNeg population (a) FACS plots of VE-Cad and CD44 expression in the AGM region at E10.5 from Runx1+/+ (left) and Runx1-/- (right) embryos. (b) tSNE plots from single-cell q-RT- PCR data shown in (a). See also Supplementary Fig. S5.

Figure 8: All CD44+ populations have haematopoietic potential (a) Images of OP9 co-cultures after 4 and 6 days of incubation. Haematopoietic potential was observed from CD44LowKitNeg cells with colonies of round cells resulting from 300 cells sorted per well. No round cell colonies were observed with CD44Neg cells. (b) Images of OP9 co-culture after 3 days of culture are shown. A single CD44High or CD44LowKitPos cell was FACS sorted onto a confluent OP9 stromal layer and incubated in HE medium. (c) The percentage of single cells giving rise to colonies was quantified across four independent experiment. The graph compares the frequency of growth from single CD44High cells with CD44LowKitPos cells. Statistical significance was determined by a two-tailed, paired t-test. (d) Colony forming unit assays were performed following three days OP9 culture of CD44LowKitPos and CD44High cells (100 cells per well). Cells were kept for a further 7 days in methocult medium before quantification. Images show representative CFU-E and CFU-GEMM

29 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

colonies. (e) The bar graphs show the number of CFUs generated per 100 initial FACS sorted cells. Error bars indicate standard deviation (n=4). Significance was determined by two-tailed, paired t-tests. (f) The bar graph indicates the total number of colony forming units formed per initial colony grown on the OP9 stromal layer. Although the CD44LowKitPos population gives rise to approximately 6 times more round cell colonies on OP9 then the CD44High population, only approximately 2.5 times more CFUs are generated. (g) B and T cell lymphoid assays were performed following 21 days of OP9 (B-Cells) and OP9-DL1 (T-cells) culture of 50 FACS sorted CD44High cells. Percentages of CD19+ (B-cells) and CD4+CD8a+ (immature T- cells) are shown.

Figure 9: Blocking the interaction between CD44 and its ligand hyaluronan inhibits the endothelial-haematopoietic transition. (a) Images of round cell colonies generated from CD44High cells after 4 days of OP9 co-culture with different concentrations of KM201 anti-CD44 blocking antibody. Dotted line indicates approximate size of colonies. Scale bar corresponds to 100 μm. (b) Dot plot comparing number of round cells colonies formed as a function of the concentration of anti-CD44 blocking antibody applied. (c) Dot plot indicating the number of cells per colony as a function of the concentration of anti-CD44 blocking antibody applied. (b-c) Significance was determined by two-tailed, independent t-test where * indicates p-value < 0.05, ** indicates p-value < 0.01 and *** indicates p- value < 0.001, n = 3. (d) Representative FACS plots of VE-Cad and CD41 expression after two days of haemangioblast differentiation. Cells were either untreated (control) or treated with anti-CD44 blocking antibody, hyaluronidase , 4MU hyaluronan synthase inhibitor or a combination (e) Dot plot showing the population percentage for vascular (VSM)(VE-Cad-CD41-), endothelial cells (VE-Cad+CD41-), Pre-HSPCs (VE-Cad+CD41+) and HSPCs (VE-Cad-CD41+) after two days of haemangioblast differentiation, summarising the results of FACS analysis shown in (d) (n ≥ 4). Significance was determined by two-tailed, independent t-tests where * indicates p-value < 0.05, ** indicates p-value < 0.01 and *** indicates p- value < 0.001. See also Supplementary Fig. S6.

30 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Figure 10: New model for the progression of EHT. Scheme summarising the findings of the present study. We identified two different sets of endothelial cell populations in the AGM with very distinct properties in term of signalling pathways and metabolic states. Expression of Runx1 in the haemogenic endothelium population triggers the upregulation of haematopoietic genes and the formation of the Pre-HSPC-I which co-expresses endothelial and haematopoietic genes. Continuous expression of haematopoietic genes and interaction between CD44 and hyaluronan eventually lead to the loss of endothelial genes and the formation of Pre-HSPC-II expressing only haematopoietic genes.

31 FigurebioRxiv 1: Search preprint doi: for https://doi.org/10.1101/338178 a new marker to; this dissect version posted the June endothelial 6, 2018. The copyright to holder haematopoietic for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under transition aCC-BY-NC-ND 4.0 International license. a CD44 CD51 CD55 CD61 0.515 0.669 0.238 1.93 0.665 1.49 1.58 0.721 105 105 105 105

104 104 104 104

103 103 103 103 : ve-cadh

102 102 102 102

0 0 0 0 86.5 12.3 11.4 86.4 82.1 15.7 81.5 16.2

0 102 103 104 105 0 102 103 104 105 0 102 103 104 105 0 102 103 104 105 : CD44 Icam1 Madcam1 Sc a 1 VE-Cad CD93 0.225 2 0.125 2.2 0.392 1.92 1.98 0.195 105 105 105 105

104 104 104 104

103 103 103 103

102 102 102 102

0 0 0 0 74.5 23.2 89.6 8.06 88.5 9.19 93.9 3.91

0 102 103 104 105 0 102 103 104 105 0 102 103 104 105 0 102 103 104 105 Antibody

b Clustering based on marker genes c Volcano plot - Rest versus Haematopoietic Haematopoietic (q value) (q PC2 10 -Log

PC1 Effect size (+ = Haematopoietic) d

GroupID 3.5GroupID Cluster Endo

Gene Expression Expression Gene Log EmcnEmcn Blood GroupID 3.5 GroupID 3 Ramp2Ramp2 Endo

10 Pecam1 Emcn Blood Pecam1 2.5 3 3 (1+TPM) KdrKdr Ramp2 2 TekTek Pecam1 2.5 Cdh5Cdh5 1.5 Kdr Col4a2Col4a2 1 2 Sox7Sox7 Tek Pcdh12 Pcdh12 0.5 Cdh5 1.5 Acvrl1Acvrl1 0 Col4a2 PtprbPtprb Adgrg1 1 Adgrg1 Sox7 1 Cd44Cd44 Pcdh12 Nfe2Nfe2 0.5 Coro1a Acvrl1 Coro1a Runx1Runx1 0 Ptprb Ikzf2Ikzf2 Adgrg1 Cluster Lgals9Lgals9 Ptk2bPtk2b Cd44 Endothelial Gfi1Gfi1 Nfe2 Haematopoietic Dok2Dok2 Coro1a Rab38Rab38 Gm28112Gm28112 Runx1 OclnOcln Ikzf2 MybMyb Krt18 Lgals9 Krt18 Ikzf1Ikzf1 Ptk2b Spi1Spi1 Gfi1 Prtn3Prtn3 HlfHlf Dok2 Mctp1Mctp1 Rab38 AI467606AI467606 Gm28112 Gfi1bGfi1b EmbEmb Ocln sc1_cell1_salmon_out sc1_cell71_salmon_out sc1_cell53_salmon_out sc1_cell61_salmon_out sc1_cell84_salmon_out sc1_cell90_salmon_out sc1_cell10_salmon_out sc1_cell58_salmon_out sc1_cell92_salmon_out sc1_cell17_salmon_out sc1_cell2_salmon_out sc1_cell25_salmon_out sc1_cell36_salmon_out sc1_cell34_salmon_out sc1_cell12_salmon_out sc1_cell65_salmon_out sc1_cell32_salmon_out sc1_cell52_salmon_out sc1_cell87_salmon_out sc1_cell86_salmon_out sc1_cell20_salmon_out sc1_cell57_salmon_out sc1_cell38_salmon_out sc1_cell68_salmon_out sc1_cell50_salmon_out sc1_cell56_salmon_out sc1_cell22_salmon_out sc1_cell33_salmon_out sc1_cell55_salmon_out sc1_cell51_salmon_out sc1_cell19_salmon_out sc1_cell4_salmon_out sc1_cell63_salmon_out sc1_cell94_salmon_out sc1_cell47_salmon_out sc1_cell48_salmon_out sc1_cell96_salmon_out sc1_cell23_salmon_out sc1_cell78_salmon_out sc1_cell79_salmon_out sc1_cell27_salmon_out sc1_cell67_salmon_out sc1_cell44_salmon_out sc1_cell5_salmon_out sc1_cell62_salmon_out sc1_cell80_salmon_out sc1_cell43_salmon_out sc1_cell69_salmon_out sc1_cell64_salmon_out sc1_cell66_salmon_out sc1_cell13_salmon_out sc1_cell49_salmon_out sc1_cell54_salmon_out sc1_cell60_salmon_out sc1_cell40_salmon_out sc1_cell28_salmon_out sc1_cell42_salmon_out sc1_cell76_salmon_out sc1_cell26_salmon_out sc1_cell29_salmon_out sc1_cell18_salmon_out sc1_cell82_salmon_out sc1_cell93_salmon_out sc1_cell70_salmon_out sc1_cell46_salmon_out sc1_cell74_salmon_out sc1_cell16_salmon_out sc1_cell14_salmon_out sc1_cell45_salmon_out sc1_cell35_salmon_out sc1_cell77_salmon_out sc1_cell8_salmon_out sc1_cell11_salmon_out sc1_cell15_salmon_out sc1_cell6_salmon_out sc1_cell30_salmon_out sc1_cell39_salmon_out sc1_cell81_salmon_out

Myb

Krt18

Ikzf1

Spi1

Prtn3

Hlf

Mctp1

AI467606

Gfi1b

Emb sc1_cell1_salmon_out sc1_cell71_salmon_out sc1_cell53_salmon_out sc1_cell61_salmon_out sc1_cell84_salmon_out sc1_cell90_salmon_out sc1_cell10_salmon_out sc1_cell58_salmon_out sc1_cell92_salmon_out sc1_cell17_salmon_out sc1_cell2_salmon_out sc1_cell25_salmon_out sc1_cell36_salmon_out sc1_cell34_salmon_out sc1_cell12_salmon_out sc1_cell65_salmon_out sc1_cell32_salmon_out sc1_cell52_salmon_out sc1_cell87_salmon_out sc1_cell86_salmon_out sc1_cell20_salmon_out sc1_cell57_salmon_out sc1_cell38_salmon_out sc1_cell68_salmon_out sc1_cell50_salmon_out sc1_cell56_salmon_out sc1_cell22_salmon_out sc1_cell33_salmon_out sc1_cell55_salmon_out sc1_cell51_salmon_out sc1_cell19_salmon_out sc1_cell4_salmon_out sc1_cell63_salmon_out sc1_cell94_salmon_out sc1_cell47_salmon_out sc1_cell48_salmon_out sc1_cell96_salmon_out sc1_cell23_salmon_out sc1_cell78_salmon_out sc1_cell79_salmon_out sc1_cell27_salmon_out sc1_cell67_salmon_out sc1_cell44_salmon_out sc1_cell5_salmon_out sc1_cell62_salmon_out sc1_cell80_salmon_out sc1_cell43_salmon_out sc1_cell69_salmon_out sc1_cell64_salmon_out sc1_cell66_salmon_out sc1_cell13_salmon_out sc1_cell49_salmon_out sc1_cell54_salmon_out sc1_cell60_salmon_out sc1_cell40_salmon_out sc1_cell28_salmon_out sc1_cell42_salmon_out sc1_cell76_salmon_out sc1_cell26_salmon_out sc1_cell29_salmon_out sc1_cell18_salmon_out sc1_cell82_salmon_out sc1_cell93_salmon_out sc1_cell70_salmon_out sc1_cell46_salmon_out sc1_cell74_salmon_out sc1_cell16_salmon_out sc1_cell14_salmon_out sc1_cell45_salmon_out sc1_cell35_salmon_out sc1_cell77_salmon_out sc1_cell8_salmon_out sc1_cell11_salmon_out sc1_cell15_salmon_out sc1_cell6_salmon_out sc1_cell30_salmon_out sc1_cell39_salmon_out sc1_cell81_salmon_out bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

LogSC_1 2SC_2 GeneSC_3 SC_4 SC_5 expression GroupID CD44high CD44lo_kitneg CD44lo_kitpos CD44neg CD44high CD44lo_kitneg CD44lo_kitpos CD44neg SC_1 SC_2 SC_3 SC_4 SC_5 e10 e11 e10 e11 10 8 6 4 2 0 10 8 6 4 2 0 GroupID GroupID GroupID GroupID GroupID 14 12 10 8 6 4 2 0 14 12 10 8 6 4 2 0 14 12 10 8 6 4 2 0 14 12 10 8 6 4 2 0 14 12 10 8 6 4 2 0 Blood Genes Group II Blood Genes Group III Endothelial Endothelial Genes Group Blood Genes Group I

SFPI1 TEK Emcn KDR CDH5 PECAM1 Adgrg1 KIT Bcl11a SMAD6 SMAD7 Pde3a FBN1 Sox17 CD44 EMB Csf1r PTPRC FLI1 LMO2 TAL1 GPR126 PCDH12 Cbfb Pde2a ERG BMP4 GroupID GFI1 Nfe2 Ikzf2 ITGB3 Sox6 Dnmt3b GATA2 Ikzf1 ITGAM RUNX1

SC_1 SC_2 SC_3 SC_4 SC_5

GroupID

bh1 bh1 bh1 bh1 bh1

10 8 6 4 2 0

− − − − − Smad7 Pde3a Fbn1 Sox17 Cd44 Emb Csf1r Ptprc Fli1 Lmo2 Tal1 Adgrg6 Pcdh12 Cbfb Pde2a Erg Bmp4 Gfi1 Nfe2 Ikzf2 Itgb3 Sox6 Dnmt3b Gata2 Ikzf1 Itgam Runx1 Spi1 Tek Emcn Kdr Cdh5 Pecam1 Adgrg1 Kit Bcl11a Smad6 Sox6 Serpinf1 Acta2 Cxcl12 Col3a1 Gata1 Epor Gfi1B Hbb Spi1 Laptm5 Ly6e Alox5ap Dapp1 Adgrg1 Kit Bcl11a Nfe2 Gfi1 Rab38 Stk26 Mctp1 Med12l Plcg2 Atp2a3 Myb Ptk2b Ikzf2 Angpt1 Itgb3 Mycn Csf1r Ptprc Dock8 Stxbp2 Pag1 Itgam Lrp1 Hmmr Notch2 Emb Cd44 Gpx1 Cd47 Tfgb1 Fli1 Tal1 Lmo2 Pfas Nmt2 Rbm38 Smad3 Smad2 Cbfb Gata2 Dnmt3b Itga2b GroupID Ppia Cdh5 Mdk Pecam1 Acvrl1 Ptprb Col4a2 Tek Ramp2 Kdr Emcn Erg Esam Cbfa2t3 Pbx1 Meis2 Smad6 Smad7 Gpr126 Col4a5 Pde3a Fbn1 Sox17 Pcdh12 Cdh2 Pde2a Notch1 Snai1 Bmp4 Lgals9 Sla Ikzf1 Dok2 Runx1 Was Pla2g4a Dock2 Ptpn6 Coro1a Nmt2 Rbm38 Smad3 Smad2 Cbfb Gata2 Dnmt3b Itga2b Sox6 Serpinf1 Acta2 Cxcl12 Col3a1 Gata1 Epor Gfi1B Hbb Stk26 Mctp1 Med12l Plcg2 Atp2a3 Myb Ptk2b Ikzf2 Angpt1 Itgb3 Mycn Csf1r Ptprc Dock8 Stxbp2 Pag1 Itgam Lrp1 Hmmr Notch2 Emb Cd44 Gpx1 Cd47 Tfgb1 Fli1 Tal1 Lmo2 Pfas Fbn1 Sox17 Pcdh12 Cdh2 Pde2a Notch1 Snai1 Bmp4 Lgals9 Sla Ikzf1 Dok2 Runx1 Was Pla2g4a Dock2 Ptpn6 Coro1a Spi1 Laptm5 Ly6e Alox5ap Dapp1 Adgrg1 Kit Bcl11a Nfe2 Gfi1 Rab38 GroupID Ppia Cdh5 Mdk Pecam1 Acvrl1 Ptprb Col4a2 Tek Ramp2 Kdr Emcn Erg Esam Cbfa2t3 Pbx1 Meis2 Smad6 Smad7 Gpr126 Col4a5 Pde3a Cd47 Tfgb1 Fli1 Tal1 Lmo2 Pfas Nmt2 Rbm38 Smad3 Smad2 Cbfb Gata2 Dnmt3b Itga2b Sox6 Serpinf1 Acta2 Cxcl12 Col3a1 Gata1 Epor Gfi1B Hbb Nfe2 Gfi1 Rab38 Stk26 Mctp1 Med12l Plcg2 Atp2a3 Myb Ptk2b Ikzf2 Angpt1 Itgb3 Mycn Csf1r Ptprc Dock8 Stxbp2 Pag1 Itgam Lrp1 Hmmr Notch2 Emb Cd44 Gpx1 Pde3a Fbn1 Sox17 Pcdh12 Cdh2 Pde2a Notch1 Snai1 Bmp4 Lgals9 Sla Ikzf1 Dok2 Runx1 Was Pla2g4a Dock2 Ptpn6 Coro1a Spi1 Laptm5 Ly6e Alox5ap Dapp1 Adgrg1 Kit Bcl11a GroupID Ppia Cdh5 Mdk Pecam1 Acvrl1 Ptprb Col4a2 Tek Ramp2 Kdr Emcn Erg Esam Cbfa2t3 Pbx1 Meis2 Smad6 Smad7 Gpr126 Col4a5 Myb Ptk2b Ikzf2 Angpt1 Itgb3 Mycn Csf1r Ptprc Dock8 Stxbp2 Pag1 Itgam Lrp1 Hmmr Notch2 Emb Cd44 Gpx1 Cd47 Tfgb1 Fli1 Tal1 Lmo2 Pfas Nmt2 Rbm38 Smad3 Smad2 Cbfb Gata2 Dnmt3b Itga2b Sox6 Serpinf1 Acta2 Cxcl12 Col3a1 Gata1 Epor Gfi1B Hbb GroupID Ppia Cdh5 Mdk Pecam1 Acvrl1 Ptprb Col4a2 Tek Ramp2 Kdr Emcn Erg Esam Cbfa2t3 Pbx1 Meis2 Smad6 Smad7 Gpr126 Col4a5 Pde3a Fbn1 Sox17 Pcdh12 Cdh2 Pde2a Notch1 Snai1 Bmp4 Lgals9 Sla Ikzf1 Dok2 Runx1 Was Pla2g4a Dock2 Ptpn6 Coro1a Spi1 Laptm5 Ly6e Alox5ap Dapp1 Adgrg1 Kit Bcl11a Nfe2 Gfi1 Rab38 Stk26 Mctp1 Med12l Plcg2 Atp2a3 Pfas Nmt2 Rbm38 Smad3 Smad2 Cbfb Gata2 Dnmt3b Itga2b Sox6 Serpinf1 Acta2 Cxcl12 Col3a1 Gata1 Epor Gfi1B Hbb Rab38 Stk26 Mctp1 Med12l Plcg2 Atp2a3 Myb Ptk2b Ikzf2 Angpt1 Itgb3 Mycn Csf1r Ptprc Dock8 Stxbp2 Pag1 Itgam Lrp1 Hmmr Notch2 Emb Cd44 Gpx1 Cd47 Tfgb1 Fli1 Tal1 Lmo2 Pde3a Fbn1 Sox17 Pcdh12 Cdh2 Pde2a Notch1 Snai1 Bmp4 Lgals9 Sla Ikzf1 Dok2 Runx1 Was Pla2g4a Dock2 Ptpn6 Coro1a Spi1 Laptm5 Ly6e Alox5ap Dapp1 Adgrg1 Kit Bcl11a Nfe2 Gfi1 GroupID Ppia Cdh5 Mdk Pecam1 Acvrl1 Ptprb Col4a2 Tek Ramp2 Kdr Emcn Erg Esam Cbfa2t3 Pbx1 Meis2 Smad6 Smad7 Gpr126 Col4a5 Emb Ikzf2 Mycn Dock8 Lrp1 Cd47 Smad2 Gata2 Sox6 Cxcl12 Gfi1b Ppia Smad7 Pde3a Cdh2 Snai1 Sla Runx1 Dock2 Spi1 Dapp1 Gfi1 Mctp1 Atp2a3

Cdh5 Cdh5 Mdk Pecam1 Acvrl1 Acvrl1 Ptprb Col4a2 Col4a2 Tek Ramp2 Ramp2 Kdr Kdr Emcn Erg Erg Esam Cbfa2t3 Pbx1 Meis2 Smad6 Gpr126 Gpr126 Col4a5 Col4a5 Fbn1 Fbn1 Sox17 Pcdh12 Pde2a Notch1 Bmp4 Bmp4 Lgals9 Lgals9 Ikzf1 Dok2 Dok2 Was Pla2g4a Ptpn6 Ptpn6 Coro1a Coro1a Laptm5 Ly6e Ly6e Alox5ap Alox5ap Adgrg1 Adgrg1 Kit Kit Bcl11a Bcl11a Nfe2 Rab38 Rab38 Stk26 Med12l Plcg2 Plcg2 Myb Ptk2b Angpt1 Angpt1 Itgb3 Itgb3 Csf1r Ptprc Stxbp2 Stxbp2 Pag1 Itgam Hmmr Notch2 Cd44 Cd44 Gpx1 Gpx1 Tgfb1 Tgfb1 Fli1 Fli1 Tal1 Lmo2 Lmo2 Pfas Nmt2 Rbm38 Rbm38 Smad3 Cbfb Dnmt3b Itga2b Itga2b Serpinf1 Serpinf1 Acta2 Col3a1 Col3a1 Gata1 Epor Hbb-bh1 Hbb-bh1 SMAD7 Pde3a FBN1 Sox17 CD44 EMB Csf1r PTPRC FLI1 LMO2 TAL1 GPR126 PCDH12 Cbfb Pde2a ERG BMP4 GroupID GFI1 Nfe2 Ikzf2 ITGB3 Sox6 Dnmt3b GATA2 Ikzf1 ITGAM RUNX1 SFPI1 TEK Emcn KDR CDH5 PECAM1 Adgrg1 KIT Bcl11a SMAD6 Cluster Phenotype point Time P1_CD44low.C.._01_23.1 P1_CD44low.C.kit._01_23.1 P1_CD44low.C.kit._01_23.1P1_CD44low.C.kit._01_23.1 P1_CD44low.C.kit._01_23.1 P1_CD44low.C.kit._01_16.1 P1_CD44low.C.kit._01_16.1 P1_CD44low.C.kit._01_16.1P1_CD44low.C.kit._01_16.1 P1_CD44low.C.kit._01_16.1 P1_CD44low.C.kit._01_12.1 P1_CD44low.C.kit._01_12.1 P1_CD44low.C.kit._01_12.1P1_CD44low.C.kit._01_12.1 P1_CD44low.C.kit._01_12.1 P1_CD44low.C.kit._01_14.1 P1_CD44low.C.kit._01_14.1 P1_CD44low.C.kit._01_14.1P1_CD44low.C.kit._01_14.1 P1_CD44low.C.kit._01_14.1 P1_CD44low.C.kit._01_06.1 P1_CD44low.C.kit._01_06.1 P1_CD44low.C.kit._01_06.1P1_CD44low.C.kit._01_06.1 P1_CD44low.C.kit._01_06.1 P1_CD44low.C.kit._01_18.1 P1_CD44low.C.kit._01_18.1 P1_CD44low.C.kit._01_18.1P1_CD44low.C.kit._01_18.1 P1_CD44low.C.kit._01_18.1 P2_CD44low.C.kit._01_15.1 P2_CD44low.C.kit._01_15.1 P2_CD44low.C.kit._01_15.1P2_CD44low.C.kit._01_15.1 P2_CD44low.C.kit._01_15.1 P2_CD44low.C.kit._01_07.1 P2_CD44low.C.kit._01_07.1SC_5 P2_CD44low.C.kit._01_07.1P2_CD44low.C.kit._01_07.1 P2_CD44low.C.kit._01_07.1 P2_CD44low.C.kit._01_01.1 P2_CD44low.C.kit._01_01.1 P2_CD44low.C.kit._01_01.1P2_CD44low.C.kit._01_01.1 P2_CD44low.C.kit._01_01.1 P1_CD44low.C.kit._01_19.1 P1_CD44low.C.kit._01_19.1 P1_CD44low.C.kit._01_19.1P1_CD44low.C.kit._01_19.1 P1_CD44low.C.kit._01_19.1 P1_CD44low.C.kit._01_08.1 P1_CD44low.C.kit._01_08.1 P1_CD44low.C.kit._01_08.1P1_CD44low.C.kit._01_08.1 P1_CD44low.C.kit._01_08.1 P2_CD44low.C.kit._01_13.1 P2_CD44low.C.kit._01_13.142 cells SC_5 P2_CD44low.C.kit._01_13.1P2_CD44low.C.kit._01_13.1 P2_CD44low.C.kit._01_13.1 P2_CD44low.C.kit._01_05.1 P2_CD44low.C.kit._01_05.1 P2_CD44low.C.kit._01_05.1P2_CD44low.C.kit._01_05.1 P2_CD44low.C.kit._01_05.1 P1_kitP7_.1._18 P1_kitP7_.1._18 P1_kitP7_.1._18P1_kitP7_.1._18 P1_kitP7_.1._18 P1_kitP7_.1._05 P1_kitP7_.1._05 P1_kitP7_.1._05P1_kitP7_.1._05 P1_kitP7_.1._05 P1_CD44low.C.kit._01_20.1 P1_CD44low.C.kit._01_20.1 P1_CD44low.C.kit._01_20.1P1_CD44low.C.kit._01_20.1 P1_CD44low.C.kit._01_20.1 P1_CD44low.C.kit._01_11.1 P1_CD44low.C.kit._01_11.1 P1_CD44low.C.kit._01_11.1P1_CD44low.C.kit._01_11.1 P1_CD44low.C.kit._01_11.1 P2_CD44low.C.kit._01_18 P2_CD44low.C.kit._01_18 P2_CD44low.C.kit._01_18P2_CD44low.C.kit._01_18 P2_CD44low.C.kit._01_18 P2_CD44low.C.kit._01_12.1 P2_CD44low.C.kit._01_12.1 SC_5 P2_CD44low.C.kit._01_12.1P2_CD44low.C.kit._01_12.1 P2_CD44low.C.kit._01_12.1 P2_CD44low.C.kit._01_04.1 P2_CD44low.C.kit._01_04.1 P2_CD44low.C.kit._01_04.1P2_CD44low.C.kit._01_04.1 P2_CD44low.C.kit._01_04.1 P2_CD44low.C.kit._01_06.1 P2_CD44low.C.kit._01_06.1 P2_CD44low.C.kit._01_06.1P2_CD44low.C.kit._01_06.1 P2_CD44low.C.kit._01_06.1 P1_kitP7_.1._07 P1_kitP7_.1._07 P1_kitP7_.1._07P1_kitP7_.1._07 P1_kitP7_.1._07 P1_kitP7_.1._04 P1_kitP7_.1._04 P1_kitP7_.1._04P1_kitP7_.1._04 P1_kitP7_.1._04 P2_CD44low.C.kit._01_16 P2_CD44low.C.kit._01_16 P2_CD44low.C.kit._01_16P2_CD44low.C.kit._01_16 P2_CD44low.C.kit._01_16 P1_CD44low.C.kit._01_10.1 P1_CD44low.C.kit._01_10.1SC_4 P1_CD44low.C.kit._01_10.1P1_CD44low.C.kit._01_10.1 P1_CD44low.C.kit._01_10.1 P1_CD44low.C.kit._01_04.1 P1_CD44low.C.kit._01_04.1 P1_CD44low.C.kit._01_04.1P1_CD44low.C.kit._01_04.1 P1_CD44low.C.kit._01_04.1 P1_CD44high_01_14 P1_CD44high_01_14 P1_CD44high_01_14P1_CD44high_01_14 P1_CD44high_01_14 P1_CD44low.C.kit._01_15.1 P1_CD44low.C.kit._01_15.1 P1_CD44low.C.kit._01_15.1P1_CD44low.C.kit._01_15.1 P1_CD44low.C.kit._01_15.1 P1_kitP7_.1._13 P1_kitP7_.1._13 49 cells P1_kitP7_.1._10SC_4 P1_kitP7_.1._10 P1_kitP7_.1._13P1_kitP7_.1._13 P1_kitP7_.1._13 P1_kitP7_.1._14 P1_kitP7_.1._14 P1_kitP7_.1._10P1_kitP7_.1._10 P1_kitP7_.1._10 P1_kitP7_.1._06 P1_kitP7_.1._06 P1_kitP7_.1._14P1_kitP7_.1._14 P1_kitP7_.1._14 P1_kitP6_.1._18 P1_kitP6_.1._18 P1_kitP7_.1._06P1_kitP7_.1._06 P1_kitP7_.1._06 P1_kitP7_.1._23 P1_kitP7_.1._23 P1_kitP6_.1._18P1_kitP6_.1._18 P1_kitP6_.1._18 P2_CD44low.C.kit._01_10.1 P2_CD44low.C.kit._01_10.1 P1_kitP7_.1._23P1_kitP7_.1._23 P1_kitP7_.1._23 P2_CD44low.C.kit._01_03.1 P2_CD44low.C.kit._01_03.1 P2_CD44low.C.kit._01_10.1P2_CD44low.C.kit._01_10.1 P2_CD44low.C.kit._01_10.1 P2_CD44low.C.kit._01_17.1 P2_CD44low.C.kit._01_17.1 P2_CD44low.C.kit._01_03.1P2_CD44low.C.kit._01_03.1 P2_CD44low.C.kit._01_03.1 P2_CD44low.C.kit._01_08.1 P2_CD44low.C.kit._01_08.1 P2_CD44low.C.kit._01_17.1P2_CD44low.C.kit._01_17.1 P2_CD44low.C.kit._01_17.1 P2_CD44low.C.kit._01_14.1 P2_CD44low.C.kit._01_14.1 P2_CD44low.C.kit._01_08.1P2_CD44low.C.kit._01_08.1 P2_CD44low.C.kit._01_08.1 P1_CD44low.C.kit._01_22.1 P1_CD44low.C.kit._01_22.1 P2_CD44low.C.kit._01_14.1P2_CD44low.C.kit._01_14.1 P2_CD44low.C.kit._01_14.1 P1_CD44low.C.kit._01_02.1 P1_CD44low.C.kit._01_02.1 P1_CD44low.C.kit._01_22.1P1_CD44low.C.kit._01_22.1 P1_CD44low.C.kit._01_22.1 P1_kitP6_.1._06 P1_CD44low.C.kit._01_02.1P1_CD44low.C.kit._01_02.1 P1_CD44low.C.kit._01_02.1 P1_kitP6_.1._06 SC_3 P1_kitP6_.1._06P1_kitP6_.1._06 P1_kitP6_.1._06 P1_CD44low.C.kit._01_07.1 P1_CD44low.C.kit._01_07.1 P1_CD44low.C.kit._01_07.1P1_CD44low.C.kit._01_07.1 P1_CD44low.C.kit._01_07.1 P1_kitP7_.1._15 P1_kitP7_.1._15 P1_kitP7_.1._15P1_kitP7_.1._15 P1_kitP7_.1._15 P1_kitP6_.1._02 P1_kitP6_.1._02 P1_kitP6_.1._02P1_kitP6_.1._02 P1_kitP6_.1._02 P1_CD44high_01_22 P1_CD44high_01_22 7 cells P1_CD44high_01_24 P1_CD44high_01_22P1_CD44high_01_22 P1_CD44high_01_22 SC_3 P1_CD44high_01_24 P1_CD44high_01_24P1_CD44high_01_24 P1_CD44high_01_24 P1_CD44high_01_05 P1_CD44high_01_05 P1_CD44high_01_05P1_CD44high_01_05 P1_CD44high_01_05 P1_CD44low.C.kit._01_09.1 P1_CD44low.C.kit._01_09.1 P1_CD44low.C.kit._01_09.1P1_CD44low.C.kit._01_09.1 P1_CD44low.C.kit._01_09.1 P1_kitP6_.1._21 P1_kitP6_.1._21 P1_kitP6_.1._21P1_kitP6_.1._21 P1_kitP6_.1._21 P1_kitP6_.1._11 P1_kitP6_.1._11 P1_kitP6_.1._11P1_kitP6_.1._11 P1_kitP6_.1._11 P1_CD44high_01_12 P1_CD44high_01_12 P1_CD44high_01_12P1_CD44high_01_12 P1_CD44high_01_12 P1_CD44high_01_06 P1_CD44high_01_06 P1_CD44high_01_06P1_CD44high_01_06 P1_CD44high_01_06 P2_CD44low.C.kit._01_09.1 P2_CD44low.C.kit._01_09.1 P2_CD44low.C.kit._01_09.1P2_CD44low.C.kit._01_09.1 P2_CD44low.C.kit._01_09.1 P1_CD44high_01_21 P1_CD44high_01_21 P1_CD44high_01_21P1_CD44high_01_21 P1_CD44high_01_21 P1_CD44high_01_04 P1_CD44high_01_04 P1_CD44high_01_04P1_CD44high_01_04 P1_CD44high_01_04 P1_CD44high_01_03 P1_CD44high_01_03 P1_CD44high_01_03P1_CD44high_01_03 P1_CD44high_01_03 P1_CD44high_01_15 P1_CD44high_01_15 P1_CD44high_01_15 P1_CD44high_01_15 . P1_CD44high_01_15 P1_CD44high_01_13 P1_CD44high_01_13 P1_CD44high_01_13P1_CD44high_01_13 P1_CD44high_01_13 P2_CD44low.C.kit._01_11.1 P2_CD44low.C.kit._01_11.1SC_2 P2_CD44low.C.kit._01_11.1P2_CD44low.C.kit._01_11.1 P2_CD44low.C.kit._01_11.1 P1_CD44low.C.kit._01_04 P1_CD44low.C.kit._01_04 P1_CD44low.C.kit._01_04P1_CD44low.C.kit._01_04 P1_CD44low.C.kit._01_04 P1_CD44high_01_20 P1_CD44high_01_20 P1_CD44high_01_20P1_CD44high_01_20 P1_CD44high_01_20 P1_CD44high_01_11 P1_CD44high_01_11 P1_CD44high_01_11P1_CD44high_01_11 P1_CD44high_01_11 The copyright holder for this preprint (which was not preprint (which for this holder The copyright P1_CD44high_01_19 P1_CD44high_01_19 59 cells P1_CD44high_01_19P1_CD44high_01_19 P1_CD44high_01_19 P1_CD44high_01_09 P1_CD44high_01_09 P1_CD44high_01_09 SC_2 P1_CD44high_01_09P1_CD44high_01_09 P1_CD44high_01_07 P1_CD44high_01_07 P1_CD44high_01_07P1_CD44high_01_07 P1_CD44high_01_07 P1_CD44high_01_01 P1_CD44high_01_01 P1_CD44high_01_01P1_CD44high_01_01 P1_CD44high_01_01 P1_CD44high_01_08 P1_CD44high_01_08 P1_CD44high_01_08P1_CD44high_01_08 P1_CD44high_01_08 P1_kitP6_.1._22 P1_kitP6_.1._22 P1_kitP6_.1._22P1_kitP6_.1._22 P1_kitP6_.1._22 P1_kitP6_.1._12 P1_kitP6_.1._12 P1_kitP6_.1._12P1_kitP6_.1._12 P1_kitP6_.1._12 P1_kitP6_.1._08 P1_kitP6_.1._08 P1_kitP6_.1._08P1_kitP6_.1._08 P1_kitP6_.1._08 P1_kitP6_.1._17 P1_kitP6_.1._17 SC_4 P1_kitP6_.1._17P1_kitP6_.1._17 P1_kitP6_.1._17 P1_kitP6_.1._05 P1_kitP6_.1._05 P1_kitP6_.1._05P1_kitP6_.1._05 P1_kitP6_.1._05 P1_kitP6_.1._07 P1_kitP6_.1._07 P1_kitP6_.1._07P1_kitP6_.1._07 P1_kitP6_.1._07 P1_kitP6_.1._20 P1_kitP6_.1._20 P1_kitP6_.1._20P1_kitP6_.1._20 P1_kitP6_.1._20 P1_CD44low.C.kit._01_03.1 P1_CD44low.C.kit._01_03.1 P1_CD44low.C.kit._01_03.1 P1_CD44low.C.kit._01_03.1 P1_CD44high_01_16 P1_CD44high_01_16SC_1 P1_CD44low.C.kit._01_03.1 P1_CD44high_01_02 P1_CD44high_01_02 P1_CD44high_01_16P1_CD44high_01_16 P1_CD44high_01_16 P1_CD44high_01_18 P1_CD44high_01_18 P1_CD44high_01_02P1_CD44high_01_02 P1_CD44high_01_02 P1_CD44high_01_10 P1_CD44high_01_10 P1_CD44high_01_18P1_CD44high_01_18 P1_CD44high_01_18 P1_kitP6_.1._03 P1_kitP6_.1._03 P1_CD44high_01_10P1_CD44high_01_10 P1_CD44high_01_10 P1_kitP6_.1._23SC_1 P1_kitP6_.1._23 P1_kitP6_.1._03P1_kitP6_.1._03 P1_kitP6_.1._03 P1_CD44high_01_23 P1_CD44high_01_23 56 cells P1_kitP6_.1._23P1_kitP6_.1._23 P1_kitP6_.1._23 P1_kitP6_.1._15 P1_kitP6_.1._15 P1_CD44high_01_23P1_CD44high_01_23 P1_CD44high_01_23 P1_kitP6_.1._04 P1_kitP6_.1._04 P1_kitP6_.1._15P1_kitP6_.1._15 P1_kitP6_.1._15 P1_kitP6_.1._24 P1_kitP6_.1._24 P1_kitP6_.1._04P1_kitP6_.1._04 P1_kitP6_.1._04 P1_kitP6_.1._13 P1_kitP6_.1._13 P1_kitP6_.1._24P1_kitP6_.1._24 P1_kitP6_.1._24 P1_CD44high_01_17 P1_CD44high_01_17 P1_kitP6_.1._13P1_kitP6_.1._13 P1_kitP6_.1._13 P1_kitP6_.1._19 P1_kitP6_.1._19 P1_CD44high_01_17P1_CD44high_01_17 P1_CD44high_01_17 P1_kitP6_.1._10 P1_kitP6_.1._10 P1_kitP6_.1._19P1_kitP6_.1._19 P1_kitP6_.1._19 P1_kitP6_.1._09 P1_kitP6_.1._09 P1_kitP6_.1._10P1_kitP6_.1._10 P1_kitP6_.1._10 P2_CD44low.C.kit._01_13 P2_CD44low.C.kit._01_13 P1_kitP6_.1._09P1_kitP6_.1._09 P1_kitP6_.1._09 P2_CD44low.C.kit._01_04 P2_CD44low.C.kit._01_04 P2_CD44low.C.kit._01_13P2_CD44low.C.kit._01_13 P2_CD44low.C.kit._01_13 P1_CD44low.C.kit._01_17 P1_CD44low.C.kit._01_17 P2_CD44low.C.kit._01_04P2_CD44low.C.kit._01_04 P2_CD44low.C.kit._01_04 P1_CD44low.C.kit._01_21 P1_CD44low.C.kit._01_21 P1_CD44low.C.kit._01_17P1_CD44low.C.kit._01_17 P1_CD44low.C.kit._01_17 P1_CD44low.C.kit._01_15 P1_CD44low.C.kit._01_15 P1_CD44low.C.kit._01_21P1_CD44low.C.kit._01_21 P1_CD44low.C.kit._01_21 P2_CD44low.C.kit._01_12 P2_CD44low.C.kit._01_12 P1_CD44low.C.kit._01_15P1_CD44low.C.kit._01_15 P1_CD44low.C.kit._01_15 P2_CD44low.C.kit._01_08 P2_CD44low.C.kit._01_08 P2_CD44low.C.kit._01_12P2_CD44low.C.kit._01_12 P2_CD44low.C.kit._01_12 P1_CD44low.C.kit._01_06c P1_CD44low.C.kit._01_06 P2_CD44low.C.kit._01_08P2_CD44low.C.kit._01_08 P2_CD44low.C.kit._01_08 P2_CD44low.C.kit._01_02 P2_CD44low.C.kit._01_02 P1_CD44low.C.kit._01_06P1_CD44low.C.kit._01_06 P1_CD44low.C.kit._01_06 P2_CD44low.C.kit._01_01 P2_CD44low.C.kit._01_01 P2_CD44low.C.kit._01_02P2_CD44low.C.kit._01_02 P2_CD44low.C.kit._01_02 P2_CD44low.C.kit._01_15 P2_CD44low.C.kit._01_15 P2_CD44low.C.kit._01_01P2_CD44low.C.kit._01_01 P2_CD44low.C.kit._01_01 P1_CD44low.C.kit._01_23 P1_CD44low.C.kit._01_23 P2_CD44low.C.kit._01_15P2_CD44low.C.kit._01_15 P2_CD44low.C.kit._01_15 P2_CD44low.C.kit._01_11 P2_CD44low.C.kit._01_11 P1_CD44low.C.kit._01_23P1_CD44low.C.kit._01_23 P1_CD44low.C.kit._01_23 P1_CD44low.C.kit._01_01 P1_CD44low.C.kit._01_01 P2_CD44low.C.kit._01_11P2_CD44low.C.kit._01_11 P2_CD44low.C.kit._01_11 P2_CD44low.C.kit._01_06 P2_CD44low.C.kit._01_06 P1_CD44low.C.kit._01_01P1_CD44low.C.kit._01_01 P1_CD44low.C.kit._01_01 P2_CD44low.C.kit._01_07 P2_CD44low.C.kit._01_07 P2_CD44low.C.kit._01_06P2_CD44low.C.kit._01_06 P2_CD44low.C.kit._01_06 P1_CD44low.C.kit._01_22 P1_CD44low.C.kit._01_22 P2_CD44low.C.kit._01_07P2_CD44low.C.kit._01_07 P2_CD44low.C.kit._01_07 P2_CD44low.C.kit._01_09 P2_CD44low.C.kit._01_09 P1_CD44low.C.kit._01_22P1_CD44low.C.kit._01_22 P1_CD44low.C.kit._01_22 P1_CD44low.C.kit._01_18 P1_CD44low.C.kit._01_18 P2_CD44low.C.kit._01_09P2_CD44low.C.kit._01_09 P2_CD44low.C.kit._01_09 P1_CD44low.C.kit._01_02 P1_CD44low.C.kit._01_02 P1_CD44low.C.kit._01_18P1_CD44low.C.kit._01_18 P1_CD44low.C.kit._01_18 P2_CD44low.C.kit._01_03 P2_CD44low.C.kit._01_03 P1_CD44low.C.kit._01_02P1_CD44low.C.kit._01_02 P1_CD44low.C.kit._01_02 P1_CD44low.C.kit._01_20 P1_CD44low.C.kit._01_20 P2_CD44low.C.kit._01_03P2_CD44low.C.kit._01_03 P2_CD44low.C.kit._01_03 defined by CD44 by distinct are defined transcriptionally P1_CD44low.C.kit._01_20P1_CD44low.C.kit._01_20 P1_CD44low.C.kit._01_20 P1_CD44low.C.kit._01_07 P1_CD44low.C.kit._01_07 P1_CD44low.C.kit._01_07P1_CD44low.C.kit._01_07 P1_CD44low.C.kit._01_07 P1_CD44low.C.kit._01_05 P1_CD44low.C.kit._01_05 P1_CD44low.C.kit._01_05P1_CD44low.C.kit._01_05 P1_CD44low.C.kit._01_05 P2_CD44low.C.kit._01_10 P2_CD44low.C.kit._01_10 P2_CD44low.C.kit._01_10P2_CD44low.C.kit._01_10 P2_CD44low.C.kit._01_10 SC_4 SC_1 SC_2 SC_5 P1_CD44low.C.kit._01_19 SC_3 P1_CD44low.C.kit._01_19 P1_CD44low.C.kit._01_19P1_CD44low.C.kit._01_19 P1_CD44low.C.kit._01_19 P1_nokitP7_.1._22 P1_nokitP7_.1._22 P1_nokitP7_.1._22P1_nokitP7_.1._22 P1_nokitP7_.1._22 P1_nokitP7_.1._01 P1_nokitP7_.1._01 P1_nokitP7_.1._01P1_nokitP7_.1._01 P1_nokitP7_.1._01 P2_CD44low.C.kit._01_05 P2_CD44low.C.kit._01_05 P2_CD44low.C.kit._01_05P2_CD44low.C.kit._01_05 P2_CD44low.C.kit._01_05 P1_CD44low.C.kit._01_12 SC_1 SC_2 SC_3 SC_4 SC_5 P1_CD44low.C.kit._01_12 P2_CD44low.C.kit._01_14 P2_CD44low.C.kit._01_14 P1_CD44low.C.kit._01_12P1_CD44low.C.kit._01_12 P1_CD44low.C.kit._01_12 P1_CD44low.C.kit._01_10 P1_CD44low.C.kit._01_10 P2_CD44low.C.kit._01_14P2_CD44low.C.kit._01_14 P2_CD44low.C.kit._01_14 Cluster P1_CD44low.C.kit._01_10P1_CD44low.C.kit._01_10 P1_CD44low.C.kit._01_10 this version posted June 6, 2018. June 6, posted this version P1_CD44low.C.kit._01_13.1 P1_CD44low.C.kit._01_13.1 samples$GroupID P1_CD44low.C.kit._01_13.1P1_CD44low.C.kit._01_13.1 P1_CD44low.C.kit._01_13.1 P1_CD44low.C.kit._01_11 P1_CD44low.C.kit._01_11 SC_3 P1_CD44low.C.kit._01_11P1_CD44low.C.kit._01_11 P1_CD44low.C.kit._01_11 ; P1_CD44low.C.kit._01_09 P1_CD44low.C.kit._01_09 P1_CD44low.C.kit._01_09P1_CD44low.C.kit._01_09 P1_CD44low.C.kit._01_09 P1_CD44low.C.kit._01_03 P1_CD44low.C.kit._01_03 P1_CD44low.C.kit._01_03P1_CD44low.C.kit._01_03 P1_CD44low.C.kit._01_03 P1_CD44low.C.kit._01_13 P1_CD44low.C.kit._01_13 P1_CD44low.C.kit._01_13P1_CD44low.C.kit._01_13 P1_CD44low.C.kit._01_13 CC-BY-NC-ND 4.0 International license CC-BY-NC-ND P1_nokitP7_.1._18 P1_nokitP7_.1._18 P1_nokitP7_.1._18P1_nokitP7_.1._18 P1_nokitP7_.1._18 P1_nokitP7_.1._13 P1_nokitP7_.1._13 P1_nokitP7_.1._13P1_nokitP7_.1._13 P1_nokitP7_.1._13 a P1_nokitP7_.1._17 P1_nokitP7_.1._17 P1_nokitP7_.1._17P1_nokitP7_.1._17 P1_nokitP7_.1._17 P1_nokitP7_.1._08 P1_nokitP7_.1._08 P1_nokitP7_.1._08P1_nokitP7_.1._08 P1_nokitP7_.1._08 P1_nokitP7_.1._11 P1_nokitP7_.1._11 P1_nokitP7_.1._11P1_nokitP7_.1._11 P1_nokitP7_.1._11

P2_CD44low.C.kit._01_17 P2_CD44low.C.kit._01_1720 P1_nokitP7_.1._24 P1_nokitP7_.1._24 P2_CD44low.C.kit._01_17P2_CD44low.C.kit._01_17 P2_CD44low.C.kit._01_17 P1_CD44low.C.kit._01_16 P1_CD44low.C.kit._01_16 P1_nokitP7_.1._24P1_nokitP7_.1._24 P1_nokitP7_.1._24 P1_nokitP7_.1._12 P1_nokitP7_.1._12 P1_CD44low.C.kit._01_16P1_CD44low.C.kit._01_16 P1_CD44low.C.kit._01_16 P1_nokitP7_.1._07 P1_nokitP7_.1._07 P1_nokitP7_.1._12P1_nokitP7_.1._12 P1_nokitP7_.1._12 P1_nokitP7_.1._21 P1_nokitP7_.1._21 P1_nokitP7_.1._07P1_nokitP7_.1._07 P1_nokitP7_.1._07 P1_nokitP7_.1._14 P1_nokitP7_.1._14 P1_nokitP7_.1._21P1_nokitP7_.1._21 P1_nokitP7_.1._21 P1_nokitP7_.1._10 P1_nokitP7_.1._10 P1_nokitP7_.1._14P1_nokitP7_.1._14 P1_nokitP7_.1._14 P1_nokitP7_.1._20 P1_nokitP7_.1._20 P1_nokitP7_.1._10P1_nokitP7_.1._10 P1_nokitP7_.1._10 P1_nokitP7_.1._05 P1_nokitP7_.1._05 P1_nokitP7_.1._20P1_nokitP7_.1._20 P1_nokitP7_.1._20 P1_nokitP7_.1._04 P1_nokitP7_.1._04 P1_nokitP7_.1._05P1_nokitP7_.1._05 P1_nokitP7_.1._05 P1_nokitP7_.1._02 P1_nokitP7_.1._02 P1_nokitP7_.1._04P1_nokitP7_.1._04 P1_nokitP7_.1._04 P1_nokitP7_.1._23 P1_nokitP7_.1._23 P1_nokitP7_.1._02P1_nokitP7_.1._02 P1_nokitP7_.1._02 P1_nokitP7_.1._16 P1_nokitP7_.1._16 P1_nokitP7_.1._23P1_nokitP7_.1._23 P1_nokitP7_.1._23 P1_nokitP7_.1._06 P1_nokitP7_.1._06 P1_nokitP7_.1._16P1_nokitP7_.1._16 P1_nokitP7_.1._16 P1_CD44low.C.kit._01_08 P1_CD44low.C.kit._01_08 P1_nokitP7_.1._06P1_nokitP7_.1._06 P1_nokitP7_.1._06 P1_nokitP7_.1._19 P1_nokitP7_.1._19 P1_CD44low.C.kit._01_08P1_CD44low.C.kit._01_08 P1_CD44low.C.kit._01_08 P1_kitP5_.1._22 P1_kitP5_.1._22 P1_nokitP7_.1._19P1_nokitP7_.1._19 P1_nokitP7_.1._19 P1_kitP5_.1._09 P1_kitP5_.1._09 P1_kitP5_.1._22P1_kitP5_.1._22 P1_kitP5_.1._22 P1_nokitP7_.1._03 P1_nokitP7_.1._0310 P1_kitP5_.1._09P1_kitP5_.1._09 P1_kitP5_.1._09 P1_kitP5_.1._11 P1_kitP5_.1._11 P1_nokitP7_.1._03P1_nokitP7_.1._03 P1_nokitP7_.1._03 P1_kitP5_.1._21 P1_kitP5_.1._21 P1_kitP5_.1._11P1_kitP5_.1._11 P1_kitP5_.1._11 P1_kitP5_.1._16 P1_kitP5_.1._16 P1_kitP5_.1._21P1_kitP5_.1._21 P1_kitP5_.1._21 P1_kitP5_.1._04 P1_kitP5_.1._04 P1_kitP5_.1._16P1_kitP5_.1._16 P1_kitP5_.1._16 P1_kitP5_.1._18 P1_kitP5_.1._18 P1_kitP5_.1._04P1_kitP5_.1._04 P1_kitP5_.1._04 P1_kitP5_.1._06 P1_kitP5_.1._06 P1_kitP5_.1._18P1_kitP5_.1._18 P1_kitP5_.1._18 P1_kitP5_.1._05 P1_kitP5_.1._05 P1_kitP5_.1._06P1_kitP5_.1._06 P1_kitP5_.1._06 P1_kitP5_.1._17 P1_kitP5_.1._17 P1_kitP5_.1._05P1_kitP5_.1._05 P1_kitP5_.1._05 P1_kitP5_.1._14 P1_kitP5_.1._14 P1_kitP5_.1._17P1_kitP5_.1._17 P1_kitP5_.1._17 P1_CD44neg_01_02 P1_CD44neg_01_02 P1_kitP5_.1._14P1_kitP5_.1._14 P1_kitP5_.1._14 P1_CD44neg_01_08 P1_CD44neg_01_08 P1_CD44neg_01_02P1_CD44neg_01_02 P1_CD44neg_01_02 P1_CD44neg_01_01 P1_CD44neg_01_01 P1_CD44neg_01_08P1_CD44neg_01_08 P1_CD44neg_01_08 P1_CD44neg_01_17 P1_CD44neg_01_01P1_CD44neg_01_01 P1_CD44neg_01_01 subpopulations P1_CD44neg_01_17 P1_CD44neg_01_17P1_CD44neg_01_17 P1_CD44neg_01_17 P1_CD44neg_01_09 P1_CD44neg_01_09 P1_CD44neg_01_09P1_CD44neg_01_09 P1_CD44neg_01_09 P1_CD44neg_01_16 P1_CD44neg_01_16 P1_CD44neg_01_16P1_CD44neg_01_16 P1_CD44neg_01_16 + P1_CD44neg_01_05 P1_CD44neg_01_05 P1_CD44neg_01_10 P1_CD44neg_01_05P1_CD44neg_01_05 P1_CD44neg_01_05

P1_CD44neg_01_100 P1_kitP7_.1._03 P1_kitP7_.1._03 P1_CD44neg_01_10P1_CD44neg_01_10 P1_CD44neg_01_10 P1_kitP5_.1._08 P1_kitP5_.1._08 P1_kitP7_.1._03P1_kitP7_.1._03 P1_kitP7_.1._03 P1_kitP5_.1._10 P1_kitP5_.1._10 P1_kitP5_.1._08P1_kitP5_.1._08 P1_kitP5_.1._08 P1_kitP7_.1._22 P1_kitP7_.1._22 P1_kitP5_.1._10P1_kitP5_.1._10 P1_kitP5_.1._10 P1_kitP7_.1._19 P1_kitP7_.1._19 P1_kitP7_.1._22P1_kitP7_.1._22 P1_kitP7_.1._22 P1_CD44low.C.kit._01_21.1 P1_CD44low.C.kit._01_21.1 P1_kitP7_.1._19P1_kitP7_.1._19 P1_kitP7_.1._19 P1_kitP7_.1._17 P1_CD44low.C.kit._01_21.1P1_CD44low.C.kit._01_21.1 P1_CD44low.C.kit._01_21.1

P1_kitP7_.1._17Dimension 1 P1_kitP5_.1._07 P1_kitP5_.1._07 SC_2 P1_kitP7_.1._17P1_kitP7_.1._17 P1_kitP7_.1._17 P1_kitP7_.1._02 P1_kitP7_.1._02 P1_kitP5_.1._07P1_kitP5_.1._07 P1_kitP5_.1._07 P1_kitP7_.1._16 P1_kitP7_.1._02P1_kitP7_.1._02 P1_kitP7_.1._02

https://doi.org/10.1101/338178 P1_kitP7_.1._16

Dimension 1 P1_kitP7_.1._16P1_kitP7_.1._16 P1_kitP7_.1._16 P1_kitP7_.1._12 P1_kitP7_.1._12 P1_kitP7_.1._12P1_kitP7_.1._12 P1_kitP7_.1._12 P1_kitP7_.1._11 P1_kitP7_.1._11 P1_kitP7_.1._11P1_kitP7_.1._11 P1_kitP7_.1._11 P1_kitP7_.1._21 P1_kitP7_.1._21 P1_kitP7_.1._21P1_kitP7_.1._21 P1_kitP7_.1._21 P1_CD44neg_01_15 P1_CD44neg_01_15 P1_CD44neg_01_15P1_CD44neg_01_15 P1_CD44neg_01_15 P1_kitP5_.1._20 P1_kitP5_.1._20 P1_kitP5_.1._20P1_kitP5_.1._20 P1_kitP5_.1._20 P1_kitP5_.1._15 P1_kitP5_.1._15 P1_kitP5_.1._15P1_kitP5_.1._15 P1_kitP5_.1._15 P1_kitP7_.1._20 P1_kitP7_.1._20 P1_kitP7_.1._20P1_kitP7_.1._20 P1_kitP7_.1._20 doi: P1_kitP5_.1._02 P1_kitP5_.1._02 P1_kitP5_.1._02 P1_nokitP7_.1._15 P1_nokitP7_.1._1510 P1_kitP5_.1._02P1_kitP5_.1._02 − P1_nokitP7_.1._15P1_nokitP7_.1._15 P1_nokitP7_.1._15 P1_kitP5_.1._03 P1_kitP5_.1._03 P1_kitP5_.1._03P1_kitP5_.1._03 P1_kitP5_.1._03 P1_kitP7_.1._08 P1_kitP7_.1._08 P1_kitP7_.1._08P1_kitP7_.1._08 P1_kitP7_.1._08 P1_kitP5_.1._12 P1_kitP5_.1._12 P1_kitP5_.1._12P1_kitP5_.1._12 P1_kitP5_.1._12 P1_kitP5_.1._13 P1_kitP5_.1._13 P1_kitP5_.1._13P1_kitP5_.1._13 P1_kitP5_.1._13 P1_kitP5_.1._19 P1_kitP5_.1._19 P1_kitP5_.1._19P1_kitP5_.1._19 P1_kitP5_.1._19 P1_kitP5_.1._01 P1_kitP5_.1._01 P1_kitP5_.1._01P1_kitP5_.1._01 P1_kitP5_.1._01 P1_kitP7_.1._09 P1_kitP7_.1._09 P1_kitP7_.1._09P1_kitP7_.1._09 P1_kitP7_.1._09 P1_CD44neg_01_19 P1_CD44neg_01_19 P1_CD44neg_01_19P1_CD44neg_01_19 P1_CD44neg_01_19 P1_CD44neg_01_21 P1_CD44neg_01_21 P1_CD44neg_01_21P1_CD44neg_01_21 P1_CD44neg_01_21 P1_CD44neg_01_06 P1_CD44neg_01_06 P1_CD44neg_01_06P1_CD44neg_01_06 P1_CD44neg_01_06 P1_CD44neg_01_20 P1_CD44neg_01_20 P1_CD44neg_01_20P1_CD44neg_01_20 P1_CD44neg_01_20 P1_CD44neg_01_13 P1_CD44neg_01_13 P1_CD44neg_01_13P1_CD44neg_01_13 P1_CD44neg_01_13 P1_CD44neg_01_23 P1_CD44neg_01_23 P1_CD44neg_01_23P1_CD44neg_01_23 P1_CD44neg_01_23 P1_CD44neg_01_22 P1_CD44neg_01_22 P1_CD44neg_01_22P1_CD44neg_01_22 P1_CD44neg_01_22 P1_CD44neg_01_07 P1_CD44neg_01_07 P1_CD44neg_01_07P1_CD44neg_01_07 P1_CD44neg_01_07 P1_CD44neg_01_14 P1_CD44neg_01_14 P1_CD44neg_01_14P1_CD44neg_01_14 P1_CD44neg_01_14 P1_CD44neg_01_12 P1_CD44neg_01_12 P1_CD44neg_01_12P1_CD44neg_01_12 P1_CD44neg_01_12 P2_CD44low.C.kit._01_02.1 P2_CD44low.C.kit._01_02.120 P2_CD44low.C.kit._01_02.1P2_CD44low.C.kit._01_02.1 P2_CD44low.C.kit._01_02.1 P1_CD44low.C.kit._01_05.1 P1_CD44low.C.kit._01_05.1− P1_CD44low.C.kit._01_05.1P1_CD44low.C.kit._01_05.1 P1_CD44low.C.kit._01_05.1 P1_CD44low.C.kit._01_01.1 P1_CD44low.C.kit._01_01.1 P1_CD44low.C.kit._01_01.1P1_CD44low.C.kit._01_01.1 P1_CD44low.C.kit._01_01.1 P1_CD44low.C.kit._01_14 P1_CD44low.C.kit._01_14 P1_CD44low.C.kit._01_14P1_CD44low.C.kit._01_14 P1_CD44low.C.kit._01_14 P1_kitP6_.1._01 P1_kitP6_.1._01 P1_kitP6_.1._01P1_kitP6_.1._01 P1_kitP6_.1._01 P1_nokitP7_.1._09 P1_nokitP7_.1._09 P1_nokitP7_.1._09P1_nokitP7_.1._09 P1_nokitP7_.1._09 P1_kitP7_.1._01 P1_kitP7_.1._01 P1_kitP7_.1._01P1_kitP7_.1._01 P1_kitP7_.1._01

Neg Neg

Pos bioRxiv preprint bioRxiv

0

certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under in perpetuity. It is to display the preprint bioRxiv a license who has granted peer review) is the author/funder, certified by

Kit Kit

20 20

− Dimension 2 Dimension Dimension 2 2 Dimension

Neg Neg

Low

Low

High High

Log2CD44high CD44lo_kitneg CD44lo_kitpos GeneCD44neg Expression

GroupID

2 0 4 8 6 10 14 12 0 2 4 6 8 10 12 14

E11 E11

E10

CD44 CD44 CD44 CD44

SC_1 SC_2 SC_3 SC_4 SC_5 SC_1

b a

Time point

Phenotype

Cluster Figure 3: The VE-Cad 3:The Figure BH1 −

SERPINF1 ITGA2B Sox6

Cbfb GATA2 Dnmt3b EPOR GFI1B Mycn HBB CXCL12 Col3a1 GATA1 HMMR Notch2 ACTA2 ITGAM Lrp1 ITGB3 Pag1 Ptk2b Ikzf2 Angpt1 ATP2A3 MYB Med12l Plcg2

Rab38 Stk26 Mctp1 Nfe2 GFI1 SNAI1 BMP4

CDH2 Pde2a NOTCH1 Sox17 PCDH12 Pde3a FBN1 SMAD7 GPR126 COL4A5 Pbx1 MEIS2 SMAD6 ESAM Cbfa2t3 Bcl11a ERG FLI1 Adgrg1 KIT CD47 TGFB1 CD44 Gpx1 Dock8 Stxbp2 EMB Csf1r PTPRC SMAD3 SMAD2

Pfas Nmt2 Rbm38 TAL1 LMO2 ALOX5AP Dapp1 SFPI1 Laptm5 Ly6e Dock2 Ptpn6 CORO1A Was PLA2G4A Dok2 RUNX1 LGALS9 SLA Ikzf1 KDR Emcn TEK RAMP2 ACVRL1 PTPRB COL4A2 Mdk PECAM1 PPIA CDH5 GroupID P1_CD44low.C.kit._01_23.1 P1_CD44low.C.kit._01_16.1 P1_CD44low.C.kit._01_12.1 P1_CD44low.C.kit._01_14.1 P1_CD44low.C.kit._01_06.1 P1_CD44low.C.kit._01_18.1 P2_CD44low.C.kit._01_15.1 P2_CD44low.C.kit._01_07.1 P2_CD44low.C.kit._01_01.1 P1_CD44low.C.kit._01_19.1 P1_CD44low.C.kit._01_08.1 P2_CD44low.C.kit._01_13.1 P2_CD44low.C.kit._01_05.1 P1_kitP7_.1._18 P1_kitP7_.1._05 P1_CD44low.C.kit._01_20.1 P1_CD44low.C.kit._01_11.1 P2_CD44low.C.kit._01_18 P2_CD44low.C.kit._01_12.1 P2_CD44low.C.kit._01_04.1 P2_CD44low.C.kit._01_06.1 P1_kitP7_.1._07 P1_kitP7_.1._04 P2_CD44low.C.kit._01_16 P1_CD44low.C.kit._01_10.1 P1_CD44low.C.kit._01_04.1 P1_CD44high_01_14 P1_CD44low.C.kit._01_15.1 P1_kitP7_.1._13 P1_kitP7_.1._10 P1_kitP7_.1._14 P1_kitP7_.1._06 P1_kitP6_.1._18 P1_kitP7_.1._23 P2_CD44low.C.kit._01_10.1 P2_CD44low.C.kit._01_03.1 P2_CD44low.C.kit._01_17.1 P2_CD44low.C.kit._01_08.1 P2_CD44low.C.kit._01_14.1 P1_CD44low.C.kit._01_22.1 P1_CD44low.C.kit._01_02.1 P1_kitP6_.1._06 P1_CD44low.C.kit._01_07.1 P1_kitP7_.1._15 P1_kitP6_.1._02 P1_CD44high_01_22 P1_CD44high_01_24 P1_CD44high_01_05 P1_CD44low.C.kit._01_09.1 P1_kitP6_.1._21 P1_kitP6_.1._11 P1_CD44high_01_12 P1_CD44high_01_06 P2_CD44low.C.kit._01_09.1 P1_CD44high_01_21 P1_CD44high_01_04 P1_CD44high_01_03 P1_CD44high_01_15 P1_CD44high_01_13 P2_CD44low.C.kit._01_11.1 P1_CD44low.C.kit._01_04 P1_CD44high_01_20 P1_CD44high_01_11 P1_CD44high_01_19 P1_CD44high_01_09 P1_CD44high_01_07 P1_CD44high_01_01 P1_CD44high_01_08 P1_kitP6_.1._22 P1_kitP6_.1._12 P1_kitP6_.1._08 P1_kitP6_.1._17 P1_kitP6_.1._05 P1_kitP6_.1._07 P1_kitP6_.1._20 P1_CD44low.C.kit._01_03.1 P1_CD44high_01_16 P1_CD44high_01_02 P1_CD44high_01_18 P1_CD44high_01_10 P1_kitP6_.1._03 P1_kitP6_.1._23 P1_CD44high_01_23 P1_kitP6_.1._15 P1_kitP6_.1._04 P1_kitP6_.1._24 P1_kitP6_.1._13 P1_CD44high_01_17 P1_kitP6_.1._19 P1_kitP6_.1._10 P1_kitP6_.1._09 P2_CD44low.C.kit._01_13 P2_CD44low.C.kit._01_04 P1_CD44low.C.kit._01_17 P1_CD44low.C.kit._01_21 P1_CD44low.C.kit._01_15 P2_CD44low.C.kit._01_12 P2_CD44low.C.kit._01_08 P1_CD44low.C.kit._01_06 P2_CD44low.C.kit._01_02 P2_CD44low.C.kit._01_01 P2_CD44low.C.kit._01_15 P1_CD44low.C.kit._01_23 P2_CD44low.C.kit._01_11 P1_CD44low.C.kit._01_01 P2_CD44low.C.kit._01_06 P2_CD44low.C.kit._01_07 P1_CD44low.C.kit._01_22 P2_CD44low.C.kit._01_09 P1_CD44low.C.kit._01_18 P1_CD44low.C.kit._01_02 P2_CD44low.C.kit._01_03 P1_CD44low.C.kit._01_20 P1_CD44low.C.kit._01_07 P1_CD44low.C.kit._01_05 P2_CD44low.C.kit._01_10 P1_CD44low.C.kit._01_19 P1_nokitP7_.1._22 P1_nokitP7_.1._01 P2_CD44low.C.kit._01_05 P1_CD44low.C.kit._01_12 P2_CD44low.C.kit._01_14 P1_CD44low.C.kit._01_10 P1_CD44low.C.kit._01_13.1 P1_CD44low.C.kit._01_11 P1_CD44low.C.kit._01_09 P1_CD44low.C.kit._01_03 P1_CD44low.C.kit._01_13 P1_nokitP7_.1._18 P1_nokitP7_.1._13 P1_nokitP7_.1._17 P1_nokitP7_.1._08 P1_nokitP7_.1._11 P2_CD44low.C.kit._01_17 P1_nokitP7_.1._24 P1_CD44low.C.kit._01_16 P1_nokitP7_.1._12 P1_nokitP7_.1._07 P1_nokitP7_.1._21 P1_nokitP7_.1._14 P1_nokitP7_.1._10 P1_nokitP7_.1._20 P1_nokitP7_.1._05 P1_nokitP7_.1._04 P1_nokitP7_.1._02 P1_nokitP7_.1._23 P1_nokitP7_.1._16 P1_nokitP7_.1._06 P1_CD44low.C.kit._01_08 P1_nokitP7_.1._19 P1_kitP5_.1._22 P1_kitP5_.1._09 P1_nokitP7_.1._03 P1_kitP5_.1._11 P1_kitP5_.1._21 P1_kitP5_.1._16 P1_kitP5_.1._04 P1_kitP5_.1._18 P1_kitP5_.1._06 P1_kitP5_.1._05 P1_kitP5_.1._17 P1_kitP5_.1._14 P1_CD44neg_01_02 P1_CD44neg_01_08 P1_CD44neg_01_01 P1_CD44neg_01_17 P1_CD44neg_01_09 P1_CD44neg_01_16 P1_CD44neg_01_05 P1_CD44neg_01_10 P1_kitP7_.1._03 P1_kitP5_.1._08 P1_kitP5_.1._10 P1_kitP7_.1._22 P1_kitP7_.1._19 P1_CD44low.C.kit._01_21.1 P1_kitP7_.1._17 P1_kitP5_.1._07 P1_kitP7_.1._02 P1_kitP7_.1._16 P1_kitP7_.1._12 P1_kitP7_.1._11 P1_kitP7_.1._21 P1_CD44neg_01_15 P1_kitP5_.1._20 P1_kitP5_.1._15 P1_kitP7_.1._20 P1_kitP5_.1._02 P1_nokitP7_.1._15 P1_kitP5_.1._03 P1_kitP7_.1._08 P1_kitP5_.1._12 P1_kitP5_.1._13 P1_kitP5_.1._19 P1_kitP5_.1._01 P1_kitP7_.1._09 P1_CD44neg_01_19 P1_CD44neg_01_21 P1_CD44neg_01_06 P1_CD44neg_01_20 P1_CD44neg_01_13 P1_CD44neg_01_23 P1_CD44neg_01_22 P1_CD44neg_01_07 P1_CD44neg_01_14 P1_CD44neg_01_12 P2_CD44low.C.kit._01_02.1 P1_CD44low.C.kit._01_05.1 P1_CD44low.C.kit._01_01.1 P1_CD44low.C.kit._01_14 P1_kitP6_.1._01 P1_nokitP7_.1._09 P1_kitP7_.1._01 FigurebioRxiv 4: Comparison preprint doi: https://doi.org/10.1101/338178 of the CD44 populations; this version posted June to 6,Pro-HSCs, 2018. The copyright Pre-HSCs holder for this preprinttype (which I and was typenot II certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

5 a 10 Pre-HSCs type II 5 4 10 VE-cad+ 10 CD45+

3 10 4

10 CD43

2 10

0

3 2 10 -10 2 3 4 5 CD45 VE-cad+ 0 10 10 10 10 CD45- CD41 2 10

0 5 10 2 -10 Pre-HSCs 3 4 5 4 0 10 10 10 10 type I

3 VE-Cad 10 CD43

2

10 Pro_1 0 Pro-HSCs 2 -10 2 3 4 5 0 10 10 10 10 CD41

Pro-HSC Pro-HSC Pre-HSC Pre-HSC b c Cluster 1 Cluster 2 type I type II GroupID GroupID Pro_1 Group ID 10 ItgamITGAM Pro_2 Pre_I 20 SC_1 SFPI1 Pre_II Spi1 8

Alox5apALOX5AP Pro_2 SC_2 6 Pla2g4aPLA2G4A SC_3 Ikzf1 4 SC_4 Runx1RUNX1

2 SC_5 Laptm5

Cd44CD44 Pro-HSCs 0 EmbEMB Pre-HSCs type I 10 PtprcPTPRC Pre-HSCs type II Smad7SMAD7

Tal1TAL1 samples$GroupID Fli1FLI1

Pre_I Lmo2LMO2

Pre_II Cdh5CDH5 Pre_I Pro KdrKDR SC_1 Pecam1PECAM1 SC_2 0 Ikzf2 Dimension 2 Dimension 2 SC_3 Adgrg1 SC_4 KitKIT SC_5 Gata2GATA2

Dnmt3b

Gfi1GFI1

Itgb3ITGB3

ErgERG

SERPINF1 −10 Serpinf1

Pcdh12PCDH12 Pre_II Adgrg6GPR126

Sox6

Col4a2COL4A2

TekTEK

Smad6SMAD6 Pde3a

−20 −10 0 10 20 Fbn1FBN1

Sox17 FBN1 Pde3a SMAD6 TEK COL4A2 Sox6 GPR126 PCDH12 SERPINF1 ERG ITGB3 GFI1 DimensionDnmt3b GATA2 1 KIT Adgrg1 Ikzf2 PECAM1 KDR CDH5 LMO2 FLI1 TAL1 SMAD7 PTPRC EMB CD44 Laptm5 RUNX1 Ikzf1 PLA2G4A ALOX5AP SFPI1 ITGAM GroupID Dimension 1 Sox17 Pro_1 Pro_2 Pre_I Pre_II 19 cells 36 cells 103 cells 61 cells

Log2 Gene expression

0 0 2 2 4 4 6 6 8 8 1010 GroupID Pre_II Pre_I Pro_2 Pro_1 CD44neg CD44lo_kit_neg CD44lo_kit_pos CD44high CD44neg CD44lo_kit_neg CD44lo_kit_pos CD44high CD44neg CD44lo_kit_neg CD44lo_kit_pos CD44high CD44neg CD44lo_kit_neg CD44lo_kit_pos CD44high GroupID GroupID GroupID GroupID 12 10 8 6 4 2 4 2 12 10 8 12 6 10 4 8 2 6 12 10 8 6 4 2 Nr2f2 Prl3b1 Pde2a Nrp2 Piezo2 Bcat1 Snai1 Stab2 Aplnr Ltbp4 Adgrg6 Hey2 Akr1c14 Ankrd29 Fas Ltbp1 Bmx Lama3 Mllt3 Pdlim1 Bcl11a Hacd4 Efnb2 Gja5 Sox17 Bmx Hey2 Aplnr Nr2f2 Nrp2 Ephb4 Ephb4 Kdr Flt1 Cd93 Madcam1 Itgb3 Ly6e Hey1 Hey2_a Jag1 Dll4 Notch1 Lef1 Ccnd1 Myc Id1 Id2 Bmp4 Bmper Bmpr1a Ltbp1_a Smad6 Smad7 Nr2f2_a Prl3b1 Pde2a Nrp2_a Piezo2 Bcat1 Snai1 Stab2 Aplnr_a Ltbp4 Adgrg6 Hey2_b Akr1c14 Ankrd29 Fas Ltbp1 Bmx_a Lama3 Mllt3 Pdlim1 Bcl11a Hacd4 Cd44 Efnb2 Gja5 Sox17 Bmx Hey2 Aplnr Nr2f2 Nrp2 GroupID Ctsc Nfe2 Spi1 Ifitm1 Runx1 Mpo Gfi1 Fgd5 Sox17_a Cldn5 Tie1 Tek Cdh5 Ramp2 Sox17_a Flt1 Cldn5 Cd93 Tie1 Madcam1 Tek Itgb3 Cdh5 Ly6e Ramp2 Hey1 Kdr Hey2_a Flt1 Jag1 Cd93 Dll4 Madcam1 Notch1 Itgb3 Lef1 Ly6e Ccnd1 Hey1 Myc Hey2_a Id1 Jag1 Id2 Dll4 Bmp4 Notch1 Bmper Lef1 Bmpr1a Ccnd1 Ltbp1_a Myc Smad6 Id1 Smad7 Id2 Nr2f2_a Bmp4 Prl3b1 Bmper Pde2a Bmpr1a Nrp2_a Ltbp1_a Piezo2 Smad6 Bcat1 Smad7 Snai1 Nr2f2_a Stab2 Prl3b1 Aplnr_a Pde2a Ltbp4 Nrp2_a Adgrg6 Piezo2 Hey2_b Bcat1 Akr1c14 Snai1 Ankrd29 Stab2 Fas Aplnr_a Ltbp1 Ltbp4 Bmx_a Adgrg6 Lama3 Hey2_b Mllt3 Akr1c14 Pdlim1 Ankrd29 Bcl11a Fas Hacd4 Ltbp1 Cd44 Bmx_a Efnb2 Lama3 Gja5 Mllt3 Sox17 Pdlim1 Bmx Bcl11a Hey2 Hacd4 Aplnr Cd44 Nr2f2 Efnb2 Nrp2 Gja5 Ephb4 Sox17 Bmx Hey2 Aplnr Nr2f2 Nrp2 Ephb4 GroupID Ctsc Nfe2 Spi1 Ifitm1 Runx1 Mpo GroupID Gfi1 Ctsc Fgd5 Nfe2 Sox17_a Spi1 Cldn5 Ifitm1 Tie1 Runx1 Tek Mpo Cdh5 Gfi1 Ramp2 Fgd5 Kdr Prl3b1 Pde2a Nrp2_a Piezo2 Bcat1 Snai1 Stab2 Aplnr_a Ltbp4 Adgrg6 Hey2_b Akr1c14 Ankrd29 Fas Ltbp1 Bmx_a Lama3 Mllt3 Pdlim1 Bcl11a Hacd4 Cd44 Efnb2 Gja5 Sox17 Bmx Hey2 Aplnr Nr2f2 Nrp2 Ephb4 GroupID Ctsc Nfe2 Spi1 Ifitm1 Runx1 Mpo Gfi1 Fgd5 Sox17_a Cldn5 Tie1 Tek Cdh5 Ramp2 Kdr Flt1 Cd93 Madcam1 Itgb3 Ly6e Hey1 Hey2_a Jag1 Dll4 Notch1 Lef1 Ccnd1 Myc Id1 Id2 Bmp4 Bmper Bmpr1a Ltbp1_a Smad6 Smad7 Nr2f2_a s22 s22 s22 s22 s21 s21 s21 s21 s20 s20 s20 s20 s19 s19 s19 s19 s18 s18 s18 s18 Neg s17 s17 s17 s17 s16 s16 s16 s16

Kit s15 s15 s15 s15 s14 s14 s14 s14

Low s13 s13 s13 s13 s12 s12 s12 s12 s11 s11 s11 s11 s10 s10 s10 s10 s9 s9 s9 s9 s8 s8 s8 s8 s7 s7 s7 s7 s6 s6 s6 s6

and and CD44 s5 s5 s5 s5 s4 s4 s4 s4 s3 s3 s3 s3 Neg s2 s2 s2 s2 s1 s1 s1 s1 g s g s . Ne Po Ne Po Kit Kit Kit Kit The copyright holder for this preprint (which was not preprint (which for this holder The copyright Nornalised Gene Expression Neg Low Low Nornalised Gene Expression High Neg Low Low High CD44neg CD44lo_kit_neg CD44lo_kit_pos CD44high CD44neg CD44lo_kit_neg CD44lo_kit_pos CD44high (rlog)

(rlog) GroupID GroupID 12 10 8 6 4 2 12 10 8 6 4 2 CD44 CD44 CD44 CD44 4 8 2 12 6 CD44 CD44 10 CD44 CD44 4 8 2 12 6 10

d c Smad6 Smad7 Nr2f2_a Prl3b1 Pde2a Nrp2_a Piezo2 Bcat1 Snai1 Stab2 Aplnr_a Ltbp4 Adgrg6 Hey2_b Akr1c14 Ankrd29 Fas Ltbp1 Bmx_a Lama3 Mllt3 Pdlim1 Bcl11a Hacd4 Cd44 Efnb2 Gja5 Sox17 Bmx Hey2 Aplnr Nr2f2 Nrp2 Ephb4 Ctsc Nfe2 Spi1 Ifitm1 Runx1 Mpo Gfi1 Fgd5 Sox17_a Cldn5 Tie1 Tek Cdh5 Ramp2 Kdr Flt1 Cd93 Madcam1 Itgb3 Ly6e Hey1 Hey2_a Jag1 Dll4 Notch1 Lef1 Ccnd1 Myc Id1 Id2 Bmp4 Bmper Bmpr1a Ltbp1_a GroupID Aplnr Nr2f2 Nrp2 Ephb4 Bmp4 Bmper Bmpr1a Ltbp1_a Smad6 Smad7 Nr2f2_a Prl3b1 Pde2a Nrp2_a Piezo2 Bcat1 Snai1 Stab2 Aplnr_a Ltbp4 Adgrg6 Hey2_b Akr1c14 Ankrd29 Fas Ltbp1 Bmx_a Lama3 Mllt3 Pdlim1 Bcl11a Hacd4 Cd44 Efnb2 Gja5 Sox17 Bmx Hey2 GroupID Ctsc Nfe2 Spi1 Ifitm1 Runx1 Mpo Gfi1 Fgd5 Sox17_a Cldn5 Tie1 Tek Cdh5 Ramp2 Kdr Flt1 Cd93 Madcam1 Itgb3 Ly6e Hey1 Hey2_a Jag1 Dll4 Notch1 Lef1 Ccnd1 Myc Id1 Id2 /BMP

s22 s22 pathway Antibody Antibody screen genes Notch pathway genes Endothelial genes TGFbeta pathway genes Wnt genes Blood Blood genes s21 s21 CD44lo_kit_neg CD44lo_kit_pos CD44high CD44neg CD44lo_kit_neg CD44lo_kit_pos CD44high CD44neg GroupID GroupID 10 8 6 4 2 12 12 10 8 6 4 2 s20 s20 this version posted June 6, 2018. June 6, posted this version Myc Id1 Id2 Bmp4 Bmper Bmp1ra Ltbp1 Smad6 Smad7 Ccnd1 Cldn5 Tie1 Tek Cdh5 Ramp2 Kdr Flt1 Cd93 Madcam1 Itgb3 Ly6e Hey1 Hey2 Jag1 Dll4 Notch1 Lef1 Ctsc Nfe2 Spi1 Ifitm1 Runx1 Mpo Gfi1 Fgd5 Sox17 Sox17 Bmx Hey2 Aplnr Nr2f2 Nrp2 Ephb4 Aplnr_a Ltbp4 Adgrg6 Hey2_b Akr1c14 Ankrd29 Fas Ltbp1 Bmx_a Lama3 Mllt3 Pdlim1 Bcl11a Hacd4 Cd44 Efnb2 Gja5 Ccnd1 Myc Id1 Id2 Bmp4 Bmper Bmpr1a Ltbp1_a Smad6 Smad7 Nr2f2_a Prl3b1 Pde2a Nrp2_a Piezo2 Bcat1 Snai1 Stab2 Cldn5 Tie1 Tek Cdh5 Ramp2 Kdr Flt1 Cd93 Madcam1 Itgb3 Ly6e Hey1 Hey2_a Jag1 Dll4 Notch1 Lef1 GroupID Ctsc Nfe2 Spi1 Ifitm1 Runx1 Mpo Gfi1 Fgd5 Sox17_a Gja5 Sox17 Bmx Hey2 Aplnr Nr2f2 Nrp2 Ephb4 Hey2_b Akr1c14 Ankrd29 Fas Ltbp1 Bmx_a Lama3 Mllt3 Pdlim1 Bcl11a Hacd4 Cd44 Efnb2 Ltbp1_a Smad6 Smad7 Nr2f2_a Prl3b1 Pde2a Nrp2_a Piezo2 Bcat1 Snai1 Stab2 Aplnr_a Ltbp4 Adgrg6 Hey1 Hey2_a Jag1 Dll4 Notch1 Lef1 Ccnd1 Myc Id1 Id2 Bmp4 Bmper Bmpr1a Fgd5 Sox17_a Cldn5 Tie1 Tek Cdh5 Ramp2 Kdr Flt1 Cd93 Madcam1 Itgb3 Ly6e GroupID Ctsc Nfe2 Spi1 Ifitm1 Runx1 Mpo Gfi1 ; s22 s22 CC-BY-NC-ND 4.0 International license CC-BY-NC-ND s19 s19 a s21 s21 50 s20 s20 s18 s19 s18 s19 s18 s18 s17 s17 s17 s17 s16 s16 25 s15 s15 s16 s16 s14 s14 s13 s13 s12 s15 s15 s12 0 s11 s11 s10 s10 Dimension 1 s14 s14 s9 s9 https://doi.org/10.1101/338178 s8 s8 s7 doi: s13 s13 s7 25 - s6 s6 s5 s5 s12 s12 s4 s4

0 s3 s3 -

25 25

- s11 Dimension 2 Dimension s11 s2 s2 s1 s1 g bioRxiv preprint bioRxiv s s10 s10 Neg Pos Ne Po certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under in perpetuity. It is to display the preprint bioRxiv a license who has granted peer review) is the author/funder, certified by Kit Kit Kit Kit Low Low High Neg Nornalised Gene Expression Neg Low Low High s9 CD44neg CD44lo_kit_neg CD44lo_kit_pos CD44high s9 (rlog)

GroupID E11

E10

2 4 6 8 10 12 E9.5

CD44 CD44 CD44 CD44

CD44 CD44 CD44 CD44 4 2 8 12 6 10

s8 s8 Time Point endothelial populations and identifies early changes in the differentiation process Figure 5: Bulk RNA sequencing further distinguishes CD44 Figure 5: Bulk RNA Group ID Group

a b

s7 s7 Ephb4 Nrp2 Nr2f2 Aplnr Hey2 Sox17 Bmx Gja5 Efnb2 Cd44 Hacd4 Bcl11a Pdlim1 Mllt3 Lama3 Bmx_a Ltbp1 Fas Ankrd29 Akr1c14 Adgrg6 Hey2_b Ltbp4 Aplnr_a Stab2 Snai1 Bcat1 Piezo2 Nrp2_a Pde2a Prl3b1 Nr2f2_a Smad7 Smad6 Ltbp1_a Bmper Bmpr1a Bmp4 Id2 Id1 Myc Ccnd1 Lef1 Notch1 Dll4 Jag1 Hey2_a Hey1 Ly6e Itgb3 Cd93 Madcam1 Flt1 Kdr Ramp2 Cdh5 Tek Tie1 Cldn5 Sox17_a Fgd5 Gfi1 Mpo Runx1 Spi1 Ifitm1 Nfe2 Ctsc GroupID

s6 s6 s22

s5 s5 s21

s4 s4 s20

s3 s3 s19

s2 s2 s18

s1 s1 s17

s16

s15

s14

s13

s12

s11

s10

s9

s8

s7

s6

s5

s4

s3

s2

s1 Figure 6: RNA sequencing analysis revealed different metabolic signatures between bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not CD44Negcertifiedand by peer CD44 review)Low is theKit author/funder,Neg endothelial who has granted populations bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. a

b Figure 7: Runx1 is required for the generation of CD44LowKitPos and CD44High cells but bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not not forcertified the by formation peer review) is the of author/funder, the CD44 who hasLow grantedKitNeg bioRxivpopulation a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. a Runx1+/+ Runx1-/- 0.39% 0.06% 0.12% 0.68% 0.21% 0%

4 4 10 10

3 3 104 104 Cad 10 10 - VE

03 03 10 10

3 4 5 3 4 5 0 10 10 10 0 10 10 10 0 0 CD44 250K 250K 3 4 5 3 4 5 0 10 10 10 0 10 10 10

200K 60% 28% 200K 82.2% 6.5%

250K 250K

150K 150K

200K 200K

100K 100K A - 150K 150K

50K 50K

100K 100K

SSC 0 0 3 3 4 5 3 3 4 5 -10 0 10 10 10 -10 0 10 10 10 50K 50K

0 0 3 3 4 5 3 3 4 5 -10 0 10 10 10 -10 0 10 10 10 Kit b Genotype Group ID SC_1 Wild type SC_2 Runx1-/-

SC_3 20 20 SC_4 SC_5 samples$GroupID

Rx1ko_44lo_Kitneg -/- Rx1ko_44lo_KitposRunx1 CD44Neg Rx1ko_CD44neg samples$GroupID -/- Rx1ko SC_1 Runx1 0 WT 0 Low Neg SC_2 CD44 Kit Dimension 2 -/- Dimension 2 Runx1 SC_3 Low Pos CD44 Kit 2 Dimension Dimension 2 Dimension SC_4

SC_5

−20 −20

−20 0 20 −20 0 20 Dimension 1 Dimension 1 Dimension 1 Dimension 1 bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. bioRxiv preprint doi: https://doi.org/10.1101/338178; this version posted June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. FigurebioRxiv 10: preprintNew doi:model https://doi.org/10.1101/338178 for the progression; this version posted of EHT June 6, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

AORTA-GONAD-MESONEPHROS REGION

Vascular Endothelium Haemogenic Endothelium Pre-HSPC-I Pre-HSPC-II

VE-Cad+ CD44Neg VE-Cad+ CD44LowKitNeg VE-Cad+ CD44LowKitPos VE-Cad+ CD44High

?

- High expression - High expression - Increase of - High expression of endothelial of endothelial haematopoietic of haematopoietic genes genes gene expression genes - Expression of - High expression - Co-expression of - No expression of Wnt pathway of Notch pathway transcription endothelial genes target genes genes factors: Gata2, - No expression of - Expression of Runx1, Lyl1, haematopoietic TGF-b/BMP Lmo2, Tal1, Fli1 genes inhibitors and Erg - Upregulation of - Decrease of signalling endothelial gene pathways expression - Decrease of Glycolysis and TCA cycle