1 A niche of trophoblast progenitor cells identified by integrin α2 is

2 present in first trimester human placentas

3 Cheryl QE Lee1,2, Margherita Turco1,2, Lucy Gardner1,2, Benjamin Simons3,4,5, Myriam 4 Hemberger*2,6, Ashley Moffett*1,2

5 1) Department of Pathology, Tennis Court Road, University of Cambridge, Cambridge 6 CB2 1QP, UK

7 2) Centre for Trophoblast Research, Tennis Court Road, University of Cambridge, 8 Cambridge CB2 3DY, UK

9 3) Cavendish Laboratory, Department of Physics, University of Cambridge, J.J. Thomson 10 Avenue, Cambridge CB3 0HE, UK

11 4) The Wellcome Trust/Cancer Research UK Gurdon Institute, University of Cambridge, 12 Tennis Court Road, Cambridge CB2 1QN, UK

13 5) Wellcome Trust-Medical Research Council Stem Cell Institute, University of 14 Cambridge, Cambridge CB2 1QR, UK

15 6) Epigenetics Programme, The Babraham Institute, Babraham Research Campus, 16 Cambridge CB22 3AT, UK

17 *Joint last authors

18 Corresponding author:

19 Prof. Ashley Moffett

20 Email: [email protected]

21

22 Running title: ITGA2 marks trophoblast progenitors

23 Key words: Placenta, stem cells, integrin α2, progenitor, lineage tracing

24

25 Summary statement

26 Integrin α2 (ITGA2) marks a trophoblast progenitor niche at the base of the cytotrophoblast 27 cell columns in the first trimester human placenta.

28

29 Abstract

30 During pregnancy the trophoblast cells of the placenta are the only fetal cells in direct 31 contact with maternal blood and decidua. Their functions include transport of nutrients and 32 oxygen, secretion of pregnancy hormones, remodelling the uterine arteries, and 33 communicating with maternal cells. Despite the importance of trophoblast cells in placental 34 development and successful pregnancy, little is known about the identity, location and 35 differentiation of human trophoblast progenitors. We identify a proliferative trophoblast 36 niche at the base of the cytotrophoblast cell columns in first trimester placentas that is 37 characterised by integrin α2 (ITGA2) expression. Pulse-chase experiments with 5-Iodo-2ʹ- 38 deoxyuridine (IdU) imply that these cells can contribute to both villous (VCT) and extravillous 39 (EVT) lineages. These proliferating trophoblast cells can be isolated using ITGA2 as a marker 40 by flow cytometry and express from both VCT and EVT. Microarray expression analysis 41 shows that ITAG2+ cells display a unique transcriptional signature including NOTCH signalling 42 and a combination of epithelial and mesenchymal characteristics. ITGA2 thus marks a niche 43 allowing the study of pure populations of trophoblast progenitor cells.

44 Introduction

45 Despite the rapid growth of the human placenta in the early weeks of gestation, little is 46 known about the identity and location of a proliferative or even self-renewing niche of 47 trophoblast stem or progenitor cells. By definition, stem cells are cells capable of unlimited 48 self-renewal and differentiation into several lineages. Murine trophoblast stem cells (TSCs) 49 can self-renew in culture and contribute to all the trophoblast lineages in vivo (Tanaka et al., 50 1998). Because the existence of putative TSCs in the human placenta is unknown, we refer 51 to the proliferative cells in human placentas as trophoblast progenitors (TPs). Trophoblast 52 differentiates along two main pathways, villous and extravillous. In the first trimester, the 53 placental villus consists of a stromal core covered by two layers of trophoblast: an inner 54 layer of villous cytotrophoblast (VCT) from which cells differentiate and fuse to form an 55 outer layer of syncytiotrophoblast (ST). Extravillous cytotrophoblast (EVT) push through the 56 ST in places forming cytotrophoblast cell columns (CCCs) from where EVT can invade into the 57 maternal decidua. Other villi float freely in maternal blood in the intervillous space, the site 58 of maternal/fetal transfer of nutrients and gases.

59 Stem cells in other tissues are often characterized by expression of specific types of 60 integrins; for example, integrin β1 demarcates stem cells in epithelia and mammary glands 61 (Jensen et al., 1999; Jones and Watt, 1993; Shackleton et al., 2006; Stingl et al., 2006; Taddei 62 et al., 2008). Integrins have important functional roles in these cells, as alteration of integrin 63 levels can affect proliferation and differentiation, either by direct signalling or indirectly by 64 anchoring the cells in a specific niche (Ellis and Tanentzapf, 2010; Hirsch et al., 2002; Sastry 65 et al., 1996). Therefore, integrins are good candidates to identify proliferative trophoblast 66 and to isolate live cells.

67 To characterise human TPs, we first confirmed the location of putative TP niches by staining 68 first trimester placentas for proliferative markers (Arnholdt et al., 1991; Bulmer et al., 1988; 69 Chan et al., 1999; Enders, 1968; Muhlhauser et al., 1993; Vićovac et al., 1995). We focussed 70 on first trimester placentas as there is a negative correlation between gestational age and 71 the proportion of proliferative placental cells (Arnholdt et al., 1991; Hemberger et al., 2010; 72 Horii et al., 2016). We found a surface , integrin α2 (ITGA2) expressed on the 73 proliferative trophoblast cells at the base of the CCCs, enabling us to isolate and characterise 74 human TPs using flow cytometry. The expression profile of these cells reveals that they 75 are enriched in NOTCH signalling pathways and unusual mesenchymal-like characteristics. 76 Using thymidine analogues, we were able to pulse chase these cells and show that they

77 contribute to both VCT and EVT. Our findings confirm previous reports that suggested there 78 is a TP niche at the base of the CCC (Muhlhauser et al., 1993; Vićovac et al., 1995).

79 Results

80 Location of proliferating trophoblast cells in the first trimester

81 To identify the location of TPs in first trimester placentas, we first stained for proliferative

82 cells using Ki67, expressed in the entire cell cycle except G0 phase, and 5-Iodo-2ʹ- 83 deoxyuridine (IdU), a base analogue incorporated during S-phase (Gerdes et al., 1984). Fresh 84 placental explants were incubated in IdU for an hour and fixed immediately for IHC. In the 85 villous placenta, no Ki67-positive cells are ever seen in ST and staining in VCT is patchy (Fig. 86 1A, n = 6, gestational age (G.A.) = 7-12 weeks). In the CCCs, Ki67 staining is restricted to the 87 base region. Similar findings were made for IdU with most cells in S phase seen at the base 88 of the CCC and in adjacent VCT (Fig. 1B, n = 10, G.A.= 7-12 weeks). In the CCCs, the IdU+ 89 trophoblast are aligned to form strings of consecutively labelled cells, suggesting that 90 neighbouring cells have synchronised cell cycles and are probably recent relatives (Fig. 1C). 91 In VCT, patches of cells in S phase are occasionally observed but these are rare and most villi 92 contain only a few IdU+ VCT (Fig. 1B, arrowheads). Co-staining for IdU and Ki67 shows that 93 IdU+ cells are concentrated in a smaller region at the base of the CCC, whereas Ki67 staining 94 extends for more layers distally (Fig. 1D, n = 3, G.A.= 8-11 weeks). This indicates that cells in 95 the CCCs lose their proliferative potential as they move further away from the villus and that 96 the proximal CCC is the most likely location for TPs.

97 ITGA2 marks proliferative cells at the base of the CCC

98 To search for a surface marker that could be used to isolate putative TPs, we studied integrin 99 expression in the proliferative niche at the base of the CCCs, and found that ITGA2 is 100 expressed specifically in trophoblast cells in this location (Fig. 2A, n = 3, G.A.= 8-9 weeks). 101 Endothelial cells in the villous core are also ITGA2+, seen by staining of serial sections with 102 CD31 (PECAM1; Fig. 2B). In addition, a smaller cluster of CD31+ cells is present at the base of 103 the CCCs (Fig. 2B, n = 3, G.A.= 8-10 weeks). We therefore stained serial sections for another 104 endothelial marker, CD34, to ensure that ITGA2+ and CD31+ cells at the base of the CCCs are 105 not endothelial cells. CD34, CD31 and ITGA2 co-localise on endothelial cells but only CD31 106 and ITGA2 are present at the base of the CCCs (Fig. 2B, n = 3, G.A.= 8-10 weeks). As further 107 proof that the ITGA2+ cells are trophoblast, serial sections were stained for the epithelial 108 marker cytokeratin 7 (KRT7) and transcription factors (TF) characteristic of trophoblast,

109 GATA3 and TFAP2C (Lee et al 2016). The ITGA2+ cells at the base of the CCCs express these 110 TFs and KRT7, confirming that they are trophoblast (Fig. 2C, n = 3, G.A. = 8-9 weeks).

111 As ITGA2 and CD31 are expressed in a region with high proliferative activity, we co-stained 112 ITGA2 and CD31 with Ki67. The majority of ITGA2+ and CD31+ trophoblast cells are Ki67+ and 113 these cells cluster near the basement membrane at the base of the CCCs, surrounded by 114 Ki67+ITGA2-/CD31- cells (Fig. 2D, n = 3, G.A.= 7-8 weeks). These findings show that CD31 and 115 ITGA2 mark a niche of proliferative trophoblast cells.

116 Characterisation of isolated ITGA2+ cells

117 To isolate these cells by flow cytometry we used an to ITGA2 because CD31 is 118 sensitive to the enzymatic digestion used in our isolation procedure. Single, live cells were 119 gated, and leukocytes and endothelial cells were excluded based on CD45 and CD34 120 expression, respectively (Fig. 3A). About 3.60 ± 0.74 % (mean ± s.e.m.) of cells in the 121 remaining fraction are ITGA2+ (Fig. 3A, n = 3, G.A.= 7-8 weeks). These ITGA2+ cells are all 122 KRT7+ and contain both HLA-G+ and EGFR+ cells demarcating EVT and VCT respectively (Fig. 123 3A, B, n = 3, G.A.= 7-8 weeks)(Apps et al., 2011). We further determined that the percentage 124 of Ki67+ cells is much higher in the ITGA2+ population than in the ITGA2- fraction (Fig. 3C; 125 71.3 ± 0.3 % vs. 42.5 ± 4.3% (mean ± s.e.m.); n = 3, G.A.= 7-8 weeks).

126 To provide further confirmation of the trophoblast identity of ITGA2+ cells, we investigated 127 the methylation status of the ELF5 promoter. The ELF5 promoter is hypomethylated in 128 human and mouse trophoblast cells compared to cells that originate from the embryonic 129 lineage, which includes mesenchymal cells of the villous core and vascular endothelial cells 130 (Lee et al., 2016; Ng et al., 2008). We compared the ELF5 promoter by bisulphite sequencing 131 in three trophoblast populations; ITGA2+ cells, VCT and EVT. ITGA2+ cells were isolated by 132 flow cytometry first, followed by VCT and EVT from the remaining cells using EGFR and HLA- 133 G respectively (A-E-G populations). The ELF5 promoter is hypomethylated in all three 134 populations, with no indication that ELF5 is differentially methylated in proliferative or 135 differentiated trophoblast (Fig. S1, n = 8 independent clones per donor from two donors). 136 We have shown in a previous publication that the ELF5 promoters in placental mesenchymal 137 cells are hypermethylated (Lee et al., 2016). Similar findings for other non-trophoblast cells 138 have been reported recently (Okae et al., 2018). The hypomethylation of ELF5 promoter in 139 ITGA2+ cells therefore provides additional proof of the trophoblast identity of ITGA2+ cells. 140

141 These findings confirm that ITGA2 can be used as a marker to isolate a unique sub- 142 population of trophoblast cells from the proliferative niche at the base of the CCC.

143 profile of ITGA2+ trophoblast

144 To understand how cells located in the proximal CCC differ from those in the villus and distal 145 CCC, the transcriptomes of the A-E-G populations from 4 placentas (G.A.= 8-9 weeks) were 146 compared by microarray. The samples cluster according to cell type by principal component 147 analysis (PCA), suggesting that there is high purity of the three populations with little 148 differences between individual donors (Fig. 4A). Hierarchical clustering provides further 149 support that the ITGA2+ cells are a distinct population from EVT and VCT (Fig. S2A). Genes 150 differentially expressed between VCT and EVT are similar to our previous findings (Fig. 151 S2B)(Apps et al., 2011). Levels for ITGA2, EGFR and HLA-G are also the highest in the ITGA2+ 152 cells, VCT and EVT respectively, further verifying the identity and purity of the isolated cell 153 types (Fig. 4B). The microarray results were validated on several candidate genes by RT-qPCR 154 (Fig. 4C).

155 As the RT-qPCR results showed that a 1.23-fold difference in gene expression on the 156 microarray was significant, we used a false discovery rate (FDR) of less than 0.05 and 1.23- 157 fold difference in gene expression and found that 102 genes are more highly expressed in 158 ITGA2+ cells than the other two trophoblast populations (Table S1, Fig. 4D). Using these 159 genes for analysis identified the terms wound healing, tissue regeneration 160 and proliferation amongst the categories with strongest significance and enrichment (Fig. 161 4E). There are far fewer genes downregulated in ITGA2+ cells (Fig S2C, Table S2).

162 We confirmed expression of three highly expressed surface in ITGA2+ cells: EpCAM, 163 TM4SF1, CD81 and COL17A1. EpCAM (also known as TACSTD1) is expressed by other stem 164 cells and more importantly, is present specifically on a subset of trophoblast progenitors in 165 the mouse placenta (Ueno et al., 2013). In human placentas, EpCAM is present on VCT and 166 at the base of the columns at 6-7 weeks, but is restricted to the base of the CCC after 8 167 weeks (Fig. 5A, n = 3, G.A.= 6-7 weeks; n = 3, G.A.= 8-10 weeks). The expression for TM4SF1 168 and CD81 is more widespread than EpCAM, but the strongest expression for both surface 169 proteins is at the base of the CCC (Fig. S3A, n = 3, G.A.= 8-9 weeks). We also identified 170 components of hemidesmosomes, COL17A1 and LAMB3 enriched in ITGA2+ cells (Fig. S3B). 171 COL17A1 is expressed mainly at the base of the CCCs and in some EVT (Fig. S3C). The 172 expression of different surface proteins in the proximal CCC will allow further analysis of the 173 heterogeneity within this niche in the future.

174 Apart from these surface markers, one of the top differentially expressed genes is thymosin 175 beta 4 (TMSB4X), which is associated with the stemness and differentiation of progenitor 176 and cancer cells (highlighted in Fig. 4D)(Table S1)(Bock-Marquette et al., 2009; Lv et al., 177 2013; Wirsching et al., 2013). TMSB4X increases NOTCH1 activity, and NOTCH1 is also 178 upregulated in ITGA2+ cells (highlighted in Fig. 4D)(Table S1)(Huang et al., 2016; Lv et al., 179 2013). We therefore investigated the levels of genes involved in NOTCH signalling in the 180 ITGA2+ population. NOTCH1 and downstream targets CDKN1A and PLAU are significantly 181 upregulated in ITGA2+ cells compared to VCT and EVT, suggesting that NOTCH signalling is 182 active in TPs (Fig. 5B)(Anova two-way group analysis, Tukey’s multiple comparisons test). 183 HES2 expression as another target of NOTCH signalling is also enriched in ITGA2+ cells with 184 an FDR <0.05. The location of NOTCH1 in human placentas has been controversial (De Falco 185 et al., 2007; Haider et al., 2014; Herr et al., 2011; Hunkapiller et al., 2011). Our results agree 186 with Haider et al. that NOTCH1 is present at the base of the columns and demarcates a 187 progenitor-like population.

188 Since NOTCH1 is involved in the regulation of epithelial-mesenchymal transition (EMT), and 189 this pathway also came out of our gene set enrichment analyses (Fig. 5C), we looked at the 190 expression of EMT genes amongst the A-E-G populations (Fig. 5D, S3D)(Wang et al., 2011). 191 EMT genes that are 1.23 times higher in at least one of the populations were selected. Most 192 of the gene expression changes between the EGFR+ and HLA-G+ populations reflect the 193 epithelial and mesenchymal nature of VCT and EVT respectively, as previously described (Fig. 194 S3D)(Davies et al., 2016). The ITGA2+ subgroup of cells, however, shows an intermediate 195 phenotype between these two extremes. As such, it is more akin to VCT in terms of CDH1 (E- 196 Cadherin) expression, but at the same time expresses high levels of TPM1 and TPM2, more 197 similar to EVT cells (Fig. S3D). Moreover, we identified a particular group of EMT-related 198 genes that are exclusively upregulated in the ITGA2+ population (Fig. 5D).

199 Taken together, the transcriptome of the A-E-G populations shows that there is a 200 proliferative population of cells at the base of the CCCs marked by ITGA2. To understand 201 more about the differentiation capacity of these cells, we performed lineage tracing on 202 human first trimester placentas.

203 Cells in the proximal column can contribute to VCT and EVT

204 Since lineage tracing by genetic manipulation is not practicable in human placentas, we used 205 IdU labelling of explants as a proxy hereditary label to trace the fate of cells. Specifically, 206 cells were treated with IdU for an hour on Day 0, and half of the explants were fixed

207 immediately while the other half were fixed after three days (Fig. 6A). With IdU portioned 208 equally between daughter progeny following division, the positional fate of cells could be 209 traced over time.

210 Following short term IdU incorporation, IdU+ cells were visible in a region close to the base 211 of the CCCs as described above, localizing high proliferative activity to cells in this niche (Fig. 212 1B). To trace the fate of these proliferative cells, we then examined the pattern of IdU 213 staining after 3 days of chase. This allowed the migration of IdU labelled cells into the 214 columns to be assessed in two ways: The first method relied on the tendency of IdU+ cells to 215 form “strings” of consecutive positively labelled cells in the proximal CCC. We reasoned that 216 the change in the length of the string could provide an estimate of the cell migration rate 217 from the base region. To control for potential clone variation associated with lateral division, 218 only cells in the longest string in each column were scored. In this way, we obtained an 219 estimate for the average length of the longest string per column for each donor placenta 220 between Day 0 and Day 3. An example of how cells in the string were counted is shown in 221 Fig. 6B. The results show that there is a net increase in string size after three days (Fig. 6C, n 222 = 6, G.A.= 8-10 weeks, p < 0.01, Mann-Whitney U test). To verify that the strings were 223 representative of what is happening in the entire column, we measured the area of the 224 regions containing labelled cells and the area of the entire columns. By comparing the ratio 225 of the area of labelled cells to that of the entire column between Day 0 and 3, we confirmed 226 that there are more labelled cells in the column after three days (Fig 6D, n = 6, G.A.= 8-10 227 weeks, p < 0.01, Mann-Whitney U test).

228 A similar strategy was employed for studying whether the proliferative trophoblast in the 229 proximal CCC can contribute to the villus. As there tend to be consecutive IdU+ VCT at the 230 edge of the CCCs, the average number of consecutive positive VCT was compared between 231 Day 0 and Day 3. There was a net gain in the number of consecutive IdU+ VCT near the edge 232 of the columns (Fig. 6E, n = 6, G.A.= 8-10 weeks, p < 0.01, Mann-Whitney U test). This 233 suggests that the proliferative cells at the base of the CCCs could contribute to both villi and 234 columns.

235 Discussion

236 The identity and location of TP cells in early human placentas are still unknown. We have 237 identified ITGA2 as a surface marker on a subpopulation of proliferative trophoblast residing 238 at the base of the CCC that can probably contribute to both VCT and EVT. Using it, we have 239 been able to isolate and characterise putative TPs. We used placentas from the first 240 trimester to study TPs, as expression of proliferative markers decreases with gestational age 241 in the human placenta (Arnholdt et al., 1991; Hemberger et al., 2010; Horii et al., 2016). 242 Similarly, EpCAM is expressed in VCT early in the first trimester but becomes restricted to 243 the proximal CCC at approximately 8 weeks. This evidence indicates that the proliferative 244 niche becomes more restricted as gestation proceeds in keeping with the exponential 245 growth of the placenta very early in gestation. This is analogous to the gut where 246 proliferation is initially present throughout the epithelium but becomes confined to the 247 crypts during development (Noah et al., 2011).

248 Sites of proliferation are commonly associated with progenitor cells. To locate these in early 249 pregnancy, we used both Ki67 and IdU on sections of first trimester placentas. In agreement 250 with others, we found that proliferative markers are concentrated at the base of the CCCs, 251 with less proliferation occurring in the VCT population scattered around the villi between 7- 252 11 weeks (Arnholdt et al., 1991; Bulmer et al., 1988; Hemberger et al., 2010; Korgun et al., 253 2006; Lash et al., 2006; Muhlhauser et al., 1993; Westerman et al., 2004). Our lineage 254 tracing results suggest that these cells localised at the base of the CCC can contribute to 255 both VCT and EVT. Next, because the stem/progenitor cell niche is often characterized by 256 the expression of particular integrin subtypes in a range of tissues, we looked for an integrin 257 whose expression was restricted to the base of the CCC (Chen et al., 2013). We found that 258 ITGA2 marks a small group of cells in this proliferative niche and can be used to isolate the 259 putative TPs. Villous endothelial cells also express ITGA2 but these can be excluded using the 260 specific endothelial marker, CD34. ITGA2 can only form heterodimers with ITGB1, and a2b1 261 integrin binds to collagen I, II, IV, XI and XXIII as well as laminin (Tuckwell et al., 1995; 262 Tuckwell et al., 1996; Tulla et al., 2001; Veit et al., 2011). Although collagen I was 263 upregulated in ITGA2+ trophoblast in our microarray analysis, we found significant 264 expression only in the villous stromal core and not in the CCC (unpublished results), thus 265 making a functional interaction with one of the other binding partners more likely. Mice 266 lacking the Itga2 gene are viable and exhibit no placental defects, suggesting that it is not 267 essential for murine trophoblast development (Chen et al., 2002). Whilst the role of ITGA2 in

268 trophoblast cells is unclear, it also marks proliferating cells in other organs (Beaulieu, 1992; 269 Lussier et al., 2000; Zutter and Santoro, 1990).

270 We used flow cytometric cell sorting to isolate the ITGA2+ cells at the base of the CCCs and 271 performed a transcriptome analysis to identify their specific expression profile. Amongst the 272 upregulated genes in the ITGA2+ trophoblast, we identified COL17A1 and LAMB3, 273 components of hemidesmosomes. COL17A1 is critical for the maintenance of hair follicle 274 stem cells and deficiency of COL17A1 leads to loss of the stem cell signature and premature 275 hair ageing (Tanimura et al., 2011). Moreover, the product of SFN (Stratifin or 14-3-3s), 276 another ITGA2+ upregulated gene, can bind both COL17A1 and keratins, therefore bridging 277 the hemidesmosomes to the rest of the cytoskeleton (Li et al., 2007). Indeed, the balance 278 between proliferation and differentiation in skin has parallels with the CCC and 279 differentiation to EVT.

280 We also find that ITGA2+ trophoblast are enriched in genes involved in the NOTCH1 281 signalling pathway, HES2, CDKN1A, PLAU and MYC. Moreover, TMSB4X as an ITGA2+- 282 enriched factor is known to increase the expression of NOTCH1, providing evidence for the 283 self-reinforcing nature of the ITGA2 niche (Huang et al., 2016; Lv et al., 2013). As expression 284 of different NOTCH receptors is restricted to subsets of trophoblast, and the inhibition of 285 NOTCH signalling elicits two opposing effects of proliferation and differentiation in a mixed 286 population of trophoblast, it will be interesting to study the specific inhibition of NOTCH1 in 287 ITGA2+ trophoblast in vitro (Haider et al., 2014).

288 As cells in the CCC move away from the basement membrane and invade the decidua upon 289 attachment to the uterus, they undergo a process typical of EMT (Davies et al., 2016). We 290 find that ITGA2+ trophoblast cells at the base of the CCC exhibit an unusual expression 291 repertoire of both epithelial and mesenchymal markers. Even within the group of genes 292 generically classified as “mesenchymal”, most of them are not expressed in EVT. These data 293 suggest that the ITGA2+ subpopulation is in a unique state between epithelial-like VCT and 294 the more mesenchymal-like EVT. This phenotype is in line with the position of ITGA2+ cells at 295 the interface between both groups, supporting the notion that they may be capable of 296 contributing to both of these major trophoblast populations.

297 The flow cytometry data further corroborate this point as the ITGA2+ population is 298 proliferative and contains both EGFR+ VCT and HLA-G+ EVT cells. The truly bipotential 299 capacity of these cells is difficult to prove however, as neither lineage tracing of human

300 trophoblast or live imaging are possible because of the optical density of the CCCs and the 301 floating nature of the villi. We used a thymidine analogue pulse-chase strategy as a 302 preliminary attempt to study the differentiation potential of the cells in this proliferative 303 niche. Based on the contribution of IdU+ cells, we conclude that the cells at the base of the 304 CCCs do contribute to both VCT and EVT, although we cannot strictly rule out the presence 305 of separate progenitor cells with different lineage contributions.

306 Taken together, we have identified a cell surface marker, ITGA2, which marks a proliferative 307 TP compartment in the first trimester placenta that is regulated by NOTCH signalling and 308 exhibits unique expression characteristics. These insights will help elucidate the putative 309 stem or progenitor cell niche in the early human placenta for future attempts at culturing a 310 self-renewing human trophoblast cell population.

311

312 Materials and methods

313 Ethics

314 Ethical approval was obtained from the Cambridgeshire 2 Research Ethics Committee 315 (reference no. 04/Q0108/23; Cambridge, United Kingdom) and informed written consent 316 was obtained from each patient.

317 Immunohistochemistry and confocal microscopy

318 All except anti-IdU and anti-COL17A1 were stained on frozen sections. Pieces of 319 placenta were embedded in OCT, snap-frozen in liquid nitrogen and sectioned at 5 µm thick, 320 then fixed in acetone. Paraffin sections were dewaxed and placed at 125°C for 30 s and then 321 90°C for 10 s in citrate buffer (A Menarini). To stain with DAB, the sections were blocked in 322 2.5% horse serum, and then incubated in primary antibody, biotinylated secondary antibody 323 (Vectastain #BA-2000, #BA-9500, BA-1100) and HRP-conjugated ABC complex (Vectastain 324 #PK-6100). Each incubation was 30 min long, with 2 x 5 min 0.1% tween/PBS washes 325 between each incubation. Incubation for anti-COL17A1 was done overnight at 4°C. The 326 sections were then developed with 3, 3’- diaminobenzidine (DAB)(Sigma #D4168).

327 For co-immunofluorescence staining, the sections were incubated in primary antibody 328 overnight at 4°C and in fluorophore-conjugated secondary antibody (Invitrogen #A-11029,

329 #A-11011, diluted 1:400) for 2 h. Coverslips were mounted using Vectashield mounting 330 medium, containing DAPI (#H1200) and images were acquired with Zeiss LSM 700.

331 Table of antibodies used for IHC and co-immunofluorescence

Dilution Antigen Brand Catalog number Clone factor ITGA2 R&D MAB1233 HAS3 1/200 TFAP2C R&D AF5059 - 1/100 IdU BD Biosciences 347580 B44 1/50 GATA3 R&D AF2605 - 1/100 Ki67 A Menarini MP-325-CRM01 SP6 1/100 TM4SF1 Sigma Aldrich HPA002823 - 1/350 CD34 Dako M716529-2 QBEnd10 1/100 KRT7 Dako M701829-2 OVTL12/30 1/200

COL17A1 Abcam Ab186415 EPR14758 1/100 EpCAM BD Biosciences 347197 EBA-1 3/100 CD31 Dako M082329-2 JC70A 1/100 CD81 BD Biosciences 561957 JS-81 3/100 332

333 Isolation of cells from human placentas

334 To obtain single cells from the placenta, the chorionic villi were scraped off from the 335 membranes and incubated in 0.2% trypsin at 37°C for 9 min. The resulting mixture was 336 sieved through muslin cloth and the flow through was centrifuged. The pellet was re- 337 suspended in 8 mL of Ham’s F12 medium, filtered through a 100 µm cell sieve, then layered 338 onto 8 mL of Lymphoprep™ (Axis-shield, #1114544) and centrifuged at 700 RCF for 20 min at 339 r.t.p. Mononuclear cells were recovered from the interphase between the Lymphoprep™ 340 and Ham’s F12 medium.

341 Staining for surface and intracellular proteins for flow cytometry 342 Live cells were blocked with 0.25 mg/mL human immunoglobulin (Sigma #I4506), followed 343 by incubation in fluorophore-conjugated antibodies for 30 min at 4°C. They were then 344 washed in 1% FCS/PBS and fixed in 2% PFA. For cell sorting, cells were not fixed and were 345 sorted by Cytomation MoFlo® immediately.

346 To stain for KRT7 and Ki67, cells were fixed in Foxp3 fixation/ permeabilisation reagent 347 (eBioscience #00-5521-00) for 30 min in the dark after staining for surface proteins and 348 permeabilised in eBioscience Permeabilisation Buffer (#00-8333). They were then blocked in 349 human immunoglobulin and incubated with fluorophore-conjugated antibodies for 15 min 350 at room temperature. They were subsequently washed in eBioscience Permeabilisation 351 Buffer and fixed in 2% PFA.

352 The data acquisition was done via either Cytek Development DxP 8 colours (488/637/561) or 353 BD LSRFortessa (405/488/637/561). All compensation was applied digitally post-acquisition. 354 The data was analysed using FlowJo (Tree Star Inc.).

355

356 Table of antibodies used for flow cytometry

Catalog Dilution Antigens Brand Number Clone Fluorophore factor ITGA2 R&D FAB1233P HAS3 PE 1/100 HLA-G Lab-produced - G233 Pacific blue 1/200 Ki67 BD Biosciences 556026 MOPC-21 FITC 3/100 KRT7 Millipore CBL194F LP5K FITC 1/100 CD45 R&D FAB1430A 2D1 APC 3/40 CD34 BD Biosciences 555824 581 APC 1/100 EGFR AbD Serotec MCA1784F ICR10 FITC 1/20 357

358 Extraction of RNA

359 Cells were lysed in TRIzol™ Reagent (Invitrogen #15596026) and 200 µL of chloroform was 360 added for every 1 mL of TRIzol™. The mixture was then vortexed and centrifuged at 12000 361 RCF, 4°C, for 15 min. The upper aqueous phase was transferred to a new tube, 1.5 times 362 volume of ethanol was added and the entire content was then centrifuged through 363 Purelink™ columns (Invitrogen #12183-016). RNA was purified from the columns following 364 the manufacturer’s protocol, including the DNase step. The final product was eluted in 12 µL 365 of RNase-free water.

366 RT-qPCR

367 RNA was converted into cDNA using the Transcriptor First Strand cDNA Synthesis Kit from 368 Roche (#04379012001). 10 µL of 2x Fast SYBR® Green Mastermix (Applied Bioscience 369 #4385612) and 10 ng of cDNA were used per test, and each gene were tested in triplicates 370 using the Applied Biosystems® 7500 Real-Time PCR System. The settings of the machine 371 were: 95°C 5 min, then 40x (95°C 10 s, 60°C 30 s). Melting curve analysis was done at the 372 end, to ensure that there was only one amplicon.

373

374 Table of primers used for RT-qPCR

Gene name Forward primer (5’ to 3’) Reverse primer (5’ to 3’)

CDH1 GAACGCATTGCCACATACAC ATTCGGGCTTGTTGTCATTC EMP3 CGAGAATGGCTGGCTGAAG GCCACGCTGGTGCAAAG ITGA2 TCACCAGGAACATGGGAACT GTCAGAACACACACCCGTTG ITGB6 CTACCTGTGGTGACCCCTGTAAC GCTTGGCCAGCTGCTGAC LAIR2 CACACCTCACTGCTCTCCTG TGAAAGTCACATGGCTCCCC RARRES3 TGGGCCCTGTATATAGGAGATG GGACTGAGAAGACACTGGAGGA SOX13 AAGGATGAGCGGAGGAAGAT GACTTCCAGCGAGATCCAAG TBP GAGCTGTGATGTGAAGTTTCC TCTGGGTTTGATCATTCTGTAG 375

376 Microarray

377 Four donor sets of three cell types with RNA integrity number (RIN) above 7 as measured by 378 2100 Bioanalyzer Instrument (Agilent Technologies) were amplified using the Ovation® Pico 379 WTA System v2 (NuGEN, #3302) according to the manufacturer’s protocol. Good quality 380 final products as assessed by the 2100 Bioanalyzer Instrument were biotinylated using the 381 Encore Biotin Module by NuGEN (#4200-12) and hybridised to Illumina Human HT-12 V4 382 BeadArray (#BD-103-0204) using the manufacturer’s protocol. After correcting for 383 background, the output from Illumina BeadStudio were logged to the base of two and 384 multiple testing corrections was applied to paired-wise analysis of each cell type against 385 each other in the R statistical programme. The data was filtered to remove probes for which 386 the detection p-value was not above 0.01 for at least one sample for the analysis.

387 Microarray work and analysis was done by Cambridge Genomic Services. Gene functional 388 annotation analysis was done using Database for Annotation, Visualization and Integrated 389 Discovery (DAVID)(Huang et al., 2008; Huang et al., 2009) and Gene Set Enrichment Analysis 390 (GSEA)(Aravind Subramanian et al., 2005; Mootha et al., 2003).

391

392 Bisulphite sequencing

393 DNA was extracted from TRIzol™-lysed samples after the RNA had been removed from the 394 aqueous phase, according to manufacturer’s protocol. 10 ng of DNA was used for bisulphite 395 conversion using the EpiTect Bisulfite Kit (Qiagen #59110), according to manufacturer’s 396 protocol. The -432 to -3bp region upstream of ELF5 transcription start site was then 397 amplified via nested PCR. The primer sequences were:

Primer name Sequence

hELF5-2b BiS -483F GGAAATGATGGATATTGAATTTGA

hELF5-2b BiS +31R CAATAAAAATAAAAACACCTATAACC

hELF5-2b BiS -432F GAGGTTTTAATATTGGGTTTATAATG

hELF5-2b BiS -3R ATAAATAACACCTACAAACAAATCC

398

399 Amplicons were purified from agarose gel using QIAquick Gel Extraction Kit (Qiagen, #28704) 400 and cloned into the pGEM-T Easy vector (Promega, #A137A). The ligations were transformed

401 into Library Efficiency® DH5α™ Chemically Competent Cells (Invitrogen, #18263012) and 402 eight colonies per group were selected to be sequenced; these were confirmed as 403 representing distinct alleles.

404 Lineage tracing by pulse chasing

405 Samples from seven donors were cut into roughly 10 x 10 x 10 mm pieces, avoiding pieces nearer to 406 the edge of the placenta. During IdU incubation, they were suspended in 10% FCS/ DMEM ± 80 µM 407 IdU, with three pieces of explants/well in a 6-well plate coated with 1% gelatin (1 mL/ well). At least 408 three random explants per donor were collected at each time point. The explants were fixed in 4% 409 PFA overnight, switched to 70% ethanol for storage, and embedded in paraffin for sectioning. For 410 antigen retrieval and DNA denaturation, sections were immersed in Access Super RTU buffer (pH 9; 411 Menarini Diagnostics) and placed in a Menarini Access Retrieval Unit and subjected to high pressure 412 at 125°C for 1 min and then 90°C for 10 s. The slides were then stained as per normal.

413

414 Statistics

415 All error bars on graphs are standard error bars. Types of statistical tests are indicated in the 416 figure captions.

417 Acknowledgements

418 We thank all donors for providing tissue and Diane Moore for coordinating the tissue 419 collection. We are also grateful to Nigel Miller and Julien Bauer for their expertise on flow 420 cytometric sorting and analysis of microarray data respectively.

421 Competing interests

422 No competing of interest declared.

423 Funding

424 This work was funded by the Centre for Trophoblast Research (CTR), the Wellcome Trust 425 (090108/Z/09/Z and 085992/Z/08/Z), Medical Research Council (MR/P001092/1), and the 426 Biotechnology and Biological Sciences Research Council (BB/P013406/1). CL and LG are 427 funded by the CTR. CL is also supported by Agency for Science, Technology and Research. 428 MYT has received funding from CTR and E.U. 7th Framework Programme for research, 429 technological development and demonstration under grant agreement to no PIEF-GA-2013- 430 629785. 431

432 Data availability

433 The microarray is deposited at the GEO repository under GSE106852.

434

435 References

436 Apps, R., Sharkey, A., Gardner, L., Male, V., Trotter, M., Miller, N., North, R., Founds, S. and 437 Moffett, A. (2011). Genome-wide expression profile of first trimester villous and 438 extravillous human trophoblast cells. Placenta 32, 33–43.

439 Aravind Subramanian, Tamayo, P., Mootha, V. K., Mukherjeed, S., Ebert, B. L., Gillette, M. 440 A., Paulovichg, A., Pomeroyh, S. L., Golub, T. R., Lander, E. S., et al. (2005). GSEA : 441 Gene set enrichment analysis Gene set enrichment analysis : A knowledge-based 442 approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. 102, 443 15545–15550.

444 Arnholdt, H., Meisel, F., Fandrey, K. and Löhrs, U. (1991). Proliferation of villous trophoblast 445 of the human placenta in normal and abnormal pregnancies. Virchows Arch. B, Cell 446 Pathol. 60, 365–72.

447 Beaulieu, J.-F. (1992). Differential expression of the VLA family of integrins along the crypt- 448 villus axis in the human small intestine. J. Cell Sci. 102, 427–36.

449 Bock-Marquette, I., Shrivastava, S., Pipes, G. C. T., Thatcher, J. E., Blystone, A., Shelton, J. 450 M., Galindo, C. L., Melegh, B., Srivastava, D., Olson, E. N., et al. (2009). Thymosin β4 451 mediated PKC activation is essential to initiate the embryonic coronary developmental 452 program and epicardial progenitor cell activation in adult mice. J. Mol. cell Cardiol. 46, 453 728–738.

454 Bulmer, J. N., Morrison, L. and Johnson, P. M. (1988). Expression of the proliferation 455 markers Ki67 and by human trophoblast populations. J. Reprod. 456 Immunol. 14, 291–302.

457 Chan, C. C. W., T T, L. and Cheung, A. N. Y. (1999). Apoptotic and proliferative activities in 458 first trimester placentae. Placenta 20, 223–227.

459 Chen, J., Diacovo, T. G., Grenache, D. G., Santoro, S. A. and Zutter, M. M. (2002). The α2 460 integrin subunit-deficient mouse. Am. J. Pathol. 161, 337–344.

461 Chen, S., Lewallen, M. and Xie, T. (2013). Adhesion in the stem cell niche: biological roles 462 and regulation. Development 140, 255–265.

463 Davies, J. E., Pollheimer, J., Yong, H. E. J., Kokkinos, M. I., Kalionis, B., Knöfler, M. and 464 Murthi, P. (2016). Epithelial-mesenchymal transition during extravillous trophoblast

465 differentiation. Cell Adhes. Migr. 10, 310–321.

466 De Falco, M., Cobellis, L., Giraldi, D., Mastrogiacomo, A., Perna, A., Colacurci, N., Miele, L. 467 and De Luca, A. (2007). Expression and distribution of notch protein members in 468 human placenta throughout pregnancy. Placenta 28, 118–26.

469 Ellis, S. J. and Tanentzapf, G. (2010). Integrin-mediated adhesion and stem-cell-niche 470 interactions. Cell Tissue Res. 339, 121–30.

471 Enders, A. C. (1968). Fine structure of anchoring villi of the human placenta. Am. J. Anat. 472 122, 419–451.

473 Gerdes, J., Lemke, H., Baisch, H., Wacker, H. H., Schwab, U. and Stein, H. (1984). Cell cycle 474 analysis of a cell proliferation-associated human nuclear antigen defined by the 475 monoclonal antibody Ki-67. J. Immunol. 133, 1710–5.

476 Haider, S., Meinhardt, G., Velicky, P., Otti, G. R., Whitley, G., Fiala, C., Pollheimer, J. and 477 Knöfler, M. (2014). Notch signaling plays a critical role in motility and differentiation of 478 human first-trimester cytotrophoblasts. Endocrinology 155, 263–74.

479 Hemberger, M., Udayashankar, R., Tesar, P., Moore, H. and Burton, G. J. (2010). ELF5- 480 enforced transcriptional networks define an epigenetically regulated trophoblast stem 481 cell compartment in the human placenta. Hum. Mol. Genet. 19, 2456–67.

482 Herr, F., Schreiner, I., Baal, N., Pfarrer, C. and Zygmunt, M. (2011). Expression patterns of 483 Notch receptors and their ligands Jagged and Delta in human placenta. Placenta 32, 484 554–63.

485 Hirsch, E., Barberis, L., Brancaccio, M., Azzolino, O., Xu, D., Kyriakis, J. M., Silengo, L., 486 Giancotti, F. G., Tarone, G., Fässler, R., et al. (2002). Defective Rac-mediated 487 proliferation and survival after targeted mutation of the beta1 integrin cytodomain. J. 488 Cell Biol. 157, 481–92.

489 Horii, M., Li, Y., Wakeland, A. K., Pizzo, D. P., Nelson, K. K., Sabatini, K., Laurent, L. C., Liu, 490 Y. and Parast, M. M. (2016). Human pluripotent stem cells as a model of trophoblast 491 differentiation in both normal development and disease. Proc. Natl. Acad. Sci. 113, 492 E3882–E3891.

493 Huang, D. W., Sherman, B. T. and Lempicki, R. A. (2008). Systematic and integrative analysis 494 of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57.

495 Huang, D. W., Sherman, B. T. and Lempicki, R. A. (2009). Bioinformatics enrichment tools: 496 paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids 497 Res. 37, 1–13.

498 Huang, D., Wang, S., Wang, A., Chen, X. and Zhang, H. (2016). Thymosin beta 4 silencing 499 suppresses proliferation and invasion of non-small cell lung cancer cells by repressing 500 Notch1 activation. Acta Biochim. Biophys. Sin. (Shanghai). 48, 788–794.

501 Hunkapiller, N. M., Gasperowicz, M., Kapidzic, M., Plaks, V., Maltepe, E., Kitajewski, J., 502 Cross, J. C. and Fisher, S. J. (2011). A role for Notch signaling in trophoblast 503 endovascular invasion and in the pathogenesis of pre-eclampsia. Development 138, 504 2987–98.

505 Jensen, U. B., Lowell, S. and Watt, F. M. (1999). The spatial relationship between stem cells 506 and their progeny in the basal layer of human epidermis: a new view based on whole- 507 mount labelling and lineage analysis. Development 126, 2409–18.

508 Jones, P. H. and Watt, F. M. (1993). Separation of human epidermal stem cells from transit 509 amplifying cells on the basis of differences in integrin function and expression. Cell 73, 510 713–724.

511 Korgun, E. T., Celik-Ozenci, C., Acar, N., Cayli, S., Desoye, G. and Demir, R. (2006). Location 512 of cell cycle regulators cyclin B1, cyclin A, PCNA, Ki67 and cell cycle inhibitors p21, p27 513 and p57 in human first trimester placenta and deciduas. Histochem. Cell Biol. 125, 615– 514 24.

515 Lash, G. E., Otun, H. A., Innes, B. A., Bulmer, J. N., Searle, R. F. and Robson, S. C. (2006). 516 Low oxygen concentrations inhibit trophoblast cell invasion from early gestation 517 placental explants via alterations in levels of the urokinase plasminogen activator 518 system. Biol. Reprod. 74, 403–9.

519 Lee, C. Q. E. C. Q. E., Gardner, L., Turco, M., Zhao, N., Murray, M. J. M. J., Coleman, N., 520 Rossant, J., Hemberger, M. and Moffett, A. (2016). What Is trophoblast? A 521 combination of criteria define human first-trimester trophoblast. Stem Cell Reports 6, 522 257–272.

523 Li, Y., Lin, X., Kilani, R. T., Jones, J. C. R. and Ghahary, A. (2007). 14-3-3 sigma isoform 524 interacts with the cytoplasmic domain of the transmembrane BP180 in keratinocytes. J. 525 Cell. Physiol. 212, 675–681.

526 Lussier, C., Basora, N., Bouatrouss, Y. and Beaulieu, J. F. (2000). Integrins as mediators of 527 epithelial cell-matrix interactions in the human small intestinal mucosa. Microsc. Res. 528 Tech. 51, 169–78.

529 Lv, S., Cheng, G., Zhou, Y. and Xu, G. (2013). Thymosin beta4 induces angiogenesis through 530 Notch signaling in endothelial cells. Mol. Cell. Biochem. 381, 283–90.

531 Mootha, V. K., Lindgren, C. M., Eriksson, K.-F., Subramanian, A., Sihag, S., Lehar, J., 532 Puigserver, P., Carlsson, E., Ridderstrale, M., Laurila, E., et al. (2003). PGC-1[alpha]- 533 responsive genes involved in oxidative phosphorylation are coordinately 534 downregulated in human diabetes. Nat Genet 34, 267–273.

535 Muhlhauser, J., Crescimanno, C., Kaufmann, P., Hofler, H., Zaccheo, D. and Castellucci, M. 536 (1993). Differentiation and proliferation patterns in human trophoblast revealed by c- 537 erbB-2 oncogene product and EGF-R. J. Histochem. Cytochem. 41, 165–173.

538 Ng, R. K., Dean, W., Dawson, C., Lucifero, D., Madeja, Z., Reik, W. and Hemberger, M. 539 (2008). Epigenetic restriction of embryonic cell lineage fate by methylation of Elf5. Nat. 540 Cell Biol. 10, 1280–1290.

541 Noah, T. K., Donahue, B. and Shroyer, N. F. (2011). Intestinal development and 542 differentiation. Exp. Cell Res. 317, 2702–2710.

543 Okae, H., Toh, H., Sato, T., Hiura, H., Takahashi, S., Shirane, K., Kabayama, Y., Suyama, M., 544 Sasaki, H. and Arima, T. (2018). Derivation of human trophoblast stem cells. Cell Stem 545 Cell 22, 50–63.

546 Sastry, S. K., Lakonishok, M., Thomas, D. A., Muschler, J. and Horwitz, A. F. (1996). Integrin 547 alpha subunit ratios, cytoplasmic domains, and growth factor synergy regulate muscle 548 proliferation and differentiation. J. Cell Biol. 133, 169–84.

549 Shackleton, M., Vaillant, F., Simpson, K. J., Stingl, J., Smyth, G. K., Asselin-Labat, M.-L., Wu, 550 L., Lindeman, G. J. and Visvader, J. E. (2006). Generation of a functional mammary 551 gland from a single stem cell. Nature 439, 84–8.

552 Stingl, J., Eirew, P., Ricketson, I., Shackleton, M., Vaillant, F., Choi, D., Li, H. I. and Eaves, C. 553 J. (2006). Purification and unique properties of mammary epithelial stem cells. Nature 554 439, 993–7.

555 Taddei, I., Deugnier, M.-A., Faraldo, M. M., Petit, V., Bouvard, D., Medina, D., Fässler, R.,

556 Thiery, J. P. and Glukhova, M. A. (2008). Beta1 integrin deletion from the basal 557 compartment of the mammary epithelium affects stem cells. Nat. Cell Biol. 10, 716–22.

558 Tanaka, S., Kunath, T., Anna-Katerina, H., Nagy, A. and Rossant, J. (1998). Promotion of 559 trophoblast stem cell proliferation by FGF4. Science (80-. ). 282, 2072–2075.

560 Tanimura, S., Tadokoro, Y., Inomata, K., Binh, N. T., Nishie, W., Yamazaki, S., Nakauchi, H., 561 Tanaka, Y., McMillan, J. R., Sawamura, D., et al. (2011). Hair follicle stem cells provide 562 a functional niche for melanocyte stem cells. Cell Stem Cell 8, 177–87.

563 Tuckwell, D., Calderwood, D. A., Green, L. J. and Humphries, M. J. (1995). I- 564 domain is a binding site for collagens. J. Cell Sci. 108, 1629–37.

565 Tuckwell, D. S., Reid, K. B., Barnes, M. J. and Humphries, M. J. (1996). The A-domain of 566 integrin alpha 2 binds specifically to a range of collagens but is not a general receptor 567 for the collagenous motif. Eur. J. Biochem. 241, 732–9.

568 Tulla, M., Pentikäinen, O. T., Viitasalo, T., Käpylä, J., Impola, U., Nykvist, P., Nissinen, L., 569 Johnson, M. S. and Heino, J. (2001). Selective binding of collagen subtypes by integrin 570 α1I, α2I, and α10I domains. J. Biol. Chem. 276, 48206–48212.

571 Ueno, M., Lee, L. K., Chhabra, A., Kim, Y. J., Sasidharan, R., Van Handel, B., Wang, Y., 572 Kamata, M., Kamran, P., Sereti, K.-I., et al. (2013). c-Met-dependent multipotent 573 labyrinth trophoblast progenitors establish placental exchange interface. Dev. Cell 27, 574 373–386.

575 Veit, G., Zwolanek, D., Eckes, B., Niland, S., Käpylä, J., Zweers, M. C., Ishada-Yamamoto, A., 576 Krieg, T., Heino, J., Eble, J. A., et al. (2011). Collagen XXIII, novel ligand for integrin 577 alpha2beta1 in the epidermis. J. Biol. Chem. 286, 27804–13.

578 Vićovac, L., Jones, C. J. P. and Aplin, J. D. (1995). Trophoblast differentiation during 579 formation of anchoring villi in a model of the early human placenta in vitro. Placenta 580 16, 41–56.

581 Wang, Z., Li, Y., Kong, D. and Sarkar, F. H. (2011). The role of Notch signaling pathway in 582 epithelial-mesenchymal transition (EMT) during development and tumor 583 aggressiveness. Curr. Drug Targets 11, 745–751.

584 Westerman, B. A., Poutsma, A., Steegers, E. A. P. and Oudejans, C. B. M. (2004). C2360, a 585 nuclear protein expressed in human proliferative cytotrophoblasts, is a representative

586 member of a novel protein family with a conserved coiled coil-helix-coiled coil-helix 587 domain. Genomics 83, 1094–104.

588 Wirsching, H.-G., Krishnan, S., Florea, A.-M., Frei, K., Krayenbühl, N., Hasenbach, K., 589 Reifenberger, G., Weller, M. and Tabatabai, G. (2013). Thymosin beta 4 gene silencing 590 decreases stemness and invasiveness in glioblastoma. Brain 137, 433–448.

591 Zutter, M. M. and Santoro, S. A. (1990). Widespread histologic distribution of the alpha 2 592 beta 1 integrin cell-surface collagen receptor. Am. J. Pathol. 137, 113–120.

593

594 Figure Legends

595 Figure 1. Proliferative trophoblast cells mostly reside at the base of the CCCs. (A, B) 596 Placental sections stained for (A) Ki67 (n = 6) and (B) IdU (n = 10). Arrowheads pointing to 597 IdU+ cells within villi that are highly proliferative. Scale bars = 100 μm. (C) Example of strings 598 of IdU+ cells in the CCCs. Scale bar = 250 μm. (D) Co-immunostaining for Ki67 and IdU (n = 3). 599 Scale bar = 100 μm. VC = villous core; CCC = cytotrophoblast cell column.

600 Figure 2. Proliferative trophoblast cells mostly reside at the base of the CCCs. (A) Serial 601 sections show that ITGA2 is expressed on Ki67+ cells (n = 3). (B) Placental serial sections 602 stained for ITGA2, CD31 and CD34 (n = 3). Only CD31 and ITGA2 are present on proximal CCC 603 (top) while all three proteins are expressed on endothelial cells (bottom). (C) Serial sections 604 to show that ITGA2+ trophoblast are positive for Ki67, TFAP2C, GATA3 and KRT7. (D) Co- 605 immunostaining for CD31 and ITGA2 with Ki67 (n = 3). Scale bars = 100 μm. VC = villous core; 606 CCC = cytotrophoblast cell column.

607 Figure 3. ITGA2 can be used to characterise and isolate TPs by flow cytometry. (A) After 608 gating out debris, cell doublets, CD45+ leukocytes and CD34+ endothelial cells, a proportion 609 of KRT7+ trophoblast cells are ITGA2+ in the remaining fraction (3.60 ± 0.74 %)(mean ± 610 s.e.m.; n = 3). (B) Expression of EGFR (VCT) and HLA-G (EVT) on ITGA2+ (middle panel) and 611 ITGA2- cells (right panel)(n = 3). Gating of ITGA2+ cells shown on left panel. (C) Ki67+ cells on 612 cells gated as in B. ITGA2+ (middle panel) and ITGA2- cells (right panel)(71.3 ± 0.3 % vs. 42.5 ± 613 4.3%)(mean ± s.e.m.; n = 3).

614 Figure 4. ITGA2+ trophoblast cells have a unique gene expression profile compared to EVT 615 and VCT. (A) Principal component (PC) analysis is based on 23196 genes with sd/mean more 616 than 0.1. The analysis shows that ITGA2+ cells, VCT and EVT isolated from 4 different donors 617 separate into distinct clusters. (B) EGFR, HLA-G and ITGA2 are upregulated in their 618 corresponding cell populations on the microarray. (C) RT-qPCR validation of some candidate 619 genes as indicated by the microarray results. * = p < 0.05; ** = p < 0.01; *** = p < 0.005; 620 **** = p < 0.0001; One-way ANOVA followed by Tukey’s multiple comparisons test. (D) 621 Heatmap of genes upregulated specifically in ITGA2+ trophoblast cells. Genes that we have 622 mentioned in the text are emphasised in bold. (E) Functional clusters that are enriched in 623 ITGA2+ cells, based on gene ontology analysis using DAVID.

624 Figure 5. ITGA2+ trophoblast are enriched in NOTCH signalling and exhibit distinct EMT 625 marker expression. (A) EpCAM becomes restricted from VCT to base of CCC between 7-8

626 weeks (n = 3 for 6-7 weeks and for 8-10 weeks). VC = villous core; CCC = cytotrophoblast cell 627 column. (B) Microarray data of expression of NOTCH pathway components. CDKN1A and 628 HES are downstream mediators of the NOTCH receptors. NOTCH4 and other HES genes were 629 not detectable on the microarray. (C) Gene Set Enrichment Analysis (GSEA) shows that the 630 EMT pathway is activated in ITGA2+ trophoblast. (D) A specific subset of EMT genes, 631 indicative of a mixed epithelial and mesenchymal identity, is upregulated in ITGA2+ 632 trophoblast. Levels of each gene is normalised to the lowest expressing cell type and only 633 genes with at least 1.23-fold difference are shown. Stats for individual genes: * = p < 0.05; ** 634 = p < 0.01; *** = p < 0.005; **** = p < 0.0001; One-way ANOVA followed by Tukey’s multiple 635 comparisons test.

636 Figure 6. Cells at the base of the CCC could contribute to both VCT and EVT. (A) Timeline for 637 lineage tracing. (B) An example of how the longest consecutive IdU+ streak is counted in 638 CCCs in Day 0 and Day 3 samples. IdU+ cells within the longest string is marked by a red dot. 639 (C-E) There is increase in the average number of consecutive IdU+ cells in the CCC (C), in the 640 area of IdU+ cells in the CCCs (D), and in the number of IdU+ VCT in the villi adjacent to CCCs 641 (E), after three days. (n = 6) ** = p < 0.01 using Mann-Whitney U test.

A B C VC CCC CCC

Ki67 VCIdU VC

VC CCC IdU Isotype control IdU D

VC

CCC IdU Ki67 DNA Merged AB VC

CCC

ITGA2 ITGA2 CD31 CD34

VC CCC

Ki67 ITGA2 CD31 CD34 C

ITGA2 TFAP2C

CCC VC KRT7 GATA3 Isotype control D

CCC

VC ITGA2 Ki67 DNA Merged

CCC

VC

CD31 Ki67 DNA Merged A 250K 250K 5 10

200K 200K 4 10 150K 150K

3

10 SSC-A 100K SSC-A 100K

50K 50K -CD45/CD34 � 0

0 0 an 3 4 5 0 50K 100K 150K 200K 250K 0 50K 100K 150K 200K 250K 0 10 10 10 FSC-A SSC-W Live/dead marker

5 5 10 10

4 4 10 3.10% 10 0.01%

3 3

10 10

-KRT7 � -KRT7 �

an an

0 0

3 4 5 3 4 5 0 10 10 10 0 10 10 10 an�- ITGA2 Isotype control B ITGA2+ cells ITGA2- cells

5 5 5 10 10 10 ITGA2+ cells 4 4 24.4% 4 81.0% 10 10 10 71.2% 6.7%

3 3 3

10 10 10

-EGFR � -EGFR �

ITGA2 ITGA2 �‐

an an an 2 10 0 0 0 2 - -10 ITGA2 cells 3 4 5 3 4 5 0 50K 100K 150K 200K 250K 0 10 10 10 0 10 10 10 FSC-A an�-HLA-G an�-HLA-G C ITGA2+ cells ITGA2- cells

5 5 5 10 10 10

4 4 72.1% 4 36.4% 10 ITGA2+ cells 10 10

3 3 3

10 10 10

-Ki67 � -Ki67 �

-ITGA2 �

an an

an 2 10 0 0 0 2 - -10 ITGA2 cells 0 50K 100K 150K 200K 250K 0 50K 100K 150K 200K 250K 0 50K 100K 150K 200K 250K FSC-A FSC-A FSC-A A D -2 20

ITGA2+ VCT EVT PC2 (12.4%)

PC1 (50.8%) B VCT 5 **** **** ** ITGA2+ cells 4 *** ** ** EVT 3 2 *** 1

Relative levels Relative 0

EGFR HLA-G ITGA2 C ITGA2 CDH1 10 50 ** 8 ** 40 * 6 30 4 20 2 10 0 0

TBP LAIR2 25 VCT 20 * + 15 ITGA2 cells 10 EVT 5 0

RARRES3 ITGB6 5 * 50 4 40 *** 3 30 + 2 20 VCT ITGA2 cells EVT 1 10 0 0

Normalised expression to expression Normalised E Response to wounding (14) Wound healing (8) SOX13 EMP3 Annexin/Calcium-binding proteins (3) 4 2.5 ** * Epidermis/skin development (7) 3 2.0 1.5 Tissue regeneration (4) 2 1.0 Regulation of proliferation (6) 1 0.5 0 -1 -2 -3 -4 -5 -6 -7 0 0.0 10 10 10 10p-Value10 10 10 10 A B C VC CCC VCT + 2.0 ** ITGA2 cells *** *** * EVT EpCAM - 7 weeks 1.5 ** ** *

1.0

0.5 VC 0.0 Relative level to VCT to level Relative

CCC MYC HES1 HES2 HES4 NOTCH1NOTCH2NOTCH3 EpCAM - 8 weeks CDKN1A

D 8 ** EMT genes 6 VCT 4 + 4 * ITGA2 cells EVT 3 ******* * * ** * *** * 2 ** *** ** 1 Relative levels Relative

0

VIM MGPCDH2MYL9 CLDN4TIMP3 THBS1 TAGLNFBLN2 ACTG2 PLAUR COL1A1 COL1A2IGFBP4 COL5A1 SERPINE1 Epithelial Mesenchymal A IdU IdU added removed 1h 3 days

Half of the explants Remaining explants washed and xed washed and xed B

IdU IdU Day 0 Day 3

C D

15 * 8 *

6 10 4 5 2 cells/ villus cells/ column 0 0 No of consecutive IdU+ No of consecutive IdU+ Day 0 Day 3 Day 0 Day 3 E 1.0 *

0.8

0.6

0.4

0.2

0.0 Proportion of column covered by IdU+ cells Day 0 Day 3 Figure S1

Figure S1. The methylation profile of the ELF5 promoter in ITGA2+ cells, VCT and EVT. Open circle = unmethylated CpG; closed circle = methylated CpG.

Figure S2

Figure S2. Validation of the microarray. (A) Hierarchical clustering of the A-E-G populations. (B) Highly upregulated genes in VCT and EVT based on a previous microarray are also highly expressed in these cells on our microarray (Apps et al., 2011). (C) Overview of the relative expression levels between ITGA2+ trophoblast compared to those in VCT and EVT. There are very few genes that are lower in ITGA2+ trophoblast than in VCT and EVT. Figure S3

Figure S3. Expression profile of EMT genes, CD81 and TM4SF1. (A) CD81 and TM4SF1 are upregulated at the base of the CCCs (arrow heads). VC = villous core; CCC = cytotrophoblast cell column. (B) Expression of LAMB3 and COL17A1 on the microarray. Stats for individual genes: * = p < 0.05; ** = p < 0.01; *** = p < 0.005; **** = p < 0.0001; One-way ANOVA followed by Tukey’s multiple comparisons test. (C) COL17A1 is upregulated at the base of the columns (arrowheads) and in some EVT (arrows). Scale bar = 50 µm. VC = villous core; CCC = cytotrophoblast cell column. (D) The expression of epithelial and mesenchymal genes is higher in VCT and EVT respectively, based on the microarray. The following genes were either not detected or not significantly different between all groups and were not included in the graph: SNAI1, SNAI2, TWIST1, TCF, ACTA2, LEF1, ZEBs, CDH3, CRB3, OCLN, TJP1, Claudins, DSP, SPARC, ID1, CFL1 and other MMPs.

Table S1. Genes upregulated in ITGA2+ cells, compared to VCT and EVT, on microarray. FDR <0.05, Fold change ≥1.23 Fold increase in Fold increase in ITGA2+ cells ITGA2+ cells Symbol compared to EVT compared to VCT TMSB4X 3.26 5.06 COL17A1 2.01 4.67 SFN 2.36 4.24 THBS1 5.11 3.59 ANXA8L2 1.72 3.53 C8orf4 1.90 3.46 MARCKS 2.42 3.05 BIRC7 2.48 2.98 LOC652688 2.51 2.77 TAGLN 2.56 2.76 ARMCX3 2.49 2.67 TM4SF1 2.69 2.66 ACTG2 1.94 2.66 CARHSP1 1.55 2.50 LOC647954 1.47 2.44 NPPB 2.53 2.43 SLCO2A1 1.88 2.33 CST6 2.23 2.29 HSPB1 1.71 2.20 DSE 1.43 2.15 HIST2H2AA3 1.41 2.11 MYL9 1.50 1.99 CRYAB 1.76 1.97 GBP2 1.49 1.93 TGIF1 1.77 1.89 COL1A2 1.81 1.87 SERPINE1 3.13 1.82 CD81 1.44 1.80 LOC646993 1.51 1.80 CLDN4 1.77 1.79 KCTD11 1.60 1.78 FBLN2 2.44 1.78 LAMB3 1.67 1.77 CD99 2.06 1.77 CXCL2 1.66 1.76 LOC646723 1.49 1.75 SCD5 1.95 1.75 LPHN3 1.61 1.74 NAV2 1.60 1.73 ITGB6 1.95 1.73 SNORD89 1.67 1.71 NUAK2 1.44 1.70 LOC729952 1.43 1.69 PLAU 1.74 1.69 CCRN4L 1.48 1.66 TSC22D1 1.72 1.66 MGP 1.54 1.65 PMP22 1.29 1.62 PDGFA 1.86 1.61 KRT18P17 1.41 1.60 CDKN1A 1.64 1.59 LOC652846 2.04 1.58 AHNAK 1.33 1.58 MIR612 1.56 1.58 C1S 1.56 1.57 LOC340598 1.35 1.56 ANXA8 2.15 1.56 TIMP3 2.37 1.56 LOC100133950 1.47 1.56 LOC728779 1.44 1.56 DDAH1 1.87 1.55 LOC647295 1.42 1.55 FAM65B 2.17 1.54 TMOD3 1.43 1.54 ITGA2 1.45 1.53 KRT18P30 1.36 1.52 KRT18P28 1.46 1.51 RRAD 1.51 1.50 NUPR1 2.19 1.48 RASGRF1 1.34 1.48 NOTCH1 1.31 1.46 AFAP1L2 1.63 1.46 LOC100133408 1.40 1.45 SAPS1 1.58 1.45 GPRC5B 1.71 1.44 GLIPR2 1.53 1.43 RHOD 1.35 1.41 P8 1.85 1.41 LOC646779 1.32 1.41 IDS 1.91 1.39 RCAN1 1.29 1.39 LOC100130133 1.56 1.38 KRT18P13 1.61 1.38 ANXA8L1 1.51 1.38 TNNT3 1.60 1.37 ZSWIM4 1.83 1.36 PSMB10 1.30 1.35 ESPL1 1.37 1.35 SCCPDH 1.89 1.34 LOC653107 1.59 1.33 LOC100132640 1.41 1.32 DNMT3B 1.63 1.32 HES2 1.34 1.32 LOC728687 1.28 1.32 MALAT1 1.36 1.31 FLJ40504 1.61 1.30 ATF4 1.45 1.30 CRYBA4 1.34 1.29 AK2P2 1.43 1.29 TACSTD1 2.25 1.26 CCNL2 1.45 1.26 DHRS7B 1.32 1.25 Table S2. Genes downregulated in ITGA2+ cells, compared to VCT and EVT, on microarray. FDR <0.05, Fold change ≥1.23 Fold decrease in ITGA2+ Fold decrease in ITGA2+ Symbol cells compared to EVT cells compared to VCT GSTA4 1.53 3.00 IDH1 1.61 2.07 HIBADH 1.28 1.55 ATAD1 1.31 1.48 FAM89A 1.44 1.47 LIPA 1.50 1.44 ADAP2 1.36 1.42 POP5 1.29 1.40 RSU1 1.33 1.40 POLR3B 1.40 1.38 PPHLN1 1.27 1.37 FLAD1 1.37 1.35 GXYLT1 1.31 1.34 PAQR3 1.36 1.34 CYFIP2 1.65 1.32