bioRxiv preprint doi: https://doi.org/10.1101/562447; this version posted February 27, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 Modeling the Human Segmentation Clock with Pluripotent Stem Cells 2 3 Mitsuhiro Matsuda1,10, Yoshihiro Yamanaka2,10, Maya Uemura2,3, Mitsujiro Osawa4, 4 Megumu K. Saito4, Ayako Nagahashi4, Megumi Nishio3, Long Guo5, Shiro Ikegawa5, 5 Satoko Sakurai6, Shunsuke Kihara7, Michiko Nakamura6, Tomoko Matsumoto6, Hiroyuki 6 Yoshitomi2,3, Makoto Ikeya6, Takuya Yamamoto6,8, Knut Woltjen6,9, Miki Ebisuya1*, 7 Junya Toguchida2,3, Cantas Alev2* 8 9 1 Laboratory for Reconstitutive Developmental Biology, RIKEN Center for Biosystems 10 Dynamics Research (RIKEN BDR), Kobe 650-0047, Japan. 11 2 Department of Cell Growth and Differentiation, Center for iPS Cell Research and 12 Application (CiRA), Kyoto University, Kyoto 606-8507, Japan. 13 3 Department of Regeneration Science and Engineering, Institute for Frontier Life and 14 Medical Sciences, Kyoto University, Kyoto 606-8507, Japan. 15 4 Department of Clinical Application, Center for iPS Cell Research and Application 16 (CiRA), Kyoto University, Kyoto 606-8507, Japan. 17 5 Laboratory for Bone and Joint Diseases, RIKEN Center for Integrative Medical 18 Sciences (RIKEN IMS), Tokyo 108-8639, Japan. 19 6 Department of Life Science Frontiers, Center for iPS Cell Research and Application 20 (CiRA), Kyoto University, 606-8507, Kyoto 108-8639, Japan. 21 7 Department of Fundamental Cell Technology, Center for iPS Cell Research and 22 Application (CiRA), Kyoto University, Kyoto 606-8507, Japan. 23 8 AMED-CREST, AMED 1-7-1 Otemachi, Chiyodaku, Tokyo 100-004, Japan. 24 9 Hakubi Center for Advanced Research, Kyoto University, Kyoto 606-8501, Japan. 25 10 These authors contributed equally to this work 26 27 * To whom correspondence should be addressed: 28 Cantas Alev, M.D., Ph.D. (lead contact) 29 E-Mail: [email protected] 30 31 Miki Ebisuya, Ph.D. 32 E-Mail: [email protected] 33 34 Keywords: segmentation , somitogenesis, human mesoderm development, induced 35 pluripotent stem cells, step-wise in vitro mesoderm differentiation, disease modeling

1 bioRxiv preprint doi: https://doi.org/10.1101/562447; this version posted February 27, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

36 Pluripotent stem cells (PSCs) have increasingly been used to model different aspects 37 of embryogenesis and organ formation1. Despite recent advances in the in vitro 38 induction of major mesodermal lineages and mesoderm-derived cell types2,3, 39 experimental model systems that can recapitulate more complex biological features 40 of human mesoderm development and patterning are largely missing. Here, we 41 utilized induced pluripotent stem cells (iPSCs) for the stepwise in vitro induction of 42 presomitic mesoderm (PSM) and its derivatives to model distinct aspects of human 43 somitogenesis. We focused initially on modeling the human segmentation clock, a 44 major biological concept believed to underlie the rhythmic and controlled 45 emergence of somites, which give rise to the segmental pattern of the vertebrate 46 axial skeleton. We succeeded to observe oscillatory expression of core segmentation 47 clock , including HES7 and DKK1, and identified novel oscillatory genes in 48 human iPSC-derived PSM. We furthermore determined the period of the human 49 segmentation clock to be around five hours and showed the presence of dynamic 50 traveling wave-like expression within in vitro induced human PSM. Utilizing 51 CRISPR/Cas9-based genome editing technology, we then targeted genes, for which 52 mutations in patients with abnormal axial skeletal development such as 53 spondylocostal dysostosis (SCD) (HES7, LFNG and DLL3) or spondylothoracic 54 dysostosis (STD) (MESP2) have been reported. Subsequent analysis of patient-like 55 iPSC knock-out lines as well as patient-derived iPSCs together with their genetically 56 corrected isogenic controls revealed gene-specific alterations in oscillation, 57 synchronization or differentiation properties, validating the overall utility of our 58 model system, to recapitulate not only key features of human somitogenesis but also 59 to provide novel insights into diseases associated with the formation and patterning 60 of the human axial skeleton. 61 We initially aimed to mimic and recreate in vitro the signaling events responsible 62 for the step-wise emergence of PSM and its derivatives during embryonic development, 63 as also recently attempted by others2,4,5, via selective activation or inhibition of 64 appropriate signaling pathways, using human iPSCs as starting material (Fig. 1a). We 65 characterized the ability of our in vitro induced human PSM cells to differentiate into 66 somitic mesoderm and its two main derivatives, sclerotome and dermomyotome, which 67 give rise to bone and cartilage of the axial skeleton and skeletal muscle and dermis of the 68 emerging embryo respectively. RNA-sequencing (RNA-seq) analysis and subsequent 69 characterization of in vitro derived human PSM samples revealed, that at each step of our 70 induction and differentiation protocol, markers expected to be present based on either 71 embryological studies in animal models or recent reports utilizing stem cells2,4-6, were 72 appropriately expressed at both transcript and levels (Fig. 1b-f, Extended Data 73 Fig. 1a-f, Extended Data Table 1), indicating that our step-wise approach is following the 74 developmental trajectory and recapitulating ontogeny seen during embryonic somitic 75 mesoderm development. 76 We detected in our in vitro derived human PSM samples expression of TBX6 and 77 DLL1, two well-established markers of presomitic mesoderm7, at transcript and protein 78 levels (Fig. 1d-f, Extended Data Fig. 1d-f). We also detected high-level expression of 79 HES7, a known regulator of the segmentation clock in murine PSM8 (Fig. 1b, Extended 80 Data Fig. 1c). Based on this observation we generated a luciferase-based iPSC reporter 81 line for human HES7 promoter activity (HES7-reporter). We observed clear oscillation of

2 bioRxiv preprint doi: https://doi.org/10.1101/562447; this version posted February 27, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

82 the HES7-reporter in our human PSM samples (Fig. 1g) and determined the period of the 83 human segmentation clock to be 5-6 hours, which is similar to the 4-6 hour period 84 reported for somite formation in primary human embryo samples9,10. To our knowledge, 85 this is the first real-time observation of oscillatory expression in the human segmentation 86 clock. 87 We then asked, whether we could also observe traveling wave-like expression, 88 another hallmark of the segmentation clock, which is caused by the synchronization 89 among oscillations in neighboring cells. Such traveling waves have been reported in the 90 context of explant studies utilizing reporter mice and mouse ESC-derived PSM11,12, but 91 have never been observed in human PSM. Using a sphere of human PSM induced in 3D 92 culture, we could see sustained oscillation and the clear presence of traveling waves 93 (Extended Data Movie 1), which is also indicated by the tilted slope in the kymograph 94 (Fig. 1h). 95 In order to ensure that our in vitro-derived PSM is indeed comparable to its in 96 vivo counterpart, we further characterized its differentiation capacity into somitic 97 mesoderm, sclerotome and dermomyotome. To induce somitic mesoderm we mimicked 98 the decrease in Fgf and Wnt activity along the posterior-anterior axis of the PSM reported 99 in the embryonic context13, by simultaneous inhibition of both pathways, leading to the 100 rapid and robust induction of somitic mesoderm expressing TCF15, a well established 101 marker of somite development14, at the transcript (Fig. 1b, Extended Data Fig. 1c) and the 102 protein level (Extended Data Fig. 1g). In addition, MESP2, a marker of segmentation 103 during somitogenesis, showed localized albeit weak expression in induced human somitic 104 mesoderm, similar to the remnant expression of TBX6 at this stage, indicating that, 105 despite a strong signal for differentiation, patterning reminiscent of somite segmentation 106 might take place in vitro, as recently shown for mouse ESC-derived PSM12 (Extended 107 Data Fig. 1g, right side of panel). Furthermore, dermomytome and sclerotome cells 108 derived from in vitro induced human somitic mesoderm expressed appropriate 109 developmental stage-specific markers at the transcript and protein level (Extended Data 110 Fig. 1c, 2a). To further demonstrate that our induced human somitic mesoderm and the 111 sclerotome derived thereof, is, indeed functional and can give rise to cartilage and bone, 112 as seen during embryogenesis15, we performed in vitro and in vivo differentiation and 113 transplantation assays (Extended Data Fig. 2b-e). As also recently reported by others4, we 114 could see self-organization of bone and cartilage, including properly patterned 115 endochondral bone, from transplanted human sclerotome cells in vivo (Fig. 1i, Extended 116 Data Fig. 2b, 2c, 2e). We then examined our in vitro induced dermomyotome cells for 117 their ability to give rise to human skeletal muscle in vitro. We could observe robust 118 induction of skeletal muscle cells, showing expression of proper muscle markers at the 119 protein level (Fig. 1j, Extended Data Fig. 2f). Functional analysis of dermomyotome cells 120 derived from a calcium reporter iPSC line (Gen1C) revealed the reproducible presence of 121 beating skeletal muscle cells after three weeks of in vitro 2D differentiation culture (Fig. 122 1j, Extended Data Movie 2). We thus showed that our in vitro step-wise induction 123 protocol can give rise to human PSM cells and their proper derivatives that appear to 124 recapitulate key features of their corresponding developmental counterparts and 125 respective embryological stages. 126 Since the oscillation and synchronization in human PSM have never been 127 characterized before, we asked whether we could deepen our understanding of the human

3 bioRxiv preprint doi: https://doi.org/10.1101/562447; this version posted February 27, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

128 segmentation clock even further, using the experimental system at hand. To this end we 129 collected PSM samples during oscillation by monitoring the oscillatory activity of the 130 HES7-reporter (Extended Data Fig. 3a) and performed RNA-seq analysis of the isolated 131 RNA samples (16 samples each for two independent sets of experiments). 132 Our NGS-based analysis of the different PSM time-points revealed a core-set of 133 about two hundred oscillating genes (Fig. 2a, Extended Data Table 2). Pathway and gene 134 ontology (GO) analysis of the identified gene clusters revealed that in addition to 135 enrichment of pathway members previously associated with the segmentation clock, such 136 as Notch, Wnt or Fgf signaling16,17 novel pathways were also represented in our data set, 137 including members of the TGFβ pathway or Hippo signaling (Fig. 2b, Extended Data Fig. 138 3b, Extended Data Table 3). Interestingly, the Hippo pathway component YAP was 139 recently reported to be an important regulator of oscillatory activity in mouse PSM18, 140 suggesting that this role might be conserved in human PSM, which showed oscillatory 141 activity of core Hippo pathway components (TEAD4, AMOTL2) (Fig. 2a, 2b, Extended 142 Data Fig. 3b). We also detected several HOX genes showing oscillatory expression in 143 human PSM (HOXA3, HOXA5, HOXD1) (Fig. 2a, 2b, Extended Data Fig. 3b). 144 Oscillatory expression of HOXD1 was previously shown during somitogenesis in 145 mouse19 suggesting that its expression pattern and biological function might be conserved 146 in human PSM. We could also identify oscillation of putative modulators of the 147 cytoskeleton (ARHGAP24, ARHGEF2, RHOU, PLEKHG2) as well as histone modifiers 148 (KDM6B, JADE1) as oscillating genes (Fig. 2a, 2c, Extended Data Fig. 3b). The 149 intriguing possibility that above-mentioned cytoskeleton associated oscillating genes 150 might represent a link between the segmentation clock and the actual process of 151 segmentation, characterized by mesenchymal to epithelial transition and associated with 152 major cytoskeletal rearrangements, remains to be elucidated. Furthermore, identification 153 of oscillating histone modifiers in the PSM, suggests a possible role of epigenetic 154 modifications in the regulation or maintenance of the segmentation clock, and will be the 155 topic of future research efforts. 156 Among the identified oscillatory genes, two thirds were oscillating in phase with 157 HES7 (142) and about one third (77) showing an anti-phase oscillatory expression pattern 158 (Fig. 2a, Extended Data Table 2). The phase cluster of human oscillating genes contained 159 Notch-pathway associated genes such as LFNG20, while the anti-phase cluster contained 160 Wnt-pathway associated negative feedback regulators such as DKK1 and SP521,22 (Fig. 161 2a, 2c), as previously reported for posterior PSM of mouse embryos23. Generating a dual 162 luciferase-activity based reporter line for DKK1 and HES7 promoter activities we 163 confirmed clear phase and anti-phase reporter oscillations in human iPSC-derived PSM 164 samples (Fig. 2d), suggesting that our induced PSM may represent posterior immature 165 PSM rather than anterior mature PSM. 166 In order to also show the utility of our system to model anomalies during human 167 axial skeletogenesis such as SCD or STD, known to be caused by mutations in genes 168 associated with the segmentation clock (e.g. HES7, LFNG, DLL3 and MESP2)24, we 169 utilized CRISPR/Cas9-mediated genome editing technology to generate knock-out 170 reporter iPSC lines aiming to induce frameshifts or deletion mutations in these target 171 genes in the HES7 luciferase reporter line background (Extended Data Fig. 4) and 172 analyzed their putative loss-of-function effect on oscillatory HES7-reporter activity. 173 Knock-out of endogenous HES7 itself led to complete loss of oscillatory activity of the

4 bioRxiv preprint doi: https://doi.org/10.1101/562447; this version posted February 27, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

174 HES7-reporter, in a 2D-oscillation assay (Fig. 3a), similar to previous embryological 175 studies utilizing knock-out mice25. Interestingly knock-out reporter lines for LFNG, DLL3 176 or MESP2 continued to show strong oscillatory HES7 activity (Fig. 3b) even though 177 knock-out mice for LFNG and DLL3 were reported to show defective oscillation 178 patterns20,26. We reasoned that in this 2D-oscillation assay the phase (i.e. timing) of 179 oscillations is initially reset by medium change, showing collective oscillation even in the 180 absence of a strong synchronization mechanism. We then examined the synchronization 181 ability of knock-out reporter lines of aforementioned genes using 3D-spheres of PSM 182 (Fig. 3c). The control (healthy/wild-type) and the knock-out reporter line for MESP2 183 showed sustained oscillations and occasional traveling waves (Extended Data Movie 3), 184 indicating intact synchronization among neighboring cells. In the knock-out lines of 185 LFNG or DLL3, by contrast, oscillation damped quickly and clear traveling waves were 186 not observed (Extended Data Movie 3, Fig. 3c, 3d, Extended Data Fig. 5a). We 187 interpreted the observed quick loss of oscillatory activity as a sign of diminished 188 synchronization. Unlike the 2D-oscillation assay, spheres spread dynamically on the 189 culture dish in the 3D-synchronization assay, and cell movement desynchronizes the 190 oscillation phases. Without proper synchronization mechanism collective oscillation is 191 quickly lost even though oscillations in individual cells continue. Thus, our assay systems 192 using induced human PSM can detect defects in both oscillation and synchronization. 193 Flow cytometric evaluation as well as RNA-seq based transcriptome analysis 194 showed no major differences between control cells and aforementioned reporter lines 195 (Extended Data Fig. 5b, 5c). PSM induction efficiency was high and comparable to wild- 196 type (healthy donor) cells in DLL3, LFNG and MESP2 knock-out reporter lines, while 197 slightly reduced for HES7 knock-out iPSC lines (Extended Data Fig. 5b). Few differences 198 in at the iPSC and PSM stages were observed when comparing knock- 199 out lines with the original healthy donor line, with HES7, MESP2 and LFNG showing 200 higher expression in HES7 knock-out derived PSM, as previously also shown in mice27 201 (Extended Data Fig. 5c). 202 Taken together, these results underline the overall value of an higher order assay 203 system that can assess not only gene or protein expression but also more complex 204 features such as oscillation or synchronization in human PSM, thus opening up the path 205 to work out functionally relevant and possibly disease-associated features specific to each 206 loss- or gain-of-function mutation, otherwise not accessible. 207 To further evaluate the utility of our experimental model system to assess not only 208 key features of the human segmentation clock and somitogenesis but also address 209 causative molecular mechanisms associated with diseases affecting axial skeletogenesis, 210 we established iPSC lines from 12 individuals afflicted by SCD or STD (data not shown). 211 In one of the established STD patient-derived iPSC lines (Extended Data Fig. 6) we 212 identified via Exome sequencing compound heterozygous loss-of-function mutations in 213 MESP2 (rs1452984345: c.256delGCCA, p.fsTer118; rs71647808: c.307T, p.E103X) 214 (Fig. 4a). Induction of PSM from these patient iPSCs (STD-A and STD-F; two different 215 iPSC clones from the same patient) appeared to be not affected, as assessed by flow 216 cytometric analysis of DLL1 expression (Fig. 4b; Extended Data Fig. 7a). We observed 217 for the patient line (STD-A) harboring MESP2 loss-of-function mutations, clear 218 oscillatory activity of HES7 in 2D-oscillation assay (Fig. 4c), similar to what we saw for 219 our human MESP2 knock-out reporter lines (Fig. 3b). In the 3D-synchronization assay

5 bioRxiv preprint doi: https://doi.org/10.1101/562447; this version posted February 27, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

220 this patient line also showed sustained collective oscillation and occasional traveling 221 waves, indicating an intact synchronization mechanism (Fig. 4d), again similar to results 222 seen for MESP2 knock-out lines (Fig. 3c). In order to facilitate the molecular and 223 functional analysis of our patient lines, we generated isogenic controls by correcting the 224 underlying putative disease causing mutations via gene targeting with CRISPR/Cas9. 225 Allele-specific gene correction of MESP2 was achieved using sgRNAs targeting either 226 the c.256delGCCA or c.307T mutation and homologous recombination with donor 227 vectors bearing normal MESP2 gene sequence. Microhomology-assisted excision 228 (MhAX) was used to remove the selection cassette28 (Fig. 4e, 4f, Extended Data Fig. 8), 229 thus effectively rescuing the disease-causing loss of MESP2, albeit heterozygously. 230 Gene-edited iPSCs were confirmed to be karyotypically similar to the parental patient 231 iPSC line (Extended Data Fig. 9). As no clear oscillation or synchronization phenotype 232 could be observed for the analyzed patient line, we asked whether we could see possible 233 differences at the functional or molecular level by comparing patient (STD-A and STD- 234 F) and corresponding rescued iPSC lines (STD-resA and STD-resF). To this end we 235 induced and compared all stages of our in vitro induction and differentiation protocol via 236 RNA-seq analysis using patient and rescue lines (Fig. 4g, Extended Data Fig. 7b). 237 Comparison of patient clones with wild-type (healthy) and heterozygously 238 corrected lines revealed the presence of an up-regulated gene cluster at the somitic 239 mesoderm stage in the analyzed patient lines, which could be reversed upon correction of 240 either mutated allele (Fig. 4g). Genes apparently up-regulated in patient somitic 241 mesoderm and reduced upon rescue of either MESP2 mutation, included FGF4, FGF18 242 and DUSP5 (Fig. 4g, 4h), indicating abnormal Fgf signaling as a possible novel disease 243 associated molecular feature in STD. MESP2 knock-out iPSC-derived somitic mesoderm 244 samples also showed higher levels of expression of FGF4, FGF18 and DUSP5 (Fig. 4h). 245 Interestingly human EPHA3, which was previously reported to have a dominant negative 246 effect on somite formation and axial organization in fish29, was also found to be up- 247 regulated in STD-patient and MESP2-knock-out derived somitic mesoderm (Fig. 4g, 4h). 248 Knock-out and patient-lines showed further higher levels of expression for MESP1 and 249 MESP2 as compared to the levels found in healthy or genetically corrected control 250 samples, indicating possible disrupted negative feed-back regulation by human MESP2. 251 UNCX, a gene associated with rostro-caudal patterning of forming somites and found to 252 be up-regulated during abnormal somitogenesis in MESP2 knock-out mice30 was also 253 found to be highly expressed in MESP2 patient iPSC-derived somitic mesoderm samples 254 as compared to genetically rescued controls. Interestingly, several other genes associated 255 with patterning during somitogenesis and for which genetic mutations in SCD patients 256 were recently reported, including LFNG, RIPPLY2 and DMRT224 were also up-regulated 257 in somitic mesoderm samples derived from STD patient lines (STD-A and STD-F) 258 harboring MESP2 loss-of-function mutations (Fig. 4g, 4h, Extended Data Fig. 7c), 259 indicating reciprocal regulatory mechanisms, possibly connecting these disease 260 associated genes at the molecular and functional level during the pathogenesis of axial 261 skeletogenic abnormalities including SCD and STD. 262 In summary, we have shown for the first time phase- and anti-phase oscillation 263 and traveling wave-like gene expression of key segmentation clock genes in human PSM, 264 and identified a putative molecular network of known and novel members comprising the 265 mammalian segmentation clock. We furthermore assessed the function of several disease-

6 bioRxiv preprint doi: https://doi.org/10.1101/562447; this version posted February 27, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

266 associated members of the human segmentation clock, applying our experimental model 267 system in combination with patient-like or patient-derived iPSCs, thus effectively 268 creating the first human PSC-based model for congenital scoliosis, which may contribute 269 to decipher the molecular mechanisms underlying normal and pathological human axial 270 skeletogenesis. 271 Furthermore, being able to induce iPSCs from non-human model systems 272 including evolutionary distant species, may, in combination with our proposed 273 experimental model system, also pave the way for interspecies comparisons of 274 segmentation clocks, and address as yet elusive molecular and evolutionary biological 275 questions, such as the presence or absence of an evolutionary conserved core network of 276 segmentation clock genes. Having access to a robust experimental model system that can 277 be easily manipulated without the need for transgenic animals or primary tissues, while 278 allowing assessment of not only genetic but also environmental or epigenetic factors, will 279 undoubtedly facilitate our molecular and functional understanding of proper as well as 280 abnormal human somitogenesis and axial skeletal development. 281 282 REFERENCES

283 1 McCauley, H. A. & Wells, J. M. Pluripotent stem cell-derived organoids: using principles 284 of developmental biology to grow human tissues in a dish. Development 144, 958-962, 285 doi:10.1242/dev.140731 (2017). 286 2 Chal, J. et al. Differentiation of pluripotent stem cells to muscle fiber to model Duchenne 287 muscular dystrophy. Nat Biotechnol 33, 962-969, doi:10.1038/nbt.3297 (2015). 288 3 Lee, J. H., Protze, S. I., Laksman, Z., Backx, P. H. & Keller, G. M. Human Pluripotent 289 Stem Cell-Derived Atrial and Ventricular Cardiomyocytes Develop from Distinct 290 Mesoderm Populations. Cell Stem Cell 21, 179-194 e174, 291 doi:10.1016/j.stem.2017.07.003 (2017). 292 4 Loh, K. M. et al. Mapping the Pairwise Choices Leading from Pluripotency to Human 293 Bone, Heart, and Other Mesoderm Cell Types. Cell 166, 451-467, 294 doi:10.1016/j.cell.2016.06.011 (2016). 295 5 Xi, H. et al. In Vivo Human Somitogenesis Guides Somite Development from hPSCs. 296 Cell Rep 18, 1573-1585, doi:10.1016/j.celrep.2017.01.040 (2017). 297 6 Pourquie, O. Vertebrate somitogenesis. Annu Rev Cell Dev Biol 17, 311-350, 298 doi:10.1146/annurev.cellbio.17.1.311 (2001). 299 7 Hofmann, M. et al. WNT signaling, in synergy with T/TBX6, controls Notch signaling 300 by regulating Dll1 expression in the presomitic mesoderm of mouse embryos. Genes Dev 301 18, 2712-2717, doi:10.1101/gad.1248604 (2004). 302 8 Bessho, Y. et al. Dynamic expression and essential functions of Hes7 in somite 303 segmentation. Genes Dev 15, 2642-2647, doi:10.1101/gad.930601 (2001). 304 9 Turnpenny, P. D. et al. Abnormal vertebral segmentation and the notch signaling 305 pathway in man. Dev Dyn 236, 1456-1474, doi:10.1002/dvdy.21182 (2007). 306 10 Hubaud, A. & Pourquie, O. Signalling dynamics in vertebrate segmentation. Nat Rev Mol 307 Cell Biol 15, 709-721, doi:10.1038/nrm3891 (2014). 308 11 Lauschke, V. M., Tsiairis, C. D., Francois, P. & Aulehla, A. Scaling of embryonic 309 patterning based on phase-gradient encoding. Nature 493, 101-105, 310 doi:10.1038/nature11804 (2013). 311 12 Matsumiya, M., Tomita, T., Yoshioka-Kobayashi, K., Isomura, A. & Kageyama, R. ES 312 cell-derived presomitic mesoderm-like tissues for analysis of synchronized oscillations in 313 the segmentation clock. Development 145, doi:10.1242/dev.156836 (2018).

7 bioRxiv preprint doi: https://doi.org/10.1101/562447; this version posted February 27, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

314 13 Wahl, M. B., Deng, C., Lewandoski, M. & Pourquie, O. FGF signaling acts upstream of 315 the NOTCH and WNT signaling pathways to control segmentation clock oscillations in 316 mouse somitogenesis. Development 134, 4033-4041, doi:10.1242/dev.009167 (2007). 317 14 Burgess, R., Rawls, A., Brown, D., Bradley, A. & Olson, E. N. Requirement of the 318 paraxis gene for somite formation and musculoskeletal patterning. Nature 384, 570-573, 319 doi:10.1038/384570a0 (1996). 320 15 Fleming, A., Kishida, M. G., Kimmel, C. B. & Keynes, R. J. Building the backbone: the 321 development and evolution of vertebral patterning. Development 142, 1733-1744, 322 doi:10.1242/dev.118950 (2015). 323 16 Krol, A. J. et al. Evolutionary plasticity of segmentation clock networks. Development 324 138, 2783-2792, doi:10.1242/dev.063834 (2011). 325 17 Dequeant, M. L. et al. A complex oscillating network of signaling genes underlies the 326 mouse segmentation clock. Science 314, 1595-1598, doi:10.1126/science.1133141 327 (2006). 328 18 Hubaud, A., Regev, I., Mahadevan, L. & Pourquie, O. Excitable Dynamics and Yap- 329 Dependent Mechanical Cues Drive the Segmentation Clock. Cell 171, 668-682 e611, 330 doi:10.1016/j.cell.2017.08.043 (2017). 331 19 Zakany, J., Kmita, M., Alarcon, P., de la Pompa, J. L. & Duboule, D. Localized and 332 transient transcription of Hox genes suggests a link between patterning and the 333 segmentation clock. Cell 106, 207-217 (2001). 334 20 Okubo, Y. et al. Lfng regulates the synchronized oscillation of the mouse segmentation 335 clock via trans-repression of Notch signalling. Nat Commun 3, 1141, 336 doi:10.1038/ncomms2133 (2012). 337 21 Huggins, I. J. et al. The WNT target SP5 negatively regulates WNT transcriptional 338 programs in human pluripotent stem cells. Nat Commun 8, 1034, doi:10.1038/s41467- 339 017-01203-1 (2017). 340 22 Pedersen, L., Jensen, M. H. & Krishna, S. Dickkopf1--a new player in modelling the Wnt 341 pathway. PLoS One 6, e25550, doi:10.1371/journal.pone.0025550 (2011). 342 23 Sonnen, K. F. et al. Modulation of Phase Shift between Wnt and Notch Signaling 343 Oscillations Controls Mesoderm Segmentation. Cell 172, 1079-1090 e1012, 344 doi:10.1016/j.cell.2018.01.026 (2018). 345 24 Bouman, A. et al. Homozygous DMRT2 variant associates with severe rib malformations 346 in a newborn. Am J Med Genet A 176, 1216-1221, doi:10.1002/ajmg.a.38668 (2018). 347 25 Kageyama, R., Niwa, Y. & Shimojo, H. Rhythmic gene expression in somite formation 348 and neural development. Mol Cells 27, 497-502, doi:10.1007/s10059-009-0068-1 (2009). 349 26 Takahashi, Y., Inoue, T., Gossler, A. & Saga, Y. Feedback loops comprising Dll1, Dll3 350 and Mesp2, and differential involvement of Psen1 are essential for rostrocaudal 351 patterning of somites. Development 130, 4259-4268 (2003). 352 27 Chen, J., Kang, L. & Zhang, N. Negative feedback loop formed by Lunatic fringe and 353 Hes7 controls their oscillatory expression during somitogenesis. Genesis 43, 196-204, 354 doi:10.1002/gene.20171 (2005). 355 28 Kim, S. I. et al. Microhomology-assisted scarless genome editing in human iPSCs. Nat 356 Commun 9, 939, doi:10.1038/s41467-018-03044-y (2018). 357 29 Lackmann, M. et al. Distinct subdomains of the EphA3 mediate ligand binding 358 and receptor dimerization. J Biol Chem 273, 20228-20237 (1998). 359 30 Takahashi, Y. et al. Transcription factors Mesp2 and Paraxis have critical roles in axial 360 musculoskeletal formation. Dev Dyn 236, 1484-1494, doi:10.1002/dvdy.21178 (2007). 361

8 bioRxiv preprint doi: https://doi.org/10.1101/562447; this version posted February 27, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

362 ACKNOWLEDGEMENTS 363 The authors would like to thank Brendan McIntyre and Paul O’Neill for critical reading 364 of the manuscript; Kanae Mitsunaga for help with FACS analysis; Yuhei Ashida for help 365 with development of 3D-synchronization assay; Junya Asahira for help with RNA-seq 366 experiments; Akihiro Yamashita for help with 3D-CI experiments; Mitsuaki Shibata and 367 Taiki Nakajima for help with development of one-step PSM induction protocol; Noriaki 368 Kawakami for help with general information on SCD; Monika Ohno and Sayaka 369 Nishimura for help with iPSC quality control and validation; the CiRA Genome 370 Evaluation Group, in particular Hiromi Dohi, Fumiyo Kitaoka, Masaki Nomura, Tomoko 371 Takahashi, Masafumi Umekage, and Naoko Takasu for performing SNP array analysis. 372 This work was supported by the CiRA Fellowship Program of Challenge to C.A.; Naito 373 Foundation Scientific Research Grant to C.A.; Grant-in-Aid for Challenging Exploratory 374 Research (KAKENHI Number 16K15664) to C.A.; Grant-in-Aid for Scientific Research 375 on Innovative Areas (KAKENHI Number 17H05777) to M.M.; Takeda Science 376 Foundation Grant to M.E.; Japan Agency for Medical Research and Development 377 (AMED) Grants Number 12103610 and 17935423 to M.K.S. for iPSC generation and 378 qualification; the Core Center for iPS Cell Research and the Acceleration Program for 379 Intractable Disease Research Utilizing Disease Specific iPS Cells (AMED) to J.T.; the 380 Cooperative Research Program (Joint Usage/Research Center Program) of Institute for 381 Frontier Life and Medical Sciences, Kyoto University to J.T., L.G and S.I.. 382 383 AUTHOR CONTRIBUTIONS 384 C.A. conceived the study; C.A. and M.E. designed the project and supervised it; M.M. 385 developed and performed 2D-oscillation assay and 3D-synchronization assay; Y.Y., M.U. 386 and C.A. developed induction and differentiation protocols and performed majority of 387 remaining in vitro and in vivo experiments; S.K. supported microscopy and calcium 388 imaging; M.Ni. helped with xeno-transplantation experiments; M.O., M.K.S. and A.N. 389 established patient iPSC lines used in this study and performed quality control of iPSCs; 390 M.O. helped with FACS data analysis; L.G. and S.I. performed exome sequencing and 391 database analysis; T.Y. analyzed RNA-seq and qRT-PCR-data; S.S. helped with RNA- 392 seq analysis; K.W. designed gene knockout and editing strategies; T.M. performed gene 393 editing and genotyping of patient iPSCs; M.Na. performed genotyping of patient and 394 gene-edited iPSCs; Y.Y., M.U. and C.A. generated knock-out lines with the help of 395 M.Na., K.W. and performed molecular/functional assays using knock-out lines, patient 396 iPSCs and corrected controls; M.I. developed one-step PSM induction protocol; M.K.S. 397 and H.Y. shared reagents; J.T. helped with establishment of patient lines and provided 398 administrative support; C.A. analyzed and interpreted the data and wrote the paper with 399 the support of M.E. and K.W.. 400 401 AUTHOR INFORMATION 402 The authors declare no competing interests. Correspondence and request for materials 403 should be addressed to M.E. ([email protected]) or C.A. ([email protected]).

404

9 Figure 1 405 FIGUREFigureStep-wiseS AND 1 in FIGURE vitro induction LEGENDS of human MIXL1 PSM and its derivatives from iPSCs ● ● ● a b ● FGF SCL a ACTIVIN WNT SHHSHH SCL b Group1 (364 genes) a FGF BMPi b Group1 (364 genes) FGF ACTIVIN BMPiWNT FGFi BMPi Sclerotome NANOG, POU5F1

FGFi expression Relative WNT FGF BMPi WNTi Sclerotome NANOG,DNMT3B, POU5F1 iPS PS TGFβi PSM WNTi SM + + PAX1 /PAX9Day 6 ● WNT TGFβi DNMT3B, SOX2 ●● Day 6 ●● ● ●●● ●●● ● PS PSM SM 0.0 0.4 0.8 1.2 iPS Primitive24hrs streak Presomitic24hrs mesoderm24hrs Somite Group2 (299iPS genes) PS PSM SM SCL DM Pluripotent Primitive Presomitic Somitic Group2MIXL1, (299 T genes) Day 0Stem Cell Day 1Streak Mesoderm Day 2Mesoderm Day 3 DM DKK1, MESP1 NANOG+/OCT4+ T+/MIXL+ HES7+/MSGN1+ TCF15+/MEOX1+ DLL1 HES7 MIXL1, T MSGN1 BMPBMP DM DKK1, MESP1 Dermomyotome ● ● WNTWNT Group3 (280 genes) ●● SHHi ●● Dermo- ● Group3TBX6, (280 HES7 genes) 30 SHHi Day 5 ● myotome TBX6,MSGN1, HES7 RSPO3 ●● ● ALX4+/PRRX1+ ● MSGN1, RSPO3 c PSM Group4 (216 genes) 20 ● Group4 (216 genes) 201B7 ● TCF15, MEOX1 ● ● ● TCF15,PAX3, MEOX1 RIPPLY1 ●●

Relative expression Relative expression Relative PAX3, expression Relative RIPPLY1 PSM ● Group (258genes) 10 ● ● Group5 (258 genes)●●●● PS ●●●● ●●●● ●●● ●●●● ●●●● ●●● ●●●● ●●●● PAX9, TBX18●●● ●●●● ●●●● ●●●● 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 PAX9, TBX180.0 0.2 0.4 0.6 0.8 1.0 1.2 iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM TWIST1 iPS PS PSM SM SCL DM SM TWIST1, 0 PS DM SM Group (402 genes) 0 10 20 30 DM GroupALX4,6 (402 PRRX1 genes) PC2 (18.14 %) Ex1 ALX4,ID2, PRRX1 HAND2 ID2, HAND2 PC2 (18.14 %)

−10 Ex2 SCL Ex1 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3

−20 Ex2 SCL Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3

−20 −10 iPS PS PSM SM SCL DM iPS Ex3 iPS PS PSM SM SCL DM iPS −40 −20 0 20 −40 −20 0 20 PC1 (29.87 %) −2 −1 012 PC1 (29.87 %) Normalized−2 −1 expression012 values (log 2) Normalized expression values (log 2)

1.2 TBX6 dd iPS 0.0% e PSM 79.2% e f f PSM PSM

(Tentative) 1.0 ●●

d e f ●TBX6 TBX6 TBX6 DAPI 0.8

● 0.6

> 90% 0.4

DLL1 0.2

Relative expression Relative DLL1 intensity 0.0

TBX6 mRNA iPS PS PSM SM SCL DM ● ● ●● 2.7% 0.0 0.2 0.4 0.6 0.8 1.0 1.2 10.8% ●●● ● TBX6 TBX6 iPS PS PSM SM SCL DM ●●●● ●●●● ●●●● 0.0 0.2 0.4 0.6 0.8 1.0 1.2 iPS PS PSM SM SCL DM A Space B A Space B TBX6 Spreading sphere 0 hSpreading sphere 0 g 1.2 h g 1.2 Spreading A Space B g 1.2 h 0 1.0 PSM hHES7 Luciferase2 nls d1PEST hHES7 3’UTR PSM sphere PSM 1.0 5.9 kb 1.0 0.80.8 0.8 hHES7 NanoLuc nls d1PEST hHES7 3’UTR 0.60.6 5.9 kb nls d1PEST Time (min)

hDKK1 Luciferase2 hDKK1 3’UTR Time (min) 2.2 kb 0.40.60.4 Ex1 Ex2 Ex3 0.2 Ex1 Ex2 Ex3 B B 5060(min) Time 5060 0.20.4 PeriodPeriod = 311= 311 min min 316 316 min min 313 313 min min A

Hes7 reporter activity A HES7 reporter activity Hes7 reporter activity 0.00.0 Ex1 Ex2 Ex3 HES7 reporter activity

HES7reporter activity 0.2 5060 0 0 Period = 311500500 min 316 min10001000 313 min 15001500 TimeTime (min) (min) 0.0 HES7reporter activity 0 500 1000 1500 Time (min) SCL DM i j jMYH DAPI GCaMP i SCL è Cartilage/Bone j DM è Muscle III

#2 4mm II GCaMP MYH/MYOSIN

I 406 Pentachrome Pentachrome SAA SAA/DAPI 407 Fig.Figure 1 | Molecular1 and functional analysis of human PSC-derived presomitic 408 mesoderm (PSM) and its derivatives. a, Schematic overview of step-wise induction and 409 differentiation of human PSM mimicking embryonic development via activation or 410 inhibition of appropriate signaling pathways. b, Heatmap of gene expression levels of

10 411 step-wise induced PSM and derivatives. FPKM values of each gene were normalized to 412 mean of all samples. Results shown for samples derived from iPSC-line 201B7. See 413 Extended Data Fig. 1 for data obtained for other iPSC-line (1231A3) and Extended Data 414 Table 1 for complete list of identified genes. c, PCA analysis of transcript levels at 415 different stages of applied protocol (for 201B7-derived samples). d, Protein expression of 416 DLL1 and TBX6 in human iPSCs and iPSC-derived PSM (201B7). e, Expression of 417 TBX6 transcript during different stages of human PSM induction and differentiation 418 (qRT-PCR results for four independent experiments using 201B7 are shown). f, 419 Expression of TBX6 protein in human PSM. Left panel shows TBX6 staining of entire 420 well. Enlarged view of highlighted area is shown on right side of panel. Scale bar: 100 421 µm. g, Oscillation of HES7-reporter activity in induced PSM. The signal was normalized 422 to the maximum oscillation peak. The period was calculated as the average elapsed time 423 between the peaks. Data of three independent experiments are shown. Schematic 424 depiction of utilized reporter construct shown on top. h, Synchronization of HES7- 425 reporter activity in spreading PSM sphere. Left: A PSM sphere was induced in 3D 426 culture, before attaching to dish. Right: A kymograph along the yellow line in the left 427 panel. Normalization (1500, 9000). Scale bar: 500 µm. Representative images of three 428 independent experiments are shown. See also Extended Data Movie 1. i, Representative 429 whole mount (upper panel) and histological staining of section (lower panel) of human 430 sclerotome (SCL)-derived in vivo cartilage and bone. Pentachrome staining of marked 431 area reminiscent of in vivo human endochondral bone formation. I: proliferative human 432 cartilage, II: hypertrophic cartilage, III: ossifying cartilage and forming human bone. 433 Scale bar: 100 µm. See also Extended Data Fig. 2e. j, Skeletal muscle derivation from 434 step-wise induced human dermomyotome (DM). Left upper side of panel: GCaMP 435 positive beating colonies of induced skeletal muscle. Left lower and right side of panel: 436 Staining of in vitro induced human skeletal muscle with myosin heavy chain (MYH) and 437 myosin (upper panel) or sarcomeric alpha-actinin (SAA) (lower half of panel). Scale bar: 438 100 µm. See also Extended Data Fig. 2f and Extended Data Movie 2. 439

11 a b Ex1 Ex2 hase (142 genes) JADE1 Notch Signalling (IPA, Canonical Pathways) KDM6b NOTCH1 MAML3 HES7 LFNG DUSP4 PTEN Signaling (IPA, Canonical Pathways) CCND1 ITGA5 FOXO3 FGFRL1 SPRY4 FGF2 (IPA, Up tream Regulators) RHOU CCND1 CCND2 FOXO3 ICAM1 ITGA5 KCNC1 LEF1 NOTCH1 PDGFA PRKCD SPRY1 SPRY4 TEAD4 ARHGEF2 TGFB1 (IPA, Upstream Regulators) PDGFA ABCG1 ARHGEF2 BDKRB2 CCND1 CCND2 DUSP4 FGF9 PLEKHG2 FOSL2 FOXO3 GFPT2 GPRC5B ICAM1 ITGA5 KDM6B LEF1 LPAR1 MYC NFATC1 NOTCH1 PDGFA PPP1R13B PPP2R5A TCF20 RHOU S1PR2 SKIL SMAD3 SPRY1 TACC2 WNT11 ZFPM2 LFNG ID2 (IPA, Upstream Regulators) BCORL1 CCND1 CCND2 DUSP4 ETS2 FOXO3 ICAM1 MYC NFATC1 NOTCH1 CXXC5 Wnt signaling pathway (DAVID, KEGG Pathway) HES7 TBL1XR1 CCND1 CCND2 PPP3R1 LEF1 WNT11 MYC NFATC1 Luciferase Hippo signaling pathway (DAVID, KEGG Pathway) SMAD3 CCND1 CCND2 TEAD4 SMAD3 LEF1 WNT11 MYC Skeletal system development (DAVID, GO TERM) NOTCH1 CMKLR1 ETS2 TEAD4 FGFRL1 HES7 PAX2

Phase (142 genes + Luciferase) Locomotory behavior (DAVID, GO TERM) WNT11 SOBP ABAT GPRC5B RASD2 PCDH1 TEAD4 MAML3 nti-phase (77 genes) DKK1 SHH (IPA, Up tream Regulators) HOXD1 ADAMTS1 CCNG2 EIF2AK3 FGF10 ARHGAP24 HK2 ZIC2 CTNNB1 (IPA, Up tream Regulators) DACT1 BHLHE40 DKK1 FOXB1 HOXA5 MCL1 SMAD7 SOX11 HOXA5 SP5 TGFA TGIF1 TNFRSF19 TSC22D1 ZEB2 PRICKLE2 H (IPA, Up tream Regulators) ADAMTS1 BDNF CCNG2 KLF10 OSR1 ZEB2 KLF15 Skeletal system development (DAVID, GO TERM ) FGF10 SOX11 KLF10 EXT1 EIF2AK3 SP5 Anterior/posterior pattern specification (DAVID, GO TERM) HOXA3, HOXA5, YY1 HK2 Anti-phase (77 genes) AMOTL2 Time Time KLF10 Low High Expression Values

c qRT-PCR d

HES7 LFNG hHES7 NanoLuc nls d1PEST hHES7 3’UTR 5.9 kb hDKK1 Luciferase2 nls d1PEST hDKK1 3’UTR 2 2.2 kb 1.0 01 3 012345 Ex1 Ex2 Ex1 Ex2 0.8 CXXC5 SMAD3 0.6

0.4

0.2 Phase 012 012 Ex1 Ex2 Ex1 Ex2 0 RHOU PLEKHG2 050010001500200025003000 -0.2

-0.4

Reporter activity Time (min) -0.6 HES7 reporter 01 012 Ex1 Ex2 Ex1 Ex2 -0.8 DKK1 reporter

-1.0 DKK1 ARHGAP24 01 01 Ex1 Ex2 Ex1 Ex2 DACT1 PRICKLE2 1 Anti-phase 01 0 440 Ex1 Ex2 Ex1 Ex2 441 Fig. 2 | Identification of phase and anti-phase oscillating genes of the human 442 segmentationFigure 2 clock. a, Heatmap of normalized gene expression levels for oscillating 443 genes in human PSM. RNA-sequencing results for two independent data sets, with 16 444 samples each. Examples of identified phase and anti-phase oscillating genes are 445 highlighted on right side of panel. See also Extended Data Table 2 for complete list of

12 446 identified oscillating genes. b, Pathway analysis result of phase and anti-phase oscillating 447 genes. See also Extended Data Table 3 for complete results of pathway and GO analyses. 448 c, Validation of RNA-sequencing results via qRT-PCR for selected phase and anti-phase 449 oscillating genes. Error bars indicate S.D. of three technical replicas for each time point 450 and sample set. d, Results obtained for dual luciferase reporter assay of HES7-reporter 451 (NanoLuc) and DKK1-reporter (Luciferase2) shown in lower half of panel. The one-step 452 induction protocol was used. The signal was detrended (± 2 hours) and normalized to the 453 maximum oscillation peak. Representative graph of three independent experiments is 454 shown. Schematic overview of used reporter constructs are shown on top. 455

13 a 1.2 b 1.2 1.0 1.0

0.8 0.8

0.6 0.6 DLL3 KO 0.4 WT 0.4 LFNG KO HES7 KO 0.2 0.2 MESP2 KO HES7 reporter activity HES7 reporter activity 0.0 0.0 0 500 1000 1500 0 500 1000 1500 Time (min) Time (min)

WT DLL3 KO LFNG KO MESP2 KO c 0 Time (min) HES7 reporter activity 5060 Space

d 80 WT 60 DLL3 KO LFNG KO 40 MESP2 KO 20 the 1st peak (%)

Amplitude relative to 0 2 3 4 5 6 7 8 456 Peak number 457 Fig. 3 | Molecular and functional evaluation of targeted disruption of selected 458 segmentation clock genes in human PSM. a, 2D-oscillation assay for wild-type (WT) 459 and HES7-knock-out PSMs. The signal was normalized to the maximum oscillation peak. 460 b, 2D-oscillation assay for LFNG-, DLL3- and MESP2-knock-out PSMs. c, 3D- 461 synchronization assay for knock-out PSMs. A kymograph along the yellow line in 462 Extended Data Movie 3 is shown. Normalization (1500, 35000). Representative graphs 463 and images of three independent experiments are shown. d, Damping rate of oscillation 464 amplitude in knock-out PSMs. The signal value of each oscillation peak is shown. Data 465 of three independent experiments are shown. See also Extended Data Fig. 5a.

Figure 3

14 a MESP2 * = c.256delGCCA or c.307T b c patient iPSCs = c.256GCCA or c.307G STD-A 1.2 * e1 e2 1.0 * 93.2% 0.8 exon1 85 . . . 101 . . 118 0.6 S A S G T S C * c.256delGCCA Δ AGCGCGAGCGGG CACGAGTTGC TGA 0.4 c.307T AGCGCCAGCGAG CACTAGTTGC STD-A S A S E H * . . . . 0.2 85 102 HES7 reporter activity STD-A/F 0 c.256delGCCA / c.307T DLL1 0 500 1000 1500 Time (min)

d STDA f MESP2 Targeting CAG::eGFP Donor Vector gene corrected iPSCs 0 - CAG::mCh-IRES-puro 85 . . . 102 . . +/ - ps1 ps1 S A S E H E L µH µH AGCGCCAGCGA G CACGAGTTGC

MESP2 * 201B7 Mutant Allele GFPneg Normal / Normal pos Excision mCh STD-resA1

Time (min) +/ - Normal / c.307T Targeted Allele STD-resF1 Normal / c.307T MMEJ

HES7 reporter activity STD-resF2 neg c.256delGCCA 5060 Normal Allele mCh / Normal Space

g h 35 150 30 3 folds a/b>0.5 25 100 q-value<0.05 q-value<0.05 20 15 10 50 Relative expression Relative expression Relative b a 5

0 0 F F A A SLC25A21 D resF resF resA resA T STD S STD Day3 STD

Expression MIPOL1

(SM) STD STD STD STD (WT) 201B7 FRRS1L (WT) 201B7

(WT) 1231A3 (WT) 1231A3 MESP2-KO#7 MESP2-KO#7

NGFR MESP2-KO#11 MESP2-KO#11

14 WT WNT7A 25 12 Patients

Patients (rescued) SYT4 SAMD3 20 10 8 MYBPC1 15 KIAA1755 6 10 ALDH1A3 4 Relative expression Relative Relative expression Relative RIPPLY2 5 2

FBXO32 0 7 F F A A 3 folds b/a>0.5 resF resF resA FGF4 resA q-value<0.05 q-value<0.05 STD STD STD RN7SK STD

STD STD STD STD (WT) 201B7 (WT) 201B7

EPHA3 (WT) 1231A3 (WT) 1231A3 MESP2-KO# a b MESP2-KO#7 MESP2-KO#11 MESP1 MESP2-KO#11 LINC01230 Day3 UNCX Expression (SM) 8 3 FHAD1 10

GNA14 6 KCNJ8 8 6 ABCC9 4

WT EBF4 4 FGF18

Patients 2 Relative expression Relative Relative expression Relative Patients (rescued) H19 2 GADD45B 0 0 F F A A resF resF resA DUSP5 resA STD STD STD NEFM STD

STD STD STD STD (WT) 201B7 (WT) 201B7

DMRT2 (WT) 1231A3 (WT) 1231A3 MESP2-KO#7 MESP2-KO#7 MESP2-KO#11 OVOS2 MESP2-KO#11 ADAMTS15 SLIT2 20 MESP2 100 201B7 201B7 201B7 AFA1 F2 F3 ex1 ex2 ex3 15 80

60 WT Patients Patients (rescued) 10 40

Relative expression Relative 5 Relative expression Relative 20

0 0 F A 1 F −3 0 3 A #1 resF resA resF Normalized expression values resA STD STD STD STD D (log 2) -KO

STD STD ST STD P2 (WT) 201B7 (WT) 201B7

(WT) 1231A3 (WT) 1231A3 MESP2-KO#7 MESP2-KO#7 MESP2-KO#11 MES 466 Healthy (WT) MESP2-KO STD-Patient STD-Rescue 467 Fig. 4 | In vitro recapitulation and molecular analysis of disease-phenotype using 468 spondylothoracic dysostosis (STD) patient iPSCs and genetically corrected controls. 469 a, Compound heterozygous genotype of STD-A/F patient iPSCs. b, Evaluation of DLL1 470 positive PSM induction efficiency of STD-A patient iPSCs. Representative results of

15 471 DLL1 protein expression in STD-A derived PSM, n=3. See also Extended Data Fig. 7a 472 for representative results of comparison (n=3) between patient (STD-A/F) and rescue 473 clones (STD-resA/resF). c, 2D-oscillation assay for STD-A PSM. The signal was 474 normalized to the maximum oscillation peak. Representative graph of three independent 475 experiments is shown. d, 3D-synchronization assay for STD-A PSM. Normalization 476 (1500, 4000). Representative image of three independent experiments is shown. e, 477 Schematic of the gene targeting procedure for allele-specific correction of MESP2 478 mutations using MhAX. Details for the targeting and genotyping procedures are provided 479 in Extended Data Fig. 8. f, Genotype of heterozygously corrected iPSC subclones. 201B7 480 is included as a reference. g, Heatmap of gene expression levels for STD-related genes. 481 The genes were upregulated or downregulated in the STD-A and STD-F patient lines and 482 increases or decreases were inhibited in rescued cell lines. See also Extended Data Fig. 483 7b. h, qRT-PCR-based validation of RNA-sequencing results. Data of three independent 484 experiments are shown. See also Extended Data Fig. 7c for qRT-PCR-based validation of 485 additional candidates.

16 486 METHODS 487 488 Pluripotent stem cell generation and culture 489 Experiments were performed using mainly two human induced pluripotent stem (iPS) 490 cell lines derived from healthy donors, 1231A331 and 201B732. Pluripotent stem cells of 491 patients suffering from spondylocostal dysostosis (SCD) or spondylothoracic dysostosis 492 (STD) were induced using patient-derived primary cell samples. The following cell line 493 of a STD patient was obtained from the NIGMS Human Genetic Cell Repository at the 494 Coriell Institute for Medical Research: GM13539. Reprogramming was performed with 495 episomes (pCE-hOCT3/4, pCE-hSK, pCE-hUL, pCE-mp53DD, pCXB-EBNA1) under 496 feeder-free conditions using StemFit® medium and laminin-coated dishes (iMatrix511)31. 497 Human iPSCs were maintained without feeder cells and cultured on iMatrix-511 silk 498 (Nippi) coated dishes or plates with StemFit® AK02N (Ajinomoto) medium 499 supplemented with 50 U penicillin and 50 mg/ml streptomycin (Gibco). 500 501 Step-wise induction of human somitic mesoderm 502 Human iPSCs were seeded on iMatrix-511 silk-coated plates or dishes at appropriate 503 densities as single cells (e.g. 1.3 × 104 cells/well into 6 well plate; 8.0 × 104 cells/dish 504 into 10 cm dish) 5 days before induction. All differentiation and induction steps were 505 performed in chemically defined medium with insulin (CDMi)33 if not otherwise 506 mentioned. Our step-wise protocol is similar to a recently published mesoderm induction 507 protocol4, albeit with some differences. Human primitive streak (PS) cells were induced 508 by treatment of iPSCs with bFGF (20 ng/ml), CHIR99021 (10 µM) and Activin A (50 509 ng/ml) for 24 hours. Presomitic mesoderm (PSM) cells were induced from PS cells by 510 exposure to SB431542 (10 µM), CHIR99021 (3 µM), LDN193189 (250 nM) and bFGF 511 (20 ng/ml) for 24 hours. Subsequently, somitic mesoderm (SM) cells were induced from 512 PSM cells using PD173074 (100 nM) and XAV939 (1 µM) for 24 hours. For details of 513 used recombinant human and small molecule agonists or inhibitors see Extended 514 Data Table 4. 515 516 Human sclerotome and dermomyotome induction 517 Following initial step-wise somitic mesoderm (SM) induction, human sclerotome (SCL) 518 cells were induced with combination of SAG (100 nM) and LDN193189 (600 nM)34 for 519 72 hours. Dermomyotome (DM) cells were induced from human somitic mesoderm as 520 previously described4, using a combination of CHIR99021 (3 µM), GDC0449 (150 nM) 521 and BMP4 (50 ng/ml) for 48 hours. 522 523 In vitro 3D-chondrogenic induction (3D-CI) 524 Step-wise induced human sclerotome (SCL) cells were dissociated using Accutase (Life 525 Tech), centrifuged and resuspended in CDMi before being seeded (2.0 × 105 cells/well) 526 into 96 well low attachment plates containing sclerotome induction medium with ROCK- 527 inhibitor Y27632 (Wako), forming 3D aggregates overnight. Initial 3D-SCL spheres were 528 transferred into low attachment plates or dishes containing 3D-CI medium35 and cultured 529 under standard conditions. Medium was changed every three days. 530 531

17 532 In vitro skeletal muscle induction 533 Dermomyotome (DM) cells were dissociated using Accutase (Life Tech), centrifuged, 534 resuspended in CDMi and seeded (2.5 × 105 cells/well) into Matrigel coated 12 well 535 plates in muscle induction medium containing ROCK-inhibitor Y27632 (Wako). In order 536 to induce human skeletal muscle cells, we applied the N2 medium established 537 previously36 with some modifications (DMEM/F12 (Gibco), 1% ITS (Corning), 1% N2 538 Supplement (Gibco), 0.2% Penicillin/Streptomycin (Gibco), 1% L-Glutamine (Gibco), 539 2% Horse serum (Sigma-Aldrich)). Medium was changed every three days. Calcium 540 imaging of DM-derived skeletal muscle activity in GCaMP reporter line (Gen1C)37 was 541 performed using Nikon A1R MP (Multiphoton+N-STORM). 542 543 In vivo xeno-transplantation of PSM derivatives 544 NOD/ShiJic-scidJcl mice were purchased from CLEA Japan. Human sclerotome cells 545 derived from healthy donor (WT) or homozygous/heterozygous luciferase reporter lines 546 (625-A4 and 625-D4) were dissociated using Accutase (Life Tech) and resuspended in 547 100 µl of CDMi before being mixed with the same volume of Matrigel as previously 548 described4. Numbers of transplanted cells ranged from ~ 5.0 × 105 – 1.2 × 106 549 cells/injection. Cells were injected into mice subcutaneously with 26 G needle and 1 ml 550 syringe (Terumo). Forming cartilage and bone tissues were taken out at 2 months post 551 injection. Bioluminescence images were taken with IVIS Spectrum (PerkinElmer). 552 Whole mount photos were taken with LEICA M205FA (Leica). Animal experiments 553 were approved by the institutional animal committee of Kyoto University and performed 554 in strict accordance with the Regulation on Animal Experimentation at Kyoto University. 555 556 Quantitative real-time PCR (qRT-PCR) 557 RNA was extracted with the RNeasy mini kit (Qiagen) following the manufacturer’s 558 instructions. cDNA was synthesized using Superscript III Reverse Transcriptase 559 (Invitrogen) from 1 µg total RNA. cDNA was diluted 1:10 in RNase-free water. qRT- 560 PCR was performed using Thunderbird SYBR qPCR Mix (Toyobo) and QuantStudio™ 561 12K Flex Real-Time PCR System (Thermo Fisher). The expression values of target genes 562 were normalized by b-actin expression from the same cDNA templates. Details of 563 utilized qRT-PCR primers are listed in Extended Data Table 5. 564 565 Immunocytochemistry 566 Cells were fixed with 2% paraformaldehyde (PFA) for 30 minutes and washed twice with 567 PBS. Samples were permeabilized with 0.2% Triton® X-100 (Sigma-Aldrich) in PBS for 568 10 minutes at room temperature and then washed with PBST (1% TWEEN® 20 (Sigma- 569 Aldrich) in PBS). Subsequently, samples were blocked in 5% skim milk for 1 hour at 570 room temperature and then stained with primary antibodies for overnight at 4°C. Samples 571 were then washed with PBST three times and stained with secondary antibodies for 1 572 hour at room temperature. Antibodies were diluted in 10% blocking solution (5% skim 573 milk) in PBST, washed with PBST twice and stained with DAPI for nuclear 574 counterstaining for 5 minutes at room temperature. All images were taken using Nikon 575 A1R MP (Multiphoton+N-STORM). For details of used primary and secondary 576 antibodies see Extended Data Table 6. 577

18 578 Histological analysis 579 Tissues were fixed with 4% PFA overnight at 4°C. Fixed samples were washed with PBS 580 twice and embedded in paraffin. Sections were sliced at 3 µm for immunostaining and 5 581 µm for other stainings. Sections were stained with Hematoxylin-Eosin (HE), Safranin O, 582 von Kossa, Pentachrome, anti-type I Collagen (COL1) antibody, anti-type II Collagen 583 (COL2) antibody, and anti-Human Nuclear Antigen (HNA) antibody. Sections stained 584 with antibodies were incubated for overnight at 4°C. Secondary antibodies were applied 585 with N-Histofine® Simple Stain™ MAX PO (Nichirei Bioscience Inc.) for 30 minutes at 586 room temperature. Signals were detected by N-Histofine® DAB-3S kit (Nichirei 587 Bioscience Inc.). Details of used antibodies are listed in Extended Data Table 6. 588 589 Flow cytometric analysis 590 Cells were washed with PBS and dissociated using Accutase (Life Technologies) and 591 centrifuged. Cells were resuspended (1.0 × 107 cells/ml) in FACS buffer (0.1% BSA in 592 PBS) and stained with allophycocyanin (APC)-conjugated DLL1 antibody for 30 minutes 593 at 4°C. Then, cells were stained with DAPI to eliminate dead cells after washing with 594 FACS buffer once and finally strained through a filter mesh. As for the co-staining of 595 intracellular molecules TBX6 and with DLL1, cells were fixed with 4% 596 paraformaldehyde (PFA) for 20 minutes at 4°C after initial staining with DLL1 antibody 597 and washed twice with staining medium, which contained PBS with 2% fetal bovine 598 serum (FBS). Samples were permeabilized with BD Perm/Wash buffer (BD Biosciences) 599 for 15 minutes at room temperature and stained with TBX6 primary antibody or 600 phycoerythrin (PE)-conjugated BRACHYURY antibody for 60 minutes at room 601 temperature and washed with BD Perm/Wash buffer twice. The cells stained with TBX6 602 antibody were stained with Alexa Fluor® 488-conjugated secondary antibody for 60 603 minutes at room temperature. The samples were washed with BD Perm/Wash buffer 604 twice and suspended into staining medium. Flow cytometric analysis was performed 605 using LSR or BD FACSAria II cell sorter (BD Biosciences). FACS data was analyzed 606 and graphs were generated using FlowJo software (FlowJo LLC). For details of used 607 antibodies see Extended Data Table 6. 608 609 Reporter constructs 610 For the HES7-reporter, human HES7 promoter (5937 bp) and 3’UTR were fused to 611 Luciferase2-NLS-d1PEST38. For the dual reporter assay, the HES7 promoter and 3’UTR 612 were fused to NanoLuc-NLS-d1PEST, while human DKK1 promoter (2218 bp) and 613 3’UTR were fused to Luciferase2-NLS-d1PEST. These reporters were integrated into the 614 genome using piggyBac transposition. See Fig. 1g and Fig. 2d for schematic overviews of 615 used reporter constructs. 616 617 2D-oscillation assay 618 Luminescence was measured in the presence of D-luciferin (200 µM) with Kronos Dio 619 Luminometer (Atto). For the dual reporter assay, HES7- and DKK1-reporter constructs 620 were simultaneously introduced into the cells, and each luminescence was filtered and 621 measured in the presence of Furimazine (400 nM) and D-luciferin (1 mM). HES7- 622 reporter cells were seeded on a 35 mm dish coated with iMatrix-511 at 3000 cells/dish. 623 After 4 days culture, medium was changed into CDMi containing SB431542 (10 µM),

19 624 DMH1 (2 µM), CHIR99021 (10 µM) and bFGF (20 ng/ml). After additional 3 days 625 culture, the medium was changed into CDMi without inhibitors for measurement with 626 Kronos Dio Luminometer (Atto). This (modified) one-step protocol39 was used for Fig. 627 1g and 2d. All other oscillation measurements were performed using our standard step- 628 wise PSM induction protocol. 629 630 3D-synchronization assay 631 To make 3D cell spheres, HES7-reporter iPSCs were seeded into non-adhesive round 632 bottom 96 well plates at 1000-3000 cells/well and cultured in CDMi containing BMP4 633 (50 ng/ml), CHIR99021 (10 µM) and Y27632 (10 µM). After one day of culture, Y27632 634 was removed. After 18 hours culture the medium was changed into CDMi containing 635 DMH1 (2 µM) and CHIR99021 (10 µM). After 6 hours culture, the cell sphere was 636 transferred to a fibronectin-coated glass bottom dish with CDMi containing DMH1 (2 637 µM) and D-luciferin (1 mM), and luminescence was imaged with a customized incubator 638 microscope LCV110 (Olympus). Obtained Image was analyzed with Metamorph 639 (Molecular Devices) and Excel (Microsoft), and kymograph was made by averaging 640 signals over 10 pixels with Metamorph. 641 642 Sampling for RNA-seq analysis of oscillating genes 643 Our standard step-wise PSM induction protocol was used with the following 644 modifications. HES7-reporter cells were seeded on a 35 mm dish coated with Matrigel. 645 At 12 hours during the 2nd-step (PSM induction), the cells were split into multiple 35 646 mm dishes at 4.0 × 105 cells/dish and cultured in CDMi containing SB431542 (10 µM), 647 LDN193189 (250 nM) and CHIR99021 (3 µM). After 12 hours culture the medium was 648 changed into CDMi containing SB431542 (10 µM), LDN193189 (250 nM), CHIR99021 649 (3 µM) and bFGF (20 ng/ml). The luminescence was continuously monitored with 650 Kronos Dio Luminometer with one sample, and the other samples were frozen at each 651 time point. 652 653 Library preparation for RNA-sequencing 654 Total RNA was extracted using RNeasy mini kit (Qiagen) following the manufacturer's 655 instructions. RNA-seq libraries were generated from 200-300 ng total RNA with the 656 TruSeq Stranded mRNA LT Sample Prep Kit (Illumina, San Diego, CA, USA) according 657 to the manufacturer's protocol, with the exception of the libraries used for RNAseq 658 analysis of oscillating genes, which were generated from 120 ng total RNA using 659 NeoPrep system (Illumina) following the manufacturer's instructions. The obtained RNA- 660 seq libraries were sequenced on NextSeq 500 (75 bp – 86 bp single-end reads, Illumina). 661 662 RNA-sequencing data analyses 663 The sequenced reads were mapped to the hg38 human reference genome plus the 664 luciferase reporter sequence using HISAT2 (version 2.1.0)40 with the GENCODE v25 665 annotation gtf file after trimming adaptor sequences and low-quality bases by cutadapt- 666 1.1441. The mapped reads with high mapping quality (MAPQ >= 20) were used for 667 further analyses. For identification of the differentiation stage-related genes, the 668 differentially expressed genes (>= 5-folds changes and q-values =< 0.05 between any pair 669 of samples) were extracted using Cuffdiff42 within Cufflinks version 2.2.1 package and

20 670 GENCODE v25 annotation file, and the extracted genes were grouped into six stages 671 based on the maximum expression levels (FPKM values determined by Cuffdiff) among 672 the differentiation stages. The low expressed genes (=< 10 FPKM) across all stages were 673 filtered out before grouping. For identification of the oscillation genes, the uniquely 674 mapped reads were counted and normalized to calculate the gene expression levels using 675 HTSeq (version 0.6.1)43 with GENCODE v25 annotation gtf file (protein-coding genes) 676 and edgeR (version 3.18.1)44 after filtering low expressed genes (cpm =< 1) across all 677 conditions in each experiment, while rhythmic genes were identified by ARSER (version 678 2.0)45 with FDR_BH =< 0.03 in both two independent experiments. The filtering genes 679 for noise judged by ARSER in both experiments were excluded from oscillation genes. 680 For pathway and analyses, DAVID web tools46 and IPA (Qiagen Inc., 681 https://www.qiagenbioinformatics.com/products/ingenuitypathway-analysis) were used. 682 For identification of the patient (STD-A/F) related genes, fold changes with q-values 683 were calculated with HTSeq (version 0.6.1)43 with GENCODE v25 annotation gtf file 684 and edgeR (version 3.18.1)44 . The genes whose expression values were upregulated or 685 downregulated (>= 3 folds changes, q<0.05, STD lines vs. WT), and increases or 686 decreases were inhibited (>=50%, q<0.05) in the rescued lines, were defined as STD- 687 related upregulated or downregulated genes, respectively. The genes whose expression 688 levels were low (average cpm =< 5) both in wild type and STD-lines were filtered out. 689 For comparisons of expression profiles between knock-out (KO) cell lines and their 690 parental cell lines (wild type), FPKM values, fold changes and q-values were calculated 691 using Cuffdiff42 within Cufflinks version 2.2.1 package and GENCODE v25 annotation 692 file (protein-coding genes). The principal component analysis (PCA) and representation 693 of heatmaps and scatter plots were performed using R software. 694 695 CRISPR/Cas9 gene knock-out 696 Gene knock-out was performed using transient transfection of pSpCas9(BB)-2A-Puro 697 (PX459) V2.0 (a gift from Feng Zhang, Addgene plasmid #62988). Oligonucleotides 698 encoding sgRNA protospacer sequences (Extended Data Table 5) were annealed and 699 cloned as described previously47. sgRNAs were verified by sequencing. Plasmid DNA (1 700 ug) was transfected into iPS cells by electroporation followed by selection with 0.5 701 µg/mL puromycin for 48 hours. Surviving cells were allowed to recover and then 702 replated at low density before picking isolated colonies. For overview of knock-out 703 reporter line establishment and details of used sgRNAs see Extended Data Fig. 4 and 704 Extended Data Table 5.4. 705 706 Whole exome sequencing and variant calling 707 Whole exome sequencing was performed as previously described48,49. Briefly, DNA (3 708 µg) was sheared with S2 Focused-ultrasonicator (Covaris, Wobum, MA, USA) and 709 processed by SureSelectXT Human All Exon V5 (Agilent Technologies, Santa Clara, 710 CA, USA). Captured DNA was sequenced using HiSeq 2000 (Illumina, San Diego, CA, 711 USA) with 101 bp pair-end reads with seven indices. Image analysis and base calling 712 were performed using HCS, RTA and CASAVA software (Illumina). Reads were 713 mapped to the reference (hg19) by Novoalign-3.02.04. Aligned reads 714 were processed by Picard to remove PCR duplicates. Variants were called by GATK 715 v2.7-4 following GATK Best Practice Workflow v350 and annotated by ANNOVAR51.

21 716 All the variants of the candidate genes, which have been reported to cause SCD or CS, 717 were evaluated using five databases: gnomAD, Human Gene Mutation Database 718 (HGMD), SIFT, PolyPhen-2, MutationTaster. 719 720 Quality control of established iPSCs 721 Morphological images of iPSC colonies were captured using Olympus CKX41 722 microscope with a PlanApo 10×/0.75 objective lens (Olympus) and Nikon digital camera 723 DS-Fil. Chromosomal G-banding analyses were performed by Chromocenter Inc, Tottori, 724 Japan. Genomic DNA and total RNA were extracted with AllPrep DNA/RNA mini kit 725 (Qiagen) following the manufacturer’s instructions. Genomic DNA was diluted into 25 726 ng/ml in distilled water. cDNA was synthesized using PrimeScript™ RT Master Mix 727 (Takara) from 500 ng total RNA and diluted 1:10 in RNase-free water for OCT3/4 and 728 NANOG mRNA expression analysis, and 1 µg total RNA for TaqMan™ hPSC 729 Scorecard™ analysis. OCT3/4 and NANOG mRNA expression were confirmed by 730 quantitative real-time PCR (qRT-PCR) with TaqMan™ assay using StepOnePlus™ Real- 731 Time PCR Systems (Thermo Fisher). Primers and probe sequences are provided in 732 Extended Data Table 5.2. The expression values of target genes were normalized by 733 GAPDH expression from the same cDNA templates and calculated relative to 201B7 iPS 734 cell line. Residual plasmids used for iPSC establishment were analyzed by TaqMan™ 735 quantitative PCR using StepOnePlus™ Real-Time PCR Systems (Thermo Fisher). Primer 736 and probe sequences (cmCAG: common-CAG) are designed on CAG-promoter region 737 included in all of the episomal vectors for iPSC generation and listed in Extended Data 738 Table 5.2. The residual plasmid numbers were determined by a standard curve method 739 with pCE-OCT3/4 episomal plasmid of known quantity using 50 ng genomic DNA of 740 STD-iPSCs at passage 6. 741 742 Initial validation of established iPSCs 743 Established (patient) iPSCs together with control human PSCs were differentiated into 744 ectoderm, mesoderm and endoderm lineages using STEMdiff™ Trilineage 745 Differentiation Kit (STEMCELL Technologies). hPSCs reaching 70–80% confluency 746 were harvested with TrypLE™ Select Enzyme (1X) (Thermo Fisher) and plated as a 747 single cell suspension in mTeSR1 medium (STEMCELL Technologies) containing 10 748 µM Y27632 (Wako) on 6-well plates coated with Matrigel (BD Biosciences). The cells 749 were plated at 4.0 × 105 cells, 2.0 × 105 cells and 4.0 × 105 cells per well for ectoderm, 750 mesoderm and endoderm differentiation culture respectively and differentiated following 751 the manufacturer’s instructions. For FACS-based evaluation of undifferentiated PSCs and 752 each of the three germ layers (1.0 × 106 cells each) were fixed with 4% paraformaldehyde 753 phosphate buffer solution (4% PFA/PBS) for 20 minutes at 4°C and washed twice with 754 staining medium, which contained PBS with 2% fetal bovine serum (FBS). Samples were 755 permeabilized with BD Perm/Wash buffer (BD Biosciences) for 15 minutes at room 756 temperature and stained with fluorescence-conjugated primary antibodies listed in 757 Extended Data Table 6.3. The samples were washed with BD Perm/Wash buffer twice 758 and suspended into staining medium. Flow cytometric analysis was performed using LSR 759 (BD Biosciences). FACS data was analyzed and graphs were generated using FlowJo 760 software (FlowJo LLC). For transcript level assessment of differentiation capacity, qPCR 761 was performed with 384-well TaqMan™ hPSC Scorecard™ panel (Thermo Fisher) by

22 762 QuantStudio™ 12K Flex Real-Time PCR System (Thermo Fisher) using undifferentiated 763 PSC and each of the three germ layers cDNA samples. Pluripotency and differentiation 764 property into ectoderm, mesoderm and endoderm lineages were scored by hPSC 765 Scorecard Analysis software, which is available on the Thermo Fisher website 766 (https://www.thermofisher.com/jp/en/home/life-science/stem-cell-research/taqman-hpsc- 767 scorecard-panel.html). 768 769 Gene correction of patient iPSCs 770 Correction of mutations in patient iPSCs was performed using the MhAX method as 771 previously described28. Briefly, donor plasmids for correction of each mutant allele were 772 created by PCR amplification of the right arm from the cloned genomic DNA of STD 773 patient cells corresponding to the matching mutant allele, and the left arm from normal 774 201B7 iPSC using the common primers listed in Extended Data Table 5. InFusion 775 cloning (Clontech) was used to assemble the arms with a restriction-digested 776 CAG::mCherry-IRES-puro selection cassette (Addgene plasmid #113876) and 777 CAG::GFP plasmid backbone (Addgene plasmid #107281). PCR-amplified regions and 778 InFusion junctions were verified by sequencing. Oligonucleotides encoding sgRNA 779 protospacer sequences (Extended Data Table 5) were annealed and cloned into pX330- 780 U6-Chimeric_BB-CBh-hSpCas9 (a gift from Feng Zhang, Addgene plasmid # 42230) as 781 described previously47. sgRNAs were verified by sequencing. For gene targeting, allele- 782 matched donor plasmids (3 µg) and Cas9 / sgRNA expression plasmids (1 µg) were co- 783 transfected by electroporation into 1.0 x 106 STD-A/F patient iPSCs, which were then 784 divided and plated under feeder-free conditions for 48 hours in AK02N medium 785 (Ajinomoto) containing 10 µM ROCK inhibitor Y27632 (Wako) before initiating 786 antibiotic selection (0.5 µg/mL puromycin, Sigma-Aldrich). Nine days after plating, 787 puromycin-resistant cells were pooled and passaged. GFPneg / mChpos colonies were 788 isolated, cultured, stored, and processed for genomic DNA isolation under feeder-free 789 conditions in 96-well format. iPSC clones positive for PCR genotyping and sequencing 790 were defrosted and expanded for genomic DNA extraction and Southern blot verification. 791 For cassette excision, 3 µg of the pX-EGFP-g1 expression plasmid (Addgene plasmid 792 #107273)28 was transfected into 1.0 x 106 gene-targeted patient iPSCs, which were then 793 divided and plated under feeder-free conditions for 48 hours in AK02N medium 794 containing 10 µM Y27632, followed by growth without selection for a total of 6 days. 795 mChneg cells were isolated by FACS on a BD FACSAria II cell sorter, and plated at low 796 density for clonal isolation after 8 days. Isolated clones were cultured, stored in 96-well 797 format, then genotyped for cassette excision by PCR and sequencing before final 798 verification by Southern blot. 799 800 Genomic DNA extraction 801 Genomic DNA for PCR amplification and sequencing was isolated from 0.5 – 1.0 x 106 802 iPS cells using a DNeasy Blood and Tissue Kit (Qiagen). Genomic DNA for Southern 803 blotting was isolated from a single confluent well of a 6-well dish using lysis buffer (100 804 mM Tris-HCl, pH 8.5, 5 mM EDTA, 0.2% SDS, 200 mM NaCl, and 1 mg/mL Proteinase 805 K) followed by phenol-chloroform extraction and ethanol precipitation from the aqueous 806 phase. Genomic DNA was eluted from columns or resuspended from precipitate in TE 807 pH 8.0.

23 808 809 iPSC genotyping 810 PCR primers flanking annotated coding exons of DLL3 (Accession NG_008256.1), HES7 811 (Accession NG_015816.1), and LFNG (Accession NG_008109.2), MESP2 (Accession 812 NG_008608.1) were designed using NCBI Primer-BLAST with optional settings filtering 813 human repeats and SNPs, with primer pair specificity checking to H. sapiens 814 (taxid:9606). PCR primers for genotyping gene-edited cell lines were designed using 815 similar principles. All genotyping primers are listed in Extended Data Table 5. Genomic 816 PCR was carried out using KAPA HiFi HotStart (KAPA Biosystems) on a Veriti 96-well 817 Thermal Cycler (Applied Biosystems) according to the manufacturer’s instructions. 818 Specific PCR conditions are available upon request. PCR products were treated with 819 ExoSAP-IT Express (Affymetrix) and sequenced with the primers indicated in Extended 820 Data Table 5 using BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems) 821 on a 3130xl Genetic Analyzer (Applied Biosystems). Sequence analysis was performed 822 using variant calling in Sequencher (Genecodes) or alignment in Snapgene (GSL Biotech 823 LLC.). For MESP2 gene correction, patient and rescued iPS cells were analyzed by 824 Southern blotting as described previously28. Probe regions were PCR amplified with Ex 825 Taq (Takara) directly from genomic DNA or cloned plasmid templates to incorporate 826 DIG-labeled dUTP (Roche) using the primers described in Extended Data Table 5. 827 Genomic DNA (5-10 µg) was digested with EcoRI. Genomic DNA from patient iPSCs 828 and iPSC clones rescued by gene editing were genotyped using an Infinium 829 OmniExpress-24 v1.2 (Illumina) SNP array according to the manufacturer’s 830 recommendations. Data collection was performed on an iScan Bead Array Scanner 831 (Illumina). Data was compared to the reference human genome (hg19) using a 832 combination of PennCNV, cnvPartition, GWAS tools, and MAD. Karyograms were 833 prepared in R (version 3.2.5) using GWASTools (version 1.16.1)52. 834 835 Data accessibility 836 All RNA sequencing data utilized for this study have been deposited in Gene Expression 837 Omnibus (https://www.ncbi.nlm.nih.gov/geo/) under the accession number GSE116935. 838 839 31 Nakagawa, M. et al. A novel efficient feeder-free culture system for the derivation of 840 human induced pluripotent stem cells. Sci Rep 4, 3594, doi:10.1038/srep03594 (2014). 841 32 Takahashi, K. et al. Induction of pluripotent stem cells from adult human fibroblasts by 842 defined factors. Cell 131, 861-872, doi:10.1016/j.cell.2007.11.019 (2007). 843 33 Wataya, T. et al. Minimization of exogenous signals in ES cell culture induces rostral 844 hypothalamic differentiation. Proc Natl Acad Sci U S A 105, 11796-11801, 845 doi:10.1073/pnas.0803078105 (2008). 846 34 Zhao, J. et al. Small molecule-directed specification of sclerotome-like 847 chondroprogenitors and induction of a somitic chondrogenesis program from embryonic 848 stem cells. Development 141, 3848-3858, doi:10.1242/dev.105981 (2014). 849 35 Yamashita, A. et al. Generation of scaffoldless hyaline cartilaginous tissue from human 850 iPSCs. Stem Cell Reports 4, 404-418, doi:10.1016/j.stemcr.2015.01.016 (2015). 851 36 Chal, J. et al. Generation of human muscle fibers and satellite-like cells from human 852 pluripotent stem cells in vitro. Nat Protoc 11, 1833-1850, doi:10.1038/nprot.2016.110 853 (2016).

24 854 37 Mandegar, M. A. et al. CRISPR Interference Efficiently Induces Specific and Reversible 855 Gene Silencing in Human iPSCs. Cell Stem Cell 18, 541-553, 856 doi:10.1016/j.stem.2016.01.022 (2016). 857 38 Li, X. et al. Generation of destabilized green fluorescent protein as a transcription 858 reporter. J Biol Chem 273, 34970-34975 (1998). 859 39 Nakajima, T. et al. Modeling human somite development and fibrodysplasia ossificans 860 progressiva with induced pluripotent stem cells. Development 145, 861 doi:10.1242/dev.165431 (2018). 862 40 Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory 863 requirements. Nat Methods 12, 357-360, doi:10.1038/nmeth.3317 (2015). 864 41 Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. 865 EMBnet.journal 17, 10-12, doi:10.14806/ej.17.1.200pp. 10-12 (2011). 866 42 Trapnell, C. et al. Differential analysis of gene regulation at transcript resolution with 867 RNA-seq. Nat Biotechnol 31, 46-53, doi:10.1038/nbt.2450 (2013). 868 43 Anders, S., Pyl, P. T. & Huber, W. HTSeq--a Python framework to work with high- 869 throughput sequencing data. Bioinformatics 31, 166-169, 870 doi:10.1093/bioinformatics/btu638 (2015). 871 44 McCarthy, D. J., Chen, Y. & Smyth, G. K. Differential expression analysis of multifactor 872 RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 40, 4288- 873 4297, doi:10.1093/nar/gks042 (2012). 874 45 Yang, R. & Su, Z. Analyzing circadian expression data by harmonic regression based on 875 autoregressive spectral estimation. Bioinformatics 26, i168-174, 876 doi:10.1093/bioinformatics/btq189 (2010). 877 46 Huang da, W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of 878 large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44-57, 879 doi:10.1038/nprot.2008.211 (2009). 880 47 Ran, F. A. et al. Genome engineering using the CRISPR-Cas9 system. Nat Protoc 8, 881 2281-2308, doi:10.1038/nprot.2013.143 (2013). 882 48 Guo, L. et al. Novel and recurrent XYLT1 mutations in two Turkish families with 883 Desbuquois dysplasia, type 2. J Hum Genet 62, 447-451, doi:10.1038/jhg.2016.143 884 (2017). 885 49 Guo, L. et al. Identification of a novel LRRK1 mutation in a family with osteosclerotic 886 metaphyseal dysplasia. J Hum Genet 62, 437-441, doi:10.1038/jhg.2016.136 (2017). 887 50 McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for 888 analyzing next-generation DNA sequencing data. Genome Res 20, 1297-1303, 889 doi:10.1101/gr.107524.110 (2010). 890 51 Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic 891 variants from high-throughput sequencing data. Nucleic Acids Res 38, e164, 892 doi:10.1093/nar/gkq603 (2010). 893 52 Gogarten, S. M. et al. GWASTools: an R/Bioconductor package for quality control and 894 analysis of genome-wide association studies. Bioinformatics 28, 3329-3331, 895 doi:10.1093/bioinformatics/bts610 (2012). 896 897 898 899

25 BRA BRA MIXL1 MIXL1 ● ● a c ● ● ● ● c ● a! ! ● ● ● ● ● ● PSM ● ● ●

SM ● ●● ● ●

R ●● R ● R R ● PS ●● ●●●● ●●●● ●●●● ●●●● ●● ● ●●● ●●● ●●

SCL 0.0 0.4 0.8 1.2 ●● 0.0 0.4 0.8 1.2 ●●●● ●●●● ●●●● ●●●● ●● ● ●●● ●●● ● 0.0 0.4 0.8 1.2 iPS PS0.0 PSM 0.4 SM 0.8 SCL 1.2 DM iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM

PC2 (15.92 %) HES7 MSGN1

DM HES7 1.2 MSGN1 ● ●● ● ●● ● ●● ●●

● ● 0.8 Ex1Ex1 ● ● ●

Ex2Ex2

BRAEx3Ex3 MIXL1 0.4 R R iPS ● ● ● ● ●●●● R ● ●●●● ●●●● ●●● ●●●● ●●●● ●●● R ●●●● ●●●● ●●●●

● 0.0 0.4 0.8 1.2 0.0 ● iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM ● ●●●● BRA ● ●●●● ●●●● ●●● ●●●● ●●●● ●●● ●●●● ●●●● ●●●● ● MIXL1 RSPO3_8 FOXC2_90.0 0.4 0.8 1.2 0.0MIXL1_9 0.4 0.8 1.2 iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM PC1 (26.31 %) ● ● ● TCF15 ● MEOX1 ● ●● ● ●1.2 ● ● ●● 900 EXTENDED DATA FIGURES ● ●● ● ● TCF15 MEOX1 ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ●● R R ●●

● 0.8 ●● ●● ● ● ● ● ● ● ● ● ●● b ●● ● a c ●● ● ●●● ●●● ● ● ● ●●●● ●●●● ●●●●group ●●●●1 ● ● 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 ● ● ●● ● iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM ● ● b! 0.4 ● ● ●● ● ● R ● R ● R R ● ●●● ● HES7 ● MSGN●●●●●●1 ●●●● ●●●● ●●●● ●●●● ●●●● ●●●●● ●●●● ●●●● ●●●● ●●●● ●●●●●●● ●● 0.0 ●●● ● ●●● ●●● ●●● 0.0 0.4 0.8 1.2 ●● R R 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 ●●●● ●●●● ●●●● 1.2 ●●●● ●●●● iPS● PS● PSM SM SCL●● DM ●●iPS PS PSM● SM●●● SCL ●●● DM ● group2 ●● 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 Relative expression Relative iPS PS● PSM SM SCL DM iPS PS PSM● SM SCL DM expression Relative iPS PS PSM SM SCL DM iPS PS●● PSM SM SCL DM iPS PS PSM SM SCL DM ●● ●●●● ●●●● ●●●● ●●●● ● ●●●● ●●●● ●●●● ●●●●

0.0 0.4 0.8 1.2 0.0●● 0.4 0.8 1.2 ● ●●●● ●●●● ●●●● ●●●● ●● ● ●●● ●●● ● 0.0 0.4 0.8 1.2 PiPSAX1 PS PSM SM SCL0.0 DM 0.4 0.8PAX9 1.2 iPS PS PSM SM SCL DM ● HES7 0.8 MSGN1 ● DLL1_10 HES7_11 ● MSGN1_11 group3 ● ● ● ● ● ●●● PAX1 PAX9 ●● ●● ● ● ● ●● 0.4 ● ●● ● ● ●● ● ● ● R ● group4 R ● ●●● ● ● ● ● ● ● ● ●●●● ● R ●●●● ●●●● ●●● ●●●● ●●●● ●●● R ●●●● ●●●● ●●●● 0.0 0.4 0.8 1.2 0.0 iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM ●group ● ● ● ● ● ●● ●●● ●● ●●● ●●● ● ●●●● ●●●● ●●●● ●●●● ●●●● ●

● ● 0.0 0.5 1.0 1.5 ● 0.0 0.5 1.0 1.5 iPS PS PSM SM SCL DM ● iPS PS PSM SM SCL DM R ●● R Relative expression Relative expression Relative R ● R

TCF15 ● ●●●●MEOX1 ● ●●●● ●●●● ●●●● ●●● ●●●● ●●●● ●●● ●●●● ●●●● ●●●●●●● ●●●●●●●● ●●●● ●●●●● ● ●●●● ●●●● ●●●● ●●●●

● 0.0 0.4 0.8 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.2 ● ●● ●●● ●● ●●● ●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● 0.0●●●● 0.4 0.8 ●●●● 1.2 ●●● 0.0●●●● 0.4 ●●●● 0.8 1.2 ●●● ●●●● ●●●● ●●● ●●●● ●●●● ●●●● 0.0 0.5 1.0 1.5 ALX4 PRRX10.0 0.5 1.0 1.5

0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 ●● 0.0 0.2 0.4 0.6 0.8 1.0 1.2 iPS PS PSM●● SM SCL DM iPS PS PSM SM SCL DM iPS PS PSM SMgroup SCL DM iPS PS PSM SMiPS SCL PS DM PSM SM SCLiPS DM PS PSMiPS SM PS SCL PSM DM SM SCL DM ● ● ●● ● ●● ●● 0.8 ●● ● ●● TCF15DLL3_12● MEOX1LNFG_12●● ALX4 NOTCH1_13 PRRX1● Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 ● ● ● ● ● ● ● ● ●● ● ●● ●● ●● ●● ● iPS PS PSM SM SCL DM ● ● ● ● ● ●● ● ● 0.4 ● ●● ● ● ● ●

R R ● ● ● ● R R ● ● ● ● ● ● ● ●● ● ● ●● ● ●●●● ●● ●●● ●●●● ●● ●●●●● ●●●● ●●●● ●●● ●●●●

012 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 ● ● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●● ●●●●● Normalized expression values (log 2) expression Relative iPS● PS PSM SM SCL DM expression Relative iPS PS PSM SM SCL DM

0.0 ● 0.0 0.4 0.8 1.2 ● ● iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM● ● ●●●● ●●●● ●●●● ● ●●●● ●●●● ●●●● ●●●● ●●●● R ● R

● 0.0 0.2 0.4 0.6 0.8 1.0 1.2 ● 0.0● 0.4 0.8● 1.2 ●● BRA BRA ● MIXL1 MIXL1 ● R R ●● ●● ● ● ●● ●● ●● ●●● ●●●● ●● ●●●● ●●●● ●●●● ●●● ●●●● ● ● ●●● 0.0 0.4 0.8 1.2 ● 0.0 0.4 0.8 1.2 a c ● ● d DLL1 PAX1 e PAX9 ● ● iPS PS PSM SM SCL DM c iPS PS PSM SM SCL DM ● ba ! ! ● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●●● ● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● A ●●●● ● A ● ● 0.0 0.4 0.8 1.2 ● 0.0 0.4 0.8 1.2 ● PSM 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 ● ● ● 0.0 0.4 0.8 1.2 ● iPS PS PSM SM SCL DM iPS PS PSM SM SCL● DM ● 1.2 iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM ● ● ● ● ● ●● SM d! e! ● DLL1 TBX6PAX1 DKK1_14●● RSPO3_14 PAX9● ●

R ●● R ● ● R ● ● R ● ● PS● ● ●● ● ● ● ● ●● ● ●●●● ●●●● ●●●● ●●●● ●● ● ●●● ●●● ●● ● ● ●

SCL 0.0 0.4 0.8 1.2 ●● ● ● ● 0.0 0.4 0.8 1.2 ●●●● ●●●● ●●●● ●●●● ● ●● ● ●●● ●●● ●

● ● expression Relative ● expression Relative ● ● 0.0 0.4 0.8 1.2 iPS PS0.0 PSM 0.4 SM 0.8 SCL 1.2 DM iPS PS PSM SM SCL DM ●● ● R ● R iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM MIXL1 R ● ● ●● ● ● ● ●● ●● ●●● ●● ●●● ●●● ●●●● ●●●● ●●●● ●●●● ●●●● 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 ● PC2 (15.92 %) ● ● ● ● HES7 MSGN1 ●●●● ●●●● ●●● ●● ●●● ●● ●●● ●●● ●●●● ●●●● ●●●● ●●●● ●●●● ● 1.2 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 0.0 0.4 0.8 DM HES7 MSGN1 ● iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM● ●● ● ● ● ● ● ●● ● ● ● ● ● A ●● ● ● RR ●● ●

●● ● ●● 0.8 ● ● R R R R ● Ex1Ex1 ●● ● ● ● ● PRRX1● ● ●●● ●ALX4 ●● ●● ● Ex2Ex2 ● ●●● ● ● ● ● ●●● ●●●● ●●●● ●●● ●●●●●● ●●● ●● ●●● ●●●●●●● ●●●● ●●●●●●●● ●●●● ●●●● ●●●●●●●● ●●●●●●●● ●●●● ●●●● ●●●● ●●●●

BRAEx3Ex3 0.4 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 0.0 0.4 0.8 1.2 PSM PSM ●● MIXL1SM SM ● ●● ● ● R iPSiPS PS PS PSM PSM SM SM SCL SCL DM DM iPS PS PSM SM SCL DM R iPS PS PSM SM SCL DM f iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM● iPS ● ● ● ● ●●●● R R ● ●●●● ●●●● ●●● ●●●● ●●●● ●●● ●●●● ●●●● ●●●● expression Relative

● 0.0 0.4 0.8 1.2 0.0 ● iPS PS PSM SM● SCL DM iPS PS PSM SM SCL DM Relative expression Relative HEYL_15 ● RIPPLY2_15 ● ● expression Relative PAX1_16●●●● BRA ● ●●●● ●●●● ●●● ●●●● ●●●● ●●● ●●●● ●●●● ●●●● ● ● MIXL1 RSPO3_8 ● FOXC2_90.0 0.4 0.8 1.2 0.0MIXL1_9 0.4 0.8 1.2 ●● ALX4 PRRX1 ● ●● ●● ● ●●● ●●● ● ●● ●●●● ●●●●iPS ●●●● PS● PSM SM SCL DM●●●● ●●●● ●●●●iPS ●●● PS ●●●● PSM SM SCL DM PC1 (26.31 %) ● ● ● ● 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 ● ●● ●● TCF15 ● MEOX1 PSM ● PSM SM ●● ●● SM iPS PS PSM SM SCL DM ● ●1.2 ● ● ● ●● ● ●● R ● R ● TCF15 MEOX1 ●● ● ● ● f! ● ● ● ●● ● ●● ● ● ● ●● ● ●● R ● R ● ●● 89.8% ●● ●● ●●● ●●●●●83.3%●● ●● ●●●● ●●●● ●●●●0.8 ●●●● DLL1 ●●●● HES7 MSGN1 PSM PSM ● ●●● ●● ● ● 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 ● ● ● ● ● d e ● f b ● ●● ● ●● iPS PS PSM SM SCL● DM iPS PS PSM SM SCL DM ● ● ●●●● ●●●● ●●●●group ●●●●1 ●● ● ●●● ●●●● ● ● 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 ●● 1.2 ●●● ● ●● ● ●● iPS PS PSM SM SCL DM●● iPS PS PSM SM SCL● DM●● ● iPS b! 0.4 ● ●● ● ● ● ●● TBX6/TCF15/DAPI TBX6/TCF15/DAPI●● TBX6/TCF15●●●/DAPI TBX6● /TCF15● /DAPI ● ● R R ● ● ● ● R R R R ●● ● ●● ●●●● ● ●● ● ● ● PSM ●●●●● ●●●●● ● ● ● HES7 ● MSGN●●●●●●1 ●●●● ●●●● ●●●● ●●●●●●● ●●●● ●●● ●●●● ●●●● ●●●● ●●●● ●●● ●●●● ●●●● ●●●●●● ●●●●●●● ●● 0.0 ●●● ● ●●● ●●● ●●● 0.0 0.4 0.8 1.2 ●● ● 0.0 0.4 0.8 1.2 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 R R 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 ●●●●●● ●● ●●● ●●●●●●●●group ●●●● ●●2 1.2 ●●●● ●●●●iPS ●●●●● PS ●●● PSM ●●●● SM SCL●● DM ●●iPS PS PSM● SM●●● SCL ●●● DM ● 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 PSM 0.0iPSiPS 0.4 PS PS 0.8● PSM PSM 1.2 PSM SM SM SCL SCL DM DM 0.0iPSiPS 0.4 PS PS 0.8 PSMSM PSM 1.2 SM SM SCL SCL DM DM 0.0iPS 0.4SM PS 0.8 PSM 1.2 SMSM SCL DM DLL1 iPSiPS PS PS PSM PSM SM SM SCL SCL DM DM iPSiPS PS PS●● PSM PSM SM SM● SCL SCL DM DM iPS PS PSM SM SCL DM ●● ●●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●●

● 0.0 0.4● 0.8● 1.2 0.0 0.4 0.8 1.2 iPS● PS PSM SM SCL DM iPS PS PSM SM SCL DM 0.4 0.8 PAX1 ●● PAX9SCL Relative expression Relative ● HES7 0.8 MSGN1 ● 4.5% ● DLL1_1013.1% HES7_11 ● MSGN1_11

TBX6/TCF15/DAPI TBX6/TCF15/DAPIgroup3 TBX6/TCF15/DAPI TBX6/ expression Relative TCF15/DAPI ● expression Relative ● DM ● ● ● ● ●● ● ● ●●●●●●● PAX1●●● PAX9 ●●

0.0 ●● ● ●●●● ● ● ●●●● ●●●●●● ●●● ●●●● ●●●● ●●● ●●●● ●●●● ●●●● 0.4 ● ● 0.0 0.4 0.8 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 ● PSM PSM SM ● SM ● R ● group4 R iPS PS PSM● SM SCL DM iPS PS PSM SM SCL DM TBX6 BRA ● DLL1● ●●● ● ● ● ● ●●●● R ●●●● ●●●● ●●● ●●●● ●●●● ●●● R ●●●● ●●●● ●●●● 0.0 0.4 0.8 1.2 0.0 iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM ●group ● ● ● ● g PSM PSM ● SM ●● ●●● ●● ●●● ●●● SM● ●●●● ●●●● ●●●● ●●●● ●●●● SM●

● ● 0.0 0.5 1.0 1.5 ● 0.0 0.5 1.0 1.5 iPS PS PSM SM SCL DM ● iPS PS PSM SM SCL DM R BRA/TCF15/DAPI BRA/TCF15/DAPI ●● BRA/TCF15R /DAPI BRA/TCF15/DAPI R ● R

TCF15 ● ●●●●MEOX1 ●●●● ●●●● ●●● ●●●● ●●●●● ●●● ●●●● ●●●● ●●●●● ● ● 1.2 ● ●● ●●● ●● ●●● ●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● 0.0●●●● 0.4 0.8 ●●●● 1.2 ●●● 0.0●●●● 0.4 ●●●● 0.8 1.2 ●●● ●●●● ●●●● ●●● ●●●● ●●●● ●●●● 0.0 0.5 1.0 1.5 ALX4 PRRX10.0 0.5 1.0 1.5

0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 ●● 0.0 0.2 0.4 0.6 0.8 1.0 1.2 iPS PS PSM●● SM SCL DM iPS PS PSM SM SCL DM Extended Figure 1 iPS PS PSM SMgroup SCL DM iPS PS PSM SMiPS SCL PS DM PSM SM SCLiPS DM PS PSMiPS SM PS SCL PSM DM SM SCL DM ● ● ●● ● ●● ●● 0.8 ●● ● TCF15DLL3_12● MEOX1LNFG_12 ALX4 NOTCH1_13 PRRX1● Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 Ex1 Ex2 Ex3 ● ● ● ● ● ● ●● ●● ●● ●● ● BRAiPS/TCF15PS /DAPIPSM SM BRASCL/TCF15DM ●/DAPI BRA/TCF15/DAPI● BRA/TCF15/DAPI ● ● ● ● 0.4 ● ●● ● ● ● R R ● R R ● ● ● ● ● ● ● ●● ● ● ●● TBX6/TCF15/DAPI TBX6/TCF15/DAPI TBX6● /TCF15/DAPI ●●TBX6●● ●● ●●●/TCF15 ●●●● ●● /DAPI ●●●●● ●●●●TBX6 ●●●●/ ●●●MESP2 ●●●● /DAPI

012 0.0 0.4 0.8 1.2

Extended Figure 1! 0.0 0.4 0.8 1.2 ● ● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●● ●●●●● Normalized expression values (log 2) iPS● PS PSM SM SCL DM iPS PS PSM SM SCL DM

0.0 ● 0.0 0.4 0.8 1.2 ● ● iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM● ● ● R ● R ● ● ● ● ●● PSM PSM SM SM● SM ● R R ●● ●● ● ● ●● ●● ●● ●●● ●●●● ●● ●●●● ●●●● ●●●●●●● ●●● ●●●● 0.0 0.4 0.8 1.2 ● TBX60.0 0.4 0.8 1.2 d DLL1 PAX1 e PAX9 iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ● 0.0 0.4 0.8 1.2 ● 0.0 0.4 0.8 1.2 ●● 0.0iPS 0.4 PS 0.8 PSM 1.2 SM SCL DM 0.0iPS 0.4 PS 0.8 1.2 PSM SM● SCL DM 0.0 0.4 0.8 1.2

1.2 iPS PS PSM SM SCL DM iPS PS PSM● SM SCL DM iPS PS PSM SM SCL DM ● ● ● ● ●● d! DLL1 TBX6PAX1 e! DKK1_14PAX9 RSPO3_14

● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● R ● R R ● ● ●● ●● Relative expression Relative ● ● ● ● ● BRA/TCF15●●●● ●●●●/DAPI BRA●●● /TCF15●●/DAPI ●●● ●● ●●● BRA●●●/TCF15/DAPI●●●● ●●●● ●●●●BRA ●●●● /TCF15●●●● /DAPI MESP2 ● 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 0.0 0.4 0.8 ● ●● 901 iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM iPS PS PSM●●● SM SCL● DM ● ● ● ●●●● ●●●●● ●●●● ● ● ● 0.0 0.2 0.4 0.6 0.8 1.0 1.2 ● ● 902 Extended Data Fig. ●● 1 | Characterization of step-wise induced human presomitic R R R ExtendedR Figure 1 ● ● ● ● PRRX1● ●●● ●ALX4 ●● ●● ● ● ●●● ● ● ● ● ●●● ●●●● ●●●● ●●● ●●●●●● ●●● ●● ●●● ●●●●●●● ●●●● ●●●●●●●● ●●●● ●●●● ●●●●●●●● ●●●●●●●● ●●●● ●●●● ●●●● ●●●● 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2

(PSM) and somitic mesoderm0.0 0.5 (SM). 1.0 1.5 a, Heatmap of gene expression levels in step-wise

903 0.0 0.5 1.0 1.5 0.0 0.4 0.8 1.2 PSM PSM ●● SM SM f iPS PS PSM SM SCL DM iPSiPS PS PS PSM PSM SM SM SCL SCL DM DM iPSiPS PS PS PSM PSM SM SM SCL SCL●● DM DM iPS PS PSM SM SCL DM 904 induced human PSM and its derivatives (using iPSC-line 1231A3). FPKM values of each ● HEYL_15 ● RIPPLY2_15 ● PAX1_16 ALX4 ● PRRX1

● ● ● ● ●● PSM PSM SM ●● SM ● R R f! ● ● ●● ●● ●● ●●● ●●●● ●● ●●●● ●●●● ●●●● ●●●● ●●●● ● ●● ● ● ● 0.0 0.4 0.8 1.2 0.0 0.4 0.8 1.2 ● ● iPS PS PSM SM SCL● DM iPS PS PSM SM SCL● DM ●● ●●● 26 ● ●● ●● ●● ● ● TBX6/TCF15/DAPI TBX6/TCF15/DAPI TBX6/TCF15●●●/DAPI TBX6● /TCF15● /DAPI ● R R ●● ● ● ● ●●●● ● ● ●●●● ●●● ● ● ● ●●● ●●● ●●● ●●● ●●● ●● 0.0●● 0.4 ●● 0.8 ●●● 1.2 ●●●● ●● 0.0●●●● 0.5 1.0 ●●●● 1.5 ●●●● 2.0 ●●● ●●●● 0.0 0.5 1.0 1.5 0.0 0.4 0.8 1.2 PSM 0.0 0.4iPS 0.8 PS 1.2 PSMPSM SM SCL DM iPS PSSM PSM SM SCL DM iPSSM PS PSM SM SCL DM iPS PS PSM SM SCL DM iPS PS PSM SM SCL DM TBX6/TCF15/DAPI TBX6/TCF15/DAPI TBX6/TCF15/DAPI TBX6/TCF15/DAPI PSM PSM SM SM

BRA/TCF15/DAPI BRA/TCF15/DAPI BRA/TCF15/DAPI BRA/TCF15/DAPI Extended Figure 1

BRA/TCF15/DAPI BRA/TCF15/DAPI BRA/TCF15/DAPI BRA/TCF15/DAPI Extended Figure 1! 905 gene were normalized to mean of all samples. The gene order is the same as in Fig. 1b. b, 906 PCA analysis of transcript expression levels in human PSM and derivatives (1231A3). c, 907 qRT-PCR based validation of RNA-seq results; summary of four independent 908 experiments with three technical replicas each using 201B7. Similar results were obtained 909 for 1231A3 (data not shown). It should be noted that open circles in some conditions are 910 less than four because no Ct values in the samples were obtained after 45 cycles of PCR 911 to calculate expression values. d, Flow cytometry-based evaluation of DLL1 and TBX6 912 (left) as well as DLL1 and BRACHYURY (BRA) (right) expression at PSM stage 913 (1231A3). e, Expression of DLL1 on transcript level throughout the different stages of 914 induction (201B7). f, Expression of DLL1 on protein level. Correlation of FACS data 915 with qRT-PCR results (201B7) shown in (e). g, Immunofluorescence staining of PSM 916 markers TBX6 and BRA as well as somitic mesoderm marker TCF15 at PSM (left half 917 panels) and SM (right half panels) stages; entire wells (left) and magnified views of 918 selected areas (right). Staining of segmentation marker MESP2 (alone or co-staining with 919 TBX6) shown in far right side of panel. Scale bar: 100 µm.

27 a DM DM SCL SCL

PAX7 PAX7/DAPI FOXC2 FOXC2/DAPI b

c SCL è Cartilage/Bone

#1 4mm SCL è Cartilage/Bone

#1 #2 #3

#3 4mm 1231A3 SCL (WT) 625-D4 SCL (LUC/GFP-reporter - hetero) 625-A4 SCL (LUC/GFP-reporter - homo)

d in vitro e in vivo

Safranin O Safranin O COL1 von Kossa in vitro in vivo

COL2 Safranin O COL2 HNA

f DM è Muscle DM è Muscle DM è Muscle DM è Muscle

MYOSIN/DAPI MYOSIN/DAPI SAA/DAPI SAA/DAPI 920 921 Extended Data Fig. 2 | Molecular and functional characterization of step-wise 922 induced human PSM-derivatives, sclerotome (SCL) and dermomyotome (DM). a, 923 Developmental-stage specific expression of DM (PAX7) and SCL (FOXC2) markers on Extended Figure 2 28 924 protein level (201B7). Scale bar: 100 µm. b, Assessment of in vivo bone and cartilage 925 forming ability of human sclerotome (SCL). Subcutaneous transplantation of PSC- 926 derived SCL step-wise induced from WT (1231A3) and luciferase reporter lines (625-D4 927 and 625-A4). Evaluation of transplanted cells via IVIS at two months post- 928 transplantation. Injection sides are marked by dashed/colored circles. Cartilage and bone 929 forming areas of WT iPSC line (1231A3) marked by white arrows. c, Whole-mount 930 images of WT SCL-derived in vivo cartilage/bone tissues isolated from transplanted mice 931 #1 and #3. Explant isolated from mice #2 is shown in Fig. 1i. d, Staining of in vitro 932 human SCL derived cartilage (3D-CI) sections. Observed Safranin O and type II collagen 933 (COL2) signals are indicative of in vitro cartilage formation. e, Sections and staining of 934 area shown in Fig. 1i; Safranin O and COL2 staining in human in vivo SCL-derived 935 cartilage areas; von Kossa and COL1 staining in ossifying cartilage and forming bone 936 areas. Majority of cells contributing to cartilage/bone formation are HNA positive and of 937 human origin (right lower panel). Scale bar: 100 µm. f, Evaluation of in vitro muscle 938 induction from human DM. Myosin and sarcomeric alpha-actinin (SAA) staining of in 939 vitro DM-derived skeletal muscle; representative images of entire well (left) and 940 magnified areas (right). Scale bar: 100 µm.

29 a Sampling Ex1 Sampling Ex2 1.2 1.2 500 840 1 543 885 1 545 930 590 0.8 458 585 798 0.8 895 975 505 640 945 628 755 415 855 990 0.6 713 0.6 463 690 1040 670 815 373 775 0.4 330 0.4 423 735 380 0.2 0.2 Hes7 reporter activity Hes7 reporter activity 0 0 0 200 400 600 800 1000 0 200 400 600 800 1000 Time (min) Time (min)

Luciferase DLL1 3 b 2 12 01 0 Ex1 Ex2 Ex1 Ex2 BCORL1 KDM6B 3 012 012 Ex1 Ex2 Ex1 Ex2 MAML3 JADE1 2 02468 01 Phase Ex1 Ex2 Ex1 Ex2 NOTCH1 HK2 2 01 01 Ex1 Ex2 Ex1 Ex2 SPRY4 TEAD4 01 01 Ex1 Ex2 Ex1 Ex2

AXIN2 AMOTL2 1 1 0 0 Ex1 Ex2 Ex1 Ex2 HOXA5 HOXD1 1 Anti-phase 01 0 941 Ex1 Ex2 Ex1 Ex2 942 ExtendedExtended Data Figure Fig. 3 | 3 Isolation and transcriptome analysis of timely coordinated 943 human PSM samples. a, Sampling for RNA-seq. HES7-reporter activity was 944 continuously monitored with one sample, and the other samples were frozen at each time 945 point indicated in the graph. b, qRT-PCR validation of identified novel phase and anti- 946 phase oscillating genes. DLL1 and AXIN2 show clear phase or anti-phase oscillation 947 despite not being included into high stringency cut-off RNA-seq candidate list. Error bars 948 indicate S.D. of three technical replicas for each time point and sample set.

30 HES7 e1 e2 e3 e4

KO1 KO2 sgRNA-KO1 PAM sgRNA-KO2 PAM REF TGGTCGCAGATGCTCAAGCCGC-TTGTGGAGAAG REF CCTGGAAGAGCTGAGGCTGCTGCTGCTGGAGCGGACCCGGGACCAGGTCAGT KO #7, 8 TGGTCGCAGATGCTCAAGCCGCCTTGTGGAGAAG ins1 KO #1 CCTGGAAGAGCTGAGGCTGCTGCTGCTGGAGCGGA------GGTCAGT del10 TGGTCGCAGATGCTCAAGCCGCTTTGTGGAGAAG ins1 > Thr43Argfs*63 > Leu19fs*170 CCTGG------TCAGT del42 > delGlu33-Gln46

DLL3 e1 e2 e3 e4 e5 e6 e7 e8

KO1 KO2 PAM sgRNA-KO1 sgRNA-KO2 PAM REF CTGCGCCCCTGCGCACCGCTCGAGGACGAATGT REF CAGCCCGGTGAATGCCGATGCC-TAGAGGGCTGG KO #6 CTGCGCC------GCTCGAGGACGAATGT del10 KO #2 CAGCCCGGTGAATGCCGATGCCCTAGAGGGCTGG ins1 CTGCGCC------GCTCGAGGACGAATGT del10 CAGCCCGGTGAATGCCGATGCCCTAGAGGGCTGG ins1 > Pro206fs*237 > Leu240fs*279

LFNG e1 e2 e3 e4 e5 e6 e7 e8

KO2 sgRNA-KO2 PAM sgRNA-KO2 PAM REF ACCCGCACACGCGGGACGTCTA-CGTCGGCAAGCCCAGCCTGGACAG REF ACCCGCACACGCGGGACGTCTA-CGTCGGCAAGC KO #2 ACCCGCACACGCGGGACGTCTAGCGTCGGCAAGCCCAGCCTGGACAG ins1 KO #12 ACCCGCACACGCGGGACGTCTAACGTCGGCAAGC ins1 ACCCGCACACGCGGGACGTCTA------GACAG del19 ACCCGCACACGCGGGACGTCTAACGTCGGCAAGC ins1 > Tyr224* > Tyr224*

MESP2 e1 e2

KO2 KO1 sgRNA-KO2 PAM PAM sgRNA-KO1 REF CGACGACGCCCAGACGAGCGCGCACCGGACCAGCGGGCGGACAG REF TGCCTCCCTCCTT-GGCGCCGGCCGGCCAGAGCC KO #11 CGACGACGCCCAGACGAGCG------GACAG del19 KO #7 TGCCTCCCTCCTTTGGCGCCGGCCGGCCAGAGCC ins1 CGACGACGCCCAGACGAGCG------GACAG del19 TGCCTCCCTCCTTTGGCGCCGGCCGGCCAGAGCC ins1 > Arg75fs*113 > Leu112fs*366 949 950 Extended Data Fig. 4 | Overview of knock-out reporter line generation. Schematics 951 of the HES7, DLL3, LFNG, and MESP2 genes. Positions of the sgRNAs used in this 952 study are shown. sgRNAs were designed to target at or near regions of known pathogenic 953 mutations, particularly those resulting in frameshifts and premature termination. 954 Sequence analysis of iPSC clones used in this study indicating indel mutations generated 955 by Cas9. Predicted effects on the protein sequence are listed below the sequence 956 alignments.

Extended Figure 4

31 Supplementary figure xx

a WT DLL3 KO 400 Peaks 400 300 1 300 1 2 200 3 200 4 5 6 7 8 2 100 100 3 4 5 6 78 0 0 050010001500200025003000 050010001500200025003000 -100 -100 -200 Time (min) -200 -300 -300

-400 -400 Avg intensity of Hes7 reporter

LFNG KO MESP2 KO 400 1 400

300 300 1 200 2 200 2 5 6 7 8 3 3 4 100 4 56 78 100 0 0 050010001500200025003000 050010001500200025003000 -100 -100

-200 -200

-300 -300

-400 b 201B7 HES7-KO #7 DLL3-KO #2 LFNG-KO #2 MESP2-KO #7

96.8% 70.2% 93.3% 93.7% 92.0%

HES7 Reporter #10 HES7-KO #8 DLL3-KO #6 LFNG-KO #12 MESP2-KO #11

86.2% 64.6% 95.7% 90.8% 93.9%

DLL1 iPS 15 15 15 c 15 10 10 10 10 HES7 HES7 HES7 HES7 MESP2 MESP2 MESP2 MESP2 5 5 5 5 DLL3 KO HES7 KO LFNG LFNG LFNG KO LFNG MESP2 KO LFNG

DLL3 DLL3 DLL3 DLL3 0 0 0 0

0 5 10 15 0 5 10 15 0 5 10 15 0 5 10 15 WT WT WT WT PSM 15 15 15 15 LFNG HES7 LFNG LFNG HES7 HES7 HES7 10 10 10 10 MESP2

DLL3 MESP2 DLL3 MESP2 DLL3 DLL3 KO 5 5 5 5 HES7 KO LFNG KO

DLL3 MESP2 KO

LFNG MESP2 0 0 0 0

0 5 10 15 0 5 10 15 957 0 5 WT 10 15 WT 0 5 WT 10 15 WT 958 ExtendedExtended FigureData Fig. 5 5 | Characterization of knock-out reporter lines. a, Oscillation 959 damping rate of knock-out PSMs in 3D-synchronization assay. Average signal of the 960 entire image at each time point was measured using the sample shown in Extended Data 961 Movie 3. The signal was detrended (± 2 hours), and each oscillation peak was detected to 962 define the amplitude. Representative images and graphs of three independent experiments 963 are shown. Scale bar: 500 µm. b, Evaluation of DLL1 expression via FACS analysis at 964 iPSC and PSM stages of healthy control and knock-out iPSC lines. PSM induction 965 efficiency is high in all analyzed samples; slight reduction of DLL1 induction efficiency 966 in HES7-KO lines. Representative results of two different knock-out lines each are 967 shown. c, Scatter plot of transcriptome analysis of wild type and KO lines at iPSC and 968 PSM stages. Positions of expression values for MESP2, DLL3, LFNG and HES7 are 969 highlighted with colored arrows.

32 a STD-A b STD-A

Residual c OCT3/4 NANOG plasmid 1.5 1.5 1.0

0.8

1.0 1.0 STD-F STD-F 0.6

0.4 0.5 0.5

Relative expression Relative expression 0.2 Plasmid copy number / cell

0.0 0.0 0.0

201B7 201B7 SCD3-A SCD3-F SCD3-A SCD3-F SCD3-A SCD3-F

d H9 hESC STD-A STD-F 67.8% 75.1% 82.0% e PSC Ectoderm 8 8

OCT3/4 6 6 PSC

4 4

2 2 NANOG 0 0

96.7% 95.3% 96.2% -2 -2 Algorithm Score Algorithm Score

-4 -4 H9 H9 -6 STD-A -6 STD-A PAX6 STD-F STD-F -8 -8 PSC Ect Mes End PSC Ect Mes End Ectoderm Gene sets Gene sets SOX2

94.0% 94.4% 91.7% Mesoderm Endoderm 8 8

6 6

4 4

BRACHYURY 2 2 Mesoderm NCAM 0 0 -2 -2 Algorithm Score Algorithm Score

44.8% 89.0% 89.4% -4 -4 H9 H9 -6 STD-A -6 STD-A STD-F STD-F -8 -8

SOX17 PSC Ect Mes End PSC Ect Mes End Gene sets Gene sets Endoderm FOXA2 970 971 Extended Data Fig. 6 | Initial characterization of STD patient iPSC line. a, Bright 972 field view of STD iPSC clones STD-A and STD-F. Scale bar: 500 µm. b, Normal 973 karyotype (46, XX) in both clones of STD patient iPSC line by chromosomal G-banding 974 analysisExtended. c, Expression Figure of6 pluripotency markers OCT3/4 and NANOG in STD-A and 975 STD-F cells compared to WT iPSC line (201B7). Quantification of residual plasmid 976 levels in STD clones (right side of panel). d, FACS-based evaluation of differentiation 977 capacity into three germlayers of healthy control (H9 hESC) and patient lines (STD-A 978 and STD-F). e, Quantification of differentiation capacity into ectoderm, mesoderm and 979 endoderm at transcript level. 980 981

33 STD-A STD- a

93.2% 92.7%

STD-F STF

91.3% 92.4%

DLL1

Day0 CTD-2228K2.7 MIPOL1 (iPS) C11orf74 GRPR KCNC3 Day2 PWAR5 ANPEP (PSM) ABCA1 GREB1L 3.0 DLL1 3.0 DLL3 Day5 LINC00632 (DM) KCNC3 2.5 2.5 SLCO1A2 2.0 2.0 Day6 KCNC3 CTGF (SCL) 1.5 1.5

SDR42E1 1.0 1.0 PEG3 Relative expression Relative RP11-69I8.2 expression Relative 0.5 0.5 Day0 SYNDIG1L TMEM266

(iPS) PLEKHA2 0.0 0.0 HMX2 SCUBE1 STD−F STD−F

RP11-132A1.3 STD−A SIX6 STD−A

STD−resF STD−resF STD−resA STD−resA (WT) 201B7 (WT) 201B7

WT) 1231A3

(WT) 1231A3 GPR50 ( MESP2-KO#7 Day1 NAT2 MESP2-KO#7 MESP2-KO#11 FTCD MESP2-KO#11 (PS) CTD-2501M5.1 PEG3 LAD1 ALDH1A3 LFNG NODAL 5 HSD17B7P2 120 APLN 4 RPL10AP6 100 Day2 MT-TT RAB17 (PSM) CLDN4 80 3 CTD-2501M5.1 VSNL1 60 CLDN7 2 LINC00458 40 Relative expression Relative Relative expression Relative 1 GPNMB 20 GSTM1 TACR1

0 NECTIN4 0 ST14 KRT7 STD−F STD−A STD−F STD−F TINCR STD−A

SLC27A6 STD−resF STD−resA STD−resF STD−resA (WT) 201B7

DLX6 (WT) 201B7

(WT) 1231A3 (WT) 1231A3 MESP2-KO#7 WNT6 MESP2-KO#7

PROM2 MESP2-KO#11 TMEM132C MESP2-KO#11 ERP27 HTR1B HHIP LINC00865 WNT7A SAMD3 EPS8L2 80 30 CLDN7 VTCN1 25 MAP7 60 RGL3 20 Day5 AP1M2 ESRP2 (DM) SPNS2 40 15 TRIML2 RPS3AP5 10 ADAMTSL4

20 expression Relative GRHL1 expression Relative FBXL21 5 RAB25

DLX5 0 0 CYP1B1 MYO5B

CD24P4 STD−F STD−A STD−F MTND4P12 STD−A

STD−resF STD−resA STD−resF PPL STD−resA (WT) 201B7 (WT) 201B7

(WT) 1231A3

SPINT1 (WT) 1231A3 MESP2-KO#7 PEG3 MESP2-KO#7 MESP2-KO#11 ABCG2 MESP2-KO#11 MTCO2P12 CGN MTND1P23 NBEAL2 H19 CDH1 Healthy (WT) MESP2-KO STD-Patient STD-Rescue DCN THBD WNT7A MARCH3 Day6 SCG3 PEG3 (SCL) BMP3 PDE6B UNCX TFCP2L1 DMRT2 201B7 201B7 201B7 A F A1 F2 F3 ex1 ex2 ex3

WT Patients Patients (rescued)

−3 0 3 Normalized expression values 982 (log 2) 983 ExtendedExtended Data Figure Fig. 77 | Comparison of patient vs. rescue STD lines. a, Representative 984 DLL1 expression in PSM-derived from STD patient (STD-A and STD-F) and 985 corresponding rescue lines (STD-resA and STD-resF); n=3. b, Heatmap of gene 986 expression levels of transcripts differentially expressed in patient lines STD-A and STD- 987 F, when compared to wild-type (201B7) and corrected rescue clones (STD-resA (A1) and 988 STD-resF (F2 and F3)). Analysis covers all stages of step-wise PSM induction and 989 differentiation. For SM stage data see Fig. 4e and Extended Fig. 7c. c, qRT-PCR-based 990 validation of additional candidates found via RNA-seq to be up-regulated in STD patient 991 lines at the SM stage. Data of three independent experiments are shown.

34 992 993

994 995 Extended Data Fig. 8 | Allele-specific gene correction of MESP2 in patient iPS cells. 996 a, Detailed schematic of the gene correction strategy described in Fig. 4e, 4f. Depicted 997 are the two mutant or corrected MESP2 alleles with coding and non-coding exons (grey 998 and white), overlapping donor vector homology arms (HA-L, HA-R), engineered 51 bp 999 microhomology (µ51, blue), inverted protospacers for cassette excision (ps1, green), 1000 genotyping primers (red arrows), and Southern blotting probes (black bars). The 1001 sequences of mutation-specific sgRNAs are shown below each mutant allele. The gene- 1002 targeted intermediate shows details of the CAG::mCherry-IRES-puro cassette used for 1003 enrichment. b, Southern blot analysis of the targeted iPSC clones. Samples marked with 1004 an asterisk were selected for cassette excision. c, Southern blot analysis of gene corrected 1005 iPSC clones following selection marker removal. Samples marked with an asterisk were 1006 selected for phenotyping (067-1-3, STD-resA1; 067-2-5, STD-resF2; 067-3-4, STD- 1007 resF3).

35 a b STD-A STD-resA1 Chr8 Chr8 Chr1 Chr1 Chr22 Chr22 Chr15 Chr15 MT MT Chr9 Chr9 Chr2 Chr2 Chr16 Chr16 ChrX ChrX Chr3 Chr3 Chr17 Chr17 Chr10 Chr10 ChrY ChrY Chr4 Chr4 Chr11 Chr11 Chr18 Chr18 Chr5 Chr5 Chr19 Chr19 Chr12 Chr12 Chr6 Chr6 Chr20 Chr20 Chr13 Chr13 Chr7 Chr7 Chr21 Chr21 Chr14 Chr14 c d STD-F STD-resF1 Chr8 Chr8 Chr1 Chr1 Chr22 Chr22 Chr15 Chr15 MT MT Chr9 Chr9 Chr2 Chr2 Chr16 Chr16 ChrX ChrX Chr3 Chr3 Chr17 Chr17 Chr10 Chr10 ChrY ChrY Chr4 Chr4 Chr11 Chr11 Chr18 Chr18 Chr5 Chr5 Chr19 Chr19 Chr12 Chr12 Chr6 Chr6 Chr20 Chr20 Chr13 Chr13 Chr7 Chr7 Chr21 Chr21 Chr14 Chr14 e STD-resF2 Chr8 Chr1 Chr22 Chr15 MT Chr9 Chr2 Chr16 ChrX Chr3 Chr10 Chr17 ChrY Chr4 Chr11 Chr18 Chr5 Chr19 Chr12 Chr6 Chr20 Chr13 Chr7 Chr21 1008 Chr14 1009 ExtendedExtended Figure Data Fig.9. 9 | Evaluation of patient and rescued iPSCs. a, b, Resulting 1010 karyograms from SNP array analysis of STD patient iPSC clone A (STD-A) and 1011 corresponding rescued iPSC line (STD-resA1). c-e, Karyograms from SNP array analysis 1012 of STD patient iPSC clone F (STD-F) and corresponding rescued iPSC lines (STD- 1013 resF1/F2). No de novo CNVs were detected following gene editing and subcloning. 1014 These figures were created by Illumina Genome Viewer (IGV) (Version 1.9.0) on 1015 Illumina GenomeStudio V2011.1 with Human:Build 37 genome.

36 1016 EXTENDED DATA FILES 1017 1018 Extended Data Movie 1 | 3D-synchronization assay for wild-type PSM. 1019 Bright field view (left) and HES7 luciferase reporter images (right). Representative data 1020 of three independent experiments are shown. Scale bar: 500 µm. 1021 1022 Extended Data Movie 2 | Calcium imaging of contracting DM-derived muscle. 1023 Representative movies of dermomyotome (DM) derived human skeletal muscle. GCaMP 1024 reporter line activity (green fluorescence) indicating calcium influx into contracting 1025 muscle cells. Magnified view of contracting muscle cells showing concomitant calcium 1026 activity (right side of movie panel). Scale bar: 100 µm. 1027 1028 Extended Data Movie 3 | 3D-synchronization assay for knock out PSMs. 1029 HES7-reporter activity is shown for WT and DLL3-, LFNG-, MESP2-knock-out PSMs. 1030 Representative data of three independent experiments are shown. Scale bar: 500 µm. 1031 1032 Extended Data Table 1 | RNA-seq analysis of human PSM and derivatives. 1033 Expressed genes arranged into six major expression clusters/groups corresponding to the 1034 six distinct differentiation and developmental stages analyzed. 1035 1036 Extended Data Table 2 | List of oscillating human segmentation clock genes. 1037 Complete list of all phase and anti-phase oscillating genes (high stringency cut-off) 1038 identified by ARSER algorithm (for details see Methods section). 1039 1040 Extended Data Table 3 | Pathway-analysis of identified oscillating genes. 1041 Complete results of pathway and GO analyses for phase and anti-phase oscillating genes. 1042 1043 Extended Data Table 4 | Recombinant proteins & small molecules used in this study. 1044 List of utilized recombinant human proteins (4.1) and small molecule agonists and 1045 inhibitors (4.2). 1046 1047 Extended Data Table 5 | Primers used in this study. 1048 List of utilized qRT-PCR primers for differentiation and oscillation assays (5.1), qRT- 1049 PCR primers for iPSC quality control (5.2), exon-specific primers for genotyping (5.3), 1050 oligos for sgRNA cloning (5.4), InFusion primers for MhAX targeting vectors (5.5), PCR 1051 genotyping for MhAX targeting and excision (5.6). 1052 1053 Extended Data Table 6 | Antibodies used in this study. 1054 List of utilized primary antibodies (6.1) and secondary antibodies (6.2) for 1055 immunostaining, and antibodies used for flow cytometric analysis (6.3). 1056 1057

37 1058 Extended Data Table 4 | Utilized recombinant proteins and small molecules 1059 1060 Extended Data Table 4.1 | Human recombinant proteins 1061 Recombinant Company Catalog Number Human Protein Activin A R&D Systems 338-AC bFGF Wako 068-04544 BMP4 R&D Systems 314-BP-050 1062 1063 1064 Extended Data Table 4.2 | Small molecule agonists/inhibitors 1065 Small Molecule Company Catalog Number Description CHIR99021 Axon Medchem 1386 WNT agonist LDN193189 Stemgent 04-0074 BMP inhibitor PD173074 Tocris 3044 FGFR inhibitor SAG Calbiochem 566661 HH agonist SB431542 Selleck Chemicals S1067 TGFβ inhibitor GDC0449 Cellagen Technology C4044-5 HH inhibitor XAV939 Tocris 3748 WNT inhibitor 1066

38 1067 Extended Data Table 5 | Primers used in this study 1068 1069 Extended Data Table 5.1 | qRT-PCR primers (differentiation and oscillation) 1070 Gene Forward Primer Sequence Reverse Primer Sequence hALX4 ATGAATGCTGAGACTTGCGTC GGGAAATGCCCTAAAAGGCG hAMOTL2 CCTGGGACCAGATGGTCAAT GCCACTCTCAGCCATTTCAG hARHGAP24 TCGGGAACCCCACAAATGTT GTGCTGGCCTAACTCTCCAG hAXIN2 GTATCGTCTGCGGGTCTTCC CGAGCTCACACTCAATTCGC hBCORL1 AGAGGCTGAAGCTTTCCTCG CACCTGCCTGTAGAACTCGG hT CACTGCATCTTTCGGGACCT GACATCGTGGACAGCCAGTA hCXXC5 TCAGTGAGCCCCTCAACAAG TTGTCAGCAGACTCTCCGC hDACT1 CTGAAGAAAGCGGGGGCTTA ACAAAGGTTTTTGCGGGGTG hDKK1 GCACCTTGGATGGGTATTCCA GCACAACACAATCCTGAGGC hDLL1 GGTCCTGCACGAGGTTATAGT AACACCAACAAGAAGGCGGAC hDMRT2 TCTCGGTGAATGAACCACTGT GAGAGACACGAGTGATGGCA hDUSP5 TGCCTACCCACTCAACAGTC CCGATTCCTCCATCCCAGTTT hFGF18 TTTCATGAAGCGCTACCCCA CTTGGTCACCGTCGTGTACT hHES7 CTCGGATCTACCGGCTTGG TGCTACTTGTCCGGTTTCCG hHK2 GCCCGCCAGAAGACATTAGA GCTCAGACCTCGCTCCATTT hHOXA5 CAATCCTCCTTCTGCGGGTC CCTACACGCGCTACCAGAC hHOXD1 CACCTACCCCAAGTCCGTCT TTTCATCCACTCGAACGTGCT hJADE1 GGCTGCCTGAGGAAGTAGTG CCGCTTGGCTAGATTGTCCT hKDM6B GTCCATGAAGCACTGCCAGGT GTTGCAGTAGTAGGCTGGCTC hLFNG GCCTGCTTGGAGGAAGGATT AGAAAGGCACCTACGCATCG Luciferase CGAGGCTACAAACGCTCTCA TTGATCAGGCTCTTCAGCCG hMAML3 CCAGACATGACCATTGGCCT AGTCGTGGATGCTAACACGG hMEOX1 CTCTTGCCCTTTTTGGGGTG GAGGGAGGTCAACGTGAGTT hMESP1 CGTCAGTTGTCCCTTGTCACTT GCTGGCTCTGTTGGAGACCT hMESP2 GGCTTCCCTCTTTCCATCCA GGAGCCTTGGCTAAAGGAGA hMIXL1 GGTACCCCGACATCCACTTG ACCTGGAAGAGGGGAGAAAAT hMSGN1 CTGCACACCCTCCGGAATTA AGGAGGTCTGTGAGTTCCCC hNOTCH1 CCAGTCGGAGACGTTGGAAT TACTCCTCGCCTGTGGACAA hPAX1 CGAAGGCACCGCTTACTCTC AAGGCAGGTTTCTCTAGCCC hPAX9 CACAACTGTGACATTCCGGC GCCGTGACAGAATGACTACCT hPLEKHG2 GGCTCTCTAGACATTCAGGGC AAGACTCCAACCTGGCAACG hPRICKLE2 AGCTCGATCGCTTGTGACGTA TGATTGGTGTTCCACCTGCT hPRRX1 AGGCTTTGGAGCGTGTCTTT GTTACCTGCACTCTCGCCTC hRHOU TCAAAGCCGCCTCCTACATC CTGTTGCTGAGTGTCCGAGT hSMAD3 TGAAACACACGAGCAAACCC AAAGTGGGGCCAAAGGGTAA hSPRY4 CCTGCACCAAACACATGCAC ATGCAAGGAGTGTGCATCCC hTBX6 GCGCGGTGTATGGTAGAGG CTACTCGGCTGCATTTCTGG hTCF15 CCTCCTTCTCCAGGGTCCAG CAGCTGCTTGAAGGTGAGGG hTEAD4 GGCACCATTACCTCCAACGA CCCTCTGCGTCATTGTCGAT hUNCX GGAGAAGGCGTTCAACGAGA GGAACCAGACCTGAACTCGG 1071

39 1072 Extended Data Table 5.2 | qRT-PCR primers (iPSC quality control) 1073 Gene TaqMan Assay ID Probe label hOCT3/4 Hs00999634_gH FAM/MGB hNANOG Hs02387400_g1 FAM/MGB 1074 Gene Forward Primer Sequence Reverse Primer Sequence Probe Sequence Probe label GAPDH TGCACCACCAACTGCTTAGC TCTTCTGGGTGGCAGTGATG ACTCATGACCACAGTCCA VIC/MGB cmCAG GGCTCTGACTGACCGCGTTA CAGAAAAGAAACAAGCCGTCATT TGTAATTAGCGCTTGGTT FAM/MGB 1075 1076 1077 Extended Data Table 5.3 | Exon genotyping primers 1078 Gene Exon no. ID# /Name Sequence dna2235 CGACCACAGACGGAATCTC 1, 2 dna2236 CCCCAGTGTGTAGAGTTTAGAG dna2237 TGAACTCTGGCCTTCATTGAG 3 dna2238 CTACATCCAGTACCAGGGACA dna2239 CTCCGTATGCATCCATGTTCG 4 dna2240 GTACCCTGAAGAGGGTGAGTG dna2241 GTACCATCTAGTCCCGATGATT DLL3 5 dna2242 TAATCCTACTTCAAGGCCCCA dna2243 AGTGTAGACAGAAGCGAGCA 6 dna2244 ATGAGATAGGGAATGTCTCCTTG dna2245 GATGACAGAGCTGGGAAACAG 7 dna2246 ATCACAAAGGGGAACCCAAAA dna2247 GGTTCCCCTTTGTGATGGGTA 8 dna2248 ACATCAAATACAAAGCATTCAGCC dna2699 CAGAAAGCTGCAATTCTGGAG 1 dna2700 ACAGCCAGAGTGGAGCAA HES7 dna2227 GCGAGCTACAGAAACTGATCT 2, 3, 4 dna2230 TGGCAGAACAGATAAACGAGA dna2231 TCGCACCTTTGGTCAACATAA MESP2 1, 2 dna2234 GAGAAGGAAAGAGCAGCAGAA dna2960 TCTCTCTGCAAATCCTTCCCA v3-1 dna2961 GGCTGGAGATAAAGAGAAGCC dna2962 AAGACTTTCCAGAAGTCCCCT v3-2 dna2963 CATGTCAGTGGGATGGGATAC dna2964 CATGGGAGACTTGTCACTTGG v4 dna2965 CCCCTCGGAGTTATGTTCACG dna2966 CCACCCCAGTTTGCAAGG LFNG 1 dna2967 ATCCAGGACTCTGGAGCTG dna2968 CGAGTGGGGAAACCAAGGC 2, 3, 4, 5 dna2969 CTCTGAAACCCAGAGGGAAGT dna2253 CCTCTCCCTGAGGAGTGC 6, 7 dna2254 TAGAGTCATGCCGCTTAGAGA dna2970 CGCTGAGCTACGGTATGTTTG v3-3 dna2971 CCTAACAGAATTGCCCTCGG 1079

40 1080 Extended Data Table 5.4 | Oligos for sgRNA cloning 1081 Gene Target Name Sequence MESP2-S-KO1-sgRNA caccgTGGCCGGCCGGCGCCAAGGA KO1 MESP2-AS-KO1-sgRNA aaacTCCTTGGCGCCGGCCGGCCAc MESP2-S-KO2-sgRNA caccgACGCCCAGACGAGCGCGCAC KO2 MESP2-AS-KO2-sgRNA aaacGTGCGCGCTCGTCTGGGCGTc MESP2 MESP2-c307T-Xs caccgAAAGCGGCGCAACTaGTGCA c.307T MESP2-c307T-Xas2 aaacTGCACtAGTTGCGCCGCTTTc MESP2-c256del4-Xs caccgGACAGCGGCAGAGCGCGAGC c.256delGCCA MESP2-c256del4-Xas aaacGCTCGCGCTCTGCCGCTGTCc HES7-S-KO1-sgRNA caccGCAGATGCTCAAGCCGCTTG KO1 HES7-AS-KO1-sgRNA aaacCAAGCGGCTTGAGCATCTGCc HES7 HES7-S-KO2-sgRNA caccgTGCTGCTGCTGGAGCGGACC KO2 HES7-AS-KO2-sgRNA aaacGGTCCGCTCCAGCAGCAGCAc DLL3-S-KO1-sgRNA caccgCGTCCTCGAGCGGTGCGCAG KO1 DLL3-AS-KO1-sgRNA aaacCTGCGCACCGCTCGAGGACGc DLL3 DLL3-S-KO2-sgRNA caccgCGGTGAATGCCGATGCCTAG KO2 DLL3-AS-KO2-sgRNA aaacCTAGGCATCGGCATTCACCGc LFNG-S-KO2-sgRNA caccgCACACGCGGGACGTCTACGT LFNG KO2 LFNG-AS-KO2-sgRNA aaacACGTAGACGTCCCGCGTGTGc 1082 1083 1084 Extended Data Table 5.5 | MhAX InFusion primers 1085 Gene Name Sequence MESP2-u51-A GCGAATTGGGTACctcgcacctttggtcaacataa MESP2-u51-B CTGCTCGAGAATtaGGGCACGGGCAGCTTGCCGGTGGccaggcgcagcgtct MESP2 MESP2-u51-C TACGGTACCATCGGGGCACGGGCAGCTTGCCGGTGGcgccggccggccaga MESP2-u51-D TCATGGCCGGTACctcctggagtagataagctggg 1086 1087 1088 Extended Data Table 5.6 | Genotyping primers 1089 Gene Target Site Name Sequence MESP2-5'F GTTTTGACACCTCTCTGCAAC 5’ arm pDestPT35-R2 AGCTTGGCGTAATCATGGTC MESP2-5'F GTTTTGACACCTCTCTGCAAC Spanning MESP2-Exon2-R GAGAAGGAAAGAGCAGCAGAA MESP2 SV40-polyA-4 GGACAAACCACAACTAGAATGC 3’ arm MESP2-Exon2-R GAGAAGGAAAGAGCAGCAGAA MESP2-Exon2-F CATACCATGGCAACCAGCC 3’ external probe MESP2-Exon2-R GAGAAGGAAAGAGCAGCAGAA 1090 1091 1092 1093 1094 1095 1096

41 1097 Extended Data Table 6 | Antibodies used in this study 1098 1099 Extended Data Table 6.1 | Primary antibodies used for immunostaining 1100 Primary Antibodies Company Catalog Number Dilution BRACHYURY R&D Systems AF2085 1:100 COL1 Sourthern Biotech 1310-01 1:200 COL2 Sourthern Biotech 1320-01 1:200 FOXC2 DSHB 1B6 1:10 HNA Merck MAB1281 1:50 MESP2 DSHB 1D4 1:10 MYH Abcam ab91506 1:2000 MYOSIN DSHB MF20-s 1:20 PAX7 DSHB PAX7-s 1:10 SAA Abcam ab9465 1:1000 TBX6 R&D Systems AF4744 1:100 TCF15 Abcam ab204045 1:50 1101 1102 1103 Extended Data Table 6.2 | Secondary antibodies used for immunostaining 1104 Secondary Antibodies Company Catalog Number Dilution Alexa Fluor® 488 Donkey Anti-Rabbit IgG (H+L) Invitrogen A-21206 1:500 Alexa Fluor® 488 Goat Anti-Mouse IgG (H+L) Invitrogen A-10680 1:500 Alexa Fluor® 555 Donkey Anti-Goat IgG H&L Abcam ab150130 1:500 Alexa Fluor® 555 Goat Anti-Mouse IgG (H+L) Invitrogen A-21422 1:500 Alexa Fluor® 647 Donkey Anti-Mouse IgG (H+L) Invitrogen A-31571 1:500 Donkey Anti-Rabbit IgG Cy3 Merck AP182C 1:500 1105 1106 1107 Extended Data Table 6.3 | Antibodies used for flow cytometric analysis 1108 Antibodies Company Catalog Number Dilution Application DLL1-APC R&D Systems FAB1818A 1:200 Primary antibody TBX6 R&D Systems AF4744 1:25 Primary antibody BRACHYURY-PE R&D Systems IC2085P 1:50 Primary antibody OCT3/4-Alexa Fluor® 647 BD Biosciences 560329 1:25 Primary antibody NANOG-FITC BD Biosciences 560791 1:25 Primary antibody PAX6-FITC BD Biosciences 561664 1:25 Primary antibody SOX2-BV421 BioLegend 656114 1:50 Primary antibody BRACHYURY-PE R&D Systems IC2085P 1:50 Primary antibody NCAM-BV421 BioLegend 318328 1:25 Primary antibody SOX17-Alexa Fluor® 647 BD Biosciences 561589 1:25 Primary antibody FOXA2-PE BD Biosciences 562594 1:50 Primary antibody Alexa Fluor® 488 Anti-Goat IgG Abcam Ab150129 1:50 Secondary antibody for TBX6

42 APC-conjugated Mouse IgG2b,κ BD Biosciences 555745 1:200 Isotype control for DLL1-APC Unconjugated Goat IgG R&D Systems AB108C 1:25 Isotype control for TBX6 Isotype control for BRACHYURY- PE-conjugated Goat IgG R&D Systems IC108P 1:50 PE 1109

43