Washington University School of Medicine Digital Commons@Becker

Open Access Publications

2019 A Hox-TALE regulatory circuit for neural crest patterning is conserved across Hugo J. Parker

Bony De Kumar

Stephen A. Green

Karin D. Prummel

Christopher Hess

See next page for additional authors

Follow this and additional works at: https://digitalcommons.wustl.edu/open_access_pubs Authors Hugo J. Parker, Bony De Kumar, Stephen A. Green, Karin D. Prummel, Christopher Hess, Charles K. Kaufman, Christian Mosimann, Leanne M. Wiedemann, Marianne E. Bronner, and Robb Krumlauf ARTICLE

https://doi.org/10.1038/s41467-019-09197-8 OPEN A Hox-TALE regulatory circuit for neural crest patterning is conserved across vertebrates

Hugo J. Parker 1, Bony De Kumar1, Stephen A. Green2, Karin D. Prummel3, Christopher Hess3, Charles K. Kaufman4, Christian Mosimann 3, Leanne M. Wiedemann 1,5, Marianne E. Bronner2 & Robb Krumlauf1,6

In jawed vertebrates (gnathostomes), Hox genes play an important role in patterning head

1234567890():,; and jaw formation, but mechanisms coupling Hox genes to neural crest (NC) are unknown. Here we use cross-species regulatory comparisons between gnathostomes and lamprey, a jawless extant , to investigate conserved ancestral mechanisms regulating Hox2 genes in NC. Gnathostome Hoxa2 and Hoxb2 NC enhancers mediate equivalent NC expression in lamprey and gnathostomes, revealing ancient conservation of Hox upstream regulatory components in NC. In characterizing a lamprey hoxα2 NC/hindbrain enhancer, we identify essential Meis, Pbx, and Hox binding sites that are functionally conserved within Hoxa2/Hoxb2 NC enhancers. This suggests that the lamprey hoxα2 enhancer retains ancestral activity and that Hoxa2/Hoxb2 NC enhancers are ancient paralogues, which diverged in hindbrain and NC activities. This identifies an ancestral mechanism for Hox2 NC regulation involving a Hox-TALE regulatory circuit, potentiated by inputs from Meis and Pbx proteins and Hox auto-/cross-regulatory interactions.

1 Stowers Institute for Medical Research, Kansas City, MO 64110, USA. 2 Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA. 3 Institute of Molecular Life Sciences, University of Zurich, 8057 Zürich, Switzerland. 4 Division of Medical Oncology, Washington University in Saint Louis, St. Louis, MO 63110, USA. 5 Department of Pathology and Laboratory Medicine, Kansas University Medical Center, Kansas City, KS 66160, USA. 6 Department of Anatomy and , Kansas University Medical Center, Kansas City, KS 66160, USA. Correspondence and requests for materials should be addressed to M.E.B. (email: [email protected]) or to R.K. (email: [email protected])

NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8

eural crest (NC) cells are a migratory and multipotent cell The sea lamprey offers an opportunity to address conserved type, representing a defining characteristic of verte- ancestral mechanisms of Hox2 NC regulation in vertebrates. The N 1–3 brates . The cranial NC emerges from the mid/hind- extant jawless vertebrates (cyclostomes), lamprey and hagfish, brain region and contributes to cartilage and bone of the pharynx. represent a sister group to gnathostomes, so comparisons between Gnathostomes (jawed vertebrates) exhibit nested these lineages can provide insights into the evolution of domains (Hox codes) along the anteroposterior (AP) regulatory programs in vertebrates18–20. Such comparisons can extent of both the neural tube and adjacent NC. However, little is reveal aspects of vertebrate biology that were present in the known regarding shared versus independent regulation of Hox common ancestor of extant vertebrates and which have been expression in these two tissues. In NC of the pharyngeal arches conserved in each lineage. These studies can also identify features (PAs), the NC Hox code confers positional identity to each arch4. that differ between each lineage, which could represent diver- There is also evidence for an earlier role of Hox genes in regional gence from the ancestral vertebrate state in either/both lineages. generation of NC from the hindbrain4,5. Emergence of NC and While comparisons between cyclostomes and gnathostomes its underlying Hox code played an important role in craniofacial may reveal ancestral features of vertebrate hox2 regulation in evolution, giving rise to unique structures of the head and neck, the NC, little is known about the enhancers and regulatory including the jaws and hyoid apparatus used in feeding and factors underlying Hox NC expression in cyclostomes. Here, we respiration. Genome analyses together with gene expression and employ cross-species regulatory comparisons between lamprey functional studies have described a common framework for a and gnathostomes to isolate and characterize NC enhancers and gene regulatory network (GRN) that drives NC induction, spe- investigate ancestral regulation of Hox2 in vertebrates. cification, and migration across vertebrates, and some regulatory components are present in non-vertebrate chordates1,6,7. How- ever, Hox genes have not yet been integrated within the current Results formulation of the NC GRN. This is in part because the Expression of hox2 genes in lamprey cranial NC. In two species mechanisms regulating Hox expression in NC are relatively of lamprey, sea lamprey (Petromyzon marinus) and Arctic lam- unclear compared to the current knowledge of Hox regulation prey (Lethenteron camtschaticum), only two Hox2 paralogues in hindbrain segmentation4. Hence, despite the established (hoxα2 and hoxδ2) have been identified, but their orthology to functional roles of Hox genes in cranial NC4,8, whether the Hox gnathostome Hoxa2/Hoxb2 remains unresolved (Fig. 2a)21–23. code is coupled to this conserved NC GRN and if so how, is The sea lamprey has six Hox clusters, compared to the four unknown. It is unclear whether Hox networks that control axial clusters inferred in ancestral gnathostomes22. A recent recon- patterning and generation of NC are integrated within or are struction based on comparisons of gene order at the chromosomal working in parallel, independently from the NC GRN. level between vertebrate species supports a model in which the Hox2 genes are critical components of the cranial NC Hox ancestor of cyclostomes and gnathostomes also had four Hox code and provide an important avenue for addressing this clusters22, suggesting that the two additional Hox clusters in question. Experimental alterations in expression of gnathostome lamprey arose from duplication event/s in the lamprey/cyclostome Hox2 paralogues (Hoxa2 and Hoxb2) lead to homeotic transfor- lineage. To investigate the duplication history of lamprey Hox mations. In mice, ectopic Hoxa2 suppresses jaw formation9, clusters, previous work employed a chromosome-wide analysis while Hoxa2 mutants exhibit mirror image jaw duplication, of genomic synteny (duplicate gene retention) between lamprey indicating a role as a selector gene in NC AP identity4,8. Reg- Hox-bearing chromosomes22. These pairwise comparisons indi- ulatory analyses in mouse, chick, and zebrafish have identified cated that the chromosomes bearing the hoxα and hoxδ clusters evolutionarily conserved enhancers in the Hoxa2 and the Hoxb2 display a significantly closer relationship to each other than to genes that mediate their expression in NC and hindbrain seg- any other Hox-bearing chromosomes. Similarly, the chromosomes ments (rhombomeres (r)) (Fig. 1a, b)4,10–13. Analyses of mouse carrying the hoxβ and hoxε clusters also show a significant Hoxa2 and Hoxb2 NC enhancers provide conflicting mechanisms enrichment for shared paralogues with each other. This suggests for NC expression of Hox genes. A single 5′-flanking enhancer that these pairs of chromosomes derive from duplication event/s drives Hoxb2 in r4 and its NC11, supporting a model for shared that occurred more recently than the duplication events that regulation in these tissues (Fig. 1b). In contrast, independent gave rise to the other Hox-bearing chromosomes22. Phylogenetic exonic/intronic and 5′-flanking enhancers regulate Hoxa2 analyses reveal hoxβ and hoxε paralogues consistently clustering expression in r412 versus r4-derived NC14 (PA2), suggesting in protein trees21,22, while hoxα and hoxδ paralogues do not show the evolution of distinct regulatory mechanisms between the any clear patterns of clustering. This may be due to the limitations hindbrain and NC (Fig. 1a). Characterisation of cis-elements of phylogenetic analyses in resolving the relative timing of ancient required for the activity of these enhancers has provided some duplication events22. Thus, the evidence from synteny analysis insight into their underlying regulatory mechanisms (Fig. 1a, b). leads us to infer that hoxα2 and hoxδ2 genes are paralogues that Both the Hoxb2 r4/NC and the Hoxa2 exonic/intronic r4 arose from duplication in the lamprey/cyclostome lineage. enhancers depend upon the combined inputs from Meis and In exploring hox gene NC regulation in sea lamprey embryos, our Pbx-Hox binding sites for their activities11,15–17. However, the time-course analysis of hox1–3 expression revealed nested domains inputs into the independent Hoxa2 NC enhancer are largely in the pharynx from stage (st) 23, reminiscent of those in Arctic unknown, except for a binding site for AP2-α lamprey24,25 and in gnathostomes4,8 (Fig. 2b–e). hoxα2 is expressed (Tfap2α) in the mouse enhancer that is not conserved across in the NC posterior to PA1 and in the hindbrain posterior to gnathostome species10,12. Taken together, these analyses reveal r1, similar to gnathostome Hoxa2.Incontrast,hoxδ2 is expressed differences between Hoxa2 and Hoxb2 in the enhancers and in r3/r5, notochord, and in posterior pharyngeal endoderm and regulatory mechanisms underlying their expression in r4 and NC. mesenchyme, but not at high levels in NC (Fig. 2b, c). If hoxα/δ Hence, in mouse, each gene supports a different model with clusters arose by duplication in the lamprey/cyclostome lineage, respect to shared versus independent regulation in the hindbrain as supported by synteny analysis, this suggests that hoxδ2 and NC. Given that Hoxa2 and Hoxb2 are paralogous genes, expression diverged after this duplication. Given the similarity this raises two questions: what is the evolutionary relationship in NC expression of hoxα2 to gnathostome Hoxa2, we focused between their NC enhancers and which of these mechanisms our comparative regulatory analysis on mechanisms mediating is ancestral? lamprey hoxα2 expression in NC.

2 NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8 ARTICLE

a Mouse Hoxa2

Krox20 ? Tfap2α HoxPG1-2, Pbx, Meis ?

Sox RE1Krox20 Krox20 RE4 RE3 RE2 RE5

NC5 NC2 NC3 NC4 NC1 r3/r5 (479 bp) r4 (1250 bp) r2 (264 bp) NC (661 bp)

r2 r2 r3 r3 r3 r3 r3 r4 2 r4 2 r4 2 r4 2 r4 2 r5 NC r5 r5 r5 r5 NC 3 3 3 3 3

Independent enhancers Hindbrain element Endogenous gene for NC and r4 Neural crest element expression

b Mouse Hoxb2

Krox20 Hoxb1, Pbx, Meis

Sox Krox20 Krox20

Meis Pbx-Hox r3/r5 (691 bp) r4 + NC (181 bp)

r3 r3 r3 r4 2 r4 2 r4 2 r5 NC r5 r5 NC 3 3 3

Shared enhancer for NC and r4

c Lamprey hox2 ?

Neural + NC (4050 bp) r2 + r4 (2625 bp)

r2 r2 Independent or shared r3 r3 enhancers for NC and r4? r4 2 r4 2 r5 r5 NC 3 3

Fig. 1 Characterized Hox2 enhancers regulating neural crest (NC) and rhombomeric expression from mouse and lamprey. Schematic diagrams depicting the known enhancers regulating mouse Hoxa2 (a) and Hoxb2 (b), and lamprey hoxα2 (c), in rhombomeres (r) and NC. For each locus, the gene exons are represented by grey boxes and the transcriptional start site by an arrow. Enhancers are marked as black lines below the loci, with their activity domains illustrated by blue shading in schematic dorsal views of the hindbrain (r2–5) and pharyngeal arches (2–3). For the mouse loci, characterised cis-elements contributing to enhancer function are depicted as coloured boxes: hindbrain elements in purple and NC elements in green (not drawn to scale). Known direct inputs from transcription factors into these cis-elements are depicted by arrows, with unknown inputs shown as question marks. Hoxa2 is regulated in r4 and r4-derived NC (PA2) by independent enhancers (a). Hoxb2 expression in r4 and NC is mediated by a single enhancer, through cis-elements bound by Meis and Pbx-Hox factors (b). Since these elements have dual hindbrain/NC activity they are depicted in both purple and green. Genomic regions from the lamprey hoxα2 locus have enhancer activity, with an r2/r4 enhancer positioned within the exons and intron (c). The hoxα2-hoxα3 intergenic region drives reporter expression in the hindbrain and NC. However, it is not known whether this is through independent or shared NC/hindbrain enhancers, specific cis-elements have not been identified, and the relationship of this region to the gnathostome Hoxa2 and Hoxb2 enhancers is unclear

Gnathostome Hox2 NC enhancers function in lamprey.To Initial green fluorescent protein (GFP) expression in pre- explore conservation of the regulatory network upstream of Hox2 migratory NC was observed at st21 and maintained as NC in the NC, we sought to perform cross-species enhancer analysis delaminated and migrated ventro-laterally to populate the phar- by testing gnathostome NC enhancers for activity in lamprey ynx (Supplementary Figure 1b–c). Frontal sections at st24 embryos. As part of this approach, we wanted a way to globally revealed that GFP transcripts are present in the NC-derived monitor the NC in vivo. In zebrafish, the crestin promoter/ pharyngeal mesenchyme (Fig. 3d). These data demonstrate that enhancer element is a highly specific tool for monitoring NC NC cells can be labelled and visualized in vivo by a reporter assay development, as it is active from pre- to post-migratory stages in the developing lamprey and suggest that the crestin element across multiple axial levels, making it a good candidate for is interpreted in lamprey by conserved upstream components marking NC in lamprey26. In lamprey transgenic reporter assays, of an ancestral NC GRN. This serves as a proof of concept for the crestin element mediates spatiotemporal expression in NC interspecies analysis of NC enhancers. similar to its activity in zebrafish, and its activity is sensitive Next, we investigated whether upstream regulatory inputs to perturbation of the same transcription factor binding sites required for Hoxa2 expression in gnathostome NC are present (Supplementary Figure. 1a–d; Fig. 3b; Supplementary Table 1). in lamprey using homologous Hoxa2 NC enhancers from three

NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications 3 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8

a Hox paralogue group 14 13 12 11 10 9 8 7 6 5 4 3 2 1 hox- hox-  Sea lamprey hox- hox- hox- hox-

HoxA Hox gene HoxB Mouse Strong expression in cNC HoxC Weak expression in cNC HoxD No expression in cNC

b st21 st22 st23 st24.5 st26 r4 r4 r4 r4

hox1

123 4 5 6 12345678 r3 r5 r3 r5 r5 r3

hox2

123456 1 234 5 6 78 r5 r3 r5 r3 hox2

123456 12345678 r5 r5

hox3

123456 1234 5678

c hox1 hox2 hox2 hox3

m m 1 1 mo 1 1 nc 1 1 1 2 2 2 2 2 2 3 3 3 3 en nc 3 4 4 4 ec m ec en en 4 ec 5 5 5 5 6 6 nc 6 6 7 7 7 nc en en 8 st22 st23st25 st23 st24.5st26 st24.5

d e hox1 Hindbrain hox2 expression hox2 pp1 Neural crest (nc) (st23)  2 hox 3 Pharynx Mesoderm (m) r3 r4 r5 pp2 r2 Ectoderm (ec) 3 3 1 2 pp3 Endoderm (en)  Neural crest hox 2 hox2 4 expression (st24) st24.5 hox3 gnathostomes (zebrafish (zf), fugu (f), mouse (m)) (Fig. 3a). In prominently expressed in NC-derived pharyngeal mesenchyme both lamprey and zebrafish embryos, all three gnathostome equivalent to crestin reporter expression domains and endogen- enhancers mediated GFP reporter expression in the developing ous hoxα2 (Fig. 3d). pharynx posterior to PA1 (Fig. 3b; Supplementary Table 2; In gnathostomes, NC Hoxa2 expression is regulated by Supplementary Figure 2). Frontal sections of transient transgenic 5′-flanking elements (NC1–5) that partially overlap those of a lamprey embryos confirmed that GFP transcripts were separate r3/r5 enhancer (RE1-5, Krox20, Sox)(Figs.1a, 3a)10,27.

4 NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8 ARTICLE

Fig. 2 Embryonic time course showing expression of hox genes in the lamprey hindbrain and cranial neural crest (NC). a Genomic organization of Hox genes in lamprey and mouse. Boxes represent Hox genes, which are organized into paralogue groups based on their sequence. The arrow above the clusters denotes the direction of Hox gene transcription. Lamprey hox genes from paralogue groups 1–3 were examined for NC expression in this study and their expression in cranial NC is denoted by green/white shading. b Lateral views of lamprey embryos from stages (st)21 to 26, showing hox gene expression domains in the developing head. Pharyngeal arches are numbered and rhombomere-specific domains (r) indicated. The arrowhead marks weak hoxδ2 expression in mandibular mesoderm at st26. c Frontal sections through lamprey embryos showing hox gene expression domains within the developing pharynx. Pharyngeal arches are numbered. d Schematic of a frontal section through the lamprey st24.5 embryonic pharynx with tissue domains annotated; NC domains are shaded in blue. Scale bars: 200 μm(b); 100 µm(c). e Schematic depicting hox expression in the lamprey hindbrain and NC at st23 and st24. ec, ectoderm; en, endoderm; m, mesoderm; mo, mouth; nc, neural crest; r, rhombomere; st, stage

Of these, the NC3 element is the most highly conserved: global functionally required for enhancer activity in gnathostomes and sequence alignment using Multi-LAGAN28 identified sequence lamprey, notably NC3 (Fig. 4a, f, h–j, Supplementary conservation of NC3 extending to sharks (Fig. 3a). In previous Figure 3a–b), suggesting that Meis, Pbx, and Hox factors may work, two 15 bp deletions within NC3 were each found to abolish provide conserved and essential inputs into these enhancers. NC reporter expression in mouse10. To determine whether the Thus, lamprey hoxα2 and gnathostome Hoxa2 appear to be same cis-elements are required for NC activity of Hoxa2(m) in regulated in the hindbrain and NC through conserved transcrip- lamprey, we generated two variants with these deletions in NC3. tion factor binding sites retained during vertebrate evolution. This While hindbrain activity persisted, these reporters exhibited severely provides further support for an ancestral GRN upstream of Hox2 diminished NC activity in zebrafish and lamprey (Fig. 3a, c; in NC patterning that has been retained in lamprey and Supplementary Figure 3a, b; Supplementary Table 3). Analo- gnathostomes. gously, activity of a gnathostome Hoxb2 NC enhancer (hoxb2a (zf)) also depended upon the same regulatory sites in lamprey as Hoxa2 Hoxb2 in zebrafish (Supplementary Figure 3c, d). Since Hoxa2(m) and and NC enhancers are divergent paralogues. The identification of conserved Meis, Pbx, and Hox binding motifs in hoxb2a(zf) NC enhancers require the same cis-motifs across α distant species, this suggests that an ancestral GRN upstream of the lamprey hox 2 and gnathostome Hoxa2 enhancers is sig- nificant as equivalent sites are present and functionally required Hox2 in NC patterning has been retained in lamprey and fi gnathostome lineages. in the mouse Hoxb2 and zebra sh hoxb2a enhancers (Fig. 1b; Supplementary Figure 3c). To explore whether these are homo- logous sites and to interrogate common and diverged features of Conservation of binding sites in a lamprey hoxα2 NC enhan- α α the hox 2, Hoxa2, and Hoxb2 enhancers, we used the Krox20 cer. Gnathostome Hoxa2 and lamprey hox 2 share similarity in sites, implicated in r3/r5 expression, as an anchor to align the their cis-regulatory architecture for rhombomeric hindbrain enhancer sequences. This revealed striking conservation of the expression, as r2 and r4 enhancers are embedded in conserved 13,18 sequence and order of Krox20, Sox, Meis, and Pbx-Hox sites locations (Fig. 1a, c) . However, our global sequence align- between these enhancers, with relatively low sequence conserva- ment failed to reveal conservation of NC regulatory elements – ′ tion within the intervening regions (Fig. 5a; Supplementary Fig- (NC1 5)5 of the lamprey hox2 genes (Fig. 3a). Hence, we ure 6). Based on these conserved sites and relative positions, we manually searched for short sequence motifs within an enhancer α − ′ fl α infer that Hoxa2 and Hoxb2 NC enhancers are ancient para- (hox 2 4 kb) in the 5 - anking region of hox 2 that recapitu- logues, derived from an ancestral vertebrate Hox2 enhancer that lates endogenous hoxα2 expression in the hindbrain (r3–r5), NC – contained these sites. These paralogous enhancers appear to have posterior to PA1 and somites (Fig. 4a e; Supplementary Figure 4; diverged in gnathostomes, such that the mouse Hoxa2 enhancer Supplementary Table 2). In mouse Hoxa2, the most highly con- is inactive in r4 but expressed in r4-derived NC, while the Hoxb2 served elements required for NC expression are NC3 and part of enhancer is active in both r4 and its NC (Fig. 5a)11. The lamprey NC2, while those necessary for r3/r5 activity are Krox20, Sox, RE2, α fi hox 2 NC enhancer exhibits the combined activity of both mouse and RE3 sites. Surveying the lamprey enhancer, we identi ed Hoxa2 and Hoxb2 enhancers, suggesting that it may reflect the short matching sequences for Krox20, Sox, and NC3 (Fig. 4f; ancestral state. We searched upstream of lamprey hoxδ2, finding Supplementary Figure 5). To address whether a smaller region conservation of Krox20 and Sox sites, but no Meis or Pbx-Hox containing these sites retains enhancer activity, we cloned a 1530 sites (Fig. 5b). This suggests that the ancestral sites for r4/NC bp region encompassing the conserved sites with ~500 bp on each enhancer activity were lost upstream of hoxδ2, consistent with side (hoxα2 elementA) and demonstrated that it mediates reporter δ α − the expression of hox 2 in r3/r5 but not in r4 and r4-derived expression equivalent to hox 2 4kb in lamprey (Fig. 4g, j). To NC (Fig. 2b, c). test if the conserved sites are required for tissue-specific enhancer activity, we assayed variants of hoxα2 −4kb in which these sites were mutated (Fig. 4f, h–j). Deleting Krox20-Sox sites (ΔKrox20) Lamprey meisC and hoxα2 are similarly expressed in NC. resulted in the loss of r3/r5 expression, but maintenance in r4 and Identification of Meis and Pbx-Hox consensus binding motifs in NC (Fig. 4f, h). In contrast, deleting the conserved sites within the hox2 enhancers of lamprey and gnathostomes implies that NC3 (ΔNC3) caused loss of all rhombomeric and NC activities, these factors may play conserved roles in regulating vertebrate but somitic expression is retained (Fig. 4f, i–j). Hox2 genes in the hindbrain and NC. Meis, Prep, and Pbx are Inspection of the conserved sites between the lamprey hoxα2 members of the TALE (Three-Amino-Acid-Loop-Extension) and gnathostome Hoxa2 enhancers revealed that they match homeodomain family29 that have diverse roles in patterning closely to consensus transcription factor binding site motifs tissues, including hindbrain and NC30–33. Additionally, they can for factors involved in early hindbrain and NC patterning. In act as cofactors for other transcription factors, including Hox addition to the previously characterised Krox20 sites, we proteins34. To initially explore whether Meis factors are linked identified three short blocks of conserved sequence that with regulation of NC in lamprey, we characterized expression correspond to consensus binding motifs for Sox, Meis, and of four lamprey meis genes. We found that meisC shows early Pbx-Hox factors (Fig. 4f). These motifs each fall within regions NC expression with a spatiotemporal pattern similar to hoxα2

NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8

a Sox RE1Krox20 Krox20 RE4 RE3 RE2 RE5 Hindbrain element

Neural crest element NC5 NC2 NC3 NC4 NC1

Hoxa3 Hoxa2

Mouse Chicken Zebrafish Fugu (a) Fugu (b) Shark Lamprey α Lamprey δ

1 kb

Jawed vertebrate enhancers: c-Fos eGFP promoter hoxa2b(zf) hoxa2a(f) Hoxa2(m) Hoxa2(m)NC3-1 Hoxa2(m)NC3-2

b Crestin-1 kb (zf) hoxa2b (zf) Hoxa2a (f) Hoxa2 (m)c Hoxa2(m)NC3-1 Hoxa2(m)NC3-2 r5 r3 r3 r4 r5 r5 r3 r5 r3 Zebrafish 3 45 3 34 34 2 2 4 5 2 5 2 5

3 4 Lamprey 3 4 4 4 2 2 2 3 23 st24

d Crestin-1 kb (zf) hoxa2b (zf) hoxa2a (f) Hoxa2 (m)

1 1 1 1 2 2 2 2 3 3 3 3 4

st24

Fig. 3 Conserved activity of gnathostome Hoxa2 neural crest (NC) enhancers in zebrafish and lamprey. a Sequence alignment of gnathostome Hoxa2-Hoxa3 and lamprey hox2-hox3 gene loci against the human locus. Conserved non-coding sequences (pink), untranslated regions (UTRs) (cyan) and coding sequences (blue) are highlighted. The relative locations of the mouse hindbrain and NC cis-elements (top) are shown. Gnathostome Hoxa2 enhancers used for cross-species reporter analysis are detailed below the alignment. Letters within parenthesis indicate species of origin of the enhancer: zf, zebrafish; f, fugu; m, mouse. b, c Green fluorescent protein (GFP) reporter expression in zebrafish and lamprey embryos (lateral views), mediated by wild-type (b) and mutated (c) gnathostome NC enhancers. For zebrafish, the otic vesicle is circled and GFP expression in rhombomeres (r) and pharyngeal arches (2–5) indicated. Lamprey pharyngeal arches are labelled (2–4). GFP-expressing embryos shown are representative of the expression potential of the reporter construct in each case, as inferred from screening many (typically more than 100) injected embryos. Supplementary Table 2 provides the number of embryos and details of specific expression for all constructs in lamprey. Injection statistics for the transient transgenic zebrafish embryos shown in c are given in Supplementary Table 3. d Frontal sections through the transient transgenic lamprey embryos shown in Fig. 3b, with GFP transcripts detected by in situ hybridisation, revealing expression in NC-derived mesenchyme (arrowheads) in the pharyngeal arches (numbered). Scale bars: 100 µm

6 NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8 ARTICLE

a Hox genomic loci: Hoxa3 Hoxa2 r5 Mouse b r2r3 r4 c hox3 hox2 2 Lamprey

1 kb 2 3 α

Lamprey enhancers: 3 45 hox eGFP 1 2 hox2 -4 kb 4 c-Fos hox2 -4 kb cFosV1 hox2  hox 2 -4 kb cFosV2 r3–5 e hox2 elementA d hox2 -4 kb Krox20 2 hox2 -4 kb NC3 3 eGFP

Sox 1 2 345 4 RE1Krox20 Krox20 RE4 RE3 RE2 RE5 hox2 -4 kb cFosV1

NC5 NC2 NC3 NC4 NC1

f Krox20 Sox RE3 RE2 NC3 Human --TCACCCACGC--AGCCC---CTGACAAAGCCTCAG CTTAAGGGCTAGAAGCTGTCAAGGCTTTT--GGTGAGCAAGATTGATCGCGCCCAGACT Mouse --TCACCCACGC--AGCC-----TGACAAAGCCCAAT CTTAAGGGCCAGAAGCTGTCAGGGCTTTT--GGTGAGCAAGATTGATCACACCCAGACT ~250-400bp Chicken G-ACACCCACGC--AGGTTTTACTCACAAAGCCCGGT CTTAAGGGCCAGAAGCTGTCAAACCTTTT--GGCGGCCAAGATTGATCGCGCGCAGACA interval of Zebrafish G-ACACCCACGTTCTTCTTT---ACACAATAGCTCCT CTTA-GGGCACTAAGCTGTCAGACCTTTT--GGCGTGTAAGATTGATAGCGTGCAGGCT A-CCACCCACTCATTTCCTTG-GACACAAAGCC---T non-aligning GCGA-GGGCACCGAGCTGTCAGACCTTTT--GGCGAGTAAGATTGATCGCGCACAGGCT Fugu (a) sequence Fugu (b) A-ACACCCACTC---ACCTCAGCGGACAAATGATCCT CTTA-GGGCAGGGAGCTGTCAGACCTTTT--GGCGTGAAAGATTGATCACACTCAGGGA Lamprey GCACACCCACGCGCAAACTTCCCGCACAAAGGTTACC GTGCAGGGCAAG-AGCTGTCAAGGCTCGCCCGGCCGGGAAGATTGATAGCGAGG-GGCT

Krox20 Sox Meis Pbx-Hox Krox20 NC3

g r3–5 h r4 ij s s s r3 r3 r4 r4 r4 r5 r5 1 2 345 1 2 3 45 1 2 3 45 hoxα2-4 kb ElementA Krox20 NC3 cFosV1 ElementA Krox20 NC3

Fig. 4 Characterization of a lamprey hoxα2 neural crest (NC)/hindbrain enhancer. a The mouse Hoxa2-Hoxa3 genomic region and its equivalent from the lamprey hoxα cluster are shown, with Hox gene exons annotated (blue arrows). hoxα2 upstream regions assayed for reporter activity in this study, with or without the c-Fos minimal promoter, are shown. b–e Lateral views (b, d) and frontal sections (c, e) of st24.5 lamprey embryos, comparing endogenous expression of hoxα2 (b, c) to GFP reporter expression mediated by hoxα2 −4kb (d, e). Pharyngeal arches are numbered and rhombomeric expression detailed. Arrowheads point to PA2 NC expression. f Multiple sequence alignment of the Hoxa2 NC enhancer from gnathostomes with the lamprey hoxα2 enhancer, showing conserved sites (yellow). The positions of characterized mouse cis-elements (Krox20, Sox, RE2-3, NC3) are marked above the alignment. The enhancer schematic (a) shows the position of these elements within the assayed hoxα2 upstream regions, with conserved (shaded boxes) or divergent (empty boxes) cis-elements highlighted. Consensus binding motifs from the JASPAR database76 for Krox2077, Sox1178, Meis179, and Pbx-Hox80 factors are shown below the alignment, as well as sequences deleted in hoxα2 −4kbΔKrox20 and ΔNC3 variants. The non-aligning interval between these conserved regions is ~250–400 bp and varies in length between species. Supplementary Figure 5 contains the full alignment. g–j Lateral (g-i) and dorsal (j) views of st24.5 lamprey embryos showing GFP reporter expression driven by the enhancers detailed in a. Pharyngeal arches are numbered, with expression in rhombomeres (r) and somites (s) annotated. GFP-expressing embryos shown are representative of the expression potential of the reporter construct in each case, as inferred from screening many (typically more than 100) injected embryos. Supplementary Table 2 provides the number of embryos and details of specific expression for all constructs in lamprey

(Fig. 6a–d). This expression data and the conserved consensus enhancer suggests that such factors may also regulate its activity in Meis binding sites are consistent with the notion that meis genes the hindbrain and NC. This is significant because previous char- may have played an ancestral role in regulating Hox2 in NC acterisation of this NC enhancer did not uncover factors that during vertebrate evolution. bind to NC3 nor provide insight into its underlying regulatory mechanism10,35. The divergent activities of the Hoxa2 and Hoxb2 Occupancy of TALE and Hox proteins on Hox2 enhancers. enhancers, particularly in r4, suggest that there may be differences in Hoxb1, Pbx, and Meis bind to sites within the mouse Hoxb2 their interactions with upstream regulatory factors despite the pre- hindbrain/NC enhancer that are required for its activity11,17.The sence of homologous binding sites. This led us to investigate binding presence of homologous sites within NC3 of the mouse Hoxa2 properties of Meis, Pbx, and Hox proteins on these enhancers.

NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8

a Hox2 upstream enhancers from mouse and lamprey NC enhancer activity Sox RE1Krox20 Krox20 RE4 RE3 RE2 RE5 Hoxa2(m) r3 r4 2 NC5 NC2 NC3 NC4 NC1 Hoxa2(m) r5 NC 315 bp 3

Hoxa2(m) TCACCCACGCAGC------CTGACAAAGCC -AGCTGTCAGGG--CTTTTGGTGAGCAAGATTGATCA hoxα2(l) -CACCCACGCGCAAACT--TCCCGCACAAAGGT GAGCTGTCAAGGCTCGCCCGGCCGGGAAGATTGATAG Hoxb2(m) CCACCCACGCGGACGTTACACTGACAGAAAGTT GAGCTGTCAGGG------GGCTAAGATTGATCG r3 hox2(l) r4 2 r5 3 Krox20 Sox Meis Pbx-Hox

354 bp

Sox r3 Krox20 Krox20 Hoxb2(m) r4 2 Hoxb2(m) r5 Meis 3 Pbx-Hox

b Lamprey Hox genomic loci hox3 hox2 hox cluster hoxδ3 hox2 hox cluster 2 kb

Sox RE1Krox20 Krox20 RE4 RE3 RE2 RE5

NC5 NC2 NC3 NC4 NC1

Krox20Sox Krox20

Human A2 CTGAGGCAG-CAGAAAGCTC--TCACCCACGCA--GCCC---CTGACAAAGCCTCAGAGATATGGGGGCAG----CGCTGCTC--CCT--CTCTGCACC Mouse A2 CTGAGGAAGGCAAAAAGCTTTTTCACCCACGCA--GCC-----TGACAAAGCC-CAATGC--TGTGGGCAG----CCCTGCTT--TCTAACTTTCCTCT Chicken A2 ---ACAAATGCGCAGAATAGAGACACCCACGCA--GGTTTTACTCACAAAGCCCGGTTAAAGTCCAAACCTCTCTCTTTATTATTTGCCATCCTGCTGT Zebrafish A2b --GGCATTTTTAAAGAATGCCGACACCCACGT---TCTTCTTTACACAATAGCTCCTGTGTAGCTGACAAAAACACGTTTTTCCTCTTACAATCAATAT Fugu A2a --CAGAGCTGCACCAAATGCCACCACCCACTCATTTCCT-TGGACACAAAGCCTGTGCGTAATTCTAAAAAAAAAAAAAAAACGCCATCAATCTCTTCC Fugu A2b --GCGTTTCTCCTGAAAGCC-AACACCCACTC---ACCTCAGCGGACAAATGATCCTGAA---CAAAAGCTGCTCCGTTCGTTTCTATAGCACTAAGCA Lamprey 2 --GAGATAACAGAGTCCCGC--ACACCCACGCGCAAACTTCCCGCACAAAGGTTACCGGGATTCCGGCCACAAT-CGTTCCCC--CCCCCGTCCCCCCC Lamprey 2 ---AAACACGAACCTCAGCATTCCACCCACGAT--GGCT----GAACAATGCGCACGGGGATTCCCCGCAGCCC-CCTCGGAAGGCCGCAGACGGCGCA

Fig. 5 Hoxa2 and Hoxb2 neural crest (NC) enhancers are ancient paralogues and the lamprey hoxα2 enhancer appears to reflect the ancestral state. a Sequence alignment of mouse (m) Hoxa2 and Hoxb2 NC enhancers with that of lamprey (l) hoxα2, revealing short conserved sequence blocks (yellow). Corresponding consensus binding motifs for Krox20, Sox, Meis, and Pbx-Hox factors are shown below the alignment. These conserved sequences map to characterized cis-elements required for hindbrain (purple) or NC (green) activity in the mouse Hoxa2 (above) and Hoxb2 (below) enhancers. The 315 and 354 bp refer to the precise distances between the 5′ end of the Krox20 site and the 3′ end of the Pbx-Hox site of the mouse Hoxa2 and Hoxb2 enhancers, respectively. The activity of each NC enhancer in the hindbrain and NC is shown in schematic dorsal views. This activity differs between each enhancer, with hoxα2(l) showing the combined output of Hoxa2(m) and Hoxb2(m). b Multiple sequence alignment of gnathostome Hoxa2 NC enhancers with a homologous region upstream of lamprey hoxδ2. The lamprey hoxα2-hoxα3 and hoxδ2-hoxδ3 genomic loci are depicted, with hox gene exons annotated (blue arrows). The multiple sequence alignment reveals conservation of a Krox20 and a Sox site upstream of hoxδ2 (yellow shading in alignment), but other cis-elements, including NC3, are not conserved in sequence. This is depicted in the enhancer schematic, which details the conserved (shaded boxes) and divergent (empty boxes) cis-elements upstream of hoxδ2

For insight into NC, we harnessed published genome-wide that Hoxa2 also binds to these enhancers in PA2, suggesting binding data for Meis, Pbx, and Hoxa2 in PA2 of mouse that there may be auto- and cross-regulatory inputs from embryos36,37. Focusing on the NC enhancers of Hoxa2 (NC3) Hox2 proteins into the NC Hox code. This is analogous to the and Hoxb2 (HRE), we observed similar binding profiles for important roles for auto- and cross-regulatory circuits in these factors over each of the enhancers, consistent with their regulating Hox expression in other tissues, including hindbrain regulatory activity in NC. There is enrichment for occupancy rhombomeres11,13,15. Since Pbx and Meis can act as Hox co- of Meis and Pbx, in keeping with a role for these TALE factors factors38, they may interact with Hoxa2 on these NC in controlling Hox2 NC activity (Fig. 7a, b). It is interesting enhancers. Together with the presence of essential Meis motifs

8 NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8 ARTICLE

abcdmeisC meisC meisC meisC

2 mb 2

3 2 3 3 23 4 5 6 st23 st24.5 hox2hox2hox hox2 2

r3 r5 2 2 3 2 3 3 23 4 5 6 st23 st24.5

Fig. 6 Endogenous expression of meisC in neural crest (NC) overlaps with that of hoxα2 in lamprey embryos. a Lateral views (a, c) and frontal sections (b, d) are shown for embryos at st23 (a, b) and st24.5 (c, d). White lines in a and c denote planes of sections in b and d. Pharyngeal arches are numbered, arrows denote expression in NC. Scale bars: 200 μm(a, c); 100 µm(b, d). mb, mid-brain

a Hoxa3 NC3 Hoxa2 Hoxb3HRE Hoxb2 50 Hoxa2 90 Meis 70 Pbx Pharyngeal arch 2 1.4 Hoxb1 0.6 Pbx 0.9 Meis 11 Prep1 4.3 Prep2 Neural cell culture 0.6 p300

mm10 chr6:52,162,064–52,170,930 2 kb mm10 chr11: 96,346,792–96,355,658

b Hoxa2 Hoxa2 Pbx, Meis Pbx, Meis Pharyngeal arch 2

NC3Hoxa2 HRE Hoxb2

Pbx, Meis Hoxb1 Prep1–2 Pbx, Meis Hindbrain r4

NC3Hoxa2 HRE Hoxb2

Fig. 7 Hoxa2 and Hoxb2 enhancers exhibit differential TALE (Three-Amino-Acid-Loop-Extension) and Hox binding correlating with their tissue-specific activities. a DNA-binding profiles for Hox, TALE, and p300 factors in neural cell culture and pharyngeal arch 2 tissues at the mouse Hoxa2 (NC3) and Hoxb2 (HRE) neural crest (NC) enhancers (highlighted in purple). Genes are annotated (top) and are transcribed from left-to-right. b Summary diagram of characterized differential regulatory inputs (purple arrows) from Hox and TALE factors (inferred from a) into the mouse Hoxa2 and Hoxb2 NC enhancers in pharyngeal arch 2 (NC) and hindbrain r4 in vivo. Activation or inactivation of transcription is depicted by green arrows or a black cross, respectively. Purple arrows from the Hoxa2 gene indicate auto-/cross-regulation and bipartite Pbx-Hox sites, this raises the possibility of both feasible for individual rhombomeres due to the small number Hox-dependent and -independent inputs of Pbx and Meis of cells and limiting amounts of embryonic material. In into NC Hox expression. addition, there is no suitable anti-Hoxb1 antibody for ChIP-seq With respect to the developing hindbrain, chromatin experiments. To circumvent these challenges, we generated a immunoprecipitation-sequencing (ChIP-seq) approaches are not mouse embryonic stem (ES) cell line (KH2) carrying an inducible

NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8 locus-specific insertion of Hoxb1 marked with Flag epitopes and may have multiple roles in coupling Hox expression to the core used it in combination with programmed differentiation of ES NC GRN. These findings raise a number of interesting points and cells into neural fates. This enabled us to use anti-Flag antibodies avenues for further investigation. for Hoxb1 ChIP-seq and to obtain sufficient cell populations for a At the mechanistic level, little is known with respect to shared comparative series of genome-wide binding experiments. This cell versus independent inputs that govern axial patterning in the culture system has previously been shown to exhibit global hindbrain and NC. This is because the mechanisms regulating changes of gene expression that are similar to early in vivo phases Hox expression in the NC are relatively unclear compared to the of neural development, including the sequential activation of current knowledge of rhombomeric Hox regulation4. Analyses of hindbrain-expressed Hox genes and their cofactors (such as mouse Hoxa2 and Hoxb2 NC enhancers provided conflicting Meis2)39,40. We have previously applied this system to investigate mechanisms for NC expression of Hox genes. Hoxa2 supported the genome-wide binding properties of Hoxa1 and TALE proteins evidence for independent enhancers mediating expression in r4 in neural cells, uncovering in vivo regulatory interactions relevant and r4-derived NC, since the NC enhancer is not active in r4 and to hindbrain patterning39–42. a separate intronic/exonic enhancer drives r4 expression10,15.In Using this same approach for neural cells, we found similarities contrast, Hoxb2 uses common elements to control r4 and NC and significant differences in binding patterns of Hox and TALE expression, suggesting similar or shared regulatory requirements proteins between the paralogous Hox2 enhancers (Fig. 7a, b). The in these tissues11. Our analyses resolve this paradox, providing Hoxb2 (HRE) enhancer shows robust binding of Hoxb1, Pbx, and evidence that the Hoxa2/Hoxb2 NC enhancers each retain con- Meis, plus prominent p300 recruitment, consistent with their served Meis, Pbx, and Hox binding sites, which are deployed in established in vivo role in mediating Hoxb2 expression in r411,17. slightly different ways in mediating tissue-specific activities. In contrast, the Hoxa2 (NC3) enhancer lacks discernable binding hoxα2 is the only lamprey hox2 paralogue expressed in PA2 NC of Hoxb1, has reduced levels of Pbx and Meis occupancy, and (Fig. 2b, c) and we uncovered the presence of homologous displays differential binding patterns of Prep1 and Prep2. These functional motifs (Meis, Pbx, and Hox) in an upstream enhancer properties, in combination with absence of p300, directly with activity in r4 and NC. Based on sequence conservation, correlate with its lack of activity in r4. These differences in position, and regulatory activity, we infer that this enhancer is TALE and Hoxb1 binding and r4 activity presumably reflect homologous to gnathostome Hoxa2/Hoxb2 NC enhancers. sequence divergence or differences in epigenetic states between Sequence comparisons suggest that lamprey hoxδ2 has diverged the two enhancers. Studies on Pbx-Hox protein binding have and is missing these NC motifs but retains Krox20 and Sox sites, shown that small sequence variations within the canonical Pbx- consistent with its expression in r3/r5. Since the lamprey hoxα2 Hox bipartite binding sites influence the selectivity for specific NC enhancer exhibits the combined activity of the Hoxa2 and Hox proteins38. However, sequence comparisons of the Hoxb2, Hoxb2 enhancers, we consider it likely that it reflects the ancestral Hoxa2, and hoxα2 enhancers show that the core consensus Meis state. Thus, we suggest that Hox2 was ancestrally regulated in r4 and Pbx-Hox sites are identical (Figs. 4f, 5a; Supplementary and NC by a shared enhancer through inputs by Meis, Pbx, and Figure 6). In contrast, the sequences around these sites differ Hox (Fig. 8b). Alternatively, if the ancestral NC enhancer did not considerably between enhancers: for example, sequences imme- have the r4 activity, then gnathostome Hoxb2 and the lamprey diately 5′ of the Meis site are shared between Hoxb2 and hoxα2 hoxα2 NC enhancers independently evolved the ability to mediate but not Hoxa2 (Supplementary Figure 6). While these differences expression in r4. in neighbouring sequences may have arisen by sequence drift and The loss of r4 activity from the Hoxa2 NC enhancer may have be functionally neutral, an intriguing alternative is that they may been mitigated by the existence of a second r4 enhancer, located have functional significance in modulating binding to the in the exon/intronic region. We have previously shown that a conserved motifs and impacting r4 activity. Taken together, region containing the lamprey hoxα2 exon1, intron, and exon2 ChIP-seq data link TALE (Pbx and Meis) and Hox proteins to was also capable of driving reporter expression in r4 in lamprey regulation of Hox2 genes in both the hindbrain and NC. embryos18. This implies that the rhombomeric activity of this genomic region is ancestral to extant vertebrates. Hence, in lamprey hoxα2 there are two r4 enhancers, which may have Discussion partially redundant/shadow activities. It is notable that while both Here, we have investigated the ancestral regulation of Hox2 genes mediate expression in r4 in lamprey, only the upstream enhancer in the NC and hindbrain of vertebrates, using interspecies cis- drives expression in NC, which indicates that there is something regulatory comparisons between gnathostomes and lamprey. We unique about the upstream enhancer that helps to potentiate its demonstrated that gnathostome Hoxa2 and Hoxb2 NC enhancers activity in both rhombomeres and NC. are capable of driving equivalent NC expression in lamprey as The close proximity and partial overlap between the r3/r5 and they do in gnathostomes and that a homologous enhancer is r4/NC enhancers in hoxα2, in a manner analogous to its gna- present upstream of lamprey hoxα2. Sequence comparisons and thostome counterparts (Fig. 1), suggests that some components of regulatory analysis revealed conserved Meis, Pbx, and Hox these cis-elements may be required for expression in both tissues. binding sites between gnathostome Hoxa2/Hoxb2 and lamprey Our experiments with the lamprey hoxα2 enhancer showed that hoxα2 NC enhancers that are required for their activity. The deletion of the Meis and Pbx-Hox sites not only led to the loss of lamprey hoxα2 NC enhancer appears to have retained ancestral r4 and NC expression but also to the loss of r3/r5 expression activity in both NC and hindbrain, while the paralogous Hoxa2 (Fig. 4j). This implies that Meis and/or Pbx-Hox factors are also and Hoxb2 NC enhancers have differentially partitioned NC and involved in regulating the expression of hoxα2 in r3/r5. In this hindbrain activities in the gnathostome lineage. Regulatory regard, the mouse Hoxa2 NC enhancer partially overlaps ele- divergence has also occurred between lamprey Hox2 paralogues, ments required for r3/r5 activity: deletion of the Meis site in NC3 with hoxδ2 appearing to have lost the regulatory sites for r4/NC (ΔNC3_1) causes the loss of NC activity and also reduces r3/r5 enhancer activity. This suggests that a regulatory circuit with expression when tested in mouse10,44 and zebrafish (Fig. 3c; input from TALE and Hox proteins was an important component Supplementary Figure 3a–b). Thus, the Meis site contributes to of the GRN for Hox2-dependent NC patterning in ancestral both r3/r5 and NC activities of the Hoxa2 enhancer. Deletion of vertebrates that has been maintained during evolution (Fig. 8a). the Pbx-Hox site (ΔNC3_2) removes NC expression but does not TALE factors are part of an ancient patterning system38,43 that influence r3/r5 activity in mouse10, which may reflect the control

10 NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8 ARTICLE

a Evolution of the neural crest Hox code in vertebrates

Non-vertebrate deuterostomes Cyclostomes Gnathostomes

SaccoglossusAmphioxus Ciona Lamprey Zebrafish Mouse

- hox2 and hox2 - Hoxa2 and Hoxb2 enhancer divergence enhancer divergence

Genome duplication/s - NC GRN - Hox GRN for hindbrain and NC Hox code - TALE- and Hox-dependent Hox2 hindbrain/NC enhancer - NC-like cells

- Nested Hox expression in the neuroectoderm, coincident with Meis

b Divergence of Hox2 hindbrain/NC enhancers

Putative ancestral Hox2 Lamprey hox2 enhancers Mouse Hox2 enhancers upstream enhancer activity

r3 r3 r3 r3 r3 r4 2 r4 2 r4 2 r4 2 r4 2 hox2hox2 Hoxa2 Hoxb2 r5 NC r5 r5 r5 r5 3 3 3 3 3

Sox Krox20

Meis Pbx-Hox Conserved functional motifs in No anterior NC No NC activity No r4 activity ancestral Hox2 upstream enhancer or r4 activity posterior to PA2

Fig. 8 Evolutionary model for regulation of Hox2 genes in vertebrates. a A model for evolution of the neural crest (NC) Hox code, based on our data. a NC gene regulatory network (GRN) and NC Hox -code evolved in ancestral vertebrates and are conserved between cyclostomes and gnathostomes. In ancestral vertebrates, Hox2 NC expression was regulated by TALE (Three-Amino-Acid-Loop-Extension) and Hox factors, through a putative ancestral enhancer with shared NC and hindbrain activities. b A model for the divergence of lamprey and mouse Hox2 NC/hindbrain enhancers. Enhancer activity domains are depicted in blue in schematic dorsal views of the hindbrain and pharyngeal arches. Conserved functional motifs (Krox20, Sox, Meis, Pbx-Hox) present upstream of lamprey and mouse Hox2 genes are shown. Lamprey and mouse enhancers show divergent activities. Comparison between expression domains and conserved motifs leads us to suggest that a putative ancestral vertebrate Hox2 enhancer contained cis-elements for r3/r5 expression (Krox20, Sox) and r4/NC expression (Meis, Pbx-Hox).These scenarios assume that duplication events that gave rise to four Hox clusters in early vertebrates occurred prior to the cyclostome/gnathostome split, as the most parsimonious explanation22,81. However, it is also possible that independent genome duplication events may have occurred in cyclostome and gnathostome lineages (see Holland and Ocampo Daza82 for a recent discussion) of r3/r5 activity being Hox-independent. For Hoxb2, r3/r5 activity migratory NC. Such auto-regulation has been shown for rhom- appears to be independent of the Meis and Pbx-Hox sites17,45, bomeric expression of many Hox genes11,13,15 and may also be a suggesting that Hoxa2 and Hoxb2 enhancers have also diverged general mechanism for pre-patterning the NC. Intriguingly, there in their degree of dependence on Meis sites for r3/r5 activity. appears to be context-dependent inputs that modulate the ability Grafting experiments in gnathostome embryos have revealed of Hox-response elements containing Meis and Pbx-Hox sites to roles for both maintenance of neural tube Hox expression (pre- potentiate activity in the hindbrain versus NC. For example, like patterning) and plasticity in shaping the NC Hox code46–49. the Hoxb2 NC/r4 enhancer, Hoxb1 has an auto-/cross-regulatory Initial AP Hox patterning in the neural tube plays an instructive element dependent on Meis and Pbx-Hox sites, but it only role in establishing NC Hox expression, which is then modulated mediates the expression in r4 and not in r4-derived NC13. by permissive signals in the PA environments. Hence, the current Similarly, since Hoxa2 is expressed in r2 but not in r2-derived model is that Hox expression initiated in the neural tube is not NC46, other regulatory mechanisms presumably prevent Hox simply passively retained by migrating NC cells. The character- expression in the r2-derived NC. This could include fibroblast ization of essential sites bound by Hoxa2, Meis, and Pbx in the growth factor signalling from the isthmus, which plays an mouse Hoxa2 and Hoxb2 NC enhancers suggests that Hox and important role in patterning PA147. Further regulatory analyses TALE-dependent auto-/cross-regulation may provide a mechan- will be required to elucidate the generality of Hox auto-/cross- ism for potentiating Hox2 expression that is set up in the pre- regulation in NC.

NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications 11 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8

The emergence of the NC during vertebrate evolution provides In summary, our finding of functionally conserved Meis and a key example of how regulatory codes coevolved with novel Pbx-Hox sites in lamprey and gnathostome Hox2 NC enhancers cell types in an animal body plan. Non-vertebrate deuterostomes, focuses attention on the role of these factors in NC development. like the hemichordate Saccoglossus and the cephalochordate Meis and Pbx play important roles in patterning diverse tissues amphioxus, lack NC but deploy nested AP domains of Hox during development, including the hindbrain and NC, some of expression to pattern their nervous system50–52. This raises the which may be independent of Hox30–33. For example, mouse intriguing possibility that the NC Hox code in ancestral verte- embryos with a conditional deletion of Meis2 in NC display brates evolved from the transfer of a deuterostome neural Hox abnormalities in patterning the bones and connective tissue in prepattern14. Alternatively, the NC Hox code may have arisen PA1, where Hox genes are not expressed32. Hence, they could independently, by evolution of new regulatory inputs into Hox also serve as cofactors for other transcription factors34, or have genes. A further possibility invokes a combination of both, with independent roles in patterning NC. Therefore, while they have shared inputs creating a Hox prepattern and independent inputs not been linked to the current NC-GRN, TALE factors (Pbx and evolving to modulate this in a tissue-specific manner. Our Meis) may be important components in this network. If so, investigation of ancestral Hox2 NC regulation in vertebrates sets these transcription factors could be part of a mechanism that the stage for examining the emergence of Hox regulation in NC couples Hox genes to the NC GRN in vertebrate evolution. The during evolution. This requires comparison of deuter- conserved expression of Meis genes in NC from gnathostomes ostome development, with a focus on non-vertebrate deuter- and lamprey is consistent with an ancestral role in NC and their ostome cell types that may be evolutionarily related to NC1. roles and interactions in NC development require further study. Studies in tunicates and cephalochordates suggest that they employ similar gene regulatory programs to specify the neural Methods 53–55 plate border . Recent studies in tunicates, the vertebrate sister Sequence alignment. Global sequence alignment of Hox genomic loci (Fig. 3a) group (Fig. 8a), have identified certain embryonic cell populations was performed using Multi-LAGAN28, with human as the baseline sequence and that display some characteristics of NC cells. For example, in conserved sequences defined by 60% conservation over 40 bp. Sequence alignments Ecteinascidia turbinata, a colonial tunicate, the trunk lateral cells of Hox2 enhancers (Figs. 4f, 5a, b; Supplementary Figures 3, 5, 6) were performed using AlignX in VectorNTI (Life Technologies). originate beside the neural tube and migrate to give rise to pig- mented cell types, leading to their designation as NC-like cells56. fi Enhancer elements. Enhancer elements were selected from the published data or Trunk lateral cells are also identi able in Ciona, where they based on sequence conservation in cross-species alignments. The DNA for each express homologues of some key genes of the vertebrate NC- element was amplified by PCR from genomic DNA or from pre-existing plasmids GRN, including Tfap2α and Twist57. However, the homology of using KOD Hot Start Master Mix (Novagen). The following primers were used for trunk lateral cells to NC has been called into question58. amplification. The sequences in uppercase represent homology to genomic DNA, Ciona Hox genes are dispersed across two chromosomes and and adaptor sequences for cloning are in lowercase text. References are given for primary literature in which the enhancers were identified. residual spatial colinearity of expression in the nervous system has crestin−1 kb(zf)26, been detected for some of them59. This may be a general feature of F:5′-ccctcgaggtcgacGCTGAAATCTTGGGCATCTC-3′; tunicates, with similar Hox cluster disintegration seen in species R: 5′-gaggatatcgagctcGCTGGGTTACTGAGGTGAC-3′; − 26 from other tunicate classes60,61. Comparison with amphioxus, crestin 296 bp , F: 5′-agggtaatgagggccCCGCAGATGTTCTAGTACCC-3′; which has a single Hox cluster and nested colinear Hox expression R: 5′-gaggatatcgagctcgGGGTTAAAACAACACATTGATTAACCTGG-3′; 51,52,62 along the AP axis of the neuroepithelium , suggests that Hoxa2(m)ΔNC3-1/ΔNC3-210, tunicates are relatively divergent in terms of their Hox genomic F: 5′-agggtaatgagggcccAGATCTGAATGCTGGAGC-3′; content and expression. Ciona intestinalis Hox2 expression has not R: 5′-tcgcccttcatagcctcgagGGTACCTTCTCTCCCTCAAAC-3′; hoxα2 elementA, been detected in the developing neural tube or neural plate border, F: 5′-agggtaatgagggccCCATCGACATGTAAACGTGGG-3′; 63 but has been described in the larval ectodermal atrial primordia . R: 5′-tcctacgtcactggcGAGTAAGCGAGGTCGTGG-3′. No defects in larval morphogenesis were detected upon The following enhancer elements were cloned into the Hugo’s lamprey 63 construct (HLC) reporter vector in a previous study focusing on the hindbrain:18 morpholino-mediated knockdown of Hox2 in Ciona embryos ,so α − the roles of Hox2 in tunicate embryonic patterning and its place- hoxa2b(zf), Hoxa2a(f), Hoxa2(m), and hox 2 4kb. ment in tunicate developmental GRNs remain unclear. In other non-vertebrate deuterostomes, the coincident Generation of reporter constructs. The HLC vector was created in a previous 50,64 study18. PCR-amplified enhancer elements were purified using the QIAquick PCR expression of Hox and Meis genes in Saccoglossus and fi 52,65 Puri cation Kit (Qiagen) and cloned into HLC by Gibson Assembly using the amphioxus neuroectoderm leads us to speculate that these Gibson Assembly Master Mix (NEB). The mouse c-Fos promoter was cloned into factors may have comprised an ancestral deuterostome regulatory the hoxα2 −4 kb-HLC vector that had been linearized either by NcoI (for hoxα2 −4 circuit involved in neuroectodermal patterning (Fig. 8a). Upon kb cfosV1)orbyAscI and NcoI (for hoxα2 −4 kb cfosV2). The following primer evolution of NC, pre-existing auto- and cross-regulatory inter- pairs were used to amplify the mouse c-Fos promoter from a plasmid template: hoxα2 −4 kb cfosV1, actions within this network may have served to maintain F: 5′-ctccgtcaaggcagcCCAGTGACGTAGGAAGTCCATC-3′; expression of these factors in the migrating NC. Investigation of R: 5′-ctcgcccttgctcaccatggTGGCGACCGGTGGATCCT-3′; regulatory interactions between Hox and TALE genes in inver- hoxα2 −4 kb cfosV2, ′ ′ tebrate deuterostomes, combined with characterisation of the cis- F: 5 -cgcctattggctgggCCAGTGACGTAGGAAGTCCATC-3 ; R: 5′-ctcgcccttgctcaccatggTGGCGACCGGTGGATCCT-3′. regulatory elements involved, could help to address whether such Site-directed mutagenesis of enhancers was achieved by Gibson Assembly. For Hox-TALE interactions were ancestral to deuterostomes and were each mutation variant, two partially overlapping amplicons (left (L) and right (R)) employed in coupling Hox genes to NC during chordate evolu- containing the desired mutation were generated by PCR and then assembled into tion. Amphioxus lacks NC54, but interspecies regulatory analysis, the linearized HLC vector by 3-fragment assembly. The following primers were used. assaying activity of regions of the amphioxus Hox cluster in crestin−296 bpΔSox10, chicken and mouse embryos, revealed that some cis-elements are L_F: 5′-agggtaatgagggccCCGCAGATGTTCTAGTACCC-3′; capable of mediating reporter expression in the hindbrain, pla- L_R: 5′-ctagagatcgtcgcaTCTCTACGAAATTGTGCTTCTAGCAG-3′; codes, and NC66. The activity of these elements in amphioxus is R_F: 5′-cgtagagatgcgacGATCTCTAGAAACATTAATGCATATGAACAAAAGC-3′; R_R: 5′-gaggatatcgagctcgGGGTTAAAACAACACATTGATTAACCTGG-3′; unknown but it will be important to investigate whether these crestin−296 bpΔTfap2α, represent ancestral neural elements with a capacity for mediating L_F: 5′-agggtaatgagggccCCGCAGATGTTCTAGTACCC-3′; NC activity in vertebrates. L_R: 5′-ggatgtgcttaagttGAGCACATGACCAGGAGTC-3′;

12 NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8 ARTICLE

R_F:5′-atgtgctcaacttaaGCACATCCTGCTAGAAGCAC-3′; hoxδ2 (585 bp, exonic), R_R: 5′-gaggatatcgagctcgGGGTTAAAACAACACATTGATTAACCTGG-3′; F: 5′-ACCTCTGCGCGACTCCTC-3′; crestin−296 bpΔMyc, R: 5′-CCAGACCTCCTCCTCCTCT-3′; L_F: 5′-agggtaatgagggccCCGCAGATGTTCTAGTACCC-3′; meisC (573 bp, exonic), L_R: 5′-aaggcgagtctagaACCAGGAGTCAATTAAAAGTCTCGTG-3′; F: 5′-CTTTGAGAAGTGCGAGCTGG-3′; R_F:5′-ctcctggttctagaCTCGCCTTGGGCACATCC-3′; R: 5′-GAAAATGCCGCGCTTCTTCT-3′. R_R: 5′-gaggatatcgagctcgGGGTTAAAACAACACATTGATTAACCTGG-3′; eGFP, hoxα2, and hoxα3 probe sequences were previously reported18. hoxα2 −4kb Δkrox20, L_F: 5′-agggcccgggatcccTCGAGCCTGCAGGAAGCTTAAG-3′; ′ ′ Lamprey in situ hybridisation. Digoxygenin-labelled probes were generated by L_R: 5 -cccggtaaGCGGGACTCTGTTATCTCC-3 ; fi ′ ′ standard methods and puri ed using the MEGAclear Transcription Clean-Up Kit R_F: 5 -agtcccgcTTACCGGGATTCCGGCCAC-3 ; 69 ′ ′ (Ambion). Lamprey embryos were staged according to Tahara et al. . Lamprey R_R: 5 -tctacgacgacgacgacgtcgaggTCGACGCAAAGAAGCCGG-3 ; whole-mount in situ hybridisation was performed on MEMFA-fixed embryos hoxα2 −4kb ΔNC3, following established protocols70, with the following additions to the protocol:71 L_F: 5′-agggcccgggatcccTCGAGCCTGCAGGAAGCTTAAG-3′ methanol-stored embryos were transferred into ethanol and left overnight prior to L_R: 5′-ccctcgctCTTGCCCTGCACAAATACTCAG-3′; rehydration; embryos were treated with 0.5% acetic anhydride in 0.1 M trietha- R_F: 5′-agggcaagAGCGAGGGGCTCCGGAAAG-3′; nolamine after proteinase K treatment. For imaging, embryos were cleared in R_R: 5′-tctacgacgacgacgacgtcgaggtcgaCGCAAAGAAGCCGGCCCC-3′. benzyl alcohol:benzyl benzoate and mounted in Permount (Fisher Scientific). For sectioning after in situ hybridisation, embryos were transferred into 30% Zebrafish and lamprey experiments. This study was conducted in accordance sucrose in phosphate-buffered saline, embedded in O.C.T. Compound and with the Guide for the Care and Use of Laboratory Animals of the National sectioned to 10-µm-thick cryosections. Images were taken using a Zeiss Axiovert Institutes of Health and protocols were approved by the Institutional Animal Care 200 microscope with AxioCam HRc camera and AxioVision Rel 4.8.2 software. and Use Committees of the Stowers Institute (zebrafish, RK Protocol #2015-0149 and Protocol #2018-0184) and California Institute of Technology (lamprey, Pro- ES cell culture. ES cells were cultured in feeder-free conditions using N2B27 + 2i tocol #1436-17). media supplemented with 2000U mL−1 of ESGRO (Millipore) on a gelatinized plate. KH2 ES cells72 with epitope-tagged Hoxb1 (3XFLAG-MYC) were used for Zebrafish reporter assay. The wild-type Slusarski AB zebrafish line was used for Hoxb1 ChIP using anti-flag antibody (F1804-Sigma). Unmodified KH2 cell lines embryo micro-injection experiments using Tol2-mediated transgenesis67. A stan- were used for ChIP experiments for Pbx (SC-888; Santa Cruz), Meis (SC-25412; dard injection mix containing 25 ng µl−1 reporter plasmid (generated by mini- Santa Cruz), Prep1 (ab55603; Abcam), Prep2 (sc-292315X; Santa Cruz) and EP300 prep), 35 ng µl−1 Tol2 transposase messenger RNA, and 0.05% phenol red was (Sc-585X; Santa Cruz). Cells were differentiated to neuroectoderm in differentia- micro-injected into one-celled embryos at an injection volume of 3–5 nl. Embryos tion media containing DMEM + 10% (vol/vol) Serum + NEAA + 3 µM RA for a were screened at 24–30 h post fertilization for fluorescent reporter expression using requisite length of time. Cells were harvested at 80–90% confluency. a Leica M205FA microscope. In assaying reporter constructs by transient trans- genesis, for each injected construct the tissue-specific GFP expression domains Chromatin immunoprecipitation-sequencing. ChIP-seq was performed accord- were noted, along with the number and proportion of screened embryos exhibiting ing to the Upstate protocol as described73 with modifications. Cells were fixed with GFP expression in each of those domains. The empty HLC reporter vector (without 1% formaldehyde by incubating at 37 °C for 11 min. The reaction was quenched for an enhancer) directs weak mosaic GFP expression in multiple cell types including 5 min by the addition of 1/10th volume of 1.25 M glycine. Cells were sonicated for neurons and muscle cells (Supplementary Figure 7a). The following pre-existing 25 min in a Bioruptor at high setting and 30 s on–off cycle. Respective antibodies transgenic reporter lines were used for this study: Tg(dr.hoxa2b:eGFP), Tg(fr. attached to sepharose-A beads were used for immunoprecipitation. Sequencing of Hoxa2a:eGFP)12,18. The line Tg(mm.Hoxa2b:eGFP) was generated in this study ChIP-seq libraries was performed on the Illumina HiSeq 2500, 51 bp single end. from a founder that had been micro-injected with Hoxa2(m)-HLC. Fluorescent and Raw reads were aligned to the UCSC mm10 mouse genome with bowtie2 2.2.074. bright-field imaging were performed with Leica DFC360FX and DFC405C cameras Primary reads from each bam were normalized to reads per million and bigWig and LAS AF imaging software. Images were cropped and altered for brightness and tracks visualized at the UCSC genome browser (https://genome.ucsc.edu/). contrast using Adobe Photoshop CS5.1. fi − Zebra sh crestin reporter expression was analysed using Tg2( 4.5_crestin: Assay for transposase-accessible chromatin-sequencing. Assay for EGFP) that uses the core long terminal repeat (LTR) elements of crestin located transposase-accessible chromatin-sequencing was performed as described pre- − 26 4.5 kb upstream of the putative crestin open reading frame . Transient assays for viously75. Fifty thousand cells were counted using the sceptre 2.0 cell counter mutated transcription factor binding sites in lamprey were performed using the (EMD Millipore). The tagmentation reaction was performed using the Nextera minimal 296 bp crestin LTR element with the previously reported mutations as DNA Library Prep Kit (Illumina, FC-121-1030) and libraries indexed using fi 26 tested in zebra sh . the Nextera Index Kit (Illumina, FC-121-1011). Libraries were size selected by Bluepippin (Sage Science) and sequenced on Illumina HiSEq. 2500 instrument. Lamprey reporter assay. Lamprey transient transgenesis was performed using P. Following sequencing, Illumina Real Time Analysis v1.18.64 and CASAVA v1.8.2 marinus embryos at the one-cell stage and I-SceI meganuclease-mediated were run to demultiplex reads and generate FASTQ files. transgenesis18,68. Injection mixes containing 20 ng µl−1 reporter plasmid (gener- −1 ated by miniprep), 1× CutSmart buffer (NEB), and 0.5U µl I-SceI enzyme (NEB) Reporting summary. Further information on experimental design is available in in water were incubated at 37 °C for 30 min prior to micro-injection at a volume of the Nature Research Reporting Summary linked to this article. ~2 nl per embryo. Embryos were screened for fluorescent reporter expression using a Zeiss SteREO Discovery V12 microscope. For each injected construct, the tissue- specific GFP expression domains were noted, along with the number and pro- Data availability fi portion of screened embryos exhibiting GFP expression in each of those domains. The authors declare that all data supporting the ndings of this study are available within The empty HLC reporter vector (without an enhancer) directs GFP expression in the article and its supplementary information files or from the corresponding author ectoderm, yolk cells, as well as in cells dorsal to the yolk (Supplementary Fig- upon reasonable request. All raw sequencing data from this study underlying Fig. 7a have ure 7b). Since transient reporter assays generate mosaic reporter expression pat- been deposited in the NCBI BioProject database [https://www.ncbi.nlm.nih.gov/ terns, variation in levels and domains of GFP expression are observed between bioproject] under accession code PRJNA341679 and Sequence Read Archive under embryos. For imaging we selected embryos with GFP-expressing patterns repre- accession code SRP079975 and PRJNA503882. Original data underlying this manuscript sentative of the expression potential of the reporter construct, as inferred from can be accessed from the Stowers Original Data Repository at [http://odr.stowers.org/ screening more than 100 injected embryos. GFP-expressing embryos were imaged websimr/]. A reporting summary for this Article is available as a Supplementary using a Zeiss SteREO Discovery V12 microscope and a Zeiss Axiocam MRm Information file. camera with AxioVision Rel 4.6 software. Images were cropped and altered for brightness using Adobe Photoshop CS5.1. Selected GFP-expressing embryos were fixed in MEMFA and dehydrated in methanol for in situ hybridisation. Received: 5 December 2018 Accepted: 26 February 2019

Cloning lamprey in situ hybridisation probes. Probes were designed based on characterised or predicted gene sequences22, amplified from P. marinus genomic DNA or st18–26 embryonic complementary DNA by PCR using KOD Hot Start Master Mix (Novagen) and cloned into the pCR4-TOPO vector (Life Technologies). The size of each amplified fragment is indicated (in bp). The following primers References were used for PCR: 1. Green, S. A., Simoes-Costa, M. & Bronner, M. E. Evolution of vertebrates hoxβ1 (674 bp, partial 3′-untranslated region), as viewed from the crest. Nature 520, 474–482 (2015). F: 5′-ATGCTCCCTCAACTCCATCC-3′; 2. Le Douarin, N. & Kalcheim, C. The Neural Crest 2nd edn (Cambridge R: 5′-TGACCTCTTCTCGCATGTAAGA-3′; University Press, Cambridge, 1999).

NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications 13 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8

3. Northcutt, R. The new head hypothesis revisited. J. Exp. Zool. B 304, 274–297 32. Machon, O., Masek, J., Machonova, O., Krauss, S. & Kozmik, Z. Meis2 is (2005). essential for cranial and cardiac neural crest development. BMC Dev. Biol. 15, 4. Parker, H. J., Pushel, I. & Krumlauf, R. Coupling the roles of Hox genes to 40 (2015). regulatory networks patterning cranial neural crest. Dev. Biol. https://doi.org/ 33. Deflorian, G. et al. Prep1.1 has essential genetic functions in hindbrain 10.1016/j.ydbio.2018.1003.1016 (2018). development and cranial neural crest cell differentiation. Development 131, 5. Gavalas, A., Trainor, P., Ariza-McNaughton, L. & Krumlauf, R. Synergy 613–627 (2004). between Hoxa1 and Hoxb1: the relationship between arch patterning and the 34. Laurent, A., Bihan, R., Omilli, F., Deschamps, S. & Pellerin, I. PBX proteins: generation of cranial neural crest. Development 128, 3017–3027 (2001). much more than Hox cofactors. Int. J. Dev. Biol. 52,9–20 (2008). 6. Martik, M. L. & Bronner, M. E. Regulatory logic underlying diversification of 35. Tümpel, S., Maconochie, M., Wiedemann, L. M. & Krumlauf, R. Conservation the neural crest. Trends Genet 33, 715–727 (2017). and diversity in the cis-regulatory networks that integrate information 7. Sauka-Spengler, T., Meulemans, D., Jones, M. & Bronner-Fraser, M. Ancient controlling expression of Hoxa2 in hindbrain and cranial neural crest cells in evolutionary origin of the neural crest gene regulatory network. Dev. Cell 13, vertebrates. Dev. Biol. 246,45–56 (2002). 405–420 (2007). 36. Amin, S. et al. Hoxa2 selectively enhances Meis binding to change a branchial 8. Minoux, M. & Rijli, F. M. Molecular mechanisms of cranial neural crest cell arch ground state. Dev. Cell 32, 265–277 (2015). migration and patterning in craniofacial development. Development 137, 37. Donaldson, I. J. et al. Genome-wide occupancy links Hoxa2 to Wnt-β-catenin 2605–2621 (2010). signaling in mouse embryonic development. Nucleic Acids Res 40, 3990–4001 9. Kitazawa, T. et al. Distinct effects of Hoxa2 overexpression in cranial neural (2012). crest populations reveal that the mammalian hyomandibular-ceratohyal 38. Merabet, S. & Mann, R. S. To be specific or not: the critical relationship boundary maps within the styloid process. Dev. Biol. 402, 162–174 (2015). between Hox and TALE proteins. Trends Genet. 32, 334–347 (2016). 10. Maconochie, M. et al. Regulation of Hoxa2 in cranial neural crest cells 39. De Kumar, B. et al. Analysis of dynamic changes in retinoid-induced involves members of the AP-2 family. Development 126, 1483–1494 (1999). transcription and epigenetic profiles of murine Hox clusters in ES cells. 11. Maconochie, M. K. et al. Cross-regulation in the mouse HoxB complex: the Genome Res. 25, 1229–1243 (2015). expression of Hoxb2 in rhombomere 4 is regulated by Hoxb1. Genes Dev. 11, 40. De Kumar, B. et al. HOXA1 and TALE proteins display cross-regulatory 1885–1896 (1997). interactions and form a combinatorial binding code on HOXA1 targets. 12. McEllin, J. A., Alexander, T. B., Tumpel, S., Wiedemann, L. M. & Krumlauf, R. Genome Res. 27, 1501–1512 (2017). Analyses of fugu hoxa2 genes provide evidence for subfunctionalization of 41. De Kumar, B. et al. Dynamic regulation of Nanog and stem cell-signaling neural crest cell and rhombomere cis-regulatory modules during vertebrate pathways by Hoxa1 during early neuro-ectodermal differentiation of ES cells. evolution. Dev. Biol. 409, 530–542 (2016). Proc. Natl Acad. Sci. USA 114, 5838–5845 (2017). 13. Parker, H. J. & Krumlauf, R. Segmental arithmetic: summing up the Hox gene 42. De Kumar, B. et al. Hoxa1 targets signaling pathways during neural regulatory network for hindbrain development in . Wiley Interdiscip. differentiation of ES cells and mouse embryogenesis. Dev. Biol. 432, 151–164 Rev. Dev. Biol. 6., https://doi.org/10.1002/wdev.286(2017). (2017). 14. Wada, H., Escriva, H., Zhang, S. & Laudet, V. Conserved RARE localization in 43. Hudry, B. et al. Molecular insights into the origin of the Hox-TALE patterning amphioxus Hox clusters and implications for Hox code evolution in the system. Elife 3, e01939 (2014). vertebrate neural crest. Dev. Dyn. 235, 1522–1531 (2006). 44. Maconochie, M. K., Nonchev, S., Manzanares, M., Marshall, H. & 15. Tümpel, S. et al. Expression of Hoxa2 in rhombomere 4 is regulated by a Krumlauf, R. Differences in Krox20-dependent regulation of Hoxa2 and conserved cross-regulatory mechanism dependent upon Hoxb1. Dev. Biol. Hoxb2 during hindbrain development. Dev. Biol. 233, 468–481 (2001). 302, 646–660 (2007). 45. Sham, M. H. et al. The zinc finger gene Krox-20 regulates Hoxb-2 (Hox2.8) 16. Lampe, X. et al. An ultraconserved Hox-Pbx responsive element resides in the during hindbrain segmentation. Cell 72, 183–196 (1993). coding sequence of Hoxa2 and is active in rhombomere 4. Nucleic Acids Res. 46. Prince, V. & Lumsden, A. Hox-a2 expression in normal and transposed 36, 3214–3225 (2008). rhombomeres: independent regulation in the neural tube and neural crest. 17. Ferretti, E. et al. Segmental expression of Hoxb2 in r4 requires two separate Development 120, 911–923 (1994). sites that integrate cooperative interactions between Prep1, Pbx and Hox 47. Trainor, P. A., Ariza-McNaughton, L. & Krumlauf, R. Role of the isthmus and proteins. Development 127, 155–166 (2000). FGFs in resolving the paradox of neural crest plasticity and prepatterning. 18. Parker, H. J., Bronner, M. E. & Krumlauf, R. A Hox regulatory network of Science 295, 1288–1291 (2002). hindbrain segmentation is conserved to the base of vertebrates. Nature 514, 48. Trainor, P. & Krumlauf, R. Plasticity in mouse neural crest cells reveals a new 490–493 (2014). patterning role for cranial mesoderm. Nat. Cell Biol. 2,96–102 (2000). 19. Green, S. A., Uy, B. R. & Bronner, M. E. Ancient evolutionary origin of 49. Schilling, T. F., Prince, V. & Ingham, P. W. Plasticity in zebrafish hox vertebrate enteric neurons from trunk-derived neural crest. Nature 544,88–91 expression in the hindbrain and cranial neural crest. Dev. Biol. 231, 201–216 (2017). (2001). 20. Shimeld, S. M. & Donoghue, P. C. Evolutionary crossroads in developmental 50. Aronowicz, J. & Lowe, C. J. Hox gene expression in the hemichordate biology: cyclostomes (lamprey and hagfish). Development 139, 2091–2099 Saccoglossus kowalevskii and the evolution of deuterostome nervous systems. (2012). Integr. Comp. Biol. 46, 890–901 (2006). 21. Mehta, T. K. et al. Evidence for at least six Hox clusters in the Japanese 51. Wada, H., Garcia-Fernandez, J. & Holland, P. W. Colinear and segmental lamprey (Lethenteron japonicum). Proc. Natl Acad. Sci. USA 110, expression of amphioxus Hox genes. Dev. Biol. 213, 131–141 (1999). 16044–16049 (2013). 52. Schubert, M., Holland, N. D., Laudet, V. & Holland, L. Z. A -Hox 22. Smith, J. J. et al. The sea lamprey germline genome provides insights into hierarchy controls both anterior/posterior patterning and neuronal programmed genome rearrangement and vertebrate evolution. Nat. Genet. 50, specification in the developing central nervous system of the cephalochordate 270–277 (2018). amphioxus. Dev. Biol. 296, 190–202 (2006). 23. Pascual-Anaya, J. et al. Hagfish and lamprey Hox genes reveal conservation of 53. Medeiros, D. M. The evolution of the neural crest: new perspectives from temporal colinearity in vertebrates. Nat. Ecol. Evol. 2, 859–866 (2018). lamprey and neural crest-like cells. Wiley Interdiscip. Rev. Dev. 24. Takio, Y. et al. Evolutionary biology: lamprey Hox genes and the evolution of Biol. 2,1–15 (2013). jaws. Nature 429,1–2 (2004). 54. Yu, J. K., Meulemans, D., McKeown, S. J. & Bronner-Fraser, M. Insights from 25. Takio, Y. et al. Hox gene expression patterns in Lethenteron japonicum the amphioxus genome on the origin of vertebrate neural crest. Genome Res. embryos—insights into the evolution of the vertebrate Hox code. Dev. Biol. 18, 1127–1132 (2008). 308, 606–620 (2007). 55. Denes, A. S. et al. Molecular architecture of annelid nerve cord supports 26. Kaufman, C. K. et al. A zebrafish melanoma model reveals emergence of common origin of nervous system centralization in bilateria. Cell 129, neural crest identity during melanoma initiation. Science 351, aad2197 (2016). 277–288 (2007). 27. Nonchev, S. et al. Segmental expression of Hoxa-2 in the hindbrain is directly 56. Jeffery, W. R., Strickler, A. G. & Yamamoto, Y. Migratory neural crest-like regulated by Krox-20. Development 122, 543–554 (1996). cells form body pigmentation in a urochordate embryo. Nature 431, 696–699 28. Brudno, M. et al. LAGAN and Multi-LAGAN: efficient tools for large-scale (2004). multiple alignment of genomic DNA. Genome Res. 13, 721–731 (2003). 57. Jeffery, W. R. et al. Trunk lateral cells are neural crest-like cells in the ascidian 29. Burglin, T. R. Analysis of TALE superclass genes (MEIS, PBC, Ciona intestinalis: insights into the ancestry and evolution of the neural crest. KNOX, Iroquois, TGIF) reveals a novel domain conserved between plants and Dev. Biol. 324, 152–160 (2008). animals. Nucleic Acids Res. 25, 4173–4180 (1997). 58. Hall, B. K. & Gillis, J. A. Incremental evolution of the neural crest, neural 30. Moens, C. B. & Selleri, L. Hox cofactors in vertebrate development. Dev. Biol. crest cells and neural crest-derived skeletal tissues. J. Anat. 222,19–31 291, 193–206 (2006). (2013). 31. Choe, S. K., Vlachakis, N. & Sagerstrom, C. G. Meis family proteins are 59. Ikuta, T., Yoshida, N., Satoh, N. & Saiga, H. Ciona intestinalis Hox gene required for hindbrain development in the zebrafish. Development 129, cluster: Its dispersed structure and residual colinear expression in 585–595 (2002). development. Proc. Natl Acad. Sci. USA 101, 15118–15123 (2004).

14 NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09197-8 ARTICLE

60. Seo, H. C. et al. Hox cluster disintegration with persistent anteroposterior 82. Holland, L. Z. & Ocampo Daza, D. A new look at an old question: when did order of expression in Oikopleura dioica. Nature 431,67–71 (2004). the second whole genome duplication occur in vertebrate evolution? Genome 61. Sekigami, Y. et al. Hox gene cluster of the ascidian, Halocynthia roretzi, reveals Biol. 19, 209 (2018). multiple ancient steps of cluster disintegration during ascidian evolution. Zool. Lett. 3, 17 (2017). 62. Garcia-Fernandez, J. & Holland, P. W. H. Archetypal organisation of the Acknowledgements amphioxus Hox gene cluster. Nature 370, 563–566 (1994). We thank Dorit Hockman, Tetsuto Miyashita, and Megan Martik for lamprey husbandry 63. Ikuta, T., Satoh, N. & Saiga, H. Limited functions of Hox genes in the larval assistance, the Stowers Institute aquatics facility for zebrafish care, and Histology facility development of the ascidian Ciona intestinalis. Development 137, 1505–1513 for sectioning assistance. This study was conducted in accordance with the recommen- (2010). dations in the Guide for the Care and Use of Laboratory Animals of the NIH and 64. Lemons, D., Fritzenwanker, J. H., Gerhart, J., Lowe, C. J. & McGinnis, W. Co- protocols approved by the Institutional Animal Care and Use Committees of the Stowers option of an anteroposterior head axis patterning system for proximodistal Institute (Zebrafish, RK Protocol: #2015-0149), California Institute of Technology patterning of appendages in early bilaterian evolution. Dev. Biol. 344, 358–362 (lamprey, MEB Protocol: #1436-11), and the veterinary office of UZH and the Canton (2010). of Zürich. K.D.P., C.H., and C.M. were supported by Science Foundation (SNSF) 65. Albuixech-Crespo, B. et al. Molecular regionalization of the developing professorship (C.M. grant 170623), a Marie Curie Career Integration Grant from the amphioxus neural tube challenges major partitions of the vertebrate brain. European Commission (C.M. grant PCIG14-GA-2013-631984), the Swiss Cancer League, PLoS Biol. 15, e2001573 (2017). and the Canton of Zürich. H.J.P., B.D.K., L.M.W., and R.K. were supported by the 66. Manzanares, M. et al. Conservation and elaboration of Hox gene regulation Stowers Institute (R.K. grant #2013-1001). S.A.G. and M.E.B. were supported by grants during evolution of the vertebrate head. Nature 408, 854–857 (2000). R01NS086907 and R01DE017911. 67. Fisher, S. et al. Evaluating the biological relevance of putative enhancers using Tol2 transposon-mediated transgenesis in zebrafish. Nat. Protoc. 1, 1297–1305 (2006). Author contributions 68. Parker, H. J., Sauka-Spengler, T., Bronner, M. & Elgar, G. A reporter assay in H.J.P., M.E.B. and R.K. conceived this research programme. H.J.P., B.D.K., K.D.P. and C. lamprey embryos reveals both functional conservation and elaboration of H. conducted the experiments. S.A.G. performed lamprey husbandry. C.K.K. and C.M. fi vertebrate enhancers. PLoS ONE 9, e85492 (2014). developed the crestin transgenic zebra sh line and associated constructs. H.J.P., B.D.K., 69. Tahara, Y. Normal stages of development in the lamprey, Lampetra reissneri C.M., L.M.W., M.E.B. and R.K. analysed the data, discussed the ideas and interpretations, (Dybowski). Zool. Sci. 5, 109–118 (1988). and wrote the manuscript. 70. Nikitina, N., Bronner-Fraser, M. & Sauka-Spengler, T. The sea lamprey Petromyzon marinus: a model for evolutionary and . Cold Spring Harb. Protoc. 2009, pdb emo113 (2009). Additional information 71. Lowe, C. J., Tagawa, K., Humphreys, T., Kirschner, M. & Gerhart, J. Supplementary Information accompanies this paper at https://doi.org/10.1038/s41467- Hemichordate embryos: procurement, culture, and basic methods. Methods 019-09197-8. Cell Biol. 74, 171–194 (2004). 72. Beard, C., Hochedlinger, K., Plath, K., Wutz, A. & Jaenisch, R. Efficient Competing interests: The authors declare no competing interests. method to generate single-copy transgenic mice by site-specific integration in embryonic stem cells. Genesis 44,23–28 (2006). Reprints and permission information is available online at http://npg.nature.com/ 73. Smith, K. T., Martin-Brown, S. A., Florens, L., Washburn, M. P. & Workman, reprintsandpermissions/ J. L. Deacetylase inhibitors dissociate the histone-targeting ING2 subunit from the Sin3 complex. Chem. Biol. 17,65–74 (2010). Journal peer review information: Nature Communications thanks David McCauley, 74. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Elena Silva and the other anonymous reviewer(s) for their contribution to the peer review Methods 9, 357–359 (2012). of this work. Peer reviewer reports are available. 75. Buenrostro, J. D., Wu, B., Chang, H. Y. & Greenleaf, W. J. ATAC-seq: a ’ method for assaying chromatin accessibility genome-wide. Curr. Protoc. Mol. Publisher s note: Springer Nature remains neutral with regard to jurisdictional claims in fi Biol. 109,212921–21 29 29 (2015). published maps and institutional af liations. 76. Khan, A. et al. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res. 46, D1284 (2018). Open Access This article is licensed under a Creative Commons 77. Meng, X., Brodsky, M. H. & Wolfe, S. A. A bacterial one-hybrid system for Attribution 4.0 International License, which permits use, sharing, determining the DNA-binding specificity of transcription factors. Nat. adaptation, distribution and reproduction in any medium or format, as long as you give Biotechnol. 23, 988–994 (2005). appropriate credit to the original author(s) and the source, provide a link to the Creative 78. Badis, G. et al. Diversity and complexity in DNA recognition by transcription Commons license, and indicate if changes were made. The images or other third party factors. Science 324, 1720–1723 (2009). material in this article are included in the article’s Creative Commons license, unless 79. Chang, C. P. et al. Meis proteins are major in vivo DNA binding partners for indicated otherwise in a credit line to the material. If material is not included in the – wild-type but not chimeric Pbx proteins. Mol. Cell. Biol. 17, 5679 5687 (1997). article’s Creative Commons license and your intended use is not permitted by statutory 80. Lu,Q.,Wright,D.D.&Kamps,M.P.Fusion with E2A converts the Pbx1 regulation or exceeds the permitted use, you will need to obtain permission directly from homeodomain protein into a constitutive transcriptional activator in human the copyright holder. To view a copy of this license, visit http://creativecommons.org/ – leukemias carrying the t(1;19) translocation. Mol. Cell. Biol. 14,3938 3948 (1994). licenses/by/4.0/. 81. Sacerdot, C., Louis, A., Bon, C., Berthelot, C. & Roest Crollius, H. Chromosome evolution at the origin of the ancestral vertebrate genome. Genome Biol. 19, 166 (2018). © The Author(s) 2019

NATURE COMMUNICATIONS | (2019) 10:1189 | https://doi.org/10.1038/s41467-019-09197-8 | www.nature.com/naturecommunications 15