<<

Neoblast specialization during regeneration of the planarian S. mediterranea

by Kellie M. Kravarik B.S., Biological Sciences Carnegie Mellon University (2011) LIBRARIES B.S., Decision Science ARCHIVES Carnegie Mellon University (2011) Submitted to the Department of Biology in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Biology at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY

February 2018

Massachusetts Institute of Technology 2018. All rights reserved.

Author ... Signature redacted...... Department of Biology September 22, 2017 Signature redacted C ertified by ...... Peter W. Reddien Professor of Biology Thesis Supervisor

Accepted by.....Signature redacted Amy E. Keating, P'rofessor of Biology Co-Chair, Biology Graduate Committee 77 Massachusetts Avenue Cambridge, MA 02139 MITLibraries http://Iibraries.mit.edu/ask

DISCLAIMER NOTICE

Due to the condition of the original material, there are unavoidable flaws in this reproduction. We have made every effort possible to provide you with the best copy available.

Thank you.

table 3.1 is missing from page 167

Neoblast specialization during regeneration of the planarian S. mediterranea by Kellie M. Kravarik

Submitted to the Department of Biology on September 22, 2017, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Biology

Abstract Planarians are well known for their ability to regenerate an entire animal from small tissue fragments. Planarian regeneration requires a population of dividing cells called neoblasts that are distributed throughout the body. Historically, neoblasts have been considered a homogeneous population of stem cells capable of differentiating into all cell types. Most studies, however, analyze neoblasts at the population rather than the single cell level, making it difficult to determine how heterogeneous the neoblast population is. A bulk RNA sequencing approach with expression screening identified 33 new tran- scription factors transcribed in specific differentiated cells that were also expressed in small fractions of neoblasts during regeneration. Transcription factors of distinct dif- ferentiated tissues were expressed in different subsets of neoblasts, whereas transcrip- tion factors expressed in the same differentiated tissues were expressed in the same neoblasts. These results suggest roles for neoblast-expressed transcription factors in the specification of distinct tissues. Furthermore, the transcription factors kif, Pax3/7, and FoxA were required for the differentiation of cintillo-expressing sensory neurons, dopamine-beta-hydroxylase-expressing neurons, and the pharynx, respectively. The planarian nervous system is comprised of numerous different cell types, pro- viding an opportunity to study how neoblasts acquire the diverse cell fates that com- prise a particular tissue. We used single-cell sequencing to identify the transcriptomes of hundreds of planarian neurons and neoblasts. Using computational analysis of these data we identified the transcriptomes of several specific types of planarian neuronal cells, including cholinergic, dopaminergic, and serotonergic neurons, as well as glial cell types. In neoblasts, we identified a population of cells that expressed both mark- ers of differentiated neurons and transcription factors expressed in various neural cell types, which we hypothesize to be neural specialized neoblasts. We found a num- ber of unique populations of neural neoblasts that correspond with specific neural sub-types. Interestingly, however, these neural specialized neoblasts do not express a detectable unified regulatory network. These results are consistent with di- rect specification of neural sub-types in neoblasts and suggest that neoblasts do not

3 differentiate down a highly hierarchical lineage path as has been described for many developmental lineages.

Thesis Supervisor: Peter W. Reddien Title: Professor of Biology

4 Acknowledgments

Thank you to my graduate advisor and thesis committee members: Professor Peter Reddien, Professor Frank Solomon, Professor Richard Hynes, and Professor Terry Orr-Weaver. Thank you to the members of the Reddien Lab past and present, including but not limited to M. Srivastava and J. Meisel for phylogenetic analysis advice, I. Wang for illustrations and cell FISH advice, J. van Wolfswinkel for cell FISH advice and transcriptome analysis, I. Wang and L. Cote for experimental assistance, 0. Wurtzel for computational support, and all those who supported me and my work in countless ways. Thank you to A. Morishige for copyediting and technical support. The Howard Hughes Medical Institute, NIH (R01GM080639) and the National Science Foundation graduate research fellowship grant 1122374 provided essential financial support.

5 6 Contents

Acknowledgements 5

List of Figures 11

List of Tables 14

References to published work 16

1 Introduction 17 1.1 Cell differentiation and lineage specification during development . .. 17 1.2 Neural Specification ...... 18 1.3 Hematopoietic Stem Cell Specification ...... 28 1.4 Regeneration through evolution ...... 32 1.5 Stem cells in axolotls ...... 33 1.6 Planarians as a model system ...... 34 1.7 Neoblasts as the source of planarian regeneration ...... 37 1.8 Neoblasts as pluripotent stem cells ...... 38 1.9 Neoblasts as homogenous or specialized ...... 38

2 Wound-induced neoblast specialization in Schmidtea mediterranea 41 2.1 Contributions ...... 41 2.2 Sum m ary ...... 41 2.3 Introduction ...... 42 2.4 R esults ...... 44

7 2.4.1 X1 neoblasts from wounded planarians express tissue-associated transcription factors ...... 44 2.4.2 Known tissue-associated transcription factors are expressed in X1 neoblasts following wounding ...... 55 2.4.3 Numerous transcription factors expressed in cells of the nervous system are expressed in neoblasts following wounding . ... . 63 2.4.4 Tissue-associated transcription factors are expressed in non- overlapping subsets of neoblasts from wounded planarians . . 73 2.4.5 Tissue-associated transcription factors are required for regen- eration of distinct cell types ...... 78 2.5 D iscussion ...... 91 2.6 Materials and methods ...... 93 2.6.1 Animals and radiation treatment ...... 93 2.6.2 mRNA purification and Illumina sequencing ...... 93 2.6.3 R N A i ...... 94 2.6.4 Statistical analysis ...... 94 2.6.5 Accession numbers ...... 95 2.6.6 Nomenclature ...... 95 2.6.7 Whole-mount and cell in situ hybridizations ...... 95 2.6.8 qPCR analysis ...... 96 2.6.9 Supplemental Table 1 ...... 96

3 Single-cell sequencing reveals diverse neoblast specialization for the Schmidtea mediterranea nervous system 99 3.1 Contributions ...... 99 3.2 Sum m ary ...... 99 3.3 Introduction ...... 100 3.4 R esults ...... 101 3.4.1 Single-cell sequencing of neurons and neoblasts from the pla- narian head ...... 101

8 3.4.2 Single-cell sequencing identifies the transcriptomes of dopamin- ergic, cholinergic, serotonergic, and afferent neurons ...... 106 3.4.3 Cholinergic neurons are divided into pax6A and skil sub-types 112 3.4.4 Dopaminergic neurons express a gene-regulatory network in- volving ETS-family transcription factors ...... 115 3.4.5 Afferent neurons express a gene regulatory network including prox-1, fli2, -1, and su(H) ...... 120 3.4.6 Single-cell sequencing reveals planarian pigment cells and glial cells express many similar ...... 126 3.4.7 Single-cell sequencing of >2C isolated neoblasts reveals expres- sion of neural markers ...... 131 3.4.8 Numerous transcription factors expressed in distinct cells of the nervous system are detected in distinct neoblasts ...... 136 3.5 D iscussion ...... 142 3.5.1 Characterization of neuronal cell types ...... 142 3.5.2 Evidence for neural neoblast specialization ...... 142 3.5.3 Lack of evidence for pan-neural progenitor ...... 143 3.6 Materials and methods ...... 143 3.6.1 Animal treatment ...... 143 3.6.2 Single-cell mRNA amplification ...... 144 3.6.3 Single Cell Sequencing ...... 144 3.6.4 Computational Analysis of clusters ...... 145 3.6.5 in situ hybridization ...... 145 3.6.6 Gene sequences ...... 145 3.6.7 Acknowledgements ...... 145 3.7 Annotation Genes ...... 145 3.8 Computational Appendix 1 ...... 149 3.9 Computational Appendix 2 160

4 Discussion 169

9 4.1 Regeneration requires unique regulation of tissue production .. ... 169 4.2 Planarian neoblasts are the site of major cell differentiation in the animal170 4.3 Planarian neoblasts specialize into many different cell types ...... 171 4.4 Evidence that planarian neoblasts directly specify into cell types .. . 172 4.5 Regeneration may decouple a need to coordinate cell differentiation with appropriate growth or timing ...... 173

Bibliography 175

10 List of Figures

1-1 Invertebrate neurogenesis...... 21

1-2 Mammalian neurogenesis and hematopoiesis...... 27

1-3 Table of regenerating animals through animal phylogeny...... 32 1-4 Planarian Organ Systems and Regeneration ...... 37

1-5 Neoblast specialization in planarians ...... 40

2-1 mRNA Sequencing Analysis of Sorted X1 Neoblasts from Regenerating P lanarians ...... 47

2-2 Expression of new transcription factors in intact animals (related to Figure 2-1) ...... 49

2-3 Expression of new transcription factors in intact animals (related to F igure 2-1) ...... 51

2-4 Expression of new transcription factors in intact animals (related to Figure 2-1) ...... 53

2-5 Expression of new transcription factors in intact animals (related to F igure 2-1) ...... 55

2-6 A candidate gene approach identified tissue-associated transcription factors expressed in X1 neoblasts from regenerating planarians ... . 57

2-7 Expression of pharynx, gut, and muscle-associated transcription fac- tors (related to Figure 2-6) ...... 59

2-8 Expression of pharynx, gut, and muscle-associated transcription fac- tors (related to Figure 2-6) ...... 61

11 2-9 mRNA Sequencing Analysis of Sorted X1 Neoblasts from Regenerating Planarians ...... 66

2-10 Expression of brain-associated transcription factors in intact and re- generating animals (related to Figure 2-9) ...... 68

2-11 Expression of brain-associated transcription factors in intact and re- generating animals (related to Figure 2-9) ...... 70

2-12 Non-overlapping expression of different tissue-associated genes in wounded X 1 cells ...... 75

2-13 Phylogenetic analysis of SMED-Pax3/7 and role of klf and /7 in neural tissue specification (related to Figure 2-14) ...... 77

2-14 Regeneration defects of neural cell types following tissue-associated RNAi ...... 81

2-15 The role of FoxA in tissue specification (related to Figure 2-16) ... . 83

2-16 FoxA RNAi disrupts pharynx regeneration ...... 85

2-17 Cell FISH in sorted X1 cells using the RNA probes FoxA, pax3 7, and kif (related to Figure 2-18) ...... 87

2-18 Proposed model: specialization of neoblasts into different lineages fol- lowing wounding ...... 89

3-1 Single-cell mRNA sequencing analysis of neurons and sorted neoblasts from planarian heads...... 104

3-2 Related to Figure 3-1. Single-cell mRNA sequencing analysis of neu- rons and sorted neoblasts from planarian heads...... 106

3-3 Neurotransmitter and transcription factor expression in neurons and sorted neoblasts from planarian heads...... 110

3-4 Related to Figure 3-3. Neurotransmitter and transcription factor ex- pression in neurons and sorted neoblasts from planarian heads. . .. 112

3-5 Analysis of cholinergic neurons in sorted neoblasts and neurons from planarian heads...... 115

12 3-6 Analysis of dopaminergic neurons in sorted neoblasts and neurons from planarian heads...... 118 3-7 Related to Figure 3-6. Expression of genes enriched in isolated dopamin- ergic neurons...... 120 3-8 Analysis of afferent neurons in sorted neoblasts and neurons from pla- narian heads...... 122 3-9 Related to Figure 3-8. Expression of genes enriched in isolated afferent neurons...... 124 3-10 Related to Figure 3-8. Expression of genes enriched in isolated afferent neurons...... 126 3-11 Analysis of glial and pigment cells in sorted neoblasts and cells from planarian heads ...... 129 3-12 Related to Figure 3-11. Expression of glial and pigment genes in iso- lated head cells...... 131 3-13 Analysis of sorted neoblasts from planarian heads...... 134 3-14 Related to Figure 3-13. Analysis of sorted neoblasts from planarian heads...... 136 3-15 Analysis of neural transcription factor expression in sorted neoblasts and neoblasts in planarian heads...... 138 3-16 Related to Figure 3-15. Analysis of neural transcription factor expres- sion in sorted neoblasts and neoblasts in planarian heads...... 140 3-17 Proposed model: specialization of neural neoblasts into different sub- lineages...... 140

13 14 List of Tables

2.1 Supplementary Table: List of Genes that Showed Upregulation in XI Cells following Wounding by mRNA Illumina Sequencing ...... 97

3.1 List of genes and transcriptome IDs used to annotate single-cell se- quencing data ...... 167

15 References to Published Work

Chapter 2 largely appears in the following publication:

M. Lucila Scimone, Kellie M. Kravarik, Sylvain W. Lapan, Peter W. Reddien, "Neoblast Specialization in Regeneration of the Planarian Schmidtea mediter- ranea," Stem Cell Reports 3, 339-352 (2014).

16 Chapter 1

Introduction

1.1 Cell differentiation and lineage specification dur- ing development

Multicellular animals distribute the cellular requirements of their bodies to cells with specialized functions. For example, a is created to contract and allow body movement, while a neuron is created to transmit and receive electrical signals, and a gut cell is created to break down and absorb nutrients from food. Thus, a key goal in the development of an animal is the creation of a large number of diverse cell types from what was once a single cell. This requires the coordination of two distinct processes: 1). specialization of a cell, and thus the genes it expresses to produce specialized and 2). amplification of these specialized cells. Cell type specification can be thought of as the activation of a circuit of regulatory genes. These regulatory genes can coordinate the transcription of all genes needed for a specialized type, and are known as a cell's Gene Regulatory Network (GRN) [1]. Amplification of a cell type occurs either through controlled mitosis of a specialized cell to create two identical daughter cells, or through asymmetric division of a stem cell to create a specialized cell while maintaining the mother cell population. During development, most bilaterian animals undergo gastrulation to create layers of cells that will amplify and specify into ectoderm, endoderm and mesoderm, distinct layers

17 of cells that will become distinct types of tissues throughout the animal. As the germ layers grow and further specify, the distinct cell types such as neurons and gut are generated from cells of distinct cell layers (ectoderm for neurons, and mesoderm for the gut). After the adult body has formed, the demand for cells continues, with each cell type and organ requiring replacement cells due to cell death and damage. We can use these known examples of cell differentiation to understand the molec- ular underpinnings of how to make these different cells in different contexts. How do cells determine their fate? How is a gene regulatory network for a specialized cell initiated? How is the appropriate number of specialized cells created?

1.2 Neural Specification

The nervous system of bilaterian animals is a diverse network of neurons integrated with one another and with their surrounding tissue. The emergence of the earliest nervous system is thought to have occurred in the last common ancestor of meta- zoans 121, and centralization grew as the complexity of organisms also grew [3]. All nervous systems require diversity of neuronal function. This diversity arises through a complex process that includes differentiation of neurons with distinct gene regula- tory networks, growth and maintenance of diverse cell structures, establishment of distinct connections with the surroundings, and communication via specific neuro- transmitters. As such, the nervous system is an example of the extreme diversity possible in a single cell type. Consistent with the challenge of making such different cell sub-types, there are di- verse ways to specify neurons across animals, though common themes are frequently maintained. The Cnidarian Nematostella begins neurogenesis during gastrulation when neural progenitor cells (NPCs) that express NvSoxB(2), a sox family transcrip- tion factor, undergo asymmetric cell divisions to produce sensory cells, ganglion cells, and nematocytes, the three neural cell types in the organism 141. It is an open ques- tion if each NvSoxB(2)+ NPC is multipotent and can produce daughter cells of each cell type, or if the NPC population is a mix of unipotent NPCs for each cell type

18 (Figure 1-1A). However, it is clear that Nematostella up-regulates a conserved cir- cuit of pro-neural bHLH transcription factor homologs in neural progenitors that are necessary and sufficient for continued neurogenesis [5].

19 A B

NvSoxB neuroblasts produce Hypadw5W 0tsba MO all neural cells 0 A

Adapted from Hobert, 0. Neurogenesis in the nematode Caenorhabdhis elegans. WormBook 1-24 (2010). doi:10.1895/worm- H -G book.1.12.2 Adapted from Rentzsch. F., Layden, M. & Manuel, M. The cellular and molecular basis of cnidarian neurogenesis. WIREs Dev Bol 6. e257 (2016).

C Temporal paterning of NP and IMP cells produce diversity INP Temporal Patterning e-w- .- e

C -*0 -*0 -"S * S

D V -*0 -*0 0 * 0 0 U. ~. ~. ~.

Adapted from Bayraktar, 0. A. & Doe, C. 0. Combinatorial temporal patteming In progen- itors expands neural diversity. Nature 498, 449-455 (2013).

Figure 1-1

20 Figure 1-1: Invertebrate neurogenesis. A. Diagram of Nematostella neuroblast regulation. Top Model: NvSoxB+ neuroblasts are each individually multipotent and able to produce unipotent progenitors and differentiated neural cell types (Bottom). Bottom Model: Alternatively, neuroblasts may be a heterogeneous mixture of unipotent or bipotent cells that are individually restricted, but collectively able to produce all neural diversity (Top). Adapted from [4]. B. Lineage of ABal fate in C. elegans embryogenesis. Though one daughter makes a clonal population of neurons in the worm (ABala, progeny marked in red), the other daughter (ABalpp) gives birth to progenitors of neurons (red), as well as hypodermal or epidermal fate (yellow) and mesoderm (green). Adapted from [6]. C. Diagram of simultaneous temporal patterning of drosophila type II neuroblasts (Black to Gray) and their intermediate progenitor progeny (Blue, pink, green) pro- duces two axes of diversity in final neuron production. Adapted from [7].

In the sea star P. miniata, neural specification occurs through the expression of several neurogenic transcription factors within the ectoderm, such as soxb1 - another sox family transcription factor [8]. As the animal continues to develop, Wnt and Notch signaling pathways restrict the neurogenic ectoderm, and allow a forkhead transcription factor homolog, foxg, to define the cells that will specify the cilliary band nervous system 181. Despite these data indicating zonal restriction of soxbl precursor cells, the potential of any neuroblast cell is unknown. Thus, it remains possible that specific types of neurons are pre-determined within the soxbl compartment, or that soxbl sub-specialization relies on temporal restriction of fate. The nematode C. elegans has a stereotyped development and the lineage of each cell in the adult is known. Instead of specifying neuroblasts that produce all neurons, neurons are largely non-clonally specified and can be derived from cells that also pro- duce other ectodermal cells or even mesodermal cells [6]. For example, the cell ABala produces two daughter cells, of which one produces all neurons and one produces a mix of neurons and mesoderm (Figure 1-1B). Furthermore, neurons of the same type, such as ventral nerve cord motor neurons or dopaminergic neurons, are speci- fied through distinct lineages rather than from a common source. Instead, neurons are specified by a combinatorial code of transcription factors that define a terminal fate. For example, lin-32 and the achete-scute-like bHLH transcription factor hih-14

21 are required for neural specification of several cells, while the lin-22/hairy-type bHLH transcription factor represses a neural fate to allow specification of hypodermal (epi- dermal) fate. Furthermore, loss of the transcription co-activator p300/cbp-1 allows neural fate in many cell types. Together, these data raise the possibility that neural fate can be specifically induced, but also may be a default state in the absence of repression. In Drosphila melanogaster, the embryonic nervous system is specified from neurob- lasts (NBs) that delaminate from a neuroepithelial tissue 191. They are bipotent cells that divide asymmetrically to create one daughter cell specified toward neural fate. This neural fate is selected by the expression of pro-neural bHLH transcription fac- tors such as acheaete scute and lethal of scute, while low levels of transcription factor expression allow neuroectodermal maintenance in the neuroblasts itself. These NBs are highly mitotic, and produce hundreds of progeny without losing their capacity for neural differentiation. The potential of these NBs and their ability to differen- tiate into specific neural cells has been studied at a very fine resolution. Because neuroblasts produce spatially stereotyped daughter cells, and because Drosophila are very amenable to combining clonal analysis with high-resolution imaging, it was dis- covered that NBs alter their ability to make specific neural types over time using a stereotyped progression of transcription factors. This temporal fate restriction is a major driver of neural diversity. As embryonic NBs delaminate from the neuroectoderm and acquire further fate, the SoxB genes SoxN and Dichaete are up-regulated [101. At the same time, a temporally regulated set of transcription factors controls daughter cell fate. It has been clearly shown that sin- gle NBs cycle through sequential expression of the Ikaros family transcription factor hunchback (Hb), the zinc-finger transcription factors kruppel (Kr)and castor(cas), and the POU domain Pdmill]. These transcription factors cycle on and off as embryogenesis proceeds to induce a germinal mother cell that is specified to- wards different fates, with the nuclear homolog seven-up (svp) controlling the transition of Hb to Kr. During larval neurogenesis in Drosophila, this same theme of temporal patterning is

22 re-capitulated and varied upon to produce an increasing diversity of neurons in the optic lobe, the mushroom-body, and other neural structures. Type I larval neuroblasts expressing Cas birth neurons expressing a new transcription factor Chronologically inappropriatemorphogenesis (Chinmo). However, as larval development proceeds, svp again controls a transition in birthed neurons from Chinmo to Broad-complex genes. A second type of Drosophila neuroblasts, type II neuroblasts, also start a more com- plex specification program [7]. The type II neuroblasts begin to produce an interme- diate neural progenitor (INP) daughter cell which itself can asymmetrically divide for several rounds of mitosis. This INP provides expansion of type II progeny, but it also provides another layer of temporal patterning [7]. Even as the type II neuroblasts cycle through a still mysterious temporal-identity factor cascade to produce distinct INPs over time, the INPs begin their own cycle of identity factor expression through the transcription factors Dichaete, Grh and Eyeless over-time. The combined tempo- ral patterning, combined with the pre-existing patterning of the embryonic neuroec- toderm, provides a source of expanded neural diversity through development (Figure 1-1C) [7]. Finally, the larval optic-lobe neuroblasts also utilize a sequence of temporal identity factors to expand into more than 70 different sub-types across 40,000 cells by expressing a sequence of Homothorax, Eyeless, Sloppy paired, Dichate and Tailless [11. Together, this shows the robust neural diversity possible through coordinated variance of gene regulatory networks in both space and time from just a small number of neuroblasts. Mammalian neurogenesis is perhaps the most extreme example of cell diversification, with hundreds of thousands of different neurons developing throughout the cerebral cortex, optic lobe, and other brain regions. Although thousands of different cell types need to be generated during development, it is not known exactly how many distinct neural types truly exist or even what mechanisms underlie their production in such great numbers [12, 13]. As in Drosophila, in mouse and human neurogenesis it is clear that the neuroectoderm residing in a neural tube is pre-patterned by many factors, such as signaling pathways from across the embryo, and from more local signals such

23 as the notochord 19, 141. This establishes spatial zones of progenitors that express different transcription factors and will become coarsely different neuronal cell types. In the vertebrate retina, a mechanism similar to temporal regulation has been pro- posed to generate diverse cell types from multipotent neural progenitors. Homologs of Hb, Kr, Pdm, and Cas are expressed sequentially in these progenitors over time

1111. Further, Ikaros a homolog of Hb, is required for early-born neurons [15j. How- ever, there is clearly more complexity to this process in mammals. Instead of highly stereotyped birth specification, in vivo lineage tracing of single progenitors has re- vealed more stochastic patterns than in the bulk population [161, suggesting that there may be heterogeneity within retinal progenitors. Indeed, some retinal progeni- tors appear to favor production of specific sub-types; though, in general, earlier retinal progenitors do appear more self-renewing than later progenitors, even at a single-cell level [11]. The mammalian cortex is the most complex neural organ in metazoans, with tens of thousands of different cell types expanding over development. In the cortex, neocor- tical progenitors are generated from radial glia residing in the ventricular and then the sub-ventricular (SVG) zones of the developing neural tube [171. These radial glial (RG) cells span from the apical layer attached to the ventricle, to the basal surface of the neural tube. Through development, this plate of cells will expand into several layers of distinct neural types, birthed in an "inside out" order with older neurons on the base of the column, and newer neurons migrating to the surface of the neocortex, ultimately growing the cortex up over time 113]. As a whole, the radial glial population will generate waves of specific neurons that correspond to these sequential layers of neuronal diversity, before switching to a glial fate and specifying the macroglial cell types that populate the cortex [11]. Many aspects of radial glial specification utilize similar themes to neurogenesis in other organisms. There are transcription factor networks that specify specific cell fates, such as FezF that specifies deeper-layer cortical fates [18, 19]. Homologs of other genes involved in neurogenesis in many species also play a role; each are expressed in specific layers of differentiated neurons in the cortex.

24 In general, radial glial cells divide asymmetrically, with one daughter cell delaminating and migrating basally along the tracts provided by radial glial cell bodies [13]. These radial glial progeny, or intermediate (or transient) progenitors (Tbr2+) glial cells, can themselves divide a number of times, providing further expansion of the neuronal population. However, the potential of such intermediate progenitors is thought to be restricted to specific fates [20], and they more commonly divide symmetrically to produce two neuronal daughter cells. This allows such progeny to expand their own numbers by producing exclusively intermediate glial cells for a few cell divisions before proceeding through terminal neural differentiation. The observation that radial glial cells produce sequential layers of distinct cell types over time developed into a historically favored model of cortical neurogenesis as a progressive restriction of potential (Figure 1-2A, top). In this model, a single radial glial cell can produce neurons of every layer [131, with its gene regulation and potential controlled over time, and gradually narrowing through development. Indeed, much data strongly supports this model. Lineage tracing experiments have showed that single radial glia can produce neurons of every lineage [21], and clonal analysis induced at increasingly late time-points correlates with specification of upper-layer neuronal differentiation [13].

25 A B HPSC

Progentors

C') N

Erythroid Myeloid Lymphoid N HPSC Developmental Time

CN N Erythroid Myeloid Lymphoid Adapted from Hamey, F. K. & Gbttgens, B. 00.< Demystifying blood stem cell fates. Nat Cell Biol 19, 261-263 (2017).

Figure 1-2

26 Figure 1-2: Mammalian neurogenesis and hematopoiesis. A. Generating diversity during mammalian corticogenesis. Top: basic schematic of radial glial cells in the ventricular zone (VZ) and then the sub-ventricular zone (SVZ) produce neural progenitors of distinct types in a sequential manor. Bottom: recent evidence points to the existence of cux-2+ radial glial cells (Blue) that emerge out of the SV but remain inactive until production of upper layer neurons (also Blue). B. Diagram of hematopoiesis. Historically, there was a strict hierarchical model that governed how a HPSC can differentiate into the many cell types of the blood and lymphatic system (Top). Recent evidence suggests that at least some branches of this model are more plastic, and that in vivo adult humans retain few of the interme- diate progenitor states, suggesting that more direct differentiation may be possible (Bottom). Adapted from [221.

If radial glia are a homogenous population that cycles through gene regulatory net- works, then there should be no differences in the potential of different radial glial cells. However, a population of radial glia was recently discovered for specific superficial- layer neuron types. They are located in a basal layer in the ventricular zone (SV) and they express the transcription factor Cux-2 120]. These Cux-2+ cells are de- tectable at very early stages of corticogenesis, and yet they only become mitotically active at later stages of development when later layers of neurons are being produced, resulting in production of exclusively higher layer neurons (Figure 1-2A, bottom). These data suggest that there may be some aspects of glial diversity, and a model of heterogeneous pre-specified radial glia has been proposed. In fact, the models need not be mutually exclusive, as the extreme diversity demanded during corticogenesis may require the function of aspects of both the progressive re- striction of potential and pre-determined models. Indeed, there can be both sequential diversification and specialized pre-patterning or bias for some glial sub-populations. More data are needed to test if all radial glial cells are competent to produce multiple layers, or if there is a mix of potential and pre-specification throughout the radial glial population of the (sub-ventricular zone) SVZ and VZ.

27 1.3 Hematopoietic Stem Cell Specification

The blood and lymphoid cell types of vertebrates are continuously derived from a population of multipotent stem cells that reside in adult bone marrow. Decades of work studying the hematopoietic system has established it as the preeminent model of stem cell biology. During development of the mouse, embryonic myeloid progenitors, lymphoid progenitors and neonatal hematopoietic stem cells (HPSCs) emerge at sev- eral sites including the yolk sac, the placenta, and around the developing mesoderm and aorta 1231. These cells enter the circulation and colonize the fetal liver, followed by migration to the spleen and thymus in later embryogenesis. The adult hematopoi- etic system produces more than 10 distinct mature cell types, including red blood cells (erythrocytes), populations of white blood cells, and lymphocytes [241. Much research has established robust cell surface markers of HPSCs [251, allowing isolation and the ability to study colony formation of isolated cells, as well as potential in vivo through transplantation. From this collective work, we know that the HPSC population produces all of these lineages, and can be transplanted into lethally irra- diated and otherwise HPSC-deficient mammalian hosts for both research and clinical treatments. The HPSC lineage produces six major classes of cell types: red blood cells (erythro- cytes), megakaryocytes (platelets), mast cells, T- and B- lymphocytes, natural killer (NK) cells, and dendritic cells (DCs) 1261. Historically, these cells were thought to differentiate down a Waddington Landscape [271 of sequentially and hierarchically restricted choices, narrowing from a pluripotent state to a series of increasingly re- stricted fates [22]. Much evidence underpinned this model, and established cell surface markers for each of these increasingly restricted fates were known, which when isolated could differentiate into the expected potentials of cells 128]. This hierarchical differ- entiation model places the HPSC at the top of a path whereby it produces daughter cells expressing a series of multipotent progenitors (MPPs) which then divide into the oligopotent common lymphoid progenitor (CLP) and the common myeloid pro- genitor (CMP), representing a bifurcation in fate [221. The CMP then gives rise to

28 the multipotent progenitors of megakaryocytes/erythrocytes (MEPs) and the gran- ulocyte/macrophage progenitor (GMPs), which then produce lineage restricted pro- genitors of platelets, erythrocytes, granulocytes and macrophages. The CLP directly produces lineage-restricted progenitors for B- and T-Cells, as well as an NK-cells. Finally, dendritic cell progenitors are produced by both the CMP and CLP (Figure 1-2B). In spite of extensive work on understanding HPSC differentiation, recent work sug- gests that this strict hierarchical picture may be more complicated [22, 23, 29, 30]. The latest evidence now points to a revision in the CMP to MEP transition, with findings that some multipotent HPSC progenitors can directly differentiate into the MEP and skip a CMP intermediate 130]. Further, it is clear that MMP cells retain unappreciated heterogeneity in and cell surface markers [311, raising the possibility that there may be occult but pre-fated progenitor populations within the HPSC, MPP, MEP, CMP, CLP and CMP populations. Indeed, HPSCs display aspects of heterogeneity. Clonal HPSC transplants in mice showed several classes of diverse repopulation kinetics, and showed that an isolatable population displayed preference for a subset of those kinetics that was maintained even through serial transplantation [23]. Given the diverse developmental sources of the HPSC population in the embryo, this heterogeneity may reflect presence of progenitors for more limited populations of hematopoietic cells. Single cell analysis, including single-cell cultures and single-cell RNA sequencing, has increased the resolution available for cataloging the HPSC lineage, and provided new interpretations on how differentiation of HPSCs occurs [25, 30, 32, 33]. First, multiple attempts to using single-cell RNAseq on isolated blood cells have identified a wide array of new sub-types for many cell types. In particular, there is much more diversity in the types of DCs than previously appreciated. Another consequence of single-cell RNA sequencing has been the description of the HPSC and MPP populations. Instead of finding clear populations corresponding to classic cell-surface defined populations, groups are now finding that there appears to be a continuum of transcriptome states present in the HPSCs/MPP populations, including clear evidence of fate priming

29 within progenitor populations 126, 301 as observed by early expression of progenitor markers. Indeed, in adult hematopoietic tissue, there was little evidence of bi- or tri-potent progenitor populations such as the CMP, CLP and CMP, but rather a collection of pre-fated unipotent progenitors. These data suggest that instead of a hierarchical model, the HPSC system may differentiate through a gradual uni-lineage commitment that begins to be established much earlier than appreciated (Figure 1-1B).

30 GMP Stem Cell Type Distribution markers Regeneration

Spey Ake D-iftrenabns&*-liW ogan rgS ksed noW cor"IDon Zabrhn De-cifrentabonase-nlsed OrgMW rvad nom heart. 5 sues

S am D8-d5OerftelnabtsllsI-Wimid orwan rms0dd nwnw W

Aqvfot DOe-dfferntamlonag-fimesbd organ rnvmnc d nanu m rptem

L-oy- Da-disernsaiswss-imled oran restncded nom "we.O 05050onentaffy 01n5

S50sd55.meofwsP0 l Manan0hyM- , abment in phirynxy OW region yes 0110*oof ant0tior 01 psto 091epwOr

Hu t~mana wa Singia population

t WC1011'l aa PlUrpownt hghoo t body yes who001body

Adapted from Gehrke, A. R. & Srivastava, M Neoblasts and the evolution of whole-body regeneration. Curr Opin Genet 3ev. 40, 131- 137 (2016).

Figure 1-3

31 Figure 1-3: Table of regenerating animals through animal phylogeny. Adapted from [34j.

1.4 Regeneration through evolution

Regeneration is the ability to replace missing tissue after injury. In contrast, heal- ing of an injury simply closes wounds and prevents further tissue loss. For adult organisms that face threat of predatory injury, the ability to regenerate increases their fitness against a harsh environment. Regeneration also allows heightened abil- ity to heal following illness or aging. But regeneration is also a challenging process for multicellular organisms to manage - it requires a system for on-demand develop- ment. Whereas development is frequently stereotyped, regeneration is random and undertaken in the adult habitat. Regeneration does not just require activation of new cell specification, but also mechanisms for stopping the process and integrating new cells with old structures. Thus, studying regenerative mechanisms, in comparison to developmental systems, provides an opportunity to understand cell differentiation on demand, within the infinite variation of adult tissue contexts. Indeed, many animals retain the ability to replace missing tissue as adults, and this process is not restricted to specific clades of metazoans evolution (Figure 1-3). The cnidarians Hydra and Namatostella can regenerate oral and aboral structures. Tape- worms, Aceols, and planarians can undergo regeneration of large body structures 1351. Annelids can regenerate body segments. Later branching animals have less ability to regenerate their whole bodies, but nonetheless can frequently regenerate organs throughout life, as well as digit tips or limbs during specific developmental time periods [36]. For example, Zebrafish can regenerate specific organs and struc- tures, including fins, notochord, scales, and liver [37]. Spiny mice can regenerate large parts of their skin and whole parts of their ears throughout life [38]. Axolotls - a group of neotenic salamanders - can regenerate limbs, tails, spinal cords, and even neurons in the brain [361 [391. Even humans have organs that can regenerate, and

32 neonates demonstrate increased healing without scarring, such as digit-tip regener- ation in young babies and wound closure from fetal operations on the spinal cord [401. Although robust throughout animals, these regenerative abilities also highlight a split during evolution: early branching linages commonly retain large populations of pluripotent or presumed pluripotent stem cells that frequently express a network of genes termed the Germline Maintenance Program, including expression of genes like piwi and vasa that form chromatid bodies and perform DNA surveillance and protec- tion from transposon-mediated mutation [34]. By contrast, later branching bilatari- ans such as vertebrates, with their further compartmentalization of body plans into somites and enclosed organs, display more organ and tissue-restricted regeneration and possess multipotent and tissue-specific stem cells or the ability to dedifferentiate into such a cell. These common regenerative capabilities provide the elegant hypothesis that many examples of regeneration in animals, like development, might be conserved from a last common ancestor that also retained regenerative mechanisms. They also provide a diversity of ways to solve a regenerative question under the varied constraints of maintaining pluripotency verses overcoming compartmentalization. To better under- stand this process, and to dissect the common molecular mechanisms required for regeneration, we must, therefore, study multiple examples of regenerating animals to uncover the generalized concepts each may have.

1.5 Stem cells in axolotls

Axolotls are salamanders that do not metamorphosize into their adult terrestrial form, but grow to reach sexual maturity. They are water-dwelling amphibians, and they possess an immense ability to regenerate tissues of their body. They are capable of regenerating whole limbs, as well as substantial parts of their nervous system, including the spinal cord and neurons in their brain [36, 41]. The source of tissue regenerating tissue in axolotls has been intensely studied. Instead of maintaining a population of pluripotent stem cells, axolotls are able to maintain

33 progenitors of restricted cell types, or induce such progenitors from differentiated cell types. The spinal cord of axolotls can regenerate from amputation [42-441, and lineage tracing has shown that resident spinal cord cells, including radial glial cells, at the site of injury regrow into the regenerating spinal cord. Interestingly, these cells also have the capacity to regenerate sporadic cells outside of the spinal cord, including blood vessels, muscle and cartilage, suggesting that the neural cells may have plasticity in their regenerative capacity. It is also known that these neural stem cells require expression to regenerate the spinal cord, but not the mesoderm of the amputated tail [45]. In contrast, regeneration of the muscle is produced by resident PAX7+ satellite cells, a muscle stem cell population [46, 471. In particular, there are two populations of these satellite cells that contribute different tissues during regeneration. Pax7B+ cells can repair damaged fibers, while Pax7A+ cells more frequently form new fibers. Further, there is evidence in digit tip regeneration that the source of bone may be endothelial in origin, pointing to an endothelial progenitor cell type [36]. Axolotls can also regenerate large portions of their central nervous system, producing electro-physiologically active neurons and the correct neural diversity required in the injury site both before and after induced metamorphosis [391. Notably, early after injury, many BrdU+ cells that divide as a result of injury express Sox2 and Gfap, which are markers of ependymoglia cells [39]. Ependymal glial cells are the source of neural regeneration in newts and zebrafish, and may be the source of regenerating neural diversity in axolotls as well.

1.6 Planarians as a model system

Planarians are lophotrochozoan flatworms that have an immense capability for re- generation: they can regenerate from virtually any injury, and are able to completely replace any tissue or organ structure (Figure 1-4B). They have a musculature that functions as a skeleton, a tubular epithelial gut, a pharynx that they use to feed and defecate, protonephridia excretory organs, a central nervous system with a bilobed cephalic ganglia and two ventral nerve cords that run the length of the body (Figure

34 1-4A).

35 A

UU

muscle excretory gut and CNS and system pharynx anterior pole

B

Adapted from Newmark, P. A. & Alvarado, A. S. Not your father's planarian: a classic model enters the era of functional genomics. Nat Rev Genet 3, 210-219 (2002).

Figure 1-4

36 Figure 1-4: Planarian Organ Systems and Regeneration A. Planarians are lophotro- cozoan flatworms with diverse organ systems. They have a robust musculature (red) that acts as a skeletal system, an excretory organ system distributed throughout their body (blue), a tubular gut and pharynx for feeding (green) as well as a bilobed cephalic ganglia (purple) and specialized structures such as the anterior pole (pink). Adapted from 1481. B. Planarians can regenerate their entire body in 7-14 days. Time course images adapted from [491 with permission.

1.7 Neoblasts as the source of planarian regenera- tion

In all of these model organisms, there must be a source of the missing tissue. This could come in the form of (1) tissue-specific resident stem cells, (2) activation of a quiescent system or , or (3) a resident population of pluripotent stem cells. In the planarian Schmidtea mediterranea, initial work identified a histolog- ically unified population of small round cells [50-531 in which all mitotic events were observed. These cells were termed neoblasts, and it was found that these cells were absolutely required for regeneration of the animal [54-56]. Irradiation ablates both these cells, and the ability of the animal to regenerate, such that animals could close their wounds, but not replace any missing tissue. This includes the ability to replace tissue due to cell turn-over, and therefore elimination of the stem cell compartment is lethal in < 10 days. Earlier work in several species of planarians indicated the possession of mesenchymal (in between the organs) cells throughout the body. The first description of these cells is from Dubois [57, 581, which identified neoblasts as cells with a low cyto- plasm/nucleus ratio, e.g. mostly a nucleus. These cells could be isolated from the animal and plated to slides, where some could be observed to divide [50-531. They were also observed in whole animals sectioned and imaged using electron microscopy. Again, all dividing cells were observed to have "neoblast" cytological characteristics.

37 1.8 Neoblasts as pluripotent stem cells

As research moved into the molecular era, it became possible to isolate neoblasts through florescent activated cell sorting of cells with >2C DNA content [56], and mitosis could be visualized by BrdU pulse-chase labeling [54]. Through these methods, it was then discovered that neoblasts express members of a germline multipotency program that are required for their expansion and maintenance, such as piwi, bruli, tutor, pumilio and vasa 159][60][61][62][63]. Additionally, it was discovered that single neoblasts transplanted into irradiated hosts, single neoblasts remaining after sub- lethal irradiation, and isolated > 2C DNA content cells transplanted in bulk could all engraft and differentiate into all tissues in the body 164].

1.9 Neoblasts as homogenous or specialized

These data were consistent with a model in which neoblasts are a uniform population of pluripotent stem cells that are distributed throughout the body (Figure 1-5A). Yet, it was soon noticed that neoblasts also up-regulated genes associated with differen- tiated tissue. A study of protonephridia regeneration found that genes required for protonephridia cell differentiation were expressed in mitotic neoblasts expressing stem cell marker genes [65] (Figure 1-5B). Further studies of neoblast gene expression found that many transcription factors expressed in differentiated cells were also expressed in neoblasts. Studies of the nervous system found that the transcription factors single- minded and Lhx5-1 were both required for regeneration of specific neural cell types, and also expressed in neoblasts near the brain during regeneration [66-69]. Finally, studies of eye regeneration used lineage tracing through gene expression signatures to find that photoreceptor neurons and optic cup cells were specified in neoblasts through expression of a gene regulatory network including ovo, eya, and dlI [70, 711 (Figure 1-5B). These data highlighted the hypothesis that neoblasts may in fact be a heterogeneous mixture of stem cells and specified progenitor cells of many cell types (Figure 1-5C).

38 A C Differentation Differentation occurs continues in post-mitotic in post-mitotic progeny progeny 4$-ak Homogenous I Specalized neoblast neoblast stem cells stem cells

B Expression observe d in isolated neoblasts or Major site of transcription factor smedwi-1+ cells expression

ovo Lapan, 2011 . sp6-9 Eye Lapan, 2012 Oya

POU213 Protonephridia Scimone, 2011 S Sixl/2-2 sail

LhxII5-1 serotonergic Curre, 2013 Marz. 2013" pitx .. neurons subset of 0 sim ChAT+, gad+, coe tbh+, th+, and Cowles, 2013 hesi tph+ neurons

ap-2 TrpA+ Wenemoser. 2012 neurons

Figure 1-5

39 Figure 1-5: A. Historically emerging model that neoblasts are homogenous pluripotent stem cells. B. Diagram of evidence against neoblasts as a homogenous stem cell population. Small populations of isolated neoblasts or smedwi-1 + neoblasts in vivo have been shown to co-express transcription factors expressed in, and required for regeneration of, the eye, the protonephridia, serotonergic neurons, various neural sub-populations, and trpA+ neurons. Adapted from [481. C. Hypothesis that neoblasts are a population of heterogeneous progenitor cells.

40 Chapter 2

Wound-induced neoblast specialization in Schmidtea mediterranea

2.1 Contributions

M. Lucila Scimone, Kellie M. Kravarik, Sylvain W. Lapan, and Peter W. Reddien were responsible for the overall study design and analysis. M.L.S. and K.M.K. performed RNAi experiments; K.M.K. and M.L.S. performed X1 FISH; K.M.K. performed X1 RNA sequencing; M.L.S. and S.W.L. performed in vivo neoblast FISH; M.L.S. and S.W.L. performed the neural transcription factor screen; and K.M.K., M.L.S., and S.W.L. performed other transcription factor identification. M.L.S., K.M.K., S.W.L., and P.W.R. wrote the manuscript.

2.2 Summary

Planarians can regenerate any missing tissue in a process requiring dividing cells called neoblasts. Historically, neoblasts have been considered a homogeneous stem cell pop- ulation. Most studies, however, analyzed neoblasts at the population rather than the single-cell level, leaving the degree of heterogeneity in this population unresolved.

41 We combined RNA-sequencing of neoblasts from wounded planarians with expression screening, and identified 33 new transcription factors transcribed in specific differen- tiated cells and in small fractions of neoblasts during regeneration. Many neoblast subsets, expressing distinct tissue-associated transcription factors were present, sug- gesting candidate roles in specification of multiple lineages. Consistent with this possi- bility, kif, Pax3/7, and FoxA were required for the differentiation of cintillo-expressing sensory neurons, dopamine-f-hydroxylase-expressing neurons, and the pharynx, re- spectively. Together these results suggest that specification of cell fate for most to all regenerative lineages occurs within neoblasts, with regenerative cells of blastemas being generated from a highly heterogeneous collection of lineage-specified neoblasts.

2.3 Introduction

Planarians are flatworms capable of regenerating any missing tissue after injury. Re- generation in the planarian Schmidtea mediterranea requires a population of small mesenchymal cells called neoblasts, which are the only dividing cells of the adult ani- mal. Irradiation eliminates neoblasts, blocking regeneration and tissue turnover 1591. Following injury, neoblasts rapidly divide throughout the animal, with mitotic num- bers peaking at six hours after wounding. If the wound requires the replacement of missing tissue, a second peak of neoblast proliferation occurs at 48 hours 172]. At this time, neoblasts accumulate at the wound site and their progeny form an unpigmented bud of regenerated tissue called the blastema. Recently, two neoblast models for planarian regeneration have been proposed: the naive and the specialized models [73]. The naive model posits that all neoblasts are stem cells with the same potential, and are therefore a largely homogeneous popula- tion. Cell specification occurs only in the non-dividing neoblast progeny. Conversely, the specialized model predicts that neoblasts involved in producing missing cells have largely restricted fates and are therefore a heterogeneous population containing many different lineage-committed dividing cells. Neoblasts have been frequently considered as a uniform population of pluripotent

42 stem cells. Recent data have shown that at least some neoblasts, termed cNeoblasts, are indeed pluripotent stem cells that can rescue tissue homeostasis and regeneration in lethally irradiated animals by single cell transplantation [641. The abundance of cNeoblasts in the neoblast population, however, is unknown. smedwi-1 encodes a PIWI-family protein that is expressed in all dividing adult planarian cells [59], and is a canonical neoblast marker. All smedwi-1+ cells rapidly disappear within one day following irradiation I741. Some smedwi-1 + cells have been found to express tissue-specific transcription factors required for specification of a few distinct tissues, such as the eye 170, 71], the nephridia 165], the anterior pole [75], and some neurons [66, 68, 761. Expression of these transcription factors is induced in a small number of smedwi-1 + cells following wounding, with only rare neoblasts expressing these transcription factors in intact animals [66, 71]. These data provide support for the specialized neoblast model for at least several lineages. Determining whether the specification of most, or all, cell lineages occurs within neoblasts is essential for understanding the cellular basis for planarian regeneration. Specifically, at what cellular step in regeneration is the identity of new cells spec- ified? On the basis of prior results demonstrating the specialization of smedwi-1 + cells for case study tissues, such as the eye, we sought to test the breadth of the specialized neoblast model. Because it is possible that the smedwi-1+cell popula- tion contains both dividing cells (neoblasts) and immediate non-dividing neoblast progeny cells, we purified S and G2/M phase neoblasts (XI neoblasts) using fluores- cence activated cell sorting (FACS) and used RNA-sequencing (RNA-seq) to identify transcription factors up-regulated in X1 neoblasts following wounding. We combined this approach with broad gene expression screening to identify transcription factors expressed in many different tissues and in neoblasts of wounded planarians. Several conserved transcription factors expressed in neoblasts following wounding were re- quired for regeneration of specific cell types, as predicted by the specialized neoblast model. Our results identify a large collection of transcription factors expressed in small subsets of neoblasts at wounds. Together with previous data, these results in- dicate that neoblasts are a heterogeneous population of pluripotent stem cells and

43 lineage-restricted progenitors, with regenerative cells of blastemas being generated from a collection of lineage-specified cells.

2.4 Results

2.4.1 X1 neoblasts from wounded planarians express tissue- associated transcription factors

Planarians are flatworms with complex internal anatomy (Figure 2-1A). They possess a branched intestine that connects to a centrally located pharynx (the only opening of the animal); a nervous system consisting of two cephalic ganglia, two ventral nerve cords, sensory neurons including those of the eyes, and many peripheral neurons; an excretory system (a network of ciliated ducts); and body wall muscle fibers. Because a small fragment of planarian tissue can regenerate an entire animal, regeneration therefore requires mechanisms for the specification of the many distinct cell types of adult planarians. Transcription factors have essential roles in development because they can orchestrate the differential expression of regulatory and structural genes needed for cell differen- tiation. In planarians, several transcription factors are expressed in neoblasts for the specification of neoblast fate. In order to establish whether expression of transcrip- tion factors in neoblasts following wounding is a hallmark of the specification of most regenerative cell lineages, we first utilized FACS to purify neoblasts that were in S or G2/M phases of the cell cycle (the X1 fraction: DNA content of more than 2C) [561 of the pre-pharyngeal region 48 hours following either head (posterior-facing wound) or trunk (anterior-facing wound) amputations, and subjected these neoblasts to RNA sequencing (RNA-seq) (Figure 2-1B). This experimental design allowed identification of genes activated during regeneration, and controlled for natural variability in gene expression across the anterior-posterior axis by using the same (pre-pharyngeal) pop- ulation of neoblasts. Expression in X1 neoblasts would confirm these transcription factors are active in cells that undergo one or more cell divisions. A few transcrip-

44 tion factors were significantly (Padj (adjusted p-value) < 0.05) up-regulated in X1 neoblasts following wounding (Figure 2-1C). We also examined (and tested below) transcription factors that were up-regulated with a fold-change of log 2 > 0.7 and Punadj (unadjusted p-value) < 0.05 (Figure 2-1C and Table 2.1). By these criteria, we found that neoblasts from wounded planarians expressed the previously described genes sp 6-9 , which is expressed in optic cup cells and peripheral neurons, and required for eye regeneration 1711; meis and otxA, which are expressed in and required for the regeneration of photoreceptor neurons [71]; eya, which is expressed in and required for regeneration of eyes and protonephridia [65, 70, 71]; and ap-2, which is expressed in multiple neuron types and required for TrpA+ neuron regeneration [76] (Figure 2-1C). In addition, several previously uncharacterized planarian transcription factor homologs were also expressed in X1 neoblasts 48 hours following wounding (Figures 2-1C, 2-2-2-5, 2-7A and Table 2.1). We examined the expression of some of these new transcription factor homologs (7 out of 13 genes were tested) in wild-type and irra- diated uninjured animals and found that most (6 out of 7 genes; xfp-2 did not show expression above background) were primarily expressed in differentiated cell types rather than in neoblasts (Figure 2-4C). To assess the expression of these transcrip- tion factors in neoblasts, we isolated X1 cells of pre-pharyngeal regions at 48 hours following both anterior and posterior amputation and performed cell fluorescence in situ hybridizations (FISH). All of the tested transcription factors identified from this RNA-seq analysis (unknown and previously reported, 15 out of 23 genes tested) were expressed in small subsets of X1 cells from wounded animals (Figures 2-1D, 2-6A, 2-9 and 2-4D).

45 2" cut Harvest wounds, X1 isolation A B and RNA sequencing

ljl X1 48 hours

muscle excretory gut and CNS and system pharynx anterior pole

C DX cell isolation --- L 48 hours ater r wou 8. wdiferrans gene Logo A W X1 sp6-9 2.16 2.46 -15 1.12E-10 / PaxSA 1.44 3.65E-08 3.31E-04 / Let- 1 1.43 7.40E-08 4.79E-04 /

ap-210 1.00 .7 1

scratch 0.71 9.21E-03 1.00 E+00 1

S. medlteirmnee e Log, Ad X1 HoxD 1.97 4.57E-07 2.60E-03 nd sp6-9 1.45 8.5E-08 5.49E-04 sufu-rke xD ,CWQ1 2.44 E-0F TVOM.00~

T 1. 1

0 r T I nd

Figure 2-1

46 Figure 2-1: mRNA Sequencing Analysis of Sorted X1 Neoblasts from Regenerating Planarians (A) Cartoon shows the planarian gastrovascular system with pharynx, the nervous system with photoreceptors, the excretory system, muscle, and the anterior pole. (B) Schematic of the experimental approach. Prepharyngeal tissues were harvested 48 hr after amputation. Cell suspensions were labeled with Hoechst 33342 and X1 cells isolated by FACS. RNA was then purified from sorted X1 cells and mRNA Illumina sequencing performed. (C) The table lists all the transcription factors that showed upregulated expression in X1 cells from wounded animals with Padj % 0.05 (above double line) and Punadj < 0.05 with a log 2 fold change R > 0.7 (below double line). Check marks indicate expression was validated in sorted X1 cells from wounded planarians; nd, not determined. @From published work [75]; *best BLASTx hit to human transcripts. (D) Schematic: X1 cells were isolated from prepharyngeal regions by FACS 48 hr after amputations, and cell FISH was performed using RNA probes from genes found in (C). DAPI labeled DNA (gray). Percentages of X1 cells expressing each transcription factor are shown in the upper left corners. Total number of X1 cells counted: sp6- 9+ 14/202; neuroD-1+ 22/1034; zfp-2+ 7/180; eya+ 9/635; pax6A+ 107/1113; FoxF+ 3/205; ap-2+ 34/1137; Tlx1+ 5/152; lhx2/9+ 20/1222; otxA+ 16/155; Fli1+ 11/1463; scratch+ 4/766. Images shown are maximal-intensity projections. Scale bars, 10 pm.

47 A .111, -f

Of-scraftm 431 Go--UiX1 I"a 344 o.M LHXI 433 46 DrJ.Hmxla - Dm-oombch 277 ti 709 32 DvU'4X5 C-- * L LI 77. E3 r -E- M-Wmlkch2 IsCl B-L I Dr -sws"lb 3.3 II um

Dr-". -1dild ~3 34 Cl1 -

NW-000 331 147 Dq-LIXS F 44DrLHX4 Hs.JHX4

Dr-MON3 $13 Dmapteru*A --- GSgIJ42 217 Dr-WA*2

--E 00 Dr..LHXS 3 HLS4XSO

171 43Cejttz-3 age S.4u! L--'

Dmwowh~dB 713 311 i-LHxB Dm- wornlu al3e

Dm-wwll DrIHES

Pd-WS

34S

6.3

- 761

048 lbiow

ph-"I in

alf C.-

IN-..b2

M - -- 3

CLOM

Figure 2-2

48 Figure 2-2: Expression of new transcription factors in intact animals (related to Figure 2-1). (A) Phylogenetic analysis of SMED-snail, SMED-scratch, SMED-Lhx2/9, SMED- Lhx3/4, SMED-Flil, SMED-elf-1, SMED-Nkx2, SMED-Nkx6, SMED-SoxB-2. All proteins for all trees were aligned using MUSCLE with default settings and trimmed with Gblocks. Maximum likelihood analyses were run using PhyML with 1,000 boot- strap replicates, the WAG model of amino acid substitution, four substitution rate categories and the proportion of invariable sites estimated from the dataset. We used proteins from diverse organisms as: 18 snail/scratch proteins, 38 Lhx proteins, 18 proteins of the ets- family, 27 proteins of the Nkx family and 25 proteins of the Sox family. The result provides strong support: for both SMED-snail and SMED-scratch (1,000 out of 1,000, highlighted in red); for SMED-Lhx2/9 (851 out of 1,000) and SMED-Lhx3/4 (807 out of 1,000, both highlighted in red); for SMED-Flil (542 out of 1,000) and SMED-elf-1 (870 out of 1,000, both highlighted in red); for SMED-Nkx2 (666 out of 1,000) and SMED- Nkx6 (964 out of 1,000, both highlighted in red); and for SMED-SoxB-2 (827 out of 1,000, highlighted in red). All ML bootstrap values are shown. Hs, Homo sapiens; Dm, Drosophila melanogaster; Smed, Schmidtea mediter- ranea; Ci, Ciona intestinalis; Nv, Nematostella vectensis; Dr, Danio rerio, Bf, Caer- norhabditis elegans, Ce, Caernorhabditis elegans, Gg, Gallus gallus; Pd, Platynereis dumerilii.

49 B gone evue 2.10E-07 zfp-2 protein 714 1E-30 Zinc-finger double domain zfp-3 zinc finger protein 619 isoform 1 2E-06 None HoxC protein Hox-C12 1E-15 Homeobox domain 2.10E-20 nuclear factor nuclear factor 1 X-type isoform 3 6E-13 None nkx6 homeobox protein Nkx-6.1 2E-36 Homeobox domain 170E-20 6.80E-09 glass glass [Macrostomum sp. MF-2005] 4.OOE-70 Zinc-finger double domain castor Castor zinc finger 1 5.OOE-57 None 3.60E-21 TIx1 T-cell leukemia homeobox protein 3 4E-40 Homeobox domain subfamily 2 group E member 1 Zinc finger, C4 type (two domains) 2.10E-08 receptor 5.30E-05 NR-1 isoform b 2.OOE-34 Ligand-binding domain of nuclear hormone prox-2 homeodomain protein Homeo-prospero domain5 otxB japonica] 1.OOE-65.DjtxB[Dugesia None Tcf/Lef-11 HMG protein TCF/LEF [Dugesia japonica] 1.OE-47 None

Figure 2-3

50 Figure 2-3: Expression of new transcription factors in intact animals (related to Figure 2-1). (B) Table shows all genes for which no phylogenetic analysis has been performed. Genes have been named on best human BLASTx hit or otherwise noted and PFAM domain analysis is shown.

51 C

Tix1 ] Fll[ scratch Ir VIra Dv irrad d

FoxF Lhx2/9 egr-2

0 V 4 irrad D a V _ irrad D V irrad a

Tcf/Lef- 1 Pax6A neuroD-1 I ap-2

D

Figure 2-4

52 Figure 2-4: Expression of new transcription factors in intact animals (related to Figure 2-1). (C) in situ hybridizations using RNA probes of transcription factors found in the X1 RNA sequencing experiment. Dorsal (left panel) and ventral (middle panel) expression in intact wild-type animals and in lethally irradiated (6000 rads) animals (right panel). Confocal images are maximal intensity projections. Images are representative of results seen in >5 animals per panel. Scale bars, 100 m. (D) Cell FISH with sorted X1 cells from pre-pharyngeal regions of wounded planarians using the RNA probe egr-2 (n=5/194 X1 cells, 2.6%). Scale bars, 10 m.

53 E

RNA= Can "I m eferees (co-xpressf eVAt Gene Tissue X1 smedw-2+ e-Oxpression Approa !=i .eedwerrne. seedd-1) Sim CNS v / and citation No Yes Yes Cowles, 2013 Lhx1/5-1 CNS Not determined / and citation No Yes Yes Currie, 2013 and Marz, 2013* hnf4 Gut No Yes Yes Wagner, 2011* gato4/5/6 Gut Not determined No Yes Yes Wagner, 2011* ova Eye I Citation No Yes Yes Lapan, 2012 twist Pharynx and Muscle V v No Yes Yes (Cowles, 2013) FoxD Anterior Pole Not determined Citation Yes Yes Yes Scimone, 2014 neuroD-1 scattered cells V v Yes No Yes (here and Cowles, 2013) otxA CNS Yes No Yes (Lapan, 2011) meis Eye/Pharynx V v Yes No Yes (here and Lapan, 2012) sp6-9 CNS and Eye v Citation Yes No Yes Lapan, 2011 eya Eyes, Pharynx and Nephridia v Citation Yes No Yes Lapan, 2011 and Scimone, 2011' egr-2 Diffuse and CNS Not determined Yes No Wound-induced (Wenemoser, 2012) kif CNSand eye No Yes Eye (here and Lapan, 2012) FoxQ2 CNS and Eye v [No Yes Eye (here and Lapan, 2012) SoB ICNS and Eye I/ v No Yes Eye (here and Lapan, 2012) No Yes No No Yes No V No No Yes 1k6 CoNS IN Nk6o_ Yes No Six3 CNS No Yes No :astor CNS Not determined No Tes N~o 3taB CNS Not determined No Yes NO n--1 [CNS Not determined No Yes hX3/4 CNS JNot determined I/ NO Yes IN 0rox-2 ICNS v Iv No Yes eW-1 CNS ]Not determined IV No Yes glass ICNS Not determined j, No Yes V Wes Sox-like [CNS I/ No No FOXA Pharynx T Yes No snoail tmuscle v I v Yes No Not determined Yes (here and Coi Tcf/Aef-I CNS v Yes No No Pox6A CNS /Vand c Wound-induced Wenemoser 2012 rfp-2 not detected N2dM No FoxF Head rim N;I det No No Tlx1 Lateral boundary and midline j. No UhX2/9 J CN! No Fli 1sca No No No tation No Wound-induced Wenemoser, 2012

Figure 2-5

54 Figure 2-5: Expression of new transcription factors in intact animals (related to Figure 2-1). (E) Table shows all the transcription factors found in this study by either the RNA sequencing or the candidate approach. These transcription factors co- expressed with smedwi-1 + in day three regenerating blastemas and/or were expressed in isolated X1 cells from wounded planarians. nd, not determined; check, detection found. Tissue denotes intact gene expression pattern. *co-expression with SMEDWI- 1 protein has been shown.

2.4.2 Known tissue-associated transcription factors are ex- pressed in X1 neoblasts following wounding

We reasoned that known tissue-associated transcription factors might be expressed in neoblasts from wounded planarians, but expression of such genes might not be significantly detectable by RNA-seq differential gene expression analysis as a con- sequence of expression occurring in only rare neoblasts. We therefore analyzed the X1 neoblast expression of transcription factors expressed in distinct tissues. First, we determined the expression frequency of transcription factors previously shown to be associated with distinct tissues and expressed in smedwi-1 + cells, using FISH in sorted X1 neoblasts. As expected, a small fraction of X1 cells expressed the nephridia transcription factors POU2/3, sall, and osr (1.6%) [65]; the eye-specific transcription factor ovo (0.5%) [711; and the anterior pole transcription factors FoxD and prep, and the marker notum (0.4%) [75] (Figure 2-6A).

55 A I npnhridii | I F- I

d3 blastema 63X

Fox A--

56 Figure 2-6: A candidate gene approach identified tissue-associated transcription factors expressed in X1 neoblasts from regenerating planarians. (A) FISH using previously known and new tissue-associated transcription factors with sorted X1 cells from pre-pharyngeal regions of amputated planarians. Percentages of X1 cells expressing the transcription factors are shown in the upper left corner. Total number of X1 cells counted: for protonephridia, a mixture of RNA probes POU2/3, odd-skipped related (osr), and sal-like (sall) n=8/504; for the eye, ovo n=1/194; for the anterior pole, a mixture of the RNA probes FoxD, prep, and notum n=1/250; for pharynx, FoxA n=54/704 and meis n=20/777 with FoxA and meis co-expression observed in n=7/15 of meis+ X1 cells; for the gut, hnf4 n=337/1459 and gata4/5/6 n=665/5249 with co-expression observed in n=56/119 of hnf4+ X1 cells; for the muscle, collagen n=72/1357 and the transcription factors myoD n=47/732 and snail n=45/1945, with myoD and collagen co-expression observed in n=20/45 myoD+ X1 cells. Positional control genes were also expressed in wounded X1 cells: nou-darake (ndk, n=23/6432), secreted related-frizzled 2 (sFRP-2, n=9/4402), slit (n=8/2763), wntless (n=24/3367), and secreted related-frizzled 1 (sFRP-1, n=7/732). Because some genes were expressed in multiple cell types for some tissues progenitor numbers will be an overestimate. DAPI was used to label nuclei DNA (gray). Scale bars, 10 m. (B) Co-expression of the gut transcription factor hnf4, the muscle gene collagen, and the pharynx transcription factors FoxA and meis (magenta) with the neoblast marker smedwi-1 (green) in day three regenerating anterior blastemas or tail fragments (for the pharynx genes). Higher magnification on the right shows cells co-expressing both genes (scale bars, 10 m). DAPI was used to label nuclei DNA (gray). All images shown are maximal intensity projections. Anterior, up. Scale bars, 100 m.

57 A

948 191

71E

M41

E- .n 270

24 E3 Fiur2-79

589 Figure 2-7: Expression of pharynx, gut, and muscle-associated transcription factors (related to Figure 2-6) (A) Phylogenetic analysis of SMED-FoxA and SMED-FoxF. 94 Fox proteins from di- verse organisms were aligned using MUSCLE with default settings and trimmed with Gblocks. Maximum likelihood analyses were run using PhyML with 1,000 bootstrap replicates, the WAG model of amino acid substitution, four substitution rate cate- gories and the proportion of invariable sites estimated from the dataset. The result provides strong support for SMED-FoxA (958 out of 1,000, highlighted in red) to be a class A member and moderate support for SMED-FoxF (588 out of 1,000, high- lighted in red) to be a class F member of the Forkhead transcription family. All ML bootstrap values are shown. Hs, Homo sapiens; Mm, Mus musculus; Dm, Drosophila melanogaster; Smed, Schmidtea mediterranea; Xl, Xenopus laevis; Sd; Suberites do- muncula; Bf, Branchiostomafloridae; Ci, Ciona intestinalis; Hv, Hydra vulgaris; Nv, Nematostella vectensis; Dj, Dugesia japonica; Cs, Ciona selvatgi; Ml, Mnemiopsis leidyi.

59 B Pharynx regeneration d3 tall fragment

C

*0

D E

Figure 2-8

60 Figure 2-8: Expression of pharynx, gut, and muscle-associated transcription factors (related to Figure 2-6) (B) FISH using FoxA, meis, and twist RNA probes (magenta) in day three regenerat- ing tail fragments. FoxA, meis, and twist are expressed in the pharynx primordium. FoxAand meis were both co-expressed in the same cells during regeneration (day three), scale bar, 10 m. twist was expressed in smedwi-1+ cells during pharynx re- generation (day three, tail fragments), and twist was expressed in isolated X1 cells from wounded planarians (n=41/1964), scale bar is 10 m. (C) Dorsal (D) and ventral (V) views of FoxA, meis, and twist expression in intact animals by FISH show expression in the pharynx as well as in other scattered cells. Yellow arrows point to pharynx expression. ph, pharynx. Images are maximal inten- sity projections and representative of results seen in >5 animals per panel. Anterior, up. Scale bars, 100 m. (D) Double FISH using gata4 5 6 (magenta) and hnf4 (green) RNA probes. Both genes were co-expressed in the regenerating gut in day three anterior blastemas. Scale bars, 10 m. (E) Double FISH using snail (magenta) and collagen (green) RNA probes in intact animals (left two panels). snail-expressing cells also co-expressed the muscle cell marker collagen (83.8 % of snail+ cells, 181 snail+collagen+ out of 216 snail+ cells). Left panel depicts an intact tail; right panel shows a higher magnification image with cells co-expressing both genes. Scale bars, 10 m. On the right, double FISH using snail (magenta) and smedwi-1 (green) RNA probes in day three regenerating tail fragments. snail is co-expressed with smedwi-1. Yellow arrows point to co-expressing cells. Lower row: A higher magnification image showing co-expression of snail and smedwi-1 in regenerating day three tail fragment (left, scale bars, 10 m). X1 cell FISH using snail (magenta) and myoD (green) (second to left panel). Both genes are co-expressed in X1 neoblasts from pre-pharyngeal regions of wounded planarians (35.7%, 5 snail+myoD+ cells out of 14 myoD+ cells) (Scale bars, 10 m). FISH of intact animals using myoD RNA probe (magenta), and higher magnification of a double FISH using myoD (magenta) and collagen (green) in an intact animal showing co-expression of both genes (95.1% of myoD+ cells, 136 myoD+collagen+ out of 143 myoD+ cells, scale bars, 10 m). DAPI was used to label DNA (gray). Anterior, up. All images shown are maximal intensity projections and representative of results seen in > 4 animals per panel. Scale bars, 100 m.

Second, we examined the expression of transcription factors known or predicted to be active in the pharynx, gut, and muscle (Figures 2-1A and 2-5E). FoxA is a fork- head transcription factor expressed in the endoderm lineage across metazoans. It is expressed in cells that intercalate, polarize, and form tight junctions in the digestive tracts of the mouse, the sea urchin, and the nematode C. elegans [771. A planarian FoxA homolog is expressed in the pharynx of the related species Dugesiajaponica 178].

61 We found that during regeneration, FoxA was expressed in the pharynx primordium in the planarian S. mediterranea (Smed-FoxA, Figures 2-7A, 2-8B, 2-6B and 2-16B). FoxA expression occurred in smedwi-1 + cells and in isolated X1 neoblasts following wounding (7.7% of X1 cells) (Figure 2-6). The homeobox gene meis was previously shown to be expressed in eyes and required for the regeneration of photoreceptor neurons [711, and was also expressed at the pharynx primordium during regenera- tion (Figures 2-6B and 2-8B). meis was expressed in isolated X1 neoblasts (2.6%) following wounding, and some meis+ X1 neoblasts also co-expressed FoxA (46.7% of meis+ cells, which is higher than expected if expression of these genes in neoblasts was independent, p=0.0001, Fisher's exact test), suggesting that both genes might be involved in pharynx specification (Figure 2-6A). Expression of meis was also observed in smedwi-1+ cells during pharynx regeneration in tail fragments three days follow- ing wounding (Figure 2-6B). FoxA and meis were expressed in the pharynx of intact animals (Figure 2-8C). Finally, a homolog of twist was expressed in the pharynx in intact animals as well as during regeneration [661, Figures 2-8B and 2-8C) and it was co-expressed in smedwi-1+ cells during regeneration (Figure 2-8B). Moreover, twist was expressed in isolated X1 neoblasts following wounding (2.1%, Figure 2-7B). Endodermal transcription factors that control differentiation of the gut and its deriva- tives during embryonic development of multiple species include the hepatocyte nuclear factors (HNF) and GATA families [79]. An HNF homolog, hnf4, and a GATA4/5/6 homolog, gata4/5/6 are both expressed in the gut of S. mediterranea [641. These gut transcription factors were expressed together in the same cells during gut regen- eration (day three anterior blastemas, Figure 2-8D) and were expressed in isolated X1 cells from wounded planarians (hnf4: 23.1% of X1 and gata4/5/6: 12.6%; Figure 2-6A). Moreover, hnf4 and gata4/5/6 were co-expressed together in the same X1 cells (47.1% of hnf4+ cells, p=0.0038, Fisher's exact test, Figure 2-6A). In addition, hnf4 was expressed in smedwi-1 + cells during gut regeneration (Figure 2-9B). Muscle cells develop in most embryos from the mesoderm tissue layer and frequently involve expression of myoD, a well-known myogenic bHLH transcription factor [801. Muscle cells are marked by collagen expression in S. mediterranea [81]. collagen and

62 myoD [661 were expressed in isolated X1 neoblasts following wounding (myoD: 6.4% and collagen: 5.3%), and 44.4% of myoD+ cells co-expressed collagen (p<0.0001, Fisher's exact test; Figure 2-6A). myoD was also co-expressed with collagen in intact animals (95.1% of myoD+ cells, Figure 2-8E). In addition, collagen was co-expressed with smedwi-1 in cells of day three blastemas (Figure 2-6B). The snail gene family encodes zinc finger proteins that can act as transcriptional repressors and control entry into myogenic differentiation [82]. A planarian snail homolog was expressed together with collagen in intact animals (83.8% of snail+ cells, Figure 2-8E). snail was also expressed in X1 neoblasts (2.3%, Figure 2-6A) and was co-expressed with myoD in X1 cells from wounded animals (35.7%, Figure 2-6E) and expressed in smedwi-1 +cells during regeneration (Figure 2-8E), suggesting a possible role for this gene in muscle specification. A different category of genes, the positional control genes (PCGs), described or pre- dicted to have roles in planarian patterning [83], are highly expressed in muscle cells in planarians [81] and are required for the patterning of regenerated tissue. We ob- served that in addition to this muscle expression, several PCGs were expressed at low frequency in the X1 compartment (Figure 2-6A); these PCGs included slit, a midline-expressed repulsive axon guidance cue; noudarake (ndk), which is required for proper restriction of the brain to the planarian head; sFRP1 and sFRP2, which are expressed as anterior gradients; and the wound-induced and posteriorly expressed gene wntless [83]. Although these PCGs are not transcription factors, these results further demonstrate the gene expression heterogeneity that exists in neoblasts during regeneration.

2.4.3 Numerous transcription factors expressed in cells of the nervous system are expressed in neoblasts following wounding

Despite the ability of planarians to completely regenerate the nervous system fol- lowing head amputation, few genes required for the regeneration of specific neuronal

63 lineages have been described [66, 68, 69, 761. We reasoned that many transcription factors might exist that specify the diverse, numerous cell types in the brain, and the specialized neoblast model predicts many of these would be expressed in neoblasts during regeneration. The nervous system was therefore a good target for testing the specialized neoblast hypothesis. The neoblast RNA-seq data described above identified up-regulation of several tran- scription factors associated with the nervous system in the neoblast population follow- ing wounding (Figure 2-1C), such as sp6-9, eya, ap-2, Pax6A, and otxA (Figure 2-4C). In addition, expression of other nervous system-associated genes such as Tcf/Lef-1 (Kobayashi et al., 2007), neuroD-1 166]), and lhx2/9 was observed in neoblasts fol- lowing wounding (Figure IC). These genes were indeed expressed in the nervous system of S. mediterranea (Figures 2-4C, 2-10A and 2-9), as well as in X1 neoblasts from wounded planarians and in smedwi-1 + cells near day three regenerating anterior blastemas (Figures 2-1D, 2-9, and 2-10B).

64 - -

I

Figure 2-9

65 Figure 2-9: A broad panel of CNS-associated transcription factors was expressed in X1 neoblasts. For each gene tested (except last row), left upper panel shows the expression of the transcription factor (magenta) in the intact head (scale bars, 100 m), right upper panel shows expression in sorted X1 cells from pre-pharyngeal regions of amputated animals 48 hours after wounding (scale bars, 10 m), left bottom panel shows the expression of the transcription factor and smedwi-1 (green) in day three regenerating anterior blastemas (scale bars, 100 m), bottom right panel shows a higher magnification of a cell co-expressing the transcription factor (magenta) and smedwi-1 (green) in the regenerating blastema (scale bars, 10 m). DAPI was used to label nuclei DNA (gray). Percentages of X1 cells expressing the transcription factors are shown in the upper left corner. Numbers of X1 cells counted: pax6B n=10/2302, Pax3/7 n=12/2299, kif n=15/2746, Six3 n=11/1826, single minded (sim) n=1/150, Tcf/Lef-1 n=3/432, nkx2 n=9/2435, nkx6 n=2/395, FoxQ2 n=2/180, castor n=11/1954, otxB n=4/2111, and nr-1 n=31/1637. Images shown are maximal intensity projections. Anterior, up.

66 A

Figure 2-10

67 Figure 2-10: Expression of brain-associated transcription factors in intact and regen- erating animals (related to Figure 2-9) (A) Double FISH using the RNA probes of all the brain-associated transcription fac- tors analyzed (magenta) and the brain marker ChAT (green) in intact animals. Inset on upper right shows higher magnification of a cell co-expressing the transcription factor and ChA T. prox-2 is the only gene for which no co- expression with ChA T was detected. Images shown are maximal intensity projections. Images are representative of results seen in > 4 animals per panel.

68 B

c Xl neoblasts 48h post amputation

Figure 2-11

69 Figure 2-11: Expression of brain-associated transcription factors in intact and regen- erating animals (related to Figure 2-9) (B) Double FISH using the RNA probes of brain-associated transcription factors (magenta) and the neoblast gene smedwi-1 (green) in day three regenerating anterior blastemas. Higher magnification panels on the right show cells co-expressing the transcription factor and smedwi-1. DAPI was used to label DNA (gray). Scale bars, 10 m. Images shown are maximal intensity projections. Images are representative of results seen in > 5 animals per panel. Anterior, up. Scale bars, 100 m. (C) Cell FISH in sorted X1 cells from the pre-pharyngeal region of wounded planarians (48 hours) using the RNA probe of soxB, soxB-2, and prox-2. Upper left corner shows percentage of expression calculated as: soxB n=12/364 X1 cells, soxB-2 n=1/521 X1 cells, and prox-2 n=2/413 X1 cells.

We next searched for conserved transcription factors with roles in nervous system development in other organisms. We found 40 putative transcription factors ho- mologs that were expressed in the planarian brain, co-expressing or closely associated with the neuronal marker cholinergic acetyltransferase (ChA T) gene. From those 40 genes, 18 genes were detectably expressed following wounding either in smedwi-1+ cells in regenerating blastemas and/or in isolated X1 neoblasts of wounded animals (Figures 2-9, 2-11B and 2-11C). Collectively, using both the RNA-seq and candidate approaches, a total of 26 neuron-associated transcription factors displayed expression in neoblasts of regenerating planarians. Paired-box homeodomain-encoding genes are transcription factors with essential roles in organogenesis, including brain development and patterning. Within the nervous system, are involved in cellular specification, proliferation, progenitor cell maintenance, and neural differentiation in diverse organisms 1841. Planarians have two Pax6 homologs, pax6A and pax6B (Figure 2-13A and [851). pax6A was abun- dantly expressed throughout the brain (Figures 2-4C and 2-10A), was significantly up-regulated in X1 cells (Figure 2-1C), and was expressed in isolated X1 cells from wounded planarians (9.6%, Figure 2-1D). pax6A was also expressed in smedwi-1 + cells during regeneration ([76]and Figure 2-9B). pax6B expression was most abundant in the outer lobes of the planarian brain in intact animals (Figures 2-9 and 2-10 A), was present in rare X1 cells from wounded animals (0.4%), and was co-expressed with

70 smedwi-1 during head regeneration (Figure 2-9). In addition, a planarian gene en- coding a homolog of Pax3/7-family transcription factors (Figure S4A) was expressed in the ventral midline of the brain and within the ventral nerve cords (Figures 2-9 and 2-10A). Pax3/7 was expressed in isolated X1 cells (0.5%) from wounded planarians and co-expressed with smedwi-1+ cells during anterior regeneration (Figure 2-9). Drosophila cephalic gap genes have important roles in brain development, as do their orthologs in vertebrate brain development [86]. Planarians have two homologs of the Drosophila orthodenticle gene, otxA and otxB. Both genes were abundantly expressed in the planarian brain (Figures 2-9 and 2-10 and 170, 87]). otxA upregulation in X1 neoblasts was indicated by RNA-seq (Figure 2-1C), was detected in isolated X1 cells from wounded animals (10.3%, Figure 2-1D), and was co-expressed with smedwi-1 during anterior regeneration (Figure 2-10B); otxB was also expressed in X1 cells from wounded animals (0.2%, Figure 2-9). The nkx2.2 gene functions in establishing progenitor domains of the ventral spinal cord and hindbrain of vertebrates during embryogenesis [881. A planarian homolog, nkx2, (Figure 2-2A) was expressed in the medial planarian brain; cells expressing nkx2 mostly localized to the space between the midline neurons and the outer lobe of the brain (Figures 2-9 and 2-10A). nkx6.1 is another homeobox gene with important roles in patterning of the ventral vertebrate CNS [891. A planarian homolog, nkx6, (Figure 2-1A) was expressed in scattered cells throughout the medial brain lobes (Figures 2-9 and 2-10A). nkx2 and nkx6 were expressed in smedwi-1+ cells during regeneration and in isolated X1 cells from wounded animals (nkx2, 0.4% and nkx6, 0.5%, Figure 2-9). castor encodes a zinc finger transcription factor expressed in a subset of Drosophila embryonic neuroglioblasts and controls neural differentiation 190]. A planarian castor homolog (Figure 2-3B) was expressed ventrally in many scattered cells, with some of these cells co-expressing the neuronal marker ChAT (Figure 2-10A). Following wounding, castor was expressed in 0.6% of X1 neoblasts (Figure 2-9). Nuclear receptors are ligand-activated transcription factors that have several roles during development. We found a nuclear receptor (nr-1) gene in S. mediterranea to

71 be expressed in neurons in an inner arc of the brain (Figures 2-9 and 2-10A). nr-1 was also expressed in 1.9% of isolated X1 neoblasts from wounded animals (Figure 2-9). Several genes found in a transcriptome study of the planarian eye 1711 were also expressed in different domains of the brain. For example, the forkhead-family gene FoxQ2, as well as klf, are expressed in the eye and required for photoreceptor neuron regeneration [70]. In addition, FoxQ2 was expressed in the ventral midline of the brain and in ventral nerve cords, and kif was expressed in peripheral sensory neurons and in the outer brain branches (Figures 2-9 and 2-10A). Both of these genes were also expressed in isolated X1 cells from wounded planarians (1'.1% and 0.6%, respectively) and were expressed in smedwi-1+ cells during anterior regeneration (Figure 2-9). The HLH-Pas domain transcription factor single minded (sim) is expressed in the planarian brain and is required for the regeneration of several types of neurons 1661. In agreement with this report, we found expression of sim in isolated X1 neoblasts from wounded planarians (0.7%) and in smedwi-1+ cells in day three regenerating anterior blastemas (Figure 2-9). An ortholog of the sine oculis/Six family transcription factor Six3 176, 851, was ex- pressed in the outer lobes of the planarian brain and in peripheral neurons (Figures 2-9 and 2-10A) and in X1 neoblasts following wounding (0.6%) and was co-expressed with smedwi-1 in day three regenerating blastemas (Figure 2-9). Moreover, genes encoding two SOX proteins (soxB-2 and soxB, Figure SlA and [711), as well as the transcription factor-encoding genes glass, Lhx3/4, prox-2, and elf-1 (Figures 2-2A and 2-3B), were co-expressed with smedwi-1 + cells during anterior regeneration (Figure 2- 11B), and some of these were expressed in isolated X1 cells from wounded planarians (Figure 2-11C). In conclusion, the large number of transcription factors expressed in distinct brain regions and in neoblasts from wounded planarians supports a model of neuronal fate specification occurring for most or all lineages within the neoblast compartment dur- ing regeneration.

72 2.4.4 Tissue-associated transcription factors are expressed in non-overlapping subsets of neoblasts from wounded pla- narians

We hypothesized that if the neoblast expression of tissue-associated transcription factors is a result of the specification of distinct lineages, as opposed to reflecting stochastic gene expression or other roles for these genes, only specific combinations of genes would be expressed together in the same neoblast cells. Indeed, for the eye and protonephridia, combinations of transcription factors expressed in differentiated cells are co-expressed in a small number of neoblasts 165, 70, 71]. We found multiple other instances of transcription factors (and/or specific markers) expressed together in the same differentiated tissue in intact animals that were co-expressed in the same iso- lated X1 neoblasts following wounding (Figure 2-6). Specifically, hnf4 and gata4/5/6 (gut-associated), collagen, myoD, and snail (muscle-associated),and FoxA and meis (pharynx-associated) were expressed together in neoblasts (Figure 2-6).

73 Figurc 2-12

74 Figure 2-12: Non-overlapping expression of different tissue-associated genes in wounded X1 cells. Cell FISH from sorted X1 cells from pre-pharyngeal regions 48 hours following amputations using several combinations of genes expressed in different tissues. Pharynx: FoxA; CNS: pax6A, neuroD-1 and sp 6 -9 ; eye: ovo and sp 6 -9; protonephridia: POU2/3, sall, osr; muscle: collagen; anterior pole: FoxD, notum, prep and gut: hnf4 and gata4/5/6. No overlapping expression within the same X1 cells was detected, paired-t-test one-tailed, p=0.00 64 when com- pared to expected frequencies. Total number of X1 cells counted: FoxA/pax6A n=693, FoxA/ovo n=198, POU2/3/saillosr/pax6An=489, FoxD/prep/notum/pax6A n=255, collagen/sp6-9 n=198, collagen/FoxA n=443, FoxA/POU2/3/sall/osr n=295, hnf4/sp6-9 n=448, collagen/POU2/3/sall/osrn=427, hnf4/pax6A n=216, POU2/3/sall/osr/gata4/5/6n=2422,neuroD-1/gata4/5/6 n=201. Images shown are maximal intensity projections. Scale bars, 10 m.

75 B A Regeneration d3

H. 6 1

i-~

I {

696 H. P-1 7

Go .. 1

"I 46 I5

2.0 c

D

Figure 2-13

76 Figure 2-13: Phylogenetic analysis of SMED-Pax3/7 and role of klf and pax3/7 in neural tissue specification (related to Figure 2-14). (A) 37 homeodomain-containing proteins from diverse organisms were aligned using MUSCLE with default settings and trimmed. Maximum likelihood analyses were run using PhyML with 1,000 bootstrap replicates, the WAG model of amino acid substitution, four substitution rate categories, and the proportion of invariable sites estimated from the dataset. The result provides support for the SMED-Pax3/7 clade (523 out of 1,000, highlighted in red) as well as for the SMED-Pax6B (726 out of 1,000, highlighted in red). All ML bootstrap values are shown. Hs, Homo sapiens; Mm, Mus musculus; Dm, Drosophila melanogaster; Smed, Schmidtea mediterranea; Dr, Danio rerio; Gg, Gallus gallus; Eg, Echinococcus granulosus; Ct, Capitella teleta. (B) Two regions of the regenerating brain are shown: the upper panel depicts sensory neurons co-expressing the transcription factor kif and the marker cintillo; on the lower panel, ventral midline neurons co-expressing the transcription factor pax3/7 and the enzyme dopamine--hydroxylase in day three anterior blastemas. Yellow arrows point to co-expressing cells. (C) Ventral views of a FISH using the RNA probes for single minded (sim), the outer branches and photoreceptor neuron gene Na+-dependent C-/HCO 3-, and the trypthophan hydroxylase (Tph) gene, show normal expression of these genes in the regenerating heads of the different RNAi animals. Images are representative of results seen in >5 animals. (D) FISH showing no expression of kif in klf(RNAi) animals and strongly reduced expression of pax3/7 in pax3/7(RNAi) animals. Images shown are maximal intensity projections. Anterior, up. Scale bars, 100 m.

To test for the possibility that co-expression of transcription factors could be co- incidental, we performed double FISH with isolated X1 cells from wounded planari- ans using combinations of RNA probes for genes expressed in distinct differentiated cell types. In all such tested cases, we failed to detect any neoblast that displayed co-expression of transcription factors associated with different tissues (taking these results together, these data suggest substantial restriction against co-expression, one- tailed paired t-test p=0.0064, Figure 2-12). Specifically, pax6A (neurons) was never detectably co-expressed in the same X1 cell with FoxA (pharynx), or with POU2/3, sall, and osr (nephridia), or with the anterior pole genes, FoxD, notum, and prep (Figure 2-12). Similarly, FoxA was never detectably co-expressed in the same X1 cell together with the nephridia genes, the muscle gene collagen, or the eye gene ovo (Figure 2-12). Moreover, the gut genes hnf4 or gata4/5/6 were never detectably co-

77 expressed with the neuronal and eye gene sp6-9, the neuronal genes neuroD-1 and pax6A, or the nephridia genes in X1 cells (Figure 2-12). Together, these results in- dicate that multiple distinct neoblast populations express specific combinations of transcription factors during regeneration that are associated with differentiated tis- sues.

2.4.5 Tissue-associated transcription factors are required for regeneration of distinct cell types

The large collection of transcription factors identified here provides a resource for detailed investigation of different candidate lineage-specification events in planarian regeneration. The specialized neoblast model predicts that most of the transcription factors described above will be essential for the specification of numerous lineages. Identifying all of such candidate lineages is beyond the scope of this work; however, we sought to test this prediction by the analysis of case studies. To assess the role of some of the identified neoblast-expressed transcription factors in the regeneration of specific cell types, we performed RNA interference (RNAi) and analyzed regeneration outcomes. We restricted our analysis to three transcription factors expressed in cell types for which we had other markers, as the result of extensive in situ screening: the neuronal expressed genes kif and Pax3/7, and the pharynx gene FoxA. kif was shown to be required for the proper formation of photoreceptor neurons during eye regeneration [71]. However, kif had a second expression domain in the outer branches of the planarian head, suggesting a possible role in the differenti- ation of sensory neurons (Figure 2-14A). cintillo encodes a protein similar to the degenerin/epithelial super-family of sodium channels and is expressed in the anterior dorsal margin of the planarian head (Oviedo et al., 2003). We found that neurons expressing kif in the outer head rim also expressed cintillo both in intact and regener- ating animal heads (Figures 2-14A and 2-13B). Pax3/7 was expressed in the ventral midline of the planarian brain (Figure 5A). We identified the planarian homolog of the enzyme dopamine -hydroxylase (DBH), which is required for the conversion

78 of dopamine into the neurotransmitter norepinephrine. DBH was also expressed in scattered cells on the ventral midline of the planarian brain, and some of these cells co-expressed Pax3/7 in intact and regenerating planarians (Figures 2-14A and 2-13B).

79 A

B

ri~i

* 4. C 29j 20- 0 I 1& C) I I p<0.0001 control kN RNAI

20- a E

conbvi PrS7 mfi

Figure 2-14

80 Figure 2-14: Regeneration defects of neural cell types following tissue-associated tran- scription factor RNAi. (A) Two regions of the brain are shown: the left panel depicts sensory neurons co- expressing the transcription factor kif and the marker cintillo; on the right panel, ventral midline neurons co-expressing the transcription factor Pax3/7 and the enzyme dopamine-/3-hydroxylase (scale bars, 100 m). Higher magnification panels show co- expression of the genes analyzed in intact head regions (scale bars, 10 m). DAPI was used to label nuclei DNA (gray). (B) Upper row: normal head regeneration (day seven) in control, kif, or Pax3/7 RNAi animals. Animals were fed four times followed by head amputation. Second row: cintillo+ cells are not detected in klf(RNAi) animals, graph on the right shows total number of cintillo+ cells (mean SD) in regenerating heads (n>14 animals per RNAi condition, Student's t-test, p<0.0001). Third row: dopamine--hydroxylase expressing cells are strongly reduced in regener- ating Pax3/7(RNAi) animals, graph on the right shows total number of dopamine- -hydroxylase cells (mean SD) in regenerating heads (n>13 animals per RNAi condition, Student-t test, p<0.0001). Fourth row: dorsal view of control labelings using the RNA probes sim, the outer branches and photoreceptor neuron gene Na+-dependent C-/HCO 3 -, and the trypthophan hydroxylase (Tph) gene, show normal expression of these genes in the regenerating heads of the different RNAi animals. Images shown are maximal intensity projections. Images are representative of results seen in >5 animals (except otherwise labeled) per panel. Anterior, up. Scale bars, 100 m.

81 C, w anterior blastema d7 hnf4 mat Intact Relative FoxA mRNA level

2. I z

"a 6 00 * Figure 2-15: The role of FoxA in tissue specification (related to Figure 2-16). (A) qPCR for FoxA in control and FoxA RNAi animals showed decreased levels of FoxA mRNA. Data are shown as Ct mean SD. p-value is indicated for unpaired Student t-test. (B) Double FISH using the RNA probes hnf4 (green) and FoxA (magenta) in intact animals. A subset of cells in the lining of the pharynx cavity co-expressed both genes. Dotted box is shown in higher magnification on the right panel (scale bars, 10 m). (C) Left: FoxA(RNAi) animals showed defective gut regeneration (day seven, ante- rior blastemas) as observed by double FISH using the RNA probes hnf4 (magenta) and mat (green). Images are representative of results seen in >4 animals. Right: Co-expression of FoxA and hnf4 in a subset of sorted X1 neoblasts from wounded planarians at 48 hours following head and trunk amputation (13/48 FoxA+ cells, 27.1 %). Scale bars, 10 m. All images shown are maximal intensity projections. Anterior, up. Scale bars, 100 m.

83 A trunk d7 dor

B tall fragment d4

control 4 *FoxA RNA i

Figure 2-16

84 Figure 2-16: FoxA RNAi disrupts pharynx regeneration. (A) FoxA(RNAi) animals show dorsal lesions in the old pharynx (in trunk pieces, n=18/18) and do not regenerate a new pharynx (in tail fragments, n=15/20) seven days following amputation. Light images show no pharynx opening on the ventral side in regenerating tail fragments of FoxA(RNAi) animals. (B) FISH using the RNA probes mhc-1 and FoxA show a defective regenerating pharynx in tail fragments of FoxA(RNAi) animals at four and seven days following amputation. Co-expression of the muscle marker mhc-1 (green) and the pharynx transcription factor FoxA (magenta) is observed in cells of the pharynx primordium in control RNAi animals at four days following tail amputation. On the right panels, expression of mhc-1 (magenta) in day seven regenerating tail fragments of control and FoxA RNAi animals. Images shown are maximal intensity projections. Images are representative of results seen in >9 animals per panel. Anterior, up. Scale bars for all panels, 100 m.

85 0 10- M 0hours x 8- m 48 hours 0 6-

C- x 4-

0 2-

0. a.' -- FoxA Pax3/7 kif

Figure 2-17

86 Figure 2-17: Cell FISH in sorted X1 cells using the RNA probes FoxA, pax3/7, and kif (related to Figure 2-18) Graph shows percentages of X1 cells expressing the transcription factors FoxA, pax3/7 and kif. Animals were amputated (heads and trunks) and pre- pharyngeal regions were collected at 0 (isolated immediately after amputation) and 48 hours post-amputation. X1 cells were then isolated by FACS analysis and cell FISH performed. Numbers of counted X1 cells from wounded planarians expressing the different transcription factors are: FoxA+ Oh: 23/1208, 48h: 54/704, pax3/7+ Oh: 4/2906, 48h: 12/2299, klf+ Oh: 3/3614, 48h: 15/2746. A two-tailed Fisher's exact test was used, p-values are shown.

87 A Wounding 1000000 000100 00 0 4100640 B ~NAnbla~t Snedahized Neoblasts Differentiated Tis s References smedwi-1V

ovo Lapan, 2011 sp6.9 E ye Lapan,2012 eya

POU213 Protonephridia Scimone, 2011 SixI/2-2 Sail

FoxD Anterior Pole Scimone, 2014 prep

Lhx15-1 serotonergic Cuni, 2013 pitx neurons subset of sim ChAT+, gad+, Cowes, 2013 Coe NItbh+, th+, and Cwe,21 hesl tph+ neurons

Wenemoser, 2012 ap-2 ap-2 neuronsTrpA+

cintillo+ kwkif sensory neurons

DBH+ Pax3/7 ventral midline neurons FoxA tneis* Pharynx twis t*

hnf4 Gut Wagner, 2011*

myoD Muscle * snail

Pax6A Pax6B obfA . obB nkx2 nkx6 CNS NR-1 elf-I Sx3 FoxQ2 Lhx3 /4 Tcf/Lef-I castor glass NeuroD-1 Sox-like Lhx2/9 prox-2 SoxB

Figure 2-18

88 Figure 2-18: Proposed model: specialization of neoblasts into different lineages following wounding. cNeoblasts will give rise, directly or indirectly, to specialized neoblasts for most-to-all tissues. Both the cNeoblasts and the specialized neoblasts express the smedwi-1 marker. As specialized neoblasts further differentiate, they will lose expression of smedwi-1. A summary of all known transcription factors expressed in neoblasts and functionally associated with distinct lineages by RNAi analysis are shown in the upper part of the model (white background). The transcription factors expressed in neoblasts and in specific tissues, but that have not been functionally shown to be required for their specification, are shown in the lower part of the model (blue background), During the production of this manuscript the role of FoxA in pharynx regeneration was also described in [91]. *Functional requirement for phar- ynx specification was not tested. **Co-expression with the smedwi-1 protein was shown. The transcription factors dlx, textitsixl/2, soxB, otxA, as well as hunchback and eya have been shown to be required for the regeneration of the eye 170, 71] and the nephridia [65], but no co-expression with the neoblast gene smedwi-1 or expression in X1 neoblasts has been tested. kif, Pax-3/7, and control RNAi animals regenerated anterior blastemas with largely normal brain formation and patterning (Figures 2-14B and 2-13C). Specifically, nor- mal Smed-tryptophan hydroxylase (tph) [71, 92], sim, and a gene expressed in eyes and cephalic branches with homology to a sodium-dependent chloride/bicarbonate exchanger (Na+-dependent Cl-/HCO3-) expression was observed (Figure 2-14B and 2-13C). kif and Pax3/7 transcripts were greatly reduced in kif and Pax3/7 RNAi animals, respectively (Figure 2-15D). Despite displaying normal regeneration of many neurons, kif (RNAi) animals had a near complete absence of cintillo-expressing cells

(n-14/14, Figure 2-14B) and also lacked photoreceptor neurons (Na+-dependent Cl- /HCO3-expressing cells) 1711 , suggesting a specific role for kif in the regeneration of this subset of peripheral sensory neurons. Similarly, Pax3/7(RNAi) animals showed significantly reduced numbers of DBH-expressing cells (n=12/13, Figure 2-14B), con- sistent with a role for this gene in the specification of this subset of neurons. In both cases, whether neuron classes are absent or simply not expressing the appropriate genes is unknown. FoxA was expressed in the pharynx primordium and in neoblasts during pharynx regeneration (Figures 2-6 and 2-8B). We inhibited FoxA with RNAi and amputated animals into head, trunk, and tail fragments. Seven days later, all regenerating trunk

89 pieces (which already had pharynxes) had lesions at the pharynx (n=18/18 animals, Figure 2-16A). Most tail fragments failed to regenerate a pharynx (n=15/20, Figure 2-16A). Because the pharynx is a muscular organ, the mhe-1 muscle marker was used to determine the presence and structure of the pharynx. We found that FoxA(RNAi) animals showed limited pharynx regeneration (n=9/9), indicating a requirement for FoxA in pharynx specification (Figure 2-16B). Moreover, regenerating tail fragments of FoxA(RNAi) animals four days after amputation completely lacked expression of both FoxA and mhc-1 (n=9/9, Figure 2-16B). By contrast, control RNAi animals showed FoxA expression together with mhc-1 in the pharynx primordium (Figure 2-16B). Moreover, FoxA mRNA levels in FoxA and control RNAi animals weresig- nificantly reduced in FoxA(RNAi) animals (Figure 2-15A). FoxA expression was not exclusively restricted to the pharynx in intact worms (Figure 2-8C). Similarly, the gut transcription factor hnf4 was expressed in few scattered cells in and around the pharynx (Figure 2-15B) in intact animals. We found some cells around the pharynx co-expressing FoxA and hnf4 in intact animals (Figure 2-15B). Moreover, some iso- lated X1 cells from wounded animals co-expressed FoxA and hnf4 (27.1% of FoxA+ cells, Figure 2-15C). Anterior blastemas in FoxA(RNAi) animals showed defects in the regeneration of gut branches (Figure 2-15C), either reflecting a direct role in an- terior gut formation or a secondary effect of failed pharynx regeneration. A FoxA homolog in C. elegans (pha-4) is essential for the establishment of the pharynx iden- tity, with pha-4 mutants lacking the entire pharynx. Animals from cnidarians to humans have FoxA transcription factors that are commonly associated with develop- ment of digestive tracts 193]. The phenotype observed in FoxA(RNAi) in planarians is reminiscent of the C. elegans pha-4 mutants, the forkhead mutants in Drosophila that lack foregut and hindgut consistent with an evolutionary ancient role for FoxA transcription factors in foregut/pharynx specification and differentiation [931.

90 2.5 Discussion

Planarian neoblasts have long attracted interest as dividing cells in adult animals that are required for regeneration and cell turnover [94]. A largely unaddressed question has been whether neoblasts are a homogeneous population or are constituted of mul- tiple different cell types, such as stem cells and lineage-committed progenitors. Two models for regeneration have been considered [73]: the naive neoblast model unites all neoblasts as pluripotent stem cells, with fate specification occurring in non-dividing neoblast progeny. The specialized neoblast model posits that fate of regenerating cells is specified in neoblasts themselves. A variety of recent reports have demonstrated instances of smedwi-1 + cell specializa- tion. Transcription factors required for the regeneration of the eye [70, 71], nephridia [65J, the anterior pole [75], and several neuron classes [66, 68, 76] are expressed in smedwi-1 + cells during regeneration. These data provide support for the specialized neoblast hypothesis. If this hypothesis is correct, it predicts that numerous additional transcription factors - perhaps for the specification of every missing cell type - would be expressed in distinct subsets of neoblasts during regeneration. We tested this prediction here by seeking transcription factors expressed in small numbers of X1 neoblasts and/or smedwi-1 + cells following wounding. We coupled an RNA-seq approach with expression screening approaches using transcription factors expressed in specific differentiated planarian tissues. In total, 41 transcription factors (8 known from prior reports and 33 reported here) were expressed in X1 neoblasts and/or smedwi-1 + cells (Figure 2-5E). 36 out of these 41 transcription factors were detectably expressed in sorted X1 neoblasts from wounded planarians, indicating expression in cells that will at least complete one round of division. These identified transcription factors displayed expression in specific differentiated cells and in small numbers of neoblasts at wound sites, consistent with the specialized neoblast model. Many specific lineages could be investigated to understand the formation of specific cell types during regeneration and the transcription factors described here provide a resource for such inquiry. in vivo lineage-tracing methods are presently limited

91 in planarians. Therefore, in principle, a transcription factor induced in neoblasts at wounds might have some role in neoblast physiology other than lineage specification. However, two lines of evidence suggest that many or all of these transcription factors will be involved in specifying progenitors for the regeneration of specific differentiated cell types. First, RNAi of a number of identified transcription factors, reported here and in published work, ablate the regeneration of their specific tissues. Here, we demonstrated a requirement for FoxA for the pharynx; pax3/7 for DBH+ ventral midline neurons, and kif for cintillo+ sensory neurons. In prior work, sp 6-9 , eya, and ovo were required for the eye 170, 71], sixl./2-2, POU2/3, and sall were required for the nephridia [651; FoxD was required for the anterior pole [751, and ap-2, lhx1/5- 1, pitx, coe, hesl, and sim were required for different subsets of neurons [66, 68, 76]. Second, we found that transcription factors that were expressed in different differentiated tissues were expressed in distinct subsets of neoblasts. By contrast, in multiple cases, transcription factors expressed together in the same differentiated tissues were expressed together in the same neoblasts. It is still possible that one transcription factor might have a role in the specification of more than one lineage. For instance, kif is required for the specification of both photoreceptor neurons as well as for the sensory cintillo+ neurons (this study and [711). Similarly, eya has been shown to be required for the specification of eye progenitors [701 as well as nephridia cells [651. The concept of neoblast specialization at wounds is important in understanding pla- narian regeneration and opens many avenues for future inquiry. For example, how is the neoblast response tailored to the identity of missing tissues Do specialized neoblasts amplify their population or do specialized neoblasts rapidly cease division and differentiate. There is an ongoing need for differentiated cells in planarians, and low levels of specialized neoblasts appear to exist in intact animals 166, 71]. For example, the ovo+ eye progenitors are abundant near anterior-facing wounds at 48 hours following amputation, however, a small number of them can also be observed near the head of intact animals 1711. Similarly, we observed an increased number of X1 neoblasts expressing FoxA, kif, and Pax3/7 at 48 hours following wounding when

92 compared to their expression in X1 neoblasts from intact animals (Figure 2-17). Regeneration requires instructions that specify missing tissue to be replaced. To understand the connection between regenerative instructions and the production of appropriate cell types, it is essential to address the cellular step at which specification of the fate of regenerative cells occurs. We propose that cNeoblasts (directly, or via their descendants) begin expressing numerous transcription factors of specific lineages in distinct neoblast cells (Figure 2-18). In this model, almost all of the lineages formed during development could be reconstituted during regeneration, with progenitors that generate and comprise the planarian blastemas being a heterogeneous patchwork of lineage-specified cells.

2.6 Materials and methods

2.6.1 Animals and radiation treatment

Asexual Schmidtea mediterranea strain (CIW4) animals starved 7-14 days prior ex- periments were used. Animals were exposed to a 6,000 rads dose of radiation using a dual Gammacell-40 137 cesium source and fixed three days after irradiation. Double- stranded RNA-expressing bacteria cultures were mixed with 70% liver solution in a 1:300 ratio. RNAi animals have been fed four times (days 0, 4, 7, and 11) and am- putated at day 12. Seven days following amputation, animals were fixed and in situ hybridizations performed. Details of histological methods can be found in Supple- mentary Information 2.1.

2.6.2 mRNA purification and Illumina sequencing

Total RNA was purified using TRIzol reagent, and mRNA sequenced using TruSeq RNA Sample Preparation Kit v2 (Illumina) on a HiSeq. More than 15 million reads per triplicate were mapped to a S. mediterranea transcriptome (195], GBEE01000000 in the Transcriptome Shotgun Assembly database) generated from paired-end se- quencing of both intact and mixed stage wounded animals, with summation of reads

93 associated with Trinity generated isotigs for increased statistical power, using Bowtie 2. Differential gene expression was calculated for 0 to 48 hours after wounding for both anterior- and posterior-facing wounds using DESeq's nbinomial statistical tests with Benjamini-Hochberg multiple test correction for a 10% false discovery rate.

2.6.3 RNAi dsRNA-expressing bacteria cultures were mixed with 70% liver solution in a 1:300 ratio to culture volume. RNAi animals have been fed four times (days 0, 4, 7, and 11), and amputated at day 12. Seven days following amputation, animals were fixed and in situ hybridizations performed.

2.6.4 Statistical analysis

Co-expression frequencies for meis and FoxA, gata4/5/6 and hnf4, and collagen and myoD (Figure 2-6A) were tested for deviance from expected if expressed indepen- dently using a two-tailed Fisher's Exact Test. Therefore, a significant p-value indi- cates that the two genes tested were not independently expressed. The observation that tissue-specific transcription factors do not co-express within the same neoblast (Figure 2-12) was tested using a one-tailed paired t-test. In this case, the null hypoth- esis was that the observed co-expression between all combinations of tissue-associated transcription factors occurs at frequencies expected from their individual expression within the X1 population. Therefore, a significant p-value indicates that observed frequency of expression between all combinations of tissue-associated transcription factors is less than expected. Unpaired t-test was used to determine the significance of the number of cintillo+ and DBH+ cells between control and kif and Pax3/7 RNAi conditions, respectively (Figure 2-14). An unpaired t-test was used to determine the significance of the number of cintillo+ and DBH+ cells between control and kif and pax3/7 RNAi conditions, respectively (Figure 2-14). Unpaired t-test was used to de- termine the significance of the mRNA level changes between control and FoxA RNAi conditions (Figure 2-15A). A two-tailed Fisher's exact test was used to determine

94 the significance of FoxA, kif, and pax3/7 wound-induced expression in X1 neoblasts between 0 and 48 hours (Figure 2-17).

2.6.5 Accession numbers

Illumina sequencing data were deposited to the Gene Expression Omnibus with the accession number GSE57226. Sequences of genes identified in this study were de- posited to GenBank with the accession numbers KJ934799 to KJ934818.

2.6.6 Nomenclature

If applicable, existing published S. mediterranea gene names were used. If a char- acterized gene fell into a gene family with an existing name/number system in S. mediterranea, this system was adopted for the new gene. For genes with similarity to multiple members of a gene family, phylogenetic analysis was performed (Figure 2-2A). Genes with similarity to the same gene in multiple organisms, with or without clear PFAM domains, were named accordingly to the convention for that family, and if that family is expected to have several members, a "-1" was added. For all other genes, we assigned "homology names" based on best BLASTx hit to human or the closely related species Dugesia japonica and Macrostomum lignano (Figures 2-1C, 2-3B, 2-4C and Table 2.1).

2.6.7 Whole-mount and cell in situ hybridizations

Animals were amputated and fixed seven days following injury or fixed without injury (intact) in 4% formaldehyde-PBST solution and nitroblue tetrazolium/5-bromo-4- chloro- 3-indolyl phosphate (NBT/BCIP) colorimetric whole-mount in situ hybridiza- tions or FISH were performed as described [96], with the addition of sodium azide in- activation [97]. For cell FISH, animals were head and trunk amputated, and 48 hours later, pre-pharyngeal regions were macerated, treated with collagenase (10 mg/ml) in CMFB media and labeled with Hoechst 33342 1981. FACS analysis was performed as described [56] and X1 neoblasts purified. X1 cells were then placed on coverslips for

95 30 minutes in CMF, followed by fixation in 4% paraformaldehyde in CMF. Cell FISH was performed as described above, with slight modifications: all hybridization washes were limited to alternating 10 and 20 minute intervals, and all other washes were lim- ited to 10 minutes, WBBR (Roche) was used as a blocking agent before antibody development. Only robust expression of transcription factors or markers was counted as positive cells using Fiji on maximal intensity projections. Fluorescent images were taken with a Zeiss LSM700 Confocal Microscope and light images were taken with a Zeiss Discovery Microscope. Images were adjusted for brightness and contrast.

2.6.8 qPCR analysis

Total RNA was extracted from three biological replicates (consisting of 3 animals each) for FoxA and control RNAi animals. cDNA was synthesized using oligodT Su- perscript. Transcript levels were detected using the SoFast EvaGreen Supermix (Bio- Rad) and the 7500 Fast Real Time PCR System (Applied Biosystems). The selected qPCR primers were 95% to 110% efficient in test reactions, did not amplify transcripts in control samples prepared without the addition of reverse transcriptase, and targeted a region of the gene that did not overlap with the dsRNA (left: ACCAAGTACTAC- TAGATTGTTGA, right: ACATGCTTGCACTCATTGGG for FoxA; left: AGCTC-

CATTGGCGAAAGTTA, right: CTTTTGCTGCACCAGTTGAA for Gadph). CT values were normalized by the levels of a housekeeping gene (Gadph) and the corre- sponding relative RNA level values were normalized by the mean value for the target gene in three control RNAi replicates.

2.6.9 Supplemental Table 1

List of genes that showed up-regulation in X1 cells following wounding by mRNA Illu- mina sequencing. All contigs that follow the criteria for up-regulation 48 hours after anterior and posterior amputation used in this study (fold change expression of log 2 > 0.7 and Punadj (unadjusted p-value) < 0.05, with a BLASTx E-value less than 10-5). Known planarian genes were identified by BLASTn to Genbank sequences. Previ-

96 ously unknown planarian transcription factors were defined by best human BLASTx homology, followed by PFAM domain analysis (Figure 2-3B) and in some cases phy- logenetics (Figure 2-2A). The transcriptome is known to be partial redundant, as is in the case of ap-2 which is assembled as two contigs.

Table 2.1: Supplementary Table: List of Genes that Showed Upregulation in X1 Cells following Wounding by mRNA Illumina Sequencing

http://www.cell.com/action/showlmagesData?pii=S2213-6711%2814%2900186-6

97 98 Chapter 3

Single-cell sequencing reveals diverse neoblast specialization for the Schmidtea mediterranea nervous system

3.1 Contributions

Kellie M. Kravarik (K.M.K.) and Peter W. Reddien (P.W.R.) were responsible for the overall study design and interpretation of data. K.M.K. performed all experiments, sequencing and computational analysis. K.M.K. and P.W.R. wrote the manuscript.

3.2 Summary

The planarian nervous system is comprised of numerous different cell types, providing an opportunity to study how neoblasts acquire the diverse cell fates that comprise a particular tissue. We used single-cell RNA-sequencing to identify the transcriptomes of hundreds of planarian neurons and neoblasts. Using computational analysis of these data we identified the transcriptomes of several specific types of planarian neuronal cells, including cholinergic, dopaminergic, and serotonergic neurons, as well as glial

99 cell types. In neoblasts, we identified a population of cells that expressed both markers of differentiated neurons and transcription factors expressed in various neural cell types, which we hypothesize to be neural specialized neoblasts. We found a number of unique populations of neural neoblasts that correspond with specific neural sub- types. Interestingly, these neural specialized neoblasts do not express a detectable unified gene regulatory network. These results are consistent with direct specification of neural sub-types in neoblasts and suggest that neoblasts do not differentiate down a highly hierarchical lineage path as has been described for many developmental lineages.

3.3 Introduction

Regenerating a nervous system requires producing an extreme diversity of cell types. Nervous system regeneration also requires that the organism reestablish the specific neural cell types lost during injury, on-demand, and then integrate them into the ex- isting adult nervous system. Such neural regeneration is of broad biological interest for applying to the many sources of human neural damage. Because developmental systems begin with a fixed starting point but regenerative systems must begin with whatever tissue remains after injury, regeneration offers a context to understand the plasticity of tissue differentiation. The planarian Schmidtea mediterranea is a model for understanding whole-head, and therefore whole-brain, regeneration. Regeneration of the S. mediterranea head requires replacement of cephalic ganglia and the periph- eral nervous system; comprised of thousands of neurons and glial cells. This provides a context for understanding how the diversity of cell types that constitute a complex tissue can be generated in adult animals. Planarians regenerate all tissue from a population of mesenchymal cells termed neoblasts, which contain pluripotent cells that can differentiate into every cell type of the body 1641. Neoblasts are the only dividing cell type in planarians, but single-cell studies have revealed heterogeneity and evidence of progenitor specification. Evi- dence exists for specialization of muscle, eye, protonephridia, pharynx and gut cell

100 types, and a large population of epidermal progenitors is maintained in neoblasts [99][48, 65, 68, 69, 71, 100]. The planarian nervous system is complex. The head contains a bi-lobed cephalic gan- glia attached to two ventral nerve cords that run the length of the body. From studies of neural gene expression, there are sensory cells and cells producing every major neu- rotransmitter found in vertebrates [1011. There is evidence of neoblast specialization for neural differentiation: ovo, sp6-9 and eya [70, 71J, single-minded (66-69, 1021, coe, pax3/7, kf 148][100] are all expressed in neoblasts and required for regeneration of specific neural cell populations. There is also evidence of a population of fluorescence activated cell sorting (FACS) isolated neoblasts that express markers of differentiated neurons, though they express relatively few markers of neoblast state, raising the possibility that such cells are in fact post-mitotic progenitors 1103]. We sought to evaluate the differentiation of the nervous system at a systems level, to better char- acterize the types of neural diversity that exist, and to leverage that information to understand how neoblasts generate that diversity.

3.4 Results

3.4.1 Single-cell sequencing of neurons and neoblasts from the planarian head

To characterize the planarian nervous system and the neoblasts that generate it, we isolated individual live cells from dissociated planarian heads by Fluorescence Activated Cell Sorting (FACS) and amplified cDNA for each cell (Figure 3-lA). 438 neoblasts (marked by >2C DNA content) and 427 differentiated cells (marked by 2C DNA content) were recovered from intact heads. To increase the diversity of neurons isolated, 161 differentiated cells from whole planarian heads, 129 differentiated cells from dissected head rims, and 138 differentiated cells from the dissected ventral mid- brain regions were recovered from animals 5 days after lethal irradiation, which are depleted of stem and progenitor cells. Amplified cDNA from single cells was qPCR

101 screened to remove low-quality cells and cells of the well-characterized epidermal lineage from downstream analysis. We then made next-generation RNA-sequencing libraries for all desired cells, and sequenced them at an average depth of 550,000 reads per cell. After removing libraries with < 10,000 reads and/or libraries with >25 percent contamination of human, mouse, yeast, or E. coli sequences, we mapped all libraries to a planarian transcriptome 1104] and further filtered our data to cells with sequenced reads from >800 and <10,000 unique planarian transcripts. In total, 374 neoblasts, and 360 2C cells of high quality were recovered for analysis, with an average of 2,981 transcripts associated with uniquely mapped reads. Computational analysis of these libraries using the Seurat R Package [1051 separated these cells into 17 informative principal components (PCs) identified from 4,608 dif- ferentially expressed transcripts. Embedding of these PCs in 2D space using t-SNE

11061 revealed a complex mixture of cells with different gene expression (Figure 3-1B) described by the clusters of cells seen in the plot. Using published markers of pla- narian cells types [1071, we observed that neoblasts are distinct from all other cells (Figure 3-2A, smedwi-1). We also recovered only a small number of gut, epidermal, and nephridia cells, indicating that our strategy to enrich for neuronal cell types was successful (Figure 3-2A). We found 7 clusters that contained substantial expression of neural genes (Figure 3-1B, circled). Of these 7 clusters, 5 expressed markers of ciliated neurons (Figure 3-2B). Additionally, we identified two clusters that expressed markers of recently identified planarian glial cells (Figure 3-1C, circled) 110811102].

102 A 6,000 Rads 0 Rads

A

Dye contend (AU) Dve cmnbnt (AU)

Dye Contsnt (AU)

Dye conbWn (AU)

DL IC Glial Gene Expression

Ii ~ ®~ F I. %~

Figure 3-1

103 Figure 3-1: Single-cell mRNA sequencing analysis of neurons and sorted neoblasts from planarian heads. (A) Schematic diagram showing the isolation of neoblasts and neurons from intact heads, irradiated intact heads, and irradiated lateral and me- dial brain regions. Cell suspensions were labeled with Hoechst 33342 and individual cells were isolated by FACS followed by immediate lysis and freezing. RNA was then amplified using SmartSeq2, and Nextera libraries and Illumina mRNA sequencing performed. (B) Top: whole animal FISH image of the neural marker PC2. Anterior, left. White box depicts region isolated for sequencing. Image shown is a maximal intensity projection. Scale bar, 100 um. Bottom: tSNE plot showing principal com- ponent analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar gene expression profiles. The magenta hue of a cell depicts individual expression of all neural markers (ciliated and non-ciliated) from 1107], as shown in Table 3.1. Black ellipses denote clusters with substantial neural gene expression. (C) Top: whole animal FISH image of the glial marker eaat2-2. White box depicts region isolated for sequencing. Image shown is a maximal intensity projection. Scale bar, 100 um. Bottom: tSNE plot showing principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of all glial from [102, 108] as shown in Table 3.1. Black ellipses denote clusters with substantial glial gene expression.

104 4

~~1

I~. I

'p

-. 4 40~,,*-. - Cv-.

* 0 i .1 -I p IA

*0

I II. I~a ~1.

W 'I ~,.4

1. *#~~ ~ *1 k .1 .~ .$v~,

F a'.' '-a. oc m Figure 3-2: Single-cell mRNA sequencing analysis of neurons and sorted neoblasts from planarian heads. (A) tSNE plots of the results of principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar gene expression profiles. The ma- genta hue of a cell depicts individual expression of: Muscle Markers, Protonephridia Markers, Gamma (gut) Neoblast Markers, Parapharyngeal Markers, Zeta (epidermal) Neoblast Markers, Epidermal Lineage Markers, and the neoblast marker smedwi-1, all from [107], as shown in Table 3.1. (B) tSNE plots of the results of principal com- ponent analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of: Nonciliated Neurons and Ciliated Neurons, all from [107], as shown in Table 3.1.

3.4.2 Single-cell sequencing identifies the transcriptomes of dopaminergic, cholinergic, serotonergic, and afferent neurons

To characterize the diversity of neural types obtained through sequencing, we first analyzed these clusters for markers of known neural populations and lineages. Pla- narian brains contain cell types expressing the biosynthesis enzymes for several known monoamine neurotransmitters, including serotonin [921, dopamine [109, 1101, GABA [111], acetylcholine[112], and octopamine [113][101]. Our data displayed cells with expression of choline acetyltransferase (ChAT), marking cholinergic neurons, Tyro- sine hydroxylase (Th), marking dopaminergic neurons, and Tryptophan hydroxylase (Tph), marking serotonergic neurons (Figure 3-3A). Each of these genes is expressed in distinct brain regions. We also observed sporadic expression of glutamate decar- boxylase and dopamine beta hydroxylase in sequenced cells (Figure 3-4A), suggesting that we did not sample glutamatergic and octopaminergic neural populations deeply enough to enable computational clustering. However, the enrichment of ChAT, Th, and Tph in specific clusters suggests that separation of planarian neural types into classes associated with distinct neurotransmitter expression is a major source of cell- cell variation.

106 The lineage for planarian Tph+ serotonergic neurons has been shown to require the transcription factors lhx5-1, pitx, and islet-i in both adult neurons and specialized serotonergic neoblasts [68, 691. Whereas detected Tph expression was not uniform throughout the cells in its cluster, some cells in this cluster expressed lhx5-1 and islet- 1. pitx-1 and lhx5-1 were co-expressed in rare neoblast cells (Figure 3-3B and Figure 3-4B). The octopaminergic neuron lineage is marked by dopamine beta hydroxylase (Dbh) and requires Pax3/7-like/arx (pax3/7)and nkx2-like/nkx2. 1 (nkx2). Both of these transcription factors are expressed in neoblasts near the brain, throughout the regenerating animal, and in differentiated neuronal cells [48111001. Consistent with previous findings, the measured expression of nkx2 was much broader (expressed in many more cells) than pax3/7 (Figure 3-3C) indicating that it might be more diversely involved in neural identity. This may explain why nkx2 RNAi caused a dramatic movement defect 1100] whereas pax3/7 RNAi and the loss of the Dbh+ cells did not [481. Single-minded (sim), a bHLHtype transcription factor, is expressed in neoblasts and ChAT, tbh, and Th neurons of the ventral midline. Consistent with this, we observed expression of sim in a subset of cells within the ChAT+ cell cluster (Figure 3-3D). A second gene encoding a bHLH transcription factor, coe, is expressed throughout the cephalic ganglia and in neoblasts, and is required for ChA T, gad, Tbh, th, and tph type neurons. We observed expression of coe in several distinct clusters, including a subset of ChA T+ cells and three other groups (Figure 3-3E). One of those clusters robustly expressed the transcription factor-encoding dbx, pou41-1, and soxb2-2 genes, whereas another expressed Post-2d/AbdBA 1114][115][1161 (Figure 3-3E, Figure 3-4C). This suggests that coe is required for the specification and/or biology of several different neuronal cell types and provides an opportunity for dissecting how coe functions in conjunction with other transcriptional regulators to establish and maintain neural diversity. There are three other well-characterized neural lineages in planarians. The transcrip- tion factor Klf-like-1 is required for regeneration of cintillo+ sensory neurons [48], the transcription factor ap-2 is required for regeneration of inner-lobe trpA+ sensory

107 neurons 176], a network of ovo, eya, otxA, sixi12, dIx, and sp 6-9 is required for dif- ferentiation of optic cup cells and photoreceptor neurons in the brain [70][711. Each of these lineages displays evidence of neoblast specialization, involving the expression of one or more transcription factors required for neuron formation in rare subsets of neoblasts. Possibly because of sampling limitations, we only observed expression of kif, eya, and sp 6-9 in isolated cells and neoblasts ( Figure 3-4D-F), and we did not see substantial enrichment of these genes in the expression in any single cell cluster. Based on these data, we suggest that our single-cell sequencing experiment had suf- ficient power to detect neural populations of moderate sizes, but did not recover all such neural types. Thus, further work sequencing a higher number of cells will be required to fully appreciate molecular basis for the neural diversity of planarians.

108 A.

Bb

II I I. D -

I I II

F j-s

I I

S.'

Figure 3-3

109 Figure 3-3: Neurotransmitter and transcription factor expression in neurons and sorted neoblasts from planarian heads. (A) Top: whole animal FISH images of the cholinergic neural marker ChAT, the dopaminergic neural marker Th, and the sero- tonergic neural marker Tph. White box depicts region isolated for sequencing. Images shown are a maximal intensity projection. Scale bars, 100 um. Bottom: tSNE plots of the results of principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar gene expression profiles. The magenta hue of a cell depicts individual expression of ChA T, Th, and Tph. Black ellipses denote clusters with substantial ChAT, Th, and Tph expression. (B) tSNE plots of the results of principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of the serotonergic lineage tran- scription factors lhx5-1 and islet-1, and summed expression of the serotonergic lineage transcription factors lhx5-1 and pitx in cells where both were detected (denoted as co-expression). Black ellipses denote clusters with substantial gene expression of tran- scription factors. (C) tSNE plots showing principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of the octopaminergic lineage transcription factors nkx2 and pax3/7. Black ellipses denote clusters with substantial gene expression of both transcription factors. (D) tSNE plots of the results of principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The ma- genta hue of a cell depicts individual expression of the midline neural transcription factors single-minded. Black ellipses denote clusters with substantial gene expression. (E) tSNE plots of the results of principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of the neural transcription factor coe, and the transcription fac- tors enriched in coe+ clusters: dbx*, post-2d, pou4-1. Black ellipses denote clusters with substantial gene expression of both transcription factors. *Gene named by best BLAST hit to the .

110 AL 11 --

0 0 I F

B, 0 I I

mat C 0 Co

3.1' "-1.- I - 0

0hlendd .l..o0 I F I - 'F I F ( owl

p. "IT I F

Figure 3-4

111 Figure 3-4: T Figure 2, related to Figure 2. Neurotransmitter and transcription factor expression in neurons and sorted neoblasts from planarian heads. tSNE plots showing principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of (A) the octopaminergic neural marker dopamine beta hydrolase and the GABAergic neural marker glutamate decarboxylase, (B) the serotonergic transcription factor pitx. (C) A transcription factors enriched in a coe+ cluster: soxb2-2. Black ellipses denote cluster with substantial gene, (D) the sensory neural transcription factor klf(E) the eye and neural transcription factor eyes-absent (eya), and (F) the eye and neural transcription factor sp6 -9 .

3.4.3 Cholinergic neurons are divided into pax6A and skil sub-types

Cholinergic neurons are one of the most abundant types of neurons in the planarian brain. Many transcription factors expressed in the planarian nervous system appear to be expressed in the CA T+ regions of the cephalic ganglia. To better charac- terize the regulatory diversity of cholinergic neurons, we analyzed the expression of genes encoding transcription factors enriched in isolated cholinergic neurons. Two transcription factors were broadly expressed in these cells: pax6A and skil (Fig- ure 3-5A).Planarian pax6A has been shown to be expressed in the cephalic ganglia 176, 117] and in neoblasts following injury 1481. Pax6 family transcription factors are required for photoreceptor differentiation and eye formation across species. However, in planarians, pax6A is dispensable for eye regeneration [70, 71, 117]. Proteins of the Ski Oncogene family are transcriptional regulators that can negatively regulate TGF-beta signaling by binding to Smad4 1118-120]. Ski Oncogene proteins have also been implicated in transcriptional repression through associations with HDAC com- plexes, and cell differentiation through binding to various transcription factors. Mice require the expression of a homolog of skil for development of the nervous system, face, and muscle [1201. Xenopus and Zebrafish also require ski for neural and muscle development, suggesting a conserved role for skil genes in the regulation of neural

112 lineages[118J. Using in situ hybridization for skil and pax6A transcripts, we observed that pax6A and skil were expressed in a large number of ChA T+ neurons of the cephalic gan- glia in vivo (Figure 3-5B). Together, skil and pax6A were expressed in the majority of isolated ChA T+ neurons of one cluster (Figure 3-5A). Consistent with the single- cell sequencing data, we observed co-expression of these transcription factor-encoding genes in cells of the cephalic ganglia (Figure 3-5C, arrow), but the majority of their expression was in distinct but intermingled cells throughout the central nervous sys- tem. Interestingly, we found that a large proportion of these planarian cephalic gan- glia ChAT+ neurons expressed at least one of these two transcriptional regulators. RNAi of pax6A has not been reported to cause a defect in neural differentiation, but the markers produced by single-cell sequencing of these neurons will provide a new opportunity for evaluating the requirement of pax6A in planarian neural differentia- tion.

113 - -l A, .1 I

B

C (rD

Figure 3-5

114 Figure 3-5: Analysis of cholinergic neurons in sorted neoblasts and neurons from planarian heads. (A) tSNE plots showing principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of ChAT and two transcription factors enriched in the ChAT cluster: pax6A and skil.(B) Single FISH using the RNA probes skil and pax6A (magenta), and triple FISH using the RNA probes pax6A (magenta), skil (green), and ChAT (blue) in intact heads (scale bars, 100 m). DAPI labels nuclei DNA (gray). Images shown are maximal intensity projections. Anterior, up. (C) Triple FISH using the RNA probes pax6A (magenta), skil (green), and ChAT (blue) in intact heads (scale bars, 100 m). DAPI labels nuclei DNA (gray). Images shown are a single confocal 2 m slice. Anterior, up.

3.4.4 Dopaminergic neurons express a gene-regulatory net- work involving ETS-family transcription factors

Dopaminergic neurons are involved in a wide array of animal behavior including feed- ing, mood regulation, and movement. The degeneration of dopaminergic neurons causes both Parkinsons Disease and Lewy Body Dementia, and is associated with se- vere mortality and morbidity [1211. Thus, understanding an example of dopaminergic regeneration is of specific interest to the scientific community. The gene tryptophan hydrolase, required for dopamine synthesis, was expressed in a small cell cluster in the single-cell sequencing data (Figure 3-6A). in vivo, tryptophan hydrolase was ex- pressed in two large domains: the cephalic ganglia and the periphery of the animal, including robust expression at the head rim (Figure 3-6A) 1109][1101. Gene enrichment analysis of the dopaminergic cells identified a set of genes encod- ing transcription factors with enriched expression in these cells: Fli-like (flii) [48] and elf-i-Iike/FLI-i (elfi)[48, 102] (Figure 3-6A, Figure 3-7). An ETS family tran- scription factor is required for dopaminergic cell identity in C. elegans[122] where it binds a cis-regulatory element of dopamine biosynthesis pathway genes and acts as a terminal fate selector. A similar role for ETS family transcription factors has not been found in mice, despite expression of several homologs in dopaminergic brain regions[123]. However, in D. melanogaster ETS96B, an ETS transcription factor, is

115 expressed in dopaminergic regions of the fly brain and is required for proper regulation of triglyceride metabolism and behavior [124]. The ETS96B is a homolog of mouse Etv5, which is also expressed in the murine dopaminergic brain regions, and whose human homolog has been associated with human bipolar disorder and obesity 11241. Both flii and elfi encode homologs of this ETS transcription factor family. The di- verse function of ETS transcription factors in either terminal identity fate or feeding behavior in other organisms makes the study of these genes in lophotrochozoans very compelling. flii and elfi were co-expressed in Th+ cells in the head rim of the animal in vivo (Figure 3-6B), whereas elfi was also expressed in a ChA T+ population in the ventral ganglion (Figure 3-6B). Neither flil nor elfi was expressed in the ganglion are of dopaminergic cells (marked by Th expression), raising the possibility that there is an independent gene regulatory network associated with a second type of dopaminergic cell in planarians. However, gene enrichment analysis identified a number of other genes expressed in the same head-rim region, and even in the ganglion arc, (Figure 3- 7) such as a noradrenergicreceptor gene that suggests that some of these neurons are responsive to a catecholamine stimulus. Interestingly, elfi RNAi animals have been reported to be deficient in the ability to find food sources [1021, which is consistent with a possible role in a sensory neuron and is interesting given the role of ETS transcription factors in dopaminergic neurons in flies and humans.

116 A Twos,,. hdux~~age 171?) ..J 40h

JS F' F U

Figure 3-6

117 Figure 3-6: Analysis of dopaminergic neurons in sorted neoblasts and neurons from planarian heads. (A) tSNE plots of the results of principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of Th and two transcription factors enriched in the Th+ cluster: flii and elfi. (B) Right top: Triple FISH showing co-expression of flii (magenta), elfi (green), and ChAT (blue) in dorsal intact heads (scale bars, 100 m). Right bottom: Triple FISH showing co-expression of elfi (magenta), flil (green), Th (blue) in dorsal intact heads (scale bars, 100 m). Left: ventral intact heads of the same FISH. Far left: single-nuclei co-expression of genes in intact heads, denoted by white boxes (scale bars, 10 m). DAPI labels nuclei DNA (gray). Images shown are of the same single 2 m confocal slice. Anterior, up.

118 I.

Figure 3-7

119 Figure 3-7: FISH detected expression of genes enriched in isolated dopaminergic neurons, in intact animal heads. Each gene is in magenta. Images are maximum intensity projections (scale bars, 100 in). *Gene named by best BLAST hit to the Human genome.

3.4.5 Afferent neurons express a gene regulatory network in- cluding prox-1, fli2, six3-1, and su(H)

Afferent neurons sense the environment and are involved in the detection of environ- mental stimuli through contact with the surface of the organism and expression of irritant receptors. Transient receptor potential channel (TRP) proteins are canoni- cally expressed on such sensory neurons, where their physical activation by a wide variety of environmental stimuli (pH, light, noxious molecules, heat, cold) induces a conformational change allowing ions into the cell and triggering a neural depolariza- tion [1251. We recovered a small but distinct group of cells that expressed a large number of transient potential channels; gene enrichment analysis identified more than 8 TRP channels expressed in the cluster (Figure 3-8A, Figure 3-9A), as well as several genes encoding proteins with ankyrin domains (Figure 3-10). Several of the genes with enriched expression in this cell cluster were expressed throughout the animal in vivo, however this cell cluster was unique in that it co-expressed the transcription factor- encoding genes prox-1, su(H), six3-1 and Fli-2 (Figure 3-8B, Figure 3-9B). prox-1, su(H), and fli-2 were all co-expressed with trpA in the planarian head rim (Figure 3-8C). su(H) and six3-1 were also co-expressed with prox-1 in the planarian head rim (Figure 3-8D). We therefore interpreted the isolated afferent neurons from single- cell sequencing to be located around the head rim. Despite this expression being reminiscent of the dopaminergic neurons we also isolated (Figure

120 A

I F 0 0

B

I F I I: (D I' 0

C .

D/.

p

Figure 3-8

121 Figure 3-8: Analysis of afferent neurons in sorted neoblasts and neurons from pla- narian heads. tSNE plots of the results of principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of (A) trpA and trpM. Black ellipses denote cluster with substantial gene expression of both genes. (B) expression of three transcrip- tion factors enriched in the trpA+ cluster: prox-1, su(H) and six3-1. Black ellipses denote clusters with substantial gene expression of both transcription factors. (C) Double FISH of prox-1, su(H) and fli2 (magenta) and trpA (blue) in dorsal intact heads (scale bars, 100 m). Right panel: single-nuclei co-expression of genes in intact heads, denoted by white boxes (scale bars, 10 m).. Images shown are of the same single 2um confocal slice. (D) Double FISH of su(H) (magenta) and prox-1 (green) in intact heads (scale bars, 100 m). Right panel: single-nuclei co-expression of genes in intact heads, denoted by white boxes (scale bars, 10 m).. Images shown are of the same single 2um confocal slice. Anterior, up. (E) Double FISH of the afferent neuron transcription factor prox-1 (magenta) and the dopaminergic transcription factor fii (green) in intact heads (scale bars, 100 m). Right panel: single-nuclei expression of genes in intact heads, denoted by white boxes (scale bars, 10m). Image shown is of the same single 2 m confocal slice. All FISH images are maximum intensity projections unless otherwise noted. DAPI labels nuclei DNA (gray). All images are anterior, up.

122 -mm

Figure 3-9

123 Figure 3-9: Expression of genes enriched in isolated afferent neurons. tSNE plots of the results of principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of (A) genes with homology to trpA enriched in isolated afferent neurons. *Gene named by best BLAST hit to the Human genome, (B) of the isolated afferent neuron enriched transcription factor fli2, and (C) of the isolated afferent neuron enriched transcription factor notch-2.

124 Figure 3-10

125 Figure 3-10: Expression of genes enriched in isolated afferent neurons. FISH detected expression of genes enriched in isolated afferent neurons, in intact animal heads. Each gene is in magenta. Images are maximum intensity projections (scale bars, 100 m). *Gene named by best BLAST hit to the Human genome.

3.4.6 Single-cell sequencing reveals planarian pigment cells and glial cells express many similar genes

Recent work has established that planarians possess a population of glial cells that encircle nerves in the neuropil and ventral nerve cords 110811102]. These CNS glia express many genes associated with neurotransmitter uptake and neural environment regulation, such as Glutamine Synthase (gs), GABA, creatine and taurin transporter (gat), and two glutamate transporter-enco ding genes, eaat2-1 and eaat2-1, as well as a gene encoding a glucose transporter ortholog glut. Hh signaling has also been implicated in regulation of these CNS glial cells, repressing the expression of a num- ber of target genes in cells outside of the neuropil [1081. Despite extensive expression screening, no transcription factor or regulatory protein has been discovered to date in planarian glia. We were therefore interested in recovering the transcriptome of CNS and non-CNS glia. We recovered a small number of cells that expressed markers of planarian glia, enriched in two separate clusters of cells as demonstrated by eaat2-2 (Figure 3-11A, Figure 3-12A-B). One cluster is enriched in genes regulated by Hedge- hog signaling, and contains all of the markers found for planarian glia, as indicated by the expression of IF-1. Gene enrichment of this population revealed further ex- pression of a number of genes encoding lysosome biogenesis proteins and scavenging proteins. Interestingly, we observed expression of a number of transcription factors, including several members of the Forkhead transcription factor family, and robust expression of delta-5, suggesting a role for Notch Signaling in planarian glia similar to other organisms (Figure 3-11E, Figure 3-12C). The second, larger population of cells showing enriched expression of glial markers also expressed a number of the genes encoding lysosome biogenesis proteins, trans-

126 porters, and scavenging proteins. However, these cells also robustly express a set of genes required for the biology of planarian pigment cells including ALAD-1, ALAS and PBGD-1 1126] (Figure 3-11A, Figure 3-12A). To understand this further, we per- formed in situ hybridizations with glial and pigment cell markers and looked at the sub-epidermal region of planarians, where pigment cells reside. We observed that the glial markers glut and gs were co-expressed in the periphery of the animal (Figure 3-11C), and the glial marker eaat2-2 was indeed expressed in peripheral cells also ex- pressing the pigment synthesis gene PBGD-1. In addition, we observed that PBGD-1 expression was indeed co-localized with glut expression within cells in the same lo- cation. Furthermore, there is much overlap in genes enriched in their expression in both the glial and pigment cell populations as determined by gene enrichment analy- sis of each population (Figure 3-11B). These data raise the intriguing possibility that planarian pigment cells might play a similar role in amino acid and neurotransmitter scavenging as do glia in the periphery of the animal, or perhaps be a specialized form of planarian glial cell.

127 A .anf22 1 IF-I 4.D- II I. *1~ F 'V

Genes enriched in pigment and gial dusters B -

Pig nt an Glia co-expression

Z Stack

4 F', ,F

Figure 3-11

128 Figure 3-11: Analysis of glial and pigment cells in sorted neoblasts and cells from planarian heads. (A) tSNE plots of the results of principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of the broad glial marker eaat2-2, the central nervous system glial marker IF-1, and the pigment cell marker PBGD-1. (B) Hierarchical clustering of genes enriched in either isolated glial cells or isolated pigment cells. Low expression (grey), high expression (red). Rows of the heatmap are cells, columns are enriched genes. A substantial number of enriched genes are shared between these isolated cells (black brackets). (C) Double FISH of the glia markers glut (magenta) and gs (green), and the glia marker eaat2-2 (magenta) and the pigment cell marker PBGD-1 (green) in ventral intact tails (scale bars, 100 m). DAPI was used to label nuclei DNA (gray). Images shown are of the same single 2 m confocal slice. White arrows denote nuclei that co-express these genes. Anterior, up. (D) Double FISH of the pigment cell marker PBGD (magenta) and the glial cell marker glut (green) in intact ventral tails (scale bars, 10 m). DAPI was used to label nuclei DNA (gray). Images are a series of single 2um confocal slices, ventral to dorsal ordered. Anterior, up. White arrows denote nuclei that co-express PBGD-1 and glut.

129 A

A) An AIA 8 F

B.-

- WEWE4

WOE

Figure 3-12

130 Figure 3-12: Expression of glial and pigment genes in isolated head cells.(A) tSNE plots of the results of principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of the pigment cell markers ALAD and ALAS as well as the CNS glial marker Estrella, in isolated pigment cells and glia. (B) tSNE plot of the result of principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of the broad glial marker eaat2-1 in isolated pigment cells and glia. (C) tSNE plots of the results of principal component analysis from RNA libraries of isolated neurons and neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of transcription factors enriched in glial or pigment cells. *Gene named by best BLAST hit to the Human genome.

3.4.7 Single-cell sequencing of >2C isolated neoblasts reveals expression of neural markers

Expression of nervous system transcription factors has been reported in neoblasts previously 148][1031168][66, 67, 101][711, and interpreted to identify candidate progen- itors for each respective neural cell type. There is much evidence that many G2/M neoblasts will produce at least one daughter cell of differentiated fate 199, 127]. The current model of neoblast dynamics posits that there are three heterogeneous classes of XI cells. There are zeta neoblasts, which produce all epidermis and are already poised for specific epidermal cell fates, gamma neoblasts, which produce all gamma neoblasts, and sigma neoblasts, which are all non-zeta and non-gamma cells. By virtue of being defined as all non-gamma and non-zeta neoblasts, sigma neoblast must contain the specialized neoblasts of all other cell types, including the muscle, nephridia, pharynx, and nervous system. One other group has posited the existence of Nu neoblasts, and describe them as being neural progenitors and generically ex- pressing many genes of neural fate [103]. We sought to better understand the heterogeneity of neural specialized neoblasts by analyzing the expression of the transcription factors described above in the single-cell

131 sequencing data. We separately re-clustered and analyzed the 374 X1 neoblasts, which were isolated by FACS as having >2C DNA content, and therefore were in G2 or M phase of the cell cycle. We recovered a clear population of zeta and gamma neoblasts, as well as distinct populations of neoblasts expressing parapharyngeal genes and mus- cle genes that we propose are parapharyngeal and muscle specialized neoblasts (Figure 3-13A, Figure 3-14). The remaining bulk of cells not partitioned to these clusters had two distinct signatures: they expressed high levels of canonical sigma markers (which were also expressed in muscle and parapharyngeal neoblasts) and they expressed levels of neural markers (Figure 3-13B). Importantly, we see near ubiquitous expression of smedwi-1, a canonical neoblast marker [591, throughout all of these G2/M cells (Figure 3-14), arguing against any potential contamination of the FACS gate with further differentiated post-mitotic progenitor cells of any lineage. This expression also separates our findings from those described for the Nu neoblasts, which expressed markedly lower levels of multiple neoblast markers. The collective expression of neural-associated transcription factor- encoding genes overlaps with the expression of differentiated markers like synapsin, synaptotagmin homologs, and PC2 (Figure 3-13B, Figure 3-14). Gene enrichment analysis of these Xis returned other markers of differentiated neural fate. However, we did not find evidence of any transcriptional regulators that broadly mark these neural specialized neoblasts from non-neural sigma neoblasts, which argues against a model where neural neoblasts are specified as a common population of cells, such as is the case for the epidermal progenitors, the zeta neoblasts.

132 A l

-I ~vI F-

B

I' F Fow!

Figure 3-13

133 Figure 3-13: Analysis of sorted neoblasts from planarian heads. (A) tSNE plots of the results of principal component analysis from RNA libraries of isolated neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of: Epidermal (Zeta) Neoblast Markers, Gamma Neoblast Markers, and Sigma Neoblast Markers, all from [107] as shown in Table 3.1. (B) Right: tSNE plot of the results of principal component analysis from RNA libraries of isolated neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of the neural gene synapsin. Middle: tSNE plot of the results of principal component analysis from RNA libraries of isolated neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of the neural transcription factors: soxB-2, soxb2-2, pax6b, pax3/7, otxB, nuclear receptor 1, nkx6, nkx2, /4, lhx2/9-like, glass, elf-like, soxB, foxQ2, ap2, sp6-9, otxA neuroD-2, hesl-1, hesl-2, hesl-3, coe, sim, pitx, lhx5-1 (C) Cell FISH from sorted neoblast of whole animals showing the neural gene synapsin (magenta) and a pool of neural transcription factors (green) expressed in the same isolated cells. (scale bars, 10 m). DAPI labels nuclei DNA (gray). Neural transcription factors pool: soxB-2, pax6b, pax3/7, otxB, nkx6, nkx2.

134 amod -----7

[

I0A MLswJa nmrkars

I. F Vt

Figure 3-14

135 Figure 3-14: tSNE plots of the results of principal component analysis from RNA libraries of isolated neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of: pan-neoblast marker smedwi-1, a selection of neural marker as shown in Table 3.1, and Muscle and Parapharyngeal Markers from [107], as shown in Table 3.1.

3.4.8 Numerous transcription factors expressed in distinct cells of the nervous system are detected in distinct neoblasts

We examined the expression of transcription factors associated with planarian neural lineages that we and others have described. We observed enrichment of transcripts associated with distinct neural lineages in small populations of isolated neoblasts, both in the single-cell sequencing data and with in situ hybridization of dissociated neoblasts. pax6A and skil, which were expressed in largely separate populations of chat neurons, were expressed in a few neoblasts together, but were largely expressed in separate cells (Figure 3-15A), as was observed in the single-cell sequencing data and in the smedwi-1+ neoblasts of intact animals.

136 A in vivo isolated neoblasts neoblasts isolated neoblasts

(2

a'

Figure 3-15

137 Figure 3-15: Analysis of neural transcription factor expression in sorted neoblasts and neoblasts in planarian heads. (A) Far Left: graphical diagram of cholinergic, dopaminergic and afferent neurons in the whole body. Left column: cell FISH of iso- lated neoblasts from whole worms expressing: skil (magenta) and pax6A (green), elfi (magenta) and flii (green), and prox-1 (magenta) and su(H) (green). DAPI labels nuclei DNA (gray). All images represents >300 isolated neoblasts (scale bars, 10um). Right column: triple FISH of skil (magenta) and pax6A (green), elfi (magenta) and flil (green), and prow-1 (magenta) and su(H) (green), each with the neoblast marker smedwi-1 (blue) showing single-nuclei co-expression of genes in intact heads, denoted by white arrows (scale bars, 10 m). DAPI was used to label nuclei DNA (gray). Image shown is of a single 2um confocal slice. (B) Cell FISH of isolated neoblasts from whole worms expressing: flii (magenta) and pax6A (green), flii (magenta) and proX-1 (green), and su(H) (magenta) and pax6A (green). DAPI was used to label nuclei DNA (gray). All images represents >300 isolated neoblasts (scale bars, 10 m). su(H) and prow-1, which were expressed in afferent neurons, were also co-expressed in isolated neoblasts and in smedwi-i+ neoblasts of intact animals. flil and elfi, which were expressed in peripheral dopaminergic neurons, were broadly co-expressed in isolated neoblasts; whereas flii was more abundantly detected in smedwi-i + intact neoblasts, rare fli+, elfi+, smedwi-l+ neoblasts were detected. Furthermore, we observed similar results, albeit with more limited detection power in the single-cell sequencing data from neoblasts (Figure 3-16A).

138 A - AkIl 1

[IDRAM

B.

I

C . x-I H -1nd dH) c-exiression

,ll . NF

D

Figure 3-16

139 Figure 3-16: Analysis of neural transcription factor expression in sorted neoblasts and neoblasts in planarian heads. (A) tSNE plots of the results of principal component analysis from RNA libraries of isolated neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of the cholinergic lineage transcription factors ski-1 and pax6A. Black ellipses denote clusters with substantial gene expression of transcription factors. (B) tSNE plots of the results of principal component analysis from RNA libraries of isolated neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of the dopaminergic lineage transcription factors flii and elfi, and summed expression of the dopaminergic lineage transcription factors fliu and elfi in cells where both were detected (denoted as co-expression). Black ellipses denote clusters with substantial gene expression of transcription factors. (C) tSNE plots of the results of principal component analysis from RNA libraries of isolated neoblasts from planarian heads. Each dot is a cell, and cells grouped together have similar expression of genes. The magenta hue of a cell depicts individual expression of the afferent neuron transcription factors prox-i and su(H), and summed expression of the afferent neuron transcription factors prox-1 and su(H) in cells where both were detected (denoted as co-expression). Black ellipses denote clusters with substantial gene expression of transcription factors. (D) Cell FISH of isolated neoblasts from whole worms expressing: the dopaminergic transcription factor flil (magenta) and the zeta neoblast markers soxP3 (green), the cholinergic transcription factor pax6A (magenta) and the zeta neoblast markers soxP3 p53 (green), the afferent neuron transcription factor su(H) and the zeta neoblast markers soxP3 p53 (green), and the cholinergic transcription factor pax6A (magenta) and the gamma neoblast marker hnf4 (green). DAPI was used to label nuclei DNA (gray). All images represents >300 isolated neoblasts (scale bars, 10 m).

Conversely, genes encoding transcription factors that were expressed in different ma- ture neural cell types were expressed in different neoblasts. pax6A and flil were not detectably co-expressed in isolated neoblasts (Figure 3-15B). Likewise, flil and prox- 1 and su(H) and pax6A were never observed to be expressed in the same neoblasts (Figure 3-15B). Similar exclusion of co-expression of these transcription factor gene combinations was observed in the single-cell sequencing data from neoblasts (Fig- ure 3-16A-C). Furthermore, we observed that expression of transcription factor genes from the dopaminergic, cholinergic, and afferent populations were also excluded from gamma and zeta isolated neoblasts, which express hnf4 and soxP3/p53, respectively (Figure 3-16D).

140 Neoblasts Specialized Differentiated Tissues Specialized Neoblasts Differentiated smedwi-1

dopaminergic neurons

cholinergic skil neurons . pax6A . prox-1 trpA + su(H) Safferent neurons M2

Figure 3-17

141 Figure 3-17: Diagram of specialized neoblasts for dopaminergic, and afferent neurons. The specialized neoblasts express the smedwi-1 marker (grey). Transcription factors expressed in neural sub-types are expressed in neoblasts with the marker smedwi-1. As specialized neoblasts further differentiate, they will lose expression of smedwi-1, but retain expression of sub-type transcription factors.

These data are consistent with the existing model of fates of progenitors for different tissues being specified in different neoblasts, and suggest a similar principle might govern sub-type specification in the planarian nervous system (Figure 3-17).

3.5 Discussion

3.5.1 Characterization of neuronal cell types

Using single-cell sequencing and in vivo validation experiments we described here the transcriptomes of three new classes of specialized neoblasts. We found evidence that dopaminergic neurons are marked by expression of two genes encoding ETS family transcription factors, pointing to a conserved role of ETS transcription factor- mediated regulation of dopaminergic neural biology. We observed a population of putative sensory neurons expressing a plethora of transient receptor potential chan- nels, and marked by expression of genes encoding a Prospero-family transcription factor and homologs of Su(H) and Notch. These findings suggest a role for Notch sig- naling in planarian sensory neuron biology. We also described the transcriptomes of cholinergic neurons, identifying expression of either pax6A or skil in most cholinergic cells of the animal. Finally, we present the transcriptomes of glial cells and pigment cells, and find surprising overlap in the expression of genes predicted to be part of the differentiated functions of both cells.

3.5.2 Evidence for neural neoblast specialization

Through single-cell sequencing of the planarian nervous system and neoblasts, we described a sub-population of sigma neoblasts that heterogeneously express an abun- dance of genes encoding neural transcription factors and markers. We find that these

142 candidate neural-specified neoblasts expressed combinations of genes associated with specific neural identity, rather than mixed or hybrid states. These cells were isolated by FACS to be in a definitive G2/M state, and their neoblast state was controlled for by assessment of smedwi-1 levels. These data are consistent with a model wherein neural specialized neoblasts are a heterogeneous mixture of cells with numerous of them having fates specified towards specific neural lineages.

3.5.3 Lack of evidence for pan-neural progenitor

Although we did see evidence of fate specification for neural lineages in the neoblast compartment, differential expression analysis of those neural neoblasts enriched for broad sigma markers. Though it remains possible that one or more genes is required for specifying a progenitor state that will generate many to all different classes of neurons, no such gene or gene set emerged as a good candidate from this single-cell sequencing data. These data therefore raise interesting questions about how neoblasts choose a neural state. It is possible that neoblasts become specified independently to adopt the fates of different neuron classes without transiting or requiring a common precursor state. It remains possible that our data doesn't sufficiently sample neoblasts to a population of pan-neural neoblasts, however our data was readily able to identify epidermal and intestinal progenitors, as well as candidate progenitors of specific neural classes. These findings set the stage for exploring possible differentiation from a pluripotent state into diverse specialized neural precursors in a rapid process for regeneration that does not require prolonged embryonic like lineages with gradual fate restriction.

3.6 Materials and methods

3.6.1 Animal treatment

Asexual Schmidtea mediterranea strain (CIW4) animals starved more than 7 days prior experiments were used. Animals were exposed to a 6,000 rads dose of radiation

143 using a dual Gammacell-40 137 cesium source, and cells were isolated five days after irradiation.

3.6.2 Single-cell mRNA amplification

Amplified and screened cDNA from individual cells was sequenced using a modi- fied Nextera 2 library kit on a HiSeq 2000 as described previously. Cells with greater than 25 percent mapping to the human, mouse, rat, yeast or E. coli genomes were dis- carded. Reads per cell were cleaned of adapter sequences using CutAdapt and mapped to a S. mediterranea transcriptome [1041 using Bowtie 2. Read counts mapping to all isotigs of the transcriptome contig were collapsed. Cells with > 4,000-10,000 and < 800 uniquely mapped contigs were discarded. Analysis of single-cell libraries was then preformed with the R-language package Seurat vl.0 11051. Briefly, genes with high dispersion were used to perform Principal Component Analysis (PCA). Descriptive PCs were selected based on their collective ability to describe >2 standard deviations of variance in the data. See attached Computational Appendix 3.8 and 3.9 for more details.

3.6.3 Single Cell Sequencing

Amplified and screened cDNA from individual cells was sequenced using a modified Nextera 2 library kit on a HiSeq 2000 as described previously 11071. Cells with greater than 25% mapping to the human, mouse, rat, yeast or E. coli genomes were removed. Reads were cleaned of adapter sequences using CutAdapt, and mapped to a S. mediterraneatranscriptome [1041 using Bowtie 2 with isotigs each transcriptome contig collapsed. Cells with > 4,000-10,000 and < 800 uniquely mapped contigs were discarded. Analysis of single-cell libraries was then preformed with the R-language package Seurat v1.0 (11051. Briefly, genes with high dispersion were used to perform Principal Component Analysis (PCA). Descriptive PCs were selected based on their collective ability to describe greater than 2 standard deviations of variance in the data set. See Computational Appendix 3.8 and 3.9 for details.

144 3.6.4 Computational Analysis of clusters

Default parameters for Seurat 1.4 and Seurat 2.0 [1051 were used for all gene enrich- ment analysis via bimodal and ROC testing. See Computational Appendix 3.8 and 3.9 for more details.

3.6.5 in situ hybridization

Genes of interest were cloned according to previous methods [1071. In situ hybridiza- tion of mRNA was detected using previously published methods. In some experi- ments, pre-hybridization and hybridization buffer was modified to remove formamide by substitution with 4M Urea and the addition of 0.01 percent SDS, 0.1 percent Heparin.

3.6.6 Gene sequences

The sequences of the genes identified in this study are found in Table 3.1.

3.6.7 Acknowledgements

P.W.R. is an Investigator of the Howard Hughes Medical Institute and an associate member of the Broad Institute of Harvard and MIT. We acknowledge support from the NIH (R01GM080639) and an NSF graduate research fellowship to K.M.K. We thank Josien van Wolfswinkel and Omri Wurtzel for qPCR and single-cell sequencing advice, Omri Wurtzel for computational support, Lauren Cote for experimental assistance, and members of the Reddien lab for their comments on the manuscript.

3.7 Annotation Genes

145 Marker Type ddSd Hh Sensitive Glia dSev41901 Hh Sensitive Glia dd Smed v4 12254 0 1 Hh Sensitive Glia dd Smed v4 1792 0 1 Hh Sensitive Glia dd Smed v4 9961 0 1 Glia dd Smed v4 1106 0 1 Glia dd Smed v4 3514 0 1 Glia dd Smed v4 646 0_1 Glia dd Smed v4 5447 0 1 Glia dd Smed v4_3620 0_1 Glia dd Smed v4 313015 0 1 Pigment ddSmed v4_1226_0_1 Pigment dd Smed v4 6364 0 1 Pigment ddSmed v4_6316_0_1 Synapse Markers dd Smed v4 10835_0_1 Synapse Markers dd Smed v4 6859 0 1 Synapse Markers ddSmed v4_5361 0_1 Synapse Markers dd Smed v4 10835 0 1 Synapse Markers ddSmedv4_2985_0_1 Synapse Markers dd Smed v4 3977 0 1 Synapse Markers ddSmed v4_3135_0_1 Synapse Markers ddSmed v4_3135_0_1 Synapse Markers dd Smed v4 7243 0 1 Synapse Markers ddSmedv4_6730_0_1 Synapse Markers ddSmedv4 4222_0 1 Synapse Markers dd Smed v4 5266 0 1 Synapse Markers dd Smed v4 7111 0 1 Synapse Markers dd Smedv4_10375_0_1 Synapse Markers dd Smed v4 11887 0 1 Synapse Markers ddSmedv4_12647_0_1 Synapse Markers ddSmedv4_12772_0_1 Synapse Markers dd Smed v4 13079 0 1 Synapse Markers dd Smed v4 13340 0 1 Synapse Markers dd Smed v4 13680 0 1 Synapse Markers dd Smed v4 13706 0 1 Synapse Markers dd Smed v4 16195 0 1 Synapse Markers dd Smed v4 16731 0 1 Synapse Markers dd Smed v4 1798 0 1 Synapse Markers dd Smed v4 18661 0 1 Synapse Markers ddSmedv4_19213_0_1 Synapse Markers ddSmedv4_19328_0_1 Synapse Markers ddSmedv4_20033_0_1 Synapse Markers dd Smed v4 20523 0 1 Synapse Markers dd Smed v4_21069 0_1 Synapse Markers dd Smed v4 22061 0 1 Synapse Markers dd Smed v4 23389 0 1 Synapse Markers dd Smed v4 25279 0 1 Synapse Markers dd Smed v4 4222 0 1 Synapse Markers dd Smed v4 4335 0 1 Synapse Markers dd Smed v4 5370 0_1

146 Synapse Markers dd Smed v4 5946 0_1 Synapse Markers ddSmedv4_6730_0_1 Synapse Markers ddSmedv4_6920_0_1 Synapse Markers dd Smed v4 7243 0 1 Synapse Markers dd Smed v4 8032 0 1 Synapse Markers dd Smed v4 8321 0 1 Synapse Markers dd Smed v4 8438 0 1 Synapse Markers dd Smed v4 8909 0 1 All from Table S2, Wurtzel, 0. et al. A Generic and Cell-Type- Specific Wound Response Precedes Regeneration in Planarians. Dev Cell 35, 632-645 (2015). Availble at http://www.cell.com/cms/attachment/2055813165/2061173206/ Zeta neoblasts mmc3.xlsx All from Table S2, Wurtzel, 0. et al. A Generic and Cell-Type- Specific Wound Response Precedes Regeneration in Planarians. Dev Cell 35, 632-645 (2015). Availble at http://www.cell.com/cms/attachment/2055813165/2061173206/ Sigma neoblasts mmc3.xlsx All from Table S2, Wurtzel, 0. et al. A Generic and Cell-Type- Specific Wound Response Precedes Regeneration in Planarians. Dev Cell 35, 632-645 (2015). Availble at http://www.cell.com/cms/attachment/2055813165/2061173206/ Gamma neoblasts mmc3.xlsx All from Table S2, Wurtzel, 0. et al. A Generic and Cell-Type- Specific Wound Response Precedes Regeneration in Planarians. Dev Cell 35, 632-645 (2015). Availble at http://www.cell.com/cms/attachment/2055813165/2061173206/ Protonephridia mmc3.xlsx All from Table S2, Wurtzel, 0. et al. A Generic and Cell-Type- Specific Wound Response Precedes Regeneration in Planarians. Dev Cell 35, 632-645 (2015). Availble at http://www.cell.com/cms/attachment/2055813165/2061173206/ Parapharyngeal mmc3.xlsx All from Table S2, Wurtzel, 0. et al. A Generic and Cell-Type- Specific Wound Response Precedes Regeneration in Planarians. Dev Cell 35, 632-645 (2015). Availble at http://www.cell.com/cms/attachment/2055813165/2061173206/ Neural (non-ciliated) mmc3.xlsx All from Table S2, Wurtzel, 0. et al. A Generic and Cell-Type- Specific Wound Response Precedes Regeneration in Planarians. Dev Cell 35, 632-645 (2015). Availble at http://www.cell.com/cms/attachment/2055813165/2061173206/ Neural (both ciliated and non-ciliated) mmc3.xlsx All from Table S2, Wurtzel, 0. et al. A Generic and Cell-Type- Specific Wound Response Precedes Regeneration in Planarians. Dev Cell 35, 632-645 (2015). Availble at http://www.cell.com/cms/attachment/2055813165/2061173206/ Neoblast mmc3.xlsx

147 All from Table S2, Wurtzel, 0. et al. A Generic and Cell-Type- Specific Wound Response Precedes Regeneration in Planarians. Dev Cell 35, 632-645 (2015). Availble at http://www.cell.com/cms/attachment/2055813165/2061173206/ Muscle mmc3.xlsx All from Table S2, Wurtzel, 0. et al. A Generic and Cell-Type- Specific Wound Response Precedes Regeneration in Planarians. Dev Cell 35, 632-645 (2015). Availble at http://www.cell.com/cms/attachment/2055813165/2061173206/ Gut mmc3.xlsx All from Table S2, Wurtzel, 0. et al. A Generic and Cell-Type- Specific Wound Response Precedes Regeneration in Planarians. Dev Cell 35, 632-645 (2015). Availble at http://www.cell.com/cms/attachment/2055813165/2061173206/ Epidermal lineage (early + late) mmc3.xlsx

148 3.8 Computational Appendix 1

149 title: "Computational Appendix 1" output: pdf_document: figcaption: yes keep_tex: yes htmldocument: default

'''{r generic, message=FALSE, warning=FALSE} library(methods) library(knitr) library("Seurat", lib.loc="/lab/solexareddien/Kellie/Rconfig/R/ x86_64-pc-linux-gnu-library/3.4") library(openxlsx) library(svglite) source("seuratextracode.R") knitr::optschunk$set(fig.path='/lab/solexareddien/Kellie/ Dissertation/Ch3_figs/', echo=T, warning=FALSE, message=FALSE, dev = 'svglite', fig.ext = 'svg', echo=T, warning=FALSE, message=FALSE, results = "hide", eval = F)

Load raw read file of cells pre-filtered for RNA contamination and <3000 total reads '''{r import non-X1 cells} filel = read.table("/lab/solexareddien/Kellie/X1_singlecellseq/ NewSequencing/161103Red/161103RedXinsrawReadCounts.txt")

#basic rawReadCount filtering: mRNAsMapped = colSums(filel > 1) hist(mRNAsMapped, breaks = 100, ylim=c(0, 35), xlim=c(0, 25000))

#cut cells under low reads and above high reads filel = filel[, colSums(filel > 1) < 4000] filel = filel[, colSums(filel > 1) > 800]

'''{r import X1 batch 2} file2 = read.table("/lab/solexareddien/Kellie/X_single_cell_seq/ NewSequencing/161103Red/161103Red_XlrawReadCounts.txt")

#basic rawReadCount filtering: mRNAsMapped = colSums(file2 > 1) hist(mRNAsMapped, breaks = 100, ylim=c(0, 35), xlim=c(0, 25000))

#cut cells under low reads and above high reads file2 = file2[, colSums(file2 > 1) < 5500] file2 = file2[, colSums(file2 > 1) > 800]

150 '''{r import X1 batch 1} file3 = read.table("/Ilab/solexareddien/Kellie/Xlsingle_cellseq/ ddV4/OWdataincluded/ddV4.condensed.rawReadFile.txt", header = T, row.names = "Contig") file3$To = NULL

#basic rawReadCount filtering: mRNAsMapped = colSums(file3 > 1) hist(mRNAsMapped, breaks = 100, ylim=c(0, 35), xlim=c(0, 25000))

#cut cells under low reads and above high reads file3 = file3[, colSums(file3 > 1) < 10000] file3 = file3[, colSums(file3 > 1) > 800]

'''{r create allrawReadCounts} drops = c('ddSmedv4_0_0_1','ddSmedv4_217_0_1','dd Smedv4_521_0_1','dd_Sme d_v4_717_0_1','ddSmedv4_7_0_1','ddSmed v4_4_1_1', '*') allrawReadCounts = merge(filel[!(row.names(filel) %in% drops),], file2[!(row.names(file2) %in% drops),], by = "row.names") row.names(allrawReadCounts) = allrawReadCounts$Row.names allrawReadCounts$Row.names = NULL allrawReadCounts = merge(allrawReadCounts, file3[!(row.names(file3) %in% drops),], by = "row.names") row.names(allrawReadCounts) = allrawReadCounts$Row.names allrawReadCounts$Row.names = NULL allrawReadCounts$Row.names = NULL

#basic rawReadCount filtering: mRNAsMapped = colSums(allrawReadCounts > 1) hist(mRNAsMapped, breaks = 100) mean (mRNAsMapped)

Create and setup the Seurat object. Include all genes detected in >= 3 cells, and all cells with > 800 genes '''{r open cells Seurat object} library(Seurat) cells=new("seurat",raw.data=as.matrix(all_rawReadCounts)) cells=Setup(cells, project="NewSequencing", min.cells = 3, min.genes = 800, is.expr=0.00001, names.field = 1, names.delim = "

library(openxlsx) libraryinfo = read.xlsx(xlsxFile = "/lab/solexareddien/Kellie/ X1_s inglecellseq/NewSequencing/161103Red/ LibraryQAandMappingStats.xlsx")

151 metadata = merge([email protected], libraryinfo, by.x = "row.names", by.y = "CellName") row.names(metadata) = metadata$Row.names metadata = metadata[,c("ContamFlag","IsolationType")]

#Add contamination flag and mRNA isolation batch cells = AddMetaData(cells, metadata) [email protected] levels([email protected]$orig.ident) length(which(cells@data. info$orig.ident %in% c("plate0l", "plate02", "XlPlatel", "XlPlate2", "X1Plate3"))) length(which([email protected]$orig.ident %in% c("OIBrainPlatel", "OIBrainPlate2", "OIBrainPlate3", "XinsPlatel", "XinsPlate2")))

Generate variable genes to perform analysis on from differentied cells, which will have more signal and less transcriptional noise then neoblasts. '''{r create an Xins only object and extract VarGenes} filel = read.table("/lab/solexareddien/Ketlie/X_single-cell_seq/ NewSequencing/161103Red/161103RedXinsrawReadCounts.txt") #cut cells under low reads and above high reads filel = filel[, colSums(filel > 1) < 5500] filel = filel[, colSums(filel > 1) > 800] #193 drops = c('ddSmedv4_0_0_1','dd_Smedv4_217_01','dd_Smedv4_521_0_l','dd_Sme d_v4_7170 _1','ddSmedv4_7_0_1','ddSmedv4_4_1_1', '*') Xins-rawReadCounts = filel[! (row.names(filel) %in% drops),] library(Seurat) Xins=new("seurat",raw.data=as.matrix(XinsrawReadCounts)) Xins=Setup(Xins, project="NewSequencing", min.cells = 3, min.genes = 800, is.expr=0.00001, names.field = 1, names.delim = "

'''{r calculate cellvar} #grab the initial var genes of Xins Xins = MeanVarPlot(Xins,y.cutoff = 1, do.plot=TRUE,x.low.cutoff=. 001,x.high.cutoff=15,fxn.x = expMean,fxn.y=logVarDivMean, set.var.genes = TRUE) cellvar = [email protected]

Proceed with clustering of cells using the gene set all_vargenes Run PCA on all genes and cells '''{r fig.width= 10, fig.height= 20} cells = PCA(cells, do.print = FALSE, pc.genes = cell-var, pcs.store = 100)

152 cells = ProjectPCA(cells, do.print = FALSE, do.center=FALSE) library(parallel) cells = JackStrawMC(cells, num.replicate=1000, num.pc = 100, prop.freq = 0.025, num.cores = detectCores()) PCElbowPlot(cells, num.pc = 100)

'''{r explore PCs of cells} library(openxlsx) #output of Seurat 1.4 BatchGene() using default settings wiht cells from X1 Batch 1 and Batch 2 batchgenes = read.xlsx("/lab/solexareddien/Kellie/ X1_singlecellseq/SmedNeuralTypes/Tak3_Rstudio/allcellsanalysis/ batchgenes.xlsx", colNames = FALSE) batchgenes = as.character(batchgenes$X1)

PCs = list() batchscore = list() for(number in 1:100){ PCs[[number]] = as.data.frame(PCTopGenes(cells, pc.use= number)) batchscore[[number]] = length(intersect(batch_genes, PCTopGenes(cells, pc.use= number))) } hist(unlist(batchscore)) batch-pcs = which(batchscore > 10) batchpcs

'''{r run tSNE on cells} cells = RunTSNE(cells, -iter=500, dims.use = setdiff(1:20, batch_pcs), perplexity = 11) TSNEPlot(cells) cells = FindClusters(cells, pc.use = setdiff(1:20, batchpcs), resolution = 1.5, print.output = T, save.SNN = T, k.param = 5, k.scale = 2) TSNEPlot(cells, do.label = F)

'''{r generate cluster markers for cells} # cellsmarkers = FindAllMarkers(cells, test.use = "roc", return.thresh = 0.7, only.pos = T) # cellsmarkersbimod = FindAllMarkers(cells, test.use = "bimod", return.thresh = 0.05, only.pos = T)

'''{r save markers for cells}

153 # allmarkers = full-join(cellsmarkers, cellsmarkersbimod, by = c("cluster", "gene"), suffix = c(". roc", ".bimod")) # df = allmarkers # df=annotateTable(df,"/lab/solexareddien/Kellie/Transcriptomes/ddV4/ hs.ddV4.txt", df.key = "gene") # write. xlsx(df, "cellMarkers. xlsx") allmarkers = read. xlsx("cellMarkers. xlsx")

'''{r load tissue markers} library(openxlsx) markers = read.xlsx("/lab/solexareddien/Kellie/Transcriptomes/ddV4/ Wurtzel2015_clusterIDs.xlsx") listofclusters = unique(markers$Cluster) marker-genes = list() i = 1 for (each in listofclusters){ marker genes[[each]] = markers$Contig[markers$Cluster %in% listofclusters[i]] i = i+ 1 }

'''{r tissue markers on cells} library(dplyr) library(tidyr) TSNEPlot(cells, group.by = "orig.ident") feature.plot. rpkm_summed(cells, markergenes[["Epidermal lineage (early + late + mature)"]], features.plot.name = "Epidermal Lineage Markers") feature.plot.rpkm(cells, "ddSmedv4_659_0_1") feature.plot.scaledsummed(cells, marker_genes[["Zeta neoblasts"]], features.plot.name = "Epidermal (Zeta) Neoblasts") feature.plot.scaledsummed(cells, marker genes[ ["Gamma neoblasts"]], features.plot.name = "Gamma Neoblasts") feature.plot.scaledsummed(cells, markergenes[["Muscle"]], features.plot.name = "Muscle") feature.plot.scaledsummed(cells, marker-genes[["Neural (both ciliated and non-ciliated)"]], features.plot.name = "All Neural Markers") feature.plot.scaledsummed(cells, marker genes[["Neural (ciliated)"]], features.plot.name = "Ciliated Neurons") feature.plot.scaledsummed(cells, marker-genes[["Neural (non- ciliated)"]], features.plot.name = "Non-ciliated Neurons") feature.plot.scaledsummed(cells, marker genes[ ["Parapharyngeal"]], features.plot.name = "Parapharyngeal") feature.plot.scaledsummed(cells, marker-genes[ ["Protonephridia"]], features.plot.name = "Protonephrida")

154 feature.plot.scaledsummed(cells, markergenes[["Hh Sensitive Glia"]], features.plot.name = "Hh Regulated Glia Markers") feature.plot.scaledsummed(cells, markergenes[["Glia"]], features.plot.name = "All Glia Markers") feature.plot.scaledsummed(cells, markergenes[["Pigment"]], features.plot.name = "Pigment Cell Markers")

Neural Cell Types '''{r neural cell types} feature.plot.rpkm(cells, "ddSmedv4_12653_0_1", features.plot.name = "Glutamate decarboxylase") #glutamate decarboxylase, GABAergic Neurons feature. plot. rpkm(cells, "ddSmedv4_16581_0_1", features. plot. name = "Tyrosine hyroxylase") #tyrosine hydroxylase, dopaminergic neurons feature.plot.rpkm(cells, "ddSmedv4_8392_0_1", features.plot.name = "Tryptophane hydroxylase") #tryptophan hydroxylase, serotonergic neurons feature.plot.rpkm(cells, "ddSmedv4_6208_0_1", features.plot.name = "ChAT") #ChAT, cholinergic neurons feature.plot.rpkm(cells, "ddSmedv4_11968_0_1", features.plot.name = "ChAT") #ChAT, cholinergic neurons feature.plot.rpkm(cells, "ddSmedv4_42610_0_1", features.plot.name = "tyramine beta hyroxylase") #tyramine beta hydroxylase, octopaminergic neurons feature.plot.rpkm(cells, "ddSmedv4_80459_0_1", features.plot.name = "tyramine beta hyroxylase") #tyramine beta hydroxylase, octopaminergic neurons

TRP Population Analysis '''{r trp channel expression} feature.plot.rpkm(cells, "ddSmedv4_14207_0_1", features.plot.name = "trpA") feature.plot.rpkm(cells, "ddSmedv4_12031_0_1", features.plot.name = "transient receptor potential cation channel subfamily A member 1 isoform X1*") feature.plot.rpkm(cells, "ddSmedv4_17857_0_1", features.plot.name = "transient receptor potential cation channel subfamily M member 3 isoform X13*") feature.plot.rpkm(cells, "ddSmedv426481_0_1", features.plot.name = "transient receptor potential cation channel subfamily M member 3 isoform X13*") feature.plot.rpkm(cells, "ddSmedv420683_0_1", features.plot.name = "transient receptor potential cation channel subfamily A member 1 isoform X2*") feature.plot.rpkm(cells, "ddSmedv4_41539_0_1", features.plot.name = "transient receptor potential cation channel subfamily A member 1 isoform X2*") feature.plot.rpkm(cells, "ddSmedv4_28432_0_1", features.plot.name =

155 "transient receptor potential cation channel subfamily A member 1 isoform X2*") feature.plot.rpkm(cells, "ddSmedv4_59015_0_1", features.plot.name = "transient receptor potential cation channel subfamily A member 1 isoform X2*") feature.plot.rpkm(cells, "ddSmedv4_72282_0_1", features.plot.name = "transient receptor potential cation channel subfamily M member 2 isoform X3*")

'''{r trp transcripition factors} feature.plot.rpkm(cells, "ddSmedv4_13772_0_1", features.plot.name = "prox-1") #prox-1 feature.plot.rpkm(cells, "ddSmed-v4_28005_0_1", features.plot.name = "fli2-like") #fli2-like feature.plot.rpkm(cells, "ddSmedv4_15178_0_1", features.plot.name = "six3-1") #six3-1 feature.plot.rpkm(cells, "ddSmedv4_6047_0_1", features.plot.name = "Su(H)")

'''{r notch receptor} feature.plot.rpkm(cells, "ddSmedv4_7067_0_1", features.plot.name = "notch-2")

Dopa Population Analysis '''{r dopa transcription factors} feature.plot.rpkm(cells, "ddSmedv4_16581_0_1", features.plot.name = "Tyrosine hyroxylase") #tyrosine hydroxylase, dopaminergic neurons feature.plot.rpkm(cells, "ddSmedv4_11113_0_1", features.plot.name = "Fli-1-like") #Fli-1-like feature.plot.rpkm(cells, "ddSmedv4_14611_0_1", features.plot.name = "elfl-like/FLI-1") #newmark Fli-1/elf-like feature.plot.rpkm(cells, "ddSmedv4_8104_0_1", features.plot.name - "soxBl-2") #sox2

Glia and Pigment Population Analysis

'''{r glia pigment genes} feature.plot.rpkm(cells, "ddSmedv4_6910_0_1", features. plot. name = "forkhead box protein F1*") feature.plot.rpkm(cells, "ddSmedv4_6316_0_1", features.plot.name = "forkhead box protein P4 isoform 3*") feature.plot.rpkm(cells, "ddSmedv4_7583_0_1", features.plot.name = "forkhead box protein K1*") feature.plot.rpkm(cells, "ddSmedv4_5767_0_1", features. plot. name = "forkhead box protein K1*") feature.plot.rpkm(cells, "ddSmedv4_6626_0_1", features.plot.name =

156 "ETS-related transcription factor Elf-4 isoform X3*") feature.plot.rpkm(cells, "ddSmedv4_7470_0_1", features.plot.name = "gli-1") feature. plot. rpkm(cells, "ddSmedv4_1226_0_1", features.plot.name = "ALAS") feature.plot.rpkm(cells, "ddSmedv4_6364_0_1", features.plot.name = "ALAD") feature.plot.rpkm(cells, "ddSmed_v4_626_0_1", features.plot.name = "PBDG-1") feature.plot.rpkm(cells, "ddSmed-v4_1792_0_1", features.plot.name = "Estrella") feature.plot.rpkm(cells, "ddSmedv4_12254_0_1", features.plot.name = "IF-1") feature.plot.rpkm(cells, "ddSmedv4_10221_0_1", features.plot.name = "Delta-5") feature.plot.rpkm(cells, "ddSmedv4_1106_0_1", features.plot.name = "eaat2-1") feature.plot.rpkm(cells, "ddSmedv4_3514_0_1", features.plot.name = eaat2-2")

...{r fig.height=25, fig.width = 15} TSNEPlot(cells, do.label = T) glialclusters = c(23, 8) genes = unique(allmarkers [which(allmarkers$cluster %in% glialclusters),]$gene) library(gplots) library(RColorBrewer) hmap = as.matrix([email protected][row.names(cells@data) %in% genes, which([email protected]$res.1.5 %in% glial-clusters)]) hclustfunc = function(x) hclust(x,method="average") distfunc = function(x) as.dist(1-cor(t(x), method="pearson", use="pairwise.complete.obs")) heatmap.2(hmap, hclust=hclustfunc, distfun=distfunc, na.rm = T, breaks=c(seq(-2,2,length=1000)), col=colorRampPalette(rev(brewer.pal(10, "PiYG")))(999), na.color = "black", key = T, keysize = 1 , density.info = "none", trace = "none", cexRow=1,cexCol=1, margins = c(20,12), lhei= c(1, 20), main = "Heatmap of expression levels")

157 Lhx5-1/Pitx '''{r serotonergic} feature.plot.rpkm(cells, "ddSmedv4_8392_0_1", features.plot.name = "Tph") feature. plot. rpkm(cells, "ddSmed_v4_11521_0_1", features. plot .name = "Lhx5-1") #lhx5-1 feature.plot.rpkm(cells, "ddSmed_v4_15253_0_", features.plot.name = "pitx") #pitx feature.plot.rpkm(cells, "ddSmedv4_8820_0_1". features.plot.name = "islet-i") coexpressionPlot(cells, "ddSmed_v4_11521_0_1", "ddSmedv4_15253_0_1") coexpressionPlot(cells, "ddSmedv4_8820_0_1", "ddSmed_v4_15253_0_1") coexpressionPlot(cells, "ddSmed_v4_11521_0_1", "ddSmedv4_8820_0_1")

nkx2.1 and arx '''{r pax37} feature.plot. rpkm(cells, "ddSmed_v4_13898_0_1", features.plot.name = "nkx2") #nkx2-like feature.plot.rpkm(cells, "ddSmed_v4_21801_0_1", features.plot.name = "pax3/7-like/alx") feature.plot. rpkm(cells, c("ddSmedv4_80459_0_1", "ddSmedv4_42610_0_1"), features.plot.name = "dopamine beta hydroxylase")

klf '''{r klf} feature.plot.rpkm(cells, "ddSmedv4_95726_0_1", features.plot.name = "klf-like") #klf, but check

SIM '''{r sim} feature. plot. rpkm(cells, "ddSmed_v4_17731_0_1", features. plot.name = "Single-Minded")

COE '''{r coe} feature.plot.rpkm(cells, "ddSmedv4_9893_0_1", features.plot.name = "coe") #coe TSNEPlot(cells, do.label = T) feature. plot. rpkm(cells, "ddSmedv4_14213_0_1", features. plot .name = "PREDICTED: doublesex- and mab-3-related transcription factor C2 isoform X2 [Homo sapiens]") feature.plot.rpkm(cells, "ddSmed_v4_15555_0_1", features.plot.name =

158 "pou4l-1") feature.plot.rpkm(cells, c("dd_Smedv4_25321_0_1", "ddSmedv4_20944_0_1"), features.plot.name = "soxb2-2") feature.plot.rpkm(cells, "ddSmedv4_12317_0_1", features.plot.name = "Post-2d/AbdBa")

'{r eyes} feature.plot.rpkm(cells, "ddSmedv4_17385_0_1", features.plot.name = "sp6-9") feature.plot.rpkm(cells, "ddSmedv4_11372_0_1", features.plot.name = "eya")

ChAT Population Analysis '''{r chat} feature. plot. rpkm(cells, "ddSmedv4_10921_0_1", features.plot.name = "skil") feature. plot. rpkm(cells, "ddSmedv4_6208_0_1", features.plotname = "ChAT") #ChAT feature.plot. rpkm(cells, "ddSmedv4_17726_0_1", features. plot. name = ''pax6A") #pax6A

159 3.9 Computational Appendix 2

160 title: "Computational Appendix 2" output: pdfdocument: figcaption: yes keeptex: yes htmldocument: default

'''{r generic} library(methods) library(knitr) library("Seurat", lib.loc="/usr/local/lib/R/site-library") library(openxlsx) library(svglite) source("seuratextracode.R") knitr::opts_chunk$set(fig.path='/lab/solexareddien/Kellie/ Dissertation/nbfigs/', echo=T, warning=FALSE, message=FALSE, dev = 'svglite', fig.ext = 'svg', echo=T, warning=FALSE, message=FALSE, results = "hide")

'''{r import X1 batch 1} deep = read.table("/lab/solexareddien/Kellie/X1_singlecell_seq/ddV4/ OWdataincluded/ddV4.condensed.rawReadFile.txt", header = T, row.names = "Contig") deep$To = NULL

#basic rawReadCount filtering: mRNAsMapped = colSums(deep > 1) hist(mRNAsMapped, breaks = 100, ylim=c(0, 35), xlim=c(0, 25000))

#cut cells under low reads and above high reads deep = deep[, colSums(deep > 1) < 10000] deep = deep[, colSums(deep > 1) > 800] drops = c('ddSmedv4_0_0_1','dd Smedv4_217_0_1','dd Smedv4_521_0_1','dd_Sme d_v4_717_0_1','ddSmed_v4_7_0_1','ddSmed v4_4_1_1', '*') deep = deep[!(row.names(deep) %in% drops),] deep = CreateSeurat~bject(raw.data = deep, min.cells = 3, min.genes = 200) deep = NormalizeData(object = deep, normalization.method = "LogNormalize", scale.factor = 10000) deep = ScaleData(object = deep) deep = FindVariableGenes(object = deep, do.plot = FALSE)

'''{r import X1 batch 2}

161 shallow = read.table("/ab/solexareddien/Kellie/X_singlecell_seq/ NewSequencing/161103Red/161103Red_XlrawReadCounts.txt")

#basic rawReadCount filtering: mRNAsMapped = colSums(shallow > 1) hist(mRNAsMapped, breaks = 100, ylim=c(0, 35), xlim=c(0, 25000))

#cut cells under low reads and above high reads shallow = shallow[, colSums(shallow > 1) < 5500] shallow = shallow[, colSums(shallow > 1) > 800] drops = c('ddSmedv4_0_0_1','ddSmedv4_217_0_1','dd_Smedv4_521_0_1','dd_Sme d_v4_717_0_1','ddSmed v4 7 0 1', 'ddSmedv44 11', '*') shallow = shallow[!(row.names(shallow) %in% drops),] shallow = CreateSeurat~bject(raw.data = shallow, min.cells = 3, min.genes = 200) shallow = NormalizeData(object = shallow, normalization.method = "LogNormalize", scale.factor = 10000) shallow = ScaleData(shallow) shallow = FindVariableGenes(object = shallow, do.plot = FALSE)

hvgdeep = rownames(x = head(x = deep@hvginfo, n = 2000)) hvg.shallow = rownames(x = head(x = [email protected], n = 2000)) hvg.union = union(x = hvg.deep, y = hvg.shallow) [email protected][, "protocol"] = "SmartSeq_1" [email protected][, "protocol"] = "SmartSeq_2"

Xis <- RunCCA(object = deep, object2 = shallow, genes.use = hvg.union, num.cc = 30)

# visualize results of CCA plot CC1 versus CC2 and look at a violin plot p1 <- DimPlot(object = Xis, reduction.use = "cca", group.by = "protocol", pt.size = 0.5, do.return = TRUE) p2 <- VlnPlot(object = Xis, features.plot = "CC", group.by = "protocol", do.return = TRUE) plotgrid(pl, p2)

'''{r, fig.height=20, fig.width=10} DimHeatmap(object = Xis, reduction.type = "cca", cells.use = 500,

162 dim.use = 1:30, do.balanced = TRUE)

'''{r CCA} Xis <- CalcVarExpRatio(object = Xis, reduction.type = "pca", grouping.var = "protocol", dims.use = 1:20)

# We discard cells where the variance explained by CCA is <2-fold (ratio < # 0.5) compared to PCA Xis <- AlignSubspace(object = Xis, reduction.type = "cca", grouping.var = "protocol", dims.align = 1:20)

'{r load tissue markers} library(openxlsx) markers = read.xlsx("/lab/solexareddien/Kellie/Transcriptomes/ddV4/ Wurtzel2015_clusterIDs.xlsx") listofclusters = unique(markers$Cluster) marker-genes = list() i = 1 for (each in listofclusters){ markergenes[[each]] = markers$Contig[markers$Cluster %in% listofclusters[i]] 1= i + 1 }

'''{r tissue markers} Xis <- RunTSNE(object = X1s, reduction.use = "cca.aligned", dims.use = unique(1:8), do.fast = TRUE, perplexity = 16)

Xis <- FindClusters(object = Xis, reduction.type = "cca.aligned", dims.use = unique(1:8), temp.file.location = "/lab/solexareddien/ Kellie/Disseration", resolution = .2, print.output = T, save.SNN = T, k.param = 5, k.scale = 2) p1 <- TSNEPlot(object = Xis, group.by = "protocol", do.return = TRUE, pt.size = 0.5) p2 <- TSNEPlot(object = X1s, do.return = TRUE, pt.size = 0.5) plotgrid(pi, p2 ) library(dplyr)

163 library(tidyr) TSNEPlot(Xls, group.by = "orig.ident") feature.plot.rpkm_summed(Xls, marker_genes[["Epidermal lineage (early + late + mature)"]], features.plot.name = "Epidermal Lineage Markers") feature.plot.rpkm(Xls, "dd_Smedv4_659_0_1") feature.plot.scaledsummed(Xls, markergenes[["Neoblasts"]], features.plot.name = "Neoblasts") feature.plot.scaledsummed(Xls, markergenes[["Zeta neoblasts"]], features.plot.name = "Epidermal (Zeta) Neoblasts") feature.plot.scaledsummed(Xs, markergenes[["Gamma neoblasts"]], features.plot.name = "Gamma Neoblasts") feature.plot.scaledsummed(Xls, markergenes[["Sigma neoblasts"]], features.plot.name = "Sigma Neoblasts") feature.plot.scaledsummed(Xls, markergenes[["Gamma neoblasts"]], features.plot.name = "Gamma Neoblasts") feature.plot.scaledsummed(Xls, markergenes[["Muscle"]], features.plot.name = "Muscle") feature.plot.scaledsummed(Xls, marker-genes[["Neural (both ciliated and non-ciliated)"]], features.plot.name = "All Neural Markers") feature.plot.scaledsummed(Xls, markergenes[["Neural (ciliated)"]], features.plot.name = "Ciliated Neurons") feature.plot.scaledsummed(Xls, markergenes[["Neural (non- ciliated)"]], features.plot.name = "Non-ciliated Neurons") feature.plot.scaledsummed(Xls, markergenes[["Parapharyngeal"]], features.plot.name = "Parapharyngeal") feature.plot.scaledsummed(Xls, marker-genes[["Protonephridia"]], features.plot.name = "Protonephrida") feature.plot.scaledsummed(Xls, markergenes[["Hh Sensitive Glia"]], features.plot.name = "Hh Regulated Glia Markers") feature.plot.scaledsummed(Xls, markergenes[["Glia"]], features.plot.name = "All Glia Markers") feature.plot.scaledsummed(Xls, markergenes[["Pigment"]], features.plot.name = "Pigment Cell Markers")

'''{r generate cluster markers for cells} cellsmarkers = FindAllMarkers(Xls, test.use = "roc", return.thresh = 0.7, only.pos = T) cellsmarkersbimod = FindAllMarkers(Xls, test.use = "bimod", return.thresh = 0.05, only.pos = T)

'''{r save markers for cells} allmarkers = fulljoin(cellsmarkers, cellsmarkers-bimod, by = c("cluster", "gene"), suffix = c(". roc", ".bimod")) df = allmarkers df=annotateTable(df,"/lab/solexareddien/Kellie/Transcriptomes/ddV4/ hs.ddV4.txt", df.key = "gene") write.xlsx(df, "cellMarkers_X1s.xlsx")

164 '''{r synapse genes} synapseGenes = c('ddSmedv4_10835_0_1', 'dd_Smedv4_6859_0_1', 'ddSmedv4_5361_0_1', 'ddSmedv4_10835_0_1', 'ddSmedv4_2985_0_1', 'ddSmedv4_3977_0_1', 'ddSmedv4_3135_0_1', 'ddSmedv4_3135_0_1', 'dd Smed v4 7243_ 01', 'ddSmed v4 6730_ 01', 'ddSmed-v4_4222_0_1', 'ddSmed v4_5266_0_l', 'ddSmedv4_7111_0 1', 'ddSmedv4_10375_0_1', 'ddSmedv4_11887_0_1', 'ddSmedv4_12647_0_1', 'ddSmedv4_12772_0_1', 'ddSmedv4_13079_0_1', 'ddSmedv4_13340_0_1', 'ddSmedv4_13680_0_1', 'ddSmedv4_13706_0_1', 'ddSmedv4_16195_0_1', 'dd Smed v4 16731 0 1', 'ddSmed-v4_1798_0_1', 'ddSmed-v4_18661_0_1', 'ddSmed v4_19213_0_1', 'ddSmedv4_19328_0_1', 'ddSmedv4_20033_0_1', 'ddSmedv4_20523_0_1', 'ddSmedv4_21069_0_1', 'ddSmedv422061_0_1', 'ddSmedv4_23389_0_1', 'ddSmedv4_25279_0_1', 'ddSmedv4_4222_0_1', 'ddSmedv4_4335_0_1', 'ddSmedv4_5370_0_1', 'ddSmedv4_5946_0_1', 'dd Smed v4 6730 0 1', 'ddSmed-v4_6920_0_1', 'ddSmed-v4_7243_0_1', 'ddSmed v4_8032_0_1', 'ddSmedv4_83210 1', 'ddSmedv4_8438_0_1', 'ddSmedv4_8909_0_1') feature.plot.scaledsummed(Xls, unique(synapseGenes), features.plot.name = "Synapse Genes") feature.plot.rpkm(Xls, "ddSmedv4_3135_0_1", features.plot.name = "Synapsin") feature.plot.rpkm(X1s, "ddSmedv4_659_0_1", features.plot.name = "smedwi-1")

'''{r neural TFs} #import list of neural marker ddIDs neuralGenes = read.table(file = "/lab/solexareddien/Kellie/ X1_singlecellseq/synapse markerenrichment/ output.neuralTFtoddV4.txt") neuralGenes = neuralGenes$V2 neuralGenes = neuralGenes[neuralGenes %in% row. names(Xls@data)] neuralGenes = as.data.frame(neuralGenes) neuralGenes = annotateTable(neuralGenes, "/lab/solexareddien/Kellie/ Transcriptomes/ddV4/hs.ddV4.txt", df.key = "neuralGenes") feature.plot.scaledsummed(Xls, unique(neuralGenes$neuralGenes), features.plot.name = "Neural Transcription Factors")

'''{r chat} feature.plot.rpkm(Xls, "ddSmedv4_10921_0_1", features.plot.name = "skil") feature.plot.rpkm(Xls, "ddSmedv4_6208_0_1") #ChAT feature.plot.rpkm(Xls, "ddSmedv4_10293_0_1", features.plot.name =

165 "pax6A") #pax6A coexpressionPlot(Xls, "dd_Smed_v4_10293_0_1", "ddSmed_v4_10921_0_1")

'{r dopa} feature.plot.rpkm(Xls, "dd_Smed_v4_11113_0_1", features.plot.name = "Fli-1-like") #Fli-1-like feature.plot.rpkm(Xls, "dd_Smed_v4_14611_0_1", features.plot.name = "elf 1") #newmark Fli-1/elf-like coexpressionPlot(Xls, "dd_Smed_v4_11113_0_.1", "ddSmed_v4_14611_0_1")

'''{r trp} feature.plot.rpkm(Xls, "dd_Smed_v4_13772_0_1", features.plot.name = "prox-1") #prox-1 feature.plot.rpkm(Xls, "dd_Smed_v4_28005_0_1", features.plot.name = "fli2-like") #fli2-like feature.plot.rpkm(Xls, "dd_Smed_v4_15178_0_1", features.plot.name = "six3-1") #six3-1 feature.plot.rpkm(Xls, "dd_Smed_v4_6047_0_1", features.plot.name = "Su(H)") coexpressionPlot(Xls, "dd_Smed_v4_13772_0_1", "ddSmed_v4_6047_0_1")

166 Table 3.1: List of genes and transcriptome IDs used to annotate single-cell sequencing data

167 168 Chapter 4

Discussion

4.1 Regeneration requires unique regulation of tissue production

Compared to development, regeneration of adult structures requires dynamic regula- tion of cell production on demand and the ability to integrate regenerated cells into existing tissues. However, regeneration also provides an opportunity to use existing adult tissues to guide the new tissue process. In planarians, recent work now points to the body musculature being a source of patterning information. The body mus- cle expresses secreted components of a number cell signaling pathways that guide or specify types of tissue across the body [81][128], and these genes dynamically rescale in response to injury. Having an adult tissue that can provide this information may allow stem cells to remain plastic in their regulation of differentiation. Indeed, recent work has shown that for the planarian eye, loss of the tissue does not induce increased specification [129]. Instead, the amount of missing tissue and the prolonged activation of a molecular wound response is correlated with the amount of new generic tissue production. Any tissue generally located near removed tissue is produced, and the amount of new tissue is scaled to the amount of missing tissue. This suggests that planarian neoblasts don't actively surveil the differentiation needs of individual tissues, but rather rely on a global homeostatic rate of all tissue pro-

169 duction. In such a model, the rate of this homeostatic process is simply increased to accommodate new tissue production.

4.2 Planarian neoblasts are the site of major cell dif- ferentiation in the animal

Much work now establishes planarian neoblasts as the major site of cell specification during planarian regeneration. The planarian neoblast compartment was shown to contain pluripotent stem cells (cNeoblasts) that can differentiate into all tissues of the body [64]. Single-cell qPCR allowed further interrogation of neoblasts, and dis- covered a population of epidermal progenitors that represent up to 50% of all isolated neoblasts, and are produced from the non-epidermal neoblast population [991. The same study identified a population of putative gut progenitors, who were later dis- covered to be required for gut regeneration 11301. A broad single cell analysis of many cell types in planarians also found that neoblast contained putative progenitors for muscle, gut, nephridial and neural cell types as shown in Chapter 2 1481. Together, these data strongly suggest a model where neoblasts are deeply heterogeneous mix- ture of progenitors for, or cells poised towards, specific tissue fates. Importantly, it will be of future interest to establish how neoblasts can largely be a population of specified cells, but remain a source of pluripotent stem cells. The question of spe- cialized neoblast potential and plasticity has yet to be systematically interrogated, and it will be of unique importance to understand if all or some specialized neoblasts retain the ability to be pluripotent cNeoblasts and if such potency is controlled by undiscovered molecular mechanisms.

170 4.3 Planarian neoblasts specialize into many differ- ent cell types

In addition to being the site of tissue specification, increasing evidence also points to planarian neoblasts being the source of cell sub-type specification. A study under- standing the differentiation of the planarian epidermis established that neoblasts not only contain progenitors for all epidermal cells, but in fact contain cells already fated towards ventral versus dorsal epidermal fate [131]. In fact, they found that neoblasts, and not later epidermal progenitors, respond to levels of BMP signaling in order to choose a dorsal versus ventral epidermal fate. This suggests that neoblasts integrate signals from the body in choosing which cells, and cell sub-types, to differentiate into. Further, multiple studies of neural differentiation from neoblasts have established that combinations of transcription factors are required for, or expressed in, very specific neural cell types which are also expressed in neoblasts. In fact, there is now evidence that serotonergic [691 168], octopaminergic [481 [100], sensory neurons 148], as well as cholinergic and dopaminergic (this thesis, Chapter Three) cell types are all specialized in planarian neoblasts. It will be of interest to establish how neoblasts choose to differentiate into these specific fates, and if signaling environments or adult tissues influence this decision as in the epidermal lineage. There is already some evidence that sources of surround tissues or signals from the muscle can affect neural neoblast differentiation. Modulation of wntA, largely expressed in the brain, can influence the number of neurons produced in the animal [132]. Likewise, modulation of Hedgehog Signaling, largely expressed in the ventral medial serotonergic and octopaminergic cells, can modulate the fate of neurons in the brain 1100]. Understanding other factors that affect neural specification choices or probabilities will be of future interest.

171 4.4 Evidence that planarian neoblasts directly spec- ify into cell types

The regulation of how neoblasts specialized into tissue sub-types is still an open question. While there is evidence of epidermal sub-type specification in neoblasts, there are still a number of transcriptional regulators expressed in most zeta neoblasts including the transcription factor zfp-1 that is required for all epidermal cell differen- tiation. Similarly, in gamma neoblasts the transcription factors hnf4 and gata4/5/6 are expressed broadly and are required for gut differentiation. These data suggested a model whereby specialized neoblasts for a tissue may make rough bifurcations in fate to maintain multipotent tissue progenitors. We studied this topic using the model system of the planarian nervous system, specif- ically looking for a population of pan-neuronal progenitor cells. Using single cell se- quencing of the planarian nervous system and the neoblasts of the planarian head, we observed evidence of neoblasts specialized towards different neural cell fates. Clus- tering of the neoblasts alone revealed previously described populations of zeta and gamma neoblasts, and the predicted population of muscle progenitors, as well as a putative population of protonephridia progenitors. Neoblasts that expressed neural transcription factors and markers clustered together as well, but it was readily appar- ent that these were more heterogeneous than a population such as the zeta or gamma populations. The neural neoblasts expressed high levels of what were previously described as sigma neoblast markers - these are genes that primarily distinguish non-zeta and non-gamma neoblast - rather than transcription factors or markers that uniquely describe the neural population. Further, genes that were specific to just the neural population were largely transcription factors and neural markers that were sparsely expressed in just a few cells of the whole neural population. While single-cell se- quencing can be subject to high drop-out rates, this expression heterogeneity is much larger than the expression of zfp-1 in the zeta population, or hnf4 and gata4/5/6 in the gamma population. Such expression dynamics are consistent with a model where

172 neural neoblasts are more heterogeneous than zeta or gamma neoblasts, and do not have unifying expression of a broad pro-neural program. Of course, it is still possible that there is a population of pan-neural progenitors that differentiate into the observed neural neoblasts, but do not express any genes associated with observed neural neoblasts. However, no such population is readily apparent as a cluster of cells with single cell sequencing. Alternatively, it is possi- ble there is temporally regulated gene expression program, as in Drosophila neural differentiation, deployed in neural specialized neoblasts, which when observed in a population of unsynchronized progenitors appears as progenitor heterogeneity. It is also possible that we simply did not sequence enough neoblasts with enough coverage to detect markers of pan-neural fate. Or, it is possible that neural fate is a "default" state for neoblasts, such that all other tissues require a repressive mechanism that is observed as a multipotent progenitor gene expression. These and many other models are possible, and further experiments will need to determine which if any have strong supporting evidence. At this time, our data of neural differentiation is most parsimo- nious with a model where neural fates are directly sampled or specified as a mixture of heterogeneous neural progenitor cells. In contrast, other described tissues such as the epidermis may utilize a unified gene expression program. Further work in this area will hopefully reveal more molecular mechanisms related to how these choices are governed for the nervous system and other tissues.

4.5 Regeneration may decouple a need to coordinate cell differentiation with appropriate growth or timing

The lack of evidence for a planarian pan-neuronal progenitor population might be associated with or enable the ability of regeneration to occur from an unlimited number of starting points (different missing cell types and numbers). Adult tis- sue turnover and regeneration could favor molecular mechanisms that rely on cell

173 sorting rather than hierarchical differentiation. We have reviewed evidence that the human hematopoietic stem cell system (this thesis, Introduction), and planarian stem cells do not always appear to go through sequential steps towards differentiation in adults. Regeneration, by definition, also requires animals to contend with robust cell plasticity. This might favor the prioritization of cell diversification versus cell am- plification during regeneration with, a regenerating animal making as many different cell types as possible in a short amount of time. In contrast, during development, without the ability to template off remaining adult cells or signaling cues, an embryo may need to build up a sufficient cell numbers before triggering layered lineage dif- ferentiation processes. It will therefore be interesting to systematically compare cell lineage dynamics during development versus regeneration in many species, to test if regeneration or adult tissue dynamics are consistently correlated with more direct differentiation paths.

174 Bibliography

11] Eric H Davidson. The Regulatory Genome. Gene Regulatory Networks In Development And Evolution. Academic Press, July 2010.

[21 Andreas Schmidt-Rhaesa, Steffen Harzsch, and G nter Purschke. Structure and Evolution of Invertebrate Nervous Systems. Oxford University Press, December 2015.

[31 Detlev Arendt, Alexandru S Denes, Gdspdr Jekely, and Kristin Tessmar-Raible. The evolution of nervous system centralization. Philosophical Transactions of the Royal Society B: Biological Sciences, 363(1496):1523-1528, April 2008.

14] Fabian Rentzsch, Michael Layden, and Micha l Manuel. The cellular and molec- ular basis of cnidarian neurogenesis. Wiley InterdisciplinaryReviews: Develop- mental Biology, 6(1):e257, November 2016.

[5] Gemma Sian Richards and Fabian Rentzsch. Regulation of Nematostella neural progenitors by SoxB, Notch and bHLH genes. Development, 142(19):3332-3342, October 2015.

[61 Oliver Hobert. Neurogenesis in the nematode Caenorhabditis elegans. Worm- Book, pages 1-24, October 2010.

17] Omer Ali Bayraktar and Chris Q Doe. Combinatorial temporal patterning in progenitors expands neural diversity. Nature, 498(7455):449-455, June 2013.

18] Kristen A Yankura, Claire S Koechlein, Abigail F Cryan, Alys Cheatle, and Veronica F Hinman. Gene regulatory network for neurogenesis in a sea star embryo connects broad neural specification and localized patterning. PNAS, 110(21):8591-8596, May 2013.

[9] Chris Q Doe. Neural stem cells: balancing self-renewal with differentiation. Development, 135(9):1575-1587, March 2008.

1101 Dylan R Farnsworth and Chris Q Doe. Opportunities lost and gained: Changes in progenitor competence during nervous system development. Neurogenesis (Austin, Tex.), 4(1):e1324260, January 2017.

175 [111 Minoree Kohwi and Chris Q Doe. Temporal fate specification and neural progen- itor competence during development. Nature Reviews Neuroscience, 14(12):823- 838, November 2013. 112] Simona Lodato and Paola Arlotta. Generating Neuronal Diversity in the Mam- malian Cerebral Cortex. Annual review of cell and developmental biology, 31(1):699-720, November 2015.

[13] Bradley J Molyneaux, Paola Arlotta, Joao R L Menezes, and Jeffrey D Macklis. Neuronal subtype specification in the cerebral cortex. Nature Reviews Neuro- science, 8(6):427-437, June 2007.

[14] C Niehrs. On growth and form: a Cartesian coordinate system of Wnt and BMP signaling specifies bilaterian body axes. Development, 137(6):845-857, February 2010.

[151 J Elliott, C Jolicoeur, V Ramamurthy, and M Cayouette. Ikaros Confers Early Temporal Competence to Mouse Retinal Progenitor Cells. Neuron, 2008.

[16] Francisco L A F Gomes, Gen Zhang, Felix Carbonell, Jos6 A Correa, William A Harris, Benjamin D Simons, and Michel Cayouette. Reconstruction of rat retinal progenitor cell lineages in vitro reveals a surprising degree of stochasticity in cell fate decisions. Development, 138(2):227-235, January 2011.

[17] Santos J Franco and Ulrich Muller. Shaping our minds: stem and progenitor cell diversity in the mammalian neocortex. Neuron, 77(1):19-34, January 2013.

[18] Bin Chen, Song S Wang, Alexis M Hattox, Helen Rayburn, Sacha B Nelson, and Susan K McConnell. The Fezf2-Ctip2 genetic pathway regulates the fate choice of subcortical projection neurons in the developing cerebral cortex. Proceedings of the National Academy of Sciences, 105(32):11382-11387, August 2008.

[191 Jie-Guang Chen, Mladen-Roko Rasin, Kenneth Y Kwan, and Nenad Sestan. Zfp312 is required for subcortical axonal projections and dendritic morphology of deep-layer pyramidal neurons of the cerebral cortex. PNAS, 102(49):17792- 17797, December 2005.

[201 Denis Jabaudon. Fate and freedom in developing neocortical circuits. Nature Communications, 8:16042, July 2017.

[21] David L Turner, Evan Y Snyder, and Constance L Cepko. Lineage-independent determination of cell type in the embryonic mouse retina. Neuron, 4(6):833-845, June 1990. [22] Fiona K Hamey and Berthold G6ttgens. Demystifying blood stem cell fates. Nature Cell Biology, 19(4):261-263, March 2017. [23] Mihaela Crisan and Elaine Dzierzak. The many faces of hematopoietic stem cell heterogeneity. Development, 143(24):4571-4581, December 2016.

176 [24] Stuart H Orkin and Leonard I Zon. Hematopoiesis: An Evolving Paradigm for Stem Cell Biology. Cell, 132(4):631-644, February 2008.

[25] Guoji Guo, Sidinh Luc, Eugenio Marco, Ta-Wei Lin, Cong Peng, Marc A Kerenyi, Semir Beyaz, Woojin Kim, Jian Xu, Partha Pratim Das, Tobias Neff, Keyong Zou, Guo-Cheng Yuan, and Stuart H Orkin. Mapping cellular hierarchy by single-cell analysis of the cell surface repertoire. Cell stem cell, 13(4):492- 505, October 2013.

[26] Jun Seita and Irving L Weissman. Hematopoietic stem cell: self-renewal versus differentiation. Wiley InterdisciplinaryReviews: Systems Biology and Medicine, 2(6):640-653, April 2010.

127] Stefan Semrau and Alexander van Oudenaarden. Studying Lineage Decision- Making In Vitro: Emerging Concepts and Novel Tools. Annual review of cell and developmental biology, 31(1):317-345, November 2015.

[28j Tariq Enver, Martin Pera, Carsten Peterson, and Peter W Andrews. Stem cell states, fates, and the rules of attraction. Cell stem cell, 4(5):387-397, May 2009.

1291 Elaine Dzierzak and Nancy A Speck. Of lineage and legacy: the development of mammalian hematopoietic stem cells. Nature immunology, 9(2):129-136, February 2008.

[30] Faiyaz Notta, Sasan Zandi, Naoya Takayama, Stephanie Dobson, Olga I Gan, Gavin Wilson, Kerstin B Kaufmann, Jessica McLeod, Elisa Laurenti, Cyrille F Dunant, John D McPherson, Lincoln D Stein, Yigal Dror, and John E Dick. Dis- tinct routes of lineage development reshape the human blood hierarchy across ontogeny. Science (New York, NY), 351(6269):aab2116, January 2016.

131] Lars Velten, Simon F Haas, Simon Raffel, Sandra Blaszkiewicz, Saiful Islam, Bianca P Hennig, Christoph Hirche, Christoph Lutz, Eike C Buss, Daniel Nowak, Tobias Boch, Wolf-Karsten Hofmann, Anthony D Ho, Wolfgang Hu- ber, Andreas Trumpp, Marieke A G Essers, and Lars M Steinmetz. Human haematopoietic stem cell lineage commitment is a continuous process. Nature Cell Biology, 19(4):271-281, March 2017.

[32] Ryo Yamamoto, Yohei Morita, Jun Ooehara, Sanae Hamanaka, Masafumi On- odera, Karl Lenhard Rudolph, Hideo Ema, and Hiromitsu Nakauchi. Clonal analysis unveils self-renewing lineage-restricted progenitors generated directly from hematopoietic stem cells. Cell, 154(5):1112-1126, August 2013.

133] Sonia Nestorowa, Fiona K Hamey, Blanca Pijuan Sala, Evangelia Diamanti, Mairi Shepherd, Elisa Laurenti, Nicola K Wilson, David G Kent, and Berthold G6ttgens. A single-cell resolution map of mouse hematopoietic stem and pro- genitor cell differentiation. Blood, 128(8):e20-31, August 2016.

177 [341 Andrew R Gehrke and Mansi Srivastava. Neoblasts and the evolution of whole- body regeneration. Current opinion in genetics & development, 40:131-137, October 2016.

[351 Qiao Li, Hao Yang, and Tao P Zhong. Regeneration across Metazoan Phylogeny: Lessons from Model Organisms. Journal of Genetics and Genomics, 42(2):57- 70, February 2015.

[36] Elly M Tanaka. The Molecular and Cellular Choreography of Appendage Re- generation. Cell, 165(7):1598-1608, June 2016.

137] Matthew Gemberling, Travis J Bailey, David R Hyde, and Kenneth D Poss. The zebrafish as a model for complex tissue regeneration. Trends in Genetics, 29(11):611-620, November 2013.

1381 Ashley W Seifert, Stephen G Kiama, Megan G Seifert, Jacob R Goheen, Todd M Palmer, and Malcolm Maden. Skin shedding and tissue regeneration in African spiny mice (Acomys). Nature, 489(7417):561-565, September 2012.

[39] Ryoji Amamoto, Violeta Gisselle Lopez Huerta, Emi Takahashi, Guangping Dai, Aaron K Grant, Zhanyan Fu, and Paola Arlotta. Adult axolotls can regen- erate original neuronal diversity in response to brain injury. eLife, 5:-, January 2016.

1401 Warnakulasuriya Akash Fernando, Eric Leininger, Jennifer Simkin, Ni Li, Car- rie A Malcom, Shyam Sathyamoorthi, Manjong Han, and Ken Muneoka. De- velopmental Biology. Developmental Biology, 350(2):301-310, February 2011.

1411 Elly M Tanaka and Patrizia Ferretti. Considering the evolution of regeneration in the central nervous system. Nature Reviews Neuroscience, 10(10):713-723, October 2009.

1421 Karen Echeverri and Elly M Tanaka. Ectoderm to mesoderm lineage switching during axolotl tail regeneration. Science (New York, NY), 298(5600):1993-1996, December 2002.

[431 Levan Mchedlishvili, Hans H Epperlein, Anja Telzerow, and Elly M Tanaka. A clonal analysis of neural progenitors during axolotl spinal cord regeneration reveals evidence for both spatially restricted and multipotent progenitors. De- velopment, 134(11):2083-2093, June 2007.

1441 Fabian Rost, Aida Rodrigo Albors, Vladimir Mazurov, Lutz Brusch, Andreas Deutsch, Elly M Tanaka, and Osvaldo Chara. Accelerated cell divisions drive the outgrowth of the regenerating spinal cord in axolotls. eLife, 5, November 2016.

[45] Ji-Feng Fei, Maritta Schuez, Akira Tazaki, Yuka Taniguchi, Kathleen Roensch, and Elly M Tanaka. CRISPR-Mediated Genomic Deletion of Sox2 in the Axolotl

178 Shows a Requirement in Spinal Cord Neural Stem Cell Amplification during Tail Regeneration. Stem Cell Reports, 3(3):444-459, September 2014.

[46] Tatiana Sandoval-Guzmdn, Heng Wang, Shahryar Khattak, Maritta Schuez, Kathleen Roensch, Eugeniu Nacu, Akira Tazaki, Alberto Joven, Elly M Tanaka, and Andris Simon. Fundamental differences in dedifferentiation and stem cell recruitment during skeletal muscle regeneration in two salamander species. Cell stem cell, 14(2):174-187, February 2014.

[47] Ashley L Siegel, David B Gurevich, and Peter D Currie. A myogenic precursor cell that could contribute to regeneration in zebrafish and its similarity to the satellite cell. FEBS Journal, 280(17):4074-4088, May 2013.

1481 M Lucila Scimone, Kellie M Kravarik, Sylvain W Lapan, and Peter W Reddien. Neoblast Specialization in Regeneration of the Planarian Schmidtea mediter- ranea. Stem Cell Reports, July 2014.

[49] Phillip A Newmark and Alejandro SAnchez Alvarado. Not your father's pla- narian: A classica model enters the era of functional genomics. Nature reviews Genetics, 3(3):210-219, March 2002.

[501 Jaume Bagufia. The planarian neoblast: the rambling history of its origin and some current black boxes. The Internationaljournal of developmental biology, 56(1-2-3):19-37, 2012.

[51] Knud Jorgen Pedersen. Cytological studies on the planarian neoblast. Cell and Tissue Research, 50(6):799-817, 1959.

1521 Lorraine S Woodruff and Allison L Burnett. The origin of the blastemal cells in Dugesia tigrina. Experimental cell research, 1965.

[53] Elizabeth D Hay and Stuart J Coward. Fine structure studies on the planarian, Dugesia: I. Nature of the "neoblast" and other cell types in noninjured worms - ScienceDirect. Journal of ultrastructure research, 1975.

[541 Phillip A Newmark and Alejandro Sdnchez Alvarado. Bromodeoxyuridine Specifically Labels the Regenerative Stem Cells of Planarians. Developmen- tal Biology, 220(2):142-153, January 2000.

[55] Alejandro Sinchez Alvarado and Hara Kang. Multicellularity, stem cells, and the neoblasts of the planarian Schmidtea mediterranea. Experimental cell re- search, 2005.

[56] Tetsutaro Hayashi, Maki Asami, Sayaka Higuchi, Norito Shibata, and Kiyokazu Agata. Isolation of planarian X-ray-sensitive stem cells by fluorescence-activated cell sorting. Development, Growth 6 Differentiation, 48(6):371-380, August 2006.

179 1571 F Dubois. Contribution A 1 etude de la migration des cellules de regeneration chez les Planaires dulcicoles. Bulletin biologique de la France et de la Belgique., pages 213-283, September 1949.

[581 E Wolff and F Dubois. Sur la migration des cellules de reg6n6ration chez les planaires. Revue suisse de zoologie, pages 218-227, September 1948.

[591 Peter W Reddien, Nestor J Oviedo, Joya R Jennings, James C Jenkin, and Alejandro SAnchez Alvarado. SMEDWI-2 is a PIWI-like protein that regulates planarian stem cells. Science (New York, NY), 310(5752):1327, 2005.

1601 Tingxia Guo, Antoine H F M Peters, and Phillip A Newmark. A Bruno-like gene is required for stem cell maintenance in planarians. Developmental cell, 11(2):159-169, August 2006.

161] Norito Shibata, Yoshihiko Umesono, Hidefumi Orii, Takashige Sakurai, Kenji Watanabe, and Kiyokazu Agata. Expression ofvasa(vas)-Related Genes in Germline Cells and Totipotent Somatic Stem Cells of Planarians. Develop- mental Biology, 206(1):73-87, February 1999.

[62] Celina E Juliano, S Zachary Swartz, and Gary M Wessel. A conserved germline multipotency program. Development, 137(24):4113-4126, December 2010.

[631 Yanqing Yuwen, Zimei Dong, Xiaohui Si, and Guangwen Chen. A pumilio ho- molog in Polycelis sp. Development genes and evolution, 224(1):53-56, February 2014.

[641 Daniel E Wagner, Irving E Wang, and Peter W Reddien. Clonogenic Neoblasts Are Pluripotent Adult Stem Cells That Underlie Planarian Regeneration. Sci- ence (New York, NY), 332(6031):811-816, May 2011.

[651 M Lucila Scimone, Mansi Srivastava, George W Bell, and Peter W Reddien. A regulatory program for excretory system regeneration in planarians. Develop- ment, 138(20):4387-4398, October 2011.

[66] Martis W Cowles, David D R Brown, Sean V Nisperos, Brianna N Stanley, Bret J Pearson, and Ricardo M Zayas. Genome-wide analysis of the bHLH gene family in planarians identifies factors required for adult neurogenesis and neuronal regeneration. Development, October 2013.

[671 Martis W Cowles, Kerilyn C Omuro, Brianna N Stanley, Carlo G Quintanilla, and Ricardo M Zayas. COE loss-of-function analysis reveals a genetic program underlying maintenance and regeneration of the nervous system in planarians. PLoS genetics, 10(10):e1004746, October 2014.

[681 Ko W Currie and Bret J Pearson. Transcription factors lhx1/5-1 and pitx are required for the maintenance and regeneration of serotonergic neurons in planarians. Development, 140(17):3577-3588, September 2013.

180 [691 Martin Mdrz, Florian Seebeck, and Kerstin Bartscherer. A Pitx transcription factor controls the establishment and maintenance of the serotonergic lineage in planarians. Development, October 2013.

[70] Sylvain W Lapan and Peter W Reddien. dlx and sp6-9 Control Optic Cup Regeneration in a Prototypic Eye. PLoS genetics, 7(8):e1002226, 2011.

[71] Sylvain W Lapan and Peter W Reddien. Transcriptome Analysis of the Pla- narian Eye Identifies ovo as a Specific Regulator of Eye Regeneration. Cell reports, August 2012.

[721 Danielle Wenemoser and Peter W Reddien. Planarian regeneration involves dis- tinct stem cell responses to wounds and tissue absence. Developmental Biology, 344(2):979-991, August 2010.

[73] Peter W Reddien. Specialized progenitors and regeneration. Development, 140(5):951-957, February 2013.

174] George T Eisenhoffer, Hara Kang, and Alejandro Sdnchez Alvarado. Molecular analysis of stem cells and their descendants during cell turnover and regener- ation in the planarian Schmidtea mediterranea. Cell stem cell, 3(3):327-339, September 2008.

[751 M Lucila Scimone, Sylvain W Lapan, and Peter W Reddien. A forkhead Tran- scription Factor Is Wound-Induced at the Planarian Midline and Required for Anterior Pole Regeneration. PLoS genetics, 10(1):e1003999, January 2014.

[76] Danielle Wenemoser, Sylvain W Lapan, Alex W Wilkinson, George W Bell, and Peter W Reddien. A molecular wound response program associated with regeneration initiation in planarians. Genes and Development, 26(9):988-1002, 2012.

[77] Smadar Ben-Tabou de Leon. The conserved role and divergent regulation of foxa, a pan-eumetazoan developmental regulatory gene. Developmental Biology, 357(1):21-26, September 2011.

1781 Satoshi Koinuma, Yoshihiko Umesono, Kenji Watanabe, and Kiyokazu Agata. Planaria FoxA (HNF3) homologue is specifically expressed in the pharynx- forming cells. Gene, 259(1-2):171-176, December 2000.

179] K Zaret. Developmental competence of the gut endoderm: genetic potentiation by GATA and HNF3/fork head proteins. Developmental Biology, 209(1):1-10, May 1999.

[80] H Weintraub, R Davis, S Tapscott, M Thayer, M Krause, R Benezra, T K Blackwell, D Turner, R Rupp, and S Hollenberg. The myoD gene family: nodal point during specification of the muscle cell lineage. Science (New York, NY), 251(4995):761-766, February 1991.

181 1811 Jessica N Witchley, Mirjam Mayer, Daniel E Wagner, Jared H Owen, and Pe- ter W Reddien. Muscle cells provide instructions for planarian regeneration. Cell reports, 4(4):633-641, August 2013.

[821 Vahab D Soleimani, Hang Yin, Arezu Jahani-Asl, Hong Ming, Christel E M Kockx, Wilfred F J van Ijcken, Frank Grosveld, and Michael A Rudnicki. Snail regulates MyoD binding-site occupancy to direct enhancer switching and differentiation-specific transcription in myogenesis. Molecular cell, 47(3):457- 468, August 2012.

1831 Peter W Reddien. Constitutive gene expression and the specification of tissue identity in adult planarian biology. Trends in Genetics, 27(7):277-285, July 2011.

1841 Ewan J D Robson, Shu-Jie He, and Michael R Eccles. A PANorama of PAX genes in cancer and development. Nature reviews Cancer, 6(1):52-62, January 2006.

[85] David Pineda and Emili Salo. Planarian Gtsix3, a member of the Six/so gene family, is expressed in brain branches but not in eye cells. Gene expression patterns : GEP, 2(1-2):169-173, November 2002.

[861 Lars Kammermeier and Heinrich Reichert. Common developmental genetic mechanisms for patterning invertebrate and vertebrate brains. Brain research bulletin, 55(6):675-682, August 2001.

[87] Yoshihiko Umesono, Kenji Watanabe, and Kiyokazu Agata. Distinct struc- tural domains in the planarian brain defined by the expression of evolutionarily conserved homeobox genes. Development genes and evolution, 209(1):31-39, January 1999.

[88] J Briscoe, L Sussel, P Serup, D Hartigan-O'Connor, T M Jessell, J L Ruben- stein, and J Ericson. Homeobox gene Nkx2.2 and specification of neuronal identity by graded Sonic hedgehog signalling. Nature, 398(6728):622-627, April 1999.

[89] Maike Sander, Sussan Paydar, Johan Ericson, James Briscoe, Elizabeth Berber, Michael German, Thomas M Jessell, and John L Rubenstein. Ventral neural patterning by Nkx homeobox genes: Nkx6.1 controls somatic motor neuron and ventral interneuron fates. Genes and Development, 14(17):2134-2139, Septem- ber 2000.

190] Dervla M Mellerick, Judith A Kassis, Shan-Ding Zhang, and Ward F Odenwald. castor encodes a novel zinc finger protein required for the development of a subset of CNS neurons in Drosophila. Neuron, 9(5):789-803, November 1992.

[911 Carolyn E Adler, Chris W Seidel, Sean A McKinney, and Alejandro Sdnchez Al- varado. Selective amputation of the pharynx identifies a FoxA-dependent re- generation program in planaria. eLife, 3:e02238, 2014.

182 [92] Kaneyasu Nishimura, Yoshihisa Kitamura, Takeshi Inoue, Yoshihiko Umesono, Kanji Yoshimoto, Kosei Takeuchi, Takashi Taniguchi, and Kiyokazu Agata. Identification and distribution of tryptophan hydroxylase (TPH)-positive neu- rons in the planarian Dugesia japonica. Neuroscience Research, 59(1):101-106, September 2007.

193] Susan E Mango. The Molecular Basis of Organ Formation: Insights From the C. elegansForegut. Annual review of cell and developmental biology, 25(1):597-628, November 2009.

[94] Peter W Reddien and Alejandro Sinchez Alvarado. Fundamentals of planarian regeneration. Annual review of cell and developmental biology, 20:725-757, 2004.

195] Mansi Srivastava, Kathleen L Mazza-Curll, Josien C van Wolfswinkel, and Pe- ter W Reddien. Whole-Body Acoel Regeneration Is Controlled by Wnt and Bmp-Admp Signaling. Current Biology, 24(10):1107-1113, April 2014.

[96] Bret J Pearson, George T Eisenhoffer, Kyle A Gurley, Jochen C Rink, Diane E Miller, and Alejandro Sinchez Alvarado. Formaldehyde-based whole-mount in situ hybridization method for planarians. Developmental dynamics : an official publication of the American Association of Anatomists, 238(2):443-450, February 2009.

[971 Ryan S King and Phillip A Newmark. In situ hybridization protocol for en- hanced detection of gene expression in the planarian Schmidtea mediterranea. BMC developmental biology, 13(1):8, March 2013.

[981 M Lucila Scimone, Joshua Meisel, and Peter W Reddien. The Mi-2-like Smed- CHD4 gene is required for stem cell differentiation in the planarian Schmidtea mediterranea. Development, 137(8):1231-1241, April 2010.

[99] Josien C van Wolfswinkel, Daniel E Wagner, and Peter W Reddien. Single- cell analysis reveals functionally distinct classes within the planarian stem cell compartment. Cell stem cell, 15(3):326-339, September 2014.

1100] Ko W Currie, Alyssa M Molinaro, and Bret J Pearson. Neuronal sources of hedgehog modulate neurogenesis in the adult planarian brain. eLife, 5, Novem- ber 2016.

[101] Kelly G Ross, Ko W Currie, Bret J Pearson, and Ricardo M Zayas. Nervous system development and regeneration in freshwater planarians. Wiley Interdis- ciplinary Reviews: Developmental Biology, 6(3), May 2017.

[102] Rachel H Roberts-Galbraith, John L Brubacher, and Phillip A Newmark. A functional genomics screen in planarians reveals regulators of whole-brain re- generation. eLife, 5, September 2016.

183 [103] Alyssa M Molinaro and Bret J Pearson. In silico lineage tracing through sin- gle cell transcriptomics identifies a neural stem cell population in planarians. Genome biology, 17:87, 2016.

11041 S Y Liu, C Selck, B Friedrich, R Lutz, M Vila-Farr6, A Dahl, H Brandl, N Lak- shmanaperumal, I Henry, and J C Rink. Reactivating head regrowth in a regeneration-deficient planarian species. Nature, 500(7460):81-84, August 2013.

11051 Evan Z Macosko, Anindita Basu, Rahul Satija, James Nemesh, Karthik Shekhar, Melissa Goldman, Itay Tirosh, Allison R Bialas, Nolan Kamitaki, Emily M Martersteck, John J Trombetta, David A Weitz, Joshua R Sanes, Alex K Shalek, Aviv Regev, and Steven A McCarroll. Highly Parallel Genome- wide Expression Profiling of Individual Cells Using Nanoliter Droplets. Cell, 161(5):1202-1214, May 2015.

11061 L Van der Maaten. Accelerating t-SNE using Tree-Based Algorithms . Journal of Machine Learning Research, 2014.

11071 Omri Wurtzel, Lauren E Cote, Amber Poirier, Rahul Satija, Aviv Regev, and Peter W Reddien. A Generic and Cell-Type-Specific Wound Response Precedes Regeneration in Planarians. Developmental cell, 35(5):632-645, December 2015.

[1081 Irving E Wang, Sylvain W Lapan, M Lucila Scimone, Thomas R Clandinin, and Peter W Reddien. Hedgehog signaling regulates gene expression in planarian glia. eLife, 5, September 2016.

11091 Kaneyasu Nishimura, Yoshihisa Kitamura, Takeshi Inoue, Yoshihiko Umesono, Shozo Sano, Kanji Yoshimoto, Masatoshi Inden, Kazuyuki Takata, Takashi Taniguchi, Shun Shimohama, and Kiyokazu Agata. Reconstruction of dopamin- ergic neural network and locomotion function in planarian regenerates. Devel- opmental Neurobiology, 67(8):1059-1078, July 2007.

[110] Natsuka Tashiro, Kaneyasu Nishimura, Kanako Daido, Tomoe Oka, Mio Todo, Asami Toshikawa, Jun Tsushima, Kazuyuki Takata, Eishi Ashihara, Kanji Yoshimoto, Kiyokazu Agata, and Yoshihisa Kitamura. Pharmacological as- sessment of methamphetamine-induced behavioral hyperactivity mediated by dopaminergic transmission in planarian Dugesia japonica. Biochemical and bio- physical research communications, 449(4):412-418, July 2014.

11111 K Nishimura, Y Kitamura, Y Umesono, K Takeuchi, K Takata, T Taniguchi, and K Agata. Identification of glutamic acid decarboxylase gene and distribu- tion of GABAergic nervous system in the planarian Dugesia japonica. Neuro- science, 153(4):1103-1114, June 2008.

[1121 K Nishimura, Y Kitamura, T Taniguchi, and K Agata. Analysis of motor function modulated by cholinergic neurons in planarian dugesia japonica. Neu- roscience, 168(1):18-30, June 2010.

184 [113] Kaneyasu Nishimura, Yoshihisa Kitamura, Takeshi Inoue, Yoshihiko Umesono, Kanji Yoshimoto, Takashi Taniguchi, and Kiyokazu Agata. Characterization of tyramine beta-hydroxylase in planarian Dugesia japonica: cloning and expres- sion. Neurochemistry International, 53(6-8):184-192, December 2008.

[1141 Ko W Currie, David D R Brown, Shujun Zhu, ChangJiang Xu, Veronique Voisin, Gary D Bader, and Bret J Pearson. HOX gene complement and expres- sion in the planarian Schmidtea mediterranea. EvoDevo, 7:7, 2016.

[115] Marta Iglesias, Jose Luis Gomez-Skarmeta, Emili Sa16, and Teresa Adell. Si- lencing of Smed-beta- catenini generates radial-like hypercephalized planarians. Development, 135(7):1215-1221, April 2008.

[116] Taisaku Nogi and Kenji Watanabe. Position-specific and non-colinear expres- sion of the planarian posterior (Abdominal-B-like) gene. Development, Growth & Differentiation, 43(2):177-184, April 2001.

[117] David.Pineda, Leonardo Rossi, Renata Batistoni, Alessandra Salvetti, Maria Marsal, Vittorio Gremigni, Alessandra Falleni, Javier Gonzalez-Linares, Paolo Deri, and Emili Salo. The genetic network of prototypic planarian eye regener- ation is Pax6 independent. Development, 129(6):1423-1434, March 2002.

[1181 Julien Deheuninck and Kunxin Luo. Ski and SnoN, potent negative regulators of TGF-3 signaling. Cell Research, 19(1):47-57, January 2009.

1119] Shannon L Stroschein Wei Wang Dan Chen Eric Martens Sharleen Zhou Qiang Zhou Kunxin Luo. The Ski oncoprotein interacts with the Smad proteins to repress TGF# signaling. Genes and Development, 13(17):2196, September 1999.

11201 M Berk, S Y Desai, H C Heyman, and C Colmenares. Mice lacking the ski proto-oncogene have defects in neurulation, craniofacial patterning, and skeletal muscle development. Genes and Development, 11(16):2029-2039, August 1997.

1121] D James Surmeier, Jose A Obeso, and Glenda M Halliday. Selective neuronal vulnerability in Parkinson disease. Nature Reviews Neuroscience, 18(2):101- 113, February 2017.

[122] Nuria Flames and Oliver Hobert. Gene regulatory logic of dopamine neuron differentiation. Nature, 458(7240):885-889, March 2009.

[123] S Wang and E E Turner. Expression of Dopamine Pathway Genes in the Mid- brain Is Independent of Known ETS Transcription Factor Activity. Journal of Neuroscience, 30(27):9224-9227, July 2010.

[124] Michael J Williams, Anica Klockars, Anders Eriksson, Sarah Voisin, Rohit Dnyansagar, Lyle Wiemerslage, Anna Kasagiannis, Mehwish Akram, Sania Kheder, Valerie Ambrosi, Emilie Hallqvist, Robert Fredriksson, and Helgi B

185 Schi6th. The Drosophila ETV5 Homologue Ets96B: Molecular Link between Obesity and Bipolar Disorder. PLoS genetics, 12(6):e1006104, June 2016.

[1251 Jie Zheng. Molecular Mechanism of TRP Channels. Comprehensive Physiology, 3(1):221, January 2013.

[126] Bradford M Stubenhaus, John P Dustin, Emily R Neverett, Megan S Beaudry, Leanna E Nadeau, Ethan Burk-McCoy, Xinwen He, Bret J Pearson, and Jason Pellettieri. Light-induced depigmentation in planarians models the pathophys- iology of acute porphyrias. eLife, 5, 2016.

[1271 Kai Lei, Hanh Thi-Kim Vu, Ryan D Mohan, Sean A McKinney, Chris W Sei- del, Richard Alexander, Kirsten Gotting, Jerry L Workman, and Alejandro Sinchez Alvarado. Egf Signaling Directs Neoblast Repopulation by Regulating Asymmetric Cell Division in Planarians. Developmental cell, August 2016.

1128] M Lucila Scimone, Lauren E Cote, Travis Rogers, and Peter W Reddien. Two FGFRL-Wnt circuits organize the planarian anteroposterior axis. eLife, 5, 2016.

1129] Samuel A LoCascio, Sylvain W Lapan, and Peter W Reddien. Eye Absence Does Not Regulate Planarian Stem Cells during Eye Regeneration. Developmental cell, 40(4):381-391.e3, February 2017.

1130] Alejandro Gonzdlez-Sastre, Nidia De Sousa, Teresa Adell, and Emili Sa16. The Smed-gata456-1 is required for gut cell differentiation and main- tenance in planarians. The Internationaljournal of developmental biology, 61(1- 2):53-63, 2017.

[1311 Omri Wurtzel, Isaac M Oderberg, and Peter W Reddien. Planarian Epider- mal Stem Cells Respond to Positional Cues to Promote Cell-Type Diversity. Developmental cell, 40(5):491-504.e5, March 2017.

[1321 E M Hill and C P Petersen. Wnt/Notum spatial feedback inhibition controls neoblast differentiation to regulate reversible growth of the planarian brain. Development, 142(24):4217-4229, December 2015.

186