Quick viewing(Text Mode)

Investigating the Roles of Neurogenin 3 in Human Pancreas and Intestine

Investigating the Roles of Neurogenin 3 in Human Pancreas and Intestine

University of Cincinnati

March 2, 2016

Investigating the roles of Neurogenin 3 in pancreas and

intestine development and disease.

A dissertation submitted to the

Graduate School of the University of Cincinnati

in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

in the Graduate Program in Molecular and Developmental Biology

of the College of Medicine

by

Patrick Sean McGrath

Bachelor of Science, Colorado State University, May 2009

Committee Chair: James M. Wells, Ph.D.

Brian Gebelein, Ph.D.

Michael A. Helmrath, MD, MS

Stacey S. Huppert, Ph.D.

Aaron M. Zorn, Ph.D. Summary

The incidence of diabetes mellitus, a disease defined by the inability to properly

regulate glucose, is increasing at an alarming rate and affects an estimated 26 million

people in the U.S. creating an economic cost of $245 billion per year (American

Diabetes, 2013). Glucose homeostasis is regulated through a complex interplay of

hormones produced by both the pancreatic and intestinal endocrine systems.

Neurogenin 3 is a basic helix-loop-helix (bHLH) factor which acts as a

“master regulator” of endocrine cell specification. Mouse studies have shown Neurog3

to be necessary and sufficient for the formation of all endocrine lineages in both the pancreas and intestine. Human patients harboring NEUROG3 have a complete loss of intestinal endocrine cells, but surprisingly retain at least some

endocrine pancreas function, calling into question whether NEUROG3 is required for

human endocrine pancreas development. To test this directly, we generated human

embryonic stem cell (hESC) lines where both alleles of NEUROG3 were disrupted using

CRISPR/Cas9-mediated targeting. Directed differentiation of NEUROG3-/- hESC lines efficiently formed pancreatic progenitors, but lacked detectible NEUROG3 and did not form any endocrine cells in vitro. Moreover, NEUROG3-/- hESC lines were

unable to form mature pancreatic endocrine cells following engraftment of

PDX1+/NKX6.1+ pancreatic progenitors into mice confirming that NEUROG3 is required

for human pancreatic endocrine development. Similarly, we found that NEUROG3 is

essential for endocrine specification in PSC derived human intestinal organoids (HIOs).

The NEUROG3-/- hPSC line was then used as a null background in which we could

ectopically express NEUROG3 using an rtTA inducible lentiviral vector. Furthermore,

NEUROG3 was also mutated to include the patient mutations R107S, L135P, or E28X.

NEUROG3R107S, but not NEUROG3L135P, was functional and induced endocrine cell

ii

formation in pancreatic precursors but at significantly reduced levels compared to wild

type NEUROG3. None of the NEUROG3 mutants were able to rescue endocrine

formation in HIOs, perfectly mimicking the human phenotype. Furthermore, we showed

NEUROG3R107S has a significantly shorter half-life which may contribute to its reduced function. Lastly, we utilized a NEUROG3ERT2 construct to identify NEUROG3 direct and

indirect target in differentiated human pancreas. In summary, these studies help

define the requirement for NEUROG3 function in human pancreatic and intestinal

endocrine development. More broadly, the methods utilized here are a robust approach

by which we can interrogate human development or disease entirely in vitro.

iii

iv

Table of Contents

Chapter 1. Introduction ...... 6

Introduction ...... 7 In vivo development of pancreas and β cells ...... 7 Directed differentiation of human pluripotent stem cells into pancreas ...... 8 The function of Neurogenin 3 in endocrine specification ...... 9 Mouse versus human requirement for Neurog3 ...... 11 Gene editing and disease modeling in PSC derived tissues ...... 13 References ...... 17 Figure Legends ...... 22 Figures ...... 23

Chapter 2. The basic helix-loop-helix NEUROG3 is required for development of the human endocrine pancreas ...... 25

Summary ...... 26 Introduction ...... 27 Methods ...... 28 Results ...... 31 Discussion ...... 37 References ...... 40 Figure Legends ...... 44 Tables ...... 51 Figures ...... 53

Chapter 3. Functional characterization and disease modeling of NEUROGENIN 3 in human pluripotent stem cell-derived pancreas and intestinal organoids ...... 61

Summary ...... 62 Introduction ...... 64 Materials and Methods ...... 66 Results ...... 72 Discussion ...... 81 References ...... 86 Figure Legends ...... 89 Tables ...... 96 Figures ...... 105

Chapter 4. Discussion ...... 117

Major Findings ...... 118 Using hPSCs to study pancreatic endocrine development and the importance of studying NEUROG3 mutations in the correct context ...... 119 Modeling NEUROG3 mutations and some experimental limitations ...... 121 Differential requirement for NEUROG3 in pancreatic versus intestinal endocrine development and mechanisms leading to tissue specific target genes and endocrine lineages ...... 126 Modeling human development and disease ...... 133 References ...... 135 Figure Legends ...... 138 Figures ...... 139

v

CHAPTER 1

Introduction.

Patrick S. McGrath1 and James M. Wells*1,2.

1Division of Developmental Biology, 2Division of Endocrinology, Cincinnati Children’s

Hospital Medical Center, 3333 Burnet Avenue, Cincinnati, OH 45229-3039

***Portions of this chapter have been reproduced from (McGrath et al., 2015).

6 In vivo development of pancreas and β cells

Embryonic development of pancreatic endocrine cells can be broadly subdivided into several steps including endoderm formation, organ specification, endocrine specification and maturation (Gittes, 2009; Murtaugh, 2007; Sinner et al., 2006; Zorn and Wells, 2009). The definitive endoderm is derived from a bipotential mesendoderm progenitor marked by and Mixl1 (Pearce and Evans, 1999; Wilkinson et al.,

1990). Specification into definitive endoderm is then marked by a number of transcription factors including Sox17, Foxa2, and Goosecoid (Blum et al., 1992; Hudson et al., 1997; Sasaki and Hogan, 1993). A series of signaling events then subdivides the definitive endoderm first into foregut, then into a subdomain that forms the pancreas.

These pathways include BMP, FGF, retinoic acid (RA), and sonic hedgehog signaling

pathways that act either positively or negatively to pattern the foregut, regulate initiation

of pancreatic (Hebrok et al., 1998; Jung et al., 1999; Molotkov et al.,

2005; Rossi et al., 2001; Wang et al., 2006b), and promote proliferation of pancreatic progenitor cells (Bhushan et al., 2001).

The mammalian pancreas first appears at approximately e8.5 in mouse as an

evagination of the foregut endoderm and is marked by PDX1, the first pancreas specific

transcription factor. From e8.5-e12.5, (termed the first transition) two stalks of

undifferentiated pancreatic epithelial cells originating from the ventral and dorsal gut

tube extend and then fuse to become a single pancreatic anlage. Each stalk contains multipotent progenitor cells destined to become pancreas. Removal of cells from this progenitor pool permanently affects final adult pancreas size (Stanger et al., 2007).

Neurog3 expression can first be detected around e9.0 in mice, and a limited number of

7 these first transition progenitor cells differentiate into hormone producing endocrine

cells, although only a minute fraction of the adult endocrine pancreas was from these

cells (Wang et al., 2009). The majority of endocrine cells form during the second

transition (e12.5-e16.5). At this point, the growing pancreatic stalk undergoes branching

morphogenesis and extensively differentiates into either endocrine and exocrine tissues. Finally, during a third transition (e16.5-neonatal) the differentiated endocrine cells migrate through the pancreas and coalesce into the stereotypic islets of

Langerhans. By the end of gestation, the formation of endocrine cells from undifferentiated pancreatic precursors is essentially complete. During the early postnatal period, there is continued restructuring and a significant expansion of endocrine cells entirely through proliferation (Teta et al., 2007).

Directed differentiation of human pluripotent stem cells into pancreas

Most successful efforts to direct the differentiation of human pluripotent stem cells (both embryonic and induced pluripotent) into pancreatic cells do so by temporally manipulating specific signaling pathways to mimic these developmental processes

(Spence and Wells, 2007; Spence et al., 2011) (Summarized in Fig. 1). For example, the Nodal signaling pathway is required for endoderm formation across vertebrate species (Gamer and Wright, 1995; Green and Smith, 1990; Henry et al., 1996;

Thomsen et al., 1990) and the nodal mimetic Activin A induces differentiation of hPSCs into definitive endoderm (D’Amour et al., 2005). In vivo, endoderm is then patterned into posterior foregut by several signaling molecules originating from the neighboring mesoderm, including fibroblast growth factor (FGF), retinoic acid (RA), and hedgehog.

8 Following these developmental cues, differentiated definitive endoderm can be treated with exogenous FGF and retinoic acid to promote commitment to pancreatic endoderm

marked by PDX1 and NKX6.1. The resulting pancreatic endoderm most closely

resembles the pool of undifferentiated pancreatic precursors formed during the primary

transition (Hrvatin et al., 2014).

The final step is the specification of endocrine precursor cells from pancreatic

endoderm. Typically, there are no exogenous signaling molecules to facilitate this step.

This is due in part to a lack in knowledge about exactly what initiates endocrine differentiation. It is known that Neurog3 is the most upstream transcription factor that

orchestrates endocrine development, but what initiates Neurog3 expression remains

poorly understood. One study mapped the Neurog3 and identified Hnf1 and

Foxa2 together with notch signaling may regulate early Neurog3 expression (Lee et al.,

2001a). Differentiated pancreatic precursors spontaneously express Neurog3, from

which endocrine precursors form. The resulting endocrine cells express various

pancreatic hormones including , glucagon, and somatostatin. It’s important to

note that the endocrine cells that form in vitro are often polyhormonal and are not

functional, nor do they give rise to functional endocrine cells when maintained in vitro. It

has been shown that PDX1/NKX6.1-positive pancreatic progenitors have the potential

to give rise to more mature and functional beta cells, but only after engraftment and

growth in mice.

The function of Neurogenin 3 in endocrine specification

9 In mice, pancreatic progenitor cells give rise to functional endocrine cells via an

endocrine progenitor intermediate that expresses the the bHLH transcription factor

Neurogenin3 (Neurog3) (Apelqvist et al., 1999; Gu et al., 2002; Miettinen et al., 2000;

Schwitzgebel et al., 2000). Neurog3 is required for development of all pancreatic endocrine cell types in mice (Gradwohl et al., 2000; Lee et al., 2001b; Xu et al., 2008), and does this through direct and indirect regulation of downstream targets including the transcription factors NeuroD1 (Huang et al., 2000), Rfx6 (Soyer et al., 2010), Pax4

(Sosa-Pineda et al., 1997), Nkx6.1 (Henseleit et al., 2005; Sander et al., 2000), Arx

(Collombat et al., 2003) and others. Neurog3+ cells are first observed during the primary transition in mouse between e9 and 12.5. While these primary transition endocrine cells contribute to adult islets(Gu et al., 2002), the majority of endocrine cell mass forms during a second wave of endocrine cell development between e12.5 and e16.5.

A cell within the pool of multipotent pancreatic progenitors becomes endocrine- committed immediately following the expression of Neurog3. The progenitor pool is maintained by notch-mediated lateral inhibition which maintains the majority of the progenitor pool as undifferentiated progenitors (Apelqvist et al., 1999; Murtaugh et al.,

2003). Neurog3 directs the formation of different endocrine cell types at different stages; first promoting a glucagon cell fate, then insulin, PP and somatostatin fates respectively

(Johansson et al., 2007). Neurog3 cells appear to specify unipotent progenitor cells, each one giving rise to only one endocrine cell type (Desgraz and Herrera, 2009).

Neurog3 is also required for development of intestinal (enteric) enteroendocrine cells (EECs) in mice(Jenny et al., 2002; Lee et al., 2002; Lopez-Diaz et al., 2007; Ootani et al., 2009). This is also appears to be the case in . All patients with identified

10 mutations in NEUROG3 are born with intractable malabsorptive diarrhea due to loss of

EECs also known as enteric anendocrinosis (Pinney et al., 2011; Rubio-Cabezas et al.,

2011; Wang et al., 2006a). These mutations occur in the bHLH domain of NEUROG3

and were predicted to render the protein transcriptionally inactive, yet all patients were

born with circulating C-peptide suggesting beta cell function at birth. Most patients

develop diabetes however, indicating a role for NEUROG3 in postnatal beta cell

development, function, and/or maintenance. Therefore the role of NEUROG3 in

embryonic development of the human endocrine pancreas has been unclear. If

development of the endocrine pancreas in humans does not require NEUROG3, it is

somewhat troubling given the emphasis placed on NEUROG3 as a therapeutic linchpin to generate pancreatic endocrine cells from ES cells or via neogenesis from adult cell types.

Mouse versus human requirement for Neurog3

Neurog3-/- mice fail to develop all endocrine cells of the pancreas and

intestine(Gradwohl et al., 2000; Jenny et al., 2002). Mouse has proven suitable as a

largely accurate model for human development and it was therefore assumed that

NEUROG3 was similarly required for human endocrine cell specification. The first

described human mutations in NEUROG3 were homozygous loss-of-function and were

identified in patients with a rare form of congenital malabsorptive diarrhea resulting from

a complete loss of enteroendocrine cells(Wang et al., 2007). Interestingly, the patients

did not present with neonatal diabetes which suggests pancreatic islet development still

occurred. Since this initial report, a total of 9 patients have been characterized with 6

11 unique mutations in NEUROG3 and all have the same phenotype of a complete loss of

enteroendocrine cells but not a loss of pancreatic endocrine cells(Fig 1)(Jensen et al.,

2007; Pinney et al., 2011). One patient was born with neonatal diabetes, but blood samples showed the presence of circulating C-peptide suggesting the formation of functional β-cells(Rubio-Cabezas et al., 2011). There were 2 likely explanations for the difference in mouse and human phenotypes. Either NEUROG3 mutations were

hypomorphic and retained some function or an unidentified factor might compensate for

NEUROG3 in human pancreas development. We confirm using human PSCs that

NEUROG3-/- pancreas is incapable of differentiating endocrine cells which suggests that

all the known human mutations are likely hypomorphic and some function must be

retained.

Knowing that NEUROG3 is required for both pancreatic and enteroendocrine

development, it is important to note the difference in pancreas versus intestinal

sensitivity to the mutations. It is somewhat surprising that although NEUROG3 is

required for both endocrine lineages, the intestinal endocrine cells appear to be much

more severely affected in all of the described mutations. Intestinal biopsies were

performed on some of the patients which showed 99%+ loss of enteroendocrine cells

(Pinney et al., 2011). No pancreas samples were gathered so it is unknown how

different, if at all, proband pancreata are from wild type. There are a couple explanations that could account for this difference. First, there may be a dosage difference in the amount of NEUROG3 expressed in the pancreas compared to the intestine. For example, the pancreas may express sufficiently high levels of NEUROG3

to overcome the loss of NEUROG3 activity resulting from the hypomorphic

12 whereas if the intestine expresses NEUROG3 at a lower level it would be more

sensitive to a reduction in functional activity. No studies have directly analyzed this

possibility. Anecdotally, a human must have mutations in both alleles to observe a

phenotype as none of the heterozygous parents show any sign of diabetes or

malabsorptive diarrhea. This provides some evidence that intestinal endocrine

development can tolerate decreased NEUROG3 levels. It is striking to note that all the

human mutations fall in the well conserved helix-loop-helix region of NEUROG3 or

result in a truncation and complete loss of the HLH domain. This suggests the possibility

that pancreas versus intestine phenotype could be due to tissue specific dimerization

partners which mediate DNA binding. There are other possibilities as well including epigenetic differences, differential affinities for sites in the pancreas compared to intestine, or tissue specific co-activators and/or co-repressors.

Gene editing and disease modeling in PSC derived tissues

Mice and other organisms have served as powerful models for a long time, however they aren’t without some limitations. Approximately 1% of human genes have no identifiable mouse homolog. Additionally, while many genes play conserved roles across species this is not always the case. GATA6 is a notable example in the context of pancreas development. Patients with a heterozygous inactivating mutation in GATA6 present with pancreatic agenesis (Lango Allen et al., 2012). Gata6 heterozygous show no obvious phenotype and homozygous null mice die during gastrulation making it

13 impossible to study pancreas development (Lango Allen et al., 2012). This highlights the

potential importance of using human models to study development and disease.

There are two main varieties of human pluripotent stem cells: human embryonic stem cells (hESCs) and induced pluripotent stem cells (iPSCs) (Takahashi et al., 2007;

Thomson et al., 1998). hESCs are derived from embryos while iPSCs are made by introducing a combination of reprogramming factors into a somatic cell type, commonly fibroblasts as they can be attained from a skin biopsy with very little invasiveness. Both types can self-renew in culture indefinitely while retaining the capacity to theoretically become any cell type in the body. In practice, there are only a relative few cell types for which efficient differentiation protocols exist but the list is ever expanding.

One advantage to using hiPSCs is that any stem cells derived from a patient will harbor that patients’ exact genotype including any diseased alleles. Once the stem cell line is derived, their disease can be studied without further patient contact. This approach is best suited to monogenic diseases, which are more likely to retain their phenotype upon differentiation into a relevant tissue. Differences in genetic background affect phenotype making experimental interpretation difficult, so it is important to think about controls. Even a healthy sibling control has a 50% different genome. Furthermore, studies have shown that reprogramming protocol, somatic cell source/epigenetic state, number of passages, even donor age can affect stem cell differentiation capacity and potential ability to study a resulting phenotype (Kim et al., 2011; Mayshar et al., 2010;

Reinhardt et al., 2013).

The best possible control is an isogenic cell line, with the only difference being the possible disease causing mutation. This situation largely eliminates the above

14 confounding factors. Historically, gene targeting by homologous recombination was the

only means by which DNA could be mutated in a directed manner, whether it be knock- in or knock-out. The vast majority of genetically altered mice have been mutated in this way using embryonic stem cell lines. The same technique has proven to be more difficult in human stem cell lines (Zwaka and Thomson, 2003). The recent emergence of genome editing technologies has made this possible with relative ease.

Zinc finger nucleases (ZFNs), TALENs, and CRISPR/Cas9 all work on the same basic principal. Each construct is designed to target a specific gene sequence, which is then cut with a nuclease. CRISPRs, which we employed in our studies, are derived from bacteria where they function to eliminate foreign DNA. The nuclease containing Cas9 protein is guided to the target site with a short RNA complementary to a 20bp DNA sequence adjacent to an NGG amino acid sequence, where it then cuts the DNA (Mali et al., 2013). The resulting double strand break can then repair by one of two pathways: directed repair (HDR) or non-homologous end joining (NHEJ) (Fig.1). NHEJ simply fuses the broken DNA back together without a repair template. It is a robust process, but it is error prone and frequently intrudes insertion/deletion mutations

(INDELs). These INDELs can be strategically placed in coding sequences, causing frameshift mutations abolishing gene function. HDR uses a complimentary piece of DNA to fill in and repair the double strand break and is best when trying to introduce a specific sequence. This sequence may be a single base change, or a protein coding gene. Conceptually, mutating by HDR is very similar to homologous recombination, just catalyzed by first cleaving DNA at desired insertion point.

15 There are now many examples of using gene editing to create/fix a mutation which is then profiled in a differentiated cell type. One of the first comprehensive studies

looked at a Parkinson’s disease mutation in the gene LRRK2 (Reinhardt et al., 2013).

Three patient iPS lines were created and corrected using ZFNs. In parallel, they took a

wild type iPSC line and introduced the disease mutation. Expression profiling of the

multiple pairs of diseased and wild type lines yielded dysregulated genes with extremely

high accuracy.

Here we sought to unambiguously determine if NEUROG3 is, or is not,

functionally required for pancreatic endocrine cell development in humans using

pancreatic differentiation of human embryonic stem cells as a model system. We used

two methods to disrupt NEUROG3 function, shRNA knockdown and direct modification

of the NEUROG3 with CRISPR/Cas mediated gene editing. We found that

NEUROG3 is absolutely essential for the specification and development of pancreatic

endocrine cells. Then, using the NEUROG3-/- hPSC line, we added an rtTA construct in

which wild type or mutant NEUROG3 could be ectopically expressed. This provided

exquisite control and allowed us to interrogate the function and mechanism of multiple

patient mutations in the context of human pancreas and intestine. NEUROG3 mutated

to include the R107S missense mutation was hypomorphic with impaired function when

ectopically expressed in pancreas. Somewhat confusingly, NEUROG3-L135P did not show any function at all. Neither mutation was able to specify endocrine cells when

expressed in the context of intestinal organoids. Finally, we profiled the transcriptomes

of pancreas and intestine downstream of NEUROG3 to identify organ specific

expression profiles which could orchestrate NEUROG3 function in a tissue specific way.

16 Apelqvist, a, Li, H., Sommer, L., Beatus, P., Anderson, D.J., Honjo, T., Hrabe de Angelis, M., Lendahl, U., and Edlund, H. (1999). Notch signalling controls pancreatic cell differentiation. Nature 400, 877–881. Bhushan, A., Itoh, N., Kato, S., Thiery, J.P., Czernichow, P., Bellusci, S., and Scharfmann, R. (2001). Fgf10 is essential for maintaining the proliferative capacity of epithelial progenitor cells during early pancreatic organogenesis. Development 128, 5109–5117. Blum, M., Gaunt, S.J., Cho, K.W.Y., Steinbeisser, H., Blumberg, B., Bittner, D., and De Robertis, E.M. (1992). Gastrulation in the mouse: The role of the gene goosecoid. Cell 69, 1097–1106. Collombat, P., Mansouri, A., Hecksher-Sorensen, J., Serup, P., Krull, J., Gradwohl, G., and Gruss, P. (2003). Opposing actions of Arx and Pax4 in endocrine pancreas development. Genes Dev. 17, 2591–2603. D’Amour, K. a, Agulnick, A.D., Eliazer, S., Kelly, O.G., Kroon, E., and Baetge, E.E. (2005). Efficient differentiation of human embryonic stem cells to definitive endoderm. Nat. Biotechnol. 23, 1534–1541. Desgraz, R., and Herrera, P.L. (2009). Pancreatic neurogenin 3-expressing cells are unipotent islet precursors. Development 136, 3567–3574. Gamer, L.W., and Wright, C. V (1995). Autonomous endodermal determination in Xenopus: regulation of expression of the pancreatic gene XlHbox 8. Dev. Biol. 171, 240–251. Gittes, G.K. (2009). Developmental biology of the pancreas: a comprehensive review. Dev. Biol. 326, 4–35. Gradwohl, G., Dierich, A., LeMeur, M., and Guillemot, F. (2000). neurogenin3 is required for the development of the four endocrine cell lineages of the pancreas. Proc. Natl. Acad. Sci. U. S. A. 97, 1607–1611. Green, J.B., and Smith, J.C. (1990). Graded changes in dose of a Xenopus activin A homologue elicit stepwise transitions in embryonic cell fate. Nature 347, 391– 394. Gu, G., Dubauskaite, J., and Melton, D. a (2002). Direct evidence for the pancreatic lineage: NGN3+ cells are islet progenitors and are distinct from duct progenitors. Development 129, 2447–2457. Hebrok, M., Kim, S.K., and Melton, D.A. (1998). Notochord repression of endodermal Sonic hedgehog permits pancreas development. Genes Dev. 12, 1705–1713. Henry, G.L., Brivanlou, I.H., Kessler, D.S., Hemmati-Brivanlou, A., and Melton, D.A. (1996). TGF-beta signals and a pattern in Xenopus laevis endodermal development. Development 122, 1007–1015. Henseleit, K.D., Nelson, S.B., Kuhlbrodt, K., Hennings, J.C., Ericson, J., and Sander, M. (2005). NKX6 transcription factor activity is required for alpha- and beta-cell development in the pancreas. Development 132, 3139–3149. Hrvatin, S., O’Donnell, C.W., Deng, F., Millman, J.R., Pagliuca, F.W., Diiorio, P., Rezania, A., Gifford, D.K., and Melton, D. a (2014). Differentiated human stem cells resemble fetal, not adult, β cells. Proc. Natl. Acad. Sci. U. S. A. Huang, H.-P.H., Liu, M.I.N., El-Hodiri, H.M., Chu, K., Jamrich, M., and Tsai, M.-J. (2000). Regulation of the Pancreatic Islet-Specific Gene BETA2 (neuroD) by Neurogenin 3. Mol. Cell. Biol. 20, 3292–3307.

17 Hudson, C., Clements, D., Friday, R. V., Stott, D., and Woodland, H.R. (1997). Xsox17- alpha and -beta mediate endoderm formation in xenopus. Cell 91, 397–405. Jenny, M., Uhl, C., Roche, C., Duluc, I., Guillemot, F., Jensen, J., and Kedinger, Á. (2002). Neurogenin3 is differentially required for endocrine cell fate specication in the intestinal and gastric epithelium. 21. Jensen, J.N., Rosenberg, L.C., Hecksher-Sørensen, J., and Serup, P. (2007). Mutant Neurogenin-3 in Congenital Malabsorptive Diarrhea. N. Engl. J. Med. 356, 1781– 1782. Johansson, K. a, Dursun, U., Jordan, N., Gu, G., Beermann, F., Gradwohl, G., and Grapin-Botton, A. (2007). Temporal control of neurogenin3 activity in pancreas progenitors reveals competence windows for the generation of different endocrine cell types. Dev. Cell 12, 457–465. Jung, J., Zheng, M., Goldfarb, M., and Zaret, K.S. (1999). Initiation of mammalian liver development from endoderm by fibroblast growth factors. Science 284, 1998– 2003. Kim, K., Zhao, R., Doi, A., Ng, K., Unternaehrer, J., Cahan, P., Hongguang, H., Loh, Y.- H., Aryee, M.J., Lensch, M.W., et al. (2011). Donor cell type can influence the epigenome and differentiation potential of human induced pluripotent stem cells. Nat. Biotechnol. 29, 1117–1119. Lango Allen, H., Flanagan, S.E., Shaw-Smith, C., De Franco, E., Akerman, I., Caswell, R., Ferrer, J., Hattersley, A.T., and Ellard, S. (2012). GATA6 haploinsufficiency causes pancreatic agenesis in humans. Nat. Genet. 44, 20–22. Lee, C.S., Perreault, N., Brestelli, J.E., and Kaestner, K.H. (2002). Neurogenin 3 is essential for the proper specification of gastric enteroendocrine cells and the maintenance of gastric epithelial cell identity. Genes Dev. 16, 1488–1497. Lee, J.C., Smith, S.B., Watada, H., Lin, J., Scheel, D., Wang, J., Mirmira, R.G., and German, M.S. (2001a). Regulation of the pancreatic pro-endocrine gene neurogenin3. Diabetes 50, 928–936. Lee, J.C., Smith, S.B., Watada, H., Lin, J., Scheel, D., Wang, J., Mirmira, R.G., and German, M.S. (2001b). Regulation of the Pancreatic Pro-Endocrine Gene Neurogenin3. Diabetes 50, 928–936. Lopez-Diaz, L., Jain, R.N., Keeley, T.M., VanDussen, K.L., Brunkan, C.S., Gumucio, D.L., and Samuelson, L.C. (2007). Intestinal Neurogenin 3 directs differentiation of a bipotential secretory progenitor to endocrine cell rather than goblet cell fate. Dev. Biol. 309, 298–305. Mali, P., Yang, L., Esvelt, K.M., Aach, J., Guell, M., DiCarlo, J.E., Norville, J.E., and Church, G.M. (2013). RNA-guided engineering via Cas9. Science 339, 823–826. Mayshar, Y., Ben-David, U., Lavon, N., Biancotti, J.C., Yakir, B., Clark, A.T., Plath, K., Lowry, W.E., and Benvenisty, N. (2010). Identification and classification of chromosomal aberrations in human induced pluripotent stem cells. Cell Stem Cell 7, 521–531. McGrath, P.S., Watson, C.L., Ingram, C., Helmrath, M.A., and Wells, J.M. (2015). The Basic Helix-Loop-Helix Transcription Factor NEUROG3 Is Required for Development of the Human Endocrine Pancreas. Diabetes 64, 2497–2505. Miettinen, P.J., Huotari, M., Koivisto, T., Ustinov, J., Palgi, J., Rasilainen, S., Lehtonen,

18 E., Keski-Oja, J., and Otonkoski, T. (2000). Impaired migration and delayed differentiation of pancreatic islet cells in mice lacking EGF-receptors. Development 127, 2617–2627. Molotkov, A., Molotkova, N., and Duester, G. (2005). Retinoic acid generated by Raldh2 in mesoderm is required for mouse dorsal endodermal pancreas development. Dev. Dyn. 232, 950–957. Murtaugh, L.C. (2007). Pancreas and beta-cell development: from the actual to the possible. Development 134, 427–438. Murtaugh, L.C., Stanger, B.Z., Kwan, K.M., and Melton, D. a (2003). Notch signaling controls multiple steps of pancreatic differentiation. Proc. Natl. Acad. Sci. U. S. A. 100, 14920–14925. Ootani, A., Li, X., Sangiorgi, E., Ho, Q.T., Ueno, H., Toda, S., Sugihara, H., Fujimoto, K., Weissman, I.L., Capecchi, M.R., et al. (2009). Sustained in vitro intestinal epithelial culture within a Wnt-dependent stem cell niche. Nat. Med. 15, 701–706. Pearce, J.J.H., and Evans, M.J. (1999). Mml, a mouse Mix-like gene expressed in the primitive streak. Mech. Dev. 87, 189–192. Pinney, S.E., Oliver-Krasinski, J., Ernst, L., Hughes, N., Patel, P., Stoffers, D. a, Russo, P., and De León, D.D. (2011). Neonatal diabetes and congenital malabsorptive diarrhea attributable to a novel mutation in the human neurogenin-3 gene coding sequence. J. Clin. Endocrinol. Metab. 96, 1960–1965. Ran, F.A., Hsu, P.D., Wright, J., Agarwala, V., Scott, D. a, and Zhang, F. (2013). Genome engineering using the CRISPR-Cas9 system. Nat. Protoc. 8, 2281– 2308. Reinhardt, P., Schmid, B., Burbulla, L.F., Sch??ndorf, D.C., Wagner, L., Glatza, M., H??ing, S., Hargus, G., Heck, S.A., Dhingra, A., et al. (2013). Genetic correction of a lrrk2 mutation in human iPSCs links parkinsonian neurodegeneration to ERK-dependent changes in gene expression. Cell Stem Cell 12, 354–367. Rossi, J.M., Dunn, N.R., Hogan, B.L., and Zaret, K.S. (2001). Distinct mesodermal signals, including BMPs from the septum transversum mesenchyme, are required in combination for hepatogenesis from the endoderm. Genes Dev. 15, 1998–2009. Rubio-Cabezas, O., Jensen, J.N., Hodgson, M.I., Codner, E., Ellard, S., Serup, P., and Hattersley, A.T. (2011). Permanent neonatal diabetes and enteric anendocrinosis associated with biallelic mutations in NEUROG3. Diabetes 60, 1349–1353. Sander, M., Sussel, L., Conners, J., Scheel, D., Kalamaras, J., Dela Cruz, F., Schwitzgebel, V., Hayes-Jordan, A., and German, M. (2000). Homeobox gene Nkx6.1 lies downstream of Nkx2.2 in the major pathway of beta-cell formation in the pancreas. Development 127, 5533–5540. Sasaki, H., and Hogan, B.L. (1993). Differential expression of multiple fork head related genes during gastrulation and axial pattern formation in the mouse embryo. Development 118, 47–59. Schiesser, J. V, and Wells, J.M. (2014). Generation of β cells from human pluripotent stem cells: Are we there yet? Ann. N. Y. Acad. Sci. 1–14. Schwitzgebel, V.M., Scheel, D.W., Conners, J.R., Kalamaras, J., Lee, J.E., Anderson, D.J., Sussel, L., Johnson, J.D., and German, M.S. (2000). Expression of neurogenin3 reveals an islet cell precursor population in the pancreas.

19 Development 127, 3533–3542. Sinner, D., Kirilenko, P., Rankin, S., Wei, E., Howard, L., Kofron, M., Heasman, J., Woodland, H.R., and Zorn, A.M. (2006). Global analysis of the transcriptional network controlling Xenopus endoderm formation. Development 133, 1955– 1966. Sosa-Pineda, B., Chowdhury, K., Torres, M., Oliver, G., and Gruss, P. (1997). The Pax4 gene is essential for differentiation of insulin-producing beta cells in the mammalian pancreas. Nature 386, 399–402. Soyer, J., Flasse, L., Raffelsberger, W., Beucher, A., Orvain, C., Peers, B., Ravassard, P., Vermot, J., Voz, M.L., Mellitzer, G., et al. (2010). Rfx6 is an Ngn3-dependent winged helix transcription factor required for pancreatic islet cell development. Development 137, 203–212. Spence, J.R., and Wells, J.M. (2007). Translational embryology: using embryonic principles to generate pancreatic endocrine cells from embryonic stem cells. Dev. Dyn. 236, 3218–3227. Spence, J.R., Mayhew, C.N., Rankin, S. a, Kuhar, M.F., Vallance, J.E., Tolle, K., Hoskins, E.E., Kalinichenko, V. V, Wells, S.I., Zorn, A.M., et al. (2011). Directed differentiation of human pluripotent stem cells into intestinal tissue in vitro - Supplementary Info. Nature 470, 105–109. Stanger, B.Z., Tanaka, A.J., and Melton, D. a (2007). Organ size is limited by the number of embryonic progenitor cells in the pancreas but not the liver. Nature 445, 886–891. Takahashi, K., Tanabe, K., Ohnuki, M., Narita, M., Ichisaka, T., Tomoda, K., and Yamanaka, S. (2007). Induction of Pluripotent Stem Cells from Adult Human Fibroblasts by Defined Factors. Cell 131, 861–872. Teta, M., Rankin, M.M., Long, S.Y., Stein, G.M., and Kushner, J.A. (2007). Growth and regeneration of adult beta cells does not involve specialized progenitors. Dev. Cell 12, 817–826. Thomsen, G., Woolf, T., Whitman, M., Sokol, S., Vaughan, J., Vale, W., and Melton, D.A. (1990). Activins are expressed early in Xenopus embryogenesis and can induce axial mesoderm and anterior structures. Cell 63, 485–493. Thomson, J. a, Itskovitz-Eldor, J., Shapiro, S.S., Waknitz, M. a, Swiergiel, J.J., Marshall, V.S., and Jones, J.M. (1998). Embryonic stem cell lines derived from human blastocysts. Science 282, 1145–1147. Wang, J., Galen, C., Wu, V., Tran, R., Cho, J.-H., Tsai, M.-J., Bailey, T.J., Jamrich, M., Ament, M.E., Treem, W.R., et al. (2006a). Mutant neurogenin-3 in congenital malabsorptive diarrhea. N. Engl. J. Med. 356, 1781–1782; author reply 1782. Wang, J., Galen, C., Wu, V., Tran, R., Cho, J.-H., Tsai, M.-J., Bailey, T.J., Jamrich, M., Ament, M.E., Treem, W.R., et al. (2007). Mutant neurogenin-3 in congenital malabsorptive diarrhea. N. Engl. J. Med. 356, 1781–1782; author reply 1782. Wang, S., Jensen, J.N., Seymour, P.A., Hsu, W., Dor, Y., Sander, M., Magnuson, M.A., Serup, P., and Gu, G. (2009). Sustained Neurog3 expression in hormone- expressing islet cells is required for endocrine maturation and function. Proc. Natl. Acad. Sci. 106, 9715–9720. Wang, Z., Doll, P., Cardoso, W. V., and Niederreither, K. (2006b). Retinoic acid regulates morphogenesis and patterning of posterior foregut derivatives. Dev.

20 Biol. 297, 433–445. Wilkinson, D.G., Bhatt, S., and Herrmann, B.G. (1990). Expression pattern of the mouse T gene and its role in mesoderm formation. Nature 343, 657–659. Xu, X., D’Hoker, J., Stangé, G., Bonné, S., De Leu, N., Xiao, X., Van de Casteele, M., Mellitzer, G., Ling, Z., Pipeleers, D., et al. (2008). Beta cells can be generated from endogenous progenitors in injured adult mouse pancreas. Cell 132, 197– 207. Zorn, A.M., and Wells, J.M. (2009). Vertebrate endoderm development and organ formation. Annu. Rev. Cell Dev. Biol. 25, 221–251. Zwaka, T.P., and Thomson, J. a (2003). Homologous recombination in human embryonic stem cells. Nat. Biotechnol. 21, 319–321.

21 Figure Legends

Figure 1. Schematic depicting key developmental stages and corresponding morphogenetic processes occurring in the embryo during pancreas formation

(from Schiesser and Wells, 2014).

Figure 2. Illustration indicating the two repair mechanisms following CRISPR mediated DNA double strand break (from Ran et al., 2013).

DSB induced by Cas9 (yellow) can be repaired by error prone NHEJ pathway creating frameshift mutations and creation of a premature stop codon. Alternatively, the HDR pathway can be leveraged if a repair template is provided. The repair template can include mutations of interest or coding sequences.

22 Figure 1

23 Figure 2

24 The basic helix-loop-helix transcription factor NEUROG3 is required for development of

the human endocrine pancreas.

Patrick S. McGrath1, Carey L. Watson2, Cameron Ingram1, Michael A. Helmrath2 and James M.

Wells*1,3.

1Division of Developmental Biology, 2Department of Pediatric and General Thoracic Surgery,

3Division of Endocrinology, Cincinnati Children’s Hospital Medical Center, 3333 Burnet

Avenue, Cincinnati, OH 45229-3039

2Department of Molecular and Cellular Physiology, University of Cincinnati, Cincinnati, OH

45267

*Correspondence:

[email protected]

Telephone: 513-636-8767

Fax: 513-636-4317

Running title: NEUROG3 is required for human endocrine development

25 Summary

Neurogenin 3 (NEUROG3) is a basic helix-loop-helix transcription factor that is required for development of the endocrine pancreas in mice. In contrast, human patients with NEUROG3 mutations are born with endocrine pancreas function, calling into question whether NEUROG3 is required for human endocrine pancreas development. To test this directly, we generated human embryonic stem cell (hESC) lines where both alleles of NEUROG3 were disrupted using

CRISPR/Cas9-mediated gene targeting. NEUROG3-/- hESC lines efficiently formed pancreatic

progenitors, but lacked detectible NEUROG3 protein and did not form any endocrine cells in

vitro. Moreover, NEUROG3-/- hESC lines were unable to form mature pancreatic endocrine cells following engraftment of PDX1+/NKX6.1+ pancreatic progenitors into mice. In contrast, a 75-

90% knockdown of NEUROG3 caused a reduction, but not loss, of pancreatic endocrine cell

development. We conclude that NEUROG3 is essential for endocrine pancreas development in

humans and that as little as 10% NEUROG3 is sufficient for formation of pancreatic endocrine

cells.

26 Introduction

In mice, pancreatic progenitor cells give rise to functional endocrine cells via an endocrine

progenitor intermediate that expresses the bHLH transcription factor Neurogenin3 (Neurog3) (1–

4). Neurog3 is required for development of all pancreatic endocrine cell types in mice (5–7), and does this through direct and indirect regulation of downstream targets including the transcription factors NeuroD1 (8), Rfx6 (9), Pax4 (10), Nkx6.1 (11,12), Arx (13), and others. Neurog3+ cells are first observed during the primary transition in mouse between e9 and 12.5. While some of these primary transition endocrine cells may contribute to adult islets (3), the majority of endocrine cell mass forms during a second wave of endocrine cell development between e12.5 and e16.5.

Neurog3 is also required for development of intestinal (enteric) enteroendocrine cells (EECs) in mice (14–17). Similarly, human patients with biallelic mutations in NEUROG3 are born with

intractable malabsorptive diarrhea due to loss of EECs, also known as enteric anendocrinosis

(18–20). Most mutations occur in, or result in a truncation of, the well-conserved bHLH domain of NEUROG3, which has been previously reported to render the protein transcriptionally inactive. However all of these patients were born with circulating C-peptide suggesting that, unlike mice, NEUROG3 may not be required for the development of the human endocrine pancreas (21).

Here we sought to unambiguously determine if NEUROG3 is, or is not, functionally required for human pancreatic endocrine cell development using pancreatic differentiation of human embryonic stem cells as a model system. We used two methods to disrupt NEUROG3 function: shRNA knockdown and direct modification of the NEUROG3 locus with CRISPR/Cas mediated gene editing. All hESC lines generated pancreatic progenitor cells with equal

27 efficiency, however NEUROG3-/- hESC lines were deficient in endocrine cell development in

vitro and following engraftment into mice. In contrast, knockdown of NEUROG3 transcripts by

up to 90% using shRNAs had only a marginal effect on the production of hormone-expressing

cells in vitro. These data are consistent with the idea that the published NEUROG3 mutations are

hypomorphic and not complete loss of function, thus allowing these patients to be born with a

functional endocrine pancreas.

Methods

Cell culture and differentiation

The human embryonic stem cell line H1 (WiCell) was maintained in mTeSR (StemCell

Technologies) on Matrigel coated plates. Prior to differentiation cells were dispersed with

Accutase (StemCell Technologies), washed, collected, resuspended in mTeSR containing 10μM

ROCK inhibitor (Y-27632,Tocris Bioscience), and plated at a concentration of 1x105 cells/cm2

on to matrigel-coated, 24-well Nunclon plates (Delta treated). Differentiation was initiated when

cells reached ~75% confluency, approximately 48 hours after plating. At the start of

differentiation (day 0), cells were switched to RPMI 1640 supplemented with non-essential

amino acids, 100ng/ml Activin A (Cell Guidance Systems) and 50ng/ml BMP-4 (R&D Systems).

Day 1-2 media included 0.2% FBS (Hyclone) and did not have BMP4. On day 3 the media was

changed to RPMI 1640 containing 2% FBS, 50ng/ml FGF-7 (R&D Systems), and 50ng/ml

Noggin (R&D Systems). On days 5 and 7 the media was switched to high-glucose (HG) DMEM

(Gibco) containing 50ng/ml Noggin, 2μM all-trans retinoic acid (Stemgent), and 1% (0.5x) B27

without vitamin A (Gibco). Finally, day 9-11 media was prepared using HG-DMEM

supplemented with 1% B27 and 25ng/ml Noggin.

28

CRISPR Design and targeted mutagenesis

The plasmid encoding Cas9-2A-GFP (22) was acquired from addgene (#44719). Guide

RNAs were designed to target downstream of the NEUROG3 start codon (gRNA1

5’GTGGGCGCACCCGAGGGTTGAGG, gRNA2 5’ GGAAGGACCGCTCCGTCTCACGG).

All gRNAs were synthesized as g-blocks by Integrated DNA Technologies and PCR cloned into the pENTR/D-TOPO vector (Life Technologies). H1 cells were transfected with 2.5μg of each plasmid using the Amaxa P3 Primary Cell 4D-Nucleofector Kit (Lonza). Positively transfected

H1 cells were then collected by FACS (utilizing the 2A-GFP) and plated at a limiting dilution for subcloning. Individual colonies were isolated and clonally expanded. Genomic DNA was collected using the HotSHOT method (23). PCR primers were designed flanking the targeted sequence and iProof(Bio-Rad) was used for amplification. PCR products were column purified

(Qiagen) and then Sanger sequenced. The Mixed Sequence Reader (24) was used to screen resulting mixed traces for INDELs. Predicted genotypes were then confirmed by subcloning using the Zero Blunt TOPO PCR Cloning Kit (Life Technologies) followed by Sanger sequencing.

Generating shRNA NEUROG3 knockdown and reporter lines

Lentiviral vectors for NEUROG3 shRNA were obtained from the CCHMC Lenti-shRNA

Library Core (TRCN0000020034, Mission Library, Sigma-Aldrich) and the mCherry reporter was

generated using a 5.5kb promoter region 5’ to the NEUROG3 transcriptional start site. Vectors were

packaged into high-titer lentivirus by CCHMC viral production core. The shRNA is designed for

the NEUROG3 sequence 5’ CAGTCTGGCTTTCTCAGATTT. Low-passage H1 ES cells were

29 dissociated into a single-cell suspension using Accutase and then replated in mTeSR + 10μM Y-

27632. shNEUROG3 viral particles were added to the cells immediately prior to plating. The

media was replaced and viral particles were removed after 24 hours. Puromycin selection

(2μg/ml) was added 72 hours after transduction and lines were maintained under selection.

Aggregation of hESC-derived pancreatic progenitors and transplantation

Day 12 cultures were lifted off the plate by treatment with Dispase and gentle scraping, collected by centrifugation, dispersed into 100-500μm size pieces, and aggregated for 24-48 hours in ultra-low attachment 6-well plates. Aggregates were then embedded into purified type I collagen (Rat tail collagen; BD Biosciences) 12 hours prior to surgery, then transplanted under the kidney capsule or directly into the splenic lobe of the pancreas of immune-deficient NOD-

Scid IL-2Rnull (NSG) mice. Grafts were harvested 6 weeks after engraftment.

qPCR

All RNA was column purified using a NucleoSpin RNA kit (Machery-Nagel) including an on-column DNAse digestion. cDNA was produced with the SuperScript VILO cDNA synthesis kit (Invitrogen) following the manufacturer’s instructions. 5ng of cDNA was amplified per reaction with QuantiTect SYBR Green (Qiagen) then amplified using a CFX96 Real-Time

PCR Detection System (Bio-Rad). All primers are listed in the supplemental materials.

Cell and Tissue Processing and Immunofluorescence

Monolayers were fixed for 20 minutes at room temperature in 4% paraformaldehyde (PFA).

Transplants were fixed overnight in 4% PFA at 4oC, cryopreserved overnight in 30% sucrose,

30 frozen in OCT, then cryosectioned in 8-10μm increments. Prior to staining, monolayers and

sections were blocked for 30 minutes (5% donkey serum and 0.5% Triton-X in PBS). Primary antibodies were diluted in PBS + 0.1% Tween and incubated with the samples overnight at 4oC.

Secondary antibodies were incubated for 2 hours at room temperature. Cells were stained with

DAPI (5μg/ml in PBS) for 5 minutes. Sections were mounted using Fluormount-G.

Image acquisition and analysis

Confocal images were captured using a Nikon A1R confocal microscope with PMT based

detectors and motorized stage. The microscope has 405nm, 488nm, 561nm, and 640nm lasers

with appropriate filters. All image analysis was performed using Bitplane Imaris software.

Figures were assembled using Adobe Photoshop and Illustrator CS6.

Statistical Analysis

All results are expressed as the mean ± SEM unless otherwise noted. Statistical

significance between two groups was tested using a two-tailed unpaired t-test. P-values are as

follows: *p<0.05, **p<0.01, ***p<0.001.

Results

Targeted mutagenesis of the NEUROG3 locus in HESCs

To investigate a role for human NEUROG3 during endocrine pancreas development we

generated human ESC lines with targeted disruption of NEUROG3 using CRISPR-Cas9

technology. In this approach we used guide RNAs (gRNAs) to target the Cas9 nuclease to

sequences just downstream of the NEUROG3 start codon (Fig. 1A). Potential gRNAs were

31 screened and ranked for specificity using BLAST algorithms and the CRISPR Design tool to minimize the risk of off-target effects (Ran et al., 2013, crispr.mit.edu) (Supplementary Figs. 1A and F). Moreover, two separate gRNAs that recognize different target sequences in NEUROG3

were used to generate independent NEUROG3+/- and NEUROG3-/- lines with the rationale that a

similar phenotype that is caused by different target sequences is exceedingly unlikely to be due

to off-target effects. Clonal lines were generated and mutations in NEUROG3 were detected by

Sanger sequencing (Fig 1B and Supplementary Fig. 1B). Approximately 25% of clones had no

insertions or deletions (INDELS) in NEUROG3 (NEUROG3+/+), ~50% of clones had INDELS

in one allele (NEUROG3+/-) and ~25% of clones had INDELS in both alleles (NEUROG3-/-). All lines exhibited a characteristic PSC morphology, expressed pluripotency markers OCT4 and

NANOG (Supplementary Figs. 1C and D), grew at a rate similar to that of the parental H1 line, and were karyotypically normal (Supplementary Fig. 1E). Importantly, the NEUROG3 paralogs

NEUROG1 and NEUROG2 were sequenced and confirmed to be normal in all NEUROG3-/- lines

(data not shown).

Human NEUROG3 is essential for formation of pancreatic endocrine cells in vitro

For differentiation of hESCs into pancreatic progenitors and endocrine cells we used a 4- step protocol that was similar to several previous methods (26–29)(summarized in Fig. 1C): differentiation of hESC monolayers (marked by OCT4) into definitive endoderm (DE, marked by SOX17 and FOXA2), then into posterior foregut (marked by PDX1, >95%), and finally formation of pancreatic progenitor cells (marked by NKX6.1, 61%)(Fig. 1D, Supplementary Fig.

2A,B,D). Differentiation into DE, posterior foregut, and pancreatic progenitors was comparable in all lines tested (NEUROG3+/+, NEUROG3+/-, NEUROG3-/-)(Fig. 1D). In NEUROG3+/+

32 hESCs, NEUROG3 transcripts were detectable starting around day 9 (Fig 4A) and approximately

6% of cells expressed NEUROG3 protein by day 12 (Figs. 1E,F). In contrast, NEUROG3-/- hESCs had virtually no NEUROG3 protein (Fig. 1E) or mRNA (Supplemental Figs. 3A,B). We observed an 80% decrease in NEUROG3-expressing cells in NEUROG3+/- heterozygous lines

compared to NEUROG3+/+ wild type controls, consistent with a published report that NEUROG3 haploinsufficiency in mice causes a reduction in pancreatic endocrine cell mass and impaired

glucose regulation (30). Interestingly, quantitative analysis of NEUROG3 protein levels in each

cell indicated that NEUROG3 levels were the same in NEUROG3+/- and NEUROG3+/+ cells

(Fig. 1G). During pancreas development, PDX1 and NKX6.1 expression is initiated before endocrine specification, and early expression of these genes was similar across genotypes (Fig.

1D). To investigate if expression of PDX1 and NKX6.1 in endocrine cells is altered by

NEUROG3 haploinsufficiency, we quantified the level of nuclear PDX1 and NKX6.1 in cells either positive or negative for NEUROG3 protein (NEUROG3pos, NEUROG3neg) across genotypes. NEUROG3 mutations had no impact on PDX1 levels (Fig. 1H), but the range of

PDX1 expression were much broader in NEUROG3pos cells as compared to NEUROG3neg cells.

However, there was a small reduction in NKX6.1 protein levels in NEUROG3+/-, NEUROG3-/- lines.

At day 12 we observed differentiated endocrine cells in NEUROG3+/+ and NEUROG3+/- cultures that expressed the pan-endocrine marker chromagranin A (CHGA, Fig 2A,

Supplementary Fig. 2A) and the hormones insulin (INS), glucagon (GCG), and somatostatin

(SST, Fig. 2B). In contrast, NEUROG-/- cultures show no evidence of any endocrine

differentiation: CHGA protein and mRNA are completely absent (Figs. 2A,C) and there were no

hormone expressing cells (Fig. 2B), demonstrating that NEUROG3 is required for endocrine

33 specification in vitro. Quantification of INS, GLU, and SST cells in day 12 cultures

demonstrated that many cells were polyhormonal. NEUROG3+/- cultures had about 75% fewer

hormone expressing cells overall (Supplementary Fig. 2H) and there were slight changes in the

relative proportions of the hormone-expressing cell types.

We next investigated the impact of NEUROG3 loss on the expression of several

transcription factors that are involved in pancreatic endocrine development. Our data suggested

that expression of these factors falls into one of three profiles: expression that was independent,

partially dependent, or completely dependent on NEUROG3 (Fig. 2C and Supplementary Fig. 3).

NEUROG3 status had little effect on the levels of PDX1 and NKX6.1 at the pancreatic precursor

stage, consistent with the protein data (Fig. 1H and Supplementary fig. 3C). However PTF1A,

IA1, and ARX were decreased in NEUROG3+/- cells and further reduced in NEUROG3-/- cells

indicating a partial dependence on NEUROG3. Transcription factors that were absent in

NEUROG3-/- cells include NEUROD1, PAX4, PAX6, and MYT1. To further investigate if the expression of these transcription factors was elevated in NEUROG3-expressing cells, we generated a transgenic NEUROG3 reporter line using a 5.5kb promoter region 5’ to the

NEUROG3 transcriptional start site to drive expression of mCherry fluorescent protein

(NEUROG3mCherry) in NEUROG3-expressing cells (Supplementary fig. 3E,F). Sorted

NEUROG3/mCherry-expressing cells had high levels of NEUROD1, NKX2.2, PAX4, RFX6,

IA1, and the hormones INS and GCG (Supplementary Fig. 2G), as compared to mCherry

negative cells. Conversely, the levels of PDX1, NKX6.1, and PTF1A were not dramatically

different in mCherry positive versus negative cells, suggesting that NEUROG3 expression levels

did not correlate with expression of these factors.

34 It was surprising that we did not observe a change in expression of RFX6, IA1, and ARX

in NEUROG3-/- pancreatic precursors because expression of these factors is completely lost in mice lacking Neurog3-/- (9,13,31). Also contrary to mouse studies, NKX2.2 appeared to be

dependent on NEUROG3 as its expression is absent in NEUROG3-/- lines. This observation is

consistent with NKX2.2 expression during human fetal pancreas development where NKX2.2 is

first detected shortly after the onset of NEUROG3 expression (32). Together these data indicate

that endocrine cell development is a NEUROG3-dependent process in humans, but there are

qualitative differences between mouse and humans regarding the transcription factors

downstream of NEUROG3.

Human NEUROG3 is essential for formation of mature pancreatic endocrine cells

40% of endocrine cells derived in vitro were polyhormonal (Figs. 2B’ and 3C) and did

not co-express beta cell transcription factors such as NKX6.1 and PDX1, suggesting that they

were not definitive pancreatic endocrine cells. To investigate if NEUROG3 is required for the

development of mature endocrine cells that arise during the secondary transition, we engrafted

hPSC-derived pancreatic progenitors into NOD-Scid IL-2Rnull (NSG), which is known to

promote their development into more mature, functional endocrine cells (27). Progenitors were engrafted either into the splenic lobe of the pancreas or under the kidney capsule and were matured for 6 weeks (Supplementary Figs. 4A and B). Interestingly, we observed that the pancreas seemed to support better growth and survival, with 8/11 grafts recovered from the pancreas as compared to 6/19 grafts recovered from the kidney. NEUROG3+/+ cells transplanted

into the pancreas contained an average of 16% endocrine cells expressing hormones INS, GCG,

and SST. NEUROG3+/+ and NEUROG3+/- endocrine cells were 99% and 91% monohormonal,

35 respectively (Figs. 3A,B). NEUROG3+/- transplants show roughly a 50% decrease in endocrine

cell numbers and an increase in the number of polyhormonal cells as compared to NEUROG3+/+ lines (Supplemental Fig. 4C). The only hormone-positive cells observed in NEUROG3-/- transplants expressed glucagon (7 of 57,393 counted cells from n=3 transplants), similar to what was observed in Neurog3-/- mice (33). Unlike insulin-expressing cells derived in vitro, insulin-

expressing cells that were matured in vivo co-expressed the definitive beta cell transcription

factors NKX6.1 and PDX1, indicating that these are more mature beta cells (Fig. 3D). The

protein levels of NKX6.1 and PDX1 were quantified by immunofluorescence in INSpos and

INSneg cells. Both NKX6.1 and PDX1 were more highly expressed in INSpos cells compared to

INSneg. There were no major differences in NKX6.1 and PDX1 expression between

NEUROG3+/+, NEUROG3+/-, and NEUROG3-/- transplants (Supplementary Fig. 4D,E).

Efficient knockdown of NEUROG3 does not block pancreatic endocrine cell development.

These data demonstrate that genetic ablation of the NEUROG3 locus results in a

complete loss of specification of human pancreatic endocrine cells differentiated from PSCs.

However, patients with homozygous or biallelic NEUROG3 mutations are all born with

endocrine pancreas function. It is possible that the reported NEUROG3 mutations retain enough

residual activity to allow for development of pancreatic endocrine cells. To investigate the

impact of reduced NEUROG3 levels on endocrine pancreas development we generated hESC

lines expressing shRNA constructs for NEUROG3 (shNEUROG3) and differentiated these into

pancreatic precursors. shNEUROG3 hESCs formed pancreatic progenitors (marked by PDX1,

NKX6.1) with the same efficiency as control lines (Fig. 4A) and had up to an 89% reduction in

NEUROG3 mRNA at the endocrine differentiation stages. However, despite this level of

36 knockdown, there were still significant levels of NEUROG3 target genes such as NEUROD1.

The hormones insulin (INS) and glucagon (GCG) mRNA were only reduced by 40% and 75%,

respectively. NEUROG3 knockdown also had a modest effect on the number of insulin

expressing cells in vitro as assessed by immunofluorescence (Fig. 4B). These data demonstrate

that human pancreatic endocrine cell formation is reduced but not lost in cells with reduced

NEUROG3 function.

Discussion

Neurog3 is necessary for development of pancreatic and gastrointestinal endocrine cells

in the mouse. Patients with biallelic mutations in NEUROG3 present with an absence of

intestinal enteroendocrine cells, thus phenocopying the mouse. In contrast to the mouse data, all

patients are born with a functional endocrine pancreas (18–20,34,35). Here we use genetically

modified human ES cells to provide definitive evidence that NEUROG3 is required for the

development of human pancreatic endocrine cells. These data suggest that the NEUROG3

mutations identified in humans were not complete loss of function as they still supported some

degree of endocrine pancreas development. Consistent with this, our data shows that as little as

11% NEUROG3 mRNA is sufficient to allow pancreatic endocrine cell development but is

insufficient for intestinal enteroendocrine development (36).

The molecular basis for the differential requirement for NEUROG3 in pancreatic versus

enteroendocrine development is unknown. It is possible that pancreatic endocrine cells express

higher levels of NEUROG3 than enteroendocrine cells where even a hypomorphic protein would be present at sufficient levels to specify a pancreatic endocrine fate. Another possibility may be context-dependent associations with transcriptional co-factors, since basic helix-loop-helix

37 transcription factors function as dimers. Consistent with this possibility, most of the point

mutations identified in NEUROG3 in humans occur in the HLH dimerization domain. Lastly,

these mutations may impact posttranslational processing and/or stability of the protein as is the

case with neurogenin1 whose half-life is regulated by phosphorylation of a threonine residue

(T188) in the loop region which is highly conserved across neurogenin paralogs (37).

We also observed that loss of one allele of NEUROG3 resulted in a substantial reduction

in endocrine cell number. Furthermore, we found a substantial increase in polyhormonal cells

following in vivo engraftment of NEUROG3+/- pancreatic precursors relative to NEUROG3+/+, consistent with the observation that timing and dose of NEUROG3 may impact specification of endocrine subtypes(38). The developmental phenotype associated with Neurog3 haploinsufficiency in mice is not as dramatic, however postnatally these animals have reduced islet mass and are glucose intolerant (30). In contrast, NEUROG3 heterozygous parents have normal glucose tolerance (20) suggesting either that any reduced islet mass is not sufficient for loss of glucose regulation or that mutations are only partial loss-of-function and are sufficient for normal endocrine pancreas development. Lastly, the in vitro nature of our model may render pancreatic cells more sensitive to reduced levels of NEUROG3. Another interesting observation is that loss of one allele of NEUROG3 caused more than a 50% reduction in mRNA, consistent with mouse studies showing that Neurog3 participates in a feed-forward loop with both Foxa2

(39) and Myt1 (33) and that a certain threshold of Neurog3 is required to maintain this regulatory loop.

In conclusion, we have used genetic targeting to demonstrate that NEUROG3 is required for development of human endocrine pancreatic cells. These studies suggest that the described human mutations in NEUROG3 are hypomorphic and the residual function is sufficient for

38 endocrine pancreas function in human patients. Moreover we have demonstrated that this

approach can be used to manipulate the human genome and study human embryonic organ

development in a way that was previously only possible in model organisms. However, these

studies also demonstrate that endocrine pancreas development is highly conserved between

humans and mice and emphasizes the utility of the mouse as a model organism to study human

development.

Acknowledgements

P.S.M. and J.M.W. designed the study, interpreted results and wrote the manuscript. P.S.M.

performed all experiments with the exception of the mouse transplantations, which were carried

out by C.W. C.I. contributed to experiments, tissue processing, and image quantitation. C.I. and

M.H. read and provided input on the manuscript. We thank Drs. Aaron Zorn and Kyle

McCracken for scientific discussion. This study was supported by NIH grants R01DK080823

and R01DK092456. We also acknowledge core support from the Cincinnati Digestive Disease

Center Award (P30 DK0789392) and the Clinical Translational Science Award (U54

RR025216). We thank the CCHMC Pluripotent Stem Cell Facility, Confocal Imaging Core,

Flow Cytometry Core, Cytogenetics Core, and Lenti-shRNA Library Core for support and

services.

39 1. Schwitzgebel VM, Scheel DW, Conners JR, Kalamaras J, Lee JE, Anderson DJ, et al. Expression of neurogenin3 reveals an islet cell precursor population in the pancreas. Development. 2000 Aug;127(16):3533–42.

2. Apelqvist a, Li H, Sommer L, Beatus P, Anderson DJ, Honjo T, et al. Notch signalling controls pancreatic cell differentiation. Nature. 1999 Aug 26;400(6747):877–81.

3. Gu G, Dubauskaite J, Melton D a. Direct evidence for the pancreatic lineage: NGN3+ cells are islet progenitors and are distinct from duct progenitors. Development. 2002 May;129(10):2447–57.

4. Miettinen PJ, Huotari M, Koivisto T, Ustinov J, Palgi J, Rasilainen S, et al. Impaired migration and delayed differentiation of pancreatic islet cells in mice lacking EGF- receptors. Development. 2000;127:2617–27.

5. Gradwohl G, Dierich A, LeMeur M, Guillemot F. Neurogenin3 Is Required for the Development of the Four Endocrine Cell Lineages of the Pancreas. Proc Natl Acad Sci U S A. 2000 Feb 15;97(4):1607–11.

6. Lee JC, Smith SB, Watada H, Lin J, Scheel D, Wang J, et al. Regulation of the Pancreatic Pro-Endocrine Gene Neurogenin3. Diabetes. 2001;50(May):928–36.

7. Xu X, D’Hoker J, Stangé G, Bonné S, De Leu N, Xiao X, et al. Beta cells can be generated from endogenous progenitors in injured adult mouse pancreas. Cell. 2008 Jan 25;132(2):197–207.

8. Huang H-PH, Liu MIN, El-Hodiri HM, Chu K, Jamrich M, Tsai M-J. Regulation of the Pancreatic Islet-Specific Gene BETA2 (neuroD) by Neurogenin 3. Mol Cell Biol. 2000 May 1;20(9):3292–307.

9. Soyer J, Flasse L, Raffelsberger W, Beucher A, Orvain C, Peers B, et al. Rfx6 is an Ngn3- dependent winged helix transcription factor required for pancreatic islet cell development. Development. 2010 Jan;137(2):203–12.

10. Sosa-Pineda B, Chowdhury K, Torres M, Oliver G, Gruss P. The Pax4 gene is essential for differentiation of insulin-producing beta cells in the mammalian pancreas. Nature. 1997;386:399–402.

11. Henseleit KD, Nelson SB, Kuhlbrodt K, Hennings JC, Ericson J, Sander M. NKX6 transcription factor activity is required for alpha- and beta-cell development in the pancreas. Development. 2005 Jul;132(13):3139–49.

12. Sander M, Sussel L, Conners J, Scheel D, Kalamaras J, Dela Cruz F, et al. Homeobox gene Nkx6.1 lies downstream of Nkx2.2 in the major pathway of beta-cell formation in the pancreas. Development. 2000 Dec;127(24):5533–40.

40 13. Collombat P, Mansouri A, Hecksher-Sorensen J, Serup P, Krull J, Gradwohl G, et al. Opposing actions of Arx and Pax4 in endocrine pancreas development. Genes Dev. 2003 Oct 15;17(20):2591–603.

14. Jenny M, Uhl C, Roche C, Duluc I, Guillemot F, Jensen J, et al. Neurogenin3 is differentially required for endocrine cell fate specication in the intestinal and gastric epithelium. 2002;21(23).

15. Lee CS, Perreault N, Brestelli JE, Kaestner KH. Neurogenin 3 is essential for the proper specification of gastric enteroendocrine cells and the maintenance of gastric epithelial cell identity. Genes Dev. 2002 Jun 15;16(12):1488–97.

16. Lopez-Diaz L, Jain RN, Keeley TM, VanDussen KL, Brunkan CS, Gumucio DL, et al. Intestinal Neurogenin 3 directs differentiation of a bipotential secretory progenitor to endocrine cell rather than goblet cell fate. Dev Biol. 2007;309:298–305.

17. Ootani A, Li X, Sangiorgi E, Ho QT, Ueno H, Toda S, et al. Sustained in vitro intestinal epithelial culture within a Wnt-dependent stem cell niche. Nat Med. 2009;15:701–6.

18. Wang J, Galen C, Wu V, Tran R, Cho J-H, Tsai M-J, et al. Mutant neurogenin-3 in congenital malabsorptive diarrhea. N Engl J Med. 2006 Apr 26;356(17):1781–2; author reply 1782.

19. Pinney SE, Oliver-Krasinski J, Ernst L, Hughes N, Patel P, Stoffers D a, et al. Neonatal diabetes and congenital malabsorptive diarrhea attributable to a novel mutation in the human neurogenin-3 gene coding sequence. J Clin Endocrinol Metab. 2011 Jul;96(7):1960–5.

20. Rubio-Cabezas O, Jensen JN, Hodgson MI, Codner E, Ellard S, Serup P, et al. Permanent neonatal diabetes and enteric anendocrinosis associated with biallelic mutations in NEUROG3. Diabetes. 2011 Apr;60(4):1349–53.

21. Rubio-Cabezas O, Codner E, Flanagan SE, Gómez JL, Ellard S, Hattersley AT. Neurogenin 3 is important but not essential for pancreatic islet development in humans. Diabetologia. 2014 Aug 14;2:3–6.

22. Ding Q, Regan SN, Xia Y, Oostrom L a, Cowan C a, Musunuru K. Enhanced efficiency of human pluripotent stem cell genome editing through replacing TALENs with CRISPRs. Cell Stem Cell. 2013 Apr 4;12(4):393–4.

23. Truett GE, Heeger P, Mynatt RL, Truett AA, Walker JA, Warman ML. Preparation of PCR-Quality Mouse Genomic DNA with Hot Sodium Hydroxide and Tris (HotSHOT). Biotechniques. 2000;29(1):52–4.

41 24. Chang C-T, Tsai C-N, Tang CY, Chen C-H, Lian J-H, Hu C-Y, et al. Mixed sequence reader: a program for analyzing DNA sequences with heterozygous base calling. ScientificWorldJournal. 2012 Jan;2012:365104.

25. Ran FA, Hsu PD, Wright J, Agarwala V, Scott D a, Zhang F. Genome engineering using the CRISPR-Cas9 system. Nat Protoc. 2013 Nov;8(11):2281–308.

26. D’Amour K a, Agulnick AD, Eliazer S, Kelly OG, Kroon E, Baetge EE. Efficient differentiation of human embryonic stem cells to definitive endoderm. Nat Biotechnol. 2005 Dec;23(12):1534–41.

27. Kroon E, Martinson L a, Kadoya K, Bang AG, Kelly OG, Eliazer S, et al. Pancreatic endoderm derived from human embryonic stem cells generates glucose-responsive insulin-secreting cells in vivo. Nat Biotechnol. 2008 Apr;26(4):443–52.

28. Rezania A, Bruin JE, Riedel MJ, Mojibian M, Asadi A, Xu J, et al. Maturation of human embryonic stem cell-derived pancreatic progenitors into functional islets capable of treating pre-existing diabetes in mice. Diabetes. 2012;61:2016–29.

29. Pagliuca FW, Melton D a. How to make a functional Beta-cell. Development. 2013 May 28;140(12):2472–83.

30. Wang S, Yan J, Anderson D a, Xu Y, Kanal MC, Cao Z, et al. Neurog3 gene dosage regulates allocation of endocrine and exocrine cell fates in the developing mouse pancreas. Dev Biol. Elsevier Inc.; 2010 Mar 1;339(1):26–37.

31. Mellitzer G, Bonné S, Luco RF, Van De Casteele M, Lenne-Samuel N, Collombat P, et al. IA1 is NGN3-dependent and essential for differentiation of the endocrine pancreas. EMBO J. 2006 Mar 22;25(6):1344–52.

32. Jennings RE, Berry A a, Kirkwood-Wilson R, Roberts N a, Hearn T, Salisbury RJ, et al. Development of the human pancreas from foregut to endocrine commitment. Diabetes. 2013 Oct;62(10):3514–22.

33. Wang S, Hecksher-Sorensen J, Xu Y, Zhao A, Dor Y, Rosenberg L, et al. Myt1 and Ngn3 form a feed-forward expression loop to promote endocrine islet cell differentiation. Dev Biol. 2008 May 15;317(2):531–40.

34. Ohsie S, Gerney G, Gui D, Kahana D, Martín MG, Cortina G. A paucity of colonic enteroendocrine and/or enterochromaffin cells characterizes a subset of patients with chronic unexplained diarrhea/malabsorption. Hum Pathol. Elsevier Inc.; 2009 Jul;40(7):1006–14.

35. Sayar E, Islek A, Yilmaz A, Akcam M, Flanagan SE, Artan R. Extremely rare cause of congenital diarrhea: Enteric anendocrinosis. Pediatr Int. 2013 Oct;55(5):661–3.

42 36. Spence JR, Mayhew CN, Rankin S a, Kuhar MF, Vallance JE, Tolle K, et al. Directed differentiation of human pluripotent stem cells into intestinal tissue in vitro. Nature. Nature Publishing Group; 2011 Feb 3;470(7332):105–9.

37. Vosper JMD, Fiore-Heriche CS, Horan I, Wilson K, Wise H, Philpott A. Regulation of neurogenin stability by ubiquitin-mediated proteolysis. Biochem J. 2007 Oct 15;407(2):277–84.

38. Desgraz R, Herrera PL. Pancreatic neurogenin 3-expressing cells are unipotent islet precursors. Development. 2009;136:3567–74.

39. Ejarque M, Cervantes S, Pujadas G, Tutusaus A, Sanchez L, Gasa R. Neurogenin3 cooperates with Foxa2 to autoactivate its own expression. J Biol Chem. 2013 Apr 26;288(17):11705–17.

43 Figure legends

Figure 1. CRISPR/Cas9-mediated mutagenesis disrupts expression of NEUROG3 in

differentiated pancreatic precursors.

(A) Adapted image from UCSC genome browser illustrating the full NEUROG3 gene with

aligned sites targeted by the designed guide RNAs (gRNA1 and gRNA2) and the primers used

for sequencing. Vertebrate conservation is illustrated by the histogram.

(B) Sequenced genotypes of NEUROG3 wild type, heterozygous (+/-), and knock-out (-/-) clones

generated independently using either gRNA-1 or -2. The NEUROG3 start codon is indicated in

green. The targeted mutation in each NEUROG3 allele (Al-1 and Al-2) is indicated. The Cas9 endonuclease cut sites are indicated by the scissor and the protospacer adjacent motif (PAM) is indicated in red. The insertions (red) or deletions (-) in NEUROG3 are indicated on the right.

(C) Schematic summarizing the 4-stage directed differentiation of human PSCs to pancreatic

precursors. The Y-axis lists the reagents and growth factors used and the X-axis shows the time

and stage that each factor was used.

(D) Representative time-course of H1 NEUROG3+/+, NEUROG3+/-, and NEUROG3-/- hESCs differentiated to pancreatic precursors. mRNA for markers of several developmental stages indicated in C were assessed by qPCR (n=3, representative of 4 separate experiments).

(E) NEUROG3 protein expression in NEUROG3+/+, NEUROG3+/-, and NEUROG3-/- pancreatic

precursors (differentiation day 12). NEUROG3-positive cells (examples illustrated with yellow

arrowheads) were counted (F) and nuclear expression was quantified (G) by

immunofluorescence and high content analysis.

(H) The nuclear levels of PDX1 protein in cells either positive or negative for NEUROG3 were

compared across cell lines (NEUROG3+/+, NEUROG3+/-, NEUROG3-/-). The data are displayed

44 with a box-and-whisker plot and the whiskers represent the minimum and maximum values.

n=number of total nuclei counted. Scale bars = 50 μm.

Data are represented as mean ± SEM. *p<0.05, **p<0.01, ***p<0.001.

Figure 2. NEUROG3 is required for specification of human pancreatic endocrine cells in

vitro.

(A) NEUROG3+/+, NEUROG3+/-, and NEUROG3-/- hESC lines were differentiated and then

analyzed on day 12 for markers of pancreatic precursors (PDX1 and NKX6.1) and endocrine

cells (CHGA) by immunofluorescence. Representative images show a complete loss of the pan-

endocrine marker CHGA in NEUROG3-/- cells.

(B) Analysis of hormone expressing cells in day 12 cultures. Insulin - INS, glucagon - GCG, and somatostatin - SST.

(B’) High magnification image of highlighted box in B with separated channels to show expression of individual hormones in the same cells.

(C) Analysis of genes involved in endocrine lineage commitment and development by qPCR.

Genes were subdivided into NEUROG3-dependent, -partially dependent, or -independent expression categories (n=3, representative of 4 separate experiments). NEUROG3+/- and

NEUROG3-/- lines generated using gRNA-1 were compared to parental H1 ESCs

(NEUROG3+/+).

(D) NEUROG3 target genes have the same response to loss of NEUROG3 in hESC clones

generated from a second NEUROG3 guide RNA (gRNA-2 shown in figure 1A). NEUROG3+/- and NEUROG3-/- lines show reduced and absent expression of NEUROD1 and CHGA as

compared to a NEUROG3+/+ control line.

45 Scale bars = 50 μm. Data are represented as mean ± SEM. *p<0.05, **p<0.01, ***p<0.001.

Figure 3. NEUROG3 is required for endocrine maturation in vivo.

(A) Human endocrine precursors WT, het, and null for NEUROG3 were transplanted into the pancreas of NSG mice and allowed to mature for 6 weeks, then analyzed for expression of pancreatic hormones insulin (INS), glucagon (GCG), and somatostatin (SST) by immunofluorescence. Human tissue is distinguished from mouse by co-staining for human nuclear antigen (HNUC).

(B) The total number of mono- and polyhormonal cells were quantified by high content imaging

(n=3 transplants each for NEUROG3+/+, NEUROG3+/-, and NEUROG3-/- lines, data represented as the total number of endocrine cells as a percentage of all human cells counted).

(C) A similar analysis was performed on in vitro derived hormone expressing cells to compare the relative proportion of polyhormonal cells in vitro with in vivo matured cells.

(D) Immunofluorescence staining for cells co-expressing INS, NKX6.1, and PDX1. Co- expression of these markers indicates the presence of mature β-cells. Scale bars = 50 μm.

Figure 4. Efficient shRNA-based knockdown of NEUROG3 reduces, but does not abolish, hormone expression in differentiated human ESCs.

(A) Human PSCs constitutively expressing a NEUROG3 silencing mRNA were differentiated into pancreatic precursors. Markers of pancreas and endocrine differentiation (PDX1, NEUROD,

INS, GCG) were assessed by quantitative PCR (n=2, representative of 3 separate experiments).

(B) Day 12 cultures were analyzed by immunofluorescence for PDX1 and INS.

Scale bars = 50 μm. Data are represented as mean ± SEM.

46

Figure S1. (A) UCSC Genome Browser view of region immediately downstream of the

NEUROG3 start codon. All candidate gRNA target sites which meet the G(N19)NGG constraint are aligned to illustrate the flexibility of CRISPR/Cas9 targeting. The sequences chosen for targeted mutagenesis are highlighted in red. (B) A list of all genotypes and sequences from expanded subclones illustrating all identified insertions and deletions (INDELs). The clone is indicated by the first letter/number (C1-C8 for gRNA#1) (A5, B2, C2, C5 for gRNA#2). The wild type (WT) allele and alleles with

INDELs are indicated, with the insertions (red) or deletions (-) in NEUROG3 indicated on the right. Some genotypes are only predicted using the Mixed Sequence Reader

(indicated by asterisk, see supplemental methods). Genotypes for all cell lines used for differentiation experiments were confirmed by subcloning and Sanger sequencing and are indicated as wild type (WT), heterozygous (+/-) or homozygous (-/-) loss-of-function for NEUROG3. (C-E) Analysis of expanded clones. All cell lines exhibited normal morphology (C), expressed markers of pluripotency such as NANOG and OCT4 (D), and had normal karyotypes (E) (only NEUROG3-/- clone C3 is shown as an example).

(F) Predicting low off target guide RNAs using the CRISPR Design Tool

(http://www.genome-engineering.org). The guide RNA sequences used for NEUROG3

have no predicted matches to any other sites in the genome. The most similar sites,

which had 3 predicted mismatches with both gRNA-1 and gRNA-2 are highlighted in

red. Each off-target site is classified as nongenic or intronic, no off-target sites were

located in . Scale bars = 50 μm.

47 Figure S2. (A) Analysis of efficiency of directed differentiation of hESCs into pancreatic

progenitors at different stages of differentiation. The efficiency of definitive endoderm

differentiation was assessed by co-expression of SOX17 and FOXA2, posterior foregut

by PDX1 expression, and pancreatic precursors by NKX6.1 expression. Endocrine

differentiation was indicated by expression of Chromogranin A (CHGA). Images are

representative of >10 separate experiments. (B) Differentiation into pancreatic

precursors is very efficient with >95% of monolayer cultures expressing PDX1, and 58%

of cells co-expressing NKX6.1. 4% of cells expressed CHGA as assessed by high

content tile-scan imaging. (C) Nuclear NKX6.1 expression levels for NEUROG3+/+,

NEUROG3+/-, and NEUROG3-/- day 12 cultures were compared by

immunofluorescence. (D) Example of a hi-res confocal tile-scan of day 12 pancreatic

precursors illustrating how data were collected for high-content imaging and quantitative

analysis. PDX1 staining was uniform across the entire monolayer. Analysis of CHGA

expression showed reduction of endocrine cells in NEUROG3+/- lines and complete absence in NEUROG3-/- lines. To generate a NEUROG3 reporting hESC line, the 5.5kb

promoter region of NEUROG3 was cloned upstream of the fluorescent protein mCherry.

The NEUROG3 reporter construct was stably transduced into H1 stem cells as a

lentiviral vector. (E) Colocalization and quantitation (F) of mCherry fluorescence and

immunostaining for NEUROG3 in day 12 pancreatic precursors. Approximately 90% of

cells expressing mCherry also express NEUROG3, indicating faithful expression. Just

over 50% of NEUROG3+ cells express the reporter mCherry. (G) NEUROG3

expressing cells were collected from day 12 pancreatic precursors by FACS using the

mCherry reporter and compared to NEUROG3-negative cells. mRNA levels of various

48 genes important for pancreatic and endocrine differentiation were assessed by qPCR.

(H) Day 12 pancreatic precursors were stained for INS, GCG, and SST (see Fig. 2B,B’)

and quantified by high content image analysis (5x5 tile scan). The numbers are total

counts for the number of cells expressing each hormone in both NEUROG3+/+ and

NEUROG3-/- cultures (NEUROG3-/- did not have any detectable hormone expression).

The areas of the circle represent the relative ratios of hormone exression. Overlapping

areas of the circles represent polyhormonal cells (ie the yellow area illustrates the

fraction of cells expressing both INS and GCG). Scale bars = 100 μm.

Figure S3. Analysis of endocrine cell markers by qPCR in NEUROG3+/+, NEUROG3+/-,

and NEUROG3-/- lines generated from gRNA-1 (A) and gRNA-2 (B). All lines were

differentiated until the pancreatic progenitor stage. Data are represented as mean ±

SEM. *p<0.05, **p<0.01, ***p<0.001.

Figure S4. (A) Engraftment of hESC-derived pancreatic progenitors into the splenic

lobe of the pancreas of immune-deficient NOD-Scid IL-2Rnull (NSG) mice. (B) A hi-

resolution confocal tile-scan of engrafted progenitors from NEUROG3+/+, NEUROG3+/-,

NEUROG3-/- lines 6 weeks after transplantation. Staining for insulin (INS), glucagon

(GCG) and somatostatin (SST) detects beta, alpha, and delta cells in the pancreas.

Human cells were distinguished from mouse islets with an antibody that recognizes a

human nuclear antigen (HNUC) and is outlined with yellow dots. (C) The number of

cells expressing INS, GCG, and/or SST was assessed by high content image analysis

for NEUROG3+/+ and NEUROG3+/- transplants. The area of each circle represents the

49 total cell counts from representative sections (n≥3) of 6-week old transplants (n=3).

Overlapping areas of the circles represent polyhormonal cells (ie the yellow area

illustrates the fraction of cells expressing both INS and GCG). (D) Nuclear PDX1 and

(E) NKX6.1 fluorescence were quantified for representative NEUROG3+/+ and

NEUROG3+/- 6-week transplants. Scale bars = 100 μm.

50 Supplementary Table 1

List of primary antibodies used in this study.

Antigen Animal Company, Cat # Concentration CHGA rabbit Immunostar #20086 1:500 FoxA2 goat Santa Cruz sc6554 (M-20) 1:500 Glucagon rabbit Zymed 1:500 HNUC Mouse Chemicon MAB 1281 1:1000 INS guinea pig DAKO AO564 1:1000 NANOG rabbit Cell Signaling 4903 1:1000 NEUROG3 mouse DSHB 1:100 Nkx6.1 mouse DSHB F55A10 1:100 Oct4 mouse Santa Cruz sc-5279 1:500 Pdx1 goat abcam ab47383-100 1:5000 Somatostatin goat sc-7819 1:1000 Sox17 goat R&D 1:500

51 Supplementary Table 2

List of primers used in this study.

qPCR Primers Gene Forward Reverse ARX CTGCCTTCTCCCGCTTG CACTACCCGGACGTCTTCAC CHGA TGTGTCGGAGATGACCTCAA GTCCTGGCTCTTCTGCTCTG GCG CAGCAAGTATCTGGACTCCAGG CCAGTTTATAAAGTCCCTGGC IA1 ACACAACGTAAAAGTGGGGG GAAAGTGTCGTCTCCGCTTC INS GAACCAACACCTGTGCGGCTCA TGCCTGCGGGCTGCGTCTAGT MYT1 CCGTGTGTCCACCTCTGATT TTCATGATTGCTTTCCGTGA NEUROD ATCAGCCCACTCTCGCTGTA GCCCCAGGGTTATGAGACTAT NEUROG3 CGCCGGTAGAAAGGATGAC GACGTGGGGCAGGTCACTT NKX2.2 GGAGCTTGAGTCCTGAGGG TCTACGACAGCAGCGACAAC NKX6.1 CGAGTCCTGCTTCTTCTTGG GGGGATGACAGAGAGTCAGG OCT4 GTGGAGGAAGCTGACAACAA CTCCAGGTTGCCTCTCACTC PAX4 TGTGCAGAGATGATTCCTGG GAGGGTCTGGTTTTCCAACA PDX1 CGTCCGCTTGTTCTCCTC CCTTTCCCATGGATGAAGTC PTF1A AGAGAGTGTCCTGCTAGGGG CCAGAAGGTCATCATCTGCC RFX6 CCAGTTTTTGAGCTAAGCGAA TGGCATCAAAGAGAGCAGTG SOX17 GGCGCAGCAGAATCCAGA CCACGACTTGCCCAGCAT

Genotyping NEUROG3 Genotyping CGGTCGTTGGCCTTCTTTCG CCACCTAGCCTCGGAATCG

52 Figure 1

A hg19 500 bases Scale 71,333,200 71,333,100 71,333,000 71,332,900fwd 71,332,800 71,332,700 71,332,600rvs 71,332,500 71,332,400hg19 71,332,300 71,332,200 71,332,100 71,332,000 71,331,900 500 bases:chr10 71,333,200 71,333,100 Your71,333,000 Sequence from Blat71,332,900 Search 71,332,800 71,332,700 71,332,600 71,332,500 71,332,400 71,332,300 71,332,200 71,332,100 primer gRNA1gRNA2 primer Blat Sequence Your Sequence from Blat Search RefSeq Genes 500bp 5’ 3’RefSeq Genes RefSeq Genes 100 vertebrates Basewise Conservation by PhyloP _ 4.88 100 vertebrates Basewise Conservation by PhyloP 100 Vert. Cons _ -4.5

 PAM GGAGTTGGGAGCCCACGCGGGTG gRNA Target #1 B Clone ||||||||||||||||||||||| INDEL WT (+) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGA -/- (C3 Al1) GATGAC------CCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGA -7 (C3 Al2) GATGACGCCTCAA-CCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGA -1 gRNA1 Lines +/- (C8 Al2) GATGAC------CCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGA -8  PAM gRNA Target #2 GGCACTCTGCCTCGCCAGGAAGG Clone ||||||||||||||||||||||| INDEL WT (+) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGA NA -/-(A5.3 Al-1) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTCCAAGTGTCCAAGGACGGAGCGGTCCTTCCCCAGA +11 (A5.3 Al-2) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTG--GGGAAGGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGA -2/+6 gRNA2 Lines +/-(B2.3 Al-2) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAG------GAGCCTCGGAAGA -19

De nitive Posterior Pancreatic Split hESC Endoderm Gut Tube Foregut Precursors

C 2 Days 3 Days 2 Days 4 Days 3 Days -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 Y-27632 mTesR 0.2% FBS 2% FBS RPMI 1640 1% B27 HG-DMEM [50ng/ml] BMP4 [100ng/ml] Activin A [50ng/ml] FGF-7 [2μM] RA [50ng/ml] [25ng/ml] Noggin

D OCT4 SOX17 PDX1PDX1 NKX6.1 600 +/+ 40000 700007000070000 700 60000 600 500 +/- 35000 6000060000 30000 500005000050000 500 400 -/- 25000 400004000040000 H1H1400 300 20000 300003000030000 HetHet300 15000 200 NullNull 200002000020000 200

10000 PSC) Fold Change (to PSC) Fold Change (to Fold Change (to PSC) Fold Change (to Fold Change (to PSC) Fold Change (to 100 PSC) Fold Change (to 100001000010000 100 Fold Change (to D12) Fold Change (to 5000 0 0 000 0 PSC D3 D9 D12 PSC D3 D9 D12 PSCPSC D3 D3 D9D9 D12 D12 PSC D3 D9 D12

7 F G 30 NEUROG3+/+ NEUROG3+/- NEUROG3-/- 6

) 25 E 2 5 20 4 15 3 10 2 %of all counted cells 1 Nuclear NEUROG3 (RFU x10 5 NEUROG3 0 0 +/+ +/- -/- +/+ +/- H 30 ** p=0.084 ) 2 25 PDX1 20

15 N.A.

10

Nuclear PDX1 (RFU x10 5

0 pos pos pos neg neg neg PDX1/ (n=0) (n=46) (n=121) (n=6915) (n=8155) (n=5002) NEUROG3 NEUROG3 NEUROG3 NEUROG3 NEUROG3 NEUROG3 NEUROG3 NEUROG3+/+ NEUROG3+/- NEUROG3-/- 53 A NEUROG3-/- NEUROG3+/- NEUROG3+/+ C

completely partially NKX6.1 dependent dependent independent

Fold Change (Compared to DE) 10000 20000 30000 40000 50000 60000 70000 80000 90000 Fold change (compared to DE) Fold Change (compared to DE) 10000 12000 1000 1200 1400 1600 14000 16000 2000 4000 6000 8000 200 400 600 800 0 0 0 +/+ +/--/- +/+ +/--/- +/+ +/--/- NEUROD1 PTF1A ** PDX1 * CHGA * ** gRNA-1 Pancreatic Precursors

Fold change(compared to DE) Fold Change (compared to DE) Fold Change (compared to DE) 1000 1200 1400 1600 1800 100 120 140 160 180 100 150 200 250 200 400 600 800 20 40 60 80 50 0 0 0 +/+ +/--/- +/+ +/--/- +/+ +/--/- PDX1 NKX2.2 NKX6.1 ** ARX ** ** Figure 2 B

54 Fold Change (compared to DE) Fold change (compared to DE) Fold change (compared to DE) 100 150 200 250 300 350 400 450 500 10000 15000 100 200 300 400 500 600 700 800 900 20000 25000 30000 35000 50 5000 0 0 -/- +/- +/+ 0 NEUROG3 +/+ +/--/- +/+ +/--/- NEUROG3 NEUROG3 +/+ +/--/- INS ** * CHGA RFX6 IA1 ** *** *** *** GCG SST D

completely DAPI dependent independent gRNA-2 Pancreatic Precursors

Fold Change (compared to DE) Fold Change (compared to DE) Fold Change (compared to DE) 10000 12000 14000 16000 10000 100 120 140 160 12000 14000 16000 2000 4000 6000 8000 2000 4000 6000 8000 20 40 60 80 0 0 0 +/+ +/--/- +/+ +/--/- +/+ +/--/- B’ NEUROD1 * CHGA PDX1 * SST GCG INS * *** Figure 3

A INS GCG SST HNUC B matured (in vivo) 18 polyhormonal +/+ 16 monohormonal 14 12 10 8 6 4 %of all counted cells 2 NEUROG3 0 +/+ +/- -/-

+/- C precursors (in vitro)

2.5 polyhormonal monohormonal

NEUROG3 2

1.5

-/- 1

0.5 %of total cells counted

0 +/+ +/- -/- NEUROG3

D INS NKX6.1 PDX1

+/+ NEUROG3

+/- NEUROG3

55 A Fold Change (normalied to D3) 10000 20000 30000 40000 50000 60000 10000 20000 30000 40000 50000 60000 0 0 14 10 12 0 2 4 6 8 3D 1 D12 D11 D9 D3 3D 1 D12 D11 D9 D3 3D 1 D12 D11 D9 D3 Days of Di of Days PTF1A PDX1 INS fferentiation SH WT 100000 120000 WT SH 20000 40000 60000 80000 10000 15000 20000 25000 10000 15000 20000 25000 30000 35000 5000 5000 0 0 0 3D 1 D12 D11 D9 D3 3D 1 D12 D11 D9 D3 3D 1 D12 D11 D9 D3 NEUROG3 NEUROD GCG Figure 4 56 H1 shNEUROG3 H1 control SH WT SH WT B PDX1/INS INS PDX1 H1 shNEUROG3 Supplemental Figure 1 A hg19 100 bases Scale 71,332,800 71,332,750 71,332,700 71,332,650 71,332,600 71,332,550 :chr10 Your Sequence from Blat Search g1 R2 R5 F2 F5 F8 F9 R4 R8 F4 F7 F1 R7 F6 g2 R6 F3 R1 RefSeq Genes NEUROG3 100 vertebrates Basewise Conservation by PhyloP _ 4.88 100 Vert. Cons _ -4.5

PAM GGAGTTGGGAGCCCACGCGGGTG gRNA Target #1 B Clone ||||||||||||||||||||||| INDEL H1(WT) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -/- C3(Allele1) GATGAC------CCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -7 C3(Allele2) GATGACGCCTCAA-CCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -1 +/- C8(het) GATGAC------CCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -8 *C2(Allele1/2) GATGACGCCT---CCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -3 *C5(Allele1) G------GTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -19 *C5(Allele2) GATGACGC------CCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -6 *C7(Allele1) GATGAC------CCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -8 *C7(Allele2) GATGAC------CCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -7 *C6(het) GATGACGCCTGGACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -2/+2 *C4(WT)

*C1(WT) PAM gRNA Target #2 GGCACTCTGCCTCGCCAGGAAGG Clone ||||||||||||||||||||||| INDEL H1(WT) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC A5-2(het) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAG------GACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -10 -/- A5-3(Allele1) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTCCAAGTGTCCAAGGACGGAGCGGTCCTTCCCCAGAG +11 A5-3(Allele2) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTG--GGGAAGGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -2/+6 +/- B2-3(het) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAG------GAGCCTCGGAAGAC -19 C5-3(Allele1) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTG-GACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -1 C5-3(Allele2) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCG---GAAGGACGGAGCGGTCCTTCCCCAGAGCCTCGGA -3/+4 B1-1(het) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGA------CCTTCCCCAGAGCCTCGGAAGAC -18 C3-2(WT) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC NA *A5-1(WT) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC NA *C2-2(het) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCCT------GAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -6 *C2-3(WT) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGTGAGACGGAGCGGTCCTTCCCCAGAGCCTCGGAAGAC NA *C5-1(het) GATGACGCCTCAACCCTCGGGTGCGCCCACTGTCCAAGTGACCCGT------GAGCGGTCCTTCCCCAGAGCCTCGGAAGAC -6 *Predicted with Mixed Sequence Reader, not confirmed by subcloning

C D OCT4 NANOG DAPI E

gRNA-1 F Off Target GTGGGCGCACCCGAGGGTTGAGG Mismatches Position Strand Location RefSeq ID Gene OT-1 GTGGGCGCACCCGAGGGTTGAGG 0 chr10 69573014 + target NM_020999 NEUROG3 OT-2 GTGGGgaCACCCGAGGGcTGTGG 3 chr14 88186220 - intron NM_138318 KCNK10 OT-3 GTGGGCtCACaCGgGGGTTGAGG 3 chr1 29712999 - nongenic OT-4 cTGGGCcCACCCGAGGGcTGGAG 3 chr9 1.37E+08 - nongenic OT-5 GTGGGCaCAtCtGAGGGTTGCAG 3 chrX 54554143 + intron NM_019067 GNL3L OT-6 GTGaGCaCACCCGAGGGaTGAAG 3 chr16 29661652 + nongenic OT-7 GTGtGCGaACCCGAGtGTTGCAG 3 chr13 35880332 + intron NM_004734 DCLK1

gRNA-2 Off Target GGAAGGACCGCTCCGTCTCACGG OT-1 GGAAGGACCGCTCCGTCTCACGG 0 chr10 69572979 + target NM_020999 NEUROG3 OT-2 GGAAtGcCCGCTCCaTCTCATGG 3 chr3 55653890 + intron NM_015576 ERC2 OT-3 GGgAGcACCGCaCCGTCTCACGG 3 chr9 504194 + intron NM_001256876 KANK1 OT-4 GGAAGGACgaCTCCcTCTCACAG 3 chr20 57943314 + nongenic OT-5 GGAAGGACCaCTCtGTgTCAGAG 3 chr20 62427912 - nongenic OT-6 GGAAGGACCaCTaCcTCTCAGAG 3 chr10 80524940 - nongenic 57 D PDX1/CHGA A E SOX17 FOXA2 H Day3-DE 5.5kb NEUROG3 Promoter

+/+ NEUROG3

GCG NEUROG3 INS PDX1 NEUROG3 PDX1 NKX6.1 129 240 Day9-PF mCherry mCherry 461 +/+ 65 4 146 205 Supplemental Figure2 NKX6.1 PDX1 F SST % colocalization 100 20 40 60 80 0 Day12-Pancreatic Precursors mCherry colocalized mCherry with NEUROG3 with 58 NEUROG3 NEUROG3 colocalized NEUROG3 with mCherry with CHGA NEUROG3+/- INS +/- GCG G

Fold Change (to mCherryneg cells) Fold Change (to mCherryneg cells) 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 3 4 5 6 7 8 9 15 25 35 45 55 65 75 0 1 2 5 6 49 C

151 mCherry B NEUROG3 NEUROD 100

6 % of all counted cells

Nuclear NKX6.1 (RFU) 10 20 % of 40 all coun60 ted cells 80 100 1000 1200 1400 1600 1800

NKX2.2 10 20 30 40 10 20 30 40 50 60 70 80 90 0 200 400 600 800 0 0 ≈ PAX4 0 22 D1 K61 CHGA NKX6.1+ PDX1+ NEUROG3 13 PDX1+ CHGA PDX1+ / / -/- +/- +/+ GCG

SST INS IA1 NKX6.1+ NKX6.1+ RFX6 PDX1 NKX6.1 CHGA+ PTF1A CHGA+ -/- Supplemental Figure 3

A gRNA-1 Pancreatic Precursors NEUROG3 INS GCG 45000 350000 1400000 * *** ** 40000 300000 1200000 35000 250000 1000000 30000 25000 200000 800000 20000 * 150000 600000 15000 100000 400000 * 10000 ** 5000 50000 200000 0 0 0 Fold Fold Change (compared to DE) Fold Fold Change (compared to DE) +/+ +/- -/- Fold Change (compared to DE) +/+ +/- -/- +/+ +/- -/-

PAX4 MYT1 4.0 *** 1200 ***

dependent 3.5 completely 1000 3.0 800 2.5 2.0 600 1.5 ** 400 *** 1.0 200 0.5 0.0 0 Fold Fold change (compared to DE) +/+ +/- -/- Fold change (compared to DE) +/+ +/- -/-

B gRNA-2 Pancreatic Precursors

450 NKX6.1 30000 RFX6 400 25000 350 300 20000 250 15000 200 150 10000 100 5000 50 independent Fold Change (compared to DE) Fold Change (compared to Fold Change (compared to DE) Fold Change (compared to 0 0 +/+ +/- -/- +/+ +/- -/-

PTF1A IA1 ARX 1800 * 25 400 1600 * 350 ** 20 1400 300 1200 15 250 1000 200 800 10 150 600

partially 100 400 5 dependent 200 50 Fold Change (compared to DE) Fold Change (compared to Fold Change (compared to DE) Fold Change (compared to 0 0 DE) Fold Change (compared to 0 +/+ +/- -/- +/+ +/- -/- +/+ +/- -/-

GCG MYT1 500000 2000 NEUROG3 90 * * * 450000 1800 80 400000 1600 70 350000 1400 60 300000 1200 50 250000 1000 40 200000 800 30 150000 600 20 100000 ** 400 ** 50000 200 10 Fold Change (compared to DE) Fold Change (compared to Fold Change (compared to DE) Fold Change (compared to 0 0 DE) Fold Change (compared to 0 +/+ +/- -/- +/+ +/- -/- +/+ +/- -/-

70000 INS 0.35 PAX4 25 NKX2.2 * ** 60000 0.3 **

dependent 20 completely 50000 0.25

40000 0.2 15

30000 0.15 10 ** 20000 0.1 *** *** 5 10000 0.05 Fold Change (compared to DE) Fold Change (compared to Fold Change (compared to DE) Fold Change (compared to 0 0 DE) Fold Change (compared to 0 +/+ +/- -/- +/+ +/- -/- +/+ +/- -/-

59 Supplemental Figure 4

A Engraftment into splenic lobe of the pancreas 4-6 weeks in vivo

B INS GCG SST HNUC NEUROG3+/+ NEUROG3+/- NEUROG3-/-

C INS 20 INS 24

+/- SST +/+ SST 362 1191 3747 172 46 4490 13 735

126 8548 15711 NEUROG3 NEUROG3

GCG GCG

D 4500 PDX1 E 4000 NKX6.1 4000 3500 3500 3000 3000 2500 2500 2000 2000

NuclearRFU NuclearRFU 1500

1500 N.A.

N.A. 1000 1000

500 500

0 0 NS(-) NS(-) NS(-) NS(-) NS(-) NS(-) NS(+) NS(+) NS(+) NS(+) NS(+) NS(+) I I I I I I I I I I I I 60 NEUROG3-/- NEUROG3+/- NEUROG3+/+ NEUROG3-/- NEUROG3+/- NEUROG3+/+ Functional characterization and disease modeling of NEUROGENIN 3 in human

pluripotent stem cell-derived pancreas and intestinal organoids

Patrick S. McGrath1, Xinghao Zhang1, Jamie Schweitzer1, Jacqueline V. Schiesser1 and

James M. Wells*1,2.

1Division of Developmental Biology, 2Division of Endocrinology, Cincinnati Children’s

Hospital Medical Center, 3333 Burnet Avenue, Cincinnati, OH 45229-3039

*Correspondence:

[email protected]

Telephone: 513-636-8767

61 Summary

Gastrointestinal hormones are secreted by endocrine cells found throughout the stomach, pancreas, and intestines and are important in regulating various functions such as glycemia, secretion, absorption, satiety, and motility. Neurogenin 3

(NEUROG3) is a basic helix-loop-helix transcription factor required for the development

of all endocrine lineages of the pancreas and intestine. Despite the common lineage

determinant NEUROG3, the pancreas endocrine subtypes are largely independent from

intestinal endocrine cells. Furthermore, multiple patient NEUROG3 mutations have

been characterized by a loss of all intestinal endocrine cells while the endocrine

pancreas remains largely intact. Together, this suggests a context dependent difference

in pancreas versus intestinal requirement for NEUROG3. To investigate this, we utilized

a previously generated NEUROG3-/- human pluripotent stem cell (hPSC) line stably

transduced with an rtTA-NEUROG3 transgene. Upon directed differentiation into

pancreatic or intestinal tissue, endocrine specification is robustly rescued following

NEUROG3 induction with doxycycline. To use this platform to model human disease,

rtTA-NEUROG3 was mutated to include the patient missense mutations R107S, L135P,

or E28X. Encouragingly, NEUROG3R107S perfectly models the human phenotype and

rescues endocrine specification in pancreatic precursors, albeit with reduced efficiency, but is unable to specific endocrine cells in the context of intestinal organoids. Neither

NEUROG3L135P nor NEUROG3E28X exhibit any observable function in pancreatic

precursors or HIOs and are unable to specify any endocrine cell lineages. Finally, we

utilized the inducible construct to identify downstream targets of NEUROG3 in the

62 context of human pancreatic precursors and HIOs using RNA-seq to identify novel context-specific gene regulatory networks governing human endocrine specification.

63 Introduction

Endocrine cells are present throughout the gastrointestinal tract and secrete

hormones which regulate many bodily functions including digestion, motility, nutrient absorption, satiety, and blood glycemia. The development of endocrine cells is completely dependent on and regulated by the basic helix-loop-helix transcription factor

Neurogenin 3 (Neurog3). Neurog3 is first detectable in the developing pancreas and intestine at E9.5 and E12.5, respectively (Gradwohl et al., 2000; Jenny et al., 2002).

Neurog3 directs endocrine specification through direct and indirect regulation of a cascade of genes including the transcription factors NeuroD (Huang et al., 2000),

NKX2.2 (Prado et al., 2004; Watada et al., 2003), Pax4 (Smith et al., 2003; Sosa-

Pineda et al., 1997), Arx (Collombat et al., 2003), Rfx6 (Soyer et al., 2010), NKX6.1

(Henseleit et al., 2005; Sander et al., 2000), amongst others. Mice lacking Neurog3 do not form any endocrine lineages of the pancreas or intestine and die shortly after birth with severe hyperglycemia (Gradwohl et al., 2000; Jenny et al., 2002). Ectopic Neurog3 expression in pancreatic progenitors shunts developing cells toward an endocrine fate

(Apelqvist et al., 1999; Schwitzgebel et al., 2000). Similarly, overexpressing Neurog3 throughout the developing intestinal epithelium yields an increased number of enteroendocrine cells at the expense of goblet cells (Lopez-Diaz et al., 2007). Taken together, it is evident that Neurog3 is both necessary and sufficient for endocrine development in the pancreas and intestine.

NEUROG3 is similarly required in humans. Patients with biallelic mutations in

NEUROG3 are born with malabsorptive diarrhea due to enteric anendocrinosis, a loss of EECs (Pinney et al., 2011; Rubio-Cabezas et al., 2011, 2014; Sayar et al., 2013;

64 Wang et al., 2006). Interestingly, some of these patients maintain perfect glycemic

control into adulthood indicating a fully developed endocrine pancreas, while others are

diabetic at birth. We showed previously that NEUROG3 is required for human pancreatic endocrine development (McGrath et al., 2015), suggesting that patients retaining endocrine function have a hypomorphic allele. The stark difference in intestinal versus pancreatic sensitivity to missense mutations in NEUROG3 highlights the importance of studying gene function in developmentally relevant systems.

To this end, for our studies of NEUROG3 we have utilized human pancreatic precursors and 3-D intestinal organoids derived from pluripotent stem cells. Using a temporal series of growth factor manipulations (Kelly et al., 2011; Kroon et al., 2008;

Rezania et al., 2012) we can model the stages of fetal gut development including

definitive endoderm formation (gastrulation), subsequent patterning to foregut and

hindgut, and finally differentiation into pancreatic precursors or intestine (Spence et al.,

2011). Pancreatic precursors are >95% PDX1-positive and express the endocrine

markers NEUROG3, NKX2.2, and NKX6.1. The intestinal organoids consist of a three-

dimensional CDX2-positive columnar epithelial surrounded by mesenchyme.

In the present study we use human NEUROG3-/- pluripotent stem cells

differentiated into pancreas precursors and intestinal organoids to model several human

NEUROG3 mutations. We find that the model systems recapitulate the human

phenotype, giving us the ability to discern a potential mechanism by which the

mutations affect NEUROG3. Furthermore, we transcriptionally profile the pancreas and

intestinal tissues with or without NEUROG3 to identify novel target genes and also shed

light on differences between pancreatic and intestinal endocrine cell differentiation.

65 Materials and Methods

Pluripotent stem cell culture

The parent human embryonic stem cell line WA01 (H1) was obtained from

WiCell. The NEUROG3-/- hESC line was created previously using CRISPR/Cas9 to

disrupt endogenous expression with a frame-shift INDEL (McGrath et al., 2015). hESCs

were maintained in mTeSR (StemCell Technologies) on HESC-qualified Matrigel (BD

Biosciences) coated plates. Cells were routinely passaged every four days with dispase

(Invitrogen).

Plasmid construction

The human NEUROG3 cDNA was acquired from the Harvard PlasmID database

(Plasmid ID HSCD00345748). The NEUROG3 cDNA was then subcloned using the

pENTR/D-TOPO kit (Thermofisher Scientific). The HA tag and various NEUROG3 mutations were added by site-directed mutagenesis (Liu and Naismith, 2008). The GFP-

2A-NEUROG3ERT2 construct was synthesized by Thermofisher Scientific. The various

constructs were then Gateway cloned into pINDUCER20 (Fig. 1C, Addgene #44012)

using LR Clonase. (Primers listed in supplemental materials).

Lentivirus Constructs

Vectors were packaged into lentivirus by the CCHMC viral production core. Virus

was added to the media of newly plated hESCs following normal dispase passaging.

The media was replaced with mTeSR containing selective antibiotic (G418, 500ng/ml)

66 after 24 hours. All transduced cell lines were maintained under selection as a

heterogeneously transduced population.

Differentiation to pancreatic precursors

Stem cells were dispersed with Accutase (StemCell Technologies), washed,

collected, resuspended in mTeSR containing 10μM ROCK inhibitor (Y-27632,Tocris

Bioscience), and plated at a concentration of 1x105 cells/cm2 on to matrigel-coated, 24-

well Nunclon plates (Delta treated). Differentiation was initiated when cells reached

~75% confluency, approximately 48 hours after plating. At the start of differentiation

(day 0), cells were switched to RPMI 1640 supplemented with non-essential amino

acids, 100ng/ml Activin A (Cell Guidance Systems) and 50ng/ml BMP-4 (R&D

Systems). Day 1-2 media included 0.2% tetracycline-free FBS (Hyclone) and did not

have BMP4. On day 3 the media was changed to RPMI 1640 containing 2% FBS,

50ng/ml FGF-7 (R&D Systems), and 50ng/ml Noggin (R&D Systems). On days 5 and 7 the media was switched to high-glucose (HG) DMEM (Gibco) containing 50ng/ml

Noggin, 2μM all-trans retinoic acid (Stemgent), and 1% (0.5x) B27 without vitamin A

(Gibco). Finally, day 9-11 media was prepared using HG-DMEM supplemented with 1%

B27 and 25ng/ml Noggin.

Differentiation and three-dimensional culture of intestinal organoids

Stem cells were handled as stated above, but only 0.5x105 cells were plated.

Differentiation was initiated when cells reach ~40% confluency, approximately 48 hours

after plating. Cells were treated with activin A for 3 consecutive days in RPMI 1640

67 (Invitrogen) with increasing concentrations of tetracycline-free FBS (0%, 0.2%, 2% on day 1, 2, 3, respectively). Definitive endoderm was then incubated for four days in

DMEM-F12 containing 2% tetracycline-free FBS, 400ng ml-1 FGF4, and 3µM

CHIR99021 (Stemgent). Spheroids were then collected and embedded in a 50µl bubble of Matrigel (BD Bioscience). The bubble was allowed to solidify (incubate 30 min at

37C), and then gut media could be added (advanced DMEM/F12 (Invitrogen), L- glutamine, 10µM HEPES, 1x N2 supplement (R&D Systems), 1x B27 (Invitrogen), pen/strep, and 100ng ml-1 EGF (R&D Systems)). Gut media was replaced every four days.

Immunofluorescent staining

Monolayers were fixed for 20 minutes at room temperature in 4% paraformaldehyde (PFA). Organoids were fixed for 30-120 minutes in 4% PFA at 4oC, cryopreserved overnight in 30% sucrose, frozen in OCT, then cryosectioned in 8-10μm increments. Prior to staining, monolayers and sections were blocked for 30 minutes (5% donkey serum and 0.1% Triton-X in PBS). Primary antibodies were diluted in PBS +

0.1% Tween and incubated with the samples overnight at 4oC. Secondary antibodies were incubated for 2 hours at room temperature. Cells were stained with DAPI (5μg/ml in PBS) for 5 minutes. Sections were mounted using Fluormount-G. All antibodies listed in supplemental materials.

Image acquisition and analysis

68 Images were captured using a Nikon A1R confocal microscope with PMT based detectors and motorized stage. The microscope has 405nm, 488nm, 561nm, and

640nm lasers with appropriate filters. All image analysis was performed using Nikon

Elements. Figures were assembled using the Adobe Creative Suite.

FACS sorting of HIOs

Human intestinal organoids (HIOs) were harvested using ice-cold PBS to remove them from matrigel and transferred into a 15ml falcon tube then centrifuged at 1000rpm for 4 minutes. The supernatant was then removed and the HIOs washed in 5ml ice-cold

PBS and titurated with a P1000 pipette. The washes were repeated 3-5 times dependent on the amount of matrigel present. The HIOs were then resuspended in 5ml of tryple-select (Thermo Fisher Scientific, cat#12563011) and incubated at 37°C for 20 minutes, then centrifuged at 1500rpm for 4 minutes. Following removal of the supernatant, the pellet was resuspended in 3ml PBS+2% FCS and filtered into an appropriate number of 40µm cell-strainer cap 5ml tubes (Corning, cat#352235).

Cells were stained with either CD326(EpCAM)-488 antibody (Biolegend, cat#324210) or mouse IgG2b,k-488 antibody (Biolegend cat#400329) for 20 minutes at room temperature in the dark. Flow cytometric sorting was performed using a FACSAria

(BD Biosciences) with cells sorted directly into lysis buffer (Machery-Nagel) for downstream RNA isolation. FACS plots were processed using FACSDiva (BD

Biosciences).

RNA isolation and quantitative PCR

69 All RNA was column purified using a NucleoSpin RNA kit (Machery-Nagel)

including an on-column DNAse digestion. cDNA was produced with the SuperScript

VILO cDNA synthesis kit (Invitrogen) following the manufacturer’s instructions. 5ng of

cDNA was amplified per reaction with QuantiTect SYBR Green (Qiagen) then amplified

using a QuantStudio 6 or StepOnePlus realtime PCR detection system (Thermo Fisher

Scientific). Primers are listed in the supplemental materials.

Western blotting and quantification

Pancreatic precursors (differentiation Day 9) were treated with 100ng/ml

Doxycycline for 24 hours to induce Neurog3 expression. At time t=0, 100µM

cycloheximide (CHX) was applied to block further protein . At each time point,

harvested cells were immediately lysed in ice-cold RIPA buffer with protease inhibitor.

Samples were mixed with 2x Laemmli buffer and boiled for 10 minutes at 100C.

Proteins were separated on SDS-PAGE gels (15% Tris-glycine) and transferred onto a

PVDF membrane. Membranes were blocked in Blocking Buffer (TBS, Odyssey) for 1 hour followed by incubation with primary antibodies at 4°C overnight (diluted in Blocking

Buffer). Secondary antibodies were then applied at room temperature for 1 hour.

Images were acquired and quantified on an Odyssey CLx Imaging System. Protein degradation half-lives were calculated using first-order rate kinetics. Antibodies are listed in the supplemental materials.

RNA-Seq and data analysis

70 RNA samples were assessed for quality and all samples had a RIN (RNA integrity number) great than 9. Nonamplified, polyadenylated mRNA was used for the synthesis of cDNA. Samples were sequenced using the HiSeq 2500 (Illumina, CA) with 75 bp, single-end reads. Following primer and barcode removal, sequences were aligned to the hg38 human genome using Ensembl transcripts. All genomic analysis was performed in GeneSpring NGS. Reads were quantified to generate computing reads per kilobase per million reads (RPKM), then normalized using the DESeq.

(GO) analysis was performed using DAVID (Huang et al., 2009).

71 Results

To study the function of NEUROG3 in pancreatic endocrine development, we

used a previously described differentiation protocol in which human pluripotent stem

cells (hPSCs) could be robustly differentiated into pancreatic precursors (Fig. 1A,

Supplementary Fig. 1A). Pancreatic precursors are >95% PDX1-positive, express

NEUROG3, and contain specified endocrine cells evidenced by the pan-endocrine

marker CHGA (Fig. 1B). We previously generated a NEUROG3-/- hPSC line using

CRISPR/Cas9 targeted to generate frameshift INDELs immediately downstream of the start codon in either allele, thereby completely disrupting NEUROG3 expression.

(McGrath et al., 2015). Directed differentiation of NEUROG3-/- hPSCs to PDX1+

pancreatic precursors shows a complete loss of NEUROG3 expression and endocrine

specification (Fig. 1B).

The complete NEUROG3 cDNA sequence was acquired and modified to add the

epitope tag hemagglutinin (HA) to the N-terminus by site-directed mutagenesis, and

then cloned into the tet-inducible lentiviral vector pINDUCER20 (Meerbrey et al., 2011.

Addgene #44012)(Fig. 1C). Then, NEUROG3-/- hPSCs were stably transduced yielding

a stem cell line in which wild-type NEUROG3 could only be expressed by forced

induction with the addition of the antibiotic doxycycline (dox) to the culture media

(hereafter referred to as NEUROG3iWT). To find the window of competence in which

differentiating cultures can respond to NEUROG3, NEUROG3iWT hPSCs were

differentiated into pancreatic precursors and treated with dox on various days during the

differentiation to ectopically express NEUROG3. Interestingly, only after differentiation

day 7 are the cultures competent to specify endocrine cells evidenced by the expression

72 of various endocrine specific genes (GHR, INS, SST, GCG, CHGA, NKX2.2) (Fig. 1D).

PDX1 expression begins to be robustly expressed at this time indicating the posterior

foregut is adopting a pancreatic fate which seems to facilitate NEUROG3 function. For

the remaining studies, day 9 was chosen as the time point to induce NEUROG3

expression as it resulted in the highest average expression of endocrine markers, and

also corresponds to the highest level of endogenous NEUROG3 expression in wild-type

cultures (data not shown). NEUROG3iWT cultures robustly express NEUROG3 following

an 8 hour pulse with dox (Fig 1E). By day 12, ectopic NEUROG3 is completely

eliminated from the cells (Fig. 2E) but NKX2.2, INS, GCG, SST, and CHGA expression indicates robust endocrine specification (Fig. 1F, 2M).

This model system is amenable to screen multiple NEUROG3 mutants. Wild type

NEUROG3 was mutated to include the previously described patient mutations R107S,

L135P, or E28X and then transduced into the NEUROG3-/- hPSCs and differentiated

into pancreatic precursors. As before, cultures were pulsed with dox on differentiation day 9 for 8 hours and analyzed for NEUROG3 expression by immunofluorescence utilizing the HA tag. NEUROG3R107S and NEUROG3L135P were expressed at levels

comparable to NEUROG3WT (Fig. 2A-C). The severely truncated NEUROG3E28X was never detected by immunostaining (Fig. 2D). After three more days of differentiation, pancreatic precursors were analyzed for endocrine formation. NEUROG3WT robustly

induced endocrine cells (CHGA, NKX2.2, INS, GCG, SST) (Fig. 2E-G). As expected

from patient data, NEUROG3R107S was also competent to induce endocrine specification

albeit at significantly reduced numbers (CHGA, NKX2.2), and hormone positive cells

were exceedingly rare (INS, GCG, SST) (Fig. 2H-J). Conversely, NEUROG3L135P and

73 NEUROG3E28X showed absolutely no function and were unable to activate NKX2.2 or

specify endocrine cells (Fig. 2F-P).

It was surprising that no function was observed with the NEUROG3L135P mutation

as one patient with this mutation exhibited normal glucose homeostasis for years after

birth, indicating a normally functioning endocrine pancreas(Rubio-Cabezas et al., 2014).

By increasing the concentration and duration of dox treatment to 300ng/ml and 24

hours, repspectively, we can expose pancreas cultures to higher levels of

NEUROG3L135P. Despite the increased exposure of pancreatic precursors to

NEUROG3L135P, there was absolutely no evidence of function when assessing the direct

target genes NEUROD, NKX2.2, or PAX4 (Supplementary Fig. S2).

We noticed while cloning the NEUROG3 cDNA used for the inducible vector that

it contained the common SNP F199S (c. 596T

shown to not associate with onset of diabetes (Bosque-plata et al., 2001; Jensen et al.,

2001) but is of particular interest since it has an allele frequence of 43% (Sherry et al.,

2001). While the F199S SNP may not affect NEUROG3 function independently, we

hypothesized that the L135P mutation may be non-functional due when combined with

F199S in the same allele. To test this, F199S was corrected using site directed

mutagenesis. The resulting wild-type NEUROG3199F was then expressed in pancreatic precursors as before and directly compared to NEUROG3F199S. NEUROG3199F was equally able to induce the expression of target genes NKX2.2 and NEUROD as assessed by immunofluorescence and qPCR, respectively (Supplementary Fig. 3A,B).

Furthermore, there appeared to be no difference in endocrine specification evidenced by the induction of CHGA, which correlates well with previous patient studies. The

74 L135P mutation was then introduced using the wild-type NEUROG3199F. Despite

correcting the compound mutation, there was still no detectable function of

NEUROG3L135P when expressed in pancreatic precursors and assayed for activation of

target genes. This is a confounding result that is further covered in the discussion.

The R107S mutation maps to a hydrophobic core formed between the two

helices of the bHLH motif (using the crystal structure of NeuroD as a model, Longo et

al., 2008) and thus could be predicted to destabilize NEUROG3R107S folding thereby

reduce its half-life. To investigate this possibility, the half-life of NEUROG3WT,

NEUROG3R107S, and NEUROG3L135P was determined. Day 9 pancreatic precursors

were dosed with dox for 24 hours before the media was exchanged to include

translation inhibitor cycloheximide (CHX) to prevent further protein synthesis. Protein was then harvested at various time points up to 3 hours following removal of dox, and

NEUROG3 was quantified by western blot (Fig. 3A-D). Using first order kinetics, the half-life of NEUROG3WT and NEUROG3L135P was calculated to be the same at 1.0 hours. The half-life of NEUROG3R107S was significantly lower at 0.6 hours (Fig. 3E).

If the observed NEUROG3R107S activity loss is due entirely to decreased stability, we should be able to compensate for its diminished function by expressing it at higher levels by with a higher concentration of dox. A dose response was performed in which

NEUROG3iWT and NEUROG3iR107S pancreatic precursors were treated with a range of

dox concentrations. At each tested concentration, NEUROG3R107S expression was

comparable to NEUROG3WT despite the reduced stability (Fig. 4A,B). Furthermore,

activation of direct target genes NEUROD, NKX2.2, and PAX4 was significantly reduced

for NEUROG3R107S compared to NEUROG3WT (Fig. 4C). This is particularly obvious for

75 NEUROD as NEUROG3R107S dosed with 300ng/ml dox is unable to activate to the same level as NEUROG3WT dosed with 30ng/ml dox despite the 10-fold increase in

expression.

Patients with NEUROG3R107S and NEUROG3L135P mutations may retain some

endocrine pancreas, which is not surprising as we have now shown that these

mutations likely retain some function. Unlike in the pancreas, however, these patients

have a complete loss of intestinal endocrine cells resulting in a severe malabsorptive

diarrhea phenotype. We previously developed a protocol in which hPSCs could be

differentiated into 3-dimensional human intestinal organoids (HIOs) in a dish (Fig. 5A)

(McCracken et al., 2011; Spence et al., 2011). We wanted to determine if the human

intestinal anendocrinosis phenotype could be replicated using this model system.

The NEUROG3iWT, NEUROG3iR107S, NEUROG3iL135P, and NEUROG3E28X hPSC lines were used for differentiation into intestinal organoids. Day 28 organoids were treated with dox for 8 hours to induce NEUROG3 expression. WT, R107S, and L135P variants were robustly expressed compared to a no dox control (Fig. 5B-E). As before,

NEUROG3E28X was undetectable by immunofluorescence (Fig. 5F). NEUROG3 was

expressed throughout the HIO including the epithelial compartment marked by ECAD

(Fig. 5B’-F’). It is interesting to note that the ectopic expression seemed to be at a

higher level in the mesenchyme. This may suggest that NEUROG3 is more stable in the

mesenchyme. Alternatively, it could be a technical issue of dox being unable to

penetrate as well into the epithelial layer. There was also a complete absence of CHGA

staining within the epithelium, indicating that endocrine cell differentiation takes longer

than 8 hours following treatment with NEUROG3 (Fig. 5B,C; Compare to Fig. 5H).

76 Induced NEUROG3iWT, NEUROG3iR107S, NEUROG3iL135P mRNA was expressed

at comparable levels (Fig. 5G). The target genes NEUROD, NKX2.2, and PAX4 are

robustly activated by NEUROG3WT. NEUROG3R107S is barely able, and NEUROG3L135P

is completely unable to activate these target genes.

7 days following treatment with dox, the HIOs were collected and analyzed for

the formation of endocrine cells. At this time point, there was no observable NEUROG3

protein (HA, Fig 5H-K). CHGA positive cells were easily detectable in the epithelium of

NEUROG3WT HIOs (Fig. 5H,H’). Conversely, there was a complete absence of endocrine formation in NEUROG3R107S, NEUROG3L135P, or NEUROG3E28X HIOs (Fig.

5I-K).

NEUROG3R107S retains sufficient activity to induce endocrine specification in

pancreatic precursors but not HIOs. This is an interesting case study because it

highlights a context dependent difference in requirement for NEUROG3. To further

explore this, we performed transcription profiling to identify differences in NEUROG3

target gene in pancreatic precursors compared to HIOs.

We generated a bicistronic GFP-P2A-NEUROG3ERT2 construct that was then

cloned into the pINDUCER20 rtTa lentiviral vector (Fig. 6A). The resulting vector was

then transduced into the NEUROG3-/- hPSC line (NEUROG3ERT2) and differentiated into pancreatic precursors. The process to produce functional NEUROG3 protein using this vector is two-part (summarized in Fig. 6B). First, day 9 pancreatic precursors were dosed with doxycycline for 24-hours which facilitates the synthesis of NEUROG3ERT2

(Fig. 6E). The ERT2 domain functions to sequester NEUROG3 away from DNA thereby rendering it unable to activate target genes such as NEUROD, PAX4, or NKX2.2 (Fig.

77 6C,I). We identified that NEUROG3ERT2 could be induced with a dox concentration as

high as 200ng/ml without exhibiting any leakiness (indicated by activation of

downstream genes prior to treatment with tamoxifen) (Supplementary Fig. 4). Upon

addition of the tamoxifen metabolite 4-hydroxy tamoxifen (4OHT), NEUROG3ERT2 is able to move freely in the nucleus and regulate target genes (Fig. 6C,J). To identify only direct target genes, the translation inhibitor cycloheximide (CHX) was added together with 4OHT. In this way, NEUROG3 target genes such as NKX2.2 are transcribed (Fig.

6C) but protein cannot be translated (Fig. 6K) and therefore cannot activate or repress downstream targets.

We utilized RNA-seq and the NEUROG3ERT2 hPSC line to identify all targets of

NEUROG3 in pancreatic precursors and also to attempt to separate direct from

indirectly affected genes. The 6 treatment groups were sequenced in triplicate for

analysis: no treatment, dox only, 4OHT only, CHX only (controls); dox+4OHT (direct

and indirect targets); and dox+4OHT+CHX (direct targets only). Using a 1.5-fold change

as the cutoff, the dox+4OHT sample had only 301 differentially expressed genes

(DEGs) compared to 226 and 237 DEGs for the dox or 4OHT samples, respectively.

CHX treated samples clustered together and yielded 5058 and 5134 DEGs for CHX and

Dox+4OHT+CHX, respectively, compared to untreated samples. Direct and indirect

targets of NEUROG3 are represented in the Dox+4OHT sample, of which 142 genes

were uniquely differentially expressed in the Dox+4OHT sample compared to both Dox

and 4OHT (Fig. 7A). 136 unique DEGs were identified in the Dox+4OHT+CHX sample,

representing putative direct targets of NEUROG3 (Fig. 7B).

78 Finally, the unique Dox+4OHT and Dox+4OHT+CHX DEGs were compared (Fig.

7C-F). Surprisingly, all of the well characterized direct targets of NEUROG3 (NKX2-2,

NEUROD1, INSM1, PAX4) were solely expressed in the Dox+4OHT group but not

Dox+4OHT+CHX, even though the latter should capture all direct targets. This could be explained by a feed forward mechanism associated with the activation of these genes that is inhibited by the addition of CHX. There are a large number of genes that are uniquely expressed in the Dox+4OHT+CHX group. Again, this was unexpected as the

Dox+4OHT treatment should have included all direct and indirect targets. Again, this may indicate that there are negative feedback loops required for the repression of these potential target genes which are again inhibited by the addition of CHX. Gene ontology was analyzed using the DEGs from the Dox+4OHT sample. Many of the top predicted biological processes are involved with neurogenesis, nerve function, and endocrine pancreas development. It is not surprising that neural GO terms are highly significant as many targets of NEUROG3 are shared by the other proneural bHLH factors NEUROG1 and NEUROG2. Furthermore, much of the hormone-releasing machinery is also utilized for neuropeptide release. Interestingly, muscle contraction was also a significantly enriched biological process.

While the endocrine cells of the pancreas and intestine are similar in that all are dependent on NEUROG3, the pancreatic lineages are largely independent of enteroendocrine cells. Very little is known about the mechanism by which such a diverse population of cells can descend from the single linchpin transcription factor

NEUROG3. Utilizing the inducible NEUROG3iWT HIO system, it should be possible

79 identify differences in target gene expression downstream of NEUROG3 in intestinal organoids compared to pancreatic precursors.

Day 28 NEUROG3iWT HIOs were treated with or without dox for 24 hours. Dox was then removed from the media and the HIOs were cultured for an additional 7 days.

Samples from day 29 (immediately after incubation with dox) and day 35 were FACS sorted using EPCAM to separate epithelium from mesenchyme (Fig. 8A). RNA was then collected from the resulting fractions and analyzed by quantitative PCR. As expected, the epithelial fraction was highly enriched for ECAD and the intestinal marker CDX2 while the mesenchyme highly expressed VMNT (Fig. 8B-D). Induced NEUROG3 was detectable in both the mesenchyme and epithelium of dox treated samples, which was completely absent by day 35 (Fig. 8E). The target genes NEUROD, NKX2.2, and PAX4 were quantified to assess NEUROG3 function. Interestingly, all three were expressed exclusively in the epithelium despite NEUROG3 being expressed at even higher levels in the mesenchyme (Fig. 8F-H). This illuminates that tissue-specific context is a major determinant of NEUROG3 function. Markers of endocrine differentiation were robustly expressed by day 35 including CHGA, GIP, and SST (Fig, 8I-K).

80 Discussion

In this study we have utilized a directed differentiation approach to interrogate

NEUROG3 function in the context of human pancreas and intestine. Using a

NEUROG3-null PSC background we could ectopically express NEUROG3, both wild-

type or mutated, at a relevant temporal time and dosage to cleanly be able to assess

downstream function. We found that NEUROG3R107Sis able to rescue endocrine

specification in differentiated pancreas, albeit at severely reduced levels compared to

NEUROG3WT. This could be due in part to a reduction in stability due to the R107S mutation. However, when reduced stability is compensated for by overexpression of

R107S, function does not come back to wild type levels. This suggests there may be

another mechanism by which R107S has reduced function.

It was surprising that we were unable to identify any function in the

NEUROG3L135P variant. Patients homozygous for this mutation have been reported to live for up to 13 years without developing diabetes (Rubio-Cabezas et al., 2014), clearly indicating a fully formed and functioning endocrine pancreas. Our previous studies suggest NEUROG3 is required for endocrine pancreas development, thus we fully expected to identify some form of regulatory activity. There are a couple possible explanations for this potential discrepancy. First, the in vitro pancreas model we have utilized may not fully recapitulate all of the necessary aspects of in vivo development.

Certainly, it has been previously shown that endocrine cells generated from PSCs more closely resemble fetal pancreas than adult endocrine cells (Hrvatin et al., 2014). It is possible that NEUROG3L135P is functional only in the secondary wave of endocrine

formation. It may be necessary to further mature pancreatic precursors by transplanting

81 into mice in order to see function. The rtTA system employed here is well suited for this experiment, as NEUROG3L135P can simply be induced in the transplant at any time point

by giving the mouse dox chow. Second, there can be significant differences in the

propensity of a stem cell line to differentiate to a particular tissue (Osafune et al., 2008).

It is not completely understood why this is, however there is evidence that genetic

background variation is a major confounding factor that must be considered when

comparing stem cell lines (Choi et al., 2015), together suggesting that we may see

function in NEUROG3L135P if the experiments are repeated with a different hPSC line.

This may also explain why two patients with the same L135P mutation can develop

diabetes at dramatically different ages (neonatal versus 13 years) (Rubio-Cabezas et

al., 2014).

Patient mutations in NEUROG3 result in a complete loss of intestinal endocrine

cells but not pancreatic. Our lab is uniquely positioned to study this phenotype using a

previously described differentiation protocol with which we can make three dimensional

intestinal organoids. NEUROG3WT is able to robustly induce endocrine specification in

the context of intestinal organoids. Interestingly, NEUROG3R107S is unable to specify

endocrine cells in intestinal organoids. This perfectly matches the patient phenotype for

this mutation of successful formation of endocrine cells in the pancreas but not intestine.

Many groups have transcriptionally profiled Neurog3 expressing cells (Gu et al.,

2004; Juhl et al., 2008; Petri et al., 2006; Soyer et al., 2010; White et al., 2008), yielding

many downstream targets that are now known to also be important in endocrine

differentiation. All of these methods utilize pooled cells with varying durations of

exposure to Neurog3, making it impossible to determine the hierarchy of target genes

82 downstream of Neurog3 expression. To address this, we generated an rtTA-inducible

NEUROG3ERT2 construct. NEUROG3ERT2 was able to activate target gene expression following induction with doxycycline and activation with 4-OHT. To decipher direct from indirect targets, we also induced transcription factor activity (addition of 4-OHT) in the presence of a translational inhibitor which prevents detection of indirect targets expressed in response to primary target activities. Following analysis, we identified 142

target genes(Dox+4OHT treatment), 93 of which are upregulated, and 136 direct target

genes(Dox+4OHT+CHX treatment), 64 of which are upregulated. Counterintuitively,

these two gene lists only overlap by 23 genes. It is also important to note that several

previously characterized direct targets (NKX2.2, NEUROD1, PAX4) were not identified

as direct targets in our experiment. This may suggest that there is feed-forward

mechanism that helps establish their expression, which is lost with translation inhibition

(addition of CHX). There are also many genes identified as direct targets that are not identified as NEUROG3 targets. This is a bit confounding, but could be explained by negative regulatory feedback (abolished by treatment with CHX) that represses expression immediately following activation by NEUROG3.

The differential capacity for NEUROG3R107S to specify endocrine cells in the

pancreas but not intestine is interesting and suggests context dependent mechanisms

by which endocrine lineages differentiate in either the pancreas or intestine. To address

this we also interrogated NEUROG3 target genes in intestinal epithelium and compared them to target genes identified in the pancreas. Interestingly, the list of NEUROG3 target genes in pancreas is significantly different from intestine (data in process and not

83 shown). This could shed light on a mechanism by which endocrine cells are formed in such a tissue specific manor.

84 Acknowledgements

P.S.M. and J.M.W. designed the study, interpreted results and wrote the manuscript.

P.S.M. and X.Z. performed all experiments. J.S. contributed to experiments, tissue processing, and image acquisition. J.V.S. performed FACS sorting. We thank Dr. Aaron

Zorn for scientific discussion. This study was supported by NIH grants R01DK080823 and R01DK092456. We also acknowledge core support from the Cincinnati Digestive

Disease Center Award (P30 DK0789392) and the Clinical Translational Science Award

(U54 RR025216). We thank the CCHMC Pluripotent Stem Cell Facility, Confocal

Imaging Core, Flow Cytometry Core, and Vector Production Core for support and services.

85 Apelqvist, a, Li, H., Sommer, L., Beatus, P., Anderson, D.J., Honjo, T., Hrabe de Angelis, M., Lendahl, U., and Edlund, H. (1999). Notch signalling controls pancreatic cell differentiation. Nature 400, 877–881. Bosque-plata, L., Lin, J., Horikawa, Y., Schwarz, P.E.H., Cox, N.J., Iwasaki, N., Ogata, M., Iwamoto, Y., German, M.S., and Bell, G.I. (2001). Mutations in the Coding Region of the Neurogenin 3 Gene (NEUROG3) Are Not a Common Cause of Maturity-Onset Diabetes of the Young in Japanese Subjects. Diabetes 50, 694– 696. Choi, J., Lee, S., Mallard, W., Clement, K., Tagliazucchi, G.M., Lim, H., Choi, I.Y., Ferrari, F., Tsankov, A.M., Pop, R., et al. (2015). A comparison of genetically matched cell lines reveals the equivalence of human iPSCs and ESCs. Nat. Biotechnol. 33, 1173–1181. Collombat, P., Mansouri, A., Hecksher-Sorensen, J., Serup, P., Krull, J., Gradwohl, G., and Gruss, P. (2003). Opposing actions of Arx and Pax4 in endocrine pancreas development. Genes Dev. 17, 2591–2603. Gradwohl, G., Dierich, A., LeMeur, M., and Guillemot, F. (2000). neurogenin3 is required for the development of the four endocrine cell lineages of the pancreas. Proc. Natl. Acad. Sci. U. S. A. 97, 1607–1611. Gu, G., Wells, J.M., Dombkowski, D., Preffer, F., Aronow, B., and Melton, D. a (2004). Global expression analysis of gene regulatory pathways during endocrine pancreatic development. Development 131, 165–179. Henseleit, K.D., Nelson, S.B., Kuhlbrodt, K., Hennings, J.C., Ericson, J., and Sander, M. (2005). NKX6 transcription factor activity is required for alpha- and beta-cell development in the pancreas. Development 132, 3139–3149. Hrvatin, S., O’Donnell, C.W., Deng, F., Millman, J.R., Pagliuca, F.W., Diiorio, P., Rezania, A., Gifford, D.K., and Melton, D. a (2014). Differentiated human stem cells resemble fetal, not adult, β cells. Proc. Natl. Acad. Sci. U. S. A. Huang, D.W., Lempicki, R. a, and Sherman, B.T. (2009). Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57. Huang, H., Liu, M., El-hodiri, H.M., Chu, K., and Liu, M.I.N. (2000). Regulation of the Pancreatic Islet-Specific Gene BETA2 ( neuroD ) by Neurogenin 3 Regulation of the Pancreatic Islet-Specific Gene BETA2 ( neuroD ) by Neurogenin 3. 2. Jenny, M., Uhl, C., Roche, C., Duluc, I., Guillemot, F., Jensen, J., and Kedinger, Á. (2002). Neurogenin3 is differentially required for endocrine cell fate specication in the intestinal and gastric epithelium. 21. Jensen, J.N., Hansen, L., Ekstrøm, C.T., Pociot, F., Nerup, J., Hansen, T., and Pedersen, O. (2001). Polymorphisms in the neurogenin 3 gene (NEUROG) and their relation to altered insulin secretion and diabetes in the Danish Caucasian population. Diabetologia 44, 123–126. Juhl, K., Sarkar, S.A., Wong, R., Jensen, J., and Hutton, J.C. (2008). Mouse Pancreatic Endocrine Cell Transcriptome in the Embryonic Ngn3-Null Mouse. Diabetes 57. Kelly, O.G., Chan, M.Y., Martinson, L. a, Kadoya, K., Ostertag, T.M., Ross, K.G., Richardson, M., Carpenter, M.K., D’Amour, K. a, Kroon, E., et al. (2011). Cell- surface markers for the isolation of pancreatic cell types derived from human embryonic stem cells. Nat. Biotechnol. 29, 750–756.

86 Kroon, E., Martinson, L. a, Kadoya, K., Bang, A.G., Kelly, O.G., Eliazer, S., Young, H., Richardson, M., Smart, N.G., Cunningham, J., et al. (2008). Pancreatic endoderm derived from human embryonic stem cells generates glucose- responsive insulin-secreting cells in vivo. Nat. Biotechnol. 26, 443–452. Liu, H., and Naismith, J.H. (2008). An efficient one-step site-directed deletion, insertion, single and multiple-site plasmid mutagenesis protocol. BMC Biotechnol. 8, 91. Longo, A., Guanga, G.P., and Rose, R.B. (2008). Crystal structure of E47- NeuroD1/beta2 bHLH domain-DNA complex: heterodimer selectivity and DNA recognition. Biochemistry 47, 218–229. Lopez-Diaz, L., Jain, R.N., Keeley, T.M., VanDussen, K.L., Brunkan, C.S., Gumucio, D.L., and Samuelson, L.C. (2007). Intestinal Neurogenin 3 directs differentiation of a bipotential secretory progenitor to endocrine cell rather than goblet cell fate. Dev. Biol. 309, 298–305. McCracken, K.W., Howell, J.C., Wells, J.M., and Spence, J.R. (2011). Generating human intestinal tissue from pluripotent stem cells in vitro. Nat. Protoc. 6, 1920– 1928. McGrath, P.S., Watson, C.L., Ingram, C., Helmrath, M.A., and Wells, J.M. (2015). The Basic Helix-Loop-Helix Transcription Factor NEUROG3 Is Required for Development of the Human Endocrine Pancreas. Diabetes 64, 2497–2505. Meerbrey, K.L., Hu, G., Kessler, J.D., Roarty, K., Li, M.Z., Fang, J.E., Herschkowitz, J.I., Burrows, A.E., Ciccia, A., Sun, T., et al. (2011). The pINDUCER lentiviral toolkit for inducible RNA interference in vitro and in vivo. Proc. Natl. Acad. Sci. U. S. A. 108, 3665–3670. Osafune, K., Caron, L., Borowiak, M., Martinez, R.J., Fitz-Gerald, C.S., Sato, Y., Cowan, C. a, Chien, K.R., and Melton, D. a (2008). Marked differences in differentiation propensity among human embryonic stem cell lines. Nat. Biotechnol. 26, 313–315. Petri, A., Ahnfelt-Rønne, J., Frederiksen, K.S., Edwards, D.G., Madsen, D., Serup, P., Fleckner, J., and Heller, R.S. (2006). The effect of neurogenin3 deficiency on pancreatic gene expression in embryonic mice. J. Mol. Endocrinol. 37, 301–316. Pinney, S.E., Oliver-Krasinski, J., Ernst, L., Hughes, N., Patel, P., Stoffers, D. a, Russo, P., and De León, D.D. (2011). Neonatal diabetes and congenital malabsorptive diarrhea attributable to a novel mutation in the human neurogenin-3 gene coding sequence. J. Clin. Endocrinol. Metab. 96, 1960–1965. Prado, C.L., Pugh-Bernard, A.E., Elghazi, L., Sosa-Pineda, B., and Sussel, L. (2004). Ghrelin cells replace insulin-producing beta cells in two mouse models of pancreas development. Proc. Natl. Acad. Sci. U. S. A. 101, 2924–2929. Rezania, A., Bruin, J.E., Riedel, M.J., Mojibian, M., Asadi, A., Xu, J., Gauvin, R., Narayan, K., Karanu, F., O’Neil, J.J., et al. (2012). Maturation of human embryonic stem cell-derived pancreatic progenitors into functional islets capable of treating pre-existing diabetes in mice. Diabetes 61, 2016–2029. Rubio-Cabezas, O., Jensen, J.N., Hodgson, M.I., Codner, E., Ellard, S., Serup, P., and Hattersley, A.T. (2011). Permanent neonatal diabetes and enteric anendocrinosis associated with biallelic mutations in NEUROG3. Diabetes 60, 1349–1353. Rubio-Cabezas, O., Codner, E., Flanagan, S.E., Gómez, J.L., Ellard, S., and Hattersley, A.T. (2014). Neurogenin 3 is important but not essential for pancreatic islet

87 development in humans. Diabetologia 2, 3–6. Sander, M., Sussel, L., Conners, J., Scheel, D., Kalamaras, J., Dela Cruz, F., Schwitzgebel, V., Hayes-Jordan, A., and German, M. (2000). Homeobox gene Nkx6.1 lies downstream of Nkx2.2 in the major pathway of beta-cell formation in the pancreas. Development 127, 5533–5540. Sayar, E., Yilmaz, A., Islek, A., Elpek, G.O., Flanagan, S.E., and Artan, R. (2013). Chromogranin-A staining reveals enteric anendocrinosis in unexplained congenital diarrhea. J. Pediatr. Gastroenterol. Nutr. 57, e21. Schwitzgebel, V.M., Scheel, D.W., Conners, J.R., Kalamaras, J., Lee, J.E., Anderson, D.J., Sussel, L., Johnson, J.D., and German, M.S. (2000). Expression of neurogenin3 reveals an islet cell precursor population in the pancreas. Development 127, 3533–3542. Sherry, S.T., Ward, M.H., Kholodov, M., Baker, J., Phan, L., Smigielski, E.M., and Sirotkin, K. (2001). dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311. Smith, S.B., Gasa, R., Watada, H., Wang, J., Griffen, S.C., and German, M.S. (2003). Neurogenin3 and hepatic nuclear factor 1 cooperate in activating pancreatic expression of Pax4. J. Biol. Chem. 278, 38254–38259. Sosa-Pineda, B., Chowdhury, K., Torres, M., Oliver, G., and Gruss, P. (1997). The Pax4 gene is essential for differentiation of insulin-producing beta cells in the mammalian pancreas. Nature 386, 399–402. Soyer, J., Flasse, L., Raffelsberger, W., Beucher, A., Orvain, C., Peers, B., Ravassard, P., Vermot, J., Voz, M.L., Mellitzer, G., et al. (2010). Rfx6 is an Ngn3-dependent winged helix transcription factor required for pancreatic islet cell development. Development 137, 203–212. Spence, J.R., Mayhew, C.N., Rankin, S. a, Kuhar, M.F., Vallance, J.E., Tolle, K., Hoskins, E.E., Kalinichenko, V. V, Wells, S.I., Zorn, A.M., et al. (2011). Directed differentiation of human pluripotent stem cells into intestinal tissue in vitro - Supplementary Info. Nature 470, 105–109. Wang, J., Galen, C., Wu, V., Tran, R., Cho, J.-H., Tsai, M.-J., Bailey, T.J., Jamrich, M., Ament, M.E., Treem, W.R., et al. (2006). Mutant neurogenin-3 in congenital malabsorptive diarrhea. N. Engl. J. Med. 356, 1781–1782; author reply 1782. Watada, H., Scheel, D.W., Leung, J., and German, M.S. (2003). Distinct gene expression programs function in progenitor and mature islet cells. J. Biol. Chem. 278, 17130–17140. White, P., May, C.L., Lamounier, R.N., Brestelli, J.E., and Kaestner, K.H. (2008). Defining pancreatic endocrine precursors and their descendants. Diabetes 57.

88 Figure Legends

Figure 1. Ectopically expressed NEUROG3 under the control of rtTA rescues

endocrine specification in NEUROG3-/- pancreatic precursors.

(A) Schematic summarizing the 4-stage differentiation protocol of hPSCs to pancreatic

precursors. See also Figure S1.

(B) hPSCs were differentiated to day 12 pancreatic precursors and characterized for

endocrine specification by immunofluorescence. NEUROG3 and the pan-endocrine

marker CHGA is robustly expressed in wild-type cells but is completely lost in

NEUROG3-/- pancreatic precursors. PDX1 staining shows the NEUROG3-/- hPSCs form

pancreatic precursors with equal efficiency to wild-type cells.

(C) Schematic of the lentiviral vector (pINDUCER20 backbone) in which NEUROG3 is

under the control of rtTA. NEUROG3 is ectopically expressed when doxycycline is

added to the cell culture media. A hemagglutinin (HA) tag was added to the N-terminus

of NEUROG3. The listed patient mutations were created by mutagenesis for further

study.

(D) NEUROG3-/- hPSCs were transduced with the inducible NEUROG3 construct

(NEUROG3-/-,iWT) and differentiated into pancreatic precursors. The cells were pulsed

for 8 hours with dox on the indicated day of differentiation (x-axis) and harvested for

RNA on day 12. The resulting endocrine specification was assessed by quantitative

PCR with a panel of endocrine markers.

(E) NEUROG3-/- and NEUROG3-/-,iWT pancreatic cultures were dosed with dox on day 9 of differentiation. NEUROG3 expression was assessed by immunofluorescence using the HA tag.

89 (F) Endocrine specification in pancreatic precursors was assessed on day 12 by

immunofluorescence. The endocrine markers NKX2.2 and CHGA can only be detected

in NEUROG3-/- cells treated with the inducible construct.

Scale bars = 50µm. Error bars represent SEM.

Figure 2. Modelling NEUROG3 mutations in pancreatic precursors.

(A-D) NEUROG3-/- hPSCs were individually transduced with each inducible NEUROG3

mutant construct and then differentiated into pancreatic precursors. Cultures were

pulsed with dox for 8 hours on day 9, and then fixed. Induced NEUROG3 expression was assessed by immunofluorescence using the HA tag. Overall differentiation quality was confirmed by expression of PDX1. All images were captured and processed using the same settings.

(E-H) Cultures pulsed with dox on day 9 were differentiated into day 12 pancreatic precursors and then fixed for analysis by immunofluorescence. By day 12, ectopic

NEUROG3 was eliminated regardless of mutation.

(I-L) Endocrine specification was assessed using the markers NKX2.2 and CHGA for each mutation.

(M-P) INS, GCG, and SST expression for each mutation.

Scale bars = 50µm. Error bars represent SEM.

Figure 3. NEUROG3R107S, but not NEUROG3L135P, is less stable than wild type

NEUROG3 protein.

90 (A) Day 9 posterior foregut cultures were treated with dox for 24 hours for the wild type

and mutant NEUROG3 lines (R107S, L135P). Dox was removed from the media and

replaced with cycloheximide to prevent further protein translation. Protein was collected

at the indicated time points following removal of dox, analyzed by western blot, and

quantified by immunofluorescence. NEUROG3 was visualized using the HA tag (green)

in combination with a pan-actin antibody (red).

(B-D) Quantified NEUROG3 protein at each time point was normalized to actin and then

plotted as a function of time. An exponential curve was fit to the data (equations shown on the corresponding graphs).

(E) Protein degredation half-lives were calculated for NEUROG3WT, NEUROG3R107S,

L135P and NEUROG3 using first-order kinetics (t1/2 = ln(2)/decay constant). The resulting half-life is an average of three independent experiments. Error bars represent SEM.

Figure 4. Reduced activity in NEUROG3R107S is not explained by dosage.

(A) NEUROG3-/-,iWT and NEUROG3-/-,iR107S were induced in day 9 posterior foregut for 8

hours with 0, 30, 100, or 300ng/ml dox. Monolayers were then fixed using PFA and

analyzed by immunofluorescence. Ectopic NEUROG3 expression was detected using

the HA tag.

(B) Same as A. Overall quality of the differentiation was assessed using the pancreatic

marker PDX1. Homogenous PDX1 expression can be seen across all dox dosages and

for either NEUROG3-/-,iWT or NEUROG3-/-,iR107S lines.

(C) RNA was collected to assess transcript levels by quantitative PCR. Primers

designed specific to the ectopically expressed NEUROG3 mRNA were used to assess

91 the level of induction. Function was interpreted by quantifying NEUROD, NKX2.2, and

PAX4 mRNA, direct targets of NEUROG3 (n=2, representative of 3 separate

experiments).

Scale bars = 50µm.

Figure 5. Modelling NEUROG3 mutations in intestinal organoids.

(A) Schematic summarizing the differentiation protocol of hPSCs to intestinal organoids.

See also Figure S1.

(B-F) NEUROG3 (WT, R107S, L135P, and E28X) was individually induced in day 28

intestinal organoids with an 8 hour pulse of dox. Organoids were collected immediately after induction, sectioned, and characterized by immunofluorescence. NEUROG3 (HA)

expression was confirmed for all constructs but E28X. There were no detectable

endocrine cells (CHGA). The epithelium component of the organoids is denoted with the

yellow dashed line (identified using the epithelial marker ECAD).

(K) Day 28 organoids pulsed with dox were collected for RNA. Induced NEUROG3 was

assessed by quantitative PCR and found to be at comparable levels across all lines.

The direct targets NEUROD, NKX2.2, and PAX4 were used to characterize the level of

NEUROG3 function with each mutation (n=3, representative of 3 separate experiments).

(H-K) Intestinal organoids pulsed with dox at day 28 were collected for analysis at day

35. Sections were analyzed by immunofluorescence for NEUROG3 (HA) and endocrine specification (CHGA).

Scale bars = 50 µM. Error bars represent SEM.

92 Figure 6. Design of a dox inducible, tamoxifen regulated NEUROG3 construct.

(A) Schematic of the bicistronic GFP-P2A-NEUROG3ERT2 construct cloned into the pINDUCER20 lentiviral vector.

(B) NEUROG3-/- hPSCs were transduced with the inducible NEUROG3ERT2 construct

and then differentiated into posterior foregut. The schematic summarizes the treatment

timings and durations.

(C) Pancreatic precursors following treatment with Dox, 4OHT, and/or CHX were

analyzed by quantitative PCR (n=3, representative of 5 experiments). Known direct

targets NEUROD, PAX4, NKX2.2 were used to assess the ability of NEUROG3ERT2 to

activate downstream genes and also leakiness with various chemical combinations.

Overall quality of the pancreas differentiation was confirmed with PDX1.

(D-G) NEUROG3 (HA) expression was confirmed by immunofluorescence and

compared to the inducible tracer GFP.

(H-K) Expression of NKX2.2, a direct target of NEUROG3, was assessed by

immunofluorescence. Importantly, NKX2.2 expression is only observable in cultures

treated with both dox and 4OHT. NKX2.2 expression is lost when also cultured with the

translation blocker CHX.

Figure 7. Transcriptional profiling to identify direct and indirect targets of

NEUROG3 in pancreatic precursors

(A) RNA-seq was performed using pancreatic precursors treated with dox, 4OHT,

and/or CHX (Experiment summarized in Figure 5). Dox, 4OHT, and Dox+4OHT gene

expression was compared to no treatment controls. A 1.5-fold change was used as the

93 cutoff. The lists were compared for shared and unique entities, illustrated by the venn

diagram.

(B) Venn diagram comparing the genes that have a >1.5-fold change in

Dox+4OHT+CHX treated pancreatic precursors compared to cultures treated with Dox,

4OHT, or CHX individually.

(C) Genes with altered expression unique to the treatments Dox+4OHT or

Dox+4OHT+CHX (from A and B) were compared for overlap.

(D) List of genes with the largest fold change, unique to the Dox+4OHT treatment (from

C).

(E) List of genes with the largest fold change, unique to the Dox+4OHT+CHX treatment

(from C).

(F) List of genes with the largest fold change, shared by the Dox+4OHT and

Dox+4OHT+CHX treatments.

(G) The complete gene list from Dox+4OHT treated cells was assessed for gene ontology. The most significantly enriched biological processes are listed.

Figure 8. Ectopically expressed NEUROG3 is only functional in the epithelial component of human intestinal organoids.

(A) NEUROG3-/-,iWT HIOs were treated for 24 hours ±dox on day 28. The mesenchymal

component could then be sorted from the intestinal epithelium by FACS using the cell

surface marker EPCAM. Sorted cells were collected on day 29 (immediately following

removal of dox), and on day 35 (one week following NEUROG3 pulse). On day 35, an

94 EPCAM-Lo population was collected in addition to EPCAM-Neg and EPCAM-Hi. The percentage of the total population collected in each gate is shown.

(B-G) RNA was collected from the FACS separated intestinal mesenchyme (EPCAM-) and epithelium (EPCAM+) and then profiled by quantitative PCR. Efficient separation was confirmed using epithelial (CDX2, ECAD) and mesenchymal (VMNT) markers.

NEUROG3 induction was assessed using primers specific to the ectopic transcript

(iNEUROG3).

(F-H) NEUROG3 target genes

(I-K) Markers of endocrine differentiation.

Data represent a single sample from pooled cells following FACS.

Supplemental Figure S1. Detailed differentiation protocols.

(A) 2-week protocol for the directed differentiation of human pluripotent stem cells into pancreatic precursors

(B) 4- to 5-week protocol for the directed differentiation of human pluripotent stem cells into 3-dimensional intestinal organoids

Supplemental Figure S2. NEUROG3R107S but not NEUROG3L135P is partially functional in pancreatic precursors.

(A) NEUROG3WT, NEUROG3R107S, NEUROG3L135P, and NEUROG3E28X was induced in day 9 pancreas cultures with various concentrations of dox (x-axis) for 24 hours and then harvested for RNA. Primers designed specific to the ectopically expressed

95 NEUROG3 mRNA were used to assess the level of induction by quantitative PCR (n=2, representative of 3 separate experiments).

(B) NEUROG3 function with each mutation was interpreted by quantifying NEUROD,

NKX2.2, and PAX4 mRNA, direct targets of NEUROG3.

Error bars represent SEM.

Supplemental Figure S3. The common SNP F199S does not affect NEUROG3 function

(A) Wild type NEUROG3 (NEUROG3WT) and NEUROG3 containing the common SNP

F199S (NEUROG3F199S) were ectopically expressed in day 9 NEUROG3-/- posterior

foregut for 8 hours with dox. Monolayers were then fixed and analyzed by

immunofluorescence. NEUROG3 (HA tag) was similarly expressed in each line. The

NEUROG3 target gene NKX2.2 was also similarly activated in either line. The

differentiations were of comparable quality evidenced by homogeneous expression of

the pancreas marker PDX1.

(B) RNA was collected from both NEUROG3WT and NEUROG3F199S lines at day 9

(posterior foregut) and day 11 (pancreatic precursors) and for analysis by quantitative

PCR. NEUROG3 was expressed at similar levels when ectopically induced. There was

no significant difference found in endocrine specification following expression of either

NEUROG3WT or NEUROG3F199S (CHGA, NEUROD).

Supplemental Figure S4. Characterization of the NEUROG3-/-,iERT2 inducible

construct in posterior foregut

96 (A) NEUROG3-/-,iERT2 were differentiated into posterior foregut. On day 8, dox was

added for 24-hours at 0, 25, 50, 100, and 200ng/ml. The 100ng/ml dox concentration

was chosen to then be further treated with 0.1, 0.3, or 1µM 4OHT for an additional 8 hours. Monolayers were then collected for RNA. The NEUROG3 target genes NEUROD and NKX2.2 were interrogated by quantitative PCR (n=2, representative of 3 separate experiments). There was no observable leakiness in the induced NEUROG3ERT2

construct up to 200ng/ml dox. Furthermore, the lowest tested dose of 0.1µM 4OHT was

sufficient to produce the highest NEUROG3 activity. No treatments had a discernable

effect on the quality of differentiation (PDX1).

Error bars represent SEM.

97 Table 3. (Direct and indirect targets) Genes with fold change >1.5 only in Dox+4OHT treatment. Samples compared to those treated with Dox only and 4OHT only (See Fig. 7A,D).

Fold Gene Symbol Change ID Description ZIC5 6.05 85416 Zic family member 5 NKX2-2 5.94 4821 NK2 homeobox 2 NEUROD1 4.16 4760 neuronal differentiation 1 INSM1 3.95 3642 insulinoma-associated 1 peroxisome proliferator-activated gamma, PPARGC1B 3.83 133522 coactivator 1 beta KRT77 2.79 374454 keratin 77 C1QL3 2.67 389941 complement component 1, q subcomponent-like 3 MYBPC1 2.56 4604 myosin binding protein C, slow type CBY3 2.53 646019 chibby homolog 3 (Drosophila) SCN7A 2.52 6332 , voltage-gated, type VII, alpha subunit IRX3 2.46 79191 iroquois homeobox 3 ARHGAP20 2.44 57569 Rho GTPase activating protein 20 APLNR 2.36 187 TM4SF19- TCTEX1D2 2.29 100534611 TM4SF19-TCTEX1D2 readthrough GNRHR2 2.29 114814 gonadotropin-releasing hormone (type 2) receptor 2 FABP7 2.25 2173 fatty acid binding protein 7, brain MUC15 2.14 143662 mucin 15, cell surface associated TRIM29 2.12 23650 tripartite motif containing 29 MIR205HG 2.03 642587 MIR205 host gene (non-protein coding) LOC100131825 2.00 100131825 uncharacterized LOC100131825 SUSD3 1.95 203328 sushi domain containing 3 MYOT 1.92 9499 myotilin PCDHA9 1.89 9752 protocadherin alpha 9 ARX 1.89 170302 aristaless related homeobox DOK2 1.89 9046 docking protein 2, 56kDa C8B 1.88 732 complement component 8, beta polypeptide solute carrier family 7 (amino acid transporter light chain, L SLC7A5P1 1.87 81893 system), member 5 1 PCOLCE-AS1 1.87 100129845 PCOLCE antisense RNA 1 C1QL1 1.84 10882 complement component 1, q subcomponent-like 1 C10orf10 1.81 11067 open reading frame 10 CPSF4L 1.79 642843 cleavage and specific factor 4-like RGS5 1.79 8490 regulator of G-protein signaling 5 RAPSN 1.78 5913 receptor-associated protein of the synapse BHMT 1.77 635 betaine--homocysteine S-methyltransferase GLT6D1 1.75 360203 glycosyltransferase 6 domain containing 1 RYR1 1.75 6261 1 (skeletal) LOC646168 1.73 646168 uncharacterized LOC646168 solute carrier family 6 (neurotransmitter transporter, SLC6A2 1.73 6530 noradrenalin), member 2 AARD 1.71 441376 alanine and arginine rich domain containing protein SPRR2E 1.71 6704 small proline-rich protein 2E PPP2R2B 1.71 5521 protein phosphatase 2, regulatory subunit B, beta BMPER 1.70 168667 BMP binding endothelial regulator 98 PAX4 1.69 5078 paired box 4 NOX1 1.68 27035 NADPH oxidase 1 LRRC55 1.67 219527 leucine rich repeat containing 55 COL6A5 1.66 256076 collagen, type VI, alpha 5 ANGPT1 1.66 284 angiopoietin 1 ANPEP 1.65 290 alanyl (membrane) aminopeptidase transient receptor potential cation channel, subfamily M, TRPM3 1.65 80036 member 3 LOC644838 1.64 644838 uncharacterized LOC644838 NGFR 1.63 4804 nerve growth factor receptor FLJ43663 1.62 378805 uncharacterized LOC378805 LOC100129858 1.61 100129858 uncharacterized LOC100129858 TFPI2 1.61 7980 tissue factor pathway inhibitor 2 RYR2 1.60 6262 (cardiac) LOC284581 1.60 284581 uncharacterized LOC284581 LINC00639 1.60 283547 long intergenic non-protein coding RNA 639 C6orf222 1.59 389384 open reading frame 222 ANXA10 1.59 11199 annexin A10 LDHA 1.58 3939 lactate dehydrogenase A DIRAS2 1.57 54769 DIRAS family, GTP-binding RAS-like 2 LFNG O-fucosylpeptide 3-beta-N- LFNG 1.56 3955 acetylglucosaminyltransferase DUOXA2 1.54 405753 dual oxidase maturation factor 2 KIAA0408 1.54 9729 KIAA0408 PCP4 1.54 5121 Purkinje cell protein 4 SNORD12 1.53 692057 small nucleolar RNA, C/D box 12 MAMDC2 1.53 256691 MAM domain containing 2 DOC2A 1.53 8448 double C2-like domains, alpha SPC24, NDC80 kinetochore complex component, homolog SPC24 1.53 147841 (S. cerevisiae) GPR34 1.53 2857 G protein-coupled receptor 34 SPRY1 1.52 10252 sprouty homolog 1, antagonist of FGF signaling (Drosophila) PTCHD4 -1.50 442213 patched domain containing 4 CHGB -1.53 1114 chromogranin B (secretogranin 1) CCDC152 -1.53 100129792 coiled-coil domain containing 152 CAMK4 -1.54 814 calcium/calmodulin-dependent protein kinase IV NPAS4 -1.54 266743 neuronal PAS domain protein 4 GPR128 -1.54 84873 G protein-coupled receptor 128 MAGEC3 -1.55 139081 melanoma antigen family C, 3 IGFBP1 -1.55 3484 insulin-like growth factor binding protein 1 THBD -1.56 7056 thrombomodulin MBNL1-AS1 -1.57 401093 MBNL1 antisense RNA 1 LOC440894 -1.58 440894 uncharacterized LOC440894 PDE7B -1.58 27115 phosphodiesterase 7B SCML4 -1.58 256380 sex comb on midleg-like 4 (Drosophila) DIO3 -1.59 1735 deiodinase, iodothyronine, type III EPS8L3 -1.60 79574 EPS8-like 3 CCDC70 -1.61 83446 coiled-coil domain containing 70 SPATA32 -1.62 124783 spermatogenesis associated 32 HCG26 -1.62 352961 HLA complex group 26 (non-protein coding) PVRL3-AS1 -1.63 100506555 PVRL3 antisense RNA 1 99 LOC100144602 -1.65 100144602 uncharacterized LOC100144602 LINC00324 -1.65 284029 long intergenic non-protein coding RNA 324 KCNRG -1.66 283518 regulator DHH -1.68 50846 desert hedgehog C12orf39 -1.68 80763 chromosome 12 open reading frame 39 tumor necrosis factor receptor superfamily, member 10d, LOC286059 -1.70 286059 decoy with truncated death domain pseudogene EDAR -1.71 10913 ectodysplasin A receptor HIST1H3E -1.71 8353 histone cluster 1, H3e ICAM1 -1.72 3383 intercellular adhesion molecule 1 HIST1H2BG -1.73 8339 histone cluster 1, H2bg UDP-Gal:betaGlcNAc beta 1,3-galactosyltransferase, B3GALT5 -1.76 10317 polypeptide 5 C19orf77 -1.81 284422 chromosome 19 open reading frame 77 FAM154A -1.89 158297 family with sequence similarity 154, member A PCDHB17 -1.93 54661 protocadherin beta 17 pseudogene CYP2A7 -1.96 1549 cytochrome P450, family 2, subfamily A, polypeptide 7 TMED7-TICAM2 -1.98 100302736 TMED7-TICAM2 readthrough C15orf32 -2.03 145858 chromosome 15 open reading frame 32 FITM1 -2.07 161247 fat storage-inducing transmembrane protein 1 HIST1H3D -2.07 8351 histone cluster 1, H3d USP50 -2.09 373509 ubiquitin specific peptidase 50 HLA-DRB5 -2.10 3127 major histocompatibility complex, class II, DR beta 5 RERG -2.15 85004 RAS-like, estrogen-regulated, growth inhibitor NDST4 -2.17 64579 N-deacetylase/N-sulfotransferase (heparan glucosaminyl) 4 FLJ35024 -2.20 401491 uncharacterized LOC401491 EPHA5 -2.22 2044 EPH receptor A5 CCR1 -2.53 1230 chemokine (C-C motif) receptor 1 PKD2L2 -3.32 27039 polycystic kidney disease 2-like 2 MED4-AS1 -3.52 100873965 MED4 antisense RNA 1 LOC728407 -3.99 728407 poly (ADP-ribose) glycohydrolase pseudogene

100 Table 4. (Direct targets) Genes with fold change >1.5 only in Dox+4OHT+CHX treatment. Samples compared to those treated with dox only, 4OHT only, and CHX only. (See Fig. 7B, E).

Fold Gene Symbol Change Entrez ID Description APOA4 12.45 337apolipoprotein A-IV RASGRF1 5.48 5923 Ras protein-specific guanine nucleotide-releasing factor 1 C1orf105 3.98 92346 open reading frame 105 KRT7 3.29 3855 keratin 7 ADRB3 2.98 155 adrenoceptor beta 3 GABRG2 2.90 2566 gamma-aminobutyric acid (GABA) A receptor, gamma 2 RAG2 2.87 5897 recombination activating gene 2 FOXQ1 2.87 94234 forkhead box Q1 PCDHGA7 2.77 56108 protocadherin gamma subfamily A, 7 MEOX2 2.61 4223 mesenchyme homeobox 2 HEYL 2.47 26508 hairy/enhancer-of-split related with YRPW motif-like MROH9 2.45 80133 maestro heat-like repeat family member 9 C11orf96 2.42 387763 chromosome 11 open reading frame 96 phosphatidylinositol-4,5-bisphosphate 3-kinase, catalytic PIK3CG 2.42 5294 subunit gamma BANK1 2.29 55024 B-cell scaffold protein with ankyrin repeats 1 inhibitor of DNA binding 4, dominant negative helix-loop-helix ID4 2.13 3400 protein PCDHGA5 2.10 56110 protocadherin gamma subfamily A, 5 HIST1H1T 2.07 3010 histone cluster 1, H1t C17orf105 2.06 284067 open reading frame 105 CDKN2B 1.96 1030 cyclin-dependent kinase inhibitor 2B (p15, inhibits CDK4) ARMS2 1.95 387715 age-related maculopathy susceptibility 2 HOXD-AS2 1.94 100506783 HOXD cluster antisense RNA 2 LOC285878 1.90 285878uncharacterized LOC285878 ANKUB1 1.86 389161 ankyrin repeat and ubiquitin domain containing 1 EBLN1 1.83 340900 endogenous Bornavirus-like nucleoprotein 1 ADAP1 1.78 11033 ArfGAP with dual PH domains 1 HIST1H4K 1.76 8362 histone cluster 1, H4k EIF1AX 1.75 1964 eukaryotic translation initiation factor 1A, X-linked PRM1 1.75 5619protamine 1 C1orf64 1.75 149563 chromosome 1 open reading frame 64 HUS1B 1.72 135458 HUS1 checkpoint homolog b (S. pombe) DDAH1 1.71 23576 dimethylarginine dimethylaminohydrolase 1 ENC1 1.66 8507 ectodermal-neural cortex 1 (with BTB domain) PPM1D 1.66 8493 protein phosphatase, Mg2+/Mn2+ dependent, 1D RNF126 1.65 55658 ring finger protein 126 42624 1.65 55752 septin 11 PFN2 1.65 5217profilin 2 CREBBP 1.63 1387 CREB binding protein YY1 1.63 7528 YY1 transcription factor SEC24A 1.61 10802 SEC24 family, member A (S. cerevisiae) potassium voltage-gated channel, delayed-rectifier, subfamily KCNS2 1.61 3788 S, member 2 PHAX 1.61 51808 phosphorylated adaptor for RNA export

101 SFPQ 1.60 6421 splicing factor proline/glutamine-rich BRPF3 1.59 27154 bromodomain and PHD finger containing, 3 C1orf198 1.59 84886 chromosome 1 open reading frame 198 CAPZA2 1.58 830 capping protein (actin filament) muscle Z-line, alpha 2 CAPZA1 1.58 829 capping protein (actin filament) muscle Z-line, alpha 1 RHOU 1.58 58480 ras homolog family member U DHX32 1.58 55760 DEAH (Asp-Glu-Ala-His) box polypeptide 32 TAX1BP1 1.58 8887 Tax1 (human T-cell leukemia virus type I) binding protein 1 hypoxia inducible factor 1, alpha subunit (basic helix-loop-helix HIF1A 1.57 3091 transcription factor) DMXL1 1.57 1657 Dmx-like 1 CTD (carboxy-terminal domain, RNA polymerase II, CTDSP1 1.56 58190 polypeptide A) small phosphatase 1 RAI14 1.56 26064 retinoic acid induced 14 MACC1 1.55 346389 metastasis associated in colon 1 SENP1 1.55 29843 SUMO1/sentrin specific peptidase 1 COLGALT2 1.55 23127 collagen beta(1-O)galactosyltransferase 2 STK24 1.55 8428 serine/threonine kinase 24 MYLIP 1.54 29116 myosin regulatory light chain interacting protein DDX1 1.54 1653 DEAD (Asp-Glu-Ala-Asp) box helicase 1 DIP2B 1.52 57609 DIP2 disco-interacting protein 2 homolog B (Drosophila) CYTH4 1.52 27128 cytohesin 4 solute carrier family 10 (sodium/bile acid cotransporter family), SLC10A2 1.52 6555 member 2 SOX21 1.52 11166 SRY (sex determining region Y)-box 21 SHROOM3 1.52 57619 shroom family member 3 NINJ1 1.52 4814 ninjurin 1 IGFBP5 1.51 3488 insulin-like growth factor binding protein 5 RNF40 1.51 9810 ring finger protein 40, E3 ubiquitin protein ligase PSMC1 1.51 5700 proteasome (prosome, macropain) 26S subunit, ATPase, 1 PLK2 1.51 10769 polo-like kinase 2 PILRB -1.59 29990 paired immunoglobin-like type 2 receptor beta PLXNC1 -1.59 10154 plexin C1 STK36 -1.60 27148serine/threonine kinase 36 RELN -1.61 5649 reelin TMBIM6 -1.66 7009 transmembrane BAX inhibitor motif containing 6 SEC31B -1.67 25956 SEC31 homolog B (S. cerevisiae) transient receptor potential cation channel, subfamily V, TRPV1 -1.67 7442 member 1 DHCR24 -1.68 1718 24-dehydrocholesterol reductase NFE2L1 -1.69 4779 nuclear factor (erythroid-derived 2)-like 1 PDIA3 -1.70 2923 protein disulfide isomerase family A, member 3 SGCD -1.71 6444 sarcoglycan, delta (35kDa dystrophin-associated glycoprotein) NCSTN -1.71 23385 nicastrin HNRNPUL2- BSCL2 -1.72 100534595HNRNPUL2-BSCL2 readthrough SLC22A17 -1.72 51310 solute carrier family 22, member 17 GNS -1.72 2799glucosamine (N-acetyl)-6-sulfatase FBN1 -1.73 2200 fibrillin 1 CLSTN1 -1.73 22883calsyntenin 1 MXRA7 -1.73 439921 matrix-remodelling associated 7 102 VSIG10 -1.74 54621 V-set and immunoglobulin domain containing 10 CD81 -1.75 975 CD81 molecule GPC3 -1.75 2719glypican 3 NAT8L -1.75 339983 N-acetyltransferase 8-like (GCN5-related, putative) SLC39A7 -1.76 7922 solute carrier family 39 ( transporter), member 7 MEGF6 -1.77 1953 multiple EGF-like-domains 6 COL26A1 -1.77 136227 collagen, type XXVI, alpha 1 MLEC -1.78 9761 malectin F10 -1.79 2159 coagulation factor X CD151 -1.80 977 CD151 molecule (Raph blood group) TMC6 -1.80 11322 transmembrane channel-like 6 PRCP -1.81 5547 prolylcarboxypeptidase (angiotensinase C) ADAMTS12 -1.83 81792ADAM metallopeptidase with thrombospondin type 1 motif, 12 C14orf37 -1.83 145407 chromosome 14 open reading frame 37 CTSC -1.83 1075 cathepsin C CACNA2D2 -1.85 9254 , voltage-dependent, alpha 2/delta subunit 2 sparc/osteonectin, cwcv and kazal-like domains proteoglycan SPOCK2 -1.89 9806 (testican) 2 QSOX1 -1.93 5768 quiescin Q6 sulfhydryl oxidase 1 UDP-GlcNAc:betaGal beta-1,3-N- B3GNT1 -2.00 11041 acetylglucosaminyltransferase 1 COL6A4P2 -2.03 646300 collagen, type VI, alpha 4 pseudogene 2 HABP2 -2.09 3026 hyaluronan binding protein 2 NTRK2 -2.10 4915 neurotrophic tyrosine kinase, receptor, type 2 WNT8B -2.17 7479 wingless-type MMTV integration site family, member 8B SLC25A18 -2.40 83733 solute carrier family 25 (glutamate carrier), member 18 CD97 -2.54 976CD97 molecule

103 Table 5. Genes with fold change >1.5 only in both Dox+4OHT and Dox+4OHT+CHX samples (See Fig. 7C,F)

Fold Fold Gene Change Change Entrez Symbol (-CHX) (+CHX) ID Description KCNK17 20.12 5.35 89822 potassium channel, subfamily K, member 17 CLCA1 13.04 10.67 1179 accessory 1 PPP1R17 9.95 6.44 10842 protein phosphatase 1, regulatory subunit 17 C1orf173 8.69 5.74 127254 chromosome 1 open reading frame 173 KIF19 8.58 19.57 124602 kinesin family member 19 CHRNA1 4.81 11.69 1134 cholinergic receptor, nicotinic, alpha 1 (muscle) DRD2 4.33 4.32 1813 D2 PCDHGA3 3.58 4.42 56112 protocadherin gamma subfamily A, 3 RGS16 3.58 5.74 6004 regulator of G-protein signaling 16 ASCL1 2.95 3.67 429 achaete-scute complex homolog 1 (Drosophila) RD3 2.42 -2.20 343035 retinal degeneration 3 immunoglobulin-like and fibronectin type III domain containing IGFN1 2.33 3.40 91156 1 DLL3 2.29 2.86 10683 delta-like 3 (Drosophila) VSX1 1.93 1.64 30813 visual system homeobox 1 P2RX1 1.88 2.07 5023 P2X, ligand-gated , 1 MORF4L2- AS1 1.68 -1.79 340544 MORF4L2 antisense RNA 1 SLC38A5 1.66 4.73 92745 solute carrier family 38, member 5 PTGDR2 1.59 3.64 11251 prostaglandin D2 receptor 2 SLC18A3 1.57 4.71 6572 solute carrier family 18 (vesicular acetylcholine), member 3 DES 1.54 3.69 1674 desmin HIST1H2AB 1.53 3.35 8335 histone cluster 1, H2ab FAM65C 1.53 2.68 140876 family with sequence similarity 65, member C IRF4 -1.88 2.63 3662 interferon regulatory factor 4

104 Figure 1

A De nitive Posterior Pancreatic Split hESC Endoderm Gut Tube Foregut Precursors

2 Days 3 Days 2 Days 4 Days 3 Days -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 Y-27632 mTesR 0.2% FBS 2% FBS RPMI+NEAA 1% B27 HG-DMEM [50ng/ml] B BMP4 [100ng/ml] ActivinNEUROG3 A CHGA [50ng/ml] PDX1 Merge FGF-7 [2μM] RA [50ng/ml] [25ng/ml] Noggin +/+

NEUROG3 De nitive Intestinal Intestinal Split hESC Endoderm Spheroids Organoids

2 Days 3 Days 4 Days 28 Days -2 -1 0 1 2 3 4 5 6 7 11 15 19 23 27 31 35

-/- Y-27632 mTesR 0.2% FBS 2% FBS RPMI+NEAA [1x B27, 2mM L-Glut, P/S, 15mM HEPES] HG-DMEM [100ng/ml] Activin A [3μM] Chiron [500ng/ml] FGF-4 [100ng/ml] EGF NEUROG3

mRNA 450 mRNA 400 450 GHR D 350 dox) C 400 ‐ 300 INS DOX

(to GHR + rtTA3 350 250 SST dox)

‐ 300 200 GCG INS Change (to 150 250 CHGA SST 100 Fold DOX 200 NKX2.2 GCG rtTA3 50 Change

150 pINDUCER20 0 CHGA 5’ LTR TRE2 HA-NEUROG3 UBC rtTA3 IRES Neo 3’ LTR 100 ‐ dox D5 D6 D7 D8 D9 D10 Fold NKX2.2 (WT,R107S,L135P,E28X) 50 0 ‐ dox D5 D6 D7 D8 D9 D10 Day of differentiation dox was added

E Day 9 (100ng/ml dox added for 8 hrs) F Day 12 (3 days post dox) HA PDX1 NKX2.2 CHGA PDX1 iWT -/-,iWT NEUROG3 NEUROG3 -/- -/- NEUROG3 NEUROG3

105 Figure 2

WT R107S L135P E28X A B C D 8h dox [100ng/ml] HA(NEUROG3) PDX1 HA(NEUROG3)

E H K N HA(NEUROG3) PDX1 HA(NEUROG3) F I L O CHGA NKX2.2 72h post dox (Day 12) 72h post G J M P SST GCG INS

106 1.4

1.41.2

1.41.21

1.20.81

0.80.61

0.80.60.4 0.2

NEUROG3 Half Life (hours) 0.60.4

0.40.20 Figure 3NEUROG3 Half Life (hours) WT R107S L135P 0.20 NEUROG3 Half Life (hours) WT R107S L135P 0 WT WTR107S L135P 0.25 WT B 0.250.2 WT A minutes following removal of dox WT 0.150.250.2 NEUROG3

0.150.10.2 HA/Actin y = 0.229e-0.676x -dox 0 30 45 60 90 120 180 R² = 0.9135 0.050.150.1 HA/Actin y = 0.229e-0.676x 0.1 R² = 0.9135 HA/Actin 0.050 y = 0.229e-0.676x

WT 30kD 0 0.5 1 1.5 2R² 2.5 = 0.9135 3 3.5 0.05 0 Time (hours) 0 0.5 1 1.5 2 2.5 3 3.5 0 Time (hours) 0 0.5 1 1.5 2 2.5 3 3.5 R107STime (hours) 30kD 0.16 R107S R107S C 0.14 0.16 R107S 0.12 NEUROG3R107S 0.14 0.160.1 0.12 0.080.14

L135P 30kD 0.1

HA/Actin 0.060.12 0.08 HA (NEUROG3) Pan-Actin 0.040.1 y = 0.1511e-1.365x

HA/Actin 0.06 0.020.08 R² = 0.8896 0.04 y = 0.1511e-1.365x

HA/Actin 0.060 0.02 R² = 0.8896 0.04 0 0.5 1 1.5 2y = 0.1511e 2.5-1.365x 3 3.5 E 0 0.02 Time (hours) R² = 0.8896 0 0.5 1 1.5 2 2.5 3 3.5 1.41.4 ns 0 Time (hours) 0 0.5 1 1.5 2 2.5 3 3.5 1.2 1.2 p=.0008 L135PTime (hours) D L135P 1 1 1.2 NEUROG3L135P 1 0.80.8 1.2 L135P 1.20.81 0.60.6 0.80.61 0.4 0.4 HA/Actin

= ln(2)/(decay constant) 0.4 0.80.6 -0.767x

1/2 y = 0.9691e t

0.2 HA/Actin R² = 0.8901

NEUROG3 Half Life (hours) 0.2 0.2 NEUROG3 Half Life (hours) 0.60.4 y = 0.9691e-0.767x

0 0 HA/Actin 0.20 R² = 0.8901 0.4 -0.767x WTWT R107SR107S L135PL135P 0 0.5 1 1.5 2y = 2.5 0.9691e 3 3.5 0.20 Time (hours) R² = 0.8901 0 0.5 1 1.5 2 2.5 3 3.5 0 Time (hours) 0 0.5 1 1.5 2 2.5 3 3.5 WTWT Time (hours) 0.250.25

0.20.2

0.150.15

0.1

HA/Actin 0.1 -0.676x HA/Actin y =y 0.229e= 0.229e-0.676x R²R² = 0.9135= 0.9135 0.050.05

0 0 00 0.5 0.5 1 1 1.5 1.5 2 2 2.5 2.5 3 3 3.5 3.5 TimeTime (hours) (hours)

R107SR107S 0.160.16 0.140.14 0.120.12 0.10.1 0.080.08

HA/Actin 0.06 HA/Actin 0.06 0.040.04 y =y 0.1511e= 0.1511e-1.365x-1.365x 107 R² = 0.8896 0.020.02 R² = 0.8896 0 0 00 0.5 0.5 1 1 1.5 1.5 2 2 2.5 2.5 3 3 3.5 3.5 TimeTime (hours) (hours)

L135PL135P 1.21.2

1 1

0.80.8

0.60.6 HA/Actin 0.4HA/Actin 0.4 y =y 0.9691e= 0.9691e-0.767x-0.767x R² = 0.8901 0.20.2 R² = 0.8901

0 0 00 0.5 0.5 1 1 1.5 1.5 2 2 2.5 2.5 3 3 3.5 3.5 TimeTime (hours) (hours) Figure 4

A HA (NEUROG3)

0ng Dox 30ng Dox 100ng Dox 300ng Dox R107S NEUROG3 WT NEUROG3 inducedinduced NEUROG3 NEUROG3 7070 60 HA (NEUROG3)60 PDX1 B 5050 0ng Dox 30ng Dox 100ng Dox 300ng Dox 4040 3030 WTWT 20 20 R107SR107S 1010 R107S 0 0

Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng

Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng DoxDox Concentration Concentration [ng/ml] [ng/ml] NEUROG3 NEURODNEUROD 120120 100100 WT 8080 6060 WTWT 4040 NEUROG3 R107SR107S 2020 0 0

Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng induced NEUROG3 WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng Dox Concentration [ng/ml] 70 Dox Concentration [ng/ml] 60 50 C 40 induced NEUROG3 NKX2-2NKX2-2 70 30 WT induced NEUROG3 300300 20 7060 R107S 250250 10 6050 200200 0 5040 150 Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng30 WT150 40 WTWT Dox Concentration [ng/ml] 100 (to 0ng, WT) 20 R107S100 30 WT R107SR107S 10 50 mRNA Fold Change mRNA 20 R107S50 0 10 0 0

Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng NEUROD WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng 0 WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng Dox Concentration [ng/ml] Dox Concentration [ng/ml] 120 WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng Dox Concentration [ng/ml] 100 Dox Concentration [ng/ml] 80 NEUROD 60 PAX4 120 WT NEUROD PAX4 40 6000 120100 R107S 6000 20 5000 10080 5000 0 4000 8060 4000

Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng WT 3000 6040 3000 Dox Concentration [ng/ml] R107S WTWT WT 2000 4020 2000 R107S R107S R107S 1000 200 1000

Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng 0 NKX2-2 0 0

Dox Concentration [ng/ml] WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng Fold Change (to 0ng, WT) 0ng, (to Change Fold

Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng 300 0ng 30ng 100ng 300ng Dox Concentration [ng/ml] Dox Concentration [ng/ml] Dox Concentration [ng/ml] 250 200 NKX2-2 150 300 WT NKX2-2 100 108 300250 R107S 50 250200 0 200150

Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng WT Dox Concentration [ng/ml] 150100 WTR107S 10050 R107S 500

Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng 0 PAX4 Dox Concentration [ng/ml]

Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng 6000 Dox Concentration [ng/ml] 5000 4000 PAX4 3000 6000 WT PAX4 2000 60005000 R107S 1000 50004000 0 40003000

Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng WT Dox Concentration [ng/ml] 30002000 WTR107S 20001000 R107S 10000

Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng 0 Dox Concentration [ng/ml] Fold Change (to 0ng, WT) 0ng, (to Change Fold 0ng 30ng 100ng 300ng Dox Concentration [ng/ml] De nitive Posterior Pancreatic Split hESC Endoderm Gut Tube Foregut Precursors

2 Days 3 Days 2 Days 4 Days 3 Days -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 Y-27632 mTesR 0.2% FBS 2% FBS RPMI+NEAA 1% B27 HG-DMEM [50ng/ml] BMP4 [100ng/ml] Activin A [50ng/ml] FGF-7 [2μM] RA [50ng/ml] [25ng/ml] Noggin Figure 5

A De nitive Intestinal Intestinal Split hESC Endoderm Spheroids Organoids

2 Days 3 Days 4 Days 28 Days -2 -1 0 1 2 3 4 5 6 7 11 15 19 23 27 31 35 Y-27632 mTesR 0.2% FBS 2% FBS RPMI+NEAA [1x B27, 2mM L-Glut, P/S, 15mM HEPES] HG-DMEM 0ng/ml Dox [100ng/ml] 8hrs, 100ng/ml Dox Activin A [3μM] Chiron WT WT [500ng/ml] R107S L135P E28X FGF-4 [100ng/ml] EGF B C D E F DAPI HA CHGA ECAD

B’ C’ D’ indNGN3E’ F’ NEUROD 600 200

500 0ng) 0ng)

150

400 WT, WT,

WT WT to to

300 100 R107S R107S 200 Change Change HA(NEUROG3) L135P L135P 50 100

Fold E28X

Fold E28X 0 0 indNGN3 NEUROD 0 100 0 100 ng/ml Dox ng/ml Dox 600 200

500 0ng) 0ng)

G indNGN3indNGN3150 NEURODNKX2.2NEUROD PAX4

400 WT, WT,

WT WT to to 200

600 1400200 300 600 100 1400 R107S R107S 500 1200 1200 0ng)

5000ng) 0ng)

0ng)

0ng)

0ng)

200 150 Change Change 150 L135P L135P 50 1000 1000

400 WT, WT, WT,

400 WT,

WT,

WT,

100

Fold WT E28X WT Fold E28X to WT

to 800 WT WT to

to

to WT 800 to

300 100 100 0 300 0 R107SR107S 600 R107SR107SR107S 600 R107S 0 100 200 0 100

200(to 0ng, WT) Change Change Change L135P Change L135P 400 Change L135P

L135P Change L135P 50 400 L135P ng/ml Dox ng/ml Dox 50 100 100 Fold Change mRNA

200Fold E28X

Fold E28X

Fold E28X

Fold E28X

Fold E28X 200

Fold E28X 0 0 00 0 0 0 100 0 0 100 100 NKX2.2 0 100 PAX4 0 100 0 100 ng/mlng/ml Dox Dox ng/mlng/ml Dox Dox ng/ml Dox 1400 1400 1200 1200 0ng)

0ng)

1000 NKX2.2NEURODNKX2.21000 PAX4 indNGN3WT, PAX4

WT, 800 WT to 1400 WT 800 1400 to 1400 600 200 1400 600 R107S 12001200 600 1200 R107S

0ng) 1200

0ng)

500 0ng)

0ng)

0ng) 0ng)

Change 400 L135P

1000150 Change 400 L135P 1000 10001000 WT,

WT, WT,

400 WT,

WT, WT,

200 WT Fold 800 E28X to 200 WT

800 Fold E28X WT

to 800 WT WT to WT

to 800

to to

300 0 100600 R107S R107S 600 0 R107SR107S 600 600 R107SR107S 0 100 0 100 200 Change 400 L135P Change

400 L135P Change 400

Change L135P Change

Change L135P 400

L135P

L135P ng/ml Dox 50 ng/ml Dox 200 100 200Fold E28X 200

Fold E28X 200Fold E28X

Fold E28X

Fold E28X Fold E28X 0 0 0 0 0 0 0 100 0 100 0 0 100 100 0 0 100 100 ng/ml Dox ng/ml Dox ng/mlng/ml Dox Dox ng/mlng/ml Dox Dox

NKX2.2 PAX4 WT R107S L135P E28X 1400 1400 1200 1200 0ng)

0ng)

1000 1000 H I J K WT,

WT, 800 WT to WT 800 to

600 R107S 600 R107S

Change 400 L135P

Change 400 L135P

200 Fold E28X 200

Fold E28X 0 0 0 100 0 100 ng/ml Dox ng/ml Dox DAPI HA CHGA ECAD H’ I’ J’ K’ CHGA

109 Figure 6

De nitiveDe nitive PosteriorPosterior PancreaticPancreatic Split Split hESChESC EndodermEndoderm GutGut Tube TubePDX1 ForegutForegut PrecursorsPrecursors 1.6 DOX PDX1 rtTA3 PDX1 1.4 A1.6 B 1.21.6 2 Days2 Days 3 Days3 Days 22 Days1.4 4 Days4 Days (differentiation3 Days day)3 Days treatment) 1.41 1.2 ERT2 no -2 1 2 3 4 TRE2 5 6 11 12 -2-1 -1 0 0 1 2 3 4treatment) 1.2 5GFP P2A HA-NEUROG3 6 77 8 8 99 10 11 10 12 0.8 treatment) 1

(to 1 9 10 11 12 treatment) pINDUCER20 no

Y-27632 1

Y-27632 no 0.80.6 no (24h) (8h) 0.8 mTesR (to mTesR 0.2% FBS 2% FBS(to 0.40.8

0.2% FBS 2% FBS(to 0.6 RPMI+NEAARPMI+NEAA Change 0.6 100nM OHT 0.20.6 1% B271%200ng/ml B27 dox HG-DMEMHG-DMEM 0.4 10μM CHX Change [50ng/ml][50ng/ml] Fold Change 0.4 BMP4 0.20 BMP4 [100ng/ml][100ng/ml] Change 0.2 Activin A Fold 0.2 no 4OHT CHX Dox Dox + 4OHT Dox+4OHT Activin A [50ng/ml][50ng/ml] Fold 0 Fold 0 FGF-7 FGF-7 0 treatment +CHX no 4OHT CHX Dox Dox + 4OHT Dox+4OHT[2μM][2μM] RA no 4OHT CHX Dox Dox + 4OHT Dox+4OHT -dox +dox dox, 4OHT dox, 4OHT, CHX RA [50ng/ml][50ng/ml]treatment +CHX [25ng/ml][25ng/ml] Noggin treatment +CHX Noggin direct and direct C controlsNEUROD indirect targets targets D E F G 60 NEUROD NEUROD 6050 60 50

treatment) 5040 50 no

treatment) 40 30 treatment) 40

(to

treatment) 40 no

no 30 HA (NEUROG3) no

3020 (to

(to 30 (to no treatment)

De nitive (to De nitive Fold Change mRNA Intestinal Change Intestinal 2010 Intestinal Intestinal 20

Split hESC Endoderm Change Split hESC Endoderm Fold SpheroidsSpheroids Organoids

Change Organoids 100 Change 10

Fold no treatment 4OHT CHX Dox Dox + 4OHT Dox+4OHT Fold 0

Fold 0 0 +CHX D’ E’ F’ G’ no treatment 4OHT CHX Dox Dox + 4OHT Dox+4OHT 2 Days 3 Days no4 treatment4 Days Days 4OHT CHX Dox Dox + 4OHT Dox+4OHT 28 Days28 Days 2 Days 3 Days +CHX +CHX -2 -2-1 -1 0 0 11 2 2 33 4 4 55 6 6 77 11 11 1515 19 1923 2723 31 27 35 31 35 Y-27632 Y-27632 PAX4 mTesR 350 PAX4 GFP HA mTesR 0.2% 0.2%FBS FBS 2%2% FBS PAX4 RPMI+NEAARPMI+NEAA PAX4 [1x B27, 2mM L-Glut, P/S, 15mM HEPES] 350300 [1x B27, 2mM L-Glut, P/S, 15mM HEPES] HG-DMEMHG-DMEM [100ng/ml][100ng/ml] 350 Activin A 300250 Activin A [3μM]treatment) [3μM] 300 Chiron 250 Chiron no 200 [500ng/ml] 250

[500ng/ml]treatment)

treatment) 250

FGF-4 FGF-4 (to [100ng/ml] treatment) [100ng/ml] no 150200

no 200 EGF EGF

no 200

(to 150 (to 150100 (to

Change 150 100 -dox +dox dox, 4OHT dox, 4OHT, CHX 10050 E Change (to no treatment) 100 Fold Change mRNA Fold Change mRNA 500 Change 50 50 Fold no 4OHT CHX Dox Dox + 4OHT Dox+4OHT Fold 0

Fold 0 0 treatment +CHX H I J K no 4OHT CHX Dox Dox + 4OHT Dox+4OHT no 4OHT CHX Dox Dox + 4OHT Dox+4OHT treatment +CHX treatment +CHX NKX2.2 20 NKX2.2

NKX2.2 NKX2.2 18 NKX2.2 20 2016 18 1814 treatment) 16 16 1612 no 14

treatment) 14 10 treatment)

14

(to 12 treatment) no 12 8 no 1210 no

10 (to 6 (to 108

(to 8 H’ I’ J’ K’ Change 84 (to no treatment) 6 mRNA Fold Change mRNA 62 Change 4 Fold Change 04

Change 4 2 Fold no treatment 4OHT CHX Dox Dox + 4OHT Dox+4OHT Fold 02

Fold 0 +CHX 0 GFP no treatment 4OHT CHX Dox Dox + 4OHT Dox+4OHT no treatment 4OHT CHX Dox Dox + 4OHT Dox+4OHT +CHX +CHX PDX1 1.6 1.4 1.2 H’’ I’’ J’’ K’’ treatment) 1 no PDX1 0.8 (to 0.6

(to no treatment) 0.4 mRNA Fold Change mRNA Change 0.2

Fold 0

no 4OHT CHX Dox Dox + 4OHT Dox+4OHT GFP NKX2.2 treatment +CHX

NEUROD 60

50

treatment) 40

no 30 (to 20

Change 10

Fold 0 no treatment 4OHT CHX Dox Dox + 4OHT Dox+4OHT +CHX 110 PAX4 350 300 250 treatment)

no 200

(to 150 100

Change 50

Fold 0 no 4OHT CHX Dox Dox + 4OHT Dox+4OHT treatment +CHX

NKX2.2 20 18 16 14 treatment) 12 no 10 (to 8 6

Change 4 2 Fold 0 no treatment 4OHT CHX Dox Dox + 4OHT Dox+4OHT +CHX Figure 7 A B Dox 4OHT Dox+4OHT+CHX Dox C Dox+4OHT Dox + 4OHT +CHX

CHX 4OHT Dox + 4OHT

DEGs shared between DEGs unique to DEGs unique to Dox+4OHT+CHX Dox+4OHT (n=119) E Dox+4OHT+CHX (n=113) F and Dox+4OHT (n=23) D Fold Change Fold Change Fold Fold Gene Symbol (-CHX) (+CHX) Gene Symbol Change Gene Symbol Change KCNK17 20.12 5.35 G ZIC5 6.05 APOA4 12.45 CLCA1 13.04 10.67 Biological Process # Genes p-Value NKX2-2 5.94 RASGRF1 5.48 PPP1R17 9.95 6.44 muscle contraction 8 3.86E-05 NEUROD1 4.16 C1orf105 3.98 C1orf173 8.69 5.74 regulation of neurogenesis 7 5.06E-04 INSM1 3.95 KRT7 3.29 KIF19 8.58 19.57 cell-cell signaling 12 9.53E-04 PPARGC1B 3.83 ADRB3 2.98 CHRNA1 4.81 11.69 regulation of nervous system development 7 1.08E-03 KRT77 2.79 GABRG2 2.90 DRD2 4.33 4.32 C1QL3 2.67 RAG2 2.87 transmission of nerve impulse 9 1.25E-03 PCDHGA3 3.58 4.42 regulation of cell development 7 1.51E-03 MYBPC1 2.56 FOXQ1 2.87 RGS16 3.58 5.74 CBY3 2.53 PCDHGA7 2.77 synaptic transmission 8 2.18E-03 ASCL1 2.95 3.67 positive regulation of cell differentiation 7 2.64E-03 SCN7A 2.52 MEOX2 2.61 RD3 2.42 -2.20 IRX3 2.46 HEYL 2.47 regulation of sequestering of calcium ion 3 3.15E-03 IGFN1 2.33 3.40 endocrine pancreas development 3 4.65E-03 ARHGAP20 2.44 MROH9 2.45 DLL3 2.29 2.86 APLNR 2.36 C11orf96 2.42 ion transport 12 6.45E-03 VSX1 1.93 1.64 TM4SF19-TCTEX1D2 2.29 PIK3CG 2.42 positive regulation of developmental process 7 6.78E-03 P2RX1 1.88 2.07 GNRHR2 2.29 BANK1 2.29 cellular ion homeostasis 8 7.51E-03 MORF4L2-AS1 1.68 -1.79 endocrine system development 4 8.44E-03 FABP7 2.25 ID4 2.13 SLC38A5 1.66 4.73 MUC15 2.14 PCDHGA5 2.10 PTGDR2 1.59 3.64 TRIM29 2.12 HIST1H1T 2.07 MIR205HG 2.03 C17orf105 2.06 SLC18A3 1.57 4.71 LOC100131825 2.00 CDKN2B 1.96 DES 1.54 3.69 SUSD3 1.95 ARMS2 1.95 HIST1H2AB 1.53 3.35

111 BD FACSDiva 8.0

BD FACSDiva 8.0 CDX2 CDX2 Figure 8 NKX2-2 NKX2-2 GIP GIP INS INS 45 45 1600 1600 140 140 700 700

40 40 1400 1400 A Day 28 Day 35 CDX2 CDX2 CDX2 CDX2 NKX2-2120 NKX2-2 NKX2-2 120 NKX2-2 GIP GIP 600 GIP GIP INS600 INS INS INS 45 45 45 45 1600 1600 1600 1600 140 140 140 140 700 700 700 700 35 35 1200 1200 40 40 40 40 1400 1400 57% 28% 100 1400 1400 100 120 120 120 500 120 600 500 600 600 600 30 30 63% 19% 35 6.5% 35 35 35 1000 1000 1200 1200 1200 1200 100 100 100 100 500 500 500 500 25 25 30 30 30 30 80 80 400 400 -dox -dox -dox -dox 1000 1000 -dox1000 1000 -dox -dox -dox 800 25 25 800 25 25 80 80 80 80 400 400 400 400 -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox 20 +dox 20 +dox +dox +dox 800 60 800 +dox800 800 60 +dox 300 +dox 300 +dox 20 20 +dox +dox20 20 +dox +dox +dox +dox +dox +dox60 60 +dox +dox60 60 +dox +dox300 300 +dox +dox300 300 +dox +dox 600 600 600 600 600 600 15 15 15 15 15 15 40 40 40 40 40 200 40 200 200 200 200 200 400 400 400 400 10 10 400 10 10 10 10 -dox 400 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 100 EPCAM-) 28, Day (to Change Fold 100 EPCAM-) 28, Day (to Change Fold 100 EPCAM-) 28, Day (to Change Fold 100 5 5 5 5 200 200 200 200 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 100 EPCAM-) 28, Day (to Change Fold 100 5 5 200 200 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 EPCAM- EPCAM+ EPCAM- EPCAM+ EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi EPCAM- EPCAM+ EPCAM- EPCAM+ EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi 0 0 0 0 0 0 0 0 EPCAM- EPCAM+ EPCAM-neg EPCAM-Lo EPCAM-Hi EPCAM- EPCAM+ EPCAM-neg EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi NEUROD NEUROD NEUROD NEUROD ECAD ECAD ECAD ECAD GHRL GHRL GHRL GHRL PA X 4 PA X 4 PA X 4 PA X 4 45000 45000 45000 45000 45 45 45 45 140 140 140 140 7000 7000 7000 7000

40000 40000 40000 40000 40 40 40 40 NEUROD NEUROD ECAD ECAD GHRL GHRL120 120 120 120PA X 4 6000 6000 PA X 4 6000 6000 35000 35000 35000 35000 35 35 35 35 45000 45000 45 45 140 140 7000 100 100 100 7000 100 5000 5000 5000 5000 47% 33% 30000 30000 30000 30000 30 30 30 30 55% 23% 10% 40000 40000 40 25000 25000 40 25000 25000 25 25 25 25 80 80 80 80 4000 4000 4000 4000 -dox -dox -dox -dox 120 -dox -dox 120 -dox -dox -dox -dox 6000 -dox -dox 6000 -dox -dox -dox -dox 20000 20000 20000 20000 20 20 20 20 35000 35000 35 +dox 35 +dox +dox +dox +dox +dox +dox +dox60 60 +dox +dox60 60 +dox +dox3000 3000 +dox +dox3000 3000 +dox +dox 15000 15000 15000 15000 15 100 15 15 15 100 5000 5000 30000 30000 30 30 40 40 40 40 2000 2000 2000 2000 10000 10000 10000 10000 10 10 10 10 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 20 20 20 20 EPCAM-) 28, Day (to Change Fold 1000 EPCAM-) 28, Day (to Change Fold 1000 1000 1000 25000 25000 25 5000 5000 25 5000 5000 5 80 5 5 5 80 4000 4000 -dox -dox -dox -dox -dox -dox -dox -dox 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

20000 20000 + dox 20 EPCAM- EPCAM+ EPCAM- EPCAM+20 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi EPCAM- EPCAM+ EPCAM- EPCAM+ EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi +dox +dox +dox +dox 60 +dox 60 +dox 3000 +dox 3000 +dox 15000 15000 15 15 CDX2 CDX2 CDX2 CDX2 iNEUROG3 iNEUROG3NKX2-2 NKX2-2 iNEUROG3 iNEUROG3NKX2-2 NKX2-2 VMNT40 VMNTGIP GIP VMNT 40 VMNTGIP GIP PDX1 PDX1INS INS2000 PDX1 PDX1INS INS SST2000 SST SST SST 10000 10000 10 120 120 10 120 120 60 60 60 60 40 40 40 40 1400 1400 1400 1400 45 45 45 45 1600 1600 1600 1600 140 140 140 140 700 700 700 700 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 20 20 35 35 35 EPCAM-) 28, Day (to Change Fold 1000 35 1000 5000 5000 40 40 40 5 40 1400 5 1400 1400 1400 1200 1200 1200 1200 mRNA Fold Change 100 100 100 100 50 50120 120 50 50120 120 600 600 600 600 35 35 35 35 30 30 30 30 1200 1200 1200 1200 1000 1000 (to 0ng, WT) 100 100 100 100 500 500 500 500 1000 1000 0 0 0 80 80 0 80 80 40 0 40 40 40 0 0 0 30 30 30 30 25 25 25 25 EPCAM- EPCAM+ EPCAM-neg EPCAM-Lo EPCAM-Hi EPCAM- EPCAM+ 1000 1000EPCAM-neg EPCAM-Lo1000 EPCAM-Hi 1000 D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi 800 800 800 800 25 25 25 25 80 80 80 80 400 400 400 400 -dox -dox 60-dox 60-dox -dox -dox-dox60 -dox60 -dox -dox-dox30 -dox30 -dox -dox-dox30 -dox30 -dox -dox-dox20 -dox20 -dox -dox20-dox -dox20 -dox -dox -dox -dox -dox -dox -dox -dox 800 800 800 800 600 600 +dox +dox +dox +dox +dox +dox +dox +dox +dox +dox +dox +dox +dox +dox600 600 +dox +dox 20 20 +dox +dox20 20 +dox +dox +dox +dox +dox +dox 60 60 +dox +dox 60 60 +dox +dox300 300 +dox +dox300 300 +dox +dox 15 15 15 15 600 600 600 600 iNEUROG3 iNEUROG315 15 15 15 VMNT 40 40 40 VMNT 40 20 20 PDX1 20 20 PDX1 400 400 SST SST 400 400 40 40 40 40 10 10200 200 10 10 200 200 400 400 400 400 10 10 10 10 120 120 60 60 40 40 1400 EPCAM-) 28, Day (to Change Fold 200 1400 EPCAM-) 28, Day (to Change Fold 200 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 200 EPCAM-) 28, Day (to Change Fold 200 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 5 EPCAM-) 28, Day (to Change Fold 1005 EPCAM-) 28, Day (to Change Fold 100 5 EPCAM-) 28, 5Day (to Change Fold 100 EPCAM-) 28, Day (to Change Fold 100 5 5 5 5 200 200 200 200 0 0 0 0 0 0 0 35 0 0 0 35 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1200 0 0 D28, EPCAM-1200 D35, EPCAM-, D28, EPCAM- D35, EPCAM-, 100 100 50 EPCAM- EPCAM+ EPCAM- EPCAM+50 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi EPCAM- EPCAM+ EPCAM- EPCAM+ EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi Day 28 Day 35 (7 days after dox) EPCAM- EPCAM+ EPCAM-Day 28EPCAM+ EPCAM-negDay 35 (7 EPCAM-Lo days EPCAM-negafter EPCAM-Hi dox) EPCAM-Lo EPCAM-Hi EPCAM-Day 28 EPCAM+ EPCAM-Day EPCAM+ 35 (7 days afterEPCAM-neg dox) EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi +dox +dox 30 30 1000 1000 CHGA CHGA CHGA CHGA CXCR4 CXCR4 CXCR4 CXCR4 FOXA2 FOXA2 FOXA2 FOXA2 GCG GCG GCG GCG CDX2 CDX2CDX2 CDX2 NKX2-2CDX2 CDX2 80 CDX2NKX2-2 NKX2-2CDX2 80 NKX2-2CDX2GIP NKX2-2CDX2 NEUROD NEURODNKX2-2GIP GIP GIPNKX2-240 NEUROD NEURODNKX2-2GIP INS GIPNKX2-2 ECAD40 ECADINSGIP INS INSGIP ECAD ECADGIPINS INSGIP GHRL GHRLINS INS GHRL GHRLINS INS PA X 4 PA X 4 PA X 4 PA X 4 CDX2 CDX2 NKX2-2 NKX2-2 GIP GIP INS INS 25 25 45 45 45 45 45 1600 B 451600 451600 160045 1401600 140 1600 1401600 35000 140 1600 35000 700140 35000 700 140 35000 140700 40 700140 40 700 40 700 40 700 60 700 60 60 800 60 8 8 8 8 45 45 1600 1600 140 45000 45000140 45000 45000 700 I 45 45 700 45 45 140 140 140 140 7000 7000 7000 7000 800 45 45 1600 1600 140 F 140 700 700 40 40 40 40 40 1400 401400 -dox 401400 140040 40000 40000 1400 -dox 1400Global40000 Sheet1 400001400 1400-dox Page 1 of 1 40 Printed 40on: Thu Nov 5, 2015 09:22:35 EST40 40-dox 35 35 -dox35 35 -dox -dox 7 7 7 7-dox 40 40 1400 60 1400 60 120 120 30 120 30000 120 30000 30 600120 30000 600 120 30000 120600 600120 20 120 120600 600 120 20 120 600 50 600 50 50 50 6000 6000 40 40 1400 120 120 600 600 6000 6000 1400 120 120 600 600 600 35 35 35 35 35 35 +dox 35 35 35000 35000 +dox 35000 35000 +dox 35 35 35 35+dox 30 30 +dox30 30 +dox +dox 6 600 6 6 6+dox 35 35 1200 1200 1200 1200 1200 1200 1200 25000 1200 25000 25000 25000 35 35 1200 1200 100 100 100 100 500100 500 100 100500 500100 100 100500 500 100 100 500 500 5000 5000 1200 1200 100 100 500 500 15 15 40 500040 5000 40 40 30 30 30 30 30 30 30 30 100 30000 30000 100 30000 30000 500 30 30 30 500 30 25 25 25 25 5 5 5 5 30 30 40 1000 1000 1000 40 1000 1000 1000 20 1000 1000 20 400 30 30 1000 1000 80 80 80 20000 80 20000 40080 20000 400 80 20000 80400 40080 400 400 400 400 25 25 25 25 1000 25 1000 25 25 25 25000 25000 25000 25000 25 25 25 25 80 80 80 80 4000 4000 4000 4000 400 -dox -dox -dox -dox -dox-dox -dox-dox 80 -dox-dox -dox80-dox -dox-dox -dox-dox 400 -dox-dox -dox -dox -dox -dox 400 -dox-dox -dox-dox-dox -dox20 -dox20-dox -dox -dox-dox -dox20 20-dox -dox -dox -dox30 30-dox -dox -dox -dox30 30 -dox -dox4 4 -dox -dox4 4 -dox -dox 25 25 800 800 800 800 80 800 -dox 80800-dox 800 -dox 800 400-dox -dox -dox 400 -dox 10 -dox -dox -dox 10 -dox -dox -dox -dox -dox -dox 25 -dox 25 -dox 800 -dox 800 -dox -dox -dox -dox -dox -dox 20 20 +dox +dox20 -dox 20 +dox -dox 20+dox 20 +dox+dox +dox20+dox -dox 20 20000+dox 20000+dox-dox+dox60 60 20000+dox+dox 20000+dox+dox60 -dox15000 60 15000+dox+dox20 +dox +dox20300+dox60 +dox15000-dox 300 60 15000+dox20+dox +dox20+dox+dox60300 +dox 300-dox60 +dox+dox +dox 300+dox+dox +dox 300 +dox +dox 300+dox +dox 300 +dox +dox +dox +dox +dox +dox +dox +dox +dox +dox 800 800 +dox +dox +dox +dox +dox +dox +dox +dox60 60 +dox +dox 60 60 +dox +dox3000 3000 +dox EPCAM-) 28, Day (to Change Fold 200 +dox3000 3000 +dox +dox

20 +dox 20 +dox +dox EPCAM-) 28, Day (to Change Fold 20 +dox EPCAM-) 28, Day (to Change Fold 2060 +dox 60 EPCAM-) 28, Day (to Change Fold 10 +dox 300 +dox EPCAM-) 28, Day (to Change Fold 10 300 +dox 15 EPCAM-) 28, Day (to Change Fold 15 15 15 EPCAM-) 28, Day (to Change Fold 3 EPCAM-) 28, Day (to Change Fold 3 3 3 20 20 600 600 600 600 600 600 600 600 20 20 20 20 200 +dox 15 15 15 +dox 15 +dox 15 15 15 +dox 15 60 15000 15000+dox 60 15000 15000 +dox 300 15 15 +dox 15 300 15 +dox 5 5 600 600 10000 10000 10000 10000 15 15 600 600 40 40 40 40 20040 200 40 40200 10 20040 1040 20040 10 200 10 40 40 200 200 2000 2000 2000 2000 2 2 2 2 15 15 400 400 400 400 400 400 400 400 10 10 10 10 10 10 10 40 10 10000 1000040 10000 10000 200 10 10 200 10 10 0 400 400 40 40 200 200 10 10 10 10 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day EPCAM-) (to 28, Change Fold Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change EPCAM-) Fold 28, Day (to Change Fold EPCAM-) EPCAM-) 28, 28, Day Day (to (to Change Change Fold Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) EPCAM-) 28, 28, Day Day (to (to Change Change Fold Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change EPCAM-) Fold 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 10 10 0 0 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 0 EPCAM-) 28, Day (to Change Fold 5000 5000 EPCAM-) 28, Day (to Change Fold 0 EPCAM-) 28, Day (to Change Fold 5000 EPCAM-) 28, 5000Day (to Change Fold EPCAM-) 28, Day (to Change Fold 0 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 0 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 0 400 400 20 20 20 20 10020 100 20 20100 5 10020 5 20 10020 5 100 5 20 20 100 100 EPCAM-) 28, Day (to Change Fold 1000 EPCAM-) 28, Day (to Change Fold 1000 D28, EPCAM-1000 D35, EPCAM-, 1000 1 1 1 1 10 5 10 5 5 5 5 200 5200 5 200 2005 5000 5000 200 200 5000 5000200 200 5 5 5 5 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold

EPCAM- EPCAM+ Fold Change (to Day 28, EPCAM-) mRNA 20 EPCAM-neg EPCAM-Lo EPCAM-Hi 20 EPCAM- EPCAM+100 EPCAM-neg100 EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi +dox D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi 5 EPCAM-) 28, Day (to Change Fold 5 EPCAM-) 28, Day (to Change Fold 200 EPCAM-) 28, Day (to Change Fold 200 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 200 200 20 20 0 1000 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 5 0 5 0 0 0 0 0 0 0 0 0 00 0 0 00 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 EPCAM- EPCAM+ EPCAM- EPCAM+ EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35,EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi 0 0 EPCAM- EPCAM+ EPCAM- EPCAM+ EPCAM-neg0 EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi 0 EPCAM-EPCAM-EPCAM+ EPCAM+ EPCAM-EPCAM- EPCAM+EPCAM+ EPCAM-negEPCAM-neg0 EPCAM-Lo EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Hi EPCAM-Lo EPCAM-LoEPCAM- EPCAM-Hi EPCAM-Hi EPCAM+ 0 EPCAM-D28,EPCAM- EPCAM- EPCAM+ D28, EPCAM+ EPCAM+ D28, EPCAM-EPCAM-EPCAM-neg D28, EPCAM+ EPCAM+ EPCAM-LoEPCAM-negD35,EPCAM-neg EPCAM-Hi EPCAM- D35,0 EPCAM-Lo EPCAM-Lo EPCAM-LoD35, D35, EPCAM-Hi EPCAM- EPCAM-HiEPCAM-negEPCAM-Hi D35, EPCAM-Lo EPCAM-LoEPCAM- D35, EPCAM-Hi EPCAM-Hi EPCAM+ EPCAM-D28,D28, EPCAM- 0EPCAM- EPCAM+ D28, D28, EPCAM+EPCAM+ D28, D28,EPCAM- EPCAM-EPCAM-neg D28, D28, EPCAM+ EPCAM+ EPCAM-LoEPCAM-negD35,D35, EPCAM-Hi EPCAM- EPCAM- D35, EPCAM-Lo D35, EPCAM-Lo EPCAM-Lo D35,D35, EPCAM-HiD35, D35, EPCAM-Hi EPCAM- EPCAM- EPCAM-Hi D35, D35, EPCAM-Lo EPCAM-LoD28, EPCAM- D35, EPCAM-Hi D28, EPCAM+ D28,D28, EPCAM- EPCAM- D28, D28, EPCAM+ EPCAM+ D28, EPCAM-D35, EPCAM- D28, EPCAM+ D35, EPCAM-LoD35, D35,D35, EPCAM- EPCAM-Hi EPCAM- D35, EPCAM-Lo D35, EPCAM-Lo D35, EPCAM-Hi D35,D35, EPCAM-Hi EPCAM- D35, EPCAM-LoD28, EPCAM- D35, EPCAM-Hi D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi 0 0 0 EPCAM- EPCAM+ EPCAM-neg EPCAM-Lo EPCAM-Hi EPCAM- EPCAM+ 0 EPCAM-neg EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ 0 D35, EPCAM- D35, EPCAM-Lo0 D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+0 D35, EPCAM- D35, EPCAM-Lo0 D35, EPCAM-Hi EPCAM- EPCAM+ EPCAM-neg EPCAM-Lo EPCAM-Hi EPCAM- EPCAM+ CHGA EPCAM-neg EPCAM-Lo EPCAM-Hi CHGAD28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoCXCR4 D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ CXCR4 D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi FOXA2 FOXA2 GCG GCG NEUROD NEUROD NEUROD NEUROD NEURODECAD NEURODECAD NEURODECAD ECADNEUROD iNEUROG3 iNEUROG3ECADGHRL GHRLECAD iNEUROG3 iNEUROG3GHRLECAD GHRLECAD VMNT VMNTGHRL GHRL VMNT VMNTGHRLPA X 4 PAGHRL X 4 PDX1 PDX1 PDX1 PDX1PA X 4 PA X 4 SST SST CDX2 CDX2CDX2 CDX2 CDX2 CDX2CDX2 CDX2 NKX2-2 NKX2-2NKX2-2G NKX2-2 NKX2-2 NKX2-2NKX2-2 NKX2-2 GIP GIP GIP PA XGIP 4 GIP PA X 4 GIP GIP GIP INS INS INS PA X 4 INS PAINS X 4 INS INS INS SST SST 35000 35000 40 40 60 60 8 8 CDX2 CDX2 CDX2 CDX2NEUROD NKX2-2 NKX2-245000 NEUROD 45000 NKX2-2 45000 NKX2-2 ECAD45000 GIP 4500045 GIP CDX2ECAD 4500045 CDX2 GIP 4500045 GIPCDX2GHRL4545000 120CDX2 INS 120 14045 NKX2-2INS GHRL140 45 120 NKX2-2 INS120 45140 NKX2-2INS14045 PA X 4 J NKX2-260 607000140 GIP 7000 140 PA X60 4 GIP 601407000 GIP 7000140 GIP40 407000 INS 7000 40 INS 40 7000 INS 7000 1400INS 1400 1400 1400 45 45 45 45 1600 1600 1600 1600 140 140 140 140 700 700 700 700 NEUROD NEUROD 45 ECAD 45 45 ECAD45 1600 GHRL 1600 1600 GHRL 1600 140 PA X 4 140 140 PA X 4140 700 700 700 700 45 45 45 45 45000 1600 160045000 1600 1600 45 140 14045 45 C14045 140 45 140 45 700 700 1401600 1600 700 1600700 7000 1600 140 7000 140 140 140 700 700 700 700 45000 40000 45000 40000 40000 40000 45 4000040 45 4000040 4000040 4040000 140 40 14040 40 40 7000 35 35 35 35 40 40 40 40 1400 1400 1201400 1201400 35 120 120 7000 35 6000120 6000 120 1206000 6000120 6000 6000 6000 6000 1200 1200 7 1200 1200 7 40 4030000 40 40 30000 1400100 1400100 1400100 1001400 120 120 50 50 120 50120 50 600 600 600 600 40 40 40 40 40000 40000 35000 35000 3500040 35000 40 403500035 403500035 403500035 354035000 140035 140035 140035 140035 120 120 120 120 50 600 600 50 600 600 1400 1400 1400 1400 120 120 120 6000 120 120 120 30 60030 600 30 30600 600 40000 40000 35 35 40 120 12035 35 40 120 120 600 600 600 600 6000 1000 1000 35 35 35 35 1200 1200120 1001200 1001200120 100 100 6000 5000100 5000 100 6000 1005000 5000100 5000 5000 5000 5000 1000 1000 35000 35000 35 35 35 35 35 35 1200 1200 1200 1200 100 100 100 100 500 500 500 500 35 35 35 35 30000 30000 30000 30000 3000030 3000030 3000030 3030000 80 80 120030 120030 80 30 80120030 120030 40 100 30 40 100 40 100 40 100 500 500 500 500 6 6 35000 1200 1200 35000 1200 1200 30 30 35 25000 30 30 35 25000 Global Sheet1 Page 1 of 1 Printed on: Thu Nov 5, 2015 09:05:39 EST 25 25 25 25 30100 30 100 10030 30 100 100 500 500 100 500 500 5000 100 5000 100 100 100 500 500 500 500 1000 1000 100080 801000 80 80 80 80 804000 400080 4000 4000 800 800 800 800 30 30 30 30 30000 30000 25000 25000 2500030 25000 30 302500025 302500025 302500025 253025000 100 1000 1000 25 10025 1000 100025 25 5000 4000 4000 5000 40 4000 4000 40 1000 1000 1000 80 1000 80 80 80 400 400 400 400 30000 1000 1000 30000 1000 -dox 1000 -dox 25 25 30 -dox 25 -dox 25 30 -dox-dox -dox-dox 60-dox-dox 60 -dox-dox -dox -dox-dox-dox60 25 60-dox-dox -dox -dox-dox-dox30 25 -dox30 -dox -dox -dox-dox30-dox 30-dox-dox -dox -dox-dox20-dox 20-dox -dox -dox20-dox 20 -dox -dox -dox -dox -dox -dox 5 -dox -dox 5 -dox 25 25 25 25 80 80 80 80 80 80 80 80 400 400 400 400 400 400 400 400 20000 20000 20000 20000 80-dox -dox80 252000020 25802000020 -dox 80 252000020 -dox80 800 202520000 800400 -dox 400-dox8020800 80020 400 -dox 20400 4000-dox 20 -dox -dox4000 -dox -dox -dox -dox -dox -dox 600 600 25 25 25 25 25000 25000 +dox +dox 25 +dox 20000 25 -dox+dox -dox +dox+dox +dox+dox 20000-dox -dox +dox +dox+dox60 -dox+dox 60 -dox+dox+dox+dox +dox+dox60 +dox-dox60 +dox+dox+dox-dox +dox3000+dox60 +dox -dox 3000 60 +dox+dox+dox-dox +dox+dox603000 +dox 300060-dox +dox+dox+dox -dox 3000+dox +dox -dox 3000 +dox+dox -dox 3000+dox +dox 3000 -dox +dox +dox -dox +dox +dox +dox 600 600 +dox +dox 25000 -dox 25000 -dox 25 -dox 25 -dox -dox -dox 80 -dox800-dox 800 -dox800 80080 -dox800 -dox 800-dox800 800 4000-dox -dox -dox 4000 -dox-dox -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox800 800 -dox -dox800 800 20 -dox 20 -dox+dox +dox20 -dox 20 -dox +dox +dox-dox -dox +dox +dox -dox -dox +dox +dox 60 -dox 60 -dox-dox+dox +dox60 60 +dox +dox300 15 300 +dox 15 +dox300 15 300 15 +dox +dox -dox -dox 20 20 -dox 20 20 20 20 -dox 20 -dox -dox -dox -dox 20000 20000 15000 15000 1500020 15000 20 +dox1500015 +dox 1500015 +dox-dox +dox1500015 +dox 1515000 +dox 40+dox 40 +dox15 +dox-dox 15+dox+dox40 20 40+dox15 +dox15-dox +dox+dox20 60 20 2060+dox 60 +dox 60 +dox20 +dox 60 20-dox60+dox 60 60 +dox 30 +dox +dox300 300+dox -dox300 +dox 300 +dox 30+dox300 300+dox 300 300 +dox +dox-dox +dox +dox 4 -dox 4 -dox 20 20 +dox 20+dox 20 +dox +dox+dox +dox +dox +dox +dox +dox60+dox 60 +dox +dox60 +dox 60 60+dox600 600300+dox+dox 300 60 600+dox 600 300+dox +dox 300 3000 +dox +dox+dox 3000 +dox 400 400 20000 +dox 20000 15+dox 15 20 +dox15 15 20 +dox 600 600 +dox60040 40600 600 60060040 +dox 40600 200040 2000 40 402000 2000+dox40 2000 2000 2000 2000 400 400 60 60 3000 +dox 3000 10 10 10 10 600 600 10000 600 10000 600 10000 10000 15 1515000 151000010 151000010 +dox 15 151000010 15000 101510000 10 +dox 10 10 40 10+dox 40 40 40 +dox 200 200 +dox 200 200 +dox +dox +dox 15 15 15 15 15000 15000 15 15 40 40 40 40 200 200 200 200 400 400 400 400 40 40 40 40 200 200 200 200 EPCAM-) 28, Day (to Change Fold 200 EPCAM-) 28, Day (to Change Fold 200 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day EPCAM-) (to 28, Change Fold Day (to Change Fold EPCAM-) 28, Day (to Change EPCAM-) Fold 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) EPCAM-) 28, 28, Day Day (to Change (to Fold Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, EPCAM-) 28, Day (toDay Change (to Fold Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to EPCAM-) Change 28, Fold Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold

15000 15000 15 15 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 10 10 10 10 40 400 400 40 400 400400 2000 400 2000 EPCAM-) 28, Day (to Change Fold 10 40 10 40 1040 10 40 200 200 40020 20400 200 15 20200 20 15 100020 1000 20 201000 100020 1000 1000 1000 1000 3 200 200 3 400 400 5000 400 5000 400 5000 5000 10 50005 1050005 10 50005 105 5000 40 5 540 5 5 2000 2000 5 5 5 5 10 10 10 10 10000 10000 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 100 20 EPCAM-) 28, Day (to Change Fold 100 EPCAM-) 28, Day (to Change Fold 100 EPCAM-) 28, Day (to Change Fold 10020 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 10000 10000 5 5 EPCAM-) 28, Day (to10 Change Fold EPCAM-) 28, Day (to Change Fold 5 5 10 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 200 200 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 200 200 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 20 20 EPCAM-) 28, Day (to Change Fold 20 20 EPCAM-) 28, Day (to Change Fold 20 20 EPCAM-) 28, Day (to Change Fold 20 20 EPCAM-) 28, Day (to Change Fold 100 100 EPCAM-) 28, Day (to Change Fold 100 100 EPCAM-) 28, Day (to Change Fold 100 100 EPCAM-) 28, Day (to Change Fold 100 100 0 0 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 200 200 EPCAM-) 28, Day (to Change Fold 200 200 200 200200 200 EPCAM-) 28, Day (to Change Fold Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 0 EPCAM-) 28, Day (to Change Fold 0 EPCAM-) 28, Day (to Change Fold 0 0 EPCAM-) 28, Day (to Change 5 Fold 100005 EPCAM-) 28, Day (to Change Fold 5 00 EPCAM-) 28, Day (to Change Fold 55 0 0 5 EPCAM-) 28, Day (to Change Fold 5 0 0 1000020 05 0 EPCAM-) 28, Day (to Change Fold 0 EPCAM-) 28, Day (to Change Fold 0 2000 0 0 EPCAM-) 28, Day (to Change Fold 0 0 EPCAM-) 28, Day (to Change Fold 0 0 1000 0 0 0 0 00 1000 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0

Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 20 20 EPCAM-) 28, Day (to Change Fold 20 20 EPCAM-) 28, Day (to Change Fold 100 100 EPCAM-) 28, Day (to Change Fold 100 100 EPCAM-) 28, Day (to Change Fold D28, EPCAM- D35, EPCAM-, D28, EPCAM- D35, EPCAM-,

200 200 200 200 10 EPCAM-) 28, Day (to Change Fold 10 2 2 5 5 5 5 5000 5000 EPCAM- EPCAM+ EPCAM- EPCAM+ 5 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi 5 EPCAM-EPCAM- EPCAM+ EPCAM+ EPCAM-EPCAM- EPCAM+ EPCAM+ EPCAM-negEPCAM-neg EPCAM-Lo EPCAM-LoEPCAM-neg EPCAM-HiEPCAM-neg EPCAM-Hi20 EPCAM-Lo EPCAM-Lo EPCAM-Hi EPCAM-Hi D28,EPCAM- EPCAM- D28, EPCAM+ EPCAM+ 20D28, EPCAM- D28, EPCAM+ EPCAM+ EPCAM-negD35, EPCAM- D35, EPCAM-Lo EPCAM-LoD35, D35,EPCAM-neg EPCAM-HiEPCAM- EPCAM-Hi D35,1000 EPCAM-Lo EPCAM-Lo D35, EPCAM-Hi EPCAM-Hi D28, EPCAM- D28, D28, EPCAM+ EPCAM+ D28,D28, EPCAM- 1000EPCAM- D28, D28, EPCAM+ EPCAM+ D35,D35, EPCAM- EPCAM- D35, D35, EPCAM-Lo EPCAM-Lo D35,D35,D35, D35, EPCAM-Hi EPCAM- EPCAM- EPCAM-Hi D35, D35, EPCAM-LoEPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35,D35, EPCAM-Hi EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi 5000 5000 0 0 5 0 0 5 0 0 EPCAM- EPCAM+ EPCAM-0 EPCAM+ 0 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo0 EPCAM-Hi 0 EPCAM- EPCAM+ EPCAM-0 EPCAM+ 0 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo0 EPCAM-Hi D28,0 EPCAM- D28, EPCAM+ D28, EPCAM- D28,0 EPCAM+ D35, EPCAM-0 D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi +dox +dox D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 EPCAM- EPCAM+ EPCAM- EPCAM+ 0 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi 0 EPCAM- EPCAM+ EPCAM- EPCAM+ 0 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi 0 D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ 0 D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM-10 D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo10 D35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi 0 0 0 0 0 0 0 0 0 EPCAM-) 28, Day (to Change Fold 0 EPCAM- EPCAM+ 0 EPCAM- EPCAM+ 0 EPCAM-neg EPCAM-) 28, Day (to Change Fold EPCAM-LoEPCAM-neg EPCAM-Hi 0 EPCAM-Lo EPCAM-Hi 0 EPCAM- EPCAM+ 0EPCAM- EPCAM-) 28, Day (to Change Fold EPCAM+ 0 EPCAM-neg EPCAM-Lo EPCAM-HiEPCAM-neg EPCAM-Lo EPCAM-Hi EPCAM-) 28, Day (to Change Fold D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35,D35, EPCAM-Hi EPCAM-) 28, Day (to Change EPCAM- Fold D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- EPCAM-) 28, Day (to Change Fold D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35,D35, EPCAM-Hi EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 0 0 0 EPCAM- EPCAM+5000 EPCAM- EPCAM+ 0 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo5000 EPCAM-Hi 0 EPCAM- EPCAM+ EPCAM- EPCAM+ 0 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi 0 D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ 0 D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi EPCAM- EPCAM+ EPCAM- EPCAM+ EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-LoEPCAM- EPCAM-Hi EPCAM+ EPCAM- EPCAM+ EPCAM-EPCAM-neg EPCAM+ EPCAM-Lo EPCAM-HiEPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-LoEPCAM- EPCAM-Hi EPCAM+ D28, EPCAM- D28, EPCAM+ D28,EPCAM-neg EPCAM- D28, EPCAM+ EPCAM-Lo EPCAM-HiD35, EPCAM- D35, EPCAM-Lo D35,D35, EPCAM-Hi EPCAM- D35, EPCAM-LoD28, EPCAM- D35, EPCAM-Hi D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D28,D35, EPCAM- EPCAM- D28, D35, EPCAM+ EPCAM-Lo D35, EPCAM-HiD35, EPCAM-5 D35, EPCAM-Lo D35,D35, EPCAM-Hi EPCAM- D35,D28, EPCAM-Lo EPCAM- D35, D28, EPCAM-Hi EPCAM+ 5 D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi 1 1 EPCAM- EPCAM+ iNEUROG3EPCAM-negiNEUROG3 EPCAM-Lo EPCAM-Hi iNEUROG3 EPCAM-iNEUROG3 EPCAM+ iNEUROG3VMNT EPCAM-negiNEUROG3VMNT EPCAM-Lo EPCAM-Hi iNEUROG3VMNT iNEUROG3D28,VMNT EPCAM- D28,CHGA EPCAM+ CHGAVMNTPDX1 D35, EPCAM-PDX1VMNT D35, EPCAM-LoCHGA D35, EPCAM-Hi CHGAPDX1VMNT PDX1VMNTD28, EPCAM-CXCR4 D28, EPCAM+ CXCR4PDX1SST SSTPDX1D35, EPCAM-CXCR4 D35, EPCAM-Lo D35, EPCAM-HiCXCR4PDX1SST PDX1SST FOXA2 FOXA2SST SST FOXA2 FOXA2SST SST GCG GCG GCG GCG 120 120 120 NEUROD120 NEUROD 0 120 60 NEURODNEUROD 12060 NEURODNEUROD 12060 0 NEUROD60120ECAD NEUROD35000 ECAD 35000 4060 ECAD ECAD40 60 35000 ECAD0 ECAD 350004060 ECAD40 60 GHRL ECAD40 GHRL0 40140040 GHRL 1400 40 GHRL40 GHRL GHRL40401400 GHRL 140040 0 PA XGHRL 4 60 PA X 4601400 PA X 4 1400 60 PA X 4 40 60 1400 PA X 4 PA X 4 1400 8 PA X 4 8 0 8 8 0 iNEUROG3 iNEUROG3 VMNT NEUROD NEUROD VMNTD NEUROD NEUROD PDX1 ECAD ECAD PDX1 ECAD ECAD GHRL GHRL SST GHRL GHRL PA X 4 PA X 4 PA X 4 PA X 4 NEUROD NEUROD NEUROD NEUROD ECAD ECAD ECAD ECAD GHRL EPCAM-GHRL EPCAM+ GHRL GHRL EPCAM-neg EPCAM-Lo EPCAM-Hi D28, EPCAM-PA X 4 D28, EPCAM+ PA X 4 SST D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi iNEUROG3 iNEUROG3 45000 45000 VMNT 4500045000 45000 45000 VMNT 45000 45 45000 45 PDX1PA X 4 453545 PA X 4 354545 PDX1 4535 140 3545 140 14035 140 140 140 14035SST 1407000 7000 7000 7000 7000 7000 7000 7000 45000 45000 45000 45000 45 45 45 45 35 140 SST 351200140 1200 35 35 140 35 1200 140 120035 1200 1200 7000 1200 7000 7 7 7 7 100 100 100 100 100 50 10050 10050 50100 30000 H 30000 50 50 30000 3000050 50 7000 7000 1200 45000 45000 45000 45000 120 45 45120 45 45 60 140 14060 140 140 40 40 7000 7000 1400 K 1400 50 50 50 50 7000 7000 120 120 40000 4000060 4000040000 4000060 40000 40000 40 40000 4040 4040 404040 40 40 1400 1400 30 30 30 30 12030 12030 12030 12030 6000 6000 40000 40000 40000 40000 40 40 40 40 120 120 30 301000 120 1000 30120 30 6000 6000 60001000 6000 60001000 6000 6 6 6 6 35 25000 2500035 25000 25000 120 120 120 1000 120 1000 6000 1000 6000 1000 40000 40000 40000 40000 40 40 40 40 1200 1200 6000 6000 100 100 80 80 80 3500050 80 35000 120 12050350003500080 40 35000 350001208040 120 350008040 35 350004080 35356000 6000354035 35353540 6000 35406000 3540 1200 40 40 40 40 35000 35000 35000 35000 35 35 25 25 35 35 25 25 25 2525 25 25 1200 2525 25 5 5 5 5 100 100 50 50 100 100 100800 100 800100 100 100 100 5000800 5000 5000800 5000 5000 5000 35000 35000 35000 35000 35 35 35 35 30 20000 2000030 20000 20000 800 5000800 5000 800 800 30000 30000 3000030000 30000 30000 30000 30 30000 30 3030 3030 30 1000 30 100 100 100 100 5000 5000 5000 5000 -dox -dox 30000100 -dox 30000 100 -dox 30000100 -dox-dox 30000100 -dox-dox 30 30 -dox-dox 30 -dox-dox -dox 30 -dox500030-dox-dox 30-dox-dox5000 -dox -dox-dox-dox -dox -dox -dox1000 -dox-dox-dox -dox-dox -dox -dox-dox-dox -dox -dox -dox-dox -dox -dox -dox -dox -dox -dox -dox -dox -dox 60 60 60 60 60 30 6030 6030 3060 5000 5000 2030 20 30 2030 20 30 100020 2020 20 20 1000 2020 20 30 30 30 30 4 4 4 4 30000 30000 30000 30000 80 30 3080 30 30 40 40 80600 60080 80 80 4000600 4000600 4000 4000 80 80 +dox +dox 40 +dox 25000+dox 40 25000 +dox+dox 25000+dox+dox 25 25000 +dox +dox2525+dox 25 +dox+dox +dox25 +dox 80 25 80+dox+dox +dox +dox 80 +dox+dox80 +dox+dox600 4000600 +dox+dox4000 +dox 4000 +dox 4000 +dox600 600 +dox +dox 25000 25000 25000 25000 -dox -dox 25 25 15000-dox 15000-dox25 +dox 25 +dox-dox15000 15000-dox 800 +dox +dox-dox 80 -dox 80 +dox +dox-dox 80 -dox 80 +dox -dox+dox 4000 -dox +dox4000 +dox-dox 4000 -dox 4000 +dox +dox-dox -dox +dox +dox +dox +dox 25000 -dox 25000 -dox 25000 -dox 25000 -dox 25 25 -dox 25 -dox 25 25 -dox 25 -dox -dox -dox800 -dox -dox -dox -dox -dox -dox 80 80 -dox -dox 80 80 -dox -dox4000 4000 15 -dox 15 -dox4000 154000 -dox15 800-dox15 1515 -dox 15 15 -dox 1515 15 -dox -dox -dox 3 -dox 3 3 3 25000 25000 25000 25000 25 25 40 25 40 25 40 40 2000040 20 200004020 +dox 20000+dox4020 200002040 +dox 20+dox20 2020 +dox 20+dox20 2020 +dox +dox +dox 800 +dox +dox 20 +dox20 -dox 20 -dox 20 +dox +dox -dox -dox 60 -dox -dox-dox 60 -dox -dox -dox 2000030 -dox 20000 -dox-dox 3020000-dox 20000 -dox -dox 20 20-dox 20 -dox -dox 2020 -dox 20 -dox -dox -dox -dox-dox 60400 40060 -dox60 60 3000400 3000400 +dox 3000+dox 3000 -dox -dox 20000 +dox 20000 +dox-dox 20000 +dox 20000 +dox-dox 2010000+dox 2010000+dox-dox 2010000 +dox 1000020 -dox+dox60 60 +dox +dox-dox60 60 +dox400 +dox3000400-dox 3000 +dox +dox3000 3000 400 +dox 400 +dox 60 60 30 +dox 30+dox +dox 20+dox 10 +dox 10 20+dox 10 600 +dox10 +dox10 60 1010 60 +dox 10 10 +dox 60 1010 60 10 +dox +dox3000 3000 +dox +dox3000 3000 +dox 2 +dox 2 2 2 20000 20000 +dox 20000+dox 20000 +dox 20+dox+dox 20 +dox 20+dox +dox 20 +dox +dox+dox +dox15000 15000+dox +dox 15000 +dox 15000 +dox+dox 15 15 +dox 15 +dox15 600 +dox+dox 600 +dox 15000 15000 60 6015000 15000 60 60 15 15 3000 3000 15 +dox 15 3000+dox 3000 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold +dox +dox +dox +dox +dox +dox 40200 +dox 20040 600 40 40+dox 2000200 EPCAM-) 28, Day (to 2000Change Fold 200 2000 2000 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) EPCAM-) 28, 28, Day Day (to (to Change Change Fold Fold EPCAM-) 28, Day (to Change Fold EPCAM-) EPCAM-) 28, 28, Day Day (to (to Change Change Fold Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change EPCAM-) Fold 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 20 20 20 20 15000 15000 20 10 EPCAM-) 28, Day (to 15000Change Fold 2010 15000 2010 15 EPCAM-) 28, Day (to 10Change Fold 20 15 15 1510 EPCAM-) 28, Day (to Change Fold 10 15 15 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 10 10 10 10 EPCAM-) 28, Day (to Change Fold Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold 5000 EPCAM-) 28, Day (to Change Fold 5000 EPCAM-) 28, Day (to Change Fold 5000 EPCAM-) 28, Day (to Change Fold 5000 40 40 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 40 EPCAM-) 28, Day (to Change Fold 40 EPCAM-) 28, Day (to Change Fold 200 2000200 EPCAM-) 28, Day (to Change Fold 2000 EPCAM-) 28, Day (to Change Fold 2000 EPCAM-) 28, Day (to Change Fold 2000 EPCAM-) 28, Day (to Change Fold 200 200 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 15000 15000 15000 15000 40 15 1540 15 15 20 20 10000 10000 10000 10000 15 10 5 51015 10 5 400 510 5 40 5 5 40 5 5 40 5 5 40 5 2000 2000 2000 2000 1 1 1 1 40 40 10000 1000020 10000 1000020 10 10 10 10 400 400 40 40 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 40 40 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 2000 2000 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 2000 EPCAM-) 28, Day (to Change Fold 2000 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 10000 10000 10000 10000 10 10 10 10 10 10 20 0 020 400 20 20 EPCAM-) 28, Day (to Change Fold 1000 0 EPCAM-) 28, Day (to Change Fold 1000 0 1000 1000

Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 5000 EPCAM-) 28, Day (to Change Fold 5000 5000 EPCAM-) 28, Day (to Change Fold 5000 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change 5Fold EPCAM-) 28, Day (to Change Fold 5 5 EPCAM-) 28, Day (to Change Fold 5 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 10000 10000 10000 10000 10 10 0 10 0 10 0 0 0 0 0 00 0 0 10 0 0 0 0 100 0 00 20 0 0 20 0 0 0 D28, EPCAM-20 D35, EPCAM-,0 D28, EPCAM-020 D35, EPCAM-, 0 0 0 EPCAM-) 28, Day (to Change Fold 00 EPCAM-) 0 28, Day (to Change Fold 0 D28, EPCAM-1000 D35, EPCAM-, D28,0 EPCAM-1000 D35, EPCAM-, 0 0 0 0 0 0 0 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 1000 1000 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 5000 5000 5000 5000 5 5 5 5 EPCAM-) 28, Day (to Change Fold 200 20 20 20 20 1000 1000 1000 1000 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM- EPCAM+ EPCAM- EPCAM+ EPCAM-) 28, Day (to Change Fold 10 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi EPCAM-) 28, Day (to Change Fold 10 EPCAM-EPCAM-EPCAM+ EPCAM+ EPCAM-EPCAM- EPCAM+EPCAM+ EPCAM-negEPCAM-neg EPCAM-) 28, Day (to Change Fold EPCAM-Lo EPCAM-LoEPCAM-negEPCAM-neg EPCAM-Hi EPCAM-Hi EPCAM-Lo EPCAM-LoEPCAM- EPCAM-Hi EPCAM-Hi EPCAM+ EPCAM-) 28, Day (to Change Fold EPCAM-D28,EPCAM- EPCAM- EPCAM+ D28, EPCAM+ EPCAM+ D28, EPCAM-EPCAM-EPCAM-neg D28, EPCAM+ EPCAM+ EPCAM-LoEPCAM-negD35,EPCAM-neg EPCAM-Hi EPCAM- D35, EPCAM-Lo EPCAM-Lo EPCAM-LoD35, D35, EPCAM-HiEPCAM-neg EPCAM- EPCAM-Hi EPCAM-Hi D35, EPCAM-Lo EPCAM-LoD28, D35, EPCAM- EPCAM-Hi EPCAM-Hi D28, EPCAM+ D28,D28, EPCAM-) 28, Day (to Change Fold EPCAM- EPCAM- D28, D28, EPCAM+ EPCAM+ D28, EPCAM-D35, EPCAM- D28, EPCAM+ D35, EPCAM-LoD35,D35, D35, EPCAM-D35, EPCAM- EPCAM-Hi EPCAM- D35, D35, EPCAM-Lo D35, EPCAM-Lo EPCAM-Lo D35, D35,D35, D35,EPCAM-Hi EPCAM-Hi EPCAM- EPCAM-Hi D35, D35, EPCAM-LoEPCAM-LoD28, EPCAM- D35, D35, EPCAM-HiEPCAM-Hi D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM-D35, EPCAM-Hi EPCAM- D35, EPCAM-Lo D35, EPCAM-Lo D35, EPCAM-Hi D35,D35, EPCAM-Hi EPCAM- D35, EPCAM-LoD28, EPCAM- D35, EPCAM-Hi D28, EPCAM+D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35,EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold +dox +dox +dox +dox Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold 500020 5000 20 205000 500020 EPCAM-) 28, Day (to Change Fold 5 EPCAM-) 28, Day (to Change Fold 5 10005 5 1000 200 200

Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 1000 1000 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 5000 5000 5000 5000 5 5 5 5 0 0 0 5 0 50 0 0 0 0 0 200 0 0 0 0 0 0 0 0 0 EPCAM- EPCAM+0 EPCAM- EPCAM+ EPCAM-neg0 EPCAM-Lo EPCAM-HiEPCAM-neg0 5 EPCAM-Lo EPCAM-Hi 0 EPCAM- EPCAM+ 0 5 EPCAM- EPCAM+ EPCAM-neg EPCAM-Lo0 EPCAM-neg EPCAM-Hi EPCAM-Lo0 EPCAM-Hi D28, EPCAM-0 D28, EPCAM+ D28, EPCAM-0 D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo0 D35,D35, EPCAM-Hi EPCAM- D35, EPCAM-Lo0 D35, EPCAM-Hi D28, EPCAM- D28,0 EPCAM+ D28, EPCAM- D28,0 EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35,D35, EPCAM-Hi EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 EPCAM- EPCAM+ EPCAM-0 EPCAM+ 0 0 EPCAM-neg EPCAM-Lo0EPCAM-neg EPCAM-Hi EPCAM-Lo0 EPCAM-Hi 0 EPCAM- EPCAM+ 0 EPCAM- EPCAM+ 0 0 EPCAM-neg EPCAM-LoEPCAM-neg0 EPCAM-Hi EPCAM-Lo EPCAM-Hi0 D28,D28, EPCAM- EPCAM- D35, D28, EPCAM-, EPCAM+0 D28, EPCAM- D28, EPCAM+ 0 D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi 0 0 0 EPCAM- EPCAM+ EPCAM- EPCAM+ 0 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi 0 EPCAM- EPCAM+ EPCAM- EPCAM+ 0 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ 0 D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi EPCAM- EPCAM+ EPCAM- EPCAM+ EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-LoEPCAM- EPCAM-HiEPCAM+ EPCAM- EPCAM+ EPCAM-EPCAM-neg EPCAM+ EPCAM-LoCHGA EPCAM-HiEPCAM-negCHGA EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-LoEPCAM- EPCAM-HiCHGA EPCAM+ D28, EPCAM-CHGA D28, EPCAM+ D28,EPCAM-neg EPCAM- D28,CHGA EPCAM+ EPCAM-LoCXCR4 EPCAM-HiD35, EPCAM-CXCR4CHGA D35, EPCAM-Lo D35,D35, EPCAM-Hi EPCAM- D35, EPCAM-LoD28, EPCAM-CHGACXCR4 D35, EPCAM-Hi D28, EPCAM+ CXCR4CHGAD28, EPCAM- D28, EPCAM+ D28,D35, EPCAM- EPCAM-CXCR4FOXA2 D28, D35, EPCAM+ EPCAM-Lo D35, EPCAM-HiFOXA2D35,CXCR4 EPCAM- D35, EPCAM-Lo D35,D35, EPCAM-Hi EPCAM-FOXA2CXCR4 D35, EPCAM-Lo D35, EPCAM-Hi+dox FOXA2CXCR4D28, EPCAM- D35, EPCAM-, FOXA2GCGD35, EPCAM- D35, EPCAM-LoGCGFOXA2 D35, EPCAM-Hi FOXA2GCG FOXA2GCG GCG GCG GCG GCG EPCAM- EPCAM+ EPCAM-neg EPCAM-Lo EPCAM-Hi EPCAM- EPCAM+ EPCAM-neg EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi +dox D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi 35000 35000 35000 35000 3500040 iNEUROG3 3500040 iNEUROG3 3500040 iNEUROG34035000 iNEUROG3 4060 VMNT 60 40 VMNT 6040 VMNT60 40 VMNT 8 60 PDX1 8 60 PDX1 608 PDX1 8 60 PDX1 8 SST 8 SST 8 SST 8 SST iNEUROG3 iNEUROG3 iNEUROG3E iNEUROG3 VMNT VMNT VMNT VMNT PDX1 PDX1 PDX1 PDX1 SST SST SST SST iNEUROG3 120iNEUROG335 12035 iNEUROG3 120 35 iNEUROG3 35120 VMNT 6035 VMNT 6035 VMNT6035 VMNT6035 PDX1 407 PDX1 7 40 40 7 PDX1 740PDX1 14007 SST SST14007 14007 SST 14007 SST iNEUROG3 iNEUROG3 iNEUROG3 iNEUROG3CHGA VMNT VMNT30000 CHGA 30000 VMNT 30000 VMNT CXCR430000 PDX1 30000 PDX1 CXCR4 30000 PDX1 30000 PDX1 FOXA230000 FOXA2 SST SST GCG GCG CHGA CHGA 120 120 CXCR4 120 120 CXCR4 60 60 FOXA2SST 5060 SST 5060 FOXA2 50 40 50 40 GCG 50 40 50 40 GCG50 140050 1400 1400 1400 120 120 120 120 60 60 60 60 40 35 40 35 40 35 40 35 1400 1400 1400 1400 120 120 120 120 35000 60 6035000 60 60 40 40 40 40 30 4030 40 30 60 30 1400 14006030 30 1400 301400 8 30 6 8 6 6 6 12006 12006 12006 12006 25000 25000 25000 25000 10025000 10025000 10025000 10025000 50 50 50 50 35000 35000 40 40 60 60 35 835 35 35 8 40 40 40 40 40 40 40 120040 1200 1200 1200 100 100 100 100 50 50 50 50 35 30 35 30 35 30 35 30 1200 1200 1200 1200 35 10035 100 35 35 25 3510025 100 35 25 25 50 50 25 25 50 50 25 7 25 5 7 5 5 5 10005 10005 10005 10005 30000 30000 35 35 1200 1200 1200 1200 7 7 100 100 100 100 30000 50 50 20000 3000050 20000 50 20000 20000 8020000 8020000 8020000 50 8020000 5040 40 40 30 40 30 30 30 50 50 30 30 30 30 1000 1000 -dox -dox -dox -dox20 20 -dox-dox -dox-dox20 20 -dox-dox -dox2030-dox 30 20 -dox-dox -dox3020-dox 30 20 -dox-dox -dox254 30-dox 4 2530 -dox-dox 25-dox30-dox4 42530 -dox-dox 1000 4-dox-dox 1000 1000 4 -dox 1000 4-dox 4 -dox -dox 30 30 30 30 30 30 6 6 800 800 1000 800 1000 800 25000 25000 80 80 30 80 8030 40 40 1000 1000 40 40 6 6 15000 15000 +dox +dox15000 15000 80 +dox 80 15000+dox 8015000 +dox+dox 80 +dox15000+dox 15000 40 +dox+dox 40 +dox+dox 100040+dox 40+dox+dox1000 25 25+dox+dox +dox +dox 25 +dox+dox25 +dox+dox +dox+dox +dox+dox +dox +dox +dox +dox 80 80 80 80 25000 40 40 2500040 40 60 60 -dox 60-dox 40 60 -dox 4030-dox 30 -dox 30-dox 30 -dox 25 20-dox25 20 -dox 25 20-dox 25 20800 -dox 800 -dox -dox -dox -dox -dox 25 25 15 15 15 15 40 15 4015 15 5 15 3 5 3 3 3 800 6003 800 800 6003 800 3 3 25 25 25 +dox 25 +dox +dox +dox20 20 +dox +dox20 20 +dox +dox20 20 +dox +dox20 20 +dox +dox +dox 800 600+dox 800 600 +dox +dox 20000 20000 25 -dox -dox 25 -dox -dox 800 -dox 800-dox -dox -dox 5 -dox -dox 5 -dox -dox -dox -dox -dox -dox 10000 10000 1000060 10000 60 60 -dox10000 60-dox 10000 10000 30 -dox 10000 30 -dox 30 -dox 30 -dox 800 800 20 -dox 20 -dox 20 -dox 20 -dox -dox -dox -dox -dox -dox -dox 20000 20000 60 60 10 6010 60 10 10 30 30 10 10 30 30 10 10 20 152 20 2 15 20 15 2 20 215 2 2 2 2 -dox -dox -dox -dox-dox -dox -dox -dox -dox -dox-dox -dox40 -dox40 -dox 40 -dox 40 -dox -dox 20 -dox 20 -dox -dox 20 -dox20 -dox-dox -dox 600 600 60 60 60 60 30 30-dox 30 30 20-dox 20 +dox 20+dox20-dox 20 +dox 20 +dox30-dox +dox +dox30-dox +dox 4-dox+dox +dox +dox4-dox +dox +dox -dox 600 +dox 400 600 +dox 600 400 600 +dox +dox 20 +dox 20+dox +dox 30+dox600 600 10 +dox 10 30+dox 10 +dox10 4 +dox 10 +dox 10 4+dox 10 10 +dox +dox +dox +dox 600 400 600 400 +dox +dox Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) EPCAM-) 28, 28, Day Day (to (to Change Change Fold Fold EPCAM-) 28, Day (to Change Fold EPCAM-) EPCAM-) 28, 28, Day Day (to (to Change Change Fold Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change EPCAM-) Fold 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 15000 +dox 15000 +dox +dox EPCAM-) 28, Day (to Change Fold +dox EPCAM-) 28, Day (to Change Fold +dox EPCAM-) 28, Day (to Change Fold +dox 15 EPCAM-) 28, Day (to Change Fold 15 +dox 15 EPCAM-) 28, Day (to Change Fold 15 +dox EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold +dox +dox +dox +dox +dox 5000 +dox 5000 5000 +dox 5000 +dox +dox50005 +dox50005 50005 +dox 5 5000 +dox 5 +dox 5 +dox600 5600 +dox 5 15+dox 101 15 1 10 15 10 1 15 110 1 1 1 1 15000 +dox 15000 4015+dox 40 15+dox40 40 +dox20 20 +dox20 20 3+dox 3+dox 400+dox 400 4015 40 15 1540 40 15 20 20 20 20 400 EPCAM-) 28, Day (to Change Fold 200 400 EPCAM-) 28, Day (to Change Fold 200 15 EPCAM-) 28, Day (to Change Fold 20 15 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 2010 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 3 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 3 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 400 400 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 40 40 40 40 20 20 20 20 400 400 10 10 5 10 5 10 5 5 400 200 400 200 10000 10000 0 0 0 0 0 0 0 0 0 0 0 0 20 0 0 200 400 0 400 0 0 10 0 010 0 0 10 00 10 0 0 0 0 0 0

10000 EPCAM-10000 EPCAM+ EPCAM- EPCAM+ 10 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi10 EPCAM-Lo EPCAM-Hi10 10 D28,EPCAM- EPCAM- D28, EPCAM+ EPCAM+10 D28, EPCAM-EPCAM- D28, EPCAM+EPCAM+10 D35,EPCAM-neg EPCAM- D35, EPCAM-Lo EPCAM-LoD35, D35, EPCAM-EPCAM-neg EPCAM-Hi EPCAM-Hi D35, EPCAM-Lo EPCAM-Lo D35, EPCAM-Hi EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28,D28, EPCAM- EPCAM- D28, D28, EPCAM+ EPCAM+ D35,D35, EPCAM-EPCAM-2 D35, D35, EPCAM-Lo EPCAM-LoD35, D35, D35, EPCAM- EPCAM-Hi EPCAM-HiEPCAM- D35, D35,EPCAM-Lo EPCAM-Lo D35, EPCAM-Hi D35, EPCAM-Hi D28,D28, EPCAM-2 EPCAM- D28, EPCAM+ D28, EPCAM-D28, EPCAM- D28, D28, EPCAM+ EPCAM+ D35,D35, EPCAM- EPCAM- D35, D35, EPCAM-Lo EPCAM-Lo EPCAM-) 28, Day (to Change Fold 200 D35, D35,D35, D35, EPCAM- EPCAM-Hi EPCAM- EPCAM-Hi D35, D35, EPCAM-Lo EPCAM-Lo EPCAM-) 28, Day (to Change Fold 200 D35, EPCAM-Hi 0 D28, EPCAM- D28, EPCAM+0 D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi

Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 10 10 2 2 EPCAM-) 28, Day (to Change Fold 200 EPCAM-) 28, Day (to Change Fold 200 200 200 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 0 EPCAM-) 28, Day (to Change Fold 200 EPCAM-) 28, Day (to Change Fold 20 0 0 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 0 0 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 100 5 0 5 EPCAM-) 28, Day (to Change Fold 0 EPCAM-) 28, Day (to Change Fold 5 0 5 EPCAM-) 28, Day (to Change Fold 0 EPCAM-) 28, Day (to Change Fold 0 D28, EPCAM- D35, EPCAM-, D28, EPCAM- D35, EPCAM-) 28, Day (to Change Fold EPCAM-, 0 EPCAM-) 28, Day (to Change Fold 0 10 EPCAM-) 28, Day (to Change Fold 200 EPCAM-) 28, Day (to Change Fold 20010 200 200 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold

Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 20 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold 10 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 5 5 5 5 5000 5000 EPCAM- EPCAM+ EPCAM- EPCAM+ EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi 10 EPCAM-Lo EPCAM-Hi EPCAM- EPCAM+ 10 200EPCAM- EPCAM+ 200EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35,D35, EPCAM-Hi EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi +dox +dox D35, EPCAM- D35, EPCAM-Lo D35,D35, EPCAM-Hi EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 5 EPCAM-) 28, Day (to Change Fold 5 5 5 EPCAM-) 28, Day (to Change Fold 5 5 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 1 EPCAM-) 28, Day (to Change Fold 1 EPCAM-) 28, Day (to Change Fold 0 0 5000 5000 5 5 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 D28, EPCAM- D35, EPCAM-, D28, EPCAM- D35, EPCAM-, 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 D28, EPCAM- D35, EPCAM-, D28, EPCAM- D35, EPCAM-, 0 0 0 0 0 0 0 0 0 0 0 0 0 EPCAM- EPCAM+ EPCAM-0 EPCAM+ 0 0 EPCAM-neg EPCAM-Lo0EPCAM-neg EPCAM-Hi EPCAM-Lo0 EPCAM-Hi 0 EPCAM- EPCAM+ EPCAM- EPCAM+ 0 EPCAM-neg EPCAM-LoEPCAM-neg0 EPCAM-Hi EPCAM-Lo EPCAM-Hi0 0 D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ 0 D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi +dox +dox D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi 0 0 0 EPCAM- EPCAM+ EPCAM- EPCAM+ 0 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi 0 D28,EPCAM- EPCAM- D35, EPCAM+ EPCAM-, EPCAM-D28, EPCAM- EPCAM+ D35, EPCAM-, 0 EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi 0 D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ 0 D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi +dox +dox D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi EPCAM- EPCAM+ EPCAM- EPCAM+ EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-LoEPCAM- EPCAM-Hi EPCAM+ EPCAM- EPCAM+ EPCAM-EPCAM-neg EPCAM+ EPCAM-Lo EPCAM-HiEPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-LoD28, EPCAM- EPCAM-Hi D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D28,D35, EPCAM- EPCAM- D28,CHGA D35, EPCAM+ EPCAM-Lo D35, D35,EPCAM-Hi EPCAM-CHGA D35, EPCAM-Lo D35,D35, EPCAM-Hi EPCAM- D35, EPCAM-LoD28,CHGA EPCAM- D35, EPCAM-Hi D28, EPCAM+ CHGA D35, EPCAM-CXCR4 D35, EPCAM-Lo D35, EPCAM-HiD35,CXCR4 EPCAM- D35, EPCAM-Lo D35,D35, EPCAM-Hi EPCAM-CXCR4 D35,D28, EPCAM-Lo EPCAM- D35, D28, EPCAM-Hi EPCAM+CXCR4 FOXA2D35, EPCAM- D35, EPCAM-LoFOXA2 D35, EPCAM-Hi FOXA2 FOXA2 GCG GCG GCG GCG EPCAM- EPCAM+ EPCAM-neg EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM++dox +dox D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi 35000 35000 35000 35000 40 40 40 40 60 60 60 60 8 8 8 8 CHGA CHGACHGA CHGA CHGA CHGACHGA CHGA CXCR4 CXCR4CXCR4 CXCR4 CXCR4 CXCR4CXCR4 CXCR4 FOXA2 FOXA2FOXA2 FOXA2 FOXA2 FOXA2FOXA2 FOXA2 GCG GCG GCG GCG GCG GCG GCG GCG 35 35 35 35 7 7 7 7 CHGA CHGA CHGA CHGA CXCR4 CXCR4 CXCR4 35000CXCR4 35000 FOXA2 3500030000 FOXA2 35000 30000 FOXA2 30000 40FOXA2 30000 40 GCG 40 GCG 40 GCG 60 GCG 60 60 60 8 8 8 8 35000 35000 35000 35000 40 40 40 40 60 50 60 50 60 50 60 50 8 8 8 8 35000 35000 35000 35000 40 40 40 40 60 60 60 60 8 8 30 30 8 30 8 30 6 6 6 6 25000 25000 25000 35 25000 35 35 35 7 7 7 7 30000 30000 30000 30000 35 35 35 35 7 7 7 7 30000 30000 30000 30000 50 50 40 50 40 50 40 40 35 35 35 35 7 7 7 7 50 50 50 50 30000 30000 30000 30000 25 25 25 25 5 5 5 5 50 50 20000 5020000 50 20000 30 20000 30 30 30 6 6 6 6 25000 25000 25000 25000 30 30 30 30 6 6 6 6 -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox -dox 30 30 30 30 25000 25000 25000 25000 6 6 20 20 6 20 6 40 20 40 30 40 30 40 30 30 4 4 4 4 25000 25000 25000 25000 25 25 25 25 40 40 40 40 5 5 5 5 15000 15000 +dox 15000+dox 15000 25 +dox 25 +dox +dox25 25+dox +dox +dox +dox +dox +dox 5 +dox 5 +dox 5 +dox 5 +dox +dox 20000 20000 40 40 20000 20000 40 40 25 25 25 25 20000 20000 20000 20000 5 5 15 15 5 15 5 15 3 3 3 3 20000 20000 20000 20000 -dox -dox -dox -dox20 20 -dox -dox20 20 -dox -dox30 30 -dox 20 -dox30 20 30 -dox20 -dox204 4 -dox -dox4 4 -dox -dox -dox10000 -dox10000 10000 -dox 10000 -dox 20 20 -dox -dox 20 20 -dox -dox 30 30 -dox -dox30 30 -dox -dox 4 4 -dox -dox 4 4 -dox -dox -dox -dox -dox 20-dox 20 -dox 20-dox 20 15000 -dox 15000 30-dox+dox +dox30 15000-dox 15000 30-dox +dox 30 +dox-dox 4-dox +dox 4 +dox10 -dox 10 4-dox +dox 10 4 +dox -dox10 -dox+dox +dox +dox +dox +dox 2 +dox 2 2 +dox 2 +dox 15000 15000 +dox +dox15000 15000 +dox +dox +dox +dox +dox +dox +dox +dox +dox +dox +dox +dox +dox +dox 15 15 15 15 10 10 10 103 3 3 3 15000 15000 +dox 15000+dox 15000 +dox +dox +dox +dox +dox +dox EPCAM-) 28, Day (to Change Fold +dox EPCAM-) 28, Day (to Change Fold +dox EPCAM-) 28, Day (to Change Fold +dox EPCAM-) 28, Day (to Change Fold +dox15 15 EPCAM-) 28, Day (to Change Fold +dox EPCAM-) 28, Day (to Change Fold +dox15 EPCAM-) 28, Day (to Change Fold 15 +dox EPCAM-) 28, Day (to Change Fold +dox EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 3 EPCAM-) 28, Day (to Change Fold 3 EPCAM-) 28, Day (to Change Fold 3 EPCAM-) 28, Day (to Change Fold 3 EPCAM-) 28, Day (to Change Fold 5000 5000 5000 5000 5 5 5 20 5 20 20 20 1 1 1 1 15 15 15 15 10000 10000 10000 10000 3 3 3 3 20 20 20 20 1000020 10000 20 2010000 1000020 10 10 10 10 2 2 2 2 10000 10000 10000 10000 0 0 0 0 10 10 0 0 10 100 0 0 0 0 0 2 0 2 0 2 0 2 0 10 10 10 10 2 2 2 2 10 10 10 10 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold 5000 EPCAM-) 28, Day (to Change Fold 5000 EPCAM-) 28, Day (to Change Fold 5000 EPCAM- EPCAM-) 28, Day (to Change Fold EPCAM+5000 EPCAM- EPCAM+ EPCAM-neg EPCAM-) 28, Day (to Change Fold EPCAM-Lo EPCAM-HiEPCAM-neg EPCAM-) 28, Day (to Change Fold EPCAM-Lo EPCAM-Hi EPCAM-) 28, Day (to Change Fold D28, EPCAM- D28, EPCAM+ EPCAM-) 28, Day (to Change Fold D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-) 28, Day (to Change Fold EPCAM-Lo D35,D35, EPCAM-Hi EPCAM- EPCAM-) 28, D35,Day (to Change Fold EPCAM-Lo10 D35, EPCAM-Hi 10D28, EPCAM-) 28, Day (to Change EPCAM- Fold D28, EPCAM+ D28, EPCAM-) 28, Day (to Change Fold EPCAM-10 D28, EPCAM+ D35,10 EPCAM- D35, EPCAM-Lo EPCAM-) 28, Day (to Change Fold D35,D35, EPCAM-Hi EPCAM- D35, EPCAM-Lo EPCAM-) 28, Day (to Change Fold D35, EPCAM-Hi D28, EPCAM- EPCAM-) 28, Day (to Change Fold D28, EPCAM+ D28, EPCAM- EPCAM-) 28, Day (to Change Fold D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35, EPCAM-HiEPCAM- D35, EPCAM-Lo D35, EPCAM-Hi Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 5 5 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 5 5 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 1 1 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 1 1 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 500010 5000 10 105000 500010 5 5 5 5 1 1 1 1 Fold Change (to Day 28, EPCAM-) 28, Day (to Change Fold 5000 EPCAM-) 28, Day (to Change Fold 5000 EPCAM-) 28, Day (to Change Fold 5000 EPCAM-) 28, Day (to Change Fold 5000 EPCAM-) 28, Day (to Change Fold 5 EPCAM-) 28, Day (to Change Fold 5 EPCAM-) 28, Day (to Change Fold 5 EPCAM-) 28, Day (to Change Fold 5 EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold EPCAM-) 28, Day (to Change Fold 1 EPCAM-) 28, Day (to Change Fold 1 EPCAM-) 28, Day (to Change Fold 1 EPCAM-) 28, Day (to Change Fold 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 EPCAM- EPCAM+ EPCAM- EPCAM+ EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35,EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi 0 0 0 0 0 0 0 0 0 EPCAM- EPCAM+ 0 EPCAM- EPCAM+ 0 EPCAM-neg EPCAM-Lo0 EPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi 0 D28, EPCAM- D28, EPCAM+0 D28, EPCAM- D28, EPCAM+ 0 D35, EPCAM- D35, EPCAM-Lo0 D35, D35,EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, EPCAM- D35, EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi EPCAM- EPCAM+ EPCAM- EPCAM+ EPCAM-neg EPCAM-LoEPCAM-neg EPCAM-Hi EPCAM-Lo EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, EPCAM-EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-Lo D35,D35, EPCAM-Hi EPCAM- D35, EPCAM-Lo D35, EPCAM-Hi D28, EPCAM- D28, EPCAM+ D28, EPCAM- D28, EPCAM+ D35, EPCAM- D35, EPCAM-LoD35, D35, EPCAM- EPCAM-Hi D35, EPCAM-Lo D35, EPCAM-Hi

112 Supp Fig 1

De nitive Posterior Pancreatic Split hESC Endoderm Gut Tube Foregut Precursors De nitive Posterior Pancreatic Split hESC Endoderm Gut Tube Foregut Precursors A 2 Days 3 Days 2 Days 4 Days 3 Days -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 Y-27632 2 Days 3 Days 2 Days 4 Days 3 Days Day of mTesR -2 -1 0 0.2%1 FBS 2 3 2% FBS 4 5 6 7 8 9 10 11 12 RPMI+NEAA Differentiation Y-27632 1% B27 HG-DMEM [50ng/ml] BMP4mTesR [100ng/ml]0.2% FBS 2% FBS ActivinRPMI+NEAA A [50ng/ml] 1% B27 FGF-7HG-DMEM [50ng/ml] [2μM] RABMP4 [100ng/ml] [50ng/ml] [25ng/ml] NogginActivin A [50ng/ml] FGF-7 [2μM] RA [50ng/ml] [25ng/ml] Noggin

De nitive Intestinal Intestinal Split hESC Endoderm Spheroids Organoids B De nitive Intestinal Intestinal Split hESC Endoderm Spheroids Organoids 2 Days 3 Days 4 Days 28 Days -2 -1 0 1 2 3 4 5 6 7 11 15 19 23 27 31 35 Day of Y-27632 2 Days 3 Days 4 Days 28 Days Differentiation mTesR 0.2% FBS 2% FBS RPMI+NEAA -2 -1 0 1 2 3 4 5 6 7 11 15 19 23 27 31 35 Y-27632 [1x B27, 2mM L-Glut, P/S, 15mM HEPES] HG-DMEM [100ng/ml] Activin AmTesR 0.2% FBS 2% FBS RPMI+NEAA [3μM] Chiron [500ng/ml] [1x B27, 2mM L-Glut, P/S, 15mM HEPES] FGF-4HG-DMEM [100ng/ml] [100ng/ml] EGFActivin A [3μM] Chiron [500ng/ml] FGF-4 [100ng/ml] EGF

113 15000

10000

5000

0

350 0ng 300 WT 250

to NEUROD 200 NKX2.2 150 Change 100 PAX4

Fold 50 0 0 0 0 0 30 30 30 30 100 300 Supp100 300 Fig 2 100 300 100 300 Ind‐Ngn3 A 400 350 0ng 300 WT 250 to 200 150 Change 100

15000Fold 50 0

10000 0 0 0 0

15000 30 30 30 30 ng/ml Dox 100 300 100 300 100 300 100 300 5000 WT R107S L135P E28X 10000 15000 B 0 5000 10000 0 3505000 0ng 300 0 WT 250350

to NEUROD

0ng 200 300 NKX2.2 WT 150 350250

to NEUROD Change

0ng

100 300200 PAX4 NKX2.2 WT Fold 50250150

to NEUROD Change

0200100 PAX4

0 0 0 0 NKX2.2 30 30 30 30 Fold 15050 100 300 100 300 100 300 100 300 Change 1000 PAX4

0 0 0 0 ng/ml Dox 30 30 30 30 Fold 50 100 300 100 300 100 300 100 300 0 WT R107SInd‐Ngn3 L135P E28X 0 0 0 0 30 30 30 30

400 100 300 100 300 100 300 100 300 350 Ind‐Ngn3 0ng 300400

WT Ind‐Ngn3 250350 to

0ng 200 300400 WT 150 250350 to Change

0ng

100 200300 WT Fold 50150250 to Change

0100200 0 0 0 0 Fold 15050 30 30 30 30 100 300 100 300 100 300 100 300 Change 1000 0 0 0 0 Fold 30 30 30 50 30 100 300 100 300 100 300 100 300 0 0 0 0 0 30 30 30 30 100 300 100 300 100 300 100 300

114 Supp Fig 3

A HA (NEUROG3) indNEUROG3NKX2.2 PDX1 Merge 140

120 dox) ‐ 9

100 Day 80 199F (to 60 F199S

Change 40

WT (199F)

Fold 20

0 ‐dox +dox ‐dox +dox indNEUROG3 CHGA 140 indNEUROG3 120 600 140 dox) ‐ 9

100 500 120 dox) dox) Day ‐

‐ 80 9 9

100 199F (to F199S 400

Day

Day 60

80 F199S 199F indNEUROG3(to 300 199F (to

Change 40 60 F199S F199S 140 200 Fold 20 Change Change

40

100 0 Fold

Fold 20 120 ‐dox +dox ‐dox +dox dox) 0 0 ‐

9 ‐dox +dox ‐dox +dox 100 ‐dox +dox ‐dox +dox B indNEUROG3 CHGA Day 80 140 NEUROD 199F 600 CHGA (to 4000 60 120 600500 dox) dox) ‐ ‐ 3500 F199S 9 9

100 400 dox) 500 ‐ 3000 dox) Change 40 Day Day 9

80 9 199F

199F (to (to 300

2500 400 Day

60 Day

Fold 20 F199S199F F199S (to 2000 200 199F (to 300

Change

Change 40

F199S 1500 F199S 0 200100 Change Fold Fold 20 1000 Change ‐dox +dox ‐dox +dox

Fold 0500 1000 ‐dox +dox ‐dox +dox Fold ‐dox +dox ‐dox +dox 0 0 ‐doxDay 9 +dox ‐doxDay 11 +dox Day 9 Day 11 ‐dox +dox ‐dox +dox CHGA CHGA NEUROD 600 PDX1 4000 NEUROD 600 3500 6 4000

500 dox) dox) ‐ 3000 ‐ 9 3500 9 500 4005

dox) 2500 dox) dox) Day

‐ ‐ 3000 Day

‐ 9 9

199F (to 9 2000 4

199F (to 300 2500 Day Day

400 1500 F199S F199S 199F

199F(to (to 2000

Day 3

200

Change

Change 1000 199F F199S1500 F199S (to 300

2 100 Fold 500 Fold Change Change

1000 1 F199S 0 Fold 200 Fold 0 500 ‐dox +dox ‐dox +dox ‐dox +dox ‐dox +dox

Change 0 0 ‐dox +dox ‐dox +dox 100 ‐dox +dox ‐dox +dox Fold Day 9 NEUROD Day 11 Day 9 PDX1 Day 11 4000 0 6 PDX1 3500 65

‐dox +dox dox) ‐dox +dox dox) ‐

3000 ‐ 9 9

2500 54 dox) Day Day

115 ‐ 9

199F (to 2000 199F (to 3

4 Day NEUROD1500 F199S F199S 199F (to 32

Change Change

1000 4000 F199S 1 Fold 2 500 Fold Change 3500 0 10 Fold dox) ‐dox +dox ‐dox +dox ‐dox +dox ‐dox +dox

‐ 3000

9 0

2500 ‐dox +dox ‐dox +dox Day

PDX1 199F (to 2000 6

1500 5 F199S dox) ‐ 9

Change 4 1000 Day

199F (to 3

Fold 500 F199S 2 Change 0 1 ‐dox +dox Fold ‐dox +dox 0 ‐dox +dox ‐dox +dox PDX1 6

5 dox) ‐ 9 4 Day

199F (to 3

F199S 2 Change

1 Fold

0 ‐dox +dox ‐dox +dox PDX1 1.2 Dox) Dox) 1.0

0.8 0ng/ml 0ng/ml 0.6 (to (to

0.4 0.2 Change Change

0.0 Fold Fold 0ng Dox 25ng Dox 50ng Dox 100ng 200ng 100ng 100ng 100ng SuppDox Fig 4Dox Dox, Dox, Dox, 1uM 0.1uM 0.3uM 4OHT 4OHT 4OHT

A NEUROD 350 300 Dox) Dox)

250 200 0ng/ml 0ng/ml 150 (to (to

100 50 Change Change

0 Fold Fold 0ng Dox 25ng Dox 50ng Dox 100ng 200ng 100ng 100ng 100ng Dox Dox Dox, Dox, Dox, 1uM 0.1uM 0.3uM 4OHT 4OHT 4OHT

NKX2.2 4.5 4.0 Dox) Dox)

3.5 3.0 2.5 0ng/ml 0ng/ml 2.0 (to (to

1.5 1.0 0.5 Change Change

0.0 Fold Fold 0ng Dox 25ng Dox 50ng Dox 100ng 200ng 100ng 100ng 100ng Dox Dox Dox, Dox, Dox, 1uM 0.1uM 0.3uM 4OHT 4OHT 4OHT

PDX1 1.2

Dox) 1.0

0.8

0ng/ml 0.6 (to 0.4 0.2 Change 0.0

Fold 0ng Dox 25ng Dox 50ng Dox 100ng 200ng 100ng 100ng 100ng Dox Dox Dox, Dox, Dox, 1uM 0.1uM 0.3uM 4OHT 4OHT 4OHT

NEUROD 116 350 300 Dox)

250 200 0ng/ml 150 (to

100 50 Change 0

Fold 0ng Dox 25ng Dox 50ng Dox 100ng 200ng 100ng 100ng 100ng Dox Dox Dox, Dox, Dox, 1uM 0.1uM 0.3uM 4OHT 4OHT 4OHT

NKX2.2 4.5 4.0 Dox) 3.5 3.0 2.5 0ng/ml 2.0 (to 1.5 1.0 0.5 Change 0.0

Fold 0ng Dox 25ng Dox 50ng Dox 100ng 200ng 100ng 100ng 100ng Dox Dox Dox, Dox, Dox, 1uM 0.1uM 0.3uM 4OHT 4OHT 4OHT CHAPTER 4

Discussion.

Patrick S. McGrath1 and James M. Wells*1,2.

1Division of Developmental Biology, 2Division of Endocrinology, Cincinnati Children’s

Hospital Medical Center, 3333 Burnet Avenue, Cincinnati, OH 45229-3039

117 Major Findings

There has been some controversy about the requirement for NEUROG3 in human pancreatic endocrine development. To test this directly, we generated human

embryonic stem cell (hESC) lines where both alleles of NEUROG3 were disrupted using

CRISPR/Cas9-mediated gene targeting. Cas9 robustly created INDELs at the target

sites in NEUROG3 and yielded PSC lines that expressed normal stem cell markers, had

no chromosomal abnormalities, and were able to differentiate normally into tissues of

interest. NEUROG3-/- hESC lines efficiently formed pancreatic progenitors, but lacked

detectible NEUROG3 protein and did not form any endocrine cells in vitro. Moreover,

NEUROG3-/- hESC lines were unable to form mature pancreatic endocrine cells

following engraftment of PDX1+/NKX6.1+ pancreatic progenitors into mice. In contrast,

a 75-90% knockdown of NEUROG3 caused a reduction, but not loss, of pancreatic

endocrine cell development. This suggests that a hypomorphic allele would need to

retain only a little residual function to still induce endocrine differentiation (Chapter 2).

We then utilized an rtTA construct to reintroduce NEUROG3 which could be

ectopically expressed following treatment with doxycycline. Upon directed differentiation

into pancreatic or intestinal tissue, endocrine specification is robustly rescued following

NEUROG3 induction with doxycycline. NEUROG3 was mutated to include the patient

missense mutations R107S, L135P, or E28X. NEUROG3R107S perfectly recapitulates

the human phenotype and rescues endocrine specification in pancreatic precursors,

albeit with reduced efficiency, but is unable to specify endocrine cells in the context of

intestinal organoids. NEUROG3L135P did not exhibit any observable function in

pancreatic precursors or HIOs and was unable to specify any endocrine cell lineages.

118 Stability testing suggests that a reduced half-life may partly explain the reduction in

function for NEUROG3R107S. Finally, we transcriptionally profiled pancreatic precursors

for direct and indirect targets of NEUROG3 using ectopic expression of a

NEUROG3ERT2 fusion construct. Similarly, NEUROG3 targets were identified in

differentiated intestinal epithelium (Chapter 3). Comparing the tissue specific targets of

NEUROG3 will identify novel pathways by which endocrine cells may be formed in a

context specific manner.

Using hPSCs to study pancreatic endocrine development and the importance of

studying NEUROG3 mutations in the correct context

It is now well known from mouse studies that Neurog3 is absolutely essential for

the formation of pancreatic endocrine cells (Gradwohl et al., 2000). Human patients

have been identified with homozygous mutations in NEUROG3, due to the severe malabsorptive diarrhea phenotype resulting from a complete loss of enteroendocrine cells (EECs) (reviewed in Rubio-Cabezas et al., 2014). Interestingly, all of the patients have detectible c-peptide and many did not present with diabetes until years after birth, strongly indicating the presence of functional β-cells. Heterozygous parents do not present with any phenotype. The various NEUROG3 mutations were heretofore thought to be loss-of-function, largely based on in vitro assays (more thoroughly discussed later). It has therefore been suggested that NEUROG3 may be dispensable at least for pancreatic endocrine development.

This possibility prompted the generation of a human PSC line in which

NEUROG3 expression was completely abolished using CRISPR/Cas9. We clearly show

119 that NEUROG3 is required for the specification of endocrine cells in pancreatic precursors in vitro. Furthermore, NEUROG3-/- differentiated pancreas matured in vivo also show no evidence of endocrine formation (McGrath et al., 2015). This finding perfectly aligns with previous mouse studies, and together strongly suggests the conclusion that NEUROG3 is essential for pancreatic endocrine development in humans.

As our data suggest NEUROG3 is required for endocrine specification in the pancreas, we can make the obvious hypothesis that the various patient mutations are likely hypomorphic and retain some functional ability to specify endocrine cells.

Confusingly, a variety of functional analyses have been performed previously on a subset of the known NEUROG3 mutations (R107S, R93L, L135P, and E123X), and these studies have largely concluded that each mutation renders NEUROG3 non- functional (Pinney et al., 2011; Rubio-Cabezas et al., 2011; Wang et al., 2006). This coincides well with the fact that each mis- or nonsense mutation affects well conserved, and therefore important amino acid residues in the functionally required bHLH domain of NEUROG3 (conservation extends to species as distant as C. eligans, Wang et al.,

2006). It’s valuable to note that the “gold standard” assay for testing each of these mutations has been a transactivation assay using the NeuroD promoter to drive expression of luciferase following co-transfection into a stable cell line (eg. HeLa, p19, and 293T cells have been used). All of the cell lines previously used are very different in cell type from a developing posterior foregut, and this has profound effects on the ability of even wild-type Neurog3 to activate a target promoter. This was clearly shown when

NEUROG3R93L and NEUROG3R107S were electroporated into developing chick

120 duodenum which ectopically induced the expression of the endocrine hormone

glucagon, suggesting the formation of α-cells (Jensen et al., 2007). More recently,

NEUROG3R107S has been transduced into Neurog3null developing mouse pancreas by

microinjection, and there was observable function evidenced by the activation of

NEUROG3 target genes and the formation of hormone positive cells (Pauerstein et al.,

2015). This is opposite of the in vitro findings which suggested NEUROG3R93L and

NEUROG3R107S are non-functional, highlighting the importance that these types of studies are performed in a relevant tissue type, in our case differentiated pluripotent stem cells.

Modeling NEUROG3 mutations and some experimental limitations

Having NEUROG3null hPSCs provides a “blank-slate” background in which we

can ectopically express NEUROG3 harboring any mutation we desire using the

pINDUCER lentiviral construct. We have now constructed most of the published patient

mutations into the doxycycline inducible vector. E28X, R107S, and L135P are

characterized in chapter 3. We have also generated R93L, E123X, and S171fsX68.

Stemming from the conclusion that NEUROG3 is required for endocrine development,

we would predict that each of these mutations should have some detectible function,

with the exception of E28X which is an early truncation we use as a negative control

(the patient harboring this mutation is heterozygous E28X/L135P thus the L135P is

presumably providing all the NEUROG3 function). NEUROG3R107S is indeed able to activate target genes and leads to the formation of hormone positive cells (Chapter 3).

Similarly R93L, E123X, and S171fsX68 are all able to activate target genes and yield

121 hormone positive cells (data not shown). All of the mutations show impaired function

compared to wild type protein. Thus in total, four out of five of the tested mutations are

confirmed to be hypomorphic. It is particularly interesting that NEUROG3E123X is

functional, since it codes for a missense mutation truncating half-way through the helix-

loop-helix domain. It is hard to imagine how this could still produce a functional

transcription factor and will require further scrutiny.

NEUROG3L135P is the one outlier and remains extremely puzzling because there

is no evidence of function in any assay we have tried thus far. The L135P mutation

affects an amino acid in the helical region that forms a dimer with another bHLH factor.

It could reasonably be predicted that the L135P mutation yields a compromised ability to

dimerize and therefore would be unable to bind DNA. A DNA binding assay (executed

by Joseph Salomone) was performed to assess this possibility. WT, R107S, or L135P

NEUROG3 was synthesized, purified, and then dimerized with the well-known bHLH

binding partner E47. DNA binding capacity was then assessed using a canonical E-box

probe. NEUROG3WT and NEUROG3R107S were both able to bind DNA, indicated by the

gel shift (Fig. 1A,B). Conversely, no shift is observed with NEUROG3L135P indicating it is

completely unable to bind DNA (Fig. 1C). There is also no observable binding with

Selex probes, so the inability to bind DNA is not limited to this particular E-box.

NEUROG3 is unable to bind DNA as a homodimer, as previously reported (Huang et

al., 2000; Wang et al., 2006). Modeling would predict that the loss of DNA binding is due to an inability to dimerize, however this still needs to be tested. These data further validate our observations that NEUROG3L135P is non-functional, however this conflicts

with the idea that NEUROG3 is required for endocrine formation.

122 The simplest explanation is that L135P function is so severely inhibited that it is

difficult to see significant differences compared to a negative control, but enough

function is retained to initiate the endocrine differentiation program. This is certainly

plausible, in fact we found using a shRNA knockdown approach that a 90% reduction in

NEUROG3 mRNA was unable to abolish endocrine formation in pancreatic precursors

(Chapter 2) suggesting that only the smallest amount of residual function is required.

Using our inducible rtTA system, we could easily compensate for this by dramatically overexpressing and/or increasing the duration of expression of NEUROG3L135P by

modulating the concentration and/or duration that cells are exposed to dox, respectively.

We found that pancreatic precursors exposed to high levels of NEUROG3L135P for up to

48 hours still were unable to activate any target genes.

If NEUROG3 is indeed required (as we have suggested), there are a few

possibilities that could explain this apparent lack of L135P function. Pancreas

development proceeds in two major waves known as the primary and secondary

transition. Some endocrine specification does occur during the primary transition, but

the vast majority of NEUROG3 expression and subsequent endocrine specification

occurs during the secondary transition (Villasenor et al., 2008). Furthermore, temporally

restricted lineage tracing studies show that the vast majority of endocrine cells

contributing to an adult islet arise from the secondary transition (Gu et al., 2002). RNA

profiling indicates that hPSC derived insulin positive cells more closely resemble

primary transition endocrine cells than adult β-cells (Hrvatin et al., 2014), suggesting

that in vitro differentiation models likely do not represent the secondary wave of

endocrine specification. Thus I would hypothesize that NEUROG3L135P is only functional

123 during the secondary transition, and we are unable to reproduce its function in vitro without further maturation. We showed previously that the differentiated pancreatic precursors can be matured following transplantation into a mouse host, and this could serve as a good model system for the secondary transition. Following transplantation and a predetermined amount of maturation, NEUROGL135P would be expressed using

dox chow. Transplants could then be harvested and assessed for activation of

NEUROG3 target genes and the formation of endocrine cells.

So far, we have assessed NEUROG3 mutant function with only a single pluripotent stem cell line. It is well known that genetic background can have a profound effect on inherited disease penetrance. One example is Hirschprung disease, characterized by the absence of ganglion cells innervating the distal intestine.

Incomplete penetrance and inter-familial variation are frequently observed in

Hirschprung gene mutations, variation even between family members harboring the same mutation is not uncommon (Badner et al., 1990; Wallace and Anderson, 2011).

Similarly, genetic background also drives transcriptional variation in human pluripotent stem cells (Rouhani et al., 2014), which could lead to significant differences in ability to differentiate or function once specified. To further characterize function in the

NEUROG3L135P mutant, it would thus be a great idea to repeat these studies using a

few different hPSC lines. Further supporting this possibility, two different patients harboring homozygous L135P mutations have been identified. One patient developed transient neonatal diabetes by 3 weeks of age while the other did not develop diabetes until 13 years old (Rubio-Cabezas et al., 2014), again highlighting the possible importance of genetic background. The best experiment would be to derive iPSC lines

124 from each of these patients and assess the ability of each line to form endocrine cells in

vitro by directed differentiation into pancreas. It would be fantastically interesting if the

more severely affected patients’ iPSC line similarly had a reduced ability to differentiate

into endocrine cells compared to the less affected patient.

Finally, it is possible that NEUROG3L135P truly is a complete loss-of-function

allele. Because each patient harboring NEUROG3 mutations have circulating C-peptide,

we can surmise they have functioning β-cells. A few of the patients’ pancreata have

been imaged and were determined to be completely normal, but this technique does not

have the sensitivity to detect islets. Furthermore, a pancreas biopsy has never been

collected to confirm the presence or absence of hormone positive. Thus it is formally

possible that these patients in fact do not have pancreatic endocrine cells. While

NEUROG3 may be required for the formation of pancreatic endocrine cells, humans

may have uniquely developed a mechanism (or mice lost the ability) to form functional

β-cells in other tissues resulting from the absence of pancreatic β-cells. Certainly it has

been shown that related tissues such as antral stomach, liver, and intestine are all

competent to form insulin expressing cells if given the right combination of factors (Al-

Masri et al., 2010; Ariyachet et al., 2016; Banga et al., 2012). However, since we have confirmed that each other NEUROG3 mutant is hypomorphic, this seems like the far more plausible explanation and we simply aren’t able to show it with L135P.

Ultimately, tissues generated by our directed differentiation strategies are similar to their in vivo counterparts in many ways. However, it is important to that note that they are not the same. The lack of overall maturation, function, or formation of various cell types is at least in part reflective of insufficiencies in our understanding of the molecular

125 events that govern how these tissues form. Additionally, while differentiated tissues and

organoids can be quite complex, containing multiple cell types, they are relatively simple

compared to an in vivo system. Therefore, many signaling inputs are surely lacking

(circulating blood, immune, neural innervation). Because of these differences, care must be taken when trying to make conclusions about developmental processes modeled in

vitro. As the stem cell and developmental biology field moves forward, these knowledge

gaps will surely diminish.

Differential requirement for NEUROG3 in pancreatic versus intestinal endocrine

development and mechanisms leading to tissue specific target genes and

endocrine lineages

One major observation that came out of these studies is that enteroendocrine

specification is more profoundly affected by particular mutations in NEUROG3 than

pancreatic endocrine specification. This aligns nicely with previously described patient

phenotypes, all of whom have a complete loss of EECs but always retain at least some

pancreatic endocrine function. We have been able to show that all but one of these

mutations retain some functional capacity, so what is the difference in requirement

between pancreas and intestine?

One mechanism that can explain this outcome is relative dosage. It is well established that NEUROG3 is required for pancreatic and intestinal endocrine development. However, the relative levels of NEUROG3 expression that occur

developmentally in vivo are unknown. Even from mouse studies, nobody has directly compared pancreatic Neurog3 expression levels to intestinal. This is likely due to a lack

126 of good reagents with which we can visualize Neurog3 expression. All available

antibodies stain with high background making interpretation difficult. A good readout of

protein levels would be a polycistronic fluorophore knocked into the Neurog3 locus (eg.

a Neurog3:P2A:GFP). So far, this mouse has not been generated. Alternatively, this

approach could be applied instead to PSCs and expression could be directly compared

between differentiated pancreas and intestine. One group has actually made this human

PSC line (NEUROG3-2A-eGFP), however they have not differentiated it into intestine

for comparison to pancreatic tissues (Liu et al., 2014).

If dosage is a contributing factor, I would expect NEUROG3 to be expressed at lower levels in intestine. Conceivably, if this is the case then the intestine could be more

sensitive to a reduction in NEUROG3 function due to mutations. The experiments

present here attempted to take this fact into consideration. One of the advantages of

expressing the mutant NEUROG3 from a rtTA construct is that you have some

control over ectopic expression, and therefore dosage. For this reason, experiments

comparing pancreas and intestine always utilized the same dosage of doxycycline

(100ng/ml) and for the same duration (8 hours). Technically however, it is not this

simple. The pancreas cultures consist of a monolayer of cells that would have

immediate access to dox upon addition to the media, and immediately begin expressing

NEUROG3. The HIOs however are 3-dimensional and embedded in a bubble of matrigel. The relevant tissue in which NEUROG3 should be expressed is the epithelium, which in the exact middle of all this tissue and extracellular matrix. It may take dox an extended time to penetrate through, making the 8h treatment with dox effectively much less. Again, due to a lack of good reagents it has proven difficult to definitively show that

127 the induced levels are truly equivalent in both tissues. We now have preliminary data

indicating that by increasing the concentration and duration of treatment with dox, we

are able to see some function from the NEUROG3R107S mutant, evidenced by activation

of target genes and later formation of endocrine cells. This further highlights the importance of confirming the relative expression levels in the experimental model presented here.

Dosage is not the only mechanism by which NEUROG3 can elicit a tissue specific response. Our data strongly suggest that NEUROG3 has largely different target genes upon expression in pancreas compared to intestine. This is not necessarily surprising since the specified endocrine lineages downstream of NEUROG3 are almost exclusively expressed in either pancreas or intestine with no overlap (the single exception is somatostatin, which is broadly expressed throughout both the pancreas and intestine). For example insulin, glucagon, and expressing cells only develop in the pancreas. Motilin, GIP, GLP-1, NTS (and various others) only develop in the intestine. Despite the tissue restricted expression of these endocrine cells, they are common in that they are all initially dependent on the expression of

NEUROG3 to develop. So how can this be possible?

Transcription is able to proceed when a transcription factor recognizes and binds a cis-regulatory DNA region and recruits the transcriptional machinery. How transcription factors can recognize specific sequences is not a trivial problem.

NEUROG3 is a basic helix-loop-helix transcription factor and thus typically binds to a consensus sequence called an E-box coded by the palindromic sequence CANNTG

(Longo et al., 2008). E-boxes are scattered throughout the genome, and most of them

128 have no functional impact on gene regulation. It is interesting to consider mechanisms

that guide transcription factors to bind the few appropriate regulatory targets out of the

immense number of similar or even identical sequences. Furthermore, how can these mechanisms illicit tissue specific cellular responses during development.

An effective mechanism that reduces the total number of putative response elements, thereby broadly reducing complexity, is through burying irrelevant binding

sites in heterochromatin which physically renders them unavailable. This epigenetic

control of transcription occurs through methylation and/or acetylation modifications of

the DNA-organizing histones. A very elegant study using stepwise in vitro differentiation

of hESCs to pancreatic or liver lineages clearly shows that the epigenetic landscape can

change quickly as cells develop and transition though various stages (Wang et al.,

2015). Furthermore, poised and active enhancers, marked by H3K4me1 and H3K27Ac

respectively, are very different when comparing two different tissue types. They further

show that the establishment of enhancers is guided by FOXA1, suggesting that key

tissue restricted transcription factors orchestrate tissue specific enhancer landscape. It

is very likely a similar mechanism at least in part regulates the activity of NEUROG3 in

pancreas versus intestine. There are various pancreas and intestine specific

transcription factors (for example PTF1A in the pancreas and CDX2 in the intestine) that

could establish tissue specific enhancer “landscapes”. Conceivably, in this fashion

NEUROG3 would able to activate one set of targets in the context of pancreas and a considerably different set of targets in intestine.

Conveniently, the study published by Wang et al. provides an absolute wealth of information that is directly applicable to the work presented here. Because they also

129 used an hESC differentiation model, the datasets they have already generated likely directly translate to our pancreas differentiation protocol. They have even published

similar enhancer analysis using differentiated intestine. Going forward, it would be very

informative to intersect our NEUROG3 target data with their data of active and poised

enhancers. We could conceivably identify putative NEUROG3 binding sites by

screening for active enhancers near target genes that also contain a canonical E-box.

With enough binding sites, we could then use in silico methods to screen through the

immediate flanking DNA to identify other transcription factor bindings sites and thus

infer putatively important factors that cooperate with NEUROG3 to regulate target gene

activation. I would hypothesize that these “other factors” are significantly different in the

pancreas versus intestine which at least in part contribute to the difference in target

gene activation. The in silico analysis would be nicely complimented by ChIP-seq data

to further compare/contrast NEUROG3 binding both in the pancreas or intestine.

Technically, our model system is ideally set up for this experiment. Endogenous

NEUROG3 is abolished, thus all binding sites should be occupied solely by our dox-

inducible NEUROG3. Since we added an HA-epitope, there are readily available ChIP-

grade antibodies and the results should be very clean.

Another possible mechanism by which NEUROG3 can activate tissue specific

targets is by context dependent dimerization partners. NEUROG3 is a bHLH

transcription factor and therefore must dimerize before it can interact with DNA. Studies

looking at the closely related Neurog2 in chicken retinal development show bHLH

protein interactions with target promoters undergo rapid changes as development

progresses within a particular tissue type. Also, promoters bound by Neurog2 are vastly

130 different in different cell types indicating Neurog2 DNA binding is cell context

specific(Skowronska-Krawczyk et al., 2004). Some bHLH factors are able to

homodimerize such as the ubiquitously expressed Class I protein E47(TCF3). However,

the proneural class of bHLH proteins in particular are unable to form homodimers and

only bind E-box regulatory elements as heterodimers (Cabrera and Alonso, 1991;

Johnson et al., 1992; Lee et al., 2005), and indeed it has been shown that the target

sequence is dependent on the binding partner. However, experiments have failed to

identify residue combinations within the E-box that are specific to particular

combinations of bHLH heterodimers indicating dimer partners alone may not determine

the target E-box sequence (Blackwell and Weintraub, 1990). It has been clearly shown

that nucleotides flanking the E-box play a crucial role in determining binding specificity.

For example, the presence of a thymidine residue flanking the consensus sequence

inhibits the bHLH protein but not Max (Fisher et al., 1993). It has also been shown

that the heterodimeric CLOCK/BMAL1 complex not only binds the canonical E-box but also two non-canonical 7-bp E-boxes (Wang et al., 2013). Taken together, tissue restricted dimerization partners is a viable mechanism by which NEUROG3 can different effects in two different tissues.

We now have transcriptome data for both pancreas and intestine. An obvious future direction is to analyze these datasets to identify candidate bHLH factors that are expressed only in the pancreas or intestine. Direct interaction can be confirmed using co-immunoprecipitation. Effect on DNA binding can be tested using EMSA assays using labelled probes. Finally, functional relevance of the interaction can be confirmed using knock-down and overexpression approaches in our pancreas and intestine

131 differentiation models. It would be a very exciting result to see insulin or glucagon expression in the intestine upon ectopic expression of a pancreas specific dimerization partner, for example.

It is important to note that tissue restricted interacting proteins come in more

flavors than just dimerization partners, and it is important to keep in mind that

NEUROG3 does not initiate transcription by itself but instead likely works in concert with

many factors to initiate transcription. For example, a Drosophila study found that the

bHLH heterodimer duplex Achaete/Scute was able to interact with a GATA transcription

factor via a bridging factor Chip (Ramain et al., 2000). It is well known that GATA

proteins are regionally restricted transcription factors important in directing

developmental processes. In fact, 50% of human pancreatic agenesis cases are caused

by heterozygous mutations in GATA6 (Lango Allen et al., 2012). It is also possible that

bridging factors such as Chip are expressed in a tissue restricted manner as well.

To add another level of complexity, mouse studies further show that Neurog3

and other related proneural bHLH factors are under strict control by post-translational

modifications. Neurog3 stability is controlled by the ubiquitin proteasome system and

turnover is regulated in part by dimerization to a bHLH binding partner (Roark et al.,

2012). Furthermore, ubiquitization and phosphorylation have been shown to be

prerequisite for transcriptional activity of bHLH factors in some contexts (Ali et al., 2014;

Kim et al., 2003).

Taken together, the final NEUROG3 transcriptional activity is a summation of epigenetic landscape, protein concentration, DNA-binding affinity, interaction with dimerization partners, and direct or indirect interactions with other transcription factors

132 or various other proteins that make up transcription initiation complex. A combination of many or maybe all of these mechanisms likely contribute to the tissue specific function of NEUROG3 in pancreas compared to intestine.

Modeling human development and disease

Our current molecular understanding of pancreatic and intestinal development originates predominantly from mouse studies, where genetic approaches have identified both intrinsic and extrinsic factors that coordinate and guide development. These data have frequently been extrapolated to human development. Although the basic developmental program may be similar, the identity and/or timing of activity of some factors may be critically different. Detailed characterization of human fetal development would provide information regarding the spatiotemporal expression of intrinsic and extrinsic factors, however for obvious reasons it is of course impossible to perform these sorts of studies.

Modeling human development and disease is an exciting strength for in vitro human culture systems. The development of iPSC technologies has made it relatively trivial to obtain stem cell lines derived from diseased patients. In our case, patient samples were unavailable but the recent development of gene editing technologies has made it possible to introduce mutations of interest into any cell line of choice. The stem cells harboring patient mutations can then be differentiated into a relevant tissue type and studied for mechanism or to identify possible treatments with absolutely no invasiveness to the patient. The experimental paradigm employed in this work is well suited to study any monogenic disorder insofar as we are able to differentiate PSCs to a

133 relevant tissue. For example, Maturity onset diabetes of the young (MODY) are forms of monogenic diabetes occurring in genes important for either the differentiation or function of β cells. These genes, and mutations therein, could similarly be used in this system to identify novel disease mechanisms that may lead to better future interventions or therapeutics.

Acknowledgements

Many thanks to James Wells, Aaron Zorn, Brian Gebelein, Stacey Huppert, Michael

Helmrath, and the many wonderful colleagues in the Wells and Zorn lab for countless valuable discussions and insights throughout my dissertation (Xinghao Zhang, Katie

Sinagoga, Jorge Munera, Heather McCauley, Jacqui Schiesser, Jamie Schweitzer, Jay

Stone, Matt Kofron, Kyle McCracken, Anna Method, Christopher Mayhew, Amy Pitstick,

Mike Workman, Stephen Trisno, Scott Rankin, Alan Kenny, Lu Han, Mariana Louza,

Zak Agricola, Sang-wook Cha)

134 Ali, F.R., Cheng, K., Kirwan, P., Metcalfe, S., Livesey, F.J., Barker, R. a, and Philpott, A. (2014). The phosphorylation status of Ascl1 is a key determinant of neuronal differentiation and maturation in vivo and in vitro. Development 141, 2216–2224. Al-Masri, M., Krishnamurthy, M., Li, J., Fellows, G.F., Dong, H.H., Goodyer, C.G., and Wang, R. (2010). Effect of forkhead box O1 (FOXO1) on beta cell development in the human fetal pancreas. Diabetologia 53, 699–711. Ariyachet, C., Tovaglieri, A., Xiang, G., Lu, J., Shah, M.S., Richmond, C.A., Verbeke, C., Melton, D.A., Stanger, B.Z., Mooney, D., et al. (2016). Reprogrammed Stomach Tissue as a Renewable Source of Functional β Cells for Blood Glucose Regulation. Cell Stem Cell 18, 410–421. Badner, J.A., Sieber, W.K., Garver, K.L., and Chakravarti, A. (1990). A genetic study of Hirschsprung disease. Am J Hum Genet 46, 568–580. Banga, a., Akinci, E., Greder, L. V., Dutton, J.R., and Slack, J.M.W. (2012). In vivo reprogramming of Sox9+ cells in the liver to insulin-secreting ducts. Proc. Natl. Acad. Sci. 109, 15336–15341. Blackwell, T.K., and Weintraub, H. (1990). Differences and similarities in DNA-binding preferences of MyoD and E2A protein complexes revealed by binding site selection. Science 250, 1104–1110. Cabrera, C. V, and Alonso, M.C. (1991). Transcriptional activation by heterodimers of the achaete-scute and daughterless gene products of Drosophila. EMBO J. 10, 2965–2973. Fisher, D.E., Parent, L.A., and Sharp, P.A. (1993). High affinity DNA-binding Myc analogs: recognition by an alpha helix. Cell 72, 467–476. Gradwohl, G., Dierich, A., LeMeur, M., and Guillemot, F. (2000). neurogenin3 is required for the development of the four endocrine cell lineages of the pancreas. Proc. Natl. Acad. Sci. U. S. A. 97, 1607–1611. Gu, G., Dubauskaite, J., and Melton, D. a (2002). Direct evidence for the pancreatic lineage: NGN3+ cells are islet progenitors and are distinct from duct progenitors. Development 129, 2447–2457. Hrvatin, S., O’Donnell, C.W., Deng, F., Millman, J.R., Pagliuca, F.W., Diiorio, P., Rezania, A., Gifford, D.K., and Melton, D. a (2014). Differentiated human stem cells resemble fetal, not adult, β cells. Proc. Natl. Acad. Sci. U. S. A. Huang, H., Liu, M., El-hodiri, H.M., Chu, K., and Liu, M.I.N. (2000). Regulation of the Pancreatic Islet-Specific Gene BETA2 ( neuroD ) by Neurogenin 3 Regulation of the Pancreatic Islet-Specific Gene BETA2 ( neuroD ) by Neurogenin 3. 2. Jensen, J.N., Rosenberg, L.C., Hecksher-Sørensen, J., and Serup, P. (2007). Mutant Neurogenin-3 in Congenital Malabsorptive Diarrhea. N. Engl. J. Med. 356, 1781– 1782. Johnson, J.E., Birren, S.J., Saito, T., and Anderson, D.J. (1992). DNA binding and transcriptional regulatory activity of mammalian achaete-scute homologous (MASH) proteins revealed by interaction with a muscle-specific enhancer. Proc. Natl. Acad. Sci. U. S. A. 89, 3596–3600. Kim, S.Y., Herbst, A., Tworkowski, K.A., Salghetti, S.E., and Tansey, W.P. (2003). Skp2 regulates Myc protein stability and activity. Mol. Cell 11, 1177–1188. Lango Allen, H., Flanagan, S.E., Shaw-Smith, C., De Franco, E., Akerman, I., Caswell, R., Ferrer, J., Hattersley, A.T., and Ellard, S. (2012). GATA6 haploinsufficiency

135 causes pancreatic agenesis in humans. Nat. Genet. 44, 20–22. Lee, S., Lee, B., Ruiz, E.C., and Pfaff, S.L. (2005). Olig2 and Ngn2 function in opposition to modulate gene expression in motor neuron progenitor cells. 282– 294. Liu, H., Yang, H., Zhu, D., Sui, X., Li, J., Liang, Z., Xu, L., Chen, Z., Yao, A., Zhang, L., et al. (2014). Systematically labeling developmental stage-specific genes for the study of pancreatic β-cell differentiation from human embryonic stem cells. Cell Res. 1–20. Longo, A., Guanga, G.P., and Rose, R.B. (2008). Crystal structure of E47- NeuroD1/beta2 bHLH domain-DNA complex: heterodimer selectivity and DNA recognition. Biochemistry 47, 218–229. McGrath, P.S., Watson, C.L., Ingram, C., Helmrath, M.A., and Wells, J.M. (2015). The Basic Helix-Loop-Helix Transcription Factor NEUROG3 Is Required for Development of the Human Endocrine Pancreas. Diabetes 64, 2497–2505. Pauerstein, P.T., Sugiyama, T., Stanley, S.E., McLean, G.W., Wang, J., Martín, M.G., and Kim, S.K. (2015). Dissecting human gene functions regulating islet development with targeted gene transduction. Diabetes 64, 1–51. Pinney, S.E., Oliver-Krasinski, J., Ernst, L., Hughes, N., Patel, P., Stoffers, D. a, Russo, P., and De León, D.D. (2011). Neonatal diabetes and congenital malabsorptive diarrhea attributable to a novel mutation in the human neurogenin-3 gene coding sequence. J. Clin. Endocrinol. Metab. 96, 1960–1965. Ramain, P., Khechumian, R., Khechumian, K., Arbogast, N., Ackermann, C., and Heitzler, P. (2000). Interactions between chip and the achaete/scute- daughterless heterodimers are required for pannier-driven proneural patterning. Mol. Cell 6, 781–790. Roark, R., Itzhaki, L., and Philpott, A. (2012). Complex regulation controls Neurogenin3 proteolysis. Biol. Open 1, 1264–1272. Rouhani, F., Kumasaka, N., de Brito, M.C., Bradley, A., Vallier, L., and Gaffney, D. (2014). Genetic Background Drives Transcriptional Variation in Human Induced Pluripotent Stem Cells. PLoS Genet. 10. Rubio-Cabezas, O., Jensen, J.N., Hodgson, M.I., Codner, E., Ellard, S., Serup, P., and Hattersley, A.T. (2011). Permanent neonatal diabetes and enteric anendocrinosis associated with biallelic mutations in NEUROG3. Diabetes 60, 1349–1353. Rubio-Cabezas, O., Codner, E., Flanagan, S.E., Gómez, J.L., Ellard, S., and Hattersley, A.T. (2014). Neurogenin 3 is important but not essential for pancreatic islet development in humans. Diabetologia 2, 3–6. Skowronska-Krawczyk, D., Ballivet, M., Dynlacht, B.D., and Matter, J.-M. (2004). Highly specific interactions between bHLH transcription factors and chromatin during retina development. Development 131, 4447–4454. Villasenor, A., Chong, D.C., and Cleaver, O. (2008). Biphasic Ngn3 expression in the developing pancreas. Dev. Dyn. 237, 3270–3279. Wallace, A.S., and Anderson, R.B. (2011). Genetic interactions and modifier genes in Hirschsprung’s disease. World J. Gastroenterol. 17, 4937–4944. Wang, A., Yue, F., Li, Y., Xie, R., Harper, T., Patel, N.A., Muth, K., Palmer, J., Qiu, Y., Wang, J., et al. (2015). Epigenetic priming of enhancers predicts developmental competence of hESC-derived endodermal lineage intermediates. Cell Stem Cell

136 16, 386–399. Wang, J., Galen, C., Wu, V., Tran, R., Cho, J.-H., Tsai, M.-J., Bailey, T.J., Jamrich, M., Ament, M.E., Treem, W.R., et al. (2006). Mutant neurogenin-3 in congenital malabsorptive diarrhea. N. Engl. J. Med. 356, 1781–1782; author reply 1782. Wang, Z., Wu, Y., Li, L., and Su, X.-D. (2013). Intermolecular recognition revealed by the complex structure of human CLOCK-BMAL1 basic helix-loop-helix domains with E-box DNA. Cell Res. 23, 213–224.

137 Figure Legends

Figure 1. Characterization of WT, R107S, and L135P ability to bind DNA

(A) An electrophoretic mobility shift assay was performed using purified E47 or

NEUROG3WT-E47 and a labeled E-box containing oligonucleotide. E47 alone is able to bind the DNA probe. A band-shift is observed the addition of NEUROG3, indicating it is

able to bind DNA. NEUROG3 is unable to homodimerize, as NEUROG3 alone cannot

bind the probe

(B) Same as A, using NEUROG3R107S. NEUROG3R107S is able to bind DNA, possibly

with slightly lower efficiency compared to wild-type.

(C) Same as A, using NEUROG3L135P. The L135P mutation renders NEUROG3

completely unable to hybridize with the E-box probe.

138 Figure 1

A B C

139