UNIVERSITY OF MANCHESTER

Transdifferentiation to pancreatic progenitors

A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Biology, Medicine and Health

2019

Stephen John Wearne

School of Medical Sciences Division of Diabetes Endocrinology

and Gastroenterology Contents Contents...... 1 List of tables ...... 3 List of figures...... 4 List of abbreviations ...... 5 List of publications ...... 8 Abstract ...... 9 Declaration ...... 10 Copyright statement ...... 10 Acknowledgements ...... 11 1. Introduction ...... 12 1.1. Overview of the pancreas ...... 12 1.2. Development towards pancreatic progenitors ...... 14 1.2.1. Endoderm to early pancreas signalling network ...... 14 1.2.2. network of the early pancreas ...... 17 1.2.3. Late pancreas signalling network ...... 40 1.2.4. Transcription factor expression in human and mouse pancreas development ...... 42 1.3. Pancreatic cell production ...... 45 1.3.1. Transdifferentiation...... 45 1.3.2. In vitro differentiation ...... 50 1.3.3. Organoid culture ...... 54 1.4. Project goal and methods ...... 57 2. Materials and Methods ...... 61 2.1. Materials...... 61 2.1.1. Solutions...... 61 2.1.2. Primers ...... 62 2.1.3. Plasmids ...... 65 2.1.4. Antibodies...... 65 2.2. Methods ...... 66 2.2.1. Bioinformatic analysis...... 66 2.2.2. Molecular cloning ...... 67 2.2.3. AAV production and testing ...... 69 2.2.4. qPCR ...... 71 2.2.5. Cell culture ...... 73 2.2.6. Western blot ...... 76 2.2.7. Immunofluorescence ...... 77 2.2.8. RNAscope ...... 77 2.2.9. Immunohistochemistry ...... 77 3. Results ...... 79 3.1. LgPCA and Mogrify identify a shared cohort of active in specifying the pancreas ...... 79

1

3.2. In vitro differentiation allows the derivation of the crucial factors for transdifferentiation ...... 81 3.3 The generation and validation of a reliable AAV production protocol ...... 85 3.4. PDX1 AAV is able to repress hepatic genes in the HepG2 cell line...... 89 3.5. Combined exposure of key transcription factors via AAV delivery can induce endogenous expression of tissue specific genes ...... 91 3.6. The spatial expression profile of PTF1A, RBPJL and NR5A2 overlap during late embryogenesis ...... 97 3.7. LgPCA and Mogrify determined transcription factors are able to alter expression of pancreatic genes in FLFs ...... 99 3.8. PDX1, MNX1 and FOXA2 AAVs are able to induce endogenous expression of multiple markers of pancreatic progenitors ...... 108 3.9. LgPCA but not Mogrify unveils further genes specifying for the early pancreas ...... 119 4. Discussion ...... 122 4.1. Bioinformatic analysis of potential transcription factors able to transdifferentiate to the pancreas ...... 122 4.2. The description of temporal expression during in vitro differentiation to pancreatic progenitors ...... 126 4.3. An improved method for AAV generation and purification ...... 132 4.4. AAVs are effective vectors for gene delivery and expression ...... 134 4.5. AAVs are able to induce transdifferentiation ...... 135 4.6. The characterisation of exocrine transcription factors during late embryogenesis ...... 138 4.7. Transcription factors derived from bioinformatic analysis have capability to transdifferentiate cells ...... 140 4.8. PDX1, MNX1 and FOXA2 are potent transcription factors for transdifferentiation to pancreatic progenitors ...... 147 4.9. Further bioinformatic analysis validates established factors as well as deriving novel factors ...... 152 5.0. Summary and future work ...... 156 Bibliography ...... 159 Appendixes ...... 180 Appendix I ...... 180 Appendix II ...... 182 Appendix III ...... 184 Appendix IV ...... 185 Appendix V ...... 186 Appendix VI ...... 187 Appendix VII ...... 188 Appendix VIII ...... 189 Appendix IX ...... 190 Appendix X ...... 191 Appendix XI ...... 194 Appendix XII ...... 195 Appendix XIII ...... 196 Appendix XIV ...... 197 Appendix XV ...... 198 Appendix XVI ...... 199 Word Count: 39225

2

List of tables Table Title Page 1 The embryonic stages of development in humans and mice. 16 2 Transcription factor presence during human and mouse embryonic development. 44 3 Solutions used throughout project. 61 4 Primers used throughout the project. 62-64 5 The plasmids used and produced throughout the project. 65 6 Antibodies used throughout project. 65 7 qPCR cycle setup. 72 8 STEMCELL Technologies pancreatic differentiation timetable. 74 9 A summary of change upon the addition or removal of 8 key transcription factor 105 encoding AAVs. 10 A summary of gene expression change upon the addition or removal of 4 key transcription factor 107 encoding AAVs.

3

List of figures Figure Title Page 1 Transcription factor regulation in the early human and mouse pancreas. 19 2 The advance of transdifferentiation protocols towards pancreatic cell types. 49 3 The advance of in vitro differentiation protocols towards pancreatic cell types. 54 4 The advance of organoid expansion protocols towards pancreatic cell types. 56 5 Bioinformatic analyses used to determine genes for transdifferentiation. 59 6 LgPCA and Mogrify analysis of transcription factors for transdifferentiation. 80 7 Day by day expression during SCT differentiation of genes active during pancreas development 84 and from LgPCA/Mogrify analysis. 8 A summary of AAV production and validation. 86 9 GFP expression following varying MOIs of GFP AAV. 87 10 analysis of AAV induced expression. 88 11 PDX1 AAV exposure to HepG2 cells. 90 12 HNF4A, HNF1A, FOXA3 AAV exposure to FLF cells. 92 13 PTF1A, RBPJL and N5A2 AAV exposure to FLF cells. 95 14 FLF protein expression following PTF1A, RBPJL and NR5A2 AAV treatment. 96 15 PTF1A, RBPJL and NR5A2 localisation within a CS18 human embryonic pancreas. 98 16 Single AAV factor exposure to FLF cells. 102 17 8 AAV factor exposure to FLF cells. 104 18 4 AAV factor exposure to FLF cells. 106 19 PDX1, MNX1 and FOXA2 binding to upstream genomic regions of pancreatic genes. 111 20 Bonfanti and Trott media exposure to PMF treated FLF cells. 113 21 PDX1, MNX1 and FOXA2 AAV exposure to FLF cells. 115 22 FLF cell morphology and protein expression following PDX1, MNX1 and FOXA2 AAV treatment. 117 23 Fibroblast gene expression following HHF, PtRN and PMF treatment. 119 24 Novel LgPCA and Mogrify analysis. 121

4

List of abbreviations Abbreviation Full Term A*STAR Agency for Science, Technology and Research A-P anterior posterior AAT alpha-1 antitrypsin AAV adeno-associated virus ACD Advanced Cell Diagnostics ACT β-actin ActA activin A AFP alpha fetoprotein AIP anterior intestinal portal ALB albumin ALK5iII activin -like kinase 5 inhibitor II AMY1A amylase alpha 1A APOA2 apolipoprotein A-II AXLi AXL receptor tyrosine kinase inhibitor bHLH basic helix-loop-helix BIX BIX-01294 BMP bone morphogenetic protein BNC1 basonuclin 1 bp base pairs BSA bovine serum albumin CDH1 E-cadherin cDNA complementary deoxyribonucleic acid CDKN1A cyclin dependent kinase inhibitor 1A CDX1 caudal type 1 CDX2 caudal type homeobox 2 ChIP chromatin immunoprecipitation CIP caudal intestinal portal CMV cytomegalovirus CNOT1 CCR4-NOT transcription complex subunit 1 COL3A1 collagen type 3 A 1 CPA1 carboxypeptidase A1 CS Carnegie stage CYC cyclopamine DAB 3,3'-diaminobenzidine DAPT N-[N-(3,5-difluorophenacetyl)-l-alanyl]-S-phenylglycine t-butyl ester dpc days post conception DMEM Dulbecco’s modified Eagle’s medium DNA deoxyribonucleic acid E embryonic day ECM extracellular matrix EDTA ethylenediaminetetraacetic acid EGF epidermal growth factor EGFR epidermal growth factor receptor EHF ETS homologous factor EMT epithelial to mesenchymal transition Ex4 exendin 4 FACS fluorescence activated cell sorting FCS fetal calf serum FGF1 fibroblast growth factor 1 FGF2 fibroblast growth factor 2 FGFR2B fibroblast growth factor receptor 2B FGF4 fibroblast growth factor 4 FGF7 fibroblast growth factor 7 FGF10 fibroblast growth factor 10 FLF fetal lung fibroblasts FOXA1 forkhead box protein A1 FOXA2 forkhead box protein A2

5

FOXA3 forkhead box protein A3 FSP1 fibroblast-specific protein 1 GATA4 GATA transcription factor 4 GATA5 GATA transcription factor 5 GATA6 GATA transcription factor 6 GCG glucagon GDF8 growth differentiation factor 8 GFP green flurorescent protein GLP1 glucagon-like peptide 1 GMP good manufacturing practice GSiXX γ-secretase inhibitor XX GSK3β glycogen synthase kinase 3 β GYS2 glycogen synthase HA hemagglutinin HBSS Hank’s balanced salt solution HEK293 human embryonic kidney 293 cells HES1 hairy and enhancer of split-1 HEYL hairy/enhancer-of-split related with YRPW motif-like protein hESC human embryonic stem cells HHEX hematopoietically expressed homeobox HHF HNF4A, HNF1A, FOXA3 HNF1A hepatocyte nuclear factor 1A HNF1B hepatocyte nuclear factor 1B HNF4A hepatocyte nuclear factor 4A HNF6 hepatocyte nuclear factor 6 hPCS human pluripotent stem cells IF immunofluorescence IGF insulin growth factor IHC immunohistochemistry INS insulin INSR insulin receptor iPSCs induced pluripotent stem cells ISL1 insulin gene enhancer protein ISL-1 Kruppel like factor 4 LB lysogeny broth LDN LDN-193189 LgPCA lineage-guided principle components analysis LGR5 leucine-rich repeat-containing G-protein coupled receptor 5 LIF leukemia inhibitory factor MAFA v- avian musculoaponeurotic fibrosarcoma oncogene homolog A MAFB v-maf avian musculoaponeurotic fibrosarcoma oncogene homolog B MIST1 muscle, intestine and stomach expression 1 MNX1 motor neuron and pancreas homeobox 1 MODY maturity onset diabetes of the young MOI multiplicity of infection mRNA messenger ribonucleic acid avian myelocytomatosis MEF2X myocyte enhancer factor 2C N-Cys N-acetyl cysteine NEAA non-essential amino acids NECA 5'-N-ethylcarboxamidoadenosine NFKBIZ NF-kappa-B inhibitor zeta NGN3 neurogenin 3 NICD notch intracellular domain NKX2.1 Nirenberg and Kim homeobox factor 2.1 NKX2.2 Nirenberg and Kim homeobox factor 2.2 NKX2.3 Nirenberg and Kim homeobox factor 2.3 NKX6.1 Nirenberg and Kim homeobox factor 6.1 NR0B2 subfamily 0 group B member 2 NR5A2 nuclear receptor subfamily 5, group A, member 2

6

OCT4 octamer-binding transcription factor 4 P57 cyclin-dependent kinase inhibitor 1C PAS periodic acid-Schiff PBS phosphate buffered saline PCR polymerase chain reaction PDX1 pancreatic and duodenal homeobox 1 PEI polyethyleneimine PKC protein kinase C PLA2G1B phospholipase A2 group B PMF PDX1, MNX1, FOXA2 PNM PDX1, NGN3, MAFA PP pancreatic polypeptide PRSS1 trypsin 1 PSC pluripotent stem cells PTF1A pancreas transcription factor 1 subunit a PtRN PTF1A, RBPJL, NR5A2 qPCR quantitative polymerase chain reaction RA retinoic acid RALDH2 retinaldehyde dehydrogenase 2 RBPJ recombining binding protein suppressor of hairless RBPJL recombining binding protein suppressor of hairless-like RFX6 regulatory factor X 6 RIPA radioimmunoprecipitation assy RNA ribonucleic acid ROCKi rho-associated protein kinase inhibitor Y SANT SHH antagonist SCT Stem Cell Technologies SDS PAGE sodium dodecyl sulphate polyacrylamide gel electrophoresis SHH sonic hedgehog siRNA small interfering RNA SIX1 sine oculis homeobox homolong 1 SIX1 sine oculis homeobox homolong 4 SRY HMG-box factor 2 SOX6 SRY HMG-box factor 2 SOX9 SRY HMG-box factor 9 SOX17 SRY HMG-box factor 17 SSS sonicated salmon sperm STZ streptozotocin SV40 simian vacuolating virus 40 T3 tri-iodothyronine TBP TATA-binding protein TBE tris-borate EDTA TBI TGFβ inhibitor IV TBST tris-buffered saline Tween TBX5 T-box transcription factor 5 TESC tescalcin TF transferrin TG tris-glycine TGFβ transforming growth factor β TGIF2 TGFβ induced factor homeobox 2 TLX2 T-cell leukemia homeobox protein 2 Tm melting temperature TOB1 transducer of ERBB2 1 TSHZ3 teashirt homolog 3 TTNPB 4-[(E)-2-(5,6,7,8-Tetrahydro-5,5,8,8-tetramethyl-2-naphthalenyl)-1-propenyl]benzoic acid UPR unfolded protein response UTR untranslated region VEGF vascular endothelial growth factor VIM vimentin VitC vitamin C

7

WHO The World Health Organisation WNT wingless/integrated wpc weeks post conception XBP1 X-box binding protein 1 ZNF469 protein 469

List of publications Jennings, R.E., Berry, A.A., Gerrard, D.T., Wearne, S.J., Strutt, J., Withey, S., Chhatriwala, M., Piper

Hanley, K., Vallier, L., Bobola, N., and Hanley, N. (2017). Laser Capture and Deep Sequencing Reveals the Transcriptomic Programmes Regulating the Onset of Pancreas and Liver Differentiation in Human

Embryos. Stem Cell Reports.

List of corrections Page Description of Corrections 9 Typo: factos -> factors 17 Typo: SOX7 -> SOX17 48 Error: MafB is not a marker of α cells in humans 51 Typo: wash -> was 57, 58, 59, 60 Addition: expanded upon aims into three clear goals and reorganised this section 58 Typo: tease -> tease out 65 Addition: antibody suppliers 69, 75 Error: rpm -> g 70 Addition: FACS methods 72 Addition: a statistics section was made 73 Addition: Barker lab -> Barker lab (A*STAR, Singapore) 79 Typo: soughtafter -> sought after 86 Figure 8: edited axis to include GFP 87 Figure 9: edited axis to include GFP 90, 93, 96, 102, 104, 106, 113,118 Error: statistical analysis used stated in figure legends 94, 99, 100, 116, 150 Error:stated that proliferation was not quantified and is this critiqued in discussion 96 Figure 15: included arrows and changed layout so bigger and can be viewed 101, 102 Figure 16: fixed erroneous MNX1 bar and changed axis to zero 143 Italics: in vitro 146 Addition: discussed flaws in the use of ΔΔCt 156 Error: rewrote sentence and fixed typo

8

Abstract Diabetes is rapidly becoming a global epidemic, with the International Diabetes Federation noting that close to 9% of the global adult population suffers from the disorder. Over the past few decades, work has begun to combat this as many protocols have been developed to in vitro differentiate viable pancreatic cells for cell therapy. However, this process is currently not cost effective and other avenues for production may be optimal. Transdifferentiation is one such avenue, as overexpression of key transcription factors can shift an initial cell type toward that of a target cell type. Here transdifferentiation was employed to generate pancreatic progenitors, which could express markers from all key lineages of the pancreas. Specific transcription factors used for this process were determined by two powerful and pioneering bioinformatic analyses, Lineage-guided Principle Component Analysis (LgPCA) and Mogrify.

LgPCA investigation was undertaken to deduce the most potent transcription factors from this novel genome-wide analysis of 19,362 genes. Mogrify complemented this data set with transdifferentiation specific factors. The combination of which provided a highly optimised cohort of genes to study. A comprehensive study of known regulatory networks in mouse and human pancreas development was combined with expression profiling during the in vitro differentiation of human embryonic stem cells

(hESCs) to early pancreatic beta-cells. Having selected a prioritised cohort of 8 transcription factors an enhanced adeno-associated viruses (AAVs) generation protocol was developed for their delivery into cells in vitro. Transcription factors were applied to human fetal fibroblasts to assess their capacity to induce transdifferentiation toward pancreatic progenitors. This was evaluated by gene expression profiling, including the expression of the endogenous genes encoding the transduced factors. This approach identified the best trio of transcription factors for transdifferentiation toward the phenotype of pancreatic progenitor. While further optimisation is needed, including in vivo transplantation, the data indicates that cell phenotype can be reproducibly altered for at least 1-2 weeks by the transduction of a very limited set of transcription factors. The conclusion is that transdifferentiation could offer a viable route to the ultimate large-scale production of insulin-secreting beta-cells from fibroblasts.

9

Declaration No portion of the work referred to in the thesis has been submitted in support of an application for another degree or qualification of this or any other university or other institute of learning.

Copyright statement I. The author of this thesis (including any appendices and/or schedules to this thesis) owns certain copyright or related rights in it (the “Copyright”) and s/he has given The University of Manchester certain rights to use such Copyright, including for administrative purposes.

II. Copies of this thesis, either in full or in extracts and whether in hard or electronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it or, where appropriate, in accordance with licensing agreements which the University has from time to time. This page must form part of any such copies made.

III. The ownership of certain Copyright, patents, designs, trademarks and other intellectual property (the

“Intellectual Property”) and any reproductions of copyright works in the thesis, for example graphs and tables (“Reproductions”), which may be described in this thesis, may not be owned by the author and may be owned by third parties. Such Intellectual Property and Reproductions cannot and must not be made available for use without the prior written permission of the owner(s) of the relevant Intellectual

Property and/or Reproductions.

IV. Further information on the conditions under which disclosure, publication and commercialisation of this thesis, the Copyright and any Intellectual Property and/or Reproductions described in it may take place is available in the University IP Policy (see http://documents.manchester.ac.uk/DocuInfo.aspx?DocI D=2442 0), in any relevant Thesis restriction declarations deposited in the University Library, The University Library’s regulations (see http://www.library.manchester.ac.uk/about/regulations/) and in The University’s policy on Presentation of Theses.

10

Acknowledgements I would like to give many thanks to my supervisors Professor Neil Hanley and Doctor Ray Dunn. Their guidance and advice has been invaluable throughout my PhD. I am also truly thankful for their patience and understanding, not only in dealing with me but also my mother.

I wish to thank all those in both Hanley laboratories who aided me in my first and last year. Thank you to Doctor Andy Berry for teaching me a vast array of useful techniques as well as providing me with several entertaining nicknames. Thank you to Doctor Rachel Jennings for sorting all the reagents upon my return to Manchester and for being a great source of gossip. Thank you to Doctor Kim Su and

Doctor Elliot Jokl for their expertise and wizardry in the running of RNAscope. Thank you to Lindsay

Birchall for her support in dealing with the many many administrative banalities and laboratory safety forms.

I also want to thank all those in the Dunn laboratory who supported me from my second to last year, you were all a constant source of hilarity and happiness. Thank you to Rau Zan for the many chats regarding upcoming games. Thank you to Doctor Kim Jee Goh for being a brilliant source of entertainment in the laboratory. Thank you to Shermaine Eng for the never ending barrage of bad jokes. Thank you to Sheena Ong for the advice, conversation and crazy stories. Thank you to Jennica

Tan for her unwavering patience and guidance in the laboratory. Thank you to Doctor Jamie Trott, for all the incredible insights he provided, which were vital to this project. Thank you to Doctor James Strutt whom throughout my time in both laboratories, provided me with amazing advice, amusement and anecdotes that kept me going even on the worst days of my PhD.

Finally, I would like to thank Ignacius Tay for her unyielding support over the past two years. Without her optimism, encouragement and boundless knowledge of delicious food I would not have been able to complete this PhD.

11

1. Introduction 1.1. Overview of the pancreas The pancreas is a multifaceted and integral organ with a heterogeneous structure specifically developed to maintain homeostasis. The primary function of the pancreas is keeping blood glucose levels stable, additionally, the pancreas also produces a variety of enzymes to be utilised during digestion (Pan and Wright 2011; Jennings et al., 2015). These diverse functions are reflected in the structure of the pancreas, comprised of three distinct tissues; the endocrine, exocrine and ductal. The endocrine cells of the pancreas are found within clusters known as the Islets of Langerhans. These islets contain various hormone producing cell types including: α cells (glucagon), β cells (insulin), δ cells

(somatostatin), ε cells (ghrelin) and PP cells (pancreatic polypeptide, previously known as γ cells). The

β cells constitute the majority of the islets and they secrete the hormone insulin which allows for the maintenance of blood glucose levels. Insulin acts to promote glycogen synthesis from glucose, allowing long term stable storage of glucose, whilst glucagon stimulates glycogen breakdown to glucose. The exocrine and ductal tissues make up the bulk of the pancreas. Here, acinar cells of the exocrine secrete digestive enzymes, which are transported out of the pancreas via a network of ductal cells to the duodenum. Given the importance of the pancreas, diseases related to it are severe, with diseases such as diabetes mellitus requiring constant management throughout life.

Many diabetic patients are reliant on external insulin in order to regulate their blood sugar levels. As a result considerable study has been dedicated toward developing long term treatments. Famously, a method was established for the transplantation of human islet cells from deceased donors into patients suffering from type 1 diabetes (Shapiro et al., 2000). This method, dubbed the Edmonton protocol has been largely successful in providing patients with a long term reprieve from insulin dependence

(Shapiro et al., 2006). However, human cadaveric islets are a scarce commodity, severely hampering the scope of the protocol’s effectiveness. Following the advent of culturable human embryonic stem cells (hESCs) prior to the end of the century, work began towards addressing this issue and providing

12 patients with a long term treatment in the form of cell therapy (Thomson et al., 1998; Assady et al.,

2001).

In the ensuing 20 years a range of technologies and methods have been employed to this end.

Pancreatic cell types can now be reliably differentiated from pluripotent stem cells (PSCs) with the use of growth factors and small molecules (Rezania et al., 2014; Trott et al., 2017; Nair et al., 2019).

Furthermore, overexpression of key transcription factors with viral vectors has also led to transdifferentiation of cells towards pancreatic cells (Zhou et al., 2008; Akinci et al., 2013; Furuyama et al., 2019). Therefore, the ability to treat those who have lost islet cells due to autoimmune damage

(type 1 diabetes) as well as the potential to aid those with insulin resistance (type 2 diabetes) on a large scale is drawing closer to reality.

Current knowledge concerning pancreas development is based primarily upon data established in mice or other vertebrates (reviewed: Pan and Wright, 2011; Serup 2012; Cano et al., 2014). This is chiefly because of insufficient early human pancreatic tissue available for study. Recent work has begun to shed light on early human development, as well as expose discrepancies between the human and rodent developmental program (Lyttle et al., 2008; Jennings et al., 2013; Benner et al., 2014).

Importantly, these differences demonstrate that rodent data does not provide an entirely comparable model for human pancreas development. Therefore, it is crucial to determine human specific traits exhibited during pancreas development to improve the generation of pancreatic cell types.

13

1.2. Development towards pancreatic progenitors Whilst knowledge of human development is growing rapidly, much still remains a mystery due to limited tissue availability and high regulation of use. Moreover, the practical realities of biology limit the study of the earliest stages of human development, due to time taken before confirmation of postponed menstruation. Consequently, what is known in both rodent and human biology with regards to the development towards the pancreas will be detailed here, whilst also outlining the deviations or similarities between the two. Pancreas development can be succinctly characterised as beginning with definitive endoderm, leading to gut tube formation, with the pancreas originating from posterior foregut of the gut tube. The pancreas then forms two buds (dorsal and ventral) which eventually combine to form the whole body of the organ. As the buds expand pancreatic progenitors also referred to as multipotent progenitor cells form. These cells are critical to pancreas development as they are able to both proliferate extensively and also generate the various cell types found within the mature pancreas such as the islets, acini and ducts (Zhou et al., 2007; Pan and Wright, 2011). This is in stark contrast to islet β cells, which have been found to have exceedingly low replication rates (Teta et al., 2005).

Furthermore, the regeneration of human adult β cells does not rely upon a β specific progenitor cell type but rather simple self-renewal (Teta et al., 2007). Consequently, pancreatic progenitors are a critical cell type to study and produce because of their inherent proliferative capabilities. However, in order to produce these cells, how they are generated in vivo must first be understood.

1.2.1. Endoderm to early pancreas signalling network The pancreas forms from the endoderm, one of the three germ layers. Mouse studies have shown nodal is a key component in defining endoderm formation (Brennan et al., 2001). Nodal, the activins and the bone morphogenetic (BMPs) are members of the transforming growth factor β (TGFβ) superfamily of growth factors, which are active throughout development and adult life. High levels of

TGFβ signalling initiates endoderm formation whilst low levels initiate mesoderm formation in both mice and humans (Kubo et al., 2004; D’Amour et al., 2005; Teo et al., 2012; Wang et al., 2015). Mutations in the type II activin receptors A and B results in disruption of the TGFβ signalling pathway causing

14 defects in stomach, spleen and pancreas formation (Kim et al., 2000). Additionally, the underlying transcriptional network activated by TGFβ signalling is highly conserved throughout vertebrates with gene families such as SRY HMG-box factors (SOX), GATA transcription factors (GATA) and forkhead factors (FOX) (Rojas et al., 2005; Sinner et al., 2006; Spence and Wells, 2007; Spence et al., 2009).

Another key signalling molecule is retinoic acid (RA) which is able to induce expression of the master regulator of the pancreas, pancreatic and duodenal homebox 1 (Pdx1) in stem cells (Micallef et al.,

2005). Additionally, mice with mutated retinaldehyde dehydrogenase 2 (Raldh2) which encodes a gene required for RA synthesis, lack dorsal pancreatic Pdx1 expression and the dorsal pancreatic bud

(Martin et al., 2005; Molotkov et al., 2005). RA has also been shown to work in tandem with another major group of signalling molecules, the fibroblast growth factor (FGF) family, which act as mitogens throughout the body (Ornitz and Marie et al., 2015). When these molecules are applied to hESCs that have already been exposed to Activin A (ActA), they induce robust PDX1 expression (Johannesson et al., 2009). During early mouse endoderm development FGF4 induces posterior gene expression whilst repressing anterior gene expression; intermediate levels of FGF4 activate Pdx1 (Wells and Melton

2000; Dessimoz 2006). In human stem cells FGF2 at intermediate levels also induces PDX1 expression and inhibits hepatocyte differentiation (Ameri et al., 2010).

The resulting effect of these layered signalling family interactions is the definition of organ specific transcriptional activities which act to further characterise the anterior posterior (A-P) axis of the endoderm. Key transcription factors mark specific portions and future organs of the endoderm: sine oculis homeobox homolog 1 and 4 (Six1 and Six4) define the pharyngeal ends (Zou et al., 2006), SRY

HMG-box factors 2 (Sox2) the stomach (Que et al., 2007), Nirenberg and Kim homeobox factor 2.1

(Nkx2.1) in the lungs (Minoo et al., 1999) hematopoietically expressed homeobox (Hhex) for the liver

(Martinez-Barbera 2000; Keng et al., 2000), the pancreas Pdx1 (Offield 1996), and caudal type

15 homeobox 1 and 2 (Cdx1 and Cdx2) in the intestine (Beck 1995; Beck et al., 2003). Each of these factors defines their respective organs along the A-P axis.

In mice, development is described in embryonic days, beginning with fertilisation. The specification of organ identity along the endoderm occurs in the early stages of mouse embryogenesis just after gastrulation around mouse embryonic day 6.5-8.5 (E6.5-8.5). During this time the foregut endoderm begins to fold forming the anterior and caudal intestinal portals (AIP and CIP). For mice this occurs at around E7.5, forming a complete internalised tube by E8.75 the posterior end of which will be the site of pancreas dorsal and ventral bud formation (Lawson and Pedersen, 1987). Unlike mice development human embryonic development is instead defined by 23 distinct morphological states known as

Carnegie Stages (CS), lasting roughly 8 weeks from fertilisation (O’Rahilly and Muller, 2010). In humans, foregut endoderm folding happens slightly later at Carnegie Stage 10 (Jennings et al., 2013).

Table 1 highlights the corresponding human to mouse stages. In time the two pancreatic buds, dorsal and ventral, begin forming independently and will eventually come together to form the whole body of the pancreas. This occurs around E9.5-10 in mice and CS13 in humans.

Table 1. The embryonic stages of development in humans and mice. Human development is described in Carnegie Stages (CS) displayed here with their respective time in days post conception (dpc). The equivalent points in mouse development are in embryonic days (E).

Human Mouse CS9 22 - 25 dpc E7.5–E8 CS10 26 - 27 dpc E8–E8.5 CS11 27 - 29 dpc E8.5–E9 CS12 29 - 31 dpc E9–E9.5 CS13 31 - 33 dpc E9.5–E10 CS14 33 - 35 dpc E10–E11.5 CS15 35 - 37 dpc E11.5–E12.25 CS16 37 - 40 dpc E12.25–E12.75 CS17 39 - 42 dpc E12.75–E13.25 CS18 42 - 45 dpc E13.25–E14 CS19 45 - 47 dpc E14–E14.5 CS20 47 - 49 dpc E14.5–E15 CS21 49 - 52 dpc E15–E15.5 CS22 52 - 55 dpc E15.5–E16 CS23 53 - 58 dpc E16–E16.5

16

In humans sonic hedgehog (SHH) is present at CS10 (Jennings et al., 2013). Previous work has shown that lack of SHH allows Pdx1 expression. This occurs as during development the foregut endoderm is blocked from SHH signalling by the adjacent notochord. Activin βB and FGF2 from the notochord can repress the endodermal SHH thus permitting Pdx1 expression. Shh mutant mice have considerably increased pancreas size and greater numbers of pancreatic endocrine cells demonstrating the importance of SHH signalling repression in pancreas development (Hebrok et al., 1998; Hebrok et al.,

2000). SHH in humans has been shown to be present at CS10 in the ventral pancreas with PDX1 expression succeeding it at CS12 (Jennings et al., 2013). Interestingly, this timing relative to mice is a delayed, highlighting another discrepancy between the two species during pancreas development. Like many organs the pancreas is defined by several transcription factors operating in an intricate network within each cell. Therefore, in order to understand pancreas development one must first understand the transcription factors active within the pancreas and the complex signalling network that they control.

These various regulatory relationships are shown in Figure 1 and expanded upon in detail below.

1.2.2. Transcription factor network of the early pancreas 1.2.2.1 SOX17 In mice, SRY HMG-box factor 17 (Sox17) is expressed from E3.5 in the preimplantation embryo and by

E7.5 it works to define the pancreatobilliary border (Spence et al., 2009; Artus et al., 2011). Sox17 being one of the earliest transcription factors active in pancreas development implies a significance in function. This is evident in Sox17 null mice which are unable to form the gut endoderm (Kanai-Azuma et al., 2002). One study has implicated Wingless/Integrated (Wnt) signalling as a regulator of Sox17.

Upon deletion β-catenin a transducer of Wnt signalling Sox17 expression is lost in the endoderm

(Engert et al., 2013). Intriguingly, in humans SOX17 is expressed at CS10 but only in areas lacking

SHH, and it remains detectable until CS12 (Jennings et al., 2013).

17

18

Figure 1. Transcription factor regulation in the early human and mouse pancreas. Each transcription factor node has arrows indicating known induction/regulation or a stub indicating inhibition of another transcription factor. The corresponding model organism is indicated above each factor, with human networks being marked in grey. The initial reference for each interaction is shown below. 1.2.2.2. FOXA2 The winged helix protein forkhead homeobox protein A2 (FOXA2, previously named HNF3β) has been implicated in endoderm and pancreas development for many years. Several studies described its role in the formation and maintenance of the definitive endoderm. FOXA2 works in concert with its related forkhead box factors 1 and 3 (FOXA1 and FOXA3) to define axis formation across the embryo from

E6.5 onwards, as well as other organs such as the liver (Ang et al., 1993; Monaghan et al., 1993;

Sasaki and Hogan, 1993; Frank et al., 2007; Nowotschin et al., 2019). FOXA2 binds to an intronic enhancer and distal promoter of GATA transcription factor 4 (Gata4) another key factor active in endoderm development, driving its expression in the endoderm before further specifying expression to the stomach and dorsal pancreas (Rehorn et al., 1996; Ritz-Laser et al., 2005; Rojas et al., 2005; Rojas et al., 2010). These two factors then work in partnership to open chromatin presumably aiding downstream expression of key endoderm genes (Cirillo et al., 2002).

One of the most intensely studied regulatory actions of FOXA2 is its activation of Pdx1. Deletion mutations located at a nuclease hyper sensitive site -2007 -1996 base pairs (bp) in the Pdx1 promoter can be bound by FOXA2 (Wu et al.,1997). Multiple papers expanded upon this Wu et al.’s work, describing the regulatory control by FOXA2 and the structure of the Pdx1 promoter. Further study of the initial hypersensitive site found there to be three conserved regions (78-89%) shared between human, mouse and chick. These were dubbed: area I (-2694 -2561 bp), area II (-2139 -1958 bp) and area III (-

1879 -1799 bp). Deletions in area I and II perturbed FOXA2 binding, it was also revealed PDX1 binds to area I, acting cooperatively with FOXA2 to regulate its own expression. To add to that PDX1 can also activate Foxa2 expression (Gerrish et al., 2000; Marshak et al., 2000; Oliver-Krasinski et al., 2009). In vivo analysis displayed that upon β cell Foxa2 deletion in mice Pdx1 mRNA and PDX1 protein in islets was reduced (Lee et al., 2002). Additional regions were described, each of which FOXA2 binds, one at

19

-3700 -3450 bp and another dubbed area IV (-6200 -5670 bp). Areas I-IV were also revealed to have active chromatin in β cells, indicating FOXA2 may not only regulate Pdx1 expression but also the accessibility of its chromatin (Ben-Shushan et al., 2001; Gerrish et al., 2004). FOXA2 also acts with

FOXA1 binding close to area IV determined via chromatin immunoprecipitation (ChIP) and ChIP sequencing. If Foxa2 and Foxa1 are both deleted in the early mouse pancreas a complete loss of Pdx1 expression is observed. This is followed by the ductal, exocrine and endocrine portions of the pancreas failing to differentiate resulting in pancreatic hypoplasia and postnatal death (Gao et al., 2008). FOXA2 also works to restructure the endoderm ensuring cells are epithelialized, exhibit polarity and simultaneously suppressing mesenchymal gene expression. As Foxa2 mutants are unable to form cell junctions or maintain polarity (Burtscher and Lickert et al., 2009). Other transcription factors have also been noted to induce Foxa2 expression such as hepatocyte nuclear factor 6 (HNF6) (Samadani and

Costa et al., 1996; Landry et al., 1997). FOXA2 also has regulatory ties to SRY HMG-box factor 9

(Sox9) as FOXA2 regulates its expression, likewise SOX9 binds to the Foxa2 promoter forming an auto regulatory loop (Lynn et al., 2007). In human pancreatic progenitor cells FOXA2 has been identified as being bound to promoters of several transcription factors such as GATA Transcription Factor 6

(GATA6), motor neuron and pancreas homeobox 1 (MNX1), PDX1, hepatocyte nuclear factor 1B

(HNF1B), SOX9 and hairy and enhancer of split-1 (HES1) (Cebola et al., 2015).

Furthermore studies of gene expression during human pancreas development have detailed the spatial and temporal expression of FOXA2. FOXA2 co-localises with many transcription factors associated with pancreas development such PDX1, SOX17, GATA4 and SOX9. Its expression is first detected at CS10, in all endoderm epithelial cells and is maintained throughout pancreas development (McDonald et al.,

2012; Jennings et al., 2013). A recent study identified a human mutant for FOXA2 whom presented with childhood-onset diabetes along with a host of other issues most likely due to the integral role the gene plays (Stekelenburg et al., 2019).

20

1.2.2.3. GATA4 and GATA6 The GATA family of transcription factors are a well-established sextet of zinc finger proteins active across several tissues during development (Patient and McGhee et al., 2002). Of particular interest are

GATA4 and GATA6 which both act throughout pancreas development in varying roles. Both are active at the earliest stages of development with GATA6 being present in the primitive endoderm from roughly

E3 with GATA4 expression being established shortly after around E3.5 (Artus et al., 2010; Simon et al.,

2018). GATA6 serves to drive endoderm formation throughout development whilst GATA4 has a role in both endoderm and heart formation, both are observed in the pancreas from around E9.5 onwards

(Molkentin et al., 1997; Bossard and Zaret, 1998; Morrisey et al., 1998; Molkentin et al., 2000; Decker et al., 2006; Cai et al., 2008).

As previously mentioned Gata4 is activated by FOXA2, it has also been noted that PDX1 can bind to a distal enhancer of GATA4 to regulate its expression in both mice and humans, in mice GATA4 also regulates itself (Rojas et al., 2005; Rojas et al., 2009; Rojas et al., 2010; Teo et al., 2015; Wang et al.,

2018). PDX1 has also been shown to regulate GATA6 via ChIP-seq in humans (Teo et al., 2015; Wang et al., 2018). Whilst it has been illustrated that both GATA4 and GATA6 can bind to area I and III of the

Pdx1 promoter and regulate its expression (Carrasco et al., 2012). Work in human cells has shown RA can also induce GATA4 expression (Arceci et al., 1993). In addition BMP antagonist noggin was shown to reduce Gata4 expression with SMAD binding sites found upstream of the gene implicating BMP signalling as a regulator of Gata4 expression (Rojas et al., 2005). RNA-sequencing of Gata4 and Gata6 double mutants compared to wild-type mice found high upregulation of intestinal and gastric genes implying Gata4 and Gata6 drive gene expression to maintain the pancreatic identity. Moreover, the same study demonstrated that GATA4 and GATA6 bind to a distal enhancer that is able to repress

SHH signalling, further supporting the role in pancreas specification (Xuan and Sussel, 2016).

21

However of most interest is the complex interplay these two factors have during pancreas development and the varying phenotypes exhibited between mice and humans. In the developing mouse pancreas epithelium Gata4 and Gata6 are expressed in the same regions, but by E15.5 Gata4 has become restricted to the exocrine compartment of the pancreas known as the acinar cells, whilst Gata6 expression is maintained in just the endocrine and ductal cells (Ketola et al., 2004). However, work has highlighted presence and activity of GATA6 in acinar development and maintenance, where it is able to bind to the promoters of key exocrine genes such as recombining binding protein suppressor of hairless-like (Rbpjl) and muscle, intestine and stomach expression 1 (Mist1) (Martinelli et al., 2012).

Interestingly, pancreas specific mouse mutants for Gata4 or Gata6 had little effect on pancreas development and any defects present were resolved after birth, indicating a level of functional redundancy between the two genes. Accordingly double mutant mice had pancreatic agenesis (failure of the organ to develop), were hyperglycaemic and died soon after birth. To determine the hierarchy of the two genes, mice were generated that were heterozygous mutants for one gene and homozygous mutants for the other. Mice with mutant Gata6 and one wild-type Gata4 allele were able to develop a normal pancreas, in contrast mutant Gata4 mice with one wild-type Gata6 allele had severe reduction in the mass of the pancreas, suggesting Gata4 serves a more vital role in the mouse pancreas

(Carrasco et al., 2012; Xuan et al., 2012).

In humans mutations in GATA6 result in a several phenotypes but the most severe being pancreatic agenesis of which GATA6 mutations are the most common cause in humans (Lango Allen et al., 2012;

De Franco et al., 2013; Yau et al., 2017). Whilst mutations in GATA4 appear to not be as extreme, causing only neonatal or childhood onset diabetes, varying exocrine defects and to date only one case of pancreatic agenesis (D’Amato et al., 2010; Shaw-Smith et al., 2014). GATA4 expression has been shown to initiate at CS12 and be maintained in the pancreatic bud, similar to that in mice its expression

22 eventually becomes sequestered to the acinar cells were it is maintained (Jennings et al., 2013).

GATA6 has so far only been observed at CS16-18, potentially acting as a gate keeper towards further development (Cebola et al., 2015). Work on GATA4 and GATA6 gene deletions in hESCs has explicated the importance of GATA6 during pancreas directed differentiation. GATA6 deletion abrogates definitive endoderm formation significantly, whilst GATA4 deletion does not, and GATA6 deletion has a greater effect on reducing PDX1 expression than GATA4 (Shi et al., 2017; Chia et al.,

2019). These studies hence implicate Gata4 as the more integral of the GATA factors in mouse pancreas development, contrary to human development where evidently GATA6 plays a pivotal role pancreas formation as evidenced by its links to pancreatic agenesis and vital requirement for pancreatic differentiation.

1.2.2.4. PDX1 As mentioned above Pdx1 expression begins around E8.5 in mice and marginally later in humans at around CS12 (Offield et al., 1996; Jennings et al., 2013). At around E9.5-11.5 the ducts are defined, whilst the exocrine and endocrine tissue develops slower from E8.5 onwards (Gu et al., 2002). PDX1 also acts to define the pancreatic progenitor population from CS13 onwards in humans (Jennings et al.,

2013). Over the years various papers have described the severe phenotypes exhibited by mouse and human subjects; the first of which found that loss of Pdx1 in mice resulted in pancreatic agenesis, additional work showed that conditional mutants with loss of Pdx1 in β cells developed diabetes in time and were unable to maintain insulin secretion and β cell identity (Jonsson et al., 1994; Ahlgren et al.,

1996; Offield et al., 1996; Ahlgren et al., 1998).

In addition to the study of the PDX1 gene its promoter in mouse has also been described in detail.

Initially three nuclease-hypersensitive sites were identified in the 5’ flanking region of the Pdx1 gene -

2560 -1880bp, -1330 -800bp and -260 +180bp with FOXA2 able to bind to a site at -2007 -1996bp (Wu

23 et al., 1997). Work following this honed in on the site at -2560 -1880bp, within this region specific areas were defined by high conservation across species and being nuclease-hypersensitive sites, as stated earlier these were: area I (-2694 -2561bp), area II (-2139 -1958bp) and area III (-1879 – 1799bp). By testing Pdx1 driven reporter activity in β and non β cells it was discovered that area I and II drive β cell specific expression and area III drives endocrine cell expression. It was also shown that FOXA2 binds to area I + II, whilst PDX1 itself can bind to area I both of which activate Pdx1 (Gerrish et al., 2000;

Marshak et al., 2000; Gannon et al., 2001). In time an additional area active in delineating Pdx1 expression was observed, it was dubbed area IV (-6200 -5670bp). FOXA2 and PDX1 were identified as being able to bind to this area as well as Nirenberg and Kim homeobox factor 2.2 (Nkx2.2) a transcription factor active later in pancreas development (Gerrish et al., 2004). To assess the importance of the Pdx1 promoter mutant mice lacking it were generated, these mice were deficient for ventral pancreatic bud specification and had a hypoplastic dorsal bud. The exocrine tissue formed in dorsal bud but endocrine development was reduced. Interestingly, heterozygotes for Pdx1 promoter deletion had a more severe prediabetic phenotype than that of Pdx1 gene heterozygotic mutants. Given area III was unique in its role further study into its function was made, here several groups noted that it drives pancreas wide Pdx1 expression. This would therefore include the exocrine and ductal tissue, these studies found that pancreas transcription factor 1 subunit α (PTF1A) a defining factor of pancreatic progenitors as well as acinar cells and another protein active in the exocrine cells, recombining binding protein suppressor of hairless (RBPJ) form a complex which could bind to area III of the promoter to regulate Pdx1 (Miyatsuka et al., 2007; Wiebe et al., 2007). Area II has also been indicated to drive Pdx1 expression in order aid β cell differentiation, as upon its deletion endocrine cells fail to develop to mature mono hormonal cells. All which highlights that given the place Pdx1 holds in the hierarchy of pancreas development its regulation is just as imperative as its expression.

24

As described earlier several growth factors and signalling molecules can induce PDX1 expression, moreover, another key factor able to do so is exendin 4 (Ex4), this is a glucagon-like peptide 1 (GLP1) receptor agonist. Ex4 is has been shown to increase histone acetylase activity, reversing epigenetic modifications that silence Pdx1 and allowing up-regulation of its expression (Aviv et al., 2009; Pinney et al., 2011a). Besides the aforementioned transcription factors that are regulated by and regulate PDX1, as the master regulator of pancreas development it operates an expansive network of additional transcription factors described below including itself in human cells (Teo et al., 2015; Wang et al.,

2018). HNF6 which is able to induce Foxa2 has also been established as an activator of Pdx1 expression yet in humans PDX1 turn regulates HNF6 (Marshak et al., 2000; Jacquemin et al., 2003).

Another hepatocyte nuclear factor active in pancreas development HNF1B is known to be activated by

PDX1 in both mice and humans (Oliver-Krasinski et al., 2009; Wang et al., 2018). Although not a transcription factor the cell adhesion marker E-cadherin (CDH1) is integral in maintaining early pancreatic identity through mediation of adherent junctions (Dahl et al., 1996). Pdx1 null mice were assessed against wild-type mice via a microarray and found downregulation of Cdh1 in the mutants, more recently it has been elucidated that PDX1 binds to the Cdh1 promoter to activate its expression

(Svensson et al., 2007; Marty-Santos and Cleaver, 2016). PDX1 also activates transcription factors required to drive pancreas development towards the endocrine lineage: Nirenberg and Kim homeobox factor 6.1 (Nkx6.1) and neurogenin 3 (Ngn3) (Pedersen et al., 2005; Svensson et al., 2007; Burlison et al., 2008; Oliver-Krasinski et al., 2009). Additionally, PDX1 also activates and aids the development of the exocrine portion of the pancreas. PTF1A is another major transcription factor in pancreas development and has been described as the key factor in defining the exocrine portion of the pancreas

(Krapp et al., 1998). PTF1A is able to activate Pdx1 expression and PDX1 can also activate PTF1A

(Svensson et al., 2007; Thompson et al., 2011; Wang et al., 2018). PDX1 is also able to activate

Nuclear receptor subfamily 5, group A, member 2 (NR5A2), another factor active during acinar development (Annicotte et al., 2003). In liver cells PDX1 expression has also been demonstrated to

25 repress hepatic genes, emphasising its power in organ specification (Teo et al., 2015). PDX1 expression is maintained throughout pancreas development, where it continues to control an ever expanding network of transcription factors to specify the endocrine lineages.

In humans several mutations within the PDX1 gene have been described all with varying phenotypes.

The first described and one of the most severe is a deletion at codon P63 causing a frameshift mutation resulting in erroneous sequence and a truncated PDX1 protein with no key DNA-binding homeodomain present. The patient was homozygous for this mutation and because of it developed neonatal diabetes due to pancreatic agenesis and also had pancreatic exocrine insufficiency, this phenotype has since been document again (Stoffers et al., 1997; Thomas et al., 2009). Another paper went onto describe two patients heterozygous for a missense mutations: Q59L and D76N, both had reduced insulin response to glucose but were otherwise healthy. The same group also described an individual heterozygous for an insertion mutation of PDX1 CCG243, disrupting the PDX1 protein, causing maturity onset diabetes of the young (MODY), more precisely MODY4 (Hani et al., 1999). An individual with compound heterozygous missense mutations at E164D and E178K presented with neonatal diabetes and pancreatic agenesis, mutations in these codons disrupted the helices of the protein’s homeodomain (Schwitzgebel et al., 2003). A patient homozygous for the mutation E178G had a similar phenotype though it was noted that they only presented with subclinical exocrine insufficiency (Nicolino et al., 2010). A later paper went onto to define various new mutations, an insertion causing a frame shift

P87L and an identical substitution P87L were found in a compound heterozygotic individual, these residues were part of a proline rich region of the PDX1 protein. As a result of these mutations the patient had neonatal diabetes but no exocrine insufficiency. A heterozygous nonsense mutation at

C18X was able to cause neonatal diabetes but again with no pancreatic exocrine insufficiency. Lastly two homozygotic substitutions in the homeodomain of PDX1, A152G and R176Q both of which led to neonatal diabetes (De Franco et al., 2013). All these studies signify several things: one PDX1 is of clear

26 importance in pancreas development as its loss can be catastrophic to the organ, two the varying phenotypes illuminate its function in the development of the various compartments of the pancreas, three allows further understanding of the importance of specific domains with the protein of PDX1 as well as its promoter.

1.2.2.5. MNX1 MNX1 is a homeobox transcription factor active throughout pancreas development and into adulthood acting to maintain the β cells. In 1999 two separate groups published work highlighting that loss of

Mnx1 (at that time known as Hlxb9) resulted in agenesis of the dorsal pancreas with reduced number of insulin positive cells in the remaining section of the pancreas (Harrison et al., 1999; Li et al., 1999). RA has been demonstrated to influence Mnx1 expression as in Raldh2 mutant mice have dramatically reduced numbers of MNX1 expressing cells (Martin 2005). Mnx1 expression begins at E8.0 and by

E9.5 exhibits a unique gradient across the early pancreas, with the dorsal pancreas having high expression compared to low expression in the ventral pancreas (Sherwood et al., 2009). It has been noted that Mnx1 expression is mediated by PTF1A which can bind to a distal enhancer of Mnx1, this interaction is believed to aid in the maintenance of pancreatic progenitors (Thompson et al., 2011).

Human mutations of the MNX1 gene results in neonatal diabetes due to mutations within the protein’s homeodomain (Bonnefond et al., 2013; Flanagan et al., 2014). In human embryo studies MNX1 expression has been documented at CS14-15 co-localised with SOX9 (Pan et al., 2015). MNX1 has been shown to have steady expression during pancreas development and it has been established that

HNF1B can bind to the promoter region of MNX1 (Jeon et al., 2009; Teo et al., 2016). Recently two studies utilising RNA-sequencing highlighted the prominence of MNX1 in pancreas development. Laser captured samples of the human dorsal pancreatic bud implicated MNX1 as a key factor highly enriched in this region (Jennings et al., 2017). Whilst single cell analysis found that MNX1 distinguishes early

27 pancreatic populations during in vitro differentiation. In addition cells initially expressing MNX1 seemingly mark those that gain NKX6.1 expression, a key marker of endocrine precursor cells

(Petersen et al., 2017).

1.2.2.6. HNF6 Hnf6 also known as onecut1 (OC1), is expressed in the mouse foregut endoderm from E9, and then in the pancreas from E10.5 onwards, and aids in the maintenance and regulation of Foxa2 expression

(Samadani and Costa, 1996; Landry et al., 1997; Rausa et al., 1997). Pdx1 expression can be activated by HNF6 (Jacquemin et al,. 2003; Teo et al., 2015). Hnf6 is also activated by HNF1B binding via an intronic enhancer, and HNF6 can in turn regulate Hnf1b (Maestro et al., 2003; Poll et al., 2006).

Alongside this Hnf6 expression can be regulated by SOX9 and PTF1A (Lynn et al., 2007; Thompson et al., 2011). The addition of epidermal growth factor (EGF) in combination with leukemia inhibitory factor

(LIF) are able to induce Hnf6 in rat exocrine cells as well as Ngn3. The initiation of Ngn3 expression may be as a result of HNF6 as it is able to regulate and induce Ngn3 (Jacquemin et al., 2000; Baeyens et al., 2006; Oliver-Krasinski et al., 2009). A recent paper indicated the HNF6 in mice is integral to exocrine differentiation regulating factors like Gata4, Ptf1a and Nr5a2 (Kropp et al., 2019)

HNF6 is initially localised within the pancreatic progenitors but as the pancreas develops it is lost from the endocrine cells and remains in the acinar and ductal cells (Benitez et al., 2014). Upon Hnf6 deletion in mice ductal and endocrine development is perturbed, the branched network is lost and cysts begin to form, islet structures form abnormally and genes regulated by HNF6 such as Hnf1b also have reduced expression (Jacquemin et al., 2000; Pierreux et al., 2006). To further investigate the role of Hnf6 it was inactivated in endocrine precursor cells, this resulted in fewer endocrine cells but otherwise no serious defects (Zhang et al., 2009). Overall it indicates Hnf6 in mice is more crucial to earlier events within the developing pancreas.

28

Very little has been discerned concerning HNF6 in humans, it is present at CS16-18 and is consistently used as a marker during hESC differentiation protocols (Cebola et al., 2015; Trott et al., 2017). As endocrine differentiation protocols progress HNF6 expression is slowly reduced over time until absent, similar to that seen in mouse development (Petersen et al., 2017). Recent cell sorting analysis of fetal pancreases found that the HNF6 expression pattern was similar to that of SOX9 and CDH1 both of which mark the pancreatic progenitor population (Ramond et al., 2018).

1.2.2.7. HNF1B The next factor discussed is another member of the hepatocyte nuclear factor family, Hnf1b, this gene has been shown to be at the centre of large regulatory network during pancreas development. It is present in the early pancreas at around E9.5 and marks the earliest regions of the pancreatic buds

(Haumaitre et al., 2005; Nammo et al., 2008). HNF6 regulates Hnf1b expression, with a feed forward loop being formed with Hnf6 where both activates one another and in humans PDX1 can regulate

HNF1B (Maestro et al., 2003; Poll et al., 2006; Oliver-Krasinski et al., 2009; Teo et al., 2015; Wang et al., 2018). Beyond these interactions, Hnf1b is regulated by SOX9 and it in turn regulates Sox9 expression (Lynn et al., 2007). HNF1B can also bind to the promoter of Ngn3, the expression of which delineates the initiation of endocrine development. This was further proven by observations that HNF1B positive cells in the pancreas at E13-E18 specify cells that will gain Ngn3 expression (Lee et al., 2001;

Maestro et al., 2003). Whilst in hESCs it has been noted that HNF1B can bind to regulate MNX1 (Teo et al., 2016).

Hnf1b loss is lethal in mice so in order to assess it further conditional mutations within β cells were generated. These mutants displayed poor glucose tolerance and secretion of insulin (Wang et al.,

2004). Another group worked around this issue by tetraploid aggregation, allowing the survival of the

29 pups whilst still maintaining the Hnf1b deletion. The resulting mutants exhibiting pancreas agenesis, both Pdx1 and Ptf1a expression were lost and ectopic Shh signalling was observed (Haumaitre et al.,

2005). Further analysis of mutants by conditional knockout of Hnf1b uncovered more in regard to its function and importance. Deletion in the early stages of pancreas development resulted in a reduced pancreatic progenitor pool, later deletion caused the formation of cystic ducts with ducts showing no polarity, acinar defects were also seen. Finally Ngn3 positive endocrine precursor cells were lost upon

Hnf1b deletion (De Vas et al., 2015). This study highlighted that HNF1B is major contributor to the morphogenesis of the pancreatic progenitors and ductal portion of the pancreas.

In humans HNF1B loss has been well studied with more than 50 documented cases ranging from substitutions to whole gene deletions. Of interest, two mutant fetuses were analysed in detail. They exhibited pancreatic hypoplasia with islet disorganisation and downregulation of both β-catenin and

CDH1, which most likely contributed to the reduced pancreas size (Haumaitre et al., 2006; El-Khairi and

Vallier, 2016). Alongside this defects in β cell maturation was also displayed, all of which is believed to cause reduced insulin secretion, resulting in MODY5.

1.2.2.8. HNF4A and HNF1A These two closely related genes have not been as studied in the same level of detail as many of the aforementioned genes, however, that is not to say they do not play an important role in pancreas development. Hnf4a is known to be activated early in the mouse embryo around E6.5 and then later within the developing pancreas at E9.5. Hnf1a comes on later within the pancreas at around E12.5.

(Nammo et al., 2008). For many years it was known that HNF4A could bind and activate Hnf1a in the liver demonstrating a hierarchy between the two genes, as here these two genes play a major role in the development of hepatocytes (Gragnoli et al., 1997). Though more recent work suggests that the regulation is mutual, Hnf4a in fact has two promoters, both of which are utilised during liver

30 development whilst just one for the pancreas (Boj et al., 2001; Thomas et al., 2001). HNF4A and

HNF1A both bind the others promoter and HNF4A even to its own, forming a multicomponent loop able maintain expression of many downstream genes (Odom et al., 2004; Boj et al., 2010). Beyond these interactions Hnf4a expression is known to be regulated by HNF6 and GATA6 (Landry et al., 1997;

Morrisey et al., 1998; Odom et al., 2004). Whilst Hnf1a expression is mediated by PDX1 and NKX6.1

(Ben-Shushan et al., 2001; Donelan et al., 2010; Wang et al., 2018).

Intriguingly, in humans HNF4A and HNF1A have been found to be the most commonly mutated genes that cause MODY (MODY1 and MODY3 respectively). HNF1A in fact is the most common cause of

MODY in people of European background at around 60-70% of all cases, with HNF4A at around 5%

(Frayling et al., 2001). Patients of each MODY present with very similar symptoms most likely because

Hnf1a and Hnf4a synergistically regulate the same cohort of genes, when one factor is lost, the resulting regulation of these genes is diminished (Boj et al., 2010). This further explains why human mutants for these genes express relative similar phenotypes such as progressive hypoglycaemia and glucose intolerance (Kyithar et al., 2011). From this it can be inferred that HNF4A and HNF1A are important in shaping a mature pancreas given their commonality in MODY cases.

1.2.2.9. SOX9 Sox9 is another HMG-box containing protein, considered a key marker of the pancreas and its loss in mice causes a pancreas to liver conversion of fate (Seymour et al., 2012). Its expression has been identified from E8.5 onwards in the early pancreas and in time comes to co localise with PDX1 (Kopp et al., 2011). Much of the activity elicited by SOX9 has already been discussed as it is known to regulate

Foxa2, Hnf6 and Hnf1b, whilst FOXA2 and HNF1B regulate it in return (Lynn et al., 2007). However,

SOX9 can also activate Ngn3 and Hes1 both of which have varying expression dependent upon Notch signalling (Lynn et al., 2007; Seymour et al., 2007; Shih et al., 2012). It has been suggested that Sox9

31 expression is also Notch dependent, being activated by medial levels of Notch (Shih et al., 2012). That said, the most important regulatory feature of SOX9 is its ability to activate fgf receptor 2b (Fgfr2b), this receptor is required in order to transduce FGF10, which in pancreatic progenitors is able to induce

Sox9 forming a feed forward loop. This loop is integral within the progenitors as it is thought to be one the key mechanisms that allows the growth and expansion of the population (Seymour et al., 2012).

This is further corroborated by the identification of SOX9 as a key marker of the pancreatic progenitor population in mice (Seymour et al., 2007; Benitez et al., 2014). It should be noted that as pancreas development progresses to maturity, SOX9 is maintained but in time the pancreatic progenitor population is lost (Kopp et al., 2011).

In humans a similar pattern of expression is found. SOX9 expression co-localises with that of FOXA2,

GATA4, PDX1 and NKX6.1 at CS12-CS13, a pattern which is maintained throughout embryonic development (Jennings et al., 2013). SOX9 also co-localises with NGN3 though in time their expression patterns diverge, eventually leading to SOX9 expression being confined to the ductal portion of the pancreas (McDonald et al., 2012; Jennings et al., 2013). To assess the importance of SOX9 siRNA was utilised to knock down the gene in fetal pancreas cells, the result was a reduction in NGN3 and Insulin expression with concomitant increase in HES1, HNF1A and Glucagon (McDonald et al., 2012).

Emphasising that SOX9 does not only mark progenitors but also outlines the initiation of endocrine commitment.

1.2.3.0. NKX6.1 NKX6.1 is considered a marker for the initiation of endocrine commitment and a key factor involved in the progression towards β cell differentiation, in mice its expression is first evident at E9.5 (Hald et al.,

2008; Schaffer et al., 2013). Early mouse mutants indicated that loss of Nkx6.1 resulted in a reduction of β precursor cells, siRNA knockdown of Nkx6.1 led to loss of suppression of Glucagon and reduction

32 of insulin secretion in response to glucose (Sander et al., 2000; Schisler et al., 2005). Further study of

Nkx6.1 found that the genes it regulated were needed for insulin biosynthesis and β cell proliferation

(Taylor et al., 2013).

Like many of the factors previously mentioned NKX6.1 also has a considerable number of regulatory ties within the developing pancreas. NKX6.1 can bind to its own promoter and so maintain its own expression (Iype et al., 2004). NKX6.1 is also able to regulate Hnf1a as well as inhibit Ptf1a expression, similarly, PTF1A can inhibit Nkx6.1 expression (Donelan et al., 2010; Schaffer et al., 2010; Thompson et al., 2011). This inhibition by each protein is integral in determining the balance between endocrine and acinar cell specification. At E9.5 in the mouse pancreas, cells are positive for PDX1, PTF1A and

NKX6.1, seemingly marking a progenitor population but soon these expression patterns diverge as mentioned above, resulting in segregation between the exocrine and endocrine cells by E12.5 (Hald et al., 2008).

NKX6.1 has been studied much in recent in years in human tissue, during development it is apparent in the dorsal pancreas from CS13 and is maintained during embryonic development (Jennings et al.,

2013). It has been recognised in human cells that PDX1 can regulate NKX6.1 (Wang et al., 2018).

Moreover, differentiation protocols are able to induce NKX6.1 expression via a combination of EGF and nicotinamide (Nostro et al., 2015; Petersen et al., 2017).

1.2.3.1. HES1 HES1 is of particular importance within pancreas development as it is a key target of the canonical

Notch signalling pathway, Hes1 expression has been detected in the mouse pancreas from E9.5

(Apelqvist et al., 1999). High Notch signalling initiates Hes1 expression, likewise loss of Notch signalling results in diminished Hes1 expression (Masjkur et al., 2016). HES1 is a basic helix-loop-helix (bHLH)

33 protein and member of the Hes/Hey family of proteins (Kageyama et al., 2007). Mouse mutants for

Hes1 display pancreatic hypoplasia, due to the acceleration of endocrine differentiation (Jensen et al.,

2000). Following this it was found that HES1 is able to inhibit Ngn3 a gene which initiates endocrine differentiation, hence a lack of HES1 would result in precocious Ngn3 expression causing the accelerated development of endocrine cells seen in the Hes1 mutants. It was therefore surmised that

Notch signalling is the defining factor in the inhibition of exocrine and endocrine differentiation (Esni et al., 2004a; Lee et al., 2001). Hes1 expression permits the growth and expansion of early pancreatic progenitor cells, intriguingly it was found that Hes1 is upstream of cyclin-dependent kinase inhibitor 1C

(P57). Mutants for this gene have increased cell cycling and a similar phenotype is seen when Hes1 expression is lost. Implying that Hes1 may control cell growth via the activity of P57 (Georgia et al.,

2006). PTF1A has been found to initiate Hes1 which in turn maintains Ptf1a expression in order to expand the progenitor pool (Ahnfelt-Ronne et al., 2012). Finally, work in human cells has established

HES1 as a target of PDX1 (Wang et al., 2018).

1.2.3.2. NGN3 The bHLH protein NGN3 plays a vital role during pancreas development acting to drive the transition from pancreatic progenitors to endocrine precursors. When knocked out in mice no endocrine cells were produced and pups died soon after birth, the authors also noted that no co-expression of RNA was seen between Ngn3 and insulin or glucagon (Gradwohl et al., 2000). It was later observed that

Ngn3 expression in mice is biphasic beginning initially at E8.75 until E11, with expression reappearing from E12 onwards. This undulation in expression leads to what are dubbed the ‘first’ and ‘second’ transitions; two separate periods of endocrine differentiation. Interestingly, far more transcript than protein was detected in the mice embryos which suggested that Ngn3 is post transcriptionally regulated

(Villasenor et al., 2008). By utilising an inducible system for Ngn3 within pancreatic progenitors one study was able to determine that initiation of Ngn3 expression at differing time points produces

34 separate endocrine cell types. Expression prior to E11.5 made glucagon positive cells, post E11.5 resulted in insulin positive cells and following E14.5 somatostatin positive cells were observed. This work highlights that in mice the biphasic expression of Ngn3 serves to promote the generation of the many endocrine cell types that make up the pancreas (Johansson et a., 2007).

As stated earlier Ngn3 is not expressed in the presence of Notch activity as it is inhibited by HES1 binding to its promoter (Lee et al., 2001). This in turn allows the pancreatic progenitor population to expand (Shih et al., 2012). When Ngn3 expression is attenuated the number of endocrine cells decreases whilst a key marker of progenitors, SOX9, increases (Prasadan et al., 2010). As development progresses Notch expression diminishes, allowing Ngn3 expression to rise. At this point the endocrine precursor cells begin proliferating slower as NGN3 upregulates cyclin dependent kinase inhibitor 1a (Cdkn1a) (Miyatsuka et al., 2011). Following on from this, the inhibition of Cdks in turn reduces CDK phosphorylation of NGN3 meaning active levels of NGN3 can increase further (Azzarelli et al., 2017). All of which drives cells towards the endocrine lineage yet Notch signalling represents just one way Ngn3 levels are regulated.

PDX1 and HNF6 are known to bind to the Ngn3 promoter to regulate its expression. Additionally, Pdx1 mouse mutants lack Ngn3 positive cells past E9.5 indicating PDX1 is needed to maintain Ngn3

(Jacquemin et al., 2000; Burlison et al., 2008; Oliver-Krasinski et al., 2009). Initial expression of Ngn3 is most likely activated by FOXA2 which aids in the formation of an auto-activation loop whereby NGN3 binds to its own promoter to regulate itself (Ejarque et al., 2013). Other transcription factors able to induce Ngn3 expression include SOX9 and HNF1B, the latter of which has been indicated as marker of

Ngn3 precursor cells (Maestro et al., 2003; Lynn et al., 2007; De Vas et al., 2015). NGN3 itself is able to regulate a host of downstream genes related to endocrine specification such as Nkx2.2, neurogenic differentiation 1 (NeuroD1) and paired box gene 4 (Pax4) (Gasa et al., 2004). Within the scope of

35 aforementioned genes it has been demonstrated to activate expression of Pdx1 and Mnx1 (Gasa et al.,

2004). An additional way it regulates the transition towards endocrine cells is the upregulation of Snail2 which can then downregulate Cdh1 a marker gene of pancreatic progenitors (Gouzi et al., 2011).

Compared to many of the other genes discussed here NGN3 has been well documented in humans.

Study of fetal pancreases found that NGN3 co-localises with insulin and glucagon in contrast to what is seen in mice (Lyttle et al 2008). Another divergence between species has since been found, as in humans NGN3 is expressed continually from about CS21-22 rather than two separate waves (Jennings et al., 2013). NGN3 expression peaks between 10-14 weeks post conception (wpc) it then steadily declines and by 35 wpc expression is gone (Salisbury et al., 2014). Nonetheless, siRNA knockdown of

SOX9 in cultured human fetal pancreas cells resulted in a reduction of NGN3 positive cells implying that as in mice SOX9 regulates NGN3 (McDonald et al., 2012). To add to this PDX1 has also been established in human cells as a regulator of NGN3 expression (Wang et al., 2018). Alongside this a mutation within NGN3 has been observed in humans, with the patient being diagnosed with neonatal diabetes (Pinney et al., 2011b). Work on human cells highlighted that upon NGN3 loss hESC lines could no longer form mature endocrine cells following in vivo engraftment (McGrath et al., 2015). To conclude whilst the temporal and spatial patterns of NGN3 expression may vary between mice and humans, its expression remains imperative across species.

1.2.3.4. PTF1A Concomitantly to the development of the endocrine portion of the pancreas the exocrine tissue is also developing. The chief factor active in determining not only the exocrine pancreas but also the pancreas itself is the bHLH factor Ptf1a, its loss in mice results in pancreatic agenesis and glucose intolerance due to a lack of insulin secretion (Fukuda et al., 2008). Furthermore, pancreatic progenitors lacking

Ptf1a expression shift their phenotype towards that of duodenal epithelium. Ptf1a expression is first

36 detected at E9 (Kawaguchi et al., 2002). Early studies of PTF1A found it is able to bind the promoters of several key enzymes produced by the acinar cells of the exocrine pancreas: amylase, elastase and trypsin (Cockell et al., 1989; Cockell et al., 1995).

To achieve this regulatory control PTF1A complexes with several other proteins, of significant importance is P48 a DNA binding subunit within PTF1A which upon deletion in mice results in complete loss of the exocrine tissue. Given the exocrine tissue comprises the majority of the pancreas following

Ptf1a loss endocrine cells exhibit abnormal spatial organisation (Krapp et al., 1996; Krapp et al., 1998).

This suggests that the presence of exocrine cells is critical to correct endocrine differentiation. The

PTF1A binds an additional protein and becomes able to further define acinar development; this can be either RBPJ or RBPJL. RBPJ is a Notch dependent protein however upon binding to PTF1A this is lost, as its Notch binding domain is utilised in the formation of the protein complex. Whilst RBPJL is an exocrine specific factor found only in acinar cells (Beres et al., 2006). Initially PTF1A is bound to RBPJ but is replaced in favour of RBPJL during development. Fascinatingly, if PTF1A is mutated so it cannot bind RBPJ but still able to bind RBPJL no development of acinar or islet cells is observed. Thus, it can be inferred that the PTF1A/RBPJ complex is crucial in pancreas development (Masui et al., 2007). The

PTF1A/RBPJL complex instead drives acinar development working with factors like NR5A2 to regulate expression of multiple digestive enzymes (Holmstrom et al., 2011).

Due to the major role PTF1A plays throughout pancreas development it is commonly noted as key factor in pancreatic progenitors along with PDX1 (Zhou et al., 2007). It is also expressed alongside

PDX1 and NKX6.1 at day E9.5 though their expression patterns diverge by E12.5 with PTF1A becoming isolated to acinar cells (Hald et al., 2008). During development a host of transcription factors regulate Ptf1a and likewise PTF1A regulates many as well. In the first instance conditional loss of Pdx1 at E10.5-12.5 results in a loss of Ptf1a expression (Hale et al., 2005). PDX1 can in turn bind and

37 regulate Ptf1a conveying that these two critical transcription factors form a feed forward loop during the pancreatic progenitor phase in order to define the cell population (Miyatsuka et al., 2007; Wiebe et al.,

2007; Thompson et al., 2011). The complex of PTF1A/RBPJL can also bind to the promoter of Ptf1a driving expression through an autoregulatory loop (Masui et al., 2008). ChIP-seq analysis has elucidated HES1 and NR5A2 as able to bind to Ptf1a enhancer-promoter regions (Hale et al., 2014; De

Lichtenburg et al., 2018). PTF1A and NKX6.1 have antagonistic relationship in which both inhibits the other (Schaffer et al., 2010). Downstream of PTF1A it is able to bind to the promoters of a number of genes such as: Mnx1, Hnf6 and Nr5a2 (Thompson et al., 2011). Unfortunately PTF1A like many of the other exocrine factors is not well described the developing human pancreas. Though it should be stated that loss of PTF1A in humans results in pancreatic and cerebellar agenesis, highlighting the indispensable role it plays in development (Sellick et al., 2004).

1.2.3.5. RBPJ and RBPJL RBPJ and its paralog RBPJL both play vital and varying roles during the development of the pancreas.

As stated previously RBPJ is Notch responsive and upon its loss in mice a phenotype equivalent that of

Notch signalling loss is witnessed, whereby endocrine differentiation is accelerated (Aqelqvist et al.,

1999; Fujikura et al., 2005). Conditional inactivation of Rbpj in Ptf1a expressing cells results in reduced expression of key genes such as Hes1 and Pdx1 as well as aberrant exocrine tissue formation (Fujikura et al., 2007). Published in the same time frame another group illustrated PTF1A and RBPJ are able to activate Pdx1 expression (Miyatsuka et al., 2007). Work in human cells has conversely found that

PDX1 targets and can activate RBPJ (Wang et al., 2018).

As development of the pancreas ensues, PTF1A bound to RBPJ is slowly exchanged for RBPJL from

E12.5 onwards (Masui et al., 2007). RBPJL loss in mice results in considerable acinar defects with decrease in expression of nearly all digestive enzymes made by the pancreas. Inverse to this many

38 liver genes were upregulated following loss of Rbpjl indicating a role beyond that of acinar specification

(Masui et al., 2010). At a regulatory level Rbpjl expression is known to be controlled by GATA6 and

NR5A2 (Martinelli et al., 2012; Hale et al., 2014). In humans RBPJL loss has been associated as a modifying risk for type 2 diabetes (Nair et al., 2018). It has also been described as a central factor in defining the early human pancreas (Jennings et al., 2017).

1.2.3.6. NR5A2 NR5A2 is zinc finger DNA binding factor and an important gene throughout development across various tissues but in particular the pancreas and liver. Its expression in mice is observed from E9.5 (Annicotte et al., 2003). Mutant mice without Nr5a2 have a smaller pool of pancreatic progenitor cells, all three lineages of the pancreas exhibit reduced cell number and abnormal morphologies. This is especially evident within the exocrine tissue where cell numbers are reduced by as much as 90%. ChIP-seq data from mice identified Foxa2, Gata6, Ptf1a and Rbpjl as downstream target genes (Hale et al., 2014).

Whilst alternative ChIP-seq data showed that PTF1A binds to the Nr5a2 promoter (Thompson et al.,

2011). Additionally, retroviral overexpression of PDX1 in human cells is able to induce NR5A2 expression (Anniocotte et al., 2003).

1.2.3.7. MYC The avian myelocytomatosis (MYC) gene otherwise known has cellular MYC (C-MYC), has been studied extensively due its role as an oncogene in humans. MYC is another bHLH protein but is also known to act during pancreas and acinar development. In 2007 Zhou and colleagues noted proliferative progenitor cells positive for Pdx1 and Ptf1a as well as c-Myc, suggesting it is maybe the presence of c-

Myc expression that aids early expansion (Zhou et al., 2007). The acinar factor NR5A2 is able to sustain c-Myc expression and upon Nr5a2 deletion the number of C-MYC positive cells decreases by

75%. Implying the C-MYC expression defines a major portion of the exocrine tissue (Hale et al., 2014).

These two studies are backed up by work in which c-Myc was conditionally inactivated within the

39 pancreas during development. As a result of c-Myc inactivation a loss in proliferation of both progenitors and acinar precursor cells was seen (Nakhai et al., 2008a; Bonal et al., 2009). Further study has revealed that C-MYC it has a more expansive role as it binds to and represses Ptf1a in order to allow proper maturation from pre-acinar to acinar cell, highlighting a functionality beyond that of proliferaton (Lobo et al., 2017).

1.2.3.8. MIST1 Another transcription factor known to be active in acinar development is Mist1. Mice lacking Mist1 have disorganised exocrine tissue with a phenotype resembling that of pancreatic injury (Pin et al., 2001).

During acinar development Mist1 is responsible for maturation of the various enzyme secreting cells located within the exocrine compartment of the pancreas (Ramsey et al., 2007). If Mist1 expression is restored in mutants lacking the gene it is able to rescue acinar function and complete maturation of the tissue, demonstrating Mist1 plays an important role in the development of acinar cells (Direnzo et al.,

2012). Mist1 expression is known to be regulated by GATA6 as well as NR5A2, in Nr5a2 lacking mice

Mist1 levels become reduced by 50%, implying it is an integral downstream target (Martinelli et al.,

2012; Hale et al., 2014).

1.2.3. Late pancreas signalling network Manifold of signalling pathways are active as pancreas development persists, each of which define and activate their own unique cohort of genes in order to aid in differentiation and maturation of the many cell types present. EGF signalling as mentioned prior is able to initiate expression of several key genes when combined with other growth factors. When the EGF receptor (EGFR) is knocked out in mice a decrease in proliferation is noted, ductal branching is diminished and islet cells do not cluster properly

(Miettinen et al., 2000). EGF can also be used maintain a pancreatic progenitor cell population via regulation of Nestin (Esni et al., 2004b). Likewise, a human study employing cultured fetal pancreatic organoids found that the presence of EGF was able to maintain cells in a proliferative state but after

40 removal cells underwent differentiation (Bonfanti et al., 2015). All of which displays that EGF signalling is essential in the maintenance of proliferation within the emerging pancreas.

FGF signalling is also active during pancreas development. Early studies of mice found that FGFR2B ligands FGF1, FGF7 and FGF10 are all present. In addition, human studies have also identified FGF7 and FGF10 present from CS11 (Duvillie and Scharfmann et al., 2005). When mouse embryonic pancreatic explants were exposed to these ligands they were able to stimulate explant growth. On the contrary when FGFR2B was inhibited explants were reduced in size (Miralles et al., 1999). Following this, work detailing Fgf10 null mice found that the pancreatic epithelium was unable to proliferate

(Bhushan et al., 2001). In time this mechanism for proliferation was established as FGF10 is able to activate Notch signalling which through HES1 expression and repression of Ngn3 can maintain cells in the pancreatic progenitor state (Hart et al., 2003; Norgaard et al., 2003; Miralles et al., 2006). SOX9 can activate Fgfr2b which in turn permits the transduction of FGF10 signal which then activates Sox9 forming a feed forward loop (Seymour et al., 2012). Accordingly, it is Sox9 expression which is imperative in initiating and maintaining the pancreatic progenitor population through FGF10 and Notch signalling.

Evidently Notch is a major player during the development of the mouse pancreas, as conditionally activated mutants of Notch within Pdx1 expressing cells become unable to differentiate towards either the endocrine or exocrine lineage. Instead cells are trapped within a progenitor state, moreover, constitutive overexpression of Notch represses endocrine cell formation greatly and completely abrogates exocrine cell development (Hald et al., 2003; Murtaugh et al., 2003). This is in stark contrast to Notch pathway mutants which undergo premature endocrine differentiation (Apelqvist et al., 1999;

Nakhai et al., 2008b). All of this is achieved through regulation of Hes1 and subsequently Ngn3

41 expression, as both factors acts to define their respective cell type through activation of downstream transcription factors (Shih et al., 2012).

The Wnt signalling pathway plays an active role in shaping the pancreas, as mouse mutants for B- catenin are still able to form islet cells though at a reduced amount. However, the exocrine portion is almost completely absent as a result the pancreas is severely hypoplastic (Murtaugh et al., 2005;

Baumgartner et al., 2014). Conditional mutants within Pdx1 expressing cells for B-catenin had a similar phenotype and it was noted that Ptf1a expression was considerably reduced (Wells et al., 2007). Sadly, neither Notch nor Wnt signalling have been characterised in the developing human pancreas.

1.2.4. Transcription factor expression in human and mouse pancreas development As discussed, considerably more is described in the mouse system than human in relation to pancreas development. The table below summarises the protein expression of key transcription factors in each organism (Table 2). It appears that the mouse pancreas developmental program is slightly ahead and expedited than that of humans. As several transcription factors appear to be present a stage earlier.

Moreover, GATA6 loss in humans causes greater perturbation to pancreas development than the loss of its counterpart GATA4 (Shi et al., 2017). This is contrary to that of mouse development in which

Gata4 loss has a more pronounced effect on the developing pancreas (Carrasco et al., 2012; Xuan et al., 2012). This suggests that GATA6 expression is more vital in the generation of human pancreatic cells. The considerable difference in NGN3 expression is also noteworthy (Villasenor et al., 2008;

Jennings et al., 2013). It may be that the extended absence of NGN3 in humans allows proper expansion of the pancreatic progenitor population. This is most likely accomplished through Notch signalling.

42

That being said multiple factors are detected together at the same time in both species such as FOXA2

+ SOX17 and PDX1 + SOX9. Moreover, in the case of NKX6.1 it seemingly appears at a comparable time in humans and mice (Table 2). This implies the mouse system does at least provide a reasonably similar model. In addition, many of the proteins such as MNX1, GATA6 and HNF6 have only been probed at specific time points meaning our existing knowledge may not be a complete reflection of their expression. Many exocrine specific factors have also not been described during human development hopefully further study will clarify their expression pattern. The downstream regulatory network that these factors control also appears to largely conserved given current available data (Fig. 2). Human

PDX1 gene regulation has been studied the most and shares gene control of GATA4, PDX1, HNF1B and NKX6.1 with that of PDX1 in mice. Once again whether more is shared may in fact be as a result of lack of study rather than any interspecies difference. Looking at the regulatory network within the mouse pancreas it is clear that many factors regulate one another in an overlapping mesh. We can therefore assume a comparable matrix is likely to be active during human pancreas development. With this assumption we can therefore also begin to define pancreatic progenitors. In mice many transcription factors have been used to describe these cells. However, consensus seems to suggest that a combination of PDX1, PTF1A and SOX9 represent the most critical factors in maintaining pancreatic progenitor identity (Pan and Wright et al., 2011). Work in the future should aim to validate this and the regulatory actions of pancreas specific transcription factors in human cells.

43

Table 2. Transcription factor presence during human and mouse embryonic development. Human development is described in Carnegie Stages (CS) and mouse in embryonic days (E). The initial detection of transcription factor protein is noted with reference. (*) SOX17 expression is no longer present in the pancreas by CS13. (†) NGN3 expression diminishes by E11 but returns from E12 onwards. (‡) RBPJ is initially bound to PTF1A but is progressively exchanged in favour of RBPJL from E9 to E12.5.

Human Initial protein References Mouse Initial protein References detection detection CS9 E7.5– FOXA2, SOX17 Monaghan et al., 1993; Spence et al.,2009 E8 CS10 FOXA2, Jennings et E8– MNX1 Sherwood et al., 2009 SOX17* al., 2013 E8.5 CS11 E8.5– HES1, NGN3†, PDX1, Apelqvist et al., 1999; Villasenor et al., 2008; Gu et E9 SOX9 al., 2002; Kopp et al., 2011 CS12 GATA4, Jennings et E9– PTF1A, RBPJ‡ Kawaguchi et al., 2002; Masui et al., 2007 PDX1, SOX9 al., 2013 E9.5 CS13 NKX6.1 Jennings et E9.5– GATA4, GATA6, HNF1B, Decker et al., 2006; Haumaitre et al., 2005; Nammo al., 2013 E10 HNF4A, NKX6.1, NR5A2 et al., 2008; Hald et al., 2008; Annicotte et al., 2003 CS14 MNX1 Pan et al., E10– HNF6 Landry et al., 1997 2015 E11.5 CS15 E11.5– NGN3† Villasenor et al., 2008 E12.25 CS16 GATA6, HNF6 Cebola et E12.25– HNF1A, RBPJL‡ Nammo et al., 2008; Masui et al., 2007 al., 2015 E12.75 CS17– E12.75– CS20 E15 CS21 NGN3 Jennings et E15– al., 2013 E15.5 CS22– E15.5– CS23 E16.5

By combining knowledge regarding the temporal expression of transcription factors (Table 2) with the

complex network of gene expression elicited by these factors (Fig. 1). As well as the understanding of

signalling pathways active during development this provides a strong basis from which to generate

pancreatic cells. Consequently, various methods have been created to produce these cells in vitro.

44

1.3. Pancreatic cell production Insulin is currently the safest and most effective treatment for those suffering from diabetes, yet it only allows for a transient release from the disorder. Moreover, as it is only a treatment those whom require it are dependent for life. The transplantation of islets is a feasible long term treatment for those with type 1 diabetes but viable islet cells are not only in high demand but also scarce (Shapiro et al., 2000;

Shapiro et al., 2006). More importantly type 2 diabetes represents the vast majority of those suffering from the disorder. The World Health Organisation (WHO) projects that diabetes will be the 7th leading cause of death globally by 2030 (WHO Global Report on Diabetes., 2016). Type 2 diabetes is characterised by peripheral insulin resistance typically as a result of a sedentary lifestyle or unhealthy diet. This means transplantation of pancreatic tissue may have a more limited effect but work does suggest that defects within β cells are also present. In turn meaning transplantation of pancreatic tissue may still provide some minor benefit (Butler et al., 2003). Accordingly, research groups across the globe are striving to generate protocols capable of producing islet cells on mass. Currently three methods have been established for the development and expansion of pancreatic cell types: transdifferentiation, in vitro differentiation and organoid culture. Here I will describe each and their various strengths and flaws.

1.3.1. Transdifferentiation Transdifferentiation also known as lineage reprogramming is the process whereby one cell type is converted to another usually through the use of either small molecules, overexpression of transcription factors or a combination of the two. The first illustration of this phenomenon occurred in 1987 in which a group used 5-azacytidine, a potent inhibitor of DNA methyltransferases to convert mouse fibroblasts into myoblasts (Davis et al., 1987). Almost 20 years later the discovery of induced pluripotent stem cells (iPSCs) transformed modern day regenerative medicine offering the potential for personalised cell therapies. By the delivery and overexpression of four transcription factors (octamer-binding transcription factor 4; Oct4, Sox2, c-Myc and Kruppel like factor 4; Klf4) into fibroblasts Yamanaka and colleagues

45 were able to reprogram cells to that of a pluripotent stem cell (Takahashi et al., 2006). Through the years countless studies have attempted transdifferentiate to all manner of cell types such as: cardiac myocytes through the overexpression of transcription factors Gata4, Myocyte enhancer factor 2c;

Mef2c and T-box transcription factor 5; Tbx5 (Inagawa et al., 2012). Neural stem cells have also been made via exposure to an intricate mix of small molecules (Zhang et al., 2016). Hepatocytes have been produced through lentiviral delivery and subsequent genomic integration of FOXA3, HNF1A, HNF4A and simian vacuolating virus 40 (SV40) large T antigen the last of which aids expansion of cells (Huang et al., 2014).

The pancreas was no exception to this enquiry; early experiments utilised Pdx1 expression delivered by virus into mouse liver cells in order to transdifferentiate them into insulin secreting β cells (Ferber et al.,

2000; Sapir et al., 2005). In a seminal paper from 2008 Zhou et al., developed a robust trio of factors for transdifferentiation towards β cells in vivo. This was accomplished by exposing 9 genes into mice and successively removing one until an ideal trio was formed. These genes were: Pdx1, Ngn3 and v-maf avian musculoaponeurotic fibrosarcoma oncogene homolog A (MafA), referred to as PNM. As already mentioned Pdx1 and Ngn3 both have integral roles in pancreas and endocrine specification, MafA is critical specifically to the differentiation and maturation of β cells (Zhang et al., 2005; Hang et al., 2014).

The adenovirally transduced genes were injected into the exocrine portion of a mouse pancreas, allowing long-term expression of the genes without integration into the genome. The adenovirus delivers DNA which then remains free within the nucleus allowing for transcription of encoded genes and subsequent protein expression. The result was the creation of β cells mirroring that of endogenous

β cells and able to produce insulin, more notably they were able to rescue steptozotocin (STZ) treated hyperglycaemic mice (Zhou et al., 2008). STZ treatment in mice leads to β cell death and is employed to mimic hyperglycaemia in mice in vivo.

46

Since the elucidation of these three factors multiple groups have utilised PNM to transdifferentiate an assortment of cell types (Fig. 2). The trio of factors was introduced in vitro to AR42j-B13 cells, a rat pancreatic exocrine tumour line. The newly generated cells flattened out, stopped dividing and expressed high levels of insulin. However, the cells were unable to respond to glucose levels and instead secreted insulin constitutively. In addition the genetic profile of the cells suggests they are not fully mature β cells as many of the β membrane channels were not upregulated, hence explaining the inability to respond to glucose. Exocrine genes also retained expression but at reduced levels (Akinci et al., 2012). Alongside these experiments the group also applied the genes in vivo into the livers of STZ treated mice. Here upon adenovirus exposure cells become SOX9 positive for approximately 2 weeks, suggesting SOX9 expression may be a marker of fate transition, eventually the cells matured to secrete insulin, loosing SOX9. More importantly the nascent β cells are glucose responsive instead of constitutively insulin expressing. Though insulin levels produced did not match that of wild-type mice

(Banga et al., 2012).

The same group went onto further assess PNM across a variety of rodent cell types combined with the addition of small molecules. They tested 8 cell lines from rodent hepatocytes, to mouse embryonic and adult fibroblasts as well as the original AR42j-B13 line. Expectedly the AR42j-B13 cells gave the best response to the factors, as it is derived from the same tissue as β cells. Therefore, the authors attempted to increase the efficiency of transdifferentiation by the addition of small molecules (Akinci et al., 2013). They screened 13 small molecules and found a combination of three increased the number of dual positive insulin and PDX1 cells from 2% to 12%: N-[N-(3,5-difluorophenacetyl)-l-alanyl]-S- phenylglycine t-butyl ester (DAPT), 5'-N-ethylcarboxamidoadenosine (NECA) and BIX-01294 (BIX).

DAPT is a γ-secretase inhibitor which blocks the cleavage of the Notch intracellular domain (NICD), because of this it is unable to translocate to the nucleus and regulate transcription. Thus, the Notch target gene Hes1 will no longer be activated allowing elevation of Ngn3 levels (Shih et al., 2012). NECA

47 is an agonist of adenosine and has been demonstrated to increase the number of β cells in zebrafish

(Andersson et al., 2012). Lastly, BIX is a histone deacetylase inhibitor which reduces H3K9Me2 marks

(Haumaitre et al., 2008).

A separate group opted for an alternative approach towards β cell transdifferentiation. Beginning with instead human primary dermal fibroblasts they overexpressed Pdx1 whilst simultaneously knocking out v-maf avian musculoaponeurotic fibrosarcoma oncogene homolog B (MafB). MafB is known to be a key marker of α cells in mice so by reducing its expression this presumably shifts the cells towards the β cell phenotype. However, it should be noted that recent human data demonstrates MafB expression is also found in human β cells (Benner et al., 2014). Finally they also exposed the cells to 5-azacytidine and a histone deacetylase inhibitor romidepsin. The cells generated were able to respond to glucose levels in vivo but were polyhormonal as they also expressed glucagon (Katz et al., 2013). A more recent study employed human iPSCs followed by PNM exposure but via the utilisation of expression switches they were able to stagger the induction of each factor so as to mimic development (Saxena et al., 2016). All of the three genes was under synthetic control of vanillic acid which was steadily increased over time. The resulting expression pattern for each gene was as followed: Pdx1 (On-Off-

On), Ngn3 (Off-On-Off) and MafA (Off-Off-On). When compared to human islets many genes were expressed at close to endogenous levels, though insulin levels were still relatively lacking the cells still responded well to glucose.

Pdx1, Ngn3 and MafA are not the only gene used in transdifferentiation. One group found the gene

TGFβ induced factor homeobox 2 (Tgif2) when introduced with lentiviruses into cultured mouse hepatocytes was able to alter their cell fate to that of a progenitor cell type. Later the same gene was delivered into the mouse liver with adeno-associated viruses (AAVs) the cells became PDX1 positive however no insulin was seen. They surmised that maybe the hyperglycaemic state induced by STZ

48 may allow for further differentiation in vivo. AAVs are a related family to that of adenoviruses operating in a similar non-integrative manner but with a reduced immune response. Here delivered DNA forms large concatemers or episomes which maintain gene expression (Berns and Muzyczka, 2017; Colella et al., 2018). Injection of Tgif2 virus into the liver of STZ treated mice gave rise to insulin production as posited (Cerda-Esteban et al., 2017). Most recently work on donated human α and PP cells found that just via the use of Pdx1 and MafA overexpression cells could become insulin expressing. Once transplanted into STZ mice they could not only reverse diabetes, and respond to glucose appropriately but also remain functional past six months (Furuyama et al., 2019). These studies establish transdifferentiation as major player in the race to generate pancreatic cell types. The ability to start with any cell type which could even be derived from the patient offers a level of personalisation in treatment and streamlines production to the target cell type. To add to this viral overexpression is a cheaper method for initiating target gene expression but this comes with a caveat of utilising a virus. Not only may the virus prompt a strong immune response but also the potential aberrant integration of ectopic

DNA sequences into genome.

49

Figure 2. The advance of transdifferentiation protocols towards pancreatic cell types. Each study referred to is shown on the left. The starting cell type is shown along with its species of origin. Genes utilised for transdifferentiation are noted, those with a strikethrough indicate genes which were deleted. The vector employed for transdifferentiation is described as well as if delivered in vivo or in vitro and the time period cells were cultured for. If any small molecules were used to enhance transdifferentiation they are noted. The right hand side highlights the final protein expression elicited by transdifferentiated cells.

1.3.2. In vitro differentiation In vitro differentiation towards pancreatic lineages from mouse stem cells, hESCs and iPSCs has developed extensively over the past two decades (reviewed: Zhou and Melton, 2018). Initially cells must be specified towards the endoderm, members of the TGFβ family combined with low serum containing media have been well established in generating endoderm cell populations (Brennan et al.,

2001; Kubo et al., 2004; D’Amour et al., 2005). Various factors have also been elucidated to drive endoderm to pancreas development. As previously mentioned lack of SHH signalling is able to induce

Pdx1 expression and initiate pancreas development, therefore the use of a SHH signalling inhibitor is often employed (Hebrok et al., 1998; Kim and Melton, 1998). RA is also able to define both and pancreas and liver at this early stage so has been utilised in differentiation (Stafford and Prince, 2002;

Molotkov et al., 2005). The addition of FGF10 aids in the maintenance of Sox9 expression and thus the pancreatic progenitor population (Bhushan et al., 2001; Seymour et al., 2007). These factors were consequently coalesced into one study by D’Amour et al., 2006, here in stage 1 human embryonic stem cells were exposed to ActA and Wnt to form the endoderm, in stage 2, FGF10 and cyclopamine (CYC; an inhibitor of SHH signalling), in time these two factors were accompanied by RA (stage 3). Stage 4 of their protocol used exedin 4 and DAPT to prompt differentiation towards the endocrine lineage. At stage

5 of the differentiation cells were exposed to exednin 4, insulin growth factor (IGF) and hepatocyte growth factor (HGF), these final two supplements were discovered empirically to aid hormone production. Although the authors did note that removal of the factors made little difference to the terminally generated cells (D’Amour et al., 2006). This protocol provided the skeleton from which all subsequent methods have built upon (Fig. 3).

50

The same group soon published an updated protocol in which FGF10 was substituted for FGF7 (KGF) and with the addition of noggin, for latter part of differentiation the cells were not exposed to any further growth factors. When cells were transplanted in STZ mice, the mice were able to retain stable blood glucose levels (Kroon et al., 2008). From here they pursued work to generate pancreatic endocrine cells in a scalable manner. Only a few minor changes to their protocol were seen, the addition of a

TGFβ inhibitor IV (TBI), substituting RA for a more stable analog 4-[(E)-2-(5,6,7,8-Tetrahydro-5,5,8,8- tetramethyl-2-naphthalenyl)-1-propenyl]benzoic acid (TTNPB) and finally the addition of EGF. Through the growth of multiple batches in ‘cell factories’ Schulz et al., were able to reproducibly generate vast amounts of pancreatic endocrine cells capable of rescuing STZ treatment in mice. However, approximately 50% of the cohort had grafts develop into cysts, suggesting a lack of maturity in the cells

(Schulz et al., 2012).

In 2014 a ground-breaking 7 stage protocol was developed to optimise β cell production in vitro. Earlier studies had been able to restore glucose stability in STZ mice but cells were still polyhormonal. In stage

1 cells were exposed to growth differentiation factor 8 (GDF8; a TGFβ family member and substitute for

ActA) as well as a glycogen synthase kinase 3 β (GSK3β) inhibitor (acting as a Wnt3a substitute) to produce definitive endoderm. As in previous protocols in stage 2 FGF7 was utilised alongside vitamin C

(VitC). The presence of VitC was shown to increase the number of PDX1 and NKX6.1 positive cells, this is believed to be because it aids in the synthesis of the extracellular matrix (ECM) (Choi et al.,

2008). The following stage featured several substitutions and additions SHH antagonist (SANT), TPB an activator protein kinase C (PKC), LDN-193189 (LDN) a noggin analog and RA. Stage 4 is identical to the previous stage but with reduced RA (1 µM to 100 nM). In the subsequent stages cells were transferred to an air-liquid interface, permitting the formation of 3D structures with cell polarity. In stage

5 two new supplements were added, tri-iodothyronine (T3) and TGFβ inhibitor analog activin receptor- like kinase 5 inhibitor II (ALK5iII). T3 has been implicated in β cell maturation as it increases MafA

51 expression (Aguayo-Mazzucato et al., 2013). In the penultimate stage an analog for DAPT a γ- secretase inhibitor XX (GSiXX) was introduced. Rezania and colleagues observed it reduced exocrine gene markers whilst concurrently increasing endocrine gene expression. At the terminus of this stage

50% of the population was insulin positive and glucagon negative but insulin levels were still not comparable to adult human islets. These cells also did not secrete insulin in response to glucose, indicating a lack of maturity still. That said, stage 6 cells were able to rescue STZ induced hyperglycaemia in mice signifying that it may not be an ideal model for assessing true maturity of β cells.

The group therefore screened a host of molecules and identified AXL receptor tyrosine kinase inhibitor

(AXLi) known as R428 which was able to induce MAFA expression. As previous work determined AXL receptor tyrosine kinase activity reduces MAFA expression, to permit β cell maturation closer to birth

(Haase et al., 2013). Additionally, N-acetyl cysteine (N-Cys) was introduced to the media; this is an antioxidant and is known to prevent the loss of MAFA (Harmon et al., 2005). To assess relative maturity of their generated cells gene expression was compared to that of adult human islets. Key genes such as insulin, NGN3, NEUROD1, PDX1, NKX6.1, NKX2.2 and MAFA all had equivalent or elevated levels of expression. Nevertheless, several other pancreas specific genes were still relatively under expressed, suggesting that the cells had to reach full maturation. The stage 7 cells were also capable of reversing STZ induced hyperglycaemic mice albeit more rapidly than their stage 6 counterparts.

Lastly, they elicited a limited to response to glucose whereas adult β cells exhibited appropriate Ca2+ response. They observed around 50% of their stage 7 cells were positive for NKX6.1, MAFA and insulin

(Rezania et al., 2014).

Rezania et al., 2014 provided significant advances towards in vitro differentiation of pancreatic cell types, studies following it worked towards scalable methods for production. One such protocol utilised

52 the scalable model from Schulz et al., 2012 and added the final stages from Rezania et al., 2014 to create a protocol for large scale production of more mature differentiated β cells (Russ et al., 2015).

Furthermore, a novel method for the long term culture of pancreatic progenitors was recently created.

Here the author’s cultured hESCs to a PDX1 and SOX9 positive state, cells were then transferred to be cultured on 3T3-J2 feeder cells in a fixed media. This media was composed of RA, EGF, FGF10, DAPT and SB431542 (TGFβ inhibitor). As a result the cells could be maintained in a pancreatic state for at least 25 passages, moreover, cells could then be re-plated and expanded towards the endocrine lineage (Trott et al., 2017). The use of feeder cells may revolutionise how pancreatic cell types could be grown and expanded without the loss of key marker gene expression over long period of times. Most recently, a study employed an insulin-GFP reporter hESC line to differentiate cells through the Rezania protocol. At stage 7 cells were then fluorescence activated cell sorting (FACS) sorted for GFP positive cells. These cells were isolated and aggregated into discrete clusters before being cultured for a further week. The resulting cells 99% endocrine cells and 90% insulin positive and monohormonal, upon transplantation into STZ treated mice they were able to revert the phenotype in around 13 days and no tumour formation was observed after 8 months. They were also able to rapidly and robustly respond to glucose and stimulate an appropriate Ca2+ response (Nair et al., 2019). To date in vitro differentiation represents the most reproducible, safe and scalable method for the production of pancreatic cell types and specifically β cells. However, it is prohibitively expensive to setup differentiation especially in a scalable manner. Moreover, stored cells or the use of feeder cells and Matrigel would not be applicable in therapy due to potential rejection by the patient. Thus, meaning cells would not only have to be expanded and differentiated from iPSCs generated from the patient, but also not utilise reagents of unknown composition which could elicit a severe immune response.

53

Figure 3. The advance of in vitro differentiation protocols towards pancreatic cell types. Each study referred to is shown on the left. Upon the top indicates the stages of differentiation whilst markers of each stage are at the base. Dotted lines denote change in media composition within the same stage. Each study’s growth factors and small molecules used are written. Colour coded items highlight the use of substitutes or analogs with similar functions. Red = TGFβ proteins, orange = Wnt signalling activators, yellow = FGFR2B binding proteins, green = SHH signalling antagonists, light blue = RA receptor activators, blue = BMP signalling inhibitors, purple = TGFβ signalling inhibitors, magenta = Notch signalling inhibitors. 1.3.3. Organoid culture Transdifferentiation and hESC differentiation are not the only avenues available for the expansion of pancreatic cell types. The discovery of leucine-rich repeat-containing G-protein coupled receptor 5

(Lgr5) as the driving force behind intestinal stem cell expansion paved the way for organoid research

(Barker et al., 2007). In the years following mouse pancreatic organoid cultures were established, employing Wnt agonist R-spondin to activate Lgr5 expression (Fig. 4). Here organoids could form both ductal and endocrine cells could be expanded fivefold per week and cultured for more than 40 weeks

(Huch et al., 2013). Later work utilising human organoids was published by Bonfanti et al, isolated fetal pancreas cells to be expanded in vitro in a fixed medium. Cells were cultured within Matrigel and media containing: N-Cys, nicotinamide, gastrin, noggin, FGF10, EGF, R-spondin, Rho-associated protein kinase inhibitor Y (ROCKi) and Ex4. The majority of these factors have already been defined with most aiding cell proliferation or maintenance. The remaining supplements act similarly; nicotinamide maintains cell growth, as does gastrin which has been regularly utilised in organoid cell culture (Barker et al., 2010).

54

After 4 weeks of expansion the cultured organoids expressed several key marker genes including:

PDX1, SOX9, NKX6.1. Specific factors were then removed from the media to assess what effect their loss would incur. ROCKi loss led to substantial reduction in proliferation. Similarly R-spondin and

FGF10 removal were also accompanied by reduced cell growth. Importantly, EGF removal caused a distinct shift in both gene expression and morphology. Organoids lacking EGF were reduced in size relative to their EGF treated counterparts. Genes such NGN3, NKX2.2, PTF1A, glucagon and insulin beyond that approximately 75% of the organoid population was insulin positive and glucagon negative

(Bonfanti et al., 2015). This paper highlighted that organoids could not only be expanded but in time differentiated through simple means to create a majority monohormonal insulin positive cell population.

Most recently adult human pancreatic tissue was utilised for organoid culture followed by in vitro differentiation to specify the endocrine lineage. Organoids were cryopreserved and later input into the in vitro differentiation protocol alongside fresh organoids, the authors noted no difference between the two cultures, providing a protcol for long term storage of viable endocrine cells (Loomans et al., 2018).

Nonetheless, no group has so far managed to utilise organoids to try and abrogate STZ treatment in mice or further analysis. The need for in vitro differentiation to generate islet cell types also means this method may incur large costs. Lastly, given the cultures require an initial pancreatic sample how these could be personalised to a patient represents considerably challenge moving forward. For now organoids remain a fantastic model and tool for pancreatic culture ex vivo.

55

Figure 4. The advance of organoid expansion protocols towards pancreatic cell types. Each study referred to is shown on the left. The starting cell type is shown along with its species of origin. The culture media components used to maintain organoids are indicated. Further processing of organoids is indicated along with time frame of experiment. The right hand side highlights the final protein expression elicited by organoids after further experiments.

Each route towards pancreatic cell types is distinct in their method yet all are able to produce insulin secreting cells. Both transdifferentiation and in vitro differentiation are able to generate β cells, these cells show robust response to glucose, offering the potential for future cell therapy. That being said a consistent set of concerns do arise. Diabetes will continue to affect more and more of the world’s population, for any cell therapy to be successful it must in the first instance be cost effective as a treatment. It must also be scalable in order to cater to the vast needs of the world’s population. Any manufacturing process must abide by strict GMP conditions, meaning common components such as

Matrigel cannot be used due to their batch to batch variation and complex uncharacterised composition.

It must also be safe and ensure no severe immune reaction can occur, ideally through autologous cell therapy. Finally, any transplanted cells must be able to be maintained long term within the patient without risk of abnormal growth or necrosis. Thus, for any cell therapy to be successful each of these criteria must be met.

56

1.4. Project goal and methods The goal and key question of my PhD project was ‘Can I transdifferentiate human cells to pancreatic progenitors?’ Pancreatic progenitors are the last proliferative cell type present during development with the innate potential to later form pancreatic endocrine cells. As following the specification to the endocrine lineage the replicative ability of these cells is slowly lost (Teta et al., 2005; Miyatsuka et al.,

2011). Therefore, these cells are of particular interest as they represent the gateway to potential large scale expansion and subsequent differentiation of islet cells. As discussed groups across the globe are currently able to reliably differentiate hESCs to pancreatic progenitors in vitro. However, this process is highly expensive, and so far is limited to hESCs and hiPSCs. Meaning any autologous cell therapy would first have to reprogram to iPSCs before eventually differentiating to β cells. Transdifferentiation instead offers the most cost effective and direct route from harvested cells to the target, pancreatic progenitors. From there it is only a case of expanding the cells in order to later differentiate to β cells.

During my PhD I carried out work across two separate laboratories, I initially began my project within the Hanley laboratory at the University of Manchester, which has access to early human tissue. I became acquainted with pancreas development and began devising a transdifferentiation strategy.

After one year I transitioned to the Dunn laboratory at the Agency for Science, Technology and

Research (A*STAR) in Singapore. Here the group specialised in the in vitro differentiation of human stem cell lines to the pancreas. I spent the next two years and nine months working towards transdifferentiating to pancreatic progenitors. Given the scope of this project, I broke the process down into several key stages: the first being the derivation of key transcription factors for transdifferentiation from bioinformatic analysis. The next was the development and validation of system for gene delivery into cells to allow overexpression. Lastly, was the screening of chosen factors so as to determine their regulatory potency and aptitude for transdifferentiation.

57

To transdifferentiate cells I first had to determine a set of genes capable of reprogramming cells. The

Hanley laboratory’s access to human embryonic and fetal tissue allowed the inference of factors active during the early stages of pancreas development. I employed data taken from Gerrard and Berry et al.,

2016, here human embryos were collected and 15 separate tissues harvested for RNA-sequencing.

The authors then applied a novel form of analysis to the RNA-sequencing data known as lineage- guided principle components analysis (LgPCA). LgPCA differs from regular PCA as it applies developmental lineages to the PCA in order to tease out genes which may be shared across related tissues such as the pancreas and liver whilst simultaneously determining genes that are shared across unrelated tissues such as brain and pancreas (Fig. 5A). This data set analysed 19,362 genes (including

1,601 transcription factors). The result is the derivation of developmental genes crucial to forming each organ of the body (Gerrard and Berry et al., 2016). The initial list of factors was then supplemented and validated by an additional form of an analysis known as Mogrify. Mogrify is an online tool developed by

Rackham et al., 2016 utilising FANTOM5 and STRING databases, which aims to elucidate the optimal transcription factors which can transdifferentiate to target cell types. Mogrify has two key criteria in the selection of genes, firstly to identify transcription factors that can activate a large cohort of genes not just a few (Fig 5B). Secondly, to then determine an ideal set of these factors that will complement one another and give the most regulatory coverage over the target cell type (Fig. 5C) (Rackham et al.,

2016). Following these analyses a list genes was generated ready further examination and distillation down to an ideal trio of transcription factors.

58

Figure 5. Bioinformatic analyses used to determine genes for transdifferentiation. (A) Figure taken from Gerrard and Berry et al., 2016 with permission from authors. LgPCA applies developmental lineages to PCA. 15 PCs are shown with samples taken from human embryonic organs and human embryonic stem cells. PC dimensions are shown in black (high) or white (low) with scale shown by circle size. Broad gene expression patterns across many organs are seen with low scores seen in PC4 for example. High scores for specific organs are seen in later PCs or between tissues developmentally unrelated. (B) Figures B and C adapted from Rackham et al., 2016. Mogrify determines genes with higher ability to induce differential expression of transcription factors. (C) Mogrify then utilises these high scoring transcription factors to elucidate the ideal combination to give the greatest regulatory coverage over the target cell type.

With target genes selected the next key aim was identifying a vector capable of delivering the genes into cells was needed. I desired for the genes to not be integrated into the genome, so as facilitate the formation of a more plastic cell type that could differentiate given time. Adenovirus and lentiviruses were not chosen due to high risk of immune response shown in patients and the possibility of improper insertion or recombination of a host’s genome (Schlimgen et al., 2016; Lee et al., 2017; Milone and

O’Doherty et al., 2018). As a result AAVs were instead chosen. AAVs not only elicit minimal immune response, as it is believed close 80% of the population may be positive for the AAV2 serotype. They provide long term and non-integrative gene expression through the formation of large concatemers

(Berns and Muzyczka, 2017; Colella et al., 2018). Moreover, AAV’s usage in pancreas is well established and effective (Wang et al., 2006).

59

With the target cell type in mind it is of chief importance to define what is a pancreatic progenitor, as through generating this definition screening can then be undetaken. In the case of β cells one can simply pursue insulin expression. However, for pancreatic progenitors it is more ambiguous. Across the literature many groups have used numerous transcription factors to characterise pancreatic progenitors from: PDX1, MNX1, HNF6, SOX9, NKX6.1, HES1, PTF1A and C-MYC to list a few (Seymour et al.,

2007; Zhou et al., 2007; Hald et al., 2008; Ahnfelt-Ronne et al., 2012; Petersen et al., 2017). Because of this when generating pancreatic progenitor like cells a multitude of pancreatic progenitor associated genes were tested. However, from these genes three markers were considered key to the formation of pancreatic progenitors. The first being PDX1, it has been observed at the pancreatic progenitor stage that PDX1 is critical to pancreas development, controls a large transcription factor network, its loss causes pancreatic agenesis and is required for β cell differentiation (Schwitzgebel et al., 2003; Pan and

Wright, 2011). Next PTF1A like PDX1 loss results in pancreatic agenesis; it sits atop an elaborate transcriptional network and is integral acinar cell formation (Cockell et a., 1989; Krapp et al., 1996;

Sellick et al., 2004). The final factor chosen is SOX9, this transcription factor is vital in maintaining the proliferation of the pancreatic progenitors through mediation of FGF as well as Notch signalling and the resulting Hes1 expression. SOX9 is later associated with the ductal lineage (Seymour et al., 2007;

Kopp et al., 2011; McDonald et al., 2012). Together these genes help develop the three lineages of the pancreas and activate a host of downstream genes needed for pancreas development. As a result these were key factors of interest during the screening process but were used in tandem with other genes so as to have the clearest possible picture of any potential transdifferentiation. Throughout this project I have employed both of the other aforementioned methods for generating pancreatic cells: in vitro differentiation and organoid culture so as to not only understand each but also to use them as tools to help unravel the complexities of pancreatic cell culture and expansion. This was done because to transdifferentiate and activate pancreatic genes necessitates an in depth knowledge and understanding of all methods in order to generate the optimal cell type.

60

2. Materials and Methods 2.1. Materials 2.1.1. Solutions Table 3. Solutions used throughout project. All the key solutions and instructions for production are described below; stock solutions were diluted before use.

Solution Contents Lysogeny broth (LB) Per 1 L: 10.0 g tryptone, 5.0 g NaCl, 5.0 g yeast extract, 2 ml 1M NaOH, dH2O to 1 L Phosphate buffered saline (PBS) Per 1 L: 80.0 g NaCl, 2.0 g KCl, 14.4 g Na2HPO4 2.4 g KH2PO4 pH to 7.4 dH2O to 1 L Tris-borate ethylenediaminetetraacetic acid (EDTA) Per 1 L: 121.1 g tris base, 61.8 g boric acid, 7.4 g EDTA, (TBE) 10x dH2O to 1 L Sodium dodecyl sulphate polyacrylamide gel Per 10 ml:2.4 ml 0.5 M tris-HCl pH 6.8, 0.8 g SDS, 50 mg electrophoresis (SDS PAGE) loading buffer 6x bromophenol blue, 2.5 ml 80% glycerol, 5.1 ml dH2O Sodium dodecyl sulphate polyacrylamide gel Per 1 L: 30 g tris base, 144 g glycine, 100 ml 10% SDS, electrophoresis (SDS PAGE) running buffer 10x dH2O to 1 L Tris-glycine (TG) buffer 10x Per 1 L: 30.3 g tris base, 144.0 g glycine, dH2O to 1 L

Transfer buffer Per 1 L: 100 ml 10X tris-glycine buffer, 200 ml methanol, 700 ml dH2O Blocking buffer Per 100 ml: 5 g milk powder, 1 μl 10% Tween, PBS to 100 ml Primary/Secondary blocking buffer Per 100 ml: 1 g milk powder, 1 μl 10% Tween, PBS to 100 ml with antibody Immunofluorescence blocking buffer Per 100 ml: 20% Donkey Serum, 0.1% (bovine serum albumin) BSA, 0.3% Triton X-100, PBS to 100 ml Immunofluorescence washing buffer Per 100 ml: 0.1% BSA, PBS to 100 ml

Fluorescence-activated cell sorting buffer Per 100 ml: 2% (fetal bovine serum) FBS, PBS to 100 ml

61

2.1.2. Primers Table 4. Primers used throughout the project. A list of the primers utilised each with a basic description relating to their purpose, red highlights indicates additional sequence to aid cloning or detection. Product size for all primer pairs indicated in base pairs (bp). Melting temperatures (Tm) described for each individual primer, red highlighted melting temperatures indicate additional genetic elements taken into account when calculated.

Primer Sequence Description Product Tm (5’ – 3’) (bp) (°C) PDX1_F1 GAGTGGGAACGCCACACAG PCR 988 60.7 PDX1_R1 GAGTGGTTGAAGCCCCTCAG PCR 60.1 RBPJL_F5 TCTAGAGCCACCATGGACCCCGC PCR 1556 68.0 RBPJL_R5_HA CTCGAGCTAAGCGTAATCTGGAACATCGTATGGGTAA PCR 72.6 GTCTGGATGAAGAGGTGGAAGT MNX1_F11 GTCAAGGCCCACCATGG PCR 1228 58.3 MNX1_R11 CATGAGGCCCTACTGGGG PCR 59.8 NR5A2_F1 TCTAGAATGTCTTCTAATTCAGATACTGGGG PCR 1489 58.3 NR5A2_R1 CTCGAGCTTATGCTCTTTTGGCATGCA PCR 63.4 FOXA2_F1 GTTAAATTTTAAACTGCCATGCACTCGG PCR 1421 58.6 FOXA2_R1 CCGTCGTCTTCTTAAGAGGAGTTC PCR 58.8 ONCT1_F6 CTGGCCACATCGATGTTGTGTCCG PCR 1447 64.5 ONCT1_R3 CCTTCATGCTTTGGTACAAGTGC PCR 58.2 PTF1A_F6 CGGACGGGCCTTAGAAACTC PCR 1086 60.3 PTF1A_R6 ATCTTCAGCCGAGTCTGGGA PCR 59.7 SOX9_F4 GTATGAATCTCCTGGACCCCTTC PCR 1534 58.9 SOX9_R4 CCTCAAGGTCGAGTGAGCTG PCR 59.5 FOXA3_F1 GGGATGCTGGGCTCAGTG PCR 1062 60.9 FOXA3_R1 CCCCTGCTAGGATGCATTAAGC PCR 60.1 HNF4At2_F1 GAGGCAGGGAGAATGCGAC PCR 1447 60.9 HNF4At2_R1 CAGCGGCTTGCTAGATAACTTCC PCR 60.0 HNF1A_KOZAK_F1 GCCACCATGGTTTCTAAACTGAG PCR 1902 57.7 HNF1A_STOP_R1 TCACTGGGAAGAGGCCATC PCR 58.8 Q5_PDX1_N196S_F1 TTCATGCGGCGGCTTTGGAACCAGATCTTGATGTGT Q5 5706 70.4 mutagenesis Q5_PDX1_N196S_R1 ACACATCAAGATCTGGTTCCAAAGCCGCCGCATGAA Q5 70.4 mutagenesis Q5_DJ8_F2 CTACCAACCTCCAGCGAGGCAACAGACAAGCAGCTAC Q5 7333 75.2 CGC mutagenesis Q5_DJ8_R2 GCGGTAGCTGCTTGTCTGTTGCCTCGCTGGAGGTTGG Q5 75.2 TAG mutagenesis AAVPolyA_F1 CATGGTCCTGCTGGAGTTCG PCR 583 60.1 AAVPolyA_R1 CCGCACGTGGTTACCTACAAA PCR 59.1 CMV_Seq ATAGAAGACACCGGGACCGA Sequencing - 59.3 Intron_Seq CTTATCTTCCTCCCACAGCTCC Sequencing - 59.0 IRES_Seq CACATTGCCAAAAGACGGCA Sequencing - 58.2 PolyA_Seq GTGGCATCCCTGTGACC Sequencing - 58.0 qPDX1_F1 AAAGGCCAGTGGGCAGGCGG qPCR 135 68.7 qPDX1_R1 GCGCGGCCGTGAGATGTACT qPCR 64.9 qPDX1_UTR_F5 GATTGGCGTTGTTTGTGGCT qPCR 81 58.2 qPDX1_UTR_R5 GCCGGCTTCTCTAAACAGGT qPCR 59.3 qRBPJL_F1 CCCTCGCAGAAGAAGCAGTC qPCR 89 60.4 qRBPJL_R2 TGTGAGCGCAGGCGGTTGAA qPCR 64.6 qRBPJL_UTR_F3 CACACATCCAGGCATAGGGG qPCR 123 59.7 qRBPJL_UTR_R3 CCGAATTCCCCAGGTGAGAC qPCR 60.0 qMNX1_F1 TCGCTCATGCTCACCGAGA qPCR 115 60.5 qMNX1_R1 CCTTCTGTTTCTCCGCTTCCT qPCR 58.8 qMNX1_UTR_F3 AACTTGAAACCGCCTCTGGA qPCR 104 58.3 qMNX1_UTR_R3 ACGCTCGTGACATAATCCCC qPCR 58.9 qNR5A2_F1 ACGGACTTACACCTATTGTGTCT qPCR 139 56.3 qNR5A2_R1 CCCTTGCAGCTTTCACAGGT qPCR 59.7 qNR5A2_UTR_F1 GAGTCCAGGGAAAGACTTGCT qPCR 114 58.6

62 qNR5A2_UTR_R1 GCCTTGGGAAGGACACATCA qPCR 59.3 qFOXA2_F1 GGGAGCGGTGAAGATGGA qPCR 89 59.3 qFOXA2_R1 TCATGTTGCTCACGGAGGAGTA qPCR 59.4 qFOXA2_UTR_F2 GGTGAAATCCAGGTCTCGGG qPCR 101 60.0 qFOXA2_UTR_R2 TTGTGGAACTCTGGCCCTTG qPCR 59.4 qONCT1_F1 TGTTGCCTCTATCCTTCCCA qPCR 108 57.2 qONCT1_R1 GGAGGATGTGGAAGTGGCT qPCR 58.9 qHNF6_UTR_F2 CGCTGTGCTTGGCTGTTTAG qPCR 108 58.6 qHNF6_UTR_R2 ACACCTTCGTGGCATGGTAG qPCR 58.7 qPTF1A_F2 CAGGCCCAGAAGGTCATCA qPCR 80 58.8 qPTF1A_R2 AGGGGAGGGAGGCCATAA qPCR 60.1 qPTF1A_UTR_F1 TCCCAGACTCGGCTGAAGAT qPCR 110 59.7 qPTF1A_UTR_R1 ACAGTTGATTGCCATTCGAAAA qPCR 54.0 qSOX9_F1 GTACCCGCACTTGCACAAC qPCR 72 58.8 qSOX9_R1 TCGCTCTCGTTCAGAAGTCTC qPCR 58.2 qSOX9_UTR_F1 TCCAAGCGCATTACCCACTT qPCR 75 58.5 qSOX9_UTR_R1 TCGTTGATTTCGCTGCTCCA qPCR 58.5 qXBP1_F1 CTGCCAGAGATCGAAAGAAGGC qPCR 140 60.2 qXBP1_R1 CTCCTGGTTCTCAACTACAAGGC qPCR 59.3 qNKX2.3_F1 TCCACACGGTCCTGCGAGACT qPCR 108 66.3 qNKX2.3_R1 CTAGAGACTTCTTCAGCTGGCAG qPCR 60.7 qEHF_F1 CGTGCAATGTTTCCAGTGGG qPCR 80 58.6 qEHF_R1 CCACACCTGGTACTTGGTCC qPCR 59.5 qHEYL_F1 TGGAGAAAGCCGAGGTCTTGCA qPCR 147 64.9 qHEYL_R1 ACCTGATGACCTCAGTGAGGCA qPCR 63.0 qZNF469_F1 GCTCATTCTGAAGATCGTGCAGC qPCR 222 61.4 qZNF469_R1 TCCTTCCTCTTCTCGCCTCGG qPCR 64.0 qTLX2_F1 CGCCGCTTTGCCAAGGACCG qPCR 146 69.6 qTLX2_R1 TCCAACTCCAGCACCTGTGAGC qPCR 65.4 qPLA2G1B_F1 ACAACTACGGCTGCTACTGTGG qPCR 142 62.0 qPLA2G1B_R1 GTGTACGGGTTGTCCAGCAGAA qPCR 63.2 qFOXA3_F1 CTGGCCGAGTGGAGCTACTA qPCR 110 60.4 qFOXA3_R1 GAGCTTAGAGGATTCAGGGTCA qPCR 57.7 qFOXA3_UTR_F1 TAACATCTGGGTGGGTCT qPCR 161 53.6 qFOXA3_UTR_R1 CAGTGGATTAGCCAATAACA qPCR 50.5 qHNF4A_F2 GAGATCCATGGTGTTCAAGGA qPCR 61 55.8 qHNF4A_R2 GTGCCGAGGGACAATGTAGT qPCR 58.7 qHNF4A_UTR_F1 CAACCCAACCTCATCCTC qPCR 208 54.3 qHNF4A_UTR_R1 GTCCCATCTCACCTGCTC qPCR 57.1 qHNF1A_F2 ACACCTCAACAAGGGCACTC qPCR 148 59.3 qHNF1A_R2 TGGTAGCTCATCACCTGTGG qPCR 58.1 qHNF1A_UTR_F1 CCAGGACAAGCATGGTCCCACAT qPCR 184 63.5 qHNF1A_UTR_R1 TCCACCGCATTTCTCCTTGACTTTA qPCR 59.9 qGATA4_F2 TCCCTCTTCCCTCCTCAAAT qPCR 194 56.8 qGATA4_R2 TCAGCGTGTAAAGGCATCTG qPCR 56.6 qGATA6_F1 CAGTTCCTACGCTTCGCATC qPCR 121 57.9 qGATA6_R1 TTGGTCGAGGTCAGTGAACA qPCR 57.4 qHNF1B_F1 CTGGCACCTCAGACAATCCACTC qPCR 162 61.8 qHNF1B_R1 CAGTACGGCTTTCTTGCTTCCTC qPCR 60.1 qHES1_F1 AGCACACTTGGGTCTGTGC qPCR 127 59.7 qHES1_R1 TGAAGAAAGATAGCTCGCGG qPCR 56.4 qNGN3_F2 AGCCGGCCTAAGAGCGAGTT qPCR 158 63.4 qNGN3_R2 TTGGTGAGCTTCGCGTCGTC qPCR 62.6 qNKX6.1_F1 GCCCGCCCTGGAGGGACGCA qPCR 186 73.8 qNKX6.1_R1 ACGAATAGGCCAAACGAGCCC qPCR 62.3 qECAD_F1 GAAGGTGACAGAGCCTCTGGAT qPCR 122 60.4 qECAD_R1 GATCGGTTACCGTGATCAAAATC qPCR 55.4 qAMY1A_F1 GATAATGGGAGCAACCAAGTGGC qPCR 119 60.6 qAMY1A_R1 CAGTATGTGCCAGCAGGAAGAC qPCR 59.7 qCPA1_F1 AGTAAGCGTCCAGCCATCTG qPCR 83 58.7

63 qCPA1_R1 TTTGCAAACCAGACCCCACT qPCR 58.6 qMIST1_F1 TCCAAGATCGAGACGCTCAC qPCR 76 58.6 qMIST1_R1 TGCTGGACATGGTCAGGATG qPCR 58.7 qMYC_F1 CCTGGTGCTCCATGAGGAGAC qPCR 128 62.0 qMYC_R1 CAGACTCTGACCTTTTGCCAGG qPCR 59.8 qPRSS1_F1 GTTCTGTGTGGGCTTCCTTGAG qPCR 146 60.1 qPRSS1_R1 CCTTGGTGTAGACTCCAGGCTT qPCR 60.4 qAFP_F2 AAGACATCCTCAGCTTGCTGTCTC qPCR 150 60.6 qAFP_R2 GCTTGGCTCTCCTGGATGTATTTC qPCR 59.8 qALB_F1 TCAGCTCTGGAAGTCGATGA qPCR 137 57.0 qALB_R1 TTCACGAGCTCAACAAGTGC qPCR 57.3 qAPOA2_F1 CAGGCCGAGGCCAAGTCTTA qPCR 130 61.7 qAPOA2_R1 ACTGGGTGGCAGGCTGTGTT qPCR 63.7 qAAT_F1 TATGATGAAGCGTTTAGGC qPCR 227 51.0 qAAT_R1 CAGTAATGGACAGTTTGGGT qPCR 53.1 qTTR_F1 TGGGAGCCATTTGCCTCTG qPCR 240 59.8 qTTR_R1 AGCCGTGGTGGAATAGGAGTA qPCR 59.2 qTransferrin_F1 TGTCTACATAGCGGGCAAGT qPCR 199 57.3 qTransferrin_R1 GTTCCAGCCAGCGGTTCT qPCR 60.1 qGYS2_F1 CCAGTGACCACGCACAACATGA qPCR 142 61.6 qGYS2_R1 GTAAGGGACTGGTGGAGGATAG qPCR 58.2 qCOL3A1_F1 TGGTCTGCAAGGAATGCCTGGA qPCR 108 63.1 qCOL3A1_R1 TCTTTCCCTGGGACACCATCAG qPCR 60.7 qVIM_F1 AGGCAAAGCAGGAGTCCACTGA qPCR 100 62.6 qVIM_R1 ATCTGGCGTTCCAGGGACTCAT qPCR 62.2 qFSP1_F1 CAGAACTAAAGGAGCTGCTGACC qPCR 126 59.5 qFSP1_R1 CTTGGAAGTCCACCTCGTTGTC qPCR 59.8 qACT_F1 CCAACCGCGAGAAGATGA qPCR 97 56.6 qACT_R1 CCAGAGGCGTACAGGGATAG qPCR 58.7 qTBP_F1 CTGCGGTAATCATGAGGATAAGAG qPCR 108 56.4 qTBP_R1 CTGCCAGTCTGGACTGTTCTTC qPCR 59.5 qGFP_F3 AAGGACGACGGCAACTACAA qPCR 77 58.0 qGFP_R3 TTCAGCTCGATGCGGTTCA qPCR 58.6 qAAVPolyA_F1 GCTGGAGTGCAGTGGCACAA qPCR 90 62.8 qAAVPolyA_R1 CCAACAACTCGGGAGGCTGA qPCR 61.8

64

2.1.3. Plasmids Table 5. The plasmids used and produced throughout the project. Each plasmid has a basic description as to its purpose, all plasmid were produced during project except those with catalogue codes.

Plasmid Initials Description PCR blunt II TOPO PB Gateway vector (450245) pAAV CMV MCS C Expression vector (VPK-410) pAAV CMV GFP GFP GFP expression vector (AAV-400) pHelper pHelper Helper plasmid (VPK-402) DJ8 DJ8 Packaging plasmid (VPK-420) DJ1 DJ1 Packaging plasmid (Can bind to heparin) PCR blunt II TOPO PDX1 PPB Gateway vector containing PDX1 pAAV CMV MCS PDX1 PC Expressing PDX1 pAAV CMV MCS N196S PDX1 N196S PC Expressing N196S PDX1 (DNA binding mutant) IMAGE clone RBPJL IcR RBPJL clone (HsCD00297063) PCR blunt II TOPO RBPJL RPB Gateway vector containing RBPJL pAAV CMV MCS RBPJL RC Expressing RBPJL IMAGE clone MNX1 IcM MNX1 clone (HsCD00295487) PCR blunt II TOPO MNX1 MPB Gateway vector containing MNX1 pAAV CMV MCS MNX1 MC Expressing MNX1 PCR blunt II TOPO NR5A2 NRPB Gateway vector containing NR5A2 pAAV CMV MCS NR5A2 NRC Expressing NR5A2 PCR blunt II TOPO FOXA2 FPB Gateway vector containing FOXA2 pAAV CMV MCS FOXA2 FC Expressing FOXA2 IMAGE clone HNF6 IcH HNF6 clone (IRCBp5005J1012Q) PCR blunt II TOPO HNF6 HPB Gateway vector containing HNF6 pAAV CMV MCS HNF6 HC Expressing HNF6 PCR blunt II TOPO PTF1A PtPB Gateway vector containing PTF1A pAAV CMV MCS PTF1A PtC Expressing PTF1A PCR blunt II TOPO SOX9 SPB Gateway vector containing SOX9 pAAV CMV MCS SOX9 SC Expressing SOX9 PCR blunt II TOPO FOXA3 F3PB Gateway vector containing FOXA3 pAAV CMV MCS FOXA3 F3C Expressing FOXA3 PCR blunt II TOPO HNF4A H4PB Gateway vector containing HNF4A pAAV CMV MCS HNF4A H4C Expressing HNF4A PCR blunt II TOPO HNF1A H1PB Gateway vector containing HNF1A pAAV CMV MCS HNF1A H1C Expressing HNF1A

2.1.4. Antibodies Table 6. Antibodies used throughout project. Each antibody has Western blot dilution and host species.

Antibody Supplier WB dilution IF/IHC dilution Host species hPDX1 (AF2419) R&D Systems 1:750 - Goat hPDX1 (AB47308) Abcam - 1:200 Guinea Pig hMNX1 (81.5C10) DHSB 1:1000 - Mouse hNR5A2 (PCRP-NR5A2-1A11) DHSB 1:100 1:10 Mouse hFOXA2 (AF2400) R&D Systems 1:2000 - Goat hHNF6 (SC-376167) Santa Cruz 1:500 1:50 Mouse hSOX9 (AB5535) Abcam 1:5000 - Rabbit hPRSS1 (AB219060) Abcam - 1:100 Rabbit HA (3724) Cell Signaling Technology 1:5000 - Rabbit TUBA4A (T9026) Sigma-Aldrich 1:5000 - Mouse

65

2.2. Methods 2.2.1. Bioinformatic analysis Key genes for transdifferentiation were derived by two separate analyses. The first, LgPCA was performed by Dr. David Gerrard. The output of this analysis was a list of key genes required for the development of the human embryo (Gerrard and Berry et al., 2016). 15 embryonic tissues were harvested for RNA-sequencing and this data was then analysed by LgPCA. A developmental tree was applied to the data in order to partition it into groups. Groups of co-varying genes could then be correlated, between groups that could not be explained by developmental relation to one another (brain and pancreas) and groups of genes derived from related lineages (pancreas and liver). Groups of genes which do not show a relationship to the input tree are found within the low PCs (usually spread across multiple organs). Successive PC shows greater relation to the tree (higher PCs have almost organ specific groups). Moreover, within each PC groups of genes will have either a positive or negative loadings by correlation to the tree. PCs with relevant pancreas signal were compiled. From these PCs a list of genes was generated. Any data derived from this pipeline is presented with the permission from the authors themselves.

The online tool Mogrify (http://www.mogrify.net/) was also used to determine genes for transdifferentiation (Rackham et al., 2016). The starting cell type input was ‘fibroblast of lung’ and the target cell type was ‘pancreas – adult’. Any further analysis was achieved through the variation of the input cell type. This algorithm utilises online available RNA-sequencing and CAGE datasets combined with the STRING database to determine transcription factors able to induce high levels of downsteam factors. From these genes an ideal cohort of genes with the greatest combined coverage over the regulatory network of the target cell type is generated. These two analyses were then cross referenced against one another to create a final shortlist of genes.

66

Gene sequences were taken from NCBI databases (https://www.ncbi.nlm.nih.gov/) or the UCSC genome database (https://genome.ucsc.edu/). All DNA sequences were analysed on Snap Gene

Viewer, and confirmed with the BLAT tool available on the UCSC website. Alignments were created using the online tool Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/). Primers were designed with Thermo Fisher Tm calculator (Taq), ensuring annealing temperature range of 53-58 °C

(https://www.thermofisher.com/sg/en/home/brands/thermo-scientific/molecular-biology/molecular- biology-learning-center/molecular-biology-resource-library/thermo-scientific-web-tools/tm- calculator.html). Primers were also validated through the in silico PCR tool found on the UCSC website. Tissue expression data was referred to throughout the project three online atlases were used

(https://www.proteinatlas.org/), (https://www.ebi.ac.uk/gxa/home), (http://biogps.org/#goto=welcome).

Identification of transcription factor binding sites was determined by MatInspector (Genomatix). For

PDX1 binding the matrix family ‘V$PDX1’ was used. For MNX1 binding the matrix family ‘V$HBOX’ was used, within that only consensus sequences of 5’-AATT-3’ were utilised. For FOXA2 the detailed matrix information ‘FOXA2’ was searched for to determine binding regions. 10,000 bp upstream of specific genes was tested (https://www.genomatix.de/online_help/help_matinspector/matinspector_help.html).

All estimation statistics were run using the online tool under the settings for shared control

(https://www.estimationstats.com/#/).

2.2.2. Molecular cloning A variety of vectors were made during the project see Table 5. All genes used for reprogramming purposes were amplified by PCR, using GXL polymerase (Takara). Human fetal cDNA was used as template for amplifying the genes PDX1, NR5A2, FOXA2, PTF1A, SOX9, FOXA3 and HNF4A. IMAGE clones were purchased for the genes RPBJL (HsCD00297063), MNX1 (HsCD00295487) and HNF6

(IRCBp5005J1012Q). A cDNA ORF clone for HNF1A was purchased (OHu25248). PCR products were gel extracted and purified according to QIAquick Gel Extraction protocol (Qiagen). Excised PCR

67 products were ligated into the gateway vector PCR Blunt. The vectors for all target genes were digested with restriction endonucleases (New England Biosciences) and ligated into the pAAV-CMV vector (Cell

Biolabs).

Plasmids were then transformed into Escherichia coli (E. coli) XL10-Gold cells (Aligent) mixed with β- mecaptoethanol. Each plasmid was added to cells at a 1:10 ratio respectively. Plasmid and XL10-Gold cell mix were kept on ice for a minimum of 20 minutes. These were then transferred to a heat block at

42 °C for 45 seconds, before being placed back on ice for 2 minutes. 500 µl of LB was added and cells were incubated for 30 minutes at 37 °C. The cells were then plated out onto ampicillin or kanamycin containing plates depending on the resistance conferred by the transformed plasmid. Bacterial plates were incubated overnight at 37 °C, colonies from plates were picked the next day and grown up overnight in 1 ml LB with either ampicillin or kanamycin. All plasmids were mini-prepped or maxi- prepped in accordance with QIAprep Spin Miniprep Kit/HiSpeed Plasmid Purification protocols

(Qiagen). Resulting plasmids were quantified with NanoDrop One (Thermo Scientific) and utilised for downstream processes or sent for sequencing (AITBiotech).

To purify AAVs required packaging of virus with DJ plasmid as this allows heparin binding. The non- heparin binding plasmid DJ8 (Cell Biolabs) were mutated at two key residues (Q587R and Q590R), to change protein sequence to match that of DJ, dubbing my creation DJ1. The QuikChangeII Site-

Directed Mutagenesis kit (Aligent) was used to mutate the plasmid with the Table 4 In addition, the pAAV CMV N196S PDX1 plasmid was generated, in which a key amino acid within the DNA binding region of the PDX1 protein (N196S) was mutated, perturbing its function. Finally, given the limited availability of antibodies for RBPJL the human influenza hemagglutinin (HA) tag sequence 5' TAC CCA

TAC GAT GTT CCA GAT TAC GCT 3' was cloned onto the terminus of the RBPJL gene.

68

2.2.3. AAV production and testing 2.2.3.1. Transfections The transfection reagent polyethyleneimine (PEI; Polysciences) was used to introduce plasmids into human embryonic kidney 293 (HEK293) cells. PEI stocks were made by dissolving 50mg in 50ml pH

4.5 warm phosphate buffered saline (PBS) resulting in a 1 mg/ml stock pH 7.0 of PEI which was passed through a 0.22 μM syringe filter. HEKs were transfected with pHelper, DJ1 and gene of interest containing plasmids (Cell Biolabs). These plasmids allowed the production of AAVs serotype DJ. HEKs were transfected with the three plasmids at 1:1:1 ratio based upon vector size, mixed with 135 μg PEI

(per 15 cm dish) with 170 Mm NaCl in OptiMEM. The HEKs were incubated with plasmids for 48-72 hours.

2.2.3.2. AAV harvest and purification At 48-72 hours post transfection HEKs were harvested, cells were manually detached from plates into their media with a cell scraper and spun down at 3000 g for 10 minutes. The supernatant was discarded and the cell pellet was resuspended in PBS (200 μl per 15 cm dish). Resuspended cells were transferred into microcentrifuge tubes to be lysed by freeze-thawing. AAV containing cells were placed in dry ice for 10 minutes and moved to 37 °C for 10 minutes, the process was repeated a total of four times. Lysed cells were spun down at 13200 g for 10 minutes and, the supernatant containing AAVs was transferred to fresh microcentrifuge tubes. AAVs stocks were then frozen at -80 °C and stored or purified immediately.

AAV supernatants were treated with 50 Units per ml benzonase nuclease (Sigma) and incubated at 37

°C for 30 minutes. Following this AAVs were spun at 13200 g for 10 minutes, the supernatant was transferred to fresh microcentrifuge tubes. AAVs were purified through 1 ml HiTrap heparin columns

(GE healthcare) in a method adapted from McClure et al., 2011. Initially columns were washed with 10 mls 150 mM NaCl, 20 mM tris pH 8.0, the viral supernatant was applied to the column. Columns were

69 washed with 20 mls 100 mM NaCl, 20 mM tris pH 8.0, the flow through was discarded. To elute virus bound to heparin the salt concentration was increased, all fractions onwards were kept. Columns were washed with the following gradient: 1 ml 200 mM NaCl, 20 mM tris pH 8.0, 1 ml 300 mM NaCl, 20 mM tris pH 8.0, 1.5 ml 400 mM NaCl, 20 mM tris pH 8.0, 3 ml 450 mM NaCl, 20 mM tris pH 8.0, 1.5 ml 500 mM NaCl, 20 mM tris pH 8.0. 8 ml total flow through was collected. These fractions were applied to a

100,000 Da molecular weight cut off filter (Merck). The concentrators were spun at 2000 rcf for 2 minutes, the flow through was discarded. This centrifugation process was repeated until all fractions had been applied to concentrator and the final volume was roughly 250 μl. The final 250 μl was mixed with 250 μl PBS (500 μl total) and stored and aliquoted at -80 °C. HiTrap heparin columns were regenerated for further use by a series of washes. Following elution, columns were washed with 10 ml

1M NaCl, 20 mM tris pH 8.0, 10 ml 150 mM NaCl, 20 mM tris pH 8.0, 10 ml dH2O and finally wash and store in 10 ml 20% ethanol.

2.2.3.3. AAV titration, purification and FACS analysis To validate the AAV production HEK293 cells were exposed to GFP encoding virus taken from each stage of the purification process. The cells were incubated for 48 hours before being harvested in

Accutase (Sigma-Aldrich) and FACS buffer at a 1:1 ratio. Cells were validated for GFP fluorescence with a FACS Calibur Cytometer System using the program CellQuest Pro (Becton Dickinson) and results were analysed with FlowJo Version 10. A minimum of 15,000 cells were tested for all treatments. Fluorescence was measured via the green and red fluorescence (FL1-H and FL4-H).

To determine AAV titre the number of genome copies was measured via qPCR. Primers were designed within the poly-adenylated tail of the pAAV-CMV construct (Table 4). A dilution series of standards was produced, the pAAV-CMV poly-adenylated tail was cloned into PCRII TOPO and excised via restriction digest. This product was diluted in sonicated salmon sperm (SSS; to reduce non-specific binding;

70

ThermoFisher Scientific) to 1.0x108 molecules/μl and serial diluted five times to 1x103 molecules/μl. To test AAV samples 4 μl of virus was diluted in 16 μl SSS 1 ng/μl and mixed with 20 μl AL buffer, samples were heated at 56 °C for 10 minutes, then diluted further, 1 μl in 99 μl SSS. These samples were used for qPCR and run against the standards and the number of genome copies was calculated via regression.

2.2.3.4. AAV tests Virus tests were carried out on fetal lung fibroblasts (FLFs), FLFs were plated out on 6 well plates at

100,000 cells per well. AAVs were added to cells at 4 points over a two day period at multiplicity of infection (MOI) of 10,000 each time. MOI denotes the number of viral particles to that of cells, therefore

I applied 10,000 AAV particles to every one cell. During exposure to virus the cells were maintained in

FLF media or HepG2 media depending on cell type. AAVs were first applied to cells upon passaging, then again at 6 hours post passage, then 12 hours later (18 hours post passage) after that and finally

24 hours after the last addition (36 hours post passage). For certain experiments key experiments such as transdifferentiation to liver and final transdifferentiation to pancreatic progenitors, 4-5 hours after the final viral addition cells were exposed to 30 μM nocodazole (Sigma Aldrich) for approximately 2-3 hours. Following this, at 48 hours post passaging FLFs were given fresh media specific to the desired cellular identity. Cells exposed to AAVs were then grown a further 6-8 days before being harvested for analysis via qPCR or IF.

2.2.4. qPCR 2.2.4.1. qPCR reaction and analysis All primers used for qPCR can be found in Table 4, RNA extractions were performed on samples following RNeasy Mini Kit protocol (Qiagen), and quantified via NanoDrop One. 1 μg of RNA was reverse transcribed to cDNA in 20 μl reactions using the High-Capacity cDNA Reverse Transcription Kit

(Applied Biosystems). Reactions were left to run for 2 hours at 37 °C, followed by a 5 second

71 inactivation period at 85 °C. Samples were diluted with 80 μl dH2O to a total volume of 100 μl. All qPCR reaction volumes were as followed: 5 μl SYBR Select Master Mix (Applied Biosystems), 3 μl dH2O, 1 μl cDNA sample and 1 μl primer mix. All qPCR plates were analysed on StepOnePlus System

(Applied Biosystems). Relative normalised expression was calculated based upon two control genes: β- actin (ACT) and TATA-binding protein (TBP) for all plates. All primers were checked prior to use via a dilution series in order to calculate R2 and efficiencies, only primers with R2 above 0.95 and efficiency between 110 and 90 % were used. Melt curves were assessed in order to determine any primer pairs that generate multiple products. Any primer pairs that did were discarded and a new pair would be designed. Below is the reaction protocol used for all qPCR reactions.

Table 7. qPCR cycle setup. Reaction setup for all qPCR machines used, denaturing and annealing steps cycled 40 times. Time in seconds and temperature in °C.

Step Time (s) Temperature (°C) Holding Stage 600 95 Denaturing Stage 15 95 }Cycled 40X Annealing Stage 60 60 Melt Curve Stage 1050 60-95 (0.75 °C per minute) Analysis of outputted data was completed within Excel (Microsoft). The Livak method for relative gene expression was utilised. (Livak and Schmittgen et al., 2001) Ct values were first normalised against the geometric mean Ct of ACT and TBP to generate ΔCt. Data was then made relative to a specific sample, generating ΔΔCt (this varied depending on experiment but typically positive controls were used). The relative expression was calculated by 2^-ΔΔCt. For all experiments n = 3 or more was implemented. The only exception to this was qPCR data including human week 12 fetal pancreas, which had n = 1, due to scarcity of supply. Means were calculated for every sample and standard deviation.

2.2.4.2 Statistics Statistical analysis was run within Excel, firstly, populations were assessed for equal variance via an F-

Test. Secondly, if the populations had equal variance then a two-sample t-test was run. If populations had an unequal variance then Welch’s two-sample t-test was run. All t-tests used two-tailed distribution unless control population exhibited no detectable expression via qPCR, in which case one-tailed

72 distribution was used. Samples with no detectable expression were considered to have a Ct of 40 in order to have a numerical value to define the population. Estimation statistics are found in Appendices

VII-XV.

2.2.5. Cell culture 2.2.5.1. In vitro cell culture All cell culture was carried out in a Class II biosafety cabinet. Three key cell lines were cultured

HEK293 cells, the hepatocellular carcinoma line HepG2 and FLF cells from aborted samples M196 and

M197. HEK293 cells were cultured in Dulbecco’s modified Eagle’s medium – high glucose (DMEM;

Gibco) with 10% fetal calf serum (FCS), with 1:100 non-essential amino acids (NEAA; Gibco). The HEK cells were passaged every 3-4 days at 1:6-1:10 depending on needs. FLFs were grown in DMEM – high glucose with 10% FCS, 2 mM L-glutamine (ThermoFisher Scientific) and 1:1000 penicillin- streptomycin (pen-strep; Gibco). Cell stocks for FLFs were kept at passage 6 and 7, FLFs were grown until a maximum passage 16-17. HepG2 cells were grown in DMEM – low glucose with 10% FCS, 2 mM L-glutamine and 1:1000 pen-strep, FLF were also cultured in this media depending on experiment.

FLFs transdifferentiated towards pancreatic lineages were maintained in media adapted from Bonfanti et al., 2015, of advanced DMEM with nutrient mixture F12 (AdDMEM/F12; Gibco), 5% FCS, 2mM L- glutamine, 1:1000 pen-strep, 50 ng/ml FGF10 (Source Bioscience), 50 ng/ml EGF (R&D Systems),

100ng/ml noggin (Peprotech), 10 mM nicotinamide (Sigma-Aldrich), 100 nM exendin 4 (Peprotech) and

10% R-Spodin media kindly donated from the Barker lab (A*STAR, Singapore). FLFs transdifferentiated towards hepatocytes were cultured in media adapted from that of Huang et al., 2014, of DMEM low glucose with 5% FCS, 40 ng/ml transforming growth factor α (TGFα; ThermoFisher Scientific), 50 ng/ml

EGF, 30 nM dexamethasone (STEMCELL Technologies), 544 ng/ml ZnCl2 (Sigma-Aldrich), 750 ng/ml

ZnSO4·7H2O (Sigma-Aldrich), 200 ng/ml CuSO4·5H2O (Sigma-Aldrich), 25 ng/ml MnSO4 (Sigma-

73

Aldrich), 1:100 NEAA, 2 mg/ml galactose (Sigma-Aldrich), 7 ng/ml insulin (Abcam) and 15 ng/ml

(Sigma-Aldrich).

In addition, the human stem cell line H9, were also cultured, cells were maintained in mTeSR™1 media

(STEMCELL Technologies). Plates were coated in Matrigel diluted 1:50 times in AdDMEM/F12 for 3-4 hours prior to plating. 1 million cells per well were plated out (12 well plate, 500µl media per well). H9 cells were also differentiated towards pancreatic progenitors using the STEMCELL Technologies kit

(Cat#05120). One minor change was made to the protocol, cells were exposed to one additional day of supplement 1B media making the entire length of the protocol 15 days (Table 8).

Table 8. STEMCELL Technologies pancreatic differentiation timetable. Supplement given and corresponding basal medium for each day shown, supplements given at 1/100 the volume of basal media.

Day Stage Supplements Basal Medium 0 Plating (mTeSR + ROCKi)

1 1A+1B Stage 1: Definitive 2 1B Stage 1 Basal Medium Endoderm 3 1B 4 2A+2B Stage 2: Primitive 5 2B Gut Tube 6 2B 7 3 Stage 3: Posterior 8 Foregut 3 9 3 Stage 2-4 Basal Medium 10 4 11 4 12 Stage 4: Pancreatic 4 13 Endoderm 4 14 4 15 4

Cells were taken from the endpoint of this experiment and maintained in a culture media described by

Trott et al., 2017. The media was made up in AdDMEM/F12 (Table 6): 50 ng/ml FGF10 (Source

Bioscience), 50 ng/ml EGF (R&D Systems), 30 nM dexamethasone (STEMCELL Technologies), 3 mM

RA (Sigma-Aldrich), 10 mM SB431542 (Calbiochem) and 1 mM DAPT (Sigma-Aldrich), 2 mM L- glutamine (ThermoFisher Scientific), 100 U/mL penicillin/streptomycin (Thermo Fisher Scientific), 1X N2 supplement (ThermoFisher Scientific), 1X B27 supplement (ThermoFisher Scientific).

74

2.2.5.2. Human tissue collection The collection and use of human embryos and fetal samples were carried out with ethical approval from the North West Research Ethics Committee under the Codes of Practice of the UK Human Tissue

Authority and legislation of the U.K. Human Tissue Act 2008. Embryos and fetuses were consented for collection from medical and surgical terminations of pregnancy.

2.2.5.3. Organoid cell culture Isolated pancreas cultures were carried out as described in Bonfanti et al., 2015 with some minor modifications. Embryonic and fetal samples were collected from 8 to 11 wpc. The pancreas was excised from each sample and treated with collagenase IV (250 U/ml) in Hank’s balanced salt solution

(HBSS; Sigma-Aldrich) at 37 °C for 20 minutes to remove any residual mesenchyme. Cells were washed AdDMEM/F12, 10% FCS, 1:1000 pen-strep multiple times and mechanically dissociated. The cells were spun down at 1000 g for 3 minutes with the pellet being resuspended in 100-200 μl of media depending on age. Once resuspended 10 μl of cells were mixed with 20 μl of Matrigel (Sigma-Aldrich) and seeded onto either non-coated 6-well plates (6 colonies per well) or 96-well plates (1 colony per well). The Matrigel was allowed to set for 30 minutes at 37 °C.

Matrigel colonies were bathed in media (2.5 ml per well on 6-well plates and 150 μl per well on 96 well plates). The culture medium was made up of the following components in AdDMEM/F12 (Table 6): 500 ng/ml R-Spondin (Peprotech), 100 ng/ml FGF10 (Peprotech), 50 ng/ml EGF (Peprotech), 10 μM Rho- associated protein kinase (ROCK) inhibitor Y (Y-27632; Sigma-Aldrich), 100ng/ml noggin, 100nM gastrin (Sigma-Aldrich), 10 mM nicotinamide, 1 mM N-acetylcysteine (Sigma-Aldrich), 100 nM exendin

4, 1:100 N2 (Gibco), 1:50 B27 (Gibco) and 1:1000 pen-strep (Sigma-Aldrich). Cells were expanded for one week with the medium changed every 2 days.

75

2.2.6. Western blot Cell samples were harvested via radioimmunoprecipitation assy (RIPA) buffer (Thermo Scientific). Cold

RIPA buffer was applied to cells. Cells were scraped off into the buffer and incubated on ice for 5 minutes. After this the solution was spun down at 13200 g for 15 minutes. The supernatant was retained for analysis. To determine protein concentration a set of bovine serum albumin (BSA) standards was produced. These were 0.05, 0.1, 0.15, 0.2, 0,25 BSA mg/ml. 10 µl of 30X diluted protein samples, 30X diluted RIPA buffer alone and the BSA standards were placed onto a flat-bottom 96 well plate, each in triplicate. This was followed by 200 µl 5X diluted Bradford Reagent (Bio-Rad).

Abssorbance at 595 nm was then established via a microplate reader (SpectraMax) allowing the derivation of protein concentration.

SDS-PAGE was run to identify proteins on 12.5% polyacrylamide gels. 4x SDS loading buffer was mixed with protein samples at 1:5 ratio and heated at 95°C for 2-5 minutes. 10 µl samples (25 μg protein) were loaded onto gels, electrophoresis were performed at room temperature for 60-75 minutes at 120 V, in SDS-PAGE running buffer until the dye front reached the bottom of the gel tank.

Gels were blotted onto Amersham Hybond PVDF membrane (GE Healthcare) in transfer buffer at 30-40

V for 90 minutes. Once transferred the membrane was bathed in blocking buffer and left shaking overnight at 4°C. The membrane was washed in fresh blocking buffer along with respective primary antibody (Table 7) and incubated for an hour at room temperature. Membrane was rinsed three times for 2-3 minutes in TBST. The membrane was incubated in blocking buffer and respective secondary antibody (HRP) for an hour before being rinsed again with TBST. The membrane was soaked in ECL fluid according to the kit’s instructions (Thermo Scientific). Chemiluminescence was detected via

Amersham Hyperfilm (GE Healthcare) which was applied to the membrane and subsequently developed for 1-20 minutes, depending on signal.

76

2.2.7. Immunofluorescence Once cells on a 12 well plate reached desired point the media was removed and cells were washed twice with PBS. Following the washes 500 µl of 4% paraformaldehyde fixative solution was added into each well, the cells were incubated for 20 minutes at room temperature. Fixative was removed and cells were washed once again twice with PBS. The cells were washed twice in 500 µl washing buffer, once removed 500 µl blocking buffer was applied to cells. The cells were incubated for 1 hour at room temperature. Blocking buffer was removed and a fresh 500 µl blocking buffer was added along with the required amount of primary antibody. Cells were left overnight shaking at 4°C. The next day primary antibody was removed; cells were washed three times with wash buffer, 500 µl blocking buffer was applied along with required secondary antibody at 1:500 dilution. Cells were incubated at room temperature in the dark on a shaker. The secondary antibody was removed and cells were washed three times with wash buffer. After washes 500 µl PBS was added to each well and one drop of

NucBlue (ThermoFisher) and incubated at room temperature for 15 minutes in the dark. Cells were rinsed with PBS and washed twice with PBS. Immunofluorescence was imaged with a Olympus

FV1000 Inverted microscope.

2.2.8. RNAscope All RNAscope experiments were run on a Leica Bond RX machine (run programmed by Dr. Kim Su and

Dr. Elliot Jokl). Probes were obtained from Advanced Cell Diagnostics (ACD) for PDX1 (437088),

PTF1A (524428) and RBPJL (584328-C3). These probes were tested on a CS18 embryo. The embryo was fixed in 4% PFA, embedded in paraffin wax and sectioned through the sagittal plane. All sectioned slides were kept at 4°C prior to testing. Upon completion of the automated RNAscope procedure, slides were imaged with a Zeiss Axio Imager A1 microscope for non-fluorescent probes and a Zeiss Axio

Imager upright fluorescence microscope for fluorescent probes.

2.2.9. Immunohistochemistry As with RNAscope, immunohistochemistry was tested upon sagittal sections of a fixed CS18 embryo.

Sections were rehydrated, beginning with two 3 minute xylene washes, a 2 minute 100% ethanol wash

77 then a 2 minute 90% ethanol wash, followed by rinsing in water. Sections were bathed in PBS with

0.3% H2O2 (Sigma-Aldrich) for 20 minutes. Sections were then washed for 5 minutes, three times in

PBS. The sectioned slides were then boiled in sodium citrate pH 6.0 for 10 minutes and kept in warm sodium citrate for a further 20 minutes. A wax border was prepared around the embryo section and the sections were once again washed for 5 minutes, three times in PBS. Primary antibody for PDX1 and

NR5A2 was diluted as indicated in Table 6 into PBS with 0.1% Triton X 100 and 3% goat serum. 100 µl of primary antibody mixture was applied to sections. Sections were incubated at 4°C overnight in a humidified container.

The following day sections were first washed for 5 minutes three times in PBS. Biotinylated IgG secondary antibodies were diluted at 1:200 (anti-mouse, BA-2000; anti-guinea pig, BA-7000) in PBS.

Secondary antibodies were applied to slides and incubated for 2 hours in a humidified container at room temperature. Once again sections were washed three times each for 5 minutes in PBS. Next 100

µl PBS 0.1% Triton X 100 with the addition of streptavidin-HRP (Vector Labs) at 1:200 dilution was applied to sectioned slides. The sections were incubated for an hour in a humidified container at room temperature. Followed by another three 5 minutes PBS washes. Colour detection was achieved by the application of 3,3'-diaminobenzidine (DAB) substrate. DAB is mixed at a 1:10 ratio into peroxide buffer

(ThermoFisher Scientific) with 100 µl applied per section. Colour was left to develop for 10-30 minutes, once visible sections were rinsed in PBS. Sectioned slides were then counterstained in toluidine blue for 2 minutes. Sections were rinsed in water and dehydrated by 10 seconds in 70% ethanol, 10 seconds in 90% ethanol, 3 minutes in 100% ethanol and two 2 minute xylene washes. Finally sections were mounted in entalan (Milipore) and a coverslip (Deltalab) was placed atop the section. Sections were imaged with a Zeiss Axio Imager A1 microscope.

78

3. Results 3.1. LgPCA and Mogrify identify a shared cohort of genes active in specifying the pancreas Initially work focused upon the generation of target genes for utilisation in transdifferentiation. As stated prior, the Hanley lab’s privileged access to human embryonic and fetal tissue allowed the derivation of genes specifically active in shaping early human development. Here I will display highly sought after data derived by Dave Gerrard taken from Gerrard and Berry et al., 2016. PCs with signal relating to the pancreas were compiled and then depending on positive or negative loading these genes, were filtered through to generate an enhanced and definitive shortlist. To rank genes output from relevant PCs the enrichment of transcription factors within the pancreas was compared to the mean of all other organs

(Fig. 5A; Fig. 6A). The top ten transcription factors from this list were taken forward for further study.

These were: RBPJL, PDX1, MNX1, HNF6, Nirenberg and Kim homeobox factor 2.3 (NKX2.3), ETS homologous factor (EHF), NR5A2, zinc finger protein 469 (ZNF469), T-cell leukemia homeobox protein

2 (TLX2) and PTF1A. To validate this list additional analysis was undertaken, by using the online tool

Mogrify. Mogrify provided a bar chart indicating the top eight transcription factors with the highest percentage regulatory coverage in the pancreas: X-box binding protein 1 (XBP1), HNF6, NR5A2,

FOXA2, GATA4, PTF1A, RBPJL and hairy/enhancer-of-split related with YRPW motif-like protein

(HEYL). Supplementary data generated alongside this chart gave the next two factors, phospholipase

A2 group B (PLA2G1B) and MNX1, resulting in two separate lists of ten potential genes (Fig. 6B).

These lists were arbitrarily capped at ten genes in order to have a feasible list of genes to work with.

To compile these genes into one comprehensive list each gene was ranked, 1 being the highest and 10 being the lowest, any gene shared across both lists would tally their rank (Fig. 6C). Genes shared across both analyses were expectedly found towards the top of the list given their commonality (HNF6,

RBPJL, NR5A2, MNX1 and PTF1A). The final list comprised of 15 genes, many of which are known factors characterised during pancreas development (Fig. 1). In order to reduce the list down to the most

79 applicable factors additional analysis of their expression kinetics during in vitro differentiation was undertaken.

Figure 6. LgPCA and Mogrify analysis of transcription factors for transdifferentiation. (A) Ranked 10 highest fold enriched transcription factors of the pancreas over all other organs in common logarithmic scale determined by LgPCA. (B) 10 highest ranked regulatory transcription factors by percentage coverage determined by Mogrify. NA = not applicable, due to data omission from graphical analysis but described as next two factors to use. (C) A compiled list of transcription factors from both analyses tallied and ranked. Factors highlighted in red are conserved across both analyses.

80

3.2. In vitro differentiation allows the derivation of the crucial factors for transdifferentiation A large cohort of the transcription factors active in pancreas development were tested along with genes outputted from the bioinformatic analyses. I aimed to elucidate the expression patterns of genes for transdifferentiation but also those active across pancreas development. This would not only help me in my choice of factors but also provide a deeper understanding of pancreas development. The Dunn laboratory is well equipped and experienced in the handling of hPSCs. Through the utilisation of the 15 day SCT pancreatic progenitor in vitro differentiation protocol, hPCSs were robustly differentiated to the beginning of endocrine specification. This model system as discussed is a valuable tool not only for pancreatic cell production but also the study of pancreas development itself. The differentiation employed here can be summarised as having four stages. Stage 1: definitive endoderm (day 1-3), stage 2: primitive gut tube (day 4-6), stage 3: posterior foregut (day 7-9) and pancreatic endoderm (day

10-15). Cells were harvested at each day of the 15 day protocol (including day 0) across three biological replicates. Samples were then tested via qPCR and mRNA expression was determined relative to D15 expression levels and normalised to ACT and TBP. Day by day images of cells can be found in Appendix I.

Several of the first factors expressed were those required for early development SOX17, FOXA2,

FOXA3, GATA4, GATA6, HNF4A and HNF1A (Fig. 7A-E, J, K). SOX17 expression is marked by a short burst of expression from day 2-5 (Fig. 7A). FOXA2 and FOXA3 both initiate expression early on and are maintained throughout the 15 day course. They also exhibited a marginally increased expression during the primitive gut tube stage (Fig. 7B, C). Intriguingly, the GATA factors have seemingly opposing expression kinetics. GATA4 has one bifurcated expression peak at day 6-8, which is flanked on either side by expression peaks of GATA6 at day 4 and day 10 respectively(Fig. 7D, E). HNF4A and HNF1A have steadily increasing expression during differentiation, beginning at day 3 and 4 respectively (Fig.

7J, K). Moreover, the related transcription factors mimic this pattern; HNF1B initiates expression on day

81

3 and HNF6 at day 7 (Fig. 7H, I). A further two factors with analogous expression to this are PDX1 starting at day 7 and SOX9 day 8, each progressively increasing over time (Fig. 7F, L). The temporal expression of MNX1 is of interest as it first peaks at day 4 before decreasing to negligible levels only to return and continue expression from day 8 onwards (Fig. 7G). Likewise HES1 expression spikes at day

2 before deteriorating, only to gradually increase during differentiation (Fig. 7N). The two key factors in the induction of endocrine specification NKX6.1 and NGN3 both apogee towards the latter end of differentiation at day 13 and day 9 (Fig. 7M, O). Lastly, the three exocrine transcription factors all seem to be expressed parallel to endocrine specification. NR5A2 is first to be expressed at day 7, followed by

PTF1A at day 8 and finally increasing RBPJL from day 9 (Fig. 7P-R). It should be noted that RBPJL expression was quite weak compared to other genes. Interestingly, the last two spike in expression whilst NR5A2 is maintained to the endpoint of differentiation.

The three key markers of pancreatic progenitors: PDX1, PTF1A and SOX9 all seem to commence expression around the same time day 7-8 (Fig. 7F, L, P). PTF1A only has a small window of expression with its peak at day 10. Given this, transcription factors capable of transdifferentiating to pancreatic progenitors must be expressed during this time or in the preceding period. Furthermore, one would expect that a factor directly involved in specifying pancreatic progenitor fate would increase or at least maintain expression as differentiation advances. With these criteria in mind I then established which of the 15 genes from the prior analysis would be most applicable in transdifferentiation (Fig. 6C).

82

83

Figure 7. Day by day expression during SCT differentiation of genes active during pancreas development and from LgPCA/Mogrify analysis. (A-X) Day by day gene expression from day 0 (D0) to day 15 (D15). Gene of interest is indicated above each graph. Expression is normalised by ACT and TBP genes and relative day 15. Each day has n = 3 biological replicates, the mean for each is shown and standard deviation.

My aim was to try and reduce the list to half its original size, so as to have a workable set of factors for cloning. Starting with the lowest ranked genes, PLA2G1B, TLX2 and ZNF469 all had no expression detected throughout in vitro differentiation. Consequently, none of these genes were utilised for transdifferentiation. HEYL expression decreases as differentiation progresses, so was not considered

(Fig. 7V). EHF on the other hand increased expression over time so was put onto the list pending other factors above it were not more applicable (Fig. 7U). The expression pattern of NKX2.3 is unique, defined by a drastic peak at day 7-8 before plummeting away (Fig. 7T). As mentioned GATA4 is expressed as a forked peak, both GATA4 and NKX2.3 were active at the correct period but lost expression over time so were not tested further (Fig. 7D). FOXA2, PDX1 and MNX1 expression pattern all fit the set criteria of being expressed at the right time period and maintaining it over time and so were chosen. PTF1A was chosen, as it was output by both analyses and is a defining gene in pancreas development. Above them was XBP1 which whilst present seemed to expressing at a fixed rate during differentiation, because of this it was not used (Fig. 7S). The final three factors NR5A2, RPBJL and

HNF6 also fit the conditions set and so were utilised. Given my desire to cut the list down EHF was not included giving me 7 factors to test downstream. Lastly, as both PDX1 and PTF1A were included in this list I chose to include SOX9. This was done not only to assess the effect of its overexpression in generating pancreatic progenitors given its importance but also as an additional control. SOX9 was not featured in either of the data analysis. Accordingly, I expected SOX9 to have minimal ability to influence transdifferentiation. The final list of genes used in transdifferentiation was: PDX1, RBPJL, MNX1,

NR5A2, FOXA2, HNF6, PTF1A and SOX9. Further experimentation was designed to reduce these genes down further to an ideal trio of factors.

84

3.3. The generation and validation of a reliable AAV production protocol The production of functional and pure AAVs was a stepwise procedure summarised in Fig. 8A. In the first instance each gene was PCR amplified from IMAGE clones or human pancreas cDNA and cloned into the gateway plasmid PCR Blunt. Through the use of specific restriction enzymes each gene was then transferred into the final pAAV CMV MCS vector. This vector allowed for the expression of target genes under the control of the cytomegalovirus (CMV) promoter. Alongside these vectors two additional control plasmids were used, the pAAV CMV GFP and DNA binding mutant pAAV CMV N196S PDX1.

HEK293 cells were transfected with the packaging plasmid (DJ1) which allowed for heparin binding, the helper plasmid supplying additional adenoviral genes (pHelper) and finally the pAAV CMV MCS with a specific target gene inserted for example PDX1. After 2 to 3 days incubation HEK293 cells were harvested lysed through freeze-thaw and exposed to benzonase nuclease to degrade any remaining plasmid DNA. The resulting solution contained both AAVs and cellular debris. In order to purify the viruses I modified a method developed by McClure et al., 2011, this utilised HiTrap heparin columns to bind AAVs allowing the removal of cellular debris. AAVs were applied to the column and bound to heparin columns; debris was removed with low salt washes. A successive increase in salt allowed the elution of AAVs. The eluate was then applied to molecular weight cut off filter to simultaneously remove salt and increase AAV concentration.

85

Figure 8. A summary of AAV production and validation. (A) A step by step cartoon summary of the pipeline used to generate AAVs. (B) FACS data for untreated HEK293 cells, the percentage of cells present in each quadrant is indicated. (C) 1/1000 crude lysate treatment. (D) 1/1000 heparin column flow through treatment. (E) 1/1000 heparin column eluate treatment. (F) 1/1000 filter flow through treatment. (G) 1/1000 purified and filtered virus treatment. (H) An example qPCR standard curve utilised to determine genome copies of virus.

To validate this approach I utilised the GFP AAV. Samples from key stages of GFP AAV purification were taken: (1) crude lysate following freeze-thaw lysis and benzonase nuclease treatment, (2) heparin column flow through prior to elution, (3) heparin column eluate from high salt washes (purified AAVs),

(4) molecular weight cut off filter flow through, and (5) retained solution from molecular weight cut off filter (purified and concentrated AAVs). These solutions were then applied to HEK293 cells for two days at 1/1000 dilution. Cells were then harvested for FACS analysis. Untreated HEK293 cells exhibit no

86 fluorescence (Fig. 8B). Following crude lysate treatment 68.3% cells were GFP positive (Fig. 8C). Cells exposed to heparin column flow through also had no GFP (Fig. 8D). The heparin eluate had reduced

GFP expression compared to the crude lysate at 21.9% (Fig. 8E). The flow through from the filter exhibited no fluorescence (Fig. 8F). The final solution acquired from the molecular weight cut off filter was able to generate 84.7% GFP positive cells, close to a 15% increase from the initial crude lysate.

Moreover, this solution had none of the impurities and debris found within crude lysate. The resulting virus was then titred via qPCR employing a standard curve to calculate genome copies (Fig. 8H).

Through qPCR the number of virus particles was found, to calculate the MOI the number of starting cells was then counted. Tests were run to establish an ideal MOI for viral application. GFP AAVs were applied to FLFs at MOIs of 1,000, 10,000, and 1000,000. An MOI of 10,000 was found to be an optimal balance between infection and amount of virus utilised (Fig. 9). As it was able induce GFP expression in 97% of cells (Fig. 9C). To infect cells for transdifferentiation I used an MOI of 10,000 (this MOI was employed across all cell types) and applied four times over a 2 day period.

87

Figure 9. GFP expression following varying MOIs of GFP AAV. (A) FACS data for untreated FLF cells, the percentage of cells present in each quadrant is indicated. (B) GFP AAV treated FLFs, MOI 1,000. (C) GFP AAV treated FLFs, MOI 10,000. (D) GFP AAV treated FLFs, MOI 100,000.

The generated AAVs were tested for their ability to generate their target protein in FLF cells. Western blots for PDX1, RBPJL (HA), MNX1, NR5A2, FOXA2, HNF6, SOX9 and PTF1A were run. TUBA4A

(αTUB) was used as an input control. All constructs demonstrated reliable expression except that of

PTF1A (Fig. 10.). Multiple antibodies were tested for PTF1A however none could detect expression from both AAV treated samples and samples obtained from in vitro differentiation.

Figure 10. Protein analysis of AAV induced expression. (A) GFP AAV exposed FLF cells, scale bar 100 µM. (B-H) Western blot analysis with target protein and TUBA4A. (B) N196S and PDX1 AAV exposed FLF cells (PDX1 antibody). (C) RBPJL exposed FLF cells (HA antibody). (D) MNX1 exposed FLF cells (MNX1 antibody). (E) NR5A2 exposed FLF cells (NR5A2 antibody). (F) FOXA2 exposed FLF cells (FOXA2 antibody). (G) HNF6 exposed FLF cells (HNF6 antibody). (H) SOX9 exposed FLF cells (SOX9 antibody).

88

3.4. PDX1 AAV is able to repress hepatic genes in the HepG2 cell line Former work carried out by Dunn and colleagues in 2015 discerned that overexpression of PDX1 was able to repress hepatic gene expression (Teo et al., 2015). Here the group applied lentivirus encoding

PDX1 into the HepG2 hepatocellular carcinoma cell line. They noted a significant decrease in expression in crucial hepatic genes: alpha fetoprotein (AFP), albumin (ALB), transthyretin (TTR) and apolipoprotein A-II (APOA2). To validate the newly generated AAV’s ability to express their encoded genes I applied them to HepG2 cells in the same manner. Along with PDX1 AAV the HepG2 cells was also individually exposed to GFP AAV and N196S PDX1 AAV as negative controls, all at an MOI of

10,000.

HepG2 cells were cultured for 8 days following the addition of each AAV (Fig. 11A). Samples were harvested and analysed with qPCR. HepG2 cells exposed to N196S PDX1 AAV and PDX1 AAV each elicited high PDX1 expression compared to untreated and GFP AAV treated cells as anticipated (Fig.

11B). Following PDX1 AAV treatment all four hepatic genes were significantly downregulated compared to GFP AAV treatment, as observed in Teo et al., 2015. Additionally, AFP, ALB and TTR were all significantly downregulated when compared to untreated HepG2 as well (Fig. 11C-F). Given the results for HepG2 PDX1 AAV exposure resembled that of HepG2 PDX1 lentiviral exposure I was confident

AAVs provided an effective vehicle for overexpression. However, I wished to further extend my validation to asses AAV’s ability to transdifferentiate cells.

89

Figure 11. PDX1 AAV exposure to HepG2 cells. (A) HepG2 cells at day 8 of culture, scale bar 100 µM. (B-F) Gene expression of untreated HepG2 cells, GFP AAV treated HepG2 cells, N196S PDX1 AAV treated HepG2 cells and PDX1 AAV treated HepG2 cells, all at day 8. Gene of interest is indicated above each bar chart. Expression is displayed in common logarithmic scale, normalised by ACT and TBP genes and relative HepG2 untreated samples. Each sample has n = 3 biological replicates, the mean for each is shown and standard deviation, t-test (*P < 0.05, **P < 0.005, ***P < 0.0005).

90

3.5. Combined exposure of key transcription factors via AAV delivery can induce endogenous expression of tissue specific genes To validate AAV mediated transdifferentiation I chose to reproduce work by Huang et al., 2014. Here lentiviruses expressing HNF4A, HNF1A and FOXA3 (HHF) were utilised to transdifferentiate human fetal fibroblasts to generate what they dubbed human induced hepatocytes (hiHeps). These factors and target cell type were chosen as much work has already been described in the transdifferentiation to β cells. However, no work has yet to validate this protocol, therefore I saw this as a prime opportunity for corroboration of both the method and AAV transdifferentiation potency. Each gene was cloned into the pAAV CMV MCS vector. Following the creation of each plasmid, high concentration viruses were derived through the aforementioned AAV production pipeline. However, due to unavailability of antibodies protein expression for HNF4A, HNF1A and FOXA3 was not assessed.

The AAVs encoding all three genes were then applied to FLFs maintained in HepG2 media along with a brief dose of nocodazole. Cells were expanded for a further 8 days in hepatocyte specific media. FLF morphology changed dramatically over this period, cells flattened out and became more adherent to one another as opposed to the fibrous mesh exhibited by untreated FLFs (Fig. 12A-B). Cells were harvested at 10 days for qPCR analysis. Gene expression of each introduced factor was first assessed; all samples expression were relative to the HepG2 cells. AAV induced expression synonymous to or beyond that of HepG2 cells for FOXA3 and HNF1A (Fig. 12D-E). HNF4A expression was significantly less than the HepG2 cell line, regardless it still had exceedingly high levels compared to untreated samples (Fig. 12C). These tests indicated that AAVs were at least capable of generating high levels of transcription for target genes. Next it was important to assess if these factors could induce their own expression. As AAVs contained only the coding region of each gene, any expression within the untranslated region (UTR) can be assumed to be endogenous. Therefore, by evaluating expression from both the coding region and UTR I could resolve exogenous and endogenous expression.

91

92

Figure 12. HNF4A, HNF1A, FOXA3 AAV exposure to FLF cells. (A) Untreated FLF cells at day 10 of culture, scale bars 100 µM. (B) HHF treated FLF cells at 10 days of culture. (C-N) Gene expression of FLF untreated cells day 10, HepG2 untreated cells and HHF treated FLF cells day 10. Gene of interest is indicated above each bar chart. Expression is displayed in common logarithmic scale, normalised by ACT and TBP genes and relative HepG2 untreated samples. Each sample has n = 3 biological replicates, the mean for each is shown and standard deviation, t-test (*P < 0.05, **P < 0.005, ***P < 0.0005).

Whilst HNF4A exogenous expression was present no detectable endogenous expression was observed in transdifferentiated cells (Fig. 12F). The HNF1A UTR was already expressed at low amounts in untreated FLFs and exposure to AAVs resulted in no change of this expression (Fig. 12G).

However, the FOXA3 UTR was activated following transdifferentiation (Fig. 12H). Various liver specific genes which were assessed in the original paper were utilised here to determine an alteration in cell fate. No AFP expression was observed from FLFs treated with HHF. Gene expression for APOA2, TTR, transferrin (TF) and glycogen synthase (GSY2) were able to be induced in FLFs following exposure to

HHF treatment (Fig. 10J, K, M, N). Alpha-1 antitrypsin (AAT) expression was also significantly increased after viral treatment, though ALB expression was not (Fig. 12I, L).

I did not induce proliferation by the addition of SV40 as I did not wish to alter the cells to the extent of resulting in aberrant growth. The cells formed a confluent layer cells by day 5-6 but did not expand further. If left to expand the cells slowly died off. In the original 2014 paper the authors employed periodic acid-Schiff (PAS) staining to mark glycogen. I also wished to assess glycogen levels generated by the cells but was not able due to limited samples. Therefore, a proxy in the form GYS2 expression was used, HHF treatment was able to induce expression equal to that present in HepG2 cells (Fig.

12N). Consequently, I was reasonably confident in AAV propensity to transdifferentiate to specific cell types.

93

One additional set of experiments was run prior to testing the 8 established transcription factors. I hypothesised that PTF1A, RBPJL and NR5A2 might be able to transdifferentiate specifically to an exocrine phenotype. FLFs were exposed the combination of the PTF1A, RBPJL and NR5A2 (PtRN) encoded AAVs. After 8 days (2 days in the presence of virus) cells were harvested for RNA and qPCR analysis. Gene expression was compared to week 12 human fetal pancreas as in vitro differentiation samples express limited exocrine markers. Exogenous expression of PTF1A, RBPJL and NR5A2 was at least equal to or exceeding that of the fetal pancreas (Fig. 13C-E). In contrast, endogenous expression of each gene was negligible in both untreated and PtRN AAV treated FLFs (Fig. 13F-H).

In the first instance SOX9 expression was investigated, with its expression significantly decreased compared to untreated FLFs (Fig. 13I). C-MYC also appeared to have decreased expression following treatment (Fig. 13K). However, transcription factor MIST1 increased in expression after PtRN AAV application (Fig. 13J). Additionally three key enzymes known to be secreted by acinar cells were tested: amylase alpha 1A (AMY1A), carboxypeptidase A1 (CPA1) and trypsin 1 (PRSS1) with all three showing a significant increase in gene expression (Fig. 13L-N). PtRN treated FLFs did not seem to convey a dramatic shift in cell morphology, only a decrease in proliferation as determined by viewing the cells at the end of the experiment (Fig. 13A, B). Immunofluorescence analysis highlighted that treated cells could induce protein expression of PRSS1 though at minimal levels (Fig. 14).

94

95

Figure 13. PTF1A, RBPJL and N5A2 AAV exposure to FLF cells. (A) Untreated FLF cells at day 8 of culture, scale bars 100 µM. (B) PtRN treated FLF cells at 8 days of culture. (C-N) Gene expression of FLF untreated cells at day 8, week 12 human pancreas and day 8 PtRN AAV treated FLFs. Gene of interest is indicated above each bar chart. Expression is displayed in common logarithmic scale, normalised by ACT and TBP genes and relative to week 12 human pancreas. Each sample has n = 3 biological replicates, except pancreas n = 1, the mean for each is shown and standard deviation, t-test (*P < 0.05, **P < 0.005, ***P < 0.0005).

Figure 14. FLF protein expression following PTF1A, RBPJL and NR5A2 AAV treatment. (A) Untreated FLF cells at 8 days of culture. (B) Day 8 PtRN treated FLF cells. (A,B) All stained with DAPI (blue), PDX1 (green), HNF6 (red) and merged channels. Scale bars 100 µM.

96

3.6. The spatial expression profile of PTF1A, RBPJL and NR5A2 overlap during late embryogenesis Eight factors were provided from LgPCA and Mogrify analysis. Of those factors PDX1, MNX1, FOXA2,

HNF6 and SOX9 have all previously been visualised in the developing human embryo (Table 2).

Therefore, study was directed toward the remaining three factors PTF1A, RBPJL and NR5A2. In the first instance NR5A2 was tested on a CS18 embryo, in tandem to this PDX1 was also tested as a positive control.

PDX1 expression was strongly localised across the entire pancreas (Fig. 15A, B). On a neighbouring section of tissue NR5A2 was investigated. Here its expression was also found across the entire pancreas and in the liver (Fig. 15C, D). As no antibody was available to determine PTF1A or RBPJL localisation RNAscope probes were acquired for both. A positive control probe was first run resulting in ubiquitous signal across the whole embryo (signal is identified by red points) indicated by black arrows, the grey arrow highlights lack of PTF1A outside of the pancreas. PTF1A was found to have expression across the entire pancreas, though a few cells had greater signal than others. (Fig. 15G, H). Finally,

RBPJL was tested instead via fluorescent RNAscope with PDX1. Once again PDX1 was found across the all of pancreas, co-expressing with RBPJL. Yet, RBPJL itself was preferentially located along the edges of the pancreas (Fig. 15I-L).

97

Figure 15. PTF1A, RBPJL and NR5A2 localisation within a CS18 human embryonic pancreas. (A, B) IHC of PDX1. (C, D) IHC of NR5A2, black arrows indicate signal. (E, F) RNAscope universal control. (G, H) PTF1A RNAscope, black arrows indicate signal and grey arrow lack of signal. (I-L) PDX1 and RBPJL RNAscope, nuclei stained with DAPI and all channels merged. Scale bars 100 µM.

98

3.7. LgPCA and Mogrify determined transcription factors are able to alter expression of pancreatic genes in FLFs With 8 transcription factor encoded AAV vectors made and the AAV system validated. I set out to determine what downstream genes these factors could initiate and what ideal combination of factors could be used to transdifferentiate cells to pancreatic progenitors. In the first instance AAVs encoding:

GFP, N196S PDX1, PDX1, RBPJL, MNX1, NR5A2, FOXA2, HNF6, PTF1A and SOX9 were applied individually to FLFs and cultured for 8 days. This allowed the determination of which genes were significantly up or downregulated following exposure to each virus. A large cohort of genes associated with pancreas development and pancreatic progenitors were assessed: FOXA2, FOXA3, GATA4,

GATA6, PDX1, MNX1, HNF6, HNF1B, HNF4A, HNF1A, SOX9, NKX6.1, HES1, NGN3, PTF1A, RBPJL,

NR5A2, CDH1 as well as the UTR expression for all induced genes. Two formats for presenting this data were generated. The format shown in Figure 16 was designed to aid choice in which AAV factors to take forward for further experimentation. As a result it is a condensed form compared to the second layout. The second format (Appendix II), displays expression by gene instead of AAV treatment, as well as the inclusion of positive controls (day 11 and 15 SCT differentiation) allowing the observation of expression relative to physiological levels. Images detailing cell morphology at day 8 can be found in

Appendix III. Each AAV treatment led to a varying reduction in cell proliferation as fewer cells were present upon harvesting.

FLFs were exposed to two separate controls in the form of GFP AAV and N196S PDX1 AAV, all expression levels shown are relative to untreated FLFs. Untreated FLFs grew well, FLFs exposed to

GFP AAVs demonstrated GFP expression after 2 days from initial viral exposure (Appendix III).

Untreated FLFs expressed low levels of several key pancreas genes such as: GATA6, MNX1, HNF6,

HNF1B, SOX9 and HES1 (Fig. 16). Differential gene expression following GFP AAV treatment was minimal with only a significant decrease in GATA6 and HNF6 as well as an increase in HES1 (Fig.

16A). N196S PDX1 AAV exposed FLFs morphology looked similar to wild type only with mild clustering;

99 cells also exhibited high PDX1 levels (Appendix II; III). All exogenous expression induced by AAVs is indicated in (Appendix II). N196S PDX1 AAV treatment resulted in significant decrease of GATA6 and

HNF6 expression, with an increase in SOX9 (Fig. 16B).

FLF cells treated with PDX1 AAV had a distinct morphology; cells rounded up and decreased in size

(Appendix III). PDX1 AAV treatment significantly increased many markers of pancreas development:

GATA6, MNX1, HNF6, HNF1B, NKX6.1 and RBPJL. It was the only factor capable of inducing NGN3,

PTF1A, CDH1 and its own UTR expression in FLFs. Alongside this it was also one of the few factors that could induce GATA4 (Fig. 16C). RBPJL AAV treated cells displayed a phenotype similar to that of untreated FLFs (Appendix III). It addition to FLFs significantly increased GATA6, PDX1, MNX1, HNF1B,

SOX9 and NKX6.1 levels. However, this was accompanied by a reduction in HES1 (Fig. 16D). MNX1

AAV treatment led to mild rounding of cells (Appendix III). MNX1 AAV treatment was also able to activate or increase many genes such as: FOXA2, GATA4, GATA6, PDX1, HNF6, HNF1B, SOX9,

NKX6.1 and RBPJL (Fig. 16E). The NR5A2 AAV treated cells grew well and had no major change in cell morphology (Appendix III). FLFs treated with NR5A2 AAV had significantly increased PDX1,

HNF1B, NKX6.1, RBPJL and especially high GATA6 levels. That said it also lowered HES1 significantly

(Fig. 16F).

Upon the addition of FOXA2 AAV cells elongated, nuclei rounded (Appendix III). FOXA2 AAV was able to able to significantly increase several genes in pancreas development: GATA4, GATA6, PDX1,

MNX1, HNF6, HNF1B, HES1, RBPJL, NR5A2, and was only factor able to induce FOXA3. Although, levels of SOX9 did decrease after delivery (Fig. 16G). HNF6 AAV treatment resulted in minimal change to cell morphology (Appendix III). It increased expression of many of similar marker mentioned prior:

GATA6, PDX1, MNX1, HNF1B, NKX6.1, RBPJL and NR5A2 whilst decreasing HES1 expression (Fig.

16H). Cells rounded slightly upon PTF1A AAV exposure (Appendix III). PTF1A AAV addition resulted in

100 high expression of PDX1 in particular (Appendix II). PTF1A AAV also upregualted GATA6, PDX1,

MNX1, HNF6, HNF1B, HES1 and RBPJL. Overexpression of SOX9 by AAV caused mild expansion of cell size (Appendix III). SOX9 expression was already at high levels in FLFs (Appendix II). SOX9 AAV introduction to cells increased GATA6, PDX1, MNX1, HNF6, HNF1B, RBPJL and NR5A2. Yet SOX9 also downregulated NKX6.1 and HES1 expression (Fig. 16J).

101

Figure 16. Single AAV factor exposure to FLF cells. (A-J) Each treatment is relative to untreated FLF cells, all samples at day 8. The transcription factor AAV applied to cells is indicated beneath each chart. Expression is shown in common logarithmic scale normalised by ACT and TBP genes. Each sample has n = 3 biological replicates, the mean for each is shown and standard deviation, t-test (*P < 0.05, **P < 0.005, ***P < 0.0005).

In parallel to this a similar experiment was run, not only to further aid choice in AAVs but to hopefully validate the single factor addition study. Here FLF cells were exposed to all 8 AAVs. Gene expression was then compared to cells treated with 7 of the 8 transcription factors. This allowed the derivation of factors vital in the maintenance of specific genes. As with the single factor work an alternative format sorted by genes was also produced to allow comparison to positive controls and exogenous expression

102

(Appendix II). Cell morphology for each treatment changed little but clustering was observed (Appendix

III).

Exposure of the various combinations of factors induced high levels of exogenous gene expression

(Appendix II). The UTR expression for all induced genes was tested. However, all UTRs except the

SOX9 UTR exhibited no expression (Fig. 17). Additionally no combination of factors was able to induce

NGN3 expression. When FLFs were treated with all 8 factors the expression of the following genes was seen: FOXA3, GATA4, GATA6, HNF1B, HNF4A, HNF1A, SOX9 UTR, NKX6.1, HES1 and CDH1. Cell shape seemed not unlike that of a fibroblast but the cells did seem to cluster together (Appendix III). I then assessed how these levels altered if cells were treated without 7 of the 8 AAVs or not treated.

Untreated FLFs elicited a significant decrease in GATA4, HNF1B, HNF4A HES1 and CDH1. The only increase was that of the SOX9 UTR (Fig. 17A).

The lack of PDX1 AAV upon FLFs caused and increase in the expression of GATA4 and NKX6.1 as well as a concomitant decrease in GATA6 and HNF4A levels (Fig. 17B). Cells treated without RBPJL

AAV displayed an increase in GATA6, NKX6.1 and a loss of HNF4A (Fig. 17C). MNX1 AAV removal increased GATA4, SOX9 UTR and NKX6.1. However, a decrease in HNF1B and HES1 gene expression was seen (Fig. 17D). Without NR5A2 AAV little change in gene expression is noted, FOXA3 levels increase and GATA6 expression decreases (Fig. 17E). In contrast, when not exposed to FOXA2

AAVs cells elicited a dramatic change in gene expression. An increase in GATA4, SOX9 UTR and

NKX6.1 was noted. Whilst the genes: GATA6, HNF1B, HNF4A, HES1 and CDH1 all significantly decreased expression levels (Fig. 17F). If HNF6 AAV is not in the combination FOXA3 and NKX6.1 levels increase and decrease in HNF4A expression (Fig. 17G). PTF1A AAV removal causes downregulation in HES1 and upregulation of NKX6.1 (Fig. 17H). Finally, the lack of SOX9 AAV also had

103 little effect on overall gene expression with only FOXA3, NKX6.1 and HES1 show significant differential levels (Fig. 17I).

Figure 17. 8 AAV factor exposure to FLF cells. (A-I) Each treatment is relative to 8 transcription factor treated FLF cells, all samples at day 8. The transcription factor AAV removed from combination is indicated beneath each chart. Expression is shown in common logarithmic scale normalised by ACT and TBP genes. Each sample has n = 3 biological replicates, the mean for each is shown and standard deviation, t-test (*P < 0.05, **P < 0.005, ***P < 0.0005).

All the data was then collated in order to distil down the number of transcription factors. Each AAV was given a score from both sets of analysis which was then combined. In single factor analysis a point was

104 allocated for every gene significantly increased (as expression was induced) and a point was lost for each gene that significantly decreased in expression. The single factor data was combined with the 7 factor data. Here the opposite occurred to the single factor analysis. A point was given for every gene that was significantly decreased (expression was lost upon AAV removal), and a point was taken away for a significant increase in expression. Table 9 highlights the consolidation of this data. FOXA2 AAV had the highest total, followed by MNX1 AAV, with PDX1 and PTF1A AAV being tied in fourth.

Intriguingly, the non-predicted factor SOX9 AAV was next above several factors outputted from the bioinformatic analysis. These factors were HNF6, RBPJL and NR5A2 AAVs all with low totals.

Table 9. A summary of gene expression change upon the addition or removal of 8 key transcription factor encoding AAVs. Each factor is indicated on the left, the specific change in gene expression is shown at the top. An increase in each gene being expressed is considered positive and given a point in the single factor study. A decrease in genes being expressed is thus negative and a point is removed. A decrease in a gene being expressed is considered a positive in the 7 factor study and a point is awarded. Therefore, an increase in a gene being expressed is negative and a point is subtracted. From all of these values a total is calculated.

AAV factor Single factor Single factor gene 7 factor gene 7 factor gene Total (+/-) gene increase decrease increase decrease PDX1 11 0 -2 2 11 RBPJL 6 -1 -2 1 4 MNX1 9 0 -3 3 9 NR5A2 6 -1 -1 1 5 FOXA2 10 -1 -3 5 11 HNF6 7 -1 -2 1 5 PTF1A 7 0 -2 1 6 SOX9 7 -1 -2 1 5

PDX1, MNX1, FOXA2 and PTF1A AAVs all induced high levels of pancreatic gene expression. Taking these four factors I undertook one final round of testing to establish an ideal trio of transcription factors.

105

Figure 18. 4 AAV factor exposure to FLF cells. (A-E) Each treatment is relative to 4 transcription factor treated FLF cells, all samples at day 8. The transcription factor AAV removed from combination is indicated beneath each chart. Expression is shown in common logarithmic scale normalised by ACT and TBP genes. Each sample has n = 3 biological replicates, the mean for each is shown and standard deviation, t-test (*P < 0.05, **P < 0.005, ***P < 0.0005).

In the final round of testing FLF cells were exposed a combination of PDX1, MNX1, FOXA2 and PTF1A

AAV. The resulting cell’s gene expression was then compared to that of cells that had only been exposed to 3 of the 4 factors. Both GATA4 and NGN3 were unable to be induced whilst the genes

FOXA3, HNF4A and HNF1A were not tested due to limited expression in previous rounds. The

106 combinatorial action of all 4 factors was able to activate many pancreas genes but not the UTR expression of PDX1, FOXA2 or PTF1A (Fig. 18). As with previous analyses a secondary format of expression data can be found in Appendix IV. Morphology of cells exposed to the multiple combinations rounded and mirrored that of PDX1 or FOXA2 AAV treated FLFs (Appendix V). Untreated FLFs expectedly had significant reduction in the majority of genes: GATA6, MNX1 UTR, HNF6, HNF1B,

NKX6.1, HES1 and NR5A2 (Fig 18A). As with the 7 factor analysis, a gene was considered more integral if its removal led to a reduction in gene expression and vice versa.

The removal of PDX1 AAV caused little change beyond the increase in GATA6 and HES1 expression

(Fig. 18B). Similarly, no MNX1 AAV led to just an increase in MNX1 UTR and HES1 expression (Fig.

18C). The lack of FOXA2 AAV had a more severe effect, only the MNX1 UTR significantly increased expression. This was accompanied by a downregulation in HNF6, HES1 and NR5A2 expression (Fig.

18D). PTF1A AAV removal resulted in the significant increase in a host of genes: MNX1 UTR, HNF6,

HNF1B, SOX9, NKX6.1, PTF1A UTR, RBPJL and NR5A2 with only GATA6 levels rising (Fig. 18E).

These results were similarly collated as with the last two analyses (Table 10).

Table 10. A summary of gene expression change upon the addition or removal of 4 key transcription factor encoding AAVs. Each factor is indicated on the left, the specific change in gene expression is shown at the top. A decrease in a gene being expressed is considered a positive in the 4 factor study and a point is awarded. Therefore, an increase in a gene being expressed is negative and a point is subtracted. From all of these values a total is calculated.

AAV factor (-) 3 factor gene 3 factor gene Total increase decrease PDX1 -2 0 -2 MNX1 -2 0 -2 FOXA2 -1 3 2 PTF1A -8 1 -7

FOXA2 AAV once more had the greatest total, ensuring its inclusion in the trio. Beneath it were PDX1 and MNX1 AAV which were tied and so made up the two remaining places. PTF1A AAV was excluded as its removal led to significant increase expression of multiple genes. Therefore the ideal trio of factors from the final 8 genes is PDX1, MNX1 and FOXA2. Work following this focused upon the application of these three genes in transdifferentiation.

107

3.8. PDX1, MNX1 and FOXA2 AAVs are able to induce endogenous expression of multiple markers of pancreatic progenitors Prior to testing the AAVs, bioinformatic analysis regarding PDX1, MNX1 and FOXA2 binding to key pancreatic genes was examined. MatInspector was employed to define transcription factor binding in silico. 10,000 bp of key pancreatic genes were taken and run through MatInspector to determine where

PDX1, MNX1 and FOXA2 bind in this region (Fig. 19). All genes had multiple binding sites for each transcription factor. However, certain genes such as GATA4, HES1 and NGN3 had reduced points of binding compared to the other genes (Fig. 19B, J, K). The only other large scale pattern of note was that the initial 5,000 bp upstream of MNX1 had minimal binding points (Fig. 19E). The data was then analysed to find binding patterns exhibited by each transcription factor.

In order to cut through the noise output from the data, one single point of binding was not considered to be real but rather distinct patterns found across multiple examples. PDX1 binding seemed to be characterised by binding on both positive and negative strands, at the same area (< 20 bp between each other) of sequence. This binding pattern is observed across every gene tested often multiple times per gene. When tallied, the genes bound the most time by PDX1 are FOXA2, MNX1 and NR5A2 at 7 (Fig.19A, E). PTF1A and RBPJL are close behind each bound by PDX1 at 6 points (Fig. 19M, N).

After this HNF6 and NKX6.1 have PDX1 binding at the same area on both strands a total of 5 times each (Fig. 19F, I). Many upstream regions are bound 3 times by PDX1 such as: PDX1 itself, HNF1B,

SOX9 and NGN3 (Fig. 19D, G, H, K). The two GATA factors were both bound to 2 times and HES1 had just a single point (Fig. 19B, C, J). In the case of HES1 and NGN3 only PDX1 demonstrated a recognisable pattern of binding.

As with PDX1, MNX1 binding was characterised by binding to both strands within the same area (< 20 bp between each other). MNX1 thus bound the most (6 times) to the PTF1A and NR5A2 upstream regions (Fig. 19L, N). The next most MNX1 bound gene regions were that of FOXA2, PDX1, HNF1B

108 and RBPJL all with 4 binding points. MNX1 bound itself and SOX9 at 3 upstream points. The upstream region of GATA6 and HNF6 were each bound twice by MNX1 and NKX6.1 just had a single point of binding. The binding pattern was not observed for GATA4, HES1 and NGN3.

Lastly, unlike the other two factors FOXA2 binding was not regularly observed on both strands within close proximity to one another. However, what was noted was binding on opposing strands separated by roughly 100-300 bp. This binding pattern was observed at maximum 3 times in the upstream region of NR5A2 (Fig. 19N). FOXA2, PDX1 and HNF1B all had 2 FOXA2 binding patterns within the 10,000 bp region (Fig. 19A, D, F). Several upstream regions had just 1 FOXA2 binding pattern: GATA4, GATA6,

MNX1, HNF6, SOX9, NKX6.1 and RBPJL (Fig. 19B, C, E, F, H, I). No binding was observed for the upstream region of HES1, NGN3 and PTF1A (Fig. 19J, K, L).

The last observed patterns were that of potential co-binding of the three factors. PDX1 and MNX1 binding on each strand was often found in very close proximity to one another for example within the upstream regions of PDX1 and PTF1A (Fig. 19D, L). The aforementioned binding profile elicited by

FOXA2 was typically more isolated from the others. However, it should be stated that across all genes binding sites for any of the three factors usually were in the same areas, appearing as ‘hotspots’ for binding.

109

110

Figure 19. PDX1, MNX1 and FOXA2 binding to upstream genomic regions of pancreatic genes. (A-N) Gene of interest is indicated above each graph. 10,000 bp for each gene was tested, furthest point from gene 5’ UTR is on the left, 0 indicates the initiation of the 5’ UTR. + and – signs indicate each DNA strand. PDX1 binding points are shown by blue diamond, MNX1 binding points by an orange rectangle and FOXA2 binding points by a grey triangle.

With the 3 transcription factors chosen to be used for transdifferentiation and analysed in silico they were applied to FLFs. As with all previous tests cells were exposed to each virus at an MOI of 10,000,

111 four times over 2 days, briefly treated with nocodazole and initially maintained in FLF media. Following this period FLFs were then cultured in a pancreas specific media for a further 6 days to ensure cell expansion. Two mediums were tested for their ability to maintain pancreatic progenitor gene expression one for pancreatic organoid culture from Bonfanti et al., 2015. The other was for the expansion of pancreatic progenitor cells from the end point of in vitro differentiation (Trott et al., 2017). Cells were tested for each media in tandem so as to derive any differences. The majority of tested genes did not differ; however, FOXA2 UTR, HNF1B, SOX9 and HES1 expression were all significantly reduced in the presence of the Trott media (Fig. 20). Therefore, following AAV treatment cells were grown in Bonfanti organoid media.

112

Figure 20. Bonfanti and Trott media exposure to PMF treated FLF cells. (A) Human pancreatic organoids derived from a fetal pancreas. (B) SCT in vitro differentiated H9 stem cells. (C-N) Gene expression of PMF treated FLF cells cultured in either Bonfanti or Trott media. Expression normalised by ACT and TBP genes and relative to Bonfanti media. Each sample has n = 3 biological replicates, the mean for each is shown and standard deviation, t-test (*P < 0.05, **P < 0.005, ***P < 0.0005).

113

Following, PDX1, MNX1, FOXA2 (PMF) AAV exposure gene expression was assessed for pancreatic progenitor markers by qPCR. Here PMF treated FLFs gene expression was compared to untreated cells and day 11 and 15 SCT differentiation samples. These two days were chosen as several genes exhibited greater expression at day 11 relative to day 15 (NGN3, PTF1A and RBPJL). Meaning with the inclusion of two positive controls a better gauge of physiological levels could be made. Exogenous levels of PDX1, MNX1 and FOXA2 exceeded that of day 15 SCT differentiation samples significantly

(Fig. 21A-C). Endogenous expression of each activated gene had also been activated to varying degrees. PDX1 UTR was above that of the differentiation samples, MNX1 UTR levels were approximately equal to that of the positive control, whilst FOXA2 UTR was only induced significantly relative to untreated FLFs (Fig. 21D-F).

Both GATA factors had significantly increased expression post PMF treatment relative to untreated samples but were not yet comparable to differentiation levels (Fig. 21G, H). The hepatocyte nuclear factors both exhibited exceptional increases in gene expression when compared to the untreated cells, yet, the expression was still not equivalent to that of SCT differentiation (Fig. 21I, J). SOX9 expression remained stable relative to FLFs though not as elevated as cells following in vitro differentiation (Fig.

21K). NKX6.1 expression appeared following PMF treatment (Fig. 21L). HES1 expression was elevated post treatment to close to physiological levels (Fig. 21M). Very mild NGN3 expression was observed in

PMF treated cells but significantly lower than that of positive control samples (Fig. 21N). Moreover, low levels of CDH1 were also observed in cells (Fig. 15O). The acinar factors also showed considerable expression: PTF1A was induced, whilst RBPJL and NR5A2 provided close to differentiation levels of expression (Fig. 21P-R).

114

115

Figure 21. PDX1, MNX1 and FOXA2 AAV exposure to FLF cells. (A-R) Gene expression of FLF untreated cells at day 8, day 11 and 15 SCT in vitro differentiation cells and day 8 PMF AAV treated FLFs. Gene of interest is indicated above each bar chart. Expression is displayed in common logarithmic scale, normalised by ACT and TBP genes and relative to day 15 SCT differentiation samples. Each sample has n = 3 biological replicates, the mean for each is shown and standard deviation, t-test (*P < 0.05, **P < 0.005, ***P < 0.0005).

Cells showed a reduction in proliferation post PMF treatment compared to fibroblasts as evidenced by cells present at the experiment end point. Morphology of PMF treated cells became rounded and clustered. Although not all cells demonstrated this change, some instead seemed to maintain the original shape of a fibroblast (Fig. 22B). To therefore infer expression at the protein level I assessed it via immunofluorescence. An antibody for PDX1 was used to indicate AAV treated and HNF6 to mark transdifferentiated cells. These were applied to untreated FLF cells, day 11 SCT differentiation acting as the positive control and the PMF Treated cells. Untreated FLFs expectedly elicited no PDX1 or

HNF6 expression (Fig. 22C). In vitro differentiated H9 cells exhibited uniform PDX1 and HNF6 expression (Fig. 22D). PMF treated FLFs had high PDX1 expression though not across every cell, nuclear HNF6 expression was seen co-expressing in some cells as well, though not all (Fig. 22D). This confirmed that transdifferentiation was able to induce expression at the protein level as well as the mRNA level.

116

FLF Untreated Cells PMF Treated FLF Cells

FLF UntreatedCells FLF

SCT Differentiation SCT

PMF Treated FLF CellsTreated FLF PMF

Figure 22. FLF cell morphology and protein expression following PDX1, MNX1 and FOXA2 AAV treatment. (A) Untreated FLF cells at day 8 of culture. (B) PMF treated FLF cells at 8 days of culture. (C) Day 8 untreated FLF cells (D) Day 11 SCT in vitro differentiation. (E) Day 8 PMF treated FLF cells. (C-E) All stained with DAPI (blue), PDX1 (green), HNF6 (red) and merged channels. Scale bars 100 µM.

117

As an additional assessment of transdifferentiation towards the desired phenoytypes qPCR analysis of fibroblast specific genes was run upon HHF treated cells (hepatocyte transdifferentiation), PtRN treated cells (acinar transdfifferentiation) and the PMF treated cells (pancreatic progenitor transdifferentiation).

Cells were tested for three fibroblast genes collagen type 3 A1 (COL3A1), fibroblast-specific protein 1

(FSP1) and vimentin (VIM). Expression for both COL3A1 and FSP1 was reduced in the HHF and PMF treated samples, significantly so in the latter’s case (Fig. 23A, B). PtRN treated cells seemed to exhibit unchanged, even mildly increased levels of COL3A1 and FSP1. VIM was unaffected post PMF treatment. However, HHF and PtRN treatment led to a significant increase in VIM expression, in particular it was elevated greatly following HHF exposure (Fig. 23C). When combined with previous analysis this data displays the change of PMF treated cells from that of a fibroblast towards the pancreatic lineage.

Figure 23. Fibroblast gene expression following HHF, PtRN and PMF treatment. (A-C) Gene expression of FLF untreated cells at day 8, day 10 HHF AAV treated FLFs, day 8 PtRN AAV treated FLFs and day 8 PMF AAV treated FLFs. Gene of interest is indicated above each bar chart. Expression is normalised by ACT and TBP genes and relative to day 8 untreated cells. Each sample has n = 3 biological replicates, the mean for each is shown and standard deviation, t-test (*P < 0.05, **P < 0.005, ***P < 0.0005).

118

3.9. LgPCA but not Mogrify unveils further genes specifying for the early pancreas Post assessment of PDX1, MNX1 and FOXA2 as ideal factors for pancreatic progenitor transdifferentiation, an updated LgPCA schematic with additional replicates and tissues was generated

(Fig. 24A). Therefore, the analysis was run once more, so as to hopefully validate established genes and potentially describe any novel transcription factors. Thus I used this opportunity to also delineate further genes from Mogrify.

In this instance raw LgPCA data was provided by Dave Gerrard, but further analysis was run by myself.

Initially as with previous LgPCA all PCs relevant to the pancreas (PC3, 6, 7, 10, 11, 14 and 15) were combined to generate one list (Fig. 24A, B). Transcription factors were selected by sorting for positive loadings in PC10, 11, 13, 14 and 15 whilst also having negative loadings in PC3, 6 and 7. The top 10 transcription factors remaining from this filtering process are noted in Fig. 24B. This new round of analysis maintained the majority of factors from the original LgPCA list: PDX1, NKX2.3, RBPJL, PTF1A,

HNF6, NR5A2 and ZNF469. However, MNX1, EHF and TLX2 were replaced with SRY HMG-box factor

6 (SOX6), zinc finger protein 587 (ZNF587) and lysine methyltransferase 2D (KMT2D).

With a vast array of maeble data further analysis was run. Each pancreas specific PC was investiaged to attempt to infer any extra information. PCs that contained transcription factors which if mutated, result in pancreatic agenesis were assessed in detail. PC 11 and 13 provided pertinent data, the top 10 factors from each are described in Fig. 24B. These two lists encompass almost every factor from the previous LgPCA analysis except that of NR5A2, EHF and ZNF469. PC 11 included GATA6 hence its study and fellow family member GATA transcription factor 5 (GATA5). It also contained multiple forkhead factors such as FOXA2 but also forkhead box protein P1 and F1 (FOXP1; FOXF1). Several novel factors were also outputted: regulatory factor X 6 (RFX6), TOX high mobility group box family member 3 (TOX3), insulin gene enhancer protein ISL-1 (ISL1), basonuclin 1 (BCN1) and teashirt

119 homolog 3 (TSHZ3). PC 13 provided many previously established genes such as: MNX1, PDX1,

NKX2.3, RBPJL, PTF1A, HNF6 and TLX2. That being said, some new additions were also found:

SOX6, NF-kappa-B inhibitor zeta (NFKBIZ) and RBPJ.

For Mogrify analysis I chose to shift the initial starting cell type with the aim of establishing any additional transcription factors. I altered the starting cell type from fibroblast of the lung to lung tissue but maintained the target cell type as pancreas. The change in the top 10 genes saw a loss of FOXA2 and HEYL replaced with insulin and insulin receptor (INSR). Furthermore, shifting starting cell type to cardiac fibroblast or cardiac tissue made no change to outputted factors (data not shown). As a result of this I went back to the original input of lung fibroblast to study the additional outputted genes past the top 10. Mogrify listed a further 7 genes: INS, INSR, MIST1, FOXA3, transducer of ERBB2 1 (TOB1), tescalcin (TESC) and nuclear receptor subfamily 0 group B member 2 (NR0B2). Subsequent analysis found that all changing the starting cell type did, was exchange various genes from this list with genes initially outputted from Mogrify. Thus, limited novel data could be derived from Mogrify.

120

Figure 24. Novel LgPCA and Mogrify analysis. (A) Updated LgPCA schematic with additional tissues and replicates. 17 PCs are shown with samples taken from human embryonic organs and human embryonic stem cells. PC dimensions are shown in black (high) or white (low) with scale shown by circle size. (B) Top 10 genes from PC 11 and 13 from new LgPCA. (C) Top 10 genes from Mogrify when starting cell type is altered along with all additional factors derived from lung fibroblasts to pancreas settings.

121

4. Discussion Transdifferentiation offers the potential for robust generation of specific cell types. This project utilised transdifferentiation to produce pancreatic progenitors; a key cell type in the development of the pancreas. Through bioinformatic analysis, a list of transcription factors capable of manipulating gene expression of a pancreatic cell fate was determined. Subsequent investigation then sought to reduce and validate this list to a potent trio of factors able of generating pancreatic progenitors.

4.1. Bioinformatic analysis of potential transcription factors able to transdifferentiate to the pancreas The critical first step was to derive transcription factors able of transdifferentiating towards the pancreas. This was achieved by the use of two separate potent bioinformatic analyses: LgPCA and

Mogrify (Gerrard and Berry et al., 2016; Rackham et al., 2016). LgPCA as discussed employed a developmental tree to tease apart genes active in generating each organ of the body. Utilising LgPCA genes specific to the pancreas were then elucidated. It outputted the following 10 transcription factors in the pancreas: RBPJL, PDX1, MNX1, HNF6, NKX2.3, EHF, NR5A2, ZNF469, TLX2 and PTF1A.

Expectedly, the majority of the genes are well described in the pancreas as RBPJL, PDX1, MNX1,

HNF6, NR5A2 and PTF1A each have key functions during early pancreas development (Pan and

Wright, 2011; Jennings et al., 2015). This suggests that LgPCA was able to deduce valid vital transcription factors active in shaping pancreas development.

Of the 4 remaining genes, only EHF has been previously described in the pancreas. EHF has been recognised as an inhibitor of human pancreatic cancers by controlling CDH1 levels (Zhao et al., 2017).

The following factor, Nkx2.3, when deleted in mice is postnatal lethal, as it causes abnormalities in small intestine and spleen development (Pabst et al., 1999). Given the proximity of these organs to the pancreas, one could speculate that it may also function in some form during pancreas development.

Work in human intestinal cells has indicated that NKX2.3 is a regulator of vascular endothelial growth

122 factor (VEGF), implying it could also function within the pancreas to promote angiogenesis (Yu et al.,

2011). The next gene, ZNF469, has been implicated in human eye. Upon ZNF469 loss defects in corneal structure occur, leading to keratoconus (Vincent et al., 2014). How ZNF469 may be associated to pancreas development is unclear but it may aid in the definition of pancreas structure and maintenance. The last novel factor outputted by LgPCA was TLX2. In mice, Tlx2 was identified downstream of BMP signalling during early mesoderm development (E6.5), with mutation being lethal to the embryo (Tang et al., 1998). It may be that TLX2 is active prior to pancreas commitment in humans, laying the groundwork for organ formation.

The strength of LgPCA stems from its ability to determine genes are active across various organs and those that specify certain lineages, which when combined, can generate a list of genes that truly defines a specific organ. However, with the goal of transdifferentiation a requirement to discover key factors with greater regulatory coverage as well as relation to the pancreas was needed. Given LgPCA could not deduce factors with high regulatory potency, an additional round of analysis was undertaken to detect these transcription factors.

Mogrify analysis utilises the STRING and FANTOM5 databases to specifically determine factors for transdifferentiation. It has two criteria, firstly to determine transcription factors with high regulatory coverage. Secondly to find an ideal combination of these factors to transdifferentiate to the desired cell type, in this case the pancreas. From Mogrify another 10 transcription factors were identified: XBP1,

HNF6, NR5A2, FOXA2, GATA4, PTF1A, HEYL, PLA2G1B and MNX1. As with the factors derived from

LgPCA many are well established in the pancreas: HNF6, NR5A2, FOXA2, GATA4, PTF1A, RBPJL and MNX1 (Pan and Wright, 2011). Moreover, 5 of the transcription factors (HNF6, RPBJL, NR5A2,

MNX1 and PTF1A) were shared between the two analyses further corroborating their validity.

123

The other genes provided by Mogrify have varying levels of association to the pancreas. In mice the disruption of Xbp1 within the acinar cells causes apoptosis, but is followed by extensive regeneration of acinar tissue (Hess et al., 2011). This suggests that XBP1 loss may be beneficial to pancreas growth.

Next, no direct link between the Notch effector HEYL and the pancreas has been surmised. Research in mice noted it present in the early mesoderm, with its loss in somites following Notch1 mutations

(Leimeister et al., 2000). How HEYL activity may pertain to pancreas development is unknown but given the importance of Notch signalling in pancreas development it would not be unexpected. Lastly,

Pla2g1b overexpression in mice causes weight gain and increased insulin intolerance whilst its loss results in the opposite (Cash et al., 2011). As with Xbp1, Plag21b loss seems advantageous to pancreatic function and health.

Both lists from LgPCA and Mogrify were combined, providing a final list of 15 genes. The vast majority of genes generated from LgPCA and Mogrify are well documented in defining the early pancreas providing, credibility to the results. This is compounded by the fact that three of the genes cause pancreatic agenesis in humans (GATA4, PDX1 and PTF1A). Additionally, the commonality of factors between LgPCA and Mogrify gives further credence to these analyses capability of deducing key pancreatic genes. That being said, two genes that when mutated result in pancreatic agenesis were not found in either list. Given the severity of phenotype it would be expected that such integral factors would be outputted from at least one of the analyses.

The first is a novel gene called CCR4-NOT transcription complex subunit 1 (CNOT1) and has been recognised as causing human pancreatic agenesis upon its loss. Little is known concerning this gene, it is speculated to act in complex to define stemness (De Franco et al., 2019). The authors noted that

Cnot1 mutant mice had reduced Pdx1, Ptf1a and Insulin levels along with increased Shh. Indicating that CNOT1 may down regulate Shh levels which in turn would promote early pancreas development. It

124 is therefore of intrigue that such a factor was not derived from either analysis. Morever, GATA6 which accounts for the majority of causes of pancreatic agenesis in humans was not present either (Lango

Allen et al., 2012). Online data from The Human Protein Atlas and EMBL-EBI Expression Atlas details that both CNOT1 and GATA6 are expressed at high levels across multiple organs within the body.

These discrepancies may be because CNOT1 and GATA6 provide broad specification to multiple organs, so were lost from LgPCA. Alongside this, to run the Mogrify algorithm a starting cell type must first be input, in this case fibroblasts of the lung. Given GATA6 and CNOT1 are expressed highly in this organ, it is understandable the Mogrify would not propose the use of factors that were already present within the initial cell type. Nonetheless, it is of note their absence from both lists given the inclusion of other pancreatic agenesis causing genes. Despite this further work focused upon reducing the list of 15 genes down to a more manageable number.

125

4.2. The description of temporal gene expression during in vitro differentiation to pancreatic progenitors Before initiating transdifferentiation experiments, work focused upon understanding temporal expression of transcription factors through the use of the SCT pancreatic progenitor in vitro differentiation kit was assessed. The expression kinetics of a cohort of transcription factors associated with pancreas development as well as all genes derived from the bioinformatic analysis was tested by qPCR. The key stages of the differentiation can be defined as definitive endoderm (day 1-3), primitive gut tube (day 4-6), posterior foregut (day 7-9), pancreatic endoderm (day 10-15).

The first gene to be investigated was SOX17, it’s expression peaked dramatically before dissipating in the in vitro system. This closely resembles the pattern of SOX17 in the developing human pancreas

(Jennings et al., 2013). FOXA2 and FOXA3 both had similar expression patterns to one another. Given these two transcription factors work in unison in the developing mouse embryo it is unsurprising their expression kinetics mirror this (Ang et al., 1993). Moreover, FOXA2 in the pancreas is present and maintained from CS10 onwards similar to that observed during differentiation (Jennings et al., 2013).

The two GATA factors elicited an intriguing expression profile. Their expression overlaps, though the levels seem inversely correlated to one another. Given what is known in the current literature this further suggests a functional redundancy, as at any one point only one GATA factor is expressed at high amounts (Carrasco et al., 2012; Xuan et al., 2012). Additionally, it may demonstrate a hierarchy between the two factors; GATA6 is expressed first suggesting that it may initiate or at least bolster

GATA4 expression. This is in line with data implicating GATA6 as the more integral factor in human pancreas development. GATA6 expression is also initiated promptly during differentiation, its loss would likely abrogate definitive endoderm formation as shown in previous work (Chia et al., 2019). In addition, in mice GATA6 is known to regulate Rbpjl (Martinelli et al., 2012). Thus, it is of note that the

126 latter GATA6 expression peak during differentiation also appears a day prior to peak RBPJL expression. Indicating that the role of GATA6 in acinar development may be conserved in humans.

PDX1 exhibits an almost linear increase in expression over time through the differentiation, coming on at the onset of the posterior foregut stage. This is just few days after SOX17 and FOXA2 expression initiation as seen during human pancreas development (Jennings et al., 2013). The expression pattern of MNX1 is rather unique, as it peaks once at day 4, before decreasing only to return to days after. The second period of MNX1 expression initiates at the same time as SOX9 in the differentiation. SOX9 being to date is the only protein shown to be co-expressed with MNX1 in the human embryonic pancreas (Pan et al., 2015). SOX9 expression during pancreatic differentiation program appears to be analogous to that of PDX1, as found in human pancreas development, illustrating the reliability and robustness of the in vitro model (Jennings et al., 2013).

All of the hepatocyte nuclear factors demonstrate similar expression kinetic. As HNF4A, HNF1A and

HNF1B all begin expression early into the pancreatic differentiation protocol it is understandable that human mutants for these genes result MODY: MODY1, MODY3 and MODY5 respectively (Frayling et al., 2001; Haumaitre et al., 2006;). HNF6 differs slightly coming on a few days later but otherwise retaining the same expression patterns as differentiation continues.

HES1 had one early peak of expression succeeded by a steady increase throughout the differentiation procedure. Its expression profile implies that as differentiation progresses, Notch signalling becomes increasingly active and cells likely replicate at an increased rate. This increase in replication is observed during differentiation. Furthermore, a drop in HES1 expression is observed following peak NGN3 expression on day 9. Therefore, it can be explicated that the high NGN3 levels may led to down regulation of HES1 by, NGN3, as seen in mouse cells (Shih et al., 2012).

127

By the pancreatic endoderm stage, NKX6.1 expression is present. When compared relative to PTF1A expression, there is a clear inverse correlation. It has been established in mice that PTF1A and NKX6.1 inhibit one another (Schaffer et al., 2010). These expression profiles support their interaction being active within human cells. Each of the acinar transcription factors evaluated had a spike in expression in the final days of differentiation; NR5A2 is first at day 9 (maintains expression), PTF1A at day 10-11 and RBPJL at day 11. These three factors work together to drive acinar gene expression in mice, in particular PTF1A and RBPJL (Masui et al., 2007). Therefore, it is unsurprising that the expression patterns for these genes overlaps at days 9-11. Unfortunately, due to oversight, the expression of RBPJ was not studied. The study of RBPJ would have been of interest in relation to RBPJL to see how the profile of each gene changes during in vitro differentiation.

Additionally, the genes derived from prior bioinformatic analysis were also examined. XBP1 did not exhibit any great variation in expression over the entire differentiation program. XBP1 is a known component of the unfolded protein response (UPR), an integral and conserved response pathway that ensures aberrant proteins do no accumulate within the endoplasmic reticulum (Iwakoshi et al., 2003).

Thus, steady expression of XBP1 would not be unexpected given its essential function. NKX2.3 expression appears during the primitive gut tube stage with a sharp peak. This period is equivalent to the beginning of intestinal and spleen specification; this is of note as data established in mice found

Nkx2.3 mutants resulted in defects in these organs (Pabst et al., 1999). However, how this expression relates to the pancreas development remains unclear. The expression trajectory of EHF gradually increases and is known to regulate CDH1 (Zhao et al., 2017). Given EHF is active alongside PDX1, it may be that these two transcription factors work together to regulate CDH1 throughout pancreas development (Svensson et al., 2007). HEYL provided a noteworthy expression profile, decreasing during differentiation with one increase at day 10, which matches up to the drop in expression elicited

128 by HES1. As both these factors are related and defined by Notch signalling it is of interest that they exhibit such divergent expression kinetics. As Hes1 is associated with high Notch signalling, HEYL may be active in absence of Notch signalling potentially as a result of NGN3 regulation (Shih et al., 2012).

The remaining factors outputted from LgPCA and Mogrify: ZNF469, TLX2 and PLA2G1B did not show any expression during the in vitro differentiation protocol. It is understandable that ZNF469 expression was not observed given there is negligible data pertaining to its role in pancreas development.

Likewise, Tlx2 has only been associated with mesoderm development, thus expression in an endodermally derived tissue and its precursors would not be anticipated (Tang et al., 1998). Lastly,

PLA2G1B which encodes an acinar secreted enzyme was not seen either. This is likely as a result of minimal exocrine specification displayed by the differentiation, as cells are driven towards pancreatic endoderm. As such genes like PTF1A, RBPJL and NR5A2 are expressed briefly and in the case of

RBPJL has weak in expression with low Ct values.

With all the in vitro differentiation data collected I utilised pre-set criteria to reduce the list of outputted genes by half. These criteria were: expression from days 7-8 as this marks the onset of PDX1, PTF1A and SOX9 expression, all of which are key markers of pancreatic progenitors and are maintained or increasing expression over time. As ZNF469, TLX2 and PLA2G1B were not present at all during the in vitro differentiation protocol; none were employed for downstream testing. This decision is strengthened by the knowledge that ZNF469 and TLX2 are not associated with the pancreas and that work in mice suggests a lack of Pla2g1b is in fact beneficial to pancreas health (Cash et al., 2011). HEYL was not taken forward either given its expression decreased over time compounded by its seemingly marking reduced Notch activity. EHF did fit the criteria but with several factors ranked higher it was not utilised.

The expression patterns of NKX2.3 and GATA4 are both unique and associated with endoderm development but as they lose expression over time, so they were not used. Of the remaining factors:

129

PTF1A, FOXA2, PDX1, MNX1, XBP1, NR5A2, RBPJL and HNF6 all fit the criteria except PTF1A and

XBP1. PTF1A was still utilised given its importance in pancreas development and its presence in both

LgPCA and Mogrify it was taken forward. Only XBP1 was not employed for downstream testing as its expression remained completely unchanged over the differentiation. In addition, in mice Xbp1 loss is in fact advantageous (Hess et al., 2011). All the other transcription factors fit the pre-established criteria, many were conserved in both analyses and well described in the pancreas.

The final list of transcription factors for testing was PDX1, RBPJL, MNX1, NR5A2, FOXA2, HNF6,

PTF1A and SOX9. SOX9 was included given its importance in pancreas development but also as a control to assess how credible the predictive analyses may be. It should be noted that the derivation of this list has multiple flaws. The arbitrary cut off at 10 transcription factors from both analyses as well as the later reduction in size to 7 factors severely limits the scope of potential genes that could have been studied. However, given the exhaustive downstream analysis it was paramount that a small list of transcription factors be generated. Furthermore, the assessment of which genes are applicable for tranasdifferentiation through the use of in vitro differentiation, therefore assumes that differentiation can be considered an accurate representation of human pancreas development. Whilst many transcription factors seem to exhibit analogous expression compared to that of the embryonic pancreas, this system is still not a completely reciprocal model. The SCT pancreatic progenitor in vitro differentiation protocol utilises 2D culture with daily media changes as a result this model will never truly reflect the dynamic

3D system of the developing human. Its use was one of convenience and reliability as this model is highly replicable and simple to conduct. Finally, the criteria set were not without flaw. The preference for genes that maintain or increase expression over time leads to the omission of critical factors which could be integral but lose expression as differentiation progresses, for example GATA4 and NKX2.3.

Whilst these would have been omitted regardless due to the cut off of genes taken forward, their study would still be of interest. To summarise, whilst this final list of genes is based on certain assumptions

130 and are not without flaw, the key intention of these criteria was to efficiently generate a viable and workable list of transcription factors to test.

131

4.3. An improved method for AAV generation and purification In order to introduce these genes to cells, a method of gene delivery was required. This is summarised in Figure 8A. The first step in this process was the cloning of each gene into the target pAAV CMV MCS vector. Once each vector was created, they were transfected in HEK293 cells with DJ1 and pHelper plasmids. During the early development of this protocol, PEI, was utilised for transfection as it is a cheap and effective reagent for transfecting plasmids. Cells were then harvested and lysed by freeze thawing. At this point, the integration of the AAV purification developed by McClure et al., 2011 was introduced. In the first instance, post lysis supernatant was resuspended in an increased concentration and reduced volume, compared to the paper’s method. This ensured viral supernatant would require less benzonase nuclease, reducing expense greatly. Following benzonase nuclease treatment, cells were spun down once more and the supernatant was either stored at -80 °C or purified immediately.

The following stages had little deviation from McClure et al., 2011. The HiTrap heparin columns were equilibrated as described in the paper then the AAV supernatant was applied to the column. The subsequent washes were identical to that in the paper and allowed the collection of virus eluate. From here virus was concentrated in the same manner as the paper, utilising 100 kDa molecular weight cut off filters to concentrate purify virus. However, one additional step was added to save further cost. All

HiTrap heparin columns were regenerated. This was achieved by a four step wash process. Firstly, high salt was washed through the column to remove any residual virus not already eluted. Secondly, a low concentration salt washes, to remove any lingering debris. Thirdly, a wash of water to remove excess salt from previous washes. Lastly, columns were washed and stored in 20% ethanol as per manufacturer’s instruction. This ensured the columns could be reused multiple times for the purification procedure.

To test the efficacy of the protocol, samples from each phase of GFP AAV purification were taken, applied to HEK293 cells and tested via FACS. The crude lysate was able to induce GFP expression in

132 almost 70% of the population. Heparin column eluate was able to induce mild GFP expression close to

20%. This reduction is expected given the eluate was of considerably larger volume compared to the initial crude lysate. The column flow through and filter flow through both elicited no GFP expression as expected. The purified and concentrated virus could induce GFP in over 80% cells. It could be argued that all the downstream processing from the crude lysate does not merit such a minimal increase.

However, throughout this project it became clear that the presence of debris and contaminants in crude lysates had detrimental effects on cells, often leading to excessive cell death and reduced proliferation of cells. Hence, purification and concentration was not only beneficial in terms of optimisation but also cell survival, which is undoubtedly critical to any further analysis.

Every virus generated was assessed for the number of genome copies by qPCR. From this value and knowing the number of cells at the start of any given experiment, a specific MOI could be applied to cells. An MOI of 10,000 was found to result in close to 100% GFP expression following GFP AAV delivery to cells. This number was also chosen as previous studies employing AAVs in reprogramming applied MOIs close to 20,000 across several cell types including fibroblasts (Khan et al., 2010; Weltner et al., 2012). Thus, it could be assumed that exposure to multiple viruses would mean the majority cells would express all the transduced genes. All pancreatic transcription factors were observed to generate protein via Western blot, except that of PTF1A due to lack of functioning antibody.

133

4.4. AAVs are effective vectors for gene delivery and expression With the production of AAVs setup, the validation of function was necessary. Therefore, the replication of a previously established experiment was undertaken. Previously Teo et al., 2015 noted that lentivrially transduced PDX1 could reduce hepatic gene expression in the HepG2 cell line. Thus, I aimed to repeat this result by employing AAVs in place of lentivirus. Induction of the same three genes was achieved through AAV exposure, these were: GFP, N196S PDX1 (mutated PDX1 unable to bind

DNA) and PDX1. Cells were treated for the same length of time as the original paper with an MOI of

10,000 to maximise the number of infected HepG2 cells.

PDX1 levels were significantly elevated in N196S PDX1 and PDX1 AAV treated HepG2 cells. Four hepatic genes were examined as in Teo et al., 2015: AFP, ALB, APOA2 and TTR. These genes were tested for mRNA expression in the three AAV treatments as well as untreated HepG2 cells unlike the original paper. As anticipated all hepatic genes had a significant decrease in expression when GFP

AAV treated cells were compared to PDX1 AAV cells as was seen with lentiviral treatment. That being said GFP AAV treatment did lead to a minor increase in hepatic gene expression. Therefore, it was important to additionally convey untreated HepG2 expression levels. Thus, barring TTR, all hepatic genes also displayed significantly reduced expression, relative to untreated HepG2 cells. From this it was evident that AAVs could be utilised effectively as a vector for delivering and inducing ectopic gene expression.

134

4.5. AAVs are able to induce transdifferentiation As AAVs could now be deemed effective in gene delivery, their ability to transdifferentiate was in turn investigated. To this end I hoped to replicate work by Huang et al., 2014 in which human fibroblasts were transdifferentiated to hepatocytes dubbed hiHeps. This study was chosen as many laboratories have already validated work regarding transdifferentiation towards β cells (Akinci et al., 2012, Akinci et al., 2013, Saxena et al., 2016). Instead this work offered the opportunity not only to further validate

AAVs but also the transcription factors themselves in transdifferentiation to hepatocytes (HNF4A,

HNF1A and FOXA3).

AAVs encoding each gene were applied to FLFs, cells were grown for a total of 10 days. To further enhance AAV transdifferentiaiton treatment was supplemented with a brief dose of nocodazole.

Nocodazole disrupts microtubule assembly, its addition has been observed to increase AAV delivery into the nucleus. Perinulcear accumulation is common upon AAV treatment and limits viral transduction.

Through the disruption of the microtubule network accumulation is abolished, allowing greater viral uptake into the nucleus (Xiao et al., 2016).

Following exposure to HNF4A, HNF1A and FOXA3 AAVs treated FLFs exhibited high ectopic expression of each transcription factor. However, endogenous expression was only observed at the

FOXA3 UTR. Nonetheless multiple markers of hepatocytes were induced or increased post AAV exposure. These markers were the same described in Huang et al., 2014 emphasising that these cells had been pushed toward a similar hepatic phenotype. This was further highlighted by cell morphology which came to mimic that of hiHeps. In addition as with hiHeps no AFP expression was observed in

HHF treated FLFs, further illustrating the reproducibility of this protocol.

Despite this success differences were noted as a result of modifications to the protocol. In the original paper the authors found that cells lost the ability to proliferate after exposure to the three transcription

135 factors. Accordingly, they introduced the SV40 large T antigen via lentivirus to promote expansion of the transdifferentiated cell population. Here SV40 was not employed and so cells showed minimal growth past day 5-6 and eventually began to die. Meaning the experiments were not able to mature and express hepatic genes to the highest potential. Moreover, due to limited options HepG2 cells were used as a positive control from which to benchmark hepatic gene expression. It should be noted that these cells typically express genes at higher than physiological levels compared to human hepatocytes.

Finally, due to limited cell number cells were not assessed for PAS staining, instead GYS2 expression was used as a proxy. The data suggests that HHF treatment induced high GYS2 expression however it is no true substitute for staining.

As addendum to this I also attempted transdifferentiation towards the exocrine tissue specifically the acinar cells of the pancreas. This portion of the pancreas is considerably under studied compared to its endocrine counterpart, despite the fact it comprises the majority of the organ. Work in mice has established PTF1A, RBPJL and NR5A2 as critical transcription factors in development of exocrine tissue. As all of these were output from both bioinformatic analyses. Here there usage as a trio of factors specifically with the intent of generating acinar cells was assessed.

Application of PTF1A, RBPJL and NR5A2 AAVs to FLFs was investigated in a similar vain to that of transdifferentiation to hepatocytes. A number of acinar genes were tested via qPCR and compared to expression found in a 12 week fetal pancreas. High exogenous expression of each factor was noted as well as the induction of many acinar genes. However, genes relating to growth and proliferation did decrease significantly. Additionally, UTR expression was not present for any gene, from this it could be inferred that these factors are unable to maintain their own expression. This would explain the spikes in expression seen during differentiation, meaning these genes are induced by other transcription factors.

This experiment served to convey the possibilities of transdifferentiation to an acinar phenotype and the

136 potential for future study in this vastly under explored portion of pancreatic tissue. Work is beginning to detail the importance of the acinar cells, particularly in regards to cell therapy as the addition of these cells may in fact aid the process (Logantahn et al., 2010; Loganathan et al., 2011). Both of these approaches described how AAVs can be used to drive change in cell type. Whilst both protocols do not represent a complete transdifferentiation most likely due the small time frame, lack of SV40 for hepatocytes and in the acinar transdifferentiation’s case lacking a complimentary media for cells. They are a clear indicator that when the correct transcription factors are paired with AAVs the ability to transdifferentiate is possible, paving the way for further downstream testing.

137

4.6. The characterisation of exocrine transcription factors during late embryogenesis The exocrine transcription factors of the pancreas have typically be less well documented and thus understood in terms of scope and function. With access to early human tissue I sort to visualise these factors during embryonic development. The three factors of interest were PTF1A, RBPJL and NR5A2.

PTF1A was observed to be present across the whole pancreas at CS18, having an equivalent expression pattern to that of PDX1. This further strengthens the designation of these two factors as integral to pancreatic progenitors. Moreover, at this point in development co-expression of SOX9 with

PDX1 has also been noted (Jennings et al., 2013). Thus at CS18 all three defining factors of the pancreatic progenitors are co-expressed with one another. This period of co-expression is likely not limited to CS18 but additional study will be required to truly determine this. Furthermore, as RNAscope was employed here, this expression may not truly reflect the protein expression of PTF1A. Hopefully in time a robust antibody for PTF1A can be generated to provide greater insight into this vital transcription factor.

Likewise, RBPJL was also assessed for expression via RNAscope due to lack of effective antibody.

Here multiplex RNAscope established PDX1 and RBPJL expression. PDX1 expression mimicked that of its perceived protein expression implying RNAscope is a reliable proxy for protein expression. RBPJL expression appeared to overlap with that of PDX1. However, RBPJL was more localised toward the edges of the developing pancreas, likely the regions that will go on to form the acinar tips. This is unsurprising given the association of RBPJL with exocrine development in mice (Masui et al., 2011). A similar expression pattern was witnessed by GATA4 at CS19 during embryonic development (Jennings et al., 2013).

138

Thankfully, a reliable antibody for NR5A2 was available, utilised previously for Western blot analysis.

NR5A2 was expressed across the entire pancreas. Given loss of Nr5a2 in mice causes a reduced pancreatic progenitor pool it may be that its co-expression along with PDX1, PTF1A and SOX9 is additive progenitor proliferation (Annicotte et al., 2003). In mice NR5A2 has also been indicated to regulate many transcription factors such as Foxa2, Gata6 as well as Ptf1a and Rbpjl (Hale et al., 2014).

Evidently at this stage PTF1A, RBPJL and NR5A2 are co-expressed and previous data has found

FOXA2 and GATA6 expression in the pancreas at CS18 (Cebola et al., 2015). It could therefore be suggested that NR5A2 may retain these regulatory powers during human development.

This data taken together alongside all previously established work regarding PDX1, RBPJL, MNX1,

NR5A2, FOXA2, HNF6, PTF1A and SOX9 highlights that all of these factors are active within the pancreas at the same period and locations. These observations are encouraging as it signifies that

LgPCA and Mogrify have generated physiologically relevant factors for transdifferenation. In addition, it demonstrates that during this period the transcription factor profile of pancreatic progenitors is highly diverse. As factors from all three lineages are seen co-expressing.

139

4.7. Transcription factors derived from bioinformatic analysis have capability to transdifferentiate cells From LgPCA and Mogrify analysis 7 transcription factors were derived with the addition of SOX9 AAV.

Additionally, two controls were also employed during testing: GFP and N196S PDX1 AAVs. With all

AAVs generated they were applied to FLFs, cells were grown and then harvested for qPCR analysis across a large number of pancreatic progenitor related genes. The intention was to determine the regulatory power of each transcription factor tested, so as to discern an ideal trio of transcription factors for transdifferentiation. A complimentary set of experiments were also run. Here an opposing protocol was carried out. Cells were exposed to various cocktails of AAVs, each of which instead lacked one of the 8 transcription factors. Combined these data sets helped explicate many details surrounding transcriptional regulation of pancreatic transcription factors.

The first phase of testing involved the application of individual vector control AAVs to FLF cells. GFP exposure to FLFs was assessed first, only minimal changes in gene expression were noted. Cells had a mild change in morphology and elicited GFP expression. Similarly N196S PDX1 AAV caused downregulation of a couple of the same genes. FLFs morphology was similar to that of GFP treated cells. As no endogenous or functional genes are being expressed it could be inferred that these effects are as a result of the AAV treatment itself.

Upon PDX1 AAV addition to cells several pancreatic genes dsiplayed elevated expression. As previous data in humans has identified GATA4, GATA6, PDX1, HNF6, HNF1B and NKX6.1 were all upregulated following PDX1 AAV (Shi et al., 2017; Wang et al., 2018). Furthermore, NGN3, PTF1A and CDH1 were also upregulated all of which have been established down stream of PDX1 in mice, illustrating that the regulatory landscape of PDX1 is shared between humans and mice (Dahl et al., 1996; Svensson et al.,

2007; Oliver-Krasinski et al., 2009). It should be stated the PDX1 AAV is the only treatment able to induce PTF1A, NGN3 and CDH1 expression, serving as testament to its importance during

140 development. Two further factors MNX1 and RBPJL were also upregulated. Interactions regarding these two genes are minimal due to lack of study, though it is not unanticipated given the integral nature of PDX1 in pancreas development. Especially in the case of RBPJL, given its homolog is known to be regulated by PDX1, it could be inferred that the same mechanism for regulation is at play here

(Wang et al., 2018). Intriguingly, PDX1 AAV exposed cells became rounded and smaller potentially as a result of CDH1 upregulation. Lack of PDX1 AAV in the following experiment had a smaller impact on gene expression change most likely due to the high level of transcription factors still be expressed via

AAVs. Intriguingly, genes such as GATA4 and NKX6.1 increased when no PDX1 AAV was present.

This hints towards a more complex interaction between the two than simple initiation of expression. The increase in NKX6.1 may mean that PDX1 ensures NKX6.1 levels to not become excessive and have an inhibitory effect on PTF1A which would cause a reduction in the pancreatic progenitor pool.

RBPJL AAV also increased expression of several genes. As mentioned little is described concerning

RBPJL however, the one known regulatory role described in mice is conserved in human cells over

PDX1 (Miyatsuka et al., 2007). Evidently RBPJL has less of overall effect on gene expression compared to PDX1, which is to be expected given its more niche role in the exocrine pancreas. This is further displayed by cell morphology which is not overly distinct from that of untreated FLFs. RBPJL

AAV removal has little effect on gene expression though GATA6 levels increase significantly. GATA6 in mice regulates RBPJL, with GATA6 expression increasing in this way it implies RBPJL may inhibit

GATA6 expression. This would be in keeping with the kinetics observed during in vitro differentiation which portrays an almost mutually exclusive pattern of expression.

The downstream regulatory capabilities of MNX1 have yet to be described. Here this work defines it as a major player, able to cause differential expression of many genes: FOXA2, GATA4, GATA6, PDX1,

HNF1B, SOX9, NKX6.1, HES1, RBPJL, NR5A2 and CDH1. Given MNX1 is expressed early on during

141 the in vitro development it is possible that it acts as vital transcription factor, able of inducing many genes. Additionally, it is able upregulate both SOX9 and HES1 signifying a potential connection to the

Notch signalling pathway. This relation to Notch signalling is compounded by its similar expression kinetics to that of HES1 during differentiation. This hypothesis is further supported by 7 factors exposure without MNX1 AAV also leads to a reduction in HES1 expression.

The acinar factor NR5A2 was only able to increase gene expression of a few genes. Several of these have been found to be regulated by NR5A2 in mice: GATA6, PTF1A and RBPJL (Hale et al., 2014). In the case of the GATA6, NR5A2 AAV was able to elevate its expression the highest out of the 8 transcription factors. Looking at the in vitro differentiation data it could be speculated that NR5A2 is responsible for the second wave of GATA6 activation. GATA6 is known to be active within the exocrine portion of the pancreas so this regulatory interaction may be critical during acinar development

(Martinelli et al., 2012). Expectedly, lack of NR5A2 AAV causes a significant decrease in GATA6. This demonstrates the strong regulatory potency elicited by NR5A2 upon GATA6.

FOXA2 AAV can activate a multitude of genes relating to pancreas development. In particular it can alter expression of its fellow family member FOXA3 along with many other genes: GATA4, GATA6,

PDX1, MNX1, HNF6, HNF1B, HNF1A, SOX9, HES1, RBPJL and NR5A2. FOXA2 is active from the earliest stages of development aiding in the formation of the definitive endoderm and maintains its expression through pancreas development (Ang et al., 1993; Jennings et al., 2013). Consequently, its capability to activate a multitude of genes is to be expected. Mouse work has demonstrated that three of these factors are targets of FOXA2: GATA4, PDX1 and SOX9. Highlighting this conservation of regulatory interactions between humans and mice. Given the importance of FOXA2 its removal during the 7 transcription factor analysis resulted in many genes reducing in expression. Several genes it upregulated in the single factor analysis were downregulated: GATA6, HNF1B and HES1. Furthermore,

142

FOXA2 overexpression results in reduction of SOX9 expression, a critical factor for cell proliferation

(Seymour et al., 2012). This may explain why FOXA2 levels decrease over time during the in vitro differentiation, in order to permit stable SOX9 expression.

The application of HNF6 AAV to cells led to increase in a number of genes. Known targets in mice such as PDX1 and HNF1B were found to upregulated following HNF6 overexpression (Jacquemin et al.,

2003; Maestro et al., 2003). No NGN3 expression was displayed in HNF6 AAV exposed cells but a reduction in HES1 was observed, suggesting HNF6 may also manipulate NGN3 levels via HES1. Lack of HNF6 resulted in little change to gene expression, FOXA3 expression rose, indicating HNF6 could regulate its expression as it does the related gene FOXA2. Both Forkhead factors do reduce in expression at a similar period to the induction of HNF6, making this interaction a potential possibility.

PTF1A AAV activated a similar group of genes to that of HNF6, though it in fact increased HES1 expression. Alongside this, as found in mice PDX1, MNX1 and HNF6 are all regulated by PTF1A

(Miyatsuka et al., 2007; Thompson et al., 2011). It can also induce expression of RBPJL a binding partner transcription factor active in specifying the acinar cells (Beres et al., 2006). Its removal in the 7 factor study sees HES1 levels decrease again hinting at a possible connection to Notch signalling. The combined action of PDX1, PTF1A and SOX9 may work in unison to promote HES1 expression within the pancreatic progenitors and promote expansion of the cell population. Interestingly, NKX6.1 rises in expression without PTF1A AAV present. This is to be expected given the inhibitory role played by

PTF1A upon Nkx6.1 (Schaffer et al., 2010).

Lastly, SOX9 AAV also provides a comparable activation pattern to the latter two factors. HNF6,

HNF1B and HES1 all show differential expression following SOX9 overexpression in accordance with mouse studies (Lynn et al., 2007; Seymour et al., 2007). Oddly, lack of SOX9 also causes decrease in

143

HES1 expression implying a greater complexity to their interaction than previously understood. Further work regarding this most likely in relation to Notch signalling may be required.

With both sets of experiments complete the results were synthesised into one coherent data set. I chose to determine factors of greater transdifferentiating power based of off their ability to significantly increase pancreatic gene expression or in the case of its removal the significant decrease of downstream gene expression. This was selected as it was a quantifiable method for honing in upon key transcription factors. However, it is not without its weaknesses, this method takes no account for physiological levels exhibited by in vitro differentiation, instead only focusing on significant difference with or without treatment. Meaning factors which may be able to induce high levels of gene expression are weighted the same as those simply able to mildly shift expression. To counteract this estimation statistics were run which highlighted that the significance aptly represented the data itself (Appendix

VII-XV). Moreover, it could be argued that certain key genes should be awarded greater weighting if activated such as the PDX1, PTF1A and SOX9. This was not implemented given all three genes were introduced into cells via AAVs. Thus to give each greater importance within these tests would be inherently flawed. The choice of genes tested was designed to encompass all the major factors active in the early stages of pancreas development so as to provide the clearest picture of pancreatic transdifferentiation capability. It should be noted that for the 8 factor work only the SOX9 UTR was tested. All other genes had no UTR expression most likely as the genes themselves were present exogenously at extreme levels.

The table generated found that FOXA2, PDX1, MNX1 and PTF1A elicited greater regulatory power then the other 4 factors. In particular FOXA2 was found to be a potent transcription factor in altering downstream gene expression. This understandable given its importance in defining the early endoderm and later the pancreas, this is no doubt vital to induction of many downstream genes. Of note SOX9

144 was found to a more effective manipulator of gene expression than several transcription factors teased out from both LgPCA and Mogrify. Bringing into question the validity of insight gleamed from these to analyses. Thus, suggesting that choice in transcription factors for transdifferentiation may be better illuminated from essential or important genes active with the tissue of interest. With these final 4 factors elucidated subsequent work was intended to determine which factor could be taken out to form the final ideal trio.

To assess the final 4 factors an analogous approach to the previous stage of analysis was utilised.

Cells were exposed to all 4 transcription factors or a combination of three. Each combination was then compared to the treatment with all 4 AAVs. Untreated FLFs had reduced expression for almost every gene except that of SOX9. Given SOX9 expression was seemingly essential to FLF growth and survival it is unsurprising that this gene did not falter.

Lack of PDX1 AAV caused minimal change in gene expression. Though notably the upregulation of

GATA4 is conserved with the previous stage of analysis but is interesting as PDX1 is also one of the few factors able to induce GATA4 as well. This would be in keeping with mouse data which indicates both factors are able to regulate one another (Rojas et al., 2009; Carrasco et al., 2012). PDX1, FOXA2 and PTF1A AAV treatment also showed little change compared to 4 factor treatment. The MNX1 UTR increased indicating a potential compensatory effect, implying a level of importance in MNX1 expression. This was seen again with cells lacking FOXA2 AAV along with a reduction in HNF6. HNF6 is an important marker of pancreas identity and its exhibited expression loss due to removal of FOXA2

AAV. This signifies a key regulatory relationship, especially when combined with the single factor data.

Lastly, the removal of PTF1A AAV led to many genes being increased, suggesting this transcription factor to be the least potent, in line with data taken from the previous stage of analysis.

145

Once again significant changes in gene expression were employed to define the final trio of factors.

This does fall into the same traps as the previous analysis. However, due to this final stage being the elimination of just a single transcription factor, the likelihood of removing a potential integral factor was lessened. Combined with the data that clearly indicated PTF1A AAV as the most expendable choice, the final trio was clear: PDX1, MNX1 and FOXA2 (PMF). The key takeaway from this analysis was the conservation of known regulatory interactions between mice and humans. The conservation highlights that whilst expression kinetics may vary in mice and humans, the elicited regulation from transcription factors does not. This implies that differences in expression profile may play a more major role in species difference than proteins themselves. That being said future work should work to verify the regulatory data explicated here. One final takeaway should be the flaws in using Livak Method for deriving gene expression level (ΔΔCt). The success of this methodology is reliant on the use of a reliable and applicable control. Without this interpreting results can be a challenge, in this case this was untreated samples for the sake of simplicity. However, the expanded figures in the appendix serve to fix this to an extent by the use of in vitro derived pancreatic progenitors as a control. Nevertheless, an absolute quantification may be more applicable in the future especially for genes that may have expression at physiologically low levels.

146

4.8. PDX1, MNX1 and FOXA2 are potent transcription factors for transdifferentiation to pancreatic progenitors With the three factors established in silico study of their binding to upstream promoter regions of key pancreatic genes was examined. Outputted data provided a plethora of data to study. By referring back to available ChIP-seq data it is evident that the results are valid. Cebola et al., 2015 details FOXA2

ChIP-seq data sets by looking at peaks upstream of PDX1 it is noted that the binding regions match to that of MatInspector. Similarly, PDX1 ChIP-seq for PDX1 itself also matched that to in silico data (Wang et al., 2018).

The trends noted by comparison of all regions tested highlighted various points of interest. Upstream regions of GATA4, HES1 and NGN3 were the least regulated. In the first instance this may be because transcription factors bind within intronic DNA or enhancers not investigated here. However, given data provided by single factor analysis, the genes GATA4 and NGN3 were only ever induced at low levels compared to other genes suggesting reduced regulation. As for Hes1 it is known to be regulated by

Notch in mice thus it may have limited transcriptional regulation beyond that (Shih et al., 2012). Any change noted in the single factor analysis could therefore have been through direct regulation of Notch by the specific transcription factor. The lack of binding close to MNX1 start site is most likely because that region of DNA codes an anti-sense RNA for MNX1 thus will be devoid of many sequences related to regulation.

The pattern of PDX1 and MNX1 binding seemed to be similar, with two binding points on opposing strands in close proximity to one another. This is to be expected of homeodomain containing proteins, as interaction to both strands of DNA is made by the α-helix. The homeodomain of PDX1 has been demonstrated to act in such a way (Longo et al., 2007). It can thus be hypothesised that MNX1 acts in a corresponding manner. However, FOXA2 appears to bind to DNA in a more distinct manner despite the presence of its homeodomain. Recent study of FOXA2 DNA binding has found that it likely flexible

147 in how it binds DNA (Li et al., 2017). In addition, it has been found that forkhead family member FOXA1 homodimerises to bind DNA (Wang et al., 2018b). Given the similarity in sequence it can be postulated that FOXA2 likewise homodimerises resulting in the binding pattern observed from MatInspector.

PDX1 and MNX1 binding being located close to one another is most probably due to the related consensus sequences shared by these proteins. The DNA regions they bind to being AT rich, hence their association to one another. How all three of these factors come together to form a complex remains unclear. Though given all three seem to typically bind in similar ‘hotspots’ they likely drive the same promoter regions, hence their synergy for transcriptional regulation.

Prior to beginning testing the trio of transcription factors two separate mediums were assessed to find an optimal culture environment to maintain and expand cells. PMF treated FLFs were cultured concomitantly in either media from Bonfanti et al., 2015 (organoid culture) or Trott et al., 2017 (stem cell culture). The Bonfanti media was found to maintain gene expression at relatively higher levels. Whilst the mediums do share multiple components one notable difference is apparent. The presence of DAPT,

DAPT in the Trott media led to Notch signalling inhibition and progression towards a more endocrine phenotype. The presence of DAPT likely accounts for the fall of SOX9 and HES1 expression levels given they are known to be Notch signalling controlled in mice (Shih et al., 2012). Therefore, cells were cultured in Bonfanti media.

PMF treated cells exhibited high exogenous and endogenous expression of a large selection of pancreatic genes. In many cases close to physiological levels exhibited by day 11 and 15 in vitro differentiation samples. This is particularly evident in the case of the exocrine markers which match or even exceed differentiation levels. It should be stated that the exocrine genes did not exhibit high levels of expression in the in vitro differentiation. Key markers from other lineages were also expressed such

148 as the ductal markers HNF1B and HNF6, endocrine markers NGN3 and NKX6.1 as well as the aforementioned exocrine markers. All of which highlights these factors’ ability to induce expression of genes from all three compartments of the pancreas, which are first specified by pancreatic progenitors.

Moreover, as initially stated the three key factors PDX1, PTF1A and SOX9 are all expressed at close to equal amounts to that of in vitro differentiation.

The majority of cells exhibited rounded cell morphology, akin to that observed in PDX1 and FOXA2

AAV treated cells. Rounded cells also seemed to form clusters with one another, indicating CDH1 expression activation may be promoting cell polarity and organisation (Dahl et al., 1996). However, it was clear that not all cells had undergone this morphological change as a portion of cells still resembled fibroblasts. This implies that potentially not all cells have been infected with all three viruses.

To assess this immunofluorescence was undertaken for PDX1 and a downstream marker not induced via AAV, HNF6. Whilst overlapping expression was found across many cells it was evident that not every cell was in fact positive for PDX1 expression. Highlighting that not every cell was taking up virus, presumably meaning the same was true for MNX1 and FOXA2 AAVs. This lack of uniform infection would likely result in the diverse cell population displayed by PMF treated cells. Here a flaw becomes apparent in the use of qPCR as representation of a cell population’s phenotype. Cells are harvested and pooled prior to analysis, making the assessment of cells individual phenotype impossible to determine.

The result illustrated that the 3 transcription factors were able to transdifferentiate cells towards pancreatic progenitors but they were unable to do so in a uniform and robust manner. Therefore, considerable work would still need to be undertaken in order to produce the target cell type. However, the potency of PDX1, MNX1 and FOXA2 in transdifferentiation is evident. FOXA2 is presumably required to set the stage, acting as a broad director of early downstream expression, with PDX1 driving

149 expression towards the pancreatic lineage. What remains to be understood is the complete regulatory role of MNX1 in the context of endoderm to pancreas development. It is interesting that MNX1 loss in mice results in dorsal agenesis, insinuating a greater importance than what is currently known. Recent work found that MNX1 can promote proliferation via the manipulation of Wnt signalling. This may be critical to pancreatic progenitors, acting as a mechanism for the expansion of cell numbers (Yang et al.,

2019).

As found with previous treatments, the addition of PMF via AAVs resulted in a reduction in cell proliferation, though this was not formally assessed but rather a subjective assessment of cell number.

Recent studies have revealed that overexpression of transcription factors is not without its flaws in this regard. Induced overexpression of transcription impedes DNA replication and proliferation, the latter of which is critical for successful transdifferentiation (Babos et al., 2019). The authors identified epigenetic blocks which prohibit high levels of simalteanous transcription and replication. They then go onto remove these blocks through the deletion of specific genes and addition of growth factors. This approach could be utilised as a mean to aid successful transdifferentiation in the future instead of driving proliferation by SV40 as seen in Huang et al., 2014. Another study found that over expression of genes induced a large immune response resulting in inflammation and a reduced transdifferentiated population. By reducing expression levels and removal of macrophages a more succesfful transdifferentiation was accomplished (Clayton et al., 2016). These two approaches could be applied together so as to maximise transdifferentiation potential.

The final piece of analysis regarding transdifferentiation was the investigation of fibroblast gene expression post AAV treatment. Here the three aforementioned transdifferentiated (HHF, PtRN and

PMF) cell types were assessed for fibroblast gene expression. In the case of both COL3A1 and FSP1 expression was significantly decreased following PMF treatment. This is expected and indicates the

150 cells can no longer be considered true fibroblasts. HHF also reduced expression of COL3A1 and FSP1 but not significantly, however PtRN treatment had no effect. This is unsurprising, given the morphology of these cells still resembles that of a fibroblast and that these cells were continually maintained in standard fibroblast media. Lastly VIM expression was unchanged in PMF treated cells but significantly upregulated for HHF and PtRN treated cells. Increased VIM expression is often associated with the epithelial to mesenchymal transition (EMT) suggesting these cells are not fully transdifferentiated

(Ivaska., 2011). Additionally, EMT is often a key process associated with cancer meaning these cells may be less applicable to further study (Brabletz et al., 2018). To summarise, PMF transdifferentiation was a success but considerable work is required to optimise this novel production method for pancreatic progenitors.

151

4.9. Further bioinformatic analysis validates established factors as well as deriving novel factors Post testing of PDX1, MNX1 and FOXA2, LgPCA was updated with additional replicates and tissues.

The opportunity was therefore taken to determine whether the same factors would be output and if any new genes could be derived. Given this I also decided to extend testing to further enquiry of Mogrify as well.

LgPCA was run with this novel data at hand, all relevant PCs were once more combined, generating a list of genes, most of which were preserved from the former analysis. Although, MNX1, EHF and TLX2 were lost from the list. The removal of MNX1 was surprising given how potent of a transcription factor it has been proven to be. The loss of EHF and TLX2 is less unexpected as neither made the final shortlist of previous analyses. Their replacements SOX6, KMT2D and ZNF587, have variable levels of association to the pancreas. SOX6 has been linked to the pancreas in various instances, most notably as a regulator of proliferation in β cells (Iguchi et al., 2007). Likewise, KMT2D has not only been associated with pancreatic cancers but also neonatal diabetes in humans (Gohda et al., 2015; Dawkins et al., 2016). ZNF587 on the otherhand had very limited annotation regarding its role, not just in relation to the pancreas. This makes its inclusion intriguing and hinting that it may be amenable to further study.

This list is of curiosity, as it contains only one factor from the ideal trio this project derived. Hence further work regarding LgPCA was undertaken.

I believe by selecting all pancreas specific PCs, data regarding shared signals e.g. pancreas/liver development may be lost. Thus I hypothesised that by only examining PCs with pancreatic agenesis related transcription factors, further genes which may be of greater relevance would appear. Therefore,

PCs containing transcription factors that caused pancreatic agenesis if mutated were found. PC 11 was chosen as it contained GATA6, the gene most associated with pancreatic agenesis in humans. This PC also contained several genes known to be involved in pancreas development. The first of which is

152

RFX6. This transcription factor has been identified downstream of Ngn3 in mice, required for islet formation (Soyer et al., 2010). Whilst in humans mutations of RFX6 are known to result in neonatal diabetes (Spiegel et al., 2011). The next factor present was FOXA2, gene was previously not elucidated from LgPCA but given the data generated from this project its potency for gene activation is clear. Additionally, two other forkhead factors were present in this list FOXP1 and FOXF1. FOXP1 is necessary for α cell growth and functionality in mice (Spaeth et al., 2015). FOXF1 mutations in humans results in complications of the lungs and an annular pancreas (Miranda et al., 2013). The transcription factor ISL1 has long been associated with pancreas development, specifically islet formation (Ahlgren et al., 1997). Gata5 has been implicated as regulating tgif2 in Xenopus aiding in the specification of endoderm (Spagnoli and Brivanlou, 2008). Lastly, recent bioinformatic analysis of the human pancreas from child-adulthood noted an increase of TSHZ3 in β cells (Arda et al., 2016). Two genes within the list have no currently discerned connection to the pancreas. These are TOX3 a known cancer causing gene and BNC1 which is needed for touch in humans (Price et al., 2000; Hong et al., 2015).

The other PC containing pancreatic agenesis genes was that of PC 13. PC 13 provided a similar list of genes to that originally derived from LgPCA with only the loss of NR5A2, EHF and ZNF469 replaced by

SOX6, NFKBIZ and RBPJ. The removal of these factors highlights they may not have been as of importance as previously believed. Vindicating the choice to not study EHF and ZNF469. To add to that

NR5A2 was found to be a weak driver of downstream gene change making its disappearance from this list logical. SOX6 as stated above has been connected to pancreas development. The aforementioned gene RBPJ is well known in relation to both Notch signalling and pancreas development (Aqelqvist et al., 1999). However, the full extent of its role during pancreas development remains to be seen, and may provide insights into not only its relation with RBPJL but also Notch signalling and its regulation.

The remaining factor NFKBIZ has only been loosely associated to pancreatic cancers, with hints toward its use as a biomarker (Winter et a., 2013). Overall, LgPCA has provided a plethora of new factors

153 which could be applied in the future for transdifferentiation. Not to mention it has validated previously derived factors. Arguably the most valuable feature of LgPCA is its tractability in gene inference.

Criteria can be easily shifted from organ specific to lineage specific so as to distinguish more relevant factors. Lastly, the presence of PDX1, MNX1 and FOXA2 at the top of these two relevant PCs signifies the power of LgPCA.

Alongside this, additional analysis by Mogrify was also run. Multiple changes to the starting cell type were made in order to tease out more factors. The initial shift to beginning with lung instead of fibroblasts of the lung saw the inclusion of insulin and INSR in place of FOXA2 and HEYL. Though given this project was founded upon the usage of transcription factors for transdifferentiation these nascent genes were of no use. Thus more changes in starting cell type were made, to drastically different tissues formed from separate germ layers. Even this change made no difference as not a single novel transcription factor was outputted. As a result a greater examination into the residual factors generated by the initial run by Mogrify was made. Mogrify yielded an additional seven genes, four of which were not transcription factors: insulin, TESC, NR0B2 and INSR. Two of the transcription factors are well studied, MIST1 and FOXA3, specifying the acinar cells and defining the early endoderm respectively (Ang et al., 1993; Ramsey et al., 2007). The last transcription factor was TOB1, another factor undescribed in pancreas development. It is known to have regulatory inhibitory control over β- catenin but how this role may be applicable to the pancreas will require study in the future (Xiong et al.,

2006). However, beyond these factors no further genes could be derived from Mogrify analysis despite multiple manipulations of the software.

With all this data compiled and combined with what has been explicated from this project, it is clear that

LgPCA offers greater insight into transcription factor choice in relation to transdifferentiation. The flexibility of LgPCA system allows for the derivation of a vast array of transcription factors. The ability to

154 set criteria to identify organ precise genes or alter them so as find more broadly spread signals is highly robust. It should be stated that Mogrify did aid in the validation of genes of interest and output all three of the ideal factors. However, when pressed for additional data little more could be deduced. Moreover, many of the genes output from Mogrify were not transcription factors, limiting the potential scope of their use. LgPCA thus provides a broader and more detailed picture of human development and may be able to infer more niche factors which may be indispensable to future experiments.

155

5.0. Summary and future work The goal of this project was to transdifferentiate cells to pancreatic progenitors. I believe this goal has been achieved but also simultaneously paved the way for a considerable amount of further research and validation. However, with the long term target in mind being potential cell therapy, this current procedure is in no way close to optimal. The use of in vitro differentiation offers a highly replicable, effective but albeit expensive route toward the generation of pancreatic cell types en masse. Through the study of transdifferentiation, in vitro differentiation and organoid culture it has become clear that in vitro differentiation still remains the gold standard. I only hope that in time transdifferentiation towards pancreatic cells can transform the field as iPSCs have done.

However, this is not to say that this project is not without its successes. To reach this point several notable achievements were made. The creation of cost effective method for AAV generation and purification, that may aid those requiring a reliable and in particular non-pathogenic method for gene delivery. The validation of two published papers, which whilst often overlooked, authentication is of vital importance to modern science. The novel visualisation of several key but under studied transcription factors. Next, the assessment of multiple pancreatic transcription factors and their downstream effects, providing valuable insight into transcriptional regulation of the pancreas. From which the derivation of 3 key transcription factors able to transdifferentiate to pancreatic progenitors was determined. Finally, additional analysis and validation regarding potential genes for transdifferentiation. To conclude, the enhanced understanding of the emerging pancreas’ transcriptional network offers hope for new routes in cell therapy which may in turn help revolutionise the treatment of pancreas related illnesses.

To further assess the data derived in this project a number of experimental changes and novel strategies could be introduced to drive this research forward. In the first instance, whilst AAVs have provided a malleable system for gene delivery during initial rounds of testing. It is clear that these inherent qualities may in fact be detrimental to the project. This work highlights that not every cell takes

156 up transcription factors provided, resulting in an assorted cell population, despite high MOIs. Now that three clear factors have been determined, the usage of lentivirus may be more applicable. This change in method would allow for selection of cells that have integrated the genes into their genome, ensuring

100% of cells would be positive for all 3 transcription factors.

To add to this, there is a need for greater visualisation of protein expression and localisation. Future work regarding PDX1, MNX1 and FOXA2 treatment should focus upon extensive protein studies of transdifferentiated cells. Ideally a reporter gene could be incorporated alongside all three genes so as to aid in visualisation and potential isolation of transdifferentiated cells. Furthermore, ChIP could be run to unequivocally determine target genes. Alongside this, pulldown assays would help elucidate if the members of the trio combine to form a complex. MNX1 is of particular interest given minimal data regarding it. Future study should aim to characterise it, ideally during human development. In particular, to define any association with Notch and Wnt signalling, as this may provide valuable insight concerning the pancreatic progenitor population and its expansion.

Throughout this study only one cell type has been studied, the FLFs. Future work would aim to expand the number of cell types studied. This would be of benefit not only to validate these factors retain their transdifferential function across cell types. But also potentially explicate other cell types which may be even more amicable to transdifferentiation. Moreover, it would be of interest to see if these factors act in a different capacity depending on the cell type to which they are exposed. For example FOXA2 eliciting a differing expression in a liver context compared to that of a fibroblast.

All work described above has been carried out in vitro. Over the years many papers have detailed the fantastic capabilities of in vivo transplantation, in aiding developmental progression and maturation.

Given the desire of this project was to generate pancreatic progenitor cells, it would be of great interest

157 to investigate how they would develop following in vivo incubation. The hope would be to see each lineage of the pancreas form within transplanted tissue. This would provide confirmation that the cells produced via transdifferentiation are able to manifest the pancreas. Moreover, these cells could be tested in STZ treated diabetic mice. This would be a novel approach to typical STZ rescue experiments as the transplanted tissue would not simply be comprised of endocrine cells but rather an in vitro derived mix of all three lineages which define the pancreas.

Finally, only a limited selection of transcription factors were assessed in detail in this project. LgPCA has proven to be a potent tool for explicating critical factors required for development. Future work should hopefully build upon this data and aim to establish the regulatory capabilities of other transcription factors not tested here. This scope need not be limited to the pancreas. The strength of this data set is its ability to predict key factors for all organs. Therefore, I believe its application towards transdifferentiation of any and all organs should be investigated. Hopefully in time success can be garnered in producing novel cell types from transdifferentiation.

158

Bibliography Aguayo-Mazzucato, C., Zavacki, A.M., Marinelarena, A., Hollister-Lock, J., El Khattabi, I., Marsili, A., Weir, G.C., Sharma, A., Larsen, P.R., and Bonner-Weir, S. (2013). Thyroid hormone promotes postnatal rat pancreatic β-cell development and glucose-responsive insulin secretion through MAFA. Diabetes. Ahlgren, U., Jonsson, J., and Edlund, H. (1996). The morphogenesis of the pancreatic mesenchyme is uncoupled from that of the pancreatic epithelium in IPF1/PDX1-deficient mice. Development. Ahlgren, U., Jonsson, J., Jonsson, L., Simu, K., and Edlund, H. (1998). β-cell-specific inactivation of the mouse Ipf1/Pdx1 gene results in loss of the β-cell phenotype and maturity onset diabetes. Genes Dev. Ahlgren, U., Pfaff, S.L., Jessell, T.M., Edlund, T., and Edlund, H. (1997). Independent requirement for ISL1 in formation of pancreatic mesenchyme and islet cells. Nature. Ahnfelt-Rønne, J., Jørgensen, M.C., Klinck, R., Jensen, J.N., Füchtbauer, E.M., Deering, T., MacDonald, R.J., Wright, C.V.E., Madsen, O.D., and Serup, P. (2012). Ptf1a-mediated control of Dll1 reveals an alternative to the lateral inhibition mechanism. Development. Akinci, E., Banga, A., Greder, L. V., Dutton, J.R., and Slack, J.M.W. (2012). Reprogramming of pancreatic exocrine cells towards a beta (β) cell character using Pdx1, Ngn3 and MafA. Biochem. J. Akinci, E., Banga, A., Tungatt, K., Segal, J., Eberhard, D., Dutton, J.R., and Slack, J.M.W. (2013). Reprogramming of various cell types to a beta-like state by Pdx1, Ngn3 and MafA. PLoS One. Allen, H.L., Flanagan, S.E., Shaw-Smith, C., De Franco, E., Akerman, I., Caswell, R., Ferrer, J., Hattersley, A.T., and Ellard, S. (2012). GATA6 haploinsufficiency causes pancreatic agenesis in humans. Nat. Genet. Ameri, J., Ståhlberg, A., Pedersen, J., Johansson, J.K., Johannesson, M.M., Artner, I., and Semb, H. (2010). FGF2 specifies hESC-derived definitive endoderm into foregut/midgut cell lineages in a concentration-dependent manner. Stem Cells. Andersson, O., Adams, B.A., Yoo, D., Ellis, G.C., Gut, P., Anderson, R.M., German, M.S., and Stainier, D.Y.R. (2012). Adenosine signaling promotes regeneration of pancreatic β cells in vivo. Cell Metab. Ang, S.L., Wierda, A., Wong, D., Stevens, K.A., Cascio, S., Rossant, J., and Zaret, K.S. (1993). The formation and maintenance of the definitive endoderm lineage in the mouse: Involvement of HNF3/forkhead proteins. Development. Annicotte, J.-S., Fayard, E., Swift, G.H., Selander, L., Edlund, H., Tanaka, T., Kodama, T., Schoonjans, K., and Auwerx, J. (2003). Pancreatic-Duodenal Homeobox 1 Regulates Expression of Liver Receptor Homolog 1 during Pancreas Development. Mol. Cell. Biol. Apelqvist, Å., Li, H., Sommer, L., Beatus, P., Anderson, D.J., Honjo, T., Hrabě De Angelis, M., Lendahl, U., and Edlund, H. (1999). Notch signalling controls pancreatic cell differentiation. Nature. Arceci, R.J., King, A.A., Simon, M.C., Orkin, S.H., and Wilson, D.B. (1993). Mouse GATA-4: a retinoic acid-inducible GATA-binding transcription factor expressed in endodermally derived tissues and heart. Mol. Cell. Biol.

159

Arda, H.E., Li, L., Tsai, J., Torre, E.A., Rosli, Y., Peiris, H., Spitale, R.C., Dai, C., Gu, X., Qu, K., et al. (2016). Age-dependent pancreatic gene regulation reveals mechanisms governing human β cell function. Cell Metab. Artus, J., Panthier, J.J., and Hadjantonakis, A.K. (2010). A role for PDGF signaling in expansion of the extra-embryonic endoderm lineage of the mouse blastocyst. Development. Artus, J., Piliszek, A., and Hadjantonakis, A.K. (2011). The primitive endoderm lineage of the mouse blastocyst: Sequential transcription factor activation and regulation of differentiation by Sox17. Dev. Biol. Assady, S., Maor, G., Amit, M., Itskovitz-Eldor, J., Skorecki, K.L., and Tzukerman, M. (2001). Insulin production by human embryonic stem cells. Diabetes 50, 1691–1697. Aviv, V., Meivar-Levy, I., Rachmut, I.H., Rubinek, T., Mor, E., and Ferber, S. (2009). Exendin-4 promotes liver cell proliferation and enhances the PDX-1-induced liver to pancreas transdifferentiation process. J. Biol. Chem. Azzarelli, R., Hurley, C., Sznurkowska, M.K., Rulands, S., Hardwick, L., Gamper, I., Ali, F., McCracken, L., Hindley, C., McDuff, F., et al. (2017). Multi-site Neurogenin3 Phosphorylation Controls Pancreatic Endocrine Differentiation. Dev. Cell. Babos, K.N., Galloway, K.E., Kisler, K., Zitting, M., Li, Y., Shi, Y., Quintino, B., Chow, R.H., Zlokovic, B. V., and Ichida, J.K. (2019). Mitigating Antagonism between Transcription and Proliferation Allows Near- Deterministic Cellular Reprogramming. Cell Stem Cell. Baeyens, L., Bonné, S., German, M.S., Ravassard, P., Heimberg, H., and Bouwens, L. (2006). Ngn3 expression during postnatal in vitro beta cell neogenesis induced by the JAK/STAT pathway. Cell Death Differ. Banga, A., Akinci, E., Greder, L. V., Dutton, J.R., and Slack, J.M.W. (2012). In vivo reprogramming of Sox9+ cells in the liver to insulin-secreting ducts. Proc. Natl. Acad. Sci. U. S. A. Barker, N., Huch, M., Kujala, P., van de Wetering, M., Snippert, H.J., van Es, J.H., Sato, T., Stange, D.E., Begthel, H., van den Born, M., et al. (2010). Lgr5+ve Stem Cells Drive Self-Renewal in the Stomach and Build Long-Lived Gastric Units In Vitro. Cell Stem Cell. Barker, N., Van Es, J.H., Kuipers, J., Kujala, P., Van Den Born, M., Cozijnsen, M., Haegebarth, A., Korving, J., Begthel, H., Peters, P.J., et al. (2007). Identification of stem cells in small intestine and colon by marker gene Lgr5. Nature. Baumgartner, B.K., Cash, G., Hansen, H., Ostler, S., and Murtaugh, L.C. (2014). Distinct requirements for beta-catenin in pancreatic epithelial growth and patterning. Dev. Biol. Beck, F., Erler, T., Russell, A., and James, R. (1995). Expression of ‐2 in the mouse embryo and placenta: Possible role in patterning of the extra‐embryonic membranes. Dev. Dyn. Beck, F., Chawengsaksophak, K., Luckett, J., Giblett, S., Tucci, J., Brown, J., Poulsom, R., Jeffery, R., and Wright, N.A. (2003). A study of regional gut endoderm potency by analysis of Cdx2 null mutant chimaeric mice. Dev. Biol.

160

Benitez, C.M., Qu, K., Sugiyama, T., Pauerstein, P.T., Liu, Y., Tsai, J., Gu, X., Ghodasara, A., Arda, H.E., Zhang, J., et al. (2014). An Integrated Cell Purification and Genomics Strategy Reveals Multiple Regulators of Pancreas Development. PLoS Genet. Benner, C., van der Meulen, T., Cacéres, E., Tigyi, K., Donaldson, C.J., and Huising, M.O. (2014). The transcriptional landscape of mouse beta cells compared to human beta cells reveals notable species differences in long non-coding RNA and protein-coding gene expression. BMC Genomics. Ben-Shushan, E., Marshak, S., Shoshkes, M., Cerasi, E., and Melloul, D. (2001). A Pancreatic β-Cell- specific Enhancer in the Human PDX-1 Gene Is Regulated by Hepatocyte Nuclear Factor 3β (HNF-3β), HNF-1α, and SPs Transcription Factors. J. Biol. Chem. Beres, T.M., Masui, T., Swift, G.H., Shi, L., Henke, R.M., and MacDonald, R.J. (2006). PTF1 Is an Organ-Specific and Notch-Independent Basic Helix-Loop-Helix Complex Containing the Mammalian Suppressor of Hairless (RBP-J) or Its Paralogue, RBP-L. Mol. Cell. Biol. Berns, K.I., and Muzyczka, N. (2017). AAV: An Overview of Unanswered Questions. Hum. Gene Ther. Bhushan, A., Itoh, N., Kato, S., Thiery, J.P., Czernichow, P., Bellusci, S., and Scharfmann, R. (2001). Fgf10 is essential for maintaining the proliferative capacity of epithelial progenitor cells during early pancreatic organogenesis. Development. Boj, S.F., Párrizas, M., Maestro, M.A., and Ferrer, J. (2001). A transcription factor regulatory circuit in differentiated pancreatic cells. Proc. Natl. Acad. Sci. U. S. A. Boj, S.F., Petrov, D., and Ferrer, J. (2010). Epistasis of transcriptomes reveals synergism between transcriptional activators Hnf1α and Hnf4α. PLoS Genet. Bonal, C., Thorel, F., Ait-Lounis, A., Reith, W., Trumpp, A., and Herrera, P.L. (2009). Pancreatic Inactivation of c-Myc Decreases Acinar Mass and Transdifferentiates Acinar Cells Into Adipocytes in Mice. Gastroenterology. Bonfanti, P., Nobecourt, E., Oshima, M., Albagli-Curiel, O., Laurysens, V., Stangé, G., Sojoodi, M., Heremans, Y., Heimberg, H., and Scharfmann, R. (2015). Ex Vivo Expansion and Differentiation of Human and Mouse Fetal Pancreatic Progenitors Are Modulated by Epidermal Growth Factor. Stem Cells Dev. Bonnefond, A., Vaillant, E., Philippe, J., Skrobek, B., Lobbens, S., Yengo, L., Huyvaert, M., Cavé, H., Busiah, K., Scharfmann, R., et al. (2013). Transcription factor gene MNX1 is a novel cause of permanent neonatal diabetes in a consanguineous family. Diabetes Metab. Bossard, P., and Zaret, K.S. (1998). GATA transcription factors as potentiators of gut endoderm differentiation. Development. Brabletz, T., Kalluri, R., Nieto, M.A., and Weinberg, R.A. (2018). EMT in cancer. Nat. Rev. Cancer. Brennan, J., Lu, C.C., Norris, D.P., Rodriguez, T. a, Beddington, R.S., and Robertson, E.J. (2001). Nodal signalling in the epiblast patterns the early mouse embryo. Nature 411, 965–969.Burlison, J.S., Long, Q., Fujitani, Y., Wright, C.V.E., and Magnuson, M.A. (2008). Pdx-1 and Ptf1a concurrently determine fate specification of pancreatic multipotent progenitor cells. Dev. Biol. Burtscher, I., and Lickert, H. (2009). Foxa2 regulates polarity and epithelialization in the endoderm germ layer of the mouse embryo. Development.Butler, A.E., Janson, J.,

161

Bonner-Weir, S., Ritzel, R., Rizza, R.A., and Butler, P.C. (2003). β-cell deficit and increased β-cell apoptosis in humans with type 2 diabetes. Diabetes. Cai, K.Q., Capo-Chichi, C.D., Rula, M.E., Yang, D.H., and Xu, X.X.M. (2008). Dynamic GATA6 expression in primitive endoderm formation and maturation in early mouse embryogenesis. Dev. Dyn. Cano, D.A., Soria, B., Martin, F., and Rojas, A. (2014). Transcriptional control of mammalian pancreas organogenesis. Cell. Mol. Life Sci. 71, 2383–2402. Carrasco, M., Delgado, I., Soria, B., Martin, F., and Rojas, A. (2012). GATA4 and GATA6 control mouse pancreas organogenesis. J. Clin. Invest. 122, 3504–3515. Cash, J.G., Kuhel, D.G., Goodin, C., and Hui, D.Y. (2011). Pancreatic acinar cell-specific overexpression of group 1B phospholipase A2 exacerbates diet-induced obesity and insulin resistance in mice. Int. J. Obes. Cebola, I., Rodríguez-Seguí, S.A., Cho, C.H.H., Bessa, J., Rovira, M., Luengo, M., Chhatriwala, M., Berry, A., Ponsa-Cobas, J., Maestro, M.A., et al. (2015). TEAD and YAP regulate the enhancer network of human embryonic pancreatic progenitors. Nat. Cell Biol. Cerdá-Esteban, N., Naumann, H., Ruzittu, S., Mah, N., Pongrac, I.M., Cozzitorto, C., Hommel, A., Andrade-Navarro, M.A., Bonifacio, E., and Spagnoli, F.M. (2017). Stepwise reprogramming of liver cells to a pancreas progenitor state by the transcriptional regulator Tgif2. Nat. Commun. Chia, C.Y., Madrigal, P., Denil, S.L.I.J., Martinez, I., Garcia-Bernardo, J., El-Khairi, R., Chhatriwala, M., Shepherd, M.H., Hattersley, A.T., Dunn, N.R., et al. (2019). GATA6 Cooperates with EOMES/SMAD2/3 to Deploy the Gene Regulatory Network Governing Human Definitive Endoderm and Pancreas Formation. Stem Cell Reports. Cirillo, L.A., Lin, F.R., Cuesta, I., Friedman, D., Jarnik, M., and Zaret, K.S. (2002). Opening of compacted chromatin by early developmental transcription factors HNF3 (FoxA) and GATA-4. Mol. Cell. Clayton, H.W., Osipovich, A.B., Stancill, J.S., Schneider, J.D., Vianna, P.G., Shanks, C.M., Yuan, W., Gu, G., Manduchi, E., Stoeckert, C.J., et al. (2016). Pancreatic Inflammation Redirects Acinar to β Cell Reprogramming. Cell Rep. Cockell, M., Stevenson, B.J., Strubin, M., Hagenbüchle, O., and Wellauer, P.K. (1989). Identification of a cell-specific DNA-binding activity that interacts with a transcriptional activator of genes expressed in the acinar pancreas. Mol. Cell. Biol. Cockell, M., Stolarczyk, D., Frutiger, S., Hughes, G.J., Hagenbüchle, O., and Wellauer, P.K. (1995). Binding sites for hepatocyte nuclear factor 3 beta or 3 gamma and pancreas transcription factor 1 are required for efficient expression of the gene encoding pancreatic alpha-amylase. Mol. Cell. Biol. Colella, P., Ronzitti, G., and Mingozzi, F. (2018). Emerging Issues in AAV-Mediated In Vivo Gene Therapy. Mol. Ther. - Methods Clin. Dev.Dahl, U., Sjödin, A., and Semb, H. (1996). Cadherins regulate aggregation of pancreatic β-cells in vivo. Development. D’Amato, E., Giacopelli, F., Giannattasio, A., D’Annunzio, G., Bocciardi, R., Musso, M., Lorini, R., and Ravazzolo, R. (2010). Genetic investigation in an Italian child with an unusual association of atrial septal defect, attributable to a new familial GATA4 gene mutation, and neonatal diabetes due to pancreatic agenesis. Diabet. Med.

162

D’Amour, K.A., Bang, A.G., Eliazer, S., Kelly, O.G., Agulnick, A.D., Smart, N.G., Moorman, M.A., Kroon, E., Carpenter, M.K., and Baetge, E.E. (2006). Production of pancreatic hormone-expressing endocrine cells from human embryonic stem cells. Nat. Biotechnol. D’Amour, K. a, Agulnick, A.D., Eliazer, S., Kelly, O.G., Kroon, E., and Baetge, E.E. (2005). Efficient differentiation of human embryonic stem cells to definitive endoderm. Nat. Biotechnol. 23, 1534–1541. Davis, R.L., Weintraub, H., and Lassar, A.B. (1987). Expression of a single transfected cDNA converts fibroblasts to myoblasts. Cell. Dawkins, J.B.N., Wang, J., Maniati, E., Heward, J.A., Koniali, L., Kocher, H.M., Martin, S.A., Chelala, C., Balkwill, F.R., Fitzgibbon, J., et al. (2016). Reduced expression of histone methyltransferases KMT2C and KMT2D correlates with improved outcome in pancreatic ductal adenocarcinoma. Cancer Res. De Franco, E., Shaw-Smith, C., Flanagan, S.E., Edghill, E.L., Wolf, J., Otte, V., Ebinger, F., Varthakavi, P., Vasanthi, T., Edvardsson, S., et al. (2013). Biallelic PDX1 (insulin promoter factor 1) mutations causing neonatal diabetes without exocrine pancreatic insufficiency. Diabet. Med. De Franco, E., Shaw-Smith, C., Flanagan, S.E., Shepherd, M.H., Hattersley, A.T., and Ellard, S. (2013). GATA6 mutations cause a broad phenotypic spectrum of diabetes from pancreatic agenesis to adult- onset diabetes without exocrine insufficiency. Diabetes. De Franco, E., Watson, R.A., Weninger, W.J., Wong, C.C., Flanagan, S.E., Caswell, R., Green, A., Tudor, C., Lelliott, C.J., Geyer, S.H., et al. (2019). A Specific CNOT1 Mutation Results in a Novel Syndrome of Pancreatic Agenesis and Holoprosencephaly through Impaired Pancreatic and Neurological Development. Am. J. Hum. Genet. De Lichtenberg, K.H., Funa, N.S., Nakic, N., Ferrer, J., Zhu, Z., Huangfu, D., and Serup, P. (2018). Genome-wide identification of HES1 target genes uncover novel roles for HES1 in pancreatic development. BioRxiv. De Vas, M.G., Kopp, J.L., Heliot, C., Sander, M., Cereghini, S., and Haumaitre, C. (2015). Hnf1b controls pancreas morphogenesis and the generation of Ngn3+ endocrine progenitors. Dev. Decker, K., Goldman, D.C., L. Grasch, C., and Sussel, L. (2006). Gata6 is an important regulator of mouse pancreas development. Dev. Biol. Dessimoz, J., Opoka, R., Kordich, J.J., Grapin-Botton, A., and Wells, J.M. (2006). FGF signaling is necessary for establishing gut tube domains along the anterior-posterior axis in vivo. Mech. Dev. Direnzo, D., Hess, D.A., Damsz, B., Hallett, J.E., Marshall, B., Goswami, C., Liu, Y., Deering, T., MacDonald, R.J., and Konieczny, S.F. (2012). Induced Mist1 expression promotes remodeling of mouse pancreatic acinar cells. Gastroenterology. Donelan, W., Koya, V., Li, S.W., and Yang, L.J. (2010). Distinct regulation of hepatic nuclear factor 1α by NKX6.1 in pancreatic beta cells. J. Biol. Chem. Ejarque, M., Altirriba, J., Gomis, R., and Gasa, R. (2013). Characterization of the transcriptional activity of the basic helix-loop-helix (bHLH) transcription factor Atoh8. Biochim. Biophys. Acta - Gene Regul. Mech.

163

El-Khairi, R., and Vallier, L. (2016). The role of hepatocyte nuclear factor 1β in disease and development. Diabetes, Obes. Metab. Engert, S., Burtscher, I., Liao, W.P., Dulev, S., Schotta, G., and Lickert, H. (2013). Wnt/β-catenin signalling regulates Sox17 expression and is essential for organizer and endoderm formation in the mouse. Dev. Esni, F., Ghosh, B., Biankin, A. V., Lin, J.W., Albert, M.A., Yu, X., MacDonald, R.J., Civin, C.I., Real, F.X., Pack, M.A., et al. (2004a). Notch inhibits Ptf1 function and acinar cell differentiation in developing mouse and zebrafish pancreas. Development. Esni, F., Stoffers, D.A., Takeuchi, T., and Leach, S.D. (2004b). Origin of exocrine pancreatic cells from nestin-positive precursors in developing mouse pancreas. Mech. Dev. Ferber, S., Halkin, A., Cohen, H., Ber, I., Einav, Y., Goldberg, I., Barshack, I., Seijffers, R., Kopolovic, J., Kaiser, N., et al. (2000). Pancreatic and duodenal homeobox gene 1 induces expression of insulin genes in liver and ameliorates streptozotocin-induced hyperglycemia. Nat. Med. Flanagan, S.E., De Franco, E., Lango Allen, H., Zerah, M., Abdul-Rasoul, M.M., Edge, J.A., Stewart, H., Alamiri, E., Hussain, K., Wallis, S., et al. (2014). Analysis of transcription factors key for mouse pancreatic development establishes NKX2-2 and MNX1 mutations as causes of neonatal diabetes in man. Cell Metab. Frank, D.U., Elliott, S.A., Eon, J.P., Hammond, J., Saijoh, Y., and Moon, A.M. (2007). System for inducible expression of Cre-recombinase from the Foxa2 locus in endoderm, notochord, and floor plate. Dev. Dyn. Frayling, T.M., Evans, J.C., Bulman, M.P., Pearson, E., Allen, L., Owen, K., Bingham, C., Hannemann, M., Shepherd, M., Ellard, S., et al. (2001). β-cell genes and diabetes: Molecular and clinical characterization of mutations in transcription factors. In Diabetes, p. Fujikura, J., Hosoda, K., Kawaguchi, Y., Noguchi, M., Iwakura, H., Odori, S., Mori, E., Tomita, T., Hirata, M., Ebihara, K., et al. (2007). Rbp-j regulates expansion of pancreatic epithelial cells and their differentiation into exocrine cells during mouse development. Dev. Dyn. Fukuda, A., Kawaguchi, Y., Furuyama, K., Kodama, S., Horiguchi, M., Kuhara, T., Kawaguchi, M., Terao, M., Doi, R., Wright, C.V.E., et al. (2008). Reduction of Ptf1a gene dosage causes pancreatic hypoplasia and diabetes in mice. Diabetes. Furuyama, K., Chera, S., van Gurp, L., Oropeza, D., Ghila, L., Damond, N., Vethe, H., Paulo, J.A., Joosten, A.M., Berney, T., et al. (2019). Diabetes relief in mice by glucose-sensing insulin-secreting human α-cells. Nature. Furuyama, K., Chera, S., van Gurp, L., Oropeza, D., Ghila, L., Damond, N., Vethe, H., Paulo, J.A., Joosten, A.M., Berney, T., et al. (2019). Diabetes relief in mice by glucose-sensing insulin-secreting human α-cells. Nature. Gannon, M., Gamer, L.W., and Wright, C.V.E. (2001). Regulatory regions driving developmental and tissue-specific expression of the essential pancreatic gene . Dev. Biol. Gao, N., LeLay, J., Vatamaniuk, M.Z., Rieck, S., Friedman, J.R., and Kaestner, K.H. (2008). Dynamic regulation of Pdx1 enhancers by Foxa1 and Foxa2 is essential for pancreas development. Genes Dev.

164

Gasa, R., Mrejen, C., Leachman, N., Otten, M., Barnes, M., Wang, J., Chakrabarti, S., Mirmira, R., and German, M. (2004). Proendocrine genes coordinate the pancreatic differentiation program in vitro. Proc. Natl. Acad. Sci. U. S. A. Georgia, S., Soliz, R., Li, M., Zhang, P., and Bhushan, A. (2006). p57 and Hes1 coordinate cell cycle exit with self-renewal of pancreatic progenitors. Dev. Biol. Gerrard, D.T., Berry, A.A., Jennings, R.E., Hanley, K.P., Bobola, N., and Hanley, N.A. (2016). An integrative transcriptomic atlas of organogenesis in human embryos. Elife. Gerrish, K., Gannon, M., Shih, D., Henderson, E., Stoffel, M., Wright, C.V.E., and Stein, R. (2000). Pancreatic β cell-specific transcription of the pdx-1 gene. The role of conserved upstream control regions and their hepatic nuclear factor 3β sites. J. Biol. Chem. Gerrish, K., Van Velkinburgh, J.C., and Stein, R. (2004). Conserved transcriptional regulatory domains of the pdx-1 gene. Mol. Endocrinol. Gohda, Y., Oka, S., Matsunaga, T., Watanabe, S., Yoshiura, K.I., Kondoh, T., and Matsumoto, T. (2015). Neonatal case of novel KMT2D mutation in with severe hypoglycemia. Pediatr. Int. Gouzi, M., Kim, Y.H., Katsumoto, K., Johansson, K., and Grapin-Botton, A. (2011). Neurogenin3 initiates stepwise delamination of differentiating endocrine cells during pancreas development. Dev. Dyn. Gradwohl, G., Dierich, A., LeMeur, M., and Guillemot, F. (2000). neurogenin3 is required for the development of the four endocrine cell lineages of the pancreas. Proc. Natl. Acad. Sci. U. S. A. Gragnoli, C., Lindner, T., Cockburn, B.N., Kaisaki, P.J., Gragnoli, F., Marozzi, G., and Bell, G.I. (1997). Maturity-onset diabetes of the young due to a mutation in the hepatocyte nuclear factor-4α binding site in the promoter of the hepatocyte nuclear factor-1α gene. Diabetes. Gu, G., Dubauskaite, J., and Melton, D.A. (2002). Direct evidence for the pancreatic lineage: NGN3+ cells are islet progenitors and are distinct from duct progenitors. Development. Hald, J., Sprinkel, A.E., Ray, M., Serup, P., Wright, C., and Madsen, O.D. (2008). Generation and characterization of Ptf1a antiserum and localization of Ptf1a in relation to Nkx6.1 and Pdx1 during the earliest stages of mouse pancreas development. J. Histochem. Cytochem. Hale, M.A., Kagami, H., Shi, L., Holland, A.M., Elsässer, H.P., Hammer, R.E., and MacDonald, R.J. (2005). The homeodomain protein PDX1 is required at mid-pancreatic development for the formation of the exocrine pancreas. Dev. Biol. Hale, M.A., Swift, G.H., Hoang, C.Q., Deering, T.G., Masui, T., Lee, Y.K., Xue, J., and MacDonald, R.J. (2014). The nuclear family member NR5A2 controls aspects of multipotent progenitor cell formation and acinar differentiation during pancreatic organogenesis. Dev. Hang, Y., Yamamoto, T., Benninger, R.K.P., Brissova, M., Guo, M., Bush, W., Piston, D.W., Powers, A.C., Magnuson, M., Thurmond, D.C., et al. (2014). The MafA transcription factor becomes essential to islet β-cells soon after birth. Diabetes.

165

Hani, E.H., Stoffers, D.A., Chèvre, J.C., Durand, E., Stanojevic, V., Dina, C., Habener, J.F., and Froguel, P. (1999). Defective mutations in the insulin promoter factor-1 (IPF-1) gene in late-onset type 2 diabetes mellitus. J. Clin. Invest. Harmon, J.S., Stein, R., and Robertson, R.P. (2005). Oxidative stress-mediated, post-translational loss of MafA protein as a contributing mechanism to loss of insulin gene expression in glucotoxic beta cells. J. Biol. Chem. Harrison, K.A., Thaler, J., Pfaff, S.L., Gu, H., and Kehrl, J.H. (1999). Pancreas dorsal lobe agenesis and abnormal islets of Langerhans in Hlxb9-deficient mice. Nat. Genet. Hart, A., Papadopoulou, S., and Edlund, H. (2003). Fgf10 maintains notch activation, stimulates proliferation, and blocks differentiation of pancreatic epithelial cells. Dev. Dyn. Haumaitre, C., Barbacci, E., Jenny, M., Ott, M.O., Gradwohl, G., and Cereghini, S. (2005). Lack of TCF2/vHNF1 in mice leads to pancreas agenesis. Proc. Natl. Acad. Sci. U. S. A. Haumaitre, C., Lenoir, O., and Scharfmann, R. (2008). Histone Deacetylase Inhibitors Modify Pancreatic Cell Fate Determination and Amplify Endocrine Progenitors. Mol. Cell. Biol. Haumaitre, C., Fabre, M., Cormier, S., Baumann, C., Delezoide, A.L., and Cereghini, S. (2006). Severe pancreas hypoplasia and multicystic renal dysplasia in two human fetuses carrying novel HNF1β/MODY5 mutations. Hum. Mol. Genet. Hebrok, M., Kim, S.K., St-Jacques, B., McMahon, A.P., and Melton, D.A. (2000). Regulation of pancreas development by hedgehog signaling. Development.Hebrok, M., Kim, S.K., and Melton, D.A. (1998). Notochord repression of endodermal sonic hedgehog permits pancreas development. Genes Dev. Hebrok, M., Kim, S.K., and Melton, D.A. (1998). Notochord repression of endodermal sonic hedgehog permits pancreas development. Genes Dev. Hess, D.A., Humphrey, S.E., Ishibashi, J., Damsz, B., Lee, A., Glimcher, L.H., and Konieczny, S.F. (2011). Extensive pancreas regeneration following acinar-specific disruption of Xbp1 in mice. Gastroenterology. Holmstrom, S.R., Deering, T., Swift, G.H., Poelwijk, F.J., Mangelsdorf, D.J., Kliewer, S.A., and Macdonald, R.J. (2011). LRH-1 and PTF1-L coregulate an exocrine pancreas-specific transcriptional network for digestive function. Genes Dev. Hong, J.J., Seksenyan, A., Yuan, X., Knudsen, B., Audeh, W., and Kaye, J. (2015). TOX3 as a novel biomarker in luminal B breast cancer. J. Clin. Oncol. Huang, P., Zhang, L., Gao, Y., He, Z., Yao, D., Wu, Z., Cen, J., Chen, X., Liu, C., Hu, Y., et al. (2014). Direct reprogramming of human fibroblasts to functional and expandable hepatocytes. Cell Stem Cell. Huch, M., Dorrell, C., Boj, S.F., Van Es, J.H., Li, V.S.W., Van De Wetering, M., Sato, T., Hamer, K., Sasaki, N., Finegold, M.J., et al. (2013). In vitro expansion of single Lgr5 + liver stem cells induced by Wnt-driven regeneration. Nature. Iguchi, H., Urashima, Y., Inagaki, Y., Ikeda, Y., Okamura, M., Tanaka, T., Uchida, A., Yamamoto, T.T., Kodama, T., and Sakai, J. (2007). SOX6 suppresses cyclin D1 promoter activity by interacting with β-

166 catenin and histone deacetylase 1, and its down-regulation induces pancreatic β-cell proliferation. J. Biol. Chem. Inagawa, K., Miyamoto, K., Yamakawa, H., Muraoka, N., Sadahiro, T., Umei, T., Wada, R., Katsumata, Y., Kaneda, R., Nakade, K., et al. (2012). Induction of cardiomyocyte-like cells in infarct hearts by gene transfer of Gata4, Mef2c, and Tbx5. Circ. Res. Ivaska, J. (2011). Vimentin: Central hub in EMT induction? Small GTPases. Iwakoshi, N.N., Lee, A.H., Vallabhajosyula, P., Otipoby, K.L., Rajewsky, K., and Glimcher, L.H. (2003). Plasma cell differentiation and the unfolded protein response intersect at the transcription factor XBP-I. Nat. Immunol. Iype, T., Taylor, D.G., Ziesmann, S.M., Garmey, J.C., Watada, H., and Mirmira, R.G. (2004). The transcriptional repressor Nkx6.1 also functions as a deoxyribonucleic acid context-dependent transcriptional activator during pancreatic β-cell differentiation: Evidence for feedback activation of the nkx6.1 gene by Nkx6.1. Mol. Endocrinol. Jacquemin, P., Durviaux, S.M., Jensen, J., Godfraind, C., Gradwohl, G., Guillemot, F., Madsen, O.D., Carmeliet, P., Dewerchin, M., Collen, D., et al. (2000). Transcription Factor Hepatocyte Nuclear Factor 6 Regulates Pancreatic Endocrine Cell Differentiation and Controls Expression of the Proendocrine Gene ngn3. Mol. Cell. Biol. Jacquemin, P., Lemaigre, F.P., and Rousseau, G.G. (2003). The Onecut transcription factor HNF-6 (OC-1) is required for timely specification of the pancreas and acts upstream of Pdx-1 in the specification cascade. Dev. Biol. Jennings, R.E., Berry, A.A., Gerrard, D.T., Wearne, S.J., Strutt, J., Withey, S., Chhatriwala, M., Piper Hanley, K., Vallier, L., Bobola, N., et al. (2017). Laser Capture and Deep Sequencing Reveals the Transcriptomic Programmes Regulating the Onset of Pancreas and Liver Differentiation in Human Embryos. Stem Cell Reports. Jennings, R.E., Berry, A.A., Kirkwood-Wilson, R., Roberts, N.A., Hearn, T., Salisbury, R.J., Blaylock, J., Hanley, K.P., and Hanley, N.A. (2013). Development of the human pancreas from foregut to endocrine commitment. Diabetes 62, 3514–3522. Jensen, J., Pedersen, E.E., Galante, P., Hald, J., Heller, R.S., Ishibashi, M., Kageyama, R., Guillemot, F., Serup, P., and Madsen, O.D. (2000). Control of endodermal endocrine development by Hes-1. Nat. Genet. Jeon, J., Correa-Medina, M., Ricordi, C., Edlund, H., and Diez, J.A. (2009). Endocrine cell clustering during human pancreas development. J. Histochem. Cytochem.Johannesson, M., Ståhlberg, A., Ameri, J., Sand, F.W., Norrman, K., and Semb, H. (2009). FGF4 and retinoic acid direct differentiation of hESCs into PDX1-expressing foregut endoderm in a time- and concentration-dependent manner. PLoS One. Johansson, K.A., Dursun, U., Jordan, N., Gu, G., Beermann, F., Gradwohl, G., and Grapin-Botton, A. (2007). Temporal Control of Neurogenin3 Activity in Pancreas Progenitors Reveals Competence Windows for the Generation of Different Endocrine Cell Types. Dev. Cell.Kageyama, R., Ohtsuka, T., and Kobayashi, T. (2007). The Hes gene family: Repressors and oscillators that orchestrate embryogenesis. Development.

167

Kanai-Azuma, M., Kanai, Y., Gad, J.M., Tajima, Y., Taya, C., Kurohmaru, M., Sanai, Y., Yonekawa, H., Yazaki, K., Tam, P.P.L., et al. (2002). Depletion of definitive gut endoderm in Sox17-null mutant mice. Development. Katz, L.S., Geras-Raaka, E., and Gershengorn, M.C. (2013). Reprogramming adult human dermal fibroblasts to islet-like cells by epigenetic modification coupled to transcription factor modulation. Stem Cells Dev. Kawaguchi, Y., Cooper, B., Gannon, M., Ray, M., MacDonald, R.J., and Wright, C.V.E. (2002). The role of the transcriptional regulator Ptf1a in converting intestinal to pancreatic progenitors. Nat. Genet. Keng, V.W., Yagi, H., Ikawa, M., Nagano, T., Myint, Z., Yamada, K., Tanaka, T., Sato, A., Muramatsu, I., Okabe, M., et al. (2000). Homeobox gene Hex is essential for onset of mouse embryonic liver development and differentiation of the monocyte lineage. Biochem. Biophys. Res. Commun. Ketola, I., Otonkoski, T., Pulkkinen, M.A., Niemi, H., Palgi, J., Jacobsen, C.M., Wilson, D.B., and Heikinheimo, M. (2004). Transcription factor GATA-6 is expressed in the endocrine and GATA-4 in the exocrine pancreas. Mol. Cell. Endocrinol. Khan, I.F., Hirata, R.K., Wang, P.R., Li, Y., Kho, J., Nelson, A., Huo, Y., Zavaljevski, M., Ware, C., and Russell, D.W. (2010). Engineering of human pluripotent stem cells by AAV-mediated gene targeting. Mol. Ther. Kim, S.K., Hebrok, M., Li, E., Oh, S.P., Schrewe, H., Harmon, E.B., Lee, J.S., and Melton, D.A. (2000). Activin receptor patterning of foregut organogenesis. Genes Dev.Kopp, J.L., Dubois, C.L., Schaffer, A.E., Hao, E., Shih, H.P., Seymour, P.A., Ma, J., and Sander, M. (2011). Sox9+ ductal cells are multipotent progenitors throughout development but do not produce new endocrine cells in the normal or injured adult pancreas. Development. Krapp, A., Knöfler, M., Frutiger, S., Hughes, G.J., Hagenbüchle, O., and Wellauer, P.K. (1996). The p48 DNA-binding subunit of transcription factor PTF1 is a new exocrine pancreas-specific basic helix-loop- helix protein. EMBO J. Krapp, A., Knöfler, M., Ledermann, B., Bürki, K., Berney, C., Zoerkler, N., Hagenbüchle, O., and Wellauer, P.K. (1998). The bHLH protein PTF1-p48 is essential for the formation of the exocrine and the correct spatial organization of the endocrine pancreas. Genes Dev. Kroon, E., Martinson, L.A., Kadoya, K., Bang, A.G., Kelly, O.G., Eliazer, S., Young, H., Richardson, M., Smart, N.G., Cunningham, J., et al. (2008). Pancreatic endoderm derived from human embryonic stem cells generates glucose-responsive insulin-secreting cells in vivo. Nat. Biotechnol. Kropp, P.A., Zhu, X., Gannon, M. (2019). Regulation of the Pancreatic Exocrine Differentiation Program and Morphogenesis by Onecut 1/Hnf6. Cell and Mol Gastro and Hep. Kubo, A., Shinozaki, K., Shannon, J.M., Kouskoff, V., Kennedy, M., Woo, S., Fehling, H.J., and Keller, G. (2004). Development of definitive endoderm from embryonic stem cells in culture. Development. Kyithar, M.P., Bacon, S., Pannu, K.K., Rizvi, S.R., Colclough, K., Ellard, S., and Byrne, M.M. (2011). Identification of HNF1A-MODY and HNF4A-MODY in Irish families: Phenotypic characteristics and therapeutic implications. Diabetes Metab.

168

Landry, C., Clotman, F., Hioki, T., Oda, H., Picard, J.J., Lemaigre, F.P., and Rousseau, G.G. (1997). HNF-6 is expressed in endoderm derivatives and nervous system of the mouse embryo and participates to the cross-regulatory network of liver- enriched transcription factors. Dev. Biol. Lango Allen, H., Flanagan, S.E., Shaw-Smith, C., De Franco, E., Akerman, I., Caswell, R., Ferrer, J., Hattersley, A.T., and Ellard, S. (2012). GATA6 haploinsufficiency causes pancreatic agenesis in humans. Nat. Genet. 44, 20–22. Lawson, K.A., Pedersen, R.A., and Van De Geer, S. (1987). Cell fate, morphogenetic movement and population kinetics of embryonic endoderm at the time of germ layer formation in the mouse. Development. Lee, C.S., Sund, N.J., Vatamaniuk, M.Z., Matschinsky, F.M., Stoffers, D.A., and Kaestner, K.H. (2002). Foxa2 controls Pdx1 gene expression in pancreatic β-cells in vivo. Diabetes. Lee, C.S., Bishop, E.S., Zhang, R., Yu, X., Farina, E.M., Yan, S., Zhao, C., Zeng, Z., Shu, Y., Wu, X., et al. (2017). Adenovirus-mediated gene delivery: Potential applications for gene and cell-based therapies in the new era of personalized medicine. Genes Dis. Lee, J.C., Smith, S.B., Watada, H., Lin, J., Scheel, D., Wang, J., Mirmira, R.G., and German, M.S. (2001). Regulation of the pancreatic pro-endocrine gene neurogenin3. Diabetes.Leimeister, C., Schumacher, N., Steidl, C., and Gessler, M. (2000). Analysis of HeyL expression in wild-type and Notch pathway mutant mouse embryos. Mech. Dev. Li, H., Arber, S., Jessell, T.M., and Edlund, H. (1999). Selective agenesis of the dorsal pancreas in mice lacking homeobox gene Hlxb9. Nat. Genet. Li, J., Dantas Machado, A.C., Guo, M., Sagendorf, J.M., Zhou, Z., Jiang, L., Chen, X., Wu, D., Qu, L., Chen, Z., et al. (2017). Structure of the Forkhead Domain of FOXA2 Bound to a Complete DNA Consensus Site. Biochemistry. Livak, K.J., and Schmittgen, T.D. (2001). Analysis of relative gene expression data using real-time quantitative PCR and the 2-ΔΔCT method. Methods. Loganathan, G., Dawra, R.K., Pugazhenthi, S., Wiseman, A.C., Sanders, M.A., Saluja, A.K., Sutherland, D.E.R., Hering, B.J., and Balamurugan, A.N. (2010). Culture of impure human islet fractions in the presence of alpha-1 antitrypsin prevents insulin cleavage and improves islet recovery. In Transplantation Proceedings, p. Loganathan, G., Dawra, R.K., Pugazhenthi, S., Guo, Z., Soltani, S.M., Wiseman, A., Sanders, M.A., Papas, K.K., Velayutham, K., Saluja, A.K., et al. (2011). Insulin degradation by acinar cell proteases creates a dysfunctional environment for human islets before/after transplantation: Benefits of α-1 antitrypsin treatment. Transplantation. Loomans, C.J.M., Williams Giuliani, N., Balak, J., Ringnalda, F., van Gurp, L., Huch, M., Boj, S.F., Sato, T., Kester, L., de Sousa Lopes, S.M.C., et al. (2018). Expansion of Adult Human Pancreatic Tissue Yields Organoids Harboring Progenitor Cells with Endocrine Differentiation Potential. Stem Cell Reports. Lynn, F.C., Smith, S.B., Wilson, M.E., Yang, K.Y., Nekrep, N., and German, M.S. (2007). Sox9 coordinates a transcriptional network in pancreatic progenitor cells. Proc. Natl. Acad. Sci. U. S. A.

169

Lyttle, B.M., Li, J., Krishnamurthy, M., Fellows, F., Wheeler, M.B., Goodyer, C.G., and Wang, R. (2008). Transcription factor expression in the developing human fetal endocrine pancreas. Diabetologia. Maestro, M.A., Boj, S.F., Luco, R.F., Pierreux, C.E., Cabedo, J., Servitja, J.M., German, M.S., Rousseau, G.G., Lemaigre, F.P., and Ferrer, J. (2003). Hnf6 and Tcf2 (MODY5) are linked in a gene network operating in a precursor cell domain of the embryonic pancreas. Hum. Mol. Genet. Marshak, S., Benshushan, E., Shoshkes, M., Havin, L., Cerasi, E., and Melloul, D. (2000). Functional Conservation of Regulatory Elements in the pdx-1 Gene: PDX-1 and Hepatocyte Nuclear Factor 3beta Transcription Factors Mediate beta -Cell-Specific Expression. Mol. Cell. Biol. Martín, M., Gallego-Llamas, J., Ribes, V., Kedinger, M., Niederreither, K., Chambon, P., Dollé, P., and Gradwohl, G. (2005). Dorsal pancreas agenesis in retinoic acid-deficient Raldh2 mutant mice. Dev. Biol. Martinelli, P., Cañamero, M., Del Pozo, N., Madriles, F., Zapata, A., and Real, F.X. (2013). Gata6 is required for complete acinar differentiation and maintenance of the exocrine pancreas in adult mice. Gut. Martinez-Barbera, J.P., Rodriguez, T.A., and Beddington, R.S.P. (2000). The homeobox gene Hesx1 is required in the anterior neural ectoderm for normal forebrain formation. Dev. Biol. Masjkur, J., Poser, S.W., Nikolakopoulou, P., Chrousos, G., McKay, R.D., Bornstein, S.R., Jones, P.M., and Androutsellis-Theotokis, A. (2016). Endocrine pancreas development and regeneration: Noncanonical ideas from neural stem cell biology. Diabetes. Masui, T., Swift, G.H., Hale, M.A., Meredith, D.M., Johnson, J.E., and MacDonald, R.J. (2008). Transcriptional Autoregulation Controls Pancreatic Ptf1a Expression during Development and Adulthood. Mol. Cell. Biol. Masui, T., Long, Q., Beres, T.M., Magnuson, M.A., and MacDonald, R.J. (2007). Early pancreatic development requires the vertebrate Suppressor of Hairless (RBPJ) in the PTF1 bHLH complex. Genes Dev. Masui, T., Swift, G.H., Deering, T., Shen, C., Coats, W.S., Long, Q., Elsässer, H.P., Magnuson, M.A., and MacDonald, R.J. (2010). Replacement of Rbpj With Rbpjl in the PTF1 Complex Controls the Final Maturation of Pancreatic Acinar Cells. Gastroenterology. McClure, C., Cole, K.L.H., Wulff, P., Klugmann, M., and Murray, A.J. (2011). Production and titering of recombinant adeno-associated viral vectors. J. Vis. Exp. McDonald, E., Li, J., Krishnamurthy, M., Fellows, G.F., Goodyer, C.G., and Wang, R. (2012). SOX9 regulates endocrine cell differentiation during human fetal pancreas development. Int. J. Biochem. Cell Biol. McGrath, P.S., Watson, C.L., Ingram, C., Helmrath, M.A., and Wells, J.M. (2015). The basic helix-loop- helix transcription factor neurog3 is required for development of the human endocrine pancreas. Diabetes. Micallef, S.J., Janes, M.E., Knezevic, K., Davis, R.P., Elefanty, A.G., and Stanley, E.G. (2005). Retinoic acid induces Pdx1-positive endoderm in differentiating mouse embryonic stem cells. Diabetes.

170

Miettinen, P.J., Huotari, M.A., Koivisto, T., Ustinov, J., Palgi, J., Rasilainen, S., Lehtonen, E., Keski-Oja, J., and Otonkoski, T. (2000). Impaired migration and delayed differentiation of pancreatic islet cells in mice lacking EGF-receptors. Development. Milone, M.C., and O’Doherty, U. (2018). Clinical use of lentiviral vectors. Leukemia. Minoo, P., Su, G., Drum, H., Bringas, P., and Kimura, S. (1999). Defects in tracheoesophageal and lung morphogenesis in Nkx2.1(-/-) mouse embryos. Dev. Biol. Miralles, F., Czernichow, P., Ozaki, K., Itoh, N., and Scharfmann, R. (1999). Signaling through fibroblast growth factor receptor 2b plays a key role in the development of the exocrine pancreas. Proc. Natl. Acad. Sci. U. S. A. Miralles, F., Lamotte, L., Couton, D., and Joshi, R.L. (2006). Interplay between FGF10 and Notch signalling is required for the self-renewal of pancreatic progenitors. Int. J. Dev. Biol. Miranda, J., Rocha, G., Soares, P., Morgado, H., Baptista, M.J., Azevedo, I., Fernandes, S., Brandão, O., Sen, P., and Guimarães, H. (2013). A novel mutation in FOXF1 gene associated with alveolar capillary dysplasia with misalignment of pulmonary veins, intestinal malrotation and annular pancreas. Neonatology. Miyatsuka, T., Kosaka, Y., Kim, H., and German, M.S. (2011). Neurogenin3 inhibits proliferation in endocrine progenitors by inducing Cdkn1a. Proc. Natl. Acad. Sci. U. S. A. Miyatsuka, T., Matsuoka, T. aki, Shiraiwa, T., Yamamoto, T., Kojima, I., and Kaneto, H. (2007). Ptf1a and RBP-J cooperate in activating Pdx1 gene expression through binding to Area III. Biochem. Biophys. Res. Commun. Molkentin, J.D. (2000). The Zinc Finger-containing Transcription Factors GATA-4, -5, and -6. J. Biol. Chem. Molkentin, J.D., Lin, Q., Duncan, S.A., and Olson, E.N. (1997). Requirement of the transcription factor GATA4 for heart tube formation and ventral morphogenesis. Genes Dev. Molotkov, A., Molotkova, N., and Duester, G. (2005). Retinoic acid generated by Raldh2 in mesoderm is required for mouse dorsal endodermal pancreas development. Dev. Dyn. Monaghan, A.P., Kaestner, K.H., Grau, E., and Schutz, G. (1993). Postimplantation expression patterns indicate a role for the mouse forkhead/HNF-3 α, β and γ genes in determination of the definitive endoderm, chordamesoderm and neuroectoderm. Development. Morrisey, E.E., Tang, Z., Sigrist, K., Lu, M.M., Jiang, F., Ip, H.S., and Parmacek, M.S. (1998). GATA6 regulates HNF4 and is required for differentiation of visceral endoderm in the mouse embryo. Genes Dev. Murtaugh, L.C., Law, A.C., Dor, Y., and Melton, D.A. (2005). β-Catenin is essential for pancreatic acinar but not islet development. Development. Murtaugh, L.C., Stanger, B.Z., Kwan, K.M., and Melton, D.A. (2003). Notch signaling controls multiple steps of pancreatic differentiation. Proc. Natl. Acad. Sci. U. S. A.

171

Nair, A.K., Sutherland, J.R., Traurig, M., Piaggi, P., Chen, P., Kobes, S., Hanson, R.L., Bogardus, C., and Baier, L.J. (2018). Functional and association analysis of an Amerindian-derived population- specific p.(Thr280Met) variant in RBPJL, a component of the PTF1 complex. Eur. J. Hum. Genet. Nair, G.G., Liu, J.S., Russ, H.A., Tran, S., Saxton, M.S., Chen, R., Juang, C., Li, M. lan, Nguyen, V.Q., Giacometti, S., et al. (2019). Recapitulating endocrine cell clustering in culture promotes maturation of human stem-cell-derived β cells. Nat. Cell Biol. Nakhai, H., Siveke, J.T., Klein, B., Mendoza-Torres, L., Mazur, P.K., Algül, H., Radtke, F., Strobl, L., Zimber-Strobl, U., and Schmid, R.M. (2008b). Conditional ablation of Notch signaling in pancreatic development. Development. Nakhai, H., Siveke, J.T., Mendoza-Torres, L., and Schmid, R.M. (2008a). Conditional inactivation of Myc impairs development of the exocrine pancreas. Development. Nammo, T., Yamagata, K., Tanaka, T., Kodama, T., Sladek, F.M., Fukui, K., Katsube, F., Sato, Y., Miyagawa, J. ichiro, and Shimomura, I. (2008). Expression of HNF-4α (MODY1), HNF-1β (MODY5), and HNF-1α (MODY3) proteins in the developing mouse pancreas. Gene Expr. Patterns. Nicolino, M., Claiborn, K.C., Senée, V., Boland, A., Stoffers, D.A., and Julier, C. (2010). A novel hypomorphic PDX1 mutation responsible for permanent neonatal diabetes with subclinical exocrine deficiency. Diabetes. Norgaard, G.A., Jensen, J.N., and Jensen, J. (2003). FGF10 signaling maintains the pancreatic progenitor cell state revealing a novel role of Notch in organ development. Dev. Biol. Nostro, M.C., Sarangi, F., Yang, C., Holland, A., Elefanty, A.G., Stanley, E.G., Greiner, D.L., and Keller, G. (2015). Efficient generation of NKX6-1+ pancreatic progenitors from multiple human pluripotent stem cell lines. Stem Cell Reports. Nowotschin, S., Setty, M., Kuo, Y.Y., Liu, V., Garg, V., Sharma, R., Simon, C.S., Saiz, N., Gardner, R., Boutet, S.C., et al. (2019). The emergent landscape of the mouse gut endoderm at single-cell resolution. Nature. Odom, D.T., Zizlsperger, H., Gordon, D.B., Bell, G.W., Rinaldi, N.J., Murray, H.L., Volkert, T.L., Schreiber, J., Rolfe, P.A., Gifford, D.K., et al. (2004). Control of Pancreas and Liver Gene Expression by HNF Transcription Factors. Science (80-. ). Offield, M.F., Jetton, T.L., Labosky, P.A., Ray, M., Stein, R.W., Magnuson, M.A., Hogan, B.L.M., and Wright, C.V.E. (1996). PDX-1 is required for pancreatic outgrowth and differentiation of the rostral duodenum. Development. Oliver-Krasinski, J.M., Kasner, M.T., Yang, J., Crutchlow, M.F., Rustgi, A.K., Kaestner, K.H., and Stoffers, D.A. (2009). The diabetes gene Pdx1 regulates the transcriptional network of pancreatic endocrine progenitor cells in mice. J. Clin. Invest. O’Rahilly, R., and Müller, F. (2010). Developmental stages in human embryos: Revised and new measurements. Cells Tissues Organs. Ornitz, D.M., and Marie, P.J. (2015). Fibroblast growth factor signaling in skeletal development and disease. Genes Dev.Pabst, O., Zweigerdt, R., and Arnold, H.H. (1999). Targeted disruption of the homeobox transcription factor Nkx2-3 in mice results in postnatal lethality and abnormal development of small intestine and spleen. Development.

172

Pan, F.C., Brissova, M., Powers, A.C., Pfaff, S., and Wright, C.V.E. (2015). Inactivating the permanent neonatal diabetes gene Mnx1 switches insulin-producing β-cells to a δ-like fate and reveals a facultative proliferative capacity in aged β-cells. Dev. Pan, F.C., and Wright, C. (2011). Pancreas organogenesis: From bud to plexus to gland. Dev. Dyn. 240, 530–565. Patient, R.K., and McGhee, J.D. (2002). The GATA family (vertebrates and invertebrates). Curr. Opin. Genet. Dev. Pedersen, J.K., Nelson, S.B., Jorgensen, M.C., Henseleit, K.D., Fujitani, Y., Wright, C.V.E., Sander, M., and Serup, P. (2005). Endodermal expression of Nkx6 genes depends differentially on Pdx1. Dev. Biol. Petersen, M.B.K., Azad, A., Ingvorsen, C., Hess, K., Hansson, M., Grapin-Botton, A., and Honoré, C. (2017). Single-Cell Gene Expression Analysis of a Human ESC Model of Pancreatic Endocrine Development Reveals Different Paths to β-Cell Differentiation. Stem Cell Reports. Pierreux, C.E., Poll, A. V., Kemp, C.R., Clotman, F., Maestro, M.A., Cordi, S., Ferrer, J., Leyns, L., Rousseau, G.G., and Lemaigre, F.P. (2006). The Transcription Factor Hepatocyte Nuclear Factor-6 Controls the Development of Pancreatic Ducts in the Mouse. Gastroenterology. Pin, C.L., Rukstalis, J.M., Johnson, C., and Konieczny, S.F. (2001). The bHLH transcription factor Mist1 is required to maintain exocrine pancreas cell organization and acinar cell identity. J. Cell Biol. Pinney, S.E., Jaeckle Santos, L.J., Han, Y., Stoffers, D.A., and Simmons, R.A. (2011a). Exendin-4 increases histone acetylase activity and reverses epigenetic modifications that silence Pdx1 in the intrauterine growth retarded rat. Diabetologia. Pinney, S.E., Oliver-Krasinski, J., Ernst, L., Hughes, N., Patel, P., Stoffers, D.A., Russo, P., and De León, D.D. (2011b). Neonatal diabetes and congenital malabsorptive diarrhea attributable to a novel mutation in the human neurogenin-3 gene coding sequence. In Journal of Clinical Endocrinology and Metabolism, p. Prasadan, K., Tulachan, S., Guo, P., Shiota, C., Shah, S., and Gittes, G. (2010). Endocrine-committed progenitor cells retain their differentiation potential in the absence of neurogenin-3 expression. Biochem. Biophys. Res. Commun. Price, M.P., Lewin, G.R., McIlwrath, S.L., Cheng, C., Xie, J., Heppenstall, P.A., Stucky, C.L., Mannsfeldt, A.G., Brennan, T.J., Drummond, H.A., et al. (2000). The mammalian sodium channel BNC1 is required for normal touch sensation. Nature. Que, J., Okubo, T., Goldenring, J.R., Nam, K.T., Kurotani, R., Morrisey, E.E., Taranova, O., Pevny, L.H., and Hogan, B.L.M. (2007). Multiple dose-dependent roles for Sox2 in the patterning and differentiation of anterior foregut endoderm. Development. Rackham, O.J.L., Firas, J., Fang, H., Oates, M.E., Holmes, M.L., Knaupp, A.S., Suzuki, H., Nefzger, C.M., Daub, C.O., Shin, J.W., et al. (2016). A predictive computational framework for direct reprogramming between human cell types. Nat. Genet. Ramond, C., Beydag-Tasöz, B.S., Azad, A., van de Bunt, M., Petersen, M.B.K., Beer, N.L., Glaser, N., Berthault, C., Gloyn, A.L., Hansson, M., et al. (2018). Understanding human fetal pancreas development using subpopulation sorting, RNA sequencing and single-cell profiling. Dev.

173

Ramsey, V.G., Doherty, J.M., Chen, C.C., Stappenbeck, T.S., Konieczny, S.F., and Mills, J.C. (2007). The maturation of mucus-secreting gastric epithelial progenitors into digestive-enzyme secreting zymogenic cells requires Mist1. Development. Rausa, F., Samadani, U., Ye, H., Lim, L., Fletcher, C.F., Jenkins, N.A., Copeland, N.G., and Costa, R.H. (1997). The cut-homeodomain transcriptional activator HNF-6 is coexpressed with its target gene HNF-3β in the developing murine liver and pancreas. Dev. Biol. Rehorn, K.P., Thelen, H., Michelson, A.M., and Reuter, R. (1996). A molecular aspect of hematopoiesis and endoderm development common to vertebrates and Drosophila. Development. Rezania, A., Bruin, J.E., Arora, P., Rubin, A., Batushansky, I., Asadi, A., O’Dwyer, S., Quiskamp, N., Mojibian, M., Albrecht, T., et al. (2014). Reversal of diabetes with insulin-producing cells derived in vitro from human pluripotent stem cells. Nat. Biotechnol. 32, 1121–1133. Ritz-Laser, B., Mamin, A., Brun, T., Avril, I., Schwitzgebel, V.M., and Philippe, J. (2005). The zinc finger-containing transcription factor Gata-4 is expressed in the developing endocrine pancreas and activates glucagon gene expression. Mol. Endocrinol. Roglic, G. (2016). WHO Global report on diabetes: A summary. Int. J. Noncommunicable Dis. Rojas, A., De Val, S., Heidt, A.B., Xu, S.-M., Bristow, J., and Black, B.L. (2005). Gata4 expression in lateral mesoderm is downstream of BMP4 and is activated directly by Forkhead and GATA transcription factors through a distal enhancer element. Development 132, 3405–3417. Rojas, A., Schachterle, W., Xu, S.M., Martin, F., and Black, B.L. (2010). Direct transcriptional regulation of Gata4 during early endoderm specification is controlled by FoxA2 binding to an intronic enhancer. Dev. Biol. 346, 346–355. Rojas, A., Schachterle, W., Xu, S.-M., and Black, B.L. (2009). An endoderm-specific transcriptional enhancer from the mouse Gata4 gene requires GATA and homeodomain protein-binding sites for function in vivo. Dev. Dyn. 238, 2588–2598. Russ, H.A., Parent, A. V, Ringler, J.J., Hennings, T.G., Nair, G.G., Shveygert, M., Guo, T., Puri, S., Haataja, L., Cirulli, V., et al. (2015). Controlled induction of human pancreatic progenitors produces functional beta‐like cells in vitro . EMBO J. S.F., B., D., P., and J., F. (2010). Epistasis of transcriptomes reveals synergism between transcriptional activators Hnf1α and Hnf4α. PLoS Genet. Salisbury, R.J., Blaylock, J., Berry, A.A., Jennings, R.E., De Krijger, R., Hanley, K.P., and Hanley, N.A. (2014). The window period of NEUROGENIN3 during human gestation. Islets. Samadani, U., and Costa, R.H. (1996). The transcriptional activator hepatocyte nuclear factor 6 regulates liver gene expression. Mol. Cell. Biol. Sánchez-Arévalo Lobo, V.J., Fernández, L.C., Carrillo-De-Santa-Pau, E., Richart, L., Cobo, I., Cendrowski, J., Moreno, U., Del Pozo, N., Megías, D., Bréant, B., et al. (2018). C-Myc downregulation is required for preacinar to acinar maturation and pancreatic homeostasis. Gut. Sander, N., Sussel, L., Conners, J., Scheel, D., Kalamaras, J., Dela Cruz, F., Schwitzgebel, V., Hayes- Jordan, A., and German, M. (2000). Homeobox gene Nkx6.1 lies downstream of Nkx2.2 in the major pathway of β-cell formation in the pancreas. Development.

174

Sapir, T., Shternhall, K., Meivar-Levy, I., Blumenfeld, T., Cohen, H., Skutelsky, E., Eventov-Friedman, S., Barshack, I., Goldberg, I., Pri-Chen, S., et al. (2005). Cell-replacement therapy for diabetes: Generating functional insulin-producing tissue from adult human liver cells. Proc. Natl. Acad. Sci. U. S. A. Sasaki, H., and Hogan, B.L.M. (1993). Differential expression of multiple fork head related genes during gastrulation and axial pattern formation in the mouse embryo. Development. Saxena, P., Heng, B.C., Bai, P., Folcher, M., Zulewski, H., and Fussenegger, M. (2016). A programmable synthetic lineage-control network that differentiates human IPSCs into glucose-sensitive insulin-secreting beta-like cells. Nat. Commun. Schaffer, A.E., Freude, K.K., Nelson, S.B., and Sander, M. (2010). Nkx6 transcription factors and Ptf1a function as antagonistic lineage determinants in multipotent pancreatic progenitors. Dev. Cell. Schaffer, A.E., Taylor, B.L., Benthuysen, J.R., Liu, J., Thorel, F., Yuan, W., Jiao, Y., Kaestner, K.H., Herrera, P.L., Magnuson, M.A., et al. (2013). Nkx6.1 Controls a Gene Regulatory Network Required for Establishing and Maintaining Pancreatic Beta Cell Identity. PLoS Genet. Schisler, J.C., Jensen, P.B., Taylor, D.G., Becker, T.C., Knop, F.K., Takekawa, S., German, M., Weir, G.C., Lu, D., Mirmira, R.G., et al. (2005). The Nkx6.1 homeodomain transcription factor suppresses glucagon expression and regulates glucose-stimulated insulin secretion in islet beta cells. Proc. Natl. Acad. Sci. U. S. A. Schlimgen, R., Howard, J., Wooley, D., Thompson, M., Baden, L.R., Yang, O.O., Christiani, D.C., Mostoslavsky, G., Diamond, D. V., Duane, E.G., et al. (2016). Risks associated with lentiviral vector exposures and prevention strategies. In Journal of Occupational and Environmental Medicine, p. Schulz, T.C., Young, H.Y., Agulnick, A.D., Babin, M.J., Baetge, E.E., Bang, A.G., Bhoumik, A., Cepa, I., Cesario, R.M., Haakmeester, C., et al. (2012). A scalable system for production of functional pancreatic progenitors from human embryonic stem cells. PLoS One. Schwitzgebel, V.M., Mamin, A., Brun, T., Ritz-Laser, B., Zaiko, M., Maret, A., Jornayvaz, F.R., Theintz, G.E., Michielin, O., Melloul, D., et al. (2003). Agenesis of human pancreas due to decreased half-life of insulin promoter factor 1. J. Clin. Endocrinol. Metab. Sellick, G.S., Barker, K.T., Stolte-Dijkstra, I., Fleischmann, C., Coleman, R.J., Garrett, C., Gloyn, A.L., Edghill, E.L., Hattersley, A.T., Wellauer, P.K., et al. (2004). Mutations in PTF1A cause pancreatic and cerebellar agenesis. Nat. Genet. Serup, P. (2012). Signaling pathways regulating murine pancreatic development. Semin. Cell Dev. Biol. Seymour, P.A., Freude, K.K., Tran, M.N., Mayes, E.E., Jensen, J., Kist, R., Scherer, G., and Sander, M. (2007). SOX9 is required for maintenance of the pancreatic progenitor cell pool. Proc. Natl. Acad. Sci. U. S. A. Seymour, P.A., Shih, H.P., Patel, N.A., Freude, K.K., Xie, R., Lim, C.J., and Sander, M. (2012). A Sox9/Fgf feed-forward loop maintains pancreatic organ identity. Dev. Shapiro, A.M.J., Lakey, J.R.T., Ryan, E.A., Korbutt, G.S., Toth, E., Warnock, G.L., Kneteman, N.M., and Rajotte, R. V. (2000). Islet transplantation in seven patients with type 1 diabetes mellitus using a glucocorticoid-free immunosuppressive regimen. N. Engl. J. Med.

175

Shapiro, A.M.J., Ricordi, C., Hering, B.J., Auchincloss, H., Lindblad, R., Robertson, R.P., Secchi, A., Brendel, M.D., Berney, T., Brennan, D.C., et al. (2006). International trial of the Edmonton protocol for islet transplantation. N. Engl. J. Med. Shaw-Smith, C., De Franco, E., Allen, H.L., Batlle, M., Flanagan, S.E., Borowiec, M., Taplin, C.E., Van Alfen-Van Der Velden, J., Cruz-Rojo, J., De Nanclares, G.P., et al. (2014). GATA4 mutations are a cause of neonatal and childhood-onset diabetes. Diabetes. Sherwood, R.I., Chen, T.Y.A., and Melton, D.A. (2009). Transcriptional dynamics of endodermal organ formation. Dev. Dyn. Shi, Z.D., Lee, K., Yang, D., Amin, S., Verma, N., Li, Q. V., Zhu, Z., Soh, C.L., Kumar, R., Evans, T., et al. (2017). Genome Editing in hPSCs Reveals GATA6 Haploinsufficiency and a Genetic Interaction with GATA4 in Human Pancreatic Development. Cell Stem Cell. Shih, H.P., Kopp, J.L., Sandhu, M., Dubois, C.L., Seymour, P.A., Grapin-Botton, A., and Sander, M. (2012). A Notch-dependent molecular circuitry initiates pancreatic endocrine and ductal cell differentiation. Dev. Simon, C.S., Zhang, L., Wu, T., Cai, W., Saiz, N., Nowotschin, S., Cai, C.L., and Hadjantonakis, A.K. (2018). A Gata4 nuclear GFP transcriptional reporter to study endoderm and cardiac development in the mouse. Biol. Open. Sinner, D., Kirilenko, P., Rankin, S., Wei, E., Howard, L., Kofron, M., Heasman, J., Woodland, H.R., and Zorn, A.M. (2006). Global analysis of the transcriptional network controlling Xenopus endoderm formation. Development. Soyer, J., Flasse, L., Raffelsberger, W., Beucher, A., Orvain, C., Peers, B., Ravassard, P., Vermot, J., Voz, M.L., Mellitzer, G., et al. (2010). Rfx6 is an Ngn3-dependent winged helix transcription factor required for pancreatic islet cell development. Development. Spaeth, J.M., Hunter, C.S., Bonatakis, L., Guo, M., French, C.A., Slack, I., Hara, M., Fisher, S.E., Ferrer, J., Morrisey, E.E., et al. (2015). The FOXP1, FOXP2 and FOXP4 transcription factors are required for islet alpha cell proliferation and function in mice. Diabetologia. Spagnoli, F.M., and Brivanlou, A.H. (2008). The Gata5 target, TGIF2, defines the pancreatic region by modulating BMP signals within the endoderm. Development. Spence, J.R., Lange, A.W., Lin, S.C.J., Kaestner, K.H., Lowy, A.M., Kim, I., Whitsett, J.A., and Wells, J.M. (2009). Sox17 Regulates Organ Lineage Segregation of Ventral Foregut Progenitor Cells. Dev. Cell. Spence, J.R., and Wells, J.M. (2007). Translational embryology: Using embryonic principles to generate pancreatic endocrine cells from embryonic stem cells. Dev. Dyn. Spiegel, R., Dobbie, A., Hartman, C., de Vries, L., Ellard, S., and Shalev, S.A. (2011). Clinical characterization of a newly described neonatal diabetes syndrome caused by RFX6 mutations. Am. J. Med. Genet. Part A. Stafford, D., and Prince, V.E. (2002). Retinoic acid signaling is required for a critical early step in zebrafish pancreatic development. Curr. Biol.

176

Stekelenburg, C., Gerster, K., Blouin, J.L., Lang-Muritano, M., Guipponi, M., Santoni, F., and Schwitzgebel, V.M. (2019). Exome sequencing identifies a de novo FOXA2 variant in a patient with syndromic diabetes. Pediatr. Diabetes. Stoffers, D.A., Zinkin, N.T., Stanojevic, V., Clarke, W.L., and Habener, J.F. (1997). Pancreatic agenesis attributable to a single nucleotide deletion in the human IPF1 gene coding sequence. Nat. Genet. Svensson, P., Williams, C., Lundeberg, J., Rydén, P., Bergqvist, I., and Edlund, H. (2007). Gene array identification of Ipf1/Pdx1-/- regulated genes in pancreatic progenitor cells. BMC Dev. Biol. Takahashi, K., and Yamanaka, S. (2006). Induction of Pluripotent Stem Cells from Mouse Embryonic and Adult Fibroblast Cultures by Defined Factors. Cell. Tang, S.J., Hoodless, P.A., Lu, Z., Breitman, M.L., McInnes, R.R., Wrana, J.L., and Buchwald, M. (1998). The Tlx-2 homeobox gene is a downstream target of BMP signalling and is required for mouse mesoderm development. Development. Taylor, B.L., Liu, F.F., and Sander, M. (2013). Nkx6.1 Is Essential for Maintaining the Functional State of Pancreatic Beta Cells. Cell Rep. Teo, A.K.K., Ali, Y., Wong, K.Y., Chipperfield, H., Sadasivam, A., Poobalan, Y., Tan, E.K., Wang, S.T., Abraham, S., Tsuneyoshi, N., et al. (2012). Activin and BMP4 synergistically promote formation of definitive endoderm in human embryonic stem cells. Stem Cells. Teo, A.K.K., Lau, H.H., Valdez, I.A., Dirice, E., Tjora, E., Raeder, H., and Kulkarni, R.N. (2016). Early Developmental Perturbations in a Human Stem Cell Model of MODY5/HNF1B Pancreatic Hypoplasia. Stem Cell Reports. Teo, A.K.K., Tsuneyoshi, N., Hoon, S., Tan, E.K., Stanton, L.W., Wright, C.V.E., and Ray Dunn, N. (2015). PDX1 binds and repreßes hepatic genes to ensure robust pancreatic commitment in differentiating human embryonic stem cells. Stem Cell Reports. Teta, M., Long, S.Y., Wartschow, L.M., Rankin, M.M., and Kushner, J.A. (2005). Very slow turnover of β-cells in aged adult mice. Diabetes. Teta, M., Rankin, M.M., Long, S.Y., Stein, G.M., and Kushner, J.A. (2007). Growth and Regeneration of Adult β Cells Does Not Involve Specialized Progenitors. Dev. Cell.Thomas, H. (2001). A distant upstream promoter of the HNF-4alpha gene connects the transcription factors involved in maturity- onset diabetes of the young. Hum. Mol. Genet. Thomas, I.H., Saini, N.K., Adhikari, A., Lee, J.M., Kasa-vubu, J.Z., Vazquez, D.M., Menon, R.K., Chen, M., and Fajans, S.S. (2009). Neonatal diabetes mellitus with pancreatic agenesis in an infant with homozygous IPF-1 Pro63fsX60 mutation. Pediatr. Diabetes. Thompson, N., Gesina, E., Scheinert, P., Bucher, P., and Grapin-Botton, A. (2012). RNA Profiling and Chromatin Immunoprecipitation-Sequencing Reveal that PTF1a Stabilizes Pancreas Progenitor Identity via the Control of MNX1/HLXB9 and a Network of Other Transcription Factors. Mol. Cell. Biol. Thomson, J.A. (1998). Embryonic stem cell lines derived from human blastocysts. Science (80-. ).Trott, J., Tan, E.K., Ong, S., Titmarsh, D.M., Denil, S.L.I.J., Giam, M., Wong, C.K., Wang, J., Shboul, M., Eio, M., et al. (2017). Long-Term Culture of Self-renewing Pancreatic Progenitors Derived from Human Pluripotent Stem Cells. Stem Cell Reports.

177

Villasenor, A., Chong, D.C., and Cleaver, O. (2008). Biphasic Ngn3 expression in the developing pancreas. Dev. Dyn. Vincent, A.L., Jordan, C.A., Cadzow, M.J., Merriman, T.R., and McGhee, C.N. (2014). Mutations in the zinc finger protein gene, Znf469, Contribute to the pathogenesis of keratoconus. Investig. Ophthalmol. Vis. Sci. Wang, L., Coffinier, C., Thomas, M.K., Gresh, L., Eddu, G., Manor, T., Levitsky, L.L., Yaniv, M., and Rhoads, D.B. (2004). Selective deletion of the Hnf1β (MODY5) gene in β-cells leads to altered gene expression and defective insulin release. Endocrinology. Wang, X., Sterr, M., Burtscher, I., Chen, S., Hieronimus, A., Machicao, F., Staiger, H., Häring, H.U., Lederer, G., Meitinger, T., et al. (2018). Genome-wide analysis of PDX1 target genes in human pancreatic progenitors. Mol. Metab.

Wang, X., Srivastava, Y., Jankowski, A., Malik, V., Wei, Y., Del Rosario, R.C.H., Cojocaru, V., Prabhakar, S., and Jauch, R. (2018). DNA-mediated dimerization on a compact sequence signature controls enhancer engagement and regulation by FOXA1. Nucleic Acids Res. Wang, Z., Li, W., Chen, T., Yang, J., Wen, Z., Yan, X., and Liang, Rui. Biotechnol Lett (2015) 37: 1711. https://doi.org/10.1007/s10529-015-1829-x Wang, Z., Zhu, T., Rehman, K.K., Bertera, S., Zhang, J., Chen, C., Papworth, G., Watkins, S., Trucco, M., Robbins, P.D., et al. (2006). Widespread and stable pancreatic gene transfer by adeno-associated virus vectors via different routes. Diabetes. Wells, J.M., Esni, F., Boivin, G.P., Aronow, B.J., Stuart, W., Combs, C., Sklenka, A., Leach, S.D., and Lowy, A.M. (2007). Wnt/β-catenin signaling is required for development of the exocrine pancreas. BMC Dev. Biol. Wells, J.M., and Melton, D.A. (2000). Early mouse endoderm is patterned by soluble factors from adjacent germ layers. Development. Weltner, J., Anisimov, A., Alitalo, K., Otonkoski, T., and Trokovic, R. (2012). Induced Pluripotent Stem Cell Clones Reprogrammed via Recombinant Adeno-Associated Virus-Mediated Transduction Contain Integrated Vector Sequences. J. Virol. Wiebe, P.O., Kormish, J.D., Roper, V.T., Fujitani, Y., Alston, N.I., Zaret, K.S., Wright, C.V.E., Stein, R.W., and Gannon, M. (2007). Ptf1a Binds to and Activates Area III, a Highly Conserved Region of the Pdx1 Promoter That Mediates Early Pancreas-Wide Pdx1 Expression. Mol. Cell. Biol. Winter, J.M., Yeo, C.J., and Brody, J.R. (2013). Diagnostic, prognostic, and predictive biomarkers in pancreatic cancer. J. Surg. Oncol. Wu, K.L., Gannon, M., Peshavaria, M., Offield, M.F., Henderson, E., Ray, M., Marks, A., Gamer, L.W., Wright, C. V, and Stein, R. (1997). Hepatocyte nuclear factor 3beta is involved in pancreatic beta-cell- specific transcription of the pdx-1 gene. Mol. Cell. Biol. Xiong, B., Rui, Y., Zhang, M., Shi, K., Jia, S., Tian, T., Yin, K., Huang, H., Lin, S., Zhao, X., et al. (2006). Tob1 Controls Dorsal Development of Zebrafish Embryos by Antagonizing Maternal β-Catenin Transcriptional Activity. Dev. Cell.

178

Xuan, S., Borok, M.J., Decker, K.J., Battle, M.A., Duncan, S.A., Hale, M.A., Macdonald, R.J., and Sussel, L. (2012). Pancreas-specific deletion of mouse Gata4 and Gata6 causes pancreatic agenesis. J. Clin. Invest. Xuan, S., and Sussel, L. (2016). GATA4 and GATA6 regulate pancreatic endoderm identity through inhibition of hedgehog signaling. Dev. Yang, X., Pan, Q., Lu, Y., Jiang, X., Zhang, S., and Wu, J. (2019). MNX1 promotes cell proliferation and activates Wnt/β-catenin signaling in colorectal cancer. Cell Biol. Int. Yang, Y., Akinci, E., Dutton, J.R., Banga, A., and Slack, J.M.W. (2013). Stage specific reprogramming of mouse embryo liver cells to a beta cell-like phenotype. Mech. Dev. Yau, D., Franco, E., Flanagan, S.E., Ellard, S., Blumenkrantz, M., and Mitchell, J.J. (2017). Case report: Maternal mosaicism resulting in inheritance of a novel GATA6 mutation causing pancreatic agenesis and neonatal diabetes mellitus. Diagn. Pathol. Ye, F., Duvillié, B., and Scharfmann, R. (2005). Fibroblast growth factors 7 and 10 are expressed in the human embryonic pancreatic mesenchyme and promote the proliferation of embryonic pancreatic epithelial cells. Diabetologia. Yu, W., Hegarty, J.P., Berg, A., Chen, X., West, G., Kelly, A.A., Wang, Y., Poritz, L.S., Koltun, W.A., and Lin, Z. (2011). NKX2-3 transcriptional regulation of endothelin-1 and VEGF signaling in human intestinal microvascular endothelial cells. PLoS One. Zhang, C., Moriguchi, T., Kajihara, M., Esaki, R., Harada, A., Shimohata, H., Oishi, H., Hamada, M., Morito, N., Hasegawa, K., et al. (2005). MafA Is a Key Regulator of Glucose-Stimulated Insulin Secretion. Mol. Cell. Biol. Zhang, H., Ables, E.T., Pope, C.F., Washington, M.K., Hipkens, S., Means, A.L., Path, G., Seufert, J., Costa, R.H., Leiter, A.B., et al. (2009). Multiple, temporal-specific roles for HNF6 in pancreatic endocrine and ductal differentiation. Mech. Dev. Zhang, M., Lin, Y.H., Sun, Y.J., Zhu, S., Zheng, J., Liu, K., Cao, N., Li, K., Huang, Y., and Ding, S. (2016). Pharmacological reprogramming of fibroblasts into neural stem cells by signaling-directed transcriptional activation. Cell Stem Cell. Zhao, T., Jiang, W., Wang, X., Wang, H., Zheng, C., Li, Y., Sun, Y., Huang, C., Han, Z.B., Yang, S., et al. (2017). ESE3 inhibits pancreatic cancer metastasis by upregulating E-cadherin. Cancer Res. Zhou, Q., Brown, J., Kanarek, A., Rajagopal, J., and Melton, D.A. (2008). In vivo reprogramming of adult pancreatic exocrine cells to β-cells. Nature. Zhou, Q., Law, A.C., Rajagopal, J., Anderson, W.J., Gray, P.A., and Melton, D.A. (2007). A Multipotent Progenitor Domain Guides Pancreatic Organogenesis. Dev. Cell. Zhou, Q., and Melton, D.A. (2018). Pancreas regeneration. Nature. Zou, D., Silvius, D., Davenport, J., Grifone, R., Maire, P., and Xu, P.X. (2006). Patterning of the third pharyngeal pouch into thymus/parathyroid by Six and Eya1. Dev. Biol.

179

Appendixes Appendix I

180

Appendix I. SCT differentiation images

181

Appendix II

182

Appendix II. Single factor (A-W) and 8 factor (V-AN) alternate layout.

183

Appendix III

Appendix III. Single factor (A-L) and 8 factor (M-U) images.

184

Appendix IV

Appendix IV. 4 factor alternate layout.

185

Appendix V

Appendix V. 4 factor images.

186

Appendix VI

Appendix VI. PMF treatment additional data.

187

Appendix VII

Appendix VII. Estimation statistics PDX1 HepG2.

188

Appendix VIII

Appendix VIII. Estimation statistics HHF treatment.

189

Appendix IX

Appendix IX. Estimation statistics PtRN treatment.

190

Appendix X

191

192

Appendix X. Estimation statistics single factor analysis.

193

Appendix XI

Appendix XI. Estimation statistics 8 factor analysis.

194

Appendix XII

Appendix XII. Estimation statistics 4 factor analysis.

195

Appendix XIII

Appendix XIII. Estimation statistics Bonfanti vs Trott media.

196

Appendix XIV

Appendix XIV. Estimation statistics PMF treatment.

197

Appendix XV

Appendix XV. Estimation statistics fibroblast gene test.

198

Appendix XVI

199

200

201

202

203

204

205

206

207

208

209

210

211

Appendix XVI. Consent form for use of termination of pregnancy samples.

212