<<

UNIVERSITY OF NEW SOUTH WALES

Exploring the function of -6-phosphate transaminase (Gfpt2) in embryonic development

by

MICHELLE WOOLFORD

A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy

Victor Chang Cardiac Research Institute and School of Biotechnology and Biomolecular Sciences UNIVERSITY OF NEW SOUTH WALES 2012

Australia

ORIGINALITY STATEMENT

‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.’

Signed ……………………………………………......

Date ……………………………………………...... COPYRIGHT STATEMENT

‘I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of my thesis or dissertation.'

Signed ……………………………………………......

Date ……………………………………………......

AUTHENTICITY STATEMENT

‘I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.’

Signed ……………………………………………......

Date ……………………………………………...... Abstract The aim of this thesis was to identify expressed in the heart during early mouse development, and to determine their function in mouse heart development. Microarray technology was used to expression profile the progenitor tissue of the heart, the mesoderm. A subset of genes identified as enriched in the progenitor tissues of the heart with no or little published expression data were selected for examination by whole mount RNA in situ hybridisation in 7.5 dpc, 8.5 dpc and 9.5 dpc mouse embryos. None of the genes examined were expressed in a restricted manner in the developing heart.

In another screen being undertaken in our laboratory, Glutamine fructose-6-phosphate transaminase 2 (Gfpt2) was identified as being expressed in a restricted manner in the developing heart at 9.5 dpc. Gfpt is the rate-limiting in the hexosamine biosynthesis pathway (HBP). This pathway accounts for 1-3 % of in the cell, and converts fructose-6-phosphate to uridine-N-acetylglucoasmine (UDP-GlcNAc), which is responsible for the majority of glycosylation in the cell. In the mouse, two separate and differentially regulated genes can each encode for Gfpt, Gfpt1 and Gfpt2. Prior to this study, Gfpt expression had not been examined in mouse development.

Gfpt1 and Gfpt2 are differentially expressed. Gfpt2 expression was detected in the foregut endoderm at 8.5 dpc, and in the myocardium underlying the cardiac cushions at 9.5 dpc. Gfpt2 was also detected in the pre-somitic mesoderm at 9.5dpc and latter stages, branchial arches and forebrain of the embryo. Gfpt1 expression was not detected in the mouse embryo at these stages, but was detected in the placenta from 9.5 dpc.

The expression pattern of Gfpt2 in the developing mouse and the known function of Gfpt2 in the HBP led to the hypothesis that Gfpt2 expression correlated to an increased requirement for UDP-GlcNAc in the tissues in which it is expressed. To investigate this, mice carrying traps within the enzymatic domain of Gfpt2 were created. It was found that the gene trap insertions are likely to be functionally null alleles. However, mice homozygous for the gene trap insertions survive and appear normal, suggesting that Gfpt2 is dispensable for mouse development.

iii Acknowledgements

I would like to take this opportunity to thank A/Prof Sally Dunwoodie for letting me come to the lab to do my PhD. Also thanks to Prof Patrick Tam and the whole Tam lab for making me feel so welcome and teaching me how to use the robot. Thanks to Dr Duncan Sparrow and Wendy Chua who first identified the Gfpt2 expression in the heart. Also thanks to Duncan for his assistance in the lab and feedback on my writing. Thanks to Dr Jost Preis who taught me all about ES cells and to Natalie Wise for doing the blastocyst injections and making sure I had a good supply of wildtype females when I was collecting embryos for the in situ screens. Thanks also to the BTF and BioCore staff who looked after my mice. Thanks to Dr Gavin Chapman for his help with the localisation studies and to Dr Sharon Pursglove for being my coffee-buddy and stepping in to proof read at the last minute. An extra big thank you to Dr Kylie Lopes Floro and Stanley Artap for being my lab-based cheer squad and teaching me so many things. Especially Kylie who called to check I was doing ok when I was home writing.

A special thanks to the other members of my cheer squad, Dr Jacque-Lynne Johnson, who, as well as being available for a beer as needed, also got me going to boot camp and gave me occasional rides home (thanks Paul). Humungous thanks to Dr Tanya Kranenburg, who has done lots of proof reading and reassured me that I was going to make it whenever I had doubts. Even from afar you have been an enormous support. Lastly, thanks to my family who are always there for me, even if not in person, and to my biggest and most important supporter, Andrew Blair. I would never have even started without your support. Here’s to our new adventure together.

iv Table of Contents

ORIGINALITY STATEMENT ...... I

COPYRIGHT STATEMENT...... II

ABSTRACT ...... III

ACKNOWLEDGEMENTS...... IV

TABLE OF CONTENTS...... V

LIST OF TABLES...... IX

LIST OF FIGURES ...... X

ABBREVIATIONS ...... XII

Chapter 1: Introduction ...... 1 1.1 Mouse gastrulation: Morphogenesis of the three germ layers...... 1 1.2 Fates of the germ layers...... 4 1.2.1 Ectoderm fates...... 6 1.2.2 Endoderm fates...... 6 1.2.3 Mesoderm fates ...... 6 1.3 Establishment of the left-right axis ...... 9 1.4 Morphogenesis of the early heart: Cardiac crescent to chamber formation...... 11 1.4.1 The primary heart field and linear heart tube...... 12 1.4.2 The secondary heart field...... 13 1.4.3 Neural crest ...... 13 1.4.4 Chamber morphogenesis...... 20 1.5 Cardiac cushions and the development of the heart valves...... 21 1.5.1 Signalling pathways involved in cushion EMT...... 26 1.6 Septation of the heart...... 29 1.7 Heart malformations...... 34 1.8 Aim of the thesis ...... 37

Chapter 2: Expression Profiling of the Mouse Germ Layers and Primitive Streak...... 41 2.1 Introduction...... 41 2.1.1 Identification of novel genes using the germ layer libraries...... 42 2.2 Germ layer microarrays...... 43 2.2.1 Array hybridisation...... 45 v 2.2.2 Analysis of cDNA library microarray data...... 45 2.2.3 Analysis 1 in silico analysis of genes...... 53 2.2.4 Analysis 1 whole mount in situ hybridisation...... 57 2.2.5 Analysis 2 microarray analysis ...... 68 2.2.6 Analysis 2 in silico analysis...... 74 2.2.7 Analysis 2 whole mount in situ hybridisation...... 84 2.3 Discussion...... 84 2.3.1 The experimental design of the microarrays...... 85 2.3.2 Microarray data analysis...... 87

Chapter 3: Expression Pattern of glutamine fructose-6-phosphate transaminase 2 ...... 93 3.1 Introduction...... 93 3.1.1 Somitogenesis ...... 93 3.1.2 The hexosamine biosynthesis pathway ...... 97 3.1.3 N-acetylglucosamine and the importance of glycosylation...... 97 3.1.4 Gfpt2...... 101 3.2 Gfpt2 expression in the mid-gestation embryo...... 101 3.3 Gfpt1 but not Gfpt2 is expressed in the developing placenta ...... 107 3.4 Cell localisation by tagged expression construct...... 107 3.5 Discussion...... 111

Chapter 4: Generation of Gfpt2 gene-trapped mouse lines ...... 115 4.1 Introduction...... 115 4.1.1 Gene trapping...... 115 4.2 Mapping the gene-trap insertions...... 119 4.3 Generation of mice carrying the gene-trap alleles ...... 122 4.4 Creation of the IRES gene trap lines ...... 124 4.4.1 Is the Neo transcript degraded in IRES mice?...... 127 4.5 Design of genotyping primers...... 129 4.6 Expression of gene trapped alleles...... 129 4.7 Discussion...... 133 4.7.1 Summary of results...... 133 4.7.2 Disadvantages to gene trapping over targeted knockout...... 134 4.7.3 Alternative splicing could lead to exon skipping and failure to express the gene trap allele...... 136 4.7.4 Conclusion ...... 137

Chapter 5: Functional analysis of Gfpt2 in the mouse ...... 139 5.1 Introduction...... 139 5.2 Gfpt2Gt(CMHD-GT_352F9-3)Cmhd heterozygous matings...... 140 5.3 Delta-IRES mice...... 142 5.3.1 Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd heterozygous inter-crosses...... 143 5.3.2 Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd heterozygous inter-crosses ...... 145 5.3.3 Adult delta-IRES mice survival ...... 149 5.4 Discussion...... 151 5.4.1 Does Gfpt1 compensate for the loss of Gfpt2 in the gene trapped mice?...... 155

Chapter 6: Discussion...... 157

vi 6.1 Summary of results ...... 158 6.1.1 Gfpt2 is expressed in a restricted manner in the developing mouse embryo.158 6.1.2 Generation of the Gfpt2 gene trap mouse lines...... 159 6.2 Gfpt2 is dispensable for mouse embryonic development...... 160 6.3 Discussion...... 162 6.3.1 The delta gene trap insertions should create functionally null alleles...... 162 6.3.2 Gfpt1 potentially compensates for loss of Gfpt2 ...... 164 6.3.3 Is GAG production affected in Gfpt2 gene trapped cardiac cushions? ...... 164 6.3.4 Future studies to determine the function of Gfpt2 in the mouse...... 165 6.4 Final comments...... 166

Chapter 7: Materials and Methods...... 169 7.1 Chemicals and Reagents...... 169 7.1.1 Chemicals...... 169 7.1.2 Antibodies/Fluorophores ...... 169 7.1.3 Kits ...... 170 7.1.4 ...... 170 7.1.5 Miscellaneous...... 170 7.1.6 Plasmids...... 170 7.2 Buffers and Solutions...... 171 7.2.1 General Molecular Biology Solutions...... 171 7.2.2 Embryo and RNA in situ hybridisation solutions ...... 171 7.2.3 Cell culture solutions...... 173 7.3 Microarray method...... 173 7.4 RNA in situ hybridisation methods...... 174 7.4.1 Embryo dissection ...... 174 7.4.2 Embryo processing for whole mount RNA in situ hybridisation...... 174 7.4.3 Embryo processing for wax embedding (if not processed for whole mount RNA in situ hybridisation)...... 175 7.4.4 Placenta dissection and processing for cryosectioning...... 175 7.4.5 Synthesis of Riboprobes (for manual in situ hybridisation) ...... 176 7.4.6 Whole mount RNA in situ hybridisation using Intavis in situ robot...... 176 7.4.7 Whole mount RNA in situ hybridisation method (manual)...... 177 7.4.8 Cryosection RNA in situ hybridisation ...... 178 7.4.9 Processing wax section slides ...... 179 7.5 Molecular Biology Methods...... 179 7.5.1 Bacteria growth plates & media...... 179 7.5.2 Maxiprep...... 179 7.5.3 Ethanol preciptation...... 180 7.5.4 DNA isolation from ES cells ...... 180 7.5.5 DNA isolation from mouse tails or ear-clips...... 180 7.5.6 DNA isolation from yolk sacs...... 180 7.5.7 RNA isolation from ES cells and embryos...... 181 7.5.8 PCR primers...... 181 7.5.9 PCR protocols ...... 181 7.5.10 Automated capillary sequencing of plasmid DNA ...... 184 7.6 Cell culture...... 184 7.6.1 Mouse embryonic fibroblast (MEF) generation and culturing ...... 184

vii 7.6.2 Mitomycin C treatment of MEFs...... 185 7.6.3 ES cell culture...... 185 7.6.4 Transfection of C2C12 cells ...... 185 7.7 Mouse strains ...... 185 7.8 Statistical analysis...... 187

Appendices ...... 189 Appendix 1 terms over-represented in the cDNA libraries...... 191 Appendix 2 Analysis 1 candidate genes selected for screening by whole mount RNA in situ hybridisation...... 195 Appendix 3 Analysis 2 candidate genes selected for screening by whole mount RNA in situ hybridisation...... 207

References ...... 222

viii List of Tables Table 1.1 Incidence of congenital heart disease ...... 36 Table 2.1 cDNA libraries were labelled with Cy3/Cy5 and hybridised to Compugen/Sigma-Genosys OligoLibrary arrayed chip...... 50 Table 2.2 Number of Genes identified as enriched in each cDNA library by fold-change in Analysis 1 51 Table 2.3 Percentage of mesoderm and primitive streak enriched genes with known embryo expression and phenotypic data at different fold-changes...... 56 Table 2.4 Analysis 1 whole mount RNA in situ hybridisation ...... 59 Table 2.5 Analysis 1 observed patterns by embryo stage ...... 68 Table 2.6 Percentage of mesoderm and mesoderm and endoderm enriched genes with known embryo expression and phenotypic data at 3-fold enrichment ...... 75 Table 2.7 Analysis 2 Whole mount RNA in situ hybridisation screen ...... 79 Table 2.8 Analysis 2 observed gene expression pattern by embryo stage...... 84 Table 4.1 Percent chimerism of mice generated from 352F9 and 305A09 ES cell clones ...... 125 Table 4.2 Genotyping primers for gene trap mouse lines...... 129 Table 5.1 Gfpt2Gt(CMHD-GT_352F9-3)Cmhd embryos at 18.5dc...... 142 Table 5.2 Gfpt2Gt(CMHD-GT_352F9-3)Cmhd heterozygous matings pups and weaners ...... 142 Table 5.3 Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd embryos at 17.5 dpc...... 145 Table 5.4 Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd pups and weaned mice...... 147 Table 5.5 Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd embryos at 17.5 dpc...... 149 Table 5.6 Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd pups and weaned mice ...... 149 Table 5.7 Adult weight for Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd at different ages...... 150 Table 5.8 Adult weight for Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd at different ages ...... 151 Table 7.1 Oligonucleotides...... 182

ix List of Figures Figure 1.1 Mouse development overview: Fertilisation to gastrulation ...... 3 Figure 1.2 Fate map of the mouse gastrula embryo ...... 5 Figure 1.3 Mesoderm fate map...... 7 Figure 1.4 Breaking of symmetry in the node and propogation of asymmetric cues in the lateral plate mesoderm ...... 10 Figure 1.5 Heart development...... 15 Figure 1.6 The primary and secondary heart fields...... 17 Figure 1.7 Contributions by the cardiac neural crest to heart development ...... 19 Figure 1.8 Chamber morphogenesis from heart tube ...... 23 Figure 1.9 Cushion development...... 25 Figure 1.10 Septation of the atria...... 31 Figure 1.11 Outflow tract septation...... 33 Figure 1.12 Schematic of the adult heart highlighting structures affected by congenital heart defects 35 Figure 2.1 Schematic and micrographs of the 7.5 dpc embryonic region dissected into its germ layer components and primitive streak fraction ...... 44 Figure 2.2 Schematic for the microarray strategy for microarray chip 1, Mesoderm library compared to Primitive Streak library ...... 47 Figure 2.3 Flow schematic of raw data restriction and identification of mesoderm enriched genes.. 49 Figure 2.4 Pie graphs representing the best 10 GO terms associated with genes enriched in each germ layer and the primitive streak ...... 55 Figure 2.5 Examples of embryo expression patterns ...... 67 Figure 2.6 Flow chart schematic of raw data filtering of the Mesoderm arrays in Analysis 2...... 71 Figure 2.7 Analysis 2, identification of genes enriched in both the mesoderm and endoderm...... 72 Figure 2.8 Pie graphs showing best 10 GO terms associated with genes enriched in the mesoderm, endoderm and enriched in both the mesoderm and endoderm...... 77 Figure 3.1 The clock-wavefront model of somitogenesis...... 96 Figure 3.2 Glucose metabolism and HBP pathway...... 99 Figure 3.3 Schematic of Gfpt2 ...... 102 Figure 3.4 Expression of Gfpt2 at 8.5 dpc...... 103 Figure 3.5 Expression of Gfpt2 at 9.5 dpc...... 105 Figure 3.6 Expression of Gfpt2 at 10.5 dpc and 11.5 dpc...... 106 Figure 3.7 Gfpt1 expression in placental sections...... 109 Figure 3.8 Gfpt2 is localised to the cytoplasm of C2C12 cells ...... 110 Figure 4.1 Schematic of gene trap vectors ...... 117 Figure 4.2 Mapping of the gene trap insertions ...... 121 Figure 4.3 Schematic of Gfpt2 intron/exon structure and domains ...... 123

x Figure 4.4 Excision of UPA trap vector IRES by Cre recombinase ...... 126 Figure 4.5 Loss of Neo transcript due to excision of the IRES...... 128 Figure 4.6 Primer design for genotyping...... 131 Figure 4.7 GFP is expressed dorsal to the heart at 8.5 dpc...... 132 Figure 5.1 Scatter plot of GfatGT embryos at 18.5 dpc ...... 141 Figure 5.2 Scatter plot of Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd embryos at 17.5 dpc...... 144 Figure 5.3 Sections of embryonic hearts ...... 146 Figure 5.4 Scatter plot of Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd 17.5 dpc embs ...... 148

xi Abbreviations

2 chi-square test -gal beta-galactosidase -geo fusion protein encoded by lacZ encoding for -gal and neo genes Afp alpha fetoprotein al allantois ANOVA analysis of variance, statistical test AV atrioventricular AVC atrioventricular canal ba branchial arch BMP bone morphogenetic protein Bmpr1a bone morphogenetic protein receptor, type 1A Bmpr2 bone morphogenic protein receptor, type II (serine/threonine kinase) bp (s) BSA bovine albumin C Celsius cDNA complementary DNA cds coding sequence Cited1 Cbp/p300-interacting transactivator with Glu/Asp-rich carboxy-terminal domain 1 Cited2 Cbp/p300-interacting transactivator with Glu/Asp-rich carboxy-terminal domain 2 CMHD Centre for Modeling Human Disease Crabp2 cellular retinoic acid binding protein 2 cvp caudal venous pole Cy3 cyanine 3 Cy5 cyanine 5 DAPI DAPI (4',6-diamidino-2-phenylindole) Ddx5 DEAD (Asp-Glu-Ala-Asp) box polypeptide 5 Dennd5a DENN/MADD domain containing 5A DIG-UTP Digoxigenin-11-uridine-5'-triphosphate

xii Dll3 delta-like 3 DNA deoxyribonucleic acid dNTP deoxyribonucleotide tri-phosphate dpc days post coitum E. coli E1 embryonic day 1 EDTA Ethylenediaminetetra-acetic acid EGFP Enhanced green fluorescent protein EMT epithelial to mesenchyme transition ENU N-ethyl-N-nitrosourea epc ectoplacental cone ES embryonic stem fb forebrain FCS foetal calf serum Flt1 FMS-like tyrosine kinase 1, formerly known as VEGFR1 G G-force g gram Gata4 GATA binding protein 4 GFP green fluorescent protein Gfpt1 glutamine fructose 6 phosphate transaminase 1 Gfpt2 glutamine fructose 6 phosphate transaminase 2 Gnpnat1 glucosamine-phosphate N-acetyltransferase 1 GO gene ontology GXD gene expression database h hour Has2 hyaluronan synthase 2 hb hindbrain HBP hexosamine biosynthesis pathway HEPES N-2-hydroxyethyl piperazine-N-ethane sulphonic acid het heterozygous hom homozygous IPTG Isopropyl -D-1-thiogalactopyranoside

xiii IRES internal entry site Irx5 Iroquois related homeobox 5 Kdr kinase insert domain protein receptor, formerly known as VEFGR2 L litre lb limb bud LoxP of X-over P1, site on the Bacteriophage P1 consisting of 34 bp MAB Maleic acid buffer mb midbrain mg milligram

MgCl2 Magnesium chloride MGD mouse genome database MGI mouse genome informatics min minute Mixl1 Mix1 homeobox-like 1 mL millilitre mM milli molar mm millimetre MQ Water passed through a reverse osmosis, filtered mRNA messenger RNA NaAc sodium acetate NaCl sodium chloride NBT/BCIP nitro blue tetrazolium chloride (NBT)/ 5-Bromo-4-chloro-3-indolyl phosphate (BCIP) nc neural crest NCBI National Center for Biotechnology Information Ndifp1 Nedd4 family interacting protein 1 Neo Neomycin resistance nf neural fold Nf1 neurofibromatosis 1 Nfatc1 nuclear factor of activated T-cells, cytoplasmic, calcineurin-dependent 1 ng nanogram Nkx2.5 NK2 transcription related, locus 5

xiv nm nanometre NMD nonsense mediated decay Nppa natriuretic peptide precursor type A

NTMT NaCl, Tris-HCl, MgCl2, Tween-20 Nuf2 NDC80 kinetochore complex component, homolog (S. cerevisiae) OCT optimal cutting temperature compound OFT outflow tract Oligo oligonucleotide ORF open reading frame P1 postnatal day 1 pA poly adenylation sequence PBS phosphate buffered saline pc prechordal plate PCR polymerase chain reaction PFA paraformaldehyde PGK phosphoglycerate kinase 1 Phlda2 pleckstrin homology-like domain, family A, member 2 polyA poly adenylation PSM pre-somitic mesoderm RNA ribonucleic acid RNAse ribonuclease RNAsin ribonuclease inhibitor rpm revolutions per minute RT-PCR reverse transcription polymerase chain reaction S. cerevisiae Saccharomyces cerevisiae SA splice acceptor site SD splice donor site Sdcbp syndecan binding protein SDS sodium dodecyl sulfate sec second SHF secondary heart field Smoc1 SPARC related modular calcium binding 1

xv Smpx small muscle protein, X-linked SNP small nucleotide repeat Sp5 trans-acting transcription factor 5 SSC Sodium chloride and sodium citrate ssDNA Sheared herring sperm DNA StD standard deviation TAE Tris acetate EDTA Tbx5 T-box 5 TGF2 transforming growth factor beta 2 Tgfbr3 transforming growth factor, beta receptor III Tm2d2 TM2 domain containing 2 Tris-base Tris(hydroxymethyl) aminomethane Tris-HCL Tris(hydroxymethyl) aminomethane hydrochloride Tween-20 Polyoxyethylene (20) sorbitan monolaurate U units UTR untranslated region V volts Vegf vascular endothelial growth factor wt wildtype x-gal 5-bromo-4-chloro-3-indolyl- beta-D-galactopyranoside Zfpm2 zinc finger protein, multitype 2, also called Fog2 L microlitre M micro molar

xvi Chapter 1: Introduction

The mouse has proven a useful tool for investigating development in mammals due to its size, relative breeding ability and the availability of genetic tools to directly mutate specific genes. In mouse development, following fertilisation, the cells divide over the course of several days to form the 16-cell morula at approximately 3 days post coitum (dpc) (Figure 1.1, top panel). At 3.5 dpc the embryo, now called a blastocyst, consists of two major cell types: the inner cell mass, which will form the epiblast and primitive endoderm, and the trophectoderm. At 4.5 dpc, the blastocyst implants and the primitive endoderm forms, which consists of both parietal and visceral endoderm. The trophectoderm and primitive endoderm will contribute to extra-embryonic tissues whereas the embryonic tissues will form from the epiblast. Following implantation, the epiblast forms a cup shaped epithelium surrounded by a layer of visceral endoderm. At approximately 6.5 dpc, gastrulation begins with formation of the primitive streak at the posterior side of the embryo (Figure 1.1, bottom panel) (Ang and Behringer, 2002).

1.1 Mouse gastrulation: Morphogenesis of the three germ layers All the somatic tissue of the mouse develops from the epiblast. During the process of gastrulation, three germ layers are formed: definitive endoderm, mesoderm and ectoderm. In the mouse, gastrulation begins with the formation of the primitive streak at approximately 6.5 dpc (Figure 1.1 bottom panel), a specialised structure that arises at and defines the posterior of the embryo. This structure marks the first visible manifestation of the anterior-posterior axis in the embryo. However, there is some evidence in the mouse that this axis is predisposed as early as fertilisation (Gray et al., 2004a; Piotrowska et al., 2001; Piotrowska and Zernicka-Goetz, 2002), though these findings are controversial.

1 dpc dpc dpc dpc dpc dpc

start of gastrulation gastrulating embryo dpc dpc dpc dpc proximal

posterior anterior distal

Figure 1.1 Mouse development overview: Fertilisation to gastrulation

Schematic of the first 7.5 days of mouse development: Top panel, pre-implantion. Bottom panel, post-implantation. Following fertilisation, over the next 3 days, cells undergo several rounds of cell division to form the 16-cell morula. The animal pole is defined by the presence of the second polar body. At 3.5 dpc, the embryo, now called a blastocyst, consists of an inner cell mass and blastocoel (fluid-filled region), surrounded by a layer of trophectoderm. At this stage the proximal and distal axis can be identified, proximal being on the inner cell mass side and distal on the blastocoel side. At 4.5 dpc, the blastocyst implants and consists of 3 cell types, the epiblast and primitive endoderm derived from the inner cell mass and the trophectoderm. Following 3 implantation, the embryo consists of the extraembryonic ectoderm on the proximal side and the two populations of endoderm, parietal and visceral. At approximately 6.5 dpc, gastrulation commences with the formation of the primitive streak marking the first visible manifestation of the posterior of the embryo. As gastrulation proceeds, the mesoderm and (definitive) endoderm emerge from epiblast cells that ingress through the primitive streak and the ectoderm is derived from non-ingressing epiblast cells. The germ layers at 6.5 and 7.5 dpc are also shown schematically as flattened sheets. The axes of the mouse are shown superimposed over an adult mouse. Abbreviations: h hours, dpc days post coitum. Modified from Cell, 96(2), Beddington and Robertson, “Axis development and early asymmetry in mammals”, pages 195-209, copyright (1999), with permission from Elsevier.

As cells ingress through the primitive streak, they differentiate to form either mesoderm (including extra-embryonic mesoderm) or definitive endoderm depending on their relative starting position within the epiblast (Lawson et al., 1991) (Figure 1.1, bottom panel). In contrast, the ectoderm is formed from cells that do not ingress through the primitive streak. At this time in mouse development, embryos are staged according the extent of the primitive streak (Figure 1.2). At early streak stage, the primitive streak is just forming at the posterior side of the embryo, at mid-streak stage the primitive streak extends approximately half way down the posterior side of the embryo, and at late streak stage, the primitive streak extends the whole way down the posterior of the embryo. Most early streak stage cell descendants contribute to the primitive streak itself and anterior-most mesoderm (Lawson et al., 1991) (Figure 1.2A). In mid-streak stage embryos the embryonic mesoderm forms wings between the epiblast and endoderm (Figure 1.2B). Much of the mesoderm that will contribute to the extraembryonic tissues including the yolk sac, amniotic mesoderm and the mesothelium of the allantoic bud, emerges from the primitive streak at early primitive streak stage, but extra-embryonic mesoderm continues to be produced throughout gastrulation (Kinder et al., 1999; Lawson et al., 1991; Parameswaran and Tam, 1995). The cells fated to form endoderm ingress at mid-streak stage, and most are derived from the distal-most extension of the primitive streak (Lawson et al., 1991). Endoderm is rarely produced later in gastrulation (Carey et al., 1995). By late streak stages ingressing cells mainly contribute to the mesoderm, whereas by head-fold stage epiblast cells contribute mainly to ectoderm (Carey et al., 1995), suggesting that these cells do not ingress (Figure 1.2C).

1.2 Fates of the germ layers The three germ layers are the source of all somatic tissue in the mouse and each layer is fated to give rise to particular tissues and organ anlage. It is important to note however, that the individual germ layers do not necessarily act in isolation. Interactions and signalling between adjacent germ layers is very important in the correct formation of many organs and tissues (Arai et al., 1997; Hallaq et al., 2004; Lough and Sugi, 2000).

4 AB C proximal proximal proximal

A P AAP P

distal distal distal

Figure 1.2 Fate map of the mouse gastrula embryo A) Early primitive streak stage. Primitive streak begins to extend down posterior of embryo. B) Mid-primitive streak stage. Primitive streak extends approximately half way down the posterior of the embryo C) Late streak - early bud stage. Primitive streak extends the whole way down the posterior of the embryo. Abbreviations: A anterior, P posterior, purple primitive streak, pink - red mesodermal wings, grey definitive endoderm, green axial mesoderm, blue ectoderm, yellow epiblast. Below dashed line in (A) denotes cap region. Reproduced from BioEssays 23, Tam et al., “Morphogenetic tissue movement and the establishment of body plan during development from blastocyst to gastrula in the mouse” pages 508-517, copyright (2001), with permission from John Wiley and Sons. 1.2.1 Ectoderm fates The cells of the epiblast that do not ingress through the primitive streak form the ectoderm. This gives rise to the neural tube, with almost all the precursors for this layer being contained within the anterior-most region of the cap (bottom) of the embryo at early primitive streak stage (Figure 1.2A) (Quinlan et al., 1995). Cells from the anterior- most regions of the cap and anterior cells from outside the anterior side of the cap contribute mainly to the surface ectoderm and the amnion, but also make some contribution to the neurectoderm. Cells in the more posterior regions of the cap contribute to surface ectoderm on the posterior side of the embryo (Quinlan et al., 1995).

1.2.2 Endoderm fates Cells originating in the lateral and posterior epiblast, particularly those from the distal- most extension of the primitive streak at mid-streak stages, contribute to the foregut endoderm (Lawson et al., 1991). The foregut endoderm also contributes to some of the branchial pouches (Lewis and Tam, 2006; Peters et al., 1998; Wallin et al., 1996). Signalling between the ventral foregut endoderm and the cardiac mesoderm is important in heart formation (Arai et al., 1997; Hallaq et al., 2004; Lough and Sugi, 2000). The mid and hindgut arise from endoderm from the distal and posterior regions of the epiblast (Tam et al., 2004). As development proceeds, the embryonic gut endoderm gives rise to the epithelial lining of the respiratory and digestive tracts and associated organs such as the lungs, and pancreas (Wells and Melton, 1999).

1.2.3 Mesoderm fates Parameswaran and Tam (1995) showed by orthotopic grafting (cells transplanted to equivalent sites in the host embryo) of early streak embryos that lateral and posterior epiblast cells are predominately fated to become mesoderm. Distal parts of lateral and posterior epiblast contributed mainly to cranial mesenchyme, heart, somites and lateral plate mesoderm; this region also makes minor contributions to yolk sac mesoderm, cranial neural plate and foregut endoderm (Figure 1.3) (Parameswaran and Tam, 1995). Taken together, these observations suggest that even cells from the same region of the

6 Figure 1.3 Mesoderm fate map Fate maps from early to late streak stage showing the localisation of mesoderm progenitors. The tissue composition of the mesoderm reflects the types of progenitors that have been recruited from the epiblast through the primitive streak in the immediately preceding developmental stage, but not those that are currently ingressing into the primitive streak. At early streak stage (ESS), the ingressing mesoderm is fated to form mainly extraembryonic mesoderm including the allantois, yolk sac endothelium. Cells fated to form vasculature, heart, cranial and lateral plate mesoderm also begin ingressing at this stage. At mid-streak stages (MSS), most heart-fated mesoderm ingresses, along with mesoderm fated to form cranial, lateral plate and paraxial mesoderm. Mesoderm contributing the vasculature also ingresses at this stage. At late streak stage (LSS), most ingressing cells are fated to form lateral plate and paraxial mesoderm. Epiblast/ectoderm (light blue), primitive streak (black bar), mesoderm (indicated by dotted outline on the embryo and pulled-away layer). Different mesodermal lineages are colour coded according to colour key. Reproduced from Trends in Cardiovascular Medicine, 11(5), Kinder, Loebel and Tam, “Allocation and early differentiation of cardiovascular progenitors in the mouse embryo.” Pages 177-84, copyright 2001, with permission from Elsevier. epiblast can contribute to both embryonic and extraembryonic mesoderm, as well as endoderm derivatives.

Kinder et al, (1999) examined the progression of mesoderm cells and their allocation to their derivatives from early streak to early bud stage. In agreement with other studies, they found that most mesoderm produced at early streak stage contributes to extraembryonic mesoderm including the yolk sac and allantois, and the amnion, with minor contributions to vascular mesoderm including erythrocytes, and lateral plate mesoderm (Figure 1.3) (Kinder et al., 1999; Lawson et al., 1991). Cranial mesoderm is produced from early-streak stage through to early bud stage, with mesoderm fated to form more rostral (anterior) cranial mesoderm ingressing at earlier stages, followed by caudal (posterior) cranial mesoderm at later stages (Figure 1.3) (Kinder et al., 1999).

In this thesis, Chapter 3 describes gene expression in the somites, regular repeated balls of mesenchyme surrounded by an epithelial layer that give rise to the skeletal muscle, vertebrae, ribs, axial tendons and dermis of the back (Brent et al., 2003; Gossler and Hrabe de Angelis, 1998). Thus, the origins of the somitic mesoderm are described here. After cells fated to form rostral cranial mesoderm ingress at mid-streak stages, cells fated to form somitic mesoderm first arise from the region closest to the node. The node is a small indentation transiently present at the posterior base of the embryo and is important in the establishment of left-right asymmetry (Section 1.3). The early bud stage embryo contains precursors of somites more caudal to the fourth somite, and the pre-somitic mesoderm. Trunk somites are allocated from mid-late primitive streak stage and continue to be allocated from the pre-somitic mesoderm through early organogenesis (Figure 1.3) (Kinder et al., 1999).

The major aim of this thesis is to identify genes involved in heart development, thus the origins of the heart mesoderm are described here. Cells are allocated to the heart lineage at mid-streak stage (Figure 1.3) (Kinder et al., 1999; Tam et al., 1997). Lateral plate mesoderm also arises from this region at mid-streak stage, but unlike in the heart, cells continue to be allocated to the lateral plate mesoderm through to early bud stage (Figure 1.3) (Kinder et al., 1999).

8 1.3 Establishment of the left-right axis The establishment of the left-right axis is important for the correct development and positioning of organs in the body and the identity of structures within organs. For example, the heart is located to the left of the midline and the left and right atria and ventricles have differing identities. Situs solitus refers to the normal asymmetric arrangement of organs. Failure to correctly establish the left-right axis can lead to an abnormal arrangement of the organs broadly termed laterality defects. Laterality defects can be classified into two categories; situs inversus, where the left and right sides are reversed; or heterotaxia, where the normal asymmetry for each organ appears to be determined independently. Isomerism (right or left) represents one specific subcategory of heterotaxia. It occurs when one side of the body mirrors the other, for example a heart with two atria with right-sided morphology. Complete isomerism is incompatible with life. In the mouse, left-right asymmetry is determined by signalling within the node, the lateral plate mesoderm (LPM, the most lateral trunk mesoderm) and the midline (notochord and floorplate).

Symmetry is first broken in the developing mouse embryo in a specialised structure called the node at approximately 8 dpc (Figure 1.4). The node is a small indentation located at the midline at the anterior tip of the primitive streak. Symmetry is broken by the leftward flow of fluid across the node, termed nodal flow (Nonaka et al., 1998). Nodal flow is first apparent at the one- to two-somite stage and disappears by the six- somite stage. Nodal flow initiates expression of Nodal in the peri nodal region (Figure 1.4A) (Nonaka et al., 1998). The absence of nodal flow results in abnormal left-right patterning, and the reversal in direction of nodal flow by introduction of an artificial flow results in a reversed left-right pattern in mice, demonstrating that nodal flow itself is required for correct left-right patterning (Nonaka et al., 2002; Okada et al., 2005).

Once symmetry has been broken in the node, the signal is relayed via an unknown mechanism to the left LPM, resulting in the expression of Nodal in the only in the left LPM (Figure 1.4B). NODAL itself is a candidate for the signal that is relayed to the left LPM (Marques et al., 2004; Saijoh et al., 2003). In the node, an antagonist of NODAL,

9 A B C

Lefty2 Lefty1

Dand5 Dand5

Figure 1.4 Breaking of symmetry in the node and propogation of asymmetric cues to the lateral plate mesoderm A) Nodal (red) is initially expressed in the peri nodal cells. B) Nodal flow and asymmetric expression of Dand5 results in asymmetric expression and distribution of Nodal, and via an unknown mechanism, the Nodal signal is transferred to the left LPM. C) Nodal propagates its own expression throughout the left LPM and induces the expression of Lefty1, Lefty2. L left, LPM lateral plate mesoderm, R right. Blue oval Node, grey circle somites, blue line midline, red nodal expression. Adapted from Developmental Biology 256(1), Saijoh et. al. “Left–right patterning of the mouse lateral plate requires nodal produced in the node” pages 161-173, copyright 2003, with permission from Elsevier. DAN domain family, member 5 (Dand5) (also known as Cerl2) is expressed asymmetrically with a much higher level of expression on the right. In the absence of Dand5, Nodal is expressed bilaterally or on the right side, suggesting that DAND5 regulates the asymmetric expression of Nodal in the left LPM (Figure 1.4B) (Marques et al., 2004; Pearce et al., 1999). Additionally, a hypomorphic allele of Nodal, which does not express Nodal in the Node or LPM and displays laterality defects, can be rescued by a transgene that specifically express Nodal in the Node. This demonstrates that the expression of NODAL within the node is required for NODAL signalling within the LPM (Saijoh et al., 2003).

Once Nodal expression has commenced in the left LPM, it is able to propagate its own signal via a regulatory feedback loop and its expression extends along the anterior posterior axis of the left LPM (Figure 1.4C). Expression of Nodal in only the left LPM requires a midline barrier as well as positive feedback loops. Nodal induces expression of left right determination factor 1 (Lefty1) and Lefty2 (Yamamoto et al., 2003). Lefty1 is expressed in the midline and Lefty2 is expressed in the left LPM (Figure 1.4C). The loss of either LEFTY1 or LEFTY2 results in leakage of NODAL signal to the right hand side, resulting in bilateral Nodal expression (Meno et al., 1998; Meno et al., 2001).

NODAL also induces the expression of asymmetric genes such as paired-like homeodomain transcription factor 2 (Pitx2), the expression of which persists in the left LPM after Nodal and Lefty2 expression has stopped (Logan et al., 1998; Yoshioka et al., 1998). Pitx2 is a major regulator of asymmetry as mice that lack PITX2c, the asymmetrically expressed isoform, have laterality defects in most visceral organs (Lin et al., 1999; Liu et al., 2001). While PITX2 is not the only important laterality signal, since cardiac looping and embryonic turning occurs normally in the absence of PITX2, to date no other genes have been identified. How the PITX2 signal directs left-sidedness in different organs is still not understood.

1.4 Morphogenesis of the early heart: Cardiac crescent to chamber formation The adult heart is a bi-circulatory pump. The right side of the heart pumps deoxygenated blood from the systemic circulation through the right atrium and ventricle

11 to the pulmonary artery for oxygenation. The oxygenated blood from the lungs returns to the left side of the heart and is pumped through the left atrium and ventricle into the systemic circulatory system via the aorta. Heart development requires a number of coordinated processes that enable a small population of progenitor cells to develop into a functional adult heart, made up of multiple cell types and specialized structures.

1.4.1 The primary heart field and linear heart tube Heart development begins with a small group of mesodermal cells that emerge at the mid-streak and come to lay either side of the prechordal plate, a small region at the anterior tip of the notochord where the endoderm and ectoderm are in contact, under the cephalic neural plate (head forming region). These bilateral pools of cells migrate laterally to form a crescent shape below the neural folds, called the primary heart field or cardiac crescent (Figure 1.5A, Figure 1.6). The cells of the primary heart field express known cardiac markers such as NK2 transcription factor related, locus 5 (Nkx2.5) and cardiac alpha-actin (Lough and Sugi, 2000). At late primitive streak stage, heart mesoderm is located at the anterior proximal region of mesoderm underneath the neural plate (Parameswaran and Tam, 1995). The morphogenic movement of cells and ingression through the primitive streak is not required for cardiogenic specification (Tam et al., 1997). This suggests that the signals required for heart development are present in the heart field and epiblast cells are capable of responding to these signals to become cardiac cells. The cardiomyocyte program has been shown to require bone morphogenic protein (BMP), fibroblast growth factor (FGF) signalling and transcription factors such as hematopoietically expressed homeobox (HHEX) from underlying endoderm (Arai et al., 1997; Chen and Fishman, 2000; Gaussin et al., 2005; Hallaq et al., 2004; Lough and Sugi, 2000).

After cardiac crescent formation, the cells of the primary heart field fuse at the midline to form a linear heart tube that gives rise to the atria and left ventricle (Figure 1.5B, 1.6) (Stalsberg and DeHaan, 1969). The heart tube consists of an inner layer of endothelial cells and an outer layer of myocardium. The myocardium begins beating at approximately this stage and blood is pumped from the inflow region, the caudally located venous pole, to the arterial outflow region, which is more cranially located.

12 In response to left-right signalling, the heart tube loops in a rightward direction. This rightward looping positions the outflow tract (OFT) on the right of the developing heart and places the presumptive atrial region above the common ventricle (Figure 1.5C). During looping the heart tube lengthens by the addition of myocardium at the venous pole.

1.4.2 The secondary heart field The primary heart field only gives rise to the atria and left ventricles. The remaining cells of the heart derive from the secondary heart field (SHF), which lies dorsal to the primary heart field in a sub-population of splanchnic mesoderm (Kelly et al., 2001; Mjaatvedt et al., 2001; Waldo et al., 2001) (Figure 1.6A). Fate mapping done with either a transgene under the control of Fgf10 promoter , or DiI (cyanine dye) showed that the SHF is initially expressed in splanchnic mesoderm adjacent to the cardiac crescent (Figure 1.6A, B) (Kelly et al., 2001). As the primary heart field forms the heart tube, the SHF is located dorsal to the heart tube and extends into the branchial arches (Figure 1.6B) (Kelly et al., 2001; Kelly and Buckingham, 2002). SHF cells within the branchial arches form a discrete mesodermal core (Kelly et al., 2001). In the heart, the SHF gives rise to the OFT and the right ventricle, with minor contributions to the atria and some cells within the left ventricle (Figure 1.6C) (Kelly et al., 2001; Kelly and

Buckingham, 2002).

1.4.3 Neural crest A third cell population, the cardiac neural crest (CNC) is also important in heart development. CNC cells migrate to populate the heart through the OFT and contribute to the formation of the great vessels and outflow septum (Figure 1.7). The CNC cells arise from a sub-population of cranial neural crest cells from the dorsal neuroepithelium at the level of the first four occipital somites (Fukiishi and Morriss-Kay, 1992). In the mouse, these cells delaminate and begin migration from the neural tube at approximately 8.5 dpc and are located in the 3rd and 4th branchial arches by 9.5 dpc (Figure 1.7, panel A) (Fukiishi and Morriss-Kay, 1992; Jiang et al., 2000; Lo et al., 1997). By 10.5 dpc, CNC cells are found in the 4th and 6th branchial arches (Fukiishi and Morriss-Kay, 1992; Jiang et al., 2000; Lo et al., 1997).

13 dpc dpc dpc dpc

nf nf pc outflow

n

cvp n

pc

Figure 1.5 Heart development

Schematic of heart development from primary heart field (cardiac crescent) stage (7.75 dpc) to the remodelled heart (12.5 dpc). Bottom section of each panel is a coronal section of above whole embryo view. A) The two bilateral pools of cardiac progenitors fuse across the midline and form the primary heart field (cardiac crescent) (heart mesoderm in bottom panel). Two coeleums form separating the splanchnic mesoderm from the somatoplueure that will line the future pericardium (neural folds are behind plane of section). (Also see Figure 1.6 for primary and secondary heart fields.) B) The primary heart field fuses to form the linear heart tube. Blood flows from bottom (inflow) to top (outflow). C) The heart tube loops to enable convergence of the outflow tract (OFT) and atrioventricular canal (AVC). In both the AVC and OFT swellings of cardiac jelly form at defined locations to form the cardiac

15 cushions (see also Figure 1.9). The outer curvature of the heart balloons to commence formation of the chambers (see also Figure 1.8). D) At later stages after chamber formation, septa form between the chambers and canals of the heart. Abbreviations: avc atrioventricular canal, avs atrioventricular septum, ca common atrium, cc cardiac crescent, co intra-embryonic coelum, cvp caudal venous pole, dm dorsal mesocardium, dp dorsal pericardium, dpc days post coitum, e endocardium, ec endocardial cushions, fe foregut endoderm, fg foregut, hm head mesenchyme, ht heart, ias inter-atrial septum (atrial septum primum), ivs inter-ventricular septum, la left atrium, lv left ventricle, m myocardium, n notochord ne neural epithelium of neural folds, nf neural folds, oft outflow tract, pc prechordal plate, ra right atrium, rv right ventricle, som somatopleure, spm splanchnic mesoderm, tr trabeculae. Reproduced from Development 132(22), Stennard and Harvey, “T-box transcription factors and their roles in regulatory hierarchies in the developing heart”, pages 4897-910, copyright 2005, with permission from Company of Biologists, dev.biologists.org

A B C

11.5 dpc

Figure 1.6 The primary and secondary heart fields

A) Ventral view at cardiac crescent stage and section (below). The secondary heart field (SHF) (green) lies dorsal to the cardiac crescent (red) in a sub- population of splanchnic mesoderm. The section at the level of the line shows the SHF in contact with the cardiac crescent and foregut endoderm (yellow). B) Lateral view at heart tube stage (approximately 8.5 dpc in the mouse) and section below. The SHF (green) lies dorsal to the heart tube (red). Section taken at the level of the line shows the SHF is contiguous with the heart tube and branchial arch (pharyngeal) (yellow) endoderm. C) Ventral view section of heart at 11.5 dpc, schematic of relative contributions to the heart by the cardiac crescent/heart tube (red) and the SHF (green). The cardiac crescent mainly contributes to the right and left atria and left ventricle. The SHF mainly contributes to the right ventricle and outflow tract

17 region. There is a minor overlap between the two progenitor fields. Abbreviations LA left atrium, LV left ventricle, RA right atrium, and RV right ventricle. A) and B) Reproduced from Semin Cell Dev Biol 18(1), Dunwoodie, “Combinatorial signaling in the heart orchestrates cardiac induction, lineage specification and chamber formation” pages 54-66, copyright 2007; with permission from Elsevier. C) Reproduced from Development 132(22), Stennard and Harvey, “T-box transcription factors and their roles in regulatory hierarchies in the developing heart”, pages 4897-910, copyright 2005, with permission from Company of Biologists, dev.biologists.org.

A 8.5 dpc 9.5 dpc

BA BA CNC CNC VP

B 11 dpc 12 dpc 14.5 dpc

RSA RCC LCC BA3 BA4 BA6 BT LSA PA DA Ao SCA PT DAo PA AoS

Figure 1.7 Contributions by the cardiac neural crest to heart development

Panel A, Schematics of lateral view of 8.5 dpc embryo and lateral and ventral views of 9.5 dpc embryo, showing location and migration of cardiac neural crest (CNC) cells into the developing heart. Primary heart field and myocardial contribution shown in red, SHF and derivatives of myocardium in dark green and vascular endothelial cells in pale green, CNC in yellow and proepicardial organ (PEO), which gives rise to the epicardium and epicardial derivatives, in blue. CNC cells begin to migrate from 8.5 dpc from the dorsal neuroepithelium through the outflow tract of the heart and contribute to the great vessels and outflow septum. Panel B, Schematic of the remodelling of the branchial arch (BA) arteries, ventral view. Initially the BA arterial network is symmetric (11 dpc) but undergoes asymmetric remodelling from 11.5 dpc caused by increased blood flow in the left BA6. This results in stabilisation of the aortic arch on the left side at the expense of the right side. Segments of the dorsal aortas (DoA) break down leading to the formation of individual common carotid arteries (LCC and RCC). At 14.5 dpc the left BA4 contributes to the segment of the aortic arch between the left subclavian artery (LSA). The right BA4 forms a segment connecting the right subclavian artery (RSA) to the brachycephalic trunk (BT), which is all that remains of the right aortic arch. The BT is also connected to the RCC. The left BA6 contributes to the ductus arteriosus (DA), an embryonic shunt that connects the left dorsal aorta to the pulmonary trunk. The DA closes at birth allowing the establishment of the pulmonary and systemic blood circulation. Ao aorta, AoS aortic sac, BA brachial arch, BT brachycephalic trunk, CNC cardiac neural crest, DA ductus arteriosus, DAo dorsal aorta, LCC left common carotid artery, LSA left subclavian artery, PA pulmonary arteries, PEO proepicardial organ, RCC right common carotid artery, RSA right subclavian artery, SA subclavian arteries, T trachea, VP venous pole. Panel A, Reproduced from Current topics in Developmental Biology, 90, Vincent and Buckingham, “How to make a heart: the origin and regulation of cardiac progenitor cells” pages 1-41, copyright 2010, with permission from Elsevier. Panel B, Adapted from Kaufman and Bard, The Anatomical Basis of Mouse Development, San Deigo, Academic Press, copyright 1999, with permission from Wiley Interscience.

19 A sub-population of the CNC cells migrate into the heart via aortic sac mesenchyme, which adjoins the OFT, and the branchial arches. A sub-population of the CNC cells migrate into the heart via aortic sac mesenchyme, which adjoins the OFT, and the branchial arch arteries (Jiang et al., 2000). From 9.5 dpc CNC cells can be observed in the OFT in the distal OFT cushions (Figure 1.7, panel A) (Jiang et al., 2000; Lo et al., 1997; Waldo et al., 2005). CNC cells also form the aortico-pulmonary septum which fuses with the OFT cushions for proper OFT septation (described in section 1.6) (Jiang et al., 2000; Waldo et al., 2005).

CNC cells are also important for aortic arch remodelling (Figure 1.7, panel B). The aortic arch delivers blood from the left ventricle to the systemic circulation. It and the associated vasculature initially develop bilaterally symmetrically from paired branchial arches and dorsal aortae (Figure 1.7, panel B). From 11.5 dpc, the branchial arch arteries undergo a process of asymmetric remodelling and selective cell survival, regulated by cardiac neural crest cells and left-right patterning, to ultimately result in a left-sided aortic arch (Figure 1.7, panel B) (Brown et al., 2001; Kioussi et al., 2002).

1.4.4 Chamber morphogenesis Following looping of the heart tube the four chambers of the heart develop. The chambers of the heart form at specific regions along the anterior-posterior axis of the linear heart tube (Figure 1.8). Seminal studies by Christoffels et al., (2000) demonstrated that the formation of the chambers occurs at the outer curvature of the heart loop where the proliferation rate is highest. Initially the linear heart tube consists mainly of primary non-chamber myocardium (Figure 1.8A). In response to the specific expression of genes such as natriuretic peptide precursor type A (Nppa) (also known as ANF), T-box 5 (Tbx5), small muscle protein, X-linked (Smpx) (also known as Chisel), Iroquois related homeobox 5 (Irx5) and gap junction protein, alpha 1 (Gja1) (also known as Cx43) in chamber myocardium, increased proliferation and differentiation of the heart chambers occurs at particular sites in the looped heart tube (Figure 1.8B) (Christoffels et al., 2000; Rumyantsev, 1977). Continued proliferation and differentiation leads to growth and trabeculation of the chambers (Figure 1.8C, D). The observation that proliferation and differentiation of the chambers occurred at particular sites in the looped heart led to the proposal of a ballooning model of chamber

20 morphogenesis, in which ventricles and atria balloon (grow) out ventrally (ventricles) and dorsolaterally (atria) from the looped heart tube (Christoffels et al., 2000). This ballooning of the atria effectively internalises the inflow tract region as the atria balloon around the inflow tract. Non-chamber myocardium forms the cardiac conduction system and regions of endocardial cushion growth (Figure 1.8C, D) (Christoffels et al., 2000).

1.5 Cardiac cushions and the development of the heart valves After the heart tube loops, the atrial region is located above the common ventricle (Figure 1.5C). Following looping, at approximately 9 dpc in the mouse, cardiac cushions form from non-chamber myocardium between the myocardial and endocardial layers of the heart tube in the atrioventricular canal (AVC) and in the distal OFT (conotruncus) (Christoffels et al., 2000). At this stage, the cardiac cushions, the first manifestations of the cardiac valves, are swellings of the extracellular matrix (ECM) called cardiac jelly (Figure 1.9A).

The cardiac jelly consists of a number of glycosaminoglycans (GAGs, such as hyaluronic acid and chondroitin sulfate), collagens, glycoproteins (such as periostin), versican, fibulin, fibrillin and laminin, and proteoglycans (Little and Rongish, 1995). The make-up of the jelly itself is important for cardiac cushion development as it must be permissive for signalling from the myocardium to the endocardium, as discussed below. Mutations and gene knockout studies in mice have confirmed the importance of some cardiac jelly components in valve development, for example cartilage link protein 1 (Crtl1), periostin and hyaluronic acid synthase 2 (Has2) (essential for hyaluronic acid production in the cardiac jelly) gene disruptions all lead to valve developmental defects, as does the disruption of versican cleavage (Camenisch et al., 2000; Kern et al., 2007; Kern et al., 2006; Snider et al., 2008; Wirrig et al., 2007).

Cellularisation of the cardiac cushions occurs via a process of epithelial to mesenchyme transition (EMT) (Figure 1.9B). The atrioventricular cushions undergo EMT first with the OFT cushions undergoing development at a later stage. Initially, at approximately 9 dpc, the cushions form as swellings of cardiac jelly. From 9.5 dpc, EMT is initiated by

21 8 dpc 9.5 dpc

11.5 dpc

Figure 1.8 Chamber morphogenesis from heart tube

Schematic of chamber development showing non-chamber myocardium in grey, inflow and outflow regions in green, ventricles in red, atria in blue and AV canal elements, including cushions in yellow. A) At 8 dpc the heart tube consists of primary non-chamber myocardium and the inflow and outflow regions. B-C) Following heart looping, the myocardium of the outer curvature of the heart balloons outward to give rise to the chamber specific myocardium (ventricles in red, atria in blue). D) Non-chamber myocardium contributes to the valves and conduction system. Abbreviations: a, common atrium, avb atrioventricular bundle branches, avc atrioventricular canal, avj atrioventricular junction, avn atrioventricular node, bb bundle branches, dbb distal bundle branches, ht heart, icv inferior vena cava, ift inflow tract, ivs inter- ventricular septum, la left atrium, lv left ventricle, oft outflow tract, pf purkinje fibers, ra right atrium, rv right ventricle, san sinuatrial node, scv superior vena cava. Reproduced from Development 132(22), Stennard and Harvey, “T-box transcription factors and their roles in regulatory hierarchies in the developing heart”, pages 4897-910, copyright 2005, with permission from Company of Biologists, dev.biologists.org.

23 A

B 9 dpc 9.5-10.5 dpc 10.5 dpc 11 dpc

Cushion formation EMT: Initiation EMT: Transition Valve maturation

Figure 1.9 Cushion development

A) Transverse view of the four-chambered heart prior to septation. The endocardial cushions (yellow and orange) have formed between the atria (blue) and ventricles (red) in the AV canal and OFT. (1) Inferior cushion, (2) superior cushion, (3) parietal, (4) septal. The cushions form canals to direct blood flow in the developing heart. B) Schematic of endocardial cushion EMT. At 9 dpc the cardiac cushions form as swellings of cardiac jelly. At 9.5-10.5 dpc, signals from the myocardium traverse the cardiac jelly to the endocardial endothelium to stimulate delamination, mesenchymal transdifferentiation (transition) and migration of cells into the jelly to cellularise the cushion. By 11 dpc, EMT is complete and the cellularised cushion commences re-modelling to form the mature valves. Abbreviations: AVC atrioventricular canal, AS atrial spine (spina vestibuli), dpc days post coitum, EMT epithelial to mesenchyme transition, LA left atrium, LV left ventricle, MC mesenchymal cell OFT outflow tract, RA right atrium, RV right ventricle, VEGF (vascular endothelial growth factor). (A) Reproduced from Circulation Research, 91(2), Lamers & Moorman, “Cardiac septation: a late contribution of the embryonic primary myocardium to heart morphogenesis”, pages 93-103, copyright 2002 with permission from Wolters Kluwer Health. (B) Reproduced from Cell, 118(5), Lambrechts and Cermeliet, “Sculpting heart valves with NFATc and VEGF”, pages 532-534, copyright 2004, with permission from Elsevier.

25 soluble factors that are secreted from the myocardium and diffuse across the cardiac jelly to the endocardium. Signalling from the myocardium causes some endocardium cells to delaminate from the epithelial sheet. These cells undergo a transition to become mesenchyme and invade the cardiac jelly. This invasion requires matrix metalloproteinase family members (Song et al., 2000). Following EMT, mesenchymal cells align into layers, expanding the cushions towards each other.

The final morphogenesis of the cardiac valves remains poorly characterised. As the cardiac cushions undergo EMT they expand towards each other. When the opposing crests meet, the cushions fuse forming a partial separation between the atria and ventricles. In a seminal study, de Lange et al., demonstrated, using genetic labelling experiments, that valves originate primarily from endocardially-derived mesenchyme with little or no contribution from myocardial or neural crest cells (de Lange et al., 2004).

1.5.1 Signalling pathways involved in cushion EMT Several signalling pathways have been implicated in EMT of the cardiac cushions. Much of the understanding about this process has come from ex vivo studies in avian and mouse systems. In these studies, the myocardium and endocardium of the developing cushion is explanted on type I collagen gel and EMT is allowed to progress, with cells invading the collagen gel and undergoing differentiation (Bernanke and Markwald, 1982). The conditions under which the explanted cushions are allowed to progress can be altered, for example by altering glucose concentration, oxygen tension or by the addition of soluble factors such as VEGF, and can lead to a decrease or increase in EMT observed (Barnett and Desgrosellier, 2003; Dor et al., 2001a; Dor et al., 2003; Enciso et al., 2003; Person et al., 2005). These studies have been complemented by studies in the mouse where “knockout” phenotypes affecting cardiac cushion development have been observed.

Studies in both chick and mouse have shown that BMP and TGF signalling is important in cushion development. Multiple BMP molecules and receptors have been implicated in cardiac cushion EMT. An understanding of precisely which BMPs are

26 required was difficult to determine, presumably due to compensatory effects from other members of the family. However, double knockout studies of BMPs are beginning to elucidate roles. Double nulls Bmp5;Bmp7 or Bmp6;Bmp7 exhibit defects in cushion formation (Kim et al., 2001b; Solloway and Robertson, 1999). Targeted deletion of bone morphogenetic protein receptor, type 1A (Bmpr1a, also known as Alk3), or hypomorphic bone morphogenetic protein receptor, type II (Bmpr2), also have defects in cushion development (Delot et al., 2003; Gaussin et al., 2005).

Complementary explant, expression and mouse knockout studies have shown that transforming growth factor beta 2 (TGF2) is expressed in the myocardium and is required for proper cushion development (Bartram et al., 2001; Camenisch et al., 2002a; Dickson et al., 1993; Sanford et al., 1997). Transforming growth factor, beta receptor III (Tgfbr3) appears to be the receptor important for cushion development and is expressed in the endothelial cells of the OFT and AVC cushions (Brown et al., 1999). Other TGF family members are also expressed in the cushions and may be important in their development (reviewed in (Barnett and Desgrosellier, 2003)). Cross talk between BMP and TGF pathways may be important for initiation of EMT (Armstrong and Bischoff, 2004; Barnett and Desgrosellier, 2003).

In addition to being a major component of the ECM of the cardiac cushions, hyaluronic acid is also important in EMT. The hyaluronic acid in the cushions is produced by HAS2, which polymerises alternating units of glucuronic acid (GlcUA) and N- acetylglucosamine (GlcNAc) (Camenisch et al., 2000). Hyaluronic acid indirectly activates v-erb-b2 erythroblastic leukaemia viral oncogene homolog 2, neuro/glioblastoma derived oncogene homolog (ERBB2) and v-erb-b2 erythroblastic leukaemia viral oncogene homolog 3 (ERBB3) via an unknown mechanism, leading to the activation of the small GTPase, RAS (Camenisch et al., 2002b). Activation of RAS is required for the transition step, from epithelial cells to mesenchymal cells, in cushion EMT (Camenisch et al., 2002b; Camenisch et al., 2000). Other cushion ECM components, such as versican, may interact directly with hyaluronic acid (Seyfried et al., 2005).

Vascular endothelial growth factor A (Vegfa) expression is up-regulated in the myocardium as EMT progresses and highly up-regulated by the time of completion, 27 approximately 11.5 dpc in the mouse. VEGF receptors, FMS-like tyrosine kinase 1 (Flt1, formerly known as Vegfr1) and kinase insert domain protein receptor (Kdr, formerly known as Vegfr2), are expressed in the endocardium and are down-regulated upon transformation (Dor et al., 2001a; Dor et al., 2003). Furthermore, the addition of anti-VEGF antibodies to cushion explants which inactivate endogenously produced VEGF, has been shown to induce EMT in explants when EMT has normally ceased (Dor et al., 2003). This suggests that VEGF, concentration may play an important role acting as an in vivo inhibitor of EMT and preventing further cushion growth.

Vegfa expression may be regulated by physiological hypoxia, the hypoxia experienced by eutherian mammals during normal development, since Vegfa is expressed in response to hypoxic conditions and the myocardium of the cushions has been shown to be hypoxic (Dor et al., 2001b), MacLean and Dunwoodie, unpublished observations). Glucose concentration may also be important as in maternal diabetes, elevated glucose levels also cause an increase in Vegfa expression, leading to congenital abnormalities that are potentially due to defects in cardiac cushion development (Loffredo et al., 2001; Madri et al., 2003). Explant studies have shown that high glucose levels result in decreased mesenchymal cell invasion, thus a failure of EMT (Enciso et al., 2003). These observations indicate that careful balance of both the concentration and timing of Vegfa expression is required for proper EMT and cushion formation.

Other pathways involved in cardiac cushion EMT include Wnt/-catenin signalling, Notch signalling and FGF signalling (reviewed in (Armstrong and Bischoff, 2004; Barnett and Desgrosellier, 2003; Person et al., 2005). Matrix metalloproteinase family members are required for invasion of the cardiac jelly by newly formed mesenchymal cells (Song et al., 2000). Other genes implicated in cardiac cushion development, based on their mouse knockout phenotypes, include Cited2 (Weninger et al., 2005), nuclear factor of activated T-cells, cytoplasmic, calcineurin-dependent 1 (Nfatc) (de la Pompa et al., 1998; Ranger et al., 1998), Nkx2.5 (Biben et al., 2000), Gata4 (Rivera-Feliciano et al., 2006), zinc finger protein, multitype 2 (Zfpm2, also called Fog2) (Flagg et al., 2007), Sox9 (Akiyama et al., 2004) and neurofibromatosis 1 (Nf1) (Lakkis and Epstein, 1998), reviewed in (Armstrong and Bischoff, 2004; Barnett and Desgrosellier, 2003; Person et al., 2005).

28 1.6 Septation of the heart Septation is the closure of communication between the respective chambers of the heart to form the four-chambered heart with one-way valves. Proper septation is required to prevent mixing of oxygenated and de-oxygenated blood. In addition to their role in cardiac valve formation, the cardiac cushions are also essential in septation of the heart.

The septation of the atria is complex and involves several tissues. At approximately 10.5 dpc, a myocardial protrusion, the atrial septum primum extends from the roof of the common atrium and extends towards the AVC cushions (Figure 1.10A, see also Figure 1.9 for cushion identities) (Anderson et al., 2003a). The septum primum has a mesenchymal cap, which fuses with the superior (cranial) AVC cushion, leaving communication open between the leading edge of the septum primum and the inferior cushion (not shown). At 12 dpc, a mass of mesenchyme called the spina vestibuli arises to the right of the orifice of the developing pulmonary vein and grows though the right hand side of the orifice and into the atrium (not shown) (Webb et al., 1998). Fusion and merging of the mesenchymal cap from the atrial septum primum, the spina vestibuli and the AVC cushions closes the foramen (opening) primum (Figure 1.10B). Before closure of foramen primum by the atrial septum primum, the upper part of the primary septum breaks down to form the secondary interatrial foramen (Figure 1.10B). At 13.5 dpc in the mouse, the atrial septum secundum forms as deep in-folding in the roof of the atrial myocardium forming a flap over the secondary interatrial foramen, finally closing communication between the atria (Figure 1.10C) (Anderson et al., 2003a; Webb et al., 1998).

Formation of the atria-ventricular septum is a result of endocardial cushion fusion and morphological rearrangements. Initially the heart is linear with blood pumping from the sinus venosus to the OFT. In the developed heart, blood flows from the right atrium to right ventricle. To achieve the morphological remodelling required for this to occur, the AVC undergoes differential expansion. First, the junction between the atria and AVC expands to form a funnel. This funnel shaped canal then expands along the inner curvature of the distal region (right side) of the AVC, ultimately aligning over both ventricles (Kim et al., 2001a). Following this the right caudal AVC expands and the AVC cushions fuse, leading to the formation of two separate inlet canals draining the two atria. 29 A

superior vena cava septum primum left endocardial right atrium cushion spina vestibule foramen primum right endocardial cushion dorsal endocardial cushion

ventricle

inferior vena cava

B septum primum superior vena cava foramen secundum septum secundum right atrium left atrium filled foramen left endocardial cushion

right endocardial cushion

sectioned atrioventricular canal

ventricle inferior vena cava

C superior vena cava septum primum

septum secundum blood flow right atrium left atrium foramen secundum

foramen ovale left ventricle sectioned atrioventricular muscular canal ventricular septum right ventricle inferior vena cava

Figure 1.10 Septation of the atria

(A) The septum primum (blue) grows down from the roof of the common atrium in the direction of the purple arrow towards the atrioventricular canal (AVC) cushions. (B) The septum primum and the associated spina vestibuli (not shown) fuse with the superior AVC endocardial cushion. Small perforations form in the region proximal to the atrium called the foramen secundum. Concurrently the atrial myocardium forms an infolding called the septum secundum (green). During foetal life, blood is able to flow from the right to left atrium through the ovale septum secundum. Reproduced from Schoenwolf et al., Larsen’s Human Biology, Philidelphia, Churchill Livingstone, copyright 2009, with permission from Elsevier.

31 A B

truncoconal septae

right atrium

right left ventricle ventricle superior EC inferior EC right AVC left AVC muscular intraventricular septum

C D

future pulmonary trunk future aorta

Figure 1.11 Outflow tract septation

(A) The hemi-clockwise rotation of the outflow tract (OFT) gives the OFT cushions (yellow and green) a spiral appearance. (B) The cushions expand towards each other and fuse. (C-D) An outgrowth from the inferior atrioventricular cushion extends caudally towards the newly formed inter-ventricular septum and fuses with it, completing ventricular septation. This directs blood flow from the right ventricle to the pulmonary artery (blue arrow) and from the left ventricle to the aorta (red arrow). AVC atrioventricular canal, EC endocardial cushion. Reproduced from Schoenwolf et al., Larsen’s Human Biology, Philidelphia, Churchill Livingstone, copyright 2009, with permission from Elsevier.

33

Septation of the ventricles occurs after the alignment of the inlet canals. In the mouse, as the ventricles grow and trabeculate, the medial walls fuse forming the muscular septum. From 11-14 dpc the septum grows towards and fuses with an outgrowth from the inferior AVC cushion forming the membranous septum (see also Figure 1.9 for cushion identities) (Anderson et al., 2003a; Lamers and Moorman, 2002; Webb et al., 1998). The outflow tract cushions also fuse with the muscular septum and AVC cushions, contributing to the membranous septum and completing inter-ventricular septation (Anderson et al., 2003a; Lamers and Moorman, 2002; Webb et al., 1998).

The OFT cushions are important in establishing the arteries. Heart looping results in a hemi-clockwise rotation of the OFT, giving the cardiac cushions a spiral appearance (Figure 1.11A, see also Figure 1.9 for cushion identities). The OFT cushions fuse with each other dividing the OFT into the aorta and pulmonary trunk (Figure 1.11B). Fusion of the cushions with the aortico-pulmonary septum (part of the aortic sac) and interventricular septum partitions the blood flow (Figure 1.11C, D). As a result, the aorta can only receive blood from the left ventricle, and the pulmonary artery receives blood only from the right ventricle. Connections between the aorta and fourth left branchial arch artery, and the pulmonary artery to the left 6th branchial arch artery is also established by cushion fusion with the aortico-pulmonary septum (Anderson et al., 2003b).

1.7 Heart malformations Heart development is a complex process with multiple cellular origins and types. Congenital heart disease is the most common type of congenital abnormality in humans, affecting approximately 1 % of live births (Hoffman and Kaplan, 2002; Hoffman et al., 2004). Heart malformations occur when one or more of the fundamental processes such as heart looping, endocardial cushion formation, cardiac neural crest migration and cardiac septation are perturbed (Bajolle et al., 2009; Bruneau, 2008; Gruber and Epstein, 2004) (Figure 1.12). Table 1.1 lists types of congenital heart defects and their incidence, and some of the genes associated with those defects (Bruneau, 2008; Gruber and Epstein, 2004).

34 Figure 1.12 Schematic of heart showing structures affected by congenital heart defects Structures affected with approximate incidence per 1000 live births indicated in brackets. Abbreviations: AC aortic coarctation, AS aortic stenosis, ASD atrial septal defect, AVSD atrioventricular septal defect, BAV bicuspid aortic valve, DORV double outlet right ventricle, Ebstein’s Ebstein’s anomoly of the tricuspid valve, HLHS hypoplastic left heart syndrome, HRHS hypoplastic right heart, IAA interrupted aortic arch, MA mitral atresia, MS mitral stenosis, PDA patent ductus arteriosus, PS pulmonary artery stenosis, PTA persistent truncus arteriosus, TA tricuspid atresia, TAPVR total anomalous pulmonary venous return, TGA transposition of the great arteries, TOF tetralogy of Fallot, VSD ventricular septal defect. Reproduced from Nature, 451(7181), Bruneau, “The developmental genetics of congenital heart disease”, pages 943-948, copyright 2008, with permission from Nature Publishing Group.

Table 1.1 Incidence of congenital heart disease

Defects and incidence from Gruber and Epstein, 2004, associated genes from Bruneau, 2008. TBX1, TBX5, JAG2 and NOTCH2 are all associated with congenital heart diseases that are a part of wider syndromes. Abbreviations: GATA4 GATA binding protein 4, JAG1 jagged 1, MED13L mediator complex subunit 13-like, MYH6 myosin, heavy polypeptide 6, cardiac muscle, alpha, NKX2.5 NK2 transcription factor related, locus 5, NOTCH1 Notch gene homolog 1, NOTCH2 Notch gene homolog 2, TBX T-box 1, TBX5 T-box 5, TBX20 T-box 20, Defect Incidence Associated genes Ventricular septal defect (VSD) 1:280 NKX2.5, GATA4, TBX20, MYH6, TBX5, TBX1 Atrial septal defect (ASD) 1:1062 NKX2.5, GATA4, TBX20, MYH6, TBX5 Atrioventricular canal (AVC) defect 1:1372 NKX2.5, MED13L Tetralogy of Fallot (TOF) 1:2375 NKX2.5, NOTCH1, NOTCH2, JAG1, TBX1 Transposition of the great arteries (TGA) 1:3175 Hypoplastic left heart syndrome (HLHS) 1:3759 NOTCH1 Double outlet right ventricle (DORV) 1:6369 NKX2.5, MED13L Pulmonary atresia 1:7576 Ebstein's anomaly 1:8772 NKX2.5 Truncus arteriosis 1:9346 NKX2.5 Total anomalous pulmonary venous connection 1:10638 (TAPVC) Tricuspid atresia 1:12658 NKX2.5

Heart looping defects are often a result of defective left-right patterning. Failure of the heart to loop correctly and improper rotation of the OFT can lead to malpositioning of the chambers, OFT, AVC and endocardial cushions. This leads to hearts with improper arrangement of chambers and vessels. Some of the congenital heart defects that can arise due to incorrect heart looping include double outlet right ventricle (DORV) in which the majority of blood entering the aorta is from the right ventricle or overriding aorta (OA), in which blood from both ventricles is funnelled into the aorta. Both DORV and OA can arise from a failed clockwise rotation of the OFT or malpositioning of the 36 OFT over the ventricles (Bajolle et al., 2006; Gittenberger-de Groot et al., 2005). OFT malpositioning over the ventricles can also result in ventricular septal defects (VSD) as the OFT cushions are also malpositioned and are required for ventricular septation (as described in section 1.5). Similarly, the AVC cushions are also incorrectly located as a result of improper looping. As fusion between the AVC cushions is required in septum formation, this can result in a common or single AVC, allowing blood from either atrium to enter either ventricle. These defects can be observed in both patients and mouse models with left-right patterning defects (Maclean and Dunwoodie, 2004).

Incorrect cardiac cushion positioning, development or fusion of the OFT and AVC cushions can lead to congenital heart defects. Abnormal EMT essential for cellularising the cardiac cushions can result in either hypoplastic or hyerplastic endocardial cushions. Since the cardiac cushions give rise to the heart valves, valvular defects result. In hyperplastic cushion defects, this can lead to double inlet left ventricle (DILV) as a result of cushion material blocking the atrial septum primum, preventing closure of the atrial septum. More commonly, endocardial cushion defects can lead to a VSD. The membranous ventricular septum is derived from the inferior AV cushion. Failure to form or fuse with the muscular ventricular septum results in VSD. ASD can also be a result of cushion defects, although this is rare (Barnett and Desgrosellier, 2003).

1.8 Aim of the thesis The aim of this thesis was to identify genes expressed in the heart during early mouse development, and to determine their function in mouse heart development. An understanding of the function of genes in mouse heart development has previously been shown to be a useful tool for identifying genes that when disrupted in humans can cause congenital heart disease, for example NKX2.5, T-box 20 (TBX20), GATA binding protein 4 (GATA4) (Table 1.1) (Elliott et al., 2003; Kirk et al., 2007; McElhinney et al., 2003; Reamon-Buettner and Borlak, 2006; Schluterman et al., 2007).

The approach taken to identify genes expressed in a restricted manner in the developing mouse heart was to expression profile the progenitor tissue of the heart, the mesoderm (Chapter 2). RNA from cDNA libraries of the three germ layers and a primitive streak fraction was transcribed and labelled, then hybridised to a 22, 000 gene mouse oligo

37 microarray library (Owen Prall, Victor Chang Cardiac Research Institute). Analysis of the data obtained from these microarrays and identification of genes enriched in particular germ layers or the primitive streak was performed. A subset of genes identified as enriched in the progenitor tissue(s) of the heart with no or little published expression data were selected for examination by whole mount RNA in situ hybridisation in 7.5 dpc, 8.5 dpc and 9.5 dpc mouse embryos. None of the genes examined were expressed in a restricted manner in the developing heart.

In another screen being undertaken in our laboratory, designed to identify genes with disrupted cycling expression in somitogenesis, Glutamine fructose-6-phosphate transaminase 2 (Gfpt2) was found to be expressed in a restricted manner in the developing heart at 9.5 dpc. Since the aim of this research was to identify genes that may be important in heart development and no appropriate candidates were identified in Chapter 2, Gfpt2 was selected for further investigation.

GFPT2, and closely related protein, GFPT1 can both function as the rate-limiting enzyme of the hexosamine biosynthesis pathway (HBP). The HBP is responsible for a small percentage of glucose metabolism in the cell and branches off from glycolysis (Zhivkov et al., 1975). It produces UDP-N-acetylglucosamine (UDP-GlcNAc) from fructose-6-phosphate via a series of reactions. The production of UDP-GlcNAc is important for many processes in the cell. Mulitmers of GlcNAc can form glycosaminoglycans that are important in connective tissues and extracellular matrices. UDP-GlcNAc can glycosylate , modulating their function, and can be considered analogous to phosphorylation in this regard (Hanover, 2001).

In Chapter 3, the expression pattern of Gfpt2 in the mid gestation mouse (8.5 dpc to 11.5 dpc) was examined. Gfpt2 was expressed in the foregut endoderm underlying the heart at 8.5 dpc. At 9.5 dpc, Gfpt2 was expressed in the myocardium underlying the cardiac cushions. Gfpt2 was also expressed in the pre-somitic mesoderm of some embryos at 9.5 dpc and 11.5 dpc.

The expression pattern of Gfpt2 in the developing mouse and the known function of GFPT2 in the HBP led to the hypothesis that Gfpt2 expression correlated to an increased requirement for UDP-GlcNAc in the tissues in which it was expressed. The 38 foregut endoderm is a known source of signals for the developing heart, while in the pre-somitic mesoderm, glycosylation of the Notch receptor by the glycosyltransferase Lunatic Fringe has been shown to be modulate Notch signalling in somitogenesis, suggesting that Gfpt2 expression might be required to produce sufficient UDP-GlcNAc to modulate signalling in these tissues. At 9.5 dpc, Gfpt2 is expressed in the myocardium underlying the cardiac cushions. The cardiac cushions consist of an extracellular matrix, called the cardiac jelly, made up of various glycosaminoglycans and proteoglycans. Thus UDP-GlcNAc might be required for the production of the proteins that make up the cardiac jelly. Alternatively, the cellularisation of the cardiac cushions requires signalling from the myocardium to the endocardium to initiate epithelial to mesenchyme transition and invasion of the cardiac jelly. UDP-GlcNAc may be required to modulate proteins involved in this signalling.

To address these possibilities, the approach taken here was to generate mice containing gene trap insertions, which disrupt Gfpt2 function. The characterisation and generation of mice carrying the gene trap insertions is described in Chapter 4, and the functional effects of the gene trap insertions are described in Chapter 5. It was found that the gene trap insertions are likely to be functionally null alleles, but there is no effect on the survival of mice homozygous for the gene trap insertions.

39 40 Chapter 2: Expression Profiling of the Mouse Germ Layers and Primitive Streak

2.1 Introduction The aim of this thesis was to identify genes expressed in the heart during early mouse development. To this end, the mesoderm and the primitive steak were expression profiled, as the heart is derived from the mesoderm and mesoderm part of the primitive streak fraction. RNA from cDNA libraries of the three germ layers and a primitive streak fraction was transcribed and labelled, then hybridised to a 22, 000 gene mouse oligo microarray library, representative of the mouse genome. The cDNAs associated with each germ layer were compared in order to identify those enriched or specific to each germ layer and the primitive streak. Generally, the endoderm was found to be most distinct from the other germ layers and the primitive streak, with regards to both the number of enriched genes and the gene ontologies associated with the genes identified.

Gene expression profiling was revolutionised by the advent of microarray technology. Microarray technology has been used to compare gene or protein expression between “normal” and “diseased” states or between different tissues, and to locate enhancer and promoter sequences important for gene regulation. Microarray technology has now been superseded by next generation sequencing. Next generation sequencing is a high throughput sequencing method, which is cheaper and faster than traditional Sanger sequencing, making it feasible to direct sequence the entire transcriptome of a cell, tissue or even whole organism. This approach is now being used in gene profiling, whole genome sequencing and discovery of transcription factor binding sites (Metzker, 2005; Morozova and Marra, 2008). At the time this study was initiated and conducted, next generation sequencing was not available, thus microarray technology was used to expression profile the 7.5 days post coitum (dpc) mouse embryo to identify genes that are specific to either particular germ layers or the primitive streak.

Two whole mount RNA in situ hybridisation analyses were undertaken. In the first analysis, genes enriched in the mesoderm and primitive streak fraction were characterised, as the heart is derived from the mesoderm and mesoderm part of the

41 primitive streak fraction. Information was collated regarding known expression in the mouse embryo and a subset of genes with no known expression were examined by whole mount RNA in situ hybridisation to determine if they were expressed in the developing mouse heart at 7.5-9.5 dpc. This did not lead to the identification of any genes expressed in a differential manner in the embryonic heart. A second analysis was undertaken with differing criteria and focusing on the mesoderm and endoderm since signalling between these tissues is known to be important for heart development. This also did not identify any genes with differential expression in the developing mouse heart.

2.1.1 Identification of novel genes using the germ layer libraries All the somatic tissue of the mouse develops from the epiblast. During the process of gastrulation, three germ layers are formed; endoderm, mesoderm and ectoderm (see Chapter 1). During gastrulation, epiblast cells ingress through the primitive streak and differentiate to form mesoderm and endoderm, whereas the ectoderm is formed from epiblast cells that differentiate in situ (Lawson et al., 1991). Since the germ layers give rise to particular organ primordia (Kinder et al., 1999; Parameswaran and Tam, 1995; Tam et al., 1997) genes displaying enriched expression in a particular germ layer at 7.5 dpc may be important markers or developmental cues for tissues derived from that layer.

There are a number of ways in which novel genes, pertinent to embryonic development, can be identified. These include subtractive hybridisation of cDNA libraries (Harrison et al., 1995), differential display by PCR (Liang and Pardee, 1992) suppression subtractive hybridisation, whole-mount in situ hybridisation of randomly selected clones, sequence clustering (Sousa-Nunes et al., 2003), analysis of phenotypes resulting from ENU mutagenesis (Acevedo-Arozena et al., 2008), SAGE (serial analysis of gene expression) display (Velculescu et al., 1995) or generation of gene trap mouse lines (Matsuda et al., 2004).

One successful approach identified novel genes expressed in a restricted manner in the early mouse embryo using a unique series of germ layer-specific cDNA libraries created by dissection of 7.5 dpc embryos (Harrison et al., 1995). These libraries were

42 constructed from pooled 7.5 dpc embryos that had been dissected into their endoderm, ectoderm, mesoderm and endoderm and a primitive streak component (Figure 2.1). Due to the nature of the dissection, the primitive streak component contained the primitive streak itself and a small proportion of the newly formed mesoderm and endoderm and a small amount of ectoderm (Harrison et al., 1995). In their study, Harrison et al., (1995), performed subtractive hybridisation, in which the Mesoderm library was subtracted from the Endoderm library. Consistent with the subtraction, a subset of the identified cDNAs was confirmed as being more abundant in the endoderm as compared to the mesoderm by Southern hybridisation and by whole mount RNA in situ hybridisation (Harrison et al., 1995).

Using these libraries, genes with enriched expression in a particular germ layer at 7.5 dpc were identified by subtractive hybridisation (Dunwoodie et al., 1997; Dunwoodie et al., 1998; Harrison et al., 1995) or sequence analysis of libraries (Sousa- Nunes et al., 2003). Phlda2 (pleckstrin homology-like domain, family A, member 2, also known as Ipl) (Dunwoodie and Beddington, 2002), Dll3 (delta-like 3) (Dunwoodie et al., 2002), Sp5 (trans-acting transcription factor 5) (Harrison et al., 2000) and Cited1 (Dunwoodie et al., 1998) were all identified as being enriched in the primitive streak fraction, compared to the ectoderm and endoderm. Since these analysess were based on comparative expression between germ layers and not gene homologies, they represent an unbiased approach for identifying genes that are relevant to mammalian development (Barbera et al., 2002; Dunwoodie and Beddington, 2002; Dunwoodie et al., 2002; Harrison et al., 2000; Rodriguez et al., 2004; Weninger et al., 2005).

2.2 Germ layer microarrays Although the germ layer-specific cDNA libraries have thus far proved a useful tool for identifying novel genes, the advent of new technologies such as microarray placed us in a position to fully exploit this unique resource. Expression profiling by microarray of the germ layer cDNAs can identify genes specific to a particular layer without the need to create subtracted libraries. This allows for a more comprehensive examination of gene expression, since the expression pattern of genes across all germ layers can be

43

2.1 Schematic and micrographs of the 7.5 dpc embryonic region dissected into its germ layer components and primitive streak fractions Note that the primitive streak fraction also contains some mesoderm and ectoderm. Reproduced from Development, 121(8), Harrison, Dunwoodie, Arkell, Lehrach, and Beddington, “Isolation of novel tissue-specific genes from cDNA libraries representing the individual tissue constituents of the gastrulating mouse embryo” pages 2479-89. Copyright (1995) with permission from The Company of Biologists. dev.biologists.org (Scale bar 200 μm). examined at once. In the microarrays used here, the OligoLibrary arrayed on the microarray chip was designed using a Compugen transcriptome database to include splice variants whilst excluding SNPs (small nucleotide polymorphisms), repeated sequences, chimeras and intron contamination. As such, the library represents 22,000 unique sequences corresponding to approximately 21,500 genes.

2.2.1 Array hybridisation The microarray hybridisation was performed by Owen Prall (Victor Chang Cardiac Research Institute) prior to the commencement of this study by the candidate. To identify genes that are enriched in a particular germ layer or the primitive streak, cDNAs present in each library (Ectoderm, Mesoderm, Endoderm and Primitive Streak) needed to be compared. To facilitate this, six microarray chips were used to compare each cDNA library to each other cDNA library (Table 2.1). For each microarray chip, plasmids containing cDNAs from the individual libraries (Ectoderm, Mesoderm, Endoderm and Primitive Streak), were linearised and RNA transcribed. Purified RNA from each library was labelled with either fluorochrome cyanine dyes, Cy3 (570 nm emission) and Cy5 (670 nm emission) and hybridised to the microchip arrayed with the 22,000 mouse gene Compugen/Sigma-Genosys OligoLibrary. Figure 2.2 shows the strategy for microarry chip 1 (Table 2.1), comparing the Mesoderm and Primitive Streak libraries. For microarray chip 1, plasmids containing cDNA from the Mesoderm and Primitive Streak libraries were linearised and RNA transcribed. Purified RNA from the Mesoderm library was labelled with Cy3 and RNA from the Primitive Streak library was labelled with Cy5. Labelled RNA from both libraries was hybridised to the same microarray chip. RNA was hybridised to complementary oligonucleotides on the chip and, after washing away non-hybridised sequences, the efficiency of the Cy3 and Cy5 hybridisation at each oligonucleotide was visualised by measuring the relative fluorescence (Figure 2.2)

2.2.2 Analysis of cDNA library microarray data At the commencement of this study by the candidate, the candidate obtained the normalised microarray data already entered into the Genespring software (Version 6.1, Silicon Genetics). An in depth analysis of the microarray data was performed by the candidate using Genespring software in order to identify genes with enriched expression 45 cDNA libraries linearise transcribe RNA and purify

label with Cy3 or Cy5 microarry chip hybridise to chip as appropriate

Figure 2.2 Schematic for the microarray strategy for microarray chip 1, Mesoderm library compared to Primitive Streak library

Plasmids containing cDNA from the Mesoderm (brown) and Primitive Streak libraries (blue) were linearised and RNA transcribed. Purified RNA from the Mesoderm library was labelled with Cy3 (green) and RNA from the Primitive Streak library was labelled with Cy5 (red). Labelled RNA from both libraries 47 was hybridised to the same microarray chip. RNA will hybridise to complementary oligonucleotides on the chip and after washing away non-hybridised sequences, the efficiency of the Cy3 and Cy5 hybridisation at each oligonucleotide is visualised by measuring the relative fluorescence. The more “green” a particular oligonucleotide “spot”, the more enriched it is in the Mesoderm library compared to the Primitive Streak library, the more “red” the spot, the more enriched it is in the Primitive Streak library compared to the Mesoderm. Yellow spots are neutral and grey spots are not represented in the libraries.

Cy3 channel Cy5 channel 1. Mesoderm Primitive Streak Analysis 1 2. Ectoderm Mesoderm Raw data 3. Endoderm Ectoderm filtered 4. Primitive Streak Endoderm 5. Mesoderm Endoderm 6. Ectoderm Primitive Streak

Microarray Chip Microarray Chip Microarray Chip 2 1 5 Genes 2-fold Genes 2-fold Genes 2-fold enriched enriched enriched

315 1113 759

275

276 362

688

Genes expressed 2-fold higher in mesoderm compared to other germ layers and the primitive streak

Figure 2.3 Flow schematic of raw data restriction and identification of mesoderm enriched genes

6 microarray chips created as described in Figure 2.2. For each chip, RNA from the cDNA libraries was labelled with either Cy3 (green text) or Cy5 (red text) and hybridised to the chips. A raw data filter was applied to the Cy3 channel such that, the raw data value was greater or equal to 50 in at least 2 of the 6 experiments. If all experiments required a raw data value of at least 50, then genes not expressed in a given layer would be excluded. This created a subset list of genes (Analysis 1 raw data filtered) (black circle). From the analysis 1 raw data filtered gene subset, genes 2-fold enriched in the mesoderm over the ectoderm (red circle), primitive streak (green circle) and endoderm (blue circle) were identified and using an overlapping Venn diagram, genes 2- fold enriched in the mesoderm were identified. This approach identified 275 genes that were 2-fold enriched in the mesoderm compared to the other layers and the primitive streak.

49 Table 2.1 cDNA libraries were labelled with Cy3/Cy5 and hybridised to Compugen/Sigma-Genosys OligoLibrary arrayed chip.

Channel Filtered Analysis 1 Analysis 2 Cy3 labelled Cy5 labelled Mesoderm Endoderm enriched enriched 1 Mesoderm Primitive Streak Mesoderm Mesoderm - 2 Ectoderm Mesoderm Ectoderm Mesoderm - 3 Endoderm Ectoderm Endoderm - Endoderm 4 Primitive Endoderm Primitive - Endoderm Streak Streak 5 Mesoderm Endoderm Mesoderm Mesoderm Endoderm 6 Ectoderm Primitive Streak Ectoderm - - in particular germ layer(s) or the primitive streak. Candidate genes for further analysis by whole mount RNA in situ hybridisation were selected using two different approaches resulting in two analyses, Analysis 1 (described below and in Section 1.2.3) and Analysis 2 (described in Section 1.2.5). RNA that binds at a very low level can confound results. For example raw binding intensities of 10 and 50 can be represented as a 5-fold change, as are efficiencies of 100 and 500. However an efficiency of 10 is likely to be at baseline compared to a more genuine interaction at 100. To overcome this, a basic expression restriction of a minimum raw data fluorescence unit value of 50 was applied to the raw data to exclude genes that were expressed at a very low level.

In Analysis 1, the restriction was applied by setting a minimum raw data value of 50 in the Cy3 channel in any 2 out of 6 microarray chip experiments (Table 2.1 and Figure 2.3). For example, in array 1 genes in the mesoderm must have a raw data value of at least 50, while in array 4 the genes in the primitive streak must be at least 50 (Table 2.1). As the germ layer and primitive streak layers were all represented in the Cy3 channel in at least one of the microarray chip experiments, the restriction was applied to any 2 out of 6 arrays to avoid exclusion of genes only enriched in one particular layer.

Using the raw data filtered gene subset, genes with enriched expression in a particular germ layer were identified using relative fold-expression level, from 2-fold to 10-fold, of genes from normalised data (Figure 2.2 and Table 2.2). The data is normalised to

50 account for general differences in hybridisation efficiency. For example the maximal intensity of one fluorophore (Cy3 or Cy5) might be generally higher compared to the other. If the raw data were used, this would create a bias towards a particular channel, potentially obscuring small fold-change differences. The analysis showed that each cDNA library is distinct as there were clear differences in the number of genes identified as being particular to each germ layer or the primitive streak. The Endoderm library stands out as the most distinct in terms of gene complexity as it has the largest number of enriched genes, whereas the ectoderm has the fewest enriched genes (Table 2.2).

Table 2.2 Number of Genes identified as enriched in each cDNA library by fold-change in Analysis 1

Fold-change Mesoderm Primitive Streak Endoderm Ectoderm 2-fold 274 199 442 122 3-fold 85 49 160 18 4-fold 40 21 78 4 5-fold 23 6 45 3 10-fold 6 - 15 -

Differential gene expression is only one measure of the differences between the germ layers and primitive streak. Analysis of the types of genes expressed is also an important measure as this provides information as to the function. To address this, the gene ontology (GO) terms associated with each germ layer were determined. Gene ontology is a controlled vocabulary used to describe gene and gene product attributes in any organism according to the molecular function of gene products; their role in multi- step biological processes; and their localization to cellular components. As such, GO terms can be used to provide information about the biology of a particular set of genes and this information can prove useful in prioritising genes for further investigation by whole mount RNA in situ hybridisation.

To determine which GO terms are associated with the different germ layers and primitive streak fractions, the accession numbers assigned to the genes 2-fold enriched in each layer were entered into the GOstat program (Beissbarth and Speed, 2004). GOstat determines which GO terms are associated with a list compared to a reference

51 list (in this case MGI – Mouse Genome Informatics). By using MGI data-set as the reference list, all annotated mouse genes are represented, thus the comparison is between the germ layers and the whole mouse genome. The best 30 terms over- represented in each library were identified (Appendix 1), and the top 10 terms are represented in pie graphs in Figure 2.4.

GO terms are hierarchical in that processes and gene functions are described in terms of biological process, cellular components and molecular function, thus there may be several overlapping GO terms representing a common pathway. For example, the following GO terms were over-represented in the primitive streak: Transcription Regulation of transcription, DNA dependent Regulation of gene expression Transcription, DNA dependent These GO terms may all be seen as describing a singular process, transcription, although not all terms will apply to all genes.

The germ layers and primitive streak are also distinct as judged by the GO terms associated with the genes for which they are enriched. The primitive streak and ectoderm are the most similar, with many GO terms in common. The mesoderm also has several terms in common with both the primitive streak and ectoderm, but the endoderm has very few terms in common with the ectoderm and none with the primitive streak (Figure 2.4, Appendix 1). Thus, in agreement with the results observed for gene complexity, endoderm is also the most distinct as judged by gene ontology. It is surprising that there are not more terms in common between the endoderm and mesoderm as both these layers are directly derived from cells that ingress through the primitive streak. The germ layers, but not the primitive streak, all have GO terms associated with organ development (multicellular organismal development in both endoderm and ectoderm and development in mesoderm), which is to be expected given that the entire embryo develops from these tissues (Appendix 1).

52 2.2.3 Analysis 1 in silico analysis of genes Extensive in silico analysis was undertaken by collating information about the genes identified, including alternative names, published expression data, predicted protein domains, gene ontology and phenotypic alleles, mainly sourced from Mouse Genome Database (MGD) and Gene Expression Database (GXD) at the Mouse Genome Informatics (MGI) website, (GXD, May, 2005; MGD, May 2005). Current gene names are correct as of July 2011. Genes known to have restricted expression within a particular germ layer or the primitive streak at 7.5 dpc were identified in the array studies as being enriched in that layer compared to other layers. For example, mesoderm restricted genes Phlda2 and Twist1 were found to be 10-fold and 4-fold enriched in mesoderm respectively; alpha foetoprotein (Afp) was found to be 10-fold enriched in the endoderm; CrabpII was identified as an ectoderm enriched gene over other layers and the primitive streak, and mix1 homeobox-like 1 (Mixl1) was 5-fold enriched in the primitive streak compared to the germ layers. This provided a level of confidence that the arrays were true to the libraries and represented genuine embryo expression profiles.

The aim of this project was to identify previously uncharacterised genes that are relevant to heart development. Since the heart is derived from the mesoderm, and the mesoderm is formed through the primitive streak, it was hypothesised that genes relevant to heart development would be expressed in the mesoderm and primitive streak rather than the endoderm or ectoderm at 7.5 dpc. Due to the nature of the dissection, the primitive streak fraction cDNA library contained the primitive streak itself and a small proportion of the newly formed mesoderm and endoderm and a small amount of ectoderm (Harrison et al., 1995). Microarray analysis revealed that 274 genes are expressed at least 2-fold greater in the mesoderm over the other layers, and 199 genes are expressed at least 2-fold greater over in the primitive streak over the germ layers (Figure 2.2).

Further characterisation of the genes identified as enriched in the mesoderm and primitive streak was carried out by comparing the number of genes for various parameters collated from the MGD and GXD (May 2005), at differing fold-changes (Table 2.3). A higher proportion of genes enriched 4-fold in either mesoderm or

53 localisation protein binding organelle intracellular part intracellular organelle cytoplasmic part cytoplasm membrane-bound organelle intracellular developmental process transport extracellular region part multicellular organismal development cellular metabolic process cell part extracellular space establishment of localization macromolecule metabolic process intracellular membrane-bound organelle primary metabolic process membrane-bound organelle intracellular membrane-bound organelle organelle

intracellular part intracellular organelle intracellular organelle nucleus intracellular nucleus membrane-bound organelle biopolymer metabolic process nucleic acid binding cellular metabolic process protein binding biological regulation regulation of cellular process developmental process regulation of biological process AB primary metabolic process CD intracellular membrane-bound organelle

Figure 2.4 Pie graphs representing the best 10 GO terms associated with genes enriched in each germ layer and the primitive streak

A) Graph showing the best 10 terms associated with genes enriched in the mesoderm. (B) Graph showing the best 10 terms associated with genes 55 enriched in the primitive streak fraction. C) Graph showing the best 10 terms associated with genes enriched in the ectoderm. D) Graph showing the best 10 terms associated with genes enriched in the endoderm.

Table 2.3 Percentage of mesoderm and primitive streak enriched genes with known embryo expression and phenotypic data at different fold- changes

Data was collated from MGI (May, 2005) and each gene was assigned as having no known expression (none), expression determined by in situ hybridisation (section, whole-mount RNA or protein) (ISH) or expression data determined by other methods e.g. PCR, Northern blot, Western blot (other data). The resulting phenotype from gene disruption provides phenotypic data and information about the function of genes. Genes were assigned as having none, targeted knockout (targeted) or non-targeted e.g. gene trap or mapped spontaneous mutation. Genes for which there are both targeted and non-targeted alleles were assigned as targeted. Expression and gene disruption data shown as percentage of total (n) for each fold-change.

Mesoderm Primitive Streak 4-fold 3-fold 2-fold 4-fold 3-fold 2-fold Expression data (n=40) (n=85) (n=274) (n=21) (n=46) (n=199) None 52.5 58.8 67.5 47.6 67.4 68.3 ISH 42.5 34.1 27.0 47.6 30.4 27.1 Other data 5 7.1 5.5 4.8 2.2 4.5 Gene disruption None 75 77.6 79.9 71.4 80.4 78.9 Targeted 22.5 18.8 18.2 28.6 17.4 19.1 Non-targeted 2.5 3.5 1.8 0 2.2 2.0 primitive streak have expression data (47.5 % vs. 32.5 % for mesoderm, 52.4 % vs. 31.6 % for primitive streak) and/or gene disruptions (25 % vs. 20 % for mesoderm, 28.6 % vs. 21.1 % for primitive streak) compared to a 2-fold enrichment. This may be because genes more highly enriched in a tissue of interest are more likely to be identified by other methods e.g. subtractive hybridisation. The majority of genes enriched in these tissues could be represented by GO terms or have known protein domains (75.2 % of mesoderm enriched genes, 73.9 % of primitive streak enriched genes) (Table 2.3).

56 2.2.4 Analysis 1 whole mount in situ hybridisation The following criteria were used to select a subset of mesoderm and primitive streak specific genes to screen by whole mount in situ hybridisation for expression in the developing heart: No published expression in embryo Genes expressed 3-fold in mesoderm or primitive streak fraction No genetic models (knockouts, spontaneous mutations, gene traps etc.) Availability of cDNA clones (IMAGE)

IMAGE cDNA clones of 47 the genes identified according to these criteria were obtained, corresponding to 14 primitive streak and 33 mesoderm candidates (Table 2.4 and Appendix.2). Sequencing was performed to confirm the identity of the cDNA clones. Gene expression pattern for 40 correctly identified genes was examined by whole mount RNA in situ hybridisation (7.5-9.5 dpc) initially using the Intavis InsituPro robot, which automates the process. In the first experiment using the robot, brachyury (T) and caudal type homeobox 1 (Cdx1) were positive controls and there was also a no probe negative control. All robot experiments included the negative control and a positive control, but this which varied from use of T to use of a previously screened gene. Images of controls were not taken, however as expected, the negative controls exhibited no expression, and the positive controls showed the expected expression pattern for the relevant gene. For example, T was expressed in the primitive streak at 7.5 dpc and in the tail bud and notochord at 8.5 dpc and 9.5 dpc. This method yielded generally poor results with most genes showing broad expression and younger embryos (7.5 dpc) as well as the head of older embryos (9.5 dpc) exhibiting trapping. There were several potentially significant discrepancies between the protocol used with the robot and the protocol used in our lab for manual whole mount RNA in situ hybridisation, including: No proteinase K treatment. This slightly digests the embryo, allowing easier access by the probe and other solutions to all surfaces of the embryo. No RNAse treatment following hybridisation. This mops up any excess un-bound probe, reducing background staining. Rocking of embryos in solutions. This may have resulted in increased trapping.

57 Given these differences, it was felt that it was worth repeating the whole mount RNA in situ hybridisation screen using a manual protocol. In these experiments, no negative control was used and in the first experiment, Cited2 was used as a positive control to confirm that the protocol was working. Again, images of controls were not taken at the time, but Cited2 showed the expected expression pattern at each stage, being expressed in the cardiac crescent and blood islands at 7.5 dpc, cardiac crescent, anterior lateral mesoderm, PM and blood islands at 8.5 dpc and widely expressed at 9.5 dpc in the heart, branchial arches, forebrain, limb bud, tail bud and somites. Positive controls were not routinely included in the manual protocol, in practice however, an effective positive control was the confirming of the result from a previously screened gene. Furthermore, since each well contained different probes, the protocol could be confirmed as working correctly if the genes screened in any particular experiment showed different expression patterns. All genes already examined using the Intavis robot as well as seven genes yet to be examined, were screened using the manual protocol. While this improved the trapping problems, no genes with differential expression within the heart were identified across the stages examined.

Genes were subjectively classified as restricted, widespread, ubiquitous, not detected or no result at each stage examined (Table 2.4, Figure 2.5 and Table 2.5). Where gene expression patterns were not identical between the robot protocol and the manual protocol they were classified based on their manual result as this method was thought to be more robust. Syndecan binding protein (Sdcbp), NDC80 complex homolog (Saccharomyces cerevisiae) (Nuf2) and Riken clone 2810408A11Rik (2810408A11Rik) were identified as enriched in the primitive streak (Figure 2.5A, E, F, I). At 7.5 dpc, the expression of 2810408A11Rik is restricted to the primitive streak (Figure 2.5I), whereas Nuf2 expression is ubiquitous. Sdcbp is expressed in the neural folds and pre-somitic mesoderm at ~8 dpc and 8.5 dpc, and is also expressed in the branchial arch at 8.5 dpc (Figure 2.5A, B, C). At 9.5 dpc, Sdcbp is widely expressed in the forebrain, neural tube, limb bud, PSM and branchial arches (Figure 2.5D). Nuf2 was also identified as enriched in the primitive streak, but is expressed widely at 7.5 dpc and 9.5 dpc with stronger regions of expressions in some tissues (Figure 2.5E, F). EFR3 homolog A (S. cerevisiae) (Efr3a), Claudin domain containing 1 (Cldnd1) and Ankarin repeat and SOCS box-containing protein 3 (Asb3) were identified as enriched in the mesoderm. At 7.5 dpc, Efr3a appeared to be expressed in the mesoderm and also in the primitive 58 Table 2.4 Analysis 1 whole mount RNA in situ hybridisation

The following candidate genes that were enriched in either the primitive streak or mesoderm (fold-enrichment in brackets) were selected for screening by whole mount RNA in situ hybridisation for heart and/or somite expression at 7.5, 8.5 and 9.5 dpc. Gene names and symbols are current as of March 2012 (Mouse Genome Database). Gene expression patterns were subjectively characterised as “ubiquitous” similar expression observed across all tissues; “widespread” - expression in several but not all tissues, or in many tissues at differential levels; “restricted” – expression in just a few tissues at stage examined or “not detected”. For a few genes at particular stages, no result was obtained “no result”. Abbreviations: AL allantois, BA branchial arches, EPC ectoplacental cone, FB forebrain, FG foregut, HB hindbrain, LB limb buds, MB midbrain, NF neural folds, NT neural tube, OC optic cup OV otic vesicle PS primitive streak, PSM presomitic mesoderm.

Expression pattern Tissue Enriched Expression Gene Symbol 7.5 dpc 8.5 dpc 9.5 dpc Primitive Streak 4-fold 2810408A11Rik Restricted. PS Widespread Widespread

59 Primitive Streak 4-fold Sdcbp Restricted. PSM. Restricted. PSM, BA. Widespread

Primitive Streak 4-fold Pgp Ubiquitous (in embryo Widespread Widespread region)

Primitive Streak 3-fold 0610030E20Rik Widespread Widespread Widespread

Primitive Streak 3-fold 1200014J11Rik Not detected. Restricted. Head, Restricted. FB, PSM, NT. HB, BA, OV, LB, Heart, PSM.

Expression pattern Tissue Enriched Expression Gene Symbol 7.5 dpc 8.5 dpc 9.5 dpc Primitive Streak 3-fold Mtmr14 Ubiquitous Ubiquitous Widespread

Primitive Streak 3-fold Tm9sf3 Not detected Not detected Restricted. BA. LB, NT, FB. (PSM unknown as broken off embryo)

Primitive Streak 3-fold Spef1 Widespread Widespread Widespread

Primitive Streak 3-fold Stk25 No result Widespread Widespread 60

Primitive Streak 3-fold Snx18 No result No result Widespread

Primitive Streak 3-fold Cldn25 No result Restricted. NT, NF Restricted. NT, BA, PSM

Primitive Streak 3-fold Ankrd61 Widespread Restricted (9 dpc). Restricted. NT NF.

Primitive Streak 3-fold Kirrel3 Widespread Widespread Widespread

Expression pattern Tissue Enriched Expression Gene Symbol 7.5 dpc 8.5 dpc 9.5 dpc Primitive Streak 3-fold Nuf2 Widespread Restricted. NF, PSM Widespread

Mesoderm 4-fold Snhg8 Widespread (8 dpc) Widespread Widespread

Mesoderm 4-fold 9030425E11Rik Not detected Restricted. NF. Restricted. FB, NT, LB, BA. Mesoderm 4-fold Msrb2 Restricted (weak). HF, Widespread Restricted. PSM, AL. BA, LB.

Mesoderm 4-fold Rhbdd2 Ubiquitous (weak) Widespread Widespread

61 Mesoderm 4-fold Trmt61b Not detected. Widespread Widespread

Mesoderm 4-fold 9130221H12Rik Ubiquitous Widespread Widespread

Mesoderm 4-fold Efr3a Widespread. (Includes Restricted. NF. Widespread Mesoderm - not confirmed by secioning)

Mesoderm 4-fold 1500011H22Rik Ubiquitous (weak) Widespread Widespread

Expression pattern Tissue Enriched Expression Gene Symbol 7.5 dpc 8.5 dpc 9.5 dpc Mesoderm 4-fold Acadsb Not detected Restricted (weak). Widespread NF Mesoderm 4-fold 2900073G15Rik Restricted. NF Restricted. NF, Restricted. FB, PSM/AL boundary PSM, LB Mesoderm 4-fold Asb3 Restricted. NF, AL Restricted. NF, PSM, Restricted. NT, AL PSM, BA, FB. Mesoderm 3-fold 1700100L14Rik Not detected Restricted. NF, PSM Restricted. FB, PSM Mesoderm 3-fold cDNA sequence Not detected Not detected Restricted. NT BC003331

62 Mesoderm 3-fold Fam36a Not detected. Restricted. NF, PSM Restricted. NT, BA Mesoderm 3-fold Gucy1b3 Ubiquitous (in embryo Ubiquitous Widespread region)

Mesoderm 3-fold Gm9751 Widespread Ubiquitous Restricted

Mesoderm 3-fold Gmpr2 Widespread Widespread Restricted

Expression pattern Tissue Enriched Expression Gene Symbol 7.5 dpc 8.5 dpc 9.5 dpc Mesoderm 3-fold 2610305D13Rik Not detected Restricted (very Widespread weak). NF, PSM

Mesoderm 3-fold Wdr82 Restricted. NF, PS Restricted (weak). Restricted. FB, NF, PSM BA, PSM, LB

Mesoderm 3-fold Mum1 Ubiquitous Widespread Widespread

Mesoderm 3-fold Mecr Not detected Restricted (weak). Widespread NT. Mesoderm 3-fold Riok2 Restricted (weak). Widespread (weak) Widespread 63 Endoderm (not confirmed by sectioning)

Mesoderm 3-fold D930014E17Rik Restricted (weak). NF, Restricted. NF, PSM, Widespread PS BA

Mesoderm 3-fold Abtb2 Restricted. NF Restricted. NF, FG Restricted FB, LB, PSM

Expression pattern Tissue Enriched Expression Gene Symbol 7.5 dpc 8.5 dpc 9.5 dpc Mesoderm 3-fold Dus4l Restricted (weak). NF Restricted (weak). Restricted. Head, NF BA Mesoderm 3-fold Dpp8 No result Widespread Widespread

Mesoderm 3-fold Wipi2 Ubiquitous Widespread Widespread

64

- restricted at 8 dpc and 8.5 dpc, widespread at 9.5 dpc

8 dpc 8 dpc 8.5 dpc 9.5 dpc

- widespread - ubiquitous

7.5 dpc 9.5 dpc 7.5 dpc 8.5 dpc I J K L

NT OV EPC OC EPC OC H FB BA BI H FGE

S LB

m PE PS

M N MB OC

HB H BA OV PSM FB S BA

H

LB PSM S

Figure 2.5 Examples of embryo expression patterns

Embryos were scored as having restricted (A-C, I, J, L-N), widespread (E, F, K) or ubiquitous (G, H) expression patterns. Sdcbp (A-D), Nuf2 (E-F) and 2810408A11Rik (I) were identified as enriched in the primitive streak. Efr3a, Cldnd1 and Asb3 were identified as enriched in the mesoderm in Analysis 1 (J, K, L). Tm2d2 was identified as enriched in the mesoderm in Analysis 2 (G, H). Smoc1 was identified as enriched in the mesoderm-endoderm population in Analysis 2 (M, N). Sdcbp and is expressed in the neural folds and pre-somitic mesoderm at ~8 dpc (A, B), and 8.5 dpc, and in the branchial arch at 8.5 dpc (C). At 9.5 dpc (D) Sdcbp is expressed in the forebrain, neural tube, limb bud, PSM and branchial arches. (E, F) Nuf2 was identified as enriched in the

67 primitive streak and is expressed widely at 7.5 dpc (E) and 9.5 dpc (F) with stronger regions of expressions in some tissues. (G, H) Tm2d2 is ubiquitously expressed at 7.5 dpc (G) and 8.5 dpc (H) with similar levels of expression across all tissues. (I) At 7.5 dpc, 2810408A11Rik is expressed in a restricted manner in the primitive streak. (J) At 9.5 dpc, Cldnd1 is expressed in the forebrain, branchial arches, limb buds, neural tube and otic vesicle. (K) At 7.5 dpc, Efr3a is expressed in the mesoderm, primitive endoderm and blood islands. L) Asb3 is expressed in the foregut endoderm and next to the optic cup at 8.5 dpc. (M, N) Smoc1 is expressed in a restricted manner at 8.5 dpc and 9.5 dpc, being present in the somites, PSM and branchial arches at both stages and additionally in the limb bud, hind-, mid- and fore-brain at 9.5 dpc. (A) Posterior view, (B, E) anterior view, (G) anterior to left, (C, D, F, H, I-N) lateral view. Abbreviations: AL allantois, BA branchial arch, BI blood islands, EPC ectoplacental cone, FB forebrain, FGE foregut endoderm, H heart, LB limb bud, MB midbrain, NF neural fold, OC optic cup, OV otic vesicle, PE primitive endoderm, PS primitive streak, PSM presomitic mesoderm, S somites. Dotted line denotes mesoderm wing. Scale bar (A-H): (A, B) 220 m, (C) 280 m, (D) 480 m, (E) 180 m, (F) 430 m, (G) 210 m, (H) 205 m. Scale bar (I-N): (I) 200 m, (J, N) 400 m, (K) 160 m, (L) 250 m, (M) 245 m.

Table 2.5 Analysis 1 observed gene expression patterns by embryo stage

Gene expression patterns were classified as restricted, widespread, ubiquitous or not detected at each stage examined (percentage, n=41 genes).

Stage Restricted Widespread Ubiquitous Not detected No result 7.5 dpc 39.0 17.1 19.5 13.3 9.8 8.5 dpc 43.9 41.5 7.3 2.4 4.9 9.5 dpc 37.8 53.3 0.0 0.0 0.0 endoderm and ectoplacental cone, however without sectioning, these tissue classifications are not certain (Figure 2.5K).

At 7.5 dpc, some genes identified by microarray analysis as being enriched in the primitive streak did exhibit stronger expression in this region at 7.5 dpc, for example 2810408A11Rik. Efr3a appeared to be expressed in the mesoderm at 7.5 dpc as expected from the microarray analysis, however, without sectioning, it is difficult to determine if genes identified as enriched in the mesoderm were indeed expressed there specifically. As such, expression in these embryos may be described as ubiquitous although sectioning may have revealed no expression within the inner ectoderm layer. Whilst genes with differential expression were detected, at some if not all stages, such as Sdcbp, 2810408A11Rik, Cldnd1 and Asb3, which showed restricted expression at some stages but not at all stages examined, these were not expressed in tissues of interest.

2.2.5 Analysis 2 microarray analysis Analysis 1 did not yield many genes with restricted expression and none that were differentially expressed in the heart. The processing of the data, including the way the raw data was filtered, the tissues examined and the final selection criteria of genes to be examined may all have contributed to the lack of success. Therefore, to enhance the chances of finding genes relevant to mouse heart development, the approach to selecting candidate genes for examination by manual whole mount RNA in situ hybridisation was re-examined. In the initial analysis, the raw data was filtered by setting a minimum value of 50 in 2 out of the 6 arrays performed comparing the cDNA

68 libraries. The selection of 2 out of 6, rather than all experiments was undertaken to avoid exclusion of genes enriched in only one particular germ layer or the primitive streak. Raw data filtering in this way was not specific to any particular germ layer or the primitive streak, and filtered in the Cy3 channel (Table 2.1 and Figure 2.2). By only filtering one channel, any general difference in hybridisation efficiency of each dye may have biased the array results.

For the second analysis, the raw data was specifically set to a minimum of 50 for a particular germ layer in each array, e.g. mesoderm (Figure 2.6). This filtering scheme identified 90 genes that are enriched 3-fold greater in mesoderm (Figure 2.6) over the other layers, compared to 85 genes identified using the raw data filter in Analysis 1. 72 of the identified genes were identical between the Analysis 1 and Analysis 2 enriched mesoderm genes and there were 13 genes unique to Analysis 1 and 18 genes unique to the Analysis 2. At the time of this study, only three of the 13 genes unique to Analysis 1 had assigned identity and the remainder had clone identities. One of the genes excluded by this filtering, matrilin 2, is reportedly expressed in the heart at 10.5 dpc (Segat et al., 2000). Of the 18 genes identified as unique to Analysis 2, 7 have clone identities only and 11 had been assigned a gene name (MGD, May 2005; MGD, September 2009). Of these, lysophosphatidic acid receptor 3 has been shown to be important in implantation and embryo spacing, and anoctamin 10 is expressed during cephalic development, Fgfr1 oncogene partner is expressed during somitogenesis, zinc finger protein 60 is involved in cartilage differentiation and a sema domain, immunoglobulin domain (Ig), transmembrane domain (TM) and short cytoplasmic domain, (semaphorin) 4D mouse mutation has been shown to have defects in the immune system (Ganss and Kobayashi, 2002; Gritli-Linde et al., 2009; Hughes et al., 2009; Ye et al., 2005). Three other genes have been identified in screens examining the embryonic brain and/or the transcriptional profile of the peri-implantation embryo, or retinal development; zinc finger protein 398, zinc finger and BTB domain containing 48 and methylenetetrahydrofolate dehydrogenase (NADP+ dependent) 2-like (Blackshaw et al., 2004; Gray et al., 2004b; Guo and Robson, 2008). The remaining genes unique to Analysis 2 have no reported expression data (MGD, September 2009). So as to not inadvertently exclude potential heart genes, genes present in Analysis 1 but not present in Analysis 2 following raw data filtering were still included as potential candidates.

69 Array 2 765 Array 5 1035 305 Cy3 channel Cy5 channel 1. Mesoderm Primitive Streak 11 084 2. Ectoderm Mesoderm 3. Endoderm Ectoderm 305 163 4. Primitive Streak Endoderm 5. Mesoderm Endoderm Ectoderm Primitive Streak 6. 115 Array 1

Analysis 2 Mesoderm raw filtered data

Microarray chip 2 Microarray chip 1 Microarray chip 5 Genes 3-fold Genes 3-fold Genes 3-fold enriched enriched enriched

128 510 301 Genes expressed 3-fold higher 90 in mesoderm compared to 96 95 other germ layers and the primitive streak

289

Figure 2.6 Flow chart schematic of raw data filtering of the Mesoderm arrays in Analysis 2

6 microarray chips created as described in Figure 2.2. For each chip, RNA from the cDNA libraries was labelled with either Cy3 (green text) or Cy5 (red text) and hybridised to the chips. A raw data minimum of 50 was applied to the mesoderm (bolded text) irrespective of whether it was in the Cy3 channel or Cy5 channel. The genes with a raw data value in the mesoderm greater than 50 in each of the three arrays (arrays 1, 2 and 5) were identified by overlapping Venn diagram. This created a subset list of mesoderm raw data filtered genes (black circle). From the mesoderm raw data filtered genes subset, genes 3-fold enriched in the mesoderm over the ectoderm (red circle), primitive streak (green circle) and endoderm (blue circle) were identified using an overlapping Venn diagram. This approach identified 90 genes that were 3- fold enriched in the mesoderm compared to the ectoderm, endoderm and primitive streak.

71 A B

3059 8025665 268 129 339

Endoderm Endoderm Mesoderm Mesoderm 3-fold over Ectoderm 3-fold over Ectoderm Raw data > 50 Raw data >50

Figure 2.7 Analysis 2, identification of genes enriched in both the mesoderm and endoderm A) Diagram representing the raw data strategy for identification of genes enriched in both the mesoderm and endoderm. The raw data was set to 50 for the mesoderm (Figure 2.4) and endoderm in each array, then overlapped to provide a genes with minimum raw data values of 50 in both the mesoderm and endoderm. B) Diagram representing the number of genes enriched 3-fold in the Mesoderm and Endoderm compared to the Ectoderm. The primitive streak enriched genes are included since cells that ingress through the primitive streak form mesoderm and endoderm. The Analysis 2 raw data filter would be expected to be a better approach as it directly excluded genes expressed at low level in the tissue(s) of interest.

Signals from the foregut endoderm to the developing heart have been shown to be important regulators of heart development (Lough and Sugi, 2000; Rochais et al., 2009; Sugi et al., 1995; Zhu et al., 1996). Furthermore, genes that are expressed in both the endoderm and mesoderm may be important for heart development, for example Cited2 and Gata4. Cited2 is expressed in both endoderm and mesoderm at early stages, is expressed in the heart, and mice carrying targeted knockout mutations in Cited2 exhibit heart defects (Barbera et al., 2002; Dunwoodie et al., 1998; Weninger et al., 2005).

To identify genes that are highly expressed in endoderm and mesoderm, raw data filtering was applied separately to each layer as described above (Figure 2.6, Table 2.1). The resulting mesoderm and endoderm raw data filtered gene lists were overlapped to provide an endoderm-mesoderm raw data filtered list (Figure 2.7A). Since the mesoderm and endoderm are formed during gastrulation from epiblast cell ingression through the primitive streak, fold-change was only compared between mesoderm and ectoderm, and endoderm and ectoderm. Thus the endoderm-mesoderm list also includes genes enriched in the primitive streak. 129 genes were identified as being expressed 3- fold greater in mesoderm and endoderm, compared to ectoderm (Figure 2.7B).

To determine the types of genes enriched in the endoderm-mesoderm list, the GO terms associated with the accession numbers for genes enriched 3-fold in the endoderm- mesoderm were entered into the GOStat program (Beissbarth and Speed, 2004). As in Analysis 1, the best 30 terms over-represented were identified, and the best 10 terms are represented in pie graph form (Figure 2.8, Appendix 1, also compare to Figure 2.4). The endoderm-mesoderm contains terms and GO hierarchies common between each of the germ layers and the primitive streak. For example, terms over-represented in the endoderm-mesoderm and the germ layers and primitive streak include: Organelle and related terms – mesoderm, ectoderm and primitive streak Localization and transport and related terms – endoderm Nucleic acid binding and related terms - ectoderm and primitive steak Interestingly, the endoderm remains the most distinct gene list as judged gene ontology. The endoderm-mesoderm enriched gene list has as many terms in common with 73 ectoderm, despite enrichment being directly compared to ectoderm, but it is most “like” the mesoderm with eight out of the top 10 GO terms in common.

2.2.6 Analysis 2 in silico analysis As in Analysis 1, reported expression pattern, mutation (knockout, spontaneous and gene-trap), gene ontology, and information on protein domain (predicted and demonstrated) was collated using MGD and GXD (GXD, June, 2006; MGD, June, 2006), for all genes identified in Analysis 2. 36.6 % of genes identified as being enriched in the mesoderm in Analysis 2 and 39.8 % of genes enriched in both the mesoderm and endoderm had previously been examined by in situ hybridisation (Table 2.6). This is similar to the percentage of genes 3-fold enriched in the mesoderm in Analysis 1, where 34.1 % of genes had published in situ hybridisation data (Table 2.3). 22.2 % of mesoderm enriched genes had disrupted alleles in Analysis 2, which is similar to Analysis 1 (22.3 %), however 32 % of genes enriched in the mesoderm and endoderm had disrupted alleles (Table 2.6, compare to Table 2.3).

To further refine the list of candidate genes, a comparison was performed between the genes identified as candidates as described above, and genes identified in another array investigating heart development (Masino et al., 2004). In this array, cardiac cells in the embryo were transgenically labelled from cardiac crescent (~7.75 dpc) stage to 9.5 dpc and compared to non-cardiac cells from the same stages (Masino et al., 2004). A higher proportion of genes identified as common between the germ layer microarrays and the Masino array, had previously been examined by in situ hybridisation (80 % of Analysis 2 mesoderm enriched genes and 63.9 % of genes enriched in both the mesoderm and endoderm). This was also the case for gene disruptions, with 60 % of mesoderm enriched genes and 52.4 % of genes enriched in the mesoderm and endoderm having disrupted alleles (Table 2.6). In addition to the new raw data filtering and the inclusion of genes enriched in both endoderm and mesoderm as well as genes enriched only in the mesoderm, the selection criteria for candidates to be examined by whole mount RNA in situ hybridisation was reviewed. In Analysis 1, the main criteria were 3-fold greater expression in mesoderm or primitive streak, no genetic models and no reported expression in the embryo. The first two of these criteria were maintained (except for gene-traps with no phenotypic data), but for this analysis only those genes for which

74

Table 2.6 Percentage of mesoderm and mesoderm and endoderm enriched genes with known embryo expression and phenotypic data at 3- fold enrichment

Data was collated from MGD and GXD (GXD, June, 2006; MGD, June, 2006) and each gene was assigned as having no known expression (none), expression determined by in situ hybridisation (section, whole-mount RNA or protein) (ISH), which was further divided into heart expression not examined (not heart), or heart expression examined (heart examined), or expression data determined by other methods e.g. PCR, Northern blot, Western blot (other data). The resulting phenotype from gene disruption provides phenotypic data and information about the function of genes. Genes were assigned as having none, targeted knockout (targeted) or non-targeted e.g. gene trap or mapped spontaneous mutation. Genes for which there are both targeted and non-targeted alleles were assigned as targeted. Expression and gene disruption data shown as percentage of total (n) for 3-fold enrichment. Mesoderm list only includes data from Analysis 2, not combined with any Analysis 1 data. Masino array numbers represent the percentage of genes in represented as upregulated in Nkx2.5 expressing cells compared to other embryonic cells at 7.75 dpc (Masino et al., 2004).

Mesoderm Mesoderm-Endoderm Expression Total percent Masino array Total percent Masino array data (n = 90) (n = 15) 16.6 % (n = 128) (n = 36) 28.1 %

None 58.9 13.3 50.8 30.6 ISH (not heart) 14.4 20.0 18.0 13.9 ISH (heart 22.2 60.0 21.8 50.0 examined) Other data 4.4 6.7 9.4 11.1 Gene disruption None 77.8 40.0 68.0 47.2 Targeted 17.8 46.7 25.0 47.2 Non-targeted 4.4 13.3 7.0 5.6 heart expression in the embryo has been examined by in situ hybridisation have been excluded (i.e. published expression of northern blot or RT-PCR that reports heart expression may still be included). Genes that had previously been examined in the mouse were more likely to be expressed at levels detectable by whole mount RNA in situ hybridisation compared to genes that were completely novel with respect to

75 A nucleus

protein binding intracellular

intracellular membrane-bound intracellular part organelle

primary metabolic organelle process

cellular metabolic process intracellular organelle

membrane-bound organelle

B multicellular organismal development cytoplasm

transport

extracellular protein binding space

establishment of localization localisation

extracellular region part cytoplasmic part developmental process

C cytoplasm

intracellular macromolecule metabolic process

intracellular intracellular part membrane-bound organelle

primary metabolic organelle process

cytoplasmic part intracellular organelle

membrane-bound organelle

Figure 2.8 Pie graphs showing best 10 GO terms associated with genes enriched in the mesoderm, endoderm and enriched in both the mesoderm and endoderm

A) Graph showing the best 10 terms associated with genes enriched in the mesoderm. B) Graph showing the best 10 terms associated with genes enriched in the endoderm. C) Graph showing the best 10 terms associated with genes enriched in the mesoderm and endoderm.

77 expression data. This should increase the chances of identifying genes that are detectable by whole mount RNA in situ hybridisation, and for which a role in the heart may not have been considered.

Genes identified as being 3-fold enriched in mesoderm, or mesoderm and endoderm, that were also represented in the Masino et al., (2004) array as being present in cardiac cells, were given higher priority. Genes not reported as present in the cardiac cell array may still be expressed in the developing heart, since only a subset of genes were present in both microarray chips and some genes known to be expressed in the developing heart, e.g. Cited1 (Dunwoodie et al., 1998) were not present on the chip used in their study.

Gene ontology, predicted protein domains and the assigned name of genes were all considered in reducing the candidate list and genes encoding for transcription factors or activators, proteins involved in protein binding, transmembrane proteins, secretion or signalling proteins or proteins involved in apoptosis, or structural or cytoskeletal proteins, were prioritised over other classifications. By prioritising these pathways/functions it was hoped that genes with functional relevance to heart development would be identified since these pathways are known to be important in the heart. For example, mutations in the transcription factors NKX2.5 and GATA4; transcriptional activator T-box 1 (TBX1) and T-box20 (TBX20); transmembrane proteins Jagged 1 (JAG1) and NOTCH2; and signalling molecule, protein tyrosine phosphatase, non-receptor type 11 (PTPN11) are associated with human congenital heart malformations (Benson et al., 1999; Eldadah et al., 2001; Garg et al., 2003; Kirk et al., 2007; Nemer et al., 2006; Rauch et al., 2004; Schott et al., 1998; Weismann et al., 2005). Apoptosis has been demonstrated to be important in shaping the heart, and the cytoskeletal protein, small muscle protein, X-linked (also known as Chisel) is important in mouse heart development (Barbosky et al., 2006; Bruneau, 2008; Palmer et al., 2001; Sharma et al., 2004). Utilising these criteria, 55 genes were selected, and cDNA clones (IMAGE or RIKEN) were obtained (Appendix 3).

78 Table 2.7 Analysis 2 Whole mount RNA in situ hybridisation screen The following candidate genes that were 3-fold enriched In the mesoderm or enriched in both the mesoderm and endoderm were selected for screening by whole mount RNA in situ hybridization for heart and/or somite expression at 7.5, 8.5 and 9.5 dpc. Gene names and symbols are current as of March 2012 (Mouse Genome Database). Expression patterns were subjectively characterised as “ubiquitous” similar expression observed across all tissues; “widespread” - expression in several but not all tissues, or in many tissues at differential levels; “restricted” – expression in just a few tissues at stage examined or “not detected”. For a few stages, no result was obtained “no result”. Abbreviations: AL allantois, BA branchial arches, EPC ectoplacental cone, FB forebrain, HB hindbrain, MB midbrain, NF neural folds, NT neural tube, PS primitive streak, PSM presomitic mesoderm.

Expression pattern Tissue Enriched Expression Gene Symbol 7.5 dpc 8.5 dpc 9.5 dpc Mesoderm-Endoderm 3-fold Smoc1 Restricted, borders Restricted. Somites, Restricted. Somites, between germ layers PSM PSM, BA, FB, MB, HB

79 Mesoderm-Endoderm 3-fold Wls Restricted. PS, NF. No result Widespread

Mesoderm-Endoderm 3-fold Ndufs2 Ubiquitous Widespread Widespread

Mesoderm-Endoderm 3-fold Pik3r4 Ubiquitous Ubiquitous (weak) Restricted. FB

Mesoderm-Endoderm 3-fold Gng12 Ubiquitous Ubiquitous (very weak) Ubiquitous

Mesoderm-Endoderm 3-fold Pcgf5 Restricted. EPC, NF Restricted. NF Restricted. BA, NT heart.

Expression pattern Tissue Enriched Expression Gene Symbol 7.5 dpc 8.5 dpc 9.5 dpc Mesoderm-Endoderm 3-fold Ndfip1 Restricted. NF Restricted. NF Widespread

Mesoderm-Endoderm 3-fold Dek Not detected Restricted. NF Widespread

Mesoderm-Endoderm 3-fold Dhx33 Restricted. AL, NF, No result Widespread anterior PS.

Mesoderm-Endoderm 3-fold Ddx10 Not detected Restricted. NF Restricted. Head, optic cup.

Mesoderm-Endoderm 3-fold Cep192 Ubiquitous Widespread Widespread

80 Mesoderm-Endoderm 3-fold Zfp330 Restricted. PS, NF No result Widespread

Mesoderm-Endoderm 3-fold Cmtm3 Widespread Widespread Widespread

Mesoderm-Endoderm 3-fold Stk38l Not detected Not detected Restricted. PSM, FB.

Mesoderm-Endoderm 3-fold Ddx5 Ubiquitous, embryonic No result Widespread region

Mesoderm-Endoderm 3-fold Dennd5a Ubiquitous Restricted. NF. Widespread

Expression pattern Tissue Enriched Expression Gene Symbol 7.5 dpc 8.5 dpc 9.5 dpc Mesoderm-Endoderm 3-fold St6galnac2 Widespread Restricted. NF Widespread

Mesoderm-Endoderm 3-fold 2610307P16Rik Ubiquitous (embryonic Ubiquitous (weak) Not detected region)

Mesoderm-Endoderm 3-fold Snhg12 Not detected No result Restricted. FB.

Mesoderm-Endoderm 3-fold Tmem216 Widespread Ubiquitous Widespread

Mesoderm-Endoderm 3-fold Tmem192 Ubiquitous Ubiquitous Widespread

81 Mesoderm-Endoderm 3-fold Atxn7l3b Widespread Widespread Widespread

Mesoderm-Endoderm 3-fold Fam92a Widespread (weak) Widespread (weak) Widespread (weak)

Mesoderm-Endoderm 3-fold Rsrc2 No result No result Widespread

Mesoderm-Endoderm 3-fold Ift46 No result No result Widespread

Mesoderm-Endoderm 3-fold 9530006C21Rik No result No result Restricted. Head, BA.

Mesoderm-Endoderm 3-fold Nck2 Ubiquitous Widespread Widespread

Expression pattern Tissue Enriched Expression Gene Symbol 7.5 dpc 8.5 dpc 9.5 dpc Mesoderm-Endoderm 3-fold Kif4-ps Restricted (weak) Not detected Restricted. FB

Mesoderm 3-fold Ift57 No result No result Widespread

Mesoderm 3-fold Zbtb48 No result Restricted. nf Restricted. Head, BA.

Mesoderm 3-fold Tm2d2 Ubiquitous (weak) Widespread Widespread

Mesoderm 3-fold Zfp60 No result Restricted. NF Widespread

Mesoderm 3-fold Fgfr1op Ubiquitous Ubiquitous Ubiquitous

82 Mesoderm 3-fold Erlec1 Not detected Ubiquitous Ubiquitous

Mesoderm 3-fold Zfp423 No result Widespread Widespread

Mesoderm 3-fold Zfp398 Ubiquitous (weak) Ubiquitous (weak) Ubiquitous (weak)

Mesoderm 3-fold Zfp618 Not detected Not detected Not detected

Mesoderm 3-fold Tmem81 Ubiquitous (weak) Ubiquitous Widespread

Mesoderm 3-fold Mtap9 No result Widespread Widespread

Expression pattern Tissue Enriched Expression Gene Symbol 7.5 dpc 8.5 dpc 9.5 dpc Mesoderm 3-fold 4930451G09Rik Restricted. NF, PS. Widespread Widespread

Mesoderm 3-fold Dleu2 Not detected Not detected Not detected

Mesoderm 3-fold Hrnr No result Restricted. AL, NF Restricted. Head, PSM. Mesoderm 3-fold 6030492E11Rik Restricted. NF Restricted. NF Restricted. FB, PSM.

Mesoderm 3-fold 2610034E01Rik No result Widespread Ubiquitous

Mesoderm 3-fold 8430406H22Rik Ubiquitous (weak) Restricted. PSM. Ubiquitous (weak) 83

2.2.7 Analysis 2 whole mount in situ hybridisation Gene expression patterns were examined by whole mount RNA in situ hybridisation in 7.5- 9.5 dpc mouse embryos for 45 correctly identified genes (Table 2.7). Gene expression patterns were categorised as restricted, widespread or ubiquitous as in Analysis 1 (see Figure 2.5 for typical expression patterns, 2.5G, H, M, N are examples from Screen 2). As observed in Analysis 1, the majority of genes (55.6 %) were widespread in their expression at 9.5 dpc. In contrast to Analysis 1, more genes with ubiquitous expression were observed at 7.5 dpc although, as sectioning was not performed, some genes described as ubiquitously expressed at 7.5 dpc, may not have been expressed in the inner ectoderm layer (Table 2.7 and 2.8).

Table 2.8 Analysis 2 observed gene expression pattern by embryo stage Gene expression patterns were classified as restricted, widespread, ubiquitous or not detected at each stage examined (percentage, n=45 genes).

Stage Restricted Widespread Ubiquitous Not detected No result 7.5 dpc 20.0 11.1 31.1 15.6 22.2 8.5 dpc 26.7 24.4 20.0 8.9 20.0 9.5 dpc 24.4 55.6 13.3 6.7 0.0

2.3 Discussion In this chapter, the expression profiling of cDNA libraries using microarray technology has been described. Gene expression profiling of mouse embryonic tissues and organs has previously proven a useful approach to identify genes pertinent to embryonic development (Dunwoodie and Beddington, 2002; Dunwoodie et al., 1997; Dunwoodie et al., 1998; Harrison et al., 1995; Harrison et al., 2000; Sewell et al., 2009). The aim of this work was to identify genes that were expressed in the heart during early mouse development. To this end, two analyses were undertaken as a lack of positive results in Analysis 1 prompted alteration to the initial criteria to enhance the chances of success.

The two analyses identified some genes known to be expressed during heart development and genes with restricted expression in the developing embryo, however they did not identify any novel genes expressed in the developing mouse heart. This could be due to the experimental

84 design and starting material being inadequate for this purpose. Alternatively, the method of analysis may not have been sufficient or sophisticated enough to identify genes expressed in the cardiac precursors.

2.3.1 The experimental design of the microarrays The expression profiling utilised established cDNA libraries from the mouse germ layers, mesoderm, endoderm and ectoderm and a primitive streak fraction. It is possible that this source material was not appropriate for identifying genes expressed in the developing heart. This seems unlikely as these same libraries have previously been successfully utilised to identify genes with germ layer restricted expression by subtractive hybridisation and sequence clustering (Harrison et al., 1995; Sousa-Nunes et al., 2003). Genes relevant to embryonic development, such as Dll3, Sp6, and Phlda2 were identified using this approach (Dunwoodie and Beddington, 2002; Dunwoodie et al., 1997; Harrison et al., 2000), and Cited1, which is expressed in the developing heart was also identified (Dunwoodie et al., 1998).

This chapter describes using microarray analysis to identify genes enriched or unique to the mouse germ layers or primitive streak fraction. However, the method used differs from typical microarray experiments in which RNA is directly extracted from the tissue of interest, labelled and hybridised to microarrayed oligonucleotides. To expression profile the germ layer libraries using microarray technology, the cDNA clones for each library were linearised and RNA transcribed, labelled and hybridised to the microarrayed oligonucleotides. In this step, a bias towards short transcripts may have been introduced, as these would be transcribed at a greater rate than longer transcripts. If this did occur, the microarrayed RNA would not be true to the cDNA libraries. This could be avoided by ensuring a linear RNA amplification. Despite this, confidence can be placed in the cDNA to RNA conversion as both analyses identified genes known to be enriched in particular germ layers and the primitive streak.

Dye swap experiments, in which each array is hybridised twice, with the dye (Cy3 and Cy5) assignment reversed in the second hybridisation, were also not performed. Dye swap experiments help to account for bias caused by one dye being taken up by RNA at a different rate to the other dye, as only gene expression profiles that are consistent between the two experimets are analysed (Dobbin et al., 2005; Martin-Magniette et al., 2005; Rosenzweig et

85 al., 2004). To set up complete dye swap experiments for the germ layer microarrays, an additional six arrays would have been required making it an expensive procedure. Additionally, dye swap experiments have been shown to be most effective when performed with biological replicates rather than as technical replicates (Dobbin et al., 2005; Rosenzweig et al., 2004). The creation of biological replicates was not a feasible possibility as the dissection of 7.5 dpc embryos into their germ layer components is technically difficult and time consuming as many embryos are required (Harrison et al., 1995). Across the six arrays performed, each germ layer and the primitive streak were labelled with the opposite dye in at least one array (Table 2.1), which would partially compensate for the lack of complete dye- swap experiments. Confidence in the validity of the arrays being true to the germ layer libraries was established by the identification of genes known to be expressed in a particular layer being enriched in that layer in the microarray data. As a result of the lack of replicates, and lack of dye swap analysis, no statistical analysis was performed to analyse the strength of a candidate gene of interest. Performing this type of statistical analysis may have reduced the likelihood of only identifying the most highly differential genes, and a high proportion of both false positive and false negative genes being selected for examination by whole mount RNA in situ hybridisation.

The use of the germ layer cDNA libraries as the source material may not have been appropriate for the identification of heart genes. The cardiac progenitor cells are derived from the anterior mesoderm, thus the mesoderm was the focus for the identification of candidate genes. By profiling the mesoderm rather than profiling the developing heart directly, it was hoped that the earliest genes turned on in the cardiac program might be identified. The mesoderm was maintained as the tissue of interest in both Analysis 1 and Analysis 2 as the heart is derived from mesoderm (Kinder et al., 2001; Kinder et al., 1999; Lawson et al., 1991; Tam et al., 2001; Tam et al., 1997). In Analysis 1 genes enriched in the primitive streak were also considered since cells that ingress through the primitive streak form either mesoderm or endoderm and the primitive streak library also contains some mesoderm (Lawson et al., 1991). In Analysis 2, candidate genes enriched in a new subset of mesoderm enriched genes (Figure 2.4b) or enriched in both the mesoderm and endoderm (Figure 2.5b) were selected for examination by whole mount RNA in situ hybridisation. Genes enriched in both the mesoderm and endoderm were selected as the endoderm underlying the heart field is known

86 to be an important source of signals for the developing heart and genes expressed in both tissues, such as Cited2 and GATA4, have been shown to be important for heart development (Lough and Sugi, 2000; Molkentin et al., 1997; Rochais et al., 2009; Sugi et al., 1995; Weninger et al., 2005; Zhu et al., 1996).

One criticism of this approach would be that not all mesoderm contributes to the heart, and not all of the endoderm is important for signalling (only the foregut endoderm underlying the cardiac crescent). Rather than expression profiling the mesoderm, primitive streak and endoderm, a better approach may have been to directly expression profile the cardiac crescent and foregut endoderm to identify genes expressed in the early cardiac program. It would be extremely technically challenging to directly dissect out these tissues due to their size and the requirement for many embryos to obtain sufficient RNA. Also it was felt unnecessary to go to such lengths since the cDNA libraries had previously been utilised to identify genes enriched in the germ layer derivatives including the heart (Dunwoodie and Beddington, 2002; Dunwoodie et al., 1997; Dunwoodie et al., 1998; Harrison et al., 1995; Harrison et al., 2000; Sousa-Nunes et al., 2003). Furthermore, Masino et. al. (2004) utilised transgenically labelled cells expressing GFP under the control Nkx2.5, one of the earliest known markers of the heart, and compared these by microarray to non-cardiac cells. A subset of genes were common between the germ layer microarrays and the Masino array. Some genes that were identified as enriched in cardiac versus non-cardiac tissue at cardiac crescent stage as judged by microarray analysis such as Nedd4 family interacting protein 1(Ndfip1), but not others DEAD (Asp-Glu-Ala-Asp) box polypeptide 5 (Ddx5) and DENN/MADD domain containing 5A (Dennd5a) were expressed differentially at 7.5-8.5 but had much broader expression by 9.5 dpc. However Ndifp1 appeared to be in head-folds rather than cardiac crescent at 7.5 dpc, although this was not confirmed by sectioning. Genes with a restricted expression pattern in the developing heart were not identified in Analysis 2.

2.3.2 Microarray data analysis The method of analysis of the microarray data may have been inappropriate. The two analyses differed with respect to raw data filtering, tissues selected and candidate gene selection (Figure 2.2 and Figure 2.4). In Analysis 1, the raw data was filtered such that the minimum raw data value was 50 in at least two of the six microarray chips which compared each of the

87 germ layers to each other and the primitive streak. Genes enriched in the mesoderm or primitive streak were identified and candidate genes selected for examination by whole mount RNA in situ hybridisation (Figure 2.2). In Analysis 2, the raw data was filtered to specifically exclude genes with raw data values lower than 50 in either the mesoderm, or in mesoderm and endoderm (Figure 2.6). Interestingly, this raw data filtering approach identified more genes 3-fold enriched in the mesoderm compared to the approach taken in Analysis 1 (85 genes in Analysis 1 and 90 genes in Analysis 2). Nonetheless, the Analysis 2 raw data filter would likely be a better approach as it directly excluded genes expressed at low level in the tissue(s) of interest. In both cases, the criteria set were not optimal for the identification of heart genes and collaboration with a bioinformatics department may have assisted in this.

Initial analysis of the data generated from the microarray experiments using comparative fold- changes to identify enrichment in each tissue, suggested that the microarray data was true to the libraries. Genes known to have restricted expressed in a particular germ layer or the primitive streak were enriched in these tissues in the microarray studies. For example, Phlda2 was enriched in the mesoderm, Afp was enriched in the endoderm, CrabpII was enriched in the ectoderm and Mixl1 was enriched in the primitive streak. This provided a level of confidence that the microarray studies, like the germ layer libraries, represented genuine embryo expression profiles. However, since sectioning was not performed, enrichment in a particular tissue (primitive streak, mesoderm, endoderm and mesoderm) was not confirmed for the genes examined by whole mount RNA in situ hybridisation in this study. Sectioning of 7.5 dpc embryos in this study would have demonstrated if the genes identified as enriched in each particular tissue were in fact enriched in that tissue, and provided further confidence that the microarray studies did indeed represent the genuine embryo expression profiles. Since the aim of this study was to identify genes expressed in the heart, rather than the expression profile of the germ layers per se, it was felt best to restrict the time and resource investment for sectioning of embryos to those genes that were identified for further study. Furthermore, genes known to be expressed in the developing heart, such as Hand1 and Foxc2, were enriched in the mesoderm in both Analysis 1 and Analysis 2, while Mixl1 was enriched in the primitive streak, and Cited2 and Cited1 were enriched in the mesoderm and endoderm.

88 The success of these analyses relied upon taking the quantitative information about gene expression from the microarrays without having an appreciation of the spatial distribution, which is the ultimate aim of these studies. In the two analyses, enrichment in the mesoderm, primitive streak or mesoderm and endoderm was judged by fold-changes but the assessment of the fold-change was by examining the spatial distribution of candidate genes by whole mount RNA in situ hybridisation. This approach could lead to false positives where a relative fold-change is not reflected by the expression pattern if the gene is ubiquitously expressed, at levels detectable by whole mount RNA in situ hybridisation in two layers. Although the absolute levels may be different, whole mount RNA in situ hybridisation is qualitative not quantitative. Alternatively, genes may not be selected for examination despite their spatial expression being ubiquitous in one germ layer, but highly expressed in a restricted region in another germ layer. Despite this, the subtractive hybridisation conducted with the cDNA libraries by Harrison et. al. (1995) and other studies were based on the same quantitative differences as the microarrays, and did identify genes with differential expression by whole mount RNA in situ hybridisation (Dunwoodie and Beddington, 2002; Dunwoodie et al., 1997; Dunwoodie et al., 1998; Harrison et al., 1995; Harrison et al., 2000; Sousa-Nunes et al., 2003).

Another approach to improve the likelihood of candidate gene identification was to trim the list of genes enriched in the mesoderm, primitive streak or mesoderm and endoderm by high baseline expression to identify genes that were more likely to be specific rather than enriched. A preliminary investigation using high baseline expression was able to only identify very small numbers of genes enriched in any layer. Furthermore, it was felt that genes expressed at high baseline levels would likely have been already identified by other approaches, thus it would exclude novel genes. Indeed this appeared to be the case since most of the genes with high baseline expression levels identified as enriched in a particular germ layer were already described.

Previous studies utilising the cDNA libraries have identified genes with restricted gene expression and genes with restricted expression in the developing heart (Dunwoodie and Beddington, 2002; Dunwoodie et al., 1997; Harrison et al., 1995; Harrison et al., 2000; Sousa- Nunes et al., 2003; Weninger et al., 2005). Harrison et al, (1995) performed subtractive

89 hybridisation, subtracting the Mesoderm library from the Endoderm library. They were able to identify genes known to be enriched in the endoderm, and also examined three novel genes by whole mount RNA in situ hybridisation (Harrison et al., 1995). The genes examined showed restricted expression in the endoderm. Another study utilising the germ layer libraries used sequence clustering to identify genes enriched in the endoderm cDNA library (Sousa- Nunes et al., 2003). In this study, 160 candidate genes were examined by whole mount RNA in situ hybridisation in the mouse from 6.5 dpc to 9.5 dpc. They found that 18 % of genes exhibited a restricted expression pattern at least one stage examined (Sousa-Nunes et al., 2003). This is comparable to the percentage of genes identified as restricted in Analysis 2 (20 % at 7.5 dpc, 26.7 % at 8.5 dpc and 24.4 % at 9.5 dpc), but Analysis 1 identified more genes with restricted expression patterns at the stages examined (39 % at 7.5 dpc, 44 % at 8.5 dpc and 37.8 % at 9.5 dpc). Since characterisation of the expression pattern is subjective it is difficult to directly compare these results.

Many tissues are derived from the mesoderm including the heart, somites and pre-somitic mesoderm and cranial mesoderm, and the genes examined reflected this, being expressed in a range of mesoderm derivatives. The majority of genes examined showed a restricted expression pattern in at least one embryonic stage examined (Tables 2.4, 2.5, 2.7 and 2.8), however none were expressed in a restricted manner in the developing heart. Genes that were selected as enriched in the primitive streak at 7.5 dpc did appear to be enriched in this tissue. It is not clear if the genes identified as being enriched in the mesoderm or enriched in both the mesoderm and endoderm were enriched specifically in these tissues as sectioning of 7.5 dpc embryos was not performed.

Genes with restricted expression were identified, but none were identified in the tissue of interest, the developing heart. For example SPARC related modular calcium binding 1 (Smoc1) was identified as enriched in the mesoderm-endoderm compared to the ectoderm. Smoc1 had previously been shown to be expressed in the endodermal basement membrane and Reichert’s membrane at 7 dpc (Gersdorff et al., 2006). In this screen, Smoc1 was shown to be expressed in a restricted manner in the somites, PSM, prospective branchial arch region and forebrain at 8.5 dpc (Figure 2.5M). At 9.5 dpc, Smoc1 was detected in the branchial arches, limb bud, and in the for-, mid- and hind-brain at 9.5 dpc (Figure 2.5N). These tissues

90 are derived from mesoderm (somites, limb buds, cranial mesenchyme) and endoderm (branchial arches), thus Smoc1 was expressed in tissues at 8.5 dpc and 9.5 dpc which are derived from the population in which it was enriched at 7.5 dpc. This demonstrates that genes with interesting expression patterns could be identified using the germ layer microarrays, and that genes enriched in particular germ layers at 7.5 dpc can also be expressed in the derivatives of those germ layers at later stages.

Recently, the expression pattern and function of Smoc1 has been examined by others from 9.5 dpc to 13.5 dpc (Okada et al., 2011). They found the same expression pattern as above at 9.5 dpc, and found that Smoc1 is expressed in the optic stalk at 10.5 dpc and in the developing limb buds from 9.5 dpc to 13.5 dpc. Using a gene trap mutation of Smoc1, they showed that Smoc1 is required for limb and ocular development (Okada et al., 2011).

In this study, only genes that had restricted expression in the heart by whole mount RNA in situ hybridisation at the stages examined would be considered for further investigation by generation of mice carrying mutations in the genes of interest. Restricted expression, however does not necessarily determine functional relevance in development. Cited1 has a restricted expression pattern in the developing heart, and is also expressed in other tissues, however targeted gene knockout of Cited1 in the mouse demonstrated that despite its promising expression pattern, it is not required for heart development, but rather has a role in placental development (Dunwoodie et al., 1998; Rodriguez et al., 2004). Additionally, genes with widespread expression patterns can have quite specific phenotypes when disrupted. For example, Cited2 is widely expressed at both 8.5 dpc and 9.5 dpc, and targeted knockout of Cited2 in the mouse has quite specific heart defects and exencephaly (Barbera et al., 2002; Dunwoodie et al., 1998; Weninger et al., 2005). Since the creation of mouse lines carrying mutations in specific genes is an expensive and time-consuming process, and without supporting information to suggest a role for any of the genes examined by whole mount RNA in situ hybridisation, no genes from the microarray studies were selected for further investigation. It is possible that some genes examined or genes that were identified as enriched in particular germ layers or the primitive streak but not selected for examination, by whole mount RNA in situ hybridisation, may have important roles in development of the heart.

91

No genes were identified using this approach for further study. However, in another screen being undertaken in our laboratory, designed to identify genes involved in somitogenesis, Glutamine fructose-6-phosphate transaminase 2 (Gfpt2) was found to be expressed in a restricted manner in the developing heart at 9.5 dpc. Gfpt2 was not identified in either microarray analyses as being enriched in any particular germ layer or the primitive streak. Since the aim of this research was to identify genes that may be important in heart development and no appropriate candidates were identified in this chapter, Gfpt2 was selected for further investigation (Chapters 3-5). Chapter 3 describes the expression of Gfpt2 and its potential roles in development.

92 Chapter 3: Expression Pattern of glutamine fructose-6-phosphate transaminase 2

3.1 Introduction Glutamine fructose-6-phosphate transaminase 2 (Gfpt2) was identified as a candidate gene involved in somitogenesis in a microarray screen in our laboratory (Sewell et al., 2009). The aim of this screen was to identify genes that are critically required for somitogenesis. Mutations of most members of the Notch pathway, such as Notch1, Delta-like 1 (Dll1), Lunatic Fringe (Lfng), mesoderm posterior 2 (Mesp2) and Hairy enhancer of split 7 (Hes7), leads to a severe decrease in expression of most Notch targets. However, the mutation of Delta-like 3 (Dll3) only leads to a reduction in the expression of some targets, although the phenotype of Dll3 mice and humans is equally severe to that of Lfng and Hes7 and Mesp2. Thus it was hypothesised that as yet unidentified genes critical for somitogenesis might be disrupted in the Dll3 mutants, and that the comparison of Dll3 null embryos with wildtype might uncover these genes (Sewell et al., 2009). Preliminary work by Duncan Sparrow and Wendy Chua showed that Gfpt2 was expressed in the developing 9.5 dpc mouse heart (Victor Chang Cardiac Research Institute, unpublished observations). Since the aim of this study was to identify genes that may be important in heart development, this gene was further investigated in this study.

To date, little information exists concerning Gfpt1 or Gfpt2 expression in humans and no information exists in the mouse. In this chapter, the expression pattern of Gfpt2 and Gfpt1 was examined by whole mount RNA in situ hybridisation in the mid-gestation mouse, 7.5 dpc to 11.5 dpc, and in placental development.

3.1.1 Somitogenesis Gfpt2 was identified in our laboratory in a screen looking for genes involved in somitogenesis. To understand the relevance of Gfpt2 expression during somitogenesis a brief overview of somitogenesis is presented here.

93 Somitogenesis is the regular segmentation of the presomitic mesoderm, a domain of paraxial mesoderm in the tail bud of the embryo, into somites. Somites are balls of mesenchyme surrounded by an epithelial layer. Formed somites have defined rostral and caudal domains, but as they mature, they can be divided into three sections, the sclerotome, myotome and dermatome. The sclerotome gives rise to the ossified bones of the vertebrate and ribs, the myotome gives rise to skeletal muscle and the dermatome gives rise to the dermis of the back. A sub-compartment within the sclerotome, the syndetome, gives rise to the axial tendons (Brent et al., 2003).

In the mouse, somite formation begins at the end of gastrulation (approximately 7.75 dpc) with the formation of the cranial somites, and ends at approximately 13.5 dpc after the formation of between 63 and 65 somites. During somitogenesis, the PSM proliferates to form more paraxial cells at the caudal tip, compensating for the budding-off of somites from the rostral-most PSM. The regular formation of somites is required for proper patterning of the vertebral spine as evidenced by mutations in genes involved in somitogenesis having defects in vertebral patterning (Kusumi, 2007).

Somites form in pairs each side of the notochord in a rostral to caudal fashion. The number of somites formed and the rate at which new somites bud off from the PSM is species dependent. The rhythmic formation of somites led to the proposal of the clock-wavefront model of somitogenesis (Cooke and Zeeman, 1976). This model proposes that cells in the PSM oscillate between a permissive and non-permissive state for somite formation. A wavefront for maturation moves in a posterior direction along the embryo. When PSM cells in the permissive phase pass by the wavefront, they mature into somites. This model has been modified and refined in the past 30 years to incorporate recent findings in somitogenesis.

The clock-wavefront model of somitogenesis is supported by the observation that some genes have periodic expression patterns (clock) in the PSM during somite formation and that the timing of this corresponds to the rate of somite formation. In the mouse, most of the cycling genes identified to date belong to the Notch signalling pathway, for example Hairy enhancer of split (Hes) family members, hairy/enhancer-of-split related with YRPW motif 2 (Hey2), and Lfng (Aulehla and Johnson, 1999; Bessho et al., 2001; Dequeant et al., 2006; Dequeant and

94 Pourquie, 2008; Forsberg et al., 1998; Jouve et al., 2000; Leimeister et al., 2000; Niwa et al., 2007). Some Wnt (Axin2 and Naked cuticle 1 (Nkd1)) and FGF pathway genes (Sprouty 2, Dual specificity phosphatase 6, Dual specificity phosphatase 4 and Snail 1) exhibit cycling of expression in the PSM (Aulehla et al., 2003; Dale et al., 2006; Dequeant et al., 2006; Dequeant and Pourquie, 2008; Ishikawa et al., 2004).

The second part of the model (i.e. the wave) is also called the determination front. The determination front is currently defined as opposing gradients of FGF/Wnt and retinoic acid signalling (Dequeant and Pourquie, 2008) (Figure 3.1). During one cycle of somitogenesis, the determination front moves caudally a distance of a single somite. When PSM cells pass by the determination front, they are able to respond to the periodic signal from the clock (cycling PSM genes). In response to this signal cells located between the determination front and the caudal boundary of the previously formed somite activate Mesp2, in a stripe of expression, marking the next somite domain (Figure 3.1). Mesp2 then stabilises Lfng expression, which in turn inhibits Notch signalling in this region and generates an interface between cells activating and cells repressing Notch. This Notch interface marks the level of the following somite boundary (Dequeant and Pourquie, 2008; Morimoto et al., 2005).

Thus, in somitogenesis there are genes expressed in morphogenic gradients and genes that cycle periodically with somite formation. The exact nature of the somitogenesis clock is still unclear at this stage. The mutation of many Notch pathway components, including Dll3, Mesp2, Lfng and Hes7, results in abnormal somite formation (Dunwoodie et al., 2002; Evrard et al., 1998; Hirata et al., 2004; Saga et al., 1997).

95 Figure 3.1 Clock-wavefront model of somitogenesis This schematic represents two cycles of somitogenesis (2 new somites are formed). The opposing gradients of retinoic acid (green) and FGF/Wnt signalling (purple) represent the determination (wave) front (black line). The cycling clock genes are represented in brown and shown only on one side of the PSM in this schematic but are expressed symmetrically on both sides of the PSM. When the wave front crosses the cycling expression domain, Mesp2 is expressed (black box), defining the future segment domain. During the next cycle, Mesp2 expression becomes restricted to the anterior-most compartment of the next forming somite (grey stripe). Abbreviations: PSM presomitic mesoderm, T time in segmentation clock cycle units. Reproduced from Nature Review Genetics 9(5), Dequeant and Pourquie, “Segmental patterning of the vertebrate embryonic axis”, pages 370-82, copyright 2008, with permission from Nature Publishing Group. 3.1.2 The hexosamine biosynthesis pathway The hexosamine biosynthesis pathway (HBP) plays a minor role in glucose metabolism (Figure 3.2). When glucose enters the cell is mainly metabolised via the glycolytic pathway (glycolysis). Glucose-6-phosphate is isomerised to fructose-6-phosphate, which can then be catalysed to fructose-1, 6-bisphosphate, ultimately giving rise to two pyruvate molecules. These are either aerobically metabolised via the Krebs cycle, or anaerobically metabolised to lactic acid (Alberts et al., 1994). The HBP branches off from glycolysis and produces UDP-N-acetylglucosamine (UDP-GlcNAc) from fructose-6-phosphate via a series of reactions (Figure 3.2). Between 2 % and 5 % of fructose-6-phosphate is directed to the pathway (Zhivkov et al., 1975).

Glucosamine fructose-6-phosphate transaminase (Gfpt) is the rate-limiting enzyme of the HBP. Gfpt is conserved from Escherichia coli to human. Gfpt proteins are class 2 (PurF) type glutamine amidotransferases. They contain a glutamine amidotransferase (GATase) domain at the amino terminus (N-terminus) of the protein. The GATase domain catalyses the removal of the ammonia group from glutamine and transfers this to a substrate to form a new carbon-nitrogen group (van den Heuvel et al., 2002). At the carboxyl terminus (C-terminus) end are two sugar (SIS) domains. The SIS domain is a phosphosugar-binding domain found in many phosphosugar and binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars possibly by binding to the end product of the pathway (Bateman, 1999; Teplyakov et al., 1998). The GATase domain transfers an ammonia group from glutamine to fructose-6-phosphate, which is isomerised by the SIS groups, leading to the formation to glucosamine-6-phosphate (Denisot et al., 1991; Teplyakov et al., 2001; Teplyakov et al., 1999). Studies of protein structure and function in E. coli show that the Gfpt homolog, GlmS, homodimerises (Teplyakov et al., 2001).

3.1.3 N-acetylglucosamine and the importance of glycosylation The production of GlcNAc is important for glycosylation. Multimers of UDP-GlcNAc can form glycosaminoglycans (GAGs). GAGs are important components of many connective tissues such as bone cartilage and extracellular matrices (ECM) including

97 A

Gfpt

B

Gfpt

Gnpnat1

Figure 3.2 Glucose metabolism and HBP pathway

A) Glucose metabolism and the hexosamine biosynthesis pathway (dotted box). Glucose enters the cell and is converted to glucose-6-phosphate (glucose-6-P) by hexokinase. Glucose-6-P can enter the pentose pathway. Glucose-6-P is converted to fructose-6-phosphate (fructose-6-P). Fructose-6-P can continue along the glycolysis pathway to the Krebs cycle or can enter the hexosamine biosynthesis pathway (HBP). Gfpt converts fructose-6-P to glucosamine-6- phosphate (glucosamine-6-P) as the rate-limiting step of the HBP. Via a series of reactions, glucosamine-6-P is converted to UDP-N-acetylglucosamine. UDP-N-acetylglucosamine is a substrate for N-glycosylation and O-glycosylation of proteins. B) More detailed version of the HBP. Abbreviations: G6PD glucose-6-phosphate dehydrogenase, NADP Nicotinamide adenine dinucleotide phosphate, Gfpt Glucosamine fructose-6-phosphate transaminase, Gpi-linker glycosylphosphatidylinositol-linker. A) Reproduced from Birth defects research. Part A, Clinical and molecular teratology 70(8), Horal et al., “Activation of the hexosamine pathway causes oxidative stress and abnormal embryo gene expression: involvement in diabetic teratogenesis”, pages 519-27, copyright 2004, with permission from John Wiley and Sons. B) Reproduced from Embo Journal 19(19), Boehmelt et al., “Decreased UDP-GlcNAc levels abrogate proliferation control in EMeg32-deficient cells”, pages 5092-104, copyright 2000, with permission from Nature Publishing Group.

99 cardiac jelly. Glycosylation is also important for cellular localisation of proteins, for example glycosylphosphatidylinositol (GPI)-anchors, which anchor proteins to the membrane of the cell or vesicles. Furthermore, addition of UDP-GlcNAc to proteins and lipids can regulate function and is analogous to phosphorylation in this regard (Hanover, 2001).

The glycosylation of substrates such as proteins and lipids requires glycosyltransferases. With the exception of hyaluronic acid, which is synthesised by hyaluronic acid synthases, GAGs have a protein core so also require glycosyltransferases for their formation. Studies on different glycosyltransferases have demonstrated the essential requirement of glycosylation and UDP-GlcNAc in many processes (Hanover, 2001; Slawson et al., 2005; Stanley, 2007). O-linked glycosylation catalysed by O-linked N- acetylglucosamine (OGT) transfers a single N-acetylglucosamine (O- GlcNAc) molecule to serine/threonine residues. O-linked glycosylation has been shown to be reciprocal with phosphorylation of some proteins such as RNA polymerase II (Comer and Hart, 2001). Phosphorylation is known to modify the function of proteins, as does glycosylation, thus either phosphorylation or glycosylation at particular residues provides two levels of post-translational modification. Dynamic O-glycosylation of proteins during the cell cycle and the observation that over expression of OGT resulted in defects in cytokinesis suggest that glycosylation is important for cell cycle regulation (Slawson et al., 2005). O-linked glycosylation has also been implicated in cellular stress, transcription, protein degradation and protein-protein interactions (Hanover, 2001; Love and Hanover, 2005; Slawson et al., 2006; Zachara and Hart, 2006).

The glycosylation of proteins has been shown to modulate their function. For example, protein O-fucosyltransferase 1 (Pofut) adds O-fucose groups to Notch EGF-like repeat domains. Lfng elongates the O-fucose group by addition of UDP-GlcNAc. The glycosylation of Notch receptor potentiates Notch signalling. Disruption of either of these genes in the mouse results in defects in somitogenesis (Evrard et al., 1998; Shi and Stanley, 2003). Another example of glycosylation signalling is hyaluronic acid. In addition to its role as an essential component of many extracellular matrices, hyaluronic acid acts as a signalling molecule activating ErbB2 and ErbB3 in cardiac cushion formation (Camenisch et al., 2002b).

100 3.1.4 Gfpt2 In mammals there are two Gfpt proteins encoded by separate genes. Gfpt2 was first identified in mouse and human by Oki et al. (1999), who found a human expressed sequence tag and identified it as a novel GFPT subtype by sequence similarity to GFPT1. GFPT1 had previously been identified in and human and mouse (McKnight et al., 1992; Sayeski et al., 1994). The mouse Gfpt2 gene has 19 coding exons and produces a transcript of 2,947 base pairs. It is located on 11 of the mouse, and encodes a protein of 682 residues. Gfpt1 and Gfpt2 are highly homologous, sharing 74.7 % similarity at the protein level (Oki et al., 1999). Gfpt1 and Gfpt2 are functional glutamine fructose-6-phosphate transaminase enzymes (McKnight et al., 1992; Oki et al., 1999; Sayeski et al., 1994). The GATase domain of Gfpt2 corresponds amino acids 2 to 264, and the SIS domains correspond to amino acids 363 to 495 and 534 to 668 (Figure 3.3).

Gfpt1 and Gfpt2 expression has not previously been examined in the mouse embryo and no studies have examined Gfpt1 or Gfpt2 function in mouse embryonic development. Previous studies have been conducted on human adult tissue (Nerlich et al., 1998; Oki et al., 1999). The localisation of Gfpt has been examined in human adult tissue using an antibody directed against amino acids 666-681 of the human Gfpt protein (Nerlich et al., 1998). This study was conducted prior to the identification of Gfpt2, and the antibody used in the localisation studies binds to a region that is identical between Gfpt1 and Gfpt2, thus would be unable to distinguish between the two proteins. Oki et al., (1999) examined Gfpt2 and Gfpt1 transcript localisation by Northern blot analysis on human tissues. They found that Gfpt1 and Gfpt2 were differentially expressed in the tissues examined. Gfpt2 but not Gfpt1 was detected in the brain, whereas Gfpt1 but not Gfpt2 is detected in the kidney and pancreas. Gfpt1 and Gfpt2 were both detected in the adult heart and placenta (Oki et al., 1999).

3.2 Gfpt2 expression in the mid-gestation embryo Gfpt2 expression was examined by whole mount RNA in situ hybridisation in the developing mouse embryo from 7.5 dpc to 11.5 dpc, using a probe to the 3’ untranslated

101 A

B GATase SIS SIS

Figure 3.3 Schematic of Gfpt2 intron/exon structure and protein domains A) Schematic of intron/exon structure of Gfpt2. Exons are represented by vertical lines separated by introns. Short vertical line in 1st exon is untranslated as is open ended box at 3’ end (right). Gfpt2 contains 19 exons encoded for by 2,947 base pairs. Blue bar represents approximate exons that encode for the GATase domain. Green bars represent approximate exons that encode for the SIS domains. B) Schematic of protein structure with domains. Gfpt2 protein consists of a GATase domain corresponding to amino acids 2-264, a spacer region then two SIS domains corresponding to amino acids 363-495 and 534-668 respectively. The total protein is 682 residues with an estimated molecular weight of 77 kDa. A B C B nf C psm nf al fg nf fe fg fe

h h h

Figure 3.4 Expression of Gfpt2 at 8.5 dpc (A) Lateral view of a 8.5 dpc mouse embryo probed with Gfpt2, Rostral to the right. Gfpt2 is expressed dorsal to the heart tube and where the presomitic mesoderm (PSM) meets the allantois. Careful examintaion of several embryos showed that this expression is not in the PSM (B, C) Transverse sections of whole mount RNA in situ hybridised embryo probed with Gfpt2, dorsal at top. Gfpt2 is expressed in the endoderm of the foregut, dorsal to the heart. Scale bar, 200 μm (A), 50 μm (B, C). Abbreviations: al allantois, fe foregut endoderm, fg foregut, h heart, nf neural fold, PSM presomitic mesoderm. region (UTR) of the transcript. The probe was designed to the 3’ UTR to avoid any cross-reactivity with Gfpt1. No Gfpt2 expression was detected at 7.5 dpc. At 8.5 dpc, Gfpt2 was detected dorsal to the heart and at the junction between the PSM and allantois (Figure 3.4A). Stained embryos were embedded in wax and sectioned. Sectioning revealed that Gfpt2 was expressed in the foregut endoderm dorsal to the heart (Figure 3.4B and C).

At 9.5 dpc, Gfpt2 was expressed in the atrioventricular canal (AVC) and outflow tract (OFT) of the heart. Expression was also detected in the forebrain and branchial arches. In approximately 50 % of embryos, expression was also detected as a stripe of expression in the PSM (Figure 3.5A - C). Wax sections of stained embryos showed that Gfpt2 is expressed in the myocardium underlying the atrioventricular (AV) and OFT cushions (Figure 3.5D and E).

Expression of Gfpt2 was also examined at 10.5 dpc and 11.5 dpc by whole mount RNA in situ hybridisation (Figure 3.6). At 10.5 dpc, the strongest region of expression was in the optic vesicle (Figure 3.6A and B). Heart expression is largely lost by this stage and no longer detected in the OFT or AVC. At 11.5 dpc, Gfpt2 may be expressed in the head region, but antibody and probe trapping was present at this stage making it difficult to distinguish the true expression pattern (Figure 3.6C). Gfpt2 was detected in the PSM as a small stripe of expression at this stage. At 10.5 dpc, PSM expression was not detected, but given that PSM expression is not detected in all embryos, and only 3 embryos were examined, this may be due to chance.

Gfpt1 in the mouse has 71 % identity by BlastN search to the coding region of Gfpt2 (Altschul et al., 1997). There have been no studies on Gfpt1 expression in the mouse embryo and given that it is functionally similar to Gfpt2, Gfpt1 expression was examined in the 9.5 dpc embryo using specific probes to both the 3’ UTR and coding region of Gfpt1, but no expression was detected (results not shown). Although the coding regions for Gfpt1 and Gfpt2 are similar, the 3’ UTR regions are distinct.

104 A B E C D

ov fb fb ov psm

oft avc

psm

DEba ba

mc mc oft ecd a av ec oft ec mes v v

Figure 3.5 Expression of Gfpt2 at 9.5 dpc (A-B) Lateral views of 9.5 dpc mouse embryos, (C) caudal part of embryo with somites (top), tail bud and presomitic mesoderm to the right, bottom, probed with Gfpt2. Gfpt2 is expressed in the atrioventricular canal (A) and outflow tract of the heart (B), with weaker expression in the forebrain and branchial arches (A, B), and presomitic mesoderm (A, B, higher magnification in in C). Dashed circle in (C) represents last formed somite. (D, E) Wax sections of 9.5 dpc heart probed with Gfpt2. Gfpt2 is expressed in the myocardium underlying the endocardial cushions. Atrioventricular cushion (D) and outflow tract cushion (E). A-C Scale bar, 200 μm (A, B) 125 μm (C). D-E Scale bar, 100 μm. Abbreviations: a atrium, av atrioventricular, avc atrioventricular canal, ba, branchial arch, ec endocardial cushion, ecd endocardium, fb forebrain, mc myocar- dium, mes mesenchymal cells, oft outflow tract, ov otic vesicle, psm presomitic mesoderm, v ventricle A B ov

otv

ov

otv lb ba lb h lb PSM lb

CD

mb fb

hb lb PSM lb otv lb

Figure 3.6 Expression of Gfpt2 at 10.5 dpc and 11.5 dpc (A-C) Lateral views of 10.5 dpc (A, B) and 11.5 dpc (C) mouse embryos probed with Gfpt2. (D) Caudal region of 11.5 dpc embryo, from C), caudal bottom left, rostral top right. Gfpt2 is expressed in the optic vesicle and otic vesicle at 10.5 dpc (A, B). At 11.5 dpc, Gfpt2 is possibly expressed in the head region (although this may be trapping) and is expressed in a weak band in the PSM (C, D). Line in (D) represents last formed somite boundry. Scale bars 60 μm (A, B), 90 μm (C), 45 μm (D). Abbreviations: ba branchial acrches, fb forebrain, h heart, hb hind brain, lb limb bud, mb mid brain, ov optic vesicle, otv otic vesicle, PSM presomitic mesoderm. 3.3 Gfpt1 but not Gfpt2 is expressed in the developing placenta Given that GFPT1 and GFPT2 are both expressed in the human placenta (Oki et al., 1999), Gfpt1 and Gfpt2 expression was examined using RNA in situ hybridisation on cryosectioned mouse placentas. Gfpt2 expression was not detected in 9.5 dpc, 11.5 dpc and 15.5 dpc placentas (results not shown). In contrast, Gfpt1 was expressed in the decidua and ectoplacental cone portions of the 9.5 dpc placenta (Figure 3.7A and B). Examination under higher magnification additionally showed expression in the allantoic mesenchyme. At 11.5 dpc, there is weak expression in the labyrinthine layer, with strong expression in the spongiotrophoblast layer and trophoblast giant cells within this layer (Figure3.7C and D). At 15.5 dpc, expression is restricted to the spongiotrophoblast layer (Figure 3.7E and F).

Interestingly, northern blot analysis of GFPT1 and GFPT2 expression in human tissues by Oki et al., (1999) detected GFPT1 and GFPT2 in the placenta. Here Gfpt1, but not Gfpt2, was detected in mouse placentas. This may represent a difference in regulation of Gfpt1 and Gfpt2 between mouse and human.

3.4 Cell localisation by tagged expression construct Nehrlich et al., (1998) showed that human Gfpt is localised to the cytoplasm by immunohistochemisty in most human tissues. The antibody was designed prior to the identification of GFPT2 and is expected to detect both GFPT1 and GFPT2, as the epitope is a region identical in both proteins. For this reason, the localisation of Gfpt2 was examined in C2C12 cells, a mouse muscle satellite cell line. Gfpt2 was HA-tagged at the 5’ end of the protein and cells were transiently transfected with the tagged construct (HA-Gfpt2 pCMX.PL2). After 24 hours, the cells were fixed and transfected cells were visualized by immunohistochemistry using a mouse monoclonal anti-HA antibody. In agreement with Nehrlich et al., (1998) Gfpt2 is localised to the cytoplasm of C2C12 cells (Figure 3.8). No images of the negative control were taken at the time this experiment was conducted, and no nuclear staining, such as DAPI (4',6-diamidino- 2-phenylindole) was performed. The staining observed was relatively weak although clear and given that it was in agreement with previously observed expression for the human Gfpt, it was not felt that DAPI staining was required. 107 A B } dec dec { EPC { al-ch { } EPC } al-ch

C D } dec

STB dec { } STB { LL { } LL EF } dec dec { STB { STB LL{ } } LL G H } dec dec { STB { STB LL { } } LL

Figure 3.7 Gfpt1 expression in placental sections

Transverse sections of mouse placental cryosections probed with Gfpt1 (A-F) or Prl3b1 (G, H). (A, B) 9.5 dpc, (C, D) 11.5 dpc, (E - H) 15.5 dpc. At 9.5 dpc (A, B) Gfpt1 is expressed at the placental/ectoplacental cone boundary in the chorionic ectoderm and in the decidua. At 11.5 dpc (C, D) and 15.5 dpc (E, F), Gfpt1 is expressed in the spongiotrophoblast layer. At 11.5 dpc, Gfpt1 is also expressed in some cells in the labyrinthine layer (C, D), but this expression is no longer present at 15.5 dpc. Prl3b1 marks a subset of secondary trophoblast giant cells in the spongiotrophoblast and labyrinthine layers, and is shown for comparison. B, D, F and H are boxes in A, C, and E respectively. Scale bar 600 m (A, C, G), 475 m (E), 150 m (B, D, F), 60 m (H). Abbreviations: al-ch allantois mesoderm/chorionic ectoderm boundary area, dec decidua, EPC ectoplacental cone, LL labyrinthine layer, STB spongiotrophoblast layer.

109 A B

Figure 3.8 Gfpt2 is localised to the cytoplasm of C2C12 cells C2C12 cells transiently transfected with HA-tagged Gfpt2 detected by immunohistochemistry (red). HA-tagged Gfpt2 is localised to the cytoplasm of cells (A). In some cells, there are dots of expression (B). This may be due to a high level of expression in these cells. Scale bar: 20 μm. 3.5 Discussion In this chapter, the expression pattern of Gfpt2 was examined in the developing mouse embryo. In general, Gfpt2 expression was weak and restricted in regards to tissue type and developmental period. Gfpt2 was expressed in the foregut endoderm, dorsal to the heart at 8.5 dpc. At 9.5 dpc, Gfpt2 was expressed in the myocardium underlying the endocardial cushions. The heart expression was absent by 10.5 dpc. At 10.5 dpc, Gfpt2 is expressed in the optic vesicle. At 9.5 dpc and 11.5 dpc, Gfpt2 was also detected in some embryos (50 %) in the PSM. At 11.5 dpc, Gfpt2 was expressed in the head region. Since nothing was known regarding the expression of Gfpt1 in the mouse embryo, Gfpt1 expression was also examined, but was not detected in the mouse embryo from 7.5 dpc-9.5 dpc.

The whole mount RNA in situ hybridisation at 10.5 dpc and 11.5 dpc could have been more informative. The trapping of probe in whole embryos is a common complication at these stages and could have been avoided or alleviated by sectioning of the examined embryos (as was performed at 8.5 dpc and 9.5 dpc), or performing in situ hybridisation on cryosectioned embryos as was performed for the placentas. Despite these issues, it was possible to determine for the tissues that this study was most interested in, the heart and the PSM, that Gfpt2 expression was absent from the heart at 10.5 dpc and 11.5 dpc, and present in the same on/off fashion as shown at 9.5 dpc in the PSM of the 11.5 dpc embryo.

Gfpt is the rate-limiting enzyme of the HBP, which produces GlcNAc. As discussed at the beginning of this chapter, the production of GlcNAc is important for glycosylation. Multimers of UDP-GlcNAc can form glycosaminoglycans (GAGs). Glycosylation of proteins has been shown to modulate function and be involved in signalling in a variety of contexts, including during heart development and somitogenesis (Allen and Rapraeger, 2003; Evrard et al., 1998; Hanover, 2001; Love and Hanover, 2005; Shi and Stanley, 2003; Zachara and Hart, 2006). Taken together, the known function of Gfpt and the observed expression of Gfpt2 in the mouse embryo, the following hypotheses were made regarding the function of Gfpt2 in the mouse.

111 Gfpt2 is expressed in the foregut endoderm at the appropriate time to influence signalling from the foregut endoderm to the overlying mesoderm of the secondary heart field (SHF). The foregut endoderm is known to be an important source of signals for the developing heart (Arai et al., 1997; Chen and Fishman, 2000; Lough and Sugi, 2000). GlcNAc is known to modify FGF signalling; FGF signalling from the foregut endoderm to the overlying mesoderm is known to be important in heart development (Allen and Rapraeger, 2003; Kelly et al., 2001; Park et al., 2008; Zhu et al., 1996). Thus, flux through the HBP might be important in the interaction between the foregut endoderm and the surrounding mesoderm of the SHF.

At 9.5 dpc, Gfpt2 expression is restricted to the myocardium of the AVC and OFT cushions. The cardiac cushions are important for cardiac valve development; defects in cardiac cushion formation can result in both valve and septal defects (de Lange et al., 2004; Eisenberg and Markwald, 1995; Person et al., 2005) (see Chapter 1). The cardiac cushions are swellings of extracellular matrix secreted by the underlying myocardium containing molecules including GAGs that form at approximately 9-9.5 dpc. Cellularisation of the cushions occurs when the myocardium underlying the cardiac cushions signals to the endocardium to initiate epithelial to mesenchyme transition (EMT) (Armstrong and Bischoff, 2004; Barnett and Desgrosellier, 2003) (see Chapter 1). The short temporal and spatial expression of Gfpt2 in the myocardium underlying the cardiac cushions suggests that Gfpt2 might be required for either GAG production to make the cardiac cushion swellings, or in the initiation of EMT to cellularise the cushions. The HBP produces GlcNAc, which is an essential component of the GAGs that make up the cardiac cushion extracellular matrix (Little and Rongish, 1995). The glycosylation of proteins has previously been demonstrated to modulate their functions (Hanover, 2001; Zachara and Hart, 2006); suggesting that production of UDP-GlcNAc by the HBP could function in signalling to initiate EMT.

Finally, Gfpt2 is also detected in the PSM of approximately 50 % of embryos at 9.5 dpc and later stages. The expression of Gfpt2 in only a proportion of embryos suggests that Gfpt2 expression may cycle in this tissue (Dequeant et al., 2006; Dequeant and Pourquie, 2008). Another gene that cycles in the PSM is Lfng (Dunwoodie, 2009; Evrard et al., 1998). Lfng is a glycosyltransferase that glycosylates Notch and alters

112 Notch signalling (Dunwoodie, 2009). Thus, flux through the HBP may be important to produce sufficient UDP-GlcNAc for Notch glycosylation.

These hypotheses will be tested in the mouse by generating mice containing gene trap insertions, which disrupt Gfpt2 function. The characterisation and generation of mice carrying the gene trap insertions is described in Chapter 4, and the functional effects of the gene trap insertions are described in Chapter 5.

113

114 Chapter 4: Generation of Gfpt2 gene- trapped mouse lines

4.1 Introduction

In order to determine the function of a gene in embryonic development, one can delete it from the genome, reduce its expression or over express the gene. In this study, the aim was to delete the expression of the gene, Glutamine fructose-6-phosphate transaminase 2 (Gfpt2), to determine the functional consequences of loss of its expression during mouse development. In the mouse, there are different approaches that can be taken to delete gene expression, targeted or random. Gene targeting is a genetic technique that uses homologous recombination, recombination between similar DNA sequences, to alter a gene. This approach can be used to delete a whole gene or to remove specific exons. Gene targeting can also add a gene or introduce point mutations, and can be designed to be conditional, to turn on or turn off gene expression in particular tissues or at particular developmental time points. Gene targeting requires the design of a specific vector for each gene of interest. Gene trapping is the random insertion of a gene trap cassette into the genome. In this approach, the same vector can insert into any gene, however since integration is random, the effect of the insertion is dependent on precisely where in the gene the insertion occurs. Here, embryonic stem (ES) cell clones heterozygous for gene traps inserted into Gfpt2, have been obtained and the mapping of the insertion points and generation of mouse lines from these are described in this chapter.

4.1.1 Gene trapping

Gene trapping is the random insertion of a gene trap cassette into the genome. Generally, the cassette consists of a splice acceptor site and a reporter gene, followed by a polyadenylation sequence (polyA) (Figure 4.1A). The gene trap cassette can be introduced into ES cells by either electroporation or retroviral infection. In many gene trap cassettes there is also a second selective marker, e.g. Neomycin resistance (Neo) gene, under the control of a constitutive promoter such as phosphoglycerate kinase 1

115 A

B

Figure 4.1 Schematic of gene trap vectors

Gene trap vectors and how they are hypothetically inserted into the genome. A) Promoter trap. B) Modified polyA trap, pUPA. In the case of the promotor trap vector (A), the gene trap vector is incorporated into the gene transcript as an exon and contains a polyA sequence, resulting in the remainder of the transcript (everything 3’ to the insertion point) no longer being transcribed. The resultant protein is a fusion protein between any 5’ transcribed exons and the gene trap reporter (-gal and NeoR). In the case of the polyA trap vector (B), the gene trap insertion results in two messages. The first transcript includes the exons 5’ to the insertion and the reporter (in this case, GFP) and the second transcript consists Neo and the remaining exons 3’ to the insertion point. Two proteins result from the first transcript, the 5’ exons and GFP (the reporter). In this example, the typical polyA type vector has been modified to also include an IRES flanked by loxP sites (triangles) after the Neo gene. Therefore, two proteins could be translated from the second transcript, NeoR.and the remainder of the gene 3’ to the transgene insertion point. Abbreviations: -geo fusion protein encoded by lacZ encoding for -gal (beta- galactosidase) and neomycin resistance genes, dEN enhancer deletion (deletion of viral enhancers from the vector that may have impacted positively or negatively on the expression of the gene into which the gene trap has inserted), IRES internal ribosome entry site, 5’LTR 5’ long tandem repeat, NeoR neomycin resistance gene, pA poly-adenylation sequence, PGK phosphoglycerate kinase 1 promotor sequence,SA splice acceptor site, SD splice donor site. The reporter in B) is enhanced green fluorescent protein (EGFP). Triangles represent loxP sites. Figure reproduced with minor adaption with permission from http://www.cmhd.ca/genetrap/index.html (June, 2009).

117 (PGK) to enable selection of transfected cells. In this case, Neo is expressed in every cell that has been transfected irrespective of whether the gene trap itself is expressed, and the insertion point in the genome. There are different types of gene trap cassettes that enable selection for integration of the gene trap into either exons or introns of genes. The ES cells can then be selected for the incorporation of the gene trap into the genome, and the insertion point can be mapped. Those insertions that map to the introns or exons of genes can then be expanded and ES cell clones stored, and/or injected into blastocysts to generate mouse lines carrying the gene trapped allele.

There are two types of gene trap vector that are commonly used, promoter trap and PolyA trap. Promoter trap vectors may or may not contain a splice acceptor (SA) site. Without a SA site, the promoter trap can only be detected if it is incorporated directly into the exons of a gene as otherwise the reporter gene would not be expressed, whereas inclusion of the SA site allows gene trap cassettes inserted into introns to be expressed (Figure 4.1A). In either case, the promoter trap cassette comes under the control of the endogenous gene into which it has inserted. Incorporation of the promoter trap cassette into the genome can create null alleles of trapped genes (Chen et al., 1994; Friedrich and Soriano, 1991). One disadvantage of the promoter trap approach to gene trapping is that only genes expressed in the ES cells can be identified.

PolyA trap cassettes consist of the same elements as the promoter trap, but additionally contain a selective marker e.g Neo without a polyA sequence under the control of a constitutive promoter, followed by a splice donor (SD) site. This allows the polyA trap to be incorporated into the transcript as an exogenous exon (Figure 4.1B). Genes not expressed in ES cells are able to be trapped as they can still be selected due to the selective marker having its own promoter (Niwa et al., 1993; Salminen et al., 1998; Yoshida et al., 1995). One problem with this approach however is the reduced likelihood of the creation of null alleles due to a bias towards polyA trap cassettes being incorporated into the 3’ end of genes. This is thought to be caused by nonsense mediated decay (NMD) of the selective marker, which occurs in eukaryotes when there is a termination codon more than 55 base pairs (bp) upstream of the final splice junction site (Nagy and Maquat, 1998).

118 To investigate the function of Gfpt2 in mouse development two mouse embryonic stem (ES) cell clones heterozygous for gene traps inserted into Gfpt2, 352F9 and 305A09, were obtained from the Centre for Modeling Human Disease (CMHD), Toronto (To et al., 2004). Gene trapped ES cells rather than targeted gene knockout of ES cells was selected as it was felt that this approach would progress more rapidly than generating a knockout line in house, and the gene trapped lines for Gfpt2 were available. The aim of the CMHD project is to generate gene trap insertions to develop a library of mutagenised mouse ES cells for the scientific community. The CMHD is one institute of several involved in the International Gene Trap Resource. The intent of the IGTR is to create mutagenic gene trap alleles in every gene in the mouse genome and make these readily available to researchers across the world.

Both clones obtained for Gfpt2 from CMHD were trapped using the UPA trap vector (Figure 4.1B) (Shigeoka et al., 2005). The UPA trap vector is a modified polyA type, RET vector (Ishida and Leder, 1999), and contains three internal ribosome entry sites (IRES) flanked by loxP sites downstream of the stop codon for the selectable marker. The addition of the IRES enables the remainder of the transcript to be translated and appears to overcome the 3’ insertion bias of polyA trap vectors (Shigeoka et al., 2005), presumably by overcoming or inhibiting NMD.

4.2 Mapping the gene-trap insertions

To ensure that the obtained ES cell clones did contain a gene trap insertion in Gfpt2 and to pinpoint the location of the insertion, cDNA and genomic mapping was performed. 3’ Rapid amplification of cDNA end by PCR (3’RACE), which can amplify the sequence of an RNA transcript from a small known sequence, followed by cDNA sequencing was performed at CMHD and indicated the approximate location of the gene trap insertions as both being in intron 5-6. Upon receiving the gene trap ES cell clones, cDNA was made from RNA extracted from each clone. Using an internal primer in the Neo sequence (p711) of the gene trap and primers in a predicted downstream exon (exon 7, p712), RT-PCR was performed. This yielded a PCR product from the Neo sequence of the gene trap to exon 7 of Gfpt2 that was 1078 base pairs (bp) for the 352F9 (Figure

119 A

B M1 2

cDNA primer schematic 352F9

Ex5 Ex6 Ex7 lane 1

2000 1500 305A09 1000

500 Ex4 Ex5 Ex6 Ex7 lane 2

C M1 2 3 4 gDNA primer schematic 352F9

Ex4 Intron 1.8kbkb Ex5

2500 2000 1500 1000

500 305A09

Ex3 Intron 1.24kbkb Ex4

Figure 4.2 Mapping of the gene trap insertions

A) Schematic of pUPA gene trap insert. B) RT-PCR of heterozygous ES cells for the 352F9 (lane 1) and 305A09 (lane 2) gene trap insertion. Lane 1, amplification of using a Neo primer and Gfpt2 exon 7 primer (size 1078 bp). Lane 2, amplification of using a Neo primer and Gfpt2 exon 5 primer (size 966 bp). C) PCR of genomic DNA from heterozygous 352F9 mice (lane 1) or ES cells (lane 2), and heterozygous 305A09 ES cells (lane 3) or mice (lane 4). Lane 1, amplification of 5’ insertion point of 352F9 gene trap insertion using a primer in Gfpt2 intron 4-5 and a primer 5’ to the SA site in the UPA trap vector. Lane 2, amplification of 3’ end of 352F9 gene trap insertion using a primer 3’ to the SD site in the UPA trap vector and a primer in Gfpt2 intron 4-5. Lane 3, amplification of 5’ insertion point of 305A09 gene trap insertion using a primer in Gfpt2 intron 3-4 and a primer 5’ to the SA site in the UPA trap vector. Lane 4, amplification of 3’ end of 305A09 gene trap insertion using a primer 3’ to the SD site in the UPA trap vector and a primer in Gfpt2 exon 4. Abbreviations: Blc2 SA spice acceptor site from the Bcl2 gene, bp base pairs, EGFP enhanced green fluorescent protein gene, HPR IRES internal ribosome entry site, LTR long tandem repeat, Neo neomycin resistance gene, M marker lane, 500 bp marker (each band represents 500 bp), each band represents 500 bp, pA poly- adenylation sequence, RNAPII RNA polymerase II . Triangles represent loxP sites. Numbers to the side of gel correspond to the number of base pairs in the Marker lane, red arrow points to relevant band as demonstrated by sequencing in C) lane 2. Approximate primer sites are represented by black arrow heads.

121 4.2). In the case of the 305A09 clone, RT-PCR was also performed between Neo and exon 5 (p730) producing a PCR product of 966 bp (Figure 4.2). The PCR product for each ES cell line was cloned into the pGemT vector and sequenced. This unexpectedly revealed that the 352F9 clone insertion point was in intron 4-5 and the 305A09 clone insertion point was located in intron 3-4 (Figure 4.2, 4.3). At the protein level, this corresponds to the insertion being after the amino acids 1-114 (exon 1 to exon 4) for the 352F9 insertion and amino acids 1-71 (exons 1-3) for the 305A09 clone. This places both the gene trap insertions in the glutamine amidotransferase (GATase) domain, and disruptions at these points would be expected to interfere with the binding of fructose-6-phosphate, and transfer of the ammonia group from glutamine (Figure 4.3).

Having identified the intron location of the gene trap in each clone, the precise genomic location needed to be determined to facilitate genotyping. DNA was isolated from ES cells or the resulting mouse lines (see section 4.3), and genomic mapping was performed. For the 352F9 clone, a primer from within Gfpt2 intron 4-5 (p807 and p800) were used in conjunction with primers at each end of the gene trap cassette (p724 and 722) to map both the 5’ and 3’ ends of the insertion point. The 352F9 clone has inserted 560 bp downstream of Gfpt2 exon 4 in intron 4-5 (Figure 4.2, 4.3). For the 305A09 clone, primers in Gfpt2 intron 3-4 (801) and exon 4 (p731) were used in conjunction with a primer at each end of the gene trap cassette (724 and 729) to map the insertion point. The 305A09 clone has inserted 630 bp downstream of exon 3 in intron 3-4 (Figure 4.2). Sequencing was performed to confirm genomic integration sites.

4.3 Generation of mice carrying the gene-trap alleles

ES cells carrying the gene trap alleles were injected into blastocysts by Natalie Wise to generate chimeric mice. The gene trap cassettes were transfected into R1 ES cells, which were created from an F1 cross 129X1/SvJ x 129S1 (Nagy A., 2002). For each clone, the ES cells were injected into C57BL/6J blastocysts. Each cell in a chimeric mouse can derive from either the host blastocyst or the R1 ES cells. The C57BL/6J blastocysts and R1 ES cells both carry different coat colour markers, C/c; P/p; Aw/Aw for R1 cells and C/C, P/P, a/a for the

122 A 34 5

B GATase SIS SIS

Figure 4.3 Schematic of Gfpt2 intron/exon structure and protein domains A) Schematic of intron/exon structure of Gfpt2. Exons are represented by vertical lines separated by introns. Short vertical line in 1st exon is untranslated as is open-ended box at 3’ end (right). Gfpt2 contains 19 exons encoded for by 2,947 base pairs. Blue bar represents approximate exons that encode for the GATase domain. Green bars represent approximate exons that encode for the SIS domains. Red triangles represent approximate location of Gfpt2GtA09d1 insertion. Black triangles represent approximate location of Gfpt2GtF9d1 insertion. B) Schematic of protein structure with domains. Gfpt2 protein consists of a GATase domain corresponding to amino acids 2-264, a spacer region then two SIS domains corre- sponding to amino acids 363-495 and 534-668 respectively. The total protein is 682 residues. C57BL/6J host blastocysts. Thus, coat colour acts as a read out for the relative contribution of the ES cells to the chimera. The Aw locus is dominant over the C locus C (black) allele, so in this case, agouti coat colour represents cells derived from the R1 ES cells. For generation of mouse lines, some of the germ line cells of the chimera must be derived from the ES cells and thus carry the gene trap allele.

Nine chimeras were obtained for the 352F9 clone, and three were obtained for the 305A09 clone (Table 4.1). Chimeric males were mated to C57BL/6J females. If the injected ES cells have contributed to the germ line (sperm cells) then mice heterozygous for the gene trap allele can be obtained. For the 352F9 clone, male mouse numbers 1, 2, 4, 5, 6, 9 and 11 produced offspring heterozygous for the gene trap when mated to C57BL/6J females (Table 4.1). The mouse line created was named Gfpt2Gt(CMHD-GT_352F9- 3)Cmhd (Gfpt2GtF9). For the 305A09 clone, chimera male numbers 4, 7 and 8 were mated to C57BL/6J females and offspring heterozygous for the gene trap were obtained (Table 4.1). The resultant mouse line was named Gfpt2Gt(CMHD-GT_305A09-3)Cmhd (Gfpt2GtA09).

4.4 Creation of the IRES gene trap lines

The insertion of the UPA gene trap vector into Gfpt2 may not generate a functionally null allele. The presence of the IRES element in the pUPA gene trap cassette overcomes the 3’ insertional bias that occurs with polyA trap vectors. It is presumed that this is achieved by evading the normal NMD caused by premature stop codons (Shigeoka et al., 2005). As a result of this, for both the Gfpt2GtF9 and the Gfpt2GtA09 alleles, the entire Gfpt2 transcript may be made, albeit disrupted by an insertion of the UPA vector. In the case of the Gfpt2GtF9 allele, Gfpt2 exons 1-4 and Enhanced green fluorescent protein (EGFP), the reporter for the pUPA vector, could be transcribed and translated from the endogenous Gfpt2 promoter, and Neo and the remainder of Gfpt2 would be transcribed and translated from the constitutively active PGK promoter (refer to Figure 4.1B for basic schematic of transcripts obtained from the pUPAtrap vector). Similarly in the case of the Gfpt2GtA09 mouse line, the gene trap allele would transcribe and translate Gfpt2 exons 1-3, the gene trap cassette under the endogenous promoter, followed by Neo and Gfpt2 exons 4-19 under the control of the constitutive promoter. In both cases the first

124 Table 4.1 Percent chimerism of mice generated from 352F9 and 305A09 ES cell clones

Blastocysts were injected with ES cells heterozygous for each clone. Unshaded chimeras were mated to C57BL/6J females to generate mice heterozygous for the gene trap alleles.

Percentage chimerism Mouse number 352F9 clone 305A09 clone 1 100 0 (black) 2 80 0 (black) 3 85 90 4 90 80 5 90 0 (black) 6 90 0 (black) 7 0 (black) 85 8 0 (black) 95 9 90 0 (black) 10 90 0 (black) 11 90 0 (black) 12 0 (black) 0 (black) 13 0 (black) 0 (black) 14 0 (black) 0 (black) few exons and EGFP would be expressed following the endogenous expression pattern and Neo and the remainder of Gfpt2 would be expressed constitutively.

Alleles that are more likely to be functionally null can be created by excising the IRES sequence with Cre recombinase (Cre). To achieve this, heterozygous Gfpt2GtF9/+and Gfpt2GtA09/+ female mice were mated to Cre deleter (Tg(CMV-cre)1Cgn) males (Schwenk et al., 1995), which express Cre in every cell, to facilitate Cre-lox recombination and excision of the IRES. Female mice were genotyped using primers immediately upstream and downstream of the IRES to determine those in which the recombination was successful (Figure 4.4, compare to Figure 4.1B). Females in which Cre-mediated excision was successful were mated to C57BL/6J males and the resulting mouse lines were denoted Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd (Gfpt2GtF9d1) and Gfpt2 Gt(CMHD- GT_305A09-3)d1Cmhd (Gfpt2GtA09d1). In the case of the 305A09 clone, chimeric males were

125 A

B Without Cre excision C M1 2

After Cre excision 2000 bp 1500 bp 1000 bp

500 bp

Figure 4.4: Excision of UPA trap vector IRES by Cre recombinase A) Schematic of IRES excision by Cre recombinase and the resultant DNA sequence, transcripts and protein products. (Compare to Figure 4.1B). B) Primer design. Primers are positioned each side of the loxP sites and IRES sequence. When there is no excision, the full length IRES sequence is amplified (909 bp). When excision occurs, the IRES sequence is excised and a smaller amplified product is amplified (250 bp). C) PCR using primers from each side of the loxP recombination sites to detect IRES deletion. Lane 1, Gfpt2Gt(CMHD-GT_305A09-3)Cmhd heterozygous mouse, recombination has not occurred, product size 909 bp. Lane 2, Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd heterozygous mouse, recombination has occurred, product size, ~250 bp. Arrow heads, approximate primer location. Black arrow heads, no Cre excision, purple arrow heads, cre excision has occurred. Abbreviations: bp base pairs, dEN enhancer deletion, IRES internal ribosome entry site, 5’LTR 5’ long tandem repeat, M – marker lane, 500 bp marker, each band represents 500 bp, Neo neomycin resistance gene, pA poly-adenylation sequence, PGK phosphoglycerate kinase 1 promotor sequence, SA splice acceptor site, SD splice donor site. Modified with permission http://www.cmhd.ca/genetrap/index.html (June, 2009). also mated to Cre deleter females to generate mice containing the UPA vector with the IRES deleted. This expediated the generation of this line.

4.4.1 Is the Neo transcript degraded in IRES mice?

The deletion of the IRES should result in NMD of the remaining transcript due to the stop codon for the Neo gene being located more than 55 bp from the polyA tail (Nagy and Maquat, 1998). In the pUPA trap cassette, the presence of the IRES sequence enables NMD to overcome the premature stop codon in the Neo transcript (Shigeoka et al., 2005). Thus when the IRES is excised by Cre, the Neo transcript should be degraded by NMD. In the case of the Gfpt2GtF9d1 allele, NMD should cause degradation of the Neo transcript and Gfpt2 gene from exon 5. In the case of the Gfpt2GtA09d1 allele, this should result in degradation of the Neo transcript and Gfpt2 gene from exon 4. To determine if the transcripts for the respective alleles are indeed degraded, RNA was extracted from two pooled 9.5 dpc embryos per genotype (wildtype, heterozygous, homozygous) from Gfpt2GtF9 and Gfpt2GtF9d1 heterozygous matings. This embryonic stage was chosen as the endogenous Gfpt2 expression is expressed at relatively high levels at this stage (see Chapter 3). cDNA was transcribed and RT-PCR performed using primers from within the Neo gene to Exon7 (size expected 1078 bp).

The Neo-Exon7 product is present in heterozygotes and homozygotes from Gfpt2GtF9 intercrosses, but not from Gfpt2GtF9d1 intercrosses (Figure 4.5A, arrows). This suggests that the transcript from the constitutive promoter within the gene trap insert is degraded in the Gfpt2GtF9d1 embryos. It is hypothesised that the Gfpt2GtF9d1 insertion would create a null allele as the majority of the protein would not be produced and the GATase domain is disrupted, although the first 4 exons may be translated. Similarly, the Neo- Exon7 product is absent in embryos heterozygous and homozygous for the Gfpt2GtA09d1 allele (Figure 4.5B), and is also expected to be a null allele. There were multiple non- specific bands present, especially in Figure 4.5B, and attempts were made to alter the PCR parameters to improve this, however the non-specific bands remained. Importantly, the only band of the appropriate size (Neo to Exon7) is absent in the Cre deleted lines. That this was indeed the appropriate band could have been confirmed by extracting the band and sequencing.

127 GtA09 GtA09d1 Gfpt2GtF9 Gfpt2GtF9d1 Gfpt2 Gfpt2

A B

het -ve hom -ve hom M wt het het -ve hom hom -ve wt wt -ve het PCRM -ve het -ve hom hom -ve PCR -ve wt -ve M wt wt -ve het het -ve hom hom -ve wt wt -ve het M

2000 1500 2000 1000 1500 1000 500 500

Figure 4.5: Loss of Neo transcript due to excision of the IRES A) Cre excision results in recombination around the IRES of the gene trap lines (Figure 4.4). This results in loss of the Neo transcript due to nonsense mediated decay. The Neo transcript is present in the Gfpt2GtF9 (A) and Gfpt2GtA09 (B) heterozygote and homozygote mice (arrows), but absent in the Gfpt2GtF9d1 (A) and Gfpt2GtA09d1 (B) mice. Abbreviations: Gfpt2GtF9 Gfpt2Gt(CMHD- GT_352F9-3)Cmhd, Gfpt2GtF9d1 Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd, Gfpt2GtA09 Gfpt2Gt(CMHD-GT_305A09-3)Cmhd, Gfpt2GtA09d1 Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd, het heterozygote, hom homozygote, -ve negative, M 500 bp marker (marks every 500 bp), PCR polymerase chain reaction, wt wild type. Arrows in A and B indicate Neo-Exon 7 product. 4.5 Design of genotyping primers

Genotyping was performed by PCR. Primers were designed for the wildtype allele for each gene trap insert such that the PCR product encompassed the insertion site (primers 807, 808 in the case of the Gfpt2GtF9 allele, and 801, 802 for the Gfpt2GtA09 allele). If the gene trap insertion is present in both alleles, a PCR product may be made but will be much larger than the wildtype product, and would generally not be produced under the genotyping PCR conditions. To genotype the gene-trap alleles, PCR primers were designed to amplify GFP or, in the case of Gfpt2GtF9d1 and Gfpt2GtA09d1, spanning the loxP sites either side of the IRES to detect the Cre mediated deletion (Table 4.2, Figure 4.6).

Table 4.2 Genotyping primers for gene trap mouse lines

Line Allele Primer Locus number Wildtype 807, 808 Gfpt2 intron 4-5 Gfpt2Gt(CMHD-GT_352F9-3)Cmhd Gene trap 852, 853 GFP Wildtype 807, 808 Gfpt2 intron 4-5 Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd Gene trap 817, 818 Encompassing loxP sites and IRES in UPA trap vector Wildtype 801, 802 Gfpt2 intron 3-4 Gfpt2Gt(CMHD-GT_305A09-3)Cmhd Gene trap 852, 853 GFP Wildtype 801, 802 Gfpt2 intron 3-4 Gfpt2 Gt(CMHD-GT_305A09-3)d1Cmhd Gene trap 817, 818 Encompassing loxP sites and IRES in UPA trap vector

4.6 Expression of gene trapped alleles

Alternative splicing allows the generation of different transcripts from a single gene sequence by splicing the mature transcript to include or exclude particular exons. This enables the organism to modify the function of the protein at the transcript level. Since the gene traps insert into introns, alternative splicing of transcript around the insertion site could result in the gene trap not being expressed and the full-length Gfpt2 transcript being produced (Figure 4.1B, 4.2). GFP should be expressed in both lines under the control of the endogenous Gfpt2 promoter. To confirm that the gene trap alleles were

129 A p807 p808

Exon 4 Intron 4-5 Exon 5

B p801 p802 Exon 3 Intron 3-4 Exon 4

C p852 p853 p817 p818

p818p817

D Gfpt2GtF9d1 Gfpt2GtA09d1 Gfpt2GtF9

wt 807, 808 gtd1 817, 818 gtd1 817, 818 M M wt 801, 802 wt 807, 808 gt 852, 853 1000 700 500 400 300 200

100

Figure 4.6 Primer design for genotyping

Schematic showing the approximate locations of primers for genotyping Gfpt2 gene trapped mouse lines. A) Schematic showing the approximate insert site of the Gfpt2GtF9 and Gfpt2GtF9d1 in Gfpt2 intron 4-5 and primer locations for genotyping Gfpt2GtF9 and Gfpt2GtF9d1 wildtype allele. B) Schematic showing the approximate insert site of the Gfpt2GtA09 and Gfpt2GtA09d1 in Gfpt2 intron 3-4 and primer locations for genotyping Gfpt2GtA09 and Gfpt2GtA09d1 wildtype allele. C) Schematic showing pUPA gene trap vector with approximate primer locations for genotyping EGFP for the gene trapped allele and either side of the IRES, below shows when IRES is deleted and genotypes the Gfpt2GtF9d1 and Gfpt2GtA09d1 gene trap alleles. Abbreviations: Blc2 SA spice acceptor site from the Bcl2 gene, bp base pairs, EGFP enhanced green fluorescent protein, IRES internal ribosome entry site, LTR long tandem repeat, Neo neomycin resistance, M marker lane, 100 bp marker (each band represents 100 bp, 100-1000 bp), pA poly- adenylation sequence, RNAPII RNA polymerase II binding site. Triangles represent loxP sites Light blue exon, Green intron, Red arrow insert site, black arrows, approximate primer location.

131 A

h nf

al

B

nf

h

al

Figure 4.7 GFP is expressed in dorsal to the heart at 8.5 dpc Lateral views of 8.5 dpc embryos. A) Neural folds to right, B) Neural folds to left. GFP expression was examined by whole mount RNA in situ hybridisation on fixed wild type (A) or fluorescence in live (B) 8.5 dpc embryos heterozygous or homozygous for the Gfpt2GtA09d1 allele. GFP was detected dorsal to the heart in a pattern similar to that observed by whole mount RNA in situ hybridisation of 8.5 dpc wild type embryos at this stage. Abbreviations: al alantois, h heart and nf neural folds. Scale bar: 200 Mm expressed, 8.5 dpc mouse embryos from Gfpt2GtA09d1 heterozygote matings were dissected and examined for GFP fluorescence. GFP fluorescence was detected dorsal to the heart tube in heterozygous (not shown) and homozygous Gfpt2GtA09d1 8.5 dpc embryos, in a similar pattern to that of Gfpt2 transcript in wildtype embryos (Figure 4.7), but was not detected in the PSM/allantois junction in any of the embryos examined.

Whole mount in situ RNA hybridisation was performed on embryos heterozygous or homozygous for the Gfpt2GtF9 and Gfpt2GtF9d1 alleles. GFP expression was detected in the heart of 9.5 dpc heterozygotes both alleles in a similar expression pattern to that observed for Gfpt2 (not shown). GFP was not detected in the PSM at 9.5 dpc, but it cannot be ruled out that GFP is expressed in the PSM at 9.5 dpc as relatively few embryos have been examined and Gfpt2 is not detected in the PSM of all embryos.

4.7 Discussion

4.7.1 Summary of results

To investigate the function of Gfpt2 in mouse embryo development, ES cells carrying gene trap alleles of Gfpt2 were obtained from CMHD. A gene trap approach rather than a targeted knockout approach was taken since the gene trap alleles of Gfpt2 were available and they were reported to disrupt Gfpt2 in a location that was considered likely to disrupt protein function. It was felt therefore, that obtaining the gene trap alleles would be more expedient than creation of a targeted knockout by homologous recombination in house.

Upon obtaining the gene trap alleles, mapping was performed to confirm the location of the gene trap insertion points. The 352F9 clone insertion point was in intron 4-5, corresponding to the insertion being after 114, and the 305A09 clone insertion point was located in intron 3-4, corresponding to the insertion being after amino acid 71 (Figure 4.2, 4.3). Both gene trap insertions are located in the GATase domain of Gfpt2. More precise mapping was performed to determine the exact location of each gene trap within the respective introns to facilitate genotyping (Figure 4.2).The ES cells carrying the gene trap alleles were injected into C57BL/6J blastocysts and

133 chimeric mice obtained. Chimeric males were mated to C57BL/6J and four mouse lines, Gfpt2GtF9, Gfpt2GtA09, Gfpt2GtF9d1 and Gfpt2GtA09d1 derived. Three of these, Gfpt2GtF9, Gfpt2GtF9d1 and Gfpt2GtA09d1, have been maintained.

The pUPA trap gene trap vector, the gene trap vector used in the case of our alleles, contains an IRES flanked by loxP sites downstream of the Neo gene. The presence of the IRES suppresses NMD of the Neo transcript in the Gfpt2GtF9 and Gfpt2GtA09 alleles. This means that in these mice, the Gfpt2 is transcribed as two transcripts, the first from exon 1 to the UPAtrap GFP and the second from the UPAtrap Neo to the end of the Gfpt2 transcript (Figure 4.1B). This would disrupt the GATase domain (amino acids 2- 71 or 2-114 of domain which encompasses amino acids 2-264), with part of the domain being translated as part of one transcript with GFP and the remainder of the GATase domain and the SIS domains being translated in the second transcript with Neo. It is not known if the two protein products could re-assemble into functional Gfpt2. If so, the Gfpt2GtF9 and Gfpt2GtA09 alleles may be functionally wildtype. There is also constitutive expression of part of the protein (amino acids 115-682 in the case of Gfpt2GtF9d1 and amino acids 72-682 in the case of Gfpt2GtA09d1) as the Neo transcript is under the control of the constitutively active PGK promoter. The constitutive expression of this partial protein could potentially interfere with normal Gfpt signalling. If this is the case, the Gfpt2GtF9 and Gfpt2GtA09 alleles may be functionally hypomorphic.

To create alleles that are more likely to be functionally null, the IRES was excised by mating heterozygous Gfpt2GtF9/+and Gfpt2GtA09/+ female mice were mated to Cre deleter males to facilitate Cre-lox recombination and excision of the IRES. The excision of the IRES was confirmed by genotyping and loss of the Neo transcript (Figure 4.4, 4.5).

4.7.2 Disadvantages to gene trapping over targeted knockout

There are different approaches that can be taken in the mouse to delete gene expression, targeted or random. Gene targeting allows for the specific deletion of a particular gene or exons of a gene, whereas gene trapping is the random insertion of a gene trap cassette into the genome, which can then be selected for integration into the intron or exon of a gene (depending on the design of the gene trap cassette). Gene targeting requires the generation of a separate vector for each gene of interest, however gene trapping can be

134 achieved using a single vector and can be randomly integrated anywhere in the genome. Gene targeting vectors can also be designed to allow conditional expression in either timing or tissue type (or both).

Gene trapping has several disadvantages compared to generation of targeted knockout of a gene. Because the integration is random, gene traps may or may not generate functionally null alleles and may have little or no effect on protein function. The International Gene Trap Resource lists gene traps available from multiple sources for particular genes. Each gene trap is approximately mapped so it is possible to select a gene trap allele that is more likely to be functionally null based on its integration site, and information about the gene trap cassette obtained from the source site. The gene traps we obtained from CMHD mapped to different introns than had been reported, emphasising the importance of confirming the gene trap insertion location prior to the time and expense of generating the mouse lines.

One factor that was not formally addressed was the possibility of the gene traps having integrated more than once, thus also being located in genes other than Gfpt2. However, it was established that the gene trap was expressed temporally and spatially in the same way as Gfpt2, by examining GFP expression (Figure 4.7 and data not shown). In addition, initial confirmation of the gene trap location was performed by 3’RACE at CMHD which showed that both gene traps were inserted in Gfpt2 and the exact location for each gene trap was confirmed in this study (Figure 4.2). While the CMHD did not report any other genes identified, and their protocols are optimised to have single insertions, this alone does not exclude the possibility of multiple insertions. Given these results, it was not felt necessary at the time to confirm a single integration, however to incontrovertibly show that the gene traps inserted only once, a Southern Blot on DNA from the gene trapped ES cells or DNA from the gene trapped mice could be performed. DNA could be linearised using EcoRI which would cut in intron 3-4, immediately downstream of GFP in the gene trap, and in intron 5-6. Using GFP as a probe for the presence of the gene trap, one would expect a product size of 5370 bp for the Gfpt2GtA09d1 gene trap allele, and 7233 bp for the Gfpt2GtF9d1 allele. If the gene trap has not integrated anywhere else GFP would not be detected at any other size.

135 4.7.3 Alternative splicing could lead to exon skipping and failure to express the gene trap allele

Another potential disadvantage to the gene trapping approach is the possibility of alternative splicing of transcripts, leading to exon skipping. Alternative splicing allows the generation of different transcripts from a single gene sequence by splicing the mature transcript to include or exclude particular exons. This enables the organism to modify the function of the protein at the transcript level. Alternative splicing can be problematic when generating targeted knockout of particular exons that are skipped due to alternative splicing around those exons. In the case of gene trap insertions, especially those which insert into introns such as pUPA, alternative splicing may result in the gene trap cassette not being transcribed, and therefore the gene trap insertion would not disrupt the gene in any way. Depending on the relative abundance and tissue expression profiles of the alternative splice transcripts, this could result in null, hypomorphic or phenotypically normal expression levels in some or all tissues. One example of this in relation to gene trapping is Crim1KST264 gene trap mouse line. In this line, the gene trap insertion point is located between exon 1 and exon 2, however alternative splicing of Crim1 can produce a transcript that skips exon 2 and the gene trap insertion (Pennisi et al., 2007). In this study, the alternatively spliced transcript was shown to be a minor splice variant and the gene trap insertion Crim1KST264 was shown to be hypomorphic rather than null (Pennisi et al., 2007). No alternative splicing events have been reported for Gfpt2 in Ensembl (Ensembl, Dec 2009). However, this does not definitively mean that there cannot be splicing variation around the gene trap insertions in this study. Potentially, even a small number of “normal” transcripts could provide sufficient Gfpt2 protein for full function. To determine if any primary transcripts are present, RT-PCR using primers flanking the insertion sites for the gene trap on cDNA from wildtype, heterozygous and homozygous 8.5 dpc or 9.5 dpc embryos could be performed. At these stages, the expression of Gfpt2 has been clearly established (Figure 3.4 and Figure 3.5). Both gene trap lines could be examined using the same primers if primers in exon 2 and exon 5 were selected. In the wildtype and heterozygous cDNA, one would expect one band of approximately 280 bp, and for this band to be absent in the homozygous cDNA if no Gfpt2 is produced. The expression of the gene trap alleles, was confirmed by examination of GFP by fluorescence in 8.5 dpc Gfpt2GtA09d1 homozygous (Figure 4.7) and heterozygous embryos (not shown) and by examination of GFP expression in

136 Gfpt2GtF9/+ and Gfpt2GtF9d1/+9.5 dpc embryos by whole mount RNA in situ hybridisation (not shown).

4.7.4 Conclusion

Based on the location of the gene trap insertions in the GATase domain of Gfpt2 one exon apart, it is expected that both the Gfpt2GtF9d1 and Gfpt2GtA09d1 alleles should be functionally null. Gfpt catalyses the removal of the ammonia group from glutamine, via the GATase domain, and transfers it to fructose-6-phosphate, which is isomerised by the SIS domains leading to the formation of glucosamine-6-phosphate. Both the Gfpt2GtF9d1 and Gfpt2GtA09d1 alleles would disrupt the GATase domain at amino acid 71 in the case of Gfpt2GtA09d1, and at amino acid 114 in the case of the Gfpt2GtF9d1 allele. Since the remainder of the transcript undergoes NMD in these mouse lines (Figure 4.5), there are no SIS domains present.

Heterzyogous mice for from Gfpt2GtF9, Gfpt2GtF9d1 and Gfpt2GtA09d1 mouse lines were intercrossed to determine the functional consequences of the gene trap insertions (Chapter 5).

137

138 Chapter 5: Functional analysis of Gfpt2 in the mouse

5.1 Introduction

In Chapter 3, the expression pattern of Gfpt2 was examined in mouse embryo development. Gfpt2 was generally expressed weakly, transiently and in a highly restricted manner in the foregut endoderm at 8.5 dpc, the myocardium underlying the cardiac cushions at 9.5 dpc and in some embryos, the pre-somitic mesoderm (PSM) at 9.5 dpc and 11.5 dpc. At the protein level, Gfpt is the rate-limiting enzyme of the hexosamine biosynthesis pathway (HBP). This pathway is produces UDP-N-acetyl glucosamine (UDP-GlcNAc), which is important for the production of glycosaminoglycans (GAGs) and for protein glycosylation.

Based on the expression pattern of Gfpt2 and its known function in the HBP, the following potential roles of Gfpt2 in mouse embryonic heart and somite development were postulated. 1. FGF signalling from the foregut endoderm to the developing heart has been shown to be important for heart development (Lough and Sugi, 2000; Sugi et al., 1995; Zhu et al., 1996). sulfate, a GAG consisting of glucuronic acid linked to N- acetylglucosamine (GlcNAc), has been shown to modulate FGF signalling (Allen and Rapraeger, 2003). Gfpt2 might be required in the foregut endoderm to provide a pool of GlcNAc to glycosylate signalling molecules such as FGF, and thus impact upon heart development. 2. In cardiac cushion development, GAGs make up the cardiac jelly. Signalling from the myocardium of the cushions to the endocardium initiates epithelial to mesenchyme transition (EMT) of cells. These cells then invade the cardiac jelly leading to the cellularisation of the cushions (Armstrong and Bischoff, 2004). Gfpt2 might be required to provide a pool of GlcNAc for either the production of the GAGs that make up the cardiac jelly, or to glycosylate signalling molecules, for example BMP family members, which are known to be important in EMT of the cushions and are glycosylated (Esko and Selleck, 2002; Uchimura et al., 2009).

139 3. In the PSM, glycosylation is known to potentiate Notch signalling. Mutations in the glycosyltransferase Lunatic Fringe, which glycosylates Notch, cause defects in somitogenesis (Evrard et al., 1998; Sparrow et al., 2006). Gfpt2 might be required in somitogenesis to provide a source of UDP-GlcNAc for Notch glycosylation (Evrard et al., 1998; Shi and Stanley, 2003).

To investigate the role of Gfpt2 in mouse embryonic development, and to determine if a lack of Gfpt2 can cause defects in heart or somite formation, mice carrying gene trap insertions were generated as described in Chapter 4. To determine the effects of the Gfpt2 gene trap insertions in vivo, mice heterozygous for the gene trap alleles, Gfpt2Gt(CMHD-GT_352F9-3)Cmhd (Gfpt2GtF9), Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd (Gfpt2GtF9d1) and Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd (Gfpt2GtA09d1) were mated to generate homozygous mice for each allele. As described in Chapter 4, the Gfpt2GtF9d1 and Gfpt2GtA09d1 alleles are expected to be functionally null.

5.2 Gfpt2 Gt(CMHD-GT_352F9-3)Cmhd heterozygous matings

Heterozygotes for Gfpt2GtF9 were inter-crossed to generate mice homozygous for the gene trap insertion. Embryos were collected at 18.5 dpc and examined macroscopically for differences in appearance and weight. Embryos were then genotyped to look for any differences in the expected Mendelian ratios for wildtype, heterozygotes and homozygotes. 80 embryos were collected from heterozygous matings and 72 were weighed. One Gfpt2GtF9 heterozygous embryo exhibited oedema and a kinky tail. No other embryos exhibited this phenotype. There was no significant difference between observed and expected ratios of wildtype, heterozygote and homozygote embryos, thus the gene trap insertion has no effect on embryo survival (2 test, p-value = 0.71). Whilst there was variation within genotypes, there was no significant difference in embryo weight (ANOVA, p-value = 0.93) (Table 5.1, Figure 5.1).

To test for survival post-birth, pups from Gfpt2GtF9 heterozygous matings were collected between post-natal day 0 (P0), day of birth, and P5 (n = 162), and also at weaning (n = 49). The observed numbers of wildtypes, heterozygotes and homozygotes were not significantly different to expected (2 test P = 0.11) (Table 5.2). There were no gross

140 Figure 5.1 Scatter Plot of Gfpt2Gt(CMHD-GT_352F9-3)Cmhd 18.5 dpc embryo weight Graphic showing Gfpt2Gt(CMHD-GT_352F9-3)Cmhd 18.5 dpc embryo weights by genotype. Each dot (wild type), square (heterozygote) or triangle (homozygote) represents an individual embryo. Mean and standard deviation are shown for each genotype (bars). No significant difference in weight is detected between the different genotypes, ANOVA p-value = 0.93. Weight is measured in grams (g). Abbreviations: wt wild type, het heterozygote, hom homozygote. Table 5.1 Gfpt2Gt(CMHD-GT_352F9-3)Cmhd embryos at 18.5dc.

Embryos from heterozygous matings of Gfpt2GtF9 were dissected and collected at 18.5 dpc, 1 day prior to birth. Embryos were weighed and examined for any external morphological phenotype, then genotyped. The observed value is the number of embryos genotyped for each category, the expected is the Mendelian ratio predicted for the number of embryos. Average 2 weight is in grams (g) ± standard deviation (StD). Statistical tests are  test for observed vs. expected genotypes, and ANOVA for embryo weight by genotype.

Wildtype Heterozygote Homozygote Statistical test Observed 23 37 20 Expected 20 40 20 0.71 Average weight 1.01 ± 0.09 1.22 ± 0.11 1.22 ± 0.08 0.93 (g) (± StD)

Table 5.2 Gfpt2Gt(CMHD-GT_352F9-3)Cmhd heterozygous matings pups and weaners

Pups from heterozygous matings of Gfpt2Gt(CMHD-GT_352F9-3)Cmhd were genotyped from P0-P5 or at weaning (3 weeks). The observed value is the number of embryos genotyped for each category, the expected is the Mendelian ratio predicted for the number of embryos. Sum is the observed combined value for pups and weaned mice.

2 Wildtype Heterozygous Homozygous  test Pups observed 52 72 38 Pups expected 40.5 81 40.5 0.11 Weaned observed 16 26 7 Weaned Expected 12.25 24.5 12.25 0.17 Sum 68 98 45 0.11 morphological differences in any pups observed and no deaths recorded for pups allowed to progress to weaning.

5.3 Delta-IRES mice

The insertion of the gene trap into Gfpt2 at intron 4-5 in the Gfpt2GtF9 mouse line may not generate a functionally null allele since Gfpt2 exons 1-4 and GFP would be transcribed and translated from the endogenous promoter, and Neo and the remainder of Gfpt2 would be transcribed and translated from the constitutively active

142 phosphoglycerate kinase (PGK) promoter (as previously described, Chapter 4, Section 4.4). To determine if Gfpt2 has an essential function in mouse embryo development, a functionally null allele is required.

As previously described (Chapter 4, Section 4.4), Gfpt2GtF9/+ and Gfpt2GtA09/+ mice were mated to Cre deleter mice to excise the UPAtrap IRES sequence, generating the new mouse lines Gfpt2GtF9d1 and Gfpt2GtA09d1. The deletion of the IRES sequence resulted in the degradation of the Neo transcript for both the Gfpt2GtF9d1 allele and the Gfpt2GtA09d1 allele (Chapter 4, Figure 4.4). This should result in functionally null alleles in each line as only exons 1-3 (for Gfpt2GtA09d1) and exons 1-4 (for Gfpt2GtF9d1) and GFP from the UPA trap cassette, would be transcribed and translated. Heterozygous mice for each of Gfpt2GtF9d1/+ and Gfpt2GtA09d1/+ were inter-crossed to determine if either allele results in a phenotype when homozygous.

5.3.1 Gfpt2 Gt(CMHD-GT_352F9-3)d1Cmhd heterozygous inter-crosses

Heterozygous Gfpt2GtF9d1 mice were inter-crossed to determine the functional consequence of homozygosity for the gene trap. Sixty-three 17.5 dpc embryos from heterozygous Gfpt2GtF9d1 inter-crosses were collected and weighed. For the Gfpt2GtF9 study, embryos were collected at 18.5 dpc. For the Gfpt2GtF9d1 mouse line, embryos were collected at 17.5 dpc to maximise numbers at collection as in the Gfpt2GtF9 study, pups were occasionally born one day early. There were no significant differences in observed Mendelian ratios for wildtype, heterozygous and homozygous embryos (2 test, p-value = 0.19) (Table 5.3) and no significant differences in weights observed between the different genotypes (ANOVA, p-value = 0.50) (Figure 5.2).

Gfpt2 is expressed in the myocardium underlying the cardiac cushions at 9.5 dpc. The cardiac cushions give rise to the heart valves and also contribute to the ventricular septum. To determine if there were any structural defects in the hearts of the gene trapped embryos, particularly with respect to the derivatives of the cardiac cushions, wildtype and homozygous Gfpt2GtF9d1 17.5 dpc embryonic hearts were embedded in wax, then sectioned. Sections were counterstained with eosin to enable visualisation of

143 Figure 5.2 Scatter Plot of Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd 17.5 dpc embryo weight Graphic showing Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd 17.5 dpc embryo weights by genotype. Each dot (wild type), square (heterozygote) or triangle (homozygote) represents an individual embryo. Each colour represents a different litter. Mean and standard deviation are shown for each genotype (bars) No significant difference in weight is detected between the different genotypes, ANOVA p-valee = 0.50. Weight is measured in grams (g). Abbreviations: wt wild type, het heterozygote, hom homozygote. Table 5.3 Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd embryos at 17.5 dpc

Embryos from heterozygous matings of Gfpt2GtF9d1 were dissected and collected at 17.5 dpc. Embryos were weighed and examined for any external morphological phenotype, then genotyped. The observed value is the number of embryos genotyped for each category, the expected is the Mendelian ratio predicted for the number of embryos. Average weight is in 2 grams (g) ± standard deviation (StD). Statistical tests are  test for observed vs. expected genotype ratios, and ANOVA for embryo weight by genotype. Wildtype Heterozygous Homozygous Statistical test Observed 14 27 22 Expected 15.75 31.5 15.75 0.19 Average weight 0.97 ± 0.12 0.94 ± 0.12 0.94 ± 0.07 0.50 (g) (+/- StD) the tissue, and morphology examined. No structural defects were observed in the Gfpt2GtF9d1/GtF9d1 17.5 dpc embryo heart compared to wildtype (Figure 5.3) (n = 3 homozygotes).

To determine if the Gfpt2GtF9d1 mice have a postnatal phenotype, pups (n = 54) and weaned mice (n = 44) from heterozygous matings were genotyped and examined for any gross morphological differences. No live mice examined showed any external phenotype. The observed numbers of wildtype, heterozygous and homozygous Gfpt2GtF9d1 mice were not significantly different to the expected Mendelian ratios (2 test, p-value = 0.85 for pups, 0.26 for weaned mice, 0.31 for combined data) (Table 5.4). This suggests that the loss of Gfpt2 expression due to the Gfpt2GtF9d1 gene trap insertion is not essential for normal development.

5.3.2 Gfpt2 Gt(CMHD-GT_305A09-3)d1Cmhd heterozygous inter-crosses

Heterozygous Gfpt2GtA09d1 mice were inter-crossed and embryos were collected at 17.5 dpc. The embryos were weighed and examined for any external morphological phenotype, before genotyping. One heterozygous embryo exhibited an open brain and blood filled decidua. The placenta of this embryo appeared normal, but was lighter compared to litter mates. No other embryos exhibited this phenotype or any other external defects.

145 pav A B pav CDaov la ra la mv ra rv ra rv ivs tcl tcl lv wild type wild ivs lv rv lv ivs rv

∆ pav EFGpav aov HIla la ra

ra la /GtF9 ∆ la ra la GtF9 rv tcv ivs lv tcl O mv lv ivs

rv ivs lv rv ivs Gfpt2 JLMNpav K pav la vsd aov la la mv ra la ra ra la rv

tcv ivs lv

∆ /GtA09 ∆ GtA09 rv ivs lv tcl ivs ivs rv ivs lv rv lv

rv lv Gfpt2

Figure 5.3 Sections of embryonic hearts Representative eosin stained frontal wax sections of 18.5 dpc hearts from wild type (A-D), Gfpt2GtF9d1/GtF9d1 (E-I) and 17.5 dpc hearts Gfpt2GtA09d1/GtA09d1 (J-N same embryo, O different embryo). (A, E, J) Global view of heart showing all four chambers, the pulmonary arterial valve (PAV) and intraventricular septum (IVS). (B, F, K) Higher magnification view of PAV. (C, G, L) Higher magnification view of aortic valve (AOV). Higher magnification view of tricuspid valve (TCV) and (D) mitral valve. (I, N) Higher magnification view of mitral valve. (O) Ventricular septal defect (VSD) in Gfpt2GtA09d1/GtA09d1 embryo. There are no morphological differences observed in the derivatives of the cardiac cushions between wild type (A-D) and Gfpt2GtF9d1/GtF9d1 (E-I) embryos (n = 4). Gfpt2GtA09d1/GtA09d1 hearts were also morphologically normal (J-N) except for one embryo exhibiting a VSD (O) (n = 6). Abbreviations: aov aortic valve, ivs intraventricular septum, la left atrium, lv left ventricle, mv mitral valve, pav pulmonary arterial valve, ra right atrium, rv right ventricle, tcl tricuspid leaflet, tcv tricuspid valve, vsd ventricular septal defect. Scale bars: (A, E, J) 180 μm, (B-D, E-I, K-O) 90 μm Table 5.4 Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd pups and weaned mice

Pups from heterozygous matings of Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd were genotyped from P0-P3 or at weaning. The observed value is the number of embryos genotyped for each category, the expected is the Mendelian ratio predicted for the number of embryos. Sum is the observed combined value for pups and weaned mice.

2 Wildtype Heterozygous Homozygous  test Pups observed 15 25 14 Pups expected 13.5 27 13.5 0.85 Weaned observed 15 17 12 Weaned expected 11 22 11 0.26 Sum 30 42 26 0.31

37 embryos were collected from Gfpt2GtA09d1 heterozygous matings. Genotyping was performed to identify any discrepancies from the expected Mendelian ratio. There is a trend towards fewer homozygotes than expected, but this does not reach statistical significance (2 test, p-value = 0.37) (Table 5.5). Embryos were weighed and ANOVA test applied to determine if there was a difference in weight distribution between genotypes. There was no statistical difference in weight between genotypes (ANOVA p-value = 0.31) (Table 5.5, Figure 5.4).

Hearts from wildtype (n = 2) and homozygous (n = 6) Gfpt2GtA09d1 embryos were embedded in wax, sectioned and counterstained with eosin. Six homozygous embryos were sectioned and counterstained. In five of these embryos, there were no differences observed between wildtype and Gfpt2GtA09d1 homozygous 17.5 dpc embryonic hearts (Figure 5.3). One embryo exhibited a ventricular septal defect (VSD) (Figure 5.3). Due to technical problems with sectioning, including tearing and folding over of the sections, it is unclear if the heart valves of this embryo were also affected.

To determine if the Gfpt2GtA09d1 mice have a post-natal phenotype, pups (n = 18) and weaned mice (n = 48) from heterozygous matings were genotyped and examined for any gross morphological differences. No live mice examined showed any external

147 Figure 5.4 Scatter Plot of Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd 17.5 dpc embryo weight Graphic showing Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd 17.5 dpc embryo weights by genotype. Each dot (wild type), square (heterozygote) or triangle (homozygote) represents an individual embryo. Each colour represents a different litter. Mean and standard deviation are shown for each genotype (bars). No significant difference in weight is detected between the different genotypes, ANOVA p-value = 0.31. Weight is measured in grams (g). Abbreviations: wt wild type, het heterozygote, hom homozygote. Table 5.5 Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd embryos at 17.5 dpc

Embryos from heterozygous matings of Gfpt2GtA09d1 were dissected and collected at 17.5 dpc. Embryos were weighed and examined for any external morphological phenotype, then genotyped. The observed value is the number of embryos genotyped for each category, the expected is the Mendelian ratio predicted for the number of embryos. Average weight is in 2 grams (g) ± standard deviation (StD). Statistical tests were  test for observed vs. expected genotype ratios, and ANOVA for embryo weight by genotype. Wildtype Heterozgous Homozygous Statistical test Observed 12 19 6 Expected 9.25 18.5 9.25 0.37 Average weight 0.94 ± 0.10 1.03 ±0.09 0.93 ± 0.03 0.31 (g) (+/- StD) phenotype. The observed numbers of wildtype, heterozygous and homozygous Gfpt2GtA09d1 mice were not significantly different to the expected Mendelian ratios (2 test, p-value = 0.14 for weaned mice, 0.36 for combined data) (Table 5.6). This is in agreement with the previous result observed for the Gfpt2GtF9d1 mice, suggesting that Gfpt2 is dispensable for normal development.

Table 5.6 Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd pups and weaned mice

Pups from heterozygous matings of Gfpt2GtA09d1 were genotyped from P0-P2 or at weaning. The observed value is the number of embryos genotyped for each category, the expected is the Mendelian ratio predicted for the number of embryos. Sum is the observed combined value for pups and weaned mice.

2 Wildtype Heterozygous Homozygous  test Pups observed 3 10 5 Pups expected 4.5 9 4.5 0.72 Weaned observed 15 23 10 Weaned expected 12 24 12 0.57 Sum 18 33 15 0.87

5.3.3 Adult delta-IRES mice survival

While a full study has not been conducted, wildtype, heterozygous and homozygous weaned mice from Gfpt2Gt9d1 and Gfpt2GtA09d1 inter-crosses are currently being aged to determine if there is any trend towards unexpected death in the homozygotes. As part of

149 this preliminary study, mice were monitored and weighed over five months. Since mice in this preliminary study are not age or litter matched and the numbers in each group are small, one must be very cautious in drawing any conclusions about differences between genotypes.

In the case of the Gfpt2Gt9d1 line, six wildtype, five heterozygous and two homozygous female, and three wildtype, heterozygous and homozygous male Gfpt2Gt9d1 mice from heterozygous inter-crosses were weighed over a five month period encompassing, 23-26 weeks to 44-47 weeks (Table 5.7). At the 31-34 week, 35-38 and 44-47 week time points, there was no statistical difference in weight by genotype for the female or male mice. At the 23-26 week time point, the heterozygous female mice were lighter and this reached statistical significance (P-value = 0.04). Since this trend did not continue at later time-points, this is likely to be a statistical anomaly. At 39 weeks, several male mice were culled due to fighting, thus at the 44-47 week time-point there are insufficient males to draw conclusions about weight. No unexplained deaths were observed for the Gfpt2Gt9d1 line, suggesting that these mice are healthy under normal conditions. These mice are still being aged.

Table 5.7 Adult weight for Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd at different ages

Mice from heterozygous Gfpt2GtF9d1 inter-crosses were weighed over a five month period. Mice were not age or litter matched. Abbreviations: F female, M male, n number, n/a non applicable, St Dev standard deviation. * At 44-47 week age range n = 1 male wildtype and heterozygote and n = 2 male homozygote, (total of 4 males culled due to fighting). ** statistically signifcant. Average weight (g) ± St Dev Age Range (weeks) 23-26 31-34 35-38 44-47 F wildtype (n=6) 31.1 ± 2.4 34.7 ± 3.2 35.4 ± 4.1 37.8 ± 5.0 F heterozygote (n=5) 27.8 ± 2.4 31.0 ± 4.3 30.38 ± 3.16 32.52 ± 4.61 F homozygote (n=2) 32.8 33.0 34.6 34.0 ANOVA female 0.04** 0.28 0.10 0.21 M wildtype (n=3)* 34.5 ± 3.0 36.7 ± 4.7 37.0 ± 6.2 47.0 M heterozygote (n=3)* 39.5 ± 1.9 42.7 ± 2.5 41.9 ± 4.6 48 M homozygote (n=3)* 37.5 ± 2.5 37.33 ± 6.5 37.33 ± 10.3 42.8 ANOVA male 0.12 0.32 0.68 n/a ANOVA combined 0.37 0.99 0.89 n/a

150 In the case of the Gfpt2GtA09d1 inter-crosses, six wildtype, three heterozygous and four homozygous female mice, and three wildtype, three heterozygous and four homozygous male mice were weighed over the five month period, encompassing 8-14 weeks to 28- 34 weeks (Table 5.8). Over the five months, there was no statistical difference in weight observed for female Gfpt2GtA09d1 mice. At the 8-14 week and 16-22 week time points, there was no difference in weight observed for the male Gfpt2GtA09d1 mice. There was a statistical difference at the 20-26 week and 28-34 week time points, with the homozygotes being heavier than their wildtype and heterozygous counterparts. However, the homozygote mice are four weeks older than the wildtype and heterozygous males to which they are being compared and this would account for their larger size. One homozygote Gfpt2GtA09d1 male mouse died for unknown reasons at 11 weeks. No autopsy was performed due to the condition of the mouse when discovered, but this mouse appeared healthy when weighed the week before. No other unexplained deaths have occurred to date and these mice are still being aged.

Table 5.8 Adult weight for Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd at different ages

Mice from heterozygous Gfpt2GtA09d1 inter-crosses were weighed over a five-month period. Mice were not age or litter matched. Abbreviations: F female, M male, n number, St Dev standard deviation, *One homozygote mouse died at 11 weeks, ** statistically signficant. Average weight (g) ± St Dev Age Range (weeks) 8-14 16-22 20-26 28-34 F wildtype (n=6) 24.0 ± 2.4 26.5 ± 2.4 30.0 ± 5.3 33.3 ± 5.7 F heterozygote (n=3) 22.0 ± 0.4 25.0 ± 2.7 28.5 ± 5.3 27.6 ± 4.1 F homozygote (n=4) 23.7 ± 2.0 29.5 ± 4.8 30.9 ± 5.3 31.6 ± 2.8 ANOVA female 0.41 0.23 0.84 0.28 M wildtype (n=3) 28.9 ± 2.7 33.0 ± 2.7 33.3 ± 2.2 33.6 ± 3.2 M heterozygote (n=3) 28.9 ± 2.1 35.0 ± 4.4 32.7 ± 1.6 33.7 ± 2.1 M homozygote (n=4)* 30.2 ± 3.2 33.3 ± 2.2 38.5 ± 2.1 41.7 ± 1.4 ANOVA male 0.76 0.72 0.02** 0.01** ANOVA combined 0.73 0.60 0.36 0.20

5.4 Discussion

In chapter 3, Gfpt2 was shown to be expressed in a transient and restricted manner in the developing mouse embryo. Gfpt2 was expressed in the foregut endoderm at 8.5 dpc,

151 in the myocardium underlying the cardiac cushions at 9.5 dpc and, in some embryos, in the presomitic mesoderm at 9.5 dpc and 11.5 dpc. In the mouse, two genes encode Gfpt proteins, Gfpt1 and Gfpt2 and both proteins have been shown to be the rate-limiting enzyme of the HBP. The HBP is an essential pathway for the production of UDP- GlcNAc, which is important for the production of glycosaminoglycans (GAGs) and protein glycosylation. Based on the temporal and spatial expression pattern of Gfpt2 and its known role as the rate-limiting enzyme of the HBP it was hypothesised that Gfpt2 expression might be important in heart development and/or somitogenesis. These hypotheses were tested in this chapter by inter-crossing mice heterozygous for gene trap insertions in the Gfpt2 gene, to generate embryos that were homozygous for the gene trapped alleles.

The expression of the gene trapped alleles was confirmed by detection of GFP fluorescence in embryos homozygous and heterozygous (not shown) for Gfpt2GtA09d1 (Chapter 4, Figure 4.7), and by whole mount RNA in situ hybridisation performed on 9.5 dpc embryos for GFP for Gfpt2GtF9 and Gfpt2GtF9d1 (not shown). This confirmed that the Gfpt2 transcript is disrupted by the gene trap insertions and that alternative splicing does not bypass the insertion. GFP is under the control of the endogenous Gfpt2 promoter in the gene trapped lines and was detected in a similar expression domain as endogenous Gfpt2.

The Gfpt2GtF9 gene trap insertion did not effect embryo survival when present as a homozygote. The observed numbers of wildtype, heterozygote and homozygotes at 18.5 dpc were not significantly different to those expected (Table 5.1, 2 P = 0.71). The embryos were also weighed and no statistical difference was observed in weight between the different genotypes (Table 5.1, ANOVA P = 0.93, Figure 5.1). In the Gfpt2GtF9 line, two transcripts are produced, exon 1-4 of Gfpt2 and GFP and Neo followed exon 7-19 of Gfpt2 (Chapter 4, Figure 4.1d). The Neo transcript is under the control of the constitutively active PGK promoter, thus will be expressed in all cells. It is possible that the aberrant expression Neo is causing the loss of some heterozygotes and homozygotes in the Gfpt2GtF9 line at or around birth. Aberrant Neo expression has been shown to result in hypomorphic alleles for some mouse lines (Stanley et al., 2002).

152 Gfpt2GtF9d1 heterozygotes were inter-crossed and the gene trap insertion had no effect on embryo survival. Observed number of wildtype, heterozygotes and homozygotes at 17.5 dpc for this line were not significantly different to expected (Table 5.3, 2 P = 0.9). Embryos were also weighed and no significant difference was observed in weight between the different genotypes (Table 5.3, ANOVA P = 0.5, Figure 5.2). Since Gfpt2 was expressed in the myocardium underlying the cardiac cushions, it was hypothesised that Gfpt2 might play a role in cardiac cushion and therefore heart valve development. Three homozygous Gfpt2GtF9d1 embryo hearts were sectioned and morphology examined, particularly with respect to the heart valves. The heart valves and ventricular septum, which is also derived from the cardiac cushions of homozygous Gfpt2GtF9d1 embryos, did not appear different to wildtype embryo at the same stage (Figure 5.3). These embryos came from a missed plug, and thus based on embryo length might be aged at 18.5 dpc.

The Gfpt2GtF9d1 gene trap insertion also had no effect on survival of pups to weaning (3 weeks) and beyond. The observed number of wildtype, heterozygotes and homozygotes at weaning was not significantly different to expected, suggesting that the Gfpt2GtF9d1 gene trap insertion has no effect on survival (Table 5.4, 2 P = 0.31). A small number of wildtype, heterozygote and homozygotes are being aged to determine if the Gfpt2GtF9d1 gene trap insertion has any effect on longer term survival. As part of their monitoring, these mice were weighed over a period of 5 months. No significant difference has been observed between wildtype and homozygous Gfpt2GtF9d1 mice over that period. There was initially a difference in female heterozygote weight at the first measurement (ANOVA P = 0.04) but this evened out over the following months (Table 5.7).

Gfpt2GtA0d1 heterozygotes were inter-crossed and the gene trap insertion had no effect on embryo survival. Observed number of wildtype, heterozygotes and homozygotes at 17.5 dpc for this line were not significantly different to expected (Table 5.5, 2 P = 0.37). Embryos were also weighed and no significant difference was observed in weight between the different genotypes (Table 5.5, ANOVA P = 0.31), Figure 5.4). As for the Gfpt2GtF9d1 line, Gfpt2GtFA0d1 homozygous 17.5 dpc embryo hearts (n = 6) were sectioned and the morphology examined. Five hearts exhibited normal hearts, but one embryo exhibited a VSD (Figure 5.3). Due to sectioning difficulties with this embryo, it was not possible to determine if the heart valves were also affected. As for the 153 Gfpt2GtF9d1, the Gfpt2GtA09d1 gene trap insertion had no effect on pup survival to weaning (3 weeks) and beyond (Table 5.6, 2 P = 0.87). A small number of wildtype, heterozygote and homozygotes are being aged to determine if the Gfpt2GtA09d1 gene trap insertion has any effect on longer term survival. Homozygous male mice are significantly heavier at the 4-month (ANOVA P = 0.02) and 5-month time-points (ANOVA P = 0.01), however this is likely due to two out of three of these mice being four weeks older than their heterozygous and wildtype counterparts (Table 5.8). In hindsight, this is not the correct way to do the analysis as weights of mice at different ages are not comparable, but this was done as there were not enough data at each time point.

It is not clear if the observed VSD in the Gfpt2GtA09d1 homozygote at 17.5 dpc is directly a result of the gene trap insertion or merely a chance defect. Since Gfpt2 was expressed in the myocardium underlying the cardiac cushion, a cushion defect may have been expected. A membranous VSD (as observed) can be the result of a cushion defect as the membranous region of the ventricular septum is derived in part from the cardiac cushions (Anderson et al., 2003a; Lamers and Moorman, 2002; Webb et al., 1998). However, this was only observed in one embryo (n = 6) and expected numbers of weaned Gfpt2GtA09d1 homozygotes were observed (observed 23, expected 24 for combined pup and weaned mice data, Table 5.6). Previously reported models of VSD in the mouse were fatal either in utero or perinatally, and are smaller than litter mates (Uchimura et al., 2009; Zhou et al., 2004), but no difference in weights was observed in 17.5 dpc embryos, but birth weight and weight at weaning (3 weeks) was not measured. Weights were measured at latter time-points but mice were not age or litter matched. One homozygous male mouse died at 11 weeks of unknown causes. This mouse did not appear unhealthy or small when examined the previous week, and its death is unexplained.

Since the gene trapped mouse lines are on a mixed genetic background (129X1/SvJ x 129S1 x C57Bl/6J), the observed VSD and the unexplained death might be strain dependent, and the incidence of this defect may increase depending on strain background. A previous study of VSD in mouse has suggested that strain may play a factor in the observed incidence, however there was still a significant reduction in the expected number of homozygous mice in this study (Uchimura et al., 2009). This could 154 be addressed by breeding the gene trapped mice for 10 generations on to either C57Bl/6J or 129X1/SvJ.

5.4.1 Does Gfpt1 compensate for the loss of Gfpt2 in the gene trapped mice?

Gfpt is the rate limiting enzyme of the HBP. There is a requirement for the HBP in the developing mouse embryo, thus the lack of phenotype observed for the gene trapped Gfpt2 mouse lines is due to non-requirement for Gfpt2, rather than the pathway itself. Mice lacking glucosamine-phosphate N-acetyltransferase 1 (Gnpnat1, formerly Gnpnat1), which encodes for glucosamine-6-phosphate acetyltransferase, an enzyme that functions down-stream of Gfpt in the HBP, die at 7.5 dpc with general proliferation defects (Boehmelt et al., 2000). Since the HBP is required, and the gene trap alleles (Gfpt2GtF9 and Gfpt2GtF9d1) are null, Gfpt1 is likely to be compensating for the lack of Gfpt2. Gfpt1 is closely related to Gfpt2, and both encode a functional Gfpt protein (McKnight et al., 1992; Oki et al., 1999). Given that Gfpt is the rate-limiting enzyme of the HBP and that the pathway is required as indicated by the requirement for Gnpnat1 expression, Gfpt1 must be expressed in the early embryo, or provided maternally in the absence of Gfpt2. Gfpt1 expression was not examined in early embryos in this study.

Gfpt1 expression may also compensate for a lack of Gfpt2 expression in the foregut endoderm at 8.5 dpc and cardiac cushions at 9.5 dpc. Gfpt1 expression was not detected at 8.5 dpc or 9.5 dpc, however it is possible that Gfpt1 expression at earlier or later stages (that were not examined) is sufficient to compensate for the lack of Gfpt2 at 8.5 dpc and 9.5 dpc. Alternatively, Gfpt1 may be expressed at a level too low to be detected by whole mount RNA in situ hybridisation. The Gfpt1 coding region probe was cloned from RNA extracted from 9.5 dpc embryos by Duncan Sparrow, suggesting that there is expression at this stage (personal communication). Another possibility is that Gfpt (and the HBP) are not required at this stage and that expression of Gfpt1 after 9.5 dpc is sufficient to compensate for an early lack of Gfpt2. Potentially, Gfpt1 expression might be upregulated in response to the loss of Gfpt2. This could be addressed by performing comparative in situ hybridisation or RT-PCR for Gfpt1 on Gfpt2 wildtype, heterozygous and homozygous embryos from the gene trapped Gfpt2 mice. 155

In conclusion, Gfpt2 is not required during mouse embryonic development. Mouse homozygous for two different gene trap insertions, that likely result in null alleles, survive to weaning in expected ratios. Adult mice lacking Gfpt2 appear generally healthy under normal conditions.

156 Chapter 6: Discussion

In this thesis I sought to identify genes novel with respect to heart development. Working from the hypothesis that genes important for heart development might be expressed in the progenitor tissue of the heart, the mesoderm, I expression profiled cDNA libraries representing the mouse germ layers and primitive streak using microarray technology to identify genes that were enriched in the mesoderm. Additionally, since the mesoderm and endoderm are derived from cells that ingress though the primitive streak, and the endoderm also has roles in heart development, I also identified genes enriched in the primitive streak and a population enriched in both the mesoderm and endoderm. From this candidate genes were selected and the spatial expression pattern in the developing mouse was examined by whole mount RNA in situ hybridisation. Unfortunately no genes with expression patterns restricted to the developing heart were identified.

In a separate screen in our laboratory that aimed to identify genes that cycle during somitogenesis, Duncan Sparrow and Wendy Chua identified Glutamine fructose- 6phosphate transaminase 2 (Gfpt2). They found that Gfpt2 was expressed in the developing heart at 9.5 days post-coitum (dpc). Gfpt is the rate-limiting enzyme of the hexosamine biosynthesis pathway (HBP) and catalyses the conversion of fructose-6- phosphate to glucosamine-6-phosphate. The HBP produces UDP-N-acetyl glucosamine (UDP-GlcNAc), which is essential for the production of glycosaminoglycans, and the glycosylation of proteins and lipids. The HBP also accounts for approximately 2-5 % of glucose metabolism in the cell. Gfpt2, and closely related Gfpt1, are both functional Gfpt enzymes (McKnight et al., 1992; Oki et al., 1999; Sayeski et al., 1994).

The characterisation of the expression pattern, generation of mouse lines carrying gene trap alleles for Gfpt2 and the functional analysis of these mouse lines became the main focus of this thesis.

157 6.1 Summary of results

6.1.1 Gfpt2 is expressed in a restricted manner in the developing mouse embryo In Chapter 3, the expression pattern of Gfpt2 was examined. Gfpt2 was expressed in the foregut endoderm at 8.5 dpc, in the myocardium underlying the cardiac cushions at 9.5 dpc and in the PSM of some embryos (50 %) at 9.5 dpc and 11.5 dpc (Chapter 3, Figure 3.4, 3.5, 3.6). In general, Gfpt2 expression was weak, and restricted in regards to tissue type and developmental time frame. Heart expression was absent by 10.5 dpc. At 10.5 dpc, the strongest region of Gfpt2 expression is in the optic vesicle (Chapter 3, Figure 3.5). At 11.5 dpc, Gfpt2 was expressed in the head region (Chapter 3, Figure 3.5).

The expression pattern of Gfpt1 was also examined in the developing mouse embryo as the two genes are closely related, have similar coding sequences, and the proteins perform a similar function in the HBP. Gfpt1 expression was not detected in the embryo at 8.5 dpc or 9.5 dpc, but was detected in the developing placenta at 9.5 dpc, 11.5 dpc and 15.5 dpc (Chapter 3, Figure 3.7). Interestingly, Gfpt2 was not detected in the placenta, which is in contrast to previously published northern blot results on human tissue, in which Gfpt1 and Gfpt2 were both detected in the placenta (Oki et al., 1999).

Glycosylation of proteins has been shown to modulate function and be involved in signalling in a variety of contexts, including during heart development and somitogenesis (Allen and Rapraeger, 2003; Evrard et al., 1998; Hanover, 2001; Love and Hanover, 2005; Shi and Stanley, 2003; Zachara and Hart, 2006). Given the general importance of UDP-GlcNAc and therefore the HBP, one would have expected either Gfpt1 or Gfpt2 to be expressed in almost every tissue, thus the highly restricted expression of Gfpt2 in the developing embryo in the absence of Gfpt1 detection, was surprising and suggested that Gfpt expression is not required in most tissues at the stages of development examined. Taken together, the known function of Gfpt and the observed expression of Gfpt2 in the mouse embryo, led to the hypothesis that the upregulation of Gfpt2 in particular tissues at particular times was due to an increased requirement for the end product of the HBP, UDP-GlcNAc.

158 6.1.2 Generation of the Gfpt2 gene trap mouse lines The highly restricted temporal and spatial expression of Gfpt2 suggested that its expression might be upregulated only where there was a particular developmental requirement for UDP-GlcNAc. It was hypothesised that Gfpt2 might be required in the foregut endoderm at 8.5 dpc to glycosylate signals such as FGF and/or BMP that act on the SHF to promote cell survival and proliferation (Brand, 2003; Lough and Sugi, 2000; Sugi et al., 1995; Zhu et al., 1996). In the cardiac cushions at 9.5 dpc, Gfpt2 expression in the myocardium underlying the cushions might be required for either the production of the GAGs that make up the cardiac jelly of the cushions or to glycosylate signalling molecules to initiate epithelial to mesenchyme transition (EMT) and cellularisation of the cushions. In the PSM, Gfpt2 expression was not detected in all embryos. Many genes cycle in the PSM, however the expression pattern of Gfpt2 is not reminiscent of cycling genes, which have moving domains of expression rather than the apparent presence/absence observed for Gfpt2. Glycosylation of Notch is known to be important in somitogenesis, thus Gfpt2 expression might be required to provide a sufficient source of UDP-GlcNAc for this purpose, or for the glycosylation of ligands or receptors of other signalling pathways involved in somitogenesis.

To investigate the requirement of Gfpt2 expression in mouse development two ES cell clones containing gene trap insertions in the Gfpt2 gene were obtained from the Centre for Modeling Human Disease (Chapter 4). The locations of the gene traps in Gfpt2 were confirmed by PCR mapping, followed by sequencing (Chapter 4, Figure 4.2). This revealed that the gene trap insertions were located in intron 4-5 for the 352F9 lines (Gfpt2GtF9 and Gfpt2GtF9d1) and in intron 3-4 for the Gfpt2GtA09d1 line. This corresponds to the gene trap insertions occurring after the first 114 amino acids in the case of the Gfpt2GtF9 and Gfpt2GtF9d1 alleles, and the first 71 amino acids of the Gfpt2GtA09d1 allele (Chapter 4, Figure 4.2). Both these insertions disrupt the glutamine amidotransferase (GATase) domain of Gfpt2. The GATase domain catalyses the transfer of the ammonia group from glutamine and transfers it to fructose-6-phosphate (Denisot et al., 1991; Teplyakov et al., 2001).

Three mouse lines, Gfpt2Gt(CMHD-GT_352F9-3)Cmhd (Gfpt2GtF9), Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd (Gfpt2GtF9d1) and Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd (Gfpt2GtA09d1) containing gene traps in Gfpt2 were created. The first of these, Gfpt2GtF9 may not generate a functionally null

159 allele. In this allele, the gene trap cassette insertion results in the production of two transcripts. The first transcript, under the control of the endogenous promoter is exon 1- 4, followed by the enhanced green fluorescent protein (EGFP) reporter of the gene trap cassette. The second transcript is under the control of the constitutively active phosphoglycerate kinase (PGK) promoter and consists of Neo followed by the remaining exons (5-19) of Gfpt2 (Chapter 4, Figure 4.1, 4.4). Since much of the protein is transcribed and presumably translated, albeit disrupted by the gene trap insertion, the protein may be able to recombine sufficiently to function, although this is unlikely.

Gfpt2GtF9d1 and Gfpt2GtA09d1 are likely to be functionally null alleles. In both these lines, the IRES sequence present in the gene trap cassette has been excised by Cre. The excision of the IRES sequence resulted in loss of the Neo transcript in Gfpt2GtF9d1 and Gfpt2GtA09d1 heterozygous and homozygous embryos (Chapter 4, Figure 4.4). This is likely due to nonsense mediated decay of the Neo transcript, due to the presence of a stop codon more than 55 base pairs from the polyadenylation sequence. The expression of all three alleles was confirmed by expression of GFP (Chapter 4, Figure 4.5 and data not shown).

6.2 Gfpt2 is dispensable for mouse embryonic development To determine the importance of Gfpt2 in mouse development, mice heterozygous for the gene trap alleles, Gfpt2GtF9, Gfpt2GtF9d1 and Gfpt2GtA09d1 were mated to generate homozygotes for each allele. Based on the known expression pattern and function of the Gfpt2 protein, it was hypothesised that Gfpt2 expression might be required for heart and/or somite development. For each allele, embryos from a late stage of development (17.5 dpc for Gfpt2GtF9d1 and Gfpt2GtA09d1, or 18.5 dpc, Gfpt2GtF9) were collected, genotyped, weighed and examined for any external phenotype.

Homozygous embryos for the Gfpt2GtF9 allele were present in Mendelian ratios and no significant differences in embryo weights between different genotypes were observed (Chapter 5, Table 5.1, Figure 5.1). One heterozygous embryo exhibited oedema and a kinked tail, but no other embryos exhibited this or any other gross morphological phenotype. There was no significant difference in the observed numbers of wildtype,

160 heterozygous and homozygous pups and weaned mice, although a slight trend towards an excess of wildtype mice was observed post birth in the absence of any observed post- natal deaths (Chapter 5, Table 5.2). Additionally, there was no significant difference in the weight distribution between genotypes of 18.5 dpc embryos.

Homozygous embryos, pups and weaned mice for the Gfpt2GtF9d1 allele were present in Mendelian ratios and no significant difference was observed in the weight distribution between genotypes (Chapter 5, Table 5.3, Figure 5.2). There was also no significant difference in the observed numbers of Gfpt2GtA09d1 embryos at 17.5 dpc (Chapter 5, Table 5.5). There was no significant difference in weight distribution between genotypes for these embryos (Chapter 5, Figure 5.4).

Since Gfpt2 is expressed in the myocardium of the cardiac cushions which give rise to the heart valves and membranous intraventricular septum (IVS), wildtype and homozygous embryonic hearts from Gfpt2GtF9 and Gfpt2GtA09d1 were dissected from fixed embryos, embedded in wax, sectioned and counterstained with eosin to examine heart morphology in these lines. Six Gfpt2GtA09d1 homozygous hearts were sectioned. Five of the six hearts exhibited normal morphology, however one embryo exhibited a ventricular septal defect (VSD). The VSD was located in the membranous ventricular septum, which is derived from the cardiac cushions (de Lange et al., 2004). It was not possible to examine the morphology of the other heart valves in this embryo due to poor sections. Three homozygous embryonic hearts were sectioned for the Gfpt2GtF9d1 allele, and no VSD or valve defects were observed for this line (Chapter 5, Figure 5.3).

There was no significant difference in the observed numbers of Gfpt2GtF9d1 wildtype, heterozygous and homozygous pups or weaned mice, suggesting that this gene trap insertion has no effect on mouse development. These mice have been aged and were also weighed over a period of 5 months. No homozygous deaths have been observed and there is no significant difference in weight between wildtype and homozygotes observed over the time period (Chapter 5, Table 5.7). There was also no significant difference in the observed numbers of Gfpt2GtA09d1 pups or weaned mice (Chapter 5, Table 5.6). These mice have been aged and except for one homozygote male mouse which died for unknown reasons at 11 weeks, the homozygotes appear normal.

161 6.3 Discussion

6.3.1 The delta gene trap insertions should create functionally null alleles There was no phenotype observed for any of the gene trap alleles (Gfpt2GtF9, Gfpt2GtF9d1, and Gfpt2GtA09d1). One possible reason for this may have been that the gene trap alleles do not create functional null alleles and Gfpt2 is still produced. The insertion point for each gene trap is in the GATase domain of Gfpt2, corresponding to a disruption at amino acid 114 in the case of the Gfpt2GtF9 and Gfpt2GtF9d1 alleles, and at amino acid 71 for the Gfpt2GtA09d1 allele (Chapter 4, Figure 4.2). As described in Section 6.1.2, the Gfpt2GtF9 may not generate a null allele. In the case of the Gfpt2GtF9d1 and Gfpt2GtA09d1 alleles, the transcripts from the point of the insertions are not transcribed (Chapter 4, Figure 4.4). Both the Gfpt2GtF9d1, and Gfpt2GtA09d1 insertions are close to the middle of the GATase domain of Gfpt2 and would be expected to disrupt the function of this domain and the remainder of the protein would not be produced. In Gfpt2, the GATase domain transfers an ammonia group from glutamine to fructose-6-phosphate, which is isomerised by the two sugar isomerase domains, leading to the formation of glycosamine-6-phosphate (Denisot et al., 1991; Teplyakov et al., 2001; Teplyakov et al., 1999). Disruption of the Gfpt2 protein at this point would result in non-functional protein as, although the catalytic residue in the GATase domain is still present, the domain is disrupted and the SIS domains are not transcribed (Teplyakov et al., 2001).

An alternative possibility is that the gene trap alleles are not expressed. Since they are located in introns, alternative splicing could result in no disruption of the allele. As discussed in Chapter 4 (Section 4.7.3, Figure 4.7), the gene trap alleles are expressed as judged by GFP expression, and no alternative splicing has been reported for Gfpt2 (Ensembl, Dec 2009). It has not been demonstrated conclusively that full-length transcripts are not present and that just a small number might be enough to rescue full function. The absence of full-length transcripts could be shown by RT-PCR, using primers that flank the insertion site comparing wildtype, heterozygous and homozygous embryos. While the location and the expression of the gene trap inserts provides good evidence that the alleles should be null, an antibody specific to Gfpt2 would determine if the protein is expressed. An antibody that is able to detect both Gfpt1 and Gfpt2 is

162 available, but antibodies that distinguish between the two proteins have not been reported (Nerlich et al., 1998).

A ventricular septal defect (VSD) was observed in one Gfpt2GtA09d1 homozygous embryo (Chapter 5, Figure 5.3). A VSD would be considered a potential phenotype for a Gfpt2 null given that Gfpt2 is expressed in myocardium underlying the cardiac cushions and cushion defects can lead to VSD. However, five other Gfpt2GtA09d1 homozygous embryos did not show this phenotype and there was no significant difference in survival rates for this line at weaning, it is unclear if this was simply a one- off observation or indicates a low penetrance phenotype caused by the mixed genetic background that these mice are on. The mice are currently on a mixed genetic background of 129X1/SvJ x 129S1 from the ES cells and C57BL/6J from the host blastocysts. The mice are being backcrossed to C57BL/6J, but all experiments with Gfpt2GtA09d1 were conducted on mice that had only been backcrossed two to three times. Genetic background has been demonstrated to affect the penetrance of phenotypes (Bamforth et al., 2004; Uchimura et al., 2009), thus backcrossing to either C57BL/6J or 129X1/SvJ for these mice might see an increase in the incidence of VSD for the gene trapped mice. VSD models in the mouse are often fatal perinatally and are smaller than litter mates, even in mixed genetic backgrounds (Uchimura et al., 2009; Zhou et al., 2004). This was not observed for the Gfpt2GtA09d1 or Gfpt2GtF9d1 homozygous mice (Chapter 5, Tables 5.4, 5.6, 5.7, 5.8). Alternatively, if the VSD is sufficiently small, the mice may be able to survive, and it might only become apparent under certain stress conditions. This could be addressed by exposing the mice to cardiac stress tests such as treadmill running or swimming. Reduced exercise tolerance may indicate a small VSD.

The Gfpt2GtF9d1 and Gfpt2GtA09d1 gene trap disruptions would be expected to result in similar phenotypes given that both disrupt the GATase domain of Gfpt. However the VSD was observed in only one Gfpt2GtA09d1 embryo, and the survival of pups from both the Gfpt2GtF9d1 or Gfpt2GtA09d1 lines was not affected. Thus it seems most likely that VSD was due to a random chance event rather than directly related to the gene trap insertion, and Gfpt2 is dispensable for mouse development.

163 6.3.2 Gfpt1 potentially compensates for loss of Gfpt2 As discussed in Chapter 5 (Section 5.4.1), it is likely that Gfpt1 is able to compensate for the loss of Gfpt2 expression during mouse development. The HBP is required during mouse development since mutation in the downstream gene, Gnpnat1 results in general proliferation defects and death at 7.5 dpc (Boehmelt et al., 2000). Since Gfpt is the rate- limiting enzyme of the HBP, either Gfpt1 or Gfpt2 must be present for the pathway to function.

Gfpt1 expression was not detected at 8.5 dpc or 9.5 dpc by whole mount RNA in situ hybridisation when Gfpt2 is expressed in a restricted manner in the foregut endoderm at 8.5 dpc, and the myocardium underlying the cardiac cushions and the PSM at 9.5 dpc (Chapter 3, Figures 3.4, 3.5). This suggests that Gfpt expression is not directly required in these tissues at these stages of development. Alternatively, Gfpt1 may be expressed at these stages, but below the detection threshold for the whole mount RNA in situ hybridisation. This could be assessed by quantitative real-time PCR. Assuming the first of these possibilities, expression of Gfpt1 prior to 8.5 dpc and after 9.5 dpc may be sufficient to compensate for a lack of Gfpt2 in the foregut endoderm, myocardium underlying the cardiac cushions and PSM. Another possibility is that Gfpt1 might be upregulated directly in response to the lack of Gfpt2 expression. This could be assessed by RT-PCR or comparative in situ hybridisation of wildtype, heterozygous and homozygous gene trapped Gfpt2 embryos, probing with Gfpt1, comparing the level of expression of Gfpt1. However, a negative result would not definitively demonstrate that Gfpt1 does not compensate for loss of Gfpt2 due the recycling and solubility of UDP- GlcNAc.

6.3.3 Is GAG production affected in Gfpt2 gene trapped cardiac cushions? It is possible that loss of Gfpt2 reduced or change the composition of GAGs present in the cardiac jelly, but that this was overcome. One of the major components of cardiac jelly is hyaluronic acid. Hyaluronic acid is produced via a pathway that is independent of the hexosamine biosynthesis pathway (Camenisch et al., 2000; McDonald and Camenisch, 2002; Spicer et al., 2002). However it is known that GlcNac is required in the cardiac jelly since mutations in genes that encode GAGs that do contain GlcNAc,

164 Crtl1 and chondroitin sulfate proteoglycan core protein 2, which encodes for versican, result in cushion defects (Kern et al., 2007; Kern et al., 2006; Mjaatvedt et al., 1998; Wirrig et al., 2007). Glycosylation is a reversible process (Hart et al., 2007). Proteins can be O-glycosylated by O-linked N-acetylglucosamine transferase and deglycosylated by O-GlcNAcases (Hart et al., 2007), thus a sufficient level of UDP-GlcNAc may be obtained by recycling from existing proteins. UDP-GlcNAc is also secreted to the ECM (Hanover, 2001; Zachara and Hart, 2006), thus it is possible that tissues could obtain sufficient substrate through circulating UDP-GlcNAc.

To determine if the cardiac jelly in the Gfpt2 gene trapped homozygotes was effected, sections of somite stage matched 9.5 dpc wildtype and homozygous embryos could be examined histologically using Alcian blue to detect acidic glycosaminoglycans. Since the cushions start to become cellularised at this stage, somite stage matching is more accurate than dpc. If Gfpt2 is required for GAG production, one might expect a reduction in Alcian blue staining, but not a complete loss, since hyaluronic acid, which is a major component of cardiac jelly, is also detected by Alcian blue staining (Camenisch et al., 2000). Another approach is to compare the relative size and cellularisation of the cushions. Cushion cellularisation and size, including the complete lack of cushions, have been shown to be affected in mouse mutants for cardiac jelly components such as Has2, (Camenisch et al., 2000). Ultimately even if an effect on GAG production in the cardiac jelly in the Gfpt2 gene trapped lines was observed, this study indicates that it was not severe enough to have any effect on mouse survival.

6.3.4 Future studies to determine the function of Gfpt2 in the mouse To determine if Gfpt1 is able to compensate for the loss of Gfpt2, a mouse model in which lack both Gfpt1 and Gfpt2 would be required. At present, there are no mouse models of Gfpt1 available, although recently gene trapped cell lines have become available (MGI and NCBI, 2008).

A mouse model knockout or otherwise null allele of Gfpt1 would enable several questions regarding the function of Gfpt1 in mouse development to be addressed. Firstly, it could be established if Gfpt1 is required for mouse development. It is possible

165 that Gfpt2 may be able to compensate for lack of Gfpt1, with the proteins being interchangeable during mouse development. Compound homozygotes or heterozygotes for Gfpt1 and Gfpt2 could also help to identify the different roles of the two proteins. It is unlikely that neither Gfpt1 nor Gfpt2 are required during mouse development given that the HBP is required, as evidenced by the knockout phenotype of Gnpnat1 (Boehmelt et al., 2000), and Gfpt is the rate-limiting enzyme for this pathway.

Ideally, Gfpt1 and Gfpt2 alleles would be designed such that Cre-lox technology could be utilised to excise each gene in particular tissues of interest in a spatial and temporal manner (Gu et al., 1994; Kuhn et al., 1995; St-Onge et al., 1996). This approach could be used to excise Gfpt2 directly in the heart using mouse line in which Cre is under the control of heart specific promoters, such as Nkx2.5 (McFadden et al., 2005), in a background that also lacks or has reduced Gfpt1 expression. Using this approach, it could be determined if Gfpt2 is required in heart development. A similar approach could be used to excise Gfpt2 from the foregut endoderm using a Sonic hedgehog Cre mouse line (Harris-Johnson et al., 2009).

6.4 Final comments The completion of the mouse genome project has identified the majority of protein coding genes in the mouse, however the function of many genes and their products is still being elucidated. Collaborative projects such as the International Gene Trap Consortium and the International Knockout Mouse Consortium aim to either disrupt or mutate every protein-coding gene in the mouse. These are large-scale undertakings and a combination of gene trap and targeted knockout strategies are used to disrupt genes in mouse ES cells. Generating mouse lines from these ES cell lines is expensive, thus prioritising genes for functional analysis in the mouse remains an issue.

In this study, Gfpt2 was shown to be expressed in a restricted manner in tissues that pattern the developing heart (foregut endoderm), give rise to the cardiac valves (myocardium underlying the cardiac cushions) and that play a role in vertebral segmentation (presomitic mesoderm). Taken together with the known role of Gfpt2 in the hexosamine biosynthesis pathway and the importance of the glycosylation of substrates in both cardiac cushion development and somitogenesis, it seemed likely that

166 disruption of the Gfpt2 gene in the mouse could result in embryonic defects. However, using two different gene traps that both disrupted Gfpt2, no effect on embryo survival was observed and structural defects (VSD) were observed in only one homozygous Gfpt2GtA09d1 heart.

167

168 Chapter 7: Materials and Methods

7.1 Chemicals and Reagents

7.1.1 Chemicals Ajax Finechem Ethanol, Isopropanol, N-Hexane

Amyl Media Bacto tryptone, Bacto yeast extract Astral PBS tablets Ambion Tri-Reagent, RNAlater BDH Chemicals Formamide and Xylenes Gibco/BRL FCS ICN Glycerol, Tween-20 Invitrogen Lipofectamine LTX reagent Progen Ampicillin (sodium salt) and X-Gal Promega RNasin Pro Sci Tech OCT compound Roche DIG-11-dUTP, dNTPs, NBT/BCIP tablets, and ssDNA Sigma Agar, Agarose, Alcoholic Eosin, BCIP tablets, Benzyl, Benzoate, Benzyl

Alcohol, Bouins fixative, BSA, CaCl2, Citric Acid, Ethidium Bromide, EDTA, Ferricyanide, Ferrocyanide, Glucose, Gluteraldehyde, Glycine, Hematoxylin, Heparin sodium salt, HEPES Acid, MgCl2, Maleic Acid, NaCl, NaOH, Orange G, Paraplast, PFA, Phenol, Phenol Red, Potassium Acetate, Potassium Chloride, Rubidium chloride, Sodium actetate, Sodium Bicarbonate, SDS, Sheep Serum, Sodium lactate, Sodium Pyruvate,

Spermidine, Sucrose, Torula, Tris HCL, Tris Base, and Trisodium citrate.

7.1.2 Antibodies/Fluorophores Amersham Cy3, Cy5 #PA23001, #PA2500 Roche Anti-DIG-AP FAB fragments Clontech Mouse monoclonal anti-HA

169 7.1.3 Kits Clontech Chromaspin 100 DEPC columns Qiagen QIAquick Gel Extraction Kit QIAquick PCR Purification Kit Invitrogen PureLink™ Micro-Mini Total RNA Purification Kit PureLink™ HiPure Plasmid Maxiprep Kit

7.1.4 Enzymes New England Biolabs Restriction endonucleases Ambion T7, T3 and SP6 RNA polymerase Roche Proteinase K RNase, DNase free Taq DNA polymerase

7.1.5 Miscellaneous DNA markers 100 (base pairs) bp and 500 bp ladder markers were purchased from Geneworks. Band sizes ranged from 100 bp to 1 kb (100 bp intervals), and 500 bp to 5 kb (500 bp intervals). DNA fragment sizes and approximate concentrations were determined by loading agarose mini-gels with 500 ng of marker DNA

7.1.6 Plasmids IMAGE clones Obtained from MGC RIKEN clones Obtained from University of Queensland Refer to Appendices 2 and 3 for list of clones examined by whole mount RNA in situ hybridisation. Plasmids for examining localisation obtained from Gavin Chapman (Victor Chang Cardiac Research Institute).

170 7.2 Buffers and Solutions

7.2.1 General Molecular Biology Solutions

5x ABI dilution buffer: 400 mM Tris-HCl (pH 9.0), 10 mM MgCl2

1x TAE: 40 mM Tris-HCl (pH 8.2), 20 mM NaAc and 10 mM EDTA (pH 8.2)

1x TE: 10 mM Tris-HCl (pH 7.5) and 1 mM EDTA

Murine tail DNA lysis solution: 100 mM Tris (pH 8.8), 1 M Trizma-HCl (pH 8.8), 200 mM NaCl, 5 mM EDTA and 0.2 % SDS

Yolk sac DNA lysis solution: 50 mM Tris (pH 8.0), 1 M Trizma-HCl (pH 8.0), 1 mM EDTA and 0.5 % Tween-20

ES cell DNA lysis solution: 10 mM Tris-HCl (pH 7.5), 10 mM EDTA, 10 mM NaCl, 0.5 % Sarkosyl

Ethanol-Salt solution: 75 mM NaCl in 100 % Ethanol. Store at -20 ºC.

Orange G loading Dye: 50 % Glycerol, Orange G to colour.

PBS: Four Phosphate buffer saline tablets (Astral) were dissolved in 400 mL of MQ water and the solution was autoclaved.

7.2.2 Embryo and RNA in situ hybridisation solutions M2: M2 was prepared by the addition of 10 % Solution A (947.0 mM NaCl, 47.7 mM KCL, 11.9 mM Potassium phosphate, 11.8 mM Magnesium sulfate, 230 mM Sodium lactate, 50 mM Glucose); 1.6 % Solution B (Sodium bicarbonate 264.3 M, Phenol Red);

1 % Solution C (33 M Sodium Pyruvate); 1 % Soultion D (171.45 mM CaCl2.2H20), 8.4 % Solution E (250 mM HEPES Acid, 282.18 M Phenol Red); 11 % heat inactivated

171 FCS; This solution was buffered to pH 7.4 before filter sterilization and storage at 4 oC for up to two weeks.

PBT: Phosphate buffered saline with 0.1 % Tween-20

20x SSC: 3 M NaCl, 0.3 M Trisodium citrate. pH adjusted to 4.5 or 7 using citric acid.

Prehybridisation Solution: 50 % Formamide, 5X SSC (pH 7.0), 0.1 % Tween-20, Heparin (50 g/mL)

Hybridisation Solution: 50 % Formamide, 5X SSC (pH 7.0), 0.1 % Tween-20, Heparin (50 g/mL), Torula yeast RNA (100 g/mL), herring sperm DNA (100 g/mL), antisense RNA probe (1:50-1:200 dilution, 0.2-1 g/mL)

Wash Solution I: 50 % Formamide, 5X SSC (pH 4.5), 1 % SDS

Wash Solution II: 0.5 M NaCl, 0.01 M Tris pH 7.5, 0.1 % Tween-20

Wash Solution III: 50 % Formamide, 2X SSC (pH 4.5)

10x TBS: 1.37 M NaCl, 26.83 mM Potassium chloride, 250 mM Tris (pH 7.5)

TBST: 1X TBS, 0.1 % Tween-20

NTMT: 0.1 M NaCl, 0.1 M Tris pH 9.5, 0.05 M MgCl2, 0.1 % Tween-20

NBT/BCIP stain: 337.5 g NBT and 175 g BCIP (dissolved in DMF) per mL of

NTMT, or 1 tablet NBT/BCIP dissolved in 10 mL MQ.H2O

5x MAB(T): 500 mM maleic acid, 750 mM NaCl, (0.5 % Tween-20), pH 7.5 with NaOH.

172 10x Salt: 1.95 M NaCl, 90 mM Tris-HCl, pH 7.5, 10 mM Tris Base, 50 mM NaH2PO4.2H2O, 35 mM NaH2PO4, 50 mM EDTA.

Blocking reagent: Boehringer Blocking Reagent (BM 1096 176) is made up in maleic acid buffer (MAB) as 10 % stocks.

100x Denhardt's solution: 2 % weight/volume (w/v) Bovine serum albumin (BSA), 2 % w/v FicollTM, 2 % w/v PVP.

Cryosection hybridisation buffer: 1x salt, 50 % formamide, 10 % dextran sulphate, 1 mg/mL Torula rRNA, 1x Denhardt’s solution.

7.2.3 Cell culture solutions ES cell media Modified Eagle’s Medium (DMEM) with high glucose (Gibco 11960-044), 1 % GlutaMAX, (GlutaMAX-1, Gibco 35050-061), 100 μM ß- mercaptoethanol (Sigma, M7522), 1 mM Sodium Pyruvate (Gibco, 11360-070), 100 μM

Non-essential amino acids (Gibco, 11140-050), 1000 U/mL of LIF (Leukaemia inhibitory factor) (Chemicon ESG1107), penicillin/streptomycin (Gibco 15140-148, 50 g/mL each), 15 % foetal calf serum (FCS).

C2C12 cell media DMEM (Gibco-BRL), 10 % FCS (Sigma), 1 % Glutamine, 50 g/mL penicillin/streptomycin.

MEF media DMEM (Gibco-BRL), 10 % FCS (Sigma), 1 % Glutamine, 50 g/mL penicillin/streptomycin.

7.3 Microarray method (Performed by Owen Prall, Victor Chang Cardiac Research Institute) The cDNA libraries were linearised with Not1, and RNA was transcribed from the T7 promoter. The RNA was then purified using Tri-reagent. Purified RNA was reverse

173 transcribed and labelled with either Cy3 (“green”) or Cy5 (“red”) (Amersham #PA23001, #PA25001) (protocol modified from TIGR http://pga.tigr.org/protocols.shtml). These probes were then hybridised to the 22,000 mouse gene Compugen/Sigma-Genosys OligoLibrary (http://www.sigma-genosys.com/). The OligoLibrary was designed using a Compugen transcriptome database to include splice variants whilst excluding SNPs (small nucleotide repeats), repeated sequences, chimeras and intron contamination. As such, the library represents 22000 unique sequences representing approximately 21,500 genes (http://www.sigma-genosys.com/).

7.4 RNA in situ hybridisation methods

7.4.1 Embryo dissection After the detection of the vaginal plug, pregnant females were sacrificed on the appropriate day. Females were dissected to reveal the uterine horns, which were then placed in PBS. The deciduas were removed from the uterus by making longitudinal tears, just adjacent to each decidua, and sliding the deciduas out. The deciduas were then placed in M2. The decidua appeared to be pear shaped with the embryo lying in the narrower end. In embryos younger than 9 dpc, these deciduas were split in half to reveal the embryo. The embryo was freed by scoring the adjacent tissue around it. Once the embryo was free the Reichert’s membrane was removed. In older embryos the decidua surrounding the embryo is reduced, and it becomes possible to easily see and free the embryo. Embryo collection was thus achieved by making a circular incision around the placenta and the yolk sac junction. Foetuses older than 15.5 dpc were sacrificed by cutting spinal cord at neck. Hearts were stopped by addition of potassium chloride (100 mM final) and foetuses fixed. Hearts were dissected out from fixed embryos.

7.4.2 Embryo processing for whole mount RNA in situ hybridisation Embryos were fixed by rocking in 4 % PFA (paraformaldehyde) at 4 ºC overnight. The following day, embryos were serially dehydrated by two washes in PBS, one wash in each

174 of 25 %, 50 % and 75 % methanol and two washes in 100 % methanol. Embryos were then stored at -20 ºC until use.

7.4.3 Embryo processing for wax embedding (if not processed for whole mount RNA in situ hybridisation) Embryos were fixed overnight in Bouins fixative and then serially dehydrated once in 70 %, twice in 80 %, twice in 95 %, and thrice in 100 % ethanol washes. Some embryos were fixed in 4 % PFA in place of Bouins fixative, in which case they were washed twice in PBS before being serially diluted in once in 30 % then 50 % before moving to 70 % ethanol.

Ethanol serially dehydrated embryos were then transferred to xylene for 2 washes, prior to being thrice equilibrated in paraplast at 55 ºC for 30 min, and finally orientated. The length of the ethanol and xylene washes varied depending on embryo age. Typically embryos older than 15.5 dpc were serially dehydrated with 1 hour ethanol washes and 30 minute xylene washes, 9.5 dpc embryos were dehydrated with 10 minute ethanol washes, 1x5 minute xylene and, 1x10 minute xylene washes, 8.5 dpc embryos were dehydrated with 5 minute 30-70 % ethanol washes, 3 minute 80-100 % ethanol washes and one 2 minute and one 3 minute xylene wash. Embedded embryos were left overnight prior to sectioning at 8- 10 μm using a Leica DSC1 microtome. Embryos were stored either orientated in wax or in 70 % ethanol.

7.4.4 Placenta dissection and processing for cryosectioning Placentas were dissected as per embryos described above. Excess tissue was removed and the umbilical cord was cut close to the placenta. Placentas were fixed rocking overnight in 4 % PFA at 4 ºC. The following day, placentas were washed twice in PBS, followed by two washes in 30 % sucrose. They were then placed back at 4 ºC until the placenta sunk to the bottom of the vial. Placentas were then equilibrated in OCT (optimal cutting temperature compound) for 10 minutes before being and placed in moulds filled with OCT. The OCT set on contact with frozen N-hexane. The placentas in frozen OCT were stored at –80 ºC

175 prior to sectioning. Embedded placentas were sectioned at 10 m using a Leica JUNG CM 300 cryostat (chamber temperature -20 ºC, object temperature of -20 ºC).

7.4.5 Synthesis of Riboprobes (for manual in situ hybridisation) 20 g of plasmid DNA was linearised with the appropriate restriction enzyme in a 200 L reaction volume. The linearised DNA was purified by phenol extraction followed by ethanol precipitation. The resulting pellet was resuspended at 1 mg/mL in RNA quality MQ. In vitro transcription was subsequently performed in a 50 L reaction containing 2.5 g template DNA, 40 U RNAsin (Promega), 1x Transcription buffer (Ambion, matched to each respective polymerase), 50 U of polymerase, 0.5 mM each of GTP, ATP and CTP, 0.32 mM UTP and 0.18 mM DIG-11-rUTP. The in vitro transcription was performed at 37 ºC for 2 hours. Chromaspin 100 DEPC columns (Clontech) were then pre-centrifuged at 500 G for 3 minutes. The samples were loaded onto the columns and the products were collected in RNase-free tubes by spinning at 500 G for 5 minutes. Probes were stored in aliquots at -80°C (for up to 3 years).

7.4.6 Whole mount RNA in situ hybridisation using Intavis in situ robot The whole mount in situ hybridisation procedure used was a compilation of the methods used by Henrique et al. Nature 375: 787 - 790 (1995), David Wilkinson (Dept. of Developmental Neurobiology, NIMR, The Ridgeway, Mill Hill, London, NW7 1AA) personal communications and other modifications made in Patrick Tam’s laboratory. The main modifications to note are: an Ampliscribe Kit (Epicentre Technologies) was used in conjunction with Dig-11-UTP (Roche Biomedical Chemicals) to synthesise the RNA probes; SDS was used instead of CHAPS in the hybridisation solution and SSC was 5x, only 0.2 μg of probe was used per mL of hybridisation solution; there is no proteinase K digestion; hybridisation and post hybridisation washes were carried out at 70 ºC; washes after hybridisation were high stringency and exclude formamide, and no RNase digestion was done after hybridisation.

176 7.4.7 Whole mount RNA in situ hybridisation method (manual) Embryos and tissues were fixed in 4 % PFA overnight at 4 ºC. Subsequently embryos were dehydrated by washing in PBS, 25 %, 50 %, 75 % and 100 % methanol for 10 minutes at room temperature (RT), and were stored at -20oC. When ready to proceed, embryos were rehydrated by washing in 75 %, 50 % and 25 % methanol in PBS, and subsequently washed twice in PBT. Embryos were bleached for 1 hour in 6 % hydrogen peroxide, after which they were washed three times in PBT and then incubated in 10 μg/mL of Proteinase K in PBT for 10 minutes (7.5 and 8.5 dpc embryos) or 15 minutes for 9.5 dpc. The digestion was stopped by washing the embryos in freshly prepared 2 mg/mL glycine in PBS. After washing the embryos twice in PBT the embryos were refixed in 0.2 % gluteraldehyde/4 % PFA in PBS for 20 minutes, followed by an additional two washes in PBT. The embryos were then blocked in prehybridisation solution at 70oC for one hour, after which they were incubated overnight at 70oC in hybridisation solution containing denatured probe. The following day embryos were washed, twice in wash solution I for 30 minutes at 70 ºC, once in a one-to-one ratio of wash solution I and II for 10 minutes, and three times in wash solution II. Embryos were then incubated in wash solution II containing 100 g/mL of RNaseA for 30 minutes at 37 ºC. The embryos were then washed once in both wash solution II and wash solution III. Subsequently they were washed in wash solution III for 30 minutes at 65 ºC, and then three times in TBST before they were blocked in 10 % sheep serum in TBST for 1-2.5 hours. The embryos were then incubated overnight in a 1 in 2000 dilution of anti-DIG-AP FAB fragments in 1 % sheep serum in TBST overnight. The following day the embryos were washed 10 times in TBST for one hour with the last wash overnight at 4 ºC. On the following morning the embryos were washed three times in NTMT for 10 minutes, following this the embryos were incubated at 37 ºC in stain (see reagents and buffers for the various concentrations) in the dark until appropriate colour has been obtained. The reaction was stopped by washing the embryos twice in NTMT and PBT, and the coloured product was subsequently fixed overnight in 4 % PFA/0.1 % gluteraldehyde at 4 ºC before being stored in 0.1 % PFA/PBT. All washes were for 5 minutes at room temperature unless otherwise indicated.

177 7.4.8 Cryosection RNA in situ hybridisation Slides with sections were defrosted for at least 30 minutes at room temp (but not more than 3 hours). Each slide was washed in PBS for at least 2 minutes in a Coplin jar baked at 180 ºC overnight to kill RNases. 400 L of hybridisation buffer (mixed by vortexing) was added to each slide. Sections were prehybridised for 1 hour at 65-70 ºC in a sealed plastic box with 2 sheets of paper towel, wetted with 1x salts:50 % formamide solution and the rocker set on slow. The probe was diluted in hybridisation buffer to 0.1-1 g/mL and denatured for 5-10 min at 70 ºC, before being vortexed and centrifuged down. 100 L of probe mix was added to each slide and slides were covered with a cover slip. Sections were hybridised overnight at 65-70 ºC in the humidified box without rocking. The wash solution (1x SSC pH 4.5, 50 % formamide) was pre-warmed to 65 ºC. Slides were transfered to a Coplin jar containing washing solution for an initial wash of 15 minutes at 65 ºC to allow the cover slips to fall off, followed by 2x 30 minute washes at 65 ºC. The slides were washed 2x for 30 minutes in MABT containing 100 mg levamisole powder per 200 mL MABT, then blocked in MABT, 2 % blocking reagent and 20 % heat inactivated sheep serum for at least 1 hour at room temperature. Anti-DIG FAB fragments were diluted 1:1000 in MABT, 2 % blocking reagent and 20 % heat inactivated sheep serum and 110 L was added to each slide. Slides were cover slipped and incubated in a humidified chamber (towels soaked in PBS or water) at room temperature overnight. The following day, the slides were washed 4- 5 times for 20 minutes in MABT containing 100 mg levamisole per 200 mL. Slides were then removed from Coplin jar and wiped. 110 L staining buffer was added per slide and colour reaction was developed in the dark, in a humidified chamber (use long white box and rest slides on paper towel dampened with H2O). If staining reaction was left overnight, fresh staining buffer was added. The staining reaction was stopped by 2 washes in PBT + 1mM EDTA. Slides could then be stored up to 2 days at 4 ºC before dehydrating the sections in 70 % then 2 washes in 100 % Ethanol. Followed by two washes in xylene and then mounted with Depex. Slides may also be counterstained by washing in water and counterstaining with Eosin for approximately 10 seconds, prior to ethanol and xylene washes and mounting.

178 7.4.9 Processing wax section slides Paraffin dehydrated sections were de-waxed by two twenty second xylene washes. If sections were to be counter stained they were rehydrated in a 100 %, 70 % ethanol washes for twenty seconds before equilibration in water. All sections, paraffin or cryosections, were then counter staining in eosin for up to 1 minute and then quickly dehydrated in ethanol by dipping in 70 % and 100 % ethanol. Prior to depex mounting, sections were equilibrated by two quick dips in xylene.

7.5 Molecular Biology Methods

7.5.1 Bacteria growth plates & media Luria broth agar: 10 g/L Bacto tryptone peptone digest, 5 g/L Bacto yeast extract, 10 g/L Sodium chloride, 15 g/L Agar agar; gum agar. Made up to a volume of 1 L with RO Water before autoclaving. Once the media had cooled to approximately 55 ºC ampicillin (100 μg/mL) was added. Subsequently the plates were poured and stored at 4 ºC.

Luria broth: 10 g/L Bacto tryptone peptone digest, 5 g/L Bacto yeast extract, 10 g/L Sodium chloride, 15 g/L Agar agar; gum agar. Made up to a volume of 1 L with RO Water before autoclaving.

7.5.2 Maxiprep 500 mL of Luria broth containing 100 g/mL ampicillin was inoculated either with a single bacterial colony or 5 mL from an overnight culture, and incubated overnight at 37°C in an orbital shaker. The cells were harvested by centrifugation at 4000 G for 5 minutes at 4°C, and the bacterial pellets drained. Plasmid DNA was extracted using the Purelink Invitrogen kit, according to the kit instructions. Yield and quality of plasmid DNA was determined, at wavelengths of 260 nm and 280 nm, using a spectrophotometer or Nanodrop.

179 7.5.3 Ethanol preciptation Samples were precipitated in 1/10 volume of 3 M NaAc (pH 5.2) and 2 volumes of 100 % ethanol (stored at -20 ºC) at –20°C for 20 minutes. The samples were then centrifuged at 12000 G for 20 minutes. The pellet was subsequently washed in 70 % (1 volume) ethanol and air-dried for 10 minutes before resuspending in 1x TE.

7.5.4 DNA isolation from ES cells ES cells were grown in 6cm plates to approximately 70-80 % confluency. Media was aspirated off and cells washed 1x in PBS. Cells were lysed in 1 mL ES cell DNA lysis solution containing 0.5 mg/mL Proteinase K. Plate was incubated in a humidified chamber for 2-3 hours or overnight. 2 mL ethanol-salt solution was added to the plate and allowed to sit at room temperature without mixing. The entire volume was added to two 1.5 mL microcentrifuge tubes and centrifuged at 12000 G for 5 minutes. The DNA was washed twice with 70 % ethanol to remove salts, and air-dried for 10 minutes before resuspending in MQ filtered water.

7.5.5 DNA isolation from mouse tails or ear-clips DNA was extracted from all mice by cutting a small piece of tail (approximately 3 mm) or ear clip from each mouse. This tissue was lysed in 500 L murine tail DNA lysis solution containing Proteinase K (0.5 mg/mL) at 55°C overnight. Any undigested tissue was removed by centrifugation for 5 minutes at 12000 G. Following centrifugation the sample was precipitated with an equal volume of isopropanol and centrifuged for 5 minutes. The sample was then washed in 70 % ethanol and air-dried at RT for 15 minutes. Prior to PCR the DNA was resuspended in MQ filtered water at RT.

7.5.6 DNA isolation from yolk sacs The yolk sac, or part thereof, was removed and rinsed in MQ prior to digestion in 40 L of yolk sac DNA lysis solution containing 5 mg/mL Proteinase K for one to twelve hours at 55oC. Any remaining undigested tissue was removed by centrifugation for 5 minutes, prior to Proteinase K heat inactivation at 95°C for 5 minutes. 180 7.5.7 RNA isolation from ES cells and embryos ES cells were grown to 70-80 % confluency on 6cm tissue culture plate. The media was aspirated off and cells washed 1x in PBS. RNA was extracted from cells using the PureLink™ Micro-Mini Total RNA Purification Kit according to manufacturer’s instructions.

7.5.8 PCR primers Synthetic DNA oligonucleotides, used for sequencing and genotyping, were synthesised by Geneworks (Sigma-Aldrich) using a 380B Applied Biosystems DNA Synthesiser and are listed in Table 7.1. Primers were designed using Primer3 http://primer3.sourceforge.net/webif.php

7.5.9 PCR protocols PCR for genotyping: 96°C denaturation 1 min, then 30 sec, annealing 58°C 30 sec, extension 72°C 30 sec, for 35 cycles with 10 min 72°C extension time in final cycle.

PCR for cloning: 98°C denaturation 1 min, then 30 sec, annealing 58°C 30 sec, extension 72°C 5 min, for 35 cycles with 10 min 72°C extension in final cycle.

Extended cloning protocol: 98°C denaturation 1 min, then 30 sec, annealing 65°C 30 sec, extension 72°C 5 min, for 10 cycles, 98°C denaturation 30 sec, annealing 58°C 30 sec, extension 72°C 5 min, for 25 cycles with 10 min 72°C extension in final cycle.

181 Table 7.1 Oligonucleotides Fwd (indicates forward primer), Rev (indicates reverse primer). Oligo Locus targeted Primer use(s) Primer sequence Number 711 pUPA Neo (fwd) ES cell GCTATCAGGACATAGCGTTGGCTAC cloning/mapping 712 Gfpt2 exon 7 (rev) ES cell GCTTCTCCCGGGTAGTGAAT cloning/mapping 730 Gfpt2 exon 5 (rev) ES cell ATGATCCCGTTGTGGATGAC cloning/mapping 731 Gfpt2 exon 4 (rev) ES cell TGCCGAAGTGTGTCTCAAA cloning/mapping 722 pUPA LTR fwd ES cell GACAATCGGACAGACACAGA cloning/mapping 724 pUPA LTR rev ES cell TGGTCCAGGCTCTAGTTTTGA cloning/mapping 729 3’ pUPA LTR fwd ES cell GGACGTCTCCCAGGGTTG cloning/mapping 736 pUPA SA rev ES cell GCAGGCATGTTGACTTCACT cloning/mapping 737 pUPA SD rev ES cell TGTTGGATATGCCCTCGACT cloning/mapping 800 Gfpt2 intron 4-5 ES cell TGAAAATTGGGGACAACCAT (rev) cloning/mapping 583 Gfpt2s exon1 - HA tagging Gfpt2 GAGCCAAGCTTTGCGGAATC HindIII 584 Gfpt2 rev cds - XbaI HA tagging Gfpt2 GACGACAGTCTAGAGATAGAAGTCC 649 Gfpt2 internal Sequencing TCGGGGTACGAAGCAAATAC primer 426* Gfpt2 3’UTR (rev) Cloning for Gfpt2 in TTTCAGGGGACAGGAATCAG situ probe 507* New forward Gfpt2 Cloning for Gfpt2 in GCCAAGTCTGTCACTGTGGA 3’ UTR (fwd) situ probe 579 Gfpt1 3’UTR (fwd) Cloning for in situ AAACTATTGCCTCCTGAAAGC probe

182 Oligo Locus targeted Primer use(s) Primer sequence Number 580 Gfpt1 3’UTR (rev) Cloning for in situ AGTTGCGAGAAAATGCCAAC probe 727* Gfpt1 cds (fwd) Cloning for in situ CAAAGGCCTTCAGAGACTGG probe 728* Gfpt1 cds (rev) Cloning for in situ GGACCGACTTCTGGTGGTAA probe 801 Gfpt2 intron 3-4 Genotyping CAGTTTGAGGCCAGTTTGGT (fwd) 802 Gfpt2 intron 3-4 Genotyping ATTCCAGCTCTGGGAAAACA (rev) 807 Gfpt2 intron 4-5 Genotyping GGTCGTAGTTCTAGGGGCAGA (fwd) 808 Gfpt2 intron 4-5 Genotyping CCACTCTTTGTGGGAGAGGA (fwd) 817 pUPA IRES fwd Genotyping CGTTGGCTACCCGTGATATT 818 pUPA IRES rev Genotyping AGTCGAGGGCATATCCAACA 852 Gfp (fwd) Genotyping GCACCATCTTCTTCAAGGACGAC 853 Gfp (rev) Genotyping AACTCCAGCAGGACCATGTGATCG 21 Neo (fwd) Genotyping CTGTGCTCGACGTTGTCACTGAAG 20 Neo (rev) Genotyping TATTCGGCAAGCAGGCATCGCCA 2 T3 RNA Sequencing ATTAACCCTCACTAAAGGGA polymerase promotor 79 T7 RNA Sequencing TAATACGACTCACTATAGGG polymerase promoter 80 Sp6 RNA Sequencing ATTTAGGTGACACTATAG polymerase promoter * cloning performed by Duncan Sparrow

183 7.5.10 Automated capillary sequencing of plasmid DNA 100 ng per 1000 bp of plasmid DNA was subjected to cycle sequencing in the presence of 25 ng of primer, 1 L Big Dye terminator mix (PE Biosystems), 3.5 L of 5x ABI dilution buffer in a total volume of 20 L. The reaction was cycled through the following steps 25 times: 96 °C for 30 seconds, 50°C for 15 seconds, 60 °C for 4 minutes. Completed reactions were precipitated for 15 minutes in 80 L of 75 % isopropanol at room temperature. DNA was pelleted for 20 minutes at 12000 G, washed in 250 L of 75 % isopropanol, re-centrifuged for 5 minutes and air-dried. Reactions were analysed at the DNA Sequencing Facility, University of New South Wales, Sydney, Australia, and viewed on the Seqman II program (DNASTAR).

7.6 Cell culture

7.6.1 Mouse embryonic fibroblast (MEF) generation and culturing Embryos were dissected from 14.5 dpc NHtg (which are transgenic for neomycin and hygromycin resistance genes) mice and rinsed in PBS. Embryos were decapitated and internal organs removed. Remainder of embryos was transferred to 15 mL falcon tubes (approx 1 tube/embryo) containing 3 mL Trypsin/EDTA and incubated at 4°C overnight. The following day most of the trypsin was removed and tubes were incubated at 37°C for 30 minutes. Media was added to tubes and titrated several times before plating onto 15 cm plates and grown for 2 days at 37°C, 20 % O2.

When MEFs reach 80-100 % confluency, media limbs and bones were aspirated off. Cells were washed in 1xPBS, before addition of 4 mL trypsin/EDTA. Cells were incubated 5 min 37°C and 8 mL media was added to 15 cm dish (3/plate) and 8 mL to 50 mL falcon. Trypsinised cells were added to 50 mL falcon and mixed well. Cells were split 4 mL cells/plate for 1:3 split and split again after 2 days.

184 7.6.2 Mitomycin C treatment of MEFs MEFs were expanded onto 15 cm dishes (total of 27 plates). Plates were treated with Mitomycin C (mmC) for 2-3 hours, 7 mL per 15 cm plate + 70 L of mmC (100X stock). mmC media was aspirated off and cells washed 3 times in 10 mL PBS. mmC treated MEFs were trypsinized with 3-4 mL trypsin. Trypsin was neutralised by adding 1 volume of ES media and triturated. Cells were centrifuged down at 800 G for 5 min. Cells were resuspended and aliquots frozen at –80°C (2vials/15 cm dish) in FCS containing 10 % DMSO.

7.6.3 ES cell culture The vial containing ES cells was quickly thawed in warm water and the contents transferred to a tube with at least 5 mLs of media in it. Cells were centrifuged down at 800 G for 5 minutes. The media was aspirated off and the pellet resuspended in 4 mL of media before plating on a 6 cm tissue culture treated dish containing feeder cells (mmC treated MEFs). Change the media the next day. ES cells were passaged every third day with a 1:5 split. Cells were injected into blastocysts between passage 6 and 10 (cells were received at passage 4).

7.6.4 Transfection of C2C12 cells C2C12 cells were maintained by passaging every 2-3 days when approximately 80 % confluent. For transfection, cells were passaged into 6 well plates containing glass slide coverslips at low confluency and transfected using Lipofectamine LTX (Invitrogen) according to the One Tube Protocol and manufacturer’s instructions, except that serum free C2C12 media was used. Cells were fixed after 16-24 hours.

7.7 Mouse strains Animal ethics permission to use both targeted and transgenic mouse lines was obtained under the animal ethics numbers 06/42 and 06/43, 09/33. All lines were housed in the Biological Testing Facility in the Garvan Institute of Medical Research, Sydney, or in the BioCore at the Victor Chang Cardiac Research Institute, Sydney on a perpetual 12 hour

185 light/dark cycle at 23 º C, and kept under the animal ethics numbers. Males and females were separately caged unless needed for specific breeding purposes and fed ad libitum.

Wildtype mice for dissections were from either QS or C57BL/6J obtained from the ARC, Perth.

Gfpt2Gt(CMHD-GT_352F9-3)Cmhd (Gfpt2GtF9) and Gfpt2Gt(CMHD-GT_305A09-3)Cmhd (Gfpt2GtA09) were created at Victor Chang Cardiac Research Institute from ES cell clones obtained from The Centre for Modeling Human Disease, Toronto (To et al., 2004), by Natalie Wise and myself. The ES cell clones were created on the R1 ES cell line (129X1/SvJ x 129S1) (Nagy A., 2002). The ES cell clones were injected into C57BL/6J blastocysts creating a C57BL/6J: 129X1/SvJ x 129S1 hybrid genetic background by Natalie Wise. Both lines created were maintained by back-crossing to C57BL/6J. Heterozygous inter-crosses of Gfpt2GtF9 mice created embryos and pups with wildtype Gfpt2+/+, heterozygous Gfpt2GtF9/+ or homozygous to Gfpt2GtF9/GtF9 genotypes. Embryos and pups were mainly collected from the F1 inter-crosses of these mice (i.e. chimera mated to C57BL/6J).

Heterozygous Gfpt2GtF9/+and Gfpt2GtA09/+ female mice were mated to Cre deleter (Tg(CMV- cre)1Cgn) (Schwenk et al., 1995) males to facilitate Cre-lox recombination and excision of the pUPA IRES. Mice heterozygous for the excised IRES were maintained by back- crossing to C57/Bl6. The resulting mouse lines were denoted Gfpt2Gt(CMHD-GT_352F9-3)d1Cmhd (Gfpt2GtF9d1) and Gfpt2Gt(CMHD-GT_305A09-3)d1Cmhd (Gfpt2GtA09d1) respectively. In the case of the 305A09 clone, chimeric males were also mated to Cre deleter females to generate mice contained the UPA vector with the IRES deleted. Heterozygous inter-crosses of Gfpt2GtF9 mice created embryos with wildtype Gfpt2+/+, heterozygous Gfpt2GtF9/+ or homozygous to Gfpt2GtF9/GtF9 genotypes. Heterozygous inter-crosses of Gfpt2GtA09 mice created embryos and pups with wildtype Gfpt2+/+, heterozygous Gfpt2GtA09/+ or homozygous to Gfpt2GtA09/GtA09 genotypes. Embryos and pups were mainly collected from mice that had been back-crossed to C57BL/6J at least two or three times.

186 7.8 Statistical analysis Chi-square test of embryo and pup survival was performed using Microsoft Excel:mac 2004 software, version 11.5.6. Analysis of Variance (ANOVA) of embryo, pup and mouse weights was performed using Graphpad Prism software, Version 5.0b with Tuckey’s post- test.

187

188

APPENDICES

189 190 Appendix 1 Gene ontology terms over-represented in the cDNA libraries Genes 2-fold enriched in each of the germ layers and primitive streak (Analysis 1) and 2-fold genes enriched in both the endoderm & mesoderm (Endoderm- Mesoderm, Analysis 2), were compared to the MGI (Mouse Genome Informatics) gene-set, to identify gene ontology terms over-represented in the libraries compared to the mouse genome. Bracketed numbers are number of genes represented by each term.

Mesoderm Primitive Streak Endoderm Ectoderm Endoderm-Mesoderm intracellular (125) cell part (136) cytoplasm (137) intracellular organelle (52) intracellular (218) intracellular membrane- intracellular part (123) intracellular (101) protein binding (125) bound organelle (48) intracellular part (216) membrane-bound organelle 191 organelle (106) intracellular part (100) localization (89) (48) organelle (182) biopolymer metabolic intracellular organelle (105) intracellular organelle (82) cytoplasmic part (87) process (39) intracellular organelle (181) membrane-bound organelle membrane-bounded (99) organelle (82) developmental process (85) nucleus (35) organelle (163)

intracellular membrane- cellular metabolic process intracellular membrane- bound organelle (98) (79) extracellular region part (83) biological regulation (34) bounded organelle (162) primary metabolic process primary metabolic process establishment of localization regulation of biological (94) (78) (79) process (33) cytoplasm (157)

cellular metabolic process intracellular membrane- primary metabolic process (93) bound organelle (75) extracellular space (78) developmental process (31) (146) membrane-bound organelle regulation of cellular process cellular metabolic process protein binding (79) (75) transport (77) (31) (146)

Mesoderm Primitive Streak Endoderm Ectoderm Endoderm-Mesoderm

macromolecule metabolic multicellular organismal macromolecule metabolic nucleus (67) process (74) development (67) nucleic acid binding (28) process (132)

biopolymer metabolic anatomical structure developmental process (51) process (66) development (62) gene expression (27) nucleus (98) multicellular organismal nucleic acid binding (48) nucleus (62) calcium ion binding (31) development (26) cytoplasmic part (93) anatomical structure regulation of biological protein metabolic process morphogenesis (26) process (49) endoplasmic reticulum (29) RNA metabolic process (26) (72)

192 cellular component regulation of cellular process regulation of cellular organization and biogenesis organ morphogenesis (16) (48) proteolysis (27) metabolic process (25) (65)

regulation of metabolic tube development (10) nucleic acid binding (47) peptidase activity (27) process (25) organelle part (63)

nucleobase, nucleoside, nucleotide and nucleic acid intracellular organelle part tube morphogenesis (9) metabolic process (47) Golgi apparatus (26) (25) nucleotide binding (56) blood vessel morphogenesis purine nucleotide binding (8) gene expression (44) endopeptidase activity (22) organelle part (25) (48)

Mesoderm Primitive Streak Endoderm Ectoderm Endoderm-Mesoderm

vascular endothelial growth factor receptor signaling regulation of metabolic proteinaceous extracellular regulation of gene expression purine ribonucleotide binding pathway (4) process (41) matrix (18) (24) (47) neural crest cell development regulation of transcription, (4) RNA metabolic process (41) lipid binding (18) DNA-dependent (23) ribonucleotide binding (47)

neural crest cell regulation of cellular transcription, DNA- macromolecule localization differentiation (4) metabolic process (38) vacuole (17) dependent (23) (28) mesenchymal cell RNA biosynthetic process 193 development (4) DNA binding (36) lysosome (16) (23) protein localization (26) regulation of nucleobase, nucleoside, nucleotide and mesenchymal cell nucleic acid metabolic regulation of transcription establishment of localization differentiation (4) process (35) lytic vacuole (16) (23) in cell (26) regulation of nucleobase, nucleoside, nucleotide and cardiac muscle cell nucleic acid metabolic proliferation (3) transcription (35) endosome (14) process (23) intracellular transport (25) striated muscle cell regulation of gene expression proliferation (3) (35) apical part of cell (10) transcription (23) protein transport (24)

regulation of mesenchymal regulation of transcription, cysteine-type endopeptidase anatomical structure cell proliferation (3) DNA-dependent (34) activity (8) development (21) pyrophosphatase activity (22)

Mesoderm Primitive Streak Endoderm Ectoderm Endoderm-Mesoderm activity, acting on positive regulation of acid anhydrides, in mesenchymal cell transcription, DNA- transcription factor activity phosphorus-containing proliferation (3) dependent (34) exopeptidase activity (8) (14) anhydrides (22) mesenchymal cell RNA biosynthetic process transcription factor complex hydrolase activity, acting on proliferation (3) (34) late endosome (6) (11) acid anhydrides (22) regulation of transcription nucleoside-triphosphatase muscle cell proliferation (3) (34) lipid transporter activity (6) nucleoplasm part (11) activity (20) establishment of cell polarity 194 blood vessel remodeling (3) (3) apical plasma membrane (6) nucleoplasm (11) cytosol (17)

protein-tyrosine stem cell factor receptor sulfotransferase activity (2) binding (2) chylomicron (3) embryonic development (10) DNA replication (11)

Appendix 2 Analysis 1 candidate genes selected for screening by whole mount RNA in situ hybridisation The following candidate genes that were enriched in either the primitive streak or mesoderm (fold-enrichment in brackets) were selected for screening by whole mount RNA in situ hybridisation for heart and/or somite expression at 7.5, 8.5 and 9.5 dpc. Gene names and symbols are current as of March, 2012 (Mouse Genome Database). Gene ontology and protein domains information was collated using Mouse Genome Database (August, 2005). Clone names are the clones obtained to transcribe the anti-sense mRNA probes to detect each gene. Grey text indicates genes were selected but correct clones were not obtained. 1) Unless stated otherwise.

195 Microarray Current Gene Name Current Gene Symbol Gene Ontology Protein domain Image clone1 Primitive Streak (4- RIKEN cDNA 2810408A11 2810408A11Rik Protein phosphatase 4216824 fold) gene inhibitor 2 (IPP-2) Primitive Streak (4- Syndecan binding protein Sdcbp Metabolism, Ras PDZ/DHR/GLGF, 4017442 fold) protein signal Aldehyde transduction, dehydrogenase membrane, activity, protein binding

Primitive Streak (4- phosphoglycolate Pgp Hypothetical HAD-like 5356187 fold) phosphatase structure containing protein

Primitive Streak (3- RIKEN cDNA 0610030E20 0610030E20Rik 6827710 fold) gene

Microarray Current Gene Name Current Gene Symbol Gene Ontology Protein domain Image clone1 Primitive Streak (3- RIKEN cDNA 1200014J11 1200014J11Rik 6335501 fold) gene

Primitive Streak (3- Myotubularin related Mtmr14 Protein amino acid 4021243 fold) protein 14 dephosphorylation, phosphoprotein 196 phosphatase activity

Primitive Streak (3- Transmembrane 9 Tm9sf3 Transport, extracellular Nonaspanin (TM9SF) 3589474 fold) superfamily member 3 space, integral to membrane, transporter activity

Primitive Streak (3- Sperm flagellar 1 Spef1 Calponin-like actin- 3590641 fold) binding, Protein of unknown function DUF1042

Microarray Current Gene Name Current Gene Symbol Gene Ontology Protein domain Image clone1 Primitive Streak (3- Serine/threonine kinase 25 Stk25 Protein amino acid Protein kinase-like, 4012695 fold) (yeast) phosphorylation, ATP Protein kinase, binding, kinase activity, Serine/threonine protein kinase activity, protein kinase, active protein serine/threonine site, Serine/threonine kinase activity, protein- protein kinase, 197 tyrosine kinase activity, Tyrosine protein kinase transferase activity

Primitive Streak (3- Sorting nexin 18 Snx18 Intracellular signalling Phox-like, SH3 4011275 fold) cascade, protein transport, transport, Golgi apparatus

Primitive Streak (3- Claudin 25 Cldn25 Integral to membrane, 6414170 fold) membrane

Primitive Streak (3- ankarin repeat domain Ankrd61 Regulation of Ankyrin 6703467 fold) containing 61 transcription, DNA- dependent, DNA binding, transcription factor activity

Microarray Current Gene Name Current Gene Symbol Gene Ontology Protein domain Image clone1 Primitive Streak (3- Kin of IRRE like 3 Kirrel3 6402632 fold) (Drosophila)

Primitive Streak (3- NDC80 kinetochore Nuf2 Attachment of spindle Nuf2 3587655 fold) complex component, microtubules to

198 homolog (S. cerevisiae) kinetochore, kinetochore

Mesoderm (4-fold) small nucleolar RNA host Snhg8 3375440 gene 8 Mesoderm (4-fold) RIKEN cDNA 9030425E11 9030425E11Rik Extracellular space, Immunoglobulin-like, 4459717 gene integral to membrane Immunoglobulin C2 type

Mesoderm (4-fold) Methionine sulfoxide Msrb2 Protein repair; Methionine sulfoxide 5150285 reductase B2 extracellular space, reductase B mitochondrion; oxidoreductase activity, protein-methionine-R- oxide reductase activity

Mesoderm (4-fold) Rhomboid domain Rhbdd2 Extracellular space 313629 containing 2

Microarray Current Gene Name Current Gene Symbol Gene Ontology Protein domain Image clone1 Mesoderm (4-fold) tRNA methyltransferase 61 Trmt61b 834199 homolog B (S. cerevisiae)

Mesoderm (4-fold) RIKEN cDNA 9130221H12 9130221H12Rik 4207058 gene

199 Mesoderm (4-fold) EFR3 homolog A (S. Efr3a Cell-cell adhesion; 3495608 cerevisiae) cornified envelope, intracellular, plasma membrane; protein homodimerization activity

Mesoderm (4-fold) RIKEN cDNA 1500011H22 1500011H22Rik 3153717 gene

Microarray Current Gene Name Current Gene Symbol Gene Ontology Protein domain Image clone1 Mesoderm (4-fold) Acyl-Coenzyme A Acadsb Electron transport, fatty Acyl-CoA 5039627 dehydrogenase, acid metabolism, lipid dehydrogenase, Acyl- short/branched chain metabolism; CoA dehydrogenase mitochondrion; acyl- (C-terminal), Acyl-CoA CoA dehydrogenase dehydrogenase activity, electron carrier (central region), Acyl- 200 activity, oxidoreductase CoA dehydrogenase activity, short-branched- (N-terminal), Acyl-CoA chain-acyl-CoA dehydrogenase dehydrogenase activity (middle and N- terminal), Acyl-CoA dehydrogenase C- terminal-like

Mesoderm (4-fold) myosin, light chain 12A, Myl12a Muscle development Calcium-binding EF- 3588394 regulatory, non-sarcomeric hand

Microarray Current Gene Name Current Gene Symbol Gene Ontology Protein domain Image clone1 Mesoderm (4-fold) Ankyrin repeat and SOCS Asb3 Intracellular signalling Ankyrin, SOCS protein, 4947610 box-containing protein 3 cascade, regulation of C-terminal transcription, DNA- dependent, DNA binding, transcription factor activity 201

Mesoderm (3-fold) RIKEN cDNA 1700100L14 1700100L14Rik 1745582 gene

Mesoderm (3-fold) cDNA sequence BC003331 BC003331 3256634

Mesoderm (3-fold) family with sequence Fam36a 5714061 similarity 36, member A

Microarray Current Gene Name Current Gene Symbol Gene Ontology Protein domain Image clone1 Mesoderm (3-fold) Guanylate cyclase 1, Gucy1b3 cGMP biosynthesis, Guanylate cyclase, 6822142 soluble, beta 3 circulation, intracellular Heme NO binding, signalling cascade, Heme NO binding nitric oxide mediated associated signal transduction, perception of smell; 202 cytoplasm, guanylate cyclase complex (soluble), calmodulin binding, guanylate cyclase activity, activity, receptor activity

Mesoderm (3-fold) Pedicted gene 9751 Gm9751 3487966

Mesoderm (3-fold) Guanosine monophosphate Gmpr2 GMP catabolism, FMN/related 3590411 reductase 2 monocyte cell compound-binding differentiation, core, Guanosine nucleotide metabolism; monophosphate catalytic activity, GMP reductase 1, reductase activity, dehydrogenase/GMP oxidoreductase activity reductase

Microarray Current Gene Name Current Gene Symbol Gene Ontology Protein domain Image clone1 Mesoderm (3-fold) RIKEN cDNA 2610305D13 2610305D13Rik Nucleus; DNA binding KRAB box, Zn-finger 1110225 gene (C2H2 type)

Mesoderm (3-fold) WD repeat domain Wdr82 WD-40 repeat 3596090 containing 82

203 Mesoderm (3-fold) Melanoma associated Mum1 PWWP 5718202 antigen (mutated) 1

Mesoderm (3-fold) Mitochondrial trans-2- Mecr Cytosol, nucleus; Zinc-containing alcohol 3488760 enoyl-CoA reductase ligand-dependent dehydrogenase nuclear receptor superfamily binding, receptor activity

Mesoderm (3-fold) RIO kinase 2 (yeast) Riok2 1246405

Mesoderm (3-fold) RIKEN cDNA D930014E17 D930014E17Rik 3824708 gene

Microarray Current Gene Name Current Gene Symbol Gene Ontology Protein domain Image clone1 Mesoderm (3-fold) Ankyrin repeat and BTB Abtb2 6493840 (POZ) domain containing 2, Chr 2

Mesoderm (3-fold) Dihydrouridine synthase 4- Dus4l tRNA processing; Dihydrouridine 5709812 like (S. cerevisiae) oxidoreductase activity synthase (DuS), 204 FMN/related compound-binding core

Mesoderm (3-fold) Dipeptidylpeptidase 8 Dpp8 Aminopeptidase activity Peptidase S9, prolyl 6410075 oligopeptidase region, Peptidase S9B, dipeptidylpeptidase IV N-terminal, Esterase// thioesterase

Mesoderm (3-fold) WD repeat domain, Wipi2 WD-40 repeat 4235237 phosphoinositide interacting 2

Microarray Current Gene Name Current Gene Symbol Gene Ontology Protein domain Image clone1 Primitive Streak (3- Solute carrier family 25, Slc25a44 Transport, integral to Adenine nucleotide 6310271 fold) member 44 membrane, translocator 1, mitochondrial inner Mitochondrial substrate membrane, binding carrier, Mitochondrial carrier protein

205 Mesoderm (4-fold) Kinesin family member 15 Kif15 Hypothetical P-loop 640982 containing nucleotide triphosphate structure containing protein Mesoderm (4-fold) Ubiquitination factor E4A, Ube4a cycle U-box 3498834 UFD2 homolog (S. cerevisiae)

Mesoderm (4-fold) Ring finger protein, Rnft1 Integral to membrane Zn-finger (RING) 6743413 transmembrane 1

Mesoderm (3-fold) Tetratricopeptide repeat Ttc33 TPR repeat 6590550 domain 33

Microarray Current Gene Name Current Gene Symbol Gene Ontology Protein domain Image clone1 Mesoderm (3-fold) HD domain containing 2 Hddc2 FERM, Metal- 3418714 dependent phosphohydrolase, HD region

Mesoderm (3-fold) Histocompatibility minor HA Hmha1 Intracellular signalling Cdc15/Fes/CIP4, 6333971 206 1 cascade; diacylglycerol Protein kinase C binding, GTPase (phorbol activator activity ester/diacylglycerol binding), RhoGAP

Appendix 3 Analysis 2 candidate genes selected for screening by whole mount RNA in situ hybridisation The following candidate genes that were 3-fold enriched in either the mesoderm or the mesendoderm (MesEnd) were selected for screening by whole mount RNA in situ hybridization for heart and/or somite expression at 7.5, 8.5 and 9.5 dpc. Gene names and symbols are current as of March 2012 (Mouse Genome Database). Gene ontology and protein domains information was collated using Mouse Genome Database (July, 2006). Clone names are the clones obtained to transcribe the anti-sense mRNA probes to detect each gene. Grey text indicates genes were selected but correct clones were not obtained. 1) Unless stated otherwise.

207 Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol MesEnd 3-fold SPARC related Smoc1 Basement membrane, Thyroglobulin type-1; 1039233 modular calcium extracellular matrix (sensu Calcium-binding EF-hand; binding 1 Metazoa); calcium ion Proteinase inhibitor I1, binding Kazal; Protease inhibitor, Kazal-type; EF-Hand type

MesEnd 3-fold wntless homolog Wls Protein of unknown 110876 (Drosophila) function DUF1171

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol MesEnd 3-fold NADH dehydrogenase Ndufs2 Electron transport; NADH-ubiquinone 3600832 (ubiquinone) Fe-S mitochondrion; 4 iron, 4 oxidoreductase, chain protein 2 sulfur cluster binding, 49kDa; NADH electron carrier activity, iron dehydrogenase I, D subunit ion binding, iron-sulfur 208 cluster binding, metal ion binding, NAD binding, NADH dehydrogenase (ubiquinone) activity, NADH dehydrogenase activity, oxidoreductase activity, oxidoreductase activity, acting on NADH or NADPH

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol MesEnd 3-fold Phosphatidylinositol 3 Pik3r4 Protein amino acid HEAT, Protein kinase, WD- 371177 kinase, regulatory phosphorylation; ATP 40 repeat, Serine/threonine subunit, polypeptide 4, binding, binding, kinase protein kinase, p150 activity, nucleotide binding, Serine/threonine protein protein kinase activity, kinase, active site, 209 protein serine/threonine Armadillo-like helical kinase activity, transferase activity

MesEnd 3-fold Guanine nucleotide Gng12 G-protein coupled receptor G-protein, gamma subunit 373769 binding protein (G protein signalling pathway, protein), gamma 12 signal transduction; heterotrimeric G-protein complex, membrane; signal transducer activity, small GTPase regulator activity

MesEnd 3-fold Polycomb group ring Pcgf5 Metal ion binding; zinc ion Zinc finger, RING-type 443986 finger 5 binding

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol MesEnd 3-fold Nedd4 family Ndfip1 Integral to membrane, 536937 interacting protein 1 membrane; protein binding

MesEnd 3-fold DEK oncogene (DNA Dek DNA binding, GTP binding, GTP-binding signal 539816 binding) RNA binding; nucleus, recognition particle SRP54, 210 signal recognition particle G-domain, DNA-binding (sensu Eukaryota); SRP- SAP dependent cotranslational protein targeting to membrane MesEnd 3-fold DEAH (Asp-Glu-Ala- Dhx33 Nucleus; ATP binding, DEAD/DEAH box helicase; 560048 His) box polypeptide 33 ATP-dependent helicase Helicase, C-terminal; ATP- activity, helicase activity, dependent helicase, hydrolase activity, nucleic DEAH-box; Helicase- acid binding, nucleotide associated region; binding DEAD/DEAH box helicase, N-terminal; Protein of unknown function DUF1605

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol MesEnd 3-fold DEAD (Asp-Glu-Ala- Ddx10 ATP binding, ATP- ATP-dependent helicase, 6411126 Asp) box polypeptide dependent helicase DEAD-box; DEAD/DEAH 10 activity, helicase activity, box helicase; Helicase, C- hydrolase activity, nucleic terminal; DEAD/DEAH box acid binding, nucleotide helicase, N-terminal 211 binding, RNA binding

MesEnd 3-fold Centrosomal protein Cep192 Myb, DNA-binding; 6416497 192 Calcium-binding EF-hand; PapD-like

MesEnd 3-fold Zinc finger protein 330 Zfp330 Nucleus; metal ion binding, NOA36 6510396 zinc ion binding MesEnd 3-fold CKLF-like MARVEL Cmtm3 Chemotaxis, response to MARVEL 670227 transmembrane domain stimulus; extracellular containing 3 space, integral to membrane, membrane; cytokine activity

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol MesEnd 3-fold Serine/threonine kinase Stk38l Actin binding, ATP binding, Protein kinase; Protein 6734706 38 like kinase activity, magnesium kinase, C-terminal; ion binding, metal ion Serine/threonine protein binding, nucleotide binding, kinase; Serine/threonine protein kinase activity, protein kinase, active site; 212 protein serine/threonine Protein kinase-like kinase activity, transferase activity; cytoplasm; protein amino acid phosphorylation, protein targeting

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol MesEnd 3-fold DEAD (Asp-Glu-Ala- Ddx5 ATP binding, ATP- ATP-dependent helicase, 6827275 Asp) box polypeptide 5 dependent helicase DEAD-box; DEAD/DEAH activity, helicase activity, box helicase; Helicase, C- hydrolase activity, nucleic terminal; DEAD/DEAH box acid binding, nucleotide helicase, N-terminal, 213 binding, protein binding, P68HR RNA binding, RNA helicase activity, transcription activity; nucleus; positive regulation of transcription MesEnd 3-fold DENN/MADD domain Dennd5a Nucleic acid binding, Lipoxygenase, LH2; DENN; 6840562 containing 5A protein binding, Rab RUN; dDENN; uDENN; GTPase binding Lipase/lipooxygenase, PLAT/LH2; Nucleic acid- binding, OB-fold

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol MesEnd 3-fold ST6 (alpha-N-acetyl- St6galnac2 Protein amino acid Glycosyl transferase, family 962873 neuraminyl-2,3-beta- glycosylation; extracellular 29; Sialyltransferase galactosyl-1,3)-N- space, integral to Golgi acetylgalactosaminide membrane, integral to alpha-2,6- membrane, membrane; 214 sialyltransferase 2 sialyltransferase activity, transferase activity, transferase activity, transferring glycosyl groups

MesEnd 3-fold RIKEN cDNA 2610307P16Rik 1851938 2610307P16 gene

MesEnd 3-fold small nucleolar RNA Snhg12 3215442 host gene 12

MesEnd 3-fold Transmembrane Tmem216 3972433 protein 216

MesEnd 3-fold Transmembrane Tmem192 536870 protein 192

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol MesEnd 3-fold ataxin 7-like 3B Atxn7l3b 5697691

MesEnd 3-fold family with sequence Fam92a Protein of unknown 570068 similarity 92, member A function DUF1208

215 MesEnd 3-fold Arginine/serine-rich Rsrc2 613910 coiled-coil 2

MesEnd 3-fold intraflagellar transport Ift46 905778 46 homolog (Chlamydomonas)

MesEnd 3-fold RIKEN cDNA 9530006C21Rik RIKEN clone 9530006C21 gene 9530006C21

MesEnd 3-fold Non-catalytic region of Nck2 RIKEN clone tyrosine kinase adaptor 4833426I10 protein 2

MesEnd 3-fold kinesin family member Kif4-ps RIKEN clone 4 -pseudogene 4930463M05

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol Mesoderm 3-fold Intraflagellar transport Ift57 Apoptosis, caspase 1282690 57 homolog activation, regulation of (Chlamydomonas) apoptosis; Golgi apparatus; protein binding

216 Mesoderm 3-fold Zinc finger and BTB Zbtb48 Regulation of transcription, BTB, Zinc finger, C2H2- 3493611 domain containing 48 DNA-dependent; nucleus; type, BTB/POZ metal ion binding, protein binding, transcription factor activity, zinc ion binding

Mesoderm 3-fold TM2 domain containing Tm2d2 Extracellular space, TM2 3970884 2 integral to membrane

Mesoderm 3-fold Zinc finger protein 60 Zfp60 Regulation of transcription, KRAB box, Zinc finger, 3992720 DNA-dependent; C2H2-type intracellular, nucleus; DNA binding, metal ion binding, nucleic acid binding, zinc ion binding

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol Mesoderm 3-fold Fgfr1 oncogene partner Fgfr1op Unknown Lissencephaly type-1-like 4935625 homology motif

Mesoderm 3-fold endoplasmic reticulum Erlec1 Receptor activity Mannose-6-phosphate 5699887 lectin 1 receptor, binding, 217 Glucosidase II beta subunit-like

Mesoderm 3-fold Zinc finger protein 423 Zfp423 Nucleus; nucleic acid Zn-finger (C2H2 type) 5705479 binding

Mesoderm 3-fold Zinc finger protein 398 Zfp398 Regulation of transcription, KRAB box, KRAB-related, 6815946 DNA-dependent; Zinc finger, C2H2-type, t- intracellular, nucleus; metal snare ion binding, nucleic acid binding, zinc ion binding

Mesoderm 3-fold Zinc fingerprotein 618 Zfp618 6827030

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol Mesoderm 3-fold Transmembrane Tmem81 Protein targeting; integral Hypothetical Microbodies 774908 protein 81 to membrane C-terminal targeting signal containing protein

Mesoderm 3-fold Microtubule-associated Mtap9 763339 218 protein 9

Mesoderm 3-fold RIKEN cDNA 4930451G09Rik 849522 4930451G09 gene

Mesoderm 3-fold Deleted in lymphocytic Dleu2 RIKEN clone leukemia, 2 1810047A16

Mesoderm 3-fold hornerin Hrnr RIKEN clone 5830443G21

Mesoderm 3-fold RIKEN cDNA 6030492E11Rik RIKEN clone 6030492E11 gene 6030492E11

Mesoderm 3-fold RIKEN cDNA 2610034E01Rik RIKEN clone 2610034E01 gene 2610034E01

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol Mesoderm 3-fold RIKEN cDNA 8430406H22Rik RIKEN clone 8430406H22 gene 8430406H22

MesEnd 3-fold BCL2-associated Bclaf1 Negative regulation of 554153 transcription factor 1 transcription, positive 219 regulation of apoptosis, regulation of transcription, DNA-dependent, transcription; nucleus; DNA binding, protein binding, transcriptional repressor activity

MesEnd 3-fold Programmed cell death Pdcd10 Apoptosis Protein of unknown 5689088 10 function DUF1241

MesEnd 3-fold RAP1, GTP-GDP Rap1gds1 Cytoskeleton Armadillo; Armadillo-like 615218 dissociation stimulator helical 1

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol MesEnd 3-fold Coiled-coil domain Ccdc56 Integral to membrane, 5407940 containing 56 mitochondrion

Mesoderm 3-fold RIKEN cDNA 1500032P08Rik 5704879 1500032P08Rik gene 220

Mesoderm 3-fold methylenetetrahydrofol Mthfd2l 636074 ate dehydrogenase (NADP+ dependent) 2- like

Mesoderm 3-fold Zinc fingers and Zhx3 Regulation of transcription Homeobox, Zn-finger RIKEN clone homeoboxes 3 (DNA-dependent); nucleus; (C2H2 type) 9530010N21 DNA binding, nucleic acid binding, transcription factor activity, zinc ion binding

Mesoderm 3-fold anoctamin 10 Ano10 Protein of unknown 3590438 function DUF590

Microarray Current Gene Name Current Gene Gene Ontology Protein domain Image clone1 Symbol MesEnd 3-fold RIKEN cDNA 5730405O12Rik RIKEN clone 5730405O12 gene 5730405O12

Mesoderm 3-fold RIKEN cDNA A430103D13Rik RIKEN clone A430103D13 gene A430103D13 221

References .

Acevedo-Arozena, A., Wells, S., Potter, P., Kelly, M., Cox, R. D. and Brown, S. D. (2008). ENU mutagenesis, a way forward to understand gene function. Annu Rev Genomics Hum Genet 9, 49-69.

Akiyama, H., Chaboissier, M. C., Behringer, R. R., Rowitch, D. H., Schedl, A., Epstein, J. A. and de Crombrugghe, B. (2004). Essential role of Sox9 in the pathway that controls formation of cardiac valves and septa. Proc Natl Acad Sci U S A 101, 6502-7.

Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts, K. and Watson, J. (1994). Molecular Biology of the Cell. New York: Garland Publishing.

Allen, B. L. and Rapraeger, A. C. (2003). Spatial and temporal expression of heparan sulfate in mouse development regulates FGF and FGF receptor assembly. J Cell Biol 163, 637-48.

Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389-402.

Anderson, R. H., Webb, S., Brown, N. A., Lamers, W. and Moorman, A. (2003a). Development of the heart: (2) Septation of the atriums and ventricles. Heart 89, 949-58.

Anderson, R. H., Webb, S., Brown, N. A., Lamers, W. and Moorman, A. (2003b). Development of the heart: (3) formation of the ventricular outflow tracts, arterial valves, and intrapericardial arterial trunks. Heart 89, 1110-8.

Ang, S. J. and Behringer, R. R. (2002). Anterior-Posterior Patterning of the Mouse Body Axis at Gastrulation. In Mouse Development: Patterning, Morphogenesis , and Organogenesis, (ed. J. Rossant and P. P. L. Tam): Academic Press.

222

Arai, A., Yamamoto, K. and Toyama, J. (1997). Murine cardiac progenitor cells require visceral embryonic endoderm and primitive streak for terminal differentiation. Dev Dyn 210, 344-53.

Armstrong, E. J. and Bischoff, J. (2004). Heart valve development: endothelial cell signaling and differentiation. Circ Res 95, 459-70.

Aulehla, A. and Johnson, R. L. (1999). Dynamic expression of lunatic fringe suggests a link between notch signaling and an autonomous cellular oscillator driving somite segmentation. Dev Biol 207, 49-61.

Aulehla, A., Wehrle, C., Brand-Saberi, B., Kemler, R., Gossler, A., Kanzler, B. and Herrmann, B. G. (2003). Wnt3a plays a major role in the segmentation clock controlling somitogenesis. Dev Cell 4, 395-406.

Bajolle, F., Zaffran, S. and Bonnet, D. (2009). Genetics and embryological mechanisms of congenital heart diseases. Arch Cardiovasc Dis 102, 59-63.

Bajolle, F., Zaffran, S., Kelly, R. G., Hadchouel, J., Bonnet, D., Brown, N. A. and Buckingham, M. E. (2006). Rotation of the myocardial wall of the outflow tract is implicated in the normal positioning of the great arteries. Circ Res 98, 421-8.

Bamforth, S. D., Braganca, J., Farthing, C. R., Schneider, J. E., Broadbent, C., Michell, A. C., Clarke, K., Neubauer, S., Norris, D., Brown, N. A. et al. (2004). Cited2 controls left-right patterning and heart development through a Nodal-Pitx2c pathway. Nat Genet 36, 1189-96.

Barbera, J. P., Rodriguez, T. A., Greene, N. D., Weninger, W. J., Simeone, A., Copp, A. J., Beddington, R. S. and Dunwoodie, S. (2002). Folic acid prevents exencephaly in Cited2 deficient mice. Hum Mol Genet 11, 283-93.

Barbosky, L., Lawrence, D. K., Karunamuni, G., Wikenheiser, J. C., Doughman, Y. Q., Visconti, R. P., Burch, J. B. and Watanabe, M. (2006). Apoptosis in the developing mouse heart. Dev Dyn 235, 2592-602. 223

Barnett, J. V. and Desgrosellier, J. S. (2003). Early events in valvulogenesis: a signaling perspective. Birth Defects Res C Embryo Today 69, 58-72.

Bartram, U., Molin, D. G., Wisse, L. J., Mohamad, A., Sanford, L. P., Doetschman, T., Speer, C. P., Poelmann, R. E. and Gittenberger-de Groot, A. C. (2001). Double-outlet right ventricle and overriding tricuspid valve reflect disturbances of looping, myocardialization, endocardial cushion differentiation, and apoptosis in TGF-beta(2)- knockout mice. Circulation 103, 2745-52.

Bateman, A. (1999). The SIS domain: a phosphosugar-binding domain. Trends Biochem Sci 24, 94-5.

Beddington, R. S. and Robertson, E. J. (1999). Axis development and early asymmetry in mammals. Cell 96, 195-209.

Beissbarth, T. and Speed, T. P. (2004). GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 20, 1464-5.

Benson, D. W., Silberbach, G. M., Kavanaugh-McHugh, A., Cottrill, C., Zhang, Y., Riggs, S., Smalls, O., Johnson, M. C., Watson, M. S., Seidman, J. G. et al. (1999). Mutations in the cardiac transcription factor NKX2.5 affect diverse cardiac developmental pathways. J Clin Invest 104, 1567-73.

Bernanke, D. H. and Markwald, R. R. (1982). Migratory behavior of cardiac cushion tissue cells in a collagen-lattice culture system. Dev Biol 91, 235-45.

Bessho, Y., Miyoshi, G., Sakata, R. and Kageyama, R. (2001). Hes7: a bHLH-type repressor gene regulated by Notch and expressed in the presomitic mesoderm. Genes Cells 6, 175-85.

Biben, C., Weber, R., Kesteven, S., Stanley, E., McDonald, L., Elliott, D. A., Barnett, L., Koentgen, F., Robb, L., Feneley, M. et al. (2000). Cardiac septal and valvular dysmorphogenesis in mice heterozygous for mutations in the homeobox gene Nkx2-5. Circ Res 87, 888-95. 224

Blackshaw, S., Harpavat, S., Trimarchi, J., Cai, L., Huang, H., Kuo, W. P., Weber, G., Lee, K., Fraioli, R. E., Cho, S. H. et al. (2004). Genomic analysis of mouse retinal development. PLoS Biol 2, E247.

Boehmelt, G., Wakeham, A., Elia, A., Sasaki, T., Plyte, S., Potter, J., Yang, Y., Tsang, E., Ruland, J., Iscove, N. N. et al. (2000). Decreased UDP-GlcNAc levels abrogate proliferation control in EMeg32-deficient cells. Embo J 19, 5092-104.

Brand, T. (2003). Heart development: molecular insights into cardiac specification and early morphogenesis. Dev Biol 258, 1-19.

Brent, A. E., Schweitzer, R. and Tabin, C. J. (2003). A somitic compartment of tendon progenitors. Cell 113, 235-48.

Brown, C. B., Boyer, A. S., Runyan, R. B. and Barnett, J. V. (1999). Requirement of type III TGF-beta receptor for endocardial cell transformation in the heart. Science 283, 2080-2.

Brown, C. B., Feiner, L., Lu, M. M., Li, J., Ma, X., Webber, A. L., Jia, L., Raper, J. A. and Epstein, J. A. (2001). PlexinA2 and semaphorin signaling during cardiac neural crest development. Development 128, 3071-80.

Bruneau, B. G. (2008). The developmental genetics of congenital heart disease. Nature 451, 943-8.

Camenisch, T. D., Molin, D. G., Person, A., Runyan, R. B., Gittenberger-de Groot, A. C., McDonald, J. A. and Klewer, S. E. (2002a). Temporal and distinct TGFbeta ligand requirements during mouse and avian endocardial cushion morphogenesis. Dev Biol 248, 170-81.

Camenisch, T. D., Schroeder, J. A., Bradley, J., Klewer, S. E. and McDonald, J. A. (2002b). Heart-valve mesenchyme formation is dependent on hyaluronan-augmented activation of ErbB2-ErbB3 receptors. Nat Med 8, 850-5.

225

Camenisch, T. D., Spicer, A. P., Brehm-Gibson, T., Biesterfeldt, J., Augustine, M. L., Calabro, A., Jr., Kubalak, S., Klewer, S. E. and McDonald, J. A. (2000). Disruption of hyaluronan synthase-2 abrogates normal cardiac morphogenesis and hyaluronan- mediated transformation of epithelium to mesenchyme. J Clin Invest 106, 349-60.

Carey, F. J., Linney, E. A. and Pedersen, R. A. (1995). Allocation of epiblast cells to germ layer derivatives during mouse gastrulation as studied with a retroviral vector. Dev Genet 17, 29-37.

Chen, J. N. and Fishman, M. C. (2000). Genetics of heart development. Trends Genet 16, 383-8.

Chen, Z., Friedrich, G. A. and Soriano, P. (1994). Transcriptional enhancer factor 1 disruption by a retroviral gene trap leads to heart defects and embryonic lethality in mice. Genes Dev 8, 2293-301.

Christoffels, V. M., Habets, P. E., Franco, D., Campione, M., de Jong, F., Lamers, W. H., Bao, Z. Z., Palmer, S., Biben, C., Harvey, R. P. et al. (2000). Chamber formation and morphogenesis in the developing mammalian heart. Dev Biol 223, 266-78.

Comer, F. I. and Hart, G. W. (2001). Reciprocity between O-GlcNAc and O-phosphate on the carboxyl terminal domain of RNA polymerase II. 40, 7845-52.

Cooke, J. and Zeeman, E. C. (1976). A clock and wavefront model for control of the number of repeated structures during animal morphogenesis. J Theor Biol 58, 455-76.

Dale, J. K., Malapert, P., Chal, J., Vilhais-Neto, G., Maroto, M., Johnson, T., Jayasinghe, S., Trainor, P., Herrmann, B. and Pourquie, O. (2006). Oscillations of the snail genes in the presomitic mesoderm coordinate segmental patterning and morphogenesis in vertebrate somitogenesis. Dev Cell 10, 355-66. de la Pompa, J. L., Timmerman, L. A., Takimoto, H., Yoshida, H., Elia, A. J., Samper, E., Potter, J., Wakeham, A., Marengere, L., Langille, B. L. et al. (1998). Role of the NF-

226

ATc transcription factor in morphogenesis of cardiac valves and septum. Nature 392, 182-6. de Lange, F. J., Moorman, A. F., Anderson, R. H., Manner, J., Soufan, A. T., de Gier-de Vries, C., Schneider, M. D., Webb, S., van den Hoff, M. J. and Christoffels, V. M. (2004). Lineage and morphogenetic analysis of the cardiac valves. Circ Res 95, 645-54.

Delot, E. C., Bahamonde, M. E., Zhao, M. and Lyons, K. M. (2003). BMP signaling is required for septation of the outflow tract of the mammalian heart. Development 130, 209-20.

Denisot, M. A., Le Goffic, F. and Badet, B. (1991). Glucosamine-6-phosphate synthase from Escherichia coli yields two proteins upon limited proteolysis: identification of the glutamine amidohydrolase and 2R ketose/aldose isomerase-bearing domains based on their biochemical properties. Arch Biochem Biophys 288, 225-30.

Dequeant, M. L., Glynn, E., Gaudenz, K., Wahl, M., Chen, J., Mushegian, A. and Pourquie, O. (2006). A complex oscillating network of signaling genes underlies the mouse segmentation clock. Science 314, 1595-8.

Dequeant, M. L. and Pourquie, O. (2008). Segmental patterning of the vertebrate embryonic axis. Nat Rev Genet 9, 370-82.

Dickson, M. C., Slager, H. G., Duffie, E., Mummery, C. L. and Akhurst, R. J. (1993). RNA and protein localisations of TGF beta 2 in the early mouse embryo suggest an involvement in cardiac development. Development 117, 625-39.

Dobbin, K. K., Kawasaki, E. S., Petersen, D. W. and Simon, R. M. (2005). Characterizing dye bias in microarray experiments. Bioinformatics 21, 2430-7.

Dor, Y., Camenisch, T. D., Itin, A., Fishman, G. I., McDonald, J. A., Carmeliet, P. and Keshet, E. (2001a). A novel role for VEGF in endocardial cushion formation and its potential contribution to congenital heart defects. Development 128, 1531-8.

227

Dor, Y., Klewer, S. E., McDonald, J. A., Keshet, E. and Camenisch, T. D. (2003). VEGF modulates early heart valve formation. Anat Rec A Discov Mol Cell Evol Biol 271, 202-8.

Dor, Y., Porat, R. and Keshet, E. (2001b). Vascular endothelial growth factor and vascular adjustments to perturbations in oxygen homeostasis. Am J Physiol Cell Physiol 280, C1367-74.

Dunwoodie, S. L. (2009). Mutation of the fucose-specific beta1,3 N- acetylglucosaminyltransferase LFNG results in abnormal formation of the spine. Biochim Biophys Acta 1792, 100-11.

Dunwoodie, S. L. and Beddington, R. S. (2002). The expression of the imprinted gene Ipl is restricted to extra-embryonic tissues and embryonic lateral mesoderm during early mouse development. Int J Dev Biol 46, 459-66.

Dunwoodie, S. L., Clements, M., Sparrow, D. B., Sa, X., Conlon, R. A. and Beddington, R. S. (2002). Axial skeletal defects caused by mutation in the spondylocostal dysplasia/pudgy gene Dll3 are associated with disruption of the segmentation clock within the presomitic mesoderm. Development 129, 1795-806.

Dunwoodie, S. L., Henrique, D., Harrison, S. M. and Beddington, R. S. (1997). Mouse Dll3: a novel divergent Delta gene which may complement the function of other Delta homologues during early pattern formation in the mouse embryo. Development 124, 3065-76.

Dunwoodie, S. L., Rodriguez, T. A. and Beddington, R. S. (1998). Msg1 and Mrg1, founding members of a gene family, show distinct patterns of gene expression during mouse embryogenesis. Mech Dev 72, 27-40.

Eisenberg, L. M. and Markwald, R. R. (1995). Molecular regulation of atrioventricular valvuloseptal morphogenesis. Circ Res 77, 1-6.

228

Eldadah, Z. A., Hamosh, A., Biery, N. J., Montgomery, R. A., Duke, M., Elkins, R. and Dietz, H. C. (2001). Familial Tetralogy of Fallot caused by mutation in the jagged1 gene. Hum Mol Genet 10, 163-9.

Elliott, D. A., Kirk, E. P., Yeoh, T., Chandar, S., McKenzie, F., Taylor, P., Grossfeld, P., Fatkin, D., Jones, O., Hayes, P. et al. (2003). Cardiac homeobox gene NKX2-5 mutations and congenital heart disease: associations with atrial septal defect and hypoplastic left heart syndrome. J Am Coll Cardiol 41, 2072-6.

Enciso, J. M., Gratzinger, D., Camenisch, T. D., Canosa, S., Pinter, E. and Madri, J. A. (2003). Elevated glucose inhibits VEGF-A-mediated endocardial cushion formation: modulation by PECAM-1 and MMP-2. J Cell Biol 160, 605-15.

Ensembl. (Dec 2009) at the www.ensembl.org)

Esko, J. D. and Selleck, S. B. (2002). Order out of chaos: assembly of ligand binding sites in heparan sulfate. Annu Rev Biochem 71, 435-71.

Evrard, Y. A., Lun, Y., Aulehla, A., Gan, L. and Johnson, R. L. (1998). lunatic fringe is an essential mediator of somite segmentation and patterning. Nature 394, 377-81.

Flagg, A. E., Earley, J. U. and Svensson, E. C. (2007). FOG-2 attenuates endothelial-to- mesenchymal transformation in the endocardial cushions of the developing heart. Dev Biol 304, 308-16.

Forsberg, H., Crozet, F. and Brown, N. A. (1998). Waves of mouse Lunatic fringe expression, in four-hour cycles at two-hour intervals, precede somite boundary formation. Curr Biol 8, 1027-30.

Friedrich, G. and Soriano, P. (1991). Promoter traps in embryonic stem cells: a genetic screen to identify and mutate developmental genes in mice. Genes Dev 5, 1513-23.

Fukiishi, Y. and Morriss-Kay, G. M. (1992). Migration of cranial neural crest cells to the pharyngeal arches and heart in rat embryos. Cell Tissue Res 268, 1-8. 229

Ganss, B. and Kobayashi, H. (2002). The zinc finger transcription factor Zfp60 is a negative regulator of cartilage differentiation. J Bone Miner Res 17, 2151-60.

Garg, V., Kathiriya, I. S., Barnes, R., Schluterman, M. K., King, I. N., Butler, C. A., Rothrock, C. R., Eapen, R. S., Hirayama-Yamada, K., Joo, K. et al. (2003). GATA4 mutations cause human congenital heart defects and reveal an interaction with TBX5. Nature 424, 443-7.

Gaussin, V., Morley, G. E., Cox, L., Zwijsen, A., Vance, K. M., Emile, L., Tian, Y., Liu, J., Hong, C., Myers, D. et al. (2005). Alk3/Bmpr1a receptor is required for development of the atrioventricular canal into valves and annulus fibrosus. Circ Res 97, 219-26.

Gene Expression Database. (June, 2006) at the Mouse Genome Informatics. The Jackson Laboratory, Bar Harbor, Maine (http://www.informatics.jax.org)

Gene Expression Database. (May 2005) at the Mouse Genome Informatics. The Jackson Laboratory, Bar Harbor, Maine (http://www.informatics.jax.org)

Gersdorff, N., Muller, M., Schall, A. and Miosge, N. (2006). Secreted modular calcium- binding protein-1 localization during mouse embryogenesis. Histochem Cell Biol.

Gittenberger-de Groot, A. C., Bartelings, M. M., Deruiter, M. C. and Poelmann, R. E. (2005). Basics of cardiac development for the understanding of congenital heart malformations. Pediatr Res 57, 169-76.

Gossler, A. and Hrabe de Angelis, M. (1998). Somitogenesis. Curr Top Dev Biol 38, 225-87.

Gray, D., Plusa, B., Piotrowska, K., Na, J., Tom, B., Glover, D. M. and Zernicka-Goetz, M. (2004a). First cleavage of the mouse embryo responds to change in egg shape at fertilization. Curr Biol 14, 397-405.

230

Gray, P. A., Fu, H., Luo, P., Zhao, Q., Yu, J., Ferrari, A., Tenzen, T., Yuk, D. I., Tsung, E. F., Cai, Z. et al. (2004b). Mouse brain organization revealed through direct genome- scale TF expression analysis. Science 306, 2255-7.

Gritli-Linde, A., Vaziri Sani, F., Rock, J. R., Hallberg, K., Iribarne, D., Harfe, B. D. and Linde, A. (2009). Expression patterns of the Tmem16 gene family during cephalic development in the mouse. Gene Expr Patterns 9, 178-91.

Gruber, P. J. and Epstein, J. A. (2004). Development gone awry: congenital heart disease. Circ Res 94, 273-83.

Gu, H., Marth, J. D., Orban, P. C., Mossmann, H. and Rajewsky, K. (1994). Deletion of a DNA polymerase beta gene segment in T cells using cell type-specific gene targeting. Science 265, 103-6.

Guo, G. and Robson, P. (2008). Transcription factor dynamics in the preimplantation embryo MGI Direct Data Submission.

Hallaq, H., Pinter, E., Enciso, J., McGrath, J., Zeiss, C., Brueckner, M., Madri, J., Jacobs, H. C., Wilson, C. M., Vasavada, H. et al. (2004). A null mutation of Hhex results in abnormal cardiac development, defective vasculogenesis and elevated Vegfa levels. Development 131, 5197-209.

Hanover, J. A. (2001). Glycan-dependent signaling: O-linked N-acetylglucosamine. Faseb J 15, 1865-76.

Harris-Johnson, K. S., Domyan, E. T., Vezina, C. M. and Sun, X. (2009). beta-Catenin promotes respiratory progenitor identity in mouse foregut. Proc Natl Acad Sci U S A 106, 16287-92.

Harrison, S. M., Dunwoodie, S. L., Arkell, R. M., Lehrach, H. and Beddington, R. S. (1995). Isolation of novel tissue-specific genes from cDNA libraries representing the individual tissue constituents of the gastrulating mouse embryo. Development 121, 2479-89. 231

Harrison, S. M., Houzelstein, D., Dunwoodie, S. L. and Beddington, R. S. (2000). Sp5, a new member of the Sp1 family, is dynamically expressed during development and genetically interacts with Brachyury. Dev Biol 227, 358-72.

Hart, G. W., Housley, M. P. and Slawson, C. (2007). Cycling of O-linked beta-N- acetylglucosamine on nucleocytoplasmic proteins. Nature 446, 1017-22.

Hirata, H., Bessho, Y., Kokubu, H., Masamizu, Y., Yamada, S., Lewis, J. and Kageyama, R. (2004). Instability of Hes7 protein is crucial for the somite segmentation clock. Nat Genet 36, 750-4.

Horal, M., Zhang, Z., Stanton, R., Virkamaki, A. and Loeken, M. R. (2004). Activation of the hexosamine pathway causes oxidative stress and abnormal embryo gene expression: involvement in diabetic teratogenesis. Birth Defects Res A Clin Mol Teratol 70, 519-27.

Hughes, D. S., Keynes, R. J. and Tannahill, D. (2009). Extensive molecular differences between anterior- and posterior-half-sclerotomes underlie somite polarity and spinal nerve segmentation. BMC Dev Biol 9, 30.

Ishida, Y. and Leder, P. (1999). RET: a poly A-trap retrovirus vector for reversible disruption and expression monitoring of genes in living cells. Nucleic Acids Res 27, e35.

Ishikawa, A., Kitajima, S., Takahashi, Y., Kokubo, H., Kanno, J., Inoue, T. and Saga, Y. (2004). Mouse Nkd1, a Wnt antagonist, exhibits oscillatory gene expression in the PSM under the control of Notch signaling. Mech Dev 121, 1443-53.

Jiang, X., Rowitch, D. H., Soriano, P., McMahon, A. P. and Sucov, H. M. (2000). Fate of the mammalian cardiac neural crest. Development 127, 1607-16.

Jouve, C., Palmeirim, I., Henrique, D., Beckers, J., Gossler, A., Ish-Horowicz, D. and Pourquie, O. (2000). Notch signalling is required for cyclic expression of the hairy-like gene HES1 in the presomitic mesoderm. Development 127, 1421-9. 232

Kaufman, M. H. and Bard, J. B. L. (1999). The Anatomical Basis of Mouse Development: Academic Press.

Kelly, R. G., Brown, N. A. and Buckingham, M. E. (2001). The arterial pole of the mouse heart forms from Fgf10-expressing cells in pharyngeal mesoderm. Dev Cell 1, 435-40.

Kelly, R. G. and Buckingham, M. E. (2002). The anterior heart-forming field: voyage to the arterial pole of the heart. Trends Genet 18, 210-6.

Kern, C. B., Norris, R. A., Thompson, R. P., Argraves, W. S., Fairey, S. E., Reyes, L., Hoffman, S., Markwald, R. R. and Mjaatvedt, C. H. (2007). Versican proteolysis mediates myocardial regression during outflow tract development. Dev Dyn 236, 671- 83.

Kern, C. B., Twal, W. O., Mjaatvedt, C. H., Fairey, S. E., Toole, B. P., Iruela-Arispe, M. L. and Argraves, W. S. (2006). Proteolytic cleavage of versican during cardiac cushion morphogenesis. Dev Dyn 235, 2238-47.

Kim, J. S., Viragh, S., Moorman, A. F., Anderson, R. H. and Lamers, W. H. (2001a). Development of the myocardium of the atrioventricular canal and the vestibular spine in the human heart. Circ Res 88, 395-402.

Kim, R. Y., Robertson, E. J. and Solloway, M. J. (2001b). Bmp6 and Bmp7 are required for cushion formation and septation in the developing mouse heart. Dev Biol 235, 449- 66.

Kinder, S. J., Loebel, D. A. and Tam, P. P. (2001). Allocation and early differentiation of cardiovascular progenitors in the mouse embryo. Trends Cardiovasc Med 11, 177-84.

Kinder, S. J., Tsang, T. E., Quinlan, G. A., Hadjantonakis, A. K., Nagy, A. and Tam, P. P. (1999). The orderly allocation of mesodermal cells to the extraembryonic structures and the anteroposterior axis during gastrulation of the mouse embryo. Development 126, 4691-701. 233

Kioussi, C., Briata, P., Baek, S. H., Wynshaw-Boris, A., Rose, D. W. and Rosenfeld, M. G. (2002). Pitx genes during cardiovascular development. Cold Spring Harb Symp Quant Biol 67, 81-7.

Kirk, E. P., Sunde, M., Costa, M. W., Rankin, S. A., Wolstein, O., Castro, M. L., Butler, T. L., Hyun, C., Guo, G., Otway, R. et al. (2007). Mutations in cardiac T-box factor gene TBX20 are associated with diverse cardiac pathologies, including defects of septation and valvulogenesis and cardiomyopathy. Am J Hum Genet 81, 280-91.

Kuhn, R., Schwenk, F., Aguet, M. and Rajewsky, K. (1995). Inducible gene targeting in mice. Science 269, 1427-9.

Kusumi, K., Sewell, W., O'Brien, M.L. (2007). Mouse Mutations Disrupting Somitogenesis and Vertebral Patterning. In Somitogenesis, vol. 638 (ed. M. M. a. N. Whittock): Landes Bioscience and Springer Science.

Lakkis, M. M. and Epstein, J. A. (1998). Neurofibromin modulation of ras activity is required for normal endocardial-mesenchymal transformation in the developing heart. Development 125, 4359-67.

Lamers, W. H. and Moorman, A. F. (2002). Cardiac septation: a late contribution of the embryonic primary myocardium to heart morphogenesis. Circ Res 91, 93-103.

Lawson, K. A., Meneses, J. J. and Pedersen, R. A. (1991). Clonal analysis of epiblast fate during germ layer formation in the mouse embryo. Development 113, 891-911.

Leimeister, C., Dale, K., Fischer, A., Klamt, B., Hrabe de Angelis, M., Radtke, F., McGrew, M. J., Pourquie, O. and Gessler, M. (2000). Oscillating expression of c-Hey2 in the presomitic mesoderm suggests that the segmentation clock may use combinatorial signaling through multiple interacting bHLH factors. Dev Biol 227, 91-103.

Lewis, S. L. and Tam, P. P. (2006). Definitive endoderm of the mouse embryo: formation, cell fates, and morphogenetic function. Dev Dyn 235, 2315-29.

234

Liang, P. and Pardee, A. B. (1992). Differential display of eukaryotic messenger RNA by means of the polymerase chain reaction. Science 257, 967-71.

Lin, C. R., Kioussi, C., O'Connell, S., Briata, P., Szeto, D., Liu, F., Izpisua-Belmonte, J. C. and Rosenfeld, M. G. (1999). Pitx2 regulates lung asymmetry, cardiac positioning and pituitary and tooth morphogenesis. Nature 401, 279-82.

Little, C. D. and Rongish, B. J. (1995). The extracellular matrix during heart development. Experientia 51, 873-82.

Liu, C., Liu, W., Lu, M. F., Brown, N. A. and Martin, J. F. (2001). Regulation of left- right asymmetry by thresholds of Pitx2c activity. Development 128, 2039-48.

Lo, C. W., Cohen, M. F., Huang, G. Y., Lazatin, B. O., Patel, N., Sullivan, R., Pauken, C. and Park, S. M. (1997). Cx43 gap junction gene expression and gap junctional communication in mouse neural crest cells. Dev Genet 20, 119-32.

Loffredo, C. A., Wilson, P. D. and Ferencz, C. (2001). Maternal diabetes: an independent risk factor for major cardiovascular malformations with increased mortality of affected infants. Teratology 64, 98-106.

Logan, M., Pagan-Westphal, S. M., Smith, D. M., Paganessi, L. and Tabin, C. J. (1998). The transcription factor Pitx2 mediates situs-specific morphogenesis in response to left- right asymmetric signals. Cell 94, 307-17.

Lough, J. and Sugi, Y. (2000). Endoderm and heart development. Dev Dyn 217, 327-42.

Love, D. C. and Hanover, J. A. (2005). The hexosamine signaling pathway: deciphering the "O-GlcNAc code". Sci STKE 2005, re13.

Maclean, K. and Dunwoodie, S. L. (2004). Breaking symmetry: a clinical overview of left-right patterning. Clin Genet 65, 441-57.

235

Madri, J. A., Enciso, J. and Pinter, E. (2003). Maternal diabetes: effects on embryonic vascular development--a vascular endothelial growth factor-A-mediated process. Pediatr Dev Pathol 6, 334-41.

Marques, S., Borges, A. C., Silva, A. C., Freitas, S., Cordenonsi, M. and Belo, J. A. (2004). The activity of the Nodal antagonist Cerl-2 in the mouse node is required for correct L/R body axis. Genes Dev 18, 2342-7.

Martin-Magniette, M. L., Aubert, J., Cabannes, E. and Daudin, J. J. (2005). Evaluation of the gene-specific dye bias in cDNA microarray experiments. Bioinformatics 21, 1995-2000.

Masino, A. M., Gallardo, T. D., Wilcox, C. A., Olson, E. N., Williams, R. S. and Garry, D. J. (2004). Transcriptional regulation of cardiac progenitor cell populations. Circ Res 95, 389-97.

Matsuda, E., Shigeoka, T., Iida, R., Yamanaka, S., Kawaichi, M. and Ishida, Y. (2004). Expression profiling with arrays of randomly disrupted genes in mouse embryonic stem cells leads to in vivo functional analysis. Proc Natl Acad Sci U S A 101, 4170-4.

McDonald, J. A. and Camenisch, T. D. (2002). Hyaluronan: genetic insights into the complex biology of a simple polysaccharide. Glycoconj J 19, 331-9.

McElhinney, D. B., Geiger, E., Blinder, J., Benson, D. W. and Goldmuntz, E. (2003). NKX2.5 mutations in patients with congenital heart disease. J Am Coll Cardiol 42, 1650-5.

McFadden, D. G., Barbosa, A. C., Richardson, J. A., Schneider, M. D., Srivastava, D. and Olson, E. N. (2005). The Hand1 and Hand2 transcription factors regulate expansion of the embryonic cardiac ventricles in a gene dosage-dependent manner. Development 132, 189-201.

McKnight, G. L., Mudri, S. L., Mathewes, S. L., Traxinger, R. R., Marshall, S., Sheppard, P. O. and O'Hara, P. J. (1992). Molecular cloning, cDNA sequence, and 236 bacterial expression of human glutamine:fructose-6-phosphate amidotransferase. J Biol Chem 267, 25208-12.

Meno, C., Shimono, A., Saijoh, Y., Yashiro, K., Mochida, K., Ohishi, S., Noji, S., Kondoh, H. and Hamada, H. (1998). lefty-1 is required for left-right determination as a regulator of lefty-2 and nodal. Cell 94, 287-97.

Meno, C., Takeuchi, J., Sakuma, R., Koshiba-Takeuchi, K., Ohishi, S., Saijoh, Y., Miyazaki, J., ten Dijke, P., Ogura, T. and Hamada, H. (2001). Diffusion of nodal signaling activity in the absence of the feedback inhibitor Lefty2. Dev Cell 1, 127-38.

Metzker, M. L. (2005). Emerging technologies in DNA sequencing. Genome Res 15, 1767-76.

Mouse Genome Informatics and National Center for Biotechnology Information, Mouse Gene Trap Data Load from dbGSS. at the http://www.informatics.jax.org)

Mjaatvedt, C. H., Nakaoka, T., Moreno-Rodriguez, R., Norris, R. A., Kern, M. J., Eisenberg, C. A., Turner, D. and Markwald, R. R. (2001). The outflow tract of the heart is recruited from a novel heart-forming field. Dev Biol 238, 97-109.

Mjaatvedt, C. H., Yamamura, H., Capehart, A. A., Turner, D. and Markwald, R. R. (1998). The Cspg2 gene, disrupted in the hdf mutant, is required for right cardiac chamber and endocardial cushion formation. Dev Biol 202, 56-66.

Molkentin, J. D., Lin, Q., Duncan, S. A. and Olson, E. N. (1997). Requirement of the transcription factor GATA4 for heart tube formation and ventral morphogenesis. Genes Dev 11, 1061-72.

Morimoto, M., Takahashi, Y., Endo, M. and Saga, Y. (2005). The Mesp2 transcription factor establishes segmental borders by suppressing Notch activity. Nature 435, 354-9.

Morozova, O. and Marra, M. A. (2008). Applications of next-generation sequencing technologies in functional genomics. Genomics 92, 255-64. 237

Mouse Genome Database. (June, 2006) at the Mouse Genome Informatics The Jackson Laboratory, Bar Harbor, Maine (http://www.informatics.jax.org)

Mouse Genome Database. (May, 2005) at the Mouse Genome Informatics The Jackson Laboratory, Bar Harbor, Maine (http://www.informatics.jax.org)

Mouse Genome Database. (September, 2009) at the Mouse Genome Informatics The Jackson Laboratory, Bar Harbor, Maine (http://www.informatics.jax.org)

Nagy A., G. M., Vintersten K. and Behringer R. (2002). Manipulating the Mouse Embryo; A Laboratory manual. Cold Spring Harbor, New York: Cold Spring Harbor Press.

Nagy, E. and Maquat, L. E. (1998). A rule for termination-codon position within intron- containing genes: when nonsense affects RNA abundance. Trends Biochem Sci 23, 198- 9.

Nemer, G., Fadlalah, F., Usta, J., Nemer, M., Dbaibo, G., Obeid, M. and Bitar, F. (2006). A novel mutation in the GATA4 gene in patients with Tetralogy of Fallot. Hum Mutat 27, 293-4.

Nerlich, A. G., Sauer, U., Kolm-Litty, V., Wagner, E., Koch, M. and Schleicher, E. D. (1998). Expression of glutamine:fructose-6-phosphate amidotransferase in human tissues: evidence for high variability and distinct regulation in diabetes. Diabetes 47, 170-8.

Niwa, H., Araki, K., Kimura, S., Taniguchi, S., Wakasugi, S. and Yamamura, K. (1993). An efficient gene-trap method using poly A trap vectors and characterization of gene-trap events. J Biochem 113, 343-9.

Niwa, Y., Masamizu, Y., Liu, T., Nakayama, R., Deng, C. X. and Kageyama, R. (2007). The initiation and propagation of Hes7 oscillation are cooperatively regulated by Fgf and notch signaling in the somite segmentation clock. Dev Cell 13, 298-304.

238

Nonaka, S., Shiratori, H., Saijoh, Y. and Hamada, H. (2002). Determination of left-right patterning of the mouse embryo by artificial nodal flow. Nature 418, 96-9.

Nonaka, S., Tanaka, Y., Okada, Y., Takeda, S., Harada, A., Kanai, Y., Kido, M. and Hirokawa, N. (1998). Randomization of left-right asymmetry due to loss of nodal cilia generating leftward flow of extraembryonic fluid in mice lacking KIF3B motor protein. Cell 95, 829-37.

Okada, I., Hamanoue, H., Terada, K., Tohma, T., Megarbane, A., Chouery, E., Abou- Ghoch, J., Jalkh, N., Cogulu, O., Ozkinay, F. et al. (2011). SMOC1 is essential for ocular and limb development in humans and mice. Am J Hum Genet 88, 30-41.

Okada, Y., Takeda, S., Tanaka, Y., Belmonte, J. C. and Hirokawa, N. (2005). Mechanism of nodal flow: a conserved symmetry breaking event in left-right axis determination. Cell 121, 633-44.

Oki, T., Yamazaki, K., Kuromitsu, J., Okada, M. and Tanaka, I. (1999). cDNA cloning and mapping of a novel subtype of glutamine:fructose-6-phosphate amidotransferase (GFAT2) in human and mouse. Genomics 57, 227-34.

Palmer, S., Groves, N., Schindeler, A., Yeoh, T., Biben, C., Wang, C. C., Sparrow, D. B., Barnett, L., Jenkins, N. A., Copeland, N. G. et al. (2001). The small muscle-specific protein Csl modifies cell shape and promotes myocyte fusion in an insulin-like growth factor 1-dependent manner. J Cell Biol 153, 985-98.

Parameswaran, M. and Tam, P. P. (1995). Regionalisation of cell fate and morphogenetic movement of the mesoderm during mouse gastrulation. Dev Genet 17, 16-28.

Park, E. J., Watanabe, Y., Smyth, G., Miyagawa-Tomita, S., Meyers, E., Klingensmith, J., Camenisch, T., Buckingham, M. and Moon, A. M. (2008). An FGF autocrine loop initiated in second heart field mesoderm regulates morphogenesis at the arterial pole of the heart. Development 135, 3599-610.

239

Pearce, J. J., Penny, G. and Rossant, J. (1999). A mouse cerberus/Dan-related gene family. Dev Biol 209, 98-110.

Pennisi, D. J., Wilkinson, L., Kolle, G., Sohaskey, M. L., Gillinder, K., Piper, M. J., McAvoy, J. W., Lovicu, F. J. and Little, M. H. (2007). Crim1KST264/KST264 mice display a disruption of the Crim1 gene resulting in perinatal lethality with defects in multiple organ systems. Dev Dyn 236, 502-11.

Person, A. D., Klewer, S. E. and Runyan, R. B. (2005). Cell biology of cardiac cushion development. Int Rev Cytol 243, 287-335.

Peters, H., Neubuser, A., Kratochwil, K. and Balling, R. (1998). Pax9-deficient mice lack pharyngeal pouch derivatives and teeth and exhibit craniofacial and limb abnormalities. Genes Dev 12, 2735-47.

Piotrowska, K., Wianny, F., Pedersen, R. A. and Zernicka-Goetz, M. (2001). Blastomeres arising from the first cleavage division have distinguishable fates in normal mouse development. Development 128, 3739-48.

Piotrowska, K. and Zernicka-Goetz, M. (2002). Early patterning of the mouse embryo-- contributions of sperm and egg. Development 129, 5803-13.

Quinlan, G. A., Williams, E. A., Tan, S. S. and Tam, P. P. (1995). Neuroectodermal fate of epiblast cells in the distal region of the mouse egg cylinder: implication for body plan organization during early embryogenesis. Development 121, 87-98.

Ranger, A. M., Grusby, M. J., Hodge, M. R., Gravallese, E. M., de la Brousse, F. C., Hoey, T., Mickanin, C., Baldwin, H. S. and Glimcher, L. H. (1998). The transcription factor NF-ATc is essential for cardiac valve formation. Nature 392, 186-90.

Rauch, A., Devriendt, K., Koch, A., Rauch, R., Gewillig, M., Kraus, C., Weyand, M., Singer, H., Reis, A. and Hofbeck, M. (2004). Assessment of association between variants and haplotypes of the remaining TBX1 gene and manifestations of congenital heart defects in 22q11.2 deletion patients. J Med Genet 41, e40. 240

Reamon-Buettner, S. M. and Borlak, J. (2006). HEY2 mutations in malformed hearts. Hum Mutat 27, 118.

Rivera-Feliciano, J., Lee, K. H., Kong, S. W., Rajagopal, S., Ma, Q., Springer, Z., Izumo, S., Tabin, C. J. and Pu, W. T. (2006). Development of heart valves requires Gata4 expression in endothelial-derived cells. Development 133, 3607-18.

Rochais, F., Mesbah, K. and Kelly, R. G. (2009). Signaling pathways controlling second heart field development. Circ Res 104, 933-42.

Rodriguez, T. A., Sparrow, D. B., Scott, A. N., Withington, S. L., Preis, J. I., Michalicek, J., Clements, M., Tsang, T. E., Shioda, T., Beddington, R. S. et al. (2004). Cited1 is required in trophoblasts for placental development and for embryo growth and survival. Mol Cell Biol 24, 228-44.

Rosenzweig, B. A., Pine, P. S., Domon, O. E., Morris, S. M., Chen, J. J. and Sistare, F. D. (2004). Dye bias correction in dual-labeled cDNA microarray gene expression measurements. Environ Health Perspect 112, 480-7.

Rumyantsev, P. P. (1977). Interrelations of the proliferation and differentiation processes during cardiact myogenesis and regeneration. Int Rev Cytol 51, 186-273.

Saga, Y., Hata, N., Koseki, H. and Taketo, M. M. (1997). Mesp2: a novel mouse gene expressed in the presegmented mesoderm and essential for segmentation initiation. Genes Dev 11, 1827-39.

Saijoh, Y., Oki, S., Ohishi, S. and Hamada, H. (2003). Left-right patterning of the mouse lateral plate requires nodal produced in the node. Dev Biol 256, 160-72.

Salminen, M., Meyer, B. I. and Gruss, P. (1998). Efficient poly A trap approach allows the capture of genes specifically active in differentiated embryonic stem cells and in mouse embryos. Dev Dyn 212, 326-33.

241

Sanford, L. P., Ormsby, I., Gittenberger-de Groot, A. C., Sariola, H., Friedman, R., Boivin, G. P., Cardell, E. L. and Doetschman, T. (1997). TGFbeta2 knockout mice have multiple developmental defects that are non-overlapping with other TGFbeta knockout phenotypes. Development 124, 2659-70.

Sayeski, P. P., Paterson, A. J. and Kudlow, J. E. (1994). The murine glutamine:fructose- 6-phosphate amidotransferase-encoding cDNA sequence. Gene 140, 289-90.

Schluterman, M. K., Krysiak, A. E., Kathiriya, I. S., Abate, N., Chandalia, M., Srivastava, D. and Garg, V. (2007). Screening and biochemical analysis of GATA4 sequence variations identified in patients with congenital heart disease. Am J Med Genet A 143A, 817-23.

Schoenwolf, G. C., Bleyl, S. B., Brauer, P. R. and Francis-West, P. H. (2009). Larsen's Human Embryology. Philadelphia, PA: Churchill Livingstone Elsevier.

Schott, J. J., Benson, D. W., Basson, C. T., Pease, W., Silberbach, G. M., Moak, J. P., Maron, B. J., Seidman, C. E. and Seidman, J. G. (1998). Congenital heart disease caused by mutations in the transcription factor NKX2-5. Science 281, 108-11.

Schwenk, F., Baron, U. and Rajewsky, K. (1995). A cre-transgenic mouse strain for the ubiquitous deletion of loxP-flanked gene segments including deletion in germ cells. Nucleic Acids Res 23, 5080-1.

Segat, D., Frie, C., Nitsche, P. D., Klatt, A. R., Piecha, D., Korpos, E., Deak, F., Wagener, R., Paulsson, M. and Smyth, N. (2000). Expression of matrilin-1, -2 and -3 in developing mouse limbs and heart. Matrix Biol 19, 649-55.

Sewell, W., Sparrow, D. B., Smith, A. J., Gonzalez, D. M., Rappaport, E. F., Dunwoodie, S. L. and Kusumi, K. (2009). Cyclical expression of the Notch/Wnt regulator Nrarp requires modulation by Dll3 in somitogenesis. Dev Biol 329, 400-9.

Seyfried, N. T., McVey, G. F., Almond, A., Mahoney, D. J., Dudhia, J. and Day, A. J. (2005). Expression and purification of functionally active hyaluronan-binding domains 242 from human cartilage link protein, aggrecan and versican: formation of ternary complexes with defined hyaluronan oligosaccharides. J Biol Chem 280, 5435-48.

Sharma, P. R., Anderson, R. H., Copp, A. J. and Henderson, D. J. (2004). Spatiotemporal analysis of programmed cell death during mouse cardiac septation. Anat Rec A Discov Mol Cell Evol Biol 277, 355-69.

Shi, S. and Stanley, P. (2003). Protein O-fucosyltransferase 1 is an essential component of Notch signaling pathways. Proc Natl Acad Sci U S A 100, 5234-9.

Shigeoka, T., Kawaichi, M. and Ishida, Y. (2005). Suppression of nonsense-mediated mRNA decay permits unbiased gene trapping in mouse embryonic stem cells. Nucleic Acids Res 33, e20.

Slawson, C., Housley, M. P. and Hart, G. W. (2006). O-GlcNAc cycling: how a single sugar post-translational modification is changing the way we think about signaling networks. J Cell Biochem 97, 71-83.

Slawson, C., Zachara, N. E., Vosseller, K., Cheung, W. D., Lane, M. D. and Hart, G. W. (2005). Perturbations in O-linked beta-N-acetylglucosamine protein modification cause severe defects in mitotic progression and cytokinesis. J Biol Chem 280, 32944-56.

Snider, P., Hinton, R. B., Moreno-Rodriguez, R. A., Wang, J., Rogers, R., Lindsley, A., Li, F., Ingram, D. A., Menick, D., Field, L. et al. (2008). Periostin is required for maturation and extracellular matrix stabilization of noncardiomyocyte lineages of the heart. Circ Res 102, 752-60.

Solloway, M. J. and Robertson, E. J. (1999). Early embryonic lethality in Bmp5;Bmp7 double mutant mice suggests functional redundancy within the 60A subgroup. Development 126, 1753-68.

Song, W., Jackson, K. and McGuire, P. G. (2000). Degradation of type IV collagen by matrix metalloproteinases is an important step in the epithelial-mesenchymal transformation of the endocardial cushions. Dev Biol 227, 606-17. 243

Sousa-Nunes, R., Rana, A. A., Kettleborough, R., Brickman, J. M., Clements, M., Forrest, A., Grimmond, S., Avner, P., Smith, J. C., Dunwoodie, S. L. et al. (2003). Characterizing embryonic gene expression patterns in the mouse using nonredundant sequence-based selection. Genome Res 13, 2609-20.

Sparrow, D. B., Chapman, G., Wouters, M. A., Whittock, N. V., Ellard, S., Fatkin, D., Turnpenny, P. D., Kusumi, K., Sillence, D. and Dunwoodie, S. L. (2006). Mutation of the LUNATIC FRINGE gene in humans causes spondylocostal dysostosis with a severe vertebral phenotype. Am J Hum Genet 78, 28-37.

Spicer, A. P., Tien, J. L., Joo, A. and Bowling Jr, R. A. (2002). Investigation of hyaluronan function in the mouse through targeted mutagenesis. Glycoconj J 19, 341-5.

St-Onge, L., Furth, P. A. and Gruss, P. (1996). Temporal control of the Cre recombinase in transgenic mice by a tetracycline responsive promoter. Nucleic Acids Res 24, 3875-7.

Stalsberg, H. and DeHaan, R. L. (1969). The precardiac areas and formation of the tubular heart in the chick embryo. Dev Biol 19, 128-59.

Stanley, E. G., Biben, C., Elefanty, A., Barnett, L., Koentgen, F., Robb, L. and Harvey, R. P. (2002). Efficient Cre-mediated deletion in cardiac progenitor cells conferred by a 3'UTR-ires-Cre allele of the homeobox gene Nkx2-5. Int J Dev Biol 46, 431-9.

Stanley, P. (2007). Regulation of Notch signaling by glycosylation. Curr Opin Struct Biol 17, 530-5.

Stennard, F. A. and Harvey, R. P. (2005). T-box transcription factors and their roles in regulatory hierarchies in the developing heart. Development 132, 4897-910.

Sugi, Y., Sasse, J., Barron, M. and Lough, J. (1995). Developmental expression of fibroblast growth factor receptor-1 (cek-1; flg) during heart development. Dev Dyn 202, 115-25.

244

Tam, P. P., Gad, J. M., Kinder, S. J., Tsang, T. E. and Behringer, R. R. (2001). Morphogenetic tissue movement and the establishment of body plan during development from blastocyst to gastrula in the mouse. Bioessays 23, 508-17.

Tam, P. P., Khoo, P. L., Wong, N., Tsang, T. E. and Behringer, R. R. (2004). Regionalization of cell fates and cell movement in the endoderm of the mouse gastrula and the impact of loss of Lhx1(Lim1) function. Dev Biol 274, 171-87.

Tam, P. P., Parameswaran, M., Kinder, S. J. and Weinberger, R. P. (1997). The allocation of epiblast cells to the embryonic heart and other mesodermal lineages: the role of ingression and tissue movement during gastrulation. Development 124, 1631-42.

Teplyakov, A., Obmolova, G., Badet, B. and Badet-Denisot, M. A. (2001). Channeling of ammonia in glucosamine-6-phosphate synthase. J Mol Biol 313, 1093-102.

Teplyakov, A., Obmolova, G., Badet-Denisot, M. A. and Badet, B. (1999). The mechanism of sugar phosphate isomerization by glucosamine 6-phosphate synthase. Protein Sci 8, 596-602.

Teplyakov, A., Obmolova, G., Badet-Denisot, M. A., Badet, B. and Polikarpov, I. (1998). Involvement of the C terminus in intramolecular nitrogen channeling in glucosamine 6-phosphate synthase: evidence from a 1.6 A crystal structure of the isomerase domain. Structure 6, 1047-55.

To, C., Epp, T., Reid, T., Lan, Q., Yu, M., Li, C. Y., Ohishi, M., Hant, P., Tsao, N., Casallo, G. et al. (2004). The Centre for Modeling Human Disease Gene Trap resource. Nucleic Acids Res 32, D557-9.

Uchimura, T., Komatsu, Y., Tanaka, M., McCann, K. L. and Mishina, Y. (2009). Bmp2 and Bmp4 genetically interact to support multiple aspects of mouse development including functional heart development. Genesis 47, 374-84.

245 van den Heuvel, R. H., Ferrari, D., Bossi, R. T., Ravasio, S., Curti, B., Vanoni, M. A., Florencio, F. J. and Mattevi, A. (2002). Structural studies on the synchronization of catalytic centers in glutamate synthase. J Biol Chem 277, 24579-83.

Velculescu, V. E., Zhang, L., Vogelstein, B. and Kinzler, K. W. (1995). Serial analysis of gene expression. Science 270, 484-7.

Vincent, S. D. and Buckingham, M. E. (2010). How to make a heart: the origin and regulation of cardiac progenitor cells. Curr Top Dev Biol 90, 1-41.

Waldo, K. L., Hutson, M. R., Stadt, H. A., Zdanowicz, M., Zdanowicz, J. and Kirby, M. L. (2005). Cardiac neural crest is necessary for normal addition of the myocardium to the arterial pole from the secondary heart field. Dev Biol 281, 66-77.

Waldo, K. L., Kumiski, D. H., Wallis, K. T., Stadt, H. A., Hutson, M. R., Platt, D. H. and Kirby, M. L. (2001). Conotruncal myocardium arises from a secondary heart field. Development 128, 3179-88.

Wallin, J., Eibel, H., Neubuser, A., Wilting, J., Koseki, H. and Balling, R. (1996). Pax1 is expressed during development of the thymus epithelium and is required for normal T- cell maturation. Development 122, 23-30.

Webb, S., Brown, N. A. and Anderson, R. H. (1998). Formation of the atrioventricular septal structures in the normal mouse. Circ Res 82, 645-56.

Weismann, C. G., Hager, A., Kaemmerer, H., Maslen, C. L., Morris, C. D., Schranz, D., Kreuder, J. and Gelb, B. D. (2005). PTPN11 mutations play a minor role in isolated congenital heart disease. Am J Med Genet A 136, 146-51.

Wells, J. M. and Melton, D. A. (1999). Vertebrate endoderm development. Annu Rev Cell Dev Biol 15, 393-410.

Weninger, W. J., Floro, K. L., Bennett, M. B., Withington, S. L., Preis, J. I., Barbera, J. P., Mohun, T. J. and Dunwoodie, S. L. (2005). Cited2 is required both for heart 246 morphogenesis and establishment of the left-right axis in mouse development. Development 132, 1337-48.

Wirrig, E. E., Snarr, B. S., Chintalapudi, M. R., O'Neal J, L., Phelps, A. L., Barth, J. L., Fresco, V. M., Kern, C. B., Mjaatvedt, C. H., Toole, B. P. et al. (2007). Cartilage link protein 1 (Crtl1), an extracellular matrix component playing an important role in heart development. Dev Biol 310, 291-303.

Yamamoto, M., Mine, N., Mochida, K., Sakai, Y., Saijoh, Y., Meno, C. and Hamada, H. (2003). Nodal signaling induces the midline barrier by activating Nodal expression in the lateral plate. Development 130, 1795-804.

Ye, X., Hama, K., Contos, J. J., Anliker, B., Inoue, A., Skinner, M. K., Suzuki, H., Amano, T., Kennedy, G., Arai, H. et al. (2005). LPA3-mediated lysophosphatidic acid signalling in embryo implantation and spacing. Nature 435, 104-8.

Yoshida, M., Yagi, T., Furuta, Y., Takayanagi, K., Kominami, R., Takeda, N., Tokunaga, T., Chiba, J., Ikawa, Y. and Aizawa, S. (1995). A new strategy of gene trapping in ES cells using 3'RACE. Transgenic Res 4, 277-87.

Yoshioka, H., Meno, C., Koshiba, K., Sugihara, M., Itoh, H., Ishimaru, Y., Inoue, T., Ohuchi, H., Semina, E. V., Murray, J. C. et al. (1998). Pitx2, a bicoid-type homeobox gene, is involved in a lefty-signaling pathway in determination of left-right asymmetry. Cell 94, 299-305.

Zachara, N. E. and Hart, G. W. (2006). Cell signaling, the essential role of O-GlcNAc! Biochim Biophys Acta 1761, 599-617.

Zhivkov, V., Tosheva, R. and Zhivkova, Y. (1975). Concentration of uridine diphosphate sugars in various tissues of vertebrates. Comp Biochem Physiol B 51, 421- 4.

247

Zhou, H. M., Weskamp, G., Chesneau, V., Sahin, U., Vortkamp, A., Horiuchi, K., Chiusaroli, R., Hahn, R., Wilkes, D., Fisher, P. et al. (2004). Essential role for ADAM19 in cardiovascular morphogenesis. Mol Cell Biol 24, 96-104.

Zhu, X., Sasse, J., McAllister, D. and Lough, J. (1996). Evidence that fibroblast growth factors 1 and 4 participate in regulation of cardiogenesis. Dev Dyn 207, 429-38.

248