Investigating the Roles of IRF6 in Epithelial Maturation, Craniofacial Development, and Orofacial Cleft Pathogenesis

The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters

Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:40049982

Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA Investigating the Roles of IRF6 in Epithelial Maturation, Craniofacial Development,

and Orofacial Cleft Pathogenesis

A dissertation presented

by

Edward Bing Hang Li

to

The Division of Medical Sciences

in partial fulfillment of the requirements

for the degree of

Doctor of Philosophy

in the subject of

Biological and Biomedical Sciences

Harvard University

Cambridge, Massachusetts

February 2018

© 2018 Edward Bing Hang Li

All rights reserved.

Dissertation Advisor: Dr. Eric Chienwei Liao Edward Bing Hang Li

Investigating the Roles of IRF6 in Epithelial Maturation, Craniofacial Development,

and Orofacial Cleft Pathogenesis

Abstract

Cleft lip and/or palates (CL/P) are common congenital malformations, and in the IRF6 are the most significant genetic contributors to cleft pathogenesis. IRF6 is a master regulator of epithelial maturation and is expressed in the oral epithelium hypothesized to play critical signaling functions during palatogenesis. Despite the importance of IRF6 in craniofacial development, much still remains unknown about its biological function. To complement the studies of CL/P pathogenesis in other models, we used CRISPR to produce a zebrafish null model and discovered that maternal-null irf6-/- embryos displayed embryonic lethality due to periderm rupture.

The zebrafish periderm has been previously used as a model of the mammalian oral epithelium with conservation of both morphologies and molecular pathways. Due to strong cross-species sequence conservation in IRF6, either zebrafish or human IRF6 could rescue the maternal-null irf6-/- periderm rupture phenotype. This allowed us to test the functions of IRF6 missense variants of unknown significance from CL/P patients, functionally categorize them by residual protein function, and provide biological data to complement the traditional statistical/computational approaches for

-/- variant pathogenicity assignment. Next, because the early irf6 periderm rupture precluded studies of potential irf6 functions later during craniofacial development, we employed a novel optogenetic gene expression system with dominant-negative Irf6 to spatiotemporally inhibit Irf6 functions during palatogenesis. The results revealed a striking orofacial cleft phenotype in the zebrafish model, the molecular mechanisms of which are currently being investigated. Finally, our irf6-/- model provided an opportunity to elucidate Irf6 downstream transcriptional targets, enhance our understanding of orofacial cleft pathogenesis, and identify potential nodes for intervention to prevent CL/Ps in utero.

iii

ChIP-seq and mRNA-seq were employed to compare wild type to maternal-null irf6-/- embryos and identify Irf6 transcriptional target significantly downregulated in the absence of irf6 functions during zebrafish development. Analyses of the candidate target genes have begun to reveal novel aspects of Irf6 biological function and developmental pathways previously uncharacterized during palatogenesis. These pathways could represent yet unexplored mechanisms by which IRF6 in the mammalian embryonic oral epithelium regulates epithelial maturation and epithelial-mesenchymal interactions during craniofacial development.

iv

Table of Contents

General Introductions

Cleft lip and/or palate prevalence and incidence………………………………………………1

Division of CL/P into syndromic and nonsyndromic cases…………………………………...4

IRF6 as the gene most commonly mutated in human CL/P patients………………………...8

Mammalian craniofacial development process overview...…………………………………12

Morphogenesis and fusion of facial prominences……………………………………………15

IRF6 is a master regulator of epithelial differentiation and maturation..….………………..20

IRF6 gene regulation and protein interactions in the embryonic epithelium……………….22

Zebrafish as a model for mammalian craniofacial development……………………………24

Chapter 1: Establishment of Zebrafish irf6 Models Using CRISPR-Cas9 Genome Editing

Introductions

Conservation of IRF6 gene structure and protein sequence in zebrafish…………31

Maternal transcripts and early zebrafish embryonic development………………...33

Zebrafish embryonic periderm as a model of the mammalian oral epithelium……35

Generation of zebrafish irf6 models using CRISPR-Cas9 genome editing.………38

Spatiotemporal gene expression patterns of irf6 in zebrafish…….………………..40

Experimental Results

Spatiotemporal expression patterns of Irf6 in zebrafish by IHC…...….………...... 42

Attempted generation of a zebrafish irf6 reporter line using CRISPR and HDR….44

Generation of a Tol2 transgenic zebrafish irf6 fluorescent reporter line…...... 46

CRISPR-Cas9 mediated irf6 mutagenesis in zebrafish…………………………….48

Molecular characterizations of the zebrafish irf6 CRISPR allele…………………..50

Molecular rescue of irf6 pathway genes by zebrafish and human IRF6………..…53

Phenotypic rescue of periderm rupture by zebrafish and human IRF6…………...55

v

Discussions

Zebrafish periderm as a model of mammalian oral epithelium and

requirements of maternal irf6 in epiboly and craniofacial development…..….……56

Chapter 2: Functional Genomics Analysis of Rare IRF6 Variants Using a Zebrafish Model

Introductions

Challenges in the identification and characterization of human IRF6 variants..….60

Functional assessment of human IRF6 missense variants in zebrafish…….…….61

Experimental Results

PolyPhen-2 and SIFT predictions of IRF6 variant protein function

do not accurately reflect ability to rescue zebrafish periderm rescue……………..62

Dosage titrations can differentiate functional categories of IRF6 variants………..66

Human IRF6 variants are capable of restoring zebrafish development…...... 70

Discussions

Challenges in statistical and computational analyses of rare gene variants……...71

Phenotypic rescue of maternal-null irf6-/- periderm rupture by IRF6 variants……..72

Chapter 3: Analysis of the Potential Post-Epiboly Roles of irf6 in Craniofacial Development

using Optogenetics

Introductions

Potential roles of irf6 during zebrafish post-epiboly embryonic development…….78

Evaluation of spatiotemporal gene expression control methods in zebrafish…….79

EL222 optogenetic regulation of spatiotemporal gene expression………...……...83

Experimental Results

Optogenetic construct design and light activation of gene expression…………....85

Dominant-negative Irf6-ENR fusion expression mimics irf6 functional ablation….88

Irf6 function inhibition at various time points leads to different phenotypes..……..90

vi

Live imaging of optogenetic irf6-ENR embryos reveals NCC migration defects....93

Proliferation and apoptosis in wild type vs. optogenetic irf6-ENR embryos………96

Discussions

Dominant-negative IRF6 in the forms of ENR-fusion and R84C…………………..97

Dominant-negative Irf6 inhibition molecularly mimicked maternal-null irf6-/-……..99

Validation of EL222 optogenetic gene expression control………………..………100

Effects of Irf6 inhibition on NCC migration, proliferation, survival & differentiation 101

Factors affecting frontonasal neural crest cell migration and survival…………...103

Chapter 4: Identification of Direct Irf6 Transcriptional Target Genes in Periderm Maturation

Introductions

Utility of ChIP-seq for identifying direct IRF6 transcriptional target genes………108

Direct IRF6 transcriptional targets and the missing inheritance of VWS/PPS…..109

Experimental Results

mRNA-seq of wild type vs. maternal-null irf6-/- zebrafish embryos……………….112

ChIP-seq of wild type vs. maternal-null irf6-/- zebrafish embryos…………………115

Prioritization of direct Irf6 transcriptional target genes for further investigation...116

Identification of CL/P patient variants in IRF6 transcriptional target genes……..118

Morpholino knockdown of Irf6 transcriptional target genes……….……………...119

CRISPR-Cas9 gene editing of candidate Irf6 transcriptional target genes…..….121

Spatiotemporal expression of Irf6 transcriptional target esrp1 in zebrafish……..122

Mouse embryonic palatal shelf Esrp1 expression and Esrp1-/- phenotypes…….124

Discussions

Esrp1 drives alternative splicing of fibroblast growth factor receptors …………..127

Zebrafish esrp1/2 mutants and potential FGF interactions……………………….129

Biological validation of CL/P patient variants in IRF6 transcriptional targets…....131

IRF6 transcriptional target validation in cell culture, mouse, and zebrafish……..132

vii

Conclusions

Establishment of the zebrafish model for studies of IRF6 in craniofacial development…134

Usage of zebrafish irf6 models to biologically evaluate human variant protein functions.137

Zebrafish as a model for human craniofacial development and OFC pathogenesis…….139

Optogenetic dissection of post-epiboly irf6 functions in craniofacial development……...140

IRF6 downstream transcriptional target gene ESRP1……………………………………..142

Methods

Fish rearing and husbandry…………………………………………………………………..147

Mouse rearing and husbandry……………………………………………………………….147

CRISPR-Cas9 gene editing and genotyping……………………………………………..…147

Total RNA isolation and RT-qPCR…………………………………………………………..148

Protein isolation and western blotting……………………………………………………….149

Cryosectioning and fluorescence immunohistochemistry…………………………………150

IRF6 cDNA cloning and variant generation by site-directed mutagenesis ………………151

In vitro mRNA synthesis………………………………………………………………………152

Zebrafish embryo microinjection of mRNA and morpholinos…………………..………….152

Whole-mount in situ hybridization and DIG-labeled riboprobe synthesis….……………..153

Acid-free alcian blue staining and brightfield imaging……………………………………...153

Two-photon in vivo zebrafish and IHC imaging……………………………………………..154

Computational modeling and statistical analyses………………………………………….154

Optogenetic expression of genes-of-interest……………………………………………….155

Whole-mount zebrafish proliferation and cell death analyses……………...……………..156

Chromatin immunoprecipitation and sequencing (ChIP-seq) of zebrafish embryos…....156

mRNA-seq of zebrafish embryos…………………………………………………………….157

References

viii

Glossary of Terms

A-P Anterior-posterior CDS Coding sequence ChIP Chromatin immunoprecipitation CL/P Cleft lip and/or palate (C)NCC (Cranial) neural crest cell CRISPR Clustered regularly interspaced short palindromic repeats D-V Dorsal-ventral E.# Embryonic gestational day (mouse) EMT Epithelial-to-mesenchymal transition ENR repressor EP Ethmoid plate ESRP Epithelial splicing regulatory protein EVL Enveloping layer ExAC Exome aggregation consortium FEZ Frontonasal ectodermal zone FGF(R) Fibroblast growth factor () FNP Frontonasal process HDR Homology-directed repair HPF/DPF Hours post fertilization/days post fertilization (zebrafish) IHC Immunohistochemistry MDP Mandibular process MEE Medial edge epithelium MO Morpholino MXP Maxillary process NHEJ Nonhomologous end joining OFC Orofacial cleft PDGF(R) Platelet derived growth factor (receptor) sgRNA Short guide ribonucleic acid SHH Sonic hedgehog TSS Transcriptional start site VWS/PPS /Popliteal Pterygium syndrome WISH Whole-mount in situ hybridization

ix

Acknowledgements

Foremost, I would like to express my sincerest gratitude to my dissertation advisor Dr. Eric

Chienwei Liao for his wholehearted support throughout my PhD years. He encouraged unhindered scientific exploration, fostered the growth of both my scientific and clinical interests, and guided me towards an assuredly fulfilling physician-scientist career path. I could not have imagined or asked for a better mentor in science, in medicine, and in life.

My PhD years have been a transformative period in my life, academically and personally.

I would like to express my heartfelt appreciation for my amazing wife, Caitlin Naureckas Li, for her love and support. Hard days in lab were never so hard with the thoughts of returning home to you.

Furthermore, I would like to thank my incredible parents, Yushan Li and Yingxin Sun, for all of their hard work, sacrifice, nurture, and unconditional love. I am strong when I am on your shoulders; you raise me up to more than I can be. To my friends near-and-dear to my heart, thank you all for your companionship and support, and for filling my daily life with joy, laughter, and excitement. I feel truly blessed to have you all in my life.

None of this would be possible without my incredible dissertation advisory committee, Dr.

Wolfram Goessling, Dr. Cliff Tabin, and Dr. Richard Maas. Thank you for your insightful comments, advice, and discussions along this journey. Moreover, I would like to thank my fellow lab members and others in the Center for Regenerative Medicine for their immense support, stimulating scientific discussions, working side-by-side at the benches, and all the fun over the past four years. Finally,

I would like to thank our numerous collaborators who have all helped to propel our projects forward and instilled in me the true value and power of interdisciplinary collaborations in science.

x

General Introductions

Cleft lip and/or palate prevalence and incidence

The craniofacial skeleton has diverse evolutionary morphologies and serves a plethora of key biological functions, including protection of the brain and craniofacial sensory organs, structural supports for the eyes, nose, and ears, segregation of the upper respiratory and digest tracts, and

1 many others . The human craniofacial skeleton is complex and composed of over twenty cartilages and bones, and its proper formation and development are critical for vertebrate organisms ranging from humans to zebrafish2. Due to the elaborate nature of the craniofacial development process, numerous errors can arise to produce congenital craniofacial malformations, the most common of which is a cleft lip and/or palate (CL/P)3. In fact, CL/P is one of the most common congenital birth defects of any category with an incidence of approximately 1 in 700 births4. In humans, the palate serves to separate the nasal and oral cavities and is composed of three components with distinct embryonic origins: the primary palate (upper teeth alveoli and anterior one-third of the hard palate), the secondary palate (posterior two-thirds of the hard palate), and the soft palate (Figure 1A)5. The hard palate establishes the physical barrier between the oral and nasal cavities and is required for

6 proper speech development and negative pressure production during suckling . The soft palate is flexible and can elevate during swallowing to seal off the posterior nasal cavity and prevent reflux of solids and liquids6. Especially during the early postnatal period, hard and soft palate defects can significantly hamper the ability for breastmilk feeding and are detrimental for growth and survival.

Numerous surgical techniques have been devised to address a wide variety of CL/P presentations

(Figure 1B-I). However, although the surgical approaches have dramatically improved over the past decades to a point where many CL/P corrections are nearly imperceptible in adulthood, the process still requires several surgeries dispersed throughout childhood depending on the severity of the orofacial cleft and a wide array of therapies to correct the associated speech and hearing issues7. And because of the high incidence of orofacial clefts, all the clinical treatments amount to

1 significant burdens on patients, families, and the healthcare system with substantial public health implications8,9.

Figure 1: Array of orofacial cleft clinical presentations in human patients.

Illustration of various categories of orofacial clefts observed in human patients. (A) Normal human orofacial morphology (from the dorsal oropharynx and oral cavity perspective; lower jaw removed).

(B) The least severe form of orofacial cleft presentation with microform cleft lip only, occasionally

2

Figure 1 (Continued) as mild as lip pits/indentations. (C-H) Unilateral and bilateral orofacial cleft presentations with combinations of clefts through the lips, hard palate, and soft palate. (I) Oblique orofacial clefts are rare and unique in etiology, often associated with skin and bony defects beyond the palate through the nasal maxillary and orbital facial bones. Skin abnormalities are indicated in orange. Clefts are indicated in brown. CLO: cleft lip only; CLP: cleft lip and palate; CPO: cleft palate only. Human patient images of orofacial cleft presentations adapted and modified from Dixon et al.,

Nature Reviews Genetics 20113.

Through a large collection of forward/reverse genetic screens, genome-wide association studies (GWAS), epidemiological studies, animal models, and many other approaches, numerous genes have already been identified to be associated with CL/P pathogenesis10. These studies will be discussed in-depth in the context of Regulatory Factor 6 (IRF6), the gene-of-interest of this dissertation. The ultimate goal for examining the genetic and developmental bases of CL/P is to understand the molecular and cellular mechanisms-of-action for orofacial cleft pathogenesis and to prevent the disease phenotypes from manifestation. In an example analogous to orofacial clefts, spina bifidas (incomplete closures of the caudal neural tube) are also common congenital malformations due to the improper union of embryonic epithelial tissue seams11,12. While surgical approaches were devised to correct spina bifida, the high incidence of disease and poor clinical outcomes caused intense scientific interests in discovering potential pathogenic mechanisms and medical interventions. Through numerous seminal basic science and epidemiological studies, the folate metabolism pathway was discovered to be intimately linked with the pathogenesis of spina bifida, and that prenatal folate supplementation could greatly reduce its incidence13,14. Encouraged by these findings, many nations introduced staple grain folate supplementation and included folate as part of the prenatal vitamin regimen, all of which led to a significant decline in the incidence of spina bifida worldwide15,16. With the combination of basic science and epidemiological research, a complex multifactorial congenital disorder was largely prevented by a fairly innocuous substance

3 taken during pregnancy. Because many of the scientific findings from spina bifida have translated to similar findings in orofacial cleft research, perhaps a fairly innocuous solution like prenatal folate supplementation could also be discovered to prevent or ameliorate orofacial cleft pathogenesis in human patients.

Taken together, orofacial clefts and congenital craniofacial malformations in general have generated immense interest in the medical and scientific communities, with both sides seeking to understand its fundamental principles, identify potential biological mechanisms of pathogenesis, and discover novel treatment options.

Division of CL/Ps into syndromic and nonsyndromic cases

Orofacial clefts are common congenital malformations and many factors contribute to their diverse mechanisms of pathogenesis, which often involve holistic integrations of both genetic and environmental contributions3,17. Approximately 30% of CL/Ps are components of the presentation of genetic syndromes and are considered syndromic OFCs (like the CL/P phenotype in DiGeorge syndrome involving 22q11 microdeletions)18. The remaining 70% of human OFC patients are often considered nonsyndromic OFCs and affect approximately 0.1% of the population worldwide4. The overall risk for CL/P rises to approximately 15% in kids with an affected first-order family member and has a 50% disease-phenotype concordance rate between identical twin siblings19,20. Taken together, it is currently believed that the pathogenesis of CL/P, especially nonsyndromic, results from a variety of genetic risk factors, prenatal environmental exposures, and gene-environment interactions that could amplify the pathogenic effects of each other to precipitate the manifestation of disease phenotypes. Interestingly, amongst nonsyndromic OFCs, CL/Ps are nearly twice as common in males compared to females, while isolated cleft palates only (CPOs) demonstrate the converse pattern and are twice as common in females compared to males7. Out of all OFC cases, cleft lip only (CLO) occurs most frequently, followed by unilateral cleft lip and palate (CLP). Cleft palate only (CPO) cases are rare occurrences, and oblique OFC cases have the lowest incidence,

4 typically with distinct mechanisms of pathogenesis involved. Approximately 75% of OFCs involving the lips are unilateral with those affecting the left side nearly twice as common as those affecting the right side21. In spite of these interesting epidemiological observations on OFC incidence, the biological underpinnings behind these observations are currently unknown.

Many epidemiological studies have been performed for OFCs due to their high incidence and financial cost on the healthcare system. Perhaps with a better understanding of the population incidence and risk factors associated with OFC, novel preventional and interventional approaches could be taken to decrease their incidence, much like the effects of prenatal folate supplementation on spina bifida. For example, it was revealed through numerous studies that the incidence of non- syndromic CL/Ps varied significantly by ancestry, with individuals of Asian descent most commonly affected (1/500 births) and those of pan-African descent least commonly affected (1/2,500 births)3.

Lifestyle risk factors such as prenatal alcohol exposure, illicit or prescription drug use, and smoking all have been shown to be associated with an increased risk of CL/Ps22-24. In particular, smoking has been consistently shown to be highly associated with an increased risk of CL/P pathogenesis with a population-attributable risk estimated as high as 20% over baseline25. Other socioeconomic related factors, such as ready access to modern medical care26, prenatal vitamins27, and nutritional deficiencies28 have all been shown to be linked to an increased risk of CL/Ps, again hinting at the multifactorial nature of CL/P pathogenesis. Certain environmental toxin exposures, such as poly- cyclic aromatic hydrocarbons resulting from the byproduct of industrial manufacturing, fossil fuel combustion, and pesticides, have also been shown to increase the incidence of CL/Ps, leading to a possible explanation to why the incidence of CL/P is higher than expected in developing nations with less stringent environmental protection measures29,30. Other environmental exposures, such

31,32 33 34 35,36 as all-trans retinoic acid (ATRA) , aspirin , valproic acid , phenytoin , and other teratogenic chemicals could all lead to significantly increased risks of OFCs in humans. Finally, environmental exposures could have different effect strengths on OFC pathogenesis, and their effects could be amplified and/or diminished in the setting of genetic backgrounds containing susceptibility genes.

5

For example, benign genetic variations in the alcohol metabolism, nonpathogenic taken in isolation, could lead to increased accumulation of toxic metabolites with alcohol ingestion during pregnancy and thereby increased risks for OFC pathogenesis22.

In recognition of the strong genetic contributions to both syndromic and nonsyndromic OFC pathogenesis, numerous approaches have been devised to identify potential genes implicated in orofacial clefts. The approaches include analyzing rearrangements, familial linkage studies, genome-wide association studies (GWAS), and candidate gene animal models. Each of these approaches comes with unique strengths and weaknesses, but together they produce a more comprehensive picture of genes and pathways involved in OFC pathogenesis. The chromosomal rearrangement approach is dependent on the identification of human patients who have balanced chromosomal translocations such that the breakpoints disrupt the open reading frames of specific

37 genes . This approach is useful due to the relatively simple candidate gene identification process, especially before the widespread utilization of next generation whole exome/genome sequencing, because translocations can be easily identified by cytogenetic/karyotype analyses, and the gene

38 disrupted is usually specific to the one contained within the breakpoint . Several genes important

39 for craniofacial development have been identified through this approach, including KDM6A and

40 SUMO1 . However, a possible drawback of this approach is that while the most direct candidate genes would be the ones disrupted by the chromosomal translocations, the translocated elements could potentially shift crucial gene regulatory regions to adjacent genes and cause ectopic gene expressions that lead to CL/P pathogenesis. In spite of this potential drawback, the chromosomal translocations approach could identify high-confidence candidates and advance the generation of gene disruption animal models to validate the functions of those genes in CL/P pathogenesis.

Familial linkage studies depend on the identification of co-segregating markers with disease phenotypes in extended family pedigrees with multiple affected and non-affected individuals. For monogenic Mendelian diseases with high penetrance and expressivity, the disease-causing loci will

6 tend to travel with adjacent genetic markers, with markers closer to the allele on the chromosome co-segregating with the disease phenotype more frequently than predicted by chance because of the reduced likelihood of chromosomal recombinations41. By identifying the assortment of genetic markers traveling with disease manifestation in the family, the larger the pedigree and number of individuals available for analysis, the smaller the region of genetic linkage could be narrowed and identified. While this method has been very successful for identifying highly penetrant monogenic disorders, small scale genetic linkage studies have met limited success for complex disorders that are multigeneic, variable in penetrance and/or expressivity, and heterogenous in clinical disease presentations. These complex disorders require a greater number of patients in order to generate sufficient statistical power to detect limited individual genetic contributions to OFC pathogenesis.

With advances in SNP microarrays, and more recently whole exome/genome sequencing, GWA studies are becoming very popular approaches for gene discovery of complex diseases. In GWAS, thousands of unrelated individuals with and without the disease-of-interest are broadly genotyped

42 for genetic markers (i.e.: single nucleotide polymorphisms) . Subsequently, the disease phenotype is correlated with over-representations in subsets of genetic markers that could either be directly contributory to disease or be linked to yet undiscovered underlying genetic variations related to pathogenesis. Because of the uncertainty of causality between the identified markers and disease pathogenesis, GWAS results have been primarily used for discovery of novel genes and pathways.

The mechanisms-of-action of these associated genes in disease pathogenesis require biological validations through experimentation. Several GWA studies have been performed to identify genes associated with CL/P from a variety of populations, including MAFB and ABCA443.

One crucial gene in particular, Interferon Regulatory Factor 6 (IRF6), has been consistently demonstrated to be significantly associated with CL/P pathogenesis across different studies and populations, which suggests that the molecular and cellular functions of IRF6 play important roles during craniofacial development and warrant greater scrutiny.

7

IRF6 as the gene most commonly mutated in human CL/P patients

Although a large number of genes have been discovered to be associated with OFC patho- genesis, the gene IRF6 is the primary gene that has consistently demonstrated a significant degree of association and causation across studies implicating its role in CL/P pathogenesis. In fact, IRF6 is the gene most commonly mutated in human OFC patients and accounts for approximately 2% of

44 all CL/P cases worldwide . The gene IRF6 (NG_007081.2) is located on chromosome 1q32.2 and belongs to a larger Interferon Regulatory Factor family of helix-turn-helix transcription factors with members IRF1-945. The canonical role of the IRF family members is the transcriptional regulation of interferon production in response to viral infections. While individuals with deleterious mutations in the other IRF members demonstrated defects in their immune systems, they did not exhibit any abnormalities in craniofacial development or structural birth defects46. In contrast, IRF6 is unique among its family members in that when it is mutated, no immunological defects are observed, and

145 instead, structural birth defects including CL/Ps and inter-epithelial adhesions become apparent .

In humans, mutations in IRF6 cause two related Mendelian syndromes: Van der Woude syndrome

45,47 (VWS OMIM: 119300) and Popliteal Pterygium syndrome (PPS OMIM: 119500) . Furthermore, in addition to VWS and PPS, gene variants in IRF6 and surrounding gene regulatory regions have been repeatedly found in GWAS and other studies to be associated with increased nonsyndromic

CL/P incidence48-52. Therefore, IRF6 is truly unique amongst CL/P candidate genes due to its high prevalence in OFC patients and its relationship with syndromic and nonsyndromic forms of OFCs.

Both VWS and PPS display autosomal-dominant inheritance patterns and have high penetrance

(>90%) but variable expressivity44,53. For VWS, most patients have lower lip pits, a nearly patho- gnomonic presentation feature, cleft lip and/or palates, hypodontia, and oral epithelial adhesions with skin abnormalities54,55. PPS patients share many of the same clinical presentations as VWS, but in general have more severe disease phenotype presentations. In addition, PPS often presents with several additional clinical features such as oropharyngeal and esophageal webbing, webbing

8 of the popliteal fossa (backs of knees), syndactyly, genitourinary abnormalities (labial hypoplasia, cryptorchidism, bifid scrotums), ankyloblepharon (epithelial fusions between the upper and lower eyelids), and nail cuticle overgrowths56,57.

Like other IRF family members, IRF6 is a transcription factor containing two key functional domains: 1) helix-turn-helix DNA-binding domain (amino acids 13-113), and 2) SMIR-IAD protein

45 binding domain (amino acids 226-394) . The DNA-binding domain is required for IRF6 binding to transcriptional motifs in enhancer and promoter elements; the protein-binding domain is required for IRF6 to form homo- and hetero-dimers58. Subsequently, the IRF6 dimers can translocate into the nucleus, associate with other transcriptional regulators, bind to their DNA targets, and regulate transcriptional activities at those loci58. While IRF6 shares the same genetic and protein structure as other IRF family members, it does not share the canonical IRF transcriptional regulatory targets, and the genes currently under IRF6 transcriptional regulation are currently unknown. Interestingly, whereas missense mutations associated with VWS are evenly distributed across the entire gene and both functional domains, most missense mutations associated with PPS are located within the

59 DNA-binding domain . In fact, every amino acid residue currently discovered to be mutated in PPS patients makes direct contacts with DNA, extrapolated from studies of crystal structures of related

IRF members45,59. Conversely, this interesting property is only true for approximately 15% of amino

45 acid residues mutated in VWS patients . Taken together, these observations led to the hypothesis that missense mutations associated with VWS cause severe loss-of-function of the mutant resulting in haploiunsufficiency, whereas missense mutations associated with PPS lead to mutant

IRF6 proteins that are unable to bind to DNA but still capable of interacting with protein partners, thereby forming inactive transcriptional complexes and acquiring dominant-negative activities45.

In support of this hypothesis, mutations in the DNA-binding domains of IRF3/7 led to inactivated

IRF transcriptional complexes and dominant-negative inhibition of viral infection-induced interferon expression60,61. For both molecular mechanisms of pathogenesis, the overt genetic inheritance pattern is autosomal dominant.

9

Figure 2: Mouse Irf6 model recapitulates Van der Woude syndrome phenotypes in humans.

The mouse model of Irf6 was generated by a gene-trap vector that inserted 36 bp downstream of the splice donor site of intron 1, which terminated transcription and translation with an ectopic splice acceptor, in-frame stop codon, and SV40 polyadenylation signal. (A) Gross morphology of E17.5 wild type mouse embryo. (B) Homozygous-null Irf6 E17.5 mouse embryos exhibited taut skin and shortened jaw, limbs, and tail. (C) Wild type E17.5 embryos have a functional epithelial barrier for toluidine blue dye exclusion compared to homozygous-null Irf6 E17.5 embryos (D). (E-L) Immuno- histochemistry analyses of E17.5 epithelium for epithelial markers revealed absent Irf6 expression

(E and I), increased depth of the spinous layers indicated by Keratin 1 (F and J), increased cellular proliferation indicated by BrdU (G and K), and decreased cell death indicated by TUNEL (H and L) in homozygous-null Irf6 embryo compared to wild type. (M-P) Histological sections revealed fusion of palatal shelves (M) and proper skin stratification (N) in wild type E17.5 embryos, but adhesions of palatal shelves to the surface of the tongue (O), and hypercellular skin with poor stratification (P) in homozygous-null Irf6 embryos. Dashed line: epidermal-dermal junction. To: tongue; *: palatal shelf. Figure adapted and modified from Ingraham et al., Nature Genetics 200662.

10

Due to the central importance of IRF6 in cleft lip and/or palate pathogenesis in humans, a mouse model of Irf6 gene disruption was generated using a gene-trap vector. The vector inserted downstream of the splice-donor site of intron 1, which then terminated transcription and translation with an ectopic splice-acceptor, in-frame stop codon, and SV40 polyadenylation signal, effectively producing a null allele that displayed no production of full-length Irf6 protein62. Irf6+/- heterozygous- null animals did not exhibit any craniofacial or generalized developmental abnormalities (contrary to the expected autosomal-dominant inheritance patterns of VWS and PPS in humans). However, homozygous-null Irf6-/- animals did exhibit numerous developmental abnormalities, including cleft palate and oral epithelial adhesions, much like the disease phenotypes observed in VWS/PPS patients62. In addition, homozygous-null Irf6-/- animals also showed shortened tails, forelimbs, and hindlimbs compared to wild type (Figure 2A and 2B). Further analyses revealed Irf6 mutant animals had poor epithelial barrier functions in the toluidine dye exclusion assay (Figure 2C and 2D), most likely due to poor epithelial stratification and tight junction formation (Figure 2F and 2J), increased epithelial cell proliferation (Figure 2G and 2K), and decreased epithelial cell death (Figure 2H and

2L). These epithelial differentiation and maturation defects (Figure 2N and 2P) likely precipitated the OFC phenotype observed, where the secondary palatal shelves did not sufficiently elevate and adhere to each other at the midline in the oral cavity but instead became attached to the lateral surfaces of the tongue during the vertical outgrowth phase of palatogenesis (Figure 2M and 2O).

Currently, it is unclear whether the pathological effects of these intraoral epithelial adhesions are physical, mechanically preventing the elevation and fusion of the secondary palatal shelves, and/or biological, perturbing aspects of the native oral/palate microenvironment to cause CL/P formation.

Another mouse model of Irf6 gene disruption was also generated by introducing a human disease allele called p.R84C through homologous recombination 63. The R84C missense disrupts a crucial DNA-binding amino acid residue in the DNA-binding domain of IRF6 and is the most common mutation observed in human PPS patients45. As such, the p.R84C IRF6 protein has been proposed to have dominant-negative activities. When R84C was introduced in the mouse model,

11

Irf6 function was significantly reduced in heterozygous Irf6R84C/+ embryos, resulting in a subtle but reproducible phenotype where 89% of Irf6R84C/+ embryos analyzed between E12.0-16.0 had mild intraoral epithelial adhesions that resolved spontaneously with oral movements during subsequent stages of embryonic development63. Although the Irf6R84C model was able to precipitating a mutant phenotype in the heterozygous state compared to the Irf6null gene trap model, no cleft lip and/or palates were observed in the heterozygous state, suggesting possible differences in Irf6 function

63 R84C/R84C between mouse and human . Only in the homozygous Irf6 embryos were strong epithelial maturation defects and orofacial clefts readily observed. Keratinocytes in the epidermis displayed increased proliferation, decreased cell death, and abnormal differentiation, much like keratinocytes in the Irf6null model63. The late markers of keratinocyte differentiation, such as loricrin and filaggrin, which are expressed in the granular and cornified layers of the epidermis, were absent from the epidermis of Irf6R84C/R84C mutants, suggesting a developmental arrest of epidermal keratinocytes early during their canonical differentiation process63.

Because the majority of the vertebrate craniofacial skeleton is composed of cranial neural crest cell derivatives, it is interesting to note that most of the defects observed in mutants of Irf6, the gene most commonly mutated in CL/P patients, are related to epithelial rather than CNCC dysregulation at the cellular level. In order to fully understand how mutations in IRF6 and changes in its function lead to epithelial maturation defects, and how those epithelial defects could manifest as craniofacial abnormalities, the normal craniofacial development process must be examined.

Mammalian craniofacial development process overview

Craniofacial development is a highly complex process involving the intricately coordinated movements, proliferation, and differentiation of numerous cell types. Although many cell types are required for this process to properly occur, epithelial and neural crest cells are the primary actors and play key roles in regulating craniofacial morphogenesis and contributing to adult structures64.

Neural crest cells (NCCs) are a highly malleable cell population that has often been termed the

12

“fourth germ layer” due to its ability to differentiate into diverse cell types that seemingly disobey the canonical barriers between ectoderm, mesoderm, and endoderm65.

Neural crest cell development initiates early in embryogenesis during neurulation before neural fold closure. NCCs are initially specified at the neural plate border at the junction between the neural and non-neural ectoderms, where NCCs undergo epithelial-to-mesenchymal transition

(EMT), delamination, and migration into the surrounding mesenchymal tissues65. Based on the initial positions of the NCC along the anterior-posterior (A-P) body axis at the time point of neural crest cell specification, NCCs can be broadly divided into two categories: cranial neural crest cells at the anterior (cranial) and trunk neural crest cells at the posterior (caudal). These NCC populations have been previously demonstrated to have distinct gene expression patterns and cells regulatory mechanisms66. After delamination, anterior CNCCs migrate in canonicalized pathways toward the appropriate pharyngeal arches (PAs) and populate the PA mesenchyme to eventually give rise to their craniofacial derivatives1,67. In contrast, trunk NCCs migrate throughout the entire developing embryo and give rise to diverse tissues and cell types ranging from the melanocytes in the skin to smooth muscle cells in the cardiac septum and neurons in the peripheral nervous system68. The

A-P identities of CNCCs are largely defined by Hox genes, akin to other processes involving A-P

69,70 axis segmentations during embryogenesis . However, CNCCs from the level of rhombomeres one and two, populations that migrate into PA1 and eventually contribute to the frontonasal and paired maxillary prominences, are completely devoid of Hox gene expression and have undefined mechanisms of identifying their A-P axial position information71,72. Furthermore, the anterior-most

CNCCs that will eventually form the frontonasal mesenchyme never enter the pharyngeal arch, a tissue milieu that provides instructive signaling cues to CNCCs and directs their proliferation and

73 differentiation . Due to the noncanonical properties of the anterior-most CNCC population, various aspects of their development and regulation are intriguing. Indeed, the lack of constraints by Hox genes and the PA microenvironment enables anterior CNCCs to be developmentally plastic and

13 adaptable, and has been hypothesized to be responsible for the massive divergence in craniofacial morphologies across diverse species and evolutionary time74-76.

What is currently known is that CNCCs from different A-P axial origins migrate in distinctive streams that do not intermingle even when they contribute to admixed structures77. Due to the lack of Hox gene expression in anterior CNCCs and their currently undefined mechanisms of cellular regulation, numerous experiments have been done to determine whether anterior CNCC identities are pre-defined by a set of unknown intrinsic molecular factors or remain plastic until guided by instructive developmental cues that influence cell fate decisions. Currently, the majority of available experimental data are from transplantation experiments in chick/quail, which seem to support the hypothesis that anterior CNCCs are developmentally plastic and able to adopt the cell fates of the locations to which they are transplanted rather than retain the fates of their embryological origins78-

83. In addition, experimental data from the transplantation of tissues surrounding CNCCs suggest that the embryonic epithelium in the facial and oropharyngeal regions play critical instructive roles

64 in inducing CNCC growth, differentiation, and morphogenesis . For example, several studies have demonstrated that cues from the oral ectoderm, including fibroblast growth factors (FGFs)84,85, Wnt signals86-88, bone morphogenic proteins (BMPs)89,90, and Sonic hedgehog signals (SHHs)91,92, are indispensable for the growth, differentiation, and survival of the frontonasal and maxillary CNCCs.

In fact, a key facial organizer named the frontonasal ectodermal zone (FEZ) was recently identified at the extreme anterior position of the embryo as the junction between Fgf8 and Shh expression in the oral ectoderm overlying on the frontonasal process down the midline93,94. The FEZ can not only dictate the dorsal-ventral (D-V) body axis necessary for maxillary outgrowth, but also induce ectopic upper jaw structures when transplanted onto posterior CNCC populations. These results suggested that the FEZ, and therefore the oral ectoderm, had a dominant role in the homeotic transformation of CNCCs to define specific anterior craniofacial structures84. Furthermore, a few studies identified another facial organizer adjacent to the FEZ near the mouth opening named the extreme anterior

14 domain (EAD) that can also play instructive specification and organization roles for anterior CNCC

95 development . The EAD describes a region of juxtaposition between the facial ectoderm and gut endoderm, and ablation of this structure has been shown to prevent mouth opening and facial bone

96,97 differentiation in the underlying CNCC mesenchyme . Lastly, as previously mentioned, the PA microenvironment, especially the pharyngeal epithelium, provides many important instructive and trophic cues to the CNCC mesenchyme. In the PA, several molecular factors are responsible for carrying out the signaling functions required for directing NCC proliferation, survival, differentiation,

98 99 migration, and tissue patterning. For example, BMPs , retinoic acid receptors (RARs and RXRs) , endothelins100,101, among many other well-established signaling pathways have been shown to be expressed in pharyngeal arch tissues and cause craniofacial abnormalities when disrupted in vivo in various animal models.

Morphogenesis and fusion of facial prominences

The facial morphogenesis process during embryonic development is well-conserved at the molecular and cellular levels across diverse species spanning the spectrum of vertebrate evolution.

It typically starts with the formation of five facial prominences surrounding the mouth opening: the frontonasal process (FNP) which can be further subdivided into the medial and lateral FNPs, the paired bilateral maxillary processes (MXP), and the paired bilateral mandibular processes (MDP)5.

The FNP and MXPs fuse at the embryonic midline to form the upper jaw apparatus; similarly, the

MDPs fuse below the mouth opening at the midline to form the lower jaw5. All facial prominences are populated primarily by CNCCs. CNCCs that constitute the FNP consist of the anterior-most

CNCCs that migrate anteriorly down the midline around the forebrain and eye primordia to reach

5 the future roof of the mouth opening . In contrast, CNCCs that constitute the MXPs originate from rhombomere one, migrate into the anterior aspects of the first pharyngeal arch, and subsequently converge bilaterally toward the midline1. Similar to the MXP, the CNCCs that constitute the MDPs originate from rhombomere two, migrate into the posterior aspect of the first pharyngeal arch, and

15 subsequently converge bilaterally toward the midline. As craniofacial development proceeds, the

FNP becomes divided into the medial and paired lateral processes by the formation of the nasal pits102. Subsequently, a series of important fusions occur between the facial prominences to form the rudiments of embryonic facial structures (Figure 3A): 1) the medial and lateral FNPs fuse at the ventral tips to form the nostrils, 2) the lateral FNP and ipsilateral MXP fuse to form the upper lips, primary palate, and secondary palatal shelves (Figure 3A-B)103, and 3) the bilateral MDPs fuse to

103 form the mandibular Meckel’s cartilages (Figure 3A’) . The secondary palate starts as outgrowths from the oral-cavity surface of the fused medial FNP and MXPs and extends vertically in the mouth to flank the developing tongue (Figure 3B’). Subsequently, the secondary palatal shelves elevate to a horizontal position in the oral cavity above the dorsal surface of the tongue (Figure 3C and 3C’), extend toward each other, abut at the midline, and fuse in a zipper-like fashion proceeding in the direction from the anterior stomodeum to the posterior oropharynx (Figure 3D)5. At the conclusion of palatogenesis, there remains a pinhole gap between the primary and secondary palates (Figure

3D). This gap forms the incisive foramen in the adult through which the greater palatine artery and nerves pass.

The secondary palatal shelves are covered by a thin layer of stratified squamous epithelia during embryonic craniofacial development. The oral epithelium has numerous signaling functions and is an important player in the epithelial-mesenchymal interactions required for the proliferation, differentiation, maturation, and survival of CNCCs in the facial prominences and palatal shelves64.

Numerous previous publications support the central importance of the oral epithelium during many aspects of palatogenesis. For example, Shh is a secreted signaling molecule expressed in the oral epithelium during craniofacial development that is critical for driving facial prominence growth104.

The reciprocal epithelial- or mesenchymal-specific genetic ablations of Shh or Smoothened (Smo), the primary receptor for Shh signaling, revealed that it is Shh signals from the oral epithelium that drives Smo-positive CNCC mesenchymal proliferation and subsequent palatal shelf protrusions

16 into the oral cavity105. Reciprocal signaling occurs from the mesenchyme to the epithelium as well.

Whereas Fgf10 expression is restricted to the underlying CNCC palatal mesenchyme, Fgfr2b, the primary receptor for Fgf10, is expressed only in the oral epithelium. Epithelial-specific ablations of

Fgfr2b functions render the oral epithelium unable to detect Fgf10 signals from the mesenchyme

106 that normally drive epithelial maturation and Shh expressions . In combination, epithelial-Fgfr2b mutants displayed a cleft palate phenotype, likely due to defective palatal CNCC mesenchymal

104,107 outgrowth . Because of the spatial constraints and proximity of structures in the oral cavity, the oral epithelium also provides insulating barrier functions between distinct mesenchymes to prevent pathological fusions between adjacent structures, such as between the palatal shelves and lateral surfaces of the tongue during vertical outgrowth (Figure 3B’)108. This barrier function is achieved by the expression of tight junction proteins at the apico-lateral surfaces of the superficial epithelial

(periderm) cells to insulate the spread of cell adhesion molecules associated with basal epithelial

64,108 cells onto the apical surfaces . In contrast, since normal palatogenesis involves several fusion events, the formation of a mesenchymal continuum requires the spatiotemporal disappearance of the oral epithelial cell barriers. In one well-studied context, the secondary palatal shelves are each enveloped by oral epithelium, and their adhesion and fusion at the midline to produce a continuous mesenchyme is a critical step during secondary palate development that requires the regulated dissolution of the medial edge epithelium (MEE) (Figure 3D’)5. In situations of MEE dysregulation, cleft palate usually results109. Mice that harbor deleterious mutations in genes critical for epithelial proliferation and differentiation, especially in the oral epithelium, often showed improper adhesions between structures in the oral cavity that physically or biologically inhibited palatal shelf elevation

59,60 and fusion . However, in certain situations, even in the experimental context where embryonic palatal shelves are microsurgically-isolated and maintained as explants in culture, the secondary palatal shelves of epithelial gene mutants often remained incapable of fusion due to defective oral epithelial differentiation and regulated cell death at the MEE, suggesting a broader biological role for oral epithelial cells during palatogenesis108.

17

Figure 3: Mammalian primary and secondary palate development process.

(A-D) Scanning electron microscopy (SEM) images of the dorsal oral cavities of mouse embryos at representative embryonic craniofacial development stages. (A’-D’) Histological coronal sections of the mouse embryonic oral cavity (stained with H&E) corresponding to the SEM images above.

(A) At E11.5, major facial prominences are populated with cranial neural crest cells and converged at the embryonic midline. The lateral-FNP, medial-FNP, and MXP fuse at the lambdoid (λ) junction to form the nares, upper lips and primary palate. The MDPs fuse to form the Meckel’s (A’). (B) At

E13.5, oral surface outgrowths of the MXPs form palatal shelves that proliferate and vertically extend along the lateral surface of the tongue (B’). (C) At E14.5, the palatal shelves have elevated to a horizontal position above the tongue (C’). (D) At E15.5, the palatal shelves contact and begin to fuse in an anterior-to-posterior direction. (D’) The medial edge epithelial seam also degrades in the anterior-to-posterior direction to form the continuous palatal mesenchyme of the secondary palate. The gap between the primary and secondary palates forms the incisive foramen through which the greater palatine artery and nerves pass. L-FNP: lateral frontonasal process; M-FMP: medial frontonasal process; MXP: maxillary process; MDP: mandibular process; UL: upper lip; PP: primary palate; NS: nasal septum; PS: palatal shelf; T: tongue. Arrowhead: medial edge epithelia.

Figure adapted and modified from Bush and Jiang, Development 20125.

18

The MEE represents a unique location in the palatal fusion process because its dissolution is necessary for a continuous secondary palatal mesenchyme to form. While previously published studies have demonstrated that the dissolution of the MEE is required for palatal shelf fusion, the exact molecular and cellular mechanisms behind how the MEE disappears during fusion remain highly contentious topics in the field of craniofacial development research109. Three non-mutually exclusive hypotheses have been proposed and experimentally tested in attempts to address this question: 1) the MEE cells undergo programmed cell death during secondary palatal shelf fusion to facilitate their removal, 2) MEE cells undergo epithelial-to-mesenchymal transition (EMT) during palatal shelf fusion and migrate into the mesenchyme, and 3) MEE cells shift and/or migrate in the nasal and oropharyngeal directions to expose the underlying mesenchyme prior to palatal shelf contact. In support of the role of programmed cell death in MEE disintegration during palatal fusion, several studies have shown through methods of detecting apoptosis, including cleaved caspase 3 antibody staining and terminal deoxynucleotidyl transferase dUTP nick-end labeling (TUNEL), that

MEE cells are positive for markers of programmed cell death immediately before and during palatal

110,111 shelf fusion . However, when an alternative genetic approach was used to interrogate the role of apoptosis during palatal shelf fusion, a conflicting result was obtained. Apaf1, a key component of the caspase 3-dependent apoptosis pathway, was mutated to generate a mouse model unable

-/- to activate apoptosis. In Apaf1 mutants, homozygous-null mutant embryos showed normal palatal fusion despite their inability to activate caspase 3-dependent apoptosis, in contrast to the previous observations of apoptosis marker expression in the MEE112. One possible explanation to reconcile the conflicting result is that although Apaf1 is an important component of the apoptosis pathway, other methods of caspase activation exist, such as Fas-FasL, and could potentially partially rescue the Apaf1 mutation such that sufficient MEE cell death occurs in Apaf1 mutants to ensure proper palatal shelf fusion112. In support of the EMT hypothesis, cell lineage tracing experiments using an epithelial-specific Cre (K14-Cre) and the ROSA26 reporter line were performed to irreversibly label the oral epithelium with LacZ for subsequent examinations of the cell fates and locations of MEE

19 cells using β-galactosidase staining113. With this approach, one group found no LacZ-positive cells in the palatal mesenchyme and concluded that EMT was not a contributor to MEE dissolution113,114.

However, another group used a different K14-Cre line and reported speckled LacZ-positive cells throughout the palatal mesenchyme after palatal shelf fusion and MEE degeneration, contradicting the results from previous reports115. The author suggested that the discrepancies could be due to differences in the constructions of the K14-Cre lines in terms of their expression levels and spatio- temporal patterns in the oral epithelium during palatogenesis, and thus left the debate of whether

EMT was truly important in palatal shelf fusion and MEE dissolution unresolved. Lastly, in support of the MEE cell migration hypothesis, cultured palatal shelf explant experiments juxtaposing an epithelial-labeled explant with an unlabeled explant revealed labeled cells dispersed throughout the epithelial surface of the opposing unlabeled explant after fusion115. The most likely mechanism of MEE dissolution during palatal shelf fusion is a combination of all three of the aforementioned biological processes, and potentially other currently undiscovered mechanisms. The magnitudes of the combinatorial defects from multiple processes likely determine the presence or absence of the resulting OFC phenotype.

IRF6 is a master regulator of epithelial differentiation and maturation

To understand the mechanisms through which IRF6 regulates epithelial differentiation and maturation, the structure and functions of embryonic epithelium, which differ from adult epithelium, should be reviewed. Embryonic epithelial differentiation occurs shortly after gastrulation where a single layer of ectoderm differentiates into both neural and non-neural ectodermal lineages116. The resulting embryonic epidermis consists of a single layer of multipotent epithelial stem cells, which subsequently gives rise to a transient protective layer of squamous epithelial cells on the superficial surface called the periderm through asymmetrical cell divisions117. The periderm serves protective, signaling, and mechanical functions until the underlying definitive epidermis is fully formed, at which point the periderm degenerates at around E16.0-17.0 during mouse embryonic development117.

20

The importance of the periderm is supported by observations that selective genetic ablation of the periderm, and not the underlying epidermis, resulted in intraoral epithelial adhesions, and in severe cases, CL/P secondary to adhesions between the palatal shelves and the lateral surfaces

108 of the tongue . The definitive epidermis is composed of several distinct layers. The deepest is the basal layer and is composed of cells capable of proliferating in apically-oriented asymmetrical cell divisions and giving rise to more superficial cells118. As additional cells are added above the basal stratum, more superficial cells will undergo terminal differentiation in the keratinocyte lineage, and begin terminal keratinization with apoptosis to form a barrier layer that serves to prevent water loss, microbial invasions, and more119,120. Through numerous lines of evidence, IRF6 has been shown to be a master regulator of the epithelial proliferation-differentiation switch63. As discussed in the previous section on mouse models of VWS/PPS, mice lacking Irf6 function have hyperproliferative epithelia without stratifications. Specifically, the deeper epidermal layers are overrepresented, and the superficial layers of differentiated keratinocytes are missing (stratum corneum). Normally, the stratum corneum does not express desmosomal components. However, in Irf6 mutants, the most superficial cells of the epidermis have not undergone apoptosis and keratinization, and therefore express desmosomes at the apical surface that subsequently facilitate pathological inter-epithelial adhesions63. In addition, SEM analyses of the Irf6R84C/R84C epidermis revealed increased amounts of apical protrusions from the exposed non-terminally differentiated epithelium compared to wild

63 type . When adjacent epithelial surfaces were juxtaposed in close proximities, such as in the oral cavity, the epithelial protrusions of the Irf6R84C/R84C embryos were able to contact protrusions from

63,64 adjacent epithelial surfaces and form desmosomal connections . The finding that defects in Irf6 function can lead to abnormalities in epidermal/periderm differentiation likely accounts for the oral epithelial adhesions observed in Irf6 mutant animals and human VWS/PPS patients because their oral epithelia are more adhesion-prone and not insulated from contacts from adjacent surfaces by keratinized cells in the stratum corneum or periderm. Whether the oral adhesions could precipitate an OFC phenotype is currently under debate. Although the mechanical blockade of the secondary

21 palatal shelves adhered to the lateral surfaces of the tongue could potentially physically prevent their elevation and lead to CLP pathogenesis, there is unpublished evidence from our collaborator

R84C/R84C that when Irf6 embryonic palatal shelves are microsurgically extracted and grown in explant culture without potential obstruction from the tongue, they are able to elevate and meet at the mid- line but still produce a CLP phenotype due to ineffective degradation of the MEE.

During mouse embryonic development, Irf6 is expressed in the placenta, skin, heart, liver, and testes45. Importantly, Irf6 is expressed in the oral epithelia and MEE of the elevating secondary palatal shelves and plays crucial roles in the regulation of MEE differentiation and degradation121.

In one study that examined the K14-Cre epithelial-specific ablation of Tgfbr2 (a receptor for TGFβ signaling), the resulting mouse mutants displayed strong OFC phenotypes in addition to epithelial defects114. Subsequently, Irf6 expression was found to be significantly downregulated in the medial edge epithelium of epithelial-specific Tgfbr2 mutants and suggested a possible role for Irf6 down- stream of TGFβ signaling in the epithelium for MEE regulation and degradation122. In fact, it has since been shown that overexpression of Irf6 under the K14 promoter could actually rescue the

122 palatal defects caused by epithelial-specific ablations of Tgfbr2 and signaling . Taken together, the activities of Irf6 in MEE regulation during secondary palatal shelf fusion could function down- stream of TGFβ signaling and be sufficient to execute the critical roles of TGFβ-mediated MEE degradation in mice.

IRF6 gene regulation and protein interactions in the embryonic epithelium

The molecular regulation of IRF6 in the oral and medial edge epithelium during craniofacial development is not well-characterized. Therefore, there is intense interest in elucidating the IRF6 gene regulatory network and identifying which factors can activate or inhibit IRF6 functions at the transcriptional and/or post-translational levels. Through studies of other genes important for cranio- facial development, several IRF6 gene regulatory pathways have been identified as a byproduct.

For example, through disruptions of Pbx1/2/3, proteins often known as Hox cofactors

22 that are strongly expressed in the facial prominences during craniofacial development, Pbx genes were discovered to activate the expression of Wnt9b/Wnt3 at the embryonic midface, which could then upregulate P63, another gene critical for epithelial proliferation and differentiation expressed in basal epithelial cells123. Deleterious mutations in P63 cause ectrodactyly, ectodermal dysplasia, and cleft lip and palate syndrome (EEC OMIM: 129900) in humans124, which shares similar clinical presentations as VWS/PPS. Through a combination of computational and molecular methods, P63 was found to be bound to a conserved orofacial enhancer element upstream of IRF6 and drove its expression in relevant craniofacial domains123. Taken together, because Pbx genes were known to play early patterning roles in craniofacial development, a relatively comprehensive pathway in

123 craniofacial patterning was established as Pbx-Wnt-P63-Irf6 . Interestingly, each node along this pathway, when mutated in mice or humans causes similar CL/P phenotypes and therefore are all highly relevant to craniofacial development and OFC pathogenesis. In another study, examination of noncoding gene variants within the IRF6 upstream gene sequences revealed variants within a multispecies conserved sequence (MCS) that disrupted a critical binding site of AP-2α125. TFAP2A belongs to a family of transcription factors all known to be involved in craniofacial development, and deleterious variants in TFAP2A cause branchio-oculo-facial syndrome (BOF OMIM: 113620) in humans, which has many overlapping clinical presentations as VWS/PPS126. AP-2α could bind upstream of IRF6 and drive its transcriptional activation, and mutations in those binding sites are associated with significantly increased risks of nonsyndromic CL/P pathogenesis. In terms of post- translational IRF6 activity regulations, through studies of the crystal structures and functions of other IRF members, it was hypothesized that IRF6 normally exists in an auto-inhibited state until activated by protein phosphorylations127 on key residues Ser413 and Ser424 as identified from mutagenesis assays128-130. Although many of the phosphorylation partners that regulate IRF6 post- translational activities, the gene RIPK4, which causes Bartsocas-Papas syndrome (BPS OMIM:

263650) when mutated in human131, has been illustrated to have serine phosphorylation activities that can activate IRF6 dimerization and nuclear translocation128. Interestingly, one report has also

23 shown that over-expression of IRF6 could activate the transcription of RIPK4, which subsequently amplified IRF6 nuclear translocation and transcriptional activities, suggesting that they could exist

128 in a feed-forward loop . However, the exact relationship between RIPK4 and IRF6 and how they cooperatively regulate epithelial differentiation is currently unknown.

It is important to note that thus far, the transcriptional and post-translational regulations of

IRF6 activity described have all placed IRF6 as the terminal node of gene regulatory transductions with the molecular events after IRF6 activation not well described or understood. In addition, since

IRF6 is a transcription factor, it is crucial to identify the direct transcriptional targets of IRF6 in order to understand how IRF6 regulates MEE degradation and epithelial maturations during craniofacial development at the molecular level.

Zebrafish as a model for mammalian craniofacial development

The zebrafish (Danio rerio) has become an immensely popular model for medical and basic science research132-137. Especially during embryogenesis, the zebrafish model displays significant molecular, cellular, and phenotypic conservations of developmental processes compared to mice and human. Moreover, the embryonic and adult phenotypes of humans in health and disease can often be recapitulated in zebrafish under corresponding pathological conditions, and thus makes the zebrafish a useful model to complement the research findings obtained from a variety of other experimental sources. In addition to the strong molecular, cellular, and phenotypic conservations compared to humans, the zebrafish model also offers many distinct advantages. First, zebrafish embryos develop rapidly in an external aqueous environment composed of a simple salt solution rather than in utero, and therefore the entire duration of embryonic development could be directly observed by microscopy (brightfield for wild type embryos or fluorescence for transgenic reporter embryos) in the matter of hours/days compared to weeks for mouse and months for nonhuman primates. Second, zebrafish (embryonic or adult) could be maintained in a compact and efficient manner, leading to experimentation with a large number of adults with a small laboratory footprint

24 and low operational budget. Each pair of male and female zebrafish could be mated every week to produce on average 200-300 embryos/clutch, which provides incredible strength in numbers to identify mutants from multigeneic heterozygous in-crosses and the statistical power to detect small perturbations in experimental results. Furthermore, due to the large number of embryos that could be produced and the aqueous environment in which the embryos are raised, wild type and mutant zebrafish embryos could be easily treated with experimental reagents, such as small molecules, biologics, heat shocks, and light to manipulate their gene expression, perturb signaling pathways, or screen for chemicals capable of augmenting certain developmental or disease processes138,139.

In fact, numerous successful chemical screens have been performed in the zebrafish model and led to drugs that targeted important developmental pathways relevant to human disease, such as prostaglandin E2 (PGE2) in promoting human hematopoietic stem cell (HSC) proliferation140 and morpholinobutyl-4-thiophenol (MoTP) in preventing pigmentation in zebrafish that has since then

141 become a popular reagent in the study of melanocyte biology . In addition to chemical screens, the ability to produce a large number of zebrafish embryos also facilitates forward genetic screens with mutagens like ethyl-nitrosourea (ENU)142,143 or ionizing radiation144 to induce mutations in big pools of embryos and assess for interesting phenotypic changes and gene discoveries145. Finally, the zebrafish genome is well sequenced and annotated, and the zebrafish embryo is amenable to reverse genetic studies due to the ease of delivery of targeted genome editing reagents including

146,147 148 transcription activator-like effector nucleases (TALENs) , Tol2 , and clustered regularly inter-

149,150 spaced short palindromic repeats (CRISPR) Cas9 endonucleases . These techniques have led to the generation of numerous zebrafish gene expression reporters, knockout lines, and other reagents that facilitate in vivo cellular and molecular investigations. Moreover, gene expression perturbations could also be easily accomplished using microinjections of mRNAs, plasmids and morpholinos (MO) during embryonic development, thereby making the zebrafish a versatile model for in vivo studies of gene functions, either individually or massively in parallel, during embryonic development151.

25

Figure 4: Zebrafish as a model for mammalian craniofacial development.

The zebrafish craniofacial development process contains many analogous molecular and cellular features compared to the mammalian system. (A) Cranial neural crest cells arise at the midbrain- hindbrain junction and migrate in canonical streams, either anteromedially or bilateral-ventrally, to reach the future mouth opening. (A’) Zebrafish anterior cranial neural crest cells follow analogous migration patterns to reach the oral cavity and pharyngeal arches. Neural crest cells were marked with :kaede with the frontonasal population photoconverted in red and other NCCs in green.

(B) Cranial neural crest cells condense into facial prominences and converge at the midline to form the primary palate. (B’) Photoconverted populations in (A’) also condense and fuse at the midline to form the zebrafish primary palate analog, the ethmoid plate. (C) The mammalian primary palate results from the fusion of FNP and bilateral MXPs. (C’) The ethmoid plate is also derived from the fusion of FNP and bilateral MXPs. (D) Mammalian cleft lip and palates frequently result from bony defects in the primary palate at seams of fusion. (D’) Zebrafish ethmoid plate defects phenocopy mammalian primary palatal defects; ethmoid plate isolated from Tg(sox10:irf6 R84C) embryos at 96 hpf. Arrows: streams of CNCC migration; MES: mesencephalon; R: rhombomere; PA: pharyngeal arch; FNP: frontonasal process; MXP: maxillary process.

26

The zebrafish has been used to create numerous human disease models, ranging broadly

152,153 154,155 156-159 160,161 from those affecting the gastrointestinal , renal , hematopoietic , cardiovascular ,

162-164 and musculoskeletal systems. Importantly for our research interests, the zebrafish model has been successfully applied in studies of key genes relevant for craniofacial development and human

OFC disorders73,74,102-108. Craniofacial development in zebrafish has many parallels to mammalian systems (Figure 4). Although the mouse model often offers faithful recapitulation of gene functions compared to human, mouse embryonic development occurs in utero and is relatively inaccessible to observers. Furthermore, the majority of orofacial clefts in mice involve exclusively the secondary palate (CPOs) while in humans, CPOs are rare and clefts through the lips and primary palate are far more common102,103. Taken together, the palatogenesis program has subtle differences across species, even amongst different mammalian systems, and should be studied using a wide variety of models to strengthen the validity and reproducibility of results.

Zebrafish craniofacial development starts with the specification of neural crest cells at the lateral border of the neural plate at the end of neurulation at approximately 12 hours post fertilization

(hpf), similar to the mammalian system165. CNCCs are specified at the midbrain-hindbrain junction rhombomeres and migrate in canonical streams to the pharyngeal arches between 12-24 hpf, with the eventual frontonasal CNCCs taking a unique route anteromedially between the brain and eye primordia to condense on the roof of the future mouth opening (Figure 4A’), similar to mammalian

102,103 systems as well (anterior-most CNCCs) (Figure 4A) . The anterior population of NCCs in PA1 will eventually go on to form the paired maxillary processes, and the posterior population the paired mandibular processes. From 24-48 hpf, MXP CNCCs in PA1 migrate anteromedially and fuse with

FNP CNCCs at the midline on the roof of the oral cavity to form the ethmoid plate (EP), a zebrafish

166 craniofacial structure analogous to the mammalian primary palate (Figure 4B and 4B’) . From 48-

72 hpf, CNCCs of the EP undergo convergence-extension movements to intercalate and extend the ethmoid plate in the anterior direction, concurrently flattening the tissue primordium into a single

27

166,167 layer of differentiating chondrocytes . Much of zebrafish embryonic craniofacial development can be observed within the first 96 hpf, with further morphogenetic refinements and endochondral ossification occurring after 6-7 days post fertilization (dpf) and continuing into the larval stages of development168.

Clear differences exist between zebrafish craniofacial structures and those of mammalian systems. The lack of a nasal cavity in zebrafish technically designates the ethmoid plate as part of the ventral neurocranium rather than the “palate” (especially the secondary palate). Moreover, key mammalian palatogenesis events like secondary palatal shelf elevation and fusions do not occur in zebrafish, with the confluence of facial prominences in zebrafish more comparable to the fusion of the FNP and MXP at the primary palate lambdoidal junction in mammals. However, despite the differences between zebrafish and mammalian palatogenesis overall, the zebrafish ethmoid plate is still highly analogous to the mammalian primary palate and has strong conservations of critical genes, regulatory/signaling pathways, and molecular functions. Numerous molecular and cellular insights on orofacial cleft pathogenesis have been gained from studies conducted in the zebrafish

136,168 model . For example, deleterious mutations in the platelet-derived growth factor receptor alpha

(pdgfra) gene in zebrafish resulted in abnormalities of the anterior neurocranium and clefts in the ethmoid plate resembling the cleft phenotype subsequently observed in Pdgfra-/- mice169-171. Many other examples abound. Mutations in specc1l in the zebrafish model caused bilateral clefts of the

172 EP, much like the oblique orofacial cleft phenotypes observed in human patients . Mutations in polr1 phenocopied the human clinical presentations of Treacher Collins syndrome173,174, in

Kabuki syndrome175, in Bamforth-Lazarus syndrome176, smchd1 in Bosma arrhinia micro- ophthalmia syndrome177, in branchio-oculo-facial syndrome178, and many others.

In summary, the zebrafish model has a strong and robust history in craniofacial and orofacial cleft research and represents an ideal model to examine various aspects of IRF6 functions in OFC pathogenesis and craniofacial development in this dissertation.

28

Chapter 1:

Establishment of Zebrafish irf6 Models Using CRISPR-Cas9 Genome Editing

Research Attributions

The research presented in the following chapter has been in part published in Li et al., 2017.

Edward Li played a role in project conceptualization, experimental execution, data curation, formal analysis, and methodology development. Dawn Truong, an undergraduate student, played a role in experimental execution, data curation and formal analysis. Shawn Hallett, a research technician, played a role in experimental execution and data curation. Kusumika Mukherjee, a post-doctoral fellow, and Christina Nguyen, a former research technician, played a role in designing the CRISPR targeting strategy and generating the zebrafish irf6 mutant. Brian Schutte, a principal investigator and collaborator, played a role in the analysis of human genetics data, methodology development and manuscript preparations.

Li EB, Truong D, Hallett SA, Mukherjee K, Schutte BC, Liao EC (2017) Rapid functional analysis of computationally complex rare human IRF6 gene variants using a novel zebrafish model. PLoS

Genet 13(9): e1007009.

30

Introductions

Conservation of IRF6 gene structure and protein sequence in zebrafish

IRF6 has been previously demonstrated by both human cell line and mouse in vivo studies to be a master regulator of keratinocyte maturation62,63,179. In human patients with VWS and PPS, dysregulated proliferation and differentiation of keratinocytes throughout the epithelial strata are hallmark cellular characteristics underlying the overt pathological clinical presentations observed, such as CL/Ps and oropharyngeal adhesions62. In one study, to analyze the roles of IRF6 in human epithelial keratinocyte maturation, primary normal human keratinocytes (NHK) with functional IRF6 or IRF6 siRNA knockdown were cultured and induced to differentiate in vitro179. Normally, cultured primary NHKs induced to differentiate could produce stratified epithelial structures resembling the natural layers of the human epidermis, with the basal layers retaining stem cell-like characteristics and the superficial layers becoming progressively keratinized and nonproliferative180. When NHKs were transfected with IRF6 siRNAs, they became hyperproliferative and non-apoptotic179. In fact,

IRF6 was subsequently discovered to act as a tumor suppressor in human keratinocytes to control their growth, survival, differentiation, and maturation, with its promoter often silenced in squamous cell carcinoma samples179.

Various other model systems have been used for the study of IRF6. The zebrafish homolog was first described by Ben et al. in 2005, who used human IRF6 cDNA as a query in the GenBank database and identified a zebrafish expressed sequence tag (EST) with a high degree of to human IRF6181. The overall genomic organization of zebrafish irf6 is highly similar to that of human IRF6 as well, with the primary differences being the concatenation of the noncoding exons 1 and 2 into one element in the 5’ untranslated region (UTR) and the noncoding segments of exons 9 and 10 into one element in the 3’ UTR (Figure 5A). Overall, the structures of the 5’ and

3’ UTRs are conserved across species. In fact, the noncoding exon concatenations observed in the zebrafish genome is also seen in the mouse genome (Figure 5A).

31

Figure 5: Conservation of IRF6 gene structure and protein sequence across species.

32

Figure 5 (Continued) Genomic structure of IRF6 homologs from humans, mouse, and zebrafish.

Boxes: exons; lines: introns; light gray sections: untranslated regions; dark gray sections: coding sequences. All three IRF6 homologs have a helix-turn-helix DNA-binding domain (yellow) at the

5’ side and a SMIR/IAD protein-binding domain (green) at the 3’ side. (B) PRALINE cross-species polypeptide sequence alignment and conservation results. The DNA-binding domain amino acid residue numbers are highlighted in yellow, and protein-binding domain amino acid residue numbers are highlighted in green. Amino acid symbols highlighted in red are highly conserved.

When the translated polypeptide sequence of zebrafish Irf6 was compared to that of mouse and humans, it was discovered to have extensive conservation of the general protein structure with strong conservation of the two critical functional modules, the helix-turn-helix DNA-binding domain and SMIR/IAD protein-binding domain (Figure 5B). Using the computational tool PRALINE182 for polypeptide sequence alignment and conservation analysis, the overall protein sequence similarity between human and zebrafish IRF6 was found to be 63% and was increased to 87% in the DNA- binding domain and 75% in the protein-binding domain (Figure 5B). The high degree of genomic structure and amino acid sequence conservation between zebrafish and human IRF6 raised the possibility that the zebrafish could be used as a reliable model for human IRF6 functions to study its gene regulatory network and roles in craniofacial development and CL/P pathogenesis.

Maternal transcripts and early zebrafish embryonic development

From whole-mount in situ hybridization (WISH) studies performed in previous publications and our laboratory, zebrafish irf6 was determined to be initially expressed as a maternal transcript during early embryogenesis, observed ubiquitously throughout the embryonic blastoderm from the one-cell stage (0 hpf) up to the 70-80% epiboly stage (8 hpf)181. Because maternal transcripts are directly deposited into the oocyte cytoplasm by the female parent before fertilization, irf6 maternal transcripts are rapidly translated into protein after fertilization before the maternal-zygotic transition

(MZT) of zygotic transcriptional activation at 3 hpf183 and is therefore unaffected by the zygotic irf6

33 genotype. Furthermore, due to the early production of maternal Irf6 proteins in zebrafish, Irf6 likely serves critical developmental functions during early zebrafish embryogenesis. In order to examine these functions, a previous research project attempted to inhibit Irf6 protein function in zebrafish by microinjections of a dominant-negative irf6 construct into wild type embryos at the one-cell stage

184 which when translated into protein would inhibit wild type Irf6 functions . The results revealed that nearly all zebrafish embryos without sufficient Irf6 function exhibited an embryonic lethal phenotype at approximately 90% epiboly (9 hpf) due to periderm rupture. A small fraction of embryos survived past gastrulation and showed shortened pectoral fins, blistered skin, and disorganized craniofacial cartilages at 3 dpf184, reminiscent of the shortened forelimbs, defective epithelial stratification, and

CL/P phenotype seen in mutant mouse embryos homozygous for the dominant-negative Irf6R84C allele62. The periderm rupture phenotype of dominant-negative irf6 mRNA-microinjected embryos suggested failures in periderm (enveloping layer or EVL) maturation, which was verified by a strong downregulation of EVL-associated structural genes including keratins (krt4 and krt8) and adhesion

185 molecules (cldnf and cldne) . In fact, many of the genes that regulate periderm specification and maturation are similar to those found in the human epidermis. For irf6, its gene regulatory network, especially its downstream transcriptional targets, appear to be highly conserved between the EVL in zebrafish and the epidermis/oral epithelium in mouse and humans.

Figure 6: Graphical schematic of zebrafish epiboly and gastrulation.

34

Figure 6 (Continued) (A-D) Zebrafish embryos in the lateral view during epiboly (4-9 hpf). (A’-D’)

Corresponding cross-sectional schematics of the epiboly process illustrating the specification and movement of various cell layers. (A) The sphere stage (4 hpf) marks initiation of blastoderm cellular differentiation. (B) At 30% epiboly, the yolk domes in the animal pole direction. (B’) Nuclei in the yolk assemble into a sheet at the interface of yolk and hypoblast, anchored by microtubules. The outermost cells differentiate into the enveloping layer (EVL) and the deeper blastoderm layers begin to flatten. (C) At the shield stage (6 hpf), the EVL, blastoderm and hypoblast continue to thin and spread over the yolk. (C’) Gastrulation movements commence as cells undergo convergence movements toward the dorsum and extension movements along the A-P axis. Blastoderm cells involute at the epiboly margin to form multiple cell layers while the EVL maintains contact with the yolk to ensure embryo integrity under the mechanical stress of epiboly. (D-D’) Cell layers continue to migrate to cover the entire yolk. Epiblast cells form ectodermal tissues while hypoblast cells form mesendodermal tissues. Dark gray: blastoderm; light gray: hypoblast; green: enveloping layer; yellow: yolk; orange: yolk syncytial layer. Zebrafish embryo images adapted and modified from

Kimmel et al., Developmental Dynamics 1995186.

Zebrafish embryonic periderm as a model of the mammalian oral epithelium

The group that utilized a dominant-negative irf6 to elucidate the roles of irf6 in periderm differentiation and maturation during zebrafish embryogenesis also performed transcriptome-wide gene expression analyses using mRNA microarrays185. By examining differential gene expression between wild type periderm and dominant-negative irf6-injected periderm, the gene Grainyhead-

Like 3 (grhl3 – ENSDARG00000078552) was discovered to be one of the genes most significantly downregulated with disruptions in Irf6 function185. Grhl3 was found to be expressed in the EVL in overlapping domains with irf6 during early embryonic development, and forced over-expression of grhl3 by mRNA injections significantly delayed periderm rupture and partially rescued periderm gene expression in dominant-negative irf6 injected embryos185. In a subsequent study, the authors

35 utilized a combination of chromatin immunoprecipitation-qPCR and minimal promoter luciferase assays to show that grhl3 is a direct downstream transcriptional target of Irf6 in zebrafish and plays

187 key roles in periderm differentiation and maturation (Figure 7A) . Further, the results suggested that grhl3 could play key effector roles in the irf6 downstream gene regulatory network and in the pathogenesis of VWS/PPS in humans. Here, it is important to note that VWS and PPS are clinical diagnoses based on disease presentations. Although IRF6 is the most significant gene associated with VWS/PPS, only 70% of VWS/PPS patients have IRF6 mutations with examination by Sanger sequencing121; the remaining 30% do not have IRF6 mutations and are categorized as non-IRF6

VWS/PPS patients. There are several possible explanations that could account for this imperfect genotype-phenotype correlation. Since VWS/PPS patients are normally sequenced in a targeted fashion for the IRF6 coding regions, pathogenic variants in the introns and upstream/downstream regulatory regions could be missed. In addition, another possible explanation is that other critical genes in the IRF6 gene regulatory network, when mutated, could also cause the constellations of symptoms that closely phenocopy those seen in VWS/PPS patients with IRF6 mutations. Indeed, when the authors of the grhl3 study re-examined the GRHL3 locus in patients with non-IRF6 VWS,

8 out of 45 such patients actually had gene variants in GRHL3 (Figure 7B)187. Mouse Grhl3 mutants displayed a cleft palate phenotype (Figure 7C), and the GRHL3 gene variants identified in those non-IRF6 VWS patients predicted by statistical and computational methods to be deleterious for protein function and pathogenic for disease were subsequently functionally tested in the zebrafish model to be dominant-negative in function (Figure 7D)187. Taken together, these experiments used a proven approach for identifying the transcriptional targets of IRF6, where the zebrafish periderm was used as a biological platform for identifying possible target gene that could thereafter be tested in animal models and against human genetics data from patients to discover novel gene variants of unknown significance. Finally, the biological properties of those variants could be functionally validated back in the zebrafish system, closing the loop on this line of investigation.

36

Figure 7: Clinical translation of zebrafish craniofacial genetics findings to human patients and the zebrafish periderm as a model for the mammalian oral epithelium.

Several previously published studies have illustrated the power of the zebrafish model in studying craniofacial genetics relevant to human disease. (A) In the IRF6 gene regulatory pathway, several downstream genes, such as OVOL1 and GRHL3, were initially identified in the zebrafish periderm where irf6 plays a critical role in epithelial differentiation and gene expression. (C) Experimental evidence for Grhl3 in craniofacial development allowed for focused resequencing of GRHL3 in human VWS/PPS patients without IRF6 mutations, and many were discovered to have potentially pathogenic mutations in GRHL3. (B) Mouse Grhl3 mutant models were generated and revealed craniofacial defects that recapitulated those observed in human patients. (D) The zebrafish model was then used to experimentally validate the biological functions of human GRHL3 variants. NS: nasal septum; PS: palatal shelf; T: tongue. Histological and human patient images adapted and modified from Peyrard-Janvid et al., AJHG 2014187.

37

Generation of zebrafish irf6 models using CRISPR-Cas9 genome editing

Encouraged by the conservation of IRF6 gene regulatory networks in zebrafish compared to human and positive discoveries surrounding IRF6 from the zebrafish system using the periderm as a model of the human embryonic oral epithelium, we decided to establish a zebrafish model of irf6 by generating a null mutant with CRISPR-Cas9 mediated genome editing. Clustered regularly interspaced short palindromic repeats (CRISPR) is an ancient bacterial adaptive immune defense system against phage viruses originally discovered in Streptococcus thermophilus188. Fragments of the invading viral genome could be inserted in tandem into the CRISPR locus of the bacterial genome, which would then produce short CRISPR RNAs that hybridize with Cas9 endonuclease and direct it to the viral genome through sequence homologies188. Once complexed with the viral genomes, Cas9 can cause double-stranded breaks (DSBs) in the DNA and thus viral inactivation.

In 2013, this system of prokaryotic immune defense was engineered to function in eukaryotic cells as a method of genome editing189,190. Short guide RNAs (sgRNAs), engineered versions of crRNA, that provide Cas9 with sequence specificity, could be synthesized in vitro and provide targeting to an unprecedentedly large fraction of the genome. Aside from the potential sequence constraints imposed by in vitro transcription, the 20 bp sgRNA was constrained by the sequence requirement of a NGG 3’ protospacer adjacent motif (PAM), leaving a theoretical targetability of 1 in every 42 bp in the genome. With rapid advances in the development of CRISPR-Cas9 technology, various

CRISPR classes and Cas9 variants have been discovered with unique characteristics to facilitate the ease-of-use and expand the functional repertoires of the system. For example, the original S. thermophilus Cas9 had a PAM sequence of NGGNG but was engineered with components of the

S. pyogenes Cas9 to accept the more versatile NGG PAM191. Furthermore, evolutionary selection methods have now generated Cas9 variants with different PAM sequence requirements including

192-194 NAG and NGA , further expanding the already extensive targetable genome by CRISPR-Cas9 genome editing technology to theoretically to every accessible character in the genome.

38

In eukaryotic cells, when sgRNAs complexed with Cas9 endonuclease bind to the desired genomic locus, a double-stranded DNA break typically occurs that could be repaired through two mechanisms: 1) nonhomologous end joining (NHEJ), and 2) homology-directed repair (HDR)195.

In NHEJ, which is the principal method of eukaryotic cellular DSB repair, the exposed DNA ends recruit a variety of NHEJ-associated co-factors, such as Ku70, Ku80, XRCC4, DNA ligase IV, and

DNA polymerase λ, to remove damaged/mismatched bases, join compatible 5’/3’ DNA ends, and fill in single-stranded DNA gaps196. Because the NHEJ method of DNA repair is non-templated, it is error-prone and often results in aberrant insertions or deletions of nucleotides termed indels196.

When these indels occur in the coding region of genes and are not in multiples of three base pairs, they can cause reading frameshifts, premature nonsense mutations, polypeptide truncations, and disruptions in the resulting gene/protein functions. Although generating a null allele is the ultimate goal of many researchers, the NHEJ approach lacks precision because the sizes of indels as the result of CRISPR-mediated DSBs cannot be effectively controlled. Such precision can be achieved through HDR, which depends on the presence of either an endogenous or exogenous homologous

197 sequence flanking the DSB for templated repair to achieve exact DNA sequence changes . Exact sequence changes are especially crucial now with advances in whole genome sequencing as more patient-specific variants of unknown functional significance are being identified. In order to evaluate the functional effects of these gene variants in animal models, the ability to rapidly generate alleles with patient-specific changes (i.e.: Irf6R84C) rather than definitive total loss-of-functions (i.e.: Irf6-/-) would be immensely powerful198. Experimentally, HDR is achieved through co-deliveries of repair templates with the desired sequence change along with CRISPR-Cas9 components199. However, depending on the model, loci targeted, and a variety of other factors, the HDR efficiency following

CRISPR-mediated DSBs has been extremely variable. For this dissertation, several approaches were used to generate zebrafish models of irf6 with various functions, including using CRISPR to generate a zebrafish irf6 null allele and an attempt to use CRISPR-mediated HDR to generate a zebrafish irf6 gene expression fluorescent reporter line.

39

Spatiotemporal gene expression patterns of irf6 in zebrafish

As the first-step in establishing the zebrafish as a model for irf6 functions during epithelial maturation and craniofacial development, the spatiotemporal gene expression patterns of irf6 in the zebrafish model must be characterized. Previous studies with a dominant-negative irf6 in the zebrafish model revealed its potential roles in zebrafish periderm differentiation and maturation184.

However, in spite of our modest understanding of the functional roles of irf6 during pre-gastrulation zebrafish development, its roles during the later stages of embryonic development are currently unknown. A previous study performed WISH on zebrafish embryos at a variety of developmental stages to analyze the spatiotemporal gene expression pattern of irf6181. Irf6 gene transcripts were expressed ubiquitously at the 2- and 16-cell stages (Figure 8A and 8B). Because these time points preceded the maternal-zygotic transition of zygotic transcriptional activation at 3 hpf, the presence of irf6 transcripts corroborated findings that irf6 is expressed as a maternal transcript in zebrafish embryos. Subsequently, irf6 remained ubiquitously expressed from 40% epiboly (Figure 8C) to 8 hpf, where expression became concentrated in the dorsal forerunner cells (Figure 8D). At the bud stage, irf6 expression was restricted to Kupffer’s vesicle (Figure 8E), and at 14-somites in the otic vesicles and potentially weakly in the optic stalk (Figure 8F). Strong expression in the otic vesicles was observed in all subsequent embryonic stages analyzed, with additional expression domains in the cloaca at the 22-somites stage (Figure 8G) and in the olfactory placodes and gut at 26 hpf

(Figure 8H). Finally, from 48 hpf, strong irf6 expression became visible in the anterior oropharynx and pharyngeal arches contiguous with expression in the gut tube (Figure 8I and 8J).

The expression patterns of Irf6 in mouse embryos have been previously described as well.

However, they are mostly derived from methods like qPCR and thus provided poor spatiotemporal information45. Cross-referencing various publications, the expression patterns of irf6 in zebrafish are similar to mice in analogous structures, such as the skin, oral epithelium, and gut200. Additional correlation with other tissues and structures in mice, such as the eyes, heart, liver, lungs, placenta, testes, and tongue, with the zebrafish irf6 expression domains are currently unknown45.

40

Figure 8: WISH spatiotemporal irf6 gene expression patterns in zebrafish.

(A-C) Ubiquitous irf6 expression was observed from initial fertilization to the mid-epiboly stages.

(D) At 80% epiboly, irf6 expression was observed in the periderm and focally concentrated first in dorsal forerunner cells and then in Kupffer’s vesicle at 10 hpf (E). At 14-somites, irf6 was expressed in the otic vesicle (F-J), with additional expression domains in the cloaca at 20 hpf (G), the olfactory placode and gut tube at 26 hpf (H), and the pharyngeal arches from 48-72 hpf (I-J). F: forerunner

41

Figure 8 (Continued) cell; K: Kupffer’s vesicle; OV: otic vesicle; C: cloaca; G: gut; OF: olfactory placode; PA: pharyngeal arch. Figure adapted from Ben et al., Gene Expression Patterns 2005181.

Due to the numerous strengths of the zebrafish animal model, its strong protein sequence conservation of irf6 compared to humans, its irf6 gene expression patterns potentially relevant for craniofacial morphogenesis, and the previous successful applications of the zebrafish periderm for human orofacial cleft gene discovery, we decided to use the zebrafish model to study the irf6 gene regulatory network and downstream transcriptional targets and to better understand its function in epithelial maturation and craniofacial development. First, we attempted to use a variety of genome editing techniques to generate useful zebrafish models of irf6 functions (i.e.: fluorescent reporters and null mutants). Subsequently, the zebrafish irf6 mutant generated was carefully characterized to determine the organismal phenotypes and molecular properties of the mutant allele. Finally, the irf6 mutant model was used to evaluate the degree of cross-species conservations in human IRF6 protein functions in zebrafish.

Experimental Results

Spatiotemporal expression patterns of Irf6 in zebrafish by IHC

In order to further characterize the spatiotemporal Irf6 gene expression pattern in zebrafish, fluorescence immunohistochemistry on zebrafish embryo cryosections with a polyclonal antibody specific for zebrafish Irf6 was performed for embryos at various stages. While technical difficulties prevented efficient staining of early-stage embryos from 4-24 hpf, highly specific IHC signal for Irf6 was observed in the periderm and oral epithelium at 72 hpf (Figure 9A and 9B) and 96 hpf (Figure

9C and 9D). No Irf6 protein expression was detected in other tissues. Importantly, Irf6 expression was not detected in neural crest cell-derived cartilaginous craniofacial structures like the ethmoid plate (Figure 9A’ dashed), parachordal trabeculae (Figure 9B’ dashed), Meckel’s cartilage (Figure

9C’ dashed), and palatoquadrate (Figure 9D’ dashed), suggesting that the irf6 expression detected by WISH in the oropharynx and PA domains previously reported was most likely in the epithelium.

42

Figure 9: Spatiotemporal zebrafish Irf6 expression by cryosection immunohistochemistry.

(A-B) Irf6 expression is observed in the periderm and oral epithelium at 72 hpf in both anterior (A) and posterior (B) craniofacial domains but is absent in neural crest-derived cartilaginous structures

(A’ and B’ – dashed). Similarly, Irf6 expression is observed in the periderm and oral epithelium at

96 hpf (C-D), but not in neural crest-derived structures or other tissues (C’ and D’ – dashed).

43

Attempted generation of a zebrafish irf6 reporter line using CRISPR and HDR

In order to generate a tool to study the in vivo expression patterns of irf6 in zebrafish during embryonic development, we attempted to generate an irf6 GFP fluorescent reporter line for use in imaging experiments and expression studies with other transgenic fluorescent reporters including tg(sox10:GFP) for neural crest cells and tg(krt4:GFP) for epithelial cells201,202. With advancements in CRISPR-Cas9 genome editing technologies, instead of the traditional Tol2 transgenic approach for generating zebrafish transgenic reporters that required the identification and cloning of promoter and enhancer elements necessary to drive faithful spatiotemporal gene expression, we employed a knock-in approach using CRISPR-mediated HDR to insert a fluorescent protein in-frame down- stream of the endogenous irf6 gene promoter and cis regulatory elements. Because endogenous gene regulatory sequences are used, this construct should theoretically drive fluorescence protein expression in the exact pattern that irf6 is normally expressed during embryonic development.

The HDR construct was designed to have approximately 1.2 kb homology arms upstream and downstream of the irf6 translational start site (Figure 10A). The GFP CDS was placed in-frame downstream of the translational start site, followed by a stop codon to terminate translation and a

SV40 polyadenylation signal to terminate transcription, effectively producing a gene-trap construct that would sequester the irf6 promoter for GFP rather than irf6 expression. After the inserts, the 3’ homology arm continued the irf6 sequence immediately downstream of the translational start site.

Several sgRNAs were identified at the irf6 translational start site at the start of exon 2 and injected with Cas9 protein into wild type zebrafish embryos at the one-cell stage to identify a high efficiency sgRNA that would maximize the likelihood of double-stranded DNA breaks and HDR (Figure 10B).

Because the 3’ homology arm of the knock-in construct contained the sgRNA target, it was altered with silent mutations to the PAM with site-directed mutagenesis in order to prevent CRISPR-Cas9 cleavage. A high efficiency sgRNA was identified and co-injected with the knock-in plasmid (Figure

10C). Embryos were genotyped at 24 hpf with a forward primer lying immediately outside of the 5’

44 homology arm and a reverse primer inside of the GFP CDS, which should produce a PCR product only in the event of an insertion at the correct location of the zebrafish genome. While mosaic GFP expression was seen in approximately 10% of microinjected F0 embryos, no conditions produced a positive PCR signal (data not shown). Moreover, when the GFP-positive F0 embryos were raised to adulthood, PCR on tail-clip DNA samples were also negative for GFP insertion at the irf6 locus, and out-crossed F1 progeny never displayed GFP expression.

Figure 10: Generation of a zebrafish irf6 GFP reporter line with CRISPR genome editing and homology-directed repair.

(A) The CRISPR HDR template featured the eGFP CDS followed by a stop codon and the late

SV40 polyadenylation signal, all flanked 5’ by the genomic sequence 1.1 kb upstream and 3’ by the genomic sequence 1.2 kb downstream of the irf6 ATG. (B) Generalized schematic of CRISPR-

Cas9 genome editing. sgRNA with sequence homology to the genomic locus-of-interest directs the Cas9 endonuclease protein to the appropriate location to induce DNA double-stranded breaks.

(C) sgRNAs were designed near the translational start site of irf6 in exon two and a high efficiency sgRNA was chosen based on results from fragment length microsatellite analysis.

45

Generation of a Tol2 transgenic zebrafish irf6 fluorescent reporter line

Because the attempt to generate a zebrafish irf6 fluorescence reporter line using CRISPR- mediated HDR knock-in was unsuccessful, we decided to undergo the traditional approach of using

Tol2 transgenesis. Tol2 elements are evolutionarily ancient transposons that mediate transposition of DNA elements flanked by Tol2 sites in the presence of the Tol2 transposase enzyme203. Due to the popularity of the Tol2 system in the zebrafish community, many established shared resources are available like the Tol2kit, which utilizes a three-insert Gateway cloning reaction involving 1) a

5’ entry vector containing the promoter element, 2) a middle entry vectoring containing the eGFP

CDS, and 3) a 3’ entry vector containing a SV40 polyadenylation sequence, to generate a reporter construct flanked by Tol2 sites nested inside a Tol2 Gateway destination vector204.

In order to identify the putative gene promoter and cis regulatory elements of zebrafish irf6, epigenetic chromatin markers were used from the zebrafish UMass histone ChIP-seq track on the UCSC Genome Browser205. Histone 3 are often associated with various markers of gene activation and repression and transcriptional regulatory elements. H3K4me3 (tri- methylation) is an activating marker often associated with transcriptional start sites, and in the case of irf6, there is a significant H3K4me3 ChIP-seq peak overlapping the transcriptional start site and first exon of irf6 which is noncoding (Figure 11A)206. H3K4me1 (mono-methylation) is an activating marker as well and is often associated with promoters and enhancer elements207,208. Examining the sequence surrounding irf6, one significant H3K4me1 ChIP-seq peak was identified at the TSS and another 7.0 kb upstream of the TSS, 8.6 kb upstream of the translational start site in exon 2

(Figure 10A arrowhead). As previously mentioned, the irf6 start codon is significantly downstream of the TSS. Due to the possibility that gene regulatory elements that control irf6 transcription could be present in this intervening sequence, we included the entire region from the upstream H3K4me1 peak to the ATG as the putative gene regulatory sequence required to drive transgene expression in a faithful spatiotemporal pattern in our Tol2 irf6 fluorescence reporter line.

46

Figure 11: Generation of a zebrafish irf6 GFP reporter line with Tol2 transgenesis.

(A) Identification of the putative zebrafish irf6 minimal promoter sequence through examination of histone H3K4 methylation signatures in 24 hpf zebrafish embryos using the UMass zebrafish ChIP seq tracks in the UCSC genome browser. One H3K4me3 peak (*), which signifies transcriptional start sites (TSS), was identified at the annotated irf6 TSS. H3K4me1 peaks (arrowheads), which signify TSS and enhancer elements, were identified at the annotated irf6 TSS and 7.0 kb upstream of the irf6 TSS (8.6 kb upstream of the translational start site). (B) The 8.6 kb fragment upstream of the irf6 ATG was cloned into a 5’ Gateway entry vector and combined with an eGFP middle entry vector, a SV40-pA 3’ entry vector, and a lens-selection Tol2 destination vector to generate the Tol2 transgenesis template. (C) Zebrafish embryos were injected at 1-cell stage with the construct and

Tol2 transposase and selected for mCherry signal in the lens to detect genomic integration (D).

47

From our analyses of the epigenetic chromatin markers, the aforementioned 8.6 kb region upstream of the irf6 ATG was isolated by PCR from Tübingen genomic DNA and cloned upstream in-frame of the GFP CDS using Gateway reactions (Figure 11B). Zebrafish embryos were injected at the 1-cell stage with the Tol2 Gateway transgenesis vector and Tol2 transposase mRNA (Figure

11C). The resulting F0-injected embryos exhibited mosaic GFP expression at 24 hpf, primarily in the periderm (data not shown). In addition, the Gateway vector contained a crystallin lens promoter driving mCherry fluorescent protein expression that would turn the lenses mCherry-positive when successful integration of the vector into the genome had occurred (Figure 11D). ≈80% of injected embryos developed normally at 96 hpf with no overt phenotypic abnormalities. From the embryos that were deemed developmentally normal, approximately 15% of embryos were mCherry-positive for the lens integration marker (data not shown). Currently, these embryos are reaching adulthood and approaching sexual maturity. Subsequently, they will be out-crossed with wild type adults, and the F1 progeny will be examined throughout embryogenesis to identify transgenic founders where genomic integration of the irf6 reporter construct had undergone germline transmission.

CRISPR-Cas9 mediated irf6 mutagenesis in zebrafish

In addition to irf6 gene expression fluorescent reporters, we decided to generate a zebrafish irf6 null model in order to further investigate its function in periderm differentiation and craniofacial development. CRISPR-Cas9 was used for irf6 gene disruption using a sgRNA targeted at exon 6 at the start of the SMIR/IAD protein-binding domain (Figure 12A). CRISPR F0 embryos were raised to adulthood, out-crossed to wild type adults, and genotyped at the CRISPR sgRNA target site to identify F1 lines heterozygous for indel mutations that caused frameshifts and premature truncation of the resulting Irf6 proteins. A F1 line with an 8bp deletion in exon 6 of the irf6 coding region was identified (NC_007133.7(NM_200598.2):c.772_779del) (Figure 12A), which was predicted by in silico translation to truncate the resulting protein to 264 amino acids (29 kD) compared to the full- length zebrafish Irf6 protein that is 492 amino acids (55 kD) (Figure 12B).

48

Figure 12: Generation of a zebrafish irf6 mutant model with CRISPR-Cas9 genome editing.

(A) The zebrafish irf6 gene structure is composed of eight exons, a helix-turn-helix DNA-binding domain (yellow), and a SMIR/IAD protein-binding domain (green). The CRISPR sgRNA target site was located in exon 6 at the start of the protein-binding domain. Sanger sequencing of the target site revealed a -8 bp deletion (-8bp) that created a frameshift and premature stop codon. Another

+5 bp insertion was also identified. (B) In silico translations of the -8 bp and +5 bp mutant alleles confirmed the frameshift and presence of a premature stop codon, truncating the resulting mutant proteins to 29 kD compared to the 55 kD of full-length wild type Irf6. (C) DNA microsatellite results for genotyping PCR fragment size identified irf6-8bp/+ heterozygous progeny. (D) Breeding pedigree revealed progeny at the expected Mendelian ratio in F2, the irf6 mutant phenotype in F3, and the importance of maternal transcripts.

-8bp/+ When irf6 heterozygous F1 adults were crossed, wild type (+/+), heterozygous (-8bp/+), and homozygous (-8bp/-8bp) progeny embryos were produced at the expected Mendelian ratio of

1:2:1 (Figure 12D). No abnormal developmental phenotypes were observed, and all F2 embryos developed normally through embryogenesis into viable and fertile F2 adults (Figure 12D). Because

49 irf6 maternal transcripts were previously reported to play critical roles in early zebrafish embryonic development, and no obvious mutant phenotypes were observed in the F2 homozygous irf6-8bp/-8bp embryos, it was possible that maternally-deposited irf6 transcripts were sufficient for F2 progeny to undergo embryonic development, and zygotic irf6 transcription may be dispensable for zebrafish embryonic development under normal conditions. In order to assess this possibility, homozygous

-8bp/-8bp irf6 females were crossed with males of any genotype to produce both heterozygous (-8bp/+) and homozygous (-8bp/-8bp) embryos that developed normally until 4 hpf. However, shortly there- after, these maternal-irf6-8bp/-8bp embryos, regardless of their own genotypes, failed to appropriately initiate epiboly compared to wild type (Figure 13A and 13A’), resulting in separation of the animal pole from the underlying yolk, periderm rupture, and embryonic lethality in 100% of embryos at 5-

-8bp/-8bp 6 hpf (Figure 13B and 13B’). Conversely, when homozygous F2 irf6 males were crossed with wild type or heterozygous females, the resulting embryos underwent normal epiboly initiation and embryonic development, confirming that the maternal contribution of irf6, even from heterozygous

-8bp/+ irf6 females, was sufficient for epiboly progression and periderm maturation. In order to ensure the specificity of the CRISPR mutant phenotype observed, another zebrafish irf6 line harboring a

+5bp insertion in exon 6 of the irf6 coding region was also identified and generated the same set of phenotypes as the aforementioned 8bp deletion line. In addition, a F1 out-cross to wild type was

-8bp/-8bp performed to obtain homozygous irf6 females in the F3 stage rather than F2. This increased the number of out-crossed generations and thereby decreased the potential for any CRISPR-Cas9 off-target mutations from carrying over to the subsequent generation, reaching homozygosity, and producing aberrant mutant phenotypes.

Molecular characterizations of the zebrafish irf6 CRISPR allele

The 8bp deletion in exon 6 of the irf6 coding sequence was predicted by in silico translation to cause a frameshift/nonsense mutation downstream of the CRISPR sgRNA target and truncate the resulting protein. It has been previously reported that some missense and nonsense mutations

50 in exon 6 of IRF6 resulted in IRF6 proteins with dominant-negative activities45. Therefore, in order to characterize the irf6-8bp zebrafish model at the molecular level, RT-qPCR using primers specific to the irf6 5’ UTR overlapping the protein N-terminus was performed on zebrafish embryos at the

-8bp/-8bp sphere stage (4 hpf) from permutations of wild type and maternal/paternal homozygous irf6

-8bp/-8bp crosses. Transcript levels of irf6 were undetectable in all maternal irf6 crosses at the sphere stage, regardless of whether the embryos were homozygous or heterozygous for the -8bp irf6 mutant allele (Figure 13D). Conversely, the relative levels of irf6 transcripts in embryos from wild type and paternal irf6-8bp/-8bp parents were comparable (Figure 13D). Western blot analysis using a rabbit polyclonal antibody specific for zebrafish Irf6 was performed and confirmed that Irf6 protein was undetectable in embryos from maternal irf6-8bp/-8bp crosses, but comparable between embryos from wild type and paternal irf6-8bp/-8bp crosses (Figure 13D). Care was taken in selecting qPCR and western blot reagents to ensure that all forms of irf6 transcript and protein (full-length and truncated) could be detected by our assays in zebrafish embryos. In all samples containing embryos with the -8bp allele, no truncated proteins at 29 kD was detected (Figure 13D).

Interestingly, embryos from heterozygous irf6-8bp/+ females crossed with homozygous irf6-8bp/-8bp males developed normally. One might have expected them to rupture had the truncated Irf6 protein from the -8bp allele possessed dominant-negative activity since the maternal deposition of both wild type and -8bp irf6 transcripts in irf6-8bp/-8bp embryos would have led to manifestation of the lethal periderm rupture phenotype. Note that the results do not preclude the possibility that zygotic irf6 is expressed at a later developmental time point since the sphere stage (4 hpf) is only shortly after the maternal-zygotic transcription transition at 3 hpf and might not have provided enough lead time for zygotic irf6 transcripts to become detectable. Taken together, the results suggest that the

8bp deletion in the irf6 coding region likely led to irf6 transcript destabilization and degradation by nonsense-mediated decay in the oocyte and significantly decreased Irf6 protein production, thus indicating that the -8bp irf6 allele generated by CRISPR mutagenesis in our study is a null allele

(from here forward the “-8bp” allele is referred to as “–”).

51

Figure 13: Molecular characterization of the zebrafish irf6 CRISPR allele reveals maternal transcript degradation.

(A-A’) Wild type embryos at the sphere stage (4 hpf) (A) and at the 30% epiboly stage (5 hpf) (A’).

(B-B’) Maternal-zygotic irf6-8bp/-8bp embryos at the sphere stage (4 hpf) (B) and at the 30% epiboly stage (5 hpf) (B’) demonstrating the completely penetrant periderm rupture phenotype. Scale bar

= 250 µm. (C) Cross-sectional schematic through the embryonic midline illustrating the zebrafish embryo epiboly process and periderm rupture phenotype. Arrows represent directional movement of cells and yolk. Wild type embryos experience rapid cellular lamination and yolk doming between

4-5 hpf, while maternal-zygotic irf6-8bp/-8bp embryos experience incomplete periderm differentiation and animal pole/yolk separation. (D) Top: western blot of cell lysates at the sphere stage revealed a lack of full-length (55 kD) or truncated (29 kD) Irf6 protein in all maternal irf6-8bp/-8bp embryos but not paternal irf6-8bp/-8bp embryos or wild type embryos. Bottom: relative gene expression by qPCR revealed a lack of irf6 mRNA transcripts in all maternal irf6-8bp/-8bp embryos but comparable levels between wild type and paternal irf6-8bp/-8bp embryo. Error bar = 2xSEM, n = 3.

52

Molecular rescue of irf6 pathway genes by zebrafish and human IRF6

It was previously demonstrated that inhibition of Irf6 protein function by dominant-negative

Irf6 injections in zebrafish could cause decreased expression of several known downstream target genes such as krt4, klf2a, and grhl3, several of which have been previously shown to be important for regulating proper periderm maturation and developmental signaling pathways185,187. With the generation of our maternal-null irf6-/- zebrafish model, we sought to determine the gene expression changes that result from the depletion of irf6 maternal transcripts, and whether our genetic ablation model could accurately replicate the molecular gene expression changes observed from previous experiments with dominant-negative Irf6184. RT-qPCR was performed for a panel of genes known to be downregulated in the absence of functional irf6, such as tfap2a, grhl1, and grhl3, comparing their gene expressions between wild type and maternal-null irf6-/- embryos at 4 hpf125,184,185,187,207.

The results revealed significant downregulation of all genes tested in maternal-null irf6-/- embryos, at times more than one hundred-fold (Figure 14A). This corroborated published results, but with much more severe magnitudes of gene downregulation and suggested a more complete ablation of irf6 function. To further validate the maternal-null irf6-/- zebrafish model and illustrate functional conservation of human IRF6 in zebrafish, maternal-null irf6-/- zebrafish embryos were injected at the one-cell stage with either zebrafish or human IRF6 mRNA and subsequently examined for changes in gene expression by qPCR. The results revealed that either zebrafish or human IRF6 mRNA could rescue the expression of genes significantly downregulated in maternal-null irf6-/- embryo back to levels that were statistically indistinguishable compared to wild type (Figure 14A).

Furthermore, the qPCR results revealed significant downregulation of genes important for epithelial and craniofacial development in maternal-null irf6-/- embryos compared to wild type (Figure 14A).

These results suggest that irf6 maternal transcripts might play key roles in activating the molecular pathways required for zebrafish periderm maturation, epiboly initiation, and potentially later for signaling pathways related to craniofacial development.

53

Figure 14: Zebrafish and human IRF6 mRNA can rescue the zebrafish irf6 maternal-zygotic null periderm rupture phenotype.

(A) qPCR relative gene expressions of wild type embryos, uninjected maternal/zygotic-null irf6-/- embryos, and maternal/zygotic-null irf6-/- embryos rescued with either wild type zebrafish or human

54

Figure 14 (Continued) IRF6 mRNA injections, for previously identified genes with key roles in the irf6 gene regulatory network. Error bar = 2xSEM, n = 3. (B-M) Zebrafish embryos at 96 hpf stained with alcian blue for cartilaginous craniofacial elements. Maternal/zygotic-null irf6-/- embryos were rescued by microinjection of either zebrafish irf6 mRNA (F-I) or human IRF6 mRNA (J-M) at the

1-cell stage, which prevented periderm rupture and restored craniofacial development compared to wild type control embryos (B-E). Scale bar = 150 µm. (N-P) Dimensional measurements of dissected ethmoid plates at 96 hpf, with the length (L) and width (W) denoted by dashed lines on panel (E). The length (N), width (O), and length/width ratio (P) of ethmoid plates from maternal-null irf6-/- zebrafish embryos rescued with wild type zebrafish or human IRF6 mRNA were statistically indistinguishable compared to wild type. Error bar = 2xSEM, n = 12.

Phenotypic rescue of periderm rupture by zebrafish and human IRF6

The molecular gene expression characterization of the maternal-null irf6-/- zebrafish model suggested that the periderm rupture phenotype observed is likely specific to ablation of Irf6 function during early embryogenesis rather than any potential CRISPR off-target mutations. Moreover, the

CRISPR sgRNA used to generate the irf6 -8bp allele was predicted in silico to have no off-targets in coding regions of the zebrafish genome (data not shown). However, as greater fractions of the genome that were previously thought as noncoding now appear to be functional, care needed to be taken in establishing CRISPR models to ensure that no off-target mutations have been carried forward to manifest into aberrant mutant phenotypes. Thus, we sought to determine the specificity of the embryonic periderm rupture phenotype of maternal-null irf6-/- embryos by injecting wild type zebrafish irf6 mRNA into maternal-null irf6-/- embryos at the one-cell stage and assessing whether the periderm rupture phenotype can be rescued. Indeed, injections of zebrafish irf6 mRNA reliably rescued the rupture phenotype in maternal-null irf6-/- embryos, and the rescued embryos were able to initiate epiboly and undergo normal embryonic craniofacial development indistinguishable from wild type embryos (Figure 14B-E vs. 14F-I).

55

Because of the significant DNA and protein sequence conservation of IRF6 across species ranging from human to zebrafish, we hypothesized that it may be possible for human IRF6 to retain its functions in zebrafish. Wild type human IRF6 (NM_006147.3) was isolated, in vitro transcribed

-/- into mRNA, and injected into maternal-null irf6 embryos at the one-cell stage to determine whether the periderm rupture phenotype could be rescued. The results demonstrated that the ability of the human IRF6 mRNA to rescue the maternal-null irf6-/- periderm rupture phenotype was statistically indistinguishable from that of zebrafish irf6 mRNA (Figure 14J-M). Moreover, maternal-null irf6-/- embryos rescued by either zebrafish or human IRF6 mRNA injections appeared wild type during embryonic development and achieved wild type craniofacial morphologies at 96 hpf, with ethmoid plate dimensions (length, width, and ratio) statistically indistinguishable from stage-matched wild type embryos (Figure 14N-P).Taken together, the ability of zebrafish and human IRF6 mRNA to rescue not only the molecular gene expression levels but also the periderm rupture phenotype in maternal-null irf6-/- embryos suggested that the periderm rupture phenotype seen in our maternal- null irf6-/- embryos was likely specific to irf6 function ablation. Furthermore, it suggested that IRF6 transcriptional functions in humans are likely conserved in the zebrafish model.

Discussions

Zebrafish periderm as a model of mammalian oral epithelium and requirements of maternal irf6 in epiboly and craniofacial development

-/- The zebrafish irf6 model revealed the importance of irf6 maternal transcript for embryonic periderm development, corroborating previous reports that characterized irf6 function in zebrafish and Xenopus laevis models184,185. However, in contrast to previous studies of irf6 performed using mRNA injections of dominant-negative irf6, which could result in delayed and incomplete knock- down of Irf6 functions, the genetic disruption of irf6 reported in this study more completely ablated

Irf6 function in the early embryo with nearly undetectable expression of downstream transcriptional

56

179,187,207,208 -/- targets like grhl3 and krt4 . Maternal-null irf6 embryos exhibited arrest at the sphere stage and suggested that irf6 is a potential epiboly initiation factor necessary for regulating cellular signaling pathways that orchestrate this complex morphogenic event in zebrafish embryogenesis.

Interestingly, paternal-null irf6-/- embryos developed normally into viable adults and suggested that zygotic transcription of irf6 may not be necessary in zebrafish for normal embryonic development, possibly due to the persistence of maternal Irf6 protein throughout early embryogenesis. Irf6-/- mice exhibited cleft palates with oral epithelial adhesions and a number of other epithelial abnormalities, indicating that IRF6 played key roles in the regulation of epithelial proliferation and maturation62,63.

The zebrafish embryonic periderm has emerged as a model of mammalian oral epithelium because many of the gene regulatory networks, signaling pathways, and cellular behaviors during zebrafish epiboly, such as convergence-extension and epithelial-to-mesenchymal transition, are conserved in the mammalian oral epithelium during palatogenesis and craniofacial development187,207,209. The observation that maternal-null irf6-/- embryos failed to initiate epiboly cellular movements suggests that many of the downstream transcriptional networks, signaling pathways, and cellular behaviors are dependent upon proper establishment of the embryonic periderm. Moreover, it emphasizes the potential importance of the oral epithelium for initiating and orchestrating both epithelial and mesenchymal tissue behaviors during palatogenesis in the mammalian system.

By WISH and IHC, Irf6 is expressed in craniofacial tissues like the oral epithelium during time periods critical for craniofacial development in zebrafish and suggests that Irf6 could play key roles in craniofacial development in the zebrafish model. Due to the early embryonic lethal periderm rupture phenotype observed in maternal-null irf6-/- embryos, any potential craniofacial phenotypes resulting from the ablation of Irf6 function are obscured from analysis. Thus, we are currently attempting to use novel gene expression regulatory systems in zebrafish to inhibit Irf6 functions post-epiboly, and to elucidate its functional requirements in craniofacial development separate from its functions in the establishment of the periderm during epiboly.

57

Chapter 2:

Functional Genomics Analysis of Rare IRF6 Gene Variants Using a Zebrafish Model

Research Attributions

The research presented in the following chapter has been in part published in Li et al., 2017.

Edward Li played a role in project conceptualization, experimental execution, data curation, formal analysis, and methodology development. Dawn Truong, an undergraduate student, played a role in experimental execution, data curation and formal analysis. Shawn Hallett, a research technician, played a role in experimental execution and data curation. Kusumika Mukherjee, a post-doctoral fellow, and Christina Nguyen, a former research technician, played a role in designing the CRISPR targeting strategy and generating the zebrafish irf6 mutant. Brian Schutte, a principal investigator and collaborator, played a role in the analysis of human genetics data, methodology development and manuscript preparations.

Li EB, Truong D, Hallett SA, Mukherjee K, Schutte BC, Liao EC (2017) Rapid functional analysis of computationally complex rare human IRF6 gene variants using a novel zebrafish model. PLoS

Genet 13(9): e1007009.

59

Introductions

Challenges in the identification and characterization of human IRF6 variants

The rapid development of next-generation sequencing technologies has ushered in a new era of personalized medicine for a myriad of diseases210. Large-scale consortia sequencing efforts have documented thousands of whole exomes/genomes from disease patients and the general population and captured a growing catalogue of genetic variations for statistical comparisons and analyses211. However, a frequent challenge in the analysis of human variants is the establishment of pathogenicity for disease, distinguishing disease-causing variants from the background of gene variants present across the human population that are rare and undetermined in function, but not actually pathogenic. Statistical methods based on the relative enrichment of certain gene variants in disease populations212-215 and computational methods based on sequence conservation and/or structural information with limited biological data are frequently inadequate and provide conflicting results216,217. Indeed, the false assignment of pathogenicity for gene variants is a key challenge in translating knowledge gained from genome sequencing to clinical diagnoses and treatments. One focused re-sequencing study recently demonstrated that as many as 27% of previously published proposed disease-causing variants were either actually benign or lacked the required evidence for pathogenicity, and therefore should be categorized as variants of unknown significance (VUS)218.

In addition, the Exome Aggregation Consortium (ExAC) recently published a study using the largest aggregation of human exomes to reveal that although each person has on average 54 variants in their genome that are currently annotated as pathogenic by current standards, as many as 41 of them are now observed to occur frequently in the human population and thus are unlikely to cause

219 disease . As the amount of exome/genome sequencing data continues to increase exponentially, it is crucial for candidate variants to undergo rigorous, multipronged evaluation before assignment of pathogenicity. In addition to statistical (case-control association, familial segregation, population frequency, etc.) and bioinformatic (evolutionary sequence conservation, protein energetics, etc.)

60 methods, experimental approaches utilizing biological assays that directly test the protein functions of gene variants should be implemented to provide functional evidence that directly links rare gene variants to the pathogenesis of disease220.

As mentioned in the General Introductions, a well-studied example of a common congenital defect associated with a plethora of rare human gene variants is that of orofacial clefts associated

45,59,221,222 with the transcription factor IRF6 . OFCs are extremely common congenital malformations.

Pathogenic gene variants in IRF6 are amongst the most common genetic determinants of CL/P pathogenesis and are associated with two autosomal-dominant human Mendelian disorders, VWS and PPS, both with variable penetrance/expressivity and characterized by cleft lip and/or palate and skin abnormalities45,47. The IRF6 gene sequence is highly conserved across vertebrates and contains two functional domains, a helix-turn-helix DNA-binding domain and SMIR/IAD protein- binding domain181. From murine studies, disruptions of Irf6 led to CL/P phenotypes in addition to oral epithelial adhesions, poor epithelial barrier functions, and improper skin stratification, revealing

62,63 potentially important roles for the oral epithelia in coordinating mammalian palate development .

Thus far, approximately 300 human IRF6 gene variants have been identified and catalogued222, but despite the relatively large amount of structural and biological data available for this important transcription factor, the accurate and reliable determination of gene variant protein functions and pathogenicity assignments associated with rare IRF6 gene variants remain significant challenges.

Moreover, various computational programs use algorithms that weigh aspects of nonsynonymous amino acid change differently, and thus often provide conflicting predictions on protein functions for the same missense mutation223.

Functional assessment of human IRF6 missense variants in zebrafish

The large number of human IRF6 variants and its fairly well-characterized biology make

IRF6 an ideal model for examining the challenges of assigning gene variant protein functions and disease pathogenicities. In order to develop a biological functional assay to determine the protein

61 functions of a large number of human IRF6 gene variants, we utilized the novel irf6 null zebrafish model described in the previous chapter of this dissertation. By taking advantage of the result that maternally-deposited irf6 transcripts were necessary for the proper development of the embryonic epithelium (periderm) and that all maternal-null irf6-/- embryos ruptured unless rescued by function

Irf6 protein, we developed a sensitive and specific rescue assay to biologically quantify the protein functions of human IRF6 missense variants184. This maternal-null irf6-/- rescue assay was used to test the protein function of human IRF6 missense gene variants and provide an additional avenue of biological evidence that helps to bridge the gap between rare human gene variant identification and disease pathogenicity assignments.

Experimental Results

PolyPhen-2 and SIFT predictions of IRF6 variant protein function do not accurately reflect ability to rescue zebrafish periderm rupture

The conservation of human IRF6 protein functions in zebrafish provided an opportunity to assess the protein functions of human IRF6 missense gene variants. Over 300 IRF6 variants have been identified from human CL/P patients, both syndromic and nonsyndromic, and approximately

222 50% are missense variants . Two of the most frequently used computational prediction programs,

PolyPhen-2224 and SIFT225, were utilized to predict the effects of amino acid substitutions on the protein functions of IRF6 missense variants and segregate them into three categories (Figure 15A):

1) both programs agree the variant disrupts protein functions, resulting in a loss-of-function protein

(Figure 15A magenta), 2) the programs disagree on the effects of the variant on protein functions

(Figure 15A cyan), and 3) both programs agree the variant does not disrupt protein function (Figure

15A dark green). Human IRF6 missense gene variants were then mapped to their corresponding nucleotides in the zebrafish irf6 cDNA by sequence conservation, in vitro transcribed into mRNA, and microinjected into maternal-null irf6-/- zebrafish embryos to assess their abilities to rescue the mutant periderm rupture phenotype (Figure 15B).

62

Figure 15: Human IRF6 missense variant functional validation using a zebrafish irf6 model.

(A) Computational characterization of IRF6 missense gene variants using PolyPhen-2 and SIFT.

For each variant, PolyPhen-2 and SIFT produced computational predictions on the potential effect of the amino acid substitution on the resulting variant protein function and disease pathogenicity.

Variants were classified into one of three categories based on the computational prediction of the two programs: 1) computational agreement deleterious (magenta), 2) computational disagreement

(cyan), and 3) computational agreement benign (dark green). (B) Experimental approach for testing the protein functions of human IRF6 missense gene variants. Variant mRNAs were synthesized in vitro, injected into maternal-null irf6-/- embryos at the one-cell stage, and assessed for phenotypic rescue at 24 hpf.

63

The irf6 variant rescue results demonstrated that the computational programs did not offer a significant statistical advantage in predicting the biological functions and rescue outcomes of Irf6 variant proteins (Figure 16). The variants that received conflicting predictions from PolyPhen-2 and

SIFT also generated mixed results in their abilities to rescue (Figure 16A). Moreover, variants that were predicted by both computational programs to result in loss-of-function variant proteins were frequently able to experimentally rescue the irf6 periderm rupture phenotype (Figure 16B). Lastly, missense irf6 gene variants that were predicted by both programs to be non-deleterious to protein functions were also mixed in their abilities to rescue, further demonstrating the limitations of these computational programs for predicting the function of variant proteins (Figure 16C). For example, the missense gene variant p.F252L that was predicted by both programs to be non-deleterious to protein functions was unable to rescue the maternal-zygotic null irf6-/- rupture phenotype (Figure

16C). When the experimentally tested IRF6 gene variants were grouped according to their abilities to rescue zebrafish periderm rupture and mapped to the predicted human IRF6 protein structures

(DNA-binding and protein-binding domain) generated by ExPASy, the distribution of amino acids revealed that variants that could not rescue mostly resided in protein secondary structures and therefore are likely to disrupt protein conformation/function (data not shown). Conversely, amino acid residues for human IRF6 variants that retained protein function and rescued periderm rupture mostly mapped to regions without secondary structures and therefore less likely to disrupt protein conformations critical for IRF6 function. In addition, the variants tested were re-examined for their genetic backgrounds and the number of individuals previously identified with the variant, and this information was used to classify the gene variants according to the five-category system for variant pathogenicity assignments established by current American College of Medical Genetics (ACMG) guidelines226. Although no variants in the highest pathogenicity certainty category of “Pathogenic” were able to rescue the rupture phenotype, other variants with various degrees of uncertainty in pathogenicity were mixed in their abilities to rescue. The only variant analyzed to be classified as

“Benign”, p.V274I, was able to rescue the rupture phenotype (Figure 16C).

64

Figure 16: Human IRF6 missense gene variant protein function validation with the maternal- null irf6-/- zebrafish model.

Results for the functional rescue of periderm rupture with maternal-null irf6-/- zebrafish embryos for representative human IRF6 missense gene variants. The results were categorized as rescued if

65

Figure 16 (Continued) any maternal-null irf6-/- embryos injected with mRNA (100 pg/embryo) are alive and phenotypically wild type at 24 hpf (50 embryos/round, n = 3). Variants were categorized by location within the IRF6 protein and computational predictions. Further displayed are the ACMG guideline pathogenicity predictions (pathogenic, likely pathogenic, uncertain, and benign), and the number of families previously identified for each variant (all gene variant annotations were based on NM_006147.3). No missense gene variants classified as pathogenic by ACMG standards were able to rescue. (A) Periderm rupture rescue results for computational disagreement variants. (B)

Periderm rupture rescue results for computational agreement deleterious variants. (C) Periderm rupture rescue results for computational agreement benign (non-deleterious) variants.

Dosage titrations can differentiate functional categories of IRF6 variants

The human IRF6 missense variants functionally tested in this current study could result in reduced function rather than complete loss-of-function variant proteins, thereby leaving open the possibility that while they could rescue the zebrafish maternal-null irf6-/- rupture phenotype in this assay, their reduced functions in vivo are sufficient to cause disease in humans. To address this possibility, the mRNA of wild type zebrafish irf6, human IRF6, and several missense gene variants were individually microinjected into maternal-null irf6-/- embryos at the one-cell stage through a range of concentrations to establish a titration-response curve for the zebrafish periderm rupture rescue phenotype. To assist in the functional analysis of these IRF6 missense gene variants with potentially more nuanced functional changes, we used the significantly increased statistical power of rare human gene variant detection provided by the Exome Aggregation Consortium database of over 60,000 individuals219,222. In addition, we also examined the gnomAD database, a more recent effort from the same group which now included over 125,000 exomes and 15,000 whole genomes

(Figure 17). The exomes/genomes included in these large databases could be used to gauge the prevalence of hypothesized “pathogenic” variants identify from human OFC patients in the general healthy population, with true pathogenic variants having extremely low relative abundances.

66

Figure 17: Whole exome and genome sequencing projects have catalogued an increasingly larger cross-section of rare human gene variants.

(A) Pictorial representation of the geographical and ethnical distribution of both common and rare human gene variants across the globe. (B) Improvements in sequencing technology and drops in cost have led to progressively larger databases of human whole exome and genome sequences.

Larger exome/genome collections can power statistical approaches for rare variants discovery.

67

From the ExAC/gnomAD databases, the IRF6 missense gene variants p.R45Q, p.R45W, p.G70R, p.V274I, p.D354N, and p.F369S were identified. Through various lines of evidence, the variant p.V274I was already considered benign by ACMG standards. Indeed, 9,280 alleles of this gene variant (allele frequency 0.077) was found in ExAC, distributed across populations including

Europeans, Africans, and Asians (Figure 18A). Variants p.D354N and p.F369S were found to be non-conserved in zebrafish irf6 and demonstrates the possibility that the identities of these residues are not essential for IRF6 protein function. In addition, although p.D354N was previously identified in four VWS/PPS individuals, 37 alleles were also identified in the ExAC/gnomAD database (Figure

18A). Three other missense gene variants, p.R45Q, p.R45W, and p.G70R were identified as single alleles in the ExAC/gnomAD database and were able to rescue the periderm rupture phenotype of maternal-null irf6-/- embryos. The small number of these alleles do not provide as strong of support for non-pathogenicity as p.V274I. But, their presence in ExAC/gnomAD does raise the possibility that these gene variants could retain protein function and potentially be non-pathogenic, and thus should be tested for biological functions (Figure 18A). The aforementioned variants, in addition to p.V274I and several others that were mixed in their abilities to rescue were used in a mRNA micro- injection dosage titration assay. Interestingly, the examined IRF6 missense gene variants naturally segregated into three functional categories upon dosage titration (Figure 18B). The gene variants identified in ExAC/gnomAD rescued to the same degree as wild type zebrafish and human IRF6 mRNA (Figure 18B green). Variants that were able to rescue in the periderm rupture assay but not found in ExAC/gnomAD were discovered to be reduced in protein function compared to wild type, rescuing a smaller percentage of embryos at each of the mRNA dosages tested (Figure 18B blue).

These results are in contrast to the IRF6 missense gene variants that could not rescue the rupture phenotype at any dosage tested and were not found in ExAC/gnomAD or other exome/genome databases (Figure 18B red). These variants likely have mutations that resulted in complete loss- of-function variant proteins and are likely pathogenic in human orofacial cleft patients.

68

Figure 18: Variant mRNA dosage titrations using the zebrafish periderm rupture model can reveal nuanced functional differences between IRF6 missense variant proteins.

(A) Identification and characterization of IRF6 missense gene variants in the ExAC and gnomAD databases with information such as allele frequencies and ethnic origins. The p.V274I variant was

-/- identified in all populations in the database. (B) Maternal-null irf6 periderm rupture dosage titration rescue results for subsets of IRF6 missense gene variants, correlating the amount of variant mRNA injected to the percent of maternal-null irf6-/- embryos rescued from rupture and undergoing normal embryonic development at 24 hpf. Missense variants were classified into three categories based on their protein function. All variants identified from ExAC/gnomAD rescued the rupture phenotype and achieved wild type level protein function. All the other variants had either moderate or severe

(complete) reductions in protein function. Error bar = 2xSEM, 50 embryos/n, n = 3.

69

Human IRF6 variants are capable of restoring zebrafish development

Although the human IRF6 missense gene variants that were not identified in ExAC/gnomAD had reduced protein activities, they otherwise retained their biological functions and were not only able to rescue the zebrafish periderm rupture phenotype at high mRNA injection dosages, but also permitted normal embryonic and craniofacial development (Figure 19). The IRF6 variant mRNA- rescued maternal-null irf6-/- embryos that otherwise would have ruptured at 4 hpf were grown under standard conditions and resulted in not only phenotypically wild type craniofacial development, but also in viable and fertile adults (data not shown).

Figure 19: Human IRF6 missense gene variants can rescue zebrafish periderm rupture and restore craniofacial development.

(A-T) Craniofacial morphologies of maternal-null irf6-/- embryos rescued by human IRF6 missense gene variant mRNA microinjections (100 pg/embryo) at 96 hpf stained with alcian blue. (A-D) Wild type control. (E-H) p.P12L. (I-L) p.P76L. (M-P) p.T100A. (Q-T) p.P222L. Scale bar = 150 µm.

70

Discussions

Challenges in statistical and computational analyses of rare gene variants

While it is tremendously useful to document human genetic variations in genes associated with disease from wide-ranging populations, as in the case of IRF6 for orofacial clefts, an important challenge in human genetics in the age of next-generation sequencing remains how to functionally ascertain whether coding sequence variations result in harmful alterations in protein function, and whether these functional changes are causal for pathogenesis214,215. This challenge is especially prominent for missense gene variants, where whether a single amino acid substitution can result in deleterious change in protein function is often unclear. In contrast, variants that result in nonsense, frameshift, and splice-site mutations, and larger chromosomal abnormalities like microdeletions and balanced translocations, often significantly change the polypeptide sequences and structures of the resulting variant proteins, and therefore offer more direct interpretations on their functional consequences. In order to facilitate this decision-making process, statistical methods can be used to provide support for the pathogenicity of missense gene variants by determining the segregation of pathogenic variants with disease status within affected families, or by examining the frequencies of variants in the healthy general population or patient cases versus control217. However, such statistical methods do not directly interrogate biological protein functions and are prone to biases, especially for rare gene variants. Distinct but unobserved pathogenic variants may be located on the same haplotype as the candidate rare variant, and therefore segregation analysis alone cannot unambiguously assign pathogenicity, especially in smaller pedigrees210. In addition, the incidence of CL/P is higher in regions of the world where whole exome and genome sequencing has not yet captured a large cross section of the normal population for use as controls. Due to this geographical clustering, many cases of newly discovered IRF6 gene variants cannot be statistically compared to public exome/genome databases because the population mismatch would over emphasize the relative rarity of certain gene variants and thereby their potential pathogenicity.

71

Computational programs, such as PolyPhen-2 and SIFT, that predict the effect of missense mutations on gene variant protein functions utilize complex algorithms that take into consideration a multitude of parameters to predict the thermodynamic stability and functions of variant proteins after amino acid substitution. However, there are numerous remaining unaccounted factors that go into translating amino acid changes into protein functional changes, and thus these computational programs often provide results that conflict with biological evidence. Because many computational prediction programs depend on machine-learning algorithms227, the direct biological assessment of gene variant protein functions could be reiterated through the same algorithms to improve their predictive powers for both IRF6 and other proteins with similar sequences and motifs. According to recent ACMG guidelines, the assessment of gene variant pathogenicity should be multipronged with various lines of supporting evidence from independent approaches, including computational,

226 statistical, and experimental . Although pathogenicity assignments typically cannot be made from any line of evidence alone, the usage of experimental models to directly interrogate gene variant protein functions can provide valuable insights into the biological effects of missense gene variant amino acid substitutions on protein function and greatly assist in the interpretation of variant protein functions and pathogenicity.

Phenotypic rescue of maternal-null irf6-/- periderm rupture by IRF6 variants

The experimental findings that human IRF6 mRNA could rescue not only periderm rupture in zebrafish maternal-null irf6-/- embryos but also normal embryonic craniofacial development and

IRF6 network gene expression suggest that there is significant cross-species conservation in IRF6 protein structure/function. Taken together, our findings demonstrated that the zebrafish maternal- null irf6-/- embryo model could serve as a sensitive and specific platform for the rapid assessment of human IRF6 missense variant protein function in a relevant in vivo context. This biological assay can complement traditional statistical and computational analyses to form a more comprehensive picture in the process of assigning pathogenicity to IRF6 missense variants. This complementary

72 approach is especially important for rare IRF6 missense gene variants identified in a small number of individuals often from a single pedigree. In the case of p.R45W, this gene variant was detected in a single VWS affected proband, but also in his unaffected sibling and mother. The imperfect co- segregation with disease left in question whether this missense variant was pathogenic or simply a rare benign variant that was annotated as pathogenic despite the imperfect co-segregation that was previously attributed to incomplete penetrance and variable expressivity228. Although this interpretation is possible due to the variable expressivity of phenotypes exhibited by VWS patients in the same pedigree, other interpretations are possible such as a case where an unobserved pathogenic variant hidden in a separate gene/locus adjacent to IRF6 is on the same haplotype co-segregating with the p.R45W variant. This interpretation is further supported by the biological analysis of the p.R45W variant protein function in our zebrafish model, which revealed that the p.R45W variant protein was able to rescue maternal-null irf6-/- periderm rupture quantitatively to the same degree as wild type IRF6. While this experimental validation of p.R45W variant protein function does not exclude conclusively its potential pathogenicity in humans, it does suggest that further biological evidence is needed before p.R45W can be reliably annotated as pathogenic in public databases and used in the clinical diagnosis of VWS patients.

The functional genomics validation of IRF6 missense variant protein function presented in this study is complemented by rapid increases in statistical power for rare gene variant identification and pathogenicity assignment through expansions in large public exome/genome databases. More individuals are being sequenced daily with advances in sequencing technology and concomitantly decreases in sequencing cost. Several IRF6 missense gene variants previously unobserved in the general population, and therefore thought to be pathogenic, were identified in the ExAC/gnomAD databases219, potentially weakening the reliability of their previous pathogenicity assignments by

ACMG standards. Further, their discovery in the ExAC/gnomAD databases streamlined the variant protein functional validations process by identifying gene variants with the highest probabilities of

73 ambiguous pathogenicity assignments for testing in our model. The zebrafish maternal-null irf6-/- periderm rupture rescue assay allowed for the detection of subtle changes in IRF6 variant protein functions through mRNA microinjection dosage titration. The stability of variant mRNA and proteins could also be readily assessed through molecular biology techniques after mRNA microinjections.

The IRF6 missense variants p.R45Q, p.R45W, p.G70R, and p.V274I were not only able to rescue the periderm rupture phenotype of maternal-null irf6-/- zebrafish embryos but were also functionally indistinguishable from wild type zebrafish and human IRF6 in dosage titration, suggesting that the proteins produced from these missense variants retained full function. However, it is important to recognize that the pathogenicity of these variants cannot be conclusively determined with this IRF6 functional model due to the possibility that only a subset of IRF6 functions are conserved between humans and zebrafish. Interestingly, the variants tested that were able to rescue in our periderm rupture assay, and yet not found in the ExAC/gnomAD database, were reduced in protein function when compared to wild type zebrafish or human IRF6. These results suggest that these missense gene variants are not completely loss-of-function but rather reduced in function, and therefore still potentially pathogenic in humans because their reduced functions might be insufficient to prevent onset of disease phenotypes. Lastly, variants that were unable to rescue the maternal-null irf6-/- periderm rupture at any of the dosages tested likely represent complete loss-of-function missense variants. These variants were not found in the ExAC/gnomAD databases, and their complete loss- of-function provide an additional line of biological evidence in support for their pathogenic statuses in human VWS/PPS patients. Because of the relatively small number of missense gene variants functionally tested by dosage titrations in this study, no variants were discovered to be functionally wild type and not identified in the ExAC/gnomAD databases. Although the collection of sequenced exomes and genomes is continuously increasing, ever larger control population databases do not necessarily guarantee the discovery of rare benign variants previously thought to be pathogenic in human disease. In these situations, experimental validation of rare gene variant protein function could bridge a gap in knowledge and provide additional biological insights to assist in gene variant

74 pathogenicity assignments and improve their accuracies. While it is possible that only a subset of

IRF6 protein functions are conserved in the zebrafish model and evaluated through this functional rescue assay, this model serves to provide novel insights into the effects of human IRF6 missense variants on IRF6 protein function and to complement current statistical and bioinformatics results.

Overall, the zebrafish maternal-null irf6-/- model not only offered a method to rapidly assess the protein function of current and yet undiscovered rare human IRF6 missense gene variants, but also illustrated a generalizable functional genomics paradigm where novel human variants can be biologically tested for protein function using the corresponding zebrafish mutants to provide another line of biological evidence to assist with assignments of pathogenicity. With advances in CRISPR-

Cas9 targeted mutagenesis in zebrafish, it will be increasingly efficient to develop zebrafish gene disruption models, patient-specific mutation models, and mutant phenotype rescue assays to test the functions of human gene variants for genes in other contexts of disease.

75

Chapter 3:

Analysis of the Potential Post-Epiboly Roles of irf6 in Craniofacial Development

Using Optogenetics

Research Attributions

The research presented in the following chapter is unpublished. Edward Li played a role in project conceptualization, experimental execution, data curation, formal analysis, and methodology development. Dawn Truong, an undergraduate student, played a role in experiment execution and data curation. Shawn Hallett, a research technician, played a role in experimental execution, data curation, and formal analysis, especially for the experiments involving zebrafish maintenance and husbandry. Laura Motta-Mena and Kevin Gardner, outside collaborators for EL222 optogenetics, provided reagents and technical expertise for the optogenetic gene expression system validation and optimization experiments.

77

Introductions

Potential roles of irf6 during zebrafish post-epiboly embryonic development

The CRISPR-mediated irf6 mutagenesis zebrafish model resulted in maternal-zygotic null

-/- irf6 embryos with a dramatic periderm (EVL) rupture phenotype that was completely penetrant and embryonic lethal at the initiation of epiboly 5 hpf (see Chapter 1 Figure 13). This early embryonic lethal phenotype, however, has precluded studies of potential irf6 functions during later stages of embryonic development after epiboly and thus whether irf6 plays a role in craniofacial development and CL/P pathogenesis in zebrafish is currently unknown. In corroborating of our zebrafish irf6 null model findings, wild type zebrafish embryos microinjected with dominant-negative irf6 at the one- cell stage also resulted in embryonic lethality due to the same periderm rupture phenotype, but at a slightly later developmental time point. However, possibly due to variability in the microinjection- based mRNA delivery, although the majority of embryos injected with dominant-negative irf6 mRNA ruptured, a small percentage remained alive throughout epiboly and were able to develop to 96 hpf where the craniofacial development processes of interest to us have already occurred184. These embryos presented with generalized reductions of craniofacial cartilaginous elements. Specifically, the ethmoid plate is missing the frontonasal process and associated neural crest-derived cartilage cells, mimicking the cleft lip and palate phenotype in human VWS/PPS patients184. This phenotype suggested a potential role for irf6 in zebrafish craniofacial development; however, these interesting craniofacial phenotypes were not further characterized. In addition, a previous member of the lab generated a transgenic zebrafish line (sox10:irf6R84C) driving the expression of dominant-negative irf6R84C in neural crest cells, which resulted in defective fusion between the frontonasal and bilateral maxillary processes and a cleft in the ethmoid plate resembling bilateral CP in humans209. Finally, from our WISH and IHC studies, irf6 was found to be expressed post-epiboly in zebrafish in tissues relevant to craniofacial morphogenesis. Taken together, our preliminary results suggested that irf6 could play key roles in craniofacial development in zebrafish beyond its early roles in the periderm.

78

Figure 20: Investigation of Irf6 functions in zebrafish embryonic craniofacial development after the embryonic lethal critical period.

The zebrafish craniofacial development process can be approximately divided into several critical periods annotated above the 96 hpf developmental timeline. Maternal-null irf6-/- embryos ruptured at 4-5 hpf prior to the onset of epiboly/gastrulation. Wild type embryos microinjected at the one-cell stage with dominant-negative irf6 mRNAs (irf6-ENR) also ruptured, but slightly temporally delayed at 8-9 hpf near the end of epiboly. Because of the completely penetrant and early embryonic lethal phenotype resulting from the genetic ablation of Irf6 function, the potential roles of Irf6 in zebrafish craniofacial development were precluded from experimental investigation.

Evaluation of spatiotemporal gene expression control methods in zebrafish

Due to the presence of irf6 maternal transcripts deposited in the oocyte and the stability of

+/- -/- Irf6 protein which remained detectable by western blotting at 96 hpf in maternal-irf6 zygotic-irf6 embryos (cannot transcribe new irf6 from the zygotic genome but inherited maternally-deposited irf6 transcripts; data not shown), the study of Irf6 protein function in zebrafish after 4 hpf will require the inhibition of active wild type Irf6 proteins, but not before 4 hpf due to the lethal periderm rupture phenotype. This could be achieved by the expression of dominant-negative forms of Irf6. However, due to the temporal requirement of inhibition after 4 hpf, microinjections of dominant-negative irf6 mRNA into one-cell stage embryos will not be feasible since the mRNA will shortly thereafter be translated into dominant-negative protein and cause epiboly arrest and periderm rupture. Several methods are available to spatiotemporally control gene expression in the zebrafish model. These

229,230 231,232 233,234 235-237 methods include CreER-lox , Tet-on/off , heat-shock promoter , and Gal4-UAS .

79

Figure 21: Comparison of the mechanisms-of-action for commonly used gene expression control systems in zebrafish.

Numerous gene expression systems have been developed for or adapted into the zebrafish model system to facilitate spatiotemporal gene expression regulation. (A-E) Off states. (A’-E’) On states.

(A-A’) The CreER-LoxP system contains CreER driven by a tissue-specific promoter that provides

80

Figure 21 (Continued) spatial regulation. CreER is not able to act on LoxP without the estrogen analog tamoxifen (Tm), which provides temporal regulation (A). When bound to Tm, CreER could excise the stop codon and enable translation of the gene-of-interest (GOI) (A’). (B-B’) The Tet-ON system contains a reverse tetracycline transactivator (rtTA) driven by a tissue-specific promoter that provides spatial regulation. rtTA is unable to bind to tetracycline-responsive elements (TRE) and activate transcription without tetracycline (Tet), which provides temporal regulation (B). When bound to Tet, rtTA can transcribe the GOI downstream of TRE (B’). (C’C’) The heat shock system takes advantage of the endogenous transcriptional activities of heat shock proteins (HSPs), which are inactive at zebrafish physiologic temperatures (C). Raising the temperature above physiologic levels can activate the heat shock response and enable transcription of the GOI from exogenous heat shock-responsive elements (HRE) (C’). (D-D’) Gal4-UAS contains Gal4 driven by a tissue- specific promoter that provides spatial and temporal regulation (D). Gal4 transcriptional activator can bind to upstream activating sequences (UAS) and drive transcription of the GOI (D’). (E-E’)

The EL222 optogenetics system contains microinjected EL222 transcriptional activator that is inactive in the dark (E). Under 465nm blue light, EL222 shifts into the active conformation, binds to C120 elements, and enable transcription of the downstream GOI (E’). Light can be provided in a generalized or location-specific manner to provide both spatial and temporal regulation.

The CreER-loxP method for spatiotemporal control of gene expression in zebrafish involves nuclear translocation of Cre recombinase by estrogen analogs, commonly tamoxifen (Tm), to loop out loxP sites flanking a stop codon that previously prevented the expression of the gene-of-interest

(GOI) under the control of a ubiquitous or tissue-specific promoter (Figure 21A and 21A’)238. While this method has a long history and successful record in a wide range of model organisms, such as mice and zebrafish, the use of Tm in vivo can result in nonspecific developmental abnormalities in zebrafish because of its narrow toxicity window239. Furthermore, due to the genetic setup of the

CreER-loxP system, each new stream of experimentation could require the generation of novel

81

CreER driver and loxP response lines, which necessitates laborious Tol2 transgenesis processes that could take months of line propagation and genotyping.

The Tet-on/off system is an alternative spatiotemporal gene expression control system that uses a tetracycline binding protein as a transcriptional transactivator that switches either on or off in the presence of tetracycline-derivatives, depending on the exact transactivator protein used in the experiment (Figure 21B and 21B’)240. The disadvantages of this method are similar to the

CreER-loxP system, where tetracycline-derivatives could also be toxic to zebrafish embryos, and

241 transgenic lines need to be generated before experiments could be performed . Furthermore, the

Tet system has not been as widely adopted in the zebrafish community compared to the CreER- lox system, and therefore has fewer shared resources and requires more challenging experimental validations. Lastly, for both the CreER and Tet systems, which require chemical inducers for gene expression activation, the chemicals could not only be toxic but also limited by diffusion rate (slow activation and deactivation) and tissue penetration (uneven expression). Given the rapid pace of zebrafish embryonic development, a method of gene expression regulation with rapid activation- deactivation kinetics and spatiotemporally restricted gene activation is preferred.

Finally, with the heat-shock promoter system, the GOI is cloned under a HSP70 heat-shock responsive promoter (Figure 21C and 21C’). When gene expression induction is desired, zebrafish embryos are placed under heat stress conditions to induce heat shock protein (HSPs) expression.

Subsequently, the HSPs will then activate the expression of the GOI under the control of the heat- shock promoter233. Although no exogenous chemicals are required for gene expression induction in this system, it does require another noxious stimulus, heat shock, which is not well tolerated by zebrafish embryos under various experimental conditions and cause confounding abnormalities in developmental studies242. Furthermore, similar to the aforementioned systems, the heat-shock promoter driving gene expression needs to be inserted into the genome, and thus a Tol2 transgenic line would also need to be generated before experimentation. In order to rapidly assess the effects

82 of temporally-regulated dominant-negative Irf6 expression and Irf6 protein function inhibition in zebrafish during embryonic development after its initial requirements in periderm maturation, an alternative spatiotemporal gene expression control method in zebrafish embryos without the draw- backs of the aforementioned systems is desired.

EL222 optogenetic regulation of spatiotemporal gene expression

In recent years, optogenetics has made tremendous strides in expanding its repertoire of use and made significant contributions to the field of neurobiology243. Generally, most optogenetic systems use bacterial-derived proteins that could undergo conformational and functional changes upon stimulation with visible light of a specific wavelength244. Applications of optogenetics started primarily in neurobiology with light-sensitive ion channels that allowed researchers to control the influx/efflux of sodium, potassium, and calcium ions across neuronal membranes and attenuate action potential generation with light stimulations247,248. As new light-sensitive proteins with novel functions were progressively discovered, some were genetically engineered into fusion proteins with transcriptional activation or repression properties245-247. From studies of one such protein, a

222-amino acid polypeptide named EL222 was initially isolated from the bacterium Erythrobacter litoralis (EL)248. In its endogenous state, EL222 contains a N-terminal light-sensitive Light-Oxygen-

Voltage (LOV) domain and a C-terminal helix-turn-helix DNA-binding domain. The LOV domain is sensitive to blue light (465nm) and can undergo conformation change under blue light stimulation.

In the absence of light, the LOV domain is folded onto the DNA-binding domain and masks the α- helices required for dimerization and DNA-binding249. However, with 465nm blue light illumination, a photochemical protein conformational change is triggered that disrupts the tonic inhibition of the

LOV domain on the DNA-binding domain. Subsequently, the DNA-binding domains can dimerize, bind to DNA, and commence transcriptional activation249. The conformation changes required for

EL222 DNA binding occur within seconds of blue light exposure and is spontaneously reversed in the dark with a decay constant of only 11 seconds at 37°C, allowing for the rapid activation and

83 inactivation of EL222 transcriptional activity248. The same group then engineered the endogenous bacterial EL222 protein to enable transcription regulation in eukaryotic systems with the addition of a nuclear localization sequence (NLS) and the VP16 transcriptional activation domain to the N-

248 terminus . Together, these modifications allowed the genetically modified EL222 (VP16-EL222; herein abbreviated as EL222) to recruit the necessary transcriptional cofactors in eukaryotic nuclei, and to activate transcription downstream of its binding sites. The DNA binding specificity of EL222 is conferred by its DNA-binding domain, which recognizes a unique bacterial sequence called C120, a 20 bp DNA sequence found in E. litoralis that is extremely uncommon in the eukaryotic genome including zebrafish250. Therefore, in the event of EL222 activation by blue light, the likelihood that dimerized EL222 will bind to endogenous eukaryotic genomic sequences and lead to transcriptional transactivation is low. The mechanism of the EL222-C120 system is akin to the Gal4-UAS system, a gene transcription regulatory system originally discovered in the budding yeast Saccharomyces cerevisiae where Gal4 could bind specific DNA sequences found in the yeast genome called UAS and activate gene transcription downstream (Figure 21D and 21D’). The Gal4-UAS system can be engineered such that a GOI is inserted downstream of UAS and crossed into tissue-specific Gal4 expression zebrafish lines to enable tissue-specific expression of the GOI235-237. However, in the

Gal4-UAS system, Gal4 protein activity cannot be user controlled and will activate gene expression from the UAS after the Gal4 protein is produced. In contrast, with the EL222-C120 optogenetics system, the co-existence of EL222 and C120-GOI in the same cell does not automatically cause gene expression activation until blue light stimulations. This added layer of regulation allows the temporal specificity and gene expression levels to be further titrated by light exposure and power according to the needs of the researcher (Figure 21E and 21E’).

Taken together, the EL222-C120 system overcame many of the challenges that traditional gene expression systems faced in the zebrafish model. Compared to expression induction using toxic chemicals and heat shocks, blue light has been shown to cause no significant developmental

84 abnormalities even under constant illumination at high powers248. Furthermore, because zebrafish embryos are fairly small and transparent throughout embryonic development, blue light can evenly penetrate the entire depth of the embryo and allow for uniform induction of gene expression. Since blue light at 465nm is the wavelength commonly used to excite the green fluorescence protein, light sources that produce 465nm light could be found on a variety of fluorescence imaging modalities such as mercury lamps for epifluorescence and lasers for multiphoton microscopy. These imaging modalities could then be used to deliver blue light in focused locations or in user-specified patterns with location-of-interest scan settings to activate EL222-mediated gene expression in a restricted spatiotemporal pattern in zebrafish embryo251. Moreover, because of the ability to control EL222 protein activation with light, EL222 mRNA and C120-GOI response plasmids could be co-injected into the zebrafish embryo at the one-cell stage and activated at a later timepoint without concerns of transcriptional activation after EL222 mRNA translation into protein, thereby negating the time- consuming and labor-intensive requirements of generating transgenic zebrafish lines.

With the flexibility of microinjections at the one-cell stage and transcriptional activation at researcher-specified times, EL222 optogenetics can be used to bypass the pre-epiboly requirement of irf6 in periderm maturation that previously precluded studies of its functions after the embryonic lethality period in the zebrafish model. Here, we utilized the EL222-C120 optogenetics system to express dominant-negative irf6 and inhibit wild type Irf6 functions in zebrafish after the embryonic lethality period to uncover its potential functions in zebrafish craniofacial development.

Experimental Results

Optogenetic construct design and light activation of gene expression

Due to the advantages of the EL222-C120 optogenetics system, EL222-C120 was used to activate the expression of dominant-negative irf6 at a variety of time points after 4 hpf (embryonic lethality period) to reveal potential roles of irf6 in zebrafish craniofacial development. For the initial system validation experiment, the mCherry coding sequence was cloned downstream of C120 and

85 embryos at the one-cell stage were injected with EL222 mRNA and C120-mCherry plasmid, either individually or together, at various concentrations and kept in the dark to determine the toxicity of the system components. Next, the same experimental setup was performed, but injected embryos were subjected to 465nm light at various powers to induce different levels of mCherry expression, assess for potential toxicity associated with EL222 transactivation of gene expression, and define the optimal amount of light that could be delivered without causing developmental abnormalities.

Together, the results revealed that while the C120 plasmids did not cause obvious developmental abnormalities within the dosages tested, high levels of EL222 transactivation caused nonspecific developmental delays in zebrafish embryos (data not shown), corroborating previously published

248 results . Finally, wild type zebrafish embryos were microinjected at the one-cell stage with various combinations of EL222 mRNA and C120-mCherry plasmid at validated concentrations and kept in the dark or induced with blue light at 4 hpf in order to quantify the level of mCherry expression at 9 hpf and any potential leakiness in gene expression for this optogenetics system without blue light stimulation. The results showed undetectable mCherry expression levels in wild type embryos, but slightly elevated expression in all samples with C120-mCherry plasmid injected, with or without

EL222 or blue light exposure, suggesting mildly leaky expression from the C120 minimal promoter construct in zebrafish (Figure 22A). However, mCherry mRNA expression level was approximately

16-fold higher in embryos microinjected with both EL222 and C120-mCherry and subjected to blue light stimulation from 4 hpf (Figure 22A), suggesting robust EL222-mediated gene expression.

In support of the qPCR quantification results, doubly-injected embryos exposed to light were the only condition where mCherry expression was visible under whole-mount epifluorescence (Figure

22B-E). Thereafter, the irf6-ENR coding sequence was cloned downstream of C120 in replacement of mCherry and injected with EL222 mRNA into zebrafish embryos. Although there remained leaky expression of irf6-ENR without exposure to blue light, these embryos were developmentally normal at 96 hpf without observable mutant phenotypes.

86

Figure 22: Molecular characterization of EL222 optogenetic gene expression and dominant negative inhibition of Irf6 activity in zebrafish after epiboly.

(A) mCherry relative expression quantification at 9 hpf in optogenetics embryos treated with 465nm light from 4-9 hpf. Embryos injected with EL222 mRNA and C120-mCherrry response plasmid treated with 465nm light displayed over 16-fold higher mCherry expression compared to embryos injected with identical constructs but maintained in the dark, which showed mild leaky expression.

(B-E) Brightfield and fluorescence merged images of zebrafish embryos quantified in (A), showing strong and uniform mCherry fluorescence in injected embryos treated with 465nm light (B and B’) compared to injected embryos maintained in the dark (C and C’). (F) IRF6R84C and IRF6-engrailed

87

Figure 22 (Continued) repressor fusion forms of the dominant-negative IRF6 protein. (G) qPCR quantification of gene expression for irf6 transcriptional targets at 4 hpf, demonstrating significant

-/- downregulation of those targets in IRF6-ENR mRNA microinjected embryos to near mz-irf6 levels.

Epcam is expressed in epithelium but known to be unaffected by Irf6. Error bar = 2xSEM, n = 3.

The leakiness of gene expression downstream of C120 is likely due to residual activity from the minimal promoter sequence. The more elevated expression observed in EL222 mRNA injected embryos without light stimulation (Figure 22C’) is likely due to either spontaneous EL222 activities or low-level exposures to ambient light (which contains a blue light component) during processing that could activate EL222 transcriptional activity, albeit at much lower levels than under blue light.

Dominant-negative Irf6-ENR fusion protein expression mimics irf6 functional ablation

As previously described in this dissertation and in previous publications, zebrafish embryos microinjected with irf6-ENR mRNA at the one-cell stage displayed embryonic lethality at 9 hpf due

-/- to periderm rupture, similar to the mutant phenotype seen in maternal-null irf6 embryos that lacked functional Irf6 protein. When EL222 mRNA and C120-irf6-ENR was injected into zebrafish embryos at the one-cell stage and stimulated with light, periderm ruptures and embryonic lethality was also observed (data not shown). Although the phenotypic manifestation of irf6-ENR expression is similar to deficiencies in Irf6 activity, other molecular pathways could be involved to achieve the periderm

-/- rupture phenotype in irf6-ENR-injected embryos compared to maternal-null irf6 mutants. Because of this possible mechanism, experiments were performed to assess the specificity of the irf6-ENR construct for inhibiting Irf6 function at the molecular level.

From previous publications, our experiments described in Chapter One characterizing irf6 in the zebrafish model, and our experiments described in Chapter Four elucidating the downstream irf6 transcriptional target genes using ChIP-seq and mRNA-seq, we obtained a list of genes under direct Irf6 transcriptional regulation, with focus on those genes most significantly downregulated in

88 maternal-null irf6-/- embryos (where Irf6 activity is ablated) because the transcriptional regulation of those genes are most heavily influenced by Irf6 activities. In order to demonstrate the molecular specificity of the irf6-ENR construct at inhibiting gene expression of endogenous Irf6 transcriptional targets, wild type zebrafish embryos were injected with irf6-ENR mRNA at the one-cell stage and collected at 4 hpf for total RNA isolation. Similarly, RNA was also isolated from wild type embryos, maternal-null irf6-/- embryos, and maternal-null irf6-/- embryos rescued by irf6 mRNA injections for comparison purposes. Gene expression analyses were performed by RT-qPCR for genes known to be transcriptionally regulated by Irf6 activity. The results showed significant downregulation of

-/- all genes tested in maternal-null irf6 embryos compared to wild type and rescued gene expression back up to wild type levels in maternal-null irf6-/- embryos injected with zebrafish irf6 mRNA (Figure

22G), corroborating the results obtained in Chapter One. Interestingly, wild type embryos injected with irf6-ENR mRNA showed significant downregulation of the genes tested as well, albeit not as

-/- severely as maternal-null irf6 embryos, suggesting that the irf6-ENR construct can faithfully inhibit the expression of genes normally downstream of irf6 transcription regulation (Figure 22G). Further, the similarity in molecular signatures between irf6-ENR and maternal-null irf6-/- suggests that the periderm rupture phenotype observed in embryos injected with irf6-ENR mRNA likely resulted from irf6 pathway downregulation, rather than an alternative pathway that produced similar phenotypes.

For all experiment conditions collected, epcam, a gene highly expressed in the zebrafish periderm but not under the transcriptional control of Irf6 and undisturbed in maternal-null irf6-/- embryos, was used as a control for periderm gene expression. Our qPCR results showed that epcam is not down- regulated in wild type zebrafish embryos injected with irf6-ENR mRNA (Figure 22G), and suggests that the downregulation of other genes observed in irf6-ENR injected embryos is due to specific downregulation of Irf6 downstream target genes by dominant-negative Irf6-ENR activities, and not yet due to broadly generalized dysregulation of the developing zebrafish periderm at 4 hpf, which would have likely resulted in more nonspecific reductions in periderm-specific gene expressions.

89

Irf6 function inhibition at various time points leads to different phenotypes

Because the irf6-ENR construct was able to recapitulate both the phenotypic and molecular effects of irf6 downregulation in zebrafish, a C120-irf6-ENR (optogenetics controlled) plasmid was co-injected with EL222 mRNA into wild type zebrafish embryos at the one-cell stage and stimulated with blue light at various stages of zebrafish embryonic development to assess the effects of Irf6 inhibition, and thereby the functional requirements of irf6 during those corresponding time points.

Overall, zebrafish embryogenesis and craniofacial development can be approximately divided into several stages: 1) 0-4 hpf is generalized cell proliferation, 2) 4-10 hpf is epiboly and gastrulation,

3) 10-14 hpf is neural crest cell specification, 4) 10-24 hpf is neural crest cell migration, 5) 24-48 hpf is facial prominence condensation, 6) 48-72 hpf is ethmoid plate convergence-extension, and

7) 72-96 hpf is neurocranium morphogenesis (Figure 23)168. Zebrafish embryos microinjected with

C120-irf6-ENR plasmid and EL222 mRNA were stimulated with blue light starting at each of the aforementioned developmental timepoints, and the resulting embryos were collected at 96 hpf and analyzed for generalized developmental and craniofacial abnormalities.

Figure 23: Dominant-negative inhibition of Irf6 protein functions after epiboly/gastrulation using optogenetics reveals potential roles of Irf6 during zebrafish craniofacial development.

90

Figure 23 (Continued) Embryos were injected with an optogenetic dominant-negative irf6-ENR construct at the 1-cell stage and stimulated with blue light from 10-72 hpf. (A-B) Uninjected embryos raised in the dark or light displayed no gross abnormalities at 72 hpf. (C) Injected embryos maintained in the dark (with mild leaky irf6-ENR expression) displayed no gross abnormalities at

72 hpf. (D) Injected embryos treated with generalized illumination grossly displayed a hanging jaw craniofacial phenotype (arrowhead) and curved body at 72 hpf. Scale bar = 150 µm.

The results revealed various developmental abnormalities in zebrafish embryos when light stimulation (constant 0.3 mW/cm2 illumination) was initiated prior to 24 hpf; no phenotypic defects were observed with dominant-negative irf6-ENR expression after 24 hpf (data not shown). When zebrafish embryos injected with C120-irf6-ENR and EL222 mRNA were stimulated with blue light at 4 hpf, around the same time as epiboly initiation, they did not experience epiboly arrest and

-/- periderm rupture like maternal-null irf6 embryos or wild type embryos injected with irf6-ENR mRNA, possibly because the necessary gene regulatory networks for epiboly initiation and periderm maturation have already been appropriately established by functional Irf6 proteins during the initial 4 hours of development. Wild type embryos treated with blue light to inhibit Irf6 function from 4-72 hpf exhibited severe craniofacial abnormalities, observed under brightfield microscopy as a “hanging jaw” phenotype at 96 hpf (data not shown). Furthermore, these embryos displayed a severely stunted anterior-posterior body axis with moderate reductions in the lengths of the head and trunk, but severe reductions in the length of the tail to a stump, a phenotype highly reminiscent of the mutant phenotype observed in zebrafish notail (ntla or ) mutants252.

These phenotypic observations could be due to potentially undiscovered roles of irf6 in regulating

A-P boy axis establishment and tail bud formation.

Interestingly, when blue light was initiated at 10 hpf at the bud stage immediately after the epiboly/gastrulation process and maintained until 72 hpf, the resulting embryos no longer displayed a truncated tail and instead retained only a hanging jaw craniofacial phenotype (Figure 23D) while

91 embryos from control conditions did not display any mutant phenotypes (Figure 23A-23C). Alcian blue staining of embryos with the hanging jaw phenotype at 96 hpf revealed a specific defect in the ethmoid plate with sparing of the other craniofacial structures (Figure 24M and 24N) compared to control embryos (Figure 24A-24L). Using microdissections, the ethmoid plate was revealed to be bifurcated with a missing frontonasal process but retention of the maxillary processes (Figure 24O and 24P, arrowhead) reminiscent of the OFC phenotype of VWS/PPS patients. In addition, some irf6-ENR expressing embryos displayed decreased focal pigmentations in the eyes, similar to the coloboma phenotype associated with VWS-like disorders, suggesting a similar pathophysiological mechanism behind the abnormal phenotypes observed126,178. The same experimental results were obtained with optogenetic expression of irf6R84C (data not shown).

Figure 24: Alcian blue stain revealed craniofacial malformations in zebrafish embryos after dominant-negative inhibition of Irf6 function.

92

Figure 24 (Continued) (A-D) Uninjected embryos raised in the dark showed wild type craniofacial morphologies at 96 hpf. (E-H) Uninjected embryos maintained under generalized illumination with

465nm blue light and (I-L) embryos injected with an optogenetic light-inducible dominant-negative irf6-ENR construct maintained in darkness all displayed wild type craniofacial morphologies. (M-

P) C120-irf6-ENR injected embryos treated with blue light from 10-72 hpf displayed a hanging-jaw and OFC phenotype through the medial ethmoid plate (panel P arrowhead) with the frontonasal portion of the ethmoid plate missing. Scale bar = 150 µm.

Live imaging of optogenetic irf6-ENR embryos reveals NCC migration defects

Although IRF6 mutations are associated with CL/P in humans and cause OFC phenotypes in mice, they are not typically associated with frontonasal-specific craniofacial abnormalities often referred to as frontonasal dysplasias, caused by a small subset of craniofacial development genes such as ZIC2253 and ALX1254-256. Several possible mechanisms could explain the missing fronto- nasal phenotype of Irf6-inhibited embryos. As described in the General Introductions section of this dissertation, FNP NCCs are a unique CNCC population that never resides in the pharyngeal arch environment and migrates in a distinct stream anteromedially around the eye to arrive at the roof of the future mouth opening91,92. Because of their unique development process and long migratory distance, it is possible that FNP CNCCs fail to migrate to the anterior due to disruptions in their migratory route or chemoattractants resulting from Irf6 inhibition. Furthermore, since FNP CNCCs are the anterior-most CNCC population specified along the A-P axis, their specification could also be affected by the same initial defects in A-P axis establishment observed in the irf6-ENR embryos that phenocopied notail mutants. Lastly, the mouth opening is a complex developmental organizer involving the juxtaposition of ectoderm and endoderm. Appropriate FNP NCC development during craniofacial morphogenesis requires not only their deposition onto the roof of the mouth opening, but also their appropriate survival, proliferation, and differentiation in a spatiotemporally regulated manner. In the mouse and chick models, Fgf8 and Shh are required to prevent FNP NCC cell death,

93

93 stimulate proliferation, and induce chondrogenic differentiations . These growth factors could be dysregulated by optogenetic inhibition of Irf6 functions in the early stages of zebrafish craniofacial development and result in increased FNP CNCC apoptosis, decreased proliferation, or defective chondrogenic differentiation, any of which could generate the frontonasal process ethmoid plate defect observed in optogenetic irf6-ENR embryos. The possibility of chondrogenic differentiation defects is especially relevant because if FNP CNCCs are at the appropriate location but failed to differentiate into chondrocytes and gain the unique cartilage tissue biochemical composition, they would not be stained by alcian blue in our preliminary phenotype analyses and thereby produce a misleading orofacial cleft-like phenotype.

In order to differentiate between these possible pathophysiological mechanisms underlying the frontonasal defects observed in optogenetic irf6-ENR embryos, timelapse live imaging using two-photon microscopy was performed on both wild type and optogenetic irf6-ENR embryos during various time windows in craniofacial development. The migratory behaviors and cell fates of neural crest cells were tracked with a transgenic tg(sox10:mCherry) reporter line that fluorescently labeled all NCCs and derivatives. Matching for embryonic development stage using phenotypic landmarks such as somite counts, optogenetic irf6-ENR embryos treated with light exhibited slowed anterior

CNCC migration between 14-24 hpf (Figure 25A’-25D’) and delayed circumferential wrapping of

CNCCs around the posterior surface of the eye primordium (Figure 25C’ dashed circle) compared to optogenetic irf6-ENR embryos raised in the dark without irf6-ENR expression (Figure 25A-25D).

Furthermore, optogenetic irf6-ENR embryos treated with blue light showed decreased FNP neural crest cell deposition onto the anterior segment of the roof of the future mouth opening (Figure 25D’ arrow) compared to control (Figure 25D). By 60 hpf, during convergence-extension of the EP, the middle of the fan-shaped EP was devoid of mCherry+ NCCs (Figure 25F’ arrowhead) compared to control (Figure 25F), suggesting that the FNP defect of optogenetic irf6-ENR embryos was not due to defective chondrogenic differentiation of CNCCs because FNP CNCCs should still be labeled with mCherry at 60 hpf due to the persistence of mCherry fluorescent proteins.

94

Figure 25: In vivo two-photon microscopy revealed an absence of cranial neural crest cells at the medial ethmoid plate (frontonasal process) domain.

95

Figure 25 (Continued) Imaging of tg(sox10:mCherry) embryos (14-60 hpf). (A-F) Optogenetic irf6-

ENR construct injected embryos maintained in the dark. (A’-F’) Optogenetic irf6-ENR construct injected embryos maintained under generalized illumination with 465nm light starting from 10 hpf.

(A-C) Anterior-most CNCCs (FNP precursor) migrate anteromedially around the dorsal surface of the eye, while MXP precursors migrate around the ventral surface. (A’-C’) irf6-ENR expressing embryos displayed delayed anterior-most CNCC migration, but normal CNCC migration into the first pharyngeal arch. (D) FNP CNCCs have migrated around the optic stalk to the anterior extreme of the embryo. (D’) The FNP-precursor CNCCs did not reach the anterior extreme (arrow) while the stomodeum and PA1 developed appropriately in irf6-ENR expressing embryos. (E-E’) CNCCs have condensed into facial prominences and commenced their convergence toward the midline.

(F) Bilateral MXPs have fused with the FNP at the midline to form the ethmoid plate. (F’) irf6-ENR expressing embryos displayed appropriately converged bilateral MXPs but missing FNP CNCCs at the midline (arrowhead). FNP: frontonasal process; MXP: maxillary process; MDP: mandibular process; E: eye; OS: optic stalk; S: stomodeum; PA: pharyngeal arch; MK: Meckel’s cartilage.

Embryo schematic adapted and modified from Wada et al., Development 200592.

Proliferation and apoptosis in wild type vs. optogenetic irf6-ENR embryos

The in vivo imaging results with tg(sox10:mCherry) optogenetic irf6-ENR embryos revealed that the missing medial EP phenotype observed is not likely due to defects in frontonasal CNCC differentiation into chondrocytes. Another possible mechanism leading to the phenotype observed is that optogenetic irf6-ENR embryos have either decreased FNP CNCC proliferation or increased cell death, leading to an insufficient number of remaining FNP CNCCs to effectively fuse with the bilateral maxillary processes and thereby cause the bifurcated EP phenotype observed. Wild type and optogenetic irf6-ENR embryos treated with generalized illumination from 10 hpf were collected at 24 hpf and stained for phospho-histone 3 to assess for cell proliferation and cleaved activated caspase 3 to assess for cell death. No significant differences in proliferation in the craniofacial

96 region was observed across samples comparing wild type with optogenetic irf6-ENR embryos

(data not shown). However, a small population of focalized cell apoptosis was observed bilaterally ventromedial to the eyes on the roof of the future mouth opening at 24 hpf in optogenetic irf6-ENR embryos but not control (data not shown). The cells undergoing apoptosis spatiotemporally overlap with frontonasal cranial neural crest cells condensed onto the oral ectoderm of the mouth opening, and thus could potentially account for the cell death of the already limited number of FNP CNCCs that reached the roof of the future mouth opening due to their defective cell migration.

Discussions

Dominant-negative IRF6 in the forms of ENR-fusion and R84C

VWS and PPS are autosomal-dominant Mendelian disorders associated with mutations in the transcription factor IRF6. Because IRF6 mutations are the most common genetic determinants of human CL/P pathogenesis, hundreds of genetic variants from all mutational classes have been identified (i.e.: nonsense, missense, frameshift, and splice-site)59,222. As described in the General

Introductions, the types of mutation associated with VWS are usually full loss-of-function mutations like frameshifts and chromosomal deletions, leading to a hypothesis that VWS is caused by haplo- insufficiency in IRF6 function. In contrast, the types of mutation associated with PPS are usually point mutations in the critical DNA-binding or protein-binding domains, which change the resulting proteins into dominant-negatives59. While the exact molecular mechanisms of IRF6 transcriptional activity regulation are not well understood, it has been hypothesized from studies of the crystalline structures of other IRF family members that IRF6 requires homo- or hetero-dimerization to undergo nuclear translocation and acquire transcriptional activities128. Under this mechanism, mutations in the critical functional domains result in mutant proteins that can either bind to DNA but not interact with protein transcriptional co-activators (protein-binding domain mutations) or bind to protein co- factors but not to the appropriate DNA sequences (DNA-binding domain mutations). In both cases

97 the defective IRF6 proteins sequester functional IRF6 proteins in nonfunctional transcriptional complexes through dimerization and effectively inhibit IRF6 transcription activities. In fact, one of the most common mutations observed in IRF6 associated with PPS is p.R84C, which mutates a critical DNA-binding residue in the DNA-binding domain and results in a dominant-negative IRF6 protein (Figure 22F).

From a previous publication from our collaborator Dr. Robert Cornell, another form of Irf6 with dominant-negative activity was generated for the zebrafish model185. The amino-terminal 115 amino acids of the zebrafish Irf6 containing the DNA-binding domain was isolated and fused to the

Engrailed gene repressor domain (ENR). This generated a fusion protein that was shown to retain endogenous binding to putative Irf6 genomic binding sites and repress transcriptional activation downstream of those loci (Figure 22F). The Engrailed repressor domain has been previously shown to transform transcription factor-ENR fusions into dominant-negative forms and uses an inhibitory mechanism that can actively suppress transcriptional activity at genomic loci to which it is recruited.

In fact, several previous publications have already successfully employed ENR-fusion proteins in the zebrafish model to inhibit transcription factor functions257,258.

R84C It is important to note that Irf6-ENR and Irf6 achieved their dominant-negative functions through different mechanisms. For the Irf6-ENR fusion, the DNA-binding domain is intact and can bind to all of the endogenous Irf6 DNA binding sites while bringing with it an ENR domain that can suppress transcription at those binding sites. While the ENR fusion is a powerful method of gene expression repression, it is possible that the DNA-binding domain, without protein-binding domain attached, might lose attributes of its endogenous DNA binding regulation and bind to putative Irf6

DNA binding sites throughout the genome regardless of the spatiotemporal regulatory restrictions

R84C that might normally be in place during embryonic development. For the Irf6 mutant protein, the critical DNA-binding amino acid was disrupted and rendered the resulting mutant protein incapable of binding to DNA. However, the endogenous protein-binding domain remained intact and should

98 therefore obey the hypothetical spatiotemporal regulations of Irf6 DNA binding during embryonic development. Therefore, the Irf6R84C protein is more likely to inhibit wild type Irf6 protein functions only when and where Irf6 is endogenously active compared to Irf6-ENR, and thus is more likely to recapitulate a precise inhibition of Irf6 protein function during embryonic development. In our initial experiments involving optogenetic dominant-negative Irf6 expression, both forms were used and characterized for Irf6 downstream gene expressions and potential mutant phenotypes. Although

R84C greater expression of Irf6 was required to achieve wild type Irf6 protein inhibition compared to

Irf6-ENR, ultimately the resulting downstream gene expression alterations and mutant phenotypes

were indistinguishable between the two dominant-negative forms. This suggested that the Irf6-ENR fusion can accurately recapitulate the inhibition of endogenous Irf6 transcriptional functions during zebrafish embryonic development without significant off-target effects.

Dominant negative Irf6 inhibition molecularly mimicked maternal-null irf6-/-

Although the two dominant-negative Irf6 proteins (Irf6-ENR and Irf6R84C) used independent approaches for inhibiting Irf6 protein functions, due to the novel nature of the craniofacial mutant phenotype observed in optogenetic irf6-ENR treated embryos, additional validation experiments characterizing the gene expression of known Irf6 downstream genes, such as krt4, grhl3, ppl and capn9, were performed comparing wild type and optogenetic irf6-ENR embryos with maternal-null irf6-/- embryos were used as positive control for complete Irf6 protein function ablation. The results revealed statistically significant downregulation of Irf6 downstream genes in optogenetic irf6-ENR embryos compared to wild type, albeit not as severe downregulation when compared to maternal-

-/- null irf6 embryos (Figure 22G). This incomplete reduction in downstream gene expressions could account for the observation that dominant-negative irf6-ENR mRNA injected embryos experienced

-/- delayed periderm rupture compared to maternal-null irf6 embryos. In spite of the partial reduction in Irf6 downstream gene expression, the results suggested that dominant-negative Irf6 expression in wild type zebrafish was sufficient to interfere with endogenous wild type Irf6 protein functions.

99

Moreover, the dimerization requirement for Irf6 nuclear translocation and transcription initiation hypothesized for human IRF6 is likely conserved in zebrafish in order for the dominant-negative

Irf6 constructs to function. Epcam, an epithelial gene strongly expressed in the periderm but known to be unaffected by Irf6 transcription regulation, was included to control for generalized periderm dysfunction. Although zebrafish Irf6 function ablation has been previously shown to be associated with defects in periderm maturation, epcam expression was found to be undisturbed in maternal-

-/- null irf6 embryos at 4 hpf, and thus should not be disrupted by dominant-negative Irf6 expression.

Our experiments revealed that epcam expression levels remained statistically indistinguishable in dominant-negative irf6-ENR injected embryos compared to wild type (Figure 22G) and suggested that the downregulation in other periderm genes like krt4 and grhl3 observed in dominant-negative irf6-ENR injected embryos were likely not due to generalized periderm dysregulation but rather specific disruptions in the gene regulatory network downstream of Irf6 in the periderm.

Validation of EL222 optogenetic gene expression control

Although optogenetics platforms have established applications in the field of neurobiology where light-induced conformational changes in ion channels can subsequently result in alterations in electrochemical gradients across neuronal membranes, optogenetic gene expression systems, especially for use in the zebrafish model, are still emerging technologies. The EL222 optogenetics gene expression system employed in this dissertation allowed for precise temporal control of gene expression and overcame traditional limitations of other commonly used zebrafish spatiotemporal gene expression systems (Figure 21).

Under the validated EL222 and C120 response plasmid concentrations microinjected that limited over-exposure to blue light and toxicities associated with EL222 transactivation, blue light activation of EL222 could robustly induce transcriptional activation of genes downstream of C120 by over 16-fold in zebrafish embryos treated under constant illumination from 4-10 hpf compared to EL222 and response plasmid-injected embryos raised in the dark (Figure 22A). It is important

100 to note that due to the design of the optogenetics response plasmid, where five tandem copies of the EL222 binding site C120 were placed upstream of a minimal promoter sequence and gene- of-interest, there appears to be basal level transcriptional activity from the response plasmid even in the absence of EL222, albeit at extremely low levels over 30-fold lower than fully activated

EL222 transcription from the response plasmid (Figure 22A). In addition, either due to exposure to ambient white light (which contains a blue light component) during live embryo processing, or to intrinsically aberrant EL222 activation in the absence of blue light stimulation, there was a 2-fold increase in transcriptional activity from the response plasmid with unstimulated EL222 compared to response plasmid-only control. However, this elevated response plasmid expression was still

16-fold lower than EL222-activated transcription (Figure 22A). In terms of dominant-negative irf6-

ENR gene expression from the response plasmid by EL222, no mutant phenotypes were observed in all embryos examined that were injected with response plasmid-only or EL222 and response plasmid but maintained in the dark (data not shown). Moreover, although irf6-ENR expression was detected by qPCR in these control conditions, no significant downregulation in Irf6 down- stream genes was observed (data not shown). Taken together, these results suggested that the low-level leaky Irf6-ENR mRNA expression under conditions without blue light EL222 stimulation likely produced Irf6-ENR proteins at insignificant quantities to effectively compete with wild type

Irf6 proteins and inhibit endogenous Irf6 transcriptional activities.

Effects of Irf6 inhibition on NCC migration, proliferation, survival, and differentiation

The optogenetic irf6-ENR embryos exhibited an extremely specific craniofacial phenotype where the medial ethmoid plate cells are missing when examined at 96 hpf by alcian blue staining.

The medial ethmoid plate cells are derived from the mammalian equivalent of frontonasal CNCCs.

In zebrafish, these cells are the anterior-most population of cranial neural crest cells and undergo a unique migratory process anteromedial to the eye primordia to condense onto the oral ectoderm on the roof of the future mouth opening92,166. A few explanations can account for the missing medial

101 ethmoid plate phenotype observed in optogenetic irf6-ENR embryos. The first possibility depends on the technical nuance of alcian blue staining which relies on the unique chemical composition of the chondrocyte extracellular matrix in order for cartilaginous tissues to preferentially retain alcian blue dye. Therefore, if the medial ethmoid plate CNCCs in optogenetic irf6-ENR embryos failed to differentiate into chondrocytes, they would not appear stained by alcian blue and thus present with the misleading result of an orofacial cleft phenotype. Upon examining optogenetic irf6-ENR treated embryos that were tg(sox10:mCherry), which labeled neural crest cells and their derivatives with mCherry fluorescent protein, the medial ethmoid plate was deficient of mCherry-positive cells at

72 hpf, suggesting that the cause of the mutant phenotype is likely due to a lack of neural crest cells at the medial ethmoid plate rather than defective CNCC differentiation (Figure 25F and 25F’).

In addition, with in vivo two-photon imaging, cranial neural crest cell migration in optogenetic irf6-

ENR embryos appeared to be delayed compared to control, albeit this effect was not statistically quantified. At 24 hpf, when frontonasal and maxillary CNCCs should have condensed onto the oral ectoderm on the roof of the future mouth opening, a location normally occupied by frontonasal

CNCCs was devoid of mCherry-positive CNCCs (Figure 25D and 25D’ arrow), suggesting that defective frontonasal CNCC migration to the correct anatomical position could be a contributory factor to the OFC phenotype observed at 96 hpf in optogenetic irf6-ENR embryos.

Because the cellular defects in optogenetic irf6-ENR embryos were already observable at

24 hpf, the defects associated with post-epiboly dominant-negative inhibition of Irf6 function are likely associated with early events during craniofacial development. In addition to aberrant CNCC migration, it is possible that the anterior-most cranial neural crest cells displayed poor specification at the neural plate border in optogenetic irf6-ENR embryos. In order to examine this possibility, we are currently in the process of performing qPCR and WISH on zebrafish embryos from 10-14 hpf for early NCC markers to determine whether optogenetic dominant-negative Irf6 function inhibition has inhibitory effects on the specification of anterior CNCCs.

102

Furthermore, the missing medial ethmoid plate phenotype could be due to either decreased proliferation or increased cell death of frontonasal CNCCs. While the results are preliminary, there appears to be no significant changes in cell proliferation in the craniofacial region at 24 hpf, but markedly increased programmed cell death on the roof of the mouth opening in a spatiotemporal pattern corresponding to frontonasal NCCs (data not shown). These results will be replicated and co-stained with markers for NCCs to determine the identities of the cells experiencing programmed cell death. Moreover, apoptosis in the craniofacial region will be quantified to statistically evaluate the increase in cell death in optogenetic irf6-ENR treated embryos compared to wild type. If indeed there is an increase in cell death of frontonasal CNCCs at the roof of the future mouth opening, it would not only suggest a cellular mechanism behind the pathogenesis of the missing medial EP frontonasal process phenotype in optogenetic irf6-ENR embryos, but also point towards potential molecular mechanisms underlying this phenotype such as pathways regulating NCC survival at this stage of craniofacial development.

Factors affecting frontonasal neural crest cell migration and survival

Several factors could affect frontonasal cranial neural crest cell migration and survival. As mentioned in the General Introductions section, frontonasal cranial neural crest cells are unique in their migration and eventual condensation on the anterior dorsal oral ectoderm of the future mouth opening. Chemokines, such as Pdgfaa secreted from the anterior optic stalk, are required to define the migratory pathway of frontonasal CNCCs (which are Pdgfra-positive)169. In fact, pdgfra down- regulation in neural crest cells has been previously revealed to result in OFC phenotypes that are

169 highly similar to the medial ethmoid plate defects seen in optogenetic irf6-ENR treated embryos .

Due to the defective frontonasal CNCC migration observed in optogenetic irf6-ENR embryos and the resemblance of the mutant phenotype with pdgfra knockdown embryos, the medial ethmoid plate defect in optogenetic irf6-ENR embryos could be caused by disruptions in the Pdgfra-Pdgfaa signaling pathway stemming from post-epiboly inhibition of Irf6 protein function. Pdgfra is normally

103 expressed in cranial neural crest cells, and therefore is unlikely to be affected by dominant-negative

Irf6 because there is currently no evidence of irf6 expression in NCCs in zebrafish. However, we have preliminary experimental evidence through WISH and IHC that irf6 could be expressed in the optic stalk at 24 hpf (data not shown). Therefore, it is possible that dominant-negative inhibition of

Irf6 function in the optic stalk downregulated the gene expression of pdgfaa, and thereby decreased the chemoattractive strength of the optic stalk for FNP CNCCs in assisting their anterior migration and condensation onto the oral ectoderm of the roof of the future mouth opening.

In addition, the roof of the future mouth opening is a complex developmental environment.

One region in particular is named the Frontonasal Ectodermal Zone (FEZ) and is defined by the juxtaposition of Fgf8 and Shh expression domains at the mouth opening; this will be revisited again in the Discussion section of Chapter Four. For the purposes of the optogenetic irf6-ENR embryos, it has been previously reported that Fgf8 signaling from the dorsal oral epithelium to the underlying frontonasal CNCCs is essential for cranial neural crest cell survival during the early stages of facial prominence formation and later for the induction of CNCC chondrogenic differentiation. Because

Irf6 is expressed in the zebrafish periderm and possibly the underlying basal epithelial layer, Irf6 expression likely spatiotemporally overlaps with fgf8 expression in the FEZ. Therefore, disruptions in Irf6 protein function by dominant-negative irf6-ENR expression in the epithelium could cause fgf8 expression dysregulation in the FEZ, and subsequently increased cell death in the underlying frontonasal cranial neural crest cells. In order to elucidate the molecular mechanisms behind the observed optogenetic irf6-ENR phenotype and to disentangle how defects in cellular migration and survival contribute to the missing medial ethmoid plate cells, we are performing qPCR and WISH for a variety of neural crest cell markers and developmentally relevant genes, such as pdgfra and fgf8, to identify possible pathways affected in optogenetic irf6-ENR treated embryos compared to wild type. Significant downregulation of these genes in the appropriate spatiotemporal expression pattern would be expected if they were to have key contributions to the overt mutant phenotype

104 observed. In addition to these currently proposal pathways, as we will examine in depth in Chapter

Four, we performed ChIP-seq and mRNA-seq to identify the direct Irf6 downstream transcriptional targets. The resulting gene list will be used to augment our understanding of Irf6 functions during post-epiboly craniofacial development and suggest additional molecular mechanisms behind the pathogenesis of the missing medial ethmoid plate CNCC phenotype.

105

Chapter 4:

Identification of Direct Irf6 Transcriptional Target Genes in Periderm Maturation

Research Attributions

The research presented in the following chapter is unpublished. Edward Li played a role in project conceptualization, experimental execution, data curation, formal analysis, and methodology development. Dawn Truong, an undergraduate student, played a role in experiment execution and data curation. Shawn Hallett, a research technician, played a role in experimental execution, data curation, and formal analysis, especially for the experiments involving CRISPR-Cas9 site-directed mutagenesis of Irf6 downstream transcriptional targets in zebrafish and subsequent maintenance and husbandry of the zebrafish lines. Yang Chai, an outside principle investigator and collaborator, provided Irf6R84C mouse embryos and adults.

107

Introductions

Utility of ChIP-seq for identifying direct IRF6 transcriptional target genes

IRF6 is a helix-turn-helix transcription factor whose core developmental function has been previously identified as transcriptional regulation of genes critical for epithelial maturation62,63. In addition, IRF6 has been implicated to play important roles in the proliferation, differentiation, and apoptosis of lambdoidal junction epithelia (primary palate) and medial edge epithelia of the palatal shelves (secondary palate) during embryonic palatogenesis262. However, these roles are currently not well characterized, and the IRF6 transcriptional target profile during palatogenesis is unknown.

As described in the General Introductions, select factors within the IRF6 gene regulatory network have been identified through studies of genes in canonical epithelial gene regulatory networks like

P63123,208,259 and TFAP2A125,209. These studies mostly studied single-factor upstream/downstream genetic interactions with IRF6, and therefore have not provided a comprehensive genome-wide and transcriptiome-wide view of IRF6 transcriptional functions.

Because IRF6 is a transcription factor, direct downstream genes under IRF6 transcription regulation are likely the direct functional executorss of IRF6 function, and thus are of high interest for determining the functions of IRF6 in epithelial maturation and embryonic palate development.

The genome-wide direct transcriptional targets of IRF6 could be experimentally obtained through chromatin immunoprecipitation and next-generation sequencing (ChIP-seq) to identify the physical binding sites of IRF6 throughout the genome. The locations of these binding sites in relationship to adjacent genes and promoters would then identify candidate genes under direct IRF6 transcription regulation. To address this gap in knowledge, a ChIP-seq study was previously conducted on IRF6 in cultured human keratinocytes under differentiating conditions to identify candidate genes under direct IRF6 transcription regulation during keratinocyte differentiation and maturation179. The ChIP- seq results revealed 3,980 IRF6 binding peaks representing 2,201 genes179. Several of the genes identified were genes previously known to be important for IRF6 function in epithelial maturation,

108 such as TBFBR3, BMP2, OVOL1, and EGFR179. Moreover, from examinations of gene expression and epigenetic markers at the IRF6 locus in squamous cell carcinomas (SCC), it was discovered that IRF6 was significantly downregulated in a large fraction of human SCC samples, and thereby implicated IRF6 as a potential tumor suppressor in the adult epidermis directing differentiation of immature keratinocytes into mature keratinized skin cells and controlling aberrant proliferation179.

However, because of the biological context and oncology focus of the aforementioned study, the experiments were performed in primary keratinocytes in vitro and not only lacked an appropriate in vivo tissue/organismal context for IRF6 functions, but also potentially missed key IRF6 functions in embryonic craniofacial development that are suppressed because the keratinocytes analyzed were adult in origin.

Previously published research and experiments performed in this dissertation project have demonstrated that the zebrafish periderm could serve as a faithful model of irf6 functions within a relevant biological and developmental context. Therefore, it is possible that the direct downstream transcriptional targets of Irf6 in the zebrafish periderm could reveal a gene regulatory profile that more closely resembles the biological functions during craniofacial development compared to the results obtained from adult human keratinocytes.

Direct IRF6 transcriptional targets and the missing inheritance of VWS/PPS

The identification of direct IRF6 downstream transcriptional targets could not only provide a more thorough understanding of the developmental function of IRF6 in epithelial maturation and craniofacial morphogenesis, but also help account for a large fraction of the missing inheritance in human VWS/PPS patients. Like many other Mendelian syndromes, VWS and PPS are clinical diagnoses based on canonical constellations of clinical presentations and not solely on genetic mutations in IRF6. In fact, only approximately 70% of patients with clinical VWS diagnoses have deleterious mutations in IRF6; the remaining 30% of patients are currently categorized as non-IRF6

VWS with unknown genetic causes121. Because of the high incidence of CL/Ps and VWS/PPS in

109 the general population, 30% of cases translate to a significant number of people with undiagnosed disease, and therefore represent a pressing need to identify other genetic causes of VWS and PPS.

Furthermore, the statistic of 70% IRF6 mutations in VWS is likely an inaccurate representation of the true relationship between IRF6 and VWS/PPS. From our analysis of rare IRF6 missense gene variants described in Chapter Two, numerous previously hypothesized pathogenic IRF6 missense gene variants actually resulted in IRF6 proteins with full biological function in the zebrafish model and thus unlikely to be pathogenic in humans. In addition, many patients with clinical presentations of VWS/PPS were, and continue to be, sequenced by direct Sanger sequencing of the IRF6 coding regions, and thus often neglected potentially pathogenic variants in IRF6 gene regulatory regions, introns, splice-sites, and other noncoding regions. Lastly, with targeted IRF6 Sanger sequencing, other potentially pathogenic gene variants that coincidentally coexisted with non-pathogenic IRF6 gene variants in VWS/PPS patients would not be identified, and the cause of pathogenesis would be wrongfully attributed to the benign IRF6 variant. This challenge is particularly pronounced in situations where pathogenic gene variants are adjacent to the IRF6 locus and thus would likely co-segregate with benign IRF6 variants and the disease phenotype that cannot be disentangled from IRF6 by recombination in small and incomplete pedigrees.

In fact, locus heterogeneity likely accounts for a large fraction of the remaining 30% of VWS patients. In one large pedigree from Finland, the VWS presentation was linked to a chromosomal locus on 1p33-p36 rather than IRF6 located on 1q32-q41260. The direct downstream transcriptional targets of IRF6 that perform its core developmental functions could also mimic VWS presentations when mutated and account for some of the missing inheritance in VWS populations. For example, the gene grhl3 was initially identified as an Irf6 transcriptional target gene in the zebrafish periderm model and subsequently discovered to be deleteriously mutated in approximately 3% of non-IRF6

VWS patients. Other direct downstream transcriptional targets of IRF6 have also been identified and implicated in non-IRF6 VWS pathogenesis such as KLF4207. Through in vivo zebrafish models

110 and in vitro cell culture studies, was identified as a direct downstream transcriptional target of IRF6. Focused resequencing of human patients with non-IRF6 VWS revealed a small portion of them to have potentially pathogenic KLF4 variants that again accounted for a small fraction of the missing inheritance in VWS patients and provided biological evidence for improved clinical genetics diagnoses for VWS patients and families.

Although focused studies of genes known to be important IRF6 transcriptional targets have provided incremental knowledge on other genes responsible for non-IRF6 VWS, the vast majority of non-IRF6 VWS patients still have unaccounted genetic causes for disease pathogenesis. With rapid advances in next-generation sequencing technologies and decreasing costs associated with whole exome/genome sequencing, many potentially pathogenic variants in VWS/PPS patients are starting to surface in conjunction with IRF6 gene variants of unknown significance. Because of the high prevalence of direct IRF6 downstream transcriptional targets in identified cases of non-IRF6

VWS, we hypothesized that a complete repertoire of IRF6 downstream transcriptional target genes would not only identify biological pathways key for craniofacial development, but also IRF6 targets that could account for a significant portion of the variant burden in non-IRF6 VWS patients.

In order to examine IRF6 function during embryogenesis in a relevant biological context,

ChIP-seq could theoretically be performed on mouse embryonic tissues from the epidermis or oral epithelium. However, in the first situation of the epidermis, most epithelial cells do not perform any signaling functions during embryonic development and thus would dilute the critical peaks of IRF6 binding to developmentally relevant genes. In the second situation of oral epithelium, although the results would likely provide a more direct biological answer to our initial question of IRF6 function during craniofacial development and CL/P pathogenesis, current technical limitations for ChIP- seq still require large numbers of cells in order to obtain enough chromatin for immunoprecipitation and next-generation sequencing, and thereby preclude the isolation of oral epithelial cells through microdissections, flow cytometry, or laser capture microdissection techniques. Because previous

111 studies have illustrated the faithfulness and effectiveness of the zebrafish periderm as a model for the mammalian oral epithelium in studies of IRF6 function and gene regulation, ChIP-seq using a polyclonal antibody specific for zebrafish Irf6 was performed on wild type and maternal-null irf6-/- embryos at 4-5 hpf to identify direct Irf6 transcriptional target genes during periderm specification and differentiation. Zebrafish periderm development is similar to mammalian epithelial maturation and occurs during epiboly, a complex event involving cell lamination, convergence-extension, and epithelial-to-mesenchymal transition, processes that all have analogous features during zebrafish and mammalian craniofacial development. In addition, although transcription factor binding to the transcriptional start site or cis-regulatory elements of a gene normally implies direct transcriptional regulation, this relationship does not always hold true. In order to improve the correlation between

IRF6 genomic binding and IRF6 transcriptional activities at the genome/transcriptome-wide levels, mRNA-seq was also performed comparing wild type and maternal-null irf6-/- embryos at 4-5 hpf to identify genes whose expression are most significantly affected by disruptions in Irf6 function.

Results mRNA-seq of wild type vs. maternal-null irf6-/- zebrafish embryos

From the zebrafish irf6 model characterization experiments described in Chapter One, our maternal-null irf6-/- zebrafish embryos lacked irf6 maternal transcripts or functional proteins and in effect represented a null model for studies of irf6 function. The null model displayed epiboly arrest, periderm rupture, and significant downregulation of several key downstream genes identified from

184,261 previous literature . As an initial step in identifying IRF6 transcriptional regulatory targets, the global gene expression changes associated with ablation of Irf6 function were assessed by mRNA- seq by comparing wild type zebrafish embryos to maternal-null irf6-/- embryos at 4-5 hpf just before the onset of mutant phenotypes. RNA was extracted from both wild type and maternal-null irf6-/- zebrafish embryos at 4 hpf and mRNA was enriched by oligo-T coated magnetic beads. Individual samples were multiplexed and sequenced by an Illumina HiSeq 2500 with single-end 50 bp reads

112 at approximately 25 million reads per sample with biological triplicates. Differential gene expression between wild type and maternal-null irf6-/- samples were compared to generate a list containing all genes with statistically significant expression change above two-fold. Principle component analysis with unsupervised clustering of the mRNA-seq results revealed greater similarities within wild type and maternal-null irf6-/- biological replicates than each other (Figure 26). Moreover, the mRNA-seq results revealed significant downregulations of genes previously known to be downregulated with disruptions in Irf6 function, such as grhl3, tfap2a, krt4, klf2a, and ovol1 (Figure 26).

Many of the genes most significantly downregulated play important roles during epithelial differentiation. Moreover, when compared to the IRF6 siRNA human keratinocyte differential gene expression data previously published, there were major overlaps of genes in molecular pathways responsible for epithelial regulation. However, in addition to the traditional epithelial differentiation and maturation genes, many key developmental pathway genes including Fgfs and Wnts were also heavily represented in our dataset as genes downregulated due to Irf6 dysfunction, but not in the dataset derived from adult human keratinocytes. These observations potentially support our initial rationale for using the zebrafish periderm as a model of the mammalian oral epithelia to study Irf6 transcriptional targets. Due to the developmental setting in which the RNA samples were collected for mRNA-seq, Irf6 functions were captured in a biologically context that is potentially relevant for craniofacial development rather than epidermal keratinocyte maturation. It is important to recognize that because of the rapid pace of zebrafish embryonic development at 4 hpf, many developmental processes could be indirectly disrupted by changes in maternally-deposited irf6 transcripts. mRNA- seq does not only detect primary gene expression changes due to direct Irf6 transcriptional activity, but also any subsequent changes that cascade from the initial perturbations. Therefore, the genes identified from this initial mRNA-seq study will require cross-validation with Irf6 ChIP-seq results in order to identify which significantly downregulated genes are also direct transcriptional targets of

Irf6 in zebrafish.

113

-/- Figure 26: Differential gene expression heat-map between WT & maternal-null irf6 embryos

114

Figure 26 (Continued) Heat-map of mRNA-seq results displaying differential gene expressions between wild type and maternal-null irf6-/- embryos at 4-5 hpf for the top one-hundred genes. The biological replicates were found to be grouped together with unsupervised clustering by principle component analysis.

ChIP-seq of wild type vs. maternal-null irf6-/- zebrafish embryos

The zebrafish research community has long faced the challenge of antibodies with poor sensitivity and specificity. Several Irf6 antibodies were tested for use in western blotting during the initial zebrafish irf6 model validation experiments, and the antibody with the highest sensitivity and specificity was selected for all subsequent experiments. Chromatin immunoprecipitation requires a highly sensitive/specific antibody in order to pull down the desired protein-chromatin complexes and obtain sequence contigs with a high signal-to-noise ratio to identify distinct peaks of protein- chromatin binding. Even for the most specific antibodies tested for zebrafish Irf6, nonspecific band patterns were observed on western blotting in addition to the band specific for Irf6 when wild type embryos were compared with maternal-null irf6-/- embryos devoid of Irf6 protein expression. When used for ChIP-seq, this antibody would likely pull down nonspecific protein-chromatin complexes and generate false-positive ChIP-seq reads. However, it was noted that the nonspecific bands on western blots were nonrandomly distributed and consistent between wild type and maternal-null irf6-/- samples (Figure 27A). Furthermore, depending on the blocking buffer and reaction condition used, the level of nonspecific background could be significantly reduced. Therefore, instead of the traditional ChIP-seq experiment setup where ChIP is performed only on wild type samples, ChIP- seq using a validated zebrafish polyclonal Irf6 antibody was performed on wild type 4 hpf embryos

(to isolate Irf6 and nonspecific binding peaks) and maternal-null irf6-/- embryos that do not express

Irf6. ChIP-seq on these embryos would only isolate the same sets of nonspecific binding peaks from wild type samples, which could then be used to cancel the noise from wild type ChIP samples and identify Irf6-specific binding peaks (Figure 27C). In addition, an additional method to ensure

115 peak specificity and decrease the rate of false-positives, maternal-null irf6-/- embryos were injected with zebrafish irf6 3xFLAG-tagged mRNA at the 1-cell stage, raised to 4-5 hpf and used for ChIP- seq with an anti-FLAG M2 antibody. Lastly, all ChIP-seq samples included an input DNA control to account for potential sequence biases throughout the experiment protocol. Because maternal- zygotic transcriptional transition occurred at around 3 hpf, chromatin fixation and crosslinking for

ChIP-seq were performed on embryos at 4-5 hpf to maximize the amount of lead time available for the initiation of zygotic transcription and early developmental processes and capture potential

Irf6 binding sites throughout the genome before maternal-null irf6-/- embryo rupture.

The ChIP-seq sequencing contigs were aligned to the zebrafish reference genome (Zv9) and input-normalized coverage tracks were used for peak identification. Genes adjacent to ChIP- seq peaks were identified, especially genes where ChIP-seq peaks lied within a few hundred base pairs upstream of the transcriptional start site, and the gene list along with the Irf6 ChIP-seq peak locations were collated. The candidate gene list was analyzed using analysis262,263, which revealed developmental processes potentially relevant to Irf6 functions in embryogenesis and craniofacial development including convergence-extension, Wnt signaling, anterior-posterior pattern specification, epithelial development and morphogenesis, cell surface receptor signaling, and many others (Figure 27D).

Prioritization of direct Irf6 transcriptional target genes for further investigation

The candidate gene lists from the mRNA-seq and ChIP-seq experiments were intersected to isolate a dataset that contained genes that are both direct transcriptional targets of Irf6 and also significantly downregulated in the absence of Irf6 function in zebrafish embryos at 4-5 hpf. While significant downregulation of a gene in the absence of Irf6 functions does not necessarily equate to the importance of that gene in performing the core downstream developmental functions of Irf6 during zebrafish embryogenesis, it does suggest for that Irf6 protein could be the primary driver of transcription activation for that gene.

116

Figure 27: Irf6 chromatin immunoprecipitation and next-generation sequencing identified

Irf6 genomic binding sites and biological processes under Irf6 transcriptional regulation.

(A) Representative schematic of Irf6 western blots in zebrafish comparing wild type embryos with mz-irf6-/-. Wild type samples displayed an Irf6-specific band in addition to non-randomly distributed

-/- nonspecific bands that occurred in identical patterns in mz-irf6 samples. (B) Bioanalyzer result for a representative ChIP-seq sequencing library displaying fragment distribution and lack of adapter contaminations. (C) Rationale for subtractive Irf6 ChIP-seq to increase peak specificity. (D) Gene

Ontology analysis for genes identified as zebrafish Irf6 transcriptional targets from Irf6 ChIP-seq.

117

From the mRNA-seq and ChIP-seq overlap, a dataset of 320 genes were identified (Figure

28A) where genes were both downregulated over two-fold in the absence of Irf6 function and have strong Irf6 ChIP-seq binding peaks within the proximal gene regulatory regions. This dataset was further refined to 277 genes by excluding unannotated zebrafish genes and those without direct homologs in humans and identifying genes with strong cross-species conservation of DNA and protein sequences (Figure 28B). These genes were then cross-referenced with the FaceBase data on mouse embryonic facial prominence gene expression to identify genes with relevant craniofacial spatiotemporal gene expression patterns and further narrow the list to 271 genes264 (Figure 28C).

Lastly, with reviews of published literature, examination of the mouse Gene Expression Database

(www.informatics.jax.org) and Zebrafish Information Network (www.zfin.org) for spatiotemporal gene expression patterns and mutant models, and experimental results generated through our own experiments, 18 genes were selected for further experimental analyses in zebrafish, mouse, and humans (Figure 28D). Four genes with particularly interesting mechanisms-of-action, dact1, esrp1, flrt3, and rspo3 became the focus of our experiments. The experimental results generated for the gene esrp1 will be the focus of this dissertation.

Identification of CL/P patient variants in IRF6 transcriptional target genes

In order to interrogate the hypothesis that direct Irf6 transcriptional target genes should not only reveal the biological mechanisms underpinning its developmental functions, but also account for a large percentage of the missing inheritance in VWS patient populations who do not have IRF6 mutations, we worked with our collaborators (Dr. Elizabeth Leslie and Dr. Mary Marazita) who used human CL/P case-parent trios whole genome sequencing data from the Gabriella Miller Kids First pediatrics research program in a preliminary study to identify potential deleterious de novo coding variants in IRF6 transcriptional targets. The whole genome sequencing data used in the preliminary study consisted of two cohorts; the first cohort contained 372 trios of European ancestry, and the second contained 276 trios of Columbian and admixed ancestries. A third cohort containing 268

118 trios of African and Asian ancestries will also be sequenced and included in subsequent analyses but is beyond the current scope of this dissertation. From the narrowed list of 271 genes from the mRNA-seq and ChIP-seq overlap, with the cross-species sequence conservation and FaceBase spatiotemporal gene expression filters applied (Figure 28C), six putative IRF6 transcriptional target genes with putative deleterious de novo coding mutations were identified: WNT11, ETV4, KEAP1,

METRN, PLEKHN1 and RAP1GAP. Whether these coding mutations actually negatively affect the resulting protein functions, and whether they are pathogenic for disease will require experimental validations through approaches like those described in Chapters One and Two of this dissertation.

In addition, with more CL/P patient case-parent trios being sequenced in the near future, more IRF6 transcription target genes with potentially pathogenic variants should be identified to suggest novel biological pathways that constitute critical IRF6 developmental functions.

Morpholino knockdown of Irf6 transcriptional target genes

In order to evaluate the craniofacial development functions of dact1, esrp1, flrt3, and rspo3 in the zebrafish model, translation-blocking morpholinos were designed for those respective genes and individually microinjected into wild type zebrafish embryos at the one-cell stage to knock-down their protein levels during embryonic development. Unfortunately, none of the morpholinos tested produced any obvious craniofacial morphant phenotypes at 96 hpf at any nontoxic concentrations tested. However, the negative results could be due to the compensatory functions of the paralogs those genes including dact2, esrp2, flrt1/2, and rspo1/2, all of which have overlapping but distinct craniofacial expression patterns during mouse embryonic craniofacial development in the mouse

Gene Expression Database and FaceBase dataset (data not shown). Experiments are ongoing to determine whether combinatorial morpholino knock-downs of paralogs will generate craniofacial morphant phenotypes. Thus far, our preliminary studies have demonstrated that although dact1 or dact2 translation-blocking morpholinos alone did not produce morphant phenotypes, that dact1 and dact2 morpholinos co-injected at low dosages could produce a midline craniofacial defect with

119 a rod-shaped ethmoid plate and hypomorphic pharyngeal cartilages when examined by alcian blue at 96 hpf (data not shown).

Figure 28: Irf6 downstream transcriptional target gene prioritization schematic.

(A) Overlap of genes from ChIP-seq and mRNA-seq results comparing wild type and maternal- null irf6-/- embryos at 4-5 hpf before periderm rupture. (B) Selection of genes with strong cross- species conservation in DNA and polypeptide sequence. (C) FaceBase mouse facial prominence spatiotemporal gene expression patterns. (D) Final candidate gene selection based on published literature, zebrafish and mouse spatiotemporal gene expression data and mutant models, human patient variants, and unpublished experimental data.

120

CRISPR-Cas9 gene editing of candidate Irf6 transcriptional target genes

In addition to potential gene compensation from paralogs of dact1, esrp1, flrt3, and rspo3, another potential explanation of why no morphant phenotypes were observed in single-morpholino knock-downs is incomplete reductions in function before the morpholino toxicity limits are reached.

It has been previously demonstrated that morpholinos can nonspecifically activate and lead to

265 apoptosis and nonspecific morphant phenotypes . Therefore, if the maximum morpholino dosage cannot sufficiently reduce protein translation, then no morphant phenotypes would be precipitated.

Also, an observation was made through our experiments with irf6 in zebrafish that many genes in this gene regulatory network were present as maternal transcripts and proteins, possibly due to their roles in periderm maturation during early zebrafish embryonic development. Maternal proteins would escape morpholino-mediated knock-down, and obscure downstream morphant phenotype analyses. In order to generate zebrafish models of the Irf6 transcriptional target genes, CRISPR- mediated mutagenesis was used to induce indels and early frameshift/nonsense mutations in the critical functional domains within dact1, esrp1, flrt3, and rspo3, much like the approach taken with generating the zebrafish irf6 model. In certain situations, CRISPR-mediated gene mutagenesis in animal models has been previously reported to inadvertently trigger gene expression changes that

266 compensate for the loss-of-function of the disrupted gene . This effect is especially prominent in zebrafish gene disruption models where paralogs with compensatory functions are present, which could unfortunately be the case with all four of the Irf6 transcriptional target genes being studied.

In anticipation for this possibility, for the top two candidate genes (dact1 and esrp1), their paralogs dact2 and esrp2 were also mutated by CRISPR-mediated mutagenesis.

For each gene (dact1/2, esrp1/2, flrt3, and rspo3), several sgRNAs were designed targeting critical functional domains and individually microinjected with Cas9 protein into wild type embryos at the one-cell stage. Indel mutations were detected by microsatellite analysis and the most efficient sgRNA was used for multiple rounds of microinjections to generate F0 mosaic embryos. F0 adults

121 were then out-crossed to wild type zebrafish to generate F1 embryos, which in adulthood were tail- clipped to identify heterozygous adult animals that have frameshift indel mutations that caused an early nonsense mutation and truncation of the wild type protein. These founder F1 adults were then out-crossed to wild type zebrafish to produce a large number of heterozygous F2s and to decrease the likelihood of nonspecific genomic alterations from CRISPR genome editing carrying over and reaching homozygosity in subsequent generations. Currently, mutant zebrafish lines for all genes are in different stages of breeding (dact1: F2 adults; dact2: F2 juveniles; esrp1: F2 adults, esrp2:

F0 juveniles; flrt3: F2 adults; and rspo3: F2 adults). Heterozygous F2 adults for each gene will be in-crossed to produce F3 homozygous embryos for phenotypic and molecular analyses. Moreover, mutant lines of gene paralogs will be intercrossed with each other to generate double-heterozygous

F3 embryos for the eventual generation of double-homozygous null embryos for phenotypic and molecular analyses.

Spatiotemporal expression of Irf6 transcriptional target esrp1 in zebrafish

One of the Irf6 downstream transcriptional targets, epithelial splicing regulatory protein 1

(esrp1 – ENSDARG00000011245), has a particularly fascinating molecular mechanism involving alternative splicing of fibroblast growth factor receptors during embryonic development which will be described in detail in the Discussions section. In our ChIP-seq analysis, Irf6 was found to have a strong binding peak at the esrp1 transcriptional start site and another peak approximately 10 kb upstream (Figure 29A) in a putative enhancer region marked by H3K4me1 (data not shown). Using qPCR, esrp1 was found to be downregulated over five-fold in maternal-null irf6-/- embryos at 4 hpf compared to wild type, and its expression was restored to levels statistically indistinguishable from

-/- wild type in maternal-null irf6 embryos microinjected with wild type irf6 mRNA (Figure 29B). Taken together, these results suggest that Irf6 could be a direct transcriptional regulator of esrp1 gene expression during early zebrafish embryonic development.

122

Figure 29: Gene expression patterns of esrp1 during zebrafish embryonic development.

(A) Irf6 ChIP-seq reads aligned to Zv9 and peaks adjacent to esrp1, demonstrating a peak directly overlying the transcriptional start site and another peak 10 kb upstream. (B) qPCR gene expression analysis for esrp1, showing approximately five-fold esrp1 expression downregulation in mz-irf6-/- embryos compared to wild type at 4 hpf, and rescued esrp1 gene expression in mz-irf6-/- embryos injected with wild type zebrafish irf6 mRNA. Error bar = 2xSEM, n = 3. (C-H) Wholemount in situ hybridization results for esrp1 at 48 hpf (C-D), 72 hpf (E-F) and 96 hpf (G-H). (C) At 48 hpf, esrp1 is expressed in the stomodeum/oropharynx (arrow), nasal pit, otic placode, pectoral fin bud, lateral line neuromast, and generalized embryonic epithelium. (E) At 72 hpf, esrp1 is expressed in the oropharynx (arrow). (G) At 96 hpf, esrp1 is expressed in the epithelium of the anterior oropharynx

(arrow) and (H) posterior pharyngeal apparatus (arrow). NP: nasal pits; OP: otic placode; FB: pectoral fin bud; NM: lateral line neuromast. Scale bar = 150 um.

123

Furthermore, RNA whole-mount in situ hybridization was performed on wild type zebrafish embryos at various developmental stages to determine the spatiotemporal expression patterns of esrp1. From 4-24 hpf (time points examined – sphere: 4hpf, shield: 6 hpf, bud: 10 hpf, 10-somites:

14 hpf, and prim-6: 24 hpf), esrp1 expression was restricted to the periderm but did not show any special regionalized expression patterns (data not shown). At 48 hpf, esrp1 gene expression was reduced in the periderm and displayed concentrated signals in the otic vesicles, pectoral fin buds, lateral line neuromasts (Figure 29C), and the oropharynx region (Figure 29D arrow). Interestingly,

-/- this spatiotemporal gene expression pattern corresponded to structures affected in the Irf6 mouse model where oropharyngeal epithelial adhesions and limb truncations were observed, and in the

Esrp1-/- mouse model where hearing loss due to cochlear development defects were observed267.

At 72 hpf, esrp1 expression was significantly downregulated in the periderm and became restricted to the oropharyngeal epithelium (Figure 29E) on both the dorsal and ventral surfaces starting from the anterior mouth opening (Figure 29F arrow). Finally, at 96 hpf, esrp1 expression continued in the oropharynx epithelia (Figure 29G) and expanded to the epithelia surrounding ceratobranchial cartilages (Figure 29H). Overall, the spatiotemporal expression patterns of esrp1 during zebrafish embryogenesis largely remained in epithelial structures but became restricted to regions that are potentially important for craniofacial development.

Mouse embryonic palatal shelf Esrp1 expression and Esrp1-/- phenotypes

The IRF6 transcriptional network has demonstrated significant cross-species conservation between zebrafish, mice, and humans. Moreover, the zebrafish model of craniofacial development has shown significant conservations in molecular and cellular developmental processes compared to mammalian systems. However, because the zebrafish ethmoid plate (model of the mammalian palate) is anatomically the base of the ventral neurocranium and not the nasal cavity, it is important to demonstrate reproducibility of our experimental findings derived from zebrafish in mammalian models as well.

124

Irf6 has been previously shown to be expressed in the oral epithelium of mice in craniofacial morphogenesis, and this finding has been independently reproduced in our experiments where Irf6 protein expression was detected in the oral epithelium surrounding the secondary palatal shelves and tongue at E14.5 (Figure 30A and 30A’). Because our ChIP-seq and mRNA-seq results showed that esrp1 is potentially a direct downstream transcriptional target of Irf6 in zebrafish embryos, we hypothesized that if this relationship remained conserved in the mouse model, Irf6 and Esrp1 would share significant overlaps in spatiotemporal gene expression in key craniofacial structures during palatogenesis. In fact, Esrp1 was found to be expressed in the same pattern in the oral epithelium compared to Irf6 using immuno-histochemistry on E14.5 embryo coronal cryosections (Figure 30B and 30B’). The overlap in Esrp1 and Irf6 expression in the oral epithelium supports the hypothesis that Esrp1 could be an important executor of Irf6 downstream molecular functions in craniofacial development.

The mouse mutant of Esrp1 has been previously published and displayed an orofacial cleft phenotype268. In Esrp1-/- P0 pups, bilateral OFCs could be observed through both the primary and secondary palates (Figure 30C and 30D). Furthermore, the clefts extended bilaterally beyond the hard palate through the upper lips into the anterior nares (Figure 30E and 30F arrowheads). In the same study, neither Esrp2-/- nor Esrp1+/-; Esrp2-/- mutants exhibited orofacial clefts. However, the mutant phenotypes observed in Esrp1-/-; Esrp2-/- double mutants were more severe than those of

-/- Esrp1 mutants alone. Taken together, the results suggested that Esrp1 is the primary functional effector in the irf6 gene regulatory network during craniofacial development, and that some degree of functional compensation exists between the two Esrp homologs. As mentioned in the General

Introductions, orofacial clefts through the primary palate are rare in the mouse model while being extremely common in humans. Therefore, the CLPs observed in Esrp1-/- mutants could be highly informative as model for studies of palatogenesis mechanisms other than palatal shelf fusion, such as lambdoidal junction fusion during upper lip and nasal nares formation.

125

Figure 30: Oral epithelial Esrp1 expression and mouse Esrp1-/- orofacial cleft phenotypes.

Immunohistochemistry on coronal sections of E14.5 mouse oral cavities for Irf6 (A-A’) and Esrp1

(B-B’) illustrating overlapping expression patterns in the oral epithelium. Scale bar = 100 µm. (C)

Palate morphology of wild type P0 embryo compared to Esrp1-/- (D) which displayed bilateral oro- facial clefts through both the primary and secondary palates (arrows). (E) Frontal view of wild type

P0 embryo compared to Esrp1-/- (F) which displayed bilateral clefts through the upper lips into the nasal cavity (arrowheads). UL: upper lip; PP: primary palate; SP: secondary palate. Mouse whole- mount images adapted and modified from Bebee et al., eLife 2015268.

126

Discussions

Esrp1 drives alternative splicing of fibroblast growth factor receptors

From the ChIP-seq and mRNA-seq overlap candidates, several genes displayed potentially interesting mechanisms to regulate critical craniofacial development events. One gene, esrp1, has been previously shown to regulate the alternative splicing of fibroblast growth factor receptors268.

The FGF signaling pathway controls many important developmental functions, regulating diverse processes including cell proliferation through RAS-MAPK (Figure 31A), apoptosis through PI3K-

AKT (Figure 31B), and cell migration through PLCγ-DAG/IP3 (Figure 31C). FGF receptor proteins contain modular extracellular immunoglobulin (Ig) domains that define their specificity and different intracytoplasmic kinase domains that recruit unique adaptors that affect specific cellular functions269. The modular Ig domains could be interchanged with alternative splicing during post- transcriptional processing. In particular, FGFR2 exists as two isoforms, FGFR2-IIIb and IIIc, based on which IgIII domains are included and are associated with epithelial and mesenchymal tissues respectively267. This alternative splicing is performed by ESRP1, which favors exon 8 and excludes exon 9 to produce FGFR2-IIIb, an epithelial-specific isoform (Figure 31D). Without ESRP1, the converse event occurs, and the FGFR2-IIIc mesenchymal-specific isoform is made (Figure 31E).

It has been previous shown that Esrp1 mutations in mice are sufficient to cause a bilateral cleft lip and palate phenotype268. During mammalian palatogenesis, proper fusion between several epithelial tissue seams are required to generate continuous mesenchymal tissues that bridge the bilateral facial prominences. These tissue seams include the MEE of the secondary palatal shelves and the lambdoidal junction formed by the FNP and MXP123. The improper fusion of these epithelial tissue seams leads to discontinuous palatal mesenchymes and ultimately orofacial clefts. Although many signaling pathways are involved during these complex fusion events, FGF signaling between the palatal epithelium and mesenchyme is thought to play important roles in coordinating the fusion process by regulating a combination of apoptosis and epithelial-to-mesenchymal transition.

127

Figure 31: Fibroblast growth factor receptor signaling pathways and mechanism of ESRP1- mediated alternative splicing of FGFR2.

Fibroblast growth factor (FGF) signaling proceeds through transmembrane receptors that dimerize upon ligand binding and activate their cytoplasmic kinase domains. Different molecular and cellular outcomes can be achieved through various adaptor proteins linked to the kinase domain. (A) SOS can activate the MAP kinase pathway to stimulate cell proliferation. (B) GAB1 can activate the PI3 kinase pathway to repress cell death. (C) PLCγ1 can activate the phosphatidylinositol pathway to control cellular migration. (D-E) ESRP1 can affect the alternative splicing of FGFR2 to generate either (D) an epithelial-cell specific isoform that includes exon 7 (IIIa) and exon 8 (IIIb), or (E) a mesenchymal-specific isoform that includes exon 7 (IIIa) and exon 9 (IIIc). Each FGFR2 isoform has its own unique specificities for FGF ligands.

128

The alternative splicing of FGFR2 between epithelial and mesenchymal isoforms has also been previously reported in vitro to induce EMT behaviors270. Taking into consideration the cellular behaviors of the oral epithelium during craniofacial prominence fusions, the OFC phenotype seen

-/- in Esrp1 mutants could be in part due to the improper regulation of FGFR2 isoform expression at the epithelial tissue seams requiring EMT for fusion. In addition, because Irf6 is canonically thought to be a master regulator of epithelial maturation, Esrp1-mediated alternative splicing of FGFR2 and other currently uncharacterized targets could represent critical direct downstream functions of Irf6 in orchestrating proper epithelial cell behaviors during craniofacial development.

Zebrafish esrp1/2 mutants and potential FGF interactions

During preparation for this dissertation, a paper was published in Nature Communications examining the evolutionary conservation of epithelial splicing regulatory protein splicing programs and associated gene expression changes across diverse species including zebrafish and mice271.

In the zebrafish portion of the study, several whole-mount in situ hybridization results for esrp1/2 on wild type embryos were included that closely matched our experimental results shown earlier in this chapter. Moreover, this group also used CRISPR-Cas9 mutagenesis to produce zebrafish

-/- -/- null mutants of esrp1/2 and showed that although esrp1 and esrp2 single mutants did not exhibit any mutant phenotypes, esrp1-/-; esrp2-/- double mutants exhibited several developmental defects.

Over 75% of double-null mutant embryos died between 8-10 dpf, and none survived past 14 dpf271.

In addition, double-null mutants exhibited distal dysgenesis of the pectoral fins, smaller inner ear volumes, and abnormal arrangements of the pharyngeal cartilages. Interestingly, these affected structures all corresponded to structures with strong esrp1 expression at 48 hpf (Figure 29C).

-/- -/- Most importantly, the esrp1 ; esrp2 double mutant embryos exhibited a cleft lip and palate phenotype where the medial ethmoid plate cells (derived from frontonasal NCCs) are missing but the lateral EP cells (derived from bilateral maxillary NCCs) are unaffected. This produced a near identical phenocopy of the OFC phenotype displayed by wild type embryos treated with dominant-

129 negative irf6-ENR using optogenetics reported in Chapter Three of this dissertation (Figure 24M).

This remarkable similarity in mutant phenotypes between esrp1/2 loss-of-function and post-epiboly

Irf6 functional inhibition, coupled with our ChIP-seq and mRNA-seq results indicating that esrp1 is a direct downstream transcriptional target of Irf6, suggested that Esrp1 could carry out significant downstream effector functions of irf6 in zebrafish craniofacial development by potentially affecting frontonasal NCC biology and their union with maxillary NCCs during ethmoid plate morphogenesis.

The observation that esrp1/2 double mutants phenocopy the OFC phenotype of dominant-negative irf6-ENR optogenetic embryos also provides insights into potential molecular mechanisms behind the OFC phenotype observed. As the results in Chapter Three showed, the missing medial ethmoid plate phenotype is likely not due to defective CNCC differentiation into chondrocytes or decreased

CNCC proliferation, but rather a combination of defective CNCC migration to the future mouth opening before 24 hpf and selective apoptosis of the frontonasal CNCCs that did arrive at their proper destination. Although the alternative splicing repertoire of Esrp1 is substantial, its primary functional role is the alternative splicing of Fgfr2 into the epithelial-cell specific IIIb isoform. If the

-/- -/- esrp1 ; esrp2 double mutant and dominant-negative irf6-ENR phenotypes are caused by similar mechanisms, then our future investigation into the molecular mechanisms behind the optogenetic dominant-negative irf6-ENR medial ethmoid plate phenotype could be focused on FGF-specific developmental functions.

The FGF signaling pathway has been previously demonstrated to play critical roles during craniofacial development in zebrafish, chicks, and mice, especially in the regulation of the unique anterior cranial neural crest cell population. As described in the General Introductions section, the anterior-most CNCCs travel along a unique migration route anteromedial to the eye primordia to condense on the oral ectoderm of the roof of the future mouth opening. During migration, CNCCs express cxcr4 and pdgfra and experience numerous chemoattractants including Sdf1 and Pdgfaa secreted from anterior structures like the optic stalk to define their migratory pathways toward their

130 final destination171,279. Currently, whether FGF signaling plays a critical role in the differentiation of chemokine-secreting structures or in the subsequent induction of chemokine gene expression is currently unknown. However, it has been previously shown in the mouse and chick models that

FGF signaling plays an important role in defining the frontonasal ectodermal zone, an area at the roof of the future mouth opening at the juxtaposition point of Fgf8 and Shh expression domains93.

FGF signaling from the oral epithelium to the underlying NCC mesenchyme is required for CNCC survival and later for chondrogenic differentiations93,94. Corresponding to the experimental results that frontonasal CNCCs exhibited increased cell death at 24 hpf in dominant-negative optogenetic irf6-ENR embryos, the increase in apoptosis could be caused by perturbations in FGF signaling resulting from disruptions in Irf6 protein function and subsequent downregulations in esrp1 gene transcriptional activation.

Biological validation of CL/P patient variants in IRF6 transcriptional targets

From previous publications that successfully identified potential pathogenic variants in Irf6 downstream transcriptional targets like grhl3 and klf4 from non-IRF6 VWS human patients, we hypothesized that a more comprehensive repertoire of Irf6 transcriptional targets would reveal a greater number of genes with variants contributing to CL/P pathogenesis in humans. Based on the preliminary analysis by our computational collaborators of a relatively small cohort of CL/P case- parent trios, putative deleterious de novo coding mutations were identified in six Irf6 transcriptional target genes: WNT11, ETV4, KEAP1, METRN, PLEKHN1, and RAP1GAP. With greater numbers of trios and larger cohorts, more genes with potentially pathogenic gene variants will be identified to account for not only an increasingly larger percent of the missing inheritance of non-IRF6 VWS patients, but also interesting aspects of IRF6 biology during craniofacial development.

As we have shown in Chapter Two of this dissertation with rare human IRF6 gene variants, current statistical and computational methods for assigning rare gene variant pathogenicities and missense variant protein functions are often misleading when taken in isolation. Thus, in order to

131 complement these in silico techniques, a principle similar to that described in Chapter Two will be applied to biologically evaluate the human gene variants of unknown significance obtained by our computational collaborators using zebrafish mutant gene model rescue assays261. We have begun to generate zebrafish mutants for a few of the genes with identified human variants like . Once the potential mutant phenotypes are characterized, wild type etv4 mRNA and mRNAs with human variants will be microinjected into mutant embryos in attempt to rescue those mutant phenotypes.

Moreover, genes with identified human variants will be molecularly characterized in the zebrafish and mouse models for their spatiotemporal expression patterns during embryogenesis in order to determine their expression in relevant craniofacial structures during palatogenesis.

IRF6 transcriptional target validation in cell culture, mouse, and zebrafish

The ChIP-seq and mRNA-seq overlap results identified genes that theoretically should be direct transcriptional targets of Irf6. However, despite strong Irf6 binding to the transcriptional start sites and proximal cis regulatory elements and corresponding downregulations in gene expression with ablation of Irf6 functions, further experimental validations are necessary to establish greater confidence in the transcriptional regulatory relationships between Irf6 and its target genes. In order to evaluate the transcriptional functions of Irf6 binding to putative ChIP-seq peaks, the ChIP-seq peaks within the proximal cis regulatory regions for each transcriptional target gene will be inserted upstream of a minimal promoter plasmid driving the expression of firefly luciferase. This plasmid will then be transfected into primary neonatal human keratinocytes (or initially into HEK293T cells) with and without a mammalian gene expression plasmid containing zebrafish irf6 to determine whether Irf6 proteins will bind to the putative ChIP-seq peaks and be sufficient to drive increased transcriptional activation downstream.

To extrapolate our zebrafish findings to mammalian systems and evaluate cross-species conservation, several additional experiments will be conducted in the mouse model. While we have demonstrated co-localization of Irf6 and Esrp1 protein expressions in the mouse oral epithelium,

132 this relationship needs to be established for other critical time points during embryonic craniofacial development through immunohistochemistry and whole-mount in situ hybridization experiments.

Furthermore, we are in the process of procuring Irf6R84C/+ and Esrp1+/- mice from our collaborators

Dr. Yang Chai and Dr. Russ Carstens respectively. If Esrp1 is also an Irf6 transcriptional target in mice, and the significant downregulation in esrp1 gene expression in maternal-null irf6-/- zebrafish embryos holds true, then we would expect decreased Esrp1 expression in the oral epithelium of

Irf6R84C/R84C embryos. Analogous experiments will also be performed for several other putative Irf6 transcriptional target genes including Dact1, Flrt3, and Rspo3.

Finally, in order to determine the presence of potential genetic interactions between Irf6 and Esrp1, Irf6R84C animals will be intercrossed with Esrp1+/- animals. While Irf6R84C/+ heterozygotes only occasionally exhibited intraoral epithelial adhesions, they did not exhibit the highly penetrant

OFC phenotype observed in Irf6R84C/R84C homozygotes63. Meanwhile, Esrp1+/- heterozygotes did

63,268 not exhibit mutant phenotypes . Therefore, if the direct transcription regulatory relationship of decreased Irf6 function leading to decreased Esrp1 expression remains valid in the mouse model during craniofacial morphogenesis, then Irf6R84C/+; Esrp1+/- compound heterozygotes could exhibit a mutant cleft lip and palate phenotype when neither gene alone could in their heterozygous state.

Such a finding would further support the presence of a genetic interaction between Irf6 and Esrp1 and strengthen the evidence for their direct transcription regulatory relationship.

133

Conclusions

Establishment of the zebrafish model for studies of IRF6 in craniofacial development

The zebrafish has been successfully applied to generate many models of human disease.

Through previous publications and research described in this dissertation, the zebrafish has now become an established model for studying the embryonic functions of irf6 in epithelial maturation, craniofacial development, and orofacial cleft pathogenesis. While IRF6 belongs to a family of nine transcription factors, it is the only member without a role in the interferon immunological response to viral infections and the only member associated with orofacial cleft pathogenesis in humans45.

The study that initially identified and described the zebrafish homolog of IRF6 and our own protein sequence analysis revealed significant polypeptide sequence conservations between human and zebrafish Irf6. In addition, the zebrafish Irf6 homolog retained the key helix-loop-helix DNA-binding and SMIR/IAD protein-binding domains and exhibited increased sequence conservations in those functional domains compared to the protein overall (Figure 5). Furthermore, the conservations in

IRF6 sequence and structure across human and zebrafish allowed human IRF6 to rescue not only the molecular expression patterns of genes previously known to be downregulated with reductions in Irf6 function, but also the embryonic lethal periderm rupture and craniofacial development in our

-/- maternal-null irf6 zebrafish model (Figure 14). Taken together, our experiment results suggested that the molecular functions of human IRF6 are conserved in zebrafish and that the zebrafish model could be used as a platform for discovery of novel IRF6 functions. In this dissertation, we primarily examined the roles of Irf6 as a transcription factor. However, IRF6 could potentially possess non- transcription factor functions important for epithelial maturation and craniofacial development. For example, IRF6 has been previously hypothesized to regulate P63 activities in the basal epidermis through proteasome-mediated degradation208. Although the mechanism of this negative-feedback relationship between IRF6 and P63 is currently unknown and not tested in our current project, the experimental finding that human IRF6 protein could not only rescue the molecular gene expression

134

-/- but also periderm rupture in maternal-null irf6 zebrafish embryos suggests a more comprehensive conservation of human IRF6 functions and gene regulatory networks in zebrafish in order for human

IRF6 to fully rescue the zebrafish mutant phenotypes at the tissue and organismal levels.

-/- Although our zebrafish zygotic-irf6 embryos did not exhibit observable mutant phenotypes

-/- during embryonic development, our maternal-irf6 embryos, regardless of their zygotic genotypes, exhibited a completely penetrant embryonic lethal periderm rupture phenotype (Figure 12). These findings illustrated the importance of irf6 maternal transcripts during critical early embryonic events during zebrafish embryonic development prior to the maternal-zygotic transition and suggested a possible explanation as to why previous forward genetic screens for craniofacial defects that relied on the zygotic phenotypes of genes did not identify irf6 as a candidate142,272,273. With injections of zebrafish or human IRF6 mRNA into maternal-null irf6-/- embryos, the periderm rupture phenotype was rescued. In addition, the rescued embryos exhibited wild type craniofacial morphogenesis and eventually developed into viable/fertile adults. Given the molecularly null nature of our irf6 CRISPR allele and the rapid degradation of mutant irf6 transcripts, the likelihood of the injected wild type irf6 to rescue the developmental requirements of irf6 into adulthood is extremely low and thus suggests that irf6 could be critical only for the early stages of embryonic development in zebrafish. However, based on our evaluation of maternal Irf6 protein dynamics, Irf6 protein appears to be highly stable and was detectable in non-maternal-null zygotic-null irf6-/- embryos up to 96 hpf (data not shown).

Taken together, our results suggested that the amount of wild type maternal irf6 deposited by the mother could be sufficient to meet all the subsequent post-epiboly requirements of irf6 and prevent the precipitation of developmental defects in zebrafish (Figure 32). This explanation is possible if the level of Irf6 activity required to prevent periderm rupture is substantially higher than the level of

Irf6 activity required for the subsequent development of tissues important for epithelial maturation and craniofacial development. In this scenario, irf6 mRNA injection at the one-cell stage to rescue the periderm rupture phenotype of maternal-null irf6-/- embryos would provide sufficient Irf6 protein

135 to not only rescue periderm rupture but also maintain sufficient levels to prevent the precipitation of other mutant embryonic phenotypes as well (Figure 32). With detailed micro-CT analysis by our collaborator, zygotic irf6-/- embryos could actually have mutant craniofacial phenotypes observable in adulthood that we did not characterize in the initial study because they did not impair the viability or fertility of the zebrafish during embryonic or larval development. These craniofacial abnormalities could represent larval or adult requirements of irf6 in zebrafish revealed after maternal or injected irf6 transcripts and proteins have degraded. Further analysis of the degradation kinetics of Irf6, its expression patterns later in zebrafish development, and the craniofacial abnormalities observed in adulthood will be necessary in order to identify and characterize any post-embryonic irf6 functions throughout zebrafish development.

Figure 32: Proposed Irf6 dynamics during zebrafish pre- and post-epiboly development

Zebrafish maternal Irf6 proteins are stable and detectable throughout 0-96 hpf. With our proposed model of Irf6 dynamics, if the level of Irf6 function required to prevent periderm rupture pre-epiboly is higher than the level required for craniofacial development, the presence of maternal or injected wild type Irf6 proteins sufficient for normal periderm development would also mask potential post- epiboly Irf6 requirements during craniofacial development.

136

Overall, the zebrafish periderm served as a faithful model of the mammalian embryonic oral epithelium and displayed strong cross-species conservation of the IRF6 gene regulatory network.

In addition, the downstream transcriptional target genes of IRF6 in terms of epithelial proliferation and differentiation appeared to be conserved between the zebrafish periderm and the mammalian embryonic periderm and epidermis, together suggesting that the zebrafish periderm could be used as a biological model tissue for novel molecular pathway discoveries surrounding the roles of IRF6 during epithelial maturation and craniofacial development.

Usage of zebrafish irf6 models to biologically evaluate human variant protein functions

Due to the complete penetrance and expressivity of the embryonic lethal periderm rupture of maternal-null irf6-/- embryos, our zebrafish model of irf6 provided a sensitive biological platform for evaluating IRF6 protein functions, especially the functions of IRF6 proteins derived from human missense gene variants of unknown significance with uncertain pathogenicity for disease. Through careful dosage titration studies, nuanced changes in IRF6 protein function could be distinguished.

In our study, variants with amino acid substitutions carrying no detrimental consequence on protein function were identified and likely represented benign human genetic variations not contributory to disease pathogenesis. Furthermore, variants with amino acid substitutions that caused complete loss-of-function proteins were identified and likely represented variants pathogenic for disease with haploinsufficiency or dominant-negative activities. The last category of variants had hypomorphic functions with undetermined pathogenic significance (Figure 18). Since our study was limited and focused on evaluating the biological function of human IRF6 gene variant proteins, it only provided part of the contributory evidence necessary for the comprehensive evaluation of rare human gene variants required for accurate disease pathogenicity assignments according to the ACMG226.

Because our experiments utilized a zebrafish model of IRF6, although human IRF6 mRNA could rescue both the periderm rupture phenotype and craniofacial development in maternal-null irf6-/- embryos, it is possible that not all the functions of human IRF6 are conserved in the zebrafish

137 model and that these non-conserved functions could be pathogenic for disease when deficient in human patients. Indeed, such caveats are associated with the usage of any nonhuman model for experimentation. The possibility of incomplete conservations in IRF6 function and gene regulatory network is important in human patients where pathogenicity is at times not completely dependent on IRF6 protein functions but also associated variations in the genetic backgrounds of individuals.

For example, hypomorphic mutations in IRF6 and an upstream transcriptional activator, each alone unable to reduce the IRF6 activity enough to precipitate orofacial cleft phenotypes, together could reduce IRF6 activity levels enough to cause disease onset. Furthermore, amino acid residues that when mutated do not have an effect on Irf6 protein functions in our zebrafish model could interact with protein co-factors that are not conserved in the zebrafish irf6 gene regulatory network. Amino acid substitutions of those residues would not cause detectable changes in Irf6 protein function in our zebrafish gene variant dosage titration assay but would alter endogenous human IRF6 protein interactions and possibly ablate important IRF6 functions. Nonetheless, this hypothesis is unlikely because the amino acid residues described above would likely be non-conserved across species while those tested by our assay demonstrated significance cross-species sequence conservation.

Furthermore, the IRF6 missense variants that retained complete wild type protein functions were also found in healthy individuals in public exome/genome databases and therefore are more likely represent rare benign genetic variations between individuals not captured by our currently limited cross-section of human genetics rather than variants responsible for disease pathogenesis219.

Because our IRF6 variant protein function assay is based on an in vivo biological platform, the utility of the assay could be further extended to identify the molecular mechanisms behind the loss-of-function mutations through assays such as RT-qPCR, western blotting, gel electrophoresis mobility shift assay (EMSA), and co-immunoprecipitation (co-IP). After IRF6 variant mRNA injection into maternal-null irf6-/- zebrafish embryos, RT-qPCR could be performed to identify gene variants that caused transcript instability and reduced production of IRF6 variant proteins. In addition, RT-

138 qPCR could also be used to assess IRF6 downstream gene expression changes in variant-rescued embryos compared to wild type IRF6-rescued embryos in order to evaluate any nuanced changes in the transcriptional functions of IRF6 variant proteins. At the protein level, western blots could be

-/- performed on rescued maternal-null irf6 embryos to determine the stability of IRF6 variant proteins compared to wild type. The ability of IRF6 variant proteins to bind to target DNA sequences could be assessed by EMSA, and their ability to associate with other protein co-factors could be assessed by co-IP. Taken together, our maternal-null irf6-/- embryos provided a novel biological platform for evaluating the protein functions of IRF6 variants from human orofacial cleft patients. Moreover, it illustrated a generalizable principle that could be applied to other genes where the corresponding gene disruption models with quantifiable mutant phenotypes could be generated in the zebrafish.

Now at the dawn of a new genomics revolution where personal genetics information is playing an increasingly more important role in guiding medical decisions, more reliable methods for the holistic evaluation of patient gene variants and pathogenicity assignments are essential for expanding the translational capacity of clinical genetics to better treatments for patients.

Zebrafish as a model for human craniofacial development and orofacial cleft pathogenesis

The zebrafish craniofacial development process involves numerous analogous processes compared to mammalian craniofacial development and has been successfully applied as a model to study genes important for craniofacial development and orofacial cleft pathogenesis in humans.

As discussed in the General Introductions, the zebrafish equivalent to the mammalian palate is the ethmoid plate which is the ventral aspect of the neurocranium (Figure 4). Because zebrafish does not have a nasal cavity compared to human, the ethmoid plate separates the oral cavity from the brain and therefore is not a direct anatomical equivalent to the mammalian hard palate. However, based on lineage tracing experiments of cranial neural crest cells of the ethmoid plate in zebrafish, the CNCCs of the EP share similar developmental origins compared to CNCCs that contribute to the frontonasal and maxillary processes in mammalian systems (Figure 4). Further, the molecular

139 signatures and cellular behaviors of EP CNCCs share many similarities with the morphogenesis of the primary palate in mammalian organisms. Although the mouse model is often considered as the experimental gold standard of palatogenesis compared to zebrafish, each model carries their own distinct advantages and disadvantages. In addition, even though M. musculus also belongs to the mammalian class, its palatogenesis program contains definitive differences compared to humans.

For example, while orofacial clefts in humans often involve the lips, CLOs and CLPs are extremely rare in the mouse model102,103. Even for genes whose causal relationships with OFC pathogenesis are relatively well-defined in humans, the mouse knockout models of those genes often resulted in

CPOs rather than CLPs, suggesting either differences in the phenotypic manifestation of identical

OFCs in various species or more likely fundamental differences in the developmental mechanism behind palatogenesis between mice and humans. This discrepancy is apparent for the IRF6 case, where although the most common clinical manifestations of IRF6 mutations in human are lower lip pits and CLPs, the mouse model of Irf6 gene disruption exhibited CPOs with no clefts through the anterior primary palate despite its clear involvement in human patients45,63. Taken together, these discrepancies in OFC phenotypes suggest that studies of OFC pathogenesis in the mouse model should be complemented with studies in other model organisms274. This shortcoming is where the zebrafish model excels because the zebrafish ethmoid plate shares similar developmental origins with the human primary palate. As such, many of the genes known to cause CLPs in humans also cause clefts through the EP at the lines of fusion between the frontonasal and maxillary processes where clefts through the human primary palate would also occur. Overall, the zebrafish model could be a more suitable model to study genes associated with defects in the human primary palate.

Optogenetic dissection of post-epiboly irf6 functions in craniofacial development

During zebrafish embryogenesis, irf6 was found to be expressed in tissues like the pectoral fin buds, nasal placodes, otic vesicles, and oropharyngeal apparatus post-epiboly (Figure 8). Due to this spatiotemporally restricted expression pattern in craniofacial regions post-epiboly, irf6 could

140 play key roles in craniofacial development in addition to its pre-epiboly roles in periderm maturation.

Previous studies have attempted to establish the zebrafish as a model to study the roles of irf6 in craniofacial development. However, because of the pre-epiboly embryonic lethal periderm rupture phenotype associated with pre-epiboly disruptions of maternal irf6 function, other groups were also precluded from studying its post-epiboly functions184,185. Through our combination of the dominant- negative irf6-ENR construct and the novel EL222 optogenetic expression control system, we were able to bypass the pre-epiboly functions of irf6 during zebrafish periderm maturation and selectively inhibit irf6 function during craniofacial development (Figure 22). Having side-stepped the periderm rupture phenotype, our results revealed a specific craniofacial defect where a missing frontonasal process and a bifurcated ethmoid plate phenotype were observed in optogenetic irf6-ENR embryos compared to wild type (Figure 24). Further, based on temporal regulations of Irf6-ENR expression and thereby wild type Irf6 functional inhibition in zebrafish, we noted that the bifurcated EP cranio- facial phenotype was precipitated when Irf6-ENR expression was initiated at the 10-somite stage

(14 hpf) but not after 24 hpf (data not shown). Through our molecular characterization of the EL222 optogenetics system, based on the concentrations of EL222 mRNA and C120-mCherry response plasmid delivered at the one-cell stage (0 hpf), we reliably detected mCherry expression in embryos stimulated with blue light up to 72 hpf, suggesting that the optogenetics system components were stable in zebrafish embryos and should have allowed for the induction of Irf6-ENR expression after

24 hpf. Thus, the lack of craniofacial defects in embryos induced to express Irf6-ENR after 24 hpf was most likely due to the bypass of important roles of irf6 in craniofacial development rather than a lack of Irf6-ENR expression and wild type Irf6 functional inhibition.

Based on the developmental origins of medial ethmoid plate CNCCs, the bifurcated ethmoid plate phenotype of optogenetic irf6-ENR embryos should have an analogous frontonasal dysplasia phenotype in mammalian systems. However, frontonasal dysplasias represent a specific subtype of craniofacial disorders with distinct genetic etiologies in humans that do not include IRF6. Hence,

141 the exact correlation between the craniofacial manifestations of VWS/PPS and what we observed with Irf6 functional inhibition in zebrafish is currently unknown. One possible explanation is that the lack of a medial EP in the zebrafish model is actually the morphologic equivalent of not frontonasal dysplasias but other forms of orofacial clefts like CLPs and CLOs. In terms of the zebrafish ethmoid plate, several zebrafish mutants have phenotypes similar to the one observed with optogenetic irf6-

ENR gene expression. Although several gene disruptions result in the complete ablation of anterior neurocranium structures including the EP, only a few genes currently published have craniofacial phenotypes exclusively isolated to the medial EP. In the publication detailing the role of microRNA

140 in Pdgfra downregulation and neurocranium morphogenesis in zebrafish, disruption of Pdgfra revealed a bifurcated EP phenotype due to the defective anterior migration of pdgfra-positive FNP

CNCCs to the roof of the mouth opening169. Disruption of PDGFR signaling in mouse models also caused an OFC phenotype171. However, similar to the Irf6 knockout mouse model, the craniofacial phenotypes observed in Pdgfra-/- mice were not FNP dysplasias but rather CPOs, suggesting that there might be not be a direct embryological correlation between the zebrafish medial EP and the mammalian frontonasal process. Due to the resemblance of the craniofacial phenotypes between pdgfra-/- and optogenetic irf6-ENR embryos, the expression of PDGF pathway genes are currently being examined in wild type and optogenetic irf6-ENR embryos to determine if they are responsible for the bifurcated EP phenotype observed. In addition, we are using qPCR gene expression panels for signaling pathways including Wnt, FGF, PDGF, SHH, and others to determine which pathways are most significantly affected in optogenetic irf6-ENR embryos and provide clues to elucidate the molecular mechanisms behind the bifurcated medial EP phenotype observed.

IRF6 downstream transcriptional target gene ESRP1

Since IRF6 is a helix-loop-helix transcription factor, the most direct method for elucidating the developmental functions of IRF6 will be to identify its direct transcription targets. Our approach of using Irf6 ChIP-seq and mRNA-seq identified direct transcriptional targets of Irf6 during periderm

142 maturation at 4 hpf in zebrafish embryos that were also significantly downregulated in the absence of Irf6 protein functions. These candidates are likely critical for zebrafish periderm maturation, and due to the significant conservations of periderm-associated molecular pathways in the mammalian oral epithelium, the same candidates could also play important roles in mammalian oral epithelium and craniofacial development as well. It is important to recognize that certain specialized functions of IRF6, especially those in the medial edge epithelium in regulating MEE degeneration, are under tight spatiotemporal regulation and thus are not present in the generalized epithelium or periderm; these processes could be under-represented in our candidate gene list. Moreover, since periderm maturation occurs in parallel with zebrafish epiboly initiation, our candidate gene list could capture genes/pathways related to this zebrafish-specific developmental event that are not directly related to mammalian craniofacial development. However, many of the morphogenic processes in epiboly, such as convergence-extension and EMT, also occur during mammalian craniofacial development.

Therefore, the molecular mechanisms of how these processes are regulated by Irf6 could also be indirectly captured by our Irf6 ChIP-seq and mRNA-seq approach as well.

Based on multifactorial evaluations of the ChIP-seq and mRNA-seq overlap candidate list, a large number of genes were found to be expressed in mammalian embryonic facial prominences in spatiotemporal patterns potentially relevant for craniofacial development (Figure 33). The results suggested that our approach for identifying IRF6 downstream transcriptional targets enriched for genes with craniofacial expression and biological significance. Furthermore, with gene expression analyses in mice/zebrafish and extensive literature searches, we restricted the candidate gene list to four genes clustered in two signaling pathways: 1) dact1 and rspo3 in the Wnt pathway and 2) esrp1 and flrt3 in the FGF pathway (Figure 33).

The esrp1 cis-regulatory element was found to be bound by Irf6 through our Irf6 ChIP-seq study (Figure 29). Using qPCR, we were able show a six-fold downregulation in esrp1 expression in maternal-null irf6-/- embryos compared to wild type and rescued esrp1 expression in maternal-

143 null irf6-/- embryos rescued with zebrafish irf6 mRNA (Figure 29). Taken together, our experimental results suggested that Irf6 is a novel direct upstream positive regulator of esrp1 transcription in the zebrafish periderm. Furthermore, this novel transcriptional relationship provided insights into how

Irf6 could regulate epithelial maturation through fine-tuning the FGF signaling pathway. With WISH, esrp1 was found to be expressed in the periderm during the first 24 hpf and subsequently became restricted to specialized structures such as the pectoral fin buds, olfactory placodes, otic vesicles, and oral epithelium (Figure 29). The esrp1 spatiotemporal gene expression patterns demonstrated significant overlaps with the irf6 expression domains during zebrafish embryonic development and therefore suggested that there could be a direct transcriptional relationship between irf6 and esrp1 in tissues with spatiotemporal gene expression overlap. However, because esrp1 transcripts were detectable in maternal-null irf6-/- embryos and only downregulated six-fold compared to wild type embryos, Irf6 could just be one of many transcriptional regulators of esrp1 expression in zebrafish.

Figure 33: Irf6 downstream transcriptional target gene summaries

Information on selected Irf6 downstream transcriptional targets. Zebrafish CRISPR gene disruption models are currently in the F2 generation. FaceBase data demonstrated strong facial prominence expressions of Esrp1, Dact1, Flrt3, and Rspo3 during mouse craniofacial development.

144

During the course of this dissertation, a research article describing the esrp1; esrp2 double knockout phenotype in zebrafish was published271. Although the single homozygous knockouts of esrp1-/- and esrp2-/- did not exhibit any mutant phenotypes, the double knockout zebrafish embryo exhibited a bifurcated ethmoid plate phenotype due to a missing frontonasal process that is highly reminiscent of the EP defect observed in optogenetic irf6-ENR embryos (Figure 24). These results suggested that esrp1 and esrp2 could have redundant or compensatory functions during zebrafish development. Furthermore, the remarkable resemblance of the esrp1-/-; esrp2-/- mutant phenotype with optogenetic irf6-ENR embryos, taken in combination with our ChIP-seq and mRNA-seq results that identified esrp1 as a direct transcriptional target gene of Irf6, suggested that esrp1 could be a critical downstream effector of irf6 functions during zebrafish craniofacial development, especially since only a few zebrafish mutants have been currently identified with craniofacial defects specific to the medial EP. In addition, as mentioned in the description of the mouse model of Esrp1 knock- out, Esrp1-/- mice exhibited a rare bilateral cleft lip and palate phenotype, making it one gene in a small collection of genes to exhibit a defect in the mouse primary palate and thus a valuable model to study craniofacial development in humans. The finding that the mouse model of Esrp1 produced a valuable CLP phenotype whose corresponding zebrafish model of esrp1 produced a medial EP phenotype suggested that there could be some embryological correspondence between bifurcated ethmoid plate phenotypes in zebrafish and clefts through the primary palate in mice. Moreover, the results further supported our usage of the zebrafish model to investigate the roles of irf6 in epithelial maturation, craniofacial development, and orofacial cleft pathogenesis.

Esrp1 regulates a diverse epithelial-specific alternative splicing program. Through a variety of techniques including mRNA-seq with multivariate transcript splicing analysis, the key canonical target gene of Esrp1 alternative splicing was identified to be Fgfr2268,275. Given the CLP phenotype in mice and bifurcated ethmoid plate phenotype in zebrafish associated Esrp1 gene disruption, the frontonasal development process could be intimately linked with downstream molecular functions

145 of Esrp1. In combination with the increase in cell death seen at the frontonasal region overlapping

FNP NCCs in optogenetic irf6-ENR embryos, the most important developmental organizing center regulating FNP cell behaviors dependent on epithelial FGF signaling is the frontonasal ectodermal zone at the juxtaposition of fgf8 and shh expression at the roof of the anterior mouth opening. Both

Fgf8 and Shh play important roles in promoting FNP NCC survival and chondrogenic differentiation through epithelial-mesenchymal interactions, and disruptions of gene expression in the FEZ could lead to orofacial cleft pathogenesis. Given the central importance of FGF signals in the roof of the oral cavity, disruptions of Fgfr2 splice variant isoform expressions through disruptions in epithelial- specific Esrp1 expression could lead to dysfunctions in epithelial and FNP NCC cellular behaviors and subsequently CNCC apoptosis and orofacial cleft pathogenesis. While fgf8a mutant zebrafish has already been identified called acerebellar (ace), the ace mutant did not show a missing medial

-/- -/- EP phenotype observed in optogenetic irf6-ENR or esrp1 ; esrp2 embryos. This suggested either that the FEZ is not an essential frontonasal organizing center in zebrafish or that other factors can serve compensatory functions to fgf8a in zebrafish during craniofacial development276. In addition, the lack of a medial EP phenotype in ace mutants does not exclude the FGF signaling pathway as

-/- -/- the key effector in esrp1 ; esrp2 double mutant embryos because disruptions of esrp would cause the expression of a mesenchymal-specific isoform of fgfr2 in epithelial cells rather than disruptions in FGF signaling ligand expression. Mesenchymal fgfr2 isoform expression in the epithelium could disrupt the reciprocal signaling from mesenchymal CNCCs to epithelial cells and potentially cause the medial EP phenotype observed in esrp1-/-; esrp2-/- embryos. Further investigation into the gene expression changes resulting from esrp1/2 gene disruptions are necessary in order to identify the molecular pathways responsible for the bifurcated EP phenotype observed in esrp1-/-; esrp2-/- and optogenetic irf6-ENR embryos.

146

Methods

Fish rearing and husbandry

Zebrafish (Danio rerio) adults and embryos were maintained in accordance with approved

277 institutional protocols (2010N000106) at Massachusetts General Hospital . Embryos were raised at 28.5°C in E3 media (5.0 mM NaCl, 0.17 mM KCl, 0.33 mM CaCl2, 0.33 mM MgSO4) with 0.0001% methylene blue. Embryos were staged according to standardized developmental timepoints186 by hours or days post fertilization (hpf or dpf respectively). All zebrafish lines used for experimentation were generated from the Tübingen strain.

Mouse rearing and husbandry

Mouse (Mus musculus) adults and embryos were maintained in accordance with approved institutional protocols (2017N000050) at Massachusetts General Hospital. Adults were maintained on a 12-hour day-night light cycle and provided with unrestricted access to food and water. Adults were separated by sex and ear-tagged for identification and genotyping. Timed matings were set up in the evenings using a 1:1 male:female ratio and mucus plugs were checked the next morning to mark embryonic age E0.5. Pregnant females were sacrificed by CO2 asphyxiation or by cervical dislocation at the appropriate gestational stages to surgically isolate embryos for experimentation.

CRISPR-Cas9 gene editing and genotyping

CRISPR sgRNA target sites were identified by a variety of online CRISPR computational

278 279 280 programs such as zifit.partners.org/ZiFiT , crispr.mit.edu , and chopchop.rc.fas.harvard.edu . sgRNAs were designed with the traditional sequence constraint of a 3’ PAM sequence containing

NGG and an additional sequence constraint of a 5’ NG for in vitro RNA synthesis from T7 or SP6 promoters. The irf6 exon 6 sgRNA was generated by in vitro transcription from the T7 promoter as described149. The zebrafish sequence-optimized cas9 template pT3TS-nls-zCas9-nls plasmid281 was linearized with the restriction enzyme XbaI and purified using the QIAquick PCR purification

147 kit (Qiagen). 5’ capped z-cas9 mRNA was synthesized from the T3 promoter with the mMessage mMachine T3 in vitro transcription kit (Ambion) and purified with the RNeasy mini plus kit (Qiagen).

Zebrafish embryos were microinjected at the one-cell stage directly in the cytoplasm with 2 nls of solution containing 25 ng/µl of sgRNA and 100 ng/µl of z-cas9 capped mRNA.

The dact1, dact2, esrp1, esrp2, flrt3, rspo3, and irf6 CRISPR HDR knock-in sgRNAs were generated by in vitro transcription from a SP6 promoter as described282. Lyophilized Cas9 protein

(PNA Bio) was resuspended in ddH2O to a stock concentration of 1 µg/µl and stored in single-use aliquots in -80°C to avoid freeze-thaw inactivation and kept for 6 months. One-cell staged zebrafish embryos were microinjected directly in the cytoplasm with 2 nls of solution containing 15 ng/µl of sgRNA and 100 ng/µl of Cas9 protein pre-complexed for 5-10 minutes at room temperature.

Genomic DNA for genotyping was isolated from either whole 24 hpf embryos or tail fin clips using the HotSHOT method as described283. Genotyping primers flanking the CRISPR sgRNA site were designed using a combination of ChopChop (chopchop.rc.fas.harvard.edu) and NCBI primer

BLAST (ncbi.nlm.nih.gov/tools/primer-blast/). Forward primers were synthesized by Invitrogen with

5’-FAM modifications. Microsatellite sequencing analyses were used to determine indel mutation sizes and frequencies (MGH DNA Core). PCR amplicons of CRISPR sgRNA targets were cloned into the pGEM-T Easy plasmid (Promega) and Sanger sequenced to confirm the exact sequence changes resulting from CRISPR mutagenesis.

Total RNA isolation and RT-qPCR

Zebrafish embryos were stage-matched by chronological or developmental age and flash- frozen in liquid nitrogen prior to RNA extraction. Samples not immediately used for RNA extraction were stored in -80°C for no longer than two months. Frozen embryos were homogenized with a micropestle in TRIzol (Invitrogen), and total RNA was isolated with standard 5:1 phenol-chloroform exaction and ethanol precipitation methods. Total RNA was digested with 10U of DNase I (Ambion) for 1 hour at 37°C to remove possible genomic and plasmid DNA contaminations and then purified

148 again using the RNeasy mini plus kit (Qiagen). Total RNA was quantified using a Nanodrop 2000 spectrophotometer, and 1-5 µg was used for reverse transcription (oligo dT or random hexamers) using either the SuperScript III cDNA synthesis kit (Invitrogen) or the iScript Select cDNA synthesis kit (Bio-Rad). RT-qPCR was performed using the PowerUP SYBR Green master mix (Invitrogen) with 20 µl reactions in 96-well plates on the StepOne Plus RT-qPCR platform (Applied Biosystems).

At least two internal controls, including ef1a (elongation factor 1α), actb1 (beta actin 1), 18S rRNA, and gapdh (glyceraldehyde-3-phosphate dehydrogenase), were examined for each experiment for expression normalization. qPCR primers were designed using NCBI primer BLAST with products between 70-200 bp with the parameters of exon-junction spanning and intron-separation selected as design criteria whenever possible. Amplification specificity was verified with melt-curve analysis after each individual run. Negative control amplifications were performed on samples either without reverse transcription or cDNA template to detect DNA contaminations.

Protein isolation and western blotting

Zebrafish embryos before 48 hpf were enzymatically dechorionated by incubating embryos in 1 mg/ml pronase (Roche) in E3 medium for 5 minutes at RT without agitation on agarose-coated plates. Dechorionated embryos were then deyolked according to previously published protocol284 with all solutions supplemented with HALT pan-protease/phosphatase inhibitor cocktails (Thermo

Scientific) to prevent sample degradation during processing. Cell pellets were flash frozen in liquid nitrogen and stored in -80°C for less than six months if not processed immediately. Samples were homogenized with a micropestle in RIPA buffer and centrifuged to pellet cellular debris. The protein supernatant concentrations were quantified using the BSA protein standards assay kit (Bio-Rad).

Gel electrophoresis was performed on Novex Bis-Tris 4-12% gradient or 10% non-gradient protein gels (Invitrogen) with 10-20 µg of total protein loaded per lane. Gels were subsequently transferred onto methanol-activated 0.22 µm pore PVDF membranes (Novex), blocked for 2 hours at RT with

StartingBlock in TBST (Thermo Scientific), and incubated with a rabbit polyclonal antibody for Irf6

149

(zebrafish) at 1:1500 dilution (GeneTex) and a rabbit monoclonal antibody for zebrafish β-actin at

1:1500 dilution (Cell Signaling Technology) in blocking buffer at 4°C overnight. Subsequently, the

PVDF membrane was washed for 3x10 minutes at RT on a rocking platform and incubated with a

HRP-conjugated anti-rabbit antibody at 1:2000 dilution (Abcam) in blocking buffer at RT for 1 hour.

Signal amplification was performed using the Novex ECL chemiluminescence reagent (Invitrogen).

Band detection and visualization was performed in a dark room using Amersham ECL Hyperfilms

(GE Life Sciences) with incremental exposure times.

Cryosectioning and fluorescence immunohistochemistry

Zebrafish and mouse embryos at the appropriate developmental stages were fixed in 4% paraformaldehyde (PFA) at 4°C overnight. Fixed embryos were washed with 1X PBS, transferred into 15% sucrose in 1X PBS for 30 minutes at RT, and subsequently into 30% sucrose in 1X PBS overnight at 4°C. Thereafter, embryos were serially transferred through optimal cutting temperature

(OCT) media (Thermo Scientific) before achieving the desired orientation and frozen in a dry ice-

100% ethanol bath. Blocks were stored in -80°C for a maximum of 6 months and thawed in -20°C overnight prior to cryosectioning. Embryos embedded in OCT were sectioned at 6 µm/section and collected in groups onto Superfrost Plus microscope slides (Thermo Scientific). Slides were dried at RT overnight and subsequently stored in -80°C prior to immunohistochemical staining.

Tissue section slides were rehydrated by washing with 1X PBST (0.1% Tween-20) for 3x

5 minutes at RT, blocked by incubating with SuperBlock (Thermo Scientific) in PBST for 1 hour at

RT, and incubated with primary antibody at the appropriate dilutions (listed below) in blocking buffer overnight at 4°C on a rocking platform. Slides were washed 3x10 min with 1X PBST and incubated for 1 hour at RT with Alexa Fluor conjugated secondary antibodies corresponding to the species of the primary antibodies diluted 1:1000 in blocking buffer. Subsequently, slides were washed 1x10 min with 1X PBST and counterstained with DAPI (Invitrogen) in PBS at 250 ng/ml for 5 min at RT.

Samples were further washed with 1X PBS 2x10 minutes at RT, drained of excess liquids without

150 drying, and mounted with Prolong Diamond anti-fade mountant (Invitrogen). Slides were cured for

24 hours at RT in the dark before fluorescence imaging and subsequently placed in 4°C for storage.

In certain situations, an alternative immunohistochemistry staining method was used to boost the signal-to-noise ratio. Alexa Fluor 488 Tyramide SuperBoost kits (Invitrogen) were used according to manufacturer instructions. This kit used poly-HRP conjugated secondary antibodies to catalyze tyramide-linked Alexa Fluor radical conversion and covalent deposition in tissues. Mouse primary antibody was applied on mouse embryonic tissue cryosections with the Mouse-on-Mouse (M.O.M) basic detection kit and ImmPRESS peroxidase polymers (Vector Labs) according to manufacturer instructions to prevent secondary antibody background detection of endogenous mouse IgGs.

Irf6 Rabbit-anti-zebrafish pAb GeneTex 1:200

Irf6 Rabbit-anti-mouse mAb Abcam 1:200

Esrp1 Mouse-anti-mouse mAb Abcam 1:200

IRF6 cDNA cloning and variant generation by site-directed mutagenesis

Full-length zebrafish irf6 cDNA (NM_200598.2) and human IRF6 cDNA (NM_006147.3) were synthesized using GeneArt (Invitrogen) and subcloned into the pCS2+8 vector (addgene #34931) at the EcoRV sequence in the multiple cloning site. IRF6 missense gene variants were identified from previous publications, and the mutations were mapped to corresponding nucleotides in the zebrafish irf6 cDNA. PCR-based site-directed mutagenesis was performed to generate pCS2+8 vectors containing irf6 missense gene variants using the Q5 site-directed mutagenesis kit (NEB) according to manufacturer instructions with primers containing the desired mutations designed by the NEBaseChanger tool (nebasechanger.neb.com). Plasmids containing irf6 missense variants were isolated using the QIAprep spin miniprep kit (Qiagen) and sequence verified for introduction of the mutation and sequence identity with either complete plasmid or Sanger sequencing from the

SP6 promoter through the entire cDNA (MGH DNA Core). All vectors were propagated in NEB 5α high efficiency chemically competent cells (NEB).

151

In vitro mRNA synthesis

Plasmids containing IRF6 missense gene variant cDNA in the pCS2+8 (addgene #34931) backbone were linearized by NotI-HF (NEB) in CutSmart buffer and purified with the QIAquick PCR purification kit (Qiagen). Gene variant mRNA was synthesized by in vitro transcription with the SP6 mMessage mMachine in vitro transcription kit (Ambion) using plasmids linearized 3’ of the SV40 polyadenylation signal as template. After mRNA synthesis, cDNA templates were digested with

Turbo DNase I according to manufacturer instructions, and mRNA was purified using the RNeasy mini plus kit (Qiagen). Purified gene variant mRNAs were quantified with either a Nanodrop 2000 spectrophotometer or Qubit fluorometer (Invitrogen), diluted to 800 ng/µl, and stored as individual aliquots in -80°C for a maximum of six months until use to prevent freeze-thaw degradation. Upon thawing, mRNA concentrations were re-measured by Nanodrop or Qubit to ensure concentrations did not significantly change during long-term storage.

Zebrafish embryo microinjection of mRNA and morpholinos

Microinjection of mRNA was performed by injecting 2 nl of mRNA solution with 0.05% phenol red directly into the cytoplasm of one-cell staged embryos. GFP mRNA was also similarly prepared by in vitro transcription and injected in identical conditions to control for potential toxicity associated with injection trauma, exogenous mRNA delivery, and excessive protein translation. Lyophilized morpholinos were resuspended with ddH2O to a stock concentration of 20 ng/µl and stored at RT in aliquots. Individual aliquots were heated to 70°C and briefly vortexed prior to preparation of the injection mix to ensure full dissolution. Morpholinos were delivered by microinjections into the yolk

(rather than cytosol) of one-cell staged embryos with 2 nls of morpholino solution diluted in 0.1 M

KCl with 0.05% phenol red. Mismatch control morpholinos were injected under identical conditions to control for potential toxicities. Embryos from all methods of microinjection were examined at 3 hpf to remove unfertilized embryos, which were quantified against the total number of microinjected embryos to ensure no fertilization defects were observed.

152

Whole-mount in situ hybridization and DIG-labeled riboprobe synthesis

Dechorionated zebrafish embryos were fixed in methanol-free 4% paraformaldehyde at 4°C overnight and subsequently washed with and stored in 100% methanol at -20°C for a minimum of one hour and a maximum of one week before whole-mount in situ hybridization (WISH). WISH and

DIG-labeled riboprobe synthesis were performed essentially as described285. Briefly, for riboprobe synthesis, PCR was performed using mix-staged embryonic zebrafish cDNA as templates and T3 promoter sequence-linked reverse primers to generate cDNA templates for in vitro transcription.

PCR reactions were purified using the QIAprep PCR purification kit (Qiagen). In vitro transcription was performed using a T3 polymerase (Roche) and DIG labeling mix (Roche) for 2 hours at 37°C, after which the DNA template was digested with DNase I (Ambion) for 30 min at 37°C. DIG-labeled riboprobes were isolated with ethanol-NaOAc precipitation, resuspended in DEPC-treated ddH2O, and stored in -20°C. All PCR products were TOPO cloned into pGEM-T Easy vectors (Promega) and sequence verified by Sanger sequencing. WISH colorimetric signal detection was performed using an alkaline phosphatase conjugated anti-DIG antibody (Roche) diluted 1:10,000 in blocking buffer (10% BSA and 5% FCS in MABT) and BCIP-NBT colorimetric substrates (Sigma).

Acid-free alcian blue staining and brightfield imaging

Zebrafish embryos at 96 hpf or 120 hpf were fixed in 4% paraformaldehyde in PBS at 4°C overnight, washed with PBS, and stained with acid-free alcian blue overnight on a rotating platform

285 at RT essentially as described . Stained embryos were washed with ddH2O and subsequently bleached in the dark (0.8% W/V KOH, 0.1% Tween 20, 0.9% H2O2) until cell pigmentation was no longer present. Whole and dissected stained embryos were mounted in 3% methylcellulose on a depression slide and imaged using a Nikon Eclipse 80i compound microscope with a Nikon DS

Ri1 camera. Z-stacked images were taken to increase the depth-of-field with the NIS Element BR

3.2 software. Stacked images were processed by ImageJ or FIJI to generate maximum intensity projection images.

153

Two-photon in vivo zebrafish and IHC imaging

Whole-mount in vivo transgenic reporter zebrafish embryo imaging and cryosectioned IHC slide imaging were performed on a custom Olympus FVMPE-RS multiphoton microscopy system

(MGH multiphoton microscopy core). All images were captured using a water-immersion objective

(25X magnification, 1.05 numerical aperture; Olympus #XLPLN25XWMP2), and the fluorophores were stimulated for two-photon excitation with two femtosecond pulsed lasers (Mai Tai DeepSee;

SpectraPhysics). For whole-mount in vivo imaging of zebrafish embryos, embryos were manually

286 dechorionated and then anesthetized in E3 media with low-dose tricaine (Sigma) . Subsequently, embryos were mounted in 3% methylcellulose with tricaine on a depression slide and covered with a coverslip. Laser power was maintained under 10% intensity and detector sensitivity was adjusted to simultaneously maximize signal intensity and minimize light exposures. For Z-stacked images, laser power was manually determined by histogram analysis for each imaging depth to establish a graduated laser power curve in relationship to the Z-plane in order to maintain uniform fluorophore signal. Images were captured using the Olympus FluoView software and subsequently analyzed on ImageJ and Imaris for Z-stack maximum intensity projections and hyperstack playbacks.

Computational modeling and statistical analyses

Human IRF6 missense gene variants were identified from previously published literature222 and analyzed for computational predictions of missense variant protein functions by the programs

PolyPhen-2 (genetics.bwh.harvard.edu/pph2)224 and SIFT (sift.jcvi.org)287. For statistical analyses of experimental data, all error bars represent ±2x standard mean error (SEM), and the statistical significance for pairwise comparisons were determined using two-tailed Student’s T-test with 0.05 as the P-value cut-off. For identifications of IRF6 missense gene variants from the ExAC database

(exac.broadinstitute.org) and gnomAD database (gnomad.broadinstitute.org), IRF6 was queried and the results were sorted for missense variants. Both ChIP-seq and mRNA-seq sequencing reads were aligned to zebrafish genome assembly Zv9. For mRNA-seq, transcriptome mapping

154 was performed with STAR v.2.3.0288. Read counts for individual genes were produced using the

289 unstranded count feature in HTSeq v.0.6.0 . Differential gene expression analysis was performed using the EdgeR package after normalizing read counts290. For ChIP-seq, reads were aligned

291 using BWA and filtered for uniquely mapped reads . Input-normalized alignments were generated and regions of tag enrichment were determined by SPP292. Multi-species amino acid conservation alignment of IRF6 was performed with PRALINE182 (ibi.vu.nl/programs/pralinewww/) using human

IRF6 (NP_006138.1), mouse Irf6 (NP_058547.2), and zebrafish Irf6 (NP_956892.1) as templates.

Optogenetic expression of genes-of-interest

Genes-of-interest such as irf6, irf6-ENR, irf6R84C, and mCherry were isolated by PCR from various templates and inserted into the pGL4.23-(C120x5)-TATA vector with In-Fusion cloning

(Clontech) according to manufacturer instructions using a 1:2 vector-to-insert ratio to generate optogenetic response plasmids. The constructs were transformed in Stellar chemically competent cells (Clontech) and colonies were screened by PCR, restriction digests, Sanger sequencing, and whole plasmid sequencing to verify the sequence identities and accuracy of the constructs. Light- sensitive response proteins, VP16-EL222 and TAEL (TA4-EL222), were subcloned into pCS2+8 and in vitro transcribed from the SP6 promoter as previous described to generate capped mRNA for embryo microinjections. The optogenetics injection mix was comprised of 25 ng/µl EL222 or 75 ng/µl TAEL mRNA and 10 ng/µl pGL4.23 response plasmid with 0.05% phenol red. Each embryo was microinjected with 2 nl of the optogenetics injection mix directly in the cytoplasm at the 1-cell stage, immediately wrapped in aluminum foil, and placed into a dark incubator. Unfertilized and abnormal embryos were removed at 3 hpf in the dark room with limited exposure to ambient light.

Injected embryos were divided into two groups (dark and light) at the desired developmental stage in E3 media without methylene blue and placed under 465nm blue light (LED panel, HQRP) at 0.3 mW/cm2 (measured by a PM100D digital power meter with a SV120VC photodiode power sensor,

ThorLabs) with constant illumination. Control embryos containers were wrapped in aluminum foil.

155

Whole-mount zebrafish proliferation and cell death analyses

Dechorionated zebrafish embryos at 14-24 hpf were fixed in methanol-free 4% PFA at 4°C overnight and permeabilized in 100% acetone for 10 minutes at -20°C, followed by 2x5 min ddH2O washes and 3x5 min PBST (0.1% Tween-20) washes at RT. Zebrafish embryos were blocked with

10% fetal calf serum and 5% normal goat serum in PBST for 1 hour at RT on a rocking platform and then incubated with a 1:500 rabbit cleaved caspase 3 monoclonal antibody (BD Bioscience) or a 1:250 rabbit phospho-histone 3 monoclonal antibody (CST) in blocking buffer overnight at 4°C.

Embryos were washed 4x15 min in PBST and incubated with HRP-conjugated secondary goat anti- rabbit IgG diluted 1:1,000 in blocking buffer for 2 hours at RT on a rocking platform. Subsequently, embryos were washed 4x15 min in PBST and developed for 5 minutes at RT in diaminobenzidine

(DAB) solution with H2O2 (Vector Labs). Stained embryos were washed in PBST, stored in 4%

PFA at 4°C, and visualized by whole-mount brightfield imaging.

Chromatin immunoprecipitation and sequencing (ChIP-seq) of zebrafish embryos

Zebrafish embryo chromatin immunoprecipitation and next-generation sequencing was performed essentially as described293 using a rabbit polyclonal Irf6 antibody specific for zebrafish

Irf6 (GeneTex). Briefly, zebrafish embryos (wild type and maternal-null irf6-/-) were dechorionated and fixed at RT for 15 minutes with 1.5% methanol-free PFA in PBS. Glycine was added to a final concentration of 0.125 M to quench the PFA fixation by incubating for 5 min at RT, following which zebrafish embryos were washed 3x5 min in ice-cold PBS. Crosslinked embryos were resuspended in cell lysis buffer and transferred into a 15 ml Tenbroeck tissue grinder (Wheaton) to homogenize embryos and completely lyse cells while maintaining nuclear integrity. Lysed cells were centrifuged to pellet nuclei at 3500 RPM for 5 min at 4°C, following which the pelleted nuclei were resuspended in nuclei lysis buffer and pipetted with a P200 tip to release the crosslinked chromatin into solution.

The crosslinked chromatin was sonicated using a Covaris E220 acoustic sonicator to generate a chromatin library with fragments centrally distributed at 200 bp in length. Fragmented crosslinked

156 chromatin was placed into a solution of pre-blocked Dyna Protein G magnetic beads (Invitrogen) linked to the anti-Irf6 primary antibody diluted 1:100 to perform chromatin immunoprecipitation. The beads were washed 4x5 min with RIPA wash buffers and resuspended in elution buffer to reverse

PFA crosslinks for 6 hours at 65°C. RNase I (Ambion) was added to a final concentration of 0.25

µg/µl and incubated at 37°C for 2 hours to remove any RNA contaminations. DNA fragments were isolated by standard phenol:chloroform:isoamyl alcohol (25:24:1) extraction. ChIP DNA fragments were quantified by a Qubit fluorometer, and their size and quality were assessed by a Bioanalyzer

2100 (Agilent). Sequencing libraries were prepared with the NEBNext Ultra DNA library preparation kit (NEB), multiplexed with NEBNext index primers (NEB), and adapted for the Illumina sequencing platforms according to manufacturer instructions. The resulting DNA libraries were quantified with a Qubit fluorometer and the NEBNext library quantification kit for Illumina (NEB) and assessed for quality with the Bioanalyzer 2100. ChIP libraries were sequenced with paired-end 50 at ≈25 million reads per sample with biological duplicates. Input DNA samples were also sequenced to control for sequence biases from ChIP, DNA isolation, and the library preparation process. mRNA-seq of zebrafish embryos

Total RNA was isolated from 4 hpf wild type and maternal-null irf6-/- embryos by TRIzol and phenol-chloroform ethanol precipitation as previous described. Total RNA was quantified with the

Nanodrop 2500 and assessed for quality with Bioanalyzer 2100 RNA chips (Agilent). Samples with

RNA integrity numbers (RIN) over 9 were selected to proceed with sequencing library preparation. mRNA-seq libraries were prepared with the NEBNext Ultra RNA library preparation kit with poly(A) mRNA magnetic isolation module (NEB) essentially according to manufacturer protocols. Resulting cDNA libraries were quantified by a Qubit fluorometer and assessed for quality with a Bioanalyzer.

The sequencing-ready cDNA libraries were quantified with the NEBNext library quantification kit for Illumina (NEB). mRNA-seq libraries were sequenced with single-end 50 at ≈20 million reads per sample with biological triplicates.

157

References

1. Santagati F, Rijli FM. Cranial neural crest and the building of the vertebrate head. Nat Rev Neurosci 2003;4:806-18.

2. Matsui M, Klingensmith J. Multiple tissue-specific requirements for the BMP antagonist Noggin in development of the mammalian craniofacial skeleton. Developmental Biology 2014;392:168-81.

3. Dixon MJ, Marazita ML, Beaty TH, Murray JC. Cleft lip and palate: understanding genetic and environmental influences. Nat Rev Genet 2011;12:167-78.

4. Rahimov F, Jugessur A, Murray JC. Genetics of nonsyndromic orofacial clefts. Cleft Palate Craniofac J 2012;49:73-91.

5. Bush JO, Jiang R. Palatogenesis: morphogenetic and molecular mechanisms of secondary palate development. Development 2012;139:231-43.

6. Masarei AG, Sell D, Habel A, Mars M, Sommerlad BC, Wade A. The Nature of Feeding in Infants With Unrepaired Cleft Lip and/or Palate Compared With Healthy Noncleft Infants. The Cleft Palate-Craniofacial Journal 2007;44:321-8.

7. Mossey PA, Little J, Munger RG, Dixon MJ, Shaw WC. Cleft lip and palate. The Lancet 2009;374:1773-85.

8. Boulet SL, Grosse SD, Honein MA, Correa-Villaseñnor A. Children with Orofacial Clefts: Health-Care Use and Costs Among a Privately Insured Population. Public Health Reports 2009;124:447-53.

9. Wehby G, Cassell CH. The Impact of Orofacial Clefts on Quality of Life and Health Care Use and Costs. Oral diseases 2010;16:3-10.

10. Stanier P, Moore GE. Genetics of cleft lip and palate: syndromic genes contribute to the incidence of non-syndromic clefts. Hum Mol Genet 2004;13 Spec No 1:R73-81.

11. Kousa YA, Mansour TA, Seada H, Matoo S, Schutte BC. Shared molecular networks in orofacial and neural tube development. Birth Defects Research 2017;109:169-79.

12. Copp AJ, Stanier P, Greene NDE. Neural tube defects: recent advances, unsolved questions, and controversies. The Lancet Neurology 2013;12:799-810.

13. Smithells RW, Sheppard S, Schorah CJ. Vitamin dificiencies and neural tube defects. Archives of Disease in Childhood 1976;51:944-50.

14. Laurence KM, James N, Miller MH, Tennant GB, Campbell H. Double-blind randomised controlled trial of folate treatment before conception to prevent recurrence of neural-tube defects. British Medical Journal (Clinical research ed) 1981;282:1509-11.

15. Smithells RW, Sheppard S, Schorah CT, et al. Vitamin supplementation and neural tube defects. The Lancet 1981;318:1425.

158

16. Oakley GP, Jr, Erickson J, Adams MJ, Jr. Urgent need to increase folic acid consumption. JAMA 1995;274:1717-8.

17. Vieira AR. Genetic and environmental factors in human cleft lip and palate. Frontiers of oral biology 2012;16:19-31.

18. Leslie EJ, Marazita ML. Genetics of cleft lip and cleft palate. American journal of medical genetics Part C, Seminars in medical genetics 2013;163C:246-58.

19. Sivertsen Å, Wilcox AJ, Skjærven R, et al. Familial risk of oral clefts by morphological type & severity: population based cohort study of first degree relatives. BMJ 2008;336:432-4.

20. Little J, Bryan E. Congenital anomalies in twins. Seminars in Perinatology;10:50-64.

21. Gundlach KKH, Maus C. Epidemiological studies on the frequency of clefts in Europe and world-wide. Journal of Cranio-Maxillofacial Surgery 2006;34:1-2.

22. Boyles AL, DeRoo LA, Lie RT, et al. Maternal Alcohol Consumption, Alcohol Metabolism Genes, and the Risk of Oral Clefts: A Population-based Case-Control Study in Norway, 1996–2001. American Journal of Epidemiology 2010;172:924-31.

23. Holmes LB, Harvey EA, Coull BA, et al. The Teratogenicity of Anticonvulsant Drugs. New England Journal of Medicine 2001;344:1132-8.

24. Shaw GM, Wasserman CR, Lammer EJ, et al. Orofacial clefts, parental cigarette smoking, and transforming growth factor-alpha gene variants. American Journal of Human Genetics 1996;58:551-61.

25. Jia ZL, Shi B, Chen CH, Shi JY, Wu J, Xu X. Maternal malnutrition, environmental exposure during pregnancy and the risk of non-syndromic orofacial clefts. Oral Dis 2011;17:584-9.

26. Burg ML, Chai Y, Yao CA, Magee W, Figueiredo JC. Epidemiology, Etiology, and Treatment of Isolated Cleft Palate. Frontiers in Physiology 2016;7.

27. Gary M. Shaw, Cathy R. Wasserman, Jeffrey C. Murray, Edward J. Lammer. Infant TGF- Alpha Genotype, Orofacial Clefts, and Maternal Periconceptional Multivitamin Use. The Cleft Palate-Craniofacial Journal 1998;35:366-70.

28. Abbott BD. The etiology of cleft palate: a 50-year search for mechanistic and molecular understanding. Birth Defects Research Part B: Developmental and Reproductive Toxicology 2010;89:266-74.

29. Tanner JP, Salemi JL, Stuart AL, et al. Associations between exposure to ambient benzene and PM(2.5) during pregnancy and the risk of selected birth defects in offspring. Environ Res 2015;142:345-53.

30. Birnbaum BDALS. TCDD Alters Medial Epithelial Cell Differentiation during Palatogenesis. Toxicology and Applied Pharmacology 1989;99:276-86.

159

31. Hu X, Gao J, Liao Y, Tang S, Lu F. Retinoic acid alters the proliferation and survival of the epithelium and mesenchyme and suppresses Wnt/beta-catenin signaling in developing cleft palate. Cell Death Dis 2013;4:e898.

32. David Lohnes MM, Cathy Mendelsohn, Pascal Dolle, Andree Dierich, Philippe Gorry, Anne Gansmuller, Pierre Chambon. Function of the retinoic acid receptors (RARs) during development. Development 1994;120.

33. Nørgård B, Puhó E, Czeizel AE, Skriver MV, Sørensen HT. Aspirin use during early pregnancy and the risk of congenital abnormalities: A population-based case-control study. American Journal of Obstetrics & Gynecology;192:922-3.

34. Jentink J, Loane MA, Dolk H, et al. Valproic Acid Monotherapy in Pregnancy and Major Congenital Malformations. New England Journal of Medicine 2010;362:2185-93.

35. Azarbayjani F, Danielsson BR. Phenytoin-induced cleft palate: Evidence for embryonic cardiac bradyarrhythmia due to inhibition of delayed rectifier k+ channels resulting in hypoxia–reoxygenation damage. Teratology 2001;63:152-60.

36. Sulik KK, Johnston MC, Ambrose LJH, Dorgan D. Phenytoin (dilantin)-induced cleft lip and palate in a/j mice: A scanning and transmission electron microscopic study. The Anatomical Record 1979;195:243-55.

37. Ordulu Z, Wong Kristen E, Currall Benjamin B, et al. Describing Sequencing Results of Structural Chromosome Rearrangements with a Suggested Next-Generation Cytogenetic Nomenclature. The American Journal of Human Genetics 2014;94:695-709.

38. Higgins AW, Alkuraya FS, Bosco AF, et al. Characterization of Apparently Balanced Chromosomal Rearrangements from the Developmental Genome Anatomy Project. American Journal of Human Genetics 2008;82:712-22.

39. Lindgren AM, Hoyos T, Talkowski ME, et al. Haploinsufficiency of KDM6A is associated with severe psychomotor retardation, global growth restriction, seizures and cleft palate. Human Genetics 2013;132:537-52.

40. Alkuraya FS, Saadi I, Lund JJ, Turbe-Doan A, Morton CC, Maas RL. SUMO1 Haplo- insufficiency Leads to Cleft Lip and Palate. Science 2006;313:1751-.

41. Dawn Teare M, Barrett JH. Genetic linkage studies. The Lancet;366:1036-44.

42. Bush WS, Moore JH. Chapter 11: Genome-Wide Association Studies. PLOS Computational Biology 2012;8:e1002822.

43. Beaty TH, Murray JC, Marazita ML, et al. A genome-wide association study of cleft lip with and without cleft palate identifies risk variants near MAFB and ABCA4. Nat Genet 2010;42:525-9.

44. Burdick AB. Genetic epidemiology and control of genetic expression in van der Woude syndrome. J Craniofac Genet Dev Biol Suppl 1986;2:99-105.

160

45. Kondo S, Schutte BC, Richardson RJ, et al. Mutations in IRF6 cause Van der Woude and popliteal pterygium syndromes. Nat Genet 2002;32:285-9.

46. Taniguchi T, Ogasawara K, Takaoka A, Tanaka N. IRF Family of Transcription Factors as Regulators of Host Defense. Annual Review of Immunology 2001;19:623-55.

47. Zucchero TM, Cooper ME, Maher BS, et al. Interferon Regulatory Factor 6 (IRF6) Gene Variants and the Risk of Isolated Cleft Lip or Palate. New England Journal of Medicine 2004;351:769-80.

48. Pan Y, Ma J, Zhang W, et al. IRF6 polymorphisms are associated with nonsyndromic orofacial clefts in a Chinese Han population. Am J Med Genet A 2010;152A:2505-11.

49. Blanton SH, Cortez A, Stal S, Mulliken JB, Finnell RH, Hecht JT. Variation in IRF6 contributes to nonsyndromic cleft lip and palate. American Journal of Medical Genetics Part A 2005;137A:259-62.

50. Srichomthong C, Siriwan P, Shotelersuk V. Significant association between IRF6 820G→A and non-syndromic cleft lip with or without cleft palate in the Thai population. Journal of Medical Genetics 2005;42:e46-e.

51. Park JW, McIntosh I, Hetmanski JB, et al. Association between IRF6 and nonsyndromic cleft lip with or without cleft palate in four populations. Genetics in medicine : official journal of the American College of Medical Genetics 2007;9:219-27.

52. Jugessur A, Rahimov F, Lie RT, et al. Genetic Variants in IRF6 and the Risk of Facial Clefts: Single-Marker and Haplotype-Based Analyses in a Population-Based Case- Control Study of Facial Clefts in Norway. Genetic epidemiology 2008;32:413-24.

53. Janku P, Robinow M, Kelly T, et al. Van der Woude syndrome in a large kindred: variability, penetrance, genetic risks. American Journal of Medical Genetics 1980;5:117-23.

54. Leslie EJ, O'Sullivan J, Cunningham ML, et al. Expanding the genetic and phenotypic spectrum of popliteal pterygium disorders. American Journal of Medical Genetics Part A 2015;167:545-52.

55. Van Der Woude A. Fistula labii inferioris congenita and its association with cleft lip and palate. American Journal of Human Genetics 1954;6:244-56.

56. Froster-Iskenius UG. Popliteal pterygium syndrome. Journal of Medical Genetics 1990;27:320-6.

57. Bixler D, Poland C, Nance WE. Phenotypic variation in the popliteal pterygium syndrome. Clinical Genetics 1973;4:220-8.

58. Chen W, Lam SS, Srinath H, et al. Insights into interferon regulatory factor activation from the crystal structure of dimeric IRF5. Nature Structural &Amp; Molecular Biology 2008;15:1213.

161

59. de Lima RLLF, Hoper SA, Ghassibe M, et al. Prevalence and nonrandom distribution of exonic mutations in interferon regulatory factor 6 in 307 families with Van der Woude syndrome and 37 families with popliteal pterygium syndrome. Genet Med 2009;11:241-7.

60. Au WC, Yeow WS, Pitha PM. Analysis of Functional Domains of Interferon Regulatory Factor 7 and Its Association with IRF-3. Virology 2001;280:273-82.

61. Lin R, Heylbroeck C, Genin P, Pitha PM, Hiscott J. Essential Role of Interferon Regulatory Factor 3 in Direct Activation of RANTES Chemokine Transcription. Molecular and Cellular Biology 1999;19:959-66.

62. Ingraham CR, Kinoshita A, Kondo S, et al. Abnormal skin, limb and craniofacial morphogenesis in mice deficient for interferon regulatory factor 6 (Irf6). Nat Genet 2006;38:1335-40.

63. Richardson RJ, Dixon J, Malhotra S, et al. Irf6 is a key determinant of the keratinocyte proliferation-differentiation switch. Nat Genet 2006;38:1329-34.

64. Hammond NL, Dixon J, Dixon MJ. Periderm: Life-cycle and function during orofacial and epidermal development. Seminars in cell & developmental biology 2017.

65. Shyamala K, Yanduri S, Girish H, Murgod S. Neural crest: The fourth germ layer. Journal of Oral and Maxillofacial Pathology 2015;19:221-9.

66. Simoes-Costa M, Bronner ME. Reprogramming of avian neural crest axial identity and cell fate. Science 2016;352:1570.

67. Lumsden A, Sprawson N, Graham A. Segmental origin and migration of neural crest cells in the hindbrain region of the chick embryo. Development 1991;113:1281.

68. Artinger KB. The fascinating world of neural crest cells: Letter from the Guest Editor. Cell Adhesion & Migration 2010;4:551-2.

69. Hunt P, Krumlauf R. Deciphering the Hox code: Clues to patterning branchial regions of the head. Cell 1991;66:1075-8.

70. Trainor PA, Krumlauf R. Hox genes, neural crest cells and branchial arch patterning. Current Opinion in Cell Biology 2001;13:698-705.

71. Rijli FM, Mark M, Lakkaraju S, Dierich A, Dollé P, Chambon P. A homeotic transformation is generated in the rostral branchial region of the head by disruption of Hoxa-2, which acts as a selector gene. Cell 1993;75:1333-49.

72. Gendron-Maguire M, Mallo M, Zhang M, Gridley T. Hoxa-2 mutant mice exhibit homeotic transformation of skeletal elements from cranial neural crest. Cell 1993;75:1317-31.

73. Osumi-Yamashita N, Ninomiya Y, Eto K, Doi H. The contribution of both forebrain and midbrain crest cells to the mesenchyme in the frontonasal mass of mouse embryos. Developmental Biology 1994;164:409-19.

162

74. Creuzet S, Couly G, Vincent C, Le Douarin NM. Negative effect of Hox gene expression on the development of the neural crest-derived facial skeleton. Development 2002;129:4301.

75. Trainor A, Melton KR, Manzanares M. Origins and plasticity of neural crest cells and their roles in jaw and craniofacial evolution. The International Journal of Developmental Biology 2003;47.

76. Minoux M, Antonarakis GS, Kmita M, Duboule D, Rijli FM. Rostral and caudal pharyngeal arches share a common neural crest ground pattern. Development 2009;136:637.

77. Serbedzija GN, Bronner-Fraser M, Fraser SE. Vital dye analysis of cranial neural crest cell migration in the mouse embryo. Development 1992;116:297.

78. Prince V, Lumsden A. Hoxa-2 expression in normal and transposed rhombomeres: independent regulation in the neural tube and neural crest. Development 1994;120:911.

79. Couly G, Grapin-Botton A, Coltey P, Ruhin B, Le Douarin NM. Determination of the identity of the derivatives of the cephalic neural crest: incompatibility between Hox gene expression and lower jaw development. Development 1998;125:3445.

80. Hunt P, Clarke JDW, Buxton P, Ferretti P, Thorogood P. Stability and Plasticity of Neural Crest Patterning and Branchial Arch Hox Code after Extensive Cephalic Crest Rotation. Developmental Biology 1998;198:82-104.

81. Trainor PA, Ariza-McNaughton L, Krumlauf R. Role of the Isthmus and FGFs in Resolving the Paradox of Neural Crest Plasticity and Prepatterning. Science 2002;295:1288.

82. Trainor P, Krumlauf R. Plasticity in mouse neural crest cells reveals a new patterning role for cranial mesoderm. Nature Cell Biology 2000;2:96.

83. Schilling TF, Prince V, Ingham PW. Plasticity in Zebrafish hox Expression in the Hindbrain and Cranial Neural Crest. Developmental Biology 2001;231:201-16.

84. Hu D. A zone of frontonasal ectoderm regulates patterning and growth in the face. Development 2003;130:1749-58.

85. Xu J, Liu H, Lan Y, Aronow BJ, Kalinichenko VV, Jiang R. A Shh-Foxf-Fgf18-Shh Molecular Circuit Regulating Palate Development. PLoS Genet 2016;12:e1005769.

86. Yanfeng W, Saint-Jeannet JP, Klein PS. Wnt-frizzled signaling in the induction and differentiation of the neural crest. Bioessays 2003;25.

87. He F, Xiong W, Yu X, et al. Wnt5a regulates directional cell migration and cell proliferation via Ror2-mediated noncanonical pathway in mammalian palate development. Development 2008;135:3871-9.

88. Reid BS, Yang H, Melvin VS, Taketo MM, Williams T. Ectodermal WNT/β-catenin signaling shapes the mouse face. Developmental Biology 2011;349:261-9.

163

89. Liu W, Sun X, Braut A, et al. Distinct functions for Bmp signaling in lip and palate fusion in mice. Development 2005;132:1453.

90. Parada C, Chai Y. Roles of BMP Signaling Pathway in Lip and Palate Development. Frontiers of oral biology 2012;16:60-70.

91. Eberhart JK, Swartz ME, Crump JG, Kimmel CB. Early Hedgehog signaling from neural to oral epithelium organizes anterior craniofacial development. Development 2006;133:1069-77.

92. Wada N, Javidan Y, Nelson S, Carney TJ, Kelsh RN, Schilling TF. Hedgehog signaling is required for cranial neural crest morphogenesis and chondrogenesis at the midline in the zebrafish skull. Development 2005;132:3977-88.

93. Hu D, Marcucio RS. Unique organization of the frontonasal ectodermal zone in birds and mammals. Developmental Biology 2009;325:200-10.

94. Hu D, Marcucio RS. A SHH-responsive signaling center in the forebrain regulates craniofacial morphogenesis via the facial ectoderm. Development 2009;136:107.

95. Jacox L, Sindelka R, Chen J, Rothman A, Dickinson A, Sive H. The extreme anterior domain is an essential craniofacial organizer acting through Kinin-Kallikrein signaling. Cell reports 2014;8:596-609.

96. Chen J, Jacox LA, Saldanha F, Sive H. Mouth development. Wiley interdisciplinary reviews Developmental biology 2017;6.

97. Dickinson AJG, Development S-HL. The Wnt antagonists Frzb-1 and Crescent locally regulate basement membrane dissolution in the developing primary mouth. Development 2009.

98. Xu X, Han J, Ito Y, Bringas Jr P, Deng C, Chai Y. Ectodermal Smad4 and p38 MAPK Are Functionally Redundant in Mediating TGF-β/BMP Signaling during Tooth and Palate Development. Developmental Cell 2008;15:322-9.

99. Cuervo R, Valencia C, Chandraratna R, Covarrubias L. Programmed cell death is required for palate shelf fusion and is regulated by retinoic acid. Dev Biol 2002;245:145-56.

100. Kimmel CB, Miller CT, Moens CB. Specification and morphogenesis of the zebrafish larval head skeleton. Dev Biol 2001;233:239-57.

101. Miller CT, Schilling TF, Lee K, Parker J, Kimmel CB. sucker encodes zebrafish Endothelin- 1 required for ventral pharyngeal arch development. Development 2000;127:3815.

102. Cox TC. Taking it to the : The genetic and developmental mechanisms coordinating midfacial morphogenesis and dysmorphology. Clinical Genetics 2004;65:163-76.

103. Tamarin A, Boyde A. Facial and visceral arch development in the mouse embryo: a study by scanning electron microscopy. Journal of Anatomy 1977;124:563-80.

164

104. Rice R, Connor E, Rice DPC. Expression patterns of Hedgehog signalling pathway members during mouse palate development. Gene Expression Patterns 2006;6:206-12.

105. Lan Y, Jiang R. Sonic hedgehog signaling regulates reciprocal epithelial-mesenchymal interactions controlling palatal outgrowth. Development 2009;136:1387.

106. Rice R, Spencer-Dene B, Connor EC, et al. Disruption of Fgf10/Fgfr2b-coordinated epithelial-mesenchymal interactions causes cleft palate. Journal of Clinical Investigation 2004;113:1692-700.

107. Rice R, Spencer-Dene B, Connor EC, et al. Disruption of Fgf10/Fgfr2b-coordinated epithelial-mesenchymal interactions causes cleft palate. The Journal of Clinical Investigation 2004;113:1692-700.

108. Richardson RJ, Hammond NL, Coulombe PA, et al. Periderm prevents pathological epithelial adhesions during embryogenesis. J Clin Invest 2014;124:3891-900.

109. Dudas M, Li W-Y, Kim J, Yang A, Kaartinen V. Palatal fusion – Where do the midline cells go?: A review on cleft palate, a major human birth defect. Acta Histochemica 2007;109:1-14.

110. Cuervo R, Covarrubias L. Death is the major fate of medial edge epithelial cells and the cause of basal lamina degradation during palatogenesis. Development 2004;131:15.

111. Martıń ez-Álvarez C, Blanco MaJ, Pérez R, et al. Snail family members and cell survival in physiological and pathological cleft palates. Developmental Biology 2004;265:207-18.

112. Cecconi F, Alvarez-Bolado G, Meyer BI, Roth KA, Gruss P. Apaf1 (CED-4 Homolog) Regulates Programmed Cell Death in Mammalian Development. Cell 1998;94:727-37.

113. Vaziri Sani F, Hallberg K, Harfe BD, McMahon AP, Linde A, Gritli-Linde A. Fate-mapping of the epithelial seam during palatal fusion rules out epithelial–mesenchymal transformation. Developmental Biology 2005;285:490-5.

114. Xu X, Han J, Ito Y, Bringas Jr P, Urata MM, Chai Y. Cell autonomous requirement for Tgfbr2 in the disappearance of medial edge epithelium during palatal fusion. Developmental Biology 2006;297:238-48.

115. Jin J-Z, Ding J. Analysis of cell migration, transdifferentiation and apoptosis during mouse secondary palate fusion. Development 2006;133:3341.

116. Fuchs E. Scratching the surface of skin development. Nature 2007;445:834.

117. M’Boneko V, Merker H-J. Development and morphology of the periderm of mouse embryos (days 9–12 of gestation). Cells Tissues Organs 1988;133:325-36.

118. Lechler T, Fuchs E. Asymmetric cell divisions promote stratification and differentiation of mammalian skin. Nature 2005;437:275-80.

165

119. Okano J, Lichti U, Mamiya S, et al. Increased retinoic acid levels through ablation of Cyp26b1 determine the processes of embryonic skin barrier formation and peridermal development. J Cell Sci 2012;125:1827-36.

120. Hardman MJ, Sisi P, Banbury DN, Byrne C. Patterned acquisition of skin barrier function during development. Development 1998;125:1541-52.

121. Kousa YA, Schutte BC. Toward an orofacial gene regulatory network. Developmental Dynamics 2016;245:220-32.

122. Iwata J-i, Suzuki A, Pelikan RC, et al. Smad4-Irf6 genetic interaction and TGFβ- mediated IRF6 signaling cascade are crucial for palatal fusion in mice. Development 2013;140:1220-30.

123. Ferretti E, Li B, Zewdu R, et al. A conserved Pbx-Wnt-p63-Irf6 regulatory module controls face morphogenesis by promoting epithelial apoptosis. Dev Cell 2011;21:627-41.

124. Celli J, Duijf P, Hamel BC, et al. Heterozygous germline mutations in the p53 homolog p63 are the cause of EEC syndrome. Cell 1999;99:143-53.

125. Rahimov F, Marazita ML, Visel A, et al. Disruption of an AP-2alpha binding site in an IRF6 enhancer is associated with cleft lip. Nat Genet 2008;40:1341-7.

126. Milunsky JM, Maher TA, Zhao G, et al. TFAP2A mutations result in branchio-oculo-facial syndrome. The American Journal of Human Genetics 2008;82:1171-7.

127. Chen W, Royer WE. Structural insights into interferon regulatory factor activation. Cellular signalling 2010;22:883-7.

128. De Groote P, Tran H, Fransen M, et al. A novel RIPK4–IRF6 connection is required to prevent epithelial fusions characteristic for popliteal pterygium syndromes. Cell Death & Differentiation 2015;22:1012-24.

129. Kwa MQ, Huynh J, Aw J, et al. Receptor-interacting protein kinase 4 and interferon regulatory factor 6 function as a signaling axis to regulate keratinocyte differentiation. Journal of Biological Chemistry 2014;289:31077-87.

130. Kwa MQ, Nguyen T, Huynh J, et al. Interferon regulatory factor 6 differentially regulates Toll-like receptor 2-dependent chemokine gene expression in epithelial cells. Journal of Biological Chemistry 2014;289:19758-68.

131. Mitchell K, O'Sullivan J, Missero C, et al. Exome sequence identifies RIPK4 as the Bartsocas-Papas syndrome locus. The American Journal of Human Genetics 2012;90:69-75.

132. Pham D-H, Zhang C, Yin C. Using Zebrafish to Model Liver Diseases-Where Do We Stand? Current Pathobiology Reports 2017;5:207-21.

133. Casey MJ, Stewart RA. Zebrafish as a model to study neuroblastoma development. Cell and Tissue Research 2017.

166

134. Ganz J. Gut feelings: Studying enteric nervous system development, function, and disease in the zebrafish model system. Developmental Dynamics:n/a-n/a.

135. Mathai B, Meijer A, Simonsen A. Studying Autophagy in Zebrafish. Cells 2017;6:21.

136. Duncan KM, Mukherjee K, Cornell RA, Liao EC. Zebrafish models of orofacial clefts. Developmental Dynamics 2017;246:897-914.

137. Machado RG, Eames BF. Using Zebrafish to Test the Genetic Basis of Human Craniofacial Diseases. Journal of Dental Research 2017;96:1192-9.

138. Kaufman CK, White RM, Zon L. Chemical genetic screening in the zebrafish embryo. Nat Protoc 2009;4:1422-32.

139. Murphey RD, Zon LI. Small molecule screens in the zebrafish. Methods 2006;39:255-61.

140. North TE, Goessling W, Walkley CR, et al. Prostaglandin E2 regulates vertebrate haematopoietic stem cell homeostasis. Nature 2007;447:1007.

141. Yang C-T, Johnson SL. Small molecule-induced ablation and subsequent regeneration of larval zebrafish melanocytes. Development 2006;133:3563.

142. Driever W, Solnica-Krezel L, Schier AF, et al. A genetic screen for mutations affecting embryogenesis in zebrafish. Development 1996;123:37-46.

143. Mullins MC, Hammerschmidt M, Haffter P, Nüsslein-Volhard C. Large-scale mutagenesis in the zebrafish: in search of genes controlling development in a vertebrate. Current Biology 1994;4:189-202.

144. Chakrabarti S, Streisinger G, Singer F, Walker C. Frequency of γ-ray induced specific locus and recessive lethal mutations in mature germ cells of the zebrafish Brachydanio rerio. Genetics 1983;103:109.

145. Lawson Nathan D, Wolfe Scot A. Forward and Reverse Genetic Approaches for the Analysis of Vertebrate Development in the Zebrafish. Developmental Cell 2011;21:48-64.

146. Bedell VM, Wang Y, Campbell JM, et al. In vivo genome editing using a high-efficiency TALEN system. Nature 2012;491:114-8.

147. Cade L, Reyon D, Hwang WY, et al. Highly efficient generation of heritable zebrafish gene mutations using homo- and heterodimeric TALENs. Nucleic Acids Research 2012;40:8001-10.

148. Kawakami K. A versatile gene transfer vector in vertebrates. Genome Biology 2007;8:S7.

149. Hwang WY, Fu Y, Reyon D, et al. Efficient genome editing in zebrafish using a CRISPR- Cas system. Nat Biotech 2013;31:227-9.

150. Zhang Y, Qin W, Lu X, et al. Programmable base editing of zebrafish genome using a modified CRISPR-Cas9 system. Nature Communications 2017;8:118.

167

151. Eisen JS, Smith JC. Controlling morpholino experiments: don't stop making antisense. Development 2008;135:1735-43.

152. Zhang D, Gates KP, Barske L, et al. Endoderm Jagged induces liver and pancreas duct lineage in zebrafish. Nature Communications 2017;8:769.

153. North TE, Babu IR, Vedder LM, et al. PGE2-regulated wnt signaling and N- acetylcysteine are synergistically hepatoprotective in zebrafish acetaminophen injury. Proceedings of the National Academy of Sciences 2010;107:17315-20.

154. Kroeger PT, Drummond BE, Miceli R, et al. The zebrafish kidney mutant zeppelin reveals that brca2/fancd1 is essential for pronephros development. Developmental Biology 2017;428:148-63.

155. Jerman S, Sun Z. Chapter Two - Using Zebrafish to Study Kidney Development and Disease. In: Sadler KC, ed. Current Topics in Developmental Biology: Academic Press; 2017:41-79.

156. Carroll KJ, North TE. Oceans of opportunity: Exploring vertebrate hematopoiesis in zebrafish. Experimental Hematology 2014;42:684-96.

157. Boatman S, Barrett F, Satishchandran S, Jing L, Shestopalov I, Zon LI. Assaying hematopoiesis using zebrafish. Blood Cells, Molecules, and Diseases 2013;51:271-6.

158. North TE, Goessling W. Haematopoietic stem cells show their true colours. Nature Cell Biology 2016;19:10.

159. North TE, Goessling W, Peeters M, et al. Hematopoietic Stem Cell Development Is Dependent on Blood Flow. Cell 2009;137:736-48.

160. González-Rosa JM, Burns CE, Burns CG. Zebrafish heart regeneration: 15 years of discoveries. Regeneration 2017;4:105-23.

161. Paffett-Lugassy N, Novikov N, Jeffrey S, et al. Unique developmental trajectories and genetic regulation of ventricular and outflow tract progenitors in the zebrafish second heart field. Development 2017.

162. Boyd PJ, Tu W-Y, Shorrock HK, et al. Bioenergetic status modulates motor neuron vulnerability and pathogenesis in a zebrafish model of spinal muscular atrophy. PLOS Genetics 2017;13:e1006744.

163. LaBonty M, Pray N, Yelick PC. A Zebrafish Model of Human Fibrodysplasia Ossificans Progressiva. Zebrafish 2017;14:293-304.

164. Chen JW, Galloway JL. Chapter 12 - Using the zebrafish to understand tendon development and repair. In: Detrich HW, Westerfield M, Zon LI, eds. Methods in Cell Biology: Academic Press; 2017:299-320.

165. Theveneau E, Mayor R. Neural crest delamination and migration: From epithelium-to- mesenchyme transition to collective migration. Developmental Biology 2012;366:34-54.

168

166. Dougherty M, Kamel G, Shubinets V, Hickey G, Grimaldi M, Liao EC. Embryonic fate map of first pharyngeal arch structures in the sox10: kaede zebrafish transgenic model. J Craniofac Surg 2012;23:1333-7.

167. Kamel G, Hoyos T, Rochard L, et al. Requirement for frzb and fzd7a in cranial neural crest convergence and extension mechanisms during zebrafish palate and jaw morphogenesis. Dev Biol 2013;381:423-33.

168. Mork L, Crump G. Chapter Ten - Zebrafish Craniofacial Development: A Window into Early Patterning. In: Chai Y, ed. Current Topics in Developmental Biology: Academic Press; 2015:235-69.

169. Eberhart JK, He X, Swartz ME, et al. MicroRNA Mirn140 modulates Pdgf signaling during palatogenesis. Nat Genet 2008;40:290-8.

170. McCarthy N, Liu JS, Richarte AM, et al. Pdgfra and Pdgfrb genetically interact during craniofacial development. Developmental Dynamics 2016;245:641-52.

171. Xu X, Bringas P, Jr., Soriano P, Chai Y. PDGFR-alpha signaling is critical for tooth cusp and palate morphogenesis. Dev Dyn 2005;232:75-84.

172. Saadi I, Alkuraya Fowzan S, Gisselbrecht Stephen S, et al. Deficiency of the Cytoskeletal Protein SPECC1L Leads to Oblique Facial Clefting. The American Journal of Human Genetics 2011;89:44-55.

173. Weiner AMJ, Scampoli NL, Calcaterra NB. Fishing the Molecular Bases of Treacher Collins Syndrome. PLOS ONE 2012;7:e29574.

174. Noack Watt KE, Achilleos A, Neben CL, Merrill AE, Trainor PA. The Roles of RNA Polymerase I and III Subunits Polr1c and Polr1d in Craniofacial Development and in Zebrafish Models of Treacher Collins Syndrome. PLOS Genetics 2016;12:e1006187.

175. Van Laarhoven PM, Neitzel LR, Quintana AM, et al. genes KMT2D and KDM6A: functional analyses demonstrate critical roles in craniofacial, heart and brain development. Human Molecular Genetics 2015;24:4443-53.

176. Nakada C, Iida A, Tabata Y, Watanabe S. Forkhead transcription factor foxe1 regulates chondrogenesis in zebrafish. Journal of Experimental Zoology Part B: Molecular and Developmental Evolution 2009;312B:827-40.

177. Shaw ND, Brand H, Kupchinsky ZA, et al. SMCHD1 mutations associated with a rare muscular dystrophy can also cause isolated arhinia and Bosma arhinia microphthalmia syndrome. Nature Genetics 2017;49:238.

178. Knight RD, Nair S, Nelson SS, et al. lockjaw encodes a zebrafish tfap2a required for early neural crest development. Development 2003;130:5755-68.

179. Botti E, Spallone G, Moretti F, et al. Developmental factor IRF6 exhibits tumor suppressor activity in squamous cell carcinomas. Proceedings of the National Academy of Sciences 2011;108:13710-5.

169

180. Shipley GD, Pittelkow MR. Growth and differentiation in vitro of human keratinocytes cultured in serum-free medium. Archives of Dermatology 1987;123:1541a-4a.

181. Ben J, Jabs EW, Chong SS. Genomic, cDNA and embryonic expression analysis of zebrafish IRF6, the gene mutated in the human oral clefting disorders Van der Woude and popliteal pterygium syndromes. Gene Expr Patterns 2005;5:629-38.

182. Simossis VA, Heringa J. PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information. Nucleic acids research 2005;33:W289-W94.

183. Schier AF. Maternal-zygotic transition: death & birth of RNAs. Science 2007;316:406-7.

184. Sabel JL, d'Alencon C, O'Brien EK, et al. Maternal Interferon Regulatory Factor 6 is required for the differentiation of primary superficial epithelia in Danio and Xenopus embryos. Dev Biol 2009;325:249-62.

185. de la Garza G, Schleiffarth JR, Dunnwald M, et al. Interferon regulatory factor 6 promotes differentiation of the periderm by activating expression of Grainyhead-like 3. J Invest Dermatol 2013;133:68-77.

186. Kimmel CB, Ballard WW, Kimmel SR, Ullmann B, Schilling TF. Stages of Embryonic Development of the Zebrafish. Dev Dynamics 1995;203.

187. Peyrard-Janvid M, Leslie EJ, Kousa YA, et al. Dominant mutations in GRHL3 cause Van der Woude Syndrome and disrupt oral periderm development. Am J Hum Genet 2014;94:23-32.

188. Barrangou R, Fremaux C, Deveau H, et al. CRISPR provides acquired resistance against viruses in prokaryotes. Science 2007;315:1709-12.

189. Cong L, Ran FA, Cox D, et al. Multiplex genome engineering using CRISPR/Cas systems. Science 2013;339:819-23.

190. Mali P, Yang L, Esvelt KM, et al. RNA-guided engineering via Cas9. Science 2013;339:823-6.

191. Nishimasu H, Ran FA, Hsu PD, et al. Crystal structure of Cas9 in complex with guide RNA and target DNA. Cell 2014;156:935-49.

192. Kleinstiver BP, Prew MS, Tsai SQ, et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature 2015;523:481.

193. O’Connell MR, Oakes BL, Sternberg SH, East-Seletsky A, Kaplan M, Doudna JA. Programmable RNA recognition and cleavage by CRISPR/Cas9. Nature 2014;516:263.

194. Pattanayak V, Lin S, Guilinger JP, Ma E, Doudna JA, Liu DR. High-throughput profiling of off-target DNA cleavage reveals RNA-programmed Cas9 nuclease specificity. Nature biotechnology 2013;31:839-43.

170

195. Sander JD, Joung JK. CRISPR-Cas systems for editing, regulating and targeting genomes. Nature biotechnology 2014;32:347-55.

196. Bétermier M, Bertrand P, Lopez BS. Is Non-Homologous End-Joining Really an Inherently Error-Prone Process? PLOS Genetics 2014;10:e1004086.

197. Jasin M, Rothstein R. Repair of strand breaks by homologous recombination. Cold Spring Harbor perspectives in biology 2013;5:a012740.

198. Peng Y, Clark KJ, Campbell JM, Panetta MR, Guo Y, Ekker SC. Making designer mutants in model organisms. Development 2014;141:4042-54.

199. Auer TO, Duroure K, De Cian A, Concordet JP, Del Bene F. Highly efficient CRISPR/Cas9-mediated knock-in in zebrafish by homology-independent DNA repair. Genome Res 2014;24:142-53.

200. Fakhouri WD, Rhea L, Du T, et al. MCS9. 7 enhancer activity is highly, but not completely, associated with expression of Irf6 and p63. Developmental dynamics 2012;241:340-9.

201. Dutton JR, Antonellis A, Carney TJ, et al. An evolutionarily conserved intronic region controls the spatiotemporal expression of the transcription factor Sox10. BMC Developmental Biology 2008;8:1-20.

202. Ju B, Xu Y, He J, et al. Faithful expression of green fluorescent protein(GFP) in transgenic zebrafish embryos under control of zebrafish gene promoters. Developmental genetics 1999;25:158-67.

203. Suster ML, Kikuta H, Urasaki A, Asakawa K, Kawakami K. Transgenesis in zebrafish with the tol2 transposon system. Transgenesis Techniques: Principles and Protocols 2009:41-63.

204. Kwan KM, Fujimoto E, Grabher C, et al. The Tol2kit: A multisite gateway-based construction kit for Tol2 transposon transgenesis constructs. Developmental Dynamics 2007;236:3088-99.

205. Nepal C, Hadzhiev Y, Previti C, et al. Dynamic regulation of the transcription initiation landscape at single nucleotide resolution during vertebrate embryogenesis. Genome research 2013;23:1938-50.

206. Schneider R, Bannister AJ, Myers FA, Thorne AW, Crane-Robinson C, Kouzarides T. Histone H3 lysine 4 methylation patterns in higher eukaryotic genes. Nature cell biology 2004;6:73-7.

207. Liu H, Leslie EJ, Jia Z, et al. Irf6 directly regulates Klf17 in zebrafish periderm and Klf4 in murine oral epithelium, and dominant-negative KLF4 variants are present in patients with cleft lip and palate. Hum Mol Genet 2016;25:766-76.

208. Moretti F, Marinari B, Lo Iacono N, et al. A regulatory feedback loop involving p63 and IRF6 links the pathogenesis of 2 genetically different human ectodermal dysplasias. The Journal of Clinical Investigation;120:1570-7.

171

209. Dougherty M, Kamel G, Grimaldi M, et al. Distinct requirements for wnt9a and irf6 in extension and integration mechanisms during zebrafish palate morphogenesis. Development 2013;140:76-81.

210. MacArthur DG, Manolio TA, Dimmock DP, et al. Guidelines for investigating causality of sequence variants in human disease. Nature 2014;508:469-76.

211. Durbin RM, Abecasis GR, Altshuler DL, et al. A map of human genome variation from population-scale sequencing. Nature 2010;467.

212. König E, Rainer J, Domingues FS. Computational assessment of feature combinations for pathogenic variant prediction. Molecular Genetics & Genomic Medicine 2016;4:431-46.

213. Rasmussen LJ, Heinen CD, Royer-Pokora B, et al. Pathological assessment of mismatch repair gene variants in Lynch syndrome: Past, present, and future. Human Mutation 2012;33:1617-25.

214. Stitziel NO, Kiezun A, Sunyaev S. Computational and statistical approaches to analyzing variants identified by exome sequencing. Genome Biology 2011;12:1-10.

215. Manolio TA, Collins FS, Cox NJ, et al. Finding the missing heritability of complex diseases. Nature 2009;461:747-53.

216. Thusberg J, Olatubosun A, Vihinen M. Performance of mutation pathogenicity prediction methods on missense variants. Human Mutation 2011;32:358-68.

217. Wei Q, Wang L, Wang Q, Kruger WD, Dunbrack RL. Testing computational prediction of missense mutation phenotypes: characterization of 204 mutations of human cystathionine beta synthase. Proteins: Structure, Function, and Bioinformatics 2010;78:2058-74.

218. Bell CJ, Dinwiddie DL, Miller NA, et al. Carrier Testing for Severe Childhood Recessive Diseases by Next-Generation Sequencing. Science Translational Medicine 2011;3:65ra4.

219. Lek M, Karczewski KJ, Minikel EV, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 2016;536:285-91.

220. Quintáns B, Ordóñez-Ugalde A, Cacheiro P, Carracedo A, Sobrido MJ. Medical genomics: The intricate path from genetic variant identification to clinical interpretation. Applied & Translational Genomics 2014;3:60-7.

221. Little HJ, Rorick NK, Su LI, et al. Missense mutations that cause Van der Woude syndrome and popliteal pterygium syndrome affect the DNA-binding and transcriptional activation functions of IRF6. Hum Mol Genet 2009;18:535-45.

222. Leslie EJ, Standley J, Compton J, Bale S, Schutte BC, Murray JC. Comparative analysis of IRF6 variants in families with Van der Woude syndrome and popliteal pterygium syndrome using public whole-exome databases. Genet Med 2013;15:338-44.

223. Hicks S, Wheeler DA, Plon SE, Kimmel M. Prediction of missense mutation functionality depends on algorithm & sequence alignment employed. Human Mutation 2011;32:661-8.

172

224. Adzhubei IA, Schmidt S, Peshkin L, et al. A method and server for predicting damaging missense mutations. Nat Methods 2010;7.

225. Ng PC, Henikoff S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Research 2003;31:3812-4.

226. Richards S, Aziz N, Bale S, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med 2015;17:405-24.

227. Rost B, Radivojac P, Bromberg Y. Protein function in precision medicine: deep understanding with machine learning. FEBS Letters 2016;590:2327-41.

228. Ene-Choo Tan EC-PL, Shiao-Hui Ya,p Seng-Teik Lee, Joanne Cheng, Yong-Chen Pop, Vincent Yeow. Identification of IRF6 gene variants in three families with Van der Woude syndrome. International Journal of Molecular Medicine 2008;21:747-51.

229. Feil R, Wagner J, Metzger D, Chambon P. Regulation of Cre recombinase activity by mutated ligand-binding domains. Biochemical and biophysical research communications 1997;237:752-7.

230. Chekuru A, Kuscha V, Hans S, Brand M. Ligand-Controlled Site-Specific Recombination in Zebrafish. Site-Specific Recombinases: Springer; 2017:87-97.

231. Buchholz DR. Tet-on binary systems for tissue-specific and inducible transgene expression. Xenopus Protocols: Post-Genomic Approaches 2012:265-75.

232. Gu Q, Yang X, He X, Li Q, Cui Z. Generation and characterization of a transgenic zebrafish expressing the reverse tetracycline transactivator. Journal of genetics and genomics 2013;40:523-31.

233. Shoji W, Sato-Maeda M. Application of heat shock promoter in transgenic zebrafish. Development, growth & differentiation 2008;50:401-6.

234. Halloran MC, Sato-Maeda M, Warren J, et al. Laser-induced gene expression in specific cells of transgenic zebrafish. Development 2000;127:1953-60.

235. Sadowski I, Ma J, Triezenberg S, Ptashne M. GAL4-VP16 is an unusually potent transcriptional activator. Nature 1988;335:563-4.

236. Halpern ME, Rhee J, Goll MG, Akitake CM, Parsons M, Leach SD. Gal4/UAS transgenic tools and their application to zebrafish. Zebrafish 2008;5:97-110.

237. Asakawa K, Kawakami K. Targeted gene expression by the Gal4-UAS system in zebrafish. Development, growth & differentiation 2008;50:391-9.

238. Nagy A. Cre recombinase: the universal reagent for genome tailoring. genesis 2000;26:99.

173

239. Mosimann C, Zon LI. Advanced zebrafish transgenesis with Tol2 and application for Cre recombination experiments. Zebrafish: genetics, genomics and informatics 2011;104.

240. T Das A, Tenenbaum L, Berkhout B. Tet-On Systems For Doxycycline-inducible Gene Expression. Current gene therapy 2016;16:156-67.

241. Knopf F, Schnabel K, Haase C, Pfeifer K, Anastassiadis K, Weidinger G. Dually inducible TetON systems for tissue-specific conditional gene expression in zebrafish. Proceedings of the National Academy of Sciences of the United States of America 2010;107:19933-8.

242. Krone PH, Lele Z, Sass JB. Heat shock genes and the heat shock response in zebrafish embryos. Biochemistry and cell biology 1997;75:487-97.

243. Yizhar O, Fenno Lief E, Davidson Thomas J, Mogri M, Deisseroth K. Optogenetics in Neural Systems. Neuron;71:9-34.

244. Fenno L, Yizhar O, Deisseroth K. The Development and Application of Optogenetics. Annual Review of Neuroscience 2011;34:389-412.

245. Müller K, Naumann S, Weber W, Zurbriggen MD. Optogenetics for gene expression in mammalian cells. Biological chemistry 2015;396:145-52.

246. Shimizu-Sato S, Huq E, Tepperman JM, Quail PH. A light-switchable gene promoter system. Nature biotechnology 2002;20:1041-4.

247. Polstein LR, Gersbach CA. Light-inducible spatiotemporal control of gene activation by customizable transcription factors. JACS 2012;134:16480-3.

248. Motta-Mena LB, Reade A, Mallory MJ, et al. An optogenetic gene expression system with rapid activation and deactivation kinetics. Nat Chem Biol 2014;10:196-202.

249. Nash AI, McNulty R, Shillito ME, et al. Structural basis of photosensitivity in a bacterial light-oxygen-voltage/helix-turn-helix (LOV-HTH) DNA-binding protein. Proceedings of the National Academy of Sciences 2011;108:9449-54.

250. Rivera-Cancel G, Motta-Mena LB, Gardner KH. Identification of Natural and Artificial DNA Substrates for Light-Activated LOV–HTH Transcription Factor EL222. Biochemistry 2012;51:10024-34.

251. Reade A, Motta-Mena LB, Gardner KH, Stainier DY, Weiner OD, Woo S. TAEL: a zebrafish-optimized optogenetic gene expression system with fine spatial and temporal control. Development 2017;144:345-55.

252. Schulte-Merker S, Van Eeden F, Halpern ME, Kimmel C, Nusslein-Volhard C. no tail (ntl) is the zebrafish homologue of the mouse T gene. Development 1994;120:1009-15.

253. Sedykh I, Yoon B, Roberson L, Moskvin O, Dewey CN, Grinblat Y. Zebrafish zic2 controls formation of periocular neural crest and choroid fissure morphogenesis. Developmental biology 2017;429:92-104.

174

254. Farlie PG, Baker NL, Yap P, Tan TY. Frontonasal Dysplasia: Towards an Understanding of Molecular and Developmental Aetiology. Molecular syndromology 2016;7:312-21.

255. Ullah A, Kalsoom UE, Umair M, et al. Exome sequencing revealed a splice site variant in the ALX1 gene underlying frontonasal dysplasia. Clinical genetics 2017;91:494-8.

256. Dee CT, Szymoniuk CR, Mills PE, Takahashi T. Defective neural crest migration revealed by a Zebrafish model of Alx1-related frontonasal dysplasia. Human molecular genetics 2012;22:239-51.

257. Markel H, Chandler J, Werr W. Translational fusions with the engrailed repressor domain efficiently convert plant transcription factors into dominant-negative functions. Nucleic Acids Research 2002;30:4709-19.

258. Vickers ER, Sharrocks AD. The use of inducible engrailed fusion proteins to study the cellular functions of eukaryotic transcription factors. Methods 2002;26:270-80.

259. Gritli-Linde A. p63 & IRF6: brothers in arms against CLP. J Clin Invest 2010;120:1386-9.

260. Koillinen H, Wong FK, Rautio J, et al. Mapping of the second locus for the Van der Woude syndrome to chromosome 1p34. European Journal of Human Genetics 2001;9:747-52.

261. Li EB, Truong D, Hallett SA, Mukherjee K, Schutte BC, Liao EC. Rapid functional analysis of computationally complex rare human IRF6 gene variants using a novel zebrafish model. PLOS Genetics 2017;13:e1007009.

262. Ashburner M, Ball CA, Blake JA, et al. Gene Ontology: tool for the unification of biology. Nature genetics 2000;25:25-9.

263. Consortium GO. Expansion of the Gene Ontology knowledgebase and resources. Nucleic acids research 2017;45:D331-D8.

264. Hooper JE, Feng W, Li H, et al. Systems biology of facial development: contributions of ectoderm and mesenchyme. Developmental Biology 2017;426:97-114.

265. Robu ME, Larson JD, Nasevicius A, et al. p53 activation by knockdown technologies. PLoS genetics 2007;3:e78.

266. Rossi A, Kontarakis Z, Gerri C, et al. Genetic compensation induced by deleterious mutations but not gene knockdowns. Nature 2015;524:230-3.

267. Rohacek AM, Bebee TW, Tilton RK, et al. ESRP1 Mutations Cause Hearing Loss due to AS Defects that Disrupt Cochlear Development. Developmental cell 2017;43:318-31. e5.

268. Bebee TW, Park JW, Sheridan KI, et al. The splicing regulators Esrp1 and Esrp2 direct an epithelial splicing program essential for mammalian development. Elife 2015;4:e08954.

269. Brewer JR, Mazot P, Soriano P. Genetic insights into the mechanisms of Fgf signaling. Genes & development 2016;30:751-71.

175

270. Ranieri D, Rosato B, Nanni M, Magenta A, Belleudi F, Torrisi MR. Expression of the FGFR2 mesenchymal splicing variant in epithelial cells drives epithelial-mesenchymal transition. Oncotarget 2016;7:5440.

271. Burguera D, Marquez Y, Racioppi C, et al. Evolutionary recruitment of flexible Esrp- dependent splicing programs into diverse embryonic morphogenetic processes. Nature communications 2017;8:1799.

272. Neuhauss S, Solnica-Krezel L, Schier AF, et al. Mutations affecting craniofacial development in zebrafish. Development 1996;123:357-67.

273. Yelick PC, Schilling TF. Molecular dissection of craniofacial development using zebrafish. Critical Reviews in Oral Biology & Medicine 2002;13:308-22.

274. Juriloff DM, Harris MJ. Mouse genetic models of cleft lip with or without cleft palate. Birth Defects Res A Clin Mol Teratol 2008;82:63-77.

275. Warzecha CC, Sato TK, Nabet B, Hogenesch JB, Carstens RP. ESRP1 and ESRP2 are epithelial cell-type-specific regulators of FGFR2 splicing. Molecular cell 2009;33:591-601.

276. Reifers F, Bohli H, Walsh EC, Crossley PH, Stainier D, Brand M. Fgf8 is mutated in zebrafish acerebellar (ace) mutants and is required for maintenance of midbrain- hindbrain boundary development and somitogenesis. Development 1998;125:2381-95.

277. Westerfield M. The zebrafish book: a guide for the laboratory use of zebrafish (Brachydanio rerio): University of Oregon press; 1995.

278. Sander JD, Maeder ML, Reyon D, Voytas DF, Joung JK, Dobbs D. ZiFiT: an updated zinc finger engineering tool. Nucleic acids research 2010;38:W462-W8.

279. Ran FA, Hsu PD, Wright J, Agarwala V, Scott DA, Zhang F. Genome engineering using the CRISPR-Cas9 system. Nat Protoc 2013;8:2281-308.

280. Montague TG, Cruz JM, Gagnon JA, Church GM, Valen E. CHOPCHOP: a CRISPR/Cas9 and TALEN web tool for genome editing. Nucleic acids research 2014;42:W401-W7.

281. Jao LE, Wente SR, Chen W. Efficient multiplex biallelic zebrafish genome editing using a CRISPR nuclease system. PNAS 2013;110:13904-9.

282. Gagnon JA, Valen E, Thyme SB, et al. Efficient Mutagenesis by Cas9 Protein-Mediated Oligonucleotide Insertion and Large-Scale Assessment of Single-Guide RNAs. PLoS ONE 2014;9:e98186.

283. Meeker N, Hutchinson S, Ho L, Trede N. Method for isolation of PCR-ready genomic DNA from zebrafish tissues. BioTechniques 2007;43:610-4.

284. Link V, Shevchenko A, Heisenberg C-P. Proteomics of early zebrafish embryos. BMC Developmental Biology 2006;6:1-.

176

285. Thisse C, Thisse B. High-resolution in situ hybridization to whole-mount zebrafish embryos. Nat Protocols 2008;3:59-69.

286. Renaud O, Herbomel P, Kissa K. Studying cell behavior in whole zebrafish embryos by confocal imaging: application to hematopoietic stem cells. Nat Protoc 2011;6:1897-904.

287. Ng PC, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res 2003;31.

288. Dobin A, Davis CA, Schlesinger F, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 2013;29:15-21.

289. Anders S, Pyl PT, Huber W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 2015;31:166-9.

290. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2010;26:139-40.

291. Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 2009;25:1754-60.

292. Kharchenko PV, Tolstorukov MY, Park PJ. Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nature biotechnology 2008;26:1351-9.

293. Bogdanovic O, Fernandez-Minan A, Tena JJ, de la Calle-Mustienes E, Gomez-Skarmeta JL. The developmental epigenomics toolbox: ChIP-seq and MethylCap-seq profiling of early zebrafish embryos. Methods 2013;62:207-15.

177