This electronic thesis or dissertation has been downloaded from the King’s Research Portal at https://kclpure.kcl.ac.uk/portal/

Genetic investigation of neonatal sclerosing cholangitis

Grammatikopoulos, Tassos

Awarding institution: King's College London

The copyright of this thesis rests with the author and no quotation from it or information derived from it may be published without proper acknowledgement.

END USER LICENCE AGREEMENT

Unless another licence is stated on the immediately following page this work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International licence. https://creativecommons.org/licenses/by-nc-nd/4.0/ You are free to copy, distribute and transmit the work Under the following conditions:  Attribution: You must attribute the work in the manner specified by the author (but not in any way that suggests that they endorse you or your use of the work).  Non Commercial: You may not use this work for commercial purposes.  No Derivative Works - You may not alter, transform, or build upon this work.

Any of these conditions can be waived if you receive permission from the author. Your fair dealings and other rights are in no way affected by the above.

Take down policy

If you believe that this document breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim.

Download date: 29. Sep. 2021

Genetic investigation of neonatal sclerosing cholangitis

Anastasios Grammatikopoulos

Supervisors

Professor Richard Thompson Professor Giorgina Mieli-Vergani MD (Res)

Institute of Liver Studies

Division of Transplantation Immunology & Mucosal Biology

King’s College School of Medicine at King’s College Hospital

King’s College London Table of Contents List of figures ...... 6 List of tables ...... 8 List of abbreviations ...... 10 Abstract ...... 13 1. Introduction ...... 14 1.1.High γ-Glutamyltransferase cholangiopathies ...... 14 1.1.1.Neonatal Sclerosing Cholangitis ...... 14 1.1.2.γ-Glutamyltransferase ...... 15 1.1.3.Neonatal ichthyosis-sclerosing cholangitis syndrome (NISCH) ...... 15 1.1.4.North American Indian childhood cirrhosis ...... 16 1.1.5.Kabuki syndrome ...... 16 1.1.6.Progressive Familial Intrahepatic Cholestasis (PFIC) type 3 ...... 17 1.1.7.Biliary Atresia ...... 17 1.1.8.Genetics in Biliary Atresia ...... 18 1.1.9.Previous genetic analysis of Neonatal Sclerosing Cholangitis ...... 19 1.2.Genetic techniques used in the project...... 20 1.2.1.Single Nucleotide Polymorphism ...... 20 1.2.2.Homozygosity mapping ...... 20 1.2.3.Whole Exome Sequencing ...... 21 2. Research Aims ...... 23 3. Patients and Methods ...... 24 3.1.Patients ...... 24 3.2.Methods ...... 24 3.2.1.DNA extraction from stored blood ...... 24 3.2.2.DNA quantification ...... 25 3.2.3. Polymerase chain reaction (PCR) ...... 26 3.2.3.1.PRIMER DESIGN ...... 26 3.2.3.2.TESTING OF NEW PCR PRIMERS ...... 26 3.2.3.3.PCR ...... 26 3.2.3.4.PCR PRODUCT VERIFICATION ...... 27 3.2.4.Whole genome analysis with the Human CytoSNP-12 v2.1 (300,000 Single Nucleotide Polymorphism) microarray analysis ...... 28 3.2.5.Homozygosity mapping ...... 29 3.2.5.1.UPLOADING GENOTYPES ...... 29 3.2.5.2.GENOTYPE ANALYSIS ...... 29 3.2.5.3.GENOME-WIDE HOMOZYGOSITY ...... 31 3.2.6.Loss of Heterozygosity by BeadStudio ...... 33 3.2.6.1.BeadStudio ...... 33 3.2.7.Sequencing analysis ...... 35 3.2.7.1.PURIFICATION OF PCR PRODUCTS FOR SEQUENCING ...... 35 3.2.7.1.1.AMPure PCR Purification Kit ...... 35

2

3.2.7.2CYCLE SEQUENCING ...... 36 3.2.7.2.1.Cycle sequencing reaction ...... 36 3.2.7.2.2.Sequencing product purification by ethanol precipitation ...... 36 3.2.7.2.3.Sequencing ...... 37 3.2.8.Next generation sequencing –Whole exome sequencing (WES)...... 37 3.2.8.1.PREPARATION OF GENOMIC DNA ...... 37 3.2.8.1.1.DNA quantification ...... 37 3.2.8.1.2.DNA shearing ...... 37 3.2.8.1.3.DNA purification ...... 38 3.2.8.1.4.Quality assessment ...... 39 3.2.8.1.5.End repair ...... 40 3.2.8.1.6. Second DNA purification ...... 40 3.2.8.1.7.Addition of “A” bases to the 3’ end of DNA fragments...... 40 3.2.8.1.8. Third DNA purification ...... 41 3.2.8.1.9.Ligation of the indexing-specific paired-end adapter ...... 41 3.2.8.1.10.DNA purification following adapter ligation ...... 42 3.2.8.1.11.Adapter-ligated library amplification ...... 42 3.2.8.1.12.DNA purification following adapter-ligated library amplification ...... 43 3.2.8.1.13.Quality assessment after adapter-ligated library amplification ...... 43 3.2.8.2.HYBRIDIZATION ...... 44 3.2.8.2.1.Library hybridization ...... 45 3.2.8.2.2.Magnetic beads preparation ...... 48 3.2.8.2.3.Hybrid capture ...... 48 3.2.8.2.4.Captured DNA purification ...... 49 3.2.8.3.INDEX TAG ADDITION BY POST-HYBRIDIZATION AMPLIFICATION ...... 49 3.2.8.3.1.Captured library amplification for addition of index tags ...... 49 3.2.8.3.2.Captured DNA purification ...... 50 3.2.8.3.3.Quality assessment with the Agilent 2100 Bioanalyzer HS DNA assay ...... 50 3.2.8.4. SAMPLE POOLING FOR MULTIPLEXED SEQUENCING ...... 51 3.2.8.5.CLUSTER AMPLIFICATION ...... 51 3.2.8.6.CLUSTER GENERATION ...... 52 3.2.8.7.REAGENT PREPARATION FOR HISEQ ...... 52 3.2.8.8.SEQUENCING BY SYNTHESIS ...... 53 3.2.8.9.SEQUENCE ALIGNMENT, VARIANT CALLING AND ANNOTATION ...... 53 3.2.8.10.VARIANT FILTERING ...... 54 3.2.9.Whole Exome Sequencing- Group 2 ...... 54 3.2.9.1.SAMPLE RECEIPT, QUALITY CONTROL AND TRACKING...... 55 3.2.9.2.LIBRARY PRODUCTION AND EXOME CAPTURE ...... 55 3.2.9.3.CLUSTERING/SEQUENCING ...... 55 3.2.9.4.READ PROCESSING ...... 55 3.2.9.5.VARIANT DETECTION ...... 56 3.2.9.6.DATA ANALYSIS QUALITY CONTROL (QC) ...... 56 3.2.9.7.VARIANT ANNOTATION...... 57 3.2.9.8.VARIANT EXPLORATION ...... 57 4. Clinical and laboratory profile ...... 58 4.1.Kabuki syndrome and cholestatic liver disease ...... 58 4.1.1.Introduction ...... 58 4.1.2.Clinical features of KS ...... 58

3

4.1.2.1.FACIAL AND DENTAL CHARACTERISTICS ...... 58 4.1.2.2.RENAL ABNORMALITIES ...... 59 4.1.2.3.CARDIAC ABNORMALITIES ...... 59 4.1.2.4.GASTROINTESTINAL AND LIVER ABNORMALITIES ...... 60 4.1.2.5.SKELETAL ABNORMALITIES...... 60 4.1.2.6.CENTRAL NERVOUS SYSTEM ABNORMALITIES ...... 61 4.1.2.7.DEVELOPMENTAL DELAY ...... 61 4.1.2.8.IMMUNE DYSREGULATION ...... 61 4.1.2.9.RECURRENT INFECTIONS ...... 62 4.1.2.10.ENDOCRINE AND GROWTH PROBLEMS ...... 62 4.1.3.Genetics in KS ...... 62 4.1.3.1.MIXED LINEAGE LEUKEMIA 2 (MLL2) ...... 62 4.1.3.2.LYSINE-SPECIFIC DEMETHYLASE 6A GENE (KDM6A) ...... 64 4.1.4.KS patients ...... 66 4.1.4.1.CLINICAL, BIOCHEMICAL AND HISTOPATHOLOGICAL FEATURES ...... 66 4.2. Non Kabuki syndrome patients ...... 71 4.2.1.Neonatal sclerosing cholangitis patients ...... 72 4.2.1.1.DEMOGRAPHICS ...... 72 4.2.1.2.HISTOLOGICAL FEATURES ...... 72 4.2.1.3.CLINICAL FEATURES AND INVESTIGATIONS AT PRESENTATION ...... 74 4.2.1.4.DISEASE COURSE ...... 74 4.2.2.Biliary atresia patients ...... 75 5. Genetic results ...... 80 5.1. Kabuki syndrome ...... 80 5.1.1.Discussion ...... 81 5.2.DNA for SNP analysis ...... 82 5.3.PCR amplification of ABCB4 exon 6 ...... 83 5.4.SNP linkage analysis ...... 84 5.4.1.Analysis A1 ...... 84 5.4.2.Analysis A2 ...... 85 5.4.3.Analysis A3 ...... 88 5.4.4.Analysis A4 ...... 88 5.4.5.Analysis A5 ...... 89 5.4.6.Analysis A6 ...... 89 5.4.7.Analysis A7 ...... 89 5.4.8.Analysis A8 ...... 89 5.4.9.Summary of HomozygosityMapper (HM) analysis ...... 89 5.5.Results of Bead studio ...... 90 5.6.RAB3B sequencing ...... 93 5.6.1.RAB3B primers ...... 93 5.6.2.RAB3B results ...... 93 5.7.Whole Exome Sequencing analysis ...... 94 5.7.1.Whole Exome Sequencing variant detection in Group 1 ...... 94 5.7.2.Whole Exome Sequencing detection in Group 2 ...... 97 5.7.3. Whole Exome Sequencing detection in BA patients ...... 101 5.8.Doublecortin domain containing 2 (DCDC2) ...... 101

4

5.8.1.Introduction ...... 101 5.8.2.Variant confirmation through Sanger sequencing ...... 103 5.8.3.DCDC2 variant detection in Group 1 WES ...... 106 5.8.4.Liver tissue immunohistochemical and ultrastructural studies ...... 108 5.8.4.1.METHODS ...... 108 5.8.4.2.RESULTS...... 109 5.8.5.Discussion ...... 112 6. Discussion ...... 115

5

List of figures

Figure 1: GGT enzyme structure demonstrating the heavy (blue) and light chains (green) composing the molecule (image copied from www.uscn.com)...... 15

Figure 2: HomozygosityMapper analysis on all consanguineous patients across the genome...... 31

Figure 3: Genotype table for a high homozygosity score region in 11 for all consanguineous patients ...... 32

Figure 4: BeadStudio genotype analysis viewer ...... 34

Figure 5: Synopsis of sequencing sample preparation workflow provided by Illumina...... 37

Figure 6: Sheared DNA analysis showing an electropherogram distribution with an optimum peak size at 150-200 nucleotides...... 39

Figure 7: Electropherogram of an amplified prepared DNA library ...... 44

Figure 8: SureSelect Target enrichment System Capture Process workflow for the Illumina platform. The figure is provided by Agilent Technology ...... 45

Figure 9: Hybridization Library PCR plate; Prepped Library (red), Hybridization Buffer (blue) and SureSelect Capture Library (green)...... 47

Figure 10: Electropherogram of an amplified captured DNA using the HS DNA Kit ...... 51

Figure 11: Kabuki make up ...... 67

Figure 12: Liver histology material from patients TG28 (a, b) and 2(c, d) ...... 70

Figure 13: Liver histology material from patient 9 ...... 73

Figure 14: Endoscopic retrograde cholangiopancreatography (ERCP) image from patient 5… ...... 73

Figure 15: Genome-wide homozygosity depicting the areas of LOH in all 24 patients in group 1 ...... 83

Figure 16: Genome-wide homozygosity depicting the areas of LOH in 12 consanguineous patients plus individual 5 with PWS ...... 85

Figure 17: Electropherogram appearance of cDNA Sanger sequencing in the region of the 2 base (GT) deletion in a control sample (top) and the c.123_124delGT change in proband TG11 (bottom) ...... 104

Figure 18: (A): DCDC2 immunostaining of cuboidal cholangiocytes (black arrow) in control tissue 10x; (B): absence of DCDC2 immunostaining in a DCDC2 confirmed mutation patient (TG12) 10x; (C): H&E staining of interlobular portal tract with evident inflammation (black

6

arrow) in the same patient in panel A 20x; (D): Cytokeratin 7 immunostaining of cholangiocytes (green arrow) in patient (TG12) 10x...... 110

Figure 19: Transmission electron microscopy of hepatic lobule with portal tract and primary cilia in a control patient with Wilson disease ...... 111

Figure 20: Transmission electron microscopy of a hepatic lobule with a portal tract in patient TG15 ...... 112

Figure 21: Diagram of a primary cilia structure ...... 117

7

List of tables

Table 1: PCR master mix of a total volume of 20 μl. GC; guanine-cytosine...... 27

Table 2: PCR thermal cycle, with temperatures gradient in step 3 between 52-58 oC, depending on each exon’s annealing temperature...... 27

Table 3: Homozygosity Mapper analysis sets...... 30

Table 4: BeadStudio analysis parameters ...... 33

Table 5: Cycle sequencing parameters ...... 36

Table 6: Covaris DNA shearing settings ...... 38

Table 7: End Repair mix for 5 samples...... 40

Table 8: Adenylation of 3’ end of DNA fragments per sample ...... 41

Table 9: Ligation of the indexing-specific paired-end adapter master mix. Volumes of reagents were adjusted for multiple libraries to achieve final volume of 37 μl/sample...... 41

Table 10: PCR mix components per sample for adapter-ligated library amplification ...... 42

Table 11: PCR parameters for adapter-ligated library amplification ...... 43

Table 12: Hybridization buffer solution protocol ...... 46

Table 13: SureSelect Block Mix ...... 46

Table 14: PCR programme for library hybridization ...... 47

Table 15: Herculase II Master Mix for captured library amplification and index tagging ...... 49

Table 16: PCR programme parameters for index tagging ...... 50

Table 17: TruSeq SBS Kit-HS (50 cycles) parameters ...... 52

Table 18: Number of mutations in MLL2 reported in the literature. Mutations are categorised as de novo, nonsense, frameshift, missense, splice site and indels...... 63

Table 19: KDM6A mutations in KS patients as reported by Miyake et al...... 66

Table 20: Demographic data of KS patients...... 68

Table 21: Liver histology and biochemical data of KS patients at time of admission ...... 71

Table 22: Demographical, biochemical, radiological and histological data of all patients...... 79

Table 23: MLL2 mutations identified in KS patients...... 80

Table 24: DNA concentrations and identification numbers (sample code, DNA, plate and patient viewer numbers) ...... 83

8

Table 25: Summary of the regions with the highest homozygosity scores for analysis A1...... 85

Table 26: Summary of the regions with the highest homozygosity scores for analysis A2. nucleotide polymorphism...... 87

Table 27: Regions with the highest homozygosity scores for analysis A3...... 88

Table 28: Areas of LOH in individual identified by BeadStudio and mapped by chromosome start and stop position and length size...... 91

Table 29: LOH regions for the two sets of siblings (patients 10 & 16) and (patients 3 & 18) when analysed per family...... 92

Table 30: LOH regions for set of siblings (10 & 16) and (3 & 18) shared with other patients...... 93

Table 31: Design parameters of RAB3B forward and reverse primers...... 93

Table 32: PCR temperature per RAB3B exon tested...... 94

Table 33: Novel homozygous variants identified in seven . The variants were individual amongst each patient...... 95

Table 34: CNV generated by WES and identified in each patient in the regions of LOH as described in section 4.6...... 96

Table 35: List of variations causing frameshift changes and introduction of a downstream premature stop codon. Patients are grouped according to number of common gene observed...... 98

Table 36: Gene variations introducing premature stop codon with high impact on the encoded protein as identified on GEMINI...... 99

Table 37: List of variations coding for a frameshift change and introduction of a premature stop codon downstream, identified on GEMINI...... 100

Table 38: Primer sequences used for Sanger sequencing confirmation of DCDC2 mutations identified in WES analysis...... 106

Table 39: Mutations in DCDC2 identified by WES and confirmed by Sanger sequencing...... 107

Table 40: Variants in DCDC2 identified in the 1st group of patients who underwent WES...... 108

9

List of abbreviations

ABCB4 ATP-binding cassette sub-family B member 4 ADPKD Autosomal dominant polycystic disease ALP Alkaline phosphatase ARPKD Autosomal recessive polycystic disease BA Biliary atresia BLAST Basic local alignment search tool algorithm BRC Biomedical research centre C Cytosine Ca2+ Calcium CVID Common variable immunodeficiency CHF Congenital hepatic fibrosis CLDN1 Claudin-1 cM Centimorgan CGH Comparative genomic hybridization DCDC2 Doublecortin domain containing 2 DCX Doublecortin gene family EBV Epstein barr virus ERCP Endoscopic retrograde cholangiopancreatography ERK Extracellular signal-regulated kinase ESP Exome sequencing project FTT Failure to thrive GC Guanine-cytosine GFR Glomerular filtration rate GGT γ-Glutamyltransferase GTP Guanosine-5’-triphosphate GWAS Genome-wide association study H3K27 Histone H3 lysine 27 HLF Hepatic leukaemia factor HMT Histone methyltransferase HM HomozygosityMapper HMBOX1 Homeobox containing protein 1 ICP Intrahepatic cholestasis of pregnancy IDA Inner dynein arms IFT Intraflagellary transport system IGV Integrative genome viewer 10

KDM6A Lysine-specific demethylase 6a KS Kabuki syndrome LCH Langerhans cell histiocytosis LOD Logarithm of the odds LPAC Low phospholipid associated cholelithiasis syndrome LSCS Lower segment caesarean section LT Liver transplantation MAF Minor allele frequency MAP Microtubule associated protein Mb Megabase pair MDR3 Multidrug resistance protein 3 MLL2 Mixed lineage leukaemia 2 MLPA Multiplex ligation-dependent probe amplification MUC4 Mucin protein-4 MYH3 Myosin heavy chain 3 gene NAIC North American Indian childhood cirrhosis NGS Next generation sequencing NISCH Neonatal ichthyosis-sclerosing cholangitis syndrome NPHP-RC Nephronophthisis-related ciliopathies NRH Nodular regenerative hyperplasia NSC Neonatal sclerosing cholangitis ODA Outer dynein arms OSNs Olfactory sensory neurons PCP Planar cell polarity PCR Polymerase chain reaction PFIC Progressive familial intrahepatic cholestasis PSC Primary sclerosing cholangitis PTC Percutaneous transhepatic cholangiogram PTLD Post transplant lymphoproliferative disease PWS Prader-Willi syndrome QC Quality control RAB3B Ras-associated protein 3B ROH Runs of homozygosity rRNA Ribosomal RNA rs Reference SNP RT-PCR Reverse transcription polymerase chain reaction SET Drosophila Su3-9 Enhancer of zeste 11

Shh Sonic hedhehog SNP Single nucleotide polymorphism SNV Single nucleotide variants SP3 Specificity protein 3 STR Short tandem repeat SVD Spontaneous vaginal delivery T Thymine TEM Transmission electron microscopy TFBS Transcription factor binding sites TJ Tight junction TJP2 Tight junction protein 2 TRPV4 Transient receptor potential vallinoid type 4 TRX Trithorax protein USCS University of California Santa Cruz UW CMG University of Washington Centre for Mendelian Genomics VCF Variant call format WES Whole exome sequencing

12

Abstract

Neonatal sclerosing cholangitis (NSC) is a rare form of severe liver disease, presenting in the newborn period with great similarities to the more common condition of biliary atresia (BA). Both involve inflammation and narrowing of the bile ducts, and lead to liver failure. NSC appears to be common amongst children of consanguineous marriages, suggesting it is inherited as an autosomal recessive trait. Despite the distinct disease phenotype and the suspicion of genetic inheritance no gene variants have so far been identified.

The aim of this study is to identify the cause of NSC by mapping the genetic locus and characterising the causative gene within the disease region.

We retrospectively identified 39 NSC patients (20 male). Consanguinity was identified in 24, two families having 2 affected siblings in each. Four patients with Kabuki make up syndrome, one of the high GGT cholestasis syndromes, were also included. DNA from 24 (14 male) patients was used for whole genome analysis with the Human CytoSNP-12 v2.1 microarray. Homozygosity Mapper and BeadStudio Genotyping Module were used and large areas of homozygosity in 5 chromosomes were identified with a high number of genes within, making it impractical to filter down to a single causal gene. Whole exome sequencing (WES) was performed in 5 patients initially, but a strong candidate gene was not found. A further 21 patients with a diagnosis of neonatal cholangiopathy underwent WES. Through the latter data analysis, protein truncating mutations in the doublecortin domain containing 2 gene (DCDC2) were identified in seven patients (4 homozygous, 3 heterozygous). DCDC2, a microtubule associated protein has a regulatory role in the ciliary microtubule’s function possibly by affecting Shh and Wnt signalling pathways. These pathways could have an impact on cholangiocyte morphology and development of cholangiopathies. Mutations in DCDC2 were confirmed following Sanger sequencing and liver tissue immunohistochemical and ultrastructure studies. A better understanding of the aetiology and pathogenesis of NSC through this finding will allow improved diagnostics and potentially new treatment options.

13

1. Introduction

The liver constitutes of two main components: the parenchyma and the biliary system. Diseases affecting mainly the biliary system are described either as cholangiopathy or cholangitis, indicating an inflammatory aspect to the disease pathophysiology, usually reflected in raised level of γ-Glutamyltransferase (GGT) in blood. The pathogenic mechanism leading to cholangiopathies has been investigated in the past in some more than others. By utilizing various available genetic methods researchers have managed to identify a genetic cause in a proportion of cholangiopathies as described in this section. Neonatal Sclerosing Cholangitis is one form of early onset cholangiopathy more frequently seen in consanguineous patients but its genetic cause has not been found yet. In this project apart from NSC patients a small group of children with Kabuki syndrome (KS) and neonatal onset cholangiopathy was also studied. This decision was taken on the basis that no disease causing gene mutations were identified up to that point.

1.1.High γ-Glutamyltransferase cholangiopathies

1.1.1.Neonatal Sclerosing Cholangitis

Neonatal sclerosing cholangitis (NSC) is a rare form of severe liver disease, presenting in the newborn period. In our institution two siblings with neonatal onset sclerosing cholangitis, associated with high GGT, were previously reported, who developed biliary cirrhosis, one of whom had required a liver transplant (Baker et al. 1993). The children had a strong history of consanguinity with the maternal grandfather and the paternal grandmother being siblings. The 2nd sibling in that report had subsequently also undergone liver transplantation (LT).

The first report on NSC was a series of 8 children presenting within the neonatal period with cholestatic jaundice, hepatosplenomegaly, pale stools and high GGT (when available) (Amedee-Manesme et al. 1987). In this paper the diagnosis of an intrahepatic cholangiopathy was confirmed during exploratory laparotomy, in an attempt to exclude biliary atresia (BA) in 2, and using percutaneous cholangiography at a later stage in all. Liver biopsy showed features of ductular proliferation, moderate inflammation and fibrosis. The majority of the patients progressed to develop biliary cirrhosis. Of interest, consanguinity was reported in 3 cases suggesting a recessive Mendelian inheritance. Paediatric sclerosing cholangitis syndromes have been subsequently divided into 3 groups: those similar to the adult primary sclerosing cholangitis (PSC), those associated with immunodeficiency (Naveh et al. 1983) and those described with Histiocytosis X (Leblanc et al. 1981). In a more recent review Vergani et al have refined paediatric cholangiopathies into primary, autoimmune, and

14

neonatal sclerosing cholangitis as well as cholangitis associated with immunodeficiency and Langerhans Cell Histiocytosis (LCH) (Mieli-Vergani and Vergani 2001).

1.1.2.γ-Glutamyltransferase

Children presenting with cholestasis in the neonatal and early infancy period have been differentiated into high or low GGT syndromes, utilising this enzyme as a marker of distinction between bile duct damage and hepatocellular inflammation. Whitfield et al, showed a strong association between high GGT and biliary disease and a correlation between serum GGT and alkaline phosphatase (ALP) in liver disease suggesting that changes in the activity of GGT reflect principally alterations in biliary function, rather than damage to the parenchymal liver cells (Whitfield et al. 1972).

GGT is an enzyme that transfers gamma-glutamyl groups from peptides such as glutathione to other amino acids and has a potential role in the amino acid transport. GGT catalyses the transfer of the gamma-glutamyl moiety of glutathione to an acceptor, that may be an amino acid, a peptide or water (forming glutamate). GGT plays a key role in the gamma-glutamyl cycle, a pathway for the synthesis and degradation of glutathione and also for drug and xenobiotic detoxification. The enzyme is composed of a heavy and a light chain, which are derived from a single precursor protein (Figure 1), and is present in tissues involved in absorption and secretion such as kidneys, pancreas, spleen, heart, brain, but most notably in the hepatobiliary system (Kinlough et al. 2005).

Figure 1: GGT enzyme structure demonstrating the heavy (blue) and light chains (green) composing the molecule (image copied from www.uscn.com).

1.1.3.Neonatal ichthyosis-sclerosing cholangitis syndrome (NISCH)

Neonatal ichthyosis-sclerosing cholangitis syndrome (NISCH) is a high GGT cholestatic syndrome characterized by scalp hypotrichosis, scarring alopecia, sclerosing cholangitis, and leukocyte vacuolization. A series of 4 affected individuals from 2 small-inbred Moroccan kindreds was first reported in 2002. The syndrome was mapped to a 21.2 cM interval of chromosome 3q27-q28 which was subsequently reduced to a 9.5 cM region (Baala et al. 2002). Two years later CLDN1 (OMIM#603718) was considered as a strong candidate gene 15

and a 2 bp deletion in exon 1 of the gene was identified, resulting in a premature stop codon and a total absence of claudin-1 protein in the liver and skin (Hadj-Rabia et al. 2004). CLDN1 encodes claudin-1, an integral cell membrane protein that is part of tight junctions (TJs), present at the apical region of polarised epithelial and endothelial cells in various tissues including the liver and kidneys (Morita et al. 1999). Disruption at the expression of claudin-1 at a cellular level can affect TJ integrity and subsequently barrier permeability (Findley and Koval 2009).

1.1.4.North American Indian childhood cirrhosis

North American Indian childhood cirrhosis (NAIC), an isolated nonsyndromic form of high GGT familial cholestasis has been reported in Ojibway-Cree children and young adults in the northwestern area of Quebec. The disease typically presents with transient neonatal jaundice in a well child, and progresses to biliary cirrhosis and portal hypertension. Biochemical and histopathological features of the disease suggest involvement of the bile ducts (elevated GGT & ALP). An autosomal recessive mode of inheritance was reported and a genome wide analysis on three DNA pools of samples from 13 patients, 16 unaffected siblings, and 22 parents from five families was performed. A five-marker haplotype on chromosome 16q22 was shared by all patients (Betard et al. 2000). Subsequently Drouin et al described an extensive series of the syndrome including 30 individuals with multisystemic involvement and 47% mortality at an early age with no recurrence of the disease following LT. Presenting features were neonatal jaundice, hepatosplenomegaly, facial telangiectasia and pruritus. The condition was associated with recurrent episodes of otitis media, bronchial hyperactivity, failure to thrive (FTT) and cardiac involvement (Drouin et al. 2000). The liver disease progresses rapidly into cirrhosis and development of portal hypertension. Further genetic analysis identified a disease causing missense mutation in gene FLJ14728 (CIRH1A, OMIM#604901), present in all NAIC individuals, and altering the protein structure (Chagnon et al. 2002). Cirhin, which is the protein product, is ubiquitously expressed in different tissues and organs. Within cells, cirhin is located in a nucleolus, a small region inside the nucleus where ribosomal RNA (rRNA) is produced. It has been hypothesised that cirhin may play a role in rRNA processing, but there has been no clear explanation of its significance in the disease process to present.

1.1.5.Kabuki syndrome

Kabuki syndrome (KS) is a congenital mental retardation syndrome with additional multisystemic involvement. Major features include long palpebral fissures, short nasal septum, arched eyebrows, depressed nasal tip, high-arched palate, prominent ears, persistent fingertip pad and short 5th digit, hypotonia, short stature and IQ <80. Cardiovascular defects, 16

vertebral deformities, hip dislocation, urinary tract malformations, hearing loss and seizures have been reported as minor features. Liver involvement includes high GGT cholangiopathy, BA (van Haelst et al. 2000) and end stage liver disease leading to LT (Ewart-Toland et al. 1998). The incidence is 1/32,000; it is sporadic with an equal sex ratio (Niikawa et al. 1981). In 2010, whole exome sequencing (WES) of 10 unrelated patients with KS, 7 of European ancestry, 2 of Hispanic ancestry and 1 of mixed European and Haitian ancestry, identified nonsense or frameshift mutations in the MLL2 gene in 7 patients. Follow-up Sanger sequencing detected mutations in MLL2 on chromosome 12q12-14 in 2 of the 3 remaining individuals with KS and in 26 of 43 additional cases. In all, 33 distinct MLL2 mutations were identified in 35 of 53 families (66%) with KS (Ng et al. 2010). Currently over 247 mutations have been identified with a high incidence of de novo occurrence (Hannibal et al. 2011). In about 10% of KS patients, mutations in Lysine-specific demethylase 6a gene (KDM6A) have been found (Lederer et al. 2012). The syndrome, including clinical features and causative genes will be discussed separately at length in section 4.1.

1.1.6.Progressive Familial Intrahepatic Cholestasis (PFIC) type 3

Progressive Familial Intrahepatic Cholestasis (PFIC) type 3, one of the spectrum of familial intrahepatic cholestasis syndromes, is differentiated from the other 2 types PFIC 1 (ATP8B1) (Bull et al. 1998) & 2 (ABCB11)(Strautnieks et al. 1998) at a biochemical level by an elevated serum GGT. All three PFIC syndromes are inherited in an autosomal recessive pattern. PFIC 3 syndrome is caused by a variety of mutations in the ATP-binding cassette sub-family B member 4 (ABCB4), the gene encoding multidrug resistance protein 3 (MDR3)(de Vree et al. 1998), which codes for a flippase responsible for phosphatidylcholine translocation across the canalicular membrane. The defective phosphatidylcholine translocation leads to a lack of phosphatidylcholine in bile. The absence of phosphatidylcholine inhibits the chaperoning of bile acids, causing damage to the biliary epithelium and cholangiopathy. MDR3 deficiency causes, apart from PFIC 3, a spectrum of liver diseases such as adult biliary cirrhosis, low phospholipid associated cholelithiasis syndrome (LPAC), transient neonatal cholestasis, intrahepatic cholestasis of pregnancy (ICP) and drug induced cholestasis (Gonzales et al. 2009).

1.1.7.Biliary Atresia

NSC has similarities with a more common condition, biliary atresia (BA). Both involve inflammation and narrowing of the bile ducts, and lead to liver failure. NSC could in fact be one form of BA, particularly as it is generally accepted that the latter is not one condition at all, but rather several different conditions with a similar ultimate phenotype. At King’s

17

College Hospital NSC is diagnosed in only one or two newborn infants every year, while there are as many as twenty new cases of BA in the same time period. BA is also a high GGT cholangiopathy, affecting extra and intrahepatic bile ducts and manifesting with prolonged cholestasis, progressive fibrosis and pronounced inflammatory obliteration of the extrahepatic biliary system during the first few weeks of life. Early surgery in the form of a Kasai portoenterostomy is the treatment of choice with about 40-60% of the patients clearing their jaundice (Davenport et al. 1997, Livesey et al. 2009). LT is the treatment of choice for the patients who remain cholestatic or develop portal hypertension, with bleeding varices refractory to endoscopic treatment. The incidence of the disease is between 1:8,000-18,000 live births (Petersen et al. 2008). The pathogenesis of the disease remains unclear with some patients developing cirrhosis very early on in the neonatal period, while others have milder liver histological changes. The best outcome with their native liver is reported in infants who undergo a Kasai portoenterostomy under the age of 43 days (Davenport et al. 2011).

1.1.8.Genetics in Biliary Atresia

In our centre a series of 5 infants with cholestatic neonatal disease and various forms of chromosome 22 aneuploidy have been reported: 2 infants had classical cat-eye syndrome, 2 partial duplication of chromosome 22 (supernumerary der (22) syndrome), and 1 mosaic for trisomy 22. Four of these infants had BA and 1 had a cystic choledochal malformation (Allotey et al. 2008). This observation lead to the hypothesis that genes located on chromosome 22 and their over expression could be involved in early bile duct development (Allotey et al. 2008). This observation, however has not been supported by further published research. In 2010, 35 BA patients from another centre were screened for genomic alterations that may be involved in the probable multifactorial pathogenesis of BA. DNA was genotyped on the Illumina Human Hap 550,000 single nucleotide polymorphisms (SNPs) BeadChip platform. Two unrelated BA patients with overlapping heterozygous deletions of 2q37.3 were identified. Patient 1 had a 1.76 Mb (280 SNP), heterozygous deletion containing 30 genes. Patient 2 had a 5.87 Mb (1,346 SNP) heterozygous deletion containing 55 genes. The overlapping 1.76 Mb deletion on chromosome 2q37.3 from 240,936,900 to 242,692,820 constituted the critical region, but due to the high number of genes within this region no further mapping analysis was performed (Leyva-Vega et al. 2010).

In the same year another group, in an attempt to identify BA susceptibility loci, carried out a genome-wide association study (GWAS) using the Affymetrix 5.0 and 500 K marker sets [500,000 single-nucleotide polymorphisms (SNPs)] in 200 Chinese BA patients and 481 ethnically matched control subjects. The 10 most BA-associated SNPs from the GWAS were

18

genotyped in an independent set of 124 BA and 90 control subjects. The strongest overall association was found for rs17095355 on 10q24, downstream XPNPEP1, a gene involved in the metabolism of inflammatory mediators. XPNPEP1 and ADD3 were two genes identified in the region of interest, both expressed in the hepatobiliary epithelium. Current bioinformatics tools (Genomatic suite) were utilized and the sequence around the area of interest was found not to be conserved amongst species, weakening any causative effect. At a 2nd stage, the University of California Santa Cruz (USCS) browser was used to identify SNPs in LD (r2>0.8) with rs17095355, which overlapped or were in the vicinity of transcription factor binding sites (TFBS) implicated in the regulation of liver genes. This revealed that rs921348 and rs9630101 were flanking the TFBS sequences of the hepatic leukaemia factor (HLF) and homeobox containing protein 1 (HMBOX1) genes, respectively. There is no supportive literature, however, to corroborate a link between these TFs expressed in the liver and overlapping BA-associated SNPs with the regulation of the candidate genes (Garcia- Barcelo et al. 2010). Further studies in another cohort of Thai origin patients confirmed the association of XPNPEP1 rs17095355 in 124 BA patients when compared with 114 controls (Seelow et al. 2009). More recently GWAS undertaken in non-Asian origin population of 171 BA and 1,630 controls identified a strong signal at rs7099604 (p = 2.5 × 10−3) in intron 1 of the ADD3 gene. Tissue expression data suggested that ADD3, but not XPNPEP1, was differentially expressed in BA patients. No further suggestions in the role of ADD3 in biliary development were made (Findley and Koval 2009).

1.1.9.Previous genetic analysis of Neonatal Sclerosing Cholangitis

Previous linkage analysis performed in our laboratory, triggered by the observation of a patient with NSC and Prader-Willi syndrome (PWS), had focused our interest in chromosome 15. PWS syndrome (OMIM#176270) was initially reported in 1956 by Andrea Prader and Heinrich Willi. It is a complex multisystemic disorder with typical facies, hypotonia, short stature, learning difficulties, hyperphagia and obesity. It occurs due to a 5-7 Mb de novo deletion of the proximal region of the paternal chromosome 15 [del (15)(q11-q13)] (70%), maternal uniparental disomy (25-30%) or less frequently due to lack of expression of paternally derived genetic sequences on chromosome 15q11-q13 (2-5%) (Jin 2011).

In our index case, linkage analysis confirmed uniparental disomy with isodisomy on the long arm of chromosome 15. In 2006 four families with NSC were selected for microsatellite analysis including the case with PWS. Two strong candidate genes TJP1 and ATP10A within this region of chromosome 15 were identified and subsequently sequenced. A novel at that time missense mutation was identified in TJP1 [c.3968A>C, p.(A1323P)] only in the PWS patient. This variation when subsequently tested proved a tolerated (SIFT) polymorphism

19

(MutationTaster). Several SNPs for ATP10A were also found, which were considered non significant.

1.2.Genetic techniques used in the project

During the process of investigating the genetic cause of NSC in this project, a number of techniques were utilised in a stepwise approach. These techniques were SNP linkage analysis, homozygosity mapping and WES. The first two methods have been especially successful in the identification of small cohorts of affected individuals with history of consanguinity such as the patients studied here. WES has been used more recently in the detection of disease causing genes again in a carefully selected population. The major principles of these methods are described in sections 1.2.1-4.

1.2.1.Single Nucleotide Polymorphism

A single nucleotide polymorphism (SNP) is a DNA sequence variation involving a single nucleotide (A, T, C or G) alteration. Seventy five percent of SNPs involve the replacement of cytosine (C) with thymine (T). They can happen in the coding and non-coding regions of the genome with variable distribution across the genome. SNPs in the coding region of a gene can produce the same polypeptide sequence (synonymous polymorphism) or a different one (non synonymous). Although in principle, at each position of a sequence stretch, any of the four possible nucleotide bases can be present; SNPs are usually bi-allelic in practice. SNPs have been utilized in the past for the construction of the and for finding the association of these markers with a specific disease (Kruglyak 1999). For the purpose of GWAS an SNP array can be utilized. Such oligonucleotide arrays are very small devices that contain thousands of short DNA sequences (about 25 nucleotides) immobilized at different positions on the surface. These sequences can be used to discriminate between alternative bases at the site of an SNP. The chips allow many SNPs to be analyzed simultaneously, a useful process for large-scale association studies. The method utilized here involves hybridization of the DNA sequence containing the SNP to the chip. Single base extension is followed by fluorescent staining, signal amplification, scanning, and analysis using commercially available software. Data generated is then utilized for amongst other purposes also in the detection of loss of heterozygosity (LOH).

1.2.2.Homozygosity mapping

Homozygosity mapping is a common method for mapping recessive traits in families with strong evidence of consanguinity. It consists of a genome-wide linkage analysis with microsatellites and for the purpose of this study with SNPs (Gibbs and Singleton 2006). A 20

multipoint linkage analysis is performed via software under a recessive disease model, and haplotypes are prepared. The likelihood of a locus being linked or not linked is calculated as the ratio between these two recombination fractions and is expressed as the logarithm of the odds (LOD) score. Following that, haplotypes can be manually inspected for homozygous regions shared by all affected individuals. It is a powerful technique as it does not require DNA from family members other than the affected offspring (Lander and Botstein 1987). The principle is that in consanguineous families with a rare autosomal recessive disease the same disease alleles have been inherited from both parents. Therefore, the proportion of homozygosity near the disease mutation among related individuals should still be higher than expected by chance. In populations, where consanguinity is common, even apparently unrelated individuals with the same disease are often found to be related with perhaps shorter shared disease haplotypes (Mueller and Bishop 1993). Homozygosity can also happen by chance in non-consanguineous families and in those individuals there may be a far more distant inbreeding history (Hildebrandt et al. 2009). In an effort to increase the number of SNPs studied and to reduce the calculation time, homozygosity mapping has been performed using the softwares BeadStudio Genotyping Module and Homozygosity Mapper (Seelow et al. 2009).

1.2.3.Whole Exome Sequencing

The exome is defined as the whole coding sequence found in DNA. Whole exome sequencing (WES) is a technique of massively parallel sequencing of all the protein coding regions of DNA (Mamanova et al. 2010). Its first application was described by Ng et al when they confirmed mutations in myosin heavy chain 3 gene (MYH3) in 12 individuals with Freeman-Sheldon syndrome (OMIM#193700) by targeted capture and parallel sequencing (Ng et al. 2009). The process involves initial DNA shearing and creation of a shotgun library of DNA fragments, flanked by adaptors. The library is then enriched for the exon sequences parts with the fragments being hybridized to biotinylated DNA or RNA baits. These fragments are then recovered by biotin-streptavidin-based pull down, followed by amplification and parallel sequencing of the library, mapping and calling of the identified variants. As the data generated encompass between 20,000-24,000 variants per exome, it makes it challenging to differentiate between incidental and causal variants. The best approach in resolving this problem is firstly a meticulous filtering of the data; secondly, the stratification of the results based on the predicted effect at a protein level. Both approaches to the data can navigate successfully the search for the causal gene. The mode of inheritance and pedigree information availability can also narrow down the search for candidate genes (Bamshad et al. 2011). The use of WES for disease gene identification, however, does generate some ethical considerations if samples are not anonymised: the management of 21

genetic results generated from each individual and the disclosure to the patients, both of which are not clearly defined yet, and the limitations in the consent obtained prior to the era of WES (Hallowell et al. 2014). The cost and time of the WES have been dramatically reduced over recent years making it a powerful tool in the discovery of the genetic basis of Mendelian and complex trait diseases.

In our project, patients were previously consented for research investigations in accordance to our local R&D guidelines and samples were anonymised.

22

2. Research Aims

The aim of this study was to identify the genetic cause of NSC. Initially we planned to map areas of homozygosity within the genome of NSC patients by utilising SNP linkage analysis and homozygosity mapping with platforms such as HomozygosityMapper and BeadStudio. This was followed by WES analysis. Through the data generated, potential strong candidate genes were identified and prioritised based on their encoded proteins function and tissue distribution of expression, primarily in the liver and biliary tract.

Subsequently, by direct sequencing of the genes, we aimed to confirm the disease causing mutations and the potential effect of the gene product in the manifestation of NSC.

23

3. Patients and Methods

3.1.Patients

Patients who presented to our centre with neonatal onset cholangiopathy were selected for the purpose of this project. The majority had distorted but patent biliary tree (NSC), but a smaller cohort of consanguineous patients had complete obliteration of the biliary system (BA). A group of 4 patients had a clinical phenotype of KS and they will be described in more detail in section 4.1. All patients underwent extensive laboratory work up as per our centre’s protocol. Endoscopic Retrograde Cholangiopancreatography (ERCP), Magnetic Resonance Cholangiopancreatography (MRCP) and liver biopsy were performed to assist the diagnostic process, where indicated. Based on the chronology and type of the analysis patients were divided into Group 1 and Group 2. Group 1 constituted of 24 patients with diagnosis of NSC. Five of these patients underwent the first WES. Group 2 patients were part of the second WES analysis, which included 10 BA and 4 NSC patients plus 7 patients from Group 1.

Both groups will be described in detail in section 4.2.

3.2.Methods

In this section the gene identification techniques used in this project are described. The stepwise approach includes DNA extraction and quantification, polymerase chain reaction, SNP linkage, homozygosity mapping and loss of heterozygosity analysis, whole exome and subsequently Sanger sequencing.

3.2.1.DNA extraction from stored blood

Stored blood samples from 24 candidate patients were utilized and the QIAamp DNA Blood Mini Kit (Qiagen Ltd, Manchester, UK) was used. A volume of 100 μl of whole blood were aliquoted into a 1.5 ml microcentrifuge tube. Initially 10 μl of Proteinase K were added and then 100 μl of lysis Buffer AL. The mix was mixed by pulse-vortexing for 15 seconds and incubated at 56 °C for 10 minutes on a thermal plate. The mix was centrifuged at 13,000 rpm for 1 minute to remove any drops from inside the lid.

50 μl of 100% ethanol were added to the mix, which was mixed by pulse-vortexing for 15 seconds. The mix was incubated at room temperature for 3 minutes.

The mix was centrifuged at 13,000 rpm for 1 minute to remove any drops from inside the lid and the entire lysate was then transferred into a 2ml collection tube (QIAamp MinElute column), provided by Qiagen Ltd, Manchester, UK. The mix was again centrifuged at 13,000

24

rpm for 1 minute. The QIAamp MinElute column was placed into a clean 2 ml collection tube and the collection tube containing the flow-through was discarded.

500 μl of Buffer AW1 were added to the QIAamp MinElute column and the mix was centrifuged at 13,000 rpm for 1 minute. The QIAamp MinElute column was placed into a clean 2 ml collection tube and the collection tube containing the flow-through was discarded.

500 μl of Buffer AW2 were added to the QIAamp MinElute column and the mix was centrifuged at 13,000 rpm for 1 minute. The QIAamp MinElute column was placed into a clean 2 ml collection tube and the collection tube containing the flow-through was discarded. The mix was centrifuged at 13,000 rpm for 3 minutes to fully dry the membrane.

The QIAamp MinElute column was then placed in a clean 1.5 ml microcentrifuge tube, while the collection tube containing the flow-through was discarded. At room temperature, 100 μl Buffer AE were added to the QIAamp MinElute column.

The new mix was incubated at room temperature (15-25 °C) for 5 minutes. The tube was centrifuged at 13,000 rpm for 1 minute and was stored at 4 °C.

3.2.2.DNA quantification

The extracted DNA concentration was measured using the Qubit® 2.0 Fluorometer DS DNA BR ASSAY KIT provided by Invitrogen (Life Technology, Paisley, UK). Two Qubit Assay tubes of 0.5 μl for the standards (S1 & S2) and one for each sample were prepared. All tubes were clearly labelled. A master working solution was prepared by mixing 199 μl/sample of the Qubit Buffer with 1 μl/sample of the Qubit Reagent. 200 μl of the working solution were prepared for each standard and sample. In each numbered sample tube 199 μl of the working solution were mixed with 1 μl of DNA to a final volume of 200 μl. Tubes S1 and S2 were prepared by mixing 190 μl of the working solution with 10 μl of the respective standard to obtain a final volume of 200 μl. All samples were mixed by vortexing for 2 minutes and then incubated for 2 minutes at room temperature. Standard samples were kept in a non- transparent box for the same incubation time. Standards S1 and S2 were first inserted into the fluorometric device followed by each sample individually to record DNA concentration in ng/μl and the 260/280 ratio for potential sample contamination.

25

3.2.3. Polymerase chain reaction (PCR)

3.2.3.1.PRIMER DESIGN

PCR reverse and forward primers were designed manually for the tested exons of the respective genes of interest. Criteria used for the design were as follows:

Length: 19-23 bp

Base composition: Cytosine and Guanine around 30-45%

Melting point temperature (Tm): 62-66 °C.

Designed primer sequences were checked by using the Basic Local Alignment Search Tool algorithm (BLAST) (http://blast.ncbi.nlm.nih.gov/Blast.cgi). The primer’s sequence was positioned 32-446 bp from either end of the respective intron.

3.2.3.2.TESTING OF NEW PCR PRIMERS

PCR primers were tested for a temperature gradient 52-68 °C. , in order to establish the best temperature conditions at which they would work. A 20 μl PCR reaction mix was prepared (Table 1) using 2 μl of genomic control DNA, and a temperature gradient was established across the PCR plate. The PCR plate was covered with an adhesive PCR seal (Anachem, Luton, UK) and the reaction was cycled using a thermal cycler as per Table 2.

3.2.3.3.PCR

Polymerase chain reactions were set up by using the following reagents multiplied by the number of PCR reactions required (Table 1). For each primer a DNA sample, stored in our R&D approved bio bank, from a positive (known MDR3 deficiency) control was used to check the quality of the designed primer. A negative (no DNA) control mix, with reagents only, was also used to assure the lack of any DNA contamination of the reagents. The PCR plate was covered with an adhesive PCR seal (Anachem, Luton, UK) and the reaction was placed on a thermal cycler with a heated lid at 105 °C as per Table 2.

26

Reagents Volume

PCR buffer 2 μl

GC-Rich solution 4 μl

dNTP mix (10 mM) 0.6 μl

FastStart Taq DNA polymerase (5 U/μl) 0.25 μl

Forward primer (10 μM) 0.5 μl

Reverse primer (10 μM) 0.5 μl

DNA 2 μl

Distilled water 10.15 μl

Total reaction volume 20 μl

Table 1: PCR master mix of a total volume of 20 μl. GC; guanine-cytosine.

Temperature (oC) Time (min) Number of cycles

96 8 1

96 0.5 52-58 0.25 } 35 72 0.5

72 7 1

Table 2: PCR thermal cycle, with temperatures gradient in step 3 between 52-58 oC, depending on each exon’s annealing temperature.

3.2.3.4.PCR PRODUCT VERIFICATION

A 1% agarose gel mix with 0.1% TBE buffer (Sigma-Aldrich Ltd, Dorset, UK) was prepared and heated in a microwave at full power for 10 minutes. After cooling down the solution in

27

room temperature, 2 μl of Nancy-520 (Sigma-Aldrich Ltd) were added. Five μl of PCR reaction product were electrophoresed through the prepared gel. 100 bp DNA ladder (1 μg/μl) (Thermo Scientific, Waltham, MA, USA) was used to standardize the size of the PCR products. The DNA bands were visualized and the success of the amplification was confirmed by using a Gene Snap UV-transluminator (Syngene, Cambridge, UK).

3.2.4.Whole genome analysis with the Human CytoSNP-12 v2.1 (300,000 Single Nucleotide Polymorphism) microarray analysis

The 12-sample HumanCytoSNP-12 BeadChip is a genome-wide scanning panel, which can deliver high-throughput linkage analysis. This BeadChip includes a complete panel of genome-wide SNPs in a region of high linkage disequilibrium (tag SNPs), and markers targeting all regions of known cytogenetic importance. It incorporates 300,000 SNPs with the highest tagging power, offering global coverage. Two hundred nanograms of DNA per sample were prepared. The DNA samples were denatured and neutralized to prepare them for amplification. The denatured DNA was isothermally amplified overnight and the product was fragmented by a controlled end-point enzymatic process. Following isopropanol precipitation, DNA samples were collected and centrifuged at 4 °C. The precipitated DNA was subsequently resuspended in RA1 buffer (supplied by Illumina).

The BeadChip was prepared for hybridization in a capillary flow-through chamber. Samples were applied to the BeadChip and divided by an IntelliHyb seal. The loaded BeadChip was incubated overnight in the Illumina hybridization oven. The following day the amplified and fragmented DNA samples were annealed to locus –specific 50-mers during hybridization. The unhybridized and non-specifically hybridized DNA was washed away and the BeadChip was prepared for staining and extension. By using the captured DNA as a template, a single- base extension of the oligonucleotides on the BeadChip was performed. This process incorporated detectable labels on the BeadChip and determined the genotype call for each sample.

Following labelling and uploading via the Autoloader2, the Illumina iScan was employed to scan the BeadChip, using a laser to excite the fluorophore of each single base extension product of the beads. The scanner, a two-channel high resolution laser imager, recorded high- resolution images of the fluorescent light captured at two wavelengths simultaneously. Data generated by the iScan was uploaded into a data file and processed with the Genome Studio. This is an integrated data analysis software platform used for extracting genotype data from intensity files such as the iScan.

28

3.2.5.Homozygosity mapping

3.2.5.1.UPLOADING GENOTYPES

Genotype files generated by the Human CytoSNP microarray analysis were uploaded to the server. During this process HomozygosityMapper screened all samples for blocks of homozygous genotypes in contiguous markers and stored the genotype, along with the length of the block it is positioned in. HomozygosityMapper is a web based tool, which can be used for the detection of recessive alleles in consanguineous families. Data were imported onto the software and data were generated on the basis of selected cases and parameters set as described in the next section (Seelow et al. 2009).

3.2.5.2.GENOTYPE ANALYSIS

Uploaded data under the project name “fullset” were given a specific analysis name based on date of analysis, ethnic origin, consanguinity and patient number. Cases were entered as per patient individual identification number. Limit block length was variable as per analysis; higher block lengths were set to a value to calculate the homozygosity score. This would prevent inflation of homozygosity scores by very long homozygous stretches in single individuals (Table 3). Once the analysis was performed, data became available online to review.

29

Analysis name Patient No Block length Maximum Fullset limit (bp) homozygosity score A01 1-24 100 386 A02 1,3,5,7-10,13,14,16,23,24 100 1131 A03 1-24 250 1733 A04 1,3,14,18,19,21,24 250 750 A05 1,3,14,18,19,21,24,15,12,17 250 1153 A06 7-10,13,16 250 1221 A07 7-10,13,16,12,15,17 100 1705 A08 7-10,13,16, 2, 11 250 1386 A09 1,3,7-10, 13,14,16,18,19,23,24 1000 4860 A10 1,3,7-10, 13,14,16,18,19,23,24 100 1406 A11 12,15,17 100 300 A12 11 1000 245 A13 1-3,5,7-10,12-19,23,24 100 1606 A14 10,16 100 200 A15 5 1000 1000 A16 2,11 100 200 A17 6,8 100 200 A18 3,8,10,16,18 250 1250 A19 5,8,9,10,13 250 839 A20 3,8,18 250 750 A21 8,10,16 250 750 A22 1,9,10,14,16 250 1000 A23 5,8,10,16 250 1000 A24 3,8,10,16,18 250 1250 A25 1,9,10,14 250 1000 A26 8,10,16,19 250 1000 A27 10,13,16 250 750 A28 10,16,19 250 750 A29 8,10,13,16,19 250 1000 Table 3: Homozygosity Mapper analysis sets. Each row represents a separate analysis (A1-29) including each time a different sub cohort of patients, based on common ethnic background [Arabic (1, 3, 4, 14, 18, 19, 21, 24), Asian (6-10, 13, 16), Greek (12,15,17)], consanguinity (1, 3, 7-10, 13, 14, 16, 18, 19, 23, 24), sibling groups (10,16 & 3,18). By altering the block length and number of patients a different maximum homozygosity score would be generated. The maximum homozygosity score in the listed analyses varied from 200 to 4860.

30

3.2.5.3.GENOME-WIDE HOMOZYGOSITY

Following selection of each analysis group, homozygosity scores were displayed as a genome-wide bar chart. Scores higher than 80% of the maximum score were displayed as red bars to highlight them (Figure 2). Above the homozygosity scores, the ratio of observed versus expected homozygous genotypes were displayed. Below the bar chart, all scores above 80% of the maximum were listed (sorted by their score) and direct links to these regions were provided. On top of the table, “broad” regions were displayed; smaller decreases of the score within a homozygous region were neglected. At the bottom of the table, “narrow” regions with sharp limits followed. As we were expecting some degree of heterogeneity, the analysis was performed considering the “broad” regions. By choosing the chromosomal region of interest candidate genes in all the identified long homozygous regions could be queried with Gene Distiller (http://genedistiller.org). Data were reviewed in more detail by using the genotype view automatically generated, where in a table format each line stood for a SNP (sorted from left to right by its position) and each column for a sample (Figure 3).

Figure 2:HomozygosityMapper analysis on all consanguineous patients across the genome. Homozygosity scores are displayed as a genome-wide bar chart. The red bars identify highest homozygosity score regions. Areas of high homozygosity are listed automatically in the format displayed in the lower part of the figure including respective chromosome position (start-stop) and reference SNP (rs) number (start-stop).

31

db SNP position on Score 1 3 5 7 8 9 10 13 14 16 23 24 chromosome 11 rs12281162 48650491 1240 BB AA AA AA AA AA AB AA AA AB AA AA (4814) (51) (379) (4) (4) (353) (0) (3370) (7) (0) (240) (6) rs12290565 48664631 1240 BB BB BB BB BB BB BB BB BB BB BB BB (4814) (51) (379) (4) (4) (353) (3) (3370) (7) (3) (240) (6) rs2202626 48842766 1240 AA AA AA AA AA AA AA AA AA AA AA AA (4814) (51) (379) (4) (4) (353) (3) (3370) (7) (3) (240) (6) rs11040196 48994066 1240 BB BB BB BB BB - (353) BB BB BB BB BB BB (4814) (51) (379) (4) (4) (3) (3370) (7) (3) (240) (6) rs11040198 49000558 1240 BB BB BB AB AB AA AB BB AB AB BB AB (4814) (51) (379) (0) (0) (353) (0) (3370) (0) (0) (240) (0) rs2865908 49009877 1240 BB BB BB AB AB AB AB BB AB AB BB AB (4814) (51) (379) (0) (0) (353) (0) (3370) (0) (0) (240) (0) rs10431006 49010832 1240 AA AA AA AA AB AA AB AA AA AB AA AB (4814) (51) (379) (2) (0) (353) (0) (3370) (2) (0) (240) (0) rs4614455 49018666 1240 BB BB BB BB BB BB BB BB BB BB BB BB (4814) (51) (379) (2) (1) (353) (1) (3370) (2) (1) (240) (1) rs7124489 49024333 1240 BB BB BB AB AB AA AB BB AB AB BB AB (4814) (51) (379) (0) (0) (353) (0) (3370) (0) (0) (240) (0) rs10769558 49030187 1240 BB BB BB AB AB AA AB BB AB AB BB AB (4814) (51) (379) (0) (0) (353) (0) (3370) (0) (0) (240) (0) rs10839210 49033342 1240 BB BB BB AB AB AA AB BB AB AB BB AB (4814) (51) (379) (0) (0) (353) (0) (3370) (0) (0) (240) (0) rs7929269 49040737 1240 BB BB BB AB AB AA AB BB AB AB BB AA (4814) (51) (379) (0) (0) (353) (0) (3370) (0) (0) (240) (1) rs3781713 49048198 1240 BB BB BB AB AB AA AB BB AB AB BB AB (4814) (51) (379) (0) (0) (353) (0) (3370) (0) (0) (240) (0) rs2696915 49058871 1240 AA AA AA AB AB BB AB AA AB AB AA AB (4814) (51) (379) (0) (0) (353) (0) (3370) (0) (0) (240) (0) rs11040236 49073888 1240 BB BB BB BB BB BB BB BB BB BB BB BB (4814) (51) (379) (4) (4) (353) (4) (3370) (4) (4) (240) (4) rs7924656 49085746 1240 AA AA AA AA AA AA AA AA AA AA AA AA (4814) (51) (379) (4) (4) (353) (4) (3370) (4) (4) (240) (4)

Figure 3: Genotype table for a high homozygosity score region in chromosome 11 for all consanguineous patients. SNPs are listed as per rs number and position. Alleles are colour coded as red (homozygosity for common alleles), orange (homozygosity for non common alleles) and blue (heterozygosity).

32

3.2.6.Loss of Heterozygosity by BeadStudio

3.2.6.1.BEADSTUDIO

For this part of the analysis LOH Detector (Bead) GT v3.0 and CNV partition (Genome) v2.4.4 softwares were used. We incorporated the data generated from the SNP linkage array (named project 52).

We attempted 3 individual analyses based on patient number, SNP number, minimum Chi square and length (Table 4). SNP data were displayed as histogram, scatter or cluster plot, line graphs and reviewed.

Analysis No Patient No Min SNP Min Chi Length square

1 Consanguineous 50 23.5 50 (n=14)

2 All (n=24) 50 250 50

3 All (n=24) 250 250 50

Table 4: BeadStudio analysis parameters

In Figure 4 SNP data are displayed for patients 10 & 16. Chromosomes 1-4 are displayed as representing a fraction of the genome, with patients 10 & 16 in ascending order. The midline blue scart is representing the common alleles (AA) and the blue scart in the edges the rare alleles (BB). Regions of consecutive homozygous SNPs are noted as gaps in the blue scatter across the midline (red arrow) with increased signal in the respective regions across the margins (black arrow). The investigator can zoom into the respective regions of LOH and by scrolling over the region can get the exact base start and stop position and the length of LOH.

33

Figure 4: BeadStudio genotype analysis viewer. LOH regions are featured here for patients 10 & 16 for chromosomes 1-4 in ascending order. The midline blue scart is representing the common alleles (AA) and the blue scart in the edges the rare alleles (BB). Gaps in the blue scatter across the midline (red arrow) represent regions of consecutive homozygous SNPs with increased signal in the rare allele respective regions (black arrow).

34

3.2.7.Sequencing analysis

3.2.7.1.PURIFICATION OF PCR PRODUCTS FOR SEQUENCING

PCR products were purified in preparation for sequencing using the Agencourt AMPure XP 60ml Kit (A63881). This process was used in order to achieve high recovery of amplicons, efficient removal of unincorporated dNTPs, primers, primer dimers, salts and other contaminants prior to sequencing.

3.2.7.1.1.AMPure PCR Purification Kit

The PCR plate was centrifuged briefly using a plate centrifuge to ensure all PCR product is collected in the bottom of the well and also to reduce the risk of cross contamination. The seal was carefully removed from the PCR plate to avoid splashing. After gently shaking the AMPure XP magnetic particle solution container (kept at 4 oC but brought slowly to room temperature over a 20 minute period) to re-suspend any magnetic particles, using a multipipette 1.8 μl of the solution/1.0 μl of PCR product were added to the well. One negative PCR control was used, to help identify any potential contamination. We pipetted up and down gently x10 times to mix the beads and PCR product. The ensuing homogenous mixture was left to stand at room temperature for 5 minutes to enable DNA products (≥ 100 bp) to bind to the magnetic beads. The reaction plate was placed onto a 96 well magnetic plate for 5 minutes to separate the beads from the solution. Beads formed a ring around the well at the height of the magnet, the solution at the end appearing clear. With the reaction plate still on the magnetic plate the supernatant was pipetted and discarded, and a multipipette was used to add 100 μl of 70% ethanol solution to each well. The mix was incubated for 30 seconds and ethanol was aspirated and discarded. This process was repeated twice. Any remaining ethanol was aspirated and discarded by using a multipipette with 10 μl tips to avoid any PCR inhibition caused by the presence of ethanol. The plate was removed from the magnetic stand and was left uncovered to dry for 10 minutes avoiding any cracking of the bead ring, as this would significantly reduce the efficiency of the elution. Thirty μl of distilled water were added to each well with a multipipette. A mix was created by gently multipipetting up and down for 10 times. The reaction plate was put onto the magnetic platform for another 5 minutes to separate the beads from the solution. A 35 μl of the eluent of each amplicon was transferred by multipipetting into the adjacent free column of the PCR plate. The plate was labelled, sealed with a plate seal and stored at 4 oC in a post PCR fridge.

35

3.2.7.2CYCLE SEQUENCING

3.2.7.2.1.Cycle sequencing reaction

Initially primers and Big Dye Terminator mix were defrosted. The cycle sequencing reaction was prepared by mixing 1.5 μl of BigDye Terminator sequencing Buffer, 1 μl of BigDye Terminator mix v3.1 and 4.5 μl of distilled water. The sequencing reaction was aliquoted and 7 μl of sequencing reaction mix were pipetted into the allocated sequencing plate wells. 1.2 μl of forward sequencing primer were added to the allocated column and the same process was followed for the reverse primer. Using the plate spinner, the PCR plate was span to draw condensation to the bottom of the wells; then the plate was placed on a magnetic rack and the sealer was removed. Using a multichannel pipette 2 μl of PCR product were added to each well. The plate was covered and span. The sequencing mixture was temperature cycled as per protocol listed in Table 5.

Temperature (oC) Time (minutes) Number of cycles

96 4 1

96 1

55 0.25 35

60 1

60 5 1

Table 5: Cycle sequencing parameters

3.2.7.2.2.Sequencing product purification by ethanol precipitation

A solution was made by mixing 20 μl of the cycle sequencing product with 50 μl of 100% ethanol, 4 μl EDTA (125 mM) and 1μl sodium acetate (125 mM). The mix was vortexed and incubated at -20 oC for 30 minutes. The mix was centrifuged at 13,000 rpm for 25 minutes and the supernatant was discarded. The DNA was re-suspended in 200 μl of 70% ethanol, vortexed briefly and centrifuged again at 13,000 rpm for 20 minutes. The supernatant was removed and the DNA was dried up fully in the tube left unsealed at room temperature. DNA was re-suspended in 10 μl formamide and loaded into an optical ABI prism® plate. The new solution was denatured at 95 oC for 5 minutes on a thermal cycler and subsequently loaded on the automated sequencer ABI PRISM 3130-Avant Genetic Analyser (Applied Biosystems, Life Technologies).

36

3.2.7.2.3.Sequencing

DNA solutions were loaded into the ABI PRISM® 3130-Avant Genetic Analyser and sequencing was performed according to protocol. The sequencing results were analysed by manually inspecting the electropherograms aligned to a reference sequence provided in the software GENtle v1.9.4 (Manske, M., University of Cologne, Germany), SeqScape® v2.1 (Applied Biosystems) and Sequencer (Gene Codes, Ann Harbor, MI, USA).

3.2.8.Next generation sequencing –Whole exome sequencing (WES)

Genomic DNA extracted from stored blood samples from our patient cohort was used for Whole Exome Sequencing according to the Sure Select XT Target Enrichment System Kit for Illumina Multiplexed Sequencing. The protocol used was specifically developed to use biotinylated RNA oligomer libraries to enrich targeted regions of the genome from repetitive sequences by a paired-end multiplexed library preparation (Figure 5). The sample preparation and process was performed at the Biomedical Research Centre, Guy’s Hospital, London, and is detailed below.

Figure 5: Synopsis of sequencing sample preparation workflow provided by Illumina.

3.2.8.1.PREPARATION OF GENOMIC DNA

3.2.8.1.1.DNA quantification

High quality DNA concentration (20 ng/μl, A260/A280 1.8-2.0) was confirmed with the Qubit 2.0 Fluorometer provided by Invitrogen (Life Technology, Paisley, UK) as described in section 3.2.2.

3.2.8.1.2.DNA shearing

The Covaris S-series single tube sample preparation system (Model S2) was set up. The Covaris tank was filled with fresh deionised water up to line 12 on the graduated fill line label, making sure that water covered the visible glass part of the tube. The chiller temperature was set in the water bath at 5 oC. The instrument was degassed for at least 30 minutes prior to use. A mix was prepared consisting of 3 μg of each DNA sample and 1X Low TE Buffer in a 1.5 ml LoBind tube to a total volume of 130 μl. A Covaris microTube was placed into the loading and unloading station and the cap was kept on the tube. With a tapered pipette tip 130 μl of DNA sample were transferred through the pre-split septa avoiding the introduction of any air bubbles into the bottom of the tube. The microTube is secured in the tube holder and the DNA is sheared according to protocol (Table 6).

37

Setting Value

Duty Cycle 10%

Intensity 5

Cycles per Burst 200

Time 6 cycles of 60 seconds each

Set Mode Frequency sweeping

Temperature 4° to 7° C

Table 6: Covaris DNA shearing settings

Following the shearing process, the Covaris microTube was placed back into the loading and unloading station. With the snap-cap on, a pipette tip was inserted through the pre-split septa and the sheared DNA was removed and transferred into a new 1.5 ml LoBind tube. The process was repeated for each DNA sample.

3.2.8.1.3.DNA purification

During the DNA shearing preparation the Agencourt AMPure XP Beads were left to come up to room temperature for at least 30 minutes prior to the purification. The reagent was then mixed well so that it became homogenous. 180 μl of AMPure XP beads were pipetted into each 1.5 ml LoBind tube and the sheared DNA library (~130 μl) was added. The mix was vortexed and incubated for 5 minutes. The tubes were placed on a magnetic stand until the solution appeared clear (3-5 minutes). While keeping the tubes on the magnetic stand the entire cleared solution was discarded making sure the beads were left untouched. The tubes were kept on the magnetic stand while 500 μl of freshly constituted 70% ethanol were dispensed into each tube. The tubes were left to rest for 1 minute to allow any disturbed beads to settle and then ethanol was removed. The ethanol step was repeated once more for optimum results. The samples were left to dry on a heat block set at 37 oC for 5 minutes or until the ethanol had completely evaporated making sure though that no cracking was formed in the bead pellet. 50 μl of nuclease-free water were added to each tube and the solution was mixed well on a vortex mixer and then incubated for 2 minutes at room temperature. The tubes were placed back on the magnetic stand and left for 3 minutes until the solution cleared. The 50 μl of the supernatant were removed to a new 1.5 ml LoBind tube and used beads were discarded.

38

3.2.8.1.4.Quality assessment

Sheared DNA quality was checked by using the Agilent 2100 Bioanalyzer. The Bioanalyzer DNA 1000 chip and reagent kit were used and they were kept at room temperature for at least 30 minutes prior to use. The 2100 Bioanalyzer electrodes were checked and cleaned. The Agilent 2100 Expert software (version B.02.02) was opened and communication with the Bioanalyzer was checked. The chip, gel-dye mix, samples and ladder were prepared according to guidelines (http://www.chem.agilent.com/en- US/Search/Library/_layouts/Agilent/PublicationSummary.aspx?whid=46764). Essentially Screen Tape with DNA1K Reagents (ladder and sample buffer) were chosen to analyse the fragment length of the sheared DNA library. In optical strip tubes, 1 µl of ladder was placed in the first tube and 1 µl of sample buffer were mixed with 1 µl of each sample in the sequential tubes. After vortexing and spinning down, the optical tube strips were placed on the Bioanalyzer together with the specific loading tips. The Bioanalyzer software was launched and the required sample panel was selected on the controller framework before starting the electrophoresis run, which was started within 5 minutes from preparation. Within the instrument settings the dsDNA assay was chosen and data were saved in an easily identifiable file. The results were verified and electropherograms were checked to ensure the DNA distribution had a peak height at 150-200 bp (Figure 6).

Figure 6: Sheared DNA analysis showing an electropherogram distribution with an optimum peak size at 150-200 nucleotides.

Upon completion of the assay the used chip was promptly removed from the Agilent 2100 Bioanalyzer and discarded. One of the wells of the electrode cleaner was slowly filled with 350 μl deionised analysis-grade water. The lid was opened and the electrode cleaner was placed in the Agilent 2100 Bioanalyzer. The lid was closed and was left for about 10 seconds. The lid was opened and the electrode cleaner was removed. A ten-second waiting time was allowed for water evaporation prior to closing the lid. 39

3.2.8.1.5.End repair

A master mix for 5 samples was prepared on ice. The SureSelect Library Prep Kit GA for multiple libraries was used and the reaction mix was prepared as per Table 7. 52 μl of the reaction mix and 48 μl of each DNA sample were added to each well in a PCR strip tube. The solution was mixed by pipetting and samples were incubated in a thermal cycler with no heated lid at 20 oC for 30 minutes. Reagent Volume for 5 libraries (Including excess)

DNA sample 48 μl/sample

Nuclease-free water 183.3 μl

10X End Repair Buffer 52.3 μl

dNTP Mix 8.33 μl

T4 DNA Polymerase 5.2 μl

Klenow DNA Polymerase 10.4 μl

T4 Polynucleotide Kinase 11.45 μl

Total volume 318.98 μl (63.79 μl/sample)

Table 7: End Repair mix for 5 samples

3.2.8.1.6. Second DNA purification

The same protocol as per 3.2.8.1.3 was followed until the mix was left to dry on the heat block. After this step 32 μl of nuclease-free water are added to each tube and the solution is mixed well on a vortex mixer and then incubated for 2 minutes at room temperature. The tubes are placed back in the magnetic stand and left for 3 minutes until the solution clears. The 32 μl of the supernatant are moved to a new 1.5 ml LoBind tube and used beads are discarded. The samples were stored overnight at -20 oC.

3.2.8.1.7.Addition of “A” bases to the 3’ end of DNA fragments

For the adenylation of the 3’ end of the DNA fragments, the SureSelect Library Prep Kit GA was used. A reaction mix for 5 samples was prepared on ice as per protocol (Table 8). The mix was vortexed for a few seconds. 20 μl of the reaction mix and 30 μl of each DNA sample were added to each individual well of a PCR strip tube. The mix was mixed by pipetting and incubated in a thermal cycler for 30 minutes at 37 oC. 40

Reagent Volume for each library

DNA sample 30 μl

Nuclease-free water 11 μl

10X Klenow Polymerase Buffer 5 μl

dATP 1 μl

Exo(-) Klenow 3 μl

Total volume 50 μl

Table 8: Adenylation of 3’ end of DNA fragments per sample

3.2.8.1.8. Third DNA purification

Following the adenylation at the 3’end of the DNA fragments a purification step was repeated as per section 3.2.8.1.6. The samples were processed immediately for ligation of the indexing-specific paired-end adapter.

3.2.8.1.9.Ligation of the indexing-specific paired-end adapter

A reaction mix was prepared on ice using a 10:1 molar ratio of adapter to DNA insert according to volumes in Table 9. A new reaction mix was made using 37 μl of the reaction mix and 13 μl of each DNA sample by pipetting into each well of the PCR strip. The mix was incubated for 15 minutes at 200C on a thermal cycler with no heated lid.

Reagent Volume per library DNA sample 13 μl

Nuclease-free Water 15.5 μl

5X T4 DNA ligase Buffer 10 μl

InPE Adaptor Oligo Mix 10 μl

T4 DNA Ligase 1.5 μl

Total volume 50 μl

Table 9: Ligation of the indexing-specific paired-end adapter master mix. Volumes of reagents were adjusted for multiple libraries to achieve final volume of 37 μl/sample. 41

3.2.8.1.10.DNA purification following adapter ligation

After the ligation of the indexing-specific paired-end adapter, a purification step was repeated as per section 3.2.8.1.6. The only difference was that 32 μl of nuclease-free water were added to each tube prior to placing the tube in the magnetic stand and removing the supernatant to a fresh 1.5 ml LoBind tube.

3.2.8.1.11.Adapter-ligated library amplification

In a PCR strip tube a reaction mix is prepared as per guidelines (Table 10). The reaction is mixed by gently pipetting up and down. 35 μl of the reaction mix and 15 μl of each DNA sample are added to each strip tube and mixed by pipetting.

Reagent Volume per reaction Indexing Adapter-ligated library 15 μl

Nuclease-free water 21 μl

InPE Primer 1.0 1.25 μl

SureSelect GA Indexing Pre Capture PCR 1.25 μl Reverse Primer 5X Herculase II Rxn Buffer 10 μl

100 mM dNTP Mix 0.5 μl

Herculase II Fusion DNA Polymerase 1 μl

Total volume 50 μl

Table 10: PCR mix components per sample for adapter-ligated library amplification

The PCR reaction was run in a thermal cycler programme as detailed in Table 11

42

Step Temperature (oC) Time

1 98 2 minutes

2 98 30 seconds

3 65 30 seconds

4 72 1 minute

5 Repeat (steps 2-4) x4

6 72 10 minutes

7 4 Hold

Table 11: PCR parameters for adapter-ligated library amplification

3.2.8.1.12.DNA purification following adapter-ligated library amplification

Following the adapter-ligated library amplification a purification step was repeated as per section 3.2.8.1.6. The only difference was that 30 μl of nuclease-free water were added to each tube prior to placing the tube in the magnetic stand and removing the supernatant to a fresh 1.5 ml LoBind tube.

3.2.8.1.13.Quality assessment after adapter-ligated library amplification

The sheared DNA quality was checked as per protocol listed in section 3.2.8.1.4. The results were verified and electropherograms were checked to ensure the DNA distribution had a peak height at 250-275 bp (Figure 7). A minimum of 500 ng of library was required for hybridization.

43

Figure 7: Electropherogram of an amplified prepared DNA library. The graph shows a single peak in the size range of 250-275 bp.

3.2.8.2.HYBRIDIZATION

The complete process of hybridization involved the combining of the prepared library with the hybridization and blocking agents and the SureSelect capture library as demonstrated below (Figure 8).

44

Figure 8: SureSelect Target enrichment System Capture Process workflow for the Illumina platform. The figure is provided by Agilent Technology

3.2.8.2.1.Library hybridization

For each of the sample library an individual hybridization and capture was prepared. 500 ng of DNA were required with a maximum volume of 3.4 μl. a. If the prepared library concentration was below 147 ng/μl the sample was concentrated at ≤45 oC with the following process: 30 μl of the prepared library were added to an Eppendorf tube and after the cap was broken off, the tube was covered with parafilm and holes were poked with a narrow gauge needle. The sample was completely lyophilized on low heat to dehydrate. The sample was reconstituted with nuclease-free water to a final concentration of 147 ng/μl by pipetting up

45

and down along the sides of the tube for maximum recovery. The new mix was mixed on a vortex mixer and span in a microfuge for 1 minute. b. A hybridization buffer solution was made according to the protocol listed in Table 12. Reagent Volume for 6 captures (including excess) SureSelect Hyb # 1 125 μl

SureSelect Hyb # 2 5 μl

SureSelect Hyb # 3 50 μl

SureSelect Hyb # 4 65 μl

Total 245 μl (40 μl /sample needed)

Table 12: Hybridization buffer solution protocol c. In case of precipitation the hybridization buffer was warmed at 65 oC for 5 minutes d. Capture library tubes were kept on ice while the SureSelect capture library mix was prepared. For each library sample and capture size of 50 Mb, 5 μl of SureSelect capture library were added. The volume of RNase Block Dilution added was 2 μl at a dilution (parts

RNase block: Parts H2O) of 1:3 (25%). e. Nuclease-free water was used to prepare the solution of the SureSelect RNase Block allowing for excess. Once the RNase Block solution was made, it was added to each capture library and mixed by pipetting. f. The SureSelect Block mix was made as listed in Table 13

Reagent Volume for 5 reaction

SureSelect Indexing Block # 1 13.75 μl SureSelect Block # 2 13.75 μl SureSelect GA Indexing Block # 3 3.3 μl

Total 30.8 μl Table 13: SureSelect Block Mix g. In a separate PCR plate 3.4 μl of 147 ng/μl library were added to the “B” row with each sample into a separate well (Figure 9).

46

Figure 9: Hybridization Library PCR plate; Prepped Library (red), Hybridization Buffer (blue) and SureSelect Capture Library (green).

5.6 μl of the SureSelect Block Mix were added to each well in row “B” and the solution was mixed by pipetting up and down. The wells in row “B” were sealed with caps and placed in the thermal cycler to run in the programme listed in table 14

Step Temperature Time

Step 1 950C 5 minutes

Step 2 650C Hold

Table 14: PCR programme for library hybridization h. While the plate is maintained at 65 oC in the cycler, 40 μl/well of hybridization buffer were added into the “A” row of the PCR plate. The samples remained on the plate at 65 oC for ≥5 minutes. i. The capture library mix previously prepared (see d.) is transferred into the “C” row in the plate by a multi-channel pipette. The wells were firmly sealed with strip caps and incubated at 65 oC for 2 minutes. j. While the plate is at 65 oC with a multi-channel pipette 13 μl of Hybridization Buffer from row “A” are added into the SureSelect capture library mix in row “C” of the plate for each sample (Figure 9). k. Immediately after that and with the plate still at 65 oC, with a multi-channel pipette the entire contents of each prepared library mix in row “B” is transferred rapidly into the hybridization solution in row “C”. The final solution is mixed well by slowly pipetting up

47

and down x10. The hybridization mixture consisted of 27-29 μl, depending on the degree of evaporation. The wells were sealed with new strip caps to avoid an evaporation of more than 4 μl. The new mixture was incubated for 24 hours at 65 oC with a heated lid at 105 oC.

3.2.8.2.2.Magnetic beads preparation

The SureSelect Wash Buffer #2 was prewarmed to 65 oC in a heat block. The Dynal MyOne Streptavidin T1 (Invitrogen) magnetic beads were re-suspended on a vortex mixer to avoid settling at the bottom of the container. For each hybridization, 50 μl of Dynal magnetic beads were added into a 1.5ml microfuge tube. 200 μl of SureSelect Binding Buffer were added to the beads and mixed on a vortex mixer for 5 seconds. The tubes were put onto the magnetic device and the supernatant was removed and discarded. The process was repeated for a total of 3 washes and the beads were afterwards re-suspended in 200 μl of SureSelect Binding Buffer.

3.2.8.2.3.Hybrid capture

The PCR plate was kept at 65 oC while the hybridization mixture was added from the thermal cycler to the bead solution and the plate was sealed and mixed by inverting it x5 times. Minimal evaporation was essential. The hybrid-capture/bead solution was incubated on a heat block for 30 minute at room temperature and the mix was briefly spanned in a centrifuge. The beads and buffer were separated on the Dynal magnetic separator and the supernatant was removed. The beads were re-suspended in 500 μl of SureSelect Wash Buffer #1 and vortexed for 5 seconds. The samples were incubated for 15 minutes at room temperature and occasionally vortexed. Beads and buffer were again separated on the Dynal magnetic separator and the supernatant was removed. The beads were re-suspended in 500 μl of 65 oC prewarmed SureSelect Wash Buffer #2 and mixed on a vortex for 5 seconds. The samples were incubated for 10 minutes at 65 oC and briefly spun in a centrifuge (<3 seconds). The beads and buffer were separated on the Dynal magnetic separator and the supernatant was removed over a total of 3 washes. The beads were mixed in 50 μl of SureSelect Elution Buffer on a vortex mixer for 5 seconds and samples were incubated at room temperature for 10 minutes. The beads and buffer were separated on the Dynal magnetic separator and the supernatant containing the captured DNA was transferred to a new 1.5 ml microfuge tube. 50 μl of SureSelect Neutralization Buffer were added to the captured DNA.

48

3.2.8.2.4.Captured DNA purification

After the hybrid capture, the captured DNA library underwent another purification stage as described in section 3.2.8.1.6, the only difference being s that at the last stage 30 μl of nuclease-free water were added to the dried samples and the same volume of supernatant was subsequently removed.

3.2.8.3.INDEX TAG ADDITION BY POST-HYBRIDIZATION AMPLIFICATION

This part of the WES process includes adding tags by amplification, purification, assessment of the libraries, sample dilution for cluster amplification and pool barcoding samples for multiplexed sequencing.

3.2.8.3.1.Captured library amplification for addition of index tags

In a PCR strip tube a reaction mix was prepared on ice as per guidelines for each sample and a negative no-template control (Table 15). All samples were mixed on a vortex mixer. Reagent Volume per reaction

Captured DNA 14 μl

Nuclease-free water 22.5 μl

5X Herculase II Rxn Buffer 10 μl

100mM dNTP Mix 0.5 μl

Herculase II Fusion DNA Polymerase 1 μl

SureSelect GA Indexing Post Capture 1 μl PCR (Forward) Primer

PCR Primer Index 1 through Index 12 1 μl

Total 50 μl

Table 15: Herculase II Master Mix for captured library amplification and index tagging

In each strip tube 35 μl of the reaction mix and 1 μl of the appropriate index PCR Primer Index 1 through Index 12 from the SureSelect Library Kit GA were added. The solution was mixed by pipetting. To each tube 14 μl of DNA were added and the tubes were placed in a thermal cycler and ran in the programme detailed in Table 16.

49

Step Temperature (oC) Time

1 98 2 minutes

2 98 30 seconds

3 57 30 seconds

4 72 1minute

5 Repeat steps (2-4) x12

6 72 10 minutes

7 4 Hold

Table 16: PCR programme parameters for index tagging

3.2.8.3.2.Captured DNA purification

After the amplification of the captured library, purification was repeated as per section 3.2.8.2.4.

3.2.8.3.3.Quality assessment with the Agilent 2100 Bioanalyzer HS DNA assay

Post-hybridization tagged DNA quality was checked as per protocol listed in section 3.2.8.1.4. The results were verified and electropherograms were checked to ensure the DNA distribution had a peak height at 300-400 bp (Figure 10).

50

Figure 10: Electropherogram of an amplified captured DNA using the HS DNA Kit. The peak in the size range is as expected at 300-400 nucleotides.

3.2.8.4. SAMPLE POOLING FOR MULTIPLEXED SEQUENCING

Libraries were combined so that each index-tagged sample was present in equimolar amounts in the pool. The formula to calculate the amount of index sample was:

Volume of Index = V(f) x C(f) / # x C(i)

V(f): final desired volume of pool (20 μl)

C(f): desired concentration of all pooled DNA (10 nM)

#: number of index used

C(i): initial concentration of each index sample

In the case of final volume lower than the desired one, Low TE was added to the mix. In the case of final volume of the index-tagged samples higher than the desired one, the samples were lyophilized and reconstituted.

3.2.8.5.CLUSTER AMPLIFICATION

During this step the pooled captured libraries were amplified by using the paired-end protocol and conditions were optimized to provide 400-600 K clusters/mm2 on the flow cell. This is an eight-lane glass slide utilised for next generation sequencing (NGS), where libraries can be either individually placed or pooled in each lane. Pooled samples are separated based on indices attached to them at the start of the process. 51

Initially a dilution of 1:20 of HP3 (2N NaOH) down to 0.1N NaOH was prepared. The amplified captured libraries were diluted to 10 nM (10 fmol/μl) based on the bioanalyzer quantitation. 2 μl of the 10 nM multiplexed sample pool were added into 8 μl of Buffer EB (10mM Tris-Cl, pH 8.5) to make a 2 nM solution and 10 μl of 0.1N NaOH. The sample was mixed by vortex and incubated for 5 minutes at room temperature. Nine hundred and eighty μl of HT1 (Hybridization Buffer) were added to the denatured DNA to make a 20 pM template solution and mixed by vortex and centrifuged.

A 4 pM template was made by mixing 200 μl of 20 pM solution with 800 μl of pre-chilled HT1 (Hybridization Buffer), vortexed and centrifuged. 10 μl of the solution were removed to achieve a volume of 990 μl and replaced by 10 μl of PhiX control. The sample was mixed and centrifuged, and 120 μl of the diluted denatured sample were dispensed into a strip tube. The samples were placed in ice, until cluster generation was about to start.

3.2.8.6.CLUSTER GENERATION

After cluster amplification, the DNA library samples, which are by now attached to complementary adapter oligos fixed onto the surface of the flow cell, are amplified. The templates are copied from the hybridised primer by 3’ extension using a DNA polymerase. The fragments created are then amplified isothermally by the process of annealing, extension and denaturation. At the end they create clonal clusters of around 1,000 copies each.

3.2.8.7.REAGENT PREPARATION FOR HISEQ

The enhanced Scanning Mix Reagent (SRE-50), Long Read Nucleotide Mix (LFN36) and the Cleavage Mix Reagent (CMR-50) were placed in a container with room temperature deionised water for about one hour until they fully thawed. The reagents bind the DNA samples to the complementary adapter oligos on the paired end flow cell, enhancing resynthesis on both forward and reverse strands. The reverse strand is then removed and the forward strands are left. The Incorporation Mix Buffer (ICB-50) was made up as per Illumina protocol (http://www.illumina.com/products/truseq_sbs_kit-hs.ilmn). Samples and reagents were loaded into the HiSEq and the programme was selected as per Table 17.

Cluster Kit Version SBS Kit version HiSEq Control Software TruSeq Cluster Kit v3 TruSeq SBS Kit v3 - HCS v1.4 / RTA v1.12 (cBot - HS) with Flow Cell HS v3 Table 17: TruSeq SBS Kit-HS (50 cycles) parameters

52

3.2.8.8.SEQUENCING BY SYNTHESIS

Sequencing starts with the extension of the first sequencing primer to produce the first read. In each cycle fluorescently tagged nucleotides are attached to the chain. After the addition of each nucleotide the clusters get excited by a light source to produce a specific signal based on the sequence of the template. The number of cycles depends on the specific programme. The emission wavelength and the light frequency determine the base call. Millions of clusters are sequenced simultaneously in a massive parallel process. After the completion of the first read, the read product is washed away. The index 1 read primer is introduced and hybridised to the template. Subsequent reads are generated in a similar way as the initial read. After completion of the index read the product is washed off and the 3’ primer folds over and binds to the next oligonucleotide on the flow cell. Index 2 is read and its read product is washed away. Polymerases extend the 2nd flow cell oligonucleotide and form a double stranded DNA bridge. This double stranded DNA is linearized and the 3’prime ends are blocked. Forward strand is cleaved off and washed away. Read 2 starts with the introduction of the read 2 sequencing primer. The sequencing process is continued until the desired read-length is achieved. Millions of reads are generated, representing all fragments.

3.2.8.9.SEQUENCE ALIGNMENT, VARIANT CALLING AND ANNOTATION

This process was undertaken at the Genomics Facility, Biomedical Research Centre (BRC), Guy’s and St Thomas’ NHS Foundation Trust and King’s College London. The sequence reads were separated through the allocated barcode system through de-multiplexing and were stored in FASTQ files. FASTQ is a text format file, which contains both sequence and quality scores. For each sample two FASTQ files were generated according to the paired-end protocol. The reads contained in these files were aligned with the human reference genome version hg19/GRCh37 for identification of similar regions using the Novoalign tool. This software has several major advantages, such as the highest sensitivity currently available, multiple mismatch and gapped alignment, soft clipping to optimal alignment, quality control (QC) filtering and quality recalibration prior to alignment. Quality control scores are generated for distribution across all bases, over all sequences, sequence content across all bases and guanine-cytosine (GC) distribution. Any differences from the reference genome were identified through the variant calling process and such DNA variants were subsequently assigned functional characteristics such as gene name, position (intronic/exonic), homozygosity/heterozygosity, effects on protein function or structure and minor allele frequency (MAF). Once aligned, the data are stored as SAM files. A SAM file (.sam) is a tab- delimited text file that contains the sequence alignment data. Subsequently for easier data management SAM files were converted onto BAM files (.bam), which is the binary version

53

of a SAM file using SAMtools. BAM files are much smaller in size. BED tools were utilised to generate capture and coverage report. Data generated were checked for quantity and quality of alignment (failed, non-/aligned) (http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/), capture efficiency (on, off, near target) expressed in reads percentage, and completeness of coverage. Variant annotation was performed using Annovar software (http://www.openbioinformatics.org/annovar/), which can identify (i) whether SNPs or indels cause protein coding change, (ii) variants in specific genomic regions and (iii) variants in dbSNP, 1000 Genome Project.

Variants which passed the quality score had quality score >20 and a total number of reads >4. Known variants were allocated their respective Reference SNP (rs) number and their frequency was screened against the BRC internal dataset and also against the 1000 genome project (http://www.1000genomes.org), dbSNP135 and NHLBI Go Exome Sequencing Project (ESP) (http://evs.gs.washington.edu/EVS/). All aligned read data were subject to the following steps: (i) “duplicate removal” was performed, (i.e., the removal of reads with duplicate start positions; Picard MarkDuplicates; v1.70) (ii) indel realignment was performed (GATK IndelRealigner; v1.6-11-g3b2fab9) resulting in improved base placement and lower false variant calls and (iii) base qualities were recalibrated (GATK TableRecalibration; v1.6- 11-g3b2fab9).

3.2.8.10.VARIANT FILTERING

Variants were looked at in more detail initially if they were homozygous, novel, with MAF <2% when compared to the general population, non-synonymous, frameshift insertion/deletion, exonic or splice site. Variants were prioritised based on protein function and also tissue expression, with priority given into any liver specific expression. The WES data were also compared with the LOH regions generated previously in the respective patients. The search for causal variants was also narrowed down into common LOH regions shared between 3 or more patients.

3.2.9.Whole Exome Sequencing- Group 2

Whole Exome Sequencing in the 2nd cohort of patients was performed at the University of Washington Centre for Mendelian Genomics (UW CMG). DNA was extracted in our laboratory using the described above technique (section 3.2.1) and then was transferred to Seattle for WES.

54

3.2.9.1.SAMPLE RECEIPT, QUALITY CONTROL AND TRACKING.

Initial QC entailed DNA quantification, gender validation assay and molecular “fingerprinting” with either a 96-plex genotyping assay (using the Illumina BeadXpress Reader) derived from a custom exome SNP set or a high density Illumina SNP chip. This ‘fingerprint’ was used to identify potential sample handling errors prior to sample processing and provides a unique genetic ID for each sample, eliminating the possibility of sample assignment errors. Samples were failed if: (i) the total amount, concentration or integrity of DNA was too low (exome sequencing requires a minimum of 4 µg of genomic DNA); (ii) the fingerprint assay produced poor genotype data or (iii) sex-typing was inconsistent with the sample. None of the samples sent failed meeting qualification criteria as described above.

3.2.9.2.LIBRARY PRODUCTION AND EXOME CAPTURE

Library construction and exome capture were automated (Perkin-Elmer Janus II) in 96-well plate format. 1 μg of genomic DNA was subjected to a series of shotgun library construction steps, including fragmentation through acoustic sonication (Covaris), end-polishing and A- tailing ligation of sequencing adaptors and PCR amplification with 8 bp barcodes for multiplexing. Libraries underwent exome capture using the Roche/Nimblegen SeqCap EZ v3.0 (~62 MB target). Briefly, 1 µg of shotgun library was hybridized to biotinylated capture probes for 72 hours. Enriched fragments were recovered via streptavidin beads and PCR amplified. Since each library was uniquely barcoded, samples could be captured in multiplex. To facilitate optimal flow-cell loading, the library concentration was determined by triplicate qPCR and molecular weight distributions verified on the Agilent Bioanalyzer (consistently 125 ± 15 bp). Processing a sample from genomic DNA into an exome-sequencing library required 6 days (1.5 days for library construction, 3 days for exome capture and 1.5 days for post-capture processing).

3.2.9.3.CLUSTERING/SEQUENCING

Barcoded exome libraries were pooled using liquid handling robotics prior to clustering (Illumina cBot) and loading. Massively parallel sequencing-by-synthesis with fluorescently labelled, reversibly terminating nucleotides was carried out on the HiSEq sequencer. Current throughput was sufficient to complete 3 - 4 multiplexed exomes per lane at high coverage (60 to 80X mean coverage).

3.2.9.4.READ PROCESSING

The processing pipeline consisted of the following elements: (i) base calls generated in real- time on the HiSeq2000/2500 instrument (RTA 1.13.48.0) (ii) demultiplexed, unaligned BAM 55

files produced by Picard ExtractIlluminaBarcodes and IlluminaBasecallsToSam and (iii) BAM files aligned to a human reference (hg19) using BWA (Burrows-Wheeler Aligner; v0.6.2) (Li and Durbin 2009). Read-pairs not mapping within ± 2 standard deviations of the average library size (~125 ± 15 bp for exomes) were removed. All aligned read data were subject to the following steps: (i) “duplicate removal” was performed, (i.e., the removal of reads with duplicate start positions; Picard MarkDuplicates; v1.70) (ii) indel realignment was performed (GATK IndelRealigner; v1.6-11-g3b2fab9) resulting in improved base placement and lower false variant calls and (iii) base qualities were recalibrated (GATK TableRecalibration; v1.6-11-g3b2fab9).

3.2.9.5.VARIANT DETECTION

Variant detection and genotyping were performed using the UnifiedGenotyper (UG) tool from GATK (v1.6-11-g3b2fab9). Variant data for each sample were formatted (variant call format [VCF]) as “raw” calls that contain individual genotype data for one or multiple samples and flagged using the filtration walker (GATK) to mark sites that were of lower quality/false positives [e.g., low quality scores (Q50), allelic imbalance (ABHet 0.75), long homopolymer runs (HRun> 3) and/or low quality by depth (QD < 5)].

3.2.9.6.DATA ANALYSIS QUALITY CONTROL (QC)

All sequence data underwent a QC protocol before they were released to the annotation group for further processing. For exomes, this included an assessment of: (1) total reads; exome completion typically requires a minimum of 50 million PE50 reads, (2) library complexity; the ratio of unique reads to total reads mapped to target. DNA libraries exhibiting low complexity were not cost-effective to finish, (3) capture efficiency; the ratio of reads mapped to human versus reads mapped to target, (4) coverage distribution; 90% at 8X required for completion, (5) capture uniformity, (6) raw error rates, (7) Transition/Transversion ratio (Ti/Tv); typically ~3 for known sites and ~2.5 for novel sites, (8) distribution of known and novel variants relative to dbSNP; typically < 7% using dbSNP build 129 in samples of European ancestry (Ng et al. 2009), (9) fingerprint concordance > 99%, (10) sample homozygosity and heterozygosity and (11) sample contamination validation. All QC metrics, for single-lane data, were reviewed by a sequence data analyst to identify data deviations from known or historical norms. Lanes/samples that failed QC were flagged in the system and could be re-queued for library prep (< 5% failure) or further sequencing (< 2% failure), depending upon the QC issue. Exome completion was defined as having > 90% of the exome

56

target at > 8X coverage and >80% of the exome target at > 20X coverage. Typically this requires mean coverage of the target at 50-60X.

3.2.9.7.VARIANT ANNOTATION

An automated pipeline for annotation of variants derived from exome data was used, the SeattleSeq Annotation Server (http://gvs.gs.washington.edu/ SeattleSeqAnnotation/). This publicly accessible server returns annotations including dbSNP rsID (or whether the coding variant is novel), gene names and accession numbers, predicted functional effect (e.g., splice- site, non-synonymous, missense, etc.), protein positions and amino-acid changes, PolyPhen predictions, conservation scores (e.g., PhastCons, GERP), ancestral allele, dbSNP allele frequencies and known clinical associations. The annotation process has also been automated into the analysis pipeline to produce a standardized, formatted output (VCF, as described above).

3.2.9.8.VARIANT EXPLORATION

Once variant data were formatted into VCF files they were loaded into the GEMINI software for exploration. GEMINI (GEnome MINIng) is a web based tool developed for identification of genetic variations in the human genome by (i) incorporating a set of annotation tools (dbSNP, ENCODE, UCSC, ClinVar, KEGG) in one database and (ii) providing the flexibility of multiple complex filtering and data exploration on both small family-based or large-scale projects (Paila et al. 2013). Data were then released onto an excel file for further filtering and interpretation.

57

4. Clinical and laboratory profile

In chapter 4 the clinical and laboratory findings of the patients included in the project are described. Patients are categorised into Kabuki syndrome, neonatal sclerosing cholangitis and biliary atresia subgroups.

4.1.Kabuki syndrome and cholestatic liver disease

4.1.1.Introduction

Kabuki syndrome (OMIM#147920) (KS) is a congenital syndrome with typical facial characteristics, multi systemic involvement including cholestatic liver disease and mental retardation. The incidence is 1/32,000 with an equal sex ratio. It was first described in 1981 in 10 patients from Japan (Kuroki et al. 1981, Niikawa et al. 1981). The name of the syndrome originates from the facial similarities of affected individuals with the Japanese Kabuki theatre masks (Figure 11). In early 2010, when we started investigating potential genetic associations in neonatal cholangiopathies, we identified in our patient cohort a small number of children with KS and cholestatic liver disease. Only isolated cases of KS with liver disease have been so far reported, as discussed in more detail below. Our KS patients were included and analysed within our original cohort of cholestatic infants just prior to any work being published confirming MLL2 as the genetic cause in about 70% of KS cases.

4.1.2.Clinical features of KS

KS has multi systemic involvement including facial and dental characteristics, renal, cardiac, gastrointestinal, liver, skeletal and central nervous system abnormalities, developmental delay, autoimmune disease manifestations, recurrent infections and endocrine and growth problems.

4.1.2.1.FACIAL AND DENTAL CHARACTERISTICS

Ocular, auricular and ectodermal abnormalities have been described including lower palpebral eversion, coloboma of the iris, epicanthus and long palpebral fissure, ptosis and strabismus, arched eyebrows, and recurrent eye infections (most frequently associated with lacrimal duct anomalies) (Niikawa et al. 1988, Philip et al. 1992, Schrander-Stumpel et al. 1994, Galan-Gomez et al. 1995, Ilyina et al. 1995). Prominent or malformed ears have been reported along with depressed nasal tip, and micrognathia with associated anaesthetic risks (Casado et al. 2004, Sivaci et al. 2005, Johnson and Mayhew 2007). Ectodermal

58

abnormalities include low posterior hairline, receding anterior hairline and sparse frontal scalp hair. On microscopy, twisting of hair shafts, irregularity of hair diameter, and trichorrhexis nodosa have been reported (Abdel-Salam et al. 2011). A high arched or cleft palate, abnormal dentition and widely spaced teeth have also been reported (Mhanni et al. 1999, Matsune et al. 2001, Atar et al. 2006, dos Santos et al. 2006).

4.1.2.2.RENAL ABNORMALITIES

Multiple renal and urinary tract anomalies have been described in KS patients. These include renal dysplasia, horseshoe or ectopic kidney, kidney duplication, ureteropelvic junction obstruction, vesicoureteric reflux, hydronephrosis and duplication of the collecting system (Niikawa et al. 1988, Philip et al. 1992, Schrander-Stumpel et al. 1994, Wilson 1998, Kawame et al. 1999, Digilio et al. 2001, McGaughran et al. 2001, Armstrong et al. 2005, Li et al. 2011). Courcet et al reported recently a French cohort of 94 KS patients where renal malformations were present in 22% of cases, and urinary tract abnormalities in 15% (Courcet et al. 2013). In their report, as in previous studies, there were no significant differences in renal or urinary tract anomalies based on mutation type or position. Since the identification of MLL2 as the disease-causing gene, renal anomalies were variably reported as being present in 47% (Hannibal et al. 2011), 41% (Banka et al. 2012) and 41% (Li et al. 2011) of MLL2 (+) patients. Overall, two reported cases required renal transplantation (Ewart-Toland et al. 1998, Hamdi Kamel et al. 2006) and a single case died from end stage renal failure aged 5 years (Armstrong et al. 2005).

4.1.2.3.CARDIAC ABNORMALITIES

A variety of congenital heart defects has been reported in KS patients such as coarctation of the aorta (Hughes and Davies 1994), hypoplastic left heart syndrome (Digilio et al. 2010), double aortic arch (Moral et al. 2009) and a further wider spectrum of cardiac anomalies (Ohdo et al. 1985, Okada et al. 2001, McMahon and Reardon 2006, Moral et al. 2009) including conduction abnormalities (Shah et al. 2005). In an extensive meta-analysis of 74 KS cases associated with cardiac defects, Yuan et al identified that the most common cardiac pathology is represented by left-sided obstructions/aortic dilatation (46.1%) and atrial/ventricular septal defects (32.9%). Overall 90.6% of the patients were diagnosed prenatally or at an early postnatal stage (Yuan 2013).

59

4.1.2.4.GASTROINTESTINAL AND LIVER ABNORMALITIES

Gastrointestinal disease in KS patients includes malrotation of the colon, anal atresia and rectovaginal fistulae (Philip et al. 1992). Niikawa et al were the first to report liver involvement in the form of prolonged hyperbilirubinaemia in 12 out of 58 neonates diagnosed with KS (Niikawa et al. 1988). The aetiology was not described. Since then liver disease in KS has been mainly described in small series or single case reports. In 1998 Ewart- Toland et al reported a single case of an infant with KS diagnosed with a choledochal cyst, who underwent a Kasai portoenterostomy at the age of 3 months. Follow up percutaneous cholangiography confirmed on-going cholangiopathy and the patient underwent LT at 8 months (Ewart-Toland et al. 1998). Two years later McGaughran et al described two female infants diagnosed with BA; both underwent a Kasai portoenterostomy, and one died subsequently after LT (McGaughran et al. 2000). In the same year a male patient with high GGT cholestasis who was diagnosed with BA and underwent a Kasai operation at the age of 3 months was published (van Haelst et al. 2000). Selicorni et al reported a child with KS and BA in 2001, who had a Kasai operation at the age of 44 days and required an LT at the age of 20 years due to development of portal hypertension, recurrent cholangitis and biliary cirrhosis (Selicorni et al. 2001). Nobili et al described a male patient diagnosed with congenital hepatic fibrosis confirmed on liver biopsy with no further information on follow up (Nobili et al. 2004). In 2005 Schrander-Stumpel et al reported a series of 20 children with KS, eight of whom had hyperbilirubinaemia and, on review of the literature, described an overall jaundice rate of 27% (21/78 of published cases) (Schrander-Stumpel et al. 2005). The latest case report described a male infant with raised serum transaminase levels, pale stools, diarrhoea and transient high GGT cholestasis. The patient improved on ursodeoxycholic acid supplementation (Isidor et al. 2007). Cholestasis has also been described in Hardikar syndrome which has phenotypic features overlapping with Kabuki comprising sclerosing cholangitis, prolonged hyperbilirubinaemia, cleft lip and palate, and pigmentary retinopathy (Cools and Jaeken 1997, McGaughran et al. 2000, van Haelst et al. 2000, Nobili et al. 2004). To our knowledge none of the published case reports of KS with associated liver involvement had confirmation of pathogenic mutations in MLL2 or KDM6A, the two genes currently associated with KS (Ng et al. 2010, Hannibal et al. 2011, Li et al. 2011, Micale et al. 2011, Paulussen et al. 2011).

4.1.2.5.SKELETAL ABNORMALITIES

Multiple skeletal anomalies are present in KS patients such as joint laxity, hypoplastic claviculae (Fryns and Devriendt 1998), hip dislocation (Ramachandran et al. 2007), patellar

60

dislocation (Kurosawa et al. 2002), congenital talipes equinovarus (Phillips et al. 2005), scoliosis and spinal instability. Hand abnormalities include dermatoglyphics such as increase of ulnar loops, absence of digital triradius, increase of hypothenar loops, presence of fingertip pads, clinodactyly, short 5th middle phalanx, short metacarpal and coarse carpal bone (Niikawa et al. 1988).

4.1.2.6.CENTRAL NERVOUS SYSTEM ABNORMALITIES

Seizures have been reported in 15-25% of KS patients (Ogawa et al. 2003, Powell et al. 2003, Oksanen et al. 2004, Ito et al. 2007) often due to structural brain anomalies. Central nervous system abnormalities described include Arnold Chiari I malformation, syringohydromyelia, cerebellar vermis cleft (Dandy-Walker variant), absence of corpus callosum and cortical atrophy (Chu et al. 1997, Ben-Omran and Teebi 2005, Kara et al. 2006), hydrocephalus with ventricular enlargement (Gillis et al. 1990), hydrocephalus caused by aqua ductal stenosis (Kasuya et al. 1998), cortical and central hypoplasia, atrophy of the hippocampus and periventricular changes (Oksanen et al. 2004), periventricular nodular heterotopia (Mihci et al. 2002), polymicrogyria (Di Gennaro et al. 1999, Powell et al. 2003), subarachnoid cysts (Chu et al. 1997), and microcephaly (Li et al. 2011). Neonatal hypotonia and isolated cranial nerve palsies have been also described (Matsumoto and Niikawa 2003).

4.1.2.7.DEVELOPMENTAL DELAY

Children with KS have reduced cognitive abilities and motor skills, variable severity of learning difficulties, mental retardation, and behaviour in the range of autistic spectrum disorder. Ho et al reported two boys who even demonstrated a deterioration in their IQ during early adolescent years (Ho and Eaves 1997). The developmental outcome in children with KS is variable, though the majority have a mild spectrum adaptive and intellectual disability (Ho and Eaves 1997, Ciprero et al. 2005, Mervis et al. 2005, Vaux et al. 2005, Akin Sari et al. 2008).

4.1.2.8.IMMUNE DYSREGULATION

Autoimmune disorders manifesting usually during adolescence have also been reported in KS patients. The most common conditions being idiopathic thrombocytopenic purpura (ITP), haemolytic anaemia (Kawame et al. 1999), thyroiditis, and vitiligo (Schrander-Stumpel et al. 1993). Some of these patients also had evidence of hypogammaglobulinaemia affecting immunoglobulin A (IgA) and/or immunoglobulin G (IgG) (Hostoffer et al. 1996). Both isolated IgA deficiency and combined IgA, IgG, and/or IgM deficiency have been described 61

in KS patients (Zannolli et al. 2007) even leading to common variable immunodeficiency (CVID) syndrome (Ming et al. 2005).

4.1.2.9.RECURRENT INFECTIONS

As a consequence of the variable degree of immunodeficiency and the multi systemic abnormalities affecting KS patients, an increased incidence of recurrent infections has been reported. Otitis media (63%), sinusitis, urinary tract infection and pneumonia are the commonest infections described, probably due to hypogammaglobulinaemia (Niikawa et al. 1988, Schrander-Stumpel et al. 1994, Kawame et al. 1999, Mhanni et al. 1999). Hoffman et al reported a series of 19 KS patients, 16 of whom had some degree of hypogammaglobulinaemia (84%) resembling a CVID syndrome. Fifteen (79%) of these patients had decreased IgA levels and eight patients had decreased total IgG (42%) (Hoffman et al. 2005).

4.1.2.10.ENDOCRINE AND GROWTH PROBLEMS

Patients with KS have been reported to have a variety of endocrine problems such as premature thelarche (Bereket et al. 2001), precocious puberty, neonatal hypoglycaemia, growth hormone deficiency (Devriendt et al. 1995, Gabrielli et al. 2000, Gabrielli et al. 2002), follicular ovarian cyst, (Devriendt et al. 1996) type 1 diabetes mellitus (Fujishiro et al. 2003) and, rarely, late onset obesity.

4.1.3.Genetics in KS

4.1.3.1 MIXED LINEAGE LEUKEMIA 2 GENE (MLL2)

The first association of KS with a genetic mutation was made in 2010 when, following WES of 10 unrelated patients, nonsense or frameshift mutations in the MLL2 gene, on chromosome 12q12-14, were identified in seven (Ng et al. 2010). MLL2 is the paralog of MLL1 (associated with chromosomal rearrangements in acute myeloid, myeloblastic and lymphatic leukaemia); a product of a gene duplication process. The name MLL (mixed lineage leukaemia) originates from its association to the pathogenesis of leukaemia (Ziemin-van der Poel et al. 1991, Djabali et al. 1992). MLL2 consists of 54 coding exons and encodes the histone H3 lysine 4 specific methyltransferase. Subsequent Sanger sequencing detected mutations in MLL2 in 2 of the 3 remaining individuals out of the initial 10 patients with KS and in 26 of 43 additional cases (Ng et al. 2010). In twelve KS patients, where parental DNA was available and tested, mutations in MLL2 were detected in only two families. Overall, 33

62

MLL2 mutations were identified in 35 of 53 families (66%) with KS (Ng et al. 2010).

To date, 247 mutations in MLL2 have been described in KS patients (Table 18). All mutations described are heterozygous with the majority being truncating. They thus appear to behave in a dominant fashion. Overall, the most common mutation types identified were nonsense (164) and frameshift (88). Of interest is that the majority of mutations described are de novo with only four families identified to have proven mutations in more than one family member. This could reflect the lack of ancestral DNA availability to facilitate further testing within families or a possible reduced fertility status (Hannibal et al. 2011, Li et al. 2011, Micale et al. 2011, Paulussen et al. 2011, Banka et al. 2012, Kokitsu-Nakata et al. 2012, Bogershausen and Wollnik 2013, Makrythanasis et al. 2013, Miyake et al. 2013, Dentici et al. 2014).

Author Total No patients/ Nonsense Frameshift Missense Splice No of mutations/de site/Indel novo Paulussen et al 2011 45/34/27 11 17 2 4

Li et al 2011 34/18/11 3 6 7 2

Micale et al 2011 65/42/38 17 13 7 3/3

Hannibal et al 2011 110/81/25 37 22 16 3/3

Banka et al 2012 116/74/7 25 30 9 10

Kokitsu-Nakata et al 2/1/none - - 1 - 2012 Tanaka et al 2012 2/2/none 1 - 1 -

Makrythanasis et al 86/45/34 31 - 10 4 2013 Miyake et al 2013 81/50/35 35

Dentici et al 2014 18/10/none 4 4 2 -

Table 18: Number of mutations in MLL2 reported in the literature. Mutations are categorised as de novo, nonsense, frameshift, missense, splice site and indels.

MLL2 is a large, 5,262-amino acid protein, containing a “Drosophila Su3-9 Enhancer of zeste” (SET) domain, five PHD fingers, an HMG-I binding motif, a zinc finger, and FY-rich

63

motifs. The MLL protein family is part of the SET domain containing a group of HMTs in humans, which is evolutionary homologous to the Drosophila trithorax protein (TRX). SET domains are highly conserved during evolution and are the essential catalytic protein subunits that execute the histone lysine methylation, which is the core function of HMTs. MLL proteins require complex formation with other proteins, which allows recruitment to target genes and facilitates the catalytic interaction of the protein’s SET domain. SET-1 like complexes, like MLL2, have been shown to have an H3K4 methyltransferase activity on H3 peptides and have been shown to participate in a nuclear receptor-dependent transcriptional activation mechanism. Histone methylation in general thus constitutes an evolutionary conserved control mechanism for (epigenetic) gene regulation. MLL proteins in general are involved in embryogenesis and development most notably through regulation of homeobox (HOX) gene expression, and their interaction with nuclear receptors suggests a role in steroid-hormone and oestrogen controlled signalling (Mo et al. 2006). MLL2 is also involved in regulation of cell adhesion, growth impairment and cell motility (Issaeva et al. 2007).

4.1.3.2.LYSINE-SPECIFIC DEMETHYLASE 6A GENE (KDM6A)

As genetic mutations in MLL2 were identified in only about 70% of KS patients, the suspicion arose of other genes associated with KS. The association of KS and a gene located in the X chromosome was hypothesized following a number of reports of KS patients with small ring X (r [X]) chromosomes (Niikawa et al. 1988, Hughes and Davies 1994, McGinniss et al. 1997, Digilio et al. 2001, Stankiewicz et al. 2001, Rodriguez et al. 2008). In early 2012, Lederer et al reported three phenotypically consistent MLL2-ve KS cases in which de novo mutations in the lysine-specific demethylase 6A (KDM6A, OMIM#300867) gene, located at Xp11.3, were identified by comparative genomic hybridization (CGH) analysis (Lederer et al. 2012). This was associated with X chromosome inactivation skewing calculated at 89:11 and 97:3 for patient 1 and 2, respectively. Patients 1, 2 and 3 had large deletions including exons 21-29, whole gene and exons 5-9, respectively.

A year later, Miyake et al described, in two separate reports, five KS cases with four truncating mutations and one in-frame deletion (Miyake et al. 2013, Miyake et al. 2013a) (Table 19). When compared, patients with KDM6A mutations had more frequently short stature and postnatal growth retardation than those with MLL2 mutations. So far pathogenic mutations in KDM6A have been reported in about 10% of KS patients.

KDM6A encodes a specific demethylase of histone H3 lysine 27 (H3K27), a trithorax group protein, which binds to the histone H3K4-specific methyltransferase (Lee et al. 2007). This demethylation is associated with the tissue expression of several genes, cell cycle and

64

embryogenesis. The T-box family genes, encoding for transcription factors, utilise KDM6A to activate their target genes. One of these genes, Tbx3, specifically expressed in liver progenitor cells, has been isolated from developing mouse liver. It has been reported that Tbx3-deficient cells showed severe defects in proliferation, hepatobiliary lineage segregation and cholangiocyte differentiation, with subsequent abnormal liver development (Suzuki et al. 2008). The association between KDM6A and liver cell differentiation via the T-box gene activation could be suggested as a potential mechanism involved in the liver pathology described in KS patients. However, non-specific liver involvement has been so far reported only in two KDM6A-mutated KS patients (Miyake et al. 2013a).

65

Patient ID Mutation Predicted AA De novo Reported by change

1 c.3717G>A p.Trp1239* Unknown Miyake et al (2013)

2 c.1555C>T p.Arg519* Unknown Miyake et al (2013)

3 c.3354_3356delTCT p.Leu1119del Yes Miyake et al (2013)

4 c.1909_1912delTCTA p.Ser637Thrfs*53 Yes Miyake et al (2013)

5 c.4051C>T p.Arg1351* Unknown Miyake et al (2013)

Table 19: KDM6A mutations in KS patients as reported by Miyake et al in two separate publications identified either by Sanger sequencing or targeted resequencing (patients 4,5). Two mutations were de novo and in three cases no parental blood was available for genetic screening.

4.1.4.KS patients

In our patient database we identified four individuals with KS and associated liver disease. Clinical, laboratory and demographic data were collected and diagnostic/confirmatory gene sequencing was performed by the reference lab (Regional Molecular Genetics Service), at St. Mary’s Hospital, Manchester.

4.1.4.1.CLINICAL, BIOCHEMICAL AND HISTOPATHOLOGICAL FEATURES

Patients 11, TG27 & TG28 were white British and patient 2 was of consanguineous Indian origin. There was no family history of KS or chronic liver disease in any (Table 22). Two patients were born by emergency lower segment caesarean section (LSCS) due to fetal tachycardia and the other two by spontaneous vaginal delivery (SVD). Patients 2 and 11 were delivered prematurely at a gestational age of 35 and 36 weeks, respectively. KS was suspected in view of typical facial characteristics in all cases (Figure 11). The diagnosis was

66

initially based on clinical features and was subsequently confirmed by mutations in the MLL2 gene.

A B C

Figure 11: A; Kabuki make up: typical facial characteristics including lower palpebral eversion, short nasal septum, arched eyebrows, depressed nasal tip, epicanthus and low posterior hair line. Facial photograph of patients TG28 (B) with arched eyebrows, eversion of the lateral part of lower eyelids and long palpebral fissure. Patient 2 (C) facial appearance with typical facial characteristics including also cleft lip and palate, microcephaly and visual impairment. Written informed consent was obtained for publication.

67

Patient Gestational Delivery mode Birth Weight Siblings number age (weeks) (kilograms) (gender)

2(F) 35 Emergency LSCS 2.48 One older and two (foetal bradycardia, younger siblings-all maternal PE) well

11(F) 36 SVD 3.2 nil

TG27(F) 39 Emergency LSCS 3.35 nil (foetal bradycardia)

TG28(M) 39 SVD 3.9 Five healthy half siblings

One sibling died of SIDS.

Table 20: Demographic data of KS patients. All but patient 2 had a birth weight within normal range. No family history of liver disease or KS was recorded. SVD, spontaneous vaginal delivery; LSCS, lower segment caesarean section; PE, pulmonary embolism; SIDS, sudden infant death syndrome.

Patient 2 has typical facial characteristics and microcephaly, requiring regular follow up for hearing problems requiring hearing aid and she had grommets inserted in her right ear (Figure 11). She presented with conjugated hyperbilirubinaemia aged 3 weeks. Her liver biopsy specimen confirmed cholestasis and portal tract inflammation (Figure 12) with evidence of intrahepatic cholangiopathy on endoscopic retrograde cholangiopancreatography (ERCP) and was diagnosed as having neonatal sclerosing cholangitis. She cleared her jaundice with normalisation of gamma-glutamyltransferase (GGT) activity within six weeks of presentation with only residual mild bile duct dilatation to present. She has cleft lip and palate and she may require further corrective surgery. Clinical diagnosis of KS was confirmed at the age of 2 years. She has problems with feeding and speech and requires regular follow up by the speech and language team. Growth is impaired with short stature and pubertal delay. She has moderate learning difficulties and she is attending a mainstream school where she is receiving extra support.

Patient 11 initially presented with recurrent infections and was subsequently diagnosed with KS complicated by common variable immunodeficiency requiring regular immunoglobulin replacement therapy, insulin dependent diabetes mellitus, osteoporosis, granulomatous disease in spleen and lungs, recurrent diarrhoea, ovarian failure and hypothyroidism. Splenomegaly and raised transaminases (AST 150 IU/L) were identified at the age of 15 years, which warranted a liver biopsy (Table 21). The biopsy had features of inflammation, cholestasis with copper associated protein deposition, and parenchymal irregularity suggestive of a chronic cholangiopathy and nodular regenerative hyperplasia. Patency of the porto-mesenteric, splenoportal, hepatic venous system was confirmed on magnetic resonance imaging (MRI). At the age of 16 years she was started on oestrogen supplementation due to ovarian failure. She rapidly became cholestatic; with severe portal hypertension, complicated by spontaneous bacterial peritonitis (SBP) and suffered severe gastrointestinal bleeds, which were treated by endoscopic ligation. In 2009, due to the severity of her liver disease she was assessed for LT, but subsequently improved significantly and the decision to transplant was deferred. She has currently regular follow up at King’s College Hospital in an endoscopic surveillance programme.

Patient TG27 presented with neonatal cholestasis aged 3 weeks and was diagnosed with BA and underwent a Kasai portoenterostomy. She has currently well compensated liver disease with yearly follow up at our centre. She was diagnosed with KS at the age of 9 years, due to facial features. She has a high arched palate, gastroesophageal reflux, visual problems requiring spectacles, bilateral knee joint laxity requiring orthopaedic plates for recurrent dislocation of the patella, and eczema. She has microcephaly, upward turning eyelids, sparse eyebrows and an unusual 4th finger joint. She has developmental delay & dyspraxia and she is requiring extra help at school.

Patient TG28 presented with conjugated hyperbilirubinaemia at 7 weeks of age. Large bile duct obstruction was suspected on liver biopsy showing features of cholestasis (Figure 12) with portal tract expansion and bile duct proliferation confirmed also at Kasai portoenterostomy (Table 21). He subsequently developed end stage liver disease with coagulopathy, hypoalbuminaemia, ascites and jaundice and underwent liver transplantation aged 11years. His graft function is within normal limits. He was subsequently found to have mild aortic regurgitation and left ventricular hypertrophy, pulmonary arteriovenous malformations (AVM), bilateral renal hypertrophy and systemic hypertension. His facial characteristics suggestive of KS include arched eyebrows, eversion of the lateral part of lower eyelids and long palpebral fissure. He has learning difficulties and he had been suffering from recurrent ear infections.

69

Figure 12: Liver histology material from patients TG28 (a, b) and 2(c, d). Evident canalicular cholestasis with bile plugs (black arrow) (c, d) with biliary features and portal tract inflammation, portal and septal bridging fibrosis (green arrow) (a, b). H&E. (a, c, d) 20x, (b) 10x.

70

Patient Age at Bilirubin AST GGT

number presentation (μmol/L) (IU/L) (IU/L)

Portaltract expansion Inflammation duct Bile proliferation Cholestasis 2 3 weeks - + + + 68 110 178

11 15 years - + - + 17 150 148

TG27 3 weeks + + + + 132 134 614

TG28 7 weeks - + + + 172 193 469

Table 21: Liver histology and biochemical data at the time of admission to King’s College Hospital. All but patient 11 presented within the first 2 months of life. Liver biopsy showed cholestasis and inflammation in all and only patient 11 did not have features of portal tract expansion or bile duct proliferation. GGT values were high in all, with serum hyperbilirubinaemia recorded in all but patient 11. AST, aspartate aminotransferase; GGT, gamma glutamyltransferase.

4.2. Non Kabuki syndrome patients The patients included in this group were selected on the basis of the following criteria: positive family history, clinical presentation during infancy, liver histology, radiological evidence of cholangiopathy (where available) and high GGT cholestasis at time of initial presentation (Table 22). Only one was asymptomatic, but diagnosed on screening within the family of a symptomatic case. Patients 1-24 underwent SNP analysis in the first step of this project. WES was undertaken in two subgroups initially for patients 5, 8-10 and 13 and at a second stage for NSC patients 4, 12, 15, 17, 20, 21, 22, TG02, TG11, TG14, TG18 and TG21-24. Consanguineous BA patients TG01, TG05-07, TG09, TG17 and TG25-26 were also included, who will be described as a separate cohort. The patients were investigated extensively at the time of presentation to exclude any other underlying liver pathology, either in our institution or in their presenting centres prior to referral to the Paediatric Liver Centre at King’s College Hospital. Through a retrospective analysis of our patient database, 36 patients (21 male) were identified who fulfilled the inclusion criteria and for whom there was DNA available. Clinical, laboratory, radiological and histological data are described in Table 22.

71

4.2.1.Neonatal sclerosing cholangitis patients

4.2.1.1.DEMOGRAPHICS

Patients for whom the parents were known to be consanguineous were chosen and within their families those individuals with the most similar, and most severe, phenotype were prioritised. Consanguinity was identified in 17 patients, with two families having more than one affected sibling (3, 18 & 10, 16, TG21, TG22). Eleven and seven patients were of Asian and Arabic origin respectively (Table 22). One patient was of mixed race and 10 patients were Caucasians (5 of Greek descent). One patient (7) with a family history of BA (paternal aunt) was also included.

All patients apart from two were born at term with a median birth weight 2.988 kg [range, 2.4-3.6]. Eight had a family history of liver disease with recurrence of NSC in six; one had a hepatitis B virus carrier mother and one had a sibling, who had died of acute liver failure of unknown aetiology.

4.2.1.2.HISTOLOGICAL FEATURES

An archived liver-biopsy specimen was available from 24 patients for review. In the remaining 4 patients a detailed liver histology report from the referring hospital was obtained. Tissue sections were cut at 4 µm and, after diastase digestion with periodic acid-Schiff technique and then stained with haematoxylin-eosin (H&E). In addition, parallel sections were immunostained with monoclonal antibody against MDR3 (Alexis Biochemicals ALX- 801-028, Nottingham, UK) as well as M2-lll-6, another monoclonal antibody raised against human MRP2 (Alexis Biochemicals, Nottingham, UK). Histological examination showed biliary features, porto-septal bridging fibrosis, ductular proliferation, copper associated protein deposition and cholestasis consistent with NSC (Figure 13A) in all. Preservation of MDR3 immunostaining was recorded in available liver tissue from 18 patients (Figure 13B).

72

A B

Figure 13: Liver histology material from patient 9. A. Features of portal tract inflammation and bile plugs (black arrow) in liver material from patient 9 (Haematoxylin-eosin; original magnification x20) B. Same biopsy as in panel A immunostained for Multi Drug Resistance protein 3 (MDR3) with haematoxylin counterstain. MDR3 expression is preserved along canaliculi (original magnification x20.

Figure 14: Endoscopic retrograde cholangiopancreatography (ERCP) image from patient 5 demonstrating intra and extrahepatic bile duct irregularity, dilatation and marked cholangiopathy

73

4.2.1.3.CLINICAL FEATURES AND INVESTIGATIONS AT PRESENTATION

Median age at presentation in 23 patients was 5 [range, 1-108] weeks of life, while one patient (16) was asymptomatic and identified through family screening at an older age (Table 22). Presenting features were jaundice (26), failure to thrive (4), pale stools (9), pruritus (4), splenomegaly (9), coagulopathy (3), gastrointestinal bleeding (3) and hair loss (3). Median values for serum bilirubin was 111 μmol/l [range, 5-250], GGT 435 IU/L [range, 148-1120], Aspartate Aminotransferase (AST) 179 IU/L [range, 35-482], Alkaline Phosphatase (ALP) 694 IU/L [range, 215-1200] and albumin 36 g/L [range, 25-48]. Endoscopic retrograde cholangiopancreatography (ERCP) performed in 13 patients and Magnetic Resonance Cholangiopancreatography (MRCP) in 2, showed cholangiopathic changes in the extra/and intrahepatic region of the biliary tree (Figure 14).

4.2.1.4.DISEASE COURSE

Fourteen (48%) patients required LT for cholestasis and severe pruritus at a mean age of 46 months [range, 13-169]; patient 18 was re-transplanted in another centre 1 year later due to chronic rejection (Table 22). For four patients who are not being followed up in our centre no recent clinical or laboratory data were available. For the remaining patients, the median time of follow up was 11years [range, 2-34]. At the time of their last clinic follow up, median laboratory values were: for serum bilirubin 19 μmol/l [range, 2-50], GGT 125 IU/L [range, 5- 462], AST 93 IU/L [range, 21-302], ALP 364 IU/L [range, 132-543] and albumin 40 g/L [range, 29-47]. No recurrence of the original disease has been observed so far in the transplanted group. Four patients died, two while listed for LT and two post LT. Among the transplanted patients, patient 17 died suddenly at the age of 17 years in her native country with no acute deterioration of her liver disease; patient 6 was also suffering from end stage renal disease of unknown aetiology and died aged 7 years. Patient 5 was treated for Epstein Barr Virus (EBV) related Post Transplant Lymphoproliferative Disease (PTLD) with 4 courses of anti-CD20 monoclonal antibody 6 years after his LT. He had a good response with no evidence of recurrence of PTLD. Two patients (6 & TG14) had cholangiopathy and associated renal disease of unknown aetiology and both eventually died. Patient TG14 had severe portal hypertension, required splenic embolization x2 for severe thrombocytopenia, followed by splenectomy at 13 years of age. One year earlier she also suffered a subarachnoid haemorrhage due to posterior cerebral artery aneurysm, which was clipped. Following surgery she progressed to end stage liver disease and was listed for LT. Unfortunately she also developed end stage renal disease requiring haemodialysis and eventually died at the age of 16 years from a catastrophic oesophageal bleed. 74

Patient TG23 had a cholocystojejunostomy at the age of 18 years for persistent cholestasis. Percutaneous Transhepatic Cholangiogram (PTC) at 21 years showed no obstruction but delayed biliary emptying. ERCP a year later showed papillary stenosis and sphincterotomy was performed. In view of persistent cholestasis, she was listed and underwent a LT the same year.

The patients in the non-transplanted subgroup remain in stable condition with median serum bilirubin 17 μmol/l [range, 5-37], and GGT 207 IU/L [range, 19-462], preservation of the liver synthetic function: median INR1.0 [range, 0.9-1.04] and serum albumin 42 g/L [range, 36-47]. Patient 8 has on going but not debilitating pruritus and FTT.

4.2.2.Biliary atresia patients Eight patients of consanguineous families, with histological features suggestive of BA and diagnosis confirmed upon laparotomy, were also included. Clinical, laboratory, radiological and histological data are described in Table 22. This decision to include these BA patients was taken for two reasons. Firstly, because of a potential phenotypic overlap in the NSC and BA subgroups, aimed to increase our patient pool and improve the chances in identifying any potential causal variants, when filtering and stratifying results. Secondly, because consanguinity is a rare phenomenon in BA and such patients could be valuable in the search for a cholangiopathy suspected to be inherited in an autosomal recessive mode.

Six patients were of Asian origin one of Arabic and one of Turkish origin. Median age at presentation was 4 weeks [range, 2-8]. Median values at presentation were for serum bilirubin 118 μmol/l [range, 6-713], GGT 257 IU/L [range, 207-613], AST 188 IU/L [range, 45-839], ALP 693 IU/L [range, 134-2101] and albumin 38 g/L [range, 28-44]. All patients underwent a Kasai portoenterostomy. Three patients underwent a LT and three died (2 on LT waiting list and 1 eight years post LT). Median follow up was 9 years [10 months-12 years].

75

Patient Gender/Origin/ Age at Presenting GGT Diagnosis ERCP/MRCP Liver histology MDR3/MRP2 LT/age at Follow presentation symptoms (IU/L) (intrahepatic staining LT up Consanguinity cholangiopathy)

1 M/Arabic/Yes 4 weeks Jaundice, pale stools 857 NSC N/a Copper-associated protein Present/Present No 4 years deposition, cholestasis, ductular proliferation 2 F/Asian/Yes 3 weeks Jaundice, pale stools 178 NSC/KS N/a Bile duct proliferation, No 11 years cholestasis, mild inflammation Present/Present

3 M/Arabic/Yes 5 weeks Jaundice 185 NSC N/a Findings consistent with NSC N/a No 10 years (reported locally)

4 M/Arabic/No 1 week Jaundice, pale stools, 523 NSC Yes/No (+) Bile duct proliferation, Present/Present Yes/11 8 years splenomegaly copper-associated protein months deposition, cholestasis, bridging fibrosis 5 M/Caucasian/ No 1 week PWS 435 NSC Yes/No (+) Bile duct proliferation, Present/Present Yes/9months 12 years Jaundice, pale stools copper-associated protein deposition, cholestasis, mild inflammation 6 F/mixed race/ No 5 weeks Jaundice 653 NSC N/a Bile duct proliferation, Present/Present Yes/13 7years copper-associated protein months (RIP) deposition, cholestasis, mild inflammation 7 F/Asian/Yes 56 weeks Jaundice 984 NSC N/a Portal tract expansion, N/a No N/a inflammation, cholestasis Family history of biliary atresia 8 F/Asian/Yes 3 weeks Jaundice, pale stools 635 NSC Yes/No (+) Bile duct proliferation, Present/Present No 7 years copper-associated protein deposition, cholestasis, moderate inflammation 9 M/Asian/Yes 12 weeks Jaundice 285 NSC N/a Bile duct proliferation, Present/Present Yes/8 years 16years copper-associated protein deposition, ccholestasis 10 M/Asian/Yes 8 weeks Jaundice, ascites, 261 NSC Yes/No (+) Bile duct proliferation, biliary Present/Present Yes/10 years 18 years splenomegaly, cirrhosis, copper-associated Kabuki syndrome protein deposition, cholestasis, mild inflammation 11 F/Caucasian/ No 108 weeks Recurrent infections, 148 NSC/KS N/a Lobular cholestasis, minimal N/a No 19 years jaundice, GI inflammation, copper- bleeding, associated protein deposition splenomegaly 12 M/Caucasian 6 weeks Jaundice, 962 NSC Yes/No Porto-septal fibrosis, Present/Present No 6 years (GR)/No splenomegaly cholangiolytic changes, copper-associated protein deposition, cholestasis Patient Gender/Origin/ Age at Presenting GGT Diagnosis ERCP/MRCP Liver histology MDR3/MRP2 LT/age at Follow presentation symptoms (IU/L) (intrahepatic staining LT up Consanguinity cholangiopathy)

13 M/Asian/Yes 7 weeks Jaundice 1120 NSC N/a Cholestasis, portal vein N/a No 4 years radical hypoplasia, copper- associated protein deposition 14 M/Arabic/Yes 8 weeks Jaundice 324 NSC N/a Biliary features, severe Present/Present Yes /15years 16 years cholestasis, bridging fibrosis

15 F/Caucasian 21 weeks Jaundice, GI 447 NSC Yes/No (+) Biliary cirrhosis, bile duct Present/Present N/a n/a (GR)/No bleeding, Ascites, proliferation, copper- Splenomegaly associated protein deposition 16 F/Asian/Yes family Asymptomatic 342 NSC Yes/No (+) Moderate bridging fibrosis, Present/Present No 11 years screening (raised AST) copper-associated protein deposition, mild portal tract inflammation 17 F/Caucasian 1 week Jaundice 196 NSC Yes/No Mild portal tract Present/Present Yes/14 years 17 years (GR)/No inflammation, bridging (RIP) fibrosis, cholestasis 18 F/Arabic/Yes 2 weeks Jaundice 532 NSC N/a Findings consistent with NSC Present (local Yes/ 3 & 4 12 years (reported locally) report) years

19 F/Arabic/No 2 weeks Jaundice, pale stools, 181 NSC Yes/No (+) Ductular proliferation, Present/Present No 2 years ascites, cholestasis, copper-associated splenomegaly protein deposition 20 M/Caucasian/ No 5 weeks Jaundice, pale stools, 435 NSC Yes/No (+) Cholestasis, copper-associated Present/Present No 11 years splenomegaly protein deposition, ductular proliferation & oedema 21 M/Arabic/Yes 6 weeks Jaundice, GI 711 NSC N/a Biliary cirrhosis, Present/Present Yes/14 years 16 years bleeding cholangiopathy, copper- associated protein deposition 22 F/Caucasian/ No 3 weeks Jaundice 451 NSC N/a Cholestasis, copper-associated Present Yes/3 years 16 years protein deposition, mild portal tract inflammation, 23 M/Caucasian/ 36 weeks Jaundice 590 NSC N/a Findings consistent with NSC N/a N/a N/a Yes (reported locally)

24 M/Arabic/Yes 57 weeks Jaundice 195 NSC No/Yes (+) Findings consistent with NSC N/a N/a N/a (reported locally)

BA RIP at Jaundice, FTT, pale Bile duct proliferation, portal TG01 M/Asian/Yes 3 weeks 185 No/No N/a No/listed 15 stools track expansion, bile plugs months

77

Patient Gender/Origin/ Age at Presenting GGT Diagnosis ERCP/MRCP Liver histology MDR3/MRP2 LT/age at Follow presentation symptoms (IU/L) (intrahepatic staining LT up Consanguinity cholangiopathy)

NSC Cholangiopathy, cholestasis, Yes TG02 F/Asian/Yes 4 weeks Jaundice, FTT 1099 Yes/No (+) N/a 8 years bridging fibrosis 2 years

Jaundice, FTT, pale BA Bile duct proliferation, portal Yes TG05 M/Arabic/Yes 8 weeks 207 No/No N/a 9 years stools track expansion, bile plugs 2 years

BA Bile duct proliferation, portal Yes TG06 F/Asian/No 8 weeks Jaundice, pale stools 613 No/No N/a 12 years track expansion, bile plugs 8 months

BA Bile duct proliferation, portal Yes RIP at 9 TG07 M/Asian/Yes 6 weeks Jaundice, pale stools 355 No/No N/a track expansion, bile plugs 11 months years

BA Bile duct proliferation, portal TG09 F/Turkish/Yes 4 weeks Jaundice, pale stools 228 No/No track expansion, bile plugs N/a No 9 years

Jaundice, pale stools, NSC M/Caucasian Cholangiopathy, cholestasis, Yes TG11 7 weeks hepatomegaly, 365 Yes/No (+) N/a 18 years (GR)/No bridging fibrosis 15 years splenomegaly

NSC RIP at 6 TG14 F/Asian/No 6 weeks Splenomegaly, PHT, 82 Yes/No (+) N/a N/a No/listed years

BA RIP at Jaundice, FTT, pale Bile duct proliferation, portal TG17 F/Asian/Yes 4 weeks 257 Yes/Yes (+) N/a No/listed 10 stools track expansion, bile plugs months

Pale stools, NSC Copper associated protein, Yes TG21 M/Asian/Yes 14 months 516 Yes/No (+) 14 years abnormal LFTs cholestasis, bridging fibrosis N/a 8 years

Jaundice, pale stools, NSC Copper associated protein, RIP at TG22 M/Asian/Yes 5 years 247 Yes/No (+) No/listed abnormal LFTs cholestasis, bridging fibrosis N/a 16 years

Jaundice, pruritus, NSC Cholangiopathy, cholestasis, Yes TG23 F/Caucasian/No 12 months 191 No/No 34 years splenomegaly bridging fibrosis N/a 23 years

BA Bile duct proliferation, portal 16 TG25 M/Asian/Yes 5 weeks Jaundice, pale stools 221 No/No N/a No track expansion, bile plugs months

BA Bile duct proliferation, portal 20 TG26 M/Asian/Yes 2 weeks Jaundice, pale stools 582 No/No N/a No track expansion, bile plugs months

BA/KS Bile duct proliferation, portal TG27 F/Caucasian/No 3 weeks Jaundice, pale stools 614 No/No N/a No 12 years track expansion, cholestasis

78

Patient Gender/Origin/ Age at Presenting GGT Diagnosis ERCP/MRCP Liver histology MDR3/MRP2 LT/age at Follow presentation symptoms (IU/L) (intrahepatic staining LT up Consanguinity cholangiopathy)

BA/KS Bile duct proliferation, portal Yes 11 TG28 M/Caucasian/No 7 weeks Jaundice, pale stools 469 No/No N/a 16 years track expansion, cholestasis years Table 22: Demographical, biochemical, radiological and histological data of all patients. M, male; F, female; ERCP, Endoscopic retrograde cholangiopancreatography; MRCP, Magnetic resonance cholangiopancreatography; GGT, γ-Glutamyltransferase; GI, gastrointestinal; MDR3, multi drug resistance protein 3; MRP2, multidrug resistance associated protein 2; BA, biliary atresia; LT, liver transplantation; NSC, neonatal sclerosing cholangitis; GR, Greek; KS, Kabuki syndrome; PWS, Prader-Willi syndrome; n/a, not available.

79

5. Genetic results

5.1. Kabuki syndrome Bi-directional fluorescent DNA sequencing (BigDye v3.1) was utilized in screening for mutations in exons 1-54 of the MLL2 and 1-29 of KDM6A genes, in the no MLL2 mutations patients. Large deletions and duplications in both genes were tested by multiplex ligation- dependent probe amplification (MLPA). Genetic analysis was undertaken in the Genomics Diagnostic Laboratory, Manchester. Each patient was found to be heterozygous for different MLL2 mutations (Table 23). Patients 1, 2 and 4 had nonsense truncating mutations identified. The mutation was novel in patient 2. Parental DNA was not tested.

In patient 3, a heterozygous splice site mutation of the MLL2 gene was revealed, previously reported in dbSNP. As the diagnostic laboratory suspected that this mutation was a neutral polymorphism, DNA was also sequenced for KDM6A mutations for exons 1-29. No pathogenic mutations were identified. Following MLPA dosage analysis no evidence of any large deletion or duplication affecting the KDM6A exons was detected. Maternal DNA proved negative for mutations in KDM6A and also for the MLL2 c.5868-8C>T variant identified in the index case. Paternal DNA was not available to determine the inheritance of the MLL2 variant and also confirm its clinical significance.

Screening of patient’s 4 DNA revealed a heterozygous nonsense mutation in exon 10 of the MLL2 gene. Paternal DNA proved negative for this MLL2 mutation.

Patient MLL2 mutation Amino acid Mutation type Reported by Number change

1 c.9961C>T R3312* Truncating Ng et al.

2 c.7411C>T R2471* Truncating novel

3 c.5868-8C>T Splice site rs75783546

4 c1306G>T E436* Truncating Ng et al.

Table 23: MLL2 mutations identified in KS patients. Bi-directional fluorescent DNA sequencing (Big-Dye v3.1) was used for screening in exons 1-54 of the gene in the Regional Genetics Laboratory, Manchester. Large deletions and duplications were tested for by MLPA (multiplex ligation dependent probe amplification). Three previously described mutations (2 truncating and 1 splice site) and one novel (truncating) identified and the amino acid changes are listed.

80 5.1.1.Discussion KS is a syndrome consisting of multiple congenital anomalies and facial characteristics first described in 1981(Kuroki et al. 1981). Liver involvement has been identified as non-specific cholestasis with high GGT, biliary atresia, cholangiopathy and congenital hepatic fibrosis (Niikawa et al. 1988, Ewart-Toland et al. 1998, McGaughran et al. 2000, van Haelst et al. 2000, Selicorni et al. 2001, Nobili et al. 2004, Schrander-Stumpel et al. 2005). Prognosis has been variable with a minority of these patients requiring LT, depending on the underlying liver pathology. Nobili et al hypothesized a correlation with notch signalling pathway defects and liver disease in KS in view of some similarities between KS and Alagille syndrome (Li et al. 1997), which was not supported further once MLL2 was identified as the causative gene.

In our cohort of four KS patients we have identified high GGT cholestasis with prominent intra and/or extrahepatic cholangiopathy, portal tract expansion, fibrosis, inflammation and in patient 1 nodular regenerative hyperplasia (NRH). The severity of liver disease was variable, with patient 1 considered for LT and patient 4 who required already LT. The other two patients remain stable under close follow up in our centre. The diagnosis of KS was suspected on clinical phenotype in all patients.

The four KS patients investigated here were part of a larger cohort of patients with high GGT activity liver disease. They had undergone SNP analysis to identify areas of homozygosity and potentially recessive candidate causative genes. As this was not informative we proceeded with WES analysis in a sub cohort of 5 non-KS NSC patients in order to identify a causative gene. By the time WES data became available, MLL2 had been reported as the causative gene in the majority of KS patients. In our initial WES data analysis we identified 8 heterozygous single base change novel mutations in MLL2 (2 splice site, 4 synonymous and 3 non synonymous exonic) in three non-KS patients. These variants were all previously known SNPs (in dbSNP). When tested, on SIFT (score 0.19-0.37), Polyphen and Grantham (mild to moderate physicochemical difference) and also with an MAF across population of 1-1.85%, they were identified as genetic polymorphisms excluding a pathogenic association.

Pathogenic mutations in MLL2 and KDM6A have been so far identified in about 73% and 10% of KS patients, respectively. MLL2 is part of a larger SET1 family of histone methyltransferases with an effect on gene expression. MLL2 is a histone methyltransferase that methylates the Lys-4 position of histone H3 and belongs to the Activating Signal Cointegrator-2 complex (ASCOM), which has been shown to be a transcriptional regulator of the nuclear Farnesoid X receptor (FXR), a regulator of several genes involved in amongst other pathways also in bile acid homeostasis (Kim et al. 2009). Histone H3K4 trimethylation by MLL3, another histone methyltransferase (HMT), as part of ASCOM complex is essential for nuclear receptor activation of bile acid transporter genes [bile salt export pump (BSEP),

81 sodium taurocholate co-transporting polypeptide (NTCP) and multidrug resistance associated protein 2 (MRP2)] and is down regulated in cholestasis (Ananthanarayanan et al. 2011). This mechanism does not appear to involved in the pathogenesis of the liver disease in our KS patients as the main feature appears to be a cholangiopathy and no defects in bile acid synthesis were found during their initial work up.

MLL2 has also been linked to the pathogenesis of hepatocellular carcinoma (HCC) along with other histone methyltransferase, nucleotide-binding domain, calcium channel subunit, and leucine-rich repeat-containing gene families (Gu 2013). The mechanism of action seems to be via down regulation of the MLL genes with loss of H3K4 methyltransferase activity, affecting transcription factors and subsequent chromatin remodelling (Kuroki et al. 1981) leading to high cell proliferation and carcinogenesis. Through exome sequencing, recurrent mutations were identified in 87 HCCs affecting TP53 (18%), CTTNB1 (10%), KEAP1 (8%), C16orf62 (8%), MLL4 (7%) and RAC2 (5%) (Cleary SP et al. 2013). By utilising whole genome analysis in 27 HCCs (25 associated with Hepatitis B or C) mutations including MLL and MLL3 were identified in 50% of samples (Fujimoto et al. 2012). To our knowledge, however, there are no cases of liver tumours reported in KS patients.

Another gene recently involved in the pathogenesis of KS is KDM6A with de novo deletions and point mutations identified in a small number of patients (Lederer et al. 2012, Miyake et al. 2013a). KDM6A encodes the lysine demethylase 6A (KDM6A) demethylating di- and trimethyl-lysine 27 on histone H3 (H3K27). H3K4 methylation by MLL2/3 is linked to the demethylation of H3K27 by KDM6A, both being involved in cell development, embryogenesis and chromatin development potentially affecting a variety of cell lineages.

In conclusion, liver disease is relatively uncommon in KS patients. In our patient cohort cholangiopathy seems to be the cardinal feature with variable outcome. Mutations in MLL2 (3 truncating, 1 splice site) were identified including one newly described. A potential pathogenic mechanism of cholangiopathy could be suggested by down regulation of genes involved in cholestasis and cell development by possible disarray in biliary epithelial cell adhesion and cytoskeleton organisation. The association of liver disease in KS and gene mutations has been underreported primarily due to the low incidence of liver manifestations in KS. Another contributing factor is the fact that still in 17% of KS patients no genetic cause has been identified possible suggesting further genetic heterogeneity.

5.2.DNA for SNP analysis

In section 4.2 patients included in the genetic analysis were described in detail focusing in clinical presentation, laboratory investigations and disease progression. After the identification of two groups of children, with neonatal onset cholangiopathy, the efforts are

82 focused into the genetic analyses and each step of this process with results obtained is detailed in this section.

DNA was extracted using stored blood samples and, following several attempts, the required quantity and recommended concentration was obtained. Samples were taken to the Biomedical Research Centre, in Guy’s Hospital. DNA was uploaded onto 2 plates (12 samples/plate) and coded. DNA concentrations and identification numbers are listed below in Table 24.

Sample DNA concentration PLATE Patient No DNA No code (ng/uL) No (viewer)

A1 17 R01C01 *002 1 G1 44.1 R01C02 *002 2 B1 66.5 R02C01 *002 3 H1 257 R02C02 *002 4 C1 58.7 R03C01 *002 5 A2 775 R03C02 *002 6 D1 28.4 R04C01 *002 7 B2 365 R04C02 *002 8 E1 40.2 R05C01 *002 9 C2 73.4 R05C02 *002 10 F1 24.7 R06C01 *002 11 D2 55.9 R06C02 *002 12 A4 52 R01C01 *003 13 C3 33.5 R01C02 *003 14 F2 26 R02C01 *003 15 D3 22.4 R02C02 *003 16 G2 28 R03C01 *003 17 E3 31 R03C02 *003 18 H2 34.3 R04C01 *003 19 F3 59.8 R04C02 *003 20 A3 53.1 R05C01 *003 21 G3 112.8 R05C02 *003 22 B3 43.6 R06C01 *003 23 B4 39.1 R06C02 *003 24 Table 24: DNA concentrations and identification numbers (sample code, DNA, plate and patient viewer numbers)

5.3.PCR amplification of ABCB4 exon 6

Following multiple DNA extraction attempts, and prior to the samples taken for SNP linkage analysis, DNA from all 24 patients in group 1 was tested to ensure function. A test PCR was 83 performed using exon 6 of the ABCB4 gene. Agarose gel electrophoresis confirmed successful amplification of the DNA bands.

5.4.SNP linkage analysis

Once data were generated from the Human CytoSNP-12 v2.1 (300,000 Single Nucleotide Polymorphism) microarray, they were transferred across to the Institute of Liver Studies, at KCH for analysis. HomozygosityMapper was used and overall 31 different combinations of patients were used to perform an equal number of data sets. A representative portion of these analyses will be described below highlighting the different subgroups and parameters utilised in finding a small region of LOH, in which strong candidate genes for NSC could be found.

5.4.1.Analysis A1

All 24 patients were included in the initial attempt to identify areas of LOH. The block length limit was 100 bp and the maximum homozygosity score was low at 386. This score represents

Figure 15: Genome-wide homozygosity depicting the areas of LOH in all 24 patients in group 1. The y axis shows the homozygosity scores per chromosome and the x axis the chromosome number. Red bars represent the maximum scores per chromosome. the excess of the identified homozygosity against one calculated from a control population, with the red bars representing a score >80% of the maximum score (Figure 15). Homozygosity was recorded in 18 regions of 11 chromosomes with a length ranging from 10-2,020 kb (Table 25). The large regions identified were too numerous and large and contained too many candidate genes for efficient gene identification by mutation analysis.

84

Length score chr from (bp) to (bp) from SNP to SNP (kb)

386 11 46543524 47175327 rs11038910 rs752849 632

386 20 31177986 31205771 rs6058750 rs456798 28

364 14 60387340 60687278 rs12435660 rs8005975 300

355 14 60782189 60813416 rs1951116 rs10151339 31

345 11 46327103 46522304 rs11821450 rs7949371 195

340 2 88535325 88596705 rs11694318 rs6753479 61

339 7 102593794 103035255 rs13233521 rs4363136 441

338 16 31854357 33836510 rs7196252 rs9674439 1982

337 3 96417590 96710451 rs4857220 rs1547399 293

329 3 89823487 90097585 rs1284258 rs1608029 274

328 1 147795678 149815536 rs1763457 rs7531664 2020

321 1 32729702 33029907 rs1004420 rs12044277 300

321 2 88474370 88484808 rs2139100 rs2292869 10

321 16 33909759 34516075 rs7499238 rs7196505 606

318 1 65687556 65838229 rs1889925 rs7537733 151

314 4 53619367 53795651 rs4865408 rs301126 176

313 7 62534685 62573214 rs10227895 rs10280942 39

310 7 72789995 72970268 rs2528707 rs17145732 180

Table 25: Summary of the regions with the highest homozygosity scores for analysis A1. Regions are described as per bp number and reference SNP (rs) number.

5.4.2.Analysis A2

In this analysis all consanguineous patients were included (1, 3, 5, 7-10, 13, 14, 16, 23 & 24) plus patient 5 who has PWS (Figure 16). This was done to identify regions of LOH in 85 chromosome 15, in which our PWS patient had proven uniparental disomy. The block length limit was reduced to 100 bp. Length of LOH ranged from 23-2034 kb in seven chromosomes (Table 26). A region in chromosome 1: 52358988-53051993 was identified and within it was the Ras-associated protein 3B (RAB3B) gene (1:52,373,627-52,456,435). This was a possible candidate gene as it codes for RAS- associated protein b, a protein localised in multisystemic vesicular structures that, once bound to guanosine-5'-triphosphate (GTP), interferes with secretory regulation and transcytosis (Zahraoui et al. 1989, Rousseau-Merck et al. 1991, van et al. 2002). With RAB3B in mind further analyses were undertaken to find a stronger candidate gene.

Figure 16: Genome-wide homozygosity depicting the areas of LOH in 12 consanguineous patients plus individual 5 with PWS. The y axis depicts the homozygosity scores per chromosome and the x axis the genome-wide distribution. The region in chromosome 1: 52358988-53051993 containing the RAB3B gene is easily identifiable (red arrow).

86

Length score chr from (bp) to (bp) from SNP to SNP (kb)

1131 1 52358988 53051993 rs2842579 rs12737517 693

1007 16 47538010 48079134 rs12149656 rs520151 541

997 1 32729702 33116623 rs1004420 rs704874 387

990 16 46723598 47379020 rs33994299 rs35514741 655

979 16 33909759 34516075 rs7499238 rs7196505 606

972 14 66911129 67234457 rs915056 rs17247792 323

960 12 112146964 112661104 rs608848 rs7297415 514

957 12 112667675 112981571 rs1005902 rs739787 314

949 10 38307723 38357774 rs594594 rs677565 50

942 1 51804731 52016713 rs11205836 rs6588412 212

942 12 111837285 112123284 rs7298118 rs601663 286

940 19 23960812 24022803 rs7245872 rs11085641 62

935 1 147826658 149860372 rs2932454 rs17581597 2034

933 15 48369485 48528113 rs16960538 rs11070629 159

925 16 31888684 33836510 rs11641727 rs9674439 1948

919 15 48670904 48694211 rs959324 rs17352842 23

906 12 110578192 110809859 rs12297035 rs28665218 232

Table 26: Summary of the regions with the highest homozygosity scores for analysis A2. Regions are described as per bp number and reference SNP (rs) number. Chr; chromosome, SNP; single nucleotide polymorphism.

All patients in this analysis were included in a further analysis with increased block length limit to 250bp to narrow down our search to only larger regions of homozygosity. LOH regions were identified in three chromosomes with a length ranging between 21-1204 bp. In the regions of interest GeneDistiller was used to investigate in detail for candidate genes. There were 35 genes in chromosome 11, 8 genes in chromosome 14 and 25 genes in chromosome 12. None of these were classified as strong candidates for NSC based on tissue protein expression and function.

87 5.4.3.Analysis A3

Patients included were 1, 3, 14, 18, 19, 21 and 24. These patients were from consanguineous families of Arabic origin (Table 27).

Length score chr from (bp) to (bp) from SNP to SNP (kb)

750 18 24880578 28943792 rs11872416 rs11081692 4063

716 5 66181023 66513326 rs27876 rs1627990 332

695 10 82974169 84151515 rs7919037 rs3740350 1177

680 2 88441879 88596705 rs9309644 rs6753479 155

672 10 12288780 12597108 rs4469792 rs2296754 308

670 6 57205920 62651063 rs6459193 rs6901766 5445

665 2 98677164 98928929 rs17026292 rs1466944 252

652 5 56952654 57602994 rs170741 rs1202500 650

636 15 71954934 73301315 rs8026078 rs8041358 1346

610 10 38667316 43474694 rs10827867 rs10900291 4807

Table 27: Regions with the highest homozygosity scores for analysis A3. Regions are described as per bp number and reference SNP (rs) number Chr; chromosome, SNP; single nucleotide polymorphism.

Homozygosity recorded in chromosomes 2, 15 and 18 were in a centromeric region. Regions in chromosomes 5, 6, and 10, when analysed by Gene Distiller, showed 4, 17 and 27 genes respectively. None of these were considered as strong candidate genes.

5.4.4.Analysis A4

In this analysis all consanguineous patients of Arabic origin plus non-consanguineous patients of Greek origin were included. Areas of homozygosity were observed in chromosomes 1, 2, 6, 7, 11, 12, 16 and 19. The areas in chromosomes 2, 6 and 16 were centromeric; chromosome 7 was very small with no genes included. The region in chromosome 16 and 19 included genes coding for zinc signalling proteins. When these results were compared with results in analysis A5 (all consanguineous Asian patients plus the three non-consanguineous Greek) the regions in chromosome 11 were different (section

88 4.5.5) minimizing the strength of this observation. Block length was set to 250bp. LOH length was ranging from 9-4183 kb. The area in chromosome 1 for RAB3B (OMIM#179510) gene was again noted to be included in the LOH regions.

5.4.5.Analysis A5

In this analysis all consanguineous Asian patients plus the three non-consanguineous Greek patients were included. Block length was set up to 100 bp. Homozygosity was observed in chromosomes 1, 12 and 14. The data were later compared with another analysis (not listed here) including only consanguineous Asians and areas in chromosomes 1, 12 and 14 were identical. The region for the RAB3B gene was once more identified from the analysis.

5.4.6.Analysis A6

In this analysis we have included only the consanguineous patients (1, 3, 7-10, 13, 14, 16, 18, 19, 23, & 24) of all ethnic background. LOH block length limit was set at 1000 bp in an effort to identify only very large areas of homozygosity. Length of LOH regions was 479- 3416 kb. In total only 5 regions in 4 chromosomes were found but the size of these regions made it impossible to analyse in depth any strong candidate genes associated with NSC.

5.4.7.Analysis A7

This analysis included the three patients of Greek origin with no known family history of consanguinity. The LOH length range was 14-6412 bp. Multiple regions, in total 22, of low score were identified in 13 chromosomes despite the lack of any history of consanguinity. The number of genes in these regions made it an impossible task to proceed with further in depth analysis.

5.4.8.Analysis A8

Patients included in this analysis were two sets of siblings from two consanguineous families plus another consanguineous Asian patient (3, 8, 10, 16 &18). Multiple small areas of LOH in chromosome 7 were identified in a total region of 6320 kb. There were overall 154 candidate genes in that region and none of them were strong candidates based on tissue specificity and function.

5.4.9.Summary of HomozygosityMapper (HM) analysis

The SNP linkage analysis and subsequently homozygosity mapping have identified large regions of homozygosity with a significant number of genes in each region. A variable block length limit (100-1000 bp) was used across the genome to identify different size contiguous runs of homozygous genotypes. The majority of the analyses were performed at 250 bp to avoid short and very common runs of homozygosity (ROH). This approach generated areas

89 of homozygosity that were too numerous and large and contained too many candidate genes for efficient gene identification by mutation analysis. Despite the multiple different analyses via HM, the size of the LOH regions remained unmanageable for conventional gene sequencing and Bead studio was used in order to potentially overcome this problem.

5.5.Results of Bead studio

The LOH Detector (Bead) GT v3.0 and CNV partition (Genome) v2.4.4 were used for further data analysis. We uploaded the data from the whole genome Human CytoSNP-12 v2.1 (300,000 Single Nucleotide Polymorphism) microarray analysis. We made 4 individual analyses (Table 4) with different sub-cohorts of patients and different minimum SNP number. Data generated with LOH regions from 20 chromosomes was scrutinised. Initially the maximum number of patients sharing common regions was recorded. In view of the large size of these regions the decision was taken to focus our efforts into regions that were shared between >4 patients (Table 28). This approach reduced the number of LOH regions to eight, amongst seven chromosomes with two regions in chromosome 15, which are highlighted in red in the table below.

90

Total Length number of Chromosome LOH start LOH finish (kb) Patient No patients 1 201,509,798 202,637,037 1127 9,13,21,24 4 2 174,014,783 177,123,816 3109 14,21,24 3 3 32,623,375 33,161,360 538 14,16,7 3 3 190,172,681 192,490,021 2317 8,10,24 3 4 6,195,581 6,908,179 713 7,9,10 3 5 2,970,143 3,724,006 754 13,19,24 3 5 19,583,334 23,488,681 3905 1,9,23 3 5 83,355,369 88,301,953 4947 16,23,24 3 6 30,737,131 32,543,022 1806 5,9,10,13,21 5 7 141,319,073 146,061,374 4742 3,10,18 3 8 21,303,156 29,672,161 8369 9,10,13,14 4 9 15,695,395 19,504,307 3809 8,9,24 3 10 28,689,836 38,725,237 10035 1,7,16 3 10 51,258,183 56,296,559 5038 1,14,16 3 11 47,944,866 47,996,437 52 4,5,6,13,23 5 12 4,787,388 5,895,315 1108 8,10,16,19 4 13 20,937,821 23,365,342 2428 14,16 2 13 28,913,007 30,142,780 1230 1,8 2 13 39,603,939 48,818,777 9215 9,10 2 14 34,919,935 37,330,743 2411 7,8,13,19,24 5 15 95,854,088 99,202,109 3348 5,8,9,13,14, 5 15 21,748,154 22,598,069 850 5,16,8,10 4 16 6,188,248 6,458,669 270 10,24 2 16 9,517,857 11,492,756 1975 16,19 2 16 54,283,042 56,526,012 2243 10,16 2 16 62,732,436 69,535,863 6803 7,23 2 17 65,401,023 69,559,372 4158 9,13,21 3 18 10,111,179 12,964,073 2853 8,9,24 3 19 45,139,097 51,569,245 6430 7,9 2 20 12,572,679 12,875,466 303 14,21 2 Table 28: Areas of LOH in individual chromosomes identified by BeadStudio and mapped by chromosome start and stop position and length size. Patient identification number and total number of patients per region are included. The LOH regions shared amongst ≥4 patients are highlighted in red. No LOH regions were identified in chromosomes 21, 22, X and Y.

91

Chromosome LOH LOH Length Patients (kb) start finish

1 108,276,164 120,886,421 12,610 (10 & 16)

1 142,739,671 155,751,199 13,011

2 62,478,288 65,890,044 3,411

7 115,465,882 129,404,963 13,939

8 55,432,038 70,661,462 15,229

12 3,543,760 5,895,315 2,351

13 35,365,943 41,164,889 5,798

15 18,421,386 22,598,069 4,176

15 38,432,425 46,782,678 8,350

16 52,830,779 56,526,012 3,695

Chromosome LOH LOH Length Patients (kb) start finish

7 137,314,017 146,061,374 8,747 (3 &18)

Table 29: LOH regions for the two sets of siblings (patients 10 & 16) and (patients 3 & 18) when analysed per family.

Interestingly in one set of siblings (patients 10 & 16) both were found to have regions of LOH in chromosomes 1, 2, 7, 8, 12, 13, 15 and 16. The decision was taken to analyse each one of the two sets of siblings separately for LOH by increasing the block length and also including small gaps (Table 29). This was performed directly in the haplotype tables in an attempt to minimize a type I error. Data were analysed further and regions of LOH between the siblings in each set were reduced by identifying regions of LOH also shared by other patients. The regions were reviewed on the Ensembl genome browser and a large number of genes were located in each region.

The most interesting LOH area was in chromosome 7: 141,319,073-146,061,374. This was due to the fact that patient 10 was also homozygous in that region, bringing the total number of patients with LOH in the region up to 4 (Table 30). Genes in this region were reviewed with no specific interest in terms of encoded protein or tissue expression and this result was considered non significant.

92

Chromosome LOH LOH Length Patients (kb) start finish 10 & 16

7 115,465,882 121,774,394 6309 Patient 19 8 59,866,991 70,661,462 10794 Patient 13 12 3,543,760 5,895,315 2352 Patients 8, 19 13 39,603,939 41,164,889 1561 Patients 1, 9, 14 15 21,748,154 22,598,069 850 Patients 5, 8 Chromosome LOH LOH Patients start finish 3 & 18

7 141,319,073 146,061,374 4742 Patients 8, 10 Table 30: LOH regions for set of siblings (10 & 16) and (3 & 18) shared with other patients.

5.6.RAB3B sequencing

5.6.1.RAB3B primers

Following the repeated identification of homozygosity in the region of chromosome 1 where RAB3B is located we decided to proceed with sequencing of this gene. The five known forward and reverse primers for RAB3B were designed and are listed in Table 31. Patients chosen according to HM analysis were 5, 9, 10 and 21.

No exon- Primer sequence (5’-3’) Annealing temperature Strand for PCR (0C)

RAB3B-1F CCT AGT AAG TTC AGT CAA TGG A 64 RAB3B-1R GGA TCG GTC CCT GAT CCT C 62 RAB3B-2F TGC TGG AGT AAG GAG TGA GTG 64 RAB3B-2R CAC CTC TAA GCC CTT GTA CTG 60 RAB3B-3F ATG TCT TGG GTT ATT AGT TAA TTG 62 RAB3B-3R GCA CAC AAT ACA GGA CCT AGC 64 RAB3B-4F GAC TCT AGC CTC ATT GAC ATA AT 64 RAB3B-4R ATA CAC ACA GAT CAC ATA GGT AC 64 RAB3B-5F TGC TGA GCA CTG GAC ATA TTA AT 64 RAB3B-5R TGT GGG CAG TGT GTA ACA GG 62 Table 31: Design parameters of RAB3B forward and reverse primers.

5.6.2.RAB3B results

Melting PCR temperature gradients were tested for individual exons and after several attempts the optimal temperatures were defined as in Table 32. Following PCR, clean up

93 samples were run on the automated ABI PRISM 3130-Avant Genetic Analyser. Sequencing results were exported and aligned to a reference gene using Sequencher 4.8 (Gene Codes Corporation, Ann Arbor, MI, USA) and no mutations were identified for any of the 5 exons for RAB3B in the four patients. This gene was therefore excluded as a candidate gene for NSC.

Exon No Temperature (0C)

1 54

5 62

2, 3, 4 58

Table 32: PCR temperature per RAB3B exon tested.

5.7.Whole Exome Sequencing analysis

In the previous chapter after the patient group identification analysis of the SNP mapping data was undertaken. Through various patient subgroup combinations it was not possible for a causative gene to be found. At this stage the question of a genetic confirmation for the remaining patients was remaining still unanswered. For this reason the decision was taken to proceed with WES analysis in a small group of 5 patients initially and subsequently expand this as the next step to another 21 affected patients as described in methods’ sections 3.2.8- 3.2.9.7.

5.7.1.Whole Exome Sequencing variant detection in Group 1

WES was performed initially in five patients and data were processed for filtering and identification of possible strong candidate genes. In total 128,580 variants were identified across these patients. As there was strong suspicion that NSC is an autosomal recessive inherited disease the filtering was followed a step by step process, which identified homozygous (52,196 variants), frameshift deletions/insertions, non synonymous single nucleotide variants (SNV) or splice-site (28,892 variants). After removing all previously known to dbSNP, 1000 Genome project or BRC internal control genome the number was reduced to 150 variants. These variants had passed quality control marks of read depth >4 and quality score >20 as described in section 3.2.8.9. The genes with the most variations amongst patients, which passed the filtering process, are listed in Table 33 along with their homozygous variations. None of these 150 variants were identified as strong candidates based on protein function and tissue expression.

As this filtering process did not generate the desired result, the next approach in the analysis was to expand our search in the total 128,580 variants and include heterozygous changes in 94 the same genes along with homozygous changes identified across subgroups of patients. No single gene had variants amongst all five patients. Genes identified across patient’s subgroups were 4, 16, 128 and 1865 in the 4, 3, 2 and 1 patient subgroups, respectively. Although this more inclusive approach generated larger number of candidate genes when results were filtered based on the encoded protein function and tissue expression characteristics again no strong candidates were identified.

Gene RefSeq ID Nucleotide change Amino acid change

MUC5B NM_002458 c.8588T>G p.I2863R

MUC4 NM_018406 c.2491T>C p.S831P

NM_198455 c.7667G>T p.G2556V SSPO NM_198455 c.12200C>T p.A4067V

MYO10 NM_198455 c.2413C>T p.R805W

ADAMTSL4 NM_012334 c.920G>A p.R307Q

AGRN NM_012334 c.1969G>A p.E657K

NM_016642 c.1969G>A p.E657K SPTBN5 NM_016642 c.3722_3744del p.1241_1248del

Table 33: Novel homozygous variants identified in seven genes. The variants were individual amongst each patient.

In a third attempt to filter genes of interest, the data generated from the previous SNP mapping analysis was incorporated in the prioritization of variants found in WES. The areas of LOH already identified were screened for candidate genes within these regions and the same filtering process was followed. Initially and based on the most interesting area of LOH (chromosome 7:141,319,073-141,319,073) identified through homozygosity mapping and Bead Studio, as described in section 4.6, variants located within this region were reviewed. On this occasion all variants located in that region were included and in all five patients 59 variants were identified. Further investigation showed that only 23 were non-synonymous (SNV, indels, splice-site, CNV) but all were previously known to dbSNP so they were not considered to be strong candidates.

95 In the last step the WES data were analysed based on all areas of LOH shared amongst patients as per section 4.6 (Table 34).

Patient Homozygous Novel Gene Name Number CNV homozygous CNV

5 nil nil nil

8 30 13 ADAMTS20, ANO6, C14orf179, CKAP4, CDI63, DCAF4, GLT8D2, GPATCH8, IFI35, KIF21A, MAP3K9, OR4S2, PLA2G5, SSPO

9 4 1 ZFHX4

10 37 20 ADAMTSL4, AGRN, C1orf38, CAB39L, CRTC2, HR, KDM6B, OR10J1, PAFAH2, PMVK, RP1, SAMD11, SPATA5, SSPO, TCEA3, TIRAP, TMEM48, TNXB, TRIM16, VSX2

13 55 33 ADAMTS17, ALPK1, ANKRD1, BTN3A3, C16orf59, C8orf34, CCT3, FRA10AC1, FSTL5, GPC3, GYLTL1B, IGHMBP2, IQGAP1, IQ6AP1, ISG20L2, KILLIN, KRT37, LAPTM4B, MAGI1, MBD6, MS4A5, MTCH1, MYO10, NR1H4, PHACTR2, PLAC1L, RHBG, RHD, SEMA4G, SHANK2, SLC16A5, SLC26A10, SLC39A14, TRAM1L1, UNC13D, ZNF705D

Table 34: CNV generated by WES and identified in each patient in the regions of LOH as described in section 4.6. Results are displayed per patient and through filtering process of identifying the homozygous (deletions or insertions) and not previously described variations.

The genes, in which variants were identified in common LOH regions, were filtered further in respect to coding protein function (i.e. intracellular, transport, structure) and tissue specificity (i.e. liver, biliary system). From table 35 the genes, which passed filtering and were expressed in the liver as well as in other tissues, were CDI63 and ANO6 (patient 8), RP1 and ADAMTSL4 (patient 10) and IQGAP1, ADAMTS17, GPC3 and IQ6AP1 (patient 13) but none were shared amongst more than one individual and were not considered as strong candidates.

Overall, through our analysis of the data generated by the first group of patients, no strong candidate genes were identified. Potential explanations for the inability to find a compelling 96 candidate gene in which unknown variants are seen, in all affected cases, could have been firstly that NSC is subject to genetic heterogeneity i.e. the disease phenotype may be caused by mutations in different genes in each patient. Secondly the number of patients included in this first analysis was too small in order to find a causative gene despite including patients with the most severe phenotype. Thirdly some or all of the causative mutations could have been in regions of non-captured genes and fourthly the filtering strategy may have too stringent and eliminated the possible pathogenic gene. In an attempt to overcome some of these issues and mainly the limitations in patient numbers and possible genetic heterogeneity we proceeded with WES in a larger cohort of 21 affected individuals.

5.7.2.Whole Exome Sequencing detection in Group 2

In the second step of the WES analysis, which included 21 patients with neonatal cholangiopathy, gene mutations were annotated as per 3.2.9 and filtered to find a possible disease causing mutations. Variants present with MAF>1% in the 1000 Genomes or the Exome Sequencing Project (ESP) data were excluded, as were intergenic variants and variants that were flagged as low quality or potential false positives (quality score <30, long homopolymer runs >5, low quality by depth <5 or within a cluster of SNPs). The initial analysis again focused on variants that supported an autosomal recessive model of inheritance and were prioritized based on biological relevance. In each patient an average of 24,000 variations were annotated and variant data were loaded into GEMINI for analysis filtering as described in section 3.2.9.8. In total 998 variations in genes inherited in an autosomal recessive mode were identified. Genes were categorized based on the number of patients respective homozygous variations were identified in, with 896 individual gene variations found in single patient subgroup. Variable combinations of two patients shared 64 genes and the number of candidate genes was inversely correlated to the number of patients with variations in the same gene (Table 35).

97

Number of patients Number of variations Gene name

20 1 MUC4

19 1 NAGLU

18 3 C12orf75, TC2N, UCK1

15 1 CDCP2

14 1 URI1

12 1 THAP7

10 2 NAV3, SLU7

9 4 MAGEF1, MAP3K1, MYH3, ZNF214

8 4 CCDC66, MEOX2, MPRIP, PAPLN

7 2 AGAP7, MEF2A

6 1 DNMT1

5 2 HNF4G, KRTAP10-7

4 7 DMTB1, ENTPD2, FAT1, MAML3, OR7G1, TMIE, YEATS2

3 8 DCDC2, GLI1, GPR112, HOMEZ, KIAA0040, LRP11, PCDHB8, SYNE1

Table 35: List of variations causing frameshift changes and introduction of a downstream premature stop codon. Patients are grouped according to number of common gene observed.

MUC4 (OMIM#158372) was identified as a candidate in all patients from both sets of patients who underwent WES analysis. It encodes for the mucin protein-4 (MUC4), which is an epithelial cell surface glycoprotein and a major constituent of mucus. MUC4 function has been associated with certain types of cancer such as sarcomas and squamous cell carcinomas (Ponnusamy et al. 2013). MUC4 though has been reported to have commonly identified variants in WES. This is attributed to the gene undergoing recent copy events, which give rise to many false positive (paralogous) alignments, which subsequently lead to systematic false positive variant calls. On the basis of this information MUC4 was dismissed as a strong candidate gene and was abandoned from any further analysis.

Candidate gene homozygous variations were also filtered based on their impact prioritizing any protein truncating mutations either by introducing stop codons (Table 36) or causing frameshift changes and a premature stop codon downstream (Table 37). In the next step the homozygous variations identified in 24 genes listed in tables 36 & 37 were investigated further individually and prioritized based on liver tissue expression and protein function, 98 using gene databases such as GeneCards (http://www.genecards.org) and OMIM (http://www.omim.org).

The gene doublecortin domain containing 2 (DCDC2, OMIM #605755), was prioritized as a strong candidate as (i) three patients were found to have homozygous changes (2 frameshift, 1 stop codon), (ii) the encoded protein is expressed in the liver as well as in other organs, (iii) DCDC2 is a microtubule association protein regulating microtubules, which are small structures of primary cilia, located also in cholangiocytes and finally (iv) heterozygous changes in DCDC2 were identified in another 3 patients (2 frameshift, 1 stop codon) bringing the total number of probands to six .

Patient Gene Chromosome Chromosome Chromosome MAF (%) Number position (start) position (end)

TG20 ATP6V1G3 chr1 198505830 198505831 0.24

TG17 BBS12 chr4 123663686 123663687 0.13

TG07 CCDC150 chr2 197521567 197521568 0.13

TG22 COL9A2 chr1 40773149 40773150 0.92

TG15 DCDC2 chr6 24278308 24278309 0.26

TG14 DCDC2 chr6 24291214 24291215 0.13

TG21 LILRA3 chr19 54803663 54803664 0.40

TG20 METTL8 chr2 172182386 172182387 0.19

TG07 PRRG2 chr19 50091761 50091762 0.13

TG21 PTCHD3 chr10 27688100 27688101 0.78

TG05 SATL1 chrX 84349206 84349207 0.24

TG17 SLFN12 chr17 33749533 33749534 0.13

TG07 ZNF221 chr19 44470740 44470741 0.13

Table 36: Gene variations introducing premature stop codon with high impact on the encoded protein as identified on GEMINI. Gene variations are listed in alphabetical order including the chromosome number and start and end position and minor allele frequency (MAF) according to exome sequencing project (ESP).

99 Patient Gene Chromosome Chromosome Chromosome MAF (%) Number position (start) position (end)

TG20 BFSP1 chr20 17474719 17474722 0.17

TG21 C3orf35 chr3 37458911 37458934 0.24

TG21 CBY1 chr22 39069233 39069234 0.1

TG22 CBY1 chr22 39069233 39069234 0.1

TG17 CDCP2 chr1 54605318 54605319 0

TG16 CDCP2 chr1 54605318 54605319 0

TG19 CDCP2 chr1 54605318 54605319 0

TG14 CDCP2 chr1 54605318 54605319 0

TG09 CDCP2 chr1 54605318 54605319 0

TG23 CDCP2 chr1 54605318 54605319 0

TG11 CDCP2 chr1 54605318 54605319 0

TG26 CDCP2 chr1 54605318 54605319 0

TG06 CDCP2 chr1 54605318 54605319 0

TG07 CDCP2 chr1 54605318 54605319 0

TG21 CDCP2 chr1 54605318 54605319 0

TG04 CDCP2 chr1 54605318 54605319 0

TG05 CDCP2 chr1 54605318 54605319 0

TG03 CDCP2 chr1 54605318 54605319 0

TG22 CDCP2 chr1 54605318 54605319 0

TG20 DCDC2 chr6 24289081 24289082 0.13

TG07 FBXL21 chr5 135277239 135277242 0.99

TG21 SPTBN5 chr15 42158266 42158269 0.19

TG21 SPTBN5 chr15 42158266 42158269 0.19

TG05 TUT1 chr11 62359059 62359061 0.49

Table 37: List of variations coding for a frameshift change and introduction of a premature stop codon downstream, identified on GEMINI. Gene variations are listed in alphabetical order including the chromosome number, start and end position and minor allele frequency (MAF) according to exome sequencing project database (ESP).

As DCDC2 was considered a strong candidate gene in the NSC cohort, Sanger sequencing was undertaken to validate the WES findings, details of which will be described in the section 4.9 along with gene details and its encoded protein’s possible involvement in the pathogenesis of NSC. 100 5.7.3. Whole Exome Sequencing detection in BA patients The WES data generated from group 2 were analysed with all NSC and BA patients as one group as explained in section 4.2.2. Through the filtering stepwise process that was applied in group 2, despite the strong family history of consanguinity recorded in BA patients, no strong candidate gene variants were identified in the BA subgroup. Sequence depth and coverage along with quality scores were all comparable between the NSC and BA subgroups as described previously. Once excluded patients with mutations in DCDC2, none of the gene variants in the remaining group 2 patients (Tables 35-37) were considered significant as described above. When filtered the BA subgroup separately there were overall 26 genes with variants in ≥ 2 patients found (MUC4, TC2N, THAP7, NAV3, HNF4G, DMBT1, ENTPD2, GLI1, ANXA6, ATXN3L, BHLHB9, CCDC177, CCDC60, CCNB3, CDC16, COL6A1, COL6A3, CYB5R2, DDX26B, DMD, LAMB4, NXF3, PRKCH, RYR3, TTN, UBE4B). All but three (splice site) were non synonymous coding variants. They were all classified as non significant based on liver tissue expression and protein function, using gene databases such as GeneCards and OMIM.

The disease mutation for BA may lie in a coding region not captured by the current exome platform, a non-coding region of the genome, or the mutation may cause disease through an alternate mechanism not detected by the methods employed in this thesis. BA may also well be subjected to genetic heterogeneity, which was not possible to find in this small studied group. Further work will be undertaken in the search of other strong candidate genes in the mutation negative patients, which will be described in the discussion chapter.

5.8.Doublecortin domain containing 2 (DCDC2)

5.8.1.Introduction

In the process of searching for a strong candidate gene in NSC patients through WES, we identified mutations in DCDC2 (OMIM #605755), which could be implicated in the pathogenesis of the disease. The gene belongs to the Doublecortin gene family (DCX), named after the doublecortin gene. DCX (OMIM #300067), a gene located on Xq23, was initially described in 1998 and was associated with subcortical band heterotopia in females and lissencephaly in males (des Portes et al. 1998). In the DCX-repeat gene family 11 paralogs in human and in mouse have been described (Reiner et al. 2006).

DCDC2 was initially identified as a candidate gene for dyslexia (Meng et al. 2005). DCDC2, its encoded protein, is highly expressed in the adult and foetal central nervous system, but also in other organs including the liver. DCDC2 contains two doublecortin domains previously described in DCX, which are microtubule-binding modifiers. Microtubules are formed through binding of alpha- and beta-tubulin dimers to a guanosine-5’-triphosphate 101 (GTP) molecule, which then assemble into linear structures called protofilaments before forming mature microtubules. Microtubules are involved in cytoskeleton structure, cell movement and division, intracellular transport and are a key component of the internal structure of cilia. Microtubule associated proteins (MAPs) have a regulatory role in the microtubule’s function by binding onto the microtubules in a nucleotide-independent process (Kim et al. 2003). DCDC2, a known MAP, has the potential to interfere with tubulin binding and microtubule polymerisation, and subsequently ciliary structure (Kim et al. 2003). Ciliary proteins are synthesized in the cytoplasm and endoplasmic reticulum and transported within the cilium via the intraflagellary transport system (IFT), which is important for multiple ciliary functions. IFT, a transport system that is composed of twenty proteins and relies on kinesin-2 or dyneine2/1b, takes place along the microtubules, whose function is affected by MAPs such as DCDC2 (Massinen et al. 2011).

Cilia are structures present in eukaryotic cells enclosed by plasma membrane. They are divided into motile cilia (trachea, fallopian tubes) and primary cilia (also known as sensory cilia) such as the ones located in cholangiocytes (Satir and Christensen 2007). Each primary cilium consists of a microtubule based axoneme and a basal body centriole-derived microtubule centre, from which the axoneme is derived. The primary cilia have the basic structure of the 9+0 microtubule arrangements lacking the inner and outer dynein arms (IDA and ODA, respectively) and radial spokes and subsequently the ability to move spontaneously, i.e. non-motile.

The first cholangiocyte-located cilia were identified in murine cholangiocytes in 1963 (Grisham 1963). Their main function, through utilization of different sensory molecules, has been described to be threefold. Firstly, detection of changes in bile flow in the intraluminal space by the receptors polycystin 1 & 2; changes in bile flow cause cilia to bend and polycystins form a functional complex that allows calcium (Ca2+) to enter the cholangiocyte, affecting its function. Secondly they act as sensors in bile composition by detecting changes in the concentration of biliary molecules via different mechanisms such as P2Y receptors and presence of exosomes. P2Y receptors, which are expressed at the cholangiocyte apical membrane and cilia, are stimulated by specific nucleotide concentration in bile and subsequently influence cell proliferation and secretion. Exosomes, small extracellular vesicles, are involved in intercellular communication. It has been shown that exosomes attach themselves onto cholangiocytic cilia and influence the extracellular signal-regulated kinase (ERK) signalling pathway and cell proliferation (Masyuk et al. 2010, Larusso and Masyuk 2011). Thirdly, cilia can also act as osmotic sensory structures. As bile flows through the biliary structures, its osmolality can alter due to absorption of bile acids and glucose or secretion of HCO3- and water. Transient receptor potential vallinoid type 4 (TRPV4) channels have been shown to regulate intracellular Ca2+ concentrations in response to changes in 102 osmolality and biliary secretion and subsequently increase cholangiocyte proliferation (Gradilone et al. 2007).

As described above, cholangiocyte cilia have been shown to be multifunctional via various signalling modules. Microtubules play a key element in cilia and cilia dysfunction may contribute to the pathogenesis of cholangiopathies. DCDC2 is a MAP and has been recently described to be localised in the ciliary axoneme and to interfere with signalling pathways (Schueler et al. 2015). Overexpression of dcdc2 had been shown to increase the length of cilia in rat hippocampal neurons and activate sonic hedhehog (Shh) signalling (Massinen et al. 2011). Subsequently, the most common alleles in conserved purine-rich short tandem repeat (STR) (BV677278) in intron 2 of the gene have been shown to have an enhancing effect on the gene expression and therefore on neuronal migration and brain development (Meng et al. 2011). Down regulation of dcdc2 expression enhances Wnt signalling, consistent with a functional role in ciliary signalling: the transmission of signals from the cilia distal tip to the cytoplasm and nucleus via their basal body (Scholey and Anderson 2006). DCDC2 has been shown to interact with the proteins DVL1, 2 and 3, which act as regulators in the Wnt signalling pathway. Wnts are glycoproteins contributing to the regulation of cell development, migration and polarity along with neural patterning and organogenesis during embryonic development. The abnormal activation of Wnt signalling has been shown to increase fibrinogenesis and liver fibrosis (Lu et al. 2014).To date three main Wnt pathways have been described (i) the noncanonical, planar cell polarity (PCP) pathway regulating the integrity of cell cytoskeleton, involved in gene transcription (ii) the canonical/β-katenin dependent, involved in gene transcription, and (iii) the noncanonical/Ca2+ dependent, which through stimulation of intracellular calcium release achieves its biological role (Komiya and Habas 2008). Abnormalities in DCDC2 affecting Shh and Wnt signalling pathways may have an impact on cholangiocyte morphology and development of cholangiopathies similar to that described for neuronal cells (Massinen et al. 2011).

In this chapter the details of the DCDC2 mutations identified in our cohort are described. Following confirmation with Sanger sequencing, the DCDC2 protein expression was investigated in liver tissue via immunohistochemistry and potential structural effects on cilia, which could explain the association with the phenotype of NSC, were explored using transmission electron microscopy.

5.8.2.Variant confirmation through Sanger sequencing

As described in the previous chapter, the analysis of WES data identified frameshift and premature stop mutations in DCDC2 in six probands. Primers were designed and, after temperature gradient PCR for each primer, annealing temperatures were decided (Table 38). As the WES analysis showed a generally low coverage for exon 1 (<90 reads), all patients 103 underwent Sanger sequencing for that exon to avoid a false negative result. Sanger sequencing confirmed in total three frameshift mutations and two single base substitutions leading to premature stops, as identified in WES (Table 39). The Sanger sequencing data were exported and aligned to a reference gene using Sequencher 4.8 (Gene Codes Corporation, Ann Arbor, MI, USA) and electropherograms were viewed using Chromas Lite 2.1.1 (Technelysium, South Brisbane, Australia). In more detail, patient TG14 was homozygous for a single base substitution in exon 5, which introduces a stop codon at position 217. Patient TG15 was homozygous for a single base substitution in exon 7, which introduces a stop codon at position 297. Patient TG20 was homozygous for a single base insertion in exon 7 leading to amino acid substitution from serine to arginine and the introduction of a premature stop codon 4 codons downstream.

Patient TG04 was heterozygous for (i) a single base duplication in exon 4 leading to isoleucine substitution with asparagine and a premature stop codon 20 codons later and (ii) the same nonsense mutation as patient TG15. Patient TG12 was heterozygous for (i) a two base deletion in exon 1 leading to change of serine to glutamine and a premature stop codon 72 codons downstream and (ii) the nonsense mutation in exon 7 seen in patient TG15. Lastly patient TG11 was homozygous for a two base deletions in exon 1 leading to change of serine to glutamine and a premature stop codon 72 codons downstream (Figure 17).

Figure 17: Electropherogram appearance of cDNA Sanger sequencing in the region of the 2 base (GT) deletion in a control sample (top) and the c.123_124delGT change in proband TG11 (bottom). Electropherogram was viewed using Chromas Lite 2.1.1.

104 DNA from a sibling of TG04 with similar phenotype underwent Sanger sequencing and was confirmed to have the same mutations in exons 4 and 7 observed in her sister: both had the same disease severity, requiring LT.

105

Annealing temperature for Exon Strand Primer sequence (5' to 3') PCR (°C)

Exon 1 Forward 1 GGTAAAGGTGCAAGAAGAGAG 62

Exon 1 Reverse 3 TCACCTCCTTCAGGAAGAC 58

Exons 3&4 Forward 1 TCTTGAGAGCTTTGTTAAGG 56

Exons 3&4 Reverse 2 TTATGTTTGCCTAGTCACTC 56

Exon 5 Forward 1 CAGATCATCAGACTCACTG 56

Exon 5 Reverse 2 ATGTAGTCTTTCATAACTTCC 58

Exon 6 Forward 2 AAGTTTACTTGTAAGCATTGAT 54

Exon 6 Reverse 1 TGTCAGGTGTATATACAAGG 56

Exon 7 Forward 2 CCAGGAAGATACATTTGGAG 58

Exon 7 Reverse 1 AGTGTCCTCATGGCACAAC 58

Table 38: Primer sequences used for Sanger sequencing confirmation of DCDC2 mutations identified in WES analysis. The transcript used was ENST00000378454.

5.8.3.DCDC2 variant detection in Group 1 WES

Following the identification of disease causing mutations in DCDC2 in group 2, we reviewed the LOH data generated from SNP linkage analysis in the first stages of this project. No areas of LOH were identified in the DCDC2 area of interest (GRCh37, chr 6: 24,171,756- 24,358,052). We then reviewed the WES data in group 1 in case variants affecting the gene of interest were present, which could have been excluded prior to the detection of DCDC2, due to stringent filtering. All variants found in exons 2, 5, 6 and 9 were known in dbSNP, all with MAF over 2%, apart from a single heterozygous variant identified in patient 9 (MAF 0.2%) (Table 40). Patients 8 and 13 were homozygous for the common variants identified. The physicochemical difference between amino acids changes p.Pro152Ala, p.Ser239Ala and p.Ser221Gly, were low to moderate with Grantham scores being 27, 99 and 56, respectively. Grantham difference is a scoring system which details composition, polarity and volume differences between two amino acids (Grantham 1974).

106 Chromosome Patient Coverage Frequency Exon Number Zygosity Nucleotide change Amino acid change position number (reads) (%) (start)

TG14 Exon 5 Homozygous c.649A>T p.(Lys217*) 24291214 3811 100

TG15 Exon 7 Homozygous c.890T>A p.(Leu297*) 24278308 3114 100

TG20 Exon 6 Homozygous c.757insG p.(Ser253Argfs*4) 24289082 1825 100

Exon 4 Heterozygous c.529dupA p.(Ile177Asnfs*20) 24301970 2040 100 TG04 Exon 7 Heterozygous c.890T>A p.(Leu297*) 24278308 3114 100

Exon 1 Heterozygous c.123_124delGT p.(Ser42Glnfs*72) 24357854 231 100 TG12 Exon 7 Heterozygous c.890T>A p.(Leu297*) 24278308 3114 99.45

TG11 Exon 1 Homozygous c.123_124delGT p.(Ser42Glnfs*72) 24357854 231 100

Table 39: Mutations in DCDC2 identified by WES and confirmed by Sanger sequencing. Mutations are described based on NM_00119561.

107 In general, differences >100 are considered radical and can have a functional effect at the protein level. The variants were also checked by web-based prediction tools such as SIFT, a score based on the effect of the amino acid change on the protein function (http://sift.bii.a- star.edu.sg/) and MutationTaster (http://www.mutationtaster.org/) (Table 41). All amino acid changes were tolerated at protein level and mutations were tolerated in the respective web tools. Only in patient 9 the single heterozygous missense mutation with MAF of 0.2%, which is considered to be rare, was predicted to have a deleterious effect at protein level in a homozygous or compound heterozygous status.

Patient Aminoacid No Exon Nucleotide change change dbSNP MAF

9 Exon 6 c.715T>G p.S239A rs144695853 0.2%

8 Exon 9 c.1017C>T p.V339V rs9467075 11.2%

Exon 9 c.1017C>T p.V339V rs9467075 11.2% 1 Exon 6 c.661A>G p.S221G rs2274305 4.5%

Exon 9 c.1017C>T p.V339V rs9467075 11.2%

Exon 6 c.661A>G p.S221G rs2274305 4.5% 10 Exon 5 c.454C>G p.P152A rs33914824 3.3%

Exon 2 c.183C>T p.A661A rs33943110 3.1%

13 Exon 6 c.661A>G p.S221G rs2274305 4.5%

Table 40: Variants in DCDC2 identified in the 1st group of patients who underwent WES. Minor allele frequency (MAF) was checked for each variant and considered rare if <2%, when compared to the general population. Variations are homozygous for patients 8 and 13. Variants are described based on NM_001195610.

5.8.4.Liver tissue immunohistochemical and ultrastructural studies

This work was undertaken by colleagues in the liver histopathology department, Institute of Liver Studies at King’s College Hospital.

5.8.4.1.METHODS

Sections at 4 μm of archival formalin-fixed, paraffin-embedded liver-biopsy or hepatectomy materials from probands (n = 5) and from patients with other cholestatic diseases were

108 immunostained using a mouse anti-human DCDC2 monoclonal antibody (Santa Cruz / Insight Biotechnology, Wembley, UK; DCDC2 [C4], sc-166051, IgG1 isotype, recognizing C-terminus amino-acid residues 331-476; 1:50 dilution, 10 minutes pre-treatment at pH9) with BondMax reagents and automated equipment (Leica Microsystems, Milton Keynes, UK). Liver material from probands (n = 3) primarily fixed in paraformaldehyde/glutaraldehyde, either at bedside on sampling or on retrieval from -80 °C storage, was post-fixed (OsO4) and embedded in resin. Ultrathin sections stained with uranylacetate / lead citrate were evaluated by transmission electron microscopy (TEM).

5.8.4.2.RESULTS

After the identification of DCDC2 mutations in these seven patients we were interested to study the expression of the protein in liver tissue. Following immunostaining as per 4.2.3.1 the following findings were identified. Hepatobiliary marking for DCDC2 expression in control patients with cholestasis (BA, alpha-1-antitrypsin deficiency, primary sclerosing cholangitis, primary biliary cirrhosis, Wilson disease) and in tissue from patients without cholestasis were examined. Immunostaining for DCDC2 was largely restricted to cuboidal cholangiocytes (neocholangioles, interlobular bile ducts, small septal bile ducts) and was generally absent from columnar cholangiocytes (large septal bile ducts, hilar bile ducts, extrahepatic biliary tract with gallbladder) (Figure 18A). Some focal faint marking of columnar cholangiocytes was occasionally found. In probands no expression of DCDC2 was found at any site (Fig 18B) confirming the absence of the protein at a cellular level.

As the next step we investigated whether the lack of protein expression had an effect on ciliary structure. Section 4.2.1 describes how DCDC2 could affect cilia structure and signalling: we wanted, therefore, to investigate the presence and structure of cilia in the liver tissue by utilising TEM. This is a microscopy technique utilising electromagnetic lenses to focus electrons into a very thin beam, which subsequently travels through the liver tissue and produces black and white images depending on the electron density of the specific area. In a Wilson disease control tissue, TEM showed crystal lysosomal granulation (red arrow) typical of Wilson disease and lobular elements with steatosis. A bile ductule is pictured with its characteristic basement membrane (green arrow) (Fig 19A) and primary cilia structures in the centre (red arrow) (19B).

In probands (TG15) TEM showed lobular elements with portal tract expansion. In the lobule there was cytoplasmic necrosis, dilatation of canalicular lumina with amorphous bile and blunting of villi (Fig 20A). Cholangiocellular apices showed blunting of microvilli and cytoplasmic blebbing into the canalicular lumen (Fig 20B). Cilia were not identified in any ductular structures.

109 The absence of cilia in cholangiocytes in conjunction with the lack of immunostaining by DCDC2 specific antibody in cholangiocytes in all six probands with available liver tissue supports the structural effect DCDC2 is likely to have on the formation of microtubules and subsequently cholangiocytic cilia.

A B

C D

Figure 18: (A): DCDC2 immunostaining of cuboidal cholangiocytes (black arrow) in control tissue 10x; (B): absence of DCDC2 immunostaining in a DCDC2 confirmed mutation patient (TG12) 10x; (C): H&E staining of interlobular portal tract with evident inflammation (black arrow) in the same patient in panel A 20x; (D): Cytokeratin 7 immunostaining of cholangiocytes (green arrow) in patient (TG12) 10x.

110

Figure 19: Transmission electron microscopy of hepatic lobule with portal tract and primary cilia in a control patient with Wilson disease. (A): A bile ductule with its characteristic basement membrane (green arrow) with (B) primary cilia structures in the centre (red arrow).

111

Figure 20: Transmission electron microscopy of a hepatic lobule with a portal tract in patient TG15. (A) There is evident portal tract expansion, with cytoplasmic blebbing (blue arrow) and (B) blunting of microvilli (red arrow), cytoplasmic blebbing and absence of ciliary structures.

5.8.5.Discussion

This work demonstrates that mutations in DCDC2 may be involved in the pathogenesis of NSC. Five different mutations (3 frameshift, 2 introducing premature stop codons) were identified in six patients through WES and subsequently in the sibling of one of the probands bringing the total up to 7/27 cases investigated. So far variants were found in exons 1, 4, 5, 6 and 7. All variants identified in our patients have as a consequence premature codon termination or stop codon, i.e. they were protein truncating mutations. All probands had severe phenotype of liver disease and all but one had either undergone LT or died while listed for LT.

DCDC2 expression has been shown to be present in embryonal neural tube, lung, kidney (tubular more than glomerular) and hepatobiliary tissue based on Ensembl (http://www.ensembl.org) and UniprotKB (http://www.uniprot.org), but it is abundant across systems in adult tissue. In our probands liver tissue immunohistochemistry demonstrated a lack of expression of DCDC2 in the biliary system when compared to other cholestatic liver disease controls (Wilson disease, BA, alpha-1-antitrypsin deficiency, primary sclerosing cholangitis, primary biliary cirrhosis) or normal tissue. In control tissue the expression of DCDC2 was restricted mainly to the cuboidal cholangiocytes in smaller biliary structures, suggesting a DCDC2 role in the evolutionary mechanism of small bile duct formation and their function as sensors of changes in bile flow, osmolality and composition (Tabibian et al. 2013). On TEM, the absence of primary cilia was notable in contrast to other cholestatic disorders. This absence would be expected to have a detrimental effect on cilia-based signalling and may explain the subsequent irregularities of the biliary system and the

112 phenotype of neonatal onset cholangiopathy (Gunay-Aygun 2009). It has been previously shown that tubular cilia length and function is affected by changes in DCDC2 in murine models by modification in signalling pathways (Wnt) and by structural defects (IFT).

In a review on ciliopathies published in 2009 that utilized data generated from the Online Mendelian Inheritance in Man (OMIM) website (http://www.omim.org), 102 conditions were identified to be either known (14), likely (n=16) or possible ciliopathies (n=72) (Baker and Beales 2009). A large proportion of cilia related gene mutations have been associated with auditory, learning and olfactory, renal and skeletal disorders. In a study published this year, a homozygous point mutation in DCDC2 (c.1271A>C, p.Gln424Pro) was reported in a Tunisian family with non-syndromic autosomal recessive hearing loss (Grati et al. 2015). On immunofluorescence of rat inner ear neuroepithelial tissue, the isoform DCDC2a was found to localize to the kinocilia of sensory hair cells and the primary cilia of nonsensory supporting cells. DCDC2a was also distributed along the length of the kinocilium, a type of cilium on the apex of hair cells located in the sensory epithelium of the vertebrate inner ear, with increased density toward the tip. The authors have also shown that DCDC2a-GFP overexpression in non-polarized COS7 cells induces the formation of long microtubule-based cytosolic cables and subsequently microtubule formation and stabilization (Grati et al. 2015). Meng et al described an intronic change in DCDC2 as a possible cause for dyslexia and reading difficulties and proved via reverse transcription polymerase chain reaction (RT-PCR) that DCDC2 is expressed in the area of the brain involved with reading (Meng et al. 2005). Cilia also constitute the structure of olfactory sensory neurons (OSNs), which are part of the chemosensory interface between the environment and our brain, contributing to the pathophysiology of anosmia in Bardet-Biedl syndrome (Kulaga et al. 2004, Iannaccone et al. 2005). Cilia facilitate odour detection, which includes all necessary components from the initial stimuli to downstream olfactory signal transduction to the brain (Kuhlmann et al. 2014).

Nephronophthisis-related ciliopathies (NPHP-RC) have been associated with mutations in a variety of genes (Hildebrandt et al. 1997, Otto et al. 2002, Olbrich et al. 2003, Otto et al. 2005, Sayer et al. 2006, Valente et al. 2006, Attanasio et al. 2007, Delous et al. 2007, Otto et al. 2008). In a recent NGS study of 100 consanguineous cases with NPHP-RC a homozygous truncating mutation in DCDC2 was identified (as in TG14) in a single patient who also had liver disease. In the same paper, another 800 NPHP-RC families underwent high-throughput exon sequencing and a single patient with hepatic fibrosis but no renal involvement was found to have two compound heterozygous mutations; a frameshift (as in TG11) and a splice site (c.349-2A>G; p.Val117Leufs*54). In their study no specific liver phenotype-genotype correlations were found, but the interaction between DCDC2 and Wnt signalling was

113 confirmed in IMCD3 cell culture, as well as the localization of DCDC2 in the ciliary axoneme and mitotic spindle fibers (Schueler et al. 2015).

In our DCDC2 cohort, four patients underwent LT (one living related LT), one died while on the LT waiting list, and sibling of TG04 died suddenly 2 years after LT of unknown cause. Patient TG12 is an overseas patient, who is followed up locally and the progression of his liver disease is not known. The disease phenotype was clearly that of a neonatal onset cholangiopathy, liver fibrosis and cholestasis in the absence of cystic renal changes. It is of significance that in this cohort there was early disease progression to end stage liver disease, LT and/or premature death. In our DCDC2 probands only one had developed renal disease, the aetiology of which has not been clearly defined. All probands had undergone renal ultrasound scans and only two siblings (patients TG04 & her sister) had an isolated small cyst in the right kidney. Both siblings had normal kidney function and glomerular filtration rate (GFR). It is possible to speculate that in our cohort kidney involvement could be a somewhat belated manifestation, perhaps in adult age. Further follow up in patients with DCDC2 mutations is necessary in order to establish potential other organ manifestations.

The identification of a genetic defect related to cilia function primarily affecting the biliary system could enable us to understand better the pathogenesis of NSC and investigate new potential therapeutic routes. To date there have only been symptomatic management options (Cnossen and Drenth 2014), with more recent studies looking into medicinal agents such as somatostatin analogues and their effect in reducing liver volume (Patel et al. 2009, van Keimpema et al. 2009, Hogan et al. 2012).

In conclusion we identified mutations in DCDC2 linking the distinct phenotype of NSC with cilia related disorders. Through a stepwise approach and the utilization of NGS we were able to find the candidate gene in a significant proportion of our NSC patients. The significance of this finding was strengthened by the lack of DCDC2 immunostaining and absence of cilia at cholangiocyte level.

114 6. Discussion

Neonatal sclerosing cholangitis, which is a focus of interest in our group, is a rare but severe form of neonatal onset cholangiopathy. A large proportion of NSC patients develops end- stage liver disease and requires LT during childhood. The similarities in clinical presentation with the much commoner paediatric liver disease, BA, have been suggestive of a possible common mechanism of pathogenesis. Equally importantly the identification of a proportion of patients with NSC being of consanguineous origin supports the theory of an autosomal recessive pattern of inheritance.

The aim of this project was to identify the causative gene for NSC. The research was approached in a stepwise manner by initially utilising SNP microarray assay and homozygosity mapping. This approach generated areas of homozygosity that were too numerous and large, and contained too many candidate genes. It became apparent that conventional sequencing was not feasible at this stage. Therefore WES was undertaken in two groups of patients. A subgroup of KS patients was also investigated for genetic confirmation and description of their liver disease.

As no causal gene for KS was available at the initial stages of this project the KS patients were studied via SNP linkage analysis, with no conclusive results. Once MLL2 became a known gene associated with KS, the subgroup described in section 4.1 underwent genetic confirmation for mutations in MLL2 plus delineation of their liver disease. Mutations in MLL2 (3 truncating, 1 splice site) were identified in all four probands including one novel variant. The KS subgroup had been included in our analysis due to the nature of their liver disease identified as BA (2), NSC (1) and high GGT cholestasis with NRH (1). As discussed in 4.1.5 the gene’s encoded protein MLL2 is part of a large family of histone methyltransferases that methylates the Lys-4 position of histone H3 and belongs to the Activating Signal Cointegrator-2 complex (ASCOM), which has been shown to be a transcriptional regulator of the nuclear Farnesoid X receptor (FXR). FXR is a regulator of several genes involved in bile acid homeostasis (Kim et al. 2009). Our KS patients though had normal bile acid synthesis profile when tested.

In an experimental model, the generation of MLL2 knockdown cell lines demonstrated altered expression (mostly down regulation) in a variety of genes, which have a role in cell development and cholestasis. Down regulation of genes involved in cholestasis, includes adhesion receptors and co-receptors such as tight junction protein 2 (TJP2), actin cytoskeleton organisers and/or regulators, cytoskeleton constituents, transcription factors and regulators, signal transduction, phosphatases, transport (SLC16A5, SLC21A11) and many others (Issaeva et al. 2007). A possible explanation could be that, as shown in the knockdown mouse, MLL2 has an impact on gene expression involved in cholestasis, with subsequent bile 115 duct damage, cholangiocyte distortion and development of cholangiopathy. This hypothesis emphasizes the need to investigate further a potential role for MLL2 in liver disease.

Extensive liver investigations including cholangiography and possibly a liver biopsy in KS patients with liver involvement are appropriate alongside genetic confirmation, as currently there is only a small group of patients with mutations in MLL2 with well-defined liver disease. This will enhance our understanding of KS and its possible association with neonatal cholangiopathies and stimulate further research in this field.

Attempts to identify genetic loci for NSC and BA have been made in the past. Previous work undertaken at King’s College Hospital using an index case with PWS and NSC by SNP linkage analysis confirmed the presence of uniparental disomy with isodisomy on the long arm of chromosome 15. This raised the possibility that the causative locus for the NSC gene could be on chromosome 15. Subsequently when four families with NSC were selected for microsatellite analysis including the case with PWS two strong candidate genes TJP1 and ATP10A within this region of chromosome 15 were identified but when sequenced no mutations were found. The work described in this thesis is the first to confirm a genetic correlation in the pathogenesis of NSC, which makes it of value to researchers and clinicians.

Variants in DCDC2 were identified through WES and confirmed by Sanger sequencing in seven patients, giving an insight on the possible pathogenic mechanism of NSC. The mutations identified were either frameshifts introducing stop codon downstream or nonsense mutations i.e. all having a protein truncating effect. The significance of this new finding was strengthened by the lack of DCDC2 immunostaining on liver tissue and absence of cilia in cholangiocytes on TEM.

The encoded protein DCDC2 is highly expressed in a variety of organs including the liver. DCDC2 contains two doublecortin domains, previously described in DCX, which are microtubule-binding modifiers. Microtubules are formed by binding of alpha- and beta- tubulin dimers to a GTP molecule, which then assemble into linear structures called protofilaments before forming mature microtubules. Microtubules are involved in the cytoskeleton structure, cell movement and division, and intracellular transport, as well as being key components of the internal structure of cilia. MAPs have a regulatory role in the microtubule’s function by binding to them. DCDC2, a known MAP, has the potential to interfere with tubulin binding and microtubule polymerisation, and subsequently axoneme and ciliary structure (Kim et al. 2003). Ciliary proteins are synthesized in the cytoplasm and endoplasmic reticulum and transported within the cilium via the intraflagellary transport system (IFT), functioning along the microtubules (Figure 21). MAPs such as DCDC2 have a direct effect on IFT function (Massinen et al. 2011).

116

Figure 21: Diagram of a primary cilia structure. (A) Primary cilia extend through the apical membrane of a cholangiocyte. They consist of a group of microtubules, forming the axoneme, which is anchored in the cell by the basal body. IFTs and MAPs are located at the tip and across the length of the axoneme, such as DCDC2 (red arrow). Various protein complexes are positioned at the ciliary tip regulating its function (blue arrow). (B) Cross sectional appearance of the axoneme with the 9+0 arrangement of microtubules.

Primary cilia were identified in murine cholangiocytes five decades ago (Grisham 1963). As previously described their main function is on three pathways; mechano-, chemo- and osmosensory. DCDC2 has been recently described to be localised in the ciliary axoneme and to interfere with signalling pathways such as Wnt and Shh. DCDC2 has been shown to interact with the proteins DVL1, 2 and 3, which act as regulators in the Wnt signalling pathway and interfere with cell development, migration, polarity and organogenesis during embryonic development. The absence of DCDC2 would be expected to have a detrimental effect on cilia-based signalling and may explain the subsequent irregularities of the biliary system and the phenotype of neonatal onset cholangiopathy (Gunay-Aygun 2009). It has been previously shown that microtubular length and function is affected by changes in DCDC2 in murine models by modification in signalling pathways (Wnt) and by structural defects (IFT). This point will be analysed below.

Cholangiocytes are exposed to hydrophobic bile via their apical membrane but have developed essential self-protective mechanisms. Disruptions in these mechanisms can lead to cellular inflammation and cholangiopathy. The ciliary mechanism constituting of sensory proteins, IFTs, MAPs and microtubules along with signalling pathways form a circuit of bile

117 regulation, cholangiocyte response and biliary system preservation. The absence or malfunction of any component of the ciliary mechanism, such as DCDC2, could influence the response to potentially cytotoxic bile. Bile salts are synthesized from cholesterol in hepatocytes and are conjugated with taurine or glycine (Russell 2009). The newly synthesized bile salts are mixed with the ones from the portal system and are secreted across the canalicular membrane to the bile canaliculi, down the large bile ducts and subsequently to the small bowel. At least 95% of bile salts are reabsorbed in the terminal ileum and are transported back to the liver via the portal system. The uptake from the sinusoidal space into hepatocytes relies on the sodium taurocholate cotransporting polypeptide (NTCP) expressed at the basolateral surface. Organic anion transporting polypeptides (OATPs) are also located at the basolateral surface and function in a sodium independent way (Dawson et al. 2009). They are transported across the hepatocytes back into the canaliculi via the bile salt export pump (BSEP) completing the enterohepatic circulation. The formation of bile in the canalicular space is an osmotic dependant process, which is followed by influx of water to maintain a stable canalicular environment (Trauner and Boyer 2003). Water can reach the canaliculi through TJs, located between cells and specific channels located in hepatocytes called aquaporins. Once the bile enters the biliary system it undergoes modification by the cholangiocytes, the lining cells of the biliary tree. While bile is in the biliary system it gets enriched with bicarbonate via the apical membrane chloride-bicarbonate exchange molecule AE2 to reduce its concentration of toxic compounds. AE2 is closely linked to changes through chloride channels such as the cystic fibrosis transmembrane conductance regulator and sodium-proton exchanger 2, which work in concert with the apical membrane’s aquaporin 1 (Bogert and LaRusso 2007). Large quantities of bicarbonate (HCO3-) are excreted therefore to form a protective layer in the outer surface of the apical membrane mainly of the larger bile ducts and less of the smaller ductules. Lack of a HCO3- protective layer could potentially leave the small intrahepatic ductules exposed to an imbalanced microenvironment with detrimental effects on the cholangiocyte’s integrity and possibly lead to the destruction of apical membrane cellular structures such as the cilia (Beuers et al. 2010).

A possible pathogenic mechanism in DCDC2 probands could be through abnormalities in signalling pathways. The absence of DCDC2 from the microtubules could lead to abnormal activation or impairment of the Wnt/β-katenin and/or Shh signalling pathways in response to stimuli targeting cholangiocytes. A malfunction in these pathways could affect the various chemo-, osmo- or mechano-sensory molecules and their ability to detect and respond to changes in bile flow and constitution. An inappropriate or “sluggish” response to disturbances in the canalicular microenvironment could affect apical membrane transporters, involved in HCO3-, Ca2+ or water movement, which in turn could lead to the formation of hydrophobic “toxic” bile. The altered bile constitution could cause cholangiocyte irritation

118 starting from direct contact with the apical membrane, inflammation, proliferation and eventually fibrinogenesis. Once cell protective mechanisms have been disabled via the lack of appropriate signalling and subsequent inadequate transport via the IFT, the intracellular environment would inevitably alter; cytoskeleton structures which are supported by microtubules, already disabled by defects in DCDC2, could be affected leading into blebbing of cholangiocytes, a feature seen on liver TEM from DCDC2 probands.

Another hypothesis could be that NSC is caused by structural defects of the axoneme, which is linked to MAPs, such as DCDC2. The formation of the axoneme is based on the careful positioning of the tubulin dimers in an orderly linear fashion extending from the region of the basal body, which is called mother centriole, to the tip of the cilia (Figure 21). An encapsulation of the distal tip of the centriole with a membrane vesicle facilitates the further positioning of microtubules and subsequently the completion of cilia formation. A distortion of the axoneme caused by the lack of DCDC2 regulatory effect could lead to cilia shortening with an effect on IFT and cholangiocytes’ response to canalicular stimuli via the apical membrane receptors. The altered ciliary response to changes in the canaliculi such as abnormal interaction with extracellular ADP can increase intracellular cAMP levels leading to poor bile alkalisation and further bile duct damage and cholangiopathy.

A potential therapeutic mechanism could be the preservation of the HCO3- protective layer with the utilisation of ursodeoxycholic acid (UDCA). UDCA decreases cholesterol concentration in bile, has a choleretic effect and also increases the hydrophilic bile acid pool. This effect could help in maintaining a stable bile consistency and minimize cholangiocyte inflammation and fibrosis.

Another potential treatment option would be through preservation of the normal activation of the Wnt/β-katenin signalling pathway, which has been shown to limit liver fibrosis. Molecules, that could regulate the levels of β-katenin and its associated pathway, could sustain a dynamic balance between the canalicular stimuli and intracellular space response at cholangiocyte level.

In the literature a large proportion of cilia related gene mutations have been associated with different organ disorders, more so the kidneys. The most common hepatorenal fibrocystic diseases, generally known as ciliopathies, include autosomal dominant and recessive polycystic disease (ADPKD & ARPKD), Joubert, Bardet-Biedl, Meckel-Gruber and oro- facial-digital syndromes. The liver manifestations have been described in the form of congenital hepatic fibrosis (CHF), Caroli’s disease and syndrome (Caroli 1968, Caroli 1968a) and polycystic liver disease (Tahvanainen et al. 2005). The identification of DCDC2 mutations in an NSC group, with no known renal involvement apart from one, could be the start of investigating in detail a liver-specific cilia mediated pathogenic mechanism, like the 119 ones mentioned above. This knowledge could widen our horizons in understanding cholangiopathies in children. A distinct type of mainly liver specific ciliopathy with minimal renal involvement could be distinguished based on the findings of this project.

Various groups have investigated the pathogenesis of BA as the disease is more prevalent than NSC and the most common indication for LT in children. In a previous study, liver tissue from patients with obliterative cholangiopathy (BA) was found to be negative on immunohistochemistry screening for fibrocystin, a membrane receptor-like protein localised in primary cilia, encoded by the PKHD1 gene. This finding was observed in liver tissue from BA patients irrespective of the presence of renal cysts, which have been linked to PKHD1 gene but not in normal liver or other cholestatic disease controls (Wilson disease, autoimmune hepatitis, cystic fibrosis and tyrosinaemia). Sequencing of this gene did not reveal any disease causing mutations (Hartley et al. 2011). The authors explained the lack of fibrocystin on immunostaining as a possible effect of advanced liver disease. In a report from 2012 liver tissue from nine patients with BA (5 syndromic BA) was compared with tissue from controls (3 healthy individuals, 6 other cholestatic disorders) in regards to cholangiocyte cilia abnormalities. The authors used CK19 as a cholangiocyte marker and α-tubulin as a cilia marker for immunostaining purposes as well as 3 dimensional confocal computed reconstructions. They showed that cilia were shorter, fewer in numbers and abnormal in angulation in all BA patients compared to controls (Chu et al. 2012). They suggested that there might be a common disease mechanism between the two BA subgroups (syndromic and non-syndromic) but no functional conclusions between cilia and BA were drawn.

The presence of cilia in extrahepatic ducts was investigated further in liver tissue from BA patients compared to healthy controls, bile duct ligated and rhesus rota virus (RRV) infected mice. By immunostaining cilia numbers were significantly reduced in bile ducts of BA patients as well as in ducts of RRV infected mice. The authors suggested that the association between BA and ciliopathies might be bidirectional. The reduction on cilia may be a secondary effect of bile duct obstruction, bile pooling and cholestasis but also primary cilia deficiency can cause loss of cholangiocyte polarity and epithelial cell integrity leading to inflammation and fibrosis (Karjoo et al. 2013).

Previous studies in BA have also focused on the inflammatory milieu observed in the liver histology looking in particular at TNF-α (Overgaard et al. 2009) or bile duct development transcription factors such as SOX9 (Masyuk et al. 2010) with no confirmation of pathogenic mechanisms. GWAS studies in two groups of patients, of Asian origin, identified a SNP on chromosome 10q as a susceptibility locus, located upstream of XPNPEP1 and ADD3 genes (Seelow et al. 2009, Gazave et al. 2013). A recent non-Asian population study identified a strong signal at a SNP upstream of ADD3, with a reduction in tissue expression compared to

120 controls. No further pathogenic associations were made though (Findley and Koval 2009). In Chinese BA patients with the previously reported risk haplotype a significantly lower expression of ADD3 was found suggesting that the risk haplotype may contribute to BA by decreasing ADD3 expression (Chu et al. 2012). ADD3, is encoding an F-actin binding protein called adducin 3. The protein is involved in the formation of a skeletal protein network, contributing to the cell-cell contact and recently found to be expressed in the biliary epithelium (Tsai et al. 2014). In a paper presented in the annual Digestive Diseases Week meeting last year authors identified the ABCB4 variant p.A934T in a subgroup of BA patients who underwent WES, as a possible modifier gene in BA (Mezina et al. 2014). In the last 10 years WES has been a powerful tool in disease causing gene mutation identification in small sets of families or large-scale population studies, and may be the way forward in the investigation of BA’s genetic locus. Through the work presented in this thesis no gene variants were identified in either of the two previously suggested candidates, leaving the BA gene locus question still unanswered.

An interesting finding was that of the seven DCDC2-mutated probands identified, five of them were of Greek ancestry. The patients originated from different parts of Greece and there was no family history of consanguinity according to parents. This phenomenon has been described in several genetic studies undertaken in small isolated communities. One could hypothesize on the other hand, that this is an epiphenomenon of the patient referral pathway and historical collaborations with referral centres in Greece. An argument against this hypothesis could be that so far a similar pattern has not been witnessed in other population groups studied at King’s such as those from the Middle East.

It would certainly be of interest to screen the remaining Greek patients with neonatal onset, obliterative or not, cholangiopathy in whom genetic diagnosis has not been made and who were not included in this project. This screening should be done to establish whether other affected individuals with the same disease phenotype have mutations in DCDC2 and whether mutations are equally present across larger NSC subgroups of different origin.

The work presented here represents the first encouraging step in understanding further the pathophysiology of NSC. Nevertheless the fact still remains that genetic confirmation has not been established in the majority of NSC, and all BA patients included in this project. The presence of genetic heterogeneity in the pathogenesis of early onset cholangiopathies can be safely concluded though the results described. Another explanation for not finding a common genetic aetiology amongst all patients studied could however be that relevant genes may lie in regions not captured by the current exome workflow.

Future steps building on the findings described in this thesis would be to extend screening for DCDC2 mutations in (i) our larger existing cohort of NSC patients, (ii) patients with neonatal 121 onset cholangiopathy of Greek origin and (iii) in all newly presented NSC patients for diagnostic purposes.

The commercial availability of a monoclonal antibody against DCDC2 and the confirmation of DCDC2 absence by liver immunohistochemistry would enable us to incorporate this technique in the diagnostic armamentarium and enhance the screening process in NSC-like patients, prior to obtaining genetic confirmation.

Long term follow up of the identified DCDC2 mutated patients will enable us to collect valuable information on the clinical course of the disease, on specific genotype-phenotype associations and on any other organ involvement.

Structural studies and ciliary location of DCDC2 along with its interaction with signalling pathways could facilitate further therapeutic approaches for this severe form of liver disease.

The WES data, which became available from both groups 1 and 2 of patients will be combined and further analysed by incorporating the information obtained from the SNP linkage analysis, in order to apply more advanced bioinformatic tools and possibly identify other strong candidate genes in the existing patient pool.

122 REFERENCES

Abdel-Salam, G. M., H. H. Afifi, M. M. Eid, T. H. El-Badry and N. Kholoussi (2011). "Ectodermal abnormalities in patients with Kabuki syndrome." Pediatr Dermatol 28: 507- 511. Akin Sari, B., K. Karaer, S. Bodur and A. S. Soysal (2008). "Case report: autistic disorder in Kabuki syndrome." J Autism Dev Disord 38: 198-201. Allotey, J., F. Lacaille, M. M. Lees, S. Strautnieks, R. J. Thompson and M. Davenport (2008). "Congenital bile duct anomalies (biliary atresia) and chromosome 22 aneuploidy." J Pediatr Surg 43: 1736-1740. Amedee-Manesme, O., O. Bernard, F. Brunelle, M. Hadchouel, C. Polonovski, J. J. Baudon, P. Beguet and D. Alagille (1987). "Sclerosing cholangitis with neonatal onset." J Pediatr 111: 225-229. Ananthanarayanan, M., Y. Li, S. Surapureddi, N. Balasubramaniyan, J. Ahn, J. A. Goldstein and F. J. Suchy (2011). "Histone H3K4 trimethylation by MLL3 as part of ASCOM complex is critical for NR activation of bile acid transporter genes and is downregulated in cholestasis." Am J Physiol Gastrointest Liver Physiol 300: G771-781. Armstrong, L., A. Abd El Moneim, K. Aleck, D. J. Aughton, C. Baumann, S. R. Braddock, G. Gillessen-Kaesbach, J. M. Graham, Jr., T. A. Grebe, K. W. Gripp, B. D. Hall, R. Hennekam, A. Hunter, K. Keppler-Noreuil, D. Lacombe, A. E. Lin, J. E. Ming, N. M. Kokitsu-Nakata, S. M. Nikkel, N. Philip, A. Raas-Rothschild, A. Sommer, A. Verloes, C. Walter, D. Wieczorek, M. S. Williams, E. Zackai and J. E. Allanson (2005). "Further delineation of Kabuki syndrome in 48 well-defined new individuals." Am J Med Genet A 132A: 265-272. Atar, M., W. Lee and D. O'Donnell (2006). "Kabuki syndrome: oral and general features seen in a 2-year-old Chinese boy." Int J Paediatr Dent 16: 222-226. Attanasio, A., V. Argiriadou, G. Sandri and M. Diomedi (2007). "An anorexic woman with convulsive loss of consciousness. Syncope or epileptic fits?" Int J Cardiol 116: e34-38. Baala, L., S. Hadj-Rabia, D. Hamel-Teillac, M. Hadchouel, C. Prost, S. M. Leal, E. Jacquemin, A. Sefiani, Y. De Prost, G. Courtois, A. Munnich, S. Lyonnet and P. Vabres (2002). "Homozygosity mapping of a locus for a novel syndromic ichthyosis to chromosome 3q27-q28." J Invest Dermatol 119: 70-76. Baker, A. J., B. Portmann, D. Westaby, M. Wilkinson, J. Karani and A. P. Mowat (1993). "Neonatal sclerosing cholangitis in two siblings: a category of progressive intrahepatic cholestasis." J Pediatr Gastroenterol Nutr 17: 317-322. Baker, K. and P. L. Beales (2009). "Making sense of cilia in disease: the human ciliopathies." Am J Med Genet C Semin Med Genet 151C: 281-295. Bamshad, M. J., S. B. Ng, A. W. Bigham, H. K. Tabor, M. J. Emond, D. A. Nickerson and J. Shendure (2011). "Exome sequencing as a tool for Mendelian disease gene discovery." Nat Rev Genet 12: 745-755. Banka, S., R. Veeramachaneni, W. Reardon, E. Howard, S. Bunstone, N. Ragge, M. J. Parker, Y. J. Crow, B. Kerr, H. Kingston, K. Metcalfe, K. Chandler, A. Magee, F. Stewart, V. P. McConnell, D. E. Donnelly, S. Berland, G. Houge, J. E. Morton, C. Oley, N. Revencu, S. M. Park, S. J. Davies, A. E. Fry, S. A. Lynch, H. Gill, S. Schweiger, W. W. Lam, J. Tolmie, S. N. Mohammed, E. Hobson, A. Smith, M. Blyth, C. Bennett, P. C. Vasudevan, S. Garcia- Minaur, A. Henderson, J. Goodship, M. J. Wright, R. Fisher, R. Gibbons, S. M. Price, C. d. S. D, I. K. Temple, A. L. Collins, K. Lachlan, F. Elmslie, M. McEntagart, B. Castle, J. Clayton- Smith, G. C. Black and D. Donnai (2012). "How genetically heterogeneous is Kabuki syndrome?: MLL2 testing in 116 patients, review and analyses of mutation and phenotypic spectrum." Eur J Hum Genet 20: 381-388. 123 Ben-Omran, T. and A. S. Teebi (2005). "Structural central nervous system (CNS) anomalies in Kabuki syndrome." Am J Med Genet A 137: 100-103. Bereket, A., S. Turan, G. Alper, S. Comu, H. Alpay and F. Akalin (2001). "Two patients with Kabuki syndrome presenting with endocrine problems." J Pediatr Endocrinol Metab 14: 215- 220. Betard, C., A. Rasquin-Weber, C. Brewer, E. Drouin, S. Clark, A. Verner, C. Darmond- Zwaig, J. Fortin, J. Mercier, P. Chagnon, T. M. Fujiwara, K. Morgan, A. Richter, T. J. Hudson and G. A. Mitchell (2000). "Localization of a recessive gene for North American Indian childhood cirrhosis to chromosome region 16q22-and identification of a shared haplotype." Am J Hum Genet 67: 222-228. Beuers, U., S. Hohenester, L. J. de Buy Wenniger, A. E. Kremer, P. L. Jansen and R. P. Elferink (2010). "The biliary HCO(3)(-) umbrella: a unifying hypothesis on pathogenetic and therapeutic aspects of fibrosing cholangiopathies." Hepatology 52: 1489-1496. Bogershausen, N. and B. Wollnik (2013). "Unmasking Kabuki syndrome." Clin Genet 83: 201-211. Bogert, P. T. and N. F. LaRusso (2007). "Cholangiocyte biology." Curr Opin Gastroenterol 23: 299-305. Bull, L. N., M. J. van Eijk, L. Pawlikowska, J. A. DeYoung, J. A. Juijn, M. Liao, L. W. Klomp, N. Lomri, R. Berger, B. F. Scharschmidt, A. S. Knisely, R. H. Houwen and N. B. Freimer (1998). "A gene encoding a P-type ATPase mutated in two forms of hereditary cholestasis." Nat Genet 18: 219-224. Caroli, J. (1968). "Diseases of intrahepatic bile ducts." Isr J Med Sci 4: 21-35. Caroli, J. (1968a). "[Intrahepatic bile duct diseases]." Rev Med Chir Mal Foie 43: 211-230. Casado, A. I., J. Ruiz, J. Oro, C. Martinez, I. Fernandez and P. Oliva (2004). "Anaesthetic management in a case of Kabuki syndrome." Eur J Anaesthesiol 21: 162-163. Chagnon, P., J. Michaud, G. Mitchell, J. Mercier, J. F. Marion, E. Drouin, A. Rasquin- Weber, T. J. Hudson and A. Richter (2002). "A missense mutation (R565W) in cirhin (FLJ14728) in North American Indian childhood cirrhosis." Am J Hum Genet 71: 1443-1449. Chu, A. S., P. A. Russo and R. G. Wells (2012). "Cholangiocyte cilia are abnormal in syndromic and non-syndromic biliary atresia." Mod Pathol 25: 751-757. Chu, D. C., S. C. Finley, D. W. Young and V. K. Proud (1997). "CNS malformation in a child with Kabuki (Niikawa-Kuroki) syndrome: report and review." Am J Med Genet 72: 205-209. Ciprero, K. L., J. Clayton-Smith, D. Donnai, R. A. Zimmerman, E. H. Zackai and J. E. Ming (2005). "Symptomatic Chiari I malformation in Kabuki syndrome." Am J Med Genet A 132A: 273-275. Cleary SP, Jeck WR, Zhao X, Chen K, Selitsky SR, Savich GL, Tan TX, Wu MC, G. Getz, Lawrence MS, Parker JS, Li J, Powers S, Kim H, Fischer S, Guindi M, Ghanekar A and C. DY (2013). "Identification of driver genes in hepatocellular carcinoma by exome sequencing." Hepatology [epub ahead of print]. Cnossen, W. R. and J. P. Drenth (2014). "Polycystic liver disease: an overview of pathogenesis, clinical manifestations and management." Orphanet J Rare Dis 9: 69. Cools, F. and J. Jaeken (1997). "Hardikar syndrome: a new syndrome with cleft lip/palate, pigmentary retinopathy and cholestasis." Am J Med Genet 71: 472-474. Courcet, J. B., L. Faivre, C. Michot, A. Burguet, S. Perez-Martin, E. Alix, J. Amiel, C. Baumann, M. P. Cordier, V. Cormier-Daire, M. A. Delrue, B. Gilbert-Dussardier, A. Goldenberg, M. L. Jacquemont, A. Jaquette, H. Kayirangwa, D. Lacombe, M. Le Merrer, A. Toutain, S. Odent, A. Moncla, A. Pelet, N. Philip, L. Pinson, S. Poisson, Q. S. Kim-Han le, J. Roume, E. Sanchez, M. Willems, M. Till, C. Vincent-Delorme, C. Mousson, S. Vinault, C. 124 Binquet, F. Huet, P. Sarda, R. Salomon, S. Lyonnet, D. Sanlaville and D. Genevieve (2013). "Clinical and molecular spectrum of renal malformations in kabuki syndrome." J Pediatr 163: 742-746. Davenport, M., N. Kerkar, G. Mieli-Vergani, A. P. Mowat and E. R. Howard (1997). "Biliary atresia: the King's College Hospital experience (1974-1995)." J Pediatr Surg 32: 479-485. Davenport, M., E. Ong, K. Sharif, N. Alizai, P. McClean, N. Hadzic and D. A. Kelly (2011). "Biliary atresia in England and Wales: results of centralization and new benchmark." J Pediatr Surg 46: 1689-1694. Dawson, P. A., T. Lan and A. Rao (2009). "Bile acid transporters." J Lipid Res 50: 2340- 2357. de Vree, J. M., E. Jacquemin, E. Sturm, D. Cresteil, P. J. Bosma, J. Aten, J. F. Deleuze, M. Desrochers, M. Burdelski, O. Bernard, R. P. Oude Elferink and M. Hadchouel (1998). "Mutations in the MDR3 gene cause progressive familial intrahepatic cholestasis." Proc Natl Acad Sci U S A 95: 282-287. Delous, M., L. Baala, R. Salomon, C. Laclef, J. Vierkotten, K. Tory, C. Golzio, T. Lacoste, L. Besse, C. Ozilou, I. Moutkine, N. E. Hellman, I. Anselme, F. Silbermann, C. Vesque, C. Gerhardt, E. Rattenberry, M. T. Wolf, M. C. Gubler, J. Martinovic, F. Encha-Razavi, N. Boddaert, M. Gonzales, M. A. Macher, H. Nivet, G. Champion, J. P. Bertheleme, P. Niaudet, F. McDonald, F. Hildebrandt, C. A. Johnson, M. Vekemans, C. Antignac, U. Ruther, S. Schneider-Maunoury, T. Attie-Bitach and S. Saunier (2007). "The ciliary gene RPGRIP1L is mutated in cerebello-oculo-renal syndrome (Joubert syndrome type B) and Meckel syndrome." Nat Genet 39: 875-881. Dentici, M. L., A. Di Pede, F. R. Lepri, M. Gnazzo, M. H. Lombardi, C. Auriti, S. Petrocchi, E. Pisaneschi, E. Bellacchio, R. Capolino, A. Braguglia, A. Angioni, A. Dotta, M. C. Digilio and B. Dallapiccola (2014). "Kabuki syndrome: clinical and molecular diagnosis in the first year of life." Arch Dis Child. des Portes, V., F. Francis, J. M. Pinard, I. Desguerre, M. L. Moutard, I. Snoeck, L. C. Meiners, F. Capron, R. Cusmai, S. Ricci, J. Motte, B. Echenne, G. Ponsot, O. Dulac, J. Chelly and C. Beldjord (1998). "doublecortin is the major gene causing X-linked subcortical laminar heterotopia (SCLH)." Hum Mol Genet 7: 1063-1070. Devriendt, K., L. Lemli, M. Craen and F. de Zegher (1995). "Growth hormone deficiency and premature thelarche in a female infant with kabuki makeup syndrome." Horm Res 43: 303- 306. Devriendt, K., H. Van den Berghe, J. P. Fryns and P. Van Reempst (1996). "Large congenital follicular ovarian cyst in a girl with Kabuki syndrome." Am J Med Genet 65: 90-91. Di Gennaro, G., C. Condoluci, C. Casali, O. Ciccarelli and G. Albertini (1999). "Epilepsy and polymicrogyria in Kabuki make-up (Niikawa-Kuroki) syndrome." Pediatr Neurol 21: 566- 568. Digilio, M. C., A. Baban, B. Marino and B. Dallapiccola (2010). "Hypoplastic left heart syndrome in patients with Kabuki syndrome." Pediatr Cardiol 31: 1111-1113. Digilio, M. C., B. Marino, A. Toscano, A. Giannotti and B. Dallapiccola (2001). "Congenital heart defects in Kabuki syndrome." Am J Med Genet 100: 269-274. Djabali, M., L. Selleri, P. Parry, M. Bower, B. D. Young and G. A. Evans (1992). "A trithorax-like gene is interrupted by chromosome 11q23 translocations in acute leukaemias." Nat Genet 2: 113-118. dos Santos, B. M., R. R. Ribeiro, A. S. Stuani, F. W. de Paula e Silva and A. M. de Queiroz (2006). "Kabuki make-up (Niikawa-Kuroki) syndrome: dental and craniofacial findings in a Brazilian child." Braz Dent J 17: 249-254.

125 Drouin, E., P. Russo, B. Tuchweber, G. Mitchell and A. Rasquin-Weber (2000). "North American Indian cirrhosis in children: a review of 30 cases." J Pediatr Gastroenterol Nutr 31: 395-404. Ewart-Toland, A., G. M. Enns, V. A. Cox, G. C. Mohan, P. Rosenthal and M. Golabi (1998). "Severe congenital anomalies requiring transplantation in children with Kabuki syndrome." Am J Med Genet 80: 362-367. Findley, M. K. and M. Koval (2009). "Regulation and roles for claudin-family tight junction proteins." IUBMB Life 61: 431-437. Fryns, J. P. and K. Devriendt (1998). "Hypoplastic claviculae in the Kabuki (Niikawa- Kuroki) syndrome." Genet Couns 9: 57-58. Fujimoto, A., Y. Totoki, T. Abe, K. A. Boroevich, F. Hosoda, H. H. Nguyen, M. Aoki, N. Hosono, M. Kubo, F. Miya, Y. Arai, H. Takahashi, T. Shirakihara, M. Nagasaki, T. Shibuya, K. Nakano, K. Watanabe-Makino, H. Tanaka, H. Nakamura, J. Kusuda, H. Ojima, K. Shimada, T. Okusaka, M. Ueno, Y. Shigekawa, Y. Kawakami, K. Arihiro, H. Ohdan, K. Gotoh, O. Ishikawa, S. Ariizumi, M. Yamamoto, T. Yamada, K. Chayama, T. Kosuge, H. Yamaue, N. Kamatani, S. Miyano, H. Nakagama, Y. Nakamura, T. Tsunoda, T. Shibata and H. Nakagawa (2012). "Whole-genome sequencing of liver cancers identifies etiological influences on mutation patterns and recurrent mutations in chromatin regulators." Nat Genet 44: 760-764. Fujishiro, M., T. Ogihara, K. Tsukuda, N. Shojima, Y. Fukushima, S. Kimura, Y. Oka and T. Asano (2003). "A case showing an association between type 1 diabetes mellitus and Kabuki syndrome." Diabetes Res Clin Pract 60: 25-31. Gabrielli, O., S. Bruni, B. Bruschi, I. Carloni and G. V. Coppa (2002). "Kabuki syndrome and growth hormone deficiency: description of a case treated by long-term hormone replacement." Clin Dysmorphol 11: 71-72. Gabrielli, O., I. Carloni, G. V. Coppa, M. F. Bedeschi, M. M. Petroncini and A. Selicorni (2000). "Long-term hormone replacement therapy in two patients with Kabuki syndrome and growth hormone deficiency." Minerva Pediatr 52: 47-53. Galan-Gomez, E., J. J. Cardesa-Garcia, F. M. Campo-Sampedro, C. Salamanca-Maesso, M. L. Martinez-Frias and J. L. Frias (1995). "Kabuki make-up (Niikawa-Kuroki) syndrome in five Spanish children." Am J Med Genet 59: 276-282. Garcia-Barcelo, M. M., M. Y. Yeung, X. P. Miao, C. S. Tang, G. Cheng, M. T. So, E. S. Ngan, V. C. Lui, Y. Chen, X. L. Liu, K. J. Hui, L. Li, W. H. Guo, X. B. Sun, J. F. Tou, K. W. Chan, X. Z. Wu, Y. Q. Song, D. Chan, K. Cheung, P. H. Chung, K. K. Wong, P. C. Sham, S. S. Cherny and P. K. Tam (2010). "Genome-wide association study identifies a susceptibility locus for biliary atresia on 10q24.2." Hum Mol Genet 19: 2917-2925. Gazave, E., D. V. Lavrov, J. Cabrol, E. Renard, C. Rocher, J. Vacelet, M. Adamska, C. Borchiellini and A. V. Ereskovsky (2013). "Systematics and molecular phylogeny of the family oscarellidae (homoscleromorpha) with description of two new oscarella species." PLoS One 8: e63976. Gibbs, J. R. and A. Singleton (2006). "Application of genome-wide single nucleotide polymorphism typing: simple association and beyond." PLoS Genet 2: e150. Gillis, R., A. Klar and E. Gross-Kieselstein (1990). "The Niikawa-Kuroki (Kabuki make-up) syndrome in a Moslem Arab child." Clin Genet 38: 378-381. Gonzales, E., A. Davit-Spraul, C. Baussan, C. Buffet, M. Maurice and E. Jacquemin (2009). "Liver diseases related to MDR3 (ABCB4) gene deficiency." Front Biosci 14: 4242-4256. Gradilone, S. A., A. I. Masyuk, P. L. Splinter, J. M. Banales, B. Q. Huang, P. S. Tietz, T. V. Masyuk and N. F. Larusso (2007). "Cholangiocyte cilia express TRPV4 and detect changes in luminal tonicity inducing bicarbonate secretion." Proc Natl Acad Sci U S A 104: 19138- 19143. 126 Grantham, R. (1974). "Amino acid difference formula to help explain protein evolution." Science 185: 862-864. Grati, M., I. Chakchouk, Q. Ma, M. Bensaid, A. Desmidt, N. Turki, D. Yan, A. Baanannou, R. Mittal, N. Driss, S. Blanton, A. Farooq, Z. Lu, X. Z. Liu and S. Masmoudi (2015). "A missense mutation in DCDC2 causes human recessive deafness DFNB66, likely by interfering with sensory hair cell and supporting cell cilia length regulation." Hum Mol Genet. Grisham, J. W. (1963). "Ciliated Epithelial Cells in Normal Murine Intrahepatic Bile Ducts." Proc Soc Exp Biol Med 114: 318-320. Gu, D. C., YH.; Shih, JH.; Lin, CH.; Jou, YS.; Chen, CF. (2013). "Target genes discovery through copy number alteration analysis in human hepatocellular carcinoma." World J Gastroenterol 19: 8873-8879. Gunay-Aygun, M. (2009). "Liver and kidney disease in ciliopathies." Am J Med Genet C Semin Med Genet 151C: 296-306. Hadj-Rabia, S., L. Baala, P. Vabres, D. Hamel-Teillac, E. Jacquemin, M. Fabre, S. Lyonnet, Y. De Prost, A. Munnich, M. Hadchouel and A. Smahi (2004). "Claudin-1 gene mutations in neonatal sclerosing cholangitis associated with ichthyosis: a tight junction disease." Gastroenterology 127: 1386-1390. Hallowell, N., A. Hall, C. Alberg and R. Zimmern (2014). "Revealing the results of whole- genome sequencing and whole-exome sequencing in research and clinical investigations: some ethical issues." J Med Ethics. Hamdi Kamel, M., B. Gilmartin, P. Mohan and D. P. Hickey (2006). "Successful long-term outcome of kidney transplantation in a child with Kabuki syndrome." Pediatr Transplant 10: 105-107. Hannibal, M. C., K. J. Buckingham, S. B. Ng, J. E. Ming, A. E. Beck, M. J. McMillin, H. I. Gildersleeve, A. W. Bigham, H. K. Tabor, H. C. Mefford, J. Cook, K. Yoshiura, T. Matsumoto, N. Matsumoto, N. Miyake, H. Tonoki, K. Naritomi, T. Kaname, T. Nagai, H. Ohashi, K. Kurosawa, J. W. Hou, T. Ohta, D. Liang, A. Sudo, C. A. Morris, S. Banka, G. C. Black, J. Clayton-Smith, D. A. Nickerson, E. H. Zackai, T. H. Shaikh, D. Donnai, N. Niikawa, J. Shendure and M. J. Bamshad (2011). "Spectrum of MLL2 (ALR) mutations in 110 cases of Kabuki syndrome." Am J Med Genet A 155A: 1511-1516. Hartley, J. L., C. O'Callaghan, S. Rossetti, M. Consugar, C. J. Ward, D. A. Kelly and P. C. Harris (2011). "Investigation of primary cilia in the pathogenesis of biliary atresia." J Pediatr Gastroenterol Nutr 52: 485-488. Hildebrandt, F., S. F. Heeringa, F. Ruschendorf, M. Attanasio, G. Nurnberg, C. Becker, D. Seelow, N. Huebner, G. Chernin, C. N. Vlangos, W. Zhou, J. F. O'Toole, B. E. Hoskins, M. T. Wolf, B. G. Hinkes, H. Chaib, S. Ashraf, D. S. Schoeb, B. Ovunc, S. J. Allen, V. Vega- Warner, E. Wise, H. M. Harville, R. H. Lyons, J. Washburn, J. Macdonald, P. Nurnberg and E. A. Otto (2009). "A systematic approach to mapping recessive disease genes in individuals from outbred populations." PLoS Genet 5: e1000353. Hildebrandt, F., E. Otto, C. Rensing, H. G. Nothwang, M. Vollmer, J. Adolphs, H. Hanusch and M. Brandis (1997). "A novel gene encoding an SH3 domain protein is mutated in nephronophthisis type 1." Nat Genet 17: 149-153. Ho, H. H. and L. C. Eaves (1997). "Kabuki make-up (Niikawa-Kuroki) syndrome: cognitive abilities and autistic features." Dev Med Child Neurol 39: 487-490. Hoffman, J. D., K. L. Ciprero, K. E. Sullivan, P. B. Kaplan, D. M. McDonald-McGinn, E. H. Zackai and J. E. Ming (2005). "Immune abnormalities are a frequent manifestation of Kabuki syndrome." Am J Med Genet A 135: 278-281. Hogan, M. C., T. V. Masyuk, L. Page, D. R. Holmes, 3rd, X. Li, E. J. Bergstralh, M. V. Irazabal, B. Kim, B. F. King, J. F. Glockner, N. F. Larusso and V. E. Torres (2012). 127 "Somatostatin analog therapy for severe polycystic liver disease: results after 2 years." Nephrol Dial Transplant 27: 3532-3539. Hostoffer, R. W., C. A. Bay, K. Wagner, J. Venglarcik, 3rd, H. Sahara, E. Omair and H. T. Clark (1996). "Kabuki make-up syndrome associated with an acquired hypogammaglobulinemia and anti-IgA antibodies." Clin Pediatr (Phila) 35: 273-276. Hughes, H. E. and S. J. Davies (1994). "Coarctation of the aorta in Kabuki syndrome." Arch Dis Child 70: 512-514. Iannaccone, A., K. Mykytyn, A. M. Persico, C. C. Searby, A. Baldi, M. M. Jablonski and V. C. Sheffield (2005). "Clinical evidence of decreased olfaction in Bardet-Biedl syndrome caused by a deletion in the BBS4 gene." Am J Med Genet A 132A: 343-346. Ilyina, H., I. Lurie, I. Naumtchik, D. Amoashy, G. Stephanenko, V. Fedotov and A. Kostjuk (1995). "Kabuki make-up (Niikawa-Kuroki) syndrome in the Byelorussian register of congenital malformations: ten new observations." Am J Med Genet 56: 127-131. Isidor, B., M. Rio, O. Mourier, D. Habes, J. Amiel and E. Jacquemin (2007). "Kabuki syndrome and neonatal cholestasis: report of a new case and review of the literature." J Pediatr Gastroenterol Nutr 45: 261-264. Issaeva, I., Y. Zonis, T. Rozovskaia, K. Orlovsky, C. M. Croce, T. Nakamura, A. Mazo, L. Eisenbach and E. Canaani (2007). "Knockdown of ALR (MLL2) reveals ALR target genes and leads to alterations in cell adhesion and growth." Mol Cell Biol 27: 1889-1903. Ito, H., K. Mori, N. Inoue and S. Kagami (2007). "A case of Kabuki syndrome presenting West syndrome." Brain Dev 29: 380-382. Jin, D. K. (2011). "Systematic review of the clinical and genetic aspects of Prader-Willi syndrome." Korean J Pediatr 54: 55-63. Johnson, G. and J. F. Mayhew (2007). "Anesthesia for a child with Kabuki Syndrome." Paediatr Anaesth 17: 900-901. Kara, B., H. Kayserili, M. Imer, M. Caliskan and M. Ozmen (2006). "Quadrigeminal cistern arachnoid cyst in a patient with Kabuki syndrome." Pediatr Neurol 34: 478-480. Karjoo, S., N. J. Hand, L. Loarca, P. A. Russo, J. R. Friedman and R. G. Wells (2013). "Extrahepatic cholangiocyte cilia are abnormal in biliary atresia." J Pediatr Gastroenterol Nutr 57: 96-101. Kasuya, H., T. Shimizu, S. Nakamura and K. Takakura (1998). "Kabuki make-up syndrome and report of a case with hydrocephalus." Childs Nerv Syst 14: 230-235. Kawame, H., M. C. Hannibal, L. Hudgins and R. A. Pagon (1999). "Phenotypic spectrum and management issues in Kabuki syndrome." J Pediatr 134: 480-485. Kim, D. H., J. Lee, B. Lee and J. W. Lee (2009). "ASCOM controls farnesoid X receptor transactivation through its associated histone H3 lysine 4 methyltransferase activity." Mol Endocrinol 23: 1556-1562. Kim, M. H., T. Cierpicki, U. Derewenda, D. Krowarsch, Y. Feng, Y. Devedjiev, Z. Dauter, C. A. Walsh, J. Otlewski, J. H. Bushweller and Z. S. Derewenda (2003). "The DCX-domain tandems of doublecortin and doublecortin-like kinase." Nat Struct Biol 10: 324-333. Kinlough, C. L., P. A. Poland, J. B. Bruns and R. P. Hughey (2005). "Gamma- glutamyltranspeptidase: disulfide bridges, propeptide cleavage, and activation in the endoplasmic reticulum." Methods Enzymol 401: 426-449. Kokitsu-Nakata, N. M., A. L. Petrin, J. P. Heard, S. Vendramini-Pittoli, L. E. Henkle, D. V. dos Santos, J. C. Murray and A. Richieri-Costa (2012). "Analysis of MLL2 gene in the first Brazilian family with Kabuki syndrome." Am J Med Genet A 158A: 2003-2008. Komiya, Y. and R. Habas (2008). "Wnt signal transduction pathways." Organogenesis 4: 68- 75. 128 Kruglyak, L. (1999). "Prospects for whole-genome linkage disequilibrium mapping of common disease genes." Nat Genet 22: 139-144. Kuhlmann, K., A. Tschapek, H. Wiese, M. Eisenacher, H. E. Meyer, H. H. Hatt, S. Oeljeklaus and B. Warscheid (2014). "The membrane proteome of sensory cilia to the depth of olfactory receptors." Mol Cell Proteomics 13: 1828-1843. Kulaga, H. M., C. C. Leitch, E. R. Eichers, J. L. Badano, A. Lesemann, B. E. Hoskins, J. R. Lupski, P. L. Beales, R. R. Reed and N. Katsanis (2004). "Loss of BBS proteins causes anosmia in humans and defects in olfactory cilia structure and function in the mouse." Nat Genet 36: 994-998. Kuroki, Y., Y. Suzuki, H. Chyo, A. Hata and I. Matsui (1981). "A new malformation syndrome of long palpebral fissures, large ears, depressed nasal tip, and skeletal anomalies associated with postnatal dwarfism and mental retardation." J Pediatr 99: 570-573. Kurosawa, K., H. Kawame, Y. Ochiai, M. Nakashima, T. Tohma and H. Ohashi (2002). "Patellar dislocation in Kabuki syndrome." Am J Med Genet 108: 160-163. Lander, E. S. and D. Botstein (1987). "Homozygosity mapping: a way to map human recessive traits with the DNA of inbred children." Science 236: 1567-1570. Larusso, N. F. and T. V. Masyuk (2011). "The role of cilia in the regulation of bile flow." Dig Dis 29: 6-12. Leblanc, A., M. Hadchouel, P. Jehan, M. Odievre and D. Alagille (1981). "Obstructive jaundice in children with histiocytosis X." Gastroenterology 80: 134-139. Lederer, D., B. Grisart, M. C. Digilio, V. Benoit, M. Crespin, S. C. Ghariani, I. Maystadt, B. Dallapiccola and C. Verellen-Dumoulin (2012). "Deletion of KDM6A, a histone demethylase interacting with MLL2, in three patients with Kabuki syndrome." Am J Hum Genet 90: 119- 124. Lee, M. G., R. Villa, P. Trojer, J. Norman, K. P. Yan, D. Reinberg, L. Di Croce and R. Shiekhattar (2007). "Demethylation of H3K27 regulates polycomb recruitment and H2A ubiquitination." Science 318: 447-450. Leyva-Vega, M., J. Gerfen, B. D. Thiel, D. Jurkiewicz, E. B. Rand, J. Pawlowska, D. Kaminska, P. Russo, X. Gai, I. D. Krantz, B. M. Kamath, H. Hakonarson, B. A. Haber and N. B. Spinner (2010). "Genomic alterations in biliary atresia suggest region of potential disease susceptibility in 2q37.3." Am J Med Genet A 152A: 886-895. Li, H. and R. Durbin (2009). "Fast and accurate short read alignment with Burrows-Wheeler transform." Bioinformatics 25: 1754-1760. Li, L., I. D. Krantz, Y. Deng, A. Genin, A. B. Banta, C. C. Collins, M. Qi, B. J. Trask, W. L. Kuo, J. Cochran, T. Costa, M. E. Pierpont, E. B. Rand, D. A. Piccoli, L. Hood and N. B. Spinner (1997). "Alagille syndrome is caused by mutations in human Jagged1, which encodes a ligand for Notch1." Nat Genet 16: 243-251. Li, Y., N. Bogershausen, Y. Alanay, P. O. Simsek Kiper, N. Plume, K. Keupp, E. Pohl, B. Pawlik, M. Rachwalski, E. Milz, M. Thoenes, B. Albrecht, E. C. Prott, M. Lehmkuhler, S. Demuth, G. E. Utine, K. Boduroglu, K. Frankenbusch, G. Borck, G. Gillessen-Kaesbach, G. Yigit, D. Wieczorek and B. Wollnik (2011). "A mutation screen in patients with Kabuki syndrome." Hum Genet 130: 715-724. Livesey, E., M. Cortina Borja, K. Sharif, N. Alizai, P. McClean, D. Kelly, N. Hadzic and M. Davenport (2009). "Epidemiology of biliary atresia in England and Wales (1999-2006)." Arch Dis Child Fetal Neonatal Ed 94: F451-455. Lu, H. W., J. H. Dong, C. H. Li, Q. Yu and W. Tang (2014). "The defects of cholangiocyte primary cilia in patients with graft cholangiopathies." Clin Transplant 28: 1202-1208. Makrythanasis, P., B. van Bon, M. Steehouwer, B. Rodriguez-Santiago, M. Simpson, P. Dias, B. Anderlid, P. Arts, M. Bhat, B. Augello, E. Biamino, E. Bongers, M. Del Campo, I. 129 Cordeiro, A. Cueto-Gonzalez, I. Cusco, C. Deshpande, E. Frysira, L. Izatt, R. Flores, E. Galan, B. Gener, C. Gilissen, S. Granneman, J. Hoyer, H. Yntema, C. Kets, D. Koolen, C. Marcelis, A. Medeira, L. Micale, S. Mohammed, S. de Munnik, A. Nordgren, S. Psoni, W. Reardon, N. Revencu, T. Roscioli, M. Ruiterkamp-Versteeg, H. Santos, J. Schoumans, J. Schuurs-Hoeijmakers, M. Silengo, L. Toledo, T. Vendrell, I. van der Burgt, B. van Lier, C. Zweier, A. Reymond, R. Trembath, L. Perez-Jurado, J. Dupont, B. de Vries, H. Brunner, J. Veltman, G. Merla, S. Antonarakis and A. Hoischen (2013). "MLL2 mutation detection in 86 patients with Kabuki syndrome: a genotype-phenotype study." Clin Genet. Mamanova, L., A. J. Coffey, C. E. Scott, I. Kozarewa, E. H. Turner, A. Kumar, E. Howard, J. Shendure and D. J. Turner (2010). "Target-enrichment strategies for next-generation sequencing." Nat Methods 7: 111-118. Massinen, S., M. E. Hokkanen, H. Matsson, K. Tammimies, I. Tapia-Paez, V. Dahlstrom- Heuser, J. Kuja-Panula, J. Burghoorn, K. E. Jeppsson, P. Swoboda, M. Peyrard-Janvid, R. Toftgard, E. Castren and J. Kere (2011). "Increased expression of the dyslexia candidate gene DCDC2 affects length and signaling of primary cilia in neurons." PLoS One 6: e20580. Masyuk, A. I., B. Q. Huang, C. J. Ward, S. A. Gradilone, J. M. Banales, T. V. Masyuk, B. Radtke, P. L. Splinter and N. F. LaRusso (2010). "Biliary exosomes influence cholangiocyte regulatory mechanisms and proliferation through interaction with primary cilia." Am J Physiol Gastrointest Liver Physiol 299: G990-999. Matsumoto, N. and N. Niikawa (2003). "Kabuki make-up syndrome: a review." Am J Med Genet C Semin Med Genet 117C: 57-65. Matsune, K., T. Shimizu, T. Tohma, Y. Asada, H. Ohashi and T. Maeda (2001). "Craniofacial and dental characteristics of Kabuki syndrome." Am J Med Genet 98: 185-190. McGaughran, J., S. Aftimos, C. Jefferies and I. Winship (2001). "Clinical phenotypes of nine cases of Kabuki syndrome from New Zealand." Clin Dysmorphol 10: 257-262. McGaughran, J. M., D. Donnai and J. Clayton-Smith (2000). "Biliary atresia in Kabuki syndrome." Am J Med Genet 91: 157-158. McGinniss, M. J., D. H. Brown, L. W. Burke, J. T. Mascarello and M. C. Jones (1997). "Ring chromosome X in a child with manifestations of Kabuki syndrome." Am J Med Genet 70: 37- 42. McMahon, C. J. and W. Reardon (2006). "The spectrum of congenital cardiac malformations encountered in six children with Kabuki syndrome." Cardiol Young 16: 30-33. Meng, H., N. R. Powers, L. Tang, N. A. Cope, P. X. Zhang, R. Fuleihan, C. Gibson, G. P. Page and J. R. Gruen (2011). "A dyslexia-associated variant in DCDC2 changes gene expression." Behav Genet 41: 58-66. Meng, H., S. D. Smith, K. Hager, M. Held, J. Liu, R. K. Olson, B. F. Pennington, J. C. DeFries, J. Gelernter, T. O'Reilly-Pol, S. Somlo, P. Skudlarski, S. E. Shaywitz, B. A. Shaywitz, K. Marchione, Y. Wang, M. Paramasivam, J. J. LoTurco, G. P. Page and J. R. Gruen (2005). "DCDC2 is associated with reading disability and modulates neuronal development in the brain." Proc Natl Acad Sci U S A 102: 17053-17058. Mervis, C. B., A. M. Becerra, M. L. Rowe, J. H. Hersh and C. A. Morris (2005). "Intellectual abilities and adaptive behavior of children and adolescents with Kabuki syndrome: a preliminary study." Am J Med Genet A 132A: 248-255. Mezina, A., K. Gandhi, A. Sabo, D. Munzy, R. Gibbs, M. Hedge and S. Karpen (2014). "Whole Exome Sequencing Identifies ABCB4 Gene Variants As Modifiers of Biliary Atresia Outcomes." Gastroenterology 146: S-928. Mhanni, A. A., H. G. Cross and A. E. Chudley (1999). "Kabuki syndrome: description of dental findings in 8 patients." Clin Genet 56: 154-157. Micale, L., B. Augello, C. Fusco, A. Selicorni, M. N. Loviglio, M. C. Silengo, A. Reymond, B. Gumiero, F. Zucchetti, E. V. D'Addetta, E. Belligni, A. Calcagni, M. C. Digilio, B. 130 Dallapiccola, F. Faravelli, F. Forzano, M. Accadia, A. Bonfante, M. Clementi, C. Daolio, S. Douzgou, P. Ferrari, R. Fischetto, L. Garavelli, E. Lapi, T. Mattina, D. Melis, M. G. Patricelli, M. Priolo, P. Prontera, A. Renieri, M. A. Mencarelli, G. Scarano, M. della Monica, B. Toschi, L. Turolla, A. Vancini, A. Zatterale, O. Gabrielli, L. Zelante and G. Merla (2011). "Mutation spectrum of MLL2 in a cohort of Kabuki syndrome patients." Orphanet J Rare Dis 6: 38. Mieli-Vergani, G. and D. Vergani (2001). "Sclerosing cholangitis in the paediatric patient." Best Pract Res Clin Gastroenterol 15: 681-690. Mihci, E., S. Tacoy, S. Haspolat and K. Karaali (2002). "Central nervous system abnormalities in Kabuki (Niikawa-Kuroki) syndrome." Am J Med Genet 111: 448-449. Ming, J. E., K. L. Russell, D. M. McDonald-McGinn and E. H. Zackai (2005). "Autoimmune disorders in Kabuki syndrome." Am J Med Genet A 132A: 260-262. Miyake, N., E. Koshimizu, N. Okamoto, S. Mizuno, T. Ogata, T. Nagai, T. Kosho, H. Ohashi, M. Kato, G. Sasaki, H. Mabe, Y. Watanabe, M. Yoshino, T. Matsuishi, J. Takanashi, V. Shotelersuk, M. Tekin, N. Ochi, M. Kubota, N. Ito, K. Ihara, T. Hara, H. Tonoki, T. Ohta, K. Saito, M. Matsuo, M. Urano, T. Enokizono, A. Sato, H. Tanaka, A. Ogawa, T. Fujita, Y. Hiraki, S. Kitanaka, Y. Matsubara, T. Makita, M. Taguri, M. Nakashima, Y. Tsurusaki, H. Saitsu, K. Yoshiura, N. Matsumoto and N. Niikawa (2013). "MLL2 and KDM6A mutations in patients with Kabuki syndrome." Am J Med Genet A 161: 2234-2243. Miyake, N., S. Mizuno, N. Okamoto, H. Ohashi, M. Shiina, K. Ogata, Y. Tsurusaki, M. Nakashima, H. Saitsu, N. Niikawa and N. Matsumoto (2013a). "KDM6A point mutations cause Kabuki syndrome." Hum Mutat 34: 108-110. Mo, R., S. M. Rao and Y. J. Zhu (2006). "Identification of the MLL2 complex as a coactivator for estrogen receptor alpha." J Biol Chem 281: 15714-15720. Moral, S., F. Zuccarino and P. Loma-Osorio (2009). "Double aortic arch: an unreported anomaly with Kabuki syndrome." Pediatr Cardiol 30: 82-84. Morita, K., M. Furuse, K. Fujimoto and S. Tsukita (1999). "Claudin multigene family encoding four-transmembrane domain protein components of tight junction strands." Proc Natl Acad Sci U S A 96: 511-516. Mueller, R. F. and D. T. Bishop (1993). "Autozygosity mapping, complex consanguinity, and autosomal recessive disorders." J Med Genet 30: 798-799. Naveh, Y., H. Mendelsohn, G. Spira, L. Auslaender, H. Mandel and M. Berant (1983). "Primary sclerosing cholangitis associated with immunodeficiency." Am J Dis Child 137: 114-117. Ng, S. B., A. W. Bigham, K. J. Buckingham, M. C. Hannibal, M. J. McMillin, H. I. Gildersleeve, A. E. Beck, H. K. Tabor, G. M. Cooper, H. C. Mefford, C. Lee, E. H. Turner, J. D. Smith, M. J. Rieder, K. Yoshiura, N. Matsumoto, T. Ohta, N. Niikawa, D. A. Nickerson, M. J. Bamshad and J. Shendure (2010). "Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome." Nat Genet 42: 790-793. Ng, S. B., E. H. Turner, P. D. Robertson, S. D. Flygare, A. W. Bigham, C. Lee, T. Shaffer, M. Wong, A. Bhattacharjee, E. E. Eichler, M. Bamshad, D. A. Nickerson and J. Shendure (2009). "Targeted capture and massively parallel sequencing of 12 human exomes." Nature 461: 272-276. Niikawa, N., Y. Kuroki, T. Kajii, N. Matsuura, S. Ishikiriyama, H. Tonoki, N. Ishikawa, Y. Yamada, M. Fujita, H. Umemoto and et al. (1988). "Kabuki make-up (Niikawa-Kuroki) syndrome: a study of 62 patients." Am J Med Genet 31: 565-589. Niikawa, N., N. Matsuura, Y. Fukushima, T. Ohsawa and T. Kajii (1981). "Kabuki make-up syndrome: a syndrome of mental retardation, unusual facies, large and protruding ears, and postnatal growth deficiency." J Pediatr 99: 565-569.

131 Nobili, V., M. Marcellini, R. Devito, R. Capolino, L. Viola and M. C. Digilio (2004). "Hepatic fibrosis in Kabuki syndrome." Am J Med Genet A 124A: 209-212. Ogawa, A., S. Yasumoto, Y. Tomoda, M. Ohfu, A. Mitsudome and Y. Kuroki (2003). "Favorable seizure outcome in Kabuki make-up syndrome associated with epilepsy." J Child Neurol 18: 549-551. Ohdo, S., H. Madokoro, T. Sonoda, T. Nishiguchi, K. Kawaguchi and K. Hayakawa (1985). "Kabuki make-up syndrome (Niikawa-Kuroki syndrome) associated with congenital heart disease." J Med Genet 22: 126-127. Okada, Y., S. Ono, T. Tomomasa, Y. Inoue and A. Morikawa (2001). "Balloon angioplasty in a patient with Kabuki make-up syndrome." Pediatr Int 43: 694-696. Oksanen, V. E., M. A. Arvio, M. M. Peippo, L. K. Valanne and K. O. Sainio (2004). "Temporo-occipital spikes: a typical EEG finding in Kabuki syndrome." Pediatr Neurol 30: 67-70. Olbrich, H., M. Fliegauf, J. Hoefele, A. Kispert, E. Otto, A. Volz, M. T. Wolf, G. Sasmaz, U. Trauer, R. Reinhardt, R. Sudbrak, C. Antignac, N. Gretz, G. Walz, B. Schermer, T. Benzing, F. Hildebrandt and H. Omran (2003). "Mutations in a novel gene, NPHP3, cause adolescent nephronophthisis, tapeto-retinal degeneration and hepatic fibrosis." Nat Genet 34: 455-459. Otto, E., J. Hoefele, R. Ruf, A. M. Mueller, K. S. Hiller, M. T. Wolf, M. J. Schuermann, A. Becker, R. Birkenhager, R. Sudbrak, H. C. Hennies, P. Nurnberg and F. Hildebrandt (2002). "A gene mutated in nephronophthisis and retinitis pigmentosa encodes a novel protein, nephroretinin, conserved in evolution." Am J Hum Genet 71: 1161-1167. Otto, E. A., B. Loeys, H. Khanna, J. Hellemans, R. Sudbrak, S. Fan, U. Muerb, J. F. O'Toole, J. Helou, M. Attanasio, B. Utsch, J. A. Sayer, C. Lillo, D. Jimeno, P. Coucke, A. De Paepe, R. Reinhardt, S. Klages, M. Tsuda, I. Kawakami, T. Kusakabe, H. Omran, A. Imm, M. Tippens, P. A. Raymond, J. Hill, P. Beales, S. He, A. Kispert, B. Margolis, D. S. Williams, A. Swaroop and F. Hildebrandt (2005). "Nephrocystin-5, a ciliary IQ domain protein, is mutated in Senior-Loken syndrome and interacts with RPGR and calmodulin." Nat Genet 37: 282-288. Otto, E. A., M. L. Trapp, U. T. Schultheiss, J. Helou, L. M. Quarmby and F. Hildebrandt (2008). "NEK8 mutations affect ciliary and centrosomal localization and may cause nephronophthisis." J Am Soc Nephrol 19: 587-592. Overgaard, C. E., K. M. Sanzone, K. S. Spiczka, D. R. Sheff, A. Sandra and C. Yeaman (2009). "Deciliation is associated with dramatic remodeling of epithelial cell junctions and surface domains." Mol Biol Cell 20: 102-113. Paila, U., B. A. Chapman, R. Kirchner and A. R. Quinlan (2013). "GEMINI: integrative exploration of genetic variation and genome annotations." PLoS Comput Biol 9: e1003153. Patel, V., R. Chowdhury and P. Igarashi (2009). "Advances in the pathogenesis and treatment of polycystic kidney disease." Curr Opin Nephrol Hypertens 18: 99-106. Paulussen, A. D., A. P. Stegmann, M. J. Blok, D. Tserpelis, C. Posma-Velter, Y. Detisch, E. E. Smeets, A. Wagemans, J. J. Schrander, M. J. van den Boogaard, J. van der Smagt, A. van Haeringen, I. Stolte-Dijkstra, W. S. Kerstjens-Frederikse, G. M. Mancini, M. W. Wessels, R. C. Hennekam, M. Vreeburg, J. Geraedts, T. de Ravel, J. P. Fryns, H. J. Smeets, K. Devriendt and C. T. Schrander-Stumpel (2011). "MLL2 mutation spectrum in 45 patients with Kabuki syndrome." Hum Mutat 32: E2018-2025. Petersen, C., D. Harder, Z. Abola, D. Alberti, T. Becker, C. Chardot, M. Davenport, A. Deutschmann, K. Khelif, H. Kobayashi, N. Kvist, J. Leonhardt, M. Melter, M. Pakarinen, J. Pawlowska, A. Petersons, E. D. Pfister, M. Rygl, R. Schreiber, R. Sokol, B. Ure, C. Veiga, H. Verkade, B. Wildhaber, B. Yerushalmi and D. Kelly (2008). "European biliary atresia registries: summary of a symposium." Eur J Pediatr Surg 18: 111-116.

132 Philip, N., P. Meinecke, A. David, J. Dean, S. Ayme, R. Clark, E. Gross-Kieselstein, D. Hosenfeld, A. Moncla, D. Muller and et al. (1992). "Kabuki make-up (Niikawa-Kuroki) syndrome: a study of 16 non-Japanese cases." Clin Dysmorphol 1: 63-77. Phillips, S., S. Hemmady, P. Thomas and O. D. D (2005). "Kabuki syndrome presenting with congenital talipes equinovarus." J Pediatr Orthop B 14: 285-286. Ponnusamy, M. P., P. Seshacharyulu, I. Lakshmanan, A. P. Vaz, S. Chugh and S. K. Batra (2013). "Emerging role of mucins in epithelial to mesenchymal transition." Curr Cancer Drug Targets 13: 945-956. Powell, H. W., P. E. Hart and S. M. Sisodiya (2003). "Epilepsy and perisylvian polymicrogyria in a patient with Kabuki syndrome." Dev Med Child Neurol 45: 841-843. Ramachandran, M., R. M. Kay and D. L. Skaggs (2007). "Treatment of hip dislocation in Kabuki syndrome: a report of three hips in two patients." J Pediatr Orthop 27: 37-40. Reiner, O., F. M. Coquelle, B. Peter, T. Levy, A. Kaplan, T. Sapir, I. Orr, N. Barkai, G. Eichele and S. Bergmann (2006). "The evolving doublecortin (DCX) superfamily." BMC Genomics 7: 188. Rodriguez, L., D. Diego-Alvarez, I. Lorda-Sanchez, F. L. Gallardo, M. L. Martinez- Fernandez, M. E. Arroyo-Munoz and M. L. Martinez-Frias (2008). "A small and active ring X chromosome in a female with features of Kabuki syndrome." Am J Med Genet A 146A: 2816-2821. Rousseau-Merck, M. F., A. Zahraoui, N. Touchot, A. Tavitian and R. Berger (1991). "Chromosome assignment of four RAS-related RAB genes." Hum Genet 86: 350-354. Russell, D. W. (2009). "Fifty years of advances in bile acid synthesis and metabolism." J Lipid Res 50 Suppl: S120-125. Satir, P. and S. T. Christensen (2007). "Overview of structure and function of mammalian cilia." Annu Rev Physiol 69: 377-400. Sayer, J. A., E. A. Otto, J. F. O'Toole, G. Nurnberg, M. A. Kennedy, C. Becker, H. C. Hennies, J. Helou, M. Attanasio, B. V. Fausett, B. Utsch, H. Khanna, Y. Liu, I. Drummond, I. Kawakami, T. Kusakabe, M. Tsuda, L. Ma, H. Lee, R. G. Larson, S. J. Allen, C. J. Wilkinson, E. A. Nigg, C. Shou, C. Lillo, D. S. Williams, B. Hoppe, M. J. Kemper, T. Neuhaus, M. A. Parisi, I. A. Glass, M. Petry, A. Kispert, J. Gloy, A. Ganner, G. Walz, X. Zhu, D. Goldman, P. Nurnberg, A. Swaroop, M. R. Leroux and F. Hildebrandt (2006). "The centrosomal protein nephrocystin-6 is mutated in Joubert syndrome and activates transcription factor ATF4." Nat Genet 38: 674-681. Scholey, J. M. and K. V. Anderson (2006). "Intraflagellar transport and cilium-based signaling." Cell 125: 439-442. Schrander-Stumpel, C., P. Meinecke, G. Wilson, G. Gillessen-Kaesbach, S. Tinschert, R. Konig, N. Philip, R. Rizzo, J. Schrander, L. Pfeiffer and et al. (1994). "The Kabuki (Niikawa- Kuroki) syndrome: further delineation of the phenotype in 29 non-Japanese patients." Eur J Pediatr 153: 438-445. Schrander-Stumpel, C., P. Theunissen, R. Hulsmans and J. P. Fryns (1993). "Kabuki make- up (Niikawa-Kuroki) syndrome in a girl presenting with vitiligo vulgaris, cleft palate, somatic and psychomotor retardation and facial dysmorphism." Genet Couns 4: 71-72. Schrander-Stumpel, C. T., L. Spruyt, L. M. Curfs, T. Defloor and J. J. Schrander (2005). "Kabuki syndrome: Clinical data in 20 patients, literature review, and further guidelines for preventive management." Am J Med Genet A 132A: 234-243. Schueler, M., D. A. Braun, G. Chandrasekar, H. Y. Gee, T. D. Klasson, J. Halbritter, A. Bieder, J. D. Porath, R. Airik, W. Zhou, J. J. LoTurco, A. Che, E. A. Otto, D. Bockenhauer, N. J. Sebire, T. Honzik, P. C. Harris, S. J. Koon, M. Gunay-Aygun, S. Saunier, K. Zerres, N. O. Bruechle, J. P. Drenth, L. Pelletier, I. Tapia-Paez, R. P. Lifton, R. H. Giles, J. Kere and F.

133 Hildebrandt (2015). "DCDC2 Mutations Cause a Renal-Hepatic Ciliopathy by Disrupting Wnt Signaling." Am J Hum Genet 96: 81-92. Seelow, D., M. Schuelke, F. Hildebrandt and P. Nurnberg (2009). "HomozygosityMapper--an interactive approach to homozygosity mapping." Nucleic Acids Res 37: W593-599. Selicorni, A., C. Colombo, S. Bonato, D. Milani, A. M. Giunta and M. F. Bedeschi (2001). "Biliary atresia and Kabuki syndrome: another case with long-term follow-up." Am J Med Genet 100: 251. Shah, M., B. Bogucki, M. Mavers, D. E. deMello and A. Knutsen (2005). "Cardiac conduction abnormalities and congenital immunodeficiency in a child with Kabuki syndrome: case report." BMC Med Genet 6: 28. Sivaci, R., O. K. Kahveci, M. Celik, A. Altuntas and M. Solak (2005). "Anesthesia management in Kabuki make-up syndrome." Saudi Med J 26: 1980-1982. Stankiewicz, P., H. Thiele, I. Giannakudis, M. Schlicker, C. Baldermann, A. Kruger, S. Dorr, H. Starke and I. Hansmann (2001). "Kabuki syndrome-like features associated with a small ring chromosome X and XIST gene expression." Am J Med Genet 102: 286-292. Strautnieks, S. S., L. N. Bull, A. S. Knisely, S. A. Kocoshis, N. Dahl, H. Arnell, E. Sokal, K. Dahan, S. Childs, V. Ling, M. S. Tanner, A. F. Kagalwalla, A. Nemeth, J. Pawlowska, A. Baker, G. Mieli-Vergani, N. B. Freimer, R. M. Gardiner and R. J. Thompson (1998). "A gene encoding a liver-specific ABC transporter is mutated in progressive familial intrahepatic cholestasis." Nat Genet 20: 233-238. Suzuki, A., S. Sekiya, D. Buscher, J. C. Izpisua Belmonte and H. Taniguchi (2008). "Tbx3 controls the fate of hepatic progenitor cells in liver development by suppressing p19ARF expression." Development 135: 1589-1595. Tabibian, J. H., A. I. Masyuk, T. V. Masyuk, S. P. O'Hara and N. F. LaRusso (2013). "Physiology of cholangiocytes." Compr Physiol 3: 541-565. Tahvanainen, E., P. Tahvanainen, H. Kaariainen and K. Hockerstedt (2005). "Polycystic liver and kidney diseases." Ann Med 37: 546-555. Trauner, M. and J. L. Boyer (2003). "Bile salt transporters: molecular characterization, function, and regulation." Physiol Rev 83: 633-671. Tsai, E. A., C. M. Grochowski, K. M. Loomes, K. Bessho, H. Hakonarson, J. A. Bezerra, P. A. Russo, B. A. Haber, N. B. Spinner and M. Devoto (2014). "Replication of a GWAS signal in a Caucasian population implicates ADD3 in susceptibility to biliary atresia." Hum Genet 133: 235-243. Valente, E. M., J. L. Silhavy, F. Brancati, G. Barrano, S. R. Krishnaswami, M. Castori, M. A. Lancaster, E. Boltshauser, L. Boccone, L. Al-Gazali, E. Fazzi, S. Signorini, C. M. Louie, E. Bellacchio, G. International Joubert Syndrome Related Disorders Study, E. Bertini, B. Dallapiccola and J. G. Gleeson (2006). "Mutations in CEP290, which encodes a centrosomal protein, cause pleiotropic forms of Joubert syndrome." Nat Genet 38: 623-625. van Haelst, M. M., A. S. Brooks, J. Hoogeboom, M. W. Wessels, D. Tibboel, J. C. de Jongste, J. C. den Hollander, J. J. Bongers-Schokking, M. F. Niermeijer and P. J. Willems (2000). "Unexpected life-threatening complications in Kabuki syndrome." Am J Med Genet 94: 170-173. van, I. S. C., M. J. Tuvim, T. Weimbs, B. F. Dickey and K. E. Mostov (2002). "Direct interaction between Rab3b and the polymeric immunoglobulin receptor controls ligand- stimulated transcytosis in epithelial cells." Dev Cell 2: 219-228. van Keimpema, L., F. Nevens, R. Vanslembrouck, M. G. van Oijen, A. L. Hoffmann, H. M. Dekker, R. A. de Man and J. P. Drenth (2009). "Lanreotide reduces the volume of polycystic liver: a randomized, double-blind, placebo-controlled trial." Gastroenterology 137: 1661- 1668 e1661-1662.

134 Vaux, K. K., K. L. Jones, M. C. Jones, S. Schelley and L. Hudgins (2005). "Developmental outcome in Kabuki syndrome." Am J Med Genet A 132A: 263-264. Whitfield, J. B., R. E. Pounder, G. Neale and D. W. Moss (1972). "Serum -glytamyl transpeptidase activity in liver disease." Gut 13: 702-708. Wilson, G. N. (1998). "Thirteen cases of Niikawa-Kuroki syndrome: report and review with emphasis on medical complications and preventive management." Am J Med Genet 79: 112- 120. Yuan, S. M. (2013). "Congenital heart defects in Kabuki syndrome." Cardiol J 20: 121-124. Zahraoui, A., N. Touchot, P. Chardin and A. Tavitian (1989). "The human Rab genes encode a family of GTP-binding proteins related to yeast YPT1 and SEC4 products involved in secretion." J Biol Chem 264: 12394-12401. Zannolli, R., S. Buoni, F. Macucci, R. Scarinci, M. Viviano, A. Orsi, G. de Aloe, M. Fimiani, L. Volterrani, M. M. de Santi, C. Miracco, M. Zappella and J. Hayek (2007). "Kabuki syndrome with trichrome vitiligo, ectodermal defect and hypogammaglobulinemia A and G." Brain Dev 29: 373-376. Ziemin-van der Poel, S., N. R. McCabe, H. J. Gill, R. Espinosa, 3rd, Y. Patel, A. Harden, P. Rubinelli, S. D. Smith, M. M. LeBeau, J. D. Rowley and et al. (1991). "Identification of a gene, MLL, that spans the breakpoint in 11q23 translocations associated with human leukemias." Proc Natl Acad Sci U S A 88: 10735-10739.

135