“Identification and characterization of candidate genes in individuals with autosomal recessive intellectual disability”

Identifizierung und Charakterisierung von Kandidatengenen bei Individuen mit autosomal rezessiver mentaler Retardierung

Der Naturwissenschaftlichen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg zur Erlangung des Doktorgrades Dr. rer. nat.

Vorgelegt von Hasan Tawamie aus Edleb, Syrien

1

Als Dissertation genehmigt von der Naturwissenschaftlichen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg

Tag der mündlichen Prüfung: 26.02.2018 Vorsitzender des Promotionsorgans: Prof. Dr. Georg Kreimer Gutachter: Prof. Dr. André Reis Gutachter: Prof. Dr. Johann Brandstätter

2

Table of Contents

1 Summary 1

2 Zusammenfassung 2

3 Introduction 3 3.1 Intellectual disability; definition, classification 3 3.2 Genetic causes of intellectual disability 3

4 Aim of this and strategies to identify genes for autosomal recessive intellectual disability 6

5 Families; brief clinical description and results of mapping 7 5.1 Families, ethic statement, and available mapping results 7 5.2 Homozygosity mapping 9

6 Methods 12 6.1 DNA Isolation and its quality measurement 12 6.2 Polymerase Chain Reaction (PCR) 13 6.3 Sanger Sequencing 18 6.4 Next Generation Sequencing Technologies 19 6.5 Plasmids cloning 24 6.6 Cell Culture 26 6.7 Protein isolation and concentration measurement 28 6.8 Small interfering RNAs (siRNAs) 29 6.9 Cell proliferation assay 30 6.10 Wound-healing assay 30 6.11 SK-N-BE(2) differentiation assay 30

7 Reagents and Materials 31 7.1 Consumables 31 7.2 Chemicals, Enzymes, Standards 32 7.3 Buffers and Solutions 35 7.4 Vectors 38 7.5 Antibodies 38 7.6 TaqMan Probes 39 7.7 Kits 39 7.8 Appliances 40 7.9 Databases and Softwares 41

I

8 Results 43 8.1 Exome sequencing of nineteen affected individuals with intellectual disability 43 8.2 Pathogenic variants in known genes for intellectual disability 44 8.3 Variants in novel genes 48

9 Discussion 68 9.1 Identification of mutations in known ID genes 68 9.2 Identification of pathogenic variants in novel ID genes 70 9.3 Concluding remarks 77

10 Literaturverzeichnis 82

11 Curriculum vitae of Hasan Tawamie 92

12 List of Publications 93

13 Danksagung 95

14 Erklärung 96

II

1 Summary By applying a combination of homozygosity mapping and whole exome sequencing I aimed in this thesis at identifying the genetic causes of intellectual disability in nineteen consanguineous families with two or more affected children. Following identification of genetic variants I prioritized them based on frequency, computer-based modeling, protein function, and familial segregation. In two families, I identified homozygous pathogenic variants in AHI1 and SPG20 that were already associated with Joubert syndrome and Troyer syndrome, respectively. In 14 families, I identified 15 candidate variants in genes not previously associated with intellectual disability, BDH1, CCAR2, EZR, FAR1, KCTD18, KIAA0586, LRCH3, OGDHL, PGAP1, PGAP2, PUS7, SKIDA1, SVBP, TAF13, and TMTC3. In three families, I could identify no candidate variants. To confirm the pathogenicity of the identified variants, working hypotheses were generated based on the putative variant effect and/or on the protein function and on results of the molecular modeling of the altered protein. For SVBP I could show severe reduction of the secretion of the interaction partner VASH1 in cells with overexpression of the altered SVBP. For TAF13 I could prove that the variant leads to major disruption of the interaction between TAF13 and TAF11. TAF13 is involved in transcription. Thus, I have also performed whole RNA sequencing on TAF13-knocked down neuron-like cell lines. This revealed that transcription of genes that are involved in proliferation and differentiation of neurons is impacted. I proved this in cell cultures. Since further analyses was not possible within this thesis, experiments to prove the pathogenicity and/or causality of the variants in EZR, FAR1, KIAA0586, PGAP1, PGAP2, and TMTC3 were done by cooperation partners, after I have performed preliminary work of validation, segregation, testing for frequency in general population, and vector construction. In conclusion, in this thesis I could identify or contribute to the identification of 8 novel genes for autosomal recessive intellectual disability.

1

2 Zusammenfassung In dieser Arbeit habe ich mittels einer Kombination von Homozygotie-Kartierung und Exom- Sequenzierung 19 konsanguinen Familien mit zwei oder mehr Kindern mit einer mentalen Retardierung (MR) untersucht. Auf der Grundlage der Variantenfrequenz, der Computer- basierten Modellierungsanalyse, der Proteinfunktion und der familiären Segregation priorisierte ich die identifizierten Kandidatenvarianten. In zwei Familien habe ich in den Genen AHI1 und SPG20 jeweils eine pathogene homozygote Variante identifiziert, und somit das Vorliegen des Joubert- bzw. Troyer-Syndroms bewiesen. In 14 Familien identifizierte ich Kandidatenvarianten in 15 Genen, die noch nicht mit mentaler Retardierung verbunden waren; BDH1, CCAR2, EZR, FAR1, KCTD18, KIAA0586, LRCH3, OGDHL, PGAP1, PGAP2, PUS7, SKIDA1, SVBP, TAF13 und TMTC3. In 3 Familien konnte ich keine Kandidatenvarianten identifizieren. Um die Pathogenität der identifizierten Varianten zu bestätigen, habe ich auf Grundlage des putativen Effekts der Varianten, der Proteinfunktionen und der Modellierungsanalysen eine Arbeitshypothese gestellt, welche ich dann mittels funktioneller Analysen verfolgt habe. Für die Variante in SVBP bewies ich, dass das instabile, veränderte SVBP eine schwere Reduktion der Sekretion seines Interaktionsfaktors VASH1 verursacht. Für die Variante in TAF13 habe nachgewiesen, dass die Variante zu einer Störung der Interaktion zwischen TAF13 und TAF11 führt. TAF13 ist in der Transkription involviert. Daher führte ich eine transkriptomweite Sequenzierungsanalyse und konnte zeigen, dass insbesondere die Proliferation und Differenzierung von Neuronen betroffen ist. Dies habe ich dann in Zellkulturen gezeigt. Weil weitere Analysen zur Bestätigung der Pathogenität und/oder Kausalität von weiteren identifizierten Varianten im Rahmen dieser Arbeit nicht möglich waren, wurden diese von Kooperationspartnern für die Gene FAR1, KIAA0586, PGAP1, PGAP2 und TMTC3 absolviert, nachdem ich die Basisanalysen (Validierung, Segregation, Prävalenz in der Allgemeinbevölkerung und Vektorkonstruktion) durchgeführt habe. Zusammengefasst konnte ich in dieser Arbeit zur Identifizierung und Charakterisierung von 8 neuen Genen für die autosomal-rezessive mentale Retardierung beitragen.

2

3 Introduction 3.1 Intellectual disability; definition, classification Intellectual disability (ID) is a condition of arrested or incomplete development of mind characterized by impairment of skills that contribute to the overall level of intelligence, i.e. cognitive, language, motor, and social abilities. ID can occur independently or accompanied with other congenital or/and neurological features such as epilepsy, sensory impairment, and autism spectrum disorders (ASD) (Vissers et al. 2016) . Many studies report that prevalence of ID ranges between 1 and 3 % (Moeschler and Shevell 2014). Also, it was reported that the incidence of ID is inversely correlated with socio- economic standards such as malnutrition, cultural deprivation and inadequate health care. With lifetime costs of 1–2 million US dollars, ID is one of the leading socio-economic challenges of health care in Europe and the United States (Polder et al. 2002). However, ID still receives little public attention, because many health care professionals, organizations and parents do not perceive it as a health condition but rather a social or educational issue (Salvador-Carulla and Bertelli 2008). Classification of ID provides some clarity of ID severity and the level of support that will be required in the educational system and beyond. Based on the degree of mental impairment, ID is classified into; profound (IQ<20), severe (IQ 20–34), moderate (IQ 35–49), and mild ID (IQ 50–69) (Rabe-Jablonska and Bienkiewicz 1994).

3.2 Genetic causes of intellectual disability While many exogenous factors like maternal alcohol abuse during pregnancy, infections, birth complications and extreme malnutrition play a role in the etiology of ID, pathogenic genetic variants seem to be the cause in the majority of the cases, at least in countries with developed health systems (van Bokhoven 2011; Rauch et al. 2012). The widespread introduction of cytogenetic technologies allowed to recognize the chromosomal abnormalities as a common cause of ID in about 15% of cases (van Karnebeek, Michelson et al. 2011) including the numerical abnormalities like in trisomy 21 (Down syndrome). The introduction of genomic microarrays allowed the genome-wide detection of CNVs at finer resolution to rapidly replace G-banded karyotyping as the first-tier test for ID (Miller et al. 2010) and revealed submicroscopic aberrations as disease causing in about 6- 10% of patients with intellectual disability (Willemsen and Kleefstra 2014). For many years, linkage analyses using large families with only male patients led to the identification of several genes on the X chromosome. The heterogeneous group of X-linked

3

intellectual disability disorders (XLID) has now more than 100 genes that explain 5%-10% of ID in males (Lubs et al. 2012). On the other hand, the progress of identifying autosomal recessive intellectual disability (ARID) genes was slow, primarily due to small family size in Western countries but also since most ID cases are clinically indistinguishable but genetically heterogeneous, thus impeding positional mapping. The application of homozygosity mapping in consanguineous families followed by next generation sequencing (NGS), as has been done in this study, has opened new perspectives in the identification of pathogenic variants related to the ARID and led to the discovery of new causative genes. E. g., in 2011, Ropers, Najmabadi and colleagues applied targeted next generation sequencing (NGS) and homozygosity mapping and reported pathogenic variants in 23 genes previously implicated in ID or related neurological disorders as well as 50 probably disease-causing variants in novel candidate genes in a cohort of 136 consanguineous families (Najmabadi et al. 2011). Also, in a more recent study of our group around Abou Jamra, including the data of this thesis, 152 consanguineous families with ID were analyzed using a combination of homozygosity mapping and whole exome sequencing. In 72% of the families, potentially protein-disrupting and clinically relevant variants were identified; of them, 38.8% have pathogenic variants in recessive genes already established in neurodevelopmental disorders while 31.6% of the families had potentially protein-disrupting variants in 52 candidate genes that were not otherwise described with neurodevelopmental disorders (Reuter et al. 2017). Until recently, still during the first few years of this study, researchers were convinced that only a minority of cases is due to autosomal dominant forms of ID and that the majority would be due to autosomal recessive causes. However, major developments have been achieved in the field of genetics of autosomal dominant intellectual disability and it has been shown that the majority of sporadic cases of ID with unrelated parents are due to de novo mutations (Rauch et al. 2012; McRae et al. 2017). Trio based exome sequencing was quickly implemented as a beneficial approach for the identification of de novo, autosomal dominant pathogenic variants in ID and other neurodevelopmental disorders, especially in the Western societies where most of the non-familial cases of ID are due to pathogenic, autosomal dominant de novo variants. In 2010, of ten trios, Vissers and her colleagues identified and validated de novo variants in nine genes, six of them were likely pathogenic (Vissers et al. 2010). Shortly after that, Rauch and her colleagues identified de novo variants in known intellectual disability genes in 31.8% of the studied cohort of 51 patients (Rauch et al. 2012). In a recent landmark study, a large trio-based exome sequencing study on 4,293 cases with

4

undiagnosed developmental disorders found a diagnostic yield of even 42% for de novo pathogenic variants (McRae et al. 2016). Taken together, in addition to the chromosomal abnormalities, to date over 1600 single ID genes were identified with X-linked, autosomal dominant and autosomal recessive inheritance in both isolated and syndromic ID (Koshinke et al. 2016), and many more genes are expected to be identified soon, confirming the unprecedented genetic heterogeneity of this condition.

5

4 Aim of this thesis and strategies to identify genes for autosomal recessive intellectual disability In this study, I aimed at determining and characterizing genes that lead, when mutated, to autosomal recessive intellectual disability. In nineteen consanguineous families with several affected children, I applied a combination of homozygosity mapping and whole exome sequencing to identify candidate pathogenic variants. Then I aimed at confirming their pathogenicity and causality through functional analyses and through identification of additional pathogenic variants in similarly affected families.

6

5 Families; brief clinical description and results of mapping 5.1 Families, ethic statement, and available mapping results This study was approved by the ethic committees of the University of Bonn and the University of Erlangen-Nuremberg in Germany. Informed consent of all examined persons or of their legal guardians was obtained. Most of the examined families were interviewed and recruited by PD Dr. med. Rami Abou Jamra in different parts of Syria between 2008 and 2010. The remaining families were examined at different Institutes of Human Genetics in Germany. All families were consanguineous and have more than one affected child with intellectual disability. The examining physicians documented the clinical history of all affected individuals and their familial history, including a detailed pedigree. For more details regarding the family structure and symptoms see table 5.1. In addition, a control group of healthy Syrian subjects, comprising 376 persons of both sexes, was examined. The blood samples of all family members and controls were obtained and genomic DNA was isolated in the laboratories of the Institutes of Human Genetics in Bonn and Erlangen. Also, lymphoblastoid cell lines (LCL) were produced, cultured and assimilated from the lymphocytes for most of the families. In most of affected individuals chromosomal aberrations or microdeletions using karyotyping and array CGH with SNP-arrays as well as pathogenic FMR1-repeat expansions for fragile X syndrome were excluded in advance.

7

Family Number and sex of Ethnic HPO ID patients origin MR020 1 male Syria severe ID, muscular hypotonia, short stature very severe ID, abnormalities of the placenta, small for gestational age, muscular hypotonia, congenital cataract, MR023 1 female and 1 male Syria constipation, bruxism, autism, microcephaly, seizures, Dandy-Walker malformation, cerebellar vermis hypoplasia severe ID, muscular hypotonia, seizures, abnormalities of MR026 1 female and 1 male Syria the face, myopia, strabismus, hypothyroidism, molar tooth sign on MRI, cerebellar hypoplasia MR027 1 female and 1 male Syria mild ID, microcephaly, growth retardation, puberty delay very severe ID, seizures, muscular hypotonia, limb MR028 1 female and 1 male Syria hypertonia, spasticity, short stature, microcephaly, leukodystrophy profound ID, joint contractures, seizures, cerebral atrophy, MR037 2 males Syria hypoplastic corpus callosum, leukodystrophy MR040 1 female and 1 male Syria moderate ID, small for gestational age, short stature very severe ID, decreased fetal movements, muscular MR043 3 females (a and b branch) Syria hypotonia, absence seizures, sleep disturbances, short stature, cerebral atrophy, Dandy-Walker malformation Delayed motoric and mental development, intellectual MR046 2 males Syria disability, Aggressivity, Extreme hyperacticity, Microcephaly, short stature, provocative MR053 2 females Syria moderate ID, short stature, microcephaly, dislocated hips MR063 1 female and 1 male Syria Moderate intellectual disability, Microcephaly, Ataxia MR064 1 female and 1 male Syria mental deterioration, seizures, tremor, rigidity, dystonia very severe ID, muscular hypotonia, stereotypical motor MR079 1 female and 1 male Syria behaviors, seizures, cerebral atrophy severe ID, small for gestational age, strabismus, short MR083 2 females and 1 male Syria stature severe ID, seizures, muscular hypotonia, cardiac MR101 2 females and 1 male Syria malformation, cerebral atrophy very severe ID, limb hypertonia, ataxia, deafness, cerebellar MR105 2 females Iraq atrophy mild ID, talipes equinovarus, Dandy-Walker malformation, MR121 1 female and 1 male Turkey ventriculomegaly MR129 2 females Pakistan severe ID, muscular hypotonia, ataxia, microcephaly MR-TUR- moderate ID, muscular hypotonia, spasticity, ataxia, pes 1 female and 1 male Turkey 03-01 cavus Table 5.1: Overview of the family structure, geographic origin and phenotypic presentation of investigated families

8

5.2 Homozygosity mapping Consanguineous marriage is defined as the union between couples related as second cousins or closer. The increased risk of autosomal recessive disorders depends on the degree of relatedness between parents. Homozygosity mapping, also called autozygosity mapping, is a classical and highly successful method of mapping recessive traits in consanguineous families. In comparison to linkage analysis, homozygosity mapping is easy to perform, produces the same positional information as linkage analysis, and functions with a minimal number of family members (Abou Jamra et al. 2011). The standard workflow of homozygosity mapping starts with genotyping for single nucleotide polymorphisms (SNPs) followed by the preparation of haplotypes manually or using software. If more SNPs in a raw than expected by chance are homozygous, then it can be assumed that those alleles are identical by state (or descent if the consanguinity is known) and considered as a run of homozygosity (ROH) (Gibbs and Singleton 2006). Consanguinity leads to an enrichment of homozygous variants, including pathogenic variants that are otherwise rare in the general population. In this study, nineteen consanguineous families with autosomal recessive intellectual disability were recruited in preliminary work of PD Dr. R. Abou Jamra. After exclusion of chromosomal aberrations and other common forms of intellectual disability, such as fragile X syndrome, genotyping was performed using Illumina or Affymetrix arrays. Homozygosity mapping was performed using the software HomozygosityMapper (Seelow et al. 2009) as described before (Abou Jamra et al. 2011). As results, many candidate regions were identified in which the causal pathogenic variants are most likely to be located (Table 5.2).

Family Chromosome Start (rs#) Stop (rs#) Start (bp) Stop (bp) Length (Mb) Total Length 6 rs2243372 rs1000738 123,101,646 137,828,234 14.7 MR020 28.9 14 rs1051860 rs2526907 58,838,667 73,047,483 14.2 4 rs2017368 rs313942 111,364,166 113,972,855 2.6 6 rs4896325 rs9321970 138,484,928 144,671,770 6.2 MR023 24.6 11 rs4520590 rs2654008 7,712,660 19,368,171 11.7

15 rs1378940 rs3743198 75,083,493 79,273,984 4.2 1 rs11162494 rs17436482 78,938,708 88,739,922 9.8 1 rs480993 rs11583286 234,448,968 240,842,075 6.4 7 rs1029544 rs4719732 14,124,064 23,547,689 9.4 MR026 11 rs3924580 rs4146870 42,653,782 60,231,985 17.6 67.1 14 rs7159296 rs2074953 57,177,169 72,943,175 15.8 18 rs2850889 rs1788659 74,958,606 77,429,700 2.5 22 rs3788268 rs5996421 17,626,664 23,288,276 5.7 MR027 1 rs6699397 rs7737 91,212,215 112,026,151 20.8 63.0

9

5 rs1841991 rs7701153 78,276,147 98,645,067 20.4 11 rs12785524 rs7103667 69,177,046 76,747,389 7.6 12 rs12422236 rs11170521 33,232,041 39,448,671 6.2 17 rs1106175 rs3027208 6,887 8,014,307 8.0 1 rs10737910 rs294230 16,109,668 23,019,543 6.9 1 rs3790461 rs12068503 156,469,363 157,890,525 1.4 3 rs17205271 rs12489012 184,968,087 197,833,757 12.9 4 rs13104944 rs6552896 179,440,708 186,530,975 7.1 MR028 53.9 12 rs2037410 rs7971375 43,250,356 51,939,842 8.7 12 rs470506 rs11147298 129,441,109 133,777,644 4.3 17 rs4372746 rs271674 39,765,713 47,978,985 8.2 19 rs6509279 rs16987929 46,978,568 51,385,252 4.4 18 rs4798592 rs206501 7,747,872 10,426,034 2.7 MR037 22 rs1006015 rs713989 17,722,535 25,506,778 7.8 18.5 6 rs162977 rs10945891 155,649,655 163,733,302 8.1 3 rs13072906 rs7643631 145,079,738 180,578,828 35.5 8 rs2514317 rs3019259 99,192,418 102,067,128 2.9 8 rs7824735 rs7000214 16,874,269 41,744,093 24.9 MR040 132.3 8 rs7830006 rs4738034 49,040,835 70,761,909 21.7 10 rs11010640 rs4293075 36,648,745 60,561,750 23.9 12 rs7295444 rs17041197 72,619,187 96,016,923 23.4 4 rs11727749 rs4407490 11,535,284 26,839,459 15.3 11 rs6421982 rs6578953 339,942 8,333,825 8.0 13 rs4942757 rs975797 48,787,459 73,797,951 25.0 MR043a 14 rs7140189 rs4899737 74,686,728 79,709,509 5.0 106.4 18 rs11872913 rs10438911 11,180,759 35,819,781 24.6 19 rs4802726 rs3760698 51,178,686 54,604,193 3.4 20 rs6033078 rs7509215 11,250,347 36,297,541 25.0 MR043ab 11 rs3802985 rs7126612 198,509 6,610,562 6.4 6.4 1 rs2365687 rs4926506 244,965,587 249,081,329 4.1 1 rs319952 rs17118876 49,113,621 59,616,909 10.5 1 rs6676886 rs11165989 91,244,342 98,783,412 7.5 6 rs1044670 rs235460 56,891,978 79,052,612 22.2 7 rs961652 rs1866577 34,111,659 45,252,656 11.1 MR043b 96.5 8 rs7461969 rs2300661 20,590,295 39,614,420 19.0 9 rs1611131 rs4295734 136,522,186 141,064,740 4.5 11 rs7930823 rs1800752 206,766 6,640,868 6.4 14 rs7155319 rs1269388 84,471,633 93,665,656 9.2 19 rs4802875 rs4801994 52,330,308 54,131,637 1.8 2 rs6716370 rs11896227 194,667,874 201,577,286 6.9 7 rs6975142 rs705324 82,017,448 98,162,255 16.1 MR053 42.9 10 rs10795412 rs2076951 16,714,465 33,925,713 17.2 12 rs10850642 rs4075946 116,891,823 119,570,063 2.7 1 rs2007350 rs783053 29,646,126 66,848,704 37.2 MR063 9 rs10814410 rs3793511 0 2,170,511 2.2 49.8 12 rs17045900 rs2220686 79,295,600 89,700,459 10.4 MR064 2 rs829610 rs2421844 30,631,692 74,540,093 43.9 82.8

10

3 rs1911979 rs6793853 183,186,074 189,044,502 5.9 6 rs12180285 rs1473745 194,720 17,102,757 16.9 14 rs12881478 rs2803958 68,227,631 73,048,943 4.8 14 rs28508074 rs915071 26,331,878 32,433,857 6.1 22 rs9605084 rs2103587 20,192,581 25,374,502 5.2 2 rs4667666 rs7581786 171,753,523 205,385,337 33.6 2 rs6738683 rs12151753 3,742,155 6,221,823 2.5 5 rs12716141 rs4370247 3,309,298 8,271,624 5.0 MR079 64.4 6 rs9480343 rs7766723 156,857,014 167,107,072 10.3 7 rs6975142 rs39027 82,017,448 89,096,540 7.1 11 rs12271322 rs4944114 70,388,250 76,367,685 6.0 9 rs10815996 rs7850937 8,915,141 21,846,284 12.9 MR083 28.9 10 rs2601735 rs603805 14,827,130 30,782,288 16.0 1 rs745910 rs10915428 1,829,482 4,158,955 2.3 3 rs6799597 rs1798624 195,835,519 196,365,765 0.5 MR101 5.6 10 rs1833044 rs9888108 85,655,442 87,311,994 1.7 18 rs604050 rs10502318 2,893,413 4,019,327 1.1 2 rs705064 rs2077724 129,278,456 144,682,018 15.4 2 239,739,972 242,645,262 2.9

4 rs6553402 rs10023094 169,939,843 174,356,664 4.4

4 rs1916109 rs13104911 181,910,926 185,067,204 3.2

6 rs2395760 rs9398312 40,896,656 97,521,441 56.6 MR105 89.6 6 rs10484689 rs17079048 147,925,750 150,246,386 2.3 12 rs2416674 6,777,287 11,586,582 4.8

7 rs2354951 rs10215885 11,568,669 20,420,170 8.9 9 rs10217627 rs4262391 83,900,505 91,189,708 7.3 11 rs10838573 rs483182 46,190,981 56,471,194 10.3 12 rs1908666 rs12809872 69,375,205 130,862,390 61.5 MR121 16 rs1878543 rs4412968 5,495,968 10,492,468 5.0 151.6 16 rs691656 rs2317971 57,921,492 82,460,189 24.5 18 rs12608218 rs8087246 69,314,227 71,389,525 2.1 21 rs2830622 rs2839373 28,386,662 48,069,930 19.7 22 rs138368 rs3810648 38,790,528 51,175,626 12.4 5 rs11744669 rs13177248 172,725,336 180,685,900 8.0 7 rs3936103 rs2371642 141,795,281 144,513,549 2.7 18 rs1443454 rs8096030 25,554,315 67,038,133 41.5 MR129 92.1 19 rs2058459 rs1870063 2,663,849 18,170,962 15.5 20 rs6016377 rs12480151 39,172,728 56,221,849 17.0 21 rs7278018 rs2829862 19,664,113 27,054,612 7.4 11 rs7479420 rs7108803 97,157,660 112,263,027 15.1 13 rs11616506 rs9535757 30,125,816 52,352,132 22.2 MR-TUR-03 51.8 16 rs4785229 rs150724 50,872,995 63,402,942 12.5 17 rs12451599 rs2721853 10,285,967 12,244,285 2.0 Table 5.2: the homozygosity mapping of nineteen families

11

6 Methods 6.1 DNA Isolation and its quality measurement 6.1.1 Genomic DNA extraction from human blood Genomic DNA of affected individuals and healthy controls was obtained from the peripheral blood leukocytes by the Genetic Diagnostics Laboratory of the Human Genetics Institute of Erlangen under the supervision of Dr. rer. nat. C. Kraus. After measuring of the DNA concentration and quality, all samples were stored at 4 ° C.

6.1.2 RNA extraction from human lymphoblastoid cell lines and other cell lines For RNA extraction, approximately 30 ml of cell suspension of lymphoblastoid cell lines (LCL) were centrifuged at 1000 rcf and 4 ° C for 10 min and washed twice with PBS after discarding the supernatant. Adherent cells were first detached with 0.05% trypsin/EDTA and taken up in DMEM before being washed and were then pelletized analogously to the LCLs. The RNA was then isolated from the cell pellet using the RNeasy mini kit (Qiagen) according to manufacturer's instructions.

6.1.3 Plasmid DNA extraction Plasmid DNA was extracted according to the principle of alkaline lysis (Birnboim and Doly 1979). For this purpose, 3 ml of LB medium were mixed with 50 μg/ml of the corresponding antibiotic (ampicillin or kanamycin, depending on the plasmid of interest) and seeded with the desired bacterial colony or a bacterial cryostock. The bacterial culture was incubated overnight at 37 ° C. with constant shaking. Then 2 ml of the mixture was pelleted for 10 min at 2300 rcf and the plasmid DNA was isolated with the QIAprep Spin Miniprep Kits (Qiagen) according to manufacturer's instructions. To obtain larger amounts of plasmid DNA, 100 ml was inoculated with 50 μg/ml antibiotic with the bacteria colony that contains the corresponding plasmid and incubated overnight. Subsequently, the plasmid DNA was extracted with the Nucleobond Xtra Midi plus kit (Macherey nail) according to manufacturer's instructions.

6.1.4 Quality control for isolated nucleic acids The quality and quantity of the isolated nucleic acids was determined using the photometric instrument Infinite 200 Nano-Quant (Tecan).

12

6.1.5 Gel Electrophoresis DNA and RNA fragments were separated according to their size by means of horizontal gel electrophoresis. For this, gels with 1-1.5% agarose content in TBE buffer were used. For the DNA staining, 0.1 μl/ml of ethidium bromide was added to the gel. The samples were mixed 1:1 with 2x loading buffer and applied to the gel wells and 500 ng of the 1 kb standard for larger products and 500 ng of the 100 bp standard (New England Biolabs) were applied to quantify the molecular size of fragments. Thereafter, a voltage of 120 V was applied for 25 minutes to separate the DNA or RNA molecules by size. A UV transilluminator (UV Star, Biometra) and the BioDocAnalyze software (Biometra) were used to document the results on gels.

6.1.6 DNA extraction from agarose gel To extract the DNA from gel, each band was partially cut out from the gel with a scalpel and the PCR product was then extracted using QIAquick Gel Extraction Kit (Qiagen) according to the manufacturer's instructions.

6.1.7 Whole Genome Amplification Because of the limitation of the DNA isolated from patients and to save it prior to the amplification and sequencing, 10 ng of DNA was amplified using the Illustra Genomiphi V2 DNA Amplification Kit (GE Healthcare) according to the manufacturer's instructions. Since the amplification can produce artefacts that can falsify the sequence results, all Genomiphi DNA results were validated on original DNA.

6.2 Polymerase Chain Reaction (PCR) With the help of a thermostable DNA polymerase from Thermophilus aquaticus, defined fragments of genome can be exponentially amplified in a polymerase chain reaction (Saiki et al. 1985). Primers were prepared according to the following criteria: The desired primers length should be between 19 and 25 bp, the melting temperature should be 59 ° C to 63 ° C, and the 3 'end should terminate with a G or C. The Primer3 (Rozen and Skaletsky 2000) program was partially used for the design. The primers were then tested with the BLAT and the computer-based PCR tools of UCSC Genome Browser. The primers were synthesized by Thermo Scientific or Metabion and shipped in lyophilized form. In this work, different approaches and programs were used to amplify DNA or cDNA. The success of the reactions was tested by gel electrophoresis.

13

6.2.1 PCR using the polymerase WinTaq For the standard PCR reaction a master mix of 100-300 ng of DNA as a template, dNTPs (0.125 mM), specific forward and reverse primers for the fragment of interest (0.5 μM), 1.1 U of the thermostable recombinant, in house manufactured Taq polymerase (offered by Professor A. Winterpacht, WinTaq), and appropriate amount of 10x buffer were prepared. In some cases, DMSO (5%) and / or betaine (1 mM) were added to the master mix to increase the specificity of the PCR reaction. The standard master mix was made up as follows:

Master mix Amount DNA template 1.5 μl Forward primer (10 pM) 1 μl Reverse primer (10 pM) 1 μl Buffer 10x 1.5 μl MgCl2 (50 mM) 0.45 μl dNTPs (2.5 mM) 0.3 μl DMSO 0.75 μl Betaine 3 μl USF-H2O 8.8 μl Taq polymerase 0.07 μl

The program was as follows

PCR step Number of cycles Temperature Time Initial activation 1x 94°C 3min Denaturation 94°C 20 sec Annealing 20x 65°C 40 sec Elongation 68°C 75 sec Denaturation 94°C 20 sec Annealing 18x 59°C 40 sec Elongation 86°C 75 sec Final extension 1x 15°C 5 min

6.2.2 PCR with commercial Taq polymerase from Life Technologies In the case of fragments which cannot be readily amplified by the standard PCR, PCR was carried out with recombinant Taq polymerase from Life Technologies. The approach is as follows:

Reaction mix Amont Puffer 10x 1.5 μl MgCl2 (50 mM) 0.45 μl dNTPs (2.5 mM) 0.3 μl Invitrogen Taq-Polymerase 0.07 μl Primer f (10 μM) 1 μl Primer r (10 μM) 1 μl DNA / cDNA 1.5 μl / 1 μl USF-H2O Auf 15 μl

14

6.2.3 GC-rich system, dNTPack from Roche: Fragments with a high proportion of GC-rich sequences (> 50%) can be amplified very poorly with the standard PCR. For this reason, the GC-rich system dNTPack by Roche was used as follows: Reaction mix 1 Reaction mix 2 dNTPs (2.5 mM) 0.5 μl H2O 2 μl USF-H2O 12 μl / 12.5 μl Puffer gelb 5 μl Primer f (10 μM) 0.5 μl Enzymmix 0.5 μl Primer r (10 μM) 0.5 μl Puffer rot 2.5 μl DNA / cDNA 1.5 μl / 1 μl

The program was as follows:

Temperature Time Number of cycles 95°C 3 min 1x 95°C 30 sec 60°C 30 sec 34x 68°C 45 sec 68°C 10 min 1x 15°C 10 min

6.2.4 PCR with Pfu Ultra II Fusion HS DNA polymerase For the amplification of longer DNA fragments the Pfu Ultra II fusion HS DNA polymerase (Stratagene) was used according as follows: Reaction mix Amount DNA / cDNA / Plasmid 1.5 μl / 1 μl / 1 μl Pfu Ultra II Buffer 10x 2.5 μl Primer f (10 μM) 0.75 μl Primer r (10 μM) 0.75 μl dNTPs (2.5 mM) 2.5 μl DMSO 1.25 μl Betain (5 mM) 5 μl Pfu Ultra II Polymerase 0.5 μl USF-H2O Auf 25 μl

The program was as follows:

Temperature Time Number of cycles 95°C 3 min 1x 95°C 30 sec 65°C 30 sec 10x 68°C 45 sec 95°C 30 sec 58°C 30 sec 25x 68°C 45 sec 68°C 10 min 1x 15°C 10 min

15

6.2.5 Reverse Transcriptase PCR (RT-PCR) Using the reverse transcription PCR (RT-PCR), the isolated mRNA was converted into cDNA (complimentary DNA) which subsequently was used as a template in PCR reactions (see 6.2) and qPCR reactions (see 6.2.6). In this thesis, the cDNA synthesis with SuperScriptII enzyme (Life Technologies) and random hexamer and/or oligo dT primers (Life Technologies) was performed according to the manufacturer's instructions. Approximately 500-1000 ng of RNA were used for the synthesis and the finished cDNA diluted to about 100 ng/μl with HPLC water and stored at -20 ° C.

6.2.6 Quantitative real time PCR (qPCR) The quantitative real time PCR (qPCR) is a method used to determine the expression of a specific gene. In this method and during the exponential phase of the PCR reaction, the relative amount of a specific transcript can be quantified in correlation with endogenous control transcripts which make it possible to quantify and compare the amount of cDNA of different samples. In this thesis, both TaqMan and SYBRgreen methods were used. In the TaqMan method, in addition to specific primers, a fluorescence-labeled probe, which specifically binds to the PCR product, is used. In the native state, the fluorescence is suppressed by a fluorescence resonance energy transfer (FRET). In the annealing phase, both primers and probes bind to the DNA strand and the Taq polymerase synthesizes the new DNA strand during the elongation phase. With its 5'-3'-exonuclease activity, the Taq polymerase hydrolyzes the probe and the inhibition of fluorescence is canceled out by the following spatial separation of fluorescence dye and quencher. Then the resulting fluorescence is measured and ROX is used as a passive reference. In SYBR Green-based qPCR, SYBR Green specifically binds double-stranded DNA by intercalating between base pairs, and fluorescens only when bound to DNA. Detection of the fluorescent signal occurs during the PCR cycle at the end of either the annealing or the extension step when the greatest amount of double-stranded DNA product is present. However, SYBR Green detects any double-stranded DNA non-specifically. Therefore, the reaction must contain a combination of primers and master mix that only generates a single gene-specific amplicon without producing any non-specific secondary products.

16

6.2.6.1 TaqMan qPCR In this method, TaqMan Gene Expression Master Mix (Life Technologies) was used together with pre-tested TaqMan gene expression assays for the genes to be examined as well as for four control genes (ACTB, B2M, PO, TBP). The qPCR was performed in a 384 well plate with four technical replicates. The approach is as follows: Reaction mix Amount TaqMan Master Mix 7.5 μl TaqMan Assay 0.75 μl cDNA 2 μl HPLC-water 4.75 μl Total volume 15 μl

The amplification and detection took place in the ABI PRISM 7900HT Sequence Detection System (Life Technologies) or in the QuantStudio 12K Flex (Life Technologies) with the following program:

Temperature Time Number of cycles 50°C 2 min 1x 95°C 10 min 95°C 15 sec 50x 60°C 1 min

6.2.6.2 SYBRGreen qPCR In this method, SYBRGreen master mix (ThermoFischer) was used together with gene specific pair of primers as well as three internal controls (TBP, B2M, HPRT1). Reaction mix Amount SYBR®Green Mastermix 7.5 μl Forward and reverse primer (10pM) 0.2 μl USF-H2O 5.1 μl cDNA 10 μl

The amplification and detection took place in the ABI PRISM 7900HT Sequence Detection System (Life Technologies) or in the QuantStudio 12K Flex (Life Technologies) with the following program:

Temperature Time Number of cycles 50°C 2 min 1x 95°C 15 min 95°C 15 sec 55°C 30 sec 50x 72°C 30 sec

17

6.2.6.3 qPCR results evaluation The analysis was done using QuantStudio12k Flex Software (Life Technologies) and the ExpressionSuite Software (Life Technologies). The relative gene expression was calculated via the ΔΔCT-method (Pfaffl 2001).

6.3 Sanger Sequencing 6.3.1 Sequencing reaction Purified PCR products were sequenced using the chain termination method (Lee et al. 1992). For this purpose, BigDye Terminator v3.1 Cycle kit from ABI was used. This kit includes normal dNTPs and four fluorescently labeled dideoxynucleotides (ddNTPs). In the first step, double-stranded PCR products are denatured and separated at 94 °C. Only one of the two primers is used to avoid overlapping sequences, and the polymerization is carried out using a DNA polymerase. The incorporation of ddNTPs leads to chain termination, because of the absence of the hydroxyl groups at the 2' and 3' carbon of the ribose (Sanger et al. 1977). The reaction is made up as follows: Reaction Mix Amount PCR-Product / Plasmid 5 μl / 1-1.5 μl Primer (10 μM) 0.5 μl 5x SequenSING Buffer 2 μl BigDye 0.2 μl USF-Wasser Auf 10 μl

The sequencing reaction was performed in the thermal cycler using the following programs:

Temperatur Time Number of cycles 94°C 10 sec 55°C 10 sec 26x 60°C 2 min

6.3.2 Sequencing reaction purification Prior to the sequences analysis, interfering by-products in the sequencing reaction like primers and unused fluorescent ddNTPs are removed by purification the reaction mix using the CleanSeq magnetic beads system from Agencourt and the pipetting robot Biomek NX from Beckman. After a washing step with 85% ethanol, the product is eluted with water.

6.3.3 Analysis of the sequences on the sequencer ABI3730 The purified sequences were read in ABI3730 sequencer from Applied Biosystems (Santa Clara, USA) that separates the products in a capillary full with denaturing polymer depending on their size. The labeled ddNTPs of the product can be stimulated by a laser to emit

18

fluorescence at a particular wavelength, which varies depending on the base. In this manner, the nucleotide can be determined at each position of the sequence by creating an electropherogram. It should be noted that the first 50 bases of the sequence are difficult to evaluate and that the length of the sequence usually does not exceed 600 base pairs. Those aspects should be considered when primers are designed.

6.3.4 The evaluation of generated sequences The sequences were evaluated in SeqMan II DNAStar Inc., and were compared against a reference sequences (UCSC, hg19). To identify possible polymorphisms, the observed variants were entered into UCSC BLAT. Variants which are not annotated or which are described to be very rare (< 0.001) are then analyzed using several in silico programs for possible pathogenicity.

6.4 Next Generation Sequencing Technologies Next generation sequencing (NGS) technologies provide the ability to sequence large amounts of DNA or RNA in quick and cost-effectively manner. In this thesis, whole exome sequencing and transcriptome sequencing were carried out. The laboratory and bioinformatics analyzes of the raw data were performed in the core unit Ultra-Deep Sequencing (Dr. rer. nat. Arif Ekici).

6.4.1 Whole Exome Sequencing (WES) Novel DNA sequencing technologies based on massive parallel sequencing (also called Next Generation Sequencing, NGS) have brought over the past few years unprecedented advances in human genetics. Using NGS, hundreds of millions of DNA segments are sequenced simultaneously, in cycles from one or both ends. Without selective enrichment, any genomic region has an equal chance of being sequenced. On the other hand, with targeted gene capture, the proportion of DNA fragments containing exons or near targeted regions is substantially increased. Any region in the genome can also be targeted for enrichment by design, including exons from a list of genes associated with diseases or specific biological pathways, or disease- associated chromosome segments suggested by linkage analysis. In this study, we used the SOLiD platform (Life Technologies) to sequence the exomes of index patients after positional mapping using homozygosity mapping. This system is based on sequencing by ligation technology and has two types of libraries; fragment or mate-paired library that can be constructed depending on the purposes. The DNA fragments are ligated to adapters and bound to beads. Then the emulsion PCR amplifies the DNA fragments on the

19

beads. The sequencing methodology is based on sequential ligation with dye-labeled oligonucleotides (Housby and Southern 1998). In the first step, a primer is hybridized to the adapter sequence within the library template. Next, four fluorescently labelled octamer oligonucleotides set competes on the ligation to the sequencing primer. After the detection of the fluorescence, the ligated octamer oligonucleotides are cleaved off removing the fluorescent label, then hybridization and ligation cycles are repeated.

6.4.1.1 Whole Exome Sequencing (WES) All examinations were performed on a SOLiD platform (Life Technologies), either SOLiD 4 or SOLiD 5500 XL. In the first step, the coding DNA was enriched with the SureSelect Human Exon 50 Mb Kit (Agilent). For this purpose, the genomic DNA was sheared with DNA from the Covaris Ult- rasonicator (Life Technologies) into 150 bp to 180 bp DNA fragments. The library was subsequently prepared automatically using the SPRIworks System III (Beckman Coulter) and then it was amplified. For enrichment using the Agilent SureSelect Target Enrichment System, first the samples were hybridized with the enrichment probes for 24 hours at 65 °C and then enriched in a pull down with streptavidin beads. Several samples were pooled and tagged with the aid of attached DNA bar codes. The library was amplified again in an emulsion PCR. The sequencing was then carried out on a SOLiD System device (Life Technologies) with 50 bp-long fragment reads. The bioinformatics processing of the raw data was standardized in the institute's NGS pipeline. Per sequenced individual, more than 100 million readings were generated, of which about 82 million were mapped to the human reference genome (hg19) using the SOLiD LifeScope software version 2.5 (Life Technologies). The variants were called with LifeScope software v2.5 and GATK v1.4. Variants, which were called by both programs, were used for further analyzes. The public databases dbSNP, 1000 Genomes Project, Exome Variant Server (NHLBI GO Exome Sequencing Project (ESP), Seattle, USA) as well as the in-house exome variant databases were used to filter out all polymorphisms from the data. A further prioritization was made by restricting variants in the respective candidate regions, which were determined individually for each family, as well as incorporating the variant predictions by ANNOVAR. Promising variants were validated with Sanger sequencing, tested for segregation in the affected family, and a frequent occurrence of the variant was excluded in a control group of 376 healthy volunteers.

20

6.4.1.2 Variant identifying and filtering In-house designed pipeline that was made and supported by Dr. Steffen Uebe was applied in this study. This included quality controlling, preprocessing, mapping, post-alignment processing, variant calling, and annotation. The generated data that includes thousands of variants needs afterwards an effective filtering strategy to identify and prioritize the most relevant candidate variants that are associated with the patient’s phenotype. My filtering approach considered only exonic non-synonymous or splice site variants that are located in the homozygous candidate loci and are absent or very rare in variant databases (< 0.001). All candidate variants were validated by Sanger sequencing using ABI3100 to exclude sequencing artefacts. Next, the segregation test was performed on all families to see whether the candidate variants truly segregate with the phenotype.

6.4.1.3 Proving variant pathogenicity Recent studies revealed that significant proportion of reported so called disease mutations were either common polymorphisms, lacked direct evidence for pathogenicity, or were identified too often in the genomes of population controls (Bell et al. 2011; Xue et al. 2012; Norton et al. 2012). This makes proving pathogenicity an essential step to avoid false positive results. Thus, an aviary of computer-based prediction programs (in silico tools), molecular modeling, and functional experiments must be used to confirm the pathogenicity of the identified candidate variants. In-silico tools predict the effect of the sequence variant at the nucleotide and amino acid levels including prediction of its effect on gene transcripts as well as on the protein structure and functions. Some tools like SIFT, MutationTaster, and Polyphen predict the missense variants damaging effect on the resultant protein, while others like Gene Splicer and human splicing finder predict the effect on splicing sites. Although it is highly recommended to use multiple in silico tools for the interpretation of the variant, the generated results are only for matters of orientation, not as a sole source of evidence to make a clinical assertion, and moreover, it gives no information about how the identified variant may affect the protein structure or functions. For that, we cooperated with Professor Heinrich Sticht (Biochemistry, Erlangen). Based on the three-dimensional structure, multiple alignments of homologous sequences, and molecular dynamics, we can study the structural perturbations caused by the candidate variants on the levels of protein folding, solubility, stability, structure, function of protein associates or protein secretion (Tavtigian et al. 2008).

21

Using the results obtained by in silico and molecular modeling, appropriate hypothesis can be suggested and functional analyses can be performed to confirm the potential impact of the identified pathogenic variants. Those experimental approaches can fall into three categories. First, experiments to demonstrate if the gene functions are consistent with the known biology of the disease process, for example studying the expression of the gene of interest in tissues relevant to the phenotype (Lage et al. 2008), or if the encoded proteins physically interacts with its previously reported interaction partner (Franke et al. 2006). Second, experiments used to demonstrate if the pathogenic variants disrupt the protein functions. Third, studying whether the disruption of the candidate gene in a model organism may lead to a phenotype that recapitulates the relevant one in patients (MacArthur et al. 2014). In this thesis, a large variety of methods were applied. In the results section, information about the methods used is given. Below are the most relevant or most often used ones:

1. Examining the alternative transcripts caused by stop gain or splice variants After the total RNA isolation from patient’s lymphoblastoid cells lines, reverse transcription- PCR (RT-PCR) (Invitrogen) was performed. Then, using specific primers for the cDNA of the genes of interest we amplified the alternative transcripts and analyzed its length sequence and/or expression using gel electrophoresis, Sanger sequencing or quantitative PCR (QPCR). This strategy was used to prove the pathogenicity of the identified variants in KIAA0586.

2. Examining the mutant protein’s localization Western blotting is an important technique used in cell and molecular biology. By using a western blot, researchers can identify specific proteins from a complex mixture of proteins extracted from cells. The technique uses three elements to accomplish this task: (1) separation by size, (2) transfer to a solid support, and (3) marking target protein using a proper primary and secondary antibody to visualize (Liu et al. 2014). To investigate the localization of the wild-type or altered proteins, we isolated total protein using 1% SDS cell lysis buffer from the patient's lymphoblastoid cell line. For patients with no cell lines, we cloned first the cDNA of the gene of interest using Topo cloning system (Invitrogen) and then transfected into appropriate cell line like HeLa or HEK293 cell lines in suitable expression vectors using Lipofectamine 2000 (Invitrogen). We used this method to analysis the co-localization of SVBP and VASH1 within the transfected cells and also into the conditioned medium.

22

3. Examining the protein-protein interaction using Co-immunoprecipitation (CO-IP) Co-immunoprecipitation (Co-IP) is a favorable technique to identify physiologically relevant protein-protein interactions by using target protein-specific antibodies to indirectly capture proteins that are bound to a particular target protein. Then, these protein complexes can be analyzed using western blotting to identify new binding partners, binding affinities, and the function of the target protein. We used this method to confirm the pathogenicity of the identified variant of TAF13 by studying the interaction between the altered TAF13 and TAF11.

6.4.2 RNA Sequencing RNA extraction was performed using the RNeasy Micro Kit according to the manufacturers’ protocols (Qiagen). Before and during the library preparation quality control was done using the Agilent 2100 Bioanalyzer system with appropriate microfluidics chips (Agilent Technologies). Barcoded RNA sequencing libraries were prepared from 100ng total RNA using Nugen Ovation Human FFPE RNA-Seq System for small amounts of RNA according to the manufacturers’ instructions (Nugen). For a selective depletion of unwanted RNAs like rRNAs the insert dependent adapter cleavage method (InDA-C) from Nugen was used before cDNA synthesis. To control sources of variability during sample and data processing a common set of external spike-in RNA controls developed by the External RNA Controls Consortium (ERCC) were used (Ambion). Libraries were subjected to single-end sequencing (101 bp) on a HighSeq-2500 platform (Illumina, San Diego, CA). Reads mapping to, e.g., rRNAs, tRNAs, snRNAs and interspersed repeats were first filtered out by performing alignment against a manually curated filter reference file using bwa-mem v. 0.7.8-r455 and keeping only unmapped reads. Subsequently, these reads were mapped against the hg19 reference genome using the STAR aligner v. 2.4.0j. Read counts per gene were obtained using featureCounts program v. 1.4.4 and the Ensembl gtf annotation file (genebuild 2013-09) for hg19. The following analyses were performed using R. Differential expression analysis was performed with the DESeq2 package v.1.6.3. Genes were considered differentially expressed if their │fold change│ > 1.5 and p<0.01. Heatmaps of regularized log-transformed data were obtained relying on the package gplots. Functional annotation analysis of the differentially expressed genes was performed using Ingenuity Pathway Analysis (IPA, QIAGEN Redwood City).

23

6.5 Plasmids cloning 6.5.1 TOPO cloning using TOPO-PCRII To insert the whole coding gene of interest into a plasmid, PCR product of the entire coding sequence of a gene from human cDNA (from commercially available brain mRNA or LCLs from healthy controls) was first prepared and then it was inserted into the TOPO-PCRII vector as described by the manufacturer of the TOPO-TA Cloning Kit (Life Technologies). The reaction was transformed into chemically competent bacteria (Top10 cells) see 6.5.5). After TOPO cloning, the insert was re-cut from the TOPO vector and inserted into the plasmid of interest.

6.5.2 The DNA cleavage with restriction enzymes The insert was cut from the TOPO vector using restriction enzymes selected by NEBcutter V2.0 (http://nc2.neb.com/NEBcutter2/). The amount of the selected enzyme varied according to the quantity and quality of the plasmid and the enzyme activity. The duration of the digestion varied between 2 and 3 hours at 37 °C, depending on the enzyme activity. Then, the reaction was intercepted by incubation the reaction mixture at 65 °C for 20 min. The products were then separated in the agarose gel and the desired DNA fragment was extracted.

6.5.3 Ligation To improve the efficiency of ligation, the digested plasmid was first dephosphorylated prior to ligation by the alkaline phosphatase (New England Biolabs) according to the manufacturer's instructions. For the ligation, the Quickligation Kit (New England Biolabs) was used according to manufacturer's instructions. A mixture of plasmid and insert in molar ratio of 1:3 and 1:5, respectively, as well as 1x T4 buffer and Quick T4 DNA ligase were incubated for 5 min at room temperature and the product was then introduced into chemically competent bacteria.

6.5.4 In-fusion HD cloning Another possibility to clone gene products directly into a vector is the In-fusion HD Cloning System (Clontech). Primers were designed for the gene to be cloned, which have a 15 bp-long homology to the plasmid at the 5' end. These primers were used to perform a PCR, and the plasmid was cut with the aid of a restriction enzyme. Both products were then purified by gel extraction. After measuring the concentrations, the following ligation was carried out and the product was used for a transformation.

24

6.5.5 Plasmids transformation into chemically competent bacteria Plasmids or TOPO reactions or in-fusion reactions products were transformed into chemically-competent OneShot TOP10 bacteria with the aid of heat shocks. 1-2.5 μl of plasmid were added to 25 μl of bacteria and incubated for 20 minutes on ice. Subsequently, they were subjected to a heat shock of 42 °C. for 35 seconds in a water bath. This was followed by a 10 minutes incubation period on ice. Thereafter, 125 μl of SOC medium was added to the bacteria and shaken for 1 hour at 37 °C in the shaking incubator (New Brunswick Scientific). Then, the bacteria were plated on LB agar plates containing right selection anti- biotic agent (50 μg/ml) and 2 mg of X-Gal for blue-white selection and incubated overnight at 37 ° C.

6.5.6 Selection of positive clones To identify positive clones, colony PCR was used. For this purpose, a PCR (see 6.2.1) was performed with primers specific for vector-insert boundary. The positive clones were picked directly from the agar plate and transferred to new agar plate using a pipette tips. Then the rest of the clones left on the tip transferred to the PCR tube. The PCR products were analyzed by gel electrophoresis. Plasmid DNA was isolated from positive clones and then it was validated by Sanger sequencing.

6.5.7 Preparation of plasmid cryostocks 750 μl of LB medium seeded with positive clones were mixed with 150 μl of sterile glycerol and the bacteria solution was frozen at -80 ° C.

6.5.8 Directed mutagenesis of plasmids To present our mutations of interest into our plasmids via a PCR, PrimerX program (http://www.bioinformatics.org/primerx/cgi-bin/DNA_1.cgi) with default settings for the QuickChange Site-Directed Mutagenesis Kit by Stratagene was used to design a mutagenesis primers that carry the base pair change at the desired location. The PCR was performed using these primers in addition to PfuUltra II fusion HS DNA polymerase (Agilent) and 25 ng of plasmid of interest. The following program was set up in the thermocycler: Temperature Time Number of cycles 95°C 30 sec 1x 95°C 30 sec 55°C 1min 12x 68°C 1 min/kb 68°C 10 min 1x

25

After the PCR reaction, the product was digested with DpnI for 1h at 37°C. Hence, the template plasmid was cut but not the PCR products. The next step was the denaturation of the enzyme itself which was carried out at 80 °C for 20 min. The final reaction was transformed into Stellar competent cells.

6.6 Cell Culture To work in the cell culture room, gamma-irradiated or autoclaved consumable materials under sterile workbenches (HA 2472 GS, Heraeus) were applied.

6.6.1 Lymphoblastoid cell lines culture The immortalization of blood lymphocytes by the EBV transformation to produce Lymphoblastoid cell lines LCLs was carried out in the cytogenetic laboratory at the Institute of Human Genetics in Erlangen. The cells were suspended in RPMI-1640 (biochrome) supplemented with 20% FCS, antibiotics (100 U / ml penicillin/streptomycin) and 2 mM of L- glutamine at 37 °C, and then cultured in BBD 6220 CO2 Incubator (Heraeus) with 5% CO2 and 91% humidity.

6.6.2 Adherent human cell lines In this thesis, all used human cell lines such as HEK293, HeLa, and SK-N-BE-2 were incubated in BBD 6220 CO2 Incubator (Heraeus) at 37 °C with CO2 content and 91% air moisture. Growing cell lines were cultured in DMEM/HAM`s F12 (Life Technologies) supplemented with 10% heat-inactivated fetal calf serum (FCS, Life Technologies) and 1% penicillin streptomycin glutamine (100x, Life Technologies).

6.6.3 Splitting the adherent cell lines The adherent cell lines were split when its confluence was nearly 100%. To do this, the medium was discarded and the cells were washed with 1x PBS. Then 2 ml of 0.05% trypsin/EDTA (Life Technologies) were added, and the cells were incubated for 2-4 minutes in the incubator until they detached. The reaction was inhibited by adding standard medium containing fetal calf serum and one tenth of the cells were transferred to fresh medium and the rest of the cells were discarded. Using this method, the cells could be split up to 30 times.

26

6.6.4 Preparing and thawing the cell lines cryostocks For the storage of adherent cells, cryostocks were prepared. For this purpose, the cells were cultured to confluence of almost 100%, washed with 1x PBS and then incubated with trypsin for 30 seconds at 37 °C. The detached cells were incubated in standard medium containing fetal calf serum and centrifuged for 4 minutes at 1000 rcf. The supernatant was discarded and the pellet was taken up in 3 ml of standard medium supplemented with 240 μl of DMSO. The cell suspension was placed in two 1.5 ml tubes and incubated for 1 hour at 4 °C, before its storage at -80 °C. For thawing the cells, first it was placed in a water bath at 37 °C and immediately transferred into 10 ml pre-heated standard medium. The mixture was then centrifuged at 1000 rcf for 8 minutes, the supernatant was discarded, and cells were taken up in fresh medium and transferred to a new culture flask.

6.6.5 The transfection of adherent cell lines with plasmids To transfect the adherent cell lines with plasmids, Lipofectamin 2000 (Life Technologies) was used. Lipofectamine is a cationic liposome that forms a complex with negatively charged nucleic acid, allow them to penetrate the cell membrane. One day before the transfection, cells were split after reaching a confluence of 80%. 50 μl of Opti-MEM (Life Technologies) were carefully mixed with 1 μl of Lipofectamine 2000. In second reaction tubes, 50 μl of Opti-MEM was mixed with 200-800 ng of the plasmid of interest. Both reaction mixtures were incubated for 5 min at room temperature and then they were mixed together and incubated for 20 minutes at room temperature. During the incubation period, the cells were washed twice with 1x PBS and then 500 μl of Opti-MEM was add per well and the transfection mixture was added dropwise to the well. The medium was changed after four hours of transfection.

6.6.6 Cell Harvesting To harvest the adherent cells, the medium was discarded and the cells were washed once in 1x PBS. The cells were incubated for 2-4 min at 37 C after adding the trypsin (Life Technologies) until they detached. After collecting the cells in serum-containing medium, they were centrifuged for 10 minutes at 1000 rcf and 4 °C and the supernatant was discarded. Subsequently, the pellets were washed once in 1x PBS, centrifuged for 5 min at 1000 rcf and 4 °C and the supernatant was discarded again. The final pellets were subsequently used for RNA or protein isolation or frozen at -80 ° C.

27

6.7 Protein isolation and concentration measurement 6.7.1 Protein isolation from human cell lines For protein isolation, the previously harvested cell pellets were first frozen at -80 ° C for at least 1 h to facilitate cell lysis and then, they were slowly thawed on ice. Depending on the pellet size, 100-500 μl of protein-lyse buffer were added and the pellet was then incubated for 20 min at 4 °C in thermo-shaker. Subsequently, the chromosomal DNA was sheared with the aid of a hypodermic needle. The larger scattered fragments of the cell were removed by centrifugation and the supernatant was stored at -80 ° C.

6.7.2 Bradford protein assay The Bradford protein assay was used to determine the concentration of proteins in solution (Bradford, 1976). Bovine serum albumin (BSA) was used as protein standard. A standard curve was established using 1-25 µg/ml BSA incubated for 5 minutes with one-fifth of Bradford reagent (BioRAD). The protein probes were diluted 1:100 and 1:1,000 and were also incubated for 5 minutes with one-fifth of Bradford reagent. The absorption measurement of the probes was performed using the photometer (Eppendorf) after its calibration at 595 nm.

6.7.3 SDS-PAGE To separate proteins according to their electrophoretic mobility (Laemmli, 1970), proteins (15-50 µg) were mixed with 5 µl NuPAGE LDS-Sample Buffer (Life Technologies) and denatured at 70 °C for 10 minutes. Proteins of Co-IP experiments were already mixed with DTT buffer and therefore just heated again for 10 minutes at 95 °C. NuPAGE 4-12% Bis-Tris gels (Life Technologies) were installed into the XCell SureLock Mini-Cell system (Life Technologies) and the denatured protein probes as well as the two protein standards MagicMarkTM XP Western Protein Standard (Life Technologies) and SeeBlue® Plus2 Pre- stained Protein Standard (Life Technologies) were added. The Mini-Cell system was filled with 1x MOPS buffer and the gel was separated electrophoreticly for 60 min at 200V.

6.7.4 Western Blot The separated proteins were moved into the XCell II Blot Module (Life Technologies) and the gel was wet blotted onto a 0.2 µm nitrocellulose membrane (Life Technologies) according to the instructions of the manufacturer. A Voltage of 30 V was applied for 1 hour and 45 minutes. The blotting success was checked by a short rinse in Ponceau S. After that, the membrane was rinsed in dH2O and then incubated in 2.5% skim milk in TBS-T for 1 hour and

28

30 minutes at room temperature. Primary antibody incubation occurred in 0.5% skim milk over night at 4 °C on a roller device (Hecht). On next day, the membrane was washed 4x with TBS-T for 10 minutes. Secondary antibody incubation occurred in 0.5% skim milk for 2 hours at room temperature on a roller device. The membrane was again washed 4x with TBS- T. For the enhanced chemiluminescence detection, the membrane was incubated with 1:1 SuperSignal®West Femto Maximum Sensitivity Substrate (Thermo Scientific) for 3 minutes in the dark. The development was performed manually. Therefore, the membranes were first covered with a transparent foil. An Amersham HyperfilmTM ECL (GE Healthcare) was placed on top of the membranes in a darkroom. After exposure times between 10 seconds and up to 10 minutes, the film was placed into developer solution, rinsed with water, fixated in fixer solution, finally rinsed in dH2O, and then air-dried.

6.7.5 Co-immunoprecipitation assay This method was applied only for the experiments on TAF11 and TAF13. For co-immune precipitation (Co-IP), first we cloned TAF13 and TAF11 in pCDNA3.1-HA-His and pCDNA3.1-Myc-His vectors, respectively, using the in-fusion kit from QIAgene. The transfection of HeLa cells with constructed plasmids was performed using Lipofectamine 2000 (Invitrogen) according to the manufacturer’s instructions. Two days after the transfection, proteins were extracted and used for the Co-IP assay. The cell lysate was incubated with polyclonal Rabbit α-HA antibody (Abcam), and then the mix was applied to the μMACS ProteinG MicroBead Systems (Miltenyi) according to the manufacturer’s protocol. For the washing steps, the columns were washed four times with lysis buffer as well as one time with a low salt buffer. Proteins of Co-IP experiments were mixed with DTT buffer and heated for 10 minutes at 95 °C and then separated by SDS-polyacrylamide gel electrophoresis (SDS-PAGE) and transferred onto nitrocellulose membranes (GE Healthcare). Blotting was performed according to standard procedures using monoclonal mouse α-myc (Abcam) as a primary antibody. Immuno-reactive protein bands were detected using Super Signal West Femto Reagents (Thermo Scientific) and Amersham hyprefilm ECL LAS-4000.

6.8 Small interfering RNAs (siRNAs) This experiment was applied only for TAF13. For RNA interference, siGENOME siRNA (SMARTpool) targeting human TAF13 mRNA (M-032253-01) and siGENOME non- targeting siRNA pool (Dahrmacon) were used. Cells were seeded in 6-well plates at a density

29

of 1 × 105 cells/well and transfected with siRNAs using oligofectamine (Invitrogen) according to the manufacturer’s instructions. Cells then were cultured for 72 hours at 37 °C.

6.9 Cell proliferation assay Cell proliferation was analysed by colorimetric assay using WST-8 (2-(2-methoxy-4- nitrophenyl)-3-(4-nitrophenyl)-5-(2,4-disulfophenyl)-2H-tetrazolium, monosodium salt) (Sigma). SK-N-BE(2) were seeded at a density of 1x105 cells per well in 6 well plates and grown in 10% FCS/DMEM. Two hours before the end of RNAi, WST-8 labelling reagent was added into the culture medium (1:100). The absorbance at 450 nm was measured in a microplate plate reader.

6.10 Wound-healing assay For wound-healing migration assays, cells were seeded in 6-well plates at a density of 1×105 cells/well. 24 hours after seeding, confluent cells were transfected with TAF13 siRNA. After 48 hours cells were scratched with a fine pipette tip and washed with DMEM (10%FCS). Photographs were taken every 30 minutes for 24 hours. Relative cell migration distance was determined by measuring the wound area on the monolayer under a microscope using image J. The data acquired from the SK-N-BE scratched area on each plate were converted to the percentage of wound closure at a given time.

6.11 SK-N-BE(2) differentiation assay Human neuroblastoma cell line SK-N-BE(2) grows in DEME supplemented with fetal bovine serum (FBS) to a final concentration of 10%, 1% antibiotic Penicillin/Streptomycin (P/S) at 37 °C, with 5% CO2. Cells were seeded in 6-well plates at a density of 1 × 105 cells/well and after 3 hours of transfection with siRANs, cells were differentiated into neuronal-like cells after treatment with a mix of retinoic acid (10 µM) and caffeine acid (25 µM). After 72 hours, total RNA was extracted using RNeasy Mini Kit (Qiagen) and cDNA was generated using Superscript II RT system (Life Technologies, Carlsbad, CA). Relative expression levels of GAP43, MAPT, and MYCN during in vitro differentiation experiments were measured using TaqMan® Gene Expression Assays (Life Technologies, USA).

30

7 Reagents and Materials 7.1 Consumables µMACS Protein G MicroBeads Miltenyibiotec, Bergisch Gladbach 0.2 ml Eppendorf tubes Peqlab, Erlangen 1.5 ml Eppendorf tubes Peske, Karlsruhe 15 ml, 50 ml Falcon tubes Greiner Bio-one, Kalifonien 2 ml Safe-Lock Tubes Eppendorf, Hamburg 384-Well Optical Reaction Plate Applied Biosystems, Kalifornien Corning® Thermowell PCR 96 well plates Sigma-Aldrich, Taufkirchen 96-Well Polypropylene PCR Microplates Greiner Bio-One 96-Well plate Costar, Washington Aluminum foil Paclan Amersham HyperfilmTM ECL GE Healthcare, Buckinghamshire UK Autoclave bag Roth, Karlsruhe Canula Microlance BD Biosciences, Heidelberg Culture dishes (10 cm) Techno Plastic Products, Trasadingen Disinfection spray Apesin Tana Professional, Mainz Dispenser pipette tips 5 ml, 10 ml, 25 ml Techno Plastic Products, Trasadingen Disposable Pasteur pipette Sarstedt, Nümbrecht

Filter tips 10 µl, 20 µl, 100 µl, 200 µl, 1000 µl Nerbeplus, Winsen/Luhe Clingfilm Saran Saropack, Rorschach MACS® Separation Columns Miltenyibiotec, Bergisch Gladbach Nitrocellulose Membrane Filter Paper Life Technologies, Carlsbad NuPAGE 4-12% Bis-Tris gels Life Technologies, Carlsbad Sterile Surgical Blade Feather Pfm medical, Köln Parafilm Pechiney Plastic Packaging, Chicago, IL, USA Pipette Combitips plus 0.1 ml, 0.5 ml Eppendorf, Hamburg Pipette tips 10 µl Biozym Scientific, Hessisch Oldendorf Pipette tips 1000 µl Greiner Bio, Frickenhausen

31

Pipette tips 200 µl Sarstedt, Nümbrecht Pipettes P2, P20, P200, P1000 Gilson, Middleton Sempercare Nitrile Skin2 Gloves Sempermed, Wien, Österreich Sempercare Premium Gloves Sempermed, Wien, Österreich Serological Pipettes 5 ml, 10 ml, 25 ml Sarstedt, Nümbrecht StellarTM Competent Cells Clontech Laboratories Inc., Mountain View, CA, USA Syringe Filter sterile VWR International, Darmstadt Syringes 1ml BD Biosciences, Heidelberg Corning Thermowell Sealing Matts Sigma-Aldrich, Taufkirchen Tissue culture flasks (25 and 75 cm2) Techno Plastic Products AG, Trasadingen

7.2 Chemicals, Enzymes, Standards Agar-Agar Roth Agarose Biozym, Roth Ampicillin Roth Antarctic phosphatase I (5 U/μl) New England Biolabs Antarctic phosphatase reaction buffer (10x) New England Biolabs Aqua-PolyMount Polysciences Betain (5 M) Sigma Bicin Roth BIS-TRIS Roth Boric acid Roth Bradford regent BioRad Bromphenol blau Sigma BSA Sigma Chloroform Merck Complete Mini Protease Inhibitor Cocktail Tablets Roche DAPI Serva Desoxynucleotid-Triphosphate (dNTPs) Life Technologies

32

dHCl Merck Dimethylsulfoxid (DMSO) Roth di-Natriumhydrogenphosphat-Dihydrat(Na2HPO4x2 H20) Merck Dithiotreitol (DTT) Roth, Life Technologies DNase I, RNase-Free (10 U/μl) Qiagen Dulbecco′s modified minimal essential medium (DMEM) Biochrom, Life Technologies DMEM, low glucose, GlutaMAX Supplement, pyruvate Life Technologies DpnI New England Biolabs EcoRI New England Biolabs EcoRV New England Biolabs EGTA Roth Ethanol Roth Ethidiumbromid (10 mg/ml) Roth Ethylendiamintetraacetat (EDTA) Roth Exonuclease I (20 U/μl) New England Biolabs Fetales Kälberserum (FCS) Biochrom Fischgelatine Sigma GeneScan 500 ROX Size Standard Life Technologies Glycerin Roth Hefeextrakt Roth HPLC-Wasser Merck Isopropanol Roth Kanamycin Roth KpnI New England Biolabs Lipofectamine 2000 Reagent Life Technologies Skimmed milk powder K-Classic Magnesium chlorid (MgCl2) Life Technologies Methanol Roth Monopotassium phosphate (KH2PO4) Merck MOPS Roth

33

NuPAGE Antioxidant Life Technologies NuPAGE LDS Sample Buffer (4X) Life Technologies NuPAGE Sample Reducing Agent (10X) Life Technologies Optimem Life Technologies Penicillin-Streptomycin-Glutamin (100x) Life Technologies Pepton Merck PfuUltra II Fusion HS DNA Polymerase Stratagene PfuUltra II Reaction Buffer (10x) Stratagene Ponceau S Solution Sigma Potassium chloride (KCl) Roth Random Primer Life Technologies RNaseOUT Ribonuclease Inhibitor (40 U/μl) Life Technologies Slim Fast Unilever Sodium chloride (NaCl) Roth Sodium citrate Roth Sodium hydroxide (NaOH) Merck SOC-Medium (Bakterien) Life Technologies Sodiumdodecylsulfat (SDS) Merck SpeI New England Biolabs Standard: 1 kb DNA Ladder New England Biolabs Standard: 100 bp DNA Ladder New England Biolabs Standard: MagicMark XP Western Protein Standard Life Technologies Standard: pUC-Mix Marker, 8 Fermentas Standard: SeeBlue Plus2 Protein Standard Life Technologies SuperScript II Reverse Transkriptase Life Technologies SuperSignal West Femto Maximum Sensitivity Substrate Thermo Scientific Taq Reaction Buffer (10x) Life Technologies Taq-DNA-Polymerase Life Technologies TaqMan Gene Expression Master Mix Life Technologies TOP10 E. coli (Bakterienstamm) Life Technologies

34

Trihydroxymethylaminomethan (TRIS) Roth Trihydroxymethylaminomethan Hydrochlorid (TRIS-HCl) Roth Triton-X 100 Pharmacia Biotech Trypsin/EDTA (0.05%) Life Technologies Tween 20 Sigma, VWR XbaI New England Biolabs X-Gal Roth XhoI New England Biolabs Xylencyanol Merck

7.3 Buffers and Solutions 10x PBS 80 g NaCl 2 g KCl 11.5 g Na2HPO4 2 g KH2PO4 - Fill up to 1L with dH2O, pH 7.0 setting and then autoclaving

10x TBS-T 24.2 g Tris 80 g NaCl 15 ml 32 % HCl (pH 7.6) 10 ml Tween 20 - Fill up to 1L with dH2O

Protein Lysis Buffer 10 mM Tris 150 mM NaCl 1 % Triton-X100 - Before using the buffer, 1 complete mini protease inhibitor tablet should be added to 10 ml buffer.

35

20x MOPS Buffer 104.6 g MOPS 60.6 g Tris 10 g SDS 3 g EDTA - Fill up to 500 ml with dH2O

20x Transfer Buffer 10.2 g Bicin 13.08 g BIS-TRIS 0.75 g EDTA - Fill up to 125 ml with dH2O

TE-Buffer 10 mM Tris-HCl 1 mM EDTA

10x TBE (Tris-Borat-EDTA) 1.0 M Tris 0.83 M boric acid 0.01 M EDTA - Fill up with dH2O (pH 8.3)

Agarosegel (1 – 3 %) 1-3 g Agarose - Fill up with 100 ml of 1x TBE, boil up, 2 μl Ethidiumbromid was added, gel pour into the gel electrophoresis plate.

DNA-Lade Buffer 50 % Glycerin 0.01 % Bromphenolblau 0.1 % Xylencyanol

36

LB-Medium 10 g Pepton 5 g Hefe-Extrakt 10 g NaCl 0.2 g NaOH - Fill up with dH2O to reach a volume of 1L (pH 7.5) and the buffer had autoclaved

Agar plates (12 plates) 500 ml LB-Medium 7.5 g Agar-Agar - Autoclaving, and cool it down to 60 ° C - 500 μl Ampicillin / Kanamycin was added, the agar was poured into plates and then stored at 4 °C

Ampicillin (50 mg/ml) 0.5 g Ampicillin 10 ml dH2O. - the solution was sterilized using filters filtration and then it was stored at storage at -20 °C

Kanamycin (50 mg/ml) 0.5 g Kanamycin 10 ml dH2O. - the solution was sterilized by filters and then it was stored in -20 °C

X-Gal (40 mg/ml) 0.4 g X-Gal 10 ml DMSO - the solution was sterilized by filters and then it was stored in -20°C 20x SSC 525.9 g NaCl 264.6 g NaCitrat - Fill up to 3 L with dH2O (pH 7.5) and the buffer

37

had autoclaved

4x SSC/ 0.2 % Tween 100 ml 20x SSC 400 ml dH2O 1 ml Tween 20

DAPI 1 ml 4x SSC/0.2 % Tween 1 μl DAPI - Fill up to 4 ml with dH2O

10x WinTaq-Puffer 500mM KCl 10mM Tris-HCl (pH 8.3) 15mM MgCl2

7.4 Vectors pCR II TOPO Life Technologies pcDNA 3.1 Life Technologies pCMV6-XL6 Origene Technologies pet28a Novagen pcDNA3.1(+)/HA-His A 5470 pcDNA3.1(+)/myc-His A 5500

7.5 Antibodies Primary Antibodies Company Dilution Rabbit-anti-HA abcam 1:10,000 Maus-anti-bActin abcam 1:10,000 Mouse α - HA abcam 1,000 Mouse α- myc abcam 1:1,000 Secondary Antibodies Company Dilution Goat-anti-Maus-HRP Life Technologies 1:20,000 Goat-anti-Rabbit-HRP Life Technologies 1:20,000

38

7.6 TaqMan Probes Genes Dye Company ACTB FAM Life Technologies B2M VIC Life Technologies PO VIC Life Technologies TBP VIC Life Technologies MAPT FAM Life Technologies MYCN FAM Life Technologies GAP43 FAM Life Technologies TAF13 FAM Life Technologies

7.7 Kits ABsolute QPCR SYBR Green ROX Mix Thermo Scientific Agencourt AMPure Kit Beckman Coulter Agencourt CleanSEQ Kit Agencourt BigDye Terminator v3.1 Cycle Sequencing Kit Life Technologies Expand 20kblus PCR System, dNTPack Roche Expand Long Range dNTPack Roche GC-Rich PCR-System Roche Illustra GenomiPhi V2 DNA Amplification Kit GE Healthcare In-Fusion HD Cloning Plus Clontech Nucleobond Xtra Midi plus Macherey-Nagel PAXgene Blood RNA Kit Becton Dickinson QIAprep Spin Miniprep Kit Qiagen QIAquick Gel Extraction Kit Qiagen QIAquick PCR Purification Kit Qiagen QIAshredder Qiagen RNase-Free DNase Set Qiagen RNeasy Mini Kit Qiagen SuperSignal West Femto Maximum Sensitivity Substrate Thermo Scientific TOPO TA Cloning Kit Life Technologies

39

7.8 Appliances Accu-jet pro Brand, Wertheim HiclaveTM HV-85 Autoclave HMC, Engelsberg Bio Photometer Eppendorf, Hamburg BioDocAnalyze Gel Documentation system Analytik Jena, Jena Biomerk NXp robot Beckman Coulter, California C-chip Neubauer improved Brand, Wertheim Centrifuge 5804/5810/5415R Eppendorf, Hamburg Centrifuge Biofuge 22 R Heraeus, Hanau Centrifuge Biofuge Pico Heraeus, Hanau Centrifuge Varifuge 20 RS Heraeus, Hanau Centrifuge Varifuge F Heraeus, Hanau Digital pH-Meter HI 2211 Hanna Instruments Vöhringen Dispenser-Pipette Eppendorf, Hamburg Fridge ProfiLine Liebherr, Biberach an der Riß Gel Electrophoresis Chambers Peqlab, Erlangen Gel Electrophoresis Combs Peqlab, Erlangen Ice Machine Ziegra, Isernhagen Incubator BBD 6220 Thermo Fisher Scientific, Braunschweig InnovaTM 4300 Incubator Shaker New Brunswick Scientific, Enfield, CT, USA Kelvitron t Incubator Heraeus, Hanau Labovert Microscope Leitz, Bad Dürkheim Laminair HA 2472 GS tissue culture hood Heraeus, Hanau Magnetic Agitator IKA, Staufen Mastercycler Eppendorf, Hamburg Microwave NN-ST462M Panasonic, Wiesbaden MiniSpin Centrifuge Eppendorf, Hamburg Multipipette xstream Eppendorf, Hamburg Nano Quant Infinite F200 Tecan, Männedorf

40

NuPAGE Novex System LifeTechnologies, Carlsbad Pipitting Robotor Biomek®NX Beckman Coulter, Krefeld PowerPacTM 300 BioRAD, München QuantStudio 12K Flex LifeTechnologies, Carlsbad Research plus 100 Pipette Eppendorf, Hamburg RM5 Roller Hecht, Sondheim Scale MC1 Sartorius AG, Göttingen Sepatech Omnifuge 2.0 RS Heraeus, Hanau Sequencer ABI Prism 2730 Applied Biosystems, Santa Clara, USA Thermomixer Eppendorf, Hamburg Thermomixer compact Eppendorf, Hamburg Unimax 1010 Heidolph Instruments, Schwabach UV Star Biometra, Goettingen Vortexer VF2 IKA, Staufen im Breisgau Water bath Memmert, Schwabach Water bath Julabo, Seelbach Water bath Lauda-Brinkmann, Delran, NJ, USA Water treatment plant Purelab plus ELGA LabWater, Celle XCell II Blot Module Life Technologies, Carlsbad XCell SureLock Mini-Cell Life Technologies, Carlsbad

7.9 Databases and Softwares 1000 Genomes http://www.1000genomes.org/ ABI PRISM R 7900 HT SDS V2.1.1 Life Technologies Axio Vision Software Version 4.8 Zeiss BioDocAnalyze 2.0 Biometra Chromas Version 2.23 Technelysium Database of Genomic Variants http://projects.tcag.ca/variation/ Ensembl Variant Effect Predictor http://www.ensembl.org/tools.html ExAC Browser http://exac.broadinstitute.org/

41

Exome Variant Server http://evs.gs.washington.edu/EVS/ ExPASy Translate Tool http://www.expasy.org/tools/dna.html Human Genome Variation Society http://www.hgvs.org/ IGV Genome Browser www.broadinstitute.org/igv/ Illumina iGenomes transcriptome http://cufflinks.cbcb.umd.edu/igenomes.html Image Studio Li-Cor Microsoft Office 2007 Microsoft Mutation Taster http://www.mutationtaster.org/ NCBI http://www.ncbi.nlm.nih.gov/ NEBcutter V2.0 http://nc2.neb.com/NEBcutter2/ NetGene2 http://www.cbs.dtu.dk/services/NetGene2/ NNSplice 0.9 http://www.fruitfly.org/seq_tools/splice.html Polyphen 2 http://genetics.bwh.harvard.edu/pph2/ Primer 3 http://frodo.wi.mit.edu/ http://www.bioinformatics.org/primerx/cgi- Primer X bin/DNA_1.cgi NCBI Pubmed http://www.ncbi.nlm.nih.gov/pubmed QuantStudio 12K Flex Software Life Technologies Raw MRC Holland SeqManII DNA Star Sequencing Analysis 5.1 Software Applied Biosystems SIFT http://sift.jcvi.org/ STRING 9.0 http://string.embl.de/ Student t-test http://www.fon.hum.uva.nl/Service/Statistics.html SysID database http://sysid.cmbi.umcn.nl/ The Human Gene Mutation Database http://www.hgmd.cf.ac.uk/ac/index.php UCSC Genome Browser http://genome.ucsc.edu/

42

8 Results 8.1 Exome sequencing of nineteen affected individuals with intellectual disability In the Ultra-Deep Sequencing core unit of the Institute of Human Genetics in Erlangen, we used SOLID platform to perform the whole exome sequencing for one affected individual in each of the nineteen families (laboratory administration: Dr. rer. nat. Arif Ekici). Data processing and variant calling was performed by Dr. rer. nat. Steffen Uebe, based on house- made bioinformatics pipeline. To identify the most relevant candidate variants, I considered only in silico possibly pathogenic exonic or splice site variants that are either absent or very rare in the general population and are located in the homozygosity candidate loci. Following the validation and segregation of the candidate variants, and in cooperation with Professor Heinrich Sticht (Biochemistry, Erlangen) the structural and functional perturbations caused by the candidate variants were analyzed by molecular modeling based on the three-dimensional structure, multiple alignments of homologous sequences and molecular dynamics. By using this strategy, I obtained seventeen variants in sixteen families, as listed in table 8.1.

ACMG Family ID Gene transcript cDNA protein classification MR020 AIH1 NM_001134830.1 c.917dup p.Lys307Glufs*3 Pathogenic MR023 FAR1 NM_032228.5 c.495_507delinsT p.(Glu165_Pro169delinsAsp) Pathogenic MR026 KIAA0586 NM_001244189.1 c.2414-1G>C p.? Pathogenic MR027 TAF13 NM_005645.3 c.119T>A p. (Met40Lys) Pathogenic MR028 BDH1 NM_004051.4 c.668G>A p.(Arg223His) novel gene MR037 EZR NM_003379.4 c.385G>A p.(Ala129Thr) novel gene CCAR2 NM_021174.4 c.2484C>A p.(Tyr828*) novel gene MR040 OGDHL NM_001143997.1 c.1979G>A p.(Arg660Gln) novel gene MR043 PGAP2 NM_001256239.1 c.296A>G p.(Tyr99Cys) Pathogenic MR046 PUS7 NM_019042 c.1348C>T p.(Arg450*) novel gene MR053 KCTD18 NM_152387.2 c.875C>T p.(Ser292Leu) novel gene MR063 SVBP NM_199342.3 c.82C>T p.(Gln28*) novel gene MR079 PGAP1 NM_024989.3 c.589_591del p.(Leu197del) Pathogenic MR083 SKIDA1 NM_207371.3 c.2600C>T p.(Ala867Val) novel gene MR101 LRCH3 NM_032773.2 c.761A>G p.(Gln254Arg) novel gene MR121 TMTC3 NM_181783.3 c.199C>G p.(His67Asp) Pathogenic MR-TUR-03-01 SPG20 NM_001142294.1 c.364_365del p.(Met122Valfs*2) Pathogenic Table 8.1: Overview of the variants identified in the investigated families

43

8.2 Pathogenic variants in known genes for intellectual disability In two families, I identified pathogenic variants in, at that time, already known ID genes, AHI1 and SPG20 (Tawamie et al. 2015), and could diagnose Joubert syndrome and Troyer syndrome in the affected children, respectively.

8.2.1 Identification of a pathogenic variant in AHI1 in family MR020

Figure 8.2.1.1.1: Family pedigree and photographs of the two affected siblings

In this family, the parents are first degree cousins and they have five children; of them, two boys (MR020-04 and MR020-06) as well as a deceased girl, had intellectual disability (Figure 8.2.1.1.1). Pregnancy and delivery of the eldest boy (MR020-04) were unremarkable. Until the age of two years, he showed a severe form of hypotonia. He developed slowly, but continuously. He was a cheerful and social boy. He had no epilepsy. He had severe growth retardation (length 93 cm and weight 13.7 kg at age of 6 years) but no microcephaly (head circumference 51 cm). His departed sister showed a similar neonatal development but died at the age of ten months because of an unknown cause. The birth weight of the younger boy (MR020-06) was higher than his siblings (even his healthy siblings). At the time of examination, he was few month old and an exact estimation of his development was unreliable. As far as possible, a developmental delay could not be excluded. The mother reported that he did better than his brother and did not exclude that the boys suffer from different disorders. Homozygosity mapping of the family MR020 using the SNP microarrays of Affymetrix 6.0 and HomozygosityMapper led to the identification of two candidate loci in this family with a total length of 28.9 Mb. Of the 36,818 unfiltered variants, 3,112 variants were not annotated in the dbSNP132 and six of them are in exons or nearby and in a candidate region. Only three

44

variants in AKAP7, LAMA2, and AHI1 could be validated using Sanger sequencing and were predicted to be disease causing. The variant c.917dup, p.(Lys307Glufs5X) in AHI1 was the most favorite candidate since the pathogenic variants in this gene are known to cause Joubert syndrome. A subsequent cranial MRI in the body confirmed this diagnosis clinically.

8.2.2 The Identification of pathogenic variant in SPG20 in family MR-TUR-03 8.2.2.1 Family MR-TUR-03: Phenotype

Figure 8.2.2.1.1: Family pedigree and photographs of the two affected siblings

This family was referred for genetic counselling at the Institute of Human Genetics in Bonn with a history of an unclarified intellectual disability in two of four children. The parents were second-degree cousins and of Turkish descent (Figure 8.2.2.1.1). The eldest daughter (P1) was born in the 8th month of pregnancy and she sat and first walked at the age of 12 and 20 months, respectively. At the age of 5 years, clinical examination revealed growth retardation (all growth parameters <3rd percentile), general muscular hypotonia, joint hypermobility, genu valgum, a prominent nose, raised nostrils, epicanthus, down-slanting palpebral fissures, and low-set ears. All laboratory investigations were unremarkable. At the age of 6 years, follow-up examination showed limb musculature atrophy. At the age of 14 years, clinical examination revealed muscular weakness of the upper extremities and at the age of 16 years, also a hoarse voice, high arched feet, atrophy of the lower arm muscles and increased proprioceptive muscular reflexes were noticed. Electromyography and investigations of nerve conduction velocity were not suggestive of

45

muscle disease. Later that year, she was admitted to a child and adolescent psychiatry ward with a suspicion of an acute decompensating psychosis. At the age of 19 years, she was readmitted to the psychiatric ward with sleep disturbance, general anxiety and panic attacks with hallucinations, and suicidal ideation. All laboratory investigations were unremarkable as well as brain magnetic imaging and electroencephalogram. Assessment using the Wechsler Adult Intelligence Scale (HSWI) revealed a total intelligence quotient (IQ) of 46. At the age of 26 years, she presented for genetic counselling. Her head circumference was 51 cm (1 cm below the 3rd percentile), her height was 144 cm (4.7 cm below the 3rd percentile), and her body weight was 49 kg (around the 10th percentile). Clinical examination revealed increased reflexes in the lower extremities with general muscle hypotonia, slurred speech, low set dysmorphic thumbs and clinodactyly with mild skeletal abnormalities of the hands (Figure 7.3.2.1.1). The younger brother (P2) was delivered at term. At the age of 9 months, he underwent surgery for hydronephrosis. During this hospital admission, he was noted to be small for his age, with short extremities and developmental delay. At the age of 10 years, he was referred to a pediatric clinic and he was noted to have muscular hypotonia, hyperextensible joints, and distal muscular atrophy. At the age of 17 years, he presented for genetic counselling. His height was 153 cm (10 cm below the 3rd percentile), and he displayed an ataxic gait, slurred speech, large low-set ears, a prominent nose, retrognathism, and down-slanting palpebral fissures (Figure 8.2.2.1.1).

8.2.2.2 Family MR-TUR-03: homozygosity mapping and whole exome sequencing The homozygosity mapping of this family revealed five candidate regions on chromosomes 11, 13, 16, and 17 with a total length of 51.8 Mb (Table 4.2). The whole exome sequencing using DNA of P2 identified a total of 47,000 variants, in which only five variants in TEX26, B3GALTL, SPG20, IRX6, and KATNB1, were exonic or nearby and located in a candidate region. After taking into account the results of prediction programs, evolutionary conservation, and mutation estimation algorithms, only c.364_365delTA, p.(Met122Valfs*2) variant in SPG20 (NM_001142294) was retained as a candidate. This frameshift variant was validated using Sanger sequencing and was found to segregate with the phenotype in this family. A review of the literature revealed the same variant was identified in an Omani family with Troyer syndrome (Manzini et al. 2010). Thus, we assumed this variant to be the causal variant for the phenotype in the Turkish family, and therefore the diagnosis of Troyer syndrome could be assigned.

46

Interestingly, the large genotype differences between the haplotypes that carry the pathogenic variant SPG20 of Turkish affected siblings and those reported by Manzini et al. (genotypes kindly provided by the authors) suggest this as a recurrent pathogenic variant rather than a founder pathogenic variant that causes Troyer syndrome in both families (Figure 8.2.2.2.1).

Figure 8.2.2.2.1: Comparison of the mutation-carrying haplotypes in the Omani kindred (Manzini et al. 10) and in the present Turkish family. Genotypes that differ between families are indicated by a green background. And the genotypes that are similar in all families are indicated by an orange background. The presence of large genotypes differences in the direct vicinity of the identified variant (in red) suggests this pathogenic variant as recurrent rather than a founder mutation that has arisen from the same founder individual and transferred on the same haplotype.

8.2.2.3 Pathogenic variants in SPG20 and the frequency of Troyer Syndrome In Exome Aggregation Consortium database (ExAc), the pathogenic variant c.364_365delTA has been reported as heterozygous variant in six individuals (allele frequency of 0.000049), as well as 481 further variants in SPG20, that include 25 loss of function variants (frame shifts, stop gains, and at the highly conserved consensus splice sites) and 43 missense variants that were predicted to be pathogenic by three in silico prediction programs (LRT, PolyPhen2, MutationTaster) and affected highly conserved residues (GERP++>5). Considering the sum of the allele frequencies of all these variants, the frequency of possibly pathogenic alleles in SPG20 would be 0.0017, and the estimated prevalence of Troyer syndrome would thus be 3 per million of the general population.

8.2.2.4 The phenotypes comparison between Patients with pathogenic variants in SPG20 All affected Omani and Turkish individuals have short stature, mild skeletal abnormalities of the hands, spasticity, and distal muscle atrophy with hyperreflexia, and these features had then progressed insidiously over time. In adulthood, the affected individuals had displayed gait disturbances, delayed motor and cognitive development and dysarthria and slurred speech. A

47

phenotypic feature observed only in males was the presence of a prominent chin. In contrast, brain imaging results varied between the families. The brain MRI performed for MR-TUR- 03-01 was unremarkable, whereas brain MRI in affected males from the Omani family revealed mild atrophy of the cerebellar vermis, mild white matter volume loss, and periventricular white matter hyperintensity. On the other hand, the clinical phenotypes comparison between the Turkish and Omani affected individuals with those of Troyer syndrome affected individuals from the Amish population (Proukakis et al. 2004) revealed also phenotypic overlapping that includes delayed milestones, dysarthria, hyperreflexia, spasticity of the lower limbs, distal muscular atrophy, skeletal abnormalities (small stature, hyperextensible joints) and emotional lability. Interestingly, brain MRI scans of the Amish patients had revealed similar deep white matter abnormalities to those reported in the Omani patients (Table 8.2.2.4.1).

Characteristics MR-TUR-03 Omani Amish I Amish II Age (in years) 26 and 17 Mean 16.6 Mean 15 Mean 43.5 Height Short Short Short Short Occipitofrontal circumference Normal Normal Normal Normal Hand/foot abnormalities 2/2 6/6 10/10 11/11 Delayed speech 2/2 Late 6/6 Late 8/10 Late 9/11 Delayed walking 2/2 Late 4/6 Late 8/10 Late 10/11 Poor school performance 2/2 5/6 10/10 9/11 Emotional lability 1/2 No 7/10 10/11 Dysarthria 2/2 6/6 10/10 11/11 Tongue dyspraxia 2/2 5/6 Often Often Distal amyotrophy 2/2 4/6 7/10 11/11 Hyperreflexia of uper extremities 2/2 2/6 3/10 10/11 Hyperreflexia of lower extremities 2/2 4/6 10/10 11/11 Clumsy, spastic gait 2/2 6/6 10/10 11/11 Dysmetria 2/2 3/6 7/10 11/11 Table 8.2.2.4.1: Comparison of clinical features between the present subjects and those of Troyer syndrome patients from the Omani and the Amish population.

Since the identified variant in SPG20 was one of few cases world-wide, we have published the results in the Molecular Cell Probes (Tawamie et al. 2015).

8.3 Variants in novel genes Of the 17 genes in which I have identified relevant variants, two were in known genes and 15 were in, at the time of identification, novel genes. In the context of this thesis, I have performed further analyses in eight genes, EZR, KIAA0586, PGAP1, PGAP2, PUS7, SVBP, TAF13 and TMTC3.

48

8.3.1 The Identification of pathogenic variant in SVBP in family MR053 8.3.1.1 Family MR053: Phenotype

Figure 8.3.1.1.1: Family pedigree and photographs of the two affected siblings (V:3 and V:5)

The family MR053 is a consanguineous family of Syrian descent, which has one healthy boy and two affected offsprings (V:3 and V:5) (Figure 8.3.1.1.1). The pregnancy with V:3 was unremarkable. At delivery, she developed cyanosis and was unconscious. In the neonatal period, she had muscular hypotonia and had early closed sutures. She learned walking at the age of three years. At the age of 5 years, she could speak few words and obey easy orders. Furthermore, she presented with microcephaly (head circumference was 45 cm (-5.53 SD)) and chest deformity. No MRI or CT scans were available. The pregnancy and birth of V-5 were unremarkable, as well as the closure of sutures. His neonatal development was unremarkable. However, at the age of 13 months, he still couldn’t crawl or stand. He also presented with microcephaly (head circumference was 43 cm (-2.63 SD)).

8.3.1.2 Family MR053: homozygosity mapping and whole exome sequencing Under the assumption that the causative variant would be homozygous and identical by descent in both affected individuals, homozygosity mapping was performed. The results revealed five candidate regions on chromosomes 2, 7, 10, and 12 of a total length of 42.9 Mb. Subsequently, the whole-exome sequencing (WES) using DNA of V:3 was performed and revealed a total of 48,267 variants, of those, only four variants in SVBP (c.82C>T, p.(Gln28*)), INADL (c.4355T>C, p.(Val1452Ala)), DMRT2 (c.1054G>A, p.(Asp352Asn))

49

and MYSM1 (c.4355T>C, p.(Val1452Ala)), affect conserved residues, were possibly pathogenic in the in silico analysis, and were absent in 380 Syrian control individuals, 728 in- house exomes, and all available SNPs databases. By Sanger sequencing the four variants were validated and segregated in all family members. Later we excluded the variants in DMRT2 and MYSM1 because of the association of the variants in this genes with sex-reversed phenotype (MIM:154230) (Ottolenghi et al. 2000) and inherited bone marrow failure syndromes (Alsultan et al. 2013), respectively, and we were left with the identified variants of SVBP and INADL.

8.3.1.3 Identification of a second family with c.82C>T variant in SVBP In cooperation with Professor Hans van Bokhoven (Donders Institute for Brain, Radboud university medical Center, the Netherlands), a second consanguineous family of Pakistani descent with two affected girls (IV:1 and IV:2) that are homozygous for the identical variant joined the project. The pregnancy and delivery of both affected members were unremarkable. Clinical examination of both affected individuals at the ages of 29 and 32 years showed intellectual disabilities with delayed and unclear speech, short stature, microcephaly, muscular hypotonia, ataxia, and aggressive behaviour. The CT scan of both affected individuals showed normal brain structures. Exome sequencing of the Pakistani family revealed two variants; the variant in SVBP (c.82C>T, p.(Gln28*)) as in the family MR053, and a second variant in KLHL21 (c.1762C>T, p.(Arg588Trp)). The identified variant in SVBP segregated with the phenotype of this family and was absent in 240 Pakistani healthy controls.

8.3.1.4 The identified SVBP variant has a founder effect The c.82C>T in SVBP was identified in two unrelated families with affected individuals with overlapping phenotype. This promoted us to investigate whether both families have a common ancestor. The haplotype analysis using SNP genotypes of both families showed that SVBP variant (chr1:43,282,134) is located in a shared haplotype, spanning 2.25 Mb between 42,637,613bp and 43,398,085bp on chromosome 1 (GRCh37/hg19), indicating that a common founder pathogenic variant is likely in these two families (Figure 8.3.1.4.1).

50

Figure 8.3.1.4.1: The compartment between the genotype of both families revealed a 2.25 Mb common haplotype flanking the identified variant of SVBP and suggests founder effect. Genotypes that differ between families are indicated by a red background, while the genotypes that are similar in both families are indicated by a green background. Since the overlapping is strong, it seems to be the same haplotype that had some further variants added with the time.

8.3.1.5 Premature stop codon SVBP variant impairs the VASH1 secretion The variant in SVBP leads to a premature stop codon at position 28, and that may reduce the SVBP expression by activating the nonsense-mediated mRNA decay (NMD). To assess this hypothesis, we ran a qPCR using the RNA isolated from the lymphoblastoid cell lines of the Syrian family and six healthy controls. The results showed no significant reduction in the mRNA levels between patients, parents and healthy controls. This suggested that there was no immediate NMD (Figure 8.3.1.5.1, C), or that the expression in blood was leaky. The molecular modelling was done in cooperation with Professor Heinrich Sticht (Institute of Biochemistry, Friedrich-Alexander-University Erlangen-Nuremberg, Germany) using the Parcoil tool. The results predicted the loss of the highly conserved coiled-coil domain (CCD) of SVBP and thus, it is assumed that the pathogenic effect of the SVBP variant may be related to the altered protein stability and/or function (Figure 8.3.1.5.1, A and B). To investigate this assumption, the VASH1 and SVBP expression in the cell lysate and conditioned medium of double transfected HeLa cells with VASH1 and N-terminal Flag- tagged human SVBP (wild type and mutant) were analysed using western blot. The obtained results clearly revealed a significant reduction in the VASH1 secretion into the conditioned medium (Figure 8.3.1.5.2, A) and cell lysate of the mutant SVBP transfected cells comparing to those transfected with the wild type (Figure 8.3.1.5.2, B). Moreover, the fact that no altered SVBP was detectable in cell lysate and conditioned medium (Figure 8.3.1.5.2, C and D), suggested that the identified variants in SVBP produce unstable protein.

51

Figure 8.3.1.5.1: (A) SVBP is highly evolutionary conserved among mammals. The arrow indicates the putative coiled-coil domain and the red box indicates the mutant amino acid (Suzuki et al. 2010). (B) Molecular modeling using Parcoil tool of wild type (left) and altered SVBP (right) revealed deletion of the coiled coil domain in the mutant SVBP. (C) Non-mediated decay as a pathogenic mechanism of SVBP variant. Using two pairs of primers specific for the cDNA of SVBP and SYBRgreen quantitive PCR we showed no differences in the mRNA levels between patients, parents and six healthy controls.

Figure 8.3.1.5.2: Double transfection of HeLa cells with equal amounts of hVASH1 and N-terminal flag tagged hSVBP (wild type and mutant). (A) The cell lysate shows a very weak amount of VASH1 in the cell transfected with mtSVBP comparing to the wtSVBP and that indicates less solubility and high degradation to VASH1 protein. (B) The same was observed in the cell medium which indicates a significant reduction in VASH1 protein secretion. (C, D) The impacts of the VASH1 protein function is a result of the mutant SVBP protein which produces unstable protein with no function that is degraded later.

52

8.3.2 The Identification of pathogenic variant in TAF13 in family MR026 8.3.2.1 Family MR026: Phenotype

Figure 8.3.2.1.1: Family pedigree (A) and photographs of the two affected siblings (IV:3 and IV:6)

The family MR026 is a Syrian consanguineous family from Kurdish descent. The parents are first degree cousins and had six children; of them, two children (IV:3 and IV:6) have mild intellectual disability, microcephaly, and severe growth retardation and one female (IV:2) deceased two hours after birth due to multiple malformations (Figure 8.3.2.1.1). Pregnancy and delivery of both affected individuals were normal. Low birth weight of 2 kg was noted in both; (<3rd centile; ‒2.5 SD; no information regarding length and head circumference). No convulsions were reported and hearing and vision seemed normal as was social interaction. Both children showed no malformations. Parents reported that the elder boy IV:3 achieved all developmental milestones slightly delayed, e.g. walking at about two years. An examination at the age of seven years revealed delayed skeletal maturation with a bone age of three years. At the time of examination, IV:3 was 16.5 years old and showed no signs of puberty. Height was 134 cm (30 cm below the 3rd centile; -6.07 SD), weight 24 kg (25 kg below the 3rd centile; -9.39 SD), and head circumference 46 cm (7.5 cm below the 5th centile; -7.06 SD). At the time of examination, the younger sister IV:6 was ten months old, and she could sit alone. Weight was 5.3 kg (1.4 kg below the 3rd centile; -3.57 SD) and head circumference 39 cm (4 cm below the 3rd centile; -5.83 SD). We could not obtain a reliable measurement of her length, but she was significantly smaller than expected for her age. The MRI of IV:6 at the age of ten months revealed a slightly delayed myelination with deep sulci of the cerebellum (Figure 8.3.2.1.2, A) that would correspond to a brain development of a six-month-old child (Figure 8.3.2.1.2, B) and a suspected pachygyria with slightly enlarged gyri in the frontal lobe

53

(Figure 8.3.2.1.2, C). Genetic testing comprised karyotype and array analyses and screening for Fragile X syndrome which were all normal.

Figure 8.3.2.1.2: MRI of individual IV:6 at age of ten months that shows mild delayed myelinisation (A), discrete frontal pachygyria (B) and deep sulci of the cerebrum (C).

8.3.2.2 Family MR026: homozygosity mapping and whole exome sequencing The homozygosity mapping revealed five candidate regions on chromosomes 1, 7, 11, 14, 18 and 22 of a total length of 67.1 Mb. Whole-exome sequencing (WES) using DNA of IV:3 identified a total of 48,119 variants, in which only 44 homozygous variants were in the candidate regions. Of eight exonic or splice-site variants in the candidate regions, only the homozygous variant in TAF13 (NM_005645.3), c.119T>A (p.Met40Lys), affects a highly conserved residue; it was predicted to be pathogenic by all applied in silico programs and was absent in 380 Syrian control individuals, 728 in-house exomes, and all available SNPs databases (ExAC, ESP, EVS, 1000 Genomes). Sanger sequencing for all healthy siblings, both affected individuals, and parents proved that the variant in TAF13 segregates with ID in the family.

8.3.2.3 Identification of second variant in TAF13 In cooperation with Dr. Roberto Colombo (Center for the Study of Rare Hereditary Diseases, Milan, Italy), another variant, c.92T>A, p.(Leu31His), in TAF13 was identified in a consanguineous family of Southern Italian origin and two affected males. In this family, the affected males (IV:4 and IV:6) displayed a phenotypic overlap with the Syrian patients including mild intellectual disability, microcephaly and skeletal and puberty delay, but with no growth retardation at the time of examination (Table 8.3.2.3.1). Urinary ketones, non- glucose reducing substances, organic acid and amino acid profiles, array analysis, and karyotyping of both of them were normal and Fragile X syndrome was excluded by molecular analysis of FMR1.

54

Family 1 Family 2 Symptoms IV:3 IV:6 IV:4 IV:6 Pregnancy and delivery normal normal normal normal Development milestones delay mild mild mild mild Intellectual disability mild mild mild mild Dysmorphic facial features none none none none Puberty signs delayed NA delayed delayed Bone age delayed NA delayed delayed Growth retardation severe severe none none Microcephaly severe severe moderate mild

Brain CT scan at the Brain MRI showed Brain Imaging NA age of 7.5 years was NA delayed myelination unremarkable

Table 8.3.2.3.1: An overview of the clinical presentation of all four individuals with a homozygous variant in TAF13 shows a significant phenotype overlapping. NA: data not available.

8.3.2.4 Molecular modeling suggested that both variants hamper TAF13-TAF11 heterodimer formation by similar mechanism The molecular modelling analysis was done in cooperation with Profossor Heinrich Sticht (Institute of Biochemistry, Friedrich-Alexander-University Erlangen-Nuremberg, Germany) and highlighted the detrimental effect of p.(Met40Lys) and p.(Leu31His) variants on TAF13 capacity to form a heterodimer with TAF11. In the TAF13-TAF11 complex, TAF13 forms two α helices connected by a tight turn. This conformation is stabilized by intramolecular interactions formed between the hydrophobic residue pairs Leu31-Ile62 and Met40-Leu57 (Figure 6.3.2.4.1, I). For both variants p.(Met40Lys) and p.(Leu31His), a hydrophobic residue is replaced by a positively charged residue, resulting in a loss of the hydrophobic interactions with Leu57 or Ile62. These structural effects were predicted to destabilize interactions between the a1 and a2 helices of the TAF13 histone fold that stabilize the L1 loop required for heterodimerization with TAF11 and therefore it was expected to affect formation of the TAF13- TAF11 complex (Figure 8.3.2.4.1, II and III).

55

Figure 8.3.2.4.1: (I) Structure of TAF13 (dark blue) in complex with TAF11 (cyan). Residues Leu31 and Met40 are shown in space-filled presentation and colored according to their atom types. The interacting residues Leu57 and IIe62 are colored in red and orange, respectively. (II) The Leu31His pathogenic variant results in a loss of the hydrophobic contacts with Ile62 (indicated by a black arrow). (III) The Met40Lys pathogenic variant results in unfavorable interactions between the Lys40 amide group and the hydrophobic Leu62 side chain (black arrow) and additionally causes steric clashes due to the longer lysine side chain compared to methionine (red arrow).

8.3.2.5 Co-immunoprecipitation (Co-IP) proves pathogenicity of the c.119T>A variant To prove the molecular modelling assumptions, Co-immunoprecipitation experiments were performed to analyse the pathogenic effect of the c.119T>C variant on the TAF13 ability to interact with TAF11. The experiments were, with my support and contribution, part of the Master thesis of Natalie Wohlfahrt (Institute of Human Genetics, Erlangen). In preparation for the Co-IPs, TAF13 (mutant and wild type) and TAF11 were cloned into pcDNA3.1(+)/myc- His and pcDNA3.1(+)/HA-His vectors, respectively. The qPCRs of HeLa cells transfected with myc+TAF11 and either wild type or mutant HA+TAF13 revealed an approximate 3:1 overexpression ratio of TAF13:TAF11, with permanently higher expression levels of wild type TAF13 than mutant. Following, two different Co-IP experiments were carried out using the cell lysates of the double transfected HeLa cells with TAF11-TAF13 and TAF11- mutTAF13 (Figure 8.3.2.5.1, A). In the first Co-IP experiment and in the contrary to the cell transfected with TAF11/TAF13, no distinct band of TAF11 was detectable in the protein lysates of cells transfected with TAF11-mutTAF13 (Figure 8.3.2.5.1, B-I). Later on, in cooperation with the group in Strasbourg (see below), similar results were obtained in the second Co-IP type in which no obvious band for TAF13 was detectable in the protein lysates TAF11-mutTAF13, comparing to the cells transfected with TAF11/TAF13 (Figure 8.3.2.5.1, B-II).

56

Figure 8.3.2.5.1: (A) qPCR results to evaluate the gene expression after double transfection into HeLa cells. Lane 1 presents the cDNA of the empty vectors myc/HA, lane 2 contains the cDNA of myc+TAF11 / HA+TAF13 and lane 3 portrays the cDNA of myc+TAF11 / HA+varTAF13. For all templates, specific TAF13 and TAF11 primers were used. (B) Western blots of Co-IP experiments with transfected HeLa cell protein lysates. (B-I) Co-IP performed with α-HA antibody, western blot with α-myc antibody. Lane 1 displays the molecular ladder SeaBlue®, lane 2 TAF11/ TAF13, lane 3 TAF11/varTAF13, lane 4 HA/myc and lane 5 a NC without antibodies in the Co-IP. (B-II) Co-IP performed with α-myc antibody, western blot with α-HA antibody. Lane 1 displays TAF11/TAF13, lane 2 TAF11/ varTAF13, lane 3 HA/myc, lane 4 a NC without antibodies and lane 5 the molecular ladder MagicMark (Master thesis of Natalie Wohlfahrt, Institute of Human Genetics, Erlangen).

Later, the co-immunoprecipitation (Co-IP) experiments were performed in cooperation with the laboratory of Prof. Irwin Davidson (IGBMC, France) to confirm the pathogenic effect of p.(Leu31His) variant as well as our Co-IP results for p.(Met40Lys). TAF13 (wild type and mutants) and TAF11 were cloned into pCDNA3.1-HA-His (Master Thesis Natalie Wohlfahrt) and pAT6-B10 vector (Igor Martianov, IGBMC), respectively. Co-IP experiments were performed on protein extracted from transfected HeLa cells using the anti-B10 antibody to precipitate TAF11-TAF13 complexes followed by immunoblotting with anti-HA antibody to

57

detect co-precipitated TAF13. As shown in Figure 8.3.2.5.2, both variants reduced the TAF11-TAF13 interaction such that they led to less precipitation of TAF13 than the wild- type, although comparable amounts of TAF11 were precipitated in all reactions. The effect was particularly marked with the c.119T>A (p.Met40Lys) variant.

Figure 8.3.2.5.2: Western blots of Co-IP experiments of the Hela cells transfected with TAF11 and either wild type or one of mutant TAF13. As showed in the figure, we used anti B10 antibody to participate TAF11-TAF13 heterodimer and then we revealed the blots using anti HA antibody (Prof. Irwin Davidson, IGBMC, France) (Tawamie et al. 2017).

8.3.2.6 TAF13 widely regulates the gene expression and cellular functions TAF13 is a constituent of at least two protein complexes: the TFIID complex and the small nuclear RNA gene specific TAF complex (snTAFc), which play a critical role in the regulation of gene transcription in eukaryotic cells (Purrello et al. 1998; Zaborowska et al. 2012). To test for the consequences of loss of function of TAF13 on the gene expression pattern, 80% of the expression of TAF13 was knocked down using siRNA in human neuroblastoma cells (SK-N-BE(2)) (Figure 8.3.2.6.1, A) and subsequent RNA sequencing was performed. Of 21,620 evaluated Ensembl genes, 16,984 with an HGNC gene symbol were selected for filtering. Compared with control cells, knockdown cells showed deregulation of 1,278 (7.8%) of these genes with |fold change| > 1.5 and p.val < 0.01 (Figure 8.3.2.6.1, B and C). Our analysis using the Molecular Signatures Database (MSigDB) for transcription factor targets (Subramanian et al. 2005) revealed significant enrichment of several transcription

58

factor binding motifs, such as the E-box (CANNTG) or other motifs (e.g., AACTTT and CTTTGT), in the genes deregulated by TAF13 knockdown (Table 8.3.2.6.1).

Figure 8.3.2.6.1: (A) TAF13 knocked down SK-N-BE(2) cells. RT-PCR shows that siGENOME siRNA targeting human TAF13 knocked down almost 80% of the gene expression in SK-N-BE(2). Values shown are the mean from four independent experiments. Error bars represent the standard deviation of the mean (SEM). (B) Transcription changes in gene expression upon TAF13 knockdown. SK-N-BE(2) cells were treated with negative control siRNA or TAF13 siRNAs (siTAF13) and sampled at RNA-seq analysis after 72 h. RNA-seq samples intercorrelate well at the expression level. The heatmap uses regularized log-transformed counts. Unsupervised hierarchical clustering separates samples into two distinct groups, those of TAF13 knocked down SK-N-BE(2) and controls. (C) The heat map displays hierarchically clustered expression values for all 1,278 differentially expressed genes (DEGs) that were significantly altered). Regularized log-transformed counts for each gene were standardized across samples; red colour scale indicates up-regulation and blue colour scale down-regulation.

59

# genes with # deregulated genes Transcription factor target ratio rawP adjP motif observed expected hsa_CAGGTG_V$E12_Q6 (E box) 2024 240 155.52 1.54 7.59E-13 4.60E-10 hsa_AACTTT_UNKNOWN 1549 185 119.02 1.55 3.23E-10 9.79E-08 hsa_CTTTGT_V$LEF1_Q2 1658 193 127.4 1.51 1.07E-09 1.62E-07 hsa_CAGCTG_V$AP4_Q5 (E box) 1244 154 95.59 1.61 1.03E-09 1.62E-07 hsa_V$TAL1BETAE47_01 (E box) 190 37 14.6 2.53 1.20E-07 1.45E-05 hsa_GGGAGGRR_V$MAZ_Q6 1910 204 146.76 1.39 3.30E-07 3.33E-05 hsa_YTATTTTNR_V$MEF2_02 537 72 41.26 1.74 2.26E-06 0.0002 hsa_YATGNWAAT_V$OCT_C 284 45 21.82 2.06 2.58E-06 0.0002 hsa_TTGTTT_V$FOXO4_01 1647 174 126.55 1.37 5.54E-06 0.0004 hsa_TTANTCA_UNKNOWN 747 91 57.4 1.59 6.60E-06 0.0004 Table 8.3.2.6.1: enrichment of top ten specific promoters motives in deregulated genes after knock down of TAF13 using the Molecular Signatures Database (MSigDB) for transcription-factor targets (Subramanian et al. 2005). The first column includes the motive, and possibly its known association transcription factor (e.g. E12, LEF1, MAZ, etc.), second column includes the number of genes with this motif in the genes that were expressed in the examined cell lines (SK-N-BE(2)), the third and fourth columns represent the number of observed number of significantly changed genes as well as the expected number. The fifth column (ratio) represents the ratio between the observed and the expected number of deregulated genes with this specific motif. The last two columns describe the significance of the results nominally, and after correction for multiple testing.

Functional enrichment analysis of the 1,278 deregulated genes (DEGs) via Ingenuity Pathway Analysis (IPA) indicated that a broad spectrum of functions was affected, often with an increased activity state. This led to the assumption that TAF13 is involved in a wide range of transcriptional regulations in different tissues and pathways. The top six positions of affected functions were general cell functions of movement, migration, proliferation, morphology, protrusion formation, and differentiation (Table 8.3.2.6.2). These were followed by neurone development, neurone proliferation, and neuritogenesis.

Diseases or Functions Annotation p-Value Predicted Activation State cell movement 1.39E-18 Increased migration of cells 2.14E-18 Increased proliferation of cells 1.85E-15 morphology of cells 2.56E-15 formation of cellular protrusions 1.47E-14 Increased differentiation of cells 1.58E-14 Increased development of neurons 1.79E-14 Increased proliferation of neuronal cells 9.35E-14 differentiation of connective tissue cells 1.72E-13 Increased neuritogenesis 2.80E-13 Increased Table 8.3.2.6.2: A list of first ten functions annotation of the deregulated genes in TAF13 knocked down SK-N- BE(2) that obtained by using the ingenuity pathway analysis (IPA).

60

8.3.2.7 TAF13 is required for the neuronal differentiation and proliferation Based on the RNA sequencing, TAF13 has in important role in the developmental stages of the central nerve system. In consensus, I found that there is an increase in TAF13 expression in fetal brain development (Figure 8.3.2.7.1, A) and during the differentiation of SK-N-BE(2) (Figure 8.3.2.7.1, B).

Figure 8.3.2.7.1: TAF13 is up-regulated upon neuronal differentiation and brain development. TAF13 expression levels were measured by qRT-PCR and compared to fetal and adult human brain (A) and between undifferentiated and three days differentiated SK-N-BE(2) cells (B).

This and the functional enrichment analyses prompted me to test the proliferation and differentiation of SK-N-BE(2) cell lines with TAF13 knockdown using siRNA. To study the changes in the differentiation of SK-N-BE(2) cells after TAF13 knockdown, we induced SK- N-BE(2) cells to form neuronal-like cells by using a mixture of retinoic acid (RA) and caffeic acid (CA) and used quantitative real-time PCR to measure the expression of specific differentiation markers: growth-associated protein 43 (GAP43), microtubule-associated protein tau (MAPT), and v-myc avian myelocytomatosis viral oncogene neuroblastoma derived homolog (MYCN) (Figure 8.3.2.7.2, A and B) (Leotta et al. 2014). Expression of GAP43 and MAPT was significantly reduced (p < 0.0001) in both differentiating and undifferentiated knockdown SK-N-BE(2) cells. MYCN showed no significant change in the undifferentiated cells but was significantly downregulated in the differentiating cells. Although we did not achieve a stable knockdown of TAF13 for the full differentiation process across 12 days (Leotta et al. 2014), the obtained results indicate that TAF13 regulates the expression of many genes included in the differentiation process.

61

Figure 8.3.2.7.2: TAF13 is required for efficient differentiation of SK-N-BE(2) cells into neuronal cells like in vitro. The expression of the differentiation markers (GAP43, MYCN, MAPT) in WT and TAF13 knocked down cells undifferentiated and differentiating were measured using qRT-PCR. In contrary to the MYCN that has the highest expression in the early step of neuronal differentiation; GAP-43 and MAPT are characterized by high expression in the last stage of the differentiation process. Values are relative to wild-type differentiated SK-N- BE(2) cells. Values shown are the mean from three independent experiments.

In the cell-proliferation assay, cells were seeded and transfected with siRNA against TAF13. Cells were then counted by colorimetric assay, and TAF13-knockdown SK-N-BE(2) cells were found to exhibit a 2-fold higher proliferation rate than negative control cells (p < 0.0001; Figure 8.3.2.7.3, C). Comparable results were also obtained from the migration assay. SK-N- BE(2) cells were seeded and transfected with siRNA against TAF13. After 48 hours of transfection, the cell monolayer was scratch-wounded. The wound healing of knocked down SK-N-BE(2) cells was completed within 24 to 30 hours, which is 30% quicker than for the wild-type cells (P < 0.02, Figure 8.3.2.7.3, A and B). However, we cannot reliably distinguish migration from the proliferation effect.

62

Figure 8.3.2.7.3: (A) Compared with the negative control, TAF13 targeted RNAi increased the migration ability of SK-N-BE(2), which was determined by scratch assay. (B) Cell monolayer was scratched with a pipette tip and incubated at 37°C for 24 hours and images of the scratches were captured. Values shown are the mean for each condition from three independent experiments. (C) Compared with the negative control, TAF13 targeted RNAi increased the proliferation ability of SK-N-BE(2) which was determined by MST-8 assay.

63

8.3.3 The Identification of pathogenic variant in KIAA0586 in family MR026 8.3.3.1 Family MR026-04: Phenotype

Figure 8.3.3.1.1: (A,B) Family pedigree and photographs of the two affected siblings (MR026-01 and MR026- 04) as well as an MRI of individual MR026-01 that shows clearly the molar tooth sign (MTS), the hall mark of Joubert Syndrome (C).

Family MR026 is a consanguineous Syrian family of Kurdish descent with two affected individuals (Figure 8.3.4.1.1, A and B). Pregnancy, delivery, and birth parameters of both children were unremarkable. In the neonatal period, both were hypotonic and weepy. Motor and speech development in MR026-01 were delayed, and his IQ was estimated to be between 50 and 70. Further symptoms were severe myopia, scoliosis, brachydactyly, distinct facial characteristics, and recurrent febrile seizures. Height was reduced (108 cm, −2.6 SD), weight was normal (22 kg, −0.27 SD), and head circumference was increased (57 cm, +2.3 SD). MR026-04 had not reached any milestones, and at the age of 7 years, she was wheelchair- bound. Cognitive abilities were weaker than in her brother, with an IQ estimated to be below 35. MR026-04 had similar physical characteristics as her brother, severe muscular hypotonia, prolonged and therapy-resistant seizures since the age of 14 months, and hypothyroidism. At the time of examination, her height was 91 cm (1 SD), weight was 11.5 kg (−0.7 SD), and there was macrocephaly (head circumference of 59 cm, +8 SD). The MRI of MR026-01 clearly revealed the molar tooth sign (MTS), the hallmark of Joubert syndrome (Figure 8.3.3.1.1, C) (Andermann et al. 1999; Quisling et al. 1999; Maria et al. 1997).

64

8.3.3.2 The family MR026: homozygosity mapping and whole exome sequencing Under the assumption that the causative variant would be homozygous and identical by descent in both affected individuals, homozygosity mapping was performed and the results revealed five candidate regions on chromosomes 2, 5, and 7 of a total length of 107.8 Mb. The whole exome sequencing using DNA from individual MR026-01 identified only one candidate homozygous pathogenic variant c.2414-1G>C that affects the invariant consensus of the exon 18 acceptor splice site in KIAA0586. This was predicted to be pathogenic and was absent in 380 Syrian control individuals including 92 of Kurdish origin, 728 in-house exomes, and many other SNP databases. Sanger sequencing of all healthy siblings, both affected individuals, and parents showed that the segregation of variant in KIAA0586 was compatible with causality in the family.

8.3.3.3 Consequences of the KIAA0586 variant c.2414-1G>C on mRNA level The c.2414-1G>C variant affects the invariant consensus of the acceptor splice site of exon 18. RT-PCR and Sanger sequencing of the fragments amplified from cDNA of patients (lymphobloasotid cell lines) revealed three aberrant splicing products due to usage of alternative exonic acceptor splice sites at AG motifs within exon 18 and due to skipping of exon 18 (Figure 2): a 13-bp deletion that results in a premature termination codon (alternative acceptor splice site at c.2425/2426AG), a 108-bp in-frame deletion (alternative acceptor splice site at c.2520/2521AG), and a 188-bp deletion due to skipping of exon 18 that results in a premature stop codon (Figure 8.3.4.3.1, A). These aberrant transcripts were present in the cDNA from both patients, but not in the cDNA of a healthy control individual. The semi quantitative PCR suggested the possibility that the alternative transcripts are likely to be degraded by nonsense-mediated decay (NMD) (Figure 8.3.3.3.1, B).

65

Figure 8.3.3.3.1: (A) Genomic structure of KIAA0586 (NM_001244190.1) including a scheme of human KIAA0586 protein and predicted consequences of c.2414-1G>C pathogenic variant. Orange color: unrelated residues included due to frameshift mutations. The third coiled-coil domain is the counterpart of the functionally essential fourth coiled-coil domain in chicken (framed in red). (B) The gel electrophoresis shows the aberrant transcripts due to c.2414-1G>C.

8.3.3.4 Further studies on KIAA0586 Two further families were recruited with one affected individual, each, with “molar tooth sign” in their cranial axial MRI and probably pathogenic variants in KIAA0586. Further functional studies by Dr. Megan G Davey’s lab (Roslin Institute, Edinburgh) confirmed KIAA0586 as an essential protein for several distinct roles in centriole function. Taken together, this led to the description of KIAA0586 as a novel gene for Joubert syndrome (Stephen et al. 2015).

8.3.4 Further convincing candidate variants In addition to the extensive analyses that I have made for the above-mentioned genes, I have identified further convincing variants. For several of these (EZR, FAR1, PGAP1, PGAP2, and TMTC3), I have performed only the preliminary analyses, including validation with Sanger sequencing, testing for minor allele frequencies in a healthy, ethnically matched cohort, basic

66

in silico analyses using publically available software, and first steps of functional analysis (e.g. mutagenesis, construction of vectors, etc.). Further analyses were done in cooperation. For more information, please see the discussion.

67

9 Discussion 9.1 Identification of mutations in known ID genes In this thesis, two pathogenic variants in already known intellectual disability genes were identified in two consanguineous families from Syria and Turkey. The identification of only 2 known genes out of 19 families (10.5%) is understandable as at the time I started this thesis the number of known genes for autosomal recessive intellectual disability (ID) was still very small. In the meanwhile, higher clarification rates can be achieved. An example is the study by Trujillano et al. 2017 where pathogenic or likely pathogenic variants were identified in 307 of the 1000 families (overall molecular diagnostic yield of 30.7%) and this shows the development that has been achieved in this field (Trujillano et al. 2017). In the first family, I identified an, extreme rare nonsense variant, c.917dup, p.(Lys307Glufs5*), in AHI1 (absent in the Genome Aggregation Database (gnomAD), data extracted on 13.09.2017). The truncated protein completely lacks the SH3 domain and all the WD40 repeats. A total of 32 truncating, pathogenic variants were reported in the literature or in The Human Gene Mutation Database (HGMD, data extracted on 13.09.2017) but located more to the C terminus. We thus considered this variant as likely pathogenic. We then initiated a cMRI of the affected child in Damascus, Syria, and this showed the molar tooth sign that is characteristic for Joubert syndrome (Maria et al. 1997). Joubert syndrome is a cerebello-oculo-renal syndrome with phenotypes including cerebellar hypoplasia, retinal dystrophy, and nephronophthisis. In two surveys of 117 and 137 subjects with Joubert syndrome, 11% and 7.3% had causative pathogenic variants in AHI1. In summary, clinical and molecular data converge on the diagnosis of Joubert syndrome. In the second family, I identified a homozygous, truncating variant in SPG20, c.364_365delTA, p. (Met122Valfs2*) in two patients with Turkish background with delayed milestones, dysarthria, hyperreflexia, spasticity of the lower limbs, distal muscular atrophy with normal electromyography and nerve conduction velocity, skeletal abnormalities and emotional liability. This extremely rare variant (frequency of 0.0000686 in GnomAD, data extracted on 13.09.2017) was identified by Manzini and her colleagues in an Omani family with Troyer syndrome (Manzini et al. 2010). Troyer syndrome is an autosomal recessive, “complicated” form of hereditary spastic paraplegia (HSP) that in addition to the paraplegia is associated with distal amyotrophy, dysarthria, mild developmental delay, and skeletal abnormalities. For a long time, the denomination Troyer syndrome was restricted to the Amish population with only one founder loss of function pathogenic variant (c.1110delA) in

68

the spastic paraplegia 20 gene (SPG20) (Cross and McKusick 1967; Patel et al. 2002; Bakowska et al. 2008). The deletion of two base-pairs in the identified variant of SPG20 leads to a premature stop codon at position 122 of the protein, and probably to a nonsense-mediated decay mechanism. In line with this hypothesis, the full-length spartin protein encoded by SPG20 and also the mRNA were absent in cells of affected individuals indicating that this mutation is a null allele (Manzini et al. 2010). The comparison of the haplotypes between the Omani family of Manzini and the Turkish family in this thesis suggests that this is a recurrent variant and not the result of a founder effect (see also results 8.2.2.2). This suggests either that other variants in SPG20 would lead to a different phenotype or even be lethal, or that this position is prone to mutations due to DNA characteristics. The phenotypic comparison between the affected Omani (Manzini) and Turkish (this thesis) individuals revealed an overlapping phenotype including short stature, mild skeletal abnormalities of the hands, spasticity, and distal muscle atrophy with hyperreflexia. These features progressed insidiously over time. In adulthood, the affected patients displayed gait disturbances, delayed motor and cognitive development and dysarthria and slurred speech. A phenotypic feature observed only in males was the presence of a prominent chin. In contrast, brain imaging results varied between the families. The brain MRI performed for our Turkish patient P1 was unremarkable, whereas brain MRI in affected males from the Omani family revealed mild atrophy of the cerebellar vermis, mild white matter volume loss, and periventricular white matter hyperintensity (Manzini et al. 2010). Again the comparison of the clinical features of the affected individuals from Turkish and Omani families with those from the Amish population revealed a phenotypic overlap, including delayed milestones, dysarthria, hyperreflexia, spasticity of the lower limbs, distal muscular atrophy with normal Electromyography and nerve conduction velocity, skeletal abnormalities (small stature, hyperextensible joints) and emotional lability. Interestingly, brain MRI scans of the Amish patients revealed similar deep white matter abnormalities to those reported in the Omani patients (Proukakis et al. 2004). This family represents the opposite example to first family with Joubert syndrome. Years of exhaustive clinical evaluation by highly specialized clinicians did not set the diagnosis. Such rare syndromes with non-specific features represent a considerable burden for the family and in health care due to repeated medical assessments over the years and extensive investigations. The diagnosis of Troyer syndrome could not be made probably due to several reasons a) the absence of pathognomonic signs or symptoms, b) the syndrome has been

69

considered to be confined to the Amish population, and c) because the patients were considered as intellectual disabled and not as having hereditary spastic paraplegia, thus preventing a correct differential diagnosis. This family demonstrates the importance of modern sequencing approaches, in which a candidate variant is first identified, and retrospective phenotyping is then performed in order to assign a diagnosis. Considering the better quality and reliability of sequencing and the expanding applications of this approach, introducing it as a diagnostic tool may reduce intricateness over years for the parents, and also health costs as it is considerably less expensive than years of fruitless medical examinations and investigations. Nevertheless, precise phenotyping and clinical examination is still of high importance, since the interpretation of sequencing results is dependent upon the quality of the available clinical data.

9.2 Identification of pathogenic variants in novel ID genes The identification and characterization of novel candidate genes for ID have great importance not only for the affected individuals and their families but also to expand our understanding of the biological processes of the neuronal development and functions, which in future may help to develop effective treatments. At the beginning of this thesis, almost 500 genes were known that may lead to ID if mutated. This included especially metabolic disorders and syndromic forms of ID. About 60% of the cases of intellectual disability could not be diagnosed (Vissers et al. 2016).

9.2.1 Identification of SVBP as novel gene for autosomal recessive intellectual disability The whole exome sequencing revealed founder homozygous pathogenic variant in SVBP, c.82C>T; p.(Gln28*), in four affected individuals from two consanguineous families of Syrian and Pakistani descent. The clinical presentation was characterized by intellectual disability, microcephaly, neonatal muscular hypotonia, delayed motor development, and ataxia. SVBP or as it was previously known CCDC23 encodes a highly conserved, small chaperon (66 amino acids) with one coiled coil domain corresponding to residues 32–52. In 2010, Suzuki and his colleagues reported CCDC23 as an interaction partner for the vasohibin proteins family (VASH1 and VASH2) and therefore renamed it as small vasohibin binding protein (SVBP) (Suzuki et al. 2010). In unconventional secretion pathway, SVBP disperses

70

first VASHs in the cytosol by inhibiting the formation of VASHs cytosolic punctate structure (PS) and then it interacts with the plasma membrane to release VASH1 and 2 into the extracellular space (Kadonosono et al. 2017). Knockdown of SVBP significantly impaired VASH1 secretion and solubility in 1% Triton X-100 cell lysis buffer (Suzuki et al. 2010; Xue et al. 2013). This makes clear that the interaction with SVBP is indispensable for vasohibin proteins secretion. Similar findings were observed in this thesis. Although the expression of mutant SVBP in patients LCLs did not appear to be affected by non-mediated decay (NMD) (Figure 8.3.1.5.1, C), the altered protein was not detected at all in the cell lysate or in the conditioned medium. This was possibly due to a late degradation of RNA or to degradation of an unstable, truncated protein. That explains the declined secretion of VASH1 three isoforms (42, 36 and 32 kDa) into conditioned medium and their low solubility into cell lysis buffer and confirms the pathogenic effect of our identified variant. However, it is still unclear how this variant would lead to a rather unspecific phenotype of ID without clear symptoms in other systems. Suzuki and colleagues showed that the co- transfection of SVBP and VASH1 significantly augmented the inhibitory effect of VASH1 on the migration of the endothelial cells (Suzuki et al. 2010). Thus, the pathomechanism of the identified variant in the two families may be interpreted as to be related to the endothelial cell’s role in the neurovascular unit (NVU) and the blood-brain barrier. However, until now there is no proof of this hypothesis. Also, other still unknown interaction partners of SVBP may play a role in the pathomechanisms rather than VASH1 and angiogenesis. Interestingly, in cooperation with Prof. Hans van Bokhoven (Donders Institute for Brain, Radboud University Medical Center, Netherlands) we could show for the first time a neuronal-related function for SVBP, where knocking down the SVBP expression in rat hippocampal neurons (18 day embryo) causes a notable reduction in the excitatory synapse formation (resuults of Radboud University Medical Center). The identification of founder pathogenic variant in SVBP in four affected individuals presented with overlapping phenotypes, that this variant functionally impact SVBP, as well as the role of SVBP in synapse formation in neuronal cells, altogether suggest SVBP as novel candidate gene for ID. The results on SVBP are a good example in the research to identify novel ID genes. On one hand, there are four patients with similar symptoms, a truncating variant, functional evidence that this variant terminates the function of the protein, and functional evidence that the gene plays a role in neurons. On the other hand, there is still a very limited number of patients and the pathomechanism is still not understood and there is still even no hypothesis for that. This

71

shows how important the first description of a gene is, but also how necessary it is to continue detailed characterization of the phenotype and the pathomechanism.

9.2.2 The identified variants of TAF13 as disease causing for autosomal recessive intellectual disability and microcephaly The whole exome sequencing revealed two homozygous pathogenic variants in TAF13, c.119T>A; p. (Met40Lys) and c.92T>A; p.(Leu 31His), in four affected individuals from two consanguineous families of Syrian and Italian descent, respectively, with mild intellectual disability, microcephaly, and delayed bone age and delayed puberty (Tawamie et al. 2017). Molecular modelling showed that both variants identified in TAF13 affect conserved residues in the histone-binding domain (HBD), which is known to be essential for TAF13-TAF11 heterodimerisation (Mengus et al. 1995). The results of co-immunoprecipitation performed in our laboratory in Erlangen and in cooperation with the laboratory of Prof. Irwin Davidson (IGBMC, France) confirmed the molecular modeling results and showed that both variants destabilized the bended TAF13 conformation and negatively affect the inability of altered TAF13 to form the heterodimer with TAF11. TAF13 forms a histone-fold-like heterodimer with TAF11, and this heterodimer is essential for their recruitment into the RNA polymerase II general transcription factor IID complex (TFIID) (Birck et al. 1998; Mengus et al. 1995; Cler et al. 2009; Papai et al. 2010). TFIID complex structurally consists of the TATA-binding protein (TBP), and 13 to 14 highly conserved TBP associated factors (TAFs) and is the principal component in the transcriptional machinery responsible for recognizing and binding a specific DNA promoter. TFIID also plays a central role in forming the pre-initiation complex (PIC) by providing an interface for other PIC components, such as TFIIA and TFIIB, the recognition of core promoter DNA elements such as the TATA and downstream promoter elements, and interactions with modified histone tails and transcriptional activators (Louder et al. 2016). Many pathogenic variants associated with neurological phenotypes were reported in several components of the TFIID complex. Pathogenic variants in TAF1 and TATA-box-binding protein (TBP) are associated with X-linked dystonia-parkinsonism (XDP) and spinocerebellar ataxia 17 (SCA17), respectively. Pathogenic variants in TAF1, TBP, TAF2, and TAF6 have been reported in individuals with autosomal recessive ID (Koide et al. 1999; Rooms et al. 2006; Makino et al. 2007; Najmabadi et al. 2011; Hellman-Aharony et al. 2013; Hu et al. 2016; O'Rawe et al. 2015; Yuan et al. 2015).

72

The functional analyses suggested that both variants do not completely abolish the function of TAF13 (see figure 8.3.2.5.2). However, variant Leu31His (in family 2) seems in the coIP to be of a milder effect than the variant Met40Lys (in family 1) (see figure 8.3.2.5.2). This gives place for speculating that the milder phenotype of the affected members of family 2 (no growth retardation and head circumference still above the 3rd percentile) is due to a milder effect of the variant. However, it should not be forgotten that these experiments were performed under overexpression in vitro conditions and do not necessarily reflect the in vivo situation. Also, both pathogenic variants that we have observed in TAF13 encode the same domain and thus impair the same function of the protein. However, TAF13 has different interaction partners and thus probably different functions. Intriguingly, it is possible that alterations in other domains of TAF13 would not necessarily lead to the same clinical presentation as in our patients. Likewise, in other subunits of TFIID different variants lead to different phenotypes. Variants in TAF1 are associated with ID and microcephaly (Hu et al. 2016; O'Rawe et al. 2015), whereas the reduction in TAF1 expression is associated with X-linked dystonia- parkinsonism (XDP, MIM:314250 ) (Makino et al. 2007). Also, in TBP, the expansion of the polyglutamine tract (polyQ) causes spinocerebellar ataxia 17 (SCA17, MIM:607136) probably in a gain-of-function mode, whereas heterozygous deletion of TBP is probably responsible for ID and microcephaly (Rooms et al. 2006; Eash et al. 2005). The pathomechanism that leads to ID when TAF13 is mutated is still unclear. However, as a constituent in TFIID and the small nuclear RNA gene specific TAF complex (snTAFc), TAF13 was suggested to play a remarkable role in the regulation of gene transcription in eukaryotic cells (Purrello et al. 1998; Zaborowska et al. 2012). I could further support this assumption by transcriptional profiling of neuroblastoma cell line SK-N-BE(2) upon TAF13 knockdown that showed a significant deregulation of the expression of 7.8% of the genes that are expressed in SK-N-BE(2). Also, I could show that genes containing E-box motives in their promoters are more affected. This strong deregulation of so many genes was also reported in yeast, where TAF13 participates in the transcriptional regulation of 16% of TAFs- depended genes (Shen et al. 2003). Thus, at this stage it seems plausible that TAF13 plays a special role in transcription of specific groups of genes that, when deregulated, would lead to the ID phenotype. This was supported by the subsequent functional enrichment performed in this thesis that showed that TAF13 is an essential transcriptional regulator of subsets of genes included in neuronal development functions such as differentiation and proliferation. This in addition to the high expression of TAF13 in brain suggests TAF13 as an important player in

73

neurological functions. To prove the results of RNA sequencing, I have tested some of the impacted functions of the neurological cells such as proliferation, differentiation and migration. This revealed an increased proliferation and the deregulation of many genes involved in neuronal differentiation like GAP34, MAPT, and MYCN in the TAF13 knock down SK-N-BE(2), suggesting TAF13 as regulator for the neuronal proliferation and differentiation. Indeed, many TAFs play roles in nervous system development and maintenance, like TAF9B and TAF4 in neuronal differentiation and TAF1, TAF9, and TAF10 in cell cycle progression (Herrera et al. 2014; Metsis et al. 2001; Mengus et al. 1997; Davidson et al. 2005). Thus, considering the genetic, clinical, and functional results of TAF13 as well as the literature on other components of the TFIID complex allows the conclusion that bi-allelic, pathogenic variants in TAF13 lead to ID (Jeronimo et al. 2007; Havugimana et al. 2012). The identification of pathogenic variants in TAF13 as the cause of mild intellectual disability with microcephaly, make TAF13 the fifth protein of TFIID after TAF1, TAF2, TAF6 and TBP, whose pathogenic alterations lead to intellectual disability. Since in TAF13 we have identified two different variants in a gene of a group that has already an association with ID makes the results even more reliable than in the previous gene SVBP. Also, for TAF13 I have achieved after RNA sequencing a hypothesis about a possible pathomechanism. However, still much needs to be done, including describing more patients and performing detailed functional analyses.

9.2.3 Identification of KIAA0586 as novel gene for Joubert syndrome The whole exome sequencing revealed a homozygous pathogenic variant in KIAA0586 (chr14:58934452G>C, c.2414-1G>C) in two siblings of Syrian descent with intellectual disability, macrocephaly, delayed motor development, and the characteristic “molar tooth sign” in MRI analyses. In cooperation with Hanno J. Bolz (University Hospital of Cologne, Germany), Megan G. Davey (, UK), and Uwe Wolfrum (Johannes Gutenberg University of Mainz, Germany) further pathogenic variants in KIAA0586 could be identified in independent patients (Stephen et al. 2015). By affecting the acceptor splice site of exon 18, I have found that the identified splice variant (c.2414-1G>C) produced three alternative splicing transcripts, two of them resulting in a frameshift deletion of 13 and 188 bp, while the third one in an in-frame deletion of 108 bp, respectively. Although the RT-PCR revealed that the total expression of KIAA0586 in patients might be affected by nonsense-mediated decay (NMD), all alternative transcripts may still produce a truncated protein with a preserved essential coiled-coil domain that mediates the

74

centrosomal localization and function of the KIAA0586 protein (Yin et al. 2009; Ben et al. 2011; Wu et al. 2014). This assumption is supported by the fact that the complete disruption of this domain in animal models, including mouse, zebrafish, and chicken, is lethal. KIAA0586 is an evolutionary conserved centrosomal protein essential for vertebrate development and ciliogenesis. Due to a failure of the centrosome apically migrating or docking at the plasma membrane, loss of KIAA0586 function in animal models results in a failure to produce both primary and motile cilia which result in abnormal sonic hedgehog (Hh) signaling and disrupted GLI processing (Davey et al. 2006). The known function of the gene in the prompted us to check if the patients have the typical symptoms of a . Thus, we initiated a cMRI of the index patient, which showed the molar tooth sign, a typical sign of Joubert syndrome. Joubert syndrome (JBTS) is a genetically heterogeneous rare ciliopathy syndrome with a molar tooth sign (MTS) on axial magnetic resonance imaging (MRI) caused by problems of migration and development of the cranial structures due to bi- allelic pathogenic variants in more than 20 genes which encode proteins related to the function and structure of cilia (Bachmann-Gagescu et al. 2015a). Thus, KIAA0586 is one further gene in a long row of Jourbert genes. KIAA0586 as a novel gene for Joubert was described by several other groups in parallel or shortly after us (Alby et al. 2015; Malicdan et al. 2015; Roosing et al. 2015; Bachmann-Gagescu et al. 2015b; Vilboux et al. 2017). Interestingly, a group from San Diego identified the same variant in a Syrian family with a similar phenotype including macrocephaly (Roosing et al. 2015). This further supports the pathogenicity and causality of this variant. Although almost all reported KIAA0586-related JBTS displayed similar phenotype, some patients with loss of function variants in KIAA0586 were reported with severe lethal symptoms (Alby et al. 2015). The difference in phenotype severity might be explained by the profound reduction in KIAA0586 expression and the differential effect of pathogenic variants in the various KIAA0586 isoforms could provide an acceptable explanation for the phenotype diversity in patients with altered KIAA0586. For example, patients with the homozygous splice variant (c.1815G>A) resulting in complete deletion of exon 14 of KIAA0586 mRNA, showed a severe lethal ciliopathy phenotype including severe hydrocephalus, polydactyly of hands and feet, a cleft palate, and skeletal abnormalities. In comparison, Joubert syndrome patients who have the same variant but as compound heterozygous, have a much milder clinical presentation (Alby et al. 2015; Roosing et al. 2015). Also in the same context, an early stop gain (c.230C>G (p.Ser77∗)) affecting several KIAA0586 isoforms (Alby et al. 2015) could explain its lethal phenotype compared to

75

classical Joubert syndrome phenotype caused by a likewise early stop gain (c.74del, p.(Lys25Argfs*6)) variant that is located one exon earlier and that affects a smaller number of transcripts of KIAA058 (Roosing et al. 2015). In comparison to SVBP and TAF13, the results for KIAA0586 seem to be most convincing. This is due to the high number of reported patients and the well-studied pathway with known pathomechanisms. Obviously, no single group could have achieved such results. This, once again, demonstrates that sharing information and publishing findings is essential to achieve progress in this field.

9.2.4 Other genes of autosomal recessive intellectual disability In the context of this study, I identified variants in the genes BDH1, CCAR2, EZR, FAR1, KCTD18, KIAA0586, LRCH3, OGDHL, PGAP1, PGAP2, PUS7, SKIDA1, SVBP, TAF13 and TMTC3. I have discussed the results of KIAA0586, SVBP, and TAF13. I have performed only the preliminary analyses, including validation with Sanger sequencing, testing for minor allele frequencies in a healthy, ethnically matched cohort, basic in silico analyses using publically available software, and first steps of functional analysis (e.g. mutagenesis, construction of vectors, etc.) for some of these genes. However, further functional analyses have been done in cooperation. Briefly 1. EZR: I have identified a homozygous candidate variant, c.385G>A, p.(Ala129Thr), in two affected males (muscular hypotonia, development delay, and distinct facial characteristics). Further analyses done by Dr. Helen Morrison’s laboratory (Leibniz Institute for Ageing Research, Jena) showed a normal subcellular localization of the altered protein, but an inability to activate Ras activation (Riecken et al. 2015). 2. FAR1: In two affected children (severe ID, growth retardation, and epilepsy) I have identified a homozygous candidate variant, c.495_507delinsT. After the preliminary analyses, further functional studies were performed in the context of the doctoral thesis of Rebecca Buchert (The Institute of Human Genetics, Erlangen) since this would have gone beyond the scope of my thesis. The functional analyses of Mrs. Buchert showed completely abolished FAR1 activity as results of the pathogenic variant. FAR1 could be confirmed as an ID gene due to the identification of a second patient (Buchert et al. 2014). 3. PGAP1: In further two affected children (moderate ID and epilepsy) I have identified a homozygous candidate variant, c.589_591delCTT, p.(Leu197-del). Functional analyses, done by Dr. Yoshiko Murakami (Research Institute for Microbial Diseases and WPI Immunology Frontier Research Center, Osaka University, Osaka, Japan), showed that this

76

variant leads to degradation of the protein (Murakami et al. 2014). Since then, further patients with pathogenic variants in PGAP1 have been described (Kettwig et al. 2016). 4. PGAP2: In three children with severe ID and muscular weakness I have identified a homozygous variant, c.296A>G, p.(Tyr99Cys). Also here, the functional analysis done by Dr. Yoshiko Murakami (Research Institute for Microbial Diseases and WPI Immunology Frontier Research Center, Osaka University, Suita, Osaka, Japan) clearly demonstrated a negative impact of the variant. A second patient has been identified, and thus PGAP2 could be validated as an ID gene (Hansen et al. 2013). 5. TMTC3: In addition to one variant, c.199C>G, p.(His67Asp), that I have identified in two children with mild ID, club foot and cranial ventriculomegaly, seven other pathogenic variants were identified by Professor Joseph G. Gleeson (Rady Children’s Institute for Genomic Medicine, University of California, USA) and his colleagues in five independent families. The phenotype comparison confirmed all affected individuals to suffer from a recessive form of cobblestone lissencephaly (COB) (Jerber et al. 2016).

9.3 Concluding remarks Over the last few years, the great advances in high-throughput sequencing allowed for a substantial increase in sequencing content atdramatically reduced cost. This allowed for simultaneous interrogation of multiple genes in one single reaction and has proven to be an effective alternative for establishing the genetic basis of Mendelian diseases in the research setting and, more recently, in the clinical diagnostic setting, especially in intellectual disability (ID) (Reuter et al. 2017; Trujillano et al. 2017). Overall, in a large curated database on the genetic basis of ID, 1638 genes have been listed as associated with ID. Of these, 976 are mainly associated with autosomal recessive ID (SysID database, as of September 2017 (Kochinke et al. 2016)). It is however expected that there are some 1500-2000 autosomal recessive ID genes (van Bokhoven 2011). The combination of homozygosity mapping and next generation sequencing is an extremely efficient tool for identifying new candidate genes for autosomal recessive ID in consanguineous families as has been shown in this thesis and in the literature (Reuter et al. 2017).

9.3.1 High yield of the exome sequencing after homozygosity mapping In addition to making a diagnosis in two families due to variants in known genes (AHI1 and SPG20), I identified convincing variants in 15 genes that were not yet associated with intellectual disability. Of these genes, 8 (FAR1, KIAA0586, PGAP1, PGAP2, PUS7, SVBP,

77

TAF13 and TMTC3) were in the meanwhile validated as ID genes due to confirming functional analyses and/or comparison with independent families (data on PUS7 and SVBP is still not published). Further 7 genes still lack confirmation of causality (BDH1, CCAR2, EZR, KCTD18, LRCH3, OGDHL and SKIDA1). For some of these latter candidates, several lines of evidence support their relevance for the phenotypes of the affected children. As an example, functional analyses on EZR have showed that the identified variant impacts the function of the protein playing a role in the RAS/MAPK activation pathway, which has been associated with ID (the Noonan syndrome spectrum). Another example is the homozygous variant in OGDHL; it is rare and the in silico prediction suggests pathogenicity. OGDHL encodes a brain-specific component of multi-enzyme OGDH complex, whose malfunction is associated with neurodegeneration diseases (Bunik et al. 2008). However, final confirmation still need more functional, genetic, and clinical support, demonstrating the efforts needed to report reliable results. Indeed, despite stringent filtering against all known neutral and pathogenic sequence variants, in addition to well-established molecular and functional analysis, it is still possible that some of the above mentioned candidates represent false positive results. In three families, I did not find candidate variants. Reasons for such negative results are 1. Misjudging a pathogenic variant I saw the causative variant but misjudged it and thus did not prioritize it adequately. The solution for such cases is in the growing databases, better in silico evaluation, better understanding of different pathways and their connections, and better description of clinical phenotypes. This would enable, e.g., a better evaluation of the variants based on frequency in the populations. Also, the growing knowledge on pathways and clinical disorders would enable identifying a specific gene as relevant, and thus would enable a better prioritization of the variants in it. 2. Pathogenic variant was not sequenced Due to technical limitations, the exome sequencing does not truly sequence every coding basepair. Also, non-coding variants cannot be identified. The solution for that would be the genome squencine since this would cover much more of the DNA. However, the risk of misjudging is high due to the high number of potential pathogenic variants. The solution as, as described above is in the growing databases. Also, the combination of different analyses may help, e.g. combining genome sequencing with RNA sequencing in order to identify regulatory or splicing variants.

78

3. Not autosomal recessive In this study, I have only analyzed for recessive forms of ID due to homozygous variants. However, there is a possibility that exogenic factors play a role or that compound- heterozygous or X linked variants play a role. The solution for this source of error would be a comprehensive analysis by sequencing all family members, but also a comprehensive clinical evaluation that would largely exclude exogenic factors.

9.3.2 Validating pathogenicity and causality is needed to identify novel genes Although filtering for the most relevant variant was powered by the results of the homozygosity mapping and led usually to only very few candidates, one can only prove that there is a relation between a variant and the phenotype after validating both the pathogenicity and the causality. For example, in the family MR063, based on positional mapping and in silico prioritization, I identified four candidate variants in the genes SVBP, INADL, MYSM1, and DMRT2. Molecular modeling of the identified variants revealed that the pathogenic variants in MYSM1 and DMRT2 have a minor effect on its encoded protein structure while the variants in SVBP and INADL would lead to a strong negative impact on the protein function. Also, pathogenic variants in MYSM1 and DMRT2 are associated with different phenotypes. Since we were able to cooperate with a group in Nijmegen that has identified a second family with exactly the same variant in SVBP and with similar symptoms in patients, SVBP was prioritized as the most probably causal gene. I then performed functional analysis and proved the pathogenicity of the variant since the altered protein ability to facilitate the secretion of VASH1 outside the cells was disrupted. Further experiments done by the group in Nijmegen suggested that SVBP has an important function in brain. The combination of all this data highly suggests SVBP as a novel gene for ARID. Though, it is still not possible to fully exclude a role of the variant in INADL. Indeed, it has been described at several occasions, e.g. by Trujillano et al 2017 and by Reuter et al. 2017, that in some of patients two different and independent pathogenic variants lead to the symptoms. Also, the results of SVBP reflect the importance of identifying several families, the importance of functional analyses, but also that even more would be needed in order to eventually confirm the results. To make the importance of having both functional evidence and independent families clear, the comparison between TAF13 and EZR represents a good example. In both genes, I identified convincing candidate variants, and the functional analyses confirmed the pathogenic impact of the identified variants and revealed a possible mechanism, influencing

79

the transcription of a specific group of genes and dysregulation of the RAS-MAPK pathway, respectively, in which the function of both genes may relate to patients' phenotype. Nevertheless, for TAF13 we could identify a second pathogenic variant in an independent family (in Italy) and could thus confirm the gene as an ID gene. For EZR there is still no confirmation by a second family, and EZR is still a candidate gene that still could be a false positive result. However, there are exception for this strategy and identifying novel ID genes is not following rigid rules. In PGAP1 and TMTC3 we identified variants and assumed relevance even though they did not completely fulfil our strategy to confirm the pathogenicity and causality of novel genes. In PGAP1, we identified a variant in only one family with ARID, but this gene was considered as clearly relevant since the identified variant interrupts the GPI anchor proteins synthesis, whose malfunction is already clearly associated with ID. Our hypothesis was shortly after confirmed be independent literature (Kettwig et al. 2016). In TMTC3, the identification of bi-allelic loss of function variants in six unrelated families with cobblestone lissencephaly supported this gene as pathogenic, although no functional analysis were performed for any of the identified variants.

9.3.3 Sharing information is essential to achieve progress Although attributing causality to a candidate gene requires functional investigations and independent, likewise affected individuals, reporting unconfirmed candidate genes accelerates the confirmation by other groups. This is because making gene lists public would help reaching out for other physicians and scientists, who have unpublished or not validated results. Despite the high competitiveness between different groups worldwide to identify new candidate genes for different genetic diseases, I proved in this thesis that the collaboration with other scientific research groups specialized in specific genes or pathways is very helpful at different levels. At the research level, this collaboration greatly reduced the time needed to prove the pathogenicity and causality of identified variants and as results it facilitated the rapid dissemination of these candidates within the scientific community. Furthermore, this collaboration greatly helped me (at the personal level) to communicate with many junior scientist and professors that in turn helped to refine my experiences in the field of human genetics with several new methods. Thus, the identified candidate genes in this thesis have been made public from the beginning on conferences and in presentations, which eventually was of a big advantage since we were

80

contacted several times and fruitful cooperation were started. Similarly, to encounter the extreme heterogeneity of genetic disorders, some public platforms were established that would help to communicate results and put efforts together. One good example is GeneMatcher (Sobreira et al. 2015). This is a freely accessible Web-based tool to facilitate the connections between scientists working on the same genes or who have convincing variants in association with specific phenotypes. Also, intensive cooperation and sharing of data on a national and international basis developed quickly. Also, the group of Rami Abou Jamra at the Institute of Human Genetics in Erlangen established the Consortium of Autosomal Recessive Intellectual Disability (CARID), a network to join forces, which was successful in several cases (KIAA0586, PUS7, SVBP and TMTC3). Taken together, based on the experience that I have made in this thesis and also the experience that the group around Rami Abou Jamra in general, genes could be confirmed as ID genes and detailed functional studies could be started if several patients have been identified, thus giving more weight to the clinical validation of the gene based on several families. In contrast, having only functional analyses, but without further patients (e.g. EZR), leads to less convincing results. So, identifying further pathogenic variants in PGAP2, KIAA0586, PUS7, SVBP, TMTC3 and TAF13 as a result of cooperation with many research groups worldwide had the decisive role in re-classifying those candidates from candidate genes to ID genes. In conclusion, in my doctoral thesis I could show that whole exome sequencing in consanguineous families with multiple affected individuals in combination with clinical examination, molecular modeling, in vitro functional analysis in combination with intensive networking in the scientific community make an effective strategy to identify and characterize novel genes associated with autosomal recessive intellectual disability. That helps to expand our knowledge about neuronal functions and diseases and also improve the diagnosis of intellectual disability, especially in cases where the diagnosis cannot be suspected by the clinical presentation. However, further steps to study the functional aspects, the pathomechanism, and finally therapeutic aspects still can be only done by dedicated research work.

81

10 Literaturverzeichnis Abou Jamra, R.; Wohlfart, Sigrun; Zweier, Markus; Uebe, Steffen; Priebe, Lutz; Ekici, Arif et al. (2011): Homozygosity mapping in 64 Syrian consanguineous families with non-specific intellectual disability reveals 11 novel loci and high heterogeneity. In: European journal of human genetics : EJHG 19 (11), S. 1161–1166. DOI: 10.1038/ejhg.2011.98. Alby, Caroline; Piquand, Kevin; Huber, Celine; Megarbane, Andre; Ichkou, Amale; Legendre, Marine et al. (2015): Mutations in KIAA0586 Cause Lethal Ciliopathies Ranging from a Hydrolethalus Phenotype to Short-Rib Polydactyly Syndrome. In: American journal of human genetics 97 (2), S. 311–318. DOI: 10.1016/j.ajhg.2015.06.003. Alsultan, Abdulrahman; Shamseldin, Hanan E.; Osman, Mohamed Elfaki; Aljabri, Mansour; Alkuraya, Fowzan S. (2013): MYSM1 is mutated in a family with transient transfusion- dependent anemia, mild thrombocytopenia, and low NK- and B-cell counts. In: Blood 122 (23), S. 3844–3845. DOI: 10.1182/blood-2013-09-527127. Andermann, F.; Andermann, E.; Ptito, A.; Fontaine, S.; Joubert, M. (1999): History of Joubert syndrome and a 30-year follow-up of the original proband. In: Journal of child neurology 14 (9), S. 565–569. DOI: 10.1177/088307389901400903. Bachmann-Gagescu, R.; Dempsey, J. C.; Phelps, I. G.; O'Roak, B. J.; Knutzen, D. M.; Rue, T. C. et al. (2015a): Joubert syndrome: a model for untangling recessive disorders with extreme genetic heterogeneity. In: Journal of medical genetics 52 (8), S. 514–522. DOI: 10.1136/jmedgenet-2015-103087. Bachmann-Gagescu, Ruxandra; Phelps, Ian G.; Dempsey, Jennifer C.; Sharma, Vivek A.; Ishak, Gisele E.; Boyle, Evan A. et al. (2015b): KIAA0586 is Mutated in Joubert Syndrome. In: Human mutation 36 (9), S. 831–835. DOI: 10.1002/humu.22821. Bakowska, Joanna C.; Wang, Heng; Xin, Baozhong; Sumner, Charlotte J.; Blackstone, Craig (2008): Lack of spartin protein in Troyer syndrome: a loss-of-function disease mechanism? In: Archives of neurology 65 (4), S. 520–524. DOI: 10.1001/archneur.65.4.520. Bell, Callum J.; Dinwiddie, Darrell L.; Miller, Neil A.; Hateley, Shannon L.; Ganusova, Elena E.; Mudge, Joann et al. (2011): Carrier testing for severe childhood recessive diseases by next-generation sequencing. In: Science translational medicine 3 (65), S. 65ra4. DOI: 10.1126/scitranslmed.3001756. Ben, Jin; Elworthy, Stone; Ng, Ashley Shu Mei; van Eeden, Freek; Ingham, Philip W. (2011): Targeted mutation of the talpid3 gene in zebrafish reveals its conserved requirement for ciliogenesis and Hedgehog signalling across the vertebrates. In: Development (Cambridge, England) 138 (22), S. 4969–4978. DOI: 10.1242/dev.070862. Birck, C.; Poch, O.; Romier, C.; Ruff, M.; Mengus, G.; Lavigne, A. C. et al. (1998): Human TAF(II)28 and TAF(II)18 interact through a histone fold encoded by atypical evolutionary conserved motifs also found in the SPT3 family. In: Cell 94 (2), S. 239–249. Birnboim, H. C.; Doly, J. (1979): A rapid alkaline extraction procedure for screening recombinant plasmid DNA. In: Nucleic acids research 7 (6), S. 1513–1523.

82

Buchert, Rebecca; Tawamie, Hasan; Smith, Christopher; Uebe, Steffen; Innes, A. Micheil; Al Hallak, Bassam et al. (2014): A peroxisomal disorder of severe intellectual disability, epilepsy, and cataracts due to fatty acyl-CoA reductase 1 deficiency. In: American journal of human genetics 95 (5), S. 602–610. DOI: 10.1016/j.ajhg.2014.10.003. Bunik, Victoria; Kaehne, Thilo; Degtyarev, Dmitry; Shcherbakova, Tatiana; Reiser, Georg (2008): Novel isoenzyme of 2-oxoglutarate dehydrogenase is identified in brain, but not in heart. In: The FEBS journal 275 (20), S. 4990–5006. DOI: 10.1111/j.1742- 4658.2008.06632.x. Cler, Emilie; Papai, Gabor; Schultz, Patrick; Davidson, Irwin (2009): Recent advances in understanding the structure and function of general transcription factor TFIID. In: Cellular and molecular life sciences : CMLS 66 (13), S. 2123–2134. DOI: 10.1007/s00018-009-0009- 3. Cross, H. E.; McKusick, V. A. (1967): The Troyer syndrome. A recessive form of spastic paraplegia with distal muscle wasting. In: Archives of neurology 16 (5), S. 473–485. Dafinger, Claudia; Liebau, Max Christoph; Elsayed, Solaf Mohamed; Hellenbroich, Yorck; Boltshauser, Eugen; Korenke, Georg Christoph et al. (2011): Mutations in KIF7 link Joubert syndrome with Sonic Hedgehog signaling and microtubule dynamics. In: The Journal of clinical investigation 121 (7), S. 2662–2667. DOI: 10.1172/JCI43639. Davey, Megan G.; Paton, I. Robert; Yin, Yili; Schmidt, Maike; Bangs, Fiona K.; Morrice, David R. et al. (2006): The chicken talpid3 gene encodes a novel protein essential for Hedgehog signaling. In: Genes & development 20 (10), S. 1365–1377. DOI: 10.1101/gad.369106. Davidson, Irwin; Kobi, Dominique; Fadloun, Anas; Mengus, Gabrielle (2005): New insights into TAFs as regulators of cell cycle and signaling pathways. In: Cell cycle (Georgetown, Tex.) 4 (11), S. 1486–1490. DOI: 10.4161/cc.4.11.2120. Eash, D.; Waggoner, D.; Chung, J.; Stevenson, D.; Martin, C. L. (2005): Calibration of 6q subtelomere deletions to define genotype/phenotype correlations. In: Clinical genetics 67 (5), S. 396–403. DOI: 10.1111/j.1399-0004.2005.00424.x. Franke, Lude; van Bakel, Harm; Fokkens, Like; Jong, Edwin D. de; Egmont-Petersen, Michael; Wijmenga, Cisca (2006): Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. In: American journal of human genetics 78 (6), S. 1011–1025. DOI: 10.1086/504300. Gibbs, J. Raphael; Singleton, Andrew (2006): Application of genome-wide single nucleotide polymorphism typing: simple association and beyond. In: PLoS genetics 2 (10), S. e150. DOI: 10.1371/journal.pgen.0020150. Hansen, Lars; Tawamie, Hasan; Murakami, Yoshiko; Mang, Yuan; ur Rehman, Shoaib; Buchert, Rebecca et al. (2013): Hypomorphic mutations in PGAP2, encoding a GPI-anchor- remodeling protein, cause autosomal-recessive intellectual disability. In: American journal of human genetics 92 (4), S. 575–583. DOI: 10.1016/j.ajhg.2013.03.008.

83

Havugimana, Pierre C.; Hart, G. Traver; Nepusz, Tamas; Yang, Haixuan; Turinsky, Andrei L.; Li, Zhihua et al. (2012): A census of human soluble protein complexes. In: Cell 150 (5), S. 1068–1081. DOI: 10.1016/j.cell.2012.08.011. Hellman-Aharony, Shlomit; Smirin-Yosef, Pola; Halevy, Ayelet; Pasmanik-Chor, Metsada; Yeheskel, Adva; Har-Zahav, Adi et al. (2013): Microcephaly thin corpus callosum intellectual disability syndrome caused by mutated TAF2. In: Pediatric neurology 49 (6), S. 411-416.e1. DOI: 10.1016/j.pediatrneurol.2013.07.017. Herrera, Francisco J.; Yamaguchi, Teppei; Roelink, Henk; Tjian, Robert (2014): Core promoter factor TAF9B regulates neuronal gene expression. In: eLife 3, S. e02559. DOI: 10.7554/eLife.02559. Housby, J. N.; Southern, E. M. (1998): Fidelity of DNA ligation: a novel experimental approach based on the polymerisation of libraries of oligonucleotides. In: Nucleic acids research 26 (18), S. 4259–4266. Hu, H.; Haas, S. A.; Chelly, J.; van Esch, H.; Raynaud, M.; de Brouwer, A P M et al. (2016): X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes. In: Molecular psychiatry 21 (1), S. 133–148. DOI: 10.1038/mp.2014.193. Jerber, Julie; Zaki, Maha S.; Al-Aama, Jumana Y.; Rosti, Rasim Ozgur; Ben-Omran, Tawfeg; Dikoglu, Esra et al. (2016): Biallelic Mutations in TMTC3, Encoding a Transmembrane and TPR-Containing Protein, Lead to Cobblestone Lissencephaly. In: American journal of human genetics 99 (5), S. 1181–1189. DOI: 10.1016/j.ajhg.2016.09.007. Jeronimo, Celia; Forget, Diane; Bouchard, Annie; Li, Qintong; Chua, Gordon; Poitras, Christian et al. (2007): Systematic analysis of the protein interaction network for the human transcription machinery reveals the identity of the 7SK capping enzyme. In: Molecular cell 27 (2), S. 262–274. DOI: 10.1016/j.molcel.2007.06.027. Kadonosono, Tetsuya; Yimchuen, Wanaporn; Tsubaki, Takuya; Shiozawa, Tadashi; Suzuki, Yasuhiro; Kuchimaru, Takahiro et al. (2017): Domain architecture of vasohibins required for their chaperone-dependent unconventional extracellular release. In: Protein science : a publication of the Protein Society 26 (3), S. 452–463. DOI: 10.1002/pro.3089. Kettwig, Matthias; Elpeleg, Orly; Wegener, Eike; Dreha-Kulaczewski, Steffi; Henneke, Marco; Gartner, Jutta; Huppke, Peter (2016): Compound heterozygous variants in PGAP1 causing severe psychomotor retardation, brain atrophy, recurrent apneas and delayed myelination: a case report and literature review. In: BMC neurology 16, S. 74. DOI: 10.1186/s12883-016-0602-7. Kinoshita, Taroh (2014): Biosynthesis and deficiencies of glycosylphosphatidylinositol. In: Proceedings of the Japan Academy. Series B, Physical and biological sciences 90 (4), S. 130– 143. Kinoshita, Taroh; Fujita, Morihisa (2016): Biosynthesis of GPI-anchored proteins: special emphasis on GPI lipid remodeling. In: Journal of lipid research 57 (1), S. 6–24. DOI: 10.1194/jlr.R063313.

84

Kochinke, Korinna; Zweier, Christiane; Nijhof, Bonnie; Fenckova, Michaela; Cizek, Pavel; Honti, Frank et al. (2016): Systematic Phenomics Analysis Deconvolutes Genes Mutated in Intellectual Disability into Biologically Coherent Modules. In: American journal of human genetics 98 (1), S. 149–164. DOI: 10.1016/j.ajhg.2015.11.024. Koide, R.; Kobayashi, S.; Shimohata, T.; Ikeuchi, T.; Maruyama, M.; Saito, M. et al. (1999): A neurological disease caused by an expanded CAG trinucleotide repeat in the TATA-binding protein gene: a new polyglutamine disease? In: Human molecular genetics 8 (11), S. 2047– 2053. Lage, K.; Hansen, N. T.; Karlberg, E. O.; Eklund, A. C.; Roque, F. S.; Donahoe, P. K. et al. (2008): A large-scale analysis of tissue-specific pathology and gene expression of human disease genes and complexes. In: Proceedings of the National Academy of Sciences of the United States of America 105 (52), S. 20870–20875. DOI: 10.1073/pnas.0810772105. Le Vissers; Ligt, J. de; Gilissen, C.; Janssen, I.; Steehouwer, M.; Vries, P. de et al. (2010): A de novo paradigm for mental retardation. In: Nature genetics 42 (12), S. 1109–1112. DOI: 10.1038/ng.712. Lee, L. G.; Connell, C. R.; Woo, S. L.; Cheng, R. D.; McArdle, B. F.; Fuller, C. W. et al. (1992): DNA sequencing with dye-labeled terminators and T7 DNA polymerase: effect of dyes and dNTPs on incorporation of dye-terminators and probability analysis of termination fragments. In: Nucleic acids research 20 (10), S. 2471–2483. Leotta, Claudia Giovanna; Federico, Concetta; Brundo, Maria Violetta; Tosi, Sabrina; Saccone, Salvatore (2014): HLXB9 gene expression, and nuclear location during in vitro neuronal differentiation in the SK-N-BE neuroblastoma cell line. In: PloS one 9 (8), S. e105481. DOI: 10.1371/journal.pone.0105481. Liu, Zhi-Qiang; Mahmood, Tahrin; Yang, Ping-Chang (2014): Western blot: technique, theory and trouble shooting. In: North American journal of medical sciences 6 (3), S. 160. DOI: 10.4103/1947-2714.128482. Louder, Robert K.; He, Yuan; Lopez-Blanco, Jose Ramon; Fang, Jie; Chacon, Pablo; Nogales, Eva (2016): Structure of promoter-bound TFIID and model of human pre-initiation complex assembly. In: Nature 531 (7596), S. 604–609. DOI: 10.1038/nature17394. Lubs, H. A.; Stevenson, R. E.; Schwartz, C. E. (2012): Fragile X and X-linked intellectual disability: four decades of discovery. In: American journal of human genetics 90 (4), S. 579– 590. DOI: 10.1016/j.ajhg.2012.02.018. MacArthur, D. G.; Manolio, T. A.; Dimmock, D. P.; Rehm, H. L.; Shendure, J.; Abecasis, G. R. et al. (2014): Guidelines for investigating causality of sequence variants in human disease. In: Nature 508 (7497), S. 469–476. DOI: 10.1038/nature13127. Makino, Satoshi; Kaji, Ryuji; Ando, Satoshi; Tomizawa, Maiko; Yasuno, Katsuhito; Goto, Satoshi et al. (2007): Reduced neuron-specific expression of the TAF1 gene is associated with X-linked dystonia-parkinsonism. In: American journal of human genetics 80 (3), S. 393–406. DOI: 10.1086/512129.

85

Malicdan, May Christine V.; Vilboux, Thierry; Stephen, Joshi; Maglic, Dino; Mian, Luhe; Konzman, Daniel et al. (2015): Mutations in human homologue of chicken talpid3 gene (KIAA0586) cause a hybrid ciliopathy with overlapping features of Jeune and Joubert syndromes. In: Journal of medical genetics 52 (12), S. 830–839. DOI: 10.1136/jmedgenet- 2015-103316. Manzini, M. Chiara; Rajab, Anna; Maynard, Thomas M.; Mochida, Ganeshwaran H.; Tan, Wen-Hann; Nasir, Ramzi et al. (2010): Developmental and degenerative features in a complicated spastic paraplegia. In: Annals of neurology 67 (4), S. 516–525. DOI: 10.1002/ana.21923. Maria, B. L.; Hoang, K. B.; Tusa, R. J.; Mancuso, A. A.; Hamed, L. M.; Quisling, R. G. et al. (1997): "Joubert syndrome" revisited: key ocular motor signs with magnetic resonance imaging correlation. In: Journal of child neurology 12 (7), S. 423–430. DOI: 10.1177/088307389701200703. McRae; Clayton S; Fitzgerald TW; Kaplanis J; Prigmore E; Rajan D et al. (2017): Prevalence and architecture of de novo mutations in developmental disorders. In: Nature 542 (7642), S. 433–438. DOI: 10.1038/nature21062. McRae, Jeremy F.; Clayton, Stephen; Fitzgerald, Tomas W.; Kaplanis, Joanna; Prigmore, Elena; Rajan, Diana et al. (2016): Prevalence, phenotype and architecture of developmental disorders caused by de novo mutation. In: bioRxiv. DOI: 10.1101/049056. Mengus, G.; May, M.; Carre, L.; Chambon, P.; Davidson, I. (1997): Human TAF(II)135 potentiates transcriptional activation by the AF-2s of the retinoic acid, vitamin D3, and thyroid hormone receptors in mammalian cells. In: Genes & development 11 (11), S. 1381– 1395. Mengus, G.; May, M.; Jacq, X.; Staub, A.; Tora, L.; Chambon, P.; Davidson, I. (1995): Cloning and characterization of hTAFII18, hTAFII20 and hTAFII28: three subunits of the human transcription factor TFIID. In: The EMBO journal 14 (7), S. 1520–1531. Metsis, M.; Brunkhorst, A.; Neuman, T. (2001): Cell-type-specific expression of the TFIID component TAF(II)135 in the nervous system. In: Experimental cell research 269 (2), S. 214– 221. DOI: 10.1006/excr.2001.5307. Michelson, D. J.; Shevell, M. I.; Sherr, E. H.; Moeschler, J. B.; Gropman, A. L.; Ashwal, S. (2011): Evidence report: Genetic and metabolic testing on children with global developmental delay: report of the Quality Standards Subcommittee of the American Academy of Neurology and the Practice Committee of the Child Neurology Society. In: Neurology 77 (17), S. 1629– 1635. DOI: 10.1212/WNL.0b013e3182345896. Miller, David T.; Adam, Margaret P.; Aradhya, Swaroop; Biesecker, Leslie G.; Brothman, Arthur R.; Carter, Nigel P. et al. (2010): Consensus statement: chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies. In: American journal of human genetics 86 (5), S. 749–764. DOI: 10.1016/j.ajhg.2010.04.006.

86

Moeschler, John B.; Shevell, Michael (2014): Comprehensive evaluation of the child with intellectual disability or global developmental delays. In: Pediatrics 134 (3), S. e903-18. DOI: 10.1542/peds.2014-1839. Murakami, Yoshiko; Tawamie, Hasan; Maeda, Yusuke; Buttner, Christian; Buchert, Rebecca; Radwan, Farah et al. (2014): Null mutation in PGAP1 impairing Gpi-anchor maturation in patients with intellectual disability and encephalopathy. In: PLoS genetics 10 (5), S. e1004320. DOI: 10.1371/journal.pgen.1004320. Najmabadi, Hossein; Hu, Hao; Garshasbi, Masoud; Zemojtel, Tomasz; Abedini, Seyedeh Sedigheh; Chen, Wei et al. (2011): Deep sequencing reveals 50 novel genes for recessive cognitive disorders. In: Nature 478 (7367), S. 57–63. DOI: 10.1038/nature10423. Naseer, Muhammad Imran; Rasool, Mahmood; Jan, Mohammed M.; Chaudhary, Adeel G.; Pushparaj, Peter Natesan; Abuzenadah, Adel M.; Al-Qahtani, Mohammad H. (2016): A novel mutation in PGAP2 gene causes developmental delay, intellectual disability, epilepsy and microcephaly in consanguineous Saudi family. In: Journal of the neurological sciences 371, S. 121–125. DOI: 10.1016/j.jns.2016.10.027. Norton, Nadine; Robertson, Peggy D.; Rieder, Mark J.; Zuchner, Stephan; Rampersaud, Evadnie; Martin, Eden et al. (2012): Evaluating pathogenicity of rare variants from dilated cardiomyopathy in the exome era. In: Circulation. Cardiovascular genetics 5 (2), S. 167–174. DOI: 10.1161/CIRCGENETICS.111.961805. O'Rawe, J. A.; Wu, Y.; Dorfel, M. J.; Rope, A. F.; Au, P. Y.; Parboosingh, J. S. et al. (2015): TAF1 Variants Are Associated with Dysmorphic Features, Intellectual Disability, and Neurological Manifestations. In: American journal of human genetics 97 (6), S. 922–932. DOI: 10.1016/j.ajhg.2015.11.005. Ottolenghi, C.; Veitia, R.; Barbieri, M.; Fellous, M.; McElreavey, K. (2000): The human doublesex-related gene, DMRT2, is homologous to a gene involved in somitogenesis and encodes a potential bicistronic transcript. In: Genomics 64 (2), S. 179–186. DOI: 10.1006/geno.2000.6120. Pagnamenta, Alistair T.; Murakami, Yoshiko; Taylor, John M.; Anzilotti, Consuelo; Howard, Malcolm F.; Miller, Venessa et al. (2017): Analysis of exome data for 4293 trios suggests GPI-anchor biogenesis defects are a rare cause of developmental disorders. In: European journal of human genetics : EJHG. DOI: 10.1038/ejhg.2017.32. Papai, Gabor; Tripathi, Manish K.; Ruhlmann, Christine; Layer, Justin H.; Weil, P. Anthony; Schultz, Patrick (2010): TFIIA and the transactivator Rap1 cooperate to commit TFIID for transcription initiation. In: Nature 465 (7300), S. 956–960. DOI: 10.1038/nature09080. Parisi, Melissa A. (2009): Clinical and molecular features of Joubert syndrome and related disorders. In: American journal of medical genetics. Part C, Seminars in medical genetics 151C (4), S. 326–340. DOI: 10.1002/ajmg.c.30229. Patel, Heema; Cross, Harold; Proukakis, Christos; Hershberger, Ruth; Bork, Peer; Ciccarelli, Francesca D. et al. (2002): SPG20 is mutated in Troyer syndrome, an hereditary spastic paraplegia. In: Nature genetics 31 (4), S. 347–348. DOI: 10.1038/ng937.

87

Pfaffl, M. W. (2001): A new mathematical model for relative quantification in real-time RT- PCR. In: Nucleic acids research 29 (9), S. e45. Polder, J. J.; Meerding, W. J.; Bonneux, L.; van der Maas, P J (2002): Healthcare costs of intellectual disability in the Netherlands: a cost-of-illness perspective. In: Journal of intellectual disability research : JIDR 46 (Pt 2), S. 168–178. Proukakis, Christos; Cross, Harold; Patel, Heema; Patton, Michael A.; Valentine, Alan; Crosby, Andrew H. (2004): Troyer syndrome revisited. A clinical and radiological study of a complicated hereditary spastic paraplegia. In: Journal of neurology 251 (9), S. 1105–1110. DOI: 10.1007/s00415-004-0491-3. Purrello, M.; Di Pietro, C.; Viola, A.; Rapisarda, A.; Stevens, S.; Guermah, M. et al. (1998): Genomics and transcription analysis of human TFIID. In: Oncogene 16 (12), S. 1633–1638. DOI: 10.1038/sj.onc.1201673. Quisling, R. G.; Barkovich, A. J.; Maria, B. L. (1999): Magnetic resonance imaging features and classification of central nervous system malformations in Joubert syndrome. In: Journal of child neurology 14 (10), S. 628-35; discussion 669-72. DOI: 10.1177/088307389901401002. Rabe-Jablonska, J.; Bienkiewicz, W. (1994): Anxiety disorders in the fourth edition of the classification of mental disorders prepared by the American Psychiatric Association: diagnostic and statistical manual of mental disorders (DMS-IV -- options book. In: Psychiatria polska 28 (2), S. 255–268. Rauch, Anita; Wieczorek, Dagmar; Graf, Elisabeth; Wieland, Thomas; Endele, Sabine; Schwarzmayr, Thomas et al. (2012): Range of genetic mutations associated with severe non- syndromic sporadic intellectual disability: an exome sequencing study. In: Lancet (London, England) 380 (9854), S. 1674–1682. DOI: 10.1016/S0140-6736(12)61480-9. Reuter, Miriam S.; Tawamie, Hasan; Buchert, Rebecca; Hosny Gebril, Ola; Froukh, Tawfiq; Thiel, Christian et al. (2017): Diagnostic Yield and Novel Candidate Genes by Exome Sequencing in 152 Consanguineous Families With Neurodevelopmental Disorders. In: JAMA psychiatry. DOI: 10.1001/jamapsychiatry.2016.3798. Riecken, Lars Bjorn; Tawamie, Hasan; Dornblut, Carsten; Buchert, Rebecca; Ismayel, Amina; Schulz, Alexander et al. (2015): Inhibition of RAS activation due to a homozygous ezrin variant in patients with profound intellectual disability. In: Human mutation 36 (2), S. 270– 278. DOI: 10.1002/humu.22737. Rooms, L.; Reyniers, E.; Scheers, S.; van Luijk, R.; Wauters, J.; van Aerschot, L. et al. (2006): TBP as a candidate gene for mental retardation in patients with subtelomeric 6q deletions. In: European journal of human genetics : EJHG 14 (10), S. 1090–1096. DOI: 10.1038/sj.ejhg.5201674. Roosing, Susanne; Hofree, Matan; Kim, Sehyun; Scott, Eric; Copeland, Brett; Romani, Marta et al. (2015): Functional genome-wide siRNA screen identifies KIAA0586 as mutated in Joubert syndrome. In: eLife 4, S. e06602. DOI: 10.7554/eLife.06602.

88

Rozen, S.; Skaletsky, H. (2000): Primer3 on the WWW for general users and for biologist programmers. In: Methods in molecular biology (Clifton, N.J.) 132, S. 365–386. Saiki, R. K.; Scharf, S.; Faloona, F.; Mullis, K. B.; Horn, G. T.; Erlich, H. A.; Arnheim, N. (1985): Enzymatic amplification of beta-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. In: Science (New York, N.Y.) 230 (4732), S. 1350–1354. Salvador-Carulla, Luis; Bertelli, Marco (2008): 'Mental retardation' or 'intellectual disability': time for a conceptual change. In: Psychopathology 41 (1), S. 10–16. DOI: 10.1159/000109950. Sanger, F.; Nicklen, S.; Coulson, A. R. (1977): DNA sequencing with chain-terminating inhibitors. In: Proceedings of the National Academy of Sciences of the United States of America 74 (12), S. 5463–5467. Seelow, Dominik; Schuelke, Markus; Hildebrandt, Friedhelm; Nurnberg, Peter (2009): HomozygosityMapper--an interactive approach to homozygosity mapping. In: Nucleic acids research 37 (Web Server issue), S. W593-9. DOI: 10.1093/nar/gkp369. Shen, Wu-Cheng; Bhaumik, Sukesh R.; Causton, Helen C.; Simon, Itamar; Zhu, Xiaochun; Jennings, Ezra G. et al. (2003): Systematic analysis of essential yeast TAFs in genome-wide transcription and preinitiation complex assembly. In: The EMBO journal 22 (13), S. 3395– 3402. DOI: 10.1093/emboj/cdg336. Sobreira, Nara; Schiettecatte, François; Valle, David; Hamosh, Ada (2015): GeneMatcher: a matching tool for connecting investigators with an interest in the same gene. In: Human mutation 36 (10), S. 928–930. DOI: 10.1002/humu.22844. Sperka, Tobias; Geissler, Katja J.; Merkel, Ulrike; Scholl, Ingmar; Rubio, Ignacio; Herrlich, Peter; Morrison, Helen L. (2011): Activation of Ras requires the ERM-dependent link of actin to the plasma membrane. In: PloS one 6 (11), S. e27511. DOI: 10.1371/journal.pone.0027511. Stephen, Louise A.; Tawamie, Hasan; Davis, Gemma M.; Tebbe, Lars; Nurnberg, Peter; Nurnberg, Gudrun et al. (2015): TALPID3 controls centrosome and cell polarity and the human ortholog KIAA0586 is mutated in Joubert syndrome (JBTS23). In: eLife 4. DOI: 10.7554/eLife.08077. Subramanian, Aravind; Tamayo, Pablo; Mootha, Vamsi K.; Mukherjee, Sayan; Ebert, Benjamin L.; Gillette, Michael A. et al. (2005): Gene set enrichment analysis: a knowledge- based approach for interpreting genome-wide expression profiles. In: Proceedings of the National Academy of Sciences of the United States of America 102 (43), S. 15545–15550. DOI: 10.1073/pnas.0506580102. Suzuki, Yasuhiro; Kobayashi, Miho; Miyashita, Hiroki; Ohta, Hideki; Sonoda, Hikaru; Sato, Yasufumi (2010): Isolation of a small vasohibin-binding protein (SVBP) and its role in vasohibin secretion. In: Journal of cell science 123 (Pt 18), S. 3094–3101. DOI: 10.1242/jcs.067538. Tashima, Yuko; Taguchi, Ryo; Murata, Chie; Ashida, Hisashi; Kinoshita, Taroh; Maeda, Yusuke (2006): PGAP2 is essential for correct processing and stable expression of GPI-

89

anchored proteins. In: Molecular biology of the cell 17 (3), S. 1410–1420. DOI: 10.1091/mbc.E05-11-1005. Tavtigian, Sean V.; Greenblatt, Marc S.; Lesueur, Fabienne; Byrnes, Graham B. (2008): In silico analysis of missense substitutions using sequence-alignment based methods. In: Human mutation 29 (11), S. 1327–1336. DOI: 10.1002/humu.20892. Tawamie, Hasan; Martianov, Igor; Wohlfahrt, Natalie; Buchert, Rebecca; Mengus, Gabrielle; Uebe, Steffen et al. (2017): Hypomorphic Pathogenic Variants in TAF13 Are Associated with Autosomal-Recessive Intellectual Disability and Microcephaly. In: American journal of human genetics 100 (3), S. 555–561. DOI: 10.1016/j.ajhg.2017.01.032. Tawamie, Hasan; Wohlleber, Eva; Uebe, Steffen; Schmal, Christine; Nothen, Markus M.; Abou Jamra, Rami (2015): Recurrent null mutation in SPG20 leads to Troyer syndrome. In: Molecular and cellular probes 29 (5), S. 315–318. DOI: 10.1016/j.mcp.2015.05.006. Trujillano, Daniel; Bertoli-Avella, Aida M.; Kumar Kandaswamy, Krishna; Weiss, Maximilian Er; Koster, Julia; Marais, Anett et al. (2017): Clinical exome sequencing: results from 2819 samples reflecting 1000 families. In: European journal of human genetics : EJHG 25 (2), S. 176–182. DOI: 10.1038/ejhg.2016.146. van Bokhoven, Hans (2011): Genetic and epigenetic networks in intellectual disabilities. In: Annual review of genetics 45, S. 81–104. DOI: 10.1146/annurev-genet-110410-132512. van Karnebeek, Clara D M; Jansweijer, Maaike C. E.; Leenders, Arnold G. E.; Offringa, Martin; Hennekam, Raoul C. M. (2005): Diagnostic investigations in individuals with mental retardation: a systematic literature review of their usefulness. In: European journal of human genetics : EJHG 13 (1), S. 6–25. DOI: 10.1038/sj.ejhg.5201279. Vilboux, Thierry; Doherty, Daniel A.; Glass, Ian A.; Parisi, Melissa A.; Phelps, Ian G.; Cullinane, Andrew R. et al. (2017): Molecular genetic findings and clinical correlations in 100 patients with Joubert syndrome and related disorders prospectively evaluated at a single center. In: Genetics in medicine : official journal of the American College of Medical Genetics. DOI: 10.1038/gim.2016.204. Vissers, Lisenka E L M; Gilissen, Christian; Veltman, Joris A. (2016): Genetic studies in intellectual disability and related disorders. In: Nature reviews. Genetics 17 (1), S. 9–18. DOI: 10.1038/nrg3999. Willemsen, M. H.; Kleefstra, T. (2014): Making headway with genetic diagnostics of intellectual disabilities. In: Clinical genetics 85 (2), S. 101–110. DOI: 10.1111/cge.12244. Wu, Chuanqing; Yang, Mei; Li, Juan; Wang, Chengbing; Cao, Ting; Tao, Kaixiong; Wang, Baolin (2014): Talpid3-binding centrosomal protein Cep120 is required for centriole duplication and proliferation of cerebellar granule neuron progenitors. In: PloS one 9 (9), S. e107943. DOI: 10.1371/journal.pone.0107943. Xue, X.; Gao, W.; Sun, B.; Xu, Y.; Han, B.; Wang, F. et al. (2013): Vasohibin 2 is transcriptionally activated and promotes angiogenesis in hepatocellular carcinoma. In: Oncogene 32 (13), S. 1724–1734. DOI: 10.1038/onc.2012.177.

90

Xue, Y.; Chen, Y.; Ayub, Q.; Huang, N.; Ball, E. V.; Mort, M. et al. (2012): Deleterious- and disease-allele prevalence in healthy individuals: insights from current predictions, mutation databases, and population-scale resequencing. In: American journal of human genetics 91 (6), S. 1022–1032. DOI: 10.1016/j.ajhg.2012.10.015. Yin, Yili; Bangs, Fiona; Paton, I. Robert; Prescott, Alan; James, John; Davey, Megan G. et al. (2009): The Talpid3 gene (KIAA0586) encodes a centrosomal protein that is essential for primary cilia formation. In: Development (Cambridge, England) 136 (4), S. 655–664. DOI: 10.1242/dev.028464. Yuan, B.; Pehlivan, D.; Karaca, E.; Patel, N.; Charng, W. L.; Gambin, T. et al. (2015): Global transcriptional disturbances underlie Cornelia de Lange syndrome and related phenotypes. In: The Journal of clinical investigation 125 (2), S. 636–651. DOI: 10.1172/JCI77435. Zaborowska, J.; Taylor, A.; Roeder, R. G.; Murphy, S. (2012): A novel TBP-TAF complex on RNA polymerase II-transcribed snRNA genes. In: Transcription 3 (2), S. 92–104. DOI: 10.4161/trns.19783.

91

11 Curriculum vitae of Hasan Tawamie

Calvissusstr.35, 04177 Leipzig, Germany T: +49 341 9723806 (office) or +49 172 9283251 (mobile) M: [email protected]

Education 1993 – 2005 Primary and Secondary School in Aleppo, Syria 2005 – 2010 Biotechnology studies at Aleppo University, Syria April 2011 – June 2012 DAAD fellow and Master student at the Institute of Human Genetics, University of Erlangen-Nuremberg, Germany June 2011 – December 2015 DAAD fellow and PhD student at the Institute of Human Genetics, University of Erlangen-Nuremberg, Germany January 2016 till now PostDoc at the Molecular diagnosis lab of the Institute of Human Genetics, University of Leipzig, Germany

Research Topic  Identifying novel genes associated with autosomal recessive intellectual disability using whole exome sequencing and functional analysis under supervision of Prof. André Reis and OA.PD. Dr. med Rami Abou Jamra

Lingual and other skills  Arabic as mother tongue, very good English, Good German  Very good Microsoft office and other routine computer skills

Oral presentations at conferences  Oral presentation at the German Society of Human Genetics meeting in Dresden 2013, Essen 2014 and Lübeck 2016.

92

12 List of Publications Parts of this dissertation are published in the following publications:

Hasan Tawamie, Igor Martianov, Natalie Wohlfahrt, Rebecca Buchert, Gabrielle Mengus, Steffen Uebe, Luigi Janiri, Franz Wolfgang Hirsch, Johannes Schumacher, Fulvia Ferrazzi, Heinrich Sticht, André Reis, Irwin Davidson, Roberto Colombo, Rami Abou Jamra. “Hypomorphic pathogenic variants of TAF13 are associated with autosomal recessive intellectual disability and microcephaly.” Am J Hum Genet. 2017 Mar 2; 100(3):555-561 (Impact factor 10.362)

Miriam S. Reuter, Hasan Tawamie, Rebecca Buchert, Ola Hosny Gebril, Tawfiq Froukh, Christian Thiel, Steffen Uebe, Arif B. Ekici, Mandy Krumbiegel, Christiane Zweier, Juliane Hoyer, Karolin Eberlein, Judith Bauer, Ute Scheller, TimM. Strom, Sabine Hoffjan, Ehab R. Abdelraouf, Nagwa A. Meguid, Ahmad Abboud, Mohammed Ayman Al Khateeb, Mahmoud Fakher, Saber Hamdan, Amina Ismael, Safia Muhammad, Ebtessam Abdallah, Heinrich Sticht, Dagmar Wieczorek, André Reis, Rami Abou Jamra. “Diagnostic Yield and Novel Candidate Genes by Exome Sequencing in 152 Consanguineous Families with Neurodevelopmental Disorders.” JAMA Psychiatry. 2017 Mar 1; 74(3):293-299 (Impact factor 15.3)

Hasan Tawamie, Eva Wohlleber, Steffen Uebe, Christine Schmäl, Markus M. Nöthen, Rami Abou Jamra. “Null Mutation in SPG20 leads to Troyer syndrome with Complicated Spastic Paraplegia.” Mol Cell Probes. 2015 Oct; 29(5):315-8 (Impact factor 1.494)

Louise A Stephen, Hasan Tawamie, Gemma M Davis, Lars Tebbe, Peter Nürnberg, Gudrun Nürnberg, Holger Thiele, Michaela Thoenes, Eugen Boltshauser, Steffen Uebe, Oliver Rompel, André Reis, Arif B Ekici, Lynn McTeir, Amy M Fraser, Emma A Hall, Pleasantine Mill, Nicolas Daudet, Courtney Cross, Uwe Wolfrum, Rami Abou Jamra, Megan G Davey, Hanno J Bolz. “TALPID3 is a regulator of cell polarity and mutated in Joubert syndrome (JBTS24).” Elife. 2015 Sep 19; 4. pii: e08077 (Impact factor 7.01)

Lars Björn Riecken, Hasan Tawamie, Carsten Dornblut, Rebecca Buchert, Amina Ismayel, Alexander Schulz, Johannes Schumacher, Heinrich Sticht, Katja J. Pohl, Yan Cui, André Reis, Helen Morrison, and Rami Abou Jamra “Inhibition of RAS Activation Due to a Homozygous Ezrin Variant in Patients with Profound Intellectual Disability.” Hum Mutat. 2015 Feb; 36(2):270-8 (Impact factor 4.601)

Rebecca Buchert, Hasan Tawamie, Christopher Smith, Steffen Uebe, A. Micheil Innes, Bassam Al Hallak, Arif B. Ekici, Heinrich Sticht, Bernd Schwarze, Ryan E. Lamont, Jillian S. Parboosingh, Francois P. Bernier, and Rami Abou Jamra.

93

“A Peroxisomal Disorder of Severe Intellectual Disability, Epilepsy, and Cataracts Due to Fatty Acyl-CoA Reductase 1 Deficiency.” Am J Hum Genet. 2017 Mar 2; 100(3):555-561 (Impact factor 10.362)

Yoshiko Murakami, Hasan Tawamie, Yusuke Maeda, Christian Büttner, Rebecca Buchert, Farah Radwan, Stefanie Schaffer, Heinrich Sticht, Michael Aigner, André Reis, Taroh Kinoshita, Rami Abou Jamra “Null mutation in PGAP1 impairing GPI-anchor maturation in patients with intellectual disability and encephalopathy.” PLoS Genet. 2014 May 1; 10(5):e1004320 (Impact factor 7.17)

Lars Hansen, Hasan Tawamie, Yoshiko Murakami, Yuan Mang, Shoaib ur Rehman, Rebecca Buchert, Stefanie Schaffer, Safia Muhammad, Mads Bak, Markus M. Nöthen, Eric P. Bennett, Yusuke Maeda, Michael Aigner, André Reis, Taroh Kinoshita, Niels Tommerup, Shahid Mahmood Baig, and Rami Abou Jamra “Hypomorphic mutations in PGAP2, encoding a GPI-anchor-remodeling protein, cause autosomal-recessive intellectual disability.” Am J Hum Genet. 2017 Mar 2; 100(3):555-561 (Impact factor 10.362)

94

13 Danksagung Mein Dank gilt Herrn Professor Dr. med. André Reis für die sehr freundliche Aufnahme am Institut und die fachkundige Unterstützung sowie für die Übernahme des Erstgutachtens. Bei meinem Betreuer PD Dr. med. Rami Abou Jamra möchte ich mich herzlich für die Bereitstellung des interessanten Themas, die fortwährende Diskussionsbereitschaft und das entgegengebrachte Vertrauen bedanken. Ebenfalls bedanken möchte ich mich für die Motivation und Unterstützung, die maßgeblich zum Gelingen dieser Arbeit beigetragen haben. Professor Dr. Johann Helmut Brandstätter danke ich für die Übernahme des Zweitgutachtens. Weiterhin möchte ich allen Kooperationspartnern danken, die dieses Projekt vorangebracht haben, insbesondere Professor Dr. Heinrich Sticht, Professor. Dr. Irwin Davidson, Dr. Helen Morrison, Dr. Roberto Colombo und Dr. Yoshiko Murakami. Allen Mitarbeitern des Instituts für Humangenetik sei herzlichst für ihre Unterstützung und die tolle Arbeitsatmosphäre gedankt. Insbesondere danken möchte ich dabei Farah Radwan für die Unterstützung. Bei allen Mitdoktoranden und Freunden möchte ich mich für die fruchtbaren Diskussionen, die Aufmunterungen, wenn es mal nicht so lief wie gewünscht, und die lustigen Stunden in- und außerhalb des Labors bedanken. Meinen Eltern danke ich für ihre großartige Unterstützung bei allem was ich tue. Meiner Frau, Bayan Amroush, danke ich für die liebevolle Unterstützung und freue mich auf die gemeinsame Zukunft.

95

14 Erklärung Ich erkläre hiermit, dass ich

 die eingereichte Dissertation selbständig und ohne unerlaubte Hilfe angefertigt habe,

 außer den im Schrifttumsverzeichnis angegebenen Quellen und Hilfsmitteln keine weiteren benutzt und alle Stellen, die aus dem Schrifttum ganz oder annähernd entnommen sind, als solche kenntlich gemacht und einzeln nach ihrer Herkunft unter Bezeichnung der Ausgabe (Auflage und Jahr des Erscheinens), des Bandes und der Seite des benützten Werkes in der Dissertation nachgewiesen habe,

 die Dissertation noch keiner anderen Stelle zur Prüfung vorgelegt habe und daß dieselbe noch nicht anderen Zwecken - auch nicht teilweise - gedient hat.

Leipzig, den 07.10.2017

Hasan Tawamie

96